[2024-04-25 21:50:47,898][44588] Saving configuration to /workspace/metta/train_dir/p2.fa.clean/config.json... [2024-04-25 21:50:47,906][44588] Rollout worker 0 uses device cpu [2024-04-25 21:50:47,906][44588] Rollout worker 1 uses device cpu [2024-04-25 21:50:47,906][44588] Rollout worker 2 uses device cpu [2024-04-25 21:50:47,906][44588] Rollout worker 3 uses device cpu [2024-04-25 21:50:47,906][44588] Rollout worker 4 uses device cpu [2024-04-25 21:50:47,906][44588] Rollout worker 5 uses device cpu [2024-04-25 21:50:47,906][44588] Rollout worker 6 uses device cpu [2024-04-25 21:50:47,906][44588] Rollout worker 7 uses device cpu [2024-04-25 21:50:47,906][44588] Rollout worker 8 uses device cpu [2024-04-25 21:50:47,906][44588] Rollout worker 9 uses device cpu [2024-04-25 21:50:47,906][44588] Rollout worker 10 uses device cpu [2024-04-25 21:50:47,906][44588] Rollout worker 11 uses device cpu [2024-04-25 21:50:47,906][44588] Rollout worker 12 uses device cpu [2024-04-25 21:50:47,906][44588] Rollout worker 13 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 14 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 15 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 16 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 17 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 18 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 19 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 20 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 21 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 22 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 23 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 24 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 25 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 26 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 27 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 28 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 29 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 30 uses device cpu [2024-04-25 21:50:47,907][44588] Rollout worker 31 uses device cpu [2024-04-25 21:50:48,456][44588] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-25 21:50:48,457][44588] InferenceWorker_p0-w0: min num requests: 10 [2024-04-25 21:50:48,499][44588] Starting all processes... [2024-04-25 21:50:48,499][44588] Starting process learner_proc0 [2024-04-25 21:50:48,553][44588] Starting all processes... [2024-04-25 21:50:48,557][44588] Starting process inference_proc0-0 [2024-04-25 21:50:48,557][44588] Starting process rollout_proc0 [2024-04-25 21:50:48,557][44588] Starting process rollout_proc1 [2024-04-25 21:50:48,557][44588] Starting process rollout_proc2 [2024-04-25 21:50:48,557][44588] Starting process rollout_proc3 [2024-04-25 21:50:48,557][44588] Starting process rollout_proc4 [2024-04-25 21:50:48,558][44588] Starting process rollout_proc5 [2024-04-25 21:50:48,559][44588] Starting process rollout_proc6 [2024-04-25 21:50:48,559][44588] Starting process rollout_proc7 [2024-04-25 21:50:48,560][44588] Starting process rollout_proc8 [2024-04-25 21:50:48,560][44588] Starting process rollout_proc9 [2024-04-25 21:50:48,560][44588] Starting process rollout_proc10 [2024-04-25 21:50:48,560][44588] Starting process rollout_proc11 [2024-04-25 21:50:48,561][44588] Starting process rollout_proc12 [2024-04-25 21:50:48,561][44588] Starting process rollout_proc13 [2024-04-25 21:50:48,561][44588] Starting process rollout_proc14 [2024-04-25 21:50:48,562][44588] Starting process rollout_proc15 [2024-04-25 21:50:48,562][44588] Starting process rollout_proc16 [2024-04-25 21:50:48,565][44588] Starting process rollout_proc17 [2024-04-25 21:50:48,566][44588] Starting process rollout_proc18 [2024-04-25 21:50:48,566][44588] Starting process rollout_proc19 [2024-04-25 21:50:48,566][44588] Starting process rollout_proc20 [2024-04-25 21:50:48,567][44588] Starting process rollout_proc21 [2024-04-25 21:50:48,569][44588] Starting process rollout_proc22 [2024-04-25 21:50:48,571][44588] Starting process rollout_proc23 [2024-04-25 21:50:48,573][44588] Starting process rollout_proc24 [2024-04-25 21:50:48,573][44588] Starting process rollout_proc25 [2024-04-25 21:50:48,573][44588] Starting process rollout_proc26 [2024-04-25 21:50:48,579][44588] Starting process rollout_proc27 [2024-04-25 21:50:48,580][44588] Starting process rollout_proc28 [2024-04-25 21:50:48,581][44588] Starting process rollout_proc29 [2024-04-25 21:50:48,583][44588] Starting process rollout_proc30 [2024-04-25 21:50:48,584][44588] Starting process rollout_proc31 [2024-04-25 21:50:51,653][44838] Worker 13 uses CPU cores [13] [2024-04-25 21:50:51,950][44804] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-25 21:50:51,950][44804] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-04-25 21:50:51,961][44804] Num visible devices: 1 [2024-04-25 21:50:52,035][44804] Starting seed is not provided [2024-04-25 21:50:52,035][44804] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-25 21:50:52,035][44804] Initializing actor-critic model on device cuda:0 [2024-04-25 21:50:52,036][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,047][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,055][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,055][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,055][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,055][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,055][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,055][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,056][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,056][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,056][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,056][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,056][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,056][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,056][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,056][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,057][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,057][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,057][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,057][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,057][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,057][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,057][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,057][44804] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,059][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,059][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,059][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,059][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,059][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,060][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,060][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,060][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,060][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,060][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,060][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,060][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,060][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,060][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,060][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,061][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,061][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,061][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,061][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,061][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,075][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,075][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,075][44804] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,110][44828] Worker 3 uses CPU cores [3] [2024-04-25 21:50:52,171][44804] Created Actor Critic model with architecture: [2024-04-25 21:50:52,171][44804] PredictingActorCritic( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): FeatureEncoder( (_global_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (conf:agent:energy:initial): RunningMeanStdInPlace() (conf:agent:energy:max): RunningMeanStdInPlace() (conf:agent:energy:regen): RunningMeanStdInPlace() (conf:altar:cooldown): RunningMeanStdInPlace() (conf:altar:cost): RunningMeanStdInPlace() (conf:attack:damage): RunningMeanStdInPlace() (conf:attack:freeze_duration): RunningMeanStdInPlace() (conf:charger:cooldown): RunningMeanStdInPlace() (conf:charger:energy): RunningMeanStdInPlace() (conf:cost:attack): RunningMeanStdInPlace() (conf:cost:frozen): RunningMeanStdInPlace() (conf:cost:jump): RunningMeanStdInPlace() (conf:cost:move): RunningMeanStdInPlace() (conf:cost:move:predator): RunningMeanStdInPlace() (conf:cost:move:prey): RunningMeanStdInPlace() (conf:cost:rotate): RunningMeanStdInPlace() (conf:cost:shield): RunningMeanStdInPlace() (conf:cost:shield:upkeep): RunningMeanStdInPlace() (conf:generator:cooldown): RunningMeanStdInPlace() (conf:gift:energy): RunningMeanStdInPlace() (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() (last_reward): RunningMeanStdInPlace() ) ) (_grid_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (charger): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv:1): RunningMeanStdInPlace() (agent:inv:2): RunningMeanStdInPlace() (agent:inv:3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (agent:species): RunningMeanStdInPlace() (altar:ready): RunningMeanStdInPlace() (charger:bonus): RunningMeanStdInPlace() (charger:input:1): RunningMeanStdInPlace() (charger:input:2): RunningMeanStdInPlace() (charger:input:3): RunningMeanStdInPlace() (charger:output): RunningMeanStdInPlace() (charger:ready): RunningMeanStdInPlace() (generator:ready): RunningMeanStdInPlace() (generator:resource): RunningMeanStdInPlace() ) ) (_global_encoder): FeatureListEncoder( (_embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (_grid_encoder): FeatureListEncoder( (_embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (encoder_head): Sequential( (0): Linear(in_features=520, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): FeatureDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=17, bias=True) ) ) [2024-04-25 21:50:52,174][44825] Worker 1 uses CPU cores [1] [2024-04-25 21:50:52,182][44831] Worker 5 uses CPU cores [5] [2024-04-25 21:50:52,182][44826] Worker 0 uses CPU cores [0] [2024-04-25 21:50:52,190][44827] Worker 2 uses CPU cores [2] [2024-04-25 21:50:52,214][44852] Worker 27 uses CPU cores [27] [2024-04-25 21:50:52,230][44850] Worker 25 uses CPU cores [25] [2024-04-25 21:50:52,374][44843] Worker 18 uses CPU cores [18] [2024-04-25 21:50:52,381][44824] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-25 21:50:52,382][44824] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-04-25 21:50:52,386][44847] Worker 24 uses CPU cores [24] [2024-04-25 21:50:52,390][44829] Worker 4 uses CPU cores [4] [2024-04-25 21:50:52,392][44824] Num visible devices: 1 [2024-04-25 21:50:52,408][44804] Using optimizer [2024-04-25 21:50:52,430][44834] Worker 8 uses CPU cores [8] [2024-04-25 21:50:52,450][44845] Worker 20 uses CPU cores [20] [2024-04-25 21:50:52,454][44844] Worker 19 uses CPU cores [19] [2024-04-25 21:50:52,474][44837] Worker 11 uses CPU cores [11] [2024-04-25 21:50:52,490][44841] Worker 16 uses CPU cores [16] [2024-04-25 21:50:52,502][44832] Worker 7 uses CPU cores [7] [2024-04-25 21:50:52,522][44853] Worker 28 uses CPU cores [28] [2024-04-25 21:50:52,522][44851] Worker 26 uses CPU cores [26] [2024-04-25 21:50:52,526][44848] Worker 22 uses CPU cores [22] [2024-04-25 21:50:52,526][44855] Worker 31 uses CPU cores [31] [2024-04-25 21:50:52,527][44835] Worker 10 uses CPU cores [10] [2024-04-25 21:50:52,558][44836] Worker 12 uses CPU cores [12] [2024-04-25 21:50:52,561][44833] Worker 9 uses CPU cores [9] [2024-04-25 21:50:52,570][44830] Worker 6 uses CPU cores [6] [2024-04-25 21:50:52,573][44849] Worker 23 uses CPU cores [23] [2024-04-25 21:50:52,595][44804] No checkpoints found [2024-04-25 21:50:52,595][44804] Did not load from checkpoint, starting from scratch! [2024-04-25 21:50:52,596][44804] Initialized policy 0 weights for model version 0 [2024-04-25 21:50:52,598][44804] LearnerWorker_p0 finished initialization! [2024-04-25 21:50:52,598][44804] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-25 21:50:52,651][44839] Worker 14 uses CPU cores [14] [2024-04-25 21:50:52,705][44840] Worker 15 uses CPU cores [15] [2024-04-25 21:50:52,719][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,729][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,729][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,730][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,730][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,730][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,730][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,730][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,730][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,730][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,730][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,731][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,731][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,731][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,731][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,731][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,731][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,731][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,731][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,731][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,731][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,732][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,732][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,732][44824] RunningMeanStd input shape: (1,) [2024-04-25 21:50:52,733][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,733][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,733][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,733][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,733][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,733][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,733][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,733][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,734][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,734][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,734][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,734][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,734][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,734][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,734][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,734][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,734][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,735][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,735][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,735][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,735][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,735][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,735][44824] RunningMeanStd input shape: (11, 11) [2024-04-25 21:50:52,762][44846] Worker 21 uses CPU cores [21] [2024-04-25 21:50:52,772][44842] Worker 17 uses CPU cores [17] [2024-04-25 21:50:52,783][44856] Worker 29 uses CPU cores [29] [2024-04-25 21:50:52,812][44588] Inference worker 0-0 is ready! [2024-04-25 21:50:52,812][44588] All inference workers are ready! Signal rollout workers to start! [2024-04-25 21:50:52,841][44854] Worker 30 uses CPU cores [30] [2024-04-25 21:50:53,366][44838] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,390][44830] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,393][44837] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,449][44831] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,458][44825] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,466][44834] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,472][44826] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,473][44828] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,476][44829] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,477][44827] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,503][44843] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,516][44835] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,520][44852] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,522][44850] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,525][44847] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,534][44844] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,537][44845] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,546][44841] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,547][44832] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,559][44833] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,598][44851] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,603][44855] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,603][44848] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,612][44836] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,615][44839] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,620][44849] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,625][44853] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,685][44840] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,765][44846] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,826][44842] Decorrelating experience for 0 frames... [2024-04-25 21:50:53,848][44856] Decorrelating experience for 0 frames... [2024-04-25 21:50:54,019][44854] Decorrelating experience for 0 frames... [2024-04-25 21:50:54,035][44838] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,040][44837] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,044][44830] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,089][44831] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,091][44834] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,099][44825] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,120][44829] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,137][44826] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,138][44828] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,138][44835] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,142][44827] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,176][44832] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,204][44833] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,234][44843] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,248][44836] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,256][44839] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,258][44844] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,262][44850] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,263][44841] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,264][44847] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,266][44845] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,269][44852] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,326][44848] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,327][44851] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,333][44855] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,340][44840] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,346][44849] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,347][44853] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,451][44846] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,496][44842] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,516][44856] Decorrelating experience for 256 frames... [2024-04-25 21:50:54,736][44854] Decorrelating experience for 256 frames... [2024-04-25 21:50:55,588][44588] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-04-25 21:50:58,907][44837] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-04-25 21:50:58,915][44838] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-04-25 21:50:58,921][44825] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-04-25 21:50:58,935][44833] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-04-25 21:50:58,965][44832] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-04-25 21:50:58,972][44831] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-04-25 21:50:58,973][44828] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-04-25 21:50:58,976][44836] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-04-25 21:50:59,004][44835] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-04-25 21:50:59,009][44827] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-04-25 21:50:59,022][44834] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-04-25 21:50:59,049][44840] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-04-25 21:50:59,050][44843] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-04-25 21:50:59,050][44804] Signal inference workers to stop experience collection... [2024-04-25 21:50:59,055][44824] InferenceWorker_p0-w0: stopping experience collection [2024-04-25 21:50:59,570][44804] Signal inference workers to resume experience collection... [2024-04-25 21:50:59,570][44824] InferenceWorker_p0-w0: resuming experience collection [2024-04-25 21:50:59,602][44855] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-04-25 21:50:59,602][44850] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-04-25 21:50:59,798][44847] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-04-25 21:50:59,798][44851] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-04-25 21:50:59,825][44839] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-04-25 21:50:59,862][44844] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-04-25 21:50:59,899][44853] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-04-25 21:50:59,904][44845] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-04-25 21:50:59,910][44848] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-04-25 21:51:00,005][44841] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-04-25 21:51:00,049][44856] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-04-25 21:51:00,104][44852] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-04-25 21:51:00,109][44849] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-04-25 21:51:00,138][44846] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-04-25 21:51:00,189][44830] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-04-25 21:51:00,473][44842] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-04-25 21:51:00,589][44588] Fps is (10 sec: 29488.6, 60 sec: 29488.6, 300 sec: 29488.6). Total num frames: 147456. Throughput: 0: 60622.7. Samples: 303140. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-25 21:51:00,747][44829] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-04-25 21:51:00,877][44824] Updated weights for policy 0, policy_version 10 (0.0023) [2024-04-25 21:51:01,018][44854] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-04-25 21:51:03,630][44825] Worker 1 awakens! [2024-04-25 21:51:05,588][44588] Fps is (10 sec: 16383.9, 60 sec: 16383.9, 300 sec: 16383.9). Total num frames: 163840. Throughput: 0: 33445.7. Samples: 334460. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-25 21:51:08,392][44827] Worker 2 awakens! [2024-04-25 21:51:08,453][44588] Heartbeat connected on Batcher_0 [2024-04-25 21:51:08,455][44588] Heartbeat connected on LearnerWorker_p0 [2024-04-25 21:51:08,459][44588] Heartbeat connected on RolloutWorker_w0 [2024-04-25 21:51:08,460][44588] Heartbeat connected on RolloutWorker_w1 [2024-04-25 21:51:08,461][44588] Heartbeat connected on RolloutWorker_w2 [2024-04-25 21:51:08,519][44588] Heartbeat connected on InferenceWorker_p0-w0 [2024-04-25 21:51:10,589][44588] Fps is (10 sec: 3276.8, 60 sec: 12014.5, 300 sec: 12014.5). Total num frames: 180224. Throughput: 0: 22723.3. Samples: 340860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 10.0) [2024-04-25 21:51:13,106][44828] Worker 3 awakens! [2024-04-25 21:51:13,120][44588] Heartbeat connected on RolloutWorker_w3 [2024-04-25 21:51:15,589][44588] Fps is (10 sec: 3276.7, 60 sec: 9830.2, 300 sec: 9830.2). Total num frames: 196608. Throughput: 0: 17988.6. Samples: 359780. Policy #0 lag: (min: 0.0, avg: 4.4, max: 11.0) [2024-04-25 21:51:19,590][44829] Worker 4 awakens! [2024-04-25 21:51:19,595][44588] Heartbeat connected on RolloutWorker_w4 [2024-04-25 21:51:20,588][44588] Fps is (10 sec: 4915.5, 60 sec: 9175.1, 300 sec: 9175.1). Total num frames: 229376. Throughput: 0: 15490.5. Samples: 387260. Policy #0 lag: (min: 0.0, avg: 1.1, max: 2.0) [2024-04-25 21:51:20,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:51:20,589][44804] Saving new best policy, reward=0.000! [2024-04-25 21:51:22,509][44831] Worker 5 awakens! [2024-04-25 21:51:22,514][44588] Heartbeat connected on RolloutWorker_w5 [2024-04-25 21:51:25,270][44824] Updated weights for policy 0, policy_version 20 (0.0016) [2024-04-25 21:51:25,588][44588] Fps is (10 sec: 13107.9, 60 sec: 10922.7, 300 sec: 10922.7). Total num frames: 327680. Throughput: 0: 14788.7. Samples: 443660. Policy #0 lag: (min: 0.0, avg: 6.5, max: 17.0) [2024-04-25 21:51:25,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:51:28,412][44830] Worker 6 awakens! [2024-04-25 21:51:28,416][44588] Heartbeat connected on RolloutWorker_w6 [2024-04-25 21:51:30,588][44588] Fps is (10 sec: 21299.4, 60 sec: 12639.1, 300 sec: 12639.1). Total num frames: 442368. Throughput: 0: 16312.6. Samples: 570940. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2024-04-25 21:51:30,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:51:31,594][44824] Updated weights for policy 0, policy_version 30 (0.0013) [2024-04-25 21:51:31,878][44832] Worker 7 awakens! [2024-04-25 21:51:31,884][44588] Heartbeat connected on RolloutWorker_w7 [2024-04-25 21:51:35,588][44588] Fps is (10 sec: 26214.2, 60 sec: 14745.6, 300 sec: 14745.6). Total num frames: 589824. Throughput: 0: 18415.0. Samples: 736600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 31.0) [2024-04-25 21:51:35,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:51:36,562][44834] Worker 8 awakens! [2024-04-25 21:51:36,566][44588] Heartbeat connected on RolloutWorker_w8 [2024-04-25 21:51:37,982][44824] Updated weights for policy 0, policy_version 40 (0.0013) [2024-04-25 21:51:40,588][44588] Fps is (10 sec: 29490.8, 60 sec: 16384.0, 300 sec: 16384.0). Total num frames: 737280. Throughput: 0: 18326.2. Samples: 824680. Policy #0 lag: (min: 0.0, avg: 2.4, max: 6.0) [2024-04-25 21:51:40,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:51:41,223][44833] Worker 9 awakens! [2024-04-25 21:51:41,228][44588] Heartbeat connected on RolloutWorker_w9 [2024-04-25 21:51:42,935][44824] Updated weights for policy 0, policy_version 50 (0.0014) [2024-04-25 21:51:45,588][44588] Fps is (10 sec: 31129.6, 60 sec: 18022.4, 300 sec: 18022.4). Total num frames: 901120. Throughput: 0: 16022.4. Samples: 1024140. Policy #0 lag: (min: 0.0, avg: 3.8, max: 8.0) [2024-04-25 21:51:45,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:51:45,982][44835] Worker 10 awakens! [2024-04-25 21:51:45,987][44588] Heartbeat connected on RolloutWorker_w10 [2024-04-25 21:51:47,119][44824] Updated weights for policy 0, policy_version 60 (0.0017) [2024-04-25 21:51:50,570][44837] Worker 11 awakens! [2024-04-25 21:51:50,577][44588] Heartbeat connected on RolloutWorker_w11 [2024-04-25 21:51:50,588][44588] Fps is (10 sec: 36044.9, 60 sec: 19958.7, 300 sec: 19958.7). Total num frames: 1097728. Throughput: 0: 20665.4. Samples: 1264400. Policy #0 lag: (min: 0.0, avg: 3.2, max: 7.0) [2024-04-25 21:51:50,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:51:50,975][44824] Updated weights for policy 0, policy_version 70 (0.0014) [2024-04-25 21:51:54,423][44824] Updated weights for policy 0, policy_version 80 (0.0018) [2024-04-25 21:51:55,326][44836] Worker 12 awakens! [2024-04-25 21:51:55,350][44588] Heartbeat connected on RolloutWorker_w12 [2024-04-25 21:51:55,588][44588] Fps is (10 sec: 44236.9, 60 sec: 22391.5, 300 sec: 22391.5). Total num frames: 1343488. Throughput: 0: 23455.4. Samples: 1396340. Policy #0 lag: (min: 0.0, avg: 4.3, max: 9.0) [2024-04-25 21:51:55,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:51:58,568][44824] Updated weights for policy 0, policy_version 90 (0.0024) [2024-04-25 21:51:59,953][44838] Worker 13 awakens! [2024-04-25 21:51:59,959][44588] Heartbeat connected on RolloutWorker_w13 [2024-04-25 21:52:00,588][44588] Fps is (10 sec: 47513.7, 60 sec: 23757.0, 300 sec: 24197.9). Total num frames: 1572864. Throughput: 0: 28924.3. Samples: 1661360. Policy #0 lag: (min: 0.0, avg: 4.5, max: 10.0) [2024-04-25 21:52:00,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:52:02,165][44824] Updated weights for policy 0, policy_version 100 (0.0023) [2024-04-25 21:52:05,523][44824] Updated weights for policy 0, policy_version 110 (0.0014) [2024-04-25 21:52:05,548][44839] Worker 14 awakens! [2024-04-25 21:52:05,553][44588] Heartbeat connected on RolloutWorker_w14 [2024-04-25 21:52:05,588][44588] Fps is (10 sec: 45875.1, 60 sec: 27306.7, 300 sec: 25746.3). Total num frames: 1802240. Throughput: 0: 34418.6. Samples: 1936100. Policy #0 lag: (min: 0.0, avg: 5.4, max: 11.0) [2024-04-25 21:52:05,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:52:09,458][44840] Worker 15 awakens! [2024-04-25 21:52:09,466][44588] Heartbeat connected on RolloutWorker_w15 [2024-04-25 21:52:09,597][44824] Updated weights for policy 0, policy_version 120 (0.0026) [2024-04-25 21:52:10,588][44588] Fps is (10 sec: 42598.4, 60 sec: 30310.7, 300 sec: 26651.3). Total num frames: 1998848. Throughput: 0: 35848.9. Samples: 2056860. Policy #0 lag: (min: 0.0, avg: 4.3, max: 10.0) [2024-04-25 21:52:10,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:52:13,414][44824] Updated weights for policy 0, policy_version 130 (0.0023) [2024-04-25 21:52:15,105][44841] Worker 16 awakens! [2024-04-25 21:52:15,114][44588] Heartbeat connected on RolloutWorker_w16 [2024-04-25 21:52:15,588][44588] Fps is (10 sec: 40959.9, 60 sec: 33587.4, 300 sec: 27648.0). Total num frames: 2211840. Throughput: 0: 38625.2. Samples: 2309080. Policy #0 lag: (min: 0.0, avg: 5.7, max: 11.0) [2024-04-25 21:52:15,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:52:17,131][44824] Updated weights for policy 0, policy_version 140 (0.0026) [2024-04-25 21:52:20,261][44842] Worker 17 awakens! [2024-04-25 21:52:20,270][44588] Heartbeat connected on RolloutWorker_w17 [2024-04-25 21:52:20,588][44588] Fps is (10 sec: 42598.1, 60 sec: 36590.9, 300 sec: 28527.4). Total num frames: 2424832. Throughput: 0: 41032.5. Samples: 2583060. Policy #0 lag: (min: 0.0, avg: 5.8, max: 11.0) [2024-04-25 21:52:20,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:52:20,831][44824] Updated weights for policy 0, policy_version 150 (0.0022) [2024-04-25 21:52:23,519][44843] Worker 18 awakens! [2024-04-25 21:52:23,529][44588] Heartbeat connected on RolloutWorker_w18 [2024-04-25 21:52:24,279][44824] Updated weights for policy 0, policy_version 160 (0.0024) [2024-04-25 21:52:25,588][44588] Fps is (10 sec: 44237.2, 60 sec: 38775.5, 300 sec: 29491.2). Total num frames: 2654208. Throughput: 0: 42029.4. Samples: 2716000. Policy #0 lag: (min: 2.0, avg: 5.8, max: 13.0) [2024-04-25 21:52:25,588][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 21:52:27,989][44824] Updated weights for policy 0, policy_version 170 (0.0023) [2024-04-25 21:52:28,965][44844] Worker 19 awakens! [2024-04-25 21:52:28,975][44588] Heartbeat connected on RolloutWorker_w19 [2024-04-25 21:52:30,588][44588] Fps is (10 sec: 49151.8, 60 sec: 41232.9, 300 sec: 30698.4). Total num frames: 2916352. Throughput: 0: 43977.3. Samples: 3003120. Policy #0 lag: (min: 0.0, avg: 58.7, max: 174.0) [2024-04-25 21:52:30,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:52:30,911][44824] Updated weights for policy 0, policy_version 180 (0.0027) [2024-04-25 21:52:33,709][44845] Worker 20 awakens! [2024-04-25 21:52:33,718][44588] Heartbeat connected on RolloutWorker_w20 [2024-04-25 21:52:34,206][44824] Updated weights for policy 0, policy_version 190 (0.0024) [2024-04-25 21:52:35,588][44588] Fps is (10 sec: 49151.3, 60 sec: 42598.4, 300 sec: 31457.3). Total num frames: 3145728. Throughput: 0: 45232.8. Samples: 3299880. Policy #0 lag: (min: 0.0, avg: 63.3, max: 188.0) [2024-04-25 21:52:35,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:52:37,827][44824] Updated weights for policy 0, policy_version 200 (0.0027) [2024-04-25 21:52:38,669][44846] Worker 21 awakens! [2024-04-25 21:52:38,679][44588] Heartbeat connected on RolloutWorker_w21 [2024-04-25 21:52:40,588][44588] Fps is (10 sec: 47513.9, 60 sec: 44236.8, 300 sec: 32299.9). Total num frames: 3391488. Throughput: 0: 45622.2. Samples: 3449340. Policy #0 lag: (min: 0.0, avg: 68.9, max: 203.0) [2024-04-25 21:52:40,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:52:41,308][44824] Updated weights for policy 0, policy_version 210 (0.0028) [2024-04-25 21:52:43,065][44848] Worker 22 awakens! [2024-04-25 21:52:43,076][44588] Heartbeat connected on RolloutWorker_w22 [2024-04-25 21:52:44,399][44824] Updated weights for policy 0, policy_version 220 (0.0024) [2024-04-25 21:52:45,588][44588] Fps is (10 sec: 50790.7, 60 sec: 45875.2, 300 sec: 33214.8). Total num frames: 3653632. Throughput: 0: 46335.0. Samples: 3746440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 15.0) [2024-04-25 21:52:45,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:52:45,601][44804] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000000223_3653632.pth... [2024-04-25 21:52:47,438][44824] Updated weights for policy 0, policy_version 230 (0.0028) [2024-04-25 21:52:48,018][44849] Worker 23 awakens! [2024-04-25 21:52:48,030][44588] Heartbeat connected on RolloutWorker_w23 [2024-04-25 21:52:50,588][44588] Fps is (10 sec: 52428.7, 60 sec: 46967.5, 300 sec: 34050.2). Total num frames: 3915776. Throughput: 0: 47192.5. Samples: 4059760. Policy #0 lag: (min: 0.0, avg: 7.1, max: 16.0) [2024-04-25 21:52:50,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:52:50,644][44824] Updated weights for policy 0, policy_version 240 (0.0027) [2024-04-25 21:52:52,346][44847] Worker 24 awakens! [2024-04-25 21:52:52,358][44588] Heartbeat connected on RolloutWorker_w24 [2024-04-25 21:52:53,487][44824] Updated weights for policy 0, policy_version 250 (0.0026) [2024-04-25 21:52:55,588][44588] Fps is (10 sec: 52428.8, 60 sec: 47240.5, 300 sec: 34816.0). Total num frames: 4177920. Throughput: 0: 47844.3. Samples: 4209860. Policy #0 lag: (min: 0.0, avg: 8.1, max: 17.0) [2024-04-25 21:52:55,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:52:56,886][44850] Worker 25 awakens! [2024-04-25 21:52:56,897][44588] Heartbeat connected on RolloutWorker_w25 [2024-04-25 21:52:56,910][44824] Updated weights for policy 0, policy_version 260 (0.0028) [2024-04-25 21:53:00,015][44824] Updated weights for policy 0, policy_version 270 (0.0026) [2024-04-25 21:53:00,588][44588] Fps is (10 sec: 52428.4, 60 sec: 47786.6, 300 sec: 35520.5). Total num frames: 4440064. Throughput: 0: 49408.0. Samples: 4532440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 17.0) [2024-04-25 21:53:00,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:53:01,710][44851] Worker 26 awakens! [2024-04-25 21:53:01,721][44588] Heartbeat connected on RolloutWorker_w26 [2024-04-25 21:53:02,830][44824] Updated weights for policy 0, policy_version 280 (0.0032) [2024-04-25 21:53:05,588][44588] Fps is (10 sec: 52428.8, 60 sec: 48332.8, 300 sec: 36170.8). Total num frames: 4702208. Throughput: 0: 50514.6. Samples: 4856220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-04-25 21:53:05,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:53:06,427][44824] Updated weights for policy 0, policy_version 290 (0.0032) [2024-04-25 21:53:06,683][44852] Worker 27 awakens! [2024-04-25 21:53:06,694][44588] Heartbeat connected on RolloutWorker_w27 [2024-04-25 21:53:08,854][44824] Updated weights for policy 0, policy_version 300 (0.0030) [2024-04-25 21:53:10,588][44588] Fps is (10 sec: 54067.3, 60 sec: 49698.0, 300 sec: 36894.3). Total num frames: 4980736. Throughput: 0: 51042.1. Samples: 5012900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 19.0) [2024-04-25 21:53:10,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:53:11,247][44853] Worker 28 awakens! [2024-04-25 21:53:11,259][44588] Heartbeat connected on RolloutWorker_w28 [2024-04-25 21:53:12,066][44824] Updated weights for policy 0, policy_version 310 (0.0031) [2024-04-25 21:53:15,220][44824] Updated weights for policy 0, policy_version 320 (0.0030) [2024-04-25 21:53:15,588][44588] Fps is (10 sec: 54066.4, 60 sec: 50517.2, 300 sec: 37449.1). Total num frames: 5242880. Throughput: 0: 52002.5. Samples: 5343240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-25 21:53:15,589][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 21:53:16,035][44856] Worker 29 awakens! [2024-04-25 21:53:16,047][44588] Heartbeat connected on RolloutWorker_w29 [2024-04-25 21:53:17,707][44824] Updated weights for policy 0, policy_version 330 (0.0032) [2024-04-25 21:53:20,588][44588] Fps is (10 sec: 55705.4, 60 sec: 51882.6, 300 sec: 38191.7). Total num frames: 5537792. Throughput: 0: 52768.0. Samples: 5674440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-04-25 21:53:20,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:53:21,439][44824] Updated weights for policy 0, policy_version 340 (0.0028) [2024-04-25 21:53:21,649][44854] Worker 30 awakens! [2024-04-25 21:53:21,671][44588] Heartbeat connected on RolloutWorker_w30 [2024-04-25 21:53:23,557][44824] Updated weights for policy 0, policy_version 350 (0.0027) [2024-04-25 21:53:25,015][44855] Worker 31 awakens! [2024-04-25 21:53:25,029][44804] Signal inference workers to stop experience collection... (50 times) [2024-04-25 21:53:25,028][44588] Heartbeat connected on RolloutWorker_w31 [2024-04-25 21:53:25,029][44804] Signal inference workers to resume experience collection... (50 times) [2024-04-25 21:53:25,041][44824] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-04-25 21:53:25,041][44824] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-04-25 21:53:25,588][44588] Fps is (10 sec: 58983.1, 60 sec: 52974.8, 300 sec: 38884.7). Total num frames: 5832704. Throughput: 0: 53191.5. Samples: 5842960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-25 21:53:25,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:53:27,394][44824] Updated weights for policy 0, policy_version 360 (0.0033) [2024-04-25 21:53:29,440][44824] Updated weights for policy 0, policy_version 370 (0.0029) [2024-04-25 21:53:30,588][44588] Fps is (10 sec: 57344.7, 60 sec: 53248.1, 300 sec: 39427.3). Total num frames: 6111232. Throughput: 0: 53981.4. Samples: 6175600. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-25 21:53:30,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:53:33,247][44824] Updated weights for policy 0, policy_version 380 (0.0026) [2024-04-25 21:53:35,304][44824] Updated weights for policy 0, policy_version 390 (0.0030) [2024-04-25 21:53:35,588][44588] Fps is (10 sec: 55705.4, 60 sec: 54067.2, 300 sec: 39936.0). Total num frames: 6389760. Throughput: 0: 54527.9. Samples: 6513520. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-25 21:53:35,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:53:38,832][44824] Updated weights for policy 0, policy_version 400 (0.0028) [2024-04-25 21:53:40,588][44588] Fps is (10 sec: 57344.2, 60 sec: 54886.4, 300 sec: 40513.2). Total num frames: 6684672. Throughput: 0: 55060.1. Samples: 6687560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 21:53:40,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:53:41,001][44824] Updated weights for policy 0, policy_version 410 (0.0031) [2024-04-25 21:53:44,469][44824] Updated weights for policy 0, policy_version 420 (0.0028) [2024-04-25 21:53:45,588][44588] Fps is (10 sec: 55706.5, 60 sec: 54886.5, 300 sec: 40863.6). Total num frames: 6946816. Throughput: 0: 55558.0. Samples: 7032540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-25 21:53:45,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:53:46,904][44824] Updated weights for policy 0, policy_version 430 (0.0031) [2024-04-25 21:53:50,177][44824] Updated weights for policy 0, policy_version 440 (0.0032) [2024-04-25 21:53:50,588][44588] Fps is (10 sec: 54067.4, 60 sec: 55159.5, 300 sec: 41287.7). Total num frames: 7225344. Throughput: 0: 56001.0. Samples: 7376260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 21:53:50,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:53:52,763][44824] Updated weights for policy 0, policy_version 450 (0.0038) [2024-04-25 21:53:55,588][44588] Fps is (10 sec: 55704.3, 60 sec: 55432.4, 300 sec: 41688.1). Total num frames: 7503872. Throughput: 0: 56101.7. Samples: 7537480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 21:53:55,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:53:56,146][44824] Updated weights for policy 0, policy_version 460 (0.0027) [2024-04-25 21:53:58,496][44824] Updated weights for policy 0, policy_version 470 (0.0030) [2024-04-25 21:54:00,588][44588] Fps is (10 sec: 57343.1, 60 sec: 55978.7, 300 sec: 42155.6). Total num frames: 7798784. Throughput: 0: 56237.0. Samples: 7873900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-25 21:54:00,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:54:01,824][44824] Updated weights for policy 0, policy_version 480 (0.0032) [2024-04-25 21:54:04,219][44824] Updated weights for policy 0, policy_version 490 (0.0036) [2024-04-25 21:54:05,588][44588] Fps is (10 sec: 58983.1, 60 sec: 56524.8, 300 sec: 42598.4). Total num frames: 8093696. Throughput: 0: 56502.3. Samples: 8217040. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-25 21:54:05,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:54:07,503][44824] Updated weights for policy 0, policy_version 500 (0.0046) [2024-04-25 21:54:10,075][44824] Updated weights for policy 0, policy_version 510 (0.0026) [2024-04-25 21:54:10,588][44588] Fps is (10 sec: 57345.1, 60 sec: 56525.0, 300 sec: 42934.5). Total num frames: 8372224. Throughput: 0: 56609.1. Samples: 8390360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-25 21:54:10,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:54:13,239][44824] Updated weights for policy 0, policy_version 520 (0.0029) [2024-04-25 21:54:15,588][44588] Fps is (10 sec: 54067.5, 60 sec: 56525.0, 300 sec: 43171.9). Total num frames: 8634368. Throughput: 0: 56726.2. Samples: 8728280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-25 21:54:15,596][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 21:54:15,986][44824] Updated weights for policy 0, policy_version 530 (0.0027) [2024-04-25 21:54:19,095][44824] Updated weights for policy 0, policy_version 540 (0.0029) [2024-04-25 21:54:20,588][44588] Fps is (10 sec: 55705.2, 60 sec: 56524.9, 300 sec: 43557.5). Total num frames: 8929280. Throughput: 0: 56670.8. Samples: 9063700. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-25 21:54:20,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:54:21,983][44824] Updated weights for policy 0, policy_version 550 (0.0029) [2024-04-25 21:54:24,799][44824] Updated weights for policy 0, policy_version 560 (0.0026) [2024-04-25 21:54:25,588][44588] Fps is (10 sec: 55705.8, 60 sec: 55978.8, 300 sec: 43768.7). Total num frames: 9191424. Throughput: 0: 56504.5. Samples: 9230260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-25 21:54:25,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:54:27,754][44824] Updated weights for policy 0, policy_version 570 (0.0031) [2024-04-25 21:54:30,307][44804] Signal inference workers to stop experience collection... (100 times) [2024-04-25 21:54:30,340][44824] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-04-25 21:54:30,371][44804] Signal inference workers to resume experience collection... (100 times) [2024-04-25 21:54:30,372][44824] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-04-25 21:54:30,588][44588] Fps is (10 sec: 55705.0, 60 sec: 56251.7, 300 sec: 44122.5). Total num frames: 9486336. Throughput: 0: 56275.8. Samples: 9564960. Policy #0 lag: (min: 2.0, avg: 10.0, max: 22.0) [2024-04-25 21:54:30,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:54:30,597][44824] Updated weights for policy 0, policy_version 580 (0.0030) [2024-04-25 21:54:33,485][44824] Updated weights for policy 0, policy_version 590 (0.0031) [2024-04-25 21:54:35,588][44588] Fps is (10 sec: 55704.9, 60 sec: 55978.7, 300 sec: 44311.3). Total num frames: 9748480. Throughput: 0: 56181.6. Samples: 9904440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-04-25 21:54:35,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:54:36,488][44824] Updated weights for policy 0, policy_version 600 (0.0026) [2024-04-25 21:54:39,291][44824] Updated weights for policy 0, policy_version 610 (0.0023) [2024-04-25 21:54:40,588][44588] Fps is (10 sec: 57344.7, 60 sec: 56251.8, 300 sec: 44710.1). Total num frames: 10059776. Throughput: 0: 56276.7. Samples: 10069920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-25 21:54:40,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:54:42,249][44824] Updated weights for policy 0, policy_version 620 (0.0032) [2024-04-25 21:54:45,088][44824] Updated weights for policy 0, policy_version 630 (0.0030) [2024-04-25 21:54:45,588][44588] Fps is (10 sec: 58983.2, 60 sec: 56524.8, 300 sec: 44949.2). Total num frames: 10338304. Throughput: 0: 56243.3. Samples: 10404840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-25 21:54:45,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:54:45,596][44804] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000000632_10354688.pth... [2024-04-25 21:54:48,283][44824] Updated weights for policy 0, policy_version 640 (0.0026) [2024-04-25 21:54:50,588][44588] Fps is (10 sec: 55705.1, 60 sec: 56524.7, 300 sec: 45178.0). Total num frames: 10616832. Throughput: 0: 56188.0. Samples: 10745500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-25 21:54:50,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:54:51,001][44824] Updated weights for policy 0, policy_version 650 (0.0029) [2024-04-25 21:54:54,068][44824] Updated weights for policy 0, policy_version 660 (0.0033) [2024-04-25 21:54:55,588][44588] Fps is (10 sec: 55705.1, 60 sec: 56524.9, 300 sec: 45397.3). Total num frames: 10895360. Throughput: 0: 56151.8. Samples: 10917200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-25 21:54:55,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:54:56,805][44824] Updated weights for policy 0, policy_version 670 (0.0027) [2024-04-25 21:54:59,765][44824] Updated weights for policy 0, policy_version 680 (0.0024) [2024-04-25 21:55:00,588][44588] Fps is (10 sec: 55706.1, 60 sec: 56251.9, 300 sec: 45607.7). Total num frames: 11173888. Throughput: 0: 56198.3. Samples: 11257200. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-25 21:55:00,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:55:02,580][44824] Updated weights for policy 0, policy_version 690 (0.0029) [2024-04-25 21:55:05,588][44588] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 45809.7). Total num frames: 11452416. Throughput: 0: 56296.4. Samples: 11597040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-25 21:55:05,588][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 21:55:05,658][44824] Updated weights for policy 0, policy_version 700 (0.0031) [2024-04-25 21:55:08,338][44824] Updated weights for policy 0, policy_version 710 (0.0026) [2024-04-25 21:55:10,588][44588] Fps is (10 sec: 54066.8, 60 sec: 55705.5, 300 sec: 45939.5). Total num frames: 11714560. Throughput: 0: 56423.5. Samples: 11769320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-25 21:55:10,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:55:11,505][44824] Updated weights for policy 0, policy_version 720 (0.0027) [2024-04-25 21:55:14,257][44824] Updated weights for policy 0, policy_version 730 (0.0029) [2024-04-25 21:55:15,588][44588] Fps is (10 sec: 57344.4, 60 sec: 56524.9, 300 sec: 46253.3). Total num frames: 12025856. Throughput: 0: 56377.1. Samples: 12101920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-25 21:55:15,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:55:17,405][44824] Updated weights for policy 0, policy_version 740 (0.0027) [2024-04-25 21:55:19,981][44824] Updated weights for policy 0, policy_version 750 (0.0032) [2024-04-25 21:55:20,588][44588] Fps is (10 sec: 58982.7, 60 sec: 56251.7, 300 sec: 46431.7). Total num frames: 12304384. Throughput: 0: 56236.6. Samples: 12435080. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-25 21:55:20,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:55:23,231][44824] Updated weights for policy 0, policy_version 760 (0.0031) [2024-04-25 21:55:25,588][44588] Fps is (10 sec: 55704.8, 60 sec: 56524.7, 300 sec: 46603.4). Total num frames: 12582912. Throughput: 0: 56349.2. Samples: 12605640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 21:55:25,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:55:25,935][44824] Updated weights for policy 0, policy_version 770 (0.0029) [2024-04-25 21:55:29,060][44824] Updated weights for policy 0, policy_version 780 (0.0029) [2024-04-25 21:55:30,588][44588] Fps is (10 sec: 57343.7, 60 sec: 56524.9, 300 sec: 46828.5). Total num frames: 12877824. Throughput: 0: 56400.4. Samples: 12942860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-25 21:55:30,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:55:31,777][44824] Updated weights for policy 0, policy_version 790 (0.0031) [2024-04-25 21:55:34,926][44824] Updated weights for policy 0, policy_version 800 (0.0033) [2024-04-25 21:55:35,588][44588] Fps is (10 sec: 55706.4, 60 sec: 56524.9, 300 sec: 46928.5). Total num frames: 13139968. Throughput: 0: 56305.5. Samples: 13279240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-25 21:55:35,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:55:37,465][44824] Updated weights for policy 0, policy_version 810 (0.0035) [2024-04-25 21:55:40,588][44588] Fps is (10 sec: 52428.8, 60 sec: 55705.5, 300 sec: 47025.0). Total num frames: 13402112. Throughput: 0: 56054.3. Samples: 13439640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 21:55:40,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:55:40,791][44824] Updated weights for policy 0, policy_version 820 (0.0031) [2024-04-25 21:55:43,046][44804] Signal inference workers to stop experience collection... (150 times) [2024-04-25 21:55:43,047][44804] Signal inference workers to resume experience collection... (150 times) [2024-04-25 21:55:43,062][44824] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-04-25 21:55:43,062][44824] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-04-25 21:55:43,291][44824] Updated weights for policy 0, policy_version 830 (0.0029) [2024-04-25 21:55:45,588][44588] Fps is (10 sec: 55704.6, 60 sec: 55978.5, 300 sec: 47231.1). Total num frames: 13697024. Throughput: 0: 56067.8. Samples: 13780260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 21:55:45,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:55:46,569][44824] Updated weights for policy 0, policy_version 840 (0.0036) [2024-04-25 21:55:49,136][44824] Updated weights for policy 0, policy_version 850 (0.0036) [2024-04-25 21:55:50,588][44588] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 47374.8). Total num frames: 13975552. Throughput: 0: 55953.3. Samples: 14114940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-25 21:55:50,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:55:52,406][44824] Updated weights for policy 0, policy_version 860 (0.0027) [2024-04-25 21:55:55,099][44824] Updated weights for policy 0, policy_version 870 (0.0030) [2024-04-25 21:55:55,588][44588] Fps is (10 sec: 57345.1, 60 sec: 56251.8, 300 sec: 47874.7). Total num frames: 14270464. Throughput: 0: 56033.0. Samples: 14290800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 21:55:55,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:55:58,212][44824] Updated weights for policy 0, policy_version 880 (0.0032) [2024-04-25 21:56:00,588][44588] Fps is (10 sec: 58983.2, 60 sec: 56524.9, 300 sec: 48818.8). Total num frames: 14565376. Throughput: 0: 56095.2. Samples: 14626200. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-25 21:56:00,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:56:00,747][44824] Updated weights for policy 0, policy_version 890 (0.0031) [2024-04-25 21:56:04,067][44824] Updated weights for policy 0, policy_version 900 (0.0029) [2024-04-25 21:56:05,588][44588] Fps is (10 sec: 57344.3, 60 sec: 56524.9, 300 sec: 49707.5). Total num frames: 14843904. Throughput: 0: 56213.0. Samples: 14964660. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-04-25 21:56:05,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:56:06,588][44824] Updated weights for policy 0, policy_version 910 (0.0032) [2024-04-25 21:56:09,788][44824] Updated weights for policy 0, policy_version 920 (0.0032) [2024-04-25 21:56:10,588][44588] Fps is (10 sec: 57343.7, 60 sec: 57071.0, 300 sec: 50651.6). Total num frames: 15138816. Throughput: 0: 56381.5. Samples: 15142800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-25 21:56:10,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:56:12,471][44824] Updated weights for policy 0, policy_version 930 (0.0030) [2024-04-25 21:56:15,588][44588] Fps is (10 sec: 54066.9, 60 sec: 55978.7, 300 sec: 51373.6). Total num frames: 15384576. Throughput: 0: 56451.6. Samples: 15483180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 24.0) [2024-04-25 21:56:15,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:56:15,620][44824] Updated weights for policy 0, policy_version 940 (0.0034) [2024-04-25 21:56:18,220][44824] Updated weights for policy 0, policy_version 950 (0.0027) [2024-04-25 21:56:20,588][44588] Fps is (10 sec: 52428.6, 60 sec: 55978.7, 300 sec: 51984.5). Total num frames: 15663104. Throughput: 0: 56370.6. Samples: 15815920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-25 21:56:20,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:56:21,637][44824] Updated weights for policy 0, policy_version 960 (0.0026) [2024-04-25 21:56:24,118][44824] Updated weights for policy 0, policy_version 970 (0.0031) [2024-04-25 21:56:25,588][44588] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 52539.9). Total num frames: 15941632. Throughput: 0: 56273.8. Samples: 15971960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-25 21:56:25,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:56:27,340][44824] Updated weights for policy 0, policy_version 980 (0.0031) [2024-04-25 21:56:29,854][44824] Updated weights for policy 0, policy_version 990 (0.0029) [2024-04-25 21:56:30,588][44588] Fps is (10 sec: 58982.4, 60 sec: 56251.8, 300 sec: 53095.3). Total num frames: 16252928. Throughput: 0: 56358.4. Samples: 16316380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-25 21:56:30,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:56:33,062][44824] Updated weights for policy 0, policy_version 1000 (0.0032) [2024-04-25 21:56:35,575][44824] Updated weights for policy 0, policy_version 1010 (0.0033) [2024-04-25 21:56:35,588][44588] Fps is (10 sec: 60620.8, 60 sec: 56797.8, 300 sec: 53595.1). Total num frames: 16547840. Throughput: 0: 56418.7. Samples: 16653780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 21:56:35,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:56:38,911][44824] Updated weights for policy 0, policy_version 1020 (0.0033) [2024-04-25 21:56:40,588][44588] Fps is (10 sec: 55705.6, 60 sec: 56797.9, 300 sec: 53928.4). Total num frames: 16809984. Throughput: 0: 56480.9. Samples: 16832440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-25 21:56:40,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:56:41,398][44824] Updated weights for policy 0, policy_version 1030 (0.0034) [2024-04-25 21:56:44,512][44824] Updated weights for policy 0, policy_version 1040 (0.0028) [2024-04-25 21:56:44,924][44804] Signal inference workers to stop experience collection... (200 times) [2024-04-25 21:56:44,956][44824] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-04-25 21:56:45,007][44804] Signal inference workers to resume experience collection... (200 times) [2024-04-25 21:56:45,007][44824] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-04-25 21:56:45,588][44588] Fps is (10 sec: 57343.5, 60 sec: 57071.0, 300 sec: 54317.1). Total num frames: 17121280. Throughput: 0: 56671.8. Samples: 17176440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-25 21:56:45,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:56:45,599][44804] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000001045_17121280.pth... [2024-04-25 21:56:45,650][44804] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000000223_3653632.pth [2024-04-25 21:56:47,345][44824] Updated weights for policy 0, policy_version 1050 (0.0031) [2024-04-25 21:56:50,351][44824] Updated weights for policy 0, policy_version 1060 (0.0034) [2024-04-25 21:56:50,588][44588] Fps is (10 sec: 57344.2, 60 sec: 56797.9, 300 sec: 54372.7). Total num frames: 17383424. Throughput: 0: 56647.5. Samples: 17513800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-25 21:56:50,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:56:53,344][44824] Updated weights for policy 0, policy_version 1070 (0.0040) [2024-04-25 21:56:55,588][44588] Fps is (10 sec: 50790.2, 60 sec: 55978.5, 300 sec: 54428.2). Total num frames: 17629184. Throughput: 0: 56188.7. Samples: 17671300. Policy #0 lag: (min: 1.0, avg: 9.4, max: 23.0) [2024-04-25 21:56:55,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:56:56,362][44824] Updated weights for policy 0, policy_version 1080 (0.0029) [2024-04-25 21:56:59,308][44824] Updated weights for policy 0, policy_version 1090 (0.0031) [2024-04-25 21:57:00,588][44588] Fps is (10 sec: 52428.8, 60 sec: 55705.5, 300 sec: 54594.8). Total num frames: 17907712. Throughput: 0: 56181.8. Samples: 18011360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-25 21:57:00,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:57:02,013][44824] Updated weights for policy 0, policy_version 1100 (0.0033) [2024-04-25 21:57:05,083][44824] Updated weights for policy 0, policy_version 1110 (0.0028) [2024-04-25 21:57:05,588][44588] Fps is (10 sec: 57344.6, 60 sec: 55978.5, 300 sec: 54928.0). Total num frames: 18202624. Throughput: 0: 56340.4. Samples: 18351240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-25 21:57:05,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:57:07,925][44824] Updated weights for policy 0, policy_version 1120 (0.0032) [2024-04-25 21:57:10,588][44588] Fps is (10 sec: 58981.9, 60 sec: 55978.6, 300 sec: 55205.8). Total num frames: 18497536. Throughput: 0: 56568.0. Samples: 18517520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 21:57:10,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:57:10,835][44824] Updated weights for policy 0, policy_version 1130 (0.0031) [2024-04-25 21:57:13,719][44824] Updated weights for policy 0, policy_version 1140 (0.0034) [2024-04-25 21:57:15,588][44588] Fps is (10 sec: 58981.8, 60 sec: 56797.7, 300 sec: 55483.4). Total num frames: 18792448. Throughput: 0: 56330.5. Samples: 18851260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-25 21:57:15,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:57:16,793][44824] Updated weights for policy 0, policy_version 1150 (0.0032) [2024-04-25 21:57:19,465][44824] Updated weights for policy 0, policy_version 1160 (0.0023) [2024-04-25 21:57:20,588][44588] Fps is (10 sec: 57343.5, 60 sec: 56797.7, 300 sec: 55650.0). Total num frames: 19070976. Throughput: 0: 56267.0. Samples: 19185800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-25 21:57:20,597][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:57:22,714][44824] Updated weights for policy 0, policy_version 1170 (0.0029) [2024-04-25 21:57:25,238][44824] Updated weights for policy 0, policy_version 1180 (0.0028) [2024-04-25 21:57:25,588][44588] Fps is (10 sec: 55706.2, 60 sec: 56797.9, 300 sec: 55705.6). Total num frames: 19349504. Throughput: 0: 56237.7. Samples: 19363140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-25 21:57:25,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:57:28,481][44824] Updated weights for policy 0, policy_version 1190 (0.0035) [2024-04-25 21:57:30,588][44588] Fps is (10 sec: 54067.8, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 19611648. Throughput: 0: 55920.1. Samples: 19692840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-25 21:57:30,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:57:31,077][44824] Updated weights for policy 0, policy_version 1200 (0.0026) [2024-04-25 21:57:34,344][44824] Updated weights for policy 0, policy_version 1210 (0.0033) [2024-04-25 21:57:35,588][44588] Fps is (10 sec: 52429.1, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 19873792. Throughput: 0: 56190.2. Samples: 20042360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 21:57:35,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:57:36,964][44824] Updated weights for policy 0, policy_version 1220 (0.0028) [2024-04-25 21:57:40,241][44824] Updated weights for policy 0, policy_version 1230 (0.0035) [2024-04-25 21:57:40,588][44588] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 20152320. Throughput: 0: 56125.1. Samples: 20196920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 21:57:40,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:57:42,734][44824] Updated weights for policy 0, policy_version 1240 (0.0037) [2024-04-25 21:57:43,121][44804] Signal inference workers to stop experience collection... (250 times) [2024-04-25 21:57:43,173][44804] Signal inference workers to resume experience collection... (250 times) [2024-04-25 21:57:43,173][44824] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-04-25 21:57:43,186][44824] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-04-25 21:57:45,588][44588] Fps is (10 sec: 57343.4, 60 sec: 55432.6, 300 sec: 56038.8). Total num frames: 20447232. Throughput: 0: 55957.2. Samples: 20529440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-25 21:57:45,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:57:46,070][44824] Updated weights for policy 0, policy_version 1250 (0.0032) [2024-04-25 21:57:48,713][44824] Updated weights for policy 0, policy_version 1260 (0.0028) [2024-04-25 21:57:50,588][44588] Fps is (10 sec: 58981.5, 60 sec: 55978.5, 300 sec: 56149.9). Total num frames: 20742144. Throughput: 0: 55867.0. Samples: 20865260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-25 21:57:50,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:57:51,850][44824] Updated weights for policy 0, policy_version 1270 (0.0031) [2024-04-25 21:57:54,461][44824] Updated weights for policy 0, policy_version 1280 (0.0033) [2024-04-25 21:57:55,588][44588] Fps is (10 sec: 57344.0, 60 sec: 56524.9, 300 sec: 56205.5). Total num frames: 21020672. Throughput: 0: 56063.1. Samples: 21040360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 21:57:55,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:57:57,788][44824] Updated weights for policy 0, policy_version 1290 (0.0028) [2024-04-25 21:58:00,406][44824] Updated weights for policy 0, policy_version 1300 (0.0025) [2024-04-25 21:58:00,588][44588] Fps is (10 sec: 57344.3, 60 sec: 56797.8, 300 sec: 56316.5). Total num frames: 21315584. Throughput: 0: 56070.8. Samples: 21374440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-25 21:58:00,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:58:03,684][44824] Updated weights for policy 0, policy_version 1310 (0.0029) [2024-04-25 21:58:05,588][44588] Fps is (10 sec: 57344.3, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 21594112. Throughput: 0: 55995.7. Samples: 21705600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-25 21:58:05,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:58:06,236][44824] Updated weights for policy 0, policy_version 1320 (0.0035) [2024-04-25 21:58:09,708][44824] Updated weights for policy 0, policy_version 1330 (0.0031) [2024-04-25 21:58:10,588][44588] Fps is (10 sec: 52429.1, 60 sec: 55705.6, 300 sec: 56261.0). Total num frames: 21839872. Throughput: 0: 55672.0. Samples: 21868380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-25 21:58:10,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:58:12,088][44824] Updated weights for policy 0, policy_version 1340 (0.0030) [2024-04-25 21:58:15,529][44824] Updated weights for policy 0, policy_version 1350 (0.0031) [2024-04-25 21:58:15,588][44588] Fps is (10 sec: 52428.5, 60 sec: 55432.6, 300 sec: 56205.5). Total num frames: 22118400. Throughput: 0: 55861.7. Samples: 22206620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-25 21:58:15,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:58:17,855][44824] Updated weights for policy 0, policy_version 1360 (0.0030) [2024-04-25 21:58:20,588][44588] Fps is (10 sec: 55705.5, 60 sec: 55432.6, 300 sec: 56149.9). Total num frames: 22396928. Throughput: 0: 55582.2. Samples: 22543560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-25 21:58:20,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:58:21,520][44824] Updated weights for policy 0, policy_version 1370 (0.0030) [2024-04-25 21:58:23,719][44824] Updated weights for policy 0, policy_version 1380 (0.0032) [2024-04-25 21:58:25,588][44588] Fps is (10 sec: 57344.6, 60 sec: 55705.6, 300 sec: 56205.5). Total num frames: 22691840. Throughput: 0: 55772.4. Samples: 22706680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 21:58:25,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:58:27,363][44824] Updated weights for policy 0, policy_version 1390 (0.0034) [2024-04-25 21:58:29,573][44824] Updated weights for policy 0, policy_version 1400 (0.0036) [2024-04-25 21:58:30,588][44588] Fps is (10 sec: 58982.8, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 22986752. Throughput: 0: 55842.8. Samples: 23042360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 21:58:30,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:58:33,229][44824] Updated weights for policy 0, policy_version 1410 (0.0035) [2024-04-25 21:58:35,329][44824] Updated weights for policy 0, policy_version 1420 (0.0028) [2024-04-25 21:58:35,588][44588] Fps is (10 sec: 57343.6, 60 sec: 56524.7, 300 sec: 56205.4). Total num frames: 23265280. Throughput: 0: 55939.6. Samples: 23382540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-25 21:58:35,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:58:38,971][44824] Updated weights for policy 0, policy_version 1430 (0.0030) [2024-04-25 21:58:40,588][44588] Fps is (10 sec: 54067.3, 60 sec: 56251.7, 300 sec: 56205.5). Total num frames: 23527424. Throughput: 0: 55877.0. Samples: 23554820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-25 21:58:40,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:58:41,119][44824] Updated weights for policy 0, policy_version 1440 (0.0026) [2024-04-25 21:58:44,764][44824] Updated weights for policy 0, policy_version 1450 (0.0038) [2024-04-25 21:58:45,588][44588] Fps is (10 sec: 54066.5, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 23805952. Throughput: 0: 56012.8. Samples: 23895020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-25 21:58:45,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:58:45,602][44804] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000001453_23805952.pth... [2024-04-25 21:58:45,656][44804] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000000632_10354688.pth [2024-04-25 21:58:47,039][44824] Updated weights for policy 0, policy_version 1460 (0.0029) [2024-04-25 21:58:47,972][44804] Signal inference workers to stop experience collection... (300 times) [2024-04-25 21:58:47,972][44804] Signal inference workers to resume experience collection... (300 times) [2024-04-25 21:58:47,999][44824] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-04-25 21:58:47,999][44824] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-04-25 21:58:50,588][44588] Fps is (10 sec: 54066.7, 60 sec: 55432.6, 300 sec: 56149.9). Total num frames: 24068096. Throughput: 0: 56076.9. Samples: 24229060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-25 21:58:50,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:58:50,640][44824] Updated weights for policy 0, policy_version 1470 (0.0026) [2024-04-25 21:58:53,091][44824] Updated weights for policy 0, policy_version 1480 (0.0032) [2024-04-25 21:58:55,588][44588] Fps is (10 sec: 52429.5, 60 sec: 55159.5, 300 sec: 56038.8). Total num frames: 24330240. Throughput: 0: 55794.2. Samples: 24379120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-25 21:58:55,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:58:56,495][44824] Updated weights for policy 0, policy_version 1490 (0.0029) [2024-04-25 21:58:58,774][44824] Updated weights for policy 0, policy_version 1500 (0.0027) [2024-04-25 21:59:00,588][44588] Fps is (10 sec: 57343.9, 60 sec: 55432.5, 300 sec: 56094.4). Total num frames: 24641536. Throughput: 0: 55709.8. Samples: 24713560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-25 21:59:00,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:59:02,279][44824] Updated weights for policy 0, policy_version 1510 (0.0031) [2024-04-25 21:59:04,534][44824] Updated weights for policy 0, policy_version 1520 (0.0025) [2024-04-25 21:59:05,588][44588] Fps is (10 sec: 60621.5, 60 sec: 55705.7, 300 sec: 56149.9). Total num frames: 24936448. Throughput: 0: 55756.6. Samples: 25052600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-25 21:59:05,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:59:08,115][44824] Updated weights for policy 0, policy_version 1530 (0.0031) [2024-04-25 21:59:10,375][44824] Updated weights for policy 0, policy_version 1540 (0.0032) [2024-04-25 21:59:10,588][44588] Fps is (10 sec: 58982.2, 60 sec: 56524.7, 300 sec: 56261.0). Total num frames: 25231360. Throughput: 0: 56097.6. Samples: 25231080. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-25 21:59:10,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:59:13,958][44824] Updated weights for policy 0, policy_version 1550 (0.0032) [2024-04-25 21:59:15,588][44588] Fps is (10 sec: 55704.5, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 25493504. Throughput: 0: 56209.1. Samples: 25571780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-25 21:59:15,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:59:16,401][44824] Updated weights for policy 0, policy_version 1560 (0.0033) [2024-04-25 21:59:19,785][44824] Updated weights for policy 0, policy_version 1570 (0.0027) [2024-04-25 21:59:20,588][44588] Fps is (10 sec: 52429.0, 60 sec: 55978.6, 300 sec: 56149.9). Total num frames: 25755648. Throughput: 0: 56064.9. Samples: 25905460. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-25 21:59:20,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:59:22,323][44824] Updated weights for policy 0, policy_version 1580 (0.0028) [2024-04-25 21:59:25,558][44824] Updated weights for policy 0, policy_version 1590 (0.0033) [2024-04-25 21:59:25,588][44588] Fps is (10 sec: 55706.4, 60 sec: 55978.7, 300 sec: 56149.9). Total num frames: 26050560. Throughput: 0: 55833.8. Samples: 26067340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 21:59:25,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:59:27,995][44824] Updated weights for policy 0, policy_version 1600 (0.0030) [2024-04-25 21:59:30,588][44588] Fps is (10 sec: 55706.5, 60 sec: 55432.6, 300 sec: 56150.0). Total num frames: 26312704. Throughput: 0: 55872.3. Samples: 26409260. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-04-25 21:59:30,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:59:31,238][44824] Updated weights for policy 0, policy_version 1610 (0.0027) [2024-04-25 21:59:33,707][44824] Updated weights for policy 0, policy_version 1620 (0.0029) [2024-04-25 21:59:35,588][44588] Fps is (10 sec: 54067.3, 60 sec: 55432.6, 300 sec: 56038.8). Total num frames: 26591232. Throughput: 0: 55992.6. Samples: 26748720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-25 21:59:35,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:59:37,043][44824] Updated weights for policy 0, policy_version 1630 (0.0038) [2024-04-25 21:59:39,665][44824] Updated weights for policy 0, policy_version 1640 (0.0027) [2024-04-25 21:59:40,588][44588] Fps is (10 sec: 57342.6, 60 sec: 55978.5, 300 sec: 56094.3). Total num frames: 26886144. Throughput: 0: 56380.8. Samples: 26916260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 21:59:40,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:59:42,903][44824] Updated weights for policy 0, policy_version 1650 (0.0027) [2024-04-25 21:59:45,402][44824] Updated weights for policy 0, policy_version 1660 (0.0034) [2024-04-25 21:59:45,588][44588] Fps is (10 sec: 60619.5, 60 sec: 56524.8, 300 sec: 56205.4). Total num frames: 27197440. Throughput: 0: 56461.6. Samples: 27254340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 21:59:45,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:59:48,714][44824] Updated weights for policy 0, policy_version 1670 (0.0032) [2024-04-25 21:59:50,588][44588] Fps is (10 sec: 57344.9, 60 sec: 56524.9, 300 sec: 56149.9). Total num frames: 27459584. Throughput: 0: 56429.7. Samples: 27591940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-04-25 21:59:50,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:59:51,180][44824] Updated weights for policy 0, policy_version 1680 (0.0032) [2024-04-25 21:59:54,504][44824] Updated weights for policy 0, policy_version 1690 (0.0031) [2024-04-25 21:59:55,217][44804] Signal inference workers to stop experience collection... (350 times) [2024-04-25 21:59:55,247][44824] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-04-25 21:59:55,271][44804] Signal inference workers to resume experience collection... (350 times) [2024-04-25 21:59:55,273][44824] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-04-25 21:59:55,588][44588] Fps is (10 sec: 55706.8, 60 sec: 57071.0, 300 sec: 56205.5). Total num frames: 27754496. Throughput: 0: 56431.3. Samples: 27770480. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-25 21:59:55,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 21:59:56,923][44824] Updated weights for policy 0, policy_version 1700 (0.0029) [2024-04-25 22:00:00,416][44824] Updated weights for policy 0, policy_version 1710 (0.0033) [2024-04-25 22:00:00,588][44588] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 28016640. Throughput: 0: 56283.6. Samples: 28104540. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-04-25 22:00:00,589][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:00:02,827][44824] Updated weights for policy 0, policy_version 1720 (0.0023) [2024-04-25 22:00:05,588][44588] Fps is (10 sec: 52428.5, 60 sec: 55705.5, 300 sec: 56149.9). Total num frames: 28278784. Throughput: 0: 56407.6. Samples: 28443800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-25 22:00:05,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:00:06,304][44824] Updated weights for policy 0, policy_version 1730 (0.0031) [2024-04-25 22:00:08,691][44824] Updated weights for policy 0, policy_version 1740 (0.0030) [2024-04-25 22:00:10,588][44588] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 56094.4). Total num frames: 28573696. Throughput: 0: 56380.0. Samples: 28604440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-25 22:00:10,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:00:12,030][44824] Updated weights for policy 0, policy_version 1750 (0.0037) [2024-04-25 22:00:14,404][44824] Updated weights for policy 0, policy_version 1760 (0.0032) [2024-04-25 22:00:15,588][44588] Fps is (10 sec: 57343.7, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 28852224. Throughput: 0: 56212.7. Samples: 28938840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-25 22:00:15,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:00:17,871][44824] Updated weights for policy 0, policy_version 1770 (0.0027) [2024-04-25 22:00:20,588][44588] Fps is (10 sec: 57343.2, 60 sec: 56524.7, 300 sec: 56149.9). Total num frames: 29147136. Throughput: 0: 56139.3. Samples: 29275000. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-04-25 22:00:20,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:00:20,725][44824] Updated weights for policy 0, policy_version 1780 (0.0027) [2024-04-25 22:00:23,617][44824] Updated weights for policy 0, policy_version 1790 (0.0037) [2024-04-25 22:00:25,588][44588] Fps is (10 sec: 58983.0, 60 sec: 56524.8, 300 sec: 56149.9). Total num frames: 29442048. Throughput: 0: 56285.1. Samples: 29449080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-25 22:00:25,597][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:00:26,536][44824] Updated weights for policy 0, policy_version 1800 (0.0030) [2024-04-25 22:00:29,541][44824] Updated weights for policy 0, policy_version 1810 (0.0030) [2024-04-25 22:00:30,588][44588] Fps is (10 sec: 55706.1, 60 sec: 56524.6, 300 sec: 56149.9). Total num frames: 29704192. Throughput: 0: 56207.7. Samples: 29783680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 22:00:30,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:00:32,258][44824] Updated weights for policy 0, policy_version 1820 (0.0032) [2024-04-25 22:00:35,294][44824] Updated weights for policy 0, policy_version 1830 (0.0042) [2024-04-25 22:00:35,588][44588] Fps is (10 sec: 54067.1, 60 sec: 56524.8, 300 sec: 56205.5). Total num frames: 29982720. Throughput: 0: 56165.3. Samples: 30119380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-25 22:00:35,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:00:37,995][44824] Updated weights for policy 0, policy_version 1840 (0.0031) [2024-04-25 22:00:40,588][44588] Fps is (10 sec: 54067.5, 60 sec: 55978.8, 300 sec: 56094.4). Total num frames: 30244864. Throughput: 0: 55879.1. Samples: 30285040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-25 22:00:40,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:00:41,160][44824] Updated weights for policy 0, policy_version 1850 (0.0031) [2024-04-25 22:00:43,978][44824] Updated weights for policy 0, policy_version 1860 (0.0030) [2024-04-25 22:00:45,588][44588] Fps is (10 sec: 54067.3, 60 sec: 55432.7, 300 sec: 56094.4). Total num frames: 30523392. Throughput: 0: 55929.0. Samples: 30621340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-25 22:00:45,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:00:45,672][44804] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000001864_30539776.pth... [2024-04-25 22:00:45,724][44804] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000001045_17121280.pth [2024-04-25 22:00:47,067][44824] Updated weights for policy 0, policy_version 1870 (0.0026) [2024-04-25 22:00:49,847][44824] Updated weights for policy 0, policy_version 1880 (0.0036) [2024-04-25 22:00:50,588][44588] Fps is (10 sec: 57344.0, 60 sec: 55978.6, 300 sec: 56094.4). Total num frames: 30818304. Throughput: 0: 55801.8. Samples: 30954880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-25 22:00:50,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:00:52,836][44824] Updated weights for policy 0, policy_version 1890 (0.0038) [2024-04-25 22:00:55,588][44588] Fps is (10 sec: 58982.2, 60 sec: 55978.6, 300 sec: 56094.4). Total num frames: 31113216. Throughput: 0: 56003.6. Samples: 31124600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-25 22:00:55,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:00:55,959][44824] Updated weights for policy 0, policy_version 1900 (0.0026) [2024-04-25 22:00:58,685][44824] Updated weights for policy 0, policy_version 1910 (0.0027) [2024-04-25 22:01:00,588][44588] Fps is (10 sec: 57343.8, 60 sec: 56251.8, 300 sec: 56094.3). Total num frames: 31391744. Throughput: 0: 56132.5. Samples: 31464800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 22:01:00,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:01:00,692][44804] Signal inference workers to stop experience collection... (400 times) [2024-04-25 22:01:00,728][44824] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-04-25 22:01:00,781][44804] Signal inference workers to resume experience collection... (400 times) [2024-04-25 22:01:00,781][44824] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-04-25 22:01:01,857][44824] Updated weights for policy 0, policy_version 1920 (0.0027) [2024-04-25 22:01:04,561][44824] Updated weights for policy 0, policy_version 1930 (0.0033) [2024-04-25 22:01:05,588][44588] Fps is (10 sec: 55705.9, 60 sec: 56524.9, 300 sec: 56038.8). Total num frames: 31670272. Throughput: 0: 56097.6. Samples: 31799380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 22:01:05,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:01:07,673][44824] Updated weights for policy 0, policy_version 1940 (0.0028) [2024-04-25 22:01:10,522][44824] Updated weights for policy 0, policy_version 1950 (0.0038) [2024-04-25 22:01:10,588][44588] Fps is (10 sec: 55706.1, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 31948800. Throughput: 0: 56024.0. Samples: 31970160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-25 22:01:10,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:01:13,561][44824] Updated weights for policy 0, policy_version 1960 (0.0027) [2024-04-25 22:01:15,588][44588] Fps is (10 sec: 54067.1, 60 sec: 55978.8, 300 sec: 56094.4). Total num frames: 32210944. Throughput: 0: 56220.5. Samples: 32313600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-25 22:01:15,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:01:16,257][44824] Updated weights for policy 0, policy_version 1970 (0.0035) [2024-04-25 22:01:19,286][44824] Updated weights for policy 0, policy_version 1980 (0.0026) [2024-04-25 22:01:20,588][44588] Fps is (10 sec: 54067.2, 60 sec: 55705.8, 300 sec: 56094.4). Total num frames: 32489472. Throughput: 0: 56180.9. Samples: 32647520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-25 22:01:20,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:01:22,138][44824] Updated weights for policy 0, policy_version 1990 (0.0032) [2024-04-25 22:01:25,221][44824] Updated weights for policy 0, policy_version 2000 (0.0033) [2024-04-25 22:01:25,588][44588] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 56038.8). Total num frames: 32784384. Throughput: 0: 56053.8. Samples: 32807460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-25 22:01:25,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:01:28,025][44824] Updated weights for policy 0, policy_version 2010 (0.0032) [2024-04-25 22:01:30,588][44588] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 33062912. Throughput: 0: 56011.9. Samples: 33141880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-25 22:01:30,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:01:31,076][44824] Updated weights for policy 0, policy_version 2020 (0.0034) [2024-04-25 22:01:33,942][44824] Updated weights for policy 0, policy_version 2030 (0.0031) [2024-04-25 22:01:35,588][44588] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 33341440. Throughput: 0: 56182.7. Samples: 33483100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-25 22:01:35,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:01:36,782][44824] Updated weights for policy 0, policy_version 2040 (0.0028) [2024-04-25 22:01:39,671][44824] Updated weights for policy 0, policy_version 2050 (0.0033) [2024-04-25 22:01:40,588][44588] Fps is (10 sec: 57344.3, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 33636352. Throughput: 0: 56223.5. Samples: 33654660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-25 22:01:40,588][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:01:42,523][44824] Updated weights for policy 0, policy_version 2060 (0.0037) [2024-04-25 22:01:45,263][44824] Updated weights for policy 0, policy_version 2070 (0.0030) [2024-04-25 22:01:45,588][44588] Fps is (10 sec: 58981.6, 60 sec: 56797.8, 300 sec: 56094.3). Total num frames: 33931264. Throughput: 0: 56236.4. Samples: 33995440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-25 22:01:45,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:01:48,179][44824] Updated weights for policy 0, policy_version 2080 (0.0026) [2024-04-25 22:01:50,588][44588] Fps is (10 sec: 54067.2, 60 sec: 55978.6, 300 sec: 56094.4). Total num frames: 34177024. Throughput: 0: 56321.2. Samples: 34333840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-25 22:01:50,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:01:51,135][44824] Updated weights for policy 0, policy_version 2090 (0.0036) [2024-04-25 22:01:54,074][44824] Updated weights for policy 0, policy_version 2100 (0.0027) [2024-04-25 22:01:55,588][44588] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 56149.9). Total num frames: 34471936. Throughput: 0: 56306.0. Samples: 34503940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-25 22:01:55,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:01:56,912][44824] Updated weights for policy 0, policy_version 2110 (0.0031) [2024-04-25 22:01:59,877][44824] Updated weights for policy 0, policy_version 2120 (0.0038) [2024-04-25 22:02:00,588][44588] Fps is (10 sec: 58982.5, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 34766848. Throughput: 0: 56163.1. Samples: 34840940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-25 22:02:00,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:02:02,801][44824] Updated weights for policy 0, policy_version 2130 (0.0029) [2024-04-25 22:02:05,588][44588] Fps is (10 sec: 57343.7, 60 sec: 56251.5, 300 sec: 56094.3). Total num frames: 35045376. Throughput: 0: 56257.5. Samples: 35179120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 22:02:05,589][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:02:05,814][44824] Updated weights for policy 0, policy_version 2140 (0.0027) [2024-04-25 22:02:08,574][44824] Updated weights for policy 0, policy_version 2150 (0.0030) [2024-04-25 22:02:10,588][44588] Fps is (10 sec: 55705.2, 60 sec: 56251.6, 300 sec: 56038.8). Total num frames: 35323904. Throughput: 0: 56462.1. Samples: 35348260. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-25 22:02:10,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:02:11,672][44824] Updated weights for policy 0, policy_version 2160 (0.0026) [2024-04-25 22:02:14,321][44824] Updated weights for policy 0, policy_version 2170 (0.0025) [2024-04-25 22:02:15,588][44588] Fps is (10 sec: 55706.1, 60 sec: 56524.7, 300 sec: 56038.8). Total num frames: 35602432. Throughput: 0: 56559.6. Samples: 35687060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-25 22:02:15,589][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:02:17,349][44824] Updated weights for policy 0, policy_version 2180 (0.0027) [2024-04-25 22:02:20,340][44824] Updated weights for policy 0, policy_version 2190 (0.0027) [2024-04-25 22:02:20,588][44588] Fps is (10 sec: 57344.4, 60 sec: 56797.8, 300 sec: 56094.4). Total num frames: 35897344. Throughput: 0: 56455.0. Samples: 36023580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 22:02:20,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:02:21,712][44804] Signal inference workers to stop experience collection... (450 times) [2024-04-25 22:02:21,713][44804] Signal inference workers to resume experience collection... (450 times) [2024-04-25 22:02:21,749][44824] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-04-25 22:02:21,749][44824] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-04-25 22:02:23,150][44824] Updated weights for policy 0, policy_version 2200 (0.0030) [2024-04-25 22:02:25,588][44588] Fps is (10 sec: 54066.9, 60 sec: 55978.5, 300 sec: 56038.8). Total num frames: 36143104. Throughput: 0: 56285.2. Samples: 36187500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-25 22:02:25,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:02:26,164][44824] Updated weights for policy 0, policy_version 2210 (0.0031) [2024-04-25 22:02:29,048][44824] Updated weights for policy 0, policy_version 2220 (0.0034) [2024-04-25 22:02:30,588][44588] Fps is (10 sec: 52428.7, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 36421632. Throughput: 0: 56156.5. Samples: 36522480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-04-25 22:02:30,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:02:31,913][44824] Updated weights for policy 0, policy_version 2230 (0.0027) [2024-04-25 22:02:34,907][44824] Updated weights for policy 0, policy_version 2240 (0.0031) [2024-04-25 22:02:35,588][44588] Fps is (10 sec: 57344.6, 60 sec: 56251.6, 300 sec: 56149.9). Total num frames: 36716544. Throughput: 0: 56004.4. Samples: 36854040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-25 22:02:35,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:02:37,826][44824] Updated weights for policy 0, policy_version 2250 (0.0032) [2024-04-25 22:02:40,588][44588] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 36995072. Throughput: 0: 55819.2. Samples: 37015800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 22:02:40,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:02:40,824][44824] Updated weights for policy 0, policy_version 2260 (0.0029) [2024-04-25 22:02:43,723][44824] Updated weights for policy 0, policy_version 2270 (0.0030) [2024-04-25 22:02:45,588][44588] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 37289984. Throughput: 0: 55801.7. Samples: 37352020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-25 22:02:45,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:02:45,600][44804] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000002276_37289984.pth... [2024-04-25 22:02:45,649][44804] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000001453_23805952.pth [2024-04-25 22:02:46,592][44824] Updated weights for policy 0, policy_version 2280 (0.0034) [2024-04-25 22:02:49,584][44824] Updated weights for policy 0, policy_version 2290 (0.0026) [2024-04-25 22:02:50,588][44588] Fps is (10 sec: 57343.5, 60 sec: 56524.7, 300 sec: 56094.4). Total num frames: 37568512. Throughput: 0: 55786.7. Samples: 37689520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-25 22:02:50,589][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:02:52,423][44824] Updated weights for policy 0, policy_version 2300 (0.0026) [2024-04-25 22:02:55,588][44588] Fps is (10 sec: 55705.8, 60 sec: 56251.8, 300 sec: 56038.8). Total num frames: 37847040. Throughput: 0: 55768.5. Samples: 37857840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-25 22:02:55,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:02:55,594][44824] Updated weights for policy 0, policy_version 2310 (0.0026) [2024-04-25 22:02:58,382][44824] Updated weights for policy 0, policy_version 2320 (0.0026) [2024-04-25 22:03:00,588][44588] Fps is (10 sec: 52429.6, 60 sec: 55432.6, 300 sec: 55927.8). Total num frames: 38092800. Throughput: 0: 55706.8. Samples: 38193860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 22:03:00,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:03:01,441][44824] Updated weights for policy 0, policy_version 2330 (0.0031) [2024-04-25 22:03:04,268][44824] Updated weights for policy 0, policy_version 2340 (0.0026) [2024-04-25 22:03:05,588][44588] Fps is (10 sec: 52428.7, 60 sec: 55432.7, 300 sec: 56038.8). Total num frames: 38371328. Throughput: 0: 55639.1. Samples: 38527340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-25 22:03:05,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:03:07,399][44824] Updated weights for policy 0, policy_version 2350 (0.0029) [2024-04-25 22:03:10,085][44824] Updated weights for policy 0, policy_version 2360 (0.0033) [2024-04-25 22:03:10,588][44588] Fps is (10 sec: 57343.8, 60 sec: 55705.7, 300 sec: 56094.4). Total num frames: 38666240. Throughput: 0: 55605.5. Samples: 38689740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 22:03:10,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:03:13,344][44824] Updated weights for policy 0, policy_version 2370 (0.0028) [2024-04-25 22:03:15,588][44588] Fps is (10 sec: 57344.4, 60 sec: 55705.7, 300 sec: 56094.4). Total num frames: 38944768. Throughput: 0: 55700.1. Samples: 39028980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-25 22:03:15,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:03:15,930][44824] Updated weights for policy 0, policy_version 2380 (0.0031) [2024-04-25 22:03:18,286][44804] Signal inference workers to stop experience collection... (500 times) [2024-04-25 22:03:18,287][44804] Signal inference workers to resume experience collection... (500 times) [2024-04-25 22:03:18,303][44824] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-04-25 22:03:18,303][44824] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-04-25 22:03:19,211][44824] Updated weights for policy 0, policy_version 2390 (0.0036) [2024-04-25 22:03:20,588][44588] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 56094.4). Total num frames: 39239680. Throughput: 0: 55652.1. Samples: 39358380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-25 22:03:20,588][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:03:21,794][44824] Updated weights for policy 0, policy_version 2400 (0.0031) [2024-04-25 22:03:24,979][44824] Updated weights for policy 0, policy_version 2410 (0.0028) [2024-04-25 22:03:25,588][44588] Fps is (10 sec: 55705.5, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 39501824. Throughput: 0: 55820.5. Samples: 39527720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-25 22:03:25,589][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:03:27,884][44824] Updated weights for policy 0, policy_version 2420 (0.0032) [2024-04-25 22:03:30,588][44588] Fps is (10 sec: 54066.8, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 39780352. Throughput: 0: 55718.2. Samples: 39859340. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-04-25 22:03:30,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:03:30,873][44824] Updated weights for policy 0, policy_version 2430 (0.0030) [2024-04-25 22:03:33,924][44824] Updated weights for policy 0, policy_version 2440 (0.0036) [2024-04-25 22:03:35,588][44588] Fps is (10 sec: 54066.5, 60 sec: 55432.5, 300 sec: 55983.3). Total num frames: 40042496. Throughput: 0: 55808.4. Samples: 40200900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-25 22:03:35,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:03:36,782][44824] Updated weights for policy 0, policy_version 2450 (0.0028) [2024-04-25 22:03:39,802][44824] Updated weights for policy 0, policy_version 2460 (0.0028) [2024-04-25 22:03:40,588][44588] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55983.3). Total num frames: 40321024. Throughput: 0: 55523.5. Samples: 40356400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-25 22:03:40,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:03:42,878][44824] Updated weights for policy 0, policy_version 2470 (0.0031) [2024-04-25 22:03:45,588][44588] Fps is (10 sec: 57344.4, 60 sec: 55432.5, 300 sec: 56094.4). Total num frames: 40615936. Throughput: 0: 55389.7. Samples: 40686400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-25 22:03:45,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:03:45,675][44824] Updated weights for policy 0, policy_version 2480 (0.0026) [2024-04-25 22:03:48,713][44824] Updated weights for policy 0, policy_version 2490 (0.0038) [2024-04-25 22:03:50,588][44588] Fps is (10 sec: 58982.2, 60 sec: 55705.6, 300 sec: 56205.4). Total num frames: 40910848. Throughput: 0: 55551.1. Samples: 41027140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-25 22:03:50,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:03:51,399][44824] Updated weights for policy 0, policy_version 2500 (0.0028) [2024-04-25 22:03:54,377][44824] Updated weights for policy 0, policy_version 2510 (0.0034) [2024-04-25 22:03:55,588][44588] Fps is (10 sec: 57343.6, 60 sec: 55705.5, 300 sec: 56094.4). Total num frames: 41189376. Throughput: 0: 55878.5. Samples: 41204280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-25 22:03:55,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:03:57,137][44824] Updated weights for policy 0, policy_version 2520 (0.0030) [2024-04-25 22:04:00,331][44824] Updated weights for policy 0, policy_version 2530 (0.0032) [2024-04-25 22:04:00,588][44588] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 56038.8). Total num frames: 41467904. Throughput: 0: 55803.0. Samples: 41540120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-25 22:04:00,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:04:03,152][44824] Updated weights for policy 0, policy_version 2540 (0.0026) [2024-04-25 22:04:05,588][44588] Fps is (10 sec: 52428.6, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 41713664. Throughput: 0: 55889.6. Samples: 41873420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 22:04:05,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:04:06,164][44824] Updated weights for policy 0, policy_version 2550 (0.0029) [2024-04-25 22:04:09,034][44824] Updated weights for policy 0, policy_version 2560 (0.0036) [2024-04-25 22:04:10,591][44588] Fps is (10 sec: 54053.1, 60 sec: 55703.1, 300 sec: 55982.8). Total num frames: 42008576. Throughput: 0: 55656.7. Samples: 42032420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-25 22:04:10,592][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:04:11,966][44824] Updated weights for policy 0, policy_version 2570 (0.0030) [2024-04-25 22:04:14,890][44824] Updated weights for policy 0, policy_version 2580 (0.0032) [2024-04-25 22:04:15,588][44588] Fps is (10 sec: 57344.4, 60 sec: 55705.5, 300 sec: 56038.8). Total num frames: 42287104. Throughput: 0: 55760.4. Samples: 42368560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-25 22:04:15,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:04:17,691][44824] Updated weights for policy 0, policy_version 2590 (0.0028) [2024-04-25 22:04:20,521][44824] Updated weights for policy 0, policy_version 2600 (0.0031) [2024-04-25 22:04:20,588][44588] Fps is (10 sec: 58998.2, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 42598400. Throughput: 0: 55893.1. Samples: 42716080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-25 22:04:20,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:04:23,418][44824] Updated weights for policy 0, policy_version 2610 (0.0027) [2024-04-25 22:04:25,588][44588] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 42860544. Throughput: 0: 56270.3. Samples: 42888560. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-25 22:04:25,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:04:26,523][44824] Updated weights for policy 0, policy_version 2620 (0.0033) [2024-04-25 22:04:29,359][44824] Updated weights for policy 0, policy_version 2630 (0.0031) [2024-04-25 22:04:30,588][44588] Fps is (10 sec: 57343.2, 60 sec: 56524.7, 300 sec: 56205.4). Total num frames: 43171840. Throughput: 0: 56515.5. Samples: 43229600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-25 22:04:30,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:04:31,623][44804] Signal inference workers to stop experience collection... (550 times) [2024-04-25 22:04:31,623][44804] Signal inference workers to resume experience collection... (550 times) [2024-04-25 22:04:31,654][44824] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-04-25 22:04:31,654][44824] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-04-25 22:04:32,241][44824] Updated weights for policy 0, policy_version 2640 (0.0030) [2024-04-25 22:04:35,094][44824] Updated weights for policy 0, policy_version 2650 (0.0025) [2024-04-25 22:04:35,588][44588] Fps is (10 sec: 57344.0, 60 sec: 56524.9, 300 sec: 56094.4). Total num frames: 43433984. Throughput: 0: 56336.1. Samples: 43562260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-25 22:04:35,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:04:37,966][44824] Updated weights for policy 0, policy_version 2660 (0.0026) [2024-04-25 22:04:40,588][44588] Fps is (10 sec: 54067.7, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 43712512. Throughput: 0: 56080.6. Samples: 43727900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-25 22:04:40,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:04:41,041][44824] Updated weights for policy 0, policy_version 2670 (0.0028) [2024-04-25 22:04:43,596][44824] Updated weights for policy 0, policy_version 2680 (0.0027) [2024-04-25 22:04:45,588][44588] Fps is (10 sec: 54066.5, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 43974656. Throughput: 0: 56187.9. Samples: 44068580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 22:04:45,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:04:45,676][44804] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000002685_43991040.pth... [2024-04-25 22:04:45,724][44804] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000001864_30539776.pth [2024-04-25 22:04:46,893][44824] Updated weights for policy 0, policy_version 2690 (0.0031) [2024-04-25 22:04:50,022][44824] Updated weights for policy 0, policy_version 2700 (0.0028) [2024-04-25 22:04:50,588][44588] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55927.7). Total num frames: 44253184. Throughput: 0: 56413.9. Samples: 44412040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-25 22:04:50,588][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:04:52,618][44824] Updated weights for policy 0, policy_version 2710 (0.0031) [2024-04-25 22:04:55,588][44588] Fps is (10 sec: 57344.6, 60 sec: 55978.8, 300 sec: 56038.8). Total num frames: 44548096. Throughput: 0: 56524.2. Samples: 44575860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 22:04:55,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:04:55,889][44824] Updated weights for policy 0, policy_version 2720 (0.0026) [2024-04-25 22:04:58,538][44824] Updated weights for policy 0, policy_version 2730 (0.0030) [2024-04-25 22:05:00,588][44588] Fps is (10 sec: 58982.3, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 44843008. Throughput: 0: 56580.5. Samples: 44914680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-25 22:05:00,588][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:05:01,736][44824] Updated weights for policy 0, policy_version 2740 (0.0033) [2024-04-25 22:05:04,498][44824] Updated weights for policy 0, policy_version 2750 (0.0031) [2024-04-25 22:05:05,588][44588] Fps is (10 sec: 58982.4, 60 sec: 57071.1, 300 sec: 56149.9). Total num frames: 45137920. Throughput: 0: 56156.4. Samples: 45243120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-25 22:05:05,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:05:07,504][44824] Updated weights for policy 0, policy_version 2760 (0.0029) [2024-04-25 22:05:10,169][44824] Updated weights for policy 0, policy_version 2770 (0.0031) [2024-04-25 22:05:10,588][44588] Fps is (10 sec: 57344.6, 60 sec: 56800.4, 300 sec: 56149.9). Total num frames: 45416448. Throughput: 0: 56390.7. Samples: 45426140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-25 22:05:10,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:05:13,187][44824] Updated weights for policy 0, policy_version 2780 (0.0038) [2024-04-25 22:05:15,588][44588] Fps is (10 sec: 52428.9, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 45662208. Throughput: 0: 56216.1. Samples: 45759320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 22:05:15,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:05:16,054][44824] Updated weights for policy 0, policy_version 2790 (0.0024) [2024-04-25 22:05:19,290][44824] Updated weights for policy 0, policy_version 2800 (0.0031) [2024-04-25 22:05:20,588][44588] Fps is (10 sec: 54066.4, 60 sec: 55978.5, 300 sec: 55983.3). Total num frames: 45957120. Throughput: 0: 56391.8. Samples: 46099900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-25 22:05:20,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:05:21,795][44824] Updated weights for policy 0, policy_version 2810 (0.0026) [2024-04-25 22:05:25,083][44824] Updated weights for policy 0, policy_version 2820 (0.0028) [2024-04-25 22:05:25,588][44588] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 46219264. Throughput: 0: 56198.7. Samples: 46256840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-25 22:05:25,588][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:05:25,602][44804] Saving new best policy, reward=0.001! [2024-04-25 22:05:27,542][44824] Updated weights for policy 0, policy_version 2830 (0.0030) [2024-04-25 22:05:30,588][44588] Fps is (10 sec: 55706.1, 60 sec: 55705.7, 300 sec: 56038.8). Total num frames: 46514176. Throughput: 0: 56246.3. Samples: 46599660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 22:05:30,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:05:30,921][44824] Updated weights for policy 0, policy_version 2840 (0.0033) [2024-04-25 22:05:33,275][44804] Signal inference workers to stop experience collection... (600 times) [2024-04-25 22:05:33,275][44804] Signal inference workers to resume experience collection... (600 times) [2024-04-25 22:05:33,289][44824] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-04-25 22:05:33,289][44824] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-04-25 22:05:33,385][44824] Updated weights for policy 0, policy_version 2850 (0.0027) [2024-04-25 22:05:35,588][44588] Fps is (10 sec: 58981.5, 60 sec: 56251.6, 300 sec: 56149.9). Total num frames: 46809088. Throughput: 0: 56011.9. Samples: 46932580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-25 22:05:35,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:05:36,817][44824] Updated weights for policy 0, policy_version 2860 (0.0031) [2024-04-25 22:05:39,238][44824] Updated weights for policy 0, policy_version 2870 (0.0035) [2024-04-25 22:05:40,588][44588] Fps is (10 sec: 58982.2, 60 sec: 56524.8, 300 sec: 56205.4). Total num frames: 47104000. Throughput: 0: 56184.4. Samples: 47104160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-25 22:05:40,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:05:42,655][44824] Updated weights for policy 0, policy_version 2880 (0.0031) [2024-04-25 22:05:44,974][44824] Updated weights for policy 0, policy_version 2890 (0.0027) [2024-04-25 22:05:45,588][44588] Fps is (10 sec: 57344.8, 60 sec: 56798.0, 300 sec: 56149.9). Total num frames: 47382528. Throughput: 0: 56204.1. Samples: 47443860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-25 22:05:45,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:05:48,580][44824] Updated weights for policy 0, policy_version 2900 (0.0028) [2024-04-25 22:05:50,588][44588] Fps is (10 sec: 54067.8, 60 sec: 56524.9, 300 sec: 56038.8). Total num frames: 47644672. Throughput: 0: 56323.7. Samples: 47777680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-25 22:05:50,588][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:05:50,783][44824] Updated weights for policy 0, policy_version 2910 (0.0035) [2024-04-25 22:05:54,301][44824] Updated weights for policy 0, policy_version 2920 (0.0031) [2024-04-25 22:05:55,588][44588] Fps is (10 sec: 54066.8, 60 sec: 56251.7, 300 sec: 56038.8). Total num frames: 47923200. Throughput: 0: 56036.8. Samples: 47947800. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-04-25 22:05:55,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:05:56,614][44824] Updated weights for policy 0, policy_version 2930 (0.0030) [2024-04-25 22:06:00,057][44824] Updated weights for policy 0, policy_version 2940 (0.0032) [2024-04-25 22:06:00,588][44588] Fps is (10 sec: 57344.2, 60 sec: 56251.9, 300 sec: 56094.4). Total num frames: 48218112. Throughput: 0: 56190.3. Samples: 48287880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 22:06:00,588][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:06:02,532][44824] Updated weights for policy 0, policy_version 2950 (0.0028) [2024-04-25 22:06:05,588][44588] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 55983.3). Total num frames: 48463872. Throughput: 0: 56165.5. Samples: 48627340. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-25 22:06:05,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:06:05,967][44824] Updated weights for policy 0, policy_version 2960 (0.0032) [2024-04-25 22:06:08,330][44824] Updated weights for policy 0, policy_version 2970 (0.0025) [2024-04-25 22:06:10,588][44588] Fps is (10 sec: 54067.0, 60 sec: 55705.6, 300 sec: 56094.4). Total num frames: 48758784. Throughput: 0: 56382.2. Samples: 48794040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-25 22:06:10,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:06:11,718][44824] Updated weights for policy 0, policy_version 2980 (0.0026) [2024-04-25 22:06:13,995][44824] Updated weights for policy 0, policy_version 2990 (0.0029) [2024-04-25 22:06:15,588][44588] Fps is (10 sec: 58982.1, 60 sec: 56524.8, 300 sec: 56149.9). Total num frames: 49053696. Throughput: 0: 56154.7. Samples: 49126620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-25 22:06:15,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:06:17,476][44824] Updated weights for policy 0, policy_version 3000 (0.0028) [2024-04-25 22:06:19,802][44824] Updated weights for policy 0, policy_version 3010 (0.0027) [2024-04-25 22:06:20,588][44588] Fps is (10 sec: 57343.8, 60 sec: 56251.8, 300 sec: 56094.4). Total num frames: 49332224. Throughput: 0: 56151.7. Samples: 49459400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-25 22:06:20,588][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:06:23,218][44824] Updated weights for policy 0, policy_version 3020 (0.0033) [2024-04-25 22:06:23,478][44804] Signal inference workers to stop experience collection... (650 times) [2024-04-25 22:06:23,478][44804] Signal inference workers to resume experience collection... (650 times) [2024-04-25 22:06:23,498][44824] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-04-25 22:06:23,498][44824] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-04-25 22:06:25,588][44588] Fps is (10 sec: 57343.9, 60 sec: 56797.8, 300 sec: 56149.9). Total num frames: 49627136. Throughput: 0: 56396.5. Samples: 49642000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-25 22:06:25,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:06:25,805][44824] Updated weights for policy 0, policy_version 3030 (0.0034) [2024-04-25 22:06:29,058][44824] Updated weights for policy 0, policy_version 3040 (0.0029) [2024-04-25 22:06:30,588][44588] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 56094.4). Total num frames: 49889280. Throughput: 0: 56285.8. Samples: 49976720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-25 22:06:30,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:06:31,520][44824] Updated weights for policy 0, policy_version 3050 (0.0033) [2024-04-25 22:06:34,904][44824] Updated weights for policy 0, policy_version 3060 (0.0034) [2024-04-25 22:06:35,588][44588] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 50167808. Throughput: 0: 56419.4. Samples: 50316560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 22:06:35,588][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:06:37,315][44824] Updated weights for policy 0, policy_version 3070 (0.0029) [2024-04-25 22:06:40,588][44588] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55983.3). Total num frames: 50446336. Throughput: 0: 56193.4. Samples: 50476500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-25 22:06:40,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:06:40,641][44824] Updated weights for policy 0, policy_version 3080 (0.0025) [2024-04-25 22:06:43,145][44824] Updated weights for policy 0, policy_version 3090 (0.0033) [2024-04-25 22:06:45,588][44588] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 56094.4). Total num frames: 50724864. Throughput: 0: 56239.0. Samples: 50818640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-25 22:06:45,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:06:45,600][44804] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000003096_50724864.pth... [2024-04-25 22:06:45,644][44804] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000002276_37289984.pth [2024-04-25 22:06:46,359][44824] Updated weights for policy 0, policy_version 3100 (0.0026) [2024-04-25 22:06:49,119][44824] Updated weights for policy 0, policy_version 3110 (0.0033) [2024-04-25 22:06:50,588][44588] Fps is (10 sec: 57343.6, 60 sec: 56251.6, 300 sec: 56094.4). Total num frames: 51019776. Throughput: 0: 56143.5. Samples: 51153800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-25 22:06:50,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:06:52,208][44824] Updated weights for policy 0, policy_version 3120 (0.0030) [2024-04-25 22:06:54,862][44824] Updated weights for policy 0, policy_version 3130 (0.0029) [2024-04-25 22:06:55,588][44588] Fps is (10 sec: 58982.1, 60 sec: 56524.8, 300 sec: 56094.4). Total num frames: 51314688. Throughput: 0: 56319.4. Samples: 51328420. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-25 22:06:55,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:06:58,128][44824] Updated weights for policy 0, policy_version 3140 (0.0027) [2024-04-25 22:07:00,583][44824] Updated weights for policy 0, policy_version 3150 (0.0028) [2024-04-25 22:07:00,588][44588] Fps is (10 sec: 58983.2, 60 sec: 56524.8, 300 sec: 56150.0). Total num frames: 51609600. Throughput: 0: 56502.8. Samples: 51669240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-25 22:07:00,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:07:03,826][44824] Updated weights for policy 0, policy_version 3160 (0.0032) [2024-04-25 22:07:05,588][44588] Fps is (10 sec: 55705.9, 60 sec: 56797.8, 300 sec: 56094.4). Total num frames: 51871744. Throughput: 0: 56574.7. Samples: 52005260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-25 22:07:05,588][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:07:06,333][44824] Updated weights for policy 0, policy_version 3170 (0.0027) [2024-04-25 22:07:09,586][44824] Updated weights for policy 0, policy_version 3180 (0.0034) [2024-04-25 22:07:10,588][44588] Fps is (10 sec: 55704.9, 60 sec: 56797.8, 300 sec: 56149.9). Total num frames: 52166656. Throughput: 0: 56378.3. Samples: 52179020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-25 22:07:10,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:07:12,353][44824] Updated weights for policy 0, policy_version 3190 (0.0031) [2024-04-25 22:07:15,456][44824] Updated weights for policy 0, policy_version 3200 (0.0030) [2024-04-25 22:07:15,588][44588] Fps is (10 sec: 57344.3, 60 sec: 56524.9, 300 sec: 56094.4). Total num frames: 52445184. Throughput: 0: 56424.0. Samples: 52515800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-25 22:07:15,588][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:07:18,058][44824] Updated weights for policy 0, policy_version 3210 (0.0029) [2024-04-25 22:07:20,588][44588] Fps is (10 sec: 52428.8, 60 sec: 55978.6, 300 sec: 56094.4). Total num frames: 52690944. Throughput: 0: 56380.5. Samples: 52853680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-25 22:07:20,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:07:21,301][44824] Updated weights for policy 0, policy_version 3220 (0.0036) [2024-04-25 22:07:23,751][44824] Updated weights for policy 0, policy_version 3230 (0.0027) [2024-04-25 22:07:25,588][44588] Fps is (10 sec: 54066.9, 60 sec: 55978.7, 300 sec: 56149.9). Total num frames: 52985856. Throughput: 0: 56467.5. Samples: 53017540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 22:07:25,589][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:07:26,980][44824] Updated weights for policy 0, policy_version 3240 (0.0037) [2024-04-25 22:07:29,508][44824] Updated weights for policy 0, policy_version 3250 (0.0035) [2024-04-25 22:07:30,588][44588] Fps is (10 sec: 58982.5, 60 sec: 56524.8, 300 sec: 56149.9). Total num frames: 53280768. Throughput: 0: 56435.6. Samples: 53358240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-25 22:07:30,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:07:32,337][44804] Signal inference workers to stop experience collection... (700 times) [2024-04-25 22:07:32,373][44824] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-04-25 22:07:32,425][44804] Signal inference workers to resume experience collection... (700 times) [2024-04-25 22:07:32,425][44824] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-04-25 22:07:32,678][44824] Updated weights for policy 0, policy_version 3260 (0.0030) [2024-04-25 22:07:35,493][44824] Updated weights for policy 0, policy_version 3270 (0.0028) [2024-04-25 22:07:35,588][44588] Fps is (10 sec: 58982.2, 60 sec: 56797.9, 300 sec: 56205.5). Total num frames: 53575680. Throughput: 0: 56498.7. Samples: 53696240. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-25 22:07:35,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:07:38,587][44824] Updated weights for policy 0, policy_version 3280 (0.0030) [2024-04-25 22:07:40,588][44588] Fps is (10 sec: 58981.9, 60 sec: 57070.8, 300 sec: 56205.4). Total num frames: 53870592. Throughput: 0: 56614.2. Samples: 53876060. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-25 22:07:40,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:07:41,217][44824] Updated weights for policy 0, policy_version 3290 (0.0034) [2024-04-25 22:07:44,379][44824] Updated weights for policy 0, policy_version 3300 (0.0027) [2024-04-25 22:07:45,588][44588] Fps is (10 sec: 57343.8, 60 sec: 57070.9, 300 sec: 56205.5). Total num frames: 54149120. Throughput: 0: 56642.0. Samples: 54218140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-25 22:07:45,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:07:47,168][44824] Updated weights for policy 0, policy_version 3310 (0.0029) [2024-04-25 22:07:50,132][44824] Updated weights for policy 0, policy_version 3320 (0.0030) [2024-04-25 22:07:50,588][44588] Fps is (10 sec: 54067.8, 60 sec: 56524.9, 300 sec: 56149.9). Total num frames: 54411264. Throughput: 0: 56663.6. Samples: 54555120. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-25 22:07:50,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:07:53,013][44824] Updated weights for policy 0, policy_version 3330 (0.0032) [2024-04-25 22:07:55,588][44588] Fps is (10 sec: 54067.3, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 54689792. Throughput: 0: 56468.9. Samples: 54720120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-25 22:07:55,588][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:07:55,904][44824] Updated weights for policy 0, policy_version 3340 (0.0030) [2024-04-25 22:07:58,685][44824] Updated weights for policy 0, policy_version 3350 (0.0035) [2024-04-25 22:08:00,588][44588] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 54968320. Throughput: 0: 56466.2. Samples: 55056780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-25 22:08:00,588][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:08:01,764][44824] Updated weights for policy 0, policy_version 3360 (0.0029) [2024-04-25 22:08:04,466][44824] Updated weights for policy 0, policy_version 3370 (0.0031) [2024-04-25 22:08:05,588][44588] Fps is (10 sec: 57344.3, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 55263232. Throughput: 0: 56536.5. Samples: 55397820. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-25 22:08:05,588][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:08:07,598][44824] Updated weights for policy 0, policy_version 3380 (0.0027) [2024-04-25 22:08:10,289][44824] Updated weights for policy 0, policy_version 3390 (0.0033) [2024-04-25 22:08:10,588][44588] Fps is (10 sec: 57343.3, 60 sec: 56251.6, 300 sec: 56261.0). Total num frames: 55541760. Throughput: 0: 56712.3. Samples: 55569600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-25 22:08:10,589][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:08:13,404][44824] Updated weights for policy 0, policy_version 3400 (0.0034) [2024-04-25 22:08:15,588][44588] Fps is (10 sec: 57343.4, 60 sec: 56524.7, 300 sec: 56261.0). Total num frames: 55836672. Throughput: 0: 56617.2. Samples: 55906020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-25 22:08:15,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:08:16,112][44824] Updated weights for policy 0, policy_version 3410 (0.0033) [2024-04-25 22:08:19,324][44824] Updated weights for policy 0, policy_version 3420 (0.0028) [2024-04-25 22:08:20,588][44588] Fps is (10 sec: 58982.9, 60 sec: 57344.0, 300 sec: 56372.1). Total num frames: 56131584. Throughput: 0: 56499.1. Samples: 56238700. Policy #0 lag: (min: 1.0, avg: 9.5, max: 23.0) [2024-04-25 22:08:20,589][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:08:21,876][44824] Updated weights for policy 0, policy_version 3430 (0.0026) [2024-04-25 22:08:25,171][44824] Updated weights for policy 0, policy_version 3440 (0.0030) [2024-04-25 22:08:25,588][44588] Fps is (10 sec: 57343.5, 60 sec: 57070.8, 300 sec: 56372.0). Total num frames: 56410112. Throughput: 0: 56434.5. Samples: 56415620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-25 22:08:25,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:08:27,724][44824] Updated weights for policy 0, policy_version 3450 (0.0024) [2024-04-25 22:08:30,588][44588] Fps is (10 sec: 52428.5, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 56655872. Throughput: 0: 56252.8. Samples: 56749520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-25 22:08:30,589][44588] Avg episode reward: [(0, '0.001')] [2024-04-25 22:08:30,975][44824] Updated weights for policy 0, policy_version 3460 (0.0029) [2024-04-25 22:08:31,186][44804] Signal inference workers to stop experience collection... (750 times) [2024-04-25 22:08:31,191][44804] Signal inference workers to resume experience collection... (750 times) [2024-04-25 22:08:31,209][44824] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-04-25 22:08:31,209][44824] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-04-25 22:08:33,742][44824] Updated weights for policy 0, policy_version 3470 (0.0030) [2024-04-25 22:08:35,588][44588] Fps is (10 sec: 54067.9, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 56950784. Throughput: 0: 56259.5. Samples: 57086800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-25 22:08:35,589][44588] Avg episode reward: [(0, '0.000')] [2024-04-25 22:08:36,773][44824] Updated weights for policy 0, policy_version 3480 (0.0034) [2024-04-25 22:08:51,213][47056] Saving configuration to /workspace/metta/train_dir/p2.fa.clean/config.json... [2024-04-25 22:08:51,221][47056] Rollout worker 0 uses device cpu [2024-04-25 22:08:51,221][47056] Rollout worker 1 uses device cpu [2024-04-25 22:08:51,221][47056] Rollout worker 2 uses device cpu [2024-04-25 22:08:51,221][47056] Rollout worker 3 uses device cpu [2024-04-25 22:08:51,221][47056] Rollout worker 4 uses device cpu [2024-04-25 22:08:51,222][47056] Rollout worker 5 uses device cpu [2024-04-25 22:08:51,222][47056] Rollout worker 6 uses device cpu [2024-04-25 22:08:51,222][47056] Rollout worker 7 uses device cpu [2024-04-25 22:08:51,222][47056] Rollout worker 8 uses device cpu [2024-04-25 22:08:51,222][47056] Rollout worker 9 uses device cpu [2024-04-25 22:08:51,222][47056] Rollout worker 10 uses device cpu [2024-04-25 22:08:51,222][47056] Rollout worker 11 uses device cpu [2024-04-25 22:08:51,222][47056] Rollout worker 12 uses device cpu [2024-04-25 22:08:51,222][47056] Rollout worker 13 uses device cpu [2024-04-25 22:08:51,222][47056] Rollout worker 14 uses device cpu [2024-04-25 22:08:51,222][47056] Rollout worker 15 uses device cpu [2024-04-25 22:08:51,222][47056] Rollout worker 16 uses device cpu [2024-04-25 22:08:51,223][47056] Rollout worker 17 uses device cpu [2024-04-25 22:08:51,223][47056] Rollout worker 18 uses device cpu [2024-04-25 22:08:51,223][47056] Rollout worker 19 uses device cpu [2024-04-25 22:08:51,223][47056] Rollout worker 20 uses device cpu [2024-04-25 22:08:51,223][47056] Rollout worker 21 uses device cpu [2024-04-25 22:08:51,223][47056] Rollout worker 22 uses device cpu [2024-04-25 22:08:51,223][47056] Rollout worker 23 uses device cpu [2024-04-25 22:08:51,223][47056] Rollout worker 24 uses device cpu [2024-04-25 22:08:51,223][47056] Rollout worker 25 uses device cpu [2024-04-25 22:08:51,223][47056] Rollout worker 26 uses device cpu [2024-04-25 22:08:51,223][47056] Rollout worker 27 uses device cpu [2024-04-25 22:08:51,223][47056] Rollout worker 28 uses device cpu [2024-04-25 22:08:51,224][47056] Rollout worker 29 uses device cpu [2024-04-25 22:08:51,224][47056] Rollout worker 30 uses device cpu [2024-04-25 22:08:51,224][47056] Rollout worker 31 uses device cpu [2024-04-25 22:08:51,769][47056] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-25 22:08:51,769][47056] InferenceWorker_p0-w0: min num requests: 10 [2024-04-25 22:08:51,815][47056] Starting all processes... [2024-04-25 22:08:51,815][47056] Starting process learner_proc0 [2024-04-25 22:08:51,872][47056] Starting all processes... [2024-04-25 22:08:51,876][47056] Starting process inference_proc0-0 [2024-04-25 22:08:51,876][47056] Starting process rollout_proc2 [2024-04-25 22:08:51,876][47056] Starting process rollout_proc1 [2024-04-25 22:08:51,876][47056] Starting process rollout_proc0 [2024-04-25 22:08:51,878][47056] Starting process rollout_proc3 [2024-04-25 22:08:51,880][47056] Starting process rollout_proc16 [2024-04-25 22:08:51,878][47056] Starting process rollout_proc5 [2024-04-25 22:08:51,878][47056] Starting process rollout_proc6 [2024-04-25 22:08:51,878][47056] Starting process rollout_proc7 [2024-04-25 22:08:51,878][47056] Starting process rollout_proc8 [2024-04-25 22:08:51,879][47056] Starting process rollout_proc9 [2024-04-25 22:08:51,879][47056] Starting process rollout_proc10 [2024-04-25 22:08:51,879][47056] Starting process rollout_proc11 [2024-04-25 22:08:51,879][47056] Starting process rollout_proc12 [2024-04-25 22:08:51,879][47056] Starting process rollout_proc13 [2024-04-25 22:08:51,880][47056] Starting process rollout_proc14 [2024-04-25 22:08:51,880][47056] Starting process rollout_proc15 [2024-04-25 22:08:51,878][47056] Starting process rollout_proc4 [2024-04-25 22:08:51,880][47056] Starting process rollout_proc17 [2024-04-25 22:08:51,881][47056] Starting process rollout_proc18 [2024-04-25 22:08:51,883][47056] Starting process rollout_proc19 [2024-04-25 22:08:51,887][47056] Starting process rollout_proc20 [2024-04-25 22:08:51,889][47056] Starting process rollout_proc21 [2024-04-25 22:08:51,889][47056] Starting process rollout_proc22 [2024-04-25 22:08:51,890][47056] Starting process rollout_proc23 [2024-04-25 22:08:51,890][47056] Starting process rollout_proc24 [2024-04-25 22:08:51,892][47056] Starting process rollout_proc25 [2024-04-25 22:08:51,892][47056] Starting process rollout_proc26 [2024-04-25 22:08:51,896][47056] Starting process rollout_proc27 [2024-04-25 22:08:51,896][47056] Starting process rollout_proc28 [2024-04-25 22:08:51,904][47056] Starting process rollout_proc29 [2024-04-25 22:08:51,904][47056] Starting process rollout_proc30 [2024-04-25 22:08:51,905][47056] Starting process rollout_proc31 [2024-04-25 22:08:55,350][47290] Worker 0 uses CPU cores [0] [2024-04-25 22:08:55,407][47291] Worker 3 uses CPU cores [3] [2024-04-25 22:08:55,430][47293] Worker 16 uses CPU cores [16] [2024-04-25 22:08:55,530][47287] Worker 2 uses CPU cores [2] [2024-04-25 22:08:55,542][47289] Worker 1 uses CPU cores [1] [2024-04-25 22:08:55,582][47312] Worker 24 uses CPU cores [24] [2024-04-25 22:08:55,606][47304] Worker 4 uses CPU cores [4] [2024-04-25 22:08:55,662][47294] Worker 7 uses CPU cores [7] [2024-04-25 22:08:55,758][47305] Worker 17 uses CPU cores [17] [2024-04-25 22:08:55,794][47300] Worker 12 uses CPU cores [12] [2024-04-25 22:08:55,798][47292] Worker 6 uses CPU cores [6] [2024-04-25 22:08:55,800][47267] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-25 22:08:55,800][47267] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-04-25 22:08:55,812][47267] Num visible devices: 1 [2024-04-25 22:08:55,828][47299] Worker 11 uses CPU cores [11] [2024-04-25 22:08:55,846][47297] Worker 5 uses CPU cores [5] [2024-04-25 22:08:55,850][47306] Worker 19 uses CPU cores [19] [2024-04-25 22:08:55,855][47296] Worker 9 uses CPU cores [9] [2024-04-25 22:08:55,855][47288] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-25 22:08:55,855][47288] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-04-25 22:08:55,858][47308] Worker 21 uses CPU cores [21] [2024-04-25 22:08:55,859][47295] Worker 8 uses CPU cores [8] [2024-04-25 22:08:55,864][47288] Num visible devices: 1 [2024-04-25 22:08:55,866][47302] Worker 13 uses CPU cores [13] [2024-04-25 22:08:55,866][47298] Worker 10 uses CPU cores [10] [2024-04-25 22:08:55,866][47301] Worker 14 uses CPU cores [14] [2024-04-25 22:08:55,868][47310] Worker 22 uses CPU cores [22] [2024-04-25 22:08:55,870][47315] Worker 27 uses CPU cores [27] [2024-04-25 22:08:55,875][47267] Starting seed is not provided [2024-04-25 22:08:55,875][47267] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-25 22:08:55,875][47267] Initializing actor-critic model on device cuda:0 [2024-04-25 22:08:55,875][47309] Worker 20 uses CPU cores [20] [2024-04-25 22:08:55,875][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,883][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,883][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,883][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,883][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,883][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,883][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,883][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,883][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,883][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,883][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,884][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,884][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,884][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,884][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,884][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,884][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,884][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,884][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,884][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,884][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,884][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,884][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,884][47267] RunningMeanStd input shape: (1,) [2024-04-25 22:08:55,885][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,885][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,885][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,885][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,885][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,886][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,886][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,886][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,886][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,886][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,886][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,886][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,886][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,886][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,886][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,886][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,886][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,887][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,887][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,887][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,887][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,887][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,887][47267] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:55,890][47318] Worker 31 uses CPU cores [31] [2024-04-25 22:08:55,895][47313] Worker 25 uses CPU cores [25] [2024-04-25 22:08:55,943][47267] Created Actor Critic model with architecture: [2024-04-25 22:08:55,943][47267] PredictingActorCritic( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): FeatureEncoder( (_global_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (conf:agent:energy:initial): RunningMeanStdInPlace() (conf:agent:energy:max): RunningMeanStdInPlace() (conf:agent:energy:regen): RunningMeanStdInPlace() (conf:altar:cooldown): RunningMeanStdInPlace() (conf:altar:cost): RunningMeanStdInPlace() (conf:attack:damage): RunningMeanStdInPlace() (conf:attack:freeze_duration): RunningMeanStdInPlace() (conf:charger:cooldown): RunningMeanStdInPlace() (conf:charger:energy): RunningMeanStdInPlace() (conf:cost:attack): RunningMeanStdInPlace() (conf:cost:frozen): RunningMeanStdInPlace() (conf:cost:jump): RunningMeanStdInPlace() (conf:cost:move): RunningMeanStdInPlace() (conf:cost:move:predator): RunningMeanStdInPlace() (conf:cost:move:prey): RunningMeanStdInPlace() (conf:cost:rotate): RunningMeanStdInPlace() (conf:cost:shield): RunningMeanStdInPlace() (conf:cost:shield:upkeep): RunningMeanStdInPlace() (conf:generator:cooldown): RunningMeanStdInPlace() (conf:gift:energy): RunningMeanStdInPlace() (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() (last_reward): RunningMeanStdInPlace() ) ) (_grid_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (charger): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv:1): RunningMeanStdInPlace() (agent:inv:2): RunningMeanStdInPlace() (agent:inv:3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (agent:species): RunningMeanStdInPlace() (altar:ready): RunningMeanStdInPlace() (charger:bonus): RunningMeanStdInPlace() (charger:input:1): RunningMeanStdInPlace() (charger:input:2): RunningMeanStdInPlace() (charger:input:3): RunningMeanStdInPlace() (charger:output): RunningMeanStdInPlace() (charger:ready): RunningMeanStdInPlace() (generator:ready): RunningMeanStdInPlace() (generator:resource): RunningMeanStdInPlace() ) ) (_global_encoder): FeatureListEncoder( (_embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (_grid_encoder): FeatureListEncoder( (_embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (encoder_head): Sequential( (0): Linear(in_features=520, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): FeatureDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=17, bias=True) ) ) [2024-04-25 22:08:55,981][47311] Worker 23 uses CPU cores [23] [2024-04-25 22:08:55,994][47303] Worker 15 uses CPU cores [15] [2024-04-25 22:08:56,017][47314] Worker 26 uses CPU cores [26] [2024-04-25 22:08:56,030][47317] Worker 30 uses CPU cores [30] [2024-04-25 22:08:56,031][47307] Worker 18 uses CPU cores [18] [2024-04-25 22:08:56,130][47267] Using optimizer [2024-04-25 22:08:56,139][47316] Worker 28 uses CPU cores [28] [2024-04-25 22:08:56,139][47319] Worker 29 uses CPU cores [29] [2024-04-25 22:08:56,235][47267] Loading state from checkpoint /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000003096_50724864.pth... [2024-04-25 22:08:56,253][47267] Loading model from checkpoint [2024-04-25 22:08:56,255][47267] Loaded experiment state at self.train_step=3096, self.env_steps=50724864 [2024-04-25 22:08:56,255][47267] Initialized policy 0 weights for model version 3096 [2024-04-25 22:08:56,257][47267] LearnerWorker_p0 finished initialization! [2024-04-25 22:08:56,257][47267] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-25 22:08:56,341][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,346][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,346][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,346][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,346][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,346][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,346][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,346][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,346][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,347][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,347][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,347][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,347][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,347][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,347][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,347][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,347][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,347][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,347][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,347][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,347][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,347][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,347][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,347][47288] RunningMeanStd input shape: (1,) [2024-04-25 22:08:56,348][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,348][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,348][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,348][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,348][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,348][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,348][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,348][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,348][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,348][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,348][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,348][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,348][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,348][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,349][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,349][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,349][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,349][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,349][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,349][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,349][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,349][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,349][47288] RunningMeanStd input shape: (11, 11) [2024-04-25 22:08:56,408][47056] Inference worker 0-0 is ready! [2024-04-25 22:08:56,408][47056] All inference workers are ready! Signal rollout workers to start! [2024-04-25 22:08:57,171][47296] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,177][47289] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,185][47295] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,188][47297] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,189][47290] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,196][47303] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,202][47294] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,203][47301] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,205][47291] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,206][47287] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,207][47292] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,210][47300] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,211][47302] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,215][47299] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,216][47304] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,216][47298] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,221][47318] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,223][47317] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,228][47293] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,234][47307] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,234][47312] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,242][47308] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,242][47310] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,242][47305] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,244][47306] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,244][47314] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,245][47309] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,245][47311] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,245][47315] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,246][47313] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,302][47316] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,304][47319] Decorrelating experience for 0 frames... [2024-04-25 22:08:57,877][47296] Decorrelating experience for 256 frames... [2024-04-25 22:08:57,888][47289] Decorrelating experience for 256 frames... [2024-04-25 22:08:57,901][47297] Decorrelating experience for 256 frames... [2024-04-25 22:08:57,909][47295] Decorrelating experience for 256 frames... [2024-04-25 22:08:57,912][47290] Decorrelating experience for 256 frames... [2024-04-25 22:08:57,933][47303] Decorrelating experience for 256 frames... [2024-04-25 22:08:57,943][47294] Decorrelating experience for 256 frames... [2024-04-25 22:08:57,946][47301] Decorrelating experience for 256 frames... [2024-04-25 22:08:57,948][47287] Decorrelating experience for 256 frames... [2024-04-25 22:08:57,954][47291] Decorrelating experience for 256 frames... [2024-04-25 22:08:57,959][47292] Decorrelating experience for 256 frames... [2024-04-25 22:08:57,959][47300] Decorrelating experience for 256 frames... [2024-04-25 22:08:57,963][47302] Decorrelating experience for 256 frames... [2024-04-25 22:08:57,965][47304] Decorrelating experience for 256 frames... [2024-04-25 22:08:57,976][47298] Decorrelating experience for 256 frames... [2024-04-25 22:08:57,976][47299] Decorrelating experience for 256 frames... [2024-04-25 22:08:58,008][47318] Decorrelating experience for 256 frames... [2024-04-25 22:08:58,016][47317] Decorrelating experience for 256 frames... [2024-04-25 22:08:58,018][47293] Decorrelating experience for 256 frames... [2024-04-25 22:08:58,021][47312] Decorrelating experience for 256 frames... [2024-04-25 22:08:58,027][47307] Decorrelating experience for 256 frames... [2024-04-25 22:08:58,059][47306] Decorrelating experience for 256 frames... [2024-04-25 22:08:58,062][47308] Decorrelating experience for 256 frames... [2024-04-25 22:08:58,064][47310] Decorrelating experience for 256 frames... [2024-04-25 22:08:58,065][47305] Decorrelating experience for 256 frames... [2024-04-25 22:08:58,069][47309] Decorrelating experience for 256 frames... [2024-04-25 22:08:58,072][47313] Decorrelating experience for 256 frames... [2024-04-25 22:08:58,072][47315] Decorrelating experience for 256 frames... [2024-04-25 22:08:58,072][47311] Decorrelating experience for 256 frames... [2024-04-25 22:08:58,075][47314] Decorrelating experience for 256 frames... [2024-04-25 22:08:58,106][47319] Decorrelating experience for 256 frames... [2024-04-25 22:08:58,106][47316] Decorrelating experience for 256 frames... [2024-04-25 22:08:58,923][47056] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 50724864. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-04-25 22:09:02,816][47289] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-04-25 22:09:02,816][47296] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-04-25 22:09:02,816][47303] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-04-25 22:09:02,816][47302] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-04-25 22:09:02,820][47291] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-04-25 22:09:02,825][47299] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-04-25 22:09:02,838][47294] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-04-25 22:09:02,847][47297] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-04-25 22:09:02,852][47300] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-04-25 22:09:02,868][47301] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-04-25 22:09:02,877][47298] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-04-25 22:09:02,879][47295] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-04-25 22:09:02,892][47287] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-04-25 22:09:02,903][47307] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-04-25 22:09:02,908][47293] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-04-25 22:09:02,916][47310] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-04-25 22:09:02,926][47312] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-04-25 22:09:02,926][47267] Signal inference workers to stop experience collection... [2024-04-25 22:09:02,930][47306] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-04-25 22:09:02,933][47288] InferenceWorker_p0-w0: stopping experience collection [2024-04-25 22:09:03,414][47267] Signal inference workers to resume experience collection... [2024-04-25 22:09:03,414][47288] InferenceWorker_p0-w0: resuming experience collection [2024-04-25 22:09:03,428][47308] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-04-25 22:09:03,428][47314] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-04-25 22:09:03,432][47319] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-04-25 22:09:03,436][47305] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-04-25 22:09:03,436][47317] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-04-25 22:09:03,436][47311] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-04-25 22:09:03,444][47313] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-04-25 22:09:03,444][47315] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-04-25 22:09:03,447][47318] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-04-25 22:09:03,643][47316] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-04-25 22:09:03,680][47309] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-04-25 22:09:03,923][47056] Fps is (10 sec: 16384.2, 60 sec: 16384.2, 300 sec: 16384.2). Total num frames: 50806784. Throughput: 0: 57368.5. Samples: 286840. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-25 22:09:04,495][47292] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-04-25 22:09:04,613][47288] Updated weights for policy 0, policy_version 3106 (0.0019) [2024-04-25 22:09:04,706][47304] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-04-25 22:09:07,526][47289] Worker 1 awakens! [2024-04-25 22:09:08,923][47056] Fps is (10 sec: 16383.5, 60 sec: 16383.5, 300 sec: 16383.5). Total num frames: 50888704. Throughput: 0: 33380.9. Samples: 333820. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-25 22:09:11,765][47056] Heartbeat connected on Batcher_0 [2024-04-25 22:09:11,767][47056] Heartbeat connected on LearnerWorker_p0 [2024-04-25 22:09:11,780][47056] Heartbeat connected on RolloutWorker_w1 [2024-04-25 22:09:11,780][47056] Heartbeat connected on RolloutWorker_w0 [2024-04-25 22:09:11,838][47056] Heartbeat connected on InferenceWorker_p0-w0 [2024-04-25 22:09:12,286][47287] Worker 2 awakens! [2024-04-25 22:09:12,296][47056] Heartbeat connected on RolloutWorker_w2 [2024-04-25 22:09:13,923][47056] Fps is (10 sec: 9830.2, 60 sec: 12014.8, 300 sec: 12014.8). Total num frames: 50905088. Throughput: 0: 22675.7. Samples: 340140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 10.0) [2024-04-25 22:09:16,953][47291] Worker 3 awakens! [2024-04-25 22:09:16,960][47056] Heartbeat connected on RolloutWorker_w3 [2024-04-25 22:09:18,923][47056] Fps is (10 sec: 3276.8, 60 sec: 9830.3, 300 sec: 9830.3). Total num frames: 50921472. Throughput: 0: 17924.8. Samples: 358500. Policy #0 lag: (min: 0.0, avg: 4.4, max: 11.0) [2024-04-25 22:09:23,550][47304] Worker 4 awakens! [2024-04-25 22:09:23,560][47056] Heartbeat connected on RolloutWorker_w4 [2024-04-25 22:09:23,923][47056] Fps is (10 sec: 4915.3, 60 sec: 9175.0, 300 sec: 9175.0). Total num frames: 50954240. Throughput: 0: 15337.6. Samples: 383440. Policy #0 lag: (min: 0.0, avg: 1.1, max: 2.0) [2024-04-25 22:09:23,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:09:26,384][47297] Worker 5 awakens! [2024-04-25 22:09:26,388][47056] Heartbeat connected on RolloutWorker_w5 [2024-04-25 22:09:28,923][47056] Fps is (10 sec: 11469.0, 60 sec: 10376.5, 300 sec: 10376.5). Total num frames: 51036160. Throughput: 0: 14537.3. Samples: 436120. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2024-04-25 22:09:28,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:09:29,066][47288] Updated weights for policy 0, policy_version 3116 (0.0015) [2024-04-25 22:09:32,718][47292] Worker 6 awakens! [2024-04-25 22:09:32,722][47056] Heartbeat connected on RolloutWorker_w6 [2024-04-25 22:09:33,923][47056] Fps is (10 sec: 19660.8, 60 sec: 12171.0, 300 sec: 12171.0). Total num frames: 51150848. Throughput: 0: 16096.6. Samples: 563380. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2024-04-25 22:09:33,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:09:35,673][47288] Updated weights for policy 0, policy_version 3126 (0.0013) [2024-04-25 22:09:35,750][47294] Worker 7 awakens! [2024-04-25 22:09:35,755][47056] Heartbeat connected on RolloutWorker_w7 [2024-04-25 22:09:38,923][47056] Fps is (10 sec: 26214.2, 60 sec: 14336.0, 300 sec: 14336.0). Total num frames: 51298304. Throughput: 0: 18158.0. Samples: 726320. Policy #0 lag: (min: 0.0, avg: 1.9, max: 5.0) [2024-04-25 22:09:38,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:09:40,478][47295] Worker 8 awakens! [2024-04-25 22:09:40,482][47056] Heartbeat connected on RolloutWorker_w8 [2024-04-25 22:09:41,291][47288] Updated weights for policy 0, policy_version 3136 (0.0013) [2024-04-25 22:09:43,923][47056] Fps is (10 sec: 27852.8, 60 sec: 15655.8, 300 sec: 15655.8). Total num frames: 51429376. Throughput: 0: 18027.6. Samples: 811240. Policy #0 lag: (min: 0.0, avg: 2.7, max: 6.0) [2024-04-25 22:09:43,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:09:45,104][47296] Worker 9 awakens! [2024-04-25 22:09:45,109][47056] Heartbeat connected on RolloutWorker_w9 [2024-04-25 22:09:46,939][47288] Updated weights for policy 0, policy_version 3146 (0.0013) [2024-04-25 22:09:48,923][47056] Fps is (10 sec: 31129.9, 60 sec: 17694.7, 300 sec: 17694.7). Total num frames: 51609600. Throughput: 0: 15979.1. Samples: 1005900. Policy #0 lag: (min: 0.0, avg: 3.0, max: 6.0) [2024-04-25 22:09:48,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:09:49,850][47298] Worker 10 awakens! [2024-04-25 22:09:49,855][47056] Heartbeat connected on RolloutWorker_w10 [2024-04-25 22:09:51,296][47288] Updated weights for policy 0, policy_version 3156 (0.0015) [2024-04-25 22:09:53,923][47056] Fps is (10 sec: 37682.9, 60 sec: 19660.8, 300 sec: 19660.8). Total num frames: 51806208. Throughput: 0: 20227.2. Samples: 1244040. Policy #0 lag: (min: 0.0, avg: 3.6, max: 7.0) [2024-04-25 22:09:53,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:09:54,488][47299] Worker 11 awakens! [2024-04-25 22:09:54,492][47056] Heartbeat connected on RolloutWorker_w11 [2024-04-25 22:09:54,829][47288] Updated weights for policy 0, policy_version 3166 (0.0015) [2024-04-25 22:09:58,373][47288] Updated weights for policy 0, policy_version 3176 (0.0017) [2024-04-25 22:09:58,924][47056] Fps is (10 sec: 42593.3, 60 sec: 21844.9, 300 sec: 21844.9). Total num frames: 52035584. Throughput: 0: 23071.9. Samples: 1378400. Policy #0 lag: (min: 0.0, avg: 4.8, max: 9.0) [2024-04-25 22:09:58,924][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:09:59,200][47300] Worker 12 awakens! [2024-04-25 22:09:59,205][47056] Heartbeat connected on RolloutWorker_w12 [2024-04-25 22:10:02,207][47288] Updated weights for policy 0, policy_version 3186 (0.0022) [2024-04-25 22:10:03,854][47302] Worker 13 awakens! [2024-04-25 22:10:03,862][47056] Heartbeat connected on RolloutWorker_w13 [2024-04-25 22:10:03,923][47056] Fps is (10 sec: 45875.6, 60 sec: 24302.9, 300 sec: 23693.8). Total num frames: 52264960. Throughput: 0: 28573.9. Samples: 1644320. Policy #0 lag: (min: 0.0, avg: 3.6, max: 10.0) [2024-04-25 22:10:03,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:10:06,105][47288] Updated weights for policy 0, policy_version 3196 (0.0018) [2024-04-25 22:10:08,594][47301] Worker 14 awakens! [2024-04-25 22:10:08,600][47056] Heartbeat connected on RolloutWorker_w14 [2024-04-25 22:10:08,923][47056] Fps is (10 sec: 45880.5, 60 sec: 26760.7, 300 sec: 25278.2). Total num frames: 52494336. Throughput: 0: 33989.3. Samples: 1912960. Policy #0 lag: (min: 0.0, avg: 5.5, max: 10.0) [2024-04-25 22:10:08,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:10:10,067][47288] Updated weights for policy 0, policy_version 3206 (0.0020) [2024-04-25 22:10:13,229][47303] Worker 15 awakens! [2024-04-25 22:10:13,236][47056] Heartbeat connected on RolloutWorker_w15 [2024-04-25 22:10:13,821][47288] Updated weights for policy 0, policy_version 3216 (0.0023) [2024-04-25 22:10:13,923][47056] Fps is (10 sec: 42597.8, 60 sec: 29764.3, 300 sec: 26214.4). Total num frames: 52690944. Throughput: 0: 35532.4. Samples: 2035080. Policy #0 lag: (min: 0.0, avg: 5.5, max: 10.0) [2024-04-25 22:10:13,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:10:17,563][47288] Updated weights for policy 0, policy_version 3226 (0.0025) [2024-04-25 22:10:18,008][47293] Worker 16 awakens! [2024-04-25 22:10:18,016][47056] Heartbeat connected on RolloutWorker_w16 [2024-04-25 22:10:18,923][47056] Fps is (10 sec: 42598.8, 60 sec: 33314.3, 300 sec: 27443.2). Total num frames: 52920320. Throughput: 0: 38449.4. Samples: 2293600. Policy #0 lag: (min: 0.0, avg: 5.2, max: 11.0) [2024-04-25 22:10:18,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:10:21,476][47288] Updated weights for policy 0, policy_version 3236 (0.0024) [2024-04-25 22:10:23,222][47305] Worker 17 awakens! [2024-04-25 22:10:23,232][47056] Heartbeat connected on RolloutWorker_w17 [2024-04-25 22:10:23,923][47056] Fps is (10 sec: 42598.3, 60 sec: 36044.7, 300 sec: 28141.9). Total num frames: 53116928. Throughput: 0: 40801.7. Samples: 2562400. Policy #0 lag: (min: 0.0, avg: 4.8, max: 14.0) [2024-04-25 22:10:23,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:10:24,806][47288] Updated weights for policy 0, policy_version 3246 (0.0025) [2024-04-25 22:10:27,378][47307] Worker 18 awakens! [2024-04-25 22:10:27,388][47056] Heartbeat connected on RolloutWorker_w18 [2024-04-25 22:10:28,233][47288] Updated weights for policy 0, policy_version 3256 (0.0027) [2024-04-25 22:10:28,923][47056] Fps is (10 sec: 44236.5, 60 sec: 38775.5, 300 sec: 29309.2). Total num frames: 53362688. Throughput: 0: 41956.0. Samples: 2699260. Policy #0 lag: (min: 0.0, avg: 4.2, max: 11.0) [2024-04-25 22:10:28,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:10:31,763][47288] Updated weights for policy 0, policy_version 3266 (0.0025) [2024-04-25 22:10:32,090][47306] Worker 19 awakens! [2024-04-25 22:10:32,099][47056] Heartbeat connected on RolloutWorker_w19 [2024-04-25 22:10:33,923][47056] Fps is (10 sec: 49151.1, 60 sec: 40959.8, 300 sec: 30353.4). Total num frames: 53608448. Throughput: 0: 44009.5. Samples: 2986340. Policy #0 lag: (min: 0.0, avg: 5.8, max: 13.0) [2024-04-25 22:10:33,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:10:35,098][47288] Updated weights for policy 0, policy_version 3276 (0.0025) [2024-04-25 22:10:37,525][47309] Worker 20 awakens! [2024-04-25 22:10:37,534][47056] Heartbeat connected on RolloutWorker_w20 [2024-04-25 22:10:38,273][47288] Updated weights for policy 0, policy_version 3286 (0.0023) [2024-04-25 22:10:38,923][47056] Fps is (10 sec: 49151.6, 60 sec: 42598.4, 300 sec: 31293.4). Total num frames: 53854208. Throughput: 0: 45309.7. Samples: 3282980. Policy #0 lag: (min: 0.0, avg: 63.4, max: 188.0) [2024-04-25 22:10:38,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:10:41,711][47288] Updated weights for policy 0, policy_version 3296 (0.0027) [2024-04-25 22:10:41,966][47308] Worker 21 awakens! [2024-04-25 22:10:41,977][47056] Heartbeat connected on RolloutWorker_w21 [2024-04-25 22:10:43,923][47056] Fps is (10 sec: 49153.1, 60 sec: 44509.8, 300 sec: 32143.8). Total num frames: 54099968. Throughput: 0: 45625.6. Samples: 3431500. Policy #0 lag: (min: 0.0, avg: 7.0, max: 13.0) [2024-04-25 22:10:43,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:10:45,156][47288] Updated weights for policy 0, policy_version 3306 (0.0024) [2024-04-25 22:10:46,066][47310] Worker 22 awakens! [2024-04-25 22:10:46,078][47056] Heartbeat connected on RolloutWorker_w22 [2024-04-25 22:10:47,893][47288] Updated weights for policy 0, policy_version 3316 (0.0026) [2024-04-25 22:10:48,923][47056] Fps is (10 sec: 50790.3, 60 sec: 45875.1, 300 sec: 33065.9). Total num frames: 54362112. Throughput: 0: 46393.2. Samples: 3732020. Policy #0 lag: (min: 0.0, avg: 6.2, max: 13.0) [2024-04-25 22:10:48,924][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:10:48,936][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000003318_54362112.pth... [2024-04-25 22:10:48,984][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000002685_43991040.pth [2024-04-25 22:10:48,987][47267] Saving new best policy, reward=0.002! [2024-04-25 22:10:51,347][47311] Worker 23 awakens! [2024-04-25 22:10:51,357][47056] Heartbeat connected on RolloutWorker_w23 [2024-04-25 22:10:51,490][47288] Updated weights for policy 0, policy_version 3326 (0.0031) [2024-04-25 22:10:53,923][47056] Fps is (10 sec: 52428.2, 60 sec: 46967.4, 300 sec: 33907.7). Total num frames: 54624256. Throughput: 0: 47313.6. Samples: 4042080. Policy #0 lag: (min: 0.0, avg: 6.8, max: 16.0) [2024-04-25 22:10:53,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:10:54,701][47288] Updated weights for policy 0, policy_version 3336 (0.0030) [2024-04-25 22:10:55,513][47312] Worker 24 awakens! [2024-04-25 22:10:55,523][47056] Heartbeat connected on RolloutWorker_w24 [2024-04-25 22:10:57,981][47288] Updated weights for policy 0, policy_version 3346 (0.0030) [2024-04-25 22:10:58,923][47056] Fps is (10 sec: 52428.5, 60 sec: 47514.4, 300 sec: 34679.4). Total num frames: 54886400. Throughput: 0: 48017.7. Samples: 4195880. Policy #0 lag: (min: 1.0, avg: 8.5, max: 16.0) [2024-04-25 22:10:58,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:11:00,644][47288] Updated weights for policy 0, policy_version 3356 (0.0028) [2024-04-25 22:11:00,725][47313] Worker 25 awakens! [2024-04-25 22:11:00,737][47056] Heartbeat connected on RolloutWorker_w25 [2024-04-25 22:11:03,923][47056] Fps is (10 sec: 50791.4, 60 sec: 47786.6, 300 sec: 35258.4). Total num frames: 55132160. Throughput: 0: 49222.1. Samples: 4508600. Policy #0 lag: (min: 0.0, avg: 7.4, max: 16.0) [2024-04-25 22:11:03,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:11:04,011][47288] Updated weights for policy 0, policy_version 3366 (0.0031) [2024-04-25 22:11:05,404][47314] Worker 26 awakens! [2024-04-25 22:11:05,414][47056] Heartbeat connected on RolloutWorker_w26 [2024-04-25 22:11:06,770][47288] Updated weights for policy 0, policy_version 3376 (0.0024) [2024-04-25 22:11:08,923][47056] Fps is (10 sec: 50790.5, 60 sec: 48332.7, 300 sec: 35918.7). Total num frames: 55394304. Throughput: 0: 50490.2. Samples: 4834460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-04-25 22:11:08,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:11:10,106][47315] Worker 27 awakens! [2024-04-25 22:11:10,119][47056] Heartbeat connected on RolloutWorker_w27 [2024-04-25 22:11:10,273][47288] Updated weights for policy 0, policy_version 3386 (0.0029) [2024-04-25 22:11:13,179][47288] Updated weights for policy 0, policy_version 3396 (0.0033) [2024-04-25 22:11:13,923][47056] Fps is (10 sec: 55705.2, 60 sec: 49971.2, 300 sec: 36773.0). Total num frames: 55689216. Throughput: 0: 50878.1. Samples: 4988780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 18.0) [2024-04-25 22:11:13,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:11:14,956][47316] Worker 28 awakens! [2024-04-25 22:11:14,967][47056] Heartbeat connected on RolloutWorker_w28 [2024-04-25 22:11:15,917][47288] Updated weights for policy 0, policy_version 3406 (0.0029) [2024-04-25 22:11:18,923][47056] Fps is (10 sec: 55705.0, 60 sec: 50517.1, 300 sec: 37332.0). Total num frames: 55951360. Throughput: 0: 51767.6. Samples: 5315880. Policy #0 lag: (min: 2.0, avg: 9.0, max: 19.0) [2024-04-25 22:11:18,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:11:19,050][47288] Updated weights for policy 0, policy_version 3416 (0.0029) [2024-04-25 22:11:19,456][47319] Worker 29 awakens! [2024-04-25 22:11:19,470][47056] Heartbeat connected on RolloutWorker_w29 [2024-04-25 22:11:20,693][47267] Signal inference workers to stop experience collection... (50 times) [2024-04-25 22:11:20,709][47288] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-04-25 22:11:20,779][47267] Signal inference workers to resume experience collection... (50 times) [2024-04-25 22:11:20,779][47288] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-04-25 22:11:21,592][47288] Updated weights for policy 0, policy_version 3426 (0.0026) [2024-04-25 22:11:23,923][47056] Fps is (10 sec: 54066.7, 60 sec: 51882.6, 300 sec: 37965.6). Total num frames: 56229888. Throughput: 0: 52691.5. Samples: 5654100. Policy #0 lag: (min: 0.0, avg: 85.6, max: 321.0) [2024-04-25 22:11:23,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:11:24,161][47317] Worker 30 awakens! [2024-04-25 22:11:24,174][47056] Heartbeat connected on RolloutWorker_w30 [2024-04-25 22:11:24,990][47288] Updated weights for policy 0, policy_version 3436 (0.0027) [2024-04-25 22:11:28,040][47288] Updated weights for policy 0, policy_version 3446 (0.0025) [2024-04-25 22:11:28,860][47318] Worker 31 awakens! [2024-04-25 22:11:28,875][47056] Heartbeat connected on RolloutWorker_w31 [2024-04-25 22:11:28,923][47056] Fps is (10 sec: 54068.7, 60 sec: 52155.8, 300 sec: 38447.8). Total num frames: 56492032. Throughput: 0: 53048.1. Samples: 5818660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 22:11:28,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:11:30,843][47288] Updated weights for policy 0, policy_version 3456 (0.0035) [2024-04-25 22:11:33,883][47288] Updated weights for policy 0, policy_version 3466 (0.0033) [2024-04-25 22:11:33,923][47056] Fps is (10 sec: 55706.6, 60 sec: 52975.2, 300 sec: 39110.2). Total num frames: 56786944. Throughput: 0: 53787.2. Samples: 6152440. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-25 22:11:33,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:11:36,591][47288] Updated weights for policy 0, policy_version 3476 (0.0031) [2024-04-25 22:11:38,923][47056] Fps is (10 sec: 60620.3, 60 sec: 54067.2, 300 sec: 39833.6). Total num frames: 57098240. Throughput: 0: 54401.9. Samples: 6490160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-25 22:11:38,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:11:39,517][47288] Updated weights for policy 0, policy_version 3486 (0.0029) [2024-04-25 22:11:42,570][47288] Updated weights for policy 0, policy_version 3496 (0.0029) [2024-04-25 22:11:43,923][47056] Fps is (10 sec: 57343.6, 60 sec: 54340.3, 300 sec: 40215.3). Total num frames: 57360384. Throughput: 0: 54817.4. Samples: 6662660. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-25 22:11:43,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:11:45,207][47288] Updated weights for policy 0, policy_version 3506 (0.0029) [2024-04-25 22:11:48,381][47288] Updated weights for policy 0, policy_version 3516 (0.0032) [2024-04-25 22:11:48,923][47056] Fps is (10 sec: 52428.7, 60 sec: 54340.3, 300 sec: 40574.5). Total num frames: 57622528. Throughput: 0: 55357.7. Samples: 6999700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-25 22:11:48,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:11:51,073][47288] Updated weights for policy 0, policy_version 3526 (0.0027) [2024-04-25 22:11:53,923][47056] Fps is (10 sec: 54067.2, 60 sec: 54613.4, 300 sec: 41006.8). Total num frames: 57901056. Throughput: 0: 55656.5. Samples: 7339000. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-25 22:11:53,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:11:54,314][47288] Updated weights for policy 0, policy_version 3536 (0.0029) [2024-04-25 22:11:56,928][47288] Updated weights for policy 0, policy_version 3546 (0.0022) [2024-04-25 22:11:58,923][47056] Fps is (10 sec: 58982.0, 60 sec: 55432.5, 300 sec: 41597.1). Total num frames: 58212352. Throughput: 0: 55795.9. Samples: 7499600. Policy #0 lag: (min: 0.0, avg: 7.8, max: 19.0) [2024-04-25 22:11:58,924][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:12:00,025][47288] Updated weights for policy 0, policy_version 3556 (0.0024) [2024-04-25 22:12:02,589][47288] Updated weights for policy 0, policy_version 3566 (0.0032) [2024-04-25 22:12:03,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55705.6, 300 sec: 41889.9). Total num frames: 58474496. Throughput: 0: 56185.2. Samples: 7844200. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-25 22:12:03,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:12:05,662][47288] Updated weights for policy 0, policy_version 3576 (0.0028) [2024-04-25 22:12:08,923][47056] Fps is (10 sec: 54068.4, 60 sec: 55978.9, 300 sec: 42253.5). Total num frames: 58753024. Throughput: 0: 56272.8. Samples: 8186360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-25 22:12:08,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:12:08,927][47288] Updated weights for policy 0, policy_version 3586 (0.0032) [2024-04-25 22:12:11,295][47288] Updated weights for policy 0, policy_version 3596 (0.0033) [2024-04-25 22:12:12,317][47267] Signal inference workers to stop experience collection... (100 times) [2024-04-25 22:12:12,318][47267] Signal inference workers to resume experience collection... (100 times) [2024-04-25 22:12:12,341][47288] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-04-25 22:12:12,341][47288] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-04-25 22:12:13,923][47056] Fps is (10 sec: 57342.8, 60 sec: 55978.6, 300 sec: 42682.4). Total num frames: 59047936. Throughput: 0: 56297.5. Samples: 8352060. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-25 22:12:13,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:12:14,921][47288] Updated weights for policy 0, policy_version 3606 (0.0029) [2024-04-25 22:12:17,123][47288] Updated weights for policy 0, policy_version 3616 (0.0025) [2024-04-25 22:12:18,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56525.0, 300 sec: 43089.9). Total num frames: 59342848. Throughput: 0: 56496.4. Samples: 8694780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 22:12:18,931][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:12:20,560][47288] Updated weights for policy 0, policy_version 3626 (0.0032) [2024-04-25 22:12:22,984][47288] Updated weights for policy 0, policy_version 3636 (0.0029) [2024-04-25 22:12:23,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56524.9, 300 sec: 43397.6). Total num frames: 59621376. Throughput: 0: 56663.6. Samples: 9040020. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-25 22:12:23,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:12:26,397][47288] Updated weights for policy 0, policy_version 3646 (0.0027) [2024-04-25 22:12:28,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56524.7, 300 sec: 43612.6). Total num frames: 59883520. Throughput: 0: 56608.8. Samples: 9210060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 22:12:28,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:12:29,085][47288] Updated weights for policy 0, policy_version 3656 (0.0032) [2024-04-25 22:12:32,272][47288] Updated weights for policy 0, policy_version 3666 (0.0030) [2024-04-25 22:12:33,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56251.7, 300 sec: 43893.9). Total num frames: 60162048. Throughput: 0: 56648.1. Samples: 9548860. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-25 22:12:33,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:12:34,921][47288] Updated weights for policy 0, policy_version 3676 (0.0030) [2024-04-25 22:12:37,943][47288] Updated weights for policy 0, policy_version 3686 (0.0030) [2024-04-25 22:12:38,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.7, 300 sec: 44311.3). Total num frames: 60473344. Throughput: 0: 56542.2. Samples: 9883400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-25 22:12:38,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:12:40,786][47288] Updated weights for policy 0, policy_version 3696 (0.0035) [2024-04-25 22:12:43,647][47288] Updated weights for policy 0, policy_version 3706 (0.0028) [2024-04-25 22:12:43,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.7, 300 sec: 44491.6). Total num frames: 60735488. Throughput: 0: 56669.8. Samples: 10049740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-04-25 22:12:43,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:12:46,703][47288] Updated weights for policy 0, policy_version 3716 (0.0028) [2024-04-25 22:12:48,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56524.8, 300 sec: 44735.4). Total num frames: 61014016. Throughput: 0: 56492.3. Samples: 10386360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-25 22:12:48,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:12:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000003724_61014016.pth... [2024-04-25 22:12:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000003096_50724864.pth [2024-04-25 22:12:49,460][47288] Updated weights for policy 0, policy_version 3726 (0.0027) [2024-04-25 22:12:52,436][47288] Updated weights for policy 0, policy_version 3736 (0.0033) [2024-04-25 22:12:53,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56797.9, 300 sec: 45038.6). Total num frames: 61308928. Throughput: 0: 56352.7. Samples: 10722240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-25 22:12:53,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:12:55,253][47288] Updated weights for policy 0, policy_version 3746 (0.0029) [2024-04-25 22:12:58,092][47288] Updated weights for policy 0, policy_version 3756 (0.0028) [2024-04-25 22:12:58,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56524.9, 300 sec: 45329.1). Total num frames: 61603840. Throughput: 0: 56527.3. Samples: 10895780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-25 22:12:58,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:13:01,011][47288] Updated weights for policy 0, policy_version 3766 (0.0029) [2024-04-25 22:13:03,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56251.7, 300 sec: 45407.1). Total num frames: 61849600. Throughput: 0: 56372.0. Samples: 11231520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-25 22:13:03,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:13:03,989][47288] Updated weights for policy 0, policy_version 3776 (0.0028) [2024-04-25 22:13:06,793][47288] Updated weights for policy 0, policy_version 3786 (0.0031) [2024-04-25 22:13:08,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56524.7, 300 sec: 45678.6). Total num frames: 62144512. Throughput: 0: 56208.5. Samples: 11569400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-25 22:13:08,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:13:09,706][47288] Updated weights for policy 0, policy_version 3796 (0.0027) [2024-04-25 22:13:12,662][47288] Updated weights for policy 0, policy_version 3806 (0.0032) [2024-04-25 22:13:13,488][47267] Signal inference workers to stop experience collection... (150 times) [2024-04-25 22:13:13,521][47288] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-04-25 22:13:13,546][47267] Signal inference workers to resume experience collection... (150 times) [2024-04-25 22:13:13,547][47288] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-04-25 22:13:13,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55978.6, 300 sec: 45810.9). Total num frames: 62406656. Throughput: 0: 56114.1. Samples: 11735200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-25 22:13:13,924][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:13:15,619][47288] Updated weights for policy 0, policy_version 3816 (0.0030) [2024-04-25 22:13:18,411][47288] Updated weights for policy 0, policy_version 3826 (0.0031) [2024-04-25 22:13:18,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 46064.2). Total num frames: 62701568. Throughput: 0: 56154.2. Samples: 12075800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-25 22:13:18,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:13:21,416][47288] Updated weights for policy 0, policy_version 3836 (0.0036) [2024-04-25 22:13:23,923][47056] Fps is (10 sec: 57345.3, 60 sec: 55978.7, 300 sec: 46246.2). Total num frames: 62980096. Throughput: 0: 56295.7. Samples: 12416700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-25 22:13:23,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:13:24,128][47288] Updated weights for policy 0, policy_version 3846 (0.0030) [2024-04-25 22:13:27,147][47288] Updated weights for policy 0, policy_version 3856 (0.0031) [2024-04-25 22:13:28,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.9, 300 sec: 46482.0). Total num frames: 63275008. Throughput: 0: 56405.0. Samples: 12587960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-25 22:13:28,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:13:29,855][47288] Updated weights for policy 0, policy_version 3866 (0.0026) [2024-04-25 22:13:32,961][47288] Updated weights for policy 0, policy_version 3876 (0.0032) [2024-04-25 22:13:33,923][47056] Fps is (10 sec: 58981.5, 60 sec: 56797.7, 300 sec: 46709.3). Total num frames: 63569920. Throughput: 0: 56503.5. Samples: 12929020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-25 22:13:33,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:13:35,630][47288] Updated weights for policy 0, policy_version 3886 (0.0034) [2024-04-25 22:13:38,837][47288] Updated weights for policy 0, policy_version 3896 (0.0029) [2024-04-25 22:13:38,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 46811.4). Total num frames: 63832064. Throughput: 0: 56511.9. Samples: 13265280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 22:13:38,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:13:41,505][47288] Updated weights for policy 0, policy_version 3906 (0.0035) [2024-04-25 22:13:43,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55978.7, 300 sec: 46910.0). Total num frames: 64094208. Throughput: 0: 56295.9. Samples: 13429100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-25 22:13:43,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:13:44,518][47288] Updated weights for policy 0, policy_version 3916 (0.0028) [2024-04-25 22:13:47,565][47288] Updated weights for policy 0, policy_version 3926 (0.0028) [2024-04-25 22:13:48,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56524.9, 300 sec: 47174.6). Total num frames: 64405504. Throughput: 0: 56474.3. Samples: 13772860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 22:13:48,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:13:50,318][47288] Updated weights for policy 0, policy_version 3936 (0.0029) [2024-04-25 22:13:53,580][47288] Updated weights for policy 0, policy_version 3946 (0.0028) [2024-04-25 22:13:53,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 47263.7). Total num frames: 64667648. Throughput: 0: 56532.5. Samples: 14113360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-25 22:13:53,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:13:56,136][47288] Updated weights for policy 0, policy_version 3956 (0.0030) [2024-04-25 22:13:58,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 47985.7). Total num frames: 64962560. Throughput: 0: 56347.3. Samples: 14270820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-25 22:13:58,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:13:59,433][47288] Updated weights for policy 0, policy_version 3966 (0.0030) [2024-04-25 22:14:01,936][47288] Updated weights for policy 0, policy_version 3976 (0.0028) [2024-04-25 22:14:03,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56797.8, 300 sec: 48707.7). Total num frames: 65257472. Throughput: 0: 56212.8. Samples: 14605380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-25 22:14:03,931][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:14:05,226][47288] Updated weights for policy 0, policy_version 3986 (0.0031) [2024-04-25 22:14:05,227][47267] Signal inference workers to stop experience collection... (200 times) [2024-04-25 22:14:05,227][47267] Signal inference workers to resume experience collection... (200 times) [2024-04-25 22:14:05,254][47288] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-04-25 22:14:05,255][47288] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-04-25 22:14:07,799][47288] Updated weights for policy 0, policy_version 3996 (0.0028) [2024-04-25 22:14:08,923][47056] Fps is (10 sec: 57345.1, 60 sec: 56524.9, 300 sec: 49596.4). Total num frames: 65536000. Throughput: 0: 56272.5. Samples: 14948960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-25 22:14:08,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:14:10,908][47288] Updated weights for policy 0, policy_version 4006 (0.0031) [2024-04-25 22:14:13,536][47288] Updated weights for policy 0, policy_version 4016 (0.0026) [2024-04-25 22:14:13,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56798.0, 300 sec: 50485.0). Total num frames: 65814528. Throughput: 0: 56262.2. Samples: 15119760. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-04-25 22:14:13,932][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:14:16,539][47288] Updated weights for policy 0, policy_version 4026 (0.0029) [2024-04-25 22:14:18,923][47056] Fps is (10 sec: 54066.5, 60 sec: 56251.7, 300 sec: 51262.5). Total num frames: 66076672. Throughput: 0: 56277.9. Samples: 15461520. Policy #0 lag: (min: 0.0, avg: 12.5, max: 27.0) [2024-04-25 22:14:18,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:14:19,287][47288] Updated weights for policy 0, policy_version 4036 (0.0032) [2024-04-25 22:14:22,289][47288] Updated weights for policy 0, policy_version 4046 (0.0027) [2024-04-25 22:14:23,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.7, 300 sec: 51928.9). Total num frames: 66355200. Throughput: 0: 56290.3. Samples: 15798340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-25 22:14:23,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:14:25,093][47288] Updated weights for policy 0, policy_version 4056 (0.0030) [2024-04-25 22:14:28,150][47288] Updated weights for policy 0, policy_version 4066 (0.0028) [2024-04-25 22:14:28,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56524.9, 300 sec: 52595.4). Total num frames: 66666496. Throughput: 0: 56426.0. Samples: 15968260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-25 22:14:28,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:14:30,878][47288] Updated weights for policy 0, policy_version 4076 (0.0029) [2024-04-25 22:14:33,923][47056] Fps is (10 sec: 57342.5, 60 sec: 55978.5, 300 sec: 52984.2). Total num frames: 66928640. Throughput: 0: 56312.5. Samples: 16306940. Policy #0 lag: (min: 1.0, avg: 11.8, max: 21.0) [2024-04-25 22:14:33,924][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:14:34,109][47288] Updated weights for policy 0, policy_version 4086 (0.0032) [2024-04-25 22:14:36,663][47288] Updated weights for policy 0, policy_version 4096 (0.0035) [2024-04-25 22:14:38,923][47056] Fps is (10 sec: 57343.0, 60 sec: 56797.9, 300 sec: 53595.1). Total num frames: 67239936. Throughput: 0: 56230.1. Samples: 16643720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 22:14:38,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:14:39,986][47288] Updated weights for policy 0, policy_version 4106 (0.0031) [2024-04-25 22:14:42,548][47288] Updated weights for policy 0, policy_version 4116 (0.0025) [2024-04-25 22:14:43,923][47056] Fps is (10 sec: 58983.5, 60 sec: 57070.9, 300 sec: 53928.3). Total num frames: 67518464. Throughput: 0: 56699.1. Samples: 16822280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-25 22:14:43,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:14:45,823][47288] Updated weights for policy 0, policy_version 4126 (0.0030) [2024-04-25 22:14:48,295][47288] Updated weights for policy 0, policy_version 4136 (0.0029) [2024-04-25 22:14:48,923][47056] Fps is (10 sec: 55704.2, 60 sec: 56524.5, 300 sec: 54206.0). Total num frames: 67796992. Throughput: 0: 56763.6. Samples: 17159760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-25 22:14:48,924][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:14:48,931][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000004138_67796992.pth... [2024-04-25 22:14:48,979][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000003318_54362112.pth [2024-04-25 22:14:51,561][47288] Updated weights for policy 0, policy_version 4146 (0.0030) [2024-04-25 22:14:53,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56797.7, 300 sec: 54372.8). Total num frames: 68075520. Throughput: 0: 56626.3. Samples: 17497160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-25 22:14:53,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:14:54,266][47288] Updated weights for policy 0, policy_version 4156 (0.0035) [2024-04-25 22:14:57,358][47288] Updated weights for policy 0, policy_version 4166 (0.0026) [2024-04-25 22:14:58,923][47056] Fps is (10 sec: 52429.7, 60 sec: 55978.6, 300 sec: 54428.2). Total num frames: 68321280. Throughput: 0: 56383.4. Samples: 17657020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-25 22:14:58,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:15:00,026][47288] Updated weights for policy 0, policy_version 4176 (0.0032) [2024-04-25 22:15:03,292][47288] Updated weights for policy 0, policy_version 4186 (0.0030) [2024-04-25 22:15:03,496][47267] Signal inference workers to stop experience collection... (250 times) [2024-04-25 22:15:03,496][47267] Signal inference workers to resume experience collection... (250 times) [2024-04-25 22:15:03,508][47288] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-04-25 22:15:03,508][47288] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-04-25 22:15:03,923][47056] Fps is (10 sec: 55703.3, 60 sec: 56251.2, 300 sec: 54705.8). Total num frames: 68632576. Throughput: 0: 56314.0. Samples: 17995680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 22:15:03,924][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:15:05,896][47288] Updated weights for policy 0, policy_version 4196 (0.0031) [2024-04-25 22:15:08,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55978.5, 300 sec: 54928.1). Total num frames: 68894720. Throughput: 0: 56403.1. Samples: 18336480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-25 22:15:08,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:15:08,984][47288] Updated weights for policy 0, policy_version 4206 (0.0037) [2024-04-25 22:15:11,761][47288] Updated weights for policy 0, policy_version 4216 (0.0036) [2024-04-25 22:15:13,923][47056] Fps is (10 sec: 57347.5, 60 sec: 56524.8, 300 sec: 55205.7). Total num frames: 69206016. Throughput: 0: 56267.0. Samples: 18500280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-25 22:15:13,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:15:14,872][47288] Updated weights for policy 0, policy_version 4226 (0.0035) [2024-04-25 22:15:17,551][47288] Updated weights for policy 0, policy_version 4236 (0.0028) [2024-04-25 22:15:18,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56797.9, 300 sec: 55483.5). Total num frames: 69484544. Throughput: 0: 56287.0. Samples: 18839840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-25 22:15:18,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:15:20,708][47288] Updated weights for policy 0, policy_version 4246 (0.0028) [2024-04-25 22:15:23,343][47288] Updated weights for policy 0, policy_version 4256 (0.0027) [2024-04-25 22:15:23,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56797.9, 300 sec: 55594.5). Total num frames: 69763072. Throughput: 0: 56194.8. Samples: 19172480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 22:15:23,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:15:26,721][47288] Updated weights for policy 0, policy_version 4266 (0.0031) [2024-04-25 22:15:28,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 70041600. Throughput: 0: 56172.9. Samples: 19350060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-25 22:15:28,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:15:29,265][47288] Updated weights for policy 0, policy_version 4276 (0.0031) [2024-04-25 22:15:32,521][47288] Updated weights for policy 0, policy_version 4286 (0.0028) [2024-04-25 22:15:33,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55979.0, 300 sec: 55705.6). Total num frames: 70287360. Throughput: 0: 56127.1. Samples: 19685460. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-04-25 22:15:33,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:15:35,153][47288] Updated weights for policy 0, policy_version 4296 (0.0030) [2024-04-25 22:15:38,363][47288] Updated weights for policy 0, policy_version 4306 (0.0032) [2024-04-25 22:15:38,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 70582272. Throughput: 0: 55992.1. Samples: 20016800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-25 22:15:38,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:15:40,921][47288] Updated weights for policy 0, policy_version 4316 (0.0029) [2024-04-25 22:15:43,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 70860800. Throughput: 0: 55975.8. Samples: 20175920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 22:15:43,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:15:44,070][47288] Updated weights for policy 0, policy_version 4326 (0.0025) [2024-04-25 22:15:46,789][47288] Updated weights for policy 0, policy_version 4336 (0.0030) [2024-04-25 22:15:48,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55979.0, 300 sec: 56038.9). Total num frames: 71155712. Throughput: 0: 55821.7. Samples: 20507620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-25 22:15:48,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:15:50,012][47288] Updated weights for policy 0, policy_version 4346 (0.0031) [2024-04-25 22:15:52,647][47288] Updated weights for policy 0, policy_version 4356 (0.0029) [2024-04-25 22:15:53,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.8, 300 sec: 56094.4). Total num frames: 71434240. Throughput: 0: 55762.7. Samples: 20845800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-25 22:15:53,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:15:55,789][47288] Updated weights for policy 0, policy_version 4366 (0.0037) [2024-04-25 22:15:58,530][47288] Updated weights for policy 0, policy_version 4376 (0.0032) [2024-04-25 22:15:58,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56797.9, 300 sec: 56261.0). Total num frames: 71729152. Throughput: 0: 55924.3. Samples: 21016880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 22:15:58,924][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:16:01,701][47288] Updated weights for policy 0, policy_version 4386 (0.0036) [2024-04-25 22:16:03,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55979.2, 300 sec: 56261.0). Total num frames: 71991296. Throughput: 0: 55905.8. Samples: 21355600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-25 22:16:03,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:16:04,225][47288] Updated weights for policy 0, policy_version 4396 (0.0031) [2024-04-25 22:16:07,475][47288] Updated weights for policy 0, policy_version 4406 (0.0028) [2024-04-25 22:16:08,923][47056] Fps is (10 sec: 52425.9, 60 sec: 55978.1, 300 sec: 56149.8). Total num frames: 72253440. Throughput: 0: 55963.2. Samples: 21690860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-25 22:16:08,932][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:16:10,028][47288] Updated weights for policy 0, policy_version 4416 (0.0031) [2024-04-25 22:16:13,399][47288] Updated weights for policy 0, policy_version 4426 (0.0031) [2024-04-25 22:16:13,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 56205.5). Total num frames: 72531968. Throughput: 0: 55639.8. Samples: 21853840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-25 22:16:13,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:16:14,678][47267] Signal inference workers to stop experience collection... (300 times) [2024-04-25 22:16:14,718][47288] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-04-25 22:16:14,726][47267] Signal inference workers to resume experience collection... (300 times) [2024-04-25 22:16:14,734][47288] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-04-25 22:16:15,892][47288] Updated weights for policy 0, policy_version 4436 (0.0031) [2024-04-25 22:16:18,923][47056] Fps is (10 sec: 55709.9, 60 sec: 55432.6, 300 sec: 56205.5). Total num frames: 72810496. Throughput: 0: 55716.6. Samples: 22192700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-25 22:16:18,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:16:19,128][47288] Updated weights for policy 0, policy_version 4446 (0.0029) [2024-04-25 22:16:21,666][47288] Updated weights for policy 0, policy_version 4456 (0.0026) [2024-04-25 22:16:23,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55432.5, 300 sec: 56261.0). Total num frames: 73089024. Throughput: 0: 55853.8. Samples: 22530220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-25 22:16:23,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:16:24,806][47288] Updated weights for policy 0, policy_version 4466 (0.0027) [2024-04-25 22:16:27,426][47288] Updated weights for policy 0, policy_version 4476 (0.0029) [2024-04-25 22:16:28,923][47056] Fps is (10 sec: 57342.8, 60 sec: 55705.6, 300 sec: 56261.0). Total num frames: 73383936. Throughput: 0: 56126.1. Samples: 22701600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-25 22:16:28,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:16:30,632][47288] Updated weights for policy 0, policy_version 4486 (0.0030) [2024-04-25 22:16:33,182][47288] Updated weights for policy 0, policy_version 4496 (0.0032) [2024-04-25 22:16:33,923][47056] Fps is (10 sec: 58981.6, 60 sec: 56524.6, 300 sec: 56205.4). Total num frames: 73678848. Throughput: 0: 56278.8. Samples: 23040180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 22:16:33,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:16:36,401][47288] Updated weights for policy 0, policy_version 4506 (0.0033) [2024-04-25 22:16:38,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 73973760. Throughput: 0: 56235.3. Samples: 23376400. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-04-25 22:16:38,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:16:39,092][47288] Updated weights for policy 0, policy_version 4516 (0.0034) [2024-04-25 22:16:42,181][47288] Updated weights for policy 0, policy_version 4526 (0.0036) [2024-04-25 22:16:43,923][47056] Fps is (10 sec: 55706.7, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 74235904. Throughput: 0: 56374.8. Samples: 23553740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-25 22:16:43,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:16:44,940][47288] Updated weights for policy 0, policy_version 4536 (0.0036) [2024-04-25 22:16:48,096][47288] Updated weights for policy 0, policy_version 4546 (0.0029) [2024-04-25 22:16:48,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55705.5, 300 sec: 56261.0). Total num frames: 74498048. Throughput: 0: 56287.0. Samples: 23888520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-25 22:16:48,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:16:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000004547_74498048.pth... [2024-04-25 22:16:49,001][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000003724_61014016.pth [2024-04-25 22:16:50,634][47288] Updated weights for policy 0, policy_version 4556 (0.0029) [2024-04-25 22:16:53,819][47288] Updated weights for policy 0, policy_version 4566 (0.0029) [2024-04-25 22:16:53,923][47056] Fps is (10 sec: 57342.7, 60 sec: 56251.5, 300 sec: 56261.0). Total num frames: 74809344. Throughput: 0: 56347.2. Samples: 24226460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-25 22:16:53,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:16:56,298][47288] Updated weights for policy 0, policy_version 4576 (0.0024) [2024-04-25 22:16:58,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55432.7, 300 sec: 56205.5). Total num frames: 75055104. Throughput: 0: 56201.3. Samples: 24382900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-25 22:16:58,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:16:59,683][47288] Updated weights for policy 0, policy_version 4586 (0.0031) [2024-04-25 22:17:02,482][47288] Updated weights for policy 0, policy_version 4596 (0.0032) [2024-04-25 22:17:03,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 75350016. Throughput: 0: 56343.4. Samples: 24728160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-25 22:17:03,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:17:05,752][47288] Updated weights for policy 0, policy_version 4606 (0.0027) [2024-04-25 22:17:06,975][47267] Signal inference workers to stop experience collection... (350 times) [2024-04-25 22:17:06,995][47288] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-04-25 22:17:07,062][47267] Signal inference workers to resume experience collection... (350 times) [2024-04-25 22:17:07,063][47288] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-04-25 22:17:08,204][47288] Updated weights for policy 0, policy_version 4616 (0.0024) [2024-04-25 22:17:08,923][47056] Fps is (10 sec: 58981.1, 60 sec: 56525.2, 300 sec: 56261.0). Total num frames: 75644928. Throughput: 0: 56271.0. Samples: 25062420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 22:17:08,924][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:17:11,625][47288] Updated weights for policy 0, policy_version 4626 (0.0026) [2024-04-25 22:17:13,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56797.8, 300 sec: 56261.0). Total num frames: 75939840. Throughput: 0: 56366.3. Samples: 25238080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-25 22:17:13,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:17:14,064][47288] Updated weights for policy 0, policy_version 4636 (0.0031) [2024-04-25 22:17:17,509][47288] Updated weights for policy 0, policy_version 4646 (0.0030) [2024-04-25 22:17:18,923][47056] Fps is (10 sec: 57345.0, 60 sec: 56797.8, 300 sec: 56261.0). Total num frames: 76218368. Throughput: 0: 56334.0. Samples: 25575200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-25 22:17:18,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:17:19,975][47288] Updated weights for policy 0, policy_version 4656 (0.0030) [2024-04-25 22:17:23,413][47288] Updated weights for policy 0, policy_version 4666 (0.0028) [2024-04-25 22:17:23,923][47056] Fps is (10 sec: 52428.7, 60 sec: 56251.8, 300 sec: 56205.5). Total num frames: 76464128. Throughput: 0: 56246.5. Samples: 25907480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 22:17:23,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:17:25,853][47288] Updated weights for policy 0, policy_version 4676 (0.0030) [2024-04-25 22:17:28,923][47056] Fps is (10 sec: 52427.7, 60 sec: 55978.5, 300 sec: 56205.4). Total num frames: 76742656. Throughput: 0: 55964.2. Samples: 26072140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-04-25 22:17:28,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:17:29,332][47288] Updated weights for policy 0, policy_version 4686 (0.0035) [2024-04-25 22:17:31,511][47288] Updated weights for policy 0, policy_version 4696 (0.0029) [2024-04-25 22:17:33,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.9, 300 sec: 56149.9). Total num frames: 77037568. Throughput: 0: 56041.5. Samples: 26410380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-25 22:17:33,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:17:35,243][47288] Updated weights for policy 0, policy_version 4706 (0.0026) [2024-04-25 22:17:37,265][47288] Updated weights for policy 0, policy_version 4716 (0.0030) [2024-04-25 22:17:38,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 56149.9). Total num frames: 77299712. Throughput: 0: 56031.6. Samples: 26747880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-25 22:17:38,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:17:41,086][47288] Updated weights for policy 0, policy_version 4726 (0.0028) [2024-04-25 22:17:43,240][47288] Updated weights for policy 0, policy_version 4736 (0.0029) [2024-04-25 22:17:43,923][47056] Fps is (10 sec: 57342.4, 60 sec: 56251.5, 300 sec: 56261.0). Total num frames: 77611008. Throughput: 0: 56318.4. Samples: 26917240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-25 22:17:43,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:17:46,817][47288] Updated weights for policy 0, policy_version 4746 (0.0034) [2024-04-25 22:17:48,923][47056] Fps is (10 sec: 60620.6, 60 sec: 56797.8, 300 sec: 56261.0). Total num frames: 77905920. Throughput: 0: 56049.2. Samples: 27250380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-04-25 22:17:48,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:17:49,153][47288] Updated weights for policy 0, policy_version 4756 (0.0033) [2024-04-25 22:17:52,725][47288] Updated weights for policy 0, policy_version 4766 (0.0036) [2024-04-25 22:17:53,923][47056] Fps is (10 sec: 57345.0, 60 sec: 56251.9, 300 sec: 56205.4). Total num frames: 78184448. Throughput: 0: 56089.5. Samples: 27586440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 22:17:53,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:17:55,478][47288] Updated weights for policy 0, policy_version 4776 (0.0026) [2024-04-25 22:17:58,583][47288] Updated weights for policy 0, policy_version 4786 (0.0034) [2024-04-25 22:17:58,923][47056] Fps is (10 sec: 52430.1, 60 sec: 56251.8, 300 sec: 56205.5). Total num frames: 78430208. Throughput: 0: 55945.4. Samples: 27755620. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-25 22:17:58,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:18:00,059][47267] Signal inference workers to stop experience collection... (400 times) [2024-04-25 22:18:00,059][47267] Signal inference workers to resume experience collection... (400 times) [2024-04-25 22:18:00,070][47288] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-04-25 22:18:00,089][47288] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-04-25 22:18:01,130][47288] Updated weights for policy 0, policy_version 4796 (0.0027) [2024-04-25 22:18:03,922][47056] Fps is (10 sec: 50791.2, 60 sec: 55705.8, 300 sec: 56094.4). Total num frames: 78692352. Throughput: 0: 56026.8. Samples: 28096400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-25 22:18:03,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:18:04,443][47288] Updated weights for policy 0, policy_version 4806 (0.0030) [2024-04-25 22:18:06,941][47288] Updated weights for policy 0, policy_version 4816 (0.0027) [2024-04-25 22:18:08,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55705.7, 300 sec: 56205.5). Total num frames: 78987264. Throughput: 0: 56131.5. Samples: 28433400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-25 22:18:08,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:18:08,934][47267] Saving new best policy, reward=0.003! [2024-04-25 22:18:10,189][47288] Updated weights for policy 0, policy_version 4826 (0.0027) [2024-04-25 22:18:12,878][47288] Updated weights for policy 0, policy_version 4836 (0.0024) [2024-04-25 22:18:13,923][47056] Fps is (10 sec: 58981.9, 60 sec: 55705.6, 300 sec: 56205.5). Total num frames: 79282176. Throughput: 0: 56060.3. Samples: 28594840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 22:18:13,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:18:16,014][47288] Updated weights for policy 0, policy_version 4846 (0.0027) [2024-04-25 22:18:18,613][47288] Updated weights for policy 0, policy_version 4856 (0.0033) [2024-04-25 22:18:18,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.5, 300 sec: 56205.4). Total num frames: 79560704. Throughput: 0: 55969.2. Samples: 28929000. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-25 22:18:18,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:18:21,793][47288] Updated weights for policy 0, policy_version 4866 (0.0032) [2024-04-25 22:18:23,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.8, 300 sec: 56205.4). Total num frames: 79855616. Throughput: 0: 55787.3. Samples: 29258300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 22:18:23,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:18:24,387][47288] Updated weights for policy 0, policy_version 4876 (0.0034) [2024-04-25 22:18:27,704][47288] Updated weights for policy 0, policy_version 4886 (0.0026) [2024-04-25 22:18:28,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56525.0, 300 sec: 56149.9). Total num frames: 80134144. Throughput: 0: 55973.1. Samples: 29436020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-25 22:18:28,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:18:30,364][47288] Updated weights for policy 0, policy_version 4896 (0.0034) [2024-04-25 22:18:33,441][47288] Updated weights for policy 0, policy_version 4906 (0.0027) [2024-04-25 22:18:33,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.5, 300 sec: 56149.9). Total num frames: 80396288. Throughput: 0: 56148.1. Samples: 29777040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 22:18:33,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:18:36,195][47288] Updated weights for policy 0, policy_version 4916 (0.0035) [2024-04-25 22:18:38,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56251.8, 300 sec: 56205.4). Total num frames: 80674816. Throughput: 0: 56035.0. Samples: 30108020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-25 22:18:38,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:18:39,430][47288] Updated weights for policy 0, policy_version 4926 (0.0034) [2024-04-25 22:18:42,157][47288] Updated weights for policy 0, policy_version 4936 (0.0033) [2024-04-25 22:18:43,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55432.7, 300 sec: 56038.8). Total num frames: 80936960. Throughput: 0: 55818.1. Samples: 30267440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 22:18:43,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:18:45,313][47288] Updated weights for policy 0, policy_version 4946 (0.0028) [2024-04-25 22:18:47,943][47288] Updated weights for policy 0, policy_version 4956 (0.0032) [2024-04-25 22:18:48,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55432.7, 300 sec: 56149.9). Total num frames: 81231872. Throughput: 0: 55720.7. Samples: 30603840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-25 22:18:48,931][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:18:49,031][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000004959_81248256.pth... [2024-04-25 22:18:49,077][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000004138_67796992.pth [2024-04-25 22:18:51,045][47288] Updated weights for policy 0, policy_version 4966 (0.0031) [2024-04-25 22:18:53,763][47288] Updated weights for policy 0, policy_version 4976 (0.0033) [2024-04-25 22:18:53,923][47056] Fps is (10 sec: 58982.0, 60 sec: 55705.6, 300 sec: 56149.9). Total num frames: 81526784. Throughput: 0: 55688.5. Samples: 30939380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-25 22:18:53,932][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:18:56,814][47288] Updated weights for policy 0, policy_version 4986 (0.0025) [2024-04-25 22:18:58,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55978.4, 300 sec: 56038.8). Total num frames: 81788928. Throughput: 0: 55778.8. Samples: 31104900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-25 22:18:58,932][47056] Avg episode reward: [(0, '0.004')] [2024-04-25 22:18:59,807][47288] Updated weights for policy 0, policy_version 4996 (0.0026) [2024-04-25 22:19:00,355][47267] Signal inference workers to stop experience collection... (450 times) [2024-04-25 22:19:00,389][47288] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-04-25 22:19:00,441][47267] Signal inference workers to resume experience collection... (450 times) [2024-04-25 22:19:00,442][47288] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-04-25 22:19:02,879][47288] Updated weights for policy 0, policy_version 5006 (0.0023) [2024-04-25 22:19:03,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.6, 300 sec: 56094.3). Total num frames: 82083840. Throughput: 0: 55900.4. Samples: 31444520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-04-25 22:19:03,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:19:05,728][47288] Updated weights for policy 0, policy_version 5016 (0.0029) [2024-04-25 22:19:08,705][47288] Updated weights for policy 0, policy_version 5026 (0.0028) [2024-04-25 22:19:08,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.5, 300 sec: 56094.3). Total num frames: 82362368. Throughput: 0: 56063.7. Samples: 31781180. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-25 22:19:08,924][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:19:11,406][47288] Updated weights for policy 0, policy_version 5036 (0.0031) [2024-04-25 22:19:13,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 56149.9). Total num frames: 82640896. Throughput: 0: 55677.3. Samples: 31941500. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-04-25 22:19:13,931][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:19:14,449][47288] Updated weights for policy 0, policy_version 5046 (0.0030) [2024-04-25 22:19:17,099][47288] Updated weights for policy 0, policy_version 5056 (0.0033) [2024-04-25 22:19:18,923][47056] Fps is (10 sec: 52429.5, 60 sec: 55432.5, 300 sec: 56038.8). Total num frames: 82886656. Throughput: 0: 55645.3. Samples: 32281080. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-04-25 22:19:18,932][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:19:20,360][47288] Updated weights for policy 0, policy_version 5066 (0.0029) [2024-04-25 22:19:22,993][47288] Updated weights for policy 0, policy_version 5076 (0.0035) [2024-04-25 22:19:23,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 56038.8). Total num frames: 83197952. Throughput: 0: 55720.4. Samples: 32615440. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-25 22:19:23,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:19:26,287][47288] Updated weights for policy 0, policy_version 5086 (0.0029) [2024-04-25 22:19:28,923][47056] Fps is (10 sec: 58982.5, 60 sec: 55705.5, 300 sec: 56094.4). Total num frames: 83476480. Throughput: 0: 55969.2. Samples: 32786060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-25 22:19:28,924][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:19:29,079][47288] Updated weights for policy 0, policy_version 5096 (0.0035) [2024-04-25 22:19:32,190][47288] Updated weights for policy 0, policy_version 5106 (0.0028) [2024-04-25 22:19:33,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 83755008. Throughput: 0: 55978.2. Samples: 33122860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-25 22:19:33,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:19:34,862][47288] Updated weights for policy 0, policy_version 5116 (0.0028) [2024-04-25 22:19:37,922][47288] Updated weights for policy 0, policy_version 5126 (0.0035) [2024-04-25 22:19:38,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 84033536. Throughput: 0: 55906.2. Samples: 33455160. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-25 22:19:38,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:19:40,658][47288] Updated weights for policy 0, policy_version 5136 (0.0029) [2024-04-25 22:19:43,889][47288] Updated weights for policy 0, policy_version 5146 (0.0027) [2024-04-25 22:19:43,923][47056] Fps is (10 sec: 55702.0, 60 sec: 56251.1, 300 sec: 55983.2). Total num frames: 84312064. Throughput: 0: 56030.1. Samples: 33626280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-25 22:19:43,924][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:19:46,636][47288] Updated weights for policy 0, policy_version 5156 (0.0031) [2024-04-25 22:19:48,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 84590592. Throughput: 0: 56016.6. Samples: 33965260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 22:19:48,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:19:49,709][47288] Updated weights for policy 0, policy_version 5166 (0.0035) [2024-04-25 22:19:52,445][47288] Updated weights for policy 0, policy_version 5176 (0.0029) [2024-04-25 22:19:53,923][47056] Fps is (10 sec: 55709.4, 60 sec: 55705.7, 300 sec: 56094.4). Total num frames: 84869120. Throughput: 0: 55917.3. Samples: 34297440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-25 22:19:53,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:19:55,532][47288] Updated weights for policy 0, policy_version 5186 (0.0030) [2024-04-25 22:19:58,170][47288] Updated weights for policy 0, policy_version 5196 (0.0031) [2024-04-25 22:19:58,923][47056] Fps is (10 sec: 55704.1, 60 sec: 55978.6, 300 sec: 55983.4). Total num frames: 85147648. Throughput: 0: 56142.5. Samples: 34467920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-25 22:19:58,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:20:01,356][47288] Updated weights for policy 0, policy_version 5206 (0.0027) [2024-04-25 22:20:02,066][47267] Signal inference workers to stop experience collection... (500 times) [2024-04-25 22:20:02,066][47267] Signal inference workers to resume experience collection... (500 times) [2024-04-25 22:20:02,082][47288] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-04-25 22:20:02,082][47288] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-04-25 22:20:03,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 85442560. Throughput: 0: 56118.4. Samples: 34806400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-25 22:20:03,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:20:03,989][47288] Updated weights for policy 0, policy_version 5216 (0.0030) [2024-04-25 22:20:07,235][47288] Updated weights for policy 0, policy_version 5226 (0.0032) [2024-04-25 22:20:08,923][47056] Fps is (10 sec: 55706.6, 60 sec: 55705.8, 300 sec: 55927.7). Total num frames: 85704704. Throughput: 0: 56132.0. Samples: 35141380. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-25 22:20:08,932][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:20:09,823][47288] Updated weights for policy 0, policy_version 5236 (0.0028) [2024-04-25 22:20:12,990][47288] Updated weights for policy 0, policy_version 5246 (0.0032) [2024-04-25 22:20:13,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 85999616. Throughput: 0: 56080.0. Samples: 35309660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-25 22:20:13,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:20:15,651][47288] Updated weights for policy 0, policy_version 5256 (0.0033) [2024-04-25 22:20:18,781][47288] Updated weights for policy 0, policy_version 5266 (0.0027) [2024-04-25 22:20:18,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56525.0, 300 sec: 55983.3). Total num frames: 86278144. Throughput: 0: 56084.9. Samples: 35646680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 22:20:18,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:20:21,597][47288] Updated weights for policy 0, policy_version 5276 (0.0026) [2024-04-25 22:20:23,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 86556672. Throughput: 0: 56157.4. Samples: 35982240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-25 22:20:23,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:20:24,614][47288] Updated weights for policy 0, policy_version 5286 (0.0027) [2024-04-25 22:20:27,400][47288] Updated weights for policy 0, policy_version 5296 (0.0029) [2024-04-25 22:20:28,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 86851584. Throughput: 0: 56254.5. Samples: 36157700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-25 22:20:28,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:20:30,417][47288] Updated weights for policy 0, policy_version 5306 (0.0029) [2024-04-25 22:20:33,210][47288] Updated weights for policy 0, policy_version 5316 (0.0030) [2024-04-25 22:20:33,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56524.8, 300 sec: 56149.9). Total num frames: 87146496. Throughput: 0: 56336.9. Samples: 36500420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-25 22:20:33,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:20:36,184][47288] Updated weights for policy 0, policy_version 5326 (0.0035) [2024-04-25 22:20:38,923][47056] Fps is (10 sec: 55704.5, 60 sec: 56251.6, 300 sec: 56094.3). Total num frames: 87408640. Throughput: 0: 56435.2. Samples: 36837040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 26.0) [2024-04-25 22:20:38,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:20:38,961][47288] Updated weights for policy 0, policy_version 5336 (0.0035) [2024-04-25 22:20:41,971][47288] Updated weights for policy 0, policy_version 5346 (0.0031) [2024-04-25 22:20:43,923][47056] Fps is (10 sec: 52428.0, 60 sec: 55979.1, 300 sec: 55983.3). Total num frames: 87670784. Throughput: 0: 56400.1. Samples: 37005920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 22:20:43,923][47056] Avg episode reward: [(0, '0.004')] [2024-04-25 22:20:44,828][47288] Updated weights for policy 0, policy_version 5356 (0.0029) [2024-04-25 22:20:47,785][47288] Updated weights for policy 0, policy_version 5366 (0.0032) [2024-04-25 22:20:48,923][47056] Fps is (10 sec: 55706.8, 60 sec: 56251.7, 300 sec: 56038.8). Total num frames: 87965696. Throughput: 0: 56302.2. Samples: 37340000. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-04-25 22:20:48,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:20:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000005369_87965696.pth... [2024-04-25 22:20:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000004547_74498048.pth [2024-04-25 22:20:50,536][47288] Updated weights for policy 0, policy_version 5376 (0.0032) [2024-04-25 22:20:53,589][47288] Updated weights for policy 0, policy_version 5386 (0.0028) [2024-04-25 22:20:53,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56524.7, 300 sec: 56038.9). Total num frames: 88260608. Throughput: 0: 56350.3. Samples: 37677140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-25 22:20:53,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:20:55,853][47267] Signal inference workers to stop experience collection... (550 times) [2024-04-25 22:20:55,883][47288] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-04-25 22:20:55,907][47267] Signal inference workers to resume experience collection... (550 times) [2024-04-25 22:20:55,908][47288] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-04-25 22:20:56,427][47288] Updated weights for policy 0, policy_version 5396 (0.0036) [2024-04-25 22:20:58,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.9, 300 sec: 56038.8). Total num frames: 88522752. Throughput: 0: 56283.2. Samples: 37842400. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-25 22:20:58,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:20:59,309][47288] Updated weights for policy 0, policy_version 5406 (0.0030) [2024-04-25 22:21:02,134][47288] Updated weights for policy 0, policy_version 5416 (0.0028) [2024-04-25 22:21:03,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.6, 300 sec: 56150.0). Total num frames: 88817664. Throughput: 0: 56242.1. Samples: 38177580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-25 22:21:03,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:21:05,105][47288] Updated weights for policy 0, policy_version 5426 (0.0030) [2024-04-25 22:21:07,983][47288] Updated weights for policy 0, policy_version 5436 (0.0031) [2024-04-25 22:21:08,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56797.9, 300 sec: 56205.4). Total num frames: 89112576. Throughput: 0: 56246.7. Samples: 38513340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-25 22:21:08,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:21:10,879][47288] Updated weights for policy 0, policy_version 5446 (0.0027) [2024-04-25 22:21:13,705][47288] Updated weights for policy 0, policy_version 5456 (0.0030) [2024-04-25 22:21:13,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.9, 300 sec: 56205.4). Total num frames: 89391104. Throughput: 0: 56301.7. Samples: 38691280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 22:21:13,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:21:16,592][47288] Updated weights for policy 0, policy_version 5466 (0.0028) [2024-04-25 22:21:18,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56251.6, 300 sec: 56149.9). Total num frames: 89653248. Throughput: 0: 56249.6. Samples: 39031660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-25 22:21:18,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:21:19,669][47288] Updated weights for policy 0, policy_version 5476 (0.0036) [2024-04-25 22:21:22,446][47288] Updated weights for policy 0, policy_version 5486 (0.0027) [2024-04-25 22:21:23,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56251.6, 300 sec: 56094.4). Total num frames: 89931776. Throughput: 0: 56203.7. Samples: 39366200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 22:21:23,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:21:25,434][47288] Updated weights for policy 0, policy_version 5496 (0.0030) [2024-04-25 22:21:28,354][47288] Updated weights for policy 0, policy_version 5506 (0.0029) [2024-04-25 22:21:28,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.7, 300 sec: 56094.4). Total num frames: 90226688. Throughput: 0: 56108.9. Samples: 39530820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 22:21:28,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:21:31,217][47288] Updated weights for policy 0, policy_version 5516 (0.0031) [2024-04-25 22:21:33,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55978.7, 300 sec: 56038.9). Total num frames: 90505216. Throughput: 0: 56312.5. Samples: 39874060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-25 22:21:33,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:21:34,189][47288] Updated weights for policy 0, policy_version 5526 (0.0030) [2024-04-25 22:21:37,015][47288] Updated weights for policy 0, policy_version 5536 (0.0030) [2024-04-25 22:21:38,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.9, 300 sec: 56094.4). Total num frames: 90783744. Throughput: 0: 56410.1. Samples: 40215600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-25 22:21:38,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:21:39,998][47288] Updated weights for policy 0, policy_version 5546 (0.0029) [2024-04-25 22:21:42,844][47288] Updated weights for policy 0, policy_version 5556 (0.0033) [2024-04-25 22:21:43,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56797.9, 300 sec: 56205.5). Total num frames: 91078656. Throughput: 0: 56510.7. Samples: 40385380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-25 22:21:43,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:21:46,142][47288] Updated weights for policy 0, policy_version 5566 (0.0030) [2024-04-25 22:21:48,739][47288] Updated weights for policy 0, policy_version 5576 (0.0027) [2024-04-25 22:21:48,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56524.9, 300 sec: 56094.4). Total num frames: 91357184. Throughput: 0: 56512.6. Samples: 40720640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 22:21:48,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:21:51,816][47288] Updated weights for policy 0, policy_version 5586 (0.0035) [2024-04-25 22:21:53,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55978.7, 300 sec: 56149.9). Total num frames: 91619328. Throughput: 0: 56369.0. Samples: 41049940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 22:21:53,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:21:53,929][47267] Signal inference workers to stop experience collection... (600 times) [2024-04-25 22:21:53,930][47267] Signal inference workers to resume experience collection... (600 times) [2024-04-25 22:21:53,962][47288] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-04-25 22:21:53,962][47288] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-04-25 22:21:54,634][47288] Updated weights for policy 0, policy_version 5596 (0.0030) [2024-04-25 22:21:57,691][47288] Updated weights for policy 0, policy_version 5606 (0.0027) [2024-04-25 22:21:58,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56251.8, 300 sec: 56094.4). Total num frames: 91897856. Throughput: 0: 56266.7. Samples: 41223280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 22:21:58,923][47056] Avg episode reward: [(0, '0.000')] [2024-04-25 22:22:00,370][47288] Updated weights for policy 0, policy_version 5616 (0.0027) [2024-04-25 22:22:03,579][47288] Updated weights for policy 0, policy_version 5626 (0.0028) [2024-04-25 22:22:03,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55978.6, 300 sec: 56038.8). Total num frames: 92176384. Throughput: 0: 56146.2. Samples: 41558240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-25 22:22:03,932][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:22:06,259][47288] Updated weights for policy 0, policy_version 5636 (0.0033) [2024-04-25 22:22:08,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56251.6, 300 sec: 56094.3). Total num frames: 92487680. Throughput: 0: 56146.7. Samples: 41892800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-25 22:22:08,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:22:09,192][47288] Updated weights for policy 0, policy_version 5646 (0.0025) [2024-04-25 22:22:12,205][47288] Updated weights for policy 0, policy_version 5656 (0.0029) [2024-04-25 22:22:13,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 56038.8). Total num frames: 92749824. Throughput: 0: 56263.5. Samples: 42062680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 22:22:13,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:22:14,942][47288] Updated weights for policy 0, policy_version 5666 (0.0027) [2024-04-25 22:22:18,002][47288] Updated weights for policy 0, policy_version 5676 (0.0039) [2024-04-25 22:22:18,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 93011968. Throughput: 0: 56054.0. Samples: 42396500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-25 22:22:18,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:22:21,241][47288] Updated weights for policy 0, policy_version 5686 (0.0029) [2024-04-25 22:22:23,677][47288] Updated weights for policy 0, policy_version 5696 (0.0031) [2024-04-25 22:22:23,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56524.9, 300 sec: 56205.5). Total num frames: 93323264. Throughput: 0: 56057.0. Samples: 42738160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 26.0) [2024-04-25 22:22:23,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:22:26,939][47288] Updated weights for policy 0, policy_version 5706 (0.0032) [2024-04-25 22:22:28,923][47056] Fps is (10 sec: 60620.3, 60 sec: 56524.7, 300 sec: 56205.4). Total num frames: 93618176. Throughput: 0: 56108.7. Samples: 42910280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 22:22:28,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:22:29,593][47288] Updated weights for policy 0, policy_version 5716 (0.0026) [2024-04-25 22:22:32,710][47288] Updated weights for policy 0, policy_version 5726 (0.0029) [2024-04-25 22:22:33,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.7, 300 sec: 56205.5). Total num frames: 93880320. Throughput: 0: 56156.0. Samples: 43247660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-25 22:22:33,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:22:35,512][47288] Updated weights for policy 0, policy_version 5736 (0.0027) [2024-04-25 22:22:38,572][47288] Updated weights for policy 0, policy_version 5746 (0.0028) [2024-04-25 22:22:38,923][47056] Fps is (10 sec: 52429.7, 60 sec: 55978.7, 300 sec: 56038.9). Total num frames: 94142464. Throughput: 0: 56302.5. Samples: 43583560. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-25 22:22:38,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:22:41,382][47288] Updated weights for policy 0, policy_version 5756 (0.0029) [2024-04-25 22:22:43,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 56038.9). Total num frames: 94437376. Throughput: 0: 56171.1. Samples: 43750980. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-25 22:22:43,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:22:44,369][47288] Updated weights for policy 0, policy_version 5766 (0.0028) [2024-04-25 22:22:47,226][47288] Updated weights for policy 0, policy_version 5776 (0.0026) [2024-04-25 22:22:48,923][47056] Fps is (10 sec: 58981.3, 60 sec: 56251.5, 300 sec: 56094.3). Total num frames: 94732288. Throughput: 0: 56243.0. Samples: 44089180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-25 22:22:48,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:22:49,015][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000005783_94748672.pth... [2024-04-25 22:22:49,060][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000004959_81248256.pth [2024-04-25 22:22:50,068][47288] Updated weights for policy 0, policy_version 5786 (0.0026) [2024-04-25 22:22:53,066][47288] Updated weights for policy 0, policy_version 5796 (0.0034) [2024-04-25 22:22:53,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 56094.4). Total num frames: 94978048. Throughput: 0: 56374.3. Samples: 44429640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-25 22:22:53,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:22:55,736][47288] Updated weights for policy 0, policy_version 5806 (0.0028) [2024-04-25 22:22:58,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56251.7, 300 sec: 56205.4). Total num frames: 95272960. Throughput: 0: 56335.2. Samples: 44597760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-04-25 22:22:58,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:22:58,939][47288] Updated weights for policy 0, policy_version 5816 (0.0026) [2024-04-25 22:23:01,570][47267] Signal inference workers to stop experience collection... (650 times) [2024-04-25 22:23:01,618][47288] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-04-25 22:23:01,618][47267] Signal inference workers to resume experience collection... (650 times) [2024-04-25 22:23:01,632][47288] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-04-25 22:23:01,732][47288] Updated weights for policy 0, policy_version 5826 (0.0034) [2024-04-25 22:23:03,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56524.9, 300 sec: 56205.5). Total num frames: 95567872. Throughput: 0: 56321.5. Samples: 44930960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-25 22:23:03,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:23:04,932][47288] Updated weights for policy 0, policy_version 5836 (0.0030) [2024-04-25 22:23:07,580][47288] Updated weights for policy 0, policy_version 5846 (0.0031) [2024-04-25 22:23:08,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55978.8, 300 sec: 56149.9). Total num frames: 95846400. Throughput: 0: 56213.9. Samples: 45267780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-25 22:23:08,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:23:10,716][47288] Updated weights for policy 0, policy_version 5856 (0.0028) [2024-04-25 22:23:13,477][47288] Updated weights for policy 0, policy_version 5866 (0.0028) [2024-04-25 22:23:13,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56524.8, 300 sec: 56205.4). Total num frames: 96141312. Throughput: 0: 56241.0. Samples: 45441120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 22:23:13,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:23:16,518][47288] Updated weights for policy 0, policy_version 5876 (0.0026) [2024-04-25 22:23:18,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.9, 300 sec: 56094.4). Total num frames: 96403456. Throughput: 0: 56052.0. Samples: 45770000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 22:23:18,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:23:19,258][47288] Updated weights for policy 0, policy_version 5886 (0.0034) [2024-04-25 22:23:22,502][47288] Updated weights for policy 0, policy_version 5896 (0.0029) [2024-04-25 22:23:23,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55705.6, 300 sec: 56038.8). Total num frames: 96665600. Throughput: 0: 56091.1. Samples: 46107660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 22:23:23,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:23:24,970][47288] Updated weights for policy 0, policy_version 5906 (0.0026) [2024-04-25 22:23:28,475][47288] Updated weights for policy 0, policy_version 5916 (0.0029) [2024-04-25 22:23:28,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55432.8, 300 sec: 56094.4). Total num frames: 96944128. Throughput: 0: 55991.7. Samples: 46270600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 22:23:28,923][47056] Avg episode reward: [(0, '0.004')] [2024-04-25 22:23:28,931][47267] Saving new best policy, reward=0.004! [2024-04-25 22:23:30,825][47288] Updated weights for policy 0, policy_version 5926 (0.0025) [2024-04-25 22:23:33,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.5, 300 sec: 56094.4). Total num frames: 97222656. Throughput: 0: 55927.3. Samples: 46605900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 22:23:33,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:23:34,148][47288] Updated weights for policy 0, policy_version 5936 (0.0028) [2024-04-25 22:23:36,723][47288] Updated weights for policy 0, policy_version 5946 (0.0030) [2024-04-25 22:23:38,923][47056] Fps is (10 sec: 57343.0, 60 sec: 56251.6, 300 sec: 56205.4). Total num frames: 97517568. Throughput: 0: 55899.9. Samples: 46945140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-25 22:23:38,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:23:40,006][47288] Updated weights for policy 0, policy_version 5956 (0.0028) [2024-04-25 22:23:42,424][47288] Updated weights for policy 0, policy_version 5966 (0.0039) [2024-04-25 22:23:43,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 56149.9). Total num frames: 97796096. Throughput: 0: 55862.7. Samples: 47111580. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-25 22:23:43,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:23:46,013][47288] Updated weights for policy 0, policy_version 5976 (0.0026) [2024-04-25 22:23:48,208][47288] Updated weights for policy 0, policy_version 5986 (0.0027) [2024-04-25 22:23:48,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.8, 300 sec: 56149.9). Total num frames: 98091008. Throughput: 0: 55910.6. Samples: 47446940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 22:23:48,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:23:52,018][47288] Updated weights for policy 0, policy_version 5996 (0.0029) [2024-04-25 22:23:53,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56524.9, 300 sec: 56205.5). Total num frames: 98369536. Throughput: 0: 55860.9. Samples: 47781520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 22:23:53,923][47056] Avg episode reward: [(0, '0.004')] [2024-04-25 22:23:54,214][47288] Updated weights for policy 0, policy_version 6006 (0.0028) [2024-04-25 22:23:57,863][47288] Updated weights for policy 0, policy_version 6016 (0.0032) [2024-04-25 22:23:58,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55978.8, 300 sec: 56094.4). Total num frames: 98631680. Throughput: 0: 55792.7. Samples: 47951780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 22:23:58,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:24:00,195][47288] Updated weights for policy 0, policy_version 6026 (0.0034) [2024-04-25 22:24:03,665][47288] Updated weights for policy 0, policy_version 6036 (0.0031) [2024-04-25 22:24:03,923][47056] Fps is (10 sec: 52428.2, 60 sec: 55432.5, 300 sec: 56038.9). Total num frames: 98893824. Throughput: 0: 55956.0. Samples: 48288020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 22:24:03,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:24:05,956][47288] Updated weights for policy 0, policy_version 6046 (0.0028) [2024-04-25 22:24:08,725][47267] Signal inference workers to stop experience collection... (700 times) [2024-04-25 22:24:08,725][47267] Signal inference workers to resume experience collection... (700 times) [2024-04-25 22:24:08,737][47288] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-04-25 22:24:08,738][47288] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-04-25 22:24:08,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55705.5, 300 sec: 56094.4). Total num frames: 99188736. Throughput: 0: 55983.1. Samples: 48626900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 22:24:08,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:24:09,593][47288] Updated weights for policy 0, policy_version 6056 (0.0040) [2024-04-25 22:24:11,799][47288] Updated weights for policy 0, policy_version 6066 (0.0030) [2024-04-25 22:24:13,923][47056] Fps is (10 sec: 58982.6, 60 sec: 55705.7, 300 sec: 56261.0). Total num frames: 99483648. Throughput: 0: 55862.6. Samples: 48784420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 22:24:13,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:24:15,443][47288] Updated weights for policy 0, policy_version 6076 (0.0033) [2024-04-25 22:24:17,447][47288] Updated weights for policy 0, policy_version 6086 (0.0026) [2024-04-25 22:24:18,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56251.7, 300 sec: 56205.5). Total num frames: 99778560. Throughput: 0: 55965.8. Samples: 49124360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-25 22:24:18,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:24:21,317][47288] Updated weights for policy 0, policy_version 6096 (0.0027) [2024-04-25 22:24:23,219][47288] Updated weights for policy 0, policy_version 6106 (0.0030) [2024-04-25 22:24:23,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56251.9, 300 sec: 56150.0). Total num frames: 100040704. Throughput: 0: 55923.8. Samples: 49461700. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-25 22:24:23,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:24:27,093][47288] Updated weights for policy 0, policy_version 6116 (0.0030) [2024-04-25 22:24:28,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56797.7, 300 sec: 56261.0). Total num frames: 100352000. Throughput: 0: 56159.4. Samples: 49638760. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-25 22:24:28,923][47056] Avg episode reward: [(0, '0.004')] [2024-04-25 22:24:29,097][47288] Updated weights for policy 0, policy_version 6126 (0.0029) [2024-04-25 22:24:32,989][47288] Updated weights for policy 0, policy_version 6136 (0.0028) [2024-04-25 22:24:33,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.9, 300 sec: 56205.5). Total num frames: 100614144. Throughput: 0: 56234.3. Samples: 49977480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 22:24:33,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:24:35,060][47288] Updated weights for policy 0, policy_version 6146 (0.0029) [2024-04-25 22:24:38,755][47288] Updated weights for policy 0, policy_version 6156 (0.0032) [2024-04-25 22:24:38,923][47056] Fps is (10 sec: 50791.1, 60 sec: 55705.7, 300 sec: 56094.5). Total num frames: 100859904. Throughput: 0: 56401.7. Samples: 50319600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-25 22:24:38,923][47056] Avg episode reward: [(0, '0.005')] [2024-04-25 22:24:40,764][47288] Updated weights for policy 0, policy_version 6166 (0.0027) [2024-04-25 22:24:43,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.7, 300 sec: 56149.9). Total num frames: 101154816. Throughput: 0: 56219.9. Samples: 50481680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-25 22:24:43,923][47056] Avg episode reward: [(0, '0.004')] [2024-04-25 22:24:44,596][47288] Updated weights for policy 0, policy_version 6176 (0.0029) [2024-04-25 22:24:47,036][47288] Updated weights for policy 0, policy_version 6186 (0.0029) [2024-04-25 22:24:48,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55705.6, 300 sec: 56149.9). Total num frames: 101433344. Throughput: 0: 56140.5. Samples: 50814340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-25 22:24:48,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:24:48,962][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000006192_101449728.pth... [2024-04-25 22:24:49,009][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000005369_87965696.pth [2024-04-25 22:24:50,367][47288] Updated weights for policy 0, policy_version 6196 (0.0029) [2024-04-25 22:24:52,884][47288] Updated weights for policy 0, policy_version 6206 (0.0030) [2024-04-25 22:24:53,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.6, 300 sec: 56205.5). Total num frames: 101728256. Throughput: 0: 56079.7. Samples: 51150480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 22:24:53,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:24:56,059][47288] Updated weights for policy 0, policy_version 6216 (0.0030) [2024-04-25 22:24:58,541][47288] Updated weights for policy 0, policy_version 6226 (0.0035) [2024-04-25 22:24:58,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56251.5, 300 sec: 56149.9). Total num frames: 102006784. Throughput: 0: 56345.6. Samples: 51319980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-25 22:24:58,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:25:01,931][47267] Signal inference workers to stop experience collection... (750 times) [2024-04-25 22:25:01,931][47267] Signal inference workers to resume experience collection... (750 times) [2024-04-25 22:25:01,958][47288] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-04-25 22:25:01,959][47288] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-04-25 22:25:02,040][47288] Updated weights for policy 0, policy_version 6236 (0.0031) [2024-04-25 22:25:03,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56797.9, 300 sec: 56261.0). Total num frames: 102301696. Throughput: 0: 56169.7. Samples: 51652000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 22:25:03,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:25:04,309][47288] Updated weights for policy 0, policy_version 6246 (0.0033) [2024-04-25 22:25:07,866][47288] Updated weights for policy 0, policy_version 6256 (0.0028) [2024-04-25 22:25:08,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56524.8, 300 sec: 56205.5). Total num frames: 102580224. Throughput: 0: 56158.1. Samples: 51988820. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-25 22:25:08,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:25:10,361][47288] Updated weights for policy 0, policy_version 6266 (0.0031) [2024-04-25 22:25:13,779][47288] Updated weights for policy 0, policy_version 6276 (0.0027) [2024-04-25 22:25:13,923][47056] Fps is (10 sec: 52429.6, 60 sec: 55705.7, 300 sec: 56094.4). Total num frames: 102825984. Throughput: 0: 55956.7. Samples: 52156800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-25 22:25:13,923][47056] Avg episode reward: [(0, '0.001')] [2024-04-25 22:25:16,325][47288] Updated weights for policy 0, policy_version 6286 (0.0032) [2024-04-25 22:25:18,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55432.5, 300 sec: 56094.4). Total num frames: 103104512. Throughput: 0: 55947.4. Samples: 52495120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-25 22:25:18,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:25:19,475][47288] Updated weights for policy 0, policy_version 6296 (0.0032) [2024-04-25 22:25:22,023][47288] Updated weights for policy 0, policy_version 6306 (0.0029) [2024-04-25 22:25:23,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 56038.8). Total num frames: 103383040. Throughput: 0: 55775.1. Samples: 52829480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-25 22:25:23,923][47056] Avg episode reward: [(0, '0.005')] [2024-04-25 22:25:25,235][47288] Updated weights for policy 0, policy_version 6316 (0.0032) [2024-04-25 22:25:28,014][47288] Updated weights for policy 0, policy_version 6326 (0.0027) [2024-04-25 22:25:28,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55705.7, 300 sec: 56094.4). Total num frames: 103694336. Throughput: 0: 55884.1. Samples: 52996460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-25 22:25:28,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:25:30,963][47288] Updated weights for policy 0, policy_version 6336 (0.0030) [2024-04-25 22:25:33,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 56094.4). Total num frames: 103956480. Throughput: 0: 55986.7. Samples: 53333740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 22:25:33,923][47056] Avg episode reward: [(0, '0.004')] [2024-04-25 22:25:33,937][47288] Updated weights for policy 0, policy_version 6346 (0.0034) [2024-04-25 22:25:36,868][47288] Updated weights for policy 0, policy_version 6356 (0.0033) [2024-04-25 22:25:38,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56797.8, 300 sec: 56261.0). Total num frames: 104267776. Throughput: 0: 56079.4. Samples: 53674060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-25 22:25:38,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:25:39,729][47288] Updated weights for policy 0, policy_version 6366 (0.0028) [2024-04-25 22:25:42,536][47288] Updated weights for policy 0, policy_version 6376 (0.0036) [2024-04-25 22:25:43,923][47056] Fps is (10 sec: 60620.5, 60 sec: 56797.9, 300 sec: 56261.0). Total num frames: 104562688. Throughput: 0: 56196.2. Samples: 53848800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 22:25:43,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:25:45,414][47288] Updated weights for policy 0, policy_version 6386 (0.0033) [2024-04-25 22:25:48,380][47288] Updated weights for policy 0, policy_version 6396 (0.0028) [2024-04-25 22:25:48,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.8, 300 sec: 56149.9). Total num frames: 104824832. Throughput: 0: 56290.8. Samples: 54185080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 22:25:48,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:25:51,138][47288] Updated weights for policy 0, policy_version 6406 (0.0030) [2024-04-25 22:25:53,923][47056] Fps is (10 sec: 50790.1, 60 sec: 55705.5, 300 sec: 56094.4). Total num frames: 105070592. Throughput: 0: 56475.1. Samples: 54530200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 22:25:53,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:25:54,264][47288] Updated weights for policy 0, policy_version 6416 (0.0031) [2024-04-25 22:25:57,025][47288] Updated weights for policy 0, policy_version 6426 (0.0026) [2024-04-25 22:25:58,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.8, 300 sec: 56094.4). Total num frames: 105365504. Throughput: 0: 56236.8. Samples: 54687460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-25 22:25:58,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:26:00,151][47288] Updated weights for policy 0, policy_version 6436 (0.0028) [2024-04-25 22:26:02,836][47288] Updated weights for policy 0, policy_version 6446 (0.0026) [2024-04-25 22:26:03,923][47056] Fps is (10 sec: 58982.7, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 105660416. Throughput: 0: 56271.6. Samples: 55027340. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-04-25 22:26:03,923][47056] Avg episode reward: [(0, '0.005')] [2024-04-25 22:26:05,573][47267] Signal inference workers to stop experience collection... (800 times) [2024-04-25 22:26:05,574][47267] Signal inference workers to resume experience collection... (800 times) [2024-04-25 22:26:05,598][47288] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-04-25 22:26:05,598][47288] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-04-25 22:26:06,028][47288] Updated weights for policy 0, policy_version 6456 (0.0030) [2024-04-25 22:26:08,492][47288] Updated weights for policy 0, policy_version 6466 (0.0032) [2024-04-25 22:26:08,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 105938944. Throughput: 0: 56264.8. Samples: 55361400. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-25 22:26:08,923][47056] Avg episode reward: [(0, '0.004')] [2024-04-25 22:26:11,805][47288] Updated weights for policy 0, policy_version 6476 (0.0027) [2024-04-25 22:26:13,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.7, 300 sec: 56149.9). Total num frames: 106217472. Throughput: 0: 56432.9. Samples: 55535940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-25 22:26:13,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:26:14,381][47288] Updated weights for policy 0, policy_version 6486 (0.0030) [2024-04-25 22:26:17,531][47288] Updated weights for policy 0, policy_version 6496 (0.0032) [2024-04-25 22:26:18,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56797.9, 300 sec: 56205.5). Total num frames: 106512384. Throughput: 0: 56412.3. Samples: 55872300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-25 22:26:18,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:26:20,264][47288] Updated weights for policy 0, policy_version 6506 (0.0039) [2024-04-25 22:26:23,478][47288] Updated weights for policy 0, policy_version 6516 (0.0027) [2024-04-25 22:26:23,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56797.9, 300 sec: 56149.9). Total num frames: 106790912. Throughput: 0: 56321.4. Samples: 56208520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-25 22:26:23,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:26:26,323][47288] Updated weights for policy 0, policy_version 6526 (0.0027) [2024-04-25 22:26:28,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55705.6, 300 sec: 56038.8). Total num frames: 107036672. Throughput: 0: 56087.1. Samples: 56372720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-25 22:26:28,923][47056] Avg episode reward: [(0, '0.004')] [2024-04-25 22:26:29,263][47288] Updated weights for policy 0, policy_version 6536 (0.0031) [2024-04-25 22:26:31,957][47288] Updated weights for policy 0, policy_version 6546 (0.0030) [2024-04-25 22:26:33,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56251.7, 300 sec: 56094.4). Total num frames: 107331584. Throughput: 0: 56231.9. Samples: 56715520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-04-25 22:26:33,923][47056] Avg episode reward: [(0, '0.004')] [2024-04-25 22:26:34,970][47288] Updated weights for policy 0, policy_version 6556 (0.0032) [2024-04-25 22:26:37,783][47288] Updated weights for policy 0, policy_version 6566 (0.0034) [2024-04-25 22:26:38,923][47056] Fps is (10 sec: 57342.8, 60 sec: 55705.5, 300 sec: 56038.8). Total num frames: 107610112. Throughput: 0: 56015.0. Samples: 57050880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 22:26:38,923][47056] Avg episode reward: [(0, '0.006')] [2024-04-25 22:26:39,023][47267] Saving new best policy, reward=0.006! [2024-04-25 22:26:40,905][47288] Updated weights for policy 0, policy_version 6576 (0.0024) [2024-04-25 22:26:43,620][47288] Updated weights for policy 0, policy_version 6586 (0.0031) [2024-04-25 22:26:43,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.5, 300 sec: 56094.3). Total num frames: 107905024. Throughput: 0: 55993.2. Samples: 57207160. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-25 22:26:43,923][47056] Avg episode reward: [(0, '0.005')] [2024-04-25 22:26:46,791][47288] Updated weights for policy 0, policy_version 6596 (0.0036) [2024-04-25 22:26:47,299][47267] Signal inference workers to stop experience collection... (850 times) [2024-04-25 22:26:47,300][47267] Signal inference workers to resume experience collection... (850 times) [2024-04-25 22:26:47,316][47288] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-04-25 22:26:47,316][47288] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-04-25 22:26:48,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.5, 300 sec: 56149.9). Total num frames: 108183552. Throughput: 0: 55784.7. Samples: 57537660. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-25 22:26:48,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:26:48,930][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000006603_108183552.pth... [2024-04-25 22:26:48,979][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000005783_94748672.pth [2024-04-25 22:26:49,487][47288] Updated weights for policy 0, policy_version 6606 (0.0030) [2024-04-25 22:26:52,622][47288] Updated weights for policy 0, policy_version 6616 (0.0035) [2024-04-25 22:26:53,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56797.9, 300 sec: 56205.5). Total num frames: 108478464. Throughput: 0: 55863.1. Samples: 57875240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-25 22:26:53,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:26:55,324][47288] Updated weights for policy 0, policy_version 6626 (0.0036) [2024-04-25 22:26:58,424][47288] Updated weights for policy 0, policy_version 6636 (0.0035) [2024-04-25 22:26:58,923][47056] Fps is (10 sec: 58983.4, 60 sec: 56797.8, 300 sec: 56261.0). Total num frames: 108773376. Throughput: 0: 55909.7. Samples: 58051880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-04-25 22:26:58,923][47056] Avg episode reward: [(0, '0.005')] [2024-04-25 22:27:01,772][47288] Updated weights for policy 0, policy_version 6646 (0.0028) [2024-04-25 22:27:03,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 56038.9). Total num frames: 109019136. Throughput: 0: 55950.4. Samples: 58390060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-25 22:27:03,923][47056] Avg episode reward: [(0, '0.004')] [2024-04-25 22:27:04,386][47288] Updated weights for policy 0, policy_version 6656 (0.0031) [2024-04-25 22:27:07,502][47288] Updated weights for policy 0, policy_version 6666 (0.0030) [2024-04-25 22:27:08,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55978.6, 300 sec: 56094.4). Total num frames: 109297664. Throughput: 0: 55988.8. Samples: 58728020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-25 22:27:08,923][47056] Avg episode reward: [(0, '0.002')] [2024-04-25 22:27:10,118][47288] Updated weights for policy 0, policy_version 6676 (0.0030) [2024-04-25 22:27:13,401][47288] Updated weights for policy 0, policy_version 6686 (0.0028) [2024-04-25 22:27:13,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.6, 300 sec: 56094.4). Total num frames: 109559808. Throughput: 0: 55885.3. Samples: 58887560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-25 22:27:13,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:27:15,910][47288] Updated weights for policy 0, policy_version 6696 (0.0030) [2024-04-25 22:27:18,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 55983.3). Total num frames: 109838336. Throughput: 0: 55830.3. Samples: 59227880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-25 22:27:18,923][47056] Avg episode reward: [(0, '0.004')] [2024-04-25 22:27:19,052][47288] Updated weights for policy 0, policy_version 6706 (0.0027) [2024-04-25 22:27:21,726][47288] Updated weights for policy 0, policy_version 6716 (0.0036) [2024-04-25 22:27:23,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55978.6, 300 sec: 56038.9). Total num frames: 110149632. Throughput: 0: 55902.8. Samples: 59566500. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-25 22:27:23,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:27:24,794][47288] Updated weights for policy 0, policy_version 6726 (0.0026) [2024-04-25 22:27:27,543][47288] Updated weights for policy 0, policy_version 6736 (0.0026) [2024-04-25 22:27:28,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56524.7, 300 sec: 56094.4). Total num frames: 110428160. Throughput: 0: 56394.7. Samples: 59744920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-25 22:27:28,923][47056] Avg episode reward: [(0, '0.008')] [2024-04-25 22:27:28,978][47267] Saving new best policy, reward=0.008! [2024-04-25 22:27:30,528][47288] Updated weights for policy 0, policy_version 6746 (0.0033) [2024-04-25 22:27:33,308][47288] Updated weights for policy 0, policy_version 6756 (0.0030) [2024-04-25 22:27:33,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56797.8, 300 sec: 56261.0). Total num frames: 110739456. Throughput: 0: 56497.9. Samples: 60080060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-25 22:27:33,923][47056] Avg episode reward: [(0, '0.007')] [2024-04-25 22:27:36,381][47288] Updated weights for policy 0, policy_version 6766 (0.0027) [2024-04-25 22:27:38,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 56094.4). Total num frames: 110985216. Throughput: 0: 56460.8. Samples: 60415980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-25 22:27:38,923][47056] Avg episode reward: [(0, '0.005')] [2024-04-25 22:27:39,098][47288] Updated weights for policy 0, policy_version 6776 (0.0028) [2024-04-25 22:27:42,256][47288] Updated weights for policy 0, policy_version 6786 (0.0031) [2024-04-25 22:27:43,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55978.8, 300 sec: 56038.9). Total num frames: 111263744. Throughput: 0: 56318.7. Samples: 60586220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-25 22:27:43,923][47056] Avg episode reward: [(0, '0.005')] [2024-04-25 22:27:44,976][47288] Updated weights for policy 0, policy_version 6796 (0.0033) [2024-04-25 22:27:47,593][47267] Signal inference workers to stop experience collection... (900 times) [2024-04-25 22:27:47,599][47267] Signal inference workers to resume experience collection... (900 times) [2024-04-25 22:27:47,619][47288] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-04-25 22:27:47,619][47288] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-04-25 22:27:48,218][47288] Updated weights for policy 0, policy_version 6806 (0.0027) [2024-04-25 22:27:48,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55705.6, 300 sec: 56094.3). Total num frames: 111525888. Throughput: 0: 56258.0. Samples: 60921680. Policy #0 lag: (min: 2.0, avg: 11.7, max: 22.0) [2024-04-25 22:27:48,923][47056] Avg episode reward: [(0, '0.003')] [2024-04-25 22:27:50,746][47288] Updated weights for policy 0, policy_version 6816 (0.0030) [2024-04-25 22:27:53,861][47288] Updated weights for policy 0, policy_version 6826 (0.0036) [2024-04-25 22:27:53,923][47056] Fps is (10 sec: 57342.7, 60 sec: 55978.5, 300 sec: 56149.9). Total num frames: 111837184. Throughput: 0: 56241.6. Samples: 61258900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-25 22:27:53,923][47056] Avg episode reward: [(0, '0.009')] [2024-04-25 22:27:53,924][47267] Saving new best policy, reward=0.009! [2024-04-25 22:27:56,458][47288] Updated weights for policy 0, policy_version 6836 (0.0032) [2024-04-25 22:27:58,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55432.5, 300 sec: 56038.8). Total num frames: 112099328. Throughput: 0: 56252.4. Samples: 61418920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 22:27:58,923][47056] Avg episode reward: [(0, '0.005')] [2024-04-25 22:27:59,760][47288] Updated weights for policy 0, policy_version 6846 (0.0031) [2024-04-25 22:28:02,174][47288] Updated weights for policy 0, policy_version 6856 (0.0026) [2024-04-25 22:28:03,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56251.6, 300 sec: 56094.4). Total num frames: 112394240. Throughput: 0: 56141.2. Samples: 61754240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 22:28:03,923][47056] Avg episode reward: [(0, '0.006')] [2024-04-25 22:28:05,704][47288] Updated weights for policy 0, policy_version 6866 (0.0036) [2024-04-25 22:28:08,200][47288] Updated weights for policy 0, policy_version 6876 (0.0034) [2024-04-25 22:28:08,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56524.9, 300 sec: 56094.4). Total num frames: 112689152. Throughput: 0: 56055.2. Samples: 62088980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-25 22:28:08,923][47056] Avg episode reward: [(0, '0.005')] [2024-04-25 22:28:11,424][47288] Updated weights for policy 0, policy_version 6886 (0.0026) [2024-04-25 22:28:13,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56797.7, 300 sec: 56149.9). Total num frames: 112967680. Throughput: 0: 56003.4. Samples: 62265080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-25 22:28:13,923][47056] Avg episode reward: [(0, '0.004')] [2024-04-25 22:28:14,017][47288] Updated weights for policy 0, policy_version 6896 (0.0026) [2024-04-25 22:28:17,113][47288] Updated weights for policy 0, policy_version 6906 (0.0031) [2024-04-25 22:28:18,923][47056] Fps is (10 sec: 54066.6, 60 sec: 56524.7, 300 sec: 56149.9). Total num frames: 113229824. Throughput: 0: 56097.3. Samples: 62604440. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-25 22:28:18,923][47056] Avg episode reward: [(0, '0.008')] [2024-04-25 22:28:19,921][47288] Updated weights for policy 0, policy_version 6916 (0.0033) [2024-04-25 22:28:23,070][47288] Updated weights for policy 0, policy_version 6926 (0.0031) [2024-04-25 22:28:23,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55705.6, 300 sec: 56094.3). Total num frames: 113491968. Throughput: 0: 56108.0. Samples: 62940840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-25 22:28:23,923][47056] Avg episode reward: [(0, '0.007')] [2024-04-25 22:28:25,642][47288] Updated weights for policy 0, policy_version 6936 (0.0032) [2024-04-25 22:28:28,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55978.8, 300 sec: 56149.9). Total num frames: 113786880. Throughput: 0: 55910.2. Samples: 63102180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-25 22:28:28,923][47056] Avg episode reward: [(0, '0.007')] [2024-04-25 22:28:28,993][47288] Updated weights for policy 0, policy_version 6946 (0.0034) [2024-04-25 22:28:31,489][47288] Updated weights for policy 0, policy_version 6956 (0.0034) [2024-04-25 22:28:33,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55432.6, 300 sec: 56094.4). Total num frames: 114065408. Throughput: 0: 56006.4. Samples: 63441960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-25 22:28:33,923][47056] Avg episode reward: [(0, '0.006')] [2024-04-25 22:28:34,778][47288] Updated weights for policy 0, policy_version 6966 (0.0030) [2024-04-25 22:28:37,205][47288] Updated weights for policy 0, policy_version 6976 (0.0029) [2024-04-25 22:28:38,923][47056] Fps is (10 sec: 57342.9, 60 sec: 56251.6, 300 sec: 56149.9). Total num frames: 114360320. Throughput: 0: 56130.3. Samples: 63784760. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-25 22:28:38,923][47056] Avg episode reward: [(0, '0.004')] [2024-04-25 22:28:40,881][47288] Updated weights for policy 0, policy_version 6986 (0.0034) [2024-04-25 22:28:41,754][47267] Signal inference workers to stop experience collection... (950 times) [2024-04-25 22:28:41,754][47267] Signal inference workers to resume experience collection... (950 times) [2024-04-25 22:28:41,764][47288] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-04-25 22:28:41,764][47288] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-04-25 22:28:43,043][47288] Updated weights for policy 0, policy_version 6996 (0.0027) [2024-04-25 22:28:43,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56524.7, 300 sec: 56149.9). Total num frames: 114655232. Throughput: 0: 56448.0. Samples: 63959080. Policy #0 lag: (min: 2.0, avg: 11.1, max: 24.0) [2024-04-25 22:28:43,923][47056] Avg episode reward: [(0, '0.007')] [2024-04-25 22:28:46,644][47288] Updated weights for policy 0, policy_version 7006 (0.0027) [2024-04-25 22:28:48,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56798.0, 300 sec: 56149.9). Total num frames: 114933760. Throughput: 0: 56543.6. Samples: 64298700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 22:28:48,923][47056] Avg episode reward: [(0, '0.005')] [2024-04-25 22:28:48,987][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000007016_114950144.pth... [2024-04-25 22:28:48,990][47288] Updated weights for policy 0, policy_version 7016 (0.0027) [2024-04-25 22:28:49,035][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000006192_101449728.pth [2024-04-25 22:28:52,297][47288] Updated weights for policy 0, policy_version 7026 (0.0028) [2024-04-25 22:28:53,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55978.9, 300 sec: 56149.9). Total num frames: 115195904. Throughput: 0: 56581.4. Samples: 64635140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 22:28:53,923][47056] Avg episode reward: [(0, '0.008')] [2024-04-25 22:28:54,857][47288] Updated weights for policy 0, policy_version 7036 (0.0034) [2024-04-25 22:28:58,103][47288] Updated weights for policy 0, policy_version 7046 (0.0028) [2024-04-25 22:28:58,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56251.7, 300 sec: 56205.5). Total num frames: 115474432. Throughput: 0: 56269.9. Samples: 64797220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-25 22:28:58,923][47056] Avg episode reward: [(0, '0.009')] [2024-04-25 22:29:00,780][47288] Updated weights for policy 0, policy_version 7056 (0.0026) [2024-04-25 22:29:03,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.8, 300 sec: 56149.9). Total num frames: 115752960. Throughput: 0: 56183.7. Samples: 65132700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-25 22:29:03,923][47056] Avg episode reward: [(0, '0.005')] [2024-04-25 22:29:03,936][47288] Updated weights for policy 0, policy_version 7066 (0.0031) [2024-04-25 22:29:06,725][47288] Updated weights for policy 0, policy_version 7076 (0.0031) [2024-04-25 22:29:08,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.6, 300 sec: 56149.9). Total num frames: 116047872. Throughput: 0: 56216.5. Samples: 65470580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-04-25 22:29:08,923][47056] Avg episode reward: [(0, '0.004')] [2024-04-25 22:29:09,601][47288] Updated weights for policy 0, policy_version 7086 (0.0037) [2024-04-25 22:29:12,531][47288] Updated weights for policy 0, policy_version 7096 (0.0035) [2024-04-25 22:29:13,922][47056] Fps is (10 sec: 57344.4, 60 sec: 55979.0, 300 sec: 56094.4). Total num frames: 116326400. Throughput: 0: 56351.3. Samples: 65637980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-04-25 22:29:13,923][47056] Avg episode reward: [(0, '0.009')] [2024-04-25 22:29:15,401][47288] Updated weights for policy 0, policy_version 7106 (0.0031) [2024-04-25 22:29:18,436][47288] Updated weights for policy 0, policy_version 7116 (0.0029) [2024-04-25 22:29:18,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 116604928. Throughput: 0: 56156.4. Samples: 65969000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-25 22:29:18,932][47056] Avg episode reward: [(0, '0.008')] [2024-04-25 22:29:21,577][47288] Updated weights for policy 0, policy_version 7126 (0.0025) [2024-04-25 22:29:23,923][47056] Fps is (10 sec: 55704.5, 60 sec: 56524.8, 300 sec: 56038.8). Total num frames: 116883456. Throughput: 0: 56038.3. Samples: 66306480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-25 22:29:23,932][47056] Avg episode reward: [(0, '0.004')] [2024-04-25 22:29:24,374][47288] Updated weights for policy 0, policy_version 7136 (0.0025) [2024-04-25 22:29:27,551][47288] Updated weights for policy 0, policy_version 7146 (0.0037) [2024-04-25 22:29:28,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 56094.4). Total num frames: 117161984. Throughput: 0: 55977.4. Samples: 66478060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 22:29:28,923][47056] Avg episode reward: [(0, '0.005')] [2024-04-25 22:29:30,207][47288] Updated weights for policy 0, policy_version 7156 (0.0029) [2024-04-25 22:29:33,600][47288] Updated weights for policy 0, policy_version 7166 (0.0032) [2024-04-25 22:29:33,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55705.7, 300 sec: 56094.4). Total num frames: 117407744. Throughput: 0: 55884.0. Samples: 66813480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-25 22:29:33,923][47056] Avg episode reward: [(0, '0.007')] [2024-04-25 22:29:36,029][47288] Updated weights for policy 0, policy_version 7176 (0.0028) [2024-04-25 22:29:38,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55978.7, 300 sec: 56149.9). Total num frames: 117719040. Throughput: 0: 55776.6. Samples: 67145100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-25 22:29:38,923][47056] Avg episode reward: [(0, '0.006')] [2024-04-25 22:29:39,285][47288] Updated weights for policy 0, policy_version 7186 (0.0030) [2024-04-25 22:29:41,153][47267] Signal inference workers to stop experience collection... (1000 times) [2024-04-25 22:29:41,194][47288] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-04-25 22:29:41,201][47267] Signal inference workers to resume experience collection... (1000 times) [2024-04-25 22:29:41,208][47288] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-04-25 22:29:41,828][47288] Updated weights for policy 0, policy_version 7196 (0.0029) [2024-04-25 22:29:43,923][47056] Fps is (10 sec: 58980.9, 60 sec: 55705.5, 300 sec: 56149.9). Total num frames: 117997568. Throughput: 0: 55825.5. Samples: 67309380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-25 22:29:43,932][47056] Avg episode reward: [(0, '0.007')] [2024-04-25 22:29:45,224][47288] Updated weights for policy 0, policy_version 7206 (0.0031) [2024-04-25 22:29:47,975][47288] Updated weights for policy 0, policy_version 7216 (0.0028) [2024-04-25 22:29:48,923][47056] Fps is (10 sec: 57345.1, 60 sec: 55978.7, 300 sec: 56149.9). Total num frames: 118292480. Throughput: 0: 55920.8. Samples: 67649140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-25 22:29:48,923][47056] Avg episode reward: [(0, '0.009')] [2024-04-25 22:29:51,057][47288] Updated weights for policy 0, policy_version 7226 (0.0040) [2024-04-25 22:29:53,796][47288] Updated weights for policy 0, policy_version 7236 (0.0026) [2024-04-25 22:29:53,923][47056] Fps is (10 sec: 55706.8, 60 sec: 55978.6, 300 sec: 56094.4). Total num frames: 118554624. Throughput: 0: 55956.0. Samples: 67988600. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-25 22:29:53,923][47056] Avg episode reward: [(0, '0.006')] [2024-04-25 22:29:56,725][47288] Updated weights for policy 0, policy_version 7246 (0.0031) [2024-04-25 22:29:58,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 56094.4). Total num frames: 118849536. Throughput: 0: 55879.9. Samples: 68152580. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-25 22:29:58,923][47056] Avg episode reward: [(0, '0.007')] [2024-04-25 22:29:59,514][47288] Updated weights for policy 0, policy_version 7256 (0.0026) [2024-04-25 22:30:02,430][47288] Updated weights for policy 0, policy_version 7266 (0.0032) [2024-04-25 22:30:03,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56524.8, 300 sec: 56149.9). Total num frames: 119144448. Throughput: 0: 55991.7. Samples: 68488620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 22:30:03,923][47056] Avg episode reward: [(0, '0.007')] [2024-04-25 22:30:05,481][47288] Updated weights for policy 0, policy_version 7276 (0.0035) [2024-04-25 22:30:08,433][47288] Updated weights for policy 0, policy_version 7286 (0.0027) [2024-04-25 22:30:08,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55705.6, 300 sec: 56149.9). Total num frames: 119390208. Throughput: 0: 56038.8. Samples: 68828220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-25 22:30:08,923][47056] Avg episode reward: [(0, '0.007')] [2024-04-25 22:30:11,240][47288] Updated weights for policy 0, policy_version 7296 (0.0024) [2024-04-25 22:30:13,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55705.5, 300 sec: 56149.9). Total num frames: 119668736. Throughput: 0: 55863.6. Samples: 68991920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-25 22:30:13,923][47056] Avg episode reward: [(0, '0.008')] [2024-04-25 22:30:14,435][47288] Updated weights for policy 0, policy_version 7306 (0.0031) [2024-04-25 22:30:16,918][47288] Updated weights for policy 0, policy_version 7316 (0.0029) [2024-04-25 22:30:18,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 119963648. Throughput: 0: 55893.8. Samples: 69328700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-25 22:30:18,923][47056] Avg episode reward: [(0, '0.014')] [2024-04-25 22:30:18,939][47267] Saving new best policy, reward=0.014! [2024-04-25 22:30:20,224][47288] Updated weights for policy 0, policy_version 7326 (0.0030) [2024-04-25 22:30:22,845][47288] Updated weights for policy 0, policy_version 7336 (0.0028) [2024-04-25 22:30:23,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 120242176. Throughput: 0: 56100.2. Samples: 69669600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-25 22:30:23,923][47056] Avg episode reward: [(0, '0.007')] [2024-04-25 22:30:25,950][47288] Updated weights for policy 0, policy_version 7346 (0.0031) [2024-04-25 22:30:28,633][47288] Updated weights for policy 0, policy_version 7356 (0.0030) [2024-04-25 22:30:28,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.7, 300 sec: 56205.4). Total num frames: 120537088. Throughput: 0: 56291.3. Samples: 69842480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-25 22:30:28,923][47056] Avg episode reward: [(0, '0.010')] [2024-04-25 22:30:31,587][47288] Updated weights for policy 0, policy_version 7366 (0.0034) [2024-04-25 22:30:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.8, 300 sec: 56094.4). Total num frames: 120815616. Throughput: 0: 56243.9. Samples: 70180120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-25 22:30:33,923][47056] Avg episode reward: [(0, '0.006')] [2024-04-25 22:30:34,363][47288] Updated weights for policy 0, policy_version 7376 (0.0031) [2024-04-25 22:30:37,466][47288] Updated weights for policy 0, policy_version 7386 (0.0029) [2024-04-25 22:30:38,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.9, 300 sec: 56038.8). Total num frames: 121094144. Throughput: 0: 56227.6. Samples: 70518840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-25 22:30:38,923][47056] Avg episode reward: [(0, '0.012')] [2024-04-25 22:30:40,244][47288] Updated weights for policy 0, policy_version 7396 (0.0030) [2024-04-25 22:30:43,216][47288] Updated weights for policy 0, policy_version 7406 (0.0028) [2024-04-25 22:30:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.9, 300 sec: 56094.4). Total num frames: 121372672. Throughput: 0: 56331.9. Samples: 70687520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 22:30:43,923][47056] Avg episode reward: [(0, '0.010')] [2024-04-25 22:30:46,013][47288] Updated weights for policy 0, policy_version 7416 (0.0028) [2024-04-25 22:30:48,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 121651200. Throughput: 0: 56343.5. Samples: 71024080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 22:30:48,923][47056] Avg episode reward: [(0, '0.010')] [2024-04-25 22:30:48,990][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000007426_121667584.pth... [2024-04-25 22:30:48,996][47288] Updated weights for policy 0, policy_version 7426 (0.0025) [2024-04-25 22:30:49,036][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000006603_108183552.pth [2024-04-25 22:30:51,848][47288] Updated weights for policy 0, policy_version 7436 (0.0026) [2024-04-25 22:30:53,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 121929728. Throughput: 0: 56397.8. Samples: 71366120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 22:30:53,923][47056] Avg episode reward: [(0, '0.005')] [2024-04-25 22:30:54,899][47288] Updated weights for policy 0, policy_version 7446 (0.0028) [2024-04-25 22:30:55,886][47267] Signal inference workers to stop experience collection... (1050 times) [2024-04-25 22:30:55,887][47267] Signal inference workers to resume experience collection... (1050 times) [2024-04-25 22:30:55,906][47288] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-04-25 22:30:55,907][47288] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-04-25 22:30:57,696][47288] Updated weights for policy 0, policy_version 7456 (0.0028) [2024-04-25 22:30:58,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 56094.4). Total num frames: 122208256. Throughput: 0: 56298.2. Samples: 71525340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 22:30:58,923][47056] Avg episode reward: [(0, '0.008')] [2024-04-25 22:31:00,734][47288] Updated weights for policy 0, policy_version 7466 (0.0030) [2024-04-25 22:31:03,437][47288] Updated weights for policy 0, policy_version 7476 (0.0034) [2024-04-25 22:31:03,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 56094.4). Total num frames: 122486784. Throughput: 0: 56264.8. Samples: 71860620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 22:31:03,923][47056] Avg episode reward: [(0, '0.004')] [2024-04-25 22:31:06,512][47288] Updated weights for policy 0, policy_version 7486 (0.0032) [2024-04-25 22:31:08,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56797.9, 300 sec: 56205.5). Total num frames: 122798080. Throughput: 0: 56282.3. Samples: 72202300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-25 22:31:08,923][47056] Avg episode reward: [(0, '0.009')] [2024-04-25 22:31:09,154][47288] Updated weights for policy 0, policy_version 7496 (0.0030) [2024-04-25 22:31:12,391][47288] Updated weights for policy 0, policy_version 7506 (0.0027) [2024-04-25 22:31:13,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56797.8, 300 sec: 56149.9). Total num frames: 123076608. Throughput: 0: 56281.4. Samples: 72375140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-25 22:31:13,923][47056] Avg episode reward: [(0, '0.012')] [2024-04-25 22:31:15,333][47288] Updated weights for policy 0, policy_version 7516 (0.0030) [2024-04-25 22:31:18,101][47288] Updated weights for policy 0, policy_version 7526 (0.0031) [2024-04-25 22:31:18,923][47056] Fps is (10 sec: 55704.4, 60 sec: 56524.6, 300 sec: 56149.9). Total num frames: 123355136. Throughput: 0: 56367.8. Samples: 72716680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-25 22:31:18,923][47056] Avg episode reward: [(0, '0.008')] [2024-04-25 22:31:21,366][47288] Updated weights for policy 0, policy_version 7536 (0.0032) [2024-04-25 22:31:23,870][47288] Updated weights for policy 0, policy_version 7546 (0.0024) [2024-04-25 22:31:23,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 123633664. Throughput: 0: 56286.7. Samples: 73051740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-25 22:31:23,923][47056] Avg episode reward: [(0, '0.008')] [2024-04-25 22:31:27,155][47288] Updated weights for policy 0, policy_version 7556 (0.0032) [2024-04-25 22:31:28,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55705.6, 300 sec: 56094.4). Total num frames: 123879424. Throughput: 0: 56034.1. Samples: 73209060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 22:31:28,923][47056] Avg episode reward: [(0, '0.012')] [2024-04-25 22:31:29,823][47288] Updated weights for policy 0, policy_version 7566 (0.0036) [2024-04-25 22:31:33,029][47288] Updated weights for policy 0, policy_version 7576 (0.0027) [2024-04-25 22:31:33,923][47056] Fps is (10 sec: 52428.2, 60 sec: 55705.5, 300 sec: 56094.4). Total num frames: 124157952. Throughput: 0: 56030.1. Samples: 73545440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-25 22:31:33,923][47056] Avg episode reward: [(0, '0.014')] [2024-04-25 22:31:35,663][47288] Updated weights for policy 0, policy_version 7586 (0.0028) [2024-04-25 22:31:38,818][47288] Updated weights for policy 0, policy_version 7596 (0.0034) [2024-04-25 22:31:38,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 124452864. Throughput: 0: 55895.1. Samples: 73881400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-25 22:31:38,923][47056] Avg episode reward: [(0, '0.009')] [2024-04-25 22:31:41,733][47288] Updated weights for policy 0, policy_version 7606 (0.0027) [2024-04-25 22:31:43,923][47056] Fps is (10 sec: 58983.3, 60 sec: 56251.8, 300 sec: 56150.0). Total num frames: 124747776. Throughput: 0: 56127.6. Samples: 74051080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-25 22:31:43,923][47056] Avg episode reward: [(0, '0.015')] [2024-04-25 22:31:44,651][47288] Updated weights for policy 0, policy_version 7616 (0.0033) [2024-04-25 22:31:46,462][47267] Signal inference workers to stop experience collection... (1100 times) [2024-04-25 22:31:46,467][47267] Signal inference workers to resume experience collection... (1100 times) [2024-04-25 22:31:46,493][47288] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-04-25 22:31:46,494][47288] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-04-25 22:31:47,656][47288] Updated weights for policy 0, policy_version 7626 (0.0029) [2024-04-25 22:31:48,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 56094.4). Total num frames: 125026304. Throughput: 0: 56004.0. Samples: 74380800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 22:31:48,923][47056] Avg episode reward: [(0, '0.009')] [2024-04-25 22:31:50,407][47288] Updated weights for policy 0, policy_version 7636 (0.0029) [2024-04-25 22:31:53,580][47288] Updated weights for policy 0, policy_version 7646 (0.0033) [2024-04-25 22:31:53,922][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.8, 300 sec: 56038.9). Total num frames: 125304832. Throughput: 0: 55866.8. Samples: 74716300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-25 22:31:53,923][47056] Avg episode reward: [(0, '0.013')] [2024-04-25 22:31:56,145][47288] Updated weights for policy 0, policy_version 7656 (0.0034) [2024-04-25 22:31:58,923][47056] Fps is (10 sec: 54065.8, 60 sec: 55978.5, 300 sec: 56094.3). Total num frames: 125566976. Throughput: 0: 55667.3. Samples: 74880180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-25 22:31:58,923][47056] Avg episode reward: [(0, '0.012')] [2024-04-25 22:31:59,293][47288] Updated weights for policy 0, policy_version 7666 (0.0033) [2024-04-25 22:32:02,198][47288] Updated weights for policy 0, policy_version 7676 (0.0027) [2024-04-25 22:32:03,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 125845504. Throughput: 0: 55630.0. Samples: 75220020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-25 22:32:03,923][47056] Avg episode reward: [(0, '0.012')] [2024-04-25 22:32:04,993][47288] Updated weights for policy 0, policy_version 7686 (0.0031) [2024-04-25 22:32:08,100][47288] Updated weights for policy 0, policy_version 7696 (0.0030) [2024-04-25 22:32:08,923][47056] Fps is (10 sec: 54068.4, 60 sec: 55159.4, 300 sec: 56094.4). Total num frames: 126107648. Throughput: 0: 55780.8. Samples: 75561880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-25 22:32:08,923][47056] Avg episode reward: [(0, '0.009')] [2024-04-25 22:32:10,787][47288] Updated weights for policy 0, policy_version 7706 (0.0027) [2024-04-25 22:32:13,863][47288] Updated weights for policy 0, policy_version 7716 (0.0030) [2024-04-25 22:32:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.7, 300 sec: 56205.5). Total num frames: 126418944. Throughput: 0: 55771.3. Samples: 75718760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-25 22:32:13,923][47056] Avg episode reward: [(0, '0.012')] [2024-04-25 22:32:16,651][47288] Updated weights for policy 0, policy_version 7726 (0.0029) [2024-04-25 22:32:18,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55705.7, 300 sec: 56094.4). Total num frames: 126697472. Throughput: 0: 55765.9. Samples: 76054900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-25 22:32:18,923][47056] Avg episode reward: [(0, '0.010')] [2024-04-25 22:32:19,864][47288] Updated weights for policy 0, policy_version 7736 (0.0032) [2024-04-25 22:32:22,547][47288] Updated weights for policy 0, policy_version 7746 (0.0029) [2024-04-25 22:32:23,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 56149.9). Total num frames: 126992384. Throughput: 0: 55679.4. Samples: 76386980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 22:32:23,923][47056] Avg episode reward: [(0, '0.012')] [2024-04-25 22:32:25,564][47288] Updated weights for policy 0, policy_version 7756 (0.0035) [2024-04-25 22:32:28,309][47288] Updated weights for policy 0, policy_version 7766 (0.0026) [2024-04-25 22:32:28,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.9, 300 sec: 56038.8). Total num frames: 127270912. Throughput: 0: 56113.7. Samples: 76576200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-25 22:32:28,923][47056] Avg episode reward: [(0, '0.014')] [2024-04-25 22:32:31,351][47288] Updated weights for policy 0, policy_version 7776 (0.0027) [2024-04-25 22:32:33,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.9, 300 sec: 56149.9). Total num frames: 127549440. Throughput: 0: 56243.1. Samples: 76911740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 22:32:33,923][47056] Avg episode reward: [(0, '0.013')] [2024-04-25 22:32:33,969][47288] Updated weights for policy 0, policy_version 7786 (0.0027) [2024-04-25 22:32:37,182][47288] Updated weights for policy 0, policy_version 7796 (0.0029) [2024-04-25 22:32:38,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.6, 300 sec: 56094.4). Total num frames: 127811584. Throughput: 0: 56190.8. Samples: 77244900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 22:32:38,923][47056] Avg episode reward: [(0, '0.014')] [2024-04-25 22:32:39,929][47288] Updated weights for policy 0, policy_version 7806 (0.0031) [2024-04-25 22:32:43,171][47288] Updated weights for policy 0, policy_version 7816 (0.0035) [2024-04-25 22:32:43,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55432.5, 300 sec: 56094.4). Total num frames: 128073728. Throughput: 0: 56051.4. Samples: 77402480. Policy #0 lag: (min: 2.0, avg: 11.0, max: 23.0) [2024-04-25 22:32:43,923][47056] Avg episode reward: [(0, '0.011')] [2024-04-25 22:32:45,686][47288] Updated weights for policy 0, policy_version 7826 (0.0033) [2024-04-25 22:32:48,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 56038.9). Total num frames: 128368640. Throughput: 0: 56028.4. Samples: 77741300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-25 22:32:48,923][47056] Avg episode reward: [(0, '0.019')] [2024-04-25 22:32:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000007835_128368640.pth... [2024-04-25 22:32:48,984][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000007016_114950144.pth [2024-04-25 22:32:48,990][47267] Saving new best policy, reward=0.019! [2024-04-25 22:32:49,469][47288] Updated weights for policy 0, policy_version 7836 (0.0025) [2024-04-25 22:32:51,080][47267] Signal inference workers to stop experience collection... (1150 times) [2024-04-25 22:32:51,130][47288] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-04-25 22:32:51,138][47267] Signal inference workers to resume experience collection... (1150 times) [2024-04-25 22:32:51,144][47288] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-04-25 22:32:51,563][47288] Updated weights for policy 0, policy_version 7846 (0.0031) [2024-04-25 22:32:53,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.4, 300 sec: 56094.4). Total num frames: 128647168. Throughput: 0: 55808.8. Samples: 78073280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-25 22:32:53,923][47056] Avg episode reward: [(0, '0.016')] [2024-04-25 22:32:55,312][47288] Updated weights for policy 0, policy_version 7856 (0.0028) [2024-04-25 22:32:57,380][47288] Updated weights for policy 0, policy_version 7866 (0.0025) [2024-04-25 22:32:58,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56524.9, 300 sec: 56149.9). Total num frames: 128958464. Throughput: 0: 56202.4. Samples: 78247880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-25 22:32:58,923][47056] Avg episode reward: [(0, '0.008')] [2024-04-25 22:33:00,984][47288] Updated weights for policy 0, policy_version 7876 (0.0029) [2024-04-25 22:33:03,315][47288] Updated weights for policy 0, policy_version 7886 (0.0031) [2024-04-25 22:33:03,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56251.7, 300 sec: 56038.8). Total num frames: 129220608. Throughput: 0: 56128.5. Samples: 78580680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 22:33:03,923][47056] Avg episode reward: [(0, '0.019')] [2024-04-25 22:33:06,937][47288] Updated weights for policy 0, policy_version 7896 (0.0029) [2024-04-25 22:33:08,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56524.8, 300 sec: 56038.9). Total num frames: 129499136. Throughput: 0: 56362.7. Samples: 78923300. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-25 22:33:08,923][47056] Avg episode reward: [(0, '0.011')] [2024-04-25 22:33:09,109][47288] Updated weights for policy 0, policy_version 7906 (0.0024) [2024-04-25 22:33:12,833][47288] Updated weights for policy 0, policy_version 7916 (0.0029) [2024-04-25 22:33:13,923][47056] Fps is (10 sec: 55704.4, 60 sec: 55978.4, 300 sec: 56094.3). Total num frames: 129777664. Throughput: 0: 55815.8. Samples: 79087920. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-25 22:33:13,923][47056] Avg episode reward: [(0, '0.009')] [2024-04-25 22:33:15,070][47288] Updated weights for policy 0, policy_version 7926 (0.0032) [2024-04-25 22:33:18,561][47288] Updated weights for policy 0, policy_version 7936 (0.0026) [2024-04-25 22:33:18,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 56094.4). Total num frames: 130039808. Throughput: 0: 55802.2. Samples: 79422840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-25 22:33:18,923][47056] Avg episode reward: [(0, '0.015')] [2024-04-25 22:33:20,931][47288] Updated weights for policy 0, policy_version 7946 (0.0027) [2024-04-25 22:33:23,923][47056] Fps is (10 sec: 54068.4, 60 sec: 55432.6, 300 sec: 56038.8). Total num frames: 130318336. Throughput: 0: 55945.5. Samples: 79762440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-25 22:33:23,923][47056] Avg episode reward: [(0, '0.013')] [2024-04-25 22:33:24,267][47288] Updated weights for policy 0, policy_version 7956 (0.0030) [2024-04-25 22:33:26,697][47288] Updated weights for policy 0, policy_version 7966 (0.0030) [2024-04-25 22:33:28,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 56038.8). Total num frames: 130596864. Throughput: 0: 56101.8. Samples: 79927060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 22:33:28,923][47056] Avg episode reward: [(0, '0.018')] [2024-04-25 22:33:30,087][47288] Updated weights for policy 0, policy_version 7976 (0.0037) [2024-04-25 22:33:32,495][47288] Updated weights for policy 0, policy_version 7986 (0.0030) [2024-04-25 22:33:33,923][47056] Fps is (10 sec: 58981.9, 60 sec: 55978.6, 300 sec: 56094.4). Total num frames: 130908160. Throughput: 0: 56065.3. Samples: 80264240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 22:33:33,923][47056] Avg episode reward: [(0, '0.017')] [2024-04-25 22:33:36,023][47288] Updated weights for policy 0, policy_version 7996 (0.0029) [2024-04-25 22:33:38,313][47288] Updated weights for policy 0, policy_version 8006 (0.0028) [2024-04-25 22:33:38,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56251.7, 300 sec: 56038.8). Total num frames: 131186688. Throughput: 0: 56121.8. Samples: 80598760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-25 22:33:38,923][47056] Avg episode reward: [(0, '0.018')] [2024-04-25 22:33:41,806][47288] Updated weights for policy 0, policy_version 8016 (0.0029) [2024-04-25 22:33:43,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.8, 300 sec: 56038.8). Total num frames: 131465216. Throughput: 0: 56213.2. Samples: 80777460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-25 22:33:43,923][47056] Avg episode reward: [(0, '0.014')] [2024-04-25 22:33:44,136][47288] Updated weights for policy 0, policy_version 8026 (0.0027) [2024-04-25 22:33:47,517][47288] Updated weights for policy 0, policy_version 8036 (0.0027) [2024-04-25 22:33:48,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 131727360. Throughput: 0: 56244.5. Samples: 81111680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-25 22:33:48,923][47056] Avg episode reward: [(0, '0.013')] [2024-04-25 22:33:50,175][47288] Updated weights for policy 0, policy_version 8046 (0.0030) [2024-04-25 22:33:53,375][47288] Updated weights for policy 0, policy_version 8056 (0.0033) [2024-04-25 22:33:53,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 132005888. Throughput: 0: 56088.8. Samples: 81447300. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-25 22:33:53,924][47056] Avg episode reward: [(0, '0.019')] [2024-04-25 22:33:56,013][47288] Updated weights for policy 0, policy_version 8066 (0.0025) [2024-04-25 22:33:58,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55432.7, 300 sec: 56038.8). Total num frames: 132284416. Throughput: 0: 56078.0. Samples: 81611420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-25 22:33:58,923][47056] Avg episode reward: [(0, '0.013')] [2024-04-25 22:33:59,136][47288] Updated weights for policy 0, policy_version 8076 (0.0033) [2024-04-25 22:34:01,707][47288] Updated weights for policy 0, policy_version 8086 (0.0027) [2024-04-25 22:34:03,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 132562944. Throughput: 0: 56179.2. Samples: 81950900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-25 22:34:03,923][47056] Avg episode reward: [(0, '0.013')] [2024-04-25 22:34:04,977][47288] Updated weights for policy 0, policy_version 8096 (0.0034) [2024-04-25 22:34:07,682][47288] Updated weights for policy 0, policy_version 8106 (0.0033) [2024-04-25 22:34:08,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 56038.8). Total num frames: 132857856. Throughput: 0: 55973.6. Samples: 82281260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-25 22:34:08,923][47056] Avg episode reward: [(0, '0.017')] [2024-04-25 22:34:10,717][47267] Signal inference workers to stop experience collection... (1200 times) [2024-04-25 22:34:10,762][47288] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-04-25 22:34:10,772][47267] Signal inference workers to resume experience collection... (1200 times) [2024-04-25 22:34:10,781][47288] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-04-25 22:34:10,880][47288] Updated weights for policy 0, policy_version 8116 (0.0031) [2024-04-25 22:34:13,589][47288] Updated weights for policy 0, policy_version 8126 (0.0030) [2024-04-25 22:34:13,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56252.0, 300 sec: 56094.4). Total num frames: 133152768. Throughput: 0: 56087.6. Samples: 82451000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-25 22:34:13,923][47056] Avg episode reward: [(0, '0.013')] [2024-04-25 22:34:16,815][47288] Updated weights for policy 0, policy_version 8136 (0.0030) [2024-04-25 22:34:18,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56251.8, 300 sec: 56038.9). Total num frames: 133414912. Throughput: 0: 56046.3. Samples: 82786320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-25 22:34:18,923][47056] Avg episode reward: [(0, '0.014')] [2024-04-25 22:34:19,416][47288] Updated weights for policy 0, policy_version 8146 (0.0030) [2024-04-25 22:34:22,636][47288] Updated weights for policy 0, policy_version 8156 (0.0031) [2024-04-25 22:34:23,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 133677056. Throughput: 0: 56052.2. Samples: 83121100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-04-25 22:34:23,923][47056] Avg episode reward: [(0, '0.013')] [2024-04-25 22:34:25,298][47288] Updated weights for policy 0, policy_version 8166 (0.0028) [2024-04-25 22:34:28,492][47288] Updated weights for policy 0, policy_version 8176 (0.0028) [2024-04-25 22:34:28,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.6, 300 sec: 56094.4). Total num frames: 133955584. Throughput: 0: 55676.8. Samples: 83282920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-04-25 22:34:28,923][47056] Avg episode reward: [(0, '0.015')] [2024-04-25 22:34:31,236][47288] Updated weights for policy 0, policy_version 8186 (0.0032) [2024-04-25 22:34:33,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55978.8, 300 sec: 56094.4). Total num frames: 134266880. Throughput: 0: 55695.6. Samples: 83617980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-04-25 22:34:33,923][47056] Avg episode reward: [(0, '0.016')] [2024-04-25 22:34:34,307][47288] Updated weights for policy 0, policy_version 8196 (0.0028) [2024-04-25 22:34:37,061][47288] Updated weights for policy 0, policy_version 8206 (0.0029) [2024-04-25 22:34:38,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55432.7, 300 sec: 55983.4). Total num frames: 134512640. Throughput: 0: 55808.7. Samples: 83958680. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-04-25 22:34:38,923][47056] Avg episode reward: [(0, '0.020')] [2024-04-25 22:34:38,933][47267] Saving new best policy, reward=0.020! [2024-04-25 22:34:39,983][47288] Updated weights for policy 0, policy_version 8216 (0.0034) [2024-04-25 22:34:42,843][47288] Updated weights for policy 0, policy_version 8226 (0.0034) [2024-04-25 22:34:43,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55705.4, 300 sec: 55983.3). Total num frames: 134807552. Throughput: 0: 56001.7. Samples: 84131500. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-04-25 22:34:43,923][47056] Avg episode reward: [(0, '0.019')] [2024-04-25 22:34:46,082][47288] Updated weights for policy 0, policy_version 8236 (0.0029) [2024-04-25 22:34:48,777][47288] Updated weights for policy 0, policy_version 8246 (0.0040) [2024-04-25 22:34:48,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56251.7, 300 sec: 56094.4). Total num frames: 135102464. Throughput: 0: 55786.6. Samples: 84461300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 22:34:48,923][47056] Avg episode reward: [(0, '0.014')] [2024-04-25 22:34:48,970][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000008247_135118848.pth... [2024-04-25 22:34:49,018][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000007426_121667584.pth [2024-04-25 22:34:52,080][47288] Updated weights for policy 0, policy_version 8256 (0.0038) [2024-04-25 22:34:53,923][47056] Fps is (10 sec: 54068.3, 60 sec: 55705.8, 300 sec: 55927.8). Total num frames: 135348224. Throughput: 0: 55659.3. Samples: 84785920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-25 22:34:53,923][47056] Avg episode reward: [(0, '0.018')] [2024-04-25 22:34:54,607][47288] Updated weights for policy 0, policy_version 8266 (0.0037) [2024-04-25 22:34:57,916][47288] Updated weights for policy 0, policy_version 8276 (0.0029) [2024-04-25 22:34:58,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 135643136. Throughput: 0: 55787.6. Samples: 84961440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 22:34:58,923][47056] Avg episode reward: [(0, '0.026')] [2024-04-25 22:34:58,939][47267] Saving new best policy, reward=0.026! [2024-04-25 22:35:00,521][47288] Updated weights for policy 0, policy_version 8286 (0.0037) [2024-04-25 22:35:03,770][47288] Updated weights for policy 0, policy_version 8296 (0.0030) [2024-04-25 22:35:03,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 56038.8). Total num frames: 135921664. Throughput: 0: 55722.6. Samples: 85293840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 22:35:03,923][47056] Avg episode reward: [(0, '0.016')] [2024-04-25 22:35:06,455][47288] Updated weights for policy 0, policy_version 8306 (0.0028) [2024-04-25 22:35:08,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55705.6, 300 sec: 56038.8). Total num frames: 136200192. Throughput: 0: 55746.1. Samples: 85629680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-25 22:35:08,932][47056] Avg episode reward: [(0, '0.019')] [2024-04-25 22:35:09,762][47288] Updated weights for policy 0, policy_version 8316 (0.0029) [2024-04-25 22:35:12,252][47288] Updated weights for policy 0, policy_version 8326 (0.0031) [2024-04-25 22:35:13,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.4, 300 sec: 55983.3). Total num frames: 136478720. Throughput: 0: 55871.0. Samples: 85797120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-25 22:35:13,923][47056] Avg episode reward: [(0, '0.014')] [2024-04-25 22:35:15,526][47288] Updated weights for policy 0, policy_version 8336 (0.0034) [2024-04-25 22:35:16,315][47267] Signal inference workers to stop experience collection... (1250 times) [2024-04-25 22:35:16,365][47288] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-04-25 22:35:16,365][47267] Signal inference workers to resume experience collection... (1250 times) [2024-04-25 22:35:16,380][47288] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-04-25 22:35:18,017][47288] Updated weights for policy 0, policy_version 8346 (0.0032) [2024-04-25 22:35:18,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 136757248. Throughput: 0: 55886.2. Samples: 86132860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-25 22:35:18,923][47056] Avg episode reward: [(0, '0.017')] [2024-04-25 22:35:21,366][47288] Updated weights for policy 0, policy_version 8356 (0.0033) [2024-04-25 22:35:23,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.5, 300 sec: 55983.3). Total num frames: 137052160. Throughput: 0: 55770.4. Samples: 86468360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-25 22:35:23,923][47056] Avg episode reward: [(0, '0.019')] [2024-04-25 22:35:24,079][47288] Updated weights for policy 0, policy_version 8366 (0.0033) [2024-04-25 22:35:27,295][47288] Updated weights for policy 0, policy_version 8376 (0.0031) [2024-04-25 22:35:28,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 137314304. Throughput: 0: 55606.4. Samples: 86633780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-25 22:35:28,923][47056] Avg episode reward: [(0, '0.021')] [2024-04-25 22:35:29,836][47288] Updated weights for policy 0, policy_version 8386 (0.0026) [2024-04-25 22:35:33,233][47288] Updated weights for policy 0, policy_version 8396 (0.0031) [2024-04-25 22:35:33,923][47056] Fps is (10 sec: 54068.7, 60 sec: 55432.6, 300 sec: 55927.8). Total num frames: 137592832. Throughput: 0: 55789.9. Samples: 86971840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-25 22:35:33,923][47056] Avg episode reward: [(0, '0.023')] [2024-04-25 22:35:35,753][47288] Updated weights for policy 0, policy_version 8406 (0.0028) [2024-04-25 22:35:38,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 137854976. Throughput: 0: 55987.8. Samples: 87305380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 22:35:38,923][47056] Avg episode reward: [(0, '0.020')] [2024-04-25 22:35:39,113][47288] Updated weights for policy 0, policy_version 8416 (0.0028) [2024-04-25 22:35:41,654][47288] Updated weights for policy 0, policy_version 8426 (0.0026) [2024-04-25 22:35:43,923][47056] Fps is (10 sec: 57340.0, 60 sec: 55978.2, 300 sec: 55983.2). Total num frames: 138166272. Throughput: 0: 55755.7. Samples: 87470480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 22:35:43,924][47056] Avg episode reward: [(0, '0.018')] [2024-04-25 22:35:44,915][47288] Updated weights for policy 0, policy_version 8436 (0.0029) [2024-04-25 22:35:47,502][47288] Updated weights for policy 0, policy_version 8446 (0.0028) [2024-04-25 22:35:48,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55705.5, 300 sec: 55983.3). Total num frames: 138444800. Throughput: 0: 55846.6. Samples: 87806940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 22:35:48,923][47056] Avg episode reward: [(0, '0.021')] [2024-04-25 22:35:50,818][47288] Updated weights for policy 0, policy_version 8456 (0.0027) [2024-04-25 22:35:53,476][47288] Updated weights for policy 0, policy_version 8466 (0.0027) [2024-04-25 22:35:53,923][47056] Fps is (10 sec: 55708.8, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 138723328. Throughput: 0: 55812.1. Samples: 88141220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 22:35:53,923][47056] Avg episode reward: [(0, '0.018')] [2024-04-25 22:35:56,795][47288] Updated weights for policy 0, policy_version 8476 (0.0026) [2024-04-25 22:35:58,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55978.4, 300 sec: 55983.3). Total num frames: 139001856. Throughput: 0: 56071.0. Samples: 88320320. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-25 22:35:58,923][47056] Avg episode reward: [(0, '0.026')] [2024-04-25 22:35:59,215][47288] Updated weights for policy 0, policy_version 8486 (0.0025) [2024-04-25 22:36:02,686][47288] Updated weights for policy 0, policy_version 8496 (0.0026) [2024-04-25 22:36:03,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 139280384. Throughput: 0: 56012.2. Samples: 88653420. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-25 22:36:03,923][47056] Avg episode reward: [(0, '0.020')] [2024-04-25 22:36:05,060][47288] Updated weights for policy 0, policy_version 8506 (0.0033) [2024-04-25 22:36:08,292][47288] Updated weights for policy 0, policy_version 8516 (0.0032) [2024-04-25 22:36:08,923][47056] Fps is (10 sec: 54068.6, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 139542528. Throughput: 0: 56059.4. Samples: 88991020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-25 22:36:08,923][47056] Avg episode reward: [(0, '0.017')] [2024-04-25 22:36:10,841][47288] Updated weights for policy 0, policy_version 8526 (0.0033) [2024-04-25 22:36:13,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 139821056. Throughput: 0: 56043.9. Samples: 89155760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 22:36:13,923][47056] Avg episode reward: [(0, '0.020')] [2024-04-25 22:36:14,173][47288] Updated weights for policy 0, policy_version 8536 (0.0037) [2024-04-25 22:36:16,692][47288] Updated weights for policy 0, policy_version 8546 (0.0027) [2024-04-25 22:36:17,543][47267] Signal inference workers to stop experience collection... (1300 times) [2024-04-25 22:36:17,576][47288] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-04-25 22:36:17,601][47267] Signal inference workers to resume experience collection... (1300 times) [2024-04-25 22:36:17,606][47288] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-04-25 22:36:18,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 140115968. Throughput: 0: 56051.8. Samples: 89494180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 22:36:18,923][47056] Avg episode reward: [(0, '0.019')] [2024-04-25 22:36:19,976][47288] Updated weights for policy 0, policy_version 8556 (0.0027) [2024-04-25 22:36:22,580][47288] Updated weights for policy 0, policy_version 8566 (0.0033) [2024-04-25 22:36:23,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 140410880. Throughput: 0: 55937.7. Samples: 89822580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-25 22:36:23,923][47056] Avg episode reward: [(0, '0.016')] [2024-04-25 22:36:25,859][47288] Updated weights for policy 0, policy_version 8576 (0.0035) [2024-04-25 22:36:28,585][47288] Updated weights for policy 0, policy_version 8586 (0.0029) [2024-04-25 22:36:28,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 140673024. Throughput: 0: 56011.4. Samples: 89990960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-25 22:36:28,923][47056] Avg episode reward: [(0, '0.012')] [2024-04-25 22:36:31,654][47288] Updated weights for policy 0, policy_version 8596 (0.0030) [2024-04-25 22:36:33,923][47056] Fps is (10 sec: 52429.6, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 140935168. Throughput: 0: 55979.2. Samples: 90326000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-25 22:36:33,923][47056] Avg episode reward: [(0, '0.025')] [2024-04-25 22:36:34,599][47288] Updated weights for policy 0, policy_version 8606 (0.0029) [2024-04-25 22:36:37,742][47288] Updated weights for policy 0, policy_version 8616 (0.0032) [2024-04-25 22:36:38,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.9, 300 sec: 55927.8). Total num frames: 141246464. Throughput: 0: 55972.0. Samples: 90659960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 22:36:38,923][47056] Avg episode reward: [(0, '0.024')] [2024-04-25 22:36:40,408][47288] Updated weights for policy 0, policy_version 8626 (0.0031) [2024-04-25 22:36:43,427][47288] Updated weights for policy 0, policy_version 8636 (0.0029) [2024-04-25 22:36:43,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55706.0, 300 sec: 55872.2). Total num frames: 141508608. Throughput: 0: 55660.1. Samples: 90825020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-25 22:36:43,923][47056] Avg episode reward: [(0, '0.030')] [2024-04-25 22:36:43,924][47267] Saving new best policy, reward=0.030! [2024-04-25 22:36:46,202][47288] Updated weights for policy 0, policy_version 8646 (0.0034) [2024-04-25 22:36:48,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 141787136. Throughput: 0: 55613.9. Samples: 91156040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-25 22:36:48,923][47056] Avg episode reward: [(0, '0.028')] [2024-04-25 22:36:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000008654_141787136.pth... [2024-04-25 22:36:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000007835_128368640.pth [2024-04-25 22:36:49,285][47288] Updated weights for policy 0, policy_version 8656 (0.0031) [2024-04-25 22:36:52,223][47288] Updated weights for policy 0, policy_version 8666 (0.0034) [2024-04-25 22:36:53,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.5, 300 sec: 55927.8). Total num frames: 142065664. Throughput: 0: 55539.4. Samples: 91490300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-25 22:36:53,923][47056] Avg episode reward: [(0, '0.020')] [2024-04-25 22:36:55,109][47288] Updated weights for policy 0, policy_version 8676 (0.0033) [2024-04-25 22:36:58,024][47288] Updated weights for policy 0, policy_version 8686 (0.0032) [2024-04-25 22:36:58,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55978.9, 300 sec: 55983.3). Total num frames: 142360576. Throughput: 0: 55736.2. Samples: 91663880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-25 22:36:58,923][47056] Avg episode reward: [(0, '0.029')] [2024-04-25 22:37:01,053][47288] Updated weights for policy 0, policy_version 8696 (0.0026) [2024-04-25 22:37:03,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.8, 300 sec: 55983.3). Total num frames: 142622720. Throughput: 0: 55669.9. Samples: 91999320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 22:37:03,923][47056] Avg episode reward: [(0, '0.026')] [2024-04-25 22:37:03,950][47288] Updated weights for policy 0, policy_version 8706 (0.0030) [2024-04-25 22:37:06,932][47288] Updated weights for policy 0, policy_version 8716 (0.0032) [2024-04-25 22:37:08,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 142917632. Throughput: 0: 55798.3. Samples: 92333500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 22:37:08,923][47056] Avg episode reward: [(0, '0.026')] [2024-04-25 22:37:09,741][47288] Updated weights for policy 0, policy_version 8726 (0.0032) [2024-04-25 22:37:12,723][47288] Updated weights for policy 0, policy_version 8736 (0.0027) [2024-04-25 22:37:13,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 143179776. Throughput: 0: 55844.6. Samples: 92503980. Policy #0 lag: (min: 2.0, avg: 12.0, max: 21.0) [2024-04-25 22:37:13,923][47056] Avg episode reward: [(0, '0.030')] [2024-04-25 22:37:15,639][47288] Updated weights for policy 0, policy_version 8746 (0.0028) [2024-04-25 22:37:18,596][47288] Updated weights for policy 0, policy_version 8756 (0.0032) [2024-04-25 22:37:18,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 143458304. Throughput: 0: 55865.7. Samples: 92839960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-25 22:37:18,924][47056] Avg episode reward: [(0, '0.024')] [2024-04-25 22:37:20,011][47267] Signal inference workers to stop experience collection... (1350 times) [2024-04-25 22:37:20,060][47288] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-04-25 22:37:20,068][47267] Signal inference workers to resume experience collection... (1350 times) [2024-04-25 22:37:20,077][47288] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-04-25 22:37:21,429][47288] Updated weights for policy 0, policy_version 8766 (0.0030) [2024-04-25 22:37:23,923][47056] Fps is (10 sec: 57345.4, 60 sec: 55705.8, 300 sec: 55872.2). Total num frames: 143753216. Throughput: 0: 56039.1. Samples: 93181720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-25 22:37:23,923][47056] Avg episode reward: [(0, '0.023')] [2024-04-25 22:37:24,391][47288] Updated weights for policy 0, policy_version 8776 (0.0027) [2024-04-25 22:37:27,219][47288] Updated weights for policy 0, policy_version 8786 (0.0026) [2024-04-25 22:37:28,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 144015360. Throughput: 0: 55963.1. Samples: 93343360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 22:37:28,927][47056] Avg episode reward: [(0, '0.024')] [2024-04-25 22:37:30,153][47288] Updated weights for policy 0, policy_version 8796 (0.0033) [2024-04-25 22:37:33,082][47288] Updated weights for policy 0, policy_version 8806 (0.0030) [2024-04-25 22:37:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 144326656. Throughput: 0: 56097.9. Samples: 93680440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-25 22:37:33,923][47056] Avg episode reward: [(0, '0.025')] [2024-04-25 22:37:36,021][47288] Updated weights for policy 0, policy_version 8816 (0.0026) [2024-04-25 22:37:38,744][47288] Updated weights for policy 0, policy_version 8826 (0.0030) [2024-04-25 22:37:38,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55978.6, 300 sec: 56038.8). Total num frames: 144605184. Throughput: 0: 56226.8. Samples: 94020500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-25 22:37:38,923][47056] Avg episode reward: [(0, '0.028')] [2024-04-25 22:37:42,089][47288] Updated weights for policy 0, policy_version 8836 (0.0031) [2024-04-25 22:37:43,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 144867328. Throughput: 0: 56118.9. Samples: 94189240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-25 22:37:43,924][47056] Avg episode reward: [(0, '0.025')] [2024-04-25 22:37:44,725][47288] Updated weights for policy 0, policy_version 8846 (0.0033) [2024-04-25 22:37:47,758][47288] Updated weights for policy 0, policy_version 8856 (0.0028) [2024-04-25 22:37:48,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 145145856. Throughput: 0: 56151.1. Samples: 94526120. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-25 22:37:48,923][47056] Avg episode reward: [(0, '0.026')] [2024-04-25 22:37:50,551][47288] Updated weights for policy 0, policy_version 8866 (0.0028) [2024-04-25 22:37:53,429][47288] Updated weights for policy 0, policy_version 8876 (0.0032) [2024-04-25 22:37:53,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 145424384. Throughput: 0: 56097.8. Samples: 94857900. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-25 22:37:53,923][47056] Avg episode reward: [(0, '0.033')] [2024-04-25 22:37:53,924][47267] Saving new best policy, reward=0.033! [2024-04-25 22:37:56,218][47288] Updated weights for policy 0, policy_version 8886 (0.0029) [2024-04-25 22:37:58,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 145702912. Throughput: 0: 56052.2. Samples: 95026320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-25 22:37:58,923][47056] Avg episode reward: [(0, '0.021')] [2024-04-25 22:37:59,346][47288] Updated weights for policy 0, policy_version 8896 (0.0030) [2024-04-25 22:38:02,148][47288] Updated weights for policy 0, policy_version 8906 (0.0025) [2024-04-25 22:38:03,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 145997824. Throughput: 0: 56006.6. Samples: 95360260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-25 22:38:03,923][47056] Avg episode reward: [(0, '0.032')] [2024-04-25 22:38:05,649][47288] Updated weights for policy 0, policy_version 8916 (0.0031) [2024-04-25 22:38:07,873][47288] Updated weights for policy 0, policy_version 8926 (0.0034) [2024-04-25 22:38:08,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 146276352. Throughput: 0: 55883.7. Samples: 95696500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-25 22:38:08,923][47056] Avg episode reward: [(0, '0.029')] [2024-04-25 22:38:11,511][47288] Updated weights for policy 0, policy_version 8936 (0.0028) [2024-04-25 22:38:13,799][47288] Updated weights for policy 0, policy_version 8946 (0.0033) [2024-04-25 22:38:13,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56525.0, 300 sec: 56038.9). Total num frames: 146571264. Throughput: 0: 56142.4. Samples: 95869760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 22:38:13,923][47056] Avg episode reward: [(0, '0.023')] [2024-04-25 22:38:17,359][47288] Updated weights for policy 0, policy_version 8956 (0.0032) [2024-04-25 22:38:18,923][47056] Fps is (10 sec: 55706.8, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 146833408. Throughput: 0: 56034.2. Samples: 96201980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-25 22:38:18,923][47056] Avg episode reward: [(0, '0.017')] [2024-04-25 22:38:19,860][47288] Updated weights for policy 0, policy_version 8966 (0.0030) [2024-04-25 22:38:23,058][47288] Updated weights for policy 0, policy_version 8976 (0.0026) [2024-04-25 22:38:23,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 147111936. Throughput: 0: 55879.2. Samples: 96535060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-25 22:38:23,923][47056] Avg episode reward: [(0, '0.029')] [2024-04-25 22:38:25,776][47288] Updated weights for policy 0, policy_version 8986 (0.0028) [2024-04-25 22:38:28,803][47288] Updated weights for policy 0, policy_version 8996 (0.0042) [2024-04-25 22:38:28,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.9, 300 sec: 55872.2). Total num frames: 147390464. Throughput: 0: 55919.3. Samples: 96705600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 22:38:28,923][47056] Avg episode reward: [(0, '0.027')] [2024-04-25 22:38:31,487][47288] Updated weights for policy 0, policy_version 9006 (0.0032) [2024-04-25 22:38:33,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 147668992. Throughput: 0: 55983.9. Samples: 97045400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 22:38:33,923][47056] Avg episode reward: [(0, '0.031')] [2024-04-25 22:38:34,725][47288] Updated weights for policy 0, policy_version 9016 (0.0028) [2024-04-25 22:38:37,324][47288] Updated weights for policy 0, policy_version 9026 (0.0030) [2024-04-25 22:38:38,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 147947520. Throughput: 0: 55953.4. Samples: 97375800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 22:38:38,923][47056] Avg episode reward: [(0, '0.031')] [2024-04-25 22:38:40,446][47267] Signal inference workers to stop experience collection... (1400 times) [2024-04-25 22:38:40,475][47288] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-04-25 22:38:40,529][47267] Signal inference workers to resume experience collection... (1400 times) [2024-04-25 22:38:40,529][47288] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-04-25 22:38:40,640][47288] Updated weights for policy 0, policy_version 9036 (0.0028) [2024-04-25 22:38:43,341][47288] Updated weights for policy 0, policy_version 9046 (0.0030) [2024-04-25 22:38:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 148226048. Throughput: 0: 55886.3. Samples: 97541200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 22:38:43,932][47056] Avg episode reward: [(0, '0.031')] [2024-04-25 22:38:46,359][47288] Updated weights for policy 0, policy_version 9056 (0.0025) [2024-04-25 22:38:48,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 148504576. Throughput: 0: 55849.8. Samples: 97873500. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-25 22:38:48,923][47056] Avg episode reward: [(0, '0.035')] [2024-04-25 22:38:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000009064_148504576.pth... [2024-04-25 22:38:48,983][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000008247_135118848.pth [2024-04-25 22:38:48,989][47267] Saving new best policy, reward=0.035! [2024-04-25 22:38:49,337][47288] Updated weights for policy 0, policy_version 9066 (0.0034) [2024-04-25 22:38:52,213][47288] Updated weights for policy 0, policy_version 9076 (0.0035) [2024-04-25 22:38:53,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56251.9, 300 sec: 55983.3). Total num frames: 148799488. Throughput: 0: 55805.7. Samples: 98207740. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-25 22:38:53,923][47056] Avg episode reward: [(0, '0.023')] [2024-04-25 22:38:55,073][47288] Updated weights for policy 0, policy_version 9086 (0.0035) [2024-04-25 22:38:58,007][47288] Updated weights for policy 0, policy_version 9096 (0.0035) [2024-04-25 22:38:58,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 149078016. Throughput: 0: 55836.8. Samples: 98382420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 22:38:58,923][47056] Avg episode reward: [(0, '0.029')] [2024-04-25 22:39:01,053][47288] Updated weights for policy 0, policy_version 9106 (0.0038) [2024-04-25 22:39:03,828][47288] Updated weights for policy 0, policy_version 9116 (0.0029) [2024-04-25 22:39:03,923][47056] Fps is (10 sec: 55704.3, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 149356544. Throughput: 0: 55891.3. Samples: 98717100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-25 22:39:03,923][47056] Avg episode reward: [(0, '0.038')] [2024-04-25 22:39:03,924][47267] Saving new best policy, reward=0.038! [2024-04-25 22:39:07,176][47288] Updated weights for policy 0, policy_version 9126 (0.0033) [2024-04-25 22:39:08,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 149618688. Throughput: 0: 55855.0. Samples: 99048540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-25 22:39:08,923][47056] Avg episode reward: [(0, '0.026')] [2024-04-25 22:39:09,686][47288] Updated weights for policy 0, policy_version 9136 (0.0028) [2024-04-25 22:39:13,161][47288] Updated weights for policy 0, policy_version 9146 (0.0035) [2024-04-25 22:39:13,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 149897216. Throughput: 0: 55698.6. Samples: 99212040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-25 22:39:13,923][47056] Avg episode reward: [(0, '0.029')] [2024-04-25 22:39:15,499][47288] Updated weights for policy 0, policy_version 9156 (0.0032) [2024-04-25 22:39:18,874][47288] Updated weights for policy 0, policy_version 9166 (0.0035) [2024-04-25 22:39:18,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 150175744. Throughput: 0: 55576.3. Samples: 99546340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 22:39:18,923][47056] Avg episode reward: [(0, '0.039')] [2024-04-25 22:39:18,931][47267] Saving new best policy, reward=0.039! [2024-04-25 22:39:21,497][47288] Updated weights for policy 0, policy_version 9176 (0.0026) [2024-04-25 22:39:23,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.4, 300 sec: 55927.7). Total num frames: 150454272. Throughput: 0: 55779.5. Samples: 99885880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 22:39:23,923][47056] Avg episode reward: [(0, '0.037')] [2024-04-25 22:39:24,823][47288] Updated weights for policy 0, policy_version 9186 (0.0030) [2024-04-25 22:39:27,382][47288] Updated weights for policy 0, policy_version 9196 (0.0032) [2024-04-25 22:39:28,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 150749184. Throughput: 0: 55859.1. Samples: 100054860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-25 22:39:28,923][47056] Avg episode reward: [(0, '0.036')] [2024-04-25 22:39:30,568][47288] Updated weights for policy 0, policy_version 9206 (0.0029) [2024-04-25 22:39:33,222][47288] Updated weights for policy 0, policy_version 9216 (0.0029) [2024-04-25 22:39:33,923][47056] Fps is (10 sec: 57345.1, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 151027712. Throughput: 0: 56023.7. Samples: 100394560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-25 22:39:33,923][47056] Avg episode reward: [(0, '0.043')] [2024-04-25 22:39:33,923][47267] Saving new best policy, reward=0.043! [2024-04-25 22:39:36,666][47288] Updated weights for policy 0, policy_version 9226 (0.0029) [2024-04-25 22:39:38,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 151306240. Throughput: 0: 56036.3. Samples: 100729380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-25 22:39:38,923][47056] Avg episode reward: [(0, '0.027')] [2024-04-25 22:39:38,973][47288] Updated weights for policy 0, policy_version 9236 (0.0028) [2024-04-25 22:39:42,452][47288] Updated weights for policy 0, policy_version 9246 (0.0030) [2024-04-25 22:39:43,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 151601152. Throughput: 0: 55955.5. Samples: 100900420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 22:39:43,923][47056] Avg episode reward: [(0, '0.020')] [2024-04-25 22:39:44,701][47288] Updated weights for policy 0, policy_version 9256 (0.0029) [2024-04-25 22:39:48,395][47288] Updated weights for policy 0, policy_version 9266 (0.0031) [2024-04-25 22:39:48,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55927.7). Total num frames: 151846912. Throughput: 0: 55946.4. Samples: 101234680. Policy #0 lag: (min: 1.0, avg: 9.3, max: 23.0) [2024-04-25 22:39:48,923][47056] Avg episode reward: [(0, '0.045')] [2024-04-25 22:39:48,986][47267] Saving new best policy, reward=0.045! [2024-04-25 22:39:50,664][47288] Updated weights for policy 0, policy_version 9276 (0.0027) [2024-04-25 22:39:53,923][47056] Fps is (10 sec: 50790.9, 60 sec: 55159.4, 300 sec: 55816.7). Total num frames: 152109056. Throughput: 0: 56081.9. Samples: 101572220. Policy #0 lag: (min: 1.0, avg: 9.3, max: 23.0) [2024-04-25 22:39:53,923][47056] Avg episode reward: [(0, '0.032')] [2024-04-25 22:39:54,315][47288] Updated weights for policy 0, policy_version 9286 (0.0030) [2024-04-25 22:39:54,797][47267] Signal inference workers to stop experience collection... (1450 times) [2024-04-25 22:39:54,829][47288] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-04-25 22:39:54,855][47267] Signal inference workers to resume experience collection... (1450 times) [2024-04-25 22:39:54,858][47288] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-04-25 22:39:56,446][47288] Updated weights for policy 0, policy_version 9296 (0.0028) [2024-04-25 22:39:58,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 152403968. Throughput: 0: 55952.4. Samples: 101729900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-25 22:39:58,923][47056] Avg episode reward: [(0, '0.031')] [2024-04-25 22:40:00,211][47288] Updated weights for policy 0, policy_version 9306 (0.0027) [2024-04-25 22:40:02,308][47288] Updated weights for policy 0, policy_version 9316 (0.0029) [2024-04-25 22:40:03,923][47056] Fps is (10 sec: 62259.6, 60 sec: 56252.0, 300 sec: 56038.9). Total num frames: 152731648. Throughput: 0: 56054.0. Samples: 102068760. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-25 22:40:03,923][47056] Avg episode reward: [(0, '0.031')] [2024-04-25 22:40:06,010][47288] Updated weights for policy 0, policy_version 9326 (0.0031) [2024-04-25 22:40:08,238][47288] Updated weights for policy 0, policy_version 9336 (0.0027) [2024-04-25 22:40:08,923][47056] Fps is (10 sec: 60621.2, 60 sec: 56524.8, 300 sec: 56038.8). Total num frames: 153010176. Throughput: 0: 56002.8. Samples: 102406000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-25 22:40:08,923][47056] Avg episode reward: [(0, '0.043')] [2024-04-25 22:40:11,789][47288] Updated weights for policy 0, policy_version 9346 (0.0034) [2024-04-25 22:40:13,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 153272320. Throughput: 0: 56168.5. Samples: 102582440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 22:40:13,923][47056] Avg episode reward: [(0, '0.042')] [2024-04-25 22:40:13,952][47288] Updated weights for policy 0, policy_version 9356 (0.0033) [2024-04-25 22:40:17,656][47288] Updated weights for policy 0, policy_version 9366 (0.0031) [2024-04-25 22:40:18,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 153550848. Throughput: 0: 56186.0. Samples: 102922940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 22:40:18,923][47056] Avg episode reward: [(0, '0.025')] [2024-04-25 22:40:19,610][47288] Updated weights for policy 0, policy_version 9376 (0.0032) [2024-04-25 22:40:23,512][47288] Updated weights for policy 0, policy_version 9386 (0.0042) [2024-04-25 22:40:23,923][47056] Fps is (10 sec: 52428.0, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 153796608. Throughput: 0: 56138.6. Samples: 103255620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-25 22:40:23,923][47056] Avg episode reward: [(0, '0.030')] [2024-04-25 22:40:25,564][47288] Updated weights for policy 0, policy_version 9396 (0.0024) [2024-04-25 22:40:28,923][47056] Fps is (10 sec: 50790.9, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 154058752. Throughput: 0: 55717.9. Samples: 103407720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-25 22:40:28,923][47056] Avg episode reward: [(0, '0.034')] [2024-04-25 22:40:29,458][47288] Updated weights for policy 0, policy_version 9406 (0.0029) [2024-04-25 22:40:31,584][47288] Updated weights for policy 0, policy_version 9416 (0.0027) [2024-04-25 22:40:33,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55705.5, 300 sec: 55983.3). Total num frames: 154370048. Throughput: 0: 55644.9. Samples: 103738700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-25 22:40:33,923][47056] Avg episode reward: [(0, '0.032')] [2024-04-25 22:40:35,324][47288] Updated weights for policy 0, policy_version 9426 (0.0030) [2024-04-25 22:40:37,352][47288] Updated weights for policy 0, policy_version 9436 (0.0035) [2024-04-25 22:40:38,923][47056] Fps is (10 sec: 62258.5, 60 sec: 56251.7, 300 sec: 55983.4). Total num frames: 154681344. Throughput: 0: 55681.6. Samples: 104077900. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-25 22:40:38,924][47056] Avg episode reward: [(0, '0.046')] [2024-04-25 22:40:41,027][47288] Updated weights for policy 0, policy_version 9446 (0.0028) [2024-04-25 22:40:43,155][47288] Updated weights for policy 0, policy_version 9456 (0.0032) [2024-04-25 22:40:43,923][47056] Fps is (10 sec: 60620.9, 60 sec: 56251.8, 300 sec: 56038.8). Total num frames: 154976256. Throughput: 0: 56216.1. Samples: 104259620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-25 22:40:43,923][47056] Avg episode reward: [(0, '0.036')] [2024-04-25 22:40:46,771][47288] Updated weights for policy 0, policy_version 9466 (0.0035) [2024-04-25 22:40:48,923][47056] Fps is (10 sec: 55706.9, 60 sec: 56524.9, 300 sec: 55983.3). Total num frames: 155238400. Throughput: 0: 56144.9. Samples: 104595280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-25 22:40:48,923][47056] Avg episode reward: [(0, '0.033')] [2024-04-25 22:40:49,002][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000009476_155254784.pth... [2024-04-25 22:40:49,005][47288] Updated weights for policy 0, policy_version 9476 (0.0032) [2024-04-25 22:40:49,050][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000008654_141787136.pth [2024-04-25 22:40:52,701][47288] Updated weights for policy 0, policy_version 9486 (0.0029) [2024-04-25 22:40:53,923][47056] Fps is (10 sec: 52429.3, 60 sec: 56524.8, 300 sec: 55927.8). Total num frames: 155500544. Throughput: 0: 56057.9. Samples: 104928600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-25 22:40:53,923][47056] Avg episode reward: [(0, '0.040')] [2024-04-25 22:40:54,881][47288] Updated weights for policy 0, policy_version 9496 (0.0027) [2024-04-25 22:40:56,443][47267] Signal inference workers to stop experience collection... (1500 times) [2024-04-25 22:40:56,492][47288] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-04-25 22:40:56,500][47267] Signal inference workers to resume experience collection... (1500 times) [2024-04-25 22:40:56,507][47288] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-04-25 22:40:58,622][47288] Updated weights for policy 0, policy_version 9506 (0.0026) [2024-04-25 22:40:58,923][47056] Fps is (10 sec: 52427.4, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 155762688. Throughput: 0: 55865.9. Samples: 105096420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-25 22:40:58,932][47056] Avg episode reward: [(0, '0.034')] [2024-04-25 22:41:00,678][47288] Updated weights for policy 0, policy_version 9516 (0.0031) [2024-04-25 22:41:03,923][47056] Fps is (10 sec: 52428.6, 60 sec: 54886.3, 300 sec: 55872.2). Total num frames: 156024832. Throughput: 0: 55589.9. Samples: 105424480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-25 22:41:03,923][47056] Avg episode reward: [(0, '0.046')] [2024-04-25 22:41:04,426][47288] Updated weights for policy 0, policy_version 9526 (0.0025) [2024-04-25 22:41:06,525][47288] Updated weights for policy 0, policy_version 9536 (0.0036) [2024-04-25 22:41:08,923][47056] Fps is (10 sec: 55706.6, 60 sec: 55159.5, 300 sec: 55927.8). Total num frames: 156319744. Throughput: 0: 55673.0. Samples: 105760900. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-25 22:41:08,923][47056] Avg episode reward: [(0, '0.034')] [2024-04-25 22:41:10,164][47288] Updated weights for policy 0, policy_version 9546 (0.0027) [2024-04-25 22:41:12,524][47288] Updated weights for policy 0, policy_version 9556 (0.0029) [2024-04-25 22:41:13,923][47056] Fps is (10 sec: 62258.7, 60 sec: 56251.6, 300 sec: 56038.8). Total num frames: 156647424. Throughput: 0: 56044.4. Samples: 105929720. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-25 22:41:13,932][47056] Avg episode reward: [(0, '0.030')] [2024-04-25 22:41:16,164][47288] Updated weights for policy 0, policy_version 9566 (0.0032) [2024-04-25 22:41:18,223][47288] Updated weights for policy 0, policy_version 9576 (0.0031) [2024-04-25 22:41:18,923][47056] Fps is (10 sec: 62258.6, 60 sec: 56524.8, 300 sec: 56038.8). Total num frames: 156942336. Throughput: 0: 56236.4. Samples: 106269340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-25 22:41:18,923][47056] Avg episode reward: [(0, '0.035')] [2024-04-25 22:41:22,148][47288] Updated weights for policy 0, policy_version 9586 (0.0029) [2024-04-25 22:41:23,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56524.9, 300 sec: 55983.3). Total num frames: 157188096. Throughput: 0: 56109.5. Samples: 106602820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-25 22:41:23,923][47056] Avg episode reward: [(0, '0.031')] [2024-04-25 22:41:24,089][47288] Updated weights for policy 0, policy_version 9596 (0.0031) [2024-04-25 22:41:27,809][47288] Updated weights for policy 0, policy_version 9606 (0.0032) [2024-04-25 22:41:28,923][47056] Fps is (10 sec: 50790.8, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 157450240. Throughput: 0: 55957.4. Samples: 106777700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-25 22:41:28,923][47056] Avg episode reward: [(0, '0.039')] [2024-04-25 22:41:29,863][47288] Updated weights for policy 0, policy_version 9616 (0.0028) [2024-04-25 22:41:33,720][47288] Updated weights for policy 0, policy_version 9626 (0.0032) [2024-04-25 22:41:33,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 157728768. Throughput: 0: 55998.5. Samples: 107115220. Policy #0 lag: (min: 0.0, avg: 13.5, max: 22.0) [2024-04-25 22:41:33,923][47056] Avg episode reward: [(0, '0.043')] [2024-04-25 22:41:35,769][47288] Updated weights for policy 0, policy_version 9636 (0.0029) [2024-04-25 22:41:38,923][47056] Fps is (10 sec: 52428.8, 60 sec: 54886.5, 300 sec: 55816.7). Total num frames: 157974528. Throughput: 0: 56099.5. Samples: 107453080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-25 22:41:38,923][47056] Avg episode reward: [(0, '0.039')] [2024-04-25 22:41:39,406][47288] Updated weights for policy 0, policy_version 9646 (0.0027) [2024-04-25 22:41:41,590][47288] Updated weights for policy 0, policy_version 9656 (0.0028) [2024-04-25 22:41:43,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55159.5, 300 sec: 55927.8). Total num frames: 158285824. Throughput: 0: 55819.3. Samples: 107608280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-25 22:41:43,923][47056] Avg episode reward: [(0, '0.036')] [2024-04-25 22:41:45,442][47288] Updated weights for policy 0, policy_version 9666 (0.0030) [2024-04-25 22:41:47,257][47267] Signal inference workers to stop experience collection... (1550 times) [2024-04-25 22:41:47,258][47267] Signal inference workers to resume experience collection... (1550 times) [2024-04-25 22:41:47,273][47288] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-04-25 22:41:47,273][47288] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-04-25 22:41:47,375][47288] Updated weights for policy 0, policy_version 9676 (0.0026) [2024-04-25 22:41:48,923][47056] Fps is (10 sec: 60620.4, 60 sec: 55705.4, 300 sec: 55983.3). Total num frames: 158580736. Throughput: 0: 55811.0. Samples: 107935980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-25 22:41:48,923][47056] Avg episode reward: [(0, '0.031')] [2024-04-25 22:41:51,143][47288] Updated weights for policy 0, policy_version 9686 (0.0031) [2024-04-25 22:41:53,343][47288] Updated weights for policy 0, policy_version 9696 (0.0027) [2024-04-25 22:41:53,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 158875648. Throughput: 0: 55866.7. Samples: 108274900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-25 22:41:53,923][47056] Avg episode reward: [(0, '0.035')] [2024-04-25 22:41:57,025][47288] Updated weights for policy 0, policy_version 9706 (0.0030) [2024-04-25 22:41:58,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56798.0, 300 sec: 56094.4). Total num frames: 159170560. Throughput: 0: 56191.6. Samples: 108458340. Policy #0 lag: (min: 0.0, avg: 13.2, max: 26.0) [2024-04-25 22:41:58,923][47056] Avg episode reward: [(0, '0.037')] [2024-04-25 22:41:59,344][47288] Updated weights for policy 0, policy_version 9716 (0.0026) [2024-04-25 22:42:02,816][47288] Updated weights for policy 0, policy_version 9726 (0.0033) [2024-04-25 22:42:03,922][47056] Fps is (10 sec: 52429.4, 60 sec: 56251.9, 300 sec: 55872.3). Total num frames: 159399936. Throughput: 0: 56122.1. Samples: 108794820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-25 22:42:03,923][47056] Avg episode reward: [(0, '0.038')] [2024-04-25 22:42:05,134][47288] Updated weights for policy 0, policy_version 9736 (0.0034) [2024-04-25 22:42:08,681][47288] Updated weights for policy 0, policy_version 9746 (0.0032) [2024-04-25 22:42:08,923][47056] Fps is (10 sec: 50790.7, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 159678464. Throughput: 0: 56140.0. Samples: 109129120. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-25 22:42:08,923][47056] Avg episode reward: [(0, '0.039')] [2024-04-25 22:42:10,837][47288] Updated weights for policy 0, policy_version 9756 (0.0028) [2024-04-25 22:42:13,923][47056] Fps is (10 sec: 54065.7, 60 sec: 54886.3, 300 sec: 55872.2). Total num frames: 159940608. Throughput: 0: 55615.8. Samples: 109280420. Policy #0 lag: (min: 2.0, avg: 12.0, max: 25.0) [2024-04-25 22:42:13,923][47056] Avg episode reward: [(0, '0.034')] [2024-04-25 22:42:14,678][47288] Updated weights for policy 0, policy_version 9766 (0.0029) [2024-04-25 22:42:16,706][47288] Updated weights for policy 0, policy_version 9776 (0.0027) [2024-04-25 22:42:18,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55159.4, 300 sec: 55927.7). Total num frames: 160251904. Throughput: 0: 55590.1. Samples: 109616780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-25 22:42:18,923][47056] Avg episode reward: [(0, '0.033')] [2024-04-25 22:42:20,377][47288] Updated weights for policy 0, policy_version 9786 (0.0030) [2024-04-25 22:42:22,585][47288] Updated weights for policy 0, policy_version 9796 (0.0028) [2024-04-25 22:42:23,923][47056] Fps is (10 sec: 60620.8, 60 sec: 55978.5, 300 sec: 56038.8). Total num frames: 160546816. Throughput: 0: 55617.6. Samples: 109955880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-25 22:42:23,923][47056] Avg episode reward: [(0, '0.049')] [2024-04-25 22:42:23,924][47267] Saving new best policy, reward=0.049! [2024-04-25 22:42:26,189][47288] Updated weights for policy 0, policy_version 9806 (0.0027) [2024-04-25 22:42:28,618][47288] Updated weights for policy 0, policy_version 9816 (0.0031) [2024-04-25 22:42:28,923][47056] Fps is (10 sec: 58983.2, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 160841728. Throughput: 0: 56251.2. Samples: 110139580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-25 22:42:28,923][47056] Avg episode reward: [(0, '0.050')] [2024-04-25 22:42:28,935][47267] Saving new best policy, reward=0.050! [2024-04-25 22:42:31,918][47288] Updated weights for policy 0, policy_version 9826 (0.0033) [2024-04-25 22:42:33,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.7, 300 sec: 55983.3). Total num frames: 161120256. Throughput: 0: 56336.4. Samples: 110471120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-25 22:42:33,923][47056] Avg episode reward: [(0, '0.044')] [2024-04-25 22:42:34,340][47288] Updated weights for policy 0, policy_version 9836 (0.0032) [2024-04-25 22:42:37,843][47288] Updated weights for policy 0, policy_version 9846 (0.0030) [2024-04-25 22:42:38,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56797.8, 300 sec: 55983.3). Total num frames: 161382400. Throughput: 0: 56331.9. Samples: 110809840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-25 22:42:38,923][47056] Avg episode reward: [(0, '0.035')] [2024-04-25 22:42:40,164][47288] Updated weights for policy 0, policy_version 9856 (0.0039) [2024-04-25 22:42:43,568][47288] Updated weights for policy 0, policy_version 9866 (0.0028) [2024-04-25 22:42:43,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 161660928. Throughput: 0: 55939.1. Samples: 110975600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-25 22:42:43,923][47056] Avg episode reward: [(0, '0.035')] [2024-04-25 22:42:45,882][47267] Signal inference workers to stop experience collection... (1600 times) [2024-04-25 22:42:45,934][47288] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-04-25 22:42:45,940][47267] Signal inference workers to resume experience collection... (1600 times) [2024-04-25 22:42:45,947][47288] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-04-25 22:42:46,047][47288] Updated weights for policy 0, policy_version 9876 (0.0032) [2024-04-25 22:42:48,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 161923072. Throughput: 0: 55969.2. Samples: 111313440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-25 22:42:48,923][47056] Avg episode reward: [(0, '0.038')] [2024-04-25 22:42:48,941][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000009884_161939456.pth... [2024-04-25 22:42:48,988][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000009064_148504576.pth [2024-04-25 22:42:49,491][47288] Updated weights for policy 0, policy_version 9886 (0.0029) [2024-04-25 22:42:51,861][47288] Updated weights for policy 0, policy_version 9896 (0.0033) [2024-04-25 22:42:53,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 162217984. Throughput: 0: 56028.9. Samples: 111650420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 22:42:53,923][47056] Avg episode reward: [(0, '0.055')] [2024-04-25 22:42:53,923][47267] Saving new best policy, reward=0.055! [2024-04-25 22:42:55,465][47288] Updated weights for policy 0, policy_version 9906 (0.0027) [2024-04-25 22:42:57,655][47288] Updated weights for policy 0, policy_version 9916 (0.0030) [2024-04-25 22:42:58,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 162512896. Throughput: 0: 56361.0. Samples: 111816660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-25 22:42:58,923][47056] Avg episode reward: [(0, '0.036')] [2024-04-25 22:43:01,239][47288] Updated weights for policy 0, policy_version 9926 (0.0030) [2024-04-25 22:43:03,596][47288] Updated weights for policy 0, policy_version 9936 (0.0027) [2024-04-25 22:43:03,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56797.7, 300 sec: 56038.9). Total num frames: 162807808. Throughput: 0: 56336.9. Samples: 112151940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-25 22:43:03,923][47056] Avg episode reward: [(0, '0.043')] [2024-04-25 22:43:07,127][47288] Updated weights for policy 0, policy_version 9946 (0.0031) [2024-04-25 22:43:08,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 163053568. Throughput: 0: 56182.4. Samples: 112484080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) [2024-04-25 22:43:08,923][47056] Avg episode reward: [(0, '0.043')] [2024-04-25 22:43:09,586][47288] Updated weights for policy 0, policy_version 9956 (0.0031) [2024-04-25 22:43:12,912][47288] Updated weights for policy 0, policy_version 9966 (0.0029) [2024-04-25 22:43:13,923][47056] Fps is (10 sec: 52429.1, 60 sec: 56524.9, 300 sec: 55927.7). Total num frames: 163332096. Throughput: 0: 55822.6. Samples: 112651600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) [2024-04-25 22:43:13,923][47056] Avg episode reward: [(0, '0.048')] [2024-04-25 22:43:15,524][47288] Updated weights for policy 0, policy_version 9976 (0.0030) [2024-04-25 22:43:18,750][47288] Updated weights for policy 0, policy_version 9986 (0.0036) [2024-04-25 22:43:18,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 163610624. Throughput: 0: 55966.5. Samples: 112989600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 22:43:18,923][47056] Avg episode reward: [(0, '0.041')] [2024-04-25 22:43:21,320][47288] Updated weights for policy 0, policy_version 9996 (0.0031) [2024-04-25 22:43:23,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 163889152. Throughput: 0: 55927.6. Samples: 113326580. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-25 22:43:23,923][47056] Avg episode reward: [(0, '0.047')] [2024-04-25 22:43:24,456][47288] Updated weights for policy 0, policy_version 10006 (0.0030) [2024-04-25 22:43:27,158][47288] Updated weights for policy 0, policy_version 10016 (0.0032) [2024-04-25 22:43:28,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55927.8). Total num frames: 164167680. Throughput: 0: 55939.2. Samples: 113492860. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-25 22:43:28,932][47056] Avg episode reward: [(0, '0.051')] [2024-04-25 22:43:30,392][47288] Updated weights for policy 0, policy_version 10026 (0.0036) [2024-04-25 22:43:33,124][47288] Updated weights for policy 0, policy_version 10036 (0.0027) [2024-04-25 22:43:33,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.7, 300 sec: 55983.3). Total num frames: 164462592. Throughput: 0: 55805.3. Samples: 113824680. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-04-25 22:43:33,923][47056] Avg episode reward: [(0, '0.055')] [2024-04-25 22:43:36,137][47288] Updated weights for policy 0, policy_version 10046 (0.0032) [2024-04-25 22:43:38,907][47267] Signal inference workers to stop experience collection... (1650 times) [2024-04-25 22:43:38,907][47267] Signal inference workers to resume experience collection... (1650 times) [2024-04-25 22:43:38,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 164741120. Throughput: 0: 55723.9. Samples: 114158000. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-04-25 22:43:38,923][47056] Avg episode reward: [(0, '0.037')] [2024-04-25 22:43:38,925][47288] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-04-25 22:43:38,932][47288] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-04-25 22:43:39,041][47288] Updated weights for policy 0, policy_version 10056 (0.0035) [2024-04-25 22:43:42,205][47288] Updated weights for policy 0, policy_version 10066 (0.0032) [2024-04-25 22:43:43,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 165003264. Throughput: 0: 55935.5. Samples: 114333760. Policy #0 lag: (min: 2.0, avg: 10.9, max: 21.0) [2024-04-25 22:43:43,932][47056] Avg episode reward: [(0, '0.055')] [2024-04-25 22:43:44,802][47288] Updated weights for policy 0, policy_version 10076 (0.0026) [2024-04-25 22:43:48,128][47288] Updated weights for policy 0, policy_version 10086 (0.0029) [2024-04-25 22:43:48,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 165298176. Throughput: 0: 55841.9. Samples: 114664820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 22:43:48,924][47056] Avg episode reward: [(0, '0.046')] [2024-04-25 22:43:50,614][47288] Updated weights for policy 0, policy_version 10096 (0.0030) [2024-04-25 22:43:53,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 165560320. Throughput: 0: 55781.3. Samples: 114994240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 22:43:53,923][47056] Avg episode reward: [(0, '0.066')] [2024-04-25 22:43:53,924][47267] Saving new best policy, reward=0.066! [2024-04-25 22:43:53,933][47288] Updated weights for policy 0, policy_version 10106 (0.0029) [2024-04-25 22:43:56,508][47288] Updated weights for policy 0, policy_version 10116 (0.0029) [2024-04-25 22:43:58,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 165838848. Throughput: 0: 55676.3. Samples: 115157040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-25 22:43:58,923][47056] Avg episode reward: [(0, '0.055')] [2024-04-25 22:43:59,684][47288] Updated weights for policy 0, policy_version 10126 (0.0029) [2024-04-25 22:44:02,518][47288] Updated weights for policy 0, policy_version 10136 (0.0028) [2024-04-25 22:44:03,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55159.6, 300 sec: 55927.8). Total num frames: 166117376. Throughput: 0: 55652.0. Samples: 115493940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-04-25 22:44:03,923][47056] Avg episode reward: [(0, '0.043')] [2024-04-25 22:44:05,547][47288] Updated weights for policy 0, policy_version 10146 (0.0030) [2024-04-25 22:44:08,334][47288] Updated weights for policy 0, policy_version 10156 (0.0028) [2024-04-25 22:44:08,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 166412288. Throughput: 0: 55579.5. Samples: 115827660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-04-25 22:44:08,923][47056] Avg episode reward: [(0, '0.045')] [2024-04-25 22:44:11,579][47288] Updated weights for policy 0, policy_version 10166 (0.0028) [2024-04-25 22:44:13,923][47056] Fps is (10 sec: 57342.9, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 166690816. Throughput: 0: 55762.5. Samples: 116002180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-25 22:44:13,923][47056] Avg episode reward: [(0, '0.041')] [2024-04-25 22:44:14,201][47288] Updated weights for policy 0, policy_version 10176 (0.0031) [2024-04-25 22:44:17,466][47288] Updated weights for policy 0, policy_version 10186 (0.0030) [2024-04-25 22:44:18,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 166952960. Throughput: 0: 55694.3. Samples: 116330920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-25 22:44:18,923][47056] Avg episode reward: [(0, '0.038')] [2024-04-25 22:44:20,188][47288] Updated weights for policy 0, policy_version 10196 (0.0036) [2024-04-25 22:44:23,416][47288] Updated weights for policy 0, policy_version 10206 (0.0028) [2024-04-25 22:44:23,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 167247872. Throughput: 0: 55598.7. Samples: 116659940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 22:44:23,923][47056] Avg episode reward: [(0, '0.061')] [2024-04-25 22:44:26,106][47288] Updated weights for policy 0, policy_version 10216 (0.0032) [2024-04-25 22:44:28,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 167510016. Throughput: 0: 55332.7. Samples: 116823740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 22:44:28,923][47056] Avg episode reward: [(0, '0.048')] [2024-04-25 22:44:29,237][47288] Updated weights for policy 0, policy_version 10226 (0.0027) [2024-04-25 22:44:31,983][47288] Updated weights for policy 0, policy_version 10236 (0.0029) [2024-04-25 22:44:33,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 167788544. Throughput: 0: 55526.6. Samples: 117163520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 22:44:33,923][47056] Avg episode reward: [(0, '0.047')] [2024-04-25 22:44:34,990][47288] Updated weights for policy 0, policy_version 10246 (0.0031) [2024-04-25 22:44:37,971][47288] Updated weights for policy 0, policy_version 10256 (0.0036) [2024-04-25 22:44:38,923][47056] Fps is (10 sec: 55707.2, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 168067072. Throughput: 0: 55710.0. Samples: 117501180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 22:44:38,923][47056] Avg episode reward: [(0, '0.044')] [2024-04-25 22:44:40,764][47288] Updated weights for policy 0, policy_version 10266 (0.0027) [2024-04-25 22:44:43,784][47288] Updated weights for policy 0, policy_version 10276 (0.0035) [2024-04-25 22:44:43,926][47056] Fps is (10 sec: 57325.7, 60 sec: 55975.7, 300 sec: 55982.7). Total num frames: 168361984. Throughput: 0: 55840.1. Samples: 117670020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 22:44:43,926][47056] Avg episode reward: [(0, '0.037')] [2024-04-25 22:44:46,573][47288] Updated weights for policy 0, policy_version 10286 (0.0033) [2024-04-25 22:44:48,923][47056] Fps is (10 sec: 57342.3, 60 sec: 55705.4, 300 sec: 56038.8). Total num frames: 168640512. Throughput: 0: 55851.7. Samples: 118007280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-25 22:44:48,924][47056] Avg episode reward: [(0, '0.046')] [2024-04-25 22:44:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000010293_168640512.pth... [2024-04-25 22:44:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000009476_155254784.pth [2024-04-25 22:44:49,663][47288] Updated weights for policy 0, policy_version 10296 (0.0023) [2024-04-25 22:44:51,424][47267] Signal inference workers to stop experience collection... (1700 times) [2024-04-25 22:44:51,425][47267] Signal inference workers to resume experience collection... (1700 times) [2024-04-25 22:44:51,451][47288] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-04-25 22:44:51,451][47288] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-04-25 22:44:52,401][47288] Updated weights for policy 0, policy_version 10306 (0.0026) [2024-04-25 22:44:53,923][47056] Fps is (10 sec: 55724.0, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 168919040. Throughput: 0: 55677.0. Samples: 118333120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-25 22:44:53,923][47056] Avg episode reward: [(0, '0.052')] [2024-04-25 22:44:55,557][47288] Updated weights for policy 0, policy_version 10316 (0.0037) [2024-04-25 22:44:58,141][47288] Updated weights for policy 0, policy_version 10326 (0.0036) [2024-04-25 22:44:58,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 169197568. Throughput: 0: 55771.2. Samples: 118511880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-25 22:44:58,923][47056] Avg episode reward: [(0, '0.046')] [2024-04-25 22:45:01,583][47288] Updated weights for policy 0, policy_version 10336 (0.0026) [2024-04-25 22:45:03,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 169476096. Throughput: 0: 55812.9. Samples: 118842500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-25 22:45:03,923][47056] Avg episode reward: [(0, '0.061')] [2024-04-25 22:45:04,467][47288] Updated weights for policy 0, policy_version 10346 (0.0029) [2024-04-25 22:45:07,413][47288] Updated weights for policy 0, policy_version 10356 (0.0027) [2024-04-25 22:45:08,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 169738240. Throughput: 0: 55980.4. Samples: 119179060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-25 22:45:08,923][47056] Avg episode reward: [(0, '0.075')] [2024-04-25 22:45:08,971][47267] Saving new best policy, reward=0.075! [2024-04-25 22:45:10,175][47288] Updated weights for policy 0, policy_version 10366 (0.0027) [2024-04-25 22:45:13,262][47288] Updated weights for policy 0, policy_version 10376 (0.0033) [2024-04-25 22:45:13,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 170016768. Throughput: 0: 55951.7. Samples: 119341560. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-25 22:45:13,923][47056] Avg episode reward: [(0, '0.049')] [2024-04-25 22:45:16,105][47288] Updated weights for policy 0, policy_version 10386 (0.0027) [2024-04-25 22:45:18,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55705.4, 300 sec: 55927.7). Total num frames: 170295296. Throughput: 0: 55967.3. Samples: 119682060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 22:45:18,923][47056] Avg episode reward: [(0, '0.055')] [2024-04-25 22:45:19,186][47288] Updated weights for policy 0, policy_version 10396 (0.0028) [2024-04-25 22:45:21,938][47288] Updated weights for policy 0, policy_version 10406 (0.0027) [2024-04-25 22:45:23,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55983.3). Total num frames: 170573824. Throughput: 0: 55850.9. Samples: 120014480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 22:45:23,923][47056] Avg episode reward: [(0, '0.062')] [2024-04-25 22:45:25,189][47288] Updated weights for policy 0, policy_version 10416 (0.0027) [2024-04-25 22:45:27,875][47288] Updated weights for policy 0, policy_version 10426 (0.0032) [2024-04-25 22:45:28,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 170868736. Throughput: 0: 55753.2. Samples: 120178740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-25 22:45:28,923][47056] Avg episode reward: [(0, '0.043')] [2024-04-25 22:45:31,000][47288] Updated weights for policy 0, policy_version 10436 (0.0032) [2024-04-25 22:45:33,560][47288] Updated weights for policy 0, policy_version 10446 (0.0033) [2024-04-25 22:45:33,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 171147264. Throughput: 0: 55618.4. Samples: 120510100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-25 22:45:33,923][47056] Avg episode reward: [(0, '0.052')] [2024-04-25 22:45:36,921][47288] Updated weights for policy 0, policy_version 10456 (0.0028) [2024-04-25 22:45:38,923][47056] Fps is (10 sec: 55706.6, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 171425792. Throughput: 0: 55707.1. Samples: 120839940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-25 22:45:38,923][47056] Avg episode reward: [(0, '0.056')] [2024-04-25 22:45:39,593][47288] Updated weights for policy 0, policy_version 10466 (0.0030) [2024-04-25 22:45:42,901][47288] Updated weights for policy 0, policy_version 10476 (0.0028) [2024-04-25 22:45:43,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55435.5, 300 sec: 55761.1). Total num frames: 171687936. Throughput: 0: 55470.3. Samples: 121008040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-25 22:45:43,923][47056] Avg episode reward: [(0, '0.033')] [2024-04-25 22:45:45,736][47288] Updated weights for policy 0, policy_version 10486 (0.0035) [2024-04-25 22:45:48,825][47288] Updated weights for policy 0, policy_version 10496 (0.0034) [2024-04-25 22:45:48,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 171966464. Throughput: 0: 55509.2. Samples: 121340420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-25 22:45:48,923][47056] Avg episode reward: [(0, '0.057')] [2024-04-25 22:45:51,665][47288] Updated weights for policy 0, policy_version 10506 (0.0029) [2024-04-25 22:45:53,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55159.4, 300 sec: 55816.7). Total num frames: 172228608. Throughput: 0: 55495.6. Samples: 121676360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-25 22:45:53,923][47056] Avg episode reward: [(0, '0.072')] [2024-04-25 22:45:54,709][47288] Updated weights for policy 0, policy_version 10516 (0.0029) [2024-04-25 22:45:57,422][47288] Updated weights for policy 0, policy_version 10526 (0.0032) [2024-04-25 22:45:58,609][47267] Signal inference workers to stop experience collection... (1750 times) [2024-04-25 22:45:58,628][47288] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-04-25 22:45:58,695][47267] Signal inference workers to resume experience collection... (1750 times) [2024-04-25 22:45:58,695][47288] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-04-25 22:45:58,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 172539904. Throughput: 0: 55494.7. Samples: 121838820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-25 22:45:58,932][47056] Avg episode reward: [(0, '0.062')] [2024-04-25 22:46:00,602][47288] Updated weights for policy 0, policy_version 10536 (0.0028) [2024-04-25 22:46:03,141][47288] Updated weights for policy 0, policy_version 10546 (0.0029) [2024-04-25 22:46:03,923][47056] Fps is (10 sec: 58982.8, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 172818432. Throughput: 0: 55291.4. Samples: 122170160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-25 22:46:03,923][47056] Avg episode reward: [(0, '0.064')] [2024-04-25 22:46:06,614][47288] Updated weights for policy 0, policy_version 10556 (0.0035) [2024-04-25 22:46:08,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 173096960. Throughput: 0: 55351.1. Samples: 122505280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-04-25 22:46:08,932][47056] Avg episode reward: [(0, '0.066')] [2024-04-25 22:46:09,054][47288] Updated weights for policy 0, policy_version 10566 (0.0029) [2024-04-25 22:46:12,556][47288] Updated weights for policy 0, policy_version 10576 (0.0033) [2024-04-25 22:46:13,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.9, 300 sec: 55761.2). Total num frames: 173391872. Throughput: 0: 55692.7. Samples: 122684900. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-04-25 22:46:13,923][47056] Avg episode reward: [(0, '0.041')] [2024-04-25 22:46:14,832][47288] Updated weights for policy 0, policy_version 10586 (0.0025) [2024-04-25 22:46:18,359][47288] Updated weights for policy 0, policy_version 10596 (0.0030) [2024-04-25 22:46:18,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.8, 300 sec: 55761.1). Total num frames: 173637632. Throughput: 0: 55858.2. Samples: 123023720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-25 22:46:18,923][47056] Avg episode reward: [(0, '0.057')] [2024-04-25 22:46:20,490][47288] Updated weights for policy 0, policy_version 10606 (0.0038) [2024-04-25 22:46:23,923][47056] Fps is (10 sec: 50790.1, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 173899776. Throughput: 0: 55907.1. Samples: 123355760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-25 22:46:23,923][47056] Avg episode reward: [(0, '0.050')] [2024-04-25 22:46:24,153][47288] Updated weights for policy 0, policy_version 10616 (0.0027) [2024-04-25 22:46:26,552][47288] Updated weights for policy 0, policy_version 10626 (0.0032) [2024-04-25 22:46:28,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 174194688. Throughput: 0: 55817.2. Samples: 123519820. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-25 22:46:28,923][47056] Avg episode reward: [(0, '0.046')] [2024-04-25 22:46:30,112][47288] Updated weights for policy 0, policy_version 10636 (0.0030) [2024-04-25 22:46:32,684][47288] Updated weights for policy 0, policy_version 10646 (0.0030) [2024-04-25 22:46:33,923][47056] Fps is (10 sec: 58981.7, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 174489600. Throughput: 0: 55856.4. Samples: 123853960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-25 22:46:33,923][47056] Avg episode reward: [(0, '0.062')] [2024-04-25 22:46:35,933][47288] Updated weights for policy 0, policy_version 10656 (0.0030) [2024-04-25 22:46:38,375][47288] Updated weights for policy 0, policy_version 10666 (0.0030) [2024-04-25 22:46:38,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 174768128. Throughput: 0: 55739.5. Samples: 124184640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-25 22:46:38,923][47056] Avg episode reward: [(0, '0.056')] [2024-04-25 22:46:41,720][47288] Updated weights for policy 0, policy_version 10676 (0.0031) [2024-04-25 22:46:43,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 175046656. Throughput: 0: 56060.9. Samples: 124361560. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-25 22:46:43,923][47056] Avg episode reward: [(0, '0.052')] [2024-04-25 22:46:44,237][47288] Updated weights for policy 0, policy_version 10686 (0.0031) [2024-04-25 22:46:47,649][47288] Updated weights for policy 0, policy_version 10696 (0.0033) [2024-04-25 22:46:48,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56524.9, 300 sec: 55872.2). Total num frames: 175357952. Throughput: 0: 56094.2. Samples: 124694400. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-25 22:46:48,923][47056] Avg episode reward: [(0, '0.055')] [2024-04-25 22:46:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000010703_175357952.pth... [2024-04-25 22:46:48,979][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000009884_161939456.pth [2024-04-25 22:46:50,092][47288] Updated weights for policy 0, policy_version 10706 (0.0037) [2024-04-25 22:46:53,583][47288] Updated weights for policy 0, policy_version 10716 (0.0028) [2024-04-25 22:46:53,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 175603712. Throughput: 0: 56097.9. Samples: 125029680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 22:46:53,923][47056] Avg episode reward: [(0, '0.054')] [2024-04-25 22:46:55,917][47288] Updated weights for policy 0, policy_version 10726 (0.0029) [2024-04-25 22:46:58,923][47056] Fps is (10 sec: 49151.4, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 175849472. Throughput: 0: 55639.8. Samples: 125188700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-25 22:46:58,923][47056] Avg episode reward: [(0, '0.060')] [2024-04-25 22:46:59,266][47288] Updated weights for policy 0, policy_version 10736 (0.0029) [2024-04-25 22:47:01,643][47288] Updated weights for policy 0, policy_version 10746 (0.0028) [2024-04-25 22:47:03,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 176144384. Throughput: 0: 55664.4. Samples: 125528620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-25 22:47:03,924][47056] Avg episode reward: [(0, '0.056')] [2024-04-25 22:47:04,962][47267] Signal inference workers to stop experience collection... (1800 times) [2024-04-25 22:47:04,995][47288] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-04-25 22:47:05,021][47267] Signal inference workers to resume experience collection... (1800 times) [2024-04-25 22:47:05,021][47288] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-04-25 22:47:05,149][47288] Updated weights for policy 0, policy_version 10756 (0.0034) [2024-04-25 22:47:07,724][47288] Updated weights for policy 0, policy_version 10766 (0.0036) [2024-04-25 22:47:08,923][47056] Fps is (10 sec: 58981.8, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 176439296. Throughput: 0: 55667.7. Samples: 125860820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 22:47:08,923][47056] Avg episode reward: [(0, '0.055')] [2024-04-25 22:47:11,149][47288] Updated weights for policy 0, policy_version 10776 (0.0034) [2024-04-25 22:47:13,604][47288] Updated weights for policy 0, policy_version 10786 (0.0031) [2024-04-25 22:47:13,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 176717824. Throughput: 0: 55737.4. Samples: 126028000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 22:47:13,923][47056] Avg episode reward: [(0, '0.052')] [2024-04-25 22:47:16,916][47288] Updated weights for policy 0, policy_version 10796 (0.0030) [2024-04-25 22:47:18,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 177012736. Throughput: 0: 55699.2. Samples: 126360420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-25 22:47:18,923][47056] Avg episode reward: [(0, '0.047')] [2024-04-25 22:47:19,637][47288] Updated weights for policy 0, policy_version 10806 (0.0026) [2024-04-25 22:47:22,675][47288] Updated weights for policy 0, policy_version 10816 (0.0027) [2024-04-25 22:47:23,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.8, 300 sec: 55761.1). Total num frames: 177291264. Throughput: 0: 55754.2. Samples: 126693580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-25 22:47:23,923][47056] Avg episode reward: [(0, '0.070')] [2024-04-25 22:47:25,331][47288] Updated weights for policy 0, policy_version 10826 (0.0028) [2024-04-25 22:47:28,523][47288] Updated weights for policy 0, policy_version 10836 (0.0031) [2024-04-25 22:47:28,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56251.9, 300 sec: 55761.2). Total num frames: 177569792. Throughput: 0: 55721.4. Samples: 126869020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-25 22:47:28,923][47056] Avg episode reward: [(0, '0.060')] [2024-04-25 22:47:31,270][47288] Updated weights for policy 0, policy_version 10846 (0.0037) [2024-04-25 22:47:33,923][47056] Fps is (10 sec: 50789.8, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 177799168. Throughput: 0: 55850.9. Samples: 127207700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-25 22:47:33,923][47056] Avg episode reward: [(0, '0.074')] [2024-04-25 22:47:34,405][47288] Updated weights for policy 0, policy_version 10856 (0.0030) [2024-04-25 22:47:36,964][47288] Updated weights for policy 0, policy_version 10866 (0.0031) [2024-04-25 22:47:38,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 178094080. Throughput: 0: 55866.2. Samples: 127543660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-25 22:47:38,923][47056] Avg episode reward: [(0, '0.073')] [2024-04-25 22:47:40,218][47288] Updated weights for policy 0, policy_version 10876 (0.0036) [2024-04-25 22:47:42,712][47288] Updated weights for policy 0, policy_version 10886 (0.0026) [2024-04-25 22:47:43,923][47056] Fps is (10 sec: 58983.3, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 178388992. Throughput: 0: 55877.0. Samples: 127703160. Policy #0 lag: (min: 0.0, avg: 13.3, max: 26.0) [2024-04-25 22:47:43,923][47056] Avg episode reward: [(0, '0.053')] [2024-04-25 22:47:45,989][47288] Updated weights for policy 0, policy_version 10896 (0.0027) [2024-04-25 22:47:48,557][47288] Updated weights for policy 0, policy_version 10906 (0.0033) [2024-04-25 22:47:48,923][47056] Fps is (10 sec: 58981.2, 60 sec: 55432.3, 300 sec: 55816.6). Total num frames: 178683904. Throughput: 0: 55825.3. Samples: 128040760. Policy #0 lag: (min: 0.0, avg: 13.3, max: 26.0) [2024-04-25 22:47:48,923][47056] Avg episode reward: [(0, '0.045')] [2024-04-25 22:47:51,828][47288] Updated weights for policy 0, policy_version 10916 (0.0035) [2024-04-25 22:47:53,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 178962432. Throughput: 0: 55903.2. Samples: 128376460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-25 22:47:53,924][47056] Avg episode reward: [(0, '0.065')] [2024-04-25 22:47:54,422][47288] Updated weights for policy 0, policy_version 10926 (0.0034) [2024-04-25 22:47:57,641][47288] Updated weights for policy 0, policy_version 10936 (0.0025) [2024-04-25 22:47:58,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56524.8, 300 sec: 55705.6). Total num frames: 179240960. Throughput: 0: 56194.2. Samples: 128556740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-25 22:47:58,923][47056] Avg episode reward: [(0, '0.046')] [2024-04-25 22:48:00,326][47288] Updated weights for policy 0, policy_version 10946 (0.0028) [2024-04-25 22:48:03,563][47288] Updated weights for policy 0, policy_version 10956 (0.0032) [2024-04-25 22:48:03,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 179519488. Throughput: 0: 56253.8. Samples: 128891840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-25 22:48:03,923][47056] Avg episode reward: [(0, '0.063')] [2024-04-25 22:48:06,239][47288] Updated weights for policy 0, policy_version 10966 (0.0028) [2024-04-25 22:48:08,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.8, 300 sec: 55761.1). Total num frames: 179781632. Throughput: 0: 56358.3. Samples: 129229700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-25 22:48:08,923][47056] Avg episode reward: [(0, '0.052')] [2024-04-25 22:48:09,016][47267] Signal inference workers to stop experience collection... (1850 times) [2024-04-25 22:48:09,017][47267] Signal inference workers to resume experience collection... (1850 times) [2024-04-25 22:48:09,036][47288] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-04-25 22:48:09,036][47288] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-04-25 22:48:09,249][47288] Updated weights for policy 0, policy_version 10976 (0.0032) [2024-04-25 22:48:12,193][47288] Updated weights for policy 0, policy_version 10986 (0.0030) [2024-04-25 22:48:13,923][47056] Fps is (10 sec: 52428.1, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 180043776. Throughput: 0: 55827.3. Samples: 129381260. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-25 22:48:13,923][47056] Avg episode reward: [(0, '0.056')] [2024-04-25 22:48:15,174][47288] Updated weights for policy 0, policy_version 10996 (0.0028) [2024-04-25 22:48:17,848][47288] Updated weights for policy 0, policy_version 11006 (0.0029) [2024-04-25 22:48:18,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 180355072. Throughput: 0: 55832.5. Samples: 129720160. Policy #0 lag: (min: 0.0, avg: 13.1, max: 28.0) [2024-04-25 22:48:18,923][47056] Avg episode reward: [(0, '0.064')] [2024-04-25 22:48:21,096][47288] Updated weights for policy 0, policy_version 11016 (0.0027) [2024-04-25 22:48:23,533][47288] Updated weights for policy 0, policy_version 11026 (0.0032) [2024-04-25 22:48:23,923][47056] Fps is (10 sec: 60621.2, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 180649984. Throughput: 0: 55822.5. Samples: 130055680. Policy #0 lag: (min: 0.0, avg: 13.1, max: 28.0) [2024-04-25 22:48:23,923][47056] Avg episode reward: [(0, '0.050')] [2024-04-25 22:48:26,843][47288] Updated weights for policy 0, policy_version 11036 (0.0031) [2024-04-25 22:48:28,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 180928512. Throughput: 0: 56263.1. Samples: 130235000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-25 22:48:28,923][47056] Avg episode reward: [(0, '0.054')] [2024-04-25 22:48:29,496][47288] Updated weights for policy 0, policy_version 11046 (0.0026) [2024-04-25 22:48:32,762][47288] Updated weights for policy 0, policy_version 11056 (0.0033) [2024-04-25 22:48:33,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56798.0, 300 sec: 55816.7). Total num frames: 181207040. Throughput: 0: 56098.0. Samples: 130565160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-25 22:48:33,923][47056] Avg episode reward: [(0, '0.068')] [2024-04-25 22:48:35,552][47288] Updated weights for policy 0, policy_version 11066 (0.0029) [2024-04-25 22:48:38,492][47288] Updated weights for policy 0, policy_version 11076 (0.0028) [2024-04-25 22:48:38,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.7, 300 sec: 55872.2). Total num frames: 181485568. Throughput: 0: 56098.8. Samples: 130900900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-25 22:48:38,923][47056] Avg episode reward: [(0, '0.070')] [2024-04-25 22:48:41,547][47288] Updated weights for policy 0, policy_version 11086 (0.0032) [2024-04-25 22:48:43,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 181747712. Throughput: 0: 55860.9. Samples: 131070480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-25 22:48:43,923][47056] Avg episode reward: [(0, '0.070')] [2024-04-25 22:48:44,446][47288] Updated weights for policy 0, policy_version 11096 (0.0027) [2024-04-25 22:48:47,350][47288] Updated weights for policy 0, policy_version 11106 (0.0029) [2024-04-25 22:48:48,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55432.7, 300 sec: 55761.2). Total num frames: 182009856. Throughput: 0: 55787.2. Samples: 131402260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-25 22:48:48,923][47056] Avg episode reward: [(0, '0.071')] [2024-04-25 22:48:48,956][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000011110_182026240.pth... [2024-04-25 22:48:49,002][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000010293_168640512.pth [2024-04-25 22:48:50,256][47288] Updated weights for policy 0, policy_version 11116 (0.0032) [2024-04-25 22:48:53,175][47288] Updated weights for policy 0, policy_version 11126 (0.0035) [2024-04-25 22:48:53,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 182304768. Throughput: 0: 55814.8. Samples: 131741360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-25 22:48:53,931][47056] Avg episode reward: [(0, '0.076')] [2024-04-25 22:48:56,003][47288] Updated weights for policy 0, policy_version 11136 (0.0032) [2024-04-25 22:48:58,848][47288] Updated weights for policy 0, policy_version 11146 (0.0032) [2024-04-25 22:48:58,923][47056] Fps is (10 sec: 60619.5, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 182616064. Throughput: 0: 56092.8. Samples: 131905440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-25 22:48:58,923][47056] Avg episode reward: [(0, '0.072')] [2024-04-25 22:49:01,996][47288] Updated weights for policy 0, policy_version 11156 (0.0031) [2024-04-25 22:49:03,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 182894592. Throughput: 0: 56107.2. Samples: 132244980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 22:49:03,923][47056] Avg episode reward: [(0, '0.067')] [2024-04-25 22:49:04,510][47288] Updated weights for policy 0, policy_version 11166 (0.0026) [2024-04-25 22:49:07,859][47288] Updated weights for policy 0, policy_version 11176 (0.0032) [2024-04-25 22:49:08,923][47056] Fps is (10 sec: 55706.6, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 183173120. Throughput: 0: 56117.9. Samples: 132580980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 22:49:08,923][47056] Avg episode reward: [(0, '0.077')] [2024-04-25 22:49:08,931][47267] Saving new best policy, reward=0.077! [2024-04-25 22:49:09,739][47267] Signal inference workers to stop experience collection... (1900 times) [2024-04-25 22:49:09,740][47267] Signal inference workers to resume experience collection... (1900 times) [2024-04-25 22:49:09,785][47288] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-04-25 22:49:09,785][47288] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-04-25 22:49:10,331][47288] Updated weights for policy 0, policy_version 11186 (0.0031) [2024-04-25 22:49:13,550][47288] Updated weights for policy 0, policy_version 11196 (0.0027) [2024-04-25 22:49:13,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56797.9, 300 sec: 55927.7). Total num frames: 183451648. Throughput: 0: 56076.4. Samples: 132758440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 22:49:13,923][47056] Avg episode reward: [(0, '0.061')] [2024-04-25 22:49:16,462][47288] Updated weights for policy 0, policy_version 11206 (0.0031) [2024-04-25 22:49:18,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 183730176. Throughput: 0: 56136.7. Samples: 133091320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-25 22:49:18,923][47056] Avg episode reward: [(0, '0.067')] [2024-04-25 22:49:19,383][47288] Updated weights for policy 0, policy_version 11216 (0.0033) [2024-04-25 22:49:22,520][47288] Updated weights for policy 0, policy_version 11226 (0.0036) [2024-04-25 22:49:23,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 183992320. Throughput: 0: 56201.1. Samples: 133429960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-25 22:49:23,923][47056] Avg episode reward: [(0, '0.066')] [2024-04-25 22:49:25,166][47288] Updated weights for policy 0, policy_version 11236 (0.0031) [2024-04-25 22:49:28,594][47288] Updated weights for policy 0, policy_version 11246 (0.0028) [2024-04-25 22:49:28,923][47056] Fps is (10 sec: 52427.5, 60 sec: 55432.2, 300 sec: 55816.6). Total num frames: 184254464. Throughput: 0: 56101.8. Samples: 133595080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-25 22:49:28,923][47056] Avg episode reward: [(0, '0.059')] [2024-04-25 22:49:31,165][47288] Updated weights for policy 0, policy_version 11256 (0.0029) [2024-04-25 22:49:33,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 184549376. Throughput: 0: 56079.8. Samples: 133925860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-25 22:49:33,923][47056] Avg episode reward: [(0, '0.060')] [2024-04-25 22:49:34,461][47288] Updated weights for policy 0, policy_version 11266 (0.0028) [2024-04-25 22:49:36,970][47288] Updated weights for policy 0, policy_version 11276 (0.0037) [2024-04-25 22:49:38,923][47056] Fps is (10 sec: 58984.5, 60 sec: 55978.7, 300 sec: 55872.8). Total num frames: 184844288. Throughput: 0: 56023.5. Samples: 134262420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-25 22:49:38,923][47056] Avg episode reward: [(0, '0.056')] [2024-04-25 22:49:40,184][47288] Updated weights for policy 0, policy_version 11286 (0.0026) [2024-04-25 22:49:42,916][47288] Updated weights for policy 0, policy_version 11296 (0.0029) [2024-04-25 22:49:43,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56251.8, 300 sec: 55872.3). Total num frames: 185122816. Throughput: 0: 56275.9. Samples: 134437840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-25 22:49:43,923][47056] Avg episode reward: [(0, '0.078')] [2024-04-25 22:49:43,936][47267] Saving new best policy, reward=0.078! [2024-04-25 22:49:45,928][47288] Updated weights for policy 0, policy_version 11306 (0.0031) [2024-04-25 22:49:48,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 185384960. Throughput: 0: 56158.4. Samples: 134772100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-25 22:49:48,923][47056] Avg episode reward: [(0, '0.060')] [2024-04-25 22:49:48,954][47288] Updated weights for policy 0, policy_version 11316 (0.0028) [2024-04-25 22:49:51,782][47288] Updated weights for policy 0, policy_version 11326 (0.0034) [2024-04-25 22:49:53,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 185679872. Throughput: 0: 56057.4. Samples: 135103560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-25 22:49:53,923][47056] Avg episode reward: [(0, '0.059')] [2024-04-25 22:49:54,698][47288] Updated weights for policy 0, policy_version 11336 (0.0028) [2024-04-25 22:49:57,727][47288] Updated weights for policy 0, policy_version 11346 (0.0027) [2024-04-25 22:49:58,923][47056] Fps is (10 sec: 57342.9, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 185958400. Throughput: 0: 55843.5. Samples: 135271400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-25 22:49:58,923][47056] Avg episode reward: [(0, '0.078')] [2024-04-25 22:50:00,457][47288] Updated weights for policy 0, policy_version 11356 (0.0027) [2024-04-25 22:50:03,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 186220544. Throughput: 0: 55896.1. Samples: 135606640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 22:50:03,923][47056] Avg episode reward: [(0, '0.047')] [2024-04-25 22:50:03,928][47288] Updated weights for policy 0, policy_version 11366 (0.0036) [2024-04-25 22:50:06,450][47288] Updated weights for policy 0, policy_version 11376 (0.0031) [2024-04-25 22:50:08,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 186515456. Throughput: 0: 55838.8. Samples: 135942700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 22:50:08,923][47056] Avg episode reward: [(0, '0.065')] [2024-04-25 22:50:09,873][47288] Updated weights for policy 0, policy_version 11386 (0.0026) [2024-04-25 22:50:09,936][47267] Signal inference workers to stop experience collection... (1950 times) [2024-04-25 22:50:09,965][47288] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-04-25 22:50:09,991][47267] Signal inference workers to resume experience collection... (1950 times) [2024-04-25 22:50:09,995][47288] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-04-25 22:50:12,309][47288] Updated weights for policy 0, policy_version 11396 (0.0029) [2024-04-25 22:50:13,923][47056] Fps is (10 sec: 58981.6, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 186810368. Throughput: 0: 55698.9. Samples: 136101520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-04-25 22:50:13,923][47056] Avg episode reward: [(0, '0.078')] [2024-04-25 22:50:15,708][47288] Updated weights for policy 0, policy_version 11406 (0.0032) [2024-04-25 22:50:18,134][47288] Updated weights for policy 0, policy_version 11416 (0.0027) [2024-04-25 22:50:18,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 187088896. Throughput: 0: 55742.7. Samples: 136434280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-04-25 22:50:18,923][47056] Avg episode reward: [(0, '0.052')] [2024-04-25 22:50:21,481][47288] Updated weights for policy 0, policy_version 11426 (0.0030) [2024-04-25 22:50:23,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 187351040. Throughput: 0: 55868.8. Samples: 136776520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-25 22:50:23,923][47056] Avg episode reward: [(0, '0.077')] [2024-04-25 22:50:24,096][47288] Updated weights for policy 0, policy_version 11436 (0.0031) [2024-04-25 22:50:27,331][47288] Updated weights for policy 0, policy_version 11446 (0.0031) [2024-04-25 22:50:28,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56252.0, 300 sec: 55872.2). Total num frames: 187629568. Throughput: 0: 55604.3. Samples: 136940040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 22:50:28,923][47056] Avg episode reward: [(0, '0.071')] [2024-04-25 22:50:30,117][47288] Updated weights for policy 0, policy_version 11456 (0.0033) [2024-04-25 22:50:33,161][47288] Updated weights for policy 0, policy_version 11466 (0.0039) [2024-04-25 22:50:33,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 187891712. Throughput: 0: 55647.5. Samples: 137276240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 22:50:33,923][47056] Avg episode reward: [(0, '0.066')] [2024-04-25 22:50:36,014][47288] Updated weights for policy 0, policy_version 11476 (0.0035) [2024-04-25 22:50:38,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 188170240. Throughput: 0: 55759.5. Samples: 137612740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-25 22:50:38,923][47056] Avg episode reward: [(0, '0.076')] [2024-04-25 22:50:38,931][47288] Updated weights for policy 0, policy_version 11486 (0.0033) [2024-04-25 22:50:41,813][47288] Updated weights for policy 0, policy_version 11496 (0.0034) [2024-04-25 22:50:43,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 188448768. Throughput: 0: 55734.8. Samples: 137779460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-25 22:50:43,923][47056] Avg episode reward: [(0, '0.079')] [2024-04-25 22:50:44,697][47288] Updated weights for policy 0, policy_version 11506 (0.0024) [2024-04-25 22:50:47,618][47288] Updated weights for policy 0, policy_version 11516 (0.0029) [2024-04-25 22:50:48,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56251.6, 300 sec: 56038.8). Total num frames: 188760064. Throughput: 0: 55730.6. Samples: 138114520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-25 22:50:48,923][47056] Avg episode reward: [(0, '0.071')] [2024-04-25 22:50:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000011521_188760064.pth... [2024-04-25 22:50:48,983][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000010703_175357952.pth [2024-04-25 22:50:50,612][47288] Updated weights for policy 0, policy_version 11526 (0.0031) [2024-04-25 22:50:53,512][47288] Updated weights for policy 0, policy_version 11536 (0.0030) [2024-04-25 22:50:53,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 189022208. Throughput: 0: 55759.6. Samples: 138451880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-25 22:50:53,923][47056] Avg episode reward: [(0, '0.086')] [2024-04-25 22:50:53,924][47267] Saving new best policy, reward=0.086! [2024-04-25 22:50:56,417][47288] Updated weights for policy 0, policy_version 11546 (0.0035) [2024-04-25 22:50:58,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 189300736. Throughput: 0: 55836.1. Samples: 138614140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-25 22:50:58,923][47056] Avg episode reward: [(0, '0.067')] [2024-04-25 22:50:59,350][47288] Updated weights for policy 0, policy_version 11556 (0.0030) [2024-04-25 22:51:00,049][47267] Signal inference workers to stop experience collection... (2000 times) [2024-04-25 22:51:00,049][47267] Signal inference workers to resume experience collection... (2000 times) [2024-04-25 22:51:00,062][47288] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-04-25 22:51:00,062][47288] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-04-25 22:51:02,354][47288] Updated weights for policy 0, policy_version 11566 (0.0028) [2024-04-25 22:51:03,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 189579264. Throughput: 0: 55921.8. Samples: 138950760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-25 22:51:03,923][47056] Avg episode reward: [(0, '0.071')] [2024-04-25 22:51:05,145][47288] Updated weights for policy 0, policy_version 11576 (0.0032) [2024-04-25 22:51:08,157][47288] Updated weights for policy 0, policy_version 11586 (0.0024) [2024-04-25 22:51:08,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 189874176. Throughput: 0: 55794.5. Samples: 139287280. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-04-25 22:51:08,923][47056] Avg episode reward: [(0, '0.049')] [2024-04-25 22:51:11,082][47288] Updated weights for policy 0, policy_version 11596 (0.0032) [2024-04-25 22:51:13,881][47288] Updated weights for policy 0, policy_version 11606 (0.0032) [2024-04-25 22:51:13,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55705.7, 300 sec: 55983.3). Total num frames: 190152704. Throughput: 0: 55915.7. Samples: 139456240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-25 22:51:13,923][47056] Avg episode reward: [(0, '0.078')] [2024-04-25 22:51:17,027][47288] Updated weights for policy 0, policy_version 11616 (0.0025) [2024-04-25 22:51:18,923][47056] Fps is (10 sec: 54068.3, 60 sec: 55432.6, 300 sec: 55983.3). Total num frames: 190414848. Throughput: 0: 55944.4. Samples: 139793740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-25 22:51:18,923][47056] Avg episode reward: [(0, '0.073')] [2024-04-25 22:51:19,613][47288] Updated weights for policy 0, policy_version 11626 (0.0026) [2024-04-25 22:51:22,775][47288] Updated weights for policy 0, policy_version 11636 (0.0031) [2024-04-25 22:51:23,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 190693376. Throughput: 0: 55875.6. Samples: 140127140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-25 22:51:23,923][47056] Avg episode reward: [(0, '0.063')] [2024-04-25 22:51:25,522][47288] Updated weights for policy 0, policy_version 11646 (0.0025) [2024-04-25 22:51:28,639][47288] Updated weights for policy 0, policy_version 11656 (0.0031) [2024-04-25 22:51:28,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 190988288. Throughput: 0: 55917.2. Samples: 140295740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-25 22:51:28,923][47056] Avg episode reward: [(0, '0.070')] [2024-04-25 22:51:31,506][47288] Updated weights for policy 0, policy_version 11666 (0.0026) [2024-04-25 22:51:33,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.6, 300 sec: 55927.8). Total num frames: 191266816. Throughput: 0: 55893.4. Samples: 140629720. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-04-25 22:51:33,923][47056] Avg episode reward: [(0, '0.042')] [2024-04-25 22:51:34,382][47288] Updated weights for policy 0, policy_version 11676 (0.0029) [2024-04-25 22:51:37,329][47288] Updated weights for policy 0, policy_version 11686 (0.0028) [2024-04-25 22:51:38,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56524.7, 300 sec: 55983.3). Total num frames: 191561728. Throughput: 0: 55893.6. Samples: 140967100. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-04-25 22:51:38,923][47056] Avg episode reward: [(0, '0.071')] [2024-04-25 22:51:40,217][47288] Updated weights for policy 0, policy_version 11696 (0.0033) [2024-04-25 22:51:43,025][47288] Updated weights for policy 0, policy_version 11706 (0.0033) [2024-04-25 22:51:43,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 191840256. Throughput: 0: 56147.7. Samples: 141140780. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-25 22:51:43,923][47056] Avg episode reward: [(0, '0.068')] [2024-04-25 22:51:46,072][47288] Updated weights for policy 0, policy_version 11716 (0.0026) [2024-04-25 22:51:48,682][47288] Updated weights for policy 0, policy_version 11726 (0.0028) [2024-04-25 22:51:48,923][47056] Fps is (10 sec: 55706.6, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 192118784. Throughput: 0: 56171.2. Samples: 141478460. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-25 22:51:48,923][47056] Avg episode reward: [(0, '0.059')] [2024-04-25 22:51:51,879][47288] Updated weights for policy 0, policy_version 11736 (0.0030) [2024-04-25 22:51:53,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.6, 300 sec: 56038.8). Total num frames: 192380928. Throughput: 0: 56194.4. Samples: 141816020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 22:51:53,923][47056] Avg episode reward: [(0, '0.073')] [2024-04-25 22:51:54,593][47288] Updated weights for policy 0, policy_version 11746 (0.0031) [2024-04-25 22:51:57,816][47288] Updated weights for policy 0, policy_version 11756 (0.0030) [2024-04-25 22:51:58,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.9, 300 sec: 56038.9). Total num frames: 192675840. Throughput: 0: 56228.5. Samples: 141986520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-25 22:51:58,923][47056] Avg episode reward: [(0, '0.063')] [2024-04-25 22:52:00,389][47288] Updated weights for policy 0, policy_version 11766 (0.0029) [2024-04-25 22:52:03,695][47288] Updated weights for policy 0, policy_version 11776 (0.0030) [2024-04-25 22:52:03,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 192954368. Throughput: 0: 56247.0. Samples: 142324860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-25 22:52:03,923][47056] Avg episode reward: [(0, '0.057')] [2024-04-25 22:52:06,260][47288] Updated weights for policy 0, policy_version 11786 (0.0029) [2024-04-25 22:52:08,923][47056] Fps is (10 sec: 54065.5, 60 sec: 55705.6, 300 sec: 55927.7). Total num frames: 193216512. Throughput: 0: 56251.2. Samples: 142658460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 22:52:08,924][47056] Avg episode reward: [(0, '0.081')] [2024-04-25 22:52:09,624][47288] Updated weights for policy 0, policy_version 11796 (0.0031) [2024-04-25 22:52:12,030][47267] Signal inference workers to stop experience collection... (2050 times) [2024-04-25 22:52:12,030][47267] Signal inference workers to resume experience collection... (2050 times) [2024-04-25 22:52:12,050][47288] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-04-25 22:52:12,050][47288] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-04-25 22:52:12,167][47288] Updated weights for policy 0, policy_version 11806 (0.0032) [2024-04-25 22:52:13,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 193511424. Throughput: 0: 56101.8. Samples: 142820320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 22:52:13,923][47056] Avg episode reward: [(0, '0.081')] [2024-04-25 22:52:15,467][47288] Updated weights for policy 0, policy_version 11816 (0.0028) [2024-04-25 22:52:18,420][47288] Updated weights for policy 0, policy_version 11826 (0.0032) [2024-04-25 22:52:18,923][47056] Fps is (10 sec: 57345.4, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 193789952. Throughput: 0: 56096.9. Samples: 143154080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-25 22:52:18,923][47056] Avg episode reward: [(0, '0.057')] [2024-04-25 22:52:21,311][47288] Updated weights for policy 0, policy_version 11836 (0.0030) [2024-04-25 22:52:23,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 194068480. Throughput: 0: 55850.4. Samples: 143480360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-25 22:52:23,923][47056] Avg episode reward: [(0, '0.071')] [2024-04-25 22:52:24,376][47288] Updated weights for policy 0, policy_version 11846 (0.0032) [2024-04-25 22:52:27,305][47288] Updated weights for policy 0, policy_version 11856 (0.0027) [2024-04-25 22:52:28,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 56038.9). Total num frames: 194330624. Throughput: 0: 55872.9. Samples: 143655060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-04-25 22:52:28,923][47056] Avg episode reward: [(0, '0.060')] [2024-04-25 22:52:30,392][47288] Updated weights for policy 0, policy_version 11866 (0.0033) [2024-04-25 22:52:33,146][47288] Updated weights for policy 0, policy_version 11876 (0.0026) [2024-04-25 22:52:33,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 56038.8). Total num frames: 194625536. Throughput: 0: 55821.7. Samples: 143990440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-04-25 22:52:33,923][47056] Avg episode reward: [(0, '0.070')] [2024-04-25 22:52:36,044][47288] Updated weights for policy 0, policy_version 11886 (0.0028) [2024-04-25 22:52:38,923][47056] Fps is (10 sec: 54066.3, 60 sec: 55159.5, 300 sec: 55872.2). Total num frames: 194871296. Throughput: 0: 55748.4. Samples: 144324700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-25 22:52:38,923][47056] Avg episode reward: [(0, '0.075')] [2024-04-25 22:52:39,094][47288] Updated weights for policy 0, policy_version 11896 (0.0033) [2024-04-25 22:52:41,814][47288] Updated weights for policy 0, policy_version 11906 (0.0032) [2024-04-25 22:52:43,923][47056] Fps is (10 sec: 52428.0, 60 sec: 55159.2, 300 sec: 55816.7). Total num frames: 195149824. Throughput: 0: 55453.0. Samples: 144481920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-25 22:52:43,923][47056] Avg episode reward: [(0, '0.063')] [2024-04-25 22:52:44,944][47288] Updated weights for policy 0, policy_version 11916 (0.0029) [2024-04-25 22:52:47,792][47288] Updated weights for policy 0, policy_version 11926 (0.0030) [2024-04-25 22:52:48,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 195444736. Throughput: 0: 55358.7. Samples: 144816000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-25 22:52:48,923][47056] Avg episode reward: [(0, '0.063')] [2024-04-25 22:52:48,947][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000011930_195461120.pth... [2024-04-25 22:52:48,995][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000011110_182026240.pth [2024-04-25 22:52:50,759][47288] Updated weights for policy 0, policy_version 11936 (0.0033) [2024-04-25 22:52:53,710][47288] Updated weights for policy 0, policy_version 11946 (0.0028) [2024-04-25 22:52:53,923][47056] Fps is (10 sec: 58983.2, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 195739648. Throughput: 0: 55519.7. Samples: 145156840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 26.0) [2024-04-25 22:52:53,932][47056] Avg episode reward: [(0, '0.066')] [2024-04-25 22:52:56,653][47288] Updated weights for policy 0, policy_version 11956 (0.0028) [2024-04-25 22:52:58,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.4, 300 sec: 55927.7). Total num frames: 196018176. Throughput: 0: 55633.7. Samples: 145323840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 26.0) [2024-04-25 22:52:58,923][47056] Avg episode reward: [(0, '0.085')] [2024-04-25 22:52:59,530][47288] Updated weights for policy 0, policy_version 11966 (0.0034) [2024-04-25 22:53:02,499][47288] Updated weights for policy 0, policy_version 11976 (0.0027) [2024-04-25 22:53:03,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55983.3). Total num frames: 196296704. Throughput: 0: 55594.9. Samples: 145655860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-25 22:53:03,923][47056] Avg episode reward: [(0, '0.085')] [2024-04-25 22:53:05,411][47288] Updated weights for policy 0, policy_version 11986 (0.0030) [2024-04-25 22:53:08,269][47288] Updated weights for policy 0, policy_version 11996 (0.0028) [2024-04-25 22:53:08,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.9, 300 sec: 56038.9). Total num frames: 196575232. Throughput: 0: 55698.3. Samples: 145986780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-25 22:53:08,923][47056] Avg episode reward: [(0, '0.081')] [2024-04-25 22:53:11,423][47288] Updated weights for policy 0, policy_version 12006 (0.0031) [2024-04-25 22:53:13,923][47056] Fps is (10 sec: 54068.4, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 196837376. Throughput: 0: 55654.2. Samples: 146159500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 22:53:13,923][47056] Avg episode reward: [(0, '0.070')] [2024-04-25 22:53:14,193][47288] Updated weights for policy 0, policy_version 12016 (0.0034) [2024-04-25 22:53:14,203][47267] Signal inference workers to stop experience collection... (2100 times) [2024-04-25 22:53:14,203][47267] Signal inference workers to resume experience collection... (2100 times) [2024-04-25 22:53:14,228][47288] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-04-25 22:53:14,228][47288] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-04-25 22:53:17,109][47288] Updated weights for policy 0, policy_version 12026 (0.0030) [2024-04-25 22:53:18,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 197115904. Throughput: 0: 55708.8. Samples: 146497340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 22:53:18,923][47056] Avg episode reward: [(0, '0.090')] [2024-04-25 22:53:18,931][47267] Saving new best policy, reward=0.090! [2024-04-25 22:53:20,010][47288] Updated weights for policy 0, policy_version 12036 (0.0030) [2024-04-25 22:53:22,876][47288] Updated weights for policy 0, policy_version 12046 (0.0035) [2024-04-25 22:53:23,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 197394432. Throughput: 0: 55740.9. Samples: 146833040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-25 22:53:23,923][47056] Avg episode reward: [(0, '0.086')] [2024-04-25 22:53:25,926][47288] Updated weights for policy 0, policy_version 12056 (0.0035) [2024-04-25 22:53:28,855][47288] Updated weights for policy 0, policy_version 12066 (0.0027) [2024-04-25 22:53:28,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 197689344. Throughput: 0: 55933.1. Samples: 146998900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-25 22:53:28,923][47056] Avg episode reward: [(0, '0.100')] [2024-04-25 22:53:28,933][47267] Saving new best policy, reward=0.100! [2024-04-25 22:53:31,824][47288] Updated weights for policy 0, policy_version 12076 (0.0027) [2024-04-25 22:53:33,923][47056] Fps is (10 sec: 58982.7, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 197984256. Throughput: 0: 55832.9. Samples: 147328480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-25 22:53:33,923][47056] Avg episode reward: [(0, '0.065')] [2024-04-25 22:53:34,768][47288] Updated weights for policy 0, policy_version 12086 (0.0028) [2024-04-25 22:53:37,562][47288] Updated weights for policy 0, policy_version 12096 (0.0032) [2024-04-25 22:53:38,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 198246400. Throughput: 0: 55682.6. Samples: 147662560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-25 22:53:38,923][47056] Avg episode reward: [(0, '0.066')] [2024-04-25 22:53:40,601][47288] Updated weights for policy 0, policy_version 12106 (0.0035) [2024-04-25 22:53:43,295][47288] Updated weights for policy 0, policy_version 12116 (0.0036) [2024-04-25 22:53:43,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56525.0, 300 sec: 56038.8). Total num frames: 198541312. Throughput: 0: 55885.0. Samples: 147838660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-25 22:53:43,923][47056] Avg episode reward: [(0, '0.087')] [2024-04-25 22:53:46,428][47288] Updated weights for policy 0, policy_version 12126 (0.0033) [2024-04-25 22:53:48,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 198803456. Throughput: 0: 55959.3. Samples: 148174020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-25 22:53:48,923][47056] Avg episode reward: [(0, '0.076')] [2024-04-25 22:53:49,249][47288] Updated weights for policy 0, policy_version 12136 (0.0026) [2024-04-25 22:53:52,255][47288] Updated weights for policy 0, policy_version 12146 (0.0027) [2024-04-25 22:53:53,923][47056] Fps is (10 sec: 52427.9, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 199065600. Throughput: 0: 56001.9. Samples: 148506880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-25 22:53:53,924][47056] Avg episode reward: [(0, '0.065')] [2024-04-25 22:53:55,092][47288] Updated weights for policy 0, policy_version 12156 (0.0029) [2024-04-25 22:53:58,090][47288] Updated weights for policy 0, policy_version 12166 (0.0029) [2024-04-25 22:53:58,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 199344128. Throughput: 0: 55761.7. Samples: 148668780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-25 22:53:58,923][47056] Avg episode reward: [(0, '0.074')] [2024-04-25 22:54:00,977][47288] Updated weights for policy 0, policy_version 12176 (0.0029) [2024-04-25 22:54:03,804][47288] Updated weights for policy 0, policy_version 12186 (0.0032) [2024-04-25 22:54:03,923][47056] Fps is (10 sec: 58983.3, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 199655424. Throughput: 0: 55755.2. Samples: 149006320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 22:54:03,923][47056] Avg episode reward: [(0, '0.085')] [2024-04-25 22:54:06,875][47288] Updated weights for policy 0, policy_version 12196 (0.0035) [2024-04-25 22:54:08,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 199917568. Throughput: 0: 55792.5. Samples: 149343700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-25 22:54:08,923][47056] Avg episode reward: [(0, '0.073')] [2024-04-25 22:54:09,662][47288] Updated weights for policy 0, policy_version 12206 (0.0027) [2024-04-25 22:54:12,800][47288] Updated weights for policy 0, policy_version 12216 (0.0036) [2024-04-25 22:54:13,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 200212480. Throughput: 0: 55876.6. Samples: 149513340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-25 22:54:13,923][47056] Avg episode reward: [(0, '0.064')] [2024-04-25 22:54:15,494][47288] Updated weights for policy 0, policy_version 12226 (0.0027) [2024-04-25 22:54:18,542][47288] Updated weights for policy 0, policy_version 12236 (0.0030) [2024-04-25 22:54:18,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 200491008. Throughput: 0: 56079.8. Samples: 149852080. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-25 22:54:18,923][47056] Avg episode reward: [(0, '0.077')] [2024-04-25 22:54:21,524][47288] Updated weights for policy 0, policy_version 12246 (0.0031) [2024-04-25 22:54:23,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 200753152. Throughput: 0: 56042.9. Samples: 150184480. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-25 22:54:23,923][47056] Avg episode reward: [(0, '0.094')] [2024-04-25 22:54:24,359][47288] Updated weights for policy 0, policy_version 12256 (0.0028) [2024-04-25 22:54:27,262][47288] Updated weights for policy 0, policy_version 12266 (0.0027) [2024-04-25 22:54:28,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 201031680. Throughput: 0: 55841.9. Samples: 150351540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-25 22:54:28,923][47056] Avg episode reward: [(0, '0.105')] [2024-04-25 22:54:28,931][47267] Saving new best policy, reward=0.105! [2024-04-25 22:54:30,091][47267] Signal inference workers to stop experience collection... (2150 times) [2024-04-25 22:54:30,091][47267] Signal inference workers to resume experience collection... (2150 times) [2024-04-25 22:54:30,118][47288] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-04-25 22:54:30,123][47288] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-04-25 22:54:30,196][47288] Updated weights for policy 0, policy_version 12276 (0.0025) [2024-04-25 22:54:33,293][47288] Updated weights for policy 0, policy_version 12286 (0.0028) [2024-04-25 22:54:33,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 201310208. Throughput: 0: 55850.6. Samples: 150687300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-25 22:54:33,923][47056] Avg episode reward: [(0, '0.072')] [2024-04-25 22:54:36,096][47288] Updated weights for policy 0, policy_version 12296 (0.0030) [2024-04-25 22:54:38,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.9, 300 sec: 55872.2). Total num frames: 201605120. Throughput: 0: 55907.1. Samples: 151022680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-25 22:54:38,923][47056] Avg episode reward: [(0, '0.083')] [2024-04-25 22:54:38,941][47288] Updated weights for policy 0, policy_version 12306 (0.0036) [2024-04-25 22:54:41,989][47288] Updated weights for policy 0, policy_version 12316 (0.0035) [2024-04-25 22:54:43,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 201867264. Throughput: 0: 56144.5. Samples: 151195280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-25 22:54:43,923][47056] Avg episode reward: [(0, '0.103')] [2024-04-25 22:54:44,911][47288] Updated weights for policy 0, policy_version 12326 (0.0026) [2024-04-25 22:54:47,737][47288] Updated weights for policy 0, policy_version 12336 (0.0023) [2024-04-25 22:54:48,923][47056] Fps is (10 sec: 57342.9, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 202178560. Throughput: 0: 56090.2. Samples: 151530380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-25 22:54:48,923][47056] Avg episode reward: [(0, '0.070')] [2024-04-25 22:54:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000012340_202178560.pth... [2024-04-25 22:54:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000011521_188760064.pth [2024-04-25 22:54:50,823][47288] Updated weights for policy 0, policy_version 12346 (0.0024) [2024-04-25 22:54:53,676][47288] Updated weights for policy 0, policy_version 12356 (0.0028) [2024-04-25 22:54:53,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 202440704. Throughput: 0: 56004.4. Samples: 151863900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-25 22:54:53,923][47056] Avg episode reward: [(0, '0.067')] [2024-04-25 22:54:56,549][47288] Updated weights for policy 0, policy_version 12366 (0.0032) [2024-04-25 22:54:58,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 202702848. Throughput: 0: 55919.1. Samples: 152029700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-25 22:54:58,923][47056] Avg episode reward: [(0, '0.079')] [2024-04-25 22:54:59,537][47288] Updated weights for policy 0, policy_version 12376 (0.0027) [2024-04-25 22:55:02,685][47288] Updated weights for policy 0, policy_version 12386 (0.0031) [2024-04-25 22:55:03,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 202981376. Throughput: 0: 55857.1. Samples: 152365640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-25 22:55:03,923][47056] Avg episode reward: [(0, '0.089')] [2024-04-25 22:55:05,444][47288] Updated weights for policy 0, policy_version 12396 (0.0028) [2024-04-25 22:55:08,493][47288] Updated weights for policy 0, policy_version 12406 (0.0029) [2024-04-25 22:55:08,923][47056] Fps is (10 sec: 55704.4, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 203259904. Throughput: 0: 56017.5. Samples: 152705280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-25 22:55:08,923][47056] Avg episode reward: [(0, '0.084')] [2024-04-25 22:55:11,230][47288] Updated weights for policy 0, policy_version 12416 (0.0034) [2024-04-25 22:55:13,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 203554816. Throughput: 0: 55977.2. Samples: 152870520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-25 22:55:13,923][47056] Avg episode reward: [(0, '0.076')] [2024-04-25 22:55:14,336][47288] Updated weights for policy 0, policy_version 12426 (0.0028) [2024-04-25 22:55:17,297][47288] Updated weights for policy 0, policy_version 12436 (0.0028) [2024-04-25 22:55:18,923][47056] Fps is (10 sec: 58983.5, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 203849728. Throughput: 0: 55792.5. Samples: 153197960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 25.0) [2024-04-25 22:55:18,923][47056] Avg episode reward: [(0, '0.091')] [2024-04-25 22:55:20,286][47288] Updated weights for policy 0, policy_version 12446 (0.0033) [2024-04-25 22:55:23,043][47288] Updated weights for policy 0, policy_version 12456 (0.0036) [2024-04-25 22:55:23,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.6, 300 sec: 55927.8). Total num frames: 204128256. Throughput: 0: 55825.5. Samples: 153534840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-25 22:55:23,923][47056] Avg episode reward: [(0, '0.095')] [2024-04-25 22:55:26,186][47288] Updated weights for policy 0, policy_version 12466 (0.0026) [2024-04-25 22:55:28,922][47288] Updated weights for policy 0, policy_version 12476 (0.0028) [2024-04-25 22:55:28,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.6, 300 sec: 55983.3). Total num frames: 204406784. Throughput: 0: 55861.2. Samples: 153709040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-25 22:55:28,923][47056] Avg episode reward: [(0, '0.088')] [2024-04-25 22:55:31,861][47288] Updated weights for policy 0, policy_version 12486 (0.0032) [2024-04-25 22:55:33,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 204668928. Throughput: 0: 55929.8. Samples: 154047220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-25 22:55:33,923][47056] Avg episode reward: [(0, '0.093')] [2024-04-25 22:55:34,850][47288] Updated weights for policy 0, policy_version 12496 (0.0034) [2024-04-25 22:55:37,579][47288] Updated weights for policy 0, policy_version 12506 (0.0028) [2024-04-25 22:55:38,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 204931072. Throughput: 0: 55955.6. Samples: 154381900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-25 22:55:38,923][47056] Avg episode reward: [(0, '0.089')] [2024-04-25 22:55:40,024][47267] Signal inference workers to stop experience collection... (2200 times) [2024-04-25 22:55:40,030][47267] Signal inference workers to resume experience collection... (2200 times) [2024-04-25 22:55:40,054][47288] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-04-25 22:55:40,054][47288] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-04-25 22:55:40,888][47288] Updated weights for policy 0, policy_version 12516 (0.0026) [2024-04-25 22:55:43,687][47288] Updated weights for policy 0, policy_version 12526 (0.0037) [2024-04-25 22:55:43,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 205225984. Throughput: 0: 55905.8. Samples: 154545460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-25 22:55:43,923][47056] Avg episode reward: [(0, '0.080')] [2024-04-25 22:55:46,555][47288] Updated weights for policy 0, policy_version 12536 (0.0024) [2024-04-25 22:55:48,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55705.6, 300 sec: 55927.7). Total num frames: 205520896. Throughput: 0: 55859.0. Samples: 154879300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-25 22:55:48,923][47056] Avg episode reward: [(0, '0.059')] [2024-04-25 22:55:49,433][47288] Updated weights for policy 0, policy_version 12546 (0.0026) [2024-04-25 22:55:52,343][47288] Updated weights for policy 0, policy_version 12556 (0.0028) [2024-04-25 22:55:53,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 205799424. Throughput: 0: 55834.1. Samples: 155217800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-25 22:55:53,923][47056] Avg episode reward: [(0, '0.092')] [2024-04-25 22:55:55,170][47288] Updated weights for policy 0, policy_version 12566 (0.0029) [2024-04-25 22:55:58,174][47288] Updated weights for policy 0, policy_version 12576 (0.0031) [2024-04-25 22:55:58,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 206077952. Throughput: 0: 55948.1. Samples: 155388180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-25 22:55:58,923][47056] Avg episode reward: [(0, '0.101')] [2024-04-25 22:56:01,041][47288] Updated weights for policy 0, policy_version 12586 (0.0031) [2024-04-25 22:56:03,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 206340096. Throughput: 0: 56145.2. Samples: 155724500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-25 22:56:03,923][47056] Avg episode reward: [(0, '0.080')] [2024-04-25 22:56:04,168][47288] Updated weights for policy 0, policy_version 12596 (0.0026) [2024-04-25 22:56:07,132][47288] Updated weights for policy 0, policy_version 12606 (0.0029) [2024-04-25 22:56:08,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.9, 300 sec: 55872.2). Total num frames: 206635008. Throughput: 0: 56045.5. Samples: 156056880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-25 22:56:08,923][47056] Avg episode reward: [(0, '0.098')] [2024-04-25 22:56:10,073][47288] Updated weights for policy 0, policy_version 12616 (0.0032) [2024-04-25 22:56:13,077][47288] Updated weights for policy 0, policy_version 12626 (0.0037) [2024-04-25 22:56:13,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 206913536. Throughput: 0: 55779.6. Samples: 156219120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-25 22:56:13,923][47056] Avg episode reward: [(0, '0.081')] [2024-04-25 22:56:15,985][47288] Updated weights for policy 0, policy_version 12636 (0.0032) [2024-04-25 22:56:18,742][47288] Updated weights for policy 0, policy_version 12646 (0.0027) [2024-04-25 22:56:18,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55927.7). Total num frames: 207192064. Throughput: 0: 55793.5. Samples: 156557920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-25 22:56:18,923][47056] Avg episode reward: [(0, '0.088')] [2024-04-25 22:56:21,654][47288] Updated weights for policy 0, policy_version 12656 (0.0032) [2024-04-25 22:56:23,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 207486976. Throughput: 0: 55872.9. Samples: 156896180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-25 22:56:23,923][47056] Avg episode reward: [(0, '0.086')] [2024-04-25 22:56:24,509][47288] Updated weights for policy 0, policy_version 12666 (0.0029) [2024-04-25 22:56:27,523][47288] Updated weights for policy 0, policy_version 12676 (0.0029) [2024-04-25 22:56:28,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.9, 300 sec: 55983.3). Total num frames: 207781888. Throughput: 0: 56028.9. Samples: 157066760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-25 22:56:28,923][47056] Avg episode reward: [(0, '0.093')] [2024-04-25 22:56:30,793][47288] Updated weights for policy 0, policy_version 12686 (0.0030) [2024-04-25 22:56:33,321][47288] Updated weights for policy 0, policy_version 12696 (0.0031) [2024-04-25 22:56:33,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 208011264. Throughput: 0: 55955.2. Samples: 157397280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-25 22:56:33,923][47056] Avg episode reward: [(0, '0.077')] [2024-04-25 22:56:36,839][47288] Updated weights for policy 0, policy_version 12706 (0.0033) [2024-04-25 22:56:38,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 208322560. Throughput: 0: 55852.8. Samples: 157731180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-25 22:56:38,923][47056] Avg episode reward: [(0, '0.085')] [2024-04-25 22:56:39,090][47288] Updated weights for policy 0, policy_version 12716 (0.0029) [2024-04-25 22:56:42,621][47288] Updated weights for policy 0, policy_version 12726 (0.0032) [2024-04-25 22:56:42,675][47267] Signal inference workers to stop experience collection... (2250 times) [2024-04-25 22:56:42,696][47288] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-04-25 22:56:42,735][47267] Signal inference workers to resume experience collection... (2250 times) [2024-04-25 22:56:42,735][47288] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-04-25 22:56:43,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 208601088. Throughput: 0: 55982.6. Samples: 157907400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-25 22:56:43,923][47056] Avg episode reward: [(0, '0.086')] [2024-04-25 22:56:44,867][47288] Updated weights for policy 0, policy_version 12736 (0.0030) [2024-04-25 22:56:48,472][47288] Updated weights for policy 0, policy_version 12746 (0.0037) [2024-04-25 22:56:48,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 208879616. Throughput: 0: 55921.5. Samples: 158240960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-25 22:56:48,923][47056] Avg episode reward: [(0, '0.100')] [2024-04-25 22:56:49,039][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000012750_208896000.pth... [2024-04-25 22:56:49,091][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000011930_195461120.pth [2024-04-25 22:56:50,737][47288] Updated weights for policy 0, policy_version 12756 (0.0028) [2024-04-25 22:56:53,923][47056] Fps is (10 sec: 52427.9, 60 sec: 55432.3, 300 sec: 55761.1). Total num frames: 209125376. Throughput: 0: 56003.7. Samples: 158577060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 22:56:53,923][47056] Avg episode reward: [(0, '0.097')] [2024-04-25 22:56:54,332][47288] Updated weights for policy 0, policy_version 12766 (0.0028) [2024-04-25 22:56:56,915][47288] Updated weights for policy 0, policy_version 12776 (0.0032) [2024-04-25 22:56:58,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 209436672. Throughput: 0: 56128.8. Samples: 158744920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 22:56:58,923][47056] Avg episode reward: [(0, '0.085')] [2024-04-25 22:57:00,188][47288] Updated weights for policy 0, policy_version 12786 (0.0034) [2024-04-25 22:57:02,784][47288] Updated weights for policy 0, policy_version 12796 (0.0027) [2024-04-25 22:57:03,923][47056] Fps is (10 sec: 60621.8, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 209731584. Throughput: 0: 56111.5. Samples: 159082940. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-25 22:57:03,923][47056] Avg episode reward: [(0, '0.083')] [2024-04-25 22:57:05,944][47288] Updated weights for policy 0, policy_version 12806 (0.0030) [2024-04-25 22:57:08,745][47288] Updated weights for policy 0, policy_version 12816 (0.0027) [2024-04-25 22:57:08,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 209993728. Throughput: 0: 56218.3. Samples: 159426000. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-25 22:57:08,923][47056] Avg episode reward: [(0, '0.095')] [2024-04-25 22:57:11,837][47288] Updated weights for policy 0, policy_version 12826 (0.0026) [2024-04-25 22:57:13,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 210272256. Throughput: 0: 56192.1. Samples: 159595420. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-25 22:57:13,923][47056] Avg episode reward: [(0, '0.091')] [2024-04-25 22:57:14,468][47288] Updated weights for policy 0, policy_version 12836 (0.0027) [2024-04-25 22:57:17,617][47288] Updated weights for policy 0, policy_version 12846 (0.0032) [2024-04-25 22:57:18,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 210567168. Throughput: 0: 56267.1. Samples: 159929300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-25 22:57:18,926][47056] Avg episode reward: [(0, '0.084')] [2024-04-25 22:57:20,321][47288] Updated weights for policy 0, policy_version 12856 (0.0029) [2024-04-25 22:57:23,339][47288] Updated weights for policy 0, policy_version 12866 (0.0029) [2024-04-25 22:57:23,923][47056] Fps is (10 sec: 57345.2, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 210845696. Throughput: 0: 56197.3. Samples: 160260060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 22:57:23,923][47056] Avg episode reward: [(0, '0.086')] [2024-04-25 22:57:26,168][47288] Updated weights for policy 0, policy_version 12876 (0.0032) [2024-04-25 22:57:28,877][47267] Signal inference workers to stop experience collection... (2300 times) [2024-04-25 22:57:28,897][47288] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-04-25 22:57:28,923][47056] Fps is (10 sec: 50789.9, 60 sec: 54886.2, 300 sec: 55761.1). Total num frames: 211075072. Throughput: 0: 55991.4. Samples: 160427020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 22:57:28,923][47056] Avg episode reward: [(0, '0.101')] [2024-04-25 22:57:28,937][47267] Signal inference workers to resume experience collection... (2300 times) [2024-04-25 22:57:28,938][47288] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-04-25 22:57:29,183][47288] Updated weights for policy 0, policy_version 12886 (0.0030) [2024-04-25 22:57:31,751][47288] Updated weights for policy 0, policy_version 12896 (0.0025) [2024-04-25 22:57:33,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 211386368. Throughput: 0: 56067.8. Samples: 160764020. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-25 22:57:33,923][47056] Avg episode reward: [(0, '0.090')] [2024-04-25 22:57:35,032][47288] Updated weights for policy 0, policy_version 12906 (0.0030) [2024-04-25 22:57:37,816][47288] Updated weights for policy 0, policy_version 12916 (0.0030) [2024-04-25 22:57:38,923][47056] Fps is (10 sec: 60621.9, 60 sec: 55978.7, 300 sec: 56038.9). Total num frames: 211681280. Throughput: 0: 56024.8. Samples: 161098160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-25 22:57:38,923][47056] Avg episode reward: [(0, '0.090')] [2024-04-25 22:57:40,807][47288] Updated weights for policy 0, policy_version 12926 (0.0029) [2024-04-25 22:57:43,543][47288] Updated weights for policy 0, policy_version 12936 (0.0029) [2024-04-25 22:57:43,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 211959808. Throughput: 0: 56044.6. Samples: 161266920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-25 22:57:43,923][47056] Avg episode reward: [(0, '0.094')] [2024-04-25 22:57:46,653][47288] Updated weights for policy 0, policy_version 12946 (0.0032) [2024-04-25 22:57:48,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 212238336. Throughput: 0: 56027.6. Samples: 161604180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-25 22:57:48,924][47056] Avg episode reward: [(0, '0.090')] [2024-04-25 22:57:49,364][47288] Updated weights for policy 0, policy_version 12956 (0.0035) [2024-04-25 22:57:52,418][47288] Updated weights for policy 0, policy_version 12966 (0.0030) [2024-04-25 22:57:53,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56798.0, 300 sec: 55983.3). Total num frames: 212533248. Throughput: 0: 55830.1. Samples: 161938360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-25 22:57:53,923][47056] Avg episode reward: [(0, '0.091')] [2024-04-25 22:57:55,092][47288] Updated weights for policy 0, policy_version 12976 (0.0030) [2024-04-25 22:57:58,388][47288] Updated weights for policy 0, policy_version 12986 (0.0033) [2024-04-25 22:57:58,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 212811776. Throughput: 0: 55829.4. Samples: 162107740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-25 22:57:58,923][47056] Avg episode reward: [(0, '0.093')] [2024-04-25 22:58:00,948][47288] Updated weights for policy 0, policy_version 12996 (0.0026) [2024-04-25 22:58:03,923][47056] Fps is (10 sec: 52429.5, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 213057536. Throughput: 0: 55950.3. Samples: 162447060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-25 22:58:03,923][47056] Avg episode reward: [(0, '0.077')] [2024-04-25 22:58:04,229][47288] Updated weights for policy 0, policy_version 13006 (0.0031) [2024-04-25 22:58:06,863][47288] Updated weights for policy 0, policy_version 13016 (0.0026) [2024-04-25 22:58:08,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 213336064. Throughput: 0: 56086.6. Samples: 162783960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 22:58:08,924][47056] Avg episode reward: [(0, '0.111')] [2024-04-25 22:58:08,936][47267] Saving new best policy, reward=0.111! [2024-04-25 22:58:10,059][47288] Updated weights for policy 0, policy_version 13026 (0.0026) [2024-04-25 22:58:12,544][47288] Updated weights for policy 0, policy_version 13036 (0.0025) [2024-04-25 22:58:13,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56251.9, 300 sec: 56038.9). Total num frames: 213647360. Throughput: 0: 56036.6. Samples: 162948660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 22:58:13,923][47056] Avg episode reward: [(0, '0.078')] [2024-04-25 22:58:15,886][47288] Updated weights for policy 0, policy_version 13046 (0.0031) [2024-04-25 22:58:18,577][47288] Updated weights for policy 0, policy_version 13056 (0.0034) [2024-04-25 22:58:18,923][47056] Fps is (10 sec: 58982.8, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 213925888. Throughput: 0: 56057.4. Samples: 163286600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-25 22:58:18,923][47056] Avg episode reward: [(0, '0.076')] [2024-04-25 22:58:21,760][47288] Updated weights for policy 0, policy_version 13066 (0.0036) [2024-04-25 22:58:23,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 214204416. Throughput: 0: 55924.5. Samples: 163614760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-25 22:58:23,923][47056] Avg episode reward: [(0, '0.086')] [2024-04-25 22:58:24,567][47288] Updated weights for policy 0, policy_version 13076 (0.0028) [2024-04-25 22:58:27,258][47267] Signal inference workers to stop experience collection... (2350 times) [2024-04-25 22:58:27,311][47267] Signal inference workers to resume experience collection... (2350 times) [2024-04-25 22:58:27,311][47288] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-04-25 22:58:27,325][47288] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-04-25 22:58:27,420][47288] Updated weights for policy 0, policy_version 13086 (0.0029) [2024-04-25 22:58:28,923][47056] Fps is (10 sec: 57343.8, 60 sec: 57071.0, 300 sec: 55983.3). Total num frames: 214499328. Throughput: 0: 56170.1. Samples: 163794580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 22:58:28,923][47056] Avg episode reward: [(0, '0.107')] [2024-04-25 22:58:30,462][47288] Updated weights for policy 0, policy_version 13096 (0.0030) [2024-04-25 22:58:33,273][47288] Updated weights for policy 0, policy_version 13106 (0.0029) [2024-04-25 22:58:33,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56524.8, 300 sec: 56038.9). Total num frames: 214777856. Throughput: 0: 56324.4. Samples: 164138780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 22:58:33,923][47056] Avg episode reward: [(0, '0.101')] [2024-04-25 22:58:36,174][47288] Updated weights for policy 0, policy_version 13116 (0.0030) [2024-04-25 22:58:38,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 215023616. Throughput: 0: 56318.3. Samples: 164472680. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-25 22:58:38,923][47056] Avg episode reward: [(0, '0.090')] [2024-04-25 22:58:39,232][47288] Updated weights for policy 0, policy_version 13126 (0.0027) [2024-04-25 22:58:41,931][47288] Updated weights for policy 0, policy_version 13136 (0.0025) [2024-04-25 22:58:43,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 215302144. Throughput: 0: 56126.3. Samples: 164633420. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-25 22:58:43,923][47056] Avg episode reward: [(0, '0.093')] [2024-04-25 22:58:44,972][47288] Updated weights for policy 0, policy_version 13146 (0.0032) [2024-04-25 22:58:47,809][47288] Updated weights for policy 0, policy_version 13156 (0.0031) [2024-04-25 22:58:48,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 56038.9). Total num frames: 215597056. Throughput: 0: 56015.5. Samples: 164967760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-25 22:58:48,923][47056] Avg episode reward: [(0, '0.109')] [2024-04-25 22:58:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000013159_215597056.pth... [2024-04-25 22:58:48,985][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000012340_202178560.pth [2024-04-25 22:58:50,782][47288] Updated weights for policy 0, policy_version 13166 (0.0027) [2024-04-25 22:58:53,710][47288] Updated weights for policy 0, policy_version 13176 (0.0034) [2024-04-25 22:58:53,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 215891968. Throughput: 0: 55979.6. Samples: 165303040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-25 22:58:53,923][47056] Avg episode reward: [(0, '0.103')] [2024-04-25 22:58:56,542][47288] Updated weights for policy 0, policy_version 13186 (0.0028) [2024-04-25 22:58:58,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 216170496. Throughput: 0: 56076.8. Samples: 165472120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-25 22:58:58,923][47056] Avg episode reward: [(0, '0.074')] [2024-04-25 22:58:59,461][47288] Updated weights for policy 0, policy_version 13196 (0.0028) [2024-04-25 22:59:02,440][47288] Updated weights for policy 0, policy_version 13206 (0.0033) [2024-04-25 22:59:03,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56797.8, 300 sec: 56094.4). Total num frames: 216465408. Throughput: 0: 56054.2. Samples: 165809040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-25 22:59:03,923][47056] Avg episode reward: [(0, '0.075')] [2024-04-25 22:59:05,352][47288] Updated weights for policy 0, policy_version 13216 (0.0031) [2024-04-25 22:59:08,192][47288] Updated weights for policy 0, policy_version 13226 (0.0030) [2024-04-25 22:59:08,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56524.7, 300 sec: 55983.3). Total num frames: 216727552. Throughput: 0: 56163.2. Samples: 166142120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-25 22:59:08,923][47056] Avg episode reward: [(0, '0.110')] [2024-04-25 22:59:11,134][47288] Updated weights for policy 0, policy_version 13236 (0.0032) [2024-04-25 22:59:13,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55705.5, 300 sec: 55927.8). Total num frames: 216989696. Throughput: 0: 55883.1. Samples: 166309320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-25 22:59:13,923][47056] Avg episode reward: [(0, '0.127')] [2024-04-25 22:59:14,024][47267] Saving new best policy, reward=0.127! [2024-04-25 22:59:14,205][47288] Updated weights for policy 0, policy_version 13246 (0.0033) [2024-04-25 22:59:16,938][47288] Updated weights for policy 0, policy_version 13256 (0.0034) [2024-04-25 22:59:18,923][47056] Fps is (10 sec: 50791.4, 60 sec: 55159.5, 300 sec: 55872.2). Total num frames: 217235456. Throughput: 0: 55682.8. Samples: 166644500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 22:59:18,923][47056] Avg episode reward: [(0, '0.105')] [2024-04-25 22:59:20,063][47288] Updated weights for policy 0, policy_version 13266 (0.0032) [2024-04-25 22:59:22,971][47288] Updated weights for policy 0, policy_version 13276 (0.0029) [2024-04-25 22:59:23,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.4, 300 sec: 55983.3). Total num frames: 217546752. Throughput: 0: 55700.7. Samples: 166979220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 22:59:23,923][47056] Avg episode reward: [(0, '0.086')] [2024-04-25 22:59:25,863][47288] Updated weights for policy 0, policy_version 13286 (0.0031) [2024-04-25 22:59:28,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55432.6, 300 sec: 55983.3). Total num frames: 217825280. Throughput: 0: 55823.2. Samples: 167145460. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-04-25 22:59:28,923][47056] Avg episode reward: [(0, '0.091')] [2024-04-25 22:59:28,926][47288] Updated weights for policy 0, policy_version 13296 (0.0035) [2024-04-25 22:59:31,819][47288] Updated weights for policy 0, policy_version 13306 (0.0035) [2024-04-25 22:59:33,467][47267] Signal inference workers to stop experience collection... (2400 times) [2024-04-25 22:59:33,467][47267] Signal inference workers to resume experience collection... (2400 times) [2024-04-25 22:59:33,493][47288] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-04-25 22:59:33,494][47288] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-04-25 22:59:33,923][47056] Fps is (10 sec: 58983.3, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 218136576. Throughput: 0: 55811.1. Samples: 167479260. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-04-25 22:59:33,923][47056] Avg episode reward: [(0, '0.114')] [2024-04-25 22:59:34,681][47288] Updated weights for policy 0, policy_version 13316 (0.0033) [2024-04-25 22:59:37,624][47288] Updated weights for policy 0, policy_version 13326 (0.0036) [2024-04-25 22:59:38,923][47056] Fps is (10 sec: 57342.8, 60 sec: 56251.6, 300 sec: 56038.8). Total num frames: 218398720. Throughput: 0: 55689.6. Samples: 167809080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-25 22:59:38,924][47056] Avg episode reward: [(0, '0.073')] [2024-04-25 22:59:40,485][47288] Updated weights for policy 0, policy_version 13336 (0.0028) [2024-04-25 22:59:43,503][47288] Updated weights for policy 0, policy_version 13346 (0.0030) [2024-04-25 22:59:43,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 218660864. Throughput: 0: 55754.4. Samples: 167981060. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-25 22:59:43,923][47056] Avg episode reward: [(0, '0.121')] [2024-04-25 22:59:46,834][47288] Updated weights for policy 0, policy_version 13356 (0.0034) [2024-04-25 22:59:48,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55705.5, 300 sec: 55927.8). Total num frames: 218939392. Throughput: 0: 55671.5. Samples: 168314260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 22:59:48,923][47056] Avg episode reward: [(0, '0.085')] [2024-04-25 22:59:49,578][47288] Updated weights for policy 0, policy_version 13366 (0.0028) [2024-04-25 22:59:52,537][47288] Updated weights for policy 0, policy_version 13376 (0.0027) [2024-04-25 22:59:53,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 55927.7). Total num frames: 219201536. Throughput: 0: 55783.7. Samples: 168652380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 22:59:53,923][47056] Avg episode reward: [(0, '0.100')] [2024-04-25 22:59:55,537][47288] Updated weights for policy 0, policy_version 13386 (0.0029) [2024-04-25 22:59:58,402][47288] Updated weights for policy 0, policy_version 13396 (0.0027) [2024-04-25 22:59:58,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55983.3). Total num frames: 219496448. Throughput: 0: 55504.4. Samples: 168807020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-25 22:59:58,923][47056] Avg episode reward: [(0, '0.115')] [2024-04-25 23:00:01,468][47288] Updated weights for policy 0, policy_version 13406 (0.0031) [2024-04-25 23:00:03,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55432.5, 300 sec: 56038.9). Total num frames: 219791360. Throughput: 0: 55461.2. Samples: 169140260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-25 23:00:03,923][47056] Avg episode reward: [(0, '0.117')] [2024-04-25 23:00:04,286][47288] Updated weights for policy 0, policy_version 13416 (0.0030) [2024-04-25 23:00:07,219][47288] Updated weights for policy 0, policy_version 13426 (0.0029) [2024-04-25 23:00:08,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.7, 300 sec: 55983.3). Total num frames: 220069888. Throughput: 0: 55492.1. Samples: 169476360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-25 23:00:08,923][47056] Avg episode reward: [(0, '0.100')] [2024-04-25 23:00:10,408][47288] Updated weights for policy 0, policy_version 13436 (0.0029) [2024-04-25 23:00:13,060][47288] Updated weights for policy 0, policy_version 13446 (0.0033) [2024-04-25 23:00:13,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 220364800. Throughput: 0: 55868.3. Samples: 169659540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-25 23:00:13,923][47056] Avg episode reward: [(0, '0.090')] [2024-04-25 23:00:16,157][47288] Updated weights for policy 0, policy_version 13456 (0.0030) [2024-04-25 23:00:18,822][47288] Updated weights for policy 0, policy_version 13466 (0.0027) [2024-04-25 23:00:18,923][47056] Fps is (10 sec: 55704.7, 60 sec: 56524.5, 300 sec: 55927.7). Total num frames: 220626944. Throughput: 0: 55878.4. Samples: 169993800. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-25 23:00:18,923][47056] Avg episode reward: [(0, '0.107')] [2024-04-25 23:00:22,131][47288] Updated weights for policy 0, policy_version 13476 (0.0032) [2024-04-25 23:00:23,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 220905472. Throughput: 0: 55843.7. Samples: 170322040. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-25 23:00:23,923][47056] Avg episode reward: [(0, '0.105')] [2024-04-25 23:00:24,478][47288] Updated weights for policy 0, policy_version 13486 (0.0033) [2024-04-25 23:00:28,098][47288] Updated weights for policy 0, policy_version 13496 (0.0034) [2024-04-25 23:00:28,923][47056] Fps is (10 sec: 52430.0, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 221151232. Throughput: 0: 55612.8. Samples: 170483640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-25 23:00:28,923][47056] Avg episode reward: [(0, '0.109')] [2024-04-25 23:00:30,558][47288] Updated weights for policy 0, policy_version 13506 (0.0030) [2024-04-25 23:00:33,909][47288] Updated weights for policy 0, policy_version 13516 (0.0031) [2024-04-25 23:00:33,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55159.5, 300 sec: 55983.3). Total num frames: 221446144. Throughput: 0: 55701.9. Samples: 170820840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-25 23:00:33,923][47056] Avg episode reward: [(0, '0.104')] [2024-04-25 23:00:36,236][47288] Updated weights for policy 0, policy_version 13526 (0.0031) [2024-04-25 23:00:38,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55705.7, 300 sec: 55983.3). Total num frames: 221741056. Throughput: 0: 55718.7. Samples: 171159720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-04-25 23:00:38,923][47056] Avg episode reward: [(0, '0.107')] [2024-04-25 23:00:39,909][47288] Updated weights for policy 0, policy_version 13536 (0.0026) [2024-04-25 23:00:40,854][47267] Signal inference workers to stop experience collection... (2450 times) [2024-04-25 23:00:40,855][47267] Signal inference workers to resume experience collection... (2450 times) [2024-04-25 23:00:40,881][47288] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-04-25 23:00:40,881][47288] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-04-25 23:00:42,128][47288] Updated weights for policy 0, policy_version 13546 (0.0033) [2024-04-25 23:00:43,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 222019584. Throughput: 0: 55945.4. Samples: 171324560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 26.0) [2024-04-25 23:00:43,923][47056] Avg episode reward: [(0, '0.100')] [2024-04-25 23:00:45,955][47288] Updated weights for policy 0, policy_version 13556 (0.0032) [2024-04-25 23:00:48,056][47288] Updated weights for policy 0, policy_version 13566 (0.0031) [2024-04-25 23:00:48,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 222314496. Throughput: 0: 55888.1. Samples: 171655220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-25 23:00:48,923][47056] Avg episode reward: [(0, '0.072')] [2024-04-25 23:00:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000013569_222314496.pth... [2024-04-25 23:00:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000012750_208896000.pth [2024-04-25 23:00:51,642][47288] Updated weights for policy 0, policy_version 13576 (0.0026) [2024-04-25 23:00:53,823][47288] Updated weights for policy 0, policy_version 13586 (0.0026) [2024-04-25 23:00:53,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 222593024. Throughput: 0: 56040.9. Samples: 171998200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-25 23:00:53,923][47056] Avg episode reward: [(0, '0.123')] [2024-04-25 23:00:57,611][47288] Updated weights for policy 0, policy_version 13596 (0.0030) [2024-04-25 23:00:58,923][47056] Fps is (10 sec: 57342.8, 60 sec: 56524.7, 300 sec: 56094.4). Total num frames: 222887936. Throughput: 0: 55850.5. Samples: 172172820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-25 23:00:58,923][47056] Avg episode reward: [(0, '0.109')] [2024-04-25 23:00:59,735][47288] Updated weights for policy 0, policy_version 13606 (0.0033) [2024-04-25 23:01:03,351][47288] Updated weights for policy 0, policy_version 13616 (0.0029) [2024-04-25 23:01:03,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 223117312. Throughput: 0: 55802.9. Samples: 172504920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-25 23:01:03,923][47056] Avg episode reward: [(0, '0.090')] [2024-04-25 23:01:05,653][47288] Updated weights for policy 0, policy_version 13626 (0.0027) [2024-04-25 23:01:08,923][47056] Fps is (10 sec: 49152.6, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 223379456. Throughput: 0: 55912.0. Samples: 172838080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-25 23:01:08,923][47056] Avg episode reward: [(0, '0.117')] [2024-04-25 23:01:09,314][47288] Updated weights for policy 0, policy_version 13636 (0.0033) [2024-04-25 23:01:11,451][47288] Updated weights for policy 0, policy_version 13646 (0.0027) [2024-04-25 23:01:13,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55159.5, 300 sec: 55872.2). Total num frames: 223674368. Throughput: 0: 55931.0. Samples: 173000540. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-25 23:01:13,923][47056] Avg episode reward: [(0, '0.101')] [2024-04-25 23:01:15,224][47288] Updated weights for policy 0, policy_version 13656 (0.0027) [2024-04-25 23:01:17,186][47288] Updated weights for policy 0, policy_version 13666 (0.0028) [2024-04-25 23:01:18,922][47056] Fps is (10 sec: 58983.5, 60 sec: 55705.9, 300 sec: 55872.3). Total num frames: 223969280. Throughput: 0: 55838.3. Samples: 173333560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-25 23:01:18,923][47056] Avg episode reward: [(0, '0.117')] [2024-04-25 23:01:21,060][47288] Updated weights for policy 0, policy_version 13676 (0.0032) [2024-04-25 23:01:23,249][47288] Updated weights for policy 0, policy_version 13686 (0.0032) [2024-04-25 23:01:23,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 224247808. Throughput: 0: 55679.1. Samples: 173665280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 23:01:23,923][47056] Avg episode reward: [(0, '0.109')] [2024-04-25 23:01:26,994][47288] Updated weights for policy 0, policy_version 13696 (0.0033) [2024-04-25 23:01:28,923][47056] Fps is (10 sec: 55704.6, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 224526336. Throughput: 0: 56056.9. Samples: 173847120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 23:01:28,923][47056] Avg episode reward: [(0, '0.140')] [2024-04-25 23:01:28,950][47267] Saving new best policy, reward=0.140! [2024-04-25 23:01:29,225][47288] Updated weights for policy 0, policy_version 13706 (0.0031) [2024-04-25 23:01:32,830][47288] Updated weights for policy 0, policy_version 13716 (0.0031) [2024-04-25 23:01:33,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56251.5, 300 sec: 55927.7). Total num frames: 224821248. Throughput: 0: 56104.7. Samples: 174179940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 23:01:33,923][47056] Avg episode reward: [(0, '0.126')] [2024-04-25 23:01:35,031][47288] Updated weights for policy 0, policy_version 13726 (0.0035) [2024-04-25 23:01:38,462][47267] Signal inference workers to stop experience collection... (2500 times) [2024-04-25 23:01:38,483][47288] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-04-25 23:01:38,555][47267] Signal inference workers to resume experience collection... (2500 times) [2024-04-25 23:01:38,555][47288] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-04-25 23:01:38,661][47288] Updated weights for policy 0, policy_version 13736 (0.0027) [2024-04-25 23:01:38,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 225067008. Throughput: 0: 55902.7. Samples: 174513820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 23:01:38,923][47056] Avg episode reward: [(0, '0.109')] [2024-04-25 23:01:41,020][47288] Updated weights for policy 0, policy_version 13746 (0.0034) [2024-04-25 23:01:43,923][47056] Fps is (10 sec: 49152.7, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 225312768. Throughput: 0: 55513.1. Samples: 174670900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-25 23:01:43,923][47056] Avg episode reward: [(0, '0.114')] [2024-04-25 23:01:44,529][47288] Updated weights for policy 0, policy_version 13756 (0.0032) [2024-04-25 23:01:47,009][47288] Updated weights for policy 0, policy_version 13766 (0.0025) [2024-04-25 23:01:48,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55159.5, 300 sec: 55927.8). Total num frames: 225624064. Throughput: 0: 55578.4. Samples: 175005940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-25 23:01:48,923][47056] Avg episode reward: [(0, '0.114')] [2024-04-25 23:01:50,345][47288] Updated weights for policy 0, policy_version 13776 (0.0029) [2024-04-25 23:01:52,766][47288] Updated weights for policy 0, policy_version 13786 (0.0029) [2024-04-25 23:01:53,923][47056] Fps is (10 sec: 60621.1, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 225918976. Throughput: 0: 55588.1. Samples: 175339540. Policy #0 lag: (min: 2.0, avg: 8.8, max: 21.0) [2024-04-25 23:01:53,923][47056] Avg episode reward: [(0, '0.108')] [2024-04-25 23:01:56,175][47288] Updated weights for policy 0, policy_version 13796 (0.0029) [2024-04-25 23:01:58,572][47288] Updated weights for policy 0, policy_version 13806 (0.0033) [2024-04-25 23:01:58,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55159.6, 300 sec: 55816.7). Total num frames: 226197504. Throughput: 0: 55614.7. Samples: 175503200. Policy #0 lag: (min: 2.0, avg: 8.8, max: 21.0) [2024-04-25 23:01:58,923][47056] Avg episode reward: [(0, '0.113')] [2024-04-25 23:02:02,163][47288] Updated weights for policy 0, policy_version 13816 (0.0034) [2024-04-25 23:02:03,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.8, 300 sec: 55927.7). Total num frames: 226492416. Throughput: 0: 55571.4. Samples: 175834280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 23:02:03,923][47056] Avg episode reward: [(0, '0.113')] [2024-04-25 23:02:05,016][47288] Updated weights for policy 0, policy_version 13826 (0.0033) [2024-04-25 23:02:08,051][47288] Updated weights for policy 0, policy_version 13836 (0.0040) [2024-04-25 23:02:08,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56251.9, 300 sec: 55872.3). Total num frames: 226754560. Throughput: 0: 55693.0. Samples: 176171460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 23:02:08,923][47056] Avg episode reward: [(0, '0.101')] [2024-04-25 23:02:10,766][47288] Updated weights for policy 0, policy_version 13846 (0.0027) [2024-04-25 23:02:13,826][47288] Updated weights for policy 0, policy_version 13856 (0.0029) [2024-04-25 23:02:13,923][47056] Fps is (10 sec: 52427.7, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 227016704. Throughput: 0: 55304.2. Samples: 176335820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 23:02:13,924][47056] Avg episode reward: [(0, '0.113')] [2024-04-25 23:02:16,450][47288] Updated weights for policy 0, policy_version 13866 (0.0031) [2024-04-25 23:02:18,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 227278848. Throughput: 0: 55381.6. Samples: 176672100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 23:02:18,923][47056] Avg episode reward: [(0, '0.104')] [2024-04-25 23:02:19,565][47288] Updated weights for policy 0, policy_version 13876 (0.0029) [2024-04-25 23:02:22,495][47288] Updated weights for policy 0, policy_version 13886 (0.0033) [2024-04-25 23:02:23,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55159.4, 300 sec: 55872.2). Total num frames: 227557376. Throughput: 0: 55418.6. Samples: 177007660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-25 23:02:23,923][47056] Avg episode reward: [(0, '0.113')] [2024-04-25 23:02:25,364][47288] Updated weights for policy 0, policy_version 13896 (0.0032) [2024-04-25 23:02:28,371][47288] Updated weights for policy 0, policy_version 13906 (0.0032) [2024-04-25 23:02:28,923][47056] Fps is (10 sec: 57342.8, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 227852288. Throughput: 0: 55428.7. Samples: 177165200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-25 23:02:28,923][47056] Avg episode reward: [(0, '0.115')] [2024-04-25 23:02:31,338][47288] Updated weights for policy 0, policy_version 13916 (0.0028) [2024-04-25 23:02:33,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 228147200. Throughput: 0: 55572.3. Samples: 177506700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-04-25 23:02:33,923][47056] Avg episode reward: [(0, '0.102')] [2024-04-25 23:02:34,034][47288] Updated weights for policy 0, policy_version 13926 (0.0029) [2024-04-25 23:02:37,120][47288] Updated weights for policy 0, policy_version 13936 (0.0042) [2024-04-25 23:02:37,710][47267] Signal inference workers to stop experience collection... (2550 times) [2024-04-25 23:02:37,711][47267] Signal inference workers to resume experience collection... (2550 times) [2024-04-25 23:02:37,741][47288] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-04-25 23:02:37,741][47288] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-04-25 23:02:38,923][47056] Fps is (10 sec: 60621.5, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 228458496. Throughput: 0: 55617.3. Samples: 177842320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-04-25 23:02:38,923][47056] Avg episode reward: [(0, '0.117')] [2024-04-25 23:02:39,752][47288] Updated weights for policy 0, policy_version 13946 (0.0036) [2024-04-25 23:02:42,957][47288] Updated weights for policy 0, policy_version 13956 (0.0029) [2024-04-25 23:02:43,923][47056] Fps is (10 sec: 58981.8, 60 sec: 57070.8, 300 sec: 55927.7). Total num frames: 228737024. Throughput: 0: 55986.1. Samples: 178022580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 23:02:43,923][47056] Avg episode reward: [(0, '0.132')] [2024-04-25 23:02:45,589][47288] Updated weights for policy 0, policy_version 13966 (0.0030) [2024-04-25 23:02:48,759][47288] Updated weights for policy 0, policy_version 13976 (0.0029) [2024-04-25 23:02:48,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 228982784. Throughput: 0: 56085.2. Samples: 178358120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 23:02:48,923][47056] Avg episode reward: [(0, '0.143')] [2024-04-25 23:02:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000013976_228982784.pth... [2024-04-25 23:02:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000013159_215597056.pth [2024-04-25 23:02:48,987][47267] Saving new best policy, reward=0.143! [2024-04-25 23:02:51,576][47288] Updated weights for policy 0, policy_version 13986 (0.0031) [2024-04-25 23:02:53,923][47056] Fps is (10 sec: 50791.0, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 229244928. Throughput: 0: 55978.5. Samples: 178690500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-25 23:02:53,923][47056] Avg episode reward: [(0, '0.121')] [2024-04-25 23:02:54,676][47288] Updated weights for policy 0, policy_version 13996 (0.0032) [2024-04-25 23:02:57,528][47288] Updated weights for policy 0, policy_version 14006 (0.0033) [2024-04-25 23:02:58,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 229539840. Throughput: 0: 55966.1. Samples: 178854280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-25 23:02:58,923][47056] Avg episode reward: [(0, '0.089')] [2024-04-25 23:03:00,433][47288] Updated weights for policy 0, policy_version 14016 (0.0027) [2024-04-25 23:03:03,530][47288] Updated weights for policy 0, policy_version 14026 (0.0024) [2024-04-25 23:03:03,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55159.3, 300 sec: 55816.7). Total num frames: 229801984. Throughput: 0: 55974.8. Samples: 179190980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-25 23:03:03,923][47056] Avg episode reward: [(0, '0.115')] [2024-04-25 23:03:06,144][47288] Updated weights for policy 0, policy_version 14036 (0.0028) [2024-04-25 23:03:08,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 230096896. Throughput: 0: 55849.8. Samples: 179520900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-25 23:03:08,932][47056] Avg episode reward: [(0, '0.103')] [2024-04-25 23:03:09,768][47288] Updated weights for policy 0, policy_version 14046 (0.0027) [2024-04-25 23:03:12,040][47288] Updated weights for policy 0, policy_version 14056 (0.0026) [2024-04-25 23:03:13,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 230391808. Throughput: 0: 56283.2. Samples: 179697940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-25 23:03:13,923][47056] Avg episode reward: [(0, '0.127')] [2024-04-25 23:03:15,478][47288] Updated weights for policy 0, policy_version 14066 (0.0031) [2024-04-25 23:03:17,940][47288] Updated weights for policy 0, policy_version 14076 (0.0026) [2024-04-25 23:03:18,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56797.8, 300 sec: 55872.2). Total num frames: 230686720. Throughput: 0: 56031.1. Samples: 180028100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-25 23:03:18,923][47056] Avg episode reward: [(0, '0.111')] [2024-04-25 23:03:21,557][47288] Updated weights for policy 0, policy_version 14086 (0.0027) [2024-04-25 23:03:23,811][47288] Updated weights for policy 0, policy_version 14096 (0.0033) [2024-04-25 23:03:23,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56524.8, 300 sec: 55761.1). Total num frames: 230948864. Throughput: 0: 56024.8. Samples: 180363440. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-25 23:03:23,923][47056] Avg episode reward: [(0, '0.122')] [2024-04-25 23:03:27,511][47288] Updated weights for policy 0, policy_version 14106 (0.0038) [2024-04-25 23:03:28,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 231211008. Throughput: 0: 55741.4. Samples: 180530940. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-25 23:03:28,923][47056] Avg episode reward: [(0, '0.100')] [2024-04-25 23:03:29,750][47288] Updated weights for policy 0, policy_version 14116 (0.0029) [2024-04-25 23:03:33,259][47288] Updated weights for policy 0, policy_version 14126 (0.0032) [2024-04-25 23:03:33,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 231489536. Throughput: 0: 55755.8. Samples: 180867120. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-25 23:03:33,923][47056] Avg episode reward: [(0, '0.111')] [2024-04-25 23:03:35,733][47288] Updated weights for policy 0, policy_version 14136 (0.0026) [2024-04-25 23:03:38,923][47056] Fps is (10 sec: 52428.9, 60 sec: 54613.3, 300 sec: 55705.6). Total num frames: 231735296. Throughput: 0: 55776.4. Samples: 181200440. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-25 23:03:38,923][47056] Avg episode reward: [(0, '0.107')] [2024-04-25 23:03:39,205][47288] Updated weights for policy 0, policy_version 14146 (0.0030) [2024-04-25 23:03:41,676][47288] Updated weights for policy 0, policy_version 14156 (0.0032) [2024-04-25 23:03:43,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 232046592. Throughput: 0: 55734.2. Samples: 181362320. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-25 23:03:43,923][47056] Avg episode reward: [(0, '0.139')] [2024-04-25 23:03:45,164][47288] Updated weights for policy 0, policy_version 14166 (0.0029) [2024-04-25 23:03:47,440][47288] Updated weights for policy 0, policy_version 14176 (0.0025) [2024-04-25 23:03:48,923][47056] Fps is (10 sec: 60621.1, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 232341504. Throughput: 0: 55626.9. Samples: 181694180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-25 23:03:48,923][47056] Avg episode reward: [(0, '0.123')] [2024-04-25 23:03:51,032][47288] Updated weights for policy 0, policy_version 14186 (0.0030) [2024-04-25 23:03:53,168][47288] Updated weights for policy 0, policy_version 14196 (0.0029) [2024-04-25 23:03:53,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 232620032. Throughput: 0: 55720.9. Samples: 182028340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-25 23:03:53,924][47056] Avg episode reward: [(0, '0.110')] [2024-04-25 23:03:56,911][47288] Updated weights for policy 0, policy_version 14206 (0.0030) [2024-04-25 23:03:57,040][47267] Signal inference workers to stop experience collection... (2600 times) [2024-04-25 23:03:57,046][47267] Signal inference workers to resume experience collection... (2600 times) [2024-04-25 23:03:57,067][47288] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-04-25 23:03:57,068][47288] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-04-25 23:03:58,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 232882176. Throughput: 0: 55688.2. Samples: 182203900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 23:03:58,923][47056] Avg episode reward: [(0, '0.114')] [2024-04-25 23:03:59,266][47288] Updated weights for policy 0, policy_version 14216 (0.0027) [2024-04-25 23:04:02,651][47288] Updated weights for policy 0, policy_version 14226 (0.0026) [2024-04-25 23:04:03,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55979.0, 300 sec: 55705.7). Total num frames: 233160704. Throughput: 0: 55790.0. Samples: 182538640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 23:04:03,923][47056] Avg episode reward: [(0, '0.134')] [2024-04-25 23:04:05,113][47288] Updated weights for policy 0, policy_version 14236 (0.0032) [2024-04-25 23:04:08,543][47288] Updated weights for policy 0, policy_version 14246 (0.0032) [2024-04-25 23:04:08,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 233422848. Throughput: 0: 55657.7. Samples: 182868040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 23:04:08,923][47056] Avg episode reward: [(0, '0.090')] [2024-04-25 23:04:11,113][47288] Updated weights for policy 0, policy_version 14256 (0.0025) [2024-04-25 23:04:13,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55159.6, 300 sec: 55816.7). Total num frames: 233701376. Throughput: 0: 55444.6. Samples: 183025940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 23:04:13,923][47056] Avg episode reward: [(0, '0.132')] [2024-04-25 23:04:14,424][47288] Updated weights for policy 0, policy_version 14266 (0.0033) [2024-04-25 23:04:16,933][47288] Updated weights for policy 0, policy_version 14276 (0.0037) [2024-04-25 23:04:18,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55159.4, 300 sec: 55761.2). Total num frames: 233996288. Throughput: 0: 55422.0. Samples: 183361120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 23:04:18,923][47056] Avg episode reward: [(0, '0.143')] [2024-04-25 23:04:20,148][47288] Updated weights for policy 0, policy_version 14286 (0.0036) [2024-04-25 23:04:22,745][47288] Updated weights for policy 0, policy_version 14296 (0.0027) [2024-04-25 23:04:23,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 234274816. Throughput: 0: 55506.2. Samples: 183698220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 23:04:23,923][47056] Avg episode reward: [(0, '0.097')] [2024-04-25 23:04:26,104][47288] Updated weights for policy 0, policy_version 14306 (0.0027) [2024-04-25 23:04:28,616][47288] Updated weights for policy 0, policy_version 14316 (0.0030) [2024-04-25 23:04:28,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 234569728. Throughput: 0: 55664.7. Samples: 183867240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-25 23:04:28,924][47056] Avg episode reward: [(0, '0.098')] [2024-04-25 23:04:31,888][47288] Updated weights for policy 0, policy_version 14326 (0.0028) [2024-04-25 23:04:33,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.5, 300 sec: 55761.2). Total num frames: 234848256. Throughput: 0: 55740.4. Samples: 184202500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-25 23:04:33,924][47056] Avg episode reward: [(0, '0.101')] [2024-04-25 23:04:34,498][47288] Updated weights for policy 0, policy_version 14336 (0.0032) [2024-04-25 23:04:37,926][47288] Updated weights for policy 0, policy_version 14346 (0.0037) [2024-04-25 23:04:38,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 235110400. Throughput: 0: 55698.3. Samples: 184534760. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-25 23:04:38,923][47056] Avg episode reward: [(0, '0.139')] [2024-04-25 23:04:40,254][47288] Updated weights for policy 0, policy_version 14356 (0.0028) [2024-04-25 23:04:43,888][47288] Updated weights for policy 0, policy_version 14366 (0.0038) [2024-04-25 23:04:43,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 235372544. Throughput: 0: 55408.5. Samples: 184697280. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-25 23:04:43,923][47056] Avg episode reward: [(0, '0.100')] [2024-04-25 23:04:46,262][47288] Updated weights for policy 0, policy_version 14376 (0.0029) [2024-04-25 23:04:48,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 235651072. Throughput: 0: 55331.3. Samples: 185028560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-25 23:04:48,923][47056] Avg episode reward: [(0, '0.126')] [2024-04-25 23:04:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000014383_235651072.pth... [2024-04-25 23:04:48,994][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000013569_222314496.pth [2024-04-25 23:04:49,686][47288] Updated weights for policy 0, policy_version 14386 (0.0029) [2024-04-25 23:04:52,444][47288] Updated weights for policy 0, policy_version 14396 (0.0036) [2024-04-25 23:04:53,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 235945984. Throughput: 0: 55375.7. Samples: 185359940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-25 23:04:53,923][47056] Avg episode reward: [(0, '0.113')] [2024-04-25 23:04:55,514][47288] Updated weights for policy 0, policy_version 14406 (0.0030) [2024-04-25 23:04:58,256][47288] Updated weights for policy 0, policy_version 14416 (0.0028) [2024-04-25 23:04:58,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 236208128. Throughput: 0: 55586.6. Samples: 185527340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-25 23:04:58,923][47056] Avg episode reward: [(0, '0.108')] [2024-04-25 23:05:01,386][47267] Signal inference workers to stop experience collection... (2650 times) [2024-04-25 23:05:01,423][47288] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-04-25 23:05:01,444][47267] Signal inference workers to resume experience collection... (2650 times) [2024-04-25 23:05:01,445][47288] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-04-25 23:05:01,552][47288] Updated weights for policy 0, policy_version 14426 (0.0031) [2024-04-25 23:05:03,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 236503040. Throughput: 0: 55422.3. Samples: 185855120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-25 23:05:03,932][47056] Avg episode reward: [(0, '0.104')] [2024-04-25 23:05:04,272][47288] Updated weights for policy 0, policy_version 14436 (0.0032) [2024-04-25 23:05:07,461][47288] Updated weights for policy 0, policy_version 14446 (0.0033) [2024-04-25 23:05:08,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 236781568. Throughput: 0: 55349.1. Samples: 186188920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-25 23:05:08,923][47056] Avg episode reward: [(0, '0.104')] [2024-04-25 23:05:10,280][47288] Updated weights for policy 0, policy_version 14456 (0.0032) [2024-04-25 23:05:13,357][47288] Updated weights for policy 0, policy_version 14466 (0.0025) [2024-04-25 23:05:13,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.7, 300 sec: 55761.2). Total num frames: 237076480. Throughput: 0: 55380.1. Samples: 186359340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-25 23:05:13,923][47056] Avg episode reward: [(0, '0.098')] [2024-04-25 23:05:16,001][47288] Updated weights for policy 0, policy_version 14476 (0.0031) [2024-04-25 23:05:18,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 237305856. Throughput: 0: 55327.7. Samples: 186692240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-25 23:05:18,923][47056] Avg episode reward: [(0, '0.106')] [2024-04-25 23:05:19,271][47288] Updated weights for policy 0, policy_version 14486 (0.0028) [2024-04-25 23:05:21,997][47288] Updated weights for policy 0, policy_version 14496 (0.0035) [2024-04-25 23:05:23,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 237600768. Throughput: 0: 55408.0. Samples: 187028120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 29.0) [2024-04-25 23:05:23,923][47056] Avg episode reward: [(0, '0.120')] [2024-04-25 23:05:25,089][47288] Updated weights for policy 0, policy_version 14506 (0.0032) [2024-04-25 23:05:27,618][47288] Updated weights for policy 0, policy_version 14516 (0.0032) [2024-04-25 23:05:28,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 237879296. Throughput: 0: 55575.1. Samples: 187198160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 29.0) [2024-04-25 23:05:28,923][47056] Avg episode reward: [(0, '0.119')] [2024-04-25 23:05:30,999][47288] Updated weights for policy 0, policy_version 14526 (0.0028) [2024-04-25 23:05:33,727][47288] Updated weights for policy 0, policy_version 14536 (0.0032) [2024-04-25 23:05:33,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 238174208. Throughput: 0: 55592.5. Samples: 187530220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-25 23:05:33,924][47056] Avg episode reward: [(0, '0.141')] [2024-04-25 23:05:36,766][47288] Updated weights for policy 0, policy_version 14546 (0.0034) [2024-04-25 23:05:38,923][47056] Fps is (10 sec: 55704.3, 60 sec: 55432.3, 300 sec: 55650.0). Total num frames: 238436352. Throughput: 0: 55676.2. Samples: 187865380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-25 23:05:38,923][47056] Avg episode reward: [(0, '0.104')] [2024-04-25 23:05:39,709][47288] Updated weights for policy 0, policy_version 14556 (0.0031) [2024-04-25 23:05:42,581][47288] Updated weights for policy 0, policy_version 14566 (0.0032) [2024-04-25 23:05:43,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 238731264. Throughput: 0: 55673.9. Samples: 188032660. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-25 23:05:43,923][47056] Avg episode reward: [(0, '0.129')] [2024-04-25 23:05:45,508][47288] Updated weights for policy 0, policy_version 14576 (0.0029) [2024-04-25 23:05:48,458][47288] Updated weights for policy 0, policy_version 14586 (0.0030) [2024-04-25 23:05:48,923][47056] Fps is (10 sec: 57345.3, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 239009792. Throughput: 0: 55854.7. Samples: 188368580. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-25 23:05:48,923][47056] Avg episode reward: [(0, '0.130')] [2024-04-25 23:05:51,365][47288] Updated weights for policy 0, policy_version 14596 (0.0030) [2024-04-25 23:05:52,068][47267] Signal inference workers to stop experience collection... (2700 times) [2024-04-25 23:05:52,117][47288] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-04-25 23:05:52,117][47267] Signal inference workers to resume experience collection... (2700 times) [2024-04-25 23:05:52,131][47288] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-04-25 23:05:53,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 239271936. Throughput: 0: 56040.9. Samples: 188710760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 23:05:53,923][47056] Avg episode reward: [(0, '0.143')] [2024-04-25 23:05:54,254][47288] Updated weights for policy 0, policy_version 14606 (0.0035) [2024-04-25 23:05:57,190][47288] Updated weights for policy 0, policy_version 14616 (0.0025) [2024-04-25 23:05:58,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 239550464. Throughput: 0: 55700.9. Samples: 188865880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 23:05:58,923][47056] Avg episode reward: [(0, '0.117')] [2024-04-25 23:06:00,354][47288] Updated weights for policy 0, policy_version 14626 (0.0029) [2024-04-25 23:06:03,210][47288] Updated weights for policy 0, policy_version 14636 (0.0023) [2024-04-25 23:06:03,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 239828992. Throughput: 0: 55630.0. Samples: 189195600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:06:03,923][47056] Avg episode reward: [(0, '0.126')] [2024-04-25 23:06:06,134][47288] Updated weights for policy 0, policy_version 14646 (0.0031) [2024-04-25 23:06:08,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 240107520. Throughput: 0: 55672.9. Samples: 189533400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:06:08,923][47056] Avg episode reward: [(0, '0.134')] [2024-04-25 23:06:08,953][47288] Updated weights for policy 0, policy_version 14656 (0.0027) [2024-04-25 23:06:11,946][47288] Updated weights for policy 0, policy_version 14666 (0.0031) [2024-04-25 23:06:13,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 240386048. Throughput: 0: 55804.8. Samples: 189709380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:06:13,925][47056] Avg episode reward: [(0, '0.099')] [2024-04-25 23:06:14,870][47288] Updated weights for policy 0, policy_version 14676 (0.0034) [2024-04-25 23:06:17,890][47288] Updated weights for policy 0, policy_version 14686 (0.0028) [2024-04-25 23:06:18,923][47056] Fps is (10 sec: 57339.9, 60 sec: 56251.0, 300 sec: 55705.5). Total num frames: 240680960. Throughput: 0: 55795.1. Samples: 190041040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-25 23:06:18,924][47056] Avg episode reward: [(0, '0.134')] [2024-04-25 23:06:20,844][47288] Updated weights for policy 0, policy_version 14696 (0.0029) [2024-04-25 23:06:23,811][47288] Updated weights for policy 0, policy_version 14706 (0.0027) [2024-04-25 23:06:23,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 240943104. Throughput: 0: 55660.6. Samples: 190370100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-25 23:06:23,923][47056] Avg episode reward: [(0, '0.112')] [2024-04-25 23:06:26,771][47288] Updated weights for policy 0, policy_version 14716 (0.0034) [2024-04-25 23:06:28,923][47056] Fps is (10 sec: 54071.8, 60 sec: 55705.6, 300 sec: 55594.6). Total num frames: 241221632. Throughput: 0: 55772.9. Samples: 190542440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-25 23:06:28,923][47056] Avg episode reward: [(0, '0.115')] [2024-04-25 23:06:29,580][47288] Updated weights for policy 0, policy_version 14726 (0.0026) [2024-04-25 23:06:32,755][47288] Updated weights for policy 0, policy_version 14736 (0.0024) [2024-04-25 23:06:33,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 241483776. Throughput: 0: 55817.6. Samples: 190880380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-25 23:06:33,923][47056] Avg episode reward: [(0, '0.142')] [2024-04-25 23:06:35,361][47288] Updated weights for policy 0, policy_version 14746 (0.0033) [2024-04-25 23:06:38,452][47288] Updated weights for policy 0, policy_version 14756 (0.0028) [2024-04-25 23:06:38,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.9, 300 sec: 55872.2). Total num frames: 241795072. Throughput: 0: 55542.3. Samples: 191210160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-25 23:06:38,923][47056] Avg episode reward: [(0, '0.109')] [2024-04-25 23:06:41,408][47288] Updated weights for policy 0, policy_version 14766 (0.0029) [2024-04-25 23:06:43,923][47056] Fps is (10 sec: 57342.7, 60 sec: 55432.2, 300 sec: 55705.5). Total num frames: 242057216. Throughput: 0: 55703.6. Samples: 191372560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-25 23:06:43,924][47056] Avg episode reward: [(0, '0.113')] [2024-04-25 23:06:44,350][47288] Updated weights for policy 0, policy_version 14776 (0.0029) [2024-04-25 23:06:47,198][47288] Updated weights for policy 0, policy_version 14786 (0.0026) [2024-04-25 23:06:48,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 242352128. Throughput: 0: 55813.3. Samples: 191707200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-04-25 23:06:48,923][47056] Avg episode reward: [(0, '0.131')] [2024-04-25 23:06:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000014792_242352128.pth... [2024-04-25 23:06:48,984][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000013976_228982784.pth [2024-04-25 23:06:50,177][47288] Updated weights for policy 0, policy_version 14796 (0.0027) [2024-04-25 23:06:53,222][47288] Updated weights for policy 0, policy_version 14806 (0.0029) [2024-04-25 23:06:53,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 242630656. Throughput: 0: 55739.4. Samples: 192041680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-04-25 23:06:53,923][47056] Avg episode reward: [(0, '0.130')] [2024-04-25 23:06:55,972][47288] Updated weights for policy 0, policy_version 14816 (0.0032) [2024-04-25 23:06:58,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 242892800. Throughput: 0: 55535.6. Samples: 192208480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-25 23:06:58,923][47056] Avg episode reward: [(0, '0.109')] [2024-04-25 23:06:59,105][47288] Updated weights for policy 0, policy_version 14826 (0.0027) [2024-04-25 23:07:01,899][47288] Updated weights for policy 0, policy_version 14836 (0.0027) [2024-04-25 23:07:03,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55705.7, 300 sec: 55650.0). Total num frames: 243171328. Throughput: 0: 55586.7. Samples: 192542400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-25 23:07:03,923][47056] Avg episode reward: [(0, '0.123')] [2024-04-25 23:07:04,994][47288] Updated weights for policy 0, policy_version 14846 (0.0028) [2024-04-25 23:07:05,378][47267] Signal inference workers to stop experience collection... (2750 times) [2024-04-25 23:07:05,412][47288] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-04-25 23:07:05,461][47267] Signal inference workers to resume experience collection... (2750 times) [2024-04-25 23:07:05,462][47288] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-04-25 23:07:07,817][47288] Updated weights for policy 0, policy_version 14856 (0.0030) [2024-04-25 23:07:08,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 243433472. Throughput: 0: 55701.2. Samples: 192876660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 23:07:08,923][47056] Avg episode reward: [(0, '0.127')] [2024-04-25 23:07:10,849][47288] Updated weights for policy 0, policy_version 14866 (0.0030) [2024-04-25 23:07:13,809][47288] Updated weights for policy 0, policy_version 14876 (0.0031) [2024-04-25 23:07:13,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 243728384. Throughput: 0: 55476.4. Samples: 193038880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 23:07:13,923][47056] Avg episode reward: [(0, '0.113')] [2024-04-25 23:07:16,808][47288] Updated weights for policy 0, policy_version 14886 (0.0031) [2024-04-25 23:07:18,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55433.3, 300 sec: 55761.2). Total num frames: 244006912. Throughput: 0: 55369.5. Samples: 193372000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 23:07:18,923][47056] Avg episode reward: [(0, '0.154')] [2024-04-25 23:07:19,033][47267] Saving new best policy, reward=0.154! [2024-04-25 23:07:19,741][47288] Updated weights for policy 0, policy_version 14896 (0.0030) [2024-04-25 23:07:22,830][47288] Updated weights for policy 0, policy_version 14906 (0.0040) [2024-04-25 23:07:23,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 244285440. Throughput: 0: 55316.4. Samples: 193699400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-25 23:07:23,923][47056] Avg episode reward: [(0, '0.115')] [2024-04-25 23:07:25,745][47288] Updated weights for policy 0, policy_version 14916 (0.0026) [2024-04-25 23:07:28,838][47288] Updated weights for policy 0, policy_version 14926 (0.0041) [2024-04-25 23:07:28,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 244547584. Throughput: 0: 55517.7. Samples: 193870840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-25 23:07:28,923][47056] Avg episode reward: [(0, '0.157')] [2024-04-25 23:07:28,932][47267] Saving new best policy, reward=0.157! [2024-04-25 23:07:31,556][47288] Updated weights for policy 0, policy_version 14936 (0.0028) [2024-04-25 23:07:33,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 244809728. Throughput: 0: 55411.2. Samples: 194200700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-25 23:07:33,923][47056] Avg episode reward: [(0, '0.148')] [2024-04-25 23:07:34,803][47288] Updated weights for policy 0, policy_version 14946 (0.0036) [2024-04-25 23:07:37,493][47288] Updated weights for policy 0, policy_version 14956 (0.0027) [2024-04-25 23:07:38,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55159.4, 300 sec: 55483.5). Total num frames: 245104640. Throughput: 0: 55259.7. Samples: 194528360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-25 23:07:38,923][47056] Avg episode reward: [(0, '0.134')] [2024-04-25 23:07:40,629][47288] Updated weights for policy 0, policy_version 14966 (0.0027) [2024-04-25 23:07:43,607][47288] Updated weights for policy 0, policy_version 14976 (0.0028) [2024-04-25 23:07:43,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 245383168. Throughput: 0: 55186.9. Samples: 194691900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-25 23:07:43,923][47056] Avg episode reward: [(0, '0.118')] [2024-04-25 23:07:46,629][47288] Updated weights for policy 0, policy_version 14986 (0.0033) [2024-04-25 23:07:48,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 245661696. Throughput: 0: 55174.1. Samples: 195025240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-25 23:07:48,923][47056] Avg episode reward: [(0, '0.141')] [2024-04-25 23:07:49,545][47288] Updated weights for policy 0, policy_version 14996 (0.0031) [2024-04-25 23:07:52,595][47288] Updated weights for policy 0, policy_version 15006 (0.0031) [2024-04-25 23:07:53,923][47056] Fps is (10 sec: 55706.8, 60 sec: 55159.7, 300 sec: 55594.5). Total num frames: 245940224. Throughput: 0: 55137.1. Samples: 195357820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 23:07:53,923][47056] Avg episode reward: [(0, '0.159')] [2024-04-25 23:07:53,923][47267] Saving new best policy, reward=0.159! [2024-04-25 23:07:55,314][47288] Updated weights for policy 0, policy_version 15016 (0.0033) [2024-04-25 23:07:58,476][47288] Updated weights for policy 0, policy_version 15026 (0.0031) [2024-04-25 23:07:58,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55159.4, 300 sec: 55594.6). Total num frames: 246202368. Throughput: 0: 55174.6. Samples: 195521740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 23:07:58,923][47056] Avg episode reward: [(0, '0.142')] [2024-04-25 23:08:01,410][47288] Updated weights for policy 0, policy_version 15036 (0.0024) [2024-04-25 23:08:03,923][47056] Fps is (10 sec: 52428.4, 60 sec: 54886.4, 300 sec: 55483.4). Total num frames: 246464512. Throughput: 0: 55143.0. Samples: 195853440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-25 23:08:03,923][47056] Avg episode reward: [(0, '0.120')] [2024-04-25 23:08:04,433][47288] Updated weights for policy 0, policy_version 15046 (0.0031) [2024-04-25 23:08:07,305][47288] Updated weights for policy 0, policy_version 15056 (0.0030) [2024-04-25 23:08:08,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 246759424. Throughput: 0: 55317.9. Samples: 196188720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-25 23:08:08,923][47056] Avg episode reward: [(0, '0.118')] [2024-04-25 23:08:10,442][47288] Updated weights for policy 0, policy_version 15066 (0.0028) [2024-04-25 23:08:13,102][47288] Updated weights for policy 0, policy_version 15076 (0.0032) [2024-04-25 23:08:13,923][47056] Fps is (10 sec: 58982.0, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 247054336. Throughput: 0: 55004.8. Samples: 196346060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-25 23:08:13,923][47056] Avg episode reward: [(0, '0.101')] [2024-04-25 23:08:16,329][47288] Updated weights for policy 0, policy_version 15086 (0.0029) [2024-04-25 23:08:18,821][47288] Updated weights for policy 0, policy_version 15096 (0.0037) [2024-04-25 23:08:18,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 247332864. Throughput: 0: 55098.7. Samples: 196680140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-25 23:08:18,932][47056] Avg episode reward: [(0, '0.148')] [2024-04-25 23:08:22,191][47288] Updated weights for policy 0, policy_version 15106 (0.0032) [2024-04-25 23:08:23,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 247595008. Throughput: 0: 55310.2. Samples: 197017320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-25 23:08:23,923][47056] Avg episode reward: [(0, '0.107')] [2024-04-25 23:08:24,853][47288] Updated weights for policy 0, policy_version 15116 (0.0033) [2024-04-25 23:08:28,068][47288] Updated weights for policy 0, policy_version 15126 (0.0035) [2024-04-25 23:08:28,331][47267] Signal inference workers to stop experience collection... (2800 times) [2024-04-25 23:08:28,331][47267] Signal inference workers to resume experience collection... (2800 times) [2024-04-25 23:08:28,350][47288] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-04-25 23:08:28,350][47288] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-04-25 23:08:28,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 247906304. Throughput: 0: 55481.0. Samples: 197188540. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-25 23:08:28,923][47056] Avg episode reward: [(0, '0.114')] [2024-04-25 23:08:30,711][47288] Updated weights for policy 0, policy_version 15136 (0.0027) [2024-04-25 23:08:33,903][47288] Updated weights for policy 0, policy_version 15146 (0.0027) [2024-04-25 23:08:33,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 248152064. Throughput: 0: 55503.9. Samples: 197522920. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-25 23:08:33,923][47056] Avg episode reward: [(0, '0.153')] [2024-04-25 23:08:36,675][47288] Updated weights for policy 0, policy_version 15156 (0.0026) [2024-04-25 23:08:38,923][47056] Fps is (10 sec: 50790.0, 60 sec: 55159.3, 300 sec: 55483.4). Total num frames: 248414208. Throughput: 0: 55555.3. Samples: 197857820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-25 23:08:38,923][47056] Avg episode reward: [(0, '0.164')] [2024-04-25 23:08:38,932][47267] Saving new best policy, reward=0.164! [2024-04-25 23:08:39,900][47288] Updated weights for policy 0, policy_version 15166 (0.0029) [2024-04-25 23:08:42,523][47288] Updated weights for policy 0, policy_version 15176 (0.0035) [2024-04-25 23:08:43,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 248692736. Throughput: 0: 55507.9. Samples: 198019600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-25 23:08:43,923][47056] Avg episode reward: [(0, '0.115')] [2024-04-25 23:08:45,817][47288] Updated weights for policy 0, policy_version 15186 (0.0028) [2024-04-25 23:08:48,395][47288] Updated weights for policy 0, policy_version 15196 (0.0028) [2024-04-25 23:08:48,923][47056] Fps is (10 sec: 58982.0, 60 sec: 55705.4, 300 sec: 55538.9). Total num frames: 249004032. Throughput: 0: 55595.3. Samples: 198355240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 23:08:48,923][47056] Avg episode reward: [(0, '0.151')] [2024-04-25 23:08:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000015198_249004032.pth... [2024-04-25 23:08:48,985][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000014383_235651072.pth [2024-04-25 23:08:51,661][47288] Updated weights for policy 0, policy_version 15206 (0.0030) [2024-04-25 23:08:53,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 249266176. Throughput: 0: 55550.0. Samples: 198688460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 23:08:53,923][47056] Avg episode reward: [(0, '0.117')] [2024-04-25 23:08:54,223][47288] Updated weights for policy 0, policy_version 15216 (0.0031) [2024-04-25 23:08:57,493][47288] Updated weights for policy 0, policy_version 15226 (0.0028) [2024-04-25 23:08:58,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 249561088. Throughput: 0: 55776.5. Samples: 198856000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-25 23:08:58,923][47056] Avg episode reward: [(0, '0.111')] [2024-04-25 23:09:00,047][47288] Updated weights for policy 0, policy_version 15236 (0.0029) [2024-04-25 23:09:03,455][47288] Updated weights for policy 0, policy_version 15246 (0.0026) [2024-04-25 23:09:03,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 249823232. Throughput: 0: 55747.9. Samples: 199188800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-25 23:09:03,923][47056] Avg episode reward: [(0, '0.160')] [2024-04-25 23:09:05,993][47288] Updated weights for policy 0, policy_version 15256 (0.0031) [2024-04-25 23:09:08,923][47056] Fps is (10 sec: 49152.6, 60 sec: 54886.6, 300 sec: 55427.9). Total num frames: 250052608. Throughput: 0: 55680.5. Samples: 199522940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-25 23:09:08,923][47056] Avg episode reward: [(0, '0.163')] [2024-04-25 23:09:09,342][47288] Updated weights for policy 0, policy_version 15266 (0.0032) [2024-04-25 23:09:11,950][47288] Updated weights for policy 0, policy_version 15276 (0.0031) [2024-04-25 23:09:13,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 250363904. Throughput: 0: 55403.1. Samples: 199681680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-25 23:09:13,923][47056] Avg episode reward: [(0, '0.133')] [2024-04-25 23:09:15,183][47288] Updated weights for policy 0, policy_version 15286 (0.0029) [2024-04-25 23:09:17,844][47288] Updated weights for policy 0, policy_version 15296 (0.0030) [2024-04-25 23:09:18,923][47056] Fps is (10 sec: 60619.5, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 250658816. Throughput: 0: 55441.3. Samples: 200017780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-25 23:09:18,923][47056] Avg episode reward: [(0, '0.115')] [2024-04-25 23:09:21,052][47288] Updated weights for policy 0, policy_version 15306 (0.0026) [2024-04-25 23:09:23,159][47267] Signal inference workers to stop experience collection... (2850 times) [2024-04-25 23:09:23,159][47267] Signal inference workers to resume experience collection... (2850 times) [2024-04-25 23:09:23,189][47288] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-04-25 23:09:23,189][47288] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-04-25 23:09:23,578][47288] Updated weights for policy 0, policy_version 15316 (0.0027) [2024-04-25 23:09:23,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 250937344. Throughput: 0: 55312.5. Samples: 200346880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-25 23:09:23,923][47056] Avg episode reward: [(0, '0.138')] [2024-04-25 23:09:26,915][47288] Updated weights for policy 0, policy_version 15326 (0.0031) [2024-04-25 23:09:28,923][47056] Fps is (10 sec: 55706.8, 60 sec: 55159.6, 300 sec: 55483.5). Total num frames: 251215872. Throughput: 0: 55637.0. Samples: 200523260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-25 23:09:28,923][47056] Avg episode reward: [(0, '0.128')] [2024-04-25 23:09:29,312][47288] Updated weights for policy 0, policy_version 15336 (0.0031) [2024-04-25 23:09:32,948][47288] Updated weights for policy 0, policy_version 15346 (0.0030) [2024-04-25 23:09:33,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 251494400. Throughput: 0: 55677.7. Samples: 200860720. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-25 23:09:33,923][47056] Avg episode reward: [(0, '0.131')] [2024-04-25 23:09:35,364][47288] Updated weights for policy 0, policy_version 15356 (0.0026) [2024-04-25 23:09:38,861][47288] Updated weights for policy 0, policy_version 15366 (0.0032) [2024-04-25 23:09:38,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 251756544. Throughput: 0: 55677.0. Samples: 201193920. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-25 23:09:38,923][47056] Avg episode reward: [(0, '0.133')] [2024-04-25 23:09:41,148][47288] Updated weights for policy 0, policy_version 15376 (0.0030) [2024-04-25 23:09:43,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 252035072. Throughput: 0: 55593.3. Samples: 201357700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-25 23:09:43,923][47056] Avg episode reward: [(0, '0.124')] [2024-04-25 23:09:44,674][47288] Updated weights for policy 0, policy_version 15386 (0.0029) [2024-04-25 23:09:47,164][47288] Updated weights for policy 0, policy_version 15396 (0.0036) [2024-04-25 23:09:48,923][47056] Fps is (10 sec: 54066.8, 60 sec: 54886.5, 300 sec: 55427.9). Total num frames: 252297216. Throughput: 0: 55632.9. Samples: 201692280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-25 23:09:48,923][47056] Avg episode reward: [(0, '0.147')] [2024-04-25 23:09:50,435][47288] Updated weights for policy 0, policy_version 15406 (0.0029) [2024-04-25 23:09:53,138][47288] Updated weights for policy 0, policy_version 15416 (0.0033) [2024-04-25 23:09:53,923][47056] Fps is (10 sec: 55706.6, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 252592128. Throughput: 0: 55617.4. Samples: 202025720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-25 23:09:53,923][47056] Avg episode reward: [(0, '0.134')] [2024-04-25 23:09:56,283][47288] Updated weights for policy 0, policy_version 15426 (0.0026) [2024-04-25 23:09:58,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 252887040. Throughput: 0: 55929.8. Samples: 202198520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 23:09:58,923][47056] Avg episode reward: [(0, '0.145')] [2024-04-25 23:09:59,024][47288] Updated weights for policy 0, policy_version 15436 (0.0031) [2024-04-25 23:10:02,285][47288] Updated weights for policy 0, policy_version 15446 (0.0027) [2024-04-25 23:10:03,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 253165568. Throughput: 0: 55839.9. Samples: 202530560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 23:10:03,923][47056] Avg episode reward: [(0, '0.152')] [2024-04-25 23:10:04,769][47288] Updated weights for policy 0, policy_version 15456 (0.0032) [2024-04-25 23:10:08,080][47288] Updated weights for policy 0, policy_version 15466 (0.0031) [2024-04-25 23:10:08,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56524.8, 300 sec: 55483.5). Total num frames: 253444096. Throughput: 0: 55895.2. Samples: 202862160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 23:10:08,923][47056] Avg episode reward: [(0, '0.152')] [2024-04-25 23:10:10,718][47288] Updated weights for policy 0, policy_version 15476 (0.0033) [2024-04-25 23:10:13,809][47288] Updated weights for policy 0, policy_version 15486 (0.0027) [2024-04-25 23:10:13,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 253722624. Throughput: 0: 55692.8. Samples: 203029440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 23:10:13,923][47056] Avg episode reward: [(0, '0.147')] [2024-04-25 23:10:16,651][47288] Updated weights for policy 0, policy_version 15496 (0.0027) [2024-04-25 23:10:18,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 253984768. Throughput: 0: 55705.2. Samples: 203367460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 23:10:18,932][47056] Avg episode reward: [(0, '0.160')] [2024-04-25 23:10:19,727][47288] Updated weights for policy 0, policy_version 15506 (0.0033) [2024-04-25 23:10:22,431][47288] Updated weights for policy 0, policy_version 15516 (0.0030) [2024-04-25 23:10:23,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 254279680. Throughput: 0: 55780.7. Samples: 203704060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 23:10:23,923][47056] Avg episode reward: [(0, '0.120')] [2024-04-25 23:10:25,715][47288] Updated weights for policy 0, policy_version 15526 (0.0032) [2024-04-25 23:10:28,394][47288] Updated weights for policy 0, policy_version 15536 (0.0029) [2024-04-25 23:10:28,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 254558208. Throughput: 0: 55686.3. Samples: 203863580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-25 23:10:28,923][47056] Avg episode reward: [(0, '0.162')] [2024-04-25 23:10:31,585][47288] Updated weights for policy 0, policy_version 15546 (0.0025) [2024-04-25 23:10:33,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 254836736. Throughput: 0: 55664.5. Samples: 204197180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-25 23:10:33,923][47056] Avg episode reward: [(0, '0.138')] [2024-04-25 23:10:34,352][47288] Updated weights for policy 0, policy_version 15556 (0.0030) [2024-04-25 23:10:37,348][47288] Updated weights for policy 0, policy_version 15566 (0.0032) [2024-04-25 23:10:38,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 255131648. Throughput: 0: 55647.4. Samples: 204529860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-25 23:10:38,923][47056] Avg episode reward: [(0, '0.128')] [2024-04-25 23:10:40,168][47288] Updated weights for policy 0, policy_version 15576 (0.0026) [2024-04-25 23:10:43,205][47288] Updated weights for policy 0, policy_version 15586 (0.0028) [2024-04-25 23:10:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 255393792. Throughput: 0: 55560.5. Samples: 204698740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-25 23:10:43,923][47056] Avg episode reward: [(0, '0.135')] [2024-04-25 23:10:46,211][47288] Updated weights for policy 0, policy_version 15596 (0.0028) [2024-04-25 23:10:46,803][47267] Signal inference workers to stop experience collection... (2900 times) [2024-04-25 23:10:46,803][47267] Signal inference workers to resume experience collection... (2900 times) [2024-04-25 23:10:46,827][47288] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-04-25 23:10:46,827][47288] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-04-25 23:10:48,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 255655936. Throughput: 0: 55710.6. Samples: 205037540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-25 23:10:48,923][47056] Avg episode reward: [(0, '0.138')] [2024-04-25 23:10:48,939][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000015605_255672320.pth... [2024-04-25 23:10:48,984][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000014792_242352128.pth [2024-04-25 23:10:49,101][47288] Updated weights for policy 0, policy_version 15606 (0.0027) [2024-04-25 23:10:52,036][47288] Updated weights for policy 0, policy_version 15616 (0.0027) [2024-04-25 23:10:53,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 255934464. Throughput: 0: 55780.3. Samples: 205372280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-25 23:10:53,923][47056] Avg episode reward: [(0, '0.151')] [2024-04-25 23:10:54,939][47288] Updated weights for policy 0, policy_version 15626 (0.0028) [2024-04-25 23:10:57,851][47288] Updated weights for policy 0, policy_version 15636 (0.0027) [2024-04-25 23:10:58,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 256229376. Throughput: 0: 55577.4. Samples: 205530420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-25 23:10:58,923][47056] Avg episode reward: [(0, '0.124')] [2024-04-25 23:11:00,886][47288] Updated weights for policy 0, policy_version 15646 (0.0028) [2024-04-25 23:11:03,786][47288] Updated weights for policy 0, policy_version 15656 (0.0030) [2024-04-25 23:11:03,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 256507904. Throughput: 0: 55468.0. Samples: 205863520. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-04-25 23:11:03,923][47056] Avg episode reward: [(0, '0.124')] [2024-04-25 23:11:06,916][47288] Updated weights for policy 0, policy_version 15666 (0.0032) [2024-04-25 23:11:08,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 256770048. Throughput: 0: 55461.8. Samples: 206199840. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-04-25 23:11:08,923][47056] Avg episode reward: [(0, '0.156')] [2024-04-25 23:11:09,758][47288] Updated weights for policy 0, policy_version 15676 (0.0033) [2024-04-25 23:11:12,619][47288] Updated weights for policy 0, policy_version 15686 (0.0032) [2024-04-25 23:11:13,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55539.1). Total num frames: 257064960. Throughput: 0: 55912.1. Samples: 206379620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-25 23:11:13,923][47056] Avg episode reward: [(0, '0.144')] [2024-04-25 23:11:15,478][47288] Updated weights for policy 0, policy_version 15696 (0.0027) [2024-04-25 23:11:18,633][47288] Updated weights for policy 0, policy_version 15706 (0.0031) [2024-04-25 23:11:18,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 257343488. Throughput: 0: 55792.1. Samples: 206707820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-25 23:11:18,923][47056] Avg episode reward: [(0, '0.124')] [2024-04-25 23:11:21,295][47288] Updated weights for policy 0, policy_version 15716 (0.0027) [2024-04-25 23:11:23,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 257605632. Throughput: 0: 55862.3. Samples: 207043660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-25 23:11:23,923][47056] Avg episode reward: [(0, '0.132')] [2024-04-25 23:11:24,567][47288] Updated weights for policy 0, policy_version 15726 (0.0030) [2024-04-25 23:11:27,332][47288] Updated weights for policy 0, policy_version 15736 (0.0026) [2024-04-25 23:11:28,923][47056] Fps is (10 sec: 54066.2, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 257884160. Throughput: 0: 55724.2. Samples: 207206340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-25 23:11:28,923][47056] Avg episode reward: [(0, '0.165')] [2024-04-25 23:11:28,934][47267] Saving new best policy, reward=0.165! [2024-04-25 23:11:30,372][47288] Updated weights for policy 0, policy_version 15746 (0.0029) [2024-04-25 23:11:33,198][47288] Updated weights for policy 0, policy_version 15756 (0.0029) [2024-04-25 23:11:33,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 258162688. Throughput: 0: 55556.1. Samples: 207537560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-25 23:11:33,923][47056] Avg episode reward: [(0, '0.154')] [2024-04-25 23:11:36,209][47288] Updated weights for policy 0, policy_version 15766 (0.0029) [2024-04-25 23:11:38,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55432.5, 300 sec: 55594.6). Total num frames: 258457600. Throughput: 0: 55451.1. Samples: 207867580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-25 23:11:38,923][47056] Avg episode reward: [(0, '0.160')] [2024-04-25 23:11:39,113][47288] Updated weights for policy 0, policy_version 15776 (0.0033) [2024-04-25 23:11:42,161][47288] Updated weights for policy 0, policy_version 15786 (0.0029) [2024-04-25 23:11:43,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55432.5, 300 sec: 55483.5). Total num frames: 258719744. Throughput: 0: 55843.9. Samples: 208043400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-25 23:11:43,923][47056] Avg episode reward: [(0, '0.162')] [2024-04-25 23:11:45,029][47288] Updated weights for policy 0, policy_version 15796 (0.0032) [2024-04-25 23:11:48,130][47288] Updated weights for policy 0, policy_version 15806 (0.0027) [2024-04-25 23:11:48,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55978.4, 300 sec: 55539.0). Total num frames: 259014656. Throughput: 0: 55868.2. Samples: 208377600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 23:11:48,923][47056] Avg episode reward: [(0, '0.186')] [2024-04-25 23:11:48,931][47267] Saving new best policy, reward=0.186! [2024-04-25 23:11:50,791][47288] Updated weights for policy 0, policy_version 15816 (0.0026) [2024-04-25 23:11:53,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 259276800. Throughput: 0: 55811.2. Samples: 208711340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 23:11:53,924][47056] Avg episode reward: [(0, '0.151')] [2024-04-25 23:11:54,051][47288] Updated weights for policy 0, policy_version 15826 (0.0031) [2024-04-25 23:11:54,287][47267] Signal inference workers to stop experience collection... (2950 times) [2024-04-25 23:11:54,320][47288] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-04-25 23:11:54,375][47267] Signal inference workers to resume experience collection... (2950 times) [2024-04-25 23:11:54,375][47288] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-04-25 23:11:56,572][47288] Updated weights for policy 0, policy_version 15836 (0.0024) [2024-04-25 23:11:58,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 259555328. Throughput: 0: 55500.7. Samples: 208877160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 23:11:58,924][47056] Avg episode reward: [(0, '0.151')] [2024-04-25 23:11:59,878][47288] Updated weights for policy 0, policy_version 15846 (0.0031) [2024-04-25 23:12:02,488][47288] Updated weights for policy 0, policy_version 15856 (0.0032) [2024-04-25 23:12:03,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 259833856. Throughput: 0: 55540.0. Samples: 209207120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-25 23:12:03,923][47056] Avg episode reward: [(0, '0.154')] [2024-04-25 23:12:05,734][47288] Updated weights for policy 0, policy_version 15866 (0.0027) [2024-04-25 23:12:08,385][47288] Updated weights for policy 0, policy_version 15876 (0.0026) [2024-04-25 23:12:08,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 260128768. Throughput: 0: 55481.3. Samples: 209540320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-25 23:12:08,923][47056] Avg episode reward: [(0, '0.134')] [2024-04-25 23:12:11,645][47288] Updated weights for policy 0, policy_version 15886 (0.0027) [2024-04-25 23:12:13,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 260407296. Throughput: 0: 55817.9. Samples: 209718140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-25 23:12:13,923][47056] Avg episode reward: [(0, '0.117')] [2024-04-25 23:12:14,142][47288] Updated weights for policy 0, policy_version 15896 (0.0029) [2024-04-25 23:12:17,527][47288] Updated weights for policy 0, policy_version 15906 (0.0028) [2024-04-25 23:12:18,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 260669440. Throughput: 0: 55826.3. Samples: 210049740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-25 23:12:18,923][47056] Avg episode reward: [(0, '0.167')] [2024-04-25 23:12:20,113][47288] Updated weights for policy 0, policy_version 15916 (0.0026) [2024-04-25 23:12:23,237][47288] Updated weights for policy 0, policy_version 15926 (0.0030) [2024-04-25 23:12:23,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 260964352. Throughput: 0: 55868.0. Samples: 210381640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-25 23:12:23,923][47056] Avg episode reward: [(0, '0.139')] [2024-04-25 23:12:26,012][47288] Updated weights for policy 0, policy_version 15936 (0.0029) [2024-04-25 23:12:28,923][47056] Fps is (10 sec: 57342.8, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 261242880. Throughput: 0: 55675.5. Samples: 210548800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-25 23:12:28,923][47056] Avg episode reward: [(0, '0.134')] [2024-04-25 23:12:29,232][47288] Updated weights for policy 0, policy_version 15946 (0.0030) [2024-04-25 23:12:32,444][47288] Updated weights for policy 0, policy_version 15956 (0.0035) [2024-04-25 23:12:33,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 261505024. Throughput: 0: 55685.7. Samples: 210883440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-25 23:12:33,923][47056] Avg episode reward: [(0, '0.128')] [2024-04-25 23:12:35,168][47288] Updated weights for policy 0, policy_version 15966 (0.0035) [2024-04-25 23:12:38,235][47288] Updated weights for policy 0, policy_version 15976 (0.0028) [2024-04-25 23:12:38,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 261799936. Throughput: 0: 55738.8. Samples: 211219580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-25 23:12:38,923][47056] Avg episode reward: [(0, '0.111')] [2024-04-25 23:12:41,052][47288] Updated weights for policy 0, policy_version 15986 (0.0032) [2024-04-25 23:12:43,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 262062080. Throughput: 0: 55653.1. Samples: 211381540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-25 23:12:43,923][47056] Avg episode reward: [(0, '0.143')] [2024-04-25 23:12:43,960][47288] Updated weights for policy 0, policy_version 15996 (0.0030) [2024-04-25 23:12:46,880][47288] Updated weights for policy 0, policy_version 16006 (0.0034) [2024-04-25 23:12:48,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 262356992. Throughput: 0: 55700.4. Samples: 211713640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-04-25 23:12:48,923][47056] Avg episode reward: [(0, '0.138')] [2024-04-25 23:12:48,943][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000016014_262373376.pth... [2024-04-25 23:12:48,991][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000015198_249004032.pth [2024-04-25 23:12:49,918][47288] Updated weights for policy 0, policy_version 16016 (0.0037) [2024-04-25 23:12:52,856][47288] Updated weights for policy 0, policy_version 16026 (0.0027) [2024-04-25 23:12:53,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 262619136. Throughput: 0: 55639.5. Samples: 212044100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-04-25 23:12:53,923][47056] Avg episode reward: [(0, '0.114')] [2024-04-25 23:12:55,825][47288] Updated weights for policy 0, policy_version 16036 (0.0033) [2024-04-25 23:12:58,746][47288] Updated weights for policy 0, policy_version 16046 (0.0030) [2024-04-25 23:12:58,923][47056] Fps is (10 sec: 54066.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 262897664. Throughput: 0: 55423.4. Samples: 212212200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-25 23:12:58,923][47056] Avg episode reward: [(0, '0.126')] [2024-04-25 23:13:01,620][47288] Updated weights for policy 0, policy_version 16056 (0.0027) [2024-04-25 23:13:02,557][47267] Signal inference workers to stop experience collection... (3000 times) [2024-04-25 23:13:02,576][47288] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-04-25 23:13:02,613][47267] Signal inference workers to resume experience collection... (3000 times) [2024-04-25 23:13:02,613][47288] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-04-25 23:13:03,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 263176192. Throughput: 0: 55651.5. Samples: 212554060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-25 23:13:03,923][47056] Avg episode reward: [(0, '0.129')] [2024-04-25 23:13:04,536][47288] Updated weights for policy 0, policy_version 16066 (0.0035) [2024-04-25 23:13:07,334][47288] Updated weights for policy 0, policy_version 16076 (0.0031) [2024-04-25 23:13:08,923][47056] Fps is (10 sec: 54068.3, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 263438336. Throughput: 0: 55732.9. Samples: 212889620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-25 23:13:08,923][47056] Avg episode reward: [(0, '0.138')] [2024-04-25 23:13:10,388][47288] Updated weights for policy 0, policy_version 16086 (0.0035) [2024-04-25 23:13:13,355][47288] Updated weights for policy 0, policy_version 16096 (0.0028) [2024-04-25 23:13:13,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 263733248. Throughput: 0: 55556.2. Samples: 213048820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-25 23:13:13,923][47056] Avg episode reward: [(0, '0.127')] [2024-04-25 23:13:16,247][47288] Updated weights for policy 0, policy_version 16106 (0.0028) [2024-04-25 23:13:18,924][47056] Fps is (10 sec: 57337.1, 60 sec: 55704.4, 300 sec: 55649.8). Total num frames: 264011776. Throughput: 0: 55488.3. Samples: 213380480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-25 23:13:18,924][47056] Avg episode reward: [(0, '0.151')] [2024-04-25 23:13:19,172][47288] Updated weights for policy 0, policy_version 16116 (0.0029) [2024-04-25 23:13:22,163][47288] Updated weights for policy 0, policy_version 16126 (0.0029) [2024-04-25 23:13:23,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 264306688. Throughput: 0: 55512.7. Samples: 213717660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-04-25 23:13:23,924][47056] Avg episode reward: [(0, '0.122')] [2024-04-25 23:13:24,960][47288] Updated weights for policy 0, policy_version 16136 (0.0035) [2024-04-25 23:13:27,926][47288] Updated weights for policy 0, policy_version 16146 (0.0030) [2024-04-25 23:13:28,923][47056] Fps is (10 sec: 57349.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 264585216. Throughput: 0: 55846.4. Samples: 213894640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-04-25 23:13:28,923][47056] Avg episode reward: [(0, '0.119')] [2024-04-25 23:13:30,708][47288] Updated weights for policy 0, policy_version 16156 (0.0034) [2024-04-25 23:13:33,767][47288] Updated weights for policy 0, policy_version 16166 (0.0027) [2024-04-25 23:13:33,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 264863744. Throughput: 0: 55965.6. Samples: 214232100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 23:13:33,923][47056] Avg episode reward: [(0, '0.123')] [2024-04-25 23:13:36,592][47288] Updated weights for policy 0, policy_version 16176 (0.0027) [2024-04-25 23:13:38,923][47056] Fps is (10 sec: 54068.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 265125888. Throughput: 0: 56041.9. Samples: 214565980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 23:13:38,923][47056] Avg episode reward: [(0, '0.138')] [2024-04-25 23:13:39,541][47288] Updated weights for policy 0, policy_version 16186 (0.0036) [2024-04-25 23:13:42,681][47288] Updated weights for policy 0, policy_version 16196 (0.0025) [2024-04-25 23:13:43,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.5, 300 sec: 55594.6). Total num frames: 265404416. Throughput: 0: 55827.7. Samples: 214724440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-25 23:13:43,923][47056] Avg episode reward: [(0, '0.159')] [2024-04-25 23:13:45,442][47288] Updated weights for policy 0, policy_version 16206 (0.0040) [2024-04-25 23:13:48,412][47288] Updated weights for policy 0, policy_version 16216 (0.0036) [2024-04-25 23:13:48,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 265682944. Throughput: 0: 55690.5. Samples: 215060140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-25 23:13:48,923][47056] Avg episode reward: [(0, '0.137')] [2024-04-25 23:13:51,411][47288] Updated weights for policy 0, policy_version 16226 (0.0038) [2024-04-25 23:13:53,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 265977856. Throughput: 0: 55551.1. Samples: 215389420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-25 23:13:53,923][47056] Avg episode reward: [(0, '0.138')] [2024-04-25 23:13:54,488][47288] Updated weights for policy 0, policy_version 16236 (0.0035) [2024-04-25 23:13:57,319][47288] Updated weights for policy 0, policy_version 16246 (0.0026) [2024-04-25 23:13:58,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 266256384. Throughput: 0: 55956.7. Samples: 215566880. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-25 23:13:58,923][47056] Avg episode reward: [(0, '0.148')] [2024-04-25 23:14:00,519][47288] Updated weights for policy 0, policy_version 16256 (0.0026) [2024-04-25 23:14:03,129][47288] Updated weights for policy 0, policy_version 16266 (0.0026) [2024-04-25 23:14:03,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 266518528. Throughput: 0: 55953.4. Samples: 215898320. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-25 23:14:03,923][47056] Avg episode reward: [(0, '0.107')] [2024-04-25 23:14:06,177][47288] Updated weights for policy 0, policy_version 16276 (0.0026) [2024-04-25 23:14:08,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 266797056. Throughput: 0: 55850.8. Samples: 216230940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-25 23:14:08,923][47056] Avg episode reward: [(0, '0.171')] [2024-04-25 23:14:09,084][47288] Updated weights for policy 0, policy_version 16286 (0.0028) [2024-04-25 23:14:11,985][47288] Updated weights for policy 0, policy_version 16296 (0.0029) [2024-04-25 23:14:13,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 267075584. Throughput: 0: 55583.4. Samples: 216395880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-25 23:14:13,923][47056] Avg episode reward: [(0, '0.154')] [2024-04-25 23:14:14,952][47288] Updated weights for policy 0, policy_version 16306 (0.0036) [2024-04-25 23:14:17,019][47267] Signal inference workers to stop experience collection... (3050 times) [2024-04-25 23:14:17,019][47267] Signal inference workers to resume experience collection... (3050 times) [2024-04-25 23:14:17,047][47288] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-04-25 23:14:17,048][47288] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-04-25 23:14:17,954][47288] Updated weights for policy 0, policy_version 16316 (0.0030) [2024-04-25 23:14:18,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55433.6, 300 sec: 55594.5). Total num frames: 267337728. Throughput: 0: 55467.3. Samples: 216728120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-25 23:14:18,923][47056] Avg episode reward: [(0, '0.146')] [2024-04-25 23:14:20,767][47288] Updated weights for policy 0, policy_version 16326 (0.0028) [2024-04-25 23:14:23,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 267632640. Throughput: 0: 55421.3. Samples: 217059940. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-25 23:14:23,923][47056] Avg episode reward: [(0, '0.100')] [2024-04-25 23:14:23,929][47288] Updated weights for policy 0, policy_version 16336 (0.0029) [2024-04-25 23:14:26,642][47288] Updated weights for policy 0, policy_version 16346 (0.0024) [2024-04-25 23:14:28,923][47056] Fps is (10 sec: 58982.8, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 267927552. Throughput: 0: 55644.6. Samples: 217228440. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-25 23:14:28,923][47056] Avg episode reward: [(0, '0.176')] [2024-04-25 23:14:30,334][47288] Updated weights for policy 0, policy_version 16356 (0.0026) [2024-04-25 23:14:32,507][47288] Updated weights for policy 0, policy_version 16366 (0.0032) [2024-04-25 23:14:33,923][47056] Fps is (10 sec: 58982.7, 60 sec: 55978.9, 300 sec: 55816.7). Total num frames: 268222464. Throughput: 0: 55606.0. Samples: 217562400. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-25 23:14:33,923][47056] Avg episode reward: [(0, '0.169')] [2024-04-25 23:14:36,323][47288] Updated weights for policy 0, policy_version 16376 (0.0038) [2024-04-25 23:14:38,490][47288] Updated weights for policy 0, policy_version 16386 (0.0027) [2024-04-25 23:14:38,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 268468224. Throughput: 0: 55717.2. Samples: 217896700. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-25 23:14:38,923][47056] Avg episode reward: [(0, '0.153')] [2024-04-25 23:14:42,093][47288] Updated weights for policy 0, policy_version 16396 (0.0029) [2024-04-25 23:14:43,923][47056] Fps is (10 sec: 52428.0, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 268746752. Throughput: 0: 55528.1. Samples: 218065640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-25 23:14:43,923][47056] Avg episode reward: [(0, '0.141')] [2024-04-25 23:14:44,374][47288] Updated weights for policy 0, policy_version 16406 (0.0028) [2024-04-25 23:14:47,745][47288] Updated weights for policy 0, policy_version 16416 (0.0029) [2024-04-25 23:14:48,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 269025280. Throughput: 0: 55672.1. Samples: 218403560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-25 23:14:48,923][47056] Avg episode reward: [(0, '0.136')] [2024-04-25 23:14:48,950][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000016421_269041664.pth... [2024-04-25 23:14:48,997][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000015605_255672320.pth [2024-04-25 23:14:50,635][47288] Updated weights for policy 0, policy_version 16426 (0.0029) [2024-04-25 23:14:53,656][47288] Updated weights for policy 0, policy_version 16436 (0.0028) [2024-04-25 23:14:53,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 269287424. Throughput: 0: 55755.6. Samples: 218739940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-25 23:14:53,923][47056] Avg episode reward: [(0, '0.186')] [2024-04-25 23:14:56,528][47288] Updated weights for policy 0, policy_version 16446 (0.0027) [2024-04-25 23:14:58,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 269582336. Throughput: 0: 55812.7. Samples: 218907460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-25 23:14:58,923][47056] Avg episode reward: [(0, '0.145')] [2024-04-25 23:14:59,683][47288] Updated weights for policy 0, policy_version 16456 (0.0027) [2024-04-25 23:15:02,330][47288] Updated weights for policy 0, policy_version 16466 (0.0034) [2024-04-25 23:15:03,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 269877248. Throughput: 0: 55766.2. Samples: 219237600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-25 23:15:03,923][47056] Avg episode reward: [(0, '0.134')] [2024-04-25 23:15:05,462][47288] Updated weights for policy 0, policy_version 16476 (0.0032) [2024-04-25 23:15:08,159][47288] Updated weights for policy 0, policy_version 16486 (0.0029) [2024-04-25 23:15:08,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 270155776. Throughput: 0: 55818.6. Samples: 219571780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-25 23:15:08,923][47056] Avg episode reward: [(0, '0.125')] [2024-04-25 23:15:11,400][47288] Updated weights for policy 0, policy_version 16496 (0.0034) [2024-04-25 23:15:11,402][47267] Signal inference workers to stop experience collection... (3100 times) [2024-04-25 23:15:11,403][47267] Signal inference workers to resume experience collection... (3100 times) [2024-04-25 23:15:11,430][47288] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-04-25 23:15:11,430][47288] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-04-25 23:15:13,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 270417920. Throughput: 0: 55933.6. Samples: 219745460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-25 23:15:13,923][47056] Avg episode reward: [(0, '0.137')] [2024-04-25 23:15:14,065][47288] Updated weights for policy 0, policy_version 16506 (0.0032) [2024-04-25 23:15:17,075][47288] Updated weights for policy 0, policy_version 16516 (0.0030) [2024-04-25 23:15:18,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 270696448. Throughput: 0: 55961.6. Samples: 220080680. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-25 23:15:18,923][47056] Avg episode reward: [(0, '0.134')] [2024-04-25 23:15:19,962][47288] Updated weights for policy 0, policy_version 16526 (0.0027) [2024-04-25 23:15:23,005][47288] Updated weights for policy 0, policy_version 16536 (0.0033) [2024-04-25 23:15:23,923][47056] Fps is (10 sec: 58983.3, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 271007744. Throughput: 0: 55929.1. Samples: 220413500. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-25 23:15:23,923][47056] Avg episode reward: [(0, '0.156')] [2024-04-25 23:15:25,873][47288] Updated weights for policy 0, policy_version 16546 (0.0032) [2024-04-25 23:15:28,724][47288] Updated weights for policy 0, policy_version 16556 (0.0032) [2024-04-25 23:15:28,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 271253504. Throughput: 0: 55747.5. Samples: 220574280. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-25 23:15:28,923][47056] Avg episode reward: [(0, '0.140')] [2024-04-25 23:15:31,689][47288] Updated weights for policy 0, policy_version 16566 (0.0033) [2024-04-25 23:15:33,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 271532032. Throughput: 0: 55672.1. Samples: 220908800. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-25 23:15:33,923][47056] Avg episode reward: [(0, '0.149')] [2024-04-25 23:15:34,582][47288] Updated weights for policy 0, policy_version 16576 (0.0027) [2024-04-25 23:15:37,473][47288] Updated weights for policy 0, policy_version 16586 (0.0029) [2024-04-25 23:15:38,923][47056] Fps is (10 sec: 57345.3, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 271826944. Throughput: 0: 55689.0. Samples: 221245940. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-25 23:15:38,923][47056] Avg episode reward: [(0, '0.165')] [2024-04-25 23:15:40,409][47288] Updated weights for policy 0, policy_version 16596 (0.0032) [2024-04-25 23:15:43,374][47288] Updated weights for policy 0, policy_version 16606 (0.0031) [2024-04-25 23:15:43,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 272121856. Throughput: 0: 55636.0. Samples: 221411080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-25 23:15:43,923][47056] Avg episode reward: [(0, '0.180')] [2024-04-25 23:15:46,178][47288] Updated weights for policy 0, policy_version 16616 (0.0030) [2024-04-25 23:15:48,923][47056] Fps is (10 sec: 55704.4, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 272384000. Throughput: 0: 55946.5. Samples: 221755200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-25 23:15:48,923][47056] Avg episode reward: [(0, '0.215')] [2024-04-25 23:15:48,933][47267] Saving new best policy, reward=0.215! [2024-04-25 23:15:49,114][47288] Updated weights for policy 0, policy_version 16626 (0.0039) [2024-04-25 23:15:52,181][47288] Updated weights for policy 0, policy_version 16636 (0.0031) [2024-04-25 23:15:53,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 272646144. Throughput: 0: 55954.1. Samples: 222089720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 23:15:53,923][47056] Avg episode reward: [(0, '0.177')] [2024-04-25 23:15:54,992][47288] Updated weights for policy 0, policy_version 16646 (0.0029) [2024-04-25 23:15:58,065][47288] Updated weights for policy 0, policy_version 16656 (0.0029) [2024-04-25 23:15:58,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 272941056. Throughput: 0: 55640.0. Samples: 222249260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 23:15:58,923][47056] Avg episode reward: [(0, '0.169')] [2024-04-25 23:16:00,969][47288] Updated weights for policy 0, policy_version 16666 (0.0033) [2024-04-25 23:16:03,889][47288] Updated weights for policy 0, policy_version 16676 (0.0031) [2024-04-25 23:16:03,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 273219584. Throughput: 0: 55647.7. Samples: 222584820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 23:16:03,923][47056] Avg episode reward: [(0, '0.147')] [2024-04-25 23:16:06,703][47288] Updated weights for policy 0, policy_version 16686 (0.0028) [2024-04-25 23:16:08,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 273498112. Throughput: 0: 55675.0. Samples: 222918880. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-25 23:16:08,923][47056] Avg episode reward: [(0, '0.118')] [2024-04-25 23:16:09,746][47288] Updated weights for policy 0, policy_version 16696 (0.0032) [2024-04-25 23:16:12,698][47288] Updated weights for policy 0, policy_version 16706 (0.0029) [2024-04-25 23:16:13,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 273793024. Throughput: 0: 55979.3. Samples: 223093340. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-25 23:16:13,923][47056] Avg episode reward: [(0, '0.156')] [2024-04-25 23:16:15,266][47267] Signal inference workers to stop experience collection... (3150 times) [2024-04-25 23:16:15,310][47288] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-04-25 23:16:15,320][47267] Signal inference workers to resume experience collection... (3150 times) [2024-04-25 23:16:15,328][47288] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-04-25 23:16:15,427][47288] Updated weights for policy 0, policy_version 16716 (0.0031) [2024-04-25 23:16:18,683][47288] Updated weights for policy 0, policy_version 16726 (0.0030) [2024-04-25 23:16:18,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 274038784. Throughput: 0: 55862.2. Samples: 223422600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-25 23:16:18,923][47056] Avg episode reward: [(0, '0.153')] [2024-04-25 23:16:21,349][47288] Updated weights for policy 0, policy_version 16736 (0.0027) [2024-04-25 23:16:23,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 274333696. Throughput: 0: 55950.6. Samples: 223763720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-25 23:16:23,923][47056] Avg episode reward: [(0, '0.153')] [2024-04-25 23:16:24,396][47288] Updated weights for policy 0, policy_version 16746 (0.0024) [2024-04-25 23:16:27,389][47288] Updated weights for policy 0, policy_version 16756 (0.0033) [2024-04-25 23:16:28,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 274612224. Throughput: 0: 55916.0. Samples: 223927300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-25 23:16:28,923][47056] Avg episode reward: [(0, '0.162')] [2024-04-25 23:16:30,231][47288] Updated weights for policy 0, policy_version 16766 (0.0041) [2024-04-25 23:16:33,194][47288] Updated weights for policy 0, policy_version 16776 (0.0028) [2024-04-25 23:16:33,923][47056] Fps is (10 sec: 57342.9, 60 sec: 56251.5, 300 sec: 55761.1). Total num frames: 274907136. Throughput: 0: 55684.8. Samples: 224261020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:16:33,923][47056] Avg episode reward: [(0, '0.134')] [2024-04-25 23:16:36,166][47288] Updated weights for policy 0, policy_version 16786 (0.0027) [2024-04-25 23:16:38,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 275169280. Throughput: 0: 55674.2. Samples: 224595060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:16:38,924][47056] Avg episode reward: [(0, '0.138')] [2024-04-25 23:16:39,063][47288] Updated weights for policy 0, policy_version 16796 (0.0031) [2024-04-25 23:16:41,987][47288] Updated weights for policy 0, policy_version 16806 (0.0027) [2024-04-25 23:16:43,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 275464192. Throughput: 0: 55868.8. Samples: 224763360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-25 23:16:43,923][47056] Avg episode reward: [(0, '0.163')] [2024-04-25 23:16:44,908][47288] Updated weights for policy 0, policy_version 16816 (0.0032) [2024-04-25 23:16:47,917][47288] Updated weights for policy 0, policy_version 16826 (0.0032) [2024-04-25 23:16:48,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 275742720. Throughput: 0: 56018.6. Samples: 225105660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-25 23:16:48,923][47056] Avg episode reward: [(0, '0.158')] [2024-04-25 23:16:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000016830_275742720.pth... [2024-04-25 23:16:48,985][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000016014_262373376.pth [2024-04-25 23:16:50,735][47288] Updated weights for policy 0, policy_version 16836 (0.0034) [2024-04-25 23:16:53,837][47288] Updated weights for policy 0, policy_version 16846 (0.0031) [2024-04-25 23:16:53,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 276004864. Throughput: 0: 55996.8. Samples: 225438740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-25 23:16:53,923][47056] Avg episode reward: [(0, '0.125')] [2024-04-25 23:16:56,524][47288] Updated weights for policy 0, policy_version 16856 (0.0031) [2024-04-25 23:16:58,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 276267008. Throughput: 0: 55728.0. Samples: 225601100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-25 23:16:58,923][47056] Avg episode reward: [(0, '0.194')] [2024-04-25 23:16:59,867][47288] Updated weights for policy 0, policy_version 16866 (0.0030) [2024-04-25 23:17:02,487][47288] Updated weights for policy 0, policy_version 16876 (0.0042) [2024-04-25 23:17:03,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 276578304. Throughput: 0: 55795.9. Samples: 225933420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-25 23:17:03,923][47056] Avg episode reward: [(0, '0.159')] [2024-04-25 23:17:05,628][47288] Updated weights for policy 0, policy_version 16886 (0.0035) [2024-04-25 23:17:08,512][47288] Updated weights for policy 0, policy_version 16896 (0.0027) [2024-04-25 23:17:08,923][47056] Fps is (10 sec: 58981.9, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 276856832. Throughput: 0: 55639.9. Samples: 226267520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-25 23:17:08,923][47056] Avg episode reward: [(0, '0.182')] [2024-04-25 23:17:11,431][47288] Updated weights for policy 0, policy_version 16906 (0.0032) [2024-04-25 23:17:13,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 277102592. Throughput: 0: 55700.5. Samples: 226433820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-25 23:17:13,923][47056] Avg episode reward: [(0, '0.151')] [2024-04-25 23:17:14,326][47288] Updated weights for policy 0, policy_version 16916 (0.0027) [2024-04-25 23:17:17,305][47288] Updated weights for policy 0, policy_version 16926 (0.0027) [2024-04-25 23:17:18,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 277397504. Throughput: 0: 55720.6. Samples: 226768440. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-25 23:17:18,923][47056] Avg episode reward: [(0, '0.167')] [2024-04-25 23:17:20,261][47288] Updated weights for policy 0, policy_version 16936 (0.0027) [2024-04-25 23:17:23,193][47288] Updated weights for policy 0, policy_version 16946 (0.0032) [2024-04-25 23:17:23,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 277692416. Throughput: 0: 55831.6. Samples: 227107480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-25 23:17:23,923][47056] Avg episode reward: [(0, '0.181')] [2024-04-25 23:17:26,127][47288] Updated weights for policy 0, policy_version 16956 (0.0027) [2024-04-25 23:17:28,704][47267] Signal inference workers to stop experience collection... (3200 times) [2024-04-25 23:17:28,733][47288] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-04-25 23:17:28,790][47267] Signal inference workers to resume experience collection... (3200 times) [2024-04-25 23:17:28,790][47288] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-04-25 23:17:28,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 277938176. Throughput: 0: 55608.6. Samples: 227265740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-25 23:17:28,923][47056] Avg episode reward: [(0, '0.179')] [2024-04-25 23:17:29,040][47288] Updated weights for policy 0, policy_version 16966 (0.0035) [2024-04-25 23:17:31,964][47288] Updated weights for policy 0, policy_version 16976 (0.0027) [2024-04-25 23:17:33,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 278233088. Throughput: 0: 55447.5. Samples: 227600800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-25 23:17:33,923][47056] Avg episode reward: [(0, '0.157')] [2024-04-25 23:17:34,968][47288] Updated weights for policy 0, policy_version 16986 (0.0039) [2024-04-25 23:17:37,592][47288] Updated weights for policy 0, policy_version 16996 (0.0025) [2024-04-25 23:17:38,923][47056] Fps is (10 sec: 57342.7, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 278511616. Throughput: 0: 55624.2. Samples: 227941840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-25 23:17:38,923][47056] Avg episode reward: [(0, '0.147')] [2024-04-25 23:17:40,780][47288] Updated weights for policy 0, policy_version 17006 (0.0036) [2024-04-25 23:17:43,447][47288] Updated weights for policy 0, policy_version 17016 (0.0030) [2024-04-25 23:17:43,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 278806528. Throughput: 0: 55740.9. Samples: 228109440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-25 23:17:43,923][47056] Avg episode reward: [(0, '0.124')] [2024-04-25 23:17:46,657][47288] Updated weights for policy 0, policy_version 17026 (0.0034) [2024-04-25 23:17:48,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 279068672. Throughput: 0: 55806.5. Samples: 228444720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-25 23:17:48,923][47056] Avg episode reward: [(0, '0.181')] [2024-04-25 23:17:49,387][47288] Updated weights for policy 0, policy_version 17036 (0.0027) [2024-04-25 23:17:52,406][47288] Updated weights for policy 0, policy_version 17046 (0.0033) [2024-04-25 23:17:53,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 279347200. Throughput: 0: 55832.6. Samples: 228779980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-25 23:17:53,923][47056] Avg episode reward: [(0, '0.169')] [2024-04-25 23:17:55,261][47288] Updated weights for policy 0, policy_version 17056 (0.0031) [2024-04-25 23:17:58,217][47288] Updated weights for policy 0, policy_version 17066 (0.0033) [2024-04-25 23:17:58,923][47056] Fps is (10 sec: 55706.8, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 279625728. Throughput: 0: 55860.1. Samples: 228947520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 23:17:58,923][47056] Avg episode reward: [(0, '0.139')] [2024-04-25 23:18:01,130][47288] Updated weights for policy 0, policy_version 17076 (0.0027) [2024-04-25 23:18:03,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 279920640. Throughput: 0: 55785.8. Samples: 229278800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 23:18:03,923][47056] Avg episode reward: [(0, '0.158')] [2024-04-25 23:18:04,240][47288] Updated weights for policy 0, policy_version 17086 (0.0033) [2024-04-25 23:18:06,901][47288] Updated weights for policy 0, policy_version 17096 (0.0028) [2024-04-25 23:18:08,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 280182784. Throughput: 0: 55755.9. Samples: 229616500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 23:18:08,923][47056] Avg episode reward: [(0, '0.159')] [2024-04-25 23:18:10,231][47288] Updated weights for policy 0, policy_version 17106 (0.0028) [2024-04-25 23:18:12,735][47288] Updated weights for policy 0, policy_version 17116 (0.0031) [2024-04-25 23:18:13,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.6, 300 sec: 55816.9). Total num frames: 280477696. Throughput: 0: 55831.8. Samples: 229778180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 23:18:13,923][47056] Avg episode reward: [(0, '0.124')] [2024-04-25 23:18:15,997][47288] Updated weights for policy 0, policy_version 17126 (0.0026) [2024-04-25 23:18:18,588][47288] Updated weights for policy 0, policy_version 17136 (0.0028) [2024-04-25 23:18:18,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 280772608. Throughput: 0: 55891.2. Samples: 230115900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-25 23:18:18,923][47056] Avg episode reward: [(0, '0.149')] [2024-04-25 23:18:22,004][47288] Updated weights for policy 0, policy_version 17146 (0.0029) [2024-04-25 23:18:23,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 281001984. Throughput: 0: 55769.0. Samples: 230451440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-25 23:18:23,923][47056] Avg episode reward: [(0, '0.158')] [2024-04-25 23:18:24,302][47267] Signal inference workers to stop experience collection... (3250 times) [2024-04-25 23:18:24,332][47288] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-04-25 23:18:24,358][47267] Signal inference workers to resume experience collection... (3250 times) [2024-04-25 23:18:24,359][47288] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-04-25 23:18:24,470][47288] Updated weights for policy 0, policy_version 17156 (0.0024) [2024-04-25 23:18:27,870][47288] Updated weights for policy 0, policy_version 17166 (0.0033) [2024-04-25 23:18:28,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 281313280. Throughput: 0: 55761.8. Samples: 230618720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-25 23:18:28,923][47056] Avg episode reward: [(0, '0.187')] [2024-04-25 23:18:30,365][47288] Updated weights for policy 0, policy_version 17176 (0.0030) [2024-04-25 23:18:33,874][47288] Updated weights for policy 0, policy_version 17186 (0.0031) [2024-04-25 23:18:33,923][47056] Fps is (10 sec: 57345.2, 60 sec: 55705.8, 300 sec: 55761.1). Total num frames: 281575424. Throughput: 0: 55777.2. Samples: 230954680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 26.0) [2024-04-25 23:18:33,923][47056] Avg episode reward: [(0, '0.157')] [2024-04-25 23:18:36,171][47288] Updated weights for policy 0, policy_version 17196 (0.0027) [2024-04-25 23:18:38,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 281870336. Throughput: 0: 55700.3. Samples: 231286500. Policy #0 lag: (min: 1.0, avg: 11.1, max: 26.0) [2024-04-25 23:18:38,923][47056] Avg episode reward: [(0, '0.200')] [2024-04-25 23:18:39,585][47288] Updated weights for policy 0, policy_version 17206 (0.0031) [2024-04-25 23:18:42,130][47288] Updated weights for policy 0, policy_version 17216 (0.0040) [2024-04-25 23:18:43,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 282132480. Throughput: 0: 55803.8. Samples: 231458700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 23:18:43,923][47056] Avg episode reward: [(0, '0.178')] [2024-04-25 23:18:45,281][47288] Updated weights for policy 0, policy_version 17226 (0.0031) [2024-04-25 23:18:48,030][47288] Updated weights for policy 0, policy_version 17236 (0.0031) [2024-04-25 23:18:48,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 282427392. Throughput: 0: 55834.2. Samples: 231791340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 23:18:48,923][47056] Avg episode reward: [(0, '0.126')] [2024-04-25 23:18:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000017238_282427392.pth... [2024-04-25 23:18:48,978][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000016421_269041664.pth [2024-04-25 23:18:51,294][47288] Updated weights for policy 0, policy_version 17246 (0.0027) [2024-04-25 23:18:53,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 282705920. Throughput: 0: 55673.4. Samples: 232121800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 23:18:53,923][47056] Avg episode reward: [(0, '0.182')] [2024-04-25 23:18:53,939][47288] Updated weights for policy 0, policy_version 17256 (0.0028) [2024-04-25 23:18:57,294][47288] Updated weights for policy 0, policy_version 17266 (0.0033) [2024-04-25 23:18:58,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 282984448. Throughput: 0: 55957.9. Samples: 232296280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-25 23:18:58,923][47056] Avg episode reward: [(0, '0.164')] [2024-04-25 23:18:59,633][47288] Updated weights for policy 0, policy_version 17276 (0.0022) [2024-04-25 23:19:03,130][47288] Updated weights for policy 0, policy_version 17286 (0.0034) [2024-04-25 23:19:03,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 283262976. Throughput: 0: 55880.4. Samples: 232630520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-25 23:19:03,923][47056] Avg episode reward: [(0, '0.140')] [2024-04-25 23:19:05,398][47288] Updated weights for policy 0, policy_version 17296 (0.0026) [2024-04-25 23:19:08,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 283525120. Throughput: 0: 55876.1. Samples: 232965860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-25 23:19:08,923][47056] Avg episode reward: [(0, '0.158')] [2024-04-25 23:19:08,952][47288] Updated weights for policy 0, policy_version 17306 (0.0033) [2024-04-25 23:19:11,630][47288] Updated weights for policy 0, policy_version 17316 (0.0033) [2024-04-25 23:19:13,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 283803648. Throughput: 0: 55661.3. Samples: 233123480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-25 23:19:13,923][47056] Avg episode reward: [(0, '0.155')] [2024-04-25 23:19:14,879][47288] Updated weights for policy 0, policy_version 17326 (0.0031) [2024-04-25 23:19:17,511][47288] Updated weights for policy 0, policy_version 17336 (0.0027) [2024-04-25 23:19:18,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 284082176. Throughput: 0: 55569.3. Samples: 233455300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-25 23:19:18,923][47056] Avg episode reward: [(0, '0.125')] [2024-04-25 23:19:20,870][47288] Updated weights for policy 0, policy_version 17346 (0.0028) [2024-04-25 23:19:23,372][47288] Updated weights for policy 0, policy_version 17356 (0.0034) [2024-04-25 23:19:23,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 284377088. Throughput: 0: 55538.6. Samples: 233785740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 23:19:23,923][47056] Avg episode reward: [(0, '0.151')] [2024-04-25 23:19:26,818][47288] Updated weights for policy 0, policy_version 17366 (0.0032) [2024-04-25 23:19:28,923][47056] Fps is (10 sec: 57343.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 284655616. Throughput: 0: 55586.2. Samples: 233960080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 23:19:28,923][47056] Avg episode reward: [(0, '0.149')] [2024-04-25 23:19:29,545][47288] Updated weights for policy 0, policy_version 17376 (0.0027) [2024-04-25 23:19:32,613][47288] Updated weights for policy 0, policy_version 17386 (0.0029) [2024-04-25 23:19:33,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 284901376. Throughput: 0: 55482.2. Samples: 234288040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 23:19:33,923][47056] Avg episode reward: [(0, '0.217')] [2024-04-25 23:19:34,044][47267] Saving new best policy, reward=0.217! [2024-04-25 23:19:35,487][47288] Updated weights for policy 0, policy_version 17396 (0.0030) [2024-04-25 23:19:38,501][47288] Updated weights for policy 0, policy_version 17406 (0.0025) [2024-04-25 23:19:38,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 285196288. Throughput: 0: 55541.7. Samples: 234621180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 23:19:38,923][47056] Avg episode reward: [(0, '0.149')] [2024-04-25 23:19:41,204][47288] Updated weights for policy 0, policy_version 17416 (0.0034) [2024-04-25 23:19:43,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 285442048. Throughput: 0: 55278.4. Samples: 234783800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 23:19:43,923][47056] Avg episode reward: [(0, '0.163')] [2024-04-25 23:19:44,455][47288] Updated weights for policy 0, policy_version 17426 (0.0028) [2024-04-25 23:19:46,982][47288] Updated weights for policy 0, policy_version 17436 (0.0028) [2024-04-25 23:19:48,923][47056] Fps is (10 sec: 55701.4, 60 sec: 55431.8, 300 sec: 55816.5). Total num frames: 285753344. Throughput: 0: 55225.3. Samples: 235115700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-25 23:19:48,924][47056] Avg episode reward: [(0, '0.178')] [2024-04-25 23:19:50,283][47288] Updated weights for policy 0, policy_version 17446 (0.0029) [2024-04-25 23:19:51,538][47267] Signal inference workers to stop experience collection... (3300 times) [2024-04-25 23:19:51,538][47267] Signal inference workers to resume experience collection... (3300 times) [2024-04-25 23:19:51,548][47288] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-04-25 23:19:51,572][47288] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-04-25 23:19:52,981][47288] Updated weights for policy 0, policy_version 17456 (0.0027) [2024-04-25 23:19:53,923][47056] Fps is (10 sec: 58982.0, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 286031872. Throughput: 0: 55166.8. Samples: 235448360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-25 23:19:53,923][47056] Avg episode reward: [(0, '0.161')] [2024-04-25 23:19:56,162][47288] Updated weights for policy 0, policy_version 17466 (0.0027) [2024-04-25 23:19:58,923][47056] Fps is (10 sec: 55709.2, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 286310400. Throughput: 0: 55411.8. Samples: 235617020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-25 23:19:58,924][47056] Avg episode reward: [(0, '0.173')] [2024-04-25 23:19:59,240][47288] Updated weights for policy 0, policy_version 17476 (0.0039) [2024-04-25 23:20:02,082][47288] Updated weights for policy 0, policy_version 17486 (0.0033) [2024-04-25 23:20:03,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 286572544. Throughput: 0: 55246.2. Samples: 235941380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-25 23:20:03,923][47056] Avg episode reward: [(0, '0.151')] [2024-04-25 23:20:05,169][47288] Updated weights for policy 0, policy_version 17496 (0.0027) [2024-04-25 23:20:08,043][47288] Updated weights for policy 0, policy_version 17506 (0.0025) [2024-04-25 23:20:08,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 286834688. Throughput: 0: 55319.1. Samples: 236275100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-25 23:20:08,923][47056] Avg episode reward: [(0, '0.175')] [2024-04-25 23:20:10,891][47288] Updated weights for policy 0, policy_version 17516 (0.0026) [2024-04-25 23:20:13,899][47288] Updated weights for policy 0, policy_version 17526 (0.0027) [2024-04-25 23:20:13,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 287145984. Throughput: 0: 55175.1. Samples: 236442960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-25 23:20:13,923][47056] Avg episode reward: [(0, '0.177')] [2024-04-25 23:20:16,871][47288] Updated weights for policy 0, policy_version 17536 (0.0032) [2024-04-25 23:20:18,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 287408128. Throughput: 0: 55272.9. Samples: 236775320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-25 23:20:18,923][47056] Avg episode reward: [(0, '0.178')] [2024-04-25 23:20:19,841][47288] Updated weights for policy 0, policy_version 17546 (0.0032) [2024-04-25 23:20:22,874][47288] Updated weights for policy 0, policy_version 17556 (0.0039) [2024-04-25 23:20:23,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 287686656. Throughput: 0: 55211.5. Samples: 237105700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 23:20:23,923][47056] Avg episode reward: [(0, '0.208')] [2024-04-25 23:20:25,662][47288] Updated weights for policy 0, policy_version 17566 (0.0029) [2024-04-25 23:20:28,702][47288] Updated weights for policy 0, policy_version 17576 (0.0026) [2024-04-25 23:20:28,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 287965184. Throughput: 0: 55442.5. Samples: 237278720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 23:20:28,923][47056] Avg episode reward: [(0, '0.182')] [2024-04-25 23:20:31,625][47288] Updated weights for policy 0, policy_version 17586 (0.0029) [2024-04-25 23:20:33,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 288243712. Throughput: 0: 55398.8. Samples: 237608600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 23:20:33,923][47056] Avg episode reward: [(0, '0.172')] [2024-04-25 23:20:34,625][47288] Updated weights for policy 0, policy_version 17596 (0.0028) [2024-04-25 23:20:37,456][47288] Updated weights for policy 0, policy_version 17606 (0.0034) [2024-04-25 23:20:38,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 288505856. Throughput: 0: 55396.7. Samples: 237941220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-25 23:20:38,923][47056] Avg episode reward: [(0, '0.139')] [2024-04-25 23:20:40,475][47288] Updated weights for policy 0, policy_version 17616 (0.0026) [2024-04-25 23:20:43,529][47288] Updated weights for policy 0, policy_version 17626 (0.0030) [2024-04-25 23:20:43,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 288784384. Throughput: 0: 55193.5. Samples: 238100720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-25 23:20:43,923][47056] Avg episode reward: [(0, '0.180')] [2024-04-25 23:20:46,465][47288] Updated weights for policy 0, policy_version 17636 (0.0036) [2024-04-25 23:20:48,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55433.2, 300 sec: 55705.6). Total num frames: 289079296. Throughput: 0: 55373.7. Samples: 238433200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-25 23:20:48,923][47056] Avg episode reward: [(0, '0.163')] [2024-04-25 23:20:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000017644_289079296.pth... [2024-04-25 23:20:48,985][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000016830_275742720.pth [2024-04-25 23:20:49,302][47288] Updated weights for policy 0, policy_version 17646 (0.0029) [2024-04-25 23:20:52,313][47288] Updated weights for policy 0, policy_version 17656 (0.0027) [2024-04-25 23:20:53,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 289341440. Throughput: 0: 55302.6. Samples: 238763720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-25 23:20:53,923][47056] Avg episode reward: [(0, '0.203')] [2024-04-25 23:20:55,255][47288] Updated weights for policy 0, policy_version 17666 (0.0029) [2024-04-25 23:20:57,975][47267] Signal inference workers to stop experience collection... (3350 times) [2024-04-25 23:20:58,016][47288] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-04-25 23:20:58,024][47267] Signal inference workers to resume experience collection... (3350 times) [2024-04-25 23:20:58,029][47288] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-04-25 23:20:58,137][47288] Updated weights for policy 0, policy_version 17676 (0.0030) [2024-04-25 23:20:58,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 289619968. Throughput: 0: 55352.1. Samples: 238933800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-25 23:20:58,924][47056] Avg episode reward: [(0, '0.173')] [2024-04-25 23:21:01,325][47288] Updated weights for policy 0, policy_version 17686 (0.0032) [2024-04-25 23:21:03,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55705.4, 300 sec: 55650.0). Total num frames: 289914880. Throughput: 0: 55444.8. Samples: 239270340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-25 23:21:03,923][47056] Avg episode reward: [(0, '0.183')] [2024-04-25 23:21:04,343][47288] Updated weights for policy 0, policy_version 17696 (0.0037) [2024-04-25 23:21:07,536][47288] Updated weights for policy 0, policy_version 17706 (0.0027) [2024-04-25 23:21:08,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 290177024. Throughput: 0: 55443.6. Samples: 239600660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-25 23:21:08,923][47056] Avg episode reward: [(0, '0.144')] [2024-04-25 23:21:10,050][47288] Updated weights for policy 0, policy_version 17716 (0.0027) [2024-04-25 23:21:13,266][47288] Updated weights for policy 0, policy_version 17726 (0.0028) [2024-04-25 23:21:13,923][47056] Fps is (10 sec: 52428.7, 60 sec: 54886.3, 300 sec: 55594.5). Total num frames: 290439168. Throughput: 0: 55398.5. Samples: 239771660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-25 23:21:13,924][47056] Avg episode reward: [(0, '0.161')] [2024-04-25 23:21:15,740][47288] Updated weights for policy 0, policy_version 17736 (0.0035) [2024-04-25 23:21:18,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 290734080. Throughput: 0: 55534.5. Samples: 240107660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-25 23:21:18,923][47056] Avg episode reward: [(0, '0.214')] [2024-04-25 23:21:18,944][47288] Updated weights for policy 0, policy_version 17746 (0.0031) [2024-04-25 23:21:21,706][47288] Updated weights for policy 0, policy_version 17756 (0.0028) [2024-04-25 23:21:23,923][47056] Fps is (10 sec: 57345.1, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 291012608. Throughput: 0: 55595.7. Samples: 240443020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-25 23:21:23,923][47056] Avg episode reward: [(0, '0.134')] [2024-04-25 23:21:25,250][47288] Updated weights for policy 0, policy_version 17766 (0.0028) [2024-04-25 23:21:27,864][47288] Updated weights for policy 0, policy_version 17776 (0.0025) [2024-04-25 23:21:28,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 291291136. Throughput: 0: 55571.5. Samples: 240601440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 23:21:28,923][47056] Avg episode reward: [(0, '0.182')] [2024-04-25 23:21:31,046][47288] Updated weights for policy 0, policy_version 17786 (0.0037) [2024-04-25 23:21:33,703][47288] Updated weights for policy 0, policy_version 17796 (0.0031) [2024-04-25 23:21:33,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 291586048. Throughput: 0: 55614.8. Samples: 240935860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 23:21:33,923][47056] Avg episode reward: [(0, '0.170')] [2024-04-25 23:21:36,846][47288] Updated weights for policy 0, policy_version 17806 (0.0027) [2024-04-25 23:21:38,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 291848192. Throughput: 0: 55676.0. Samples: 241269140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-25 23:21:38,923][47056] Avg episode reward: [(0, '0.190')] [2024-04-25 23:21:39,452][47288] Updated weights for policy 0, policy_version 17816 (0.0032) [2024-04-25 23:21:42,828][47288] Updated weights for policy 0, policy_version 17826 (0.0031) [2024-04-25 23:21:43,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 292126720. Throughput: 0: 55713.3. Samples: 241440900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-25 23:21:43,923][47056] Avg episode reward: [(0, '0.196')] [2024-04-25 23:21:45,317][47288] Updated weights for policy 0, policy_version 17836 (0.0032) [2024-04-25 23:21:48,644][47288] Updated weights for policy 0, policy_version 17846 (0.0031) [2024-04-25 23:21:48,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 292388864. Throughput: 0: 55674.8. Samples: 241775700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-25 23:21:48,923][47056] Avg episode reward: [(0, '0.152')] [2024-04-25 23:21:51,292][47288] Updated weights for policy 0, policy_version 17856 (0.0029) [2024-04-25 23:21:53,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 292683776. Throughput: 0: 55833.0. Samples: 242113140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-25 23:21:53,923][47056] Avg episode reward: [(0, '0.137')] [2024-04-25 23:21:54,479][47288] Updated weights for policy 0, policy_version 17866 (0.0031) [2024-04-25 23:21:57,055][47288] Updated weights for policy 0, policy_version 17876 (0.0032) [2024-04-25 23:21:58,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 292962304. Throughput: 0: 55670.3. Samples: 242276820. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-25 23:21:58,924][47056] Avg episode reward: [(0, '0.142')] [2024-04-25 23:22:00,440][47288] Updated weights for policy 0, policy_version 17886 (0.0033) [2024-04-25 23:22:02,894][47288] Updated weights for policy 0, policy_version 17896 (0.0026) [2024-04-25 23:22:03,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 293240832. Throughput: 0: 55637.5. Samples: 242611340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 23:22:03,923][47056] Avg episode reward: [(0, '0.155')] [2024-04-25 23:22:06,400][47288] Updated weights for policy 0, policy_version 17906 (0.0029) [2024-04-25 23:22:08,741][47288] Updated weights for policy 0, policy_version 17916 (0.0029) [2024-04-25 23:22:08,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 293535744. Throughput: 0: 55614.7. Samples: 242945680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 23:22:08,923][47056] Avg episode reward: [(0, '0.126')] [2024-04-25 23:22:12,310][47288] Updated weights for policy 0, policy_version 17926 (0.0030) [2024-04-25 23:22:13,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.9, 300 sec: 55650.1). Total num frames: 293814272. Throughput: 0: 55893.8. Samples: 243116660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 23:22:13,924][47056] Avg episode reward: [(0, '0.165')] [2024-04-25 23:22:14,678][47288] Updated weights for policy 0, policy_version 17936 (0.0030) [2024-04-25 23:22:18,045][47288] Updated weights for policy 0, policy_version 17946 (0.0027) [2024-04-25 23:22:18,616][47267] Signal inference workers to stop experience collection... (3400 times) [2024-04-25 23:22:18,616][47267] Signal inference workers to resume experience collection... (3400 times) [2024-04-25 23:22:18,639][47288] InferenceWorker_p0-w0: stopping experience collection (3400 times) [2024-04-25 23:22:18,639][47288] InferenceWorker_p0-w0: resuming experience collection (3400 times) [2024-04-25 23:22:18,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 294076416. Throughput: 0: 55853.7. Samples: 243449280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-25 23:22:18,923][47056] Avg episode reward: [(0, '0.194')] [2024-04-25 23:22:20,555][47288] Updated weights for policy 0, policy_version 17956 (0.0030) [2024-04-25 23:22:23,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 294338560. Throughput: 0: 55821.0. Samples: 243781080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-25 23:22:23,923][47056] Avg episode reward: [(0, '0.193')] [2024-04-25 23:22:23,970][47288] Updated weights for policy 0, policy_version 17966 (0.0029) [2024-04-25 23:22:26,477][47288] Updated weights for policy 0, policy_version 17976 (0.0034) [2024-04-25 23:22:28,923][47056] Fps is (10 sec: 52427.5, 60 sec: 55159.2, 300 sec: 55483.4). Total num frames: 294600704. Throughput: 0: 55600.5. Samples: 243942940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 23:22:28,924][47056] Avg episode reward: [(0, '0.131')] [2024-04-25 23:22:29,782][47288] Updated weights for policy 0, policy_version 17986 (0.0031) [2024-04-25 23:22:32,334][47288] Updated weights for policy 0, policy_version 17996 (0.0035) [2024-04-25 23:22:33,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55432.5, 300 sec: 55594.6). Total num frames: 294912000. Throughput: 0: 55511.1. Samples: 244273700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 23:22:33,923][47056] Avg episode reward: [(0, '0.147')] [2024-04-25 23:22:35,625][47288] Updated weights for policy 0, policy_version 18006 (0.0033) [2024-04-25 23:22:38,296][47288] Updated weights for policy 0, policy_version 18016 (0.0031) [2024-04-25 23:22:38,923][47056] Fps is (10 sec: 58984.5, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 295190528. Throughput: 0: 55482.3. Samples: 244609840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 23:22:38,923][47056] Avg episode reward: [(0, '0.186')] [2024-04-25 23:22:41,465][47288] Updated weights for policy 0, policy_version 18026 (0.0026) [2024-04-25 23:22:43,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 295469056. Throughput: 0: 55628.0. Samples: 244780080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-25 23:22:43,924][47056] Avg episode reward: [(0, '0.184')] [2024-04-25 23:22:44,181][47288] Updated weights for policy 0, policy_version 18036 (0.0030) [2024-04-25 23:22:47,258][47288] Updated weights for policy 0, policy_version 18046 (0.0034) [2024-04-25 23:22:48,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.7, 300 sec: 55650.0). Total num frames: 295763968. Throughput: 0: 55583.9. Samples: 245112620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-25 23:22:48,923][47056] Avg episode reward: [(0, '0.140')] [2024-04-25 23:22:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000018052_295763968.pth... [2024-04-25 23:22:48,980][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000017238_282427392.pth [2024-04-25 23:22:50,103][47288] Updated weights for policy 0, policy_version 18056 (0.0028) [2024-04-25 23:22:53,254][47288] Updated weights for policy 0, policy_version 18066 (0.0032) [2024-04-25 23:22:53,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 296026112. Throughput: 0: 55588.8. Samples: 245447180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-04-25 23:22:53,923][47056] Avg episode reward: [(0, '0.173')] [2024-04-25 23:22:56,032][47288] Updated weights for policy 0, policy_version 18076 (0.0028) [2024-04-25 23:22:58,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 296288256. Throughput: 0: 55379.9. Samples: 245608760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-04-25 23:22:58,923][47056] Avg episode reward: [(0, '0.225')] [2024-04-25 23:22:58,931][47267] Saving new best policy, reward=0.225! [2024-04-25 23:22:59,311][47288] Updated weights for policy 0, policy_version 18086 (0.0035) [2024-04-25 23:23:01,906][47288] Updated weights for policy 0, policy_version 18096 (0.0034) [2024-04-25 23:23:03,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55159.3, 300 sec: 55483.4). Total num frames: 296550400. Throughput: 0: 55366.2. Samples: 245940760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-04-25 23:23:03,923][47056] Avg episode reward: [(0, '0.134')] [2024-04-25 23:23:05,141][47288] Updated weights for policy 0, policy_version 18106 (0.0030) [2024-04-25 23:23:07,804][47288] Updated weights for policy 0, policy_version 18116 (0.0034) [2024-04-25 23:23:08,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 296861696. Throughput: 0: 55437.0. Samples: 246275740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:23:08,923][47056] Avg episode reward: [(0, '0.182')] [2024-04-25 23:23:11,078][47288] Updated weights for policy 0, policy_version 18126 (0.0030) [2024-04-25 23:23:13,765][47288] Updated weights for policy 0, policy_version 18136 (0.0029) [2024-04-25 23:23:13,923][47056] Fps is (10 sec: 58983.5, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 297140224. Throughput: 0: 55618.7. Samples: 246445760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:23:13,923][47056] Avg episode reward: [(0, '0.141')] [2024-04-25 23:23:17,039][47288] Updated weights for policy 0, policy_version 18146 (0.0030) [2024-04-25 23:23:18,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.6, 300 sec: 55594.6). Total num frames: 297402368. Throughput: 0: 55616.9. Samples: 246776460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-25 23:23:18,923][47056] Avg episode reward: [(0, '0.196')] [2024-04-25 23:23:19,812][47288] Updated weights for policy 0, policy_version 18156 (0.0027) [2024-04-25 23:23:20,145][47267] Signal inference workers to stop experience collection... (3450 times) [2024-04-25 23:23:20,179][47288] InferenceWorker_p0-w0: stopping experience collection (3450 times) [2024-04-25 23:23:20,194][47267] Signal inference workers to resume experience collection... (3450 times) [2024-04-25 23:23:20,197][47288] InferenceWorker_p0-w0: resuming experience collection (3450 times) [2024-04-25 23:23:22,992][47288] Updated weights for policy 0, policy_version 18166 (0.0030) [2024-04-25 23:23:23,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.7, 300 sec: 55483.4). Total num frames: 297680896. Throughput: 0: 55523.6. Samples: 247108400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-25 23:23:23,923][47056] Avg episode reward: [(0, '0.144')] [2024-04-25 23:23:25,647][47288] Updated weights for policy 0, policy_version 18176 (0.0030) [2024-04-25 23:23:28,839][47288] Updated weights for policy 0, policy_version 18186 (0.0030) [2024-04-25 23:23:28,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.9, 300 sec: 55539.0). Total num frames: 297959424. Throughput: 0: 55297.4. Samples: 247268460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-25 23:23:28,923][47056] Avg episode reward: [(0, '0.180')] [2024-04-25 23:23:31,442][47288] Updated weights for policy 0, policy_version 18196 (0.0029) [2024-04-25 23:23:33,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 298221568. Throughput: 0: 55405.0. Samples: 247605840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 23:23:33,923][47056] Avg episode reward: [(0, '0.178')] [2024-04-25 23:23:34,838][47288] Updated weights for policy 0, policy_version 18206 (0.0029) [2024-04-25 23:23:37,424][47288] Updated weights for policy 0, policy_version 18216 (0.0027) [2024-04-25 23:23:38,923][47056] Fps is (10 sec: 52429.0, 60 sec: 54886.3, 300 sec: 55427.9). Total num frames: 298483712. Throughput: 0: 55388.4. Samples: 247939660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 23:23:38,923][47056] Avg episode reward: [(0, '0.166')] [2024-04-25 23:23:40,682][47288] Updated weights for policy 0, policy_version 18226 (0.0033) [2024-04-25 23:23:43,199][47288] Updated weights for policy 0, policy_version 18236 (0.0027) [2024-04-25 23:23:43,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 298811392. Throughput: 0: 55442.8. Samples: 248103680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 23:23:43,923][47056] Avg episode reward: [(0, '0.184')] [2024-04-25 23:23:46,674][47288] Updated weights for policy 0, policy_version 18246 (0.0030) [2024-04-25 23:23:48,923][47056] Fps is (10 sec: 60621.2, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 299089920. Throughput: 0: 55410.9. Samples: 248434240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 23:23:48,923][47056] Avg episode reward: [(0, '0.158')] [2024-04-25 23:23:48,942][47288] Updated weights for policy 0, policy_version 18256 (0.0030) [2024-04-25 23:23:52,489][47288] Updated weights for policy 0, policy_version 18266 (0.0025) [2024-04-25 23:23:53,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 299352064. Throughput: 0: 55404.3. Samples: 248768940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 23:23:53,923][47056] Avg episode reward: [(0, '0.176')] [2024-04-25 23:23:54,876][47288] Updated weights for policy 0, policy_version 18276 (0.0030) [2024-04-25 23:23:58,223][47288] Updated weights for policy 0, policy_version 18286 (0.0032) [2024-04-25 23:23:58,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 299614208. Throughput: 0: 55571.4. Samples: 248946480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 23:23:58,923][47056] Avg episode reward: [(0, '0.213')] [2024-04-25 23:24:00,952][47288] Updated weights for policy 0, policy_version 18296 (0.0026) [2024-04-25 23:24:03,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 299892736. Throughput: 0: 55621.8. Samples: 249279440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 23:24:03,923][47056] Avg episode reward: [(0, '0.169')] [2024-04-25 23:24:04,150][47288] Updated weights for policy 0, policy_version 18306 (0.0032) [2024-04-25 23:24:06,834][47288] Updated weights for policy 0, policy_version 18316 (0.0030) [2024-04-25 23:24:08,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55159.3, 300 sec: 55483.4). Total num frames: 300171264. Throughput: 0: 55585.6. Samples: 249609760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-25 23:24:08,923][47056] Avg episode reward: [(0, '0.134')] [2024-04-25 23:24:10,201][47288] Updated weights for policy 0, policy_version 18326 (0.0032) [2024-04-25 23:24:12,889][47288] Updated weights for policy 0, policy_version 18336 (0.0027) [2024-04-25 23:24:13,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55159.3, 300 sec: 55483.4). Total num frames: 300449792. Throughput: 0: 55653.8. Samples: 249772880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 23:24:13,924][47056] Avg episode reward: [(0, '0.228')] [2024-04-25 23:24:13,924][47267] Saving new best policy, reward=0.228! [2024-04-25 23:24:16,046][47288] Updated weights for policy 0, policy_version 18346 (0.0033) [2024-04-25 23:24:18,858][47288] Updated weights for policy 0, policy_version 18356 (0.0027) [2024-04-25 23:24:18,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 300744704. Throughput: 0: 55461.6. Samples: 250101620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 23:24:18,923][47056] Avg episode reward: [(0, '0.189')] [2024-04-25 23:24:21,743][47288] Updated weights for policy 0, policy_version 18366 (0.0027) [2024-04-25 23:24:23,923][47056] Fps is (10 sec: 58982.8, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 301039616. Throughput: 0: 55553.4. Samples: 250439560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-25 23:24:23,923][47056] Avg episode reward: [(0, '0.175')] [2024-04-25 23:24:24,562][47288] Updated weights for policy 0, policy_version 18376 (0.0031) [2024-04-25 23:24:27,703][47288] Updated weights for policy 0, policy_version 18386 (0.0029) [2024-04-25 23:24:28,923][47056] Fps is (10 sec: 55706.6, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 301301760. Throughput: 0: 55826.7. Samples: 250615880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-25 23:24:28,923][47056] Avg episode reward: [(0, '0.175')] [2024-04-25 23:24:30,372][47288] Updated weights for policy 0, policy_version 18396 (0.0033) [2024-04-25 23:24:33,719][47288] Updated weights for policy 0, policy_version 18406 (0.0031) [2024-04-25 23:24:33,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 301563904. Throughput: 0: 55786.2. Samples: 250944620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-25 23:24:33,923][47056] Avg episode reward: [(0, '0.197')] [2024-04-25 23:24:36,325][47288] Updated weights for policy 0, policy_version 18416 (0.0032) [2024-04-25 23:24:38,923][47056] Fps is (10 sec: 54066.1, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 301842432. Throughput: 0: 55804.8. Samples: 251280160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-25 23:24:38,923][47056] Avg episode reward: [(0, '0.171')] [2024-04-25 23:24:39,620][47288] Updated weights for policy 0, policy_version 18426 (0.0030) [2024-04-25 23:24:42,322][47288] Updated weights for policy 0, policy_version 18436 (0.0029) [2024-04-25 23:24:43,923][47056] Fps is (10 sec: 54066.8, 60 sec: 54886.4, 300 sec: 55428.1). Total num frames: 302104576. Throughput: 0: 55289.4. Samples: 251434500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-25 23:24:43,923][47056] Avg episode reward: [(0, '0.159')] [2024-04-25 23:24:45,358][47288] Updated weights for policy 0, policy_version 18446 (0.0029) [2024-04-25 23:24:48,320][47288] Updated weights for policy 0, policy_version 18456 (0.0029) [2024-04-25 23:24:48,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55159.3, 300 sec: 55483.4). Total num frames: 302399488. Throughput: 0: 55284.3. Samples: 251767240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-25 23:24:48,923][47056] Avg episode reward: [(0, '0.201')] [2024-04-25 23:24:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000018457_302399488.pth... [2024-04-25 23:24:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000017644_289079296.pth [2024-04-25 23:24:51,092][47267] Signal inference workers to stop experience collection... (3500 times) [2024-04-25 23:24:51,096][47267] Signal inference workers to resume experience collection... (3500 times) [2024-04-25 23:24:51,122][47288] InferenceWorker_p0-w0: stopping experience collection (3500 times) [2024-04-25 23:24:51,123][47288] InferenceWorker_p0-w0: resuming experience collection (3500 times) [2024-04-25 23:24:51,207][47288] Updated weights for policy 0, policy_version 18466 (0.0029) [2024-04-25 23:24:53,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 302678016. Throughput: 0: 55359.7. Samples: 252100940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-25 23:24:53,923][47056] Avg episode reward: [(0, '0.168')] [2024-04-25 23:24:54,110][47288] Updated weights for policy 0, policy_version 18476 (0.0028) [2024-04-25 23:24:57,138][47288] Updated weights for policy 0, policy_version 18486 (0.0028) [2024-04-25 23:24:58,923][47056] Fps is (10 sec: 58983.7, 60 sec: 56251.9, 300 sec: 55650.1). Total num frames: 302989312. Throughput: 0: 55563.3. Samples: 252273220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-25 23:24:58,923][47056] Avg episode reward: [(0, '0.186')] [2024-04-25 23:25:00,240][47288] Updated weights for policy 0, policy_version 18496 (0.0027) [2024-04-25 23:25:03,114][47288] Updated weights for policy 0, policy_version 18506 (0.0026) [2024-04-25 23:25:03,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 303235072. Throughput: 0: 55642.8. Samples: 252605540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-04-25 23:25:03,923][47056] Avg episode reward: [(0, '0.201')] [2024-04-25 23:25:06,068][47288] Updated weights for policy 0, policy_version 18516 (0.0040) [2024-04-25 23:25:08,923][47056] Fps is (10 sec: 52427.7, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 303513600. Throughput: 0: 55512.3. Samples: 252937620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-04-25 23:25:08,923][47056] Avg episode reward: [(0, '0.176')] [2024-04-25 23:25:09,109][47288] Updated weights for policy 0, policy_version 18526 (0.0030) [2024-04-25 23:25:11,844][47288] Updated weights for policy 0, policy_version 18536 (0.0028) [2024-04-25 23:25:13,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55432.7, 300 sec: 55483.5). Total num frames: 303775744. Throughput: 0: 55270.2. Samples: 253103040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 26.0) [2024-04-25 23:25:13,923][47056] Avg episode reward: [(0, '0.142')] [2024-04-25 23:25:15,057][47288] Updated weights for policy 0, policy_version 18546 (0.0028) [2024-04-25 23:25:17,749][47288] Updated weights for policy 0, policy_version 18556 (0.0025) [2024-04-25 23:25:18,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55159.6, 300 sec: 55483.5). Total num frames: 304054272. Throughput: 0: 55286.2. Samples: 253432500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-25 23:25:18,923][47056] Avg episode reward: [(0, '0.180')] [2024-04-25 23:25:20,950][47288] Updated weights for policy 0, policy_version 18566 (0.0025) [2024-04-25 23:25:23,523][47288] Updated weights for policy 0, policy_version 18576 (0.0030) [2024-04-25 23:25:23,923][47056] Fps is (10 sec: 57342.4, 60 sec: 55159.3, 300 sec: 55539.0). Total num frames: 304349184. Throughput: 0: 55192.8. Samples: 253763840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-25 23:25:23,923][47056] Avg episode reward: [(0, '0.168')] [2024-04-25 23:25:26,949][47288] Updated weights for policy 0, policy_version 18586 (0.0035) [2024-04-25 23:25:28,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55432.3, 300 sec: 55539.0). Total num frames: 304627712. Throughput: 0: 55619.9. Samples: 253937400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-25 23:25:28,923][47056] Avg episode reward: [(0, '0.186')] [2024-04-25 23:25:29,490][47288] Updated weights for policy 0, policy_version 18596 (0.0031) [2024-04-25 23:25:32,641][47288] Updated weights for policy 0, policy_version 18606 (0.0028) [2024-04-25 23:25:33,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 304922624. Throughput: 0: 55563.9. Samples: 254267620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 23:25:33,923][47056] Avg episode reward: [(0, '0.148')] [2024-04-25 23:25:35,521][47288] Updated weights for policy 0, policy_version 18616 (0.0032) [2024-04-25 23:25:38,376][47288] Updated weights for policy 0, policy_version 18626 (0.0029) [2024-04-25 23:25:38,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 305168384. Throughput: 0: 55485.8. Samples: 254597800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 23:25:38,923][47056] Avg episode reward: [(0, '0.220')] [2024-04-25 23:25:41,879][47288] Updated weights for policy 0, policy_version 18636 (0.0033) [2024-04-25 23:25:43,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 305463296. Throughput: 0: 55384.7. Samples: 254765540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-25 23:25:43,923][47056] Avg episode reward: [(0, '0.175')] [2024-04-25 23:25:44,385][47288] Updated weights for policy 0, policy_version 18646 (0.0033) [2024-04-25 23:25:47,820][47288] Updated weights for policy 0, policy_version 18656 (0.0029) [2024-04-25 23:25:48,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 305725440. Throughput: 0: 55492.9. Samples: 255102720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-25 23:25:48,923][47056] Avg episode reward: [(0, '0.182')] [2024-04-25 23:25:49,843][47267] Signal inference workers to stop experience collection... (3550 times) [2024-04-25 23:25:49,879][47288] InferenceWorker_p0-w0: stopping experience collection (3550 times) [2024-04-25 23:25:49,934][47267] Signal inference workers to resume experience collection... (3550 times) [2024-04-25 23:25:49,935][47288] InferenceWorker_p0-w0: resuming experience collection (3550 times) [2024-04-25 23:25:50,337][47288] Updated weights for policy 0, policy_version 18666 (0.0029) [2024-04-25 23:25:53,794][47288] Updated weights for policy 0, policy_version 18676 (0.0028) [2024-04-25 23:25:53,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 306003968. Throughput: 0: 55577.9. Samples: 255438620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-25 23:25:53,923][47056] Avg episode reward: [(0, '0.179')] [2024-04-25 23:25:56,399][47288] Updated weights for policy 0, policy_version 18686 (0.0035) [2024-04-25 23:25:58,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55159.2, 300 sec: 55539.0). Total num frames: 306298880. Throughput: 0: 55513.0. Samples: 255601140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-25 23:25:58,923][47056] Avg episode reward: [(0, '0.135')] [2024-04-25 23:25:59,504][47288] Updated weights for policy 0, policy_version 18696 (0.0030) [2024-04-25 23:26:02,493][47288] Updated weights for policy 0, policy_version 18706 (0.0035) [2024-04-25 23:26:03,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 306561024. Throughput: 0: 55674.3. Samples: 255937840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-25 23:26:03,923][47056] Avg episode reward: [(0, '0.189')] [2024-04-25 23:26:05,292][47288] Updated weights for policy 0, policy_version 18716 (0.0036) [2024-04-25 23:26:08,362][47288] Updated weights for policy 0, policy_version 18726 (0.0029) [2024-04-25 23:26:08,923][47056] Fps is (10 sec: 55706.6, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 306855936. Throughput: 0: 55556.7. Samples: 256263880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-25 23:26:08,923][47056] Avg episode reward: [(0, '0.184')] [2024-04-25 23:26:11,327][47288] Updated weights for policy 0, policy_version 18736 (0.0035) [2024-04-25 23:26:13,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55432.4, 300 sec: 55483.5). Total num frames: 307101696. Throughput: 0: 55352.5. Samples: 256428260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-25 23:26:13,923][47056] Avg episode reward: [(0, '0.233')] [2024-04-25 23:26:13,924][47267] Saving new best policy, reward=0.233! [2024-04-25 23:26:14,245][47288] Updated weights for policy 0, policy_version 18746 (0.0033) [2024-04-25 23:26:17,154][47288] Updated weights for policy 0, policy_version 18756 (0.0029) [2024-04-25 23:26:18,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 307380224. Throughput: 0: 55395.4. Samples: 256760400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-25 23:26:18,923][47056] Avg episode reward: [(0, '0.194')] [2024-04-25 23:26:20,113][47288] Updated weights for policy 0, policy_version 18766 (0.0028) [2024-04-25 23:26:23,405][47288] Updated weights for policy 0, policy_version 18776 (0.0035) [2024-04-25 23:26:23,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55159.6, 300 sec: 55483.4). Total num frames: 307658752. Throughput: 0: 55426.2. Samples: 257091980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-25 23:26:23,923][47056] Avg episode reward: [(0, '0.171')] [2024-04-25 23:26:26,133][47288] Updated weights for policy 0, policy_version 18786 (0.0028) [2024-04-25 23:26:28,923][47056] Fps is (10 sec: 54066.3, 60 sec: 54886.4, 300 sec: 55372.3). Total num frames: 307920896. Throughput: 0: 55153.7. Samples: 257247460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-25 23:26:28,923][47056] Avg episode reward: [(0, '0.153')] [2024-04-25 23:26:29,262][47288] Updated weights for policy 0, policy_version 18796 (0.0032) [2024-04-25 23:26:32,058][47288] Updated weights for policy 0, policy_version 18806 (0.0029) [2024-04-25 23:26:33,923][47056] Fps is (10 sec: 54067.2, 60 sec: 54613.4, 300 sec: 55427.9). Total num frames: 308199424. Throughput: 0: 54956.4. Samples: 257575760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-25 23:26:33,923][47056] Avg episode reward: [(0, '0.200')] [2024-04-25 23:26:35,204][47288] Updated weights for policy 0, policy_version 18816 (0.0032) [2024-04-25 23:26:37,904][47288] Updated weights for policy 0, policy_version 18826 (0.0032) [2024-04-25 23:26:38,923][47056] Fps is (10 sec: 60621.0, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 308527104. Throughput: 0: 54843.5. Samples: 257906580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 23:26:38,923][47056] Avg episode reward: [(0, '0.140')] [2024-04-25 23:26:41,045][47288] Updated weights for policy 0, policy_version 18836 (0.0029) [2024-04-25 23:26:43,892][47288] Updated weights for policy 0, policy_version 18846 (0.0031) [2024-04-25 23:26:43,917][47267] Signal inference workers to stop experience collection... (3600 times) [2024-04-25 23:26:43,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 308772864. Throughput: 0: 55118.8. Samples: 258081480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 23:26:43,923][47056] Avg episode reward: [(0, '0.219')] [2024-04-25 23:26:43,934][47288] InferenceWorker_p0-w0: stopping experience collection (3600 times) [2024-04-25 23:26:44,005][47267] Signal inference workers to resume experience collection... (3600 times) [2024-04-25 23:26:44,005][47288] InferenceWorker_p0-w0: resuming experience collection (3600 times) [2024-04-25 23:26:46,821][47288] Updated weights for policy 0, policy_version 18856 (0.0028) [2024-04-25 23:26:48,923][47056] Fps is (10 sec: 50790.4, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 309035008. Throughput: 0: 55067.0. Samples: 258415860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-25 23:26:48,923][47056] Avg episode reward: [(0, '0.185')] [2024-04-25 23:26:48,930][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000018862_309035008.pth... [2024-04-25 23:26:48,978][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000018052_295763968.pth [2024-04-25 23:26:49,752][47288] Updated weights for policy 0, policy_version 18866 (0.0027) [2024-04-25 23:26:52,816][47288] Updated weights for policy 0, policy_version 18876 (0.0030) [2024-04-25 23:26:53,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 309313536. Throughput: 0: 55079.6. Samples: 258742460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-25 23:26:53,923][47056] Avg episode reward: [(0, '0.165')] [2024-04-25 23:26:55,789][47288] Updated weights for policy 0, policy_version 18886 (0.0025) [2024-04-25 23:26:58,697][47288] Updated weights for policy 0, policy_version 18896 (0.0029) [2024-04-25 23:26:58,923][47056] Fps is (10 sec: 55705.5, 60 sec: 54886.5, 300 sec: 55427.9). Total num frames: 309592064. Throughput: 0: 54984.0. Samples: 258902540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-25 23:26:58,923][47056] Avg episode reward: [(0, '0.231')] [2024-04-25 23:27:01,556][47288] Updated weights for policy 0, policy_version 18906 (0.0032) [2024-04-25 23:27:03,923][47056] Fps is (10 sec: 54066.9, 60 sec: 54886.3, 300 sec: 55316.8). Total num frames: 309854208. Throughput: 0: 55017.2. Samples: 259236180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-25 23:27:03,923][47056] Avg episode reward: [(0, '0.212')] [2024-04-25 23:27:04,634][47288] Updated weights for policy 0, policy_version 18916 (0.0036) [2024-04-25 23:27:07,460][47288] Updated weights for policy 0, policy_version 18926 (0.0026) [2024-04-25 23:27:08,923][47056] Fps is (10 sec: 55706.1, 60 sec: 54886.4, 300 sec: 55372.4). Total num frames: 310149120. Throughput: 0: 55059.6. Samples: 259569660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-25 23:27:08,923][47056] Avg episode reward: [(0, '0.217')] [2024-04-25 23:27:10,875][47288] Updated weights for policy 0, policy_version 18936 (0.0026) [2024-04-25 23:27:13,263][47288] Updated weights for policy 0, policy_version 18946 (0.0027) [2024-04-25 23:27:13,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 310444032. Throughput: 0: 55383.6. Samples: 259739720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-25 23:27:13,923][47056] Avg episode reward: [(0, '0.220')] [2024-04-25 23:27:16,879][47288] Updated weights for policy 0, policy_version 18956 (0.0030) [2024-04-25 23:27:18,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55705.4, 300 sec: 55539.0). Total num frames: 310722560. Throughput: 0: 55583.0. Samples: 260077000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 23:27:18,924][47056] Avg episode reward: [(0, '0.216')] [2024-04-25 23:27:19,045][47288] Updated weights for policy 0, policy_version 18966 (0.0034) [2024-04-25 23:27:22,714][47288] Updated weights for policy 0, policy_version 18976 (0.0032) [2024-04-25 23:27:23,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55159.5, 300 sec: 55483.5). Total num frames: 310968320. Throughput: 0: 55661.0. Samples: 260411320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 23:27:23,923][47056] Avg episode reward: [(0, '0.187')] [2024-04-25 23:27:24,925][47288] Updated weights for policy 0, policy_version 18986 (0.0026) [2024-04-25 23:27:28,497][47288] Updated weights for policy 0, policy_version 18996 (0.0031) [2024-04-25 23:27:28,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 311246848. Throughput: 0: 55376.5. Samples: 260573420. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-25 23:27:28,923][47056] Avg episode reward: [(0, '0.167')] [2024-04-25 23:27:30,823][47288] Updated weights for policy 0, policy_version 19006 (0.0036) [2024-04-25 23:27:33,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 311525376. Throughput: 0: 55506.4. Samples: 260913640. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-25 23:27:33,923][47056] Avg episode reward: [(0, '0.176')] [2024-04-25 23:27:34,270][47288] Updated weights for policy 0, policy_version 19016 (0.0034) [2024-04-25 23:27:36,678][47288] Updated weights for policy 0, policy_version 19026 (0.0037) [2024-04-25 23:27:38,923][47056] Fps is (10 sec: 55705.9, 60 sec: 54613.4, 300 sec: 55372.4). Total num frames: 311803904. Throughput: 0: 55681.7. Samples: 261248140. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-25 23:27:38,923][47056] Avg episode reward: [(0, '0.183')] [2024-04-25 23:27:39,972][47288] Updated weights for policy 0, policy_version 19036 (0.0028) [2024-04-25 23:27:42,420][47288] Updated weights for policy 0, policy_version 19046 (0.0033) [2024-04-25 23:27:43,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 312098816. Throughput: 0: 55806.3. Samples: 261413820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-25 23:27:43,923][47056] Avg episode reward: [(0, '0.173')] [2024-04-25 23:27:45,931][47288] Updated weights for policy 0, policy_version 19056 (0.0030) [2024-04-25 23:27:48,232][47288] Updated weights for policy 0, policy_version 19066 (0.0026) [2024-04-25 23:27:48,923][47056] Fps is (10 sec: 60620.6, 60 sec: 56251.8, 300 sec: 55539.0). Total num frames: 312410112. Throughput: 0: 55841.8. Samples: 261749060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-25 23:27:48,923][47056] Avg episode reward: [(0, '0.200')] [2024-04-25 23:27:52,002][47288] Updated weights for policy 0, policy_version 19076 (0.0030) [2024-04-25 23:27:53,683][47267] Signal inference workers to stop experience collection... (3650 times) [2024-04-25 23:27:53,684][47267] Signal inference workers to resume experience collection... (3650 times) [2024-04-25 23:27:53,697][47288] InferenceWorker_p0-w0: stopping experience collection (3650 times) [2024-04-25 23:27:53,697][47288] InferenceWorker_p0-w0: resuming experience collection (3650 times) [2024-04-25 23:27:53,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 312672256. Throughput: 0: 55880.1. Samples: 262084260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-25 23:27:53,923][47056] Avg episode reward: [(0, '0.197')] [2024-04-25 23:27:54,067][47288] Updated weights for policy 0, policy_version 19086 (0.0026) [2024-04-25 23:27:57,819][47288] Updated weights for policy 0, policy_version 19096 (0.0030) [2024-04-25 23:27:58,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 312934400. Throughput: 0: 55979.5. Samples: 262258800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 23:27:58,923][47056] Avg episode reward: [(0, '0.195')] [2024-04-25 23:27:59,919][47288] Updated weights for policy 0, policy_version 19106 (0.0029) [2024-04-25 23:28:03,719][47288] Updated weights for policy 0, policy_version 19116 (0.0030) [2024-04-25 23:28:03,923][47056] Fps is (10 sec: 52428.0, 60 sec: 55705.6, 300 sec: 55372.3). Total num frames: 313196544. Throughput: 0: 55858.7. Samples: 262590640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 23:28:03,923][47056] Avg episode reward: [(0, '0.181')] [2024-04-25 23:28:05,784][47288] Updated weights for policy 0, policy_version 19126 (0.0030) [2024-04-25 23:28:08,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.4, 300 sec: 55372.3). Total num frames: 313475072. Throughput: 0: 55836.8. Samples: 262923980. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-25 23:28:08,923][47056] Avg episode reward: [(0, '0.213')] [2024-04-25 23:28:09,593][47288] Updated weights for policy 0, policy_version 19136 (0.0037) [2024-04-25 23:28:11,775][47288] Updated weights for policy 0, policy_version 19146 (0.0026) [2024-04-25 23:28:13,923][47056] Fps is (10 sec: 57343.0, 60 sec: 55432.3, 300 sec: 55483.4). Total num frames: 313769984. Throughput: 0: 55654.0. Samples: 263077860. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-25 23:28:13,924][47056] Avg episode reward: [(0, '0.216')] [2024-04-25 23:28:15,618][47288] Updated weights for policy 0, policy_version 19156 (0.0033) [2024-04-25 23:28:17,506][47288] Updated weights for policy 0, policy_version 19166 (0.0026) [2024-04-25 23:28:18,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 314032128. Throughput: 0: 55423.8. Samples: 263407720. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-25 23:28:18,923][47056] Avg episode reward: [(0, '0.164')] [2024-04-25 23:28:21,375][47288] Updated weights for policy 0, policy_version 19176 (0.0030) [2024-04-25 23:28:23,458][47288] Updated weights for policy 0, policy_version 19186 (0.0035) [2024-04-25 23:28:23,923][47056] Fps is (10 sec: 58983.4, 60 sec: 56524.7, 300 sec: 55594.5). Total num frames: 314359808. Throughput: 0: 55426.1. Samples: 263742320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 23:28:23,932][47056] Avg episode reward: [(0, '0.219')] [2024-04-25 23:28:27,251][47288] Updated weights for policy 0, policy_version 19196 (0.0036) [2024-04-25 23:28:28,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56251.8, 300 sec: 55594.5). Total num frames: 314621952. Throughput: 0: 55861.8. Samples: 263927600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 23:28:28,923][47056] Avg episode reward: [(0, '0.159')] [2024-04-25 23:28:29,316][47288] Updated weights for policy 0, policy_version 19206 (0.0031) [2024-04-25 23:28:33,118][47288] Updated weights for policy 0, policy_version 19216 (0.0035) [2024-04-25 23:28:33,923][47056] Fps is (10 sec: 54068.2, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 314900480. Throughput: 0: 55871.7. Samples: 264263280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 23:28:33,923][47056] Avg episode reward: [(0, '0.238')] [2024-04-25 23:28:33,923][47267] Saving new best policy, reward=0.238! [2024-04-25 23:28:35,108][47288] Updated weights for policy 0, policy_version 19226 (0.0027) [2024-04-25 23:28:38,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55705.7, 300 sec: 55372.4). Total num frames: 315146240. Throughput: 0: 55877.3. Samples: 264598740. Policy #0 lag: (min: 0.0, avg: 13.3, max: 20.0) [2024-04-25 23:28:38,923][47056] Avg episode reward: [(0, '0.225')] [2024-04-25 23:28:39,007][47288] Updated weights for policy 0, policy_version 19236 (0.0035) [2024-04-25 23:28:40,952][47288] Updated weights for policy 0, policy_version 19246 (0.0028) [2024-04-25 23:28:43,923][47056] Fps is (10 sec: 52427.5, 60 sec: 55432.4, 300 sec: 55372.3). Total num frames: 315424768. Throughput: 0: 55290.6. Samples: 264746880. Policy #0 lag: (min: 0.0, avg: 13.3, max: 20.0) [2024-04-25 23:28:43,923][47056] Avg episode reward: [(0, '0.162')] [2024-04-25 23:28:44,858][47288] Updated weights for policy 0, policy_version 19256 (0.0029) [2024-04-25 23:28:46,884][47288] Updated weights for policy 0, policy_version 19266 (0.0026) [2024-04-25 23:28:48,923][47056] Fps is (10 sec: 55704.7, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 315703296. Throughput: 0: 55482.7. Samples: 265087360. Policy #0 lag: (min: 0.0, avg: 13.3, max: 20.0) [2024-04-25 23:28:48,923][47056] Avg episode reward: [(0, '0.180')] [2024-04-25 23:28:49,034][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000019270_315719680.pth... [2024-04-25 23:28:49,091][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000018457_302399488.pth [2024-04-25 23:28:50,677][47267] Signal inference workers to stop experience collection... (3700 times) [2024-04-25 23:28:50,679][47267] Signal inference workers to resume experience collection... (3700 times) [2024-04-25 23:28:50,698][47288] InferenceWorker_p0-w0: stopping experience collection (3700 times) [2024-04-25 23:28:50,699][47288] InferenceWorker_p0-w0: resuming experience collection (3700 times) [2024-04-25 23:28:50,793][47288] Updated weights for policy 0, policy_version 19276 (0.0032) [2024-04-25 23:28:52,796][47288] Updated weights for policy 0, policy_version 19286 (0.0026) [2024-04-25 23:28:53,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 315998208. Throughput: 0: 55573.9. Samples: 265424800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-25 23:28:53,923][47056] Avg episode reward: [(0, '0.188')] [2024-04-25 23:28:56,571][47288] Updated weights for policy 0, policy_version 19296 (0.0031) [2024-04-25 23:28:58,576][47288] Updated weights for policy 0, policy_version 19306 (0.0027) [2024-04-25 23:28:58,923][47056] Fps is (10 sec: 60620.8, 60 sec: 56251.7, 300 sec: 55650.0). Total num frames: 316309504. Throughput: 0: 55982.9. Samples: 265597080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-25 23:28:58,924][47056] Avg episode reward: [(0, '0.199')] [2024-04-25 23:29:02,524][47288] Updated weights for policy 0, policy_version 19316 (0.0028) [2024-04-25 23:29:03,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56525.0, 300 sec: 55650.1). Total num frames: 316588032. Throughput: 0: 56192.7. Samples: 265936380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-25 23:29:03,923][47056] Avg episode reward: [(0, '0.179')] [2024-04-25 23:29:04,524][47288] Updated weights for policy 0, policy_version 19326 (0.0032) [2024-04-25 23:29:08,375][47288] Updated weights for policy 0, policy_version 19336 (0.0029) [2024-04-25 23:29:08,923][47056] Fps is (10 sec: 52429.5, 60 sec: 55978.8, 300 sec: 55539.0). Total num frames: 316833792. Throughput: 0: 56078.4. Samples: 266265840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-25 23:29:08,923][47056] Avg episode reward: [(0, '0.221')] [2024-04-25 23:29:10,345][47288] Updated weights for policy 0, policy_version 19346 (0.0034) [2024-04-25 23:29:13,923][47056] Fps is (10 sec: 50790.0, 60 sec: 55432.8, 300 sec: 55427.9). Total num frames: 317095936. Throughput: 0: 55543.1. Samples: 266427040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-25 23:29:13,923][47056] Avg episode reward: [(0, '0.166')] [2024-04-25 23:29:14,338][47288] Updated weights for policy 0, policy_version 19356 (0.0031) [2024-04-25 23:29:16,322][47288] Updated weights for policy 0, policy_version 19366 (0.0033) [2024-04-25 23:29:18,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55705.6, 300 sec: 55372.4). Total num frames: 317374464. Throughput: 0: 55538.9. Samples: 266762540. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-04-25 23:29:18,923][47056] Avg episode reward: [(0, '0.190')] [2024-04-25 23:29:20,179][47288] Updated weights for policy 0, policy_version 19376 (0.0031) [2024-04-25 23:29:22,185][47288] Updated weights for policy 0, policy_version 19386 (0.0028) [2024-04-25 23:29:23,923][47056] Fps is (10 sec: 55705.8, 60 sec: 54886.5, 300 sec: 55427.9). Total num frames: 317652992. Throughput: 0: 55493.7. Samples: 267095960. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-04-25 23:29:23,923][47056] Avg episode reward: [(0, '0.230')] [2024-04-25 23:29:25,993][47288] Updated weights for policy 0, policy_version 19396 (0.0028) [2024-04-25 23:29:28,421][47288] Updated weights for policy 0, policy_version 19406 (0.0026) [2024-04-25 23:29:28,923][47056] Fps is (10 sec: 58982.9, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 317964288. Throughput: 0: 55904.6. Samples: 267262580. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-04-25 23:29:28,923][47056] Avg episode reward: [(0, '0.195')] [2024-04-25 23:29:32,046][47288] Updated weights for policy 0, policy_version 19416 (0.0030) [2024-04-25 23:29:33,923][47056] Fps is (10 sec: 60620.2, 60 sec: 55978.5, 300 sec: 55650.1). Total num frames: 318259200. Throughput: 0: 55678.7. Samples: 267592900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 23:29:33,923][47056] Avg episode reward: [(0, '0.163')] [2024-04-25 23:29:34,134][47288] Updated weights for policy 0, policy_version 19426 (0.0034) [2024-04-25 23:29:37,788][47288] Updated weights for policy 0, policy_version 19436 (0.0026) [2024-04-25 23:29:38,923][47056] Fps is (10 sec: 55704.2, 60 sec: 56251.5, 300 sec: 55650.0). Total num frames: 318521344. Throughput: 0: 55606.8. Samples: 267927120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 23:29:38,923][47056] Avg episode reward: [(0, '0.213')] [2024-04-25 23:29:39,941][47288] Updated weights for policy 0, policy_version 19446 (0.0031) [2024-04-25 23:29:43,680][47288] Updated weights for policy 0, policy_version 19456 (0.0034) [2024-04-25 23:29:43,923][47056] Fps is (10 sec: 50790.7, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 318767104. Throughput: 0: 55591.2. Samples: 268098680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-25 23:29:43,923][47056] Avg episode reward: [(0, '0.158')] [2024-04-25 23:29:44,525][47267] Signal inference workers to stop experience collection... (3750 times) [2024-04-25 23:29:44,526][47267] Signal inference workers to resume experience collection... (3750 times) [2024-04-25 23:29:44,537][47288] InferenceWorker_p0-w0: stopping experience collection (3750 times) [2024-04-25 23:29:44,568][47288] InferenceWorker_p0-w0: resuming experience collection (3750 times) [2024-04-25 23:29:45,917][47288] Updated weights for policy 0, policy_version 19466 (0.0033) [2024-04-25 23:29:48,923][47056] Fps is (10 sec: 50791.4, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 319029248. Throughput: 0: 55463.0. Samples: 268432220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-25 23:29:48,923][47056] Avg episode reward: [(0, '0.217')] [2024-04-25 23:29:49,630][47288] Updated weights for policy 0, policy_version 19476 (0.0031) [2024-04-25 23:29:51,994][47288] Updated weights for policy 0, policy_version 19486 (0.0031) [2024-04-25 23:29:53,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55159.3, 300 sec: 55316.8). Total num frames: 319307776. Throughput: 0: 55522.0. Samples: 268764340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-25 23:29:53,923][47056] Avg episode reward: [(0, '0.251')] [2024-04-25 23:29:54,037][47267] Saving new best policy, reward=0.251! [2024-04-25 23:29:55,455][47288] Updated weights for policy 0, policy_version 19496 (0.0028) [2024-04-25 23:29:57,805][47288] Updated weights for policy 0, policy_version 19506 (0.0032) [2024-04-25 23:29:58,923][47056] Fps is (10 sec: 57343.9, 60 sec: 54886.4, 300 sec: 55483.4). Total num frames: 319602688. Throughput: 0: 55535.5. Samples: 268926140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-25 23:29:58,923][47056] Avg episode reward: [(0, '0.203')] [2024-04-25 23:30:01,474][47288] Updated weights for policy 0, policy_version 19516 (0.0034) [2024-04-25 23:30:03,558][47288] Updated weights for policy 0, policy_version 19526 (0.0030) [2024-04-25 23:30:03,923][47056] Fps is (10 sec: 60621.7, 60 sec: 55432.5, 300 sec: 55594.6). Total num frames: 319913984. Throughput: 0: 55305.9. Samples: 269251300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-25 23:30:03,923][47056] Avg episode reward: [(0, '0.214')] [2024-04-25 23:30:07,197][47288] Updated weights for policy 0, policy_version 19536 (0.0029) [2024-04-25 23:30:08,923][47056] Fps is (10 sec: 58981.5, 60 sec: 55978.4, 300 sec: 55650.0). Total num frames: 320192512. Throughput: 0: 55316.1. Samples: 269585200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-25 23:30:08,923][47056] Avg episode reward: [(0, '0.226')] [2024-04-25 23:30:09,780][47288] Updated weights for policy 0, policy_version 19546 (0.0027) [2024-04-25 23:30:13,141][47288] Updated weights for policy 0, policy_version 19556 (0.0028) [2024-04-25 23:30:13,923][47056] Fps is (10 sec: 52428.2, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 320438272. Throughput: 0: 55643.0. Samples: 269766520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-25 23:30:13,923][47056] Avg episode reward: [(0, '0.154')] [2024-04-25 23:30:16,101][47288] Updated weights for policy 0, policy_version 19566 (0.0027) [2024-04-25 23:30:18,923][47056] Fps is (10 sec: 52429.5, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 320716800. Throughput: 0: 55653.7. Samples: 270097320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-25 23:30:18,923][47056] Avg episode reward: [(0, '0.202')] [2024-04-25 23:30:19,047][47288] Updated weights for policy 0, policy_version 19576 (0.0028) [2024-04-25 23:30:22,169][47288] Updated weights for policy 0, policy_version 19586 (0.0026) [2024-04-25 23:30:23,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 320978944. Throughput: 0: 55470.1. Samples: 270423260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-25 23:30:23,923][47056] Avg episode reward: [(0, '0.204')] [2024-04-25 23:30:24,879][47288] Updated weights for policy 0, policy_version 19596 (0.0029) [2024-04-25 23:30:28,153][47288] Updated weights for policy 0, policy_version 19606 (0.0027) [2024-04-25 23:30:28,923][47056] Fps is (10 sec: 52429.0, 60 sec: 54613.3, 300 sec: 55316.8). Total num frames: 321241088. Throughput: 0: 55213.3. Samples: 270583280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-25 23:30:28,923][47056] Avg episode reward: [(0, '0.170')] [2024-04-25 23:30:30,778][47288] Updated weights for policy 0, policy_version 19616 (0.0029) [2024-04-25 23:30:33,896][47288] Updated weights for policy 0, policy_version 19626 (0.0035) [2024-04-25 23:30:33,923][47056] Fps is (10 sec: 57343.7, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 321552384. Throughput: 0: 55062.2. Samples: 270910020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-25 23:30:33,923][47056] Avg episode reward: [(0, '0.211')] [2024-04-25 23:30:36,691][47288] Updated weights for policy 0, policy_version 19636 (0.0027) [2024-04-25 23:30:38,232][47267] Signal inference workers to stop experience collection... (3800 times) [2024-04-25 23:30:38,233][47267] Signal inference workers to resume experience collection... (3800 times) [2024-04-25 23:30:38,246][47288] InferenceWorker_p0-w0: stopping experience collection (3800 times) [2024-04-25 23:30:38,247][47288] InferenceWorker_p0-w0: resuming experience collection (3800 times) [2024-04-25 23:30:38,923][47056] Fps is (10 sec: 60620.9, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 321847296. Throughput: 0: 55066.8. Samples: 271242340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 23:30:38,923][47056] Avg episode reward: [(0, '0.183')] [2024-04-25 23:30:39,719][47288] Updated weights for policy 0, policy_version 19646 (0.0034) [2024-04-25 23:30:42,601][47288] Updated weights for policy 0, policy_version 19656 (0.0029) [2024-04-25 23:30:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 322109440. Throughput: 0: 55446.3. Samples: 271421220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 23:30:43,923][47056] Avg episode reward: [(0, '0.181')] [2024-04-25 23:30:45,634][47288] Updated weights for policy 0, policy_version 19666 (0.0032) [2024-04-25 23:30:48,461][47288] Updated weights for policy 0, policy_version 19676 (0.0034) [2024-04-25 23:30:48,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 322387968. Throughput: 0: 55590.9. Samples: 271752900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 23:30:48,923][47056] Avg episode reward: [(0, '0.192')] [2024-04-25 23:30:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000019677_322387968.pth... [2024-04-25 23:30:48,995][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000018862_309035008.pth [2024-04-25 23:30:51,679][47288] Updated weights for policy 0, policy_version 19686 (0.0029) [2024-04-25 23:30:53,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.6, 300 sec: 55427.9). Total num frames: 322650112. Throughput: 0: 55632.1. Samples: 272088640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-04-25 23:30:53,923][47056] Avg episode reward: [(0, '0.180')] [2024-04-25 23:30:54,286][47288] Updated weights for policy 0, policy_version 19696 (0.0032) [2024-04-25 23:30:57,547][47288] Updated weights for policy 0, policy_version 19706 (0.0033) [2024-04-25 23:30:58,923][47056] Fps is (10 sec: 52429.7, 60 sec: 55159.6, 300 sec: 55427.9). Total num frames: 322912256. Throughput: 0: 55074.4. Samples: 272244860. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-04-25 23:30:58,923][47056] Avg episode reward: [(0, '0.226')] [2024-04-25 23:31:00,182][47288] Updated weights for policy 0, policy_version 19716 (0.0025) [2024-04-25 23:31:03,600][47288] Updated weights for policy 0, policy_version 19726 (0.0030) [2024-04-25 23:31:03,923][47056] Fps is (10 sec: 54067.5, 60 sec: 54613.3, 300 sec: 55372.4). Total num frames: 323190784. Throughput: 0: 55149.8. Samples: 272579060. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-04-25 23:31:03,923][47056] Avg episode reward: [(0, '0.198')] [2024-04-25 23:31:06,131][47288] Updated weights for policy 0, policy_version 19736 (0.0029) [2024-04-25 23:31:08,923][47056] Fps is (10 sec: 57343.0, 60 sec: 54886.5, 300 sec: 55539.0). Total num frames: 323485696. Throughput: 0: 55410.6. Samples: 272916740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-25 23:31:08,923][47056] Avg episode reward: [(0, '0.187')] [2024-04-25 23:31:09,566][47288] Updated weights for policy 0, policy_version 19746 (0.0036) [2024-04-25 23:31:11,993][47288] Updated weights for policy 0, policy_version 19756 (0.0027) [2024-04-25 23:31:13,923][47056] Fps is (10 sec: 58982.6, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 323780608. Throughput: 0: 55560.5. Samples: 273083500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-25 23:31:13,923][47056] Avg episode reward: [(0, '0.181')] [2024-04-25 23:31:15,336][47288] Updated weights for policy 0, policy_version 19766 (0.0030) [2024-04-25 23:31:17,857][47288] Updated weights for policy 0, policy_version 19776 (0.0031) [2024-04-25 23:31:18,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 324059136. Throughput: 0: 55693.9. Samples: 273416240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-25 23:31:18,923][47056] Avg episode reward: [(0, '0.207')] [2024-04-25 23:31:21,316][47288] Updated weights for policy 0, policy_version 19786 (0.0028) [2024-04-25 23:31:23,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55594.6). Total num frames: 324321280. Throughput: 0: 55662.8. Samples: 273747160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-25 23:31:23,923][47056] Avg episode reward: [(0, '0.218')] [2024-04-25 23:31:24,118][47288] Updated weights for policy 0, policy_version 19796 (0.0032) [2024-04-25 23:31:27,147][47288] Updated weights for policy 0, policy_version 19806 (0.0028) [2024-04-25 23:31:28,923][47056] Fps is (10 sec: 54066.3, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 324599808. Throughput: 0: 55385.2. Samples: 273913560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-25 23:31:28,923][47056] Avg episode reward: [(0, '0.175')] [2024-04-25 23:31:29,809][47288] Updated weights for policy 0, policy_version 19816 (0.0024) [2024-04-25 23:31:33,085][47288] Updated weights for policy 0, policy_version 19826 (0.0031) [2024-04-25 23:31:33,923][47056] Fps is (10 sec: 52428.2, 60 sec: 54886.4, 300 sec: 55316.8). Total num frames: 324845568. Throughput: 0: 55507.2. Samples: 274250720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-25 23:31:33,923][47056] Avg episode reward: [(0, '0.197')] [2024-04-25 23:31:35,450][47267] Signal inference workers to stop experience collection... (3850 times) [2024-04-25 23:31:35,450][47267] Signal inference workers to resume experience collection... (3850 times) [2024-04-25 23:31:35,471][47288] InferenceWorker_p0-w0: stopping experience collection (3850 times) [2024-04-25 23:31:35,471][47288] InferenceWorker_p0-w0: resuming experience collection (3850 times) [2024-04-25 23:31:35,556][47288] Updated weights for policy 0, policy_version 19836 (0.0035) [2024-04-25 23:31:38,914][47288] Updated weights for policy 0, policy_version 19846 (0.0029) [2024-04-25 23:31:38,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 325156864. Throughput: 0: 55493.4. Samples: 274585840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-25 23:31:38,923][47056] Avg episode reward: [(0, '0.247')] [2024-04-25 23:31:41,520][47288] Updated weights for policy 0, policy_version 19856 (0.0034) [2024-04-25 23:31:43,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 325419008. Throughput: 0: 55602.3. Samples: 274746960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-25 23:31:43,923][47056] Avg episode reward: [(0, '0.153')] [2024-04-25 23:31:44,761][47288] Updated weights for policy 0, policy_version 19866 (0.0032) [2024-04-25 23:31:47,557][47288] Updated weights for policy 0, policy_version 19876 (0.0027) [2024-04-25 23:31:48,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 325730304. Throughput: 0: 55561.0. Samples: 275079300. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-25 23:31:48,923][47056] Avg episode reward: [(0, '0.194')] [2024-04-25 23:31:50,788][47288] Updated weights for policy 0, policy_version 19886 (0.0035) [2024-04-25 23:31:53,423][47288] Updated weights for policy 0, policy_version 19896 (0.0030) [2024-04-25 23:31:53,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55705.8, 300 sec: 55594.5). Total num frames: 325992448. Throughput: 0: 55324.2. Samples: 275406320. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-25 23:31:53,923][47056] Avg episode reward: [(0, '0.201')] [2024-04-25 23:31:56,801][47288] Updated weights for policy 0, policy_version 19906 (0.0027) [2024-04-25 23:31:58,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 326254592. Throughput: 0: 55501.9. Samples: 275581080. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-25 23:31:58,923][47056] Avg episode reward: [(0, '0.198')] [2024-04-25 23:31:59,271][47288] Updated weights for policy 0, policy_version 19916 (0.0031) [2024-04-25 23:32:02,591][47288] Updated weights for policy 0, policy_version 19926 (0.0034) [2024-04-25 23:32:03,923][47056] Fps is (10 sec: 52428.2, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 326516736. Throughput: 0: 55515.0. Samples: 275914420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 26.0) [2024-04-25 23:32:03,923][47056] Avg episode reward: [(0, '0.225')] [2024-04-25 23:32:05,305][47288] Updated weights for policy 0, policy_version 19936 (0.0028) [2024-04-25 23:32:08,363][47288] Updated weights for policy 0, policy_version 19946 (0.0031) [2024-04-25 23:32:08,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 326811648. Throughput: 0: 55415.0. Samples: 276240840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 26.0) [2024-04-25 23:32:08,923][47056] Avg episode reward: [(0, '0.194')] [2024-04-25 23:32:11,266][47288] Updated weights for policy 0, policy_version 19956 (0.0027) [2024-04-25 23:32:13,923][47056] Fps is (10 sec: 55705.8, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 327073792. Throughput: 0: 55292.6. Samples: 276401720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-25 23:32:13,923][47056] Avg episode reward: [(0, '0.212')] [2024-04-25 23:32:14,399][47288] Updated weights for policy 0, policy_version 19966 (0.0035) [2024-04-25 23:32:17,006][47288] Updated weights for policy 0, policy_version 19976 (0.0026) [2024-04-25 23:32:18,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 327368704. Throughput: 0: 55171.6. Samples: 276733440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-25 23:32:18,923][47056] Avg episode reward: [(0, '0.208')] [2024-04-25 23:32:20,492][47288] Updated weights for policy 0, policy_version 19986 (0.0026) [2024-04-25 23:32:22,926][47288] Updated weights for policy 0, policy_version 19996 (0.0029) [2024-04-25 23:32:23,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 327647232. Throughput: 0: 55167.7. Samples: 277068380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-25 23:32:23,923][47056] Avg episode reward: [(0, '0.206')] [2024-04-25 23:32:26,352][47288] Updated weights for policy 0, policy_version 20006 (0.0024) [2024-04-25 23:32:28,690][47288] Updated weights for policy 0, policy_version 20016 (0.0028) [2024-04-25 23:32:28,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55705.7, 300 sec: 55650.0). Total num frames: 327942144. Throughput: 0: 55434.4. Samples: 277241520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 23:32:28,923][47056] Avg episode reward: [(0, '0.253')] [2024-04-25 23:32:28,997][47267] Saving new best policy, reward=0.253! [2024-04-25 23:32:29,596][47267] Signal inference workers to stop experience collection... (3900 times) [2024-04-25 23:32:29,629][47288] InferenceWorker_p0-w0: stopping experience collection (3900 times) [2024-04-25 23:32:29,682][47267] Signal inference workers to resume experience collection... (3900 times) [2024-04-25 23:32:29,682][47288] InferenceWorker_p0-w0: resuming experience collection (3900 times) [2024-04-25 23:32:32,181][47288] Updated weights for policy 0, policy_version 20026 (0.0029) [2024-04-25 23:32:33,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 328187904. Throughput: 0: 55365.3. Samples: 277570740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 23:32:33,923][47056] Avg episode reward: [(0, '0.211')] [2024-04-25 23:32:34,814][47288] Updated weights for policy 0, policy_version 20036 (0.0030) [2024-04-25 23:32:38,091][47288] Updated weights for policy 0, policy_version 20046 (0.0033) [2024-04-25 23:32:38,923][47056] Fps is (10 sec: 50790.5, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 328450048. Throughput: 0: 55442.1. Samples: 277901220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 23:32:38,923][47056] Avg episode reward: [(0, '0.204')] [2024-04-25 23:32:41,072][47288] Updated weights for policy 0, policy_version 20056 (0.0035) [2024-04-25 23:32:43,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55432.4, 300 sec: 55372.4). Total num frames: 328744960. Throughput: 0: 55083.5. Samples: 278059840. Policy #0 lag: (min: 1.0, avg: 12.1, max: 25.0) [2024-04-25 23:32:43,923][47056] Avg episode reward: [(0, '0.203')] [2024-04-25 23:32:44,192][47288] Updated weights for policy 0, policy_version 20066 (0.0028) [2024-04-25 23:32:46,851][47288] Updated weights for policy 0, policy_version 20076 (0.0026) [2024-04-25 23:32:48,923][47056] Fps is (10 sec: 57344.2, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 329023488. Throughput: 0: 55221.4. Samples: 278399380. Policy #0 lag: (min: 1.0, avg: 12.1, max: 25.0) [2024-04-25 23:32:48,923][47056] Avg episode reward: [(0, '0.238')] [2024-04-25 23:32:48,945][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000020083_329039872.pth... [2024-04-25 23:32:48,992][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000019270_315719680.pth [2024-04-25 23:32:49,889][47288] Updated weights for policy 0, policy_version 20086 (0.0028) [2024-04-25 23:32:52,669][47288] Updated weights for policy 0, policy_version 20096 (0.0028) [2024-04-25 23:32:53,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55159.4, 300 sec: 55483.5). Total num frames: 329302016. Throughput: 0: 55283.2. Samples: 278728580. Policy #0 lag: (min: 1.0, avg: 12.1, max: 25.0) [2024-04-25 23:32:53,923][47056] Avg episode reward: [(0, '0.183')] [2024-04-25 23:32:55,677][47288] Updated weights for policy 0, policy_version 20106 (0.0036) [2024-04-25 23:32:58,656][47288] Updated weights for policy 0, policy_version 20116 (0.0026) [2024-04-25 23:32:58,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 329596928. Throughput: 0: 55505.3. Samples: 278899460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-25 23:32:58,923][47056] Avg episode reward: [(0, '0.256')] [2024-04-25 23:32:58,934][47267] Saving new best policy, reward=0.256! [2024-04-25 23:33:01,971][47288] Updated weights for policy 0, policy_version 20126 (0.0032) [2024-04-25 23:33:03,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 329875456. Throughput: 0: 55435.5. Samples: 279228040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-25 23:33:03,923][47056] Avg episode reward: [(0, '0.153')] [2024-04-25 23:33:04,684][47288] Updated weights for policy 0, policy_version 20136 (0.0033) [2024-04-25 23:33:07,990][47288] Updated weights for policy 0, policy_version 20146 (0.0028) [2024-04-25 23:33:08,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 330137600. Throughput: 0: 55511.5. Samples: 279566400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-25 23:33:08,923][47056] Avg episode reward: [(0, '0.200')] [2024-04-25 23:33:10,421][47288] Updated weights for policy 0, policy_version 20156 (0.0030) [2024-04-25 23:33:13,871][47288] Updated weights for policy 0, policy_version 20166 (0.0038) [2024-04-25 23:33:13,923][47056] Fps is (10 sec: 52428.1, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 330399744. Throughput: 0: 55245.3. Samples: 279727560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 23:33:13,923][47056] Avg episode reward: [(0, '0.148')] [2024-04-25 23:33:16,116][47288] Updated weights for policy 0, policy_version 20176 (0.0027) [2024-04-25 23:33:18,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55432.4, 300 sec: 55372.4). Total num frames: 330694656. Throughput: 0: 55418.5. Samples: 280064580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-25 23:33:18,923][47056] Avg episode reward: [(0, '0.192')] [2024-04-25 23:33:19,700][47288] Updated weights for policy 0, policy_version 20186 (0.0032) [2024-04-25 23:33:22,285][47288] Updated weights for policy 0, policy_version 20196 (0.0030) [2024-04-25 23:33:23,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55432.4, 300 sec: 55427.9). Total num frames: 330973184. Throughput: 0: 55387.1. Samples: 280393640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-25 23:33:23,923][47056] Avg episode reward: [(0, '0.253')] [2024-04-25 23:33:25,492][47288] Updated weights for policy 0, policy_version 20206 (0.0029) [2024-04-25 23:33:28,237][47288] Updated weights for policy 0, policy_version 20216 (0.0029) [2024-04-25 23:33:28,923][47056] Fps is (10 sec: 57345.1, 60 sec: 55432.7, 300 sec: 55483.4). Total num frames: 331268096. Throughput: 0: 55657.4. Samples: 280564420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-25 23:33:28,923][47056] Avg episode reward: [(0, '0.217')] [2024-04-25 23:33:31,390][47288] Updated weights for policy 0, policy_version 20226 (0.0030) [2024-04-25 23:33:33,923][47056] Fps is (10 sec: 55706.7, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 331530240. Throughput: 0: 55405.9. Samples: 280892640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-25 23:33:33,923][47056] Avg episode reward: [(0, '0.179')] [2024-04-25 23:33:34,177][47288] Updated weights for policy 0, policy_version 20236 (0.0027) [2024-04-25 23:33:37,398][47288] Updated weights for policy 0, policy_version 20246 (0.0035) [2024-04-25 23:33:38,923][47056] Fps is (10 sec: 55704.7, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 331825152. Throughput: 0: 55543.5. Samples: 281228040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-25 23:33:38,923][47056] Avg episode reward: [(0, '0.236')] [2024-04-25 23:33:39,890][47288] Updated weights for policy 0, policy_version 20256 (0.0027) [2024-04-25 23:33:43,257][47288] Updated weights for policy 0, policy_version 20266 (0.0043) [2024-04-25 23:33:43,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 332087296. Throughput: 0: 55527.5. Samples: 281398200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-25 23:33:43,927][47056] Avg episode reward: [(0, '0.177')] [2024-04-25 23:33:45,796][47288] Updated weights for policy 0, policy_version 20276 (0.0034) [2024-04-25 23:33:48,923][47056] Fps is (10 sec: 50790.8, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 332333056. Throughput: 0: 55555.1. Samples: 281728020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-25 23:33:48,923][47056] Avg episode reward: [(0, '0.187')] [2024-04-25 23:33:49,153][47288] Updated weights for policy 0, policy_version 20286 (0.0027) [2024-04-25 23:33:49,520][47267] Signal inference workers to stop experience collection... (3950 times) [2024-04-25 23:33:49,520][47267] Signal inference workers to resume experience collection... (3950 times) [2024-04-25 23:33:49,536][47288] InferenceWorker_p0-w0: stopping experience collection (3950 times) [2024-04-25 23:33:49,536][47288] InferenceWorker_p0-w0: resuming experience collection (3950 times) [2024-04-25 23:33:51,871][47288] Updated weights for policy 0, policy_version 20296 (0.0027) [2024-04-25 23:33:53,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 332627968. Throughput: 0: 55459.1. Samples: 282062060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 23:33:53,923][47056] Avg episode reward: [(0, '0.172')] [2024-04-25 23:33:55,277][47288] Updated weights for policy 0, policy_version 20306 (0.0039) [2024-04-25 23:33:57,663][47288] Updated weights for policy 0, policy_version 20316 (0.0033) [2024-04-25 23:33:58,923][47056] Fps is (10 sec: 57342.7, 60 sec: 55159.2, 300 sec: 55316.8). Total num frames: 332906496. Throughput: 0: 55519.8. Samples: 282225960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 23:33:58,924][47056] Avg episode reward: [(0, '0.180')] [2024-04-25 23:34:01,113][47288] Updated weights for policy 0, policy_version 20326 (0.0031) [2024-04-25 23:34:03,541][47288] Updated weights for policy 0, policy_version 20336 (0.0028) [2024-04-25 23:34:03,923][47056] Fps is (10 sec: 58982.7, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 333217792. Throughput: 0: 55483.3. Samples: 282561320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-25 23:34:03,923][47056] Avg episode reward: [(0, '0.195')] [2024-04-25 23:34:06,994][47288] Updated weights for policy 0, policy_version 20346 (0.0032) [2024-04-25 23:34:08,923][47056] Fps is (10 sec: 54068.7, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 333447168. Throughput: 0: 55503.7. Samples: 282891300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 23:34:08,923][47056] Avg episode reward: [(0, '0.179')] [2024-04-25 23:34:09,491][47288] Updated weights for policy 0, policy_version 20356 (0.0026) [2024-04-25 23:34:12,702][47288] Updated weights for policy 0, policy_version 20366 (0.0027) [2024-04-25 23:34:13,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 333758464. Throughput: 0: 55434.5. Samples: 283058980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 23:34:13,924][47056] Avg episode reward: [(0, '0.150')] [2024-04-25 23:34:15,397][47288] Updated weights for policy 0, policy_version 20376 (0.0029) [2024-04-25 23:34:18,573][47288] Updated weights for policy 0, policy_version 20386 (0.0030) [2024-04-25 23:34:18,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 334020608. Throughput: 0: 55581.6. Samples: 283393820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 23:34:18,923][47056] Avg episode reward: [(0, '0.222')] [2024-04-25 23:34:21,316][47288] Updated weights for policy 0, policy_version 20396 (0.0031) [2024-04-25 23:34:23,923][47056] Fps is (10 sec: 52429.8, 60 sec: 55159.6, 300 sec: 55316.8). Total num frames: 334282752. Throughput: 0: 55594.4. Samples: 283729780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-25 23:34:23,923][47056] Avg episode reward: [(0, '0.223')] [2024-04-25 23:34:24,439][47288] Updated weights for policy 0, policy_version 20406 (0.0038) [2024-04-25 23:34:27,073][47288] Updated weights for policy 0, policy_version 20416 (0.0029) [2024-04-25 23:34:28,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55159.2, 300 sec: 55316.8). Total num frames: 334577664. Throughput: 0: 55545.6. Samples: 283897760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-25 23:34:28,923][47056] Avg episode reward: [(0, '0.190')] [2024-04-25 23:34:30,379][47288] Updated weights for policy 0, policy_version 20426 (0.0026) [2024-04-25 23:34:33,122][47288] Updated weights for policy 0, policy_version 20436 (0.0030) [2024-04-25 23:34:33,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 334856192. Throughput: 0: 55495.2. Samples: 284225300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-25 23:34:33,923][47056] Avg episode reward: [(0, '0.210')] [2024-04-25 23:34:36,169][47288] Updated weights for policy 0, policy_version 20446 (0.0031) [2024-04-25 23:34:38,889][47288] Updated weights for policy 0, policy_version 20456 (0.0033) [2024-04-25 23:34:38,923][47056] Fps is (10 sec: 57345.1, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 335151104. Throughput: 0: 55385.4. Samples: 284554400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-25 23:34:38,923][47056] Avg episode reward: [(0, '0.222')] [2024-04-25 23:34:42,033][47288] Updated weights for policy 0, policy_version 20466 (0.0031) [2024-04-25 23:34:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 335413248. Throughput: 0: 55697.7. Samples: 284732340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-25 23:34:43,923][47056] Avg episode reward: [(0, '0.235')] [2024-04-25 23:34:44,929][47288] Updated weights for policy 0, policy_version 20476 (0.0037) [2024-04-25 23:34:47,871][47288] Updated weights for policy 0, policy_version 20486 (0.0027) [2024-04-25 23:34:48,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 335691776. Throughput: 0: 55649.3. Samples: 285065540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-25 23:34:48,923][47056] Avg episode reward: [(0, '0.204')] [2024-04-25 23:34:48,935][47267] Signal inference workers to stop experience collection... (4000 times) [2024-04-25 23:34:48,980][47288] InferenceWorker_p0-w0: stopping experience collection (4000 times) [2024-04-25 23:34:48,993][47267] Signal inference workers to resume experience collection... (4000 times) [2024-04-25 23:34:48,993][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000020490_335708160.pth... [2024-04-25 23:34:48,996][47288] InferenceWorker_p0-w0: resuming experience collection (4000 times) [2024-04-25 23:34:49,035][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000019677_322387968.pth [2024-04-25 23:34:50,884][47288] Updated weights for policy 0, policy_version 20496 (0.0034) [2024-04-25 23:34:53,880][47288] Updated weights for policy 0, policy_version 20506 (0.0031) [2024-04-25 23:34:53,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 335970304. Throughput: 0: 55575.8. Samples: 285392220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-25 23:34:53,923][47056] Avg episode reward: [(0, '0.248')] [2024-04-25 23:34:56,790][47288] Updated weights for policy 0, policy_version 20516 (0.0032) [2024-04-25 23:34:58,923][47056] Fps is (10 sec: 52429.6, 60 sec: 55159.8, 300 sec: 55261.3). Total num frames: 336216064. Throughput: 0: 55545.6. Samples: 285558520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-25 23:34:58,923][47056] Avg episode reward: [(0, '0.238')] [2024-04-25 23:34:59,821][47288] Updated weights for policy 0, policy_version 20526 (0.0033) [2024-04-25 23:35:02,679][47288] Updated weights for policy 0, policy_version 20536 (0.0034) [2024-04-25 23:35:03,923][47056] Fps is (10 sec: 54067.3, 60 sec: 54886.3, 300 sec: 55316.9). Total num frames: 336510976. Throughput: 0: 55511.9. Samples: 285891860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:35:03,923][47056] Avg episode reward: [(0, '0.224')] [2024-04-25 23:35:05,694][47288] Updated weights for policy 0, policy_version 20546 (0.0035) [2024-04-25 23:35:08,604][47288] Updated weights for policy 0, policy_version 20556 (0.0031) [2024-04-25 23:35:08,923][47056] Fps is (10 sec: 58981.9, 60 sec: 55978.6, 300 sec: 55483.5). Total num frames: 336805888. Throughput: 0: 55338.6. Samples: 286220020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:35:08,923][47056] Avg episode reward: [(0, '0.189')] [2024-04-25 23:35:11,544][47288] Updated weights for policy 0, policy_version 20566 (0.0029) [2024-04-25 23:35:13,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 337068032. Throughput: 0: 55428.9. Samples: 286392060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:35:13,923][47056] Avg episode reward: [(0, '0.239')] [2024-04-25 23:35:14,505][47288] Updated weights for policy 0, policy_version 20576 (0.0032) [2024-04-25 23:35:17,538][47288] Updated weights for policy 0, policy_version 20586 (0.0032) [2024-04-25 23:35:18,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 337362944. Throughput: 0: 55531.0. Samples: 286724200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-25 23:35:18,923][47056] Avg episode reward: [(0, '0.220')] [2024-04-25 23:35:20,454][47288] Updated weights for policy 0, policy_version 20596 (0.0031) [2024-04-25 23:35:23,357][47288] Updated weights for policy 0, policy_version 20606 (0.0025) [2024-04-25 23:35:23,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56251.6, 300 sec: 55650.1). Total num frames: 337657856. Throughput: 0: 55543.5. Samples: 287053860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-25 23:35:23,923][47056] Avg episode reward: [(0, '0.233')] [2024-04-25 23:35:26,276][47288] Updated weights for policy 0, policy_version 20616 (0.0029) [2024-04-25 23:35:28,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 337903616. Throughput: 0: 55362.9. Samples: 287223680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-25 23:35:28,923][47056] Avg episode reward: [(0, '0.246')] [2024-04-25 23:35:29,209][47288] Updated weights for policy 0, policy_version 20626 (0.0026) [2024-04-25 23:35:32,156][47288] Updated weights for policy 0, policy_version 20636 (0.0028) [2024-04-25 23:35:33,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55705.7, 300 sec: 55427.9). Total num frames: 338198528. Throughput: 0: 55418.4. Samples: 287559360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 23:35:33,923][47056] Avg episode reward: [(0, '0.182')] [2024-04-25 23:35:35,024][47288] Updated weights for policy 0, policy_version 20646 (0.0027) [2024-04-25 23:35:38,044][47288] Updated weights for policy 0, policy_version 20656 (0.0027) [2024-04-25 23:35:38,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 338460672. Throughput: 0: 55622.8. Samples: 287895240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 23:35:38,923][47056] Avg episode reward: [(0, '0.234')] [2024-04-25 23:35:40,954][47288] Updated weights for policy 0, policy_version 20666 (0.0030) [2024-04-25 23:35:43,888][47288] Updated weights for policy 0, policy_version 20676 (0.0038) [2024-04-25 23:35:43,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 338755584. Throughput: 0: 55556.2. Samples: 288058560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 23:35:43,923][47056] Avg episode reward: [(0, '0.223')] [2024-04-25 23:35:46,889][47288] Updated weights for policy 0, policy_version 20686 (0.0031) [2024-04-25 23:35:48,923][47056] Fps is (10 sec: 55704.0, 60 sec: 55432.3, 300 sec: 55483.4). Total num frames: 339017728. Throughput: 0: 55600.2. Samples: 288393880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 23:35:48,923][47056] Avg episode reward: [(0, '0.220')] [2024-04-25 23:35:49,881][47288] Updated weights for policy 0, policy_version 20696 (0.0027) [2024-04-25 23:35:52,605][47288] Updated weights for policy 0, policy_version 20706 (0.0026) [2024-04-25 23:35:53,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 339312640. Throughput: 0: 55662.1. Samples: 288724820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 23:35:53,923][47056] Avg episode reward: [(0, '0.216')] [2024-04-25 23:35:55,672][47288] Updated weights for policy 0, policy_version 20716 (0.0033) [2024-04-25 23:35:58,551][47288] Updated weights for policy 0, policy_version 20726 (0.0030) [2024-04-25 23:35:58,923][47056] Fps is (10 sec: 57345.3, 60 sec: 56251.6, 300 sec: 55594.5). Total num frames: 339591168. Throughput: 0: 55819.3. Samples: 288903920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 23:35:58,923][47056] Avg episode reward: [(0, '0.277')] [2024-04-25 23:35:59,024][47267] Saving new best policy, reward=0.277! [2024-04-25 23:36:01,521][47288] Updated weights for policy 0, policy_version 20736 (0.0029) [2024-04-25 23:36:03,923][47056] Fps is (10 sec: 52429.6, 60 sec: 55432.7, 300 sec: 55427.9). Total num frames: 339836928. Throughput: 0: 55759.7. Samples: 289233380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 23:36:03,923][47056] Avg episode reward: [(0, '0.235')] [2024-04-25 23:36:04,223][47267] Signal inference workers to stop experience collection... (4050 times) [2024-04-25 23:36:04,276][47288] InferenceWorker_p0-w0: stopping experience collection (4050 times) [2024-04-25 23:36:04,276][47267] Signal inference workers to resume experience collection... (4050 times) [2024-04-25 23:36:04,291][47288] InferenceWorker_p0-w0: resuming experience collection (4050 times) [2024-04-25 23:36:04,410][47288] Updated weights for policy 0, policy_version 20746 (0.0027) [2024-04-25 23:36:07,499][47288] Updated weights for policy 0, policy_version 20756 (0.0032) [2024-04-25 23:36:08,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 340148224. Throughput: 0: 55755.9. Samples: 289562880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 23:36:08,923][47056] Avg episode reward: [(0, '0.187')] [2024-04-25 23:36:10,324][47288] Updated weights for policy 0, policy_version 20766 (0.0026) [2024-04-25 23:36:13,325][47288] Updated weights for policy 0, policy_version 20776 (0.0035) [2024-04-25 23:36:13,923][47056] Fps is (10 sec: 57342.9, 60 sec: 55705.7, 300 sec: 55427.9). Total num frames: 340410368. Throughput: 0: 55611.2. Samples: 289726180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 23:36:13,923][47056] Avg episode reward: [(0, '0.260')] [2024-04-25 23:36:16,245][47288] Updated weights for policy 0, policy_version 20786 (0.0026) [2024-04-25 23:36:18,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 340705280. Throughput: 0: 55670.0. Samples: 290064520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-25 23:36:18,923][47056] Avg episode reward: [(0, '0.205')] [2024-04-25 23:36:19,147][47288] Updated weights for policy 0, policy_version 20796 (0.0034) [2024-04-25 23:36:22,099][47288] Updated weights for policy 0, policy_version 20806 (0.0028) [2024-04-25 23:36:23,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55159.6, 300 sec: 55483.5). Total num frames: 340967424. Throughput: 0: 55738.3. Samples: 290403460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-25 23:36:23,923][47056] Avg episode reward: [(0, '0.184')] [2024-04-25 23:36:25,020][47288] Updated weights for policy 0, policy_version 20816 (0.0037) [2024-04-25 23:36:27,896][47288] Updated weights for policy 0, policy_version 20826 (0.0026) [2024-04-25 23:36:28,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 341262336. Throughput: 0: 55736.9. Samples: 290566720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-25 23:36:28,923][47056] Avg episode reward: [(0, '0.222')] [2024-04-25 23:36:30,869][47288] Updated weights for policy 0, policy_version 20836 (0.0036) [2024-04-25 23:36:33,723][47288] Updated weights for policy 0, policy_version 20846 (0.0034) [2024-04-25 23:36:33,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 341540864. Throughput: 0: 55737.2. Samples: 290902040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-25 23:36:33,923][47056] Avg episode reward: [(0, '0.197')] [2024-04-25 23:36:36,778][47288] Updated weights for policy 0, policy_version 20856 (0.0028) [2024-04-25 23:36:38,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 341819392. Throughput: 0: 55909.4. Samples: 291240740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-25 23:36:38,923][47056] Avg episode reward: [(0, '0.190')] [2024-04-25 23:36:39,736][47288] Updated weights for policy 0, policy_version 20866 (0.0023) [2024-04-25 23:36:42,680][47288] Updated weights for policy 0, policy_version 20876 (0.0029) [2024-04-25 23:36:43,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 342114304. Throughput: 0: 55814.2. Samples: 291415560. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-25 23:36:43,923][47056] Avg episode reward: [(0, '0.252')] [2024-04-25 23:36:45,488][47288] Updated weights for policy 0, policy_version 20886 (0.0029) [2024-04-25 23:36:48,734][47288] Updated weights for policy 0, policy_version 20896 (0.0029) [2024-04-25 23:36:48,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.9, 300 sec: 55483.4). Total num frames: 342360064. Throughput: 0: 55899.0. Samples: 291748840. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-25 23:36:48,923][47056] Avg episode reward: [(0, '0.217')] [2024-04-25 23:36:48,931][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000020896_342360064.pth... [2024-04-25 23:36:48,980][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000020083_329039872.pth [2024-04-25 23:36:51,311][47288] Updated weights for policy 0, policy_version 20906 (0.0030) [2024-04-25 23:36:53,923][47056] Fps is (10 sec: 50790.4, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 342622208. Throughput: 0: 55866.7. Samples: 292076880. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-25 23:36:53,923][47056] Avg episode reward: [(0, '0.211')] [2024-04-25 23:36:54,780][47288] Updated weights for policy 0, policy_version 20916 (0.0031) [2024-04-25 23:36:57,077][47288] Updated weights for policy 0, policy_version 20926 (0.0031) [2024-04-25 23:36:58,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 342917120. Throughput: 0: 55836.9. Samples: 292238840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 23:36:58,923][47056] Avg episode reward: [(0, '0.229')] [2024-04-25 23:37:00,513][47288] Updated weights for policy 0, policy_version 20936 (0.0029) [2024-04-25 23:37:03,051][47288] Updated weights for policy 0, policy_version 20946 (0.0027) [2024-04-25 23:37:03,923][47056] Fps is (10 sec: 60621.6, 60 sec: 56524.8, 300 sec: 55650.1). Total num frames: 343228416. Throughput: 0: 55865.5. Samples: 292578460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 23:37:03,923][47056] Avg episode reward: [(0, '0.222')] [2024-04-25 23:37:06,287][47288] Updated weights for policy 0, policy_version 20956 (0.0027) [2024-04-25 23:37:08,847][47288] Updated weights for policy 0, policy_version 20966 (0.0027) [2024-04-25 23:37:08,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 343506944. Throughput: 0: 55693.1. Samples: 292909660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 23:37:08,923][47056] Avg episode reward: [(0, '0.229')] [2024-04-25 23:37:12,191][47288] Updated weights for policy 0, policy_version 20976 (0.0028) [2024-04-25 23:37:13,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 343769088. Throughput: 0: 55878.8. Samples: 293081260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 23:37:13,923][47056] Avg episode reward: [(0, '0.198')] [2024-04-25 23:37:14,466][47267] Signal inference workers to stop experience collection... (4100 times) [2024-04-25 23:37:14,471][47267] Signal inference workers to resume experience collection... (4100 times) [2024-04-25 23:37:14,477][47288] InferenceWorker_p0-w0: stopping experience collection (4100 times) [2024-04-25 23:37:14,496][47288] InferenceWorker_p0-w0: resuming experience collection (4100 times) [2024-04-25 23:37:14,573][47288] Updated weights for policy 0, policy_version 20986 (0.0031) [2024-04-25 23:37:18,205][47288] Updated weights for policy 0, policy_version 20996 (0.0031) [2024-04-25 23:37:18,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 344047616. Throughput: 0: 55799.9. Samples: 293413040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 23:37:18,923][47056] Avg episode reward: [(0, '0.210')] [2024-04-25 23:37:20,480][47288] Updated weights for policy 0, policy_version 21006 (0.0030) [2024-04-25 23:37:23,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 344309760. Throughput: 0: 55691.0. Samples: 293746840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 23:37:23,923][47056] Avg episode reward: [(0, '0.258')] [2024-04-25 23:37:24,031][47288] Updated weights for policy 0, policy_version 21016 (0.0027) [2024-04-25 23:37:26,481][47288] Updated weights for policy 0, policy_version 21026 (0.0027) [2024-04-25 23:37:28,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 344571904. Throughput: 0: 55221.4. Samples: 293900520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:37:28,923][47056] Avg episode reward: [(0, '0.231')] [2024-04-25 23:37:29,975][47288] Updated weights for policy 0, policy_version 21036 (0.0033) [2024-04-25 23:37:32,388][47288] Updated weights for policy 0, policy_version 21046 (0.0028) [2024-04-25 23:37:33,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 344866816. Throughput: 0: 55181.7. Samples: 294232020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:37:33,923][47056] Avg episode reward: [(0, '0.153')] [2024-04-25 23:37:35,985][47288] Updated weights for policy 0, policy_version 21056 (0.0028) [2024-04-25 23:37:38,277][47288] Updated weights for policy 0, policy_version 21066 (0.0030) [2024-04-25 23:37:38,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 345161728. Throughput: 0: 55255.5. Samples: 294563380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:37:38,923][47056] Avg episode reward: [(0, '0.213')] [2024-04-25 23:37:41,895][47288] Updated weights for policy 0, policy_version 21076 (0.0027) [2024-04-25 23:37:43,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 345440256. Throughput: 0: 55618.3. Samples: 294741660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-25 23:37:43,923][47056] Avg episode reward: [(0, '0.210')] [2024-04-25 23:37:44,366][47288] Updated weights for policy 0, policy_version 21086 (0.0025) [2024-04-25 23:37:47,678][47288] Updated weights for policy 0, policy_version 21096 (0.0026) [2024-04-25 23:37:48,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 345702400. Throughput: 0: 55482.6. Samples: 295075180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-25 23:37:48,923][47056] Avg episode reward: [(0, '0.185')] [2024-04-25 23:37:50,256][47288] Updated weights for policy 0, policy_version 21106 (0.0027) [2024-04-25 23:37:53,585][47288] Updated weights for policy 0, policy_version 21116 (0.0028) [2024-04-25 23:37:53,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.8, 300 sec: 55539.0). Total num frames: 345980928. Throughput: 0: 55481.6. Samples: 295406320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-25 23:37:53,923][47056] Avg episode reward: [(0, '0.226')] [2024-04-25 23:37:56,143][47288] Updated weights for policy 0, policy_version 21126 (0.0027) [2024-04-25 23:37:58,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55159.6, 300 sec: 55427.9). Total num frames: 346226688. Throughput: 0: 55212.5. Samples: 295565820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-25 23:37:58,923][47056] Avg episode reward: [(0, '0.193')] [2024-04-25 23:37:59,485][47288] Updated weights for policy 0, policy_version 21136 (0.0025) [2024-04-25 23:38:02,009][47288] Updated weights for policy 0, policy_version 21146 (0.0026) [2024-04-25 23:38:03,923][47056] Fps is (10 sec: 54066.4, 60 sec: 54886.3, 300 sec: 55539.0). Total num frames: 346521600. Throughput: 0: 55223.1. Samples: 295898080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-25 23:38:03,923][47056] Avg episode reward: [(0, '0.309')] [2024-04-25 23:38:03,924][47267] Saving new best policy, reward=0.309! [2024-04-25 23:38:05,364][47288] Updated weights for policy 0, policy_version 21156 (0.0025) [2024-04-25 23:38:08,090][47288] Updated weights for policy 0, policy_version 21166 (0.0031) [2024-04-25 23:38:08,923][47056] Fps is (10 sec: 57343.2, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 346800128. Throughput: 0: 55239.5. Samples: 296232620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-25 23:38:08,923][47056] Avg episode reward: [(0, '0.231')] [2024-04-25 23:38:11,184][47288] Updated weights for policy 0, policy_version 21176 (0.0033) [2024-04-25 23:38:11,906][47267] Signal inference workers to stop experience collection... (4150 times) [2024-04-25 23:38:11,907][47267] Signal inference workers to resume experience collection... (4150 times) [2024-04-25 23:38:11,917][47288] InferenceWorker_p0-w0: stopping experience collection (4150 times) [2024-04-25 23:38:11,917][47288] InferenceWorker_p0-w0: resuming experience collection (4150 times) [2024-04-25 23:38:13,746][47288] Updated weights for policy 0, policy_version 21186 (0.0027) [2024-04-25 23:38:13,923][47056] Fps is (10 sec: 58982.6, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 347111424. Throughput: 0: 55592.0. Samples: 296402160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-04-25 23:38:13,923][47056] Avg episode reward: [(0, '0.225')] [2024-04-25 23:38:17,009][47288] Updated weights for policy 0, policy_version 21196 (0.0030) [2024-04-25 23:38:18,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 347373568. Throughput: 0: 55723.2. Samples: 296739560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-04-25 23:38:18,923][47056] Avg episode reward: [(0, '0.178')] [2024-04-25 23:38:19,511][47288] Updated weights for policy 0, policy_version 21206 (0.0033) [2024-04-25 23:38:23,005][47288] Updated weights for policy 0, policy_version 21216 (0.0027) [2024-04-25 23:38:23,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 347635712. Throughput: 0: 55717.0. Samples: 297070640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-04-25 23:38:23,923][47056] Avg episode reward: [(0, '0.285')] [2024-04-25 23:38:25,482][47288] Updated weights for policy 0, policy_version 21226 (0.0033) [2024-04-25 23:38:28,801][47288] Updated weights for policy 0, policy_version 21236 (0.0036) [2024-04-25 23:38:28,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 347930624. Throughput: 0: 55435.0. Samples: 297236240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-25 23:38:28,923][47056] Avg episode reward: [(0, '0.186')] [2024-04-25 23:38:31,565][47288] Updated weights for policy 0, policy_version 21246 (0.0032) [2024-04-25 23:38:33,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 348192768. Throughput: 0: 55414.7. Samples: 297568840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-25 23:38:33,923][47056] Avg episode reward: [(0, '0.211')] [2024-04-25 23:38:34,697][47288] Updated weights for policy 0, policy_version 21256 (0.0026) [2024-04-25 23:38:37,932][47288] Updated weights for policy 0, policy_version 21266 (0.0030) [2024-04-25 23:38:38,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 348471296. Throughput: 0: 55509.2. Samples: 297904240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-25 23:38:38,923][47056] Avg episode reward: [(0, '0.232')] [2024-04-25 23:38:40,805][47288] Updated weights for policy 0, policy_version 21276 (0.0030) [2024-04-25 23:38:43,923][47056] Fps is (10 sec: 54066.8, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 348733440. Throughput: 0: 55549.2. Samples: 298065540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-25 23:38:43,923][47056] Avg episode reward: [(0, '0.239')] [2024-04-25 23:38:43,981][47288] Updated weights for policy 0, policy_version 21286 (0.0029) [2024-04-25 23:38:46,564][47288] Updated weights for policy 0, policy_version 21296 (0.0029) [2024-04-25 23:38:48,923][47056] Fps is (10 sec: 57343.0, 60 sec: 55705.4, 300 sec: 55650.0). Total num frames: 349044736. Throughput: 0: 55506.9. Samples: 298395900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-25 23:38:48,924][47056] Avg episode reward: [(0, '0.218')] [2024-04-25 23:38:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000021304_349044736.pth... [2024-04-25 23:38:48,994][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000020490_335708160.pth [2024-04-25 23:38:49,881][47288] Updated weights for policy 0, policy_version 21306 (0.0024) [2024-04-25 23:38:52,620][47288] Updated weights for policy 0, policy_version 21316 (0.0029) [2024-04-25 23:38:53,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 349323264. Throughput: 0: 55479.1. Samples: 298729180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-25 23:38:53,923][47056] Avg episode reward: [(0, '0.230')] [2024-04-25 23:38:55,649][47288] Updated weights for policy 0, policy_version 21326 (0.0036) [2024-04-25 23:38:58,474][47288] Updated weights for policy 0, policy_version 21336 (0.0030) [2024-04-25 23:38:58,923][47056] Fps is (10 sec: 54068.4, 60 sec: 55978.6, 300 sec: 55483.4). Total num frames: 349585408. Throughput: 0: 55441.8. Samples: 298897040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-25 23:38:58,923][47056] Avg episode reward: [(0, '0.215')] [2024-04-25 23:39:01,348][47288] Updated weights for policy 0, policy_version 21346 (0.0031) [2024-04-25 23:39:03,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 349863936. Throughput: 0: 55431.0. Samples: 299233960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-25 23:39:03,923][47056] Avg episode reward: [(0, '0.217')] [2024-04-25 23:39:04,500][47288] Updated weights for policy 0, policy_version 21356 (0.0031) [2024-04-25 23:39:07,235][47288] Updated weights for policy 0, policy_version 21366 (0.0028) [2024-04-25 23:39:08,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55483.5). Total num frames: 350126080. Throughput: 0: 55485.7. Samples: 299567500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-25 23:39:08,923][47056] Avg episode reward: [(0, '0.267')] [2024-04-25 23:39:10,439][47288] Updated weights for policy 0, policy_version 21376 (0.0033) [2024-04-25 23:39:13,334][47288] Updated weights for policy 0, policy_version 21386 (0.0036) [2024-04-25 23:39:13,923][47056] Fps is (10 sec: 54067.3, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 350404608. Throughput: 0: 55388.6. Samples: 299728720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 23:39:13,923][47056] Avg episode reward: [(0, '0.242')] [2024-04-25 23:39:16,349][47288] Updated weights for policy 0, policy_version 21396 (0.0031) [2024-04-25 23:39:18,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 350683136. Throughput: 0: 55407.3. Samples: 300062180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 23:39:18,923][47056] Avg episode reward: [(0, '0.264')] [2024-04-25 23:39:19,184][47288] Updated weights for policy 0, policy_version 21406 (0.0036) [2024-04-25 23:39:21,145][47267] Signal inference workers to stop experience collection... (4200 times) [2024-04-25 23:39:21,179][47288] InferenceWorker_p0-w0: stopping experience collection (4200 times) [2024-04-25 23:39:21,231][47267] Signal inference workers to resume experience collection... (4200 times) [2024-04-25 23:39:21,231][47288] InferenceWorker_p0-w0: resuming experience collection (4200 times) [2024-04-25 23:39:22,064][47288] Updated weights for policy 0, policy_version 21416 (0.0029) [2024-04-25 23:39:23,923][47056] Fps is (10 sec: 58982.5, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 350994432. Throughput: 0: 55429.9. Samples: 300398580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 23:39:23,923][47056] Avg episode reward: [(0, '0.208')] [2024-04-25 23:39:25,117][47288] Updated weights for policy 0, policy_version 21426 (0.0034) [2024-04-25 23:39:27,790][47288] Updated weights for policy 0, policy_version 21436 (0.0031) [2024-04-25 23:39:28,923][47056] Fps is (10 sec: 57345.4, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 351256576. Throughput: 0: 55630.8. Samples: 300568920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 23:39:28,923][47056] Avg episode reward: [(0, '0.196')] [2024-04-25 23:39:30,807][47288] Updated weights for policy 0, policy_version 21446 (0.0028) [2024-04-25 23:39:33,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 351518720. Throughput: 0: 55787.8. Samples: 300906340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 23:39:33,923][47056] Avg episode reward: [(0, '0.216')] [2024-04-25 23:39:34,112][47288] Updated weights for policy 0, policy_version 21456 (0.0030) [2024-04-25 23:39:36,637][47288] Updated weights for policy 0, policy_version 21466 (0.0035) [2024-04-25 23:39:38,923][47056] Fps is (10 sec: 54066.2, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 351797248. Throughput: 0: 55746.1. Samples: 301237760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-25 23:39:38,924][47056] Avg episode reward: [(0, '0.204')] [2024-04-25 23:39:39,871][47288] Updated weights for policy 0, policy_version 21476 (0.0027) [2024-04-25 23:39:42,540][47288] Updated weights for policy 0, policy_version 21486 (0.0028) [2024-04-25 23:39:43,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 352075776. Throughput: 0: 55565.8. Samples: 301397500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 23:39:43,923][47056] Avg episode reward: [(0, '0.253')] [2024-04-25 23:39:45,752][47288] Updated weights for policy 0, policy_version 21496 (0.0027) [2024-04-25 23:39:48,310][47288] Updated weights for policy 0, policy_version 21506 (0.0029) [2024-04-25 23:39:48,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 352354304. Throughput: 0: 55538.1. Samples: 301733180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 23:39:48,924][47056] Avg episode reward: [(0, '0.235')] [2024-04-25 23:39:51,723][47288] Updated weights for policy 0, policy_version 21516 (0.0029) [2024-04-25 23:39:53,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 352649216. Throughput: 0: 55501.8. Samples: 302065080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 23:39:53,923][47056] Avg episode reward: [(0, '0.252')] [2024-04-25 23:39:54,634][47288] Updated weights for policy 0, policy_version 21526 (0.0030) [2024-04-25 23:39:57,692][47288] Updated weights for policy 0, policy_version 21536 (0.0030) [2024-04-25 23:39:58,923][47056] Fps is (10 sec: 58982.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 352944128. Throughput: 0: 55700.4. Samples: 302235240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 23:39:58,923][47056] Avg episode reward: [(0, '0.189')] [2024-04-25 23:40:00,547][47288] Updated weights for policy 0, policy_version 21546 (0.0028) [2024-04-25 23:40:03,423][47288] Updated weights for policy 0, policy_version 21556 (0.0029) [2024-04-25 23:40:03,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 353189888. Throughput: 0: 55796.3. Samples: 302573000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 23:40:03,923][47056] Avg episode reward: [(0, '0.248')] [2024-04-25 23:40:06,479][47288] Updated weights for policy 0, policy_version 21566 (0.0030) [2024-04-25 23:40:08,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 353468416. Throughput: 0: 55592.4. Samples: 302900240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-25 23:40:08,923][47056] Avg episode reward: [(0, '0.230')] [2024-04-25 23:40:09,312][47288] Updated weights for policy 0, policy_version 21576 (0.0034) [2024-04-25 23:40:12,397][47288] Updated weights for policy 0, policy_version 21586 (0.0032) [2024-04-25 23:40:13,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 353746944. Throughput: 0: 55525.7. Samples: 303067580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 23:40:13,923][47056] Avg episode reward: [(0, '0.183')] [2024-04-25 23:40:15,172][47288] Updated weights for policy 0, policy_version 21596 (0.0028) [2024-04-25 23:40:18,312][47288] Updated weights for policy 0, policy_version 21606 (0.0032) [2024-04-25 23:40:18,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.7, 300 sec: 55483.4). Total num frames: 354025472. Throughput: 0: 55468.8. Samples: 303402440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 23:40:18,923][47056] Avg episode reward: [(0, '0.246')] [2024-04-25 23:40:21,222][47288] Updated weights for policy 0, policy_version 21616 (0.0029) [2024-04-25 23:40:23,923][47056] Fps is (10 sec: 54067.3, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 354287616. Throughput: 0: 55493.5. Samples: 303734960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 23:40:23,923][47056] Avg episode reward: [(0, '0.246')] [2024-04-25 23:40:24,127][47288] Updated weights for policy 0, policy_version 21626 (0.0026) [2024-04-25 23:40:27,207][47288] Updated weights for policy 0, policy_version 21636 (0.0032) [2024-04-25 23:40:28,923][47056] Fps is (10 sec: 58982.7, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 354615296. Throughput: 0: 55699.0. Samples: 303903960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 23:40:28,923][47056] Avg episode reward: [(0, '0.229')] [2024-04-25 23:40:29,714][47267] Signal inference workers to stop experience collection... (4250 times) [2024-04-25 23:40:29,757][47288] InferenceWorker_p0-w0: stopping experience collection (4250 times) [2024-04-25 23:40:29,766][47267] Signal inference workers to resume experience collection... (4250 times) [2024-04-25 23:40:29,773][47288] InferenceWorker_p0-w0: resuming experience collection (4250 times) [2024-04-25 23:40:29,877][47288] Updated weights for policy 0, policy_version 21646 (0.0028) [2024-04-25 23:40:33,200][47288] Updated weights for policy 0, policy_version 21656 (0.0031) [2024-04-25 23:40:33,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 354861056. Throughput: 0: 55633.8. Samples: 304236700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-25 23:40:33,923][47056] Avg episode reward: [(0, '0.230')] [2024-04-25 23:40:35,756][47288] Updated weights for policy 0, policy_version 21666 (0.0031) [2024-04-25 23:40:38,923][47056] Fps is (10 sec: 50790.8, 60 sec: 55432.7, 300 sec: 55483.5). Total num frames: 355123200. Throughput: 0: 55545.0. Samples: 304564600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-25 23:40:38,923][47056] Avg episode reward: [(0, '0.209')] [2024-04-25 23:40:38,987][47288] Updated weights for policy 0, policy_version 21676 (0.0030) [2024-04-25 23:40:41,816][47288] Updated weights for policy 0, policy_version 21686 (0.0028) [2024-04-25 23:40:43,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 355401728. Throughput: 0: 55446.3. Samples: 304730320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-25 23:40:43,923][47056] Avg episode reward: [(0, '0.229')] [2024-04-25 23:40:44,878][47288] Updated weights for policy 0, policy_version 21696 (0.0027) [2024-04-25 23:40:47,634][47288] Updated weights for policy 0, policy_version 21706 (0.0033) [2024-04-25 23:40:48,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55432.7, 300 sec: 55483.5). Total num frames: 355680256. Throughput: 0: 55438.6. Samples: 305067740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-25 23:40:48,923][47056] Avg episode reward: [(0, '0.226')] [2024-04-25 23:40:49,008][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000021710_355696640.pth... [2024-04-25 23:40:49,055][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000020896_342360064.pth [2024-04-25 23:40:50,676][47288] Updated weights for policy 0, policy_version 21716 (0.0031) [2024-04-25 23:40:53,628][47288] Updated weights for policy 0, policy_version 21726 (0.0034) [2024-04-25 23:40:53,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 355975168. Throughput: 0: 55586.2. Samples: 305401620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 23:40:53,923][47056] Avg episode reward: [(0, '0.219')] [2024-04-25 23:40:56,795][47288] Updated weights for policy 0, policy_version 21736 (0.0032) [2024-04-25 23:40:58,923][47056] Fps is (10 sec: 57343.0, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 356253696. Throughput: 0: 55602.9. Samples: 305569720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 23:40:58,923][47056] Avg episode reward: [(0, '0.209')] [2024-04-25 23:40:59,457][47288] Updated weights for policy 0, policy_version 21746 (0.0025) [2024-04-25 23:41:02,617][47288] Updated weights for policy 0, policy_version 21756 (0.0036) [2024-04-25 23:41:03,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 356548608. Throughput: 0: 55572.9. Samples: 305903220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 23:41:03,923][47056] Avg episode reward: [(0, '0.234')] [2024-04-25 23:41:05,361][47288] Updated weights for policy 0, policy_version 21766 (0.0026) [2024-04-25 23:41:08,394][47288] Updated weights for policy 0, policy_version 21776 (0.0026) [2024-04-25 23:41:08,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 356810752. Throughput: 0: 55454.7. Samples: 306230420. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-25 23:41:08,923][47056] Avg episode reward: [(0, '0.241')] [2024-04-25 23:41:11,083][47288] Updated weights for policy 0, policy_version 21786 (0.0032) [2024-04-25 23:41:13,923][47056] Fps is (10 sec: 50790.9, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 357056512. Throughput: 0: 55331.2. Samples: 306393860. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-25 23:41:13,923][47056] Avg episode reward: [(0, '0.233')] [2024-04-25 23:41:14,349][47288] Updated weights for policy 0, policy_version 21796 (0.0026) [2024-04-25 23:41:16,940][47288] Updated weights for policy 0, policy_version 21806 (0.0032) [2024-04-25 23:41:18,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 357367808. Throughput: 0: 55473.7. Samples: 306733020. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-25 23:41:18,923][47056] Avg episode reward: [(0, '0.178')] [2024-04-25 23:41:20,289][47288] Updated weights for policy 0, policy_version 21816 (0.0031) [2024-04-25 23:41:22,913][47288] Updated weights for policy 0, policy_version 21826 (0.0028) [2024-04-25 23:41:23,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 357629952. Throughput: 0: 55556.3. Samples: 307064640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-25 23:41:23,923][47056] Avg episode reward: [(0, '0.242')] [2024-04-25 23:41:26,076][47288] Updated weights for policy 0, policy_version 21836 (0.0034) [2024-04-25 23:41:28,805][47288] Updated weights for policy 0, policy_version 21846 (0.0031) [2024-04-25 23:41:28,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 357924864. Throughput: 0: 55614.5. Samples: 307232980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-25 23:41:28,923][47056] Avg episode reward: [(0, '0.182')] [2024-04-25 23:41:31,947][47288] Updated weights for policy 0, policy_version 21856 (0.0027) [2024-04-25 23:41:33,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 358203392. Throughput: 0: 55403.6. Samples: 307560900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-25 23:41:33,923][47056] Avg episode reward: [(0, '0.234')] [2024-04-25 23:41:34,536][47288] Updated weights for policy 0, policy_version 21866 (0.0029) [2024-04-25 23:41:37,953][47288] Updated weights for policy 0, policy_version 21876 (0.0031) [2024-04-25 23:41:38,197][47267] Signal inference workers to stop experience collection... (4300 times) [2024-04-25 23:41:38,229][47288] InferenceWorker_p0-w0: stopping experience collection (4300 times) [2024-04-25 23:41:38,288][47267] Signal inference workers to resume experience collection... (4300 times) [2024-04-25 23:41:38,288][47288] InferenceWorker_p0-w0: resuming experience collection (4300 times) [2024-04-25 23:41:38,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.6, 300 sec: 55539.0). Total num frames: 358498304. Throughput: 0: 55364.9. Samples: 307893040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-25 23:41:38,923][47056] Avg episode reward: [(0, '0.194')] [2024-04-25 23:41:40,583][47288] Updated weights for policy 0, policy_version 21886 (0.0027) [2024-04-25 23:41:43,793][47288] Updated weights for policy 0, policy_version 21896 (0.0036) [2024-04-25 23:41:43,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 358744064. Throughput: 0: 55390.3. Samples: 308062280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-25 23:41:43,923][47056] Avg episode reward: [(0, '0.137')] [2024-04-25 23:41:46,569][47288] Updated weights for policy 0, policy_version 21906 (0.0035) [2024-04-25 23:41:48,923][47056] Fps is (10 sec: 50790.4, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 359006208. Throughput: 0: 55353.3. Samples: 308394120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-25 23:41:48,923][47056] Avg episode reward: [(0, '0.255')] [2024-04-25 23:41:49,702][47288] Updated weights for policy 0, policy_version 21916 (0.0029) [2024-04-25 23:41:52,434][47288] Updated weights for policy 0, policy_version 21926 (0.0026) [2024-04-25 23:41:53,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 359284736. Throughput: 0: 55574.6. Samples: 308731280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:41:53,923][47056] Avg episode reward: [(0, '0.236')] [2024-04-25 23:41:55,527][47288] Updated weights for policy 0, policy_version 21936 (0.0027) [2024-04-25 23:41:58,217][47288] Updated weights for policy 0, policy_version 21946 (0.0032) [2024-04-25 23:41:58,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 359579648. Throughput: 0: 55507.9. Samples: 308891720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:41:58,923][47056] Avg episode reward: [(0, '0.233')] [2024-04-25 23:42:01,315][47288] Updated weights for policy 0, policy_version 21956 (0.0030) [2024-04-25 23:42:03,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 359858176. Throughput: 0: 55434.4. Samples: 309227560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:42:03,923][47056] Avg episode reward: [(0, '0.193')] [2024-04-25 23:42:04,089][47288] Updated weights for policy 0, policy_version 21966 (0.0029) [2024-04-25 23:42:07,248][47288] Updated weights for policy 0, policy_version 21976 (0.0030) [2024-04-25 23:42:08,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 360153088. Throughput: 0: 55388.5. Samples: 309557120. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-04-25 23:42:08,923][47056] Avg episode reward: [(0, '0.197')] [2024-04-25 23:42:10,083][47288] Updated weights for policy 0, policy_version 21986 (0.0033) [2024-04-25 23:42:13,008][47288] Updated weights for policy 0, policy_version 21996 (0.0033) [2024-04-25 23:42:13,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56251.8, 300 sec: 55539.0). Total num frames: 360431616. Throughput: 0: 55612.7. Samples: 309735540. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-04-25 23:42:13,923][47056] Avg episode reward: [(0, '0.221')] [2024-04-25 23:42:15,927][47288] Updated weights for policy 0, policy_version 22006 (0.0028) [2024-04-25 23:42:18,868][47288] Updated weights for policy 0, policy_version 22016 (0.0027) [2024-04-25 23:42:18,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.8, 300 sec: 55594.5). Total num frames: 360710144. Throughput: 0: 55843.2. Samples: 310073840. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-04-25 23:42:18,923][47056] Avg episode reward: [(0, '0.247')] [2024-04-25 23:42:21,680][47288] Updated weights for policy 0, policy_version 22026 (0.0032) [2024-04-25 23:42:23,923][47056] Fps is (10 sec: 50790.0, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 360939520. Throughput: 0: 55903.2. Samples: 310408680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-25 23:42:23,923][47056] Avg episode reward: [(0, '0.216')] [2024-04-25 23:42:24,773][47288] Updated weights for policy 0, policy_version 22036 (0.0028) [2024-04-25 23:42:27,601][47288] Updated weights for policy 0, policy_version 22046 (0.0032) [2024-04-25 23:42:28,923][47056] Fps is (10 sec: 54063.2, 60 sec: 55432.0, 300 sec: 55538.9). Total num frames: 361250816. Throughput: 0: 55606.4. Samples: 310564600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-25 23:42:28,924][47056] Avg episode reward: [(0, '0.247')] [2024-04-25 23:42:30,675][47288] Updated weights for policy 0, policy_version 22056 (0.0031) [2024-04-25 23:42:33,544][47288] Updated weights for policy 0, policy_version 22066 (0.0028) [2024-04-25 23:42:33,923][47056] Fps is (10 sec: 60620.8, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 361545728. Throughput: 0: 55605.9. Samples: 310896380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-25 23:42:33,923][47056] Avg episode reward: [(0, '0.207')] [2024-04-25 23:42:36,567][47288] Updated weights for policy 0, policy_version 22076 (0.0031) [2024-04-25 23:42:38,923][47056] Fps is (10 sec: 54070.7, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 361791488. Throughput: 0: 55605.8. Samples: 311233540. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-25 23:42:38,923][47056] Avg episode reward: [(0, '0.263')] [2024-04-25 23:42:39,493][47288] Updated weights for policy 0, policy_version 22086 (0.0024) [2024-04-25 23:42:41,867][47267] Signal inference workers to stop experience collection... (4350 times) [2024-04-25 23:42:41,915][47288] InferenceWorker_p0-w0: stopping experience collection (4350 times) [2024-04-25 23:42:41,925][47267] Signal inference workers to resume experience collection... (4350 times) [2024-04-25 23:42:41,931][47288] InferenceWorker_p0-w0: resuming experience collection (4350 times) [2024-04-25 23:42:42,345][47288] Updated weights for policy 0, policy_version 22096 (0.0026) [2024-04-25 23:42:43,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 362102784. Throughput: 0: 55982.2. Samples: 311410920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-25 23:42:43,923][47056] Avg episode reward: [(0, '0.275')] [2024-04-25 23:42:45,378][47288] Updated weights for policy 0, policy_version 22106 (0.0027) [2024-04-25 23:42:48,167][47288] Updated weights for policy 0, policy_version 22116 (0.0027) [2024-04-25 23:42:48,923][47056] Fps is (10 sec: 60620.4, 60 sec: 56524.8, 300 sec: 55650.0). Total num frames: 362397696. Throughput: 0: 56007.4. Samples: 311747900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-25 23:42:48,923][47056] Avg episode reward: [(0, '0.234')] [2024-04-25 23:42:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000022119_362397696.pth... [2024-04-25 23:42:48,983][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000021304_349044736.pth [2024-04-25 23:42:51,116][47288] Updated weights for policy 0, policy_version 22126 (0.0035) [2024-04-25 23:42:53,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 362643456. Throughput: 0: 55909.6. Samples: 312073060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-25 23:42:53,923][47056] Avg episode reward: [(0, '0.206')] [2024-04-25 23:42:54,104][47288] Updated weights for policy 0, policy_version 22136 (0.0035) [2024-04-25 23:42:57,084][47288] Updated weights for policy 0, policy_version 22146 (0.0025) [2024-04-25 23:42:58,923][47056] Fps is (10 sec: 49152.3, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 362889216. Throughput: 0: 55620.7. Samples: 312238480. Policy #0 lag: (min: 2.0, avg: 10.8, max: 24.0) [2024-04-25 23:42:58,923][47056] Avg episode reward: [(0, '0.264')] [2024-04-25 23:42:59,987][47288] Updated weights for policy 0, policy_version 22156 (0.0029) [2024-04-25 23:43:03,181][47288] Updated weights for policy 0, policy_version 22166 (0.0028) [2024-04-25 23:43:03,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 363184128. Throughput: 0: 55372.7. Samples: 312565620. Policy #0 lag: (min: 2.0, avg: 10.8, max: 24.0) [2024-04-25 23:43:03,923][47056] Avg episode reward: [(0, '0.228')] [2024-04-25 23:43:06,057][47288] Updated weights for policy 0, policy_version 22176 (0.0029) [2024-04-25 23:43:08,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 363479040. Throughput: 0: 55288.4. Samples: 312896660. Policy #0 lag: (min: 2.0, avg: 10.8, max: 24.0) [2024-04-25 23:43:08,923][47056] Avg episode reward: [(0, '0.186')] [2024-04-25 23:43:08,942][47288] Updated weights for policy 0, policy_version 22186 (0.0028) [2024-04-25 23:43:11,836][47288] Updated weights for policy 0, policy_version 22196 (0.0026) [2024-04-25 23:43:13,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 363757568. Throughput: 0: 55547.4. Samples: 313064200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 23:43:13,923][47056] Avg episode reward: [(0, '0.218')] [2024-04-25 23:43:15,171][47288] Updated weights for policy 0, policy_version 22206 (0.0033) [2024-04-25 23:43:17,791][47288] Updated weights for policy 0, policy_version 22216 (0.0029) [2024-04-25 23:43:18,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 364036096. Throughput: 0: 55542.6. Samples: 313395800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 23:43:18,923][47056] Avg episode reward: [(0, '0.169')] [2024-04-25 23:43:21,037][47288] Updated weights for policy 0, policy_version 22226 (0.0030) [2024-04-25 23:43:23,594][47288] Updated weights for policy 0, policy_version 22236 (0.0032) [2024-04-25 23:43:23,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.7, 300 sec: 55594.5). Total num frames: 364331008. Throughput: 0: 55409.3. Samples: 313726960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 23:43:23,923][47056] Avg episode reward: [(0, '0.179')] [2024-04-25 23:43:27,023][47288] Updated weights for policy 0, policy_version 22246 (0.0037) [2024-04-25 23:43:28,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55433.2, 300 sec: 55539.0). Total num frames: 364576768. Throughput: 0: 55291.3. Samples: 313899020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-25 23:43:28,923][47056] Avg episode reward: [(0, '0.234')] [2024-04-25 23:43:29,432][47288] Updated weights for policy 0, policy_version 22256 (0.0031) [2024-04-25 23:43:33,094][47288] Updated weights for policy 0, policy_version 22266 (0.0027) [2024-04-25 23:43:33,923][47056] Fps is (10 sec: 50790.6, 60 sec: 54886.4, 300 sec: 55483.4). Total num frames: 364838912. Throughput: 0: 55254.8. Samples: 314234360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-25 23:43:33,923][47056] Avg episode reward: [(0, '0.227')] [2024-04-25 23:43:35,294][47288] Updated weights for policy 0, policy_version 22276 (0.0027) [2024-04-25 23:43:38,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 365117440. Throughput: 0: 55570.4. Samples: 314573720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-25 23:43:38,923][47056] Avg episode reward: [(0, '0.251')] [2024-04-25 23:43:38,972][47288] Updated weights for policy 0, policy_version 22286 (0.0028) [2024-04-25 23:43:39,833][47267] Signal inference workers to stop experience collection... (4400 times) [2024-04-25 23:43:39,837][47267] Signal inference workers to resume experience collection... (4400 times) [2024-04-25 23:43:39,862][47288] InferenceWorker_p0-w0: stopping experience collection (4400 times) [2024-04-25 23:43:39,862][47288] InferenceWorker_p0-w0: resuming experience collection (4400 times) [2024-04-25 23:43:41,213][47288] Updated weights for policy 0, policy_version 22296 (0.0028) [2024-04-25 23:43:43,923][47056] Fps is (10 sec: 58982.0, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 365428736. Throughput: 0: 55284.4. Samples: 314726280. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-04-25 23:43:43,923][47056] Avg episode reward: [(0, '0.217')] [2024-04-25 23:43:44,937][47288] Updated weights for policy 0, policy_version 22306 (0.0029) [2024-04-25 23:43:47,089][47288] Updated weights for policy 0, policy_version 22316 (0.0031) [2024-04-25 23:43:48,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 365707264. Throughput: 0: 55456.5. Samples: 315061160. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-04-25 23:43:48,923][47056] Avg episode reward: [(0, '0.253')] [2024-04-25 23:43:50,652][47288] Updated weights for policy 0, policy_version 22326 (0.0029) [2024-04-25 23:43:53,005][47288] Updated weights for policy 0, policy_version 22336 (0.0026) [2024-04-25 23:43:53,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 365985792. Throughput: 0: 55608.3. Samples: 315399040. Policy #0 lag: (min: 2.0, avg: 10.7, max: 25.0) [2024-04-25 23:43:53,923][47056] Avg episode reward: [(0, '0.236')] [2024-04-25 23:43:56,757][47288] Updated weights for policy 0, policy_version 22346 (0.0037) [2024-04-25 23:43:58,769][47288] Updated weights for policy 0, policy_version 22356 (0.0031) [2024-04-25 23:43:58,923][47056] Fps is (10 sec: 57342.5, 60 sec: 56524.6, 300 sec: 55650.0). Total num frames: 366280704. Throughput: 0: 55838.9. Samples: 315576960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-25 23:43:58,924][47056] Avg episode reward: [(0, '0.197')] [2024-04-25 23:44:02,704][47288] Updated weights for policy 0, policy_version 22366 (0.0038) [2024-04-25 23:44:03,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 366542848. Throughput: 0: 56039.2. Samples: 315917560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-25 23:44:03,923][47056] Avg episode reward: [(0, '0.252')] [2024-04-25 23:44:04,577][47288] Updated weights for policy 0, policy_version 22376 (0.0028) [2024-04-25 23:44:08,486][47288] Updated weights for policy 0, policy_version 22386 (0.0031) [2024-04-25 23:44:08,923][47056] Fps is (10 sec: 50791.5, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 366788608. Throughput: 0: 56008.5. Samples: 316247340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-25 23:44:08,923][47056] Avg episode reward: [(0, '0.240')] [2024-04-25 23:44:10,486][47288] Updated weights for policy 0, policy_version 22396 (0.0030) [2024-04-25 23:44:13,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 367067136. Throughput: 0: 55640.7. Samples: 316402860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-25 23:44:13,923][47056] Avg episode reward: [(0, '0.245')] [2024-04-25 23:44:14,560][47288] Updated weights for policy 0, policy_version 22406 (0.0032) [2024-04-25 23:44:16,413][47288] Updated weights for policy 0, policy_version 22416 (0.0027) [2024-04-25 23:44:18,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 367362048. Throughput: 0: 55573.3. Samples: 316735160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-25 23:44:18,923][47056] Avg episode reward: [(0, '0.232')] [2024-04-25 23:44:20,437][47288] Updated weights for policy 0, policy_version 22426 (0.0027) [2024-04-25 23:44:22,295][47288] Updated weights for policy 0, policy_version 22436 (0.0036) [2024-04-25 23:44:23,923][47056] Fps is (10 sec: 60620.8, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 367673344. Throughput: 0: 55473.3. Samples: 317070020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-25 23:44:23,923][47056] Avg episode reward: [(0, '0.235')] [2024-04-25 23:44:26,239][47288] Updated weights for policy 0, policy_version 22446 (0.0030) [2024-04-25 23:44:26,691][47267] Signal inference workers to stop experience collection... (4450 times) [2024-04-25 23:44:26,738][47288] InferenceWorker_p0-w0: stopping experience collection (4450 times) [2024-04-25 23:44:26,745][47267] Signal inference workers to resume experience collection... (4450 times) [2024-04-25 23:44:26,752][47288] InferenceWorker_p0-w0: resuming experience collection (4450 times) [2024-04-25 23:44:28,205][47288] Updated weights for policy 0, policy_version 22456 (0.0024) [2024-04-25 23:44:28,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.4, 300 sec: 55650.0). Total num frames: 367935488. Throughput: 0: 56074.1. Samples: 317249620. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-04-25 23:44:28,923][47056] Avg episode reward: [(0, '0.254')] [2024-04-25 23:44:32,038][47288] Updated weights for policy 0, policy_version 22466 (0.0030) [2024-04-25 23:44:33,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56524.7, 300 sec: 55705.6). Total num frames: 368230400. Throughput: 0: 56016.2. Samples: 317581900. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-04-25 23:44:33,923][47056] Avg episode reward: [(0, '0.186')] [2024-04-25 23:44:34,201][47288] Updated weights for policy 0, policy_version 22476 (0.0028) [2024-04-25 23:44:37,791][47288] Updated weights for policy 0, policy_version 22486 (0.0033) [2024-04-25 23:44:38,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.6, 300 sec: 55650.0). Total num frames: 368492544. Throughput: 0: 55888.9. Samples: 317914040. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-04-25 23:44:38,923][47056] Avg episode reward: [(0, '0.215')] [2024-04-25 23:44:39,879][47288] Updated weights for policy 0, policy_version 22496 (0.0027) [2024-04-25 23:44:43,714][47288] Updated weights for policy 0, policy_version 22506 (0.0043) [2024-04-25 23:44:43,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 368754688. Throughput: 0: 55518.0. Samples: 318075260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-25 23:44:43,923][47056] Avg episode reward: [(0, '0.205')] [2024-04-25 23:44:45,761][47288] Updated weights for policy 0, policy_version 22516 (0.0029) [2024-04-25 23:44:48,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 369016832. Throughput: 0: 55407.0. Samples: 318410880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-25 23:44:48,924][47056] Avg episode reward: [(0, '0.253')] [2024-04-25 23:44:49,031][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000022524_369033216.pth... [2024-04-25 23:44:49,079][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000021710_355696640.pth [2024-04-25 23:44:49,592][47288] Updated weights for policy 0, policy_version 22526 (0.0027) [2024-04-25 23:44:51,727][47288] Updated weights for policy 0, policy_version 22536 (0.0027) [2024-04-25 23:44:53,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 369311744. Throughput: 0: 55515.0. Samples: 318745520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-25 23:44:53,923][47056] Avg episode reward: [(0, '0.189')] [2024-04-25 23:44:55,404][47288] Updated weights for policy 0, policy_version 22546 (0.0029) [2024-04-25 23:44:57,476][47288] Updated weights for policy 0, policy_version 22556 (0.0033) [2024-04-25 23:44:58,923][47056] Fps is (10 sec: 58982.0, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 369606656. Throughput: 0: 55735.0. Samples: 318910940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-25 23:44:58,923][47056] Avg episode reward: [(0, '0.168')] [2024-04-25 23:45:01,175][47288] Updated weights for policy 0, policy_version 22566 (0.0025) [2024-04-25 23:45:03,194][47288] Updated weights for policy 0, policy_version 22576 (0.0026) [2024-04-25 23:45:03,923][47056] Fps is (10 sec: 58982.9, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 369901568. Throughput: 0: 55792.1. Samples: 319245800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-25 23:45:03,923][47056] Avg episode reward: [(0, '0.223')] [2024-04-25 23:45:07,079][47288] Updated weights for policy 0, policy_version 22586 (0.0028) [2024-04-25 23:45:08,923][47056] Fps is (10 sec: 57345.1, 60 sec: 56524.8, 300 sec: 55705.6). Total num frames: 370180096. Throughput: 0: 55848.5. Samples: 319583200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-25 23:45:08,923][47056] Avg episode reward: [(0, '0.221')] [2024-04-25 23:45:09,540][47288] Updated weights for policy 0, policy_version 22596 (0.0026) [2024-04-25 23:45:12,964][47288] Updated weights for policy 0, policy_version 22606 (0.0031) [2024-04-25 23:45:13,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55978.8, 300 sec: 55594.6). Total num frames: 370425856. Throughput: 0: 55655.8. Samples: 319754120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-25 23:45:13,923][47056] Avg episode reward: [(0, '0.269')] [2024-04-25 23:45:15,458][47288] Updated weights for policy 0, policy_version 22616 (0.0027) [2024-04-25 23:45:18,693][47288] Updated weights for policy 0, policy_version 22626 (0.0029) [2024-04-25 23:45:18,923][47056] Fps is (10 sec: 52428.2, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 370704384. Throughput: 0: 55615.2. Samples: 320084580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-25 23:45:18,923][47056] Avg episode reward: [(0, '0.183')] [2024-04-25 23:45:21,173][47288] Updated weights for policy 0, policy_version 22636 (0.0030) [2024-04-25 23:45:22,079][47267] Signal inference workers to stop experience collection... (4500 times) [2024-04-25 23:45:22,097][47288] InferenceWorker_p0-w0: stopping experience collection (4500 times) [2024-04-25 23:45:22,176][47267] Signal inference workers to resume experience collection... (4500 times) [2024-04-25 23:45:22,177][47288] InferenceWorker_p0-w0: resuming experience collection (4500 times) [2024-04-25 23:45:23,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 370982912. Throughput: 0: 55769.9. Samples: 320423680. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-25 23:45:23,923][47056] Avg episode reward: [(0, '0.249')] [2024-04-25 23:45:24,437][47288] Updated weights for policy 0, policy_version 22646 (0.0030) [2024-04-25 23:45:26,931][47288] Updated weights for policy 0, policy_version 22656 (0.0032) [2024-04-25 23:45:28,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 371261440. Throughput: 0: 55587.2. Samples: 320576680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-25 23:45:28,923][47056] Avg episode reward: [(0, '0.226')] [2024-04-25 23:45:30,334][47288] Updated weights for policy 0, policy_version 22666 (0.0032) [2024-04-25 23:45:32,861][47288] Updated weights for policy 0, policy_version 22676 (0.0029) [2024-04-25 23:45:33,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 371556352. Throughput: 0: 55613.9. Samples: 320913500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-25 23:45:33,923][47056] Avg episode reward: [(0, '0.231')] [2024-04-25 23:45:36,218][47288] Updated weights for policy 0, policy_version 22686 (0.0026) [2024-04-25 23:45:38,834][47288] Updated weights for policy 0, policy_version 22696 (0.0022) [2024-04-25 23:45:38,923][47056] Fps is (10 sec: 58982.0, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 371851264. Throughput: 0: 55633.4. Samples: 321249020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-25 23:45:38,923][47056] Avg episode reward: [(0, '0.178')] [2024-04-25 23:45:42,452][47288] Updated weights for policy 0, policy_version 22706 (0.0034) [2024-04-25 23:45:43,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.9, 300 sec: 55761.1). Total num frames: 372129792. Throughput: 0: 55924.7. Samples: 321427540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-25 23:45:43,923][47056] Avg episode reward: [(0, '0.181')] [2024-04-25 23:45:44,663][47288] Updated weights for policy 0, policy_version 22716 (0.0027) [2024-04-25 23:45:48,368][47288] Updated weights for policy 0, policy_version 22726 (0.0026) [2024-04-25 23:45:48,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 372375552. Throughput: 0: 55864.0. Samples: 321759680. Policy #0 lag: (min: 0.0, avg: 13.0, max: 23.0) [2024-04-25 23:45:48,923][47056] Avg episode reward: [(0, '0.249')] [2024-04-25 23:45:50,520][47288] Updated weights for policy 0, policy_version 22736 (0.0033) [2024-04-25 23:45:53,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55705.7, 300 sec: 55594.6). Total num frames: 372654080. Throughput: 0: 55710.7. Samples: 322090180. Policy #0 lag: (min: 0.0, avg: 13.0, max: 23.0) [2024-04-25 23:45:53,923][47056] Avg episode reward: [(0, '0.240')] [2024-04-25 23:45:54,399][47288] Updated weights for policy 0, policy_version 22746 (0.0029) [2024-04-25 23:45:56,557][47288] Updated weights for policy 0, policy_version 22756 (0.0032) [2024-04-25 23:45:58,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 372932608. Throughput: 0: 55491.1. Samples: 322251220. Policy #0 lag: (min: 0.0, avg: 13.0, max: 23.0) [2024-04-25 23:45:58,923][47056] Avg episode reward: [(0, '0.250')] [2024-04-25 23:46:00,104][47288] Updated weights for policy 0, policy_version 22766 (0.0031) [2024-04-25 23:46:02,594][47288] Updated weights for policy 0, policy_version 22776 (0.0026) [2024-04-25 23:46:03,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 373211136. Throughput: 0: 55762.8. Samples: 322593900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-25 23:46:03,923][47056] Avg episode reward: [(0, '0.213')] [2024-04-25 23:46:05,930][47288] Updated weights for policy 0, policy_version 22786 (0.0035) [2024-04-25 23:46:08,429][47288] Updated weights for policy 0, policy_version 22796 (0.0029) [2024-04-25 23:46:08,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55159.3, 300 sec: 55705.6). Total num frames: 373489664. Throughput: 0: 55551.0. Samples: 322923480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-25 23:46:08,923][47056] Avg episode reward: [(0, '0.235')] [2024-04-25 23:46:11,799][47288] Updated weights for policy 0, policy_version 22806 (0.0031) [2024-04-25 23:46:13,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 373784576. Throughput: 0: 55988.0. Samples: 323096140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-25 23:46:13,923][47056] Avg episode reward: [(0, '0.225')] [2024-04-25 23:46:14,208][47288] Updated weights for policy 0, policy_version 22816 (0.0027) [2024-04-25 23:46:17,748][47288] Updated weights for policy 0, policy_version 22826 (0.0030) [2024-04-25 23:46:18,461][47267] Signal inference workers to stop experience collection... (4550 times) [2024-04-25 23:46:18,461][47267] Signal inference workers to resume experience collection... (4550 times) [2024-04-25 23:46:18,488][47288] InferenceWorker_p0-w0: stopping experience collection (4550 times) [2024-04-25 23:46:18,489][47288] InferenceWorker_p0-w0: resuming experience collection (4550 times) [2024-04-25 23:46:18,923][47056] Fps is (10 sec: 58983.0, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 374079488. Throughput: 0: 55944.9. Samples: 323431020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-25 23:46:18,923][47056] Avg episode reward: [(0, '0.258')] [2024-04-25 23:46:20,168][47288] Updated weights for policy 0, policy_version 22836 (0.0026) [2024-04-25 23:46:23,489][47288] Updated weights for policy 0, policy_version 22846 (0.0030) [2024-04-25 23:46:23,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 374358016. Throughput: 0: 55940.0. Samples: 323766320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-25 23:46:23,923][47056] Avg episode reward: [(0, '0.190')] [2024-04-25 23:46:26,150][47288] Updated weights for policy 0, policy_version 22856 (0.0035) [2024-04-25 23:46:28,923][47056] Fps is (10 sec: 50790.4, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 374587392. Throughput: 0: 55597.7. Samples: 323929440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-25 23:46:28,923][47056] Avg episode reward: [(0, '0.250')] [2024-04-25 23:46:29,394][47288] Updated weights for policy 0, policy_version 22866 (0.0031) [2024-04-25 23:46:31,926][47288] Updated weights for policy 0, policy_version 22876 (0.0026) [2024-04-25 23:46:33,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 374898688. Throughput: 0: 55752.5. Samples: 324268540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-25 23:46:33,923][47056] Avg episode reward: [(0, '0.194')] [2024-04-25 23:46:35,213][47288] Updated weights for policy 0, policy_version 22886 (0.0032) [2024-04-25 23:46:37,709][47288] Updated weights for policy 0, policy_version 22896 (0.0030) [2024-04-25 23:46:38,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 375177216. Throughput: 0: 55872.8. Samples: 324604460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-25 23:46:38,923][47056] Avg episode reward: [(0, '0.247')] [2024-04-25 23:46:41,019][47288] Updated weights for policy 0, policy_version 22906 (0.0026) [2024-04-25 23:46:43,828][47288] Updated weights for policy 0, policy_version 22916 (0.0029) [2024-04-25 23:46:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 375455744. Throughput: 0: 55921.8. Samples: 324767700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-25 23:46:43,923][47056] Avg episode reward: [(0, '0.217')] [2024-04-25 23:46:46,938][47288] Updated weights for policy 0, policy_version 22926 (0.0030) [2024-04-25 23:46:48,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 375734272. Throughput: 0: 55788.3. Samples: 325104380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-25 23:46:48,923][47056] Avg episode reward: [(0, '0.186')] [2024-04-25 23:46:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000022933_375734272.pth... [2024-04-25 23:46:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000022119_362397696.pth [2024-04-25 23:46:49,970][47288] Updated weights for policy 0, policy_version 22936 (0.0030) [2024-04-25 23:46:52,650][47288] Updated weights for policy 0, policy_version 22946 (0.0026) [2024-04-25 23:46:53,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 376045568. Throughput: 0: 55927.1. Samples: 325440200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-25 23:46:53,923][47056] Avg episode reward: [(0, '0.238')] [2024-04-25 23:46:56,083][47288] Updated weights for policy 0, policy_version 22956 (0.0032) [2024-04-25 23:46:58,525][47288] Updated weights for policy 0, policy_version 22966 (0.0026) [2024-04-25 23:46:58,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 376307712. Throughput: 0: 55772.9. Samples: 325605920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-25 23:46:58,923][47056] Avg episode reward: [(0, '0.223')] [2024-04-25 23:47:01,761][47288] Updated weights for policy 0, policy_version 22976 (0.0034) [2024-04-25 23:47:02,001][47267] Signal inference workers to stop experience collection... (4600 times) [2024-04-25 23:47:02,002][47267] Signal inference workers to resume experience collection... (4600 times) [2024-04-25 23:47:02,028][47288] InferenceWorker_p0-w0: stopping experience collection (4600 times) [2024-04-25 23:47:02,028][47288] InferenceWorker_p0-w0: resuming experience collection (4600 times) [2024-04-25 23:47:03,923][47056] Fps is (10 sec: 50789.6, 60 sec: 55705.3, 300 sec: 55594.5). Total num frames: 376553472. Throughput: 0: 55757.5. Samples: 325940120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-25 23:47:03,923][47056] Avg episode reward: [(0, '0.239')] [2024-04-25 23:47:04,448][47288] Updated weights for policy 0, policy_version 22986 (0.0028) [2024-04-25 23:47:07,539][47288] Updated weights for policy 0, policy_version 22996 (0.0026) [2024-04-25 23:47:08,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 376864768. Throughput: 0: 55681.4. Samples: 326271980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-25 23:47:08,923][47056] Avg episode reward: [(0, '0.204')] [2024-04-25 23:47:10,318][47288] Updated weights for policy 0, policy_version 23006 (0.0028) [2024-04-25 23:47:13,368][47288] Updated weights for policy 0, policy_version 23016 (0.0033) [2024-04-25 23:47:13,923][47056] Fps is (10 sec: 57345.9, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 377126912. Throughput: 0: 55669.4. Samples: 326434560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-25 23:47:13,923][47056] Avg episode reward: [(0, '0.214')] [2024-04-25 23:47:16,245][47288] Updated weights for policy 0, policy_version 23026 (0.0033) [2024-04-25 23:47:18,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 377389056. Throughput: 0: 55682.6. Samples: 326774260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-25 23:47:18,923][47056] Avg episode reward: [(0, '0.211')] [2024-04-25 23:47:19,254][47288] Updated weights for policy 0, policy_version 23036 (0.0033) [2024-04-25 23:47:21,998][47288] Updated weights for policy 0, policy_version 23046 (0.0029) [2024-04-25 23:47:23,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55705.7). Total num frames: 377683968. Throughput: 0: 55688.1. Samples: 327110420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-25 23:47:23,923][47056] Avg episode reward: [(0, '0.257')] [2024-04-25 23:47:25,180][47288] Updated weights for policy 0, policy_version 23056 (0.0035) [2024-04-25 23:47:27,812][47288] Updated weights for policy 0, policy_version 23066 (0.0027) [2024-04-25 23:47:28,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56524.7, 300 sec: 55705.6). Total num frames: 377978880. Throughput: 0: 55701.6. Samples: 327274280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-25 23:47:28,923][47056] Avg episode reward: [(0, '0.181')] [2024-04-25 23:47:31,038][47288] Updated weights for policy 0, policy_version 23076 (0.0034) [2024-04-25 23:47:33,709][47288] Updated weights for policy 0, policy_version 23086 (0.0027) [2024-04-25 23:47:33,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 378257408. Throughput: 0: 55684.4. Samples: 327610180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-25 23:47:33,923][47056] Avg episode reward: [(0, '0.250')] [2024-04-25 23:47:36,964][47288] Updated weights for policy 0, policy_version 23096 (0.0027) [2024-04-25 23:47:38,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 378535936. Throughput: 0: 55769.8. Samples: 327949840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-25 23:47:38,923][47056] Avg episode reward: [(0, '0.251')] [2024-04-25 23:47:39,377][47288] Updated weights for policy 0, policy_version 23106 (0.0032) [2024-04-25 23:47:42,734][47288] Updated weights for policy 0, policy_version 23116 (0.0030) [2024-04-25 23:47:43,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 378814464. Throughput: 0: 55904.0. Samples: 328121600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-25 23:47:43,923][47056] Avg episode reward: [(0, '0.196')] [2024-04-25 23:47:45,203][47288] Updated weights for policy 0, policy_version 23126 (0.0029) [2024-04-25 23:47:48,646][47288] Updated weights for policy 0, policy_version 23136 (0.0032) [2024-04-25 23:47:48,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 379076608. Throughput: 0: 55869.3. Samples: 328454220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-25 23:47:48,923][47056] Avg episode reward: [(0, '0.222')] [2024-04-25 23:47:51,220][47288] Updated weights for policy 0, policy_version 23146 (0.0029) [2024-04-25 23:47:53,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 379355136. Throughput: 0: 55995.1. Samples: 328791760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-25 23:47:53,923][47056] Avg episode reward: [(0, '0.252')] [2024-04-25 23:47:54,448][47288] Updated weights for policy 0, policy_version 23156 (0.0028) [2024-04-25 23:47:57,135][47288] Updated weights for policy 0, policy_version 23166 (0.0030) [2024-04-25 23:47:58,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 379633664. Throughput: 0: 56083.0. Samples: 328958300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-25 23:47:58,924][47056] Avg episode reward: [(0, '0.219')] [2024-04-25 23:48:00,025][47267] Signal inference workers to stop experience collection... (4650 times) [2024-04-25 23:48:00,054][47288] InferenceWorker_p0-w0: stopping experience collection (4650 times) [2024-04-25 23:48:00,080][47267] Signal inference workers to resume experience collection... (4650 times) [2024-04-25 23:48:00,080][47288] InferenceWorker_p0-w0: resuming experience collection (4650 times) [2024-04-25 23:48:00,197][47288] Updated weights for policy 0, policy_version 23176 (0.0033) [2024-04-25 23:48:02,912][47288] Updated weights for policy 0, policy_version 23186 (0.0031) [2024-04-25 23:48:03,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56252.0, 300 sec: 55761.2). Total num frames: 379928576. Throughput: 0: 55926.8. Samples: 329290960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-25 23:48:03,923][47056] Avg episode reward: [(0, '0.207')] [2024-04-25 23:48:05,944][47288] Updated weights for policy 0, policy_version 23196 (0.0033) [2024-04-25 23:48:08,698][47288] Updated weights for policy 0, policy_version 23206 (0.0031) [2024-04-25 23:48:08,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 380207104. Throughput: 0: 55912.8. Samples: 329626500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-25 23:48:08,923][47056] Avg episode reward: [(0, '0.256')] [2024-04-25 23:48:11,906][47288] Updated weights for policy 0, policy_version 23216 (0.0029) [2024-04-25 23:48:13,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 380469248. Throughput: 0: 55997.1. Samples: 329794140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-25 23:48:13,923][47056] Avg episode reward: [(0, '0.260')] [2024-04-25 23:48:14,618][47288] Updated weights for policy 0, policy_version 23226 (0.0034) [2024-04-25 23:48:17,980][47288] Updated weights for policy 0, policy_version 23236 (0.0029) [2024-04-25 23:48:18,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 380764160. Throughput: 0: 56106.3. Samples: 330134960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-25 23:48:18,923][47056] Avg episode reward: [(0, '0.246')] [2024-04-25 23:48:20,495][47288] Updated weights for policy 0, policy_version 23246 (0.0032) [2024-04-25 23:48:23,865][47288] Updated weights for policy 0, policy_version 23256 (0.0028) [2024-04-25 23:48:23,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 381026304. Throughput: 0: 55923.7. Samples: 330466400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-25 23:48:23,923][47056] Avg episode reward: [(0, '0.248')] [2024-04-25 23:48:26,333][47288] Updated weights for policy 0, policy_version 23266 (0.0032) [2024-04-25 23:48:28,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 381304832. Throughput: 0: 55587.5. Samples: 330623040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-25 23:48:28,923][47056] Avg episode reward: [(0, '0.238')] [2024-04-25 23:48:29,741][47288] Updated weights for policy 0, policy_version 23276 (0.0032) [2024-04-25 23:48:32,098][47288] Updated weights for policy 0, policy_version 23286 (0.0035) [2024-04-25 23:48:33,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 381599744. Throughput: 0: 55696.3. Samples: 330960560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-25 23:48:33,923][47056] Avg episode reward: [(0, '0.204')] [2024-04-25 23:48:35,711][47288] Updated weights for policy 0, policy_version 23296 (0.0026) [2024-04-25 23:48:38,177][47288] Updated weights for policy 0, policy_version 23306 (0.0027) [2024-04-25 23:48:38,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 381878272. Throughput: 0: 55553.8. Samples: 331291680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-25 23:48:38,923][47056] Avg episode reward: [(0, '0.269')] [2024-04-25 23:48:41,518][47288] Updated weights for policy 0, policy_version 23316 (0.0030) [2024-04-25 23:48:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 382156800. Throughput: 0: 55726.1. Samples: 331465980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-25 23:48:43,923][47056] Avg episode reward: [(0, '0.205')] [2024-04-25 23:48:44,094][47288] Updated weights for policy 0, policy_version 23326 (0.0031) [2024-04-25 23:48:47,333][47288] Updated weights for policy 0, policy_version 23336 (0.0031) [2024-04-25 23:48:48,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 382435328. Throughput: 0: 55716.2. Samples: 331798200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-25 23:48:48,923][47056] Avg episode reward: [(0, '0.248')] [2024-04-25 23:48:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000023342_382435328.pth... [2024-04-25 23:48:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000022524_369033216.pth [2024-04-25 23:48:49,914][47288] Updated weights for policy 0, policy_version 23346 (0.0034) [2024-04-25 23:48:53,205][47288] Updated weights for policy 0, policy_version 23356 (0.0029) [2024-04-25 23:48:53,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 382697472. Throughput: 0: 55705.9. Samples: 332133260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 23:48:53,923][47056] Avg episode reward: [(0, '0.240')] [2024-04-25 23:48:55,653][47288] Updated weights for policy 0, policy_version 23366 (0.0034) [2024-04-25 23:48:58,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55432.3, 300 sec: 55650.0). Total num frames: 382959616. Throughput: 0: 55693.4. Samples: 332300360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 23:48:58,924][47056] Avg episode reward: [(0, '0.212')] [2024-04-25 23:48:59,202][47288] Updated weights for policy 0, policy_version 23376 (0.0033) [2024-04-25 23:49:01,432][47288] Updated weights for policy 0, policy_version 23386 (0.0029) [2024-04-25 23:49:03,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 383254528. Throughput: 0: 55448.9. Samples: 332630160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 23:49:03,923][47056] Avg episode reward: [(0, '0.249')] [2024-04-25 23:49:05,058][47288] Updated weights for policy 0, policy_version 23396 (0.0032) [2024-04-25 23:49:07,405][47288] Updated weights for policy 0, policy_version 23406 (0.0030) [2024-04-25 23:49:08,923][47056] Fps is (10 sec: 58983.6, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 383549440. Throughput: 0: 55525.7. Samples: 332965060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-25 23:49:08,923][47056] Avg episode reward: [(0, '0.222')] [2024-04-25 23:49:10,900][47288] Updated weights for policy 0, policy_version 23416 (0.0027) [2024-04-25 23:49:11,504][47267] Signal inference workers to stop experience collection... (4700 times) [2024-04-25 23:49:11,545][47288] InferenceWorker_p0-w0: stopping experience collection (4700 times) [2024-04-25 23:49:11,555][47267] Signal inference workers to resume experience collection... (4700 times) [2024-04-25 23:49:11,561][47288] InferenceWorker_p0-w0: resuming experience collection (4700 times) [2024-04-25 23:49:13,312][47288] Updated weights for policy 0, policy_version 23426 (0.0025) [2024-04-25 23:49:13,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 383827968. Throughput: 0: 56003.5. Samples: 333143200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-25 23:49:13,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-25 23:49:13,924][47267] Saving new best policy, reward=0.315! [2024-04-25 23:49:16,677][47288] Updated weights for policy 0, policy_version 23436 (0.0024) [2024-04-25 23:49:18,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 384122880. Throughput: 0: 55848.5. Samples: 333473740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-25 23:49:18,923][47056] Avg episode reward: [(0, '0.296')] [2024-04-25 23:49:19,069][47288] Updated weights for policy 0, policy_version 23446 (0.0032) [2024-04-25 23:49:22,580][47288] Updated weights for policy 0, policy_version 23456 (0.0030) [2024-04-25 23:49:23,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 384368640. Throughput: 0: 55854.3. Samples: 333805120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-25 23:49:23,923][47056] Avg episode reward: [(0, '0.273')] [2024-04-25 23:49:25,009][47288] Updated weights for policy 0, policy_version 23466 (0.0035) [2024-04-25 23:49:28,533][47288] Updated weights for policy 0, policy_version 23476 (0.0033) [2024-04-25 23:49:28,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 384647168. Throughput: 0: 55636.6. Samples: 333969620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 23:49:28,923][47056] Avg episode reward: [(0, '0.251')] [2024-04-25 23:49:30,820][47288] Updated weights for policy 0, policy_version 23486 (0.0029) [2024-04-25 23:49:33,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 384909312. Throughput: 0: 55676.0. Samples: 334303620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 23:49:33,923][47056] Avg episode reward: [(0, '0.275')] [2024-04-25 23:49:34,361][47288] Updated weights for policy 0, policy_version 23496 (0.0032) [2024-04-25 23:49:36,692][47288] Updated weights for policy 0, policy_version 23506 (0.0031) [2024-04-25 23:49:38,924][47056] Fps is (10 sec: 57334.6, 60 sec: 55704.2, 300 sec: 55816.4). Total num frames: 385220608. Throughput: 0: 55638.9. Samples: 334637100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-25 23:49:38,925][47056] Avg episode reward: [(0, '0.232')] [2024-04-25 23:49:40,468][47288] Updated weights for policy 0, policy_version 23516 (0.0030) [2024-04-25 23:49:42,743][47288] Updated weights for policy 0, policy_version 23526 (0.0033) [2024-04-25 23:49:43,923][47056] Fps is (10 sec: 60621.2, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 385515520. Throughput: 0: 55685.6. Samples: 334806200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 23:49:43,923][47056] Avg episode reward: [(0, '0.232')] [2024-04-25 23:49:46,276][47288] Updated weights for policy 0, policy_version 23536 (0.0028) [2024-04-25 23:49:48,757][47288] Updated weights for policy 0, policy_version 23546 (0.0034) [2024-04-25 23:49:48,923][47056] Fps is (10 sec: 55713.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 385777664. Throughput: 0: 55800.3. Samples: 335141180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 23:49:48,932][47056] Avg episode reward: [(0, '0.196')] [2024-04-25 23:49:52,155][47288] Updated weights for policy 0, policy_version 23556 (0.0028) [2024-04-25 23:49:53,923][47056] Fps is (10 sec: 52429.5, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 386039808. Throughput: 0: 55716.6. Samples: 335472300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-25 23:49:53,923][47056] Avg episode reward: [(0, '0.214')] [2024-04-25 23:49:54,658][47288] Updated weights for policy 0, policy_version 23566 (0.0030) [2024-04-25 23:49:57,838][47288] Updated weights for policy 0, policy_version 23576 (0.0029) [2024-04-25 23:49:58,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55978.9, 300 sec: 55650.1). Total num frames: 386318336. Throughput: 0: 55532.5. Samples: 335642160. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-25 23:49:58,923][47056] Avg episode reward: [(0, '0.221')] [2024-04-25 23:49:59,872][47267] Signal inference workers to stop experience collection... (4750 times) [2024-04-25 23:49:59,924][47288] InferenceWorker_p0-w0: stopping experience collection (4750 times) [2024-04-25 23:49:59,926][47267] Signal inference workers to resume experience collection... (4750 times) [2024-04-25 23:49:59,934][47288] InferenceWorker_p0-w0: resuming experience collection (4750 times) [2024-04-25 23:50:00,487][47288] Updated weights for policy 0, policy_version 23586 (0.0031) [2024-04-25 23:50:03,659][47288] Updated weights for policy 0, policy_version 23596 (0.0039) [2024-04-25 23:50:03,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 386596864. Throughput: 0: 55533.4. Samples: 335972740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-25 23:50:03,923][47056] Avg episode reward: [(0, '0.255')] [2024-04-25 23:50:06,453][47288] Updated weights for policy 0, policy_version 23606 (0.0028) [2024-04-25 23:50:08,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 386859008. Throughput: 0: 55595.1. Samples: 336306900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-25 23:50:08,923][47056] Avg episode reward: [(0, '0.251')] [2024-04-25 23:50:09,539][47288] Updated weights for policy 0, policy_version 23616 (0.0028) [2024-04-25 23:50:12,450][47288] Updated weights for policy 0, policy_version 23626 (0.0031) [2024-04-25 23:50:13,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 387170304. Throughput: 0: 55729.7. Samples: 336477460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-25 23:50:13,923][47056] Avg episode reward: [(0, '0.266')] [2024-04-25 23:50:15,398][47288] Updated weights for policy 0, policy_version 23636 (0.0030) [2024-04-25 23:50:18,203][47288] Updated weights for policy 0, policy_version 23646 (0.0027) [2024-04-25 23:50:18,923][47056] Fps is (10 sec: 60621.6, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 387465216. Throughput: 0: 55681.1. Samples: 336809260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-25 23:50:18,923][47056] Avg episode reward: [(0, '0.281')] [2024-04-25 23:50:21,342][47288] Updated weights for policy 0, policy_version 23656 (0.0031) [2024-04-25 23:50:23,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 387710976. Throughput: 0: 55656.6. Samples: 337141560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-25 23:50:23,923][47056] Avg episode reward: [(0, '0.159')] [2024-04-25 23:50:24,092][47288] Updated weights for policy 0, policy_version 23666 (0.0030) [2024-04-25 23:50:27,250][47288] Updated weights for policy 0, policy_version 23676 (0.0031) [2024-04-25 23:50:28,923][47056] Fps is (10 sec: 50789.5, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 387973120. Throughput: 0: 55704.8. Samples: 337312920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-25 23:50:28,924][47056] Avg episode reward: [(0, '0.191')] [2024-04-25 23:50:29,924][47288] Updated weights for policy 0, policy_version 23686 (0.0029) [2024-04-25 23:50:33,257][47288] Updated weights for policy 0, policy_version 23696 (0.0031) [2024-04-25 23:50:33,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 388251648. Throughput: 0: 55666.4. Samples: 337646160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-25 23:50:33,923][47056] Avg episode reward: [(0, '0.272')] [2024-04-25 23:50:35,701][47288] Updated weights for policy 0, policy_version 23706 (0.0035) [2024-04-25 23:50:38,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55434.0, 300 sec: 55650.0). Total num frames: 388546560. Throughput: 0: 55676.3. Samples: 337977740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-25 23:50:38,923][47056] Avg episode reward: [(0, '0.222')] [2024-04-25 23:50:39,113][47288] Updated weights for policy 0, policy_version 23716 (0.0025) [2024-04-25 23:50:41,660][47288] Updated weights for policy 0, policy_version 23726 (0.0028) [2024-04-25 23:50:43,923][47056] Fps is (10 sec: 55705.1, 60 sec: 54886.3, 300 sec: 55705.6). Total num frames: 388808704. Throughput: 0: 55341.2. Samples: 338132520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-25 23:50:43,923][47056] Avg episode reward: [(0, '0.230')] [2024-04-25 23:50:45,012][47288] Updated weights for policy 0, policy_version 23736 (0.0038) [2024-04-25 23:50:47,482][47288] Updated weights for policy 0, policy_version 23746 (0.0028) [2024-04-25 23:50:48,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55432.8, 300 sec: 55761.2). Total num frames: 389103616. Throughput: 0: 55360.1. Samples: 338463940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-25 23:50:48,923][47056] Avg episode reward: [(0, '0.182')] [2024-04-25 23:50:48,943][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000023750_389120000.pth... [2024-04-25 23:50:48,986][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000022933_375734272.pth [2024-04-25 23:50:50,909][47288] Updated weights for policy 0, policy_version 23756 (0.0028) [2024-04-25 23:50:52,976][47267] Signal inference workers to stop experience collection... (4800 times) [2024-04-25 23:50:52,981][47267] Signal inference workers to resume experience collection... (4800 times) [2024-04-25 23:50:52,995][47288] InferenceWorker_p0-w0: stopping experience collection (4800 times) [2024-04-25 23:50:52,995][47288] InferenceWorker_p0-w0: resuming experience collection (4800 times) [2024-04-25 23:50:53,535][47288] Updated weights for policy 0, policy_version 23766 (0.0031) [2024-04-25 23:50:53,923][47056] Fps is (10 sec: 58982.7, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 389398528. Throughput: 0: 55344.0. Samples: 338797380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-25 23:50:53,923][47056] Avg episode reward: [(0, '0.254')] [2024-04-25 23:50:56,874][47288] Updated weights for policy 0, policy_version 23776 (0.0027) [2024-04-25 23:50:58,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 389677056. Throughput: 0: 55421.4. Samples: 338971420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-25 23:50:58,923][47056] Avg episode reward: [(0, '0.191')] [2024-04-25 23:50:59,437][47288] Updated weights for policy 0, policy_version 23786 (0.0032) [2024-04-25 23:51:02,879][47288] Updated weights for policy 0, policy_version 23796 (0.0026) [2024-04-25 23:51:03,923][47056] Fps is (10 sec: 50790.7, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 389906432. Throughput: 0: 55443.0. Samples: 339304200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-25 23:51:03,923][47056] Avg episode reward: [(0, '0.220')] [2024-04-25 23:51:05,144][47288] Updated weights for policy 0, policy_version 23806 (0.0029) [2024-04-25 23:51:08,891][47288] Updated weights for policy 0, policy_version 23816 (0.0027) [2024-04-25 23:51:08,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 390201344. Throughput: 0: 55500.0. Samples: 339639060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-25 23:51:08,923][47056] Avg episode reward: [(0, '0.240')] [2024-04-25 23:51:11,027][47288] Updated weights for policy 0, policy_version 23826 (0.0027) [2024-04-25 23:51:13,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 390496256. Throughput: 0: 55178.3. Samples: 339795940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-25 23:51:13,923][47056] Avg episode reward: [(0, '0.263')] [2024-04-25 23:51:14,809][47288] Updated weights for policy 0, policy_version 23836 (0.0028) [2024-04-25 23:51:17,027][47288] Updated weights for policy 0, policy_version 23846 (0.0033) [2024-04-25 23:51:18,923][47056] Fps is (10 sec: 55706.0, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 390758400. Throughput: 0: 55206.3. Samples: 340130440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-25 23:51:18,923][47056] Avg episode reward: [(0, '0.249')] [2024-04-25 23:51:20,561][47288] Updated weights for policy 0, policy_version 23856 (0.0033) [2024-04-25 23:51:23,002][47288] Updated weights for policy 0, policy_version 23866 (0.0027) [2024-04-25 23:51:23,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 391053312. Throughput: 0: 55258.8. Samples: 340464380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-25 23:51:23,923][47056] Avg episode reward: [(0, '0.225')] [2024-04-25 23:51:26,396][47288] Updated weights for policy 0, policy_version 23876 (0.0029) [2024-04-25 23:51:28,859][47288] Updated weights for policy 0, policy_version 23886 (0.0036) [2024-04-25 23:51:28,923][47056] Fps is (10 sec: 58981.5, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 391348224. Throughput: 0: 55616.0. Samples: 340635240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-25 23:51:28,923][47056] Avg episode reward: [(0, '0.252')] [2024-04-25 23:51:32,414][47288] Updated weights for policy 0, policy_version 23896 (0.0033) [2024-04-25 23:51:33,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 391610368. Throughput: 0: 55688.6. Samples: 340969940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-25 23:51:33,923][47056] Avg episode reward: [(0, '0.271')] [2024-04-25 23:51:34,818][47288] Updated weights for policy 0, policy_version 23906 (0.0028) [2024-04-25 23:51:38,390][47288] Updated weights for policy 0, policy_version 23916 (0.0032) [2024-04-25 23:51:38,923][47056] Fps is (10 sec: 50790.6, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 391856128. Throughput: 0: 55666.2. Samples: 341302360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-25 23:51:38,923][47056] Avg episode reward: [(0, '0.283')] [2024-04-25 23:51:40,570][47288] Updated weights for policy 0, policy_version 23926 (0.0032) [2024-04-25 23:51:43,923][47056] Fps is (10 sec: 52429.6, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 392134656. Throughput: 0: 55464.8. Samples: 341467340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-25 23:51:43,923][47056] Avg episode reward: [(0, '0.232')] [2024-04-25 23:51:44,354][47288] Updated weights for policy 0, policy_version 23936 (0.0029) [2024-04-25 23:51:46,557][47288] Updated weights for policy 0, policy_version 23946 (0.0026) [2024-04-25 23:51:48,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55159.4, 300 sec: 55483.5). Total num frames: 392413184. Throughput: 0: 55465.0. Samples: 341800120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-25 23:51:48,923][47056] Avg episode reward: [(0, '0.232')] [2024-04-25 23:51:50,402][47288] Updated weights for policy 0, policy_version 23956 (0.0029) [2024-04-25 23:51:52,652][47288] Updated weights for policy 0, policy_version 23966 (0.0041) [2024-04-25 23:51:53,923][47056] Fps is (10 sec: 58982.5, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 392724480. Throughput: 0: 55412.1. Samples: 342132600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-25 23:51:53,923][47056] Avg episode reward: [(0, '0.225')] [2024-04-25 23:51:56,213][47288] Updated weights for policy 0, policy_version 23976 (0.0029) [2024-04-25 23:51:56,546][47267] Signal inference workers to stop experience collection... (4850 times) [2024-04-25 23:51:56,547][47267] Signal inference workers to resume experience collection... (4850 times) [2024-04-25 23:51:56,571][47288] InferenceWorker_p0-w0: stopping experience collection (4850 times) [2024-04-25 23:51:56,571][47288] InferenceWorker_p0-w0: resuming experience collection (4850 times) [2024-04-25 23:51:58,335][47288] Updated weights for policy 0, policy_version 23986 (0.0032) [2024-04-25 23:51:58,923][47056] Fps is (10 sec: 57343.0, 60 sec: 55159.3, 300 sec: 55705.6). Total num frames: 392986624. Throughput: 0: 55778.1. Samples: 342305960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-25 23:51:58,923][47056] Avg episode reward: [(0, '0.275')] [2024-04-25 23:52:02,188][47288] Updated weights for policy 0, policy_version 23996 (0.0029) [2024-04-25 23:52:03,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 393281536. Throughput: 0: 55610.6. Samples: 342632920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-25 23:52:03,923][47056] Avg episode reward: [(0, '0.198')] [2024-04-25 23:52:04,671][47288] Updated weights for policy 0, policy_version 24006 (0.0024) [2024-04-25 23:52:07,971][47288] Updated weights for policy 0, policy_version 24016 (0.0029) [2024-04-25 23:52:08,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 393560064. Throughput: 0: 55608.8. Samples: 342966780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-25 23:52:08,923][47056] Avg episode reward: [(0, '0.276')] [2024-04-25 23:52:10,615][47288] Updated weights for policy 0, policy_version 24026 (0.0031) [2024-04-25 23:52:13,761][47288] Updated weights for policy 0, policy_version 24036 (0.0037) [2024-04-25 23:52:13,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 393805824. Throughput: 0: 55607.3. Samples: 343137560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-25 23:52:13,923][47056] Avg episode reward: [(0, '0.270')] [2024-04-25 23:52:16,301][47288] Updated weights for policy 0, policy_version 24046 (0.0027) [2024-04-25 23:52:18,923][47056] Fps is (10 sec: 52427.8, 60 sec: 55432.3, 300 sec: 55594.5). Total num frames: 394084352. Throughput: 0: 55589.3. Samples: 343471460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-25 23:52:18,923][47056] Avg episode reward: [(0, '0.211')] [2024-04-25 23:52:19,674][47288] Updated weights for policy 0, policy_version 24056 (0.0037) [2024-04-25 23:52:22,114][47288] Updated weights for policy 0, policy_version 24066 (0.0036) [2024-04-25 23:52:23,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 394379264. Throughput: 0: 55652.9. Samples: 343806740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-25 23:52:23,923][47056] Avg episode reward: [(0, '0.225')] [2024-04-25 23:52:25,453][47288] Updated weights for policy 0, policy_version 24076 (0.0031) [2024-04-25 23:52:27,982][47288] Updated weights for policy 0, policy_version 24086 (0.0028) [2024-04-25 23:52:28,923][47056] Fps is (10 sec: 57345.1, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 394657792. Throughput: 0: 55551.1. Samples: 343967140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-25 23:52:28,923][47056] Avg episode reward: [(0, '0.297')] [2024-04-25 23:52:31,394][47288] Updated weights for policy 0, policy_version 24096 (0.0031) [2024-04-25 23:52:33,690][47288] Updated weights for policy 0, policy_version 24106 (0.0030) [2024-04-25 23:52:33,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 394952704. Throughput: 0: 55601.2. Samples: 344302180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-25 23:52:33,923][47056] Avg episode reward: [(0, '0.272')] [2024-04-25 23:52:37,346][47288] Updated weights for policy 0, policy_version 24116 (0.0028) [2024-04-25 23:52:38,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.7, 300 sec: 55650.0). Total num frames: 395231232. Throughput: 0: 55606.5. Samples: 344634900. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-25 23:52:38,923][47056] Avg episode reward: [(0, '0.269')] [2024-04-25 23:52:39,819][47288] Updated weights for policy 0, policy_version 24126 (0.0032) [2024-04-25 23:52:43,101][47288] Updated weights for policy 0, policy_version 24136 (0.0030) [2024-04-25 23:52:43,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 395493376. Throughput: 0: 55593.5. Samples: 344807660. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-25 23:52:43,923][47056] Avg episode reward: [(0, '0.288')] [2024-04-25 23:52:45,947][47288] Updated weights for policy 0, policy_version 24146 (0.0028) [2024-04-25 23:52:48,900][47288] Updated weights for policy 0, policy_version 24156 (0.0033) [2024-04-25 23:52:48,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 395771904. Throughput: 0: 55759.1. Samples: 345142080. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-25 23:52:48,923][47056] Avg episode reward: [(0, '0.194')] [2024-04-25 23:52:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000024156_395771904.pth... [2024-04-25 23:52:48,984][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000023342_382435328.pth [2024-04-25 23:52:52,018][47288] Updated weights for policy 0, policy_version 24166 (0.0031) [2024-04-25 23:52:53,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 396050432. Throughput: 0: 55840.0. Samples: 345479580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 23:52:53,923][47056] Avg episode reward: [(0, '0.221')] [2024-04-25 23:52:54,844][47288] Updated weights for policy 0, policy_version 24176 (0.0032) [2024-04-25 23:52:57,898][47288] Updated weights for policy 0, policy_version 24186 (0.0038) [2024-04-25 23:52:58,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 396312576. Throughput: 0: 55657.6. Samples: 345642160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 23:52:58,923][47056] Avg episode reward: [(0, '0.223')] [2024-04-25 23:53:00,596][47288] Updated weights for policy 0, policy_version 24196 (0.0029) [2024-04-25 23:53:03,631][47288] Updated weights for policy 0, policy_version 24206 (0.0033) [2024-04-25 23:53:03,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 396591104. Throughput: 0: 55573.0. Samples: 345972240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 23:53:03,923][47056] Avg episode reward: [(0, '0.213')] [2024-04-25 23:53:06,490][47288] Updated weights for policy 0, policy_version 24216 (0.0036) [2024-04-25 23:53:08,923][47056] Fps is (10 sec: 58982.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 396902400. Throughput: 0: 55523.1. Samples: 346305280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-25 23:53:08,923][47056] Avg episode reward: [(0, '0.270')] [2024-04-25 23:53:09,469][47288] Updated weights for policy 0, policy_version 24226 (0.0027) [2024-04-25 23:53:10,999][47267] Signal inference workers to stop experience collection... (4900 times) [2024-04-25 23:53:10,999][47267] Signal inference workers to resume experience collection... (4900 times) [2024-04-25 23:53:11,041][47288] InferenceWorker_p0-w0: stopping experience collection (4900 times) [2024-04-25 23:53:11,041][47288] InferenceWorker_p0-w0: resuming experience collection (4900 times) [2024-04-25 23:53:12,576][47288] Updated weights for policy 0, policy_version 24236 (0.0034) [2024-04-25 23:53:13,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.6, 300 sec: 55650.1). Total num frames: 397180928. Throughput: 0: 55841.7. Samples: 346480020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-25 23:53:13,923][47056] Avg episode reward: [(0, '0.264')] [2024-04-25 23:53:15,358][47288] Updated weights for policy 0, policy_version 24246 (0.0031) [2024-04-25 23:53:18,347][47288] Updated weights for policy 0, policy_version 24256 (0.0026) [2024-04-25 23:53:18,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 397459456. Throughput: 0: 55847.6. Samples: 346815320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-25 23:53:18,923][47056] Avg episode reward: [(0, '0.235')] [2024-04-25 23:53:21,220][47288] Updated weights for policy 0, policy_version 24266 (0.0033) [2024-04-25 23:53:23,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 397705216. Throughput: 0: 55894.4. Samples: 347150140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-25 23:53:23,923][47056] Avg episode reward: [(0, '0.243')] [2024-04-25 23:53:24,247][47288] Updated weights for policy 0, policy_version 24276 (0.0029) [2024-04-25 23:53:27,001][47288] Updated weights for policy 0, policy_version 24286 (0.0031) [2024-04-25 23:53:28,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 397983744. Throughput: 0: 55672.1. Samples: 347312900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-25 23:53:28,923][47056] Avg episode reward: [(0, '0.262')] [2024-04-25 23:53:30,079][47288] Updated weights for policy 0, policy_version 24296 (0.0030) [2024-04-25 23:53:32,992][47288] Updated weights for policy 0, policy_version 24306 (0.0033) [2024-04-25 23:53:33,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 398262272. Throughput: 0: 55664.5. Samples: 347646980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-25 23:53:33,923][47056] Avg episode reward: [(0, '0.240')] [2024-04-25 23:53:36,002][47288] Updated weights for policy 0, policy_version 24316 (0.0027) [2024-04-25 23:53:38,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 398540800. Throughput: 0: 55671.5. Samples: 347984800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-25 23:53:38,923][47056] Avg episode reward: [(0, '0.239')] [2024-04-25 23:53:39,203][47288] Updated weights for policy 0, policy_version 24326 (0.0028) [2024-04-25 23:53:41,786][47288] Updated weights for policy 0, policy_version 24336 (0.0036) [2024-04-25 23:53:43,923][47056] Fps is (10 sec: 58982.6, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 398852096. Throughput: 0: 55753.4. Samples: 348151060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-25 23:53:43,923][47056] Avg episode reward: [(0, '0.264')] [2024-04-25 23:53:45,082][47288] Updated weights for policy 0, policy_version 24346 (0.0032) [2024-04-25 23:53:47,670][47288] Updated weights for policy 0, policy_version 24356 (0.0029) [2024-04-25 23:53:48,923][47056] Fps is (10 sec: 58983.0, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 399130624. Throughput: 0: 55853.1. Samples: 348485620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-25 23:53:48,923][47056] Avg episode reward: [(0, '0.253')] [2024-04-25 23:53:50,941][47288] Updated weights for policy 0, policy_version 24366 (0.0028) [2024-04-25 23:53:53,616][47288] Updated weights for policy 0, policy_version 24376 (0.0029) [2024-04-25 23:53:53,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 399392768. Throughput: 0: 55798.5. Samples: 348816220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-25 23:53:53,923][47056] Avg episode reward: [(0, '0.238')] [2024-04-25 23:53:56,739][47288] Updated weights for policy 0, policy_version 24386 (0.0029) [2024-04-25 23:53:58,923][47056] Fps is (10 sec: 50790.1, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 399638528. Throughput: 0: 55599.7. Samples: 348982000. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-25 23:53:58,923][47056] Avg episode reward: [(0, '0.280')] [2024-04-25 23:53:59,597][47288] Updated weights for policy 0, policy_version 24396 (0.0035) [2024-04-25 23:54:02,540][47288] Updated weights for policy 0, policy_version 24406 (0.0033) [2024-04-25 23:54:03,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 399949824. Throughput: 0: 55568.1. Samples: 349315880. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-25 23:54:03,923][47056] Avg episode reward: [(0, '0.264')] [2024-04-25 23:54:05,452][47288] Updated weights for policy 0, policy_version 24416 (0.0025) [2024-04-25 23:54:08,439][47288] Updated weights for policy 0, policy_version 24426 (0.0035) [2024-04-25 23:54:08,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 400211968. Throughput: 0: 55495.8. Samples: 349647460. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-25 23:54:08,923][47056] Avg episode reward: [(0, '0.281')] [2024-04-25 23:54:10,868][47267] Signal inference workers to stop experience collection... (4950 times) [2024-04-25 23:54:10,868][47267] Signal inference workers to resume experience collection... (4950 times) [2024-04-25 23:54:10,884][47288] InferenceWorker_p0-w0: stopping experience collection (4950 times) [2024-04-25 23:54:10,884][47288] InferenceWorker_p0-w0: resuming experience collection (4950 times) [2024-04-25 23:54:11,274][47288] Updated weights for policy 0, policy_version 24436 (0.0026) [2024-04-25 23:54:13,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55159.6, 300 sec: 55483.5). Total num frames: 400490496. Throughput: 0: 55535.1. Samples: 349811980. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-25 23:54:13,923][47056] Avg episode reward: [(0, '0.283')] [2024-04-25 23:54:14,563][47288] Updated weights for policy 0, policy_version 24446 (0.0038) [2024-04-25 23:54:17,217][47288] Updated weights for policy 0, policy_version 24456 (0.0030) [2024-04-25 23:54:18,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 400785408. Throughput: 0: 55453.7. Samples: 350142400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-25 23:54:18,923][47056] Avg episode reward: [(0, '0.301')] [2024-04-25 23:54:20,305][47288] Updated weights for policy 0, policy_version 24466 (0.0035) [2024-04-25 23:54:23,017][47288] Updated weights for policy 0, policy_version 24476 (0.0030) [2024-04-25 23:54:23,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 401080320. Throughput: 0: 55254.4. Samples: 350471240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-25 23:54:23,923][47056] Avg episode reward: [(0, '0.241')] [2024-04-25 23:54:26,101][47288] Updated weights for policy 0, policy_version 24486 (0.0034) [2024-04-25 23:54:28,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 401326080. Throughput: 0: 55557.7. Samples: 350651160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-25 23:54:28,923][47056] Avg episode reward: [(0, '0.244')] [2024-04-25 23:54:28,940][47288] Updated weights for policy 0, policy_version 24496 (0.0027) [2024-04-25 23:54:32,176][47288] Updated weights for policy 0, policy_version 24506 (0.0027) [2024-04-25 23:54:33,923][47056] Fps is (10 sec: 52428.1, 60 sec: 55705.5, 300 sec: 55539.3). Total num frames: 401604608. Throughput: 0: 55525.2. Samples: 350984260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 23:54:33,923][47056] Avg episode reward: [(0, '0.232')] [2024-04-25 23:54:34,812][47288] Updated weights for policy 0, policy_version 24516 (0.0029) [2024-04-25 23:54:38,163][47288] Updated weights for policy 0, policy_version 24526 (0.0031) [2024-04-25 23:54:38,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 401883136. Throughput: 0: 55433.1. Samples: 351310700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 23:54:38,923][47056] Avg episode reward: [(0, '0.225')] [2024-04-25 23:54:40,763][47288] Updated weights for policy 0, policy_version 24536 (0.0026) [2024-04-25 23:54:43,923][47056] Fps is (10 sec: 54067.2, 60 sec: 54886.3, 300 sec: 55483.5). Total num frames: 402145280. Throughput: 0: 55234.6. Samples: 351467560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-25 23:54:43,924][47056] Avg episode reward: [(0, '0.181')] [2024-04-25 23:54:43,962][47288] Updated weights for policy 0, policy_version 24546 (0.0034) [2024-04-25 23:54:46,613][47288] Updated weights for policy 0, policy_version 24556 (0.0029) [2024-04-25 23:54:48,923][47056] Fps is (10 sec: 54065.9, 60 sec: 54886.1, 300 sec: 55538.9). Total num frames: 402423808. Throughput: 0: 55237.9. Samples: 351801600. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-04-25 23:54:48,924][47056] Avg episode reward: [(0, '0.270')] [2024-04-25 23:54:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000024562_402423808.pth... [2024-04-25 23:54:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000023750_389120000.pth [2024-04-25 23:54:49,855][47288] Updated weights for policy 0, policy_version 24566 (0.0033) [2024-04-25 23:54:52,532][47288] Updated weights for policy 0, policy_version 24576 (0.0026) [2024-04-25 23:54:53,923][47056] Fps is (10 sec: 58983.2, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 402735104. Throughput: 0: 55210.4. Samples: 352131920. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-04-25 23:54:53,923][47056] Avg episode reward: [(0, '0.242')] [2024-04-25 23:54:55,727][47288] Updated weights for policy 0, policy_version 24586 (0.0030) [2024-04-25 23:54:58,390][47288] Updated weights for policy 0, policy_version 24596 (0.0025) [2024-04-25 23:54:58,923][47056] Fps is (10 sec: 60622.5, 60 sec: 56524.8, 300 sec: 55705.6). Total num frames: 403030016. Throughput: 0: 55410.2. Samples: 352305440. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-04-25 23:54:58,923][47056] Avg episode reward: [(0, '0.206')] [2024-04-25 23:55:01,602][47288] Updated weights for policy 0, policy_version 24606 (0.0034) [2024-04-25 23:55:03,923][47056] Fps is (10 sec: 50790.7, 60 sec: 54886.5, 300 sec: 55539.0). Total num frames: 403243008. Throughput: 0: 55386.9. Samples: 352634800. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-04-25 23:55:03,923][47056] Avg episode reward: [(0, '0.254')] [2024-04-25 23:55:04,287][47267] Signal inference workers to stop experience collection... (5000 times) [2024-04-25 23:55:04,330][47288] InferenceWorker_p0-w0: stopping experience collection (5000 times) [2024-04-25 23:55:04,345][47267] Signal inference workers to resume experience collection... (5000 times) [2024-04-25 23:55:04,353][47288] InferenceWorker_p0-w0: resuming experience collection (5000 times) [2024-04-25 23:55:04,357][47288] Updated weights for policy 0, policy_version 24616 (0.0029) [2024-04-25 23:55:07,425][47288] Updated weights for policy 0, policy_version 24626 (0.0038) [2024-04-25 23:55:08,923][47056] Fps is (10 sec: 50789.8, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 403537920. Throughput: 0: 55538.9. Samples: 352970500. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-25 23:55:08,923][47056] Avg episode reward: [(0, '0.197')] [2024-04-25 23:55:10,245][47288] Updated weights for policy 0, policy_version 24636 (0.0029) [2024-04-25 23:55:13,242][47288] Updated weights for policy 0, policy_version 24646 (0.0027) [2024-04-25 23:55:13,923][47056] Fps is (10 sec: 58982.0, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 403832832. Throughput: 0: 55169.5. Samples: 353133780. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-25 23:55:13,923][47056] Avg episode reward: [(0, '0.222')] [2024-04-25 23:55:16,050][47288] Updated weights for policy 0, policy_version 24656 (0.0028) [2024-04-25 23:55:18,923][47056] Fps is (10 sec: 54067.6, 60 sec: 54886.4, 300 sec: 55483.4). Total num frames: 404078592. Throughput: 0: 55283.6. Samples: 353472020. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-25 23:55:18,923][47056] Avg episode reward: [(0, '0.251')] [2024-04-25 23:55:19,216][47288] Updated weights for policy 0, policy_version 24666 (0.0030) [2024-04-25 23:55:21,818][47288] Updated weights for policy 0, policy_version 24676 (0.0027) [2024-04-25 23:55:23,923][47056] Fps is (10 sec: 54066.1, 60 sec: 54886.2, 300 sec: 55594.5). Total num frames: 404373504. Throughput: 0: 55489.6. Samples: 353807740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-25 23:55:23,923][47056] Avg episode reward: [(0, '0.254')] [2024-04-25 23:55:25,095][47288] Updated weights for policy 0, policy_version 24686 (0.0032) [2024-04-25 23:55:27,692][47288] Updated weights for policy 0, policy_version 24696 (0.0029) [2024-04-25 23:55:28,923][47056] Fps is (10 sec: 58982.0, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 404668416. Throughput: 0: 55843.1. Samples: 353980500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-25 23:55:28,923][47056] Avg episode reward: [(0, '0.278')] [2024-04-25 23:55:31,042][47288] Updated weights for policy 0, policy_version 24706 (0.0031) [2024-04-25 23:55:33,564][47288] Updated weights for policy 0, policy_version 24716 (0.0028) [2024-04-25 23:55:33,923][47056] Fps is (10 sec: 58982.9, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 404963328. Throughput: 0: 55874.9. Samples: 354315960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-25 23:55:33,923][47056] Avg episode reward: [(0, '0.277')] [2024-04-25 23:55:36,755][47288] Updated weights for policy 0, policy_version 24726 (0.0032) [2024-04-25 23:55:38,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 405225472. Throughput: 0: 55911.8. Samples: 354647960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:55:38,923][47056] Avg episode reward: [(0, '0.267')] [2024-04-25 23:55:39,439][47288] Updated weights for policy 0, policy_version 24736 (0.0026) [2024-04-25 23:55:42,627][47288] Updated weights for policy 0, policy_version 24746 (0.0034) [2024-04-25 23:55:43,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 405487616. Throughput: 0: 55651.0. Samples: 354809740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:55:43,923][47056] Avg episode reward: [(0, '0.222')] [2024-04-25 23:55:45,278][47288] Updated weights for policy 0, policy_version 24756 (0.0033) [2024-04-25 23:55:48,557][47288] Updated weights for policy 0, policy_version 24766 (0.0026) [2024-04-25 23:55:48,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.8, 300 sec: 55539.0). Total num frames: 405782528. Throughput: 0: 55657.6. Samples: 355139400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:55:48,923][47056] Avg episode reward: [(0, '0.284')] [2024-04-25 23:55:51,260][47288] Updated weights for policy 0, policy_version 24776 (0.0032) [2024-04-25 23:55:53,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 406061056. Throughput: 0: 55800.4. Samples: 355481520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-25 23:55:53,923][47056] Avg episode reward: [(0, '0.218')] [2024-04-25 23:55:54,401][47288] Updated weights for policy 0, policy_version 24786 (0.0029) [2024-04-25 23:55:57,160][47288] Updated weights for policy 0, policy_version 24796 (0.0026) [2024-04-25 23:55:58,923][47056] Fps is (10 sec: 54068.0, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 406323200. Throughput: 0: 55857.8. Samples: 355647380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-25 23:55:58,923][47056] Avg episode reward: [(0, '0.277')] [2024-04-25 23:56:00,276][47288] Updated weights for policy 0, policy_version 24806 (0.0031) [2024-04-25 23:56:00,283][47267] Signal inference workers to stop experience collection... (5050 times) [2024-04-25 23:56:00,283][47267] Signal inference workers to resume experience collection... (5050 times) [2024-04-25 23:56:00,297][47288] InferenceWorker_p0-w0: stopping experience collection (5050 times) [2024-04-25 23:56:00,297][47288] InferenceWorker_p0-w0: resuming experience collection (5050 times) [2024-04-25 23:56:02,984][47288] Updated weights for policy 0, policy_version 24816 (0.0030) [2024-04-25 23:56:03,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 406601728. Throughput: 0: 55711.1. Samples: 355979020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-25 23:56:03,923][47056] Avg episode reward: [(0, '0.234')] [2024-04-25 23:56:06,294][47288] Updated weights for policy 0, policy_version 24826 (0.0025) [2024-04-25 23:56:08,922][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.9, 300 sec: 55594.6). Total num frames: 406896640. Throughput: 0: 55551.9. Samples: 356307560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-25 23:56:08,923][47056] Avg episode reward: [(0, '0.194')] [2024-04-25 23:56:08,960][47288] Updated weights for policy 0, policy_version 24836 (0.0030) [2024-04-25 23:56:12,156][47288] Updated weights for policy 0, policy_version 24846 (0.0029) [2024-04-25 23:56:13,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 407158784. Throughput: 0: 55525.9. Samples: 356479160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-25 23:56:13,923][47056] Avg episode reward: [(0, '0.271')] [2024-04-25 23:56:14,700][47288] Updated weights for policy 0, policy_version 24856 (0.0029) [2024-04-25 23:56:18,072][47288] Updated weights for policy 0, policy_version 24866 (0.0032) [2024-04-25 23:56:18,923][47056] Fps is (10 sec: 52427.3, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 407420928. Throughput: 0: 55595.5. Samples: 356817760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-25 23:56:18,923][47056] Avg episode reward: [(0, '0.206')] [2024-04-25 23:56:20,833][47288] Updated weights for policy 0, policy_version 24876 (0.0026) [2024-04-25 23:56:23,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.8, 300 sec: 55539.0). Total num frames: 407732224. Throughput: 0: 55529.4. Samples: 357146780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-25 23:56:23,923][47056] Avg episode reward: [(0, '0.236')] [2024-04-25 23:56:23,932][47288] Updated weights for policy 0, policy_version 24886 (0.0033) [2024-04-25 23:56:26,908][47288] Updated weights for policy 0, policy_version 24896 (0.0032) [2024-04-25 23:56:28,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 407994368. Throughput: 0: 55410.6. Samples: 357303220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-25 23:56:28,923][47056] Avg episode reward: [(0, '0.244')] [2024-04-25 23:56:29,887][47288] Updated weights for policy 0, policy_version 24906 (0.0030) [2024-04-25 23:56:32,824][47288] Updated weights for policy 0, policy_version 24916 (0.0030) [2024-04-25 23:56:33,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 408272896. Throughput: 0: 55592.4. Samples: 357641060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-25 23:56:33,923][47056] Avg episode reward: [(0, '0.219')] [2024-04-25 23:56:35,832][47288] Updated weights for policy 0, policy_version 24926 (0.0035) [2024-04-25 23:56:38,890][47288] Updated weights for policy 0, policy_version 24936 (0.0029) [2024-04-25 23:56:38,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 408551424. Throughput: 0: 55474.4. Samples: 357977860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-25 23:56:38,923][47056] Avg episode reward: [(0, '0.250')] [2024-04-25 23:56:41,586][47288] Updated weights for policy 0, policy_version 24946 (0.0027) [2024-04-25 23:56:43,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 408846336. Throughput: 0: 55483.5. Samples: 358144140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-25 23:56:43,923][47056] Avg episode reward: [(0, '0.254')] [2024-04-25 23:56:44,827][47288] Updated weights for policy 0, policy_version 24956 (0.0030) [2024-04-25 23:56:47,588][47288] Updated weights for policy 0, policy_version 24966 (0.0027) [2024-04-25 23:56:48,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 409092096. Throughput: 0: 55548.8. Samples: 358478720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-25 23:56:48,923][47056] Avg episode reward: [(0, '0.250')] [2024-04-25 23:56:48,931][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000024970_409108480.pth... [2024-04-25 23:56:48,983][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000024156_395771904.pth [2024-04-25 23:56:50,726][47288] Updated weights for policy 0, policy_version 24976 (0.0040) [2024-04-25 23:56:53,681][47288] Updated weights for policy 0, policy_version 24986 (0.0033) [2024-04-25 23:56:53,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55432.7, 300 sec: 55594.6). Total num frames: 409387008. Throughput: 0: 55738.6. Samples: 358815800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-25 23:56:53,923][47056] Avg episode reward: [(0, '0.244')] [2024-04-25 23:56:56,267][47267] Signal inference workers to stop experience collection... (5100 times) [2024-04-25 23:56:56,303][47288] InferenceWorker_p0-w0: stopping experience collection (5100 times) [2024-04-25 23:56:56,329][47267] Signal inference workers to resume experience collection... (5100 times) [2024-04-25 23:56:56,329][47288] InferenceWorker_p0-w0: resuming experience collection (5100 times) [2024-04-25 23:56:56,439][47288] Updated weights for policy 0, policy_version 24996 (0.0028) [2024-04-25 23:56:58,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55432.3, 300 sec: 55483.4). Total num frames: 409649152. Throughput: 0: 55476.8. Samples: 358975620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-25 23:56:58,923][47056] Avg episode reward: [(0, '0.246')] [2024-04-25 23:56:59,652][47288] Updated weights for policy 0, policy_version 25006 (0.0030) [2024-04-25 23:57:02,390][47288] Updated weights for policy 0, policy_version 25016 (0.0035) [2024-04-25 23:57:03,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 409944064. Throughput: 0: 55364.1. Samples: 359309140. Policy #0 lag: (min: 1.0, avg: 12.7, max: 25.0) [2024-04-25 23:57:03,923][47056] Avg episode reward: [(0, '0.282')] [2024-04-25 23:57:05,469][47288] Updated weights for policy 0, policy_version 25026 (0.0028) [2024-04-25 23:57:08,364][47288] Updated weights for policy 0, policy_version 25036 (0.0030) [2024-04-25 23:57:08,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 410222592. Throughput: 0: 55469.4. Samples: 359642900. Policy #0 lag: (min: 1.0, avg: 12.7, max: 25.0) [2024-04-25 23:57:08,923][47056] Avg episode reward: [(0, '0.174')] [2024-04-25 23:57:11,278][47288] Updated weights for policy 0, policy_version 25046 (0.0028) [2024-04-25 23:57:13,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 410484736. Throughput: 0: 55632.8. Samples: 359806700. Policy #0 lag: (min: 1.0, avg: 12.7, max: 25.0) [2024-04-25 23:57:13,923][47056] Avg episode reward: [(0, '0.258')] [2024-04-25 23:57:14,178][47288] Updated weights for policy 0, policy_version 25056 (0.0031) [2024-04-25 23:57:17,164][47288] Updated weights for policy 0, policy_version 25066 (0.0028) [2024-04-25 23:57:18,923][47056] Fps is (10 sec: 57342.7, 60 sec: 56251.7, 300 sec: 55650.0). Total num frames: 410796032. Throughput: 0: 55652.8. Samples: 360145440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 23:57:18,923][47056] Avg episode reward: [(0, '0.229')] [2024-04-25 23:57:20,130][47288] Updated weights for policy 0, policy_version 25076 (0.0030) [2024-04-25 23:57:22,990][47288] Updated weights for policy 0, policy_version 25086 (0.0034) [2024-04-25 23:57:23,923][47056] Fps is (10 sec: 58983.4, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 411074560. Throughput: 0: 55568.5. Samples: 360478440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 23:57:23,923][47056] Avg episode reward: [(0, '0.243')] [2024-04-25 23:57:26,028][47288] Updated weights for policy 0, policy_version 25096 (0.0032) [2024-04-25 23:57:28,692][47288] Updated weights for policy 0, policy_version 25106 (0.0031) [2024-04-25 23:57:28,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 411336704. Throughput: 0: 55638.9. Samples: 360647900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 23:57:28,923][47056] Avg episode reward: [(0, '0.247')] [2024-04-25 23:57:31,826][47288] Updated weights for policy 0, policy_version 25116 (0.0027) [2024-04-25 23:57:33,923][47056] Fps is (10 sec: 50789.5, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 411582464. Throughput: 0: 55676.0. Samples: 360984140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-25 23:57:33,923][47056] Avg episode reward: [(0, '0.234')] [2024-04-25 23:57:34,845][47288] Updated weights for policy 0, policy_version 25126 (0.0029) [2024-04-25 23:57:37,728][47288] Updated weights for policy 0, policy_version 25136 (0.0026) [2024-04-25 23:57:38,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 411877376. Throughput: 0: 55536.7. Samples: 361314960. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) [2024-04-25 23:57:38,924][47056] Avg episode reward: [(0, '0.278')] [2024-04-25 23:57:40,702][47288] Updated weights for policy 0, policy_version 25146 (0.0026) [2024-04-25 23:57:43,540][47288] Updated weights for policy 0, policy_version 25156 (0.0036) [2024-04-25 23:57:43,923][47056] Fps is (10 sec: 58983.2, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 412172288. Throughput: 0: 55737.5. Samples: 361483800. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) [2024-04-25 23:57:43,923][47056] Avg episode reward: [(0, '0.247')] [2024-04-25 23:57:46,630][47288] Updated weights for policy 0, policy_version 25166 (0.0028) [2024-04-25 23:57:48,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 412450816. Throughput: 0: 55595.0. Samples: 361810920. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) [2024-04-25 23:57:48,923][47056] Avg episode reward: [(0, '0.282')] [2024-04-25 23:57:49,424][47288] Updated weights for policy 0, policy_version 25176 (0.0027) [2024-04-25 23:57:52,319][47288] Updated weights for policy 0, policy_version 25186 (0.0027) [2024-04-25 23:57:53,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 412729344. Throughput: 0: 55645.8. Samples: 362146960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-25 23:57:53,923][47056] Avg episode reward: [(0, '0.285')] [2024-04-25 23:57:55,372][47288] Updated weights for policy 0, policy_version 25196 (0.0035) [2024-04-25 23:57:58,144][47288] Updated weights for policy 0, policy_version 25206 (0.0028) [2024-04-25 23:57:58,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 413007872. Throughput: 0: 55732.9. Samples: 362314680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-25 23:57:58,923][47056] Avg episode reward: [(0, '0.243')] [2024-04-25 23:57:59,573][47267] Signal inference workers to stop experience collection... (5150 times) [2024-04-25 23:57:59,573][47267] Signal inference workers to resume experience collection... (5150 times) [2024-04-25 23:57:59,585][47288] InferenceWorker_p0-w0: stopping experience collection (5150 times) [2024-04-25 23:57:59,585][47288] InferenceWorker_p0-w0: resuming experience collection (5150 times) [2024-04-25 23:58:01,244][47288] Updated weights for policy 0, policy_version 25216 (0.0033) [2024-04-25 23:58:03,922][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 413286400. Throughput: 0: 55606.6. Samples: 362647720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-25 23:58:03,923][47056] Avg episode reward: [(0, '0.246')] [2024-04-25 23:58:04,090][47288] Updated weights for policy 0, policy_version 25226 (0.0028) [2024-04-25 23:58:07,060][47288] Updated weights for policy 0, policy_version 25236 (0.0044) [2024-04-25 23:58:08,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 413532160. Throughput: 0: 55697.7. Samples: 362984840. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-04-25 23:58:08,923][47056] Avg episode reward: [(0, '0.201')] [2024-04-25 23:58:09,918][47288] Updated weights for policy 0, policy_version 25246 (0.0026) [2024-04-25 23:58:13,060][47288] Updated weights for policy 0, policy_version 25256 (0.0029) [2024-04-25 23:58:13,923][47056] Fps is (10 sec: 52427.7, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 413810688. Throughput: 0: 55460.1. Samples: 363143600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-04-25 23:58:13,923][47056] Avg episode reward: [(0, '0.202')] [2024-04-25 23:58:15,867][47288] Updated weights for policy 0, policy_version 25266 (0.0024) [2024-04-25 23:58:18,795][47288] Updated weights for policy 0, policy_version 25276 (0.0028) [2024-04-25 23:58:18,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 414121984. Throughput: 0: 55381.9. Samples: 363476320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-04-25 23:58:18,923][47056] Avg episode reward: [(0, '0.292')] [2024-04-25 23:58:21,856][47288] Updated weights for policy 0, policy_version 25286 (0.0026) [2024-04-25 23:58:23,923][47056] Fps is (10 sec: 60620.6, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 414416896. Throughput: 0: 55354.6. Samples: 363805920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-04-25 23:58:23,923][47056] Avg episode reward: [(0, '0.256')] [2024-04-25 23:58:24,644][47288] Updated weights for policy 0, policy_version 25296 (0.0028) [2024-04-25 23:58:27,618][47288] Updated weights for policy 0, policy_version 25306 (0.0027) [2024-04-25 23:58:28,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 414662656. Throughput: 0: 55573.3. Samples: 363984600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-25 23:58:28,923][47056] Avg episode reward: [(0, '0.283')] [2024-04-25 23:58:30,521][47288] Updated weights for policy 0, policy_version 25316 (0.0031) [2024-04-25 23:58:33,609][47288] Updated weights for policy 0, policy_version 25326 (0.0036) [2024-04-25 23:58:33,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 414941184. Throughput: 0: 55778.4. Samples: 364320940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-25 23:58:33,923][47056] Avg episode reward: [(0, '0.230')] [2024-04-25 23:58:36,440][47288] Updated weights for policy 0, policy_version 25336 (0.0026) [2024-04-25 23:58:38,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 415236096. Throughput: 0: 55544.7. Samples: 364646480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-25 23:58:38,923][47056] Avg episode reward: [(0, '0.244')] [2024-04-25 23:58:39,591][47288] Updated weights for policy 0, policy_version 25346 (0.0031) [2024-04-25 23:58:42,520][47288] Updated weights for policy 0, policy_version 25356 (0.0031) [2024-04-25 23:58:43,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55159.3, 300 sec: 55427.9). Total num frames: 415481856. Throughput: 0: 55448.9. Samples: 364809880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-25 23:58:43,923][47056] Avg episode reward: [(0, '0.196')] [2024-04-25 23:58:45,367][47288] Updated weights for policy 0, policy_version 25366 (0.0031) [2024-04-25 23:58:48,249][47288] Updated weights for policy 0, policy_version 25376 (0.0026) [2024-04-25 23:58:48,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55159.6, 300 sec: 55483.5). Total num frames: 415760384. Throughput: 0: 55503.4. Samples: 365145380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-25 23:58:48,923][47056] Avg episode reward: [(0, '0.219')] [2024-04-25 23:58:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000025376_415760384.pth... [2024-04-25 23:58:48,985][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000024562_402423808.pth [2024-04-25 23:58:51,117][47288] Updated weights for policy 0, policy_version 25386 (0.0026) [2024-04-25 23:58:53,923][47056] Fps is (10 sec: 58983.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 416071680. Throughput: 0: 55319.3. Samples: 365474200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-25 23:58:53,923][47056] Avg episode reward: [(0, '0.251')] [2024-04-25 23:58:54,077][47288] Updated weights for policy 0, policy_version 25396 (0.0032) [2024-04-25 23:58:55,848][47267] Signal inference workers to stop experience collection... (5200 times) [2024-04-25 23:58:55,884][47288] InferenceWorker_p0-w0: stopping experience collection (5200 times) [2024-04-25 23:58:55,936][47267] Signal inference workers to resume experience collection... (5200 times) [2024-04-25 23:58:55,936][47288] InferenceWorker_p0-w0: resuming experience collection (5200 times) [2024-04-25 23:58:57,107][47288] Updated weights for policy 0, policy_version 25406 (0.0026) [2024-04-25 23:58:58,923][47056] Fps is (10 sec: 58981.7, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 416350208. Throughput: 0: 55635.5. Samples: 365647200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-25 23:58:58,923][47056] Avg episode reward: [(0, '0.258')] [2024-04-25 23:59:00,162][47288] Updated weights for policy 0, policy_version 25416 (0.0025) [2024-04-25 23:59:03,269][47288] Updated weights for policy 0, policy_version 25426 (0.0030) [2024-04-25 23:59:03,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55705.4, 300 sec: 55650.1). Total num frames: 416628736. Throughput: 0: 55597.7. Samples: 365978220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-25 23:59:03,923][47056] Avg episode reward: [(0, '0.254')] [2024-04-25 23:59:06,048][47288] Updated weights for policy 0, policy_version 25436 (0.0028) [2024-04-25 23:59:08,923][47056] Fps is (10 sec: 54068.4, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 416890880. Throughput: 0: 55676.7. Samples: 366311360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-25 23:59:08,923][47056] Avg episode reward: [(0, '0.261')] [2024-04-25 23:59:09,438][47288] Updated weights for policy 0, policy_version 25446 (0.0030) [2024-04-25 23:59:11,870][47288] Updated weights for policy 0, policy_version 25456 (0.0032) [2024-04-25 23:59:13,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55978.8, 300 sec: 55539.0). Total num frames: 417169408. Throughput: 0: 55251.6. Samples: 366470920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-25 23:59:13,923][47056] Avg episode reward: [(0, '0.290')] [2024-04-25 23:59:15,134][47288] Updated weights for policy 0, policy_version 25466 (0.0030) [2024-04-25 23:59:17,582][47288] Updated weights for policy 0, policy_version 25476 (0.0027) [2024-04-25 23:59:18,923][47056] Fps is (10 sec: 52428.4, 60 sec: 54886.5, 300 sec: 55372.4). Total num frames: 417415168. Throughput: 0: 55303.1. Samples: 366809580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-04-25 23:59:18,923][47056] Avg episode reward: [(0, '0.227')] [2024-04-25 23:59:20,969][47288] Updated weights for policy 0, policy_version 25486 (0.0033) [2024-04-25 23:59:23,923][47056] Fps is (10 sec: 54067.4, 60 sec: 54886.6, 300 sec: 55539.0). Total num frames: 417710080. Throughput: 0: 55604.6. Samples: 367148680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-04-25 23:59:23,923][47056] Avg episode reward: [(0, '0.183')] [2024-04-25 23:59:23,956][47288] Updated weights for policy 0, policy_version 25496 (0.0029) [2024-04-25 23:59:26,773][47288] Updated weights for policy 0, policy_version 25506 (0.0024) [2024-04-25 23:59:28,923][47056] Fps is (10 sec: 60619.5, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 418021376. Throughput: 0: 55653.2. Samples: 367314280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-04-25 23:59:28,923][47056] Avg episode reward: [(0, '0.270')] [2024-04-25 23:59:30,374][47288] Updated weights for policy 0, policy_version 25516 (0.0034) [2024-04-25 23:59:32,450][47288] Updated weights for policy 0, policy_version 25526 (0.0027) [2024-04-25 23:59:33,923][47056] Fps is (10 sec: 58981.5, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 418299904. Throughput: 0: 55572.4. Samples: 367646140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 23:59:33,923][47056] Avg episode reward: [(0, '0.239')] [2024-04-25 23:59:36,309][47288] Updated weights for policy 0, policy_version 25536 (0.0035) [2024-04-25 23:59:38,119][47288] Updated weights for policy 0, policy_version 25546 (0.0025) [2024-04-25 23:59:38,923][47056] Fps is (10 sec: 55706.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 418578432. Throughput: 0: 55752.3. Samples: 367983060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 23:59:38,924][47056] Avg episode reward: [(0, '0.253')] [2024-04-25 23:59:42,044][47288] Updated weights for policy 0, policy_version 25556 (0.0030) [2024-04-25 23:59:43,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 418840576. Throughput: 0: 55825.4. Samples: 368159340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 23:59:43,923][47056] Avg episode reward: [(0, '0.275')] [2024-04-25 23:59:44,176][47288] Updated weights for policy 0, policy_version 25566 (0.0025) [2024-04-25 23:59:47,785][47288] Updated weights for policy 0, policy_version 25576 (0.0033) [2024-04-25 23:59:48,923][47056] Fps is (10 sec: 54065.8, 60 sec: 55978.4, 300 sec: 55538.9). Total num frames: 419119104. Throughput: 0: 55914.9. Samples: 368494400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-25 23:59:48,923][47056] Avg episode reward: [(0, '0.198')] [2024-04-25 23:59:50,084][47288] Updated weights for policy 0, policy_version 25586 (0.0028) [2024-04-25 23:59:53,568][47267] Signal inference workers to stop experience collection... (5250 times) [2024-04-25 23:59:53,569][47267] Signal inference workers to resume experience collection... (5250 times) [2024-04-25 23:59:53,582][47288] InferenceWorker_p0-w0: stopping experience collection (5250 times) [2024-04-25 23:59:53,583][47288] InferenceWorker_p0-w0: resuming experience collection (5250 times) [2024-04-25 23:59:53,714][47288] Updated weights for policy 0, policy_version 25596 (0.0033) [2024-04-25 23:59:53,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 419381248. Throughput: 0: 55950.6. Samples: 368829140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 23:59:53,923][47056] Avg episode reward: [(0, '0.242')] [2024-04-25 23:59:56,182][47288] Updated weights for policy 0, policy_version 25606 (0.0035) [2024-04-25 23:59:58,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 419659776. Throughput: 0: 55864.2. Samples: 368984820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-25 23:59:58,923][47056] Avg episode reward: [(0, '0.237')] [2024-04-25 23:59:59,592][47288] Updated weights for policy 0, policy_version 25616 (0.0033) [2024-04-26 00:00:02,085][47288] Updated weights for policy 0, policy_version 25626 (0.0029) [2024-04-26 00:00:03,923][47056] Fps is (10 sec: 58981.7, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 419971072. Throughput: 0: 55765.7. Samples: 369319040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:00:03,923][47056] Avg episode reward: [(0, '0.258')] [2024-04-26 00:00:05,350][47288] Updated weights for policy 0, policy_version 25636 (0.0027) [2024-04-26 00:00:07,812][47288] Updated weights for policy 0, policy_version 25646 (0.0031) [2024-04-26 00:00:08,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 420233216. Throughput: 0: 55668.3. Samples: 369653760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 00:00:08,923][47056] Avg episode reward: [(0, '0.309')] [2024-04-26 00:00:11,179][47288] Updated weights for policy 0, policy_version 25656 (0.0022) [2024-04-26 00:00:13,605][47288] Updated weights for policy 0, policy_version 25666 (0.0030) [2024-04-26 00:00:13,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 420511744. Throughput: 0: 55820.1. Samples: 369826180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 00:00:13,923][47056] Avg episode reward: [(0, '0.270')] [2024-04-26 00:00:17,025][47288] Updated weights for policy 0, policy_version 25676 (0.0027) [2024-04-26 00:00:18,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 420790272. Throughput: 0: 55813.4. Samples: 370157740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 00:00:18,923][47056] Avg episode reward: [(0, '0.259')] [2024-04-26 00:00:19,558][47288] Updated weights for policy 0, policy_version 25686 (0.0030) [2024-04-26 00:00:23,051][47288] Updated weights for policy 0, policy_version 25696 (0.0029) [2024-04-26 00:00:23,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 421052416. Throughput: 0: 55723.1. Samples: 370490600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 00:00:23,923][47056] Avg episode reward: [(0, '0.231')] [2024-04-26 00:00:25,596][47288] Updated weights for policy 0, policy_version 25706 (0.0032) [2024-04-26 00:00:28,923][47056] Fps is (10 sec: 52428.8, 60 sec: 54886.5, 300 sec: 55427.9). Total num frames: 421314560. Throughput: 0: 55366.2. Samples: 370650820. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 00:00:28,923][47056] Avg episode reward: [(0, '0.288')] [2024-04-26 00:00:28,967][47288] Updated weights for policy 0, policy_version 25716 (0.0030) [2024-04-26 00:00:31,365][47288] Updated weights for policy 0, policy_version 25726 (0.0026) [2024-04-26 00:00:33,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 421609472. Throughput: 0: 55292.3. Samples: 370982540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 00:00:33,923][47056] Avg episode reward: [(0, '0.259')] [2024-04-26 00:00:34,761][47288] Updated weights for policy 0, policy_version 25736 (0.0025) [2024-04-26 00:00:37,273][47288] Updated weights for policy 0, policy_version 25746 (0.0029) [2024-04-26 00:00:38,923][47056] Fps is (10 sec: 58982.5, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 421904384. Throughput: 0: 55247.5. Samples: 371315280. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 00:00:38,923][47056] Avg episode reward: [(0, '0.247')] [2024-04-26 00:00:40,727][47288] Updated weights for policy 0, policy_version 25756 (0.0032) [2024-04-26 00:00:43,097][47267] Signal inference workers to stop experience collection... (5300 times) [2024-04-26 00:00:43,144][47288] InferenceWorker_p0-w0: stopping experience collection (5300 times) [2024-04-26 00:00:43,155][47267] Signal inference workers to resume experience collection... (5300 times) [2024-04-26 00:00:43,162][47288] InferenceWorker_p0-w0: resuming experience collection (5300 times) [2024-04-26 00:00:43,264][47288] Updated weights for policy 0, policy_version 25766 (0.0030) [2024-04-26 00:00:43,923][47056] Fps is (10 sec: 58981.5, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 422199296. Throughput: 0: 55639.1. Samples: 371488580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 00:00:43,923][47056] Avg episode reward: [(0, '0.296')] [2024-04-26 00:00:46,662][47288] Updated weights for policy 0, policy_version 25776 (0.0037) [2024-04-26 00:00:48,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55432.8, 300 sec: 55539.0). Total num frames: 422445056. Throughput: 0: 55690.8. Samples: 371825120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 00:00:48,923][47056] Avg episode reward: [(0, '0.312')] [2024-04-26 00:00:49,038][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000025786_422477824.pth... [2024-04-26 00:00:49,041][47288] Updated weights for policy 0, policy_version 25786 (0.0024) [2024-04-26 00:00:49,086][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000024970_409108480.pth [2024-04-26 00:00:52,482][47288] Updated weights for policy 0, policy_version 25796 (0.0027) [2024-04-26 00:00:53,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 422739968. Throughput: 0: 55612.4. Samples: 372156320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 00:00:53,923][47056] Avg episode reward: [(0, '0.286')] [2024-04-26 00:00:54,960][47288] Updated weights for policy 0, policy_version 25806 (0.0030) [2024-04-26 00:00:58,256][47288] Updated weights for policy 0, policy_version 25816 (0.0029) [2024-04-26 00:00:58,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 423002112. Throughput: 0: 55527.7. Samples: 372324920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 00:00:58,923][47056] Avg episode reward: [(0, '0.242')] [2024-04-26 00:01:00,792][47288] Updated weights for policy 0, policy_version 25826 (0.0025) [2024-04-26 00:01:03,923][47056] Fps is (10 sec: 52428.7, 60 sec: 54886.4, 300 sec: 55483.4). Total num frames: 423264256. Throughput: 0: 55430.2. Samples: 372652100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 00:01:03,923][47056] Avg episode reward: [(0, '0.226')] [2024-04-26 00:01:04,199][47288] Updated weights for policy 0, policy_version 25836 (0.0029) [2024-04-26 00:01:06,768][47288] Updated weights for policy 0, policy_version 25846 (0.0029) [2024-04-26 00:01:08,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 423559168. Throughput: 0: 55394.6. Samples: 372983360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 00:01:08,923][47056] Avg episode reward: [(0, '0.243')] [2024-04-26 00:01:10,344][47288] Updated weights for policy 0, policy_version 25856 (0.0031) [2024-04-26 00:01:12,737][47288] Updated weights for policy 0, policy_version 25866 (0.0030) [2024-04-26 00:01:13,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 423837696. Throughput: 0: 55542.2. Samples: 373150220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 00:01:13,923][47056] Avg episode reward: [(0, '0.245')] [2024-04-26 00:01:16,172][47288] Updated weights for policy 0, policy_version 25876 (0.0030) [2024-04-26 00:01:18,544][47288] Updated weights for policy 0, policy_version 25886 (0.0033) [2024-04-26 00:01:18,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 424132608. Throughput: 0: 55579.1. Samples: 373483600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-26 00:01:18,923][47056] Avg episode reward: [(0, '0.331')] [2024-04-26 00:01:18,933][47267] Saving new best policy, reward=0.331! [2024-04-26 00:01:22,096][47288] Updated weights for policy 0, policy_version 25896 (0.0025) [2024-04-26 00:01:23,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 424394752. Throughput: 0: 55496.0. Samples: 373812600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-26 00:01:23,923][47056] Avg episode reward: [(0, '0.238')] [2024-04-26 00:01:24,497][47288] Updated weights for policy 0, policy_version 25906 (0.0033) [2024-04-26 00:01:27,867][47288] Updated weights for policy 0, policy_version 25916 (0.0031) [2024-04-26 00:01:28,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 424656896. Throughput: 0: 55520.6. Samples: 373987000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-26 00:01:28,923][47056] Avg episode reward: [(0, '0.227')] [2024-04-26 00:01:30,340][47288] Updated weights for policy 0, policy_version 25926 (0.0028) [2024-04-26 00:01:33,875][47288] Updated weights for policy 0, policy_version 25936 (0.0032) [2024-04-26 00:01:33,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 424935424. Throughput: 0: 55387.3. Samples: 374317560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-26 00:01:33,929][47056] Avg episode reward: [(0, '0.197')] [2024-04-26 00:01:36,171][47288] Updated weights for policy 0, policy_version 25946 (0.0032) [2024-04-26 00:01:38,923][47056] Fps is (10 sec: 54066.9, 60 sec: 54886.3, 300 sec: 55427.9). Total num frames: 425197568. Throughput: 0: 55479.9. Samples: 374652920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 00:01:38,923][47056] Avg episode reward: [(0, '0.275')] [2024-04-26 00:01:39,755][47288] Updated weights for policy 0, policy_version 25956 (0.0032) [2024-04-26 00:01:42,022][47288] Updated weights for policy 0, policy_version 25966 (0.0029) [2024-04-26 00:01:43,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 425508864. Throughput: 0: 55381.3. Samples: 374817080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 00:01:43,923][47056] Avg episode reward: [(0, '0.238')] [2024-04-26 00:01:45,550][47288] Updated weights for policy 0, policy_version 25976 (0.0028) [2024-04-26 00:01:46,781][47267] Signal inference workers to stop experience collection... (5350 times) [2024-04-26 00:01:46,782][47267] Signal inference workers to resume experience collection... (5350 times) [2024-04-26 00:01:46,811][47288] InferenceWorker_p0-w0: stopping experience collection (5350 times) [2024-04-26 00:01:46,811][47288] InferenceWorker_p0-w0: resuming experience collection (5350 times) [2024-04-26 00:01:47,886][47288] Updated weights for policy 0, policy_version 25986 (0.0031) [2024-04-26 00:01:48,923][47056] Fps is (10 sec: 58982.6, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 425787392. Throughput: 0: 55538.6. Samples: 375151340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 00:01:48,923][47056] Avg episode reward: [(0, '0.199')] [2024-04-26 00:01:51,384][47288] Updated weights for policy 0, policy_version 25996 (0.0029) [2024-04-26 00:01:53,848][47288] Updated weights for policy 0, policy_version 26006 (0.0034) [2024-04-26 00:01:53,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 426082304. Throughput: 0: 55583.6. Samples: 375484620. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 00:01:53,923][47056] Avg episode reward: [(0, '0.226')] [2024-04-26 00:01:57,195][47288] Updated weights for policy 0, policy_version 26016 (0.0026) [2024-04-26 00:01:58,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 426360832. Throughput: 0: 55771.1. Samples: 375659920. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 00:01:58,924][47056] Avg episode reward: [(0, '0.241')] [2024-04-26 00:01:59,669][47288] Updated weights for policy 0, policy_version 26026 (0.0031) [2024-04-26 00:02:03,157][47288] Updated weights for policy 0, policy_version 26036 (0.0028) [2024-04-26 00:02:03,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 426622976. Throughput: 0: 55998.3. Samples: 376003520. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 00:02:03,923][47056] Avg episode reward: [(0, '0.228')] [2024-04-26 00:02:05,369][47288] Updated weights for policy 0, policy_version 26046 (0.0030) [2024-04-26 00:02:08,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 426885120. Throughput: 0: 55982.2. Samples: 376331800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 00:02:08,923][47056] Avg episode reward: [(0, '0.269')] [2024-04-26 00:02:08,974][47288] Updated weights for policy 0, policy_version 26056 (0.0032) [2024-04-26 00:02:11,399][47288] Updated weights for policy 0, policy_version 26066 (0.0027) [2024-04-26 00:02:13,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 427163648. Throughput: 0: 55573.1. Samples: 376487780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 00:02:13,923][47056] Avg episode reward: [(0, '0.216')] [2024-04-26 00:02:14,810][47288] Updated weights for policy 0, policy_version 26076 (0.0032) [2024-04-26 00:02:17,089][47288] Updated weights for policy 0, policy_version 26086 (0.0027) [2024-04-26 00:02:18,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 427458560. Throughput: 0: 55749.0. Samples: 376826260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 00:02:18,923][47056] Avg episode reward: [(0, '0.246')] [2024-04-26 00:02:20,717][47288] Updated weights for policy 0, policy_version 26096 (0.0037) [2024-04-26 00:02:23,242][47288] Updated weights for policy 0, policy_version 26106 (0.0030) [2024-04-26 00:02:23,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.7, 300 sec: 55594.6). Total num frames: 427737088. Throughput: 0: 55758.0. Samples: 377162020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 00:02:23,923][47056] Avg episode reward: [(0, '0.257')] [2024-04-26 00:02:26,478][47288] Updated weights for policy 0, policy_version 26116 (0.0034) [2024-04-26 00:02:28,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 428032000. Throughput: 0: 55900.7. Samples: 377332620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 00:02:28,923][47056] Avg episode reward: [(0, '0.275')] [2024-04-26 00:02:29,087][47288] Updated weights for policy 0, policy_version 26126 (0.0029) [2024-04-26 00:02:32,280][47288] Updated weights for policy 0, policy_version 26136 (0.0031) [2024-04-26 00:02:33,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 428294144. Throughput: 0: 55816.1. Samples: 377663060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 00:02:33,923][47056] Avg episode reward: [(0, '0.251')] [2024-04-26 00:02:35,049][47288] Updated weights for policy 0, policy_version 26146 (0.0029) [2024-04-26 00:02:38,185][47288] Updated weights for policy 0, policy_version 26156 (0.0029) [2024-04-26 00:02:38,923][47056] Fps is (10 sec: 52429.9, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 428556288. Throughput: 0: 55822.7. Samples: 377996640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 00:02:38,923][47056] Avg episode reward: [(0, '0.197')] [2024-04-26 00:02:40,788][47288] Updated weights for policy 0, policy_version 26166 (0.0036) [2024-04-26 00:02:43,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55594.6). Total num frames: 428851200. Throughput: 0: 55631.6. Samples: 378163340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 00:02:43,923][47056] Avg episode reward: [(0, '0.259')] [2024-04-26 00:02:43,931][47288] Updated weights for policy 0, policy_version 26176 (0.0031) [2024-04-26 00:02:46,854][47288] Updated weights for policy 0, policy_version 26186 (0.0027) [2024-04-26 00:02:48,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55432.5, 300 sec: 55538.9). Total num frames: 429113344. Throughput: 0: 55439.8. Samples: 378498320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 00:02:48,923][47056] Avg episode reward: [(0, '0.229')] [2024-04-26 00:02:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000026191_429113344.pth... [2024-04-26 00:02:48,997][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000025376_415760384.pth [2024-04-26 00:02:49,897][47288] Updated weights for policy 0, policy_version 26196 (0.0026) [2024-04-26 00:02:52,772][47288] Updated weights for policy 0, policy_version 26206 (0.0033) [2024-04-26 00:02:53,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 429408256. Throughput: 0: 55483.6. Samples: 378828560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 00:02:53,923][47056] Avg episode reward: [(0, '0.269')] [2024-04-26 00:02:55,741][47288] Updated weights for policy 0, policy_version 26216 (0.0026) [2024-04-26 00:02:57,357][47267] Signal inference workers to stop experience collection... (5400 times) [2024-04-26 00:02:57,358][47267] Signal inference workers to resume experience collection... (5400 times) [2024-04-26 00:02:57,385][47288] InferenceWorker_p0-w0: stopping experience collection (5400 times) [2024-04-26 00:02:57,385][47288] InferenceWorker_p0-w0: resuming experience collection (5400 times) [2024-04-26 00:02:58,528][47288] Updated weights for policy 0, policy_version 26226 (0.0031) [2024-04-26 00:02:58,923][47056] Fps is (10 sec: 58982.6, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 429703168. Throughput: 0: 55835.4. Samples: 379000380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 00:02:58,932][47056] Avg episode reward: [(0, '0.268')] [2024-04-26 00:03:01,874][47288] Updated weights for policy 0, policy_version 26236 (0.0032) [2024-04-26 00:03:03,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 429965312. Throughput: 0: 55762.2. Samples: 379335560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 00:03:03,923][47056] Avg episode reward: [(0, '0.249')] [2024-04-26 00:03:04,296][47288] Updated weights for policy 0, policy_version 26246 (0.0034) [2024-04-26 00:03:07,786][47288] Updated weights for policy 0, policy_version 26256 (0.0029) [2024-04-26 00:03:08,922][47056] Fps is (10 sec: 52429.8, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 430227456. Throughput: 0: 55793.9. Samples: 379672740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 00:03:08,923][47056] Avg episode reward: [(0, '0.241')] [2024-04-26 00:03:10,275][47288] Updated weights for policy 0, policy_version 26266 (0.0030) [2024-04-26 00:03:13,517][47288] Updated weights for policy 0, policy_version 26276 (0.0031) [2024-04-26 00:03:13,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 430505984. Throughput: 0: 55543.4. Samples: 379832060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 00:03:13,923][47056] Avg episode reward: [(0, '0.255')] [2024-04-26 00:03:16,160][47288] Updated weights for policy 0, policy_version 26286 (0.0027) [2024-04-26 00:03:18,923][47056] Fps is (10 sec: 57342.3, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 430800896. Throughput: 0: 55676.7. Samples: 380168520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:03:18,923][47056] Avg episode reward: [(0, '0.236')] [2024-04-26 00:03:19,336][47288] Updated weights for policy 0, policy_version 26296 (0.0030) [2024-04-26 00:03:22,078][47288] Updated weights for policy 0, policy_version 26306 (0.0032) [2024-04-26 00:03:23,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 431079424. Throughput: 0: 55689.7. Samples: 380502680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:03:23,923][47056] Avg episode reward: [(0, '0.230')] [2024-04-26 00:03:25,221][47288] Updated weights for policy 0, policy_version 26316 (0.0028) [2024-04-26 00:03:27,719][47288] Updated weights for policy 0, policy_version 26326 (0.0032) [2024-04-26 00:03:28,923][47056] Fps is (10 sec: 55706.8, 60 sec: 55432.8, 300 sec: 55650.1). Total num frames: 431357952. Throughput: 0: 55791.2. Samples: 380673940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:03:28,923][47056] Avg episode reward: [(0, '0.283')] [2024-04-26 00:03:31,116][47288] Updated weights for policy 0, policy_version 26336 (0.0032) [2024-04-26 00:03:33,597][47288] Updated weights for policy 0, policy_version 26346 (0.0039) [2024-04-26 00:03:33,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 431652864. Throughput: 0: 55851.2. Samples: 381011620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:03:33,923][47056] Avg episode reward: [(0, '0.321')] [2024-04-26 00:03:36,879][47288] Updated weights for policy 0, policy_version 26356 (0.0030) [2024-04-26 00:03:38,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 431915008. Throughput: 0: 55944.5. Samples: 381346060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:03:38,923][47056] Avg episode reward: [(0, '0.248')] [2024-04-26 00:03:39,611][47288] Updated weights for policy 0, policy_version 26366 (0.0033) [2024-04-26 00:03:43,192][47288] Updated weights for policy 0, policy_version 26376 (0.0024) [2024-04-26 00:03:43,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 432177152. Throughput: 0: 55829.4. Samples: 381512700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:03:43,923][47056] Avg episode reward: [(0, '0.267')] [2024-04-26 00:03:45,329][47288] Updated weights for policy 0, policy_version 26386 (0.0030) [2024-04-26 00:03:48,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55538.9). Total num frames: 432455680. Throughput: 0: 55774.7. Samples: 381845420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:03:48,923][47056] Avg episode reward: [(0, '0.238')] [2024-04-26 00:03:49,082][47288] Updated weights for policy 0, policy_version 26396 (0.0031) [2024-04-26 00:03:51,435][47288] Updated weights for policy 0, policy_version 26406 (0.0029) [2024-04-26 00:03:53,923][47056] Fps is (10 sec: 58982.7, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 432766976. Throughput: 0: 55721.7. Samples: 382180220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 00:03:53,923][47056] Avg episode reward: [(0, '0.290')] [2024-04-26 00:03:54,786][47288] Updated weights for policy 0, policy_version 26416 (0.0031) [2024-04-26 00:03:57,355][47288] Updated weights for policy 0, policy_version 26426 (0.0028) [2024-04-26 00:03:58,923][47056] Fps is (10 sec: 58982.5, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 433045504. Throughput: 0: 55872.7. Samples: 382346340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 00:03:58,923][47056] Avg episode reward: [(0, '0.213')] [2024-04-26 00:04:00,807][47288] Updated weights for policy 0, policy_version 26436 (0.0035) [2024-04-26 00:04:01,825][47267] Signal inference workers to stop experience collection... (5450 times) [2024-04-26 00:04:01,879][47288] InferenceWorker_p0-w0: stopping experience collection (5450 times) [2024-04-26 00:04:01,879][47267] Signal inference workers to resume experience collection... (5450 times) [2024-04-26 00:04:01,893][47288] InferenceWorker_p0-w0: resuming experience collection (5450 times) [2024-04-26 00:04:03,245][47288] Updated weights for policy 0, policy_version 26446 (0.0027) [2024-04-26 00:04:03,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 433324032. Throughput: 0: 55726.4. Samples: 382676200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 00:04:03,923][47056] Avg episode reward: [(0, '0.262')] [2024-04-26 00:04:06,693][47288] Updated weights for policy 0, policy_version 26456 (0.0029) [2024-04-26 00:04:08,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.5, 300 sec: 55705.6). Total num frames: 433602560. Throughput: 0: 55699.6. Samples: 383009160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 00:04:08,923][47056] Avg episode reward: [(0, '0.290')] [2024-04-26 00:04:09,143][47288] Updated weights for policy 0, policy_version 26466 (0.0031) [2024-04-26 00:04:12,544][47288] Updated weights for policy 0, policy_version 26476 (0.0029) [2024-04-26 00:04:13,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 433881088. Throughput: 0: 55636.3. Samples: 383177580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:04:13,923][47056] Avg episode reward: [(0, '0.226')] [2024-04-26 00:04:14,904][47288] Updated weights for policy 0, policy_version 26486 (0.0031) [2024-04-26 00:04:18,335][47288] Updated weights for policy 0, policy_version 26496 (0.0028) [2024-04-26 00:04:18,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 434126848. Throughput: 0: 55634.6. Samples: 383515180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:04:18,923][47056] Avg episode reward: [(0, '0.187')] [2024-04-26 00:04:20,802][47288] Updated weights for policy 0, policy_version 26506 (0.0029) [2024-04-26 00:04:23,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 434405376. Throughput: 0: 55698.3. Samples: 383852480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:04:23,923][47056] Avg episode reward: [(0, '0.264')] [2024-04-26 00:04:24,129][47288] Updated weights for policy 0, policy_version 26516 (0.0029) [2024-04-26 00:04:26,501][47288] Updated weights for policy 0, policy_version 26526 (0.0029) [2024-04-26 00:04:28,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 434700288. Throughput: 0: 55483.2. Samples: 384009440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:04:28,923][47056] Avg episode reward: [(0, '0.293')] [2024-04-26 00:04:30,058][47288] Updated weights for policy 0, policy_version 26536 (0.0026) [2024-04-26 00:04:32,441][47288] Updated weights for policy 0, policy_version 26546 (0.0027) [2024-04-26 00:04:33,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 434962432. Throughput: 0: 55484.0. Samples: 384342200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 00:04:33,923][47056] Avg episode reward: [(0, '0.261')] [2024-04-26 00:04:35,986][47288] Updated weights for policy 0, policy_version 26556 (0.0031) [2024-04-26 00:04:38,324][47288] Updated weights for policy 0, policy_version 26566 (0.0032) [2024-04-26 00:04:38,923][47056] Fps is (10 sec: 58981.1, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 435290112. Throughput: 0: 55527.3. Samples: 384678960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 00:04:38,923][47056] Avg episode reward: [(0, '0.257')] [2024-04-26 00:04:41,783][47288] Updated weights for policy 0, policy_version 26576 (0.0031) [2024-04-26 00:04:43,923][47056] Fps is (10 sec: 58983.2, 60 sec: 56251.8, 300 sec: 55705.7). Total num frames: 435552256. Throughput: 0: 55798.4. Samples: 384857260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 00:04:43,923][47056] Avg episode reward: [(0, '0.219')] [2024-04-26 00:04:44,259][47288] Updated weights for policy 0, policy_version 26586 (0.0034) [2024-04-26 00:04:47,635][47288] Updated weights for policy 0, policy_version 26596 (0.0030) [2024-04-26 00:04:48,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 435830784. Throughput: 0: 55822.5. Samples: 385188220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 00:04:48,924][47056] Avg episode reward: [(0, '0.241')] [2024-04-26 00:04:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000026601_435830784.pth... [2024-04-26 00:04:48,985][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000025786_422477824.pth [2024-04-26 00:04:50,068][47288] Updated weights for policy 0, policy_version 26606 (0.0029) [2024-04-26 00:04:53,561][47288] Updated weights for policy 0, policy_version 26616 (0.0031) [2024-04-26 00:04:53,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 436092928. Throughput: 0: 55939.1. Samples: 385526420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 00:04:53,923][47056] Avg episode reward: [(0, '0.224')] [2024-04-26 00:04:55,907][47288] Updated weights for policy 0, policy_version 26626 (0.0032) [2024-04-26 00:04:58,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 436371456. Throughput: 0: 55706.2. Samples: 385684360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 00:04:58,923][47056] Avg episode reward: [(0, '0.220')] [2024-04-26 00:04:59,539][47288] Updated weights for policy 0, policy_version 26636 (0.0031) [2024-04-26 00:05:01,800][47288] Updated weights for policy 0, policy_version 26646 (0.0032) [2024-04-26 00:05:03,808][47267] Signal inference workers to stop experience collection... (5500 times) [2024-04-26 00:05:03,839][47288] InferenceWorker_p0-w0: stopping experience collection (5500 times) [2024-04-26 00:05:03,862][47267] Signal inference workers to resume experience collection... (5500 times) [2024-04-26 00:05:03,866][47288] InferenceWorker_p0-w0: resuming experience collection (5500 times) [2024-04-26 00:05:03,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 436633600. Throughput: 0: 55681.3. Samples: 386020840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 00:05:03,923][47056] Avg episode reward: [(0, '0.242')] [2024-04-26 00:05:05,296][47288] Updated weights for policy 0, policy_version 26656 (0.0028) [2024-04-26 00:05:07,519][47288] Updated weights for policy 0, policy_version 26666 (0.0032) [2024-04-26 00:05:08,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55159.5, 300 sec: 55594.6). Total num frames: 436912128. Throughput: 0: 55689.3. Samples: 386358500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 00:05:08,923][47056] Avg episode reward: [(0, '0.247')] [2024-04-26 00:05:11,430][47288] Updated weights for policy 0, policy_version 26676 (0.0027) [2024-04-26 00:05:13,444][47288] Updated weights for policy 0, policy_version 26686 (0.0029) [2024-04-26 00:05:13,923][47056] Fps is (10 sec: 60621.0, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 437239808. Throughput: 0: 56020.3. Samples: 386530360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 00:05:13,923][47056] Avg episode reward: [(0, '0.329')] [2024-04-26 00:05:17,292][47288] Updated weights for policy 0, policy_version 26696 (0.0029) [2024-04-26 00:05:18,923][47056] Fps is (10 sec: 60619.7, 60 sec: 56524.7, 300 sec: 55816.6). Total num frames: 437518336. Throughput: 0: 56023.9. Samples: 386863280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 00:05:18,923][47056] Avg episode reward: [(0, '0.253')] [2024-04-26 00:05:19,216][47288] Updated weights for policy 0, policy_version 26706 (0.0029) [2024-04-26 00:05:23,007][47288] Updated weights for policy 0, policy_version 26716 (0.0029) [2024-04-26 00:05:23,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 437780480. Throughput: 0: 55853.0. Samples: 387192340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 00:05:23,923][47056] Avg episode reward: [(0, '0.276')] [2024-04-26 00:05:25,090][47288] Updated weights for policy 0, policy_version 26726 (0.0028) [2024-04-26 00:05:28,790][47288] Updated weights for policy 0, policy_version 26736 (0.0023) [2024-04-26 00:05:28,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 438042624. Throughput: 0: 55812.8. Samples: 387368840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 00:05:28,923][47056] Avg episode reward: [(0, '0.235')] [2024-04-26 00:05:31,018][47288] Updated weights for policy 0, policy_version 26746 (0.0034) [2024-04-26 00:05:33,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 438304768. Throughput: 0: 55968.7. Samples: 387706800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 00:05:33,923][47056] Avg episode reward: [(0, '0.284')] [2024-04-26 00:05:34,610][47288] Updated weights for policy 0, policy_version 26756 (0.0028) [2024-04-26 00:05:36,769][47288] Updated weights for policy 0, policy_version 26766 (0.0029) [2024-04-26 00:05:38,923][47056] Fps is (10 sec: 54066.5, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 438583296. Throughput: 0: 55863.0. Samples: 388040260. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 00:05:38,923][47056] Avg episode reward: [(0, '0.271')] [2024-04-26 00:05:40,506][47288] Updated weights for policy 0, policy_version 26776 (0.0028) [2024-04-26 00:05:42,569][47288] Updated weights for policy 0, policy_version 26786 (0.0030) [2024-04-26 00:05:43,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 438861824. Throughput: 0: 55782.4. Samples: 388194560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 00:05:43,923][47056] Avg episode reward: [(0, '0.303')] [2024-04-26 00:05:46,437][47288] Updated weights for policy 0, policy_version 26796 (0.0031) [2024-04-26 00:05:48,411][47288] Updated weights for policy 0, policy_version 26806 (0.0029) [2024-04-26 00:05:48,923][47056] Fps is (10 sec: 60621.2, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 439189504. Throughput: 0: 55686.6. Samples: 388526740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 00:05:48,923][47056] Avg episode reward: [(0, '0.241')] [2024-04-26 00:05:52,359][47288] Updated weights for policy 0, policy_version 26816 (0.0025) [2024-04-26 00:05:53,923][47056] Fps is (10 sec: 60620.2, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 439468032. Throughput: 0: 55615.9. Samples: 388861220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 00:05:53,923][47056] Avg episode reward: [(0, '0.285')] [2024-04-26 00:05:54,323][47288] Updated weights for policy 0, policy_version 26826 (0.0026) [2024-04-26 00:05:56,317][47267] Signal inference workers to stop experience collection... (5550 times) [2024-04-26 00:05:56,362][47288] InferenceWorker_p0-w0: stopping experience collection (5550 times) [2024-04-26 00:05:56,373][47267] Signal inference workers to resume experience collection... (5550 times) [2024-04-26 00:05:56,382][47288] InferenceWorker_p0-w0: resuming experience collection (5550 times) [2024-04-26 00:05:58,369][47288] Updated weights for policy 0, policy_version 26836 (0.0027) [2024-04-26 00:05:58,922][47056] Fps is (10 sec: 55707.0, 60 sec: 56251.9, 300 sec: 55872.3). Total num frames: 439746560. Throughput: 0: 55745.1. Samples: 389038880. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-26 00:05:58,923][47056] Avg episode reward: [(0, '0.299')] [2024-04-26 00:06:00,318][47288] Updated weights for policy 0, policy_version 26846 (0.0022) [2024-04-26 00:06:03,923][47056] Fps is (10 sec: 50790.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 439975936. Throughput: 0: 55706.9. Samples: 389370080. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-26 00:06:03,923][47056] Avg episode reward: [(0, '0.244')] [2024-04-26 00:06:04,238][47288] Updated weights for policy 0, policy_version 26856 (0.0040) [2024-04-26 00:06:06,780][47288] Updated weights for policy 0, policy_version 26866 (0.0025) [2024-04-26 00:06:08,923][47056] Fps is (10 sec: 50789.5, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 440254464. Throughput: 0: 55773.8. Samples: 389702160. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-26 00:06:08,923][47056] Avg episode reward: [(0, '0.221')] [2024-04-26 00:06:10,074][47288] Updated weights for policy 0, policy_version 26876 (0.0029) [2024-04-26 00:06:12,581][47288] Updated weights for policy 0, policy_version 26886 (0.0024) [2024-04-26 00:06:13,923][47056] Fps is (10 sec: 54066.6, 60 sec: 54613.3, 300 sec: 55539.0). Total num frames: 440516608. Throughput: 0: 55290.6. Samples: 389856920. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-26 00:06:13,923][47056] Avg episode reward: [(0, '0.274')] [2024-04-26 00:06:16,060][47288] Updated weights for policy 0, policy_version 26896 (0.0033) [2024-04-26 00:06:18,868][47288] Updated weights for policy 0, policy_version 26906 (0.0035) [2024-04-26 00:06:18,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 440827904. Throughput: 0: 55123.4. Samples: 390187360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 00:06:18,923][47056] Avg episode reward: [(0, '0.258')] [2024-04-26 00:06:21,864][47288] Updated weights for policy 0, policy_version 26916 (0.0028) [2024-04-26 00:06:23,923][47056] Fps is (10 sec: 60620.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 441122816. Throughput: 0: 55117.0. Samples: 390520520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 00:06:23,923][47056] Avg episode reward: [(0, '0.225')] [2024-04-26 00:06:25,525][47288] Updated weights for policy 0, policy_version 26926 (0.0027) [2024-04-26 00:06:27,675][47288] Updated weights for policy 0, policy_version 26936 (0.0032) [2024-04-26 00:06:28,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56251.8, 300 sec: 55872.3). Total num frames: 441417728. Throughput: 0: 55681.3. Samples: 390700220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 00:06:28,923][47056] Avg episode reward: [(0, '0.281')] [2024-04-26 00:06:31,195][47288] Updated weights for policy 0, policy_version 26946 (0.0034) [2024-04-26 00:06:33,583][47288] Updated weights for policy 0, policy_version 26956 (0.0029) [2024-04-26 00:06:33,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 441663488. Throughput: 0: 55728.6. Samples: 391034520. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 00:06:33,923][47056] Avg episode reward: [(0, '0.275')] [2024-04-26 00:06:36,958][47288] Updated weights for policy 0, policy_version 26966 (0.0027) [2024-04-26 00:06:38,923][47056] Fps is (10 sec: 49151.0, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 441909248. Throughput: 0: 55745.6. Samples: 391369780. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 00:06:38,923][47056] Avg episode reward: [(0, '0.263')] [2024-04-26 00:06:39,542][47288] Updated weights for policy 0, policy_version 26976 (0.0029) [2024-04-26 00:06:39,986][47267] Signal inference workers to stop experience collection... (5600 times) [2024-04-26 00:06:40,030][47288] InferenceWorker_p0-w0: stopping experience collection (5600 times) [2024-04-26 00:06:40,043][47267] Signal inference workers to resume experience collection... (5600 times) [2024-04-26 00:06:40,047][47288] InferenceWorker_p0-w0: resuming experience collection (5600 times) [2024-04-26 00:06:42,936][47288] Updated weights for policy 0, policy_version 26986 (0.0031) [2024-04-26 00:06:43,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 442187776. Throughput: 0: 55139.2. Samples: 391520160. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 00:06:43,923][47056] Avg episode reward: [(0, '0.201')] [2024-04-26 00:06:45,503][47288] Updated weights for policy 0, policy_version 26996 (0.0030) [2024-04-26 00:06:48,906][47288] Updated weights for policy 0, policy_version 27006 (0.0023) [2024-04-26 00:06:48,923][47056] Fps is (10 sec: 55706.6, 60 sec: 54613.4, 300 sec: 55539.0). Total num frames: 442466304. Throughput: 0: 55348.4. Samples: 391860760. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 00:06:48,923][47056] Avg episode reward: [(0, '0.247')] [2024-04-26 00:06:49,002][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000027007_442482688.pth... [2024-04-26 00:06:49,047][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000026191_429113344.pth [2024-04-26 00:06:51,303][47288] Updated weights for policy 0, policy_version 27016 (0.0036) [2024-04-26 00:06:53,923][47056] Fps is (10 sec: 57344.8, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 442761216. Throughput: 0: 55401.9. Samples: 392195240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 00:06:53,923][47056] Avg episode reward: [(0, '0.268')] [2024-04-26 00:06:54,675][47288] Updated weights for policy 0, policy_version 27026 (0.0028) [2024-04-26 00:06:57,188][47288] Updated weights for policy 0, policy_version 27036 (0.0028) [2024-04-26 00:06:58,923][47056] Fps is (10 sec: 62258.1, 60 sec: 55705.3, 300 sec: 55816.6). Total num frames: 443088896. Throughput: 0: 55888.8. Samples: 392371920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 00:06:58,923][47056] Avg episode reward: [(0, '0.193')] [2024-04-26 00:07:00,472][47288] Updated weights for policy 0, policy_version 27046 (0.0030) [2024-04-26 00:07:03,030][47288] Updated weights for policy 0, policy_version 27056 (0.0027) [2024-04-26 00:07:03,923][47056] Fps is (10 sec: 60621.0, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 443367424. Throughput: 0: 55960.6. Samples: 392705580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 00:07:03,923][47056] Avg episode reward: [(0, '0.257')] [2024-04-26 00:07:06,303][47288] Updated weights for policy 0, policy_version 27066 (0.0027) [2024-04-26 00:07:08,923][47056] Fps is (10 sec: 50791.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 443596800. Throughput: 0: 56021.4. Samples: 393041480. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-04-26 00:07:08,923][47056] Avg episode reward: [(0, '0.291')] [2024-04-26 00:07:08,929][47288] Updated weights for policy 0, policy_version 27076 (0.0023) [2024-04-26 00:07:12,127][47288] Updated weights for policy 0, policy_version 27086 (0.0026) [2024-04-26 00:07:13,923][47056] Fps is (10 sec: 49151.5, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 443858944. Throughput: 0: 55581.2. Samples: 393201380. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-04-26 00:07:13,923][47056] Avg episode reward: [(0, '0.249')] [2024-04-26 00:07:14,770][47288] Updated weights for policy 0, policy_version 27096 (0.0031) [2024-04-26 00:07:17,856][47288] Updated weights for policy 0, policy_version 27106 (0.0035) [2024-04-26 00:07:18,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 444153856. Throughput: 0: 55607.5. Samples: 393536860. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-04-26 00:07:18,923][47056] Avg episode reward: [(0, '0.249')] [2024-04-26 00:07:20,729][47267] Signal inference workers to stop experience collection... (5650 times) [2024-04-26 00:07:20,777][47267] Signal inference workers to resume experience collection... (5650 times) [2024-04-26 00:07:20,778][47288] InferenceWorker_p0-w0: stopping experience collection (5650 times) [2024-04-26 00:07:20,781][47288] Updated weights for policy 0, policy_version 27116 (0.0031) [2024-04-26 00:07:20,792][47288] InferenceWorker_p0-w0: resuming experience collection (5650 times) [2024-04-26 00:07:23,596][47288] Updated weights for policy 0, policy_version 27126 (0.0031) [2024-04-26 00:07:23,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55159.6, 300 sec: 55594.6). Total num frames: 444432384. Throughput: 0: 55577.1. Samples: 393870740. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-04-26 00:07:23,923][47056] Avg episode reward: [(0, '0.216')] [2024-04-26 00:07:26,595][47288] Updated weights for policy 0, policy_version 27136 (0.0028) [2024-04-26 00:07:28,923][47056] Fps is (10 sec: 55705.1, 60 sec: 54886.2, 300 sec: 55650.0). Total num frames: 444710912. Throughput: 0: 55975.5. Samples: 394039060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 00:07:28,923][47056] Avg episode reward: [(0, '0.184')] [2024-04-26 00:07:29,582][47288] Updated weights for policy 0, policy_version 27146 (0.0031) [2024-04-26 00:07:32,494][47288] Updated weights for policy 0, policy_version 27156 (0.0027) [2024-04-26 00:07:33,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 445022208. Throughput: 0: 55702.6. Samples: 394367380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 00:07:33,923][47056] Avg episode reward: [(0, '0.248')] [2024-04-26 00:07:35,608][47288] Updated weights for policy 0, policy_version 27166 (0.0035) [2024-04-26 00:07:38,378][47288] Updated weights for policy 0, policy_version 27176 (0.0032) [2024-04-26 00:07:38,923][47056] Fps is (10 sec: 58983.8, 60 sec: 56525.1, 300 sec: 55761.2). Total num frames: 445300736. Throughput: 0: 55683.7. Samples: 394701000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 00:07:38,923][47056] Avg episode reward: [(0, '0.202')] [2024-04-26 00:07:41,457][47288] Updated weights for policy 0, policy_version 27186 (0.0026) [2024-04-26 00:07:43,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 445546496. Throughput: 0: 55546.0. Samples: 394871480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 00:07:43,923][47056] Avg episode reward: [(0, '0.279')] [2024-04-26 00:07:44,238][47288] Updated weights for policy 0, policy_version 27196 (0.0030) [2024-04-26 00:07:47,390][47288] Updated weights for policy 0, policy_version 27206 (0.0029) [2024-04-26 00:07:48,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 445825024. Throughput: 0: 55630.2. Samples: 395208940. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-04-26 00:07:48,923][47056] Avg episode reward: [(0, '0.278')] [2024-04-26 00:07:50,065][47288] Updated weights for policy 0, policy_version 27216 (0.0026) [2024-04-26 00:07:53,138][47288] Updated weights for policy 0, policy_version 27226 (0.0031) [2024-04-26 00:07:53,923][47056] Fps is (10 sec: 57343.1, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 446119936. Throughput: 0: 55589.6. Samples: 395543020. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-04-26 00:07:53,923][47056] Avg episode reward: [(0, '0.209')] [2024-04-26 00:07:55,894][47288] Updated weights for policy 0, policy_version 27236 (0.0030) [2024-04-26 00:07:58,923][47056] Fps is (10 sec: 55705.5, 60 sec: 54886.5, 300 sec: 55650.1). Total num frames: 446382080. Throughput: 0: 55526.2. Samples: 395700060. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-04-26 00:07:58,923][47056] Avg episode reward: [(0, '0.263')] [2024-04-26 00:07:59,003][47288] Updated weights for policy 0, policy_version 27246 (0.0031) [2024-04-26 00:08:01,742][47288] Updated weights for policy 0, policy_version 27256 (0.0027) [2024-04-26 00:08:03,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55159.3, 300 sec: 55761.1). Total num frames: 446676992. Throughput: 0: 55616.8. Samples: 396039620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-26 00:08:03,923][47056] Avg episode reward: [(0, '0.321')] [2024-04-26 00:08:04,724][47288] Updated weights for policy 0, policy_version 27266 (0.0032) [2024-04-26 00:08:07,642][47288] Updated weights for policy 0, policy_version 27276 (0.0027) [2024-04-26 00:08:08,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56251.6, 300 sec: 55816.6). Total num frames: 446971904. Throughput: 0: 55619.8. Samples: 396373640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-26 00:08:08,923][47056] Avg episode reward: [(0, '0.303')] [2024-04-26 00:08:10,591][47288] Updated weights for policy 0, policy_version 27286 (0.0028) [2024-04-26 00:08:12,790][47267] Signal inference workers to stop experience collection... (5700 times) [2024-04-26 00:08:12,791][47267] Signal inference workers to resume experience collection... (5700 times) [2024-04-26 00:08:12,808][47288] InferenceWorker_p0-w0: stopping experience collection (5700 times) [2024-04-26 00:08:12,809][47288] InferenceWorker_p0-w0: resuming experience collection (5700 times) [2024-04-26 00:08:13,466][47288] Updated weights for policy 0, policy_version 27296 (0.0034) [2024-04-26 00:08:13,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 447234048. Throughput: 0: 55676.1. Samples: 396544480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-26 00:08:13,924][47056] Avg episode reward: [(0, '0.284')] [2024-04-26 00:08:16,582][47288] Updated weights for policy 0, policy_version 27306 (0.0031) [2024-04-26 00:08:18,923][47056] Fps is (10 sec: 52430.0, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 447496192. Throughput: 0: 55785.5. Samples: 396877720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-26 00:08:18,923][47056] Avg episode reward: [(0, '0.284')] [2024-04-26 00:08:19,430][47288] Updated weights for policy 0, policy_version 27316 (0.0030) [2024-04-26 00:08:22,431][47288] Updated weights for policy 0, policy_version 27326 (0.0033) [2024-04-26 00:08:23,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 447791104. Throughput: 0: 55853.2. Samples: 397214400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-26 00:08:23,923][47056] Avg episode reward: [(0, '0.261')] [2024-04-26 00:08:25,181][47288] Updated weights for policy 0, policy_version 27336 (0.0025) [2024-04-26 00:08:28,115][47288] Updated weights for policy 0, policy_version 27346 (0.0028) [2024-04-26 00:08:28,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 448069632. Throughput: 0: 55612.4. Samples: 397374040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-26 00:08:28,923][47056] Avg episode reward: [(0, '0.257')] [2024-04-26 00:08:31,182][47288] Updated weights for policy 0, policy_version 27356 (0.0029) [2024-04-26 00:08:33,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 448348160. Throughput: 0: 55656.9. Samples: 397713500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-26 00:08:33,923][47056] Avg episode reward: [(0, '0.243')] [2024-04-26 00:08:33,972][47288] Updated weights for policy 0, policy_version 27366 (0.0029) [2024-04-26 00:08:37,072][47288] Updated weights for policy 0, policy_version 27376 (0.0032) [2024-04-26 00:08:38,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55159.3, 300 sec: 55705.6). Total num frames: 448610304. Throughput: 0: 55697.9. Samples: 398049420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-26 00:08:38,923][47056] Avg episode reward: [(0, '0.232')] [2024-04-26 00:08:39,823][47288] Updated weights for policy 0, policy_version 27386 (0.0031) [2024-04-26 00:08:42,906][47288] Updated weights for policy 0, policy_version 27396 (0.0037) [2024-04-26 00:08:43,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 448905216. Throughput: 0: 56026.8. Samples: 398221260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 00:08:43,923][47056] Avg episode reward: [(0, '0.266')] [2024-04-26 00:08:45,605][47288] Updated weights for policy 0, policy_version 27406 (0.0031) [2024-04-26 00:08:48,863][47288] Updated weights for policy 0, policy_version 27416 (0.0032) [2024-04-26 00:08:48,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 449183744. Throughput: 0: 55880.5. Samples: 398554240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 00:08:48,923][47056] Avg episode reward: [(0, '0.225')] [2024-04-26 00:08:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000027416_449183744.pth... [2024-04-26 00:08:48,987][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000026601_435830784.pth [2024-04-26 00:08:51,470][47288] Updated weights for policy 0, policy_version 27426 (0.0030) [2024-04-26 00:08:53,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 449445888. Throughput: 0: 55886.0. Samples: 398888500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 00:08:53,923][47056] Avg episode reward: [(0, '0.234')] [2024-04-26 00:08:54,776][47288] Updated weights for policy 0, policy_version 27436 (0.0039) [2024-04-26 00:08:57,276][47288] Updated weights for policy 0, policy_version 27446 (0.0036) [2024-04-26 00:08:58,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 449740800. Throughput: 0: 55853.8. Samples: 399057900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-26 00:08:58,923][47056] Avg episode reward: [(0, '0.309')] [2024-04-26 00:09:00,481][47288] Updated weights for policy 0, policy_version 27456 (0.0028) [2024-04-26 00:09:03,255][47288] Updated weights for policy 0, policy_version 27466 (0.0035) [2024-04-26 00:09:03,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 450019328. Throughput: 0: 55944.2. Samples: 399395220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-26 00:09:03,923][47056] Avg episode reward: [(0, '0.248')] [2024-04-26 00:09:06,243][47288] Updated weights for policy 0, policy_version 27476 (0.0024) [2024-04-26 00:09:08,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 450314240. Throughput: 0: 55903.1. Samples: 399730040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-26 00:09:08,923][47056] Avg episode reward: [(0, '0.186')] [2024-04-26 00:09:09,028][47288] Updated weights for policy 0, policy_version 27486 (0.0032) [2024-04-26 00:09:12,140][47288] Updated weights for policy 0, policy_version 27496 (0.0029) [2024-04-26 00:09:13,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 450576384. Throughput: 0: 56093.4. Samples: 399898240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-26 00:09:13,923][47056] Avg episode reward: [(0, '0.212')] [2024-04-26 00:09:14,926][47288] Updated weights for policy 0, policy_version 27506 (0.0026) [2024-04-26 00:09:18,102][47288] Updated weights for policy 0, policy_version 27516 (0.0027) [2024-04-26 00:09:18,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55978.4, 300 sec: 55761.1). Total num frames: 450854912. Throughput: 0: 56036.7. Samples: 400235160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 00:09:18,923][47056] Avg episode reward: [(0, '0.206')] [2024-04-26 00:09:20,595][47288] Updated weights for policy 0, policy_version 27526 (0.0032) [2024-04-26 00:09:23,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 451133440. Throughput: 0: 55995.5. Samples: 400569220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 00:09:23,924][47056] Avg episode reward: [(0, '0.243')] [2024-04-26 00:09:24,049][47288] Updated weights for policy 0, policy_version 27536 (0.0030) [2024-04-26 00:09:26,397][47288] Updated weights for policy 0, policy_version 27546 (0.0028) [2024-04-26 00:09:28,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 451411968. Throughput: 0: 55583.9. Samples: 400722540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 00:09:28,923][47056] Avg episode reward: [(0, '0.230')] [2024-04-26 00:09:29,819][47288] Updated weights for policy 0, policy_version 27556 (0.0034) [2024-04-26 00:09:31,511][47267] Signal inference workers to stop experience collection... (5750 times) [2024-04-26 00:09:31,544][47288] InferenceWorker_p0-w0: stopping experience collection (5750 times) [2024-04-26 00:09:31,568][47267] Signal inference workers to resume experience collection... (5750 times) [2024-04-26 00:09:31,573][47288] InferenceWorker_p0-w0: resuming experience collection (5750 times) [2024-04-26 00:09:32,310][47288] Updated weights for policy 0, policy_version 27566 (0.0029) [2024-04-26 00:09:33,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 451690496. Throughput: 0: 55748.0. Samples: 401062900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 00:09:33,923][47056] Avg episode reward: [(0, '0.252')] [2024-04-26 00:09:35,653][47288] Updated weights for policy 0, policy_version 27576 (0.0032) [2024-04-26 00:09:38,165][47288] Updated weights for policy 0, policy_version 27586 (0.0037) [2024-04-26 00:09:38,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 451985408. Throughput: 0: 55833.2. Samples: 401401000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 00:09:38,923][47056] Avg episode reward: [(0, '0.273')] [2024-04-26 00:09:41,509][47288] Updated weights for policy 0, policy_version 27596 (0.0029) [2024-04-26 00:09:43,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 452280320. Throughput: 0: 55875.9. Samples: 401572320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 00:09:43,923][47056] Avg episode reward: [(0, '0.306')] [2024-04-26 00:09:44,357][47288] Updated weights for policy 0, policy_version 27606 (0.0028) [2024-04-26 00:09:47,223][47288] Updated weights for policy 0, policy_version 27616 (0.0029) [2024-04-26 00:09:48,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 452526080. Throughput: 0: 55756.7. Samples: 401904260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 00:09:48,923][47056] Avg episode reward: [(0, '0.246')] [2024-04-26 00:09:50,280][47288] Updated weights for policy 0, policy_version 27626 (0.0026) [2024-04-26 00:09:53,318][47288] Updated weights for policy 0, policy_version 27636 (0.0038) [2024-04-26 00:09:53,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 452804608. Throughput: 0: 55714.6. Samples: 402237200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 00:09:53,923][47056] Avg episode reward: [(0, '0.299')] [2024-04-26 00:09:56,149][47288] Updated weights for policy 0, policy_version 27646 (0.0032) [2024-04-26 00:09:58,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 453099520. Throughput: 0: 55644.3. Samples: 402402240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 00:09:58,923][47056] Avg episode reward: [(0, '0.256')] [2024-04-26 00:09:59,124][47288] Updated weights for policy 0, policy_version 27656 (0.0028) [2024-04-26 00:10:01,889][47288] Updated weights for policy 0, policy_version 27666 (0.0034) [2024-04-26 00:10:03,923][47056] Fps is (10 sec: 55704.4, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 453361664. Throughput: 0: 55634.6. Samples: 402738720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 00:10:03,923][47056] Avg episode reward: [(0, '0.299')] [2024-04-26 00:10:05,073][47288] Updated weights for policy 0, policy_version 27676 (0.0034) [2024-04-26 00:10:07,571][47288] Updated weights for policy 0, policy_version 27686 (0.0032) [2024-04-26 00:10:08,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 453640192. Throughput: 0: 55714.8. Samples: 403076380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 00:10:08,923][47056] Avg episode reward: [(0, '0.164')] [2024-04-26 00:10:11,103][47288] Updated weights for policy 0, policy_version 27696 (0.0030) [2024-04-26 00:10:13,588][47288] Updated weights for policy 0, policy_version 27706 (0.0032) [2024-04-26 00:10:13,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 453951488. Throughput: 0: 56156.8. Samples: 403249600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 00:10:13,923][47056] Avg episode reward: [(0, '0.293')] [2024-04-26 00:10:17,118][47288] Updated weights for policy 0, policy_version 27716 (0.0033) [2024-04-26 00:10:18,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 454213632. Throughput: 0: 56089.0. Samples: 403586900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 00:10:18,923][47056] Avg episode reward: [(0, '0.206')] [2024-04-26 00:10:19,498][47288] Updated weights for policy 0, policy_version 27726 (0.0028) [2024-04-26 00:10:22,827][47288] Updated weights for policy 0, policy_version 27736 (0.0033) [2024-04-26 00:10:23,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 454492160. Throughput: 0: 55960.5. Samples: 403919220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 00:10:23,923][47056] Avg episode reward: [(0, '0.247')] [2024-04-26 00:10:25,251][47288] Updated weights for policy 0, policy_version 27746 (0.0030) [2024-04-26 00:10:28,767][47288] Updated weights for policy 0, policy_version 27756 (0.0025) [2024-04-26 00:10:28,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 454770688. Throughput: 0: 55790.0. Samples: 404082860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 00:10:28,923][47056] Avg episode reward: [(0, '0.239')] [2024-04-26 00:10:31,243][47288] Updated weights for policy 0, policy_version 27766 (0.0028) [2024-04-26 00:10:33,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 455049216. Throughput: 0: 55978.9. Samples: 404423320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 00:10:33,923][47056] Avg episode reward: [(0, '0.210')] [2024-04-26 00:10:34,484][47288] Updated weights for policy 0, policy_version 27776 (0.0028) [2024-04-26 00:10:34,823][47267] Signal inference workers to stop experience collection... (5800 times) [2024-04-26 00:10:34,824][47267] Signal inference workers to resume experience collection... (5800 times) [2024-04-26 00:10:34,851][47288] InferenceWorker_p0-w0: stopping experience collection (5800 times) [2024-04-26 00:10:34,851][47288] InferenceWorker_p0-w0: resuming experience collection (5800 times) [2024-04-26 00:10:36,929][47288] Updated weights for policy 0, policy_version 27786 (0.0031) [2024-04-26 00:10:38,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 455344128. Throughput: 0: 56016.0. Samples: 404757920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 00:10:38,923][47056] Avg episode reward: [(0, '0.230')] [2024-04-26 00:10:40,479][47288] Updated weights for policy 0, policy_version 27796 (0.0027) [2024-04-26 00:10:42,658][47288] Updated weights for policy 0, policy_version 27806 (0.0027) [2024-04-26 00:10:43,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 455589888. Throughput: 0: 55895.1. Samples: 404917520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 00:10:43,923][47056] Avg episode reward: [(0, '0.271')] [2024-04-26 00:10:46,342][47288] Updated weights for policy 0, policy_version 27816 (0.0026) [2024-04-26 00:10:48,634][47288] Updated weights for policy 0, policy_version 27826 (0.0030) [2024-04-26 00:10:48,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56251.5, 300 sec: 55705.6). Total num frames: 455901184. Throughput: 0: 55947.6. Samples: 405256360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 00:10:48,923][47056] Avg episode reward: [(0, '0.216')] [2024-04-26 00:10:49,037][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000027827_455917568.pth... [2024-04-26 00:10:49,092][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000027007_442482688.pth [2024-04-26 00:10:52,171][47288] Updated weights for policy 0, policy_version 27836 (0.0029) [2024-04-26 00:10:53,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 456163328. Throughput: 0: 55869.7. Samples: 405590520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 00:10:53,923][47056] Avg episode reward: [(0, '0.239')] [2024-04-26 00:10:54,518][47288] Updated weights for policy 0, policy_version 27846 (0.0030) [2024-04-26 00:10:57,946][47288] Updated weights for policy 0, policy_version 27856 (0.0027) [2024-04-26 00:10:58,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 456458240. Throughput: 0: 55724.1. Samples: 405757180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 00:10:58,923][47056] Avg episode reward: [(0, '0.265')] [2024-04-26 00:11:00,418][47288] Updated weights for policy 0, policy_version 27866 (0.0031) [2024-04-26 00:11:03,667][47288] Updated weights for policy 0, policy_version 27876 (0.0026) [2024-04-26 00:11:03,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.9, 300 sec: 55816.7). Total num frames: 456720384. Throughput: 0: 55774.2. Samples: 406096740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 00:11:03,923][47056] Avg episode reward: [(0, '0.259')] [2024-04-26 00:11:06,392][47288] Updated weights for policy 0, policy_version 27886 (0.0035) [2024-04-26 00:11:08,923][47056] Fps is (10 sec: 52428.2, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 456982528. Throughput: 0: 55914.1. Samples: 406435360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-26 00:11:08,923][47056] Avg episode reward: [(0, '0.292')] [2024-04-26 00:11:09,555][47288] Updated weights for policy 0, policy_version 27896 (0.0029) [2024-04-26 00:11:12,132][47288] Updated weights for policy 0, policy_version 27906 (0.0027) [2024-04-26 00:11:13,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 457277440. Throughput: 0: 55906.0. Samples: 406598640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-26 00:11:13,923][47056] Avg episode reward: [(0, '0.290')] [2024-04-26 00:11:15,299][47288] Updated weights for policy 0, policy_version 27916 (0.0028) [2024-04-26 00:11:18,043][47288] Updated weights for policy 0, policy_version 27926 (0.0030) [2024-04-26 00:11:18,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 457555968. Throughput: 0: 55818.2. Samples: 406935140. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-26 00:11:18,923][47056] Avg episode reward: [(0, '0.277')] [2024-04-26 00:11:21,259][47288] Updated weights for policy 0, policy_version 27936 (0.0029) [2024-04-26 00:11:23,911][47288] Updated weights for policy 0, policy_version 27946 (0.0027) [2024-04-26 00:11:23,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 457867264. Throughput: 0: 55860.4. Samples: 407271640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 00:11:23,923][47056] Avg episode reward: [(0, '0.232')] [2024-04-26 00:11:26,325][47267] Signal inference workers to stop experience collection... (5850 times) [2024-04-26 00:11:26,358][47288] InferenceWorker_p0-w0: stopping experience collection (5850 times) [2024-04-26 00:11:26,406][47267] Signal inference workers to resume experience collection... (5850 times) [2024-04-26 00:11:26,406][47288] InferenceWorker_p0-w0: resuming experience collection (5850 times) [2024-04-26 00:11:27,105][47288] Updated weights for policy 0, policy_version 27956 (0.0028) [2024-04-26 00:11:28,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 458129408. Throughput: 0: 56127.1. Samples: 407443240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 00:11:28,923][47056] Avg episode reward: [(0, '0.277')] [2024-04-26 00:11:29,817][47288] Updated weights for policy 0, policy_version 27966 (0.0029) [2024-04-26 00:11:32,791][47288] Updated weights for policy 0, policy_version 27976 (0.0030) [2024-04-26 00:11:33,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 458407936. Throughput: 0: 56080.1. Samples: 407779960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 00:11:33,923][47056] Avg episode reward: [(0, '0.219')] [2024-04-26 00:11:35,599][47288] Updated weights for policy 0, policy_version 27986 (0.0025) [2024-04-26 00:11:38,865][47288] Updated weights for policy 0, policy_version 27996 (0.0024) [2024-04-26 00:11:38,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 458686464. Throughput: 0: 56010.7. Samples: 408111000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 00:11:38,923][47056] Avg episode reward: [(0, '0.322')] [2024-04-26 00:11:41,840][47288] Updated weights for policy 0, policy_version 28006 (0.0026) [2024-04-26 00:11:43,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 458964992. Throughput: 0: 56024.7. Samples: 408278300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 00:11:43,923][47056] Avg episode reward: [(0, '0.234')] [2024-04-26 00:11:44,666][47288] Updated weights for policy 0, policy_version 28016 (0.0029) [2024-04-26 00:11:47,632][47288] Updated weights for policy 0, policy_version 28026 (0.0034) [2024-04-26 00:11:48,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 459243520. Throughput: 0: 55931.4. Samples: 408613660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 00:11:48,923][47056] Avg episode reward: [(0, '0.327')] [2024-04-26 00:11:50,340][47288] Updated weights for policy 0, policy_version 28036 (0.0031) [2024-04-26 00:11:53,503][47288] Updated weights for policy 0, policy_version 28046 (0.0038) [2024-04-26 00:11:53,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 459522048. Throughput: 0: 55884.9. Samples: 408950180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 00:11:53,923][47056] Avg episode reward: [(0, '0.237')] [2024-04-26 00:11:56,305][47288] Updated weights for policy 0, policy_version 28056 (0.0032) [2024-04-26 00:11:58,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 459800576. Throughput: 0: 55875.5. Samples: 409113040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 00:11:58,923][47056] Avg episode reward: [(0, '0.262')] [2024-04-26 00:11:59,307][47288] Updated weights for policy 0, policy_version 28066 (0.0026) [2024-04-26 00:12:02,261][47288] Updated weights for policy 0, policy_version 28076 (0.0027) [2024-04-26 00:12:03,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 460079104. Throughput: 0: 55799.6. Samples: 409446120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 00:12:03,923][47056] Avg episode reward: [(0, '0.242')] [2024-04-26 00:12:05,187][47288] Updated weights for policy 0, policy_version 28086 (0.0027) [2024-04-26 00:12:08,053][47288] Updated weights for policy 0, policy_version 28096 (0.0028) [2024-04-26 00:12:08,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56524.9, 300 sec: 55983.3). Total num frames: 460374016. Throughput: 0: 55832.1. Samples: 409784080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 00:12:08,923][47056] Avg episode reward: [(0, '0.271')] [2024-04-26 00:12:11,089][47288] Updated weights for policy 0, policy_version 28106 (0.0028) [2024-04-26 00:12:13,768][47288] Updated weights for policy 0, policy_version 28116 (0.0033) [2024-04-26 00:12:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 460652544. Throughput: 0: 55773.3. Samples: 409953040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 00:12:13,923][47056] Avg episode reward: [(0, '0.271')] [2024-04-26 00:12:16,986][47288] Updated weights for policy 0, policy_version 28126 (0.0030) [2024-04-26 00:12:18,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 460914688. Throughput: 0: 55734.7. Samples: 410288020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 00:12:18,923][47056] Avg episode reward: [(0, '0.281')] [2024-04-26 00:12:19,652][47288] Updated weights for policy 0, policy_version 28136 (0.0028) [2024-04-26 00:12:22,848][47288] Updated weights for policy 0, policy_version 28146 (0.0025) [2024-04-26 00:12:23,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 461193216. Throughput: 0: 55700.8. Samples: 410617540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 00:12:23,923][47056] Avg episode reward: [(0, '0.222')] [2024-04-26 00:12:25,640][47288] Updated weights for policy 0, policy_version 28156 (0.0027) [2024-04-26 00:12:28,850][47288] Updated weights for policy 0, policy_version 28166 (0.0037) [2024-04-26 00:12:28,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 461471744. Throughput: 0: 55817.9. Samples: 410790100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 00:12:28,923][47056] Avg episode reward: [(0, '0.289')] [2024-04-26 00:12:30,019][47267] Signal inference workers to stop experience collection... (5900 times) [2024-04-26 00:12:30,063][47288] InferenceWorker_p0-w0: stopping experience collection (5900 times) [2024-04-26 00:12:30,077][47267] Signal inference workers to resume experience collection... (5900 times) [2024-04-26 00:12:30,086][47288] InferenceWorker_p0-w0: resuming experience collection (5900 times) [2024-04-26 00:12:31,480][47288] Updated weights for policy 0, policy_version 28176 (0.0042) [2024-04-26 00:12:33,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 461750272. Throughput: 0: 55763.5. Samples: 411123020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 00:12:33,923][47056] Avg episode reward: [(0, '0.167')] [2024-04-26 00:12:34,728][47288] Updated weights for policy 0, policy_version 28186 (0.0029) [2024-04-26 00:12:37,273][47288] Updated weights for policy 0, policy_version 28196 (0.0027) [2024-04-26 00:12:38,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 462012416. Throughput: 0: 55638.9. Samples: 411453920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 00:12:38,923][47056] Avg episode reward: [(0, '0.234')] [2024-04-26 00:12:40,732][47288] Updated weights for policy 0, policy_version 28206 (0.0030) [2024-04-26 00:12:43,111][47288] Updated weights for policy 0, policy_version 28216 (0.0032) [2024-04-26 00:12:43,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 462307328. Throughput: 0: 55610.4. Samples: 411615500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 00:12:43,923][47056] Avg episode reward: [(0, '0.207')] [2024-04-26 00:12:46,703][47288] Updated weights for policy 0, policy_version 28226 (0.0027) [2024-04-26 00:12:48,923][47056] Fps is (10 sec: 58981.2, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 462602240. Throughput: 0: 55636.9. Samples: 411949780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 00:12:48,932][47056] Avg episode reward: [(0, '0.232')] [2024-04-26 00:12:48,943][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000028235_462602240.pth... [2024-04-26 00:12:48,989][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000027416_449183744.pth [2024-04-26 00:12:49,136][47288] Updated weights for policy 0, policy_version 28236 (0.0029) [2024-04-26 00:12:52,545][47288] Updated weights for policy 0, policy_version 28246 (0.0033) [2024-04-26 00:12:53,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 462864384. Throughput: 0: 55468.4. Samples: 412280160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 00:12:53,923][47056] Avg episode reward: [(0, '0.257')] [2024-04-26 00:12:54,968][47288] Updated weights for policy 0, policy_version 28256 (0.0027) [2024-04-26 00:12:58,389][47288] Updated weights for policy 0, policy_version 28266 (0.0031) [2024-04-26 00:12:58,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 463142912. Throughput: 0: 55598.3. Samples: 412454960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 00:12:58,923][47056] Avg episode reward: [(0, '0.287')] [2024-04-26 00:13:01,161][47288] Updated weights for policy 0, policy_version 28276 (0.0026) [2024-04-26 00:13:03,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 463405056. Throughput: 0: 55556.9. Samples: 412788080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 00:13:03,923][47056] Avg episode reward: [(0, '0.261')] [2024-04-26 00:13:04,142][47288] Updated weights for policy 0, policy_version 28286 (0.0031) [2024-04-26 00:13:07,299][47288] Updated weights for policy 0, policy_version 28296 (0.0027) [2024-04-26 00:13:08,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55159.5, 300 sec: 55761.2). Total num frames: 463683584. Throughput: 0: 55692.5. Samples: 413123700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 00:13:08,923][47056] Avg episode reward: [(0, '0.306')] [2024-04-26 00:13:10,032][47288] Updated weights for policy 0, policy_version 28306 (0.0025) [2024-04-26 00:13:13,055][47288] Updated weights for policy 0, policy_version 28316 (0.0028) [2024-04-26 00:13:13,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55159.6, 300 sec: 55816.7). Total num frames: 463962112. Throughput: 0: 55458.8. Samples: 413285740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 00:13:13,923][47056] Avg episode reward: [(0, '0.301')] [2024-04-26 00:13:15,968][47288] Updated weights for policy 0, policy_version 28326 (0.0031) [2024-04-26 00:13:18,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 464240640. Throughput: 0: 55522.0. Samples: 413621500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-04-26 00:13:18,923][47056] Avg episode reward: [(0, '0.211')] [2024-04-26 00:13:18,989][47288] Updated weights for policy 0, policy_version 28336 (0.0023) [2024-04-26 00:13:21,821][47288] Updated weights for policy 0, policy_version 28346 (0.0026) [2024-04-26 00:13:23,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 464551936. Throughput: 0: 55570.6. Samples: 413954600. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-04-26 00:13:23,923][47056] Avg episode reward: [(0, '0.259')] [2024-04-26 00:13:24,740][47288] Updated weights for policy 0, policy_version 28356 (0.0027) [2024-04-26 00:13:27,658][47288] Updated weights for policy 0, policy_version 28366 (0.0026) [2024-04-26 00:13:28,923][47056] Fps is (10 sec: 58981.6, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 464830464. Throughput: 0: 55820.9. Samples: 414127440. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-04-26 00:13:28,923][47056] Avg episode reward: [(0, '0.278')] [2024-04-26 00:13:30,628][47288] Updated weights for policy 0, policy_version 28376 (0.0035) [2024-04-26 00:13:33,503][47288] Updated weights for policy 0, policy_version 28386 (0.0028) [2024-04-26 00:13:33,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 465092608. Throughput: 0: 55829.8. Samples: 414462120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 00:13:33,923][47056] Avg episode reward: [(0, '0.320')] [2024-04-26 00:13:36,438][47288] Updated weights for policy 0, policy_version 28396 (0.0036) [2024-04-26 00:13:38,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 465354752. Throughput: 0: 55893.0. Samples: 414795340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 00:13:38,923][47056] Avg episode reward: [(0, '0.229')] [2024-04-26 00:13:39,361][47288] Updated weights for policy 0, policy_version 28406 (0.0026) [2024-04-26 00:13:41,377][47267] Signal inference workers to stop experience collection... (5950 times) [2024-04-26 00:13:41,384][47267] Signal inference workers to resume experience collection... (5950 times) [2024-04-26 00:13:41,397][47288] InferenceWorker_p0-w0: stopping experience collection (5950 times) [2024-04-26 00:13:41,397][47288] InferenceWorker_p0-w0: resuming experience collection (5950 times) [2024-04-26 00:13:42,594][47288] Updated weights for policy 0, policy_version 28416 (0.0036) [2024-04-26 00:13:43,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 465616896. Throughput: 0: 55562.3. Samples: 414955260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 00:13:43,923][47056] Avg episode reward: [(0, '0.277')] [2024-04-26 00:13:45,152][47288] Updated weights for policy 0, policy_version 28426 (0.0026) [2024-04-26 00:13:48,803][47288] Updated weights for policy 0, policy_version 28436 (0.0029) [2024-04-26 00:13:48,923][47056] Fps is (10 sec: 54067.4, 60 sec: 54886.5, 300 sec: 55761.1). Total num frames: 465895424. Throughput: 0: 55565.9. Samples: 415288540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 00:13:48,923][47056] Avg episode reward: [(0, '0.258')] [2024-04-26 00:13:50,940][47288] Updated weights for policy 0, policy_version 28446 (0.0031) [2024-04-26 00:13:53,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 466190336. Throughput: 0: 55531.4. Samples: 415622620. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-26 00:13:53,924][47056] Avg episode reward: [(0, '0.188')] [2024-04-26 00:13:54,607][47288] Updated weights for policy 0, policy_version 28456 (0.0032) [2024-04-26 00:13:56,860][47288] Updated weights for policy 0, policy_version 28466 (0.0028) [2024-04-26 00:13:58,923][47056] Fps is (10 sec: 58981.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 466485248. Throughput: 0: 55762.5. Samples: 415795060. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-26 00:13:58,923][47056] Avg episode reward: [(0, '0.244')] [2024-04-26 00:14:00,331][47288] Updated weights for policy 0, policy_version 28476 (0.0035) [2024-04-26 00:14:02,782][47288] Updated weights for policy 0, policy_version 28486 (0.0026) [2024-04-26 00:14:03,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 466763776. Throughput: 0: 55753.6. Samples: 416130420. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-26 00:14:03,923][47056] Avg episode reward: [(0, '0.218')] [2024-04-26 00:14:06,282][47288] Updated weights for policy 0, policy_version 28496 (0.0034) [2024-04-26 00:14:08,643][47288] Updated weights for policy 0, policy_version 28506 (0.0029) [2024-04-26 00:14:08,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 467042304. Throughput: 0: 55739.1. Samples: 416462860. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-26 00:14:08,923][47056] Avg episode reward: [(0, '0.261')] [2024-04-26 00:14:12,138][47288] Updated weights for policy 0, policy_version 28516 (0.0031) [2024-04-26 00:14:13,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 467304448. Throughput: 0: 55647.9. Samples: 416631600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 00:14:13,923][47056] Avg episode reward: [(0, '0.225')] [2024-04-26 00:14:14,455][47288] Updated weights for policy 0, policy_version 28526 (0.0025) [2024-04-26 00:14:17,926][47288] Updated weights for policy 0, policy_version 28536 (0.0028) [2024-04-26 00:14:18,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 467582976. Throughput: 0: 55633.3. Samples: 416965620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 00:14:18,923][47056] Avg episode reward: [(0, '0.216')] [2024-04-26 00:14:20,372][47288] Updated weights for policy 0, policy_version 28546 (0.0033) [2024-04-26 00:14:23,687][47288] Updated weights for policy 0, policy_version 28556 (0.0032) [2024-04-26 00:14:23,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55159.3, 300 sec: 55761.1). Total num frames: 467861504. Throughput: 0: 55659.9. Samples: 417300040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 00:14:23,923][47056] Avg episode reward: [(0, '0.242')] [2024-04-26 00:14:26,253][47288] Updated weights for policy 0, policy_version 28566 (0.0032) [2024-04-26 00:14:28,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 468140032. Throughput: 0: 55690.4. Samples: 417461340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 00:14:28,923][47056] Avg episode reward: [(0, '0.210')] [2024-04-26 00:14:29,856][47288] Updated weights for policy 0, policy_version 28576 (0.0028) [2024-04-26 00:14:32,137][47288] Updated weights for policy 0, policy_version 28586 (0.0028) [2024-04-26 00:14:33,923][47056] Fps is (10 sec: 58981.8, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 468451328. Throughput: 0: 55618.8. Samples: 417791400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 00:14:33,923][47056] Avg episode reward: [(0, '0.202')] [2024-04-26 00:14:35,540][47267] Signal inference workers to stop experience collection... (6000 times) [2024-04-26 00:14:35,540][47267] Signal inference workers to resume experience collection... (6000 times) [2024-04-26 00:14:35,567][47288] InferenceWorker_p0-w0: stopping experience collection (6000 times) [2024-04-26 00:14:35,568][47288] InferenceWorker_p0-w0: resuming experience collection (6000 times) [2024-04-26 00:14:35,654][47288] Updated weights for policy 0, policy_version 28596 (0.0027) [2024-04-26 00:14:38,026][47288] Updated weights for policy 0, policy_version 28606 (0.0027) [2024-04-26 00:14:38,923][47056] Fps is (10 sec: 58983.4, 60 sec: 56251.7, 300 sec: 55761.2). Total num frames: 468729856. Throughput: 0: 55697.5. Samples: 418129000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 00:14:38,923][47056] Avg episode reward: [(0, '0.254')] [2024-04-26 00:14:41,469][47288] Updated weights for policy 0, policy_version 28616 (0.0028) [2024-04-26 00:14:43,923][47056] Fps is (10 sec: 54068.5, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 468992000. Throughput: 0: 55687.7. Samples: 418301000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 00:14:43,923][47056] Avg episode reward: [(0, '0.305')] [2024-04-26 00:14:44,001][47288] Updated weights for policy 0, policy_version 28626 (0.0029) [2024-04-26 00:14:47,357][47288] Updated weights for policy 0, policy_version 28636 (0.0027) [2024-04-26 00:14:48,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 469254144. Throughput: 0: 55639.7. Samples: 418634200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 00:14:48,923][47056] Avg episode reward: [(0, '0.286')] [2024-04-26 00:14:49,029][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000028642_469270528.pth... [2024-04-26 00:14:49,083][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000027827_455917568.pth [2024-04-26 00:14:49,943][47288] Updated weights for policy 0, policy_version 28646 (0.0028) [2024-04-26 00:14:53,155][47288] Updated weights for policy 0, policy_version 28656 (0.0033) [2024-04-26 00:14:53,923][47056] Fps is (10 sec: 52428.2, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 469516288. Throughput: 0: 55723.9. Samples: 418970440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 00:14:53,923][47056] Avg episode reward: [(0, '0.200')] [2024-04-26 00:14:55,869][47288] Updated weights for policy 0, policy_version 28666 (0.0032) [2024-04-26 00:14:58,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 469811200. Throughput: 0: 55497.4. Samples: 419128980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 00:14:58,923][47056] Avg episode reward: [(0, '0.150')] [2024-04-26 00:14:59,001][47288] Updated weights for policy 0, policy_version 28676 (0.0030) [2024-04-26 00:15:01,753][47288] Updated weights for policy 0, policy_version 28686 (0.0031) [2024-04-26 00:15:03,923][47056] Fps is (10 sec: 57340.4, 60 sec: 55432.1, 300 sec: 55761.0). Total num frames: 470089728. Throughput: 0: 55566.0. Samples: 419466120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 00:15:03,924][47056] Avg episode reward: [(0, '0.211')] [2024-04-26 00:15:04,949][47288] Updated weights for policy 0, policy_version 28696 (0.0029) [2024-04-26 00:15:07,518][47288] Updated weights for policy 0, policy_version 28706 (0.0032) [2024-04-26 00:15:08,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 470384640. Throughput: 0: 55600.0. Samples: 419802040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:15:08,923][47056] Avg episode reward: [(0, '0.270')] [2024-04-26 00:15:11,024][47288] Updated weights for policy 0, policy_version 28716 (0.0030) [2024-04-26 00:15:13,354][47288] Updated weights for policy 0, policy_version 28726 (0.0027) [2024-04-26 00:15:13,923][47056] Fps is (10 sec: 58985.7, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 470679552. Throughput: 0: 55709.9. Samples: 419968280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:15:13,923][47056] Avg episode reward: [(0, '0.215')] [2024-04-26 00:15:17,004][47288] Updated weights for policy 0, policy_version 28736 (0.0029) [2024-04-26 00:15:18,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 470941696. Throughput: 0: 55919.7. Samples: 420307780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:15:18,923][47056] Avg episode reward: [(0, '0.242')] [2024-04-26 00:15:19,376][47288] Updated weights for policy 0, policy_version 28746 (0.0031) [2024-04-26 00:15:22,697][47288] Updated weights for policy 0, policy_version 28756 (0.0028) [2024-04-26 00:15:23,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 471203840. Throughput: 0: 55675.1. Samples: 420634380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:15:23,923][47056] Avg episode reward: [(0, '0.286')] [2024-04-26 00:15:25,256][47288] Updated weights for policy 0, policy_version 28766 (0.0030) [2024-04-26 00:15:28,450][47288] Updated weights for policy 0, policy_version 28776 (0.0034) [2024-04-26 00:15:28,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 471482368. Throughput: 0: 55435.8. Samples: 420795620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:15:28,923][47056] Avg episode reward: [(0, '0.309')] [2024-04-26 00:15:31,256][47288] Updated weights for policy 0, policy_version 28786 (0.0029) [2024-04-26 00:15:33,923][47056] Fps is (10 sec: 52428.8, 60 sec: 54613.5, 300 sec: 55539.0). Total num frames: 471728128. Throughput: 0: 55519.1. Samples: 421132560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:15:33,923][47056] Avg episode reward: [(0, '0.211')] [2024-04-26 00:15:34,434][47288] Updated weights for policy 0, policy_version 28796 (0.0027) [2024-04-26 00:15:36,966][47288] Updated weights for policy 0, policy_version 28806 (0.0024) [2024-04-26 00:15:38,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 472055808. Throughput: 0: 55509.0. Samples: 421468340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:15:38,923][47056] Avg episode reward: [(0, '0.238')] [2024-04-26 00:15:40,147][47288] Updated weights for policy 0, policy_version 28816 (0.0033) [2024-04-26 00:15:42,968][47288] Updated weights for policy 0, policy_version 28826 (0.0031) [2024-04-26 00:15:43,923][47056] Fps is (10 sec: 60620.3, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 472334336. Throughput: 0: 55816.9. Samples: 421640740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:15:43,923][47056] Avg episode reward: [(0, '0.257')] [2024-04-26 00:15:45,779][47267] Signal inference workers to stop experience collection... (6050 times) [2024-04-26 00:15:45,800][47288] InferenceWorker_p0-w0: stopping experience collection (6050 times) [2024-04-26 00:15:45,870][47267] Signal inference workers to resume experience collection... (6050 times) [2024-04-26 00:15:45,870][47288] InferenceWorker_p0-w0: resuming experience collection (6050 times) [2024-04-26 00:15:45,963][47288] Updated weights for policy 0, policy_version 28836 (0.0028) [2024-04-26 00:15:48,924][47056] Fps is (10 sec: 55698.2, 60 sec: 55977.5, 300 sec: 55760.9). Total num frames: 472612864. Throughput: 0: 55757.0. Samples: 421975220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:15:48,924][47056] Avg episode reward: [(0, '0.254')] [2024-04-26 00:15:48,924][47288] Updated weights for policy 0, policy_version 28846 (0.0034) [2024-04-26 00:15:51,842][47288] Updated weights for policy 0, policy_version 28856 (0.0032) [2024-04-26 00:15:53,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 472891392. Throughput: 0: 55727.0. Samples: 422309760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:15:53,923][47056] Avg episode reward: [(0, '0.253')] [2024-04-26 00:15:54,605][47288] Updated weights for policy 0, policy_version 28866 (0.0032) [2024-04-26 00:15:57,703][47288] Updated weights for policy 0, policy_version 28876 (0.0030) [2024-04-26 00:15:58,923][47056] Fps is (10 sec: 55712.6, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 473169920. Throughput: 0: 55893.0. Samples: 422483460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:15:58,923][47056] Avg episode reward: [(0, '0.270')] [2024-04-26 00:16:00,276][47288] Updated weights for policy 0, policy_version 28886 (0.0030) [2024-04-26 00:16:03,507][47288] Updated weights for policy 0, policy_version 28896 (0.0031) [2024-04-26 00:16:03,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55706.2, 300 sec: 55761.2). Total num frames: 473432064. Throughput: 0: 55819.6. Samples: 422819660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:16:03,923][47056] Avg episode reward: [(0, '0.245')] [2024-04-26 00:16:06,258][47288] Updated weights for policy 0, policy_version 28906 (0.0032) [2024-04-26 00:16:08,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 473726976. Throughput: 0: 56028.6. Samples: 423155680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-26 00:16:08,923][47056] Avg episode reward: [(0, '0.271')] [2024-04-26 00:16:09,276][47288] Updated weights for policy 0, policy_version 28916 (0.0030) [2024-04-26 00:16:12,221][47288] Updated weights for policy 0, policy_version 28926 (0.0028) [2024-04-26 00:16:13,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 474005504. Throughput: 0: 56176.9. Samples: 423323580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-26 00:16:13,923][47056] Avg episode reward: [(0, '0.320')] [2024-04-26 00:16:15,051][47288] Updated weights for policy 0, policy_version 28936 (0.0027) [2024-04-26 00:16:18,013][47288] Updated weights for policy 0, policy_version 28946 (0.0032) [2024-04-26 00:16:18,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 474300416. Throughput: 0: 56191.7. Samples: 423661200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-26 00:16:18,924][47056] Avg episode reward: [(0, '0.261')] [2024-04-26 00:16:20,807][47288] Updated weights for policy 0, policy_version 28956 (0.0027) [2024-04-26 00:16:23,834][47288] Updated weights for policy 0, policy_version 28966 (0.0031) [2024-04-26 00:16:23,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 474578944. Throughput: 0: 56170.2. Samples: 423996000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-26 00:16:23,923][47056] Avg episode reward: [(0, '0.245')] [2024-04-26 00:16:26,707][47288] Updated weights for policy 0, policy_version 28976 (0.0035) [2024-04-26 00:16:28,923][47056] Fps is (10 sec: 54068.6, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 474841088. Throughput: 0: 55997.4. Samples: 424160620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 00:16:28,923][47056] Avg episode reward: [(0, '0.251')] [2024-04-26 00:16:29,627][47288] Updated weights for policy 0, policy_version 28986 (0.0031) [2024-04-26 00:16:32,638][47288] Updated weights for policy 0, policy_version 28996 (0.0025) [2024-04-26 00:16:33,923][47056] Fps is (10 sec: 55704.6, 60 sec: 56797.7, 300 sec: 55761.1). Total num frames: 475136000. Throughput: 0: 56013.4. Samples: 424495760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 00:16:33,923][47056] Avg episode reward: [(0, '0.309')] [2024-04-26 00:16:35,620][47288] Updated weights for policy 0, policy_version 29006 (0.0028) [2024-04-26 00:16:38,559][47288] Updated weights for policy 0, policy_version 29016 (0.0034) [2024-04-26 00:16:38,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 475414528. Throughput: 0: 56054.7. Samples: 424832220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 00:16:38,923][47056] Avg episode reward: [(0, '0.300')] [2024-04-26 00:16:41,550][47288] Updated weights for policy 0, policy_version 29026 (0.0029) [2024-04-26 00:16:43,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 475676672. Throughput: 0: 55896.9. Samples: 424998820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 00:16:43,923][47056] Avg episode reward: [(0, '0.229')] [2024-04-26 00:16:44,547][47288] Updated weights for policy 0, policy_version 29036 (0.0030) [2024-04-26 00:16:47,253][47288] Updated weights for policy 0, policy_version 29046 (0.0035) [2024-04-26 00:16:48,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55706.7, 300 sec: 55705.6). Total num frames: 475955200. Throughput: 0: 55805.2. Samples: 425330900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 00:16:48,923][47056] Avg episode reward: [(0, '0.212')] [2024-04-26 00:16:48,954][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000029051_475971584.pth... [2024-04-26 00:16:48,999][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000028235_462602240.pth [2024-04-26 00:16:50,356][47288] Updated weights for policy 0, policy_version 29056 (0.0030) [2024-04-26 00:16:53,032][47288] Updated weights for policy 0, policy_version 29066 (0.0031) [2024-04-26 00:16:53,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 476266496. Throughput: 0: 55781.0. Samples: 425665820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 00:16:53,923][47056] Avg episode reward: [(0, '0.230')] [2024-04-26 00:16:56,240][47288] Updated weights for policy 0, policy_version 29076 (0.0033) [2024-04-26 00:16:58,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 476512256. Throughput: 0: 55794.3. Samples: 425834320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 00:16:58,923][47056] Avg episode reward: [(0, '0.193')] [2024-04-26 00:16:59,074][47288] Updated weights for policy 0, policy_version 29086 (0.0025) [2024-04-26 00:17:00,219][47267] Signal inference workers to stop experience collection... (6100 times) [2024-04-26 00:17:00,264][47288] InferenceWorker_p0-w0: stopping experience collection (6100 times) [2024-04-26 00:17:00,276][47267] Signal inference workers to resume experience collection... (6100 times) [2024-04-26 00:17:00,278][47288] InferenceWorker_p0-w0: resuming experience collection (6100 times) [2024-04-26 00:17:02,052][47288] Updated weights for policy 0, policy_version 29096 (0.0031) [2024-04-26 00:17:03,923][47056] Fps is (10 sec: 52427.8, 60 sec: 55978.4, 300 sec: 55650.0). Total num frames: 476790784. Throughput: 0: 55780.8. Samples: 426171340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 00:17:03,924][47056] Avg episode reward: [(0, '0.243')] [2024-04-26 00:17:05,014][47288] Updated weights for policy 0, policy_version 29106 (0.0031) [2024-04-26 00:17:07,938][47288] Updated weights for policy 0, policy_version 29116 (0.0028) [2024-04-26 00:17:08,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 477085696. Throughput: 0: 55835.5. Samples: 426508600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 00:17:08,923][47056] Avg episode reward: [(0, '0.287')] [2024-04-26 00:17:10,841][47288] Updated weights for policy 0, policy_version 29126 (0.0027) [2024-04-26 00:17:13,860][47288] Updated weights for policy 0, policy_version 29136 (0.0027) [2024-04-26 00:17:13,923][47056] Fps is (10 sec: 57345.8, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 477364224. Throughput: 0: 55810.7. Samples: 426672100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 00:17:13,923][47056] Avg episode reward: [(0, '0.268')] [2024-04-26 00:17:16,692][47288] Updated weights for policy 0, policy_version 29146 (0.0032) [2024-04-26 00:17:18,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 477626368. Throughput: 0: 55771.5. Samples: 427005480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 00:17:18,923][47056] Avg episode reward: [(0, '0.255')] [2024-04-26 00:17:19,779][47288] Updated weights for policy 0, policy_version 29156 (0.0026) [2024-04-26 00:17:22,512][47288] Updated weights for policy 0, policy_version 29166 (0.0027) [2024-04-26 00:17:23,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 477904896. Throughput: 0: 55668.1. Samples: 427337280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:17:23,923][47056] Avg episode reward: [(0, '0.248')] [2024-04-26 00:17:25,730][47288] Updated weights for policy 0, policy_version 29176 (0.0029) [2024-04-26 00:17:28,344][47288] Updated weights for policy 0, policy_version 29186 (0.0031) [2024-04-26 00:17:28,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 478199808. Throughput: 0: 55732.2. Samples: 427506780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:17:28,923][47056] Avg episode reward: [(0, '0.305')] [2024-04-26 00:17:31,613][47288] Updated weights for policy 0, policy_version 29196 (0.0026) [2024-04-26 00:17:33,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55705.6, 300 sec: 55816.6). Total num frames: 478478336. Throughput: 0: 55824.0. Samples: 427842980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:17:33,924][47056] Avg episode reward: [(0, '0.221')] [2024-04-26 00:17:34,124][47288] Updated weights for policy 0, policy_version 29206 (0.0029) [2024-04-26 00:17:37,518][47288] Updated weights for policy 0, policy_version 29216 (0.0032) [2024-04-26 00:17:38,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.3, 300 sec: 55705.6). Total num frames: 478740480. Throughput: 0: 55841.6. Samples: 428178700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:17:38,924][47056] Avg episode reward: [(0, '0.294')] [2024-04-26 00:17:40,108][47288] Updated weights for policy 0, policy_version 29226 (0.0032) [2024-04-26 00:17:43,358][47288] Updated weights for policy 0, policy_version 29236 (0.0031) [2024-04-26 00:17:43,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55978.4, 300 sec: 55705.6). Total num frames: 479035392. Throughput: 0: 55729.0. Samples: 428342140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 00:17:43,923][47056] Avg episode reward: [(0, '0.247')] [2024-04-26 00:17:46,153][47288] Updated weights for policy 0, policy_version 29246 (0.0029) [2024-04-26 00:17:48,923][47056] Fps is (10 sec: 55707.0, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 479297536. Throughput: 0: 55733.2. Samples: 428679320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 00:17:48,923][47056] Avg episode reward: [(0, '0.275')] [2024-04-26 00:17:49,229][47288] Updated weights for policy 0, policy_version 29256 (0.0029) [2024-04-26 00:17:52,044][47288] Updated weights for policy 0, policy_version 29266 (0.0036) [2024-04-26 00:17:53,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 479576064. Throughput: 0: 55725.6. Samples: 429016260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 00:17:53,923][47056] Avg episode reward: [(0, '0.259')] [2024-04-26 00:17:55,070][47288] Updated weights for policy 0, policy_version 29276 (0.0026) [2024-04-26 00:17:57,918][47288] Updated weights for policy 0, policy_version 29286 (0.0032) [2024-04-26 00:17:58,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 479870976. Throughput: 0: 55786.6. Samples: 429182500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 00:17:58,923][47056] Avg episode reward: [(0, '0.277')] [2024-04-26 00:18:00,994][47288] Updated weights for policy 0, policy_version 29296 (0.0032) [2024-04-26 00:18:03,760][47267] Signal inference workers to stop experience collection... (6150 times) [2024-04-26 00:18:03,763][47267] Signal inference workers to resume experience collection... (6150 times) [2024-04-26 00:18:03,771][47288] Updated weights for policy 0, policy_version 29306 (0.0034) [2024-04-26 00:18:03,790][47288] InferenceWorker_p0-w0: stopping experience collection (6150 times) [2024-04-26 00:18:03,791][47288] InferenceWorker_p0-w0: resuming experience collection (6150 times) [2024-04-26 00:18:03,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56251.9, 300 sec: 55872.2). Total num frames: 480165888. Throughput: 0: 55909.8. Samples: 429521420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-26 00:18:03,923][47056] Avg episode reward: [(0, '0.289')] [2024-04-26 00:18:06,910][47288] Updated weights for policy 0, policy_version 29316 (0.0023) [2024-04-26 00:18:08,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 480444416. Throughput: 0: 55790.1. Samples: 429847840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-26 00:18:08,923][47056] Avg episode reward: [(0, '0.260')] [2024-04-26 00:18:09,566][47288] Updated weights for policy 0, policy_version 29326 (0.0035) [2024-04-26 00:18:12,954][47288] Updated weights for policy 0, policy_version 29336 (0.0026) [2024-04-26 00:18:13,923][47056] Fps is (10 sec: 52429.6, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 480690176. Throughput: 0: 55895.0. Samples: 430022040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-26 00:18:13,923][47056] Avg episode reward: [(0, '0.254')] [2024-04-26 00:18:15,277][47288] Updated weights for policy 0, policy_version 29346 (0.0027) [2024-04-26 00:18:18,656][47288] Updated weights for policy 0, policy_version 29356 (0.0026) [2024-04-26 00:18:18,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 480985088. Throughput: 0: 55947.1. Samples: 430360600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-26 00:18:18,923][47056] Avg episode reward: [(0, '0.239')] [2024-04-26 00:18:21,129][47288] Updated weights for policy 0, policy_version 29366 (0.0028) [2024-04-26 00:18:23,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 481247232. Throughput: 0: 55983.5. Samples: 430697940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 00:18:23,923][47056] Avg episode reward: [(0, '0.243')] [2024-04-26 00:18:24,525][47288] Updated weights for policy 0, policy_version 29376 (0.0030) [2024-04-26 00:18:27,037][47288] Updated weights for policy 0, policy_version 29386 (0.0029) [2024-04-26 00:18:28,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 481542144. Throughput: 0: 55858.0. Samples: 430855740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 00:18:28,923][47056] Avg episode reward: [(0, '0.287')] [2024-04-26 00:18:30,363][47288] Updated weights for policy 0, policy_version 29396 (0.0025) [2024-04-26 00:18:33,010][47288] Updated weights for policy 0, policy_version 29406 (0.0027) [2024-04-26 00:18:33,923][47056] Fps is (10 sec: 58981.7, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 481837056. Throughput: 0: 55754.6. Samples: 431188280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 00:18:33,923][47056] Avg episode reward: [(0, '0.216')] [2024-04-26 00:18:36,240][47288] Updated weights for policy 0, policy_version 29416 (0.0027) [2024-04-26 00:18:38,758][47288] Updated weights for policy 0, policy_version 29426 (0.0032) [2024-04-26 00:18:38,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56252.0, 300 sec: 55927.8). Total num frames: 482115584. Throughput: 0: 55734.4. Samples: 431524300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 00:18:38,923][47056] Avg episode reward: [(0, '0.248')] [2024-04-26 00:18:42,073][47288] Updated weights for policy 0, policy_version 29436 (0.0033) [2024-04-26 00:18:43,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56252.0, 300 sec: 55983.3). Total num frames: 482410496. Throughput: 0: 56017.4. Samples: 431703280. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-26 00:18:43,923][47056] Avg episode reward: [(0, '0.225')] [2024-04-26 00:18:44,530][47288] Updated weights for policy 0, policy_version 29446 (0.0028) [2024-04-26 00:18:47,940][47288] Updated weights for policy 0, policy_version 29456 (0.0040) [2024-04-26 00:18:48,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 482656256. Throughput: 0: 55889.4. Samples: 432036440. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-26 00:18:48,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 00:18:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000029459_482656256.pth... [2024-04-26 00:18:48,986][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000028642_469270528.pth [2024-04-26 00:18:48,994][47267] Saving new best policy, reward=0.340! [2024-04-26 00:18:50,398][47288] Updated weights for policy 0, policy_version 29466 (0.0023) [2024-04-26 00:18:53,880][47288] Updated weights for policy 0, policy_version 29476 (0.0027) [2024-04-26 00:18:53,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 482934784. Throughput: 0: 56088.9. Samples: 432371840. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-26 00:18:53,923][47056] Avg episode reward: [(0, '0.268')] [2024-04-26 00:18:56,126][47288] Updated weights for policy 0, policy_version 29486 (0.0026) [2024-04-26 00:18:58,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 483196928. Throughput: 0: 55663.8. Samples: 432526920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-26 00:18:58,923][47056] Avg episode reward: [(0, '0.271')] [2024-04-26 00:18:59,820][47288] Updated weights for policy 0, policy_version 29496 (0.0027) [2024-04-26 00:19:02,181][47288] Updated weights for policy 0, policy_version 29506 (0.0027) [2024-04-26 00:19:03,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 483491840. Throughput: 0: 55746.7. Samples: 432869200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 00:19:03,923][47056] Avg episode reward: [(0, '0.233')] [2024-04-26 00:19:05,601][47288] Updated weights for policy 0, policy_version 29516 (0.0030) [2024-04-26 00:19:06,246][47267] Signal inference workers to stop experience collection... (6200 times) [2024-04-26 00:19:06,247][47267] Signal inference workers to resume experience collection... (6200 times) [2024-04-26 00:19:06,260][47288] InferenceWorker_p0-w0: stopping experience collection (6200 times) [2024-04-26 00:19:06,260][47288] InferenceWorker_p0-w0: resuming experience collection (6200 times) [2024-04-26 00:19:07,856][47288] Updated weights for policy 0, policy_version 29526 (0.0024) [2024-04-26 00:19:08,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 483770368. Throughput: 0: 55613.6. Samples: 433200560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 00:19:08,923][47056] Avg episode reward: [(0, '0.310')] [2024-04-26 00:19:11,467][47288] Updated weights for policy 0, policy_version 29536 (0.0035) [2024-04-26 00:19:13,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56251.5, 300 sec: 55872.2). Total num frames: 484065280. Throughput: 0: 55941.2. Samples: 433373100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 00:19:13,923][47056] Avg episode reward: [(0, '0.274')] [2024-04-26 00:19:13,937][47288] Updated weights for policy 0, policy_version 29546 (0.0028) [2024-04-26 00:19:17,333][47288] Updated weights for policy 0, policy_version 29556 (0.0034) [2024-04-26 00:19:18,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 484360192. Throughput: 0: 56004.9. Samples: 433708500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 00:19:18,923][47056] Avg episode reward: [(0, '0.291')] [2024-04-26 00:19:19,908][47288] Updated weights for policy 0, policy_version 29566 (0.0036) [2024-04-26 00:19:23,162][47288] Updated weights for policy 0, policy_version 29576 (0.0034) [2024-04-26 00:19:23,923][47056] Fps is (10 sec: 55706.7, 60 sec: 56251.7, 300 sec: 55872.3). Total num frames: 484622336. Throughput: 0: 56026.7. Samples: 434045500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 00:19:23,923][47056] Avg episode reward: [(0, '0.231')] [2024-04-26 00:19:25,861][47288] Updated weights for policy 0, policy_version 29586 (0.0028) [2024-04-26 00:19:28,923][47056] Fps is (10 sec: 50790.5, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 484868096. Throughput: 0: 55649.8. Samples: 434207520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 00:19:28,923][47056] Avg episode reward: [(0, '0.361')] [2024-04-26 00:19:28,973][47267] Saving new best policy, reward=0.361! [2024-04-26 00:19:29,218][47288] Updated weights for policy 0, policy_version 29596 (0.0038) [2024-04-26 00:19:31,674][47288] Updated weights for policy 0, policy_version 29606 (0.0027) [2024-04-26 00:19:33,923][47056] Fps is (10 sec: 52428.1, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 485146624. Throughput: 0: 55647.9. Samples: 434540600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 00:19:33,923][47056] Avg episode reward: [(0, '0.273')] [2024-04-26 00:19:35,084][47288] Updated weights for policy 0, policy_version 29616 (0.0032) [2024-04-26 00:19:37,514][47288] Updated weights for policy 0, policy_version 29626 (0.0024) [2024-04-26 00:19:38,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 485441536. Throughput: 0: 55550.7. Samples: 434871620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 00:19:38,923][47056] Avg episode reward: [(0, '0.281')] [2024-04-26 00:19:40,935][47288] Updated weights for policy 0, policy_version 29636 (0.0029) [2024-04-26 00:19:43,473][47288] Updated weights for policy 0, policy_version 29646 (0.0033) [2024-04-26 00:19:43,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 485736448. Throughput: 0: 55880.0. Samples: 435041520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 00:19:43,923][47056] Avg episode reward: [(0, '0.295')] [2024-04-26 00:19:46,707][47288] Updated weights for policy 0, policy_version 29656 (0.0031) [2024-04-26 00:19:48,924][47056] Fps is (10 sec: 57337.7, 60 sec: 55977.6, 300 sec: 55927.5). Total num frames: 486014976. Throughput: 0: 55721.8. Samples: 435376740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 00:19:48,924][47056] Avg episode reward: [(0, '0.238')] [2024-04-26 00:19:49,363][47288] Updated weights for policy 0, policy_version 29666 (0.0028) [2024-04-26 00:19:52,552][47288] Updated weights for policy 0, policy_version 29676 (0.0028) [2024-04-26 00:19:53,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 486326272. Throughput: 0: 55761.3. Samples: 435709820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 00:19:53,923][47056] Avg episode reward: [(0, '0.307')] [2024-04-26 00:19:55,154][47288] Updated weights for policy 0, policy_version 29686 (0.0029) [2024-04-26 00:19:58,532][47288] Updated weights for policy 0, policy_version 29696 (0.0033) [2024-04-26 00:19:58,923][47056] Fps is (10 sec: 57350.3, 60 sec: 56524.8, 300 sec: 55927.9). Total num frames: 486588416. Throughput: 0: 55793.0. Samples: 435883780. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-26 00:19:58,923][47056] Avg episode reward: [(0, '0.318')] [2024-04-26 00:20:01,192][47288] Updated weights for policy 0, policy_version 29706 (0.0026) [2024-04-26 00:20:03,923][47056] Fps is (10 sec: 50790.1, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 486834176. Throughput: 0: 55811.0. Samples: 436220000. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-26 00:20:03,924][47056] Avg episode reward: [(0, '0.308')] [2024-04-26 00:20:04,333][47288] Updated weights for policy 0, policy_version 29716 (0.0027) [2024-04-26 00:20:05,506][47267] Signal inference workers to stop experience collection... (6250 times) [2024-04-26 00:20:05,515][47267] Signal inference workers to resume experience collection... (6250 times) [2024-04-26 00:20:05,528][47288] InferenceWorker_p0-w0: stopping experience collection (6250 times) [2024-04-26 00:20:05,529][47288] InferenceWorker_p0-w0: resuming experience collection (6250 times) [2024-04-26 00:20:07,113][47288] Updated weights for policy 0, policy_version 29726 (0.0027) [2024-04-26 00:20:08,923][47056] Fps is (10 sec: 50790.8, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 487096320. Throughput: 0: 55783.1. Samples: 436555740. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-26 00:20:08,923][47056] Avg episode reward: [(0, '0.319')] [2024-04-26 00:20:10,244][47288] Updated weights for policy 0, policy_version 29736 (0.0026) [2024-04-26 00:20:12,964][47288] Updated weights for policy 0, policy_version 29746 (0.0026) [2024-04-26 00:20:13,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55432.7, 300 sec: 55761.1). Total num frames: 487391232. Throughput: 0: 55507.6. Samples: 436705360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-26 00:20:13,923][47056] Avg episode reward: [(0, '0.297')] [2024-04-26 00:20:16,056][47288] Updated weights for policy 0, policy_version 29756 (0.0029) [2024-04-26 00:20:18,722][47288] Updated weights for policy 0, policy_version 29766 (0.0025) [2024-04-26 00:20:18,923][47056] Fps is (10 sec: 58981.6, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 487686144. Throughput: 0: 55594.7. Samples: 437042360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:20:18,923][47056] Avg episode reward: [(0, '0.285')] [2024-04-26 00:20:22,010][47288] Updated weights for policy 0, policy_version 29776 (0.0026) [2024-04-26 00:20:23,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 487964672. Throughput: 0: 55707.7. Samples: 437378460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:20:23,923][47056] Avg episode reward: [(0, '0.244')] [2024-04-26 00:20:24,582][47288] Updated weights for policy 0, policy_version 29786 (0.0033) [2024-04-26 00:20:27,880][47288] Updated weights for policy 0, policy_version 29796 (0.0030) [2024-04-26 00:20:28,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56797.8, 300 sec: 56094.4). Total num frames: 488275968. Throughput: 0: 55852.5. Samples: 437554880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:20:28,923][47056] Avg episode reward: [(0, '0.271')] [2024-04-26 00:20:30,572][47288] Updated weights for policy 0, policy_version 29806 (0.0032) [2024-04-26 00:20:33,696][47288] Updated weights for policy 0, policy_version 29816 (0.0029) [2024-04-26 00:20:33,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 488521728. Throughput: 0: 55832.5. Samples: 437889140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:20:33,923][47056] Avg episode reward: [(0, '0.217')] [2024-04-26 00:20:36,557][47288] Updated weights for policy 0, policy_version 29826 (0.0029) [2024-04-26 00:20:38,923][47056] Fps is (10 sec: 49151.5, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 488767488. Throughput: 0: 55768.8. Samples: 438219420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 00:20:38,923][47056] Avg episode reward: [(0, '0.282')] [2024-04-26 00:20:39,564][47288] Updated weights for policy 0, policy_version 29836 (0.0030) [2024-04-26 00:20:42,385][47288] Updated weights for policy 0, policy_version 29846 (0.0029) [2024-04-26 00:20:43,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55761.4). Total num frames: 489062400. Throughput: 0: 55545.8. Samples: 438383340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 00:20:43,923][47056] Avg episode reward: [(0, '0.322')] [2024-04-26 00:20:45,505][47288] Updated weights for policy 0, policy_version 29856 (0.0028) [2024-04-26 00:20:48,399][47288] Updated weights for policy 0, policy_version 29866 (0.0026) [2024-04-26 00:20:48,923][47056] Fps is (10 sec: 57345.2, 60 sec: 55433.6, 300 sec: 55761.2). Total num frames: 489340928. Throughput: 0: 55487.3. Samples: 438716920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 00:20:48,923][47056] Avg episode reward: [(0, '0.288')] [2024-04-26 00:20:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000029867_489340928.pth... [2024-04-26 00:20:48,986][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000029051_475971584.pth [2024-04-26 00:20:51,377][47288] Updated weights for policy 0, policy_version 29876 (0.0032) [2024-04-26 00:20:53,923][47056] Fps is (10 sec: 55705.2, 60 sec: 54886.4, 300 sec: 55761.1). Total num frames: 489619456. Throughput: 0: 55428.8. Samples: 439050040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 00:20:53,923][47056] Avg episode reward: [(0, '0.378')] [2024-04-26 00:20:53,924][47267] Saving new best policy, reward=0.378! [2024-04-26 00:20:54,493][47288] Updated weights for policy 0, policy_version 29886 (0.0028) [2024-04-26 00:20:57,282][47288] Updated weights for policy 0, policy_version 29896 (0.0030) [2024-04-26 00:20:58,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 489914368. Throughput: 0: 55829.7. Samples: 439217700. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 00:20:58,923][47056] Avg episode reward: [(0, '0.268')] [2024-04-26 00:21:00,191][47288] Updated weights for policy 0, policy_version 29906 (0.0026) [2024-04-26 00:21:03,082][47288] Updated weights for policy 0, policy_version 29916 (0.0025) [2024-04-26 00:21:03,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 490192896. Throughput: 0: 55758.8. Samples: 439551500. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 00:21:03,923][47056] Avg episode reward: [(0, '0.236')] [2024-04-26 00:21:03,975][47267] Signal inference workers to stop experience collection... (6300 times) [2024-04-26 00:21:03,976][47267] Signal inference workers to resume experience collection... (6300 times) [2024-04-26 00:21:04,004][47288] InferenceWorker_p0-w0: stopping experience collection (6300 times) [2024-04-26 00:21:04,004][47288] InferenceWorker_p0-w0: resuming experience collection (6300 times) [2024-04-26 00:21:05,946][47288] Updated weights for policy 0, policy_version 29926 (0.0033) [2024-04-26 00:21:08,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 490455040. Throughput: 0: 55746.2. Samples: 439887040. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 00:21:08,923][47056] Avg episode reward: [(0, '0.254')] [2024-04-26 00:21:09,009][47288] Updated weights for policy 0, policy_version 29936 (0.0032) [2024-04-26 00:21:11,841][47288] Updated weights for policy 0, policy_version 29946 (0.0032) [2024-04-26 00:21:13,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 490733568. Throughput: 0: 55476.0. Samples: 440051300. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 00:21:13,923][47056] Avg episode reward: [(0, '0.243')] [2024-04-26 00:21:14,883][47288] Updated weights for policy 0, policy_version 29956 (0.0027) [2024-04-26 00:21:17,712][47288] Updated weights for policy 0, policy_version 29966 (0.0030) [2024-04-26 00:21:18,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 490995712. Throughput: 0: 55522.8. Samples: 440387660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 00:21:18,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 00:21:20,721][47288] Updated weights for policy 0, policy_version 29976 (0.0032) [2024-04-26 00:21:23,769][47288] Updated weights for policy 0, policy_version 29986 (0.0027) [2024-04-26 00:21:23,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 491290624. Throughput: 0: 55617.5. Samples: 440722200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 00:21:23,923][47056] Avg episode reward: [(0, '0.247')] [2024-04-26 00:21:26,403][47288] Updated weights for policy 0, policy_version 29996 (0.0028) [2024-04-26 00:21:28,923][47056] Fps is (10 sec: 57344.1, 60 sec: 54886.5, 300 sec: 55705.6). Total num frames: 491569152. Throughput: 0: 55563.2. Samples: 440883680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 00:21:28,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 00:21:28,932][47267] Saving new best policy, reward=0.409! [2024-04-26 00:21:29,642][47288] Updated weights for policy 0, policy_version 30006 (0.0029) [2024-04-26 00:21:32,273][47288] Updated weights for policy 0, policy_version 30016 (0.0033) [2024-04-26 00:21:33,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 491864064. Throughput: 0: 55651.4. Samples: 441221240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 00:21:33,923][47056] Avg episode reward: [(0, '0.304')] [2024-04-26 00:21:35,325][47288] Updated weights for policy 0, policy_version 30026 (0.0035) [2024-04-26 00:21:38,212][47288] Updated weights for policy 0, policy_version 30036 (0.0025) [2024-04-26 00:21:38,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56525.0, 300 sec: 55872.2). Total num frames: 492158976. Throughput: 0: 55658.7. Samples: 441554680. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-04-26 00:21:38,923][47056] Avg episode reward: [(0, '0.259')] [2024-04-26 00:21:41,339][47288] Updated weights for policy 0, policy_version 30046 (0.0030) [2024-04-26 00:21:43,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 492421120. Throughput: 0: 55741.0. Samples: 441726040. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-04-26 00:21:43,923][47056] Avg episode reward: [(0, '0.214')] [2024-04-26 00:21:43,931][47288] Updated weights for policy 0, policy_version 30056 (0.0027) [2024-04-26 00:21:47,306][47288] Updated weights for policy 0, policy_version 30066 (0.0031) [2024-04-26 00:21:48,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 492683264. Throughput: 0: 55765.3. Samples: 442060940. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-04-26 00:21:48,923][47056] Avg episode reward: [(0, '0.278')] [2024-04-26 00:21:50,101][47288] Updated weights for policy 0, policy_version 30076 (0.0031) [2024-04-26 00:21:53,110][47288] Updated weights for policy 0, policy_version 30086 (0.0027) [2024-04-26 00:21:53,923][47056] Fps is (10 sec: 52428.1, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 492945408. Throughput: 0: 55755.9. Samples: 442396060. Policy #0 lag: (min: 1.0, avg: 10.9, max: 25.0) [2024-04-26 00:21:53,923][47056] Avg episode reward: [(0, '0.306')] [2024-04-26 00:21:56,029][47288] Updated weights for policy 0, policy_version 30096 (0.0032) [2024-04-26 00:21:58,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 493240320. Throughput: 0: 55640.0. Samples: 442555100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 19.0) [2024-04-26 00:21:58,923][47056] Avg episode reward: [(0, '0.245')] [2024-04-26 00:21:58,962][47288] Updated weights for policy 0, policy_version 30106 (0.0033) [2024-04-26 00:22:01,907][47288] Updated weights for policy 0, policy_version 30116 (0.0034) [2024-04-26 00:22:03,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 493518848. Throughput: 0: 55511.0. Samples: 442885660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 19.0) [2024-04-26 00:22:03,923][47056] Avg episode reward: [(0, '0.219')] [2024-04-26 00:22:04,941][47288] Updated weights for policy 0, policy_version 30126 (0.0034) [2024-04-26 00:22:07,668][47288] Updated weights for policy 0, policy_version 30136 (0.0028) [2024-04-26 00:22:08,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 493797376. Throughput: 0: 55577.7. Samples: 443223200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 19.0) [2024-04-26 00:22:08,923][47056] Avg episode reward: [(0, '0.260')] [2024-04-26 00:22:11,098][47288] Updated weights for policy 0, policy_version 30146 (0.0028) [2024-04-26 00:22:11,329][47267] Signal inference workers to stop experience collection... (6350 times) [2024-04-26 00:22:11,364][47288] InferenceWorker_p0-w0: stopping experience collection (6350 times) [2024-04-26 00:22:11,417][47267] Signal inference workers to resume experience collection... (6350 times) [2024-04-26 00:22:11,417][47288] InferenceWorker_p0-w0: resuming experience collection (6350 times) [2024-04-26 00:22:13,570][47288] Updated weights for policy 0, policy_version 30156 (0.0025) [2024-04-26 00:22:13,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 494092288. Throughput: 0: 55961.4. Samples: 443401940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 19.0) [2024-04-26 00:22:13,923][47056] Avg episode reward: [(0, '0.231')] [2024-04-26 00:22:16,942][47288] Updated weights for policy 0, policy_version 30166 (0.0024) [2024-04-26 00:22:18,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 494370816. Throughput: 0: 55779.2. Samples: 443731300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-26 00:22:18,923][47056] Avg episode reward: [(0, '0.311')] [2024-04-26 00:22:19,417][47288] Updated weights for policy 0, policy_version 30176 (0.0027) [2024-04-26 00:22:22,749][47288] Updated weights for policy 0, policy_version 30186 (0.0032) [2024-04-26 00:22:23,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 494649344. Throughput: 0: 55835.5. Samples: 444067280. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-26 00:22:23,923][47056] Avg episode reward: [(0, '0.279')] [2024-04-26 00:22:25,245][47288] Updated weights for policy 0, policy_version 30196 (0.0032) [2024-04-26 00:22:28,629][47288] Updated weights for policy 0, policy_version 30206 (0.0033) [2024-04-26 00:22:28,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 494927872. Throughput: 0: 55790.5. Samples: 444236620. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-26 00:22:28,923][47056] Avg episode reward: [(0, '0.316')] [2024-04-26 00:22:31,131][47288] Updated weights for policy 0, policy_version 30216 (0.0032) [2024-04-26 00:22:33,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 495190016. Throughput: 0: 55785.7. Samples: 444571300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-26 00:22:33,923][47056] Avg episode reward: [(0, '0.301')] [2024-04-26 00:22:34,548][47288] Updated weights for policy 0, policy_version 30226 (0.0032) [2024-04-26 00:22:37,004][47288] Updated weights for policy 0, policy_version 30236 (0.0034) [2024-04-26 00:22:38,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 495468544. Throughput: 0: 55822.2. Samples: 444908060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 00:22:38,923][47056] Avg episode reward: [(0, '0.310')] [2024-04-26 00:22:40,360][47288] Updated weights for policy 0, policy_version 30246 (0.0028) [2024-04-26 00:22:42,733][47288] Updated weights for policy 0, policy_version 30256 (0.0030) [2024-04-26 00:22:43,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 495747072. Throughput: 0: 55873.3. Samples: 445069400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 00:22:43,923][47056] Avg episode reward: [(0, '0.303')] [2024-04-26 00:22:45,984][47288] Updated weights for policy 0, policy_version 30266 (0.0031) [2024-04-26 00:22:48,525][47288] Updated weights for policy 0, policy_version 30276 (0.0032) [2024-04-26 00:22:48,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 496058368. Throughput: 0: 56008.9. Samples: 445406060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 00:22:48,923][47056] Avg episode reward: [(0, '0.256')] [2024-04-26 00:22:48,939][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000030278_496074752.pth... [2024-04-26 00:22:48,989][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000029459_482656256.pth [2024-04-26 00:22:52,037][47288] Updated weights for policy 0, policy_version 30286 (0.0032) [2024-04-26 00:22:53,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 496320512. Throughput: 0: 56065.3. Samples: 445746140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 00:22:53,923][47056] Avg episode reward: [(0, '0.304')] [2024-04-26 00:22:54,303][47288] Updated weights for policy 0, policy_version 30296 (0.0030) [2024-04-26 00:22:57,757][47288] Updated weights for policy 0, policy_version 30306 (0.0028) [2024-04-26 00:22:58,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 496615424. Throughput: 0: 56000.2. Samples: 445921960. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-26 00:22:58,923][47056] Avg episode reward: [(0, '0.272')] [2024-04-26 00:23:00,199][47288] Updated weights for policy 0, policy_version 30316 (0.0036) [2024-04-26 00:23:03,525][47288] Updated weights for policy 0, policy_version 30326 (0.0030) [2024-04-26 00:23:03,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 496877568. Throughput: 0: 56140.5. Samples: 446257620. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-26 00:23:03,923][47056] Avg episode reward: [(0, '0.342')] [2024-04-26 00:23:05,918][47288] Updated weights for policy 0, policy_version 30336 (0.0028) [2024-04-26 00:23:08,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 497139712. Throughput: 0: 56234.9. Samples: 446597860. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-26 00:23:08,923][47056] Avg episode reward: [(0, '0.242')] [2024-04-26 00:23:09,348][47288] Updated weights for policy 0, policy_version 30346 (0.0026) [2024-04-26 00:23:11,776][47288] Updated weights for policy 0, policy_version 30356 (0.0028) [2024-04-26 00:23:12,175][47267] Signal inference workers to stop experience collection... (6400 times) [2024-04-26 00:23:12,175][47267] Signal inference workers to resume experience collection... (6400 times) [2024-04-26 00:23:12,198][47288] InferenceWorker_p0-w0: stopping experience collection (6400 times) [2024-04-26 00:23:12,198][47288] InferenceWorker_p0-w0: resuming experience collection (6400 times) [2024-04-26 00:23:13,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 497434624. Throughput: 0: 55945.9. Samples: 446754180. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-26 00:23:13,923][47056] Avg episode reward: [(0, '0.318')] [2024-04-26 00:23:15,173][47288] Updated weights for policy 0, policy_version 30366 (0.0028) [2024-04-26 00:23:17,637][47288] Updated weights for policy 0, policy_version 30376 (0.0027) [2024-04-26 00:23:18,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.4, 300 sec: 55816.6). Total num frames: 497713152. Throughput: 0: 56057.6. Samples: 447093900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 00:23:18,923][47056] Avg episode reward: [(0, '0.247')] [2024-04-26 00:23:21,014][47288] Updated weights for policy 0, policy_version 30386 (0.0036) [2024-04-26 00:23:23,653][47288] Updated weights for policy 0, policy_version 30396 (0.0027) [2024-04-26 00:23:23,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 498024448. Throughput: 0: 55936.9. Samples: 447425220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 00:23:23,923][47056] Avg episode reward: [(0, '0.293')] [2024-04-26 00:23:26,751][47288] Updated weights for policy 0, policy_version 30406 (0.0028) [2024-04-26 00:23:28,923][47056] Fps is (10 sec: 58983.4, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 498302976. Throughput: 0: 56223.7. Samples: 447599460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 00:23:28,923][47056] Avg episode reward: [(0, '0.327')] [2024-04-26 00:23:29,421][47288] Updated weights for policy 0, policy_version 30416 (0.0026) [2024-04-26 00:23:32,490][47288] Updated weights for policy 0, policy_version 30426 (0.0026) [2024-04-26 00:23:33,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 498581504. Throughput: 0: 56165.0. Samples: 447933480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 00:23:33,923][47056] Avg episode reward: [(0, '0.246')] [2024-04-26 00:23:35,169][47288] Updated weights for policy 0, policy_version 30436 (0.0032) [2024-04-26 00:23:38,392][47288] Updated weights for policy 0, policy_version 30446 (0.0029) [2024-04-26 00:23:38,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 498843648. Throughput: 0: 56088.9. Samples: 448270140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 00:23:38,923][47056] Avg episode reward: [(0, '0.229')] [2024-04-26 00:23:41,087][47288] Updated weights for policy 0, policy_version 30456 (0.0033) [2024-04-26 00:23:43,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.9, 300 sec: 55816.7). Total num frames: 499122176. Throughput: 0: 55879.8. Samples: 448436540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 00:23:43,923][47056] Avg episode reward: [(0, '0.284')] [2024-04-26 00:23:44,382][47288] Updated weights for policy 0, policy_version 30466 (0.0029) [2024-04-26 00:23:46,913][47288] Updated weights for policy 0, policy_version 30476 (0.0031) [2024-04-26 00:23:48,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55705.4, 300 sec: 55816.6). Total num frames: 499400704. Throughput: 0: 55914.8. Samples: 448773800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 00:23:48,923][47056] Avg episode reward: [(0, '0.289')] [2024-04-26 00:23:50,211][47288] Updated weights for policy 0, policy_version 30486 (0.0026) [2024-04-26 00:23:52,785][47288] Updated weights for policy 0, policy_version 30496 (0.0028) [2024-04-26 00:23:53,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.9, 300 sec: 55927.8). Total num frames: 499695616. Throughput: 0: 55919.9. Samples: 449114240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 00:23:53,923][47056] Avg episode reward: [(0, '0.253')] [2024-04-26 00:23:56,075][47288] Updated weights for policy 0, policy_version 30506 (0.0030) [2024-04-26 00:23:58,445][47288] Updated weights for policy 0, policy_version 30516 (0.0026) [2024-04-26 00:23:58,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 499974144. Throughput: 0: 56308.8. Samples: 449288080. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-26 00:23:58,923][47056] Avg episode reward: [(0, '0.299')] [2024-04-26 00:24:01,812][47288] Updated weights for policy 0, policy_version 30526 (0.0032) [2024-04-26 00:24:03,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56524.7, 300 sec: 55927.7). Total num frames: 500269056. Throughput: 0: 56316.1. Samples: 449628120. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-26 00:24:03,923][47056] Avg episode reward: [(0, '0.292')] [2024-04-26 00:24:04,310][47288] Updated weights for policy 0, policy_version 30536 (0.0028) [2024-04-26 00:24:07,735][47288] Updated weights for policy 0, policy_version 30546 (0.0031) [2024-04-26 00:24:08,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56798.0, 300 sec: 55872.2). Total num frames: 500547584. Throughput: 0: 56386.6. Samples: 449962620. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-26 00:24:08,932][47056] Avg episode reward: [(0, '0.294')] [2024-04-26 00:24:10,345][47288] Updated weights for policy 0, policy_version 30556 (0.0027) [2024-04-26 00:24:13,491][47288] Updated weights for policy 0, policy_version 30566 (0.0031) [2024-04-26 00:24:13,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 500809728. Throughput: 0: 56130.3. Samples: 450125320. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-26 00:24:13,923][47056] Avg episode reward: [(0, '0.239')] [2024-04-26 00:24:16,197][47288] Updated weights for policy 0, policy_version 30576 (0.0033) [2024-04-26 00:24:18,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.9, 300 sec: 55816.7). Total num frames: 501088256. Throughput: 0: 56227.5. Samples: 450463720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 00:24:18,923][47056] Avg episode reward: [(0, '0.256')] [2024-04-26 00:24:19,410][47288] Updated weights for policy 0, policy_version 30586 (0.0026) [2024-04-26 00:24:22,033][47288] Updated weights for policy 0, policy_version 30596 (0.0038) [2024-04-26 00:24:23,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 55927.7). Total num frames: 501366784. Throughput: 0: 56147.6. Samples: 450796780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 00:24:23,923][47056] Avg episode reward: [(0, '0.218')] [2024-04-26 00:24:25,234][47288] Updated weights for policy 0, policy_version 30606 (0.0032) [2024-04-26 00:24:27,679][47288] Updated weights for policy 0, policy_version 30616 (0.0033) [2024-04-26 00:24:28,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 501661696. Throughput: 0: 56163.0. Samples: 450963880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 00:24:28,923][47056] Avg episode reward: [(0, '0.206')] [2024-04-26 00:24:30,987][47288] Updated weights for policy 0, policy_version 30626 (0.0031) [2024-04-26 00:24:31,727][47267] Signal inference workers to stop experience collection... (6450 times) [2024-04-26 00:24:31,727][47267] Signal inference workers to resume experience collection... (6450 times) [2024-04-26 00:24:31,754][47288] InferenceWorker_p0-w0: stopping experience collection (6450 times) [2024-04-26 00:24:31,755][47288] InferenceWorker_p0-w0: resuming experience collection (6450 times) [2024-04-26 00:24:33,568][47288] Updated weights for policy 0, policy_version 30636 (0.0035) [2024-04-26 00:24:33,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56251.6, 300 sec: 55983.3). Total num frames: 501956608. Throughput: 0: 56146.8. Samples: 451300400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 00:24:33,923][47056] Avg episode reward: [(0, '0.309')] [2024-04-26 00:24:36,939][47288] Updated weights for policy 0, policy_version 30646 (0.0036) [2024-04-26 00:24:38,923][47056] Fps is (10 sec: 54066.2, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 502202368. Throughput: 0: 56093.4. Samples: 451638460. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 00:24:38,923][47056] Avg episode reward: [(0, '0.302')] [2024-04-26 00:24:39,449][47288] Updated weights for policy 0, policy_version 30656 (0.0034) [2024-04-26 00:24:42,766][47288] Updated weights for policy 0, policy_version 30666 (0.0026) [2024-04-26 00:24:43,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56251.7, 300 sec: 55872.4). Total num frames: 502497280. Throughput: 0: 55918.3. Samples: 451804400. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 00:24:43,923][47056] Avg episode reward: [(0, '0.282')] [2024-04-26 00:24:45,337][47288] Updated weights for policy 0, policy_version 30676 (0.0030) [2024-04-26 00:24:48,756][47288] Updated weights for policy 0, policy_version 30686 (0.0031) [2024-04-26 00:24:48,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 502759424. Throughput: 0: 55863.1. Samples: 452141960. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 00:24:48,924][47056] Avg episode reward: [(0, '0.290')] [2024-04-26 00:24:49,017][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000030687_502775808.pth... [2024-04-26 00:24:49,082][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000029867_489340928.pth [2024-04-26 00:24:51,276][47288] Updated weights for policy 0, policy_version 30696 (0.0031) [2024-04-26 00:24:53,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 503037952. Throughput: 0: 55821.9. Samples: 452474600. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 00:24:53,923][47056] Avg episode reward: [(0, '0.285')] [2024-04-26 00:24:54,475][47288] Updated weights for policy 0, policy_version 30706 (0.0034) [2024-04-26 00:24:57,181][47288] Updated weights for policy 0, policy_version 30716 (0.0033) [2024-04-26 00:24:58,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 503349248. Throughput: 0: 56035.4. Samples: 452646920. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-04-26 00:24:58,923][47056] Avg episode reward: [(0, '0.292')] [2024-04-26 00:25:00,247][47288] Updated weights for policy 0, policy_version 30726 (0.0029) [2024-04-26 00:25:03,041][47288] Updated weights for policy 0, policy_version 30736 (0.0027) [2024-04-26 00:25:03,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.8, 300 sec: 55983.3). Total num frames: 503611392. Throughput: 0: 55917.1. Samples: 452979980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-04-26 00:25:03,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 00:25:06,272][47288] Updated weights for policy 0, policy_version 30746 (0.0027) [2024-04-26 00:25:08,729][47288] Updated weights for policy 0, policy_version 30756 (0.0027) [2024-04-26 00:25:08,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 503906304. Throughput: 0: 55917.8. Samples: 453313080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-04-26 00:25:08,923][47056] Avg episode reward: [(0, '0.270')] [2024-04-26 00:25:12,216][47288] Updated weights for policy 0, policy_version 30766 (0.0030) [2024-04-26 00:25:13,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 504168448. Throughput: 0: 55952.0. Samples: 453481720. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-04-26 00:25:13,923][47056] Avg episode reward: [(0, '0.285')] [2024-04-26 00:25:14,630][47288] Updated weights for policy 0, policy_version 30776 (0.0031) [2024-04-26 00:25:18,015][47288] Updated weights for policy 0, policy_version 30786 (0.0026) [2024-04-26 00:25:18,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 504430592. Throughput: 0: 55886.2. Samples: 453815280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 00:25:18,923][47056] Avg episode reward: [(0, '0.336')] [2024-04-26 00:25:20,592][47288] Updated weights for policy 0, policy_version 30796 (0.0032) [2024-04-26 00:25:23,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 504709120. Throughput: 0: 55799.8. Samples: 454149440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 00:25:23,923][47056] Avg episode reward: [(0, '0.268')] [2024-04-26 00:25:24,000][47288] Updated weights for policy 0, policy_version 30806 (0.0026) [2024-04-26 00:25:26,578][47288] Updated weights for policy 0, policy_version 30816 (0.0024) [2024-04-26 00:25:28,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 505004032. Throughput: 0: 55752.3. Samples: 454313260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 00:25:28,932][47056] Avg episode reward: [(0, '0.276')] [2024-04-26 00:25:29,812][47288] Updated weights for policy 0, policy_version 30826 (0.0028) [2024-04-26 00:25:32,563][47288] Updated weights for policy 0, policy_version 30836 (0.0030) [2024-04-26 00:25:33,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55432.6, 300 sec: 55983.3). Total num frames: 505282560. Throughput: 0: 55664.5. Samples: 454646860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 00:25:33,923][47056] Avg episode reward: [(0, '0.250')] [2024-04-26 00:25:35,652][47288] Updated weights for policy 0, policy_version 30846 (0.0038) [2024-04-26 00:25:38,320][47288] Updated weights for policy 0, policy_version 30856 (0.0031) [2024-04-26 00:25:38,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56524.9, 300 sec: 56038.8). Total num frames: 505593856. Throughput: 0: 55594.0. Samples: 454976340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 00:25:38,923][47056] Avg episode reward: [(0, '0.262')] [2024-04-26 00:25:40,603][47267] Signal inference workers to stop experience collection... (6500 times) [2024-04-26 00:25:40,635][47288] InferenceWorker_p0-w0: stopping experience collection (6500 times) [2024-04-26 00:25:40,649][47267] Signal inference workers to resume experience collection... (6500 times) [2024-04-26 00:25:40,656][47288] InferenceWorker_p0-w0: resuming experience collection (6500 times) [2024-04-26 00:25:41,589][47288] Updated weights for policy 0, policy_version 30866 (0.0033) [2024-04-26 00:25:43,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 505839616. Throughput: 0: 55665.5. Samples: 455151860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 00:25:43,923][47056] Avg episode reward: [(0, '0.286')] [2024-04-26 00:25:44,340][47288] Updated weights for policy 0, policy_version 30876 (0.0031) [2024-04-26 00:25:47,605][47288] Updated weights for policy 0, policy_version 30886 (0.0033) [2024-04-26 00:25:48,923][47056] Fps is (10 sec: 52429.6, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 506118144. Throughput: 0: 55751.5. Samples: 455488800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 00:25:48,923][47056] Avg episode reward: [(0, '0.223')] [2024-04-26 00:25:50,095][47288] Updated weights for policy 0, policy_version 30896 (0.0024) [2024-04-26 00:25:53,395][47288] Updated weights for policy 0, policy_version 30906 (0.0028) [2024-04-26 00:25:53,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 506380288. Throughput: 0: 55607.6. Samples: 455815420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 00:25:53,923][47056] Avg episode reward: [(0, '0.271')] [2024-04-26 00:25:55,912][47288] Updated weights for policy 0, policy_version 30916 (0.0026) [2024-04-26 00:25:58,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55159.6, 300 sec: 55816.7). Total num frames: 506658816. Throughput: 0: 55519.3. Samples: 455980080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 00:25:58,923][47056] Avg episode reward: [(0, '0.211')] [2024-04-26 00:25:59,111][47288] Updated weights for policy 0, policy_version 30926 (0.0038) [2024-04-26 00:26:01,858][47288] Updated weights for policy 0, policy_version 30936 (0.0027) [2024-04-26 00:26:03,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 506937344. Throughput: 0: 55370.3. Samples: 456306940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 00:26:03,923][47056] Avg episode reward: [(0, '0.265')] [2024-04-26 00:26:05,049][47288] Updated weights for policy 0, policy_version 30946 (0.0030) [2024-04-26 00:26:07,728][47288] Updated weights for policy 0, policy_version 30956 (0.0028) [2024-04-26 00:26:08,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55432.5, 300 sec: 55927.8). Total num frames: 507232256. Throughput: 0: 55484.0. Samples: 456646220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 00:26:08,923][47056] Avg episode reward: [(0, '0.285')] [2024-04-26 00:26:11,091][47288] Updated weights for policy 0, policy_version 30966 (0.0030) [2024-04-26 00:26:13,635][47288] Updated weights for policy 0, policy_version 30976 (0.0030) [2024-04-26 00:26:13,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 507510784. Throughput: 0: 55660.9. Samples: 456818000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 00:26:13,923][47056] Avg episode reward: [(0, '0.272')] [2024-04-26 00:26:16,928][47288] Updated weights for policy 0, policy_version 30986 (0.0029) [2024-04-26 00:26:18,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 507789312. Throughput: 0: 55846.7. Samples: 457159960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 00:26:18,923][47056] Avg episode reward: [(0, '0.276')] [2024-04-26 00:26:19,470][47288] Updated weights for policy 0, policy_version 30996 (0.0030) [2024-04-26 00:26:22,595][47288] Updated weights for policy 0, policy_version 31006 (0.0029) [2024-04-26 00:26:23,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 508051456. Throughput: 0: 55960.5. Samples: 457494560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 00:26:23,923][47056] Avg episode reward: [(0, '0.231')] [2024-04-26 00:26:25,260][47288] Updated weights for policy 0, policy_version 31016 (0.0033) [2024-04-26 00:26:28,734][47288] Updated weights for policy 0, policy_version 31026 (0.0027) [2024-04-26 00:26:28,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 508329984. Throughput: 0: 55648.3. Samples: 457656040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 00:26:28,923][47056] Avg episode reward: [(0, '0.269')] [2024-04-26 00:26:31,256][47288] Updated weights for policy 0, policy_version 31036 (0.0034) [2024-04-26 00:26:33,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 508592128. Throughput: 0: 55582.5. Samples: 457990020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 00:26:33,923][47056] Avg episode reward: [(0, '0.291')] [2024-04-26 00:26:34,632][47288] Updated weights for policy 0, policy_version 31046 (0.0027) [2024-04-26 00:26:36,375][47267] Signal inference workers to stop experience collection... (6550 times) [2024-04-26 00:26:36,375][47267] Signal inference workers to resume experience collection... (6550 times) [2024-04-26 00:26:36,387][47288] InferenceWorker_p0-w0: stopping experience collection (6550 times) [2024-04-26 00:26:36,387][47288] InferenceWorker_p0-w0: resuming experience collection (6550 times) [2024-04-26 00:26:37,261][47288] Updated weights for policy 0, policy_version 31056 (0.0030) [2024-04-26 00:26:38,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55159.5, 300 sec: 55872.2). Total num frames: 508903424. Throughput: 0: 55827.4. Samples: 458327660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 00:26:38,923][47056] Avg episode reward: [(0, '0.232')] [2024-04-26 00:26:40,290][47288] Updated weights for policy 0, policy_version 31066 (0.0035) [2024-04-26 00:26:42,950][47288] Updated weights for policy 0, policy_version 31076 (0.0028) [2024-04-26 00:26:43,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 509165568. Throughput: 0: 56006.6. Samples: 458500380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 00:26:43,923][47056] Avg episode reward: [(0, '0.272')] [2024-04-26 00:26:45,980][47288] Updated weights for policy 0, policy_version 31086 (0.0031) [2024-04-26 00:26:48,860][47288] Updated weights for policy 0, policy_version 31096 (0.0029) [2024-04-26 00:26:48,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.5, 300 sec: 56038.8). Total num frames: 509476864. Throughput: 0: 56115.4. Samples: 458832140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 00:26:48,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 00:26:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000031096_509476864.pth... [2024-04-26 00:26:48,979][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000030278_496074752.pth [2024-04-26 00:26:51,811][47288] Updated weights for policy 0, policy_version 31106 (0.0028) [2024-04-26 00:26:53,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 509739008. Throughput: 0: 56034.2. Samples: 459167760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 00:26:53,923][47056] Avg episode reward: [(0, '0.313')] [2024-04-26 00:26:54,837][47288] Updated weights for policy 0, policy_version 31116 (0.0027) [2024-04-26 00:26:57,747][47288] Updated weights for policy 0, policy_version 31126 (0.0030) [2024-04-26 00:26:58,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56251.4, 300 sec: 55983.3). Total num frames: 510033920. Throughput: 0: 55937.5. Samples: 459335200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-26 00:26:58,923][47056] Avg episode reward: [(0, '0.181')] [2024-04-26 00:27:00,700][47288] Updated weights for policy 0, policy_version 31136 (0.0025) [2024-04-26 00:27:03,376][47288] Updated weights for policy 0, policy_version 31146 (0.0027) [2024-04-26 00:27:03,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 510296064. Throughput: 0: 55940.9. Samples: 459677300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-26 00:27:03,923][47056] Avg episode reward: [(0, '0.282')] [2024-04-26 00:27:06,432][47288] Updated weights for policy 0, policy_version 31156 (0.0028) [2024-04-26 00:27:08,923][47056] Fps is (10 sec: 55707.3, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 510590976. Throughput: 0: 56064.6. Samples: 460017460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-26 00:27:08,923][47056] Avg episode reward: [(0, '0.253')] [2024-04-26 00:27:09,283][47288] Updated weights for policy 0, policy_version 31166 (0.0028) [2024-04-26 00:27:12,174][47288] Updated weights for policy 0, policy_version 31176 (0.0033) [2024-04-26 00:27:13,923][47056] Fps is (10 sec: 57343.1, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 510869504. Throughput: 0: 56101.2. Samples: 460180600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-26 00:27:13,923][47056] Avg episode reward: [(0, '0.302')] [2024-04-26 00:27:15,208][47288] Updated weights for policy 0, policy_version 31186 (0.0025) [2024-04-26 00:27:18,196][47288] Updated weights for policy 0, policy_version 31196 (0.0026) [2024-04-26 00:27:18,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 511148032. Throughput: 0: 56091.1. Samples: 460514120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-26 00:27:18,923][47056] Avg episode reward: [(0, '0.295')] [2024-04-26 00:27:21,152][47288] Updated weights for policy 0, policy_version 31206 (0.0027) [2024-04-26 00:27:23,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 511426560. Throughput: 0: 56005.3. Samples: 460847900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-26 00:27:23,923][47056] Avg episode reward: [(0, '0.289')] [2024-04-26 00:27:24,049][47288] Updated weights for policy 0, policy_version 31216 (0.0028) [2024-04-26 00:27:26,870][47288] Updated weights for policy 0, policy_version 31226 (0.0026) [2024-04-26 00:27:27,807][47267] Signal inference workers to stop experience collection... (6600 times) [2024-04-26 00:27:27,850][47288] InferenceWorker_p0-w0: stopping experience collection (6600 times) [2024-04-26 00:27:27,856][47267] Signal inference workers to resume experience collection... (6600 times) [2024-04-26 00:27:27,862][47288] InferenceWorker_p0-w0: resuming experience collection (6600 times) [2024-04-26 00:27:28,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 511705088. Throughput: 0: 55859.0. Samples: 461014040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-26 00:27:28,923][47056] Avg episode reward: [(0, '0.307')] [2024-04-26 00:27:29,836][47288] Updated weights for policy 0, policy_version 31236 (0.0028) [2024-04-26 00:27:32,568][47288] Updated weights for policy 0, policy_version 31246 (0.0028) [2024-04-26 00:27:33,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56797.9, 300 sec: 56038.8). Total num frames: 512000000. Throughput: 0: 55938.8. Samples: 461349380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-26 00:27:33,923][47056] Avg episode reward: [(0, '0.252')] [2024-04-26 00:27:35,704][47288] Updated weights for policy 0, policy_version 31256 (0.0027) [2024-04-26 00:27:38,518][47288] Updated weights for policy 0, policy_version 31266 (0.0030) [2024-04-26 00:27:38,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.8, 300 sec: 56038.8). Total num frames: 512278528. Throughput: 0: 55975.9. Samples: 461686680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-26 00:27:38,923][47056] Avg episode reward: [(0, '0.307')] [2024-04-26 00:27:41,746][47288] Updated weights for policy 0, policy_version 31276 (0.0031) [2024-04-26 00:27:43,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 512540672. Throughput: 0: 56028.8. Samples: 461856480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 00:27:43,923][47056] Avg episode reward: [(0, '0.255')] [2024-04-26 00:27:44,457][47288] Updated weights for policy 0, policy_version 31286 (0.0029) [2024-04-26 00:27:47,704][47288] Updated weights for policy 0, policy_version 31296 (0.0035) [2024-04-26 00:27:48,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 512802816. Throughput: 0: 55863.6. Samples: 462191160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 00:27:48,923][47056] Avg episode reward: [(0, '0.327')] [2024-04-26 00:27:50,222][47288] Updated weights for policy 0, policy_version 31306 (0.0029) [2024-04-26 00:27:53,498][47288] Updated weights for policy 0, policy_version 31316 (0.0030) [2024-04-26 00:27:53,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 513097728. Throughput: 0: 55741.2. Samples: 462525820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 00:27:53,923][47056] Avg episode reward: [(0, '0.244')] [2024-04-26 00:27:56,253][47288] Updated weights for policy 0, policy_version 31326 (0.0032) [2024-04-26 00:27:58,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 513392640. Throughput: 0: 55777.9. Samples: 462690600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 00:27:58,923][47056] Avg episode reward: [(0, '0.244')] [2024-04-26 00:27:59,324][47288] Updated weights for policy 0, policy_version 31336 (0.0035) [2024-04-26 00:28:02,048][47288] Updated weights for policy 0, policy_version 31346 (0.0031) [2024-04-26 00:28:03,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 513654784. Throughput: 0: 55697.7. Samples: 463020520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 00:28:03,923][47056] Avg episode reward: [(0, '0.283')] [2024-04-26 00:28:05,447][47288] Updated weights for policy 0, policy_version 31356 (0.0029) [2024-04-26 00:28:07,811][47288] Updated weights for policy 0, policy_version 31366 (0.0027) [2024-04-26 00:28:08,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 513949696. Throughput: 0: 55823.2. Samples: 463359940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 00:28:08,923][47056] Avg episode reward: [(0, '0.256')] [2024-04-26 00:28:11,308][47288] Updated weights for policy 0, policy_version 31376 (0.0030) [2024-04-26 00:28:13,617][47288] Updated weights for policy 0, policy_version 31386 (0.0030) [2024-04-26 00:28:13,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55978.9, 300 sec: 55983.3). Total num frames: 514228224. Throughput: 0: 55965.0. Samples: 463532460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 00:28:13,923][47056] Avg episode reward: [(0, '0.232')] [2024-04-26 00:28:17,051][47288] Updated weights for policy 0, policy_version 31396 (0.0030) [2024-04-26 00:28:18,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 514490368. Throughput: 0: 55942.2. Samples: 463866780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 00:28:18,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 00:28:19,396][47288] Updated weights for policy 0, policy_version 31406 (0.0037) [2024-04-26 00:28:21,092][47267] Signal inference workers to stop experience collection... (6650 times) [2024-04-26 00:28:21,099][47267] Signal inference workers to resume experience collection... (6650 times) [2024-04-26 00:28:21,115][47288] InferenceWorker_p0-w0: stopping experience collection (6650 times) [2024-04-26 00:28:21,115][47288] InferenceWorker_p0-w0: resuming experience collection (6650 times) [2024-04-26 00:28:23,150][47288] Updated weights for policy 0, policy_version 31416 (0.0028) [2024-04-26 00:28:23,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 514785280. Throughput: 0: 56032.5. Samples: 464208140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:28:23,923][47056] Avg episode reward: [(0, '0.268')] [2024-04-26 00:28:25,086][47288] Updated weights for policy 0, policy_version 31426 (0.0029) [2024-04-26 00:28:28,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 515031040. Throughput: 0: 55750.3. Samples: 464365240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:28:28,923][47056] Avg episode reward: [(0, '0.212')] [2024-04-26 00:28:29,009][47288] Updated weights for policy 0, policy_version 31436 (0.0028) [2024-04-26 00:28:30,997][47288] Updated weights for policy 0, policy_version 31446 (0.0026) [2024-04-26 00:28:33,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 515325952. Throughput: 0: 55983.5. Samples: 464710420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:28:33,924][47056] Avg episode reward: [(0, '0.259')] [2024-04-26 00:28:34,725][47288] Updated weights for policy 0, policy_version 31456 (0.0028) [2024-04-26 00:28:36,998][47288] Updated weights for policy 0, policy_version 31466 (0.0028) [2024-04-26 00:28:38,923][47056] Fps is (10 sec: 58981.1, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 515620864. Throughput: 0: 55878.1. Samples: 465040340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:28:38,923][47056] Avg episode reward: [(0, '0.283')] [2024-04-26 00:28:40,542][47288] Updated weights for policy 0, policy_version 31476 (0.0029) [2024-04-26 00:28:42,789][47288] Updated weights for policy 0, policy_version 31486 (0.0033) [2024-04-26 00:28:43,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56251.6, 300 sec: 55983.3). Total num frames: 515915776. Throughput: 0: 56047.9. Samples: 465212760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 00:28:43,923][47056] Avg episode reward: [(0, '0.251')] [2024-04-26 00:28:46,603][47288] Updated weights for policy 0, policy_version 31496 (0.0033) [2024-04-26 00:28:48,622][47288] Updated weights for policy 0, policy_version 31506 (0.0030) [2024-04-26 00:28:48,923][47056] Fps is (10 sec: 57345.0, 60 sec: 56524.8, 300 sec: 55927.8). Total num frames: 516194304. Throughput: 0: 56149.4. Samples: 465547240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 00:28:48,923][47056] Avg episode reward: [(0, '0.290')] [2024-04-26 00:28:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000031506_516194304.pth... [2024-04-26 00:28:48,983][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000030687_502775808.pth [2024-04-26 00:28:52,349][47288] Updated weights for policy 0, policy_version 31516 (0.0030) [2024-04-26 00:28:53,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 516472832. Throughput: 0: 56008.0. Samples: 465880300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 00:28:53,923][47056] Avg episode reward: [(0, '0.333')] [2024-04-26 00:28:54,495][47288] Updated weights for policy 0, policy_version 31526 (0.0028) [2024-04-26 00:28:58,155][47288] Updated weights for policy 0, policy_version 31536 (0.0030) [2024-04-26 00:28:58,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 516718592. Throughput: 0: 55969.8. Samples: 466051100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 00:28:58,923][47056] Avg episode reward: [(0, '0.326')] [2024-04-26 00:29:00,164][47288] Updated weights for policy 0, policy_version 31546 (0.0027) [2024-04-26 00:29:03,890][47288] Updated weights for policy 0, policy_version 31556 (0.0033) [2024-04-26 00:29:03,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 517013504. Throughput: 0: 55981.4. Samples: 466385940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:29:03,923][47056] Avg episode reward: [(0, '0.307')] [2024-04-26 00:29:06,421][47288] Updated weights for policy 0, policy_version 31566 (0.0029) [2024-04-26 00:29:08,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 517275648. Throughput: 0: 55883.1. Samples: 466722880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:29:08,923][47056] Avg episode reward: [(0, '0.199')] [2024-04-26 00:29:09,750][47288] Updated weights for policy 0, policy_version 31576 (0.0030) [2024-04-26 00:29:12,454][47288] Updated weights for policy 0, policy_version 31586 (0.0022) [2024-04-26 00:29:13,086][47267] Signal inference workers to stop experience collection... (6700 times) [2024-04-26 00:29:13,086][47267] Signal inference workers to resume experience collection... (6700 times) [2024-04-26 00:29:13,108][47288] InferenceWorker_p0-w0: stopping experience collection (6700 times) [2024-04-26 00:29:13,108][47288] InferenceWorker_p0-w0: resuming experience collection (6700 times) [2024-04-26 00:29:13,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 517570560. Throughput: 0: 55807.4. Samples: 466876580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:29:13,923][47056] Avg episode reward: [(0, '0.193')] [2024-04-26 00:29:15,540][47288] Updated weights for policy 0, policy_version 31596 (0.0030) [2024-04-26 00:29:18,208][47288] Updated weights for policy 0, policy_version 31606 (0.0034) [2024-04-26 00:29:18,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55978.8, 300 sec: 55872.3). Total num frames: 517849088. Throughput: 0: 55582.8. Samples: 467211640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:29:18,923][47056] Avg episode reward: [(0, '0.292')] [2024-04-26 00:29:21,334][47288] Updated weights for policy 0, policy_version 31616 (0.0026) [2024-04-26 00:29:23,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 518127616. Throughput: 0: 55698.7. Samples: 467546780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 00:29:23,923][47056] Avg episode reward: [(0, '0.247')] [2024-04-26 00:29:24,222][47288] Updated weights for policy 0, policy_version 31626 (0.0029) [2024-04-26 00:29:27,181][47288] Updated weights for policy 0, policy_version 31636 (0.0029) [2024-04-26 00:29:28,923][47056] Fps is (10 sec: 58981.2, 60 sec: 56797.7, 300 sec: 55872.2). Total num frames: 518438912. Throughput: 0: 55800.5. Samples: 467723780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 00:29:28,924][47056] Avg episode reward: [(0, '0.270')] [2024-04-26 00:29:30,053][47288] Updated weights for policy 0, policy_version 31646 (0.0026) [2024-04-26 00:29:33,172][47288] Updated weights for policy 0, policy_version 31656 (0.0033) [2024-04-26 00:29:33,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 518668288. Throughput: 0: 55823.4. Samples: 468059300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 00:29:33,923][47056] Avg episode reward: [(0, '0.251')] [2024-04-26 00:29:35,891][47288] Updated weights for policy 0, policy_version 31666 (0.0026) [2024-04-26 00:29:38,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 518963200. Throughput: 0: 55726.7. Samples: 468388000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 00:29:38,923][47056] Avg episode reward: [(0, '0.258')] [2024-04-26 00:29:38,992][47288] Updated weights for policy 0, policy_version 31676 (0.0034) [2024-04-26 00:29:41,797][47288] Updated weights for policy 0, policy_version 31686 (0.0029) [2024-04-26 00:29:43,923][47056] Fps is (10 sec: 55706.8, 60 sec: 55159.7, 300 sec: 55816.7). Total num frames: 519225344. Throughput: 0: 55565.0. Samples: 468551520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 00:29:43,923][47056] Avg episode reward: [(0, '0.247')] [2024-04-26 00:29:44,967][47288] Updated weights for policy 0, policy_version 31696 (0.0031) [2024-04-26 00:29:47,535][47288] Updated weights for policy 0, policy_version 31706 (0.0027) [2024-04-26 00:29:48,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55159.3, 300 sec: 55816.6). Total num frames: 519503872. Throughput: 0: 55508.3. Samples: 468883820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 00:29:48,923][47056] Avg episode reward: [(0, '0.300')] [2024-04-26 00:29:50,942][47288] Updated weights for policy 0, policy_version 31716 (0.0034) [2024-04-26 00:29:53,546][47288] Updated weights for policy 0, policy_version 31726 (0.0026) [2024-04-26 00:29:53,923][47056] Fps is (10 sec: 58981.5, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 519815168. Throughput: 0: 55394.6. Samples: 469215640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 00:29:53,923][47056] Avg episode reward: [(0, '0.212')] [2024-04-26 00:29:56,735][47288] Updated weights for policy 0, policy_version 31736 (0.0030) [2024-04-26 00:29:58,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 520077312. Throughput: 0: 55799.1. Samples: 469387540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 00:29:58,923][47056] Avg episode reward: [(0, '0.268')] [2024-04-26 00:29:59,452][47288] Updated weights for policy 0, policy_version 31746 (0.0028) [2024-04-26 00:30:02,613][47288] Updated weights for policy 0, policy_version 31756 (0.0030) [2024-04-26 00:30:03,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55978.5, 300 sec: 55816.6). Total num frames: 520372224. Throughput: 0: 55767.6. Samples: 469721200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 00:30:03,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 00:30:05,423][47288] Updated weights for policy 0, policy_version 31766 (0.0030) [2024-04-26 00:30:08,558][47288] Updated weights for policy 0, policy_version 31776 (0.0026) [2024-04-26 00:30:08,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 520634368. Throughput: 0: 55678.7. Samples: 470052320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 00:30:08,923][47056] Avg episode reward: [(0, '0.223')] [2024-04-26 00:30:09,192][47267] Signal inference workers to stop experience collection... (6750 times) [2024-04-26 00:30:09,224][47288] InferenceWorker_p0-w0: stopping experience collection (6750 times) [2024-04-26 00:30:09,248][47267] Signal inference workers to resume experience collection... (6750 times) [2024-04-26 00:30:09,254][47288] InferenceWorker_p0-w0: resuming experience collection (6750 times) [2024-04-26 00:30:11,408][47288] Updated weights for policy 0, policy_version 31786 (0.0024) [2024-04-26 00:30:13,923][47056] Fps is (10 sec: 54068.4, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 520912896. Throughput: 0: 55362.8. Samples: 470215100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 00:30:13,923][47056] Avg episode reward: [(0, '0.257')] [2024-04-26 00:30:14,315][47288] Updated weights for policy 0, policy_version 31796 (0.0033) [2024-04-26 00:30:17,298][47288] Updated weights for policy 0, policy_version 31806 (0.0027) [2024-04-26 00:30:18,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55159.3, 300 sec: 55761.1). Total num frames: 521158656. Throughput: 0: 55407.6. Samples: 470552640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 00:30:18,923][47056] Avg episode reward: [(0, '0.284')] [2024-04-26 00:30:20,285][47288] Updated weights for policy 0, policy_version 31816 (0.0032) [2024-04-26 00:30:23,052][47288] Updated weights for policy 0, policy_version 31826 (0.0025) [2024-04-26 00:30:23,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 521453568. Throughput: 0: 55604.4. Samples: 470890200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 00:30:23,923][47056] Avg episode reward: [(0, '0.209')] [2024-04-26 00:30:26,275][47288] Updated weights for policy 0, policy_version 31836 (0.0028) [2024-04-26 00:30:28,922][47056] Fps is (10 sec: 58983.7, 60 sec: 55159.7, 300 sec: 55816.7). Total num frames: 521748480. Throughput: 0: 55657.8. Samples: 471056120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 00:30:28,923][47056] Avg episode reward: [(0, '0.262')] [2024-04-26 00:30:29,019][47288] Updated weights for policy 0, policy_version 31846 (0.0030) [2024-04-26 00:30:32,118][47288] Updated weights for policy 0, policy_version 31856 (0.0034) [2024-04-26 00:30:33,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 522027008. Throughput: 0: 55659.7. Samples: 471388500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 00:30:33,923][47056] Avg episode reward: [(0, '0.302')] [2024-04-26 00:30:34,880][47288] Updated weights for policy 0, policy_version 31866 (0.0030) [2024-04-26 00:30:37,908][47288] Updated weights for policy 0, policy_version 31876 (0.0028) [2024-04-26 00:30:38,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 522305536. Throughput: 0: 55652.2. Samples: 471719980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 00:30:38,923][47056] Avg episode reward: [(0, '0.248')] [2024-04-26 00:30:40,914][47288] Updated weights for policy 0, policy_version 31886 (0.0029) [2024-04-26 00:30:43,651][47288] Updated weights for policy 0, policy_version 31896 (0.0029) [2024-04-26 00:30:43,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 522600448. Throughput: 0: 55710.2. Samples: 471894500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 00:30:43,923][47056] Avg episode reward: [(0, '0.336')] [2024-04-26 00:30:46,708][47288] Updated weights for policy 0, policy_version 31906 (0.0028) [2024-04-26 00:30:48,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 522862592. Throughput: 0: 55835.2. Samples: 472233780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 00:30:48,924][47056] Avg episode reward: [(0, '0.296')] [2024-04-26 00:30:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000031913_522862592.pth... [2024-04-26 00:30:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000031096_509476864.pth [2024-04-26 00:30:49,654][47288] Updated weights for policy 0, policy_version 31916 (0.0030) [2024-04-26 00:30:52,701][47288] Updated weights for policy 0, policy_version 31926 (0.0027) [2024-04-26 00:30:53,923][47056] Fps is (10 sec: 50790.5, 60 sec: 54886.5, 300 sec: 55761.1). Total num frames: 523108352. Throughput: 0: 55855.6. Samples: 472565820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 00:30:53,923][47056] Avg episode reward: [(0, '0.224')] [2024-04-26 00:30:55,521][47288] Updated weights for policy 0, policy_version 31936 (0.0024) [2024-04-26 00:30:58,532][47288] Updated weights for policy 0, policy_version 31946 (0.0035) [2024-04-26 00:30:58,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 523403264. Throughput: 0: 55673.6. Samples: 472720420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 00:30:58,923][47056] Avg episode reward: [(0, '0.247')] [2024-04-26 00:31:01,321][47288] Updated weights for policy 0, policy_version 31956 (0.0029) [2024-04-26 00:31:03,923][47056] Fps is (10 sec: 58981.9, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 523698176. Throughput: 0: 55537.8. Samples: 473051840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 00:31:03,924][47056] Avg episode reward: [(0, '0.276')] [2024-04-26 00:31:04,517][47288] Updated weights for policy 0, policy_version 31966 (0.0029) [2024-04-26 00:31:07,287][47288] Updated weights for policy 0, policy_version 31976 (0.0028) [2024-04-26 00:31:08,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 523993088. Throughput: 0: 55510.6. Samples: 473388180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-26 00:31:08,923][47056] Avg episode reward: [(0, '0.300')] [2024-04-26 00:31:10,416][47288] Updated weights for policy 0, policy_version 31986 (0.0026) [2024-04-26 00:31:11,504][47267] Signal inference workers to stop experience collection... (6800 times) [2024-04-26 00:31:11,505][47267] Signal inference workers to resume experience collection... (6800 times) [2024-04-26 00:31:11,531][47288] InferenceWorker_p0-w0: stopping experience collection (6800 times) [2024-04-26 00:31:11,532][47288] InferenceWorker_p0-w0: resuming experience collection (6800 times) [2024-04-26 00:31:12,959][47288] Updated weights for policy 0, policy_version 31996 (0.0030) [2024-04-26 00:31:13,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 524271616. Throughput: 0: 55979.4. Samples: 473575200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-26 00:31:13,923][47056] Avg episode reward: [(0, '0.228')] [2024-04-26 00:31:16,269][47288] Updated weights for policy 0, policy_version 32006 (0.0025) [2024-04-26 00:31:18,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 524533760. Throughput: 0: 55917.9. Samples: 473904800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-26 00:31:18,923][47056] Avg episode reward: [(0, '0.272')] [2024-04-26 00:31:18,959][47288] Updated weights for policy 0, policy_version 32016 (0.0032) [2024-04-26 00:31:22,135][47288] Updated weights for policy 0, policy_version 32026 (0.0027) [2024-04-26 00:31:23,923][47056] Fps is (10 sec: 52427.4, 60 sec: 55705.4, 300 sec: 55816.6). Total num frames: 524795904. Throughput: 0: 55844.5. Samples: 474233000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-26 00:31:23,924][47056] Avg episode reward: [(0, '0.203')] [2024-04-26 00:31:24,825][47288] Updated weights for policy 0, policy_version 32036 (0.0030) [2024-04-26 00:31:28,116][47288] Updated weights for policy 0, policy_version 32046 (0.0036) [2024-04-26 00:31:28,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 525074432. Throughput: 0: 55685.7. Samples: 474400360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 00:31:28,923][47056] Avg episode reward: [(0, '0.260')] [2024-04-26 00:31:30,488][47288] Updated weights for policy 0, policy_version 32056 (0.0026) [2024-04-26 00:31:33,923][47056] Fps is (10 sec: 54068.7, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 525336576. Throughput: 0: 55740.2. Samples: 474742080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 00:31:33,923][47056] Avg episode reward: [(0, '0.295')] [2024-04-26 00:31:34,177][47288] Updated weights for policy 0, policy_version 32066 (0.0028) [2024-04-26 00:31:36,332][47288] Updated weights for policy 0, policy_version 32076 (0.0026) [2024-04-26 00:31:38,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55978.5, 300 sec: 55927.7). Total num frames: 525664256. Throughput: 0: 55831.9. Samples: 475078260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 00:31:38,923][47056] Avg episode reward: [(0, '0.283')] [2024-04-26 00:31:39,923][47288] Updated weights for policy 0, policy_version 32086 (0.0039) [2024-04-26 00:31:42,118][47288] Updated weights for policy 0, policy_version 32096 (0.0027) [2024-04-26 00:31:43,923][47056] Fps is (10 sec: 60620.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 525942784. Throughput: 0: 56081.9. Samples: 475244100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 00:31:43,923][47056] Avg episode reward: [(0, '0.295')] [2024-04-26 00:31:45,689][47288] Updated weights for policy 0, policy_version 32106 (0.0031) [2024-04-26 00:31:48,192][47288] Updated weights for policy 0, policy_version 32116 (0.0029) [2024-04-26 00:31:48,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 526237696. Throughput: 0: 56227.5. Samples: 475582080. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 00:31:48,923][47056] Avg episode reward: [(0, '0.272')] [2024-04-26 00:31:51,545][47288] Updated weights for policy 0, policy_version 32126 (0.0025) [2024-04-26 00:31:53,923][47056] Fps is (10 sec: 54066.5, 60 sec: 56251.6, 300 sec: 55761.2). Total num frames: 526483456. Throughput: 0: 56161.3. Samples: 475915440. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 00:31:53,923][47056] Avg episode reward: [(0, '0.249')] [2024-04-26 00:31:54,092][47288] Updated weights for policy 0, policy_version 32136 (0.0027) [2024-04-26 00:31:57,458][47288] Updated weights for policy 0, policy_version 32146 (0.0034) [2024-04-26 00:31:58,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 526761984. Throughput: 0: 55719.5. Samples: 476082580. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 00:31:58,923][47056] Avg episode reward: [(0, '0.281')] [2024-04-26 00:31:59,990][47288] Updated weights for policy 0, policy_version 32156 (0.0026) [2024-04-26 00:32:03,277][47288] Updated weights for policy 0, policy_version 32166 (0.0030) [2024-04-26 00:32:03,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 527024128. Throughput: 0: 55835.1. Samples: 476417380. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 00:32:03,923][47056] Avg episode reward: [(0, '0.289')] [2024-04-26 00:32:05,952][47288] Updated weights for policy 0, policy_version 32176 (0.0026) [2024-04-26 00:32:08,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55432.7, 300 sec: 55761.2). Total num frames: 527319040. Throughput: 0: 56009.3. Samples: 476753400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 00:32:08,923][47056] Avg episode reward: [(0, '0.252')] [2024-04-26 00:32:08,957][47288] Updated weights for policy 0, policy_version 32186 (0.0026) [2024-04-26 00:32:10,999][47267] Signal inference workers to stop experience collection... (6850 times) [2024-04-26 00:32:11,040][47288] InferenceWorker_p0-w0: stopping experience collection (6850 times) [2024-04-26 00:32:11,050][47267] Signal inference workers to resume experience collection... (6850 times) [2024-04-26 00:32:11,056][47288] InferenceWorker_p0-w0: resuming experience collection (6850 times) [2024-04-26 00:32:11,687][47288] Updated weights for policy 0, policy_version 32196 (0.0027) [2024-04-26 00:32:13,923][47056] Fps is (10 sec: 58982.9, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 527613952. Throughput: 0: 55934.8. Samples: 476917420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 00:32:13,923][47056] Avg episode reward: [(0, '0.288')] [2024-04-26 00:32:14,826][47288] Updated weights for policy 0, policy_version 32206 (0.0029) [2024-04-26 00:32:17,614][47288] Updated weights for policy 0, policy_version 32216 (0.0026) [2024-04-26 00:32:18,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 527892480. Throughput: 0: 55887.5. Samples: 477257020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 00:32:18,923][47056] Avg episode reward: [(0, '0.271')] [2024-04-26 00:32:21,025][47288] Updated weights for policy 0, policy_version 32226 (0.0029) [2024-04-26 00:32:23,449][47288] Updated weights for policy 0, policy_version 32236 (0.0026) [2024-04-26 00:32:23,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56252.0, 300 sec: 55816.7). Total num frames: 528171008. Throughput: 0: 55765.4. Samples: 477587700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 00:32:23,923][47056] Avg episode reward: [(0, '0.292')] [2024-04-26 00:32:26,843][47288] Updated weights for policy 0, policy_version 32246 (0.0034) [2024-04-26 00:32:28,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.9, 300 sec: 55816.7). Total num frames: 528465920. Throughput: 0: 55928.1. Samples: 477760860. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-04-26 00:32:28,923][47056] Avg episode reward: [(0, '0.277')] [2024-04-26 00:32:29,476][47288] Updated weights for policy 0, policy_version 32256 (0.0043) [2024-04-26 00:32:32,785][47288] Updated weights for policy 0, policy_version 32266 (0.0033) [2024-04-26 00:32:33,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 528695296. Throughput: 0: 55719.6. Samples: 478089460. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-04-26 00:32:33,923][47056] Avg episode reward: [(0, '0.259')] [2024-04-26 00:32:35,278][47288] Updated weights for policy 0, policy_version 32276 (0.0030) [2024-04-26 00:32:38,651][47288] Updated weights for policy 0, policy_version 32286 (0.0026) [2024-04-26 00:32:38,923][47056] Fps is (10 sec: 50789.5, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 528973824. Throughput: 0: 55829.4. Samples: 478427760. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-04-26 00:32:38,923][47056] Avg episode reward: [(0, '0.211')] [2024-04-26 00:32:41,249][47288] Updated weights for policy 0, policy_version 32296 (0.0024) [2024-04-26 00:32:43,923][47056] Fps is (10 sec: 58982.6, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 529285120. Throughput: 0: 55734.3. Samples: 478590620. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-04-26 00:32:43,923][47056] Avg episode reward: [(0, '0.323')] [2024-04-26 00:32:44,371][47288] Updated weights for policy 0, policy_version 32306 (0.0034) [2024-04-26 00:32:47,118][47288] Updated weights for policy 0, policy_version 32316 (0.0039) [2024-04-26 00:32:48,923][47056] Fps is (10 sec: 60622.2, 60 sec: 55705.8, 300 sec: 55872.2). Total num frames: 529580032. Throughput: 0: 55710.9. Samples: 478924360. Policy #0 lag: (min: 1.0, avg: 12.0, max: 27.0) [2024-04-26 00:32:48,923][47056] Avg episode reward: [(0, '0.230')] [2024-04-26 00:32:49,012][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000032324_529596416.pth... [2024-04-26 00:32:49,053][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000031506_516194304.pth [2024-04-26 00:32:50,075][47288] Updated weights for policy 0, policy_version 32326 (0.0026) [2024-04-26 00:32:52,982][47288] Updated weights for policy 0, policy_version 32336 (0.0029) [2024-04-26 00:32:53,923][47056] Fps is (10 sec: 54066.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 529825792. Throughput: 0: 55681.9. Samples: 479259100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 00:32:53,924][47056] Avg episode reward: [(0, '0.286')] [2024-04-26 00:32:56,042][47288] Updated weights for policy 0, policy_version 32346 (0.0033) [2024-04-26 00:32:58,848][47288] Updated weights for policy 0, policy_version 32356 (0.0030) [2024-04-26 00:32:58,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 530120704. Throughput: 0: 55814.6. Samples: 479429080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 00:32:58,923][47056] Avg episode reward: [(0, '0.237')] [2024-04-26 00:33:02,090][47288] Updated weights for policy 0, policy_version 32366 (0.0026) [2024-04-26 00:33:03,923][47056] Fps is (10 sec: 57345.2, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 530399232. Throughput: 0: 55613.3. Samples: 479759620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 00:33:03,923][47056] Avg episode reward: [(0, '0.294')] [2024-04-26 00:33:04,696][47288] Updated weights for policy 0, policy_version 32376 (0.0034) [2024-04-26 00:33:08,088][47288] Updated weights for policy 0, policy_version 32386 (0.0027) [2024-04-26 00:33:08,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 530661376. Throughput: 0: 55826.3. Samples: 480099880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 00:33:08,923][47056] Avg episode reward: [(0, '0.239')] [2024-04-26 00:33:10,542][47288] Updated weights for policy 0, policy_version 32396 (0.0030) [2024-04-26 00:33:13,783][47288] Updated weights for policy 0, policy_version 32406 (0.0026) [2024-04-26 00:33:13,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 530939904. Throughput: 0: 55504.7. Samples: 480258580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 00:33:13,923][47056] Avg episode reward: [(0, '0.261')] [2024-04-26 00:33:16,336][47288] Updated weights for policy 0, policy_version 32416 (0.0031) [2024-04-26 00:33:17,698][47267] Signal inference workers to stop experience collection... (6900 times) [2024-04-26 00:33:17,734][47288] InferenceWorker_p0-w0: stopping experience collection (6900 times) [2024-04-26 00:33:17,745][47267] Signal inference workers to resume experience collection... (6900 times) [2024-04-26 00:33:17,751][47288] InferenceWorker_p0-w0: resuming experience collection (6900 times) [2024-04-26 00:33:18,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 531234816. Throughput: 0: 55557.3. Samples: 480589540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 00:33:18,923][47056] Avg episode reward: [(0, '0.311')] [2024-04-26 00:33:19,546][47288] Updated weights for policy 0, policy_version 32426 (0.0028) [2024-04-26 00:33:22,360][47288] Updated weights for policy 0, policy_version 32436 (0.0033) [2024-04-26 00:33:23,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 531513344. Throughput: 0: 55498.4. Samples: 480925180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 00:33:23,923][47056] Avg episode reward: [(0, '0.320')] [2024-04-26 00:33:25,358][47288] Updated weights for policy 0, policy_version 32446 (0.0034) [2024-04-26 00:33:28,214][47288] Updated weights for policy 0, policy_version 32456 (0.0028) [2024-04-26 00:33:28,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 531775488. Throughput: 0: 55776.5. Samples: 481100560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 00:33:28,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 00:33:31,351][47288] Updated weights for policy 0, policy_version 32466 (0.0031) [2024-04-26 00:33:33,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 532070400. Throughput: 0: 55804.7. Samples: 481435580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 00:33:33,923][47056] Avg episode reward: [(0, '0.285')] [2024-04-26 00:33:34,141][47288] Updated weights for policy 0, policy_version 32476 (0.0028) [2024-04-26 00:33:37,029][47288] Updated weights for policy 0, policy_version 32486 (0.0035) [2024-04-26 00:33:38,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 532348928. Throughput: 0: 55770.4. Samples: 481768760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 00:33:38,923][47056] Avg episode reward: [(0, '0.283')] [2024-04-26 00:33:39,858][47288] Updated weights for policy 0, policy_version 32496 (0.0030) [2024-04-26 00:33:42,836][47288] Updated weights for policy 0, policy_version 32506 (0.0027) [2024-04-26 00:33:43,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 532611072. Throughput: 0: 55563.1. Samples: 481929420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 00:33:43,923][47056] Avg episode reward: [(0, '0.184')] [2024-04-26 00:33:45,616][47288] Updated weights for policy 0, policy_version 32516 (0.0036) [2024-04-26 00:33:48,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55159.2, 300 sec: 55650.0). Total num frames: 532889600. Throughput: 0: 55712.3. Samples: 482266680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 00:33:48,923][47056] Avg episode reward: [(0, '0.298')] [2024-04-26 00:33:49,096][47288] Updated weights for policy 0, policy_version 32526 (0.0032) [2024-04-26 00:33:51,476][47288] Updated weights for policy 0, policy_version 32536 (0.0023) [2024-04-26 00:33:53,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 533184512. Throughput: 0: 55627.4. Samples: 482603120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 00:33:53,923][47056] Avg episode reward: [(0, '0.240')] [2024-04-26 00:33:55,048][47288] Updated weights for policy 0, policy_version 32546 (0.0031) [2024-04-26 00:33:57,336][47288] Updated weights for policy 0, policy_version 32556 (0.0026) [2024-04-26 00:33:58,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 533479424. Throughput: 0: 55890.4. Samples: 482773640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 00:33:58,923][47056] Avg episode reward: [(0, '0.287')] [2024-04-26 00:34:00,723][47288] Updated weights for policy 0, policy_version 32566 (0.0039) [2024-04-26 00:34:03,239][47288] Updated weights for policy 0, policy_version 32576 (0.0030) [2024-04-26 00:34:03,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 533757952. Throughput: 0: 56157.7. Samples: 483116640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 00:34:03,923][47056] Avg episode reward: [(0, '0.205')] [2024-04-26 00:34:05,794][47267] Signal inference workers to stop experience collection... (6950 times) [2024-04-26 00:34:05,795][47267] Signal inference workers to resume experience collection... (6950 times) [2024-04-26 00:34:05,813][47288] InferenceWorker_p0-w0: stopping experience collection (6950 times) [2024-04-26 00:34:05,814][47288] InferenceWorker_p0-w0: resuming experience collection (6950 times) [2024-04-26 00:34:06,452][47288] Updated weights for policy 0, policy_version 32586 (0.0029) [2024-04-26 00:34:08,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 534036480. Throughput: 0: 56106.2. Samples: 483449960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 00:34:08,923][47056] Avg episode reward: [(0, '0.259')] [2024-04-26 00:34:09,199][47288] Updated weights for policy 0, policy_version 32596 (0.0031) [2024-04-26 00:34:12,346][47288] Updated weights for policy 0, policy_version 32606 (0.0032) [2024-04-26 00:34:13,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 534298624. Throughput: 0: 55953.7. Samples: 483618480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 00:34:13,923][47056] Avg episode reward: [(0, '0.249')] [2024-04-26 00:34:15,029][47288] Updated weights for policy 0, policy_version 32616 (0.0027) [2024-04-26 00:34:18,157][47288] Updated weights for policy 0, policy_version 32626 (0.0028) [2024-04-26 00:34:18,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 534560768. Throughput: 0: 55966.3. Samples: 483954060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 00:34:18,923][47056] Avg episode reward: [(0, '0.282')] [2024-04-26 00:34:20,649][47288] Updated weights for policy 0, policy_version 32636 (0.0034) [2024-04-26 00:34:23,823][47288] Updated weights for policy 0, policy_version 32646 (0.0029) [2024-04-26 00:34:23,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 534872064. Throughput: 0: 55957.4. Samples: 484286840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 00:34:23,923][47056] Avg episode reward: [(0, '0.348')] [2024-04-26 00:34:26,503][47288] Updated weights for policy 0, policy_version 32656 (0.0028) [2024-04-26 00:34:28,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 535150592. Throughput: 0: 56089.3. Samples: 484453440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 00:34:28,923][47056] Avg episode reward: [(0, '0.294')] [2024-04-26 00:34:30,102][47288] Updated weights for policy 0, policy_version 32666 (0.0027) [2024-04-26 00:34:32,297][47288] Updated weights for policy 0, policy_version 32676 (0.0027) [2024-04-26 00:34:33,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 535429120. Throughput: 0: 56032.6. Samples: 484788140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 00:34:33,923][47056] Avg episode reward: [(0, '0.264')] [2024-04-26 00:34:36,141][47288] Updated weights for policy 0, policy_version 32686 (0.0034) [2024-04-26 00:34:38,032][47288] Updated weights for policy 0, policy_version 32696 (0.0027) [2024-04-26 00:34:38,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56251.9, 300 sec: 55927.7). Total num frames: 535724032. Throughput: 0: 56142.4. Samples: 485129520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 00:34:38,923][47056] Avg episode reward: [(0, '0.294')] [2024-04-26 00:34:42,064][47288] Updated weights for policy 0, policy_version 32706 (0.0026) [2024-04-26 00:34:43,910][47288] Updated weights for policy 0, policy_version 32716 (0.0032) [2024-04-26 00:34:43,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56797.9, 300 sec: 55983.3). Total num frames: 536018944. Throughput: 0: 56355.6. Samples: 485309640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 00:34:43,923][47056] Avg episode reward: [(0, '0.299')] [2024-04-26 00:34:47,878][47288] Updated weights for policy 0, policy_version 32726 (0.0027) [2024-04-26 00:34:48,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 536264704. Throughput: 0: 56230.4. Samples: 485647000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 00:34:48,923][47056] Avg episode reward: [(0, '0.257')] [2024-04-26 00:34:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000032731_536264704.pth... [2024-04-26 00:34:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000031913_522862592.pth [2024-04-26 00:34:49,880][47288] Updated weights for policy 0, policy_version 32736 (0.0030) [2024-04-26 00:34:53,644][47288] Updated weights for policy 0, policy_version 32746 (0.0031) [2024-04-26 00:34:53,923][47056] Fps is (10 sec: 50789.9, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 536526848. Throughput: 0: 56247.4. Samples: 485981100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 00:34:53,923][47056] Avg episode reward: [(0, '0.309')] [2024-04-26 00:34:55,768][47288] Updated weights for policy 0, policy_version 32756 (0.0030) [2024-04-26 00:34:58,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 536805376. Throughput: 0: 56099.4. Samples: 486142960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:34:58,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 00:34:59,343][47288] Updated weights for policy 0, policy_version 32766 (0.0035) [2024-04-26 00:35:01,525][47288] Updated weights for policy 0, policy_version 32776 (0.0035) [2024-04-26 00:35:03,923][47056] Fps is (10 sec: 58983.0, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 537116672. Throughput: 0: 56116.4. Samples: 486479300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:35:03,923][47056] Avg episode reward: [(0, '0.342')] [2024-04-26 00:35:05,453][47288] Updated weights for policy 0, policy_version 32786 (0.0027) [2024-04-26 00:35:07,469][47288] Updated weights for policy 0, policy_version 32796 (0.0029) [2024-04-26 00:35:08,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 537378816. Throughput: 0: 56063.2. Samples: 486809680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:35:08,923][47056] Avg episode reward: [(0, '0.322')] [2024-04-26 00:35:11,243][47288] Updated weights for policy 0, policy_version 32806 (0.0030) [2024-04-26 00:35:11,512][47267] Signal inference workers to stop experience collection... (7000 times) [2024-04-26 00:35:11,550][47288] InferenceWorker_p0-w0: stopping experience collection (7000 times) [2024-04-26 00:35:11,560][47267] Signal inference workers to resume experience collection... (7000 times) [2024-04-26 00:35:11,566][47288] InferenceWorker_p0-w0: resuming experience collection (7000 times) [2024-04-26 00:35:13,297][47288] Updated weights for policy 0, policy_version 32816 (0.0026) [2024-04-26 00:35:13,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 537673728. Throughput: 0: 56247.5. Samples: 486984580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:35:13,923][47056] Avg episode reward: [(0, '0.252')] [2024-04-26 00:35:17,183][47288] Updated weights for policy 0, policy_version 32826 (0.0026) [2024-04-26 00:35:18,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56797.8, 300 sec: 55983.3). Total num frames: 537968640. Throughput: 0: 56170.2. Samples: 487315800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:35:18,923][47056] Avg episode reward: [(0, '0.221')] [2024-04-26 00:35:18,998][47288] Updated weights for policy 0, policy_version 32836 (0.0028) [2024-04-26 00:35:23,148][47288] Updated weights for policy 0, policy_version 32846 (0.0024) [2024-04-26 00:35:23,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 538230784. Throughput: 0: 56083.9. Samples: 487653300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:35:23,923][47056] Avg episode reward: [(0, '0.317')] [2024-04-26 00:35:24,814][47288] Updated weights for policy 0, policy_version 32856 (0.0029) [2024-04-26 00:35:28,912][47288] Updated weights for policy 0, policy_version 32866 (0.0033) [2024-04-26 00:35:28,923][47056] Fps is (10 sec: 50789.7, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 538476544. Throughput: 0: 55668.3. Samples: 487814720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:35:28,923][47056] Avg episode reward: [(0, '0.207')] [2024-04-26 00:35:30,880][47288] Updated weights for policy 0, policy_version 32876 (0.0033) [2024-04-26 00:35:33,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 538755072. Throughput: 0: 55560.4. Samples: 488147220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:35:33,923][47056] Avg episode reward: [(0, '0.314')] [2024-04-26 00:35:34,638][47288] Updated weights for policy 0, policy_version 32886 (0.0030) [2024-04-26 00:35:36,800][47288] Updated weights for policy 0, policy_version 32896 (0.0030) [2024-04-26 00:35:38,923][47056] Fps is (10 sec: 58982.7, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 539066368. Throughput: 0: 55470.7. Samples: 488477280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:35:38,923][47056] Avg episode reward: [(0, '0.319')] [2024-04-26 00:35:40,582][47288] Updated weights for policy 0, policy_version 32906 (0.0027) [2024-04-26 00:35:42,805][47288] Updated weights for policy 0, policy_version 32916 (0.0027) [2024-04-26 00:35:43,923][47056] Fps is (10 sec: 60620.5, 60 sec: 55705.5, 300 sec: 55927.8). Total num frames: 539361280. Throughput: 0: 55844.9. Samples: 488655980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 00:35:43,923][47056] Avg episode reward: [(0, '0.244')] [2024-04-26 00:35:46,299][47288] Updated weights for policy 0, policy_version 32926 (0.0029) [2024-04-26 00:35:48,626][47288] Updated weights for policy 0, policy_version 32936 (0.0034) [2024-04-26 00:35:48,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 539623424. Throughput: 0: 55771.0. Samples: 488989000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 00:35:48,923][47056] Avg episode reward: [(0, '0.240')] [2024-04-26 00:35:52,118][47288] Updated weights for policy 0, policy_version 32946 (0.0033) [2024-04-26 00:35:53,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.9, 300 sec: 55983.3). Total num frames: 539918336. Throughput: 0: 55809.7. Samples: 489321120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 00:35:53,923][47056] Avg episode reward: [(0, '0.291')] [2024-04-26 00:35:54,436][47288] Updated weights for policy 0, policy_version 32956 (0.0026) [2024-04-26 00:35:57,839][47288] Updated weights for policy 0, policy_version 32966 (0.0026) [2024-04-26 00:35:58,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 540164096. Throughput: 0: 55658.3. Samples: 489489200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 00:35:58,923][47056] Avg episode reward: [(0, '0.371')] [2024-04-26 00:36:00,299][47288] Updated weights for policy 0, policy_version 32976 (0.0033) [2024-04-26 00:36:03,755][47288] Updated weights for policy 0, policy_version 32986 (0.0033) [2024-04-26 00:36:03,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 540442624. Throughput: 0: 55764.8. Samples: 489825220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 00:36:03,923][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 00:36:06,112][47267] Signal inference workers to stop experience collection... (7050 times) [2024-04-26 00:36:06,165][47288] InferenceWorker_p0-w0: stopping experience collection (7050 times) [2024-04-26 00:36:06,165][47267] Signal inference workers to resume experience collection... (7050 times) [2024-04-26 00:36:06,181][47288] InferenceWorker_p0-w0: resuming experience collection (7050 times) [2024-04-26 00:36:06,278][47288] Updated weights for policy 0, policy_version 32996 (0.0035) [2024-04-26 00:36:08,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 540704768. Throughput: 0: 55695.7. Samples: 490159600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 00:36:08,923][47056] Avg episode reward: [(0, '0.312')] [2024-04-26 00:36:09,981][47288] Updated weights for policy 0, policy_version 33006 (0.0027) [2024-04-26 00:36:12,125][47288] Updated weights for policy 0, policy_version 33016 (0.0030) [2024-04-26 00:36:13,923][47056] Fps is (10 sec: 58981.6, 60 sec: 55978.5, 300 sec: 55927.7). Total num frames: 541032448. Throughput: 0: 55737.2. Samples: 490322900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 00:36:13,923][47056] Avg episode reward: [(0, '0.247')] [2024-04-26 00:36:15,758][47288] Updated weights for policy 0, policy_version 33026 (0.0036) [2024-04-26 00:36:18,176][47288] Updated weights for policy 0, policy_version 33036 (0.0037) [2024-04-26 00:36:18,923][47056] Fps is (10 sec: 60621.0, 60 sec: 55705.6, 300 sec: 55983.4). Total num frames: 541310976. Throughput: 0: 55748.6. Samples: 490655900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 00:36:18,923][47056] Avg episode reward: [(0, '0.259')] [2024-04-26 00:36:21,494][47288] Updated weights for policy 0, policy_version 33046 (0.0031) [2024-04-26 00:36:23,923][47056] Fps is (10 sec: 54068.3, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 541573120. Throughput: 0: 56032.1. Samples: 490998720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 00:36:23,923][47056] Avg episode reward: [(0, '0.268')] [2024-04-26 00:36:24,029][47288] Updated weights for policy 0, policy_version 33056 (0.0028) [2024-04-26 00:36:27,288][47288] Updated weights for policy 0, policy_version 33066 (0.0028) [2024-04-26 00:36:28,923][47056] Fps is (10 sec: 54066.5, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 541851648. Throughput: 0: 55812.0. Samples: 491167520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 00:36:28,923][47056] Avg episode reward: [(0, '0.272')] [2024-04-26 00:36:29,814][47288] Updated weights for policy 0, policy_version 33076 (0.0026) [2024-04-26 00:36:33,109][47288] Updated weights for policy 0, policy_version 33086 (0.0026) [2024-04-26 00:36:33,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 542130176. Throughput: 0: 55904.9. Samples: 491504720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 00:36:33,923][47056] Avg episode reward: [(0, '0.348')] [2024-04-26 00:36:35,735][47288] Updated weights for policy 0, policy_version 33096 (0.0031) [2024-04-26 00:36:38,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 542392320. Throughput: 0: 55979.6. Samples: 491840200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 00:36:38,923][47056] Avg episode reward: [(0, '0.289')] [2024-04-26 00:36:39,056][47288] Updated weights for policy 0, policy_version 33106 (0.0027) [2024-04-26 00:36:41,617][47288] Updated weights for policy 0, policy_version 33116 (0.0036) [2024-04-26 00:36:43,924][47056] Fps is (10 sec: 54059.8, 60 sec: 55158.2, 300 sec: 55705.3). Total num frames: 542670848. Throughput: 0: 55933.3. Samples: 492006280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 00:36:43,925][47056] Avg episode reward: [(0, '0.331')] [2024-04-26 00:36:44,756][47288] Updated weights for policy 0, policy_version 33126 (0.0030) [2024-04-26 00:36:47,586][47288] Updated weights for policy 0, policy_version 33136 (0.0030) [2024-04-26 00:36:48,923][47056] Fps is (10 sec: 58981.9, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 542982144. Throughput: 0: 55969.8. Samples: 492343860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 00:36:48,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 00:36:48,936][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000033141_542982144.pth... [2024-04-26 00:36:48,988][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000032324_529596416.pth [2024-04-26 00:36:50,630][47288] Updated weights for policy 0, policy_version 33146 (0.0027) [2024-04-26 00:36:53,396][47288] Updated weights for policy 0, policy_version 33156 (0.0040) [2024-04-26 00:36:53,923][47056] Fps is (10 sec: 60629.8, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 543277056. Throughput: 0: 56000.4. Samples: 492679620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 00:36:53,923][47056] Avg episode reward: [(0, '0.292')] [2024-04-26 00:36:56,558][47288] Updated weights for policy 0, policy_version 33166 (0.0028) [2024-04-26 00:36:58,923][47056] Fps is (10 sec: 52429.5, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 543506432. Throughput: 0: 56147.0. Samples: 492849500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 00:36:58,923][47056] Avg episode reward: [(0, '0.297')] [2024-04-26 00:36:59,266][47288] Updated weights for policy 0, policy_version 33176 (0.0032) [2024-04-26 00:36:59,469][47267] Signal inference workers to stop experience collection... (7100 times) [2024-04-26 00:36:59,517][47267] Signal inference workers to resume experience collection... (7100 times) [2024-04-26 00:36:59,517][47288] InferenceWorker_p0-w0: stopping experience collection (7100 times) [2024-04-26 00:36:59,529][47288] InferenceWorker_p0-w0: resuming experience collection (7100 times) [2024-04-26 00:37:02,314][47288] Updated weights for policy 0, policy_version 33186 (0.0027) [2024-04-26 00:37:03,923][47056] Fps is (10 sec: 54066.6, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 543817728. Throughput: 0: 56050.4. Samples: 493178180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 00:37:03,923][47056] Avg episode reward: [(0, '0.328')] [2024-04-26 00:37:05,104][47288] Updated weights for policy 0, policy_version 33196 (0.0025) [2024-04-26 00:37:08,117][47288] Updated weights for policy 0, policy_version 33206 (0.0031) [2024-04-26 00:37:08,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 544079872. Throughput: 0: 55881.7. Samples: 493513400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 00:37:08,923][47056] Avg episode reward: [(0, '0.330')] [2024-04-26 00:37:11,193][47288] Updated weights for policy 0, policy_version 33216 (0.0026) [2024-04-26 00:37:13,923][47056] Fps is (10 sec: 54068.3, 60 sec: 55432.8, 300 sec: 55816.7). Total num frames: 544358400. Throughput: 0: 55797.1. Samples: 493678380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 00:37:13,923][47056] Avg episode reward: [(0, '0.300')] [2024-04-26 00:37:14,024][47288] Updated weights for policy 0, policy_version 33226 (0.0035) [2024-04-26 00:37:17,005][47288] Updated weights for policy 0, policy_version 33236 (0.0030) [2024-04-26 00:37:18,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 544636928. Throughput: 0: 55737.9. Samples: 494012920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 00:37:18,923][47056] Avg episode reward: [(0, '0.248')] [2024-04-26 00:37:19,778][47288] Updated weights for policy 0, policy_version 33246 (0.0032) [2024-04-26 00:37:22,813][47288] Updated weights for policy 0, policy_version 33256 (0.0033) [2024-04-26 00:37:23,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 544931840. Throughput: 0: 55731.6. Samples: 494348120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 00:37:23,923][47056] Avg episode reward: [(0, '0.293')] [2024-04-26 00:37:25,802][47288] Updated weights for policy 0, policy_version 33266 (0.0029) [2024-04-26 00:37:28,613][47288] Updated weights for policy 0, policy_version 33276 (0.0027) [2024-04-26 00:37:28,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 545210368. Throughput: 0: 55807.5. Samples: 494517540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 00:37:28,923][47056] Avg episode reward: [(0, '0.286')] [2024-04-26 00:37:31,677][47288] Updated weights for policy 0, policy_version 33286 (0.0030) [2024-04-26 00:37:33,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55432.7, 300 sec: 55872.2). Total num frames: 545456128. Throughput: 0: 55835.7. Samples: 494856460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 00:37:33,923][47056] Avg episode reward: [(0, '0.313')] [2024-04-26 00:37:34,471][47288] Updated weights for policy 0, policy_version 33296 (0.0029) [2024-04-26 00:37:37,470][47288] Updated weights for policy 0, policy_version 33306 (0.0028) [2024-04-26 00:37:38,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56524.8, 300 sec: 55927.8). Total num frames: 545783808. Throughput: 0: 55761.3. Samples: 495188880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 00:37:38,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 00:37:40,296][47288] Updated weights for policy 0, policy_version 33316 (0.0033) [2024-04-26 00:37:43,184][47288] Updated weights for policy 0, policy_version 33326 (0.0027) [2024-04-26 00:37:43,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56253.1, 300 sec: 55816.7). Total num frames: 546045952. Throughput: 0: 55731.9. Samples: 495357440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 00:37:43,923][47056] Avg episode reward: [(0, '0.288')] [2024-04-26 00:37:45,609][47267] Signal inference workers to stop experience collection... (7150 times) [2024-04-26 00:37:45,645][47288] InferenceWorker_p0-w0: stopping experience collection (7150 times) [2024-04-26 00:37:45,700][47267] Signal inference workers to resume experience collection... (7150 times) [2024-04-26 00:37:45,700][47288] InferenceWorker_p0-w0: resuming experience collection (7150 times) [2024-04-26 00:37:46,177][47288] Updated weights for policy 0, policy_version 33336 (0.0028) [2024-04-26 00:37:48,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55432.6, 300 sec: 55872.3). Total num frames: 546308096. Throughput: 0: 55895.7. Samples: 495693480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 00:37:48,923][47056] Avg episode reward: [(0, '0.286')] [2024-04-26 00:37:49,068][47288] Updated weights for policy 0, policy_version 33346 (0.0027) [2024-04-26 00:37:52,064][47288] Updated weights for policy 0, policy_version 33356 (0.0029) [2024-04-26 00:37:53,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 546603008. Throughput: 0: 55841.8. Samples: 496026280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 00:37:53,923][47056] Avg episode reward: [(0, '0.303')] [2024-04-26 00:37:54,988][47288] Updated weights for policy 0, policy_version 33366 (0.0031) [2024-04-26 00:37:57,916][47288] Updated weights for policy 0, policy_version 33376 (0.0031) [2024-04-26 00:37:58,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 546881536. Throughput: 0: 55956.3. Samples: 496196420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 00:37:58,923][47056] Avg episode reward: [(0, '0.252')] [2024-04-26 00:38:00,746][47288] Updated weights for policy 0, policy_version 33386 (0.0028) [2024-04-26 00:38:03,733][47288] Updated weights for policy 0, policy_version 33396 (0.0027) [2024-04-26 00:38:03,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 547176448. Throughput: 0: 56005.3. Samples: 496533160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 00:38:03,923][47056] Avg episode reward: [(0, '0.327')] [2024-04-26 00:38:06,580][47288] Updated weights for policy 0, policy_version 33406 (0.0029) [2024-04-26 00:38:08,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 547405824. Throughput: 0: 55998.2. Samples: 496868040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 00:38:08,923][47056] Avg episode reward: [(0, '0.369')] [2024-04-26 00:38:09,613][47288] Updated weights for policy 0, policy_version 33416 (0.0027) [2024-04-26 00:38:12,440][47288] Updated weights for policy 0, policy_version 33426 (0.0029) [2024-04-26 00:38:13,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.6, 300 sec: 55927.8). Total num frames: 547733504. Throughput: 0: 55899.7. Samples: 497033020. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-26 00:38:13,923][47056] Avg episode reward: [(0, '0.272')] [2024-04-26 00:38:15,481][47288] Updated weights for policy 0, policy_version 33436 (0.0029) [2024-04-26 00:38:18,361][47288] Updated weights for policy 0, policy_version 33446 (0.0029) [2024-04-26 00:38:18,923][47056] Fps is (10 sec: 58981.0, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 547995648. Throughput: 0: 55924.6. Samples: 497373080. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-26 00:38:18,923][47056] Avg episode reward: [(0, '0.314')] [2024-04-26 00:38:21,192][47288] Updated weights for policy 0, policy_version 33456 (0.0027) [2024-04-26 00:38:23,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 548290560. Throughput: 0: 56069.4. Samples: 497712000. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-26 00:38:23,923][47056] Avg episode reward: [(0, '0.269')] [2024-04-26 00:38:24,234][47288] Updated weights for policy 0, policy_version 33466 (0.0040) [2024-04-26 00:38:27,096][47288] Updated weights for policy 0, policy_version 33476 (0.0031) [2024-04-26 00:38:28,923][47056] Fps is (10 sec: 57345.4, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 548569088. Throughput: 0: 56042.7. Samples: 497879360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-26 00:38:28,923][47056] Avg episode reward: [(0, '0.296')] [2024-04-26 00:38:30,135][47288] Updated weights for policy 0, policy_version 33486 (0.0029) [2024-04-26 00:38:33,004][47288] Updated weights for policy 0, policy_version 33496 (0.0029) [2024-04-26 00:38:33,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.7, 300 sec: 55927.8). Total num frames: 548847616. Throughput: 0: 55889.7. Samples: 498208520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 00:38:33,923][47056] Avg episode reward: [(0, '0.278')] [2024-04-26 00:38:36,023][47288] Updated weights for policy 0, policy_version 33506 (0.0031) [2024-04-26 00:38:37,707][47267] Signal inference workers to stop experience collection... (7200 times) [2024-04-26 00:38:37,734][47288] InferenceWorker_p0-w0: stopping experience collection (7200 times) [2024-04-26 00:38:37,755][47267] Signal inference workers to resume experience collection... (7200 times) [2024-04-26 00:38:37,758][47288] InferenceWorker_p0-w0: resuming experience collection (7200 times) [2024-04-26 00:38:38,743][47288] Updated weights for policy 0, policy_version 33516 (0.0028) [2024-04-26 00:38:38,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 549126144. Throughput: 0: 56013.8. Samples: 498546900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 00:38:38,923][47056] Avg episode reward: [(0, '0.344')] [2024-04-26 00:38:41,753][47288] Updated weights for policy 0, policy_version 33526 (0.0028) [2024-04-26 00:38:43,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.5, 300 sec: 55927.8). Total num frames: 549388288. Throughput: 0: 55851.5. Samples: 498709740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 00:38:43,923][47056] Avg episode reward: [(0, '0.296')] [2024-04-26 00:38:44,635][47288] Updated weights for policy 0, policy_version 33536 (0.0031) [2024-04-26 00:38:47,559][47288] Updated weights for policy 0, policy_version 33546 (0.0037) [2024-04-26 00:38:48,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.6, 300 sec: 55927.8). Total num frames: 549683200. Throughput: 0: 55898.2. Samples: 499048580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 00:38:48,923][47056] Avg episode reward: [(0, '0.278')] [2024-04-26 00:38:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000033550_549683200.pth... [2024-04-26 00:38:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000032731_536264704.pth [2024-04-26 00:38:50,535][47288] Updated weights for policy 0, policy_version 33556 (0.0027) [2024-04-26 00:38:53,502][47288] Updated weights for policy 0, policy_version 33566 (0.0027) [2024-04-26 00:38:53,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 549961728. Throughput: 0: 55883.4. Samples: 499382800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 00:38:53,923][47056] Avg episode reward: [(0, '0.279')] [2024-04-26 00:38:56,602][47288] Updated weights for policy 0, policy_version 33576 (0.0032) [2024-04-26 00:38:58,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 550240256. Throughput: 0: 55974.4. Samples: 499551880. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-04-26 00:38:58,923][47056] Avg episode reward: [(0, '0.288')] [2024-04-26 00:38:59,580][47288] Updated weights for policy 0, policy_version 33586 (0.0028) [2024-04-26 00:39:02,418][47288] Updated weights for policy 0, policy_version 33596 (0.0031) [2024-04-26 00:39:03,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 550518784. Throughput: 0: 55746.9. Samples: 499881680. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-04-26 00:39:03,923][47056] Avg episode reward: [(0, '0.301')] [2024-04-26 00:39:05,373][47288] Updated weights for policy 0, policy_version 33606 (0.0032) [2024-04-26 00:39:08,175][47288] Updated weights for policy 0, policy_version 33616 (0.0032) [2024-04-26 00:39:08,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56524.7, 300 sec: 55927.7). Total num frames: 550797312. Throughput: 0: 55569.7. Samples: 500212640. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-04-26 00:39:08,923][47056] Avg episode reward: [(0, '0.297')] [2024-04-26 00:39:11,709][47288] Updated weights for policy 0, policy_version 33626 (0.0028) [2024-04-26 00:39:13,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55927.7). Total num frames: 551059456. Throughput: 0: 55663.5. Samples: 500384220. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-04-26 00:39:13,923][47056] Avg episode reward: [(0, '0.264')] [2024-04-26 00:39:14,282][47288] Updated weights for policy 0, policy_version 33636 (0.0035) [2024-04-26 00:39:17,557][47288] Updated weights for policy 0, policy_version 33646 (0.0028) [2024-04-26 00:39:18,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 551337984. Throughput: 0: 55634.3. Samples: 500712060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 00:39:18,923][47056] Avg episode reward: [(0, '0.197')] [2024-04-26 00:39:20,268][47288] Updated weights for policy 0, policy_version 33656 (0.0032) [2024-04-26 00:39:23,643][47288] Updated weights for policy 0, policy_version 33666 (0.0035) [2024-04-26 00:39:23,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 551600128. Throughput: 0: 55459.0. Samples: 501042560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 00:39:23,923][47056] Avg episode reward: [(0, '0.326')] [2024-04-26 00:39:26,043][47288] Updated weights for policy 0, policy_version 33676 (0.0033) [2024-04-26 00:39:28,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 551878656. Throughput: 0: 55356.5. Samples: 501200780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 00:39:28,923][47056] Avg episode reward: [(0, '0.280')] [2024-04-26 00:39:29,431][47288] Updated weights for policy 0, policy_version 33686 (0.0028) [2024-04-26 00:39:32,117][47288] Updated weights for policy 0, policy_version 33696 (0.0028) [2024-04-26 00:39:33,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 552157184. Throughput: 0: 55300.6. Samples: 501537100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 00:39:33,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 00:39:35,352][47288] Updated weights for policy 0, policy_version 33706 (0.0029) [2024-04-26 00:39:37,948][47288] Updated weights for policy 0, policy_version 33716 (0.0032) [2024-04-26 00:39:38,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 552452096. Throughput: 0: 55297.9. Samples: 501871200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 00:39:38,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 00:39:41,245][47288] Updated weights for policy 0, policy_version 33726 (0.0038) [2024-04-26 00:39:41,272][47267] Signal inference workers to stop experience collection... (7250 times) [2024-04-26 00:39:41,272][47267] Signal inference workers to resume experience collection... (7250 times) [2024-04-26 00:39:41,299][47288] InferenceWorker_p0-w0: stopping experience collection (7250 times) [2024-04-26 00:39:41,299][47288] InferenceWorker_p0-w0: resuming experience collection (7250 times) [2024-04-26 00:39:43,731][47288] Updated weights for policy 0, policy_version 33736 (0.0033) [2024-04-26 00:39:43,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 552730624. Throughput: 0: 55235.8. Samples: 502037480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 00:39:43,923][47056] Avg episode reward: [(0, '0.308')] [2024-04-26 00:39:47,137][47288] Updated weights for policy 0, policy_version 33746 (0.0031) [2024-04-26 00:39:48,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 552992768. Throughput: 0: 55273.2. Samples: 502368980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 00:39:48,923][47056] Avg episode reward: [(0, '0.234')] [2024-04-26 00:39:49,644][47288] Updated weights for policy 0, policy_version 33756 (0.0031) [2024-04-26 00:39:52,996][47288] Updated weights for policy 0, policy_version 33766 (0.0028) [2024-04-26 00:39:53,923][47056] Fps is (10 sec: 54066.1, 60 sec: 55159.3, 300 sec: 55816.7). Total num frames: 553271296. Throughput: 0: 55350.0. Samples: 502703400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 00:39:53,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 00:39:55,516][47288] Updated weights for policy 0, policy_version 33776 (0.0033) [2024-04-26 00:39:58,923][47056] Fps is (10 sec: 54067.8, 60 sec: 54886.7, 300 sec: 55650.1). Total num frames: 553533440. Throughput: 0: 55123.2. Samples: 502864760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 00:39:58,923][47056] Avg episode reward: [(0, '0.328')] [2024-04-26 00:39:58,960][47288] Updated weights for policy 0, policy_version 33786 (0.0025) [2024-04-26 00:40:01,327][47288] Updated weights for policy 0, policy_version 33796 (0.0025) [2024-04-26 00:40:03,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 553828352. Throughput: 0: 55344.8. Samples: 503202580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 00:40:03,924][47056] Avg episode reward: [(0, '0.233')] [2024-04-26 00:40:04,748][47288] Updated weights for policy 0, policy_version 33806 (0.0029) [2024-04-26 00:40:07,132][47288] Updated weights for policy 0, policy_version 33816 (0.0030) [2024-04-26 00:40:08,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 554106880. Throughput: 0: 55453.8. Samples: 503537980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 00:40:08,923][47056] Avg episode reward: [(0, '0.233')] [2024-04-26 00:40:10,567][47288] Updated weights for policy 0, policy_version 33826 (0.0032) [2024-04-26 00:40:13,297][47288] Updated weights for policy 0, policy_version 33836 (0.0029) [2024-04-26 00:40:13,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 554401792. Throughput: 0: 55677.3. Samples: 503706260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 00:40:13,923][47056] Avg episode reward: [(0, '0.243')] [2024-04-26 00:40:16,427][47288] Updated weights for policy 0, policy_version 33846 (0.0028) [2024-04-26 00:40:18,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 554680320. Throughput: 0: 55706.9. Samples: 504043920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 00:40:18,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 00:40:18,932][47267] Saving new best policy, reward=0.411! [2024-04-26 00:40:19,145][47288] Updated weights for policy 0, policy_version 33856 (0.0032) [2024-04-26 00:40:22,315][47288] Updated weights for policy 0, policy_version 33866 (0.0037) [2024-04-26 00:40:23,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 554942464. Throughput: 0: 55621.4. Samples: 504374160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 00:40:23,923][47056] Avg episode reward: [(0, '0.213')] [2024-04-26 00:40:24,891][47288] Updated weights for policy 0, policy_version 33876 (0.0028) [2024-04-26 00:40:28,197][47288] Updated weights for policy 0, policy_version 33886 (0.0027) [2024-04-26 00:40:28,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 555237376. Throughput: 0: 55769.3. Samples: 504547100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 00:40:28,923][47056] Avg episode reward: [(0, '0.237')] [2024-04-26 00:40:30,721][47288] Updated weights for policy 0, policy_version 33896 (0.0031) [2024-04-26 00:40:33,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 555499520. Throughput: 0: 55800.8. Samples: 504880020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 00:40:33,923][47056] Avg episode reward: [(0, '0.309')] [2024-04-26 00:40:34,187][47288] Updated weights for policy 0, policy_version 33906 (0.0032) [2024-04-26 00:40:36,645][47288] Updated weights for policy 0, policy_version 33916 (0.0030) [2024-04-26 00:40:38,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 555778048. Throughput: 0: 55726.5. Samples: 505211080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 00:40:38,923][47056] Avg episode reward: [(0, '0.277')] [2024-04-26 00:40:40,049][47288] Updated weights for policy 0, policy_version 33926 (0.0030) [2024-04-26 00:40:42,442][47288] Updated weights for policy 0, policy_version 33936 (0.0027) [2024-04-26 00:40:43,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 556056576. Throughput: 0: 55852.4. Samples: 505378120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 00:40:43,923][47056] Avg episode reward: [(0, '0.317')] [2024-04-26 00:40:45,740][47288] Updated weights for policy 0, policy_version 33946 (0.0028) [2024-04-26 00:40:48,455][47288] Updated weights for policy 0, policy_version 33956 (0.0036) [2024-04-26 00:40:48,923][47056] Fps is (10 sec: 58981.6, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 556367872. Throughput: 0: 55739.5. Samples: 505710860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 00:40:48,923][47056] Avg episode reward: [(0, '0.275')] [2024-04-26 00:40:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000033958_556367872.pth... [2024-04-26 00:40:48,980][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000033141_542982144.pth [2024-04-26 00:40:50,224][47267] Signal inference workers to stop experience collection... (7300 times) [2024-04-26 00:40:50,225][47267] Signal inference workers to resume experience collection... (7300 times) [2024-04-26 00:40:50,269][47288] InferenceWorker_p0-w0: stopping experience collection (7300 times) [2024-04-26 00:40:50,269][47288] InferenceWorker_p0-w0: resuming experience collection (7300 times) [2024-04-26 00:40:51,646][47288] Updated weights for policy 0, policy_version 33966 (0.0033) [2024-04-26 00:40:53,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.9, 300 sec: 55816.7). Total num frames: 556630016. Throughput: 0: 55636.5. Samples: 506041620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 00:40:53,923][47056] Avg episode reward: [(0, '0.302')] [2024-04-26 00:40:54,376][47288] Updated weights for policy 0, policy_version 33976 (0.0031) [2024-04-26 00:40:57,543][47288] Updated weights for policy 0, policy_version 33986 (0.0027) [2024-04-26 00:40:58,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 556892160. Throughput: 0: 55631.6. Samples: 506209680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 00:40:58,923][47056] Avg episode reward: [(0, '0.255')] [2024-04-26 00:41:00,200][47288] Updated weights for policy 0, policy_version 33996 (0.0027) [2024-04-26 00:41:03,347][47288] Updated weights for policy 0, policy_version 34006 (0.0026) [2024-04-26 00:41:03,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 557170688. Throughput: 0: 55575.1. Samples: 506544800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 00:41:03,923][47056] Avg episode reward: [(0, '0.238')] [2024-04-26 00:41:06,045][47288] Updated weights for policy 0, policy_version 34016 (0.0028) [2024-04-26 00:41:08,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55978.8, 300 sec: 55705.7). Total num frames: 557465600. Throughput: 0: 55753.0. Samples: 506883040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 00:41:08,923][47056] Avg episode reward: [(0, '0.287')] [2024-04-26 00:41:09,051][47288] Updated weights for policy 0, policy_version 34026 (0.0025) [2024-04-26 00:41:11,960][47288] Updated weights for policy 0, policy_version 34036 (0.0034) [2024-04-26 00:41:13,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 557711360. Throughput: 0: 55482.3. Samples: 507043800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 00:41:13,923][47056] Avg episode reward: [(0, '0.254')] [2024-04-26 00:41:14,941][47288] Updated weights for policy 0, policy_version 34046 (0.0031) [2024-04-26 00:41:17,734][47288] Updated weights for policy 0, policy_version 34056 (0.0026) [2024-04-26 00:41:18,923][47056] Fps is (10 sec: 52428.2, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 557989888. Throughput: 0: 55573.9. Samples: 507380840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 00:41:18,923][47056] Avg episode reward: [(0, '0.290')] [2024-04-26 00:41:20,891][47288] Updated weights for policy 0, policy_version 34066 (0.0033) [2024-04-26 00:41:23,511][47288] Updated weights for policy 0, policy_version 34076 (0.0033) [2024-04-26 00:41:23,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 558301184. Throughput: 0: 55578.2. Samples: 507712100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 00:41:23,923][47056] Avg episode reward: [(0, '0.295')] [2024-04-26 00:41:26,695][47288] Updated weights for policy 0, policy_version 34086 (0.0027) [2024-04-26 00:41:28,923][47056] Fps is (10 sec: 60620.7, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 558596096. Throughput: 0: 55757.8. Samples: 507887220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 00:41:28,923][47056] Avg episode reward: [(0, '0.201')] [2024-04-26 00:41:29,477][47288] Updated weights for policy 0, policy_version 34096 (0.0028) [2024-04-26 00:41:32,407][47288] Updated weights for policy 0, policy_version 34106 (0.0028) [2024-04-26 00:41:33,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 558841856. Throughput: 0: 55602.0. Samples: 508212940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 00:41:33,923][47056] Avg episode reward: [(0, '0.276')] [2024-04-26 00:41:35,526][47288] Updated weights for policy 0, policy_version 34116 (0.0031) [2024-04-26 00:41:38,315][47288] Updated weights for policy 0, policy_version 34126 (0.0030) [2024-04-26 00:41:38,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55978.6, 300 sec: 55816.9). Total num frames: 559136768. Throughput: 0: 55730.6. Samples: 508549500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 00:41:38,923][47056] Avg episode reward: [(0, '0.291')] [2024-04-26 00:41:41,408][47288] Updated weights for policy 0, policy_version 34136 (0.0031) [2024-04-26 00:41:43,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 559415296. Throughput: 0: 55695.2. Samples: 508715960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 00:41:43,923][47056] Avg episode reward: [(0, '0.238')] [2024-04-26 00:41:44,218][47288] Updated weights for policy 0, policy_version 34146 (0.0025) [2024-04-26 00:41:47,082][47288] Updated weights for policy 0, policy_version 34156 (0.0029) [2024-04-26 00:41:48,923][47056] Fps is (10 sec: 52429.3, 60 sec: 54886.5, 300 sec: 55539.0). Total num frames: 559661056. Throughput: 0: 55897.9. Samples: 509060200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 00:41:48,923][47056] Avg episode reward: [(0, '0.303')] [2024-04-26 00:41:49,964][47288] Updated weights for policy 0, policy_version 34166 (0.0030) [2024-04-26 00:41:52,925][47288] Updated weights for policy 0, policy_version 34176 (0.0035) [2024-04-26 00:41:53,923][47056] Fps is (10 sec: 54066.2, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 559955968. Throughput: 0: 55843.3. Samples: 509396000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 00:41:53,923][47056] Avg episode reward: [(0, '0.333')] [2024-04-26 00:41:56,245][47288] Updated weights for policy 0, policy_version 34186 (0.0032) [2024-04-26 00:41:57,943][47267] Signal inference workers to stop experience collection... (7350 times) [2024-04-26 00:41:57,979][47288] InferenceWorker_p0-w0: stopping experience collection (7350 times) [2024-04-26 00:41:58,032][47267] Signal inference workers to resume experience collection... (7350 times) [2024-04-26 00:41:58,032][47288] InferenceWorker_p0-w0: resuming experience collection (7350 times) [2024-04-26 00:41:58,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 560250880. Throughput: 0: 55905.3. Samples: 509559540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 00:41:58,923][47056] Avg episode reward: [(0, '0.286')] [2024-04-26 00:41:58,971][47288] Updated weights for policy 0, policy_version 34196 (0.0025) [2024-04-26 00:42:01,939][47288] Updated weights for policy 0, policy_version 34206 (0.0028) [2024-04-26 00:42:03,923][47056] Fps is (10 sec: 57345.1, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 560529408. Throughput: 0: 55850.7. Samples: 509894120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 00:42:03,923][47056] Avg episode reward: [(0, '0.355')] [2024-04-26 00:42:04,735][47288] Updated weights for policy 0, policy_version 34216 (0.0028) [2024-04-26 00:42:07,848][47288] Updated weights for policy 0, policy_version 34226 (0.0025) [2024-04-26 00:42:08,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 560807936. Throughput: 0: 55899.2. Samples: 510227560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 00:42:08,923][47056] Avg episode reward: [(0, '0.249')] [2024-04-26 00:42:10,449][47288] Updated weights for policy 0, policy_version 34236 (0.0032) [2024-04-26 00:42:13,705][47288] Updated weights for policy 0, policy_version 34246 (0.0033) [2024-04-26 00:42:13,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 561086464. Throughput: 0: 55876.8. Samples: 510401680. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 00:42:13,923][47056] Avg episode reward: [(0, '0.294')] [2024-04-26 00:42:16,555][47288] Updated weights for policy 0, policy_version 34256 (0.0028) [2024-04-26 00:42:18,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 561364992. Throughput: 0: 56101.2. Samples: 510737500. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 00:42:18,923][47056] Avg episode reward: [(0, '0.293')] [2024-04-26 00:42:19,434][47288] Updated weights for policy 0, policy_version 34266 (0.0027) [2024-04-26 00:42:22,846][47288] Updated weights for policy 0, policy_version 34276 (0.0030) [2024-04-26 00:42:23,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 561627136. Throughput: 0: 55881.8. Samples: 511064180. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 00:42:23,923][47056] Avg episode reward: [(0, '0.288')] [2024-04-26 00:42:25,252][47288] Updated weights for policy 0, policy_version 34286 (0.0035) [2024-04-26 00:42:28,877][47288] Updated weights for policy 0, policy_version 34296 (0.0025) [2024-04-26 00:42:28,923][47056] Fps is (10 sec: 54066.1, 60 sec: 55159.2, 300 sec: 55761.1). Total num frames: 561905664. Throughput: 0: 55912.9. Samples: 511232060. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 00:42:28,924][47056] Avg episode reward: [(0, '0.294')] [2024-04-26 00:42:31,114][47288] Updated weights for policy 0, policy_version 34306 (0.0026) [2024-04-26 00:42:33,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 562200576. Throughput: 0: 55670.6. Samples: 511565380. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 00:42:33,923][47056] Avg episode reward: [(0, '0.331')] [2024-04-26 00:42:34,771][47288] Updated weights for policy 0, policy_version 34316 (0.0028) [2024-04-26 00:42:37,108][47288] Updated weights for policy 0, policy_version 34326 (0.0031) [2024-04-26 00:42:38,923][47056] Fps is (10 sec: 58983.6, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 562495488. Throughput: 0: 55752.1. Samples: 511904840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 00:42:38,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 00:42:40,433][47288] Updated weights for policy 0, policy_version 34336 (0.0029) [2024-04-26 00:42:43,048][47288] Updated weights for policy 0, policy_version 34346 (0.0032) [2024-04-26 00:42:43,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 562757632. Throughput: 0: 55922.5. Samples: 512076060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 00:42:43,923][47056] Avg episode reward: [(0, '0.316')] [2024-04-26 00:42:46,288][47288] Updated weights for policy 0, policy_version 34356 (0.0026) [2024-04-26 00:42:48,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 563036160. Throughput: 0: 55830.7. Samples: 512406500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 00:42:48,923][47056] Avg episode reward: [(0, '0.363')] [2024-04-26 00:42:48,968][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000034366_563052544.pth... [2024-04-26 00:42:48,974][47288] Updated weights for policy 0, policy_version 34366 (0.0028) [2024-04-26 00:42:49,017][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000033550_549683200.pth [2024-04-26 00:42:50,428][47267] Signal inference workers to stop experience collection... (7400 times) [2024-04-26 00:42:50,458][47288] InferenceWorker_p0-w0: stopping experience collection (7400 times) [2024-04-26 00:42:50,513][47267] Signal inference workers to resume experience collection... (7400 times) [2024-04-26 00:42:50,513][47288] InferenceWorker_p0-w0: resuming experience collection (7400 times) [2024-04-26 00:42:52,238][47288] Updated weights for policy 0, policy_version 34376 (0.0026) [2024-04-26 00:42:53,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 563314688. Throughput: 0: 55869.6. Samples: 512741700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 00:42:53,924][47056] Avg episode reward: [(0, '0.220')] [2024-04-26 00:42:54,714][47288] Updated weights for policy 0, policy_version 34386 (0.0034) [2024-04-26 00:42:57,980][47288] Updated weights for policy 0, policy_version 34396 (0.0031) [2024-04-26 00:42:58,923][47056] Fps is (10 sec: 54065.6, 60 sec: 55432.3, 300 sec: 55594.5). Total num frames: 563576832. Throughput: 0: 55566.0. Samples: 512902160. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) [2024-04-26 00:42:58,923][47056] Avg episode reward: [(0, '0.300')] [2024-04-26 00:43:00,648][47288] Updated weights for policy 0, policy_version 34406 (0.0026) [2024-04-26 00:43:03,892][47288] Updated weights for policy 0, policy_version 34416 (0.0028) [2024-04-26 00:43:03,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 563871744. Throughput: 0: 55487.2. Samples: 513234420. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) [2024-04-26 00:43:03,923][47056] Avg episode reward: [(0, '0.317')] [2024-04-26 00:43:06,580][47288] Updated weights for policy 0, policy_version 34426 (0.0033) [2024-04-26 00:43:08,923][47056] Fps is (10 sec: 55706.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 564133888. Throughput: 0: 55682.3. Samples: 513569880. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) [2024-04-26 00:43:08,923][47056] Avg episode reward: [(0, '0.273')] [2024-04-26 00:43:09,848][47288] Updated weights for policy 0, policy_version 34436 (0.0025) [2024-04-26 00:43:12,359][47288] Updated weights for policy 0, policy_version 34446 (0.0032) [2024-04-26 00:43:13,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 564428800. Throughput: 0: 55703.4. Samples: 513738700. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) [2024-04-26 00:43:13,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 00:43:15,894][47288] Updated weights for policy 0, policy_version 34456 (0.0026) [2024-04-26 00:43:18,049][47288] Updated weights for policy 0, policy_version 34466 (0.0026) [2024-04-26 00:43:18,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 564707328. Throughput: 0: 55815.9. Samples: 514077100. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) [2024-04-26 00:43:18,923][47056] Avg episode reward: [(0, '0.346')] [2024-04-26 00:43:21,616][47288] Updated weights for policy 0, policy_version 34476 (0.0028) [2024-04-26 00:43:23,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 565002240. Throughput: 0: 55804.8. Samples: 514416060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 00:43:23,923][47056] Avg episode reward: [(0, '0.276')] [2024-04-26 00:43:23,936][47288] Updated weights for policy 0, policy_version 34486 (0.0028) [2024-04-26 00:43:27,453][47288] Updated weights for policy 0, policy_version 34496 (0.0033) [2024-04-26 00:43:28,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 565264384. Throughput: 0: 55757.3. Samples: 514585140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 00:43:28,924][47056] Avg episode reward: [(0, '0.270')] [2024-04-26 00:43:29,913][47288] Updated weights for policy 0, policy_version 34506 (0.0026) [2024-04-26 00:43:33,178][47288] Updated weights for policy 0, policy_version 34516 (0.0027) [2024-04-26 00:43:33,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 565542912. Throughput: 0: 55841.3. Samples: 514919360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 00:43:33,923][47056] Avg episode reward: [(0, '0.318')] [2024-04-26 00:43:35,797][47288] Updated weights for policy 0, policy_version 34526 (0.0029) [2024-04-26 00:43:38,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 565821440. Throughput: 0: 55948.4. Samples: 515259380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 00:43:38,923][47056] Avg episode reward: [(0, '0.289')] [2024-04-26 00:43:39,024][47288] Updated weights for policy 0, policy_version 34536 (0.0027) [2024-04-26 00:43:41,456][47288] Updated weights for policy 0, policy_version 34546 (0.0027) [2024-04-26 00:43:43,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 566099968. Throughput: 0: 55922.4. Samples: 515418660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 00:43:43,924][47056] Avg episode reward: [(0, '0.251')] [2024-04-26 00:43:44,869][47288] Updated weights for policy 0, policy_version 34556 (0.0035) [2024-04-26 00:43:47,335][47288] Updated weights for policy 0, policy_version 34566 (0.0034) [2024-04-26 00:43:48,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 566394880. Throughput: 0: 56007.6. Samples: 515754760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 00:43:48,923][47056] Avg episode reward: [(0, '0.240')] [2024-04-26 00:43:50,723][47288] Updated weights for policy 0, policy_version 34576 (0.0025) [2024-04-26 00:43:53,173][47288] Updated weights for policy 0, policy_version 34586 (0.0035) [2024-04-26 00:43:53,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 566673408. Throughput: 0: 56081.7. Samples: 516093560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 00:43:53,923][47056] Avg episode reward: [(0, '0.272')] [2024-04-26 00:43:56,704][47288] Updated weights for policy 0, policy_version 34596 (0.0029) [2024-04-26 00:43:57,621][47267] Signal inference workers to stop experience collection... (7450 times) [2024-04-26 00:43:57,622][47267] Signal inference workers to resume experience collection... (7450 times) [2024-04-26 00:43:57,649][47288] InferenceWorker_p0-w0: stopping experience collection (7450 times) [2024-04-26 00:43:57,649][47288] InferenceWorker_p0-w0: resuming experience collection (7450 times) [2024-04-26 00:43:58,838][47288] Updated weights for policy 0, policy_version 34606 (0.0033) [2024-04-26 00:43:58,923][47056] Fps is (10 sec: 58981.5, 60 sec: 56798.0, 300 sec: 55816.7). Total num frames: 566984704. Throughput: 0: 56309.2. Samples: 516272620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 00:43:58,923][47056] Avg episode reward: [(0, '0.260')] [2024-04-26 00:44:02,617][47288] Updated weights for policy 0, policy_version 34616 (0.0033) [2024-04-26 00:44:03,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 567230464. Throughput: 0: 56266.4. Samples: 516609080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:44:03,923][47056] Avg episode reward: [(0, '0.299')] [2024-04-26 00:44:04,819][47288] Updated weights for policy 0, policy_version 34626 (0.0040) [2024-04-26 00:44:08,429][47288] Updated weights for policy 0, policy_version 34636 (0.0028) [2024-04-26 00:44:08,923][47056] Fps is (10 sec: 50791.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 567492608. Throughput: 0: 56067.3. Samples: 516939080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:44:08,923][47056] Avg episode reward: [(0, '0.313')] [2024-04-26 00:44:10,953][47288] Updated weights for policy 0, policy_version 34646 (0.0026) [2024-04-26 00:44:13,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 567787520. Throughput: 0: 56087.7. Samples: 517109080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:44:13,923][47056] Avg episode reward: [(0, '0.323')] [2024-04-26 00:44:14,117][47288] Updated weights for policy 0, policy_version 34656 (0.0027) [2024-04-26 00:44:16,844][47288] Updated weights for policy 0, policy_version 34666 (0.0032) [2024-04-26 00:44:18,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 568066048. Throughput: 0: 56086.5. Samples: 517443260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:44:18,923][47056] Avg episode reward: [(0, '0.279')] [2024-04-26 00:44:19,959][47288] Updated weights for policy 0, policy_version 34676 (0.0027) [2024-04-26 00:44:22,835][47288] Updated weights for policy 0, policy_version 34686 (0.0031) [2024-04-26 00:44:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 568360960. Throughput: 0: 55979.7. Samples: 517778460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:44:23,923][47056] Avg episode reward: [(0, '0.335')] [2024-04-26 00:44:25,784][47288] Updated weights for policy 0, policy_version 34696 (0.0031) [2024-04-26 00:44:28,598][47288] Updated weights for policy 0, policy_version 34706 (0.0031) [2024-04-26 00:44:28,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56251.9, 300 sec: 55872.2). Total num frames: 568639488. Throughput: 0: 56307.2. Samples: 517952480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 00:44:28,923][47056] Avg episode reward: [(0, '0.320')] [2024-04-26 00:44:31,697][47288] Updated weights for policy 0, policy_version 34716 (0.0025) [2024-04-26 00:44:33,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 568918016. Throughput: 0: 56285.7. Samples: 518287620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 00:44:33,923][47056] Avg episode reward: [(0, '0.300')] [2024-04-26 00:44:34,363][47288] Updated weights for policy 0, policy_version 34726 (0.0028) [2024-04-26 00:44:37,563][47288] Updated weights for policy 0, policy_version 34736 (0.0027) [2024-04-26 00:44:38,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.9, 300 sec: 55872.2). Total num frames: 569212928. Throughput: 0: 56187.3. Samples: 518621980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 00:44:38,923][47056] Avg episode reward: [(0, '0.293')] [2024-04-26 00:44:40,176][47288] Updated weights for policy 0, policy_version 34746 (0.0028) [2024-04-26 00:44:43,314][47288] Updated weights for policy 0, policy_version 34756 (0.0031) [2024-04-26 00:44:43,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 569458688. Throughput: 0: 55865.9. Samples: 518786580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 00:44:43,923][47056] Avg episode reward: [(0, '0.254')] [2024-04-26 00:44:45,887][47288] Updated weights for policy 0, policy_version 34766 (0.0033) [2024-04-26 00:44:48,923][47056] Fps is (10 sec: 54065.7, 60 sec: 55978.4, 300 sec: 55872.2). Total num frames: 569753600. Throughput: 0: 55883.3. Samples: 519123840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 00:44:48,923][47056] Avg episode reward: [(0, '0.314')] [2024-04-26 00:44:48,930][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000034775_569753600.pth... [2024-04-26 00:44:48,990][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000033958_556367872.pth [2024-04-26 00:44:49,106][47288] Updated weights for policy 0, policy_version 34776 (0.0032) [2024-04-26 00:44:51,809][47288] Updated weights for policy 0, policy_version 34786 (0.0032) [2024-04-26 00:44:53,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 570032128. Throughput: 0: 56083.5. Samples: 519462840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 00:44:53,923][47056] Avg episode reward: [(0, '0.343')] [2024-04-26 00:44:54,921][47288] Updated weights for policy 0, policy_version 34796 (0.0034) [2024-04-26 00:44:55,650][47267] Signal inference workers to stop experience collection... (7500 times) [2024-04-26 00:44:55,686][47288] InferenceWorker_p0-w0: stopping experience collection (7500 times) [2024-04-26 00:44:55,743][47267] Signal inference workers to resume experience collection... (7500 times) [2024-04-26 00:44:55,744][47288] InferenceWorker_p0-w0: resuming experience collection (7500 times) [2024-04-26 00:44:57,758][47288] Updated weights for policy 0, policy_version 34806 (0.0029) [2024-04-26 00:44:58,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55705.6, 300 sec: 55927.7). Total num frames: 570327040. Throughput: 0: 56069.7. Samples: 519632220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 00:44:58,923][47056] Avg episode reward: [(0, '0.238')] [2024-04-26 00:45:00,855][47288] Updated weights for policy 0, policy_version 34816 (0.0032) [2024-04-26 00:45:03,739][47288] Updated weights for policy 0, policy_version 34826 (0.0035) [2024-04-26 00:45:03,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 570605568. Throughput: 0: 55947.7. Samples: 519960900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 00:45:03,923][47056] Avg episode reward: [(0, '0.225')] [2024-04-26 00:45:06,651][47288] Updated weights for policy 0, policy_version 34836 (0.0027) [2024-04-26 00:45:08,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.7, 300 sec: 55872.2). Total num frames: 570884096. Throughput: 0: 56009.2. Samples: 520298880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 00:45:08,923][47056] Avg episode reward: [(0, '0.324')] [2024-04-26 00:45:09,457][47288] Updated weights for policy 0, policy_version 34846 (0.0028) [2024-04-26 00:45:12,701][47288] Updated weights for policy 0, policy_version 34856 (0.0032) [2024-04-26 00:45:13,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 571162624. Throughput: 0: 55870.6. Samples: 520466660. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 00:45:13,923][47056] Avg episode reward: [(0, '0.204')] [2024-04-26 00:45:15,279][47288] Updated weights for policy 0, policy_version 34866 (0.0031) [2024-04-26 00:45:18,422][47288] Updated weights for policy 0, policy_version 34876 (0.0025) [2024-04-26 00:45:18,923][47056] Fps is (10 sec: 52429.5, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 571408384. Throughput: 0: 55907.6. Samples: 520803460. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 00:45:18,923][47056] Avg episode reward: [(0, '0.260')] [2024-04-26 00:45:21,111][47288] Updated weights for policy 0, policy_version 34886 (0.0025) [2024-04-26 00:45:23,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 571686912. Throughput: 0: 55836.8. Samples: 521134640. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 00:45:23,923][47056] Avg episode reward: [(0, '0.352')] [2024-04-26 00:45:24,359][47288] Updated weights for policy 0, policy_version 34896 (0.0028) [2024-04-26 00:45:27,162][47288] Updated weights for policy 0, policy_version 34906 (0.0030) [2024-04-26 00:45:28,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 571981824. Throughput: 0: 55787.0. Samples: 521297000. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 00:45:28,923][47056] Avg episode reward: [(0, '0.330')] [2024-04-26 00:45:30,329][47288] Updated weights for policy 0, policy_version 34916 (0.0030) [2024-04-26 00:45:33,051][47288] Updated weights for policy 0, policy_version 34926 (0.0027) [2024-04-26 00:45:33,923][47056] Fps is (10 sec: 58982.0, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 572276736. Throughput: 0: 55781.5. Samples: 521634000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-26 00:45:33,923][47056] Avg episode reward: [(0, '0.281')] [2024-04-26 00:45:36,050][47288] Updated weights for policy 0, policy_version 34936 (0.0035) [2024-04-26 00:45:38,886][47288] Updated weights for policy 0, policy_version 34946 (0.0028) [2024-04-26 00:45:38,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.4, 300 sec: 55927.7). Total num frames: 572555264. Throughput: 0: 55724.8. Samples: 521970460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-26 00:45:38,923][47056] Avg episode reward: [(0, '0.254')] [2024-04-26 00:45:41,841][47288] Updated weights for policy 0, policy_version 34956 (0.0030) [2024-04-26 00:45:43,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 572833792. Throughput: 0: 55795.5. Samples: 522143020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-26 00:45:43,923][47056] Avg episode reward: [(0, '0.321')] [2024-04-26 00:45:44,864][47288] Updated weights for policy 0, policy_version 34966 (0.0026) [2024-04-26 00:45:47,768][47288] Updated weights for policy 0, policy_version 34976 (0.0031) [2024-04-26 00:45:48,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.6, 300 sec: 55816.6). Total num frames: 573095936. Throughput: 0: 55905.6. Samples: 522476660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-26 00:45:48,923][47056] Avg episode reward: [(0, '0.336')] [2024-04-26 00:45:50,829][47288] Updated weights for policy 0, policy_version 34986 (0.0030) [2024-04-26 00:45:53,777][47288] Updated weights for policy 0, policy_version 34996 (0.0029) [2024-04-26 00:45:53,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 573374464. Throughput: 0: 55769.3. Samples: 522808500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-26 00:45:53,923][47056] Avg episode reward: [(0, '0.277')] [2024-04-26 00:45:56,560][47288] Updated weights for policy 0, policy_version 35006 (0.0029) [2024-04-26 00:45:58,923][47056] Fps is (10 sec: 55706.7, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 573652992. Throughput: 0: 55703.2. Samples: 522973300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 00:45:58,923][47056] Avg episode reward: [(0, '0.235')] [2024-04-26 00:45:59,915][47288] Updated weights for policy 0, policy_version 35016 (0.0027) [2024-04-26 00:46:02,282][47288] Updated weights for policy 0, policy_version 35026 (0.0029) [2024-04-26 00:46:03,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 573947904. Throughput: 0: 55694.3. Samples: 523309700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 00:46:03,923][47056] Avg episode reward: [(0, '0.308')] [2024-04-26 00:46:04,691][47267] Signal inference workers to stop experience collection... (7550 times) [2024-04-26 00:46:04,691][47267] Signal inference workers to resume experience collection... (7550 times) [2024-04-26 00:46:04,714][47288] InferenceWorker_p0-w0: stopping experience collection (7550 times) [2024-04-26 00:46:04,714][47288] InferenceWorker_p0-w0: resuming experience collection (7550 times) [2024-04-26 00:46:05,619][47288] Updated weights for policy 0, policy_version 35036 (0.0028) [2024-04-26 00:46:08,268][47288] Updated weights for policy 0, policy_version 35046 (0.0030) [2024-04-26 00:46:08,923][47056] Fps is (10 sec: 58981.2, 60 sec: 55978.6, 300 sec: 56038.8). Total num frames: 574242816. Throughput: 0: 55785.1. Samples: 523644980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 00:46:08,923][47056] Avg episode reward: [(0, '0.353')] [2024-04-26 00:46:11,286][47288] Updated weights for policy 0, policy_version 35056 (0.0029) [2024-04-26 00:46:13,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55432.6, 300 sec: 55927.7). Total num frames: 574488576. Throughput: 0: 55971.6. Samples: 523815720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 00:46:13,923][47056] Avg episode reward: [(0, '0.208')] [2024-04-26 00:46:14,130][47288] Updated weights for policy 0, policy_version 35066 (0.0032) [2024-04-26 00:46:17,290][47288] Updated weights for policy 0, policy_version 35076 (0.0032) [2024-04-26 00:46:18,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 574783488. Throughput: 0: 56096.8. Samples: 524158360. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 00:46:18,923][47056] Avg episode reward: [(0, '0.277')] [2024-04-26 00:46:19,886][47288] Updated weights for policy 0, policy_version 35086 (0.0030) [2024-04-26 00:46:23,274][47288] Updated weights for policy 0, policy_version 35096 (0.0029) [2024-04-26 00:46:23,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56524.7, 300 sec: 55872.2). Total num frames: 575078400. Throughput: 0: 56041.3. Samples: 524492320. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 00:46:23,923][47056] Avg episode reward: [(0, '0.306')] [2024-04-26 00:46:25,636][47288] Updated weights for policy 0, policy_version 35106 (0.0028) [2024-04-26 00:46:28,923][47056] Fps is (10 sec: 54068.3, 60 sec: 55705.8, 300 sec: 55872.2). Total num frames: 575324160. Throughput: 0: 55830.5. Samples: 524655380. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 00:46:28,923][47056] Avg episode reward: [(0, '0.311')] [2024-04-26 00:46:28,951][47288] Updated weights for policy 0, policy_version 35116 (0.0030) [2024-04-26 00:46:31,523][47288] Updated weights for policy 0, policy_version 35126 (0.0027) [2024-04-26 00:46:33,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 575635456. Throughput: 0: 55892.3. Samples: 524991800. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 00:46:33,923][47056] Avg episode reward: [(0, '0.292')] [2024-04-26 00:46:34,620][47288] Updated weights for policy 0, policy_version 35136 (0.0026) [2024-04-26 00:46:37,357][47288] Updated weights for policy 0, policy_version 35146 (0.0029) [2024-04-26 00:46:38,923][47056] Fps is (10 sec: 58981.4, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 575913984. Throughput: 0: 56071.6. Samples: 525331720. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 00:46:38,923][47056] Avg episode reward: [(0, '0.284')] [2024-04-26 00:46:40,496][47288] Updated weights for policy 0, policy_version 35156 (0.0027) [2024-04-26 00:46:43,091][47288] Updated weights for policy 0, policy_version 35166 (0.0033) [2024-04-26 00:46:43,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 576176128. Throughput: 0: 56075.9. Samples: 525496720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 00:46:43,923][47056] Avg episode reward: [(0, '0.319')] [2024-04-26 00:46:46,411][47288] Updated weights for policy 0, policy_version 35176 (0.0029) [2024-04-26 00:46:48,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 576471040. Throughput: 0: 56014.1. Samples: 525830340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 00:46:48,923][47056] Avg episode reward: [(0, '0.347')] [2024-04-26 00:46:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000035185_576471040.pth... [2024-04-26 00:46:48,995][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000034366_563052544.pth [2024-04-26 00:46:49,111][47288] Updated weights for policy 0, policy_version 35186 (0.0030) [2024-04-26 00:46:52,243][47288] Updated weights for policy 0, policy_version 35196 (0.0042) [2024-04-26 00:46:53,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 576733184. Throughput: 0: 56064.1. Samples: 526167860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 00:46:53,923][47056] Avg episode reward: [(0, '0.242')] [2024-04-26 00:46:55,022][47288] Updated weights for policy 0, policy_version 35206 (0.0030) [2024-04-26 00:46:58,145][47288] Updated weights for policy 0, policy_version 35216 (0.0026) [2024-04-26 00:46:58,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 577028096. Throughput: 0: 55978.2. Samples: 526334740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 00:46:58,923][47056] Avg episode reward: [(0, '0.282')] [2024-04-26 00:47:00,071][47267] Signal inference workers to stop experience collection... (7600 times) [2024-04-26 00:47:00,075][47267] Signal inference workers to resume experience collection... (7600 times) [2024-04-26 00:47:00,102][47288] InferenceWorker_p0-w0: stopping experience collection (7600 times) [2024-04-26 00:47:00,102][47288] InferenceWorker_p0-w0: resuming experience collection (7600 times) [2024-04-26 00:47:00,808][47288] Updated weights for policy 0, policy_version 35226 (0.0029) [2024-04-26 00:47:03,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 577306624. Throughput: 0: 55805.0. Samples: 526669580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 00:47:03,923][47056] Avg episode reward: [(0, '0.299')] [2024-04-26 00:47:03,925][47288] Updated weights for policy 0, policy_version 35236 (0.0030) [2024-04-26 00:47:06,858][47288] Updated weights for policy 0, policy_version 35246 (0.0031) [2024-04-26 00:47:08,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 577568768. Throughput: 0: 55890.6. Samples: 527007400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 00:47:08,923][47056] Avg episode reward: [(0, '0.325')] [2024-04-26 00:47:09,818][47288] Updated weights for policy 0, policy_version 35256 (0.0029) [2024-04-26 00:47:12,810][47288] Updated weights for policy 0, policy_version 35266 (0.0031) [2024-04-26 00:47:13,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 577880064. Throughput: 0: 55969.2. Samples: 527174000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 00:47:13,923][47056] Avg episode reward: [(0, '0.256')] [2024-04-26 00:47:15,641][47288] Updated weights for policy 0, policy_version 35276 (0.0030) [2024-04-26 00:47:18,714][47288] Updated weights for policy 0, policy_version 35286 (0.0029) [2024-04-26 00:47:18,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 578125824. Throughput: 0: 55891.7. Samples: 527506940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 00:47:18,923][47056] Avg episode reward: [(0, '0.242')] [2024-04-26 00:47:21,405][47288] Updated weights for policy 0, policy_version 35296 (0.0028) [2024-04-26 00:47:23,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.7, 300 sec: 55983.3). Total num frames: 578420736. Throughput: 0: 55766.3. Samples: 527841200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 00:47:23,923][47056] Avg episode reward: [(0, '0.353')] [2024-04-26 00:47:24,546][47288] Updated weights for policy 0, policy_version 35306 (0.0027) [2024-04-26 00:47:27,289][47288] Updated weights for policy 0, policy_version 35316 (0.0034) [2024-04-26 00:47:28,923][47056] Fps is (10 sec: 58983.7, 60 sec: 56524.7, 300 sec: 55983.3). Total num frames: 578715648. Throughput: 0: 55954.7. Samples: 528014680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 00:47:28,923][47056] Avg episode reward: [(0, '0.305')] [2024-04-26 00:47:30,395][47288] Updated weights for policy 0, policy_version 35326 (0.0034) [2024-04-26 00:47:33,171][47288] Updated weights for policy 0, policy_version 35336 (0.0027) [2024-04-26 00:47:33,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 578977792. Throughput: 0: 56099.1. Samples: 528354800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 00:47:33,923][47056] Avg episode reward: [(0, '0.293')] [2024-04-26 00:47:36,302][47288] Updated weights for policy 0, policy_version 35346 (0.0026) [2024-04-26 00:47:38,851][47288] Updated weights for policy 0, policy_version 35356 (0.0030) [2024-04-26 00:47:38,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 579272704. Throughput: 0: 56030.3. Samples: 528689220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 00:47:38,923][47056] Avg episode reward: [(0, '0.327')] [2024-04-26 00:47:42,094][47288] Updated weights for policy 0, policy_version 35366 (0.0028) [2024-04-26 00:47:43,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 579518464. Throughput: 0: 55863.9. Samples: 528848620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 00:47:43,923][47056] Avg episode reward: [(0, '0.288')] [2024-04-26 00:47:44,883][47288] Updated weights for policy 0, policy_version 35376 (0.0031) [2024-04-26 00:47:47,779][47288] Updated weights for policy 0, policy_version 35386 (0.0029) [2024-04-26 00:47:48,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 579813376. Throughput: 0: 55957.7. Samples: 529187680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 00:47:48,923][47056] Avg episode reward: [(0, '0.325')] [2024-04-26 00:47:50,716][47288] Updated weights for policy 0, policy_version 35396 (0.0033) [2024-04-26 00:47:53,809][47288] Updated weights for policy 0, policy_version 35406 (0.0027) [2024-04-26 00:47:53,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 580091904. Throughput: 0: 55998.8. Samples: 529527340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 00:47:53,923][47056] Avg episode reward: [(0, '0.249')] [2024-04-26 00:47:56,498][47288] Updated weights for policy 0, policy_version 35416 (0.0028) [2024-04-26 00:47:58,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 580386816. Throughput: 0: 55897.9. Samples: 529689400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 00:47:58,923][47056] Avg episode reward: [(0, '0.303')] [2024-04-26 00:47:59,802][47288] Updated weights for policy 0, policy_version 35426 (0.0030) [2024-04-26 00:47:59,921][47267] Signal inference workers to stop experience collection... (7650 times) [2024-04-26 00:47:59,925][47267] Signal inference workers to resume experience collection... (7650 times) [2024-04-26 00:47:59,961][47288] InferenceWorker_p0-w0: stopping experience collection (7650 times) [2024-04-26 00:47:59,961][47288] InferenceWorker_p0-w0: resuming experience collection (7650 times) [2024-04-26 00:48:02,411][47288] Updated weights for policy 0, policy_version 35436 (0.0036) [2024-04-26 00:48:03,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 580665344. Throughput: 0: 55847.4. Samples: 530020060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 00:48:03,923][47056] Avg episode reward: [(0, '0.228')] [2024-04-26 00:48:05,578][47288] Updated weights for policy 0, policy_version 35446 (0.0027) [2024-04-26 00:48:08,560][47288] Updated weights for policy 0, policy_version 35456 (0.0032) [2024-04-26 00:48:08,923][47056] Fps is (10 sec: 54066.1, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 580927488. Throughput: 0: 55975.4. Samples: 530360100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 00:48:08,923][47056] Avg episode reward: [(0, '0.352')] [2024-04-26 00:48:11,451][47288] Updated weights for policy 0, policy_version 35466 (0.0030) [2024-04-26 00:48:13,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55927.8). Total num frames: 581206016. Throughput: 0: 55829.8. Samples: 530527020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:48:13,923][47056] Avg episode reward: [(0, '0.302')] [2024-04-26 00:48:14,285][47288] Updated weights for policy 0, policy_version 35476 (0.0030) [2024-04-26 00:48:17,240][47288] Updated weights for policy 0, policy_version 35486 (0.0033) [2024-04-26 00:48:18,923][47056] Fps is (10 sec: 55706.6, 60 sec: 55978.9, 300 sec: 55872.2). Total num frames: 581484544. Throughput: 0: 55836.1. Samples: 530867420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:48:18,923][47056] Avg episode reward: [(0, '0.265')] [2024-04-26 00:48:19,974][47288] Updated weights for policy 0, policy_version 35496 (0.0037) [2024-04-26 00:48:23,089][47288] Updated weights for policy 0, policy_version 35506 (0.0026) [2024-04-26 00:48:23,926][47056] Fps is (10 sec: 55684.3, 60 sec: 55702.1, 300 sec: 55927.1). Total num frames: 581763072. Throughput: 0: 55860.1. Samples: 531203140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:48:23,927][47056] Avg episode reward: [(0, '0.267')] [2024-04-26 00:48:25,842][47288] Updated weights for policy 0, policy_version 35516 (0.0025) [2024-04-26 00:48:28,812][47288] Updated weights for policy 0, policy_version 35526 (0.0030) [2024-04-26 00:48:28,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 582057984. Throughput: 0: 55989.0. Samples: 531368120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:48:28,923][47056] Avg episode reward: [(0, '0.317')] [2024-04-26 00:48:31,731][47288] Updated weights for policy 0, policy_version 35536 (0.0027) [2024-04-26 00:48:33,923][47056] Fps is (10 sec: 57366.2, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 582336512. Throughput: 0: 55857.5. Samples: 531701260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 00:48:33,923][47056] Avg episode reward: [(0, '0.305')] [2024-04-26 00:48:34,794][47288] Updated weights for policy 0, policy_version 35546 (0.0029) [2024-04-26 00:48:37,576][47288] Updated weights for policy 0, policy_version 35556 (0.0028) [2024-04-26 00:48:38,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55978.5, 300 sec: 56038.8). Total num frames: 582631424. Throughput: 0: 55947.0. Samples: 532044960. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-26 00:48:38,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 00:48:40,598][47288] Updated weights for policy 0, policy_version 35566 (0.0027) [2024-04-26 00:48:43,256][47288] Updated weights for policy 0, policy_version 35576 (0.0032) [2024-04-26 00:48:43,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56798.0, 300 sec: 56038.8). Total num frames: 582926336. Throughput: 0: 56236.8. Samples: 532220060. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-26 00:48:43,923][47056] Avg episode reward: [(0, '0.298')] [2024-04-26 00:48:46,340][47288] Updated weights for policy 0, policy_version 35586 (0.0032) [2024-04-26 00:48:48,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 583172096. Throughput: 0: 56392.8. Samples: 532557740. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-26 00:48:48,923][47056] Avg episode reward: [(0, '0.231')] [2024-04-26 00:48:48,946][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000035595_583188480.pth... [2024-04-26 00:48:48,989][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000034775_569753600.pth [2024-04-26 00:48:49,154][47288] Updated weights for policy 0, policy_version 35596 (0.0029) [2024-04-26 00:48:52,161][47288] Updated weights for policy 0, policy_version 35606 (0.0031) [2024-04-26 00:48:53,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 583450624. Throughput: 0: 56155.3. Samples: 532887080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-26 00:48:53,923][47056] Avg episode reward: [(0, '0.221')] [2024-04-26 00:48:55,143][47288] Updated weights for policy 0, policy_version 35616 (0.0026) [2024-04-26 00:48:58,033][47288] Updated weights for policy 0, policy_version 35626 (0.0027) [2024-04-26 00:48:58,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.5, 300 sec: 55927.8). Total num frames: 583729152. Throughput: 0: 56121.3. Samples: 533052480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 00:48:58,923][47056] Avg episode reward: [(0, '0.267')] [2024-04-26 00:49:00,817][47288] Updated weights for policy 0, policy_version 35636 (0.0028) [2024-04-26 00:49:03,829][47288] Updated weights for policy 0, policy_version 35646 (0.0030) [2024-04-26 00:49:03,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 584024064. Throughput: 0: 55972.0. Samples: 533386160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 00:49:03,923][47056] Avg episode reward: [(0, '0.295')] [2024-04-26 00:49:06,014][47267] Signal inference workers to stop experience collection... (7700 times) [2024-04-26 00:49:06,046][47288] InferenceWorker_p0-w0: stopping experience collection (7700 times) [2024-04-26 00:49:06,063][47267] Signal inference workers to resume experience collection... (7700 times) [2024-04-26 00:49:06,064][47288] InferenceWorker_p0-w0: resuming experience collection (7700 times) [2024-04-26 00:49:06,527][47288] Updated weights for policy 0, policy_version 35656 (0.0030) [2024-04-26 00:49:08,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56524.9, 300 sec: 56038.8). Total num frames: 584318976. Throughput: 0: 56072.3. Samples: 533726180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 00:49:08,923][47056] Avg episode reward: [(0, '0.218')] [2024-04-26 00:49:09,528][47288] Updated weights for policy 0, policy_version 35666 (0.0032) [2024-04-26 00:49:12,385][47288] Updated weights for policy 0, policy_version 35676 (0.0028) [2024-04-26 00:49:13,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56524.7, 300 sec: 56038.8). Total num frames: 584597504. Throughput: 0: 56199.9. Samples: 533897120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 00:49:13,923][47056] Avg episode reward: [(0, '0.245')] [2024-04-26 00:49:15,373][47288] Updated weights for policy 0, policy_version 35686 (0.0027) [2024-04-26 00:49:18,422][47288] Updated weights for policy 0, policy_version 35696 (0.0036) [2024-04-26 00:49:18,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.7, 300 sec: 55983.3). Total num frames: 584876032. Throughput: 0: 56397.2. Samples: 534239140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 00:49:18,923][47056] Avg episode reward: [(0, '0.361')] [2024-04-26 00:49:21,346][47288] Updated weights for policy 0, policy_version 35706 (0.0027) [2024-04-26 00:49:23,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56255.2, 300 sec: 55927.7). Total num frames: 585138176. Throughput: 0: 56207.0. Samples: 534574280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:49:23,923][47056] Avg episode reward: [(0, '0.299')] [2024-04-26 00:49:24,213][47288] Updated weights for policy 0, policy_version 35716 (0.0027) [2024-04-26 00:49:27,140][47288] Updated weights for policy 0, policy_version 35726 (0.0026) [2024-04-26 00:49:28,923][47056] Fps is (10 sec: 54065.8, 60 sec: 55978.3, 300 sec: 55927.7). Total num frames: 585416704. Throughput: 0: 56008.0. Samples: 534740440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:49:28,923][47056] Avg episode reward: [(0, '0.260')] [2024-04-26 00:49:30,006][47288] Updated weights for policy 0, policy_version 35736 (0.0027) [2024-04-26 00:49:32,937][47288] Updated weights for policy 0, policy_version 35746 (0.0030) [2024-04-26 00:49:33,923][47056] Fps is (10 sec: 55706.7, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 585695232. Throughput: 0: 56037.0. Samples: 535079400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:49:33,923][47056] Avg episode reward: [(0, '0.321')] [2024-04-26 00:49:35,933][47288] Updated weights for policy 0, policy_version 35756 (0.0028) [2024-04-26 00:49:38,857][47288] Updated weights for policy 0, policy_version 35766 (0.0030) [2024-04-26 00:49:38,923][47056] Fps is (10 sec: 57345.9, 60 sec: 55978.8, 300 sec: 56038.8). Total num frames: 585990144. Throughput: 0: 56138.7. Samples: 535413320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:49:38,923][47056] Avg episode reward: [(0, '0.324')] [2024-04-26 00:49:41,718][47288] Updated weights for policy 0, policy_version 35776 (0.0029) [2024-04-26 00:49:43,923][47056] Fps is (10 sec: 55704.4, 60 sec: 55432.3, 300 sec: 55927.8). Total num frames: 586252288. Throughput: 0: 56310.0. Samples: 535586440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 00:49:43,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 00:49:44,672][47288] Updated weights for policy 0, policy_version 35786 (0.0030) [2024-04-26 00:49:47,448][47288] Updated weights for policy 0, policy_version 35796 (0.0027) [2024-04-26 00:49:48,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.9, 300 sec: 56038.8). Total num frames: 586563584. Throughput: 0: 56141.8. Samples: 535912540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 00:49:48,923][47056] Avg episode reward: [(0, '0.220')] [2024-04-26 00:49:50,458][47288] Updated weights for policy 0, policy_version 35806 (0.0028) [2024-04-26 00:49:53,481][47288] Updated weights for policy 0, policy_version 35816 (0.0025) [2024-04-26 00:49:53,923][47056] Fps is (10 sec: 57345.3, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 586825728. Throughput: 0: 56060.1. Samples: 536248880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 00:49:53,923][47056] Avg episode reward: [(0, '0.332')] [2024-04-26 00:49:56,291][47288] Updated weights for policy 0, policy_version 35826 (0.0027) [2024-04-26 00:49:58,923][47056] Fps is (10 sec: 54066.4, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 587104256. Throughput: 0: 55900.4. Samples: 536412640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 00:49:58,923][47056] Avg episode reward: [(0, '0.285')] [2024-04-26 00:49:59,473][47267] Signal inference workers to stop experience collection... (7750 times) [2024-04-26 00:49:59,479][47288] Updated weights for policy 0, policy_version 35836 (0.0029) [2024-04-26 00:49:59,482][47267] Signal inference workers to resume experience collection... (7750 times) [2024-04-26 00:49:59,495][47288] InferenceWorker_p0-w0: stopping experience collection (7750 times) [2024-04-26 00:49:59,495][47288] InferenceWorker_p0-w0: resuming experience collection (7750 times) [2024-04-26 00:50:02,125][47288] Updated weights for policy 0, policy_version 35846 (0.0027) [2024-04-26 00:50:03,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 587382784. Throughput: 0: 55866.8. Samples: 536753140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 00:50:03,923][47056] Avg episode reward: [(0, '0.215')] [2024-04-26 00:50:05,176][47288] Updated weights for policy 0, policy_version 35856 (0.0029) [2024-04-26 00:50:07,993][47288] Updated weights for policy 0, policy_version 35866 (0.0025) [2024-04-26 00:50:08,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 587644928. Throughput: 0: 55903.7. Samples: 537089940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:50:08,923][47056] Avg episode reward: [(0, '0.246')] [2024-04-26 00:50:10,955][47288] Updated weights for policy 0, policy_version 35876 (0.0034) [2024-04-26 00:50:13,729][47288] Updated weights for policy 0, policy_version 35886 (0.0031) [2024-04-26 00:50:13,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 587956224. Throughput: 0: 55939.0. Samples: 537257680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:50:13,923][47056] Avg episode reward: [(0, '0.252')] [2024-04-26 00:50:16,845][47288] Updated weights for policy 0, policy_version 35896 (0.0027) [2024-04-26 00:50:18,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 56038.8). Total num frames: 588218368. Throughput: 0: 55922.9. Samples: 537595940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:50:18,923][47056] Avg episode reward: [(0, '0.302')] [2024-04-26 00:50:19,469][47288] Updated weights for policy 0, policy_version 35906 (0.0025) [2024-04-26 00:50:22,928][47288] Updated weights for policy 0, policy_version 35916 (0.0029) [2024-04-26 00:50:23,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56251.9, 300 sec: 56038.9). Total num frames: 588513280. Throughput: 0: 55887.1. Samples: 537928240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:50:23,923][47056] Avg episode reward: [(0, '0.337')] [2024-04-26 00:50:25,464][47288] Updated weights for policy 0, policy_version 35926 (0.0029) [2024-04-26 00:50:28,609][47288] Updated weights for policy 0, policy_version 35936 (0.0026) [2024-04-26 00:50:28,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.9, 300 sec: 55927.7). Total num frames: 588775424. Throughput: 0: 55831.2. Samples: 538098840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:50:28,923][47056] Avg episode reward: [(0, '0.293')] [2024-04-26 00:50:31,525][47288] Updated weights for policy 0, policy_version 35946 (0.0029) [2024-04-26 00:50:33,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56251.6, 300 sec: 55983.3). Total num frames: 589070336. Throughput: 0: 55985.2. Samples: 538431880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 00:50:33,923][47056] Avg episode reward: [(0, '0.330')] [2024-04-26 00:50:34,323][47288] Updated weights for policy 0, policy_version 35956 (0.0031) [2024-04-26 00:50:37,182][47288] Updated weights for policy 0, policy_version 35966 (0.0030) [2024-04-26 00:50:38,923][47056] Fps is (10 sec: 57345.2, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 589348864. Throughput: 0: 56050.7. Samples: 538771160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 00:50:38,923][47056] Avg episode reward: [(0, '0.281')] [2024-04-26 00:50:40,200][47288] Updated weights for policy 0, policy_version 35976 (0.0031) [2024-04-26 00:50:43,081][47288] Updated weights for policy 0, policy_version 35986 (0.0025) [2024-04-26 00:50:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 56038.9). Total num frames: 589627392. Throughput: 0: 56048.1. Samples: 538934800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 00:50:43,923][47056] Avg episode reward: [(0, '0.301')] [2024-04-26 00:50:46,073][47288] Updated weights for policy 0, policy_version 35996 (0.0034) [2024-04-26 00:50:48,923][47056] Fps is (10 sec: 55704.3, 60 sec: 55705.5, 300 sec: 56038.8). Total num frames: 589905920. Throughput: 0: 55790.0. Samples: 539263700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 00:50:48,923][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 00:50:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000036005_589905920.pth... [2024-04-26 00:50:48,980][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000035185_576471040.pth [2024-04-26 00:50:49,182][47288] Updated weights for policy 0, policy_version 36006 (0.0025) [2024-04-26 00:50:51,795][47288] Updated weights for policy 0, policy_version 36016 (0.0039) [2024-04-26 00:50:53,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.5, 300 sec: 56038.8). Total num frames: 590184448. Throughput: 0: 55878.2. Samples: 539604460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 00:50:53,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 00:50:54,942][47288] Updated weights for policy 0, policy_version 36026 (0.0025) [2024-04-26 00:50:57,646][47288] Updated weights for policy 0, policy_version 36036 (0.0027) [2024-04-26 00:50:58,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 590462976. Throughput: 0: 56013.7. Samples: 539778300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 00:50:58,923][47056] Avg episode reward: [(0, '0.279')] [2024-04-26 00:51:00,752][47288] Updated weights for policy 0, policy_version 36046 (0.0029) [2024-04-26 00:51:03,817][47288] Updated weights for policy 0, policy_version 36056 (0.0036) [2024-04-26 00:51:03,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.5, 300 sec: 55927.8). Total num frames: 590741504. Throughput: 0: 55876.4. Samples: 540110380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 00:51:03,923][47056] Avg episode reward: [(0, '0.289')] [2024-04-26 00:51:04,826][47267] Signal inference workers to stop experience collection... (7800 times) [2024-04-26 00:51:04,847][47288] InferenceWorker_p0-w0: stopping experience collection (7800 times) [2024-04-26 00:51:04,884][47267] Signal inference workers to resume experience collection... (7800 times) [2024-04-26 00:51:04,885][47288] InferenceWorker_p0-w0: resuming experience collection (7800 times) [2024-04-26 00:51:06,848][47288] Updated weights for policy 0, policy_version 36066 (0.0026) [2024-04-26 00:51:08,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56251.7, 300 sec: 56038.8). Total num frames: 591020032. Throughput: 0: 55762.5. Samples: 540437560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 00:51:08,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 00:51:09,827][47288] Updated weights for policy 0, policy_version 36076 (0.0031) [2024-04-26 00:51:12,675][47288] Updated weights for policy 0, policy_version 36086 (0.0026) [2024-04-26 00:51:13,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55432.4, 300 sec: 55927.7). Total num frames: 591282176. Throughput: 0: 55752.8. Samples: 540607720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 00:51:13,923][47056] Avg episode reward: [(0, '0.256')] [2024-04-26 00:51:15,730][47288] Updated weights for policy 0, policy_version 36096 (0.0035) [2024-04-26 00:51:18,519][47288] Updated weights for policy 0, policy_version 36106 (0.0030) [2024-04-26 00:51:18,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 591577088. Throughput: 0: 55828.1. Samples: 540944140. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-04-26 00:51:18,923][47056] Avg episode reward: [(0, '0.293')] [2024-04-26 00:51:21,406][47288] Updated weights for policy 0, policy_version 36116 (0.0030) [2024-04-26 00:51:23,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55705.5, 300 sec: 56038.8). Total num frames: 591855616. Throughput: 0: 55833.6. Samples: 541283680. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-04-26 00:51:23,923][47056] Avg episode reward: [(0, '0.257')] [2024-04-26 00:51:24,193][47288] Updated weights for policy 0, policy_version 36126 (0.0028) [2024-04-26 00:51:27,376][47288] Updated weights for policy 0, policy_version 36136 (0.0032) [2024-04-26 00:51:28,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 592134144. Throughput: 0: 55734.2. Samples: 541442840. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-04-26 00:51:28,923][47056] Avg episode reward: [(0, '0.286')] [2024-04-26 00:51:30,362][47288] Updated weights for policy 0, policy_version 36146 (0.0027) [2024-04-26 00:51:33,214][47288] Updated weights for policy 0, policy_version 36156 (0.0027) [2024-04-26 00:51:33,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 592379904. Throughput: 0: 55679.7. Samples: 541769280. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-04-26 00:51:33,923][47056] Avg episode reward: [(0, '0.344')] [2024-04-26 00:51:36,183][47288] Updated weights for policy 0, policy_version 36166 (0.0027) [2024-04-26 00:51:38,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.4, 300 sec: 55983.3). Total num frames: 592691200. Throughput: 0: 55632.0. Samples: 542107900. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-04-26 00:51:38,923][47056] Avg episode reward: [(0, '0.284')] [2024-04-26 00:51:39,416][47288] Updated weights for policy 0, policy_version 36176 (0.0027) [2024-04-26 00:51:41,963][47288] Updated weights for policy 0, policy_version 36186 (0.0030) [2024-04-26 00:51:43,923][47056] Fps is (10 sec: 60620.5, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 592986112. Throughput: 0: 55553.4. Samples: 542278200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 00:51:43,923][47056] Avg episode reward: [(0, '0.314')] [2024-04-26 00:51:45,232][47288] Updated weights for policy 0, policy_version 36196 (0.0028) [2024-04-26 00:51:47,802][47288] Updated weights for policy 0, policy_version 36206 (0.0027) [2024-04-26 00:51:48,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 593248256. Throughput: 0: 55576.0. Samples: 542611300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 00:51:48,924][47056] Avg episode reward: [(0, '0.313')] [2024-04-26 00:51:51,025][47288] Updated weights for policy 0, policy_version 36216 (0.0029) [2024-04-26 00:51:52,903][47267] Signal inference workers to stop experience collection... (7850 times) [2024-04-26 00:51:52,903][47267] Signal inference workers to resume experience collection... (7850 times) [2024-04-26 00:51:52,929][47288] InferenceWorker_p0-w0: stopping experience collection (7850 times) [2024-04-26 00:51:52,929][47288] InferenceWorker_p0-w0: resuming experience collection (7850 times) [2024-04-26 00:51:53,521][47288] Updated weights for policy 0, policy_version 36226 (0.0033) [2024-04-26 00:51:53,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 593526784. Throughput: 0: 55801.9. Samples: 542948640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 00:51:53,923][47056] Avg episode reward: [(0, '0.346')] [2024-04-26 00:51:56,953][47288] Updated weights for policy 0, policy_version 36236 (0.0029) [2024-04-26 00:51:58,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 593821696. Throughput: 0: 55939.7. Samples: 543125000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 00:51:58,923][47056] Avg episode reward: [(0, '0.290')] [2024-04-26 00:51:59,625][47288] Updated weights for policy 0, policy_version 36246 (0.0031) [2024-04-26 00:52:02,633][47288] Updated weights for policy 0, policy_version 36256 (0.0028) [2024-04-26 00:52:03,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55927.8). Total num frames: 594067456. Throughput: 0: 55951.9. Samples: 543461980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 00:52:03,923][47056] Avg episode reward: [(0, '0.278')] [2024-04-26 00:52:05,400][47288] Updated weights for policy 0, policy_version 36266 (0.0034) [2024-04-26 00:52:08,386][47288] Updated weights for policy 0, policy_version 36276 (0.0032) [2024-04-26 00:52:08,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 594345984. Throughput: 0: 55763.1. Samples: 543793020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 00:52:08,923][47056] Avg episode reward: [(0, '0.333')] [2024-04-26 00:52:11,213][47288] Updated weights for policy 0, policy_version 36286 (0.0032) [2024-04-26 00:52:13,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.8, 300 sec: 56038.9). Total num frames: 594657280. Throughput: 0: 55886.3. Samples: 543957720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 00:52:13,923][47056] Avg episode reward: [(0, '0.250')] [2024-04-26 00:52:14,094][47288] Updated weights for policy 0, policy_version 36296 (0.0026) [2024-04-26 00:52:17,094][47288] Updated weights for policy 0, policy_version 36306 (0.0031) [2024-04-26 00:52:18,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 594935808. Throughput: 0: 56105.7. Samples: 544294040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 00:52:18,924][47056] Avg episode reward: [(0, '0.322')] [2024-04-26 00:52:20,269][47288] Updated weights for policy 0, policy_version 36316 (0.0027) [2024-04-26 00:52:22,987][47288] Updated weights for policy 0, policy_version 36326 (0.0034) [2024-04-26 00:52:23,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 595197952. Throughput: 0: 56209.9. Samples: 544637340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 00:52:23,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 00:52:26,362][47288] Updated weights for policy 0, policy_version 36336 (0.0027) [2024-04-26 00:52:28,777][47288] Updated weights for policy 0, policy_version 36346 (0.0029) [2024-04-26 00:52:28,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 595492864. Throughput: 0: 56187.2. Samples: 544806620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 00:52:28,923][47056] Avg episode reward: [(0, '0.350')] [2024-04-26 00:52:32,326][47288] Updated weights for policy 0, policy_version 36356 (0.0030) [2024-04-26 00:52:33,923][47056] Fps is (10 sec: 57343.0, 60 sec: 56524.7, 300 sec: 55927.7). Total num frames: 595771392. Throughput: 0: 56214.2. Samples: 545140940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 00:52:33,924][47056] Avg episode reward: [(0, '0.289')] [2024-04-26 00:52:34,455][47288] Updated weights for policy 0, policy_version 36366 (0.0033) [2024-04-26 00:52:38,334][47288] Updated weights for policy 0, policy_version 36376 (0.0032) [2024-04-26 00:52:38,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 596033536. Throughput: 0: 56164.8. Samples: 545476060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 00:52:38,923][47056] Avg episode reward: [(0, '0.311')] [2024-04-26 00:52:40,592][47288] Updated weights for policy 0, policy_version 36386 (0.0030) [2024-04-26 00:52:43,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55159.4, 300 sec: 55872.2). Total num frames: 596295680. Throughput: 0: 55783.0. Samples: 545635240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 00:52:43,923][47056] Avg episode reward: [(0, '0.271')] [2024-04-26 00:52:44,042][47288] Updated weights for policy 0, policy_version 36396 (0.0026) [2024-04-26 00:52:46,488][47288] Updated weights for policy 0, policy_version 36406 (0.0028) [2024-04-26 00:52:48,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 596606976. Throughput: 0: 55836.8. Samples: 545974640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 00:52:48,923][47056] Avg episode reward: [(0, '0.309')] [2024-04-26 00:52:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000036414_596606976.pth... [2024-04-26 00:52:48,998][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000035595_583188480.pth [2024-04-26 00:52:49,881][47288] Updated weights for policy 0, policy_version 36416 (0.0026) [2024-04-26 00:52:50,070][47267] Signal inference workers to stop experience collection... (7900 times) [2024-04-26 00:52:50,106][47288] InferenceWorker_p0-w0: stopping experience collection (7900 times) [2024-04-26 00:52:50,132][47267] Signal inference workers to resume experience collection... (7900 times) [2024-04-26 00:52:50,132][47288] InferenceWorker_p0-w0: resuming experience collection (7900 times) [2024-04-26 00:52:52,320][47288] Updated weights for policy 0, policy_version 36426 (0.0031) [2024-04-26 00:52:53,923][47056] Fps is (10 sec: 58982.5, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 596885504. Throughput: 0: 55764.9. Samples: 546302440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 00:52:53,923][47056] Avg episode reward: [(0, '0.260')] [2024-04-26 00:52:55,859][47288] Updated weights for policy 0, policy_version 36436 (0.0028) [2024-04-26 00:52:58,071][47288] Updated weights for policy 0, policy_version 36446 (0.0033) [2024-04-26 00:52:58,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 597180416. Throughput: 0: 55967.6. Samples: 546476260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 00:52:58,923][47056] Avg episode reward: [(0, '0.298')] [2024-04-26 00:53:01,815][47288] Updated weights for policy 0, policy_version 36456 (0.0028) [2024-04-26 00:53:03,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 597426176. Throughput: 0: 55873.8. Samples: 546808360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 00:53:03,923][47056] Avg episode reward: [(0, '0.345')] [2024-04-26 00:53:04,116][47288] Updated weights for policy 0, policy_version 36466 (0.0029) [2024-04-26 00:53:07,569][47288] Updated weights for policy 0, policy_version 36476 (0.0030) [2024-04-26 00:53:08,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 597721088. Throughput: 0: 55780.8. Samples: 547147480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 00:53:08,923][47056] Avg episode reward: [(0, '0.310')] [2024-04-26 00:53:10,136][47288] Updated weights for policy 0, policy_version 36486 (0.0022) [2024-04-26 00:53:13,357][47288] Updated weights for policy 0, policy_version 36496 (0.0035) [2024-04-26 00:53:13,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55432.7, 300 sec: 55927.8). Total num frames: 597983232. Throughput: 0: 55500.5. Samples: 547304140. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-26 00:53:13,923][47056] Avg episode reward: [(0, '0.309')] [2024-04-26 00:53:15,968][47288] Updated weights for policy 0, policy_version 36506 (0.0034) [2024-04-26 00:53:18,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55159.5, 300 sec: 55872.9). Total num frames: 598245376. Throughput: 0: 55531.2. Samples: 547639840. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-26 00:53:18,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 00:53:19,197][47288] Updated weights for policy 0, policy_version 36516 (0.0029) [2024-04-26 00:53:21,745][47288] Updated weights for policy 0, policy_version 36526 (0.0030) [2024-04-26 00:53:23,923][47056] Fps is (10 sec: 57343.0, 60 sec: 55978.5, 300 sec: 55927.7). Total num frames: 598556672. Throughput: 0: 55383.5. Samples: 547968320. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-26 00:53:23,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 00:53:25,194][47288] Updated weights for policy 0, policy_version 36536 (0.0031) [2024-04-26 00:53:27,665][47288] Updated weights for policy 0, policy_version 36546 (0.0031) [2024-04-26 00:53:28,923][47056] Fps is (10 sec: 62259.1, 60 sec: 56251.7, 300 sec: 56038.8). Total num frames: 598867968. Throughput: 0: 55739.2. Samples: 548143500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-26 00:53:28,923][47056] Avg episode reward: [(0, '0.322')] [2024-04-26 00:53:30,858][47288] Updated weights for policy 0, policy_version 36556 (0.0032) [2024-04-26 00:53:33,612][47288] Updated weights for policy 0, policy_version 36566 (0.0025) [2024-04-26 00:53:33,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 599113728. Throughput: 0: 55692.2. Samples: 548480780. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-26 00:53:33,923][47056] Avg episode reward: [(0, '0.240')] [2024-04-26 00:53:36,501][47267] Signal inference workers to stop experience collection... (7950 times) [2024-04-26 00:53:36,502][47267] Signal inference workers to resume experience collection... (7950 times) [2024-04-26 00:53:36,515][47288] InferenceWorker_p0-w0: stopping experience collection (7950 times) [2024-04-26 00:53:36,515][47288] InferenceWorker_p0-w0: resuming experience collection (7950 times) [2024-04-26 00:53:36,627][47288] Updated weights for policy 0, policy_version 36576 (0.0027) [2024-04-26 00:53:38,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 599392256. Throughput: 0: 55933.0. Samples: 548819420. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 00:53:38,923][47056] Avg episode reward: [(0, '0.266')] [2024-04-26 00:53:39,431][47288] Updated weights for policy 0, policy_version 36586 (0.0029) [2024-04-26 00:53:42,535][47288] Updated weights for policy 0, policy_version 36596 (0.0028) [2024-04-26 00:53:43,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 599654400. Throughput: 0: 55650.6. Samples: 548980540. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 00:53:43,923][47056] Avg episode reward: [(0, '0.326')] [2024-04-26 00:53:45,286][47288] Updated weights for policy 0, policy_version 36606 (0.0029) [2024-04-26 00:53:48,374][47288] Updated weights for policy 0, policy_version 36616 (0.0032) [2024-04-26 00:53:48,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.7, 300 sec: 55872.2). Total num frames: 599932928. Throughput: 0: 55783.2. Samples: 549318600. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 00:53:48,923][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 00:53:51,138][47288] Updated weights for policy 0, policy_version 36626 (0.0029) [2024-04-26 00:53:53,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 600211456. Throughput: 0: 55758.6. Samples: 549656620. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 00:53:53,923][47056] Avg episode reward: [(0, '0.341')] [2024-04-26 00:53:54,124][47288] Updated weights for policy 0, policy_version 36636 (0.0029) [2024-04-26 00:53:57,102][47288] Updated weights for policy 0, policy_version 36646 (0.0032) [2024-04-26 00:53:58,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 600506368. Throughput: 0: 55928.2. Samples: 549820920. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 00:53:58,923][47056] Avg episode reward: [(0, '0.355')] [2024-04-26 00:53:59,886][47288] Updated weights for policy 0, policy_version 36656 (0.0030) [2024-04-26 00:54:02,910][47288] Updated weights for policy 0, policy_version 36666 (0.0029) [2024-04-26 00:54:03,923][47056] Fps is (10 sec: 60621.1, 60 sec: 56524.9, 300 sec: 55927.8). Total num frames: 600817664. Throughput: 0: 55960.1. Samples: 550158040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 00:54:03,923][47056] Avg episode reward: [(0, '0.335')] [2024-04-26 00:54:05,776][47288] Updated weights for policy 0, policy_version 36676 (0.0031) [2024-04-26 00:54:08,689][47288] Updated weights for policy 0, policy_version 36686 (0.0034) [2024-04-26 00:54:08,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 601063424. Throughput: 0: 56064.5. Samples: 550491220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 00:54:08,923][47056] Avg episode reward: [(0, '0.343')] [2024-04-26 00:54:11,718][47288] Updated weights for policy 0, policy_version 36696 (0.0028) [2024-04-26 00:54:13,923][47056] Fps is (10 sec: 50790.2, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 601325568. Throughput: 0: 55884.5. Samples: 550658300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 00:54:13,923][47056] Avg episode reward: [(0, '0.312')] [2024-04-26 00:54:14,640][47288] Updated weights for policy 0, policy_version 36706 (0.0029) [2024-04-26 00:54:17,583][47288] Updated weights for policy 0, policy_version 36716 (0.0027) [2024-04-26 00:54:18,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 601604096. Throughput: 0: 55723.9. Samples: 550988360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 00:54:18,923][47056] Avg episode reward: [(0, '0.235')] [2024-04-26 00:54:20,542][47288] Updated weights for policy 0, policy_version 36726 (0.0024) [2024-04-26 00:54:23,309][47288] Updated weights for policy 0, policy_version 36736 (0.0035) [2024-04-26 00:54:23,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55872.3). Total num frames: 601899008. Throughput: 0: 55596.3. Samples: 551321260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 00:54:23,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 00:54:26,354][47288] Updated weights for policy 0, policy_version 36746 (0.0031) [2024-04-26 00:54:28,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55159.3, 300 sec: 55872.2). Total num frames: 602177536. Throughput: 0: 55699.9. Samples: 551487040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 00:54:28,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 00:54:29,262][47288] Updated weights for policy 0, policy_version 36756 (0.0024) [2024-04-26 00:54:32,162][47288] Updated weights for policy 0, policy_version 36766 (0.0028) [2024-04-26 00:54:33,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 602456064. Throughput: 0: 55612.4. Samples: 551821160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 00:54:33,923][47056] Avg episode reward: [(0, '0.311')] [2024-04-26 00:54:35,192][47288] Updated weights for policy 0, policy_version 36776 (0.0029) [2024-04-26 00:54:38,035][47288] Updated weights for policy 0, policy_version 36786 (0.0028) [2024-04-26 00:54:38,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 602750976. Throughput: 0: 55515.5. Samples: 552154820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 00:54:38,923][47056] Avg episode reward: [(0, '0.280')] [2024-04-26 00:54:41,054][47288] Updated weights for policy 0, policy_version 36796 (0.0033) [2024-04-26 00:54:41,704][47267] Signal inference workers to stop experience collection... (8000 times) [2024-04-26 00:54:41,733][47288] InferenceWorker_p0-w0: stopping experience collection (8000 times) [2024-04-26 00:54:41,760][47267] Signal inference workers to resume experience collection... (8000 times) [2024-04-26 00:54:41,765][47288] InferenceWorker_p0-w0: resuming experience collection (8000 times) [2024-04-26 00:54:43,894][47288] Updated weights for policy 0, policy_version 36806 (0.0025) [2024-04-26 00:54:43,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 603029504. Throughput: 0: 55728.0. Samples: 552328680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 00:54:43,923][47056] Avg episode reward: [(0, '0.298')] [2024-04-26 00:54:47,011][47288] Updated weights for policy 0, policy_version 36816 (0.0025) [2024-04-26 00:54:48,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 603275264. Throughput: 0: 55676.9. Samples: 552663500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 00:54:48,923][47056] Avg episode reward: [(0, '0.309')] [2024-04-26 00:54:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000036821_603275264.pth... [2024-04-26 00:54:48,986][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000036005_589905920.pth [2024-04-26 00:54:49,729][47288] Updated weights for policy 0, policy_version 36826 (0.0027) [2024-04-26 00:54:52,936][47288] Updated weights for policy 0, policy_version 36836 (0.0028) [2024-04-26 00:54:53,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 603570176. Throughput: 0: 55646.7. Samples: 552995320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 00:54:53,923][47056] Avg episode reward: [(0, '0.327')] [2024-04-26 00:54:55,730][47288] Updated weights for policy 0, policy_version 36846 (0.0031) [2024-04-26 00:54:58,887][47288] Updated weights for policy 0, policy_version 36856 (0.0034) [2024-04-26 00:54:58,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 603848704. Throughput: 0: 55383.9. Samples: 553150580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 00:54:58,923][47056] Avg episode reward: [(0, '0.276')] [2024-04-26 00:55:01,571][47288] Updated weights for policy 0, policy_version 36866 (0.0027) [2024-04-26 00:55:03,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55159.3, 300 sec: 55872.2). Total num frames: 604127232. Throughput: 0: 55622.2. Samples: 553491360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 00:55:03,923][47056] Avg episode reward: [(0, '0.328')] [2024-04-26 00:55:04,718][47288] Updated weights for policy 0, policy_version 36876 (0.0030) [2024-04-26 00:55:07,406][47288] Updated weights for policy 0, policy_version 36886 (0.0033) [2024-04-26 00:55:08,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 604422144. Throughput: 0: 55702.6. Samples: 553827880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 00:55:08,923][47056] Avg episode reward: [(0, '0.276')] [2024-04-26 00:55:10,521][47288] Updated weights for policy 0, policy_version 36896 (0.0029) [2024-04-26 00:55:13,380][47288] Updated weights for policy 0, policy_version 36906 (0.0029) [2024-04-26 00:55:13,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 604700672. Throughput: 0: 55718.4. Samples: 553994360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 00:55:13,923][47056] Avg episode reward: [(0, '0.307')] [2024-04-26 00:55:16,431][47288] Updated weights for policy 0, policy_version 36916 (0.0029) [2024-04-26 00:55:18,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 604962816. Throughput: 0: 55723.1. Samples: 554328700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 00:55:18,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 00:55:19,322][47288] Updated weights for policy 0, policy_version 36926 (0.0028) [2024-04-26 00:55:22,553][47288] Updated weights for policy 0, policy_version 36936 (0.0035) [2024-04-26 00:55:23,927][47056] Fps is (10 sec: 50770.4, 60 sec: 55155.8, 300 sec: 55704.9). Total num frames: 605208576. Throughput: 0: 55735.6. Samples: 554663140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 00:55:23,927][47056] Avg episode reward: [(0, '0.295')] [2024-04-26 00:55:25,162][47288] Updated weights for policy 0, policy_version 36946 (0.0032) [2024-04-26 00:55:28,329][47288] Updated weights for policy 0, policy_version 36956 (0.0031) [2024-04-26 00:55:28,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 605503488. Throughput: 0: 55394.6. Samples: 554821440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 00:55:28,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 00:55:28,932][47267] Saving new best policy, reward=0.413! [2024-04-26 00:55:30,898][47288] Updated weights for policy 0, policy_version 36966 (0.0027) [2024-04-26 00:55:33,923][47056] Fps is (10 sec: 57367.0, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 605782016. Throughput: 0: 55458.7. Samples: 555159140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 00:55:33,923][47056] Avg episode reward: [(0, '0.253')] [2024-04-26 00:55:34,134][47288] Updated weights for policy 0, policy_version 36976 (0.0031) [2024-04-26 00:55:36,740][47288] Updated weights for policy 0, policy_version 36986 (0.0037) [2024-04-26 00:55:38,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 606060544. Throughput: 0: 55554.6. Samples: 555495280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 00:55:38,923][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 00:55:40,163][47288] Updated weights for policy 0, policy_version 36996 (0.0030) [2024-04-26 00:55:42,699][47288] Updated weights for policy 0, policy_version 37006 (0.0031) [2024-04-26 00:55:43,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 606371840. Throughput: 0: 55980.5. Samples: 555669700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 00:55:43,923][47056] Avg episode reward: [(0, '0.304')] [2024-04-26 00:55:45,867][47288] Updated weights for policy 0, policy_version 37016 (0.0028) [2024-04-26 00:55:48,437][47288] Updated weights for policy 0, policy_version 37026 (0.0030) [2024-04-26 00:55:48,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 606633984. Throughput: 0: 55818.1. Samples: 556003180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 00:55:48,923][47056] Avg episode reward: [(0, '0.324')] [2024-04-26 00:55:51,604][47288] Updated weights for policy 0, policy_version 37036 (0.0028) [2024-04-26 00:55:53,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 606896128. Throughput: 0: 55831.3. Samples: 556340280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 00:55:53,923][47056] Avg episode reward: [(0, '0.292')] [2024-04-26 00:55:54,440][47288] Updated weights for policy 0, policy_version 37046 (0.0030) [2024-04-26 00:55:57,047][47267] Signal inference workers to stop experience collection... (8050 times) [2024-04-26 00:55:57,048][47267] Signal inference workers to resume experience collection... (8050 times) [2024-04-26 00:55:57,077][47288] InferenceWorker_p0-w0: stopping experience collection (8050 times) [2024-04-26 00:55:57,077][47288] InferenceWorker_p0-w0: resuming experience collection (8050 times) [2024-04-26 00:55:57,665][47288] Updated weights for policy 0, policy_version 37056 (0.0032) [2024-04-26 00:55:58,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 607191040. Throughput: 0: 55868.3. Samples: 556508440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:55:58,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 00:56:00,448][47288] Updated weights for policy 0, policy_version 37066 (0.0026) [2024-04-26 00:56:03,620][47288] Updated weights for policy 0, policy_version 37076 (0.0027) [2024-04-26 00:56:03,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 607453184. Throughput: 0: 55883.2. Samples: 556843440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:56:03,931][47056] Avg episode reward: [(0, '0.334')] [2024-04-26 00:56:06,105][47288] Updated weights for policy 0, policy_version 37086 (0.0032) [2024-04-26 00:56:08,923][47056] Fps is (10 sec: 55706.7, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 607748096. Throughput: 0: 55971.7. Samples: 557181640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:56:08,923][47056] Avg episode reward: [(0, '0.234')] [2024-04-26 00:56:09,429][47288] Updated weights for policy 0, policy_version 37096 (0.0031) [2024-04-26 00:56:11,827][47288] Updated weights for policy 0, policy_version 37106 (0.0031) [2024-04-26 00:56:13,923][47056] Fps is (10 sec: 57343.0, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 608026624. Throughput: 0: 56042.7. Samples: 557343360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:56:13,923][47056] Avg episode reward: [(0, '0.269')] [2024-04-26 00:56:15,372][47288] Updated weights for policy 0, policy_version 37116 (0.0029) [2024-04-26 00:56:17,687][47288] Updated weights for policy 0, policy_version 37126 (0.0030) [2024-04-26 00:56:18,923][47056] Fps is (10 sec: 58981.5, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 608337920. Throughput: 0: 56035.0. Samples: 557680720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 00:56:18,923][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 00:56:21,218][47288] Updated weights for policy 0, policy_version 37136 (0.0027) [2024-04-26 00:56:23,845][47288] Updated weights for policy 0, policy_version 37146 (0.0027) [2024-04-26 00:56:23,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56528.5, 300 sec: 55816.7). Total num frames: 608600064. Throughput: 0: 56010.7. Samples: 558015760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 00:56:23,924][47056] Avg episode reward: [(0, '0.261')] [2024-04-26 00:56:27,117][47288] Updated weights for policy 0, policy_version 37156 (0.0030) [2024-04-26 00:56:28,923][47056] Fps is (10 sec: 54067.4, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 608878592. Throughput: 0: 56070.6. Samples: 558192880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 00:56:28,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 00:56:29,668][47288] Updated weights for policy 0, policy_version 37166 (0.0028) [2024-04-26 00:56:32,885][47288] Updated weights for policy 0, policy_version 37176 (0.0027) [2024-04-26 00:56:33,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 609157120. Throughput: 0: 56031.4. Samples: 558524580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 00:56:33,923][47056] Avg episode reward: [(0, '0.290')] [2024-04-26 00:56:35,327][47288] Updated weights for policy 0, policy_version 37186 (0.0036) [2024-04-26 00:56:38,701][47288] Updated weights for policy 0, policy_version 37196 (0.0036) [2024-04-26 00:56:38,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 609435648. Throughput: 0: 56091.5. Samples: 558864400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 00:56:38,923][47056] Avg episode reward: [(0, '0.331')] [2024-04-26 00:56:41,308][47288] Updated weights for policy 0, policy_version 37206 (0.0030) [2024-04-26 00:56:43,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 609681408. Throughput: 0: 55917.4. Samples: 559024720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 00:56:43,923][47056] Avg episode reward: [(0, '0.368')] [2024-04-26 00:56:44,540][47288] Updated weights for policy 0, policy_version 37216 (0.0032) [2024-04-26 00:56:47,145][47288] Updated weights for policy 0, policy_version 37226 (0.0028) [2024-04-26 00:56:48,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 609976320. Throughput: 0: 55825.2. Samples: 559355580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 00:56:48,923][47056] Avg episode reward: [(0, '0.305')] [2024-04-26 00:56:49,040][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000037231_609992704.pth... [2024-04-26 00:56:49,091][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000036414_596606976.pth [2024-04-26 00:56:50,435][47288] Updated weights for policy 0, policy_version 37236 (0.0036) [2024-04-26 00:56:52,854][47288] Updated weights for policy 0, policy_version 37246 (0.0034) [2024-04-26 00:56:53,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 610271232. Throughput: 0: 55795.1. Samples: 559692420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 00:56:53,923][47056] Avg episode reward: [(0, '0.284')] [2024-04-26 00:56:56,248][47288] Updated weights for policy 0, policy_version 37256 (0.0024) [2024-04-26 00:56:58,644][47288] Updated weights for policy 0, policy_version 37266 (0.0029) [2024-04-26 00:56:58,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 610566144. Throughput: 0: 55889.4. Samples: 559858380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 00:56:58,923][47056] Avg episode reward: [(0, '0.300')] [2024-04-26 00:57:02,233][47288] Updated weights for policy 0, policy_version 37276 (0.0033) [2024-04-26 00:57:03,186][47267] Signal inference workers to stop experience collection... (8100 times) [2024-04-26 00:57:03,218][47288] InferenceWorker_p0-w0: stopping experience collection (8100 times) [2024-04-26 00:57:03,244][47267] Signal inference workers to resume experience collection... (8100 times) [2024-04-26 00:57:03,249][47288] InferenceWorker_p0-w0: resuming experience collection (8100 times) [2024-04-26 00:57:03,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 610811904. Throughput: 0: 55908.9. Samples: 560196620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 00:57:03,923][47056] Avg episode reward: [(0, '0.317')] [2024-04-26 00:57:04,565][47288] Updated weights for policy 0, policy_version 37286 (0.0029) [2024-04-26 00:57:08,110][47288] Updated weights for policy 0, policy_version 37296 (0.0025) [2024-04-26 00:57:08,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 611106816. Throughput: 0: 55875.7. Samples: 560530160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 00:57:08,923][47056] Avg episode reward: [(0, '0.349')] [2024-04-26 00:57:10,387][47288] Updated weights for policy 0, policy_version 37306 (0.0031) [2024-04-26 00:57:13,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 611368960. Throughput: 0: 55491.6. Samples: 560690000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 00:57:13,923][47056] Avg episode reward: [(0, '0.351')] [2024-04-26 00:57:14,021][47288] Updated weights for policy 0, policy_version 37316 (0.0028) [2024-04-26 00:57:16,216][47288] Updated weights for policy 0, policy_version 37326 (0.0028) [2024-04-26 00:57:18,923][47056] Fps is (10 sec: 54066.3, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 611647488. Throughput: 0: 55810.5. Samples: 561036060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 00:57:18,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 00:57:19,711][47288] Updated weights for policy 0, policy_version 37336 (0.0027) [2024-04-26 00:57:22,090][47288] Updated weights for policy 0, policy_version 37346 (0.0027) [2024-04-26 00:57:23,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 611926016. Throughput: 0: 55646.7. Samples: 561368500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 00:57:23,923][47056] Avg episode reward: [(0, '0.343')] [2024-04-26 00:57:25,715][47288] Updated weights for policy 0, policy_version 37356 (0.0038) [2024-04-26 00:57:27,906][47288] Updated weights for policy 0, policy_version 37366 (0.0026) [2024-04-26 00:57:28,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 612220928. Throughput: 0: 55798.1. Samples: 561535640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 00:57:28,924][47056] Avg episode reward: [(0, '0.311')] [2024-04-26 00:57:31,558][47288] Updated weights for policy 0, policy_version 37376 (0.0031) [2024-04-26 00:57:33,699][47288] Updated weights for policy 0, policy_version 37386 (0.0027) [2024-04-26 00:57:33,923][47056] Fps is (10 sec: 60620.7, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 612532224. Throughput: 0: 55850.3. Samples: 561868840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 00:57:33,923][47056] Avg episode reward: [(0, '0.285')] [2024-04-26 00:57:37,546][47288] Updated weights for policy 0, policy_version 37396 (0.0035) [2024-04-26 00:57:38,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 612794368. Throughput: 0: 55660.8. Samples: 562197160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 00:57:38,923][47056] Avg episode reward: [(0, '0.286')] [2024-04-26 00:57:39,552][47288] Updated weights for policy 0, policy_version 37406 (0.0036) [2024-04-26 00:57:43,304][47288] Updated weights for policy 0, policy_version 37416 (0.0029) [2024-04-26 00:57:43,923][47056] Fps is (10 sec: 50790.7, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 613040128. Throughput: 0: 55853.5. Samples: 562371780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 00:57:43,923][47056] Avg episode reward: [(0, '0.264')] [2024-04-26 00:57:45,413][47288] Updated weights for policy 0, policy_version 37426 (0.0029) [2024-04-26 00:57:48,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 613318656. Throughput: 0: 55812.4. Samples: 562708180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 00:57:48,923][47056] Avg episode reward: [(0, '0.320')] [2024-04-26 00:57:49,253][47288] Updated weights for policy 0, policy_version 37436 (0.0030) [2024-04-26 00:57:50,174][47267] Signal inference workers to stop experience collection... (8150 times) [2024-04-26 00:57:50,175][47267] Signal inference workers to resume experience collection... (8150 times) [2024-04-26 00:57:50,186][47288] InferenceWorker_p0-w0: stopping experience collection (8150 times) [2024-04-26 00:57:50,205][47288] InferenceWorker_p0-w0: resuming experience collection (8150 times) [2024-04-26 00:57:51,547][47288] Updated weights for policy 0, policy_version 37446 (0.0030) [2024-04-26 00:57:53,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 613597184. Throughput: 0: 55974.2. Samples: 563049000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 00:57:53,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 00:57:55,158][47288] Updated weights for policy 0, policy_version 37456 (0.0027) [2024-04-26 00:57:57,697][47288] Updated weights for policy 0, policy_version 37466 (0.0038) [2024-04-26 00:57:58,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 613892096. Throughput: 0: 55793.8. Samples: 563200720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-26 00:57:58,923][47056] Avg episode reward: [(0, '0.369')] [2024-04-26 00:58:00,970][47288] Updated weights for policy 0, policy_version 37476 (0.0032) [2024-04-26 00:58:03,468][47288] Updated weights for policy 0, policy_version 37486 (0.0025) [2024-04-26 00:58:03,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 614170624. Throughput: 0: 55610.8. Samples: 563538540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-26 00:58:03,923][47056] Avg episode reward: [(0, '0.251')] [2024-04-26 00:58:06,737][47288] Updated weights for policy 0, policy_version 37496 (0.0027) [2024-04-26 00:58:08,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 614465536. Throughput: 0: 55786.1. Samples: 563878880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-26 00:58:08,923][47056] Avg episode reward: [(0, '0.277')] [2024-04-26 00:58:09,346][47288] Updated weights for policy 0, policy_version 37506 (0.0024) [2024-04-26 00:58:12,497][47288] Updated weights for policy 0, policy_version 37516 (0.0027) [2024-04-26 00:58:13,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56524.9, 300 sec: 55983.3). Total num frames: 614760448. Throughput: 0: 55994.1. Samples: 564055360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-26 00:58:13,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 00:58:15,083][47288] Updated weights for policy 0, policy_version 37526 (0.0029) [2024-04-26 00:58:18,518][47288] Updated weights for policy 0, policy_version 37536 (0.0029) [2024-04-26 00:58:18,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 615022592. Throughput: 0: 55951.9. Samples: 564386680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-26 00:58:18,923][47056] Avg episode reward: [(0, '0.250')] [2024-04-26 00:58:21,110][47288] Updated weights for policy 0, policy_version 37546 (0.0029) [2024-04-26 00:58:23,923][47056] Fps is (10 sec: 49151.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 615251968. Throughput: 0: 56118.3. Samples: 564722480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 00:58:23,923][47056] Avg episode reward: [(0, '0.331')] [2024-04-26 00:58:24,281][47288] Updated weights for policy 0, policy_version 37556 (0.0026) [2024-04-26 00:58:27,001][47288] Updated weights for policy 0, policy_version 37566 (0.0032) [2024-04-26 00:58:28,923][47056] Fps is (10 sec: 54066.1, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 615563264. Throughput: 0: 55798.7. Samples: 564882740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 00:58:28,923][47056] Avg episode reward: [(0, '0.313')] [2024-04-26 00:58:30,162][47288] Updated weights for policy 0, policy_version 37576 (0.0031) [2024-04-26 00:58:32,709][47288] Updated weights for policy 0, policy_version 37586 (0.0028) [2024-04-26 00:58:33,923][47056] Fps is (10 sec: 60620.5, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 615858176. Throughput: 0: 55800.5. Samples: 565219200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 00:58:33,923][47056] Avg episode reward: [(0, '0.310')] [2024-04-26 00:58:36,098][47288] Updated weights for policy 0, policy_version 37596 (0.0032) [2024-04-26 00:58:38,459][47288] Updated weights for policy 0, policy_version 37606 (0.0030) [2024-04-26 00:58:38,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.4, 300 sec: 55872.2). Total num frames: 616136704. Throughput: 0: 55574.3. Samples: 565549860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 00:58:38,923][47056] Avg episode reward: [(0, '0.372')] [2024-04-26 00:58:42,004][47288] Updated weights for policy 0, policy_version 37616 (0.0027) [2024-04-26 00:58:43,717][47267] Signal inference workers to stop experience collection... (8200 times) [2024-04-26 00:58:43,717][47267] Signal inference workers to resume experience collection... (8200 times) [2024-04-26 00:58:43,749][47288] InferenceWorker_p0-w0: stopping experience collection (8200 times) [2024-04-26 00:58:43,749][47288] InferenceWorker_p0-w0: resuming experience collection (8200 times) [2024-04-26 00:58:43,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.7, 300 sec: 55927.7). Total num frames: 616431616. Throughput: 0: 56262.7. Samples: 565732540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 00:58:43,923][47056] Avg episode reward: [(0, '0.226')] [2024-04-26 00:58:44,712][47288] Updated weights for policy 0, policy_version 37626 (0.0024) [2024-04-26 00:58:47,746][47288] Updated weights for policy 0, policy_version 37636 (0.0028) [2024-04-26 00:58:48,923][47056] Fps is (10 sec: 58983.4, 60 sec: 56797.9, 300 sec: 55983.3). Total num frames: 616726528. Throughput: 0: 56163.4. Samples: 566065900. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-26 00:58:48,923][47056] Avg episode reward: [(0, '0.309')] [2024-04-26 00:58:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000037642_616726528.pth... [2024-04-26 00:58:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000036821_603275264.pth [2024-04-26 00:58:50,651][47288] Updated weights for policy 0, policy_version 37646 (0.0034) [2024-04-26 00:58:53,545][47288] Updated weights for policy 0, policy_version 37656 (0.0029) [2024-04-26 00:58:53,923][47056] Fps is (10 sec: 55704.6, 60 sec: 56524.5, 300 sec: 55872.2). Total num frames: 616988672. Throughput: 0: 55984.7. Samples: 566398200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-26 00:58:53,924][47056] Avg episode reward: [(0, '0.280')] [2024-04-26 00:58:56,589][47288] Updated weights for policy 0, policy_version 37666 (0.0027) [2024-04-26 00:58:58,923][47056] Fps is (10 sec: 50790.6, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 617234432. Throughput: 0: 55851.4. Samples: 566568680. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-26 00:58:58,923][47056] Avg episode reward: [(0, '0.313')] [2024-04-26 00:58:59,356][47288] Updated weights for policy 0, policy_version 37676 (0.0034) [2024-04-26 00:59:02,356][47288] Updated weights for policy 0, policy_version 37686 (0.0026) [2024-04-26 00:59:03,923][47056] Fps is (10 sec: 52429.5, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 617512960. Throughput: 0: 55784.9. Samples: 566897000. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-26 00:59:03,924][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 00:59:05,170][47288] Updated weights for policy 0, policy_version 37696 (0.0033) [2024-04-26 00:59:08,088][47288] Updated weights for policy 0, policy_version 37706 (0.0030) [2024-04-26 00:59:08,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 617775104. Throughput: 0: 55785.4. Samples: 567232820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-26 00:59:08,923][47056] Avg episode reward: [(0, '0.270')] [2024-04-26 00:59:10,970][47288] Updated weights for policy 0, policy_version 37716 (0.0026) [2024-04-26 00:59:13,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 618086400. Throughput: 0: 55926.6. Samples: 567399420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 00:59:13,923][47056] Avg episode reward: [(0, '0.244')] [2024-04-26 00:59:14,079][47288] Updated weights for policy 0, policy_version 37726 (0.0034) [2024-04-26 00:59:16,985][47288] Updated weights for policy 0, policy_version 37736 (0.0028) [2024-04-26 00:59:18,923][47056] Fps is (10 sec: 60620.4, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 618381312. Throughput: 0: 55880.9. Samples: 567733840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 00:59:18,923][47056] Avg episode reward: [(0, '0.346')] [2024-04-26 00:59:19,996][47288] Updated weights for policy 0, policy_version 37746 (0.0031) [2024-04-26 00:59:22,789][47288] Updated weights for policy 0, policy_version 37756 (0.0029) [2024-04-26 00:59:23,923][47056] Fps is (10 sec: 58981.7, 60 sec: 57070.8, 300 sec: 55927.8). Total num frames: 618676224. Throughput: 0: 55906.8. Samples: 568065660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 00:59:23,923][47056] Avg episode reward: [(0, '0.343')] [2024-04-26 00:59:25,777][47288] Updated weights for policy 0, policy_version 37766 (0.0026) [2024-04-26 00:59:28,529][47288] Updated weights for policy 0, policy_version 37776 (0.0030) [2024-04-26 00:59:28,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.9, 300 sec: 55872.2). Total num frames: 618938368. Throughput: 0: 55781.3. Samples: 568242700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 00:59:28,923][47056] Avg episode reward: [(0, '0.333')] [2024-04-26 00:59:31,642][47288] Updated weights for policy 0, policy_version 37786 (0.0029) [2024-04-26 00:59:33,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 619200512. Throughput: 0: 55868.5. Samples: 568579980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 00:59:33,923][47056] Avg episode reward: [(0, '0.334')] [2024-04-26 00:59:34,256][47288] Updated weights for policy 0, policy_version 37796 (0.0029) [2024-04-26 00:59:37,597][47288] Updated weights for policy 0, policy_version 37806 (0.0030) [2024-04-26 00:59:38,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.9, 300 sec: 55761.2). Total num frames: 619479040. Throughput: 0: 56049.2. Samples: 568920400. Policy #0 lag: (min: 2.0, avg: 12.8, max: 23.0) [2024-04-26 00:59:38,923][47056] Avg episode reward: [(0, '0.303')] [2024-04-26 00:59:40,221][47288] Updated weights for policy 0, policy_version 37816 (0.0029) [2024-04-26 00:59:43,775][47288] Updated weights for policy 0, policy_version 37826 (0.0030) [2024-04-26 00:59:43,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 619741184. Throughput: 0: 55585.4. Samples: 569070020. Policy #0 lag: (min: 2.0, avg: 12.8, max: 23.0) [2024-04-26 00:59:43,923][47056] Avg episode reward: [(0, '0.232')] [2024-04-26 00:59:44,978][47267] Signal inference workers to stop experience collection... (8250 times) [2024-04-26 00:59:44,978][47267] Signal inference workers to resume experience collection... (8250 times) [2024-04-26 00:59:45,000][47288] InferenceWorker_p0-w0: stopping experience collection (8250 times) [2024-04-26 00:59:45,000][47288] InferenceWorker_p0-w0: resuming experience collection (8250 times) [2024-04-26 00:59:45,983][47288] Updated weights for policy 0, policy_version 37836 (0.0027) [2024-04-26 00:59:48,923][47056] Fps is (10 sec: 55704.2, 60 sec: 55159.3, 300 sec: 55816.6). Total num frames: 620036096. Throughput: 0: 55770.1. Samples: 569406660. Policy #0 lag: (min: 2.0, avg: 12.8, max: 23.0) [2024-04-26 00:59:48,924][47056] Avg episode reward: [(0, '0.342')] [2024-04-26 00:59:49,664][47288] Updated weights for policy 0, policy_version 37846 (0.0027) [2024-04-26 00:59:51,882][47288] Updated weights for policy 0, policy_version 37856 (0.0030) [2024-04-26 00:59:53,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55705.8, 300 sec: 55872.2). Total num frames: 620331008. Throughput: 0: 55819.9. Samples: 569744720. Policy #0 lag: (min: 2.0, avg: 12.8, max: 23.0) [2024-04-26 00:59:53,923][47056] Avg episode reward: [(0, '0.326')] [2024-04-26 00:59:55,588][47288] Updated weights for policy 0, policy_version 37866 (0.0033) [2024-04-26 00:59:57,840][47288] Updated weights for policy 0, policy_version 37876 (0.0030) [2024-04-26 00:59:58,927][47056] Fps is (10 sec: 60597.8, 60 sec: 56794.1, 300 sec: 55982.5). Total num frames: 620642304. Throughput: 0: 56151.9. Samples: 569926480. Policy #0 lag: (min: 2.0, avg: 12.8, max: 23.0) [2024-04-26 00:59:58,927][47056] Avg episode reward: [(0, '0.338')] [2024-04-26 01:00:01,325][47288] Updated weights for policy 0, policy_version 37886 (0.0027) [2024-04-26 01:00:03,734][47288] Updated weights for policy 0, policy_version 37896 (0.0033) [2024-04-26 01:00:03,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 620904448. Throughput: 0: 56085.7. Samples: 570257700. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-26 01:00:03,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 01:00:07,317][47288] Updated weights for policy 0, policy_version 37906 (0.0038) [2024-04-26 01:00:08,923][47056] Fps is (10 sec: 52450.1, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 621166592. Throughput: 0: 56177.9. Samples: 570593660. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-26 01:00:08,923][47056] Avg episode reward: [(0, '0.277')] [2024-04-26 01:00:09,450][47288] Updated weights for policy 0, policy_version 37916 (0.0032) [2024-04-26 01:00:13,317][47288] Updated weights for policy 0, policy_version 37926 (0.0026) [2024-04-26 01:00:13,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 621445120. Throughput: 0: 55890.7. Samples: 570757780. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-26 01:00:13,923][47056] Avg episode reward: [(0, '0.322')] [2024-04-26 01:00:15,198][47288] Updated weights for policy 0, policy_version 37936 (0.0033) [2024-04-26 01:00:18,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55159.5, 300 sec: 55873.0). Total num frames: 621690880. Throughput: 0: 55900.1. Samples: 571095480. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-26 01:00:18,923][47056] Avg episode reward: [(0, '0.256')] [2024-04-26 01:00:19,002][47288] Updated weights for policy 0, policy_version 37946 (0.0035) [2024-04-26 01:00:21,076][47288] Updated weights for policy 0, policy_version 37956 (0.0026) [2024-04-26 01:00:23,923][47056] Fps is (10 sec: 52428.2, 60 sec: 54886.4, 300 sec: 55816.7). Total num frames: 621969408. Throughput: 0: 55805.2. Samples: 571431640. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-26 01:00:23,923][47056] Avg episode reward: [(0, '0.258')] [2024-04-26 01:00:24,856][47288] Updated weights for policy 0, policy_version 37966 (0.0030) [2024-04-26 01:00:26,994][47288] Updated weights for policy 0, policy_version 37976 (0.0030) [2024-04-26 01:00:28,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 622280704. Throughput: 0: 56071.2. Samples: 571593220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 01:00:28,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 01:00:30,880][47288] Updated weights for policy 0, policy_version 37986 (0.0035) [2024-04-26 01:00:32,754][47288] Updated weights for policy 0, policy_version 37996 (0.0032) [2024-04-26 01:00:33,923][47056] Fps is (10 sec: 62260.0, 60 sec: 56524.8, 300 sec: 56038.9). Total num frames: 622592000. Throughput: 0: 56031.0. Samples: 571928040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 01:00:33,923][47056] Avg episode reward: [(0, '0.260')] [2024-04-26 01:00:36,715][47288] Updated weights for policy 0, policy_version 38006 (0.0031) [2024-04-26 01:00:38,788][47288] Updated weights for policy 0, policy_version 38016 (0.0040) [2024-04-26 01:00:38,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 622854144. Throughput: 0: 55940.4. Samples: 572262040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 01:00:38,923][47056] Avg episode reward: [(0, '0.264')] [2024-04-26 01:00:42,434][47288] Updated weights for policy 0, policy_version 38026 (0.0027) [2024-04-26 01:00:43,923][47056] Fps is (10 sec: 50790.6, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 623099904. Throughput: 0: 55863.3. Samples: 572440100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 01:00:43,923][47056] Avg episode reward: [(0, '0.305')] [2024-04-26 01:00:43,966][47267] Signal inference workers to stop experience collection... (8300 times) [2024-04-26 01:00:43,966][47267] Signal inference workers to resume experience collection... (8300 times) [2024-04-26 01:00:43,991][47288] InferenceWorker_p0-w0: stopping experience collection (8300 times) [2024-04-26 01:00:43,992][47288] InferenceWorker_p0-w0: resuming experience collection (8300 times) [2024-04-26 01:00:44,652][47288] Updated weights for policy 0, policy_version 38036 (0.0036) [2024-04-26 01:00:48,132][47288] Updated weights for policy 0, policy_version 38046 (0.0028) [2024-04-26 01:00:48,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.9, 300 sec: 55927.7). Total num frames: 623394816. Throughput: 0: 56037.8. Samples: 572779400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 01:00:48,923][47056] Avg episode reward: [(0, '0.338')] [2024-04-26 01:00:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000038049_623394816.pth... [2024-04-26 01:00:48,987][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000037231_609992704.pth [2024-04-26 01:00:50,504][47288] Updated weights for policy 0, policy_version 38056 (0.0027) [2024-04-26 01:00:53,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 623640576. Throughput: 0: 55972.7. Samples: 573112440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 01:00:53,923][47056] Avg episode reward: [(0, '0.276')] [2024-04-26 01:00:54,082][47288] Updated weights for policy 0, policy_version 38066 (0.0032) [2024-04-26 01:00:56,197][47288] Updated weights for policy 0, policy_version 38076 (0.0035) [2024-04-26 01:00:58,923][47056] Fps is (10 sec: 54067.3, 60 sec: 54890.1, 300 sec: 55872.2). Total num frames: 623935488. Throughput: 0: 55699.6. Samples: 573264260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 01:00:58,923][47056] Avg episode reward: [(0, '0.342')] [2024-04-26 01:00:59,846][47288] Updated weights for policy 0, policy_version 38086 (0.0029) [2024-04-26 01:01:02,015][47288] Updated weights for policy 0, policy_version 38096 (0.0030) [2024-04-26 01:01:03,923][47056] Fps is (10 sec: 60621.6, 60 sec: 55705.7, 300 sec: 55927.7). Total num frames: 624246784. Throughput: 0: 55645.8. Samples: 573599540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 01:01:03,923][47056] Avg episode reward: [(0, '0.256')] [2024-04-26 01:01:05,662][47288] Updated weights for policy 0, policy_version 38106 (0.0026) [2024-04-26 01:01:07,956][47288] Updated weights for policy 0, policy_version 38116 (0.0027) [2024-04-26 01:01:08,923][47056] Fps is (10 sec: 60620.6, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 624541696. Throughput: 0: 55696.5. Samples: 573937980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 01:01:08,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 01:01:11,479][47288] Updated weights for policy 0, policy_version 38126 (0.0032) [2024-04-26 01:01:13,787][47288] Updated weights for policy 0, policy_version 38136 (0.0029) [2024-04-26 01:01:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 624820224. Throughput: 0: 56147.6. Samples: 574119860. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 01:01:13,923][47056] Avg episode reward: [(0, '0.200')] [2024-04-26 01:01:17,423][47288] Updated weights for policy 0, policy_version 38146 (0.0024) [2024-04-26 01:01:18,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 625082368. Throughput: 0: 56093.0. Samples: 574452220. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 01:01:18,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 01:01:19,605][47288] Updated weights for policy 0, policy_version 38156 (0.0028) [2024-04-26 01:01:23,317][47288] Updated weights for policy 0, policy_version 38166 (0.0028) [2024-04-26 01:01:23,923][47056] Fps is (10 sec: 50789.8, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 625328128. Throughput: 0: 56146.2. Samples: 574788620. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 01:01:23,923][47056] Avg episode reward: [(0, '0.284')] [2024-04-26 01:01:25,564][47288] Updated weights for policy 0, policy_version 38176 (0.0026) [2024-04-26 01:01:28,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 625623040. Throughput: 0: 55596.3. Samples: 574941940. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 01:01:28,923][47056] Avg episode reward: [(0, '0.258')] [2024-04-26 01:01:28,977][47288] Updated weights for policy 0, policy_version 38186 (0.0030) [2024-04-26 01:01:31,365][47288] Updated weights for policy 0, policy_version 38196 (0.0029) [2024-04-26 01:01:33,923][47056] Fps is (10 sec: 55706.3, 60 sec: 54886.5, 300 sec: 55761.2). Total num frames: 625885184. Throughput: 0: 55762.3. Samples: 575288700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 01:01:33,923][47056] Avg episode reward: [(0, '0.268')] [2024-04-26 01:01:34,797][47288] Updated weights for policy 0, policy_version 38206 (0.0029) [2024-04-26 01:01:37,227][47288] Updated weights for policy 0, policy_version 38216 (0.0029) [2024-04-26 01:01:38,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55927.8). Total num frames: 626180096. Throughput: 0: 55787.7. Samples: 575622880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 01:01:38,923][47056] Avg episode reward: [(0, '0.311')] [2024-04-26 01:01:40,894][47288] Updated weights for policy 0, policy_version 38226 (0.0034) [2024-04-26 01:01:41,815][47267] Signal inference workers to stop experience collection... (8350 times) [2024-04-26 01:01:41,816][47267] Signal inference workers to resume experience collection... (8350 times) [2024-04-26 01:01:41,857][47288] InferenceWorker_p0-w0: stopping experience collection (8350 times) [2024-04-26 01:01:41,857][47288] InferenceWorker_p0-w0: resuming experience collection (8350 times) [2024-04-26 01:01:43,073][47288] Updated weights for policy 0, policy_version 38236 (0.0026) [2024-04-26 01:01:43,923][47056] Fps is (10 sec: 62258.7, 60 sec: 56797.8, 300 sec: 56038.9). Total num frames: 626507776. Throughput: 0: 56280.0. Samples: 575796860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 01:01:43,923][47056] Avg episode reward: [(0, '0.306')] [2024-04-26 01:01:46,757][47288] Updated weights for policy 0, policy_version 38246 (0.0028) [2024-04-26 01:01:48,891][47288] Updated weights for policy 0, policy_version 38256 (0.0032) [2024-04-26 01:01:48,923][47056] Fps is (10 sec: 60619.7, 60 sec: 56524.7, 300 sec: 55983.3). Total num frames: 626786304. Throughput: 0: 56248.2. Samples: 576130720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 01:01:48,923][47056] Avg episode reward: [(0, '0.279')] [2024-04-26 01:01:52,798][47288] Updated weights for policy 0, policy_version 38266 (0.0037) [2024-04-26 01:01:53,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56797.9, 300 sec: 55872.2). Total num frames: 627048448. Throughput: 0: 56008.9. Samples: 576458380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 01:01:53,923][47056] Avg episode reward: [(0, '0.308')] [2024-04-26 01:01:54,854][47288] Updated weights for policy 0, policy_version 38276 (0.0028) [2024-04-26 01:01:58,582][47288] Updated weights for policy 0, policy_version 38286 (0.0031) [2024-04-26 01:01:58,923][47056] Fps is (10 sec: 49153.0, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 627277824. Throughput: 0: 55753.3. Samples: 576628760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 01:01:58,923][47056] Avg episode reward: [(0, '0.332')] [2024-04-26 01:02:00,593][47288] Updated weights for policy 0, policy_version 38296 (0.0031) [2024-04-26 01:02:03,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 627572736. Throughput: 0: 55804.3. Samples: 576963420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 01:02:03,923][47056] Avg episode reward: [(0, '0.338')] [2024-04-26 01:02:04,555][47288] Updated weights for policy 0, policy_version 38306 (0.0027) [2024-04-26 01:02:06,601][47288] Updated weights for policy 0, policy_version 38316 (0.0027) [2024-04-26 01:02:08,923][47056] Fps is (10 sec: 57342.7, 60 sec: 55159.4, 300 sec: 55872.2). Total num frames: 627851264. Throughput: 0: 55689.6. Samples: 577294660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 01:02:08,923][47056] Avg episode reward: [(0, '0.232')] [2024-04-26 01:02:10,498][47288] Updated weights for policy 0, policy_version 38326 (0.0033) [2024-04-26 01:02:12,369][47288] Updated weights for policy 0, policy_version 38336 (0.0025) [2024-04-26 01:02:13,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55432.4, 300 sec: 55927.8). Total num frames: 628146176. Throughput: 0: 55916.0. Samples: 577458160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 01:02:13,923][47056] Avg episode reward: [(0, '0.259')] [2024-04-26 01:02:16,250][47288] Updated weights for policy 0, policy_version 38346 (0.0029) [2024-04-26 01:02:18,264][47288] Updated weights for policy 0, policy_version 38356 (0.0029) [2024-04-26 01:02:18,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 628441088. Throughput: 0: 55721.2. Samples: 577796160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 01:02:18,923][47056] Avg episode reward: [(0, '0.299')] [2024-04-26 01:02:22,009][47288] Updated weights for policy 0, policy_version 38366 (0.0027) [2024-04-26 01:02:23,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56797.8, 300 sec: 55983.3). Total num frames: 628736000. Throughput: 0: 55706.1. Samples: 578129660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 01:02:23,923][47056] Avg episode reward: [(0, '0.288')] [2024-04-26 01:02:24,078][47288] Updated weights for policy 0, policy_version 38376 (0.0026) [2024-04-26 01:02:27,912][47288] Updated weights for policy 0, policy_version 38386 (0.0032) [2024-04-26 01:02:28,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 628981760. Throughput: 0: 55602.1. Samples: 578298960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 01:02:28,923][47056] Avg episode reward: [(0, '0.278')] [2024-04-26 01:02:29,967][47288] Updated weights for policy 0, policy_version 38396 (0.0029) [2024-04-26 01:02:33,701][47288] Updated weights for policy 0, policy_version 38406 (0.0034) [2024-04-26 01:02:33,923][47056] Fps is (10 sec: 50790.9, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 629243904. Throughput: 0: 55633.5. Samples: 578634220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 01:02:33,923][47056] Avg episode reward: [(0, '0.356')] [2024-04-26 01:02:35,670][47288] Updated weights for policy 0, policy_version 38416 (0.0028) [2024-04-26 01:02:38,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 629522432. Throughput: 0: 55805.0. Samples: 578969600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 01:02:38,923][47056] Avg episode reward: [(0, '0.322')] [2024-04-26 01:02:39,792][47288] Updated weights for policy 0, policy_version 38426 (0.0032) [2024-04-26 01:02:40,931][47267] Signal inference workers to stop experience collection... (8400 times) [2024-04-26 01:02:40,932][47267] Signal inference workers to resume experience collection... (8400 times) [2024-04-26 01:02:40,943][47288] InferenceWorker_p0-w0: stopping experience collection (8400 times) [2024-04-26 01:02:40,954][47288] InferenceWorker_p0-w0: resuming experience collection (8400 times) [2024-04-26 01:02:41,706][47288] Updated weights for policy 0, policy_version 38436 (0.0027) [2024-04-26 01:02:43,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55159.4, 300 sec: 55927.8). Total num frames: 629817344. Throughput: 0: 55518.6. Samples: 579127100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 01:02:43,923][47056] Avg episode reward: [(0, '0.363')] [2024-04-26 01:02:45,546][47288] Updated weights for policy 0, policy_version 38446 (0.0024) [2024-04-26 01:02:47,520][47288] Updated weights for policy 0, policy_version 38456 (0.0033) [2024-04-26 01:02:48,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55159.7, 300 sec: 55927.8). Total num frames: 630095872. Throughput: 0: 55516.7. Samples: 579461660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 01:02:48,923][47056] Avg episode reward: [(0, '0.334')] [2024-04-26 01:02:48,963][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000038459_630112256.pth... [2024-04-26 01:02:49,011][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000037642_616726528.pth [2024-04-26 01:02:51,505][47288] Updated weights for policy 0, policy_version 38466 (0.0026) [2024-04-26 01:02:53,550][47288] Updated weights for policy 0, policy_version 38476 (0.0035) [2024-04-26 01:02:53,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 630407168. Throughput: 0: 55615.2. Samples: 579797340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 01:02:53,923][47056] Avg episode reward: [(0, '0.311')] [2024-04-26 01:02:57,319][47288] Updated weights for policy 0, policy_version 38486 (0.0027) [2024-04-26 01:02:58,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.8, 300 sec: 55927.8). Total num frames: 630669312. Throughput: 0: 55894.4. Samples: 579973400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 01:02:58,923][47056] Avg episode reward: [(0, '0.317')] [2024-04-26 01:02:59,825][47288] Updated weights for policy 0, policy_version 38496 (0.0030) [2024-04-26 01:03:03,017][47288] Updated weights for policy 0, policy_version 38506 (0.0032) [2024-04-26 01:03:03,923][47056] Fps is (10 sec: 52429.6, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 630931456. Throughput: 0: 55827.7. Samples: 580308400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 01:03:03,923][47056] Avg episode reward: [(0, '0.290')] [2024-04-26 01:03:05,540][47288] Updated weights for policy 0, policy_version 38516 (0.0033) [2024-04-26 01:03:08,735][47288] Updated weights for policy 0, policy_version 38526 (0.0028) [2024-04-26 01:03:08,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 631209984. Throughput: 0: 55862.7. Samples: 580643480. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 01:03:08,923][47056] Avg episode reward: [(0, '0.334')] [2024-04-26 01:03:11,217][47288] Updated weights for policy 0, policy_version 38536 (0.0029) [2024-04-26 01:03:13,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 631488512. Throughput: 0: 55659.1. Samples: 580803620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 01:03:13,923][47056] Avg episode reward: [(0, '0.301')] [2024-04-26 01:03:14,659][47288] Updated weights for policy 0, policy_version 38546 (0.0028) [2024-04-26 01:03:17,201][47288] Updated weights for policy 0, policy_version 38556 (0.0033) [2024-04-26 01:03:18,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55983.3). Total num frames: 631767040. Throughput: 0: 55715.9. Samples: 581141440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 01:03:18,923][47056] Avg episode reward: [(0, '0.225')] [2024-04-26 01:03:20,587][47288] Updated weights for policy 0, policy_version 38566 (0.0029) [2024-04-26 01:03:23,194][47288] Updated weights for policy 0, policy_version 38576 (0.0031) [2024-04-26 01:03:23,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55432.7, 300 sec: 55927.8). Total num frames: 632061952. Throughput: 0: 55745.0. Samples: 581478120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 01:03:23,923][47056] Avg episode reward: [(0, '0.321')] [2024-04-26 01:03:26,453][47288] Updated weights for policy 0, policy_version 38586 (0.0027) [2024-04-26 01:03:28,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 632356864. Throughput: 0: 56192.4. Samples: 581655760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 01:03:28,923][47056] Avg episode reward: [(0, '0.274')] [2024-04-26 01:03:28,925][47288] Updated weights for policy 0, policy_version 38596 (0.0040) [2024-04-26 01:03:32,196][47288] Updated weights for policy 0, policy_version 38606 (0.0027) [2024-04-26 01:03:33,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56251.7, 300 sec: 55872.3). Total num frames: 632619008. Throughput: 0: 56026.0. Samples: 581982840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 01:03:33,923][47056] Avg episode reward: [(0, '0.281')] [2024-04-26 01:03:34,759][47288] Updated weights for policy 0, policy_version 38616 (0.0029) [2024-04-26 01:03:38,011][47288] Updated weights for policy 0, policy_version 38626 (0.0026) [2024-04-26 01:03:38,923][47056] Fps is (10 sec: 54066.5, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 632897536. Throughput: 0: 56044.4. Samples: 582319340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 01:03:38,923][47056] Avg episode reward: [(0, '0.254')] [2024-04-26 01:03:40,782][47288] Updated weights for policy 0, policy_version 38636 (0.0028) [2024-04-26 01:03:43,917][47288] Updated weights for policy 0, policy_version 38646 (0.0027) [2024-04-26 01:03:43,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 633176064. Throughput: 0: 55716.0. Samples: 582480620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:03:43,923][47056] Avg episode reward: [(0, '0.361')] [2024-04-26 01:03:46,549][47288] Updated weights for policy 0, policy_version 38656 (0.0028) [2024-04-26 01:03:48,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 633454592. Throughput: 0: 55896.4. Samples: 582823740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:03:48,923][47056] Avg episode reward: [(0, '0.345')] [2024-04-26 01:03:49,831][47288] Updated weights for policy 0, policy_version 38666 (0.0028) [2024-04-26 01:03:52,292][47288] Updated weights for policy 0, policy_version 38676 (0.0030) [2024-04-26 01:03:53,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55159.5, 300 sec: 55872.2). Total num frames: 633716736. Throughput: 0: 55866.2. Samples: 583157460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:03:53,923][47056] Avg episode reward: [(0, '0.235')] [2024-04-26 01:03:55,505][47288] Updated weights for policy 0, policy_version 38686 (0.0035) [2024-04-26 01:03:58,156][47288] Updated weights for policy 0, policy_version 38696 (0.0027) [2024-04-26 01:03:58,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55927.8). Total num frames: 634011648. Throughput: 0: 55889.3. Samples: 583318640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:03:58,923][47056] Avg episode reward: [(0, '0.324')] [2024-04-26 01:04:01,376][47288] Updated weights for policy 0, policy_version 38706 (0.0029) [2024-04-26 01:04:03,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 634290176. Throughput: 0: 55805.0. Samples: 583652660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:04:03,923][47056] Avg episode reward: [(0, '0.253')] [2024-04-26 01:04:04,112][47288] Updated weights for policy 0, policy_version 38716 (0.0025) [2024-04-26 01:04:07,326][47288] Updated weights for policy 0, policy_version 38726 (0.0033) [2024-04-26 01:04:08,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 634568704. Throughput: 0: 55789.4. Samples: 583988660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 01:04:08,923][47056] Avg episode reward: [(0, '0.319')] [2024-04-26 01:04:09,877][47288] Updated weights for policy 0, policy_version 38736 (0.0025) [2024-04-26 01:04:13,331][47288] Updated weights for policy 0, policy_version 38746 (0.0027) [2024-04-26 01:04:13,923][47056] Fps is (10 sec: 54065.9, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 634830848. Throughput: 0: 55573.6. Samples: 584156580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 01:04:13,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 01:04:14,624][47267] Signal inference workers to stop experience collection... (8450 times) [2024-04-26 01:04:14,647][47288] InferenceWorker_p0-w0: stopping experience collection (8450 times) [2024-04-26 01:04:14,684][47267] Signal inference workers to resume experience collection... (8450 times) [2024-04-26 01:04:14,685][47288] InferenceWorker_p0-w0: resuming experience collection (8450 times) [2024-04-26 01:04:15,695][47288] Updated weights for policy 0, policy_version 38756 (0.0028) [2024-04-26 01:04:18,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 635109376. Throughput: 0: 55773.2. Samples: 584492640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 01:04:18,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 01:04:19,173][47288] Updated weights for policy 0, policy_version 38766 (0.0031) [2024-04-26 01:04:21,740][47288] Updated weights for policy 0, policy_version 38776 (0.0031) [2024-04-26 01:04:23,923][47056] Fps is (10 sec: 54068.8, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 635371520. Throughput: 0: 55724.3. Samples: 584826920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 01:04:23,923][47056] Avg episode reward: [(0, '0.208')] [2024-04-26 01:04:24,996][47288] Updated weights for policy 0, policy_version 38786 (0.0034) [2024-04-26 01:04:27,615][47288] Updated weights for policy 0, policy_version 38796 (0.0035) [2024-04-26 01:04:28,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55159.4, 300 sec: 55816.7). Total num frames: 635666432. Throughput: 0: 55785.1. Samples: 584990960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 01:04:28,923][47056] Avg episode reward: [(0, '0.355')] [2024-04-26 01:04:30,770][47288] Updated weights for policy 0, policy_version 38806 (0.0031) [2024-04-26 01:04:33,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 635944960. Throughput: 0: 55601.4. Samples: 585325800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 01:04:33,923][47056] Avg episode reward: [(0, '0.270')] [2024-04-26 01:04:33,944][47288] Updated weights for policy 0, policy_version 38816 (0.0033) [2024-04-26 01:04:36,482][47288] Updated weights for policy 0, policy_version 38826 (0.0031) [2024-04-26 01:04:38,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 636256256. Throughput: 0: 55669.8. Samples: 585662600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 01:04:38,932][47056] Avg episode reward: [(0, '0.350')] [2024-04-26 01:04:39,649][47288] Updated weights for policy 0, policy_version 38836 (0.0030) [2024-04-26 01:04:42,304][47288] Updated weights for policy 0, policy_version 38846 (0.0035) [2024-04-26 01:04:43,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 636502016. Throughput: 0: 55959.7. Samples: 585836820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 01:04:43,923][47056] Avg episode reward: [(0, '0.335')] [2024-04-26 01:04:45,636][47288] Updated weights for policy 0, policy_version 38856 (0.0028) [2024-04-26 01:04:48,403][47288] Updated weights for policy 0, policy_version 38866 (0.0031) [2024-04-26 01:04:48,923][47056] Fps is (10 sec: 54065.7, 60 sec: 55705.3, 300 sec: 55816.6). Total num frames: 636796928. Throughput: 0: 55778.3. Samples: 586162700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 01:04:48,924][47056] Avg episode reward: [(0, '0.351')] [2024-04-26 01:04:48,942][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000038867_636796928.pth... [2024-04-26 01:04:48,994][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000038049_623394816.pth [2024-04-26 01:04:51,597][47288] Updated weights for policy 0, policy_version 38876 (0.0035) [2024-04-26 01:04:53,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55650.8). Total num frames: 637059072. Throughput: 0: 55691.4. Samples: 586494760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 01:04:53,923][47056] Avg episode reward: [(0, '0.313')] [2024-04-26 01:04:54,299][47288] Updated weights for policy 0, policy_version 38886 (0.0026) [2024-04-26 01:04:57,328][47288] Updated weights for policy 0, policy_version 38896 (0.0037) [2024-04-26 01:04:58,923][47056] Fps is (10 sec: 52429.9, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 637321216. Throughput: 0: 55453.9. Samples: 586652000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-26 01:04:58,923][47056] Avg episode reward: [(0, '0.350')] [2024-04-26 01:05:00,233][47288] Updated weights for policy 0, policy_version 38906 (0.0031) [2024-04-26 01:05:03,283][47288] Updated weights for policy 0, policy_version 38916 (0.0028) [2024-04-26 01:05:03,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 637616128. Throughput: 0: 55400.5. Samples: 586985660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-26 01:05:03,923][47056] Avg episode reward: [(0, '0.325')] [2024-04-26 01:05:06,125][47288] Updated weights for policy 0, policy_version 38926 (0.0030) [2024-04-26 01:05:08,923][47056] Fps is (10 sec: 58982.6, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 637911040. Throughput: 0: 55482.5. Samples: 587323640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-26 01:05:08,923][47056] Avg episode reward: [(0, '0.270')] [2024-04-26 01:05:09,179][47288] Updated weights for policy 0, policy_version 38936 (0.0034) [2024-04-26 01:05:11,918][47288] Updated weights for policy 0, policy_version 38946 (0.0027) [2024-04-26 01:05:13,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55978.9, 300 sec: 55927.8). Total num frames: 638189568. Throughput: 0: 55685.1. Samples: 587496780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-26 01:05:13,923][47056] Avg episode reward: [(0, '0.309')] [2024-04-26 01:05:15,033][47288] Updated weights for policy 0, policy_version 38956 (0.0030) [2024-04-26 01:05:17,642][47267] Signal inference workers to stop experience collection... (8500 times) [2024-04-26 01:05:17,674][47288] InferenceWorker_p0-w0: stopping experience collection (8500 times) [2024-04-26 01:05:17,699][47267] Signal inference workers to resume experience collection... (8500 times) [2024-04-26 01:05:17,702][47288] InferenceWorker_p0-w0: resuming experience collection (8500 times) [2024-04-26 01:05:17,809][47288] Updated weights for policy 0, policy_version 38966 (0.0028) [2024-04-26 01:05:18,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 638468096. Throughput: 0: 55615.8. Samples: 587828520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-26 01:05:18,923][47056] Avg episode reward: [(0, '0.268')] [2024-04-26 01:05:21,125][47288] Updated weights for policy 0, policy_version 38976 (0.0029) [2024-04-26 01:05:23,856][47288] Updated weights for policy 0, policy_version 38986 (0.0026) [2024-04-26 01:05:23,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 638746624. Throughput: 0: 55519.7. Samples: 588160980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 01:05:23,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 01:05:27,386][47288] Updated weights for policy 0, policy_version 38996 (0.0038) [2024-04-26 01:05:28,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 639008768. Throughput: 0: 55424.9. Samples: 588330940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 01:05:28,923][47056] Avg episode reward: [(0, '0.295')] [2024-04-26 01:05:29,626][47288] Updated weights for policy 0, policy_version 39006 (0.0030) [2024-04-26 01:05:33,361][47288] Updated weights for policy 0, policy_version 39016 (0.0030) [2024-04-26 01:05:33,923][47056] Fps is (10 sec: 50789.1, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 639254528. Throughput: 0: 55537.5. Samples: 588661880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 01:05:33,923][47056] Avg episode reward: [(0, '0.351')] [2024-04-26 01:05:35,661][47288] Updated weights for policy 0, policy_version 39026 (0.0026) [2024-04-26 01:05:38,923][47056] Fps is (10 sec: 54067.3, 60 sec: 54886.4, 300 sec: 55761.1). Total num frames: 639549440. Throughput: 0: 55587.6. Samples: 588996200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 01:05:38,923][47056] Avg episode reward: [(0, '0.260')] [2024-04-26 01:05:39,289][47288] Updated weights for policy 0, policy_version 39036 (0.0028) [2024-04-26 01:05:41,542][47288] Updated weights for policy 0, policy_version 39046 (0.0031) [2024-04-26 01:05:43,923][47056] Fps is (10 sec: 60622.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 639860736. Throughput: 0: 55747.7. Samples: 589160640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 01:05:43,923][47056] Avg episode reward: [(0, '0.320')] [2024-04-26 01:05:45,051][47288] Updated weights for policy 0, policy_version 39056 (0.0031) [2024-04-26 01:05:47,545][47288] Updated weights for policy 0, policy_version 39066 (0.0030) [2024-04-26 01:05:48,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55705.9, 300 sec: 55927.8). Total num frames: 640139264. Throughput: 0: 55793.9. Samples: 589496380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 01:05:48,923][47056] Avg episode reward: [(0, '0.332')] [2024-04-26 01:05:50,765][47288] Updated weights for policy 0, policy_version 39076 (0.0032) [2024-04-26 01:05:53,356][47288] Updated weights for policy 0, policy_version 39086 (0.0026) [2024-04-26 01:05:53,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 640417792. Throughput: 0: 55507.9. Samples: 589821500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 01:05:53,923][47056] Avg episode reward: [(0, '0.288')] [2024-04-26 01:05:56,647][47288] Updated weights for policy 0, policy_version 39096 (0.0032) [2024-04-26 01:05:58,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 640663552. Throughput: 0: 55471.0. Samples: 589992980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 01:05:58,923][47056] Avg episode reward: [(0, '0.250')] [2024-04-26 01:05:59,194][47288] Updated weights for policy 0, policy_version 39106 (0.0026) [2024-04-26 01:06:02,800][47288] Updated weights for policy 0, policy_version 39116 (0.0031) [2024-04-26 01:06:03,923][47056] Fps is (10 sec: 52429.6, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 640942080. Throughput: 0: 55688.2. Samples: 590334480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 01:06:03,923][47056] Avg episode reward: [(0, '0.356')] [2024-04-26 01:06:04,986][47288] Updated weights for policy 0, policy_version 39126 (0.0037) [2024-04-26 01:06:08,729][47288] Updated weights for policy 0, policy_version 39136 (0.0028) [2024-04-26 01:06:08,923][47056] Fps is (10 sec: 54067.6, 60 sec: 54886.5, 300 sec: 55539.0). Total num frames: 641204224. Throughput: 0: 55747.9. Samples: 590669640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 01:06:08,923][47056] Avg episode reward: [(0, '0.272')] [2024-04-26 01:06:10,833][47288] Updated weights for policy 0, policy_version 39146 (0.0026) [2024-04-26 01:06:13,923][47056] Fps is (10 sec: 55704.1, 60 sec: 55159.2, 300 sec: 55650.0). Total num frames: 641499136. Throughput: 0: 55452.1. Samples: 590826300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-04-26 01:06:13,923][47056] Avg episode reward: [(0, '0.285')] [2024-04-26 01:06:14,450][47288] Updated weights for policy 0, policy_version 39156 (0.0031) [2024-04-26 01:06:16,857][47288] Updated weights for policy 0, policy_version 39166 (0.0034) [2024-04-26 01:06:18,923][47056] Fps is (10 sec: 62257.7, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 641826816. Throughput: 0: 55595.1. Samples: 591163660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-04-26 01:06:18,923][47056] Avg episode reward: [(0, '0.321')] [2024-04-26 01:06:20,358][47288] Updated weights for policy 0, policy_version 39176 (0.0026) [2024-04-26 01:06:21,648][47267] Signal inference workers to stop experience collection... (8550 times) [2024-04-26 01:06:21,649][47267] Signal inference workers to resume experience collection... (8550 times) [2024-04-26 01:06:21,673][47288] InferenceWorker_p0-w0: stopping experience collection (8550 times) [2024-04-26 01:06:21,673][47288] InferenceWorker_p0-w0: resuming experience collection (8550 times) [2024-04-26 01:06:22,630][47288] Updated weights for policy 0, policy_version 39186 (0.0029) [2024-04-26 01:06:23,923][47056] Fps is (10 sec: 58983.3, 60 sec: 55705.4, 300 sec: 55816.7). Total num frames: 642088960. Throughput: 0: 55530.5. Samples: 591495080. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-04-26 01:06:23,923][47056] Avg episode reward: [(0, '0.336')] [2024-04-26 01:06:26,175][47288] Updated weights for policy 0, policy_version 39196 (0.0028) [2024-04-26 01:06:28,474][47288] Updated weights for policy 0, policy_version 39206 (0.0031) [2024-04-26 01:06:28,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 642383872. Throughput: 0: 55786.6. Samples: 591671040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-04-26 01:06:28,924][47056] Avg episode reward: [(0, '0.350')] [2024-04-26 01:06:31,968][47288] Updated weights for policy 0, policy_version 39216 (0.0033) [2024-04-26 01:06:33,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56251.9, 300 sec: 55761.1). Total num frames: 642629632. Throughput: 0: 55737.4. Samples: 592004560. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-04-26 01:06:33,923][47056] Avg episode reward: [(0, '0.300')] [2024-04-26 01:06:34,537][47288] Updated weights for policy 0, policy_version 39226 (0.0030) [2024-04-26 01:06:37,735][47288] Updated weights for policy 0, policy_version 39236 (0.0028) [2024-04-26 01:06:38,923][47056] Fps is (10 sec: 50790.3, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 642891776. Throughput: 0: 55997.8. Samples: 592341400. Policy #0 lag: (min: 0.0, avg: 12.7, max: 28.0) [2024-04-26 01:06:38,923][47056] Avg episode reward: [(0, '0.328')] [2024-04-26 01:06:40,297][47288] Updated weights for policy 0, policy_version 39246 (0.0028) [2024-04-26 01:06:43,744][47288] Updated weights for policy 0, policy_version 39256 (0.0031) [2024-04-26 01:06:43,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 643186688. Throughput: 0: 55679.5. Samples: 592498560. Policy #0 lag: (min: 0.0, avg: 12.7, max: 28.0) [2024-04-26 01:06:43,923][47056] Avg episode reward: [(0, '0.313')] [2024-04-26 01:06:46,069][47288] Updated weights for policy 0, policy_version 39266 (0.0031) [2024-04-26 01:06:48,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 643465216. Throughput: 0: 55642.6. Samples: 592838400. Policy #0 lag: (min: 0.0, avg: 12.7, max: 28.0) [2024-04-26 01:06:48,923][47056] Avg episode reward: [(0, '0.289')] [2024-04-26 01:06:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000039274_643465216.pth... [2024-04-26 01:06:48,988][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000038459_630112256.pth [2024-04-26 01:06:49,479][47288] Updated weights for policy 0, policy_version 39276 (0.0033) [2024-04-26 01:06:51,886][47288] Updated weights for policy 0, policy_version 39286 (0.0027) [2024-04-26 01:06:53,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 643760128. Throughput: 0: 55602.6. Samples: 593171760. Policy #0 lag: (min: 0.0, avg: 12.7, max: 28.0) [2024-04-26 01:06:53,923][47056] Avg episode reward: [(0, '0.288')] [2024-04-26 01:06:55,304][47288] Updated weights for policy 0, policy_version 39296 (0.0028) [2024-04-26 01:06:57,896][47288] Updated weights for policy 0, policy_version 39306 (0.0031) [2024-04-26 01:06:58,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 644055040. Throughput: 0: 56008.6. Samples: 593346680. Policy #0 lag: (min: 0.0, avg: 12.7, max: 28.0) [2024-04-26 01:06:58,924][47056] Avg episode reward: [(0, '0.347')] [2024-04-26 01:07:01,524][47288] Updated weights for policy 0, policy_version 39316 (0.0035) [2024-04-26 01:07:03,839][47288] Updated weights for policy 0, policy_version 39326 (0.0028) [2024-04-26 01:07:03,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 644317184. Throughput: 0: 55891.3. Samples: 593678760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 01:07:03,923][47056] Avg episode reward: [(0, '0.307')] [2024-04-26 01:07:07,201][47288] Updated weights for policy 0, policy_version 39336 (0.0035) [2024-04-26 01:07:08,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56524.6, 300 sec: 55761.1). Total num frames: 644595712. Throughput: 0: 55953.7. Samples: 594013000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 01:07:08,924][47056] Avg episode reward: [(0, '0.292')] [2024-04-26 01:07:09,584][47288] Updated weights for policy 0, policy_version 39346 (0.0031) [2024-04-26 01:07:10,907][47267] Signal inference workers to stop experience collection... (8600 times) [2024-04-26 01:07:10,909][47267] Signal inference workers to resume experience collection... (8600 times) [2024-04-26 01:07:10,943][47288] InferenceWorker_p0-w0: stopping experience collection (8600 times) [2024-04-26 01:07:10,944][47288] InferenceWorker_p0-w0: resuming experience collection (8600 times) [2024-04-26 01:07:12,982][47288] Updated weights for policy 0, policy_version 39356 (0.0027) [2024-04-26 01:07:13,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 644841472. Throughput: 0: 55575.0. Samples: 594171920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 01:07:13,923][47056] Avg episode reward: [(0, '0.301')] [2024-04-26 01:07:15,477][47288] Updated weights for policy 0, policy_version 39366 (0.0028) [2024-04-26 01:07:18,754][47288] Updated weights for policy 0, policy_version 39376 (0.0031) [2024-04-26 01:07:18,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 645136384. Throughput: 0: 55692.3. Samples: 594510720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 01:07:18,923][47056] Avg episode reward: [(0, '0.326')] [2024-04-26 01:07:21,515][47288] Updated weights for policy 0, policy_version 39386 (0.0029) [2024-04-26 01:07:23,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 645414912. Throughput: 0: 55726.2. Samples: 594849080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 01:07:23,923][47056] Avg episode reward: [(0, '0.331')] [2024-04-26 01:07:24,569][47288] Updated weights for policy 0, policy_version 39396 (0.0032) [2024-04-26 01:07:27,435][47288] Updated weights for policy 0, policy_version 39406 (0.0028) [2024-04-26 01:07:28,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 645709824. Throughput: 0: 55859.3. Samples: 595012220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 01:07:28,923][47056] Avg episode reward: [(0, '0.299')] [2024-04-26 01:07:30,557][47288] Updated weights for policy 0, policy_version 39416 (0.0033) [2024-04-26 01:07:33,259][47288] Updated weights for policy 0, policy_version 39426 (0.0033) [2024-04-26 01:07:33,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 646004736. Throughput: 0: 55780.4. Samples: 595348520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 01:07:33,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 01:07:36,239][47288] Updated weights for policy 0, policy_version 39436 (0.0027) [2024-04-26 01:07:38,923][47056] Fps is (10 sec: 55704.4, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 646266880. Throughput: 0: 55960.3. Samples: 595689980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 01:07:38,923][47056] Avg episode reward: [(0, '0.335')] [2024-04-26 01:07:38,982][47288] Updated weights for policy 0, policy_version 39446 (0.0031) [2024-04-26 01:07:41,988][47288] Updated weights for policy 0, policy_version 39456 (0.0030) [2024-04-26 01:07:43,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 55816.6). Total num frames: 646561792. Throughput: 0: 55929.3. Samples: 595863500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 01:07:43,924][47056] Avg episode reward: [(0, '0.300')] [2024-04-26 01:07:44,648][47288] Updated weights for policy 0, policy_version 39466 (0.0026) [2024-04-26 01:07:47,919][47288] Updated weights for policy 0, policy_version 39476 (0.0026) [2024-04-26 01:07:48,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 646823936. Throughput: 0: 55974.3. Samples: 596197600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 01:07:48,923][47056] Avg episode reward: [(0, '0.354')] [2024-04-26 01:07:50,662][47288] Updated weights for policy 0, policy_version 39486 (0.0036) [2024-04-26 01:07:53,716][47288] Updated weights for policy 0, policy_version 39496 (0.0034) [2024-04-26 01:07:53,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 647102464. Throughput: 0: 56048.9. Samples: 596535200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 01:07:53,923][47056] Avg episode reward: [(0, '0.313')] [2024-04-26 01:07:56,550][47288] Updated weights for policy 0, policy_version 39506 (0.0026) [2024-04-26 01:07:58,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 647380992. Throughput: 0: 56205.4. Samples: 596701160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 01:07:58,923][47056] Avg episode reward: [(0, '0.339')] [2024-04-26 01:07:59,483][47288] Updated weights for policy 0, policy_version 39516 (0.0031) [2024-04-26 01:08:02,377][47288] Updated weights for policy 0, policy_version 39526 (0.0032) [2024-04-26 01:08:03,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 647659520. Throughput: 0: 56066.4. Samples: 597033700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 01:08:03,923][47056] Avg episode reward: [(0, '0.305')] [2024-04-26 01:08:05,205][47288] Updated weights for policy 0, policy_version 39536 (0.0026) [2024-04-26 01:08:08,305][47288] Updated weights for policy 0, policy_version 39546 (0.0028) [2024-04-26 01:08:08,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 647954432. Throughput: 0: 55948.5. Samples: 597366760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 01:08:08,923][47056] Avg episode reward: [(0, '0.369')] [2024-04-26 01:08:11,262][47288] Updated weights for policy 0, policy_version 39556 (0.0030) [2024-04-26 01:08:13,923][47056] Fps is (10 sec: 55704.2, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 648216576. Throughput: 0: 56125.0. Samples: 597537860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 01:08:13,923][47056] Avg episode reward: [(0, '0.310')] [2024-04-26 01:08:14,246][47288] Updated weights for policy 0, policy_version 39566 (0.0032) [2024-04-26 01:08:17,118][47288] Updated weights for policy 0, policy_version 39576 (0.0032) [2024-04-26 01:08:18,923][47056] Fps is (10 sec: 55704.7, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 648511488. Throughput: 0: 56250.1. Samples: 597879780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:08:18,924][47056] Avg episode reward: [(0, '0.312')] [2024-04-26 01:08:19,981][47288] Updated weights for policy 0, policy_version 39586 (0.0027) [2024-04-26 01:08:22,903][47288] Updated weights for policy 0, policy_version 39596 (0.0033) [2024-04-26 01:08:23,923][47056] Fps is (10 sec: 55706.9, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 648773632. Throughput: 0: 56185.5. Samples: 598218320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:08:23,923][47056] Avg episode reward: [(0, '0.288')] [2024-04-26 01:08:25,785][47288] Updated weights for policy 0, policy_version 39606 (0.0026) [2024-04-26 01:08:28,509][47267] Signal inference workers to stop experience collection... (8650 times) [2024-04-26 01:08:28,509][47267] Signal inference workers to resume experience collection... (8650 times) [2024-04-26 01:08:28,539][47288] InferenceWorker_p0-w0: stopping experience collection (8650 times) [2024-04-26 01:08:28,539][47288] InferenceWorker_p0-w0: resuming experience collection (8650 times) [2024-04-26 01:08:28,742][47288] Updated weights for policy 0, policy_version 39616 (0.0031) [2024-04-26 01:08:28,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 649068544. Throughput: 0: 55893.4. Samples: 598378700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:08:28,923][47056] Avg episode reward: [(0, '0.369')] [2024-04-26 01:08:31,598][47288] Updated weights for policy 0, policy_version 39626 (0.0028) [2024-04-26 01:08:33,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 649330688. Throughput: 0: 55924.8. Samples: 598714220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:08:33,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 01:08:33,924][47267] Saving new best policy, reward=0.423! [2024-04-26 01:08:34,697][47288] Updated weights for policy 0, policy_version 39636 (0.0030) [2024-04-26 01:08:37,532][47288] Updated weights for policy 0, policy_version 39646 (0.0025) [2024-04-26 01:08:38,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 649609216. Throughput: 0: 55725.9. Samples: 599042860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:08:38,924][47056] Avg episode reward: [(0, '0.336')] [2024-04-26 01:08:40,471][47288] Updated weights for policy 0, policy_version 39656 (0.0034) [2024-04-26 01:08:43,267][47288] Updated weights for policy 0, policy_version 39666 (0.0035) [2024-04-26 01:08:43,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55705.8, 300 sec: 55761.1). Total num frames: 649904128. Throughput: 0: 55847.8. Samples: 599214300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 01:08:43,923][47056] Avg episode reward: [(0, '0.338')] [2024-04-26 01:08:46,188][47288] Updated weights for policy 0, policy_version 39676 (0.0031) [2024-04-26 01:08:48,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 650182656. Throughput: 0: 55992.8. Samples: 599553380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 01:08:48,923][47056] Avg episode reward: [(0, '0.335')] [2024-04-26 01:08:48,978][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000039685_650199040.pth... [2024-04-26 01:08:49,024][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000038867_636796928.pth [2024-04-26 01:08:49,175][47288] Updated weights for policy 0, policy_version 39686 (0.0030) [2024-04-26 01:08:52,069][47288] Updated weights for policy 0, policy_version 39696 (0.0035) [2024-04-26 01:08:53,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 650461184. Throughput: 0: 55997.4. Samples: 599886640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 01:08:53,923][47056] Avg episode reward: [(0, '0.305')] [2024-04-26 01:08:55,187][47288] Updated weights for policy 0, policy_version 39706 (0.0040) [2024-04-26 01:08:58,194][47288] Updated weights for policy 0, policy_version 39716 (0.0031) [2024-04-26 01:08:58,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 650723328. Throughput: 0: 55962.9. Samples: 600056180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 01:08:58,923][47056] Avg episode reward: [(0, '0.381')] [2024-04-26 01:09:00,920][47288] Updated weights for policy 0, policy_version 39726 (0.0027) [2024-04-26 01:09:03,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 651018240. Throughput: 0: 55827.7. Samples: 600392020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 01:09:03,923][47056] Avg episode reward: [(0, '0.268')] [2024-04-26 01:09:04,127][47288] Updated weights for policy 0, policy_version 39736 (0.0026) [2024-04-26 01:09:06,792][47288] Updated weights for policy 0, policy_version 39746 (0.0033) [2024-04-26 01:09:08,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 651280384. Throughput: 0: 55841.8. Samples: 600731200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 01:09:08,923][47056] Avg episode reward: [(0, '0.329')] [2024-04-26 01:09:09,997][47288] Updated weights for policy 0, policy_version 39756 (0.0029) [2024-04-26 01:09:12,537][47288] Updated weights for policy 0, policy_version 39766 (0.0028) [2024-04-26 01:09:13,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 651575296. Throughput: 0: 55960.0. Samples: 600896900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 01:09:13,923][47056] Avg episode reward: [(0, '0.328')] [2024-04-26 01:09:15,841][47288] Updated weights for policy 0, policy_version 39776 (0.0033) [2024-04-26 01:09:18,456][47288] Updated weights for policy 0, policy_version 39786 (0.0029) [2024-04-26 01:09:18,923][47056] Fps is (10 sec: 58981.7, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 651870208. Throughput: 0: 55930.2. Samples: 601231080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 01:09:18,923][47056] Avg episode reward: [(0, '0.292')] [2024-04-26 01:09:21,608][47288] Updated weights for policy 0, policy_version 39796 (0.0032) [2024-04-26 01:09:23,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56251.8, 300 sec: 55872.3). Total num frames: 652148736. Throughput: 0: 56070.8. Samples: 601566040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 01:09:23,923][47056] Avg episode reward: [(0, '0.309')] [2024-04-26 01:09:24,222][47288] Updated weights for policy 0, policy_version 39806 (0.0028) [2024-04-26 01:09:27,568][47288] Updated weights for policy 0, policy_version 39816 (0.0025) [2024-04-26 01:09:28,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 652427264. Throughput: 0: 55979.1. Samples: 601733360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 01:09:28,923][47056] Avg episode reward: [(0, '0.346')] [2024-04-26 01:09:30,264][47288] Updated weights for policy 0, policy_version 39826 (0.0035) [2024-04-26 01:09:33,313][47288] Updated weights for policy 0, policy_version 39836 (0.0029) [2024-04-26 01:09:33,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 652689408. Throughput: 0: 55871.6. Samples: 602067600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 01:09:33,923][47056] Avg episode reward: [(0, '0.245')] [2024-04-26 01:09:34,196][47267] Signal inference workers to stop experience collection... (8700 times) [2024-04-26 01:09:34,231][47288] InferenceWorker_p0-w0: stopping experience collection (8700 times) [2024-04-26 01:09:34,287][47267] Signal inference workers to resume experience collection... (8700 times) [2024-04-26 01:09:34,287][47288] InferenceWorker_p0-w0: resuming experience collection (8700 times) [2024-04-26 01:09:36,099][47288] Updated weights for policy 0, policy_version 39846 (0.0029) [2024-04-26 01:09:38,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 652967936. Throughput: 0: 55964.9. Samples: 602405060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 01:09:38,923][47056] Avg episode reward: [(0, '0.267')] [2024-04-26 01:09:39,201][47288] Updated weights for policy 0, policy_version 39856 (0.0027) [2024-04-26 01:09:41,941][47288] Updated weights for policy 0, policy_version 39866 (0.0027) [2024-04-26 01:09:43,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55761.2). Total num frames: 653246464. Throughput: 0: 55897.4. Samples: 602571560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 01:09:43,923][47056] Avg episode reward: [(0, '0.324')] [2024-04-26 01:09:45,240][47288] Updated weights for policy 0, policy_version 39876 (0.0030) [2024-04-26 01:09:47,732][47288] Updated weights for policy 0, policy_version 39886 (0.0028) [2024-04-26 01:09:48,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 653524992. Throughput: 0: 55768.9. Samples: 602901620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 01:09:48,923][47056] Avg episode reward: [(0, '0.318')] [2024-04-26 01:09:51,083][47288] Updated weights for policy 0, policy_version 39896 (0.0032) [2024-04-26 01:09:53,640][47288] Updated weights for policy 0, policy_version 39906 (0.0029) [2024-04-26 01:09:53,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 653819904. Throughput: 0: 55769.7. Samples: 603240840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 01:09:53,923][47056] Avg episode reward: [(0, '0.322')] [2024-04-26 01:09:56,817][47288] Updated weights for policy 0, policy_version 39916 (0.0029) [2024-04-26 01:09:58,923][47056] Fps is (10 sec: 57345.1, 60 sec: 56251.9, 300 sec: 55872.3). Total num frames: 654098432. Throughput: 0: 55835.3. Samples: 603409480. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-04-26 01:09:58,923][47056] Avg episode reward: [(0, '0.297')] [2024-04-26 01:09:59,484][47288] Updated weights for policy 0, policy_version 39926 (0.0030) [2024-04-26 01:10:02,726][47288] Updated weights for policy 0, policy_version 39936 (0.0036) [2024-04-26 01:10:03,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 654376960. Throughput: 0: 55796.9. Samples: 603741940. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-04-26 01:10:03,923][47056] Avg episode reward: [(0, '0.339')] [2024-04-26 01:10:05,434][47288] Updated weights for policy 0, policy_version 39946 (0.0028) [2024-04-26 01:10:08,670][47288] Updated weights for policy 0, policy_version 39956 (0.0029) [2024-04-26 01:10:08,923][47056] Fps is (10 sec: 54065.9, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 654639104. Throughput: 0: 55671.3. Samples: 604071260. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-04-26 01:10:08,923][47056] Avg episode reward: [(0, '0.292')] [2024-04-26 01:10:11,494][47288] Updated weights for policy 0, policy_version 39966 (0.0036) [2024-04-26 01:10:13,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 654917632. Throughput: 0: 55578.2. Samples: 604234380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-04-26 01:10:13,923][47056] Avg episode reward: [(0, '0.354')] [2024-04-26 01:10:14,529][47288] Updated weights for policy 0, policy_version 39976 (0.0031) [2024-04-26 01:10:17,318][47288] Updated weights for policy 0, policy_version 39986 (0.0028) [2024-04-26 01:10:18,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 655179776. Throughput: 0: 55557.3. Samples: 604567680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-04-26 01:10:18,923][47056] Avg episode reward: [(0, '0.378')] [2024-04-26 01:10:20,241][47288] Updated weights for policy 0, policy_version 39996 (0.0027) [2024-04-26 01:10:23,075][47288] Updated weights for policy 0, policy_version 40006 (0.0034) [2024-04-26 01:10:23,923][47056] Fps is (10 sec: 55703.4, 60 sec: 55432.1, 300 sec: 55816.6). Total num frames: 655474688. Throughput: 0: 55520.7. Samples: 604903520. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-04-26 01:10:23,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 01:10:26,206][47288] Updated weights for policy 0, policy_version 40016 (0.0026) [2024-04-26 01:10:28,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55705.5, 300 sec: 55983.3). Total num frames: 655769600. Throughput: 0: 55512.0. Samples: 605069600. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-04-26 01:10:28,923][47056] Avg episode reward: [(0, '0.298')] [2024-04-26 01:10:28,988][47288] Updated weights for policy 0, policy_version 40026 (0.0028) [2024-04-26 01:10:32,334][47288] Updated weights for policy 0, policy_version 40036 (0.0026) [2024-04-26 01:10:33,178][47267] Signal inference workers to stop experience collection... (8750 times) [2024-04-26 01:10:33,178][47267] Signal inference workers to resume experience collection... (8750 times) [2024-04-26 01:10:33,192][47288] InferenceWorker_p0-w0: stopping experience collection (8750 times) [2024-04-26 01:10:33,192][47288] InferenceWorker_p0-w0: resuming experience collection (8750 times) [2024-04-26 01:10:33,923][47056] Fps is (10 sec: 57346.1, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 656048128. Throughput: 0: 55734.7. Samples: 605409680. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-04-26 01:10:33,924][47056] Avg episode reward: [(0, '0.302')] [2024-04-26 01:10:34,942][47288] Updated weights for policy 0, policy_version 40046 (0.0030) [2024-04-26 01:10:38,043][47288] Updated weights for policy 0, policy_version 40056 (0.0031) [2024-04-26 01:10:38,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 656326656. Throughput: 0: 55658.4. Samples: 605745460. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-04-26 01:10:38,923][47056] Avg episode reward: [(0, '0.359')] [2024-04-26 01:10:40,618][47288] Updated weights for policy 0, policy_version 40066 (0.0032) [2024-04-26 01:10:43,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 656588800. Throughput: 0: 55690.9. Samples: 605915580. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-04-26 01:10:43,923][47056] Avg episode reward: [(0, '0.217')] [2024-04-26 01:10:44,001][47288] Updated weights for policy 0, policy_version 40076 (0.0033) [2024-04-26 01:10:46,387][47288] Updated weights for policy 0, policy_version 40086 (0.0028) [2024-04-26 01:10:48,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 656850944. Throughput: 0: 55732.6. Samples: 606249900. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-04-26 01:10:48,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 01:10:48,987][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000040092_656867328.pth... [2024-04-26 01:10:49,049][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000039274_643465216.pth [2024-04-26 01:10:49,867][47288] Updated weights for policy 0, policy_version 40096 (0.0029) [2024-04-26 01:10:52,347][47288] Updated weights for policy 0, policy_version 40106 (0.0025) [2024-04-26 01:10:53,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 657145856. Throughput: 0: 55803.8. Samples: 606582420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 01:10:53,923][47056] Avg episode reward: [(0, '0.357')] [2024-04-26 01:10:55,716][47288] Updated weights for policy 0, policy_version 40116 (0.0027) [2024-04-26 01:10:58,231][47288] Updated weights for policy 0, policy_version 40126 (0.0031) [2024-04-26 01:10:58,923][47056] Fps is (10 sec: 58981.7, 60 sec: 55705.4, 300 sec: 55927.7). Total num frames: 657440768. Throughput: 0: 55773.2. Samples: 606744180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 01:10:58,923][47056] Avg episode reward: [(0, '0.378')] [2024-04-26 01:11:01,499][47288] Updated weights for policy 0, policy_version 40136 (0.0030) [2024-04-26 01:11:03,923][47056] Fps is (10 sec: 58981.3, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 657735680. Throughput: 0: 55913.2. Samples: 607083780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 01:11:03,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 01:11:04,032][47288] Updated weights for policy 0, policy_version 40146 (0.0025) [2024-04-26 01:11:07,338][47288] Updated weights for policy 0, policy_version 40156 (0.0027) [2024-04-26 01:11:08,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 657997824. Throughput: 0: 55843.7. Samples: 607416460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 01:11:08,923][47056] Avg episode reward: [(0, '0.329')] [2024-04-26 01:11:09,884][47288] Updated weights for policy 0, policy_version 40166 (0.0027) [2024-04-26 01:11:13,316][47288] Updated weights for policy 0, policy_version 40176 (0.0025) [2024-04-26 01:11:13,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 658292736. Throughput: 0: 55916.4. Samples: 607585840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 01:11:13,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 01:11:15,862][47288] Updated weights for policy 0, policy_version 40186 (0.0029) [2024-04-26 01:11:18,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 658538496. Throughput: 0: 55762.2. Samples: 607918980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 01:11:18,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 01:11:19,168][47288] Updated weights for policy 0, policy_version 40196 (0.0028) [2024-04-26 01:11:21,641][47288] Updated weights for policy 0, policy_version 40206 (0.0030) [2024-04-26 01:11:23,923][47056] Fps is (10 sec: 50790.8, 60 sec: 55432.9, 300 sec: 55650.1). Total num frames: 658800640. Throughput: 0: 55684.4. Samples: 608251260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 01:11:23,923][47056] Avg episode reward: [(0, '0.230')] [2024-04-26 01:11:25,054][47288] Updated weights for policy 0, policy_version 40216 (0.0032) [2024-04-26 01:11:27,481][47288] Updated weights for policy 0, policy_version 40226 (0.0028) [2024-04-26 01:11:28,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 659079168. Throughput: 0: 55464.8. Samples: 608411500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 01:11:28,923][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 01:11:31,007][47288] Updated weights for policy 0, policy_version 40236 (0.0027) [2024-04-26 01:11:31,014][47267] Signal inference workers to stop experience collection... (8800 times) [2024-04-26 01:11:31,015][47267] Signal inference workers to resume experience collection... (8800 times) [2024-04-26 01:11:31,027][47288] InferenceWorker_p0-w0: stopping experience collection (8800 times) [2024-04-26 01:11:31,027][47288] InferenceWorker_p0-w0: resuming experience collection (8800 times) [2024-04-26 01:11:33,467][47288] Updated weights for policy 0, policy_version 40246 (0.0032) [2024-04-26 01:11:33,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 659390464. Throughput: 0: 55379.9. Samples: 608742000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 01:11:33,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 01:11:36,931][47288] Updated weights for policy 0, policy_version 40256 (0.0025) [2024-04-26 01:11:38,923][47056] Fps is (10 sec: 58983.0, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 659668992. Throughput: 0: 55414.2. Samples: 609076060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 01:11:38,923][47056] Avg episode reward: [(0, '0.309')] [2024-04-26 01:11:39,471][47288] Updated weights for policy 0, policy_version 40266 (0.0032) [2024-04-26 01:11:42,739][47288] Updated weights for policy 0, policy_version 40276 (0.0029) [2024-04-26 01:11:43,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 659931136. Throughput: 0: 55657.0. Samples: 609248740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 01:11:43,923][47056] Avg episode reward: [(0, '0.300')] [2024-04-26 01:11:45,523][47288] Updated weights for policy 0, policy_version 40286 (0.0030) [2024-04-26 01:11:48,529][47288] Updated weights for policy 0, policy_version 40296 (0.0028) [2024-04-26 01:11:48,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 660226048. Throughput: 0: 55446.7. Samples: 609578880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 01:11:48,923][47056] Avg episode reward: [(0, '0.329')] [2024-04-26 01:11:52,073][47288] Updated weights for policy 0, policy_version 40306 (0.0032) [2024-04-26 01:11:53,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 660471808. Throughput: 0: 55481.7. Samples: 609913140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 01:11:53,923][47056] Avg episode reward: [(0, '0.202')] [2024-04-26 01:11:54,499][47288] Updated weights for policy 0, policy_version 40316 (0.0032) [2024-04-26 01:11:57,971][47288] Updated weights for policy 0, policy_version 40326 (0.0032) [2024-04-26 01:11:58,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 660750336. Throughput: 0: 55315.6. Samples: 610075040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 01:11:58,923][47056] Avg episode reward: [(0, '0.261')] [2024-04-26 01:12:00,619][47288] Updated weights for policy 0, policy_version 40336 (0.0029) [2024-04-26 01:12:03,851][47288] Updated weights for policy 0, policy_version 40346 (0.0034) [2024-04-26 01:12:03,923][47056] Fps is (10 sec: 55705.5, 60 sec: 54886.5, 300 sec: 55705.6). Total num frames: 661028864. Throughput: 0: 55248.0. Samples: 610405140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 01:12:03,923][47056] Avg episode reward: [(0, '0.268')] [2024-04-26 01:12:06,445][47288] Updated weights for policy 0, policy_version 40356 (0.0025) [2024-04-26 01:12:08,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55705.5, 300 sec: 55927.8). Total num frames: 661340160. Throughput: 0: 55188.8. Samples: 610734760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 01:12:08,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 01:12:09,535][47288] Updated weights for policy 0, policy_version 40366 (0.0028) [2024-04-26 01:12:12,288][47288] Updated weights for policy 0, policy_version 40376 (0.0031) [2024-04-26 01:12:13,923][47056] Fps is (10 sec: 57343.1, 60 sec: 55159.3, 300 sec: 55816.6). Total num frames: 661602304. Throughput: 0: 55494.5. Samples: 610908760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 01:12:13,924][47056] Avg episode reward: [(0, '0.267')] [2024-04-26 01:12:15,307][47288] Updated weights for policy 0, policy_version 40386 (0.0033) [2024-04-26 01:12:18,326][47288] Updated weights for policy 0, policy_version 40396 (0.0028) [2024-04-26 01:12:18,923][47056] Fps is (10 sec: 54066.2, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 661880832. Throughput: 0: 55610.4. Samples: 611244480. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 01:12:18,923][47056] Avg episode reward: [(0, '0.320')] [2024-04-26 01:12:21,351][47288] Updated weights for policy 0, policy_version 40406 (0.0027) [2024-04-26 01:12:23,923][47056] Fps is (10 sec: 54068.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 662142976. Throughput: 0: 55572.0. Samples: 611576800. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 01:12:23,923][47056] Avg episode reward: [(0, '0.253')] [2024-04-26 01:12:24,241][47288] Updated weights for policy 0, policy_version 40416 (0.0027) [2024-04-26 01:12:27,359][47288] Updated weights for policy 0, policy_version 40426 (0.0027) [2024-04-26 01:12:28,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 662421504. Throughput: 0: 55307.8. Samples: 611737600. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 01:12:28,923][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 01:12:30,099][47288] Updated weights for policy 0, policy_version 40436 (0.0031) [2024-04-26 01:12:33,351][47288] Updated weights for policy 0, policy_version 40446 (0.0026) [2024-04-26 01:12:33,923][47056] Fps is (10 sec: 54067.6, 60 sec: 54886.5, 300 sec: 55650.1). Total num frames: 662683648. Throughput: 0: 55526.9. Samples: 612077580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 01:12:33,923][47056] Avg episode reward: [(0, '0.257')] [2024-04-26 01:12:36,020][47288] Updated weights for policy 0, policy_version 40456 (0.0029) [2024-04-26 01:12:38,923][47056] Fps is (10 sec: 54067.5, 60 sec: 54886.3, 300 sec: 55594.5). Total num frames: 662962176. Throughput: 0: 55471.0. Samples: 612409340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 01:12:38,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 01:12:39,432][47288] Updated weights for policy 0, policy_version 40466 (0.0026) [2024-04-26 01:12:40,052][47267] Signal inference workers to stop experience collection... (8850 times) [2024-04-26 01:12:40,062][47288] InferenceWorker_p0-w0: stopping experience collection (8850 times) [2024-04-26 01:12:40,142][47267] Signal inference workers to resume experience collection... (8850 times) [2024-04-26 01:12:40,143][47288] InferenceWorker_p0-w0: resuming experience collection (8850 times) [2024-04-26 01:12:41,998][47288] Updated weights for policy 0, policy_version 40476 (0.0029) [2024-04-26 01:12:43,923][47056] Fps is (10 sec: 58982.0, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 663273472. Throughput: 0: 55567.6. Samples: 612575580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 01:12:43,923][47056] Avg episode reward: [(0, '0.346')] [2024-04-26 01:12:45,342][47288] Updated weights for policy 0, policy_version 40486 (0.0031) [2024-04-26 01:12:47,993][47288] Updated weights for policy 0, policy_version 40496 (0.0037) [2024-04-26 01:12:48,923][47056] Fps is (10 sec: 58982.7, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 663552000. Throughput: 0: 55656.4. Samples: 612909680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 01:12:48,923][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 01:12:48,931][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000040500_663552000.pth... [2024-04-26 01:12:48,979][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000039685_650199040.pth [2024-04-26 01:12:51,027][47288] Updated weights for policy 0, policy_version 40506 (0.0029) [2024-04-26 01:12:53,771][47288] Updated weights for policy 0, policy_version 40516 (0.0032) [2024-04-26 01:12:53,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 663830528. Throughput: 0: 55846.7. Samples: 613247860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 01:12:53,923][47056] Avg episode reward: [(0, '0.329')] [2024-04-26 01:12:56,794][47288] Updated weights for policy 0, policy_version 40526 (0.0026) [2024-04-26 01:12:58,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 664109056. Throughput: 0: 55772.2. Samples: 613418500. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-26 01:12:58,923][47056] Avg episode reward: [(0, '0.334')] [2024-04-26 01:12:59,629][47288] Updated weights for policy 0, policy_version 40536 (0.0029) [2024-04-26 01:13:02,837][47288] Updated weights for policy 0, policy_version 40546 (0.0035) [2024-04-26 01:13:03,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 664387584. Throughput: 0: 55770.9. Samples: 613754160. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-26 01:13:03,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 01:13:05,426][47288] Updated weights for policy 0, policy_version 40556 (0.0028) [2024-04-26 01:13:08,649][47288] Updated weights for policy 0, policy_version 40566 (0.0034) [2024-04-26 01:13:08,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55159.6, 300 sec: 55705.7). Total num frames: 664649728. Throughput: 0: 55795.2. Samples: 614087580. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-26 01:13:08,923][47056] Avg episode reward: [(0, '0.311')] [2024-04-26 01:13:11,368][47288] Updated weights for policy 0, policy_version 40576 (0.0031) [2024-04-26 01:13:13,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 664928256. Throughput: 0: 55734.4. Samples: 614245640. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-26 01:13:13,923][47056] Avg episode reward: [(0, '0.321')] [2024-04-26 01:13:14,576][47288] Updated weights for policy 0, policy_version 40586 (0.0027) [2024-04-26 01:13:17,254][47288] Updated weights for policy 0, policy_version 40596 (0.0026) [2024-04-26 01:13:18,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.9, 300 sec: 55761.1). Total num frames: 665223168. Throughput: 0: 55724.4. Samples: 614585180. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-26 01:13:18,923][47056] Avg episode reward: [(0, '0.278')] [2024-04-26 01:13:20,291][47288] Updated weights for policy 0, policy_version 40606 (0.0031) [2024-04-26 01:13:23,124][47288] Updated weights for policy 0, policy_version 40616 (0.0028) [2024-04-26 01:13:23,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 665501696. Throughput: 0: 55775.2. Samples: 614919220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:13:23,932][47056] Avg episode reward: [(0, '0.292')] [2024-04-26 01:13:26,191][47288] Updated weights for policy 0, policy_version 40626 (0.0031) [2024-04-26 01:13:28,910][47288] Updated weights for policy 0, policy_version 40636 (0.0031) [2024-04-26 01:13:28,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 665780224. Throughput: 0: 55912.4. Samples: 615091640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:13:28,923][47056] Avg episode reward: [(0, '0.290')] [2024-04-26 01:13:32,188][47288] Updated weights for policy 0, policy_version 40646 (0.0031) [2024-04-26 01:13:33,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 666058752. Throughput: 0: 55832.1. Samples: 615422120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:13:33,923][47056] Avg episode reward: [(0, '0.271')] [2024-04-26 01:13:34,794][47288] Updated weights for policy 0, policy_version 40656 (0.0031) [2024-04-26 01:13:37,916][47288] Updated weights for policy 0, policy_version 40666 (0.0026) [2024-04-26 01:13:38,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 666337280. Throughput: 0: 55808.8. Samples: 615759260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:13:38,923][47056] Avg episode reward: [(0, '0.338')] [2024-04-26 01:13:40,603][47288] Updated weights for policy 0, policy_version 40676 (0.0030) [2024-04-26 01:13:41,646][47267] Signal inference workers to stop experience collection... (8900 times) [2024-04-26 01:13:41,685][47288] InferenceWorker_p0-w0: stopping experience collection (8900 times) [2024-04-26 01:13:41,739][47267] Signal inference workers to resume experience collection... (8900 times) [2024-04-26 01:13:41,739][47288] InferenceWorker_p0-w0: resuming experience collection (8900 times) [2024-04-26 01:13:43,658][47288] Updated weights for policy 0, policy_version 40686 (0.0030) [2024-04-26 01:13:43,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 666599424. Throughput: 0: 55772.1. Samples: 615928240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:13:43,923][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 01:13:46,527][47288] Updated weights for policy 0, policy_version 40696 (0.0030) [2024-04-26 01:13:48,923][47056] Fps is (10 sec: 54068.3, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 666877952. Throughput: 0: 55757.6. Samples: 616263240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:13:48,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 01:13:49,568][47288] Updated weights for policy 0, policy_version 40706 (0.0032) [2024-04-26 01:13:52,710][47288] Updated weights for policy 0, policy_version 40716 (0.0029) [2024-04-26 01:13:53,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 667172864. Throughput: 0: 55587.4. Samples: 616589020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 01:13:53,923][47056] Avg episode reward: [(0, '0.294')] [2024-04-26 01:13:55,755][47288] Updated weights for policy 0, policy_version 40726 (0.0026) [2024-04-26 01:13:58,524][47288] Updated weights for policy 0, policy_version 40736 (0.0032) [2024-04-26 01:13:58,923][47056] Fps is (10 sec: 55704.3, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 667435008. Throughput: 0: 55661.3. Samples: 616750400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 01:13:58,923][47056] Avg episode reward: [(0, '0.327')] [2024-04-26 01:14:01,506][47288] Updated weights for policy 0, policy_version 40746 (0.0032) [2024-04-26 01:14:03,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 667713536. Throughput: 0: 55599.9. Samples: 617087180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 01:14:03,923][47056] Avg episode reward: [(0, '0.324')] [2024-04-26 01:14:04,294][47288] Updated weights for policy 0, policy_version 40756 (0.0027) [2024-04-26 01:14:07,463][47288] Updated weights for policy 0, policy_version 40766 (0.0028) [2024-04-26 01:14:08,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 667992064. Throughput: 0: 55634.7. Samples: 617422780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 01:14:08,923][47056] Avg episode reward: [(0, '0.319')] [2024-04-26 01:14:10,264][47288] Updated weights for policy 0, policy_version 40776 (0.0033) [2024-04-26 01:14:13,415][47288] Updated weights for policy 0, policy_version 40786 (0.0030) [2024-04-26 01:14:13,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55705.7, 300 sec: 55594.6). Total num frames: 668270592. Throughput: 0: 55488.1. Samples: 617588600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 01:14:13,923][47056] Avg episode reward: [(0, '0.282')] [2024-04-26 01:14:16,209][47288] Updated weights for policy 0, policy_version 40796 (0.0027) [2024-04-26 01:14:18,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55159.3, 300 sec: 55539.0). Total num frames: 668532736. Throughput: 0: 55489.3. Samples: 617919140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 01:14:18,923][47056] Avg episode reward: [(0, '0.395')] [2024-04-26 01:14:19,243][47288] Updated weights for policy 0, policy_version 40806 (0.0034) [2024-04-26 01:14:21,971][47288] Updated weights for policy 0, policy_version 40816 (0.0030) [2024-04-26 01:14:23,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 668844032. Throughput: 0: 55486.4. Samples: 618256140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 01:14:23,923][47056] Avg episode reward: [(0, '0.266')] [2024-04-26 01:14:24,952][47288] Updated weights for policy 0, policy_version 40826 (0.0028) [2024-04-26 01:14:27,707][47288] Updated weights for policy 0, policy_version 40836 (0.0029) [2024-04-26 01:14:28,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 669106176. Throughput: 0: 55446.5. Samples: 618423340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 01:14:28,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 01:14:30,845][47288] Updated weights for policy 0, policy_version 40846 (0.0029) [2024-04-26 01:14:33,654][47288] Updated weights for policy 0, policy_version 40856 (0.0028) [2024-04-26 01:14:33,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 669384704. Throughput: 0: 55413.6. Samples: 618756860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 01:14:33,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 01:14:36,736][47288] Updated weights for policy 0, policy_version 40866 (0.0028) [2024-04-26 01:14:38,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 669663232. Throughput: 0: 55617.4. Samples: 619091800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 01:14:38,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 01:14:39,363][47267] Signal inference workers to stop experience collection... (8950 times) [2024-04-26 01:14:39,394][47288] InferenceWorker_p0-w0: stopping experience collection (8950 times) [2024-04-26 01:14:39,443][47267] Signal inference workers to resume experience collection... (8950 times) [2024-04-26 01:14:39,444][47288] InferenceWorker_p0-w0: resuming experience collection (8950 times) [2024-04-26 01:14:39,565][47288] Updated weights for policy 0, policy_version 40876 (0.0028) [2024-04-26 01:14:42,510][47288] Updated weights for policy 0, policy_version 40886 (0.0026) [2024-04-26 01:14:43,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 669958144. Throughput: 0: 55740.9. Samples: 619258740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 01:14:43,923][47056] Avg episode reward: [(0, '0.359')] [2024-04-26 01:14:45,266][47288] Updated weights for policy 0, policy_version 40896 (0.0030) [2024-04-26 01:14:48,257][47288] Updated weights for policy 0, policy_version 40906 (0.0033) [2024-04-26 01:14:48,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 670220288. Throughput: 0: 55809.1. Samples: 619598580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 01:14:48,923][47056] Avg episode reward: [(0, '0.359')] [2024-04-26 01:14:49,002][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000040908_670236672.pth... [2024-04-26 01:14:49,046][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000040092_656867328.pth [2024-04-26 01:14:51,040][47288] Updated weights for policy 0, policy_version 40916 (0.0026) [2024-04-26 01:14:53,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 670498816. Throughput: 0: 55837.3. Samples: 619935460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 01:14:53,923][47056] Avg episode reward: [(0, '0.257')] [2024-04-26 01:14:54,291][47288] Updated weights for policy 0, policy_version 40926 (0.0030) [2024-04-26 01:14:56,966][47288] Updated weights for policy 0, policy_version 40936 (0.0029) [2024-04-26 01:14:58,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 670777344. Throughput: 0: 55797.2. Samples: 620099480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 01:14:58,923][47056] Avg episode reward: [(0, '0.351')] [2024-04-26 01:15:00,280][47288] Updated weights for policy 0, policy_version 40946 (0.0039) [2024-04-26 01:15:02,912][47288] Updated weights for policy 0, policy_version 40956 (0.0036) [2024-04-26 01:15:03,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 671055872. Throughput: 0: 55870.8. Samples: 620433320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 01:15:03,923][47056] Avg episode reward: [(0, '0.325')] [2024-04-26 01:15:06,259][47288] Updated weights for policy 0, policy_version 40966 (0.0031) [2024-04-26 01:15:08,630][47288] Updated weights for policy 0, policy_version 40976 (0.0032) [2024-04-26 01:15:08,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 671350784. Throughput: 0: 55773.3. Samples: 620765940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:15:08,923][47056] Avg episode reward: [(0, '0.351')] [2024-04-26 01:15:12,119][47288] Updated weights for policy 0, policy_version 40986 (0.0029) [2024-04-26 01:15:13,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 671612928. Throughput: 0: 55754.3. Samples: 620932280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:15:13,923][47056] Avg episode reward: [(0, '0.301')] [2024-04-26 01:15:14,653][47288] Updated weights for policy 0, policy_version 40996 (0.0036) [2024-04-26 01:15:17,827][47288] Updated weights for policy 0, policy_version 41006 (0.0030) [2024-04-26 01:15:18,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.8, 300 sec: 55705.7). Total num frames: 671907840. Throughput: 0: 55903.6. Samples: 621272520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:15:18,923][47056] Avg episode reward: [(0, '0.282')] [2024-04-26 01:15:20,607][47288] Updated weights for policy 0, policy_version 41016 (0.0029) [2024-04-26 01:15:23,476][47288] Updated weights for policy 0, policy_version 41026 (0.0030) [2024-04-26 01:15:23,923][47056] Fps is (10 sec: 58981.1, 60 sec: 55978.4, 300 sec: 55705.6). Total num frames: 672202752. Throughput: 0: 55934.8. Samples: 621608880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:15:23,923][47056] Avg episode reward: [(0, '0.333')] [2024-04-26 01:15:26,290][47288] Updated weights for policy 0, policy_version 41036 (0.0030) [2024-04-26 01:15:28,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 672464896. Throughput: 0: 55879.5. Samples: 621773320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:15:28,923][47056] Avg episode reward: [(0, '0.320')] [2024-04-26 01:15:29,445][47288] Updated weights for policy 0, policy_version 41046 (0.0034) [2024-04-26 01:15:32,225][47288] Updated weights for policy 0, policy_version 41056 (0.0033) [2024-04-26 01:15:33,923][47056] Fps is (10 sec: 52429.9, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 672727040. Throughput: 0: 55788.3. Samples: 622109060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 01:15:33,923][47056] Avg episode reward: [(0, '0.384')] [2024-04-26 01:15:35,515][47288] Updated weights for policy 0, policy_version 41066 (0.0024) [2024-04-26 01:15:38,210][47288] Updated weights for policy 0, policy_version 41076 (0.0034) [2024-04-26 01:15:38,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 673005568. Throughput: 0: 55821.5. Samples: 622447420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 01:15:38,923][47056] Avg episode reward: [(0, '0.297')] [2024-04-26 01:15:41,286][47288] Updated weights for policy 0, policy_version 41086 (0.0032) [2024-04-26 01:15:43,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55705.8, 300 sec: 55761.1). Total num frames: 673300480. Throughput: 0: 55797.9. Samples: 622610380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 01:15:43,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 01:15:43,986][47288] Updated weights for policy 0, policy_version 41096 (0.0030) [2024-04-26 01:15:47,042][47288] Updated weights for policy 0, policy_version 41106 (0.0032) [2024-04-26 01:15:48,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 673579008. Throughput: 0: 55862.1. Samples: 622947120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 01:15:48,923][47056] Avg episode reward: [(0, '0.390')] [2024-04-26 01:15:49,848][47288] Updated weights for policy 0, policy_version 41116 (0.0028) [2024-04-26 01:15:52,849][47267] Signal inference workers to stop experience collection... (9000 times) [2024-04-26 01:15:52,888][47288] InferenceWorker_p0-w0: stopping experience collection (9000 times) [2024-04-26 01:15:52,913][47267] Signal inference workers to resume experience collection... (9000 times) [2024-04-26 01:15:52,914][47288] InferenceWorker_p0-w0: resuming experience collection (9000 times) [2024-04-26 01:15:53,030][47288] Updated weights for policy 0, policy_version 41126 (0.0032) [2024-04-26 01:15:53,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 673857536. Throughput: 0: 55888.5. Samples: 623280920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 01:15:53,923][47056] Avg episode reward: [(0, '0.327')] [2024-04-26 01:15:55,687][47288] Updated weights for policy 0, policy_version 41136 (0.0024) [2024-04-26 01:15:58,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 674119680. Throughput: 0: 55917.3. Samples: 623448560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 01:15:58,923][47056] Avg episode reward: [(0, '0.356')] [2024-04-26 01:15:59,044][47288] Updated weights for policy 0, policy_version 41146 (0.0028) [2024-04-26 01:16:01,689][47288] Updated weights for policy 0, policy_version 41156 (0.0028) [2024-04-26 01:16:03,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 674414592. Throughput: 0: 55868.8. Samples: 623786620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 01:16:03,923][47056] Avg episode reward: [(0, '0.259')] [2024-04-26 01:16:04,791][47288] Updated weights for policy 0, policy_version 41166 (0.0026) [2024-04-26 01:16:07,499][47288] Updated weights for policy 0, policy_version 41176 (0.0028) [2024-04-26 01:16:08,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 674693120. Throughput: 0: 55717.1. Samples: 624116140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 01:16:08,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 01:16:10,615][47288] Updated weights for policy 0, policy_version 41186 (0.0026) [2024-04-26 01:16:13,348][47288] Updated weights for policy 0, policy_version 41196 (0.0027) [2024-04-26 01:16:13,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 674955264. Throughput: 0: 55681.5. Samples: 624278980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 01:16:13,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 01:16:16,560][47288] Updated weights for policy 0, policy_version 41206 (0.0032) [2024-04-26 01:16:18,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 675233792. Throughput: 0: 55646.8. Samples: 624613160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 01:16:18,923][47056] Avg episode reward: [(0, '0.304')] [2024-04-26 01:16:19,171][47288] Updated weights for policy 0, policy_version 41216 (0.0031) [2024-04-26 01:16:22,460][47288] Updated weights for policy 0, policy_version 41226 (0.0033) [2024-04-26 01:16:23,923][47056] Fps is (10 sec: 57342.9, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 675528704. Throughput: 0: 55541.5. Samples: 624946800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 01:16:23,923][47056] Avg episode reward: [(0, '0.401')] [2024-04-26 01:16:25,173][47288] Updated weights for policy 0, policy_version 41236 (0.0027) [2024-04-26 01:16:28,166][47288] Updated weights for policy 0, policy_version 41246 (0.0024) [2024-04-26 01:16:28,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 675807232. Throughput: 0: 55722.7. Samples: 625117900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 01:16:28,923][47056] Avg episode reward: [(0, '0.348')] [2024-04-26 01:16:31,048][47288] Updated weights for policy 0, policy_version 41256 (0.0032) [2024-04-26 01:16:33,923][47056] Fps is (10 sec: 55706.7, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 676085760. Throughput: 0: 55735.2. Samples: 625455200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 01:16:33,923][47056] Avg episode reward: [(0, '0.287')] [2024-04-26 01:16:34,100][47288] Updated weights for policy 0, policy_version 41266 (0.0033) [2024-04-26 01:16:36,827][47288] Updated weights for policy 0, policy_version 41276 (0.0028) [2024-04-26 01:16:38,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 676364288. Throughput: 0: 55797.3. Samples: 625791800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 01:16:38,923][47056] Avg episode reward: [(0, '0.294')] [2024-04-26 01:16:39,991][47288] Updated weights for policy 0, policy_version 41286 (0.0029) [2024-04-26 01:16:42,694][47288] Updated weights for policy 0, policy_version 41296 (0.0028) [2024-04-26 01:16:43,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 676642816. Throughput: 0: 55905.7. Samples: 625964320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 01:16:43,923][47056] Avg episode reward: [(0, '0.329')] [2024-04-26 01:16:45,658][47288] Updated weights for policy 0, policy_version 41306 (0.0030) [2024-04-26 01:16:48,505][47288] Updated weights for policy 0, policy_version 41316 (0.0026) [2024-04-26 01:16:48,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 676937728. Throughput: 0: 55990.3. Samples: 626306180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 01:16:48,923][47056] Avg episode reward: [(0, '0.341')] [2024-04-26 01:16:48,931][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000041317_676937728.pth... [2024-04-26 01:16:48,980][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000040500_663552000.pth [2024-04-26 01:16:51,485][47288] Updated weights for policy 0, policy_version 41326 (0.0043) [2024-04-26 01:16:53,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 677199872. Throughput: 0: 56097.3. Samples: 626640520. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 01:16:53,923][47056] Avg episode reward: [(0, '0.273')] [2024-04-26 01:16:54,402][47288] Updated weights for policy 0, policy_version 41336 (0.0027) [2024-04-26 01:16:55,342][47267] Signal inference workers to stop experience collection... (9050 times) [2024-04-26 01:16:55,389][47288] InferenceWorker_p0-w0: stopping experience collection (9050 times) [2024-04-26 01:16:55,400][47267] Signal inference workers to resume experience collection... (9050 times) [2024-04-26 01:16:55,405][47288] InferenceWorker_p0-w0: resuming experience collection (9050 times) [2024-04-26 01:16:57,382][47288] Updated weights for policy 0, policy_version 41346 (0.0028) [2024-04-26 01:16:58,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 677494784. Throughput: 0: 56139.9. Samples: 626805280. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 01:16:58,924][47056] Avg episode reward: [(0, '0.313')] [2024-04-26 01:17:00,263][47288] Updated weights for policy 0, policy_version 41356 (0.0027) [2024-04-26 01:17:03,334][47288] Updated weights for policy 0, policy_version 41366 (0.0032) [2024-04-26 01:17:03,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 677789696. Throughput: 0: 56106.2. Samples: 627137940. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 01:17:03,923][47056] Avg episode reward: [(0, '0.395')] [2024-04-26 01:17:06,070][47288] Updated weights for policy 0, policy_version 41376 (0.0027) [2024-04-26 01:17:08,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 678035456. Throughput: 0: 56304.1. Samples: 627480480. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 01:17:08,924][47056] Avg episode reward: [(0, '0.331')] [2024-04-26 01:17:09,160][47288] Updated weights for policy 0, policy_version 41386 (0.0026) [2024-04-26 01:17:12,021][47288] Updated weights for policy 0, policy_version 41396 (0.0027) [2024-04-26 01:17:13,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 678313984. Throughput: 0: 56054.9. Samples: 627640380. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 01:17:13,923][47056] Avg episode reward: [(0, '0.296')] [2024-04-26 01:17:14,944][47288] Updated weights for policy 0, policy_version 41406 (0.0025) [2024-04-26 01:17:17,947][47288] Updated weights for policy 0, policy_version 41416 (0.0029) [2024-04-26 01:17:18,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 678592512. Throughput: 0: 55934.7. Samples: 627972260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 01:17:18,923][47056] Avg episode reward: [(0, '0.357')] [2024-04-26 01:17:20,922][47288] Updated weights for policy 0, policy_version 41426 (0.0028) [2024-04-26 01:17:23,875][47288] Updated weights for policy 0, policy_version 41436 (0.0028) [2024-04-26 01:17:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 678887424. Throughput: 0: 55788.0. Samples: 628302260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 01:17:23,923][47056] Avg episode reward: [(0, '0.357')] [2024-04-26 01:17:26,775][47288] Updated weights for policy 0, policy_version 41446 (0.0027) [2024-04-26 01:17:28,923][47056] Fps is (10 sec: 57343.0, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 679165952. Throughput: 0: 55909.2. Samples: 628480240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 01:17:28,923][47056] Avg episode reward: [(0, '0.342')] [2024-04-26 01:17:29,750][47288] Updated weights for policy 0, policy_version 41456 (0.0027) [2024-04-26 01:17:32,618][47288] Updated weights for policy 0, policy_version 41466 (0.0028) [2024-04-26 01:17:33,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 679428096. Throughput: 0: 55683.6. Samples: 628811940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 01:17:33,923][47056] Avg episode reward: [(0, '0.369')] [2024-04-26 01:17:35,598][47288] Updated weights for policy 0, policy_version 41476 (0.0024) [2024-04-26 01:17:38,456][47288] Updated weights for policy 0, policy_version 41486 (0.0034) [2024-04-26 01:17:38,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 679739392. Throughput: 0: 55613.3. Samples: 629143120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 01:17:38,923][47056] Avg episode reward: [(0, '0.399')] [2024-04-26 01:17:41,597][47288] Updated weights for policy 0, policy_version 41496 (0.0029) [2024-04-26 01:17:43,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 679985152. Throughput: 0: 55694.8. Samples: 629311540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 01:17:43,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 01:17:44,481][47288] Updated weights for policy 0, policy_version 41506 (0.0029) [2024-04-26 01:17:47,453][47288] Updated weights for policy 0, policy_version 41516 (0.0026) [2024-04-26 01:17:48,258][47267] Signal inference workers to stop experience collection... (9100 times) [2024-04-26 01:17:48,258][47267] Signal inference workers to resume experience collection... (9100 times) [2024-04-26 01:17:48,283][47288] InferenceWorker_p0-w0: stopping experience collection (9100 times) [2024-04-26 01:17:48,283][47288] InferenceWorker_p0-w0: resuming experience collection (9100 times) [2024-04-26 01:17:48,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 680280064. Throughput: 0: 55753.3. Samples: 629646840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 01:17:48,923][47056] Avg episode reward: [(0, '0.319')] [2024-04-26 01:17:50,270][47288] Updated weights for policy 0, policy_version 41526 (0.0029) [2024-04-26 01:17:53,185][47288] Updated weights for policy 0, policy_version 41536 (0.0032) [2024-04-26 01:17:53,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 680542208. Throughput: 0: 55565.4. Samples: 629980920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 01:17:53,923][47056] Avg episode reward: [(0, '0.361')] [2024-04-26 01:17:56,004][47288] Updated weights for policy 0, policy_version 41546 (0.0027) [2024-04-26 01:17:58,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 680837120. Throughput: 0: 55632.4. Samples: 630143840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 01:17:58,923][47056] Avg episode reward: [(0, '0.303')] [2024-04-26 01:17:59,077][47288] Updated weights for policy 0, policy_version 41556 (0.0031) [2024-04-26 01:18:02,118][47288] Updated weights for policy 0, policy_version 41566 (0.0033) [2024-04-26 01:18:03,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 681099264. Throughput: 0: 55542.8. Samples: 630471680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 01:18:03,923][47056] Avg episode reward: [(0, '0.326')] [2024-04-26 01:18:05,046][47288] Updated weights for policy 0, policy_version 41576 (0.0029) [2024-04-26 01:18:07,916][47288] Updated weights for policy 0, policy_version 41586 (0.0030) [2024-04-26 01:18:08,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 681394176. Throughput: 0: 55738.2. Samples: 630810480. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 01:18:08,923][47056] Avg episode reward: [(0, '0.194')] [2024-04-26 01:18:10,973][47288] Updated weights for policy 0, policy_version 41596 (0.0034) [2024-04-26 01:18:13,784][47288] Updated weights for policy 0, policy_version 41606 (0.0029) [2024-04-26 01:18:13,923][47056] Fps is (10 sec: 57343.1, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 681672704. Throughput: 0: 55597.9. Samples: 630982140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 01:18:13,923][47056] Avg episode reward: [(0, '0.298')] [2024-04-26 01:18:16,697][47288] Updated weights for policy 0, policy_version 41616 (0.0029) [2024-04-26 01:18:18,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 681934848. Throughput: 0: 55727.6. Samples: 631319680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 01:18:18,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 01:18:19,666][47288] Updated weights for policy 0, policy_version 41626 (0.0032) [2024-04-26 01:18:22,613][47288] Updated weights for policy 0, policy_version 41636 (0.0027) [2024-04-26 01:18:23,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 682213376. Throughput: 0: 55741.4. Samples: 631651480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 01:18:23,923][47056] Avg episode reward: [(0, '0.306')] [2024-04-26 01:18:25,578][47288] Updated weights for policy 0, policy_version 41646 (0.0027) [2024-04-26 01:18:28,527][47288] Updated weights for policy 0, policy_version 41656 (0.0034) [2024-04-26 01:18:28,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 682491904. Throughput: 0: 55573.3. Samples: 631812340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 01:18:28,923][47056] Avg episode reward: [(0, '0.369')] [2024-04-26 01:18:31,547][47288] Updated weights for policy 0, policy_version 41666 (0.0033) [2024-04-26 01:18:33,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 682786816. Throughput: 0: 55707.5. Samples: 632153680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 01:18:33,923][47056] Avg episode reward: [(0, '0.347')] [2024-04-26 01:18:34,230][47288] Updated weights for policy 0, policy_version 41676 (0.0030) [2024-04-26 01:18:37,208][47288] Updated weights for policy 0, policy_version 41686 (0.0032) [2024-04-26 01:18:38,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 683065344. Throughput: 0: 55733.7. Samples: 632488940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 01:18:38,923][47056] Avg episode reward: [(0, '0.302')] [2024-04-26 01:18:40,065][47288] Updated weights for policy 0, policy_version 41696 (0.0028) [2024-04-26 01:18:43,094][47288] Updated weights for policy 0, policy_version 41706 (0.0031) [2024-04-26 01:18:43,923][47056] Fps is (10 sec: 54066.3, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 683327488. Throughput: 0: 55830.9. Samples: 632656240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 01:18:43,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 01:18:46,512][47288] Updated weights for policy 0, policy_version 41716 (0.0029) [2024-04-26 01:18:48,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 683622400. Throughput: 0: 55798.0. Samples: 632982600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 01:18:48,923][47056] Avg episode reward: [(0, '0.267')] [2024-04-26 01:18:48,930][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000041725_683622400.pth... [2024-04-26 01:18:48,980][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000040908_670236672.pth [2024-04-26 01:18:49,201][47288] Updated weights for policy 0, policy_version 41726 (0.0033) [2024-04-26 01:18:52,277][47288] Updated weights for policy 0, policy_version 41736 (0.0029) [2024-04-26 01:18:53,923][47056] Fps is (10 sec: 57345.7, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 683900928. Throughput: 0: 55719.2. Samples: 633317840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 01:18:53,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 01:18:55,079][47288] Updated weights for policy 0, policy_version 41746 (0.0029) [2024-04-26 01:18:56,158][47267] Signal inference workers to stop experience collection... (9150 times) [2024-04-26 01:18:56,158][47267] Signal inference workers to resume experience collection... (9150 times) [2024-04-26 01:18:56,172][47288] InferenceWorker_p0-w0: stopping experience collection (9150 times) [2024-04-26 01:18:56,172][47288] InferenceWorker_p0-w0: resuming experience collection (9150 times) [2024-04-26 01:18:57,968][47288] Updated weights for policy 0, policy_version 41756 (0.0028) [2024-04-26 01:18:58,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 684163072. Throughput: 0: 55573.8. Samples: 633482960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 01:18:58,923][47056] Avg episode reward: [(0, '0.336')] [2024-04-26 01:19:00,807][47288] Updated weights for policy 0, policy_version 41766 (0.0031) [2024-04-26 01:19:03,804][47288] Updated weights for policy 0, policy_version 41776 (0.0031) [2024-04-26 01:19:03,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 684457984. Throughput: 0: 55586.9. Samples: 633821100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 01:19:03,923][47056] Avg episode reward: [(0, '0.335')] [2024-04-26 01:19:06,753][47288] Updated weights for policy 0, policy_version 41786 (0.0031) [2024-04-26 01:19:08,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 684720128. Throughput: 0: 55549.9. Samples: 634151220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 01:19:08,923][47056] Avg episode reward: [(0, '0.273')] [2024-04-26 01:19:09,614][47288] Updated weights for policy 0, policy_version 41796 (0.0027) [2024-04-26 01:19:12,610][47288] Updated weights for policy 0, policy_version 41806 (0.0034) [2024-04-26 01:19:13,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 684998656. Throughput: 0: 55818.7. Samples: 634324180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 01:19:13,923][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 01:19:15,766][47288] Updated weights for policy 0, policy_version 41816 (0.0027) [2024-04-26 01:19:18,612][47288] Updated weights for policy 0, policy_version 41826 (0.0035) [2024-04-26 01:19:18,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 685277184. Throughput: 0: 55586.7. Samples: 634655080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 01:19:18,923][47056] Avg episode reward: [(0, '0.307')] [2024-04-26 01:19:21,516][47288] Updated weights for policy 0, policy_version 41836 (0.0028) [2024-04-26 01:19:23,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 685572096. Throughput: 0: 55586.8. Samples: 634990340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 01:19:23,923][47056] Avg episode reward: [(0, '0.300')] [2024-04-26 01:19:24,398][47288] Updated weights for policy 0, policy_version 41846 (0.0028) [2024-04-26 01:19:27,409][47288] Updated weights for policy 0, policy_version 41856 (0.0030) [2024-04-26 01:19:28,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 685850624. Throughput: 0: 55641.2. Samples: 635160080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 01:19:28,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 01:19:28,933][47267] Saving new best policy, reward=0.441! [2024-04-26 01:19:30,140][47288] Updated weights for policy 0, policy_version 41866 (0.0034) [2024-04-26 01:19:33,397][47288] Updated weights for policy 0, policy_version 41876 (0.0030) [2024-04-26 01:19:33,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 686112768. Throughput: 0: 55812.2. Samples: 635494140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 01:19:33,923][47056] Avg episode reward: [(0, '0.317')] [2024-04-26 01:19:36,011][47288] Updated weights for policy 0, policy_version 41886 (0.0034) [2024-04-26 01:19:38,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 686391296. Throughput: 0: 55751.9. Samples: 635826680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 01:19:38,923][47056] Avg episode reward: [(0, '0.280')] [2024-04-26 01:19:39,073][47288] Updated weights for policy 0, policy_version 41896 (0.0031) [2024-04-26 01:19:42,063][47288] Updated weights for policy 0, policy_version 41906 (0.0024) [2024-04-26 01:19:43,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.9, 300 sec: 55816.7). Total num frames: 686686208. Throughput: 0: 55783.3. Samples: 635993200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 01:19:43,923][47056] Avg episode reward: [(0, '0.298')] [2024-04-26 01:19:44,761][47288] Updated weights for policy 0, policy_version 41916 (0.0033) [2024-04-26 01:19:47,749][47288] Updated weights for policy 0, policy_version 41926 (0.0026) [2024-04-26 01:19:48,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 686964736. Throughput: 0: 55770.7. Samples: 636330780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 01:19:48,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 01:19:50,649][47288] Updated weights for policy 0, policy_version 41936 (0.0033) [2024-04-26 01:19:53,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 687226880. Throughput: 0: 55873.8. Samples: 636665540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 01:19:53,923][47056] Avg episode reward: [(0, '0.372')] [2024-04-26 01:19:53,991][47288] Updated weights for policy 0, policy_version 41946 (0.0033) [2024-04-26 01:19:56,498][47288] Updated weights for policy 0, policy_version 41956 (0.0024) [2024-04-26 01:19:58,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 687538176. Throughput: 0: 55635.0. Samples: 636827760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 01:19:58,932][47056] Avg episode reward: [(0, '0.363')] [2024-04-26 01:19:59,931][47288] Updated weights for policy 0, policy_version 41966 (0.0028) [2024-04-26 01:20:02,515][47288] Updated weights for policy 0, policy_version 41976 (0.0028) [2024-04-26 01:20:03,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 687800320. Throughput: 0: 55770.9. Samples: 637164760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 01:20:03,923][47056] Avg episode reward: [(0, '0.363')] [2024-04-26 01:20:05,728][47288] Updated weights for policy 0, policy_version 41986 (0.0030) [2024-04-26 01:20:08,202][47288] Updated weights for policy 0, policy_version 41996 (0.0029) [2024-04-26 01:20:08,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 688078848. Throughput: 0: 55815.0. Samples: 637502020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 01:20:08,924][47056] Avg episode reward: [(0, '0.329')] [2024-04-26 01:20:11,418][47288] Updated weights for policy 0, policy_version 42006 (0.0027) [2024-04-26 01:20:13,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 688357376. Throughput: 0: 55762.2. Samples: 637669380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 01:20:13,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 01:20:14,229][47288] Updated weights for policy 0, policy_version 42016 (0.0026) [2024-04-26 01:20:16,232][47267] Signal inference workers to stop experience collection... (9200 times) [2024-04-26 01:20:16,232][47267] Signal inference workers to resume experience collection... (9200 times) [2024-04-26 01:20:16,245][47288] InferenceWorker_p0-w0: stopping experience collection (9200 times) [2024-04-26 01:20:16,245][47288] InferenceWorker_p0-w0: resuming experience collection (9200 times) [2024-04-26 01:20:17,100][47288] Updated weights for policy 0, policy_version 42026 (0.0028) [2024-04-26 01:20:18,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 688619520. Throughput: 0: 55771.5. Samples: 638003860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 01:20:18,923][47056] Avg episode reward: [(0, '0.312')] [2024-04-26 01:20:20,090][47288] Updated weights for policy 0, policy_version 42036 (0.0030) [2024-04-26 01:20:23,052][47288] Updated weights for policy 0, policy_version 42046 (0.0030) [2024-04-26 01:20:23,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 688914432. Throughput: 0: 55880.1. Samples: 638341280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 01:20:23,923][47056] Avg episode reward: [(0, '0.290')] [2024-04-26 01:20:25,832][47288] Updated weights for policy 0, policy_version 42056 (0.0027) [2024-04-26 01:20:28,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 689192960. Throughput: 0: 55831.9. Samples: 638505640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 01:20:28,923][47056] Avg episode reward: [(0, '0.303')] [2024-04-26 01:20:29,101][47288] Updated weights for policy 0, policy_version 42066 (0.0031) [2024-04-26 01:20:31,530][47288] Updated weights for policy 0, policy_version 42076 (0.0029) [2024-04-26 01:20:33,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 689471488. Throughput: 0: 55806.4. Samples: 638842060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 01:20:33,923][47056] Avg episode reward: [(0, '0.286')] [2024-04-26 01:20:34,880][47288] Updated weights for policy 0, policy_version 42086 (0.0031) [2024-04-26 01:20:37,416][47288] Updated weights for policy 0, policy_version 42096 (0.0033) [2024-04-26 01:20:38,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 689766400. Throughput: 0: 55764.8. Samples: 639174960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 01:20:38,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 01:20:40,992][47288] Updated weights for policy 0, policy_version 42106 (0.0027) [2024-04-26 01:20:43,345][47288] Updated weights for policy 0, policy_version 42116 (0.0026) [2024-04-26 01:20:43,923][47056] Fps is (10 sec: 55704.1, 60 sec: 55705.3, 300 sec: 55761.1). Total num frames: 690028544. Throughput: 0: 55994.9. Samples: 639347540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 01:20:43,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 01:20:46,900][47288] Updated weights for policy 0, policy_version 42126 (0.0034) [2024-04-26 01:20:48,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 690323456. Throughput: 0: 55840.2. Samples: 639677580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 01:20:48,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 01:20:48,979][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000042135_690339840.pth... [2024-04-26 01:20:49,023][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000041317_676937728.pth [2024-04-26 01:20:49,418][47288] Updated weights for policy 0, policy_version 42136 (0.0024) [2024-04-26 01:20:52,724][47288] Updated weights for policy 0, policy_version 42146 (0.0031) [2024-04-26 01:20:53,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 690569216. Throughput: 0: 55817.7. Samples: 640013820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 01:20:53,923][47056] Avg episode reward: [(0, '0.317')] [2024-04-26 01:20:55,380][47288] Updated weights for policy 0, policy_version 42156 (0.0029) [2024-04-26 01:20:58,691][47288] Updated weights for policy 0, policy_version 42166 (0.0028) [2024-04-26 01:20:58,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 690864128. Throughput: 0: 55749.6. Samples: 640178120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 01:20:58,924][47056] Avg episode reward: [(0, '0.292')] [2024-04-26 01:21:01,384][47288] Updated weights for policy 0, policy_version 42176 (0.0034) [2024-04-26 01:21:03,923][47056] Fps is (10 sec: 58983.0, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 691159040. Throughput: 0: 55774.7. Samples: 640513720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 01:21:03,923][47056] Avg episode reward: [(0, '0.255')] [2024-04-26 01:21:04,561][47288] Updated weights for policy 0, policy_version 42186 (0.0029) [2024-04-26 01:21:07,261][47288] Updated weights for policy 0, policy_version 42196 (0.0031) [2024-04-26 01:21:08,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 691421184. Throughput: 0: 55825.6. Samples: 640853440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 01:21:08,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 01:21:10,277][47288] Updated weights for policy 0, policy_version 42206 (0.0025) [2024-04-26 01:21:13,056][47288] Updated weights for policy 0, policy_version 42216 (0.0030) [2024-04-26 01:21:13,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 691699712. Throughput: 0: 55902.2. Samples: 641021240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 01:21:13,923][47056] Avg episode reward: [(0, '0.243')] [2024-04-26 01:21:16,120][47288] Updated weights for policy 0, policy_version 42226 (0.0027) [2024-04-26 01:21:18,830][47288] Updated weights for policy 0, policy_version 42236 (0.0032) [2024-04-26 01:21:18,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 691994624. Throughput: 0: 55814.0. Samples: 641353700. Policy #0 lag: (min: 1.0, avg: 12.0, max: 21.0) [2024-04-26 01:21:18,923][47056] Avg episode reward: [(0, '0.353')] [2024-04-26 01:21:22,130][47288] Updated weights for policy 0, policy_version 42246 (0.0035) [2024-04-26 01:21:23,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 692273152. Throughput: 0: 55820.0. Samples: 641686860. Policy #0 lag: (min: 1.0, avg: 12.0, max: 21.0) [2024-04-26 01:21:23,923][47056] Avg episode reward: [(0, '0.357')] [2024-04-26 01:21:24,812][47288] Updated weights for policy 0, policy_version 42256 (0.0032) [2024-04-26 01:21:28,070][47288] Updated weights for policy 0, policy_version 42266 (0.0029) [2024-04-26 01:21:28,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 692535296. Throughput: 0: 55654.6. Samples: 641851980. Policy #0 lag: (min: 1.0, avg: 12.0, max: 21.0) [2024-04-26 01:21:28,923][47056] Avg episode reward: [(0, '0.287')] [2024-04-26 01:21:28,963][47267] Signal inference workers to stop experience collection... (9250 times) [2024-04-26 01:21:29,008][47288] InferenceWorker_p0-w0: stopping experience collection (9250 times) [2024-04-26 01:21:29,019][47267] Signal inference workers to resume experience collection... (9250 times) [2024-04-26 01:21:29,025][47288] InferenceWorker_p0-w0: resuming experience collection (9250 times) [2024-04-26 01:21:30,672][47288] Updated weights for policy 0, policy_version 42276 (0.0029) [2024-04-26 01:21:33,817][47288] Updated weights for policy 0, policy_version 42286 (0.0033) [2024-04-26 01:21:33,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 692813824. Throughput: 0: 55809.5. Samples: 642189000. Policy #0 lag: (min: 1.0, avg: 12.0, max: 21.0) [2024-04-26 01:21:33,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 01:21:36,379][47288] Updated weights for policy 0, policy_version 42296 (0.0026) [2024-04-26 01:21:38,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 693108736. Throughput: 0: 55757.8. Samples: 642522920. Policy #0 lag: (min: 1.0, avg: 12.0, max: 21.0) [2024-04-26 01:21:38,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 01:21:39,563][47288] Updated weights for policy 0, policy_version 42306 (0.0031) [2024-04-26 01:21:42,345][47288] Updated weights for policy 0, policy_version 42316 (0.0026) [2024-04-26 01:21:43,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.9, 300 sec: 55761.1). Total num frames: 693387264. Throughput: 0: 55879.4. Samples: 642692680. Policy #0 lag: (min: 1.0, avg: 12.0, max: 21.0) [2024-04-26 01:21:43,923][47056] Avg episode reward: [(0, '0.359')] [2024-04-26 01:21:45,508][47288] Updated weights for policy 0, policy_version 42326 (0.0030) [2024-04-26 01:21:48,480][47288] Updated weights for policy 0, policy_version 42336 (0.0028) [2024-04-26 01:21:48,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 693649408. Throughput: 0: 55906.2. Samples: 643029500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:21:48,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 01:21:51,543][47288] Updated weights for policy 0, policy_version 42346 (0.0028) [2024-04-26 01:21:53,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.9, 300 sec: 55761.2). Total num frames: 693944320. Throughput: 0: 55717.9. Samples: 643360740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:21:53,923][47056] Avg episode reward: [(0, '0.284')] [2024-04-26 01:21:54,350][47288] Updated weights for policy 0, policy_version 42356 (0.0026) [2024-04-26 01:21:57,307][47288] Updated weights for policy 0, policy_version 42366 (0.0031) [2024-04-26 01:21:58,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 694222848. Throughput: 0: 55548.9. Samples: 643520940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:21:58,923][47056] Avg episode reward: [(0, '0.278')] [2024-04-26 01:22:00,149][47288] Updated weights for policy 0, policy_version 42376 (0.0037) [2024-04-26 01:22:03,249][47288] Updated weights for policy 0, policy_version 42386 (0.0030) [2024-04-26 01:22:03,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 694484992. Throughput: 0: 55719.6. Samples: 643861080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:22:03,923][47056] Avg episode reward: [(0, '0.268')] [2024-04-26 01:22:05,993][47288] Updated weights for policy 0, policy_version 42396 (0.0025) [2024-04-26 01:22:08,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 694747136. Throughput: 0: 55850.5. Samples: 644200140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:22:08,923][47056] Avg episode reward: [(0, '0.346')] [2024-04-26 01:22:09,036][47288] Updated weights for policy 0, policy_version 42406 (0.0027) [2024-04-26 01:22:11,914][47288] Updated weights for policy 0, policy_version 42416 (0.0031) [2024-04-26 01:22:13,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 695058432. Throughput: 0: 55744.3. Samples: 644360480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 01:22:13,923][47056] Avg episode reward: [(0, '0.357')] [2024-04-26 01:22:14,901][47288] Updated weights for policy 0, policy_version 42426 (0.0031) [2024-04-26 01:22:17,673][47288] Updated weights for policy 0, policy_version 42436 (0.0028) [2024-04-26 01:22:18,923][47056] Fps is (10 sec: 58982.5, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 695336960. Throughput: 0: 55674.9. Samples: 644694380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 01:22:18,923][47056] Avg episode reward: [(0, '0.280')] [2024-04-26 01:22:20,743][47288] Updated weights for policy 0, policy_version 42446 (0.0032) [2024-04-26 01:22:23,592][47288] Updated weights for policy 0, policy_version 42456 (0.0030) [2024-04-26 01:22:23,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 695615488. Throughput: 0: 55822.2. Samples: 645034920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 01:22:23,923][47056] Avg episode reward: [(0, '0.348')] [2024-04-26 01:22:26,480][47288] Updated weights for policy 0, policy_version 42466 (0.0027) [2024-04-26 01:22:28,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 695877632. Throughput: 0: 55826.6. Samples: 645204880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 01:22:28,923][47056] Avg episode reward: [(0, '0.262')] [2024-04-26 01:22:29,369][47288] Updated weights for policy 0, policy_version 42476 (0.0028) [2024-04-26 01:22:32,237][47288] Updated weights for policy 0, policy_version 42486 (0.0032) [2024-04-26 01:22:33,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 696156160. Throughput: 0: 55719.9. Samples: 645536900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 01:22:33,923][47056] Avg episode reward: [(0, '0.291')] [2024-04-26 01:22:35,191][47288] Updated weights for policy 0, policy_version 42496 (0.0048) [2024-04-26 01:22:38,176][47288] Updated weights for policy 0, policy_version 42506 (0.0028) [2024-04-26 01:22:38,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 696434688. Throughput: 0: 55871.1. Samples: 645874940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 01:22:38,923][47056] Avg episode reward: [(0, '0.313')] [2024-04-26 01:22:39,518][47267] Signal inference workers to stop experience collection... (9300 times) [2024-04-26 01:22:39,548][47288] InferenceWorker_p0-w0: stopping experience collection (9300 times) [2024-04-26 01:22:39,601][47267] Signal inference workers to resume experience collection... (9300 times) [2024-04-26 01:22:39,601][47288] InferenceWorker_p0-w0: resuming experience collection (9300 times) [2024-04-26 01:22:40,972][47288] Updated weights for policy 0, policy_version 42516 (0.0033) [2024-04-26 01:22:43,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 696729600. Throughput: 0: 55971.2. Samples: 646039640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 01:22:43,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 01:22:43,967][47288] Updated weights for policy 0, policy_version 42526 (0.0026) [2024-04-26 01:22:46,729][47288] Updated weights for policy 0, policy_version 42536 (0.0034) [2024-04-26 01:22:48,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 697008128. Throughput: 0: 55978.3. Samples: 646380100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 01:22:48,923][47056] Avg episode reward: [(0, '0.282')] [2024-04-26 01:22:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000042542_697008128.pth... [2024-04-26 01:22:48,977][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000041725_683622400.pth [2024-04-26 01:22:49,858][47288] Updated weights for policy 0, policy_version 42546 (0.0037) [2024-04-26 01:22:52,675][47288] Updated weights for policy 0, policy_version 42556 (0.0032) [2024-04-26 01:22:53,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 697303040. Throughput: 0: 55888.6. Samples: 646715120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 01:22:53,923][47056] Avg episode reward: [(0, '0.292')] [2024-04-26 01:22:55,855][47288] Updated weights for policy 0, policy_version 42566 (0.0031) [2024-04-26 01:22:58,572][47288] Updated weights for policy 0, policy_version 42576 (0.0028) [2024-04-26 01:22:58,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 697581568. Throughput: 0: 56029.4. Samples: 646881800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 01:22:58,923][47056] Avg episode reward: [(0, '0.245')] [2024-04-26 01:23:01,811][47288] Updated weights for policy 0, policy_version 42586 (0.0025) [2024-04-26 01:23:03,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 697843712. Throughput: 0: 56148.7. Samples: 647221060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 01:23:03,923][47056] Avg episode reward: [(0, '0.325')] [2024-04-26 01:23:04,375][47288] Updated weights for policy 0, policy_version 42596 (0.0031) [2024-04-26 01:23:07,621][47288] Updated weights for policy 0, policy_version 42606 (0.0024) [2024-04-26 01:23:08,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 698105856. Throughput: 0: 56081.9. Samples: 647558600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 01:23:08,923][47056] Avg episode reward: [(0, '0.305')] [2024-04-26 01:23:10,240][47288] Updated weights for policy 0, policy_version 42616 (0.0030) [2024-04-26 01:23:13,449][47288] Updated weights for policy 0, policy_version 42626 (0.0029) [2024-04-26 01:23:13,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 698400768. Throughput: 0: 55896.0. Samples: 647720200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 01:23:13,924][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 01:23:16,323][47288] Updated weights for policy 0, policy_version 42636 (0.0034) [2024-04-26 01:23:18,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 698695680. Throughput: 0: 55837.4. Samples: 648049580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 01:23:18,923][47056] Avg episode reward: [(0, '0.397')] [2024-04-26 01:23:19,174][47288] Updated weights for policy 0, policy_version 42646 (0.0032) [2024-04-26 01:23:21,540][47267] Signal inference workers to stop experience collection... (9350 times) [2024-04-26 01:23:21,541][47267] Signal inference workers to resume experience collection... (9350 times) [2024-04-26 01:23:21,564][47288] InferenceWorker_p0-w0: stopping experience collection (9350 times) [2024-04-26 01:23:21,565][47288] InferenceWorker_p0-w0: resuming experience collection (9350 times) [2024-04-26 01:23:22,218][47288] Updated weights for policy 0, policy_version 42656 (0.0029) [2024-04-26 01:23:23,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 698957824. Throughput: 0: 55855.4. Samples: 648388440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 01:23:23,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 01:23:25,017][47288] Updated weights for policy 0, policy_version 42666 (0.0027) [2024-04-26 01:23:27,944][47288] Updated weights for policy 0, policy_version 42676 (0.0027) [2024-04-26 01:23:28,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 699252736. Throughput: 0: 56011.9. Samples: 648560180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 01:23:28,923][47056] Avg episode reward: [(0, '0.343')] [2024-04-26 01:23:30,868][47288] Updated weights for policy 0, policy_version 42686 (0.0031) [2024-04-26 01:23:33,812][47288] Updated weights for policy 0, policy_version 42696 (0.0037) [2024-04-26 01:23:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 699531264. Throughput: 0: 55952.4. Samples: 648897960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:23:33,923][47056] Avg episode reward: [(0, '0.270')] [2024-04-26 01:23:36,728][47288] Updated weights for policy 0, policy_version 42706 (0.0029) [2024-04-26 01:23:38,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 699793408. Throughput: 0: 55903.2. Samples: 649230760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:23:38,923][47056] Avg episode reward: [(0, '0.357')] [2024-04-26 01:23:39,601][47288] Updated weights for policy 0, policy_version 42716 (0.0027) [2024-04-26 01:23:42,745][47288] Updated weights for policy 0, policy_version 42726 (0.0026) [2024-04-26 01:23:43,922][47056] Fps is (10 sec: 55706.8, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 700088320. Throughput: 0: 55879.8. Samples: 649396380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:23:43,923][47056] Avg episode reward: [(0, '0.335')] [2024-04-26 01:23:45,494][47288] Updated weights for policy 0, policy_version 42736 (0.0025) [2024-04-26 01:23:48,600][47288] Updated weights for policy 0, policy_version 42746 (0.0032) [2024-04-26 01:23:48,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 700350464. Throughput: 0: 55849.2. Samples: 649734280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:23:48,923][47056] Avg episode reward: [(0, '0.328')] [2024-04-26 01:23:51,329][47288] Updated weights for policy 0, policy_version 42756 (0.0027) [2024-04-26 01:23:53,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 700645376. Throughput: 0: 55738.2. Samples: 650066820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:23:53,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 01:23:54,299][47288] Updated weights for policy 0, policy_version 42766 (0.0028) [2024-04-26 01:23:57,105][47288] Updated weights for policy 0, policy_version 42776 (0.0029) [2024-04-26 01:23:58,923][47056] Fps is (10 sec: 58982.8, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 700940288. Throughput: 0: 56125.5. Samples: 650245840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:23:58,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 01:24:00,032][47288] Updated weights for policy 0, policy_version 42786 (0.0031) [2024-04-26 01:24:03,093][47288] Updated weights for policy 0, policy_version 42796 (0.0026) [2024-04-26 01:24:03,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.6, 300 sec: 55927.8). Total num frames: 701218816. Throughput: 0: 56205.4. Samples: 650578820. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 01:24:03,923][47056] Avg episode reward: [(0, '0.320')] [2024-04-26 01:24:06,025][47288] Updated weights for policy 0, policy_version 42806 (0.0035) [2024-04-26 01:24:08,885][47288] Updated weights for policy 0, policy_version 42816 (0.0028) [2024-04-26 01:24:08,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 701497344. Throughput: 0: 56096.5. Samples: 650912780. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 01:24:08,923][47056] Avg episode reward: [(0, '0.297')] [2024-04-26 01:24:11,945][47288] Updated weights for policy 0, policy_version 42826 (0.0029) [2024-04-26 01:24:13,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 701759488. Throughput: 0: 55955.2. Samples: 651078160. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 01:24:13,923][47056] Avg episode reward: [(0, '0.390')] [2024-04-26 01:24:14,593][47288] Updated weights for policy 0, policy_version 42836 (0.0029) [2024-04-26 01:24:17,778][47288] Updated weights for policy 0, policy_version 42846 (0.0032) [2024-04-26 01:24:18,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 702054400. Throughput: 0: 56024.3. Samples: 651419060. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 01:24:18,923][47056] Avg episode reward: [(0, '0.399')] [2024-04-26 01:24:20,428][47288] Updated weights for policy 0, policy_version 42856 (0.0025) [2024-04-26 01:24:23,628][47288] Updated weights for policy 0, policy_version 42866 (0.0030) [2024-04-26 01:24:23,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56251.9, 300 sec: 55872.2). Total num frames: 702332928. Throughput: 0: 56032.1. Samples: 651752200. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 01:24:23,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 01:24:26,377][47288] Updated weights for policy 0, policy_version 42876 (0.0026) [2024-04-26 01:24:28,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 702595072. Throughput: 0: 55962.0. Samples: 651914680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 01:24:28,924][47056] Avg episode reward: [(0, '0.337')] [2024-04-26 01:24:29,542][47288] Updated weights for policy 0, policy_version 42886 (0.0032) [2024-04-26 01:24:31,746][47267] Signal inference workers to stop experience collection... (9400 times) [2024-04-26 01:24:31,746][47267] Signal inference workers to resume experience collection... (9400 times) [2024-04-26 01:24:31,759][47288] InferenceWorker_p0-w0: stopping experience collection (9400 times) [2024-04-26 01:24:31,759][47288] InferenceWorker_p0-w0: resuming experience collection (9400 times) [2024-04-26 01:24:32,076][47288] Updated weights for policy 0, policy_version 42896 (0.0030) [2024-04-26 01:24:33,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 702889984. Throughput: 0: 56010.3. Samples: 652254740. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 01:24:33,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 01:24:35,420][47288] Updated weights for policy 0, policy_version 42906 (0.0025) [2024-04-26 01:24:38,172][47288] Updated weights for policy 0, policy_version 42916 (0.0029) [2024-04-26 01:24:38,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56524.7, 300 sec: 55927.7). Total num frames: 703184896. Throughput: 0: 56067.0. Samples: 652589840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 01:24:38,923][47056] Avg episode reward: [(0, '0.357')] [2024-04-26 01:24:41,131][47288] Updated weights for policy 0, policy_version 42926 (0.0030) [2024-04-26 01:24:43,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 703447040. Throughput: 0: 55705.0. Samples: 652752560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 01:24:43,923][47056] Avg episode reward: [(0, '0.391')] [2024-04-26 01:24:43,969][47288] Updated weights for policy 0, policy_version 42936 (0.0026) [2024-04-26 01:24:46,773][47288] Updated weights for policy 0, policy_version 42946 (0.0027) [2024-04-26 01:24:48,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 703725568. Throughput: 0: 55848.4. Samples: 653092000. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 01:24:48,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 01:24:48,931][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000042952_703725568.pth... [2024-04-26 01:24:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000042135_690339840.pth [2024-04-26 01:24:49,999][47288] Updated weights for policy 0, policy_version 42956 (0.0025) [2024-04-26 01:24:52,677][47288] Updated weights for policy 0, policy_version 42966 (0.0027) [2024-04-26 01:24:53,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 704004096. Throughput: 0: 55826.7. Samples: 653424980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 01:24:53,923][47056] Avg episode reward: [(0, '0.359')] [2024-04-26 01:24:55,702][47288] Updated weights for policy 0, policy_version 42976 (0.0026) [2024-04-26 01:24:58,516][47288] Updated weights for policy 0, policy_version 42986 (0.0026) [2024-04-26 01:24:58,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56251.6, 300 sec: 55983.3). Total num frames: 704315392. Throughput: 0: 56092.9. Samples: 653602340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 01:24:58,923][47056] Avg episode reward: [(0, '0.346')] [2024-04-26 01:25:01,373][47288] Updated weights for policy 0, policy_version 42996 (0.0030) [2024-04-26 01:25:03,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 704577536. Throughput: 0: 55974.9. Samples: 653937920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 01:25:03,923][47056] Avg episode reward: [(0, '0.329')] [2024-04-26 01:25:04,308][47288] Updated weights for policy 0, policy_version 43006 (0.0031) [2024-04-26 01:25:07,289][47288] Updated weights for policy 0, policy_version 43016 (0.0030) [2024-04-26 01:25:08,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 704839680. Throughput: 0: 56076.2. Samples: 654275640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 01:25:08,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 01:25:10,438][47288] Updated weights for policy 0, policy_version 43026 (0.0033) [2024-04-26 01:25:13,068][47288] Updated weights for policy 0, policy_version 43036 (0.0030) [2024-04-26 01:25:13,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 705118208. Throughput: 0: 56262.4. Samples: 654446480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 01:25:13,923][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 01:25:16,191][47288] Updated weights for policy 0, policy_version 43046 (0.0032) [2024-04-26 01:25:18,922][47288] Updated weights for policy 0, policy_version 43056 (0.0028) [2024-04-26 01:25:18,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56251.9, 300 sec: 55983.3). Total num frames: 705429504. Throughput: 0: 56102.6. Samples: 654779360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 01:25:18,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 01:25:22,010][47288] Updated weights for policy 0, policy_version 43066 (0.0035) [2024-04-26 01:25:23,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 705691648. Throughput: 0: 56159.7. Samples: 655117020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 01:25:23,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 01:25:25,195][47288] Updated weights for policy 0, policy_version 43076 (0.0036) [2024-04-26 01:25:25,939][47267] Signal inference workers to stop experience collection... (9450 times) [2024-04-26 01:25:25,939][47267] Signal inference workers to resume experience collection... (9450 times) [2024-04-26 01:25:25,954][47288] InferenceWorker_p0-w0: stopping experience collection (9450 times) [2024-04-26 01:25:25,954][47288] InferenceWorker_p0-w0: resuming experience collection (9450 times) [2024-04-26 01:25:27,764][47288] Updated weights for policy 0, policy_version 43086 (0.0030) [2024-04-26 01:25:28,923][47056] Fps is (10 sec: 50790.0, 60 sec: 55705.6, 300 sec: 55816.6). Total num frames: 705937408. Throughput: 0: 56069.5. Samples: 655275700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 01:25:28,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 01:25:31,119][47288] Updated weights for policy 0, policy_version 43096 (0.0032) [2024-04-26 01:25:33,527][47288] Updated weights for policy 0, policy_version 43106 (0.0029) [2024-04-26 01:25:33,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 706265088. Throughput: 0: 56024.4. Samples: 655613100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 01:25:33,923][47056] Avg episode reward: [(0, '0.299')] [2024-04-26 01:25:36,956][47288] Updated weights for policy 0, policy_version 43116 (0.0026) [2024-04-26 01:25:38,923][47056] Fps is (10 sec: 62259.6, 60 sec: 56251.7, 300 sec: 56038.9). Total num frames: 706560000. Throughput: 0: 56150.2. Samples: 655951740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 01:25:38,923][47056] Avg episode reward: [(0, '0.338')] [2024-04-26 01:25:39,258][47288] Updated weights for policy 0, policy_version 43126 (0.0031) [2024-04-26 01:25:42,878][47288] Updated weights for policy 0, policy_version 43136 (0.0025) [2024-04-26 01:25:43,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 706789376. Throughput: 0: 55936.6. Samples: 656119480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 01:25:43,923][47056] Avg episode reward: [(0, '0.369')] [2024-04-26 01:25:45,128][47288] Updated weights for policy 0, policy_version 43146 (0.0029) [2024-04-26 01:25:48,697][47288] Updated weights for policy 0, policy_version 43156 (0.0034) [2024-04-26 01:25:48,923][47056] Fps is (10 sec: 50790.3, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 707067904. Throughput: 0: 55787.1. Samples: 656448340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-04-26 01:25:48,923][47056] Avg episode reward: [(0, '0.355')] [2024-04-26 01:25:50,992][47288] Updated weights for policy 0, policy_version 43166 (0.0031) [2024-04-26 01:25:53,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 707362816. Throughput: 0: 55670.2. Samples: 656780800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-04-26 01:25:53,923][47056] Avg episode reward: [(0, '0.355')] [2024-04-26 01:25:54,484][47288] Updated weights for policy 0, policy_version 43176 (0.0027) [2024-04-26 01:25:56,960][47288] Updated weights for policy 0, policy_version 43186 (0.0030) [2024-04-26 01:25:58,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55159.6, 300 sec: 55816.7). Total num frames: 707624960. Throughput: 0: 55545.8. Samples: 656946040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-04-26 01:25:58,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 01:26:00,454][47288] Updated weights for policy 0, policy_version 43196 (0.0024) [2024-04-26 01:26:02,662][47288] Updated weights for policy 0, policy_version 43206 (0.0027) [2024-04-26 01:26:03,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 707903488. Throughput: 0: 55553.3. Samples: 657279260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-04-26 01:26:03,923][47056] Avg episode reward: [(0, '0.328')] [2024-04-26 01:26:06,055][47267] Signal inference workers to stop experience collection... (9500 times) [2024-04-26 01:26:06,105][47288] InferenceWorker_p0-w0: stopping experience collection (9500 times) [2024-04-26 01:26:06,107][47267] Signal inference workers to resume experience collection... (9500 times) [2024-04-26 01:26:06,113][47288] InferenceWorker_p0-w0: resuming experience collection (9500 times) [2024-04-26 01:26:06,348][47288] Updated weights for policy 0, policy_version 43216 (0.0028) [2024-04-26 01:26:08,410][47288] Updated weights for policy 0, policy_version 43226 (0.0028) [2024-04-26 01:26:08,923][47056] Fps is (10 sec: 58981.6, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 708214784. Throughput: 0: 55397.6. Samples: 657609920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-04-26 01:26:08,923][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 01:26:12,247][47288] Updated weights for policy 0, policy_version 43236 (0.0038) [2024-04-26 01:26:13,923][47056] Fps is (10 sec: 60620.7, 60 sec: 56524.7, 300 sec: 55983.3). Total num frames: 708509696. Throughput: 0: 55870.8. Samples: 657789880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-04-26 01:26:13,923][47056] Avg episode reward: [(0, '0.320')] [2024-04-26 01:26:14,264][47288] Updated weights for policy 0, policy_version 43246 (0.0027) [2024-04-26 01:26:18,169][47288] Updated weights for policy 0, policy_version 43256 (0.0031) [2024-04-26 01:26:18,923][47056] Fps is (10 sec: 52425.6, 60 sec: 55158.8, 300 sec: 55816.6). Total num frames: 708739072. Throughput: 0: 55879.2. Samples: 658127700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 01:26:18,924][47056] Avg episode reward: [(0, '0.347')] [2024-04-26 01:26:20,349][47288] Updated weights for policy 0, policy_version 43266 (0.0030) [2024-04-26 01:26:23,923][47056] Fps is (10 sec: 49152.0, 60 sec: 55159.4, 300 sec: 55816.7). Total num frames: 709001216. Throughput: 0: 55792.5. Samples: 658462400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 01:26:23,923][47056] Avg episode reward: [(0, '0.330')] [2024-04-26 01:26:24,218][47288] Updated weights for policy 0, policy_version 43276 (0.0035) [2024-04-26 01:26:26,789][47288] Updated weights for policy 0, policy_version 43286 (0.0029) [2024-04-26 01:26:28,923][47056] Fps is (10 sec: 55709.4, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 709296128. Throughput: 0: 55510.2. Samples: 658617440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 01:26:28,923][47056] Avg episode reward: [(0, '0.415')] [2024-04-26 01:26:30,026][47288] Updated weights for policy 0, policy_version 43296 (0.0029) [2024-04-26 01:26:32,703][47288] Updated weights for policy 0, policy_version 43306 (0.0026) [2024-04-26 01:26:33,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 709574656. Throughput: 0: 55696.5. Samples: 658954680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 01:26:33,923][47056] Avg episode reward: [(0, '0.345')] [2024-04-26 01:26:35,859][47288] Updated weights for policy 0, policy_version 43316 (0.0032) [2024-04-26 01:26:38,544][47288] Updated weights for policy 0, policy_version 43326 (0.0031) [2024-04-26 01:26:38,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55159.5, 300 sec: 55872.2). Total num frames: 709869568. Throughput: 0: 55760.0. Samples: 659290000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 01:26:38,924][47056] Avg episode reward: [(0, '0.298')] [2024-04-26 01:26:41,771][47288] Updated weights for policy 0, policy_version 43336 (0.0030) [2024-04-26 01:26:43,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 710148096. Throughput: 0: 55835.0. Samples: 659458620. Policy #0 lag: (min: 1.0, avg: 12.8, max: 22.0) [2024-04-26 01:26:43,923][47056] Avg episode reward: [(0, '0.337')] [2024-04-26 01:26:44,374][47288] Updated weights for policy 0, policy_version 43346 (0.0033) [2024-04-26 01:26:47,602][47288] Updated weights for policy 0, policy_version 43356 (0.0027) [2024-04-26 01:26:48,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 710426624. Throughput: 0: 55861.4. Samples: 659793020. Policy #0 lag: (min: 1.0, avg: 12.8, max: 22.0) [2024-04-26 01:26:48,923][47056] Avg episode reward: [(0, '0.291')] [2024-04-26 01:26:48,943][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000043362_710443008.pth... [2024-04-26 01:26:48,996][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000042542_697008128.pth [2024-04-26 01:26:50,295][47288] Updated weights for policy 0, policy_version 43366 (0.0026) [2024-04-26 01:26:52,019][47267] Signal inference workers to stop experience collection... (9550 times) [2024-04-26 01:26:52,019][47267] Signal inference workers to resume experience collection... (9550 times) [2024-04-26 01:26:52,048][47288] InferenceWorker_p0-w0: stopping experience collection (9550 times) [2024-04-26 01:26:52,048][47288] InferenceWorker_p0-w0: resuming experience collection (9550 times) [2024-04-26 01:26:53,449][47288] Updated weights for policy 0, policy_version 43376 (0.0028) [2024-04-26 01:26:53,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 710688768. Throughput: 0: 55857.4. Samples: 660123500. Policy #0 lag: (min: 1.0, avg: 12.8, max: 22.0) [2024-04-26 01:26:53,923][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 01:26:56,172][47288] Updated weights for policy 0, policy_version 43386 (0.0028) [2024-04-26 01:26:58,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 710967296. Throughput: 0: 55573.4. Samples: 660290680. Policy #0 lag: (min: 1.0, avg: 12.8, max: 22.0) [2024-04-26 01:26:58,923][47056] Avg episode reward: [(0, '0.302')] [2024-04-26 01:26:59,381][47288] Updated weights for policy 0, policy_version 43396 (0.0027) [2024-04-26 01:27:01,849][47288] Updated weights for policy 0, policy_version 43406 (0.0029) [2024-04-26 01:27:03,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 711245824. Throughput: 0: 55532.4. Samples: 660626620. Policy #0 lag: (min: 1.0, avg: 12.8, max: 22.0) [2024-04-26 01:27:03,923][47056] Avg episode reward: [(0, '0.349')] [2024-04-26 01:27:05,288][47288] Updated weights for policy 0, policy_version 43416 (0.0028) [2024-04-26 01:27:07,642][47288] Updated weights for policy 0, policy_version 43426 (0.0024) [2024-04-26 01:27:08,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55159.6, 300 sec: 55816.7). Total num frames: 711524352. Throughput: 0: 55547.2. Samples: 660962020. Policy #0 lag: (min: 1.0, avg: 12.8, max: 22.0) [2024-04-26 01:27:08,923][47056] Avg episode reward: [(0, '0.333')] [2024-04-26 01:27:11,303][47288] Updated weights for policy 0, policy_version 43436 (0.0037) [2024-04-26 01:27:13,663][47288] Updated weights for policy 0, policy_version 43446 (0.0028) [2024-04-26 01:27:13,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55159.5, 300 sec: 55872.2). Total num frames: 711819264. Throughput: 0: 55767.1. Samples: 661126960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-26 01:27:13,923][47056] Avg episode reward: [(0, '0.300')] [2024-04-26 01:27:17,110][47288] Updated weights for policy 0, policy_version 43456 (0.0025) [2024-04-26 01:27:18,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56252.3, 300 sec: 55927.8). Total num frames: 712114176. Throughput: 0: 55720.9. Samples: 661462120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-26 01:27:18,923][47056] Avg episode reward: [(0, '0.264')] [2024-04-26 01:27:19,476][47288] Updated weights for policy 0, policy_version 43466 (0.0033) [2024-04-26 01:27:22,900][47288] Updated weights for policy 0, policy_version 43476 (0.0034) [2024-04-26 01:27:23,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56524.9, 300 sec: 55983.3). Total num frames: 712392704. Throughput: 0: 55748.6. Samples: 661798680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-26 01:27:23,923][47056] Avg episode reward: [(0, '0.270')] [2024-04-26 01:27:25,211][47288] Updated weights for policy 0, policy_version 43486 (0.0027) [2024-04-26 01:27:28,787][47288] Updated weights for policy 0, policy_version 43496 (0.0029) [2024-04-26 01:27:28,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 712638464. Throughput: 0: 55773.7. Samples: 661968440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-26 01:27:28,923][47056] Avg episode reward: [(0, '0.295')] [2024-04-26 01:27:31,459][47288] Updated weights for policy 0, policy_version 43506 (0.0033) [2024-04-26 01:27:33,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 712933376. Throughput: 0: 55811.0. Samples: 662304520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-26 01:27:33,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 01:27:34,649][47288] Updated weights for policy 0, policy_version 43516 (0.0030) [2024-04-26 01:27:37,201][47288] Updated weights for policy 0, policy_version 43526 (0.0031) [2024-04-26 01:27:38,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 713195520. Throughput: 0: 55885.4. Samples: 662638340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 01:27:38,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 01:27:40,614][47288] Updated weights for policy 0, policy_version 43536 (0.0029) [2024-04-26 01:27:43,125][47288] Updated weights for policy 0, policy_version 43546 (0.0030) [2024-04-26 01:27:43,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 713474048. Throughput: 0: 55695.9. Samples: 662797000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 01:27:43,923][47056] Avg episode reward: [(0, '0.223')] [2024-04-26 01:27:46,301][47288] Updated weights for policy 0, policy_version 43556 (0.0029) [2024-04-26 01:27:46,892][47267] Signal inference workers to stop experience collection... (9600 times) [2024-04-26 01:27:46,925][47288] InferenceWorker_p0-w0: stopping experience collection (9600 times) [2024-04-26 01:27:46,951][47267] Signal inference workers to resume experience collection... (9600 times) [2024-04-26 01:27:46,956][47288] InferenceWorker_p0-w0: resuming experience collection (9600 times) [2024-04-26 01:27:48,895][47288] Updated weights for policy 0, policy_version 43566 (0.0030) [2024-04-26 01:27:48,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 713785344. Throughput: 0: 55672.5. Samples: 663131880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 01:27:48,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 01:27:52,326][47288] Updated weights for policy 0, policy_version 43576 (0.0032) [2024-04-26 01:27:53,923][47056] Fps is (10 sec: 60620.8, 60 sec: 56524.8, 300 sec: 55927.8). Total num frames: 714080256. Throughput: 0: 55655.4. Samples: 663466520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 01:27:53,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 01:27:54,646][47288] Updated weights for policy 0, policy_version 43586 (0.0028) [2024-04-26 01:27:58,226][47288] Updated weights for policy 0, policy_version 43596 (0.0030) [2024-04-26 01:27:58,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 714342400. Throughput: 0: 55936.5. Samples: 663644100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 01:27:58,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 01:28:00,372][47288] Updated weights for policy 0, policy_version 43606 (0.0034) [2024-04-26 01:28:03,923][47056] Fps is (10 sec: 50790.7, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 714588160. Throughput: 0: 55883.6. Samples: 663976880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 01:28:03,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 01:28:04,015][47288] Updated weights for policy 0, policy_version 43616 (0.0026) [2024-04-26 01:28:06,255][47288] Updated weights for policy 0, policy_version 43626 (0.0030) [2024-04-26 01:28:08,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 714883072. Throughput: 0: 55819.2. Samples: 664310540. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-26 01:28:08,923][47056] Avg episode reward: [(0, '0.314')] [2024-04-26 01:28:09,844][47288] Updated weights for policy 0, policy_version 43636 (0.0027) [2024-04-26 01:28:12,100][47288] Updated weights for policy 0, policy_version 43646 (0.0030) [2024-04-26 01:28:13,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 715161600. Throughput: 0: 55706.7. Samples: 664475240. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-26 01:28:13,923][47056] Avg episode reward: [(0, '0.289')] [2024-04-26 01:28:15,743][47288] Updated weights for policy 0, policy_version 43656 (0.0026) [2024-04-26 01:28:18,433][47288] Updated weights for policy 0, policy_version 43666 (0.0029) [2024-04-26 01:28:18,923][47056] Fps is (10 sec: 55704.3, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 715440128. Throughput: 0: 55587.0. Samples: 664805940. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-26 01:28:18,923][47056] Avg episode reward: [(0, '0.369')] [2024-04-26 01:28:21,696][47288] Updated weights for policy 0, policy_version 43676 (0.0027) [2024-04-26 01:28:23,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 715735040. Throughput: 0: 55536.8. Samples: 665137500. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-26 01:28:23,923][47056] Avg episode reward: [(0, '0.318')] [2024-04-26 01:28:24,479][47288] Updated weights for policy 0, policy_version 43686 (0.0031) [2024-04-26 01:28:27,484][47288] Updated weights for policy 0, policy_version 43696 (0.0032) [2024-04-26 01:28:28,923][47056] Fps is (10 sec: 58983.9, 60 sec: 56525.0, 300 sec: 55927.8). Total num frames: 716029952. Throughput: 0: 55988.2. Samples: 665316460. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-26 01:28:28,923][47056] Avg episode reward: [(0, '0.319')] [2024-04-26 01:28:30,257][47288] Updated weights for policy 0, policy_version 43706 (0.0033) [2024-04-26 01:28:33,358][47288] Updated weights for policy 0, policy_version 43716 (0.0027) [2024-04-26 01:28:33,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 716292096. Throughput: 0: 55960.9. Samples: 665650120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 01:28:33,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 01:28:35,956][47288] Updated weights for policy 0, policy_version 43726 (0.0033) [2024-04-26 01:28:38,923][47056] Fps is (10 sec: 50790.1, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 716537856. Throughput: 0: 55982.4. Samples: 665985720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 01:28:38,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 01:28:38,934][47267] Signal inference workers to stop experience collection... (9650 times) [2024-04-26 01:28:38,934][47267] Signal inference workers to resume experience collection... (9650 times) [2024-04-26 01:28:38,959][47288] InferenceWorker_p0-w0: stopping experience collection (9650 times) [2024-04-26 01:28:38,960][47288] InferenceWorker_p0-w0: resuming experience collection (9650 times) [2024-04-26 01:28:39,183][47288] Updated weights for policy 0, policy_version 43736 (0.0028) [2024-04-26 01:28:41,945][47288] Updated weights for policy 0, policy_version 43746 (0.0028) [2024-04-26 01:28:43,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 716832768. Throughput: 0: 55578.8. Samples: 666145140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 01:28:43,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 01:28:45,116][47288] Updated weights for policy 0, policy_version 43756 (0.0031) [2024-04-26 01:28:48,010][47288] Updated weights for policy 0, policy_version 43766 (0.0027) [2024-04-26 01:28:48,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 717094912. Throughput: 0: 55630.6. Samples: 666480260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 01:28:48,923][47056] Avg episode reward: [(0, '0.314')] [2024-04-26 01:28:48,965][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000043769_717111296.pth... [2024-04-26 01:28:49,016][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000042952_703725568.pth [2024-04-26 01:28:50,887][47288] Updated weights for policy 0, policy_version 43776 (0.0031) [2024-04-26 01:28:53,704][47288] Updated weights for policy 0, policy_version 43786 (0.0029) [2024-04-26 01:28:53,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 717389824. Throughput: 0: 55667.8. Samples: 666815600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 01:28:53,923][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 01:28:56,732][47288] Updated weights for policy 0, policy_version 43796 (0.0039) [2024-04-26 01:28:58,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 717684736. Throughput: 0: 55851.5. Samples: 666988560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 01:28:58,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 01:28:59,440][47288] Updated weights for policy 0, policy_version 43806 (0.0031) [2024-04-26 01:29:02,567][47288] Updated weights for policy 0, policy_version 43816 (0.0031) [2024-04-26 01:29:03,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 717963264. Throughput: 0: 55902.4. Samples: 667321540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 01:29:03,923][47056] Avg episode reward: [(0, '0.246')] [2024-04-26 01:29:05,500][47288] Updated weights for policy 0, policy_version 43826 (0.0030) [2024-04-26 01:29:08,391][47288] Updated weights for policy 0, policy_version 43836 (0.0031) [2024-04-26 01:29:08,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 718241792. Throughput: 0: 55869.4. Samples: 667651620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 01:29:08,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 01:29:11,415][47288] Updated weights for policy 0, policy_version 43846 (0.0037) [2024-04-26 01:29:13,923][47056] Fps is (10 sec: 52429.7, 60 sec: 55432.7, 300 sec: 55705.7). Total num frames: 718487552. Throughput: 0: 55674.7. Samples: 667821820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 01:29:13,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 01:29:14,192][47288] Updated weights for policy 0, policy_version 43856 (0.0032) [2024-04-26 01:29:17,135][47288] Updated weights for policy 0, policy_version 43866 (0.0035) [2024-04-26 01:29:18,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55705.8, 300 sec: 55761.1). Total num frames: 718782464. Throughput: 0: 55692.2. Samples: 668156260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 01:29:18,923][47056] Avg episode reward: [(0, '0.393')] [2024-04-26 01:29:19,967][47288] Updated weights for policy 0, policy_version 43876 (0.0028) [2024-04-26 01:29:23,085][47288] Updated weights for policy 0, policy_version 43886 (0.0030) [2024-04-26 01:29:23,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 719044608. Throughput: 0: 55652.4. Samples: 668490080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 01:29:23,923][47056] Avg episode reward: [(0, '0.302')] [2024-04-26 01:29:25,513][47267] Signal inference workers to stop experience collection... (9700 times) [2024-04-26 01:29:25,550][47288] InferenceWorker_p0-w0: stopping experience collection (9700 times) [2024-04-26 01:29:25,600][47267] Signal inference workers to resume experience collection... (9700 times) [2024-04-26 01:29:25,600][47288] InferenceWorker_p0-w0: resuming experience collection (9700 times) [2024-04-26 01:29:25,824][47288] Updated weights for policy 0, policy_version 43896 (0.0028) [2024-04-26 01:29:28,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55159.3, 300 sec: 55761.1). Total num frames: 719339520. Throughput: 0: 55679.8. Samples: 668650740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 01:29:28,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 01:29:28,971][47288] Updated weights for policy 0, policy_version 43906 (0.0026) [2024-04-26 01:29:31,792][47288] Updated weights for policy 0, policy_version 43916 (0.0032) [2024-04-26 01:29:33,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 719618048. Throughput: 0: 55703.7. Samples: 668986920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 01:29:33,923][47056] Avg episode reward: [(0, '0.391')] [2024-04-26 01:29:34,735][47288] Updated weights for policy 0, policy_version 43926 (0.0027) [2024-04-26 01:29:37,536][47288] Updated weights for policy 0, policy_version 43936 (0.0029) [2024-04-26 01:29:38,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 719912960. Throughput: 0: 55720.5. Samples: 669323020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 01:29:38,923][47056] Avg episode reward: [(0, '0.339')] [2024-04-26 01:29:40,473][47288] Updated weights for policy 0, policy_version 43946 (0.0028) [2024-04-26 01:29:43,567][47288] Updated weights for policy 0, policy_version 43956 (0.0031) [2024-04-26 01:29:43,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 720191488. Throughput: 0: 55801.8. Samples: 669499640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 01:29:43,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 01:29:46,435][47288] Updated weights for policy 0, policy_version 43966 (0.0030) [2024-04-26 01:29:48,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 720470016. Throughput: 0: 55826.3. Samples: 669833720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 01:29:48,923][47056] Avg episode reward: [(0, '0.334')] [2024-04-26 01:29:49,479][47288] Updated weights for policy 0, policy_version 43976 (0.0028) [2024-04-26 01:29:52,450][47288] Updated weights for policy 0, policy_version 43986 (0.0028) [2024-04-26 01:29:53,923][47056] Fps is (10 sec: 50790.5, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 720699392. Throughput: 0: 55935.1. Samples: 670168700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 01:29:53,923][47056] Avg episode reward: [(0, '0.357')] [2024-04-26 01:29:55,252][47288] Updated weights for policy 0, policy_version 43996 (0.0035) [2024-04-26 01:29:58,282][47288] Updated weights for policy 0, policy_version 44006 (0.0026) [2024-04-26 01:29:58,923][47056] Fps is (10 sec: 52428.2, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 720994304. Throughput: 0: 55565.5. Samples: 670322280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-26 01:29:58,923][47056] Avg episode reward: [(0, '0.350')] [2024-04-26 01:30:01,061][47288] Updated weights for policy 0, policy_version 44016 (0.0033) [2024-04-26 01:30:03,923][47056] Fps is (10 sec: 60620.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 721305600. Throughput: 0: 55589.2. Samples: 670657780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-26 01:30:03,923][47056] Avg episode reward: [(0, '0.359')] [2024-04-26 01:30:04,106][47288] Updated weights for policy 0, policy_version 44026 (0.0027) [2024-04-26 01:30:06,977][47288] Updated weights for policy 0, policy_version 44036 (0.0039) [2024-04-26 01:30:08,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 721567744. Throughput: 0: 55516.5. Samples: 670988320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-26 01:30:08,923][47056] Avg episode reward: [(0, '0.320')] [2024-04-26 01:30:10,324][47288] Updated weights for policy 0, policy_version 44046 (0.0027) [2024-04-26 01:30:13,076][47288] Updated weights for policy 0, policy_version 44056 (0.0033) [2024-04-26 01:30:13,322][47267] Signal inference workers to stop experience collection... (9750 times) [2024-04-26 01:30:13,333][47288] InferenceWorker_p0-w0: stopping experience collection (9750 times) [2024-04-26 01:30:13,415][47267] Signal inference workers to resume experience collection... (9750 times) [2024-04-26 01:30:13,415][47288] InferenceWorker_p0-w0: resuming experience collection (9750 times) [2024-04-26 01:30:13,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 721862656. Throughput: 0: 55909.9. Samples: 671166680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-26 01:30:13,923][47056] Avg episode reward: [(0, '0.381')] [2024-04-26 01:30:16,501][47288] Updated weights for policy 0, policy_version 44066 (0.0031) [2024-04-26 01:30:18,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 722124800. Throughput: 0: 55762.6. Samples: 671496240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-26 01:30:18,923][47056] Avg episode reward: [(0, '0.436')] [2024-04-26 01:30:19,185][47288] Updated weights for policy 0, policy_version 44076 (0.0030) [2024-04-26 01:30:22,554][47288] Updated weights for policy 0, policy_version 44086 (0.0032) [2024-04-26 01:30:23,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 722386944. Throughput: 0: 55743.7. Samples: 671831480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-26 01:30:23,923][47056] Avg episode reward: [(0, '0.252')] [2024-04-26 01:30:25,074][47288] Updated weights for policy 0, policy_version 44096 (0.0032) [2024-04-26 01:30:28,451][47288] Updated weights for policy 0, policy_version 44106 (0.0032) [2024-04-26 01:30:28,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 722649088. Throughput: 0: 55297.0. Samples: 671988000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 01:30:28,923][47056] Avg episode reward: [(0, '0.404')] [2024-04-26 01:30:30,805][47288] Updated weights for policy 0, policy_version 44116 (0.0026) [2024-04-26 01:30:33,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 722944000. Throughput: 0: 55257.2. Samples: 672320300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 01:30:33,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 01:30:34,163][47288] Updated weights for policy 0, policy_version 44126 (0.0037) [2024-04-26 01:30:36,650][47288] Updated weights for policy 0, policy_version 44136 (0.0027) [2024-04-26 01:30:38,923][47056] Fps is (10 sec: 58981.9, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 723238912. Throughput: 0: 55235.5. Samples: 672654300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 01:30:38,923][47056] Avg episode reward: [(0, '0.334')] [2024-04-26 01:30:40,108][47288] Updated weights for policy 0, policy_version 44146 (0.0031) [2024-04-26 01:30:42,553][47288] Updated weights for policy 0, policy_version 44156 (0.0032) [2024-04-26 01:30:43,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 723517440. Throughput: 0: 55717.1. Samples: 672829540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 01:30:43,923][47056] Avg episode reward: [(0, '0.283')] [2024-04-26 01:30:46,012][47288] Updated weights for policy 0, policy_version 44166 (0.0031) [2024-04-26 01:30:48,393][47288] Updated weights for policy 0, policy_version 44176 (0.0034) [2024-04-26 01:30:48,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 723795968. Throughput: 0: 55588.0. Samples: 673159240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 01:30:48,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 01:30:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000044177_723795968.pth... [2024-04-26 01:30:48,987][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000043362_710443008.pth [2024-04-26 01:30:51,802][47288] Updated weights for policy 0, policy_version 44186 (0.0027) [2024-04-26 01:30:53,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.9, 300 sec: 55761.2). Total num frames: 724074496. Throughput: 0: 55713.4. Samples: 673495420. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-26 01:30:53,923][47056] Avg episode reward: [(0, '0.218')] [2024-04-26 01:30:54,074][47288] Updated weights for policy 0, policy_version 44196 (0.0028) [2024-04-26 01:30:57,622][47288] Updated weights for policy 0, policy_version 44206 (0.0024) [2024-04-26 01:30:58,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 724336640. Throughput: 0: 55524.5. Samples: 673665280. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-26 01:30:58,923][47056] Avg episode reward: [(0, '0.322')] [2024-04-26 01:31:00,051][47288] Updated weights for policy 0, policy_version 44216 (0.0032) [2024-04-26 01:31:03,583][47288] Updated weights for policy 0, policy_version 44226 (0.0032) [2024-04-26 01:31:03,923][47056] Fps is (10 sec: 52427.9, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 724598784. Throughput: 0: 55520.8. Samples: 673994680. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-26 01:31:03,923][47056] Avg episode reward: [(0, '0.334')] [2024-04-26 01:31:06,052][47288] Updated weights for policy 0, policy_version 44236 (0.0031) [2024-04-26 01:31:08,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 55483.5). Total num frames: 724877312. Throughput: 0: 55431.1. Samples: 674325880. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-26 01:31:08,923][47056] Avg episode reward: [(0, '0.371')] [2024-04-26 01:31:09,494][47288] Updated weights for policy 0, policy_version 44246 (0.0032) [2024-04-26 01:31:10,579][47267] Signal inference workers to stop experience collection... (9800 times) [2024-04-26 01:31:10,625][47288] InferenceWorker_p0-w0: stopping experience collection (9800 times) [2024-04-26 01:31:10,635][47267] Signal inference workers to resume experience collection... (9800 times) [2024-04-26 01:31:10,640][47288] InferenceWorker_p0-w0: resuming experience collection (9800 times) [2024-04-26 01:31:11,859][47288] Updated weights for policy 0, policy_version 44256 (0.0027) [2024-04-26 01:31:13,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55159.4, 300 sec: 55705.7). Total num frames: 725172224. Throughput: 0: 55615.5. Samples: 674490700. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-26 01:31:13,923][47056] Avg episode reward: [(0, '0.322')] [2024-04-26 01:31:15,385][47288] Updated weights for policy 0, policy_version 44266 (0.0028) [2024-04-26 01:31:17,706][47288] Updated weights for policy 0, policy_version 44276 (0.0027) [2024-04-26 01:31:18,923][47056] Fps is (10 sec: 62258.8, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 725499904. Throughput: 0: 55669.0. Samples: 674825400. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-26 01:31:18,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 01:31:21,314][47288] Updated weights for policy 0, policy_version 44286 (0.0024) [2024-04-26 01:31:23,524][47288] Updated weights for policy 0, policy_version 44296 (0.0030) [2024-04-26 01:31:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 725745664. Throughput: 0: 55659.2. Samples: 675158960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 01:31:23,923][47056] Avg episode reward: [(0, '0.301')] [2024-04-26 01:31:27,241][47288] Updated weights for policy 0, policy_version 44306 (0.0035) [2024-04-26 01:31:28,923][47056] Fps is (10 sec: 52428.4, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 726024192. Throughput: 0: 55723.8. Samples: 675337120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 01:31:28,923][47056] Avg episode reward: [(0, '0.375')] [2024-04-26 01:31:29,481][47288] Updated weights for policy 0, policy_version 44316 (0.0030) [2024-04-26 01:31:33,087][47288] Updated weights for policy 0, policy_version 44326 (0.0027) [2024-04-26 01:31:33,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 726269952. Throughput: 0: 55839.7. Samples: 675672020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 01:31:33,923][47056] Avg episode reward: [(0, '0.328')] [2024-04-26 01:31:35,356][47288] Updated weights for policy 0, policy_version 44336 (0.0029) [2024-04-26 01:31:38,912][47288] Updated weights for policy 0, policy_version 44346 (0.0028) [2024-04-26 01:31:38,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 726564864. Throughput: 0: 55778.0. Samples: 676005440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 01:31:38,923][47056] Avg episode reward: [(0, '0.319')] [2024-04-26 01:31:41,065][47288] Updated weights for policy 0, policy_version 44356 (0.0026) [2024-04-26 01:31:43,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 726827008. Throughput: 0: 55462.0. Samples: 676161080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 01:31:43,923][47056] Avg episode reward: [(0, '0.329')] [2024-04-26 01:31:44,902][47288] Updated weights for policy 0, policy_version 44366 (0.0026) [2024-04-26 01:31:46,902][47288] Updated weights for policy 0, policy_version 44376 (0.0027) [2024-04-26 01:31:48,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 727138304. Throughput: 0: 55563.4. Samples: 676495040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 01:31:48,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 01:31:50,914][47288] Updated weights for policy 0, policy_version 44386 (0.0030) [2024-04-26 01:31:52,892][47288] Updated weights for policy 0, policy_version 44396 (0.0028) [2024-04-26 01:31:53,923][47056] Fps is (10 sec: 60621.0, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 727433216. Throughput: 0: 55520.8. Samples: 676824320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 01:31:53,923][47056] Avg episode reward: [(0, '0.346')] [2024-04-26 01:31:56,733][47288] Updated weights for policy 0, policy_version 44406 (0.0028) [2024-04-26 01:31:58,923][47056] Fps is (10 sec: 55707.2, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 727695360. Throughput: 0: 55906.5. Samples: 677006480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 01:31:58,923][47056] Avg episode reward: [(0, '0.351')] [2024-04-26 01:31:58,953][47288] Updated weights for policy 0, policy_version 44416 (0.0026) [2024-04-26 01:32:02,509][47288] Updated weights for policy 0, policy_version 44426 (0.0026) [2024-04-26 01:32:03,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 727973888. Throughput: 0: 55776.9. Samples: 677335360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 01:32:03,923][47056] Avg episode reward: [(0, '0.332')] [2024-04-26 01:32:04,798][47288] Updated weights for policy 0, policy_version 44436 (0.0029) [2024-04-26 01:32:07,534][47267] Signal inference workers to stop experience collection... (9850 times) [2024-04-26 01:32:07,574][47288] InferenceWorker_p0-w0: stopping experience collection (9850 times) [2024-04-26 01:32:07,584][47267] Signal inference workers to resume experience collection... (9850 times) [2024-04-26 01:32:07,591][47288] InferenceWorker_p0-w0: resuming experience collection (9850 times) [2024-04-26 01:32:08,428][47288] Updated weights for policy 0, policy_version 44446 (0.0030) [2024-04-26 01:32:08,923][47056] Fps is (10 sec: 52427.7, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 728219648. Throughput: 0: 55850.6. Samples: 677672240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 01:32:08,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 01:32:10,646][47288] Updated weights for policy 0, policy_version 44456 (0.0033) [2024-04-26 01:32:13,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 728498176. Throughput: 0: 55366.3. Samples: 677828600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 01:32:13,923][47056] Avg episode reward: [(0, '0.352')] [2024-04-26 01:32:14,415][47288] Updated weights for policy 0, policy_version 44466 (0.0029) [2024-04-26 01:32:16,420][47288] Updated weights for policy 0, policy_version 44476 (0.0029) [2024-04-26 01:32:18,923][47056] Fps is (10 sec: 57344.1, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 728793088. Throughput: 0: 55457.2. Samples: 678167600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 01:32:18,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 01:32:20,149][47288] Updated weights for policy 0, policy_version 44486 (0.0030) [2024-04-26 01:32:22,279][47288] Updated weights for policy 0, policy_version 44496 (0.0026) [2024-04-26 01:32:23,923][47056] Fps is (10 sec: 58982.8, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 729088000. Throughput: 0: 55566.7. Samples: 678505940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 01:32:23,923][47056] Avg episode reward: [(0, '0.395')] [2024-04-26 01:32:25,871][47288] Updated weights for policy 0, policy_version 44506 (0.0034) [2024-04-26 01:32:28,103][47288] Updated weights for policy 0, policy_version 44516 (0.0039) [2024-04-26 01:32:28,923][47056] Fps is (10 sec: 60621.3, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 729399296. Throughput: 0: 56173.5. Samples: 678688880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 01:32:28,923][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 01:32:32,219][47288] Updated weights for policy 0, policy_version 44526 (0.0028) [2024-04-26 01:32:33,868][47288] Updated weights for policy 0, policy_version 44536 (0.0027) [2024-04-26 01:32:33,923][47056] Fps is (10 sec: 58979.4, 60 sec: 56797.3, 300 sec: 55872.1). Total num frames: 729677824. Throughput: 0: 56073.3. Samples: 679018360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 01:32:33,924][47056] Avg episode reward: [(0, '0.424')] [2024-04-26 01:32:37,900][47288] Updated weights for policy 0, policy_version 44546 (0.0039) [2024-04-26 01:32:38,923][47056] Fps is (10 sec: 54066.5, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 729939968. Throughput: 0: 56106.2. Samples: 679349100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 01:32:38,923][47056] Avg episode reward: [(0, '0.301')] [2024-04-26 01:32:39,720][47288] Updated weights for policy 0, policy_version 44556 (0.0031) [2024-04-26 01:32:43,769][47288] Updated weights for policy 0, policy_version 44566 (0.0028) [2024-04-26 01:32:43,923][47056] Fps is (10 sec: 50792.8, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 730185728. Throughput: 0: 55682.5. Samples: 679512200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 01:32:43,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 01:32:45,679][47288] Updated weights for policy 0, policy_version 44576 (0.0030) [2024-04-26 01:32:48,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 730480640. Throughput: 0: 55961.1. Samples: 679853620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 01:32:48,923][47056] Avg episode reward: [(0, '0.350')] [2024-04-26 01:32:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000044585_730480640.pth... [2024-04-26 01:32:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000043769_717111296.pth [2024-04-26 01:32:49,556][47288] Updated weights for policy 0, policy_version 44586 (0.0033) [2024-04-26 01:32:51,500][47288] Updated weights for policy 0, policy_version 44596 (0.0026) [2024-04-26 01:32:52,769][47267] Signal inference workers to stop experience collection... (9900 times) [2024-04-26 01:32:52,769][47267] Signal inference workers to resume experience collection... (9900 times) [2024-04-26 01:32:52,780][47288] InferenceWorker_p0-w0: stopping experience collection (9900 times) [2024-04-26 01:32:52,780][47288] InferenceWorker_p0-w0: resuming experience collection (9900 times) [2024-04-26 01:32:53,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 730759168. Throughput: 0: 55935.5. Samples: 680189340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 01:32:53,923][47056] Avg episode reward: [(0, '0.321')] [2024-04-26 01:32:55,287][47288] Updated weights for policy 0, policy_version 44606 (0.0028) [2024-04-26 01:32:57,262][47288] Updated weights for policy 0, policy_version 44616 (0.0027) [2024-04-26 01:32:58,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 731054080. Throughput: 0: 56201.7. Samples: 680357680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 01:32:58,923][47056] Avg episode reward: [(0, '0.317')] [2024-04-26 01:33:00,973][47288] Updated weights for policy 0, policy_version 44626 (0.0028) [2024-04-26 01:33:03,006][47288] Updated weights for policy 0, policy_version 44636 (0.0030) [2024-04-26 01:33:03,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 731332608. Throughput: 0: 56083.6. Samples: 680691360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 01:33:03,923][47056] Avg episode reward: [(0, '0.330')] [2024-04-26 01:33:06,951][47288] Updated weights for policy 0, policy_version 44646 (0.0025) [2024-04-26 01:33:08,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56798.0, 300 sec: 55816.7). Total num frames: 731627520. Throughput: 0: 56013.8. Samples: 681026560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 01:33:08,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 01:33:09,026][47288] Updated weights for policy 0, policy_version 44656 (0.0027) [2024-04-26 01:33:12,797][47288] Updated weights for policy 0, policy_version 44666 (0.0029) [2024-04-26 01:33:13,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.9, 300 sec: 55761.2). Total num frames: 731889664. Throughput: 0: 55899.5. Samples: 681204360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 01:33:13,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 01:33:15,028][47288] Updated weights for policy 0, policy_version 44676 (0.0034) [2024-04-26 01:33:18,545][47288] Updated weights for policy 0, policy_version 44686 (0.0029) [2024-04-26 01:33:18,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 732151808. Throughput: 0: 55946.9. Samples: 681535940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 01:33:18,923][47056] Avg episode reward: [(0, '0.337')] [2024-04-26 01:33:20,746][47288] Updated weights for policy 0, policy_version 44696 (0.0032) [2024-04-26 01:33:23,926][47056] Fps is (10 sec: 54050.1, 60 sec: 55702.7, 300 sec: 55593.9). Total num frames: 732430336. Throughput: 0: 56181.1. Samples: 681877420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 01:33:23,926][47056] Avg episode reward: [(0, '0.349')] [2024-04-26 01:33:24,184][47288] Updated weights for policy 0, policy_version 44706 (0.0034) [2024-04-26 01:33:26,554][47288] Updated weights for policy 0, policy_version 44716 (0.0027) [2024-04-26 01:33:28,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 732708864. Throughput: 0: 55979.7. Samples: 682031280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 01:33:28,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 01:33:28,953][47267] Saving new best policy, reward=0.467! [2024-04-26 01:33:30,129][47288] Updated weights for policy 0, policy_version 44726 (0.0033) [2024-04-26 01:33:32,769][47288] Updated weights for policy 0, policy_version 44736 (0.0029) [2024-04-26 01:33:33,923][47056] Fps is (10 sec: 57362.7, 60 sec: 55433.1, 300 sec: 55816.7). Total num frames: 733003776. Throughput: 0: 55939.0. Samples: 682370860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 01:33:33,923][47056] Avg episode reward: [(0, '0.320')] [2024-04-26 01:33:35,945][47288] Updated weights for policy 0, policy_version 44746 (0.0029) [2024-04-26 01:33:38,566][47288] Updated weights for policy 0, policy_version 44756 (0.0032) [2024-04-26 01:33:38,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 733282304. Throughput: 0: 56026.8. Samples: 682710540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 01:33:38,923][47056] Avg episode reward: [(0, '0.314')] [2024-04-26 01:33:41,876][47288] Updated weights for policy 0, policy_version 44766 (0.0026) [2024-04-26 01:33:43,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 733577216. Throughput: 0: 56155.7. Samples: 682884680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 01:33:43,923][47056] Avg episode reward: [(0, '0.350')] [2024-04-26 01:33:44,510][47288] Updated weights for policy 0, policy_version 44776 (0.0029) [2024-04-26 01:33:47,621][47288] Updated weights for policy 0, policy_version 44786 (0.0035) [2024-04-26 01:33:48,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.9, 300 sec: 55761.2). Total num frames: 733839360. Throughput: 0: 56203.2. Samples: 683220500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 01:33:48,923][47056] Avg episode reward: [(0, '0.352')] [2024-04-26 01:33:50,190][47288] Updated weights for policy 0, policy_version 44796 (0.0027) [2024-04-26 01:33:53,549][47288] Updated weights for policy 0, policy_version 44806 (0.0027) [2024-04-26 01:33:53,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 734134272. Throughput: 0: 56173.2. Samples: 683554360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 01:33:53,923][47056] Avg episode reward: [(0, '0.319')] [2024-04-26 01:33:56,031][47288] Updated weights for policy 0, policy_version 44816 (0.0035) [2024-04-26 01:33:58,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 734380032. Throughput: 0: 55786.2. Samples: 683714740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 01:33:58,923][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 01:33:59,379][47288] Updated weights for policy 0, policy_version 44826 (0.0025) [2024-04-26 01:33:59,740][47267] Signal inference workers to stop experience collection... (9950 times) [2024-04-26 01:33:59,740][47267] Signal inference workers to resume experience collection... (9950 times) [2024-04-26 01:33:59,751][47288] InferenceWorker_p0-w0: stopping experience collection (9950 times) [2024-04-26 01:33:59,751][47288] InferenceWorker_p0-w0: resuming experience collection (9950 times) [2024-04-26 01:34:02,132][47288] Updated weights for policy 0, policy_version 44836 (0.0028) [2024-04-26 01:34:03,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 734674944. Throughput: 0: 55922.1. Samples: 684052440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 01:34:03,923][47056] Avg episode reward: [(0, '0.326')] [2024-04-26 01:34:05,351][47288] Updated weights for policy 0, policy_version 44846 (0.0033) [2024-04-26 01:34:07,879][47288] Updated weights for policy 0, policy_version 44856 (0.0027) [2024-04-26 01:34:08,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 734969856. Throughput: 0: 55662.5. Samples: 684382060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 01:34:08,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 01:34:11,153][47288] Updated weights for policy 0, policy_version 44866 (0.0035) [2024-04-26 01:34:13,744][47288] Updated weights for policy 0, policy_version 44876 (0.0030) [2024-04-26 01:34:13,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 735248384. Throughput: 0: 55875.1. Samples: 684545660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 01:34:13,923][47056] Avg episode reward: [(0, '0.345')] [2024-04-26 01:34:16,980][47288] Updated weights for policy 0, policy_version 44886 (0.0026) [2024-04-26 01:34:18,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 735510528. Throughput: 0: 55738.1. Samples: 684879080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 01:34:18,923][47056] Avg episode reward: [(0, '0.343')] [2024-04-26 01:34:19,833][47288] Updated weights for policy 0, policy_version 44896 (0.0028) [2024-04-26 01:34:22,714][47288] Updated weights for policy 0, policy_version 44906 (0.0026) [2024-04-26 01:34:23,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56254.6, 300 sec: 55816.7). Total num frames: 735805440. Throughput: 0: 55719.0. Samples: 685217900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 01:34:23,923][47056] Avg episode reward: [(0, '0.328')] [2024-04-26 01:34:25,720][47288] Updated weights for policy 0, policy_version 44916 (0.0028) [2024-04-26 01:34:28,713][47288] Updated weights for policy 0, policy_version 44926 (0.0030) [2024-04-26 01:34:28,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 736083968. Throughput: 0: 55808.4. Samples: 685396060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 01:34:28,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 01:34:31,491][47288] Updated weights for policy 0, policy_version 44936 (0.0029) [2024-04-26 01:34:33,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 736329728. Throughput: 0: 55695.9. Samples: 685726820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 01:34:33,924][47056] Avg episode reward: [(0, '0.344')] [2024-04-26 01:34:34,486][47288] Updated weights for policy 0, policy_version 44946 (0.0032) [2024-04-26 01:34:37,217][47288] Updated weights for policy 0, policy_version 44956 (0.0028) [2024-04-26 01:34:38,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 736624640. Throughput: 0: 55703.1. Samples: 686061000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 01:34:38,923][47056] Avg episode reward: [(0, '0.299')] [2024-04-26 01:34:40,415][47288] Updated weights for policy 0, policy_version 44966 (0.0027) [2024-04-26 01:34:43,177][47288] Updated weights for policy 0, policy_version 44976 (0.0025) [2024-04-26 01:34:43,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 736919552. Throughput: 0: 55823.2. Samples: 686226780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 01:34:43,923][47056] Avg episode reward: [(0, '0.356')] [2024-04-26 01:34:46,189][47288] Updated weights for policy 0, policy_version 44986 (0.0038) [2024-04-26 01:34:48,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 737181696. Throughput: 0: 55802.7. Samples: 686563560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 01:34:48,923][47056] Avg episode reward: [(0, '0.361')] [2024-04-26 01:34:49,033][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000044995_737198080.pth... [2024-04-26 01:34:49,077][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000044177_723795968.pth [2024-04-26 01:34:49,211][47288] Updated weights for policy 0, policy_version 44996 (0.0028) [2024-04-26 01:34:52,169][47288] Updated weights for policy 0, policy_version 45006 (0.0029) [2024-04-26 01:34:53,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 737460224. Throughput: 0: 55822.7. Samples: 686894080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 01:34:53,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 01:34:54,921][47288] Updated weights for policy 0, policy_version 45016 (0.0026) [2024-04-26 01:34:57,868][47267] Signal inference workers to stop experience collection... (10000 times) [2024-04-26 01:34:57,892][47288] InferenceWorker_p0-w0: stopping experience collection (10000 times) [2024-04-26 01:34:57,960][47267] Signal inference workers to resume experience collection... (10000 times) [2024-04-26 01:34:57,960][47288] InferenceWorker_p0-w0: resuming experience collection (10000 times) [2024-04-26 01:34:57,962][47288] Updated weights for policy 0, policy_version 45026 (0.0025) [2024-04-26 01:34:58,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 737738752. Throughput: 0: 55899.0. Samples: 687061120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 01:34:58,923][47056] Avg episode reward: [(0, '0.378')] [2024-04-26 01:35:00,870][47288] Updated weights for policy 0, policy_version 45036 (0.0024) [2024-04-26 01:35:03,854][47288] Updated weights for policy 0, policy_version 45046 (0.0035) [2024-04-26 01:35:03,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 738033664. Throughput: 0: 56014.3. Samples: 687399720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 01:35:03,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 01:35:06,768][47288] Updated weights for policy 0, policy_version 45056 (0.0028) [2024-04-26 01:35:08,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 738295808. Throughput: 0: 55930.1. Samples: 687734760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 01:35:08,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 01:35:09,687][47288] Updated weights for policy 0, policy_version 45066 (0.0029) [2024-04-26 01:35:12,740][47288] Updated weights for policy 0, policy_version 45076 (0.0036) [2024-04-26 01:35:13,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 738574336. Throughput: 0: 55512.4. Samples: 687894120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 01:35:13,923][47056] Avg episode reward: [(0, '0.323')] [2024-04-26 01:35:15,621][47288] Updated weights for policy 0, policy_version 45086 (0.0032) [2024-04-26 01:35:18,501][47288] Updated weights for policy 0, policy_version 45096 (0.0039) [2024-04-26 01:35:18,923][47056] Fps is (10 sec: 55707.0, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 738852864. Throughput: 0: 55531.7. Samples: 688225740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 01:35:18,923][47056] Avg episode reward: [(0, '0.357')] [2024-04-26 01:35:21,417][47288] Updated weights for policy 0, policy_version 45106 (0.0031) [2024-04-26 01:35:23,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55159.6, 300 sec: 55816.7). Total num frames: 739115008. Throughput: 0: 55633.0. Samples: 688564480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 01:35:23,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 01:35:24,355][47288] Updated weights for policy 0, policy_version 45116 (0.0033) [2024-04-26 01:35:27,380][47288] Updated weights for policy 0, policy_version 45126 (0.0029) [2024-04-26 01:35:28,923][47056] Fps is (10 sec: 57343.1, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 739426304. Throughput: 0: 55770.0. Samples: 688736440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 01:35:28,924][47056] Avg episode reward: [(0, '0.263')] [2024-04-26 01:35:30,331][47288] Updated weights for policy 0, policy_version 45136 (0.0030) [2024-04-26 01:35:33,318][47288] Updated weights for policy 0, policy_version 45146 (0.0027) [2024-04-26 01:35:33,923][47056] Fps is (10 sec: 58981.4, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 739704832. Throughput: 0: 55702.1. Samples: 689070160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 01:35:33,923][47056] Avg episode reward: [(0, '0.321')] [2024-04-26 01:35:36,028][47288] Updated weights for policy 0, policy_version 45156 (0.0030) [2024-04-26 01:35:38,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 739983360. Throughput: 0: 55914.2. Samples: 689410220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-04-26 01:35:38,923][47056] Avg episode reward: [(0, '0.390')] [2024-04-26 01:35:39,040][47288] Updated weights for policy 0, policy_version 45166 (0.0036) [2024-04-26 01:35:41,828][47288] Updated weights for policy 0, policy_version 45176 (0.0027) [2024-04-26 01:35:43,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 740245504. Throughput: 0: 55776.8. Samples: 689571080. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-04-26 01:35:43,923][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 01:35:44,696][47267] Signal inference workers to stop experience collection... (10050 times) [2024-04-26 01:35:44,743][47288] InferenceWorker_p0-w0: stopping experience collection (10050 times) [2024-04-26 01:35:44,745][47267] Signal inference workers to resume experience collection... (10050 times) [2024-04-26 01:35:44,753][47288] InferenceWorker_p0-w0: resuming experience collection (10050 times) [2024-04-26 01:35:44,855][47288] Updated weights for policy 0, policy_version 45186 (0.0031) [2024-04-26 01:35:47,815][47288] Updated weights for policy 0, policy_version 45196 (0.0033) [2024-04-26 01:35:48,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 740524032. Throughput: 0: 55614.4. Samples: 689902380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-04-26 01:35:48,923][47056] Avg episode reward: [(0, '0.293')] [2024-04-26 01:35:50,825][47288] Updated weights for policy 0, policy_version 45206 (0.0032) [2024-04-26 01:35:53,833][47288] Updated weights for policy 0, policy_version 45216 (0.0026) [2024-04-26 01:35:53,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 740818944. Throughput: 0: 55567.6. Samples: 690235300. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-04-26 01:35:53,923][47056] Avg episode reward: [(0, '0.316')] [2024-04-26 01:35:56,778][47288] Updated weights for policy 0, policy_version 45226 (0.0027) [2024-04-26 01:35:58,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 741081088. Throughput: 0: 55759.1. Samples: 690403280. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-04-26 01:35:58,932][47056] Avg episode reward: [(0, '0.378')] [2024-04-26 01:35:59,790][47288] Updated weights for policy 0, policy_version 45236 (0.0026) [2024-04-26 01:36:02,570][47288] Updated weights for policy 0, policy_version 45246 (0.0031) [2024-04-26 01:36:03,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.3, 300 sec: 55872.2). Total num frames: 741359616. Throughput: 0: 55813.5. Samples: 690737360. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-04-26 01:36:03,923][47056] Avg episode reward: [(0, '0.332')] [2024-04-26 01:36:05,751][47288] Updated weights for policy 0, policy_version 45256 (0.0027) [2024-04-26 01:36:08,352][47288] Updated weights for policy 0, policy_version 45266 (0.0025) [2024-04-26 01:36:08,923][47056] Fps is (10 sec: 57343.1, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 741654528. Throughput: 0: 55661.8. Samples: 691069280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:36:08,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 01:36:11,533][47288] Updated weights for policy 0, policy_version 45276 (0.0028) [2024-04-26 01:36:13,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 741933056. Throughput: 0: 55713.4. Samples: 691243540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:36:13,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 01:36:14,164][47288] Updated weights for policy 0, policy_version 45286 (0.0026) [2024-04-26 01:36:17,252][47288] Updated weights for policy 0, policy_version 45296 (0.0025) [2024-04-26 01:36:18,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 742211584. Throughput: 0: 55686.7. Samples: 691576060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:36:18,923][47056] Avg episode reward: [(0, '0.342')] [2024-04-26 01:36:20,119][47288] Updated weights for policy 0, policy_version 45306 (0.0031) [2024-04-26 01:36:23,375][47288] Updated weights for policy 0, policy_version 45316 (0.0026) [2024-04-26 01:36:23,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 742473728. Throughput: 0: 55572.1. Samples: 691910960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:36:23,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 01:36:25,902][47288] Updated weights for policy 0, policy_version 45326 (0.0041) [2024-04-26 01:36:28,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 742752256. Throughput: 0: 55668.6. Samples: 692076160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:36:28,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 01:36:29,225][47288] Updated weights for policy 0, policy_version 45336 (0.0040) [2024-04-26 01:36:31,856][47288] Updated weights for policy 0, policy_version 45346 (0.0026) [2024-04-26 01:36:33,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 743047168. Throughput: 0: 55790.8. Samples: 692412960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 01:36:33,923][47056] Avg episode reward: [(0, '0.282')] [2024-04-26 01:36:35,026][47288] Updated weights for policy 0, policy_version 45356 (0.0026) [2024-04-26 01:36:37,601][47288] Updated weights for policy 0, policy_version 45366 (0.0024) [2024-04-26 01:36:38,923][47056] Fps is (10 sec: 58983.0, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 743342080. Throughput: 0: 55855.8. Samples: 692748800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 01:36:38,923][47056] Avg episode reward: [(0, '0.346')] [2024-04-26 01:36:40,985][47288] Updated weights for policy 0, policy_version 45376 (0.0032) [2024-04-26 01:36:43,462][47288] Updated weights for policy 0, policy_version 45386 (0.0028) [2024-04-26 01:36:43,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 743604224. Throughput: 0: 55888.6. Samples: 692918260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 01:36:43,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 01:36:46,862][47288] Updated weights for policy 0, policy_version 45396 (0.0026) [2024-04-26 01:36:48,924][47056] Fps is (10 sec: 54059.5, 60 sec: 55977.5, 300 sec: 55760.9). Total num frames: 743882752. Throughput: 0: 55858.0. Samples: 693251040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 01:36:48,925][47056] Avg episode reward: [(0, '0.375')] [2024-04-26 01:36:48,983][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000045404_743899136.pth... [2024-04-26 01:36:49,033][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000044585_730480640.pth [2024-04-26 01:36:49,287][47288] Updated weights for policy 0, policy_version 45406 (0.0031) [2024-04-26 01:36:52,829][47288] Updated weights for policy 0, policy_version 45416 (0.0032) [2024-04-26 01:36:53,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.7, 300 sec: 55761.1). Total num frames: 744144896. Throughput: 0: 55922.1. Samples: 693585760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 01:36:53,923][47056] Avg episode reward: [(0, '0.397')] [2024-04-26 01:36:55,159][47288] Updated weights for policy 0, policy_version 45426 (0.0027) [2024-04-26 01:36:58,774][47288] Updated weights for policy 0, policy_version 45436 (0.0029) [2024-04-26 01:36:58,923][47056] Fps is (10 sec: 55713.2, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 744439808. Throughput: 0: 55720.4. Samples: 693750960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 01:36:58,923][47056] Avg episode reward: [(0, '0.348')] [2024-04-26 01:37:00,064][47267] Signal inference workers to stop experience collection... (10100 times) [2024-04-26 01:37:00,084][47288] InferenceWorker_p0-w0: stopping experience collection (10100 times) [2024-04-26 01:37:00,151][47267] Signal inference workers to resume experience collection... (10100 times) [2024-04-26 01:37:00,151][47288] InferenceWorker_p0-w0: resuming experience collection (10100 times) [2024-04-26 01:37:00,966][47288] Updated weights for policy 0, policy_version 45446 (0.0027) [2024-04-26 01:37:03,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 744718336. Throughput: 0: 55697.4. Samples: 694082440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 01:37:03,923][47056] Avg episode reward: [(0, '0.342')] [2024-04-26 01:37:04,580][47288] Updated weights for policy 0, policy_version 45456 (0.0031) [2024-04-26 01:37:07,535][47288] Updated weights for policy 0, policy_version 45466 (0.0031) [2024-04-26 01:37:08,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 745013248. Throughput: 0: 55776.3. Samples: 694420900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 01:37:08,923][47056] Avg episode reward: [(0, '0.351')] [2024-04-26 01:37:10,380][47288] Updated weights for policy 0, policy_version 45476 (0.0029) [2024-04-26 01:37:13,283][47288] Updated weights for policy 0, policy_version 45486 (0.0035) [2024-04-26 01:37:13,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 745291776. Throughput: 0: 55973.0. Samples: 694594940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 01:37:13,923][47056] Avg episode reward: [(0, '0.390')] [2024-04-26 01:37:16,316][47288] Updated weights for policy 0, policy_version 45496 (0.0030) [2024-04-26 01:37:18,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 745553920. Throughput: 0: 55895.1. Samples: 694928240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 01:37:18,923][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 01:37:19,220][47288] Updated weights for policy 0, policy_version 45506 (0.0032) [2024-04-26 01:37:22,264][47288] Updated weights for policy 0, policy_version 45516 (0.0024) [2024-04-26 01:37:23,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 745848832. Throughput: 0: 55827.4. Samples: 695261040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 01:37:23,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 01:37:25,138][47288] Updated weights for policy 0, policy_version 45526 (0.0030) [2024-04-26 01:37:28,006][47288] Updated weights for policy 0, policy_version 45536 (0.0034) [2024-04-26 01:37:28,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56251.9, 300 sec: 55761.3). Total num frames: 746127360. Throughput: 0: 55781.9. Samples: 695428440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 01:37:28,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 01:37:30,855][47288] Updated weights for policy 0, policy_version 45546 (0.0029) [2024-04-26 01:37:33,890][47288] Updated weights for policy 0, policy_version 45556 (0.0028) [2024-04-26 01:37:33,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 746389504. Throughput: 0: 55879.2. Samples: 695765540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 01:37:33,923][47056] Avg episode reward: [(0, '0.404')] [2024-04-26 01:37:36,734][47288] Updated weights for policy 0, policy_version 45566 (0.0028) [2024-04-26 01:37:38,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 746668032. Throughput: 0: 55912.4. Samples: 696101820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 01:37:38,923][47056] Avg episode reward: [(0, '0.372')] [2024-04-26 01:37:39,797][47288] Updated weights for policy 0, policy_version 45576 (0.0029) [2024-04-26 01:37:42,625][47288] Updated weights for policy 0, policy_version 45586 (0.0027) [2024-04-26 01:37:43,923][47056] Fps is (10 sec: 58984.1, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 746979328. Throughput: 0: 55859.2. Samples: 696264620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 01:37:43,923][47056] Avg episode reward: [(0, '0.355')] [2024-04-26 01:37:45,693][47288] Updated weights for policy 0, policy_version 45596 (0.0031) [2024-04-26 01:37:48,336][47288] Updated weights for policy 0, policy_version 45606 (0.0027) [2024-04-26 01:37:48,923][47056] Fps is (10 sec: 58981.3, 60 sec: 56252.8, 300 sec: 55927.7). Total num frames: 747257856. Throughput: 0: 56059.9. Samples: 696605140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 01:37:48,923][47056] Avg episode reward: [(0, '0.354')] [2024-04-26 01:37:51,630][47288] Updated weights for policy 0, policy_version 45616 (0.0031) [2024-04-26 01:37:53,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 747503616. Throughput: 0: 55943.3. Samples: 696938340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 01:37:53,923][47056] Avg episode reward: [(0, '0.337')] [2024-04-26 01:37:54,080][47288] Updated weights for policy 0, policy_version 45626 (0.0026) [2024-04-26 01:37:55,133][47267] Signal inference workers to stop experience collection... (10150 times) [2024-04-26 01:37:55,165][47288] InferenceWorker_p0-w0: stopping experience collection (10150 times) [2024-04-26 01:37:55,192][47267] Signal inference workers to resume experience collection... (10150 times) [2024-04-26 01:37:55,197][47288] InferenceWorker_p0-w0: resuming experience collection (10150 times) [2024-04-26 01:37:57,442][47288] Updated weights for policy 0, policy_version 45636 (0.0034) [2024-04-26 01:37:58,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 747782144. Throughput: 0: 55866.9. Samples: 697108960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 01:37:58,923][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 01:38:00,086][47288] Updated weights for policy 0, policy_version 45646 (0.0030) [2024-04-26 01:38:03,173][47288] Updated weights for policy 0, policy_version 45656 (0.0025) [2024-04-26 01:38:03,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 748077056. Throughput: 0: 55992.1. Samples: 697447880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:38:03,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 01:38:05,870][47288] Updated weights for policy 0, policy_version 45666 (0.0029) [2024-04-26 01:38:08,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 748322816. Throughput: 0: 56110.3. Samples: 697786000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:38:08,923][47056] Avg episode reward: [(0, '0.288')] [2024-04-26 01:38:09,060][47288] Updated weights for policy 0, policy_version 45676 (0.0030) [2024-04-26 01:38:11,735][47288] Updated weights for policy 0, policy_version 45686 (0.0033) [2024-04-26 01:38:13,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 748617728. Throughput: 0: 55927.5. Samples: 697945180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:38:13,923][47056] Avg episode reward: [(0, '0.368')] [2024-04-26 01:38:14,875][47288] Updated weights for policy 0, policy_version 45696 (0.0032) [2024-04-26 01:38:17,554][47288] Updated weights for policy 0, policy_version 45706 (0.0031) [2024-04-26 01:38:18,923][47056] Fps is (10 sec: 60620.8, 60 sec: 56251.8, 300 sec: 55928.4). Total num frames: 748929024. Throughput: 0: 55925.6. Samples: 698282180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:38:18,923][47056] Avg episode reward: [(0, '0.390')] [2024-04-26 01:38:20,767][47288] Updated weights for policy 0, policy_version 45716 (0.0030) [2024-04-26 01:38:23,365][47288] Updated weights for policy 0, policy_version 45726 (0.0027) [2024-04-26 01:38:23,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 749191168. Throughput: 0: 55800.4. Samples: 698612840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:38:23,923][47056] Avg episode reward: [(0, '0.336')] [2024-04-26 01:38:26,793][47288] Updated weights for policy 0, policy_version 45736 (0.0032) [2024-04-26 01:38:28,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 749486080. Throughput: 0: 56139.0. Samples: 698790880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 01:38:28,923][47056] Avg episode reward: [(0, '0.344')] [2024-04-26 01:38:29,071][47288] Updated weights for policy 0, policy_version 45746 (0.0027) [2024-04-26 01:38:32,628][47288] Updated weights for policy 0, policy_version 45756 (0.0029) [2024-04-26 01:38:33,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 749731840. Throughput: 0: 56051.4. Samples: 699127440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 01:38:33,923][47056] Avg episode reward: [(0, '0.294')] [2024-04-26 01:38:35,040][47288] Updated weights for policy 0, policy_version 45766 (0.0030) [2024-04-26 01:38:38,500][47288] Updated weights for policy 0, policy_version 45776 (0.0031) [2024-04-26 01:38:38,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 750010368. Throughput: 0: 55989.2. Samples: 699457860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 01:38:38,923][47056] Avg episode reward: [(0, '0.306')] [2024-04-26 01:38:41,003][47288] Updated weights for policy 0, policy_version 45786 (0.0027) [2024-04-26 01:38:43,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55159.3, 300 sec: 55761.1). Total num frames: 750288896. Throughput: 0: 55693.4. Samples: 699615160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 01:38:43,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 01:38:44,399][47288] Updated weights for policy 0, policy_version 45796 (0.0030) [2024-04-26 01:38:46,991][47288] Updated weights for policy 0, policy_version 45806 (0.0026) [2024-04-26 01:38:48,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 750583808. Throughput: 0: 55514.9. Samples: 699946060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 01:38:48,923][47056] Avg episode reward: [(0, '0.262')] [2024-04-26 01:38:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000045812_750583808.pth... [2024-04-26 01:38:48,995][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000044995_737198080.pth [2024-04-26 01:38:50,404][47288] Updated weights for policy 0, policy_version 45816 (0.0029) [2024-04-26 01:38:51,438][47267] Signal inference workers to stop experience collection... (10200 times) [2024-04-26 01:38:51,483][47288] InferenceWorker_p0-w0: stopping experience collection (10200 times) [2024-04-26 01:38:51,491][47267] Signal inference workers to resume experience collection... (10200 times) [2024-04-26 01:38:51,497][47288] InferenceWorker_p0-w0: resuming experience collection (10200 times) [2024-04-26 01:38:52,856][47288] Updated weights for policy 0, policy_version 45826 (0.0033) [2024-04-26 01:38:53,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 750878720. Throughput: 0: 55435.0. Samples: 700280580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 01:38:53,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 01:38:56,243][47288] Updated weights for policy 0, policy_version 45836 (0.0027) [2024-04-26 01:38:58,629][47288] Updated weights for policy 0, policy_version 45846 (0.0027) [2024-04-26 01:38:58,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 751140864. Throughput: 0: 55822.6. Samples: 700457200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 01:38:58,923][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 01:39:02,147][47288] Updated weights for policy 0, policy_version 45856 (0.0028) [2024-04-26 01:39:03,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55978.4, 300 sec: 55816.6). Total num frames: 751435776. Throughput: 0: 55804.6. Samples: 700793400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 01:39:03,924][47056] Avg episode reward: [(0, '0.350')] [2024-04-26 01:39:04,369][47288] Updated weights for policy 0, policy_version 45866 (0.0025) [2024-04-26 01:39:07,941][47288] Updated weights for policy 0, policy_version 45876 (0.0032) [2024-04-26 01:39:08,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 751697920. Throughput: 0: 55951.6. Samples: 701130660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 01:39:08,923][47056] Avg episode reward: [(0, '0.281')] [2024-04-26 01:39:10,152][47288] Updated weights for policy 0, policy_version 45886 (0.0028) [2024-04-26 01:39:13,923][47056] Fps is (10 sec: 50791.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 751943680. Throughput: 0: 55375.2. Samples: 701282760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 01:39:13,923][47056] Avg episode reward: [(0, '0.345')] [2024-04-26 01:39:14,011][47288] Updated weights for policy 0, policy_version 45896 (0.0037) [2024-04-26 01:39:16,081][47288] Updated weights for policy 0, policy_version 45906 (0.0032) [2024-04-26 01:39:18,923][47056] Fps is (10 sec: 52428.7, 60 sec: 54886.3, 300 sec: 55650.1). Total num frames: 752222208. Throughput: 0: 55343.0. Samples: 701617880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 01:39:18,923][47056] Avg episode reward: [(0, '0.331')] [2024-04-26 01:39:19,985][47288] Updated weights for policy 0, policy_version 45916 (0.0034) [2024-04-26 01:39:21,798][47288] Updated weights for policy 0, policy_version 45926 (0.0026) [2024-04-26 01:39:23,923][47056] Fps is (10 sec: 58981.8, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 752533504. Throughput: 0: 55468.9. Samples: 701953960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 01:39:23,923][47056] Avg episode reward: [(0, '0.391')] [2024-04-26 01:39:25,886][47288] Updated weights for policy 0, policy_version 45936 (0.0031) [2024-04-26 01:39:28,054][47288] Updated weights for policy 0, policy_version 45946 (0.0030) [2024-04-26 01:39:28,923][47056] Fps is (10 sec: 62259.8, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 752844800. Throughput: 0: 55925.0. Samples: 702131780. Policy #0 lag: (min: 1.0, avg: 7.4, max: 21.0) [2024-04-26 01:39:28,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 01:39:31,802][47288] Updated weights for policy 0, policy_version 45956 (0.0030) [2024-04-26 01:39:32,285][47267] Signal inference workers to stop experience collection... (10250 times) [2024-04-26 01:39:32,285][47267] Signal inference workers to resume experience collection... (10250 times) [2024-04-26 01:39:32,294][47288] InferenceWorker_p0-w0: stopping experience collection (10250 times) [2024-04-26 01:39:32,297][47288] InferenceWorker_p0-w0: resuming experience collection (10250 times) [2024-04-26 01:39:33,815][47288] Updated weights for policy 0, policy_version 45966 (0.0032) [2024-04-26 01:39:33,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 753106944. Throughput: 0: 55865.0. Samples: 702459980. Policy #0 lag: (min: 1.0, avg: 7.4, max: 21.0) [2024-04-26 01:39:33,923][47056] Avg episode reward: [(0, '0.368')] [2024-04-26 01:39:37,591][47288] Updated weights for policy 0, policy_version 45976 (0.0027) [2024-04-26 01:39:38,923][47056] Fps is (10 sec: 54066.6, 60 sec: 56251.7, 300 sec: 55816.6). Total num frames: 753385472. Throughput: 0: 55895.5. Samples: 702795880. Policy #0 lag: (min: 1.0, avg: 7.4, max: 21.0) [2024-04-26 01:39:38,923][47056] Avg episode reward: [(0, '0.436')] [2024-04-26 01:39:39,622][47288] Updated weights for policy 0, policy_version 45986 (0.0032) [2024-04-26 01:39:43,503][47288] Updated weights for policy 0, policy_version 45996 (0.0026) [2024-04-26 01:39:43,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 753647616. Throughput: 0: 55701.7. Samples: 702963780. Policy #0 lag: (min: 1.0, avg: 7.4, max: 21.0) [2024-04-26 01:39:43,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 01:39:45,561][47288] Updated weights for policy 0, policy_version 46006 (0.0028) [2024-04-26 01:39:48,923][47056] Fps is (10 sec: 50790.6, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 753893376. Throughput: 0: 55647.8. Samples: 703297540. Policy #0 lag: (min: 1.0, avg: 7.4, max: 21.0) [2024-04-26 01:39:48,923][47056] Avg episode reward: [(0, '0.351')] [2024-04-26 01:39:49,343][47288] Updated weights for policy 0, policy_version 46016 (0.0028) [2024-04-26 01:39:51,263][47288] Updated weights for policy 0, policy_version 46026 (0.0027) [2024-04-26 01:39:53,923][47056] Fps is (10 sec: 52429.3, 60 sec: 54886.5, 300 sec: 55705.6). Total num frames: 754171904. Throughput: 0: 55634.3. Samples: 703634200. Policy #0 lag: (min: 1.0, avg: 7.4, max: 21.0) [2024-04-26 01:39:53,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 01:39:55,103][47288] Updated weights for policy 0, policy_version 46036 (0.0028) [2024-04-26 01:39:57,043][47288] Updated weights for policy 0, policy_version 46046 (0.0029) [2024-04-26 01:39:58,923][47056] Fps is (10 sec: 60621.0, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 754499584. Throughput: 0: 55928.4. Samples: 703799540. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-26 01:39:58,923][47056] Avg episode reward: [(0, '0.293')] [2024-04-26 01:40:01,074][47288] Updated weights for policy 0, policy_version 46056 (0.0030) [2024-04-26 01:40:02,998][47288] Updated weights for policy 0, policy_version 46066 (0.0024) [2024-04-26 01:40:03,923][47056] Fps is (10 sec: 62258.8, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 754794496. Throughput: 0: 55841.8. Samples: 704130760. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-26 01:40:03,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 01:40:07,047][47288] Updated weights for policy 0, policy_version 46076 (0.0033) [2024-04-26 01:40:08,890][47288] Updated weights for policy 0, policy_version 46086 (0.0026) [2024-04-26 01:40:08,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 755073024. Throughput: 0: 55877.8. Samples: 704468460. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-26 01:40:08,923][47056] Avg episode reward: [(0, '0.317')] [2024-04-26 01:40:12,948][47288] Updated weights for policy 0, policy_version 46096 (0.0029) [2024-04-26 01:40:13,541][47267] Signal inference workers to stop experience collection... (10300 times) [2024-04-26 01:40:13,541][47267] Signal inference workers to resume experience collection... (10300 times) [2024-04-26 01:40:13,572][47288] InferenceWorker_p0-w0: stopping experience collection (10300 times) [2024-04-26 01:40:13,572][47288] InferenceWorker_p0-w0: resuming experience collection (10300 times) [2024-04-26 01:40:13,923][47056] Fps is (10 sec: 52428.5, 60 sec: 56251.6, 300 sec: 55816.6). Total num frames: 755318784. Throughput: 0: 55645.6. Samples: 704635840. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-26 01:40:13,923][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 01:40:14,888][47288] Updated weights for policy 0, policy_version 46106 (0.0031) [2024-04-26 01:40:18,916][47288] Updated weights for policy 0, policy_version 46116 (0.0028) [2024-04-26 01:40:18,923][47056] Fps is (10 sec: 49152.2, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 755564544. Throughput: 0: 55857.4. Samples: 704973560. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-26 01:40:18,923][47056] Avg episode reward: [(0, '0.335')] [2024-04-26 01:40:20,669][47288] Updated weights for policy 0, policy_version 46126 (0.0023) [2024-04-26 01:40:23,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 755859456. Throughput: 0: 55732.6. Samples: 705303840. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-26 01:40:23,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 01:40:24,773][47288] Updated weights for policy 0, policy_version 46136 (0.0029) [2024-04-26 01:40:26,643][47288] Updated weights for policy 0, policy_version 46146 (0.0029) [2024-04-26 01:40:28,923][47056] Fps is (10 sec: 57343.5, 60 sec: 54886.3, 300 sec: 55705.6). Total num frames: 756137984. Throughput: 0: 55508.4. Samples: 705461660. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 01:40:28,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 01:40:30,484][47288] Updated weights for policy 0, policy_version 46156 (0.0033) [2024-04-26 01:40:32,623][47288] Updated weights for policy 0, policy_version 46166 (0.0026) [2024-04-26 01:40:33,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 756432896. Throughput: 0: 55450.2. Samples: 705792800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 01:40:33,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 01:40:36,404][47288] Updated weights for policy 0, policy_version 46176 (0.0031) [2024-04-26 01:40:38,360][47288] Updated weights for policy 0, policy_version 46186 (0.0030) [2024-04-26 01:40:38,923][47056] Fps is (10 sec: 58983.3, 60 sec: 55705.7, 300 sec: 55872.3). Total num frames: 756727808. Throughput: 0: 55360.5. Samples: 706125420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 01:40:38,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 01:40:42,200][47288] Updated weights for policy 0, policy_version 46196 (0.0025) [2024-04-26 01:40:43,922][47056] Fps is (10 sec: 57345.0, 60 sec: 55978.9, 300 sec: 55872.3). Total num frames: 757006336. Throughput: 0: 55692.2. Samples: 706305680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 01:40:43,923][47056] Avg episode reward: [(0, '0.362')] [2024-04-26 01:40:44,114][47288] Updated weights for policy 0, policy_version 46206 (0.0030) [2024-04-26 01:40:48,087][47288] Updated weights for policy 0, policy_version 46216 (0.0030) [2024-04-26 01:40:48,279][47267] Signal inference workers to stop experience collection... (10350 times) [2024-04-26 01:40:48,279][47267] Signal inference workers to resume experience collection... (10350 times) [2024-04-26 01:40:48,299][47288] InferenceWorker_p0-w0: stopping experience collection (10350 times) [2024-04-26 01:40:48,300][47288] InferenceWorker_p0-w0: resuming experience collection (10350 times) [2024-04-26 01:40:48,923][47056] Fps is (10 sec: 55704.5, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 757284864. Throughput: 0: 55947.0. Samples: 706648380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 01:40:48,923][47056] Avg episode reward: [(0, '0.343')] [2024-04-26 01:40:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000046221_757284864.pth... [2024-04-26 01:40:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000045404_743899136.pth [2024-04-26 01:40:50,143][47288] Updated weights for policy 0, policy_version 46226 (0.0027) [2024-04-26 01:40:53,923][47056] Fps is (10 sec: 52428.0, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 757530624. Throughput: 0: 55814.7. Samples: 706980120. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 01:40:53,923][47056] Avg episode reward: [(0, '0.332')] [2024-04-26 01:40:53,923][47288] Updated weights for policy 0, policy_version 46236 (0.0029) [2024-04-26 01:40:56,001][47288] Updated weights for policy 0, policy_version 46246 (0.0029) [2024-04-26 01:40:58,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55159.4, 300 sec: 55761.2). Total num frames: 757809152. Throughput: 0: 55413.8. Samples: 707129460. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 01:40:58,923][47056] Avg episode reward: [(0, '0.381')] [2024-04-26 01:40:59,816][47288] Updated weights for policy 0, policy_version 46256 (0.0030) [2024-04-26 01:41:01,995][47288] Updated weights for policy 0, policy_version 46266 (0.0023) [2024-04-26 01:41:03,923][47056] Fps is (10 sec: 55705.5, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 758087680. Throughput: 0: 55293.7. Samples: 707461780. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 01:41:03,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 01:41:05,581][47288] Updated weights for policy 0, policy_version 46276 (0.0031) [2024-04-26 01:41:07,758][47288] Updated weights for policy 0, policy_version 46286 (0.0033) [2024-04-26 01:41:08,923][47056] Fps is (10 sec: 58982.7, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 758398976. Throughput: 0: 55647.9. Samples: 707808000. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 01:41:08,923][47056] Avg episode reward: [(0, '0.450')] [2024-04-26 01:41:11,409][47288] Updated weights for policy 0, policy_version 46296 (0.0032) [2024-04-26 01:41:13,607][47288] Updated weights for policy 0, policy_version 46306 (0.0027) [2024-04-26 01:41:13,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 758677504. Throughput: 0: 55956.1. Samples: 707979680. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 01:41:13,923][47056] Avg episode reward: [(0, '0.333')] [2024-04-26 01:41:17,206][47288] Updated weights for policy 0, policy_version 46316 (0.0029) [2024-04-26 01:41:18,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 758939648. Throughput: 0: 56044.9. Samples: 708314820. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 01:41:18,923][47056] Avg episode reward: [(0, '0.310')] [2024-04-26 01:41:19,553][47288] Updated weights for policy 0, policy_version 46326 (0.0028) [2024-04-26 01:41:23,100][47288] Updated weights for policy 0, policy_version 46336 (0.0033) [2024-04-26 01:41:23,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 759218176. Throughput: 0: 55979.5. Samples: 708644500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 26.0) [2024-04-26 01:41:23,923][47056] Avg episode reward: [(0, '0.331')] [2024-04-26 01:41:25,423][47288] Updated weights for policy 0, policy_version 46346 (0.0027) [2024-04-26 01:41:28,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 759463936. Throughput: 0: 55688.6. Samples: 708811680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 26.0) [2024-04-26 01:41:28,923][47056] Avg episode reward: [(0, '0.384')] [2024-04-26 01:41:29,137][47288] Updated weights for policy 0, policy_version 46356 (0.0026) [2024-04-26 01:41:31,344][47288] Updated weights for policy 0, policy_version 46366 (0.0029) [2024-04-26 01:41:33,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 759758848. Throughput: 0: 55361.8. Samples: 709139660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 26.0) [2024-04-26 01:41:33,923][47056] Avg episode reward: [(0, '0.391')] [2024-04-26 01:41:35,161][47288] Updated weights for policy 0, policy_version 46376 (0.0026) [2024-04-26 01:41:37,372][47288] Updated weights for policy 0, policy_version 46386 (0.0032) [2024-04-26 01:41:38,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55159.3, 300 sec: 55705.6). Total num frames: 760037376. Throughput: 0: 55387.9. Samples: 709472580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 26.0) [2024-04-26 01:41:38,923][47056] Avg episode reward: [(0, '0.313')] [2024-04-26 01:41:41,128][47288] Updated weights for policy 0, policy_version 46396 (0.0034) [2024-04-26 01:41:42,941][47267] Signal inference workers to stop experience collection... (10400 times) [2024-04-26 01:41:42,980][47288] InferenceWorker_p0-w0: stopping experience collection (10400 times) [2024-04-26 01:41:43,001][47267] Signal inference workers to resume experience collection... (10400 times) [2024-04-26 01:41:43,007][47288] InferenceWorker_p0-w0: resuming experience collection (10400 times) [2024-04-26 01:41:43,222][47288] Updated weights for policy 0, policy_version 46406 (0.0026) [2024-04-26 01:41:43,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55432.4, 300 sec: 55761.4). Total num frames: 760332288. Throughput: 0: 55593.4. Samples: 709631160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 26.0) [2024-04-26 01:41:43,923][47056] Avg episode reward: [(0, '0.337')] [2024-04-26 01:41:47,005][47288] Updated weights for policy 0, policy_version 46416 (0.0036) [2024-04-26 01:41:48,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 760610816. Throughput: 0: 55615.0. Samples: 709964460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 26.0) [2024-04-26 01:41:48,923][47056] Avg episode reward: [(0, '0.356')] [2024-04-26 01:41:49,315][47288] Updated weights for policy 0, policy_version 46426 (0.0032) [2024-04-26 01:41:52,715][47288] Updated weights for policy 0, policy_version 46436 (0.0030) [2024-04-26 01:41:53,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 760889344. Throughput: 0: 55421.2. Samples: 710301960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 01:41:53,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 01:41:55,154][47288] Updated weights for policy 0, policy_version 46446 (0.0028) [2024-04-26 01:41:58,604][47288] Updated weights for policy 0, policy_version 46456 (0.0028) [2024-04-26 01:41:58,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 761151488. Throughput: 0: 55245.9. Samples: 710465740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 01:41:58,923][47056] Avg episode reward: [(0, '0.289')] [2024-04-26 01:42:01,133][47288] Updated weights for policy 0, policy_version 46466 (0.0029) [2024-04-26 01:42:03,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 761413632. Throughput: 0: 55420.4. Samples: 710808740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 01:42:03,924][47056] Avg episode reward: [(0, '0.339')] [2024-04-26 01:42:04,528][47288] Updated weights for policy 0, policy_version 46476 (0.0025) [2024-04-26 01:42:07,214][47288] Updated weights for policy 0, policy_version 46486 (0.0027) [2024-04-26 01:42:08,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 761708544. Throughput: 0: 55496.4. Samples: 711141840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 01:42:08,923][47056] Avg episode reward: [(0, '0.307')] [2024-04-26 01:42:10,347][47288] Updated weights for policy 0, policy_version 46496 (0.0026) [2024-04-26 01:42:13,031][47288] Updated weights for policy 0, policy_version 46506 (0.0027) [2024-04-26 01:42:13,923][47056] Fps is (10 sec: 55705.4, 60 sec: 54886.3, 300 sec: 55650.1). Total num frames: 761970688. Throughput: 0: 55267.1. Samples: 711298700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 01:42:13,923][47056] Avg episode reward: [(0, '0.296')] [2024-04-26 01:42:16,349][47288] Updated weights for policy 0, policy_version 46516 (0.0034) [2024-04-26 01:42:18,911][47288] Updated weights for policy 0, policy_version 46526 (0.0027) [2024-04-26 01:42:18,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 762281984. Throughput: 0: 55329.9. Samples: 711629500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 01:42:18,923][47056] Avg episode reward: [(0, '0.345')] [2024-04-26 01:42:22,249][47288] Updated weights for policy 0, policy_version 46536 (0.0027) [2024-04-26 01:42:23,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 762544128. Throughput: 0: 55367.2. Samples: 711964100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 01:42:23,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 01:42:24,822][47288] Updated weights for policy 0, policy_version 46546 (0.0027) [2024-04-26 01:42:28,032][47288] Updated weights for policy 0, policy_version 46556 (0.0026) [2024-04-26 01:42:28,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 762822656. Throughput: 0: 55705.3. Samples: 712137900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 01:42:28,923][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 01:42:30,678][47288] Updated weights for policy 0, policy_version 46566 (0.0032) [2024-04-26 01:42:33,872][47288] Updated weights for policy 0, policy_version 46576 (0.0031) [2024-04-26 01:42:33,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 763101184. Throughput: 0: 55682.7. Samples: 712470180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 01:42:33,923][47056] Avg episode reward: [(0, '0.393')] [2024-04-26 01:42:36,670][47288] Updated weights for policy 0, policy_version 46586 (0.0037) [2024-04-26 01:42:37,531][47267] Signal inference workers to stop experience collection... (10450 times) [2024-04-26 01:42:37,536][47267] Signal inference workers to resume experience collection... (10450 times) [2024-04-26 01:42:37,562][47288] InferenceWorker_p0-w0: stopping experience collection (10450 times) [2024-04-26 01:42:37,567][47288] InferenceWorker_p0-w0: resuming experience collection (10450 times) [2024-04-26 01:42:38,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 763363328. Throughput: 0: 55653.0. Samples: 712806340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 01:42:38,923][47056] Avg episode reward: [(0, '0.347')] [2024-04-26 01:42:39,728][47288] Updated weights for policy 0, policy_version 46596 (0.0025) [2024-04-26 01:42:42,437][47288] Updated weights for policy 0, policy_version 46606 (0.0031) [2024-04-26 01:42:43,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 763674624. Throughput: 0: 55715.4. Samples: 712972940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 01:42:43,923][47056] Avg episode reward: [(0, '0.349')] [2024-04-26 01:42:45,706][47288] Updated weights for policy 0, policy_version 46616 (0.0027) [2024-04-26 01:42:48,346][47288] Updated weights for policy 0, policy_version 46626 (0.0031) [2024-04-26 01:42:48,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 763936768. Throughput: 0: 55508.5. Samples: 713306620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 01:42:48,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 01:42:48,931][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000046627_763936768.pth... [2024-04-26 01:42:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000045812_750583808.pth [2024-04-26 01:42:51,438][47288] Updated weights for policy 0, policy_version 46636 (0.0033) [2024-04-26 01:42:53,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 764231680. Throughput: 0: 55603.3. Samples: 713643980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:42:53,923][47056] Avg episode reward: [(0, '0.341')] [2024-04-26 01:42:54,064][47288] Updated weights for policy 0, policy_version 46646 (0.0027) [2024-04-26 01:42:57,225][47288] Updated weights for policy 0, policy_version 46656 (0.0029) [2024-04-26 01:42:58,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 764526592. Throughput: 0: 56069.9. Samples: 713821840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:42:58,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 01:42:59,841][47288] Updated weights for policy 0, policy_version 46666 (0.0030) [2024-04-26 01:43:03,218][47288] Updated weights for policy 0, policy_version 46676 (0.0030) [2024-04-26 01:43:03,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 764772352. Throughput: 0: 56148.0. Samples: 714156160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:43:03,923][47056] Avg episode reward: [(0, '0.322')] [2024-04-26 01:43:05,691][47288] Updated weights for policy 0, policy_version 46686 (0.0028) [2024-04-26 01:43:08,923][47056] Fps is (10 sec: 50790.5, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 765034496. Throughput: 0: 56135.7. Samples: 714490200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:43:08,923][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 01:43:09,111][47288] Updated weights for policy 0, policy_version 46696 (0.0030) [2024-04-26 01:43:11,617][47288] Updated weights for policy 0, policy_version 46706 (0.0035) [2024-04-26 01:43:13,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 765313024. Throughput: 0: 55777.4. Samples: 714647880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:43:13,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 01:43:13,924][47267] Saving new best policy, reward=0.480! [2024-04-26 01:43:15,069][47288] Updated weights for policy 0, policy_version 46716 (0.0032) [2024-04-26 01:43:17,590][47288] Updated weights for policy 0, policy_version 46726 (0.0029) [2024-04-26 01:43:18,923][47056] Fps is (10 sec: 58981.8, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 765624320. Throughput: 0: 55725.3. Samples: 714977820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:43:18,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 01:43:20,775][47288] Updated weights for policy 0, policy_version 46736 (0.0026) [2024-04-26 01:43:23,729][47288] Updated weights for policy 0, policy_version 46746 (0.0031) [2024-04-26 01:43:23,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 765886464. Throughput: 0: 55784.5. Samples: 715316640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 01:43:23,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 01:43:26,417][47267] Signal inference workers to stop experience collection... (10500 times) [2024-04-26 01:43:26,419][47267] Signal inference workers to resume experience collection... (10500 times) [2024-04-26 01:43:26,428][47288] InferenceWorker_p0-w0: stopping experience collection (10500 times) [2024-04-26 01:43:26,459][47288] InferenceWorker_p0-w0: resuming experience collection (10500 times) [2024-04-26 01:43:26,526][47288] Updated weights for policy 0, policy_version 46756 (0.0031) [2024-04-26 01:43:28,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 766164992. Throughput: 0: 55760.5. Samples: 715482160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 01:43:28,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 01:43:29,417][47288] Updated weights for policy 0, policy_version 46766 (0.0027) [2024-04-26 01:43:32,376][47288] Updated weights for policy 0, policy_version 46776 (0.0033) [2024-04-26 01:43:33,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 766459904. Throughput: 0: 55729.8. Samples: 715814460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 01:43:33,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 01:43:35,265][47288] Updated weights for policy 0, policy_version 46786 (0.0033) [2024-04-26 01:43:38,310][47288] Updated weights for policy 0, policy_version 46796 (0.0034) [2024-04-26 01:43:38,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 766722048. Throughput: 0: 55700.7. Samples: 716150520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 01:43:38,923][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 01:43:41,248][47288] Updated weights for policy 0, policy_version 46806 (0.0032) [2024-04-26 01:43:43,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 767000576. Throughput: 0: 55461.8. Samples: 716317620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 01:43:43,923][47056] Avg episode reward: [(0, '0.297')] [2024-04-26 01:43:44,343][47288] Updated weights for policy 0, policy_version 46816 (0.0032) [2024-04-26 01:43:47,123][47288] Updated weights for policy 0, policy_version 46826 (0.0027) [2024-04-26 01:43:48,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 767246336. Throughput: 0: 55484.0. Samples: 716652940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 01:43:48,923][47056] Avg episode reward: [(0, '0.285')] [2024-04-26 01:43:50,368][47288] Updated weights for policy 0, policy_version 46836 (0.0026) [2024-04-26 01:43:52,903][47288] Updated weights for policy 0, policy_version 46846 (0.0032) [2024-04-26 01:43:53,924][47056] Fps is (10 sec: 57338.6, 60 sec: 55704.7, 300 sec: 55705.4). Total num frames: 767574016. Throughput: 0: 55338.4. Samples: 716980480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 01:43:53,924][47056] Avg episode reward: [(0, '0.299')] [2024-04-26 01:43:56,394][47288] Updated weights for policy 0, policy_version 46856 (0.0024) [2024-04-26 01:43:58,785][47288] Updated weights for policy 0, policy_version 46866 (0.0030) [2024-04-26 01:43:58,923][47056] Fps is (10 sec: 60620.6, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 767852544. Throughput: 0: 55574.5. Samples: 717148740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 01:43:58,924][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 01:44:02,297][47288] Updated weights for policy 0, policy_version 46876 (0.0025) [2024-04-26 01:44:03,923][47056] Fps is (10 sec: 54071.5, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 768114688. Throughput: 0: 55752.9. Samples: 717486700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 01:44:03,923][47056] Avg episode reward: [(0, '0.306')] [2024-04-26 01:44:04,942][47288] Updated weights for policy 0, policy_version 46886 (0.0029) [2024-04-26 01:44:08,132][47288] Updated weights for policy 0, policy_version 46896 (0.0027) [2024-04-26 01:44:08,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 768409600. Throughput: 0: 55655.2. Samples: 717821120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 01:44:08,923][47056] Avg episode reward: [(0, '0.381')] [2024-04-26 01:44:10,924][47288] Updated weights for policy 0, policy_version 46906 (0.0040) [2024-04-26 01:44:13,889][47288] Updated weights for policy 0, policy_version 46916 (0.0027) [2024-04-26 01:44:13,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 768671744. Throughput: 0: 55593.7. Samples: 717983880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 01:44:13,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 01:44:16,749][47288] Updated weights for policy 0, policy_version 46926 (0.0028) [2024-04-26 01:44:18,923][47056] Fps is (10 sec: 54065.8, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 768950272. Throughput: 0: 55629.1. Samples: 718317780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 01:44:18,923][47056] Avg episode reward: [(0, '0.347')] [2024-04-26 01:44:19,705][47288] Updated weights for policy 0, policy_version 46936 (0.0027) [2024-04-26 01:44:22,533][47288] Updated weights for policy 0, policy_version 46946 (0.0033) [2024-04-26 01:44:23,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 769212416. Throughput: 0: 55611.2. Samples: 718653020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 01:44:23,923][47056] Avg episode reward: [(0, '0.355')] [2024-04-26 01:44:24,824][47267] Signal inference workers to stop experience collection... (10550 times) [2024-04-26 01:44:24,824][47267] Signal inference workers to resume experience collection... (10550 times) [2024-04-26 01:44:24,838][47288] InferenceWorker_p0-w0: stopping experience collection (10550 times) [2024-04-26 01:44:24,838][47288] InferenceWorker_p0-w0: resuming experience collection (10550 times) [2024-04-26 01:44:25,636][47288] Updated weights for policy 0, policy_version 46956 (0.0034) [2024-04-26 01:44:28,385][47288] Updated weights for policy 0, policy_version 46966 (0.0028) [2024-04-26 01:44:28,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 769523712. Throughput: 0: 55525.6. Samples: 718816280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 01:44:28,923][47056] Avg episode reward: [(0, '0.334')] [2024-04-26 01:44:31,638][47288] Updated weights for policy 0, policy_version 46976 (0.0028) [2024-04-26 01:44:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 769785856. Throughput: 0: 55419.2. Samples: 719146800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 01:44:33,923][47056] Avg episode reward: [(0, '0.299')] [2024-04-26 01:44:34,227][47288] Updated weights for policy 0, policy_version 46986 (0.0033) [2024-04-26 01:44:37,411][47288] Updated weights for policy 0, policy_version 46996 (0.0028) [2024-04-26 01:44:38,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 770064384. Throughput: 0: 55713.6. Samples: 719487540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 01:44:38,923][47056] Avg episode reward: [(0, '0.368')] [2024-04-26 01:44:40,037][47288] Updated weights for policy 0, policy_version 47006 (0.0032) [2024-04-26 01:44:43,373][47288] Updated weights for policy 0, policy_version 47016 (0.0027) [2024-04-26 01:44:43,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 770326528. Throughput: 0: 55725.4. Samples: 719656380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 01:44:43,923][47056] Avg episode reward: [(0, '0.305')] [2024-04-26 01:44:45,956][47288] Updated weights for policy 0, policy_version 47026 (0.0028) [2024-04-26 01:44:48,923][47056] Fps is (10 sec: 54066.3, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 770605056. Throughput: 0: 55649.3. Samples: 719990920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:44:48,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 01:44:49,049][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000047035_770621440.pth... [2024-04-26 01:44:49,095][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000046221_757284864.pth [2024-04-26 01:44:49,219][47288] Updated weights for policy 0, policy_version 47036 (0.0029) [2024-04-26 01:44:51,856][47288] Updated weights for policy 0, policy_version 47046 (0.0028) [2024-04-26 01:44:53,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55433.4, 300 sec: 55594.5). Total num frames: 770899968. Throughput: 0: 55700.9. Samples: 720327660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:44:53,923][47056] Avg episode reward: [(0, '0.357')] [2024-04-26 01:44:55,112][47288] Updated weights for policy 0, policy_version 47056 (0.0028) [2024-04-26 01:44:57,668][47288] Updated weights for policy 0, policy_version 47066 (0.0029) [2024-04-26 01:44:58,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 771162112. Throughput: 0: 55744.5. Samples: 720492380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:44:58,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 01:45:00,996][47288] Updated weights for policy 0, policy_version 47076 (0.0035) [2024-04-26 01:45:03,524][47288] Updated weights for policy 0, policy_version 47086 (0.0027) [2024-04-26 01:45:03,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 771457024. Throughput: 0: 55644.3. Samples: 720821760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:45:03,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 01:45:06,768][47288] Updated weights for policy 0, policy_version 47096 (0.0032) [2024-04-26 01:45:08,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 771719168. Throughput: 0: 55602.7. Samples: 721155140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:45:08,923][47056] Avg episode reward: [(0, '0.353')] [2024-04-26 01:45:09,444][47288] Updated weights for policy 0, policy_version 47106 (0.0025) [2024-04-26 01:45:12,725][47288] Updated weights for policy 0, policy_version 47116 (0.0034) [2024-04-26 01:45:13,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 772014080. Throughput: 0: 55656.0. Samples: 721320800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 01:45:13,923][47056] Avg episode reward: [(0, '0.375')] [2024-04-26 01:45:15,246][47288] Updated weights for policy 0, policy_version 47126 (0.0031) [2024-04-26 01:45:16,809][47267] Signal inference workers to stop experience collection... (10600 times) [2024-04-26 01:45:16,809][47267] Signal inference workers to resume experience collection... (10600 times) [2024-04-26 01:45:16,827][47288] InferenceWorker_p0-w0: stopping experience collection (10600 times) [2024-04-26 01:45:16,827][47288] InferenceWorker_p0-w0: resuming experience collection (10600 times) [2024-04-26 01:45:18,719][47288] Updated weights for policy 0, policy_version 47136 (0.0026) [2024-04-26 01:45:18,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 772292608. Throughput: 0: 55849.7. Samples: 721660040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 01:45:18,923][47056] Avg episode reward: [(0, '0.299')] [2024-04-26 01:45:21,202][47288] Updated weights for policy 0, policy_version 47146 (0.0034) [2024-04-26 01:45:23,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 772554752. Throughput: 0: 55615.0. Samples: 721990220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 01:45:23,923][47056] Avg episode reward: [(0, '0.318')] [2024-04-26 01:45:24,621][47288] Updated weights for policy 0, policy_version 47156 (0.0033) [2024-04-26 01:45:26,968][47288] Updated weights for policy 0, policy_version 47166 (0.0033) [2024-04-26 01:45:28,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 772849664. Throughput: 0: 55522.2. Samples: 722154880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 01:45:28,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 01:45:30,543][47288] Updated weights for policy 0, policy_version 47176 (0.0029) [2024-04-26 01:45:32,898][47288] Updated weights for policy 0, policy_version 47186 (0.0028) [2024-04-26 01:45:33,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 773111808. Throughput: 0: 55551.4. Samples: 722490720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 01:45:33,923][47056] Avg episode reward: [(0, '0.345')] [2024-04-26 01:45:36,306][47288] Updated weights for policy 0, policy_version 47196 (0.0029) [2024-04-26 01:45:38,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 773406720. Throughput: 0: 55474.5. Samples: 722824020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 01:45:38,923][47056] Avg episode reward: [(0, '0.328')] [2024-04-26 01:45:39,357][47288] Updated weights for policy 0, policy_version 47206 (0.0024) [2024-04-26 01:45:42,079][47288] Updated weights for policy 0, policy_version 47216 (0.0030) [2024-04-26 01:45:43,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 773668864. Throughput: 0: 55503.8. Samples: 722990040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 01:45:43,923][47056] Avg episode reward: [(0, '0.337')] [2024-04-26 01:45:45,069][47288] Updated weights for policy 0, policy_version 47226 (0.0031) [2024-04-26 01:45:48,071][47288] Updated weights for policy 0, policy_version 47236 (0.0029) [2024-04-26 01:45:48,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 773963776. Throughput: 0: 55553.7. Samples: 723321680. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-04-26 01:45:48,923][47056] Avg episode reward: [(0, '0.355')] [2024-04-26 01:45:50,964][47288] Updated weights for policy 0, policy_version 47246 (0.0032) [2024-04-26 01:45:53,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 774225920. Throughput: 0: 55516.5. Samples: 723653380. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-04-26 01:45:53,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 01:45:54,034][47288] Updated weights for policy 0, policy_version 47256 (0.0027) [2024-04-26 01:45:56,792][47288] Updated weights for policy 0, policy_version 47266 (0.0031) [2024-04-26 01:45:58,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 774504448. Throughput: 0: 55498.4. Samples: 723818220. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-04-26 01:45:58,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 01:45:59,405][47267] Signal inference workers to stop experience collection... (10650 times) [2024-04-26 01:45:59,439][47288] InferenceWorker_p0-w0: stopping experience collection (10650 times) [2024-04-26 01:45:59,462][47267] Signal inference workers to resume experience collection... (10650 times) [2024-04-26 01:45:59,470][47288] InferenceWorker_p0-w0: resuming experience collection (10650 times) [2024-04-26 01:45:59,837][47288] Updated weights for policy 0, policy_version 47276 (0.0030) [2024-04-26 01:46:02,529][47288] Updated weights for policy 0, policy_version 47286 (0.0027) [2024-04-26 01:46:03,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 774782976. Throughput: 0: 55381.9. Samples: 724152220. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-04-26 01:46:03,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 01:46:05,743][47288] Updated weights for policy 0, policy_version 47296 (0.0026) [2024-04-26 01:46:08,314][47288] Updated weights for policy 0, policy_version 47306 (0.0030) [2024-04-26 01:46:08,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 775061504. Throughput: 0: 55348.9. Samples: 724480920. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-04-26 01:46:08,923][47056] Avg episode reward: [(0, '0.323')] [2024-04-26 01:46:11,717][47288] Updated weights for policy 0, policy_version 47316 (0.0035) [2024-04-26 01:46:13,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 775356416. Throughput: 0: 55619.2. Samples: 724657740. Policy #0 lag: (min: 2.0, avg: 10.6, max: 22.0) [2024-04-26 01:46:13,923][47056] Avg episode reward: [(0, '0.397')] [2024-04-26 01:46:14,209][47288] Updated weights for policy 0, policy_version 47326 (0.0026) [2024-04-26 01:46:17,862][47288] Updated weights for policy 0, policy_version 47336 (0.0027) [2024-04-26 01:46:18,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 775602176. Throughput: 0: 55498.0. Samples: 724988140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 01:46:18,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 01:46:20,404][47288] Updated weights for policy 0, policy_version 47346 (0.0029) [2024-04-26 01:46:23,653][47288] Updated weights for policy 0, policy_version 47356 (0.0032) [2024-04-26 01:46:23,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 775897088. Throughput: 0: 55549.3. Samples: 725323740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 01:46:23,923][47056] Avg episode reward: [(0, '0.337')] [2024-04-26 01:46:26,433][47288] Updated weights for policy 0, policy_version 47366 (0.0030) [2024-04-26 01:46:28,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 776159232. Throughput: 0: 55459.2. Samples: 725485720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 01:46:28,923][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 01:46:29,517][47288] Updated weights for policy 0, policy_version 47376 (0.0031) [2024-04-26 01:46:32,370][47288] Updated weights for policy 0, policy_version 47386 (0.0030) [2024-04-26 01:46:33,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 776437760. Throughput: 0: 55555.6. Samples: 725821680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 01:46:33,923][47056] Avg episode reward: [(0, '0.395')] [2024-04-26 01:46:35,293][47288] Updated weights for policy 0, policy_version 47396 (0.0027) [2024-04-26 01:46:38,044][47288] Updated weights for policy 0, policy_version 47406 (0.0035) [2024-04-26 01:46:38,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 776716288. Throughput: 0: 55633.6. Samples: 726156900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 01:46:38,923][47056] Avg episode reward: [(0, '0.342')] [2024-04-26 01:46:41,178][47288] Updated weights for policy 0, policy_version 47416 (0.0027) [2024-04-26 01:46:43,619][47267] Signal inference workers to stop experience collection... (10700 times) [2024-04-26 01:46:43,651][47288] InferenceWorker_p0-w0: stopping experience collection (10700 times) [2024-04-26 01:46:43,675][47267] Signal inference workers to resume experience collection... (10700 times) [2024-04-26 01:46:43,676][47288] InferenceWorker_p0-w0: resuming experience collection (10700 times) [2024-04-26 01:46:43,803][47288] Updated weights for policy 0, policy_version 47426 (0.0026) [2024-04-26 01:46:43,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 777027584. Throughput: 0: 55642.1. Samples: 726322120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 01:46:43,923][47056] Avg episode reward: [(0, '0.331')] [2024-04-26 01:46:46,974][47288] Updated weights for policy 0, policy_version 47436 (0.0026) [2024-04-26 01:46:48,923][47056] Fps is (10 sec: 58983.3, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 777306112. Throughput: 0: 55681.8. Samples: 726657900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-26 01:46:48,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 01:46:48,950][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000047444_777322496.pth... [2024-04-26 01:46:48,996][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000046627_763936768.pth [2024-04-26 01:46:49,614][47288] Updated weights for policy 0, policy_version 47446 (0.0031) [2024-04-26 01:46:52,951][47288] Updated weights for policy 0, policy_version 47456 (0.0031) [2024-04-26 01:46:53,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 777551872. Throughput: 0: 55874.3. Samples: 726995260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-26 01:46:53,924][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 01:46:55,459][47288] Updated weights for policy 0, policy_version 47466 (0.0028) [2024-04-26 01:46:58,923][47056] Fps is (10 sec: 52428.2, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 777830400. Throughput: 0: 55695.9. Samples: 727164060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-26 01:46:58,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 01:46:58,983][47288] Updated weights for policy 0, policy_version 47476 (0.0031) [2024-04-26 01:47:01,313][47288] Updated weights for policy 0, policy_version 47486 (0.0027) [2024-04-26 01:47:03,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 778108928. Throughput: 0: 55741.1. Samples: 727496480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-26 01:47:03,923][47056] Avg episode reward: [(0, '0.349')] [2024-04-26 01:47:04,741][47288] Updated weights for policy 0, policy_version 47496 (0.0029) [2024-04-26 01:47:07,770][47288] Updated weights for policy 0, policy_version 47506 (0.0025) [2024-04-26 01:47:08,923][47056] Fps is (10 sec: 58982.5, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 778420224. Throughput: 0: 55646.3. Samples: 727827820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-26 01:47:08,923][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 01:47:10,460][47288] Updated weights for policy 0, policy_version 47516 (0.0034) [2024-04-26 01:47:13,800][47288] Updated weights for policy 0, policy_version 47526 (0.0034) [2024-04-26 01:47:13,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 778665984. Throughput: 0: 55829.2. Samples: 727998020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-26 01:47:13,923][47056] Avg episode reward: [(0, '0.339')] [2024-04-26 01:47:16,302][47288] Updated weights for policy 0, policy_version 47536 (0.0027) [2024-04-26 01:47:18,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 778960896. Throughput: 0: 55836.0. Samples: 728334300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 01:47:18,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 01:47:19,639][47288] Updated weights for policy 0, policy_version 47546 (0.0028) [2024-04-26 01:47:22,295][47288] Updated weights for policy 0, policy_version 47556 (0.0030) [2024-04-26 01:47:23,923][47056] Fps is (10 sec: 58981.1, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 779255808. Throughput: 0: 55652.4. Samples: 728661260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 01:47:23,923][47056] Avg episode reward: [(0, '0.381')] [2024-04-26 01:47:25,440][47288] Updated weights for policy 0, policy_version 47566 (0.0034) [2024-04-26 01:47:27,625][47267] Signal inference workers to stop experience collection... (10750 times) [2024-04-26 01:47:27,628][47267] Signal inference workers to resume experience collection... (10750 times) [2024-04-26 01:47:27,641][47288] InferenceWorker_p0-w0: stopping experience collection (10750 times) [2024-04-26 01:47:27,641][47288] InferenceWorker_p0-w0: resuming experience collection (10750 times) [2024-04-26 01:47:28,255][47288] Updated weights for policy 0, policy_version 47576 (0.0036) [2024-04-26 01:47:28,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.8, 300 sec: 55594.5). Total num frames: 779501568. Throughput: 0: 55734.3. Samples: 728830160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 01:47:28,923][47056] Avg episode reward: [(0, '0.224')] [2024-04-26 01:47:31,264][47288] Updated weights for policy 0, policy_version 47586 (0.0033) [2024-04-26 01:47:33,923][47056] Fps is (10 sec: 52429.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 779780096. Throughput: 0: 55676.8. Samples: 729163360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 01:47:33,923][47056] Avg episode reward: [(0, '0.324')] [2024-04-26 01:47:34,104][47288] Updated weights for policy 0, policy_version 47596 (0.0031) [2024-04-26 01:47:36,988][47288] Updated weights for policy 0, policy_version 47606 (0.0029) [2024-04-26 01:47:38,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 780058624. Throughput: 0: 55647.2. Samples: 729499380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 01:47:38,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 01:47:39,947][47288] Updated weights for policy 0, policy_version 47616 (0.0027) [2024-04-26 01:47:42,748][47288] Updated weights for policy 0, policy_version 47626 (0.0029) [2024-04-26 01:47:43,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 780353536. Throughput: 0: 55663.5. Samples: 729668920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 01:47:43,923][47056] Avg episode reward: [(0, '0.327')] [2024-04-26 01:47:45,932][47288] Updated weights for policy 0, policy_version 47636 (0.0029) [2024-04-26 01:47:48,562][47288] Updated weights for policy 0, policy_version 47646 (0.0028) [2024-04-26 01:47:48,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 780632064. Throughput: 0: 55648.7. Samples: 730000680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:47:48,923][47056] Avg episode reward: [(0, '0.274')] [2024-04-26 01:47:51,737][47288] Updated weights for policy 0, policy_version 47656 (0.0026) [2024-04-26 01:47:53,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 780910592. Throughput: 0: 55715.5. Samples: 730335020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:47:53,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 01:47:54,670][47288] Updated weights for policy 0, policy_version 47666 (0.0028) [2024-04-26 01:47:57,415][47288] Updated weights for policy 0, policy_version 47676 (0.0027) [2024-04-26 01:47:58,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 781205504. Throughput: 0: 55785.2. Samples: 730508360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:47:58,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 01:48:00,543][47288] Updated weights for policy 0, policy_version 47686 (0.0030) [2024-04-26 01:48:03,362][47288] Updated weights for policy 0, policy_version 47696 (0.0030) [2024-04-26 01:48:03,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 781467648. Throughput: 0: 55765.7. Samples: 730843760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:48:03,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 01:48:06,427][47288] Updated weights for policy 0, policy_version 47706 (0.0032) [2024-04-26 01:48:08,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 781746176. Throughput: 0: 55895.3. Samples: 731176540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:48:08,923][47056] Avg episode reward: [(0, '0.325')] [2024-04-26 01:48:09,250][47288] Updated weights for policy 0, policy_version 47716 (0.0030) [2024-04-26 01:48:12,265][47288] Updated weights for policy 0, policy_version 47726 (0.0033) [2024-04-26 01:48:13,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 782024704. Throughput: 0: 55799.0. Samples: 731341120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:48:13,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 01:48:15,135][47288] Updated weights for policy 0, policy_version 47736 (0.0025) [2024-04-26 01:48:17,978][47288] Updated weights for policy 0, policy_version 47746 (0.0029) [2024-04-26 01:48:18,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 782303232. Throughput: 0: 55791.0. Samples: 731673960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:48:18,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 01:48:21,099][47288] Updated weights for policy 0, policy_version 47756 (0.0034) [2024-04-26 01:48:23,742][47288] Updated weights for policy 0, policy_version 47766 (0.0026) [2024-04-26 01:48:23,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 782598144. Throughput: 0: 55785.3. Samples: 732009720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:48:23,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 01:48:27,061][47288] Updated weights for policy 0, policy_version 47776 (0.0026) [2024-04-26 01:48:28,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.6, 300 sec: 55650.0). Total num frames: 782876672. Throughput: 0: 55724.4. Samples: 732176520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:48:28,923][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 01:48:29,731][47288] Updated weights for policy 0, policy_version 47786 (0.0030) [2024-04-26 01:48:32,915][47267] Signal inference workers to stop experience collection... (10800 times) [2024-04-26 01:48:32,916][47267] Signal inference workers to resume experience collection... (10800 times) [2024-04-26 01:48:32,942][47288] InferenceWorker_p0-w0: stopping experience collection (10800 times) [2024-04-26 01:48:32,943][47288] InferenceWorker_p0-w0: resuming experience collection (10800 times) [2024-04-26 01:48:33,026][47288] Updated weights for policy 0, policy_version 47796 (0.0032) [2024-04-26 01:48:33,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 783138816. Throughput: 0: 55750.3. Samples: 732509440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:48:33,923][47056] Avg episode reward: [(0, '0.352')] [2024-04-26 01:48:35,640][47288] Updated weights for policy 0, policy_version 47806 (0.0023) [2024-04-26 01:48:38,817][47288] Updated weights for policy 0, policy_version 47816 (0.0030) [2024-04-26 01:48:38,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 783417344. Throughput: 0: 55800.5. Samples: 732846040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:48:38,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 01:48:41,674][47288] Updated weights for policy 0, policy_version 47826 (0.0029) [2024-04-26 01:48:43,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 783712256. Throughput: 0: 55721.8. Samples: 733015840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 01:48:43,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 01:48:44,647][47288] Updated weights for policy 0, policy_version 47836 (0.0033) [2024-04-26 01:48:47,658][47288] Updated weights for policy 0, policy_version 47846 (0.0028) [2024-04-26 01:48:48,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55432.7, 300 sec: 55539.2). Total num frames: 783958016. Throughput: 0: 55654.8. Samples: 733348220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 01:48:48,923][47056] Avg episode reward: [(0, '0.349')] [2024-04-26 01:48:49,020][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000047850_783974400.pth... [2024-04-26 01:48:49,067][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000047035_770621440.pth [2024-04-26 01:48:50,662][47288] Updated weights for policy 0, policy_version 47856 (0.0025) [2024-04-26 01:48:53,682][47288] Updated weights for policy 0, policy_version 47866 (0.0028) [2024-04-26 01:48:53,923][47056] Fps is (10 sec: 52428.0, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 784236544. Throughput: 0: 55621.5. Samples: 733679520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 01:48:53,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 01:48:56,520][47288] Updated weights for policy 0, policy_version 47876 (0.0026) [2024-04-26 01:48:58,923][47056] Fps is (10 sec: 57342.8, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 784531456. Throughput: 0: 55718.1. Samples: 733848440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 01:48:58,923][47056] Avg episode reward: [(0, '0.395')] [2024-04-26 01:48:59,376][47288] Updated weights for policy 0, policy_version 47886 (0.0028) [2024-04-26 01:49:02,407][47288] Updated weights for policy 0, policy_version 47896 (0.0029) [2024-04-26 01:49:03,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 784826368. Throughput: 0: 55777.3. Samples: 734183940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 01:49:03,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 01:49:05,109][47288] Updated weights for policy 0, policy_version 47906 (0.0033) [2024-04-26 01:49:08,198][47288] Updated weights for policy 0, policy_version 47916 (0.0032) [2024-04-26 01:49:08,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 785104896. Throughput: 0: 55786.1. Samples: 734520100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 01:49:08,923][47056] Avg episode reward: [(0, '0.277')] [2024-04-26 01:49:11,109][47288] Updated weights for policy 0, policy_version 47926 (0.0026) [2024-04-26 01:49:13,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 785350656. Throughput: 0: 55586.7. Samples: 734677920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 01:49:13,923][47056] Avg episode reward: [(0, '0.344')] [2024-04-26 01:49:14,225][47288] Updated weights for policy 0, policy_version 47936 (0.0027) [2024-04-26 01:49:16,998][47288] Updated weights for policy 0, policy_version 47946 (0.0031) [2024-04-26 01:49:18,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 785645568. Throughput: 0: 55585.8. Samples: 735010800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 01:49:18,923][47056] Avg episode reward: [(0, '0.327')] [2024-04-26 01:49:20,069][47288] Updated weights for policy 0, policy_version 47956 (0.0027) [2024-04-26 01:49:22,883][47288] Updated weights for policy 0, policy_version 47966 (0.0033) [2024-04-26 01:49:23,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 785924096. Throughput: 0: 55589.9. Samples: 735347580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 01:49:23,923][47056] Avg episode reward: [(0, '0.395')] [2024-04-26 01:49:25,917][47288] Updated weights for policy 0, policy_version 47976 (0.0030) [2024-04-26 01:49:28,675][47288] Updated weights for policy 0, policy_version 47986 (0.0026) [2024-04-26 01:49:28,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 786219008. Throughput: 0: 55536.6. Samples: 735514980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 01:49:28,923][47056] Avg episode reward: [(0, '0.381')] [2024-04-26 01:49:29,735][47267] Signal inference workers to stop experience collection... (10850 times) [2024-04-26 01:49:29,765][47288] InferenceWorker_p0-w0: stopping experience collection (10850 times) [2024-04-26 01:49:29,791][47267] Signal inference workers to resume experience collection... (10850 times) [2024-04-26 01:49:29,792][47288] InferenceWorker_p0-w0: resuming experience collection (10850 times) [2024-04-26 01:49:31,683][47288] Updated weights for policy 0, policy_version 47996 (0.0031) [2024-04-26 01:49:33,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 786481152. Throughput: 0: 55649.3. Samples: 735852440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 01:49:33,923][47056] Avg episode reward: [(0, '0.354')] [2024-04-26 01:49:34,515][47288] Updated weights for policy 0, policy_version 48006 (0.0029) [2024-04-26 01:49:37,510][47288] Updated weights for policy 0, policy_version 48016 (0.0033) [2024-04-26 01:49:38,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 786792448. Throughput: 0: 55782.5. Samples: 736189720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 01:49:38,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 01:49:40,375][47288] Updated weights for policy 0, policy_version 48026 (0.0023) [2024-04-26 01:49:43,572][47288] Updated weights for policy 0, policy_version 48036 (0.0032) [2024-04-26 01:49:43,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 787054592. Throughput: 0: 55921.6. Samples: 736364900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 01:49:43,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 01:49:46,109][47288] Updated weights for policy 0, policy_version 48046 (0.0030) [2024-04-26 01:49:48,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 787316736. Throughput: 0: 55826.0. Samples: 736696100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 01:49:48,923][47056] Avg episode reward: [(0, '0.341')] [2024-04-26 01:49:49,401][47288] Updated weights for policy 0, policy_version 48056 (0.0026) [2024-04-26 01:49:51,843][47288] Updated weights for policy 0, policy_version 48066 (0.0030) [2024-04-26 01:49:53,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55979.0, 300 sec: 55705.6). Total num frames: 787595264. Throughput: 0: 55715.3. Samples: 737027280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 01:49:53,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 01:49:55,246][47288] Updated weights for policy 0, policy_version 48076 (0.0028) [2024-04-26 01:49:57,797][47288] Updated weights for policy 0, policy_version 48086 (0.0030) [2024-04-26 01:49:58,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 787857408. Throughput: 0: 55877.8. Samples: 737192420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 01:49:58,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 01:50:01,107][47288] Updated weights for policy 0, policy_version 48096 (0.0025) [2024-04-26 01:50:03,882][47288] Updated weights for policy 0, policy_version 48106 (0.0037) [2024-04-26 01:50:03,923][47056] Fps is (10 sec: 57342.6, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 788168704. Throughput: 0: 55863.4. Samples: 737524660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 01:50:03,924][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 01:50:06,905][47288] Updated weights for policy 0, policy_version 48116 (0.0034) [2024-04-26 01:50:08,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 788430848. Throughput: 0: 55783.4. Samples: 737857840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 01:50:08,923][47056] Avg episode reward: [(0, '0.279')] [2024-04-26 01:50:09,722][47288] Updated weights for policy 0, policy_version 48126 (0.0031) [2024-04-26 01:50:12,743][47288] Updated weights for policy 0, policy_version 48136 (0.0036) [2024-04-26 01:50:13,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 788709376. Throughput: 0: 55807.3. Samples: 738026320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 01:50:13,923][47056] Avg episode reward: [(0, '0.353')] [2024-04-26 01:50:15,510][47288] Updated weights for policy 0, policy_version 48146 (0.0032) [2024-04-26 01:50:18,753][47288] Updated weights for policy 0, policy_version 48156 (0.0029) [2024-04-26 01:50:18,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 788987904. Throughput: 0: 55740.2. Samples: 738360760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 01:50:18,923][47056] Avg episode reward: [(0, '0.327')] [2024-04-26 01:50:20,294][47267] Signal inference workers to stop experience collection... (10900 times) [2024-04-26 01:50:20,295][47267] Signal inference workers to resume experience collection... (10900 times) [2024-04-26 01:50:20,321][47288] InferenceWorker_p0-w0: stopping experience collection (10900 times) [2024-04-26 01:50:20,321][47288] InferenceWorker_p0-w0: resuming experience collection (10900 times) [2024-04-26 01:50:21,499][47288] Updated weights for policy 0, policy_version 48166 (0.0030) [2024-04-26 01:50:23,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 789250048. Throughput: 0: 55577.0. Samples: 738690680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 01:50:23,923][47056] Avg episode reward: [(0, '0.372')] [2024-04-26 01:50:24,569][47288] Updated weights for policy 0, policy_version 48176 (0.0027) [2024-04-26 01:50:27,521][47288] Updated weights for policy 0, policy_version 48186 (0.0026) [2024-04-26 01:50:28,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55432.3, 300 sec: 55705.6). Total num frames: 789544960. Throughput: 0: 55301.6. Samples: 738853480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 01:50:28,923][47056] Avg episode reward: [(0, '0.306')] [2024-04-26 01:50:30,672][47288] Updated weights for policy 0, policy_version 48196 (0.0028) [2024-04-26 01:50:33,379][47288] Updated weights for policy 0, policy_version 48206 (0.0028) [2024-04-26 01:50:33,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 789807104. Throughput: 0: 55337.2. Samples: 739186280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 01:50:33,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 01:50:36,586][47288] Updated weights for policy 0, policy_version 48216 (0.0029) [2024-04-26 01:50:38,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 790102016. Throughput: 0: 55507.4. Samples: 739525120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 01:50:38,923][47056] Avg episode reward: [(0, '0.296')] [2024-04-26 01:50:39,198][47288] Updated weights for policy 0, policy_version 48226 (0.0029) [2024-04-26 01:50:42,447][47288] Updated weights for policy 0, policy_version 48236 (0.0029) [2024-04-26 01:50:43,923][47056] Fps is (10 sec: 58982.0, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 790396928. Throughput: 0: 55649.7. Samples: 739696660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 01:50:43,923][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 01:50:45,182][47288] Updated weights for policy 0, policy_version 48246 (0.0028) [2024-04-26 01:50:48,190][47288] Updated weights for policy 0, policy_version 48256 (0.0031) [2024-04-26 01:50:48,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 790642688. Throughput: 0: 55698.7. Samples: 740031100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 01:50:48,923][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 01:50:49,052][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000048258_790659072.pth... [2024-04-26 01:50:49,111][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000047444_777322496.pth [2024-04-26 01:50:51,055][47288] Updated weights for policy 0, policy_version 48266 (0.0037) [2024-04-26 01:50:53,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 790937600. Throughput: 0: 55625.9. Samples: 740361000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 01:50:53,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 01:50:53,994][47288] Updated weights for policy 0, policy_version 48276 (0.0030) [2024-04-26 01:50:56,787][47288] Updated weights for policy 0, policy_version 48286 (0.0031) [2024-04-26 01:50:58,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 791216128. Throughput: 0: 55673.8. Samples: 740531640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 01:50:58,924][47056] Avg episode reward: [(0, '0.339')] [2024-04-26 01:50:59,879][47288] Updated weights for policy 0, policy_version 48296 (0.0026) [2024-04-26 01:51:02,732][47288] Updated weights for policy 0, policy_version 48306 (0.0032) [2024-04-26 01:51:03,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 791494656. Throughput: 0: 55658.0. Samples: 740865360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 01:51:03,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 01:51:05,777][47288] Updated weights for policy 0, policy_version 48316 (0.0027) [2024-04-26 01:51:08,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 791756800. Throughput: 0: 55830.6. Samples: 741203060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 01:51:08,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 01:51:08,992][47288] Updated weights for policy 0, policy_version 48326 (0.0034) [2024-04-26 01:51:11,082][47267] Signal inference workers to stop experience collection... (10950 times) [2024-04-26 01:51:11,120][47288] InferenceWorker_p0-w0: stopping experience collection (10950 times) [2024-04-26 01:51:11,146][47267] Signal inference workers to resume experience collection... (10950 times) [2024-04-26 01:51:11,146][47288] InferenceWorker_p0-w0: resuming experience collection (10950 times) [2024-04-26 01:51:11,561][47288] Updated weights for policy 0, policy_version 48336 (0.0028) [2024-04-26 01:51:13,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 792035328. Throughput: 0: 55822.0. Samples: 741365460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 01:51:13,923][47056] Avg episode reward: [(0, '0.395')] [2024-04-26 01:51:14,726][47288] Updated weights for policy 0, policy_version 48346 (0.0030) [2024-04-26 01:51:17,472][47288] Updated weights for policy 0, policy_version 48356 (0.0027) [2024-04-26 01:51:18,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 792330240. Throughput: 0: 55793.9. Samples: 741697000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 01:51:18,931][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 01:51:20,564][47288] Updated weights for policy 0, policy_version 48366 (0.0029) [2024-04-26 01:51:23,353][47288] Updated weights for policy 0, policy_version 48376 (0.0027) [2024-04-26 01:51:23,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.5, 300 sec: 55761.2). Total num frames: 792608768. Throughput: 0: 55748.0. Samples: 742033780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 01:51:23,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 01:51:26,291][47288] Updated weights for policy 0, policy_version 48386 (0.0033) [2024-04-26 01:51:28,923][47056] Fps is (10 sec: 55704.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 792887296. Throughput: 0: 55678.6. Samples: 742202200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 01:51:28,923][47056] Avg episode reward: [(0, '0.390')] [2024-04-26 01:51:29,436][47288] Updated weights for policy 0, policy_version 48396 (0.0027) [2024-04-26 01:51:32,208][47288] Updated weights for policy 0, policy_version 48406 (0.0035) [2024-04-26 01:51:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 793182208. Throughput: 0: 55696.0. Samples: 742537420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 01:51:33,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 01:51:35,197][47288] Updated weights for policy 0, policy_version 48416 (0.0030) [2024-04-26 01:51:38,144][47288] Updated weights for policy 0, policy_version 48426 (0.0028) [2024-04-26 01:51:38,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 793460736. Throughput: 0: 55749.3. Samples: 742869720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 01:51:38,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 01:51:41,039][47288] Updated weights for policy 0, policy_version 48436 (0.0030) [2024-04-26 01:51:43,904][47288] Updated weights for policy 0, policy_version 48446 (0.0032) [2024-04-26 01:51:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 793739264. Throughput: 0: 55773.3. Samples: 743041440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 01:51:43,923][47056] Avg episode reward: [(0, '0.347')] [2024-04-26 01:51:46,848][47288] Updated weights for policy 0, policy_version 48456 (0.0027) [2024-04-26 01:51:48,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 793985024. Throughput: 0: 55778.1. Samples: 743375380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 01:51:48,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 01:51:50,098][47288] Updated weights for policy 0, policy_version 48466 (0.0027) [2024-04-26 01:51:52,798][47288] Updated weights for policy 0, policy_version 48476 (0.0027) [2024-04-26 01:51:53,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 794279936. Throughput: 0: 55706.2. Samples: 743709840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 01:51:53,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 01:51:55,882][47288] Updated weights for policy 0, policy_version 48486 (0.0026) [2024-04-26 01:51:58,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 794542080. Throughput: 0: 55693.8. Samples: 743871680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 01:51:58,923][47056] Avg episode reward: [(0, '0.328')] [2024-04-26 01:51:59,045][47288] Updated weights for policy 0, policy_version 48496 (0.0031) [2024-04-26 01:52:01,755][47288] Updated weights for policy 0, policy_version 48506 (0.0030) [2024-04-26 01:52:03,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 794836992. Throughput: 0: 55663.6. Samples: 744201860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 01:52:03,923][47056] Avg episode reward: [(0, '0.333')] [2024-04-26 01:52:04,991][47288] Updated weights for policy 0, policy_version 48516 (0.0030) [2024-04-26 01:52:07,426][47288] Updated weights for policy 0, policy_version 48526 (0.0027) [2024-04-26 01:52:08,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 795131904. Throughput: 0: 55642.8. Samples: 744537700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 01:52:08,923][47056] Avg episode reward: [(0, '0.403')] [2024-04-26 01:52:10,878][47288] Updated weights for policy 0, policy_version 48536 (0.0025) [2024-04-26 01:52:13,388][47288] Updated weights for policy 0, policy_version 48546 (0.0030) [2024-04-26 01:52:13,768][47267] Signal inference workers to stop experience collection... (11000 times) [2024-04-26 01:52:13,768][47267] Signal inference workers to resume experience collection... (11000 times) [2024-04-26 01:52:13,779][47288] InferenceWorker_p0-w0: stopping experience collection (11000 times) [2024-04-26 01:52:13,779][47288] InferenceWorker_p0-w0: resuming experience collection (11000 times) [2024-04-26 01:52:13,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 795410432. Throughput: 0: 55732.3. Samples: 744710140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 01:52:13,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 01:52:16,731][47288] Updated weights for policy 0, policy_version 48556 (0.0030) [2024-04-26 01:52:18,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 795672576. Throughput: 0: 55593.4. Samples: 745039120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 01:52:18,923][47056] Avg episode reward: [(0, '0.281')] [2024-04-26 01:52:19,474][47288] Updated weights for policy 0, policy_version 48566 (0.0038) [2024-04-26 01:52:22,869][47288] Updated weights for policy 0, policy_version 48576 (0.0031) [2024-04-26 01:52:23,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 795934720. Throughput: 0: 55579.2. Samples: 745370780. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 01:52:23,923][47056] Avg episode reward: [(0, '0.351')] [2024-04-26 01:52:25,293][47288] Updated weights for policy 0, policy_version 48586 (0.0032) [2024-04-26 01:52:28,757][47288] Updated weights for policy 0, policy_version 48596 (0.0024) [2024-04-26 01:52:28,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55432.8, 300 sec: 55705.6). Total num frames: 796213248. Throughput: 0: 55273.1. Samples: 745528720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 01:52:28,923][47056] Avg episode reward: [(0, '0.296')] [2024-04-26 01:52:31,185][47288] Updated weights for policy 0, policy_version 48606 (0.0032) [2024-04-26 01:52:33,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 796491776. Throughput: 0: 55395.6. Samples: 745868180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 01:52:33,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 01:52:34,490][47288] Updated weights for policy 0, policy_version 48616 (0.0028) [2024-04-26 01:52:37,157][47288] Updated weights for policy 0, policy_version 48626 (0.0031) [2024-04-26 01:52:38,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 796786688. Throughput: 0: 55492.0. Samples: 746206980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 01:52:38,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 01:52:40,348][47288] Updated weights for policy 0, policy_version 48636 (0.0030) [2024-04-26 01:52:42,929][47288] Updated weights for policy 0, policy_version 48646 (0.0025) [2024-04-26 01:52:43,923][47056] Fps is (10 sec: 60621.1, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 797097984. Throughput: 0: 55692.9. Samples: 746377860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 01:52:43,923][47056] Avg episode reward: [(0, '0.381')] [2024-04-26 01:52:46,216][47288] Updated weights for policy 0, policy_version 48656 (0.0031) [2024-04-26 01:52:48,752][47288] Updated weights for policy 0, policy_version 48666 (0.0034) [2024-04-26 01:52:48,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 55761.2). Total num frames: 797360128. Throughput: 0: 55800.3. Samples: 746712880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 01:52:48,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 01:52:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000048667_797360128.pth... [2024-04-26 01:52:48,987][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000047850_783974400.pth [2024-04-26 01:52:52,189][47288] Updated weights for policy 0, policy_version 48676 (0.0028) [2024-04-26 01:52:53,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 797622272. Throughput: 0: 55809.3. Samples: 747049120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:52:53,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 01:52:54,612][47288] Updated weights for policy 0, policy_version 48686 (0.0032) [2024-04-26 01:52:57,880][47288] Updated weights for policy 0, policy_version 48696 (0.0036) [2024-04-26 01:52:58,923][47056] Fps is (10 sec: 50790.9, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 797868032. Throughput: 0: 55535.5. Samples: 747209240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:52:58,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 01:53:00,587][47288] Updated weights for policy 0, policy_version 48706 (0.0029) [2024-04-26 01:53:03,794][47288] Updated weights for policy 0, policy_version 48716 (0.0032) [2024-04-26 01:53:03,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 798162944. Throughput: 0: 55613.0. Samples: 747541700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:53:03,923][47056] Avg episode reward: [(0, '0.296')] [2024-04-26 01:53:06,344][47288] Updated weights for policy 0, policy_version 48726 (0.0029) [2024-04-26 01:53:08,923][47056] Fps is (10 sec: 55704.5, 60 sec: 54886.3, 300 sec: 55594.5). Total num frames: 798425088. Throughput: 0: 55692.2. Samples: 747876940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:53:08,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 01:53:08,952][47267] Saving new best policy, reward=0.495! [2024-04-26 01:53:09,599][47267] Signal inference workers to stop experience collection... (11050 times) [2024-04-26 01:53:09,632][47288] InferenceWorker_p0-w0: stopping experience collection (11050 times) [2024-04-26 01:53:09,657][47267] Signal inference workers to resume experience collection... (11050 times) [2024-04-26 01:53:09,662][47288] InferenceWorker_p0-w0: resuming experience collection (11050 times) [2024-04-26 01:53:09,766][47288] Updated weights for policy 0, policy_version 48736 (0.0026) [2024-04-26 01:53:12,126][47288] Updated weights for policy 0, policy_version 48746 (0.0037) [2024-04-26 01:53:13,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 798736384. Throughput: 0: 55931.4. Samples: 748045640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:53:13,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 01:53:15,624][47288] Updated weights for policy 0, policy_version 48756 (0.0029) [2024-04-26 01:53:18,060][47288] Updated weights for policy 0, policy_version 48766 (0.0031) [2024-04-26 01:53:18,923][47056] Fps is (10 sec: 62259.9, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 799047680. Throughput: 0: 55801.8. Samples: 748379260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:53:18,923][47056] Avg episode reward: [(0, '0.305')] [2024-04-26 01:53:21,336][47288] Updated weights for policy 0, policy_version 48776 (0.0029) [2024-04-26 01:53:23,915][47288] Updated weights for policy 0, policy_version 48786 (0.0033) [2024-04-26 01:53:23,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 799309824. Throughput: 0: 55790.7. Samples: 748717560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 01:53:23,931][47056] Avg episode reward: [(0, '0.334')] [2024-04-26 01:53:27,419][47288] Updated weights for policy 0, policy_version 48796 (0.0027) [2024-04-26 01:53:28,923][47056] Fps is (10 sec: 50789.6, 60 sec: 55705.4, 300 sec: 55650.0). Total num frames: 799555584. Throughput: 0: 55551.3. Samples: 748877680. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 01:53:28,924][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 01:53:29,820][47288] Updated weights for policy 0, policy_version 48806 (0.0031) [2024-04-26 01:53:33,329][47288] Updated weights for policy 0, policy_version 48816 (0.0029) [2024-04-26 01:53:33,923][47056] Fps is (10 sec: 50790.1, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 799817728. Throughput: 0: 55620.0. Samples: 749215780. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 01:53:33,923][47056] Avg episode reward: [(0, '0.356')] [2024-04-26 01:53:35,562][47288] Updated weights for policy 0, policy_version 48826 (0.0030) [2024-04-26 01:53:38,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 800112640. Throughput: 0: 55679.8. Samples: 749554720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 01:53:38,923][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 01:53:39,245][47288] Updated weights for policy 0, policy_version 48836 (0.0030) [2024-04-26 01:53:41,382][47288] Updated weights for policy 0, policy_version 48846 (0.0031) [2024-04-26 01:53:43,923][47056] Fps is (10 sec: 57343.0, 60 sec: 54886.2, 300 sec: 55705.5). Total num frames: 800391168. Throughput: 0: 55588.6. Samples: 749710740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 01:53:43,923][47056] Avg episode reward: [(0, '0.378')] [2024-04-26 01:53:45,073][47288] Updated weights for policy 0, policy_version 48856 (0.0026) [2024-04-26 01:53:47,333][47288] Updated weights for policy 0, policy_version 48866 (0.0029) [2024-04-26 01:53:48,923][47056] Fps is (10 sec: 58983.3, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 800702464. Throughput: 0: 55640.4. Samples: 750045520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 01:53:48,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 01:53:50,851][47288] Updated weights for policy 0, policy_version 48876 (0.0027) [2024-04-26 01:53:53,184][47288] Updated weights for policy 0, policy_version 48886 (0.0025) [2024-04-26 01:53:53,923][47056] Fps is (10 sec: 58984.1, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 800980992. Throughput: 0: 55557.6. Samples: 750377020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 01:53:53,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 01:53:56,679][47288] Updated weights for policy 0, policy_version 48896 (0.0028) [2024-04-26 01:53:58,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56524.8, 300 sec: 55705.6). Total num frames: 801259520. Throughput: 0: 55871.2. Samples: 750559840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 01:53:58,923][47056] Avg episode reward: [(0, '0.314')] [2024-04-26 01:53:58,965][47288] Updated weights for policy 0, policy_version 48906 (0.0030) [2024-04-26 01:54:02,665][47288] Updated weights for policy 0, policy_version 48916 (0.0030) [2024-04-26 01:54:03,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 801521664. Throughput: 0: 55891.6. Samples: 750894380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 01:54:03,923][47056] Avg episode reward: [(0, '0.375')] [2024-04-26 01:54:04,889][47288] Updated weights for policy 0, policy_version 48926 (0.0028) [2024-04-26 01:54:08,690][47288] Updated weights for policy 0, policy_version 48936 (0.0031) [2024-04-26 01:54:08,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 801783808. Throughput: 0: 55690.6. Samples: 751223640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 01:54:08,923][47056] Avg episode reward: [(0, '0.312')] [2024-04-26 01:54:10,583][47267] Signal inference workers to stop experience collection... (11100 times) [2024-04-26 01:54:10,583][47267] Signal inference workers to resume experience collection... (11100 times) [2024-04-26 01:54:10,595][47288] InferenceWorker_p0-w0: stopping experience collection (11100 times) [2024-04-26 01:54:10,595][47288] InferenceWorker_p0-w0: resuming experience collection (11100 times) [2024-04-26 01:54:10,875][47288] Updated weights for policy 0, policy_version 48946 (0.0029) [2024-04-26 01:54:13,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 802045952. Throughput: 0: 55566.5. Samples: 751378160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 01:54:13,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 01:54:14,621][47288] Updated weights for policy 0, policy_version 48956 (0.0037) [2024-04-26 01:54:16,670][47288] Updated weights for policy 0, policy_version 48966 (0.0031) [2024-04-26 01:54:18,923][47056] Fps is (10 sec: 55705.7, 60 sec: 54886.4, 300 sec: 55650.0). Total num frames: 802340864. Throughput: 0: 55363.5. Samples: 751707140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 01:54:18,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 01:54:20,450][47288] Updated weights for policy 0, policy_version 48976 (0.0037) [2024-04-26 01:54:22,616][47288] Updated weights for policy 0, policy_version 48986 (0.0032) [2024-04-26 01:54:23,923][47056] Fps is (10 sec: 58981.7, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 802635776. Throughput: 0: 55184.1. Samples: 752038000. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-26 01:54:23,923][47056] Avg episode reward: [(0, '0.312')] [2024-04-26 01:54:26,388][47288] Updated weights for policy 0, policy_version 48996 (0.0029) [2024-04-26 01:54:28,425][47288] Updated weights for policy 0, policy_version 49006 (0.0033) [2024-04-26 01:54:28,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56251.9, 300 sec: 55761.1). Total num frames: 802930688. Throughput: 0: 55799.8. Samples: 752221720. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-26 01:54:28,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 01:54:32,385][47288] Updated weights for policy 0, policy_version 49016 (0.0032) [2024-04-26 01:54:33,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 803176448. Throughput: 0: 55814.5. Samples: 752557180. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-26 01:54:33,923][47056] Avg episode reward: [(0, '0.330')] [2024-04-26 01:54:34,332][47288] Updated weights for policy 0, policy_version 49026 (0.0031) [2024-04-26 01:54:38,202][47288] Updated weights for policy 0, policy_version 49036 (0.0033) [2024-04-26 01:54:38,923][47056] Fps is (10 sec: 50789.9, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 803438592. Throughput: 0: 55796.7. Samples: 752887880. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-26 01:54:38,924][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 01:54:40,401][47288] Updated weights for policy 0, policy_version 49046 (0.0035) [2024-04-26 01:54:43,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 803717120. Throughput: 0: 55171.1. Samples: 753042540. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-26 01:54:43,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 01:54:44,148][47288] Updated weights for policy 0, policy_version 49056 (0.0033) [2024-04-26 01:54:46,356][47288] Updated weights for policy 0, policy_version 49066 (0.0032) [2024-04-26 01:54:48,923][47056] Fps is (10 sec: 54067.2, 60 sec: 54613.2, 300 sec: 55538.9). Total num frames: 803979264. Throughput: 0: 55187.4. Samples: 753377820. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-26 01:54:48,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 01:54:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000049072_803995648.pth... [2024-04-26 01:54:48,993][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000048258_790659072.pth [2024-04-26 01:54:50,012][47288] Updated weights for policy 0, policy_version 49076 (0.0030) [2024-04-26 01:54:52,260][47288] Updated weights for policy 0, policy_version 49086 (0.0030) [2024-04-26 01:54:53,923][47056] Fps is (10 sec: 55705.3, 60 sec: 54886.3, 300 sec: 55650.1). Total num frames: 804274176. Throughput: 0: 55265.8. Samples: 753710600. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-26 01:54:53,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 01:54:56,093][47288] Updated weights for policy 0, policy_version 49096 (0.0033) [2024-04-26 01:54:58,058][47288] Updated weights for policy 0, policy_version 49106 (0.0032) [2024-04-26 01:54:58,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 804569088. Throughput: 0: 55636.7. Samples: 753881820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:54:58,923][47056] Avg episode reward: [(0, '0.390')] [2024-04-26 01:55:01,853][47288] Updated weights for policy 0, policy_version 49116 (0.0029) [2024-04-26 01:55:03,852][47267] Signal inference workers to stop experience collection... (11150 times) [2024-04-26 01:55:03,885][47288] InferenceWorker_p0-w0: stopping experience collection (11150 times) [2024-04-26 01:55:03,911][47267] Signal inference workers to resume experience collection... (11150 times) [2024-04-26 01:55:03,914][47288] InferenceWorker_p0-w0: resuming experience collection (11150 times) [2024-04-26 01:55:03,918][47288] Updated weights for policy 0, policy_version 49126 (0.0031) [2024-04-26 01:55:03,923][47056] Fps is (10 sec: 60621.3, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 804880384. Throughput: 0: 55729.0. Samples: 754214940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:55:03,923][47056] Avg episode reward: [(0, '0.352')] [2024-04-26 01:55:07,534][47288] Updated weights for policy 0, policy_version 49136 (0.0031) [2024-04-26 01:55:08,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 805126144. Throughput: 0: 55759.9. Samples: 754547200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:55:08,924][47056] Avg episode reward: [(0, '0.397')] [2024-04-26 01:55:09,741][47288] Updated weights for policy 0, policy_version 49146 (0.0026) [2024-04-26 01:55:13,421][47288] Updated weights for policy 0, policy_version 49156 (0.0026) [2024-04-26 01:55:13,923][47056] Fps is (10 sec: 50790.3, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 805388288. Throughput: 0: 55623.6. Samples: 754724780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:55:13,923][47056] Avg episode reward: [(0, '0.436')] [2024-04-26 01:55:15,554][47288] Updated weights for policy 0, policy_version 49166 (0.0028) [2024-04-26 01:55:18,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 805666816. Throughput: 0: 55684.5. Samples: 755062980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:55:18,923][47056] Avg episode reward: [(0, '0.314')] [2024-04-26 01:55:19,439][47288] Updated weights for policy 0, policy_version 49176 (0.0027) [2024-04-26 01:55:21,514][47288] Updated weights for policy 0, policy_version 49186 (0.0031) [2024-04-26 01:55:23,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55159.6, 300 sec: 55594.6). Total num frames: 805945344. Throughput: 0: 55764.6. Samples: 755397280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 01:55:23,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 01:55:25,192][47288] Updated weights for policy 0, policy_version 49196 (0.0029) [2024-04-26 01:55:27,321][47288] Updated weights for policy 0, policy_version 49206 (0.0034) [2024-04-26 01:55:28,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55159.3, 300 sec: 55705.6). Total num frames: 806240256. Throughput: 0: 55798.4. Samples: 755553480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 01:55:28,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 01:55:30,956][47288] Updated weights for policy 0, policy_version 49216 (0.0029) [2024-04-26 01:55:33,219][47288] Updated weights for policy 0, policy_version 49226 (0.0031) [2024-04-26 01:55:33,923][47056] Fps is (10 sec: 58981.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 806535168. Throughput: 0: 55844.9. Samples: 755890840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 01:55:33,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 01:55:36,804][47288] Updated weights for policy 0, policy_version 49236 (0.0029) [2024-04-26 01:55:38,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.6, 300 sec: 55650.0). Total num frames: 806813696. Throughput: 0: 55992.7. Samples: 756230280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 01:55:38,923][47056] Avg episode reward: [(0, '0.337')] [2024-04-26 01:55:39,119][47288] Updated weights for policy 0, policy_version 49246 (0.0034) [2024-04-26 01:55:42,568][47288] Updated weights for policy 0, policy_version 49256 (0.0029) [2024-04-26 01:55:43,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 807108608. Throughput: 0: 56068.9. Samples: 756404920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 01:55:43,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 01:55:44,809][47267] Signal inference workers to stop experience collection... (11200 times) [2024-04-26 01:55:44,815][47267] Signal inference workers to resume experience collection... (11200 times) [2024-04-26 01:55:44,820][47288] InferenceWorker_p0-w0: stopping experience collection (11200 times) [2024-04-26 01:55:44,848][47288] InferenceWorker_p0-w0: resuming experience collection (11200 times) [2024-04-26 01:55:44,919][47288] Updated weights for policy 0, policy_version 49266 (0.0034) [2024-04-26 01:55:48,406][47288] Updated weights for policy 0, policy_version 49276 (0.0032) [2024-04-26 01:55:48,923][47056] Fps is (10 sec: 55707.1, 60 sec: 56525.0, 300 sec: 55705.6). Total num frames: 807370752. Throughput: 0: 56155.6. Samples: 756741940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 01:55:48,923][47056] Avg episode reward: [(0, '0.371')] [2024-04-26 01:55:50,759][47288] Updated weights for policy 0, policy_version 49286 (0.0031) [2024-04-26 01:55:53,923][47056] Fps is (10 sec: 52429.7, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 807632896. Throughput: 0: 56237.2. Samples: 757077860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 01:55:53,923][47056] Avg episode reward: [(0, '0.309')] [2024-04-26 01:55:54,232][47288] Updated weights for policy 0, policy_version 49296 (0.0034) [2024-04-26 01:55:56,578][47288] Updated weights for policy 0, policy_version 49306 (0.0027) [2024-04-26 01:55:58,923][47056] Fps is (10 sec: 50788.9, 60 sec: 55159.4, 300 sec: 55538.9). Total num frames: 807878656. Throughput: 0: 55658.4. Samples: 757229420. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-04-26 01:55:58,923][47056] Avg episode reward: [(0, '0.299')] [2024-04-26 01:56:00,237][47288] Updated weights for policy 0, policy_version 49316 (0.0029) [2024-04-26 01:56:02,544][47288] Updated weights for policy 0, policy_version 49326 (0.0024) [2024-04-26 01:56:03,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 808189952. Throughput: 0: 55592.1. Samples: 757564620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-04-26 01:56:03,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 01:56:06,172][47288] Updated weights for policy 0, policy_version 49336 (0.0032) [2024-04-26 01:56:08,417][47288] Updated weights for policy 0, policy_version 49346 (0.0034) [2024-04-26 01:56:08,923][47056] Fps is (10 sec: 62259.9, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 808501248. Throughput: 0: 55410.4. Samples: 757890760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-04-26 01:56:08,923][47056] Avg episode reward: [(0, '0.281')] [2024-04-26 01:56:11,987][47288] Updated weights for policy 0, policy_version 49356 (0.0032) [2024-04-26 01:56:13,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 808763392. Throughput: 0: 55900.2. Samples: 758068980. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-04-26 01:56:13,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 01:56:14,291][47288] Updated weights for policy 0, policy_version 49366 (0.0031) [2024-04-26 01:56:17,692][47288] Updated weights for policy 0, policy_version 49376 (0.0030) [2024-04-26 01:56:18,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 809041920. Throughput: 0: 55816.6. Samples: 758402580. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-04-26 01:56:18,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 01:56:20,302][47288] Updated weights for policy 0, policy_version 49386 (0.0027) [2024-04-26 01:56:23,595][47288] Updated weights for policy 0, policy_version 49396 (0.0027) [2024-04-26 01:56:23,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 809320448. Throughput: 0: 55665.0. Samples: 758735200. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-04-26 01:56:23,923][47056] Avg episode reward: [(0, '0.321')] [2024-04-26 01:56:26,190][47288] Updated weights for policy 0, policy_version 49406 (0.0030) [2024-04-26 01:56:28,923][47056] Fps is (10 sec: 52427.4, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 809566208. Throughput: 0: 55516.7. Samples: 758903180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 01:56:28,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 01:56:29,601][47288] Updated weights for policy 0, policy_version 49416 (0.0030) [2024-04-26 01:56:32,057][47288] Updated weights for policy 0, policy_version 49426 (0.0028) [2024-04-26 01:56:33,923][47056] Fps is (10 sec: 50791.1, 60 sec: 54886.5, 300 sec: 55483.5). Total num frames: 809828352. Throughput: 0: 55491.5. Samples: 759239060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 01:56:33,923][47056] Avg episode reward: [(0, '0.393')] [2024-04-26 01:56:35,341][47288] Updated weights for policy 0, policy_version 49436 (0.0031) [2024-04-26 01:56:37,925][47288] Updated weights for policy 0, policy_version 49446 (0.0025) [2024-04-26 01:56:38,923][47056] Fps is (10 sec: 57345.7, 60 sec: 55432.8, 300 sec: 55594.5). Total num frames: 810139648. Throughput: 0: 55449.4. Samples: 759573080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 01:56:38,923][47056] Avg episode reward: [(0, '0.362')] [2024-04-26 01:56:41,312][47288] Updated weights for policy 0, policy_version 49456 (0.0029) [2024-04-26 01:56:43,873][47288] Updated weights for policy 0, policy_version 49466 (0.0028) [2024-04-26 01:56:43,923][47056] Fps is (10 sec: 62259.3, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 810450944. Throughput: 0: 55794.1. Samples: 759740140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 01:56:43,923][47056] Avg episode reward: [(0, '0.329')] [2024-04-26 01:56:46,432][47267] Signal inference workers to stop experience collection... (11250 times) [2024-04-26 01:56:46,432][47267] Signal inference workers to resume experience collection... (11250 times) [2024-04-26 01:56:46,443][47288] InferenceWorker_p0-w0: stopping experience collection (11250 times) [2024-04-26 01:56:46,443][47288] InferenceWorker_p0-w0: resuming experience collection (11250 times) [2024-04-26 01:56:47,322][47288] Updated weights for policy 0, policy_version 49476 (0.0029) [2024-04-26 01:56:48,923][47056] Fps is (10 sec: 60620.1, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 810745856. Throughput: 0: 55744.3. Samples: 760073120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 01:56:48,923][47056] Avg episode reward: [(0, '0.312')] [2024-04-26 01:56:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000049484_810745856.pth... [2024-04-26 01:56:48,983][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000048667_797360128.pth [2024-04-26 01:56:49,848][47288] Updated weights for policy 0, policy_version 49486 (0.0035) [2024-04-26 01:56:53,213][47288] Updated weights for policy 0, policy_version 49496 (0.0031) [2024-04-26 01:56:53,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 810991616. Throughput: 0: 55884.2. Samples: 760405540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 01:56:53,923][47056] Avg episode reward: [(0, '0.390')] [2024-04-26 01:56:55,705][47288] Updated weights for policy 0, policy_version 49506 (0.0029) [2024-04-26 01:56:58,923][47056] Fps is (10 sec: 49152.6, 60 sec: 55978.9, 300 sec: 55594.5). Total num frames: 811237376. Throughput: 0: 55497.5. Samples: 760566360. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-04-26 01:56:58,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 01:56:59,266][47288] Updated weights for policy 0, policy_version 49516 (0.0038) [2024-04-26 01:57:01,699][47288] Updated weights for policy 0, policy_version 49526 (0.0030) [2024-04-26 01:57:03,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 811515904. Throughput: 0: 55447.9. Samples: 760897740. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-04-26 01:57:03,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 01:57:05,005][47288] Updated weights for policy 0, policy_version 49536 (0.0029) [2024-04-26 01:57:07,697][47288] Updated weights for policy 0, policy_version 49546 (0.0027) [2024-04-26 01:57:08,923][47056] Fps is (10 sec: 54066.9, 60 sec: 54613.4, 300 sec: 55483.4). Total num frames: 811778048. Throughput: 0: 55523.2. Samples: 761233740. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-04-26 01:57:08,923][47056] Avg episode reward: [(0, '0.359')] [2024-04-26 01:57:10,877][47288] Updated weights for policy 0, policy_version 49556 (0.0030) [2024-04-26 01:57:13,589][47288] Updated weights for policy 0, policy_version 49566 (0.0029) [2024-04-26 01:57:13,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 812089344. Throughput: 0: 55506.0. Samples: 761400940. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-04-26 01:57:13,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 01:57:16,765][47288] Updated weights for policy 0, policy_version 49576 (0.0027) [2024-04-26 01:57:18,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 812367872. Throughput: 0: 55401.2. Samples: 761732120. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-04-26 01:57:18,923][47056] Avg episode reward: [(0, '0.416')] [2024-04-26 01:57:19,401][47288] Updated weights for policy 0, policy_version 49586 (0.0026) [2024-04-26 01:57:22,548][47288] Updated weights for policy 0, policy_version 49596 (0.0027) [2024-04-26 01:57:23,922][47056] Fps is (10 sec: 57345.0, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 812662784. Throughput: 0: 55357.9. Samples: 762064180. Policy #0 lag: (min: 1.0, avg: 8.4, max: 22.0) [2024-04-26 01:57:23,923][47056] Avg episode reward: [(0, '0.401')] [2024-04-26 01:57:25,263][47288] Updated weights for policy 0, policy_version 49606 (0.0034) [2024-04-26 01:57:28,400][47288] Updated weights for policy 0, policy_version 49616 (0.0030) [2024-04-26 01:57:28,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 812924928. Throughput: 0: 55537.1. Samples: 762239320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 01:57:28,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 01:57:31,276][47288] Updated weights for policy 0, policy_version 49626 (0.0031) [2024-04-26 01:57:33,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 813187072. Throughput: 0: 55500.2. Samples: 762570620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 01:57:33,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 01:57:34,309][47288] Updated weights for policy 0, policy_version 49636 (0.0026) [2024-04-26 01:57:37,009][47288] Updated weights for policy 0, policy_version 49646 (0.0035) [2024-04-26 01:57:38,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 813465600. Throughput: 0: 55519.4. Samples: 762903920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 01:57:38,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 01:57:40,309][47288] Updated weights for policy 0, policy_version 49656 (0.0029) [2024-04-26 01:57:41,487][47267] Signal inference workers to stop experience collection... (11300 times) [2024-04-26 01:57:41,539][47288] InferenceWorker_p0-w0: stopping experience collection (11300 times) [2024-04-26 01:57:41,541][47267] Signal inference workers to resume experience collection... (11300 times) [2024-04-26 01:57:41,553][47288] InferenceWorker_p0-w0: resuming experience collection (11300 times) [2024-04-26 01:57:42,717][47288] Updated weights for policy 0, policy_version 49666 (0.0031) [2024-04-26 01:57:43,923][47056] Fps is (10 sec: 55704.6, 60 sec: 54886.2, 300 sec: 55539.0). Total num frames: 813744128. Throughput: 0: 55690.5. Samples: 763072440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 01:57:43,923][47056] Avg episode reward: [(0, '0.322')] [2024-04-26 01:57:46,183][47288] Updated weights for policy 0, policy_version 49676 (0.0026) [2024-04-26 01:57:48,923][47056] Fps is (10 sec: 55706.3, 60 sec: 54613.4, 300 sec: 55594.5). Total num frames: 814022656. Throughput: 0: 55583.2. Samples: 763398980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 01:57:48,923][47056] Avg episode reward: [(0, '0.424')] [2024-04-26 01:57:49,076][47288] Updated weights for policy 0, policy_version 49686 (0.0035) [2024-04-26 01:57:51,972][47288] Updated weights for policy 0, policy_version 49696 (0.0028) [2024-04-26 01:57:53,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 814317568. Throughput: 0: 55508.9. Samples: 763731640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 01:57:53,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 01:57:55,189][47288] Updated weights for policy 0, policy_version 49706 (0.0031) [2024-04-26 01:57:57,694][47288] Updated weights for policy 0, policy_version 49716 (0.0029) [2024-04-26 01:57:58,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 814579712. Throughput: 0: 55634.3. Samples: 763904480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 01:57:58,923][47056] Avg episode reward: [(0, '0.354')] [2024-04-26 01:58:01,151][47288] Updated weights for policy 0, policy_version 49726 (0.0030) [2024-04-26 01:58:03,605][47288] Updated weights for policy 0, policy_version 49736 (0.0028) [2024-04-26 01:58:03,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 814874624. Throughput: 0: 55695.7. Samples: 764238420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 01:58:03,923][47056] Avg episode reward: [(0, '0.311')] [2024-04-26 01:58:06,891][47288] Updated weights for policy 0, policy_version 49746 (0.0036) [2024-04-26 01:58:08,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 815136768. Throughput: 0: 55791.2. Samples: 764574800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 01:58:08,924][47056] Avg episode reward: [(0, '0.381')] [2024-04-26 01:58:09,503][47288] Updated weights for policy 0, policy_version 49756 (0.0027) [2024-04-26 01:58:12,628][47288] Updated weights for policy 0, policy_version 49766 (0.0034) [2024-04-26 01:58:13,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 815415296. Throughput: 0: 55476.2. Samples: 764735740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 01:58:13,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 01:58:15,570][47288] Updated weights for policy 0, policy_version 49776 (0.0027) [2024-04-26 01:58:18,446][47288] Updated weights for policy 0, policy_version 49786 (0.0028) [2024-04-26 01:58:18,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 815710208. Throughput: 0: 55583.8. Samples: 765071900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 01:58:18,923][47056] Avg episode reward: [(0, '0.368')] [2024-04-26 01:58:21,471][47288] Updated weights for policy 0, policy_version 49796 (0.0026) [2024-04-26 01:58:23,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55432.3, 300 sec: 55705.6). Total num frames: 815988736. Throughput: 0: 55477.8. Samples: 765400420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 01:58:23,923][47056] Avg episode reward: [(0, '0.325')] [2024-04-26 01:58:24,213][47288] Updated weights for policy 0, policy_version 49806 (0.0028) [2024-04-26 01:58:27,190][47288] Updated weights for policy 0, policy_version 49816 (0.0028) [2024-04-26 01:58:28,923][47056] Fps is (10 sec: 55706.8, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 816267264. Throughput: 0: 55720.3. Samples: 765579840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 01:58:28,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 01:58:30,034][47288] Updated weights for policy 0, policy_version 49826 (0.0026) [2024-04-26 01:58:33,087][47288] Updated weights for policy 0, policy_version 49836 (0.0028) [2024-04-26 01:58:33,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 816529408. Throughput: 0: 55823.5. Samples: 765911040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 01:58:33,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 01:58:36,182][47288] Updated weights for policy 0, policy_version 49846 (0.0037) [2024-04-26 01:58:38,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 816824320. Throughput: 0: 55791.0. Samples: 766242240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 01:58:38,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 01:58:39,074][47288] Updated weights for policy 0, policy_version 49856 (0.0030) [2024-04-26 01:58:42,308][47288] Updated weights for policy 0, policy_version 49866 (0.0027) [2024-04-26 01:58:43,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 817102848. Throughput: 0: 55741.7. Samples: 766412860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 01:58:43,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 01:58:44,757][47288] Updated weights for policy 0, policy_version 49876 (0.0027) [2024-04-26 01:58:45,535][47267] Signal inference workers to stop experience collection... (11350 times) [2024-04-26 01:58:45,561][47288] InferenceWorker_p0-w0: stopping experience collection (11350 times) [2024-04-26 01:58:45,619][47267] Signal inference workers to resume experience collection... (11350 times) [2024-04-26 01:58:45,619][47288] InferenceWorker_p0-w0: resuming experience collection (11350 times) [2024-04-26 01:58:48,282][47288] Updated weights for policy 0, policy_version 49886 (0.0027) [2024-04-26 01:58:48,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.4, 300 sec: 55538.9). Total num frames: 817364992. Throughput: 0: 55816.1. Samples: 766750160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 01:58:48,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 01:58:48,931][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000049888_817364992.pth... [2024-04-26 01:58:48,989][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000049072_803995648.pth [2024-04-26 01:58:50,476][47288] Updated weights for policy 0, policy_version 49896 (0.0028) [2024-04-26 01:58:53,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 817643520. Throughput: 0: 55812.6. Samples: 767086360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 01:58:53,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 01:58:53,957][47288] Updated weights for policy 0, policy_version 49906 (0.0024) [2024-04-26 01:58:56,572][47288] Updated weights for policy 0, policy_version 49916 (0.0031) [2024-04-26 01:58:58,923][47056] Fps is (10 sec: 58983.5, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 817954816. Throughput: 0: 55908.0. Samples: 767251600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 01:58:58,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 01:58:59,678][47288] Updated weights for policy 0, policy_version 49926 (0.0029) [2024-04-26 01:59:02,753][47288] Updated weights for policy 0, policy_version 49936 (0.0033) [2024-04-26 01:59:03,923][47056] Fps is (10 sec: 58982.0, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 818233344. Throughput: 0: 55898.3. Samples: 767587320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 01:59:03,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 01:59:05,672][47288] Updated weights for policy 0, policy_version 49946 (0.0027) [2024-04-26 01:59:08,467][47288] Updated weights for policy 0, policy_version 49956 (0.0032) [2024-04-26 01:59:08,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 818495488. Throughput: 0: 56066.7. Samples: 767923420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 01:59:08,923][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 01:59:11,560][47288] Updated weights for policy 0, policy_version 49966 (0.0032) [2024-04-26 01:59:13,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 818774016. Throughput: 0: 55867.0. Samples: 768093860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 01:59:13,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 01:59:14,186][47288] Updated weights for policy 0, policy_version 49976 (0.0033) [2024-04-26 01:59:17,269][47288] Updated weights for policy 0, policy_version 49986 (0.0033) [2024-04-26 01:59:18,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 819052544. Throughput: 0: 55895.1. Samples: 768426320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 01:59:18,923][47056] Avg episode reward: [(0, '0.363')] [2024-04-26 01:59:20,162][47288] Updated weights for policy 0, policy_version 49996 (0.0028) [2024-04-26 01:59:23,374][47288] Updated weights for policy 0, policy_version 50006 (0.0031) [2024-04-26 01:59:23,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.8, 300 sec: 55594.5). Total num frames: 819331072. Throughput: 0: 56008.2. Samples: 768762600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 01:59:23,923][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 01:59:26,022][47288] Updated weights for policy 0, policy_version 50016 (0.0031) [2024-04-26 01:59:28,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 819593216. Throughput: 0: 55819.1. Samples: 768924720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 01:59:28,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 01:59:29,321][47288] Updated weights for policy 0, policy_version 50026 (0.0030) [2024-04-26 01:59:31,710][47288] Updated weights for policy 0, policy_version 50036 (0.0029) [2024-04-26 01:59:33,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 819888128. Throughput: 0: 55865.0. Samples: 769264080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 01:59:33,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 01:59:35,261][47288] Updated weights for policy 0, policy_version 50046 (0.0035) [2024-04-26 01:59:37,526][47288] Updated weights for policy 0, policy_version 50056 (0.0030) [2024-04-26 01:59:38,923][47056] Fps is (10 sec: 58983.3, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 820183040. Throughput: 0: 55801.8. Samples: 769597440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 01:59:38,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 01:59:40,905][47288] Updated weights for policy 0, policy_version 50066 (0.0026) [2024-04-26 01:59:43,526][47288] Updated weights for policy 0, policy_version 50076 (0.0035) [2024-04-26 01:59:43,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 820461568. Throughput: 0: 56045.2. Samples: 769773640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 01:59:43,923][47056] Avg episode reward: [(0, '0.384')] [2024-04-26 01:59:46,615][47288] Updated weights for policy 0, policy_version 50086 (0.0028) [2024-04-26 01:59:48,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.9, 300 sec: 55816.7). Total num frames: 820740096. Throughput: 0: 55941.4. Samples: 770104680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 01:59:48,923][47056] Avg episode reward: [(0, '0.299')] [2024-04-26 01:59:49,424][47288] Updated weights for policy 0, policy_version 50096 (0.0029) [2024-04-26 01:59:52,515][47288] Updated weights for policy 0, policy_version 50106 (0.0030) [2024-04-26 01:59:53,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 821002240. Throughput: 0: 55912.5. Samples: 770439480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 01:59:53,923][47056] Avg episode reward: [(0, '0.362')] [2024-04-26 01:59:55,211][47288] Updated weights for policy 0, policy_version 50116 (0.0027) [2024-04-26 01:59:58,336][47288] Updated weights for policy 0, policy_version 50126 (0.0031) [2024-04-26 01:59:58,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 821280768. Throughput: 0: 55714.1. Samples: 770601000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 01:59:58,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 02:00:01,149][47288] Updated weights for policy 0, policy_version 50136 (0.0034) [2024-04-26 02:00:02,270][47267] Signal inference workers to stop experience collection... (11400 times) [2024-04-26 02:00:02,270][47267] Signal inference workers to resume experience collection... (11400 times) [2024-04-26 02:00:02,281][47288] InferenceWorker_p0-w0: stopping experience collection (11400 times) [2024-04-26 02:00:02,299][47288] InferenceWorker_p0-w0: resuming experience collection (11400 times) [2024-04-26 02:00:03,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 821559296. Throughput: 0: 55723.2. Samples: 770933860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 02:00:03,923][47056] Avg episode reward: [(0, '0.327')] [2024-04-26 02:00:04,316][47288] Updated weights for policy 0, policy_version 50146 (0.0028) [2024-04-26 02:00:06,853][47288] Updated weights for policy 0, policy_version 50156 (0.0028) [2024-04-26 02:00:08,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 821854208. Throughput: 0: 55740.7. Samples: 771270940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 02:00:08,923][47056] Avg episode reward: [(0, '0.336')] [2024-04-26 02:00:10,316][47288] Updated weights for policy 0, policy_version 50166 (0.0031) [2024-04-26 02:00:12,761][47288] Updated weights for policy 0, policy_version 50176 (0.0028) [2024-04-26 02:00:13,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 822149120. Throughput: 0: 55970.6. Samples: 771443400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 02:00:13,923][47056] Avg episode reward: [(0, '0.391')] [2024-04-26 02:00:16,101][47288] Updated weights for policy 0, policy_version 50186 (0.0028) [2024-04-26 02:00:18,603][47288] Updated weights for policy 0, policy_version 50196 (0.0026) [2024-04-26 02:00:18,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55816.6). Total num frames: 822411264. Throughput: 0: 55822.2. Samples: 771776080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 02:00:18,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 02:00:22,188][47288] Updated weights for policy 0, policy_version 50206 (0.0031) [2024-04-26 02:00:23,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55978.5, 300 sec: 55761.2). Total num frames: 822689792. Throughput: 0: 55824.2. Samples: 772109540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 02:00:23,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 02:00:24,526][47288] Updated weights for policy 0, policy_version 50216 (0.0027) [2024-04-26 02:00:28,222][47288] Updated weights for policy 0, policy_version 50226 (0.0036) [2024-04-26 02:00:28,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 822935552. Throughput: 0: 55511.6. Samples: 772271660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 02:00:28,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 02:00:30,460][47288] Updated weights for policy 0, policy_version 50236 (0.0033) [2024-04-26 02:00:33,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55432.6, 300 sec: 55594.6). Total num frames: 823214080. Throughput: 0: 55655.1. Samples: 772609160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 02:00:33,923][47056] Avg episode reward: [(0, '0.371')] [2024-04-26 02:00:33,999][47288] Updated weights for policy 0, policy_version 50246 (0.0030) [2024-04-26 02:00:36,267][47288] Updated weights for policy 0, policy_version 50256 (0.0030) [2024-04-26 02:00:38,923][47056] Fps is (10 sec: 58982.9, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 823525376. Throughput: 0: 55685.0. Samples: 772945300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 02:00:38,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 02:00:39,874][47288] Updated weights for policy 0, policy_version 50266 (0.0028) [2024-04-26 02:00:42,204][47288] Updated weights for policy 0, policy_version 50276 (0.0030) [2024-04-26 02:00:43,923][47056] Fps is (10 sec: 58980.9, 60 sec: 55705.5, 300 sec: 55705.5). Total num frames: 823803904. Throughput: 0: 55646.4. Samples: 773105100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 02:00:43,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 02:00:45,749][47288] Updated weights for policy 0, policy_version 50286 (0.0037) [2024-04-26 02:00:47,955][47288] Updated weights for policy 0, policy_version 50296 (0.0032) [2024-04-26 02:00:48,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 824082432. Throughput: 0: 55685.6. Samples: 773439720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 02:00:48,923][47056] Avg episode reward: [(0, '0.401')] [2024-04-26 02:00:48,996][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000050299_824098816.pth... [2024-04-26 02:00:49,047][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000049484_810745856.pth [2024-04-26 02:00:51,391][47267] Signal inference workers to stop experience collection... (11450 times) [2024-04-26 02:00:51,423][47288] InferenceWorker_p0-w0: stopping experience collection (11450 times) [2024-04-26 02:00:51,451][47267] Signal inference workers to resume experience collection... (11450 times) [2024-04-26 02:00:51,454][47288] InferenceWorker_p0-w0: resuming experience collection (11450 times) [2024-04-26 02:00:51,566][47288] Updated weights for policy 0, policy_version 50306 (0.0027) [2024-04-26 02:00:53,798][47288] Updated weights for policy 0, policy_version 50316 (0.0033) [2024-04-26 02:00:53,923][47056] Fps is (10 sec: 57345.1, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 824377344. Throughput: 0: 55607.1. Samples: 773773260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 02:00:53,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 02:00:57,476][47288] Updated weights for policy 0, policy_version 50326 (0.0028) [2024-04-26 02:00:58,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 824655872. Throughput: 0: 55721.9. Samples: 773950880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 02:00:58,923][47056] Avg episode reward: [(0, '0.320')] [2024-04-26 02:00:59,637][47288] Updated weights for policy 0, policy_version 50336 (0.0029) [2024-04-26 02:01:03,383][47288] Updated weights for policy 0, policy_version 50346 (0.0030) [2024-04-26 02:01:03,923][47056] Fps is (10 sec: 50790.9, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 824885248. Throughput: 0: 55798.8. Samples: 774287020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 02:01:03,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 02:01:05,492][47288] Updated weights for policy 0, policy_version 50356 (0.0034) [2024-04-26 02:01:08,923][47056] Fps is (10 sec: 50789.7, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 825163776. Throughput: 0: 55910.7. Samples: 774625520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 02:01:08,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 02:01:09,320][47288] Updated weights for policy 0, policy_version 50366 (0.0031) [2024-04-26 02:01:11,415][47288] Updated weights for policy 0, policy_version 50376 (0.0034) [2024-04-26 02:01:13,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 825458688. Throughput: 0: 55756.9. Samples: 774780720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 02:01:13,923][47056] Avg episode reward: [(0, '0.357')] [2024-04-26 02:01:15,188][47288] Updated weights for policy 0, policy_version 50386 (0.0030) [2024-04-26 02:01:17,365][47288] Updated weights for policy 0, policy_version 50396 (0.0027) [2024-04-26 02:01:18,923][47056] Fps is (10 sec: 60621.0, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 825769984. Throughput: 0: 55759.0. Samples: 775118320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 02:01:18,923][47056] Avg episode reward: [(0, '0.354')] [2024-04-26 02:01:20,869][47288] Updated weights for policy 0, policy_version 50406 (0.0029) [2024-04-26 02:01:23,117][47288] Updated weights for policy 0, policy_version 50416 (0.0030) [2024-04-26 02:01:23,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 826032128. Throughput: 0: 55741.2. Samples: 775453660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 02:01:23,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 02:01:26,669][47288] Updated weights for policy 0, policy_version 50426 (0.0026) [2024-04-26 02:01:28,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.9, 300 sec: 55927.8). Total num frames: 826327040. Throughput: 0: 56109.2. Samples: 775630000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 02:01:28,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 02:01:28,975][47288] Updated weights for policy 0, policy_version 50436 (0.0030) [2024-04-26 02:01:32,522][47288] Updated weights for policy 0, policy_version 50446 (0.0030) [2024-04-26 02:01:33,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 826589184. Throughput: 0: 55939.5. Samples: 775957000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 02:01:33,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 02:01:34,994][47288] Updated weights for policy 0, policy_version 50456 (0.0028) [2024-04-26 02:01:38,459][47288] Updated weights for policy 0, policy_version 50466 (0.0027) [2024-04-26 02:01:38,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 826851328. Throughput: 0: 55897.4. Samples: 776288640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 02:01:38,923][47056] Avg episode reward: [(0, '0.372')] [2024-04-26 02:01:41,067][47288] Updated weights for policy 0, policy_version 50476 (0.0028) [2024-04-26 02:01:43,923][47056] Fps is (10 sec: 52430.0, 60 sec: 55159.7, 300 sec: 55483.5). Total num frames: 827113472. Throughput: 0: 55522.7. Samples: 776449400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:01:43,923][47056] Avg episode reward: [(0, '0.343')] [2024-04-26 02:01:44,454][47288] Updated weights for policy 0, policy_version 50486 (0.0032) [2024-04-26 02:01:46,851][47288] Updated weights for policy 0, policy_version 50496 (0.0024) [2024-04-26 02:01:48,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 827408384. Throughput: 0: 55481.2. Samples: 776783680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:01:48,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 02:01:50,167][47288] Updated weights for policy 0, policy_version 50506 (0.0036) [2024-04-26 02:01:50,992][47267] Signal inference workers to stop experience collection... (11500 times) [2024-04-26 02:01:50,993][47267] Signal inference workers to resume experience collection... (11500 times) [2024-04-26 02:01:51,007][47288] InferenceWorker_p0-w0: stopping experience collection (11500 times) [2024-04-26 02:01:51,007][47288] InferenceWorker_p0-w0: resuming experience collection (11500 times) [2024-04-26 02:01:52,887][47288] Updated weights for policy 0, policy_version 50516 (0.0032) [2024-04-26 02:01:53,923][47056] Fps is (10 sec: 58981.9, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 827703296. Throughput: 0: 55356.5. Samples: 777116560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:01:53,923][47056] Avg episode reward: [(0, '0.318')] [2024-04-26 02:01:56,147][47288] Updated weights for policy 0, policy_version 50526 (0.0027) [2024-04-26 02:01:58,918][47288] Updated weights for policy 0, policy_version 50536 (0.0034) [2024-04-26 02:01:58,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 827981824. Throughput: 0: 55727.7. Samples: 777288460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:01:58,923][47056] Avg episode reward: [(0, '0.416')] [2024-04-26 02:02:02,088][47288] Updated weights for policy 0, policy_version 50546 (0.0029) [2024-04-26 02:02:03,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 828260352. Throughput: 0: 55573.0. Samples: 777619100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:02:03,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 02:02:04,813][47288] Updated weights for policy 0, policy_version 50556 (0.0033) [2024-04-26 02:02:07,773][47288] Updated weights for policy 0, policy_version 50566 (0.0028) [2024-04-26 02:02:08,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 828522496. Throughput: 0: 55601.4. Samples: 777955720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:02:08,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 02:02:10,667][47288] Updated weights for policy 0, policy_version 50576 (0.0028) [2024-04-26 02:02:13,579][47288] Updated weights for policy 0, policy_version 50586 (0.0032) [2024-04-26 02:02:13,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 828801024. Throughput: 0: 55347.1. Samples: 778120620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 02:02:13,924][47056] Avg episode reward: [(0, '0.371')] [2024-04-26 02:02:16,555][47288] Updated weights for policy 0, policy_version 50596 (0.0031) [2024-04-26 02:02:18,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55159.6, 300 sec: 55650.0). Total num frames: 829079552. Throughput: 0: 55520.7. Samples: 778455420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 02:02:18,923][47056] Avg episode reward: [(0, '0.298')] [2024-04-26 02:02:19,518][47288] Updated weights for policy 0, policy_version 50606 (0.0028) [2024-04-26 02:02:22,279][47288] Updated weights for policy 0, policy_version 50616 (0.0031) [2024-04-26 02:02:23,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 829358080. Throughput: 0: 55585.4. Samples: 778789980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 02:02:23,923][47056] Avg episode reward: [(0, '0.355')] [2024-04-26 02:02:25,227][47288] Updated weights for policy 0, policy_version 50626 (0.0033) [2024-04-26 02:02:28,176][47288] Updated weights for policy 0, policy_version 50636 (0.0037) [2024-04-26 02:02:28,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 829652992. Throughput: 0: 55843.6. Samples: 778962360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 02:02:28,923][47056] Avg episode reward: [(0, '0.375')] [2024-04-26 02:02:31,553][47288] Updated weights for policy 0, policy_version 50646 (0.0031) [2024-04-26 02:02:33,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 829931520. Throughput: 0: 55834.4. Samples: 779296220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 02:02:33,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 02:02:33,996][47288] Updated weights for policy 0, policy_version 50656 (0.0033) [2024-04-26 02:02:37,282][47288] Updated weights for policy 0, policy_version 50666 (0.0029) [2024-04-26 02:02:38,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 830210048. Throughput: 0: 55824.5. Samples: 779628660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 02:02:38,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 02:02:39,937][47288] Updated weights for policy 0, policy_version 50676 (0.0028) [2024-04-26 02:02:43,021][47288] Updated weights for policy 0, policy_version 50686 (0.0034) [2024-04-26 02:02:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 830488576. Throughput: 0: 55690.7. Samples: 779794540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:02:43,923][47056] Avg episode reward: [(0, '0.372')] [2024-04-26 02:02:45,760][47288] Updated weights for policy 0, policy_version 50696 (0.0031) [2024-04-26 02:02:48,825][47288] Updated weights for policy 0, policy_version 50706 (0.0031) [2024-04-26 02:02:48,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 830767104. Throughput: 0: 55806.9. Samples: 780130420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:02:48,923][47056] Avg episode reward: [(0, '0.314')] [2024-04-26 02:02:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000050706_830767104.pth... [2024-04-26 02:02:48,979][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000049888_817364992.pth [2024-04-26 02:02:51,782][47288] Updated weights for policy 0, policy_version 50716 (0.0027) [2024-04-26 02:02:53,650][47267] Signal inference workers to stop experience collection... (11550 times) [2024-04-26 02:02:53,689][47288] InferenceWorker_p0-w0: stopping experience collection (11550 times) [2024-04-26 02:02:53,714][47267] Signal inference workers to resume experience collection... (11550 times) [2024-04-26 02:02:53,719][47288] InferenceWorker_p0-w0: resuming experience collection (11550 times) [2024-04-26 02:02:53,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 831045632. Throughput: 0: 55818.8. Samples: 780467560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:02:53,923][47056] Avg episode reward: [(0, '0.436')] [2024-04-26 02:02:54,696][47288] Updated weights for policy 0, policy_version 50726 (0.0029) [2024-04-26 02:02:57,538][47288] Updated weights for policy 0, policy_version 50736 (0.0028) [2024-04-26 02:02:58,923][47056] Fps is (10 sec: 55706.6, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 831324160. Throughput: 0: 55780.0. Samples: 780630720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:02:58,927][47056] Avg episode reward: [(0, '0.338')] [2024-04-26 02:03:00,566][47288] Updated weights for policy 0, policy_version 50746 (0.0027) [2024-04-26 02:03:03,467][47288] Updated weights for policy 0, policy_version 50756 (0.0031) [2024-04-26 02:03:03,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 831602688. Throughput: 0: 55803.4. Samples: 780966580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:03:03,923][47056] Avg episode reward: [(0, '0.390')] [2024-04-26 02:03:06,489][47288] Updated weights for policy 0, policy_version 50766 (0.0030) [2024-04-26 02:03:08,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 831881216. Throughput: 0: 55861.7. Samples: 781303760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:03:08,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 02:03:09,210][47288] Updated weights for policy 0, policy_version 50776 (0.0027) [2024-04-26 02:03:12,469][47288] Updated weights for policy 0, policy_version 50786 (0.0030) [2024-04-26 02:03:13,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 832159744. Throughput: 0: 55689.7. Samples: 781468400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:03:13,923][47056] Avg episode reward: [(0, '0.323')] [2024-04-26 02:03:15,068][47288] Updated weights for policy 0, policy_version 50796 (0.0033) [2024-04-26 02:03:18,220][47288] Updated weights for policy 0, policy_version 50806 (0.0037) [2024-04-26 02:03:18,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 832421888. Throughput: 0: 55771.9. Samples: 781805960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 02:03:18,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 02:03:20,884][47288] Updated weights for policy 0, policy_version 50816 (0.0030) [2024-04-26 02:03:23,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 832700416. Throughput: 0: 55794.2. Samples: 782139400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 02:03:23,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 02:03:24,114][47288] Updated weights for policy 0, policy_version 50826 (0.0036) [2024-04-26 02:03:26,788][47288] Updated weights for policy 0, policy_version 50836 (0.0032) [2024-04-26 02:03:28,923][47056] Fps is (10 sec: 57342.7, 60 sec: 55705.3, 300 sec: 55816.6). Total num frames: 832995328. Throughput: 0: 55786.7. Samples: 782304960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 02:03:28,924][47056] Avg episode reward: [(0, '0.313')] [2024-04-26 02:03:29,960][47288] Updated weights for policy 0, policy_version 50846 (0.0029) [2024-04-26 02:03:32,703][47288] Updated weights for policy 0, policy_version 50856 (0.0029) [2024-04-26 02:03:33,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 833273856. Throughput: 0: 55792.1. Samples: 782641060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 02:03:33,923][47056] Avg episode reward: [(0, '0.331')] [2024-04-26 02:03:35,729][47288] Updated weights for policy 0, policy_version 50866 (0.0029) [2024-04-26 02:03:38,429][47288] Updated weights for policy 0, policy_version 50876 (0.0027) [2024-04-26 02:03:38,923][47056] Fps is (10 sec: 55707.7, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 833552384. Throughput: 0: 55723.2. Samples: 782975100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 02:03:38,923][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 02:03:41,652][47288] Updated weights for policy 0, policy_version 50886 (0.0033) [2024-04-26 02:03:43,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 833847296. Throughput: 0: 55906.6. Samples: 783146520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 02:03:43,923][47056] Avg episode reward: [(0, '0.361')] [2024-04-26 02:03:44,230][47288] Updated weights for policy 0, policy_version 50896 (0.0038) [2024-04-26 02:03:46,433][47267] Signal inference workers to stop experience collection... (11600 times) [2024-04-26 02:03:46,439][47267] Signal inference workers to resume experience collection... (11600 times) [2024-04-26 02:03:46,460][47288] InferenceWorker_p0-w0: stopping experience collection (11600 times) [2024-04-26 02:03:46,461][47288] InferenceWorker_p0-w0: resuming experience collection (11600 times) [2024-04-26 02:03:47,470][47288] Updated weights for policy 0, policy_version 50906 (0.0035) [2024-04-26 02:03:48,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 834109440. Throughput: 0: 55861.0. Samples: 783480320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 02:03:48,923][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 02:03:50,216][47288] Updated weights for policy 0, policy_version 50916 (0.0026) [2024-04-26 02:03:53,246][47288] Updated weights for policy 0, policy_version 50926 (0.0030) [2024-04-26 02:03:53,923][47056] Fps is (10 sec: 52429.5, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 834371584. Throughput: 0: 55831.8. Samples: 783816180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 02:03:53,923][47056] Avg episode reward: [(0, '0.415')] [2024-04-26 02:03:56,015][47288] Updated weights for policy 0, policy_version 50936 (0.0028) [2024-04-26 02:03:58,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 834666496. Throughput: 0: 55829.8. Samples: 783980740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 02:03:58,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 02:03:59,113][47288] Updated weights for policy 0, policy_version 50946 (0.0029) [2024-04-26 02:04:01,754][47288] Updated weights for policy 0, policy_version 50956 (0.0025) [2024-04-26 02:04:03,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 834961408. Throughput: 0: 55855.2. Samples: 784319440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 02:04:03,923][47056] Avg episode reward: [(0, '0.353')] [2024-04-26 02:04:05,076][47288] Updated weights for policy 0, policy_version 50966 (0.0029) [2024-04-26 02:04:07,722][47288] Updated weights for policy 0, policy_version 50976 (0.0031) [2024-04-26 02:04:08,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 835223552. Throughput: 0: 55908.3. Samples: 784655280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 02:04:08,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 02:04:11,117][47288] Updated weights for policy 0, policy_version 50986 (0.0026) [2024-04-26 02:04:13,688][47288] Updated weights for policy 0, policy_version 50996 (0.0026) [2024-04-26 02:04:13,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 835518464. Throughput: 0: 56030.1. Samples: 784826300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 02:04:13,923][47056] Avg episode reward: [(0, '0.350')] [2024-04-26 02:04:16,819][47288] Updated weights for policy 0, policy_version 51006 (0.0027) [2024-04-26 02:04:18,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 835780608. Throughput: 0: 55959.0. Samples: 785159220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 02:04:18,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 02:04:20,024][47288] Updated weights for policy 0, policy_version 51016 (0.0030) [2024-04-26 02:04:22,532][47288] Updated weights for policy 0, policy_version 51026 (0.0028) [2024-04-26 02:04:23,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 836059136. Throughput: 0: 56063.7. Samples: 785497980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 02:04:23,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 02:04:25,764][47288] Updated weights for policy 0, policy_version 51036 (0.0026) [2024-04-26 02:04:28,491][47288] Updated weights for policy 0, policy_version 51046 (0.0028) [2024-04-26 02:04:28,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55705.8, 300 sec: 55761.1). Total num frames: 836337664. Throughput: 0: 55892.9. Samples: 785661700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 02:04:28,924][47056] Avg episode reward: [(0, '0.337')] [2024-04-26 02:04:31,699][47288] Updated weights for policy 0, policy_version 51056 (0.0030) [2024-04-26 02:04:33,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 836616192. Throughput: 0: 55834.2. Samples: 785992860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 02:04:33,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 02:04:34,521][47288] Updated weights for policy 0, policy_version 51066 (0.0033) [2024-04-26 02:04:37,544][47288] Updated weights for policy 0, policy_version 51076 (0.0028) [2024-04-26 02:04:38,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 836894720. Throughput: 0: 55730.2. Samples: 786324040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 02:04:38,923][47056] Avg episode reward: [(0, '0.351')] [2024-04-26 02:04:40,477][47288] Updated weights for policy 0, policy_version 51086 (0.0026) [2024-04-26 02:04:43,161][47288] Updated weights for policy 0, policy_version 51096 (0.0029) [2024-04-26 02:04:43,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 837156864. Throughput: 0: 55880.0. Samples: 786495340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 02:04:43,923][47056] Avg episode reward: [(0, '0.393')] [2024-04-26 02:04:46,313][47288] Updated weights for policy 0, policy_version 51106 (0.0034) [2024-04-26 02:04:48,818][47288] Updated weights for policy 0, policy_version 51116 (0.0028) [2024-04-26 02:04:48,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 837484544. Throughput: 0: 55952.3. Samples: 786837300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 02:04:48,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 02:04:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000051116_837484544.pth... [2024-04-26 02:04:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000050299_824098816.pth [2024-04-26 02:04:50,090][47267] Signal inference workers to stop experience collection... (11650 times) [2024-04-26 02:04:50,143][47288] InferenceWorker_p0-w0: stopping experience collection (11650 times) [2024-04-26 02:04:50,144][47267] Signal inference workers to resume experience collection... (11650 times) [2024-04-26 02:04:50,157][47288] InferenceWorker_p0-w0: resuming experience collection (11650 times) [2024-04-26 02:04:52,044][47288] Updated weights for policy 0, policy_version 51126 (0.0028) [2024-04-26 02:04:53,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 837746688. Throughput: 0: 55868.7. Samples: 787169360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 02:04:53,923][47056] Avg episode reward: [(0, '0.384')] [2024-04-26 02:04:55,020][47288] Updated weights for policy 0, policy_version 51136 (0.0029) [2024-04-26 02:04:57,729][47288] Updated weights for policy 0, policy_version 51146 (0.0032) [2024-04-26 02:04:58,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 838025216. Throughput: 0: 55739.5. Samples: 787334580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 02:04:58,923][47056] Avg episode reward: [(0, '0.369')] [2024-04-26 02:05:01,136][47288] Updated weights for policy 0, policy_version 51156 (0.0027) [2024-04-26 02:05:03,584][47288] Updated weights for policy 0, policy_version 51166 (0.0030) [2024-04-26 02:05:03,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 838303744. Throughput: 0: 55887.3. Samples: 787674140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 02:05:03,923][47056] Avg episode reward: [(0, '0.351')] [2024-04-26 02:05:07,022][47288] Updated weights for policy 0, policy_version 51176 (0.0027) [2024-04-26 02:05:08,923][47056] Fps is (10 sec: 55704.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 838582272. Throughput: 0: 55759.4. Samples: 788007160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 02:05:08,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 02:05:09,314][47288] Updated weights for policy 0, policy_version 51186 (0.0032) [2024-04-26 02:05:12,912][47288] Updated weights for policy 0, policy_version 51196 (0.0029) [2024-04-26 02:05:13,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 838877184. Throughput: 0: 55907.0. Samples: 788177520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 02:05:13,923][47056] Avg episode reward: [(0, '0.232')] [2024-04-26 02:05:15,211][47288] Updated weights for policy 0, policy_version 51206 (0.0029) [2024-04-26 02:05:18,661][47288] Updated weights for policy 0, policy_version 51216 (0.0032) [2024-04-26 02:05:18,923][47056] Fps is (10 sec: 55707.1, 60 sec: 55978.9, 300 sec: 55761.2). Total num frames: 839139328. Throughput: 0: 55958.7. Samples: 788511000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 02:05:18,923][47056] Avg episode reward: [(0, '0.420')] [2024-04-26 02:05:21,353][47288] Updated weights for policy 0, policy_version 51226 (0.0029) [2024-04-26 02:05:23,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 839417856. Throughput: 0: 55999.9. Samples: 788844040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:05:23,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 02:05:24,450][47288] Updated weights for policy 0, policy_version 51236 (0.0028) [2024-04-26 02:05:27,256][47288] Updated weights for policy 0, policy_version 51246 (0.0027) [2024-04-26 02:05:28,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 839712768. Throughput: 0: 56057.6. Samples: 789017940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:05:28,923][47056] Avg episode reward: [(0, '0.420')] [2024-04-26 02:05:30,191][47288] Updated weights for policy 0, policy_version 51256 (0.0027) [2024-04-26 02:05:33,153][47288] Updated weights for policy 0, policy_version 51266 (0.0033) [2024-04-26 02:05:33,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 839974912. Throughput: 0: 55884.6. Samples: 789352100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:05:33,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 02:05:34,284][47267] Signal inference workers to stop experience collection... (11700 times) [2024-04-26 02:05:34,284][47267] Signal inference workers to resume experience collection... (11700 times) [2024-04-26 02:05:34,314][47288] InferenceWorker_p0-w0: stopping experience collection (11700 times) [2024-04-26 02:05:34,314][47288] InferenceWorker_p0-w0: resuming experience collection (11700 times) [2024-04-26 02:05:36,197][47288] Updated weights for policy 0, policy_version 51276 (0.0028) [2024-04-26 02:05:38,922][47056] Fps is (10 sec: 54068.2, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 840253440. Throughput: 0: 56043.6. Samples: 789691320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:05:38,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 02:05:39,021][47288] Updated weights for policy 0, policy_version 51286 (0.0029) [2024-04-26 02:05:42,100][47288] Updated weights for policy 0, policy_version 51296 (0.0039) [2024-04-26 02:05:43,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56797.8, 300 sec: 55872.2). Total num frames: 840564736. Throughput: 0: 56108.9. Samples: 789859480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:05:43,923][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 02:05:44,812][47288] Updated weights for policy 0, policy_version 51306 (0.0032) [2024-04-26 02:05:47,854][47288] Updated weights for policy 0, policy_version 51316 (0.0026) [2024-04-26 02:05:48,923][47056] Fps is (10 sec: 57342.8, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 840826880. Throughput: 0: 56056.4. Samples: 790196680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:05:48,924][47056] Avg episode reward: [(0, '0.345')] [2024-04-26 02:05:50,568][47288] Updated weights for policy 0, policy_version 51326 (0.0028) [2024-04-26 02:05:53,657][47288] Updated weights for policy 0, policy_version 51336 (0.0027) [2024-04-26 02:05:53,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 841089024. Throughput: 0: 56171.0. Samples: 790534840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:05:53,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 02:05:56,364][47288] Updated weights for policy 0, policy_version 51346 (0.0032) [2024-04-26 02:05:58,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 841367552. Throughput: 0: 55939.7. Samples: 790694800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 02:05:58,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 02:05:59,597][47288] Updated weights for policy 0, policy_version 51356 (0.0032) [2024-04-26 02:06:02,426][47288] Updated weights for policy 0, policy_version 51366 (0.0030) [2024-04-26 02:06:03,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 841678848. Throughput: 0: 55959.9. Samples: 791029200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 02:06:03,923][47056] Avg episode reward: [(0, '0.378')] [2024-04-26 02:06:05,403][47288] Updated weights for policy 0, policy_version 51376 (0.0025) [2024-04-26 02:06:08,189][47288] Updated weights for policy 0, policy_version 51386 (0.0028) [2024-04-26 02:06:08,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 841924608. Throughput: 0: 55928.9. Samples: 791360840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 02:06:08,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 02:06:11,222][47288] Updated weights for policy 0, policy_version 51396 (0.0031) [2024-04-26 02:06:13,923][47056] Fps is (10 sec: 52427.9, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 842203136. Throughput: 0: 55841.6. Samples: 791530820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 02:06:13,923][47056] Avg episode reward: [(0, '0.322')] [2024-04-26 02:06:14,247][47288] Updated weights for policy 0, policy_version 51406 (0.0027) [2024-04-26 02:06:17,067][47288] Updated weights for policy 0, policy_version 51416 (0.0039) [2024-04-26 02:06:18,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 842498048. Throughput: 0: 55796.9. Samples: 791862960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 02:06:18,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 02:06:20,200][47288] Updated weights for policy 0, policy_version 51426 (0.0027) [2024-04-26 02:06:23,096][47288] Updated weights for policy 0, policy_version 51436 (0.0036) [2024-04-26 02:06:23,923][47056] Fps is (10 sec: 57345.1, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 842776576. Throughput: 0: 55583.8. Samples: 792192600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 02:06:23,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 02:06:26,123][47288] Updated weights for policy 0, policy_version 51446 (0.0034) [2024-04-26 02:06:28,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 843038720. Throughput: 0: 55559.5. Samples: 792359660. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 02:06:28,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 02:06:29,054][47288] Updated weights for policy 0, policy_version 51456 (0.0030) [2024-04-26 02:06:31,830][47288] Updated weights for policy 0, policy_version 51466 (0.0028) [2024-04-26 02:06:33,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 843317248. Throughput: 0: 55553.5. Samples: 792696580. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 02:06:33,923][47056] Avg episode reward: [(0, '0.350')] [2024-04-26 02:06:34,811][47288] Updated weights for policy 0, policy_version 51476 (0.0031) [2024-04-26 02:06:37,698][47288] Updated weights for policy 0, policy_version 51486 (0.0029) [2024-04-26 02:06:38,922][47056] Fps is (10 sec: 55706.8, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 843595776. Throughput: 0: 55457.4. Samples: 793030420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 02:06:38,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 02:06:38,942][47267] Signal inference workers to stop experience collection... (11750 times) [2024-04-26 02:06:38,961][47288] InferenceWorker_p0-w0: stopping experience collection (11750 times) [2024-04-26 02:06:38,999][47267] Signal inference workers to resume experience collection... (11750 times) [2024-04-26 02:06:39,000][47288] InferenceWorker_p0-w0: resuming experience collection (11750 times) [2024-04-26 02:06:40,940][47288] Updated weights for policy 0, policy_version 51496 (0.0025) [2024-04-26 02:06:43,682][47288] Updated weights for policy 0, policy_version 51506 (0.0033) [2024-04-26 02:06:43,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 843890688. Throughput: 0: 55539.6. Samples: 793194080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 02:06:43,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 02:06:46,728][47288] Updated weights for policy 0, policy_version 51516 (0.0030) [2024-04-26 02:06:48,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 844152832. Throughput: 0: 55441.4. Samples: 793524060. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 02:06:48,923][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 02:06:48,949][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000051524_844169216.pth... [2024-04-26 02:06:48,997][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000050706_830767104.pth [2024-04-26 02:06:49,552][47288] Updated weights for policy 0, policy_version 51526 (0.0029) [2024-04-26 02:06:52,581][47288] Updated weights for policy 0, policy_version 51536 (0.0032) [2024-04-26 02:06:53,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 844431360. Throughput: 0: 55427.5. Samples: 793855080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 02:06:53,923][47056] Avg episode reward: [(0, '0.355')] [2024-04-26 02:06:55,322][47288] Updated weights for policy 0, policy_version 51546 (0.0029) [2024-04-26 02:06:58,543][47288] Updated weights for policy 0, policy_version 51556 (0.0029) [2024-04-26 02:06:58,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 844693504. Throughput: 0: 55420.3. Samples: 794024720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 02:06:58,923][47056] Avg episode reward: [(0, '0.359')] [2024-04-26 02:07:01,315][47288] Updated weights for policy 0, policy_version 51566 (0.0032) [2024-04-26 02:07:03,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55159.4, 300 sec: 55816.7). Total num frames: 844988416. Throughput: 0: 55367.3. Samples: 794354500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 02:07:03,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 02:07:04,374][47288] Updated weights for policy 0, policy_version 51576 (0.0033) [2024-04-26 02:07:07,110][47288] Updated weights for policy 0, policy_version 51586 (0.0026) [2024-04-26 02:07:08,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 845283328. Throughput: 0: 55466.2. Samples: 794688580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 02:07:08,923][47056] Avg episode reward: [(0, '0.403')] [2024-04-26 02:07:10,050][47288] Updated weights for policy 0, policy_version 51596 (0.0033) [2024-04-26 02:07:13,084][47288] Updated weights for policy 0, policy_version 51606 (0.0029) [2024-04-26 02:07:13,923][47056] Fps is (10 sec: 55707.0, 60 sec: 55705.9, 300 sec: 55816.7). Total num frames: 845545472. Throughput: 0: 55580.2. Samples: 794860760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 02:07:13,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 02:07:16,187][47288] Updated weights for policy 0, policy_version 51616 (0.0027) [2024-04-26 02:07:18,803][47288] Updated weights for policy 0, policy_version 51626 (0.0026) [2024-04-26 02:07:18,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 845840384. Throughput: 0: 55447.4. Samples: 795191720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 02:07:18,923][47056] Avg episode reward: [(0, '0.372')] [2024-04-26 02:07:22,147][47288] Updated weights for policy 0, policy_version 51636 (0.0030) [2024-04-26 02:07:23,923][47056] Fps is (10 sec: 55704.4, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 846102528. Throughput: 0: 55401.9. Samples: 795523520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 02:07:23,923][47056] Avg episode reward: [(0, '0.338')] [2024-04-26 02:07:24,633][47288] Updated weights for policy 0, policy_version 51646 (0.0032) [2024-04-26 02:07:27,932][47288] Updated weights for policy 0, policy_version 51656 (0.0031) [2024-04-26 02:07:28,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 846381056. Throughput: 0: 55682.3. Samples: 795699780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 02:07:28,923][47056] Avg episode reward: [(0, '0.401')] [2024-04-26 02:07:30,422][47288] Updated weights for policy 0, policy_version 51666 (0.0028) [2024-04-26 02:07:33,793][47288] Updated weights for policy 0, policy_version 51676 (0.0030) [2024-04-26 02:07:33,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 846659584. Throughput: 0: 55805.8. Samples: 796035320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 02:07:33,923][47056] Avg episode reward: [(0, '0.363')] [2024-04-26 02:07:35,988][47267] Signal inference workers to stop experience collection... (11800 times) [2024-04-26 02:07:35,992][47267] Signal inference workers to resume experience collection... (11800 times) [2024-04-26 02:07:36,016][47288] InferenceWorker_p0-w0: stopping experience collection (11800 times) [2024-04-26 02:07:36,017][47288] InferenceWorker_p0-w0: resuming experience collection (11800 times) [2024-04-26 02:07:36,525][47288] Updated weights for policy 0, policy_version 51686 (0.0026) [2024-04-26 02:07:38,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55705.3, 300 sec: 55761.1). Total num frames: 846938112. Throughput: 0: 55853.2. Samples: 796368480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 02:07:38,923][47056] Avg episode reward: [(0, '0.362')] [2024-04-26 02:07:39,732][47288] Updated weights for policy 0, policy_version 51696 (0.0035) [2024-04-26 02:07:42,696][47288] Updated weights for policy 0, policy_version 51706 (0.0030) [2024-04-26 02:07:43,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 847233024. Throughput: 0: 55765.7. Samples: 796534180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 02:07:43,923][47056] Avg episode reward: [(0, '0.286')] [2024-04-26 02:07:45,534][47288] Updated weights for policy 0, policy_version 51716 (0.0031) [2024-04-26 02:07:48,661][47288] Updated weights for policy 0, policy_version 51726 (0.0027) [2024-04-26 02:07:48,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 847495168. Throughput: 0: 55805.9. Samples: 796865760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 02:07:48,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 02:07:51,245][47288] Updated weights for policy 0, policy_version 51736 (0.0028) [2024-04-26 02:07:53,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 847757312. Throughput: 0: 55875.9. Samples: 797203000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 02:07:53,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 02:07:54,421][47288] Updated weights for policy 0, policy_version 51746 (0.0029) [2024-04-26 02:07:57,253][47288] Updated weights for policy 0, policy_version 51756 (0.0029) [2024-04-26 02:07:58,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 848068608. Throughput: 0: 55769.2. Samples: 797370380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 02:07:58,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 02:08:00,322][47288] Updated weights for policy 0, policy_version 51766 (0.0031) [2024-04-26 02:08:03,135][47288] Updated weights for policy 0, policy_version 51776 (0.0025) [2024-04-26 02:08:03,923][47056] Fps is (10 sec: 58983.2, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 848347136. Throughput: 0: 55934.7. Samples: 797708780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 02:08:03,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 02:08:06,224][47288] Updated weights for policy 0, policy_version 51786 (0.0032) [2024-04-26 02:08:08,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 848609280. Throughput: 0: 55933.3. Samples: 798040520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 02:08:08,923][47056] Avg episode reward: [(0, '0.415')] [2024-04-26 02:08:09,064][47288] Updated weights for policy 0, policy_version 51796 (0.0031) [2024-04-26 02:08:12,037][47288] Updated weights for policy 0, policy_version 51806 (0.0031) [2024-04-26 02:08:13,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 848887808. Throughput: 0: 55751.1. Samples: 798208580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 02:08:13,923][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 02:08:14,770][47288] Updated weights for policy 0, policy_version 51816 (0.0028) [2024-04-26 02:08:17,732][47288] Updated weights for policy 0, policy_version 51826 (0.0029) [2024-04-26 02:08:18,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 849182720. Throughput: 0: 55883.1. Samples: 798550060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 02:08:18,923][47056] Avg episode reward: [(0, '0.307')] [2024-04-26 02:08:20,688][47288] Updated weights for policy 0, policy_version 51836 (0.0033) [2024-04-26 02:08:23,544][47288] Updated weights for policy 0, policy_version 51846 (0.0028) [2024-04-26 02:08:23,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 849444864. Throughput: 0: 55847.7. Samples: 798881620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 02:08:23,923][47056] Avg episode reward: [(0, '0.277')] [2024-04-26 02:08:26,601][47288] Updated weights for policy 0, policy_version 51856 (0.0029) [2024-04-26 02:08:28,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 849723392. Throughput: 0: 55932.5. Samples: 799051140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 02:08:28,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 02:08:29,665][47288] Updated weights for policy 0, policy_version 51866 (0.0030) [2024-04-26 02:08:32,376][47288] Updated weights for policy 0, policy_version 51876 (0.0024) [2024-04-26 02:08:33,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55816.6). Total num frames: 850018304. Throughput: 0: 55925.3. Samples: 799382400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 02:08:33,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 02:08:35,429][47288] Updated weights for policy 0, policy_version 51886 (0.0034) [2024-04-26 02:08:38,186][47288] Updated weights for policy 0, policy_version 51896 (0.0039) [2024-04-26 02:08:38,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 850280448. Throughput: 0: 55917.0. Samples: 799719260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-26 02:08:38,923][47056] Avg episode reward: [(0, '0.368')] [2024-04-26 02:08:41,334][47288] Updated weights for policy 0, policy_version 51906 (0.0029) [2024-04-26 02:08:43,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 850575360. Throughput: 0: 55834.3. Samples: 799882920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-26 02:08:43,923][47056] Avg episode reward: [(0, '0.338')] [2024-04-26 02:08:44,135][47288] Updated weights for policy 0, policy_version 51916 (0.0036) [2024-04-26 02:08:47,133][47288] Updated weights for policy 0, policy_version 51926 (0.0030) [2024-04-26 02:08:48,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 850853888. Throughput: 0: 55836.8. Samples: 800221440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-26 02:08:48,923][47056] Avg episode reward: [(0, '0.328')] [2024-04-26 02:08:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000051932_850853888.pth... [2024-04-26 02:08:48,989][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000051116_837484544.pth [2024-04-26 02:08:50,251][47288] Updated weights for policy 0, policy_version 51936 (0.0031) [2024-04-26 02:08:52,749][47267] Signal inference workers to stop experience collection... (11850 times) [2024-04-26 02:08:52,786][47288] InferenceWorker_p0-w0: stopping experience collection (11850 times) [2024-04-26 02:08:52,841][47267] Signal inference workers to resume experience collection... (11850 times) [2024-04-26 02:08:52,841][47288] InferenceWorker_p0-w0: resuming experience collection (11850 times) [2024-04-26 02:08:52,953][47288] Updated weights for policy 0, policy_version 51946 (0.0030) [2024-04-26 02:08:53,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 851132416. Throughput: 0: 55903.6. Samples: 800556180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-26 02:08:53,923][47056] Avg episode reward: [(0, '0.314')] [2024-04-26 02:08:56,027][47288] Updated weights for policy 0, policy_version 51956 (0.0028) [2024-04-26 02:08:58,755][47288] Updated weights for policy 0, policy_version 51966 (0.0032) [2024-04-26 02:08:58,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 851410944. Throughput: 0: 55869.3. Samples: 800722700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-26 02:08:58,923][47056] Avg episode reward: [(0, '0.345')] [2024-04-26 02:09:01,863][47288] Updated weights for policy 0, policy_version 51976 (0.0031) [2024-04-26 02:09:03,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 851673088. Throughput: 0: 55702.7. Samples: 801056680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-26 02:09:03,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 02:09:04,732][47288] Updated weights for policy 0, policy_version 51986 (0.0026) [2024-04-26 02:09:07,773][47288] Updated weights for policy 0, policy_version 51996 (0.0029) [2024-04-26 02:09:08,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 851968000. Throughput: 0: 55731.9. Samples: 801389560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:09:08,923][47056] Avg episode reward: [(0, '0.305')] [2024-04-26 02:09:10,649][47288] Updated weights for policy 0, policy_version 52006 (0.0028) [2024-04-26 02:09:13,544][47288] Updated weights for policy 0, policy_version 52016 (0.0024) [2024-04-26 02:09:13,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 852230144. Throughput: 0: 55620.0. Samples: 801554040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:09:13,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 02:09:16,502][47288] Updated weights for policy 0, policy_version 52026 (0.0030) [2024-04-26 02:09:18,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 852525056. Throughput: 0: 55794.3. Samples: 801893140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:09:18,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 02:09:19,386][47288] Updated weights for policy 0, policy_version 52036 (0.0028) [2024-04-26 02:09:22,419][47288] Updated weights for policy 0, policy_version 52046 (0.0026) [2024-04-26 02:09:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 852803584. Throughput: 0: 55862.3. Samples: 802233060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:09:23,923][47056] Avg episode reward: [(0, '0.312')] [2024-04-26 02:09:25,112][47288] Updated weights for policy 0, policy_version 52056 (0.0029) [2024-04-26 02:09:28,189][47288] Updated weights for policy 0, policy_version 52066 (0.0030) [2024-04-26 02:09:28,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 853082112. Throughput: 0: 55722.5. Samples: 802390440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:09:28,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 02:09:31,182][47288] Updated weights for policy 0, policy_version 52076 (0.0028) [2024-04-26 02:09:33,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 853360640. Throughput: 0: 55638.7. Samples: 802725180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:09:33,924][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 02:09:34,027][47288] Updated weights for policy 0, policy_version 52086 (0.0028) [2024-04-26 02:09:37,326][47288] Updated weights for policy 0, policy_version 52096 (0.0028) [2024-04-26 02:09:38,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 853639168. Throughput: 0: 55735.6. Samples: 803064280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:09:38,923][47056] Avg episode reward: [(0, '0.345')] [2024-04-26 02:09:39,931][47288] Updated weights for policy 0, policy_version 52106 (0.0031) [2024-04-26 02:09:43,137][47288] Updated weights for policy 0, policy_version 52116 (0.0029) [2024-04-26 02:09:43,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 853917696. Throughput: 0: 55756.9. Samples: 803231760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 02:09:43,923][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 02:09:45,848][47288] Updated weights for policy 0, policy_version 52126 (0.0027) [2024-04-26 02:09:48,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 854196224. Throughput: 0: 55706.5. Samples: 803563480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 02:09:48,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 02:09:48,924][47288] Updated weights for policy 0, policy_version 52136 (0.0031) [2024-04-26 02:09:51,568][47288] Updated weights for policy 0, policy_version 52146 (0.0031) [2024-04-26 02:09:53,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 854491136. Throughput: 0: 55833.4. Samples: 803902060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 02:09:53,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 02:09:54,633][47288] Updated weights for policy 0, policy_version 52156 (0.0025) [2024-04-26 02:09:57,594][47288] Updated weights for policy 0, policy_version 52166 (0.0032) [2024-04-26 02:09:58,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 854769664. Throughput: 0: 55987.9. Samples: 804073500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 02:09:58,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 02:10:00,473][47288] Updated weights for policy 0, policy_version 52176 (0.0027) [2024-04-26 02:10:01,058][47267] Signal inference workers to stop experience collection... (11900 times) [2024-04-26 02:10:01,081][47288] InferenceWorker_p0-w0: stopping experience collection (11900 times) [2024-04-26 02:10:01,116][47267] Signal inference workers to resume experience collection... (11900 times) [2024-04-26 02:10:01,116][47288] InferenceWorker_p0-w0: resuming experience collection (11900 times) [2024-04-26 02:10:03,481][47288] Updated weights for policy 0, policy_version 52186 (0.0026) [2024-04-26 02:10:03,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 855031808. Throughput: 0: 55939.5. Samples: 804410420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 02:10:03,923][47056] Avg episode reward: [(0, '0.361')] [2024-04-26 02:10:06,234][47288] Updated weights for policy 0, policy_version 52196 (0.0029) [2024-04-26 02:10:08,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 855310336. Throughput: 0: 55850.9. Samples: 804746360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 02:10:08,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 02:10:09,288][47288] Updated weights for policy 0, policy_version 52206 (0.0025) [2024-04-26 02:10:12,033][47288] Updated weights for policy 0, policy_version 52216 (0.0032) [2024-04-26 02:10:13,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 855588864. Throughput: 0: 55954.4. Samples: 804908380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:10:13,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 02:10:15,041][47288] Updated weights for policy 0, policy_version 52226 (0.0029) [2024-04-26 02:10:17,959][47288] Updated weights for policy 0, policy_version 52236 (0.0031) [2024-04-26 02:10:18,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 855883776. Throughput: 0: 56167.2. Samples: 805252700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:10:18,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 02:10:20,943][47288] Updated weights for policy 0, policy_version 52246 (0.0031) [2024-04-26 02:10:23,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 856145920. Throughput: 0: 55925.7. Samples: 805580940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:10:23,923][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 02:10:24,082][47288] Updated weights for policy 0, policy_version 52256 (0.0025) [2024-04-26 02:10:26,790][47288] Updated weights for policy 0, policy_version 52266 (0.0032) [2024-04-26 02:10:28,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 856424448. Throughput: 0: 55938.5. Samples: 805749000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:10:28,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 02:10:30,015][47288] Updated weights for policy 0, policy_version 52276 (0.0031) [2024-04-26 02:10:32,878][47288] Updated weights for policy 0, policy_version 52286 (0.0028) [2024-04-26 02:10:33,923][47056] Fps is (10 sec: 58983.3, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 856735744. Throughput: 0: 55917.6. Samples: 806079760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:10:33,923][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 02:10:35,889][47288] Updated weights for policy 0, policy_version 52296 (0.0032) [2024-04-26 02:10:38,620][47288] Updated weights for policy 0, policy_version 52306 (0.0033) [2024-04-26 02:10:38,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 856981504. Throughput: 0: 55819.6. Samples: 806413940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:10:38,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 02:10:41,738][47288] Updated weights for policy 0, policy_version 52316 (0.0028) [2024-04-26 02:10:43,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 857260032. Throughput: 0: 55682.4. Samples: 806579200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:10:43,923][47056] Avg episode reward: [(0, '0.362')] [2024-04-26 02:10:44,630][47288] Updated weights for policy 0, policy_version 52326 (0.0027) [2024-04-26 02:10:47,503][47288] Updated weights for policy 0, policy_version 52336 (0.0026) [2024-04-26 02:10:48,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 857554944. Throughput: 0: 55673.3. Samples: 806915720. Policy #0 lag: (min: 3.0, avg: 11.6, max: 23.0) [2024-04-26 02:10:48,923][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 02:10:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000052341_857554944.pth... [2024-04-26 02:10:48,979][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000051524_844169216.pth [2024-04-26 02:10:50,744][47288] Updated weights for policy 0, policy_version 52346 (0.0028) [2024-04-26 02:10:53,378][47288] Updated weights for policy 0, policy_version 52356 (0.0027) [2024-04-26 02:10:53,923][47056] Fps is (10 sec: 55704.0, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 857817088. Throughput: 0: 55493.2. Samples: 807243560. Policy #0 lag: (min: 3.0, avg: 11.6, max: 23.0) [2024-04-26 02:10:53,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 02:10:56,608][47288] Updated weights for policy 0, policy_version 52366 (0.0031) [2024-04-26 02:10:58,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 858095616. Throughput: 0: 55671.0. Samples: 807413580. Policy #0 lag: (min: 3.0, avg: 11.6, max: 23.0) [2024-04-26 02:10:58,923][47056] Avg episode reward: [(0, '0.362')] [2024-04-26 02:10:59,349][47288] Updated weights for policy 0, policy_version 52376 (0.0028) [2024-04-26 02:11:01,455][47267] Signal inference workers to stop experience collection... (11950 times) [2024-04-26 02:11:01,476][47288] InferenceWorker_p0-w0: stopping experience collection (11950 times) [2024-04-26 02:11:01,513][47267] Signal inference workers to resume experience collection... (11950 times) [2024-04-26 02:11:01,513][47288] InferenceWorker_p0-w0: resuming experience collection (11950 times) [2024-04-26 02:11:02,632][47288] Updated weights for policy 0, policy_version 52386 (0.0029) [2024-04-26 02:11:03,923][47056] Fps is (10 sec: 55706.9, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 858374144. Throughput: 0: 55434.8. Samples: 807747260. Policy #0 lag: (min: 3.0, avg: 11.6, max: 23.0) [2024-04-26 02:11:03,923][47056] Avg episode reward: [(0, '0.363')] [2024-04-26 02:11:05,322][47288] Updated weights for policy 0, policy_version 52396 (0.0032) [2024-04-26 02:11:08,362][47288] Updated weights for policy 0, policy_version 52406 (0.0028) [2024-04-26 02:11:08,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 858669056. Throughput: 0: 55515.6. Samples: 808079140. Policy #0 lag: (min: 3.0, avg: 11.6, max: 23.0) [2024-04-26 02:11:08,923][47056] Avg episode reward: [(0, '0.404')] [2024-04-26 02:11:11,284][47288] Updated weights for policy 0, policy_version 52416 (0.0029) [2024-04-26 02:11:13,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 858931200. Throughput: 0: 55496.6. Samples: 808246340. Policy #0 lag: (min: 3.0, avg: 11.6, max: 23.0) [2024-04-26 02:11:13,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 02:11:14,432][47288] Updated weights for policy 0, policy_version 52426 (0.0027) [2024-04-26 02:11:17,227][47288] Updated weights for policy 0, policy_version 52436 (0.0031) [2024-04-26 02:11:18,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 859209728. Throughput: 0: 55487.9. Samples: 808576720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 02:11:18,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 02:11:20,171][47288] Updated weights for policy 0, policy_version 52446 (0.0026) [2024-04-26 02:11:23,190][47288] Updated weights for policy 0, policy_version 52456 (0.0025) [2024-04-26 02:11:23,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 859471872. Throughput: 0: 55499.6. Samples: 808911420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 02:11:23,923][47056] Avg episode reward: [(0, '0.261')] [2024-04-26 02:11:26,034][47288] Updated weights for policy 0, policy_version 52466 (0.0027) [2024-04-26 02:11:28,901][47288] Updated weights for policy 0, policy_version 52476 (0.0028) [2024-04-26 02:11:28,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 859766784. Throughput: 0: 55360.5. Samples: 809070440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 02:11:28,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 02:11:31,844][47288] Updated weights for policy 0, policy_version 52486 (0.0026) [2024-04-26 02:11:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 860045312. Throughput: 0: 55315.6. Samples: 809404920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 02:11:33,923][47056] Avg episode reward: [(0, '0.369')] [2024-04-26 02:11:34,638][47288] Updated weights for policy 0, policy_version 52496 (0.0026) [2024-04-26 02:11:37,847][47288] Updated weights for policy 0, policy_version 52506 (0.0033) [2024-04-26 02:11:38,923][47056] Fps is (10 sec: 55706.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 860323840. Throughput: 0: 55455.4. Samples: 809739040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 02:11:38,923][47056] Avg episode reward: [(0, '0.302')] [2024-04-26 02:11:40,542][47288] Updated weights for policy 0, policy_version 52516 (0.0027) [2024-04-26 02:11:43,680][47288] Updated weights for policy 0, policy_version 52526 (0.0028) [2024-04-26 02:11:43,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 860602368. Throughput: 0: 55495.0. Samples: 809910860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 02:11:43,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 02:11:46,569][47288] Updated weights for policy 0, policy_version 52536 (0.0028) [2024-04-26 02:11:48,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 860864512. Throughput: 0: 55468.5. Samples: 810243340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 02:11:48,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 02:11:49,531][47288] Updated weights for policy 0, policy_version 52546 (0.0028) [2024-04-26 02:11:52,354][47288] Updated weights for policy 0, policy_version 52556 (0.0032) [2024-04-26 02:11:53,363][47267] Signal inference workers to stop experience collection... (12000 times) [2024-04-26 02:11:53,369][47267] Signal inference workers to resume experience collection... (12000 times) [2024-04-26 02:11:53,383][47288] InferenceWorker_p0-w0: stopping experience collection (12000 times) [2024-04-26 02:11:53,384][47288] InferenceWorker_p0-w0: resuming experience collection (12000 times) [2024-04-26 02:11:53,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 861159424. Throughput: 0: 55474.3. Samples: 810575480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 02:11:53,923][47056] Avg episode reward: [(0, '0.391')] [2024-04-26 02:11:55,292][47288] Updated weights for policy 0, policy_version 52566 (0.0027) [2024-04-26 02:11:58,485][47288] Updated weights for policy 0, policy_version 52576 (0.0037) [2024-04-26 02:11:58,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 861421568. Throughput: 0: 55517.7. Samples: 810744640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 02:11:58,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 02:12:01,301][47288] Updated weights for policy 0, policy_version 52586 (0.0030) [2024-04-26 02:12:03,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 861700096. Throughput: 0: 55494.6. Samples: 811073980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 02:12:03,923][47056] Avg episode reward: [(0, '0.312')] [2024-04-26 02:12:04,599][47288] Updated weights for policy 0, policy_version 52596 (0.0025) [2024-04-26 02:12:07,229][47288] Updated weights for policy 0, policy_version 52606 (0.0027) [2024-04-26 02:12:08,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 861978624. Throughput: 0: 55428.4. Samples: 811405700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 02:12:08,923][47056] Avg episode reward: [(0, '0.306')] [2024-04-26 02:12:10,317][47288] Updated weights for policy 0, policy_version 52616 (0.0027) [2024-04-26 02:12:13,136][47288] Updated weights for policy 0, policy_version 52626 (0.0032) [2024-04-26 02:12:13,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 862273536. Throughput: 0: 55684.6. Samples: 811576240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 02:12:13,923][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 02:12:16,000][47288] Updated weights for policy 0, policy_version 52636 (0.0029) [2024-04-26 02:12:18,850][47288] Updated weights for policy 0, policy_version 52646 (0.0033) [2024-04-26 02:12:18,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 862552064. Throughput: 0: 55759.0. Samples: 811914080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 02:12:18,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 02:12:22,014][47288] Updated weights for policy 0, policy_version 52656 (0.0034) [2024-04-26 02:12:23,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 862814208. Throughput: 0: 55708.9. Samples: 812245940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 02:12:23,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 02:12:24,751][47288] Updated weights for policy 0, policy_version 52666 (0.0027) [2024-04-26 02:12:27,872][47288] Updated weights for policy 0, policy_version 52676 (0.0031) [2024-04-26 02:12:28,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 863092736. Throughput: 0: 55687.1. Samples: 812416780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 02:12:28,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 02:12:30,706][47288] Updated weights for policy 0, policy_version 52686 (0.0026) [2024-04-26 02:12:33,860][47288] Updated weights for policy 0, policy_version 52696 (0.0028) [2024-04-26 02:12:33,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 863371264. Throughput: 0: 55646.9. Samples: 812747460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 02:12:33,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 02:12:36,602][47288] Updated weights for policy 0, policy_version 52706 (0.0024) [2024-04-26 02:12:38,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 863633408. Throughput: 0: 55812.0. Samples: 813087020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 02:12:38,923][47056] Avg episode reward: [(0, '0.384')] [2024-04-26 02:12:39,514][47288] Updated weights for policy 0, policy_version 52716 (0.0025) [2024-04-26 02:12:42,303][47288] Updated weights for policy 0, policy_version 52726 (0.0032) [2024-04-26 02:12:43,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 863911936. Throughput: 0: 55726.1. Samples: 813252320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 02:12:43,923][47056] Avg episode reward: [(0, '0.343')] [2024-04-26 02:12:45,349][47288] Updated weights for policy 0, policy_version 52736 (0.0037) [2024-04-26 02:12:48,065][47288] Updated weights for policy 0, policy_version 52746 (0.0030) [2024-04-26 02:12:48,923][47056] Fps is (10 sec: 60616.1, 60 sec: 56250.9, 300 sec: 55872.1). Total num frames: 864239616. Throughput: 0: 55740.0. Samples: 813582320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 02:12:48,924][47056] Avg episode reward: [(0, '0.397')] [2024-04-26 02:12:48,936][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000052749_864239616.pth... [2024-04-26 02:12:48,993][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000051932_850853888.pth [2024-04-26 02:12:51,269][47288] Updated weights for policy 0, policy_version 52756 (0.0028) [2024-04-26 02:12:53,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 864501760. Throughput: 0: 55806.9. Samples: 813917020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 02:12:53,923][47056] Avg episode reward: [(0, '0.336')] [2024-04-26 02:12:53,981][47288] Updated weights for policy 0, policy_version 52766 (0.0029) [2024-04-26 02:12:57,064][47267] Signal inference workers to stop experience collection... (12050 times) [2024-04-26 02:12:57,107][47288] InferenceWorker_p0-w0: stopping experience collection (12050 times) [2024-04-26 02:12:57,113][47267] Signal inference workers to resume experience collection... (12050 times) [2024-04-26 02:12:57,119][47288] InferenceWorker_p0-w0: resuming experience collection (12050 times) [2024-04-26 02:12:57,224][47288] Updated weights for policy 0, policy_version 52776 (0.0029) [2024-04-26 02:12:58,923][47056] Fps is (10 sec: 54071.8, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 864780288. Throughput: 0: 55868.5. Samples: 814090320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:12:58,923][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 02:12:59,930][47288] Updated weights for policy 0, policy_version 52786 (0.0027) [2024-04-26 02:13:02,908][47288] Updated weights for policy 0, policy_version 52796 (0.0030) [2024-04-26 02:13:03,923][47056] Fps is (10 sec: 54068.5, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 865042432. Throughput: 0: 55761.6. Samples: 814423340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:13:03,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 02:13:05,656][47288] Updated weights for policy 0, policy_version 52806 (0.0028) [2024-04-26 02:13:08,745][47288] Updated weights for policy 0, policy_version 52816 (0.0035) [2024-04-26 02:13:08,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 865337344. Throughput: 0: 55764.8. Samples: 814755360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:13:08,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 02:13:11,479][47288] Updated weights for policy 0, policy_version 52826 (0.0034) [2024-04-26 02:13:13,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 865583104. Throughput: 0: 55552.6. Samples: 814916640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:13:13,923][47056] Avg episode reward: [(0, '0.345')] [2024-04-26 02:13:14,834][47288] Updated weights for policy 0, policy_version 52836 (0.0027) [2024-04-26 02:13:17,347][47288] Updated weights for policy 0, policy_version 52846 (0.0027) [2024-04-26 02:13:18,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 865878016. Throughput: 0: 55579.2. Samples: 815248520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:13:18,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 02:13:20,640][47288] Updated weights for policy 0, policy_version 52856 (0.0036) [2024-04-26 02:13:23,220][47288] Updated weights for policy 0, policy_version 52866 (0.0024) [2024-04-26 02:13:23,923][47056] Fps is (10 sec: 58981.9, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 866172928. Throughput: 0: 55387.5. Samples: 815579460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:13:23,923][47056] Avg episode reward: [(0, '0.259')] [2024-04-26 02:13:26,544][47288] Updated weights for policy 0, policy_version 52876 (0.0028) [2024-04-26 02:13:28,923][47056] Fps is (10 sec: 57345.1, 60 sec: 55978.9, 300 sec: 55705.6). Total num frames: 866451456. Throughput: 0: 55528.9. Samples: 815751100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:13:28,923][47056] Avg episode reward: [(0, '0.327')] [2024-04-26 02:13:29,075][47288] Updated weights for policy 0, policy_version 52886 (0.0029) [2024-04-26 02:13:32,809][47288] Updated weights for policy 0, policy_version 52896 (0.0032) [2024-04-26 02:13:33,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 866713600. Throughput: 0: 55584.2. Samples: 816083560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 02:13:33,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 02:13:35,107][47288] Updated weights for policy 0, policy_version 52906 (0.0023) [2024-04-26 02:13:38,810][47288] Updated weights for policy 0, policy_version 52916 (0.0027) [2024-04-26 02:13:38,923][47056] Fps is (10 sec: 52427.7, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 866975744. Throughput: 0: 55572.9. Samples: 816417800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 02:13:38,923][47056] Avg episode reward: [(0, '0.351')] [2024-04-26 02:13:40,873][47288] Updated weights for policy 0, policy_version 52926 (0.0027) [2024-04-26 02:13:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.9, 300 sec: 55650.1). Total num frames: 867270656. Throughput: 0: 55314.3. Samples: 816579460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 02:13:43,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 02:13:44,462][47288] Updated weights for policy 0, policy_version 52936 (0.0026) [2024-04-26 02:13:46,813][47288] Updated weights for policy 0, policy_version 52946 (0.0036) [2024-04-26 02:13:48,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55160.2, 300 sec: 55650.1). Total num frames: 867549184. Throughput: 0: 55385.1. Samples: 816915680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 02:13:48,923][47056] Avg episode reward: [(0, '0.308')] [2024-04-26 02:13:49,500][47267] Signal inference workers to stop experience collection... (12100 times) [2024-04-26 02:13:49,546][47288] InferenceWorker_p0-w0: stopping experience collection (12100 times) [2024-04-26 02:13:49,559][47267] Signal inference workers to resume experience collection... (12100 times) [2024-04-26 02:13:49,563][47288] InferenceWorker_p0-w0: resuming experience collection (12100 times) [2024-04-26 02:13:50,174][47288] Updated weights for policy 0, policy_version 52956 (0.0034) [2024-04-26 02:13:52,661][47288] Updated weights for policy 0, policy_version 52966 (0.0031) [2024-04-26 02:13:53,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 867844096. Throughput: 0: 55459.7. Samples: 817251040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 02:13:53,923][47056] Avg episode reward: [(0, '0.314')] [2024-04-26 02:13:56,046][47288] Updated weights for policy 0, policy_version 52976 (0.0027) [2024-04-26 02:13:58,442][47288] Updated weights for policy 0, policy_version 52986 (0.0022) [2024-04-26 02:13:58,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 868122624. Throughput: 0: 55789.4. Samples: 817427160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 02:13:58,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 02:14:02,169][47288] Updated weights for policy 0, policy_version 52996 (0.0025) [2024-04-26 02:14:03,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 868401152. Throughput: 0: 55861.9. Samples: 817762300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 02:14:03,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 02:14:04,358][47288] Updated weights for policy 0, policy_version 53006 (0.0032) [2024-04-26 02:14:08,237][47288] Updated weights for policy 0, policy_version 53016 (0.0028) [2024-04-26 02:14:08,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 868646912. Throughput: 0: 55992.5. Samples: 818099120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 02:14:08,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 02:14:10,256][47288] Updated weights for policy 0, policy_version 53026 (0.0029) [2024-04-26 02:14:13,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 868925440. Throughput: 0: 55691.8. Samples: 818257240. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 02:14:13,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 02:14:14,294][47288] Updated weights for policy 0, policy_version 53036 (0.0031) [2024-04-26 02:14:16,135][47288] Updated weights for policy 0, policy_version 53046 (0.0024) [2024-04-26 02:14:18,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 869203968. Throughput: 0: 55621.4. Samples: 818586520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 02:14:18,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 02:14:20,281][47288] Updated weights for policy 0, policy_version 53056 (0.0028) [2024-04-26 02:14:22,073][47288] Updated weights for policy 0, policy_version 53066 (0.0030) [2024-04-26 02:14:23,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 869482496. Throughput: 0: 55587.1. Samples: 818919220. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 02:14:23,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 02:14:26,095][47288] Updated weights for policy 0, policy_version 53076 (0.0029) [2024-04-26 02:14:28,092][47288] Updated weights for policy 0, policy_version 53086 (0.0030) [2024-04-26 02:14:28,923][47056] Fps is (10 sec: 57342.9, 60 sec: 55432.3, 300 sec: 55650.1). Total num frames: 869777408. Throughput: 0: 55833.1. Samples: 819091960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 02:14:28,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 02:14:31,866][47288] Updated weights for policy 0, policy_version 53096 (0.0034) [2024-04-26 02:14:33,834][47288] Updated weights for policy 0, policy_version 53106 (0.0030) [2024-04-26 02:14:33,923][47056] Fps is (10 sec: 60621.3, 60 sec: 56251.7, 300 sec: 55761.2). Total num frames: 870088704. Throughput: 0: 55874.7. Samples: 819430040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 02:14:33,924][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 02:14:37,869][47288] Updated weights for policy 0, policy_version 53116 (0.0033) [2024-04-26 02:14:38,663][47267] Signal inference workers to stop experience collection... (12150 times) [2024-04-26 02:14:38,664][47267] Signal inference workers to resume experience collection... (12150 times) [2024-04-26 02:14:38,681][47288] InferenceWorker_p0-w0: stopping experience collection (12150 times) [2024-04-26 02:14:38,681][47288] InferenceWorker_p0-w0: resuming experience collection (12150 times) [2024-04-26 02:14:38,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 870350848. Throughput: 0: 55872.3. Samples: 819765300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 02:14:38,923][47056] Avg episode reward: [(0, '0.355')] [2024-04-26 02:14:39,647][47288] Updated weights for policy 0, policy_version 53126 (0.0026) [2024-04-26 02:14:43,545][47288] Updated weights for policy 0, policy_version 53136 (0.0029) [2024-04-26 02:14:43,923][47056] Fps is (10 sec: 50790.6, 60 sec: 55432.5, 300 sec: 55594.6). Total num frames: 870596608. Throughput: 0: 55610.7. Samples: 819929640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 02:14:43,923][47056] Avg episode reward: [(0, '0.372')] [2024-04-26 02:14:45,481][47288] Updated weights for policy 0, policy_version 53146 (0.0029) [2024-04-26 02:14:48,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 870875136. Throughput: 0: 55635.9. Samples: 820265920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 02:14:48,923][47056] Avg episode reward: [(0, '0.278')] [2024-04-26 02:14:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000053154_870875136.pth... [2024-04-26 02:14:49,003][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000052341_857554944.pth [2024-04-26 02:14:49,302][47288] Updated weights for policy 0, policy_version 53156 (0.0027) [2024-04-26 02:14:51,310][47288] Updated weights for policy 0, policy_version 53166 (0.0029) [2024-04-26 02:14:53,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 871170048. Throughput: 0: 55623.9. Samples: 820602200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 02:14:53,923][47056] Avg episode reward: [(0, '0.305')] [2024-04-26 02:14:55,185][47288] Updated weights for policy 0, policy_version 53176 (0.0026) [2024-04-26 02:14:57,109][47288] Updated weights for policy 0, policy_version 53186 (0.0028) [2024-04-26 02:14:58,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 871448576. Throughput: 0: 55715.5. Samples: 820764440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 02:14:58,923][47056] Avg episode reward: [(0, '0.420')] [2024-04-26 02:15:01,071][47288] Updated weights for policy 0, policy_version 53196 (0.0031) [2024-04-26 02:15:02,963][47288] Updated weights for policy 0, policy_version 53206 (0.0025) [2024-04-26 02:15:03,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 871743488. Throughput: 0: 55809.6. Samples: 821097960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 02:15:03,923][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 02:15:06,979][47288] Updated weights for policy 0, policy_version 53216 (0.0035) [2024-04-26 02:15:08,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56524.7, 300 sec: 55761.1). Total num frames: 872038400. Throughput: 0: 55690.6. Samples: 821425300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 02:15:08,924][47056] Avg episode reward: [(0, '0.330')] [2024-04-26 02:15:09,104][47288] Updated weights for policy 0, policy_version 53226 (0.0032) [2024-04-26 02:15:13,002][47288] Updated weights for policy 0, policy_version 53236 (0.0033) [2024-04-26 02:15:13,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 872284160. Throughput: 0: 55640.1. Samples: 821595760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:15:13,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 02:15:13,966][47267] Saving new best policy, reward=0.520! [2024-04-26 02:15:15,193][47288] Updated weights for policy 0, policy_version 53246 (0.0024) [2024-04-26 02:15:18,785][47288] Updated weights for policy 0, policy_version 53256 (0.0028) [2024-04-26 02:15:18,923][47056] Fps is (10 sec: 50791.5, 60 sec: 55705.6, 300 sec: 55594.6). Total num frames: 872546304. Throughput: 0: 55623.2. Samples: 821933080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:15:18,923][47056] Avg episode reward: [(0, '0.343')] [2024-04-26 02:15:20,992][47288] Updated weights for policy 0, policy_version 53266 (0.0035) [2024-04-26 02:15:23,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 872824832. Throughput: 0: 55645.8. Samples: 822269360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:15:23,923][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 02:15:24,680][47288] Updated weights for policy 0, policy_version 53276 (0.0023) [2024-04-26 02:15:26,915][47288] Updated weights for policy 0, policy_version 53286 (0.0036) [2024-04-26 02:15:28,923][47056] Fps is (10 sec: 58981.7, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 873136128. Throughput: 0: 55547.9. Samples: 822429300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:15:28,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 02:15:30,493][47288] Updated weights for policy 0, policy_version 53296 (0.0035) [2024-04-26 02:15:32,854][47288] Updated weights for policy 0, policy_version 53306 (0.0031) [2024-04-26 02:15:33,923][47056] Fps is (10 sec: 55705.3, 60 sec: 54886.3, 300 sec: 55594.5). Total num frames: 873381888. Throughput: 0: 55458.2. Samples: 822761540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:15:33,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 02:15:34,379][47267] Signal inference workers to stop experience collection... (12200 times) [2024-04-26 02:15:34,428][47288] InferenceWorker_p0-w0: stopping experience collection (12200 times) [2024-04-26 02:15:34,436][47267] Signal inference workers to resume experience collection... (12200 times) [2024-04-26 02:15:34,442][47288] InferenceWorker_p0-w0: resuming experience collection (12200 times) [2024-04-26 02:15:36,388][47288] Updated weights for policy 0, policy_version 53316 (0.0028) [2024-04-26 02:15:38,585][47288] Updated weights for policy 0, policy_version 53326 (0.0027) [2024-04-26 02:15:38,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 873693184. Throughput: 0: 55484.1. Samples: 823098980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:15:38,923][47056] Avg episode reward: [(0, '0.391')] [2024-04-26 02:15:42,428][47288] Updated weights for policy 0, policy_version 53336 (0.0028) [2024-04-26 02:15:43,923][47056] Fps is (10 sec: 58983.4, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 873971712. Throughput: 0: 55778.9. Samples: 823274480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:15:43,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 02:15:44,391][47288] Updated weights for policy 0, policy_version 53346 (0.0032) [2024-04-26 02:15:48,189][47288] Updated weights for policy 0, policy_version 53356 (0.0028) [2024-04-26 02:15:48,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 874250240. Throughput: 0: 55917.5. Samples: 823614240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 02:15:48,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 02:15:50,263][47288] Updated weights for policy 0, policy_version 53366 (0.0028) [2024-04-26 02:15:53,923][47056] Fps is (10 sec: 50789.6, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 874479616. Throughput: 0: 55949.9. Samples: 823943040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 02:15:53,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 02:15:54,248][47288] Updated weights for policy 0, policy_version 53376 (0.0029) [2024-04-26 02:15:56,161][47288] Updated weights for policy 0, policy_version 53386 (0.0026) [2024-04-26 02:15:58,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 874774528. Throughput: 0: 55562.6. Samples: 824096080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 02:15:58,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 02:16:00,041][47288] Updated weights for policy 0, policy_version 53396 (0.0027) [2024-04-26 02:16:02,081][47288] Updated weights for policy 0, policy_version 53406 (0.0028) [2024-04-26 02:16:03,923][47056] Fps is (10 sec: 58982.8, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 875069440. Throughput: 0: 55583.5. Samples: 824434340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 02:16:03,923][47056] Avg episode reward: [(0, '0.356')] [2024-04-26 02:16:05,834][47288] Updated weights for policy 0, policy_version 53416 (0.0027) [2024-04-26 02:16:08,293][47288] Updated weights for policy 0, policy_version 53426 (0.0031) [2024-04-26 02:16:08,923][47056] Fps is (10 sec: 55705.5, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 875331584. Throughput: 0: 55591.9. Samples: 824771000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 02:16:08,932][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 02:16:11,790][47288] Updated weights for policy 0, policy_version 53436 (0.0031) [2024-04-26 02:16:11,825][47267] Signal inference workers to stop experience collection... (12250 times) [2024-04-26 02:16:11,862][47288] InferenceWorker_p0-w0: stopping experience collection (12250 times) [2024-04-26 02:16:11,915][47267] Signal inference workers to resume experience collection... (12250 times) [2024-04-26 02:16:11,915][47288] InferenceWorker_p0-w0: resuming experience collection (12250 times) [2024-04-26 02:16:13,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 875642880. Throughput: 0: 55965.3. Samples: 824947740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 02:16:13,923][47056] Avg episode reward: [(0, '0.322')] [2024-04-26 02:16:14,725][47288] Updated weights for policy 0, policy_version 53446 (0.0032) [2024-04-26 02:16:17,528][47288] Updated weights for policy 0, policy_version 53456 (0.0032) [2024-04-26 02:16:18,923][47056] Fps is (10 sec: 58983.4, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 875921408. Throughput: 0: 55926.4. Samples: 825278220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 02:16:18,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 02:16:20,472][47288] Updated weights for policy 0, policy_version 53466 (0.0032) [2024-04-26 02:16:23,426][47288] Updated weights for policy 0, policy_version 53476 (0.0036) [2024-04-26 02:16:23,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 876183552. Throughput: 0: 55692.0. Samples: 825605120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 02:16:23,923][47056] Avg episode reward: [(0, '0.356')] [2024-04-26 02:16:26,189][47288] Updated weights for policy 0, policy_version 53486 (0.0030) [2024-04-26 02:16:28,923][47056] Fps is (10 sec: 49151.4, 60 sec: 54613.3, 300 sec: 55483.4). Total num frames: 876412928. Throughput: 0: 55552.3. Samples: 825774340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 02:16:28,923][47056] Avg episode reward: [(0, '0.335')] [2024-04-26 02:16:29,348][47288] Updated weights for policy 0, policy_version 53496 (0.0030) [2024-04-26 02:16:31,988][47288] Updated weights for policy 0, policy_version 53506 (0.0026) [2024-04-26 02:16:33,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 876724224. Throughput: 0: 55432.4. Samples: 826108700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 02:16:33,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 02:16:35,406][47288] Updated weights for policy 0, policy_version 53516 (0.0027) [2024-04-26 02:16:37,817][47288] Updated weights for policy 0, policy_version 53526 (0.0029) [2024-04-26 02:16:38,923][47056] Fps is (10 sec: 57343.7, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 876986368. Throughput: 0: 55536.0. Samples: 826442160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 02:16:38,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 02:16:41,306][47288] Updated weights for policy 0, policy_version 53536 (0.0026) [2024-04-26 02:16:43,556][47288] Updated weights for policy 0, policy_version 53546 (0.0026) [2024-04-26 02:16:43,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 877297664. Throughput: 0: 55733.3. Samples: 826604080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 02:16:43,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 02:16:46,983][47288] Updated weights for policy 0, policy_version 53556 (0.0033) [2024-04-26 02:16:48,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 877559808. Throughput: 0: 55753.4. Samples: 826943240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 02:16:48,923][47056] Avg episode reward: [(0, '0.285')] [2024-04-26 02:16:48,952][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000053563_877576192.pth... [2024-04-26 02:16:49,006][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000052749_864239616.pth [2024-04-26 02:16:49,605][47288] Updated weights for policy 0, policy_version 53566 (0.0029) [2024-04-26 02:16:52,818][47288] Updated weights for policy 0, policy_version 53576 (0.0028) [2024-04-26 02:16:53,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.8, 300 sec: 55761.1). Total num frames: 877871104. Throughput: 0: 55581.4. Samples: 827272160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:16:53,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 02:16:56,023][47288] Updated weights for policy 0, policy_version 53586 (0.0027) [2024-04-26 02:16:58,566][47267] Signal inference workers to stop experience collection... (12300 times) [2024-04-26 02:16:58,602][47288] InferenceWorker_p0-w0: stopping experience collection (12300 times) [2024-04-26 02:16:58,654][47267] Signal inference workers to resume experience collection... (12300 times) [2024-04-26 02:16:58,655][47288] InferenceWorker_p0-w0: resuming experience collection (12300 times) [2024-04-26 02:16:58,763][47288] Updated weights for policy 0, policy_version 53596 (0.0033) [2024-04-26 02:16:58,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 878116864. Throughput: 0: 55451.8. Samples: 827443060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:16:58,923][47056] Avg episode reward: [(0, '0.341')] [2024-04-26 02:17:01,970][47288] Updated weights for policy 0, policy_version 53606 (0.0031) [2024-04-26 02:17:03,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 878395392. Throughput: 0: 55530.5. Samples: 827777100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:17:03,923][47056] Avg episode reward: [(0, '0.282')] [2024-04-26 02:17:04,727][47288] Updated weights for policy 0, policy_version 53616 (0.0032) [2024-04-26 02:17:07,796][47288] Updated weights for policy 0, policy_version 53626 (0.0034) [2024-04-26 02:17:08,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 878673920. Throughput: 0: 55660.8. Samples: 828109860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:17:08,923][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 02:17:10,540][47288] Updated weights for policy 0, policy_version 53636 (0.0025) [2024-04-26 02:17:13,620][47288] Updated weights for policy 0, policy_version 53646 (0.0034) [2024-04-26 02:17:13,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55159.7, 300 sec: 55594.6). Total num frames: 878952448. Throughput: 0: 55406.4. Samples: 828267620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:17:13,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 02:17:16,404][47288] Updated weights for policy 0, policy_version 53656 (0.0031) [2024-04-26 02:17:18,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 879230976. Throughput: 0: 55465.3. Samples: 828604640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:17:18,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 02:17:19,431][47288] Updated weights for policy 0, policy_version 53666 (0.0026) [2024-04-26 02:17:22,362][47288] Updated weights for policy 0, policy_version 53676 (0.0032) [2024-04-26 02:17:23,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 879525888. Throughput: 0: 55565.1. Samples: 828942580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 02:17:23,923][47056] Avg episode reward: [(0, '0.363')] [2024-04-26 02:17:25,244][47288] Updated weights for policy 0, policy_version 53686 (0.0028) [2024-04-26 02:17:28,148][47288] Updated weights for policy 0, policy_version 53696 (0.0029) [2024-04-26 02:17:28,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56797.9, 300 sec: 55761.2). Total num frames: 879820800. Throughput: 0: 55678.3. Samples: 829109600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 02:17:28,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 02:17:31,153][47288] Updated weights for policy 0, policy_version 53706 (0.0029) [2024-04-26 02:17:33,900][47288] Updated weights for policy 0, policy_version 53716 (0.0026) [2024-04-26 02:17:33,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 880082944. Throughput: 0: 55562.1. Samples: 829443540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 02:17:33,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 02:17:36,914][47288] Updated weights for policy 0, policy_version 53726 (0.0034) [2024-04-26 02:17:38,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 880345088. Throughput: 0: 55806.7. Samples: 829783460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 02:17:38,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 02:17:39,598][47288] Updated weights for policy 0, policy_version 53736 (0.0029) [2024-04-26 02:17:42,864][47288] Updated weights for policy 0, policy_version 53746 (0.0030) [2024-04-26 02:17:43,275][47267] Signal inference workers to stop experience collection... (12350 times) [2024-04-26 02:17:43,280][47267] Signal inference workers to resume experience collection... (12350 times) [2024-04-26 02:17:43,302][47288] InferenceWorker_p0-w0: stopping experience collection (12350 times) [2024-04-26 02:17:43,303][47288] InferenceWorker_p0-w0: resuming experience collection (12350 times) [2024-04-26 02:17:43,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55650.2). Total num frames: 880656384. Throughput: 0: 55793.7. Samples: 829953780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 02:17:43,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 02:17:45,407][47288] Updated weights for policy 0, policy_version 53756 (0.0027) [2024-04-26 02:17:48,780][47288] Updated weights for policy 0, policy_version 53766 (0.0025) [2024-04-26 02:17:48,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 880918528. Throughput: 0: 55768.0. Samples: 830286660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 02:17:48,924][47056] Avg episode reward: [(0, '0.353')] [2024-04-26 02:17:51,196][47288] Updated weights for policy 0, policy_version 53776 (0.0031) [2024-04-26 02:17:53,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 881180672. Throughput: 0: 55893.9. Samples: 830625080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 02:17:53,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 02:17:54,630][47288] Updated weights for policy 0, policy_version 53786 (0.0032) [2024-04-26 02:17:57,094][47288] Updated weights for policy 0, policy_version 53796 (0.0029) [2024-04-26 02:17:58,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 881475584. Throughput: 0: 56216.2. Samples: 830797360. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-04-26 02:17:58,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 02:18:00,480][47288] Updated weights for policy 0, policy_version 53806 (0.0028) [2024-04-26 02:18:02,966][47288] Updated weights for policy 0, policy_version 53816 (0.0031) [2024-04-26 02:18:03,923][47056] Fps is (10 sec: 58983.3, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 881770496. Throughput: 0: 56164.7. Samples: 831132040. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-04-26 02:18:03,923][47056] Avg episode reward: [(0, '0.293')] [2024-04-26 02:18:06,265][47288] Updated weights for policy 0, policy_version 53826 (0.0031) [2024-04-26 02:18:08,678][47288] Updated weights for policy 0, policy_version 53836 (0.0028) [2024-04-26 02:18:08,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 882049024. Throughput: 0: 56035.5. Samples: 831464180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-04-26 02:18:08,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 02:18:12,068][47288] Updated weights for policy 0, policy_version 53846 (0.0032) [2024-04-26 02:18:13,923][47056] Fps is (10 sec: 54066.0, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 882311168. Throughput: 0: 56088.3. Samples: 831633580. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-04-26 02:18:13,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 02:18:14,674][47288] Updated weights for policy 0, policy_version 53856 (0.0025) [2024-04-26 02:18:18,037][47288] Updated weights for policy 0, policy_version 53866 (0.0036) [2024-04-26 02:18:18,923][47056] Fps is (10 sec: 55704.7, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 882606080. Throughput: 0: 56091.4. Samples: 831967660. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-04-26 02:18:18,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 02:18:20,384][47288] Updated weights for policy 0, policy_version 53876 (0.0030) [2024-04-26 02:18:23,779][47288] Updated weights for policy 0, policy_version 53886 (0.0029) [2024-04-26 02:18:23,923][47056] Fps is (10 sec: 55706.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 882868224. Throughput: 0: 56021.1. Samples: 832304400. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-04-26 02:18:23,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 02:18:26,262][47288] Updated weights for policy 0, policy_version 53896 (0.0028) [2024-04-26 02:18:28,923][47056] Fps is (10 sec: 52429.7, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 883130368. Throughput: 0: 55731.2. Samples: 832461680. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-04-26 02:18:28,923][47056] Avg episode reward: [(0, '0.266')] [2024-04-26 02:18:29,767][47288] Updated weights for policy 0, policy_version 53906 (0.0032) [2024-04-26 02:18:32,020][47288] Updated weights for policy 0, policy_version 53916 (0.0029) [2024-04-26 02:18:33,923][47056] Fps is (10 sec: 55704.3, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 883425280. Throughput: 0: 55737.3. Samples: 832794840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 02:18:33,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 02:18:35,662][47288] Updated weights for policy 0, policy_version 53926 (0.0028) [2024-04-26 02:18:37,971][47288] Updated weights for policy 0, policy_version 53936 (0.0028) [2024-04-26 02:18:38,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 883720192. Throughput: 0: 55650.7. Samples: 833129360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 02:18:38,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 02:18:41,352][47288] Updated weights for policy 0, policy_version 53946 (0.0026) [2024-04-26 02:18:43,886][47288] Updated weights for policy 0, policy_version 53956 (0.0026) [2024-04-26 02:18:43,923][47056] Fps is (10 sec: 58982.6, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 884015104. Throughput: 0: 55749.3. Samples: 833306080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 02:18:43,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 02:18:47,146][47288] Updated weights for policy 0, policy_version 53966 (0.0040) [2024-04-26 02:18:48,462][47267] Signal inference workers to stop experience collection... (12400 times) [2024-04-26 02:18:48,493][47288] InferenceWorker_p0-w0: stopping experience collection (12400 times) [2024-04-26 02:18:48,518][47267] Signal inference workers to resume experience collection... (12400 times) [2024-04-26 02:18:48,518][47288] InferenceWorker_p0-w0: resuming experience collection (12400 times) [2024-04-26 02:18:48,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 884260864. Throughput: 0: 55692.6. Samples: 833638220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 02:18:48,924][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 02:18:48,950][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000053972_884277248.pth... [2024-04-26 02:18:49,000][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000053154_870875136.pth [2024-04-26 02:18:49,789][47288] Updated weights for policy 0, policy_version 53976 (0.0026) [2024-04-26 02:18:53,038][47288] Updated weights for policy 0, policy_version 53986 (0.0030) [2024-04-26 02:18:53,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 884555776. Throughput: 0: 55763.9. Samples: 833973560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 02:18:53,923][47056] Avg episode reward: [(0, '0.258')] [2024-04-26 02:18:55,565][47288] Updated weights for policy 0, policy_version 53996 (0.0036) [2024-04-26 02:18:58,842][47288] Updated weights for policy 0, policy_version 54006 (0.0028) [2024-04-26 02:18:58,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 884834304. Throughput: 0: 55645.4. Samples: 834137620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 02:18:58,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 02:19:01,440][47288] Updated weights for policy 0, policy_version 54016 (0.0027) [2024-04-26 02:19:03,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55159.3, 300 sec: 55705.6). Total num frames: 885080064. Throughput: 0: 55698.0. Samples: 834474060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 02:19:03,923][47056] Avg episode reward: [(0, '0.395')] [2024-04-26 02:19:04,718][47288] Updated weights for policy 0, policy_version 54026 (0.0029) [2024-04-26 02:19:07,343][47288] Updated weights for policy 0, policy_version 54036 (0.0026) [2024-04-26 02:19:08,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 885374976. Throughput: 0: 55702.5. Samples: 834811020. Policy #0 lag: (min: 1.0, avg: 8.0, max: 20.0) [2024-04-26 02:19:08,932][47056] Avg episode reward: [(0, '0.316')] [2024-04-26 02:19:10,598][47288] Updated weights for policy 0, policy_version 54046 (0.0037) [2024-04-26 02:19:13,131][47288] Updated weights for policy 0, policy_version 54056 (0.0033) [2024-04-26 02:19:13,923][47056] Fps is (10 sec: 60620.5, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 885686272. Throughput: 0: 55812.3. Samples: 834973240. Policy #0 lag: (min: 1.0, avg: 8.0, max: 20.0) [2024-04-26 02:19:13,924][47056] Avg episode reward: [(0, '0.368')] [2024-04-26 02:19:16,515][47288] Updated weights for policy 0, policy_version 54066 (0.0024) [2024-04-26 02:19:18,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55978.9, 300 sec: 55872.2). Total num frames: 885964800. Throughput: 0: 55881.6. Samples: 835309500. Policy #0 lag: (min: 1.0, avg: 8.0, max: 20.0) [2024-04-26 02:19:18,923][47056] Avg episode reward: [(0, '0.296')] [2024-04-26 02:19:19,011][47288] Updated weights for policy 0, policy_version 54076 (0.0032) [2024-04-26 02:19:22,392][47288] Updated weights for policy 0, policy_version 54086 (0.0028) [2024-04-26 02:19:23,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 886210560. Throughput: 0: 55859.1. Samples: 835643020. Policy #0 lag: (min: 1.0, avg: 8.0, max: 20.0) [2024-04-26 02:19:23,923][47056] Avg episode reward: [(0, '0.450')] [2024-04-26 02:19:24,890][47288] Updated weights for policy 0, policy_version 54096 (0.0030) [2024-04-26 02:19:28,222][47288] Updated weights for policy 0, policy_version 54106 (0.0030) [2024-04-26 02:19:28,923][47056] Fps is (10 sec: 54066.6, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 886505472. Throughput: 0: 55669.9. Samples: 835811220. Policy #0 lag: (min: 1.0, avg: 8.0, max: 20.0) [2024-04-26 02:19:28,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 02:19:30,961][47288] Updated weights for policy 0, policy_version 54116 (0.0028) [2024-04-26 02:19:33,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 886767616. Throughput: 0: 55797.7. Samples: 836149120. Policy #0 lag: (min: 1.0, avg: 8.0, max: 20.0) [2024-04-26 02:19:33,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 02:19:34,151][47288] Updated weights for policy 0, policy_version 54126 (0.0030) [2024-04-26 02:19:36,785][47288] Updated weights for policy 0, policy_version 54136 (0.0038) [2024-04-26 02:19:38,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 887029760. Throughput: 0: 55755.3. Samples: 836482540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 02:19:38,923][47056] Avg episode reward: [(0, '0.416')] [2024-04-26 02:19:40,041][47288] Updated weights for policy 0, policy_version 54146 (0.0026) [2024-04-26 02:19:42,542][47288] Updated weights for policy 0, policy_version 54156 (0.0029) [2024-04-26 02:19:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 887324672. Throughput: 0: 55780.3. Samples: 836647740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 02:19:43,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 02:19:45,930][47288] Updated weights for policy 0, policy_version 54166 (0.0029) [2024-04-26 02:19:48,510][47288] Updated weights for policy 0, policy_version 54176 (0.0027) [2024-04-26 02:19:48,926][47056] Fps is (10 sec: 58964.8, 60 sec: 55976.0, 300 sec: 55760.6). Total num frames: 887619584. Throughput: 0: 55680.4. Samples: 836979840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 02:19:48,926][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 02:19:50,481][47267] Signal inference workers to stop experience collection... (12450 times) [2024-04-26 02:19:50,504][47288] InferenceWorker_p0-w0: stopping experience collection (12450 times) [2024-04-26 02:19:50,575][47267] Signal inference workers to resume experience collection... (12450 times) [2024-04-26 02:19:50,575][47288] InferenceWorker_p0-w0: resuming experience collection (12450 times) [2024-04-26 02:19:51,828][47288] Updated weights for policy 0, policy_version 54186 (0.0037) [2024-04-26 02:19:53,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 887898112. Throughput: 0: 55614.7. Samples: 837313680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 02:19:53,924][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 02:19:54,345][47288] Updated weights for policy 0, policy_version 54196 (0.0034) [2024-04-26 02:19:57,824][47288] Updated weights for policy 0, policy_version 54206 (0.0033) [2024-04-26 02:19:58,923][47056] Fps is (10 sec: 55722.0, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 888176640. Throughput: 0: 55788.5. Samples: 837483720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 02:19:58,923][47056] Avg episode reward: [(0, '0.359')] [2024-04-26 02:20:00,253][47288] Updated weights for policy 0, policy_version 54216 (0.0034) [2024-04-26 02:20:03,646][47288] Updated weights for policy 0, policy_version 54226 (0.0028) [2024-04-26 02:20:03,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 888438784. Throughput: 0: 55729.5. Samples: 837817340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 02:20:03,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 02:20:06,230][47288] Updated weights for policy 0, policy_version 54236 (0.0029) [2024-04-26 02:20:08,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 888700928. Throughput: 0: 55717.9. Samples: 838150320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 02:20:08,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 02:20:09,584][47288] Updated weights for policy 0, policy_version 54246 (0.0030) [2024-04-26 02:20:12,448][47288] Updated weights for policy 0, policy_version 54256 (0.0035) [2024-04-26 02:20:13,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 888995840. Throughput: 0: 55598.2. Samples: 838313140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 02:20:13,932][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 02:20:15,318][47288] Updated weights for policy 0, policy_version 54266 (0.0036) [2024-04-26 02:20:18,252][47288] Updated weights for policy 0, policy_version 54276 (0.0026) [2024-04-26 02:20:18,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55159.3, 300 sec: 55761.1). Total num frames: 889274368. Throughput: 0: 55610.8. Samples: 838651600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 02:20:18,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 02:20:21,161][47288] Updated weights for policy 0, policy_version 54286 (0.0026) [2024-04-26 02:20:23,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 889552896. Throughput: 0: 55548.4. Samples: 838982220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 02:20:23,923][47056] Avg episode reward: [(0, '0.307')] [2024-04-26 02:20:24,228][47288] Updated weights for policy 0, policy_version 54296 (0.0029) [2024-04-26 02:20:27,023][47288] Updated weights for policy 0, policy_version 54306 (0.0031) [2024-04-26 02:20:28,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 889847808. Throughput: 0: 55666.8. Samples: 839152740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 02:20:28,923][47056] Avg episode reward: [(0, '0.357')] [2024-04-26 02:20:30,001][47288] Updated weights for policy 0, policy_version 54316 (0.0029) [2024-04-26 02:20:32,918][47288] Updated weights for policy 0, policy_version 54326 (0.0031) [2024-04-26 02:20:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 890126336. Throughput: 0: 55820.6. Samples: 839491600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 02:20:33,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 02:20:35,818][47288] Updated weights for policy 0, policy_version 54336 (0.0028) [2024-04-26 02:20:38,644][47288] Updated weights for policy 0, policy_version 54346 (0.0031) [2024-04-26 02:20:38,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.5, 300 sec: 55705.6). Total num frames: 890404864. Throughput: 0: 55750.1. Samples: 839822440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 02:20:38,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 02:20:41,792][47288] Updated weights for policy 0, policy_version 54356 (0.0029) [2024-04-26 02:20:43,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 890650624. Throughput: 0: 55578.6. Samples: 839984760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 02:20:43,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 02:20:44,841][47288] Updated weights for policy 0, policy_version 54366 (0.0028) [2024-04-26 02:20:47,623][47288] Updated weights for policy 0, policy_version 54376 (0.0030) [2024-04-26 02:20:48,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55435.1, 300 sec: 55816.7). Total num frames: 890945536. Throughput: 0: 55578.6. Samples: 840318380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:20:48,923][47056] Avg episode reward: [(0, '0.355')] [2024-04-26 02:20:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000054379_890945536.pth... [2024-04-26 02:20:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000053563_877576192.pth [2024-04-26 02:20:50,591][47288] Updated weights for policy 0, policy_version 54386 (0.0033) [2024-04-26 02:20:53,373][47288] Updated weights for policy 0, policy_version 54396 (0.0029) [2024-04-26 02:20:53,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 891240448. Throughput: 0: 55574.9. Samples: 840651200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:20:53,923][47056] Avg episode reward: [(0, '0.330')] [2024-04-26 02:20:55,153][47267] Signal inference workers to stop experience collection... (12500 times) [2024-04-26 02:20:55,154][47267] Signal inference workers to resume experience collection... (12500 times) [2024-04-26 02:20:55,176][47288] InferenceWorker_p0-w0: stopping experience collection (12500 times) [2024-04-26 02:20:55,177][47288] InferenceWorker_p0-w0: resuming experience collection (12500 times) [2024-04-26 02:20:56,545][47288] Updated weights for policy 0, policy_version 54406 (0.0027) [2024-04-26 02:20:58,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 891518976. Throughput: 0: 55779.1. Samples: 840823200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:20:58,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 02:20:59,363][47288] Updated weights for policy 0, policy_version 54416 (0.0029) [2024-04-26 02:21:02,438][47288] Updated weights for policy 0, policy_version 54426 (0.0031) [2024-04-26 02:21:03,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 891797504. Throughput: 0: 55723.2. Samples: 841159140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:21:03,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 02:21:05,262][47288] Updated weights for policy 0, policy_version 54436 (0.0028) [2024-04-26 02:21:08,269][47288] Updated weights for policy 0, policy_version 54446 (0.0029) [2024-04-26 02:21:08,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 892059648. Throughput: 0: 55791.1. Samples: 841492820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:21:08,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 02:21:11,188][47288] Updated weights for policy 0, policy_version 54456 (0.0029) [2024-04-26 02:21:13,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 892338176. Throughput: 0: 55752.1. Samples: 841661580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:21:13,923][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 02:21:14,065][47288] Updated weights for policy 0, policy_version 54466 (0.0024) [2024-04-26 02:21:16,958][47288] Updated weights for policy 0, policy_version 54476 (0.0024) [2024-04-26 02:21:18,923][47056] Fps is (10 sec: 54065.9, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 892600320. Throughput: 0: 55691.3. Samples: 841997720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 02:21:18,923][47056] Avg episode reward: [(0, '0.397')] [2024-04-26 02:21:20,001][47288] Updated weights for policy 0, policy_version 54486 (0.0026) [2024-04-26 02:21:22,754][47288] Updated weights for policy 0, policy_version 54496 (0.0031) [2024-04-26 02:21:23,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 892878848. Throughput: 0: 55741.6. Samples: 842330800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-04-26 02:21:23,923][47056] Avg episode reward: [(0, '0.404')] [2024-04-26 02:21:26,117][47288] Updated weights for policy 0, policy_version 54506 (0.0025) [2024-04-26 02:21:28,684][47288] Updated weights for policy 0, policy_version 54516 (0.0034) [2024-04-26 02:21:28,923][47056] Fps is (10 sec: 58982.8, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 893190144. Throughput: 0: 55789.3. Samples: 842495280. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-04-26 02:21:28,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 02:21:31,966][47288] Updated weights for policy 0, policy_version 54526 (0.0029) [2024-04-26 02:21:33,923][47056] Fps is (10 sec: 58981.1, 60 sec: 55705.4, 300 sec: 55872.2). Total num frames: 893468672. Throughput: 0: 55773.3. Samples: 842828180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-04-26 02:21:33,924][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 02:21:34,572][47288] Updated weights for policy 0, policy_version 54536 (0.0032) [2024-04-26 02:21:37,756][47288] Updated weights for policy 0, policy_version 54546 (0.0027) [2024-04-26 02:21:38,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 893747200. Throughput: 0: 55842.6. Samples: 843164120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-04-26 02:21:38,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 02:21:40,516][47288] Updated weights for policy 0, policy_version 54556 (0.0031) [2024-04-26 02:21:43,578][47288] Updated weights for policy 0, policy_version 54566 (0.0032) [2024-04-26 02:21:43,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 894025728. Throughput: 0: 55862.6. Samples: 843337020. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-04-26 02:21:43,923][47056] Avg episode reward: [(0, '0.359')] [2024-04-26 02:21:46,286][47288] Updated weights for policy 0, policy_version 54576 (0.0028) [2024-04-26 02:21:48,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 894304256. Throughput: 0: 55884.3. Samples: 843673940. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-04-26 02:21:48,923][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 02:21:49,426][47288] Updated weights for policy 0, policy_version 54586 (0.0025) [2024-04-26 02:21:52,212][47288] Updated weights for policy 0, policy_version 54596 (0.0030) [2024-04-26 02:21:53,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55816.6). Total num frames: 894582784. Throughput: 0: 55961.0. Samples: 844011080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:21:53,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 02:21:55,379][47288] Updated weights for policy 0, policy_version 54606 (0.0028) [2024-04-26 02:21:58,095][47288] Updated weights for policy 0, policy_version 54616 (0.0024) [2024-04-26 02:21:58,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 894844928. Throughput: 0: 55725.7. Samples: 844169240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:21:58,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 02:21:59,362][47267] Signal inference workers to stop experience collection... (12550 times) [2024-04-26 02:21:59,362][47267] Signal inference workers to resume experience collection... (12550 times) [2024-04-26 02:21:59,384][47288] InferenceWorker_p0-w0: stopping experience collection (12550 times) [2024-04-26 02:21:59,385][47288] InferenceWorker_p0-w0: resuming experience collection (12550 times) [2024-04-26 02:22:01,185][47288] Updated weights for policy 0, policy_version 54626 (0.0032) [2024-04-26 02:22:03,868][47288] Updated weights for policy 0, policy_version 54636 (0.0031) [2024-04-26 02:22:03,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 895156224. Throughput: 0: 55597.9. Samples: 844499620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:22:03,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 02:22:07,085][47288] Updated weights for policy 0, policy_version 54646 (0.0028) [2024-04-26 02:22:08,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 895418368. Throughput: 0: 55701.2. Samples: 844837360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:22:08,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 02:22:09,645][47288] Updated weights for policy 0, policy_version 54656 (0.0028) [2024-04-26 02:22:12,950][47288] Updated weights for policy 0, policy_version 54666 (0.0031) [2024-04-26 02:22:13,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 895696896. Throughput: 0: 55973.5. Samples: 845014080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:22:13,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 02:22:15,448][47288] Updated weights for policy 0, policy_version 54676 (0.0027) [2024-04-26 02:22:18,820][47288] Updated weights for policy 0, policy_version 54686 (0.0025) [2024-04-26 02:22:18,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.9, 300 sec: 55761.1). Total num frames: 895975424. Throughput: 0: 55943.4. Samples: 845345620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:22:18,923][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 02:22:21,410][47288] Updated weights for policy 0, policy_version 54696 (0.0027) [2024-04-26 02:22:23,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 896237568. Throughput: 0: 55864.1. Samples: 845678000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:22:23,923][47056] Avg episode reward: [(0, '0.328')] [2024-04-26 02:22:24,630][47288] Updated weights for policy 0, policy_version 54706 (0.0024) [2024-04-26 02:22:27,619][47288] Updated weights for policy 0, policy_version 54716 (0.0033) [2024-04-26 02:22:28,923][47056] Fps is (10 sec: 54065.8, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 896516096. Throughput: 0: 55707.0. Samples: 845843840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:22:28,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 02:22:30,571][47288] Updated weights for policy 0, policy_version 54726 (0.0026) [2024-04-26 02:22:33,439][47288] Updated weights for policy 0, policy_version 54736 (0.0033) [2024-04-26 02:22:33,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 896811008. Throughput: 0: 55737.0. Samples: 846182100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:22:33,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 02:22:36,390][47288] Updated weights for policy 0, policy_version 54746 (0.0031) [2024-04-26 02:22:38,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 897089536. Throughput: 0: 55591.6. Samples: 846512700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:22:38,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 02:22:39,558][47288] Updated weights for policy 0, policy_version 54756 (0.0034) [2024-04-26 02:22:42,274][47288] Updated weights for policy 0, policy_version 54766 (0.0025) [2024-04-26 02:22:43,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 897368064. Throughput: 0: 55774.6. Samples: 846679100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:22:43,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 02:22:45,362][47288] Updated weights for policy 0, policy_version 54776 (0.0027) [2024-04-26 02:22:48,075][47267] Signal inference workers to stop experience collection... (12600 times) [2024-04-26 02:22:48,106][47288] InferenceWorker_p0-w0: stopping experience collection (12600 times) [2024-04-26 02:22:48,159][47267] Signal inference workers to resume experience collection... (12600 times) [2024-04-26 02:22:48,159][47288] InferenceWorker_p0-w0: resuming experience collection (12600 times) [2024-04-26 02:22:48,161][47288] Updated weights for policy 0, policy_version 54786 (0.0027) [2024-04-26 02:22:48,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 897646592. Throughput: 0: 55859.1. Samples: 847013280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:22:48,923][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 02:22:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000054788_897646592.pth... [2024-04-26 02:22:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000053972_884277248.pth [2024-04-26 02:22:51,197][47288] Updated weights for policy 0, policy_version 54796 (0.0034) [2024-04-26 02:22:53,923][47056] Fps is (10 sec: 54068.4, 60 sec: 55432.8, 300 sec: 55705.6). Total num frames: 897908736. Throughput: 0: 55824.6. Samples: 847349460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:22:53,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 02:22:54,072][47288] Updated weights for policy 0, policy_version 54806 (0.0033) [2024-04-26 02:22:57,132][47288] Updated weights for policy 0, policy_version 54816 (0.0029) [2024-04-26 02:22:58,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 898187264. Throughput: 0: 55461.5. Samples: 847509860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:22:58,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 02:22:59,898][47288] Updated weights for policy 0, policy_version 54826 (0.0026) [2024-04-26 02:23:02,847][47288] Updated weights for policy 0, policy_version 54836 (0.0031) [2024-04-26 02:23:03,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 898465792. Throughput: 0: 55428.8. Samples: 847839920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 02:23:03,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 02:23:05,768][47288] Updated weights for policy 0, policy_version 54846 (0.0029) [2024-04-26 02:23:08,594][47288] Updated weights for policy 0, policy_version 54856 (0.0031) [2024-04-26 02:23:08,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 898760704. Throughput: 0: 55451.7. Samples: 848173320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 02:23:08,923][47056] Avg episode reward: [(0, '0.381')] [2024-04-26 02:23:11,631][47288] Updated weights for policy 0, policy_version 54866 (0.0029) [2024-04-26 02:23:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 899039232. Throughput: 0: 55597.7. Samples: 848345720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 02:23:13,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 02:23:14,646][47288] Updated weights for policy 0, policy_version 54876 (0.0034) [2024-04-26 02:23:17,485][47288] Updated weights for policy 0, policy_version 54886 (0.0031) [2024-04-26 02:23:18,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 899301376. Throughput: 0: 55521.0. Samples: 848680540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 02:23:18,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 02:23:20,468][47288] Updated weights for policy 0, policy_version 54896 (0.0033) [2024-04-26 02:23:23,401][47288] Updated weights for policy 0, policy_version 54906 (0.0030) [2024-04-26 02:23:23,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 899596288. Throughput: 0: 55515.2. Samples: 849010880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 02:23:23,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 02:23:26,569][47288] Updated weights for policy 0, policy_version 54916 (0.0026) [2024-04-26 02:23:28,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 899874816. Throughput: 0: 55682.3. Samples: 849184800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 02:23:28,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 02:23:29,125][47288] Updated weights for policy 0, policy_version 54926 (0.0028) [2024-04-26 02:23:32,279][47288] Updated weights for policy 0, policy_version 54936 (0.0029) [2024-04-26 02:23:33,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 900153344. Throughput: 0: 55593.2. Samples: 849514980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 02:23:33,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 02:23:35,080][47288] Updated weights for policy 0, policy_version 54946 (0.0026) [2024-04-26 02:23:38,181][47288] Updated weights for policy 0, policy_version 54956 (0.0031) [2024-04-26 02:23:38,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 900415488. Throughput: 0: 55618.8. Samples: 849852320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 02:23:38,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 02:23:40,912][47288] Updated weights for policy 0, policy_version 54966 (0.0032) [2024-04-26 02:23:43,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 900710400. Throughput: 0: 55727.7. Samples: 850017600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 02:23:43,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 02:23:44,114][47288] Updated weights for policy 0, policy_version 54976 (0.0026) [2024-04-26 02:23:46,966][47288] Updated weights for policy 0, policy_version 54986 (0.0031) [2024-04-26 02:23:48,923][47056] Fps is (10 sec: 55706.6, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 900972544. Throughput: 0: 55783.1. Samples: 850350160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 02:23:48,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 02:23:49,961][47288] Updated weights for policy 0, policy_version 54996 (0.0036) [2024-04-26 02:23:53,148][47288] Updated weights for policy 0, policy_version 55006 (0.0031) [2024-04-26 02:23:53,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 901267456. Throughput: 0: 55789.8. Samples: 850683860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 02:23:53,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 02:23:56,055][47288] Updated weights for policy 0, policy_version 55016 (0.0031) [2024-04-26 02:23:58,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 901529600. Throughput: 0: 55736.6. Samples: 850853880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 02:23:58,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 02:23:59,071][47288] Updated weights for policy 0, policy_version 55026 (0.0032) [2024-04-26 02:24:00,375][47267] Signal inference workers to stop experience collection... (12650 times) [2024-04-26 02:24:00,409][47288] InferenceWorker_p0-w0: stopping experience collection (12650 times) [2024-04-26 02:24:00,432][47267] Signal inference workers to resume experience collection... (12650 times) [2024-04-26 02:24:00,433][47288] InferenceWorker_p0-w0: resuming experience collection (12650 times) [2024-04-26 02:24:01,824][47288] Updated weights for policy 0, policy_version 55036 (0.0029) [2024-04-26 02:24:03,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 901824512. Throughput: 0: 55744.5. Samples: 851189040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 02:24:03,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 02:24:04,833][47288] Updated weights for policy 0, policy_version 55046 (0.0030) [2024-04-26 02:24:07,593][47288] Updated weights for policy 0, policy_version 55056 (0.0025) [2024-04-26 02:24:08,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 902103040. Throughput: 0: 55776.0. Samples: 851520800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 02:24:08,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 02:24:10,811][47288] Updated weights for policy 0, policy_version 55066 (0.0032) [2024-04-26 02:24:13,524][47288] Updated weights for policy 0, policy_version 55076 (0.0034) [2024-04-26 02:24:13,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 902365184. Throughput: 0: 55497.5. Samples: 851682180. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 02:24:13,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 02:24:16,769][47288] Updated weights for policy 0, policy_version 55086 (0.0027) [2024-04-26 02:24:18,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 902660096. Throughput: 0: 55622.4. Samples: 852017980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 02:24:18,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 02:24:19,511][47288] Updated weights for policy 0, policy_version 55096 (0.0026) [2024-04-26 02:24:22,623][47288] Updated weights for policy 0, policy_version 55106 (0.0030) [2024-04-26 02:24:23,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 902922240. Throughput: 0: 55587.3. Samples: 852353740. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 02:24:23,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 02:24:25,314][47288] Updated weights for policy 0, policy_version 55116 (0.0027) [2024-04-26 02:24:28,645][47288] Updated weights for policy 0, policy_version 55126 (0.0027) [2024-04-26 02:24:28,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 903217152. Throughput: 0: 55599.1. Samples: 852519560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 02:24:28,923][47056] Avg episode reward: [(0, '0.348')] [2024-04-26 02:24:31,213][47288] Updated weights for policy 0, policy_version 55136 (0.0026) [2024-04-26 02:24:33,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 903479296. Throughput: 0: 55727.9. Samples: 852857920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 02:24:33,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 02:24:34,349][47288] Updated weights for policy 0, policy_version 55146 (0.0032) [2024-04-26 02:24:37,296][47288] Updated weights for policy 0, policy_version 55156 (0.0028) [2024-04-26 02:24:38,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 903757824. Throughput: 0: 55664.4. Samples: 853188760. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 02:24:38,923][47056] Avg episode reward: [(0, '0.294')] [2024-04-26 02:24:40,117][47288] Updated weights for policy 0, policy_version 55166 (0.0027) [2024-04-26 02:24:43,204][47288] Updated weights for policy 0, policy_version 55176 (0.0029) [2024-04-26 02:24:43,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55159.5, 300 sec: 55595.1). Total num frames: 904019968. Throughput: 0: 55610.3. Samples: 853356340. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 02:24:43,923][47056] Avg episode reward: [(0, '0.342')] [2024-04-26 02:24:46,080][47288] Updated weights for policy 0, policy_version 55186 (0.0031) [2024-04-26 02:24:48,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 904314880. Throughput: 0: 55509.2. Samples: 853686960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 02:24:48,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 02:24:48,981][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000055196_904331264.pth... [2024-04-26 02:24:48,988][47288] Updated weights for policy 0, policy_version 55196 (0.0028) [2024-04-26 02:24:49,025][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000054379_890945536.pth [2024-04-26 02:24:52,094][47288] Updated weights for policy 0, policy_version 55206 (0.0029) [2024-04-26 02:24:53,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 904593408. Throughput: 0: 55591.1. Samples: 854022400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 02:24:53,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 02:24:54,894][47288] Updated weights for policy 0, policy_version 55216 (0.0031) [2024-04-26 02:24:57,787][47288] Updated weights for policy 0, policy_version 55226 (0.0033) [2024-04-26 02:24:58,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 904855552. Throughput: 0: 55574.7. Samples: 854183040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 02:24:58,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 02:25:00,875][47288] Updated weights for policy 0, policy_version 55236 (0.0027) [2024-04-26 02:25:03,632][47288] Updated weights for policy 0, policy_version 55246 (0.0033) [2024-04-26 02:25:03,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55432.3, 300 sec: 55761.1). Total num frames: 905150464. Throughput: 0: 55439.4. Samples: 854512760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 02:25:03,923][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 02:25:06,657][47288] Updated weights for policy 0, policy_version 55256 (0.0029) [2024-04-26 02:25:08,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 905428992. Throughput: 0: 55422.2. Samples: 854847740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 02:25:08,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 02:25:09,566][47288] Updated weights for policy 0, policy_version 55266 (0.0029) [2024-04-26 02:25:10,615][47267] Signal inference workers to stop experience collection... (12700 times) [2024-04-26 02:25:10,619][47267] Signal inference workers to resume experience collection... (12700 times) [2024-04-26 02:25:10,647][47288] InferenceWorker_p0-w0: stopping experience collection (12700 times) [2024-04-26 02:25:10,647][47288] InferenceWorker_p0-w0: resuming experience collection (12700 times) [2024-04-26 02:25:12,444][47288] Updated weights for policy 0, policy_version 55276 (0.0029) [2024-04-26 02:25:13,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 905691136. Throughput: 0: 55632.5. Samples: 855023020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 02:25:13,923][47056] Avg episode reward: [(0, '0.359')] [2024-04-26 02:25:15,314][47288] Updated weights for policy 0, policy_version 55286 (0.0027) [2024-04-26 02:25:18,324][47288] Updated weights for policy 0, policy_version 55296 (0.0030) [2024-04-26 02:25:18,923][47056] Fps is (10 sec: 55704.4, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 905986048. Throughput: 0: 55625.2. Samples: 855361060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 02:25:18,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 02:25:21,048][47288] Updated weights for policy 0, policy_version 55306 (0.0033) [2024-04-26 02:25:23,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 906264576. Throughput: 0: 55730.6. Samples: 855696640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 02:25:23,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 02:25:24,364][47288] Updated weights for policy 0, policy_version 55316 (0.0043) [2024-04-26 02:25:27,156][47288] Updated weights for policy 0, policy_version 55326 (0.0030) [2024-04-26 02:25:28,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 906543104. Throughput: 0: 55524.4. Samples: 855854940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 02:25:28,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 02:25:30,245][47288] Updated weights for policy 0, policy_version 55336 (0.0026) [2024-04-26 02:25:33,158][47288] Updated weights for policy 0, policy_version 55346 (0.0028) [2024-04-26 02:25:33,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55432.7, 300 sec: 55594.6). Total num frames: 906805248. Throughput: 0: 55576.6. Samples: 856187900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 02:25:33,923][47056] Avg episode reward: [(0, '0.350')] [2024-04-26 02:25:35,937][47288] Updated weights for policy 0, policy_version 55356 (0.0033) [2024-04-26 02:25:38,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 907100160. Throughput: 0: 55684.4. Samples: 856528200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 02:25:38,923][47056] Avg episode reward: [(0, '0.404')] [2024-04-26 02:25:39,039][47288] Updated weights for policy 0, policy_version 55366 (0.0034) [2024-04-26 02:25:41,784][47288] Updated weights for policy 0, policy_version 55376 (0.0030) [2024-04-26 02:25:43,923][47056] Fps is (10 sec: 58981.3, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 907395072. Throughput: 0: 55919.3. Samples: 856699420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 02:25:43,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 02:25:44,822][47288] Updated weights for policy 0, policy_version 55386 (0.0030) [2024-04-26 02:25:47,732][47288] Updated weights for policy 0, policy_version 55396 (0.0025) [2024-04-26 02:25:48,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 907673600. Throughput: 0: 56011.2. Samples: 857033260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 02:25:48,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 02:25:50,681][47288] Updated weights for policy 0, policy_version 55406 (0.0028) [2024-04-26 02:25:53,534][47288] Updated weights for policy 0, policy_version 55416 (0.0034) [2024-04-26 02:25:53,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 907935744. Throughput: 0: 55944.3. Samples: 857365240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 02:25:53,923][47056] Avg episode reward: [(0, '0.361')] [2024-04-26 02:25:56,477][47288] Updated weights for policy 0, policy_version 55426 (0.0026) [2024-04-26 02:25:58,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 908214272. Throughput: 0: 55807.0. Samples: 857534340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 02:25:58,923][47056] Avg episode reward: [(0, '0.359')] [2024-04-26 02:25:59,465][47288] Updated weights for policy 0, policy_version 55436 (0.0029) [2024-04-26 02:26:02,355][47288] Updated weights for policy 0, policy_version 55446 (0.0028) [2024-04-26 02:26:03,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 908476416. Throughput: 0: 55740.9. Samples: 857869400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 02:26:03,923][47056] Avg episode reward: [(0, '0.328')] [2024-04-26 02:26:05,201][47288] Updated weights for policy 0, policy_version 55456 (0.0029) [2024-04-26 02:26:08,387][47288] Updated weights for policy 0, policy_version 55466 (0.0029) [2024-04-26 02:26:08,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 908771328. Throughput: 0: 55762.6. Samples: 858205960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 02:26:08,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 02:26:11,136][47288] Updated weights for policy 0, policy_version 55476 (0.0029) [2024-04-26 02:26:13,923][47056] Fps is (10 sec: 58983.6, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 909066240. Throughput: 0: 55966.8. Samples: 858373440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 02:26:13,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 02:26:14,260][47288] Updated weights for policy 0, policy_version 55486 (0.0026) [2024-04-26 02:26:17,093][47288] Updated weights for policy 0, policy_version 55496 (0.0026) [2024-04-26 02:26:17,722][47267] Signal inference workers to stop experience collection... (12750 times) [2024-04-26 02:26:17,722][47267] Signal inference workers to resume experience collection... (12750 times) [2024-04-26 02:26:17,740][47288] InferenceWorker_p0-w0: stopping experience collection (12750 times) [2024-04-26 02:26:17,740][47288] InferenceWorker_p0-w0: resuming experience collection (12750 times) [2024-04-26 02:26:18,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56251.9, 300 sec: 55872.2). Total num frames: 909361152. Throughput: 0: 56047.9. Samples: 858710060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 02:26:18,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 02:26:20,049][47288] Updated weights for policy 0, policy_version 55506 (0.0029) [2024-04-26 02:26:22,809][47288] Updated weights for policy 0, policy_version 55516 (0.0031) [2024-04-26 02:26:23,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 909656064. Throughput: 0: 55989.8. Samples: 859047740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 02:26:23,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 02:26:25,865][47288] Updated weights for policy 0, policy_version 55526 (0.0026) [2024-04-26 02:26:28,628][47288] Updated weights for policy 0, policy_version 55536 (0.0030) [2024-04-26 02:26:28,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 909901824. Throughput: 0: 55976.0. Samples: 859218340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 02:26:28,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 02:26:31,632][47288] Updated weights for policy 0, policy_version 55546 (0.0025) [2024-04-26 02:26:33,923][47056] Fps is (10 sec: 50790.5, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 910163968. Throughput: 0: 55960.1. Samples: 859551460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 02:26:33,923][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 02:26:34,790][47288] Updated weights for policy 0, policy_version 55556 (0.0032) [2024-04-26 02:26:37,444][47288] Updated weights for policy 0, policy_version 55566 (0.0027) [2024-04-26 02:26:38,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 910442496. Throughput: 0: 55994.3. Samples: 859884980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 02:26:38,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 02:26:40,606][47288] Updated weights for policy 0, policy_version 55576 (0.0026) [2024-04-26 02:26:43,368][47288] Updated weights for policy 0, policy_version 55586 (0.0031) [2024-04-26 02:26:43,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 910721024. Throughput: 0: 55688.9. Samples: 860040340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 02:26:43,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 02:26:46,371][47288] Updated weights for policy 0, policy_version 55596 (0.0030) [2024-04-26 02:26:48,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 911015936. Throughput: 0: 55598.5. Samples: 860371320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 02:26:48,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 02:26:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000055604_911015936.pth... [2024-04-26 02:26:48,983][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000054788_897646592.pth [2024-04-26 02:26:49,272][47288] Updated weights for policy 0, policy_version 55606 (0.0029) [2024-04-26 02:26:52,253][47288] Updated weights for policy 0, policy_version 55616 (0.0028) [2024-04-26 02:26:53,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 911294464. Throughput: 0: 55518.4. Samples: 860704280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 02:26:53,923][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 02:26:55,625][47288] Updated weights for policy 0, policy_version 55626 (0.0026) [2024-04-26 02:26:58,339][47288] Updated weights for policy 0, policy_version 55636 (0.0031) [2024-04-26 02:26:58,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 911589376. Throughput: 0: 55770.6. Samples: 860883120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 02:26:58,923][47056] Avg episode reward: [(0, '0.415')] [2024-04-26 02:27:01,416][47288] Updated weights for policy 0, policy_version 55646 (0.0028) [2024-04-26 02:27:03,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 911835136. Throughput: 0: 55756.3. Samples: 861219100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-26 02:27:03,923][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 02:27:04,220][47288] Updated weights for policy 0, policy_version 55656 (0.0026) [2024-04-26 02:27:07,071][47288] Updated weights for policy 0, policy_version 55666 (0.0029) [2024-04-26 02:27:08,923][47056] Fps is (10 sec: 52427.9, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 912113664. Throughput: 0: 55779.4. Samples: 861557820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-26 02:27:08,923][47056] Avg episode reward: [(0, '0.352')] [2024-04-26 02:27:09,813][47267] Signal inference workers to stop experience collection... (12800 times) [2024-04-26 02:27:09,813][47267] Signal inference workers to resume experience collection... (12800 times) [2024-04-26 02:27:09,826][47288] InferenceWorker_p0-w0: stopping experience collection (12800 times) [2024-04-26 02:27:09,827][47288] InferenceWorker_p0-w0: resuming experience collection (12800 times) [2024-04-26 02:27:09,932][47288] Updated weights for policy 0, policy_version 55676 (0.0030) [2024-04-26 02:27:12,958][47288] Updated weights for policy 0, policy_version 55686 (0.0025) [2024-04-26 02:27:13,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 912392192. Throughput: 0: 55455.2. Samples: 861713820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-26 02:27:13,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 02:27:15,759][47288] Updated weights for policy 0, policy_version 55696 (0.0026) [2024-04-26 02:27:18,736][47288] Updated weights for policy 0, policy_version 55706 (0.0030) [2024-04-26 02:27:18,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 912687104. Throughput: 0: 55537.2. Samples: 862050640. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-26 02:27:18,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 02:27:21,580][47288] Updated weights for policy 0, policy_version 55716 (0.0024) [2024-04-26 02:27:23,923][47056] Fps is (10 sec: 58982.5, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 912982016. Throughput: 0: 55611.5. Samples: 862387500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-26 02:27:23,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 02:27:24,847][47288] Updated weights for policy 0, policy_version 55726 (0.0038) [2024-04-26 02:27:27,354][47288] Updated weights for policy 0, policy_version 55736 (0.0026) [2024-04-26 02:27:28,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 913260544. Throughput: 0: 56134.1. Samples: 862566380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-26 02:27:28,923][47056] Avg episode reward: [(0, '0.391')] [2024-04-26 02:27:30,637][47288] Updated weights for policy 0, policy_version 55746 (0.0028) [2024-04-26 02:27:33,169][47288] Updated weights for policy 0, policy_version 55756 (0.0026) [2024-04-26 02:27:33,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 913555456. Throughput: 0: 56133.4. Samples: 862897320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-26 02:27:33,923][47056] Avg episode reward: [(0, '0.371')] [2024-04-26 02:27:36,642][47288] Updated weights for policy 0, policy_version 55766 (0.0031) [2024-04-26 02:27:38,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 913801216. Throughput: 0: 56140.5. Samples: 863230600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 02:27:38,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 02:27:39,119][47288] Updated weights for policy 0, policy_version 55776 (0.0029) [2024-04-26 02:27:42,549][47288] Updated weights for policy 0, policy_version 55786 (0.0030) [2024-04-26 02:27:43,923][47056] Fps is (10 sec: 49152.1, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 914046976. Throughput: 0: 55765.8. Samples: 863392580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 02:27:43,923][47056] Avg episode reward: [(0, '0.308')] [2024-04-26 02:27:44,962][47288] Updated weights for policy 0, policy_version 55796 (0.0027) [2024-04-26 02:27:48,243][47288] Updated weights for policy 0, policy_version 55806 (0.0024) [2024-04-26 02:27:48,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 914325504. Throughput: 0: 55745.1. Samples: 863727620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 02:27:48,923][47056] Avg episode reward: [(0, '0.259')] [2024-04-26 02:27:50,763][47288] Updated weights for policy 0, policy_version 55816 (0.0032) [2024-04-26 02:27:53,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 914620416. Throughput: 0: 55717.6. Samples: 864065100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 02:27:53,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 02:27:54,073][47288] Updated weights for policy 0, policy_version 55826 (0.0028) [2024-04-26 02:27:56,727][47288] Updated weights for policy 0, policy_version 55836 (0.0030) [2024-04-26 02:27:58,923][47056] Fps is (10 sec: 58981.7, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 914915328. Throughput: 0: 55962.7. Samples: 864232140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 02:27:58,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 02:28:00,145][47288] Updated weights for policy 0, policy_version 55846 (0.0031) [2024-04-26 02:28:02,589][47288] Updated weights for policy 0, policy_version 55856 (0.0029) [2024-04-26 02:28:03,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.9, 300 sec: 55761.1). Total num frames: 915210240. Throughput: 0: 55802.4. Samples: 864561740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 02:28:03,923][47056] Avg episode reward: [(0, '0.335')] [2024-04-26 02:28:06,121][47288] Updated weights for policy 0, policy_version 55866 (0.0028) [2024-04-26 02:28:08,472][47288] Updated weights for policy 0, policy_version 55876 (0.0029) [2024-04-26 02:28:08,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56524.9, 300 sec: 55816.7). Total num frames: 915505152. Throughput: 0: 55724.9. Samples: 864895120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 02:28:08,923][47056] Avg episode reward: [(0, '0.368')] [2024-04-26 02:28:11,917][47288] Updated weights for policy 0, policy_version 55886 (0.0027) [2024-04-26 02:28:12,695][47267] Signal inference workers to stop experience collection... (12850 times) [2024-04-26 02:28:12,696][47267] Signal inference workers to resume experience collection... (12850 times) [2024-04-26 02:28:12,723][47288] InferenceWorker_p0-w0: stopping experience collection (12850 times) [2024-04-26 02:28:12,723][47288] InferenceWorker_p0-w0: resuming experience collection (12850 times) [2024-04-26 02:28:13,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 915750912. Throughput: 0: 55673.0. Samples: 865071660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:28:13,923][47056] Avg episode reward: [(0, '0.404')] [2024-04-26 02:28:14,291][47288] Updated weights for policy 0, policy_version 55896 (0.0034) [2024-04-26 02:28:17,805][47288] Updated weights for policy 0, policy_version 55906 (0.0028) [2024-04-26 02:28:18,923][47056] Fps is (10 sec: 50790.1, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 916013056. Throughput: 0: 55849.6. Samples: 865410560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:28:18,923][47056] Avg episode reward: [(0, '0.289')] [2024-04-26 02:28:20,096][47288] Updated weights for policy 0, policy_version 55916 (0.0030) [2024-04-26 02:28:23,759][47288] Updated weights for policy 0, policy_version 55926 (0.0027) [2024-04-26 02:28:23,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 916291584. Throughput: 0: 55921.8. Samples: 865747080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:28:23,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 02:28:25,931][47288] Updated weights for policy 0, policy_version 55936 (0.0029) [2024-04-26 02:28:28,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 916586496. Throughput: 0: 55696.2. Samples: 865898920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:28:28,923][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 02:28:29,708][47288] Updated weights for policy 0, policy_version 55946 (0.0030) [2024-04-26 02:28:31,820][47288] Updated weights for policy 0, policy_version 55956 (0.0026) [2024-04-26 02:28:33,923][47056] Fps is (10 sec: 58981.4, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 916881408. Throughput: 0: 55706.9. Samples: 866234440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:28:33,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 02:28:35,549][47288] Updated weights for policy 0, policy_version 55966 (0.0030) [2024-04-26 02:28:37,639][47288] Updated weights for policy 0, policy_version 55976 (0.0025) [2024-04-26 02:28:38,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 917176320. Throughput: 0: 55629.2. Samples: 866568420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:28:38,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 02:28:41,432][47288] Updated weights for policy 0, policy_version 55986 (0.0030) [2024-04-26 02:28:43,626][47288] Updated weights for policy 0, policy_version 55996 (0.0034) [2024-04-26 02:28:43,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56797.6, 300 sec: 55872.2). Total num frames: 917454848. Throughput: 0: 56039.4. Samples: 866753920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 02:28:43,924][47056] Avg episode reward: [(0, '0.372')] [2024-04-26 02:28:47,341][47288] Updated weights for policy 0, policy_version 56006 (0.0029) [2024-04-26 02:28:48,923][47056] Fps is (10 sec: 52428.9, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 917700608. Throughput: 0: 56162.1. Samples: 867089040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 02:28:48,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 02:28:48,956][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000056013_917716992.pth... [2024-04-26 02:28:49,011][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000055196_904331264.pth [2024-04-26 02:28:49,401][47288] Updated weights for policy 0, policy_version 56016 (0.0029) [2024-04-26 02:28:53,129][47288] Updated weights for policy 0, policy_version 56026 (0.0029) [2024-04-26 02:28:53,923][47056] Fps is (10 sec: 50791.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 917962752. Throughput: 0: 55973.8. Samples: 867413940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 02:28:53,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 02:28:55,309][47288] Updated weights for policy 0, policy_version 56036 (0.0025) [2024-04-26 02:28:58,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 918224896. Throughput: 0: 55499.6. Samples: 867569140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 02:28:58,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 02:28:59,146][47288] Updated weights for policy 0, policy_version 56046 (0.0029) [2024-04-26 02:29:01,130][47288] Updated weights for policy 0, policy_version 56056 (0.0027) [2024-04-26 02:29:03,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55159.3, 300 sec: 55650.1). Total num frames: 918519808. Throughput: 0: 55381.8. Samples: 867902740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 02:29:03,923][47056] Avg episode reward: [(0, '0.357')] [2024-04-26 02:29:05,110][47288] Updated weights for policy 0, policy_version 56066 (0.0029) [2024-04-26 02:29:06,679][47267] Signal inference workers to stop experience collection... (12900 times) [2024-04-26 02:29:06,685][47267] Signal inference workers to resume experience collection... (12900 times) [2024-04-26 02:29:06,699][47288] InferenceWorker_p0-w0: stopping experience collection (12900 times) [2024-04-26 02:29:06,700][47288] InferenceWorker_p0-w0: resuming experience collection (12900 times) [2024-04-26 02:29:06,920][47288] Updated weights for policy 0, policy_version 56076 (0.0036) [2024-04-26 02:29:08,923][47056] Fps is (10 sec: 60620.8, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 918831104. Throughput: 0: 55458.6. Samples: 868242720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 02:29:08,923][47056] Avg episode reward: [(0, '0.401')] [2024-04-26 02:29:10,918][47288] Updated weights for policy 0, policy_version 56086 (0.0032) [2024-04-26 02:29:12,774][47288] Updated weights for policy 0, policy_version 56096 (0.0027) [2024-04-26 02:29:13,923][47056] Fps is (10 sec: 60620.4, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 919126016. Throughput: 0: 56053.3. Samples: 868421320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 02:29:13,932][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 02:29:16,666][47288] Updated weights for policy 0, policy_version 56106 (0.0035) [2024-04-26 02:29:18,689][47288] Updated weights for policy 0, policy_version 56116 (0.0023) [2024-04-26 02:29:18,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56798.0, 300 sec: 55927.8). Total num frames: 919420928. Throughput: 0: 56088.2. Samples: 868758400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 02:29:18,923][47056] Avg episode reward: [(0, '0.375')] [2024-04-26 02:29:22,635][47288] Updated weights for policy 0, policy_version 56126 (0.0026) [2024-04-26 02:29:23,923][47056] Fps is (10 sec: 52429.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 919650304. Throughput: 0: 55951.2. Samples: 869086220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 02:29:23,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 02:29:24,698][47288] Updated weights for policy 0, policy_version 56136 (0.0030) [2024-04-26 02:29:28,750][47288] Updated weights for policy 0, policy_version 56146 (0.0031) [2024-04-26 02:29:28,923][47056] Fps is (10 sec: 47513.7, 60 sec: 55159.7, 300 sec: 55650.1). Total num frames: 919896064. Throughput: 0: 55303.4. Samples: 869242560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 02:29:28,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 02:29:30,543][47288] Updated weights for policy 0, policy_version 56156 (0.0030) [2024-04-26 02:29:33,923][47056] Fps is (10 sec: 52428.4, 60 sec: 54886.5, 300 sec: 55650.0). Total num frames: 920174592. Throughput: 0: 55227.5. Samples: 869574280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 02:29:33,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 02:29:34,527][47288] Updated weights for policy 0, policy_version 56166 (0.0030) [2024-04-26 02:29:36,365][47288] Updated weights for policy 0, policy_version 56176 (0.0029) [2024-04-26 02:29:38,923][47056] Fps is (10 sec: 57343.1, 60 sec: 54886.4, 300 sec: 55761.1). Total num frames: 920469504. Throughput: 0: 55466.5. Samples: 869909940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 02:29:38,924][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 02:29:40,366][47288] Updated weights for policy 0, policy_version 56186 (0.0030) [2024-04-26 02:29:42,325][47288] Updated weights for policy 0, policy_version 56196 (0.0028) [2024-04-26 02:29:43,923][47056] Fps is (10 sec: 60620.4, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 920780800. Throughput: 0: 55716.2. Samples: 870076380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 02:29:43,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 02:29:46,419][47288] Updated weights for policy 0, policy_version 56206 (0.0030) [2024-04-26 02:29:47,761][47267] Signal inference workers to stop experience collection... (12950 times) [2024-04-26 02:29:47,778][47288] InferenceWorker_p0-w0: stopping experience collection (12950 times) [2024-04-26 02:29:47,851][47267] Signal inference workers to resume experience collection... (12950 times) [2024-04-26 02:29:47,851][47288] InferenceWorker_p0-w0: resuming experience collection (12950 times) [2024-04-26 02:29:48,106][47288] Updated weights for policy 0, policy_version 56216 (0.0035) [2024-04-26 02:29:48,923][47056] Fps is (10 sec: 60620.3, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 921075712. Throughput: 0: 55718.1. Samples: 870410060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 02:29:48,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 02:29:52,389][47288] Updated weights for policy 0, policy_version 56226 (0.0030) [2024-04-26 02:29:53,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 921354240. Throughput: 0: 55513.3. Samples: 870740820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 02:29:53,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 02:29:53,938][47288] Updated weights for policy 0, policy_version 56236 (0.0027) [2024-04-26 02:29:58,160][47288] Updated weights for policy 0, policy_version 56246 (0.0035) [2024-04-26 02:29:58,923][47056] Fps is (10 sec: 49153.5, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 921567232. Throughput: 0: 55436.8. Samples: 870915960. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-26 02:29:58,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 02:29:59,864][47288] Updated weights for policy 0, policy_version 56256 (0.0034) [2024-04-26 02:30:03,923][47056] Fps is (10 sec: 49151.7, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 921845760. Throughput: 0: 55318.6. Samples: 871247740. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-26 02:30:03,923][47056] Avg episode reward: [(0, '0.300')] [2024-04-26 02:30:04,181][47288] Updated weights for policy 0, policy_version 56266 (0.0028) [2024-04-26 02:30:05,835][47288] Updated weights for policy 0, policy_version 56276 (0.0026) [2024-04-26 02:30:08,923][47056] Fps is (10 sec: 54066.7, 60 sec: 54613.3, 300 sec: 55650.1). Total num frames: 922107904. Throughput: 0: 55423.6. Samples: 871580280. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-26 02:30:08,923][47056] Avg episode reward: [(0, '0.369')] [2024-04-26 02:30:10,006][47288] Updated weights for policy 0, policy_version 56286 (0.0032) [2024-04-26 02:30:11,774][47288] Updated weights for policy 0, policy_version 56296 (0.0028) [2024-04-26 02:30:13,923][47056] Fps is (10 sec: 58982.7, 60 sec: 55159.6, 300 sec: 55761.2). Total num frames: 922435584. Throughput: 0: 55388.4. Samples: 871735040. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-26 02:30:13,923][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 02:30:15,860][47288] Updated weights for policy 0, policy_version 56306 (0.0028) [2024-04-26 02:30:17,504][47288] Updated weights for policy 0, policy_version 56316 (0.0032) [2024-04-26 02:30:18,923][47056] Fps is (10 sec: 62257.8, 60 sec: 55159.2, 300 sec: 55816.6). Total num frames: 922730496. Throughput: 0: 55362.9. Samples: 872065620. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-26 02:30:18,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 02:30:21,824][47288] Updated weights for policy 0, policy_version 56326 (0.0030) [2024-04-26 02:30:23,453][47288] Updated weights for policy 0, policy_version 56336 (0.0032) [2024-04-26 02:30:23,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 923025408. Throughput: 0: 55207.6. Samples: 872394280. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-26 02:30:23,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 02:30:27,587][47288] Updated weights for policy 0, policy_version 56346 (0.0031) [2024-04-26 02:30:28,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56524.7, 300 sec: 55872.2). Total num frames: 923287552. Throughput: 0: 55709.9. Samples: 872583320. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-26 02:30:28,923][47056] Avg episode reward: [(0, '0.344')] [2024-04-26 02:30:29,055][47267] Signal inference workers to stop experience collection... (13000 times) [2024-04-26 02:30:29,056][47267] Signal inference workers to resume experience collection... (13000 times) [2024-04-26 02:30:29,085][47288] InferenceWorker_p0-w0: stopping experience collection (13000 times) [2024-04-26 02:30:29,085][47288] InferenceWorker_p0-w0: resuming experience collection (13000 times) [2024-04-26 02:30:29,323][47288] Updated weights for policy 0, policy_version 56356 (0.0031) [2024-04-26 02:30:33,571][47288] Updated weights for policy 0, policy_version 56366 (0.0029) [2024-04-26 02:30:33,923][47056] Fps is (10 sec: 49151.3, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 923516928. Throughput: 0: 55822.2. Samples: 872922060. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 02:30:33,923][47056] Avg episode reward: [(0, '0.416')] [2024-04-26 02:30:35,125][47288] Updated weights for policy 0, policy_version 56376 (0.0029) [2024-04-26 02:30:38,923][47056] Fps is (10 sec: 49151.5, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 923779072. Throughput: 0: 55910.4. Samples: 873256800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 02:30:38,924][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 02:30:39,534][47288] Updated weights for policy 0, policy_version 56386 (0.0026) [2024-04-26 02:30:40,974][47288] Updated weights for policy 0, policy_version 56396 (0.0024) [2024-04-26 02:30:43,923][47056] Fps is (10 sec: 55706.2, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 924073984. Throughput: 0: 55118.4. Samples: 873396300. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 02:30:43,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 02:30:45,513][47288] Updated weights for policy 0, policy_version 56406 (0.0031) [2024-04-26 02:30:46,950][47288] Updated weights for policy 0, policy_version 56416 (0.0025) [2024-04-26 02:30:48,923][47056] Fps is (10 sec: 58983.2, 60 sec: 54886.6, 300 sec: 55705.6). Total num frames: 924368896. Throughput: 0: 55119.6. Samples: 873728120. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 02:30:48,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 02:30:48,942][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000056419_924368896.pth... [2024-04-26 02:30:48,988][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000055604_911015936.pth [2024-04-26 02:30:51,522][47288] Updated weights for policy 0, policy_version 56426 (0.0029) [2024-04-26 02:30:52,887][47288] Updated weights for policy 0, policy_version 56436 (0.0032) [2024-04-26 02:30:53,923][47056] Fps is (10 sec: 62259.4, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 924696576. Throughput: 0: 55172.8. Samples: 874063060. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 02:30:53,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 02:30:57,291][47288] Updated weights for policy 0, policy_version 56446 (0.0031) [2024-04-26 02:30:58,677][47288] Updated weights for policy 0, policy_version 56456 (0.0035) [2024-04-26 02:30:58,923][47056] Fps is (10 sec: 62258.9, 60 sec: 57070.7, 300 sec: 55983.3). Total num frames: 924991488. Throughput: 0: 55987.4. Samples: 874254480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 02:30:58,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 02:31:03,021][47288] Updated weights for policy 0, policy_version 56466 (0.0033) [2024-04-26 02:31:03,923][47056] Fps is (10 sec: 52429.0, 60 sec: 56251.7, 300 sec: 55761.2). Total num frames: 925220864. Throughput: 0: 56090.4. Samples: 874589680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 02:31:03,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 02:31:04,455][47267] Signal inference workers to stop experience collection... (13050 times) [2024-04-26 02:31:04,455][47267] Signal inference workers to resume experience collection... (13050 times) [2024-04-26 02:31:04,468][47288] InferenceWorker_p0-w0: stopping experience collection (13050 times) [2024-04-26 02:31:04,468][47288] InferenceWorker_p0-w0: resuming experience collection (13050 times) [2024-04-26 02:31:04,565][47288] Updated weights for policy 0, policy_version 56476 (0.0025) [2024-04-26 02:31:08,911][47288] Updated weights for policy 0, policy_version 56486 (0.0028) [2024-04-26 02:31:08,923][47056] Fps is (10 sec: 47514.1, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 925466624. Throughput: 0: 56073.5. Samples: 874917580. Policy #0 lag: (min: 0.0, avg: 5.8, max: 20.0) [2024-04-26 02:31:08,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 02:31:10,472][47288] Updated weights for policy 0, policy_version 56496 (0.0024) [2024-04-26 02:31:13,923][47056] Fps is (10 sec: 50791.1, 60 sec: 54886.5, 300 sec: 55483.5). Total num frames: 925728768. Throughput: 0: 55211.7. Samples: 875067840. Policy #0 lag: (min: 0.0, avg: 5.8, max: 20.0) [2024-04-26 02:31:13,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 02:31:14,732][47288] Updated weights for policy 0, policy_version 56506 (0.0030) [2024-04-26 02:31:16,461][47288] Updated weights for policy 0, policy_version 56516 (0.0028) [2024-04-26 02:31:18,923][47056] Fps is (10 sec: 54066.0, 60 sec: 54613.3, 300 sec: 55427.9). Total num frames: 926007296. Throughput: 0: 55130.2. Samples: 875402920. Policy #0 lag: (min: 0.0, avg: 5.8, max: 20.0) [2024-04-26 02:31:18,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 02:31:20,594][47288] Updated weights for policy 0, policy_version 56526 (0.0029) [2024-04-26 02:31:22,288][47288] Updated weights for policy 0, policy_version 56536 (0.0027) [2024-04-26 02:31:23,923][47056] Fps is (10 sec: 58981.7, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 926318592. Throughput: 0: 55077.0. Samples: 875735260. Policy #0 lag: (min: 0.0, avg: 5.8, max: 20.0) [2024-04-26 02:31:23,923][47056] Avg episode reward: [(0, '0.345')] [2024-04-26 02:31:26,516][47288] Updated weights for policy 0, policy_version 56546 (0.0031) [2024-04-26 02:31:28,063][47288] Updated weights for policy 0, policy_version 56556 (0.0026) [2024-04-26 02:31:28,923][47056] Fps is (10 sec: 62260.4, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 926629888. Throughput: 0: 55981.5. Samples: 875915460. Policy #0 lag: (min: 0.0, avg: 5.8, max: 20.0) [2024-04-26 02:31:28,923][47056] Avg episode reward: [(0, '0.345')] [2024-04-26 02:31:32,473][47288] Updated weights for policy 0, policy_version 56566 (0.0027) [2024-04-26 02:31:33,923][47056] Fps is (10 sec: 58983.3, 60 sec: 56525.1, 300 sec: 55816.7). Total num frames: 926908416. Throughput: 0: 56011.3. Samples: 876248620. Policy #0 lag: (min: 0.0, avg: 5.8, max: 20.0) [2024-04-26 02:31:33,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 02:31:34,062][47288] Updated weights for policy 0, policy_version 56576 (0.0027) [2024-04-26 02:31:38,302][47288] Updated weights for policy 0, policy_version 56586 (0.0030) [2024-04-26 02:31:38,923][47056] Fps is (10 sec: 52429.4, 60 sec: 56252.0, 300 sec: 55705.6). Total num frames: 927154176. Throughput: 0: 56002.9. Samples: 876583180. Policy #0 lag: (min: 0.0, avg: 5.8, max: 20.0) [2024-04-26 02:31:38,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 02:31:39,818][47267] Signal inference workers to stop experience collection... (13100 times) [2024-04-26 02:31:39,854][47288] InferenceWorker_p0-w0: stopping experience collection (13100 times) [2024-04-26 02:31:39,907][47267] Signal inference workers to resume experience collection... (13100 times) [2024-04-26 02:31:39,907][47288] InferenceWorker_p0-w0: resuming experience collection (13100 times) [2024-04-26 02:31:40,017][47288] Updated weights for policy 0, policy_version 56596 (0.0025) [2024-04-26 02:31:43,923][47056] Fps is (10 sec: 49151.4, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 927399936. Throughput: 0: 55192.1. Samples: 876738120. Policy #0 lag: (min: 0.0, avg: 6.9, max: 20.0) [2024-04-26 02:31:43,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 02:31:44,169][47288] Updated weights for policy 0, policy_version 56606 (0.0028) [2024-04-26 02:31:45,918][47288] Updated weights for policy 0, policy_version 56616 (0.0031) [2024-04-26 02:31:48,923][47056] Fps is (10 sec: 50789.9, 60 sec: 54886.4, 300 sec: 55483.4). Total num frames: 927662080. Throughput: 0: 55266.3. Samples: 877076660. Policy #0 lag: (min: 0.0, avg: 6.9, max: 20.0) [2024-04-26 02:31:48,923][47056] Avg episode reward: [(0, '0.362')] [2024-04-26 02:31:49,926][47288] Updated weights for policy 0, policy_version 56626 (0.0027) [2024-04-26 02:31:51,809][47288] Updated weights for policy 0, policy_version 56636 (0.0033) [2024-04-26 02:31:53,923][47056] Fps is (10 sec: 55705.3, 60 sec: 54340.3, 300 sec: 55483.4). Total num frames: 927956992. Throughput: 0: 55383.0. Samples: 877409820. Policy #0 lag: (min: 0.0, avg: 6.9, max: 20.0) [2024-04-26 02:31:53,923][47056] Avg episode reward: [(0, '0.353')] [2024-04-26 02:31:55,693][47288] Updated weights for policy 0, policy_version 56646 (0.0029) [2024-04-26 02:31:57,822][47288] Updated weights for policy 0, policy_version 56656 (0.0024) [2024-04-26 02:31:58,923][47056] Fps is (10 sec: 60620.9, 60 sec: 54613.4, 300 sec: 55705.6). Total num frames: 928268288. Throughput: 0: 55805.7. Samples: 877579100. Policy #0 lag: (min: 0.0, avg: 6.9, max: 20.0) [2024-04-26 02:31:58,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 02:32:01,606][47288] Updated weights for policy 0, policy_version 56666 (0.0037) [2024-04-26 02:32:03,781][47288] Updated weights for policy 0, policy_version 56676 (0.0030) [2024-04-26 02:32:03,923][47056] Fps is (10 sec: 62259.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 928579584. Throughput: 0: 55590.4. Samples: 877904480. Policy #0 lag: (min: 0.0, avg: 6.9, max: 20.0) [2024-04-26 02:32:03,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 02:32:07,463][47288] Updated weights for policy 0, policy_version 56686 (0.0037) [2024-04-26 02:32:08,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 928858112. Throughput: 0: 55658.7. Samples: 878239900. Policy #0 lag: (min: 0.0, avg: 6.9, max: 20.0) [2024-04-26 02:32:08,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 02:32:09,661][47288] Updated weights for policy 0, policy_version 56696 (0.0029) [2024-04-26 02:32:13,271][47288] Updated weights for policy 0, policy_version 56706 (0.0028) [2024-04-26 02:32:13,922][47056] Fps is (10 sec: 52429.7, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 929103872. Throughput: 0: 55855.7. Samples: 878428960. Policy #0 lag: (min: 0.0, avg: 6.9, max: 20.0) [2024-04-26 02:32:13,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 02:32:15,498][47288] Updated weights for policy 0, policy_version 56716 (0.0032) [2024-04-26 02:32:18,923][47056] Fps is (10 sec: 50791.1, 60 sec: 55978.9, 300 sec: 55539.0). Total num frames: 929366016. Throughput: 0: 55816.0. Samples: 878760340. Policy #0 lag: (min: 1.0, avg: 7.3, max: 21.0) [2024-04-26 02:32:18,923][47056] Avg episode reward: [(0, '0.375')] [2024-04-26 02:32:19,138][47288] Updated weights for policy 0, policy_version 56726 (0.0027) [2024-04-26 02:32:21,441][47288] Updated weights for policy 0, policy_version 56736 (0.0032) [2024-04-26 02:32:23,923][47056] Fps is (10 sec: 50789.9, 60 sec: 54886.5, 300 sec: 55427.9). Total num frames: 929611776. Throughput: 0: 55759.9. Samples: 879092380. Policy #0 lag: (min: 1.0, avg: 7.3, max: 21.0) [2024-04-26 02:32:23,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 02:32:25,053][47288] Updated weights for policy 0, policy_version 56746 (0.0031) [2024-04-26 02:32:27,502][47288] Updated weights for policy 0, policy_version 56756 (0.0033) [2024-04-26 02:32:28,923][47056] Fps is (10 sec: 54066.8, 60 sec: 54613.4, 300 sec: 55427.9). Total num frames: 929906688. Throughput: 0: 55688.9. Samples: 879244120. Policy #0 lag: (min: 1.0, avg: 7.3, max: 21.0) [2024-04-26 02:32:28,923][47056] Avg episode reward: [(0, '0.363')] [2024-04-26 02:32:29,800][47267] Signal inference workers to stop experience collection... (13150 times) [2024-04-26 02:32:29,827][47288] InferenceWorker_p0-w0: stopping experience collection (13150 times) [2024-04-26 02:32:29,853][47267] Signal inference workers to resume experience collection... (13150 times) [2024-04-26 02:32:29,858][47288] InferenceWorker_p0-w0: resuming experience collection (13150 times) [2024-04-26 02:32:30,841][47288] Updated weights for policy 0, policy_version 56766 (0.0036) [2024-04-26 02:32:33,307][47288] Updated weights for policy 0, policy_version 56776 (0.0028) [2024-04-26 02:32:33,923][47056] Fps is (10 sec: 62258.3, 60 sec: 55432.3, 300 sec: 55705.6). Total num frames: 930234368. Throughput: 0: 55691.9. Samples: 879582800. Policy #0 lag: (min: 1.0, avg: 7.3, max: 21.0) [2024-04-26 02:32:33,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 02:32:36,650][47288] Updated weights for policy 0, policy_version 56786 (0.0030) [2024-04-26 02:32:38,923][47056] Fps is (10 sec: 60621.0, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 930512896. Throughput: 0: 55741.5. Samples: 879918180. Policy #0 lag: (min: 1.0, avg: 7.3, max: 21.0) [2024-04-26 02:32:38,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 02:32:39,120][47288] Updated weights for policy 0, policy_version 56796 (0.0027) [2024-04-26 02:32:42,480][47288] Updated weights for policy 0, policy_version 56806 (0.0035) [2024-04-26 02:32:43,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56797.9, 300 sec: 55872.2). Total num frames: 930807808. Throughput: 0: 55980.0. Samples: 880098200. Policy #0 lag: (min: 1.0, avg: 7.3, max: 21.0) [2024-04-26 02:32:43,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 02:32:45,073][47288] Updated weights for policy 0, policy_version 56816 (0.0029) [2024-04-26 02:32:48,329][47288] Updated weights for policy 0, policy_version 56826 (0.0031) [2024-04-26 02:32:48,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56797.9, 300 sec: 55761.1). Total num frames: 931069952. Throughput: 0: 56085.4. Samples: 880428320. Policy #0 lag: (min: 1.0, avg: 7.3, max: 21.0) [2024-04-26 02:32:48,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 02:32:49,038][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000056829_931086336.pth... [2024-04-26 02:32:49,089][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000056013_917716992.pth [2024-04-26 02:32:51,092][47288] Updated weights for policy 0, policy_version 56836 (0.0033) [2024-04-26 02:32:53,923][47056] Fps is (10 sec: 50789.7, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 931315712. Throughput: 0: 56121.7. Samples: 880765380. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 02:32:53,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 02:32:54,224][47288] Updated weights for policy 0, policy_version 56846 (0.0023) [2024-04-26 02:32:56,821][47288] Updated weights for policy 0, policy_version 56856 (0.0032) [2024-04-26 02:32:58,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55432.4, 300 sec: 55538.9). Total num frames: 931594240. Throughput: 0: 55321.0. Samples: 880918420. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 02:32:58,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 02:33:00,225][47288] Updated weights for policy 0, policy_version 56866 (0.0029) [2024-04-26 02:33:02,679][47288] Updated weights for policy 0, policy_version 56876 (0.0034) [2024-04-26 02:33:03,923][47056] Fps is (10 sec: 54067.5, 60 sec: 54613.3, 300 sec: 55427.9). Total num frames: 931856384. Throughput: 0: 55337.6. Samples: 881250540. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 02:33:03,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 02:33:05,990][47288] Updated weights for policy 0, policy_version 56886 (0.0032) [2024-04-26 02:33:08,526][47288] Updated weights for policy 0, policy_version 56896 (0.0028) [2024-04-26 02:33:08,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 932184064. Throughput: 0: 55356.4. Samples: 881583420. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 02:33:08,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 02:33:11,778][47288] Updated weights for policy 0, policy_version 56906 (0.0028) [2024-04-26 02:33:13,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 932446208. Throughput: 0: 55904.8. Samples: 881759840. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 02:33:13,924][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 02:33:14,656][47288] Updated weights for policy 0, policy_version 56916 (0.0030) [2024-04-26 02:33:17,696][47288] Updated weights for policy 0, policy_version 56926 (0.0033) [2024-04-26 02:33:18,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 932741120. Throughput: 0: 55833.0. Samples: 882095280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 02:33:18,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 02:33:20,710][47288] Updated weights for policy 0, policy_version 56936 (0.0029) [2024-04-26 02:33:22,221][47267] Signal inference workers to stop experience collection... (13200 times) [2024-04-26 02:33:22,221][47267] Signal inference workers to resume experience collection... (13200 times) [2024-04-26 02:33:22,237][47288] InferenceWorker_p0-w0: stopping experience collection (13200 times) [2024-04-26 02:33:22,237][47288] InferenceWorker_p0-w0: resuming experience collection (13200 times) [2024-04-26 02:33:23,665][47288] Updated weights for policy 0, policy_version 56946 (0.0032) [2024-04-26 02:33:23,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56524.8, 300 sec: 55650.1). Total num frames: 933003264. Throughput: 0: 55802.6. Samples: 882429300. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 02:33:23,923][47056] Avg episode reward: [(0, '0.334')] [2024-04-26 02:33:27,162][47288] Updated weights for policy 0, policy_version 56956 (0.0031) [2024-04-26 02:33:28,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.8, 300 sec: 55650.1). Total num frames: 933298176. Throughput: 0: 55570.2. Samples: 882598860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 02:33:28,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 02:33:29,462][47288] Updated weights for policy 0, policy_version 56966 (0.0036) [2024-04-26 02:33:33,201][47288] Updated weights for policy 0, policy_version 56976 (0.0033) [2024-04-26 02:33:33,923][47056] Fps is (10 sec: 52427.9, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 933527552. Throughput: 0: 55687.4. Samples: 882934260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 02:33:33,924][47056] Avg episode reward: [(0, '0.349')] [2024-04-26 02:33:35,314][47288] Updated weights for policy 0, policy_version 56986 (0.0031) [2024-04-26 02:33:38,923][47056] Fps is (10 sec: 50789.9, 60 sec: 54886.3, 300 sec: 55427.9). Total num frames: 933806080. Throughput: 0: 55716.5. Samples: 883272620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 02:33:38,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 02:33:39,091][47288] Updated weights for policy 0, policy_version 56996 (0.0025) [2024-04-26 02:33:41,238][47288] Updated weights for policy 0, policy_version 57006 (0.0035) [2024-04-26 02:33:43,923][47056] Fps is (10 sec: 58983.7, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 934117376. Throughput: 0: 55866.5. Samples: 883432400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 02:33:43,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 02:33:44,769][47288] Updated weights for policy 0, policy_version 57016 (0.0031) [2024-04-26 02:33:47,014][47288] Updated weights for policy 0, policy_version 57026 (0.0031) [2024-04-26 02:33:48,923][47056] Fps is (10 sec: 60620.8, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 934412288. Throughput: 0: 55920.9. Samples: 883766980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 02:33:48,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 02:33:50,450][47288] Updated weights for policy 0, policy_version 57036 (0.0027) [2024-04-26 02:33:52,812][47288] Updated weights for policy 0, policy_version 57046 (0.0027) [2024-04-26 02:33:53,923][47056] Fps is (10 sec: 57343.0, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 934690816. Throughput: 0: 56100.8. Samples: 884107960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 02:33:53,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 02:33:56,284][47288] Updated weights for policy 0, policy_version 57056 (0.0028) [2024-04-26 02:33:58,620][47288] Updated weights for policy 0, policy_version 57066 (0.0033) [2024-04-26 02:33:58,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56524.9, 300 sec: 55816.7). Total num frames: 934985728. Throughput: 0: 56013.4. Samples: 884280440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 02:33:58,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 02:34:02,174][47288] Updated weights for policy 0, policy_version 57076 (0.0032) [2024-04-26 02:34:03,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56797.9, 300 sec: 55705.6). Total num frames: 935264256. Throughput: 0: 56035.2. Samples: 884616860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 02:34:03,923][47056] Avg episode reward: [(0, '0.372')] [2024-04-26 02:34:04,550][47288] Updated weights for policy 0, policy_version 57086 (0.0037) [2024-04-26 02:34:07,945][47288] Updated weights for policy 0, policy_version 57096 (0.0030) [2024-04-26 02:34:08,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 935510016. Throughput: 0: 56164.4. Samples: 884956700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 02:34:08,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 02:34:10,449][47288] Updated weights for policy 0, policy_version 57106 (0.0027) [2024-04-26 02:34:13,904][47288] Updated weights for policy 0, policy_version 57116 (0.0030) [2024-04-26 02:34:13,923][47056] Fps is (10 sec: 52428.1, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 935788544. Throughput: 0: 55812.3. Samples: 885110420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 02:34:13,923][47056] Avg episode reward: [(0, '0.393')] [2024-04-26 02:34:15,650][47267] Signal inference workers to stop experience collection... (13250 times) [2024-04-26 02:34:15,651][47267] Signal inference workers to resume experience collection... (13250 times) [2024-04-26 02:34:15,683][47288] InferenceWorker_p0-w0: stopping experience collection (13250 times) [2024-04-26 02:34:15,683][47288] InferenceWorker_p0-w0: resuming experience collection (13250 times) [2024-04-26 02:34:16,224][47288] Updated weights for policy 0, policy_version 57126 (0.0032) [2024-04-26 02:34:18,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 936067072. Throughput: 0: 55720.5. Samples: 885441680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 02:34:18,923][47056] Avg episode reward: [(0, '0.403')] [2024-04-26 02:34:19,994][47288] Updated weights for policy 0, policy_version 57136 (0.0029) [2024-04-26 02:34:21,993][47288] Updated weights for policy 0, policy_version 57146 (0.0026) [2024-04-26 02:34:23,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.5, 300 sec: 55816.6). Total num frames: 936361984. Throughput: 0: 55718.6. Samples: 885779960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 02:34:23,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 02:34:25,800][47288] Updated weights for policy 0, policy_version 57156 (0.0030) [2024-04-26 02:34:27,948][47288] Updated weights for policy 0, policy_version 57166 (0.0033) [2024-04-26 02:34:28,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 936656896. Throughput: 0: 56066.0. Samples: 885955380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 02:34:28,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 02:34:31,525][47288] Updated weights for policy 0, policy_version 57176 (0.0030) [2024-04-26 02:34:33,885][47288] Updated weights for policy 0, policy_version 57186 (0.0031) [2024-04-26 02:34:33,923][47056] Fps is (10 sec: 57345.0, 60 sec: 56798.1, 300 sec: 55816.7). Total num frames: 936935424. Throughput: 0: 56019.7. Samples: 886287860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 02:34:33,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 02:34:37,337][47288] Updated weights for policy 0, policy_version 57196 (0.0031) [2024-04-26 02:34:38,923][47056] Fps is (10 sec: 52428.9, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 937181184. Throughput: 0: 55868.0. Samples: 886622020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 02:34:38,923][47056] Avg episode reward: [(0, '0.393')] [2024-04-26 02:34:39,596][47288] Updated weights for policy 0, policy_version 57206 (0.0032) [2024-04-26 02:34:43,262][47288] Updated weights for policy 0, policy_version 57216 (0.0029) [2024-04-26 02:34:43,923][47056] Fps is (10 sec: 50790.1, 60 sec: 55432.5, 300 sec: 55483.5). Total num frames: 937443328. Throughput: 0: 55655.6. Samples: 886784940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 02:34:43,923][47056] Avg episode reward: [(0, '0.349')] [2024-04-26 02:34:45,392][47288] Updated weights for policy 0, policy_version 57226 (0.0028) [2024-04-26 02:34:48,923][47056] Fps is (10 sec: 55706.7, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 937738240. Throughput: 0: 55653.8. Samples: 887121280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 02:34:48,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 02:34:49,008][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000057236_937754624.pth... [2024-04-26 02:34:49,013][47288] Updated weights for policy 0, policy_version 57236 (0.0033) [2024-04-26 02:34:49,057][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000056419_924368896.pth [2024-04-26 02:34:51,254][47288] Updated weights for policy 0, policy_version 57246 (0.0027) [2024-04-26 02:34:53,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 938016768. Throughput: 0: 55570.3. Samples: 887457360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 02:34:53,923][47056] Avg episode reward: [(0, '0.404')] [2024-04-26 02:34:55,037][47288] Updated weights for policy 0, policy_version 57256 (0.0031) [2024-04-26 02:34:57,221][47288] Updated weights for policy 0, policy_version 57266 (0.0032) [2024-04-26 02:34:58,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 938311680. Throughput: 0: 55783.3. Samples: 887620660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 02:34:58,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 02:35:00,912][47288] Updated weights for policy 0, policy_version 57276 (0.0034) [2024-04-26 02:35:02,960][47288] Updated weights for policy 0, policy_version 57286 (0.0029) [2024-04-26 02:35:03,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 938590208. Throughput: 0: 55862.4. Samples: 887955480. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 02:35:03,923][47056] Avg episode reward: [(0, '0.349')] [2024-04-26 02:35:06,808][47288] Updated weights for policy 0, policy_version 57296 (0.0032) [2024-04-26 02:35:08,264][47267] Signal inference workers to stop experience collection... (13300 times) [2024-04-26 02:35:08,301][47288] InferenceWorker_p0-w0: stopping experience collection (13300 times) [2024-04-26 02:35:08,355][47267] Signal inference workers to resume experience collection... (13300 times) [2024-04-26 02:35:08,355][47288] InferenceWorker_p0-w0: resuming experience collection (13300 times) [2024-04-26 02:35:08,754][47288] Updated weights for policy 0, policy_version 57306 (0.0032) [2024-04-26 02:35:08,923][47056] Fps is (10 sec: 58981.5, 60 sec: 56524.7, 300 sec: 55816.6). Total num frames: 938901504. Throughput: 0: 55742.6. Samples: 888288380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 02:35:08,923][47056] Avg episode reward: [(0, '0.399')] [2024-04-26 02:35:12,587][47288] Updated weights for policy 0, policy_version 57316 (0.0028) [2024-04-26 02:35:13,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 939147264. Throughput: 0: 55716.1. Samples: 888462600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 02:35:13,923][47056] Avg episode reward: [(0, '0.318')] [2024-04-26 02:35:14,656][47288] Updated weights for policy 0, policy_version 57326 (0.0030) [2024-04-26 02:35:18,451][47288] Updated weights for policy 0, policy_version 57336 (0.0027) [2024-04-26 02:35:18,923][47056] Fps is (10 sec: 50791.1, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 939409408. Throughput: 0: 55847.0. Samples: 888800980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 02:35:18,923][47056] Avg episode reward: [(0, '0.303')] [2024-04-26 02:35:20,532][47288] Updated weights for policy 0, policy_version 57346 (0.0027) [2024-04-26 02:35:23,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 939687936. Throughput: 0: 55872.6. Samples: 889136280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 02:35:23,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 02:35:24,278][47288] Updated weights for policy 0, policy_version 57356 (0.0030) [2024-04-26 02:35:26,381][47288] Updated weights for policy 0, policy_version 57366 (0.0031) [2024-04-26 02:35:28,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55432.8, 300 sec: 55816.7). Total num frames: 939982848. Throughput: 0: 55553.9. Samples: 889284860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 02:35:28,923][47056] Avg episode reward: [(0, '0.333')] [2024-04-26 02:35:30,122][47288] Updated weights for policy 0, policy_version 57376 (0.0032) [2024-04-26 02:35:32,192][47288] Updated weights for policy 0, policy_version 57386 (0.0026) [2024-04-26 02:35:33,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55432.5, 300 sec: 55872.3). Total num frames: 940261376. Throughput: 0: 55542.6. Samples: 889620700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 02:35:33,923][47056] Avg episode reward: [(0, '0.283')] [2024-04-26 02:35:36,024][47288] Updated weights for policy 0, policy_version 57396 (0.0029) [2024-04-26 02:35:38,134][47288] Updated weights for policy 0, policy_version 57406 (0.0030) [2024-04-26 02:35:38,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 940556288. Throughput: 0: 55568.9. Samples: 889957960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 02:35:38,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 02:35:42,002][47288] Updated weights for policy 0, policy_version 57416 (0.0031) [2024-04-26 02:35:43,923][47056] Fps is (10 sec: 58981.6, 60 sec: 56797.7, 300 sec: 55872.2). Total num frames: 940851200. Throughput: 0: 55834.1. Samples: 890133200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 02:35:43,923][47056] Avg episode reward: [(0, '0.371')] [2024-04-26 02:35:44,006][47288] Updated weights for policy 0, policy_version 57426 (0.0029) [2024-04-26 02:35:47,806][47288] Updated weights for policy 0, policy_version 57436 (0.0025) [2024-04-26 02:35:48,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 941096960. Throughput: 0: 55869.4. Samples: 890469600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 02:35:48,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 02:35:49,736][47288] Updated weights for policy 0, policy_version 57446 (0.0032) [2024-04-26 02:35:53,793][47288] Updated weights for policy 0, policy_version 57456 (0.0030) [2024-04-26 02:35:53,923][47056] Fps is (10 sec: 50791.1, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 941359104. Throughput: 0: 56052.7. Samples: 890810740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 30.0) [2024-04-26 02:35:53,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 02:35:55,553][47288] Updated weights for policy 0, policy_version 57466 (0.0033) [2024-04-26 02:35:58,923][47056] Fps is (10 sec: 55704.4, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 941654016. Throughput: 0: 55687.8. Samples: 890968560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 30.0) [2024-04-26 02:35:58,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 02:35:59,616][47288] Updated weights for policy 0, policy_version 57476 (0.0027) [2024-04-26 02:36:00,955][47267] Signal inference workers to stop experience collection... (13350 times) [2024-04-26 02:36:00,997][47288] InferenceWorker_p0-w0: stopping experience collection (13350 times) [2024-04-26 02:36:01,004][47267] Signal inference workers to resume experience collection... (13350 times) [2024-04-26 02:36:01,008][47288] InferenceWorker_p0-w0: resuming experience collection (13350 times) [2024-04-26 02:36:01,493][47288] Updated weights for policy 0, policy_version 57486 (0.0030) [2024-04-26 02:36:03,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 941932544. Throughput: 0: 55598.6. Samples: 891302920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 30.0) [2024-04-26 02:36:03,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 02:36:05,513][47288] Updated weights for policy 0, policy_version 57496 (0.0026) [2024-04-26 02:36:07,816][47288] Updated weights for policy 0, policy_version 57506 (0.0027) [2024-04-26 02:36:08,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55432.6, 300 sec: 55927.7). Total num frames: 942227456. Throughput: 0: 55605.7. Samples: 891638540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 30.0) [2024-04-26 02:36:08,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 02:36:11,425][47288] Updated weights for policy 0, policy_version 57516 (0.0028) [2024-04-26 02:36:13,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 55872.3). Total num frames: 942489600. Throughput: 0: 56161.2. Samples: 891812120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 30.0) [2024-04-26 02:36:13,923][47056] Avg episode reward: [(0, '0.323')] [2024-04-26 02:36:14,028][47288] Updated weights for policy 0, policy_version 57526 (0.0028) [2024-04-26 02:36:17,322][47288] Updated weights for policy 0, policy_version 57536 (0.0026) [2024-04-26 02:36:18,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 942800896. Throughput: 0: 56143.9. Samples: 892147180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 30.0) [2024-04-26 02:36:18,923][47056] Avg episode reward: [(0, '0.356')] [2024-04-26 02:36:19,736][47288] Updated weights for policy 0, policy_version 57546 (0.0027) [2024-04-26 02:36:23,162][47288] Updated weights for policy 0, policy_version 57556 (0.0028) [2024-04-26 02:36:23,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 943046656. Throughput: 0: 56106.1. Samples: 892482740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 30.0) [2024-04-26 02:36:23,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 02:36:25,572][47288] Updated weights for policy 0, policy_version 57566 (0.0025) [2024-04-26 02:36:28,923][47056] Fps is (10 sec: 50790.1, 60 sec: 55432.3, 300 sec: 55594.5). Total num frames: 943308800. Throughput: 0: 55877.3. Samples: 892647680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:36:28,923][47056] Avg episode reward: [(0, '0.368')] [2024-04-26 02:36:28,968][47288] Updated weights for policy 0, policy_version 57576 (0.0032) [2024-04-26 02:36:31,492][47288] Updated weights for policy 0, policy_version 57586 (0.0028) [2024-04-26 02:36:33,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55705.3, 300 sec: 55761.1). Total num frames: 943603712. Throughput: 0: 55784.1. Samples: 892979900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:36:33,924][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 02:36:34,935][47288] Updated weights for policy 0, policy_version 57596 (0.0031) [2024-04-26 02:36:37,154][47288] Updated weights for policy 0, policy_version 57606 (0.0038) [2024-04-26 02:36:38,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 943882240. Throughput: 0: 55688.5. Samples: 893316720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:36:38,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 02:36:40,809][47288] Updated weights for policy 0, policy_version 57616 (0.0027) [2024-04-26 02:36:41,666][47267] Signal inference workers to stop experience collection... (13400 times) [2024-04-26 02:36:41,686][47288] InferenceWorker_p0-w0: stopping experience collection (13400 times) [2024-04-26 02:36:41,724][47267] Signal inference workers to resume experience collection... (13400 times) [2024-04-26 02:36:41,725][47288] InferenceWorker_p0-w0: resuming experience collection (13400 times) [2024-04-26 02:36:43,075][47288] Updated weights for policy 0, policy_version 57626 (0.0024) [2024-04-26 02:36:43,923][47056] Fps is (10 sec: 57345.6, 60 sec: 55432.6, 300 sec: 55983.3). Total num frames: 944177152. Throughput: 0: 56007.8. Samples: 893488900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:36:43,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 02:36:46,480][47288] Updated weights for policy 0, policy_version 57636 (0.0032) [2024-04-26 02:36:48,786][47288] Updated weights for policy 0, policy_version 57646 (0.0027) [2024-04-26 02:36:48,923][47056] Fps is (10 sec: 58980.9, 60 sec: 56251.5, 300 sec: 55983.3). Total num frames: 944472064. Throughput: 0: 56045.6. Samples: 893824980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:36:48,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 02:36:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000057646_944472064.pth... [2024-04-26 02:36:48,988][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000056829_931086336.pth [2024-04-26 02:36:52,275][47288] Updated weights for policy 0, policy_version 57656 (0.0030) [2024-04-26 02:36:53,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56524.7, 300 sec: 55872.2). Total num frames: 944750592. Throughput: 0: 56004.4. Samples: 894158740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:36:53,924][47056] Avg episode reward: [(0, '0.343')] [2024-04-26 02:36:54,766][47288] Updated weights for policy 0, policy_version 57666 (0.0027) [2024-04-26 02:36:58,110][47288] Updated weights for policy 0, policy_version 57676 (0.0031) [2024-04-26 02:36:58,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 945012736. Throughput: 0: 55813.2. Samples: 894323720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 02:36:58,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 02:37:00,729][47288] Updated weights for policy 0, policy_version 57686 (0.0028) [2024-04-26 02:37:03,923][47056] Fps is (10 sec: 52429.9, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 945274880. Throughput: 0: 55861.5. Samples: 894660940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 02:37:03,923][47056] Avg episode reward: [(0, '0.378')] [2024-04-26 02:37:03,979][47288] Updated weights for policy 0, policy_version 57696 (0.0027) [2024-04-26 02:37:06,489][47288] Updated weights for policy 0, policy_version 57706 (0.0030) [2024-04-26 02:37:08,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55816.6). Total num frames: 945569792. Throughput: 0: 55957.8. Samples: 895000840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 02:37:08,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 02:37:09,787][47288] Updated weights for policy 0, policy_version 57716 (0.0030) [2024-04-26 02:37:12,407][47288] Updated weights for policy 0, policy_version 57726 (0.0031) [2024-04-26 02:37:13,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 945848320. Throughput: 0: 55824.7. Samples: 895159780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 02:37:13,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 02:37:15,764][47288] Updated weights for policy 0, policy_version 57736 (0.0027) [2024-04-26 02:37:18,403][47288] Updated weights for policy 0, policy_version 57746 (0.0034) [2024-04-26 02:37:18,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55983.3). Total num frames: 946126848. Throughput: 0: 55926.5. Samples: 895496580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 02:37:18,923][47056] Avg episode reward: [(0, '0.390')] [2024-04-26 02:37:21,661][47288] Updated weights for policy 0, policy_version 57756 (0.0029) [2024-04-26 02:37:22,368][47267] Signal inference workers to stop experience collection... (13450 times) [2024-04-26 02:37:22,392][47288] InferenceWorker_p0-w0: stopping experience collection (13450 times) [2024-04-26 02:37:22,425][47267] Signal inference workers to resume experience collection... (13450 times) [2024-04-26 02:37:22,426][47288] InferenceWorker_p0-w0: resuming experience collection (13450 times) [2024-04-26 02:37:23,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 946405376. Throughput: 0: 55839.9. Samples: 895829520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 02:37:23,923][47056] Avg episode reward: [(0, '0.450')] [2024-04-26 02:37:24,182][47288] Updated weights for policy 0, policy_version 57766 (0.0026) [2024-04-26 02:37:27,377][47288] Updated weights for policy 0, policy_version 57776 (0.0033) [2024-04-26 02:37:28,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56524.9, 300 sec: 55816.7). Total num frames: 946700288. Throughput: 0: 55944.5. Samples: 896006400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 02:37:28,923][47056] Avg episode reward: [(0, '0.326')] [2024-04-26 02:37:29,956][47288] Updated weights for policy 0, policy_version 57786 (0.0029) [2024-04-26 02:37:33,172][47288] Updated weights for policy 0, policy_version 57796 (0.0033) [2024-04-26 02:37:33,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 946946048. Throughput: 0: 55893.0. Samples: 896340160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 02:37:33,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 02:37:35,921][47288] Updated weights for policy 0, policy_version 57806 (0.0031) [2024-04-26 02:37:38,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 947224576. Throughput: 0: 55879.2. Samples: 896673300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 02:37:38,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 02:37:39,079][47288] Updated weights for policy 0, policy_version 57816 (0.0031) [2024-04-26 02:37:41,827][47288] Updated weights for policy 0, policy_version 57826 (0.0028) [2024-04-26 02:37:43,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 947519488. Throughput: 0: 55857.8. Samples: 896837320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 02:37:43,923][47056] Avg episode reward: [(0, '0.369')] [2024-04-26 02:37:44,977][47288] Updated weights for policy 0, policy_version 57836 (0.0024) [2024-04-26 02:37:47,558][47288] Updated weights for policy 0, policy_version 57846 (0.0033) [2024-04-26 02:37:48,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 947798016. Throughput: 0: 55802.3. Samples: 897172060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 02:37:48,923][47056] Avg episode reward: [(0, '0.384')] [2024-04-26 02:37:50,980][47288] Updated weights for policy 0, policy_version 57856 (0.0030) [2024-04-26 02:37:53,276][47288] Updated weights for policy 0, policy_version 57866 (0.0028) [2024-04-26 02:37:53,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 948076544. Throughput: 0: 55594.6. Samples: 897502600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 02:37:53,923][47056] Avg episode reward: [(0, '0.348')] [2024-04-26 02:37:57,000][47288] Updated weights for policy 0, policy_version 57876 (0.0029) [2024-04-26 02:37:58,923][47056] Fps is (10 sec: 57345.2, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 948371456. Throughput: 0: 55971.0. Samples: 897678480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 02:37:58,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 02:37:59,133][47288] Updated weights for policy 0, policy_version 57886 (0.0030) [2024-04-26 02:38:02,881][47288] Updated weights for policy 0, policy_version 57896 (0.0032) [2024-04-26 02:38:03,310][47267] Signal inference workers to stop experience collection... (13500 times) [2024-04-26 02:38:03,310][47267] Signal inference workers to resume experience collection... (13500 times) [2024-04-26 02:38:03,330][47288] InferenceWorker_p0-w0: stopping experience collection (13500 times) [2024-04-26 02:38:03,331][47288] InferenceWorker_p0-w0: resuming experience collection (13500 times) [2024-04-26 02:38:03,923][47056] Fps is (10 sec: 58983.5, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 948666368. Throughput: 0: 56006.8. Samples: 898016880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 02:38:03,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 02:38:04,980][47288] Updated weights for policy 0, policy_version 57906 (0.0026) [2024-04-26 02:38:08,780][47288] Updated weights for policy 0, policy_version 57916 (0.0030) [2024-04-26 02:38:08,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 948912128. Throughput: 0: 56073.7. Samples: 898352840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 02:38:08,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 02:38:11,090][47288] Updated weights for policy 0, policy_version 57926 (0.0029) [2024-04-26 02:38:13,923][47056] Fps is (10 sec: 50789.6, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 949174272. Throughput: 0: 55617.6. Samples: 898509200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 02:38:13,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 02:38:14,646][47288] Updated weights for policy 0, policy_version 57936 (0.0026) [2024-04-26 02:38:16,817][47288] Updated weights for policy 0, policy_version 57946 (0.0031) [2024-04-26 02:38:18,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 949485568. Throughput: 0: 55748.1. Samples: 898848820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 02:38:18,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 02:38:20,515][47288] Updated weights for policy 0, policy_version 57956 (0.0026) [2024-04-26 02:38:22,965][47288] Updated weights for policy 0, policy_version 57966 (0.0028) [2024-04-26 02:38:23,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55978.6, 300 sec: 55816.6). Total num frames: 949764096. Throughput: 0: 55763.4. Samples: 899182660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 02:38:23,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 02:38:26,298][47288] Updated weights for policy 0, policy_version 57976 (0.0028) [2024-04-26 02:38:28,795][47288] Updated weights for policy 0, policy_version 57986 (0.0029) [2024-04-26 02:38:28,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55705.4, 300 sec: 55983.3). Total num frames: 950042624. Throughput: 0: 55798.6. Samples: 899348260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 02:38:28,923][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 02:38:31,988][47288] Updated weights for policy 0, policy_version 57996 (0.0029) [2024-04-26 02:38:33,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.7, 300 sec: 56038.8). Total num frames: 950337536. Throughput: 0: 55839.1. Samples: 899684820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 02:38:33,923][47056] Avg episode reward: [(0, '0.403')] [2024-04-26 02:38:34,506][47288] Updated weights for policy 0, policy_version 58006 (0.0030) [2024-04-26 02:38:37,977][47288] Updated weights for policy 0, policy_version 58016 (0.0033) [2024-04-26 02:38:38,923][47056] Fps is (10 sec: 57345.2, 60 sec: 56524.9, 300 sec: 55927.8). Total num frames: 950616064. Throughput: 0: 55974.9. Samples: 900021460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 02:38:38,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 02:38:40,222][47288] Updated weights for policy 0, policy_version 58026 (0.0025) [2024-04-26 02:38:43,916][47288] Updated weights for policy 0, policy_version 58036 (0.0026) [2024-04-26 02:38:43,923][47056] Fps is (10 sec: 52429.6, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 950861824. Throughput: 0: 55880.8. Samples: 900193120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 02:38:43,923][47056] Avg episode reward: [(0, '0.342')] [2024-04-26 02:38:45,972][47288] Updated weights for policy 0, policy_version 58046 (0.0034) [2024-04-26 02:38:48,923][47056] Fps is (10 sec: 50789.7, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 951123968. Throughput: 0: 55759.8. Samples: 900526080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 02:38:48,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 02:38:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000058052_951123968.pth... [2024-04-26 02:38:48,995][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000057236_937754624.pth [2024-04-26 02:38:49,638][47288] Updated weights for policy 0, policy_version 58056 (0.0026) [2024-04-26 02:38:52,121][47288] Updated weights for policy 0, policy_version 58066 (0.0036) [2024-04-26 02:38:53,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 951418880. Throughput: 0: 55657.9. Samples: 900857440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 02:38:53,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 02:38:55,472][47288] Updated weights for policy 0, policy_version 58076 (0.0027) [2024-04-26 02:38:57,107][47267] Signal inference workers to stop experience collection... (13550 times) [2024-04-26 02:38:57,140][47288] InferenceWorker_p0-w0: stopping experience collection (13550 times) [2024-04-26 02:38:57,164][47267] Signal inference workers to resume experience collection... (13550 times) [2024-04-26 02:38:57,169][47288] InferenceWorker_p0-w0: resuming experience collection (13550 times) [2024-04-26 02:38:57,928][47288] Updated weights for policy 0, policy_version 58086 (0.0027) [2024-04-26 02:38:58,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55432.3, 300 sec: 55705.6). Total num frames: 951697408. Throughput: 0: 55883.9. Samples: 901023980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 02:38:58,923][47056] Avg episode reward: [(0, '0.416')] [2024-04-26 02:39:01,458][47288] Updated weights for policy 0, policy_version 58096 (0.0033) [2024-04-26 02:39:03,692][47288] Updated weights for policy 0, policy_version 58106 (0.0028) [2024-04-26 02:39:03,923][47056] Fps is (10 sec: 60620.2, 60 sec: 55978.5, 300 sec: 55983.3). Total num frames: 952025088. Throughput: 0: 55636.8. Samples: 901352480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 02:39:03,923][47056] Avg episode reward: [(0, '0.303')] [2024-04-26 02:39:07,308][47288] Updated weights for policy 0, policy_version 58116 (0.0029) [2024-04-26 02:39:08,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 952270848. Throughput: 0: 55688.6. Samples: 901688640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 02:39:08,923][47056] Avg episode reward: [(0, '0.397')] [2024-04-26 02:39:09,533][47288] Updated weights for policy 0, policy_version 58126 (0.0032) [2024-04-26 02:39:13,014][47288] Updated weights for policy 0, policy_version 58136 (0.0031) [2024-04-26 02:39:13,923][47056] Fps is (10 sec: 54068.3, 60 sec: 56525.0, 300 sec: 55927.8). Total num frames: 952565760. Throughput: 0: 56077.2. Samples: 901871720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 02:39:13,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 02:39:15,447][47288] Updated weights for policy 0, policy_version 58146 (0.0032) [2024-04-26 02:39:18,853][47288] Updated weights for policy 0, policy_version 58156 (0.0032) [2024-04-26 02:39:18,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 952827904. Throughput: 0: 56055.7. Samples: 902207320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 02:39:18,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 02:39:21,231][47288] Updated weights for policy 0, policy_version 58166 (0.0030) [2024-04-26 02:39:23,923][47056] Fps is (10 sec: 50790.3, 60 sec: 55159.7, 300 sec: 55650.1). Total num frames: 953073664. Throughput: 0: 55866.3. Samples: 902535440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 02:39:23,923][47056] Avg episode reward: [(0, '0.353')] [2024-04-26 02:39:24,894][47288] Updated weights for policy 0, policy_version 58176 (0.0033) [2024-04-26 02:39:27,027][47288] Updated weights for policy 0, policy_version 58186 (0.0026) [2024-04-26 02:39:28,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 953368576. Throughput: 0: 55567.6. Samples: 902693660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 02:39:28,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 02:39:30,725][47288] Updated weights for policy 0, policy_version 58196 (0.0025) [2024-04-26 02:39:32,879][47288] Updated weights for policy 0, policy_version 58206 (0.0028) [2024-04-26 02:39:33,923][47056] Fps is (10 sec: 58980.7, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 953663488. Throughput: 0: 55560.3. Samples: 903026300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 02:39:33,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 02:39:36,464][47288] Updated weights for policy 0, policy_version 58216 (0.0031) [2024-04-26 02:39:38,858][47288] Updated weights for policy 0, policy_version 58226 (0.0024) [2024-04-26 02:39:38,923][47056] Fps is (10 sec: 60620.4, 60 sec: 55978.6, 300 sec: 56038.8). Total num frames: 953974784. Throughput: 0: 55746.6. Samples: 903366040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 02:39:38,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 02:39:42,058][47267] Signal inference workers to stop experience collection... (13600 times) [2024-04-26 02:39:42,090][47288] InferenceWorker_p0-w0: stopping experience collection (13600 times) [2024-04-26 02:39:42,147][47267] Signal inference workers to resume experience collection... (13600 times) [2024-04-26 02:39:42,147][47288] InferenceWorker_p0-w0: resuming experience collection (13600 times) [2024-04-26 02:39:42,251][47288] Updated weights for policy 0, policy_version 58236 (0.0027) [2024-04-26 02:39:43,923][47056] Fps is (10 sec: 57345.1, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 954236928. Throughput: 0: 56141.5. Samples: 903550340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 02:39:43,923][47056] Avg episode reward: [(0, '0.378')] [2024-04-26 02:39:44,629][47288] Updated weights for policy 0, policy_version 58246 (0.0033) [2024-04-26 02:39:48,130][47288] Updated weights for policy 0, policy_version 58256 (0.0034) [2024-04-26 02:39:48,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 954515456. Throughput: 0: 56208.0. Samples: 903881840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 02:39:48,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 02:39:50,804][47288] Updated weights for policy 0, policy_version 58266 (0.0027) [2024-04-26 02:39:53,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 954777600. Throughput: 0: 56182.8. Samples: 904216860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 02:39:53,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 02:39:54,035][47288] Updated weights for policy 0, policy_version 58276 (0.0030) [2024-04-26 02:39:56,681][47288] Updated weights for policy 0, policy_version 58286 (0.0032) [2024-04-26 02:39:58,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 955039744. Throughput: 0: 55562.0. Samples: 904372020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 02:39:58,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 02:39:59,812][47288] Updated weights for policy 0, policy_version 58296 (0.0024) [2024-04-26 02:40:02,600][47288] Updated weights for policy 0, policy_version 58306 (0.0035) [2024-04-26 02:40:03,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 955334656. Throughput: 0: 55593.3. Samples: 904709020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-26 02:40:03,924][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 02:40:05,666][47288] Updated weights for policy 0, policy_version 58316 (0.0034) [2024-04-26 02:40:08,371][47288] Updated weights for policy 0, policy_version 58326 (0.0031) [2024-04-26 02:40:08,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 955613184. Throughput: 0: 55585.1. Samples: 905036780. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-26 02:40:08,923][47056] Avg episode reward: [(0, '0.399')] [2024-04-26 02:40:11,550][47288] Updated weights for policy 0, policy_version 58336 (0.0032) [2024-04-26 02:40:13,923][47056] Fps is (10 sec: 58983.6, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 955924480. Throughput: 0: 55973.9. Samples: 905212480. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-26 02:40:13,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 02:40:14,037][47288] Updated weights for policy 0, policy_version 58346 (0.0029) [2024-04-26 02:40:17,437][47288] Updated weights for policy 0, policy_version 58356 (0.0027) [2024-04-26 02:40:18,923][47056] Fps is (10 sec: 58983.5, 60 sec: 56251.9, 300 sec: 55983.3). Total num frames: 956203008. Throughput: 0: 55948.8. Samples: 905543980. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-26 02:40:18,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 02:40:19,910][47288] Updated weights for policy 0, policy_version 58366 (0.0027) [2024-04-26 02:40:23,143][47288] Updated weights for policy 0, policy_version 58376 (0.0025) [2024-04-26 02:40:23,923][47056] Fps is (10 sec: 52428.0, 60 sec: 56251.6, 300 sec: 55816.6). Total num frames: 956448768. Throughput: 0: 55937.4. Samples: 905883220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-26 02:40:23,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 02:40:25,967][47288] Updated weights for policy 0, policy_version 58386 (0.0029) [2024-04-26 02:40:28,454][47267] Signal inference workers to stop experience collection... (13650 times) [2024-04-26 02:40:28,484][47288] InferenceWorker_p0-w0: stopping experience collection (13650 times) [2024-04-26 02:40:28,511][47267] Signal inference workers to resume experience collection... (13650 times) [2024-04-26 02:40:28,511][47288] InferenceWorker_p0-w0: resuming experience collection (13650 times) [2024-04-26 02:40:28,905][47288] Updated weights for policy 0, policy_version 58396 (0.0027) [2024-04-26 02:40:28,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56524.8, 300 sec: 55927.8). Total num frames: 956760064. Throughput: 0: 55539.6. Samples: 906049620. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-26 02:40:28,924][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 02:40:32,039][47288] Updated weights for policy 0, policy_version 58406 (0.0029) [2024-04-26 02:40:33,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 956989440. Throughput: 0: 55747.9. Samples: 906390500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-26 02:40:33,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 02:40:34,821][47288] Updated weights for policy 0, policy_version 58416 (0.0031) [2024-04-26 02:40:38,065][47288] Updated weights for policy 0, policy_version 58426 (0.0032) [2024-04-26 02:40:38,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 957284352. Throughput: 0: 55750.1. Samples: 906725620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-26 02:40:38,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 02:40:40,730][47288] Updated weights for policy 0, policy_version 58436 (0.0031) [2024-04-26 02:40:43,858][47288] Updated weights for policy 0, policy_version 58446 (0.0035) [2024-04-26 02:40:43,923][47056] Fps is (10 sec: 58983.3, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 957579264. Throughput: 0: 55888.1. Samples: 906886980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-26 02:40:43,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 02:40:46,705][47288] Updated weights for policy 0, policy_version 58456 (0.0039) [2024-04-26 02:40:48,923][47056] Fps is (10 sec: 58981.9, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 957874176. Throughput: 0: 55655.5. Samples: 907213520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-26 02:40:48,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 02:40:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000058464_957874176.pth... [2024-04-26 02:40:48,976][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000057646_944472064.pth [2024-04-26 02:40:49,746][47288] Updated weights for policy 0, policy_version 58466 (0.0027) [2024-04-26 02:40:52,611][47288] Updated weights for policy 0, policy_version 58476 (0.0037) [2024-04-26 02:40:53,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 958152704. Throughput: 0: 55873.0. Samples: 907551060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-26 02:40:53,923][47056] Avg episode reward: [(0, '0.301')] [2024-04-26 02:40:55,692][47288] Updated weights for policy 0, policy_version 58486 (0.0030) [2024-04-26 02:40:58,513][47288] Updated weights for policy 0, policy_version 58496 (0.0034) [2024-04-26 02:40:58,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 958431232. Throughput: 0: 55927.3. Samples: 907729220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-26 02:40:58,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 02:41:01,517][47288] Updated weights for policy 0, policy_version 58506 (0.0041) [2024-04-26 02:41:03,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 958693376. Throughput: 0: 56007.8. Samples: 908064340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-26 02:41:03,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 02:41:04,441][47288] Updated weights for policy 0, policy_version 58516 (0.0029) [2024-04-26 02:41:07,261][47288] Updated weights for policy 0, policy_version 58526 (0.0029) [2024-04-26 02:41:08,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 958955520. Throughput: 0: 55908.9. Samples: 908399120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-26 02:41:08,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 02:41:10,192][47288] Updated weights for policy 0, policy_version 58536 (0.0032) [2024-04-26 02:41:13,534][47288] Updated weights for policy 0, policy_version 58546 (0.0029) [2024-04-26 02:41:13,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 959250432. Throughput: 0: 55916.9. Samples: 908565880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-26 02:41:13,923][47056] Avg episode reward: [(0, '0.399')] [2024-04-26 02:41:15,998][47288] Updated weights for policy 0, policy_version 58556 (0.0024) [2024-04-26 02:41:18,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55432.3, 300 sec: 55872.2). Total num frames: 959528960. Throughput: 0: 55763.5. Samples: 908899860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 02:41:18,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 02:41:19,230][47288] Updated weights for policy 0, policy_version 58566 (0.0028) [2024-04-26 02:41:21,816][47288] Updated weights for policy 0, policy_version 58576 (0.0030) [2024-04-26 02:41:23,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 959807488. Throughput: 0: 55804.0. Samples: 909236800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 02:41:23,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 02:41:25,149][47288] Updated weights for policy 0, policy_version 58586 (0.0029) [2024-04-26 02:41:27,507][47288] Updated weights for policy 0, policy_version 58596 (0.0031) [2024-04-26 02:41:28,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55705.5, 300 sec: 55927.8). Total num frames: 960102400. Throughput: 0: 56049.7. Samples: 909409220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 02:41:28,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 02:41:31,070][47288] Updated weights for policy 0, policy_version 58606 (0.0028) [2024-04-26 02:41:32,148][47267] Signal inference workers to stop experience collection... (13700 times) [2024-04-26 02:41:32,148][47267] Signal inference workers to resume experience collection... (13700 times) [2024-04-26 02:41:32,173][47288] InferenceWorker_p0-w0: stopping experience collection (13700 times) [2024-04-26 02:41:32,173][47288] InferenceWorker_p0-w0: resuming experience collection (13700 times) [2024-04-26 02:41:33,572][47288] Updated weights for policy 0, policy_version 58616 (0.0031) [2024-04-26 02:41:33,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56797.9, 300 sec: 55983.3). Total num frames: 960397312. Throughput: 0: 56177.4. Samples: 909741500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 02:41:33,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 02:41:36,769][47288] Updated weights for policy 0, policy_version 58626 (0.0030) [2024-04-26 02:41:38,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 960659456. Throughput: 0: 56165.3. Samples: 910078500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 02:41:38,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 02:41:39,434][47288] Updated weights for policy 0, policy_version 58636 (0.0030) [2024-04-26 02:41:42,580][47288] Updated weights for policy 0, policy_version 58646 (0.0028) [2024-04-26 02:41:43,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 960921600. Throughput: 0: 55952.2. Samples: 910247060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 02:41:43,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 02:41:45,473][47288] Updated weights for policy 0, policy_version 58656 (0.0032) [2024-04-26 02:41:48,524][47288] Updated weights for policy 0, policy_version 58666 (0.0028) [2024-04-26 02:41:48,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 961200128. Throughput: 0: 55847.1. Samples: 910577460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 02:41:48,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 02:41:51,160][47288] Updated weights for policy 0, policy_version 58676 (0.0027) [2024-04-26 02:41:53,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 55761.2). Total num frames: 961462272. Throughput: 0: 56008.0. Samples: 910919480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 02:41:53,923][47056] Avg episode reward: [(0, '0.368')] [2024-04-26 02:41:54,301][47288] Updated weights for policy 0, policy_version 58686 (0.0031) [2024-04-26 02:41:57,009][47288] Updated weights for policy 0, policy_version 58696 (0.0027) [2024-04-26 02:41:58,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 961757184. Throughput: 0: 55845.8. Samples: 911078940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 02:41:58,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 02:42:00,201][47288] Updated weights for policy 0, policy_version 58706 (0.0030) [2024-04-26 02:42:02,919][47288] Updated weights for policy 0, policy_version 58716 (0.0029) [2024-04-26 02:42:03,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 962052096. Throughput: 0: 55869.0. Samples: 911413960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 02:42:03,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 02:42:06,024][47288] Updated weights for policy 0, policy_version 58726 (0.0030) [2024-04-26 02:42:08,779][47288] Updated weights for policy 0, policy_version 58736 (0.0035) [2024-04-26 02:42:08,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 962347008. Throughput: 0: 55887.0. Samples: 911751720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 02:42:08,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 02:42:12,101][47288] Updated weights for policy 0, policy_version 58746 (0.0035) [2024-04-26 02:42:13,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 962625536. Throughput: 0: 55926.8. Samples: 911925920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 02:42:13,923][47056] Avg episode reward: [(0, '0.323')] [2024-04-26 02:42:14,475][47288] Updated weights for policy 0, policy_version 58756 (0.0026) [2024-04-26 02:42:17,818][47288] Updated weights for policy 0, policy_version 58766 (0.0034) [2024-04-26 02:42:18,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 962887680. Throughput: 0: 55879.5. Samples: 912256080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 02:42:18,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 02:42:20,360][47288] Updated weights for policy 0, policy_version 58776 (0.0026) [2024-04-26 02:42:23,734][47288] Updated weights for policy 0, policy_version 58786 (0.0032) [2024-04-26 02:42:23,923][47056] Fps is (10 sec: 52427.6, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 963149824. Throughput: 0: 55844.2. Samples: 912591500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 02:42:23,923][47056] Avg episode reward: [(0, '0.362')] [2024-04-26 02:42:26,291][47288] Updated weights for policy 0, policy_version 58796 (0.0033) [2024-04-26 02:42:28,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 963428352. Throughput: 0: 55556.3. Samples: 912747100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 02:42:28,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 02:42:29,573][47288] Updated weights for policy 0, policy_version 58806 (0.0029) [2024-04-26 02:42:30,709][47267] Signal inference workers to stop experience collection... (13750 times) [2024-04-26 02:42:30,710][47267] Signal inference workers to resume experience collection... (13750 times) [2024-04-26 02:42:30,723][47288] InferenceWorker_p0-w0: stopping experience collection (13750 times) [2024-04-26 02:42:30,723][47288] InferenceWorker_p0-w0: resuming experience collection (13750 times) [2024-04-26 02:42:32,212][47288] Updated weights for policy 0, policy_version 58816 (0.0033) [2024-04-26 02:42:33,923][47056] Fps is (10 sec: 54068.4, 60 sec: 54886.5, 300 sec: 55816.7). Total num frames: 963690496. Throughput: 0: 55579.2. Samples: 913078520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 02:42:33,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 02:42:33,945][47267] Saving new best policy, reward=0.527! [2024-04-26 02:42:35,446][47288] Updated weights for policy 0, policy_version 58826 (0.0027) [2024-04-26 02:42:37,962][47288] Updated weights for policy 0, policy_version 58836 (0.0026) [2024-04-26 02:42:38,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 963985408. Throughput: 0: 55523.8. Samples: 913418060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 02:42:38,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 02:42:41,259][47288] Updated weights for policy 0, policy_version 58846 (0.0030) [2024-04-26 02:42:43,701][47288] Updated weights for policy 0, policy_version 58856 (0.0026) [2024-04-26 02:42:43,923][47056] Fps is (10 sec: 60620.5, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 964296704. Throughput: 0: 55823.1. Samples: 913590980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 02:42:43,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 02:42:47,290][47288] Updated weights for policy 0, policy_version 58866 (0.0031) [2024-04-26 02:42:48,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 964558848. Throughput: 0: 55932.1. Samples: 913930900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 02:42:48,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 02:42:49,036][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000058873_964575232.pth... [2024-04-26 02:42:49,092][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000058052_951123968.pth [2024-04-26 02:42:49,563][47288] Updated weights for policy 0, policy_version 58876 (0.0026) [2024-04-26 02:42:53,037][47288] Updated weights for policy 0, policy_version 58886 (0.0035) [2024-04-26 02:42:53,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 964853760. Throughput: 0: 55873.4. Samples: 914266020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 02:42:53,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 02:42:55,454][47288] Updated weights for policy 0, policy_version 58896 (0.0025) [2024-04-26 02:42:58,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 965099520. Throughput: 0: 55747.9. Samples: 914434580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 02:42:58,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 02:42:58,983][47288] Updated weights for policy 0, policy_version 58906 (0.0026) [2024-04-26 02:43:01,216][47288] Updated weights for policy 0, policy_version 58916 (0.0032) [2024-04-26 02:43:03,923][47056] Fps is (10 sec: 50790.4, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 965361664. Throughput: 0: 55881.4. Samples: 914770740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 02:43:03,923][47056] Avg episode reward: [(0, '0.336')] [2024-04-26 02:43:04,990][47288] Updated weights for policy 0, policy_version 58926 (0.0032) [2024-04-26 02:43:07,045][47288] Updated weights for policy 0, policy_version 58936 (0.0030) [2024-04-26 02:43:08,924][47056] Fps is (10 sec: 55698.7, 60 sec: 55158.3, 300 sec: 55872.0). Total num frames: 965656576. Throughput: 0: 55749.3. Samples: 915100280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:43:08,925][47056] Avg episode reward: [(0, '0.332')] [2024-04-26 02:43:10,727][47288] Updated weights for policy 0, policy_version 58946 (0.0027) [2024-04-26 02:43:13,180][47288] Updated weights for policy 0, policy_version 58956 (0.0028) [2024-04-26 02:43:13,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 965951488. Throughput: 0: 55952.5. Samples: 915264960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:43:13,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 02:43:16,498][47288] Updated weights for policy 0, policy_version 58966 (0.0032) [2024-04-26 02:43:17,027][47267] Signal inference workers to stop experience collection... (13800 times) [2024-04-26 02:43:17,029][47267] Signal inference workers to resume experience collection... (13800 times) [2024-04-26 02:43:17,052][47288] InferenceWorker_p0-w0: stopping experience collection (13800 times) [2024-04-26 02:43:17,052][47288] InferenceWorker_p0-w0: resuming experience collection (13800 times) [2024-04-26 02:43:18,923][47056] Fps is (10 sec: 58990.1, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 966246400. Throughput: 0: 55870.2. Samples: 915592680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:43:18,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 02:43:18,985][47288] Updated weights for policy 0, policy_version 58976 (0.0029) [2024-04-26 02:43:22,352][47288] Updated weights for policy 0, policy_version 58986 (0.0030) [2024-04-26 02:43:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.9, 300 sec: 55872.2). Total num frames: 966524928. Throughput: 0: 55740.6. Samples: 915926380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:43:23,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 02:43:24,740][47288] Updated weights for policy 0, policy_version 58996 (0.0032) [2024-04-26 02:43:28,299][47288] Updated weights for policy 0, policy_version 59006 (0.0032) [2024-04-26 02:43:28,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 966803456. Throughput: 0: 55862.7. Samples: 916104800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:43:28,923][47056] Avg episode reward: [(0, '0.372')] [2024-04-26 02:43:30,500][47288] Updated weights for policy 0, policy_version 59016 (0.0030) [2024-04-26 02:43:33,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 967049216. Throughput: 0: 55747.1. Samples: 916439520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:43:33,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 02:43:34,254][47288] Updated weights for policy 0, policy_version 59026 (0.0035) [2024-04-26 02:43:36,478][47288] Updated weights for policy 0, policy_version 59036 (0.0027) [2024-04-26 02:43:38,923][47056] Fps is (10 sec: 49150.7, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 967294976. Throughput: 0: 55742.0. Samples: 916774420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:43:38,924][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 02:43:40,065][47288] Updated weights for policy 0, policy_version 59046 (0.0029) [2024-04-26 02:43:42,434][47288] Updated weights for policy 0, policy_version 59056 (0.0028) [2024-04-26 02:43:43,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55432.4, 300 sec: 55927.7). Total num frames: 967622656. Throughput: 0: 55499.9. Samples: 916932080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 02:43:43,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 02:43:45,938][47288] Updated weights for policy 0, policy_version 59066 (0.0032) [2024-04-26 02:43:48,201][47288] Updated weights for policy 0, policy_version 59076 (0.0026) [2024-04-26 02:43:48,923][47056] Fps is (10 sec: 60622.0, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 967901184. Throughput: 0: 55443.6. Samples: 917265700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 02:43:48,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 02:43:51,813][47288] Updated weights for policy 0, policy_version 59086 (0.0030) [2024-04-26 02:43:53,923][47056] Fps is (10 sec: 58983.3, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 968212480. Throughput: 0: 55642.5. Samples: 917604120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 02:43:53,923][47056] Avg episode reward: [(0, '0.332')] [2024-04-26 02:43:54,128][47288] Updated weights for policy 0, policy_version 59096 (0.0033) [2024-04-26 02:43:57,745][47288] Updated weights for policy 0, policy_version 59106 (0.0026) [2024-04-26 02:43:58,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 968474624. Throughput: 0: 55847.0. Samples: 917778080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 02:43:58,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 02:44:00,145][47288] Updated weights for policy 0, policy_version 59116 (0.0031) [2024-04-26 02:44:03,555][47288] Updated weights for policy 0, policy_version 59126 (0.0029) [2024-04-26 02:44:03,923][47056] Fps is (10 sec: 52428.7, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 968736768. Throughput: 0: 55911.6. Samples: 918108700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 02:44:03,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 02:44:06,459][47288] Updated weights for policy 0, policy_version 59136 (0.0029) [2024-04-26 02:44:08,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55706.8, 300 sec: 55705.6). Total num frames: 968998912. Throughput: 0: 55836.0. Samples: 918439000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 02:44:08,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 02:44:09,421][47288] Updated weights for policy 0, policy_version 59146 (0.0028) [2024-04-26 02:44:09,941][47267] Signal inference workers to stop experience collection... (13850 times) [2024-04-26 02:44:09,941][47267] Signal inference workers to resume experience collection... (13850 times) [2024-04-26 02:44:09,965][47288] InferenceWorker_p0-w0: stopping experience collection (13850 times) [2024-04-26 02:44:09,965][47288] InferenceWorker_p0-w0: resuming experience collection (13850 times) [2024-04-26 02:44:12,454][47288] Updated weights for policy 0, policy_version 59156 (0.0032) [2024-04-26 02:44:13,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55432.7, 300 sec: 55761.2). Total num frames: 969277440. Throughput: 0: 55510.3. Samples: 918602760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 02:44:13,923][47056] Avg episode reward: [(0, '0.346')] [2024-04-26 02:44:15,406][47288] Updated weights for policy 0, policy_version 59166 (0.0028) [2024-04-26 02:44:18,118][47288] Updated weights for policy 0, policy_version 59176 (0.0030) [2024-04-26 02:44:18,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55432.5, 300 sec: 55927.7). Total num frames: 969572352. Throughput: 0: 55525.3. Samples: 918938160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 02:44:18,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 02:44:21,278][47288] Updated weights for policy 0, policy_version 59186 (0.0031) [2024-04-26 02:44:23,874][47288] Updated weights for policy 0, policy_version 59196 (0.0039) [2024-04-26 02:44:23,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 969867264. Throughput: 0: 55491.0. Samples: 919271500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 02:44:23,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 02:44:27,148][47288] Updated weights for policy 0, policy_version 59206 (0.0026) [2024-04-26 02:44:28,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 970129408. Throughput: 0: 55897.6. Samples: 919447460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 02:44:28,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 02:44:29,893][47288] Updated weights for policy 0, policy_version 59216 (0.0029) [2024-04-26 02:44:32,924][47288] Updated weights for policy 0, policy_version 59226 (0.0025) [2024-04-26 02:44:33,923][47056] Fps is (10 sec: 55704.2, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 970424320. Throughput: 0: 56001.1. Samples: 919785760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 02:44:33,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 02:44:35,897][47288] Updated weights for policy 0, policy_version 59236 (0.0029) [2024-04-26 02:44:38,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 970670080. Throughput: 0: 55807.1. Samples: 920115440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 02:44:38,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 02:44:38,967][47288] Updated weights for policy 0, policy_version 59246 (0.0033) [2024-04-26 02:44:41,715][47288] Updated weights for policy 0, policy_version 59256 (0.0028) [2024-04-26 02:44:43,923][47056] Fps is (10 sec: 52429.8, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 970948608. Throughput: 0: 55524.6. Samples: 920276680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 02:44:43,923][47056] Avg episode reward: [(0, '0.378')] [2024-04-26 02:44:45,006][47288] Updated weights for policy 0, policy_version 59266 (0.0031) [2024-04-26 02:44:47,650][47288] Updated weights for policy 0, policy_version 59276 (0.0028) [2024-04-26 02:44:48,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 971227136. Throughput: 0: 55485.3. Samples: 920605540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 02:44:48,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 02:44:49,099][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000059281_971259904.pth... [2024-04-26 02:44:49,151][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000058464_957874176.pth [2024-04-26 02:44:50,763][47288] Updated weights for policy 0, policy_version 59286 (0.0026) [2024-04-26 02:44:53,603][47288] Updated weights for policy 0, policy_version 59296 (0.0028) [2024-04-26 02:44:53,923][47056] Fps is (10 sec: 55705.6, 60 sec: 54886.4, 300 sec: 55816.7). Total num frames: 971505664. Throughput: 0: 55636.5. Samples: 920942640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 02:44:53,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 02:44:54,697][47267] Signal inference workers to stop experience collection... (13900 times) [2024-04-26 02:44:54,703][47267] Signal inference workers to resume experience collection... (13900 times) [2024-04-26 02:44:54,717][47288] InferenceWorker_p0-w0: stopping experience collection (13900 times) [2024-04-26 02:44:54,717][47288] InferenceWorker_p0-w0: resuming experience collection (13900 times) [2024-04-26 02:44:56,583][47288] Updated weights for policy 0, policy_version 59306 (0.0034) [2024-04-26 02:44:58,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 971800576. Throughput: 0: 55821.2. Samples: 921114720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 02:44:58,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 02:44:59,499][47288] Updated weights for policy 0, policy_version 59316 (0.0027) [2024-04-26 02:45:02,490][47288] Updated weights for policy 0, policy_version 59326 (0.0032) [2024-04-26 02:45:03,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 972095488. Throughput: 0: 55796.5. Samples: 921449000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 02:45:03,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 02:45:05,195][47288] Updated weights for policy 0, policy_version 59336 (0.0031) [2024-04-26 02:45:08,432][47288] Updated weights for policy 0, policy_version 59346 (0.0029) [2024-04-26 02:45:08,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 972374016. Throughput: 0: 55784.7. Samples: 921781820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 02:45:08,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 02:45:10,964][47288] Updated weights for policy 0, policy_version 59356 (0.0027) [2024-04-26 02:45:13,923][47056] Fps is (10 sec: 52428.2, 60 sec: 55705.4, 300 sec: 55650.0). Total num frames: 972619776. Throughput: 0: 55566.0. Samples: 921947940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 02:45:13,923][47056] Avg episode reward: [(0, '0.372')] [2024-04-26 02:45:14,387][47288] Updated weights for policy 0, policy_version 59366 (0.0029) [2024-04-26 02:45:16,944][47288] Updated weights for policy 0, policy_version 59376 (0.0030) [2024-04-26 02:45:18,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 972898304. Throughput: 0: 55522.4. Samples: 922284260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 02:45:18,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 02:45:20,132][47288] Updated weights for policy 0, policy_version 59386 (0.0025) [2024-04-26 02:45:22,996][47288] Updated weights for policy 0, policy_version 59396 (0.0030) [2024-04-26 02:45:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 973193216. Throughput: 0: 55651.9. Samples: 922619780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 02:45:23,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 02:45:25,913][47288] Updated weights for policy 0, policy_version 59406 (0.0028) [2024-04-26 02:45:28,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 973455360. Throughput: 0: 55495.1. Samples: 922773960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 02:45:28,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 02:45:29,024][47288] Updated weights for policy 0, policy_version 59416 (0.0030) [2024-04-26 02:45:31,720][47288] Updated weights for policy 0, policy_version 59426 (0.0036) [2024-04-26 02:45:33,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 973733888. Throughput: 0: 55684.8. Samples: 923111360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 02:45:33,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 02:45:34,969][47288] Updated weights for policy 0, policy_version 59436 (0.0031) [2024-04-26 02:45:37,615][47288] Updated weights for policy 0, policy_version 59446 (0.0029) [2024-04-26 02:45:38,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 974028800. Throughput: 0: 55741.9. Samples: 923451020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 02:45:38,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 02:45:40,749][47288] Updated weights for policy 0, policy_version 59456 (0.0027) [2024-04-26 02:45:43,216][47267] Signal inference workers to stop experience collection... (13950 times) [2024-04-26 02:45:43,216][47267] Signal inference workers to resume experience collection... (13950 times) [2024-04-26 02:45:43,231][47288] InferenceWorker_p0-w0: stopping experience collection (13950 times) [2024-04-26 02:45:43,231][47288] InferenceWorker_p0-w0: resuming experience collection (13950 times) [2024-04-26 02:45:43,333][47288] Updated weights for policy 0, policy_version 59466 (0.0029) [2024-04-26 02:45:43,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 974323712. Throughput: 0: 55747.4. Samples: 923623360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 02:45:43,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 02:45:46,454][47288] Updated weights for policy 0, policy_version 59476 (0.0029) [2024-04-26 02:45:48,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 974569472. Throughput: 0: 55833.4. Samples: 923961500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 02:45:48,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 02:45:49,260][47288] Updated weights for policy 0, policy_version 59486 (0.0025) [2024-04-26 02:45:52,166][47288] Updated weights for policy 0, policy_version 59496 (0.0031) [2024-04-26 02:45:53,923][47056] Fps is (10 sec: 54068.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 974864384. Throughput: 0: 55901.1. Samples: 924297360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 02:45:53,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 02:45:55,192][47288] Updated weights for policy 0, policy_version 59506 (0.0025) [2024-04-26 02:45:58,075][47288] Updated weights for policy 0, policy_version 59516 (0.0030) [2024-04-26 02:45:58,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 975159296. Throughput: 0: 55717.5. Samples: 924455220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 02:45:58,923][47056] Avg episode reward: [(0, '0.306')] [2024-04-26 02:46:00,941][47288] Updated weights for policy 0, policy_version 59526 (0.0039) [2024-04-26 02:46:03,844][47288] Updated weights for policy 0, policy_version 59536 (0.0028) [2024-04-26 02:46:03,923][47056] Fps is (10 sec: 57342.5, 60 sec: 55705.4, 300 sec: 55872.2). Total num frames: 975437824. Throughput: 0: 55809.1. Samples: 924795680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 02:46:03,923][47056] Avg episode reward: [(0, '0.345')] [2024-04-26 02:46:06,643][47288] Updated weights for policy 0, policy_version 59546 (0.0036) [2024-04-26 02:46:08,923][47056] Fps is (10 sec: 52428.2, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 975683584. Throughput: 0: 55990.2. Samples: 925139340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 02:46:08,924][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 02:46:09,741][47288] Updated weights for policy 0, policy_version 59556 (0.0023) [2024-04-26 02:46:12,609][47288] Updated weights for policy 0, policy_version 59566 (0.0034) [2024-04-26 02:46:13,923][47056] Fps is (10 sec: 55706.6, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 975994880. Throughput: 0: 56212.0. Samples: 925303500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 02:46:13,923][47056] Avg episode reward: [(0, '0.341')] [2024-04-26 02:46:15,766][47288] Updated weights for policy 0, policy_version 59576 (0.0032) [2024-04-26 02:46:18,593][47288] Updated weights for policy 0, policy_version 59586 (0.0029) [2024-04-26 02:46:18,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 976273408. Throughput: 0: 56152.0. Samples: 925638200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 02:46:18,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 02:46:21,563][47288] Updated weights for policy 0, policy_version 59596 (0.0033) [2024-04-26 02:46:23,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 976551936. Throughput: 0: 56030.8. Samples: 925972420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 02:46:23,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 02:46:24,351][47288] Updated weights for policy 0, policy_version 59606 (0.0033) [2024-04-26 02:46:27,345][47288] Updated weights for policy 0, policy_version 59616 (0.0029) [2024-04-26 02:46:28,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 976830464. Throughput: 0: 55973.5. Samples: 926142160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 02:46:28,923][47056] Avg episode reward: [(0, '0.381')] [2024-04-26 02:46:30,098][47288] Updated weights for policy 0, policy_version 59626 (0.0027) [2024-04-26 02:46:33,169][47288] Updated weights for policy 0, policy_version 59636 (0.0027) [2024-04-26 02:46:33,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 977108992. Throughput: 0: 55866.5. Samples: 926475500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 02:46:33,924][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 02:46:36,251][47288] Updated weights for policy 0, policy_version 59646 (0.0027) [2024-04-26 02:46:38,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 977387520. Throughput: 0: 55864.9. Samples: 926811280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 02:46:38,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 02:46:39,029][47288] Updated weights for policy 0, policy_version 59656 (0.0027) [2024-04-26 02:46:40,288][47267] Signal inference workers to stop experience collection... (14000 times) [2024-04-26 02:46:40,288][47267] Signal inference workers to resume experience collection... (14000 times) [2024-04-26 02:46:40,300][47288] InferenceWorker_p0-w0: stopping experience collection (14000 times) [2024-04-26 02:46:40,301][47288] InferenceWorker_p0-w0: resuming experience collection (14000 times) [2024-04-26 02:46:42,091][47288] Updated weights for policy 0, policy_version 59666 (0.0042) [2024-04-26 02:46:43,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 977666048. Throughput: 0: 56152.5. Samples: 926982080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 02:46:43,923][47056] Avg episode reward: [(0, '0.347')] [2024-04-26 02:46:44,872][47288] Updated weights for policy 0, policy_version 59676 (0.0037) [2024-04-26 02:46:47,896][47288] Updated weights for policy 0, policy_version 59686 (0.0031) [2024-04-26 02:46:48,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 977928192. Throughput: 0: 55850.4. Samples: 927308940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 02:46:48,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 02:46:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000059688_977928192.pth... [2024-04-26 02:46:48,985][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000058873_964575232.pth [2024-04-26 02:46:50,894][47288] Updated weights for policy 0, policy_version 59696 (0.0029) [2024-04-26 02:46:53,912][47288] Updated weights for policy 0, policy_version 59706 (0.0028) [2024-04-26 02:46:53,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 978223104. Throughput: 0: 55637.8. Samples: 927643040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 02:46:53,923][47056] Avg episode reward: [(0, '0.324')] [2024-04-26 02:46:56,690][47288] Updated weights for policy 0, policy_version 59716 (0.0034) [2024-04-26 02:46:58,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 978501632. Throughput: 0: 55837.8. Samples: 927816200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 02:46:58,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 02:46:59,897][47288] Updated weights for policy 0, policy_version 59726 (0.0026) [2024-04-26 02:47:02,676][47288] Updated weights for policy 0, policy_version 59736 (0.0032) [2024-04-26 02:47:03,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 978796544. Throughput: 0: 55791.1. Samples: 928148800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 02:47:03,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 02:47:05,640][47288] Updated weights for policy 0, policy_version 59746 (0.0033) [2024-04-26 02:47:08,600][47288] Updated weights for policy 0, policy_version 59756 (0.0028) [2024-04-26 02:47:08,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 979058688. Throughput: 0: 55759.3. Samples: 928481580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 02:47:08,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 02:47:11,370][47288] Updated weights for policy 0, policy_version 59766 (0.0041) [2024-04-26 02:47:13,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 979320832. Throughput: 0: 55604.7. Samples: 928644380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 02:47:13,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 02:47:14,334][47288] Updated weights for policy 0, policy_version 59776 (0.0027) [2024-04-26 02:47:17,485][47288] Updated weights for policy 0, policy_version 59786 (0.0031) [2024-04-26 02:47:18,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 979615744. Throughput: 0: 55715.1. Samples: 928982680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 02:47:18,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 02:47:20,153][47288] Updated weights for policy 0, policy_version 59796 (0.0031) [2024-04-26 02:47:23,593][47288] Updated weights for policy 0, policy_version 59806 (0.0027) [2024-04-26 02:47:23,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55432.7, 300 sec: 55761.2). Total num frames: 979877888. Throughput: 0: 55636.8. Samples: 929314940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 02:47:23,923][47056] Avg episode reward: [(0, '0.424')] [2024-04-26 02:47:26,085][47288] Updated weights for policy 0, policy_version 59816 (0.0026) [2024-04-26 02:47:28,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 980156416. Throughput: 0: 55394.6. Samples: 929474840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 02:47:28,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 02:47:29,673][47288] Updated weights for policy 0, policy_version 59826 (0.0032) [2024-04-26 02:47:32,076][47288] Updated weights for policy 0, policy_version 59836 (0.0028) [2024-04-26 02:47:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 980451328. Throughput: 0: 55533.4. Samples: 929807940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 02:47:33,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 02:47:35,473][47288] Updated weights for policy 0, policy_version 59846 (0.0034) [2024-04-26 02:47:37,850][47288] Updated weights for policy 0, policy_version 59856 (0.0028) [2024-04-26 02:47:38,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 980729856. Throughput: 0: 55598.0. Samples: 930144940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 02:47:38,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 02:47:41,161][47288] Updated weights for policy 0, policy_version 59866 (0.0029) [2024-04-26 02:47:41,703][47267] Signal inference workers to stop experience collection... (14050 times) [2024-04-26 02:47:41,703][47267] Signal inference workers to resume experience collection... (14050 times) [2024-04-26 02:47:41,715][47288] InferenceWorker_p0-w0: stopping experience collection (14050 times) [2024-04-26 02:47:41,734][47288] InferenceWorker_p0-w0: resuming experience collection (14050 times) [2024-04-26 02:47:43,783][47288] Updated weights for policy 0, policy_version 59876 (0.0032) [2024-04-26 02:47:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 981008384. Throughput: 0: 55712.4. Samples: 930323260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 02:47:43,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 02:47:47,082][47288] Updated weights for policy 0, policy_version 59886 (0.0028) [2024-04-26 02:47:48,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 981270528. Throughput: 0: 55668.6. Samples: 930653880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 02:47:48,923][47056] Avg episode reward: [(0, '0.368')] [2024-04-26 02:47:49,667][47288] Updated weights for policy 0, policy_version 59896 (0.0031) [2024-04-26 02:47:52,980][47288] Updated weights for policy 0, policy_version 59906 (0.0035) [2024-04-26 02:47:53,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 981565440. Throughput: 0: 55635.4. Samples: 930985180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 02:47:53,923][47056] Avg episode reward: [(0, '0.395')] [2024-04-26 02:47:55,387][47288] Updated weights for policy 0, policy_version 59916 (0.0029) [2024-04-26 02:47:58,787][47288] Updated weights for policy 0, policy_version 59926 (0.0032) [2024-04-26 02:47:58,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 981827584. Throughput: 0: 55683.3. Samples: 931150120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 02:47:58,923][47056] Avg episode reward: [(0, '0.354')] [2024-04-26 02:48:01,179][47288] Updated weights for policy 0, policy_version 59936 (0.0024) [2024-04-26 02:48:03,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55159.3, 300 sec: 55761.3). Total num frames: 982106112. Throughput: 0: 55742.0. Samples: 931491080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 02:48:03,924][47056] Avg episode reward: [(0, '0.363')] [2024-04-26 02:48:04,472][47288] Updated weights for policy 0, policy_version 59946 (0.0030) [2024-04-26 02:48:07,037][47288] Updated weights for policy 0, policy_version 59956 (0.0034) [2024-04-26 02:48:08,923][47056] Fps is (10 sec: 57342.7, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 982401024. Throughput: 0: 55841.1. Samples: 931827800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 02:48:08,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 02:48:10,371][47288] Updated weights for policy 0, policy_version 59966 (0.0029) [2024-04-26 02:48:12,981][47288] Updated weights for policy 0, policy_version 59976 (0.0029) [2024-04-26 02:48:13,923][47056] Fps is (10 sec: 58983.6, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 982695936. Throughput: 0: 55892.8. Samples: 931990020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 02:48:13,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 02:48:16,210][47288] Updated weights for policy 0, policy_version 59986 (0.0035) [2024-04-26 02:48:18,896][47288] Updated weights for policy 0, policy_version 59996 (0.0029) [2024-04-26 02:48:18,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 982974464. Throughput: 0: 55952.4. Samples: 932325800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 02:48:18,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 02:48:22,023][47288] Updated weights for policy 0, policy_version 60006 (0.0030) [2024-04-26 02:48:23,923][47056] Fps is (10 sec: 50790.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 983203840. Throughput: 0: 56015.9. Samples: 932665660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 02:48:23,923][47056] Avg episode reward: [(0, '0.393')] [2024-04-26 02:48:24,768][47288] Updated weights for policy 0, policy_version 60016 (0.0025) [2024-04-26 02:48:27,780][47288] Updated weights for policy 0, policy_version 60026 (0.0033) [2024-04-26 02:48:28,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 983515136. Throughput: 0: 55658.1. Samples: 932827880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 02:48:28,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 02:48:30,812][47288] Updated weights for policy 0, policy_version 60036 (0.0036) [2024-04-26 02:48:33,726][47288] Updated weights for policy 0, policy_version 60046 (0.0028) [2024-04-26 02:48:33,923][47056] Fps is (10 sec: 58981.7, 60 sec: 55705.5, 300 sec: 55927.8). Total num frames: 983793664. Throughput: 0: 55597.2. Samples: 933155760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 02:48:33,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 02:48:36,654][47288] Updated weights for policy 0, policy_version 60056 (0.0033) [2024-04-26 02:48:38,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 984072192. Throughput: 0: 55651.8. Samples: 933489500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 02:48:38,923][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 02:48:39,501][47288] Updated weights for policy 0, policy_version 60066 (0.0029) [2024-04-26 02:48:40,361][47267] Signal inference workers to stop experience collection... (14100 times) [2024-04-26 02:48:40,362][47267] Signal inference workers to resume experience collection... (14100 times) [2024-04-26 02:48:40,392][47288] InferenceWorker_p0-w0: stopping experience collection (14100 times) [2024-04-26 02:48:40,392][47288] InferenceWorker_p0-w0: resuming experience collection (14100 times) [2024-04-26 02:48:42,513][47288] Updated weights for policy 0, policy_version 60076 (0.0031) [2024-04-26 02:48:43,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 984367104. Throughput: 0: 56059.5. Samples: 933672800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 02:48:43,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 02:48:45,177][47288] Updated weights for policy 0, policy_version 60086 (0.0026) [2024-04-26 02:48:48,343][47288] Updated weights for policy 0, policy_version 60096 (0.0027) [2024-04-26 02:48:48,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 984645632. Throughput: 0: 55932.8. Samples: 934008040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 02:48:48,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 02:48:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000060098_984645632.pth... [2024-04-26 02:48:48,978][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000059281_971259904.pth [2024-04-26 02:48:51,316][47288] Updated weights for policy 0, policy_version 60106 (0.0036) [2024-04-26 02:48:53,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55705.9, 300 sec: 55705.7). Total num frames: 984907776. Throughput: 0: 55859.1. Samples: 934341440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 02:48:53,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 02:48:54,084][47288] Updated weights for policy 0, policy_version 60116 (0.0027) [2024-04-26 02:48:57,342][47288] Updated weights for policy 0, policy_version 60126 (0.0033) [2024-04-26 02:48:58,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 985186304. Throughput: 0: 55881.1. Samples: 934504660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 02:48:58,923][47056] Avg episode reward: [(0, '0.369')] [2024-04-26 02:49:00,007][47288] Updated weights for policy 0, policy_version 60136 (0.0034) [2024-04-26 02:49:03,216][47288] Updated weights for policy 0, policy_version 60146 (0.0034) [2024-04-26 02:49:03,923][47056] Fps is (10 sec: 54065.7, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 985448448. Throughput: 0: 55803.9. Samples: 934836980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 02:49:03,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 02:49:05,691][47288] Updated weights for policy 0, policy_version 60156 (0.0032) [2024-04-26 02:49:08,876][47288] Updated weights for policy 0, policy_version 60166 (0.0029) [2024-04-26 02:49:08,923][47056] Fps is (10 sec: 57343.0, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 985759744. Throughput: 0: 55852.3. Samples: 935179020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 02:49:08,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 02:49:11,633][47288] Updated weights for policy 0, policy_version 60176 (0.0032) [2024-04-26 02:49:13,923][47056] Fps is (10 sec: 60621.4, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 986054656. Throughput: 0: 56042.3. Samples: 935349780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 02:49:13,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 02:49:14,625][47288] Updated weights for policy 0, policy_version 60186 (0.0028) [2024-04-26 02:49:17,455][47288] Updated weights for policy 0, policy_version 60196 (0.0032) [2024-04-26 02:49:18,923][47056] Fps is (10 sec: 55706.7, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 986316800. Throughput: 0: 55959.0. Samples: 935673900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 02:49:18,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 02:49:20,598][47288] Updated weights for policy 0, policy_version 60206 (0.0024) [2024-04-26 02:49:23,409][47288] Updated weights for policy 0, policy_version 60216 (0.0027) [2024-04-26 02:49:23,923][47056] Fps is (10 sec: 52428.4, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 986578944. Throughput: 0: 55866.5. Samples: 936003500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 02:49:23,923][47056] Avg episode reward: [(0, '0.363')] [2024-04-26 02:49:26,526][47288] Updated weights for policy 0, policy_version 60226 (0.0029) [2024-04-26 02:49:28,565][47267] Signal inference workers to stop experience collection... (14150 times) [2024-04-26 02:49:28,566][47267] Signal inference workers to resume experience collection... (14150 times) [2024-04-26 02:49:28,584][47288] InferenceWorker_p0-w0: stopping experience collection (14150 times) [2024-04-26 02:49:28,585][47288] InferenceWorker_p0-w0: resuming experience collection (14150 times) [2024-04-26 02:49:28,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 986873856. Throughput: 0: 55662.2. Samples: 936177600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 02:49:28,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 02:49:29,389][47288] Updated weights for policy 0, policy_version 60236 (0.0027) [2024-04-26 02:49:32,263][47288] Updated weights for policy 0, policy_version 60246 (0.0033) [2024-04-26 02:49:33,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 987152384. Throughput: 0: 55747.2. Samples: 936516660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 02:49:33,923][47056] Avg episode reward: [(0, '0.328')] [2024-04-26 02:49:35,141][47288] Updated weights for policy 0, policy_version 60256 (0.0024) [2024-04-26 02:49:38,327][47288] Updated weights for policy 0, policy_version 60266 (0.0032) [2024-04-26 02:49:38,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 987414528. Throughput: 0: 55715.8. Samples: 936848660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 02:49:38,923][47056] Avg episode reward: [(0, '0.333')] [2024-04-26 02:49:41,066][47288] Updated weights for policy 0, policy_version 60276 (0.0032) [2024-04-26 02:49:43,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 987693056. Throughput: 0: 55782.9. Samples: 937014900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 02:49:43,924][47056] Avg episode reward: [(0, '0.416')] [2024-04-26 02:49:44,237][47288] Updated weights for policy 0, policy_version 60286 (0.0028) [2024-04-26 02:49:46,879][47288] Updated weights for policy 0, policy_version 60296 (0.0027) [2024-04-26 02:49:48,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 987987968. Throughput: 0: 55877.0. Samples: 937351440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 02:49:48,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 02:49:50,139][47288] Updated weights for policy 0, policy_version 60306 (0.0029) [2024-04-26 02:49:52,614][47288] Updated weights for policy 0, policy_version 60316 (0.0029) [2024-04-26 02:49:53,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 988266496. Throughput: 0: 55770.3. Samples: 937688680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:49:53,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 02:49:55,852][47288] Updated weights for policy 0, policy_version 60326 (0.0025) [2024-04-26 02:49:58,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 988528640. Throughput: 0: 55831.1. Samples: 937862180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:49:58,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 02:49:59,074][47288] Updated weights for policy 0, policy_version 60336 (0.0035) [2024-04-26 02:50:01,724][47288] Updated weights for policy 0, policy_version 60346 (0.0031) [2024-04-26 02:50:03,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.9, 300 sec: 55761.2). Total num frames: 988823552. Throughput: 0: 56041.2. Samples: 938195760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:50:03,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 02:50:04,846][47288] Updated weights for policy 0, policy_version 60356 (0.0030) [2024-04-26 02:50:07,262][47267] Signal inference workers to stop experience collection... (14200 times) [2024-04-26 02:50:07,267][47267] Signal inference workers to resume experience collection... (14200 times) [2024-04-26 02:50:07,291][47288] InferenceWorker_p0-w0: stopping experience collection (14200 times) [2024-04-26 02:50:07,291][47288] InferenceWorker_p0-w0: resuming experience collection (14200 times) [2024-04-26 02:50:07,664][47288] Updated weights for policy 0, policy_version 60366 (0.0027) [2024-04-26 02:50:08,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55159.6, 300 sec: 55761.2). Total num frames: 989069312. Throughput: 0: 56139.7. Samples: 938529780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:50:08,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 02:50:10,634][47288] Updated weights for policy 0, policy_version 60376 (0.0038) [2024-04-26 02:50:13,376][47288] Updated weights for policy 0, policy_version 60386 (0.0035) [2024-04-26 02:50:13,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 989364224. Throughput: 0: 55773.3. Samples: 938687400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:50:13,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 02:50:16,338][47288] Updated weights for policy 0, policy_version 60396 (0.0026) [2024-04-26 02:50:18,923][47056] Fps is (10 sec: 58982.5, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 989659136. Throughput: 0: 55570.2. Samples: 939017320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:50:18,923][47056] Avg episode reward: [(0, '0.378')] [2024-04-26 02:50:19,917][47288] Updated weights for policy 0, policy_version 60406 (0.0030) [2024-04-26 02:50:22,163][47288] Updated weights for policy 0, policy_version 60416 (0.0027) [2024-04-26 02:50:23,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 989937664. Throughput: 0: 55527.7. Samples: 939347400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:50:23,923][47056] Avg episode reward: [(0, '0.399')] [2024-04-26 02:50:26,125][47288] Updated weights for policy 0, policy_version 60426 (0.0027) [2024-04-26 02:50:28,017][47288] Updated weights for policy 0, policy_version 60436 (0.0031) [2024-04-26 02:50:28,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 990199808. Throughput: 0: 55779.2. Samples: 939524960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 02:50:28,923][47056] Avg episode reward: [(0, '0.436')] [2024-04-26 02:50:31,900][47288] Updated weights for policy 0, policy_version 60446 (0.0029) [2024-04-26 02:50:33,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 990494720. Throughput: 0: 55652.9. Samples: 939855820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 02:50:33,923][47056] Avg episode reward: [(0, '0.276')] [2024-04-26 02:50:33,988][47288] Updated weights for policy 0, policy_version 60456 (0.0034) [2024-04-26 02:50:37,648][47288] Updated weights for policy 0, policy_version 60466 (0.0034) [2024-04-26 02:50:38,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 990773248. Throughput: 0: 55648.8. Samples: 940192880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 02:50:38,924][47056] Avg episode reward: [(0, '0.391')] [2024-04-26 02:50:39,930][47288] Updated weights for policy 0, policy_version 60476 (0.0030) [2024-04-26 02:50:43,543][47288] Updated weights for policy 0, policy_version 60486 (0.0034) [2024-04-26 02:50:43,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 991019008. Throughput: 0: 55429.3. Samples: 940356500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 02:50:43,923][47056] Avg episode reward: [(0, '0.315')] [2024-04-26 02:50:45,898][47288] Updated weights for policy 0, policy_version 60496 (0.0029) [2024-04-26 02:50:46,408][47267] Signal inference workers to stop experience collection... (14250 times) [2024-04-26 02:50:46,441][47288] InferenceWorker_p0-w0: stopping experience collection (14250 times) [2024-04-26 02:50:46,464][47267] Signal inference workers to resume experience collection... (14250 times) [2024-04-26 02:50:46,464][47288] InferenceWorker_p0-w0: resuming experience collection (14250 times) [2024-04-26 02:50:48,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 991297536. Throughput: 0: 55466.4. Samples: 940691760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 02:50:48,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 02:50:48,931][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000060504_991297536.pth... [2024-04-26 02:50:48,992][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000059688_977928192.pth [2024-04-26 02:50:49,441][47288] Updated weights for policy 0, policy_version 60506 (0.0035) [2024-04-26 02:50:51,763][47288] Updated weights for policy 0, policy_version 60516 (0.0032) [2024-04-26 02:50:53,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 991592448. Throughput: 0: 55425.3. Samples: 941023920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 02:50:53,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 02:50:55,184][47288] Updated weights for policy 0, policy_version 60526 (0.0031) [2024-04-26 02:50:57,573][47288] Updated weights for policy 0, policy_version 60536 (0.0032) [2024-04-26 02:50:58,923][47056] Fps is (10 sec: 58983.6, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 991887360. Throughput: 0: 55565.4. Samples: 941187840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 02:50:58,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 02:51:00,852][47288] Updated weights for policy 0, policy_version 60546 (0.0026) [2024-04-26 02:51:03,317][47288] Updated weights for policy 0, policy_version 60556 (0.0029) [2024-04-26 02:51:03,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 992182272. Throughput: 0: 55802.6. Samples: 941528440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 02:51:03,923][47056] Avg episode reward: [(0, '0.372')] [2024-04-26 02:51:06,834][47288] Updated weights for policy 0, policy_version 60566 (0.0029) [2024-04-26 02:51:08,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 992460800. Throughput: 0: 55783.5. Samples: 941857660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 02:51:08,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 02:51:09,270][47288] Updated weights for policy 0, policy_version 60576 (0.0029) [2024-04-26 02:51:12,791][47288] Updated weights for policy 0, policy_version 60586 (0.0027) [2024-04-26 02:51:13,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 992722944. Throughput: 0: 55823.6. Samples: 942037020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 02:51:13,923][47056] Avg episode reward: [(0, '0.356')] [2024-04-26 02:51:14,972][47288] Updated weights for policy 0, policy_version 60596 (0.0032) [2024-04-26 02:51:18,655][47288] Updated weights for policy 0, policy_version 60606 (0.0029) [2024-04-26 02:51:18,923][47056] Fps is (10 sec: 50790.3, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 992968704. Throughput: 0: 55917.4. Samples: 942372100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 02:51:18,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 02:51:20,716][47288] Updated weights for policy 0, policy_version 60616 (0.0028) [2024-04-26 02:51:23,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 993263616. Throughput: 0: 55793.8. Samples: 942703600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 02:51:23,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 02:51:24,641][47288] Updated weights for policy 0, policy_version 60626 (0.0030) [2024-04-26 02:51:26,808][47288] Updated weights for policy 0, policy_version 60636 (0.0030) [2024-04-26 02:51:28,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 993542144. Throughput: 0: 55655.1. Samples: 942860980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 02:51:28,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 02:51:30,446][47288] Updated weights for policy 0, policy_version 60646 (0.0028) [2024-04-26 02:51:32,828][47288] Updated weights for policy 0, policy_version 60656 (0.0031) [2024-04-26 02:51:33,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 993837056. Throughput: 0: 55651.4. Samples: 943196060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 02:51:33,923][47056] Avg episode reward: [(0, '0.356')] [2024-04-26 02:51:36,260][47288] Updated weights for policy 0, policy_version 60666 (0.0029) [2024-04-26 02:51:38,580][47288] Updated weights for policy 0, policy_version 60676 (0.0028) [2024-04-26 02:51:38,923][47056] Fps is (10 sec: 58981.7, 60 sec: 55978.6, 300 sec: 55816.6). Total num frames: 994131968. Throughput: 0: 55668.7. Samples: 943529020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 02:51:38,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 02:51:42,170][47288] Updated weights for policy 0, policy_version 60686 (0.0034) [2024-04-26 02:51:43,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 994410496. Throughput: 0: 55874.1. Samples: 943702180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 02:51:43,923][47056] Avg episode reward: [(0, '0.403')] [2024-04-26 02:51:44,496][47288] Updated weights for policy 0, policy_version 60696 (0.0029) [2024-04-26 02:51:47,735][47267] Signal inference workers to stop experience collection... (14300 times) [2024-04-26 02:51:47,770][47288] InferenceWorker_p0-w0: stopping experience collection (14300 times) [2024-04-26 02:51:47,823][47267] Signal inference workers to resume experience collection... (14300 times) [2024-04-26 02:51:47,823][47288] InferenceWorker_p0-w0: resuming experience collection (14300 times) [2024-04-26 02:51:47,949][47288] Updated weights for policy 0, policy_version 60706 (0.0024) [2024-04-26 02:51:48,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 994672640. Throughput: 0: 55733.7. Samples: 944036460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 02:51:48,923][47056] Avg episode reward: [(0, '0.369')] [2024-04-26 02:51:50,311][47288] Updated weights for policy 0, policy_version 60716 (0.0027) [2024-04-26 02:51:53,679][47288] Updated weights for policy 0, policy_version 60726 (0.0034) [2024-04-26 02:51:53,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 994934784. Throughput: 0: 55915.9. Samples: 944373880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 02:51:53,923][47056] Avg episode reward: [(0, '0.346')] [2024-04-26 02:51:56,226][47288] Updated weights for policy 0, policy_version 60736 (0.0032) [2024-04-26 02:51:58,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 995196928. Throughput: 0: 55333.3. Samples: 944527020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 02:51:58,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 02:51:59,543][47288] Updated weights for policy 0, policy_version 60746 (0.0037) [2024-04-26 02:52:01,993][47288] Updated weights for policy 0, policy_version 60756 (0.0028) [2024-04-26 02:52:03,923][47056] Fps is (10 sec: 54067.7, 60 sec: 54886.5, 300 sec: 55650.1). Total num frames: 995475456. Throughput: 0: 55405.3. Samples: 944865340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 02:52:03,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 02:52:05,468][47288] Updated weights for policy 0, policy_version 60766 (0.0030) [2024-04-26 02:52:07,794][47288] Updated weights for policy 0, policy_version 60776 (0.0027) [2024-04-26 02:52:08,923][47056] Fps is (10 sec: 60620.6, 60 sec: 55705.4, 300 sec: 55872.2). Total num frames: 995803136. Throughput: 0: 55539.0. Samples: 945202860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 02:52:08,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 02:52:11,197][47288] Updated weights for policy 0, policy_version 60786 (0.0027) [2024-04-26 02:52:13,636][47288] Updated weights for policy 0, policy_version 60796 (0.0027) [2024-04-26 02:52:13,923][47056] Fps is (10 sec: 62259.1, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 996098048. Throughput: 0: 55938.8. Samples: 945378220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 02:52:13,923][47056] Avg episode reward: [(0, '0.303')] [2024-04-26 02:52:17,158][47288] Updated weights for policy 0, policy_version 60806 (0.0029) [2024-04-26 02:52:18,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 996343808. Throughput: 0: 55954.5. Samples: 945714020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 02:52:18,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 02:52:19,521][47288] Updated weights for policy 0, policy_version 60816 (0.0031) [2024-04-26 02:52:23,098][47288] Updated weights for policy 0, policy_version 60826 (0.0030) [2024-04-26 02:52:23,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 996622336. Throughput: 0: 55881.5. Samples: 946043680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 02:52:23,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 02:52:25,409][47288] Updated weights for policy 0, policy_version 60836 (0.0029) [2024-04-26 02:52:28,838][47288] Updated weights for policy 0, policy_version 60846 (0.0026) [2024-04-26 02:52:28,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 996900864. Throughput: 0: 55670.2. Samples: 946207340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 02:52:28,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 02:52:31,163][47288] Updated weights for policy 0, policy_version 60856 (0.0033) [2024-04-26 02:52:33,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 997163008. Throughput: 0: 55766.2. Samples: 946545940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 02:52:33,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 02:52:34,664][47288] Updated weights for policy 0, policy_version 60866 (0.0031) [2024-04-26 02:52:37,061][47288] Updated weights for policy 0, policy_version 60876 (0.0028) [2024-04-26 02:52:38,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 997441536. Throughput: 0: 55713.8. Samples: 946881000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 02:52:38,923][47056] Avg episode reward: [(0, '0.404')] [2024-04-26 02:52:40,636][47288] Updated weights for policy 0, policy_version 60886 (0.0035) [2024-04-26 02:52:43,050][47288] Updated weights for policy 0, policy_version 60896 (0.0024) [2024-04-26 02:52:43,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 997736448. Throughput: 0: 55893.0. Samples: 947042200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 02:52:43,923][47056] Avg episode reward: [(0, '0.348')] [2024-04-26 02:52:46,481][47288] Updated weights for policy 0, policy_version 60906 (0.0033) [2024-04-26 02:52:48,534][47267] Signal inference workers to stop experience collection... (14350 times) [2024-04-26 02:52:48,534][47267] Signal inference workers to resume experience collection... (14350 times) [2024-04-26 02:52:48,560][47288] InferenceWorker_p0-w0: stopping experience collection (14350 times) [2024-04-26 02:52:48,561][47288] InferenceWorker_p0-w0: resuming experience collection (14350 times) [2024-04-26 02:52:48,923][47056] Fps is (10 sec: 58982.0, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 998031360. Throughput: 0: 55757.6. Samples: 947374440. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 02:52:48,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 02:52:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000060915_998031360.pth... [2024-04-26 02:52:48,990][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000060098_984645632.pth [2024-04-26 02:52:49,123][47288] Updated weights for policy 0, policy_version 60916 (0.0035) [2024-04-26 02:52:52,346][47288] Updated weights for policy 0, policy_version 60926 (0.0027) [2024-04-26 02:52:53,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55978.6, 300 sec: 55816.6). Total num frames: 998293504. Throughput: 0: 55814.6. Samples: 947714520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 02:52:53,923][47056] Avg episode reward: [(0, '0.362')] [2024-04-26 02:52:54,922][47288] Updated weights for policy 0, policy_version 60936 (0.0029) [2024-04-26 02:52:58,348][47288] Updated weights for policy 0, policy_version 60946 (0.0029) [2024-04-26 02:52:58,923][47056] Fps is (10 sec: 52430.0, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 998555648. Throughput: 0: 55652.5. Samples: 947882580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 02:52:58,923][47056] Avg episode reward: [(0, '0.395')] [2024-04-26 02:53:00,697][47288] Updated weights for policy 0, policy_version 60956 (0.0026) [2024-04-26 02:53:03,923][47056] Fps is (10 sec: 54068.5, 60 sec: 55978.7, 300 sec: 55705.7). Total num frames: 998834176. Throughput: 0: 55674.9. Samples: 948219380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 02:53:03,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 02:53:04,253][47288] Updated weights for policy 0, policy_version 60966 (0.0031) [2024-04-26 02:53:06,616][47288] Updated weights for policy 0, policy_version 60976 (0.0034) [2024-04-26 02:53:08,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 999112704. Throughput: 0: 55709.8. Samples: 948550620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 02:53:08,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 02:53:10,087][47288] Updated weights for policy 0, policy_version 60986 (0.0030) [2024-04-26 02:53:12,688][47288] Updated weights for policy 0, policy_version 60996 (0.0037) [2024-04-26 02:53:13,923][47056] Fps is (10 sec: 55705.2, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 999391232. Throughput: 0: 55669.4. Samples: 948712460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 02:53:13,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 02:53:15,975][47288] Updated weights for policy 0, policy_version 61006 (0.0028) [2024-04-26 02:53:18,501][47288] Updated weights for policy 0, policy_version 61016 (0.0028) [2024-04-26 02:53:18,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 999686144. Throughput: 0: 55623.2. Samples: 949048980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 02:53:18,923][47056] Avg episode reward: [(0, '0.436')] [2024-04-26 02:53:21,929][47288] Updated weights for policy 0, policy_version 61026 (0.0033) [2024-04-26 02:53:23,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 999981056. Throughput: 0: 55677.8. Samples: 949386500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 02:53:23,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 02:53:24,194][47288] Updated weights for policy 0, policy_version 61036 (0.0029) [2024-04-26 02:53:27,586][47288] Updated weights for policy 0, policy_version 61046 (0.0028) [2024-04-26 02:53:28,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 1000226816. Throughput: 0: 55798.2. Samples: 949553120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 02:53:28,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 02:53:30,134][47288] Updated weights for policy 0, policy_version 61056 (0.0026) [2024-04-26 02:53:33,516][47288] Updated weights for policy 0, policy_version 61066 (0.0028) [2024-04-26 02:53:33,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 1000521728. Throughput: 0: 55802.3. Samples: 949885540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 02:53:33,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 02:53:36,251][47288] Updated weights for policy 0, policy_version 61076 (0.0029) [2024-04-26 02:53:38,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1000800256. Throughput: 0: 55798.8. Samples: 950225460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:53:38,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 02:53:39,240][47288] Updated weights for policy 0, policy_version 61086 (0.0029) [2024-04-26 02:53:42,159][47288] Updated weights for policy 0, policy_version 61096 (0.0043) [2024-04-26 02:53:43,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 1001062400. Throughput: 0: 55637.5. Samples: 950386280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:53:43,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 02:53:45,133][47288] Updated weights for policy 0, policy_version 61106 (0.0027) [2024-04-26 02:53:47,987][47288] Updated weights for policy 0, policy_version 61116 (0.0029) [2024-04-26 02:53:48,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 1001340928. Throughput: 0: 55611.0. Samples: 950721880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:53:48,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 02:53:51,050][47288] Updated weights for policy 0, policy_version 61126 (0.0027) [2024-04-26 02:53:53,802][47288] Updated weights for policy 0, policy_version 61136 (0.0031) [2024-04-26 02:53:53,923][47056] Fps is (10 sec: 58983.3, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 1001652224. Throughput: 0: 55606.7. Samples: 951052920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:53:53,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 02:53:56,943][47288] Updated weights for policy 0, policy_version 61146 (0.0028) [2024-04-26 02:53:58,433][47267] Signal inference workers to stop experience collection... (14400 times) [2024-04-26 02:53:58,433][47267] Signal inference workers to resume experience collection... (14400 times) [2024-04-26 02:53:58,455][47288] InferenceWorker_p0-w0: stopping experience collection (14400 times) [2024-04-26 02:53:58,455][47288] InferenceWorker_p0-w0: resuming experience collection (14400 times) [2024-04-26 02:53:58,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 1001914368. Throughput: 0: 55807.0. Samples: 951223780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:53:58,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 02:54:00,128][47288] Updated weights for policy 0, policy_version 61156 (0.0035) [2024-04-26 02:54:02,739][47288] Updated weights for policy 0, policy_version 61166 (0.0033) [2024-04-26 02:54:03,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 1002176512. Throughput: 0: 55796.8. Samples: 951559840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:54:03,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 02:54:05,896][47288] Updated weights for policy 0, policy_version 61176 (0.0030) [2024-04-26 02:54:08,601][47288] Updated weights for policy 0, policy_version 61186 (0.0030) [2024-04-26 02:54:08,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 1002471424. Throughput: 0: 55744.9. Samples: 951895020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 02:54:08,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 02:54:11,652][47288] Updated weights for policy 0, policy_version 61196 (0.0027) [2024-04-26 02:54:13,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1002749952. Throughput: 0: 55737.8. Samples: 952061320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 02:54:13,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 02:54:14,457][47288] Updated weights for policy 0, policy_version 61206 (0.0034) [2024-04-26 02:54:17,576][47288] Updated weights for policy 0, policy_version 61216 (0.0031) [2024-04-26 02:54:18,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55761.2). Total num frames: 1003028480. Throughput: 0: 55844.0. Samples: 952398520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 02:54:18,923][47056] Avg episode reward: [(0, '0.416')] [2024-04-26 02:54:20,810][47288] Updated weights for policy 0, policy_version 61226 (0.0032) [2024-04-26 02:54:23,553][47288] Updated weights for policy 0, policy_version 61236 (0.0033) [2024-04-26 02:54:23,923][47056] Fps is (10 sec: 54066.2, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 1003290624. Throughput: 0: 55793.2. Samples: 952736160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 02:54:23,923][47056] Avg episode reward: [(0, '0.368')] [2024-04-26 02:54:26,682][47288] Updated weights for policy 0, policy_version 61246 (0.0027) [2024-04-26 02:54:28,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 1003585536. Throughput: 0: 56006.8. Samples: 952906580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 02:54:28,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 02:54:29,627][47288] Updated weights for policy 0, policy_version 61256 (0.0029) [2024-04-26 02:54:32,613][47288] Updated weights for policy 0, policy_version 61266 (0.0027) [2024-04-26 02:54:33,923][47056] Fps is (10 sec: 58983.2, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 1003880448. Throughput: 0: 56003.5. Samples: 953242040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 02:54:33,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 02:54:35,629][47288] Updated weights for policy 0, policy_version 61276 (0.0033) [2024-04-26 02:54:38,346][47288] Updated weights for policy 0, policy_version 61286 (0.0031) [2024-04-26 02:54:38,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 1004126208. Throughput: 0: 55940.9. Samples: 953570260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 02:54:38,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 02:54:41,540][47288] Updated weights for policy 0, policy_version 61296 (0.0025) [2024-04-26 02:54:43,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 1004421120. Throughput: 0: 55943.7. Samples: 953741240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 02:54:43,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 02:54:44,176][47288] Updated weights for policy 0, policy_version 61306 (0.0031) [2024-04-26 02:54:47,267][47288] Updated weights for policy 0, policy_version 61316 (0.0030) [2024-04-26 02:54:48,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 1004716032. Throughput: 0: 55893.4. Samples: 954075040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 02:54:48,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 02:54:48,991][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000061324_1004732416.pth... [2024-04-26 02:54:49,036][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000060504_991297536.pth [2024-04-26 02:54:50,018][47288] Updated weights for policy 0, policy_version 61326 (0.0026) [2024-04-26 02:54:53,078][47288] Updated weights for policy 0, policy_version 61336 (0.0029) [2024-04-26 02:54:53,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 1004978176. Throughput: 0: 55952.1. Samples: 954412860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 02:54:53,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 02:54:55,843][47288] Updated weights for policy 0, policy_version 61346 (0.0027) [2024-04-26 02:54:58,899][47288] Updated weights for policy 0, policy_version 61356 (0.0031) [2024-04-26 02:54:58,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 1005256704. Throughput: 0: 55949.7. Samples: 954579060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 02:54:58,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 02:55:01,630][47267] Signal inference workers to stop experience collection... (14450 times) [2024-04-26 02:55:01,665][47288] InferenceWorker_p0-w0: stopping experience collection (14450 times) [2024-04-26 02:55:01,718][47267] Signal inference workers to resume experience collection... (14450 times) [2024-04-26 02:55:01,718][47288] InferenceWorker_p0-w0: resuming experience collection (14450 times) [2024-04-26 02:55:01,827][47288] Updated weights for policy 0, policy_version 61366 (0.0024) [2024-04-26 02:55:03,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 1005551616. Throughput: 0: 55841.2. Samples: 954911380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 02:55:03,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 02:55:04,674][47288] Updated weights for policy 0, policy_version 61376 (0.0024) [2024-04-26 02:55:07,633][47288] Updated weights for policy 0, policy_version 61386 (0.0030) [2024-04-26 02:55:08,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 1005830144. Throughput: 0: 55749.1. Samples: 955244860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 02:55:08,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 02:55:10,481][47288] Updated weights for policy 0, policy_version 61396 (0.0028) [2024-04-26 02:55:13,289][47288] Updated weights for policy 0, policy_version 61406 (0.0029) [2024-04-26 02:55:13,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 1006092288. Throughput: 0: 55688.4. Samples: 955412560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 02:55:13,924][47056] Avg episode reward: [(0, '0.349')] [2024-04-26 02:55:16,206][47288] Updated weights for policy 0, policy_version 61416 (0.0031) [2024-04-26 02:55:18,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1006370816. Throughput: 0: 55625.9. Samples: 955745200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 02:55:18,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 02:55:19,202][47288] Updated weights for policy 0, policy_version 61426 (0.0029) [2024-04-26 02:55:22,213][47288] Updated weights for policy 0, policy_version 61436 (0.0030) [2024-04-26 02:55:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.9, 300 sec: 55816.7). Total num frames: 1006665728. Throughput: 0: 55719.5. Samples: 956077640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 02:55:23,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 02:55:25,000][47288] Updated weights for policy 0, policy_version 61446 (0.0029) [2024-04-26 02:55:28,118][47288] Updated weights for policy 0, policy_version 61456 (0.0027) [2024-04-26 02:55:28,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 1006927872. Throughput: 0: 55722.7. Samples: 956248760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:55:28,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 02:55:30,776][47288] Updated weights for policy 0, policy_version 61466 (0.0030) [2024-04-26 02:55:33,913][47288] Updated weights for policy 0, policy_version 61476 (0.0027) [2024-04-26 02:55:33,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 1007222784. Throughput: 0: 55695.9. Samples: 956581360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:55:33,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 02:55:36,737][47288] Updated weights for policy 0, policy_version 61486 (0.0027) [2024-04-26 02:55:38,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 1007484928. Throughput: 0: 55660.9. Samples: 956917600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:55:38,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 02:55:39,691][47288] Updated weights for policy 0, policy_version 61496 (0.0031) [2024-04-26 02:55:42,835][47288] Updated weights for policy 0, policy_version 61506 (0.0032) [2024-04-26 02:55:43,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55872.3). Total num frames: 1007779840. Throughput: 0: 55675.2. Samples: 957084440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:55:43,923][47056] Avg episode reward: [(0, '0.415')] [2024-04-26 02:55:45,581][47288] Updated weights for policy 0, policy_version 61516 (0.0031) [2024-04-26 02:55:48,752][47288] Updated weights for policy 0, policy_version 61526 (0.0027) [2024-04-26 02:55:48,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 1008041984. Throughput: 0: 55709.5. Samples: 957418300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:55:48,932][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 02:55:51,586][47288] Updated weights for policy 0, policy_version 61536 (0.0027) [2024-04-26 02:55:53,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1008320512. Throughput: 0: 55714.4. Samples: 957752000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:55:53,923][47056] Avg episode reward: [(0, '0.395')] [2024-04-26 02:55:54,478][47288] Updated weights for policy 0, policy_version 61546 (0.0027) [2024-04-26 02:55:57,327][47288] Updated weights for policy 0, policy_version 61556 (0.0029) [2024-04-26 02:55:58,923][47056] Fps is (10 sec: 58981.5, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 1008631808. Throughput: 0: 55820.3. Samples: 957924480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:55:58,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 02:56:00,276][47288] Updated weights for policy 0, policy_version 61566 (0.0030) [2024-04-26 02:56:03,315][47288] Updated weights for policy 0, policy_version 61576 (0.0034) [2024-04-26 02:56:03,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 1008893952. Throughput: 0: 55932.0. Samples: 958262140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 02:56:03,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 02:56:06,005][47267] Signal inference workers to stop experience collection... (14500 times) [2024-04-26 02:56:06,005][47267] Signal inference workers to resume experience collection... (14500 times) [2024-04-26 02:56:06,034][47288] InferenceWorker_p0-w0: stopping experience collection (14500 times) [2024-04-26 02:56:06,034][47288] InferenceWorker_p0-w0: resuming experience collection (14500 times) [2024-04-26 02:56:06,116][47288] Updated weights for policy 0, policy_version 61586 (0.0036) [2024-04-26 02:56:08,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 1009156096. Throughput: 0: 55936.7. Samples: 958594800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 02:56:08,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 02:56:09,226][47288] Updated weights for policy 0, policy_version 61596 (0.0032) [2024-04-26 02:56:11,835][47288] Updated weights for policy 0, policy_version 61606 (0.0028) [2024-04-26 02:56:13,923][47056] Fps is (10 sec: 54065.9, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 1009434624. Throughput: 0: 55707.4. Samples: 958755600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 02:56:13,923][47056] Avg episode reward: [(0, '0.368')] [2024-04-26 02:56:15,118][47288] Updated weights for policy 0, policy_version 61616 (0.0026) [2024-04-26 02:56:17,825][47288] Updated weights for policy 0, policy_version 61626 (0.0029) [2024-04-26 02:56:18,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 1009696768. Throughput: 0: 55688.4. Samples: 959087340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 02:56:18,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 02:56:20,857][47288] Updated weights for policy 0, policy_version 61636 (0.0029) [2024-04-26 02:56:23,881][47288] Updated weights for policy 0, policy_version 61646 (0.0027) [2024-04-26 02:56:23,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 1010008064. Throughput: 0: 55665.8. Samples: 959422560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 02:56:23,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 02:56:26,713][47288] Updated weights for policy 0, policy_version 61656 (0.0031) [2024-04-26 02:56:28,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 1010270208. Throughput: 0: 55619.9. Samples: 959587340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 02:56:28,923][47056] Avg episode reward: [(0, '0.325')] [2024-04-26 02:56:29,923][47288] Updated weights for policy 0, policy_version 61666 (0.0027) [2024-04-26 02:56:32,556][47288] Updated weights for policy 0, policy_version 61676 (0.0029) [2024-04-26 02:56:33,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 1010548736. Throughput: 0: 55500.9. Samples: 959915840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 02:56:33,923][47056] Avg episode reward: [(0, '0.305')] [2024-04-26 02:56:36,080][47288] Updated weights for policy 0, policy_version 61686 (0.0032) [2024-04-26 02:56:38,531][47288] Updated weights for policy 0, policy_version 61696 (0.0028) [2024-04-26 02:56:38,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 1010827264. Throughput: 0: 55534.9. Samples: 960251080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 02:56:38,923][47056] Avg episode reward: [(0, '0.416')] [2024-04-26 02:56:41,818][47288] Updated weights for policy 0, policy_version 61706 (0.0029) [2024-04-26 02:56:43,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 1011105792. Throughput: 0: 55614.5. Samples: 960427120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 02:56:43,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 02:56:44,261][47288] Updated weights for policy 0, policy_version 61716 (0.0028) [2024-04-26 02:56:47,686][47288] Updated weights for policy 0, policy_version 61726 (0.0037) [2024-04-26 02:56:48,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1011367936. Throughput: 0: 55582.9. Samples: 960763380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 02:56:48,924][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 02:56:48,936][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000061729_1011367936.pth... [2024-04-26 02:56:49,014][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000060915_998031360.pth [2024-04-26 02:56:50,243][47288] Updated weights for policy 0, policy_version 61736 (0.0031) [2024-04-26 02:56:53,621][47288] Updated weights for policy 0, policy_version 61746 (0.0026) [2024-04-26 02:56:53,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55705.4, 300 sec: 55816.7). Total num frames: 1011662848. Throughput: 0: 55532.0. Samples: 961093740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 02:56:53,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 02:56:56,650][47288] Updated weights for policy 0, policy_version 61756 (0.0029) [2024-04-26 02:56:58,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 1011957760. Throughput: 0: 55673.4. Samples: 961260900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 02:56:58,923][47056] Avg episode reward: [(0, '0.328')] [2024-04-26 02:56:59,429][47288] Updated weights for policy 0, policy_version 61766 (0.0026) [2024-04-26 02:57:02,438][47267] Signal inference workers to stop experience collection... (14550 times) [2024-04-26 02:57:02,439][47267] Signal inference workers to resume experience collection... (14550 times) [2024-04-26 02:57:02,449][47288] InferenceWorker_p0-w0: stopping experience collection (14550 times) [2024-04-26 02:57:02,450][47288] InferenceWorker_p0-w0: resuming experience collection (14550 times) [2024-04-26 02:57:02,562][47288] Updated weights for policy 0, policy_version 61776 (0.0028) [2024-04-26 02:57:03,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 1012236288. Throughput: 0: 55790.7. Samples: 961597920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 02:57:03,924][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 02:57:05,164][47288] Updated weights for policy 0, policy_version 61786 (0.0028) [2024-04-26 02:57:08,321][47288] Updated weights for policy 0, policy_version 61796 (0.0028) [2024-04-26 02:57:08,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 1012514816. Throughput: 0: 55639.5. Samples: 961926340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 02:57:08,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 02:57:11,088][47288] Updated weights for policy 0, policy_version 61806 (0.0028) [2024-04-26 02:57:13,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 1012760576. Throughput: 0: 55643.1. Samples: 962091280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 02:57:13,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 02:57:14,108][47288] Updated weights for policy 0, policy_version 61816 (0.0026) [2024-04-26 02:57:16,982][47288] Updated weights for policy 0, policy_version 61826 (0.0035) [2024-04-26 02:57:18,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1013055488. Throughput: 0: 55754.6. Samples: 962424800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 02:57:18,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 02:57:20,045][47288] Updated weights for policy 0, policy_version 61836 (0.0031) [2024-04-26 02:57:23,074][47288] Updated weights for policy 0, policy_version 61846 (0.0032) [2024-04-26 02:57:23,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 1013317632. Throughput: 0: 55741.8. Samples: 962759460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 02:57:23,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 02:57:25,961][47288] Updated weights for policy 0, policy_version 61856 (0.0034) [2024-04-26 02:57:28,860][47288] Updated weights for policy 0, policy_version 61866 (0.0032) [2024-04-26 02:57:28,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 1013612544. Throughput: 0: 55479.0. Samples: 962923680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 02:57:28,924][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 02:57:31,760][47288] Updated weights for policy 0, policy_version 61876 (0.0025) [2024-04-26 02:57:33,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 1013891072. Throughput: 0: 55359.6. Samples: 963254560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 02:57:33,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 02:57:34,601][47288] Updated weights for policy 0, policy_version 61886 (0.0030) [2024-04-26 02:57:37,574][47288] Updated weights for policy 0, policy_version 61896 (0.0025) [2024-04-26 02:57:38,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 1014185984. Throughput: 0: 55426.7. Samples: 963587940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 02:57:38,923][47056] Avg episode reward: [(0, '0.322')] [2024-04-26 02:57:40,572][47288] Updated weights for policy 0, policy_version 61906 (0.0038) [2024-04-26 02:57:43,497][47288] Updated weights for policy 0, policy_version 61916 (0.0028) [2024-04-26 02:57:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 1014448128. Throughput: 0: 55642.8. Samples: 963764820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 02:57:43,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 02:57:46,489][47288] Updated weights for policy 0, policy_version 61926 (0.0029) [2024-04-26 02:57:48,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 1014710272. Throughput: 0: 55452.4. Samples: 964093280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 02:57:48,923][47056] Avg episode reward: [(0, '0.288')] [2024-04-26 02:57:49,465][47288] Updated weights for policy 0, policy_version 61936 (0.0027) [2024-04-26 02:57:52,219][47288] Updated weights for policy 0, policy_version 61946 (0.0034) [2024-04-26 02:57:53,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 1015005184. Throughput: 0: 55525.8. Samples: 964425000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 02:57:53,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 02:57:54,940][47267] Signal inference workers to stop experience collection... (14600 times) [2024-04-26 02:57:54,940][47267] Signal inference workers to resume experience collection... (14600 times) [2024-04-26 02:57:54,962][47288] InferenceWorker_p0-w0: stopping experience collection (14600 times) [2024-04-26 02:57:54,962][47288] InferenceWorker_p0-w0: resuming experience collection (14600 times) [2024-04-26 02:57:55,177][47288] Updated weights for policy 0, policy_version 61956 (0.0028) [2024-04-26 02:57:57,973][47288] Updated weights for policy 0, policy_version 61966 (0.0030) [2024-04-26 02:57:58,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 1015283712. Throughput: 0: 55551.5. Samples: 964591100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 02:57:58,927][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 02:58:01,043][47288] Updated weights for policy 0, policy_version 61976 (0.0030) [2024-04-26 02:58:03,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 1015545856. Throughput: 0: 55536.1. Samples: 964923920. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 02:58:03,932][47056] Avg episode reward: [(0, '0.334')] [2024-04-26 02:58:04,068][47288] Updated weights for policy 0, policy_version 61986 (0.0027) [2024-04-26 02:58:06,954][47288] Updated weights for policy 0, policy_version 61996 (0.0028) [2024-04-26 02:58:08,923][47056] Fps is (10 sec: 55706.6, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 1015840768. Throughput: 0: 55525.9. Samples: 965258120. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 02:58:08,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 02:58:10,107][47288] Updated weights for policy 0, policy_version 62006 (0.0035) [2024-04-26 02:58:12,875][47288] Updated weights for policy 0, policy_version 62016 (0.0034) [2024-04-26 02:58:13,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 1016135680. Throughput: 0: 55771.1. Samples: 965433380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 02:58:13,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 02:58:15,835][47288] Updated weights for policy 0, policy_version 62026 (0.0030) [2024-04-26 02:58:18,657][47288] Updated weights for policy 0, policy_version 62036 (0.0029) [2024-04-26 02:58:18,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1016414208. Throughput: 0: 55963.1. Samples: 965772900. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 02:58:18,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 02:58:21,837][47288] Updated weights for policy 0, policy_version 62046 (0.0029) [2024-04-26 02:58:23,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 1016659968. Throughput: 0: 56031.6. Samples: 966109360. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 02:58:23,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 02:58:24,543][47288] Updated weights for policy 0, policy_version 62056 (0.0029) [2024-04-26 02:58:27,793][47288] Updated weights for policy 0, policy_version 62066 (0.0033) [2024-04-26 02:58:28,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 1016938496. Throughput: 0: 55517.6. Samples: 966263120. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 02:58:28,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 02:58:30,529][47288] Updated weights for policy 0, policy_version 62076 (0.0031) [2024-04-26 02:58:33,525][47288] Updated weights for policy 0, policy_version 62086 (0.0031) [2024-04-26 02:58:33,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 1017233408. Throughput: 0: 55630.3. Samples: 966596640. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 02:58:33,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 02:58:36,441][47288] Updated weights for policy 0, policy_version 62096 (0.0035) [2024-04-26 02:58:38,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 1017511936. Throughput: 0: 55754.5. Samples: 966933960. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 02:58:38,923][47056] Avg episode reward: [(0, '0.344')] [2024-04-26 02:58:39,274][47288] Updated weights for policy 0, policy_version 62106 (0.0027) [2024-04-26 02:58:42,393][47288] Updated weights for policy 0, policy_version 62116 (0.0033) [2024-04-26 02:58:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 1017790464. Throughput: 0: 56038.7. Samples: 967112840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 02:58:43,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 02:58:45,265][47288] Updated weights for policy 0, policy_version 62126 (0.0029) [2024-04-26 02:58:48,211][47288] Updated weights for policy 0, policy_version 62136 (0.0031) [2024-04-26 02:58:48,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 1018085376. Throughput: 0: 56003.2. Samples: 967444060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 02:58:48,923][47056] Avg episode reward: [(0, '0.312')] [2024-04-26 02:58:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000062139_1018085376.pth... [2024-04-26 02:58:48,980][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000061324_1004732416.pth [2024-04-26 02:58:51,208][47288] Updated weights for policy 0, policy_version 62146 (0.0031) [2024-04-26 02:58:53,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 1018331136. Throughput: 0: 55938.7. Samples: 967775360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 02:58:53,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 02:58:54,053][47288] Updated weights for policy 0, policy_version 62156 (0.0032) [2024-04-26 02:58:54,875][47267] Signal inference workers to stop experience collection... (14650 times) [2024-04-26 02:58:54,879][47267] Signal inference workers to resume experience collection... (14650 times) [2024-04-26 02:58:54,916][47288] InferenceWorker_p0-w0: stopping experience collection (14650 times) [2024-04-26 02:58:54,916][47288] InferenceWorker_p0-w0: resuming experience collection (14650 times) [2024-04-26 02:58:56,895][47288] Updated weights for policy 0, policy_version 62166 (0.0031) [2024-04-26 02:58:58,923][47056] Fps is (10 sec: 50789.4, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 1018593280. Throughput: 0: 55620.4. Samples: 967936300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 02:58:58,923][47056] Avg episode reward: [(0, '0.397')] [2024-04-26 02:58:59,800][47288] Updated weights for policy 0, policy_version 62176 (0.0026) [2024-04-26 02:59:02,634][47288] Updated weights for policy 0, policy_version 62186 (0.0027) [2024-04-26 02:59:03,923][47056] Fps is (10 sec: 55704.4, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 1018888192. Throughput: 0: 55608.4. Samples: 968275280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 02:59:03,923][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 02:59:05,881][47288] Updated weights for policy 0, policy_version 62196 (0.0032) [2024-04-26 02:59:08,623][47288] Updated weights for policy 0, policy_version 62206 (0.0034) [2024-04-26 02:59:08,923][47056] Fps is (10 sec: 58983.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 1019183104. Throughput: 0: 55576.9. Samples: 968610320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 02:59:08,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 02:59:11,774][47288] Updated weights for policy 0, policy_version 62216 (0.0027) [2024-04-26 02:59:13,923][47056] Fps is (10 sec: 58982.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 1019478016. Throughput: 0: 55828.4. Samples: 968775400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 02:59:13,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 02:59:14,704][47288] Updated weights for policy 0, policy_version 62226 (0.0029) [2024-04-26 02:59:17,679][47288] Updated weights for policy 0, policy_version 62236 (0.0030) [2024-04-26 02:59:18,923][47056] Fps is (10 sec: 57343.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 1019756544. Throughput: 0: 55750.6. Samples: 969105420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 02:59:18,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 02:59:20,452][47288] Updated weights for policy 0, policy_version 62246 (0.0028) [2024-04-26 02:59:23,653][47288] Updated weights for policy 0, policy_version 62256 (0.0032) [2024-04-26 02:59:23,923][47056] Fps is (10 sec: 54068.8, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 1020018688. Throughput: 0: 55717.6. Samples: 969441240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 02:59:23,923][47056] Avg episode reward: [(0, '0.424')] [2024-04-26 02:59:26,308][47288] Updated weights for policy 0, policy_version 62266 (0.0027) [2024-04-26 02:59:28,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 1020280832. Throughput: 0: 55311.9. Samples: 969601880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 02:59:28,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 02:59:29,499][47288] Updated weights for policy 0, policy_version 62276 (0.0025) [2024-04-26 02:59:32,148][47288] Updated weights for policy 0, policy_version 62286 (0.0032) [2024-04-26 02:59:33,923][47056] Fps is (10 sec: 52427.7, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 1020542976. Throughput: 0: 55470.5. Samples: 969940240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 02:59:33,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 02:59:35,337][47288] Updated weights for policy 0, policy_version 62296 (0.0029) [2024-04-26 02:59:38,053][47288] Updated weights for policy 0, policy_version 62306 (0.0027) [2024-04-26 02:59:38,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 1020837888. Throughput: 0: 55491.0. Samples: 970272460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 02:59:38,923][47056] Avg episode reward: [(0, '0.333')] [2024-04-26 02:59:41,177][47288] Updated weights for policy 0, policy_version 62316 (0.0031) [2024-04-26 02:59:43,776][47288] Updated weights for policy 0, policy_version 62326 (0.0030) [2024-04-26 02:59:43,923][47056] Fps is (10 sec: 60620.9, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 1021149184. Throughput: 0: 55575.2. Samples: 970437180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 02:59:43,923][47056] Avg episode reward: [(0, '0.344')] [2024-04-26 02:59:47,023][47288] Updated weights for policy 0, policy_version 62336 (0.0025) [2024-04-26 02:59:48,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 1021427712. Throughput: 0: 55553.9. Samples: 970775200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 02:59:48,923][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 02:59:49,600][47288] Updated weights for policy 0, policy_version 62346 (0.0035) [2024-04-26 02:59:53,008][47288] Updated weights for policy 0, policy_version 62356 (0.0029) [2024-04-26 02:59:53,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 1021689856. Throughput: 0: 55459.1. Samples: 971105980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 02:59:53,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 02:59:55,539][47288] Updated weights for policy 0, policy_version 62366 (0.0026) [2024-04-26 02:59:58,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 1021952000. Throughput: 0: 55515.7. Samples: 971273600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 02:59:58,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 02:59:59,060][47288] Updated weights for policy 0, policy_version 62376 (0.0026) [2024-04-26 03:00:01,439][47288] Updated weights for policy 0, policy_version 62386 (0.0026) [2024-04-26 03:00:03,923][47056] Fps is (10 sec: 50790.5, 60 sec: 55159.6, 300 sec: 55483.5). Total num frames: 1022197760. Throughput: 0: 55595.7. Samples: 971607220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 03:00:03,923][47056] Avg episode reward: [(0, '0.395')] [2024-04-26 03:00:04,896][47288] Updated weights for policy 0, policy_version 62396 (0.0028) [2024-04-26 03:00:06,477][47267] Signal inference workers to stop experience collection... (14700 times) [2024-04-26 03:00:06,478][47267] Signal inference workers to resume experience collection... (14700 times) [2024-04-26 03:00:06,502][47288] InferenceWorker_p0-w0: stopping experience collection (14700 times) [2024-04-26 03:00:06,502][47288] InferenceWorker_p0-w0: resuming experience collection (14700 times) [2024-04-26 03:00:07,162][47288] Updated weights for policy 0, policy_version 62406 (0.0028) [2024-04-26 03:00:08,923][47056] Fps is (10 sec: 52428.3, 60 sec: 54886.3, 300 sec: 55539.0). Total num frames: 1022476288. Throughput: 0: 55652.6. Samples: 971945620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 03:00:08,923][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 03:00:10,642][47288] Updated weights for policy 0, policy_version 62416 (0.0029) [2024-04-26 03:00:13,131][47288] Updated weights for policy 0, policy_version 62426 (0.0028) [2024-04-26 03:00:13,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55159.7, 300 sec: 55650.1). Total num frames: 1022787584. Throughput: 0: 55722.9. Samples: 972109400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 03:00:13,923][47056] Avg episode reward: [(0, '0.375')] [2024-04-26 03:00:16,638][47288] Updated weights for policy 0, policy_version 62436 (0.0026) [2024-04-26 03:00:18,923][47056] Fps is (10 sec: 62259.7, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1023098880. Throughput: 0: 55580.9. Samples: 972441380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 03:00:18,923][47056] Avg episode reward: [(0, '0.371')] [2024-04-26 03:00:19,122][47288] Updated weights for policy 0, policy_version 62446 (0.0027) [2024-04-26 03:00:22,591][47288] Updated weights for policy 0, policy_version 62456 (0.0031) [2024-04-26 03:00:23,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 1023377408. Throughput: 0: 55534.6. Samples: 972771520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 03:00:23,923][47056] Avg episode reward: [(0, '0.278')] [2024-04-26 03:00:24,836][47288] Updated weights for policy 0, policy_version 62466 (0.0034) [2024-04-26 03:00:28,385][47288] Updated weights for policy 0, policy_version 62476 (0.0030) [2024-04-26 03:00:28,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 1023639552. Throughput: 0: 55845.9. Samples: 972950240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 03:00:28,923][47056] Avg episode reward: [(0, '0.311')] [2024-04-26 03:00:30,914][47288] Updated weights for policy 0, policy_version 62486 (0.0028) [2024-04-26 03:00:33,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 1023901696. Throughput: 0: 55835.1. Samples: 973287780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 03:00:33,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 03:00:34,187][47288] Updated weights for policy 0, policy_version 62496 (0.0026) [2024-04-26 03:00:37,085][47288] Updated weights for policy 0, policy_version 62506 (0.0028) [2024-04-26 03:00:38,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 1024163840. Throughput: 0: 55897.9. Samples: 973621380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 03:00:38,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 03:00:40,141][47288] Updated weights for policy 0, policy_version 62516 (0.0028) [2024-04-26 03:00:43,000][47288] Updated weights for policy 0, policy_version 62526 (0.0030) [2024-04-26 03:00:43,923][47056] Fps is (10 sec: 54066.4, 60 sec: 54886.3, 300 sec: 55594.5). Total num frames: 1024442368. Throughput: 0: 55563.4. Samples: 973773960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 03:00:43,924][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 03:00:45,991][47288] Updated weights for policy 0, policy_version 62536 (0.0026) [2024-04-26 03:00:48,910][47288] Updated weights for policy 0, policy_version 62546 (0.0031) [2024-04-26 03:00:48,923][47056] Fps is (10 sec: 58981.5, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1024753664. Throughput: 0: 55589.2. Samples: 974108740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 03:00:48,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 03:00:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000062546_1024753664.pth... [2024-04-26 03:00:48,983][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000061729_1011367936.pth [2024-04-26 03:00:52,043][47288] Updated weights for policy 0, policy_version 62556 (0.0025) [2024-04-26 03:00:53,923][47056] Fps is (10 sec: 60621.1, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 1025048576. Throughput: 0: 55459.1. Samples: 974441280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 03:00:53,925][47056] Avg episode reward: [(0, '0.308')] [2024-04-26 03:00:54,648][47288] Updated weights for policy 0, policy_version 62566 (0.0023) [2024-04-26 03:00:57,765][47288] Updated weights for policy 0, policy_version 62576 (0.0029) [2024-04-26 03:00:58,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 1025327104. Throughput: 0: 55932.4. Samples: 974626360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 03:00:58,923][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 03:01:01,014][47288] Updated weights for policy 0, policy_version 62586 (0.0034) [2024-04-26 03:01:03,664][47288] Updated weights for policy 0, policy_version 62596 (0.0034) [2024-04-26 03:01:03,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56524.7, 300 sec: 55705.6). Total num frames: 1025589248. Throughput: 0: 55910.2. Samples: 974957340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 03:01:03,932][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 03:01:04,648][47267] Signal inference workers to stop experience collection... (14750 times) [2024-04-26 03:01:04,648][47267] Signal inference workers to resume experience collection... (14750 times) [2024-04-26 03:01:04,664][47288] InferenceWorker_p0-w0: stopping experience collection (14750 times) [2024-04-26 03:01:04,664][47288] InferenceWorker_p0-w0: resuming experience collection (14750 times) [2024-04-26 03:01:06,692][47288] Updated weights for policy 0, policy_version 62606 (0.0031) [2024-04-26 03:01:08,923][47056] Fps is (10 sec: 50790.5, 60 sec: 55978.8, 300 sec: 55594.6). Total num frames: 1025835008. Throughput: 0: 56013.4. Samples: 975292120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 03:01:08,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 03:01:09,671][47288] Updated weights for policy 0, policy_version 62616 (0.0027) [2024-04-26 03:01:12,340][47288] Updated weights for policy 0, policy_version 62626 (0.0029) [2024-04-26 03:01:13,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55432.3, 300 sec: 55650.0). Total num frames: 1026113536. Throughput: 0: 55508.6. Samples: 975448140. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 03:01:13,923][47056] Avg episode reward: [(0, '0.334')] [2024-04-26 03:01:15,499][47288] Updated weights for policy 0, policy_version 62636 (0.0031) [2024-04-26 03:01:18,156][47288] Updated weights for policy 0, policy_version 62646 (0.0027) [2024-04-26 03:01:18,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 1026408448. Throughput: 0: 55472.4. Samples: 975784040. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 03:01:18,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 03:01:21,303][47288] Updated weights for policy 0, policy_version 62656 (0.0029) [2024-04-26 03:01:23,923][47056] Fps is (10 sec: 58983.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 1026703360. Throughput: 0: 55511.0. Samples: 976119380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 03:01:23,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 03:01:24,112][47288] Updated weights for policy 0, policy_version 62666 (0.0031) [2024-04-26 03:01:27,425][47288] Updated weights for policy 0, policy_version 62676 (0.0036) [2024-04-26 03:01:28,923][47056] Fps is (10 sec: 60620.7, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 1027014656. Throughput: 0: 56030.8. Samples: 976295340. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 03:01:28,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 03:01:30,132][47288] Updated weights for policy 0, policy_version 62686 (0.0031) [2024-04-26 03:01:33,348][47288] Updated weights for policy 0, policy_version 62696 (0.0036) [2024-04-26 03:01:33,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 1027276800. Throughput: 0: 55995.9. Samples: 976628560. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 03:01:33,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 03:01:36,060][47288] Updated weights for policy 0, policy_version 62706 (0.0030) [2024-04-26 03:01:38,923][47056] Fps is (10 sec: 49152.1, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 1027506176. Throughput: 0: 55979.7. Samples: 976960360. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 03:01:38,923][47056] Avg episode reward: [(0, '0.344')] [2024-04-26 03:01:39,155][47288] Updated weights for policy 0, policy_version 62716 (0.0030) [2024-04-26 03:01:41,769][47288] Updated weights for policy 0, policy_version 62726 (0.0029) [2024-04-26 03:01:43,923][47056] Fps is (10 sec: 52429.5, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 1027801088. Throughput: 0: 55432.4. Samples: 977120820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 03:01:43,923][47056] Avg episode reward: [(0, '0.336')] [2024-04-26 03:01:44,912][47288] Updated weights for policy 0, policy_version 62736 (0.0028) [2024-04-26 03:01:47,634][47288] Updated weights for policy 0, policy_version 62746 (0.0027) [2024-04-26 03:01:48,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 1028079616. Throughput: 0: 55570.7. Samples: 977458020. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 03:01:48,923][47056] Avg episode reward: [(0, '0.399')] [2024-04-26 03:01:50,950][47288] Updated weights for policy 0, policy_version 62756 (0.0033) [2024-04-26 03:01:53,425][47288] Updated weights for policy 0, policy_version 62766 (0.0028) [2024-04-26 03:01:53,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 1028358144. Throughput: 0: 55489.3. Samples: 977789140. Policy #0 lag: (min: 0.0, avg: 14.4, max: 29.0) [2024-04-26 03:01:53,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 03:01:56,822][47288] Updated weights for policy 0, policy_version 62776 (0.0028) [2024-04-26 03:01:58,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 1028653056. Throughput: 0: 55909.9. Samples: 977964080. Policy #0 lag: (min: 0.0, avg: 14.4, max: 29.0) [2024-04-26 03:01:58,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 03:01:59,452][47288] Updated weights for policy 0, policy_version 62786 (0.0029) [2024-04-26 03:02:02,635][47288] Updated weights for policy 0, policy_version 62796 (0.0035) [2024-04-26 03:02:02,857][47267] Signal inference workers to stop experience collection... (14800 times) [2024-04-26 03:02:02,892][47288] InferenceWorker_p0-w0: stopping experience collection (14800 times) [2024-04-26 03:02:02,916][47267] Signal inference workers to resume experience collection... (14800 times) [2024-04-26 03:02:02,916][47288] InferenceWorker_p0-w0: resuming experience collection (14800 times) [2024-04-26 03:02:03,923][47056] Fps is (10 sec: 58981.3, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 1028947968. Throughput: 0: 55883.3. Samples: 978298800. Policy #0 lag: (min: 0.0, avg: 14.4, max: 29.0) [2024-04-26 03:02:03,923][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 03:02:05,223][47288] Updated weights for policy 0, policy_version 62806 (0.0034) [2024-04-26 03:02:08,358][47288] Updated weights for policy 0, policy_version 62816 (0.0027) [2024-04-26 03:02:08,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 1029226496. Throughput: 0: 55807.9. Samples: 978630740. Policy #0 lag: (min: 0.0, avg: 14.4, max: 29.0) [2024-04-26 03:02:08,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 03:02:11,625][47288] Updated weights for policy 0, policy_version 62826 (0.0031) [2024-04-26 03:02:13,923][47056] Fps is (10 sec: 52429.5, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 1029472256. Throughput: 0: 55752.4. Samples: 978804200. Policy #0 lag: (min: 0.0, avg: 14.4, max: 29.0) [2024-04-26 03:02:13,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 03:02:14,304][47288] Updated weights for policy 0, policy_version 62836 (0.0030) [2024-04-26 03:02:17,478][47288] Updated weights for policy 0, policy_version 62846 (0.0031) [2024-04-26 03:02:18,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 1029750784. Throughput: 0: 55717.4. Samples: 979135840. Policy #0 lag: (min: 0.0, avg: 14.4, max: 29.0) [2024-04-26 03:02:18,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 03:02:20,062][47288] Updated weights for policy 0, policy_version 62856 (0.0037) [2024-04-26 03:02:23,203][47288] Updated weights for policy 0, policy_version 62866 (0.0036) [2024-04-26 03:02:23,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 1030029312. Throughput: 0: 55807.0. Samples: 979471680. Policy #0 lag: (min: 0.0, avg: 14.4, max: 29.0) [2024-04-26 03:02:23,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 03:02:25,963][47288] Updated weights for policy 0, policy_version 62876 (0.0031) [2024-04-26 03:02:28,923][47056] Fps is (10 sec: 55706.3, 60 sec: 54886.5, 300 sec: 55650.1). Total num frames: 1030307840. Throughput: 0: 55727.2. Samples: 979628540. Policy #0 lag: (min: 0.0, avg: 14.4, max: 29.0) [2024-04-26 03:02:28,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 03:02:28,994][47288] Updated weights for policy 0, policy_version 62886 (0.0031) [2024-04-26 03:02:31,808][47288] Updated weights for policy 0, policy_version 62896 (0.0031) [2024-04-26 03:02:33,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1030619136. Throughput: 0: 55744.0. Samples: 979966500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 03:02:33,923][47056] Avg episode reward: [(0, '0.326')] [2024-04-26 03:02:34,780][47288] Updated weights for policy 0, policy_version 62906 (0.0034) [2024-04-26 03:02:37,610][47288] Updated weights for policy 0, policy_version 62916 (0.0040) [2024-04-26 03:02:38,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56524.8, 300 sec: 55761.1). Total num frames: 1030897664. Throughput: 0: 55832.9. Samples: 980301620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 03:02:38,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 03:02:40,744][47288] Updated weights for policy 0, policy_version 62926 (0.0030) [2024-04-26 03:02:43,443][47288] Updated weights for policy 0, policy_version 62936 (0.0029) [2024-04-26 03:02:43,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 1031159808. Throughput: 0: 55777.0. Samples: 980474040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 03:02:43,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 03:02:46,784][47288] Updated weights for policy 0, policy_version 62946 (0.0031) [2024-04-26 03:02:48,923][47056] Fps is (10 sec: 52428.1, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 1031421952. Throughput: 0: 55783.6. Samples: 980809060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 03:02:48,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 03:02:49,062][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000062954_1031438336.pth... [2024-04-26 03:02:49,105][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000062139_1018085376.pth [2024-04-26 03:02:49,347][47288] Updated weights for policy 0, policy_version 62956 (0.0039) [2024-04-26 03:02:52,549][47288] Updated weights for policy 0, policy_version 62966 (0.0027) [2024-04-26 03:02:53,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 1031700480. Throughput: 0: 55841.8. Samples: 981143620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 03:02:53,923][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 03:02:55,225][47288] Updated weights for policy 0, policy_version 62976 (0.0030) [2024-04-26 03:02:58,450][47288] Updated weights for policy 0, policy_version 62986 (0.0029) [2024-04-26 03:02:58,923][47056] Fps is (10 sec: 55707.1, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 1031979008. Throughput: 0: 55529.5. Samples: 981303020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 03:02:58,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 03:03:01,101][47288] Updated weights for policy 0, policy_version 62996 (0.0029) [2024-04-26 03:03:03,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55159.6, 300 sec: 55650.0). Total num frames: 1032257536. Throughput: 0: 55602.3. Samples: 981637940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 03:03:03,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 03:03:04,260][47288] Updated weights for policy 0, policy_version 63006 (0.0028) [2024-04-26 03:03:06,995][47288] Updated weights for policy 0, policy_version 63016 (0.0026) [2024-04-26 03:03:07,256][47267] Signal inference workers to stop experience collection... (14850 times) [2024-04-26 03:03:07,290][47288] InferenceWorker_p0-w0: stopping experience collection (14850 times) [2024-04-26 03:03:07,342][47267] Signal inference workers to resume experience collection... (14850 times) [2024-04-26 03:03:07,343][47288] InferenceWorker_p0-w0: resuming experience collection (14850 times) [2024-04-26 03:03:08,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 1032552448. Throughput: 0: 55648.6. Samples: 981975860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 03:03:08,923][47056] Avg episode reward: [(0, '0.338')] [2024-04-26 03:03:10,121][47288] Updated weights for policy 0, policy_version 63026 (0.0033) [2024-04-26 03:03:12,853][47288] Updated weights for policy 0, policy_version 63036 (0.0025) [2024-04-26 03:03:13,923][47056] Fps is (10 sec: 60620.5, 60 sec: 56524.8, 300 sec: 55761.1). Total num frames: 1032863744. Throughput: 0: 55987.0. Samples: 982147960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 03:03:13,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 03:03:15,875][47288] Updated weights for policy 0, policy_version 63046 (0.0029) [2024-04-26 03:03:18,779][47288] Updated weights for policy 0, policy_version 63056 (0.0028) [2024-04-26 03:03:18,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 1033109504. Throughput: 0: 56005.3. Samples: 982486740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 03:03:18,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 03:03:21,586][47288] Updated weights for policy 0, policy_version 63066 (0.0026) [2024-04-26 03:03:23,923][47056] Fps is (10 sec: 50790.7, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1033371648. Throughput: 0: 55980.0. Samples: 982820720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 03:03:23,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 03:03:24,612][47288] Updated weights for policy 0, policy_version 63076 (0.0030) [2024-04-26 03:03:27,808][47288] Updated weights for policy 0, policy_version 63086 (0.0031) [2024-04-26 03:03:28,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 1033650176. Throughput: 0: 55685.4. Samples: 982979880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 03:03:28,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 03:03:30,428][47288] Updated weights for policy 0, policy_version 63096 (0.0028) [2024-04-26 03:03:33,593][47288] Updated weights for policy 0, policy_version 63106 (0.0027) [2024-04-26 03:03:33,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 1033961472. Throughput: 0: 55727.3. Samples: 983316780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 03:03:33,923][47056] Avg episode reward: [(0, '0.369')] [2024-04-26 03:03:36,247][47288] Updated weights for policy 0, policy_version 63116 (0.0024) [2024-04-26 03:03:38,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1034223616. Throughput: 0: 55856.8. Samples: 983657180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 03:03:38,923][47056] Avg episode reward: [(0, '0.404')] [2024-04-26 03:03:39,388][47288] Updated weights for policy 0, policy_version 63126 (0.0026) [2024-04-26 03:03:42,203][47288] Updated weights for policy 0, policy_version 63136 (0.0029) [2024-04-26 03:03:43,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 1034502144. Throughput: 0: 55989.2. Samples: 983822540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 03:03:43,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 03:03:45,318][47288] Updated weights for policy 0, policy_version 63146 (0.0030) [2024-04-26 03:03:48,014][47288] Updated weights for policy 0, policy_version 63156 (0.0029) [2024-04-26 03:03:48,923][47056] Fps is (10 sec: 57345.0, 60 sec: 56252.0, 300 sec: 55816.7). Total num frames: 1034797056. Throughput: 0: 56051.7. Samples: 984160260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 03:03:48,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 03:03:51,112][47288] Updated weights for policy 0, policy_version 63166 (0.0028) [2024-04-26 03:03:53,876][47288] Updated weights for policy 0, policy_version 63176 (0.0032) [2024-04-26 03:03:53,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 1035075584. Throughput: 0: 55832.9. Samples: 984488340. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-26 03:03:53,923][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 03:03:56,846][47288] Updated weights for policy 0, policy_version 63186 (0.0028) [2024-04-26 03:03:58,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 1035337728. Throughput: 0: 55861.0. Samples: 984661700. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-26 03:03:58,923][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 03:03:59,817][47288] Updated weights for policy 0, policy_version 63196 (0.0026) [2024-04-26 03:04:02,738][47288] Updated weights for policy 0, policy_version 63206 (0.0027) [2024-04-26 03:04:03,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 1035616256. Throughput: 0: 55755.6. Samples: 984995740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-26 03:04:03,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 03:04:03,924][47267] Saving new best policy, reward=0.549! [2024-04-26 03:04:04,378][47267] Signal inference workers to stop experience collection... (14900 times) [2024-04-26 03:04:04,387][47267] Signal inference workers to resume experience collection... (14900 times) [2024-04-26 03:04:04,409][47288] InferenceWorker_p0-w0: stopping experience collection (14900 times) [2024-04-26 03:04:04,409][47288] InferenceWorker_p0-w0: resuming experience collection (14900 times) [2024-04-26 03:04:05,590][47288] Updated weights for policy 0, policy_version 63216 (0.0035) [2024-04-26 03:04:08,712][47288] Updated weights for policy 0, policy_version 63226 (0.0028) [2024-04-26 03:04:08,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 1035911168. Throughput: 0: 55808.8. Samples: 985332120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-26 03:04:08,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 03:04:11,455][47288] Updated weights for policy 0, policy_version 63236 (0.0029) [2024-04-26 03:04:13,923][47056] Fps is (10 sec: 54067.9, 60 sec: 54886.5, 300 sec: 55594.6). Total num frames: 1036156928. Throughput: 0: 55840.5. Samples: 985492700. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-26 03:04:13,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 03:04:14,531][47288] Updated weights for policy 0, policy_version 63246 (0.0024) [2024-04-26 03:04:17,256][47288] Updated weights for policy 0, policy_version 63256 (0.0024) [2024-04-26 03:04:18,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 1036468224. Throughput: 0: 55809.6. Samples: 985828220. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-26 03:04:18,923][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 03:04:20,264][47288] Updated weights for policy 0, policy_version 63266 (0.0030) [2024-04-26 03:04:23,023][47288] Updated weights for policy 0, policy_version 63276 (0.0036) [2024-04-26 03:04:23,923][47056] Fps is (10 sec: 60620.9, 60 sec: 56524.8, 300 sec: 55872.3). Total num frames: 1036763136. Throughput: 0: 55685.5. Samples: 986163020. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-26 03:04:23,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 03:04:26,185][47288] Updated weights for policy 0, policy_version 63286 (0.0033) [2024-04-26 03:04:28,908][47288] Updated weights for policy 0, policy_version 63296 (0.0031) [2024-04-26 03:04:28,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56524.8, 300 sec: 55927.8). Total num frames: 1037041664. Throughput: 0: 55822.2. Samples: 986334540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:04:28,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 03:04:32,110][47288] Updated weights for policy 0, policy_version 63306 (0.0030) [2024-04-26 03:04:33,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 1037303808. Throughput: 0: 55792.3. Samples: 986670920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:04:33,923][47056] Avg episode reward: [(0, '0.416')] [2024-04-26 03:04:34,825][47288] Updated weights for policy 0, policy_version 63316 (0.0030) [2024-04-26 03:04:38,040][47288] Updated weights for policy 0, policy_version 63326 (0.0030) [2024-04-26 03:04:38,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 1037565952. Throughput: 0: 55989.0. Samples: 987007840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:04:38,923][47056] Avg episode reward: [(0, '0.345')] [2024-04-26 03:04:40,588][47288] Updated weights for policy 0, policy_version 63336 (0.0031) [2024-04-26 03:04:43,751][47288] Updated weights for policy 0, policy_version 63346 (0.0034) [2024-04-26 03:04:43,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1037860864. Throughput: 0: 55746.2. Samples: 987170280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:04:43,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 03:04:46,630][47288] Updated weights for policy 0, policy_version 63356 (0.0034) [2024-04-26 03:04:48,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1038123008. Throughput: 0: 55741.9. Samples: 987504120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:04:48,923][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 03:04:49,074][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000063364_1038155776.pth... [2024-04-26 03:04:49,121][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000062546_1024753664.pth [2024-04-26 03:04:49,128][47267] Saving new best policy, reward=0.564! [2024-04-26 03:04:50,001][47288] Updated weights for policy 0, policy_version 63366 (0.0033) [2024-04-26 03:04:52,623][47288] Updated weights for policy 0, policy_version 63376 (0.0032) [2024-04-26 03:04:53,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 1038434304. Throughput: 0: 55622.8. Samples: 987835140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:04:53,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 03:04:55,841][47288] Updated weights for policy 0, policy_version 63386 (0.0028) [2024-04-26 03:04:58,415][47288] Updated weights for policy 0, policy_version 63396 (0.0039) [2024-04-26 03:04:58,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55978.5, 300 sec: 55927.7). Total num frames: 1038696448. Throughput: 0: 55963.4. Samples: 988011060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:04:58,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 03:05:01,620][47288] Updated weights for policy 0, policy_version 63406 (0.0030) [2024-04-26 03:05:03,923][47056] Fps is (10 sec: 54066.2, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 1038974976. Throughput: 0: 55810.7. Samples: 988339700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:05:03,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 03:05:04,473][47288] Updated weights for policy 0, policy_version 63416 (0.0029) [2024-04-26 03:05:06,849][47267] Signal inference workers to stop experience collection... (14950 times) [2024-04-26 03:05:06,850][47267] Signal inference workers to resume experience collection... (14950 times) [2024-04-26 03:05:06,869][47288] InferenceWorker_p0-w0: stopping experience collection (14950 times) [2024-04-26 03:05:06,869][47288] InferenceWorker_p0-w0: resuming experience collection (14950 times) [2024-04-26 03:05:07,490][47288] Updated weights for policy 0, policy_version 63426 (0.0033) [2024-04-26 03:05:08,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 1039253504. Throughput: 0: 55792.3. Samples: 988673680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 03:05:08,923][47056] Avg episode reward: [(0, '0.354')] [2024-04-26 03:05:10,339][47288] Updated weights for policy 0, policy_version 63436 (0.0029) [2024-04-26 03:05:13,462][47288] Updated weights for policy 0, policy_version 63446 (0.0030) [2024-04-26 03:05:13,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 1039515648. Throughput: 0: 55604.5. Samples: 988836740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 03:05:13,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 03:05:16,336][47288] Updated weights for policy 0, policy_version 63456 (0.0023) [2024-04-26 03:05:18,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 1039794176. Throughput: 0: 55476.0. Samples: 989167340. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 03:05:18,924][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 03:05:19,383][47288] Updated weights for policy 0, policy_version 63466 (0.0028) [2024-04-26 03:05:22,122][47288] Updated weights for policy 0, policy_version 63476 (0.0028) [2024-04-26 03:05:23,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55159.3, 300 sec: 55705.6). Total num frames: 1040072704. Throughput: 0: 55434.1. Samples: 989502380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 03:05:23,924][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 03:05:25,126][47288] Updated weights for policy 0, policy_version 63486 (0.0031) [2024-04-26 03:05:27,999][47288] Updated weights for policy 0, policy_version 63496 (0.0029) [2024-04-26 03:05:28,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 1040367616. Throughput: 0: 55440.8. Samples: 989665120. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 03:05:28,923][47056] Avg episode reward: [(0, '0.338')] [2024-04-26 03:05:30,847][47288] Updated weights for policy 0, policy_version 63506 (0.0029) [2024-04-26 03:05:33,743][47288] Updated weights for policy 0, policy_version 63516 (0.0031) [2024-04-26 03:05:33,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 1040646144. Throughput: 0: 55577.8. Samples: 990005120. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 03:05:33,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 03:05:37,167][47288] Updated weights for policy 0, policy_version 63526 (0.0026) [2024-04-26 03:05:38,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 1040908288. Throughput: 0: 55629.3. Samples: 990338460. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 03:05:38,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 03:05:39,760][47288] Updated weights for policy 0, policy_version 63536 (0.0029) [2024-04-26 03:05:42,851][47288] Updated weights for policy 0, policy_version 63546 (0.0029) [2024-04-26 03:05:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 1041203200. Throughput: 0: 55462.8. Samples: 990506880. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 03:05:43,923][47056] Avg episode reward: [(0, '0.391')] [2024-04-26 03:05:45,675][47288] Updated weights for policy 0, policy_version 63556 (0.0029) [2024-04-26 03:05:48,581][47288] Updated weights for policy 0, policy_version 63566 (0.0029) [2024-04-26 03:05:48,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 1041465344. Throughput: 0: 55736.2. Samples: 990847820. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-26 03:05:48,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 03:05:51,419][47288] Updated weights for policy 0, policy_version 63576 (0.0027) [2024-04-26 03:05:53,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1041760256. Throughput: 0: 55713.4. Samples: 991180780. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-26 03:05:53,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 03:05:54,481][47288] Updated weights for policy 0, policy_version 63586 (0.0027) [2024-04-26 03:05:57,312][47288] Updated weights for policy 0, policy_version 63596 (0.0026) [2024-04-26 03:05:58,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 1042022400. Throughput: 0: 55816.8. Samples: 991348500. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-26 03:05:58,923][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 03:06:00,354][47288] Updated weights for policy 0, policy_version 63606 (0.0028) [2024-04-26 03:06:03,145][47288] Updated weights for policy 0, policy_version 63616 (0.0033) [2024-04-26 03:06:03,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 1042300928. Throughput: 0: 55793.9. Samples: 991678060. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-26 03:06:03,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 03:06:06,248][47288] Updated weights for policy 0, policy_version 63626 (0.0030) [2024-04-26 03:06:08,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 1042595840. Throughput: 0: 55775.2. Samples: 992012260. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-26 03:06:08,923][47056] Avg episode reward: [(0, '0.368')] [2024-04-26 03:06:08,961][47288] Updated weights for policy 0, policy_version 63636 (0.0028) [2024-04-26 03:06:11,915][47267] Signal inference workers to stop experience collection... (15000 times) [2024-04-26 03:06:11,915][47267] Signal inference workers to resume experience collection... (15000 times) [2024-04-26 03:06:11,947][47288] InferenceWorker_p0-w0: stopping experience collection (15000 times) [2024-04-26 03:06:11,947][47288] InferenceWorker_p0-w0: resuming experience collection (15000 times) [2024-04-26 03:06:12,020][47288] Updated weights for policy 0, policy_version 63646 (0.0030) [2024-04-26 03:06:13,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 1042874368. Throughput: 0: 55917.1. Samples: 992181380. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-26 03:06:13,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 03:06:14,845][47288] Updated weights for policy 0, policy_version 63656 (0.0029) [2024-04-26 03:06:17,880][47288] Updated weights for policy 0, policy_version 63666 (0.0026) [2024-04-26 03:06:18,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 1043152896. Throughput: 0: 55988.8. Samples: 992524620. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-26 03:06:18,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 03:06:20,634][47288] Updated weights for policy 0, policy_version 63676 (0.0025) [2024-04-26 03:06:23,754][47288] Updated weights for policy 0, policy_version 63686 (0.0028) [2024-04-26 03:06:23,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 1043431424. Throughput: 0: 55999.4. Samples: 992858440. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-26 03:06:23,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 03:06:26,515][47288] Updated weights for policy 0, policy_version 63696 (0.0033) [2024-04-26 03:06:28,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 1043709952. Throughput: 0: 55921.1. Samples: 993023340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 03:06:28,923][47056] Avg episode reward: [(0, '0.329')] [2024-04-26 03:06:29,646][47288] Updated weights for policy 0, policy_version 63706 (0.0031) [2024-04-26 03:06:32,420][47288] Updated weights for policy 0, policy_version 63716 (0.0028) [2024-04-26 03:06:33,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 1043988480. Throughput: 0: 55745.4. Samples: 993356360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 03:06:33,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 03:06:35,362][47288] Updated weights for policy 0, policy_version 63726 (0.0033) [2024-04-26 03:06:38,126][47288] Updated weights for policy 0, policy_version 63736 (0.0028) [2024-04-26 03:06:38,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 1044267008. Throughput: 0: 55839.0. Samples: 993693540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 03:06:38,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 03:06:41,045][47288] Updated weights for policy 0, policy_version 63746 (0.0030) [2024-04-26 03:06:43,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 1044561920. Throughput: 0: 55878.2. Samples: 993863020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 03:06:43,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 03:06:43,974][47288] Updated weights for policy 0, policy_version 63756 (0.0031) [2024-04-26 03:06:47,024][47288] Updated weights for policy 0, policy_version 63766 (0.0027) [2024-04-26 03:06:48,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 1044824064. Throughput: 0: 56044.8. Samples: 994200080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 03:06:48,923][47056] Avg episode reward: [(0, '0.312')] [2024-04-26 03:06:49,058][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000063772_1044840448.pth... [2024-04-26 03:06:49,106][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000062954_1031438336.pth [2024-04-26 03:06:49,998][47288] Updated weights for policy 0, policy_version 63776 (0.0031) [2024-04-26 03:06:53,003][47288] Updated weights for policy 0, policy_version 63786 (0.0029) [2024-04-26 03:06:53,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 1045102592. Throughput: 0: 55942.5. Samples: 994529680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 03:06:53,923][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 03:06:55,859][47288] Updated weights for policy 0, policy_version 63796 (0.0031) [2024-04-26 03:06:58,786][47288] Updated weights for policy 0, policy_version 63806 (0.0028) [2024-04-26 03:06:58,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56251.7, 300 sec: 55761.2). Total num frames: 1045397504. Throughput: 0: 56008.6. Samples: 994701780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 03:06:58,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 03:07:01,606][47288] Updated weights for policy 0, policy_version 63816 (0.0029) [2024-04-26 03:07:03,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 1045659648. Throughput: 0: 55836.1. Samples: 995037240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 03:07:03,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 03:07:04,540][47288] Updated weights for policy 0, policy_version 63826 (0.0030) [2024-04-26 03:07:07,599][47288] Updated weights for policy 0, policy_version 63836 (0.0034) [2024-04-26 03:07:08,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 1045954560. Throughput: 0: 55903.5. Samples: 995374100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 03:07:08,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 03:07:10,650][47288] Updated weights for policy 0, policy_version 63846 (0.0024) [2024-04-26 03:07:13,465][47288] Updated weights for policy 0, policy_version 63856 (0.0033) [2024-04-26 03:07:13,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.4, 300 sec: 55816.7). Total num frames: 1046216704. Throughput: 0: 55877.4. Samples: 995537820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 03:07:13,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 03:07:16,275][47267] Signal inference workers to stop experience collection... (15050 times) [2024-04-26 03:07:16,305][47288] InferenceWorker_p0-w0: stopping experience collection (15050 times) [2024-04-26 03:07:16,330][47267] Signal inference workers to resume experience collection... (15050 times) [2024-04-26 03:07:16,331][47288] InferenceWorker_p0-w0: resuming experience collection (15050 times) [2024-04-26 03:07:16,443][47288] Updated weights for policy 0, policy_version 63866 (0.0031) [2024-04-26 03:07:18,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 1046511616. Throughput: 0: 56015.6. Samples: 995877080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 03:07:18,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 03:07:19,294][47288] Updated weights for policy 0, policy_version 63876 (0.0028) [2024-04-26 03:07:22,216][47288] Updated weights for policy 0, policy_version 63886 (0.0037) [2024-04-26 03:07:23,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 1046757376. Throughput: 0: 55945.7. Samples: 996211100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 03:07:23,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 03:07:25,248][47288] Updated weights for policy 0, policy_version 63896 (0.0027) [2024-04-26 03:07:28,033][47288] Updated weights for policy 0, policy_version 63906 (0.0028) [2024-04-26 03:07:28,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1047052288. Throughput: 0: 55779.5. Samples: 996373100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 03:07:28,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 03:07:31,184][47288] Updated weights for policy 0, policy_version 63916 (0.0032) [2024-04-26 03:07:33,923][47056] Fps is (10 sec: 57345.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 1047330816. Throughput: 0: 55684.6. Samples: 996705880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 03:07:33,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 03:07:34,228][47288] Updated weights for policy 0, policy_version 63926 (0.0026) [2024-04-26 03:07:37,339][47288] Updated weights for policy 0, policy_version 63936 (0.0027) [2024-04-26 03:07:38,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 1047609344. Throughput: 0: 55682.4. Samples: 997035380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 03:07:38,923][47056] Avg episode reward: [(0, '0.399')] [2024-04-26 03:07:40,025][47288] Updated weights for policy 0, policy_version 63946 (0.0027) [2024-04-26 03:07:43,104][47288] Updated weights for policy 0, policy_version 63956 (0.0033) [2024-04-26 03:07:43,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 1047887872. Throughput: 0: 55718.7. Samples: 997209120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 03:07:43,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 03:07:45,791][47288] Updated weights for policy 0, policy_version 63966 (0.0035) [2024-04-26 03:07:48,812][47288] Updated weights for policy 0, policy_version 63976 (0.0028) [2024-04-26 03:07:48,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 1048182784. Throughput: 0: 55797.8. Samples: 997548140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 03:07:48,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 03:07:51,636][47288] Updated weights for policy 0, policy_version 63986 (0.0031) [2024-04-26 03:07:53,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 55816.6). Total num frames: 1048444928. Throughput: 0: 55658.2. Samples: 997878720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 03:07:53,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 03:07:54,805][47288] Updated weights for policy 0, policy_version 63996 (0.0039) [2024-04-26 03:07:57,439][47288] Updated weights for policy 0, policy_version 64006 (0.0030) [2024-04-26 03:07:58,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 1048723456. Throughput: 0: 55658.8. Samples: 998042460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 03:07:58,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 03:08:00,801][47288] Updated weights for policy 0, policy_version 64016 (0.0024) [2024-04-26 03:08:03,338][47288] Updated weights for policy 0, policy_version 64026 (0.0033) [2024-04-26 03:08:03,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 1049001984. Throughput: 0: 55573.9. Samples: 998377900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 03:08:03,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 03:08:06,536][47288] Updated weights for policy 0, policy_version 64036 (0.0028) [2024-04-26 03:08:08,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1049296896. Throughput: 0: 55634.7. Samples: 998714660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 03:08:08,923][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 03:08:09,163][47288] Updated weights for policy 0, policy_version 64046 (0.0034) [2024-04-26 03:08:12,198][47288] Updated weights for policy 0, policy_version 64056 (0.0027) [2024-04-26 03:08:13,923][47056] Fps is (10 sec: 57345.1, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 1049575424. Throughput: 0: 55769.1. Samples: 998882700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 03:08:13,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 03:08:15,237][47288] Updated weights for policy 0, policy_version 64066 (0.0028) [2024-04-26 03:08:18,240][47288] Updated weights for policy 0, policy_version 64076 (0.0031) [2024-04-26 03:08:18,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.9, 300 sec: 55872.2). Total num frames: 1049853952. Throughput: 0: 55786.2. Samples: 999216260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 03:08:18,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 03:08:20,974][47288] Updated weights for policy 0, policy_version 64086 (0.0032) [2024-04-26 03:08:23,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 1050116096. Throughput: 0: 55863.2. Samples: 999549220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 03:08:23,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 03:08:24,214][47288] Updated weights for policy 0, policy_version 64096 (0.0034) [2024-04-26 03:08:26,862][47288] Updated weights for policy 0, policy_version 64106 (0.0032) [2024-04-26 03:08:28,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1050394624. Throughput: 0: 55633.9. Samples: 999712640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 03:08:28,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 03:08:30,186][47288] Updated weights for policy 0, policy_version 64116 (0.0035) [2024-04-26 03:08:32,752][47288] Updated weights for policy 0, policy_version 64126 (0.0030) [2024-04-26 03:08:33,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 1050673152. Throughput: 0: 55585.2. Samples: 1000049480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 03:08:33,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 03:08:35,737][47267] Signal inference workers to stop experience collection... (15100 times) [2024-04-26 03:08:35,781][47288] InferenceWorker_p0-w0: stopping experience collection (15100 times) [2024-04-26 03:08:35,792][47267] Signal inference workers to resume experience collection... (15100 times) [2024-04-26 03:08:35,798][47288] InferenceWorker_p0-w0: resuming experience collection (15100 times) [2024-04-26 03:08:35,896][47288] Updated weights for policy 0, policy_version 64136 (0.0027) [2024-04-26 03:08:38,527][47288] Updated weights for policy 0, policy_version 64146 (0.0032) [2024-04-26 03:08:38,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 1050968064. Throughput: 0: 55661.1. Samples: 1000383460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 03:08:38,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 03:08:41,744][47288] Updated weights for policy 0, policy_version 64156 (0.0031) [2024-04-26 03:08:43,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 1051246592. Throughput: 0: 55865.1. Samples: 1000556400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 03:08:43,924][47056] Avg episode reward: [(0, '0.561')] [2024-04-26 03:08:44,286][47288] Updated weights for policy 0, policy_version 64166 (0.0026) [2024-04-26 03:08:47,688][47288] Updated weights for policy 0, policy_version 64176 (0.0033) [2024-04-26 03:08:48,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 1051525120. Throughput: 0: 55803.1. Samples: 1000889040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 03:08:48,923][47056] Avg episode reward: [(0, '0.416')] [2024-04-26 03:08:48,999][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000064181_1051541504.pth... [2024-04-26 03:08:49,043][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000063364_1038155776.pth [2024-04-26 03:08:50,490][47288] Updated weights for policy 0, policy_version 64186 (0.0028) [2024-04-26 03:08:53,561][47288] Updated weights for policy 0, policy_version 64196 (0.0029) [2024-04-26 03:08:53,923][47056] Fps is (10 sec: 55707.0, 60 sec: 55978.9, 300 sec: 55816.7). Total num frames: 1051803648. Throughput: 0: 55814.7. Samples: 1001226320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 03:08:53,923][47056] Avg episode reward: [(0, '0.391')] [2024-04-26 03:08:56,305][47288] Updated weights for policy 0, policy_version 64206 (0.0036) [2024-04-26 03:08:58,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 1052065792. Throughput: 0: 55728.0. Samples: 1001390460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 03:08:58,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 03:08:59,438][47288] Updated weights for policy 0, policy_version 64216 (0.0029) [2024-04-26 03:09:02,135][47288] Updated weights for policy 0, policy_version 64226 (0.0036) [2024-04-26 03:09:03,923][47056] Fps is (10 sec: 54066.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 1052344320. Throughput: 0: 55771.7. Samples: 1001726000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 03:09:03,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 03:09:05,263][47288] Updated weights for policy 0, policy_version 64236 (0.0027) [2024-04-26 03:09:07,873][47288] Updated weights for policy 0, policy_version 64246 (0.0027) [2024-04-26 03:09:08,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 1052622848. Throughput: 0: 55821.3. Samples: 1002061180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 03:09:08,923][47056] Avg episode reward: [(0, '0.404')] [2024-04-26 03:09:11,196][47288] Updated weights for policy 0, policy_version 64256 (0.0030) [2024-04-26 03:09:13,792][47288] Updated weights for policy 0, policy_version 64266 (0.0029) [2024-04-26 03:09:13,923][47056] Fps is (10 sec: 58982.9, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 1052934144. Throughput: 0: 55797.7. Samples: 1002223540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 03:09:13,923][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 03:09:17,056][47288] Updated weights for policy 0, policy_version 64276 (0.0028) [2024-04-26 03:09:18,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 1053196288. Throughput: 0: 55843.5. Samples: 1002562440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 03:09:18,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 03:09:19,638][47288] Updated weights for policy 0, policy_version 64286 (0.0026) [2024-04-26 03:09:22,835][47288] Updated weights for policy 0, policy_version 64296 (0.0031) [2024-04-26 03:09:23,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1053474816. Throughput: 0: 55890.3. Samples: 1002898520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 03:09:23,923][47056] Avg episode reward: [(0, '0.317')] [2024-04-26 03:09:25,481][47288] Updated weights for policy 0, policy_version 64306 (0.0027) [2024-04-26 03:09:28,633][47288] Updated weights for policy 0, policy_version 64316 (0.0024) [2024-04-26 03:09:28,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 1053753344. Throughput: 0: 55774.3. Samples: 1003066240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 03:09:28,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 03:09:31,393][47288] Updated weights for policy 0, policy_version 64326 (0.0025) [2024-04-26 03:09:33,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 1054031872. Throughput: 0: 55901.9. Samples: 1003404620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 03:09:33,923][47056] Avg episode reward: [(0, '0.350')] [2024-04-26 03:09:34,489][47288] Updated weights for policy 0, policy_version 64336 (0.0028) [2024-04-26 03:09:37,789][47288] Updated weights for policy 0, policy_version 64346 (0.0028) [2024-04-26 03:09:38,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1054294016. Throughput: 0: 55879.4. Samples: 1003740900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 03:09:38,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 03:09:40,179][47288] Updated weights for policy 0, policy_version 64356 (0.0029) [2024-04-26 03:09:43,539][47288] Updated weights for policy 0, policy_version 64366 (0.0029) [2024-04-26 03:09:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 1054588928. Throughput: 0: 55839.9. Samples: 1003903260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 03:09:43,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 03:09:46,057][47288] Updated weights for policy 0, policy_version 64376 (0.0027) [2024-04-26 03:09:48,171][47267] Signal inference workers to stop experience collection... (15150 times) [2024-04-26 03:09:48,179][47267] Signal inference workers to resume experience collection... (15150 times) [2024-04-26 03:09:48,199][47288] InferenceWorker_p0-w0: stopping experience collection (15150 times) [2024-04-26 03:09:48,199][47288] InferenceWorker_p0-w0: resuming experience collection (15150 times) [2024-04-26 03:09:48,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 1054867456. Throughput: 0: 55757.0. Samples: 1004235060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 03:09:48,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 03:09:49,432][47288] Updated weights for policy 0, policy_version 64386 (0.0033) [2024-04-26 03:09:51,838][47288] Updated weights for policy 0, policy_version 64396 (0.0035) [2024-04-26 03:09:53,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 1055162368. Throughput: 0: 55721.7. Samples: 1004568660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 03:09:53,932][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 03:09:55,265][47288] Updated weights for policy 0, policy_version 64406 (0.0029) [2024-04-26 03:09:58,154][47288] Updated weights for policy 0, policy_version 64416 (0.0032) [2024-04-26 03:09:58,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 1055440896. Throughput: 0: 56092.9. Samples: 1004747720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 03:09:58,932][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 03:10:01,060][47288] Updated weights for policy 0, policy_version 64426 (0.0026) [2024-04-26 03:10:03,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 1055703040. Throughput: 0: 55960.9. Samples: 1005080680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 03:10:03,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 03:10:03,931][47288] Updated weights for policy 0, policy_version 64436 (0.0029) [2024-04-26 03:10:06,874][47288] Updated weights for policy 0, policy_version 64446 (0.0024) [2024-04-26 03:10:08,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 1055997952. Throughput: 0: 56006.2. Samples: 1005418800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 03:10:08,923][47056] Avg episode reward: [(0, '0.397')] [2024-04-26 03:10:09,698][47288] Updated weights for policy 0, policy_version 64456 (0.0030) [2024-04-26 03:10:12,759][47288] Updated weights for policy 0, policy_version 64466 (0.0035) [2024-04-26 03:10:13,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 1056260096. Throughput: 0: 55839.7. Samples: 1005579020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 03:10:13,923][47056] Avg episode reward: [(0, '0.348')] [2024-04-26 03:10:15,634][47288] Updated weights for policy 0, policy_version 64476 (0.0028) [2024-04-26 03:10:18,642][47288] Updated weights for policy 0, policy_version 64486 (0.0030) [2024-04-26 03:10:18,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 1056538624. Throughput: 0: 55734.1. Samples: 1005912660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 03:10:18,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 03:10:21,475][47288] Updated weights for policy 0, policy_version 64496 (0.0035) [2024-04-26 03:10:23,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 1056817152. Throughput: 0: 55640.3. Samples: 1006244720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 03:10:23,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 03:10:24,554][47288] Updated weights for policy 0, policy_version 64506 (0.0030) [2024-04-26 03:10:27,402][47288] Updated weights for policy 0, policy_version 64516 (0.0028) [2024-04-26 03:10:28,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 1057112064. Throughput: 0: 55846.6. Samples: 1006416360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 03:10:28,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 03:10:30,489][47288] Updated weights for policy 0, policy_version 64526 (0.0028) [2024-04-26 03:10:33,130][47288] Updated weights for policy 0, policy_version 64536 (0.0033) [2024-04-26 03:10:33,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 1057374208. Throughput: 0: 55763.7. Samples: 1006744420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 03:10:33,923][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 03:10:36,231][47267] Signal inference workers to stop experience collection... (15200 times) [2024-04-26 03:10:36,231][47267] Signal inference workers to resume experience collection... (15200 times) [2024-04-26 03:10:36,253][47288] InferenceWorker_p0-w0: stopping experience collection (15200 times) [2024-04-26 03:10:36,253][47288] InferenceWorker_p0-w0: resuming experience collection (15200 times) [2024-04-26 03:10:36,338][47288] Updated weights for policy 0, policy_version 64546 (0.0031) [2024-04-26 03:10:38,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 1057669120. Throughput: 0: 55761.9. Samples: 1007077940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 03:10:38,923][47056] Avg episode reward: [(0, '0.393')] [2024-04-26 03:10:39,190][47288] Updated weights for policy 0, policy_version 64556 (0.0030) [2024-04-26 03:10:42,239][47288] Updated weights for policy 0, policy_version 64566 (0.0034) [2024-04-26 03:10:43,923][47056] Fps is (10 sec: 55704.4, 60 sec: 55705.4, 300 sec: 55816.6). Total num frames: 1057931264. Throughput: 0: 55696.8. Samples: 1007254080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 03:10:43,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 03:10:45,143][47288] Updated weights for policy 0, policy_version 64576 (0.0027) [2024-04-26 03:10:48,297][47288] Updated weights for policy 0, policy_version 64586 (0.0028) [2024-04-26 03:10:48,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 1058193408. Throughput: 0: 55653.8. Samples: 1007585100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 03:10:48,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 03:10:48,991][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000064588_1058209792.pth... [2024-04-26 03:10:49,045][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000063772_1044840448.pth [2024-04-26 03:10:51,043][47288] Updated weights for policy 0, policy_version 64596 (0.0027) [2024-04-26 03:10:53,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 1058488320. Throughput: 0: 55523.0. Samples: 1007917340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 03:10:53,923][47056] Avg episode reward: [(0, '0.321')] [2024-04-26 03:10:54,045][47288] Updated weights for policy 0, policy_version 64606 (0.0030) [2024-04-26 03:10:56,872][47288] Updated weights for policy 0, policy_version 64616 (0.0036) [2024-04-26 03:10:58,923][47056] Fps is (10 sec: 57343.0, 60 sec: 55432.5, 300 sec: 55816.6). Total num frames: 1058766848. Throughput: 0: 55655.4. Samples: 1008083520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 03:10:58,923][47056] Avg episode reward: [(0, '0.415')] [2024-04-26 03:10:59,862][47288] Updated weights for policy 0, policy_version 64626 (0.0028) [2024-04-26 03:11:02,796][47288] Updated weights for policy 0, policy_version 64636 (0.0029) [2024-04-26 03:11:03,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 1059061760. Throughput: 0: 55555.3. Samples: 1008412640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 03:11:03,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 03:11:05,996][47288] Updated weights for policy 0, policy_version 64646 (0.0026) [2024-04-26 03:11:08,666][47288] Updated weights for policy 0, policy_version 64656 (0.0030) [2024-04-26 03:11:08,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 1059323904. Throughput: 0: 55620.0. Samples: 1008747620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 03:11:08,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 03:11:11,956][47288] Updated weights for policy 0, policy_version 64666 (0.0029) [2024-04-26 03:11:13,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 1059618816. Throughput: 0: 55541.9. Samples: 1008915740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 03:11:13,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 03:11:14,681][47288] Updated weights for policy 0, policy_version 64676 (0.0031) [2024-04-26 03:11:17,688][47288] Updated weights for policy 0, policy_version 64686 (0.0030) [2024-04-26 03:11:18,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 1059880960. Throughput: 0: 55688.9. Samples: 1009250420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 03:11:18,923][47056] Avg episode reward: [(0, '0.393')] [2024-04-26 03:11:20,673][47288] Updated weights for policy 0, policy_version 64696 (0.0029) [2024-04-26 03:11:23,555][47288] Updated weights for policy 0, policy_version 64706 (0.0029) [2024-04-26 03:11:23,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 1060143104. Throughput: 0: 55558.2. Samples: 1009578060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 03:11:23,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 03:11:26,388][47288] Updated weights for policy 0, policy_version 64716 (0.0028) [2024-04-26 03:11:28,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 1060421632. Throughput: 0: 55241.4. Samples: 1009739940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 03:11:28,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 03:11:29,465][47288] Updated weights for policy 0, policy_version 64726 (0.0033) [2024-04-26 03:11:32,293][47288] Updated weights for policy 0, policy_version 64736 (0.0027) [2024-04-26 03:11:33,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 1060700160. Throughput: 0: 55374.1. Samples: 1010076940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 03:11:33,924][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 03:11:35,216][47288] Updated weights for policy 0, policy_version 64746 (0.0028) [2024-04-26 03:11:38,099][47288] Updated weights for policy 0, policy_version 64756 (0.0028) [2024-04-26 03:11:38,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 1060978688. Throughput: 0: 55568.9. Samples: 1010417940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 03:11:38,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 03:11:41,169][47288] Updated weights for policy 0, policy_version 64766 (0.0031) [2024-04-26 03:11:43,850][47288] Updated weights for policy 0, policy_version 64776 (0.0029) [2024-04-26 03:11:43,923][47056] Fps is (10 sec: 58982.7, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 1061289984. Throughput: 0: 55544.2. Samples: 1010583000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 03:11:43,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 03:11:47,083][47288] Updated weights for policy 0, policy_version 64786 (0.0030) [2024-04-26 03:11:48,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 1061568512. Throughput: 0: 55764.4. Samples: 1010922040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 03:11:48,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 03:11:49,866][47288] Updated weights for policy 0, policy_version 64796 (0.0032) [2024-04-26 03:11:52,884][47288] Updated weights for policy 0, policy_version 64806 (0.0032) [2024-04-26 03:11:53,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1061830656. Throughput: 0: 55737.0. Samples: 1011255780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 03:11:53,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 03:11:55,650][47288] Updated weights for policy 0, policy_version 64816 (0.0030) [2024-04-26 03:11:57,368][47267] Signal inference workers to stop experience collection... (15250 times) [2024-04-26 03:11:57,375][47267] Signal inference workers to resume experience collection... (15250 times) [2024-04-26 03:11:57,379][47288] InferenceWorker_p0-w0: stopping experience collection (15250 times) [2024-04-26 03:11:57,400][47288] InferenceWorker_p0-w0: resuming experience collection (15250 times) [2024-04-26 03:11:58,611][47288] Updated weights for policy 0, policy_version 64826 (0.0030) [2024-04-26 03:11:58,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.8, 300 sec: 55761.1). Total num frames: 1062109184. Throughput: 0: 55788.4. Samples: 1011426220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 03:11:58,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 03:12:01,451][47288] Updated weights for policy 0, policy_version 64836 (0.0026) [2024-04-26 03:12:03,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 1062371328. Throughput: 0: 55910.6. Samples: 1011766400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 03:12:03,923][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 03:12:04,568][47288] Updated weights for policy 0, policy_version 64846 (0.0027) [2024-04-26 03:12:07,285][47288] Updated weights for policy 0, policy_version 64856 (0.0027) [2024-04-26 03:12:08,923][47056] Fps is (10 sec: 54066.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1062649856. Throughput: 0: 56038.9. Samples: 1012099820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 03:12:08,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 03:12:10,604][47288] Updated weights for policy 0, policy_version 64866 (0.0043) [2024-04-26 03:12:13,346][47288] Updated weights for policy 0, policy_version 64876 (0.0033) [2024-04-26 03:12:13,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1062944768. Throughput: 0: 55967.7. Samples: 1012258480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 03:12:13,923][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 03:12:16,344][47288] Updated weights for policy 0, policy_version 64886 (0.0029) [2024-04-26 03:12:18,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 1063223296. Throughput: 0: 55797.3. Samples: 1012587820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 03:12:18,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 03:12:19,120][47288] Updated weights for policy 0, policy_version 64896 (0.0029) [2024-04-26 03:12:22,151][47288] Updated weights for policy 0, policy_version 64906 (0.0029) [2024-04-26 03:12:23,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 1063518208. Throughput: 0: 55703.5. Samples: 1012924600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 03:12:23,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 03:12:25,094][47288] Updated weights for policy 0, policy_version 64916 (0.0029) [2024-04-26 03:12:27,994][47288] Updated weights for policy 0, policy_version 64926 (0.0028) [2024-04-26 03:12:28,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 1063796736. Throughput: 0: 55972.4. Samples: 1013101760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 03:12:28,923][47056] Avg episode reward: [(0, '0.355')] [2024-04-26 03:12:31,142][47288] Updated weights for policy 0, policy_version 64936 (0.0030) [2024-04-26 03:12:33,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 1064058880. Throughput: 0: 55816.7. Samples: 1013433800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 03:12:33,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 03:12:34,164][47288] Updated weights for policy 0, policy_version 64946 (0.0027) [2024-04-26 03:12:36,869][47288] Updated weights for policy 0, policy_version 64956 (0.0026) [2024-04-26 03:12:38,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 1064337408. Throughput: 0: 55812.8. Samples: 1013767360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 03:12:38,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 03:12:39,897][47288] Updated weights for policy 0, policy_version 64966 (0.0030) [2024-04-26 03:12:42,821][47288] Updated weights for policy 0, policy_version 64976 (0.0029) [2024-04-26 03:12:43,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 1064599552. Throughput: 0: 55563.4. Samples: 1013926580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 03:12:43,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 03:12:45,769][47288] Updated weights for policy 0, policy_version 64986 (0.0028) [2024-04-26 03:12:48,797][47288] Updated weights for policy 0, policy_version 64996 (0.0027) [2024-04-26 03:12:48,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55432.4, 300 sec: 55761.2). Total num frames: 1064894464. Throughput: 0: 55498.5. Samples: 1014263840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 03:12:48,923][47056] Avg episode reward: [(0, '0.381')] [2024-04-26 03:12:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000064996_1064894464.pth... [2024-04-26 03:12:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000064181_1051541504.pth [2024-04-26 03:12:51,722][47288] Updated weights for policy 0, policy_version 65006 (0.0037) [2024-04-26 03:12:53,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 1065172992. Throughput: 0: 55457.9. Samples: 1014595420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 03:12:53,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 03:12:54,611][47288] Updated weights for policy 0, policy_version 65016 (0.0023) [2024-04-26 03:12:57,533][47288] Updated weights for policy 0, policy_version 65026 (0.0027) [2024-04-26 03:12:58,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 1065467904. Throughput: 0: 55717.3. Samples: 1014765760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 03:12:58,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 03:13:00,425][47288] Updated weights for policy 0, policy_version 65036 (0.0032) [2024-04-26 03:13:03,215][47267] Signal inference workers to stop experience collection... (15300 times) [2024-04-26 03:13:03,245][47288] InferenceWorker_p0-w0: stopping experience collection (15300 times) [2024-04-26 03:13:03,271][47267] Signal inference workers to resume experience collection... (15300 times) [2024-04-26 03:13:03,272][47288] InferenceWorker_p0-w0: resuming experience collection (15300 times) [2024-04-26 03:13:03,385][47288] Updated weights for policy 0, policy_version 65046 (0.0032) [2024-04-26 03:13:03,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 1065730048. Throughput: 0: 55800.0. Samples: 1015098820. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-04-26 03:13:03,923][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 03:13:06,386][47288] Updated weights for policy 0, policy_version 65056 (0.0027) [2024-04-26 03:13:08,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1066008576. Throughput: 0: 55798.2. Samples: 1015435520. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-04-26 03:13:08,923][47056] Avg episode reward: [(0, '0.416')] [2024-04-26 03:13:09,323][47288] Updated weights for policy 0, policy_version 65066 (0.0027) [2024-04-26 03:13:12,270][47288] Updated weights for policy 0, policy_version 65076 (0.0031) [2024-04-26 03:13:13,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 1066270720. Throughput: 0: 55539.8. Samples: 1015601060. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-04-26 03:13:13,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 03:13:15,106][47288] Updated weights for policy 0, policy_version 65086 (0.0030) [2024-04-26 03:13:18,526][47288] Updated weights for policy 0, policy_version 65096 (0.0030) [2024-04-26 03:13:18,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 1066549248. Throughput: 0: 55525.8. Samples: 1015932460. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-04-26 03:13:18,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 03:13:21,026][47288] Updated weights for policy 0, policy_version 65106 (0.0028) [2024-04-26 03:13:23,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 1066844160. Throughput: 0: 55492.5. Samples: 1016264520. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-04-26 03:13:23,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 03:13:24,365][47288] Updated weights for policy 0, policy_version 65116 (0.0030) [2024-04-26 03:13:26,921][47288] Updated weights for policy 0, policy_version 65126 (0.0035) [2024-04-26 03:13:28,923][47056] Fps is (10 sec: 58981.4, 60 sec: 55705.4, 300 sec: 55816.6). Total num frames: 1067139072. Throughput: 0: 55734.1. Samples: 1016434620. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-04-26 03:13:28,924][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 03:13:29,995][47288] Updated weights for policy 0, policy_version 65136 (0.0032) [2024-04-26 03:13:32,800][47288] Updated weights for policy 0, policy_version 65146 (0.0027) [2024-04-26 03:13:33,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 1067417600. Throughput: 0: 55696.0. Samples: 1016770160. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-04-26 03:13:33,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 03:13:35,717][47288] Updated weights for policy 0, policy_version 65156 (0.0034) [2024-04-26 03:13:38,809][47288] Updated weights for policy 0, policy_version 65166 (0.0033) [2024-04-26 03:13:38,923][47056] Fps is (10 sec: 55707.1, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 1067696128. Throughput: 0: 55844.1. Samples: 1017108400. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-04-26 03:13:38,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 03:13:41,670][47288] Updated weights for policy 0, policy_version 65176 (0.0023) [2024-04-26 03:13:43,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1067958272. Throughput: 0: 55750.2. Samples: 1017274520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 03:13:43,923][47056] Avg episode reward: [(0, '0.399')] [2024-04-26 03:13:44,587][47288] Updated weights for policy 0, policy_version 65186 (0.0031) [2024-04-26 03:13:47,594][47288] Updated weights for policy 0, policy_version 65196 (0.0024) [2024-04-26 03:13:48,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 1068220416. Throughput: 0: 55845.9. Samples: 1017611880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 03:13:48,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 03:13:50,437][47288] Updated weights for policy 0, policy_version 65206 (0.0032) [2024-04-26 03:13:53,286][47288] Updated weights for policy 0, policy_version 65216 (0.0034) [2024-04-26 03:13:53,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 1068498944. Throughput: 0: 55689.9. Samples: 1017941560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 03:13:53,923][47056] Avg episode reward: [(0, '0.420')] [2024-04-26 03:13:56,288][47288] Updated weights for policy 0, policy_version 65226 (0.0028) [2024-04-26 03:13:58,335][47267] Signal inference workers to stop experience collection... (15350 times) [2024-04-26 03:13:58,373][47288] InferenceWorker_p0-w0: stopping experience collection (15350 times) [2024-04-26 03:13:58,425][47267] Signal inference workers to resume experience collection... (15350 times) [2024-04-26 03:13:58,426][47288] InferenceWorker_p0-w0: resuming experience collection (15350 times) [2024-04-26 03:13:58,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 1068793856. Throughput: 0: 55703.2. Samples: 1018107700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 03:13:58,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 03:13:59,308][47288] Updated weights for policy 0, policy_version 65236 (0.0025) [2024-04-26 03:14:02,263][47288] Updated weights for policy 0, policy_version 65246 (0.0031) [2024-04-26 03:14:03,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 1069072384. Throughput: 0: 55622.7. Samples: 1018435480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 03:14:03,923][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 03:14:05,391][47288] Updated weights for policy 0, policy_version 65256 (0.0031) [2024-04-26 03:14:08,062][47288] Updated weights for policy 0, policy_version 65266 (0.0032) [2024-04-26 03:14:08,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 1069367296. Throughput: 0: 55662.5. Samples: 1018769340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 03:14:08,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 03:14:11,359][47288] Updated weights for policy 0, policy_version 65276 (0.0028) [2024-04-26 03:14:13,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1069629440. Throughput: 0: 55735.3. Samples: 1018942700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 03:14:13,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 03:14:13,972][47288] Updated weights for policy 0, policy_version 65286 (0.0031) [2024-04-26 03:14:17,147][47288] Updated weights for policy 0, policy_version 65296 (0.0027) [2024-04-26 03:14:18,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1069907968. Throughput: 0: 55739.1. Samples: 1019278420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 03:14:18,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 03:14:19,783][47288] Updated weights for policy 0, policy_version 65306 (0.0030) [2024-04-26 03:14:22,899][47288] Updated weights for policy 0, policy_version 65316 (0.0028) [2024-04-26 03:14:23,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 1070186496. Throughput: 0: 55636.8. Samples: 1019612060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 03:14:23,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 03:14:25,745][47288] Updated weights for policy 0, policy_version 65326 (0.0026) [2024-04-26 03:14:28,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 1070448640. Throughput: 0: 55431.6. Samples: 1019768940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 03:14:28,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 03:14:29,106][47288] Updated weights for policy 0, policy_version 65336 (0.0027) [2024-04-26 03:14:31,673][47288] Updated weights for policy 0, policy_version 65346 (0.0027) [2024-04-26 03:14:33,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 1070743552. Throughput: 0: 55533.8. Samples: 1020110900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 03:14:33,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 03:14:34,857][47288] Updated weights for policy 0, policy_version 65356 (0.0029) [2024-04-26 03:14:37,616][47288] Updated weights for policy 0, policy_version 65366 (0.0029) [2024-04-26 03:14:38,923][47056] Fps is (10 sec: 58982.9, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 1071038464. Throughput: 0: 55678.2. Samples: 1020447080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 03:14:38,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 03:14:40,596][47288] Updated weights for policy 0, policy_version 65376 (0.0032) [2024-04-26 03:14:43,358][47288] Updated weights for policy 0, policy_version 65386 (0.0033) [2024-04-26 03:14:43,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 1071316992. Throughput: 0: 55808.8. Samples: 1020619100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 03:14:43,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 03:14:46,535][47288] Updated weights for policy 0, policy_version 65396 (0.0033) [2024-04-26 03:14:48,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 1071579136. Throughput: 0: 55915.9. Samples: 1020951700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 03:14:48,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 03:14:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000065404_1071579136.pth... [2024-04-26 03:14:48,979][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000064588_1058209792.pth [2024-04-26 03:14:49,286][47288] Updated weights for policy 0, policy_version 65406 (0.0030) [2024-04-26 03:14:52,462][47288] Updated weights for policy 0, policy_version 65416 (0.0032) [2024-04-26 03:14:53,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 1071841280. Throughput: 0: 55865.0. Samples: 1021283260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 03:14:53,923][47056] Avg episode reward: [(0, '0.343')] [2024-04-26 03:14:55,216][47288] Updated weights for policy 0, policy_version 65426 (0.0028) [2024-04-26 03:14:58,314][47288] Updated weights for policy 0, policy_version 65436 (0.0027) [2024-04-26 03:14:58,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 1072119808. Throughput: 0: 55621.0. Samples: 1021445640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 03:14:58,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 03:14:59,615][47267] Signal inference workers to stop experience collection... (15400 times) [2024-04-26 03:14:59,620][47267] Signal inference workers to resume experience collection... (15400 times) [2024-04-26 03:14:59,645][47288] InferenceWorker_p0-w0: stopping experience collection (15400 times) [2024-04-26 03:14:59,645][47288] InferenceWorker_p0-w0: resuming experience collection (15400 times) [2024-04-26 03:15:00,977][47288] Updated weights for policy 0, policy_version 65446 (0.0030) [2024-04-26 03:15:03,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 1072414720. Throughput: 0: 55664.2. Samples: 1021783300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 03:15:03,923][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 03:15:04,182][47288] Updated weights for policy 0, policy_version 65456 (0.0027) [2024-04-26 03:15:06,899][47288] Updated weights for policy 0, policy_version 65466 (0.0026) [2024-04-26 03:15:08,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 1072693248. Throughput: 0: 55714.3. Samples: 1022119200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 03:15:08,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 03:15:09,913][47288] Updated weights for policy 0, policy_version 65476 (0.0028) [2024-04-26 03:15:12,842][47288] Updated weights for policy 0, policy_version 65486 (0.0032) [2024-04-26 03:15:13,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1072971776. Throughput: 0: 55945.0. Samples: 1022286460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 03:15:13,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 03:15:15,923][47288] Updated weights for policy 0, policy_version 65496 (0.0028) [2024-04-26 03:15:18,746][47288] Updated weights for policy 0, policy_version 65506 (0.0027) [2024-04-26 03:15:18,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1073250304. Throughput: 0: 55731.3. Samples: 1022618800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 03:15:18,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 03:15:21,994][47288] Updated weights for policy 0, policy_version 65516 (0.0032) [2024-04-26 03:15:23,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 1073512448. Throughput: 0: 55686.2. Samples: 1022952960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 03:15:23,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 03:15:24,566][47288] Updated weights for policy 0, policy_version 65526 (0.0029) [2024-04-26 03:15:27,670][47288] Updated weights for policy 0, policy_version 65536 (0.0029) [2024-04-26 03:15:28,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 1073790976. Throughput: 0: 55468.1. Samples: 1023115160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 03:15:28,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 03:15:30,434][47288] Updated weights for policy 0, policy_version 65546 (0.0038) [2024-04-26 03:15:33,481][47288] Updated weights for policy 0, policy_version 65556 (0.0030) [2024-04-26 03:15:33,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 1074085888. Throughput: 0: 55622.2. Samples: 1023454700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 03:15:33,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 03:15:36,470][47288] Updated weights for policy 0, policy_version 65566 (0.0029) [2024-04-26 03:15:38,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 1074348032. Throughput: 0: 55618.3. Samples: 1023786080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 03:15:38,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 03:15:39,548][47288] Updated weights for policy 0, policy_version 65576 (0.0030) [2024-04-26 03:15:42,526][47288] Updated weights for policy 0, policy_version 65586 (0.0029) [2024-04-26 03:15:43,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 1074642944. Throughput: 0: 55655.8. Samples: 1023950160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 03:15:43,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 03:15:45,533][47288] Updated weights for policy 0, policy_version 65596 (0.0029) [2024-04-26 03:15:48,248][47288] Updated weights for policy 0, policy_version 65606 (0.0029) [2024-04-26 03:15:48,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 1074921472. Throughput: 0: 55602.1. Samples: 1024285400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 03:15:48,923][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 03:15:51,312][47288] Updated weights for policy 0, policy_version 65616 (0.0029) [2024-04-26 03:15:53,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1075200000. Throughput: 0: 55524.4. Samples: 1024617800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 03:15:53,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 03:15:54,031][47288] Updated weights for policy 0, policy_version 65626 (0.0034) [2024-04-26 03:15:57,396][47288] Updated weights for policy 0, policy_version 65636 (0.0029) [2024-04-26 03:15:58,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 1075462144. Throughput: 0: 55519.5. Samples: 1024784840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 03:15:58,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 03:15:59,987][47288] Updated weights for policy 0, policy_version 65646 (0.0026) [2024-04-26 03:16:02,546][47267] Signal inference workers to stop experience collection... (15450 times) [2024-04-26 03:16:02,547][47267] Signal inference workers to resume experience collection... (15450 times) [2024-04-26 03:16:02,572][47288] InferenceWorker_p0-w0: stopping experience collection (15450 times) [2024-04-26 03:16:02,572][47288] InferenceWorker_p0-w0: resuming experience collection (15450 times) [2024-04-26 03:16:03,136][47288] Updated weights for policy 0, policy_version 65656 (0.0029) [2024-04-26 03:16:03,923][47056] Fps is (10 sec: 54065.9, 60 sec: 55432.3, 300 sec: 55650.0). Total num frames: 1075740672. Throughput: 0: 55522.3. Samples: 1025117320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 03:16:03,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 03:16:05,970][47288] Updated weights for policy 0, policy_version 65666 (0.0028) [2024-04-26 03:16:08,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 1076002816. Throughput: 0: 55431.4. Samples: 1025447380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 03:16:08,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 03:16:09,060][47288] Updated weights for policy 0, policy_version 65676 (0.0028) [2024-04-26 03:16:11,696][47288] Updated weights for policy 0, policy_version 65686 (0.0029) [2024-04-26 03:16:13,923][47056] Fps is (10 sec: 55706.7, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 1076297728. Throughput: 0: 55460.5. Samples: 1025610880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 03:16:13,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 03:16:14,850][47288] Updated weights for policy 0, policy_version 65696 (0.0033) [2024-04-26 03:16:17,782][47288] Updated weights for policy 0, policy_version 65706 (0.0028) [2024-04-26 03:16:18,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1076576256. Throughput: 0: 55473.4. Samples: 1025951000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 03:16:18,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 03:16:20,510][47288] Updated weights for policy 0, policy_version 65716 (0.0032) [2024-04-26 03:16:23,446][47288] Updated weights for policy 0, policy_version 65726 (0.0028) [2024-04-26 03:16:23,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 1076871168. Throughput: 0: 55541.3. Samples: 1026285440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 03:16:23,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 03:16:26,337][47288] Updated weights for policy 0, policy_version 65736 (0.0025) [2024-04-26 03:16:28,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 1077149696. Throughput: 0: 55604.5. Samples: 1026452360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 03:16:28,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 03:16:29,150][47288] Updated weights for policy 0, policy_version 65746 (0.0029) [2024-04-26 03:16:32,308][47288] Updated weights for policy 0, policy_version 65756 (0.0029) [2024-04-26 03:16:33,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 1077411840. Throughput: 0: 55673.4. Samples: 1026790700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 03:16:33,923][47056] Avg episode reward: [(0, '0.310')] [2024-04-26 03:16:34,987][47288] Updated weights for policy 0, policy_version 65766 (0.0030) [2024-04-26 03:16:38,196][47288] Updated weights for policy 0, policy_version 65776 (0.0032) [2024-04-26 03:16:38,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 1077690368. Throughput: 0: 55646.6. Samples: 1027121900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 03:16:38,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 03:16:40,787][47288] Updated weights for policy 0, policy_version 65786 (0.0029) [2024-04-26 03:16:43,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.7, 300 sec: 55650.0). Total num frames: 1077985280. Throughput: 0: 55715.9. Samples: 1027292060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 03:16:43,923][47056] Avg episode reward: [(0, '0.436')] [2024-04-26 03:16:43,985][47288] Updated weights for policy 0, policy_version 65796 (0.0032) [2024-04-26 03:16:46,626][47288] Updated weights for policy 0, policy_version 65806 (0.0033) [2024-04-26 03:16:48,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 1078247424. Throughput: 0: 55754.1. Samples: 1027626240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 03:16:48,923][47056] Avg episode reward: [(0, '0.345')] [2024-04-26 03:16:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000065811_1078247424.pth... [2024-04-26 03:16:48,984][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000064996_1064894464.pth [2024-04-26 03:16:49,773][47288] Updated weights for policy 0, policy_version 65816 (0.0031) [2024-04-26 03:16:52,509][47288] Updated weights for policy 0, policy_version 65826 (0.0028) [2024-04-26 03:16:53,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 1078509568. Throughput: 0: 56013.0. Samples: 1027967960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 03:16:53,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 03:16:55,590][47288] Updated weights for policy 0, policy_version 65836 (0.0029) [2024-04-26 03:16:58,432][47288] Updated weights for policy 0, policy_version 65846 (0.0032) [2024-04-26 03:16:58,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 1078820864. Throughput: 0: 56152.2. Samples: 1028137720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 03:16:58,923][47056] Avg episode reward: [(0, '0.424')] [2024-04-26 03:17:01,320][47288] Updated weights for policy 0, policy_version 65856 (0.0027) [2024-04-26 03:17:03,923][47056] Fps is (10 sec: 60620.4, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 1079115776. Throughput: 0: 55961.2. Samples: 1028469260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 03:17:03,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 03:17:04,636][47288] Updated weights for policy 0, policy_version 65866 (0.0038) [2024-04-26 03:17:06,605][47267] Signal inference workers to stop experience collection... (15500 times) [2024-04-26 03:17:06,605][47267] Signal inference workers to resume experience collection... (15500 times) [2024-04-26 03:17:06,628][47288] InferenceWorker_p0-w0: stopping experience collection (15500 times) [2024-04-26 03:17:06,649][47288] InferenceWorker_p0-w0: resuming experience collection (15500 times) [2024-04-26 03:17:07,178][47288] Updated weights for policy 0, policy_version 65876 (0.0028) [2024-04-26 03:17:08,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 1079377920. Throughput: 0: 56014.2. Samples: 1028806080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 03:17:08,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 03:17:10,376][47288] Updated weights for policy 0, policy_version 65886 (0.0031) [2024-04-26 03:17:13,533][47288] Updated weights for policy 0, policy_version 65896 (0.0032) [2024-04-26 03:17:13,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 1079656448. Throughput: 0: 56115.1. Samples: 1028977540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 03:17:13,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 03:17:16,222][47288] Updated weights for policy 0, policy_version 65906 (0.0030) [2024-04-26 03:17:18,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 1079934976. Throughput: 0: 56114.7. Samples: 1029315860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 03:17:18,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 03:17:19,285][47288] Updated weights for policy 0, policy_version 65916 (0.0029) [2024-04-26 03:17:22,239][47288] Updated weights for policy 0, policy_version 65926 (0.0036) [2024-04-26 03:17:23,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 1080197120. Throughput: 0: 56040.9. Samples: 1029643740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 03:17:23,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 03:17:25,244][47288] Updated weights for policy 0, policy_version 65936 (0.0037) [2024-04-26 03:17:27,974][47288] Updated weights for policy 0, policy_version 65946 (0.0029) [2024-04-26 03:17:28,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 1080475648. Throughput: 0: 55905.2. Samples: 1029807800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 03:17:28,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 03:17:31,328][47288] Updated weights for policy 0, policy_version 65956 (0.0030) [2024-04-26 03:17:33,844][47288] Updated weights for policy 0, policy_version 65966 (0.0033) [2024-04-26 03:17:33,923][47056] Fps is (10 sec: 58983.0, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 1080786944. Throughput: 0: 55842.3. Samples: 1030139140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 03:17:33,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 03:17:37,068][47288] Updated weights for policy 0, policy_version 65976 (0.0026) [2024-04-26 03:17:38,923][47056] Fps is (10 sec: 60621.1, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 1081081856. Throughput: 0: 55622.6. Samples: 1030470980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 03:17:38,923][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 03:17:39,671][47288] Updated weights for policy 0, policy_version 65986 (0.0039) [2024-04-26 03:17:42,826][47288] Updated weights for policy 0, policy_version 65996 (0.0034) [2024-04-26 03:17:43,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1081327616. Throughput: 0: 55719.0. Samples: 1030645080. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-26 03:17:43,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 03:17:45,564][47288] Updated weights for policy 0, policy_version 66006 (0.0030) [2024-04-26 03:17:48,615][47288] Updated weights for policy 0, policy_version 66016 (0.0029) [2024-04-26 03:17:48,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 1081606144. Throughput: 0: 55772.9. Samples: 1030979040. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-26 03:17:48,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 03:17:51,485][47288] Updated weights for policy 0, policy_version 66026 (0.0031) [2024-04-26 03:17:53,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 1081884672. Throughput: 0: 55697.2. Samples: 1031312460. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-26 03:17:53,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 03:17:54,698][47288] Updated weights for policy 0, policy_version 66036 (0.0027) [2024-04-26 03:17:57,421][47288] Updated weights for policy 0, policy_version 66046 (0.0029) [2024-04-26 03:17:58,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 1082163200. Throughput: 0: 55428.1. Samples: 1031471800. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-26 03:17:58,923][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 03:18:00,588][47288] Updated weights for policy 0, policy_version 66056 (0.0024) [2024-04-26 03:18:03,354][47288] Updated weights for policy 0, policy_version 66066 (0.0036) [2024-04-26 03:18:03,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 1082425344. Throughput: 0: 55425.9. Samples: 1031810020. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-26 03:18:03,923][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 03:18:06,283][47288] Updated weights for policy 0, policy_version 66076 (0.0027) [2024-04-26 03:18:08,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 1082736640. Throughput: 0: 55781.8. Samples: 1032153920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-26 03:18:08,931][47056] Avg episode reward: [(0, '0.393')] [2024-04-26 03:18:08,971][47288] Updated weights for policy 0, policy_version 66086 (0.0032) [2024-04-26 03:18:11,711][47267] Signal inference workers to stop experience collection... (15550 times) [2024-04-26 03:18:11,715][47267] Signal inference workers to resume experience collection... (15550 times) [2024-04-26 03:18:11,742][47288] InferenceWorker_p0-w0: stopping experience collection (15550 times) [2024-04-26 03:18:11,743][47288] InferenceWorker_p0-w0: resuming experience collection (15550 times) [2024-04-26 03:18:11,959][47288] Updated weights for policy 0, policy_version 66096 (0.0029) [2024-04-26 03:18:13,923][47056] Fps is (10 sec: 60619.8, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 1083031552. Throughput: 0: 56029.4. Samples: 1032329120. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-26 03:18:13,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 03:18:14,792][47288] Updated weights for policy 0, policy_version 66106 (0.0036) [2024-04-26 03:18:17,980][47288] Updated weights for policy 0, policy_version 66116 (0.0033) [2024-04-26 03:18:18,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 1083293696. Throughput: 0: 56069.3. Samples: 1032662260. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-26 03:18:18,923][47056] Avg episode reward: [(0, '0.314')] [2024-04-26 03:18:20,986][47288] Updated weights for policy 0, policy_version 66126 (0.0033) [2024-04-26 03:18:23,846][47288] Updated weights for policy 0, policy_version 66136 (0.0031) [2024-04-26 03:18:23,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 1083572224. Throughput: 0: 56072.8. Samples: 1032994260. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-26 03:18:23,924][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 03:18:27,116][47288] Updated weights for policy 0, policy_version 66146 (0.0030) [2024-04-26 03:18:28,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 1083834368. Throughput: 0: 55903.0. Samples: 1033160720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 03:18:28,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 03:18:29,577][47288] Updated weights for policy 0, policy_version 66156 (0.0030) [2024-04-26 03:18:32,905][47288] Updated weights for policy 0, policy_version 66166 (0.0031) [2024-04-26 03:18:33,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 1084112896. Throughput: 0: 55968.1. Samples: 1033497600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 03:18:33,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 03:18:35,459][47288] Updated weights for policy 0, policy_version 66176 (0.0032) [2024-04-26 03:18:38,823][47288] Updated weights for policy 0, policy_version 66186 (0.0033) [2024-04-26 03:18:38,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 1084391424. Throughput: 0: 55991.2. Samples: 1033832060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 03:18:38,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 03:18:41,433][47288] Updated weights for policy 0, policy_version 66196 (0.0034) [2024-04-26 03:18:43,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 1084686336. Throughput: 0: 56079.0. Samples: 1033995360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 03:18:43,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 03:18:44,657][47288] Updated weights for policy 0, policy_version 66206 (0.0036) [2024-04-26 03:18:47,247][47288] Updated weights for policy 0, policy_version 66216 (0.0031) [2024-04-26 03:18:48,923][47056] Fps is (10 sec: 58981.6, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 1084981248. Throughput: 0: 56115.7. Samples: 1034335240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 03:18:48,932][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 03:18:49,038][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000066223_1084997632.pth... [2024-04-26 03:18:49,082][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000065404_1071579136.pth [2024-04-26 03:18:50,417][47288] Updated weights for policy 0, policy_version 66226 (0.0035) [2024-04-26 03:18:53,180][47288] Updated weights for policy 0, policy_version 66236 (0.0030) [2024-04-26 03:18:53,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 1085259776. Throughput: 0: 55760.5. Samples: 1034663140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 03:18:53,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 03:18:56,284][47288] Updated weights for policy 0, policy_version 66246 (0.0034) [2024-04-26 03:18:58,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 1085521920. Throughput: 0: 55669.4. Samples: 1034834240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 03:18:58,923][47056] Avg episode reward: [(0, '0.342')] [2024-04-26 03:18:59,049][47288] Updated weights for policy 0, policy_version 66256 (0.0029) [2024-04-26 03:19:02,079][47288] Updated weights for policy 0, policy_version 66266 (0.0032) [2024-04-26 03:19:03,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 1085784064. Throughput: 0: 55591.6. Samples: 1035163880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 03:19:03,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 03:19:04,972][47288] Updated weights for policy 0, policy_version 66276 (0.0030) [2024-04-26 03:19:05,718][47267] Signal inference workers to stop experience collection... (15600 times) [2024-04-26 03:19:05,719][47267] Signal inference workers to resume experience collection... (15600 times) [2024-04-26 03:19:05,729][47288] InferenceWorker_p0-w0: stopping experience collection (15600 times) [2024-04-26 03:19:05,748][47288] InferenceWorker_p0-w0: resuming experience collection (15600 times) [2024-04-26 03:19:08,108][47288] Updated weights for policy 0, policy_version 66286 (0.0035) [2024-04-26 03:19:08,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1086062592. Throughput: 0: 55704.6. Samples: 1035500960. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-26 03:19:08,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 03:19:10,845][47288] Updated weights for policy 0, policy_version 66296 (0.0028) [2024-04-26 03:19:13,897][47288] Updated weights for policy 0, policy_version 66306 (0.0030) [2024-04-26 03:19:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55432.7, 300 sec: 55761.2). Total num frames: 1086357504. Throughput: 0: 55724.6. Samples: 1035668320. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-26 03:19:13,923][47056] Avg episode reward: [(0, '0.415')] [2024-04-26 03:19:16,626][47288] Updated weights for policy 0, policy_version 66316 (0.0030) [2024-04-26 03:19:18,923][47056] Fps is (10 sec: 58982.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 1086652416. Throughput: 0: 55615.9. Samples: 1036000320. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-26 03:19:18,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 03:19:19,632][47288] Updated weights for policy 0, policy_version 66326 (0.0034) [2024-04-26 03:19:22,446][47288] Updated weights for policy 0, policy_version 66336 (0.0027) [2024-04-26 03:19:23,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 1086930944. Throughput: 0: 55614.3. Samples: 1036334700. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-26 03:19:23,923][47056] Avg episode reward: [(0, '0.319')] [2024-04-26 03:19:25,429][47288] Updated weights for policy 0, policy_version 66346 (0.0034) [2024-04-26 03:19:28,570][47288] Updated weights for policy 0, policy_version 66356 (0.0032) [2024-04-26 03:19:28,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 1087193088. Throughput: 0: 55956.6. Samples: 1036513400. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-26 03:19:28,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 03:19:31,537][47288] Updated weights for policy 0, policy_version 66366 (0.0028) [2024-04-26 03:19:33,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 1087455232. Throughput: 0: 55883.9. Samples: 1036850000. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-26 03:19:33,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 03:19:34,233][47288] Updated weights for policy 0, policy_version 66376 (0.0026) [2024-04-26 03:19:37,317][47288] Updated weights for policy 0, policy_version 66386 (0.0030) [2024-04-26 03:19:38,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 1087750144. Throughput: 0: 55997.2. Samples: 1037183020. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-26 03:19:38,923][47056] Avg episode reward: [(0, '0.323')] [2024-04-26 03:19:40,224][47288] Updated weights for policy 0, policy_version 66396 (0.0030) [2024-04-26 03:19:43,043][47288] Updated weights for policy 0, policy_version 66406 (0.0030) [2024-04-26 03:19:43,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 1088012288. Throughput: 0: 55690.2. Samples: 1037340300. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-26 03:19:43,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 03:19:46,260][47288] Updated weights for policy 0, policy_version 66416 (0.0025) [2024-04-26 03:19:48,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 1088307200. Throughput: 0: 55762.4. Samples: 1037673200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 03:19:48,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 03:19:49,041][47288] Updated weights for policy 0, policy_version 66426 (0.0033) [2024-04-26 03:19:52,141][47288] Updated weights for policy 0, policy_version 66436 (0.0024) [2024-04-26 03:19:53,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55432.3, 300 sec: 55816.6). Total num frames: 1088585728. Throughput: 0: 55730.1. Samples: 1038008820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 03:19:53,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 03:19:54,983][47288] Updated weights for policy 0, policy_version 66446 (0.0030) [2024-04-26 03:19:57,863][47288] Updated weights for policy 0, policy_version 66456 (0.0033) [2024-04-26 03:19:58,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55816.6). Total num frames: 1088880640. Throughput: 0: 55787.7. Samples: 1038178780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 03:19:58,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 03:20:01,021][47288] Updated weights for policy 0, policy_version 66466 (0.0043) [2024-04-26 03:20:03,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 1089126400. Throughput: 0: 55814.3. Samples: 1038511960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 03:20:03,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 03:20:03,954][47288] Updated weights for policy 0, policy_version 66476 (0.0034) [2024-04-26 03:20:06,957][47288] Updated weights for policy 0, policy_version 66486 (0.0031) [2024-04-26 03:20:08,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55705.4, 300 sec: 55705.5). Total num frames: 1089404928. Throughput: 0: 55853.5. Samples: 1038848120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 03:20:08,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 03:20:09,915][47288] Updated weights for policy 0, policy_version 66496 (0.0026) [2024-04-26 03:20:10,485][47267] Signal inference workers to stop experience collection... (15650 times) [2024-04-26 03:20:10,485][47267] Signal inference workers to resume experience collection... (15650 times) [2024-04-26 03:20:10,500][47288] InferenceWorker_p0-w0: stopping experience collection (15650 times) [2024-04-26 03:20:10,500][47288] InferenceWorker_p0-w0: resuming experience collection (15650 times) [2024-04-26 03:20:12,765][47288] Updated weights for policy 0, policy_version 66506 (0.0032) [2024-04-26 03:20:13,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 1089699840. Throughput: 0: 55457.7. Samples: 1039009000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 03:20:13,923][47056] Avg episode reward: [(0, '0.344')] [2024-04-26 03:20:15,643][47288] Updated weights for policy 0, policy_version 66516 (0.0025) [2024-04-26 03:20:18,769][47288] Updated weights for policy 0, policy_version 66526 (0.0027) [2024-04-26 03:20:18,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 1089961984. Throughput: 0: 55443.4. Samples: 1039344960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 03:20:18,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 03:20:21,476][47288] Updated weights for policy 0, policy_version 66536 (0.0033) [2024-04-26 03:20:23,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 1090256896. Throughput: 0: 55504.1. Samples: 1039680700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 03:20:23,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 03:20:24,582][47288] Updated weights for policy 0, policy_version 66546 (0.0031) [2024-04-26 03:20:27,423][47288] Updated weights for policy 0, policy_version 66556 (0.0026) [2024-04-26 03:20:28,923][47056] Fps is (10 sec: 58982.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 1090551808. Throughput: 0: 55746.7. Samples: 1039848900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:20:28,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 03:20:30,534][47288] Updated weights for policy 0, policy_version 66566 (0.0026) [2024-04-26 03:20:33,431][47288] Updated weights for policy 0, policy_version 66576 (0.0025) [2024-04-26 03:20:33,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 1090830336. Throughput: 0: 55674.4. Samples: 1040178540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:20:33,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 03:20:36,274][47288] Updated weights for policy 0, policy_version 66586 (0.0028) [2024-04-26 03:20:38,923][47056] Fps is (10 sec: 50790.9, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 1091059712. Throughput: 0: 55577.6. Samples: 1040509800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:20:38,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 03:20:39,332][47288] Updated weights for policy 0, policy_version 66596 (0.0029) [2024-04-26 03:20:42,165][47288] Updated weights for policy 0, policy_version 66606 (0.0030) [2024-04-26 03:20:43,923][47056] Fps is (10 sec: 50789.9, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 1091338240. Throughput: 0: 55448.9. Samples: 1040673980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:20:43,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 03:20:45,252][47288] Updated weights for policy 0, policy_version 66616 (0.0029) [2024-04-26 03:20:48,146][47288] Updated weights for policy 0, policy_version 66626 (0.0027) [2024-04-26 03:20:48,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 1091633152. Throughput: 0: 55443.1. Samples: 1041006900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:20:48,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 03:20:49,018][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000066629_1091649536.pth... [2024-04-26 03:20:49,065][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000065811_1078247424.pth [2024-04-26 03:20:51,079][47288] Updated weights for policy 0, policy_version 66636 (0.0028) [2024-04-26 03:20:53,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55432.7, 300 sec: 55761.1). Total num frames: 1091911680. Throughput: 0: 55424.3. Samples: 1041342200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:20:53,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 03:20:53,974][47288] Updated weights for policy 0, policy_version 66646 (0.0033) [2024-04-26 03:20:57,073][47288] Updated weights for policy 0, policy_version 66656 (0.0027) [2024-04-26 03:20:58,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55159.5, 300 sec: 55761.2). Total num frames: 1092190208. Throughput: 0: 55573.7. Samples: 1041509820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:20:58,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 03:20:59,953][47288] Updated weights for policy 0, policy_version 66666 (0.0025) [2024-04-26 03:21:00,817][47267] Signal inference workers to stop experience collection... (15700 times) [2024-04-26 03:21:00,817][47267] Signal inference workers to resume experience collection... (15700 times) [2024-04-26 03:21:00,828][47288] InferenceWorker_p0-w0: stopping experience collection (15700 times) [2024-04-26 03:21:00,828][47288] InferenceWorker_p0-w0: resuming experience collection (15700 times) [2024-04-26 03:21:02,943][47288] Updated weights for policy 0, policy_version 66676 (0.0028) [2024-04-26 03:21:03,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 1092485120. Throughput: 0: 55444.1. Samples: 1041839940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:21:03,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 03:21:05,715][47288] Updated weights for policy 0, policy_version 66686 (0.0028) [2024-04-26 03:21:08,738][47288] Updated weights for policy 0, policy_version 66696 (0.0031) [2024-04-26 03:21:08,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 1092747264. Throughput: 0: 55395.3. Samples: 1042173500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 03:21:08,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 03:21:11,553][47288] Updated weights for policy 0, policy_version 66706 (0.0026) [2024-04-26 03:21:13,923][47056] Fps is (10 sec: 50790.3, 60 sec: 54886.5, 300 sec: 55650.1). Total num frames: 1092993024. Throughput: 0: 55210.7. Samples: 1042333380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 03:21:13,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 03:21:14,678][47288] Updated weights for policy 0, policy_version 66716 (0.0030) [2024-04-26 03:21:17,460][47288] Updated weights for policy 0, policy_version 66726 (0.0028) [2024-04-26 03:21:18,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 1093271552. Throughput: 0: 55240.3. Samples: 1042664360. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 03:21:18,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 03:21:20,625][47288] Updated weights for policy 0, policy_version 66736 (0.0028) [2024-04-26 03:21:23,233][47288] Updated weights for policy 0, policy_version 66746 (0.0030) [2024-04-26 03:21:23,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 1093566464. Throughput: 0: 55281.2. Samples: 1042997460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 03:21:23,923][47056] Avg episode reward: [(0, '0.390')] [2024-04-26 03:21:26,471][47288] Updated weights for policy 0, policy_version 66756 (0.0025) [2024-04-26 03:21:28,923][47056] Fps is (10 sec: 58982.8, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 1093861376. Throughput: 0: 55460.5. Samples: 1043169700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 03:21:28,924][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 03:21:29,547][47288] Updated weights for policy 0, policy_version 66766 (0.0033) [2024-04-26 03:21:32,283][47288] Updated weights for policy 0, policy_version 66776 (0.0038) [2024-04-26 03:21:33,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 1094139904. Throughput: 0: 55429.7. Samples: 1043501240. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 03:21:33,923][47056] Avg episode reward: [(0, '0.361')] [2024-04-26 03:21:35,612][47288] Updated weights for policy 0, policy_version 66786 (0.0031) [2024-04-26 03:21:38,272][47288] Updated weights for policy 0, policy_version 66796 (0.0032) [2024-04-26 03:21:38,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 1094434816. Throughput: 0: 55261.2. Samples: 1043828960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 03:21:38,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 03:21:41,505][47288] Updated weights for policy 0, policy_version 66806 (0.0028) [2024-04-26 03:21:43,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1094680576. Throughput: 0: 55430.9. Samples: 1044004200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 03:21:43,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 03:21:44,056][47288] Updated weights for policy 0, policy_version 66816 (0.0028) [2024-04-26 03:21:47,401][47288] Updated weights for policy 0, policy_version 66826 (0.0037) [2024-04-26 03:21:48,923][47056] Fps is (10 sec: 52429.5, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 1094959104. Throughput: 0: 55451.1. Samples: 1044335240. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 03:21:48,931][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 03:21:49,944][47288] Updated weights for policy 0, policy_version 66836 (0.0027) [2024-04-26 03:21:53,265][47288] Updated weights for policy 0, policy_version 66846 (0.0029) [2024-04-26 03:21:53,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 1095221248. Throughput: 0: 55483.2. Samples: 1044670240. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 03:21:53,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 03:21:55,886][47288] Updated weights for policy 0, policy_version 66856 (0.0027) [2024-04-26 03:21:58,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55432.7, 300 sec: 55594.6). Total num frames: 1095516160. Throughput: 0: 55432.0. Samples: 1044827820. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 03:21:58,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 03:21:59,057][47288] Updated weights for policy 0, policy_version 66866 (0.0026) [2024-04-26 03:22:01,550][47267] Signal inference workers to stop experience collection... (15750 times) [2024-04-26 03:22:01,584][47288] InferenceWorker_p0-w0: stopping experience collection (15750 times) [2024-04-26 03:22:01,636][47267] Signal inference workers to resume experience collection... (15750 times) [2024-04-26 03:22:01,636][47288] InferenceWorker_p0-w0: resuming experience collection (15750 times) [2024-04-26 03:22:01,747][47288] Updated weights for policy 0, policy_version 66876 (0.0027) [2024-04-26 03:22:03,923][47056] Fps is (10 sec: 58982.6, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 1095811072. Throughput: 0: 55504.5. Samples: 1045162060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 03:22:03,923][47056] Avg episode reward: [(0, '0.350')] [2024-04-26 03:22:04,940][47288] Updated weights for policy 0, policy_version 66886 (0.0025) [2024-04-26 03:22:07,740][47288] Updated weights for policy 0, policy_version 66896 (0.0023) [2024-04-26 03:22:08,923][47056] Fps is (10 sec: 58981.9, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 1096105984. Throughput: 0: 55524.1. Samples: 1045496040. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 03:22:08,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 03:22:10,771][47288] Updated weights for policy 0, policy_version 66906 (0.0027) [2024-04-26 03:22:13,542][47288] Updated weights for policy 0, policy_version 66916 (0.0033) [2024-04-26 03:22:13,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 1096368128. Throughput: 0: 55739.6. Samples: 1045677980. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 03:22:13,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 03:22:16,743][47288] Updated weights for policy 0, policy_version 66926 (0.0032) [2024-04-26 03:22:18,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55978.9, 300 sec: 55705.6). Total num frames: 1096630272. Throughput: 0: 55673.5. Samples: 1046006540. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 03:22:18,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 03:22:19,333][47288] Updated weights for policy 0, policy_version 66936 (0.0028) [2024-04-26 03:22:22,474][47288] Updated weights for policy 0, policy_version 66946 (0.0029) [2024-04-26 03:22:23,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 1096908800. Throughput: 0: 55919.6. Samples: 1046345340. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 03:22:23,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 03:22:25,140][47288] Updated weights for policy 0, policy_version 66956 (0.0024) [2024-04-26 03:22:28,414][47288] Updated weights for policy 0, policy_version 66966 (0.0027) [2024-04-26 03:22:28,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 1097170944. Throughput: 0: 55454.2. Samples: 1046499640. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 03:22:28,923][47056] Avg episode reward: [(0, '0.390')] [2024-04-26 03:22:31,145][47288] Updated weights for policy 0, policy_version 66976 (0.0026) [2024-04-26 03:22:33,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 1097465856. Throughput: 0: 55550.7. Samples: 1046835020. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-26 03:22:33,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 03:22:34,685][47288] Updated weights for policy 0, policy_version 66986 (0.0027) [2024-04-26 03:22:37,077][47288] Updated weights for policy 0, policy_version 66996 (0.0031) [2024-04-26 03:22:38,923][47056] Fps is (10 sec: 58981.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1097760768. Throughput: 0: 55511.1. Samples: 1047168240. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-26 03:22:38,924][47056] Avg episode reward: [(0, '0.354')] [2024-04-26 03:22:40,638][47288] Updated weights for policy 0, policy_version 67006 (0.0029) [2024-04-26 03:22:42,943][47288] Updated weights for policy 0, policy_version 67016 (0.0027) [2024-04-26 03:22:43,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 1098055680. Throughput: 0: 55951.6. Samples: 1047345640. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-26 03:22:43,923][47056] Avg episode reward: [(0, '0.337')] [2024-04-26 03:22:46,373][47288] Updated weights for policy 0, policy_version 67026 (0.0028) [2024-04-26 03:22:48,836][47288] Updated weights for policy 0, policy_version 67036 (0.0029) [2024-04-26 03:22:48,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.4, 300 sec: 55705.6). Total num frames: 1098317824. Throughput: 0: 55928.7. Samples: 1047678860. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-26 03:22:48,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 03:22:49,023][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000067037_1098334208.pth... [2024-04-26 03:22:49,070][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000066223_1084997632.pth [2024-04-26 03:22:52,290][47288] Updated weights for policy 0, policy_version 67046 (0.0027) [2024-04-26 03:22:53,923][47056] Fps is (10 sec: 52427.6, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 1098579968. Throughput: 0: 55920.4. Samples: 1048012460. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-26 03:22:53,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 03:22:54,763][47288] Updated weights for policy 0, policy_version 67056 (0.0028) [2024-04-26 03:22:58,242][47288] Updated weights for policy 0, policy_version 67066 (0.0031) [2024-04-26 03:22:58,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 1098858496. Throughput: 0: 55457.4. Samples: 1048173560. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-26 03:22:58,923][47056] Avg episode reward: [(0, '0.343')] [2024-04-26 03:23:00,684][47288] Updated weights for policy 0, policy_version 67076 (0.0026) [2024-04-26 03:23:03,923][47056] Fps is (10 sec: 52429.3, 60 sec: 54886.5, 300 sec: 55483.4). Total num frames: 1099104256. Throughput: 0: 55566.1. Samples: 1048507020. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-26 03:23:03,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 03:23:04,113][47288] Updated weights for policy 0, policy_version 67086 (0.0036) [2024-04-26 03:23:06,504][47288] Updated weights for policy 0, policy_version 67096 (0.0029) [2024-04-26 03:23:08,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 1099415552. Throughput: 0: 55483.0. Samples: 1048842080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-26 03:23:08,923][47056] Avg episode reward: [(0, '0.399')] [2024-04-26 03:23:10,021][47288] Updated weights for policy 0, policy_version 67106 (0.0035) [2024-04-26 03:23:10,883][47267] Signal inference workers to stop experience collection... (15800 times) [2024-04-26 03:23:10,918][47288] InferenceWorker_p0-w0: stopping experience collection (15800 times) [2024-04-26 03:23:10,966][47267] Signal inference workers to resume experience collection... (15800 times) [2024-04-26 03:23:10,967][47288] InferenceWorker_p0-w0: resuming experience collection (15800 times) [2024-04-26 03:23:12,353][47288] Updated weights for policy 0, policy_version 67116 (0.0026) [2024-04-26 03:23:13,923][47056] Fps is (10 sec: 60620.9, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 1099710464. Throughput: 0: 55655.5. Samples: 1049004140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 03:23:13,923][47056] Avg episode reward: [(0, '0.404')] [2024-04-26 03:23:15,925][47288] Updated weights for policy 0, policy_version 67126 (0.0027) [2024-04-26 03:23:18,341][47288] Updated weights for policy 0, policy_version 67136 (0.0026) [2024-04-26 03:23:18,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 1100005376. Throughput: 0: 55712.8. Samples: 1049342100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 03:23:18,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 03:23:21,823][47288] Updated weights for policy 0, policy_version 67146 (0.0028) [2024-04-26 03:23:23,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 1100251136. Throughput: 0: 55686.4. Samples: 1049674120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 03:23:23,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 03:23:24,216][47288] Updated weights for policy 0, policy_version 67156 (0.0027) [2024-04-26 03:23:27,752][47288] Updated weights for policy 0, policy_version 67166 (0.0029) [2024-04-26 03:23:28,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 1100529664. Throughput: 0: 55551.8. Samples: 1049845480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 03:23:28,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 03:23:29,977][47288] Updated weights for policy 0, policy_version 67176 (0.0033) [2024-04-26 03:23:33,561][47288] Updated weights for policy 0, policy_version 67186 (0.0032) [2024-04-26 03:23:33,923][47056] Fps is (10 sec: 52427.0, 60 sec: 55159.2, 300 sec: 55538.9). Total num frames: 1100775424. Throughput: 0: 55480.8. Samples: 1050175500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 03:23:33,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 03:23:35,793][47288] Updated weights for policy 0, policy_version 67196 (0.0037) [2024-04-26 03:23:38,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 1101070336. Throughput: 0: 55526.8. Samples: 1050511160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 03:23:38,923][47056] Avg episode reward: [(0, '0.287')] [2024-04-26 03:23:39,514][47288] Updated weights for policy 0, policy_version 67206 (0.0030) [2024-04-26 03:23:41,676][47288] Updated weights for policy 0, policy_version 67216 (0.0030) [2024-04-26 03:23:43,923][47056] Fps is (10 sec: 57345.6, 60 sec: 54886.3, 300 sec: 55483.5). Total num frames: 1101348864. Throughput: 0: 55448.9. Samples: 1050668760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 03:23:43,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 03:23:45,305][47288] Updated weights for policy 0, policy_version 67226 (0.0029) [2024-04-26 03:23:47,567][47288] Updated weights for policy 0, policy_version 67236 (0.0033) [2024-04-26 03:23:48,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 1101643776. Throughput: 0: 55375.2. Samples: 1050998900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 03:23:48,923][47056] Avg episode reward: [(0, '0.336')] [2024-04-26 03:23:51,348][47288] Updated weights for policy 0, policy_version 67246 (0.0030) [2024-04-26 03:23:53,431][47288] Updated weights for policy 0, policy_version 67256 (0.0034) [2024-04-26 03:23:53,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 1101938688. Throughput: 0: 55279.2. Samples: 1051329640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 03:23:53,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 03:23:57,292][47288] Updated weights for policy 0, policy_version 67266 (0.0035) [2024-04-26 03:23:58,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 1102200832. Throughput: 0: 55693.9. Samples: 1051510360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 03:23:58,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 03:23:59,305][47288] Updated weights for policy 0, policy_version 67276 (0.0026) [2024-04-26 03:24:03,037][47288] Updated weights for policy 0, policy_version 67286 (0.0030) [2024-04-26 03:24:03,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 1102462976. Throughput: 0: 55600.9. Samples: 1051844140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 03:24:03,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 03:24:05,178][47288] Updated weights for policy 0, policy_version 67296 (0.0027) [2024-04-26 03:24:08,802][47288] Updated weights for policy 0, policy_version 67306 (0.0025) [2024-04-26 03:24:08,923][47056] Fps is (10 sec: 54066.1, 60 sec: 55432.5, 300 sec: 55538.9). Total num frames: 1102741504. Throughput: 0: 55746.8. Samples: 1052182740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 03:24:08,924][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 03:24:11,070][47288] Updated weights for policy 0, policy_version 67316 (0.0027) [2024-04-26 03:24:13,923][47056] Fps is (10 sec: 54067.0, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 1103003648. Throughput: 0: 55375.2. Samples: 1052337360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 03:24:13,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 03:24:14,571][47267] Signal inference workers to stop experience collection... (15850 times) [2024-04-26 03:24:14,610][47288] InferenceWorker_p0-w0: stopping experience collection (15850 times) [2024-04-26 03:24:14,622][47267] Signal inference workers to resume experience collection... (15850 times) [2024-04-26 03:24:14,631][47288] InferenceWorker_p0-w0: resuming experience collection (15850 times) [2024-04-26 03:24:14,731][47288] Updated weights for policy 0, policy_version 67326 (0.0032) [2024-04-26 03:24:17,035][47288] Updated weights for policy 0, policy_version 67336 (0.0029) [2024-04-26 03:24:18,923][47056] Fps is (10 sec: 54066.5, 60 sec: 54613.1, 300 sec: 55427.9). Total num frames: 1103282176. Throughput: 0: 55481.7. Samples: 1052672180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 03:24:18,923][47056] Avg episode reward: [(0, '0.416')] [2024-04-26 03:24:20,714][47288] Updated weights for policy 0, policy_version 67346 (0.0029) [2024-04-26 03:24:23,016][47288] Updated weights for policy 0, policy_version 67356 (0.0031) [2024-04-26 03:24:23,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 1103593472. Throughput: 0: 55483.5. Samples: 1053007920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 03:24:23,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 03:24:26,427][47288] Updated weights for policy 0, policy_version 67366 (0.0027) [2024-04-26 03:24:28,923][47056] Fps is (10 sec: 58984.5, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 1103872000. Throughput: 0: 55680.1. Samples: 1053174360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 03:24:28,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 03:24:28,931][47288] Updated weights for policy 0, policy_version 67376 (0.0027) [2024-04-26 03:24:32,416][47288] Updated weights for policy 0, policy_version 67386 (0.0036) [2024-04-26 03:24:33,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56798.0, 300 sec: 55705.6). Total num frames: 1104183296. Throughput: 0: 55781.1. Samples: 1053509060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 03:24:33,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 03:24:34,980][47288] Updated weights for policy 0, policy_version 67396 (0.0032) [2024-04-26 03:24:38,226][47288] Updated weights for policy 0, policy_version 67406 (0.0029) [2024-04-26 03:24:38,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 1104412672. Throughput: 0: 55868.5. Samples: 1053843720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 03:24:38,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 03:24:40,860][47288] Updated weights for policy 0, policy_version 67416 (0.0028) [2024-04-26 03:24:43,923][47056] Fps is (10 sec: 49152.9, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 1104674816. Throughput: 0: 55497.3. Samples: 1054007740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 03:24:43,923][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 03:24:44,220][47288] Updated weights for policy 0, policy_version 67426 (0.0031) [2024-04-26 03:24:46,670][47288] Updated weights for policy 0, policy_version 67436 (0.0025) [2024-04-26 03:24:48,923][47056] Fps is (10 sec: 52428.2, 60 sec: 54886.3, 300 sec: 55427.9). Total num frames: 1104936960. Throughput: 0: 55483.4. Samples: 1054340900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 03:24:48,924][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 03:24:49,003][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000067441_1104953344.pth... [2024-04-26 03:24:49,056][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000066629_1091649536.pth [2024-04-26 03:24:50,019][47288] Updated weights for policy 0, policy_version 67446 (0.0033) [2024-04-26 03:24:52,629][47288] Updated weights for policy 0, policy_version 67456 (0.0031) [2024-04-26 03:24:53,923][47056] Fps is (10 sec: 55705.2, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 1105231872. Throughput: 0: 55399.7. Samples: 1054675720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 03:24:53,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 03:24:56,138][47288] Updated weights for policy 0, policy_version 67466 (0.0030) [2024-04-26 03:24:58,900][47288] Updated weights for policy 0, policy_version 67476 (0.0025) [2024-04-26 03:24:58,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 1105526784. Throughput: 0: 55672.3. Samples: 1054842620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 03:24:58,923][47056] Avg episode reward: [(0, '0.391')] [2024-04-26 03:25:02,086][47288] Updated weights for policy 0, policy_version 67486 (0.0030) [2024-04-26 03:25:03,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 1105821696. Throughput: 0: 55527.4. Samples: 1055170900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 03:25:03,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 03:25:04,868][47288] Updated weights for policy 0, policy_version 67496 (0.0029) [2024-04-26 03:25:07,809][47288] Updated weights for policy 0, policy_version 67506 (0.0025) [2024-04-26 03:25:08,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 1106116608. Throughput: 0: 55363.1. Samples: 1055499260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 03:25:08,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 03:25:10,590][47288] Updated weights for policy 0, policy_version 67516 (0.0029) [2024-04-26 03:25:13,692][47288] Updated weights for policy 0, policy_version 67526 (0.0034) [2024-04-26 03:25:13,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 1106362368. Throughput: 0: 55622.0. Samples: 1055677360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 03:25:13,923][47056] Avg episode reward: [(0, '0.450')] [2024-04-26 03:25:14,265][47267] Signal inference workers to stop experience collection... (15900 times) [2024-04-26 03:25:14,270][47267] Signal inference workers to resume experience collection... (15900 times) [2024-04-26 03:25:14,307][47288] InferenceWorker_p0-w0: stopping experience collection (15900 times) [2024-04-26 03:25:14,311][47288] InferenceWorker_p0-w0: resuming experience collection (15900 times) [2024-04-26 03:25:16,478][47288] Updated weights for policy 0, policy_version 67536 (0.0029) [2024-04-26 03:25:18,923][47056] Fps is (10 sec: 49151.7, 60 sec: 55432.7, 300 sec: 55427.9). Total num frames: 1106608128. Throughput: 0: 55598.7. Samples: 1056011000. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 03:25:18,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 03:25:19,581][47288] Updated weights for policy 0, policy_version 67546 (0.0027) [2024-04-26 03:25:22,588][47288] Updated weights for policy 0, policy_version 67556 (0.0030) [2024-04-26 03:25:23,923][47056] Fps is (10 sec: 52428.5, 60 sec: 54886.3, 300 sec: 55372.3). Total num frames: 1106886656. Throughput: 0: 55530.0. Samples: 1056342580. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 03:25:23,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 03:25:25,662][47288] Updated weights for policy 0, policy_version 67566 (0.0029) [2024-04-26 03:25:28,435][47288] Updated weights for policy 0, policy_version 67576 (0.0025) [2024-04-26 03:25:28,923][47056] Fps is (10 sec: 55706.1, 60 sec: 54886.3, 300 sec: 55372.4). Total num frames: 1107165184. Throughput: 0: 55154.1. Samples: 1056489680. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 03:25:28,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 03:25:31,471][47288] Updated weights for policy 0, policy_version 67586 (0.0025) [2024-04-26 03:25:33,923][47056] Fps is (10 sec: 57344.5, 60 sec: 54613.4, 300 sec: 55594.5). Total num frames: 1107460096. Throughput: 0: 55145.3. Samples: 1056822440. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 03:25:33,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 03:25:34,176][47288] Updated weights for policy 0, policy_version 67596 (0.0027) [2024-04-26 03:25:37,238][47288] Updated weights for policy 0, policy_version 67606 (0.0027) [2024-04-26 03:25:38,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 1107755008. Throughput: 0: 55082.2. Samples: 1057154420. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 03:25:38,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 03:25:40,150][47288] Updated weights for policy 0, policy_version 67616 (0.0029) [2024-04-26 03:25:43,241][47288] Updated weights for policy 0, policy_version 67626 (0.0032) [2024-04-26 03:25:43,923][47056] Fps is (10 sec: 60620.5, 60 sec: 56524.6, 300 sec: 55705.6). Total num frames: 1108066304. Throughput: 0: 55492.4. Samples: 1057339780. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 03:25:43,923][47056] Avg episode reward: [(0, '0.362')] [2024-04-26 03:25:46,188][47288] Updated weights for policy 0, policy_version 67636 (0.0029) [2024-04-26 03:25:48,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55705.7, 300 sec: 55483.4). Total num frames: 1108279296. Throughput: 0: 55663.1. Samples: 1057675740. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 03:25:48,923][47056] Avg episode reward: [(0, '0.450')] [2024-04-26 03:25:49,114][47288] Updated weights for policy 0, policy_version 67646 (0.0029) [2024-04-26 03:25:52,050][47288] Updated weights for policy 0, policy_version 67656 (0.0031) [2024-04-26 03:25:53,923][47056] Fps is (10 sec: 49152.3, 60 sec: 55432.5, 300 sec: 55483.5). Total num frames: 1108557824. Throughput: 0: 55709.3. Samples: 1058006180. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 03:25:53,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 03:25:54,900][47288] Updated weights for policy 0, policy_version 67666 (0.0030) [2024-04-26 03:25:58,161][47288] Updated weights for policy 0, policy_version 67676 (0.0029) [2024-04-26 03:25:58,923][47056] Fps is (10 sec: 54066.5, 60 sec: 54886.4, 300 sec: 55372.3). Total num frames: 1108819968. Throughput: 0: 55181.7. Samples: 1058160540. Policy #0 lag: (min: 0.0, avg: 12.8, max: 20.0) [2024-04-26 03:25:58,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 03:26:00,772][47288] Updated weights for policy 0, policy_version 67686 (0.0026) [2024-04-26 03:26:00,908][47267] Signal inference workers to stop experience collection... (15950 times) [2024-04-26 03:26:00,955][47288] InferenceWorker_p0-w0: stopping experience collection (15950 times) [2024-04-26 03:26:00,965][47267] Signal inference workers to resume experience collection... (15950 times) [2024-04-26 03:26:00,970][47288] InferenceWorker_p0-w0: resuming experience collection (15950 times) [2024-04-26 03:26:03,923][47056] Fps is (10 sec: 55705.8, 60 sec: 54886.4, 300 sec: 55483.5). Total num frames: 1109114880. Throughput: 0: 55202.8. Samples: 1058495120. Policy #0 lag: (min: 0.0, avg: 12.8, max: 20.0) [2024-04-26 03:26:03,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 03:26:03,927][47288] Updated weights for policy 0, policy_version 67696 (0.0025) [2024-04-26 03:26:06,650][47288] Updated weights for policy 0, policy_version 67706 (0.0027) [2024-04-26 03:26:08,923][47056] Fps is (10 sec: 58982.7, 60 sec: 54886.4, 300 sec: 55650.0). Total num frames: 1109409792. Throughput: 0: 55365.9. Samples: 1058834040. Policy #0 lag: (min: 0.0, avg: 12.8, max: 20.0) [2024-04-26 03:26:08,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 03:26:09,693][47288] Updated weights for policy 0, policy_version 67716 (0.0029) [2024-04-26 03:26:12,503][47288] Updated weights for policy 0, policy_version 67726 (0.0033) [2024-04-26 03:26:13,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1109704704. Throughput: 0: 55984.0. Samples: 1059008960. Policy #0 lag: (min: 0.0, avg: 12.8, max: 20.0) [2024-04-26 03:26:13,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 03:26:15,730][47288] Updated weights for policy 0, policy_version 67736 (0.0032) [2024-04-26 03:26:18,292][47288] Updated weights for policy 0, policy_version 67746 (0.0039) [2024-04-26 03:26:18,923][47056] Fps is (10 sec: 60620.6, 60 sec: 56797.9, 300 sec: 55761.1). Total num frames: 1110016000. Throughput: 0: 55979.1. Samples: 1059341500. Policy #0 lag: (min: 0.0, avg: 12.8, max: 20.0) [2024-04-26 03:26:18,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 03:26:21,562][47288] Updated weights for policy 0, policy_version 67756 (0.0030) [2024-04-26 03:26:23,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 1110228992. Throughput: 0: 55990.2. Samples: 1059673980. Policy #0 lag: (min: 0.0, avg: 12.8, max: 20.0) [2024-04-26 03:26:23,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 03:26:24,288][47288] Updated weights for policy 0, policy_version 67766 (0.0026) [2024-04-26 03:26:27,436][47288] Updated weights for policy 0, policy_version 67776 (0.0028) [2024-04-26 03:26:28,923][47056] Fps is (10 sec: 49152.1, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 1110507520. Throughput: 0: 55536.5. Samples: 1059838920. Policy #0 lag: (min: 0.0, avg: 12.8, max: 20.0) [2024-04-26 03:26:28,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 03:26:30,062][47288] Updated weights for policy 0, policy_version 67786 (0.0026) [2024-04-26 03:26:33,523][47288] Updated weights for policy 0, policy_version 67796 (0.0027) [2024-04-26 03:26:33,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 1110786048. Throughput: 0: 55609.2. Samples: 1060178160. Policy #0 lag: (min: 0.0, avg: 12.8, max: 20.0) [2024-04-26 03:26:33,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 03:26:35,906][47288] Updated weights for policy 0, policy_version 67806 (0.0026) [2024-04-26 03:26:38,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 1111080960. Throughput: 0: 55541.9. Samples: 1060505560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 03:26:38,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 03:26:39,318][47288] Updated weights for policy 0, policy_version 67816 (0.0028) [2024-04-26 03:26:41,816][47288] Updated weights for policy 0, policy_version 67826 (0.0026) [2024-04-26 03:26:43,923][47056] Fps is (10 sec: 57343.8, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 1111359488. Throughput: 0: 55938.2. Samples: 1060677760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 03:26:43,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 03:26:45,118][47288] Updated weights for policy 0, policy_version 67836 (0.0026) [2024-04-26 03:26:47,669][47288] Updated weights for policy 0, policy_version 67846 (0.0026) [2024-04-26 03:26:48,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 1111654400. Throughput: 0: 55919.1. Samples: 1061011480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 03:26:48,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 03:26:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000067850_1111654400.pth... [2024-04-26 03:26:49,009][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000067037_1098334208.pth [2024-04-26 03:26:50,976][47288] Updated weights for policy 0, policy_version 67856 (0.0033) [2024-04-26 03:26:53,107][47267] Signal inference workers to stop experience collection... (16000 times) [2024-04-26 03:26:53,111][47267] Signal inference workers to resume experience collection... (16000 times) [2024-04-26 03:26:53,146][47288] InferenceWorker_p0-w0: stopping experience collection (16000 times) [2024-04-26 03:26:53,146][47288] InferenceWorker_p0-w0: resuming experience collection (16000 times) [2024-04-26 03:26:53,488][47288] Updated weights for policy 0, policy_version 67866 (0.0029) [2024-04-26 03:26:53,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56251.7, 300 sec: 55650.0). Total num frames: 1111932928. Throughput: 0: 55665.3. Samples: 1061338980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 03:26:53,923][47056] Avg episode reward: [(0, '0.378')] [2024-04-26 03:26:56,940][47288] Updated weights for policy 0, policy_version 67876 (0.0030) [2024-04-26 03:26:58,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55978.7, 300 sec: 55483.5). Total num frames: 1112178688. Throughput: 0: 55626.7. Samples: 1061512160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 03:26:58,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 03:26:59,355][47288] Updated weights for policy 0, policy_version 67886 (0.0031) [2024-04-26 03:27:02,947][47288] Updated weights for policy 0, policy_version 67896 (0.0031) [2024-04-26 03:27:03,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55705.5, 300 sec: 55427.9). Total num frames: 1112457216. Throughput: 0: 55602.2. Samples: 1061843600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 03:27:03,923][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 03:27:05,227][47288] Updated weights for policy 0, policy_version 67906 (0.0029) [2024-04-26 03:27:08,700][47288] Updated weights for policy 0, policy_version 67916 (0.0028) [2024-04-26 03:27:08,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 1112735744. Throughput: 0: 55611.6. Samples: 1062176500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 03:27:08,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 03:27:11,265][47288] Updated weights for policy 0, policy_version 67926 (0.0026) [2024-04-26 03:27:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 1113030656. Throughput: 0: 55391.1. Samples: 1062331520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 03:27:13,923][47056] Avg episode reward: [(0, '0.335')] [2024-04-26 03:27:14,529][47288] Updated weights for policy 0, policy_version 67936 (0.0029) [2024-04-26 03:27:17,159][47288] Updated weights for policy 0, policy_version 67946 (0.0027) [2024-04-26 03:27:18,923][47056] Fps is (10 sec: 57344.2, 60 sec: 54886.5, 300 sec: 55594.5). Total num frames: 1113309184. Throughput: 0: 55225.9. Samples: 1062663320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 03:27:18,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 03:27:20,498][47288] Updated weights for policy 0, policy_version 67956 (0.0034) [2024-04-26 03:27:22,944][47288] Updated weights for policy 0, policy_version 67966 (0.0025) [2024-04-26 03:27:23,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 1113604096. Throughput: 0: 55379.6. Samples: 1062997640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 03:27:23,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 03:27:26,461][47288] Updated weights for policy 0, policy_version 67976 (0.0027) [2024-04-26 03:27:28,881][47288] Updated weights for policy 0, policy_version 67986 (0.0030) [2024-04-26 03:27:28,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56251.7, 300 sec: 55650.0). Total num frames: 1113882624. Throughput: 0: 55492.0. Samples: 1063174900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 03:27:28,924][47056] Avg episode reward: [(0, '0.415')] [2024-04-26 03:27:32,188][47288] Updated weights for policy 0, policy_version 67996 (0.0032) [2024-04-26 03:27:33,923][47056] Fps is (10 sec: 50790.1, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 1114112000. Throughput: 0: 55514.2. Samples: 1063509620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 03:27:33,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 03:27:34,837][47288] Updated weights for policy 0, policy_version 68006 (0.0026) [2024-04-26 03:27:38,181][47288] Updated weights for policy 0, policy_version 68016 (0.0033) [2024-04-26 03:27:38,923][47056] Fps is (10 sec: 50790.7, 60 sec: 55159.4, 300 sec: 55372.3). Total num frames: 1114390528. Throughput: 0: 55628.9. Samples: 1063842280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 03:27:38,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 03:27:40,756][47288] Updated weights for policy 0, policy_version 68026 (0.0027) [2024-04-26 03:27:43,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 1114669056. Throughput: 0: 55307.9. Samples: 1064001020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 03:27:43,923][47056] Avg episode reward: [(0, '0.350')] [2024-04-26 03:27:44,161][47288] Updated weights for policy 0, policy_version 68036 (0.0028) [2024-04-26 03:27:46,578][47288] Updated weights for policy 0, policy_version 68046 (0.0037) [2024-04-26 03:27:48,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 1114963968. Throughput: 0: 55397.3. Samples: 1064336480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 03:27:48,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 03:27:49,818][47267] Signal inference workers to stop experience collection... (16050 times) [2024-04-26 03:27:49,819][47267] Signal inference workers to resume experience collection... (16050 times) [2024-04-26 03:27:49,835][47288] InferenceWorker_p0-w0: stopping experience collection (16050 times) [2024-04-26 03:27:49,836][47288] InferenceWorker_p0-w0: resuming experience collection (16050 times) [2024-04-26 03:27:49,927][47288] Updated weights for policy 0, policy_version 68056 (0.0032) [2024-04-26 03:27:52,452][47288] Updated weights for policy 0, policy_version 68066 (0.0032) [2024-04-26 03:27:53,923][47056] Fps is (10 sec: 58983.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 1115258880. Throughput: 0: 55450.3. Samples: 1064671760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 03:27:53,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 03:27:55,831][47288] Updated weights for policy 0, policy_version 68076 (0.0030) [2024-04-26 03:27:58,419][47288] Updated weights for policy 0, policy_version 68086 (0.0033) [2024-04-26 03:27:58,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 1115537408. Throughput: 0: 55970.6. Samples: 1064850200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 03:27:58,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 03:28:01,693][47288] Updated weights for policy 0, policy_version 68096 (0.0028) [2024-04-26 03:28:03,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 1115815936. Throughput: 0: 55963.0. Samples: 1065181660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-26 03:28:03,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 03:28:04,502][47288] Updated weights for policy 0, policy_version 68106 (0.0032) [2024-04-26 03:28:07,656][47288] Updated weights for policy 0, policy_version 68116 (0.0027) [2024-04-26 03:28:08,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 1116078080. Throughput: 0: 55913.7. Samples: 1065513760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-26 03:28:08,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 03:28:10,298][47288] Updated weights for policy 0, policy_version 68126 (0.0031) [2024-04-26 03:28:13,372][47288] Updated weights for policy 0, policy_version 68136 (0.0027) [2024-04-26 03:28:13,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 1116340224. Throughput: 0: 55444.9. Samples: 1065669920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-26 03:28:13,924][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 03:28:16,177][47288] Updated weights for policy 0, policy_version 68146 (0.0033) [2024-04-26 03:28:18,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 1116635136. Throughput: 0: 55453.8. Samples: 1066005040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-26 03:28:18,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 03:28:19,077][47288] Updated weights for policy 0, policy_version 68156 (0.0029) [2024-04-26 03:28:22,249][47288] Updated weights for policy 0, policy_version 68166 (0.0028) [2024-04-26 03:28:23,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 1116913664. Throughput: 0: 55454.6. Samples: 1066337740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-26 03:28:23,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 03:28:25,062][47288] Updated weights for policy 0, policy_version 68176 (0.0028) [2024-04-26 03:28:28,018][47288] Updated weights for policy 0, policy_version 68186 (0.0030) [2024-04-26 03:28:28,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 1117208576. Throughput: 0: 55793.4. Samples: 1066511720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-26 03:28:28,923][47056] Avg episode reward: [(0, '0.390')] [2024-04-26 03:28:31,171][47288] Updated weights for policy 0, policy_version 68196 (0.0028) [2024-04-26 03:28:33,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 1117470720. Throughput: 0: 55755.2. Samples: 1066845460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-26 03:28:33,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 03:28:33,940][47288] Updated weights for policy 0, policy_version 68206 (0.0030) [2024-04-26 03:28:37,223][47288] Updated weights for policy 0, policy_version 68216 (0.0026) [2024-04-26 03:28:38,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 55650.0). Total num frames: 1117765632. Throughput: 0: 55725.6. Samples: 1067179420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-26 03:28:38,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 03:28:39,890][47288] Updated weights for policy 0, policy_version 68226 (0.0031) [2024-04-26 03:28:42,959][47288] Updated weights for policy 0, policy_version 68236 (0.0033) [2024-04-26 03:28:43,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.8, 300 sec: 55539.0). Total num frames: 1118027776. Throughput: 0: 55501.1. Samples: 1067347740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 03:28:43,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 03:28:45,641][47288] Updated weights for policy 0, policy_version 68246 (0.0026) [2024-04-26 03:28:48,820][47288] Updated weights for policy 0, policy_version 68256 (0.0032) [2024-04-26 03:28:48,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 55483.4). Total num frames: 1118306304. Throughput: 0: 55612.4. Samples: 1067684220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 03:28:48,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 03:28:48,931][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000068256_1118306304.pth... [2024-04-26 03:28:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000067441_1104953344.pth [2024-04-26 03:28:51,424][47288] Updated weights for policy 0, policy_version 68266 (0.0033) [2024-04-26 03:28:53,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 1118584832. Throughput: 0: 55768.9. Samples: 1068023360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 03:28:53,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 03:28:54,658][47288] Updated weights for policy 0, policy_version 68276 (0.0030) [2024-04-26 03:28:57,278][47288] Updated weights for policy 0, policy_version 68286 (0.0032) [2024-04-26 03:28:58,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 1118879744. Throughput: 0: 55942.1. Samples: 1068187320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 03:28:58,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 03:29:00,410][47288] Updated weights for policy 0, policy_version 68296 (0.0030) [2024-04-26 03:29:03,136][47288] Updated weights for policy 0, policy_version 68306 (0.0029) [2024-04-26 03:29:03,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 1119174656. Throughput: 0: 55893.4. Samples: 1068520240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 03:29:03,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 03:29:06,414][47288] Updated weights for policy 0, policy_version 68316 (0.0036) [2024-04-26 03:29:07,196][47267] Signal inference workers to stop experience collection... (16100 times) [2024-04-26 03:29:07,222][47288] InferenceWorker_p0-w0: stopping experience collection (16100 times) [2024-04-26 03:29:07,258][47267] Signal inference workers to resume experience collection... (16100 times) [2024-04-26 03:29:07,258][47288] InferenceWorker_p0-w0: resuming experience collection (16100 times) [2024-04-26 03:29:08,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 1119436800. Throughput: 0: 56040.0. Samples: 1068859540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 03:29:08,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 03:29:09,047][47288] Updated weights for policy 0, policy_version 68326 (0.0030) [2024-04-26 03:29:12,077][47288] Updated weights for policy 0, policy_version 68336 (0.0030) [2024-04-26 03:29:13,923][47056] Fps is (10 sec: 54066.3, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 1119715328. Throughput: 0: 55970.5. Samples: 1069030400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 03:29:13,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 03:29:14,833][47288] Updated weights for policy 0, policy_version 68346 (0.0031) [2024-04-26 03:29:18,072][47288] Updated weights for policy 0, policy_version 68356 (0.0031) [2024-04-26 03:29:18,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 1119993856. Throughput: 0: 55975.9. Samples: 1069364380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 03:29:18,926][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 03:29:20,673][47288] Updated weights for policy 0, policy_version 68366 (0.0032) [2024-04-26 03:29:23,923][47056] Fps is (10 sec: 54068.4, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 1120256000. Throughput: 0: 55914.4. Samples: 1069695560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 03:29:23,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 03:29:23,940][47288] Updated weights for policy 0, policy_version 68376 (0.0034) [2024-04-26 03:29:26,399][47288] Updated weights for policy 0, policy_version 68386 (0.0033) [2024-04-26 03:29:28,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 1120534528. Throughput: 0: 55730.6. Samples: 1069855620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 03:29:28,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 03:29:29,903][47288] Updated weights for policy 0, policy_version 68396 (0.0032) [2024-04-26 03:29:32,302][47288] Updated weights for policy 0, policy_version 68406 (0.0029) [2024-04-26 03:29:33,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 1120829440. Throughput: 0: 55701.0. Samples: 1070190760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 03:29:33,923][47056] Avg episode reward: [(0, '0.355')] [2024-04-26 03:29:35,640][47288] Updated weights for policy 0, policy_version 68416 (0.0033) [2024-04-26 03:29:38,343][47288] Updated weights for policy 0, policy_version 68426 (0.0026) [2024-04-26 03:29:38,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1121107968. Throughput: 0: 55606.7. Samples: 1070525660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 03:29:38,923][47056] Avg episode reward: [(0, '0.424')] [2024-04-26 03:29:41,505][47288] Updated weights for policy 0, policy_version 68436 (0.0032) [2024-04-26 03:29:43,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 1121386496. Throughput: 0: 55781.6. Samples: 1070697480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 03:29:43,923][47056] Avg episode reward: [(0, '0.324')] [2024-04-26 03:29:44,218][47288] Updated weights for policy 0, policy_version 68446 (0.0036) [2024-04-26 03:29:47,426][47288] Updated weights for policy 0, policy_version 68456 (0.0027) [2024-04-26 03:29:48,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 1121648640. Throughput: 0: 55737.4. Samples: 1071028420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 03:29:48,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 03:29:50,353][47288] Updated weights for policy 0, policy_version 68466 (0.0030) [2024-04-26 03:29:53,284][47288] Updated weights for policy 0, policy_version 68476 (0.0028) [2024-04-26 03:29:53,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55705.6, 300 sec: 55594.6). Total num frames: 1121927168. Throughput: 0: 55682.8. Samples: 1071365260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 03:29:53,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 03:29:56,160][47288] Updated weights for policy 0, policy_version 68486 (0.0032) [2024-04-26 03:29:58,923][47056] Fps is (10 sec: 55704.2, 60 sec: 55432.5, 300 sec: 55538.9). Total num frames: 1122205696. Throughput: 0: 55553.8. Samples: 1071530320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 03:29:58,924][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 03:29:59,170][47288] Updated weights for policy 0, policy_version 68496 (0.0026) [2024-04-26 03:30:02,085][47288] Updated weights for policy 0, policy_version 68506 (0.0030) [2024-04-26 03:30:03,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 1122484224. Throughput: 0: 55552.0. Samples: 1071864220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 03:30:03,923][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 03:30:05,301][47288] Updated weights for policy 0, policy_version 68516 (0.0027) [2024-04-26 03:30:07,811][47288] Updated weights for policy 0, policy_version 68526 (0.0033) [2024-04-26 03:30:08,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 1122779136. Throughput: 0: 55619.0. Samples: 1072198420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 03:30:08,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 03:30:11,029][47288] Updated weights for policy 0, policy_version 68536 (0.0029) [2024-04-26 03:30:13,764][47288] Updated weights for policy 0, policy_version 68546 (0.0027) [2024-04-26 03:30:13,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55978.9, 300 sec: 55816.7). Total num frames: 1123074048. Throughput: 0: 55987.2. Samples: 1072375040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 03:30:13,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 03:30:16,799][47288] Updated weights for policy 0, policy_version 68556 (0.0028) [2024-04-26 03:30:18,923][47056] Fps is (10 sec: 55706.7, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 1123336192. Throughput: 0: 55945.4. Samples: 1072708300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 03:30:18,923][47056] Avg episode reward: [(0, '0.339')] [2024-04-26 03:30:19,543][47288] Updated weights for policy 0, policy_version 68566 (0.0027) [2024-04-26 03:30:19,560][47267] Signal inference workers to stop experience collection... (16150 times) [2024-04-26 03:30:19,560][47267] Signal inference workers to resume experience collection... (16150 times) [2024-04-26 03:30:19,572][47288] InferenceWorker_p0-w0: stopping experience collection (16150 times) [2024-04-26 03:30:19,572][47288] InferenceWorker_p0-w0: resuming experience collection (16150 times) [2024-04-26 03:30:22,724][47288] Updated weights for policy 0, policy_version 68576 (0.0024) [2024-04-26 03:30:23,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 1123614720. Throughput: 0: 55905.8. Samples: 1073041420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 03:30:23,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 03:30:25,247][47288] Updated weights for policy 0, policy_version 68586 (0.0033) [2024-04-26 03:30:28,707][47288] Updated weights for policy 0, policy_version 68596 (0.0029) [2024-04-26 03:30:28,923][47056] Fps is (10 sec: 54066.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 1123876864. Throughput: 0: 55677.6. Samples: 1073202980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 03:30:28,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 03:30:31,044][47288] Updated weights for policy 0, policy_version 68606 (0.0026) [2024-04-26 03:30:33,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 1124171776. Throughput: 0: 55806.1. Samples: 1073539700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 03:30:33,923][47056] Avg episode reward: [(0, '0.353')] [2024-04-26 03:30:34,655][47288] Updated weights for policy 0, policy_version 68616 (0.0031) [2024-04-26 03:30:36,988][47288] Updated weights for policy 0, policy_version 68626 (0.0027) [2024-04-26 03:30:38,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55483.5). Total num frames: 1124433920. Throughput: 0: 55786.5. Samples: 1073875660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 03:30:38,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 03:30:40,496][47288] Updated weights for policy 0, policy_version 68636 (0.0037) [2024-04-26 03:30:42,746][47288] Updated weights for policy 0, policy_version 68646 (0.0029) [2024-04-26 03:30:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 1124728832. Throughput: 0: 55738.5. Samples: 1074038540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 03:30:43,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 03:30:46,317][47288] Updated weights for policy 0, policy_version 68656 (0.0033) [2024-04-26 03:30:48,858][47288] Updated weights for policy 0, policy_version 68666 (0.0036) [2024-04-26 03:30:48,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 1125023744. Throughput: 0: 55690.7. Samples: 1074370300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 03:30:48,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 03:30:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000068666_1125023744.pth... [2024-04-26 03:30:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000067850_1111654400.pth [2024-04-26 03:30:52,138][47288] Updated weights for policy 0, policy_version 68676 (0.0028) [2024-04-26 03:30:53,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 1125302272. Throughput: 0: 55788.9. Samples: 1074708920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 03:30:53,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 03:30:54,765][47288] Updated weights for policy 0, policy_version 68686 (0.0026) [2024-04-26 03:30:58,047][47288] Updated weights for policy 0, policy_version 68696 (0.0036) [2024-04-26 03:30:58,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 1125564416. Throughput: 0: 55541.7. Samples: 1074874420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 03:30:58,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 03:31:00,581][47288] Updated weights for policy 0, policy_version 68706 (0.0034) [2024-04-26 03:31:03,801][47288] Updated weights for policy 0, policy_version 68716 (0.0033) [2024-04-26 03:31:03,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 1125842944. Throughput: 0: 55756.9. Samples: 1075217380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 03:31:03,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 03:31:06,619][47288] Updated weights for policy 0, policy_version 68726 (0.0027) [2024-04-26 03:31:08,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 1126121472. Throughput: 0: 55741.5. Samples: 1075549800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 03:31:08,923][47056] Avg episode reward: [(0, '0.342')] [2024-04-26 03:31:09,597][47288] Updated weights for policy 0, policy_version 68736 (0.0033) [2024-04-26 03:31:12,428][47288] Updated weights for policy 0, policy_version 68746 (0.0033) [2024-04-26 03:31:13,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55159.3, 300 sec: 55483.4). Total num frames: 1126383616. Throughput: 0: 55695.5. Samples: 1075709280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 03:31:13,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 03:31:15,587][47288] Updated weights for policy 0, policy_version 68756 (0.0027) [2024-04-26 03:31:18,260][47288] Updated weights for policy 0, policy_version 68766 (0.0029) [2024-04-26 03:31:18,923][47056] Fps is (10 sec: 55706.7, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 1126678528. Throughput: 0: 55517.8. Samples: 1076038000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 03:31:18,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 03:31:21,471][47288] Updated weights for policy 0, policy_version 68776 (0.0031) [2024-04-26 03:31:23,923][47056] Fps is (10 sec: 57344.8, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 1126957056. Throughput: 0: 55509.5. Samples: 1076373580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 03:31:23,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 03:31:24,115][47288] Updated weights for policy 0, policy_version 68786 (0.0035) [2024-04-26 03:31:27,335][47288] Updated weights for policy 0, policy_version 68796 (0.0030) [2024-04-26 03:31:28,923][47056] Fps is (10 sec: 57342.8, 60 sec: 56251.6, 300 sec: 55816.6). Total num frames: 1127251968. Throughput: 0: 55739.7. Samples: 1076546840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 03:31:28,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 03:31:29,904][47288] Updated weights for policy 0, policy_version 68806 (0.0033) [2024-04-26 03:31:33,182][47288] Updated weights for policy 0, policy_version 68816 (0.0033) [2024-04-26 03:31:33,923][47056] Fps is (10 sec: 54066.1, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 1127497728. Throughput: 0: 55795.3. Samples: 1076881100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 03:31:33,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 03:31:36,052][47288] Updated weights for policy 0, policy_version 68826 (0.0029) [2024-04-26 03:31:38,429][47267] Signal inference workers to stop experience collection... (16200 times) [2024-04-26 03:31:38,477][47288] InferenceWorker_p0-w0: stopping experience collection (16200 times) [2024-04-26 03:31:38,485][47267] Signal inference workers to resume experience collection... (16200 times) [2024-04-26 03:31:38,489][47288] InferenceWorker_p0-w0: resuming experience collection (16200 times) [2024-04-26 03:31:38,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1127792640. Throughput: 0: 55691.6. Samples: 1077215040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 03:31:38,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 03:31:39,089][47288] Updated weights for policy 0, policy_version 68836 (0.0033) [2024-04-26 03:31:41,841][47288] Updated weights for policy 0, policy_version 68846 (0.0030) [2024-04-26 03:31:43,923][47056] Fps is (10 sec: 55706.8, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 1128054784. Throughput: 0: 55501.0. Samples: 1077371960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 03:31:43,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 03:31:44,980][47288] Updated weights for policy 0, policy_version 68856 (0.0030) [2024-04-26 03:31:47,796][47288] Updated weights for policy 0, policy_version 68866 (0.0032) [2024-04-26 03:31:48,923][47056] Fps is (10 sec: 52428.6, 60 sec: 54886.3, 300 sec: 55539.0). Total num frames: 1128316928. Throughput: 0: 55307.2. Samples: 1077706200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 03:31:48,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 03:31:50,793][47288] Updated weights for policy 0, policy_version 68876 (0.0028) [2024-04-26 03:31:53,816][47288] Updated weights for policy 0, policy_version 68886 (0.0037) [2024-04-26 03:31:53,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55432.7, 300 sec: 55761.2). Total num frames: 1128628224. Throughput: 0: 55255.4. Samples: 1078036280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 03:31:53,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 03:31:57,003][47288] Updated weights for policy 0, policy_version 68896 (0.0035) [2024-04-26 03:31:58,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 1128890368. Throughput: 0: 55467.9. Samples: 1078205340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 03:31:58,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 03:31:59,833][47288] Updated weights for policy 0, policy_version 68906 (0.0032) [2024-04-26 03:32:03,027][47288] Updated weights for policy 0, policy_version 68916 (0.0032) [2024-04-26 03:32:03,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 1129185280. Throughput: 0: 55491.9. Samples: 1078535140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 03:32:03,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 03:32:05,895][47288] Updated weights for policy 0, policy_version 68926 (0.0033) [2024-04-26 03:32:08,781][47288] Updated weights for policy 0, policy_version 68936 (0.0030) [2024-04-26 03:32:08,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 1129447424. Throughput: 0: 55532.3. Samples: 1078872540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 03:32:08,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 03:32:11,588][47288] Updated weights for policy 0, policy_version 68946 (0.0030) [2024-04-26 03:32:13,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 1129725952. Throughput: 0: 55436.8. Samples: 1079041480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-26 03:32:13,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 03:32:14,464][47288] Updated weights for policy 0, policy_version 68956 (0.0030) [2024-04-26 03:32:17,478][47288] Updated weights for policy 0, policy_version 68966 (0.0027) [2024-04-26 03:32:18,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 1129988096. Throughput: 0: 55476.7. Samples: 1079377540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 03:32:18,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 03:32:20,607][47288] Updated weights for policy 0, policy_version 68976 (0.0033) [2024-04-26 03:32:23,619][47288] Updated weights for policy 0, policy_version 68986 (0.0029) [2024-04-26 03:32:23,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 1130266624. Throughput: 0: 55583.7. Samples: 1079716300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 03:32:23,923][47056] Avg episode reward: [(0, '0.403')] [2024-04-26 03:32:26,364][47288] Updated weights for policy 0, policy_version 68996 (0.0028) [2024-04-26 03:32:28,923][47056] Fps is (10 sec: 57342.5, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 1130561536. Throughput: 0: 55611.7. Samples: 1079874500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 03:32:28,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 03:32:29,516][47288] Updated weights for policy 0, policy_version 69006 (0.0031) [2024-04-26 03:32:32,273][47288] Updated weights for policy 0, policy_version 69016 (0.0032) [2024-04-26 03:32:33,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 1130840064. Throughput: 0: 55600.7. Samples: 1080208220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 03:32:33,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 03:32:35,262][47288] Updated weights for policy 0, policy_version 69026 (0.0029) [2024-04-26 03:32:38,051][47288] Updated weights for policy 0, policy_version 69036 (0.0024) [2024-04-26 03:32:38,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 1131134976. Throughput: 0: 55693.2. Samples: 1080542480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 03:32:38,923][47056] Avg episode reward: [(0, '0.384')] [2024-04-26 03:32:41,055][47288] Updated weights for policy 0, policy_version 69046 (0.0027) [2024-04-26 03:32:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 1131397120. Throughput: 0: 55815.9. Samples: 1080717040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 03:32:43,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 03:32:43,994][47288] Updated weights for policy 0, policy_version 69056 (0.0029) [2024-04-26 03:32:47,250][47288] Updated weights for policy 0, policy_version 69066 (0.0029) [2024-04-26 03:32:48,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 1131675648. Throughput: 0: 56010.6. Samples: 1081055620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 03:32:48,923][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 03:32:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000069072_1131675648.pth... [2024-04-26 03:32:48,983][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000068256_1118306304.pth [2024-04-26 03:32:49,778][47288] Updated weights for policy 0, policy_version 69076 (0.0032) [2024-04-26 03:32:53,307][47288] Updated weights for policy 0, policy_version 69086 (0.0031) [2024-04-26 03:32:53,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 1131937792. Throughput: 0: 55897.5. Samples: 1081387920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 03:32:53,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 03:32:55,645][47288] Updated weights for policy 0, policy_version 69096 (0.0029) [2024-04-26 03:32:58,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 1132216320. Throughput: 0: 55621.2. Samples: 1081544440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 03:32:58,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 03:32:59,174][47288] Updated weights for policy 0, policy_version 69106 (0.0029) [2024-04-26 03:32:59,924][47267] Signal inference workers to stop experience collection... (16250 times) [2024-04-26 03:32:59,961][47288] InferenceWorker_p0-w0: stopping experience collection (16250 times) [2024-04-26 03:32:59,978][47267] Signal inference workers to resume experience collection... (16250 times) [2024-04-26 03:32:59,980][47288] InferenceWorker_p0-w0: resuming experience collection (16250 times) [2024-04-26 03:33:01,582][47288] Updated weights for policy 0, policy_version 69116 (0.0028) [2024-04-26 03:33:03,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1132511232. Throughput: 0: 55614.1. Samples: 1081880180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 03:33:03,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 03:33:05,108][47288] Updated weights for policy 0, policy_version 69126 (0.0027) [2024-04-26 03:33:07,509][47288] Updated weights for policy 0, policy_version 69136 (0.0028) [2024-04-26 03:33:08,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 1132806144. Throughput: 0: 55455.5. Samples: 1082211800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 03:33:08,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 03:33:10,889][47288] Updated weights for policy 0, policy_version 69146 (0.0027) [2024-04-26 03:33:13,278][47288] Updated weights for policy 0, policy_version 69156 (0.0029) [2024-04-26 03:33:13,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 1133101056. Throughput: 0: 55828.7. Samples: 1082386780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 03:33:13,923][47056] Avg episode reward: [(0, '0.415')] [2024-04-26 03:33:16,690][47288] Updated weights for policy 0, policy_version 69166 (0.0029) [2024-04-26 03:33:18,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56251.5, 300 sec: 55761.1). Total num frames: 1133363200. Throughput: 0: 55905.9. Samples: 1082724000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 03:33:18,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 03:33:19,058][47288] Updated weights for policy 0, policy_version 69176 (0.0029) [2024-04-26 03:33:22,466][47288] Updated weights for policy 0, policy_version 69186 (0.0032) [2024-04-26 03:33:23,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 1133625344. Throughput: 0: 55897.2. Samples: 1083057860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 03:33:23,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 03:33:25,002][47288] Updated weights for policy 0, policy_version 69196 (0.0034) [2024-04-26 03:33:28,403][47288] Updated weights for policy 0, policy_version 69206 (0.0027) [2024-04-26 03:33:28,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 1133920256. Throughput: 0: 55675.3. Samples: 1083222440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 03:33:28,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 03:33:30,870][47288] Updated weights for policy 0, policy_version 69216 (0.0031) [2024-04-26 03:33:33,923][47056] Fps is (10 sec: 54068.3, 60 sec: 55432.5, 300 sec: 55594.6). Total num frames: 1134166016. Throughput: 0: 55649.2. Samples: 1083559820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 03:33:33,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 03:33:34,309][47288] Updated weights for policy 0, policy_version 69226 (0.0028) [2024-04-26 03:33:36,634][47288] Updated weights for policy 0, policy_version 69236 (0.0039) [2024-04-26 03:33:38,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1134460928. Throughput: 0: 55773.7. Samples: 1083897740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 03:33:38,923][47056] Avg episode reward: [(0, '0.357')] [2024-04-26 03:33:40,012][47288] Updated weights for policy 0, policy_version 69246 (0.0027) [2024-04-26 03:33:42,299][47288] Updated weights for policy 0, policy_version 69256 (0.0029) [2024-04-26 03:33:43,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 1134755840. Throughput: 0: 56022.7. Samples: 1084065460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 03:33:43,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 03:33:45,839][47288] Updated weights for policy 0, policy_version 69266 (0.0034) [2024-04-26 03:33:48,217][47288] Updated weights for policy 0, policy_version 69276 (0.0030) [2024-04-26 03:33:48,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55978.9, 300 sec: 55761.1). Total num frames: 1135034368. Throughput: 0: 55928.6. Samples: 1084396960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 03:33:48,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 03:33:52,034][47288] Updated weights for policy 0, policy_version 69286 (0.0028) [2024-04-26 03:33:53,923][47056] Fps is (10 sec: 55704.7, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 1135312896. Throughput: 0: 55998.1. Samples: 1084731720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 03:33:53,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 03:33:54,214][47288] Updated weights for policy 0, policy_version 69296 (0.0029) [2024-04-26 03:33:57,690][47288] Updated weights for policy 0, policy_version 69306 (0.0031) [2024-04-26 03:33:58,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 1135575040. Throughput: 0: 55867.2. Samples: 1084900800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 03:33:58,923][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 03:33:59,897][47267] Signal inference workers to stop experience collection... (16300 times) [2024-04-26 03:33:59,897][47267] Signal inference workers to resume experience collection... (16300 times) [2024-04-26 03:33:59,925][47288] InferenceWorker_p0-w0: stopping experience collection (16300 times) [2024-04-26 03:33:59,926][47288] InferenceWorker_p0-w0: resuming experience collection (16300 times) [2024-04-26 03:34:00,005][47288] Updated weights for policy 0, policy_version 69316 (0.0032) [2024-04-26 03:34:03,384][47288] Updated weights for policy 0, policy_version 69326 (0.0026) [2024-04-26 03:34:03,923][47056] Fps is (10 sec: 54068.5, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 1135853568. Throughput: 0: 55887.9. Samples: 1085238940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 03:34:03,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 03:34:05,811][47288] Updated weights for policy 0, policy_version 69336 (0.0030) [2024-04-26 03:34:08,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 1136132096. Throughput: 0: 55877.2. Samples: 1085572320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 03:34:08,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 03:34:09,278][47288] Updated weights for policy 0, policy_version 69346 (0.0027) [2024-04-26 03:34:11,967][47288] Updated weights for policy 0, policy_version 69356 (0.0029) [2024-04-26 03:34:13,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 1136410624. Throughput: 0: 55810.3. Samples: 1085733900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 03:34:13,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 03:34:15,170][47288] Updated weights for policy 0, policy_version 69366 (0.0028) [2024-04-26 03:34:17,686][47288] Updated weights for policy 0, policy_version 69376 (0.0029) [2024-04-26 03:34:18,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 1136705536. Throughput: 0: 55684.3. Samples: 1086065620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 03:34:18,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 03:34:21,095][47288] Updated weights for policy 0, policy_version 69386 (0.0031) [2024-04-26 03:34:23,417][47288] Updated weights for policy 0, policy_version 69396 (0.0029) [2024-04-26 03:34:23,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 1137000448. Throughput: 0: 55588.5. Samples: 1086399220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 03:34:23,932][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 03:34:26,764][47288] Updated weights for policy 0, policy_version 69406 (0.0029) [2024-04-26 03:34:28,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 1137278976. Throughput: 0: 55828.0. Samples: 1086577720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 03:34:28,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 03:34:29,281][47288] Updated weights for policy 0, policy_version 69416 (0.0033) [2024-04-26 03:34:32,610][47288] Updated weights for policy 0, policy_version 69426 (0.0023) [2024-04-26 03:34:33,923][47056] Fps is (10 sec: 52429.5, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 1137524736. Throughput: 0: 55903.1. Samples: 1086912600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 03:34:33,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 03:34:35,061][47288] Updated weights for policy 0, policy_version 69436 (0.0025) [2024-04-26 03:34:38,633][47288] Updated weights for policy 0, policy_version 69446 (0.0031) [2024-04-26 03:34:38,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55705.7, 300 sec: 55650.0). Total num frames: 1137803264. Throughput: 0: 55879.7. Samples: 1087246300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 03:34:38,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 03:34:41,073][47288] Updated weights for policy 0, policy_version 69456 (0.0033) [2024-04-26 03:34:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 1138081792. Throughput: 0: 55770.3. Samples: 1087410460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 03:34:43,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 03:34:44,518][47288] Updated weights for policy 0, policy_version 69466 (0.0030) [2024-04-26 03:34:46,766][47288] Updated weights for policy 0, policy_version 69476 (0.0028) [2024-04-26 03:34:48,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 1138376704. Throughput: 0: 55794.5. Samples: 1087749700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 03:34:48,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 03:34:48,974][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000069482_1138393088.pth... [2024-04-26 03:34:49,020][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000068666_1125023744.pth [2024-04-26 03:34:50,428][47288] Updated weights for policy 0, policy_version 69486 (0.0041) [2024-04-26 03:34:52,819][47288] Updated weights for policy 0, policy_version 69496 (0.0031) [2024-04-26 03:34:53,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 1138671616. Throughput: 0: 55828.4. Samples: 1088084600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 03:34:53,923][47056] Avg episode reward: [(0, '0.362')] [2024-04-26 03:34:56,143][47288] Updated weights for policy 0, policy_version 69506 (0.0029) [2024-04-26 03:34:58,740][47288] Updated weights for policy 0, policy_version 69516 (0.0029) [2024-04-26 03:34:58,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 1138950144. Throughput: 0: 55841.5. Samples: 1088246760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 03:34:58,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 03:34:59,650][47267] Signal inference workers to stop experience collection... (16350 times) [2024-04-26 03:34:59,689][47288] InferenceWorker_p0-w0: stopping experience collection (16350 times) [2024-04-26 03:34:59,713][47267] Signal inference workers to resume experience collection... (16350 times) [2024-04-26 03:34:59,714][47288] InferenceWorker_p0-w0: resuming experience collection (16350 times) [2024-04-26 03:35:02,068][47288] Updated weights for policy 0, policy_version 69526 (0.0032) [2024-04-26 03:35:03,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1139212288. Throughput: 0: 55902.0. Samples: 1088581200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:35:03,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 03:35:04,511][47288] Updated weights for policy 0, policy_version 69536 (0.0027) [2024-04-26 03:35:07,981][47288] Updated weights for policy 0, policy_version 69546 (0.0027) [2024-04-26 03:35:08,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 1139490816. Throughput: 0: 56005.7. Samples: 1088919480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:35:08,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 03:35:10,386][47288] Updated weights for policy 0, policy_version 69556 (0.0033) [2024-04-26 03:35:13,740][47288] Updated weights for policy 0, policy_version 69566 (0.0028) [2024-04-26 03:35:13,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 1139769344. Throughput: 0: 55746.2. Samples: 1089086300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:35:13,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 03:35:16,350][47288] Updated weights for policy 0, policy_version 69576 (0.0028) [2024-04-26 03:35:18,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 1140031488. Throughput: 0: 55623.1. Samples: 1089415640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:35:18,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 03:35:19,670][47288] Updated weights for policy 0, policy_version 69586 (0.0026) [2024-04-26 03:35:22,181][47288] Updated weights for policy 0, policy_version 69596 (0.0030) [2024-04-26 03:35:23,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 1140310016. Throughput: 0: 55623.6. Samples: 1089749360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:35:23,923][47056] Avg episode reward: [(0, '0.359')] [2024-04-26 03:35:25,456][47288] Updated weights for policy 0, policy_version 69606 (0.0032) [2024-04-26 03:35:28,518][47288] Updated weights for policy 0, policy_version 69616 (0.0028) [2024-04-26 03:35:28,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 1140604928. Throughput: 0: 55805.3. Samples: 1089921700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:35:28,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 03:35:31,311][47288] Updated weights for policy 0, policy_version 69626 (0.0031) [2024-04-26 03:35:33,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 1140899840. Throughput: 0: 55715.6. Samples: 1090256900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:35:33,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 03:35:34,467][47288] Updated weights for policy 0, policy_version 69636 (0.0026) [2024-04-26 03:35:37,296][47288] Updated weights for policy 0, policy_version 69646 (0.0028) [2024-04-26 03:35:38,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 1141161984. Throughput: 0: 55669.4. Samples: 1090589720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:35:38,923][47056] Avg episode reward: [(0, '0.381')] [2024-04-26 03:35:40,443][47288] Updated weights for policy 0, policy_version 69656 (0.0033) [2024-04-26 03:35:42,989][47288] Updated weights for policy 0, policy_version 69666 (0.0031) [2024-04-26 03:35:43,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 1141424128. Throughput: 0: 55760.3. Samples: 1090755980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:35:43,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 03:35:46,222][47288] Updated weights for policy 0, policy_version 69676 (0.0028) [2024-04-26 03:35:48,747][47288] Updated weights for policy 0, policy_version 69686 (0.0035) [2024-04-26 03:35:48,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1141735424. Throughput: 0: 55764.3. Samples: 1091090600. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-26 03:35:48,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 03:35:52,276][47288] Updated weights for policy 0, policy_version 69696 (0.0025) [2024-04-26 03:35:53,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 1141997568. Throughput: 0: 55642.9. Samples: 1091423400. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-26 03:35:53,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 03:35:54,836][47288] Updated weights for policy 0, policy_version 69706 (0.0026) [2024-04-26 03:35:58,075][47288] Updated weights for policy 0, policy_version 69716 (0.0027) [2024-04-26 03:35:58,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 1142259712. Throughput: 0: 55472.9. Samples: 1091582580. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-26 03:35:58,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 03:36:00,890][47288] Updated weights for policy 0, policy_version 69726 (0.0026) [2024-04-26 03:36:03,866][47288] Updated weights for policy 0, policy_version 69736 (0.0028) [2024-04-26 03:36:03,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 1142554624. Throughput: 0: 55517.7. Samples: 1091913940. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-26 03:36:03,924][47056] Avg episode reward: [(0, '0.424')] [2024-04-26 03:36:07,168][47288] Updated weights for policy 0, policy_version 69746 (0.0033) [2024-04-26 03:36:08,746][47267] Signal inference workers to stop experience collection... (16400 times) [2024-04-26 03:36:08,775][47288] InferenceWorker_p0-w0: stopping experience collection (16400 times) [2024-04-26 03:36:08,798][47267] Signal inference workers to resume experience collection... (16400 times) [2024-04-26 03:36:08,798][47288] InferenceWorker_p0-w0: resuming experience collection (16400 times) [2024-04-26 03:36:08,923][47056] Fps is (10 sec: 58981.7, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 1142849536. Throughput: 0: 55556.7. Samples: 1092249420. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-26 03:36:08,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 03:36:09,636][47288] Updated weights for policy 0, policy_version 69756 (0.0035) [2024-04-26 03:36:12,854][47288] Updated weights for policy 0, policy_version 69766 (0.0034) [2024-04-26 03:36:13,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 1143095296. Throughput: 0: 55534.6. Samples: 1092420760. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-26 03:36:13,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 03:36:15,637][47288] Updated weights for policy 0, policy_version 69776 (0.0027) [2024-04-26 03:36:18,607][47288] Updated weights for policy 0, policy_version 69786 (0.0031) [2024-04-26 03:36:18,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 1143373824. Throughput: 0: 55483.9. Samples: 1092753680. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-26 03:36:18,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 03:36:21,432][47288] Updated weights for policy 0, policy_version 69796 (0.0034) [2024-04-26 03:36:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 1143668736. Throughput: 0: 55563.5. Samples: 1093090080. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-26 03:36:23,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 03:36:24,508][47288] Updated weights for policy 0, policy_version 69806 (0.0026) [2024-04-26 03:36:27,389][47288] Updated weights for policy 0, policy_version 69816 (0.0029) [2024-04-26 03:36:28,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.5, 300 sec: 55761.2). Total num frames: 1143947264. Throughput: 0: 55684.0. Samples: 1093261760. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-26 03:36:28,932][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 03:36:30,480][47288] Updated weights for policy 0, policy_version 69826 (0.0031) [2024-04-26 03:36:33,230][47288] Updated weights for policy 0, policy_version 69836 (0.0033) [2024-04-26 03:36:33,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 1144209408. Throughput: 0: 55551.5. Samples: 1093590420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 03:36:33,923][47056] Avg episode reward: [(0, '0.381')] [2024-04-26 03:36:36,238][47288] Updated weights for policy 0, policy_version 69846 (0.0027) [2024-04-26 03:36:38,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 1144504320. Throughput: 0: 55669.5. Samples: 1093928540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 03:36:38,932][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 03:36:39,140][47288] Updated weights for policy 0, policy_version 69856 (0.0028) [2024-04-26 03:36:42,067][47288] Updated weights for policy 0, policy_version 69866 (0.0028) [2024-04-26 03:36:43,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 1144782848. Throughput: 0: 55761.4. Samples: 1094091840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 03:36:43,923][47056] Avg episode reward: [(0, '0.303')] [2024-04-26 03:36:44,929][47288] Updated weights for policy 0, policy_version 69876 (0.0033) [2024-04-26 03:36:47,975][47288] Updated weights for policy 0, policy_version 69886 (0.0028) [2024-04-26 03:36:48,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 1145044992. Throughput: 0: 55928.8. Samples: 1094430740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 03:36:48,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 03:36:48,999][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000069889_1145061376.pth... [2024-04-26 03:36:49,041][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000069072_1131675648.pth [2024-04-26 03:36:50,915][47288] Updated weights for policy 0, policy_version 69896 (0.0029) [2024-04-26 03:36:53,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 1145323520. Throughput: 0: 55820.5. Samples: 1094761340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 03:36:53,924][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 03:36:54,154][47288] Updated weights for policy 0, policy_version 69906 (0.0029) [2024-04-26 03:36:56,774][47288] Updated weights for policy 0, policy_version 69916 (0.0031) [2024-04-26 03:36:58,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 1145602048. Throughput: 0: 55607.0. Samples: 1094923080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 03:36:58,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 03:37:00,180][47288] Updated weights for policy 0, policy_version 69926 (0.0029) [2024-04-26 03:37:02,512][47288] Updated weights for policy 0, policy_version 69936 (0.0030) [2024-04-26 03:37:03,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 1145913344. Throughput: 0: 55637.8. Samples: 1095257380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 03:37:03,923][47056] Avg episode reward: [(0, '0.395')] [2024-04-26 03:37:05,924][47288] Updated weights for policy 0, policy_version 69946 (0.0031) [2024-04-26 03:37:08,412][47288] Updated weights for policy 0, policy_version 69956 (0.0032) [2024-04-26 03:37:08,923][47056] Fps is (10 sec: 58983.4, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 1146191872. Throughput: 0: 55561.8. Samples: 1095590360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 03:37:08,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 03:37:11,712][47288] Updated weights for policy 0, policy_version 69966 (0.0030) [2024-04-26 03:37:13,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 1146454016. Throughput: 0: 55718.3. Samples: 1095769080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 03:37:13,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 03:37:14,364][47288] Updated weights for policy 0, policy_version 69976 (0.0029) [2024-04-26 03:37:17,654][47288] Updated weights for policy 0, policy_version 69986 (0.0028) [2024-04-26 03:37:18,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 1146732544. Throughput: 0: 55748.4. Samples: 1096099100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 03:37:18,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 03:37:20,416][47288] Updated weights for policy 0, policy_version 69996 (0.0028) [2024-04-26 03:37:23,472][47288] Updated weights for policy 0, policy_version 70006 (0.0032) [2024-04-26 03:37:23,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 1146994688. Throughput: 0: 55616.9. Samples: 1096431300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 03:37:23,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 03:37:26,205][47288] Updated weights for policy 0, policy_version 70016 (0.0031) [2024-04-26 03:37:28,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55432.4, 300 sec: 55705.5). Total num frames: 1147273216. Throughput: 0: 55691.2. Samples: 1096597960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 03:37:28,924][47056] Avg episode reward: [(0, '0.391')] [2024-04-26 03:37:29,346][47288] Updated weights for policy 0, policy_version 70026 (0.0041) [2024-04-26 03:37:31,257][47267] Signal inference workers to stop experience collection... (16450 times) [2024-04-26 03:37:31,257][47267] Signal inference workers to resume experience collection... (16450 times) [2024-04-26 03:37:31,276][47288] InferenceWorker_p0-w0: stopping experience collection (16450 times) [2024-04-26 03:37:31,276][47288] InferenceWorker_p0-w0: resuming experience collection (16450 times) [2024-04-26 03:37:31,920][47288] Updated weights for policy 0, policy_version 70036 (0.0035) [2024-04-26 03:37:33,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 1147551744. Throughput: 0: 55540.0. Samples: 1096930040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 03:37:33,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 03:37:35,329][47288] Updated weights for policy 0, policy_version 70046 (0.0030) [2024-04-26 03:37:37,813][47288] Updated weights for policy 0, policy_version 70056 (0.0038) [2024-04-26 03:37:38,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 1147846656. Throughput: 0: 55489.8. Samples: 1097258380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 03:37:38,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 03:37:41,317][47288] Updated weights for policy 0, policy_version 70066 (0.0030) [2024-04-26 03:37:43,723][47288] Updated weights for policy 0, policy_version 70076 (0.0040) [2024-04-26 03:37:43,923][47056] Fps is (10 sec: 57343.0, 60 sec: 55705.3, 300 sec: 55761.1). Total num frames: 1148125184. Throughput: 0: 55813.6. Samples: 1097434700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 03:37:43,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 03:37:47,270][47288] Updated weights for policy 0, policy_version 70086 (0.0029) [2024-04-26 03:37:48,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 1148403712. Throughput: 0: 55792.5. Samples: 1097768040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 03:37:48,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 03:37:49,464][47288] Updated weights for policy 0, policy_version 70096 (0.0028) [2024-04-26 03:37:52,999][47288] Updated weights for policy 0, policy_version 70106 (0.0029) [2024-04-26 03:37:53,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 1148665856. Throughput: 0: 55843.4. Samples: 1098103320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 03:37:53,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 03:37:55,546][47288] Updated weights for policy 0, policy_version 70116 (0.0031) [2024-04-26 03:37:58,779][47288] Updated weights for policy 0, policy_version 70126 (0.0032) [2024-04-26 03:37:58,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1148944384. Throughput: 0: 55446.7. Samples: 1098264180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 03:37:58,923][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 03:38:01,488][47288] Updated weights for policy 0, policy_version 70136 (0.0030) [2024-04-26 03:38:03,923][47056] Fps is (10 sec: 54067.3, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 1149206528. Throughput: 0: 55424.0. Samples: 1098593180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 03:38:03,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 03:38:04,759][47288] Updated weights for policy 0, policy_version 70146 (0.0034) [2024-04-26 03:38:07,370][47288] Updated weights for policy 0, policy_version 70156 (0.0024) [2024-04-26 03:38:08,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 1149517824. Throughput: 0: 55586.2. Samples: 1098932680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 03:38:08,923][47056] Avg episode reward: [(0, '0.378')] [2024-04-26 03:38:10,691][47288] Updated weights for policy 0, policy_version 70166 (0.0036) [2024-04-26 03:38:13,135][47288] Updated weights for policy 0, policy_version 70176 (0.0033) [2024-04-26 03:38:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 1149779968. Throughput: 0: 55652.7. Samples: 1099102320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 03:38:13,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 03:38:16,411][47288] Updated weights for policy 0, policy_version 70186 (0.0032) [2024-04-26 03:38:18,920][47288] Updated weights for policy 0, policy_version 70196 (0.0033) [2024-04-26 03:38:18,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 1150091264. Throughput: 0: 55712.2. Samples: 1099437080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 03:38:18,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 03:38:22,452][47288] Updated weights for policy 0, policy_version 70206 (0.0035) [2024-04-26 03:38:23,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1150353408. Throughput: 0: 55855.5. Samples: 1099771880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 03:38:23,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 03:38:24,660][47288] Updated weights for policy 0, policy_version 70216 (0.0028) [2024-04-26 03:38:28,190][47288] Updated weights for policy 0, policy_version 70226 (0.0033) [2024-04-26 03:38:28,923][47056] Fps is (10 sec: 50789.8, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 1150599168. Throughput: 0: 55622.0. Samples: 1099937680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 03:38:28,923][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 03:38:28,976][47267] Signal inference workers to stop experience collection... (16500 times) [2024-04-26 03:38:29,006][47288] InferenceWorker_p0-w0: stopping experience collection (16500 times) [2024-04-26 03:38:29,032][47267] Signal inference workers to resume experience collection... (16500 times) [2024-04-26 03:38:29,037][47288] InferenceWorker_p0-w0: resuming experience collection (16500 times) [2024-04-26 03:38:30,451][47288] Updated weights for policy 0, policy_version 70236 (0.0027) [2024-04-26 03:38:33,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1150894080. Throughput: 0: 55740.1. Samples: 1100276340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 03:38:33,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 03:38:33,964][47288] Updated weights for policy 0, policy_version 70246 (0.0036) [2024-04-26 03:38:36,372][47288] Updated weights for policy 0, policy_version 70256 (0.0029) [2024-04-26 03:38:38,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 1151156224. Throughput: 0: 55783.5. Samples: 1100613580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 03:38:38,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 03:38:39,807][47288] Updated weights for policy 0, policy_version 70266 (0.0026) [2024-04-26 03:38:42,227][47288] Updated weights for policy 0, policy_version 70276 (0.0024) [2024-04-26 03:38:43,923][47056] Fps is (10 sec: 58981.6, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 1151483904. Throughput: 0: 55971.9. Samples: 1100782920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 03:38:43,923][47056] Avg episode reward: [(0, '0.341')] [2024-04-26 03:38:45,701][47288] Updated weights for policy 0, policy_version 70286 (0.0036) [2024-04-26 03:38:48,070][47288] Updated weights for policy 0, policy_version 70296 (0.0026) [2024-04-26 03:38:48,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1151746048. Throughput: 0: 55982.8. Samples: 1101112400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 03:38:48,923][47056] Avg episode reward: [(0, '0.368')] [2024-04-26 03:38:49,002][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000070298_1151762432.pth... [2024-04-26 03:38:49,047][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000069482_1138393088.pth [2024-04-26 03:38:51,560][47288] Updated weights for policy 0, policy_version 70306 (0.0029) [2024-04-26 03:38:53,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 1152024576. Throughput: 0: 55808.2. Samples: 1101444040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 03:38:53,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 03:38:54,096][47288] Updated weights for policy 0, policy_version 70316 (0.0028) [2024-04-26 03:38:57,273][47288] Updated weights for policy 0, policy_version 70326 (0.0034) [2024-04-26 03:38:58,923][47056] Fps is (10 sec: 54066.3, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 1152286720. Throughput: 0: 55914.1. Samples: 1101618460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 03:38:58,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 03:38:59,920][47288] Updated weights for policy 0, policy_version 70336 (0.0029) [2024-04-26 03:39:03,213][47288] Updated weights for policy 0, policy_version 70346 (0.0031) [2024-04-26 03:39:03,923][47056] Fps is (10 sec: 55704.6, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 1152581632. Throughput: 0: 56018.5. Samples: 1101957920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 03:39:03,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 03:39:05,729][47288] Updated weights for policy 0, policy_version 70356 (0.0029) [2024-04-26 03:39:08,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1152843776. Throughput: 0: 55956.8. Samples: 1102289940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 03:39:08,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 03:39:09,247][47288] Updated weights for policy 0, policy_version 70366 (0.0028) [2024-04-26 03:39:11,579][47288] Updated weights for policy 0, policy_version 70376 (0.0029) [2024-04-26 03:39:13,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 1153122304. Throughput: 0: 55709.2. Samples: 1102444600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 03:39:13,924][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 03:39:15,239][47288] Updated weights for policy 0, policy_version 70386 (0.0030) [2024-04-26 03:39:17,562][47288] Updated weights for policy 0, policy_version 70396 (0.0028) [2024-04-26 03:39:18,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 1153433600. Throughput: 0: 55669.1. Samples: 1102781460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 03:39:18,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 03:39:21,034][47288] Updated weights for policy 0, policy_version 70406 (0.0035) [2024-04-26 03:39:23,497][47288] Updated weights for policy 0, policy_version 70416 (0.0032) [2024-04-26 03:39:23,923][47056] Fps is (10 sec: 58982.8, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1153712128. Throughput: 0: 55567.6. Samples: 1103114120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:39:23,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 03:39:26,958][47288] Updated weights for policy 0, policy_version 70426 (0.0032) [2024-04-26 03:39:28,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 1153974272. Throughput: 0: 55653.8. Samples: 1103287340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:39:28,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 03:39:29,494][47267] Signal inference workers to stop experience collection... (16550 times) [2024-04-26 03:39:29,499][47267] Signal inference workers to resume experience collection... (16550 times) [2024-04-26 03:39:29,524][47288] InferenceWorker_p0-w0: stopping experience collection (16550 times) [2024-04-26 03:39:29,524][47288] InferenceWorker_p0-w0: resuming experience collection (16550 times) [2024-04-26 03:39:29,606][47288] Updated weights for policy 0, policy_version 70436 (0.0038) [2024-04-26 03:39:32,683][47288] Updated weights for policy 0, policy_version 70446 (0.0030) [2024-04-26 03:39:33,923][47056] Fps is (10 sec: 54066.3, 60 sec: 55978.4, 300 sec: 55761.1). Total num frames: 1154252800. Throughput: 0: 55718.3. Samples: 1103619740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:39:33,924][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 03:39:35,525][47288] Updated weights for policy 0, policy_version 70456 (0.0034) [2024-04-26 03:39:38,436][47288] Updated weights for policy 0, policy_version 70466 (0.0031) [2024-04-26 03:39:38,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 1154531328. Throughput: 0: 55682.5. Samples: 1103949760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:39:38,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 03:39:41,371][47288] Updated weights for policy 0, policy_version 70476 (0.0028) [2024-04-26 03:39:43,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 1154826240. Throughput: 0: 55566.6. Samples: 1104118960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:39:43,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 03:39:44,233][47288] Updated weights for policy 0, policy_version 70486 (0.0029) [2024-04-26 03:39:47,165][47288] Updated weights for policy 0, policy_version 70496 (0.0031) [2024-04-26 03:39:48,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 1155072000. Throughput: 0: 55424.4. Samples: 1104452020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:39:48,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 03:39:50,184][47288] Updated weights for policy 0, policy_version 70506 (0.0031) [2024-04-26 03:39:52,964][47288] Updated weights for policy 0, policy_version 70516 (0.0032) [2024-04-26 03:39:53,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 1155383296. Throughput: 0: 55465.0. Samples: 1104785860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:39:53,927][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 03:39:56,193][47288] Updated weights for policy 0, policy_version 70526 (0.0031) [2024-04-26 03:39:58,877][47288] Updated weights for policy 0, policy_version 70536 (0.0033) [2024-04-26 03:39:58,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 1155661824. Throughput: 0: 55828.1. Samples: 1104956860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:39:58,923][47056] Avg episode reward: [(0, '0.351')] [2024-04-26 03:40:02,017][47288] Updated weights for policy 0, policy_version 70546 (0.0034) [2024-04-26 03:40:03,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 1155923968. Throughput: 0: 55758.8. Samples: 1105290600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 03:40:03,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 03:40:04,676][47288] Updated weights for policy 0, policy_version 70556 (0.0027) [2024-04-26 03:40:07,943][47288] Updated weights for policy 0, policy_version 70566 (0.0032) [2024-04-26 03:40:08,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 1156202496. Throughput: 0: 55906.7. Samples: 1105629920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-26 03:40:08,923][47056] Avg episode reward: [(0, '0.368')] [2024-04-26 03:40:10,568][47288] Updated weights for policy 0, policy_version 70576 (0.0028) [2024-04-26 03:40:13,665][47288] Updated weights for policy 0, policy_version 70586 (0.0033) [2024-04-26 03:40:13,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 1156497408. Throughput: 0: 55664.5. Samples: 1105792240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-26 03:40:13,923][47056] Avg episode reward: [(0, '0.436')] [2024-04-26 03:40:16,409][47288] Updated weights for policy 0, policy_version 70596 (0.0032) [2024-04-26 03:40:18,923][47056] Fps is (10 sec: 57342.8, 60 sec: 55705.5, 300 sec: 55816.6). Total num frames: 1156775936. Throughput: 0: 55724.5. Samples: 1106127340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-26 03:40:18,924][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 03:40:19,437][47288] Updated weights for policy 0, policy_version 70606 (0.0029) [2024-04-26 03:40:22,279][47288] Updated weights for policy 0, policy_version 70616 (0.0030) [2024-04-26 03:40:23,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 1157054464. Throughput: 0: 56054.6. Samples: 1106472220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-26 03:40:23,924][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 03:40:25,260][47288] Updated weights for policy 0, policy_version 70626 (0.0037) [2024-04-26 03:40:28,321][47288] Updated weights for policy 0, policy_version 70636 (0.0032) [2024-04-26 03:40:28,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 1157316608. Throughput: 0: 56012.9. Samples: 1106639540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-26 03:40:28,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 03:40:31,053][47288] Updated weights for policy 0, policy_version 70646 (0.0034) [2024-04-26 03:40:33,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1157595136. Throughput: 0: 55911.2. Samples: 1106968020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-26 03:40:33,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 03:40:34,207][47288] Updated weights for policy 0, policy_version 70656 (0.0033) [2024-04-26 03:40:37,172][47288] Updated weights for policy 0, policy_version 70666 (0.0038) [2024-04-26 03:40:38,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 1157873664. Throughput: 0: 55922.1. Samples: 1107302360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-26 03:40:38,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 03:40:39,592][47267] Signal inference workers to stop experience collection... (16600 times) [2024-04-26 03:40:39,593][47267] Signal inference workers to resume experience collection... (16600 times) [2024-04-26 03:40:39,611][47288] InferenceWorker_p0-w0: stopping experience collection (16600 times) [2024-04-26 03:40:39,611][47288] InferenceWorker_p0-w0: resuming experience collection (16600 times) [2024-04-26 03:40:39,985][47288] Updated weights for policy 0, policy_version 70676 (0.0030) [2024-04-26 03:40:43,208][47288] Updated weights for policy 0, policy_version 70686 (0.0036) [2024-04-26 03:40:43,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 1158152192. Throughput: 0: 55843.5. Samples: 1107469820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-26 03:40:43,923][47056] Avg episode reward: [(0, '0.436')] [2024-04-26 03:40:45,815][47288] Updated weights for policy 0, policy_version 70696 (0.0028) [2024-04-26 03:40:48,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 1158430720. Throughput: 0: 55842.7. Samples: 1107803520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 03:40:48,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 03:40:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000070705_1158430720.pth... [2024-04-26 03:40:48,992][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000069889_1145061376.pth [2024-04-26 03:40:49,106][47288] Updated weights for policy 0, policy_version 70706 (0.0031) [2024-04-26 03:40:51,624][47288] Updated weights for policy 0, policy_version 70716 (0.0029) [2024-04-26 03:40:53,923][47056] Fps is (10 sec: 58982.5, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 1158742016. Throughput: 0: 55676.3. Samples: 1108135360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 03:40:53,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 03:40:54,936][47288] Updated weights for policy 0, policy_version 70726 (0.0033) [2024-04-26 03:40:57,516][47288] Updated weights for policy 0, policy_version 70736 (0.0026) [2024-04-26 03:40:58,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1158987776. Throughput: 0: 56061.7. Samples: 1108315020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 03:40:58,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 03:41:00,643][47288] Updated weights for policy 0, policy_version 70746 (0.0030) [2024-04-26 03:41:03,532][47288] Updated weights for policy 0, policy_version 70756 (0.0030) [2024-04-26 03:41:03,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 1159266304. Throughput: 0: 55990.7. Samples: 1108646920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 03:41:03,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 03:41:06,364][47288] Updated weights for policy 0, policy_version 70766 (0.0026) [2024-04-26 03:41:08,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1159528448. Throughput: 0: 55809.4. Samples: 1108983640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 03:41:08,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 03:41:09,351][47288] Updated weights for policy 0, policy_version 70776 (0.0029) [2024-04-26 03:41:12,216][47288] Updated weights for policy 0, policy_version 70786 (0.0032) [2024-04-26 03:41:13,923][47056] Fps is (10 sec: 52429.3, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 1159790592. Throughput: 0: 55450.3. Samples: 1109134800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 03:41:13,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 03:41:15,216][47288] Updated weights for policy 0, policy_version 70796 (0.0028) [2024-04-26 03:41:18,090][47288] Updated weights for policy 0, policy_version 70806 (0.0027) [2024-04-26 03:41:18,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 1160101888. Throughput: 0: 55630.2. Samples: 1109471380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 03:41:18,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 03:41:20,997][47288] Updated weights for policy 0, policy_version 70816 (0.0030) [2024-04-26 03:41:23,923][47056] Fps is (10 sec: 58983.2, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 1160380416. Throughput: 0: 55658.9. Samples: 1109807000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 03:41:23,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 03:41:24,132][47288] Updated weights for policy 0, policy_version 70826 (0.0036) [2024-04-26 03:41:26,957][47288] Updated weights for policy 0, policy_version 70836 (0.0026) [2024-04-26 03:41:28,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 1160691712. Throughput: 0: 55833.8. Samples: 1109982340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 03:41:28,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 03:41:30,243][47288] Updated weights for policy 0, policy_version 70846 (0.0028) [2024-04-26 03:41:30,850][47267] Signal inference workers to stop experience collection... (16650 times) [2024-04-26 03:41:30,851][47267] Signal inference workers to resume experience collection... (16650 times) [2024-04-26 03:41:30,882][47288] InferenceWorker_p0-w0: stopping experience collection (16650 times) [2024-04-26 03:41:30,882][47288] InferenceWorker_p0-w0: resuming experience collection (16650 times) [2024-04-26 03:41:32,751][47288] Updated weights for policy 0, policy_version 70856 (0.0026) [2024-04-26 03:41:33,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 1160970240. Throughput: 0: 55871.1. Samples: 1110317720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:41:33,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 03:41:35,925][47288] Updated weights for policy 0, policy_version 70866 (0.0027) [2024-04-26 03:41:38,596][47288] Updated weights for policy 0, policy_version 70876 (0.0029) [2024-04-26 03:41:38,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 1161232384. Throughput: 0: 55829.3. Samples: 1110647680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:41:38,923][47056] Avg episode reward: [(0, '0.384')] [2024-04-26 03:41:41,625][47288] Updated weights for policy 0, policy_version 70886 (0.0029) [2024-04-26 03:41:43,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 1161494528. Throughput: 0: 55636.5. Samples: 1110818660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:41:43,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 03:41:44,514][47288] Updated weights for policy 0, policy_version 70896 (0.0035) [2024-04-26 03:41:47,487][47288] Updated weights for policy 0, policy_version 70906 (0.0033) [2024-04-26 03:41:48,923][47056] Fps is (10 sec: 50790.5, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 1161740288. Throughput: 0: 55669.8. Samples: 1111152060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:41:48,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 03:41:50,416][47288] Updated weights for policy 0, policy_version 70916 (0.0032) [2024-04-26 03:41:53,511][47288] Updated weights for policy 0, policy_version 70926 (0.0028) [2024-04-26 03:41:53,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 1162051584. Throughput: 0: 55668.2. Samples: 1111488720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:41:53,924][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 03:41:56,381][47288] Updated weights for policy 0, policy_version 70936 (0.0030) [2024-04-26 03:41:58,923][47056] Fps is (10 sec: 60620.6, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 1162346496. Throughput: 0: 55930.1. Samples: 1111651660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:41:58,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 03:41:59,208][47288] Updated weights for policy 0, policy_version 70946 (0.0028) [2024-04-26 03:42:02,329][47288] Updated weights for policy 0, policy_version 70956 (0.0030) [2024-04-26 03:42:03,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 1162641408. Throughput: 0: 55839.5. Samples: 1111984160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:42:03,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 03:42:05,080][47288] Updated weights for policy 0, policy_version 70966 (0.0026) [2024-04-26 03:42:07,993][47288] Updated weights for policy 0, policy_version 70976 (0.0025) [2024-04-26 03:42:08,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 1162919936. Throughput: 0: 55941.0. Samples: 1112324360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:42:08,924][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 03:42:11,485][47288] Updated weights for policy 0, policy_version 70986 (0.0026) [2024-04-26 03:42:13,832][47288] Updated weights for policy 0, policy_version 70996 (0.0027) [2024-04-26 03:42:13,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56797.9, 300 sec: 55816.7). Total num frames: 1163198464. Throughput: 0: 55954.8. Samples: 1112500300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:42:13,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 03:42:17,160][47288] Updated weights for policy 0, policy_version 71006 (0.0032) [2024-04-26 03:42:18,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 1163460608. Throughput: 0: 55960.5. Samples: 1112835940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 03:42:18,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 03:42:19,441][47267] Signal inference workers to stop experience collection... (16700 times) [2024-04-26 03:42:19,445][47267] Signal inference workers to resume experience collection... (16700 times) [2024-04-26 03:42:19,462][47288] InferenceWorker_p0-w0: stopping experience collection (16700 times) [2024-04-26 03:42:19,463][47288] InferenceWorker_p0-w0: resuming experience collection (16700 times) [2024-04-26 03:42:19,693][47288] Updated weights for policy 0, policy_version 71016 (0.0026) [2024-04-26 03:42:23,102][47288] Updated weights for policy 0, policy_version 71026 (0.0034) [2024-04-26 03:42:23,923][47056] Fps is (10 sec: 49151.9, 60 sec: 55159.3, 300 sec: 55650.1). Total num frames: 1163689984. Throughput: 0: 56104.9. Samples: 1113172400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 03:42:23,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 03:42:25,566][47288] Updated weights for policy 0, policy_version 71036 (0.0029) [2024-04-26 03:42:28,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55159.6, 300 sec: 55761.2). Total num frames: 1164001280. Throughput: 0: 55699.6. Samples: 1113325140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 03:42:28,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 03:42:29,193][47288] Updated weights for policy 0, policy_version 71046 (0.0039) [2024-04-26 03:42:31,615][47288] Updated weights for policy 0, policy_version 71056 (0.0028) [2024-04-26 03:42:33,923][47056] Fps is (10 sec: 62259.8, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 1164312576. Throughput: 0: 55551.7. Samples: 1113651880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 03:42:33,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 03:42:34,948][47288] Updated weights for policy 0, policy_version 71066 (0.0028) [2024-04-26 03:42:37,475][47288] Updated weights for policy 0, policy_version 71076 (0.0027) [2024-04-26 03:42:38,923][47056] Fps is (10 sec: 60620.2, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 1164607488. Throughput: 0: 55476.6. Samples: 1113985160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 03:42:38,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 03:42:40,690][47288] Updated weights for policy 0, policy_version 71086 (0.0029) [2024-04-26 03:42:43,327][47288] Updated weights for policy 0, policy_version 71096 (0.0024) [2024-04-26 03:42:43,923][47056] Fps is (10 sec: 54066.3, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 1164853248. Throughput: 0: 56040.4. Samples: 1114173480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 03:42:43,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 03:42:46,616][47288] Updated weights for policy 0, policy_version 71106 (0.0029) [2024-04-26 03:42:48,923][47056] Fps is (10 sec: 52428.7, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 1165131776. Throughput: 0: 56058.3. Samples: 1114506780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 03:42:48,923][47056] Avg episode reward: [(0, '0.403')] [2024-04-26 03:42:48,929][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000071114_1165131776.pth... [2024-04-26 03:42:48,983][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000070298_1151762432.pth [2024-04-26 03:42:49,241][47288] Updated weights for policy 0, policy_version 71116 (0.0030) [2024-04-26 03:42:52,500][47288] Updated weights for policy 0, policy_version 71126 (0.0027) [2024-04-26 03:42:53,923][47056] Fps is (10 sec: 50791.6, 60 sec: 55159.7, 300 sec: 55650.1). Total num frames: 1165361152. Throughput: 0: 55865.6. Samples: 1114838300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 03:42:53,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 03:42:55,061][47288] Updated weights for policy 0, policy_version 71136 (0.0028) [2024-04-26 03:42:58,561][47288] Updated weights for policy 0, policy_version 71146 (0.0031) [2024-04-26 03:42:58,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 1165656064. Throughput: 0: 55324.3. Samples: 1114989900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:42:58,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 03:43:00,684][47267] Signal inference workers to stop experience collection... (16750 times) [2024-04-26 03:43:00,684][47267] Signal inference workers to resume experience collection... (16750 times) [2024-04-26 03:43:00,711][47288] InferenceWorker_p0-w0: stopping experience collection (16750 times) [2024-04-26 03:43:00,711][47288] InferenceWorker_p0-w0: resuming experience collection (16750 times) [2024-04-26 03:43:00,955][47288] Updated weights for policy 0, policy_version 71156 (0.0026) [2024-04-26 03:43:03,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 1165950976. Throughput: 0: 55270.2. Samples: 1115323100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:43:03,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 03:43:04,672][47288] Updated weights for policy 0, policy_version 71166 (0.0028) [2024-04-26 03:43:06,886][47288] Updated weights for policy 0, policy_version 71176 (0.0027) [2024-04-26 03:43:08,923][47056] Fps is (10 sec: 58983.0, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 1166245888. Throughput: 0: 55205.3. Samples: 1115656640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:43:08,923][47056] Avg episode reward: [(0, '0.384')] [2024-04-26 03:43:11,045][47288] Updated weights for policy 0, policy_version 71186 (0.0029) [2024-04-26 03:43:12,667][47288] Updated weights for policy 0, policy_version 71196 (0.0025) [2024-04-26 03:43:13,923][47056] Fps is (10 sec: 60619.3, 60 sec: 55978.5, 300 sec: 55816.6). Total num frames: 1166557184. Throughput: 0: 55891.3. Samples: 1115840260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:43:13,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 03:43:16,834][47288] Updated weights for policy 0, policy_version 71206 (0.0026) [2024-04-26 03:43:18,658][47288] Updated weights for policy 0, policy_version 71216 (0.0029) [2024-04-26 03:43:18,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 1166835712. Throughput: 0: 56019.5. Samples: 1116172760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:43:18,923][47056] Avg episode reward: [(0, '0.326')] [2024-04-26 03:43:22,779][47288] Updated weights for policy 0, policy_version 71226 (0.0028) [2024-04-26 03:43:23,923][47056] Fps is (10 sec: 50790.9, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 1167065088. Throughput: 0: 55949.7. Samples: 1116502900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:43:23,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 03:43:24,500][47288] Updated weights for policy 0, policy_version 71236 (0.0027) [2024-04-26 03:43:28,532][47288] Updated weights for policy 0, policy_version 71246 (0.0033) [2024-04-26 03:43:28,923][47056] Fps is (10 sec: 47513.8, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 1167310848. Throughput: 0: 55214.4. Samples: 1116658120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:43:28,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 03:43:30,400][47288] Updated weights for policy 0, policy_version 71256 (0.0026) [2024-04-26 03:43:33,923][47056] Fps is (10 sec: 54067.5, 60 sec: 54886.3, 300 sec: 55761.1). Total num frames: 1167605760. Throughput: 0: 55218.3. Samples: 1116991600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:43:33,923][47056] Avg episode reward: [(0, '0.404')] [2024-04-26 03:43:34,245][47288] Updated weights for policy 0, policy_version 71266 (0.0026) [2024-04-26 03:43:36,217][47288] Updated weights for policy 0, policy_version 71276 (0.0035) [2024-04-26 03:43:38,923][47056] Fps is (10 sec: 58981.3, 60 sec: 54886.3, 300 sec: 55650.1). Total num frames: 1167900672. Throughput: 0: 55373.5. Samples: 1117330120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 03:43:38,924][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 03:43:40,170][47288] Updated weights for policy 0, policy_version 71286 (0.0028) [2024-04-26 03:43:42,103][47288] Updated weights for policy 0, policy_version 71296 (0.0034) [2024-04-26 03:43:43,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 1168195584. Throughput: 0: 55796.1. Samples: 1117500720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-04-26 03:43:43,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 03:43:46,016][47288] Updated weights for policy 0, policy_version 71306 (0.0032) [2024-04-26 03:43:47,932][47288] Updated weights for policy 0, policy_version 71316 (0.0027) [2024-04-26 03:43:48,923][47056] Fps is (10 sec: 58982.6, 60 sec: 55978.6, 300 sec: 55816.6). Total num frames: 1168490496. Throughput: 0: 55802.5. Samples: 1117834220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-04-26 03:43:48,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 03:43:52,138][47288] Updated weights for policy 0, policy_version 71326 (0.0031) [2024-04-26 03:43:52,720][47267] Signal inference workers to stop experience collection... (16800 times) [2024-04-26 03:43:52,721][47267] Signal inference workers to resume experience collection... (16800 times) [2024-04-26 03:43:52,746][47288] InferenceWorker_p0-w0: stopping experience collection (16800 times) [2024-04-26 03:43:52,751][47288] InferenceWorker_p0-w0: resuming experience collection (16800 times) [2024-04-26 03:43:53,893][47288] Updated weights for policy 0, policy_version 71336 (0.0026) [2024-04-26 03:43:53,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56797.8, 300 sec: 55872.2). Total num frames: 1168769024. Throughput: 0: 55711.2. Samples: 1118163640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-04-26 03:43:53,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 03:43:57,904][47288] Updated weights for policy 0, policy_version 71346 (0.0024) [2024-04-26 03:43:58,923][47056] Fps is (10 sec: 49152.6, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 1168982016. Throughput: 0: 55350.0. Samples: 1118331000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-04-26 03:43:58,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 03:43:59,850][47288] Updated weights for policy 0, policy_version 71356 (0.0027) [2024-04-26 03:44:03,692][47288] Updated weights for policy 0, policy_version 71366 (0.0028) [2024-04-26 03:44:03,923][47056] Fps is (10 sec: 50789.9, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 1169276928. Throughput: 0: 55387.0. Samples: 1118665180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-04-26 03:44:03,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 03:44:05,639][47288] Updated weights for policy 0, policy_version 71376 (0.0027) [2024-04-26 03:44:08,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 1169555456. Throughput: 0: 55484.2. Samples: 1118999680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-04-26 03:44:08,927][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 03:44:09,523][47288] Updated weights for policy 0, policy_version 71386 (0.0027) [2024-04-26 03:44:11,445][47288] Updated weights for policy 0, policy_version 71396 (0.0025) [2024-04-26 03:44:13,923][47056] Fps is (10 sec: 57343.8, 60 sec: 54886.5, 300 sec: 55650.1). Total num frames: 1169850368. Throughput: 0: 55503.8. Samples: 1119155800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-04-26 03:44:13,923][47056] Avg episode reward: [(0, '0.384')] [2024-04-26 03:44:15,300][47288] Updated weights for policy 0, policy_version 71406 (0.0029) [2024-04-26 03:44:17,372][47288] Updated weights for policy 0, policy_version 71416 (0.0025) [2024-04-26 03:44:18,923][47056] Fps is (10 sec: 58981.5, 60 sec: 55159.3, 300 sec: 55705.6). Total num frames: 1170145280. Throughput: 0: 55547.5. Samples: 1119491240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-04-26 03:44:18,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 03:44:21,026][47288] Updated weights for policy 0, policy_version 71426 (0.0029) [2024-04-26 03:44:23,408][47288] Updated weights for policy 0, policy_version 71436 (0.0031) [2024-04-26 03:44:23,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 1170423808. Throughput: 0: 55545.6. Samples: 1119829660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-04-26 03:44:23,923][47056] Avg episode reward: [(0, '0.308')] [2024-04-26 03:44:26,935][47288] Updated weights for policy 0, policy_version 71446 (0.0034) [2024-04-26 03:44:28,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56797.7, 300 sec: 55816.7). Total num frames: 1170718720. Throughput: 0: 55752.0. Samples: 1120009560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 03:44:28,924][47056] Avg episode reward: [(0, '0.352')] [2024-04-26 03:44:29,181][47288] Updated weights for policy 0, policy_version 71456 (0.0025) [2024-04-26 03:44:32,809][47288] Updated weights for policy 0, policy_version 71466 (0.0029) [2024-04-26 03:44:33,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 1170980864. Throughput: 0: 55812.2. Samples: 1120345760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 03:44:33,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 03:44:35,039][47288] Updated weights for policy 0, policy_version 71476 (0.0032) [2024-04-26 03:44:38,678][47288] Updated weights for policy 0, policy_version 71486 (0.0027) [2024-04-26 03:44:38,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 1171243008. Throughput: 0: 55944.3. Samples: 1120681140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 03:44:38,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 03:44:40,778][47288] Updated weights for policy 0, policy_version 71496 (0.0029) [2024-04-26 03:44:41,701][47267] Signal inference workers to stop experience collection... (16850 times) [2024-04-26 03:44:41,721][47288] InferenceWorker_p0-w0: stopping experience collection (16850 times) [2024-04-26 03:44:41,787][47267] Signal inference workers to resume experience collection... (16850 times) [2024-04-26 03:44:41,787][47288] InferenceWorker_p0-w0: resuming experience collection (16850 times) [2024-04-26 03:44:43,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 1171505152. Throughput: 0: 55753.3. Samples: 1120839900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 03:44:43,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 03:44:44,473][47288] Updated weights for policy 0, policy_version 71506 (0.0026) [2024-04-26 03:44:46,701][47288] Updated weights for policy 0, policy_version 71516 (0.0028) [2024-04-26 03:44:48,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 1171800064. Throughput: 0: 55786.8. Samples: 1121175580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 03:44:48,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 03:44:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000071521_1171800064.pth... [2024-04-26 03:44:48,987][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000070705_1158430720.pth [2024-04-26 03:44:50,358][47288] Updated weights for policy 0, policy_version 71526 (0.0030) [2024-04-26 03:44:52,748][47288] Updated weights for policy 0, policy_version 71536 (0.0031) [2024-04-26 03:44:53,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 1172094976. Throughput: 0: 55815.2. Samples: 1121511360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 03:44:53,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 03:44:56,114][47288] Updated weights for policy 0, policy_version 71546 (0.0027) [2024-04-26 03:44:58,828][47288] Updated weights for policy 0, policy_version 71556 (0.0031) [2024-04-26 03:44:58,923][47056] Fps is (10 sec: 57343.0, 60 sec: 56524.7, 300 sec: 55761.1). Total num frames: 1172373504. Throughput: 0: 56195.1. Samples: 1121684580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 03:44:58,923][47056] Avg episode reward: [(0, '0.399')] [2024-04-26 03:45:01,914][47288] Updated weights for policy 0, policy_version 71566 (0.0030) [2024-04-26 03:45:03,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 1172635648. Throughput: 0: 56167.8. Samples: 1122018780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 03:45:03,923][47056] Avg episode reward: [(0, '0.362')] [2024-04-26 03:45:04,681][47288] Updated weights for policy 0, policy_version 71576 (0.0029) [2024-04-26 03:45:07,756][47288] Updated weights for policy 0, policy_version 71586 (0.0034) [2024-04-26 03:45:08,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 1172930560. Throughput: 0: 55969.7. Samples: 1122348300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 03:45:08,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 03:45:10,556][47288] Updated weights for policy 0, policy_version 71596 (0.0027) [2024-04-26 03:45:13,683][47288] Updated weights for policy 0, policy_version 71606 (0.0035) [2024-04-26 03:45:13,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 1173209088. Throughput: 0: 55712.2. Samples: 1122516600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 03:45:13,923][47056] Avg episode reward: [(0, '0.324')] [2024-04-26 03:45:16,367][47288] Updated weights for policy 0, policy_version 71616 (0.0028) [2024-04-26 03:45:18,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55159.7, 300 sec: 55594.6). Total num frames: 1173454848. Throughput: 0: 55805.0. Samples: 1122856980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 03:45:18,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 03:45:19,503][47288] Updated weights for policy 0, policy_version 71626 (0.0027) [2024-04-26 03:45:22,226][47288] Updated weights for policy 0, policy_version 71636 (0.0028) [2024-04-26 03:45:23,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 1173766144. Throughput: 0: 55800.6. Samples: 1123192160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 03:45:23,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 03:45:25,330][47288] Updated weights for policy 0, policy_version 71646 (0.0031) [2024-04-26 03:45:28,123][47288] Updated weights for policy 0, policy_version 71656 (0.0029) [2024-04-26 03:45:28,923][47056] Fps is (10 sec: 60620.6, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 1174061056. Throughput: 0: 55918.8. Samples: 1123356240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 03:45:28,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 03:45:31,154][47288] Updated weights for policy 0, policy_version 71666 (0.0029) [2024-04-26 03:45:33,911][47288] Updated weights for policy 0, policy_version 71676 (0.0029) [2024-04-26 03:45:33,923][47056] Fps is (10 sec: 57342.6, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 1174339584. Throughput: 0: 56002.8. Samples: 1123695720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 03:45:33,923][47056] Avg episode reward: [(0, '0.390')] [2024-04-26 03:45:37,021][47288] Updated weights for policy 0, policy_version 71686 (0.0027) [2024-04-26 03:45:38,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1174585344. Throughput: 0: 55969.7. Samples: 1124030000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 03:45:38,923][47056] Avg episode reward: [(0, '0.328')] [2024-04-26 03:45:39,870][47288] Updated weights for policy 0, policy_version 71696 (0.0030) [2024-04-26 03:45:39,884][47267] Signal inference workers to stop experience collection... (16900 times) [2024-04-26 03:45:39,884][47267] Signal inference workers to resume experience collection... (16900 times) [2024-04-26 03:45:39,897][47288] InferenceWorker_p0-w0: stopping experience collection (16900 times) [2024-04-26 03:45:39,897][47288] InferenceWorker_p0-w0: resuming experience collection (16900 times) [2024-04-26 03:45:42,798][47288] Updated weights for policy 0, policy_version 71706 (0.0036) [2024-04-26 03:45:43,923][47056] Fps is (10 sec: 54068.3, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 1174880256. Throughput: 0: 55882.0. Samples: 1124199260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 03:45:43,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 03:45:45,789][47288] Updated weights for policy 0, policy_version 71716 (0.0027) [2024-04-26 03:45:48,616][47288] Updated weights for policy 0, policy_version 71726 (0.0026) [2024-04-26 03:45:48,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.5, 300 sec: 55650.1). Total num frames: 1175158784. Throughput: 0: 55831.4. Samples: 1124531200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 03:45:48,923][47056] Avg episode reward: [(0, '0.346')] [2024-04-26 03:45:51,664][47288] Updated weights for policy 0, policy_version 71736 (0.0038) [2024-04-26 03:45:53,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1175420928. Throughput: 0: 55948.5. Samples: 1124865980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 03:45:53,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 03:45:54,459][47288] Updated weights for policy 0, policy_version 71746 (0.0036) [2024-04-26 03:45:57,548][47288] Updated weights for policy 0, policy_version 71756 (0.0035) [2024-04-26 03:45:58,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 1175715840. Throughput: 0: 56011.3. Samples: 1125037120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 03:45:58,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 03:46:00,377][47288] Updated weights for policy 0, policy_version 71766 (0.0028) [2024-04-26 03:46:03,379][47288] Updated weights for policy 0, policy_version 71776 (0.0031) [2024-04-26 03:46:03,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 1175994368. Throughput: 0: 55918.6. Samples: 1125373320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 03:46:03,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 03:46:06,479][47288] Updated weights for policy 0, policy_version 71786 (0.0029) [2024-04-26 03:46:08,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 1176272896. Throughput: 0: 55863.8. Samples: 1125706040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 03:46:08,923][47056] Avg episode reward: [(0, '0.341')] [2024-04-26 03:46:09,221][47288] Updated weights for policy 0, policy_version 71796 (0.0031) [2024-04-26 03:46:12,249][47288] Updated weights for policy 0, policy_version 71806 (0.0028) [2024-04-26 03:46:13,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 1176551424. Throughput: 0: 55930.9. Samples: 1125873140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 03:46:13,924][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 03:46:15,170][47288] Updated weights for policy 0, policy_version 71816 (0.0030) [2024-04-26 03:46:18,022][47288] Updated weights for policy 0, policy_version 71826 (0.0033) [2024-04-26 03:46:18,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 1176846336. Throughput: 0: 55805.1. Samples: 1126206940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 03:46:18,923][47056] Avg episode reward: [(0, '0.357')] [2024-04-26 03:46:21,113][47288] Updated weights for policy 0, policy_version 71836 (0.0031) [2024-04-26 03:46:23,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 1177108480. Throughput: 0: 55688.0. Samples: 1126535960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 03:46:23,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 03:46:23,935][47288] Updated weights for policy 0, policy_version 71846 (0.0027) [2024-04-26 03:46:27,049][47288] Updated weights for policy 0, policy_version 71856 (0.0028) [2024-04-26 03:46:28,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 1177387008. Throughput: 0: 55781.7. Samples: 1126709440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 03:46:28,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 03:46:29,835][47288] Updated weights for policy 0, policy_version 71866 (0.0029) [2024-04-26 03:46:32,920][47288] Updated weights for policy 0, policy_version 71876 (0.0025) [2024-04-26 03:46:33,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 1177681920. Throughput: 0: 55869.3. Samples: 1127045320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 03:46:33,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 03:46:35,554][47288] Updated weights for policy 0, policy_version 71886 (0.0032) [2024-04-26 03:46:36,376][47267] Signal inference workers to stop experience collection... (16950 times) [2024-04-26 03:46:36,421][47288] InferenceWorker_p0-w0: stopping experience collection (16950 times) [2024-04-26 03:46:36,470][47267] Signal inference workers to resume experience collection... (16950 times) [2024-04-26 03:46:36,470][47288] InferenceWorker_p0-w0: resuming experience collection (16950 times) [2024-04-26 03:46:38,738][47288] Updated weights for policy 0, policy_version 71896 (0.0027) [2024-04-26 03:46:38,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 1177960448. Throughput: 0: 55819.4. Samples: 1127377860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:46:38,923][47056] Avg episode reward: [(0, '0.316')] [2024-04-26 03:46:41,750][47288] Updated weights for policy 0, policy_version 71906 (0.0028) [2024-04-26 03:46:43,923][47056] Fps is (10 sec: 52428.0, 60 sec: 55432.3, 300 sec: 55816.7). Total num frames: 1178206208. Throughput: 0: 55530.6. Samples: 1127536000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:46:43,924][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 03:46:44,634][47288] Updated weights for policy 0, policy_version 71916 (0.0030) [2024-04-26 03:46:47,462][47288] Updated weights for policy 0, policy_version 71926 (0.0026) [2024-04-26 03:46:48,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 1178501120. Throughput: 0: 55503.0. Samples: 1127870960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:46:48,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 03:46:49,009][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000071931_1178517504.pth... [2024-04-26 03:46:49,055][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000071114_1165131776.pth [2024-04-26 03:46:50,415][47288] Updated weights for policy 0, policy_version 71936 (0.0031) [2024-04-26 03:46:53,441][47288] Updated weights for policy 0, policy_version 71946 (0.0028) [2024-04-26 03:46:53,923][47056] Fps is (10 sec: 57345.6, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1178779648. Throughput: 0: 55512.2. Samples: 1128204080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:46:53,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 03:46:56,435][47288] Updated weights for policy 0, policy_version 71956 (0.0034) [2024-04-26 03:46:58,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 1179058176. Throughput: 0: 55485.8. Samples: 1128370000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:46:58,923][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 03:46:59,331][47288] Updated weights for policy 0, policy_version 71966 (0.0026) [2024-04-26 03:47:02,336][47288] Updated weights for policy 0, policy_version 71976 (0.0024) [2024-04-26 03:47:03,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55705.4, 300 sec: 55650.1). Total num frames: 1179336704. Throughput: 0: 55452.3. Samples: 1128702300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:47:03,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 03:47:05,233][47288] Updated weights for policy 0, policy_version 71986 (0.0031) [2024-04-26 03:47:08,231][47288] Updated weights for policy 0, policy_version 71996 (0.0027) [2024-04-26 03:47:08,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1179631616. Throughput: 0: 55490.2. Samples: 1129033020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:47:08,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 03:47:11,045][47288] Updated weights for policy 0, policy_version 72006 (0.0030) [2024-04-26 03:47:13,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 1179877376. Throughput: 0: 55480.0. Samples: 1129206040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:47:13,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 03:47:14,100][47288] Updated weights for policy 0, policy_version 72016 (0.0027) [2024-04-26 03:47:16,738][47288] Updated weights for policy 0, policy_version 72026 (0.0027) [2024-04-26 03:47:18,923][47056] Fps is (10 sec: 52428.2, 60 sec: 55159.3, 300 sec: 55816.7). Total num frames: 1180155904. Throughput: 0: 55483.4. Samples: 1129542080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 03:47:18,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 03:47:19,956][47288] Updated weights for policy 0, policy_version 72036 (0.0026) [2024-04-26 03:47:22,751][47288] Updated weights for policy 0, policy_version 72046 (0.0029) [2024-04-26 03:47:23,683][47267] Signal inference workers to stop experience collection... (17000 times) [2024-04-26 03:47:23,683][47267] Signal inference workers to resume experience collection... (17000 times) [2024-04-26 03:47:23,706][47288] InferenceWorker_p0-w0: stopping experience collection (17000 times) [2024-04-26 03:47:23,706][47288] InferenceWorker_p0-w0: resuming experience collection (17000 times) [2024-04-26 03:47:23,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 1180450816. Throughput: 0: 55452.0. Samples: 1129873200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:47:23,923][47056] Avg episode reward: [(0, '0.450')] [2024-04-26 03:47:25,867][47288] Updated weights for policy 0, policy_version 72056 (0.0027) [2024-04-26 03:47:28,759][47288] Updated weights for policy 0, policy_version 72066 (0.0028) [2024-04-26 03:47:28,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 1180729344. Throughput: 0: 55596.3. Samples: 1130037820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:47:28,923][47056] Avg episode reward: [(0, '0.397')] [2024-04-26 03:47:31,796][47288] Updated weights for policy 0, policy_version 72076 (0.0027) [2024-04-26 03:47:33,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 1181007872. Throughput: 0: 55492.3. Samples: 1130368120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:47:33,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 03:47:34,698][47288] Updated weights for policy 0, policy_version 72086 (0.0032) [2024-04-26 03:47:37,631][47288] Updated weights for policy 0, policy_version 72096 (0.0028) [2024-04-26 03:47:38,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 1181286400. Throughput: 0: 55461.2. Samples: 1130699840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:47:38,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 03:47:40,932][47288] Updated weights for policy 0, policy_version 72106 (0.0031) [2024-04-26 03:47:43,556][47288] Updated weights for policy 0, policy_version 72116 (0.0035) [2024-04-26 03:47:43,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.9, 300 sec: 55705.6). Total num frames: 1181564928. Throughput: 0: 55446.3. Samples: 1130865080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:47:43,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 03:47:46,658][47288] Updated weights for policy 0, policy_version 72126 (0.0030) [2024-04-26 03:47:48,923][47056] Fps is (10 sec: 52429.9, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 1181810688. Throughput: 0: 55529.2. Samples: 1131201100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:47:48,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 03:47:49,514][47288] Updated weights for policy 0, policy_version 72136 (0.0031) [2024-04-26 03:47:52,438][47288] Updated weights for policy 0, policy_version 72146 (0.0029) [2024-04-26 03:47:53,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 1182105600. Throughput: 0: 55635.1. Samples: 1131536600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:47:53,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 03:47:55,400][47288] Updated weights for policy 0, policy_version 72156 (0.0033) [2024-04-26 03:47:58,462][47288] Updated weights for policy 0, policy_version 72166 (0.0026) [2024-04-26 03:47:58,923][47056] Fps is (10 sec: 58981.6, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 1182400512. Throughput: 0: 55443.1. Samples: 1131700980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:47:58,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 03:48:01,358][47288] Updated weights for policy 0, policy_version 72176 (0.0031) [2024-04-26 03:48:03,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 1182662656. Throughput: 0: 55401.6. Samples: 1132035140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:48:03,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 03:48:04,488][47288] Updated weights for policy 0, policy_version 72186 (0.0032) [2024-04-26 03:48:07,307][47288] Updated weights for policy 0, policy_version 72196 (0.0031) [2024-04-26 03:48:08,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 55594.6). Total num frames: 1182957568. Throughput: 0: 55327.2. Samples: 1132362920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 03:48:08,923][47056] Avg episode reward: [(0, '0.393')] [2024-04-26 03:48:10,229][47288] Updated weights for policy 0, policy_version 72206 (0.0031) [2024-04-26 03:48:13,197][47288] Updated weights for policy 0, policy_version 72216 (0.0031) [2024-04-26 03:48:13,923][47056] Fps is (10 sec: 55704.2, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 1183219712. Throughput: 0: 55476.7. Samples: 1132534280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 03:48:13,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 03:48:16,071][47288] Updated weights for policy 0, policy_version 72226 (0.0033) [2024-04-26 03:48:18,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 1183498240. Throughput: 0: 55428.0. Samples: 1132862380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 03:48:18,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 03:48:19,121][47288] Updated weights for policy 0, policy_version 72236 (0.0028) [2024-04-26 03:48:20,399][47267] Signal inference workers to stop experience collection... (17050 times) [2024-04-26 03:48:20,435][47288] InferenceWorker_p0-w0: stopping experience collection (17050 times) [2024-04-26 03:48:20,447][47267] Signal inference workers to resume experience collection... (17050 times) [2024-04-26 03:48:20,451][47288] InferenceWorker_p0-w0: resuming experience collection (17050 times) [2024-04-26 03:48:22,010][47288] Updated weights for policy 0, policy_version 72246 (0.0024) [2024-04-26 03:48:23,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 1183760384. Throughput: 0: 55579.6. Samples: 1133200920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 03:48:23,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 03:48:24,963][47288] Updated weights for policy 0, policy_version 72256 (0.0033) [2024-04-26 03:48:27,806][47288] Updated weights for policy 0, policy_version 72266 (0.0035) [2024-04-26 03:48:28,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 1184055296. Throughput: 0: 55505.3. Samples: 1133362820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 03:48:28,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 03:48:30,659][47288] Updated weights for policy 0, policy_version 72276 (0.0031) [2024-04-26 03:48:33,829][47288] Updated weights for policy 0, policy_version 72286 (0.0030) [2024-04-26 03:48:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 1184333824. Throughput: 0: 55469.1. Samples: 1133697220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 03:48:33,923][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 03:48:36,757][47288] Updated weights for policy 0, policy_version 72296 (0.0030) [2024-04-26 03:48:38,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 1184612352. Throughput: 0: 55463.5. Samples: 1134032460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 03:48:38,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 03:48:39,577][47288] Updated weights for policy 0, policy_version 72306 (0.0030) [2024-04-26 03:48:42,557][47288] Updated weights for policy 0, policy_version 72316 (0.0028) [2024-04-26 03:48:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 1184890880. Throughput: 0: 55723.5. Samples: 1134208540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 03:48:43,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 03:48:45,488][47288] Updated weights for policy 0, policy_version 72326 (0.0024) [2024-04-26 03:48:48,331][47288] Updated weights for policy 0, policy_version 72336 (0.0031) [2024-04-26 03:48:48,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 1185169408. Throughput: 0: 55782.9. Samples: 1134545380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 03:48:48,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 03:48:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000072337_1185169408.pth... [2024-04-26 03:48:48,980][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000071521_1171800064.pth [2024-04-26 03:48:51,354][47288] Updated weights for policy 0, policy_version 72346 (0.0031) [2024-04-26 03:48:53,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 1185447936. Throughput: 0: 55756.4. Samples: 1134871960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 03:48:53,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 03:48:54,500][47288] Updated weights for policy 0, policy_version 72356 (0.0027) [2024-04-26 03:48:57,302][47288] Updated weights for policy 0, policy_version 72366 (0.0029) [2024-04-26 03:48:58,923][47056] Fps is (10 sec: 52429.0, 60 sec: 54886.3, 300 sec: 55650.1). Total num frames: 1185693696. Throughput: 0: 55638.3. Samples: 1135038000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 03:48:58,924][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 03:49:00,295][47288] Updated weights for policy 0, policy_version 72376 (0.0034) [2024-04-26 03:49:03,068][47288] Updated weights for policy 0, policy_version 72386 (0.0026) [2024-04-26 03:49:03,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 1186004992. Throughput: 0: 55714.7. Samples: 1135369540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 03:49:03,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 03:49:06,137][47288] Updated weights for policy 0, policy_version 72396 (0.0033) [2024-04-26 03:49:08,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 1186283520. Throughput: 0: 55576.8. Samples: 1135701880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 03:49:08,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 03:49:08,952][47288] Updated weights for policy 0, policy_version 72406 (0.0025) [2024-04-26 03:49:11,957][47288] Updated weights for policy 0, policy_version 72416 (0.0031) [2024-04-26 03:49:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 1186578432. Throughput: 0: 55873.3. Samples: 1135877120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 03:49:13,923][47056] Avg episode reward: [(0, '0.415')] [2024-04-26 03:49:15,003][47288] Updated weights for policy 0, policy_version 72426 (0.0029) [2024-04-26 03:49:17,749][47267] Signal inference workers to stop experience collection... (17100 times) [2024-04-26 03:49:17,790][47288] InferenceWorker_p0-w0: stopping experience collection (17100 times) [2024-04-26 03:49:17,803][47267] Signal inference workers to resume experience collection... (17100 times) [2024-04-26 03:49:17,812][47288] InferenceWorker_p0-w0: resuming experience collection (17100 times) [2024-04-26 03:49:17,907][47288] Updated weights for policy 0, policy_version 72436 (0.0030) [2024-04-26 03:49:18,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1186856960. Throughput: 0: 55979.1. Samples: 1136216280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 03:49:18,923][47056] Avg episode reward: [(0, '0.450')] [2024-04-26 03:49:20,877][47288] Updated weights for policy 0, policy_version 72446 (0.0029) [2024-04-26 03:49:23,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 1187102720. Throughput: 0: 55932.1. Samples: 1136549400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 03:49:23,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 03:49:23,935][47288] Updated weights for policy 0, policy_version 72456 (0.0032) [2024-04-26 03:49:26,626][47288] Updated weights for policy 0, policy_version 72466 (0.0033) [2024-04-26 03:49:28,923][47056] Fps is (10 sec: 52429.8, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 1187381248. Throughput: 0: 55632.1. Samples: 1136711980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 03:49:28,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 03:49:29,762][47288] Updated weights for policy 0, policy_version 72476 (0.0028) [2024-04-26 03:49:32,585][47288] Updated weights for policy 0, policy_version 72486 (0.0031) [2024-04-26 03:49:33,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 1187659776. Throughput: 0: 55604.3. Samples: 1137047560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 03:49:33,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 03:49:35,651][47288] Updated weights for policy 0, policy_version 72496 (0.0026) [2024-04-26 03:49:38,475][47288] Updated weights for policy 0, policy_version 72506 (0.0031) [2024-04-26 03:49:38,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 1187954688. Throughput: 0: 55749.4. Samples: 1137380680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 03:49:38,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 03:49:41,493][47288] Updated weights for policy 0, policy_version 72516 (0.0027) [2024-04-26 03:49:43,923][47056] Fps is (10 sec: 57343.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 1188233216. Throughput: 0: 55835.6. Samples: 1137550600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 03:49:43,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 03:49:44,268][47288] Updated weights for policy 0, policy_version 72526 (0.0034) [2024-04-26 03:49:47,384][47288] Updated weights for policy 0, policy_version 72536 (0.0028) [2024-04-26 03:49:48,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.9, 300 sec: 55705.6). Total num frames: 1188528128. Throughput: 0: 55969.1. Samples: 1137888140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 03:49:48,923][47056] Avg episode reward: [(0, '0.436')] [2024-04-26 03:49:49,938][47288] Updated weights for policy 0, policy_version 72546 (0.0032) [2024-04-26 03:49:53,269][47288] Updated weights for policy 0, policy_version 72556 (0.0029) [2024-04-26 03:49:53,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1188806656. Throughput: 0: 56006.0. Samples: 1138222140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 03:49:53,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 03:49:55,872][47288] Updated weights for policy 0, policy_version 72566 (0.0029) [2024-04-26 03:49:58,923][47056] Fps is (10 sec: 54066.6, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 1189068800. Throughput: 0: 55890.3. Samples: 1138392180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 03:49:58,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 03:49:59,025][47288] Updated weights for policy 0, policy_version 72576 (0.0031) [2024-04-26 03:50:01,794][47288] Updated weights for policy 0, policy_version 72586 (0.0030) [2024-04-26 03:50:03,923][47056] Fps is (10 sec: 54066.2, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 1189347328. Throughput: 0: 55870.7. Samples: 1138730460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 03:50:03,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 03:50:05,001][47288] Updated weights for policy 0, policy_version 72596 (0.0026) [2024-04-26 03:50:07,454][47288] Updated weights for policy 0, policy_version 72606 (0.0026) [2024-04-26 03:50:08,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55705.7, 300 sec: 55650.0). Total num frames: 1189625856. Throughput: 0: 55900.4. Samples: 1139064920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 03:50:08,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 03:50:10,905][47288] Updated weights for policy 0, policy_version 72616 (0.0029) [2024-04-26 03:50:13,177][47288] Updated weights for policy 0, policy_version 72626 (0.0027) [2024-04-26 03:50:13,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 1189920768. Throughput: 0: 55947.0. Samples: 1139229600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 03:50:13,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 03:50:16,727][47288] Updated weights for policy 0, policy_version 72636 (0.0026) [2024-04-26 03:50:18,923][47056] Fps is (10 sec: 58982.8, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 1190215680. Throughput: 0: 55841.7. Samples: 1139560440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 03:50:18,923][47056] Avg episode reward: [(0, '0.420')] [2024-04-26 03:50:19,016][47288] Updated weights for policy 0, policy_version 72646 (0.0034) [2024-04-26 03:50:22,508][47288] Updated weights for policy 0, policy_version 72656 (0.0029) [2024-04-26 03:50:23,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.7, 300 sec: 55705.6). Total num frames: 1190494208. Throughput: 0: 55809.7. Samples: 1139892120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 03:50:23,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 03:50:25,052][47288] Updated weights for policy 0, policy_version 72666 (0.0036) [2024-04-26 03:50:28,401][47267] Signal inference workers to stop experience collection... (17150 times) [2024-04-26 03:50:28,412][47288] InferenceWorker_p0-w0: stopping experience collection (17150 times) [2024-04-26 03:50:28,459][47267] Signal inference workers to resume experience collection... (17150 times) [2024-04-26 03:50:28,459][47288] InferenceWorker_p0-w0: resuming experience collection (17150 times) [2024-04-26 03:50:28,461][47288] Updated weights for policy 0, policy_version 72676 (0.0024) [2024-04-26 03:50:28,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 1190756352. Throughput: 0: 55934.3. Samples: 1140067640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 03:50:28,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 03:50:30,827][47288] Updated weights for policy 0, policy_version 72686 (0.0029) [2024-04-26 03:50:33,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 1191018496. Throughput: 0: 55932.8. Samples: 1140405120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 03:50:33,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 03:50:34,362][47288] Updated weights for policy 0, policy_version 72696 (0.0029) [2024-04-26 03:50:36,556][47288] Updated weights for policy 0, policy_version 72706 (0.0024) [2024-04-26 03:50:38,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 1191280640. Throughput: 0: 55967.0. Samples: 1140740660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 03:50:38,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 03:50:40,181][47288] Updated weights for policy 0, policy_version 72716 (0.0029) [2024-04-26 03:50:42,493][47288] Updated weights for policy 0, policy_version 72726 (0.0027) [2024-04-26 03:50:43,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 1191575552. Throughput: 0: 55740.8. Samples: 1140900520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 03:50:43,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 03:50:45,915][47288] Updated weights for policy 0, policy_version 72736 (0.0032) [2024-04-26 03:50:48,344][47288] Updated weights for policy 0, policy_version 72746 (0.0027) [2024-04-26 03:50:48,923][47056] Fps is (10 sec: 60620.1, 60 sec: 55978.5, 300 sec: 55816.6). Total num frames: 1191886848. Throughput: 0: 55735.5. Samples: 1141238560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 03:50:48,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 03:50:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000072747_1191886848.pth... [2024-04-26 03:50:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000071931_1178517504.pth [2024-04-26 03:50:51,819][47288] Updated weights for policy 0, policy_version 72756 (0.0034) [2024-04-26 03:50:53,923][47056] Fps is (10 sec: 58983.2, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 1192165376. Throughput: 0: 55744.9. Samples: 1141573440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 03:50:53,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 03:50:54,149][47288] Updated weights for policy 0, policy_version 72766 (0.0029) [2024-04-26 03:50:57,717][47288] Updated weights for policy 0, policy_version 72776 (0.0030) [2024-04-26 03:50:58,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 1192460288. Throughput: 0: 55933.8. Samples: 1141746620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 03:50:58,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 03:50:59,863][47288] Updated weights for policy 0, policy_version 72786 (0.0025) [2024-04-26 03:51:03,512][47288] Updated weights for policy 0, policy_version 72796 (0.0029) [2024-04-26 03:51:03,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 1192722432. Throughput: 0: 56061.7. Samples: 1142083220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 03:51:03,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 03:51:05,795][47288] Updated weights for policy 0, policy_version 72806 (0.0026) [2024-04-26 03:51:08,923][47056] Fps is (10 sec: 50789.6, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 1192968192. Throughput: 0: 56111.0. Samples: 1142417120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 03:51:08,924][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 03:51:09,504][47288] Updated weights for policy 0, policy_version 72816 (0.0030) [2024-04-26 03:51:11,750][47288] Updated weights for policy 0, policy_version 72826 (0.0029) [2024-04-26 03:51:13,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 1193246720. Throughput: 0: 55730.7. Samples: 1142575520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 03:51:13,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 03:51:15,418][47288] Updated weights for policy 0, policy_version 72836 (0.0031) [2024-04-26 03:51:16,064][47267] Signal inference workers to stop experience collection... (17200 times) [2024-04-26 03:51:16,110][47288] InferenceWorker_p0-w0: stopping experience collection (17200 times) [2024-04-26 03:51:16,120][47267] Signal inference workers to resume experience collection... (17200 times) [2024-04-26 03:51:16,128][47288] InferenceWorker_p0-w0: resuming experience collection (17200 times) [2024-04-26 03:51:17,775][47288] Updated weights for policy 0, policy_version 72846 (0.0032) [2024-04-26 03:51:18,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55159.3, 300 sec: 55650.0). Total num frames: 1193525248. Throughput: 0: 55507.9. Samples: 1142902980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 03:51:18,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 03:51:21,288][47288] Updated weights for policy 0, policy_version 72856 (0.0034) [2024-04-26 03:51:23,627][47288] Updated weights for policy 0, policy_version 72866 (0.0027) [2024-04-26 03:51:23,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 1193836544. Throughput: 0: 55478.6. Samples: 1143237200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 03:51:23,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 03:51:27,029][47288] Updated weights for policy 0, policy_version 72876 (0.0030) [2024-04-26 03:51:28,923][47056] Fps is (10 sec: 60621.5, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 1194131456. Throughput: 0: 55897.5. Samples: 1143415900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 03:51:28,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 03:51:29,336][47288] Updated weights for policy 0, policy_version 72886 (0.0029) [2024-04-26 03:51:33,108][47288] Updated weights for policy 0, policy_version 72896 (0.0028) [2024-04-26 03:51:33,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 1194393600. Throughput: 0: 55769.6. Samples: 1143748180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 03:51:33,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 03:51:35,670][47288] Updated weights for policy 0, policy_version 72906 (0.0029) [2024-04-26 03:51:38,923][47056] Fps is (10 sec: 50789.3, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 1194639360. Throughput: 0: 55839.7. Samples: 1144086240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 03:51:38,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 03:51:39,013][47288] Updated weights for policy 0, policy_version 72916 (0.0029) [2024-04-26 03:51:41,658][47288] Updated weights for policy 0, policy_version 72926 (0.0029) [2024-04-26 03:51:43,923][47056] Fps is (10 sec: 50789.2, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 1194901504. Throughput: 0: 55434.0. Samples: 1144241160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 03:51:43,923][47056] Avg episode reward: [(0, '0.401')] [2024-04-26 03:51:44,877][47288] Updated weights for policy 0, policy_version 72936 (0.0026) [2024-04-26 03:51:47,468][47288] Updated weights for policy 0, policy_version 72946 (0.0027) [2024-04-26 03:51:48,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 1195196416. Throughput: 0: 55445.2. Samples: 1144578260. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-26 03:51:48,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 03:51:50,621][47288] Updated weights for policy 0, policy_version 72956 (0.0034) [2024-04-26 03:51:53,342][47288] Updated weights for policy 0, policy_version 72966 (0.0029) [2024-04-26 03:51:53,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 1195474944. Throughput: 0: 55419.3. Samples: 1144910980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-26 03:51:53,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 03:51:56,539][47288] Updated weights for policy 0, policy_version 72976 (0.0029) [2024-04-26 03:51:58,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55159.3, 300 sec: 55705.6). Total num frames: 1195769856. Throughput: 0: 55742.5. Samples: 1145083940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-26 03:51:58,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 03:51:59,419][47288] Updated weights for policy 0, policy_version 72986 (0.0027) [2024-04-26 03:52:02,572][47288] Updated weights for policy 0, policy_version 72996 (0.0027) [2024-04-26 03:52:03,923][47056] Fps is (10 sec: 58982.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 1196064768. Throughput: 0: 55833.1. Samples: 1145415460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-26 03:52:03,923][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 03:52:05,608][47288] Updated weights for policy 0, policy_version 73006 (0.0029) [2024-04-26 03:52:08,333][47288] Updated weights for policy 0, policy_version 73016 (0.0026) [2024-04-26 03:52:08,534][47267] Signal inference workers to stop experience collection... (17250 times) [2024-04-26 03:52:08,534][47267] Signal inference workers to resume experience collection... (17250 times) [2024-04-26 03:52:08,548][47288] InferenceWorker_p0-w0: stopping experience collection (17250 times) [2024-04-26 03:52:08,549][47288] InferenceWorker_p0-w0: resuming experience collection (17250 times) [2024-04-26 03:52:08,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 1196343296. Throughput: 0: 55759.4. Samples: 1145746380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-26 03:52:08,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 03:52:11,478][47288] Updated weights for policy 0, policy_version 73026 (0.0029) [2024-04-26 03:52:13,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 1196605440. Throughput: 0: 55699.9. Samples: 1145922400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-26 03:52:13,923][47056] Avg episode reward: [(0, '0.588')] [2024-04-26 03:52:14,024][47267] Saving new best policy, reward=0.588! [2024-04-26 03:52:14,031][47288] Updated weights for policy 0, policy_version 73036 (0.0033) [2024-04-26 03:52:17,397][47288] Updated weights for policy 0, policy_version 73046 (0.0028) [2024-04-26 03:52:18,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 1196867584. Throughput: 0: 55825.5. Samples: 1146260340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-26 03:52:18,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 03:52:19,818][47288] Updated weights for policy 0, policy_version 73056 (0.0035) [2024-04-26 03:52:23,119][47288] Updated weights for policy 0, policy_version 73066 (0.0030) [2024-04-26 03:52:23,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 1197178880. Throughput: 0: 55755.3. Samples: 1146595220. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-26 03:52:23,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 03:52:25,801][47288] Updated weights for policy 0, policy_version 73076 (0.0027) [2024-04-26 03:52:28,923][47056] Fps is (10 sec: 54067.5, 60 sec: 54613.2, 300 sec: 55594.5). Total num frames: 1197408256. Throughput: 0: 55782.8. Samples: 1146751380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-26 03:52:28,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 03:52:29,119][47288] Updated weights for policy 0, policy_version 73086 (0.0025) [2024-04-26 03:52:31,616][47288] Updated weights for policy 0, policy_version 73096 (0.0028) [2024-04-26 03:52:33,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 1197735936. Throughput: 0: 55742.6. Samples: 1147086680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 19.0) [2024-04-26 03:52:33,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 03:52:35,099][47288] Updated weights for policy 0, policy_version 73106 (0.0035) [2024-04-26 03:52:37,456][47288] Updated weights for policy 0, policy_version 73116 (0.0030) [2024-04-26 03:52:38,923][47056] Fps is (10 sec: 60620.5, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 1198014464. Throughput: 0: 55809.2. Samples: 1147422400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 19.0) [2024-04-26 03:52:38,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 03:52:40,919][47288] Updated weights for policy 0, policy_version 73126 (0.0034) [2024-04-26 03:52:43,244][47288] Updated weights for policy 0, policy_version 73136 (0.0025) [2024-04-26 03:52:43,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56797.8, 300 sec: 55927.7). Total num frames: 1198309376. Throughput: 0: 55796.3. Samples: 1147594780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 19.0) [2024-04-26 03:52:43,923][47056] Avg episode reward: [(0, '0.353')] [2024-04-26 03:52:46,665][47288] Updated weights for policy 0, policy_version 73146 (0.0026) [2024-04-26 03:52:48,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 1198555136. Throughput: 0: 55969.2. Samples: 1147934080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 19.0) [2024-04-26 03:52:48,924][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 03:52:48,930][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000073155_1198571520.pth... [2024-04-26 03:52:48,970][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000072337_1185169408.pth [2024-04-26 03:52:49,128][47288] Updated weights for policy 0, policy_version 73156 (0.0027) [2024-04-26 03:52:52,501][47288] Updated weights for policy 0, policy_version 73166 (0.0028) [2024-04-26 03:52:53,923][47056] Fps is (10 sec: 52430.0, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 1198833664. Throughput: 0: 56103.8. Samples: 1148271040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 19.0) [2024-04-26 03:52:53,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 03:52:54,977][47288] Updated weights for policy 0, policy_version 73176 (0.0031) [2024-04-26 03:52:58,268][47288] Updated weights for policy 0, policy_version 73186 (0.0029) [2024-04-26 03:52:58,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 1199128576. Throughput: 0: 55818.7. Samples: 1148434240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 19.0) [2024-04-26 03:52:58,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 03:53:00,911][47288] Updated weights for policy 0, policy_version 73196 (0.0034) [2024-04-26 03:53:03,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 1199374336. Throughput: 0: 55773.0. Samples: 1148770120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 19.0) [2024-04-26 03:53:03,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 03:53:04,088][47288] Updated weights for policy 0, policy_version 73206 (0.0033) [2024-04-26 03:53:06,701][47288] Updated weights for policy 0, policy_version 73216 (0.0027) [2024-04-26 03:53:08,923][47056] Fps is (10 sec: 54066.1, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 1199669248. Throughput: 0: 55809.0. Samples: 1149106640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 19.0) [2024-04-26 03:53:08,924][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 03:53:10,012][47288] Updated weights for policy 0, policy_version 73226 (0.0027) [2024-04-26 03:53:11,463][47267] Signal inference workers to stop experience collection... (17300 times) [2024-04-26 03:53:11,463][47267] Signal inference workers to resume experience collection... (17300 times) [2024-04-26 03:53:11,476][47288] InferenceWorker_p0-w0: stopping experience collection (17300 times) [2024-04-26 03:53:11,480][47288] InferenceWorker_p0-w0: resuming experience collection (17300 times) [2024-04-26 03:53:12,438][47288] Updated weights for policy 0, policy_version 73236 (0.0033) [2024-04-26 03:53:13,923][47056] Fps is (10 sec: 60620.7, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 1199980544. Throughput: 0: 56181.4. Samples: 1149279540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 19.0) [2024-04-26 03:53:13,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 03:53:15,813][47288] Updated weights for policy 0, policy_version 73246 (0.0035) [2024-04-26 03:53:18,420][47288] Updated weights for policy 0, policy_version 73256 (0.0030) [2024-04-26 03:53:18,923][47056] Fps is (10 sec: 58984.2, 60 sec: 56525.0, 300 sec: 55927.8). Total num frames: 1200259072. Throughput: 0: 56169.6. Samples: 1149614300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 03:53:18,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 03:53:21,627][47288] Updated weights for policy 0, policy_version 73266 (0.0031) [2024-04-26 03:53:23,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 1200537600. Throughput: 0: 56192.3. Samples: 1149951040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 03:53:23,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 03:53:24,101][47288] Updated weights for policy 0, policy_version 73276 (0.0031) [2024-04-26 03:53:27,370][47288] Updated weights for policy 0, policy_version 73286 (0.0031) [2024-04-26 03:53:28,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56524.9, 300 sec: 55816.7). Total num frames: 1200799744. Throughput: 0: 56112.3. Samples: 1150119820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 03:53:28,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 03:53:29,866][47288] Updated weights for policy 0, policy_version 73296 (0.0026) [2024-04-26 03:53:33,184][47288] Updated weights for policy 0, policy_version 73306 (0.0033) [2024-04-26 03:53:33,923][47056] Fps is (10 sec: 54065.9, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 1201078272. Throughput: 0: 56068.8. Samples: 1150457180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 03:53:33,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 03:53:35,801][47288] Updated weights for policy 0, policy_version 73316 (0.0027) [2024-04-26 03:53:38,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 1201340416. Throughput: 0: 56026.1. Samples: 1150792220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 03:53:38,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 03:53:39,096][47288] Updated weights for policy 0, policy_version 73326 (0.0035) [2024-04-26 03:53:41,599][47288] Updated weights for policy 0, policy_version 73336 (0.0027) [2024-04-26 03:53:43,923][47056] Fps is (10 sec: 55706.6, 60 sec: 55432.8, 300 sec: 55816.7). Total num frames: 1201635328. Throughput: 0: 55914.8. Samples: 1150950400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 03:53:43,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 03:53:44,889][47288] Updated weights for policy 0, policy_version 73346 (0.0027) [2024-04-26 03:53:47,265][47288] Updated weights for policy 0, policy_version 73356 (0.0030) [2024-04-26 03:53:48,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 1201913856. Throughput: 0: 55886.2. Samples: 1151285000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 03:53:48,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 03:53:50,684][47288] Updated weights for policy 0, policy_version 73366 (0.0040) [2024-04-26 03:53:53,367][47288] Updated weights for policy 0, policy_version 73376 (0.0024) [2024-04-26 03:53:53,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 1202208768. Throughput: 0: 55886.0. Samples: 1151621500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 03:53:53,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 03:53:56,597][47288] Updated weights for policy 0, policy_version 73386 (0.0027) [2024-04-26 03:53:58,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 1202503680. Throughput: 0: 55961.3. Samples: 1151797800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 03:53:58,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 03:53:59,188][47288] Updated weights for policy 0, policy_version 73396 (0.0027) [2024-04-26 03:54:02,600][47288] Updated weights for policy 0, policy_version 73406 (0.0032) [2024-04-26 03:54:03,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.7, 300 sec: 55872.2). Total num frames: 1202765824. Throughput: 0: 55982.0. Samples: 1152133500. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 03:54:03,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 03:54:05,027][47288] Updated weights for policy 0, policy_version 73416 (0.0024) [2024-04-26 03:54:08,337][47288] Updated weights for policy 0, policy_version 73426 (0.0025) [2024-04-26 03:54:08,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 1203027968. Throughput: 0: 55914.9. Samples: 1152467220. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 03:54:08,923][47056] Avg episode reward: [(0, '0.384')] [2024-04-26 03:54:09,696][47267] Signal inference workers to stop experience collection... (17350 times) [2024-04-26 03:54:09,739][47288] InferenceWorker_p0-w0: stopping experience collection (17350 times) [2024-04-26 03:54:09,749][47267] Signal inference workers to resume experience collection... (17350 times) [2024-04-26 03:54:09,753][47288] InferenceWorker_p0-w0: resuming experience collection (17350 times) [2024-04-26 03:54:10,722][47288] Updated weights for policy 0, policy_version 73436 (0.0027) [2024-04-26 03:54:13,923][47056] Fps is (10 sec: 54068.4, 60 sec: 55432.7, 300 sec: 55761.2). Total num frames: 1203306496. Throughput: 0: 55865.9. Samples: 1152633780. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 03:54:13,923][47056] Avg episode reward: [(0, '0.397')] [2024-04-26 03:54:14,133][47288] Updated weights for policy 0, policy_version 73446 (0.0030) [2024-04-26 03:54:16,531][47288] Updated weights for policy 0, policy_version 73456 (0.0027) [2024-04-26 03:54:18,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 1203585024. Throughput: 0: 55863.3. Samples: 1152971020. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 03:54:18,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 03:54:19,959][47288] Updated weights for policy 0, policy_version 73466 (0.0029) [2024-04-26 03:54:22,636][47288] Updated weights for policy 0, policy_version 73476 (0.0027) [2024-04-26 03:54:23,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 1203879936. Throughput: 0: 56019.6. Samples: 1153313100. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 03:54:23,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 03:54:25,772][47288] Updated weights for policy 0, policy_version 73486 (0.0029) [2024-04-26 03:54:28,410][47288] Updated weights for policy 0, policy_version 73496 (0.0026) [2024-04-26 03:54:28,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 1204158464. Throughput: 0: 56244.0. Samples: 1153481380. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 03:54:28,923][47056] Avg episode reward: [(0, '0.391')] [2024-04-26 03:54:31,549][47288] Updated weights for policy 0, policy_version 73506 (0.0029) [2024-04-26 03:54:33,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56251.9, 300 sec: 55927.8). Total num frames: 1204453376. Throughput: 0: 56175.7. Samples: 1153812900. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 03:54:33,923][47056] Avg episode reward: [(0, '0.424')] [2024-04-26 03:54:34,260][47288] Updated weights for policy 0, policy_version 73516 (0.0030) [2024-04-26 03:54:37,376][47288] Updated weights for policy 0, policy_version 73526 (0.0028) [2024-04-26 03:54:38,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56524.8, 300 sec: 55927.8). Total num frames: 1204731904. Throughput: 0: 56222.2. Samples: 1154151500. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 03:54:38,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 03:54:40,478][47288] Updated weights for policy 0, policy_version 73536 (0.0027) [2024-04-26 03:54:43,311][47288] Updated weights for policy 0, policy_version 73546 (0.0036) [2024-04-26 03:54:43,923][47056] Fps is (10 sec: 54066.1, 60 sec: 55978.5, 300 sec: 55816.6). Total num frames: 1204994048. Throughput: 0: 55929.7. Samples: 1154314640. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 03:54:43,924][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 03:54:46,301][47288] Updated weights for policy 0, policy_version 73556 (0.0028) [2024-04-26 03:54:48,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 1205272576. Throughput: 0: 55909.4. Samples: 1154649420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:54:48,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 03:54:49,056][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000073566_1205305344.pth... [2024-04-26 03:54:49,058][47288] Updated weights for policy 0, policy_version 73566 (0.0028) [2024-04-26 03:54:49,103][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000072747_1191886848.pth [2024-04-26 03:54:52,170][47288] Updated weights for policy 0, policy_version 73576 (0.0034) [2024-04-26 03:54:53,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 1205551104. Throughput: 0: 56165.3. Samples: 1154994660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:54:53,923][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 03:54:54,644][47267] Signal inference workers to stop experience collection... (17400 times) [2024-04-26 03:54:54,687][47288] InferenceWorker_p0-w0: stopping experience collection (17400 times) [2024-04-26 03:54:54,700][47267] Signal inference workers to resume experience collection... (17400 times) [2024-04-26 03:54:54,709][47288] InferenceWorker_p0-w0: resuming experience collection (17400 times) [2024-04-26 03:54:54,809][47288] Updated weights for policy 0, policy_version 73586 (0.0028) [2024-04-26 03:54:57,894][47288] Updated weights for policy 0, policy_version 73596 (0.0028) [2024-04-26 03:54:58,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 1205846016. Throughput: 0: 56062.5. Samples: 1155156600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:54:58,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 03:55:00,830][47288] Updated weights for policy 0, policy_version 73606 (0.0027) [2024-04-26 03:55:03,697][47288] Updated weights for policy 0, policy_version 73616 (0.0027) [2024-04-26 03:55:03,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 1206124544. Throughput: 0: 56054.5. Samples: 1155493480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:55:03,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 03:55:06,757][47288] Updated weights for policy 0, policy_version 73626 (0.0026) [2024-04-26 03:55:08,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.8, 300 sec: 55927.8). Total num frames: 1206419456. Throughput: 0: 56013.3. Samples: 1155833700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:55:08,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 03:55:09,644][47288] Updated weights for policy 0, policy_version 73636 (0.0028) [2024-04-26 03:55:12,357][47288] Updated weights for policy 0, policy_version 73646 (0.0027) [2024-04-26 03:55:13,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56797.6, 300 sec: 55927.7). Total num frames: 1206714368. Throughput: 0: 56100.2. Samples: 1156005900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:55:13,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 03:55:15,561][47288] Updated weights for policy 0, policy_version 73656 (0.0024) [2024-04-26 03:55:18,094][47288] Updated weights for policy 0, policy_version 73666 (0.0030) [2024-04-26 03:55:18,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56797.7, 300 sec: 55927.7). Total num frames: 1206992896. Throughput: 0: 56223.7. Samples: 1156342980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:55:18,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 03:55:21,371][47288] Updated weights for policy 0, policy_version 73676 (0.0028) [2024-04-26 03:55:23,923][47056] Fps is (10 sec: 54068.5, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 1207255040. Throughput: 0: 56153.5. Samples: 1156678400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:55:23,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 03:55:24,073][47288] Updated weights for policy 0, policy_version 73686 (0.0030) [2024-04-26 03:55:27,145][47288] Updated weights for policy 0, policy_version 73696 (0.0026) [2024-04-26 03:55:28,923][47056] Fps is (10 sec: 54068.3, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 1207533568. Throughput: 0: 56303.4. Samples: 1156848280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 03:55:28,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 03:55:29,906][47288] Updated weights for policy 0, policy_version 73706 (0.0029) [2024-04-26 03:55:32,872][47288] Updated weights for policy 0, policy_version 73716 (0.0028) [2024-04-26 03:55:33,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 56038.8). Total num frames: 1207812096. Throughput: 0: 56452.5. Samples: 1157189780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 03:55:33,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 03:55:35,620][47288] Updated weights for policy 0, policy_version 73726 (0.0031) [2024-04-26 03:55:38,735][47288] Updated weights for policy 0, policy_version 73736 (0.0034) [2024-04-26 03:55:38,923][47056] Fps is (10 sec: 55704.1, 60 sec: 55978.5, 300 sec: 55983.3). Total num frames: 1208090624. Throughput: 0: 56380.3. Samples: 1157531780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 03:55:38,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 03:55:41,330][47288] Updated weights for policy 0, policy_version 73746 (0.0030) [2024-04-26 03:55:43,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56525.0, 300 sec: 55927.8). Total num frames: 1208385536. Throughput: 0: 56352.5. Samples: 1157692460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 03:55:43,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 03:55:44,536][47288] Updated weights for policy 0, policy_version 73756 (0.0030) [2024-04-26 03:55:47,190][47288] Updated weights for policy 0, policy_version 73766 (0.0026) [2024-04-26 03:55:48,923][47056] Fps is (10 sec: 58983.9, 60 sec: 56797.9, 300 sec: 55983.3). Total num frames: 1208680448. Throughput: 0: 56323.3. Samples: 1158028020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 03:55:48,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 03:55:50,216][47288] Updated weights for policy 0, policy_version 73776 (0.0027) [2024-04-26 03:55:53,049][47288] Updated weights for policy 0, policy_version 73786 (0.0026) [2024-04-26 03:55:53,410][47267] Signal inference workers to stop experience collection... (17450 times) [2024-04-26 03:55:53,446][47288] InferenceWorker_p0-w0: stopping experience collection (17450 times) [2024-04-26 03:55:53,459][47267] Signal inference workers to resume experience collection... (17450 times) [2024-04-26 03:55:53,463][47288] InferenceWorker_p0-w0: resuming experience collection (17450 times) [2024-04-26 03:55:53,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56798.0, 300 sec: 55927.8). Total num frames: 1208958976. Throughput: 0: 56373.4. Samples: 1158370500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 03:55:53,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 03:55:56,143][47288] Updated weights for policy 0, policy_version 73796 (0.0025) [2024-04-26 03:55:58,765][47288] Updated weights for policy 0, policy_version 73806 (0.0032) [2024-04-26 03:55:58,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 1209237504. Throughput: 0: 56383.8. Samples: 1158543160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 03:55:58,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 03:56:02,082][47288] Updated weights for policy 0, policy_version 73816 (0.0027) [2024-04-26 03:56:03,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56251.8, 300 sec: 56038.9). Total num frames: 1209499648. Throughput: 0: 56293.5. Samples: 1158876180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 03:56:03,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 03:56:04,435][47288] Updated weights for policy 0, policy_version 73826 (0.0032) [2024-04-26 03:56:07,784][47288] Updated weights for policy 0, policy_version 73836 (0.0029) [2024-04-26 03:56:08,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55978.6, 300 sec: 56038.8). Total num frames: 1209778176. Throughput: 0: 56323.8. Samples: 1159212980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 03:56:08,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 03:56:10,416][47288] Updated weights for policy 0, policy_version 73846 (0.0033) [2024-04-26 03:56:13,685][47288] Updated weights for policy 0, policy_version 73856 (0.0030) [2024-04-26 03:56:13,923][47056] Fps is (10 sec: 57343.0, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 1210073088. Throughput: 0: 56245.9. Samples: 1159379360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 03:56:13,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 03:56:16,281][47288] Updated weights for policy 0, policy_version 73866 (0.0029) [2024-04-26 03:56:18,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 1210335232. Throughput: 0: 56103.1. Samples: 1159714420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-26 03:56:18,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 03:56:19,512][47288] Updated weights for policy 0, policy_version 73876 (0.0036) [2024-04-26 03:56:22,082][47288] Updated weights for policy 0, policy_version 73886 (0.0029) [2024-04-26 03:56:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56524.6, 300 sec: 55983.3). Total num frames: 1210646528. Throughput: 0: 55891.2. Samples: 1160046880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-26 03:56:23,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 03:56:25,360][47288] Updated weights for policy 0, policy_version 73896 (0.0038) [2024-04-26 03:56:27,912][47288] Updated weights for policy 0, policy_version 73906 (0.0030) [2024-04-26 03:56:28,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56524.8, 300 sec: 56038.8). Total num frames: 1210925056. Throughput: 0: 56380.0. Samples: 1160229560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-26 03:56:28,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 03:56:31,134][47288] Updated weights for policy 0, policy_version 73916 (0.0032) [2024-04-26 03:56:33,919][47288] Updated weights for policy 0, policy_version 73926 (0.0031) [2024-04-26 03:56:33,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.8, 300 sec: 56149.9). Total num frames: 1211203584. Throughput: 0: 56379.9. Samples: 1160565120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-26 03:56:33,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 03:56:36,985][47288] Updated weights for policy 0, policy_version 73936 (0.0039) [2024-04-26 03:56:38,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56798.1, 300 sec: 56261.0). Total num frames: 1211498496. Throughput: 0: 56208.0. Samples: 1160899860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-26 03:56:38,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 03:56:39,694][47288] Updated weights for policy 0, policy_version 73946 (0.0031) [2024-04-26 03:56:42,870][47288] Updated weights for policy 0, policy_version 73956 (0.0027) [2024-04-26 03:56:43,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55705.6, 300 sec: 56038.9). Total num frames: 1211727872. Throughput: 0: 56075.7. Samples: 1161066560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-26 03:56:43,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 03:56:45,403][47288] Updated weights for policy 0, policy_version 73966 (0.0034) [2024-04-26 03:56:48,697][47288] Updated weights for policy 0, policy_version 73976 (0.0032) [2024-04-26 03:56:48,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55705.6, 300 sec: 56094.4). Total num frames: 1212022784. Throughput: 0: 56202.7. Samples: 1161405300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-26 03:56:48,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 03:56:49,005][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000073977_1212039168.pth... [2024-04-26 03:56:49,050][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000073155_1198571520.pth [2024-04-26 03:56:51,221][47288] Updated weights for policy 0, policy_version 73986 (0.0027) [2024-04-26 03:56:53,923][47056] Fps is (10 sec: 55704.1, 60 sec: 55432.3, 300 sec: 55983.3). Total num frames: 1212284928. Throughput: 0: 56159.9. Samples: 1161740180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-26 03:56:53,924][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 03:56:54,649][47288] Updated weights for policy 0, policy_version 73996 (0.0029) [2024-04-26 03:56:57,213][47288] Updated weights for policy 0, policy_version 74006 (0.0034) [2024-04-26 03:56:58,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 1212596224. Throughput: 0: 56036.6. Samples: 1161901000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-26 03:56:58,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 03:57:00,347][47288] Updated weights for policy 0, policy_version 74016 (0.0033) [2024-04-26 03:57:01,487][47267] Signal inference workers to stop experience collection... (17500 times) [2024-04-26 03:57:01,488][47267] Signal inference workers to resume experience collection... (17500 times) [2024-04-26 03:57:01,519][47288] InferenceWorker_p0-w0: stopping experience collection (17500 times) [2024-04-26 03:57:01,519][47288] InferenceWorker_p0-w0: resuming experience collection (17500 times) [2024-04-26 03:57:02,902][47288] Updated weights for policy 0, policy_version 74026 (0.0031) [2024-04-26 03:57:03,923][47056] Fps is (10 sec: 60622.6, 60 sec: 56524.9, 300 sec: 56094.4). Total num frames: 1212891136. Throughput: 0: 56100.1. Samples: 1162238920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 28.0) [2024-04-26 03:57:03,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 03:57:06,249][47288] Updated weights for policy 0, policy_version 74036 (0.0031) [2024-04-26 03:57:08,749][47288] Updated weights for policy 0, policy_version 74046 (0.0028) [2024-04-26 03:57:08,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56524.8, 300 sec: 56149.9). Total num frames: 1213169664. Throughput: 0: 56339.5. Samples: 1162582160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 28.0) [2024-04-26 03:57:08,924][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 03:57:12,192][47288] Updated weights for policy 0, policy_version 74056 (0.0030) [2024-04-26 03:57:13,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.9, 300 sec: 56205.5). Total num frames: 1213448192. Throughput: 0: 56148.9. Samples: 1162756260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 28.0) [2024-04-26 03:57:13,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 03:57:14,663][47288] Updated weights for policy 0, policy_version 74066 (0.0039) [2024-04-26 03:57:17,871][47288] Updated weights for policy 0, policy_version 74076 (0.0031) [2024-04-26 03:57:18,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56524.8, 300 sec: 56094.4). Total num frames: 1213726720. Throughput: 0: 56084.9. Samples: 1163088940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 28.0) [2024-04-26 03:57:18,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 03:57:20,585][47288] Updated weights for policy 0, policy_version 74086 (0.0025) [2024-04-26 03:57:23,537][47288] Updated weights for policy 0, policy_version 74096 (0.0030) [2024-04-26 03:57:23,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.8, 300 sec: 56205.5). Total num frames: 1213988864. Throughput: 0: 56153.4. Samples: 1163426760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 28.0) [2024-04-26 03:57:23,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 03:57:26,276][47288] Updated weights for policy 0, policy_version 74106 (0.0033) [2024-04-26 03:57:28,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 56038.9). Total num frames: 1214267392. Throughput: 0: 56080.0. Samples: 1163590160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 28.0) [2024-04-26 03:57:28,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 03:57:29,551][47288] Updated weights for policy 0, policy_version 74116 (0.0027) [2024-04-26 03:57:32,028][47288] Updated weights for policy 0, policy_version 74126 (0.0029) [2024-04-26 03:57:33,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 1214562304. Throughput: 0: 56085.2. Samples: 1163929140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 28.0) [2024-04-26 03:57:33,923][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 03:57:35,690][47288] Updated weights for policy 0, policy_version 74136 (0.0032) [2024-04-26 03:57:37,915][47288] Updated weights for policy 0, policy_version 74146 (0.0032) [2024-04-26 03:57:38,923][47056] Fps is (10 sec: 58980.8, 60 sec: 55978.4, 300 sec: 56094.4). Total num frames: 1214857216. Throughput: 0: 56076.5. Samples: 1164263620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 28.0) [2024-04-26 03:57:38,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 03:57:41,436][47288] Updated weights for policy 0, policy_version 74156 (0.0028) [2024-04-26 03:57:43,716][47288] Updated weights for policy 0, policy_version 74166 (0.0031) [2024-04-26 03:57:43,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56797.7, 300 sec: 56205.4). Total num frames: 1215135744. Throughput: 0: 56442.0. Samples: 1164440900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 28.0) [2024-04-26 03:57:43,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 03:57:47,238][47288] Updated weights for policy 0, policy_version 74176 (0.0030) [2024-04-26 03:57:48,923][47056] Fps is (10 sec: 54068.2, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 1215397888. Throughput: 0: 56381.6. Samples: 1164776100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 03:57:48,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 03:57:49,399][47288] Updated weights for policy 0, policy_version 74186 (0.0034) [2024-04-26 03:57:52,911][47288] Updated weights for policy 0, policy_version 74196 (0.0029) [2024-04-26 03:57:53,923][47056] Fps is (10 sec: 54068.2, 60 sec: 56525.0, 300 sec: 56094.4). Total num frames: 1215676416. Throughput: 0: 56385.1. Samples: 1165119480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 03:57:53,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 03:57:55,700][47288] Updated weights for policy 0, policy_version 74206 (0.0024) [2024-04-26 03:57:58,535][47288] Updated weights for policy 0, policy_version 74216 (0.0032) [2024-04-26 03:57:58,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1215971328. Throughput: 0: 56024.8. Samples: 1165277380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 03:57:58,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 03:58:01,309][47288] Updated weights for policy 0, policy_version 74226 (0.0033) [2024-04-26 03:58:03,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 56205.5). Total num frames: 1216249856. Throughput: 0: 56334.6. Samples: 1165624000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 03:58:03,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 03:58:04,326][47288] Updated weights for policy 0, policy_version 74236 (0.0027) [2024-04-26 03:58:06,264][47267] Signal inference workers to stop experience collection... (17550 times) [2024-04-26 03:58:06,310][47288] InferenceWorker_p0-w0: stopping experience collection (17550 times) [2024-04-26 03:58:06,321][47267] Signal inference workers to resume experience collection... (17550 times) [2024-04-26 03:58:06,330][47288] InferenceWorker_p0-w0: resuming experience collection (17550 times) [2024-04-26 03:58:07,113][47288] Updated weights for policy 0, policy_version 74246 (0.0032) [2024-04-26 03:58:08,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55705.8, 300 sec: 56038.9). Total num frames: 1216512000. Throughput: 0: 56376.4. Samples: 1165963700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 03:58:08,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 03:58:10,085][47288] Updated weights for policy 0, policy_version 74256 (0.0027) [2024-04-26 03:58:12,805][47288] Updated weights for policy 0, policy_version 74266 (0.0029) [2024-04-26 03:58:13,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 1216823296. Throughput: 0: 56524.3. Samples: 1166133760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 03:58:13,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 03:58:15,806][47288] Updated weights for policy 0, policy_version 74276 (0.0025) [2024-04-26 03:58:18,471][47288] Updated weights for policy 0, policy_version 74286 (0.0029) [2024-04-26 03:58:18,923][47056] Fps is (10 sec: 60620.5, 60 sec: 56524.8, 300 sec: 56205.4). Total num frames: 1217118208. Throughput: 0: 56512.9. Samples: 1166472220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 03:58:18,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 03:58:21,669][47288] Updated weights for policy 0, policy_version 74296 (0.0025) [2024-04-26 03:58:23,923][47056] Fps is (10 sec: 58982.3, 60 sec: 57070.8, 300 sec: 56316.5). Total num frames: 1217413120. Throughput: 0: 56584.2. Samples: 1166809900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 03:58:23,923][47056] Avg episode reward: [(0, '0.369')] [2024-04-26 03:58:24,158][47288] Updated weights for policy 0, policy_version 74306 (0.0028) [2024-04-26 03:58:27,476][47288] Updated weights for policy 0, policy_version 74316 (0.0025) [2024-04-26 03:58:28,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56524.7, 300 sec: 56205.5). Total num frames: 1217658880. Throughput: 0: 56584.1. Samples: 1166987180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 03:58:28,923][47056] Avg episode reward: [(0, '0.415')] [2024-04-26 03:58:29,876][47288] Updated weights for policy 0, policy_version 74326 (0.0030) [2024-04-26 03:58:33,210][47288] Updated weights for policy 0, policy_version 74336 (0.0034) [2024-04-26 03:58:33,923][47056] Fps is (10 sec: 52429.3, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1217937408. Throughput: 0: 56615.6. Samples: 1167323800. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-04-26 03:58:33,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 03:58:35,836][47288] Updated weights for policy 0, policy_version 74346 (0.0031) [2024-04-26 03:58:38,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56251.8, 300 sec: 56260.9). Total num frames: 1218232320. Throughput: 0: 56530.9. Samples: 1167663380. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-04-26 03:58:38,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 03:58:39,132][47288] Updated weights for policy 0, policy_version 74356 (0.0025) [2024-04-26 03:58:41,630][47288] Updated weights for policy 0, policy_version 74366 (0.0033) [2024-04-26 03:58:43,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55978.7, 300 sec: 56205.4). Total num frames: 1218494464. Throughput: 0: 56712.4. Samples: 1167829440. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-04-26 03:58:43,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 03:58:44,768][47288] Updated weights for policy 0, policy_version 74376 (0.0026) [2024-04-26 03:58:47,509][47288] Updated weights for policy 0, policy_version 74386 (0.0029) [2024-04-26 03:58:48,923][47056] Fps is (10 sec: 55707.1, 60 sec: 56524.9, 300 sec: 56205.5). Total num frames: 1218789376. Throughput: 0: 56527.7. Samples: 1168167740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-04-26 03:58:48,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 03:58:49,049][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000074391_1218822144.pth... [2024-04-26 03:58:49,093][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000073566_1205305344.pth [2024-04-26 03:58:50,446][47288] Updated weights for policy 0, policy_version 74396 (0.0033) [2024-04-26 03:58:53,218][47288] Updated weights for policy 0, policy_version 74406 (0.0026) [2024-04-26 03:58:53,923][47056] Fps is (10 sec: 58983.9, 60 sec: 56797.9, 300 sec: 56205.5). Total num frames: 1219084288. Throughput: 0: 56424.9. Samples: 1168502820. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-04-26 03:58:53,923][47056] Avg episode reward: [(0, '0.397')] [2024-04-26 03:58:56,270][47288] Updated weights for policy 0, policy_version 74416 (0.0034) [2024-04-26 03:58:58,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56798.0, 300 sec: 56316.6). Total num frames: 1219379200. Throughput: 0: 56505.9. Samples: 1168676520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-04-26 03:58:58,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 03:58:59,029][47288] Updated weights for policy 0, policy_version 74426 (0.0032) [2024-04-26 03:59:02,137][47288] Updated weights for policy 0, policy_version 74436 (0.0039) [2024-04-26 03:59:03,364][47267] Signal inference workers to stop experience collection... (17600 times) [2024-04-26 03:59:03,393][47288] InferenceWorker_p0-w0: stopping experience collection (17600 times) [2024-04-26 03:59:03,411][47267] Signal inference workers to resume experience collection... (17600 times) [2024-04-26 03:59:03,412][47288] InferenceWorker_p0-w0: resuming experience collection (17600 times) [2024-04-26 03:59:03,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1219657728. Throughput: 0: 56689.4. Samples: 1169023240. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-04-26 03:59:03,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 03:59:04,766][47288] Updated weights for policy 0, policy_version 74446 (0.0027) [2024-04-26 03:59:08,183][47288] Updated weights for policy 0, policy_version 74456 (0.0025) [2024-04-26 03:59:08,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56797.8, 300 sec: 56316.5). Total num frames: 1219919872. Throughput: 0: 56667.3. Samples: 1169359920. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-04-26 03:59:08,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 03:59:10,603][47288] Updated weights for policy 0, policy_version 74466 (0.0028) [2024-04-26 03:59:13,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1220198400. Throughput: 0: 56419.2. Samples: 1169526040. Policy #0 lag: (min: 1.0, avg: 10.5, max: 24.0) [2024-04-26 03:59:13,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 03:59:14,034][47288] Updated weights for policy 0, policy_version 74476 (0.0029) [2024-04-26 03:59:16,310][47288] Updated weights for policy 0, policy_version 74486 (0.0028) [2024-04-26 03:59:18,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1220493312. Throughput: 0: 56388.0. Samples: 1169861260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 03:59:18,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 03:59:19,980][47288] Updated weights for policy 0, policy_version 74496 (0.0031) [2024-04-26 03:59:22,233][47288] Updated weights for policy 0, policy_version 74506 (0.0034) [2024-04-26 03:59:23,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1220771840. Throughput: 0: 56405.5. Samples: 1170201620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 03:59:23,923][47056] Avg episode reward: [(0, '0.436')] [2024-04-26 03:59:25,630][47288] Updated weights for policy 0, policy_version 74516 (0.0034) [2024-04-26 03:59:28,114][47288] Updated weights for policy 0, policy_version 74526 (0.0030) [2024-04-26 03:59:28,923][47056] Fps is (10 sec: 58981.9, 60 sec: 57070.9, 300 sec: 56372.0). Total num frames: 1221083136. Throughput: 0: 56580.1. Samples: 1170375540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 03:59:28,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 03:59:31,299][47288] Updated weights for policy 0, policy_version 74536 (0.0025) [2024-04-26 03:59:33,859][47288] Updated weights for policy 0, policy_version 74546 (0.0032) [2024-04-26 03:59:33,923][47056] Fps is (10 sec: 58982.6, 60 sec: 57070.9, 300 sec: 56372.1). Total num frames: 1221361664. Throughput: 0: 56570.1. Samples: 1170713400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 03:59:33,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 03:59:37,179][47288] Updated weights for policy 0, policy_version 74556 (0.0025) [2024-04-26 03:59:38,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56525.0, 300 sec: 56372.1). Total num frames: 1221623808. Throughput: 0: 56536.8. Samples: 1171046980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 03:59:38,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 03:59:39,714][47288] Updated weights for policy 0, policy_version 74566 (0.0029) [2024-04-26 03:59:43,094][47288] Updated weights for policy 0, policy_version 74576 (0.0029) [2024-04-26 03:59:43,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56798.0, 300 sec: 56372.1). Total num frames: 1221902336. Throughput: 0: 56396.5. Samples: 1171214360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 03:59:43,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 03:59:45,490][47288] Updated weights for policy 0, policy_version 74586 (0.0028) [2024-04-26 03:59:48,846][47288] Updated weights for policy 0, policy_version 74596 (0.0030) [2024-04-26 03:59:48,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1222180864. Throughput: 0: 56181.2. Samples: 1171551400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 03:59:48,923][47056] Avg episode reward: [(0, '0.359')] [2024-04-26 03:59:51,268][47288] Updated weights for policy 0, policy_version 74606 (0.0031) [2024-04-26 03:59:53,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1222443008. Throughput: 0: 56204.4. Samples: 1171889120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 03:59:53,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 03:59:54,885][47288] Updated weights for policy 0, policy_version 74616 (0.0031) [2024-04-26 03:59:57,167][47288] Updated weights for policy 0, policy_version 74626 (0.0027) [2024-04-26 03:59:58,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.6, 300 sec: 56372.1). Total num frames: 1222754304. Throughput: 0: 56191.0. Samples: 1172054640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 03:59:58,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 04:00:00,639][47288] Updated weights for policy 0, policy_version 74636 (0.0032) [2024-04-26 04:00:01,349][47267] Signal inference workers to stop experience collection... (17650 times) [2024-04-26 04:00:01,391][47288] InferenceWorker_p0-w0: stopping experience collection (17650 times) [2024-04-26 04:00:01,404][47267] Signal inference workers to resume experience collection... (17650 times) [2024-04-26 04:00:01,410][47288] InferenceWorker_p0-w0: resuming experience collection (17650 times) [2024-04-26 04:00:02,980][47288] Updated weights for policy 0, policy_version 74646 (0.0030) [2024-04-26 04:00:03,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1223016448. Throughput: 0: 56246.1. Samples: 1172392340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 04:00:03,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 04:00:06,594][47288] Updated weights for policy 0, policy_version 74656 (0.0031) [2024-04-26 04:00:08,786][47288] Updated weights for policy 0, policy_version 74666 (0.0037) [2024-04-26 04:00:08,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56797.7, 300 sec: 56316.5). Total num frames: 1223327744. Throughput: 0: 56234.1. Samples: 1172732160. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 04:00:08,923][47056] Avg episode reward: [(0, '0.384')] [2024-04-26 04:00:12,380][47288] Updated weights for policy 0, policy_version 74676 (0.0027) [2024-04-26 04:00:13,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1223589888. Throughput: 0: 56161.5. Samples: 1172902800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 04:00:13,923][47056] Avg episode reward: [(0, '0.395')] [2024-04-26 04:00:14,904][47288] Updated weights for policy 0, policy_version 74686 (0.0035) [2024-04-26 04:00:18,110][47288] Updated weights for policy 0, policy_version 74696 (0.0029) [2024-04-26 04:00:18,923][47056] Fps is (10 sec: 54067.4, 60 sec: 56251.6, 300 sec: 56316.5). Total num frames: 1223868416. Throughput: 0: 56111.4. Samples: 1173238420. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 04:00:18,924][47056] Avg episode reward: [(0, '0.333')] [2024-04-26 04:00:20,641][47288] Updated weights for policy 0, policy_version 74706 (0.0025) [2024-04-26 04:00:23,851][47288] Updated weights for policy 0, policy_version 74716 (0.0027) [2024-04-26 04:00:23,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1224146944. Throughput: 0: 56310.8. Samples: 1173580960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 04:00:23,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 04:00:26,614][47288] Updated weights for policy 0, policy_version 74726 (0.0027) [2024-04-26 04:00:28,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1224441856. Throughput: 0: 56280.8. Samples: 1173747000. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 04:00:28,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 04:00:29,704][47288] Updated weights for policy 0, policy_version 74736 (0.0038) [2024-04-26 04:00:32,388][47288] Updated weights for policy 0, policy_version 74746 (0.0026) [2024-04-26 04:00:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1224720384. Throughput: 0: 56290.8. Samples: 1174084480. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 04:00:33,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 04:00:35,507][47288] Updated weights for policy 0, policy_version 74756 (0.0028) [2024-04-26 04:00:38,202][47288] Updated weights for policy 0, policy_version 74766 (0.0026) [2024-04-26 04:00:38,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56524.7, 300 sec: 56372.0). Total num frames: 1225015296. Throughput: 0: 56147.0. Samples: 1174415740. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 04:00:38,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 04:00:41,459][47288] Updated weights for policy 0, policy_version 74776 (0.0025) [2024-04-26 04:00:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1225277440. Throughput: 0: 56521.5. Samples: 1174598100. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 04:00:43,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 04:00:43,986][47288] Updated weights for policy 0, policy_version 74786 (0.0028) [2024-04-26 04:00:47,165][47288] Updated weights for policy 0, policy_version 74796 (0.0026) [2024-04-26 04:00:48,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1225572352. Throughput: 0: 56484.4. Samples: 1174934140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 04:00:48,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 04:00:49,017][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000074804_1225588736.pth... [2024-04-26 04:00:49,063][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000073977_1212039168.pth [2024-04-26 04:00:49,656][47288] Updated weights for policy 0, policy_version 74806 (0.0034) [2024-04-26 04:00:52,916][47288] Updated weights for policy 0, policy_version 74816 (0.0026) [2024-04-26 04:00:53,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1225834496. Throughput: 0: 56342.4. Samples: 1175267560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 04:00:53,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 04:00:55,440][47288] Updated weights for policy 0, policy_version 74826 (0.0027) [2024-04-26 04:00:58,832][47288] Updated weights for policy 0, policy_version 74836 (0.0030) [2024-04-26 04:00:58,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1226113024. Throughput: 0: 56097.6. Samples: 1175427200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 04:00:58,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 04:01:01,307][47288] Updated weights for policy 0, policy_version 74846 (0.0028) [2024-04-26 04:01:03,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1226407936. Throughput: 0: 56215.2. Samples: 1175768100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 04:01:03,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 04:01:04,627][47288] Updated weights for policy 0, policy_version 74856 (0.0027) [2024-04-26 04:01:07,102][47288] Updated weights for policy 0, policy_version 74866 (0.0029) [2024-04-26 04:01:08,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.7, 300 sec: 56261.0). Total num frames: 1226670080. Throughput: 0: 56163.0. Samples: 1176108300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 04:01:08,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 04:01:10,255][47288] Updated weights for policy 0, policy_version 74876 (0.0027) [2024-04-26 04:01:12,201][47267] Signal inference workers to stop experience collection... (17700 times) [2024-04-26 04:01:12,236][47288] InferenceWorker_p0-w0: stopping experience collection (17700 times) [2024-04-26 04:01:12,288][47267] Signal inference workers to resume experience collection... (17700 times) [2024-04-26 04:01:12,288][47288] InferenceWorker_p0-w0: resuming experience collection (17700 times) [2024-04-26 04:01:12,910][47288] Updated weights for policy 0, policy_version 74886 (0.0029) [2024-04-26 04:01:13,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.6, 300 sec: 56372.1). Total num frames: 1226964992. Throughput: 0: 56220.8. Samples: 1176276940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 04:01:13,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 04:01:16,035][47288] Updated weights for policy 0, policy_version 74896 (0.0029) [2024-04-26 04:01:18,585][47288] Updated weights for policy 0, policy_version 74906 (0.0033) [2024-04-26 04:01:18,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56524.9, 300 sec: 56316.6). Total num frames: 1227259904. Throughput: 0: 56188.0. Samples: 1176612940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 04:01:18,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 04:01:22,287][47288] Updated weights for policy 0, policy_version 74916 (0.0027) [2024-04-26 04:01:23,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1227538432. Throughput: 0: 56242.4. Samples: 1176946640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 04:01:23,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 04:01:24,440][47288] Updated weights for policy 0, policy_version 74926 (0.0022) [2024-04-26 04:01:28,112][47288] Updated weights for policy 0, policy_version 74936 (0.0033) [2024-04-26 04:01:28,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1227800576. Throughput: 0: 56166.5. Samples: 1177125600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 04:01:28,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 04:01:30,229][47288] Updated weights for policy 0, policy_version 74946 (0.0022) [2024-04-26 04:01:33,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55705.5, 300 sec: 56149.9). Total num frames: 1228062720. Throughput: 0: 56044.9. Samples: 1177456160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 04:01:33,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 04:01:33,953][47288] Updated weights for policy 0, policy_version 74956 (0.0030) [2024-04-26 04:01:36,206][47288] Updated weights for policy 0, policy_version 74966 (0.0036) [2024-04-26 04:01:38,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.6, 300 sec: 56316.5). Total num frames: 1228341248. Throughput: 0: 56172.8. Samples: 1177795340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:01:38,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 04:01:39,586][47288] Updated weights for policy 0, policy_version 74976 (0.0025) [2024-04-26 04:01:42,496][47288] Updated weights for policy 0, policy_version 74986 (0.0029) [2024-04-26 04:01:43,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 1228652544. Throughput: 0: 56191.1. Samples: 1177955800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:01:43,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 04:01:45,247][47288] Updated weights for policy 0, policy_version 74996 (0.0029) [2024-04-26 04:01:48,394][47288] Updated weights for policy 0, policy_version 75006 (0.0035) [2024-04-26 04:01:48,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55705.5, 300 sec: 56372.1). Total num frames: 1228914688. Throughput: 0: 56165.6. Samples: 1178295560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:01:48,924][47056] Avg episode reward: [(0, '0.348')] [2024-04-26 04:01:51,217][47288] Updated weights for policy 0, policy_version 75016 (0.0031) [2024-04-26 04:01:53,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55978.5, 300 sec: 56261.0). Total num frames: 1229193216. Throughput: 0: 56165.1. Samples: 1178635740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:01:53,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 04:01:54,173][47288] Updated weights for policy 0, policy_version 75026 (0.0025) [2024-04-26 04:01:57,045][47288] Updated weights for policy 0, policy_version 75036 (0.0031) [2024-04-26 04:01:58,923][47056] Fps is (10 sec: 60621.3, 60 sec: 56797.9, 300 sec: 56372.0). Total num frames: 1229520896. Throughput: 0: 56214.2. Samples: 1178806580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:01:58,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 04:02:00,050][47288] Updated weights for policy 0, policy_version 75046 (0.0023) [2024-04-26 04:02:02,620][47267] Signal inference workers to stop experience collection... (17750 times) [2024-04-26 04:02:02,670][47288] InferenceWorker_p0-w0: stopping experience collection (17750 times) [2024-04-26 04:02:02,676][47267] Signal inference workers to resume experience collection... (17750 times) [2024-04-26 04:02:02,685][47288] InferenceWorker_p0-w0: resuming experience collection (17750 times) [2024-04-26 04:02:02,799][47288] Updated weights for policy 0, policy_version 75056 (0.0027) [2024-04-26 04:02:03,923][47056] Fps is (10 sec: 60621.7, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1229799424. Throughput: 0: 56279.6. Samples: 1179145520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:02:03,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 04:02:05,862][47288] Updated weights for policy 0, policy_version 75066 (0.0031) [2024-04-26 04:02:08,647][47288] Updated weights for policy 0, policy_version 75076 (0.0032) [2024-04-26 04:02:08,923][47056] Fps is (10 sec: 52429.8, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1230045184. Throughput: 0: 56262.3. Samples: 1179478440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:02:08,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 04:02:11,623][47288] Updated weights for policy 0, policy_version 75086 (0.0029) [2024-04-26 04:02:13,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1230340096. Throughput: 0: 56196.1. Samples: 1179654420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:02:13,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 04:02:14,405][47288] Updated weights for policy 0, policy_version 75096 (0.0035) [2024-04-26 04:02:17,549][47288] Updated weights for policy 0, policy_version 75106 (0.0027) [2024-04-26 04:02:18,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55705.6, 300 sec: 56316.5). Total num frames: 1230602240. Throughput: 0: 56388.5. Samples: 1179993640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:02:18,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 04:02:20,117][47288] Updated weights for policy 0, policy_version 75116 (0.0026) [2024-04-26 04:02:23,323][47288] Updated weights for policy 0, policy_version 75126 (0.0028) [2024-04-26 04:02:23,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.5, 300 sec: 56372.0). Total num frames: 1230897152. Throughput: 0: 56339.9. Samples: 1180330640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 04:02:23,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 04:02:25,759][47288] Updated weights for policy 0, policy_version 75136 (0.0032) [2024-04-26 04:02:28,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 56205.4). Total num frames: 1231142912. Throughput: 0: 56441.8. Samples: 1180495680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 04:02:28,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 04:02:29,308][47288] Updated weights for policy 0, policy_version 75146 (0.0030) [2024-04-26 04:02:31,570][47288] Updated weights for policy 0, policy_version 75156 (0.0034) [2024-04-26 04:02:33,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1231454208. Throughput: 0: 56363.7. Samples: 1180831920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 04:02:33,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 04:02:35,101][47288] Updated weights for policy 0, policy_version 75166 (0.0029) [2024-04-26 04:02:37,484][47288] Updated weights for policy 0, policy_version 75176 (0.0030) [2024-04-26 04:02:38,923][47056] Fps is (10 sec: 63897.2, 60 sec: 57344.0, 300 sec: 56427.6). Total num frames: 1231781888. Throughput: 0: 56189.4. Samples: 1181164260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 04:02:38,924][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 04:02:40,949][47288] Updated weights for policy 0, policy_version 75186 (0.0027) [2024-04-26 04:02:43,246][47288] Updated weights for policy 0, policy_version 75196 (0.0031) [2024-04-26 04:02:43,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1232027648. Throughput: 0: 56291.1. Samples: 1181339680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 04:02:43,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 04:02:46,777][47288] Updated weights for policy 0, policy_version 75206 (0.0026) [2024-04-26 04:02:48,015][47267] Signal inference workers to stop experience collection... (17800 times) [2024-04-26 04:02:48,051][47288] InferenceWorker_p0-w0: stopping experience collection (17800 times) [2024-04-26 04:02:48,061][47267] Signal inference workers to resume experience collection... (17800 times) [2024-04-26 04:02:48,067][47288] InferenceWorker_p0-w0: resuming experience collection (17800 times) [2024-04-26 04:02:48,923][47056] Fps is (10 sec: 52428.8, 60 sec: 56524.9, 300 sec: 56372.0). Total num frames: 1232306176. Throughput: 0: 56346.5. Samples: 1181681120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 04:02:48,923][47056] Avg episode reward: [(0, '0.294')] [2024-04-26 04:02:49,013][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000075215_1232322560.pth... [2024-04-26 04:02:49,063][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000074391_1218822144.pth [2024-04-26 04:02:49,224][47288] Updated weights for policy 0, policy_version 75216 (0.0029) [2024-04-26 04:02:52,651][47288] Updated weights for policy 0, policy_version 75226 (0.0028) [2024-04-26 04:02:53,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56251.9, 300 sec: 56261.0). Total num frames: 1232568320. Throughput: 0: 56452.3. Samples: 1182018800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 04:02:53,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 04:02:55,134][47288] Updated weights for policy 0, policy_version 75236 (0.0027) [2024-04-26 04:02:58,271][47288] Updated weights for policy 0, policy_version 75246 (0.0030) [2024-04-26 04:02:58,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 56316.5). Total num frames: 1232863232. Throughput: 0: 55962.7. Samples: 1182172740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 04:02:58,923][47056] Avg episode reward: [(0, '0.420')] [2024-04-26 04:03:01,005][47288] Updated weights for policy 0, policy_version 75256 (0.0030) [2024-04-26 04:03:03,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 56372.1). Total num frames: 1233141760. Throughput: 0: 56060.0. Samples: 1182516340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 04:03:03,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 04:03:04,034][47288] Updated weights for policy 0, policy_version 75266 (0.0027) [2024-04-26 04:03:06,667][47288] Updated weights for policy 0, policy_version 75276 (0.0026) [2024-04-26 04:03:08,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.6, 300 sec: 56261.0). Total num frames: 1233420288. Throughput: 0: 56195.6. Samples: 1182859440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-26 04:03:08,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 04:03:09,908][47288] Updated weights for policy 0, policy_version 75286 (0.0035) [2024-04-26 04:03:12,365][47288] Updated weights for policy 0, policy_version 75296 (0.0027) [2024-04-26 04:03:13,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1233731584. Throughput: 0: 56285.9. Samples: 1183028540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-26 04:03:13,923][47056] Avg episode reward: [(0, '0.312')] [2024-04-26 04:03:15,862][47288] Updated weights for policy 0, policy_version 75306 (0.0036) [2024-04-26 04:03:18,041][47288] Updated weights for policy 0, policy_version 75316 (0.0029) [2024-04-26 04:03:18,923][47056] Fps is (10 sec: 62259.0, 60 sec: 57343.9, 300 sec: 56372.1). Total num frames: 1234042880. Throughput: 0: 56204.4. Samples: 1183361120. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-26 04:03:18,923][47056] Avg episode reward: [(0, '0.548')] [2024-04-26 04:03:21,846][47288] Updated weights for policy 0, policy_version 75326 (0.0027) [2024-04-26 04:03:23,794][47288] Updated weights for policy 0, policy_version 75336 (0.0034) [2024-04-26 04:03:23,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1234305024. Throughput: 0: 56317.8. Samples: 1183698560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-26 04:03:23,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 04:03:27,612][47288] Updated weights for policy 0, policy_version 75346 (0.0028) [2024-04-26 04:03:28,923][47056] Fps is (10 sec: 50790.6, 60 sec: 56797.9, 300 sec: 56316.5). Total num frames: 1234550784. Throughput: 0: 56396.9. Samples: 1183877540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-26 04:03:28,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 04:03:29,757][47288] Updated weights for policy 0, policy_version 75356 (0.0032) [2024-04-26 04:03:33,536][47288] Updated weights for policy 0, policy_version 75366 (0.0031) [2024-04-26 04:03:33,923][47056] Fps is (10 sec: 49152.4, 60 sec: 55705.6, 300 sec: 56149.9). Total num frames: 1234796544. Throughput: 0: 56299.2. Samples: 1184214580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-26 04:03:33,923][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 04:03:35,582][47288] Updated weights for policy 0, policy_version 75376 (0.0026) [2024-04-26 04:03:38,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55159.5, 300 sec: 56261.0). Total num frames: 1235091456. Throughput: 0: 56233.2. Samples: 1184549300. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-26 04:03:38,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 04:03:39,485][47288] Updated weights for policy 0, policy_version 75386 (0.0026) [2024-04-26 04:03:41,392][47288] Updated weights for policy 0, policy_version 75396 (0.0025) [2024-04-26 04:03:43,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1235386368. Throughput: 0: 56306.2. Samples: 1184706520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-26 04:03:43,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 04:03:45,170][47288] Updated weights for policy 0, policy_version 75406 (0.0025) [2024-04-26 04:03:45,903][47267] Signal inference workers to stop experience collection... (17850 times) [2024-04-26 04:03:45,908][47267] Signal inference workers to resume experience collection... (17850 times) [2024-04-26 04:03:45,934][47288] InferenceWorker_p0-w0: stopping experience collection (17850 times) [2024-04-26 04:03:45,934][47288] InferenceWorker_p0-w0: resuming experience collection (17850 times) [2024-04-26 04:03:47,111][47288] Updated weights for policy 0, policy_version 75416 (0.0025) [2024-04-26 04:03:48,923][47056] Fps is (10 sec: 58983.4, 60 sec: 56251.9, 300 sec: 56261.0). Total num frames: 1235681280. Throughput: 0: 56183.2. Samples: 1185044580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-26 04:03:48,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 04:03:50,874][47288] Updated weights for policy 0, policy_version 75426 (0.0036) [2024-04-26 04:03:52,913][47288] Updated weights for policy 0, policy_version 75436 (0.0024) [2024-04-26 04:03:53,923][47056] Fps is (10 sec: 60620.9, 60 sec: 57070.9, 300 sec: 56316.5). Total num frames: 1235992576. Throughput: 0: 55941.0. Samples: 1185376780. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 04:03:53,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 04:03:56,795][47288] Updated weights for policy 0, policy_version 75446 (0.0025) [2024-04-26 04:03:58,659][47288] Updated weights for policy 0, policy_version 75456 (0.0027) [2024-04-26 04:03:58,923][47056] Fps is (10 sec: 60619.5, 60 sec: 57070.8, 300 sec: 56372.0). Total num frames: 1236287488. Throughput: 0: 56370.4. Samples: 1185565220. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 04:03:58,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 04:04:02,704][47288] Updated weights for policy 0, policy_version 75466 (0.0031) [2024-04-26 04:04:03,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1236533248. Throughput: 0: 56558.8. Samples: 1185906260. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 04:04:03,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 04:04:04,354][47288] Updated weights for policy 0, policy_version 75476 (0.0027) [2024-04-26 04:04:08,459][47288] Updated weights for policy 0, policy_version 75486 (0.0032) [2024-04-26 04:04:08,923][47056] Fps is (10 sec: 49152.7, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 1236779008. Throughput: 0: 56545.5. Samples: 1186243100. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 04:04:08,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 04:04:10,199][47288] Updated weights for policy 0, policy_version 75496 (0.0027) [2024-04-26 04:04:13,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55432.4, 300 sec: 56149.9). Total num frames: 1237057536. Throughput: 0: 55930.7. Samples: 1186394420. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 04:04:13,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 04:04:14,194][47288] Updated weights for policy 0, policy_version 75506 (0.0027) [2024-04-26 04:04:16,177][47288] Updated weights for policy 0, policy_version 75516 (0.0023) [2024-04-26 04:04:18,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55159.5, 300 sec: 56205.4). Total num frames: 1237352448. Throughput: 0: 55953.3. Samples: 1186732480. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 04:04:18,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 04:04:19,991][47288] Updated weights for policy 0, policy_version 75526 (0.0034) [2024-04-26 04:04:21,915][47288] Updated weights for policy 0, policy_version 75536 (0.0030) [2024-04-26 04:04:23,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55705.6, 300 sec: 56149.9). Total num frames: 1237647360. Throughput: 0: 56060.0. Samples: 1187072000. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 04:04:23,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 04:04:25,992][47288] Updated weights for policy 0, policy_version 75546 (0.0025) [2024-04-26 04:04:27,431][47267] Signal inference workers to stop experience collection... (17900 times) [2024-04-26 04:04:27,431][47267] Signal inference workers to resume experience collection... (17900 times) [2024-04-26 04:04:27,444][47288] InferenceWorker_p0-w0: stopping experience collection (17900 times) [2024-04-26 04:04:27,444][47288] InferenceWorker_p0-w0: resuming experience collection (17900 times) [2024-04-26 04:04:27,724][47288] Updated weights for policy 0, policy_version 75556 (0.0029) [2024-04-26 04:04:28,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56524.8, 300 sec: 56205.4). Total num frames: 1237942272. Throughput: 0: 56570.6. Samples: 1187252200. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 04:04:28,924][47056] Avg episode reward: [(0, '0.339')] [2024-04-26 04:04:31,791][47288] Updated weights for policy 0, policy_version 75566 (0.0033) [2024-04-26 04:04:33,503][47288] Updated weights for policy 0, policy_version 75576 (0.0026) [2024-04-26 04:04:33,923][47056] Fps is (10 sec: 60621.4, 60 sec: 57617.1, 300 sec: 56372.1). Total num frames: 1238253568. Throughput: 0: 56467.9. Samples: 1187585640. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 04:04:33,923][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 04:04:37,477][47288] Updated weights for policy 0, policy_version 75586 (0.0031) [2024-04-26 04:04:38,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56797.9, 300 sec: 56261.0). Total num frames: 1238499328. Throughput: 0: 56538.6. Samples: 1187921020. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 04:04:38,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 04:04:39,341][47288] Updated weights for policy 0, policy_version 75596 (0.0036) [2024-04-26 04:04:43,328][47288] Updated weights for policy 0, policy_version 75606 (0.0027) [2024-04-26 04:04:43,923][47056] Fps is (10 sec: 50790.2, 60 sec: 56251.7, 300 sec: 56205.4). Total num frames: 1238761472. Throughput: 0: 56205.9. Samples: 1188094480. Policy #0 lag: (min: 1.0, avg: 13.2, max: 23.0) [2024-04-26 04:04:43,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 04:04:45,180][47288] Updated weights for policy 0, policy_version 75616 (0.0026) [2024-04-26 04:04:48,923][47056] Fps is (10 sec: 50791.0, 60 sec: 55432.5, 300 sec: 56149.9). Total num frames: 1239007232. Throughput: 0: 56057.4. Samples: 1188428840. Policy #0 lag: (min: 1.0, avg: 13.2, max: 23.0) [2024-04-26 04:04:48,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 04:04:49,123][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000075625_1239040000.pth... [2024-04-26 04:04:49,172][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000074804_1225588736.pth [2024-04-26 04:04:49,289][47288] Updated weights for policy 0, policy_version 75626 (0.0038) [2024-04-26 04:04:50,912][47288] Updated weights for policy 0, policy_version 75636 (0.0026) [2024-04-26 04:04:53,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55159.6, 300 sec: 56094.4). Total num frames: 1239302144. Throughput: 0: 55992.1. Samples: 1188762740. Policy #0 lag: (min: 1.0, avg: 13.2, max: 23.0) [2024-04-26 04:04:53,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 04:04:55,281][47288] Updated weights for policy 0, policy_version 75646 (0.0033) [2024-04-26 04:04:56,837][47288] Updated weights for policy 0, policy_version 75656 (0.0036) [2024-04-26 04:04:58,923][47056] Fps is (10 sec: 62258.1, 60 sec: 55705.7, 300 sec: 56316.5). Total num frames: 1239629824. Throughput: 0: 56218.6. Samples: 1188924260. Policy #0 lag: (min: 1.0, avg: 13.2, max: 23.0) [2024-04-26 04:04:58,923][47056] Avg episode reward: [(0, '0.343')] [2024-04-26 04:05:00,933][47288] Updated weights for policy 0, policy_version 75666 (0.0027) [2024-04-26 04:05:02,688][47288] Updated weights for policy 0, policy_version 75676 (0.0035) [2024-04-26 04:05:03,923][47056] Fps is (10 sec: 62256.9, 60 sec: 56524.6, 300 sec: 56261.0). Total num frames: 1239924736. Throughput: 0: 56171.8. Samples: 1189260220. Policy #0 lag: (min: 1.0, avg: 13.2, max: 23.0) [2024-04-26 04:05:03,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 04:05:06,831][47288] Updated weights for policy 0, policy_version 75686 (0.0031) [2024-04-26 04:05:08,606][47288] Updated weights for policy 0, policy_version 75696 (0.0035) [2024-04-26 04:05:08,923][47056] Fps is (10 sec: 57345.1, 60 sec: 57071.0, 300 sec: 56316.5). Total num frames: 1240203264. Throughput: 0: 56056.2. Samples: 1189594520. Policy #0 lag: (min: 1.0, avg: 13.2, max: 23.0) [2024-04-26 04:05:08,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 04:05:12,507][47288] Updated weights for policy 0, policy_version 75706 (0.0026) [2024-04-26 04:05:13,923][47056] Fps is (10 sec: 57344.3, 60 sec: 57343.9, 300 sec: 56372.0). Total num frames: 1240498176. Throughput: 0: 56125.6. Samples: 1189777860. Policy #0 lag: (min: 1.0, avg: 13.2, max: 23.0) [2024-04-26 04:05:13,923][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 04:05:14,438][47288] Updated weights for policy 0, policy_version 75716 (0.0029) [2024-04-26 04:05:18,314][47288] Updated weights for policy 0, policy_version 75726 (0.0030) [2024-04-26 04:05:18,923][47056] Fps is (10 sec: 52428.3, 60 sec: 56251.8, 300 sec: 56205.4). Total num frames: 1240727552. Throughput: 0: 56258.7. Samples: 1190117280. Policy #0 lag: (min: 1.0, avg: 13.2, max: 23.0) [2024-04-26 04:05:18,923][47056] Avg episode reward: [(0, '0.450')] [2024-04-26 04:05:19,272][47267] Signal inference workers to stop experience collection... (17950 times) [2024-04-26 04:05:19,272][47267] Signal inference workers to resume experience collection... (17950 times) [2024-04-26 04:05:19,299][47288] InferenceWorker_p0-w0: stopping experience collection (17950 times) [2024-04-26 04:05:19,300][47288] InferenceWorker_p0-w0: resuming experience collection (17950 times) [2024-04-26 04:05:20,195][47288] Updated weights for policy 0, policy_version 75736 (0.0037) [2024-04-26 04:05:23,923][47056] Fps is (10 sec: 49153.4, 60 sec: 55705.8, 300 sec: 56094.4). Total num frames: 1240989696. Throughput: 0: 56207.7. Samples: 1190450360. Policy #0 lag: (min: 1.0, avg: 13.2, max: 23.0) [2024-04-26 04:05:23,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 04:05:24,183][47288] Updated weights for policy 0, policy_version 75746 (0.0028) [2024-04-26 04:05:26,136][47288] Updated weights for policy 0, policy_version 75756 (0.0028) [2024-04-26 04:05:28,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55432.5, 300 sec: 56094.3). Total num frames: 1241268224. Throughput: 0: 55735.0. Samples: 1190602560. Policy #0 lag: (min: 1.0, avg: 11.9, max: 23.0) [2024-04-26 04:05:28,923][47056] Avg episode reward: [(0, '0.384')] [2024-04-26 04:05:30,059][47288] Updated weights for policy 0, policy_version 75766 (0.0028) [2024-04-26 04:05:31,937][47288] Updated weights for policy 0, policy_version 75776 (0.0027) [2024-04-26 04:05:33,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55159.5, 300 sec: 56094.4). Total num frames: 1241563136. Throughput: 0: 55759.4. Samples: 1190938020. Policy #0 lag: (min: 1.0, avg: 11.9, max: 23.0) [2024-04-26 04:05:33,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 04:05:35,942][47288] Updated weights for policy 0, policy_version 75786 (0.0024) [2024-04-26 04:05:37,738][47288] Updated weights for policy 0, policy_version 75796 (0.0027) [2024-04-26 04:05:38,923][47056] Fps is (10 sec: 60621.6, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1241874432. Throughput: 0: 55898.6. Samples: 1191278180. Policy #0 lag: (min: 1.0, avg: 11.9, max: 23.0) [2024-04-26 04:05:38,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 04:05:41,664][47288] Updated weights for policy 0, policy_version 75806 (0.0033) [2024-04-26 04:05:43,578][47288] Updated weights for policy 0, policy_version 75816 (0.0033) [2024-04-26 04:05:43,923][47056] Fps is (10 sec: 62259.4, 60 sec: 57071.0, 300 sec: 56316.5). Total num frames: 1242185728. Throughput: 0: 56457.0. Samples: 1191464820. Policy #0 lag: (min: 1.0, avg: 11.9, max: 23.0) [2024-04-26 04:05:43,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 04:05:47,419][47288] Updated weights for policy 0, policy_version 75826 (0.0027) [2024-04-26 04:05:48,923][47056] Fps is (10 sec: 55705.3, 60 sec: 57070.8, 300 sec: 56261.0). Total num frames: 1242431488. Throughput: 0: 56366.9. Samples: 1191796720. Policy #0 lag: (min: 1.0, avg: 11.9, max: 23.0) [2024-04-26 04:05:48,923][47056] Avg episode reward: [(0, '0.420')] [2024-04-26 04:05:49,519][47288] Updated weights for policy 0, policy_version 75836 (0.0030) [2024-04-26 04:05:53,204][47288] Updated weights for policy 0, policy_version 75846 (0.0036) [2024-04-26 04:05:53,923][47056] Fps is (10 sec: 52428.5, 60 sec: 56797.7, 300 sec: 56261.0). Total num frames: 1242710016. Throughput: 0: 56366.9. Samples: 1192131040. Policy #0 lag: (min: 1.0, avg: 11.9, max: 23.0) [2024-04-26 04:05:53,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 04:05:55,345][47288] Updated weights for policy 0, policy_version 75856 (0.0028) [2024-04-26 04:05:58,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55705.5, 300 sec: 56149.9). Total num frames: 1242972160. Throughput: 0: 56075.2. Samples: 1192301240. Policy #0 lag: (min: 1.0, avg: 11.9, max: 23.0) [2024-04-26 04:05:58,923][47056] Avg episode reward: [(0, '0.371')] [2024-04-26 04:05:59,057][47288] Updated weights for policy 0, policy_version 75866 (0.0034) [2024-04-26 04:06:00,715][47267] Signal inference workers to stop experience collection... (18000 times) [2024-04-26 04:06:00,736][47288] InferenceWorker_p0-w0: stopping experience collection (18000 times) [2024-04-26 04:06:00,769][47267] Signal inference workers to resume experience collection... (18000 times) [2024-04-26 04:06:00,770][47288] InferenceWorker_p0-w0: resuming experience collection (18000 times) [2024-04-26 04:06:01,171][47288] Updated weights for policy 0, policy_version 75876 (0.0030) [2024-04-26 04:06:03,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55432.7, 300 sec: 56205.4). Total num frames: 1243250688. Throughput: 0: 56121.7. Samples: 1192642760. Policy #0 lag: (min: 1.0, avg: 11.9, max: 23.0) [2024-04-26 04:06:03,923][47056] Avg episode reward: [(0, '0.415')] [2024-04-26 04:06:04,765][47288] Updated weights for policy 0, policy_version 75886 (0.0029) [2024-04-26 04:06:06,896][47288] Updated weights for policy 0, policy_version 75896 (0.0026) [2024-04-26 04:06:08,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55432.3, 300 sec: 56149.9). Total num frames: 1243529216. Throughput: 0: 56226.0. Samples: 1192980540. Policy #0 lag: (min: 1.0, avg: 11.9, max: 23.0) [2024-04-26 04:06:08,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 04:06:10,616][47288] Updated weights for policy 0, policy_version 75906 (0.0036) [2024-04-26 04:06:12,814][47288] Updated weights for policy 0, policy_version 75916 (0.0033) [2024-04-26 04:06:13,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55432.6, 300 sec: 56149.9). Total num frames: 1243824128. Throughput: 0: 56398.2. Samples: 1193140480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 04:06:13,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 04:06:16,438][47288] Updated weights for policy 0, policy_version 75926 (0.0029) [2024-04-26 04:06:18,776][47288] Updated weights for policy 0, policy_version 75936 (0.0028) [2024-04-26 04:06:18,923][47056] Fps is (10 sec: 60621.4, 60 sec: 56797.8, 300 sec: 56261.0). Total num frames: 1244135424. Throughput: 0: 56393.8. Samples: 1193475740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 04:06:18,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 04:06:22,312][47288] Updated weights for policy 0, policy_version 75946 (0.0030) [2024-04-26 04:06:23,923][47056] Fps is (10 sec: 58982.4, 60 sec: 57070.8, 300 sec: 56316.5). Total num frames: 1244413952. Throughput: 0: 56331.9. Samples: 1193813120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 04:06:23,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 04:06:24,530][47288] Updated weights for policy 0, policy_version 75956 (0.0029) [2024-04-26 04:06:28,026][47288] Updated weights for policy 0, policy_version 75966 (0.0032) [2024-04-26 04:06:28,923][47056] Fps is (10 sec: 55704.9, 60 sec: 57070.9, 300 sec: 56372.0). Total num frames: 1244692480. Throughput: 0: 56192.2. Samples: 1193993480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 04:06:28,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 04:06:30,285][47288] Updated weights for policy 0, policy_version 75976 (0.0028) [2024-04-26 04:06:33,722][47288] Updated weights for policy 0, policy_version 75986 (0.0034) [2024-04-26 04:06:33,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1244954624. Throughput: 0: 56429.8. Samples: 1194336060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 04:06:33,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 04:06:36,261][47288] Updated weights for policy 0, policy_version 75996 (0.0031) [2024-04-26 04:06:38,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55705.5, 300 sec: 56149.9). Total num frames: 1245216768. Throughput: 0: 56548.5. Samples: 1194675720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 04:06:38,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 04:06:39,596][47288] Updated weights for policy 0, policy_version 76006 (0.0024) [2024-04-26 04:06:42,050][47288] Updated weights for policy 0, policy_version 76016 (0.0029) [2024-04-26 04:06:43,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 56261.0). Total num frames: 1245511680. Throughput: 0: 56311.7. Samples: 1194835260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 04:06:43,923][47056] Avg episode reward: [(0, '0.363')] [2024-04-26 04:06:45,401][47288] Updated weights for policy 0, policy_version 76026 (0.0029) [2024-04-26 04:06:47,782][47288] Updated weights for policy 0, policy_version 76036 (0.0029) [2024-04-26 04:06:48,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1245790208. Throughput: 0: 56257.4. Samples: 1195174340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 04:06:48,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 04:06:48,973][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000076038_1245806592.pth... [2024-04-26 04:06:49,021][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000075215_1232322560.pth [2024-04-26 04:06:51,131][47288] Updated weights for policy 0, policy_version 76046 (0.0037) [2024-04-26 04:06:53,831][47288] Updated weights for policy 0, policy_version 76056 (0.0027) [2024-04-26 04:06:53,923][47056] Fps is (10 sec: 58983.0, 60 sec: 56524.9, 300 sec: 56205.5). Total num frames: 1246101504. Throughput: 0: 56220.7. Samples: 1195510460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 04:06:53,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 04:06:56,923][47288] Updated weights for policy 0, policy_version 76066 (0.0039) [2024-04-26 04:06:58,923][47056] Fps is (10 sec: 60620.2, 60 sec: 57071.0, 300 sec: 56261.0). Total num frames: 1246396416. Throughput: 0: 56689.8. Samples: 1195691520. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-26 04:06:58,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 04:06:59,505][47288] Updated weights for policy 0, policy_version 76076 (0.0030) [2024-04-26 04:07:02,315][47267] Signal inference workers to stop experience collection... (18050 times) [2024-04-26 04:07:02,316][47267] Signal inference workers to resume experience collection... (18050 times) [2024-04-26 04:07:02,333][47288] InferenceWorker_p0-w0: stopping experience collection (18050 times) [2024-04-26 04:07:02,338][47288] InferenceWorker_p0-w0: resuming experience collection (18050 times) [2024-04-26 04:07:02,931][47288] Updated weights for policy 0, policy_version 76086 (0.0029) [2024-04-26 04:07:03,923][47056] Fps is (10 sec: 57342.9, 60 sec: 57070.9, 300 sec: 56372.0). Total num frames: 1246674944. Throughput: 0: 56689.2. Samples: 1196026760. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-26 04:07:03,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 04:07:05,384][47288] Updated weights for policy 0, policy_version 76096 (0.0031) [2024-04-26 04:07:08,653][47288] Updated weights for policy 0, policy_version 76106 (0.0033) [2024-04-26 04:07:08,923][47056] Fps is (10 sec: 52429.4, 60 sec: 56524.9, 300 sec: 56205.5). Total num frames: 1246920704. Throughput: 0: 56618.8. Samples: 1196360960. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-26 04:07:08,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 04:07:11,267][47288] Updated weights for policy 0, policy_version 76116 (0.0028) [2024-04-26 04:07:13,923][47056] Fps is (10 sec: 50790.9, 60 sec: 55978.7, 300 sec: 56205.4). Total num frames: 1247182848. Throughput: 0: 56055.7. Samples: 1196515980. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-26 04:07:13,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 04:07:14,570][47288] Updated weights for policy 0, policy_version 76126 (0.0027) [2024-04-26 04:07:17,035][47288] Updated weights for policy 0, policy_version 76136 (0.0029) [2024-04-26 04:07:18,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 56205.4). Total num frames: 1247477760. Throughput: 0: 56046.9. Samples: 1196858180. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-26 04:07:18,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 04:07:20,479][47288] Updated weights for policy 0, policy_version 76146 (0.0030) [2024-04-26 04:07:22,979][47288] Updated weights for policy 0, policy_version 76156 (0.0025) [2024-04-26 04:07:23,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55705.5, 300 sec: 56316.5). Total num frames: 1247756288. Throughput: 0: 55968.3. Samples: 1197194300. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-26 04:07:23,923][47056] Avg episode reward: [(0, '0.420')] [2024-04-26 04:07:26,247][47288] Updated weights for policy 0, policy_version 76166 (0.0030) [2024-04-26 04:07:28,744][47288] Updated weights for policy 0, policy_version 76176 (0.0034) [2024-04-26 04:07:28,923][47056] Fps is (10 sec: 58983.3, 60 sec: 56251.9, 300 sec: 56316.5). Total num frames: 1248067584. Throughput: 0: 56038.7. Samples: 1197357000. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-26 04:07:28,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 04:07:32,074][47288] Updated weights for policy 0, policy_version 76186 (0.0027) [2024-04-26 04:07:33,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56251.7, 300 sec: 56094.4). Total num frames: 1248329728. Throughput: 0: 55979.5. Samples: 1197693420. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-26 04:07:33,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 04:07:34,520][47288] Updated weights for policy 0, policy_version 76196 (0.0027) [2024-04-26 04:07:37,864][47288] Updated weights for policy 0, policy_version 76206 (0.0033) [2024-04-26 04:07:38,923][47056] Fps is (10 sec: 57343.1, 60 sec: 57070.8, 300 sec: 56316.5). Total num frames: 1248641024. Throughput: 0: 55920.6. Samples: 1198026900. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-26 04:07:38,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 04:07:40,631][47288] Updated weights for policy 0, policy_version 76216 (0.0025) [2024-04-26 04:07:43,782][47288] Updated weights for policy 0, policy_version 76226 (0.0026) [2024-04-26 04:07:43,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56251.8, 300 sec: 56205.5). Total num frames: 1248886784. Throughput: 0: 55865.5. Samples: 1198205460. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-26 04:07:43,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 04:07:46,517][47288] Updated weights for policy 0, policy_version 76236 (0.0031) [2024-04-26 04:07:48,923][47056] Fps is (10 sec: 50791.2, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 1249148928. Throughput: 0: 55777.5. Samples: 1198536740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 04:07:48,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 04:07:48,955][47267] Signal inference workers to stop experience collection... (18100 times) [2024-04-26 04:07:48,989][47288] InferenceWorker_p0-w0: stopping experience collection (18100 times) [2024-04-26 04:07:49,013][47267] Signal inference workers to resume experience collection... (18100 times) [2024-04-26 04:07:49,013][47288] InferenceWorker_p0-w0: resuming experience collection (18100 times) [2024-04-26 04:07:49,592][47288] Updated weights for policy 0, policy_version 76246 (0.0030) [2024-04-26 04:07:52,201][47288] Updated weights for policy 0, policy_version 76256 (0.0030) [2024-04-26 04:07:53,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55432.4, 300 sec: 56149.9). Total num frames: 1249427456. Throughput: 0: 55828.4. Samples: 1198873240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 04:07:53,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 04:07:55,388][47288] Updated weights for policy 0, policy_version 76266 (0.0030) [2024-04-26 04:07:58,049][47288] Updated weights for policy 0, policy_version 76276 (0.0030) [2024-04-26 04:07:58,923][47056] Fps is (10 sec: 58981.8, 60 sec: 55705.6, 300 sec: 56261.0). Total num frames: 1249738752. Throughput: 0: 56151.9. Samples: 1199042820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 04:07:58,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 04:08:01,157][47288] Updated weights for policy 0, policy_version 76286 (0.0027) [2024-04-26 04:08:03,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55432.7, 300 sec: 56205.5). Total num frames: 1250000896. Throughput: 0: 55942.4. Samples: 1199375580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 04:08:03,923][47056] Avg episode reward: [(0, '0.420')] [2024-04-26 04:08:04,417][47288] Updated weights for policy 0, policy_version 76296 (0.0033) [2024-04-26 04:08:07,067][47288] Updated weights for policy 0, policy_version 76306 (0.0030) [2024-04-26 04:08:08,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 1250295808. Throughput: 0: 55760.6. Samples: 1199703520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 04:08:08,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 04:08:10,377][47288] Updated weights for policy 0, policy_version 76316 (0.0026) [2024-04-26 04:08:13,098][47288] Updated weights for policy 0, policy_version 76326 (0.0033) [2024-04-26 04:08:13,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56524.9, 300 sec: 56038.9). Total num frames: 1250574336. Throughput: 0: 56045.4. Samples: 1199879040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 04:08:13,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 04:08:16,120][47288] Updated weights for policy 0, policy_version 76336 (0.0030) [2024-04-26 04:08:18,814][47288] Updated weights for policy 0, policy_version 76346 (0.0028) [2024-04-26 04:08:18,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.8, 300 sec: 56094.4). Total num frames: 1250852864. Throughput: 0: 56217.7. Samples: 1200223220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 04:08:18,923][47056] Avg episode reward: [(0, '0.557')] [2024-04-26 04:08:21,781][47288] Updated weights for policy 0, policy_version 76356 (0.0031) [2024-04-26 04:08:23,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55705.7, 300 sec: 56094.4). Total num frames: 1251098624. Throughput: 0: 56167.3. Samples: 1200554420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 04:08:23,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 04:08:24,798][47288] Updated weights for policy 0, policy_version 76366 (0.0029) [2024-04-26 04:08:27,724][47288] Updated weights for policy 0, policy_version 76376 (0.0031) [2024-04-26 04:08:28,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55159.4, 300 sec: 56205.4). Total num frames: 1251377152. Throughput: 0: 55677.2. Samples: 1200710940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 04:08:28,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 04:08:30,725][47288] Updated weights for policy 0, policy_version 76386 (0.0026) [2024-04-26 04:08:33,595][47288] Updated weights for policy 0, policy_version 76396 (0.0030) [2024-04-26 04:08:33,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55705.7, 300 sec: 56205.5). Total num frames: 1251672064. Throughput: 0: 55818.8. Samples: 1201048580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 04:08:33,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 04:08:36,452][47288] Updated weights for policy 0, policy_version 76406 (0.0030) [2024-04-26 04:08:38,923][47056] Fps is (10 sec: 58982.7, 60 sec: 55432.7, 300 sec: 56205.5). Total num frames: 1251966976. Throughput: 0: 55760.5. Samples: 1201382460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 04:08:38,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 04:08:39,281][47288] Updated weights for policy 0, policy_version 76416 (0.0034) [2024-04-26 04:08:42,325][47288] Updated weights for policy 0, policy_version 76426 (0.0031) [2024-04-26 04:08:43,923][47056] Fps is (10 sec: 57342.9, 60 sec: 55978.5, 300 sec: 56149.9). Total num frames: 1252245504. Throughput: 0: 55751.1. Samples: 1201551620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 04:08:43,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 04:08:45,207][47288] Updated weights for policy 0, policy_version 76436 (0.0029) [2024-04-26 04:08:46,295][47267] Signal inference workers to stop experience collection... (18150 times) [2024-04-26 04:08:46,344][47288] InferenceWorker_p0-w0: stopping experience collection (18150 times) [2024-04-26 04:08:46,352][47267] Signal inference workers to resume experience collection... (18150 times) [2024-04-26 04:08:46,358][47288] InferenceWorker_p0-w0: resuming experience collection (18150 times) [2024-04-26 04:08:48,126][47288] Updated weights for policy 0, policy_version 76446 (0.0025) [2024-04-26 04:08:48,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.8, 300 sec: 56094.4). Total num frames: 1252540416. Throughput: 0: 55906.3. Samples: 1201891360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 04:08:48,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 04:08:49,014][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000076450_1252556800.pth... [2024-04-26 04:08:49,061][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000075625_1239040000.pth [2024-04-26 04:08:51,411][47288] Updated weights for policy 0, policy_version 76456 (0.0031) [2024-04-26 04:08:53,923][47056] Fps is (10 sec: 55706.6, 60 sec: 56251.9, 300 sec: 55983.3). Total num frames: 1252802560. Throughput: 0: 55937.9. Samples: 1202220720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 04:08:53,923][47056] Avg episode reward: [(0, '0.544')] [2024-04-26 04:08:53,960][47288] Updated weights for policy 0, policy_version 76466 (0.0029) [2024-04-26 04:08:57,270][47288] Updated weights for policy 0, policy_version 76476 (0.0027) [2024-04-26 04:08:58,923][47056] Fps is (10 sec: 54066.3, 60 sec: 55705.6, 300 sec: 56094.3). Total num frames: 1253081088. Throughput: 0: 55781.1. Samples: 1202389200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 04:08:58,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 04:08:59,779][47288] Updated weights for policy 0, policy_version 76486 (0.0028) [2024-04-26 04:09:03,116][47288] Updated weights for policy 0, policy_version 76496 (0.0036) [2024-04-26 04:09:03,923][47056] Fps is (10 sec: 54066.2, 60 sec: 55705.5, 300 sec: 56149.9). Total num frames: 1253343232. Throughput: 0: 55611.5. Samples: 1202725740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 04:09:03,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 04:09:05,599][47288] Updated weights for policy 0, policy_version 76506 (0.0023) [2024-04-26 04:09:08,759][47288] Updated weights for policy 0, policy_version 76516 (0.0038) [2024-04-26 04:09:08,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55705.7, 300 sec: 56205.5). Total num frames: 1253638144. Throughput: 0: 55751.6. Samples: 1203063240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 04:09:08,923][47056] Avg episode reward: [(0, '0.420')] [2024-04-26 04:09:11,593][47288] Updated weights for policy 0, policy_version 76526 (0.0031) [2024-04-26 04:09:13,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55705.6, 300 sec: 56149.9). Total num frames: 1253916672. Throughput: 0: 56037.4. Samples: 1203232620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 04:09:13,923][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 04:09:14,523][47288] Updated weights for policy 0, policy_version 76536 (0.0029) [2024-04-26 04:09:17,480][47288] Updated weights for policy 0, policy_version 76546 (0.0028) [2024-04-26 04:09:18,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55978.7, 300 sec: 56149.9). Total num frames: 1254211584. Throughput: 0: 55862.4. Samples: 1203562400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:09:18,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 04:09:20,418][47288] Updated weights for policy 0, policy_version 76556 (0.0031) [2024-04-26 04:09:23,312][47288] Updated weights for policy 0, policy_version 76566 (0.0030) [2024-04-26 04:09:23,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56251.8, 300 sec: 56038.8). Total num frames: 1254473728. Throughput: 0: 55792.0. Samples: 1203893100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:09:23,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 04:09:26,159][47288] Updated weights for policy 0, policy_version 76576 (0.0031) [2024-04-26 04:09:28,923][47056] Fps is (10 sec: 55706.6, 60 sec: 56524.9, 300 sec: 55983.3). Total num frames: 1254768640. Throughput: 0: 55906.9. Samples: 1204067420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:09:28,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 04:09:29,042][47288] Updated weights for policy 0, policy_version 76586 (0.0031) [2024-04-26 04:09:32,196][47288] Updated weights for policy 0, policy_version 76596 (0.0027) [2024-04-26 04:09:33,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.5, 300 sec: 56038.8). Total num frames: 1255030784. Throughput: 0: 55897.2. Samples: 1204406740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:09:33,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 04:09:34,759][47288] Updated weights for policy 0, policy_version 76606 (0.0031) [2024-04-26 04:09:38,286][47288] Updated weights for policy 0, policy_version 76616 (0.0028) [2024-04-26 04:09:38,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55705.6, 300 sec: 56094.4). Total num frames: 1255309312. Throughput: 0: 56082.2. Samples: 1204744420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:09:38,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 04:09:40,691][47288] Updated weights for policy 0, policy_version 76626 (0.0031) [2024-04-26 04:09:43,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.7, 300 sec: 56205.4). Total num frames: 1255587840. Throughput: 0: 55841.9. Samples: 1204902080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:09:43,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 04:09:44,492][47288] Updated weights for policy 0, policy_version 76636 (0.0031) [2024-04-26 04:09:46,296][47267] Signal inference workers to stop experience collection... (18200 times) [2024-04-26 04:09:46,297][47267] Signal inference workers to resume experience collection... (18200 times) [2024-04-26 04:09:46,312][47288] InferenceWorker_p0-w0: stopping experience collection (18200 times) [2024-04-26 04:09:46,312][47288] InferenceWorker_p0-w0: resuming experience collection (18200 times) [2024-04-26 04:09:46,411][47288] Updated weights for policy 0, policy_version 76646 (0.0040) [2024-04-26 04:09:48,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 56149.9). Total num frames: 1255866368. Throughput: 0: 55827.2. Samples: 1205237960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:09:48,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 04:09:50,187][47288] Updated weights for policy 0, policy_version 76656 (0.0026) [2024-04-26 04:09:52,307][47288] Updated weights for policy 0, policy_version 76666 (0.0030) [2024-04-26 04:09:53,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56251.6, 300 sec: 56094.4). Total num frames: 1256177664. Throughput: 0: 55951.1. Samples: 1205581040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:09:53,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 04:09:55,841][47288] Updated weights for policy 0, policy_version 76676 (0.0032) [2024-04-26 04:09:58,072][47288] Updated weights for policy 0, policy_version 76686 (0.0029) [2024-04-26 04:09:58,923][47056] Fps is (10 sec: 60620.4, 60 sec: 56524.8, 300 sec: 56094.4). Total num frames: 1256472576. Throughput: 0: 56199.4. Samples: 1205761600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:09:58,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 04:10:01,652][47288] Updated weights for policy 0, policy_version 76696 (0.0029) [2024-04-26 04:10:03,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56525.0, 300 sec: 56038.8). Total num frames: 1256734720. Throughput: 0: 56341.1. Samples: 1206097740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:10:03,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 04:10:04,012][47288] Updated weights for policy 0, policy_version 76706 (0.0028) [2024-04-26 04:10:07,425][47288] Updated weights for policy 0, policy_version 76716 (0.0030) [2024-04-26 04:10:08,923][47056] Fps is (10 sec: 50790.7, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 1256980480. Throughput: 0: 56454.6. Samples: 1206433560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 04:10:08,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 04:10:09,838][47288] Updated weights for policy 0, policy_version 76726 (0.0025) [2024-04-26 04:10:13,110][47288] Updated weights for policy 0, policy_version 76736 (0.0031) [2024-04-26 04:10:13,923][47056] Fps is (10 sec: 52428.1, 60 sec: 55705.5, 300 sec: 56038.8). Total num frames: 1257259008. Throughput: 0: 56215.8. Samples: 1206597140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 04:10:13,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 04:10:15,560][47288] Updated weights for policy 0, policy_version 76746 (0.0028) [2024-04-26 04:10:18,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55705.7, 300 sec: 56149.9). Total num frames: 1257553920. Throughput: 0: 56107.6. Samples: 1206931580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 04:10:18,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 04:10:19,034][47288] Updated weights for policy 0, policy_version 76756 (0.0025) [2024-04-26 04:10:21,307][47288] Updated weights for policy 0, policy_version 76766 (0.0030) [2024-04-26 04:10:23,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 56149.9). Total num frames: 1257832448. Throughput: 0: 56032.8. Samples: 1207265900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 04:10:23,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 04:10:24,922][47288] Updated weights for policy 0, policy_version 76776 (0.0029) [2024-04-26 04:10:27,319][47288] Updated weights for policy 0, policy_version 76786 (0.0037) [2024-04-26 04:10:28,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 56149.9). Total num frames: 1258127360. Throughput: 0: 56289.7. Samples: 1207435120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 04:10:28,923][47056] Avg episode reward: [(0, '0.372')] [2024-04-26 04:10:30,776][47288] Updated weights for policy 0, policy_version 76796 (0.0036) [2024-04-26 04:10:33,116][47288] Updated weights for policy 0, policy_version 76806 (0.0029) [2024-04-26 04:10:33,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56524.8, 300 sec: 56094.4). Total num frames: 1258422272. Throughput: 0: 56284.9. Samples: 1207770780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 04:10:33,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 04:10:36,664][47288] Updated weights for policy 0, policy_version 76816 (0.0031) [2024-04-26 04:10:38,913][47288] Updated weights for policy 0, policy_version 76826 (0.0029) [2024-04-26 04:10:38,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56797.8, 300 sec: 56038.8). Total num frames: 1258717184. Throughput: 0: 56096.0. Samples: 1208105360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 04:10:38,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 04:10:42,470][47288] Updated weights for policy 0, policy_version 76836 (0.0028) [2024-04-26 04:10:43,923][47056] Fps is (10 sec: 52429.7, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 1258946560. Throughput: 0: 55928.7. Samples: 1208278380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 04:10:43,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 04:10:43,994][47267] Signal inference workers to stop experience collection... (18250 times) [2024-04-26 04:10:44,014][47288] InferenceWorker_p0-w0: stopping experience collection (18250 times) [2024-04-26 04:10:44,053][47267] Signal inference workers to resume experience collection... (18250 times) [2024-04-26 04:10:44,053][47288] InferenceWorker_p0-w0: resuming experience collection (18250 times) [2024-04-26 04:10:44,904][47288] Updated weights for policy 0, policy_version 76846 (0.0026) [2024-04-26 04:10:48,324][47288] Updated weights for policy 0, policy_version 76856 (0.0033) [2024-04-26 04:10:48,923][47056] Fps is (10 sec: 49152.0, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 1259208704. Throughput: 0: 55806.1. Samples: 1208609020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 04:10:48,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 04:10:48,958][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000076857_1259225088.pth... [2024-04-26 04:10:49,000][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000076038_1245806592.pth [2024-04-26 04:10:50,808][47288] Updated weights for policy 0, policy_version 76866 (0.0030) [2024-04-26 04:10:53,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55432.5, 300 sec: 56038.9). Total num frames: 1259503616. Throughput: 0: 55972.5. Samples: 1208952320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 04:10:53,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 04:10:54,084][47288] Updated weights for policy 0, policy_version 76876 (0.0031) [2024-04-26 04:10:56,517][47288] Updated weights for policy 0, policy_version 76886 (0.0025) [2024-04-26 04:10:58,923][47056] Fps is (10 sec: 58981.4, 60 sec: 55432.4, 300 sec: 56094.3). Total num frames: 1259798528. Throughput: 0: 55782.9. Samples: 1209107380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 04:10:58,924][47056] Avg episode reward: [(0, '0.424')] [2024-04-26 04:11:00,146][47288] Updated weights for policy 0, policy_version 76896 (0.0030) [2024-04-26 04:11:02,296][47288] Updated weights for policy 0, policy_version 76906 (0.0035) [2024-04-26 04:11:03,923][47056] Fps is (10 sec: 58982.6, 60 sec: 55978.6, 300 sec: 56149.9). Total num frames: 1260093440. Throughput: 0: 55787.1. Samples: 1209442000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 04:11:03,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 04:11:05,841][47288] Updated weights for policy 0, policy_version 76916 (0.0029) [2024-04-26 04:11:08,155][47288] Updated weights for policy 0, policy_version 76926 (0.0033) [2024-04-26 04:11:08,923][47056] Fps is (10 sec: 57345.2, 60 sec: 56524.9, 300 sec: 56094.4). Total num frames: 1260371968. Throughput: 0: 55953.8. Samples: 1209783820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 04:11:08,923][47056] Avg episode reward: [(0, '0.403')] [2024-04-26 04:11:11,694][47288] Updated weights for policy 0, policy_version 76936 (0.0025) [2024-04-26 04:11:13,923][47056] Fps is (10 sec: 55704.0, 60 sec: 56524.6, 300 sec: 55983.2). Total num frames: 1260650496. Throughput: 0: 56146.8. Samples: 1209961740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 04:11:13,924][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 04:11:14,165][47288] Updated weights for policy 0, policy_version 76946 (0.0031) [2024-04-26 04:11:17,794][47288] Updated weights for policy 0, policy_version 76956 (0.0035) [2024-04-26 04:11:18,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 1260912640. Throughput: 0: 56075.2. Samples: 1210294160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 04:11:18,923][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 04:11:20,070][47288] Updated weights for policy 0, policy_version 76966 (0.0028) [2024-04-26 04:11:23,475][47288] Updated weights for policy 0, policy_version 76976 (0.0035) [2024-04-26 04:11:23,923][47056] Fps is (10 sec: 52430.1, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 1261174784. Throughput: 0: 56189.8. Samples: 1210633900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 04:11:23,923][47056] Avg episode reward: [(0, '0.375')] [2024-04-26 04:11:25,756][47288] Updated weights for policy 0, policy_version 76986 (0.0029) [2024-04-26 04:11:28,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 1261469696. Throughput: 0: 55952.2. Samples: 1210796240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 04:11:28,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 04:11:29,135][47288] Updated weights for policy 0, policy_version 76996 (0.0029) [2024-04-26 04:11:31,482][47267] Signal inference workers to stop experience collection... (18300 times) [2024-04-26 04:11:31,529][47288] InferenceWorker_p0-w0: stopping experience collection (18300 times) [2024-04-26 04:11:31,529][47267] Signal inference workers to resume experience collection... (18300 times) [2024-04-26 04:11:31,541][47288] InferenceWorker_p0-w0: resuming experience collection (18300 times) [2024-04-26 04:11:31,638][47288] Updated weights for policy 0, policy_version 77006 (0.0027) [2024-04-26 04:11:33,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55432.4, 300 sec: 56038.8). Total num frames: 1261748224. Throughput: 0: 56012.3. Samples: 1211129580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 04:11:33,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 04:11:34,952][47288] Updated weights for policy 0, policy_version 77016 (0.0024) [2024-04-26 04:11:37,584][47288] Updated weights for policy 0, policy_version 77026 (0.0029) [2024-04-26 04:11:38,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55432.6, 300 sec: 56038.8). Total num frames: 1262043136. Throughput: 0: 55799.6. Samples: 1211463300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 04:11:38,923][47056] Avg episode reward: [(0, '0.348')] [2024-04-26 04:11:40,969][47288] Updated weights for policy 0, policy_version 77036 (0.0026) [2024-04-26 04:11:43,484][47288] Updated weights for policy 0, policy_version 77046 (0.0027) [2024-04-26 04:11:43,923][47056] Fps is (10 sec: 58983.0, 60 sec: 56524.6, 300 sec: 56094.4). Total num frames: 1262338048. Throughput: 0: 56198.4. Samples: 1211636300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-26 04:11:43,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 04:11:46,757][47288] Updated weights for policy 0, policy_version 77056 (0.0031) [2024-04-26 04:11:48,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56797.8, 300 sec: 55983.3). Total num frames: 1262616576. Throughput: 0: 56311.9. Samples: 1211976040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-26 04:11:48,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 04:11:49,214][47288] Updated weights for policy 0, policy_version 77066 (0.0032) [2024-04-26 04:11:52,458][47288] Updated weights for policy 0, policy_version 77076 (0.0030) [2024-04-26 04:11:53,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 1262878720. Throughput: 0: 56163.6. Samples: 1212311180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-26 04:11:53,923][47056] Avg episode reward: [(0, '0.381')] [2024-04-26 04:11:55,086][47288] Updated weights for policy 0, policy_version 77086 (0.0033) [2024-04-26 04:11:58,802][47288] Updated weights for policy 0, policy_version 77096 (0.0030) [2024-04-26 04:11:58,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 1263140864. Throughput: 0: 55722.4. Samples: 1212469240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-26 04:11:58,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 04:12:00,950][47288] Updated weights for policy 0, policy_version 77106 (0.0032) [2024-04-26 04:12:03,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55432.4, 300 sec: 55927.7). Total num frames: 1263419392. Throughput: 0: 55795.0. Samples: 1212804940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-26 04:12:03,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 04:12:04,723][47288] Updated weights for policy 0, policy_version 77116 (0.0026) [2024-04-26 04:12:06,846][47288] Updated weights for policy 0, policy_version 77126 (0.0031) [2024-04-26 04:12:08,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.5, 300 sec: 56038.8). Total num frames: 1263714304. Throughput: 0: 55823.4. Samples: 1213145960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-26 04:12:08,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 04:12:10,606][47288] Updated weights for policy 0, policy_version 77136 (0.0024) [2024-04-26 04:12:12,693][47288] Updated weights for policy 0, policy_version 77146 (0.0024) [2024-04-26 04:12:13,923][47056] Fps is (10 sec: 58982.6, 60 sec: 55978.9, 300 sec: 56038.8). Total num frames: 1264009216. Throughput: 0: 55856.0. Samples: 1213309760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-26 04:12:13,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 04:12:16,258][47288] Updated weights for policy 0, policy_version 77156 (0.0028) [2024-04-26 04:12:18,590][47288] Updated weights for policy 0, policy_version 77166 (0.0031) [2024-04-26 04:12:18,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56251.6, 300 sec: 56038.8). Total num frames: 1264287744. Throughput: 0: 55865.4. Samples: 1213643520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-26 04:12:18,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 04:12:22,209][47288] Updated weights for policy 0, policy_version 77176 (0.0032) [2024-04-26 04:12:23,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56797.7, 300 sec: 55983.3). Total num frames: 1264582656. Throughput: 0: 55916.6. Samples: 1213979560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-26 04:12:23,923][47056] Avg episode reward: [(0, '0.424')] [2024-04-26 04:12:24,615][47288] Updated weights for policy 0, policy_version 77186 (0.0028) [2024-04-26 04:12:27,974][47288] Updated weights for policy 0, policy_version 77196 (0.0027) [2024-04-26 04:12:28,923][47056] Fps is (10 sec: 54066.2, 60 sec: 55978.5, 300 sec: 55927.7). Total num frames: 1264828416. Throughput: 0: 55927.7. Samples: 1214153060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:12:28,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 04:12:29,304][47267] Signal inference workers to stop experience collection... (18350 times) [2024-04-26 04:12:29,305][47267] Signal inference workers to resume experience collection... (18350 times) [2024-04-26 04:12:29,316][47288] InferenceWorker_p0-w0: stopping experience collection (18350 times) [2024-04-26 04:12:29,335][47288] InferenceWorker_p0-w0: resuming experience collection (18350 times) [2024-04-26 04:12:30,321][47288] Updated weights for policy 0, policy_version 77206 (0.0026) [2024-04-26 04:12:33,842][47288] Updated weights for policy 0, policy_version 77216 (0.0028) [2024-04-26 04:12:33,923][47056] Fps is (10 sec: 52429.8, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 1265106944. Throughput: 0: 55929.0. Samples: 1214492840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:12:33,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 04:12:36,176][47288] Updated weights for policy 0, policy_version 77226 (0.0027) [2024-04-26 04:12:38,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 1265385472. Throughput: 0: 55910.5. Samples: 1214827160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:12:38,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 04:12:39,668][47288] Updated weights for policy 0, policy_version 77236 (0.0031) [2024-04-26 04:12:42,069][47288] Updated weights for policy 0, policy_version 77246 (0.0030) [2024-04-26 04:12:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55983.3). Total num frames: 1265664000. Throughput: 0: 55853.9. Samples: 1214982660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:12:43,923][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 04:12:45,554][47288] Updated weights for policy 0, policy_version 77256 (0.0033) [2024-04-26 04:12:47,796][47288] Updated weights for policy 0, policy_version 77266 (0.0031) [2024-04-26 04:12:48,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55705.5, 300 sec: 56038.8). Total num frames: 1265958912. Throughput: 0: 55947.9. Samples: 1215322600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:12:48,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 04:12:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000077268_1265958912.pth... [2024-04-26 04:12:48,979][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000076450_1252556800.pth [2024-04-26 04:12:51,491][47288] Updated weights for policy 0, policy_version 77276 (0.0027) [2024-04-26 04:12:53,757][47288] Updated weights for policy 0, policy_version 77286 (0.0028) [2024-04-26 04:12:53,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 1266253824. Throughput: 0: 55829.9. Samples: 1215658300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:12:53,924][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 04:12:57,254][47288] Updated weights for policy 0, policy_version 77296 (0.0029) [2024-04-26 04:12:58,923][47056] Fps is (10 sec: 57345.1, 60 sec: 56524.9, 300 sec: 56038.8). Total num frames: 1266532352. Throughput: 0: 56147.2. Samples: 1215836380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:12:58,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 04:12:59,804][47288] Updated weights for policy 0, policy_version 77306 (0.0022) [2024-04-26 04:13:03,021][47288] Updated weights for policy 0, policy_version 77316 (0.0031) [2024-04-26 04:13:03,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56798.0, 300 sec: 56038.8). Total num frames: 1266827264. Throughput: 0: 56109.4. Samples: 1216168440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:13:03,923][47056] Avg episode reward: [(0, '0.363')] [2024-04-26 04:13:05,708][47288] Updated weights for policy 0, policy_version 77326 (0.0026) [2024-04-26 04:13:08,749][47288] Updated weights for policy 0, policy_version 77336 (0.0028) [2024-04-26 04:13:08,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 1267073024. Throughput: 0: 56162.0. Samples: 1216506840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:13:08,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 04:13:11,502][47288] Updated weights for policy 0, policy_version 77346 (0.0031) [2024-04-26 04:13:13,923][47056] Fps is (10 sec: 52428.0, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 1267351552. Throughput: 0: 55995.7. Samples: 1216672860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 04:13:13,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 04:13:14,575][47288] Updated weights for policy 0, policy_version 77356 (0.0031) [2024-04-26 04:13:17,181][47288] Updated weights for policy 0, policy_version 77366 (0.0028) [2024-04-26 04:13:18,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 56094.4). Total num frames: 1267646464. Throughput: 0: 56039.0. Samples: 1217014600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 04:13:18,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 04:13:20,423][47288] Updated weights for policy 0, policy_version 77376 (0.0025) [2024-04-26 04:13:22,860][47288] Updated weights for policy 0, policy_version 77386 (0.0026) [2024-04-26 04:13:23,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55705.7, 300 sec: 56094.4). Total num frames: 1267924992. Throughput: 0: 56133.8. Samples: 1217353180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 04:13:23,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 04:13:26,329][47288] Updated weights for policy 0, policy_version 77396 (0.0030) [2024-04-26 04:13:28,782][47288] Updated weights for policy 0, policy_version 77406 (0.0033) [2024-04-26 04:13:28,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.9, 300 sec: 56094.3). Total num frames: 1268219904. Throughput: 0: 56415.8. Samples: 1217521380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 04:13:28,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 04:13:32,054][47288] Updated weights for policy 0, policy_version 77416 (0.0032) [2024-04-26 04:13:33,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56797.8, 300 sec: 56094.4). Total num frames: 1268514816. Throughput: 0: 56307.7. Samples: 1217856440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 04:13:33,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 04:13:34,714][47288] Updated weights for policy 0, policy_version 77426 (0.0029) [2024-04-26 04:13:37,830][47288] Updated weights for policy 0, policy_version 77436 (0.0025) [2024-04-26 04:13:38,923][47056] Fps is (10 sec: 55706.6, 60 sec: 56524.9, 300 sec: 56038.9). Total num frames: 1268776960. Throughput: 0: 56305.4. Samples: 1218192040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 04:13:38,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 04:13:40,416][47288] Updated weights for policy 0, policy_version 77446 (0.0033) [2024-04-26 04:13:43,669][47288] Updated weights for policy 0, policy_version 77456 (0.0028) [2024-04-26 04:13:43,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56524.7, 300 sec: 55983.3). Total num frames: 1269055488. Throughput: 0: 56109.6. Samples: 1218361320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 04:13:43,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 04:13:46,582][47288] Updated weights for policy 0, policy_version 77466 (0.0028) [2024-04-26 04:13:48,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.9, 300 sec: 56038.8). Total num frames: 1269334016. Throughput: 0: 56285.0. Samples: 1218701260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 04:13:48,923][47056] Avg episode reward: [(0, '0.401')] [2024-04-26 04:13:49,515][47288] Updated weights for policy 0, policy_version 77476 (0.0028) [2024-04-26 04:13:49,699][47267] Signal inference workers to stop experience collection... (18400 times) [2024-04-26 04:13:49,703][47267] Signal inference workers to resume experience collection... (18400 times) [2024-04-26 04:13:49,718][47288] InferenceWorker_p0-w0: stopping experience collection (18400 times) [2024-04-26 04:13:49,718][47288] InferenceWorker_p0-w0: resuming experience collection (18400 times) [2024-04-26 04:13:52,637][47288] Updated weights for policy 0, policy_version 77486 (0.0033) [2024-04-26 04:13:53,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55978.7, 300 sec: 56038.9). Total num frames: 1269612544. Throughput: 0: 56307.2. Samples: 1219040660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 04:13:53,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 04:13:55,211][47288] Updated weights for policy 0, policy_version 77496 (0.0031) [2024-04-26 04:13:58,278][47288] Updated weights for policy 0, policy_version 77506 (0.0028) [2024-04-26 04:13:58,923][47056] Fps is (10 sec: 55704.8, 60 sec: 55978.6, 300 sec: 56094.4). Total num frames: 1269891072. Throughput: 0: 56304.1. Samples: 1219206540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 04:13:58,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 04:14:00,850][47288] Updated weights for policy 0, policy_version 77516 (0.0027) [2024-04-26 04:14:03,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.7, 300 sec: 56038.9). Total num frames: 1270169600. Throughput: 0: 56296.3. Samples: 1219547920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 04:14:03,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 04:14:03,998][47288] Updated weights for policy 0, policy_version 77526 (0.0033) [2024-04-26 04:14:06,737][47288] Updated weights for policy 0, policy_version 77536 (0.0037) [2024-04-26 04:14:08,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.8, 300 sec: 56094.4). Total num frames: 1270464512. Throughput: 0: 56238.2. Samples: 1219883900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 04:14:08,923][47056] Avg episode reward: [(0, '0.327')] [2024-04-26 04:14:09,800][47288] Updated weights for policy 0, policy_version 77546 (0.0032) [2024-04-26 04:14:12,628][47288] Updated weights for policy 0, policy_version 77556 (0.0025) [2024-04-26 04:14:13,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56798.0, 300 sec: 56094.4). Total num frames: 1270759424. Throughput: 0: 56353.5. Samples: 1220057280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 04:14:13,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 04:14:15,601][47288] Updated weights for policy 0, policy_version 77566 (0.0034) [2024-04-26 04:14:18,338][47288] Updated weights for policy 0, policy_version 77576 (0.0032) [2024-04-26 04:14:18,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.8, 300 sec: 56149.9). Total num frames: 1271037952. Throughput: 0: 56400.4. Samples: 1220394460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 04:14:18,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 04:14:21,321][47288] Updated weights for policy 0, policy_version 77586 (0.0033) [2024-04-26 04:14:23,923][47056] Fps is (10 sec: 55704.5, 60 sec: 56524.7, 300 sec: 56094.3). Total num frames: 1271316480. Throughput: 0: 56471.8. Samples: 1220733280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 04:14:23,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 04:14:24,192][47288] Updated weights for policy 0, policy_version 77596 (0.0027) [2024-04-26 04:14:27,006][47288] Updated weights for policy 0, policy_version 77606 (0.0031) [2024-04-26 04:14:28,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1271595008. Throughput: 0: 56544.0. Samples: 1220905800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 04:14:28,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 04:14:30,023][47288] Updated weights for policy 0, policy_version 77616 (0.0031) [2024-04-26 04:14:32,933][47288] Updated weights for policy 0, policy_version 77626 (0.0024) [2024-04-26 04:14:33,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.6, 300 sec: 56149.9). Total num frames: 1271873536. Throughput: 0: 56528.2. Samples: 1221245040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 04:14:33,923][47056] Avg episode reward: [(0, '0.568')] [2024-04-26 04:14:35,747][47288] Updated weights for policy 0, policy_version 77636 (0.0034) [2024-04-26 04:14:38,851][47288] Updated weights for policy 0, policy_version 77646 (0.0024) [2024-04-26 04:14:38,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.6, 300 sec: 56149.9). Total num frames: 1272152064. Throughput: 0: 56347.4. Samples: 1221576300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 04:14:38,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 04:14:41,646][47288] Updated weights for policy 0, policy_version 77656 (0.0026) [2024-04-26 04:14:43,923][47056] Fps is (10 sec: 54065.4, 60 sec: 55978.4, 300 sec: 56094.3). Total num frames: 1272414208. Throughput: 0: 56268.9. Samples: 1221738660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 04:14:43,924][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 04:14:44,660][47288] Updated weights for policy 0, policy_version 77666 (0.0026) [2024-04-26 04:14:47,564][47288] Updated weights for policy 0, policy_version 77676 (0.0032) [2024-04-26 04:14:48,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.6, 300 sec: 56094.4). Total num frames: 1272725504. Throughput: 0: 56241.5. Samples: 1222078800. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-04-26 04:14:48,924][47056] Avg episode reward: [(0, '0.310')] [2024-04-26 04:14:48,936][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000077681_1272725504.pth... [2024-04-26 04:14:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000076857_1259225088.pth [2024-04-26 04:14:50,481][47288] Updated weights for policy 0, policy_version 77686 (0.0030) [2024-04-26 04:14:53,322][47288] Updated weights for policy 0, policy_version 77696 (0.0026) [2024-04-26 04:14:53,923][47056] Fps is (10 sec: 57346.4, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 1272987648. Throughput: 0: 56267.2. Samples: 1222415920. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-04-26 04:14:53,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 04:14:56,216][47288] Updated weights for policy 0, policy_version 77706 (0.0034) [2024-04-26 04:14:58,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56251.8, 300 sec: 56038.8). Total num frames: 1273266176. Throughput: 0: 56133.7. Samples: 1222583300. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-04-26 04:14:58,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 04:14:59,207][47288] Updated weights for policy 0, policy_version 77716 (0.0026) [2024-04-26 04:15:02,160][47288] Updated weights for policy 0, policy_version 77726 (0.0031) [2024-04-26 04:15:03,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 1273544704. Throughput: 0: 55975.3. Samples: 1222913340. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-04-26 04:15:03,923][47056] Avg episode reward: [(0, '0.341')] [2024-04-26 04:15:05,173][47288] Updated weights for policy 0, policy_version 77736 (0.0028) [2024-04-26 04:15:07,989][47288] Updated weights for policy 0, policy_version 77746 (0.0029) [2024-04-26 04:15:08,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 56205.4). Total num frames: 1273839616. Throughput: 0: 56023.2. Samples: 1223254320. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-04-26 04:15:08,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 04:15:10,922][47288] Updated weights for policy 0, policy_version 77756 (0.0027) [2024-04-26 04:15:13,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55705.5, 300 sec: 56094.3). Total num frames: 1274101760. Throughput: 0: 56050.2. Samples: 1223428060. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-04-26 04:15:13,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 04:15:13,936][47288] Updated weights for policy 0, policy_version 77766 (0.0030) [2024-04-26 04:15:15,028][47267] Signal inference workers to stop experience collection... (18450 times) [2024-04-26 04:15:15,028][47267] Signal inference workers to resume experience collection... (18450 times) [2024-04-26 04:15:15,053][47288] InferenceWorker_p0-w0: stopping experience collection (18450 times) [2024-04-26 04:15:15,053][47288] InferenceWorker_p0-w0: resuming experience collection (18450 times) [2024-04-26 04:15:16,602][47288] Updated weights for policy 0, policy_version 77776 (0.0029) [2024-04-26 04:15:18,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 56094.4). Total num frames: 1274380288. Throughput: 0: 55899.2. Samples: 1223760500. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-04-26 04:15:18,923][47056] Avg episode reward: [(0, '0.336')] [2024-04-26 04:15:19,832][47288] Updated weights for policy 0, policy_version 77786 (0.0026) [2024-04-26 04:15:22,545][47288] Updated weights for policy 0, policy_version 77796 (0.0026) [2024-04-26 04:15:23,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 56038.8). Total num frames: 1274658816. Throughput: 0: 55960.1. Samples: 1224094500. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-04-26 04:15:23,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 04:15:25,583][47288] Updated weights for policy 0, policy_version 77806 (0.0026) [2024-04-26 04:15:28,390][47288] Updated weights for policy 0, policy_version 77816 (0.0028) [2024-04-26 04:15:28,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55978.8, 300 sec: 56038.8). Total num frames: 1274953728. Throughput: 0: 56205.5. Samples: 1224267880. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-04-26 04:15:28,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 04:15:31,275][47288] Updated weights for policy 0, policy_version 77826 (0.0032) [2024-04-26 04:15:33,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 1275232256. Throughput: 0: 56136.1. Samples: 1224604920. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-04-26 04:15:33,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 04:15:34,414][47288] Updated weights for policy 0, policy_version 77836 (0.0029) [2024-04-26 04:15:37,083][47288] Updated weights for policy 0, policy_version 77846 (0.0029) [2024-04-26 04:15:38,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.8, 300 sec: 56149.9). Total num frames: 1275510784. Throughput: 0: 56139.1. Samples: 1224942180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:15:38,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 04:15:40,308][47288] Updated weights for policy 0, policy_version 77856 (0.0029) [2024-04-26 04:15:42,960][47288] Updated weights for policy 0, policy_version 77866 (0.0029) [2024-04-26 04:15:43,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56252.1, 300 sec: 56205.4). Total num frames: 1275789312. Throughput: 0: 56127.1. Samples: 1225109020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:15:43,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 04:15:46,297][47288] Updated weights for policy 0, policy_version 77876 (0.0029) [2024-04-26 04:15:48,672][47288] Updated weights for policy 0, policy_version 77886 (0.0030) [2024-04-26 04:15:48,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.7, 300 sec: 56205.4). Total num frames: 1276084224. Throughput: 0: 56350.5. Samples: 1225449120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:15:48,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 04:15:52,188][47288] Updated weights for policy 0, policy_version 77896 (0.0029) [2024-04-26 04:15:53,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56251.8, 300 sec: 56150.0). Total num frames: 1276362752. Throughput: 0: 56253.1. Samples: 1225785700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:15:53,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 04:15:54,652][47288] Updated weights for policy 0, policy_version 77906 (0.0027) [2024-04-26 04:15:57,936][47288] Updated weights for policy 0, policy_version 77916 (0.0035) [2024-04-26 04:15:58,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.8, 300 sec: 56149.9). Total num frames: 1276657664. Throughput: 0: 56125.4. Samples: 1225953700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:15:58,932][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 04:16:00,606][47288] Updated weights for policy 0, policy_version 77926 (0.0033) [2024-04-26 04:16:03,809][47288] Updated weights for policy 0, policy_version 77936 (0.0026) [2024-04-26 04:16:03,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55978.5, 300 sec: 56038.8). Total num frames: 1276903424. Throughput: 0: 56212.0. Samples: 1226290040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:16:03,932][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 04:16:06,407][47288] Updated weights for policy 0, policy_version 77946 (0.0030) [2024-04-26 04:16:08,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55978.6, 300 sec: 56094.4). Total num frames: 1277198336. Throughput: 0: 56074.4. Samples: 1226617860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:16:08,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 04:16:09,677][47288] Updated weights for policy 0, policy_version 77956 (0.0029) [2024-04-26 04:16:12,228][47288] Updated weights for policy 0, policy_version 77966 (0.0034) [2024-04-26 04:16:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1277476864. Throughput: 0: 56106.1. Samples: 1226792660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:16:13,924][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 04:16:15,554][47288] Updated weights for policy 0, policy_version 77976 (0.0033) [2024-04-26 04:16:18,182][47288] Updated weights for policy 0, policy_version 77986 (0.0028) [2024-04-26 04:16:18,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1277771776. Throughput: 0: 56087.9. Samples: 1227128880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:16:18,923][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 04:16:21,400][47288] Updated weights for policy 0, policy_version 77996 (0.0025) [2024-04-26 04:16:22,082][47267] Signal inference workers to stop experience collection... (18500 times) [2024-04-26 04:16:22,110][47288] InferenceWorker_p0-w0: stopping experience collection (18500 times) [2024-04-26 04:16:22,139][47267] Signal inference workers to resume experience collection... (18500 times) [2024-04-26 04:16:22,142][47288] InferenceWorker_p0-w0: resuming experience collection (18500 times) [2024-04-26 04:16:23,923][47056] Fps is (10 sec: 55706.6, 60 sec: 56251.9, 300 sec: 56149.9). Total num frames: 1278033920. Throughput: 0: 56028.2. Samples: 1227463440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-26 04:16:23,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 04:16:23,991][47288] Updated weights for policy 0, policy_version 78006 (0.0031) [2024-04-26 04:16:27,395][47288] Updated weights for policy 0, policy_version 78016 (0.0032) [2024-04-26 04:16:28,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.5, 300 sec: 56149.9). Total num frames: 1278312448. Throughput: 0: 56096.8. Samples: 1227633380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-26 04:16:28,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 04:16:29,805][47288] Updated weights for policy 0, policy_version 78026 (0.0028) [2024-04-26 04:16:33,249][47288] Updated weights for policy 0, policy_version 78036 (0.0026) [2024-04-26 04:16:33,923][47056] Fps is (10 sec: 57342.7, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 1278607360. Throughput: 0: 56010.6. Samples: 1227969600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-26 04:16:33,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 04:16:35,495][47288] Updated weights for policy 0, policy_version 78046 (0.0033) [2024-04-26 04:16:38,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 1278853120. Throughput: 0: 56070.1. Samples: 1228308860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-26 04:16:38,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 04:16:38,946][47288] Updated weights for policy 0, policy_version 78056 (0.0024) [2024-04-26 04:16:41,463][47288] Updated weights for policy 0, policy_version 78066 (0.0031) [2024-04-26 04:16:43,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 1279148032. Throughput: 0: 55843.6. Samples: 1228466660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-26 04:16:43,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 04:16:44,719][47288] Updated weights for policy 0, policy_version 78076 (0.0030) [2024-04-26 04:16:47,302][47288] Updated weights for policy 0, policy_version 78086 (0.0027) [2024-04-26 04:16:48,923][47056] Fps is (10 sec: 58982.1, 60 sec: 55978.6, 300 sec: 56149.9). Total num frames: 1279442944. Throughput: 0: 55796.4. Samples: 1228800880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-26 04:16:48,923][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 04:16:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000078091_1279442944.pth... [2024-04-26 04:16:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000077268_1265958912.pth [2024-04-26 04:16:50,538][47288] Updated weights for policy 0, policy_version 78096 (0.0029) [2024-04-26 04:16:53,169][47288] Updated weights for policy 0, policy_version 78106 (0.0025) [2024-04-26 04:16:53,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56251.6, 300 sec: 56261.0). Total num frames: 1279737856. Throughput: 0: 55993.9. Samples: 1229137580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-26 04:16:53,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 04:16:56,299][47288] Updated weights for policy 0, policy_version 78116 (0.0030) [2024-04-26 04:16:58,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 56205.5). Total num frames: 1280000000. Throughput: 0: 56031.5. Samples: 1229314080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-26 04:16:58,923][47056] Avg episode reward: [(0, '0.416')] [2024-04-26 04:16:58,929][47288] Updated weights for policy 0, policy_version 78126 (0.0025) [2024-04-26 04:17:01,972][47288] Updated weights for policy 0, policy_version 78136 (0.0031) [2024-04-26 04:17:03,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1280278528. Throughput: 0: 56029.5. Samples: 1229650200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-26 04:17:03,923][47056] Avg episode reward: [(0, '0.335')] [2024-04-26 04:17:04,658][47288] Updated weights for policy 0, policy_version 78146 (0.0037) [2024-04-26 04:17:07,850][47288] Updated weights for policy 0, policy_version 78156 (0.0024) [2024-04-26 04:17:08,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56252.0, 300 sec: 56149.9). Total num frames: 1280573440. Throughput: 0: 56145.7. Samples: 1229990000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-26 04:17:08,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 04:17:10,429][47288] Updated weights for policy 0, policy_version 78166 (0.0022) [2024-04-26 04:17:13,738][47288] Updated weights for policy 0, policy_version 78176 (0.0031) [2024-04-26 04:17:13,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 1280835584. Throughput: 0: 55998.0. Samples: 1230153280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:17:13,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 04:17:16,058][47267] Signal inference workers to stop experience collection... (18550 times) [2024-04-26 04:17:16,059][47267] Signal inference workers to resume experience collection... (18550 times) [2024-04-26 04:17:16,068][47288] InferenceWorker_p0-w0: stopping experience collection (18550 times) [2024-04-26 04:17:16,068][47288] InferenceWorker_p0-w0: resuming experience collection (18550 times) [2024-04-26 04:17:16,320][47288] Updated weights for policy 0, policy_version 78186 (0.0030) [2024-04-26 04:17:18,923][47056] Fps is (10 sec: 52428.1, 60 sec: 55432.6, 300 sec: 55983.3). Total num frames: 1281097728. Throughput: 0: 55964.0. Samples: 1230487980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:17:18,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 04:17:19,562][47288] Updated weights for policy 0, policy_version 78196 (0.0038) [2024-04-26 04:17:22,152][47288] Updated weights for policy 0, policy_version 78206 (0.0033) [2024-04-26 04:17:23,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.6, 300 sec: 56205.5). Total num frames: 1281409024. Throughput: 0: 55935.2. Samples: 1230825940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:17:23,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 04:17:25,309][47288] Updated weights for policy 0, policy_version 78216 (0.0031) [2024-04-26 04:17:27,959][47288] Updated weights for policy 0, policy_version 78226 (0.0030) [2024-04-26 04:17:28,923][47056] Fps is (10 sec: 60620.5, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1281703936. Throughput: 0: 56262.1. Samples: 1230998460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:17:28,923][47056] Avg episode reward: [(0, '0.450')] [2024-04-26 04:17:31,112][47288] Updated weights for policy 0, policy_version 78236 (0.0034) [2024-04-26 04:17:33,762][47288] Updated weights for policy 0, policy_version 78246 (0.0029) [2024-04-26 04:17:33,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1281982464. Throughput: 0: 56249.8. Samples: 1231332120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:17:33,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 04:17:36,933][47288] Updated weights for policy 0, policy_version 78256 (0.0025) [2024-04-26 04:17:38,923][47056] Fps is (10 sec: 52429.8, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1282228224. Throughput: 0: 56228.2. Samples: 1231667840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:17:38,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 04:17:39,597][47288] Updated weights for policy 0, policy_version 78266 (0.0031) [2024-04-26 04:17:42,669][47288] Updated weights for policy 0, policy_version 78276 (0.0031) [2024-04-26 04:17:43,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 1282523136. Throughput: 0: 56055.5. Samples: 1231836580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:17:43,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 04:17:45,323][47288] Updated weights for policy 0, policy_version 78286 (0.0028) [2024-04-26 04:17:48,457][47288] Updated weights for policy 0, policy_version 78296 (0.0027) [2024-04-26 04:17:48,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 1282801664. Throughput: 0: 56191.0. Samples: 1232178800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:17:48,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 04:17:51,271][47288] Updated weights for policy 0, policy_version 78306 (0.0030) [2024-04-26 04:17:53,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 56094.3). Total num frames: 1283080192. Throughput: 0: 56160.6. Samples: 1232517240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:17:53,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 04:17:54,344][47288] Updated weights for policy 0, policy_version 78316 (0.0035) [2024-04-26 04:17:56,970][47288] Updated weights for policy 0, policy_version 78326 (0.0033) [2024-04-26 04:17:58,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56251.9, 300 sec: 56094.4). Total num frames: 1283375104. Throughput: 0: 56143.2. Samples: 1232679720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 04:17:58,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 04:18:00,201][47288] Updated weights for policy 0, policy_version 78336 (0.0034) [2024-04-26 04:18:02,889][47288] Updated weights for policy 0, policy_version 78346 (0.0031) [2024-04-26 04:18:03,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55978.6, 300 sec: 56149.9). Total num frames: 1283637248. Throughput: 0: 56233.5. Samples: 1233018480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 04:18:03,923][47056] Avg episode reward: [(0, '0.548')] [2024-04-26 04:18:05,818][47267] Signal inference workers to stop experience collection... (18600 times) [2024-04-26 04:18:05,819][47267] Signal inference workers to resume experience collection... (18600 times) [2024-04-26 04:18:05,834][47288] InferenceWorker_p0-w0: stopping experience collection (18600 times) [2024-04-26 04:18:05,837][47288] InferenceWorker_p0-w0: resuming experience collection (18600 times) [2024-04-26 04:18:05,932][47288] Updated weights for policy 0, policy_version 78356 (0.0035) [2024-04-26 04:18:08,835][47288] Updated weights for policy 0, policy_version 78366 (0.0030) [2024-04-26 04:18:08,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1283948544. Throughput: 0: 56136.4. Samples: 1233352080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 04:18:08,923][47056] Avg episode reward: [(0, '0.401')] [2024-04-26 04:18:11,874][47288] Updated weights for policy 0, policy_version 78376 (0.0030) [2024-04-26 04:18:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1284210688. Throughput: 0: 56014.5. Samples: 1233519100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 04:18:13,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 04:18:14,722][47288] Updated weights for policy 0, policy_version 78386 (0.0029) [2024-04-26 04:18:17,515][47288] Updated weights for policy 0, policy_version 78396 (0.0028) [2024-04-26 04:18:18,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56797.9, 300 sec: 56205.4). Total num frames: 1284505600. Throughput: 0: 56171.5. Samples: 1233859840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 04:18:18,923][47056] Avg episode reward: [(0, '0.436')] [2024-04-26 04:18:20,549][47288] Updated weights for policy 0, policy_version 78406 (0.0030) [2024-04-26 04:18:23,578][47288] Updated weights for policy 0, policy_version 78416 (0.0026) [2024-04-26 04:18:23,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 1284767744. Throughput: 0: 56239.1. Samples: 1234198600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 04:18:23,923][47056] Avg episode reward: [(0, '0.404')] [2024-04-26 04:18:26,448][47288] Updated weights for policy 0, policy_version 78426 (0.0034) [2024-04-26 04:18:28,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55705.8, 300 sec: 56038.9). Total num frames: 1285046272. Throughput: 0: 56206.4. Samples: 1234365860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 04:18:28,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 04:18:29,513][47288] Updated weights for policy 0, policy_version 78436 (0.0029) [2024-04-26 04:18:32,374][47288] Updated weights for policy 0, policy_version 78446 (0.0032) [2024-04-26 04:18:33,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 56149.9). Total num frames: 1285341184. Throughput: 0: 56231.2. Samples: 1234709200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 04:18:33,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 04:18:35,362][47288] Updated weights for policy 0, policy_version 78456 (0.0030) [2024-04-26 04:18:38,030][47288] Updated weights for policy 0, policy_version 78466 (0.0023) [2024-04-26 04:18:38,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.6, 300 sec: 56094.4). Total num frames: 1285603328. Throughput: 0: 56328.1. Samples: 1235052000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 04:18:38,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 04:18:41,367][47288] Updated weights for policy 0, policy_version 78476 (0.0025) [2024-04-26 04:18:43,738][47288] Updated weights for policy 0, policy_version 78486 (0.0031) [2024-04-26 04:18:43,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.9, 300 sec: 56205.4). Total num frames: 1285914624. Throughput: 0: 56473.2. Samples: 1235221020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 04:18:43,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 04:18:47,170][47288] Updated weights for policy 0, policy_version 78496 (0.0026) [2024-04-26 04:18:48,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56524.7, 300 sec: 56205.4). Total num frames: 1286193152. Throughput: 0: 56405.2. Samples: 1235556720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:18:48,924][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 04:18:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000078503_1286193152.pth... [2024-04-26 04:18:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000077681_1272725504.pth [2024-04-26 04:18:49,714][47288] Updated weights for policy 0, policy_version 78506 (0.0034) [2024-04-26 04:18:52,775][47288] Updated weights for policy 0, policy_version 78516 (0.0029) [2024-04-26 04:18:53,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56525.0, 300 sec: 56205.5). Total num frames: 1286471680. Throughput: 0: 56529.9. Samples: 1235895920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:18:53,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 04:18:55,544][47288] Updated weights for policy 0, policy_version 78526 (0.0028) [2024-04-26 04:18:58,411][47288] Updated weights for policy 0, policy_version 78536 (0.0029) [2024-04-26 04:18:58,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56251.7, 300 sec: 56205.4). Total num frames: 1286750208. Throughput: 0: 56530.2. Samples: 1236062960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:18:58,923][47056] Avg episode reward: [(0, '0.393')] [2024-04-26 04:19:01,589][47288] Updated weights for policy 0, policy_version 78546 (0.0028) [2024-04-26 04:19:03,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56251.8, 300 sec: 56094.4). Total num frames: 1287012352. Throughput: 0: 56352.2. Samples: 1236395680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:19:03,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 04:19:04,221][47288] Updated weights for policy 0, policy_version 78556 (0.0032) [2024-04-26 04:19:07,438][47288] Updated weights for policy 0, policy_version 78566 (0.0031) [2024-04-26 04:19:08,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 1287323648. Throughput: 0: 56345.6. Samples: 1236734160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:19:08,929][47056] Avg episode reward: [(0, '0.372')] [2024-04-26 04:19:10,208][47288] Updated weights for policy 0, policy_version 78576 (0.0030) [2024-04-26 04:19:13,034][47288] Updated weights for policy 0, policy_version 78586 (0.0032) [2024-04-26 04:19:13,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56524.7, 300 sec: 56149.9). Total num frames: 1287602176. Throughput: 0: 56430.6. Samples: 1236905240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:19:13,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 04:19:14,102][47267] Signal inference workers to stop experience collection... (18650 times) [2024-04-26 04:19:14,147][47288] InferenceWorker_p0-w0: stopping experience collection (18650 times) [2024-04-26 04:19:14,155][47267] Signal inference workers to resume experience collection... (18650 times) [2024-04-26 04:19:14,163][47288] InferenceWorker_p0-w0: resuming experience collection (18650 times) [2024-04-26 04:19:15,947][47288] Updated weights for policy 0, policy_version 78596 (0.0032) [2024-04-26 04:19:18,759][47288] Updated weights for policy 0, policy_version 78606 (0.0028) [2024-04-26 04:19:18,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1287880704. Throughput: 0: 56354.7. Samples: 1237245160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:19:18,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 04:19:21,676][47288] Updated weights for policy 0, policy_version 78616 (0.0029) [2024-04-26 04:19:23,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.8, 300 sec: 56205.5). Total num frames: 1288175616. Throughput: 0: 56382.7. Samples: 1237589220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:19:23,932][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 04:19:24,558][47288] Updated weights for policy 0, policy_version 78626 (0.0030) [2024-04-26 04:19:27,566][47288] Updated weights for policy 0, policy_version 78636 (0.0028) [2024-04-26 04:19:28,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56797.7, 300 sec: 56205.5). Total num frames: 1288454144. Throughput: 0: 56296.3. Samples: 1237754360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:19:28,923][47056] Avg episode reward: [(0, '0.420')] [2024-04-26 04:19:30,428][47288] Updated weights for policy 0, policy_version 78646 (0.0030) [2024-04-26 04:19:33,240][47288] Updated weights for policy 0, policy_version 78656 (0.0026) [2024-04-26 04:19:33,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 1288716288. Throughput: 0: 56381.0. Samples: 1238093860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:19:33,923][47056] Avg episode reward: [(0, '0.436')] [2024-04-26 04:19:36,316][47288] Updated weights for policy 0, policy_version 78666 (0.0028) [2024-04-26 04:19:38,923][47056] Fps is (10 sec: 55706.6, 60 sec: 56798.0, 300 sec: 56261.1). Total num frames: 1289011200. Throughput: 0: 56348.5. Samples: 1238431600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 04:19:38,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 04:19:38,968][47288] Updated weights for policy 0, policy_version 78676 (0.0030) [2024-04-26 04:19:42,071][47288] Updated weights for policy 0, policy_version 78686 (0.0028) [2024-04-26 04:19:43,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 1289273344. Throughput: 0: 56418.7. Samples: 1238601800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 04:19:43,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 04:19:44,817][47288] Updated weights for policy 0, policy_version 78696 (0.0023) [2024-04-26 04:19:48,001][47288] Updated weights for policy 0, policy_version 78706 (0.0033) [2024-04-26 04:19:48,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1289584640. Throughput: 0: 56401.2. Samples: 1238933740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 04:19:48,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 04:19:50,854][47288] Updated weights for policy 0, policy_version 78716 (0.0026) [2024-04-26 04:19:53,898][47288] Updated weights for policy 0, policy_version 78726 (0.0027) [2024-04-26 04:19:53,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.6, 300 sec: 56205.5). Total num frames: 1289846784. Throughput: 0: 56302.3. Samples: 1239267760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 04:19:53,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 04:19:56,607][47288] Updated weights for policy 0, policy_version 78736 (0.0030) [2024-04-26 04:19:58,923][47056] Fps is (10 sec: 54067.4, 60 sec: 56251.7, 300 sec: 56205.4). Total num frames: 1290125312. Throughput: 0: 56249.3. Samples: 1239436460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 04:19:58,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 04:19:59,761][47288] Updated weights for policy 0, policy_version 78746 (0.0025) [2024-04-26 04:20:02,313][47288] Updated weights for policy 0, policy_version 78756 (0.0028) [2024-04-26 04:20:03,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56524.6, 300 sec: 56149.9). Total num frames: 1290403840. Throughput: 0: 56050.1. Samples: 1239767420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 04:20:03,923][47056] Avg episode reward: [(0, '0.357')] [2024-04-26 04:20:05,680][47288] Updated weights for policy 0, policy_version 78766 (0.0028) [2024-04-26 04:20:06,865][47267] Signal inference workers to stop experience collection... (18700 times) [2024-04-26 04:20:06,865][47267] Signal inference workers to resume experience collection... (18700 times) [2024-04-26 04:20:06,889][47288] InferenceWorker_p0-w0: stopping experience collection (18700 times) [2024-04-26 04:20:06,889][47288] InferenceWorker_p0-w0: resuming experience collection (18700 times) [2024-04-26 04:20:08,260][47288] Updated weights for policy 0, policy_version 78776 (0.0023) [2024-04-26 04:20:08,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 56149.9). Total num frames: 1290665984. Throughput: 0: 55965.0. Samples: 1240107640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 04:20:08,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 04:20:11,372][47288] Updated weights for policy 0, policy_version 78786 (0.0026) [2024-04-26 04:20:13,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 1290960896. Throughput: 0: 56105.1. Samples: 1240279080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 04:20:13,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 04:20:14,404][47288] Updated weights for policy 0, policy_version 78796 (0.0034) [2024-04-26 04:20:17,211][47288] Updated weights for policy 0, policy_version 78806 (0.0035) [2024-04-26 04:20:18,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1291239424. Throughput: 0: 55919.1. Samples: 1240610220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 04:20:18,923][47056] Avg episode reward: [(0, '0.420')] [2024-04-26 04:20:20,069][47288] Updated weights for policy 0, policy_version 78816 (0.0032) [2024-04-26 04:20:23,177][47288] Updated weights for policy 0, policy_version 78826 (0.0026) [2024-04-26 04:20:23,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 56149.9). Total num frames: 1291517952. Throughput: 0: 55894.2. Samples: 1240946840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 04:20:23,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 04:20:25,735][47288] Updated weights for policy 0, policy_version 78836 (0.0029) [2024-04-26 04:20:28,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 56149.9). Total num frames: 1291796480. Throughput: 0: 55983.0. Samples: 1241121040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 04:20:28,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 04:20:28,984][47288] Updated weights for policy 0, policy_version 78846 (0.0028) [2024-04-26 04:20:31,551][47288] Updated weights for policy 0, policy_version 78856 (0.0028) [2024-04-26 04:20:33,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.8, 300 sec: 56205.5). Total num frames: 1292091392. Throughput: 0: 56123.6. Samples: 1241459300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 04:20:33,923][47056] Avg episode reward: [(0, '0.415')] [2024-04-26 04:20:34,784][47288] Updated weights for policy 0, policy_version 78866 (0.0027) [2024-04-26 04:20:38,088][47288] Updated weights for policy 0, policy_version 78876 (0.0027) [2024-04-26 04:20:38,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56251.6, 300 sec: 56261.0). Total num frames: 1292386304. Throughput: 0: 56109.7. Samples: 1241792700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 04:20:38,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 04:20:40,624][47288] Updated weights for policy 0, policy_version 78886 (0.0024) [2024-04-26 04:20:43,775][47288] Updated weights for policy 0, policy_version 78896 (0.0027) [2024-04-26 04:20:43,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55978.5, 300 sec: 56094.4). Total num frames: 1292632064. Throughput: 0: 56092.8. Samples: 1241960640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 04:20:43,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 04:20:46,485][47288] Updated weights for policy 0, policy_version 78906 (0.0039) [2024-04-26 04:20:48,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1292943360. Throughput: 0: 56291.6. Samples: 1242300540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 04:20:48,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 04:20:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000078915_1292943360.pth... [2024-04-26 04:20:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000078091_1279442944.pth [2024-04-26 04:20:49,409][47288] Updated weights for policy 0, policy_version 78916 (0.0037) [2024-04-26 04:20:52,241][47288] Updated weights for policy 0, policy_version 78926 (0.0029) [2024-04-26 04:20:53,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 1293205504. Throughput: 0: 56166.7. Samples: 1242635140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 04:20:53,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 04:20:55,337][47288] Updated weights for policy 0, policy_version 78936 (0.0026) [2024-04-26 04:20:58,156][47288] Updated weights for policy 0, policy_version 78946 (0.0030) [2024-04-26 04:20:58,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1293500416. Throughput: 0: 55915.9. Samples: 1242795300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 04:20:58,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 04:21:01,209][47267] Signal inference workers to stop experience collection... (18750 times) [2024-04-26 04:21:01,209][47267] Signal inference workers to resume experience collection... (18750 times) [2024-04-26 04:21:01,222][47288] InferenceWorker_p0-w0: stopping experience collection (18750 times) [2024-04-26 04:21:01,222][47288] InferenceWorker_p0-w0: resuming experience collection (18750 times) [2024-04-26 04:21:01,324][47288] Updated weights for policy 0, policy_version 78956 (0.0028) [2024-04-26 04:21:03,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.8, 300 sec: 56205.5). Total num frames: 1293778944. Throughput: 0: 56000.9. Samples: 1243130260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 04:21:03,923][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 04:21:03,928][47288] Updated weights for policy 0, policy_version 78966 (0.0027) [2024-04-26 04:21:07,043][47288] Updated weights for policy 0, policy_version 78976 (0.0035) [2024-04-26 04:21:08,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 1294041088. Throughput: 0: 56100.3. Samples: 1243471360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 04:21:08,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 04:21:09,742][47288] Updated weights for policy 0, policy_version 78986 (0.0025) [2024-04-26 04:21:12,678][47288] Updated weights for policy 0, policy_version 78996 (0.0026) [2024-04-26 04:21:13,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56524.8, 300 sec: 56205.5). Total num frames: 1294352384. Throughput: 0: 56121.5. Samples: 1243646500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-04-26 04:21:13,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 04:21:15,521][47288] Updated weights for policy 0, policy_version 79006 (0.0027) [2024-04-26 04:21:18,741][47288] Updated weights for policy 0, policy_version 79016 (0.0028) [2024-04-26 04:21:18,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55978.6, 300 sec: 56149.8). Total num frames: 1294598144. Throughput: 0: 56000.2. Samples: 1243979320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-04-26 04:21:18,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 04:21:21,320][47288] Updated weights for policy 0, policy_version 79026 (0.0037) [2024-04-26 04:21:23,923][47056] Fps is (10 sec: 54065.9, 60 sec: 56251.5, 300 sec: 56205.5). Total num frames: 1294893056. Throughput: 0: 56048.8. Samples: 1244314900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-04-26 04:21:23,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 04:21:24,539][47288] Updated weights for policy 0, policy_version 79036 (0.0028) [2024-04-26 04:21:27,083][47288] Updated weights for policy 0, policy_version 79046 (0.0028) [2024-04-26 04:21:28,923][47056] Fps is (10 sec: 57345.0, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1295171584. Throughput: 0: 56216.1. Samples: 1244490360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-04-26 04:21:28,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 04:21:30,174][47288] Updated weights for policy 0, policy_version 79056 (0.0029) [2024-04-26 04:21:32,894][47288] Updated weights for policy 0, policy_version 79066 (0.0030) [2024-04-26 04:21:33,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1295450112. Throughput: 0: 56144.0. Samples: 1244827020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-04-26 04:21:33,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 04:21:35,888][47288] Updated weights for policy 0, policy_version 79076 (0.0026) [2024-04-26 04:21:38,714][47288] Updated weights for policy 0, policy_version 79086 (0.0034) [2024-04-26 04:21:38,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1295761408. Throughput: 0: 56175.9. Samples: 1245163060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-04-26 04:21:38,923][47056] Avg episode reward: [(0, '0.552')] [2024-04-26 04:21:41,757][47288] Updated weights for policy 0, policy_version 79096 (0.0032) [2024-04-26 04:21:43,923][47056] Fps is (10 sec: 57345.1, 60 sec: 56525.0, 300 sec: 56205.5). Total num frames: 1296023552. Throughput: 0: 56438.9. Samples: 1245335040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-04-26 04:21:43,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 04:21:44,344][47288] Updated weights for policy 0, policy_version 79106 (0.0030) [2024-04-26 04:21:47,421][47267] Signal inference workers to stop experience collection... (18800 times) [2024-04-26 04:21:47,422][47267] Signal inference workers to resume experience collection... (18800 times) [2024-04-26 04:21:47,452][47288] InferenceWorker_p0-w0: stopping experience collection (18800 times) [2024-04-26 04:21:47,452][47288] InferenceWorker_p0-w0: resuming experience collection (18800 times) [2024-04-26 04:21:47,533][47288] Updated weights for policy 0, policy_version 79116 (0.0027) [2024-04-26 04:21:48,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.8, 300 sec: 56205.5). Total num frames: 1296318464. Throughput: 0: 56563.6. Samples: 1245675620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-04-26 04:21:48,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 04:21:50,132][47288] Updated weights for policy 0, policy_version 79126 (0.0031) [2024-04-26 04:21:53,300][47288] Updated weights for policy 0, policy_version 79136 (0.0025) [2024-04-26 04:21:53,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56251.7, 300 sec: 56205.5). Total num frames: 1296580608. Throughput: 0: 56382.3. Samples: 1246008560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-04-26 04:21:53,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 04:21:55,973][47288] Updated weights for policy 0, policy_version 79146 (0.0035) [2024-04-26 04:21:58,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.8, 300 sec: 56205.5). Total num frames: 1296859136. Throughput: 0: 56211.5. Samples: 1246176020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:21:58,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 04:21:59,289][47288] Updated weights for policy 0, policy_version 79156 (0.0028) [2024-04-26 04:22:01,778][47288] Updated weights for policy 0, policy_version 79166 (0.0033) [2024-04-26 04:22:03,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.8, 300 sec: 56205.4). Total num frames: 1297154048. Throughput: 0: 56348.7. Samples: 1246515000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:22:03,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 04:22:05,084][47288] Updated weights for policy 0, policy_version 79176 (0.0028) [2024-04-26 04:22:07,580][47288] Updated weights for policy 0, policy_version 79186 (0.0031) [2024-04-26 04:22:08,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.8, 300 sec: 56205.4). Total num frames: 1297416192. Throughput: 0: 56434.4. Samples: 1246854440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:22:08,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 04:22:10,755][47288] Updated weights for policy 0, policy_version 79196 (0.0029) [2024-04-26 04:22:13,441][47288] Updated weights for policy 0, policy_version 79206 (0.0027) [2024-04-26 04:22:13,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.6, 300 sec: 56316.6). Total num frames: 1297711104. Throughput: 0: 56262.8. Samples: 1247022180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:22:13,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 04:22:16,653][47288] Updated weights for policy 0, policy_version 79216 (0.0025) [2024-04-26 04:22:18,923][47056] Fps is (10 sec: 58983.0, 60 sec: 56798.1, 300 sec: 56261.0). Total num frames: 1298006016. Throughput: 0: 56278.9. Samples: 1247359560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:22:18,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 04:22:19,149][47288] Updated weights for policy 0, policy_version 79226 (0.0029) [2024-04-26 04:22:22,507][47288] Updated weights for policy 0, policy_version 79236 (0.0030) [2024-04-26 04:22:23,923][47056] Fps is (10 sec: 55704.1, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 1298268160. Throughput: 0: 56401.1. Samples: 1247701120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:22:23,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 04:22:25,063][47288] Updated weights for policy 0, policy_version 79246 (0.0031) [2024-04-26 04:22:28,226][47288] Updated weights for policy 0, policy_version 79256 (0.0025) [2024-04-26 04:22:28,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.9, 300 sec: 56205.5). Total num frames: 1298563072. Throughput: 0: 56373.2. Samples: 1247871840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:22:28,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 04:22:30,885][47288] Updated weights for policy 0, policy_version 79266 (0.0027) [2024-04-26 04:22:33,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.6, 300 sec: 56260.9). Total num frames: 1298825216. Throughput: 0: 56340.2. Samples: 1248210940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:22:33,924][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 04:22:34,157][47288] Updated weights for policy 0, policy_version 79276 (0.0028) [2024-04-26 04:22:36,699][47288] Updated weights for policy 0, policy_version 79286 (0.0027) [2024-04-26 04:22:38,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1299120128. Throughput: 0: 56423.9. Samples: 1248547640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:22:38,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 04:22:40,056][47288] Updated weights for policy 0, policy_version 79296 (0.0028) [2024-04-26 04:22:42,372][47288] Updated weights for policy 0, policy_version 79306 (0.0029) [2024-04-26 04:22:43,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56251.6, 300 sec: 56261.0). Total num frames: 1299398656. Throughput: 0: 56422.1. Samples: 1248715020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:22:43,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 04:22:45,759][47288] Updated weights for policy 0, policy_version 79316 (0.0027) [2024-04-26 04:22:48,241][47288] Updated weights for policy 0, policy_version 79326 (0.0034) [2024-04-26 04:22:48,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56251.8, 300 sec: 56316.6). Total num frames: 1299693568. Throughput: 0: 56412.9. Samples: 1249053580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-26 04:22:48,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 04:22:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000079327_1299693568.pth... [2024-04-26 04:22:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000078503_1286193152.pth [2024-04-26 04:22:49,319][47267] Signal inference workers to stop experience collection... (18850 times) [2024-04-26 04:22:49,319][47267] Signal inference workers to resume experience collection... (18850 times) [2024-04-26 04:22:49,339][47288] InferenceWorker_p0-w0: stopping experience collection (18850 times) [2024-04-26 04:22:49,339][47288] InferenceWorker_p0-w0: resuming experience collection (18850 times) [2024-04-26 04:22:51,636][47288] Updated weights for policy 0, policy_version 79336 (0.0028) [2024-04-26 04:22:53,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56797.8, 300 sec: 56316.5). Total num frames: 1299988480. Throughput: 0: 56398.6. Samples: 1249392380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-26 04:22:53,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 04:22:54,113][47288] Updated weights for policy 0, policy_version 79346 (0.0032) [2024-04-26 04:22:57,518][47288] Updated weights for policy 0, policy_version 79356 (0.0024) [2024-04-26 04:22:58,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1300250624. Throughput: 0: 56522.5. Samples: 1249565700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-26 04:22:58,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 04:23:00,160][47288] Updated weights for policy 0, policy_version 79366 (0.0033) [2024-04-26 04:23:03,290][47288] Updated weights for policy 0, policy_version 79376 (0.0024) [2024-04-26 04:23:03,923][47056] Fps is (10 sec: 54066.6, 60 sec: 56251.6, 300 sec: 56205.4). Total num frames: 1300529152. Throughput: 0: 56517.0. Samples: 1249902840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-26 04:23:03,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 04:23:06,060][47288] Updated weights for policy 0, policy_version 79386 (0.0033) [2024-04-26 04:23:08,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1300807680. Throughput: 0: 56422.2. Samples: 1250240100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-26 04:23:08,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 04:23:08,954][47288] Updated weights for policy 0, policy_version 79396 (0.0026) [2024-04-26 04:23:11,965][47288] Updated weights for policy 0, policy_version 79406 (0.0027) [2024-04-26 04:23:13,923][47056] Fps is (10 sec: 55706.9, 60 sec: 56251.7, 300 sec: 56205.5). Total num frames: 1301086208. Throughput: 0: 56205.8. Samples: 1250401100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-26 04:23:13,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 04:23:14,812][47288] Updated weights for policy 0, policy_version 79416 (0.0029) [2024-04-26 04:23:17,672][47288] Updated weights for policy 0, policy_version 79426 (0.0025) [2024-04-26 04:23:18,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1301381120. Throughput: 0: 56164.7. Samples: 1250738340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-26 04:23:18,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 04:23:20,713][47288] Updated weights for policy 0, policy_version 79436 (0.0028) [2024-04-26 04:23:23,430][47288] Updated weights for policy 0, policy_version 79446 (0.0031) [2024-04-26 04:23:23,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56252.0, 300 sec: 56261.0). Total num frames: 1301643264. Throughput: 0: 56105.5. Samples: 1251072380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-26 04:23:23,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 04:23:26,699][47288] Updated weights for policy 0, policy_version 79456 (0.0032) [2024-04-26 04:23:28,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1301938176. Throughput: 0: 56291.7. Samples: 1251248140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-26 04:23:28,923][47056] Avg episode reward: [(0, '0.450')] [2024-04-26 04:23:29,208][47288] Updated weights for policy 0, policy_version 79466 (0.0030) [2024-04-26 04:23:32,409][47288] Updated weights for policy 0, policy_version 79476 (0.0024) [2024-04-26 04:23:33,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56524.9, 300 sec: 56316.5). Total num frames: 1302216704. Throughput: 0: 56233.6. Samples: 1251584100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-26 04:23:33,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 04:23:35,127][47288] Updated weights for policy 0, policy_version 79486 (0.0027) [2024-04-26 04:23:38,315][47288] Updated weights for policy 0, policy_version 79496 (0.0032) [2024-04-26 04:23:38,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.9, 300 sec: 56149.9). Total num frames: 1302478848. Throughput: 0: 56210.5. Samples: 1251921840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-26 04:23:38,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 04:23:40,812][47288] Updated weights for policy 0, policy_version 79506 (0.0025) [2024-04-26 04:23:43,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 56205.5). Total num frames: 1302773760. Throughput: 0: 56188.5. Samples: 1252094180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-26 04:23:43,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 04:23:44,003][47288] Updated weights for policy 0, policy_version 79516 (0.0030) [2024-04-26 04:23:46,748][47288] Updated weights for policy 0, policy_version 79526 (0.0032) [2024-04-26 04:23:48,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1303052288. Throughput: 0: 56138.5. Samples: 1252429060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-26 04:23:48,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 04:23:49,651][47288] Updated weights for policy 0, policy_version 79536 (0.0029) [2024-04-26 04:23:52,604][47288] Updated weights for policy 0, policy_version 79546 (0.0028) [2024-04-26 04:23:53,459][47267] Signal inference workers to stop experience collection... (18900 times) [2024-04-26 04:23:53,459][47267] Signal inference workers to resume experience collection... (18900 times) [2024-04-26 04:23:53,484][47288] InferenceWorker_p0-w0: stopping experience collection (18900 times) [2024-04-26 04:23:53,485][47288] InferenceWorker_p0-w0: resuming experience collection (18900 times) [2024-04-26 04:23:53,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 56205.4). Total num frames: 1303330816. Throughput: 0: 56035.8. Samples: 1252761720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-26 04:23:53,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 04:23:55,428][47288] Updated weights for policy 0, policy_version 79556 (0.0031) [2024-04-26 04:23:58,445][47288] Updated weights for policy 0, policy_version 79566 (0.0033) [2024-04-26 04:23:58,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1303609344. Throughput: 0: 56263.0. Samples: 1252932940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-26 04:23:58,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 04:24:01,481][47288] Updated weights for policy 0, policy_version 79576 (0.0028) [2024-04-26 04:24:03,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56252.0, 300 sec: 56205.5). Total num frames: 1303904256. Throughput: 0: 56201.0. Samples: 1253267380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-26 04:24:03,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 04:24:04,274][47288] Updated weights for policy 0, policy_version 79586 (0.0027) [2024-04-26 04:24:07,302][47288] Updated weights for policy 0, policy_version 79596 (0.0039) [2024-04-26 04:24:08,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56524.6, 300 sec: 56261.0). Total num frames: 1304199168. Throughput: 0: 56314.0. Samples: 1253606520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-26 04:24:08,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 04:24:10,056][47288] Updated weights for policy 0, policy_version 79606 (0.0033) [2024-04-26 04:24:13,212][47288] Updated weights for policy 0, policy_version 79616 (0.0026) [2024-04-26 04:24:13,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56251.7, 300 sec: 56205.4). Total num frames: 1304461312. Throughput: 0: 56188.8. Samples: 1253776640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-26 04:24:13,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 04:24:15,817][47288] Updated weights for policy 0, policy_version 79626 (0.0029) [2024-04-26 04:24:18,897][47288] Updated weights for policy 0, policy_version 79636 (0.0031) [2024-04-26 04:24:18,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.7, 300 sec: 56205.5). Total num frames: 1304756224. Throughput: 0: 56222.3. Samples: 1254114100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-26 04:24:18,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 04:24:21,759][47288] Updated weights for policy 0, policy_version 79646 (0.0026) [2024-04-26 04:24:23,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 1305018368. Throughput: 0: 56106.5. Samples: 1254446640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-26 04:24:23,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 04:24:24,675][47288] Updated weights for policy 0, policy_version 79656 (0.0028) [2024-04-26 04:24:27,695][47288] Updated weights for policy 0, policy_version 79666 (0.0033) [2024-04-26 04:24:28,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.6, 300 sec: 56205.5). Total num frames: 1305296896. Throughput: 0: 56080.5. Samples: 1254617800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:24:28,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 04:24:30,648][47288] Updated weights for policy 0, policy_version 79676 (0.0025) [2024-04-26 04:24:33,883][47288] Updated weights for policy 0, policy_version 79686 (0.0030) [2024-04-26 04:24:33,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.8, 300 sec: 56149.9). Total num frames: 1305575424. Throughput: 0: 56125.8. Samples: 1254954720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:24:33,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 04:24:36,559][47288] Updated weights for policy 0, policy_version 79696 (0.0026) [2024-04-26 04:24:38,923][47056] Fps is (10 sec: 57342.5, 60 sec: 56524.5, 300 sec: 56260.9). Total num frames: 1305870336. Throughput: 0: 56125.6. Samples: 1255287380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:24:38,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 04:24:39,904][47288] Updated weights for policy 0, policy_version 79706 (0.0026) [2024-04-26 04:24:42,307][47288] Updated weights for policy 0, policy_version 79716 (0.0032) [2024-04-26 04:24:43,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1306148864. Throughput: 0: 56025.9. Samples: 1255454100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:24:43,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 04:24:45,668][47288] Updated weights for policy 0, policy_version 79726 (0.0032) [2024-04-26 04:24:48,218][47288] Updated weights for policy 0, policy_version 79736 (0.0029) [2024-04-26 04:24:48,923][47056] Fps is (10 sec: 57345.6, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1306443776. Throughput: 0: 56208.4. Samples: 1255796760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:24:48,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 04:24:48,931][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000079739_1306443776.pth... [2024-04-26 04:24:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000078915_1292943360.pth [2024-04-26 04:24:51,438][47288] Updated weights for policy 0, policy_version 79746 (0.0030) [2024-04-26 04:24:53,923][47056] Fps is (10 sec: 55704.1, 60 sec: 56251.6, 300 sec: 56205.4). Total num frames: 1306705920. Throughput: 0: 56129.2. Samples: 1256132340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:24:53,924][47056] Avg episode reward: [(0, '0.414')] [2024-04-26 04:24:54,154][47288] Updated weights for policy 0, policy_version 79756 (0.0032) [2024-04-26 04:24:57,212][47288] Updated weights for policy 0, policy_version 79766 (0.0034) [2024-04-26 04:24:58,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1307000832. Throughput: 0: 55996.4. Samples: 1256296480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:24:58,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 04:24:59,827][47267] Signal inference workers to stop experience collection... (18950 times) [2024-04-26 04:24:59,827][47267] Signal inference workers to resume experience collection... (18950 times) [2024-04-26 04:24:59,855][47288] InferenceWorker_p0-w0: stopping experience collection (18950 times) [2024-04-26 04:24:59,856][47288] InferenceWorker_p0-w0: resuming experience collection (18950 times) [2024-04-26 04:24:59,939][47288] Updated weights for policy 0, policy_version 79776 (0.0029) [2024-04-26 04:25:03,001][47288] Updated weights for policy 0, policy_version 79786 (0.0027) [2024-04-26 04:25:03,923][47056] Fps is (10 sec: 57345.2, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1307279360. Throughput: 0: 56030.7. Samples: 1256635480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:25:03,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 04:25:05,762][47288] Updated weights for policy 0, policy_version 79796 (0.0032) [2024-04-26 04:25:08,858][47288] Updated weights for policy 0, policy_version 79806 (0.0028) [2024-04-26 04:25:08,923][47056] Fps is (10 sec: 54066.3, 60 sec: 55705.5, 300 sec: 56205.4). Total num frames: 1307541504. Throughput: 0: 56198.9. Samples: 1256975600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 04:25:08,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 04:25:11,554][47288] Updated weights for policy 0, policy_version 79816 (0.0033) [2024-04-26 04:25:13,923][47056] Fps is (10 sec: 52428.0, 60 sec: 55705.5, 300 sec: 56149.9). Total num frames: 1307803648. Throughput: 0: 56119.7. Samples: 1257143200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 04:25:13,924][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 04:25:14,802][47288] Updated weights for policy 0, policy_version 79826 (0.0026) [2024-04-26 04:25:17,518][47288] Updated weights for policy 0, policy_version 79836 (0.0030) [2024-04-26 04:25:18,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1308114944. Throughput: 0: 56156.3. Samples: 1257481760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 04:25:18,923][47056] Avg episode reward: [(0, '0.352')] [2024-04-26 04:25:20,508][47288] Updated weights for policy 0, policy_version 79846 (0.0033) [2024-04-26 04:25:23,180][47288] Updated weights for policy 0, policy_version 79856 (0.0027) [2024-04-26 04:25:23,923][47056] Fps is (10 sec: 57345.4, 60 sec: 55978.8, 300 sec: 56205.5). Total num frames: 1308377088. Throughput: 0: 56215.0. Samples: 1257817040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 04:25:23,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 04:25:26,341][47288] Updated weights for policy 0, policy_version 79866 (0.0030) [2024-04-26 04:25:28,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56524.6, 300 sec: 56261.0). Total num frames: 1308688384. Throughput: 0: 56321.5. Samples: 1257988580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 04:25:28,923][47288] Updated weights for policy 0, policy_version 79876 (0.0030) [2024-04-26 04:25:28,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 04:25:32,223][47288] Updated weights for policy 0, policy_version 79886 (0.0030) [2024-04-26 04:25:33,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56251.6, 300 sec: 56149.9). Total num frames: 1308950528. Throughput: 0: 56082.0. Samples: 1258320460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 04:25:33,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 04:25:35,015][47288] Updated weights for policy 0, policy_version 79896 (0.0030) [2024-04-26 04:25:37,942][47288] Updated weights for policy 0, policy_version 79906 (0.0036) [2024-04-26 04:25:38,923][47056] Fps is (10 sec: 54068.3, 60 sec: 55978.9, 300 sec: 56261.0). Total num frames: 1309229056. Throughput: 0: 56072.7. Samples: 1258655600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 04:25:38,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 04:25:40,956][47288] Updated weights for policy 0, policy_version 79916 (0.0030) [2024-04-26 04:25:43,688][47288] Updated weights for policy 0, policy_version 79926 (0.0029) [2024-04-26 04:25:43,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.6, 300 sec: 56149.9). Total num frames: 1309507584. Throughput: 0: 56202.3. Samples: 1258825580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 04:25:43,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 04:25:46,641][47288] Updated weights for policy 0, policy_version 79936 (0.0031) [2024-04-26 04:25:48,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 56149.9). Total num frames: 1309769728. Throughput: 0: 56136.9. Samples: 1259161640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 04:25:48,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 04:25:49,625][47288] Updated weights for policy 0, policy_version 79946 (0.0027) [2024-04-26 04:25:52,430][47288] Updated weights for policy 0, policy_version 79956 (0.0027) [2024-04-26 04:25:53,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.9, 300 sec: 56149.9). Total num frames: 1310064640. Throughput: 0: 56030.1. Samples: 1259496940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 04:25:53,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 04:25:55,533][47288] Updated weights for policy 0, policy_version 79966 (0.0026) [2024-04-26 04:25:56,633][47267] Signal inference workers to stop experience collection... (19000 times) [2024-04-26 04:25:56,639][47267] Signal inference workers to resume experience collection... (19000 times) [2024-04-26 04:25:56,664][47288] InferenceWorker_p0-w0: stopping experience collection (19000 times) [2024-04-26 04:25:56,664][47288] InferenceWorker_p0-w0: resuming experience collection (19000 times) [2024-04-26 04:25:58,313][47288] Updated weights for policy 0, policy_version 79976 (0.0027) [2024-04-26 04:25:58,923][47056] Fps is (10 sec: 60620.1, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1310375936. Throughput: 0: 56051.6. Samples: 1259665520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 04:25:58,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 04:26:01,328][47288] Updated weights for policy 0, policy_version 79986 (0.0031) [2024-04-26 04:26:03,923][47056] Fps is (10 sec: 55704.2, 60 sec: 55705.4, 300 sec: 56205.4). Total num frames: 1310621696. Throughput: 0: 55970.1. Samples: 1260000420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 04:26:03,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 04:26:04,106][47288] Updated weights for policy 0, policy_version 79996 (0.0030) [2024-04-26 04:26:07,192][47288] Updated weights for policy 0, policy_version 80006 (0.0028) [2024-04-26 04:26:08,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.9, 300 sec: 56149.9). Total num frames: 1310916608. Throughput: 0: 56085.6. Samples: 1260340900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 04:26:08,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 04:26:09,806][47288] Updated weights for policy 0, policy_version 80016 (0.0027) [2024-04-26 04:26:12,952][47288] Updated weights for policy 0, policy_version 80026 (0.0026) [2024-04-26 04:26:13,923][47056] Fps is (10 sec: 58984.1, 60 sec: 56798.1, 300 sec: 56316.6). Total num frames: 1311211520. Throughput: 0: 56138.6. Samples: 1260514800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 04:26:13,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 04:26:15,635][47288] Updated weights for policy 0, policy_version 80036 (0.0025) [2024-04-26 04:26:18,900][47288] Updated weights for policy 0, policy_version 80046 (0.0028) [2024-04-26 04:26:18,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 56205.5). Total num frames: 1311473664. Throughput: 0: 56291.6. Samples: 1260853580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 04:26:18,923][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 04:26:21,533][47288] Updated weights for policy 0, policy_version 80056 (0.0029) [2024-04-26 04:26:23,923][47056] Fps is (10 sec: 54066.3, 60 sec: 56251.6, 300 sec: 56205.4). Total num frames: 1311752192. Throughput: 0: 56354.1. Samples: 1261191540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 04:26:23,923][47056] Avg episode reward: [(0, '0.371')] [2024-04-26 04:26:24,593][47288] Updated weights for policy 0, policy_version 80066 (0.0027) [2024-04-26 04:26:27,261][47288] Updated weights for policy 0, policy_version 80076 (0.0030) [2024-04-26 04:26:28,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.8, 300 sec: 56261.0). Total num frames: 1312047104. Throughput: 0: 56249.3. Samples: 1261356800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 04:26:28,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 04:26:30,264][47288] Updated weights for policy 0, policy_version 80086 (0.0025) [2024-04-26 04:26:32,979][47288] Updated weights for policy 0, policy_version 80096 (0.0037) [2024-04-26 04:26:33,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1312325632. Throughput: 0: 56393.3. Samples: 1261699340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 04:26:33,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 04:26:36,123][47288] Updated weights for policy 0, policy_version 80106 (0.0026) [2024-04-26 04:26:38,793][47288] Updated weights for policy 0, policy_version 80116 (0.0026) [2024-04-26 04:26:38,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.7, 300 sec: 56261.0). Total num frames: 1312620544. Throughput: 0: 56350.5. Samples: 1262032720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 04:26:38,923][47056] Avg episode reward: [(0, '0.436')] [2024-04-26 04:26:42,024][47288] Updated weights for policy 0, policy_version 80126 (0.0030) [2024-04-26 04:26:43,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1312882688. Throughput: 0: 56460.2. Samples: 1262206220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 04:26:43,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 04:26:44,657][47288] Updated weights for policy 0, policy_version 80136 (0.0029) [2024-04-26 04:26:47,934][47267] Signal inference workers to stop experience collection... (19050 times) [2024-04-26 04:26:47,975][47288] InferenceWorker_p0-w0: stopping experience collection (19050 times) [2024-04-26 04:26:47,988][47267] Signal inference workers to resume experience collection... (19050 times) [2024-04-26 04:26:47,995][47288] InferenceWorker_p0-w0: resuming experience collection (19050 times) [2024-04-26 04:26:47,998][47288] Updated weights for policy 0, policy_version 80146 (0.0032) [2024-04-26 04:26:48,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56797.9, 300 sec: 56261.0). Total num frames: 1313177600. Throughput: 0: 56566.0. Samples: 1262545880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 04:26:48,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 04:26:49,005][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000080151_1313193984.pth... [2024-04-26 04:26:49,051][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000079327_1299693568.pth [2024-04-26 04:26:50,447][47288] Updated weights for policy 0, policy_version 80156 (0.0027) [2024-04-26 04:26:53,629][47288] Updated weights for policy 0, policy_version 80166 (0.0028) [2024-04-26 04:26:53,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56524.7, 300 sec: 56261.0). Total num frames: 1313456128. Throughput: 0: 56366.7. Samples: 1262877400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 04:26:53,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 04:26:56,140][47288] Updated weights for policy 0, policy_version 80176 (0.0029) [2024-04-26 04:26:58,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55432.6, 300 sec: 56094.4). Total num frames: 1313701888. Throughput: 0: 56181.6. Samples: 1263042980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 04:26:58,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 04:26:59,376][47288] Updated weights for policy 0, policy_version 80186 (0.0027) [2024-04-26 04:27:01,975][47288] Updated weights for policy 0, policy_version 80196 (0.0041) [2024-04-26 04:27:03,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1314013184. Throughput: 0: 56152.9. Samples: 1263380460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 04:27:03,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 04:27:05,221][47288] Updated weights for policy 0, policy_version 80206 (0.0031) [2024-04-26 04:27:07,893][47288] Updated weights for policy 0, policy_version 80216 (0.0023) [2024-04-26 04:27:08,923][47056] Fps is (10 sec: 60620.5, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1314308096. Throughput: 0: 56192.9. Samples: 1263720220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 04:27:08,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 04:27:10,988][47288] Updated weights for policy 0, policy_version 80226 (0.0032) [2024-04-26 04:27:13,628][47288] Updated weights for policy 0, policy_version 80236 (0.0033) [2024-04-26 04:27:13,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.6, 300 sec: 56205.4). Total num frames: 1314586624. Throughput: 0: 56235.6. Samples: 1263887400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 04:27:13,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 04:27:16,718][47288] Updated weights for policy 0, policy_version 80246 (0.0023) [2024-04-26 04:27:18,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1314865152. Throughput: 0: 56125.3. Samples: 1264224980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 04:27:18,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 04:27:19,435][47288] Updated weights for policy 0, policy_version 80256 (0.0027) [2024-04-26 04:27:22,613][47288] Updated weights for policy 0, policy_version 80266 (0.0025) [2024-04-26 04:27:23,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.8, 300 sec: 56205.4). Total num frames: 1315143680. Throughput: 0: 56177.0. Samples: 1264560680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 04:27:23,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 04:27:25,210][47288] Updated weights for policy 0, policy_version 80276 (0.0027) [2024-04-26 04:27:28,540][47288] Updated weights for policy 0, policy_version 80286 (0.0028) [2024-04-26 04:27:28,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1315422208. Throughput: 0: 56127.5. Samples: 1264731960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 04:27:28,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 04:27:30,996][47288] Updated weights for policy 0, policy_version 80296 (0.0031) [2024-04-26 04:27:33,922][47056] Fps is (10 sec: 54068.1, 60 sec: 55978.8, 300 sec: 56150.0). Total num frames: 1315684352. Throughput: 0: 56185.9. Samples: 1265074240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 04:27:33,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 04:27:34,255][47288] Updated weights for policy 0, policy_version 80306 (0.0035) [2024-04-26 04:27:36,623][47267] Signal inference workers to stop experience collection... (19100 times) [2024-04-26 04:27:36,655][47288] InferenceWorker_p0-w0: stopping experience collection (19100 times) [2024-04-26 04:27:36,709][47267] Signal inference workers to resume experience collection... (19100 times) [2024-04-26 04:27:36,709][47288] InferenceWorker_p0-w0: resuming experience collection (19100 times) [2024-04-26 04:27:36,822][47288] Updated weights for policy 0, policy_version 80316 (0.0026) [2024-04-26 04:27:38,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55432.6, 300 sec: 56094.4). Total num frames: 1315946496. Throughput: 0: 56400.1. Samples: 1265415400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 04:27:38,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 04:27:40,080][47288] Updated weights for policy 0, policy_version 80326 (0.0029) [2024-04-26 04:27:42,695][47288] Updated weights for policy 0, policy_version 80336 (0.0029) [2024-04-26 04:27:43,923][47056] Fps is (10 sec: 58981.3, 60 sec: 56524.7, 300 sec: 56205.4). Total num frames: 1316274176. Throughput: 0: 56402.1. Samples: 1265581080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:27:43,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 04:27:45,939][47288] Updated weights for policy 0, policy_version 80346 (0.0027) [2024-04-26 04:27:48,516][47288] Updated weights for policy 0, policy_version 80356 (0.0029) [2024-04-26 04:27:48,923][47056] Fps is (10 sec: 62259.6, 60 sec: 56524.8, 300 sec: 56205.5). Total num frames: 1316569088. Throughput: 0: 56352.1. Samples: 1265916300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:27:48,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 04:27:51,746][47288] Updated weights for policy 0, policy_version 80366 (0.0032) [2024-04-26 04:27:53,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1316847616. Throughput: 0: 56284.9. Samples: 1266253040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:27:53,923][47056] Avg episode reward: [(0, '0.580')] [2024-04-26 04:27:54,406][47288] Updated weights for policy 0, policy_version 80376 (0.0027) [2024-04-26 04:27:57,641][47288] Updated weights for policy 0, policy_version 80386 (0.0026) [2024-04-26 04:27:58,923][47056] Fps is (10 sec: 54066.5, 60 sec: 56797.7, 300 sec: 56205.5). Total num frames: 1317109760. Throughput: 0: 56600.8. Samples: 1266434440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:27:58,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 04:28:00,450][47288] Updated weights for policy 0, policy_version 80396 (0.0030) [2024-04-26 04:28:03,380][47288] Updated weights for policy 0, policy_version 80406 (0.0028) [2024-04-26 04:28:03,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1317404672. Throughput: 0: 56606.6. Samples: 1266772280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:28:03,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 04:28:06,103][47288] Updated weights for policy 0, policy_version 80416 (0.0028) [2024-04-26 04:28:08,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.5, 300 sec: 56149.9). Total num frames: 1317650432. Throughput: 0: 56585.7. Samples: 1267107040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:28:08,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 04:28:09,276][47288] Updated weights for policy 0, policy_version 80426 (0.0028) [2024-04-26 04:28:11,780][47288] Updated weights for policy 0, policy_version 80436 (0.0025) [2024-04-26 04:28:13,923][47056] Fps is (10 sec: 49151.8, 60 sec: 55159.4, 300 sec: 55983.3). Total num frames: 1317896192. Throughput: 0: 56093.2. Samples: 1267256160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:28:13,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 04:28:15,060][47288] Updated weights for policy 0, policy_version 80446 (0.0027) [2024-04-26 04:28:17,661][47288] Updated weights for policy 0, policy_version 80456 (0.0027) [2024-04-26 04:28:17,673][47267] Signal inference workers to stop experience collection... (19150 times) [2024-04-26 04:28:17,674][47267] Signal inference workers to resume experience collection... (19150 times) [2024-04-26 04:28:17,702][47288] InferenceWorker_p0-w0: stopping experience collection (19150 times) [2024-04-26 04:28:17,702][47288] InferenceWorker_p0-w0: resuming experience collection (19150 times) [2024-04-26 04:28:18,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.5, 300 sec: 56205.4). Total num frames: 1318223872. Throughput: 0: 56057.8. Samples: 1267596860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:28:18,923][47056] Avg episode reward: [(0, '0.351')] [2024-04-26 04:28:20,884][47288] Updated weights for policy 0, policy_version 80466 (0.0027) [2024-04-26 04:28:23,524][47288] Updated weights for policy 0, policy_version 80476 (0.0024) [2024-04-26 04:28:23,923][47056] Fps is (10 sec: 63897.9, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1318535168. Throughput: 0: 56048.9. Samples: 1267937600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:28:23,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 04:28:26,592][47288] Updated weights for policy 0, policy_version 80486 (0.0031) [2024-04-26 04:28:28,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56524.6, 300 sec: 56261.0). Total num frames: 1318813696. Throughput: 0: 56191.0. Samples: 1268109680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:28:28,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 04:28:29,374][47288] Updated weights for policy 0, policy_version 80496 (0.0028) [2024-04-26 04:28:32,434][47288] Updated weights for policy 0, policy_version 80506 (0.0027) [2024-04-26 04:28:33,923][47056] Fps is (10 sec: 54066.4, 60 sec: 56524.5, 300 sec: 56260.9). Total num frames: 1319075840. Throughput: 0: 56143.7. Samples: 1268442780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:28:33,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 04:28:35,142][47288] Updated weights for policy 0, policy_version 80516 (0.0024) [2024-04-26 04:28:38,238][47288] Updated weights for policy 0, policy_version 80526 (0.0024) [2024-04-26 04:28:38,923][47056] Fps is (10 sec: 55706.4, 60 sec: 57070.9, 300 sec: 56261.0). Total num frames: 1319370752. Throughput: 0: 56099.1. Samples: 1268777500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:28:38,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 04:28:40,876][47288] Updated weights for policy 0, policy_version 80536 (0.0031) [2024-04-26 04:28:43,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1319649280. Throughput: 0: 55899.2. Samples: 1268949900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:28:43,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 04:28:44,086][47288] Updated weights for policy 0, policy_version 80546 (0.0031) [2024-04-26 04:28:46,755][47288] Updated weights for policy 0, policy_version 80556 (0.0027) [2024-04-26 04:28:48,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55432.5, 300 sec: 56149.9). Total num frames: 1319895040. Throughput: 0: 55877.0. Samples: 1269286740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:28:48,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 04:28:48,969][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000080561_1319911424.pth... [2024-04-26 04:28:49,018][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000079739_1306443776.pth [2024-04-26 04:28:49,914][47288] Updated weights for policy 0, policy_version 80566 (0.0030) [2024-04-26 04:28:52,593][47288] Updated weights for policy 0, policy_version 80576 (0.0030) [2024-04-26 04:28:53,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55432.5, 300 sec: 56149.9). Total num frames: 1320173568. Throughput: 0: 55964.1. Samples: 1269625420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:28:53,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 04:28:55,774][47288] Updated weights for policy 0, policy_version 80586 (0.0031) [2024-04-26 04:28:57,745][47267] Signal inference workers to stop experience collection... (19200 times) [2024-04-26 04:28:57,745][47267] Signal inference workers to resume experience collection... (19200 times) [2024-04-26 04:28:57,764][47288] InferenceWorker_p0-w0: stopping experience collection (19200 times) [2024-04-26 04:28:57,765][47288] InferenceWorker_p0-w0: resuming experience collection (19200 times) [2024-04-26 04:28:58,353][47288] Updated weights for policy 0, policy_version 80596 (0.0033) [2024-04-26 04:28:58,923][47056] Fps is (10 sec: 60619.9, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1320501248. Throughput: 0: 56448.4. Samples: 1269796340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:28:58,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 04:29:01,655][47288] Updated weights for policy 0, policy_version 80606 (0.0032) [2024-04-26 04:29:03,923][47056] Fps is (10 sec: 62260.1, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1320796160. Throughput: 0: 56439.4. Samples: 1270136620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:29:03,923][47056] Avg episode reward: [(0, '0.363')] [2024-04-26 04:29:04,132][47288] Updated weights for policy 0, policy_version 80616 (0.0035) [2024-04-26 04:29:07,380][47288] Updated weights for policy 0, policy_version 80626 (0.0029) [2024-04-26 04:29:08,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56797.9, 300 sec: 56261.0). Total num frames: 1321058304. Throughput: 0: 56287.9. Samples: 1270470560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:29:08,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 04:29:09,877][47288] Updated weights for policy 0, policy_version 80636 (0.0032) [2024-04-26 04:29:13,121][47288] Updated weights for policy 0, policy_version 80646 (0.0021) [2024-04-26 04:29:13,923][47056] Fps is (10 sec: 52428.5, 60 sec: 57071.1, 300 sec: 56149.9). Total num frames: 1321320448. Throughput: 0: 56295.4. Samples: 1270642960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:29:13,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 04:29:15,919][47288] Updated weights for policy 0, policy_version 80656 (0.0027) [2024-04-26 04:29:18,923][47056] Fps is (10 sec: 54066.4, 60 sec: 56251.7, 300 sec: 56205.4). Total num frames: 1321598976. Throughput: 0: 56427.5. Samples: 1270982020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 04:29:18,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 04:29:19,066][47288] Updated weights for policy 0, policy_version 80666 (0.0025) [2024-04-26 04:29:21,805][47288] Updated weights for policy 0, policy_version 80676 (0.0031) [2024-04-26 04:29:23,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55432.6, 300 sec: 56149.9). Total num frames: 1321861120. Throughput: 0: 56346.2. Samples: 1271313080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 04:29:23,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 04:29:25,004][47288] Updated weights for policy 0, policy_version 80686 (0.0028) [2024-04-26 04:29:27,678][47288] Updated weights for policy 0, policy_version 80696 (0.0027) [2024-04-26 04:29:28,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55432.6, 300 sec: 56149.9). Total num frames: 1322139648. Throughput: 0: 56109.3. Samples: 1271474820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 04:29:28,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 04:29:30,824][47288] Updated weights for policy 0, policy_version 80706 (0.0033) [2024-04-26 04:29:33,483][47288] Updated weights for policy 0, policy_version 80716 (0.0029) [2024-04-26 04:29:33,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56251.8, 300 sec: 56205.5). Total num frames: 1322450944. Throughput: 0: 56135.9. Samples: 1271812860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 04:29:33,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 04:29:36,498][47288] Updated weights for policy 0, policy_version 80726 (0.0027) [2024-04-26 04:29:38,923][47056] Fps is (10 sec: 60621.1, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1322745856. Throughput: 0: 55959.6. Samples: 1272143600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 04:29:38,923][47056] Avg episode reward: [(0, '0.403')] [2024-04-26 04:29:39,430][47288] Updated weights for policy 0, policy_version 80736 (0.0034) [2024-04-26 04:29:42,275][47267] Signal inference workers to stop experience collection... (19250 times) [2024-04-26 04:29:42,301][47288] InferenceWorker_p0-w0: stopping experience collection (19250 times) [2024-04-26 04:29:42,329][47267] Signal inference workers to resume experience collection... (19250 times) [2024-04-26 04:29:42,334][47288] InferenceWorker_p0-w0: resuming experience collection (19250 times) [2024-04-26 04:29:42,458][47288] Updated weights for policy 0, policy_version 80746 (0.0029) [2024-04-26 04:29:43,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56251.8, 300 sec: 56205.4). Total num frames: 1323024384. Throughput: 0: 56185.1. Samples: 1272324660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 04:29:43,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 04:29:45,304][47288] Updated weights for policy 0, policy_version 80756 (0.0031) [2024-04-26 04:29:48,449][47288] Updated weights for policy 0, policy_version 80766 (0.0034) [2024-04-26 04:29:48,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56524.8, 300 sec: 56205.5). Total num frames: 1323286528. Throughput: 0: 56116.0. Samples: 1272661840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 04:29:48,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 04:29:51,133][47288] Updated weights for policy 0, policy_version 80776 (0.0028) [2024-04-26 04:29:53,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56524.9, 300 sec: 56149.9). Total num frames: 1323565056. Throughput: 0: 56048.6. Samples: 1272992740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 04:29:53,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 04:29:54,507][47288] Updated weights for policy 0, policy_version 80786 (0.0027) [2024-04-26 04:29:57,274][47288] Updated weights for policy 0, policy_version 80796 (0.0032) [2024-04-26 04:29:58,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.7, 300 sec: 56149.9). Total num frames: 1323843584. Throughput: 0: 55867.0. Samples: 1273156980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 04:29:58,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 04:30:00,280][47288] Updated weights for policy 0, policy_version 80806 (0.0027) [2024-04-26 04:30:03,372][47288] Updated weights for policy 0, policy_version 80816 (0.0032) [2024-04-26 04:30:03,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 56150.0). Total num frames: 1324105728. Throughput: 0: 55895.4. Samples: 1273497300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 04:30:03,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 04:30:05,939][47288] Updated weights for policy 0, policy_version 80826 (0.0035) [2024-04-26 04:30:08,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.7, 300 sec: 56261.0). Total num frames: 1324400640. Throughput: 0: 56083.1. Samples: 1273836820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:30:08,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 04:30:09,128][47288] Updated weights for policy 0, policy_version 80836 (0.0030) [2024-04-26 04:30:11,814][47288] Updated weights for policy 0, policy_version 80846 (0.0031) [2024-04-26 04:30:13,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56251.6, 300 sec: 56205.4). Total num frames: 1324695552. Throughput: 0: 56187.6. Samples: 1274003260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:30:13,923][47056] Avg episode reward: [(0, '0.573')] [2024-04-26 04:30:15,068][47288] Updated weights for policy 0, policy_version 80856 (0.0028) [2024-04-26 04:30:17,614][47288] Updated weights for policy 0, policy_version 80866 (0.0027) [2024-04-26 04:30:18,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56525.0, 300 sec: 56316.5). Total num frames: 1324990464. Throughput: 0: 56166.3. Samples: 1274340340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:30:18,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 04:30:20,950][47288] Updated weights for policy 0, policy_version 80876 (0.0024) [2024-04-26 04:30:23,717][47288] Updated weights for policy 0, policy_version 80886 (0.0033) [2024-04-26 04:30:23,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.8, 300 sec: 56150.0). Total num frames: 1325252608. Throughput: 0: 56257.4. Samples: 1274675180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:30:23,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 04:30:26,672][47288] Updated weights for policy 0, policy_version 80896 (0.0033) [2024-04-26 04:30:28,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56524.8, 300 sec: 56205.5). Total num frames: 1325531136. Throughput: 0: 56099.0. Samples: 1274849120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:30:28,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 04:30:29,353][47288] Updated weights for policy 0, policy_version 80906 (0.0029) [2024-04-26 04:30:32,481][47288] Updated weights for policy 0, policy_version 80916 (0.0028) [2024-04-26 04:30:33,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.8, 300 sec: 56205.5). Total num frames: 1325809664. Throughput: 0: 56046.2. Samples: 1275183920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:30:33,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 04:30:35,214][47288] Updated weights for policy 0, policy_version 80926 (0.0030) [2024-04-26 04:30:38,144][47288] Updated weights for policy 0, policy_version 80936 (0.0030) [2024-04-26 04:30:38,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 56205.4). Total num frames: 1326088192. Throughput: 0: 56200.8. Samples: 1275521780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:30:38,923][47056] Avg episode reward: [(0, '0.372')] [2024-04-26 04:30:41,008][47288] Updated weights for policy 0, policy_version 80946 (0.0031) [2024-04-26 04:30:43,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 56261.0). Total num frames: 1326366720. Throughput: 0: 56270.3. Samples: 1275689140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:30:43,923][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 04:30:44,018][47288] Updated weights for policy 0, policy_version 80956 (0.0027) [2024-04-26 04:30:46,627][47267] Signal inference workers to stop experience collection... (19300 times) [2024-04-26 04:30:46,662][47288] InferenceWorker_p0-w0: stopping experience collection (19300 times) [2024-04-26 04:30:46,687][47267] Signal inference workers to resume experience collection... (19300 times) [2024-04-26 04:30:46,688][47288] InferenceWorker_p0-w0: resuming experience collection (19300 times) [2024-04-26 04:30:46,690][47288] Updated weights for policy 0, policy_version 80966 (0.0024) [2024-04-26 04:30:48,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1326645248. Throughput: 0: 56134.2. Samples: 1276023340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:30:48,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 04:30:48,988][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000080973_1326661632.pth... [2024-04-26 04:30:49,042][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000080151_1313193984.pth [2024-04-26 04:30:49,835][47288] Updated weights for policy 0, policy_version 80976 (0.0028) [2024-04-26 04:30:52,367][47288] Updated weights for policy 0, policy_version 80986 (0.0030) [2024-04-26 04:30:53,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56524.7, 300 sec: 56205.5). Total num frames: 1326956544. Throughput: 0: 56078.6. Samples: 1276360360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:30:53,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 04:30:55,579][47288] Updated weights for policy 0, policy_version 80996 (0.0034) [2024-04-26 04:30:58,326][47288] Updated weights for policy 0, policy_version 81006 (0.0025) [2024-04-26 04:30:58,923][47056] Fps is (10 sec: 60621.1, 60 sec: 56798.0, 300 sec: 56372.1). Total num frames: 1327251456. Throughput: 0: 56449.5. Samples: 1276543480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 04:30:58,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 04:31:01,368][47288] Updated weights for policy 0, policy_version 81016 (0.0024) [2024-04-26 04:31:03,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56797.9, 300 sec: 56261.0). Total num frames: 1327513600. Throughput: 0: 56336.0. Samples: 1276875460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 04:31:03,923][47056] Avg episode reward: [(0, '0.544')] [2024-04-26 04:31:04,007][47288] Updated weights for policy 0, policy_version 81026 (0.0030) [2024-04-26 04:31:07,060][47288] Updated weights for policy 0, policy_version 81036 (0.0027) [2024-04-26 04:31:08,923][47056] Fps is (10 sec: 54066.0, 60 sec: 56524.6, 300 sec: 56205.4). Total num frames: 1327792128. Throughput: 0: 56517.5. Samples: 1277218480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 04:31:08,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 04:31:09,629][47288] Updated weights for policy 0, policy_version 81046 (0.0026) [2024-04-26 04:31:12,717][47288] Updated weights for policy 0, policy_version 81056 (0.0023) [2024-04-26 04:31:13,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55978.6, 300 sec: 56205.5). Total num frames: 1328054272. Throughput: 0: 56323.1. Samples: 1277383660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 04:31:13,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 04:31:15,384][47288] Updated weights for policy 0, policy_version 81066 (0.0036) [2024-04-26 04:31:18,438][47288] Updated weights for policy 0, policy_version 81076 (0.0026) [2024-04-26 04:31:18,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1328349184. Throughput: 0: 56500.8. Samples: 1277726460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 04:31:18,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 04:31:21,282][47288] Updated weights for policy 0, policy_version 81086 (0.0028) [2024-04-26 04:31:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 56205.5). Total num frames: 1328627712. Throughput: 0: 56516.0. Samples: 1278065000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 04:31:23,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 04:31:24,290][47288] Updated weights for policy 0, policy_version 81096 (0.0026) [2024-04-26 04:31:27,125][47288] Updated weights for policy 0, policy_version 81106 (0.0032) [2024-04-26 04:31:28,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1328922624. Throughput: 0: 56579.4. Samples: 1278235220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 04:31:28,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 04:31:30,252][47288] Updated weights for policy 0, policy_version 81116 (0.0030) [2024-04-26 04:31:32,734][47288] Updated weights for policy 0, policy_version 81126 (0.0037) [2024-04-26 04:31:33,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56797.7, 300 sec: 56261.0). Total num frames: 1329217536. Throughput: 0: 56530.5. Samples: 1278567220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 04:31:33,924][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 04:31:35,261][47267] Signal inference workers to stop experience collection... (19350 times) [2024-04-26 04:31:35,298][47288] InferenceWorker_p0-w0: stopping experience collection (19350 times) [2024-04-26 04:31:35,351][47267] Signal inference workers to resume experience collection... (19350 times) [2024-04-26 04:31:35,351][47288] InferenceWorker_p0-w0: resuming experience collection (19350 times) [2024-04-26 04:31:35,909][47288] Updated weights for policy 0, policy_version 81136 (0.0024) [2024-04-26 04:31:38,456][47288] Updated weights for policy 0, policy_version 81146 (0.0027) [2024-04-26 04:31:38,923][47056] Fps is (10 sec: 58982.7, 60 sec: 57070.9, 300 sec: 56372.0). Total num frames: 1329512448. Throughput: 0: 56482.2. Samples: 1278902060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 04:31:38,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 04:31:41,713][47288] Updated weights for policy 0, policy_version 81156 (0.0026) [2024-04-26 04:31:43,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56797.8, 300 sec: 56261.0). Total num frames: 1329774592. Throughput: 0: 56336.7. Samples: 1279078640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 04:31:43,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 04:31:44,597][47288] Updated weights for policy 0, policy_version 81166 (0.0027) [2024-04-26 04:31:48,140][47288] Updated weights for policy 0, policy_version 81176 (0.0025) [2024-04-26 04:31:48,923][47056] Fps is (10 sec: 52429.2, 60 sec: 56524.8, 300 sec: 56205.5). Total num frames: 1330036736. Throughput: 0: 56473.7. Samples: 1279416780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:31:48,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 04:31:50,289][47288] Updated weights for policy 0, policy_version 81186 (0.0030) [2024-04-26 04:31:53,820][47288] Updated weights for policy 0, policy_version 81196 (0.0032) [2024-04-26 04:31:53,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1330315264. Throughput: 0: 56477.9. Samples: 1279759980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:31:53,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 04:31:55,980][47288] Updated weights for policy 0, policy_version 81206 (0.0024) [2024-04-26 04:31:58,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1330626560. Throughput: 0: 56385.4. Samples: 1279921000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:31:58,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 04:31:59,780][47288] Updated weights for policy 0, policy_version 81216 (0.0031) [2024-04-26 04:32:01,810][47288] Updated weights for policy 0, policy_version 81226 (0.0030) [2024-04-26 04:32:03,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.7, 300 sec: 56205.5). Total num frames: 1330888704. Throughput: 0: 56358.7. Samples: 1280262600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:32:03,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 04:32:05,688][47288] Updated weights for policy 0, policy_version 81236 (0.0031) [2024-04-26 04:32:07,659][47288] Updated weights for policy 0, policy_version 81246 (0.0029) [2024-04-26 04:32:08,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56251.8, 300 sec: 56205.4). Total num frames: 1331167232. Throughput: 0: 56302.2. Samples: 1280598600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:32:08,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 04:32:11,442][47288] Updated weights for policy 0, policy_version 81256 (0.0028) [2024-04-26 04:32:13,422][47288] Updated weights for policy 0, policy_version 81266 (0.0027) [2024-04-26 04:32:13,923][47056] Fps is (10 sec: 58981.8, 60 sec: 57070.9, 300 sec: 56316.5). Total num frames: 1331478528. Throughput: 0: 56366.3. Samples: 1280771700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:32:13,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 04:32:17,188][47288] Updated weights for policy 0, policy_version 81276 (0.0034) [2024-04-26 04:32:18,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56797.9, 300 sec: 56316.5). Total num frames: 1331757056. Throughput: 0: 56544.5. Samples: 1281111720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:32:18,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 04:32:19,270][47288] Updated weights for policy 0, policy_version 81286 (0.0027) [2024-04-26 04:32:22,916][47288] Updated weights for policy 0, policy_version 81296 (0.0028) [2024-04-26 04:32:23,923][47056] Fps is (10 sec: 52429.4, 60 sec: 56251.8, 300 sec: 56205.5). Total num frames: 1332002816. Throughput: 0: 56689.9. Samples: 1281453100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:32:23,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 04:32:25,215][47288] Updated weights for policy 0, policy_version 81306 (0.0027) [2024-04-26 04:32:28,805][47288] Updated weights for policy 0, policy_version 81316 (0.0029) [2024-04-26 04:32:28,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1332281344. Throughput: 0: 56407.6. Samples: 1281616980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:32:28,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 04:32:29,617][47267] Signal inference workers to stop experience collection... (19400 times) [2024-04-26 04:32:29,655][47288] InferenceWorker_p0-w0: stopping experience collection (19400 times) [2024-04-26 04:32:29,666][47267] Signal inference workers to resume experience collection... (19400 times) [2024-04-26 04:32:29,674][47288] InferenceWorker_p0-w0: resuming experience collection (19400 times) [2024-04-26 04:32:31,086][47288] Updated weights for policy 0, policy_version 81326 (0.0032) [2024-04-26 04:32:33,923][47056] Fps is (10 sec: 57340.9, 60 sec: 55978.3, 300 sec: 56372.0). Total num frames: 1332576256. Throughput: 0: 56340.3. Samples: 1281952120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 04:32:33,924][47056] Avg episode reward: [(0, '0.403')] [2024-04-26 04:32:34,519][47288] Updated weights for policy 0, policy_version 81336 (0.0024) [2024-04-26 04:32:36,820][47288] Updated weights for policy 0, policy_version 81346 (0.0031) [2024-04-26 04:32:38,923][47056] Fps is (10 sec: 58982.2, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1332871168. Throughput: 0: 56073.3. Samples: 1282283280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 04:32:38,923][47056] Avg episode reward: [(0, '0.552')] [2024-04-26 04:32:40,328][47288] Updated weights for policy 0, policy_version 81356 (0.0037) [2024-04-26 04:32:42,810][47288] Updated weights for policy 0, policy_version 81366 (0.0029) [2024-04-26 04:32:43,923][47056] Fps is (10 sec: 54069.5, 60 sec: 55705.6, 300 sec: 56094.4). Total num frames: 1333116928. Throughput: 0: 56326.6. Samples: 1282455700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 04:32:43,924][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 04:32:46,329][47288] Updated weights for policy 0, policy_version 81376 (0.0024) [2024-04-26 04:32:48,665][47288] Updated weights for policy 0, policy_version 81386 (0.0033) [2024-04-26 04:32:48,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56524.8, 300 sec: 56205.5). Total num frames: 1333428224. Throughput: 0: 56183.5. Samples: 1282790860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 04:32:48,923][47056] Avg episode reward: [(0, '0.384')] [2024-04-26 04:32:49,007][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000081387_1333444608.pth... [2024-04-26 04:32:49,054][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000080561_1319911424.pth [2024-04-26 04:32:52,201][47288] Updated weights for policy 0, policy_version 81396 (0.0028) [2024-04-26 04:32:53,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1333706752. Throughput: 0: 56214.2. Samples: 1283128240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 04:32:53,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 04:32:54,504][47288] Updated weights for policy 0, policy_version 81406 (0.0032) [2024-04-26 04:32:57,911][47288] Updated weights for policy 0, policy_version 81416 (0.0031) [2024-04-26 04:32:58,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 1333985280. Throughput: 0: 56007.7. Samples: 1283292040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 04:32:58,923][47056] Avg episode reward: [(0, '0.450')] [2024-04-26 04:33:00,243][47288] Updated weights for policy 0, policy_version 81426 (0.0024) [2024-04-26 04:33:03,606][47288] Updated weights for policy 0, policy_version 81436 (0.0027) [2024-04-26 04:33:03,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.5, 300 sec: 56261.0). Total num frames: 1334247424. Throughput: 0: 55997.2. Samples: 1283631600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 04:33:03,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 04:33:05,956][47288] Updated weights for policy 0, policy_version 81446 (0.0025) [2024-04-26 04:33:08,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1334542336. Throughput: 0: 55903.0. Samples: 1283968740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 04:33:08,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 04:33:09,521][47288] Updated weights for policy 0, policy_version 81456 (0.0027) [2024-04-26 04:33:11,986][47288] Updated weights for policy 0, policy_version 81466 (0.0031) [2024-04-26 04:33:13,923][47056] Fps is (10 sec: 58982.8, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1334837248. Throughput: 0: 56047.1. Samples: 1284139100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 04:33:13,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 04:33:15,388][47288] Updated weights for policy 0, policy_version 81476 (0.0030) [2024-04-26 04:33:17,961][47288] Updated weights for policy 0, policy_version 81486 (0.0028) [2024-04-26 04:33:18,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55978.5, 300 sec: 56205.4). Total num frames: 1335115776. Throughput: 0: 56076.9. Samples: 1284475560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 04:33:18,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 04:33:21,177][47288] Updated weights for policy 0, policy_version 81496 (0.0028) [2024-04-26 04:33:23,775][47288] Updated weights for policy 0, policy_version 81506 (0.0035) [2024-04-26 04:33:23,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56524.7, 300 sec: 56205.5). Total num frames: 1335394304. Throughput: 0: 56278.3. Samples: 1284815800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:33:23,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 04:33:26,859][47288] Updated weights for policy 0, policy_version 81516 (0.0031) [2024-04-26 04:33:28,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56797.9, 300 sec: 56316.6). Total num frames: 1335689216. Throughput: 0: 56221.4. Samples: 1284985660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:33:28,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 04:33:29,554][47288] Updated weights for policy 0, policy_version 81526 (0.0028) [2024-04-26 04:33:32,658][47288] Updated weights for policy 0, policy_version 81536 (0.0031) [2024-04-26 04:33:33,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56252.2, 300 sec: 56205.5). Total num frames: 1335951360. Throughput: 0: 56318.7. Samples: 1285325200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:33:33,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 04:33:35,481][47288] Updated weights for policy 0, policy_version 81546 (0.0030) [2024-04-26 04:33:35,494][47267] Signal inference workers to stop experience collection... (19450 times) [2024-04-26 04:33:35,494][47267] Signal inference workers to resume experience collection... (19450 times) [2024-04-26 04:33:35,521][47288] InferenceWorker_p0-w0: stopping experience collection (19450 times) [2024-04-26 04:33:35,522][47288] InferenceWorker_p0-w0: resuming experience collection (19450 times) [2024-04-26 04:33:38,537][47288] Updated weights for policy 0, policy_version 81556 (0.0030) [2024-04-26 04:33:38,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1336229888. Throughput: 0: 56241.2. Samples: 1285659100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:33:38,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 04:33:41,224][47288] Updated weights for policy 0, policy_version 81566 (0.0033) [2024-04-26 04:33:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.9, 300 sec: 56316.5). Total num frames: 1336508416. Throughput: 0: 56292.8. Samples: 1285825220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:33:43,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 04:33:44,432][47288] Updated weights for policy 0, policy_version 81576 (0.0028) [2024-04-26 04:33:47,086][47288] Updated weights for policy 0, policy_version 81586 (0.0027) [2024-04-26 04:33:48,923][47056] Fps is (10 sec: 57345.4, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1336803328. Throughput: 0: 56301.1. Samples: 1286165140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:33:48,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 04:33:50,097][47288] Updated weights for policy 0, policy_version 81596 (0.0024) [2024-04-26 04:33:53,008][47288] Updated weights for policy 0, policy_version 81606 (0.0029) [2024-04-26 04:33:53,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.7, 300 sec: 56205.5). Total num frames: 1337081856. Throughput: 0: 56264.4. Samples: 1286500640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:33:53,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 04:33:55,959][47288] Updated weights for policy 0, policy_version 81616 (0.0028) [2024-04-26 04:33:58,712][47288] Updated weights for policy 0, policy_version 81626 (0.0030) [2024-04-26 04:33:58,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 1337360384. Throughput: 0: 56376.0. Samples: 1286676020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:33:58,923][47056] Avg episode reward: [(0, '0.563')] [2024-04-26 04:34:02,097][47288] Updated weights for policy 0, policy_version 81636 (0.0032) [2024-04-26 04:34:03,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.9, 300 sec: 56205.5). Total num frames: 1337638912. Throughput: 0: 56220.2. Samples: 1287005460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:34:03,923][47056] Avg episode reward: [(0, '0.321')] [2024-04-26 04:34:04,628][47288] Updated weights for policy 0, policy_version 81646 (0.0029) [2024-04-26 04:34:07,788][47288] Updated weights for policy 0, policy_version 81656 (0.0028) [2024-04-26 04:34:08,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1337917440. Throughput: 0: 56221.0. Samples: 1287345740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 04:34:08,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 04:34:10,376][47288] Updated weights for policy 0, policy_version 81666 (0.0024) [2024-04-26 04:34:13,759][47288] Updated weights for policy 0, policy_version 81676 (0.0025) [2024-04-26 04:34:13,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 56205.5). Total num frames: 1338179584. Throughput: 0: 56154.8. Samples: 1287512620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:34:13,923][47056] Avg episode reward: [(0, '0.331')] [2024-04-26 04:34:16,250][47288] Updated weights for policy 0, policy_version 81686 (0.0031) [2024-04-26 04:34:18,923][47056] Fps is (10 sec: 57342.9, 60 sec: 56251.8, 300 sec: 56372.0). Total num frames: 1338490880. Throughput: 0: 56114.0. Samples: 1287850340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:34:18,924][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 04:34:19,506][47288] Updated weights for policy 0, policy_version 81696 (0.0028) [2024-04-26 04:34:22,102][47288] Updated weights for policy 0, policy_version 81706 (0.0034) [2024-04-26 04:34:23,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1338753024. Throughput: 0: 56198.9. Samples: 1288188040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:34:23,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 04:34:25,272][47288] Updated weights for policy 0, policy_version 81716 (0.0031) [2024-04-26 04:34:28,005][47288] Updated weights for policy 0, policy_version 81726 (0.0033) [2024-04-26 04:34:28,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55705.6, 300 sec: 56205.5). Total num frames: 1339031552. Throughput: 0: 56104.0. Samples: 1288349900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:34:28,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 04:34:31,181][47288] Updated weights for policy 0, policy_version 81736 (0.0029) [2024-04-26 04:34:33,680][47288] Updated weights for policy 0, policy_version 81746 (0.0033) [2024-04-26 04:34:33,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56251.6, 300 sec: 56205.4). Total num frames: 1339326464. Throughput: 0: 56007.8. Samples: 1288685500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:34:33,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 04:34:37,042][47288] Updated weights for policy 0, policy_version 81756 (0.0033) [2024-04-26 04:34:38,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.9, 300 sec: 56205.4). Total num frames: 1339604992. Throughput: 0: 56054.8. Samples: 1289023100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:34:38,923][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 04:34:39,356][47288] Updated weights for policy 0, policy_version 81766 (0.0028) [2024-04-26 04:34:42,875][47288] Updated weights for policy 0, policy_version 81776 (0.0029) [2024-04-26 04:34:43,023][47267] Signal inference workers to stop experience collection... (19500 times) [2024-04-26 04:34:43,023][47267] Signal inference workers to resume experience collection... (19500 times) [2024-04-26 04:34:43,039][47288] InferenceWorker_p0-w0: stopping experience collection (19500 times) [2024-04-26 04:34:43,039][47288] InferenceWorker_p0-w0: resuming experience collection (19500 times) [2024-04-26 04:34:43,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1339899904. Throughput: 0: 55999.5. Samples: 1289196000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:34:43,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 04:34:45,604][47288] Updated weights for policy 0, policy_version 81786 (0.0030) [2024-04-26 04:34:48,693][47288] Updated weights for policy 0, policy_version 81796 (0.0028) [2024-04-26 04:34:48,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.5, 300 sec: 56261.0). Total num frames: 1340162048. Throughput: 0: 56168.8. Samples: 1289533060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:34:48,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 04:34:48,986][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000081798_1340178432.pth... [2024-04-26 04:34:49,042][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000080973_1326661632.pth [2024-04-26 04:34:51,516][47288] Updated weights for policy 0, policy_version 81806 (0.0026) [2024-04-26 04:34:53,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1340440576. Throughput: 0: 56030.5. Samples: 1289867120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:34:53,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 04:34:54,498][47288] Updated weights for policy 0, policy_version 81816 (0.0030) [2024-04-26 04:34:57,279][47288] Updated weights for policy 0, policy_version 81826 (0.0028) [2024-04-26 04:34:58,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.5, 300 sec: 56316.5). Total num frames: 1340719104. Throughput: 0: 56005.0. Samples: 1290032860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:34:58,924][47056] Avg episode reward: [(0, '0.373')] [2024-04-26 04:35:00,325][47288] Updated weights for policy 0, policy_version 81836 (0.0027) [2024-04-26 04:35:03,119][47288] Updated weights for policy 0, policy_version 81846 (0.0029) [2024-04-26 04:35:03,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.8, 300 sec: 56261.0). Total num frames: 1340997632. Throughput: 0: 55986.5. Samples: 1290369720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 04:35:03,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 04:35:06,135][47288] Updated weights for policy 0, policy_version 81856 (0.0037) [2024-04-26 04:35:08,923][47056] Fps is (10 sec: 55706.9, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 1341276160. Throughput: 0: 55968.4. Samples: 1290706620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 04:35:08,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 04:35:09,051][47288] Updated weights for policy 0, policy_version 81866 (0.0027) [2024-04-26 04:35:11,922][47288] Updated weights for policy 0, policy_version 81876 (0.0027) [2024-04-26 04:35:13,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1341554688. Throughput: 0: 56243.7. Samples: 1290880860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 04:35:13,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 04:35:14,981][47288] Updated weights for policy 0, policy_version 81886 (0.0027) [2024-04-26 04:35:17,676][47288] Updated weights for policy 0, policy_version 81896 (0.0033) [2024-04-26 04:35:18,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1341865984. Throughput: 0: 56248.0. Samples: 1291216660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 04:35:18,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 04:35:20,757][47288] Updated weights for policy 0, policy_version 81906 (0.0032) [2024-04-26 04:35:23,525][47288] Updated weights for policy 0, policy_version 81916 (0.0026) [2024-04-26 04:35:23,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1342144512. Throughput: 0: 56235.1. Samples: 1291553680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 04:35:23,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 04:35:26,413][47288] Updated weights for policy 0, policy_version 81926 (0.0025) [2024-04-26 04:35:28,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55978.7, 300 sec: 56205.4). Total num frames: 1342390272. Throughput: 0: 56065.8. Samples: 1291718960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 04:35:28,923][47056] Avg episode reward: [(0, '0.401')] [2024-04-26 04:35:29,387][47288] Updated weights for policy 0, policy_version 81936 (0.0029) [2024-04-26 04:35:32,265][47288] Updated weights for policy 0, policy_version 81946 (0.0025) [2024-04-26 04:35:33,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55705.6, 300 sec: 56205.4). Total num frames: 1342668800. Throughput: 0: 56159.2. Samples: 1292060220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 04:35:33,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 04:35:35,161][47288] Updated weights for policy 0, policy_version 81956 (0.0029) [2024-04-26 04:35:38,010][47288] Updated weights for policy 0, policy_version 81966 (0.0026) [2024-04-26 04:35:38,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 56205.4). Total num frames: 1342947328. Throughput: 0: 56233.0. Samples: 1292397600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 04:35:38,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 04:35:40,920][47288] Updated weights for policy 0, policy_version 81976 (0.0036) [2024-04-26 04:35:43,866][47288] Updated weights for policy 0, policy_version 81986 (0.0029) [2024-04-26 04:35:43,923][47056] Fps is (10 sec: 58983.0, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1343258624. Throughput: 0: 56199.0. Samples: 1292561800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 04:35:43,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 04:35:46,659][47288] Updated weights for policy 0, policy_version 81996 (0.0039) [2024-04-26 04:35:48,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.8, 300 sec: 56205.5). Total num frames: 1343537152. Throughput: 0: 56191.0. Samples: 1292898320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 04:35:48,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 04:35:49,892][47288] Updated weights for policy 0, policy_version 82006 (0.0026) [2024-04-26 04:35:52,133][47267] Signal inference workers to stop experience collection... (19550 times) [2024-04-26 04:35:52,133][47267] Signal inference workers to resume experience collection... (19550 times) [2024-04-26 04:35:52,159][47288] InferenceWorker_p0-w0: stopping experience collection (19550 times) [2024-04-26 04:35:52,159][47288] InferenceWorker_p0-w0: resuming experience collection (19550 times) [2024-04-26 04:35:52,518][47288] Updated weights for policy 0, policy_version 82016 (0.0037) [2024-04-26 04:35:53,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56524.8, 300 sec: 56205.4). Total num frames: 1343832064. Throughput: 0: 56074.1. Samples: 1293229960. Policy #0 lag: (min: 1.0, avg: 12.0, max: 21.0) [2024-04-26 04:35:53,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 04:35:55,595][47288] Updated weights for policy 0, policy_version 82026 (0.0024) [2024-04-26 04:35:58,348][47288] Updated weights for policy 0, policy_version 82036 (0.0029) [2024-04-26 04:35:58,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56798.1, 300 sec: 56316.5). Total num frames: 1344126976. Throughput: 0: 56367.9. Samples: 1293417420. Policy #0 lag: (min: 1.0, avg: 12.0, max: 21.0) [2024-04-26 04:35:58,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 04:36:01,217][47288] Updated weights for policy 0, policy_version 82046 (0.0032) [2024-04-26 04:36:03,923][47056] Fps is (10 sec: 54067.4, 60 sec: 56251.6, 300 sec: 56205.5). Total num frames: 1344372736. Throughput: 0: 56464.5. Samples: 1293757560. Policy #0 lag: (min: 1.0, avg: 12.0, max: 21.0) [2024-04-26 04:36:03,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 04:36:04,203][47288] Updated weights for policy 0, policy_version 82056 (0.0031) [2024-04-26 04:36:07,158][47288] Updated weights for policy 0, policy_version 82066 (0.0031) [2024-04-26 04:36:08,923][47056] Fps is (10 sec: 50789.6, 60 sec: 55978.5, 300 sec: 56205.4). Total num frames: 1344634880. Throughput: 0: 56395.0. Samples: 1294091460. Policy #0 lag: (min: 1.0, avg: 12.0, max: 21.0) [2024-04-26 04:36:08,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 04:36:09,996][47288] Updated weights for policy 0, policy_version 82076 (0.0036) [2024-04-26 04:36:13,069][47288] Updated weights for policy 0, policy_version 82086 (0.0026) [2024-04-26 04:36:13,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.6, 300 sec: 56149.9). Total num frames: 1344913408. Throughput: 0: 56157.9. Samples: 1294246060. Policy #0 lag: (min: 1.0, avg: 12.0, max: 21.0) [2024-04-26 04:36:13,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 04:36:15,644][47288] Updated weights for policy 0, policy_version 82096 (0.0024) [2024-04-26 04:36:18,767][47288] Updated weights for policy 0, policy_version 82106 (0.0030) [2024-04-26 04:36:18,923][47056] Fps is (10 sec: 58983.2, 60 sec: 55978.8, 300 sec: 56261.0). Total num frames: 1345224704. Throughput: 0: 56042.8. Samples: 1294582140. Policy #0 lag: (min: 1.0, avg: 12.0, max: 21.0) [2024-04-26 04:36:18,932][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 04:36:21,480][47288] Updated weights for policy 0, policy_version 82116 (0.0026) [2024-04-26 04:36:23,923][47056] Fps is (10 sec: 58982.5, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 1345503232. Throughput: 0: 56369.0. Samples: 1294934200. Policy #0 lag: (min: 1.0, avg: 12.0, max: 21.0) [2024-04-26 04:36:23,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 04:36:24,401][47288] Updated weights for policy 0, policy_version 82126 (0.0026) [2024-04-26 04:36:27,264][47288] Updated weights for policy 0, policy_version 82136 (0.0028) [2024-04-26 04:36:28,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56797.8, 300 sec: 56205.5). Total num frames: 1345798144. Throughput: 0: 56482.1. Samples: 1295103500. Policy #0 lag: (min: 1.0, avg: 12.0, max: 21.0) [2024-04-26 04:36:28,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 04:36:30,490][47288] Updated weights for policy 0, policy_version 82146 (0.0027) [2024-04-26 04:36:33,023][47288] Updated weights for policy 0, policy_version 82156 (0.0031) [2024-04-26 04:36:33,923][47056] Fps is (10 sec: 58981.5, 60 sec: 57070.9, 300 sec: 56205.5). Total num frames: 1346093056. Throughput: 0: 56521.2. Samples: 1295441780. Policy #0 lag: (min: 1.0, avg: 12.0, max: 21.0) [2024-04-26 04:36:33,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 04:36:36,543][47288] Updated weights for policy 0, policy_version 82166 (0.0031) [2024-04-26 04:36:38,756][47288] Updated weights for policy 0, policy_version 82176 (0.0039) [2024-04-26 04:36:38,923][47056] Fps is (10 sec: 57343.2, 60 sec: 57070.8, 300 sec: 56261.0). Total num frames: 1346371584. Throughput: 0: 56513.2. Samples: 1295773060. Policy #0 lag: (min: 1.0, avg: 12.0, max: 21.0) [2024-04-26 04:36:38,923][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 04:36:42,417][47288] Updated weights for policy 0, policy_version 82186 (0.0027) [2024-04-26 04:36:43,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.6, 300 sec: 56261.0). Total num frames: 1346633728. Throughput: 0: 56222.5. Samples: 1295947440. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 04:36:43,924][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 04:36:44,204][47267] Signal inference workers to stop experience collection... (19600 times) [2024-04-26 04:36:44,238][47288] InferenceWorker_p0-w0: stopping experience collection (19600 times) [2024-04-26 04:36:44,291][47267] Signal inference workers to resume experience collection... (19600 times) [2024-04-26 04:36:44,291][47288] InferenceWorker_p0-w0: resuming experience collection (19600 times) [2024-04-26 04:36:44,523][47288] Updated weights for policy 0, policy_version 82196 (0.0030) [2024-04-26 04:36:48,106][47288] Updated weights for policy 0, policy_version 82206 (0.0026) [2024-04-26 04:36:48,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1346895872. Throughput: 0: 56200.3. Samples: 1296286580. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 04:36:48,923][47056] Avg episode reward: [(0, '0.375')] [2024-04-26 04:36:48,936][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000082208_1346895872.pth... [2024-04-26 04:36:48,988][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000081387_1333444608.pth [2024-04-26 04:36:50,399][47288] Updated weights for policy 0, policy_version 82216 (0.0029) [2024-04-26 04:36:53,787][47288] Updated weights for policy 0, policy_version 82226 (0.0027) [2024-04-26 04:36:53,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 56149.9). Total num frames: 1347190784. Throughput: 0: 56418.7. Samples: 1296630300. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 04:36:53,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 04:36:56,137][47288] Updated weights for policy 0, policy_version 82236 (0.0030) [2024-04-26 04:36:58,923][47056] Fps is (10 sec: 58982.7, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1347485696. Throughput: 0: 56618.1. Samples: 1296793880. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 04:36:58,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 04:36:59,862][47288] Updated weights for policy 0, policy_version 82246 (0.0030) [2024-04-26 04:37:01,842][47288] Updated weights for policy 0, policy_version 82256 (0.0032) [2024-04-26 04:37:03,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1347764224. Throughput: 0: 56525.0. Samples: 1297125760. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 04:37:03,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 04:37:05,692][47288] Updated weights for policy 0, policy_version 82266 (0.0030) [2024-04-26 04:37:07,683][47288] Updated weights for policy 0, policy_version 82276 (0.0027) [2024-04-26 04:37:08,923][47056] Fps is (10 sec: 57343.8, 60 sec: 57071.0, 300 sec: 56205.4). Total num frames: 1348059136. Throughput: 0: 56286.0. Samples: 1297467080. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 04:37:08,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 04:37:11,320][47288] Updated weights for policy 0, policy_version 82286 (0.0026) [2024-04-26 04:37:13,595][47288] Updated weights for policy 0, policy_version 82296 (0.0031) [2024-04-26 04:37:13,923][47056] Fps is (10 sec: 58981.3, 60 sec: 57343.9, 300 sec: 56261.0). Total num frames: 1348354048. Throughput: 0: 56545.3. Samples: 1297648040. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 04:37:13,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 04:37:17,046][47288] Updated weights for policy 0, policy_version 82306 (0.0034) [2024-04-26 04:37:18,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56524.9, 300 sec: 56316.5). Total num frames: 1348616192. Throughput: 0: 56541.1. Samples: 1297986120. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 04:37:18,923][47056] Avg episode reward: [(0, '0.416')] [2024-04-26 04:37:19,321][47288] Updated weights for policy 0, policy_version 82316 (0.0022) [2024-04-26 04:37:22,816][47288] Updated weights for policy 0, policy_version 82326 (0.0033) [2024-04-26 04:37:23,923][47056] Fps is (10 sec: 52429.5, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1348878336. Throughput: 0: 56796.3. Samples: 1298328880. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 04:37:23,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 04:37:25,015][47288] Updated weights for policy 0, policy_version 82336 (0.0023) [2024-04-26 04:37:28,769][47288] Updated weights for policy 0, policy_version 82346 (0.0027) [2024-04-26 04:37:28,923][47056] Fps is (10 sec: 54066.0, 60 sec: 55978.6, 300 sec: 56205.5). Total num frames: 1349156864. Throughput: 0: 56451.9. Samples: 1298487780. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 04:37:28,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 04:37:30,826][47288] Updated weights for policy 0, policy_version 82356 (0.0026) [2024-04-26 04:37:33,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 1349451776. Throughput: 0: 56502.2. Samples: 1298829180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 04:37:33,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 04:37:34,440][47288] Updated weights for policy 0, policy_version 82366 (0.0026) [2024-04-26 04:37:35,531][47267] Signal inference workers to stop experience collection... (19650 times) [2024-04-26 04:37:35,532][47267] Signal inference workers to resume experience collection... (19650 times) [2024-04-26 04:37:35,553][47288] InferenceWorker_p0-w0: stopping experience collection (19650 times) [2024-04-26 04:37:35,553][47288] InferenceWorker_p0-w0: resuming experience collection (19650 times) [2024-04-26 04:37:36,793][47288] Updated weights for policy 0, policy_version 82376 (0.0028) [2024-04-26 04:37:38,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.8, 300 sec: 56316.5). Total num frames: 1349730304. Throughput: 0: 56368.9. Samples: 1299166900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 04:37:38,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 04:37:40,354][47288] Updated weights for policy 0, policy_version 82386 (0.0029) [2024-04-26 04:37:42,524][47288] Updated weights for policy 0, policy_version 82396 (0.0024) [2024-04-26 04:37:43,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1350025216. Throughput: 0: 56501.0. Samples: 1299336420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 04:37:43,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 04:37:46,375][47288] Updated weights for policy 0, policy_version 82406 (0.0027) [2024-04-26 04:37:48,279][47288] Updated weights for policy 0, policy_version 82416 (0.0034) [2024-04-26 04:37:48,923][47056] Fps is (10 sec: 58982.6, 60 sec: 57071.0, 300 sec: 56316.5). Total num frames: 1350320128. Throughput: 0: 56572.3. Samples: 1299671520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 04:37:48,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 04:37:52,136][47288] Updated weights for policy 0, policy_version 82426 (0.0027) [2024-04-26 04:37:53,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56798.0, 300 sec: 56316.5). Total num frames: 1350598656. Throughput: 0: 56590.4. Samples: 1300013640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 04:37:53,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 04:37:54,061][47288] Updated weights for policy 0, policy_version 82436 (0.0030) [2024-04-26 04:37:57,841][47288] Updated weights for policy 0, policy_version 82446 (0.0029) [2024-04-26 04:37:58,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1350877184. Throughput: 0: 56452.6. Samples: 1300188400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 04:37:58,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 04:38:00,007][47288] Updated weights for policy 0, policy_version 82456 (0.0034) [2024-04-26 04:38:03,808][47288] Updated weights for policy 0, policy_version 82466 (0.0027) [2024-04-26 04:38:03,923][47056] Fps is (10 sec: 54066.4, 60 sec: 56251.6, 300 sec: 56261.0). Total num frames: 1351139328. Throughput: 0: 56400.3. Samples: 1300524140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 04:38:03,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 04:38:05,797][47288] Updated weights for policy 0, policy_version 82476 (0.0034) [2024-04-26 04:38:08,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1351417856. Throughput: 0: 56199.4. Samples: 1300857860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 04:38:08,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 04:38:09,603][47288] Updated weights for policy 0, policy_version 82486 (0.0034) [2024-04-26 04:38:11,559][47288] Updated weights for policy 0, policy_version 82496 (0.0031) [2024-04-26 04:38:13,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.7, 300 sec: 56205.5). Total num frames: 1351696384. Throughput: 0: 56159.8. Samples: 1301014960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 04:38:13,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 04:38:15,313][47288] Updated weights for policy 0, policy_version 82506 (0.0028) [2024-04-26 04:38:17,370][47288] Updated weights for policy 0, policy_version 82516 (0.0029) [2024-04-26 04:38:18,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56524.6, 300 sec: 56316.5). Total num frames: 1352007680. Throughput: 0: 56108.9. Samples: 1301354080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 04:38:18,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 04:38:20,968][47288] Updated weights for policy 0, policy_version 82526 (0.0032) [2024-04-26 04:38:23,328][47288] Updated weights for policy 0, policy_version 82536 (0.0031) [2024-04-26 04:38:23,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56797.9, 300 sec: 56261.0). Total num frames: 1352286208. Throughput: 0: 56172.6. Samples: 1301694660. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 04:38:23,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 04:38:26,799][47288] Updated weights for policy 0, policy_version 82546 (0.0028) [2024-04-26 04:38:28,923][47056] Fps is (10 sec: 57344.8, 60 sec: 57071.1, 300 sec: 56372.1). Total num frames: 1352581120. Throughput: 0: 56352.8. Samples: 1301872300. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 04:38:28,923][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 04:38:29,126][47288] Updated weights for policy 0, policy_version 82556 (0.0029) [2024-04-26 04:38:32,370][47267] Signal inference workers to stop experience collection... (19700 times) [2024-04-26 04:38:32,381][47288] InferenceWorker_p0-w0: stopping experience collection (19700 times) [2024-04-26 04:38:32,462][47267] Signal inference workers to resume experience collection... (19700 times) [2024-04-26 04:38:32,462][47288] InferenceWorker_p0-w0: resuming experience collection (19700 times) [2024-04-26 04:38:32,567][47288] Updated weights for policy 0, policy_version 82566 (0.0026) [2024-04-26 04:38:33,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1352843264. Throughput: 0: 56441.8. Samples: 1302211400. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 04:38:33,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 04:38:34,785][47288] Updated weights for policy 0, policy_version 82576 (0.0028) [2024-04-26 04:38:38,195][47288] Updated weights for policy 0, policy_version 82586 (0.0026) [2024-04-26 04:38:38,923][47056] Fps is (10 sec: 52428.5, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1353105408. Throughput: 0: 56363.0. Samples: 1302549980. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 04:38:38,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 04:38:40,685][47288] Updated weights for policy 0, policy_version 82596 (0.0031) [2024-04-26 04:38:43,923][47056] Fps is (10 sec: 54068.3, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 1353383936. Throughput: 0: 56053.9. Samples: 1302710820. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 04:38:43,923][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 04:38:44,045][47288] Updated weights for policy 0, policy_version 82606 (0.0036) [2024-04-26 04:38:46,482][47288] Updated weights for policy 0, policy_version 82616 (0.0036) [2024-04-26 04:38:48,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 56205.5). Total num frames: 1353662464. Throughput: 0: 56135.1. Samples: 1303050220. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 04:38:48,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 04:38:48,959][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000082622_1353678848.pth... [2024-04-26 04:38:49,007][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000081798_1340178432.pth [2024-04-26 04:38:49,834][47288] Updated weights for policy 0, policy_version 82626 (0.0029) [2024-04-26 04:38:52,382][47288] Updated weights for policy 0, policy_version 82636 (0.0022) [2024-04-26 04:38:53,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1353957376. Throughput: 0: 56169.5. Samples: 1303385480. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 04:38:53,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 04:38:55,863][47288] Updated weights for policy 0, policy_version 82646 (0.0030) [2024-04-26 04:38:58,254][47288] Updated weights for policy 0, policy_version 82656 (0.0033) [2024-04-26 04:38:58,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1354252288. Throughput: 0: 56511.0. Samples: 1303557960. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 04:38:58,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 04:39:01,833][47288] Updated weights for policy 0, policy_version 82666 (0.0025) [2024-04-26 04:39:03,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1354547200. Throughput: 0: 56363.7. Samples: 1303890440. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 04:39:03,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 04:39:04,019][47288] Updated weights for policy 0, policy_version 82676 (0.0036) [2024-04-26 04:39:07,741][47288] Updated weights for policy 0, policy_version 82686 (0.0033) [2024-04-26 04:39:08,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.8, 300 sec: 56372.0). Total num frames: 1354809344. Throughput: 0: 56258.5. Samples: 1304226300. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 04:39:08,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 04:39:09,912][47288] Updated weights for policy 0, policy_version 82696 (0.0031) [2024-04-26 04:39:13,606][47288] Updated weights for policy 0, policy_version 82706 (0.0029) [2024-04-26 04:39:13,923][47056] Fps is (10 sec: 52428.8, 60 sec: 56251.7, 300 sec: 56205.5). Total num frames: 1355071488. Throughput: 0: 56168.4. Samples: 1304399880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 04:39:13,924][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 04:39:15,633][47288] Updated weights for policy 0, policy_version 82716 (0.0030) [2024-04-26 04:39:18,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 56261.0). Total num frames: 1355350016. Throughput: 0: 56109.8. Samples: 1304736340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 04:39:18,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 04:39:19,396][47288] Updated weights for policy 0, policy_version 82726 (0.0031) [2024-04-26 04:39:21,235][47267] Signal inference workers to stop experience collection... (19750 times) [2024-04-26 04:39:21,273][47288] InferenceWorker_p0-w0: stopping experience collection (19750 times) [2024-04-26 04:39:21,299][47267] Signal inference workers to resume experience collection... (19750 times) [2024-04-26 04:39:21,300][47288] InferenceWorker_p0-w0: resuming experience collection (19750 times) [2024-04-26 04:39:21,408][47288] Updated weights for policy 0, policy_version 82736 (0.0026) [2024-04-26 04:39:23,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 56261.0). Total num frames: 1355628544. Throughput: 0: 55888.0. Samples: 1305064940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 04:39:23,923][47056] Avg episode reward: [(0, '0.375')] [2024-04-26 04:39:25,156][47288] Updated weights for policy 0, policy_version 82746 (0.0031) [2024-04-26 04:39:27,591][47288] Updated weights for policy 0, policy_version 82756 (0.0027) [2024-04-26 04:39:28,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55705.6, 300 sec: 56261.0). Total num frames: 1355923456. Throughput: 0: 56147.0. Samples: 1305237440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 04:39:28,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 04:39:30,996][47288] Updated weights for policy 0, policy_version 82766 (0.0029) [2024-04-26 04:39:33,328][47288] Updated weights for policy 0, policy_version 82776 (0.0029) [2024-04-26 04:39:33,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1356218368. Throughput: 0: 56094.8. Samples: 1305574480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 04:39:33,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 04:39:36,839][47288] Updated weights for policy 0, policy_version 82786 (0.0032) [2024-04-26 04:39:38,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56797.9, 300 sec: 56316.5). Total num frames: 1356513280. Throughput: 0: 56097.2. Samples: 1305909860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 04:39:38,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 04:39:39,151][47288] Updated weights for policy 0, policy_version 82796 (0.0028) [2024-04-26 04:39:42,633][47288] Updated weights for policy 0, policy_version 82806 (0.0026) [2024-04-26 04:39:43,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.6, 300 sec: 56316.5). Total num frames: 1356775424. Throughput: 0: 56099.2. Samples: 1306082420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 04:39:43,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 04:39:45,073][47288] Updated weights for policy 0, policy_version 82816 (0.0032) [2024-04-26 04:39:48,467][47288] Updated weights for policy 0, policy_version 82826 (0.0027) [2024-04-26 04:39:48,923][47056] Fps is (10 sec: 52429.0, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1357037568. Throughput: 0: 56257.4. Samples: 1306422020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 04:39:48,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 04:39:51,021][47288] Updated weights for policy 0, policy_version 82836 (0.0026) [2024-04-26 04:39:53,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1357316096. Throughput: 0: 56164.6. Samples: 1306753700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 04:39:53,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 04:39:54,391][47288] Updated weights for policy 0, policy_version 82846 (0.0030) [2024-04-26 04:39:56,857][47288] Updated weights for policy 0, policy_version 82856 (0.0032) [2024-04-26 04:39:58,923][47056] Fps is (10 sec: 55704.2, 60 sec: 55705.4, 300 sec: 56260.9). Total num frames: 1357594624. Throughput: 0: 55884.2. Samples: 1306914680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 04:39:58,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 04:40:00,186][47288] Updated weights for policy 0, policy_version 82866 (0.0029) [2024-04-26 04:40:02,707][47288] Updated weights for policy 0, policy_version 82876 (0.0030) [2024-04-26 04:40:03,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55705.6, 300 sec: 56316.5). Total num frames: 1357889536. Throughput: 0: 55773.3. Samples: 1307246140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 04:40:03,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 04:40:06,026][47288] Updated weights for policy 0, policy_version 82886 (0.0033) [2024-04-26 04:40:08,717][47288] Updated weights for policy 0, policy_version 82896 (0.0028) [2024-04-26 04:40:08,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1358168064. Throughput: 0: 56036.9. Samples: 1307586600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 04:40:08,924][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 04:40:11,939][47288] Updated weights for policy 0, policy_version 82906 (0.0027) [2024-04-26 04:40:13,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56524.6, 300 sec: 56261.0). Total num frames: 1358462976. Throughput: 0: 55950.4. Samples: 1307755220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 04:40:13,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 04:40:14,585][47288] Updated weights for policy 0, policy_version 82916 (0.0033) [2024-04-26 04:40:17,822][47288] Updated weights for policy 0, policy_version 82926 (0.0032) [2024-04-26 04:40:18,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1358741504. Throughput: 0: 56052.8. Samples: 1308096860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 04:40:18,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 04:40:20,403][47288] Updated weights for policy 0, policy_version 82936 (0.0028) [2024-04-26 04:40:23,658][47288] Updated weights for policy 0, policy_version 82946 (0.0027) [2024-04-26 04:40:23,923][47056] Fps is (10 sec: 52430.1, 60 sec: 55978.8, 300 sec: 56261.0). Total num frames: 1358987264. Throughput: 0: 56016.6. Samples: 1308430600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 04:40:23,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 04:40:26,226][47288] Updated weights for policy 0, policy_version 82956 (0.0032) [2024-04-26 04:40:28,923][47056] Fps is (10 sec: 50790.0, 60 sec: 55432.4, 300 sec: 56205.4). Total num frames: 1359249408. Throughput: 0: 55751.4. Samples: 1308591240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 04:40:28,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 04:40:29,449][47288] Updated weights for policy 0, policy_version 82966 (0.0028) [2024-04-26 04:40:32,257][47288] Updated weights for policy 0, policy_version 82976 (0.0026) [2024-04-26 04:40:33,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55432.6, 300 sec: 56261.0). Total num frames: 1359544320. Throughput: 0: 55667.2. Samples: 1308927040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 04:40:33,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 04:40:35,404][47288] Updated weights for policy 0, policy_version 82986 (0.0026) [2024-04-26 04:40:37,981][47288] Updated weights for policy 0, policy_version 82996 (0.0032) [2024-04-26 04:40:38,923][47056] Fps is (10 sec: 60621.1, 60 sec: 55705.6, 300 sec: 56261.0). Total num frames: 1359855616. Throughput: 0: 55731.4. Samples: 1309261620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 04:40:38,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 04:40:40,603][47267] Signal inference workers to stop experience collection... (19800 times) [2024-04-26 04:40:40,633][47288] InferenceWorker_p0-w0: stopping experience collection (19800 times) [2024-04-26 04:40:40,686][47267] Signal inference workers to resume experience collection... (19800 times) [2024-04-26 04:40:40,686][47288] InferenceWorker_p0-w0: resuming experience collection (19800 times) [2024-04-26 04:40:41,220][47288] Updated weights for policy 0, policy_version 83006 (0.0027) [2024-04-26 04:40:43,687][47288] Updated weights for policy 0, policy_version 83016 (0.0035) [2024-04-26 04:40:43,923][47056] Fps is (10 sec: 58982.6, 60 sec: 55978.8, 300 sec: 56261.0). Total num frames: 1360134144. Throughput: 0: 56120.8. Samples: 1309440100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 04:40:43,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 04:40:46,881][47288] Updated weights for policy 0, policy_version 83026 (0.0030) [2024-04-26 04:40:48,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.7, 300 sec: 56205.5). Total num frames: 1360412672. Throughput: 0: 56226.7. Samples: 1309776340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 04:40:48,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 04:40:49,032][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000083034_1360429056.pth... [2024-04-26 04:40:49,072][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000082208_1346895872.pth [2024-04-26 04:40:49,623][47288] Updated weights for policy 0, policy_version 83036 (0.0028) [2024-04-26 04:40:52,761][47288] Updated weights for policy 0, policy_version 83046 (0.0030) [2024-04-26 04:40:53,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1360691200. Throughput: 0: 56106.0. Samples: 1310111360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 04:40:53,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 04:40:55,619][47288] Updated weights for policy 0, policy_version 83056 (0.0027) [2024-04-26 04:40:58,580][47288] Updated weights for policy 0, policy_version 83066 (0.0034) [2024-04-26 04:40:58,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56251.9, 300 sec: 56261.0). Total num frames: 1360969728. Throughput: 0: 56045.5. Samples: 1310277260. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 04:40:58,923][47056] Avg episode reward: [(0, '0.436')] [2024-04-26 04:41:01,390][47288] Updated weights for policy 0, policy_version 83076 (0.0029) [2024-04-26 04:41:03,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55705.6, 300 sec: 56261.0). Total num frames: 1361231872. Throughput: 0: 56121.3. Samples: 1310622320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 04:41:03,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 04:41:04,449][47288] Updated weights for policy 0, policy_version 83086 (0.0032) [2024-04-26 04:41:07,132][47288] Updated weights for policy 0, policy_version 83096 (0.0030) [2024-04-26 04:41:08,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 56261.0). Total num frames: 1361510400. Throughput: 0: 56136.3. Samples: 1310956740. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 04:41:08,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 04:41:10,254][47288] Updated weights for policy 0, policy_version 83106 (0.0033) [2024-04-26 04:41:13,092][47288] Updated weights for policy 0, policy_version 83116 (0.0026) [2024-04-26 04:41:13,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55705.7, 300 sec: 56205.4). Total num frames: 1361805312. Throughput: 0: 56100.1. Samples: 1311115740. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 04:41:13,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 04:41:15,988][47288] Updated weights for policy 0, policy_version 83126 (0.0031) [2024-04-26 04:41:18,731][47288] Updated weights for policy 0, policy_version 83136 (0.0031) [2024-04-26 04:41:18,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55978.8, 300 sec: 56261.0). Total num frames: 1362100224. Throughput: 0: 56222.3. Samples: 1311457040. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 04:41:18,923][47056] Avg episode reward: [(0, '0.563')] [2024-04-26 04:41:22,022][47288] Updated weights for policy 0, policy_version 83146 (0.0034) [2024-04-26 04:41:23,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56524.7, 300 sec: 56205.5). Total num frames: 1362378752. Throughput: 0: 56233.9. Samples: 1311792140. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 04:41:23,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 04:41:24,612][47288] Updated weights for policy 0, policy_version 83156 (0.0031) [2024-04-26 04:41:27,815][47288] Updated weights for policy 0, policy_version 83166 (0.0030) [2024-04-26 04:41:28,923][47056] Fps is (10 sec: 55704.5, 60 sec: 56797.9, 300 sec: 56149.9). Total num frames: 1362657280. Throughput: 0: 56186.0. Samples: 1311968480. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 04:41:28,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 04:41:30,373][47288] Updated weights for policy 0, policy_version 83176 (0.0027) [2024-04-26 04:41:33,572][47288] Updated weights for policy 0, policy_version 83186 (0.0030) [2024-04-26 04:41:33,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.7, 300 sec: 56149.9). Total num frames: 1362935808. Throughput: 0: 56351.5. Samples: 1312312160. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 04:41:33,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 04:41:36,324][47288] Updated weights for policy 0, policy_version 83196 (0.0026) [2024-04-26 04:41:38,923][47056] Fps is (10 sec: 55706.6, 60 sec: 55978.8, 300 sec: 56205.5). Total num frames: 1363214336. Throughput: 0: 56372.9. Samples: 1312648140. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 04:41:38,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 04:41:39,361][47288] Updated weights for policy 0, policy_version 83206 (0.0031) [2024-04-26 04:41:42,101][47288] Updated weights for policy 0, policy_version 83216 (0.0023) [2024-04-26 04:41:43,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.5, 300 sec: 56205.5). Total num frames: 1363476480. Throughput: 0: 56428.9. Samples: 1312816560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 04:41:43,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 04:41:45,041][47288] Updated weights for policy 0, policy_version 83226 (0.0029) [2024-04-26 04:41:47,830][47288] Updated weights for policy 0, policy_version 83236 (0.0026) [2024-04-26 04:41:48,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1363787776. Throughput: 0: 56209.8. Samples: 1313151760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 04:41:48,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 04:41:50,930][47288] Updated weights for policy 0, policy_version 83246 (0.0028) [2024-04-26 04:41:53,845][47288] Updated weights for policy 0, policy_version 83256 (0.0035) [2024-04-26 04:41:53,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56251.6, 300 sec: 56205.5). Total num frames: 1364066304. Throughput: 0: 56107.1. Samples: 1313481560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 04:41:53,924][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 04:41:56,784][47267] Signal inference workers to stop experience collection... (19850 times) [2024-04-26 04:41:56,820][47288] InferenceWorker_p0-w0: stopping experience collection (19850 times) [2024-04-26 04:41:56,869][47267] Signal inference workers to resume experience collection... (19850 times) [2024-04-26 04:41:56,869][47288] InferenceWorker_p0-w0: resuming experience collection (19850 times) [2024-04-26 04:41:56,975][47288] Updated weights for policy 0, policy_version 83266 (0.0030) [2024-04-26 04:41:58,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.6, 300 sec: 56205.4). Total num frames: 1364344832. Throughput: 0: 56424.8. Samples: 1313654860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 04:41:58,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 04:41:59,703][47288] Updated weights for policy 0, policy_version 83276 (0.0027) [2024-04-26 04:42:02,822][47288] Updated weights for policy 0, policy_version 83286 (0.0029) [2024-04-26 04:42:03,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.9, 300 sec: 56149.9). Total num frames: 1364623360. Throughput: 0: 56342.6. Samples: 1313992460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 04:42:03,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 04:42:05,551][47288] Updated weights for policy 0, policy_version 83296 (0.0029) [2024-04-26 04:42:08,558][47288] Updated weights for policy 0, policy_version 83306 (0.0030) [2024-04-26 04:42:08,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56251.7, 300 sec: 56038.8). Total num frames: 1364885504. Throughput: 0: 56227.0. Samples: 1314322360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 04:42:08,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 04:42:11,321][47288] Updated weights for policy 0, policy_version 83316 (0.0027) [2024-04-26 04:42:13,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1365180416. Throughput: 0: 56038.9. Samples: 1314490220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 04:42:13,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 04:42:14,604][47288] Updated weights for policy 0, policy_version 83326 (0.0034) [2024-04-26 04:42:17,089][47288] Updated weights for policy 0, policy_version 83336 (0.0032) [2024-04-26 04:42:18,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1365458944. Throughput: 0: 55999.6. Samples: 1314832140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 04:42:18,923][47056] Avg episode reward: [(0, '0.567')] [2024-04-26 04:42:20,361][47288] Updated weights for policy 0, policy_version 83346 (0.0031) [2024-04-26 04:42:22,989][47288] Updated weights for policy 0, policy_version 83356 (0.0030) [2024-04-26 04:42:23,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 1365737472. Throughput: 0: 55998.6. Samples: 1315168080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 04:42:23,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 04:42:26,036][47288] Updated weights for policy 0, policy_version 83366 (0.0030) [2024-04-26 04:42:28,779][47288] Updated weights for policy 0, policy_version 83376 (0.0027) [2024-04-26 04:42:28,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.8, 300 sec: 56205.5). Total num frames: 1366032384. Throughput: 0: 56118.7. Samples: 1315341900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 04:42:28,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 04:42:31,781][47288] Updated weights for policy 0, policy_version 83386 (0.0029) [2024-04-26 04:42:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 56205.5). Total num frames: 1366310912. Throughput: 0: 56148.9. Samples: 1315678460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 04:42:33,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 04:42:34,554][47288] Updated weights for policy 0, policy_version 83396 (0.0026) [2024-04-26 04:42:37,644][47288] Updated weights for policy 0, policy_version 83406 (0.0031) [2024-04-26 04:42:38,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.7, 300 sec: 56205.4). Total num frames: 1366605824. Throughput: 0: 56176.8. Samples: 1316009520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 04:42:38,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 04:42:40,351][47288] Updated weights for policy 0, policy_version 83416 (0.0028) [2024-04-26 04:42:43,367][47288] Updated weights for policy 0, policy_version 83426 (0.0030) [2024-04-26 04:42:43,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56797.9, 300 sec: 56149.9). Total num frames: 1366884352. Throughput: 0: 56258.8. Samples: 1316186500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 04:42:43,923][47056] Avg episode reward: [(0, '0.528')] [2024-04-26 04:42:46,289][47288] Updated weights for policy 0, policy_version 83436 (0.0026) [2024-04-26 04:42:48,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.7, 300 sec: 56094.3). Total num frames: 1367146496. Throughput: 0: 56271.5. Samples: 1316524680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 04:42:48,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 04:42:48,952][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000083445_1367162880.pth... [2024-04-26 04:42:49,002][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000082622_1353678848.pth [2024-04-26 04:42:49,194][47288] Updated weights for policy 0, policy_version 83446 (0.0026) [2024-04-26 04:42:52,109][47288] Updated weights for policy 0, policy_version 83456 (0.0025) [2024-04-26 04:42:53,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1367441408. Throughput: 0: 56549.9. Samples: 1316867100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 04:42:53,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 04:42:54,876][47288] Updated weights for policy 0, policy_version 83466 (0.0027) [2024-04-26 04:42:57,817][47288] Updated weights for policy 0, policy_version 83476 (0.0029) [2024-04-26 04:42:58,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1367736320. Throughput: 0: 56571.8. Samples: 1317035960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 04:42:58,932][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 04:43:00,792][47288] Updated weights for policy 0, policy_version 83486 (0.0031) [2024-04-26 04:43:03,611][47288] Updated weights for policy 0, policy_version 83496 (0.0028) [2024-04-26 04:43:03,923][47056] Fps is (10 sec: 57342.6, 60 sec: 56524.6, 300 sec: 56261.0). Total num frames: 1368014848. Throughput: 0: 56479.8. Samples: 1317373740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 04:43:03,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 04:43:06,484][47288] Updated weights for policy 0, policy_version 83506 (0.0030) [2024-04-26 04:43:06,722][47267] Signal inference workers to stop experience collection... (19900 times) [2024-04-26 04:43:06,723][47267] Signal inference workers to resume experience collection... (19900 times) [2024-04-26 04:43:06,741][47288] InferenceWorker_p0-w0: stopping experience collection (19900 times) [2024-04-26 04:43:06,741][47288] InferenceWorker_p0-w0: resuming experience collection (19900 times) [2024-04-26 04:43:08,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56524.7, 300 sec: 56205.4). Total num frames: 1368276992. Throughput: 0: 56596.2. Samples: 1317714920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 04:43:08,923][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 04:43:09,502][47288] Updated weights for policy 0, policy_version 83516 (0.0028) [2024-04-26 04:43:12,343][47288] Updated weights for policy 0, policy_version 83526 (0.0027) [2024-04-26 04:43:13,923][47056] Fps is (10 sec: 57345.4, 60 sec: 56797.9, 300 sec: 56205.5). Total num frames: 1368588288. Throughput: 0: 56581.0. Samples: 1317888040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 04:43:13,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 04:43:15,229][47288] Updated weights for policy 0, policy_version 83536 (0.0028) [2024-04-26 04:43:18,173][47288] Updated weights for policy 0, policy_version 83546 (0.0026) [2024-04-26 04:43:18,923][47056] Fps is (10 sec: 58983.9, 60 sec: 56798.0, 300 sec: 56205.5). Total num frames: 1368866816. Throughput: 0: 56587.2. Samples: 1318224880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 04:43:18,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 04:43:21,033][47288] Updated weights for policy 0, policy_version 83556 (0.0032) [2024-04-26 04:43:23,850][47288] Updated weights for policy 0, policy_version 83566 (0.0024) [2024-04-26 04:43:23,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56797.8, 300 sec: 56149.9). Total num frames: 1369145344. Throughput: 0: 56848.5. Samples: 1318567700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 04:43:23,924][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 04:43:26,785][47288] Updated weights for policy 0, policy_version 83576 (0.0031) [2024-04-26 04:43:28,923][47056] Fps is (10 sec: 55704.6, 60 sec: 56524.7, 300 sec: 56205.4). Total num frames: 1369423872. Throughput: 0: 56679.0. Samples: 1318737060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 04:43:28,924][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 04:43:29,594][47288] Updated weights for policy 0, policy_version 83586 (0.0026) [2024-04-26 04:43:32,589][47288] Updated weights for policy 0, policy_version 83596 (0.0034) [2024-04-26 04:43:33,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56798.0, 300 sec: 56316.6). Total num frames: 1369718784. Throughput: 0: 56585.1. Samples: 1319071000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 04:43:33,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 04:43:35,499][47288] Updated weights for policy 0, policy_version 83606 (0.0027) [2024-04-26 04:43:38,412][47288] Updated weights for policy 0, policy_version 83616 (0.0027) [2024-04-26 04:43:38,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1369997312. Throughput: 0: 56499.8. Samples: 1319409600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 04:43:38,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 04:43:41,350][47288] Updated weights for policy 0, policy_version 83626 (0.0027) [2024-04-26 04:43:43,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1370259456. Throughput: 0: 56582.9. Samples: 1319582180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 04:43:43,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 04:43:44,229][47288] Updated weights for policy 0, policy_version 83636 (0.0026) [2024-04-26 04:43:47,001][47288] Updated weights for policy 0, policy_version 83646 (0.0025) [2024-04-26 04:43:48,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56797.7, 300 sec: 56260.9). Total num frames: 1370554368. Throughput: 0: 56632.0. Samples: 1319922180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 04:43:48,924][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 04:43:49,882][47288] Updated weights for policy 0, policy_version 83656 (0.0031) [2024-04-26 04:43:52,665][47288] Updated weights for policy 0, policy_version 83666 (0.0025) [2024-04-26 04:43:53,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56524.7, 300 sec: 56205.5). Total num frames: 1370832896. Throughput: 0: 56635.3. Samples: 1320263500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 04:43:53,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 04:43:55,544][47288] Updated weights for policy 0, policy_version 83676 (0.0028) [2024-04-26 04:43:58,554][47288] Updated weights for policy 0, policy_version 83686 (0.0031) [2024-04-26 04:43:58,923][47056] Fps is (10 sec: 57345.3, 60 sec: 56524.9, 300 sec: 56205.5). Total num frames: 1371127808. Throughput: 0: 56601.3. Samples: 1320435100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 04:43:58,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 04:44:01,423][47288] Updated weights for policy 0, policy_version 83696 (0.0029) [2024-04-26 04:44:03,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56252.0, 300 sec: 56205.5). Total num frames: 1371389952. Throughput: 0: 56672.0. Samples: 1320775120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 04:44:03,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 04:44:04,485][47288] Updated weights for policy 0, policy_version 83706 (0.0027) [2024-04-26 04:44:07,089][47267] Signal inference workers to stop experience collection... (19950 times) [2024-04-26 04:44:07,090][47267] Signal inference workers to resume experience collection... (19950 times) [2024-04-26 04:44:07,110][47288] InferenceWorker_p0-w0: stopping experience collection (19950 times) [2024-04-26 04:44:07,110][47288] InferenceWorker_p0-w0: resuming experience collection (19950 times) [2024-04-26 04:44:07,202][47288] Updated weights for policy 0, policy_version 83716 (0.0034) [2024-04-26 04:44:08,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56797.9, 300 sec: 56316.5). Total num frames: 1371684864. Throughput: 0: 56511.5. Samples: 1321110720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 04:44:08,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 04:44:10,321][47288] Updated weights for policy 0, policy_version 83726 (0.0030) [2024-04-26 04:44:12,907][47288] Updated weights for policy 0, policy_version 83736 (0.0034) [2024-04-26 04:44:13,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1371963392. Throughput: 0: 56516.2. Samples: 1321280280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 04:44:13,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 04:44:16,016][47288] Updated weights for policy 0, policy_version 83746 (0.0028) [2024-04-26 04:44:18,828][47288] Updated weights for policy 0, policy_version 83756 (0.0032) [2024-04-26 04:44:18,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1372258304. Throughput: 0: 56589.7. Samples: 1321617540. Policy #0 lag: (min: 2.0, avg: 10.9, max: 24.0) [2024-04-26 04:44:18,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 04:44:21,865][47288] Updated weights for policy 0, policy_version 83766 (0.0028) [2024-04-26 04:44:23,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1372520448. Throughput: 0: 56565.4. Samples: 1321955040. Policy #0 lag: (min: 2.0, avg: 10.9, max: 24.0) [2024-04-26 04:44:23,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 04:44:24,783][47288] Updated weights for policy 0, policy_version 83776 (0.0030) [2024-04-26 04:44:27,650][47288] Updated weights for policy 0, policy_version 83786 (0.0032) [2024-04-26 04:44:28,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56525.0, 300 sec: 56261.0). Total num frames: 1372815360. Throughput: 0: 56526.7. Samples: 1322125880. Policy #0 lag: (min: 2.0, avg: 10.9, max: 24.0) [2024-04-26 04:44:28,923][47056] Avg episode reward: [(0, '0.416')] [2024-04-26 04:44:30,647][47288] Updated weights for policy 0, policy_version 83796 (0.0034) [2024-04-26 04:44:33,307][47288] Updated weights for policy 0, policy_version 83806 (0.0030) [2024-04-26 04:44:33,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56251.7, 300 sec: 56205.5). Total num frames: 1373093888. Throughput: 0: 56374.1. Samples: 1322459000. Policy #0 lag: (min: 2.0, avg: 10.9, max: 24.0) [2024-04-26 04:44:33,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 04:44:36,529][47288] Updated weights for policy 0, policy_version 83816 (0.0027) [2024-04-26 04:44:38,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1373372416. Throughput: 0: 56440.0. Samples: 1322803300. Policy #0 lag: (min: 2.0, avg: 10.9, max: 24.0) [2024-04-26 04:44:38,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 04:44:39,196][47288] Updated weights for policy 0, policy_version 83826 (0.0036) [2024-04-26 04:44:42,410][47288] Updated weights for policy 0, policy_version 83836 (0.0027) [2024-04-26 04:44:43,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1373667328. Throughput: 0: 56316.9. Samples: 1322969360. Policy #0 lag: (min: 2.0, avg: 10.9, max: 24.0) [2024-04-26 04:44:43,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 04:44:45,029][47288] Updated weights for policy 0, policy_version 83846 (0.0033) [2024-04-26 04:44:48,051][47288] Updated weights for policy 0, policy_version 83856 (0.0028) [2024-04-26 04:44:48,923][47056] Fps is (10 sec: 57342.5, 60 sec: 56524.7, 300 sec: 56372.0). Total num frames: 1373945856. Throughput: 0: 56297.3. Samples: 1323308520. Policy #0 lag: (min: 2.0, avg: 10.9, max: 24.0) [2024-04-26 04:44:48,924][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 04:44:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000083859_1373945856.pth... [2024-04-26 04:44:48,984][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000083034_1360429056.pth [2024-04-26 04:44:50,803][47288] Updated weights for policy 0, policy_version 83866 (0.0036) [2024-04-26 04:44:53,887][47288] Updated weights for policy 0, policy_version 83876 (0.0037) [2024-04-26 04:44:53,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1374224384. Throughput: 0: 56317.8. Samples: 1323645020. Policy #0 lag: (min: 2.0, avg: 10.9, max: 24.0) [2024-04-26 04:44:53,924][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 04:44:57,085][47288] Updated weights for policy 0, policy_version 83886 (0.0032) [2024-04-26 04:44:58,923][47056] Fps is (10 sec: 55707.7, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1374502912. Throughput: 0: 56404.5. Samples: 1323818480. Policy #0 lag: (min: 2.0, avg: 10.9, max: 24.0) [2024-04-26 04:44:58,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 04:44:59,743][47288] Updated weights for policy 0, policy_version 83896 (0.0038) [2024-04-26 04:45:00,045][47267] Signal inference workers to stop experience collection... (20000 times) [2024-04-26 04:45:00,045][47267] Signal inference workers to resume experience collection... (20000 times) [2024-04-26 04:45:00,056][47288] InferenceWorker_p0-w0: stopping experience collection (20000 times) [2024-04-26 04:45:00,068][47288] InferenceWorker_p0-w0: resuming experience collection (20000 times) [2024-04-26 04:45:02,772][47288] Updated weights for policy 0, policy_version 83906 (0.0031) [2024-04-26 04:45:03,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 1374797824. Throughput: 0: 56460.9. Samples: 1324158280. Policy #0 lag: (min: 2.0, avg: 10.9, max: 24.0) [2024-04-26 04:45:03,923][47056] Avg episode reward: [(0, '0.395')] [2024-04-26 04:45:05,553][47288] Updated weights for policy 0, policy_version 83916 (0.0027) [2024-04-26 04:45:08,592][47288] Updated weights for policy 0, policy_version 83926 (0.0028) [2024-04-26 04:45:08,923][47056] Fps is (10 sec: 55704.5, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1375059968. Throughput: 0: 56405.8. Samples: 1324493300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:45:08,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 04:45:11,338][47288] Updated weights for policy 0, policy_version 83936 (0.0030) [2024-04-26 04:45:13,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56251.6, 300 sec: 56261.0). Total num frames: 1375338496. Throughput: 0: 56291.8. Samples: 1324659020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:45:13,923][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 04:45:14,392][47288] Updated weights for policy 0, policy_version 83946 (0.0031) [2024-04-26 04:45:17,162][47288] Updated weights for policy 0, policy_version 83956 (0.0030) [2024-04-26 04:45:18,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1375633408. Throughput: 0: 56314.9. Samples: 1324993180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:45:18,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 04:45:20,151][47288] Updated weights for policy 0, policy_version 83966 (0.0032) [2024-04-26 04:45:23,082][47288] Updated weights for policy 0, policy_version 83976 (0.0025) [2024-04-26 04:45:23,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1375911936. Throughput: 0: 56121.7. Samples: 1325328780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:45:23,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 04:45:26,001][47288] Updated weights for policy 0, policy_version 83986 (0.0033) [2024-04-26 04:45:28,718][47288] Updated weights for policy 0, policy_version 83996 (0.0032) [2024-04-26 04:45:28,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1376190464. Throughput: 0: 56300.4. Samples: 1325502880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:45:28,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 04:45:31,861][47288] Updated weights for policy 0, policy_version 84006 (0.0028) [2024-04-26 04:45:33,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56251.7, 300 sec: 56316.6). Total num frames: 1376468992. Throughput: 0: 56210.2. Samples: 1325837960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:45:33,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 04:45:34,537][47288] Updated weights for policy 0, policy_version 84016 (0.0031) [2024-04-26 04:45:37,799][47288] Updated weights for policy 0, policy_version 84026 (0.0031) [2024-04-26 04:45:38,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1376763904. Throughput: 0: 56211.3. Samples: 1326174520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:45:38,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 04:45:40,294][47288] Updated weights for policy 0, policy_version 84036 (0.0031) [2024-04-26 04:45:43,504][47288] Updated weights for policy 0, policy_version 84046 (0.0028) [2024-04-26 04:45:43,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55978.5, 300 sec: 56316.5). Total num frames: 1377026048. Throughput: 0: 56158.4. Samples: 1326345620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:45:43,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 04:45:46,054][47288] Updated weights for policy 0, policy_version 84056 (0.0027) [2024-04-26 04:45:48,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55705.9, 300 sec: 56261.0). Total num frames: 1377288192. Throughput: 0: 56040.9. Samples: 1326680120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:45:48,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 04:45:49,377][47288] Updated weights for policy 0, policy_version 84066 (0.0027) [2024-04-26 04:45:51,948][47288] Updated weights for policy 0, policy_version 84076 (0.0029) [2024-04-26 04:45:53,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1377583104. Throughput: 0: 56135.2. Samples: 1327019380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:45:53,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 04:45:55,111][47288] Updated weights for policy 0, policy_version 84086 (0.0030) [2024-04-26 04:45:57,740][47288] Updated weights for policy 0, policy_version 84096 (0.0029) [2024-04-26 04:45:58,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1377878016. Throughput: 0: 56245.1. Samples: 1327190040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 04:45:58,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 04:46:00,968][47288] Updated weights for policy 0, policy_version 84106 (0.0033) [2024-04-26 04:46:03,387][47288] Updated weights for policy 0, policy_version 84116 (0.0030) [2024-04-26 04:46:03,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1378172928. Throughput: 0: 56279.6. Samples: 1327525760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 04:46:03,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 04:46:06,583][47288] Updated weights for policy 0, policy_version 84126 (0.0034) [2024-04-26 04:46:08,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1378451456. Throughput: 0: 56413.0. Samples: 1327867360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 04:46:08,923][47056] Avg episode reward: [(0, '0.566')] [2024-04-26 04:46:09,221][47288] Updated weights for policy 0, policy_version 84136 (0.0032) [2024-04-26 04:46:12,509][47288] Updated weights for policy 0, policy_version 84146 (0.0029) [2024-04-26 04:46:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1378746368. Throughput: 0: 56454.6. Samples: 1328043340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 04:46:13,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 04:46:15,065][47288] Updated weights for policy 0, policy_version 84156 (0.0027) [2024-04-26 04:46:18,329][47288] Updated weights for policy 0, policy_version 84166 (0.0030) [2024-04-26 04:46:18,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 1379008512. Throughput: 0: 56596.5. Samples: 1328384800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 04:46:18,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 04:46:20,721][47288] Updated weights for policy 0, policy_version 84176 (0.0024) [2024-04-26 04:46:23,863][47267] Signal inference workers to stop experience collection... (20050 times) [2024-04-26 04:46:23,864][47267] Signal inference workers to resume experience collection... (20050 times) [2024-04-26 04:46:23,889][47288] InferenceWorker_p0-w0: stopping experience collection (20050 times) [2024-04-26 04:46:23,889][47288] InferenceWorker_p0-w0: resuming experience collection (20050 times) [2024-04-26 04:46:23,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 1379287040. Throughput: 0: 56671.6. Samples: 1328724740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 04:46:23,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 04:46:23,975][47288] Updated weights for policy 0, policy_version 84186 (0.0034) [2024-04-26 04:46:26,585][47288] Updated weights for policy 0, policy_version 84196 (0.0034) [2024-04-26 04:46:28,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1379549184. Throughput: 0: 56356.6. Samples: 1328881660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 04:46:28,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 04:46:29,841][47288] Updated weights for policy 0, policy_version 84206 (0.0027) [2024-04-26 04:46:32,425][47288] Updated weights for policy 0, policy_version 84216 (0.0033) [2024-04-26 04:46:33,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1379860480. Throughput: 0: 56430.7. Samples: 1329219500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 04:46:33,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 04:46:35,947][47288] Updated weights for policy 0, policy_version 84226 (0.0024) [2024-04-26 04:46:38,255][47288] Updated weights for policy 0, policy_version 84236 (0.0028) [2024-04-26 04:46:38,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56251.6, 300 sec: 56483.1). Total num frames: 1380139008. Throughput: 0: 56488.8. Samples: 1329561380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 04:46:38,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 04:46:41,582][47288] Updated weights for policy 0, policy_version 84246 (0.0025) [2024-04-26 04:46:43,851][47288] Updated weights for policy 0, policy_version 84256 (0.0035) [2024-04-26 04:46:43,923][47056] Fps is (10 sec: 58981.4, 60 sec: 57071.0, 300 sec: 56483.1). Total num frames: 1380450304. Throughput: 0: 56665.1. Samples: 1329739980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 04:46:43,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 04:46:47,496][47288] Updated weights for policy 0, policy_version 84266 (0.0027) [2024-04-26 04:46:48,923][47056] Fps is (10 sec: 57344.0, 60 sec: 57070.8, 300 sec: 56427.6). Total num frames: 1380712448. Throughput: 0: 56767.4. Samples: 1330080300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 04:46:48,923][47056] Avg episode reward: [(0, '0.392')] [2024-04-26 04:46:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000084272_1380712448.pth... [2024-04-26 04:46:48,991][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000083445_1367162880.pth [2024-04-26 04:46:49,703][47288] Updated weights for policy 0, policy_version 84276 (0.0025) [2024-04-26 04:46:53,416][47288] Updated weights for policy 0, policy_version 84286 (0.0029) [2024-04-26 04:46:53,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1380990976. Throughput: 0: 56644.0. Samples: 1330416340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 04:46:53,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 04:46:55,524][47288] Updated weights for policy 0, policy_version 84296 (0.0029) [2024-04-26 04:46:58,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55978.5, 300 sec: 56316.5). Total num frames: 1381236736. Throughput: 0: 56358.6. Samples: 1330579480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 04:46:58,924][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 04:46:59,159][47288] Updated weights for policy 0, policy_version 84306 (0.0035) [2024-04-26 04:47:01,385][47288] Updated weights for policy 0, policy_version 84316 (0.0034) [2024-04-26 04:47:03,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1381531648. Throughput: 0: 56259.9. Samples: 1330916500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 04:47:03,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 04:47:04,786][47288] Updated weights for policy 0, policy_version 84326 (0.0027) [2024-04-26 04:47:07,049][47288] Updated weights for policy 0, policy_version 84336 (0.0029) [2024-04-26 04:47:08,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1381826560. Throughput: 0: 56286.0. Samples: 1331257620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 04:47:08,923][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 04:47:10,621][47288] Updated weights for policy 0, policy_version 84346 (0.0033) [2024-04-26 04:47:12,969][47288] Updated weights for policy 0, policy_version 84356 (0.0034) [2024-04-26 04:47:13,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 56427.6). Total num frames: 1382105088. Throughput: 0: 56467.5. Samples: 1331422700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 04:47:13,924][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 04:47:16,473][47288] Updated weights for policy 0, policy_version 84366 (0.0028) [2024-04-26 04:47:18,508][47267] Signal inference workers to stop experience collection... (20100 times) [2024-04-26 04:47:18,543][47288] InferenceWorker_p0-w0: stopping experience collection (20100 times) [2024-04-26 04:47:18,596][47267] Signal inference workers to resume experience collection... (20100 times) [2024-04-26 04:47:18,596][47288] InferenceWorker_p0-w0: resuming experience collection (20100 times) [2024-04-26 04:47:18,705][47288] Updated weights for policy 0, policy_version 84376 (0.0026) [2024-04-26 04:47:18,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56797.7, 300 sec: 56538.7). Total num frames: 1382416384. Throughput: 0: 56448.2. Samples: 1331759680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 04:47:18,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 04:47:22,258][47288] Updated weights for policy 0, policy_version 84386 (0.0021) [2024-04-26 04:47:23,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56797.6, 300 sec: 56483.1). Total num frames: 1382694912. Throughput: 0: 56328.4. Samples: 1332096160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 04:47:23,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 04:47:24,674][47288] Updated weights for policy 0, policy_version 84396 (0.0035) [2024-04-26 04:47:27,989][47288] Updated weights for policy 0, policy_version 84406 (0.0026) [2024-04-26 04:47:28,923][47056] Fps is (10 sec: 57345.2, 60 sec: 57344.1, 300 sec: 56538.7). Total num frames: 1382989824. Throughput: 0: 56393.5. Samples: 1332277680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 04:47:28,923][47056] Avg episode reward: [(0, '0.574')] [2024-04-26 04:47:30,338][47288] Updated weights for policy 0, policy_version 84416 (0.0030) [2024-04-26 04:47:33,763][47288] Updated weights for policy 0, policy_version 84426 (0.0023) [2024-04-26 04:47:33,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1383251968. Throughput: 0: 56248.6. Samples: 1332611480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 04:47:33,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 04:47:36,089][47288] Updated weights for policy 0, policy_version 84436 (0.0034) [2024-04-26 04:47:38,923][47056] Fps is (10 sec: 50789.6, 60 sec: 55978.8, 300 sec: 56316.5). Total num frames: 1383497728. Throughput: 0: 56373.8. Samples: 1332953160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 04:47:38,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 04:47:39,577][47288] Updated weights for policy 0, policy_version 84446 (0.0030) [2024-04-26 04:47:42,017][47288] Updated weights for policy 0, policy_version 84456 (0.0032) [2024-04-26 04:47:43,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 56483.1). Total num frames: 1383809024. Throughput: 0: 56356.9. Samples: 1333115540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 04:47:43,923][47056] Avg episode reward: [(0, '0.372')] [2024-04-26 04:47:45,286][47288] Updated weights for policy 0, policy_version 84466 (0.0029) [2024-04-26 04:47:48,367][47288] Updated weights for policy 0, policy_version 84476 (0.0029) [2024-04-26 04:47:48,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1384071168. Throughput: 0: 56427.1. Samples: 1333455720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 04:47:48,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 04:47:51,053][47288] Updated weights for policy 0, policy_version 84486 (0.0030) [2024-04-26 04:47:53,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1384366080. Throughput: 0: 56289.3. Samples: 1333790640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 04:47:53,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 04:47:54,234][47288] Updated weights for policy 0, policy_version 84496 (0.0031) [2024-04-26 04:47:56,808][47288] Updated weights for policy 0, policy_version 84506 (0.0026) [2024-04-26 04:47:58,923][47056] Fps is (10 sec: 58982.9, 60 sec: 57071.0, 300 sec: 56427.7). Total num frames: 1384660992. Throughput: 0: 56585.0. Samples: 1333969020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 04:47:58,923][47056] Avg episode reward: [(0, '0.615')] [2024-04-26 04:47:58,990][47267] Saving new best policy, reward=0.615! [2024-04-26 04:48:00,046][47288] Updated weights for policy 0, policy_version 84516 (0.0029) [2024-04-26 04:48:02,593][47288] Updated weights for policy 0, policy_version 84526 (0.0028) [2024-04-26 04:48:03,923][47056] Fps is (10 sec: 58982.1, 60 sec: 57070.8, 300 sec: 56538.7). Total num frames: 1384955904. Throughput: 0: 56524.4. Samples: 1334303280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 04:48:03,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 04:48:05,841][47288] Updated weights for policy 0, policy_version 84536 (0.0028) [2024-04-26 04:48:06,298][47267] Signal inference workers to stop experience collection... (20150 times) [2024-04-26 04:48:06,298][47267] Signal inference workers to resume experience collection... (20150 times) [2024-04-26 04:48:06,328][47288] InferenceWorker_p0-w0: stopping experience collection (20150 times) [2024-04-26 04:48:06,328][47288] InferenceWorker_p0-w0: resuming experience collection (20150 times) [2024-04-26 04:48:08,440][47288] Updated weights for policy 0, policy_version 84546 (0.0034) [2024-04-26 04:48:08,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1385234432. Throughput: 0: 56472.6. Samples: 1334637420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 04:48:08,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 04:48:11,731][47288] Updated weights for policy 0, policy_version 84556 (0.0025) [2024-04-26 04:48:13,923][47056] Fps is (10 sec: 54068.7, 60 sec: 56525.0, 300 sec: 56372.1). Total num frames: 1385496576. Throughput: 0: 56431.2. Samples: 1334817080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 04:48:13,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 04:48:14,258][47288] Updated weights for policy 0, policy_version 84566 (0.0031) [2024-04-26 04:48:17,660][47288] Updated weights for policy 0, policy_version 84576 (0.0029) [2024-04-26 04:48:18,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55705.7, 300 sec: 56316.5). Total num frames: 1385758720. Throughput: 0: 56488.1. Samples: 1335153440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 04:48:18,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 04:48:20,031][47288] Updated weights for policy 0, policy_version 84586 (0.0029) [2024-04-26 04:48:23,330][47288] Updated weights for policy 0, policy_version 84596 (0.0027) [2024-04-26 04:48:23,923][47056] Fps is (10 sec: 55704.2, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1386053632. Throughput: 0: 56322.6. Samples: 1335487680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 04:48:23,924][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 04:48:25,820][47288] Updated weights for policy 0, policy_version 84606 (0.0034) [2024-04-26 04:48:28,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55705.4, 300 sec: 56316.5). Total num frames: 1386332160. Throughput: 0: 56395.1. Samples: 1335653320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 04:48:28,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 04:48:29,158][47288] Updated weights for policy 0, policy_version 84616 (0.0029) [2024-04-26 04:48:31,569][47288] Updated weights for policy 0, policy_version 84626 (0.0029) [2024-04-26 04:48:33,923][47056] Fps is (10 sec: 58983.5, 60 sec: 56524.9, 300 sec: 56427.7). Total num frames: 1386643456. Throughput: 0: 56242.8. Samples: 1335986640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 04:48:33,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 04:48:35,092][47288] Updated weights for policy 0, policy_version 84636 (0.0029) [2024-04-26 04:48:37,358][47288] Updated weights for policy 0, policy_version 84646 (0.0035) [2024-04-26 04:48:38,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1386905600. Throughput: 0: 56350.2. Samples: 1336326400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 04:48:38,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 04:48:40,751][47288] Updated weights for policy 0, policy_version 84656 (0.0030) [2024-04-26 04:48:43,212][47288] Updated weights for policy 0, policy_version 84666 (0.0029) [2024-04-26 04:48:43,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.9, 300 sec: 56427.7). Total num frames: 1387200512. Throughput: 0: 56193.8. Samples: 1336497740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 04:48:43,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 04:48:46,585][47288] Updated weights for policy 0, policy_version 84676 (0.0028) [2024-04-26 04:48:48,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1387462656. Throughput: 0: 56246.7. Samples: 1336834380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 04:48:48,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 04:48:48,985][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000084685_1387479040.pth... [2024-04-26 04:48:49,042][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000083859_1373945856.pth [2024-04-26 04:48:49,181][47288] Updated weights for policy 0, policy_version 84686 (0.0032) [2024-04-26 04:48:52,462][47288] Updated weights for policy 0, policy_version 84696 (0.0028) [2024-04-26 04:48:53,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.9, 300 sec: 56316.5). Total num frames: 1387741184. Throughput: 0: 56348.1. Samples: 1337173080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 04:48:53,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 04:48:54,925][47288] Updated weights for policy 0, policy_version 84706 (0.0030) [2024-04-26 04:48:58,342][47288] Updated weights for policy 0, policy_version 84716 (0.0027) [2024-04-26 04:48:58,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 56372.0). Total num frames: 1388019712. Throughput: 0: 56011.3. Samples: 1337337600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 04:48:58,923][47056] Avg episode reward: [(0, '0.450')] [2024-04-26 04:49:00,752][47288] Updated weights for policy 0, policy_version 84726 (0.0027) [2024-04-26 04:49:03,923][47056] Fps is (10 sec: 55704.3, 60 sec: 55705.6, 300 sec: 56316.5). Total num frames: 1388298240. Throughput: 0: 56007.3. Samples: 1337673780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 04:49:03,923][47056] Avg episode reward: [(0, '0.424')] [2024-04-26 04:49:04,083][47288] Updated weights for policy 0, policy_version 84736 (0.0033) [2024-04-26 04:49:05,480][47267] Signal inference workers to stop experience collection... (20200 times) [2024-04-26 04:49:05,522][47288] InferenceWorker_p0-w0: stopping experience collection (20200 times) [2024-04-26 04:49:05,535][47267] Signal inference workers to resume experience collection... (20200 times) [2024-04-26 04:49:05,541][47288] InferenceWorker_p0-w0: resuming experience collection (20200 times) [2024-04-26 04:49:06,573][47288] Updated weights for policy 0, policy_version 84746 (0.0030) [2024-04-26 04:49:08,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55705.7, 300 sec: 56316.5). Total num frames: 1388576768. Throughput: 0: 56145.1. Samples: 1338014200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 04:49:08,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 04:49:09,714][47288] Updated weights for policy 0, policy_version 84756 (0.0031) [2024-04-26 04:49:12,415][47288] Updated weights for policy 0, policy_version 84766 (0.0024) [2024-04-26 04:49:13,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56251.5, 300 sec: 56316.5). Total num frames: 1388871680. Throughput: 0: 56162.7. Samples: 1338180640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 04:49:13,923][47056] Avg episode reward: [(0, '0.358')] [2024-04-26 04:49:15,672][47288] Updated weights for policy 0, policy_version 84776 (0.0030) [2024-04-26 04:49:18,004][47288] Updated weights for policy 0, policy_version 84786 (0.0026) [2024-04-26 04:49:18,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1389150208. Throughput: 0: 56344.9. Samples: 1338522160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 04:49:18,923][47056] Avg episode reward: [(0, '0.579')] [2024-04-26 04:49:21,397][47288] Updated weights for policy 0, policy_version 84796 (0.0029) [2024-04-26 04:49:23,830][47288] Updated weights for policy 0, policy_version 84806 (0.0031) [2024-04-26 04:49:23,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1389461504. Throughput: 0: 56246.7. Samples: 1338857500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 04:49:23,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 04:49:27,250][47288] Updated weights for policy 0, policy_version 84816 (0.0031) [2024-04-26 04:49:28,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1389723648. Throughput: 0: 56331.1. Samples: 1339032640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:49:28,923][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 04:49:29,609][47288] Updated weights for policy 0, policy_version 84826 (0.0029) [2024-04-26 04:49:33,118][47288] Updated weights for policy 0, policy_version 84836 (0.0031) [2024-04-26 04:49:33,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1390002176. Throughput: 0: 56386.0. Samples: 1339371740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:49:33,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 04:49:35,473][47288] Updated weights for policy 0, policy_version 84846 (0.0028) [2024-04-26 04:49:38,837][47288] Updated weights for policy 0, policy_version 84856 (0.0031) [2024-04-26 04:49:38,923][47056] Fps is (10 sec: 55704.3, 60 sec: 56251.6, 300 sec: 56316.5). Total num frames: 1390280704. Throughput: 0: 56400.1. Samples: 1339711100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:49:38,923][47056] Avg episode reward: [(0, '0.565')] [2024-04-26 04:49:41,160][47288] Updated weights for policy 0, policy_version 84866 (0.0030) [2024-04-26 04:49:43,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 56316.6). Total num frames: 1390559232. Throughput: 0: 56370.3. Samples: 1339874260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:49:43,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 04:49:44,738][47288] Updated weights for policy 0, policy_version 84876 (0.0026) [2024-04-26 04:49:47,022][47288] Updated weights for policy 0, policy_version 84886 (0.0028) [2024-04-26 04:49:48,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1390837760. Throughput: 0: 56528.2. Samples: 1340217540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:49:48,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 04:49:50,365][47288] Updated weights for policy 0, policy_version 84896 (0.0026) [2024-04-26 04:49:52,958][47288] Updated weights for policy 0, policy_version 84906 (0.0030) [2024-04-26 04:49:53,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.6, 300 sec: 56316.5). Total num frames: 1391116288. Throughput: 0: 56519.4. Samples: 1340557580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:49:53,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 04:49:56,256][47288] Updated weights for policy 0, policy_version 84916 (0.0031) [2024-04-26 04:49:58,609][47288] Updated weights for policy 0, policy_version 84926 (0.0030) [2024-04-26 04:49:58,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 1391427584. Throughput: 0: 56584.9. Samples: 1340726960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:49:58,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 04:50:02,105][47288] Updated weights for policy 0, policy_version 84936 (0.0031) [2024-04-26 04:50:03,923][47056] Fps is (10 sec: 60621.4, 60 sec: 57071.1, 300 sec: 56483.2). Total num frames: 1391722496. Throughput: 0: 56487.5. Samples: 1341064100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:50:03,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 04:50:04,293][47288] Updated weights for policy 0, policy_version 84946 (0.0024) [2024-04-26 04:50:05,603][47267] Signal inference workers to stop experience collection... (20250 times) [2024-04-26 04:50:05,603][47267] Signal inference workers to resume experience collection... (20250 times) [2024-04-26 04:50:05,629][47288] InferenceWorker_p0-w0: stopping experience collection (20250 times) [2024-04-26 04:50:05,630][47288] InferenceWorker_p0-w0: resuming experience collection (20250 times) [2024-04-26 04:50:07,972][47288] Updated weights for policy 0, policy_version 84956 (0.0027) [2024-04-26 04:50:08,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1391984640. Throughput: 0: 56643.2. Samples: 1341406440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:50:08,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 04:50:10,238][47288] Updated weights for policy 0, policy_version 84966 (0.0030) [2024-04-26 04:50:13,677][47288] Updated weights for policy 0, policy_version 84976 (0.0029) [2024-04-26 04:50:13,923][47056] Fps is (10 sec: 52429.1, 60 sec: 56251.9, 300 sec: 56316.6). Total num frames: 1392246784. Throughput: 0: 56402.7. Samples: 1341570760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 04:50:13,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 04:50:16,114][47288] Updated weights for policy 0, policy_version 84986 (0.0026) [2024-04-26 04:50:18,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56251.6, 300 sec: 56316.5). Total num frames: 1392525312. Throughput: 0: 56405.6. Samples: 1341910000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 04:50:18,924][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 04:50:19,555][47288] Updated weights for policy 0, policy_version 84996 (0.0033) [2024-04-26 04:50:22,184][47288] Updated weights for policy 0, policy_version 85006 (0.0036) [2024-04-26 04:50:23,923][47056] Fps is (10 sec: 57343.0, 60 sec: 55978.6, 300 sec: 56372.0). Total num frames: 1392820224. Throughput: 0: 56269.5. Samples: 1342243220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 04:50:23,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 04:50:25,274][47288] Updated weights for policy 0, policy_version 85016 (0.0027) [2024-04-26 04:50:28,125][47288] Updated weights for policy 0, policy_version 85026 (0.0029) [2024-04-26 04:50:28,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1393115136. Throughput: 0: 56439.0. Samples: 1342414020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 04:50:28,924][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 04:50:31,011][47288] Updated weights for policy 0, policy_version 85036 (0.0029) [2024-04-26 04:50:33,923][47056] Fps is (10 sec: 55706.7, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1393377280. Throughput: 0: 56279.7. Samples: 1342750120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 04:50:33,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 04:50:34,032][47288] Updated weights for policy 0, policy_version 85046 (0.0031) [2024-04-26 04:50:36,720][47288] Updated weights for policy 0, policy_version 85056 (0.0029) [2024-04-26 04:50:38,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1393672192. Throughput: 0: 56272.3. Samples: 1343089840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 04:50:38,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 04:50:39,887][47288] Updated weights for policy 0, policy_version 85066 (0.0030) [2024-04-26 04:50:42,618][47288] Updated weights for policy 0, policy_version 85076 (0.0030) [2024-04-26 04:50:43,923][47056] Fps is (10 sec: 60620.9, 60 sec: 57071.1, 300 sec: 56594.2). Total num frames: 1393983488. Throughput: 0: 56434.9. Samples: 1343266520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 04:50:43,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 04:50:45,715][47288] Updated weights for policy 0, policy_version 85086 (0.0033) [2024-04-26 04:50:48,397][47288] Updated weights for policy 0, policy_version 85096 (0.0036) [2024-04-26 04:50:48,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1394229248. Throughput: 0: 56514.5. Samples: 1343607260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 04:50:48,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 04:50:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000085097_1394229248.pth... [2024-04-26 04:50:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000084272_1380712448.pth [2024-04-26 04:50:51,377][47288] Updated weights for policy 0, policy_version 85106 (0.0028) [2024-04-26 04:50:53,923][47056] Fps is (10 sec: 52428.6, 60 sec: 56525.0, 300 sec: 56372.1). Total num frames: 1394507776. Throughput: 0: 56459.7. Samples: 1343947120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 04:50:53,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 04:50:54,346][47288] Updated weights for policy 0, policy_version 85116 (0.0026) [2024-04-26 04:50:57,086][47288] Updated weights for policy 0, policy_version 85126 (0.0028) [2024-04-26 04:50:58,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1394786304. Throughput: 0: 56361.1. Samples: 1344107020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 04:50:58,923][47056] Avg episode reward: [(0, '0.316')] [2024-04-26 04:51:00,176][47288] Updated weights for policy 0, policy_version 85136 (0.0027) [2024-04-26 04:51:02,967][47288] Updated weights for policy 0, policy_version 85146 (0.0031) [2024-04-26 04:51:03,923][47056] Fps is (10 sec: 57343.1, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 1395081216. Throughput: 0: 56380.0. Samples: 1344447100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 04:51:03,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 04:51:05,805][47267] Signal inference workers to stop experience collection... (20300 times) [2024-04-26 04:51:05,805][47267] Signal inference workers to resume experience collection... (20300 times) [2024-04-26 04:51:05,826][47288] InferenceWorker_p0-w0: stopping experience collection (20300 times) [2024-04-26 04:51:05,826][47288] InferenceWorker_p0-w0: resuming experience collection (20300 times) [2024-04-26 04:51:05,922][47288] Updated weights for policy 0, policy_version 85156 (0.0029) [2024-04-26 04:51:08,711][47288] Updated weights for policy 0, policy_version 85166 (0.0026) [2024-04-26 04:51:08,923][47056] Fps is (10 sec: 58983.0, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1395376128. Throughput: 0: 56609.5. Samples: 1344790640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 04:51:08,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 04:51:11,660][47288] Updated weights for policy 0, policy_version 85176 (0.0024) [2024-04-26 04:51:13,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1395638272. Throughput: 0: 56560.6. Samples: 1344959240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 04:51:13,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 04:51:14,383][47288] Updated weights for policy 0, policy_version 85186 (0.0028) [2024-04-26 04:51:17,452][47288] Updated weights for policy 0, policy_version 85196 (0.0026) [2024-04-26 04:51:18,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1395933184. Throughput: 0: 56596.6. Samples: 1345296980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 04:51:18,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 04:51:20,330][47288] Updated weights for policy 0, policy_version 85206 (0.0029) [2024-04-26 04:51:23,181][47288] Updated weights for policy 0, policy_version 85216 (0.0025) [2024-04-26 04:51:23,923][47056] Fps is (10 sec: 58981.0, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 1396228096. Throughput: 0: 56592.0. Samples: 1345636480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 04:51:23,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 04:51:26,133][47288] Updated weights for policy 0, policy_version 85226 (0.0029) [2024-04-26 04:51:28,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1396506624. Throughput: 0: 56369.2. Samples: 1345803140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 04:51:28,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 04:51:28,924][47288] Updated weights for policy 0, policy_version 85236 (0.0034) [2024-04-26 04:51:31,846][47288] Updated weights for policy 0, policy_version 85246 (0.0025) [2024-04-26 04:51:33,923][47056] Fps is (10 sec: 54068.5, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1396768768. Throughput: 0: 56261.1. Samples: 1346139000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 04:51:33,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 04:51:34,695][47288] Updated weights for policy 0, policy_version 85256 (0.0030) [2024-04-26 04:51:37,579][47288] Updated weights for policy 0, policy_version 85266 (0.0034) [2024-04-26 04:51:38,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56251.9, 300 sec: 56261.0). Total num frames: 1397047296. Throughput: 0: 56333.6. Samples: 1346482140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 04:51:38,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 04:51:40,786][47288] Updated weights for policy 0, policy_version 85276 (0.0026) [2024-04-26 04:51:43,200][47288] Updated weights for policy 0, policy_version 85286 (0.0030) [2024-04-26 04:51:43,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1397358592. Throughput: 0: 56460.9. Samples: 1346647760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 04:51:43,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 04:51:46,481][47288] Updated weights for policy 0, policy_version 85296 (0.0035) [2024-04-26 04:51:48,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56798.0, 300 sec: 56427.6). Total num frames: 1397637120. Throughput: 0: 56488.6. Samples: 1346989080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 04:51:48,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 04:51:49,273][47288] Updated weights for policy 0, policy_version 85306 (0.0030) [2024-04-26 04:51:52,384][47288] Updated weights for policy 0, policy_version 85316 (0.0036) [2024-04-26 04:51:53,927][47056] Fps is (10 sec: 50770.5, 60 sec: 55974.9, 300 sec: 56371.3). Total num frames: 1397866496. Throughput: 0: 56303.5. Samples: 1347324520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 04:51:53,927][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 04:51:55,131][47288] Updated weights for policy 0, policy_version 85326 (0.0028) [2024-04-26 04:51:58,088][47288] Updated weights for policy 0, policy_version 85336 (0.0027) [2024-04-26 04:51:58,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1398177792. Throughput: 0: 56250.7. Samples: 1347490520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 04:51:58,923][47056] Avg episode reward: [(0, '0.450')] [2024-04-26 04:52:01,075][47288] Updated weights for policy 0, policy_version 85346 (0.0030) [2024-04-26 04:52:03,883][47288] Updated weights for policy 0, policy_version 85356 (0.0030) [2024-04-26 04:52:03,923][47056] Fps is (10 sec: 60644.6, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1398472704. Throughput: 0: 56190.8. Samples: 1347825560. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-26 04:52:03,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 04:52:06,832][47288] Updated weights for policy 0, policy_version 85366 (0.0033) [2024-04-26 04:52:08,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 1398734848. Throughput: 0: 56208.6. Samples: 1348165860. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-26 04:52:08,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 04:52:09,626][47288] Updated weights for policy 0, policy_version 85376 (0.0023) [2024-04-26 04:52:12,506][47288] Updated weights for policy 0, policy_version 85386 (0.0032) [2024-04-26 04:52:13,923][47056] Fps is (10 sec: 55706.6, 60 sec: 56524.9, 300 sec: 56316.6). Total num frames: 1399029760. Throughput: 0: 56362.3. Samples: 1348339440. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-26 04:52:13,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 04:52:15,479][47288] Updated weights for policy 0, policy_version 85396 (0.0030) [2024-04-26 04:52:18,498][47288] Updated weights for policy 0, policy_version 85406 (0.0031) [2024-04-26 04:52:18,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.8, 300 sec: 56316.6). Total num frames: 1399308288. Throughput: 0: 56339.0. Samples: 1348674260. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-26 04:52:18,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 04:52:21,404][47288] Updated weights for policy 0, policy_version 85416 (0.0029) [2024-04-26 04:52:23,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.9, 300 sec: 56261.0). Total num frames: 1399586816. Throughput: 0: 56169.0. Samples: 1349009740. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-26 04:52:23,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 04:52:24,366][47288] Updated weights for policy 0, policy_version 85426 (0.0027) [2024-04-26 04:52:27,181][47288] Updated weights for policy 0, policy_version 85436 (0.0027) [2024-04-26 04:52:28,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55978.4, 300 sec: 56316.5). Total num frames: 1399865344. Throughput: 0: 56266.0. Samples: 1349179740. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-26 04:52:28,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 04:52:30,162][47288] Updated weights for policy 0, policy_version 85446 (0.0026) [2024-04-26 04:52:32,566][47267] Signal inference workers to stop experience collection... (20350 times) [2024-04-26 04:52:32,572][47267] Signal inference workers to resume experience collection... (20350 times) [2024-04-26 04:52:32,586][47288] InferenceWorker_p0-w0: stopping experience collection (20350 times) [2024-04-26 04:52:32,586][47288] InferenceWorker_p0-w0: resuming experience collection (20350 times) [2024-04-26 04:52:32,977][47288] Updated weights for policy 0, policy_version 85456 (0.0031) [2024-04-26 04:52:33,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 1400127488. Throughput: 0: 56219.0. Samples: 1349518940. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-26 04:52:33,923][47056] Avg episode reward: [(0, '0.579')] [2024-04-26 04:52:35,837][47288] Updated weights for policy 0, policy_version 85466 (0.0025) [2024-04-26 04:52:38,743][47288] Updated weights for policy 0, policy_version 85476 (0.0029) [2024-04-26 04:52:38,923][47056] Fps is (10 sec: 57345.3, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1400438784. Throughput: 0: 56203.2. Samples: 1349853440. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-26 04:52:38,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 04:52:41,804][47288] Updated weights for policy 0, policy_version 85486 (0.0033) [2024-04-26 04:52:43,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 56372.1). Total num frames: 1400700928. Throughput: 0: 56365.2. Samples: 1350026960. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-26 04:52:43,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 04:52:44,542][47288] Updated weights for policy 0, policy_version 85496 (0.0027) [2024-04-26 04:52:47,497][47288] Updated weights for policy 0, policy_version 85506 (0.0033) [2024-04-26 04:52:48,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56251.5, 300 sec: 56427.6). Total num frames: 1401012224. Throughput: 0: 56338.5. Samples: 1350360800. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-26 04:52:48,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 04:52:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000085511_1401012224.pth... [2024-04-26 04:52:48,985][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000084685_1387479040.pth [2024-04-26 04:52:50,398][47288] Updated weights for policy 0, policy_version 85516 (0.0027) [2024-04-26 04:52:53,320][47288] Updated weights for policy 0, policy_version 85526 (0.0025) [2024-04-26 04:52:53,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56801.6, 300 sec: 56316.5). Total num frames: 1401274368. Throughput: 0: 56274.7. Samples: 1350698220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:52:53,923][47056] Avg episode reward: [(0, '0.399')] [2024-04-26 04:52:56,074][47288] Updated weights for policy 0, policy_version 85536 (0.0026) [2024-04-26 04:52:58,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56251.6, 300 sec: 56261.0). Total num frames: 1401552896. Throughput: 0: 56146.4. Samples: 1350866040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:52:58,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 04:52:59,193][47288] Updated weights for policy 0, policy_version 85546 (0.0031) [2024-04-26 04:53:02,211][47288] Updated weights for policy 0, policy_version 85556 (0.0033) [2024-04-26 04:53:03,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1401831424. Throughput: 0: 56321.4. Samples: 1351208720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:53:03,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 04:53:04,829][47288] Updated weights for policy 0, policy_version 85566 (0.0025) [2024-04-26 04:53:08,347][47288] Updated weights for policy 0, policy_version 85576 (0.0028) [2024-04-26 04:53:08,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1402093568. Throughput: 0: 56488.3. Samples: 1351551720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:53:08,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 04:53:10,525][47288] Updated weights for policy 0, policy_version 85586 (0.0023) [2024-04-26 04:53:13,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.5, 300 sec: 56372.1). Total num frames: 1402388480. Throughput: 0: 56233.6. Samples: 1351710240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:53:13,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 04:53:14,141][47288] Updated weights for policy 0, policy_version 85596 (0.0026) [2024-04-26 04:53:16,283][47288] Updated weights for policy 0, policy_version 85606 (0.0025) [2024-04-26 04:53:18,923][47056] Fps is (10 sec: 60620.7, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1402699776. Throughput: 0: 56278.6. Samples: 1352051480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:53:18,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 04:53:19,766][47288] Updated weights for policy 0, policy_version 85616 (0.0037) [2024-04-26 04:53:22,240][47288] Updated weights for policy 0, policy_version 85626 (0.0034) [2024-04-26 04:53:23,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56524.6, 300 sec: 56427.6). Total num frames: 1402978304. Throughput: 0: 56374.9. Samples: 1352390320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:53:23,923][47056] Avg episode reward: [(0, '0.424')] [2024-04-26 04:53:25,630][47288] Updated weights for policy 0, policy_version 85636 (0.0029) [2024-04-26 04:53:28,291][47288] Updated weights for policy 0, policy_version 85646 (0.0026) [2024-04-26 04:53:28,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.9, 300 sec: 56316.5). Total num frames: 1403256832. Throughput: 0: 56482.7. Samples: 1352568680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:53:28,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 04:53:31,372][47288] Updated weights for policy 0, policy_version 85656 (0.0026) [2024-04-26 04:53:33,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1403535360. Throughput: 0: 56561.5. Samples: 1352906060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:53:33,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 04:53:33,987][47288] Updated weights for policy 0, policy_version 85666 (0.0024) [2024-04-26 04:53:37,085][47288] Updated weights for policy 0, policy_version 85676 (0.0030) [2024-04-26 04:53:38,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1403813888. Throughput: 0: 56554.7. Samples: 1353243180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 04:53:38,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 04:53:39,640][47288] Updated weights for policy 0, policy_version 85686 (0.0027) [2024-04-26 04:53:42,889][47288] Updated weights for policy 0, policy_version 85696 (0.0026) [2024-04-26 04:53:43,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1404092416. Throughput: 0: 56497.8. Samples: 1353408440. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-26 04:53:43,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 04:53:45,510][47288] Updated weights for policy 0, policy_version 85706 (0.0029) [2024-04-26 04:53:48,791][47288] Updated weights for policy 0, policy_version 85716 (0.0025) [2024-04-26 04:53:48,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1404370944. Throughput: 0: 56424.5. Samples: 1353747820. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-26 04:53:48,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 04:53:51,145][47288] Updated weights for policy 0, policy_version 85726 (0.0027) [2024-04-26 04:53:53,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1404665856. Throughput: 0: 56281.2. Samples: 1354084380. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-26 04:53:53,924][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 04:53:54,658][47288] Updated weights for policy 0, policy_version 85736 (0.0031) [2024-04-26 04:53:56,467][47267] Signal inference workers to stop experience collection... (20400 times) [2024-04-26 04:53:56,496][47288] InferenceWorker_p0-w0: stopping experience collection (20400 times) [2024-04-26 04:53:56,562][47267] Signal inference workers to resume experience collection... (20400 times) [2024-04-26 04:53:56,562][47288] InferenceWorker_p0-w0: resuming experience collection (20400 times) [2024-04-26 04:53:56,796][47288] Updated weights for policy 0, policy_version 85746 (0.0025) [2024-04-26 04:53:58,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1404944384. Throughput: 0: 56473.8. Samples: 1354251560. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-26 04:53:58,923][47056] Avg episode reward: [(0, '0.378')] [2024-04-26 04:54:00,560][47288] Updated weights for policy 0, policy_version 85756 (0.0025) [2024-04-26 04:54:02,873][47288] Updated weights for policy 0, policy_version 85766 (0.0032) [2024-04-26 04:54:03,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1405222912. Throughput: 0: 56350.8. Samples: 1354587260. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-26 04:54:03,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 04:54:06,348][47288] Updated weights for policy 0, policy_version 85776 (0.0025) [2024-04-26 04:54:08,803][47288] Updated weights for policy 0, policy_version 85786 (0.0026) [2024-04-26 04:54:08,923][47056] Fps is (10 sec: 57344.3, 60 sec: 57071.0, 300 sec: 56427.6). Total num frames: 1405517824. Throughput: 0: 56342.9. Samples: 1354925740. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-26 04:54:08,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 04:54:12,169][47288] Updated weights for policy 0, policy_version 85796 (0.0027) [2024-04-26 04:54:13,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1405796352. Throughput: 0: 56292.1. Samples: 1355101820. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-26 04:54:13,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 04:54:14,759][47288] Updated weights for policy 0, policy_version 85806 (0.0030) [2024-04-26 04:54:17,861][47288] Updated weights for policy 0, policy_version 85816 (0.0031) [2024-04-26 04:54:18,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1406058496. Throughput: 0: 56371.0. Samples: 1355442760. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-26 04:54:18,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 04:54:20,462][47288] Updated weights for policy 0, policy_version 85826 (0.0026) [2024-04-26 04:54:23,496][47288] Updated weights for policy 0, policy_version 85836 (0.0027) [2024-04-26 04:54:23,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.8, 300 sec: 56316.5). Total num frames: 1406337024. Throughput: 0: 56398.2. Samples: 1355781100. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-26 04:54:23,923][47056] Avg episode reward: [(0, '0.566')] [2024-04-26 04:54:26,222][47288] Updated weights for policy 0, policy_version 85846 (0.0032) [2024-04-26 04:54:28,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 1406631936. Throughput: 0: 56447.1. Samples: 1355948560. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-26 04:54:28,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 04:54:29,322][47288] Updated weights for policy 0, policy_version 85856 (0.0028) [2024-04-26 04:54:31,952][47288] Updated weights for policy 0, policy_version 85866 (0.0030) [2024-04-26 04:54:33,923][47056] Fps is (10 sec: 58983.0, 60 sec: 56524.9, 300 sec: 56427.7). Total num frames: 1406926848. Throughput: 0: 56474.3. Samples: 1356289160. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-26 04:54:33,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 04:54:35,295][47288] Updated weights for policy 0, policy_version 85876 (0.0026) [2024-04-26 04:54:37,674][47288] Updated weights for policy 0, policy_version 85886 (0.0025) [2024-04-26 04:54:38,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56524.6, 300 sec: 56427.6). Total num frames: 1407205376. Throughput: 0: 56588.4. Samples: 1356630860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 04:54:38,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 04:54:41,141][47288] Updated weights for policy 0, policy_version 85896 (0.0028) [2024-04-26 04:54:43,553][47288] Updated weights for policy 0, policy_version 85906 (0.0026) [2024-04-26 04:54:43,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1407483904. Throughput: 0: 56832.5. Samples: 1356809020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 04:54:43,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 04:54:46,934][47288] Updated weights for policy 0, policy_version 85916 (0.0029) [2024-04-26 04:54:48,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56797.8, 300 sec: 56483.2). Total num frames: 1407778816. Throughput: 0: 56877.3. Samples: 1357146740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 04:54:48,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 04:54:48,986][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000085925_1407795200.pth... [2024-04-26 04:54:49,037][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000085097_1394229248.pth [2024-04-26 04:54:49,460][47288] Updated weights for policy 0, policy_version 85926 (0.0038) [2024-04-26 04:54:52,744][47288] Updated weights for policy 0, policy_version 85936 (0.0024) [2024-04-26 04:54:53,923][47056] Fps is (10 sec: 58981.1, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1408073728. Throughput: 0: 56785.1. Samples: 1357481080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 04:54:53,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 04:54:55,248][47288] Updated weights for policy 0, policy_version 85946 (0.0033) [2024-04-26 04:54:58,484][47288] Updated weights for policy 0, policy_version 85956 (0.0028) [2024-04-26 04:54:58,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56251.9, 300 sec: 56261.0). Total num frames: 1408319488. Throughput: 0: 56488.1. Samples: 1357643780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 04:54:58,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 04:55:00,952][47267] Signal inference workers to stop experience collection... (20450 times) [2024-04-26 04:55:00,952][47267] Signal inference workers to resume experience collection... (20450 times) [2024-04-26 04:55:00,971][47288] InferenceWorker_p0-w0: stopping experience collection (20450 times) [2024-04-26 04:55:00,971][47288] InferenceWorker_p0-w0: resuming experience collection (20450 times) [2024-04-26 04:55:01,069][47288] Updated weights for policy 0, policy_version 85966 (0.0027) [2024-04-26 04:55:03,923][47056] Fps is (10 sec: 52429.5, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1408598016. Throughput: 0: 56451.6. Samples: 1357983080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 04:55:03,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 04:55:04,266][47288] Updated weights for policy 0, policy_version 85976 (0.0032) [2024-04-26 04:55:06,892][47288] Updated weights for policy 0, policy_version 85986 (0.0025) [2024-04-26 04:55:08,923][47056] Fps is (10 sec: 58981.6, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 1408909312. Throughput: 0: 56609.3. Samples: 1358328520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 04:55:08,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 04:55:10,029][47288] Updated weights for policy 0, policy_version 85996 (0.0027) [2024-04-26 04:55:12,634][47288] Updated weights for policy 0, policy_version 86006 (0.0032) [2024-04-26 04:55:13,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 1409187840. Throughput: 0: 56647.6. Samples: 1358497700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 04:55:13,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 04:55:15,699][47288] Updated weights for policy 0, policy_version 86016 (0.0032) [2024-04-26 04:55:18,299][47288] Updated weights for policy 0, policy_version 86026 (0.0027) [2024-04-26 04:55:18,923][47056] Fps is (10 sec: 57343.7, 60 sec: 57070.9, 300 sec: 56483.2). Total num frames: 1409482752. Throughput: 0: 56658.5. Samples: 1358838800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 04:55:18,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 04:55:21,651][47288] Updated weights for policy 0, policy_version 86036 (0.0031) [2024-04-26 04:55:23,923][47056] Fps is (10 sec: 57343.6, 60 sec: 57070.9, 300 sec: 56427.6). Total num frames: 1409761280. Throughput: 0: 56522.3. Samples: 1359174360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-04-26 04:55:23,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 04:55:23,999][47288] Updated weights for policy 0, policy_version 86046 (0.0027) [2024-04-26 04:55:27,399][47288] Updated weights for policy 0, policy_version 86056 (0.0025) [2024-04-26 04:55:28,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56798.0, 300 sec: 56483.1). Total num frames: 1410039808. Throughput: 0: 56546.7. Samples: 1359353620. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 04:55:28,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 04:55:29,929][47288] Updated weights for policy 0, policy_version 86066 (0.0030) [2024-04-26 04:55:33,094][47288] Updated weights for policy 0, policy_version 86076 (0.0028) [2024-04-26 04:55:33,923][47056] Fps is (10 sec: 55704.6, 60 sec: 56524.5, 300 sec: 56427.6). Total num frames: 1410318336. Throughput: 0: 56668.1. Samples: 1359696820. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 04:55:33,924][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 04:55:35,610][47288] Updated weights for policy 0, policy_version 86086 (0.0035) [2024-04-26 04:55:38,885][47288] Updated weights for policy 0, policy_version 86096 (0.0026) [2024-04-26 04:55:38,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56525.0, 300 sec: 56316.5). Total num frames: 1410596864. Throughput: 0: 56611.0. Samples: 1360028560. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 04:55:38,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 04:55:41,510][47288] Updated weights for policy 0, policy_version 86106 (0.0024) [2024-04-26 04:55:43,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1410875392. Throughput: 0: 56653.5. Samples: 1360193200. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 04:55:43,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 04:55:44,634][47288] Updated weights for policy 0, policy_version 86116 (0.0028) [2024-04-26 04:55:47,358][47288] Updated weights for policy 0, policy_version 86126 (0.0035) [2024-04-26 04:55:48,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1411170304. Throughput: 0: 56644.9. Samples: 1360532100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 04:55:48,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 04:55:50,252][47288] Updated weights for policy 0, policy_version 86136 (0.0028) [2024-04-26 04:55:53,089][47288] Updated weights for policy 0, policy_version 86146 (0.0030) [2024-04-26 04:55:53,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56251.9, 300 sec: 56483.2). Total num frames: 1411448832. Throughput: 0: 56528.4. Samples: 1360872300. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 04:55:53,923][47056] Avg episode reward: [(0, '0.548')] [2024-04-26 04:55:55,987][47288] Updated weights for policy 0, policy_version 86156 (0.0026) [2024-04-26 04:55:58,785][47288] Updated weights for policy 0, policy_version 86166 (0.0028) [2024-04-26 04:55:58,923][47056] Fps is (10 sec: 57343.7, 60 sec: 57070.7, 300 sec: 56483.1). Total num frames: 1411743744. Throughput: 0: 56667.9. Samples: 1361047760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 04:55:58,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 04:56:01,876][47288] Updated weights for policy 0, policy_version 86176 (0.0034) [2024-04-26 04:56:03,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56797.8, 300 sec: 56372.0). Total num frames: 1412005888. Throughput: 0: 56504.0. Samples: 1361381480. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 04:56:03,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 04:56:04,559][47288] Updated weights for policy 0, policy_version 86186 (0.0026) [2024-04-26 04:56:07,092][47267] Signal inference workers to stop experience collection... (20500 times) [2024-04-26 04:56:07,112][47288] InferenceWorker_p0-w0: stopping experience collection (20500 times) [2024-04-26 04:56:07,148][47267] Signal inference workers to resume experience collection... (20500 times) [2024-04-26 04:56:07,149][47288] InferenceWorker_p0-w0: resuming experience collection (20500 times) [2024-04-26 04:56:07,870][47288] Updated weights for policy 0, policy_version 86196 (0.0032) [2024-04-26 04:56:08,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 1412317184. Throughput: 0: 56607.6. Samples: 1361721700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 04:56:08,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 04:56:10,331][47288] Updated weights for policy 0, policy_version 86206 (0.0031) [2024-04-26 04:56:13,687][47288] Updated weights for policy 0, policy_version 86216 (0.0031) [2024-04-26 04:56:13,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 1412562944. Throughput: 0: 56342.3. Samples: 1361889040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 04:56:13,924][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 04:56:16,217][47288] Updated weights for policy 0, policy_version 86226 (0.0027) [2024-04-26 04:56:18,923][47056] Fps is (10 sec: 52429.5, 60 sec: 55978.8, 300 sec: 56316.6). Total num frames: 1412841472. Throughput: 0: 56196.9. Samples: 1362225660. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 04:56:18,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 04:56:19,418][47288] Updated weights for policy 0, policy_version 86236 (0.0029) [2024-04-26 04:56:22,081][47288] Updated weights for policy 0, policy_version 86246 (0.0029) [2024-04-26 04:56:23,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56251.8, 300 sec: 56372.0). Total num frames: 1413136384. Throughput: 0: 56376.7. Samples: 1362565520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 04:56:23,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 04:56:25,312][47288] Updated weights for policy 0, policy_version 86256 (0.0027) [2024-04-26 04:56:27,957][47288] Updated weights for policy 0, policy_version 86266 (0.0029) [2024-04-26 04:56:28,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 1413431296. Throughput: 0: 56610.8. Samples: 1362740680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 04:56:28,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 04:56:31,041][47288] Updated weights for policy 0, policy_version 86276 (0.0024) [2024-04-26 04:56:33,770][47288] Updated weights for policy 0, policy_version 86286 (0.0039) [2024-04-26 04:56:33,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56525.1, 300 sec: 56483.2). Total num frames: 1413709824. Throughput: 0: 56516.5. Samples: 1363075340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 04:56:33,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 04:56:36,905][47288] Updated weights for policy 0, policy_version 86296 (0.0027) [2024-04-26 04:56:38,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1413988352. Throughput: 0: 56382.7. Samples: 1363409520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 04:56:38,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 04:56:39,558][47288] Updated weights for policy 0, policy_version 86306 (0.0026) [2024-04-26 04:56:42,748][47288] Updated weights for policy 0, policy_version 86316 (0.0026) [2024-04-26 04:56:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1414266880. Throughput: 0: 56420.6. Samples: 1363586680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 04:56:43,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 04:56:45,443][47288] Updated weights for policy 0, policy_version 86326 (0.0027) [2024-04-26 04:56:48,445][47288] Updated weights for policy 0, policy_version 86336 (0.0026) [2024-04-26 04:56:48,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.9, 300 sec: 56595.0). Total num frames: 1414561792. Throughput: 0: 56610.3. Samples: 1363928940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 04:56:48,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 04:56:49,011][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000086339_1414578176.pth... [2024-04-26 04:56:49,063][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000085511_1401012224.pth [2024-04-26 04:56:51,158][47288] Updated weights for policy 0, policy_version 86346 (0.0030) [2024-04-26 04:56:53,923][47056] Fps is (10 sec: 55704.7, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1414823936. Throughput: 0: 56552.4. Samples: 1364266560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 04:56:53,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 04:56:54,291][47288] Updated weights for policy 0, policy_version 86356 (0.0030) [2024-04-26 04:56:56,858][47288] Updated weights for policy 0, policy_version 86366 (0.0030) [2024-04-26 04:56:58,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1415102464. Throughput: 0: 56332.8. Samples: 1364424000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 04:56:58,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 04:57:00,269][47288] Updated weights for policy 0, policy_version 86376 (0.0032) [2024-04-26 04:57:02,781][47288] Updated weights for policy 0, policy_version 86386 (0.0029) [2024-04-26 04:57:03,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 1415413760. Throughput: 0: 56561.1. Samples: 1364770920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 04:57:03,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 04:57:05,978][47288] Updated weights for policy 0, policy_version 86396 (0.0033) [2024-04-26 04:57:08,596][47288] Updated weights for policy 0, policy_version 86406 (0.0030) [2024-04-26 04:57:08,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56251.8, 300 sec: 56483.1). Total num frames: 1415692288. Throughput: 0: 56493.4. Samples: 1365107720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 04:57:08,923][47056] Avg episode reward: [(0, '0.420')] [2024-04-26 04:57:11,782][47288] Updated weights for policy 0, policy_version 86416 (0.0032) [2024-04-26 04:57:13,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56798.0, 300 sec: 56483.1). Total num frames: 1415970816. Throughput: 0: 56448.0. Samples: 1365280840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:57:13,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 04:57:14,414][47288] Updated weights for policy 0, policy_version 86426 (0.0030) [2024-04-26 04:57:17,618][47288] Updated weights for policy 0, policy_version 86436 (0.0026) [2024-04-26 04:57:18,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 1416249344. Throughput: 0: 56612.0. Samples: 1365622880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:57:18,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 04:57:19,928][47267] Signal inference workers to stop experience collection... (20550 times) [2024-04-26 04:57:19,969][47288] InferenceWorker_p0-w0: stopping experience collection (20550 times) [2024-04-26 04:57:19,978][47267] Signal inference workers to resume experience collection... (20550 times) [2024-04-26 04:57:19,986][47288] InferenceWorker_p0-w0: resuming experience collection (20550 times) [2024-04-26 04:57:20,084][47288] Updated weights for policy 0, policy_version 86446 (0.0030) [2024-04-26 04:57:23,554][47288] Updated weights for policy 0, policy_version 86456 (0.0029) [2024-04-26 04:57:23,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1416511488. Throughput: 0: 56660.7. Samples: 1365959260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:57:23,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 04:57:26,015][47288] Updated weights for policy 0, policy_version 86466 (0.0033) [2024-04-26 04:57:28,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 56483.2). Total num frames: 1416790016. Throughput: 0: 56204.4. Samples: 1366115880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:57:28,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 04:57:29,389][47288] Updated weights for policy 0, policy_version 86476 (0.0029) [2024-04-26 04:57:31,930][47288] Updated weights for policy 0, policy_version 86486 (0.0026) [2024-04-26 04:57:33,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1417068544. Throughput: 0: 56044.5. Samples: 1366450940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:57:33,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 04:57:35,282][47288] Updated weights for policy 0, policy_version 86496 (0.0027) [2024-04-26 04:57:37,637][47288] Updated weights for policy 0, policy_version 86506 (0.0033) [2024-04-26 04:57:38,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 1417379840. Throughput: 0: 55972.6. Samples: 1366785320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:57:38,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 04:57:41,069][47288] Updated weights for policy 0, policy_version 86516 (0.0027) [2024-04-26 04:57:43,302][47288] Updated weights for policy 0, policy_version 86526 (0.0037) [2024-04-26 04:57:43,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1417658368. Throughput: 0: 56573.3. Samples: 1366969800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:57:43,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 04:57:47,004][47288] Updated weights for policy 0, policy_version 86536 (0.0029) [2024-04-26 04:57:48,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1417936896. Throughput: 0: 56294.4. Samples: 1367304160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:57:48,924][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 04:57:49,213][47288] Updated weights for policy 0, policy_version 86546 (0.0025) [2024-04-26 04:57:52,648][47288] Updated weights for policy 0, policy_version 86556 (0.0029) [2024-04-26 04:57:53,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 1418215424. Throughput: 0: 56374.7. Samples: 1367644580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:57:53,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 04:57:55,141][47288] Updated weights for policy 0, policy_version 86566 (0.0031) [2024-04-26 04:57:58,406][47288] Updated weights for policy 0, policy_version 86576 (0.0028) [2024-04-26 04:57:58,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1418477568. Throughput: 0: 56040.1. Samples: 1367802640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:57:58,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 04:58:00,900][47288] Updated weights for policy 0, policy_version 86586 (0.0029) [2024-04-26 04:58:03,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.7, 300 sec: 56483.2). Total num frames: 1418756096. Throughput: 0: 56093.3. Samples: 1368147080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 04:58:03,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 04:58:04,282][47288] Updated weights for policy 0, policy_version 86596 (0.0025) [2024-04-26 04:58:06,620][47288] Updated weights for policy 0, policy_version 86606 (0.0035) [2024-04-26 04:58:08,923][47056] Fps is (10 sec: 58981.0, 60 sec: 56251.5, 300 sec: 56538.7). Total num frames: 1419067392. Throughput: 0: 56306.6. Samples: 1368493060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:58:08,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 04:58:09,910][47288] Updated weights for policy 0, policy_version 86616 (0.0030) [2024-04-26 04:58:12,265][47288] Updated weights for policy 0, policy_version 86626 (0.0023) [2024-04-26 04:58:13,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1419329536. Throughput: 0: 56436.9. Samples: 1368655540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:58:13,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 04:58:15,598][47288] Updated weights for policy 0, policy_version 86636 (0.0037) [2024-04-26 04:58:18,062][47288] Updated weights for policy 0, policy_version 86646 (0.0030) [2024-04-26 04:58:18,923][47056] Fps is (10 sec: 58983.8, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 1419657216. Throughput: 0: 56525.8. Samples: 1368994600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:58:18,923][47056] Avg episode reward: [(0, '0.569')] [2024-04-26 04:58:21,445][47288] Updated weights for policy 0, policy_version 86656 (0.0027) [2024-04-26 04:58:22,691][47267] Signal inference workers to stop experience collection... (20600 times) [2024-04-26 04:58:22,739][47288] InferenceWorker_p0-w0: stopping experience collection (20600 times) [2024-04-26 04:58:22,750][47267] Signal inference workers to resume experience collection... (20600 times) [2024-04-26 04:58:22,758][47288] InferenceWorker_p0-w0: resuming experience collection (20600 times) [2024-04-26 04:58:23,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56797.9, 300 sec: 56483.1). Total num frames: 1419919360. Throughput: 0: 56696.4. Samples: 1369336660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:58:23,923][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 04:58:23,945][47288] Updated weights for policy 0, policy_version 86666 (0.0028) [2024-04-26 04:58:27,315][47288] Updated weights for policy 0, policy_version 86676 (0.0033) [2024-04-26 04:58:28,923][47056] Fps is (10 sec: 55705.5, 60 sec: 57070.9, 300 sec: 56538.7). Total num frames: 1420214272. Throughput: 0: 56629.3. Samples: 1369518120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:58:28,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 04:58:29,581][47288] Updated weights for policy 0, policy_version 86686 (0.0034) [2024-04-26 04:58:32,897][47288] Updated weights for policy 0, policy_version 86696 (0.0025) [2024-04-26 04:58:33,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 1420476416. Throughput: 0: 56888.5. Samples: 1369864140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:58:33,923][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 04:58:35,377][47288] Updated weights for policy 0, policy_version 86706 (0.0026) [2024-04-26 04:58:38,592][47288] Updated weights for policy 0, policy_version 86716 (0.0026) [2024-04-26 04:58:38,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56251.7, 300 sec: 56483.2). Total num frames: 1420754944. Throughput: 0: 56698.2. Samples: 1370196000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:58:38,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 04:58:41,150][47288] Updated weights for policy 0, policy_version 86726 (0.0032) [2024-04-26 04:58:43,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1421017088. Throughput: 0: 56674.3. Samples: 1370352980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:58:43,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 04:58:44,444][47288] Updated weights for policy 0, policy_version 86736 (0.0030) [2024-04-26 04:58:47,049][47288] Updated weights for policy 0, policy_version 86746 (0.0025) [2024-04-26 04:58:48,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 1421328384. Throughput: 0: 56611.4. Samples: 1370694600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:58:48,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 04:58:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000086751_1421328384.pth... [2024-04-26 04:58:48,986][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000085925_1407795200.pth [2024-04-26 04:58:50,233][47288] Updated weights for policy 0, policy_version 86756 (0.0028) [2024-04-26 04:58:52,700][47288] Updated weights for policy 0, policy_version 86766 (0.0029) [2024-04-26 04:58:53,923][47056] Fps is (10 sec: 58981.3, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 1421606912. Throughput: 0: 56502.8. Samples: 1371035680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 04:58:53,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 04:58:55,897][47288] Updated weights for policy 0, policy_version 86776 (0.0029) [2024-04-26 04:58:58,410][47288] Updated weights for policy 0, policy_version 86786 (0.0025) [2024-04-26 04:58:58,923][47056] Fps is (10 sec: 58983.0, 60 sec: 57343.9, 300 sec: 56594.2). Total num frames: 1421918208. Throughput: 0: 56812.8. Samples: 1371212120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 04:58:58,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 04:59:01,638][47288] Updated weights for policy 0, policy_version 86796 (0.0031) [2024-04-26 04:59:03,923][47056] Fps is (10 sec: 58982.3, 60 sec: 57343.9, 300 sec: 56538.7). Total num frames: 1422196736. Throughput: 0: 56752.7. Samples: 1371548480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 04:59:03,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 04:59:04,277][47288] Updated weights for policy 0, policy_version 86806 (0.0033) [2024-04-26 04:59:07,506][47288] Updated weights for policy 0, policy_version 86816 (0.0028) [2024-04-26 04:59:08,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56525.0, 300 sec: 56483.1). Total num frames: 1422458880. Throughput: 0: 56652.9. Samples: 1371886040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 04:59:08,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 04:59:10,128][47288] Updated weights for policy 0, policy_version 86826 (0.0027) [2024-04-26 04:59:13,540][47288] Updated weights for policy 0, policy_version 86836 (0.0030) [2024-04-26 04:59:13,923][47056] Fps is (10 sec: 54068.1, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 1422737408. Throughput: 0: 56343.2. Samples: 1372053560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 04:59:13,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 04:59:15,288][47267] Signal inference workers to stop experience collection... (20650 times) [2024-04-26 04:59:15,309][47288] InferenceWorker_p0-w0: stopping experience collection (20650 times) [2024-04-26 04:59:15,346][47267] Signal inference workers to resume experience collection... (20650 times) [2024-04-26 04:59:15,347][47288] InferenceWorker_p0-w0: resuming experience collection (20650 times) [2024-04-26 04:59:16,110][47288] Updated weights for policy 0, policy_version 86846 (0.0026) [2024-04-26 04:59:18,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 56538.7). Total num frames: 1423015936. Throughput: 0: 56256.3. Samples: 1372395680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 04:59:18,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 04:59:19,222][47288] Updated weights for policy 0, policy_version 86856 (0.0029) [2024-04-26 04:59:22,013][47288] Updated weights for policy 0, policy_version 86866 (0.0032) [2024-04-26 04:59:23,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1423294464. Throughput: 0: 56440.8. Samples: 1372735840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 04:59:23,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 04:59:25,012][47288] Updated weights for policy 0, policy_version 86876 (0.0031) [2024-04-26 04:59:27,673][47288] Updated weights for policy 0, policy_version 86886 (0.0028) [2024-04-26 04:59:28,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.5, 300 sec: 56427.6). Total num frames: 1423572992. Throughput: 0: 56582.8. Samples: 1372899220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 04:59:28,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 04:59:30,888][47288] Updated weights for policy 0, policy_version 86896 (0.0026) [2024-04-26 04:59:33,576][47288] Updated weights for policy 0, policy_version 86906 (0.0031) [2024-04-26 04:59:33,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 1423884288. Throughput: 0: 56449.4. Samples: 1373234820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 04:59:33,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 04:59:36,543][47288] Updated weights for policy 0, policy_version 86916 (0.0030) [2024-04-26 04:59:38,923][47056] Fps is (10 sec: 58983.3, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 1424162816. Throughput: 0: 56377.4. Samples: 1373572660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 04:59:38,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 04:59:39,447][47288] Updated weights for policy 0, policy_version 86926 (0.0028) [2024-04-26 04:59:42,385][47288] Updated weights for policy 0, policy_version 86936 (0.0029) [2024-04-26 04:59:43,923][47056] Fps is (10 sec: 55706.2, 60 sec: 57070.9, 300 sec: 56483.2). Total num frames: 1424441344. Throughput: 0: 56445.4. Samples: 1373752160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 04:59:43,923][47056] Avg episode reward: [(0, '0.353')] [2024-04-26 04:59:45,171][47288] Updated weights for policy 0, policy_version 86946 (0.0028) [2024-04-26 04:59:48,217][47288] Updated weights for policy 0, policy_version 86956 (0.0030) [2024-04-26 04:59:48,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 1424736256. Throughput: 0: 56430.6. Samples: 1374087860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 04:59:48,923][47056] Avg episode reward: [(0, '0.583')] [2024-04-26 04:59:50,933][47288] Updated weights for policy 0, policy_version 86966 (0.0030) [2024-04-26 04:59:53,884][47288] Updated weights for policy 0, policy_version 86976 (0.0028) [2024-04-26 04:59:53,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 1425014784. Throughput: 0: 56340.0. Samples: 1374421340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 04:59:53,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 04:59:56,757][47288] Updated weights for policy 0, policy_version 86986 (0.0027) [2024-04-26 04:59:58,923][47056] Fps is (10 sec: 52429.8, 60 sec: 55705.6, 300 sec: 56483.2). Total num frames: 1425260544. Throughput: 0: 56288.4. Samples: 1374586540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 04:59:58,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 04:59:59,875][47288] Updated weights for policy 0, policy_version 86996 (0.0025) [2024-04-26 05:00:02,813][47288] Updated weights for policy 0, policy_version 87006 (0.0026) [2024-04-26 05:00:03,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55978.8, 300 sec: 56427.6). Total num frames: 1425555456. Throughput: 0: 56269.1. Samples: 1374927780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:00:03,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 05:00:05,753][47288] Updated weights for policy 0, policy_version 87016 (0.0023) [2024-04-26 05:00:08,705][47288] Updated weights for policy 0, policy_version 87026 (0.0028) [2024-04-26 05:00:08,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1425833984. Throughput: 0: 56195.9. Samples: 1375264660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:00:08,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 05:00:11,360][47288] Updated weights for policy 0, policy_version 87036 (0.0025) [2024-04-26 05:00:13,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1426128896. Throughput: 0: 56288.2. Samples: 1375432180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:00:13,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 05:00:14,363][47288] Updated weights for policy 0, policy_version 87046 (0.0027) [2024-04-26 05:00:17,090][47288] Updated weights for policy 0, policy_version 87056 (0.0029) [2024-04-26 05:00:18,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1426391040. Throughput: 0: 56337.8. Samples: 1375770020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:00:18,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 05:00:20,413][47288] Updated weights for policy 0, policy_version 87066 (0.0027) [2024-04-26 05:00:23,010][47267] Signal inference workers to stop experience collection... (20700 times) [2024-04-26 05:00:23,055][47288] InferenceWorker_p0-w0: stopping experience collection (20700 times) [2024-04-26 05:00:23,063][47267] Signal inference workers to resume experience collection... (20700 times) [2024-04-26 05:00:23,072][47288] InferenceWorker_p0-w0: resuming experience collection (20700 times) [2024-04-26 05:00:23,074][47288] Updated weights for policy 0, policy_version 87076 (0.0034) [2024-04-26 05:00:23,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56797.9, 300 sec: 56483.1). Total num frames: 1426702336. Throughput: 0: 56209.3. Samples: 1376102080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:00:23,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 05:00:26,381][47288] Updated weights for policy 0, policy_version 87086 (0.0027) [2024-04-26 05:00:28,764][47288] Updated weights for policy 0, policy_version 87096 (0.0029) [2024-04-26 05:00:28,923][47056] Fps is (10 sec: 58983.7, 60 sec: 56798.2, 300 sec: 56483.2). Total num frames: 1426980864. Throughput: 0: 56272.6. Samples: 1376284420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:00:28,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 05:00:32,358][47288] Updated weights for policy 0, policy_version 87106 (0.0037) [2024-04-26 05:00:33,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1427243008. Throughput: 0: 56302.3. Samples: 1376621460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:00:33,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 05:00:34,506][47288] Updated weights for policy 0, policy_version 87116 (0.0030) [2024-04-26 05:00:38,117][47288] Updated weights for policy 0, policy_version 87126 (0.0030) [2024-04-26 05:00:38,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1427521536. Throughput: 0: 56380.9. Samples: 1376958480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:00:38,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 05:00:40,381][47288] Updated weights for policy 0, policy_version 87136 (0.0025) [2024-04-26 05:00:43,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55705.6, 300 sec: 56316.6). Total num frames: 1427783680. Throughput: 0: 56310.8. Samples: 1377120520. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 05:00:43,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 05:00:43,996][47288] Updated weights for policy 0, policy_version 87146 (0.0028) [2024-04-26 05:00:46,067][47288] Updated weights for policy 0, policy_version 87156 (0.0028) [2024-04-26 05:00:48,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.8, 300 sec: 56372.1). Total num frames: 1428078592. Throughput: 0: 56383.9. Samples: 1377465060. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 05:00:48,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 05:00:48,987][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000087164_1428094976.pth... [2024-04-26 05:00:49,032][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000086339_1414578176.pth [2024-04-26 05:00:49,798][47288] Updated weights for policy 0, policy_version 87166 (0.0028) [2024-04-26 05:00:51,772][47288] Updated weights for policy 0, policy_version 87176 (0.0024) [2024-04-26 05:00:53,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55705.7, 300 sec: 56316.6). Total num frames: 1428357120. Throughput: 0: 56633.5. Samples: 1377813160. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 05:00:53,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 05:00:55,500][47288] Updated weights for policy 0, policy_version 87186 (0.0029) [2024-04-26 05:00:57,507][47288] Updated weights for policy 0, policy_version 87196 (0.0030) [2024-04-26 05:00:58,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1428652032. Throughput: 0: 56466.2. Samples: 1377973160. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 05:00:58,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 05:01:01,394][47288] Updated weights for policy 0, policy_version 87206 (0.0032) [2024-04-26 05:01:03,224][47288] Updated weights for policy 0, policy_version 87216 (0.0025) [2024-04-26 05:01:03,923][47056] Fps is (10 sec: 62258.7, 60 sec: 57070.8, 300 sec: 56483.2). Total num frames: 1428979712. Throughput: 0: 56457.0. Samples: 1378310580. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 05:01:03,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 05:01:07,163][47288] Updated weights for policy 0, policy_version 87226 (0.0028) [2024-04-26 05:01:08,923][47056] Fps is (10 sec: 60621.3, 60 sec: 57071.0, 300 sec: 56594.3). Total num frames: 1429258240. Throughput: 0: 56561.4. Samples: 1378647340. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 05:01:08,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 05:01:09,088][47288] Updated weights for policy 0, policy_version 87236 (0.0029) [2024-04-26 05:01:12,979][47288] Updated weights for policy 0, policy_version 87246 (0.0033) [2024-04-26 05:01:13,923][47056] Fps is (10 sec: 52429.4, 60 sec: 56251.9, 300 sec: 56483.1). Total num frames: 1429504000. Throughput: 0: 56449.3. Samples: 1378824640. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 05:01:13,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 05:01:14,857][47288] Updated weights for policy 0, policy_version 87256 (0.0028) [2024-04-26 05:01:18,754][47288] Updated weights for policy 0, policy_version 87266 (0.0030) [2024-04-26 05:01:18,923][47056] Fps is (10 sec: 50790.5, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1429766144. Throughput: 0: 56494.4. Samples: 1379163700. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 05:01:18,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 05:01:20,703][47288] Updated weights for policy 0, policy_version 87276 (0.0028) [2024-04-26 05:01:22,973][47267] Signal inference workers to stop experience collection... (20750 times) [2024-04-26 05:01:23,005][47288] InferenceWorker_p0-w0: stopping experience collection (20750 times) [2024-04-26 05:01:23,031][47267] Signal inference workers to resume experience collection... (20750 times) [2024-04-26 05:01:23,034][47288] InferenceWorker_p0-w0: resuming experience collection (20750 times) [2024-04-26 05:01:23,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 1430061056. Throughput: 0: 56422.6. Samples: 1379497500. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 05:01:23,932][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 05:01:24,501][47288] Updated weights for policy 0, policy_version 87286 (0.0026) [2024-04-26 05:01:26,454][47288] Updated weights for policy 0, policy_version 87296 (0.0031) [2024-04-26 05:01:28,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55705.4, 300 sec: 56316.5). Total num frames: 1430323200. Throughput: 0: 56206.9. Samples: 1379649840. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 05:01:28,932][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 05:01:30,258][47288] Updated weights for policy 0, policy_version 87306 (0.0033) [2024-04-26 05:01:32,189][47288] Updated weights for policy 0, policy_version 87316 (0.0026) [2024-04-26 05:01:33,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55978.8, 300 sec: 56316.5). Total num frames: 1430601728. Throughput: 0: 56111.6. Samples: 1379990080. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 05:01:33,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 05:01:36,133][47288] Updated weights for policy 0, policy_version 87326 (0.0034) [2024-04-26 05:01:38,002][47288] Updated weights for policy 0, policy_version 87336 (0.0029) [2024-04-26 05:01:38,923][47056] Fps is (10 sec: 60620.0, 60 sec: 56797.7, 300 sec: 56483.1). Total num frames: 1430929408. Throughput: 0: 55816.6. Samples: 1380324920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 05:01:38,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 05:01:42,059][47288] Updated weights for policy 0, policy_version 87346 (0.0025) [2024-04-26 05:01:43,835][47288] Updated weights for policy 0, policy_version 87356 (0.0031) [2024-04-26 05:01:43,923][47056] Fps is (10 sec: 63897.6, 60 sec: 57617.0, 300 sec: 56538.7). Total num frames: 1431240704. Throughput: 0: 56301.0. Samples: 1380506700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 05:01:43,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 05:01:47,852][47288] Updated weights for policy 0, policy_version 87366 (0.0031) [2024-04-26 05:01:48,923][47056] Fps is (10 sec: 55707.2, 60 sec: 56797.9, 300 sec: 56483.2). Total num frames: 1431486464. Throughput: 0: 56352.6. Samples: 1380846440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 05:01:48,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 05:01:49,639][47288] Updated weights for policy 0, policy_version 87376 (0.0030) [2024-04-26 05:01:53,782][47288] Updated weights for policy 0, policy_version 87386 (0.0035) [2024-04-26 05:01:53,923][47056] Fps is (10 sec: 49152.1, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1431732224. Throughput: 0: 56281.4. Samples: 1381180000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 05:01:53,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 05:01:55,552][47288] Updated weights for policy 0, policy_version 87396 (0.0027) [2024-04-26 05:01:58,923][47056] Fps is (10 sec: 52428.1, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1432010752. Throughput: 0: 55842.9. Samples: 1381337580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 05:01:58,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 05:01:59,487][47288] Updated weights for policy 0, policy_version 87406 (0.0028) [2024-04-26 05:02:01,419][47288] Updated weights for policy 0, policy_version 87416 (0.0023) [2024-04-26 05:02:03,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55159.4, 300 sec: 56261.0). Total num frames: 1432289280. Throughput: 0: 55939.4. Samples: 1381680980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 05:02:03,924][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 05:02:05,351][47288] Updated weights for policy 0, policy_version 87426 (0.0026) [2024-04-26 05:02:06,483][47267] Signal inference workers to stop experience collection... (20800 times) [2024-04-26 05:02:06,483][47267] Signal inference workers to resume experience collection... (20800 times) [2024-04-26 05:02:06,514][47288] InferenceWorker_p0-w0: stopping experience collection (20800 times) [2024-04-26 05:02:06,514][47288] InferenceWorker_p0-w0: resuming experience collection (20800 times) [2024-04-26 05:02:07,421][47288] Updated weights for policy 0, policy_version 87436 (0.0030) [2024-04-26 05:02:08,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55159.5, 300 sec: 56261.0). Total num frames: 1432567808. Throughput: 0: 56071.8. Samples: 1382020720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 05:02:08,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 05:02:11,087][47288] Updated weights for policy 0, policy_version 87446 (0.0026) [2024-04-26 05:02:13,333][47288] Updated weights for policy 0, policy_version 87456 (0.0032) [2024-04-26 05:02:13,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56251.6, 300 sec: 56372.1). Total num frames: 1432879104. Throughput: 0: 56469.0. Samples: 1382190940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 05:02:13,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 05:02:16,883][47288] Updated weights for policy 0, policy_version 87466 (0.0026) [2024-04-26 05:02:18,923][47056] Fps is (10 sec: 62258.8, 60 sec: 57070.9, 300 sec: 56538.7). Total num frames: 1433190400. Throughput: 0: 56349.3. Samples: 1382525800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 05:02:18,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 05:02:19,010][47288] Updated weights for policy 0, policy_version 87476 (0.0031) [2024-04-26 05:02:22,619][47288] Updated weights for policy 0, policy_version 87486 (0.0027) [2024-04-26 05:02:23,923][47056] Fps is (10 sec: 60620.8, 60 sec: 57071.0, 300 sec: 56594.2). Total num frames: 1433485312. Throughput: 0: 56351.3. Samples: 1382860720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 05:02:23,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 05:02:24,761][47288] Updated weights for policy 0, policy_version 87496 (0.0029) [2024-04-26 05:02:28,510][47288] Updated weights for policy 0, policy_version 87506 (0.0029) [2024-04-26 05:02:28,923][47056] Fps is (10 sec: 55705.6, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 1433747456. Throughput: 0: 56504.8. Samples: 1383049420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 05:02:28,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 05:02:30,579][47288] Updated weights for policy 0, policy_version 87516 (0.0029) [2024-04-26 05:02:33,923][47056] Fps is (10 sec: 50790.4, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1433993216. Throughput: 0: 56537.6. Samples: 1383390640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 05:02:33,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 05:02:34,222][47288] Updated weights for policy 0, policy_version 87526 (0.0029) [2024-04-26 05:02:36,446][47288] Updated weights for policy 0, policy_version 87536 (0.0026) [2024-04-26 05:02:38,923][47056] Fps is (10 sec: 50790.3, 60 sec: 55432.7, 300 sec: 56261.0). Total num frames: 1434255360. Throughput: 0: 56606.6. Samples: 1383727300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 05:02:38,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 05:02:40,001][47288] Updated weights for policy 0, policy_version 87546 (0.0028) [2024-04-26 05:02:42,493][47288] Updated weights for policy 0, policy_version 87556 (0.0027) [2024-04-26 05:02:43,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55159.4, 300 sec: 56316.5). Total num frames: 1434550272. Throughput: 0: 56365.4. Samples: 1383874020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 05:02:43,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 05:02:45,731][47288] Updated weights for policy 0, policy_version 87566 (0.0032) [2024-04-26 05:02:48,344][47288] Updated weights for policy 0, policy_version 87576 (0.0029) [2024-04-26 05:02:48,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 1434845184. Throughput: 0: 56316.1. Samples: 1384215200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 05:02:48,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 05:02:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000087576_1434845184.pth... [2024-04-26 05:02:48,994][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000086751_1421328384.pth [2024-04-26 05:02:50,563][47267] Signal inference workers to stop experience collection... (20850 times) [2024-04-26 05:02:50,600][47288] InferenceWorker_p0-w0: stopping experience collection (20850 times) [2024-04-26 05:02:50,626][47267] Signal inference workers to resume experience collection... (20850 times) [2024-04-26 05:02:50,628][47288] InferenceWorker_p0-w0: resuming experience collection (20850 times) [2024-04-26 05:02:51,506][47288] Updated weights for policy 0, policy_version 87586 (0.0026) [2024-04-26 05:02:53,923][47056] Fps is (10 sec: 60620.6, 60 sec: 57070.9, 300 sec: 56538.7). Total num frames: 1435156480. Throughput: 0: 56325.7. Samples: 1384555380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 05:02:53,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 05:02:54,204][47288] Updated weights for policy 0, policy_version 87596 (0.0029) [2024-04-26 05:02:57,305][47288] Updated weights for policy 0, policy_version 87606 (0.0028) [2024-04-26 05:02:58,923][47056] Fps is (10 sec: 60621.1, 60 sec: 57344.1, 300 sec: 56594.2). Total num frames: 1435451392. Throughput: 0: 56572.5. Samples: 1384736700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 05:02:58,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 05:03:00,347][47288] Updated weights for policy 0, policy_version 87616 (0.0027) [2024-04-26 05:03:03,082][47288] Updated weights for policy 0, policy_version 87626 (0.0029) [2024-04-26 05:03:03,923][47056] Fps is (10 sec: 57344.0, 60 sec: 57344.1, 300 sec: 56483.2). Total num frames: 1435729920. Throughput: 0: 56628.4. Samples: 1385074080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 05:03:03,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 05:03:06,028][47288] Updated weights for policy 0, policy_version 87636 (0.0034) [2024-04-26 05:03:08,858][47288] Updated weights for policy 0, policy_version 87646 (0.0030) [2024-04-26 05:03:08,923][47056] Fps is (10 sec: 54067.6, 60 sec: 57071.0, 300 sec: 56483.2). Total num frames: 1435992064. Throughput: 0: 56656.2. Samples: 1385410240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 05:03:08,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 05:03:11,969][47288] Updated weights for policy 0, policy_version 87656 (0.0028) [2024-04-26 05:03:13,923][47056] Fps is (10 sec: 50789.9, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1436237824. Throughput: 0: 56082.5. Samples: 1385573140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 05:03:13,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 05:03:14,669][47288] Updated weights for policy 0, policy_version 87666 (0.0031) [2024-04-26 05:03:17,864][47288] Updated weights for policy 0, policy_version 87676 (0.0027) [2024-04-26 05:03:18,923][47056] Fps is (10 sec: 50789.5, 60 sec: 55159.4, 300 sec: 56205.4). Total num frames: 1436499968. Throughput: 0: 56061.3. Samples: 1385913400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 05:03:18,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 05:03:20,386][47288] Updated weights for policy 0, policy_version 87686 (0.0023) [2024-04-26 05:03:23,506][47288] Updated weights for policy 0, policy_version 87696 (0.0031) [2024-04-26 05:03:23,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55432.5, 300 sec: 56261.0). Total num frames: 1436811264. Throughput: 0: 56166.6. Samples: 1386254800. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 05:03:23,924][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 05:03:26,241][47288] Updated weights for policy 0, policy_version 87706 (0.0028) [2024-04-26 05:03:28,388][47267] Signal inference workers to stop experience collection... (20900 times) [2024-04-26 05:03:28,388][47267] Signal inference workers to resume experience collection... (20900 times) [2024-04-26 05:03:28,411][47288] InferenceWorker_p0-w0: stopping experience collection (20900 times) [2024-04-26 05:03:28,412][47288] InferenceWorker_p0-w0: resuming experience collection (20900 times) [2024-04-26 05:03:28,923][47056] Fps is (10 sec: 60621.2, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1437106176. Throughput: 0: 56510.6. Samples: 1386417000. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 05:03:28,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 05:03:29,169][47288] Updated weights for policy 0, policy_version 87716 (0.0025) [2024-04-26 05:03:31,929][47288] Updated weights for policy 0, policy_version 87726 (0.0033) [2024-04-26 05:03:33,923][47056] Fps is (10 sec: 58981.3, 60 sec: 56797.7, 300 sec: 56427.6). Total num frames: 1437401088. Throughput: 0: 56403.7. Samples: 1386753380. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 05:03:33,924][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 05:03:35,426][47288] Updated weights for policy 0, policy_version 87736 (0.0028) [2024-04-26 05:03:37,776][47288] Updated weights for policy 0, policy_version 87746 (0.0025) [2024-04-26 05:03:38,923][47056] Fps is (10 sec: 60620.5, 60 sec: 57617.0, 300 sec: 56594.2). Total num frames: 1437712384. Throughput: 0: 56395.1. Samples: 1387093160. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 05:03:38,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 05:03:41,043][47288] Updated weights for policy 0, policy_version 87756 (0.0033) [2024-04-26 05:03:43,496][47288] Updated weights for policy 0, policy_version 87766 (0.0024) [2024-04-26 05:03:43,923][47056] Fps is (10 sec: 60622.2, 60 sec: 57617.0, 300 sec: 56538.7). Total num frames: 1438007296. Throughput: 0: 56440.8. Samples: 1387276540. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 05:03:43,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 05:03:46,714][47288] Updated weights for policy 0, policy_version 87776 (0.0028) [2024-04-26 05:03:48,923][47056] Fps is (10 sec: 50790.6, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1438220288. Throughput: 0: 56510.6. Samples: 1387617060. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 05:03:48,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 05:03:49,286][47288] Updated weights for policy 0, policy_version 87786 (0.0026) [2024-04-26 05:03:52,615][47288] Updated weights for policy 0, policy_version 87796 (0.0028) [2024-04-26 05:03:53,923][47056] Fps is (10 sec: 49151.3, 60 sec: 55705.5, 300 sec: 56205.4). Total num frames: 1438498816. Throughput: 0: 56637.0. Samples: 1387958920. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 05:03:53,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 05:03:55,073][47288] Updated weights for policy 0, policy_version 87806 (0.0027) [2024-04-26 05:03:58,479][47288] Updated weights for policy 0, policy_version 87816 (0.0033) [2024-04-26 05:03:58,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55432.4, 300 sec: 56205.4). Total num frames: 1438777344. Throughput: 0: 56428.4. Samples: 1388112420. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 05:03:58,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 05:04:00,934][47288] Updated weights for policy 0, policy_version 87826 (0.0020) [2024-04-26 05:04:03,923][47056] Fps is (10 sec: 57345.2, 60 sec: 55705.7, 300 sec: 56316.5). Total num frames: 1439072256. Throughput: 0: 56280.6. Samples: 1388446020. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 05:04:03,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 05:04:04,226][47288] Updated weights for policy 0, policy_version 87836 (0.0026) [2024-04-26 05:04:06,775][47288] Updated weights for policy 0, policy_version 87846 (0.0035) [2024-04-26 05:04:08,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56251.5, 300 sec: 56372.0). Total num frames: 1439367168. Throughput: 0: 56350.5. Samples: 1388790580. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 05:04:08,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 05:04:10,024][47288] Updated weights for policy 0, policy_version 87856 (0.0027) [2024-04-26 05:04:12,449][47288] Updated weights for policy 0, policy_version 87866 (0.0029) [2024-04-26 05:04:13,923][47056] Fps is (10 sec: 60620.5, 60 sec: 57344.1, 300 sec: 56483.2). Total num frames: 1439678464. Throughput: 0: 56712.5. Samples: 1388969060. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-04-26 05:04:13,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 05:04:15,810][47288] Updated weights for policy 0, policy_version 87876 (0.0028) [2024-04-26 05:04:17,085][47267] Signal inference workers to stop experience collection... (20950 times) [2024-04-26 05:04:17,086][47267] Signal inference workers to resume experience collection... (20950 times) [2024-04-26 05:04:17,114][47288] InferenceWorker_p0-w0: stopping experience collection (20950 times) [2024-04-26 05:04:17,114][47288] InferenceWorker_p0-w0: resuming experience collection (20950 times) [2024-04-26 05:04:18,374][47288] Updated weights for policy 0, policy_version 87886 (0.0027) [2024-04-26 05:04:18,923][47056] Fps is (10 sec: 58983.4, 60 sec: 57617.1, 300 sec: 56483.1). Total num frames: 1439956992. Throughput: 0: 56705.2. Samples: 1389305100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 05:04:18,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 05:04:21,706][47288] Updated weights for policy 0, policy_version 87896 (0.0030) [2024-04-26 05:04:23,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1440219136. Throughput: 0: 56677.9. Samples: 1389643660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 05:04:23,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 05:04:24,262][47288] Updated weights for policy 0, policy_version 87906 (0.0025) [2024-04-26 05:04:27,590][47288] Updated weights for policy 0, policy_version 87916 (0.0030) [2024-04-26 05:04:28,923][47056] Fps is (10 sec: 54067.4, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1440497664. Throughput: 0: 56348.9. Samples: 1389812240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 05:04:28,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 05:04:29,925][47288] Updated weights for policy 0, policy_version 87926 (0.0027) [2024-04-26 05:04:33,395][47288] Updated weights for policy 0, policy_version 87936 (0.0027) [2024-04-26 05:04:33,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.9, 300 sec: 56316.5). Total num frames: 1440776192. Throughput: 0: 56420.9. Samples: 1390156000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 05:04:33,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 05:04:35,680][47288] Updated weights for policy 0, policy_version 87946 (0.0027) [2024-04-26 05:04:38,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.6, 300 sec: 56261.0). Total num frames: 1441038336. Throughput: 0: 56486.9. Samples: 1390500820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 05:04:38,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 05:04:39,109][47288] Updated weights for policy 0, policy_version 87956 (0.0033) [2024-04-26 05:04:41,548][47288] Updated weights for policy 0, policy_version 87966 (0.0031) [2024-04-26 05:04:43,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 56261.0). Total num frames: 1441333248. Throughput: 0: 56661.1. Samples: 1390662160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 05:04:43,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 05:04:44,895][47288] Updated weights for policy 0, policy_version 87976 (0.0027) [2024-04-26 05:04:47,379][47288] Updated weights for policy 0, policy_version 87986 (0.0028) [2024-04-26 05:04:48,923][47056] Fps is (10 sec: 60620.5, 60 sec: 57070.9, 300 sec: 56372.1). Total num frames: 1441644544. Throughput: 0: 56695.9. Samples: 1390997340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 05:04:48,923][47056] Avg episode reward: [(0, '0.375')] [2024-04-26 05:04:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000087991_1441644544.pth... [2024-04-26 05:04:48,995][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000087164_1428094976.pth [2024-04-26 05:04:50,665][47288] Updated weights for policy 0, policy_version 87996 (0.0029) [2024-04-26 05:04:53,125][47288] Updated weights for policy 0, policy_version 88006 (0.0026) [2024-04-26 05:04:53,923][47056] Fps is (10 sec: 60620.3, 60 sec: 57344.1, 300 sec: 56538.7). Total num frames: 1441939456. Throughput: 0: 56521.5. Samples: 1391334040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 05:04:53,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 05:04:56,298][47288] Updated weights for policy 0, policy_version 88016 (0.0027) [2024-04-26 05:04:58,923][47056] Fps is (10 sec: 55706.2, 60 sec: 57071.2, 300 sec: 56427.6). Total num frames: 1442201600. Throughput: 0: 56581.9. Samples: 1391515240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 05:04:58,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 05:04:58,998][47288] Updated weights for policy 0, policy_version 88026 (0.0028) [2024-04-26 05:05:01,951][47288] Updated weights for policy 0, policy_version 88036 (0.0028) [2024-04-26 05:05:03,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1442480128. Throughput: 0: 56660.5. Samples: 1391854820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 05:05:03,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 05:05:04,773][47288] Updated weights for policy 0, policy_version 88046 (0.0027) [2024-04-26 05:05:07,820][47288] Updated weights for policy 0, policy_version 88056 (0.0035) [2024-04-26 05:05:08,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56525.0, 300 sec: 56372.1). Total num frames: 1442758656. Throughput: 0: 56691.1. Samples: 1392194760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 05:05:08,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 05:05:10,515][47288] Updated weights for policy 0, policy_version 88066 (0.0032) [2024-04-26 05:05:13,620][47288] Updated weights for policy 0, policy_version 88076 (0.0031) [2024-04-26 05:05:13,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1443037184. Throughput: 0: 56442.3. Samples: 1392352140. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 05:05:13,923][47056] Avg episode reward: [(0, '0.405')] [2024-04-26 05:05:15,940][47267] Signal inference workers to stop experience collection... (21000 times) [2024-04-26 05:05:15,941][47267] Signal inference workers to resume experience collection... (21000 times) [2024-04-26 05:05:15,965][47288] InferenceWorker_p0-w0: stopping experience collection (21000 times) [2024-04-26 05:05:15,966][47288] InferenceWorker_p0-w0: resuming experience collection (21000 times) [2024-04-26 05:05:16,179][47288] Updated weights for policy 0, policy_version 88086 (0.0030) [2024-04-26 05:05:18,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55978.8, 300 sec: 56316.5). Total num frames: 1443315712. Throughput: 0: 56408.6. Samples: 1392694380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 05:05:18,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 05:05:19,400][47288] Updated weights for policy 0, policy_version 88096 (0.0028) [2024-04-26 05:05:22,087][47288] Updated weights for policy 0, policy_version 88106 (0.0028) [2024-04-26 05:05:23,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56524.8, 300 sec: 56372.0). Total num frames: 1443610624. Throughput: 0: 56274.6. Samples: 1393033180. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 05:05:23,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 05:05:25,435][47288] Updated weights for policy 0, policy_version 88116 (0.0025) [2024-04-26 05:05:27,941][47288] Updated weights for policy 0, policy_version 88126 (0.0029) [2024-04-26 05:05:28,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1443889152. Throughput: 0: 56605.0. Samples: 1393209380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 05:05:28,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 05:05:31,114][47288] Updated weights for policy 0, policy_version 88136 (0.0032) [2024-04-26 05:05:33,717][47288] Updated weights for policy 0, policy_version 88146 (0.0025) [2024-04-26 05:05:33,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56798.0, 300 sec: 56483.2). Total num frames: 1444184064. Throughput: 0: 56549.0. Samples: 1393542040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 05:05:33,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 05:05:36,948][47288] Updated weights for policy 0, policy_version 88156 (0.0029) [2024-04-26 05:05:38,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1444429824. Throughput: 0: 56669.9. Samples: 1393884180. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 05:05:38,923][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 05:05:39,461][47288] Updated weights for policy 0, policy_version 88166 (0.0027) [2024-04-26 05:05:42,780][47288] Updated weights for policy 0, policy_version 88176 (0.0029) [2024-04-26 05:05:43,923][47056] Fps is (10 sec: 54066.4, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1444724736. Throughput: 0: 56213.6. Samples: 1394044860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 05:05:43,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 05:05:45,344][47288] Updated weights for policy 0, policy_version 88186 (0.0025) [2024-04-26 05:05:48,657][47288] Updated weights for policy 0, policy_version 88196 (0.0026) [2024-04-26 05:05:48,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 56427.6). Total num frames: 1445003264. Throughput: 0: 56218.5. Samples: 1394384660. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 05:05:48,923][47056] Avg episode reward: [(0, '0.335')] [2024-04-26 05:05:51,253][47288] Updated weights for policy 0, policy_version 88206 (0.0029) [2024-04-26 05:05:53,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1445298176. Throughput: 0: 56044.4. Samples: 1394716760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 05:05:53,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 05:05:54,546][47288] Updated weights for policy 0, policy_version 88216 (0.0030) [2024-04-26 05:05:56,989][47288] Updated weights for policy 0, policy_version 88226 (0.0026) [2024-04-26 05:05:58,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1445593088. Throughput: 0: 56567.9. Samples: 1394897700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 05:05:58,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 05:06:00,177][47288] Updated weights for policy 0, policy_version 88236 (0.0031) [2024-04-26 05:06:02,791][47288] Updated weights for policy 0, policy_version 88246 (0.0024) [2024-04-26 05:06:03,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1445871616. Throughput: 0: 56412.8. Samples: 1395232960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 05:06:03,932][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 05:06:05,772][47288] Updated weights for policy 0, policy_version 88256 (0.0033) [2024-04-26 05:06:08,637][47288] Updated weights for policy 0, policy_version 88266 (0.0027) [2024-04-26 05:06:08,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56798.0, 300 sec: 56483.1). Total num frames: 1446166528. Throughput: 0: 56400.6. Samples: 1395571200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 05:06:08,923][47056] Avg episode reward: [(0, '0.566')] [2024-04-26 05:06:11,658][47288] Updated weights for policy 0, policy_version 88276 (0.0032) [2024-04-26 05:06:13,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1446412288. Throughput: 0: 56250.2. Samples: 1395740640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 05:06:13,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 05:06:14,578][47288] Updated weights for policy 0, policy_version 88286 (0.0027) [2024-04-26 05:06:17,534][47288] Updated weights for policy 0, policy_version 88296 (0.0025) [2024-04-26 05:06:18,642][47267] Signal inference workers to stop experience collection... (21050 times) [2024-04-26 05:06:18,676][47288] InferenceWorker_p0-w0: stopping experience collection (21050 times) [2024-04-26 05:06:18,688][47267] Signal inference workers to resume experience collection... (21050 times) [2024-04-26 05:06:18,694][47288] InferenceWorker_p0-w0: resuming experience collection (21050 times) [2024-04-26 05:06:18,923][47056] Fps is (10 sec: 52428.3, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1446690816. Throughput: 0: 56299.9. Samples: 1396075540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 05:06:18,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 05:06:20,296][47288] Updated weights for policy 0, policy_version 88306 (0.0028) [2024-04-26 05:06:23,224][47288] Updated weights for policy 0, policy_version 88316 (0.0032) [2024-04-26 05:06:23,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55978.6, 300 sec: 56427.6). Total num frames: 1446969344. Throughput: 0: 56231.5. Samples: 1396414600. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 05:06:23,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 05:06:26,112][47288] Updated weights for policy 0, policy_version 88326 (0.0032) [2024-04-26 05:06:28,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56524.6, 300 sec: 56538.7). Total num frames: 1447280640. Throughput: 0: 56248.0. Samples: 1396576020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 05:06:28,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 05:06:29,632][47288] Updated weights for policy 0, policy_version 88336 (0.0030) [2024-04-26 05:06:31,953][47288] Updated weights for policy 0, policy_version 88346 (0.0035) [2024-04-26 05:06:33,923][47056] Fps is (10 sec: 60620.9, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1447575552. Throughput: 0: 56192.5. Samples: 1396913320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 05:06:33,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 05:06:35,464][47288] Updated weights for policy 0, policy_version 88356 (0.0031) [2024-04-26 05:06:37,796][47288] Updated weights for policy 0, policy_version 88366 (0.0033) [2024-04-26 05:06:38,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56797.8, 300 sec: 56261.0). Total num frames: 1447837696. Throughput: 0: 56324.0. Samples: 1397251340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 05:06:38,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 05:06:41,224][47288] Updated weights for policy 0, policy_version 88376 (0.0027) [2024-04-26 05:06:43,452][47288] Updated weights for policy 0, policy_version 88386 (0.0025) [2024-04-26 05:06:43,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1448132608. Throughput: 0: 56393.8. Samples: 1397435420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 05:06:43,924][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 05:06:46,875][47288] Updated weights for policy 0, policy_version 88396 (0.0025) [2024-04-26 05:06:48,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.9, 300 sec: 56483.1). Total num frames: 1448394752. Throughput: 0: 56540.9. Samples: 1397777300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 05:06:48,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 05:06:48,976][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000088404_1448411136.pth... [2024-04-26 05:06:49,018][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000087576_1434845184.pth [2024-04-26 05:06:49,473][47288] Updated weights for policy 0, policy_version 88406 (0.0029) [2024-04-26 05:06:52,549][47288] Updated weights for policy 0, policy_version 88416 (0.0028) [2024-04-26 05:06:53,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55978.8, 300 sec: 56427.6). Total num frames: 1448656896. Throughput: 0: 56492.4. Samples: 1398113360. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 05:06:53,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 05:06:55,289][47288] Updated weights for policy 0, policy_version 88426 (0.0029) [2024-04-26 05:06:58,531][47288] Updated weights for policy 0, policy_version 88436 (0.0029) [2024-04-26 05:06:58,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 56483.2). Total num frames: 1448951808. Throughput: 0: 56194.9. Samples: 1398269420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:06:58,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 05:07:00,963][47288] Updated weights for policy 0, policy_version 88446 (0.0028) [2024-04-26 05:07:03,923][47056] Fps is (10 sec: 57343.1, 60 sec: 55978.6, 300 sec: 56483.1). Total num frames: 1449230336. Throughput: 0: 56187.5. Samples: 1398603980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:07:03,923][47056] Avg episode reward: [(0, '0.588')] [2024-04-26 05:07:04,419][47288] Updated weights for policy 0, policy_version 88456 (0.0025) [2024-04-26 05:07:06,657][47288] Updated weights for policy 0, policy_version 88466 (0.0033) [2024-04-26 05:07:08,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.5, 300 sec: 56427.6). Total num frames: 1449525248. Throughput: 0: 56162.2. Samples: 1398941900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:07:08,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 05:07:10,250][47288] Updated weights for policy 0, policy_version 88476 (0.0029) [2024-04-26 05:07:12,465][47288] Updated weights for policy 0, policy_version 88486 (0.0030) [2024-04-26 05:07:13,923][47056] Fps is (10 sec: 60620.2, 60 sec: 57070.7, 300 sec: 56427.6). Total num frames: 1449836544. Throughput: 0: 56739.0. Samples: 1399129280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:07:13,923][47056] Avg episode reward: [(0, '0.404')] [2024-04-26 05:07:16,118][47288] Updated weights for policy 0, policy_version 88496 (0.0027) [2024-04-26 05:07:18,345][47288] Updated weights for policy 0, policy_version 88506 (0.0029) [2024-04-26 05:07:18,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56797.9, 300 sec: 56316.5). Total num frames: 1450098688. Throughput: 0: 56673.4. Samples: 1399463620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:07:18,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 05:07:21,828][47288] Updated weights for policy 0, policy_version 88516 (0.0025) [2024-04-26 05:07:23,923][47056] Fps is (10 sec: 54068.3, 60 sec: 56798.0, 300 sec: 56372.1). Total num frames: 1450377216. Throughput: 0: 56654.8. Samples: 1399800800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:07:23,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 05:07:24,079][47288] Updated weights for policy 0, policy_version 88526 (0.0029) [2024-04-26 05:07:25,106][47267] Signal inference workers to stop experience collection... (21100 times) [2024-04-26 05:07:25,107][47267] Signal inference workers to resume experience collection... (21100 times) [2024-04-26 05:07:25,125][47288] InferenceWorker_p0-w0: stopping experience collection (21100 times) [2024-04-26 05:07:25,126][47288] InferenceWorker_p0-w0: resuming experience collection (21100 times) [2024-04-26 05:07:27,675][47288] Updated weights for policy 0, policy_version 88536 (0.0031) [2024-04-26 05:07:28,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1450655744. Throughput: 0: 56210.6. Samples: 1399964900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:07:28,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 05:07:30,361][47288] Updated weights for policy 0, policy_version 88546 (0.0035) [2024-04-26 05:07:33,386][47288] Updated weights for policy 0, policy_version 88556 (0.0031) [2024-04-26 05:07:33,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55705.6, 300 sec: 56483.1). Total num frames: 1450917888. Throughput: 0: 56083.5. Samples: 1400301060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:07:33,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 05:07:36,023][47288] Updated weights for policy 0, policy_version 88566 (0.0026) [2024-04-26 05:07:38,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.8, 300 sec: 56483.1). Total num frames: 1451212800. Throughput: 0: 56285.6. Samples: 1400646220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:07:38,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 05:07:39,005][47288] Updated weights for policy 0, policy_version 88576 (0.0025) [2024-04-26 05:07:41,755][47288] Updated weights for policy 0, policy_version 88586 (0.0029) [2024-04-26 05:07:43,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1451507712. Throughput: 0: 56518.2. Samples: 1400812740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:07:43,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 05:07:44,866][47288] Updated weights for policy 0, policy_version 88596 (0.0028) [2024-04-26 05:07:47,621][47288] Updated weights for policy 0, policy_version 88606 (0.0026) [2024-04-26 05:07:48,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1451802624. Throughput: 0: 56679.7. Samples: 1401154560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 05:07:48,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 05:07:50,651][47288] Updated weights for policy 0, policy_version 88616 (0.0032) [2024-04-26 05:07:53,192][47288] Updated weights for policy 0, policy_version 88626 (0.0030) [2024-04-26 05:07:53,923][47056] Fps is (10 sec: 57344.1, 60 sec: 57070.8, 300 sec: 56372.1). Total num frames: 1452081152. Throughput: 0: 56768.9. Samples: 1401496500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 05:07:53,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 05:07:56,453][47288] Updated weights for policy 0, policy_version 88636 (0.0030) [2024-04-26 05:07:58,824][47288] Updated weights for policy 0, policy_version 88646 (0.0031) [2024-04-26 05:07:58,923][47056] Fps is (10 sec: 57343.5, 60 sec: 57070.9, 300 sec: 56427.6). Total num frames: 1452376064. Throughput: 0: 56489.9. Samples: 1401671320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 05:07:58,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 05:08:02,271][47288] Updated weights for policy 0, policy_version 88656 (0.0031) [2024-04-26 05:08:03,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56524.8, 300 sec: 56372.0). Total num frames: 1452621824. Throughput: 0: 56567.4. Samples: 1402009160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 05:08:03,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 05:08:04,644][47288] Updated weights for policy 0, policy_version 88666 (0.0031) [2024-04-26 05:08:07,888][47288] Updated weights for policy 0, policy_version 88676 (0.0028) [2024-04-26 05:08:08,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 1452916736. Throughput: 0: 56533.8. Samples: 1402344820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 05:08:08,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 05:08:10,476][47288] Updated weights for policy 0, policy_version 88686 (0.0029) [2024-04-26 05:08:13,654][47288] Updated weights for policy 0, policy_version 88696 (0.0026) [2024-04-26 05:08:13,923][47056] Fps is (10 sec: 57345.1, 60 sec: 55978.9, 300 sec: 56594.3). Total num frames: 1453195264. Throughput: 0: 56547.8. Samples: 1402509540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 05:08:13,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 05:08:16,307][47288] Updated weights for policy 0, policy_version 88706 (0.0032) [2024-04-26 05:08:18,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 56483.2). Total num frames: 1453473792. Throughput: 0: 56668.9. Samples: 1402851160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 05:08:18,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 05:08:19,516][47288] Updated weights for policy 0, policy_version 88716 (0.0028) [2024-04-26 05:08:21,525][47267] Signal inference workers to stop experience collection... (21150 times) [2024-04-26 05:08:21,569][47288] InferenceWorker_p0-w0: stopping experience collection (21150 times) [2024-04-26 05:08:21,581][47267] Signal inference workers to resume experience collection... (21150 times) [2024-04-26 05:08:21,588][47288] InferenceWorker_p0-w0: resuming experience collection (21150 times) [2024-04-26 05:08:21,949][47288] Updated weights for policy 0, policy_version 88726 (0.0026) [2024-04-26 05:08:23,923][47056] Fps is (10 sec: 58981.6, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 1453785088. Throughput: 0: 56520.9. Samples: 1403189660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 05:08:23,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 05:08:25,306][47288] Updated weights for policy 0, policy_version 88736 (0.0029) [2024-04-26 05:08:27,897][47288] Updated weights for policy 0, policy_version 88746 (0.0034) [2024-04-26 05:08:28,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 1454030848. Throughput: 0: 56558.9. Samples: 1403357880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 05:08:28,923][47056] Avg episode reward: [(0, '0.407')] [2024-04-26 05:08:31,288][47288] Updated weights for policy 0, policy_version 88756 (0.0030) [2024-04-26 05:08:33,748][47288] Updated weights for policy 0, policy_version 88766 (0.0025) [2024-04-26 05:08:33,923][47056] Fps is (10 sec: 55705.8, 60 sec: 57071.0, 300 sec: 56372.1). Total num frames: 1454342144. Throughput: 0: 56452.4. Samples: 1403694920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 05:08:33,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 05:08:36,950][47288] Updated weights for policy 0, policy_version 88776 (0.0028) [2024-04-26 05:08:38,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1454604288. Throughput: 0: 56426.7. Samples: 1404035700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 05:08:38,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 05:08:39,446][47288] Updated weights for policy 0, policy_version 88786 (0.0031) [2024-04-26 05:08:42,759][47288] Updated weights for policy 0, policy_version 88796 (0.0024) [2024-04-26 05:08:43,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1454882816. Throughput: 0: 56335.1. Samples: 1404206400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 05:08:43,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 05:08:45,296][47288] Updated weights for policy 0, policy_version 88806 (0.0030) [2024-04-26 05:08:48,587][47288] Updated weights for policy 0, policy_version 88816 (0.0032) [2024-04-26 05:08:48,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 56483.2). Total num frames: 1455161344. Throughput: 0: 56386.4. Samples: 1404546540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 05:08:48,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 05:08:48,971][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000088817_1455177728.pth... [2024-04-26 05:08:49,017][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000087991_1441644544.pth [2024-04-26 05:08:51,333][47288] Updated weights for policy 0, policy_version 88826 (0.0032) [2024-04-26 05:08:53,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 1455456256. Throughput: 0: 56412.9. Samples: 1404883400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 05:08:53,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 05:08:54,229][47288] Updated weights for policy 0, policy_version 88836 (0.0030) [2024-04-26 05:08:57,031][47288] Updated weights for policy 0, policy_version 88846 (0.0029) [2024-04-26 05:08:58,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 56483.1). Total num frames: 1455734784. Throughput: 0: 56558.1. Samples: 1405054660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 05:08:58,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 05:08:59,864][47288] Updated weights for policy 0, policy_version 88856 (0.0030) [2024-04-26 05:09:02,972][47288] Updated weights for policy 0, policy_version 88866 (0.0031) [2024-04-26 05:09:03,923][47056] Fps is (10 sec: 58981.4, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 1456046080. Throughput: 0: 56647.9. Samples: 1405400320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 05:09:03,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 05:09:05,657][47288] Updated weights for policy 0, policy_version 88876 (0.0024) [2024-04-26 05:09:08,710][47288] Updated weights for policy 0, policy_version 88886 (0.0027) [2024-04-26 05:09:08,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1456308224. Throughput: 0: 56575.2. Samples: 1405735540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 05:09:08,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 05:09:11,514][47288] Updated weights for policy 0, policy_version 88896 (0.0029) [2024-04-26 05:09:13,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1456586752. Throughput: 0: 56433.6. Samples: 1405897400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 05:09:13,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 05:09:14,656][47288] Updated weights for policy 0, policy_version 88906 (0.0029) [2024-04-26 05:09:15,113][47267] Signal inference workers to stop experience collection... (21200 times) [2024-04-26 05:09:15,150][47288] InferenceWorker_p0-w0: stopping experience collection (21200 times) [2024-04-26 05:09:15,174][47267] Signal inference workers to resume experience collection... (21200 times) [2024-04-26 05:09:15,175][47288] InferenceWorker_p0-w0: resuming experience collection (21200 times) [2024-04-26 05:09:17,503][47288] Updated weights for policy 0, policy_version 88916 (0.0032) [2024-04-26 05:09:18,923][47056] Fps is (10 sec: 55704.6, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1456865280. Throughput: 0: 56458.1. Samples: 1406235540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 05:09:18,923][47056] Avg episode reward: [(0, '0.384')] [2024-04-26 05:09:20,405][47288] Updated weights for policy 0, policy_version 88926 (0.0027) [2024-04-26 05:09:23,342][47288] Updated weights for policy 0, policy_version 88936 (0.0026) [2024-04-26 05:09:23,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1457143808. Throughput: 0: 56303.5. Samples: 1406569360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 05:09:23,924][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 05:09:26,212][47288] Updated weights for policy 0, policy_version 88946 (0.0031) [2024-04-26 05:09:28,923][47056] Fps is (10 sec: 57345.2, 60 sec: 56797.8, 300 sec: 56483.2). Total num frames: 1457438720. Throughput: 0: 56310.8. Samples: 1406740380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 05:09:28,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 05:09:29,151][47288] Updated weights for policy 0, policy_version 88956 (0.0028) [2024-04-26 05:09:32,207][47288] Updated weights for policy 0, policy_version 88966 (0.0026) [2024-04-26 05:09:33,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 56483.1). Total num frames: 1457700864. Throughput: 0: 56199.9. Samples: 1407075540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 05:09:33,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 05:09:35,134][47288] Updated weights for policy 0, policy_version 88976 (0.0037) [2024-04-26 05:09:38,032][47288] Updated weights for policy 0, policy_version 88986 (0.0026) [2024-04-26 05:09:38,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1457995776. Throughput: 0: 56127.9. Samples: 1407409160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:09:38,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 05:09:40,785][47288] Updated weights for policy 0, policy_version 88996 (0.0031) [2024-04-26 05:09:43,815][47288] Updated weights for policy 0, policy_version 89006 (0.0027) [2024-04-26 05:09:43,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1458274304. Throughput: 0: 56181.5. Samples: 1407582820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:09:43,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 05:09:46,501][47288] Updated weights for policy 0, policy_version 89016 (0.0033) [2024-04-26 05:09:48,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1458536448. Throughput: 0: 55877.9. Samples: 1407914820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:09:48,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 05:09:49,605][47288] Updated weights for policy 0, policy_version 89026 (0.0026) [2024-04-26 05:09:52,374][47288] Updated weights for policy 0, policy_version 89036 (0.0027) [2024-04-26 05:09:53,923][47056] Fps is (10 sec: 54066.1, 60 sec: 55978.5, 300 sec: 56316.5). Total num frames: 1458814976. Throughput: 0: 55982.5. Samples: 1408254760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:09:53,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 05:09:55,334][47288] Updated weights for policy 0, policy_version 89046 (0.0031) [2024-04-26 05:09:58,277][47288] Updated weights for policy 0, policy_version 89056 (0.0027) [2024-04-26 05:09:58,923][47056] Fps is (10 sec: 60620.4, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 1459142656. Throughput: 0: 56280.3. Samples: 1408430020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:09:58,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 05:10:01,138][47288] Updated weights for policy 0, policy_version 89066 (0.0026) [2024-04-26 05:10:03,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55978.8, 300 sec: 56427.6). Total num frames: 1459404800. Throughput: 0: 56250.0. Samples: 1408766780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:10:03,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 05:10:03,927][47267] Signal inference workers to stop experience collection... (21250 times) [2024-04-26 05:10:03,931][47267] Signal inference workers to resume experience collection... (21250 times) [2024-04-26 05:10:03,968][47288] InferenceWorker_p0-w0: stopping experience collection (21250 times) [2024-04-26 05:10:03,969][47288] InferenceWorker_p0-w0: resuming experience collection (21250 times) [2024-04-26 05:10:04,037][47288] Updated weights for policy 0, policy_version 89076 (0.0035) [2024-04-26 05:10:06,915][47288] Updated weights for policy 0, policy_version 89086 (0.0032) [2024-04-26 05:10:08,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55978.5, 300 sec: 56372.0). Total num frames: 1459666944. Throughput: 0: 56410.6. Samples: 1409107840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:10:08,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 05:10:09,837][47288] Updated weights for policy 0, policy_version 89096 (0.0028) [2024-04-26 05:10:12,552][47288] Updated weights for policy 0, policy_version 89106 (0.0027) [2024-04-26 05:10:13,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1459961856. Throughput: 0: 56344.7. Samples: 1409275900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:10:13,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 05:10:15,725][47288] Updated weights for policy 0, policy_version 89116 (0.0025) [2024-04-26 05:10:18,418][47288] Updated weights for policy 0, policy_version 89126 (0.0025) [2024-04-26 05:10:18,923][47056] Fps is (10 sec: 57345.1, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 1460240384. Throughput: 0: 56514.8. Samples: 1409618700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:10:18,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 05:10:21,472][47288] Updated weights for policy 0, policy_version 89136 (0.0029) [2024-04-26 05:10:23,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 1460518912. Throughput: 0: 56628.3. Samples: 1409957440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:10:23,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 05:10:24,302][47288] Updated weights for policy 0, policy_version 89146 (0.0032) [2024-04-26 05:10:27,230][47288] Updated weights for policy 0, policy_version 89156 (0.0031) [2024-04-26 05:10:28,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1460797440. Throughput: 0: 56458.5. Samples: 1410123460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 05:10:28,923][47056] Avg episode reward: [(0, '0.379')] [2024-04-26 05:10:30,377][47288] Updated weights for policy 0, policy_version 89166 (0.0029) [2024-04-26 05:10:32,908][47288] Updated weights for policy 0, policy_version 89176 (0.0030) [2024-04-26 05:10:33,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1461092352. Throughput: 0: 56501.7. Samples: 1410457400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:10:33,923][47056] Avg episode reward: [(0, '0.584')] [2024-04-26 05:10:36,301][47288] Updated weights for policy 0, policy_version 89186 (0.0029) [2024-04-26 05:10:38,680][47288] Updated weights for policy 0, policy_version 89196 (0.0028) [2024-04-26 05:10:38,923][47056] Fps is (10 sec: 60621.4, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 1461403648. Throughput: 0: 56578.9. Samples: 1410800800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:10:38,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 05:10:42,098][47288] Updated weights for policy 0, policy_version 89206 (0.0027) [2024-04-26 05:10:43,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56524.7, 300 sec: 56483.2). Total num frames: 1461665792. Throughput: 0: 56459.7. Samples: 1410970700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:10:43,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 05:10:44,414][47288] Updated weights for policy 0, policy_version 89216 (0.0027) [2024-04-26 05:10:47,833][47288] Updated weights for policy 0, policy_version 89226 (0.0026) [2024-04-26 05:10:48,923][47056] Fps is (10 sec: 54066.5, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1461944320. Throughput: 0: 56487.0. Samples: 1411308700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:10:48,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 05:10:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000089230_1461944320.pth... [2024-04-26 05:10:48,987][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000088404_1448411136.pth [2024-04-26 05:10:50,300][47288] Updated weights for policy 0, policy_version 89236 (0.0029) [2024-04-26 05:10:53,617][47288] Updated weights for policy 0, policy_version 89246 (0.0029) [2024-04-26 05:10:53,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1462222848. Throughput: 0: 56493.9. Samples: 1411650060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:10:53,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 05:10:55,927][47288] Updated weights for policy 0, policy_version 89256 (0.0028) [2024-04-26 05:10:58,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 1462501376. Throughput: 0: 56527.2. Samples: 1411819620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:10:58,923][47056] Avg episode reward: [(0, '0.547')] [2024-04-26 05:10:59,415][47288] Updated weights for policy 0, policy_version 89266 (0.0030) [2024-04-26 05:11:01,777][47288] Updated weights for policy 0, policy_version 89276 (0.0026) [2024-04-26 05:11:03,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1462779904. Throughput: 0: 56471.1. Samples: 1412159900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:11:03,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 05:11:05,151][47288] Updated weights for policy 0, policy_version 89286 (0.0030) [2024-04-26 05:11:07,604][47288] Updated weights for policy 0, policy_version 89296 (0.0025) [2024-04-26 05:11:08,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1463058432. Throughput: 0: 56377.5. Samples: 1412494420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:11:08,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 05:11:10,898][47288] Updated weights for policy 0, policy_version 89306 (0.0026) [2024-04-26 05:11:13,236][47288] Updated weights for policy 0, policy_version 89316 (0.0025) [2024-04-26 05:11:13,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1463353344. Throughput: 0: 56386.6. Samples: 1412660860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:11:13,924][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 05:11:16,759][47288] Updated weights for policy 0, policy_version 89326 (0.0030) [2024-04-26 05:11:18,923][47056] Fps is (10 sec: 60620.0, 60 sec: 57070.8, 300 sec: 56594.2). Total num frames: 1463664640. Throughput: 0: 56708.4. Samples: 1413009280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:11:18,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 05:11:18,962][47288] Updated weights for policy 0, policy_version 89336 (0.0023) [2024-04-26 05:11:22,715][47288] Updated weights for policy 0, policy_version 89346 (0.0026) [2024-04-26 05:11:23,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1463926784. Throughput: 0: 56471.3. Samples: 1413342020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:11:23,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 05:11:24,868][47288] Updated weights for policy 0, policy_version 89356 (0.0030) [2024-04-26 05:11:28,292][47267] Signal inference workers to stop experience collection... (21300 times) [2024-04-26 05:11:28,340][47288] InferenceWorker_p0-w0: stopping experience collection (21300 times) [2024-04-26 05:11:28,350][47267] Signal inference workers to resume experience collection... (21300 times) [2024-04-26 05:11:28,355][47288] InferenceWorker_p0-w0: resuming experience collection (21300 times) [2024-04-26 05:11:28,458][47288] Updated weights for policy 0, policy_version 89366 (0.0027) [2024-04-26 05:11:28,923][47056] Fps is (10 sec: 52429.6, 60 sec: 56524.9, 300 sec: 56316.5). Total num frames: 1464188928. Throughput: 0: 56400.9. Samples: 1413508740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:11:28,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 05:11:30,695][47288] Updated weights for policy 0, policy_version 89376 (0.0027) [2024-04-26 05:11:33,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1464467456. Throughput: 0: 56513.8. Samples: 1413851820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:11:33,923][47056] Avg episode reward: [(0, '0.415')] [2024-04-26 05:11:34,104][47288] Updated weights for policy 0, policy_version 89386 (0.0035) [2024-04-26 05:11:36,863][47288] Updated weights for policy 0, policy_version 89396 (0.0032) [2024-04-26 05:11:38,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 56316.5). Total num frames: 1464745984. Throughput: 0: 56480.9. Samples: 1414191700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:11:38,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 05:11:39,823][47288] Updated weights for policy 0, policy_version 89406 (0.0029) [2024-04-26 05:11:42,864][47288] Updated weights for policy 0, policy_version 89416 (0.0037) [2024-04-26 05:11:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 1465024512. Throughput: 0: 56268.5. Samples: 1414351700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:11:43,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 05:11:45,786][47288] Updated weights for policy 0, policy_version 89426 (0.0028) [2024-04-26 05:11:48,722][47288] Updated weights for policy 0, policy_version 89436 (0.0034) [2024-04-26 05:11:48,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1465319424. Throughput: 0: 56224.2. Samples: 1414690000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:11:48,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 05:11:51,520][47288] Updated weights for policy 0, policy_version 89446 (0.0031) [2024-04-26 05:11:53,923][47056] Fps is (10 sec: 60620.9, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 1465630720. Throughput: 0: 56264.3. Samples: 1415026320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:11:53,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 05:11:54,500][47288] Updated weights for policy 0, policy_version 89456 (0.0028) [2024-04-26 05:11:57,320][47288] Updated weights for policy 0, policy_version 89466 (0.0028) [2024-04-26 05:11:58,923][47056] Fps is (10 sec: 60621.1, 60 sec: 57070.9, 300 sec: 56594.2). Total num frames: 1465925632. Throughput: 0: 56667.6. Samples: 1415210900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:11:58,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 05:12:00,301][47288] Updated weights for policy 0, policy_version 89476 (0.0035) [2024-04-26 05:12:03,126][47288] Updated weights for policy 0, policy_version 89486 (0.0026) [2024-04-26 05:12:03,923][47056] Fps is (10 sec: 54066.6, 60 sec: 56524.6, 300 sec: 56427.6). Total num frames: 1466171392. Throughput: 0: 56393.7. Samples: 1415547000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:12:03,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 05:12:06,096][47288] Updated weights for policy 0, policy_version 89496 (0.0025) [2024-04-26 05:12:08,923][47056] Fps is (10 sec: 52428.8, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1466449920. Throughput: 0: 56525.4. Samples: 1415885660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:12:08,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 05:12:09,120][47288] Updated weights for policy 0, policy_version 89506 (0.0028) [2024-04-26 05:12:11,731][47288] Updated weights for policy 0, policy_version 89516 (0.0026) [2024-04-26 05:12:13,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 1466728448. Throughput: 0: 56425.0. Samples: 1416047880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:12:13,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 05:12:14,959][47288] Updated weights for policy 0, policy_version 89526 (0.0034) [2024-04-26 05:12:17,878][47288] Updated weights for policy 0, policy_version 89536 (0.0025) [2024-04-26 05:12:18,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 56427.6). Total num frames: 1467023360. Throughput: 0: 56392.3. Samples: 1416389480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 05:12:18,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 05:12:20,674][47288] Updated weights for policy 0, policy_version 89546 (0.0028) [2024-04-26 05:12:21,407][47267] Signal inference workers to stop experience collection... (21350 times) [2024-04-26 05:12:21,411][47267] Signal inference workers to resume experience collection... (21350 times) [2024-04-26 05:12:21,438][47288] InferenceWorker_p0-w0: stopping experience collection (21350 times) [2024-04-26 05:12:21,438][47288] InferenceWorker_p0-w0: resuming experience collection (21350 times) [2024-04-26 05:12:23,598][47288] Updated weights for policy 0, policy_version 89556 (0.0033) [2024-04-26 05:12:23,923][47056] Fps is (10 sec: 57345.2, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1467301888. Throughput: 0: 56446.6. Samples: 1416731800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 05:12:23,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 05:12:26,421][47288] Updated weights for policy 0, policy_version 89566 (0.0028) [2024-04-26 05:12:28,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 1467580416. Throughput: 0: 56575.1. Samples: 1416897580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 05:12:28,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 05:12:29,427][47288] Updated weights for policy 0, policy_version 89576 (0.0032) [2024-04-26 05:12:32,212][47288] Updated weights for policy 0, policy_version 89586 (0.0032) [2024-04-26 05:12:33,923][47056] Fps is (10 sec: 58982.9, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 1467891712. Throughput: 0: 56548.2. Samples: 1417234660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 05:12:33,923][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 05:12:35,220][47288] Updated weights for policy 0, policy_version 89596 (0.0030) [2024-04-26 05:12:38,135][47288] Updated weights for policy 0, policy_version 89606 (0.0026) [2024-04-26 05:12:38,923][47056] Fps is (10 sec: 60620.9, 60 sec: 57344.0, 300 sec: 56538.7). Total num frames: 1468186624. Throughput: 0: 56588.0. Samples: 1417572780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 05:12:38,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 05:12:41,151][47288] Updated weights for policy 0, policy_version 89616 (0.0028) [2024-04-26 05:12:43,837][47288] Updated weights for policy 0, policy_version 89626 (0.0032) [2024-04-26 05:12:43,923][47056] Fps is (10 sec: 54066.5, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 1468432384. Throughput: 0: 56428.0. Samples: 1417750160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 05:12:43,923][47056] Avg episode reward: [(0, '0.583')] [2024-04-26 05:12:46,863][47288] Updated weights for policy 0, policy_version 89636 (0.0032) [2024-04-26 05:12:48,923][47056] Fps is (10 sec: 50790.4, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1468694528. Throughput: 0: 56397.9. Samples: 1418084900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 05:12:48,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 05:12:49,039][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000089644_1468727296.pth... [2024-04-26 05:12:49,091][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000088817_1455177728.pth [2024-04-26 05:12:49,513][47288] Updated weights for policy 0, policy_version 89646 (0.0026) [2024-04-26 05:12:52,685][47288] Updated weights for policy 0, policy_version 89656 (0.0022) [2024-04-26 05:12:53,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1468989440. Throughput: 0: 56386.7. Samples: 1418423060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 05:12:53,924][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 05:12:55,436][47288] Updated weights for policy 0, policy_version 89666 (0.0029) [2024-04-26 05:12:58,624][47288] Updated weights for policy 0, policy_version 89676 (0.0027) [2024-04-26 05:12:58,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 56427.6). Total num frames: 1469267968. Throughput: 0: 56265.1. Samples: 1418579800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 05:12:58,923][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 05:13:01,227][47288] Updated weights for policy 0, policy_version 89686 (0.0029) [2024-04-26 05:13:03,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 1469546496. Throughput: 0: 56254.6. Samples: 1418920940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 05:13:03,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 05:13:04,354][47288] Updated weights for policy 0, policy_version 89696 (0.0032) [2024-04-26 05:13:07,028][47288] Updated weights for policy 0, policy_version 89706 (0.0033) [2024-04-26 05:13:08,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56797.9, 300 sec: 56483.1). Total num frames: 1469857792. Throughput: 0: 56123.6. Samples: 1419257360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 05:13:08,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 05:13:10,078][47288] Updated weights for policy 0, policy_version 89716 (0.0028) [2024-04-26 05:13:12,797][47288] Updated weights for policy 0, policy_version 89726 (0.0028) [2024-04-26 05:13:13,923][47056] Fps is (10 sec: 60621.5, 60 sec: 57071.1, 300 sec: 56538.7). Total num frames: 1470152704. Throughput: 0: 56474.7. Samples: 1419438940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 05:13:13,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 05:13:15,850][47288] Updated weights for policy 0, policy_version 89736 (0.0029) [2024-04-26 05:13:18,578][47288] Updated weights for policy 0, policy_version 89746 (0.0031) [2024-04-26 05:13:18,923][47056] Fps is (10 sec: 55704.4, 60 sec: 56524.7, 300 sec: 56372.0). Total num frames: 1470414848. Throughput: 0: 56602.8. Samples: 1419781800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 05:13:18,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 05:13:21,707][47288] Updated weights for policy 0, policy_version 89756 (0.0031) [2024-04-26 05:13:23,923][47056] Fps is (10 sec: 50791.2, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1470660608. Throughput: 0: 56488.2. Samples: 1420114740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 05:13:23,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 05:13:24,075][47267] Signal inference workers to stop experience collection... (21400 times) [2024-04-26 05:13:24,075][47267] Signal inference workers to resume experience collection... (21400 times) [2024-04-26 05:13:24,086][47288] InferenceWorker_p0-w0: stopping experience collection (21400 times) [2024-04-26 05:13:24,086][47288] InferenceWorker_p0-w0: resuming experience collection (21400 times) [2024-04-26 05:13:24,463][47288] Updated weights for policy 0, policy_version 89766 (0.0025) [2024-04-26 05:13:27,457][47288] Updated weights for policy 0, policy_version 89776 (0.0027) [2024-04-26 05:13:28,926][47056] Fps is (10 sec: 54052.1, 60 sec: 56249.0, 300 sec: 56316.0). Total num frames: 1470955520. Throughput: 0: 56216.8. Samples: 1420280080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 05:13:28,926][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 05:13:30,260][47288] Updated weights for policy 0, policy_version 89786 (0.0028) [2024-04-26 05:13:33,262][47288] Updated weights for policy 0, policy_version 89796 (0.0029) [2024-04-26 05:13:33,923][47056] Fps is (10 sec: 58981.6, 60 sec: 55978.6, 300 sec: 56427.6). Total num frames: 1471250432. Throughput: 0: 56399.1. Samples: 1420622860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 05:13:33,928][47056] Avg episode reward: [(0, '0.365')] [2024-04-26 05:13:36,133][47288] Updated weights for policy 0, policy_version 89806 (0.0025) [2024-04-26 05:13:38,923][47056] Fps is (10 sec: 55722.4, 60 sec: 55432.6, 300 sec: 56372.1). Total num frames: 1471512576. Throughput: 0: 56357.0. Samples: 1420959120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 05:13:38,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 05:13:39,168][47288] Updated weights for policy 0, policy_version 89816 (0.0024) [2024-04-26 05:13:41,905][47288] Updated weights for policy 0, policy_version 89826 (0.0034) [2024-04-26 05:13:43,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1471807488. Throughput: 0: 56528.0. Samples: 1421123560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 05:13:43,924][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 05:13:45,109][47288] Updated weights for policy 0, policy_version 89836 (0.0028) [2024-04-26 05:13:47,783][47288] Updated weights for policy 0, policy_version 89846 (0.0031) [2024-04-26 05:13:48,923][47056] Fps is (10 sec: 60621.4, 60 sec: 57071.1, 300 sec: 56483.2). Total num frames: 1472118784. Throughput: 0: 56400.8. Samples: 1421458960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 05:13:48,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 05:13:50,830][47288] Updated weights for policy 0, policy_version 89856 (0.0027) [2024-04-26 05:13:53,705][47288] Updated weights for policy 0, policy_version 89866 (0.0031) [2024-04-26 05:13:53,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1472380928. Throughput: 0: 56478.2. Samples: 1421798880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 05:13:53,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 05:13:56,501][47288] Updated weights for policy 0, policy_version 89876 (0.0030) [2024-04-26 05:13:58,923][47056] Fps is (10 sec: 52428.0, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1472643072. Throughput: 0: 56152.0. Samples: 1421965780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 05:13:58,923][47056] Avg episode reward: [(0, '0.403')] [2024-04-26 05:13:59,525][47288] Updated weights for policy 0, policy_version 89886 (0.0032) [2024-04-26 05:14:02,268][47288] Updated weights for policy 0, policy_version 89896 (0.0034) [2024-04-26 05:14:03,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1472937984. Throughput: 0: 56002.4. Samples: 1422301900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 05:14:03,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 05:14:05,193][47288] Updated weights for policy 0, policy_version 89906 (0.0027) [2024-04-26 05:14:08,138][47288] Updated weights for policy 0, policy_version 89916 (0.0027) [2024-04-26 05:14:08,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1473216512. Throughput: 0: 56097.7. Samples: 1422639140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 05:14:08,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 05:14:11,055][47288] Updated weights for policy 0, policy_version 89926 (0.0029) [2024-04-26 05:14:13,897][47288] Updated weights for policy 0, policy_version 89936 (0.0036) [2024-04-26 05:14:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1473511424. Throughput: 0: 56063.3. Samples: 1422802760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 05:14:13,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 05:14:16,944][47288] Updated weights for policy 0, policy_version 89946 (0.0036) [2024-04-26 05:14:18,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1473773568. Throughput: 0: 55898.6. Samples: 1423138300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 05:14:18,923][47056] Avg episode reward: [(0, '0.565')] [2024-04-26 05:14:19,769][47288] Updated weights for policy 0, policy_version 89956 (0.0029) [2024-04-26 05:14:22,778][47288] Updated weights for policy 0, policy_version 89966 (0.0027) [2024-04-26 05:14:23,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 1474068480. Throughput: 0: 55928.4. Samples: 1423475900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 05:14:23,923][47056] Avg episode reward: [(0, '0.546')] [2024-04-26 05:14:25,778][47288] Updated weights for policy 0, policy_version 89976 (0.0030) [2024-04-26 05:14:26,684][47267] Signal inference workers to stop experience collection... (21450 times) [2024-04-26 05:14:26,734][47288] InferenceWorker_p0-w0: stopping experience collection (21450 times) [2024-04-26 05:14:26,740][47267] Signal inference workers to resume experience collection... (21450 times) [2024-04-26 05:14:26,749][47288] InferenceWorker_p0-w0: resuming experience collection (21450 times) [2024-04-26 05:14:28,630][47288] Updated weights for policy 0, policy_version 89986 (0.0029) [2024-04-26 05:14:28,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56800.6, 300 sec: 56483.1). Total num frames: 1474363392. Throughput: 0: 56277.3. Samples: 1423656040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 05:14:28,924][47056] Avg episode reward: [(0, '0.334')] [2024-04-26 05:14:31,439][47288] Updated weights for policy 0, policy_version 89996 (0.0028) [2024-04-26 05:14:33,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55978.8, 300 sec: 56316.5). Total num frames: 1474609152. Throughput: 0: 56366.2. Samples: 1423995440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 05:14:33,923][47056] Avg episode reward: [(0, '0.346')] [2024-04-26 05:14:34,327][47288] Updated weights for policy 0, policy_version 90006 (0.0033) [2024-04-26 05:14:37,143][47288] Updated weights for policy 0, policy_version 90016 (0.0035) [2024-04-26 05:14:38,923][47056] Fps is (10 sec: 52429.3, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1474887680. Throughput: 0: 56254.3. Samples: 1424330320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 05:14:38,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 05:14:40,221][47288] Updated weights for policy 0, policy_version 90026 (0.0033) [2024-04-26 05:14:42,989][47288] Updated weights for policy 0, policy_version 90036 (0.0026) [2024-04-26 05:14:43,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1475166208. Throughput: 0: 56259.1. Samples: 1424497440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 05:14:43,923][47056] Avg episode reward: [(0, '0.571')] [2024-04-26 05:14:46,100][47288] Updated weights for policy 0, policy_version 90046 (0.0026) [2024-04-26 05:14:48,788][47288] Updated weights for policy 0, policy_version 90056 (0.0034) [2024-04-26 05:14:48,923][47056] Fps is (10 sec: 58981.9, 60 sec: 55978.5, 300 sec: 56483.2). Total num frames: 1475477504. Throughput: 0: 56293.3. Samples: 1424835100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 05:14:48,924][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 05:14:48,974][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000090057_1475493888.pth... [2024-04-26 05:14:49,023][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000089230_1461944320.pth [2024-04-26 05:14:51,836][47288] Updated weights for policy 0, policy_version 90066 (0.0032) [2024-04-26 05:14:53,923][47056] Fps is (10 sec: 58983.0, 60 sec: 56251.8, 300 sec: 56316.6). Total num frames: 1475756032. Throughput: 0: 56211.6. Samples: 1425168660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 05:14:53,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 05:14:54,727][47288] Updated weights for policy 0, policy_version 90076 (0.0032) [2024-04-26 05:14:57,561][47288] Updated weights for policy 0, policy_version 90086 (0.0026) [2024-04-26 05:14:58,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1476018176. Throughput: 0: 56380.8. Samples: 1425339900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 05:14:58,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 05:15:00,453][47288] Updated weights for policy 0, policy_version 90096 (0.0025) [2024-04-26 05:15:03,426][47288] Updated weights for policy 0, policy_version 90106 (0.0028) [2024-04-26 05:15:03,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1476313088. Throughput: 0: 56489.0. Samples: 1425680300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 05:15:03,932][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 05:15:06,293][47288] Updated weights for policy 0, policy_version 90116 (0.0029) [2024-04-26 05:15:08,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1476591616. Throughput: 0: 56569.8. Samples: 1426021540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 05:15:08,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 05:15:09,214][47288] Updated weights for policy 0, policy_version 90126 (0.0026) [2024-04-26 05:15:12,035][47288] Updated weights for policy 0, policy_version 90136 (0.0030) [2024-04-26 05:15:13,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 56316.5). Total num frames: 1476853760. Throughput: 0: 56234.8. Samples: 1426186600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 05:15:13,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 05:15:14,934][47288] Updated weights for policy 0, policy_version 90146 (0.0028) [2024-04-26 05:15:17,979][47288] Updated weights for policy 0, policy_version 90156 (0.0028) [2024-04-26 05:15:18,923][47056] Fps is (10 sec: 54066.2, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1477132288. Throughput: 0: 56191.3. Samples: 1426524060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 05:15:18,924][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 05:15:20,714][47288] Updated weights for policy 0, policy_version 90166 (0.0026) [2024-04-26 05:15:23,720][47288] Updated weights for policy 0, policy_version 90176 (0.0025) [2024-04-26 05:15:23,923][47056] Fps is (10 sec: 58981.5, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1477443584. Throughput: 0: 56343.4. Samples: 1426865780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 05:15:23,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 05:15:26,784][47288] Updated weights for policy 0, policy_version 90186 (0.0033) [2024-04-26 05:15:28,923][47056] Fps is (10 sec: 60621.5, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1477738496. Throughput: 0: 56344.4. Samples: 1427032940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 05:15:28,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 05:15:29,482][47288] Updated weights for policy 0, policy_version 90196 (0.0031) [2024-04-26 05:15:32,736][47288] Updated weights for policy 0, policy_version 90206 (0.0029) [2024-04-26 05:15:33,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56797.8, 300 sec: 56316.5). Total num frames: 1478017024. Throughput: 0: 56480.0. Samples: 1427376700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 05:15:33,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 05:15:35,147][47288] Updated weights for policy 0, policy_version 90216 (0.0027) [2024-04-26 05:15:38,460][47288] Updated weights for policy 0, policy_version 90226 (0.0026) [2024-04-26 05:15:38,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1478279168. Throughput: 0: 56656.8. Samples: 1427718220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 05:15:38,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 05:15:39,689][47267] Signal inference workers to stop experience collection... (21500 times) [2024-04-26 05:15:39,689][47267] Signal inference workers to resume experience collection... (21500 times) [2024-04-26 05:15:39,715][47288] InferenceWorker_p0-w0: stopping experience collection (21500 times) [2024-04-26 05:15:39,716][47288] InferenceWorker_p0-w0: resuming experience collection (21500 times) [2024-04-26 05:15:41,085][47288] Updated weights for policy 0, policy_version 90236 (0.0024) [2024-04-26 05:15:43,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 1478574080. Throughput: 0: 56598.6. Samples: 1427886840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 05:15:43,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 05:15:44,184][47288] Updated weights for policy 0, policy_version 90246 (0.0035) [2024-04-26 05:15:46,934][47288] Updated weights for policy 0, policy_version 90256 (0.0031) [2024-04-26 05:15:48,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1478852608. Throughput: 0: 56637.2. Samples: 1428228980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 05:15:48,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 05:15:49,862][47288] Updated weights for policy 0, policy_version 90266 (0.0029) [2024-04-26 05:15:52,907][47288] Updated weights for policy 0, policy_version 90276 (0.0032) [2024-04-26 05:15:53,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.5, 300 sec: 56372.1). Total num frames: 1479131136. Throughput: 0: 56596.7. Samples: 1428568400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 05:15:53,923][47056] Avg episode reward: [(0, '0.370')] [2024-04-26 05:15:55,700][47288] Updated weights for policy 0, policy_version 90286 (0.0034) [2024-04-26 05:15:58,685][47288] Updated weights for policy 0, policy_version 90296 (0.0030) [2024-04-26 05:15:58,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1479409664. Throughput: 0: 56495.1. Samples: 1428728880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 05:15:58,923][47056] Avg episode reward: [(0, '0.563')] [2024-04-26 05:16:01,325][47288] Updated weights for policy 0, policy_version 90306 (0.0026) [2024-04-26 05:16:03,923][47056] Fps is (10 sec: 57345.3, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1479704576. Throughput: 0: 56654.6. Samples: 1429073500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 05:16:03,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 05:16:04,391][47288] Updated weights for policy 0, policy_version 90316 (0.0025) [2024-04-26 05:16:07,142][47288] Updated weights for policy 0, policy_version 90326 (0.0026) [2024-04-26 05:16:08,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1479999488. Throughput: 0: 56513.4. Samples: 1429408880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 05:16:08,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 05:16:10,042][47288] Updated weights for policy 0, policy_version 90336 (0.0028) [2024-04-26 05:16:13,093][47288] Updated weights for policy 0, policy_version 90346 (0.0035) [2024-04-26 05:16:13,923][47056] Fps is (10 sec: 57343.6, 60 sec: 57070.9, 300 sec: 56316.6). Total num frames: 1480278016. Throughput: 0: 56801.9. Samples: 1429589020. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 05:16:13,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 05:16:15,649][47288] Updated weights for policy 0, policy_version 90356 (0.0032) [2024-04-26 05:16:18,868][47288] Updated weights for policy 0, policy_version 90366 (0.0030) [2024-04-26 05:16:18,923][47056] Fps is (10 sec: 55705.7, 60 sec: 57071.0, 300 sec: 56372.1). Total num frames: 1480556544. Throughput: 0: 56644.8. Samples: 1429925720. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 05:16:18,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 05:16:21,399][47288] Updated weights for policy 0, policy_version 90376 (0.0027) [2024-04-26 05:16:23,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56797.9, 300 sec: 56483.1). Total num frames: 1480851456. Throughput: 0: 56511.4. Samples: 1430261240. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 05:16:23,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 05:16:24,543][47288] Updated weights for policy 0, policy_version 90386 (0.0029) [2024-04-26 05:16:27,411][47288] Updated weights for policy 0, policy_version 90396 (0.0026) [2024-04-26 05:16:28,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 1481129984. Throughput: 0: 56606.4. Samples: 1430434120. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 05:16:28,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 05:16:30,139][47288] Updated weights for policy 0, policy_version 90406 (0.0027) [2024-04-26 05:16:33,350][47288] Updated weights for policy 0, policy_version 90416 (0.0030) [2024-04-26 05:16:33,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1481392128. Throughput: 0: 56698.7. Samples: 1430780420. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 05:16:33,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 05:16:35,913][47288] Updated weights for policy 0, policy_version 90426 (0.0025) [2024-04-26 05:16:38,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56797.9, 300 sec: 56483.2). Total num frames: 1481687040. Throughput: 0: 56535.3. Samples: 1431112480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 05:16:38,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 05:16:38,998][47288] Updated weights for policy 0, policy_version 90436 (0.0031) [2024-04-26 05:16:41,693][47288] Updated weights for policy 0, policy_version 90446 (0.0030) [2024-04-26 05:16:43,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1481965568. Throughput: 0: 56742.4. Samples: 1431282300. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 05:16:43,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 05:16:44,862][47288] Updated weights for policy 0, policy_version 90456 (0.0029) [2024-04-26 05:16:45,704][47267] Signal inference workers to stop experience collection... (21550 times) [2024-04-26 05:16:45,705][47267] Signal inference workers to resume experience collection... (21550 times) [2024-04-26 05:16:45,722][47288] InferenceWorker_p0-w0: stopping experience collection (21550 times) [2024-04-26 05:16:45,722][47288] InferenceWorker_p0-w0: resuming experience collection (21550 times) [2024-04-26 05:16:47,502][47288] Updated weights for policy 0, policy_version 90466 (0.0030) [2024-04-26 05:16:48,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1482260480. Throughput: 0: 56523.3. Samples: 1431617060. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 05:16:48,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 05:16:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000090470_1482260480.pth... [2024-04-26 05:16:48,977][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000089644_1468727296.pth [2024-04-26 05:16:50,810][47288] Updated weights for policy 0, policy_version 90476 (0.0033) [2024-04-26 05:16:53,505][47288] Updated weights for policy 0, policy_version 90486 (0.0031) [2024-04-26 05:16:53,923][47056] Fps is (10 sec: 55706.8, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1482522624. Throughput: 0: 56621.4. Samples: 1431956840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 05:16:53,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 05:16:56,438][47288] Updated weights for policy 0, policy_version 90496 (0.0027) [2024-04-26 05:16:58,923][47056] Fps is (10 sec: 57343.6, 60 sec: 57070.8, 300 sec: 56483.1). Total num frames: 1482833920. Throughput: 0: 56408.7. Samples: 1432127420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 05:16:58,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 05:16:59,182][47288] Updated weights for policy 0, policy_version 90506 (0.0024) [2024-04-26 05:17:02,333][47288] Updated weights for policy 0, policy_version 90516 (0.0032) [2024-04-26 05:17:03,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56797.8, 300 sec: 56483.2). Total num frames: 1483112448. Throughput: 0: 56516.6. Samples: 1432468960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 05:17:03,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 05:17:04,957][47288] Updated weights for policy 0, policy_version 90526 (0.0032) [2024-04-26 05:17:08,201][47288] Updated weights for policy 0, policy_version 90536 (0.0027) [2024-04-26 05:17:08,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1483374592. Throughput: 0: 56614.1. Samples: 1432808880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 05:17:08,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 05:17:10,578][47288] Updated weights for policy 0, policy_version 90546 (0.0028) [2024-04-26 05:17:13,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1483653120. Throughput: 0: 56604.0. Samples: 1432981300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 05:17:13,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 05:17:14,007][47288] Updated weights for policy 0, policy_version 90556 (0.0026) [2024-04-26 05:17:16,257][47288] Updated weights for policy 0, policy_version 90566 (0.0028) [2024-04-26 05:17:18,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1483948032. Throughput: 0: 56440.0. Samples: 1433320220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 05:17:18,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 05:17:19,897][47288] Updated weights for policy 0, policy_version 90576 (0.0034) [2024-04-26 05:17:22,145][47288] Updated weights for policy 0, policy_version 90586 (0.0027) [2024-04-26 05:17:23,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1484242944. Throughput: 0: 56505.6. Samples: 1433655240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 05:17:23,923][47056] Avg episode reward: [(0, '0.544')] [2024-04-26 05:17:25,752][47288] Updated weights for policy 0, policy_version 90596 (0.0026) [2024-04-26 05:17:27,935][47288] Updated weights for policy 0, policy_version 90606 (0.0029) [2024-04-26 05:17:28,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.7, 300 sec: 56372.0). Total num frames: 1484521472. Throughput: 0: 56569.5. Samples: 1433827920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 05:17:28,923][47056] Avg episode reward: [(0, '0.572')] [2024-04-26 05:17:31,487][47288] Updated weights for policy 0, policy_version 90616 (0.0028) [2024-04-26 05:17:33,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56797.8, 300 sec: 56316.5). Total num frames: 1484800000. Throughput: 0: 56552.8. Samples: 1434161940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 05:17:33,923][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 05:17:34,173][47288] Updated weights for policy 0, policy_version 90626 (0.0027) [2024-04-26 05:17:37,370][47288] Updated weights for policy 0, policy_version 90636 (0.0040) [2024-04-26 05:17:38,081][47267] Signal inference workers to stop experience collection... (21600 times) [2024-04-26 05:17:38,132][47288] InferenceWorker_p0-w0: stopping experience collection (21600 times) [2024-04-26 05:17:38,168][47267] Signal inference workers to resume experience collection... (21600 times) [2024-04-26 05:17:38,168][47288] InferenceWorker_p0-w0: resuming experience collection (21600 times) [2024-04-26 05:17:38,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 1485094912. Throughput: 0: 56452.4. Samples: 1434497200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 05:17:38,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 05:17:40,018][47288] Updated weights for policy 0, policy_version 90646 (0.0030) [2024-04-26 05:17:43,220][47288] Updated weights for policy 0, policy_version 90656 (0.0028) [2024-04-26 05:17:43,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56798.1, 300 sec: 56538.7). Total num frames: 1485373440. Throughput: 0: 56498.8. Samples: 1434669860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 05:17:43,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 05:17:45,700][47288] Updated weights for policy 0, policy_version 90666 (0.0033) [2024-04-26 05:17:48,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1485619200. Throughput: 0: 56509.7. Samples: 1435011900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 05:17:48,923][47056] Avg episode reward: [(0, '0.590')] [2024-04-26 05:17:48,956][47288] Updated weights for policy 0, policy_version 90676 (0.0028) [2024-04-26 05:17:51,532][47288] Updated weights for policy 0, policy_version 90686 (0.0025) [2024-04-26 05:17:53,923][47056] Fps is (10 sec: 52428.5, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1485897728. Throughput: 0: 56450.8. Samples: 1435349160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 05:17:53,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 05:17:54,987][47288] Updated weights for policy 0, policy_version 90696 (0.0032) [2024-04-26 05:17:57,354][47288] Updated weights for policy 0, policy_version 90706 (0.0030) [2024-04-26 05:17:58,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 1486209024. Throughput: 0: 56139.0. Samples: 1435507560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 05:17:58,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 05:18:00,874][47288] Updated weights for policy 0, policy_version 90716 (0.0031) [2024-04-26 05:18:03,035][47288] Updated weights for policy 0, policy_version 90726 (0.0033) [2024-04-26 05:18:03,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1486471168. Throughput: 0: 56081.5. Samples: 1435843880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 05:18:03,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 05:18:06,690][47288] Updated weights for policy 0, policy_version 90736 (0.0028) [2024-04-26 05:18:08,712][47288] Updated weights for policy 0, policy_version 90746 (0.0027) [2024-04-26 05:18:08,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56798.0, 300 sec: 56372.1). Total num frames: 1486782464. Throughput: 0: 56194.2. Samples: 1436183980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 05:18:08,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 05:18:12,429][47288] Updated weights for policy 0, policy_version 90756 (0.0028) [2024-04-26 05:18:13,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56797.9, 300 sec: 56427.7). Total num frames: 1487060992. Throughput: 0: 56320.6. Samples: 1436362340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 05:18:13,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 05:18:14,521][47288] Updated weights for policy 0, policy_version 90766 (0.0036) [2024-04-26 05:18:18,207][47288] Updated weights for policy 0, policy_version 90776 (0.0029) [2024-04-26 05:18:18,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 1487339520. Throughput: 0: 56418.8. Samples: 1436700780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 05:18:18,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 05:18:20,406][47288] Updated weights for policy 0, policy_version 90786 (0.0031) [2024-04-26 05:18:23,923][47056] Fps is (10 sec: 50790.0, 60 sec: 55432.6, 300 sec: 56317.1). Total num frames: 1487568896. Throughput: 0: 56554.3. Samples: 1437042140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 05:18:23,923][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 05:18:24,117][47288] Updated weights for policy 0, policy_version 90796 (0.0032) [2024-04-26 05:18:26,413][47288] Updated weights for policy 0, policy_version 90806 (0.0028) [2024-04-26 05:18:28,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1487880192. Throughput: 0: 56205.7. Samples: 1437199120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 05:18:28,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 05:18:29,756][47288] Updated weights for policy 0, policy_version 90816 (0.0026) [2024-04-26 05:18:31,525][47267] Signal inference workers to stop experience collection... (21650 times) [2024-04-26 05:18:31,525][47267] Signal inference workers to resume experience collection... (21650 times) [2024-04-26 05:18:31,548][47288] InferenceWorker_p0-w0: stopping experience collection (21650 times) [2024-04-26 05:18:31,549][47288] InferenceWorker_p0-w0: resuming experience collection (21650 times) [2024-04-26 05:18:32,274][47288] Updated weights for policy 0, policy_version 90826 (0.0028) [2024-04-26 05:18:33,923][47056] Fps is (10 sec: 60620.5, 60 sec: 56251.8, 300 sec: 56483.1). Total num frames: 1488175104. Throughput: 0: 56275.5. Samples: 1437544300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 05:18:33,923][47056] Avg episode reward: [(0, '0.578')] [2024-04-26 05:18:35,439][47288] Updated weights for policy 0, policy_version 90836 (0.0029) [2024-04-26 05:18:38,014][47288] Updated weights for policy 0, policy_version 90846 (0.0028) [2024-04-26 05:18:38,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56251.9, 300 sec: 56483.2). Total num frames: 1488470016. Throughput: 0: 56426.0. Samples: 1437888320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 05:18:38,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 05:18:41,100][47288] Updated weights for policy 0, policy_version 90856 (0.0035) [2024-04-26 05:18:43,617][47288] Updated weights for policy 0, policy_version 90866 (0.0028) [2024-04-26 05:18:43,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 1488748544. Throughput: 0: 56636.5. Samples: 1438056200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 05:18:43,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 05:18:46,918][47288] Updated weights for policy 0, policy_version 90876 (0.0028) [2024-04-26 05:18:48,923][47056] Fps is (10 sec: 57343.4, 60 sec: 57070.9, 300 sec: 56483.1). Total num frames: 1489043456. Throughput: 0: 56720.8. Samples: 1438396320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 05:18:48,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 05:18:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000090884_1489043456.pth... [2024-04-26 05:18:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000090057_1475493888.pth [2024-04-26 05:18:49,262][47288] Updated weights for policy 0, policy_version 90886 (0.0028) [2024-04-26 05:18:52,669][47288] Updated weights for policy 0, policy_version 90896 (0.0034) [2024-04-26 05:18:53,923][47056] Fps is (10 sec: 57344.1, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 1489321984. Throughput: 0: 56610.3. Samples: 1438731440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 05:18:53,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 05:18:55,200][47288] Updated weights for policy 0, policy_version 90906 (0.0030) [2024-04-26 05:18:58,557][47288] Updated weights for policy 0, policy_version 90916 (0.0037) [2024-04-26 05:18:58,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56251.9, 300 sec: 56427.6). Total num frames: 1489584128. Throughput: 0: 56436.9. Samples: 1438902000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 05:18:58,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 05:19:01,263][47288] Updated weights for policy 0, policy_version 90926 (0.0027) [2024-04-26 05:19:03,923][47056] Fps is (10 sec: 52428.7, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1489846272. Throughput: 0: 56486.3. Samples: 1439242660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 05:19:03,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 05:19:04,409][47288] Updated weights for policy 0, policy_version 90936 (0.0032) [2024-04-26 05:19:06,930][47288] Updated weights for policy 0, policy_version 90946 (0.0028) [2024-04-26 05:19:08,923][47056] Fps is (10 sec: 55704.3, 60 sec: 55978.6, 300 sec: 56372.0). Total num frames: 1490141184. Throughput: 0: 56517.6. Samples: 1439585440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 05:19:08,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 05:19:10,087][47288] Updated weights for policy 0, policy_version 90956 (0.0030) [2024-04-26 05:19:12,585][47288] Updated weights for policy 0, policy_version 90966 (0.0026) [2024-04-26 05:19:13,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56251.7, 300 sec: 56483.2). Total num frames: 1490436096. Throughput: 0: 56687.7. Samples: 1439750060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 05:19:13,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 05:19:15,850][47288] Updated weights for policy 0, policy_version 90976 (0.0030) [2024-04-26 05:19:18,488][47288] Updated weights for policy 0, policy_version 90986 (0.0027) [2024-04-26 05:19:18,923][47056] Fps is (10 sec: 58983.4, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 1490731008. Throughput: 0: 56575.6. Samples: 1440090200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 05:19:18,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 05:19:21,618][47288] Updated weights for policy 0, policy_version 90996 (0.0032) [2024-04-26 05:19:23,923][47056] Fps is (10 sec: 57344.0, 60 sec: 57344.0, 300 sec: 56427.6). Total num frames: 1491009536. Throughput: 0: 56469.7. Samples: 1440429460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 05:19:23,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 05:19:24,204][47288] Updated weights for policy 0, policy_version 91006 (0.0029) [2024-04-26 05:19:26,752][47267] Signal inference workers to stop experience collection... (21700 times) [2024-04-26 05:19:26,791][47288] InferenceWorker_p0-w0: stopping experience collection (21700 times) [2024-04-26 05:19:26,839][47267] Signal inference workers to resume experience collection... (21700 times) [2024-04-26 05:19:26,839][47288] InferenceWorker_p0-w0: resuming experience collection (21700 times) [2024-04-26 05:19:27,374][47288] Updated weights for policy 0, policy_version 91016 (0.0023) [2024-04-26 05:19:28,923][47056] Fps is (10 sec: 57342.6, 60 sec: 57070.7, 300 sec: 56594.2). Total num frames: 1491304448. Throughput: 0: 56675.7. Samples: 1440606620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 05:19:28,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 05:19:30,010][47288] Updated weights for policy 0, policy_version 91026 (0.0027) [2024-04-26 05:19:33,222][47288] Updated weights for policy 0, policy_version 91036 (0.0026) [2024-04-26 05:19:33,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 1491582976. Throughput: 0: 56661.2. Samples: 1440946080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 05:19:33,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 05:19:35,697][47288] Updated weights for policy 0, policy_version 91046 (0.0030) [2024-04-26 05:19:38,923][47056] Fps is (10 sec: 52429.8, 60 sec: 55978.5, 300 sec: 56483.1). Total num frames: 1491828736. Throughput: 0: 56640.8. Samples: 1441280280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 05:19:38,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 05:19:39,044][47288] Updated weights for policy 0, policy_version 91056 (0.0034) [2024-04-26 05:19:41,597][47288] Updated weights for policy 0, policy_version 91066 (0.0027) [2024-04-26 05:19:43,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1492123648. Throughput: 0: 56492.3. Samples: 1441444160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 05:19:43,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 05:19:44,784][47288] Updated weights for policy 0, policy_version 91076 (0.0027) [2024-04-26 05:19:47,273][47288] Updated weights for policy 0, policy_version 91086 (0.0034) [2024-04-26 05:19:48,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1492402176. Throughput: 0: 56369.4. Samples: 1441779280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 05:19:48,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 05:19:50,433][47288] Updated weights for policy 0, policy_version 91096 (0.0030) [2024-04-26 05:19:52,984][47288] Updated weights for policy 0, policy_version 91106 (0.0029) [2024-04-26 05:19:53,923][47056] Fps is (10 sec: 57343.0, 60 sec: 56251.6, 300 sec: 56538.7). Total num frames: 1492697088. Throughput: 0: 56354.6. Samples: 1442121400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 05:19:53,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 05:19:56,332][47288] Updated weights for policy 0, policy_version 91116 (0.0039) [2024-04-26 05:19:58,877][47288] Updated weights for policy 0, policy_version 91126 (0.0028) [2024-04-26 05:19:58,923][47056] Fps is (10 sec: 60620.0, 60 sec: 57070.8, 300 sec: 56594.2). Total num frames: 1493008384. Throughput: 0: 56478.1. Samples: 1442291580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 05:19:58,923][47056] Avg episode reward: [(0, '0.569')] [2024-04-26 05:20:02,264][47288] Updated weights for policy 0, policy_version 91136 (0.0033) [2024-04-26 05:20:03,923][47056] Fps is (10 sec: 57345.4, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 1493270528. Throughput: 0: 56386.7. Samples: 1442627600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 05:20:03,923][47056] Avg episode reward: [(0, '0.593')] [2024-04-26 05:20:04,758][47288] Updated weights for policy 0, policy_version 91146 (0.0028) [2024-04-26 05:20:07,883][47288] Updated weights for policy 0, policy_version 91156 (0.0024) [2024-04-26 05:20:08,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 1493549056. Throughput: 0: 56347.8. Samples: 1442965120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 05:20:08,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 05:20:10,708][47288] Updated weights for policy 0, policy_version 91166 (0.0023) [2024-04-26 05:20:13,578][47288] Updated weights for policy 0, policy_version 91176 (0.0029) [2024-04-26 05:20:13,923][47056] Fps is (10 sec: 55704.3, 60 sec: 56524.6, 300 sec: 56594.2). Total num frames: 1493827584. Throughput: 0: 56388.1. Samples: 1443144080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 05:20:13,924][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 05:20:16,470][47288] Updated weights for policy 0, policy_version 91186 (0.0036) [2024-04-26 05:20:18,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 1494106112. Throughput: 0: 56361.9. Samples: 1443482360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 05:20:18,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 05:20:19,378][47288] Updated weights for policy 0, policy_version 91196 (0.0031) [2024-04-26 05:20:22,450][47288] Updated weights for policy 0, policy_version 91206 (0.0029) [2024-04-26 05:20:23,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 1494368256. Throughput: 0: 56479.1. Samples: 1443821840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 05:20:23,923][47056] Avg episode reward: [(0, '0.534')] [2024-04-26 05:20:25,357][47267] Signal inference workers to stop experience collection... (21750 times) [2024-04-26 05:20:25,412][47288] InferenceWorker_p0-w0: stopping experience collection (21750 times) [2024-04-26 05:20:25,417][47267] Signal inference workers to resume experience collection... (21750 times) [2024-04-26 05:20:25,422][47288] InferenceWorker_p0-w0: resuming experience collection (21750 times) [2024-04-26 05:20:25,425][47288] Updated weights for policy 0, policy_version 91216 (0.0027) [2024-04-26 05:20:28,283][47288] Updated weights for policy 0, policy_version 91226 (0.0029) [2024-04-26 05:20:28,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55705.8, 300 sec: 56372.1). Total num frames: 1494646784. Throughput: 0: 56343.1. Samples: 1443979600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 05:20:28,923][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 05:20:31,071][47288] Updated weights for policy 0, policy_version 91236 (0.0037) [2024-04-26 05:20:33,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 1494958080. Throughput: 0: 56380.7. Samples: 1444316420. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 05:20:33,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 05:20:34,121][47288] Updated weights for policy 0, policy_version 91246 (0.0030) [2024-04-26 05:20:36,720][47288] Updated weights for policy 0, policy_version 91256 (0.0025) [2024-04-26 05:20:38,923][47056] Fps is (10 sec: 60621.1, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 1495252992. Throughput: 0: 56436.3. Samples: 1444661020. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 05:20:38,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 05:20:39,787][47288] Updated weights for policy 0, policy_version 91266 (0.0028) [2024-04-26 05:20:42,778][47288] Updated weights for policy 0, policy_version 91276 (0.0029) [2024-04-26 05:20:43,923][47056] Fps is (10 sec: 58982.9, 60 sec: 57070.9, 300 sec: 56594.2). Total num frames: 1495547904. Throughput: 0: 56545.9. Samples: 1444836140. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 05:20:43,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 05:20:45,610][47288] Updated weights for policy 0, policy_version 91286 (0.0034) [2024-04-26 05:20:48,531][47288] Updated weights for policy 0, policy_version 91296 (0.0026) [2024-04-26 05:20:48,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 1495810048. Throughput: 0: 56506.6. Samples: 1445170400. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 05:20:48,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 05:20:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000091297_1495810048.pth... [2024-04-26 05:20:48,984][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000090470_1482260480.pth [2024-04-26 05:20:51,703][47288] Updated weights for policy 0, policy_version 91306 (0.0030) [2024-04-26 05:20:53,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56525.0, 300 sec: 56538.7). Total num frames: 1496088576. Throughput: 0: 56568.2. Samples: 1445510680. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 05:20:53,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 05:20:54,140][47288] Updated weights for policy 0, policy_version 91316 (0.0028) [2024-04-26 05:20:57,472][47288] Updated weights for policy 0, policy_version 91326 (0.0028) [2024-04-26 05:20:58,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 56483.1). Total num frames: 1496367104. Throughput: 0: 56277.1. Samples: 1445676540. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 05:20:58,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 05:21:00,117][47288] Updated weights for policy 0, policy_version 91336 (0.0029) [2024-04-26 05:21:03,300][47288] Updated weights for policy 0, policy_version 91346 (0.0034) [2024-04-26 05:21:03,923][47056] Fps is (10 sec: 54065.5, 60 sec: 55978.4, 300 sec: 56372.0). Total num frames: 1496629248. Throughput: 0: 56430.3. Samples: 1446021740. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 05:21:03,926][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 05:21:05,974][47288] Updated weights for policy 0, policy_version 91356 (0.0037) [2024-04-26 05:21:08,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.9, 300 sec: 56427.6). Total num frames: 1496924160. Throughput: 0: 56358.8. Samples: 1446357980. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 05:21:08,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 05:21:09,060][47288] Updated weights for policy 0, policy_version 91366 (0.0031) [2024-04-26 05:21:11,865][47288] Updated weights for policy 0, policy_version 91376 (0.0033) [2024-04-26 05:21:13,923][47056] Fps is (10 sec: 57345.2, 60 sec: 56251.9, 300 sec: 56427.6). Total num frames: 1497202688. Throughput: 0: 56544.8. Samples: 1446524120. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 05:21:13,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 05:21:14,854][47288] Updated weights for policy 0, policy_version 91386 (0.0036) [2024-04-26 05:21:17,816][47288] Updated weights for policy 0, policy_version 91396 (0.0033) [2024-04-26 05:21:18,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1497481216. Throughput: 0: 56519.7. Samples: 1446859800. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 05:21:18,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 05:21:20,578][47288] Updated weights for policy 0, policy_version 91406 (0.0030) [2024-04-26 05:21:23,423][47288] Updated weights for policy 0, policy_version 91416 (0.0025) [2024-04-26 05:21:23,923][47056] Fps is (10 sec: 58982.6, 60 sec: 57071.0, 300 sec: 56483.1). Total num frames: 1497792512. Throughput: 0: 56333.3. Samples: 1447196020. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 05:21:23,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 05:21:26,437][47288] Updated weights for policy 0, policy_version 91426 (0.0032) [2024-04-26 05:21:28,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.8, 300 sec: 56483.2). Total num frames: 1498054656. Throughput: 0: 56239.9. Samples: 1447366940. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 05:21:28,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 05:21:29,214][47288] Updated weights for policy 0, policy_version 91436 (0.0028) [2024-04-26 05:21:29,228][47267] Signal inference workers to stop experience collection... (21800 times) [2024-04-26 05:21:29,229][47267] Signal inference workers to resume experience collection... (21800 times) [2024-04-26 05:21:29,251][47288] InferenceWorker_p0-w0: stopping experience collection (21800 times) [2024-04-26 05:21:29,252][47288] InferenceWorker_p0-w0: resuming experience collection (21800 times) [2024-04-26 05:21:32,330][47288] Updated weights for policy 0, policy_version 91446 (0.0036) [2024-04-26 05:21:33,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1498333184. Throughput: 0: 56396.0. Samples: 1447708220. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 05:21:33,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 05:21:34,985][47288] Updated weights for policy 0, policy_version 91456 (0.0031) [2024-04-26 05:21:38,396][47288] Updated weights for policy 0, policy_version 91466 (0.0037) [2024-04-26 05:21:38,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.6, 300 sec: 56427.7). Total num frames: 1498611712. Throughput: 0: 56387.1. Samples: 1448048100. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 05:21:38,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 05:21:40,931][47288] Updated weights for policy 0, policy_version 91476 (0.0024) [2024-04-26 05:21:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 56372.1). Total num frames: 1498890240. Throughput: 0: 56170.6. Samples: 1448204220. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 05:21:43,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 05:21:44,150][47288] Updated weights for policy 0, policy_version 91486 (0.0027) [2024-04-26 05:21:46,628][47288] Updated weights for policy 0, policy_version 91496 (0.0035) [2024-04-26 05:21:48,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1499185152. Throughput: 0: 56065.1. Samples: 1448544660. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 05:21:48,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 05:21:49,897][47288] Updated weights for policy 0, policy_version 91506 (0.0033) [2024-04-26 05:21:52,389][47288] Updated weights for policy 0, policy_version 91516 (0.0029) [2024-04-26 05:21:53,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1499463680. Throughput: 0: 56044.8. Samples: 1448880000. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 05:21:53,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 05:21:55,974][47288] Updated weights for policy 0, policy_version 91526 (0.0028) [2024-04-26 05:21:58,284][47288] Updated weights for policy 0, policy_version 91536 (0.0031) [2024-04-26 05:21:58,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1499758592. Throughput: 0: 56245.9. Samples: 1449055180. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 05:21:58,923][47056] Avg episode reward: [(0, '0.367')] [2024-04-26 05:22:01,849][47288] Updated weights for policy 0, policy_version 91546 (0.0025) [2024-04-26 05:22:03,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56525.0, 300 sec: 56427.6). Total num frames: 1500020736. Throughput: 0: 56243.5. Samples: 1449390760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 05:22:03,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 05:22:04,136][47288] Updated weights for policy 0, policy_version 91556 (0.0029) [2024-04-26 05:22:07,506][47288] Updated weights for policy 0, policy_version 91566 (0.0027) [2024-04-26 05:22:08,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1500299264. Throughput: 0: 56393.3. Samples: 1449733720. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 05:22:08,923][47056] Avg episode reward: [(0, '0.581')] [2024-04-26 05:22:09,817][47288] Updated weights for policy 0, policy_version 91576 (0.0035) [2024-04-26 05:22:13,438][47288] Updated weights for policy 0, policy_version 91586 (0.0029) [2024-04-26 05:22:13,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1500594176. Throughput: 0: 56351.2. Samples: 1449902740. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 05:22:13,923][47056] Avg episode reward: [(0, '0.415')] [2024-04-26 05:22:15,745][47288] Updated weights for policy 0, policy_version 91596 (0.0025) [2024-04-26 05:22:18,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1500839936. Throughput: 0: 56131.6. Samples: 1450234140. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 05:22:18,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 05:22:19,126][47288] Updated weights for policy 0, policy_version 91606 (0.0028) [2024-04-26 05:22:21,027][47267] Signal inference workers to stop experience collection... (21850 times) [2024-04-26 05:22:21,031][47267] Signal inference workers to resume experience collection... (21850 times) [2024-04-26 05:22:21,061][47288] InferenceWorker_p0-w0: stopping experience collection (21850 times) [2024-04-26 05:22:21,061][47288] InferenceWorker_p0-w0: resuming experience collection (21850 times) [2024-04-26 05:22:21,449][47288] Updated weights for policy 0, policy_version 91616 (0.0033) [2024-04-26 05:22:23,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 1501151232. Throughput: 0: 56065.2. Samples: 1450571040. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 05:22:23,932][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 05:22:24,911][47288] Updated weights for policy 0, policy_version 91626 (0.0027) [2024-04-26 05:22:27,156][47288] Updated weights for policy 0, policy_version 91636 (0.0023) [2024-04-26 05:22:28,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1501429760. Throughput: 0: 56455.2. Samples: 1450744700. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 05:22:28,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 05:22:30,674][47288] Updated weights for policy 0, policy_version 91646 (0.0028) [2024-04-26 05:22:32,856][47288] Updated weights for policy 0, policy_version 91656 (0.0033) [2024-04-26 05:22:33,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1501708288. Throughput: 0: 56495.2. Samples: 1451086940. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 05:22:33,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 05:22:36,492][47288] Updated weights for policy 0, policy_version 91666 (0.0030) [2024-04-26 05:22:38,913][47288] Updated weights for policy 0, policy_version 91676 (0.0026) [2024-04-26 05:22:38,923][47056] Fps is (10 sec: 58981.4, 60 sec: 56797.7, 300 sec: 56427.6). Total num frames: 1502019584. Throughput: 0: 56461.6. Samples: 1451420780. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 05:22:38,932][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 05:22:42,293][47288] Updated weights for policy 0, policy_version 91686 (0.0038) [2024-04-26 05:22:43,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 1502281728. Throughput: 0: 56477.0. Samples: 1451596640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 05:22:43,923][47056] Avg episode reward: [(0, '0.371')] [2024-04-26 05:22:44,670][47288] Updated weights for policy 0, policy_version 91696 (0.0027) [2024-04-26 05:22:47,940][47288] Updated weights for policy 0, policy_version 91706 (0.0023) [2024-04-26 05:22:48,923][47056] Fps is (10 sec: 52429.5, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1502543872. Throughput: 0: 56539.1. Samples: 1451935020. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 05:22:48,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 05:22:48,950][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000091709_1502560256.pth... [2024-04-26 05:22:48,997][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000090884_1489043456.pth [2024-04-26 05:22:50,291][47288] Updated weights for policy 0, policy_version 91716 (0.0028) [2024-04-26 05:22:53,653][47288] Updated weights for policy 0, policy_version 91726 (0.0025) [2024-04-26 05:22:53,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1502838784. Throughput: 0: 56350.2. Samples: 1452269480. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 05:22:53,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 05:22:56,506][47288] Updated weights for policy 0, policy_version 91736 (0.0029) [2024-04-26 05:22:58,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 56372.1). Total num frames: 1503100928. Throughput: 0: 56315.0. Samples: 1452436920. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 05:22:58,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 05:22:59,447][47288] Updated weights for policy 0, policy_version 91746 (0.0023) [2024-04-26 05:23:02,510][47288] Updated weights for policy 0, policy_version 91756 (0.0026) [2024-04-26 05:23:03,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1503412224. Throughput: 0: 56532.0. Samples: 1452778080. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 05:23:03,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 05:23:05,301][47288] Updated weights for policy 0, policy_version 91766 (0.0029) [2024-04-26 05:23:08,362][47288] Updated weights for policy 0, policy_version 91776 (0.0029) [2024-04-26 05:23:08,923][47056] Fps is (10 sec: 58983.0, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1503690752. Throughput: 0: 56516.1. Samples: 1453114260. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 05:23:08,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 05:23:11,074][47288] Updated weights for policy 0, policy_version 91786 (0.0026) [2024-04-26 05:23:13,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1503969280. Throughput: 0: 56499.7. Samples: 1453287180. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 05:23:13,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 05:23:13,993][47288] Updated weights for policy 0, policy_version 91796 (0.0031) [2024-04-26 05:23:16,945][47288] Updated weights for policy 0, policy_version 91806 (0.0028) [2024-04-26 05:23:18,923][47056] Fps is (10 sec: 57343.2, 60 sec: 57070.8, 300 sec: 56594.2). Total num frames: 1504264192. Throughput: 0: 56422.5. Samples: 1453625960. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-26 05:23:18,924][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 05:23:19,652][47288] Updated weights for policy 0, policy_version 91816 (0.0032) [2024-04-26 05:23:22,806][47288] Updated weights for policy 0, policy_version 91826 (0.0029) [2024-04-26 05:23:23,838][47267] Signal inference workers to stop experience collection... (21900 times) [2024-04-26 05:23:23,865][47288] InferenceWorker_p0-w0: stopping experience collection (21900 times) [2024-04-26 05:23:23,896][47267] Signal inference workers to resume experience collection... (21900 times) [2024-04-26 05:23:23,896][47288] InferenceWorker_p0-w0: resuming experience collection (21900 times) [2024-04-26 05:23:23,923][47056] Fps is (10 sec: 54065.8, 60 sec: 55978.6, 300 sec: 56372.0). Total num frames: 1504509952. Throughput: 0: 56433.8. Samples: 1453960300. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-26 05:23:23,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 05:23:25,560][47288] Updated weights for policy 0, policy_version 91836 (0.0030) [2024-04-26 05:23:28,529][47288] Updated weights for policy 0, policy_version 91846 (0.0029) [2024-04-26 05:23:28,923][47056] Fps is (10 sec: 54067.4, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1504804864. Throughput: 0: 56108.2. Samples: 1454121520. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-26 05:23:28,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 05:23:31,514][47288] Updated weights for policy 0, policy_version 91856 (0.0029) [2024-04-26 05:23:33,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1505083392. Throughput: 0: 56008.5. Samples: 1454455400. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-26 05:23:33,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 05:23:34,366][47288] Updated weights for policy 0, policy_version 91866 (0.0027) [2024-04-26 05:23:37,330][47288] Updated weights for policy 0, policy_version 91876 (0.0031) [2024-04-26 05:23:38,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1505378304. Throughput: 0: 56182.7. Samples: 1454797700. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-26 05:23:38,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 05:23:40,243][47288] Updated weights for policy 0, policy_version 91886 (0.0029) [2024-04-26 05:23:43,210][47288] Updated weights for policy 0, policy_version 91896 (0.0027) [2024-04-26 05:23:43,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56524.6, 300 sec: 56372.1). Total num frames: 1505673216. Throughput: 0: 56420.4. Samples: 1454975840. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-26 05:23:43,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 05:23:45,881][47288] Updated weights for policy 0, policy_version 91906 (0.0028) [2024-04-26 05:23:48,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1505935360. Throughput: 0: 56234.7. Samples: 1455308640. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-26 05:23:48,923][47056] Avg episode reward: [(0, '0.588')] [2024-04-26 05:23:48,956][47288] Updated weights for policy 0, policy_version 91916 (0.0037) [2024-04-26 05:23:51,714][47288] Updated weights for policy 0, policy_version 91926 (0.0028) [2024-04-26 05:23:53,923][47056] Fps is (10 sec: 54068.2, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1506213888. Throughput: 0: 56327.6. Samples: 1455649000. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-26 05:23:53,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 05:23:54,621][47288] Updated weights for policy 0, policy_version 91936 (0.0031) [2024-04-26 05:23:57,885][47288] Updated weights for policy 0, policy_version 91946 (0.0028) [2024-04-26 05:23:58,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56798.0, 300 sec: 56483.2). Total num frames: 1506508800. Throughput: 0: 56232.4. Samples: 1455817640. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-26 05:23:58,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 05:24:00,483][47288] Updated weights for policy 0, policy_version 91956 (0.0034) [2024-04-26 05:24:03,685][47288] Updated weights for policy 0, policy_version 91966 (0.0026) [2024-04-26 05:24:03,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1506770944. Throughput: 0: 56217.0. Samples: 1456155720. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-26 05:24:03,923][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 05:24:06,209][47288] Updated weights for policy 0, policy_version 91976 (0.0030) [2024-04-26 05:24:08,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1507065856. Throughput: 0: 56322.4. Samples: 1456494800. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-26 05:24:08,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 05:24:09,386][47288] Updated weights for policy 0, policy_version 91986 (0.0031) [2024-04-26 05:24:11,907][47288] Updated weights for policy 0, policy_version 91996 (0.0030) [2024-04-26 05:24:13,923][47056] Fps is (10 sec: 57340.6, 60 sec: 56251.1, 300 sec: 56316.4). Total num frames: 1507344384. Throughput: 0: 56420.2. Samples: 1456660460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 05:24:13,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 05:24:15,267][47288] Updated weights for policy 0, policy_version 92006 (0.0036) [2024-04-26 05:24:17,964][47288] Updated weights for policy 0, policy_version 92016 (0.0028) [2024-04-26 05:24:18,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1507639296. Throughput: 0: 56553.8. Samples: 1457000320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 05:24:18,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 05:24:20,973][47288] Updated weights for policy 0, policy_version 92026 (0.0029) [2024-04-26 05:24:23,091][47267] Signal inference workers to stop experience collection... (21950 times) [2024-04-26 05:24:23,119][47288] InferenceWorker_p0-w0: stopping experience collection (21950 times) [2024-04-26 05:24:23,151][47267] Signal inference workers to resume experience collection... (21950 times) [2024-04-26 05:24:23,151][47288] InferenceWorker_p0-w0: resuming experience collection (21950 times) [2024-04-26 05:24:23,730][47288] Updated weights for policy 0, policy_version 92036 (0.0030) [2024-04-26 05:24:23,923][47056] Fps is (10 sec: 58985.8, 60 sec: 57071.1, 300 sec: 56372.1). Total num frames: 1507934208. Throughput: 0: 56499.1. Samples: 1457340160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 05:24:23,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 05:24:26,750][47288] Updated weights for policy 0, policy_version 92046 (0.0031) [2024-04-26 05:24:28,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1508179968. Throughput: 0: 56254.3. Samples: 1457507280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 05:24:28,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 05:24:29,674][47288] Updated weights for policy 0, policy_version 92056 (0.0027) [2024-04-26 05:24:32,441][47288] Updated weights for policy 0, policy_version 92066 (0.0029) [2024-04-26 05:24:33,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1508474880. Throughput: 0: 56224.8. Samples: 1457838760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 05:24:33,924][47056] Avg episode reward: [(0, '0.589')] [2024-04-26 05:24:35,548][47288] Updated weights for policy 0, policy_version 92076 (0.0031) [2024-04-26 05:24:38,287][47288] Updated weights for policy 0, policy_version 92086 (0.0030) [2024-04-26 05:24:38,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 1508753408. Throughput: 0: 56264.6. Samples: 1458180920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 05:24:38,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 05:24:41,250][47288] Updated weights for policy 0, policy_version 92096 (0.0026) [2024-04-26 05:24:43,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1509031936. Throughput: 0: 56231.0. Samples: 1458348040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 05:24:43,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 05:24:44,395][47288] Updated weights for policy 0, policy_version 92106 (0.0029) [2024-04-26 05:24:46,978][47288] Updated weights for policy 0, policy_version 92116 (0.0029) [2024-04-26 05:24:48,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1509326848. Throughput: 0: 56289.6. Samples: 1458688760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 05:24:48,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 05:24:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000092122_1509326848.pth... [2024-04-26 05:24:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000091297_1495810048.pth [2024-04-26 05:24:50,161][47288] Updated weights for policy 0, policy_version 92126 (0.0030) [2024-04-26 05:24:52,780][47288] Updated weights for policy 0, policy_version 92136 (0.0027) [2024-04-26 05:24:53,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.7, 300 sec: 56261.0). Total num frames: 1509605376. Throughput: 0: 56246.3. Samples: 1459025880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 05:24:53,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 05:24:56,085][47288] Updated weights for policy 0, policy_version 92146 (0.0029) [2024-04-26 05:24:58,593][47288] Updated weights for policy 0, policy_version 92156 (0.0029) [2024-04-26 05:24:58,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56524.7, 300 sec: 56372.0). Total num frames: 1509900288. Throughput: 0: 56505.1. Samples: 1459203160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 05:24:58,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 05:25:01,969][47288] Updated weights for policy 0, policy_version 92166 (0.0033) [2024-04-26 05:25:03,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1510146048. Throughput: 0: 56346.6. Samples: 1459535920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 05:25:03,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 05:25:04,556][47288] Updated weights for policy 0, policy_version 92176 (0.0028) [2024-04-26 05:25:07,666][47288] Updated weights for policy 0, policy_version 92186 (0.0033) [2024-04-26 05:25:08,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1510440960. Throughput: 0: 56280.8. Samples: 1459872800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 05:25:08,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 05:25:10,397][47288] Updated weights for policy 0, policy_version 92196 (0.0027) [2024-04-26 05:25:13,332][47288] Updated weights for policy 0, policy_version 92206 (0.0026) [2024-04-26 05:25:13,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56525.3, 300 sec: 56372.1). Total num frames: 1510735872. Throughput: 0: 56263.6. Samples: 1460039140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 05:25:13,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 05:25:16,280][47288] Updated weights for policy 0, policy_version 92216 (0.0030) [2024-04-26 05:25:18,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 1510998016. Throughput: 0: 56529.8. Samples: 1460382600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 05:25:18,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 05:25:19,250][47288] Updated weights for policy 0, policy_version 92226 (0.0032) [2024-04-26 05:25:22,105][47288] Updated weights for policy 0, policy_version 92236 (0.0031) [2024-04-26 05:25:23,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55705.5, 300 sec: 56372.0). Total num frames: 1511276544. Throughput: 0: 56369.3. Samples: 1460717540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 05:25:23,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 05:25:25,168][47288] Updated weights for policy 0, policy_version 92246 (0.0032) [2024-04-26 05:25:27,709][47288] Updated weights for policy 0, policy_version 92256 (0.0031) [2024-04-26 05:25:28,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 1511587840. Throughput: 0: 56315.0. Samples: 1460882220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 05:25:28,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 05:25:31,080][47288] Updated weights for policy 0, policy_version 92266 (0.0033) [2024-04-26 05:25:33,465][47267] Signal inference workers to stop experience collection... (22000 times) [2024-04-26 05:25:33,466][47267] Signal inference workers to resume experience collection... (22000 times) [2024-04-26 05:25:33,482][47288] InferenceWorker_p0-w0: stopping experience collection (22000 times) [2024-04-26 05:25:33,482][47288] InferenceWorker_p0-w0: resuming experience collection (22000 times) [2024-04-26 05:25:33,586][47288] Updated weights for policy 0, policy_version 92276 (0.0029) [2024-04-26 05:25:33,923][47056] Fps is (10 sec: 58983.5, 60 sec: 56524.9, 300 sec: 56316.5). Total num frames: 1511866368. Throughput: 0: 56172.2. Samples: 1461216500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 05:25:33,923][47056] Avg episode reward: [(0, '0.584')] [2024-04-26 05:25:36,884][47288] Updated weights for policy 0, policy_version 92286 (0.0029) [2024-04-26 05:25:38,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56251.8, 300 sec: 56205.4). Total num frames: 1512128512. Throughput: 0: 56308.0. Samples: 1461559740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 05:25:38,923][47056] Avg episode reward: [(0, '0.348')] [2024-04-26 05:25:39,464][47288] Updated weights for policy 0, policy_version 92296 (0.0026) [2024-04-26 05:25:42,639][47288] Updated weights for policy 0, policy_version 92306 (0.0027) [2024-04-26 05:25:43,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1512407040. Throughput: 0: 56188.1. Samples: 1461731620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 05:25:43,923][47056] Avg episode reward: [(0, '0.389')] [2024-04-26 05:25:45,203][47288] Updated weights for policy 0, policy_version 92316 (0.0031) [2024-04-26 05:25:48,347][47288] Updated weights for policy 0, policy_version 92326 (0.0032) [2024-04-26 05:25:48,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1512701952. Throughput: 0: 56316.4. Samples: 1462070160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 05:25:48,923][47056] Avg episode reward: [(0, '0.565')] [2024-04-26 05:25:51,188][47288] Updated weights for policy 0, policy_version 92336 (0.0030) [2024-04-26 05:25:53,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1512964096. Throughput: 0: 56278.7. Samples: 1462405340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 05:25:53,923][47056] Avg episode reward: [(0, '0.547')] [2024-04-26 05:25:54,235][47288] Updated weights for policy 0, policy_version 92346 (0.0027) [2024-04-26 05:25:56,969][47288] Updated weights for policy 0, policy_version 92356 (0.0029) [2024-04-26 05:25:58,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55978.5, 300 sec: 56372.1). Total num frames: 1513259008. Throughput: 0: 56199.3. Samples: 1462568120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 05:25:58,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 05:25:59,943][47288] Updated weights for policy 0, policy_version 92366 (0.0031) [2024-04-26 05:26:02,796][47288] Updated weights for policy 0, policy_version 92376 (0.0032) [2024-04-26 05:26:03,923][47056] Fps is (10 sec: 58983.0, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1513553920. Throughput: 0: 56176.1. Samples: 1462910520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:26:03,923][47056] Avg episode reward: [(0, '0.344')] [2024-04-26 05:26:05,794][47288] Updated weights for policy 0, policy_version 92386 (0.0037) [2024-04-26 05:26:08,508][47288] Updated weights for policy 0, policy_version 92396 (0.0034) [2024-04-26 05:26:08,923][47056] Fps is (10 sec: 57345.0, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1513832448. Throughput: 0: 56280.1. Samples: 1463250140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:26:08,923][47056] Avg episode reward: [(0, '0.562')] [2024-04-26 05:26:11,532][47288] Updated weights for policy 0, policy_version 92406 (0.0030) [2024-04-26 05:26:13,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1514110976. Throughput: 0: 56396.5. Samples: 1463420060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:26:13,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 05:26:14,358][47288] Updated weights for policy 0, policy_version 92416 (0.0025) [2024-04-26 05:26:17,455][47288] Updated weights for policy 0, policy_version 92426 (0.0028) [2024-04-26 05:26:18,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56797.8, 300 sec: 56316.5). Total num frames: 1514405888. Throughput: 0: 56568.7. Samples: 1463762100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:26:18,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 05:26:20,151][47288] Updated weights for policy 0, policy_version 92436 (0.0028) [2024-04-26 05:26:23,215][47288] Updated weights for policy 0, policy_version 92446 (0.0032) [2024-04-26 05:26:23,923][47056] Fps is (10 sec: 54066.1, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1514651648. Throughput: 0: 56333.5. Samples: 1464094760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:26:23,924][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 05:26:26,000][47288] Updated weights for policy 0, policy_version 92456 (0.0031) [2024-04-26 05:26:28,923][47056] Fps is (10 sec: 54068.4, 60 sec: 55978.8, 300 sec: 56316.6). Total num frames: 1514946560. Throughput: 0: 56315.2. Samples: 1464265800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:26:28,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 05:26:28,957][47288] Updated weights for policy 0, policy_version 92466 (0.0029) [2024-04-26 05:26:31,713][47288] Updated weights for policy 0, policy_version 92476 (0.0027) [2024-04-26 05:26:33,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.5, 300 sec: 56316.5). Total num frames: 1515225088. Throughput: 0: 56300.8. Samples: 1464603700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:26:33,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 05:26:34,660][47288] Updated weights for policy 0, policy_version 92486 (0.0028) [2024-04-26 05:26:37,688][47288] Updated weights for policy 0, policy_version 92496 (0.0040) [2024-04-26 05:26:38,923][47056] Fps is (10 sec: 57342.8, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1515520000. Throughput: 0: 56381.8. Samples: 1464942520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:26:38,924][47056] Avg episode reward: [(0, '0.534')] [2024-04-26 05:26:40,551][47288] Updated weights for policy 0, policy_version 92506 (0.0031) [2024-04-26 05:26:43,557][47288] Updated weights for policy 0, policy_version 92516 (0.0026) [2024-04-26 05:26:43,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1515798528. Throughput: 0: 56504.3. Samples: 1465110800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:26:43,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 05:26:44,902][47267] Signal inference workers to stop experience collection... (22050 times) [2024-04-26 05:26:44,907][47267] Signal inference workers to resume experience collection... (22050 times) [2024-04-26 05:26:44,946][47288] InferenceWorker_p0-w0: stopping experience collection (22050 times) [2024-04-26 05:26:44,946][47288] InferenceWorker_p0-w0: resuming experience collection (22050 times) [2024-04-26 05:26:46,295][47288] Updated weights for policy 0, policy_version 92526 (0.0028) [2024-04-26 05:26:48,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1516060672. Throughput: 0: 56428.0. Samples: 1465449780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:26:48,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 05:26:48,941][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000092534_1516077056.pth... [2024-04-26 05:26:48,988][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000091709_1502560256.pth [2024-04-26 05:26:49,289][47288] Updated weights for policy 0, policy_version 92536 (0.0028) [2024-04-26 05:26:51,955][47288] Updated weights for policy 0, policy_version 92546 (0.0033) [2024-04-26 05:26:53,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56797.9, 300 sec: 56316.5). Total num frames: 1516371968. Throughput: 0: 56268.5. Samples: 1465782220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:26:53,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 05:26:55,087][47288] Updated weights for policy 0, policy_version 92556 (0.0025) [2024-04-26 05:26:57,827][47288] Updated weights for policy 0, policy_version 92566 (0.0026) [2024-04-26 05:26:58,923][47056] Fps is (10 sec: 60620.5, 60 sec: 56798.1, 300 sec: 56427.6). Total num frames: 1516666880. Throughput: 0: 56572.4. Samples: 1465965820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 05:26:58,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 05:27:00,748][47288] Updated weights for policy 0, policy_version 92576 (0.0028) [2024-04-26 05:27:03,700][47288] Updated weights for policy 0, policy_version 92586 (0.0033) [2024-04-26 05:27:03,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56251.6, 300 sec: 56372.1). Total num frames: 1516929024. Throughput: 0: 56431.1. Samples: 1466301500. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-04-26 05:27:03,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 05:27:06,660][47288] Updated weights for policy 0, policy_version 92596 (0.0028) [2024-04-26 05:27:08,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1517191168. Throughput: 0: 56525.5. Samples: 1466638400. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-04-26 05:27:08,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 05:27:09,381][47288] Updated weights for policy 0, policy_version 92606 (0.0026) [2024-04-26 05:27:12,413][47288] Updated weights for policy 0, policy_version 92616 (0.0032) [2024-04-26 05:27:13,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1517486080. Throughput: 0: 56416.4. Samples: 1466804540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-04-26 05:27:13,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 05:27:15,176][47288] Updated weights for policy 0, policy_version 92626 (0.0026) [2024-04-26 05:27:18,381][47288] Updated weights for policy 0, policy_version 92636 (0.0032) [2024-04-26 05:27:18,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1517764608. Throughput: 0: 56476.5. Samples: 1467145140. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-04-26 05:27:18,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 05:27:20,971][47288] Updated weights for policy 0, policy_version 92646 (0.0026) [2024-04-26 05:27:23,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56525.1, 300 sec: 56316.5). Total num frames: 1518043136. Throughput: 0: 56422.9. Samples: 1467481540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-04-26 05:27:23,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 05:27:24,250][47288] Updated weights for policy 0, policy_version 92656 (0.0025) [2024-04-26 05:27:26,725][47288] Updated weights for policy 0, policy_version 92666 (0.0027) [2024-04-26 05:27:28,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.6, 300 sec: 56372.1). Total num frames: 1518338048. Throughput: 0: 56328.8. Samples: 1467645600. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-04-26 05:27:28,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 05:27:30,164][47288] Updated weights for policy 0, policy_version 92676 (0.0028) [2024-04-26 05:27:32,102][47267] Signal inference workers to stop experience collection... (22100 times) [2024-04-26 05:27:32,103][47267] Signal inference workers to resume experience collection... (22100 times) [2024-04-26 05:27:32,128][47288] InferenceWorker_p0-w0: stopping experience collection (22100 times) [2024-04-26 05:27:32,129][47288] InferenceWorker_p0-w0: resuming experience collection (22100 times) [2024-04-26 05:27:32,529][47288] Updated weights for policy 0, policy_version 92686 (0.0029) [2024-04-26 05:27:33,923][47056] Fps is (10 sec: 60621.0, 60 sec: 57071.2, 300 sec: 56372.1). Total num frames: 1518649344. Throughput: 0: 56253.9. Samples: 1467981200. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-04-26 05:27:33,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 05:27:36,044][47288] Updated weights for policy 0, policy_version 92696 (0.0026) [2024-04-26 05:27:38,466][47288] Updated weights for policy 0, policy_version 92706 (0.0030) [2024-04-26 05:27:38,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.8, 300 sec: 56372.0). Total num frames: 1518911488. Throughput: 0: 56401.7. Samples: 1468320300. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-04-26 05:27:38,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 05:27:41,793][47288] Updated weights for policy 0, policy_version 92716 (0.0030) [2024-04-26 05:27:43,923][47056] Fps is (10 sec: 52428.2, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1519173632. Throughput: 0: 56280.4. Samples: 1468498440. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-04-26 05:27:43,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 05:27:44,442][47288] Updated weights for policy 0, policy_version 92726 (0.0029) [2024-04-26 05:27:47,590][47288] Updated weights for policy 0, policy_version 92736 (0.0030) [2024-04-26 05:27:48,923][47056] Fps is (10 sec: 54067.4, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1519452160. Throughput: 0: 56342.3. Samples: 1468836900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-04-26 05:27:48,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 05:27:50,361][47288] Updated weights for policy 0, policy_version 92746 (0.0026) [2024-04-26 05:27:53,465][47288] Updated weights for policy 0, policy_version 92756 (0.0033) [2024-04-26 05:27:53,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1519730688. Throughput: 0: 56343.7. Samples: 1469173860. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-04-26 05:27:53,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 05:27:56,031][47288] Updated weights for policy 0, policy_version 92766 (0.0032) [2024-04-26 05:27:58,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1520025600. Throughput: 0: 56370.5. Samples: 1469341220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 05:27:58,923][47056] Avg episode reward: [(0, '0.580')] [2024-04-26 05:27:59,246][47288] Updated weights for policy 0, policy_version 92776 (0.0029) [2024-04-26 05:28:01,866][47288] Updated weights for policy 0, policy_version 92786 (0.0025) [2024-04-26 05:28:03,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1520304128. Throughput: 0: 56212.0. Samples: 1469674680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 05:28:03,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 05:28:05,099][47288] Updated weights for policy 0, policy_version 92796 (0.0030) [2024-04-26 05:28:07,731][47288] Updated weights for policy 0, policy_version 92806 (0.0033) [2024-04-26 05:28:08,923][47056] Fps is (10 sec: 58982.4, 60 sec: 57070.9, 300 sec: 56427.6). Total num frames: 1520615424. Throughput: 0: 56154.9. Samples: 1470008520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 05:28:08,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 05:28:10,869][47288] Updated weights for policy 0, policy_version 92816 (0.0027) [2024-04-26 05:28:13,490][47288] Updated weights for policy 0, policy_version 92826 (0.0031) [2024-04-26 05:28:13,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1520877568. Throughput: 0: 56467.5. Samples: 1470186640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 05:28:13,924][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 05:28:16,795][47288] Updated weights for policy 0, policy_version 92836 (0.0026) [2024-04-26 05:28:18,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56797.8, 300 sec: 56483.2). Total num frames: 1521172480. Throughput: 0: 56387.7. Samples: 1470518660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 05:28:18,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 05:28:19,148][47288] Updated weights for policy 0, policy_version 92846 (0.0023) [2024-04-26 05:28:22,742][47288] Updated weights for policy 0, policy_version 92856 (0.0027) [2024-04-26 05:28:23,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1521434624. Throughput: 0: 56367.1. Samples: 1470856820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 05:28:23,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 05:28:25,169][47288] Updated weights for policy 0, policy_version 92866 (0.0024) [2024-04-26 05:28:28,599][47288] Updated weights for policy 0, policy_version 92876 (0.0028) [2024-04-26 05:28:28,923][47056] Fps is (10 sec: 52429.6, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1521696768. Throughput: 0: 56022.7. Samples: 1471019460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 05:28:28,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 05:28:30,536][47267] Signal inference workers to stop experience collection... (22150 times) [2024-04-26 05:28:30,537][47267] Signal inference workers to resume experience collection... (22150 times) [2024-04-26 05:28:30,561][47288] InferenceWorker_p0-w0: stopping experience collection (22150 times) [2024-04-26 05:28:30,561][47288] InferenceWorker_p0-w0: resuming experience collection (22150 times) [2024-04-26 05:28:31,129][47288] Updated weights for policy 0, policy_version 92886 (0.0033) [2024-04-26 05:28:33,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.4, 300 sec: 56316.5). Total num frames: 1521991680. Throughput: 0: 55982.2. Samples: 1471356100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 05:28:33,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 05:28:34,284][47288] Updated weights for policy 0, policy_version 92896 (0.0025) [2024-04-26 05:28:36,959][47288] Updated weights for policy 0, policy_version 92906 (0.0031) [2024-04-26 05:28:38,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1522270208. Throughput: 0: 56007.4. Samples: 1471694200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 05:28:38,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 05:28:39,939][47288] Updated weights for policy 0, policy_version 92916 (0.0028) [2024-04-26 05:28:42,864][47288] Updated weights for policy 0, policy_version 92926 (0.0033) [2024-04-26 05:28:43,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.7, 300 sec: 56372.0). Total num frames: 1522565120. Throughput: 0: 56087.5. Samples: 1471865160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 05:28:43,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 05:28:45,743][47288] Updated weights for policy 0, policy_version 92936 (0.0032) [2024-04-26 05:28:48,701][47288] Updated weights for policy 0, policy_version 92946 (0.0030) [2024-04-26 05:28:48,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.7, 300 sec: 56372.0). Total num frames: 1522843648. Throughput: 0: 56202.6. Samples: 1472203800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 05:28:48,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 05:28:48,936][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000092947_1522843648.pth... [2024-04-26 05:28:48,990][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000092122_1509326848.pth [2024-04-26 05:28:51,585][47288] Updated weights for policy 0, policy_version 92956 (0.0031) [2024-04-26 05:28:53,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1523122176. Throughput: 0: 56233.0. Samples: 1472539000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-04-26 05:28:53,923][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 05:28:54,426][47288] Updated weights for policy 0, policy_version 92966 (0.0028) [2024-04-26 05:28:57,435][47288] Updated weights for policy 0, policy_version 92976 (0.0027) [2024-04-26 05:28:58,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 1523400704. Throughput: 0: 56090.9. Samples: 1472710740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-04-26 05:28:58,924][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 05:29:00,128][47288] Updated weights for policy 0, policy_version 92986 (0.0028) [2024-04-26 05:29:03,452][47288] Updated weights for policy 0, policy_version 92996 (0.0031) [2024-04-26 05:29:03,923][47056] Fps is (10 sec: 52428.0, 60 sec: 55705.5, 300 sec: 56205.4). Total num frames: 1523646464. Throughput: 0: 56190.6. Samples: 1473047240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-04-26 05:29:03,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 05:29:05,962][47288] Updated weights for policy 0, policy_version 93006 (0.0026) [2024-04-26 05:29:08,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.5, 300 sec: 56316.6). Total num frames: 1523957760. Throughput: 0: 56213.3. Samples: 1473386420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-04-26 05:29:08,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 05:29:09,245][47288] Updated weights for policy 0, policy_version 93016 (0.0029) [2024-04-26 05:29:11,798][47288] Updated weights for policy 0, policy_version 93026 (0.0027) [2024-04-26 05:29:13,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1524236288. Throughput: 0: 56390.6. Samples: 1473557040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-04-26 05:29:13,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 05:29:15,104][47288] Updated weights for policy 0, policy_version 93036 (0.0027) [2024-04-26 05:29:17,526][47288] Updated weights for policy 0, policy_version 93046 (0.0033) [2024-04-26 05:29:18,927][47056] Fps is (10 sec: 57322.0, 60 sec: 55975.1, 300 sec: 56260.2). Total num frames: 1524531200. Throughput: 0: 56404.0. Samples: 1473894500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-04-26 05:29:18,927][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 05:29:20,880][47288] Updated weights for policy 0, policy_version 93056 (0.0028) [2024-04-26 05:29:23,287][47288] Updated weights for policy 0, policy_version 93066 (0.0027) [2024-04-26 05:29:23,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1524826112. Throughput: 0: 56393.4. Samples: 1474231900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-04-26 05:29:23,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 05:29:26,733][47288] Updated weights for policy 0, policy_version 93076 (0.0030) [2024-04-26 05:29:28,923][47056] Fps is (10 sec: 57367.0, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1525104640. Throughput: 0: 56537.5. Samples: 1474409340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-04-26 05:29:28,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 05:29:29,079][47288] Updated weights for policy 0, policy_version 93086 (0.0031) [2024-04-26 05:29:32,706][47288] Updated weights for policy 0, policy_version 93096 (0.0030) [2024-04-26 05:29:32,943][47267] Signal inference workers to stop experience collection... (22200 times) [2024-04-26 05:29:32,990][47288] InferenceWorker_p0-w0: stopping experience collection (22200 times) [2024-04-26 05:29:33,001][47267] Signal inference workers to resume experience collection... (22200 times) [2024-04-26 05:29:33,008][47288] InferenceWorker_p0-w0: resuming experience collection (22200 times) [2024-04-26 05:29:33,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1525383168. Throughput: 0: 56600.0. Samples: 1474750800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-04-26 05:29:33,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 05:29:34,735][47288] Updated weights for policy 0, policy_version 93106 (0.0029) [2024-04-26 05:29:38,635][47288] Updated weights for policy 0, policy_version 93116 (0.0029) [2024-04-26 05:29:38,923][47056] Fps is (10 sec: 54066.6, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1525645312. Throughput: 0: 56775.0. Samples: 1475093880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-04-26 05:29:38,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 05:29:40,400][47288] Updated weights for policy 0, policy_version 93126 (0.0031) [2024-04-26 05:29:43,923][47056] Fps is (10 sec: 52429.6, 60 sec: 55705.8, 300 sec: 56205.5). Total num frames: 1525907456. Throughput: 0: 56405.2. Samples: 1475248960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-04-26 05:29:43,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 05:29:44,280][47288] Updated weights for policy 0, policy_version 93136 (0.0031) [2024-04-26 05:29:46,278][47288] Updated weights for policy 0, policy_version 93146 (0.0027) [2024-04-26 05:29:48,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.8, 300 sec: 56261.0). Total num frames: 1526202368. Throughput: 0: 56460.3. Samples: 1475587940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:29:48,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 05:29:50,044][47288] Updated weights for policy 0, policy_version 93156 (0.0030) [2024-04-26 05:29:52,391][47288] Updated weights for policy 0, policy_version 93166 (0.0027) [2024-04-26 05:29:53,923][47056] Fps is (10 sec: 60621.1, 60 sec: 56524.9, 300 sec: 56316.6). Total num frames: 1526513664. Throughput: 0: 56462.5. Samples: 1475927220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:29:53,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 05:29:55,812][47288] Updated weights for policy 0, policy_version 93176 (0.0025) [2024-04-26 05:29:58,077][47288] Updated weights for policy 0, policy_version 93186 (0.0032) [2024-04-26 05:29:58,923][47056] Fps is (10 sec: 58980.7, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1526792192. Throughput: 0: 56450.0. Samples: 1476097300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:29:58,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 05:30:01,805][47288] Updated weights for policy 0, policy_version 93196 (0.0037) [2024-04-26 05:30:03,920][47288] Updated weights for policy 0, policy_version 93206 (0.0028) [2024-04-26 05:30:03,923][47056] Fps is (10 sec: 57343.3, 60 sec: 57344.1, 300 sec: 56427.6). Total num frames: 1527087104. Throughput: 0: 56344.5. Samples: 1476429780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:30:03,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 05:30:07,581][47288] Updated weights for policy 0, policy_version 93216 (0.0028) [2024-04-26 05:30:08,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1527332864. Throughput: 0: 56320.8. Samples: 1476766340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:30:08,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 05:30:09,836][47288] Updated weights for policy 0, policy_version 93226 (0.0028) [2024-04-26 05:30:13,288][47288] Updated weights for policy 0, policy_version 93236 (0.0027) [2024-04-26 05:30:13,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1527627776. Throughput: 0: 56139.1. Samples: 1476935600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:30:13,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 05:30:15,521][47288] Updated weights for policy 0, policy_version 93246 (0.0031) [2024-04-26 05:30:18,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55709.2, 300 sec: 56261.0). Total num frames: 1527873536. Throughput: 0: 56161.8. Samples: 1477278080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:30:18,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 05:30:19,109][47288] Updated weights for policy 0, policy_version 93256 (0.0028) [2024-04-26 05:30:21,182][47288] Updated weights for policy 0, policy_version 93266 (0.0032) [2024-04-26 05:30:23,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.5, 300 sec: 56205.5). Total num frames: 1528168448. Throughput: 0: 56154.7. Samples: 1477620840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:30:23,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 05:30:25,006][47288] Updated weights for policy 0, policy_version 93276 (0.0028) [2024-04-26 05:30:25,253][47267] Signal inference workers to stop experience collection... (22250 times) [2024-04-26 05:30:25,253][47267] Signal inference workers to resume experience collection... (22250 times) [2024-04-26 05:30:25,281][47288] InferenceWorker_p0-w0: stopping experience collection (22250 times) [2024-04-26 05:30:25,281][47288] InferenceWorker_p0-w0: resuming experience collection (22250 times) [2024-04-26 05:30:27,118][47288] Updated weights for policy 0, policy_version 93286 (0.0028) [2024-04-26 05:30:28,923][47056] Fps is (10 sec: 60620.7, 60 sec: 56251.6, 300 sec: 56316.5). Total num frames: 1528479744. Throughput: 0: 56210.9. Samples: 1477778460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:30:28,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 05:30:30,903][47288] Updated weights for policy 0, policy_version 93296 (0.0031) [2024-04-26 05:30:33,150][47288] Updated weights for policy 0, policy_version 93306 (0.0028) [2024-04-26 05:30:33,923][47056] Fps is (10 sec: 60620.8, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1528774656. Throughput: 0: 56174.5. Samples: 1478115800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:30:33,923][47056] Avg episode reward: [(0, '0.577')] [2024-04-26 05:30:36,607][47288] Updated weights for policy 0, policy_version 93316 (0.0028) [2024-04-26 05:30:38,883][47288] Updated weights for policy 0, policy_version 93326 (0.0031) [2024-04-26 05:30:38,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1529053184. Throughput: 0: 56141.2. Samples: 1478453580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:30:38,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 05:30:42,409][47288] Updated weights for policy 0, policy_version 93336 (0.0029) [2024-04-26 05:30:43,923][47056] Fps is (10 sec: 55706.0, 60 sec: 57070.9, 300 sec: 56372.1). Total num frames: 1529331712. Throughput: 0: 56280.7. Samples: 1478629920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:30:43,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 05:30:44,628][47288] Updated weights for policy 0, policy_version 93346 (0.0029) [2024-04-26 05:30:48,249][47288] Updated weights for policy 0, policy_version 93356 (0.0035) [2024-04-26 05:30:48,923][47056] Fps is (10 sec: 52428.9, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1529577472. Throughput: 0: 56392.9. Samples: 1478967460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:30:48,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 05:30:49,005][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000093359_1529593856.pth... [2024-04-26 05:30:49,063][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000092534_1516077056.pth [2024-04-26 05:30:50,482][47288] Updated weights for policy 0, policy_version 93366 (0.0027) [2024-04-26 05:30:53,923][47056] Fps is (10 sec: 50790.1, 60 sec: 55432.4, 300 sec: 56205.5). Total num frames: 1529839616. Throughput: 0: 56356.5. Samples: 1479302380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:30:53,923][47056] Avg episode reward: [(0, '0.357')] [2024-04-26 05:30:54,196][47288] Updated weights for policy 0, policy_version 93376 (0.0028) [2024-04-26 05:30:56,474][47288] Updated weights for policy 0, policy_version 93386 (0.0026) [2024-04-26 05:30:58,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.7, 300 sec: 56205.4). Total num frames: 1530134528. Throughput: 0: 56195.0. Samples: 1479464380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:30:58,924][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 05:30:59,818][47288] Updated weights for policy 0, policy_version 93396 (0.0028) [2024-04-26 05:31:02,125][47288] Updated weights for policy 0, policy_version 93406 (0.0028) [2024-04-26 05:31:03,923][47056] Fps is (10 sec: 58982.6, 60 sec: 55705.6, 300 sec: 56261.0). Total num frames: 1530429440. Throughput: 0: 56089.9. Samples: 1479802120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:31:03,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 05:31:05,800][47288] Updated weights for policy 0, policy_version 93416 (0.0026) [2024-04-26 05:31:07,768][47288] Updated weights for policy 0, policy_version 93426 (0.0035) [2024-04-26 05:31:08,923][47056] Fps is (10 sec: 60620.7, 60 sec: 56797.9, 300 sec: 56372.0). Total num frames: 1530740736. Throughput: 0: 55996.4. Samples: 1480140680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:31:08,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 05:31:11,548][47288] Updated weights for policy 0, policy_version 93436 (0.0031) [2024-04-26 05:31:13,552][47288] Updated weights for policy 0, policy_version 93446 (0.0028) [2024-04-26 05:31:13,923][47056] Fps is (10 sec: 60621.2, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1531035648. Throughput: 0: 56509.1. Samples: 1480321360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:31:13,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 05:31:17,409][47288] Updated weights for policy 0, policy_version 93456 (0.0041) [2024-04-26 05:31:18,923][47056] Fps is (10 sec: 55705.2, 60 sec: 57070.9, 300 sec: 56427.6). Total num frames: 1531297792. Throughput: 0: 56461.7. Samples: 1480656580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:31:18,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 05:31:19,446][47288] Updated weights for policy 0, policy_version 93466 (0.0036) [2024-04-26 05:31:22,883][47267] Signal inference workers to stop experience collection... (22300 times) [2024-04-26 05:31:22,883][47267] Signal inference workers to resume experience collection... (22300 times) [2024-04-26 05:31:22,907][47288] InferenceWorker_p0-w0: stopping experience collection (22300 times) [2024-04-26 05:31:22,907][47288] InferenceWorker_p0-w0: resuming experience collection (22300 times) [2024-04-26 05:31:23,163][47288] Updated weights for policy 0, policy_version 93476 (0.0027) [2024-04-26 05:31:23,923][47056] Fps is (10 sec: 52428.9, 60 sec: 56524.9, 300 sec: 56316.5). Total num frames: 1531559936. Throughput: 0: 56504.5. Samples: 1480996280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:31:23,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 05:31:25,112][47288] Updated weights for policy 0, policy_version 93486 (0.0025) [2024-04-26 05:31:28,923][47056] Fps is (10 sec: 50791.1, 60 sec: 55432.6, 300 sec: 56205.5). Total num frames: 1531805696. Throughput: 0: 56128.4. Samples: 1481155700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:31:28,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 05:31:29,142][47288] Updated weights for policy 0, policy_version 93496 (0.0033) [2024-04-26 05:31:30,850][47288] Updated weights for policy 0, policy_version 93506 (0.0029) [2024-04-26 05:31:33,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55159.5, 300 sec: 56149.9). Total num frames: 1532084224. Throughput: 0: 56112.4. Samples: 1481492520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:31:33,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 05:31:34,989][47288] Updated weights for policy 0, policy_version 93516 (0.0029) [2024-04-26 05:31:36,727][47288] Updated weights for policy 0, policy_version 93526 (0.0028) [2024-04-26 05:31:38,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55432.5, 300 sec: 56205.5). Total num frames: 1532379136. Throughput: 0: 56206.8. Samples: 1481831680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:31:38,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 05:31:40,727][47288] Updated weights for policy 0, policy_version 93536 (0.0029) [2024-04-26 05:31:42,611][47288] Updated weights for policy 0, policy_version 93546 (0.0024) [2024-04-26 05:31:43,923][47056] Fps is (10 sec: 62258.8, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1532706816. Throughput: 0: 56343.1. Samples: 1481999820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:31:43,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 05:31:46,523][47288] Updated weights for policy 0, policy_version 93556 (0.0029) [2024-04-26 05:31:48,311][47288] Updated weights for policy 0, policy_version 93566 (0.0026) [2024-04-26 05:31:48,923][47056] Fps is (10 sec: 62259.2, 60 sec: 57070.9, 300 sec: 56372.1). Total num frames: 1533001728. Throughput: 0: 56303.6. Samples: 1482335780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:31:48,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 05:31:52,472][47288] Updated weights for policy 0, policy_version 93576 (0.0028) [2024-04-26 05:31:53,923][47056] Fps is (10 sec: 57344.5, 60 sec: 57344.1, 300 sec: 56316.5). Total num frames: 1533280256. Throughput: 0: 56257.5. Samples: 1482672260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:31:53,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 05:31:54,058][47288] Updated weights for policy 0, policy_version 93586 (0.0028) [2024-04-26 05:31:58,345][47288] Updated weights for policy 0, policy_version 93596 (0.0026) [2024-04-26 05:31:58,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56797.9, 300 sec: 56316.5). Total num frames: 1533542400. Throughput: 0: 56198.6. Samples: 1482850300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:31:58,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 05:31:59,497][47267] Signal inference workers to stop experience collection... (22350 times) [2024-04-26 05:31:59,531][47288] InferenceWorker_p0-w0: stopping experience collection (22350 times) [2024-04-26 05:31:59,555][47267] Signal inference workers to resume experience collection... (22350 times) [2024-04-26 05:31:59,555][47288] InferenceWorker_p0-w0: resuming experience collection (22350 times) [2024-04-26 05:31:59,936][47288] Updated weights for policy 0, policy_version 93606 (0.0032) [2024-04-26 05:32:03,923][47056] Fps is (10 sec: 50790.3, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1533788160. Throughput: 0: 56256.2. Samples: 1483188100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:32:03,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 05:32:04,032][47288] Updated weights for policy 0, policy_version 93616 (0.0027) [2024-04-26 05:32:05,733][47288] Updated weights for policy 0, policy_version 93626 (0.0027) [2024-04-26 05:32:08,923][47056] Fps is (10 sec: 50790.7, 60 sec: 55159.6, 300 sec: 56149.9). Total num frames: 1534050304. Throughput: 0: 56315.1. Samples: 1483530460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:32:08,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 05:32:09,893][47288] Updated weights for policy 0, policy_version 93636 (0.0029) [2024-04-26 05:32:11,501][47288] Updated weights for policy 0, policy_version 93646 (0.0029) [2024-04-26 05:32:13,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55432.4, 300 sec: 56261.0). Total num frames: 1534361600. Throughput: 0: 56051.5. Samples: 1483678020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:32:13,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 05:32:15,649][47288] Updated weights for policy 0, policy_version 93656 (0.0032) [2024-04-26 05:32:17,236][47288] Updated weights for policy 0, policy_version 93666 (0.0027) [2024-04-26 05:32:18,923][47056] Fps is (10 sec: 60620.0, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1534656512. Throughput: 0: 56051.0. Samples: 1484014820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:32:18,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 05:32:21,450][47288] Updated weights for policy 0, policy_version 93676 (0.0027) [2024-04-26 05:32:23,259][47288] Updated weights for policy 0, policy_version 93686 (0.0026) [2024-04-26 05:32:23,923][47056] Fps is (10 sec: 60620.6, 60 sec: 56797.7, 300 sec: 56372.1). Total num frames: 1534967808. Throughput: 0: 56014.1. Samples: 1484352320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:32:23,923][47056] Avg episode reward: [(0, '0.528')] [2024-04-26 05:32:27,164][47288] Updated weights for policy 0, policy_version 93696 (0.0036) [2024-04-26 05:32:28,926][47056] Fps is (10 sec: 58965.2, 60 sec: 57341.2, 300 sec: 56260.4). Total num frames: 1535246336. Throughput: 0: 56538.1. Samples: 1484544200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:32:28,926][47056] Avg episode reward: [(0, '0.605')] [2024-04-26 05:32:29,113][47288] Updated weights for policy 0, policy_version 93706 (0.0026) [2024-04-26 05:32:33,003][47288] Updated weights for policy 0, policy_version 93716 (0.0028) [2024-04-26 05:32:33,923][47056] Fps is (10 sec: 55706.5, 60 sec: 57344.1, 300 sec: 56316.6). Total num frames: 1535524864. Throughput: 0: 56692.0. Samples: 1484886920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:32:33,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 05:32:35,029][47288] Updated weights for policy 0, policy_version 93726 (0.0026) [2024-04-26 05:32:38,857][47288] Updated weights for policy 0, policy_version 93736 (0.0032) [2024-04-26 05:32:38,923][47056] Fps is (10 sec: 52444.0, 60 sec: 56524.7, 300 sec: 56261.0). Total num frames: 1535770624. Throughput: 0: 56753.6. Samples: 1485226180. Policy #0 lag: (min: 0.0, avg: 6.6, max: 21.0) [2024-04-26 05:32:38,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 05:32:39,203][47267] Signal inference workers to stop experience collection... (22400 times) [2024-04-26 05:32:39,233][47288] InferenceWorker_p0-w0: stopping experience collection (22400 times) [2024-04-26 05:32:39,259][47267] Signal inference workers to resume experience collection... (22400 times) [2024-04-26 05:32:39,260][47288] InferenceWorker_p0-w0: resuming experience collection (22400 times) [2024-04-26 05:32:40,706][47288] Updated weights for policy 0, policy_version 93746 (0.0024) [2024-04-26 05:32:43,923][47056] Fps is (10 sec: 50790.6, 60 sec: 55432.7, 300 sec: 56205.5). Total num frames: 1536032768. Throughput: 0: 56166.3. Samples: 1485377780. Policy #0 lag: (min: 0.0, avg: 6.6, max: 21.0) [2024-04-26 05:32:43,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 05:32:44,472][47288] Updated weights for policy 0, policy_version 93756 (0.0031) [2024-04-26 05:32:46,522][47288] Updated weights for policy 0, policy_version 93766 (0.0028) [2024-04-26 05:32:48,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55159.3, 300 sec: 56205.4). Total num frames: 1536311296. Throughput: 0: 56297.2. Samples: 1485721480. Policy #0 lag: (min: 0.0, avg: 6.6, max: 21.0) [2024-04-26 05:32:48,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 05:32:48,936][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000093769_1536311296.pth... [2024-04-26 05:32:48,995][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000092947_1522843648.pth [2024-04-26 05:32:50,310][47288] Updated weights for policy 0, policy_version 93776 (0.0032) [2024-04-26 05:32:52,390][47288] Updated weights for policy 0, policy_version 93786 (0.0026) [2024-04-26 05:32:53,923][47056] Fps is (10 sec: 60619.9, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1536638976. Throughput: 0: 56159.9. Samples: 1486057660. Policy #0 lag: (min: 0.0, avg: 6.6, max: 21.0) [2024-04-26 05:32:53,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 05:32:56,206][47288] Updated weights for policy 0, policy_version 93796 (0.0026) [2024-04-26 05:32:58,199][47288] Updated weights for policy 0, policy_version 93806 (0.0027) [2024-04-26 05:32:58,923][47056] Fps is (10 sec: 62259.2, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1536933888. Throughput: 0: 56719.5. Samples: 1486230400. Policy #0 lag: (min: 0.0, avg: 6.6, max: 21.0) [2024-04-26 05:32:58,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 05:33:01,961][47288] Updated weights for policy 0, policy_version 93816 (0.0029) [2024-04-26 05:33:03,923][47056] Fps is (10 sec: 57344.2, 60 sec: 57070.9, 300 sec: 56261.0). Total num frames: 1537212416. Throughput: 0: 56535.6. Samples: 1486558920. Policy #0 lag: (min: 0.0, avg: 6.6, max: 21.0) [2024-04-26 05:33:03,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 05:33:04,162][47288] Updated weights for policy 0, policy_version 93826 (0.0023) [2024-04-26 05:33:07,709][47288] Updated weights for policy 0, policy_version 93836 (0.0030) [2024-04-26 05:33:08,923][47056] Fps is (10 sec: 55706.7, 60 sec: 57344.0, 300 sec: 56316.6). Total num frames: 1537490944. Throughput: 0: 56418.9. Samples: 1486891160. Policy #0 lag: (min: 0.0, avg: 6.6, max: 21.0) [2024-04-26 05:33:08,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 05:33:10,104][47288] Updated weights for policy 0, policy_version 93846 (0.0028) [2024-04-26 05:33:13,564][47288] Updated weights for policy 0, policy_version 93856 (0.0029) [2024-04-26 05:33:13,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56798.0, 300 sec: 56261.0). Total num frames: 1537769472. Throughput: 0: 56206.9. Samples: 1487073340. Policy #0 lag: (min: 0.0, avg: 6.6, max: 21.0) [2024-04-26 05:33:13,923][47056] Avg episode reward: [(0, '0.576')] [2024-04-26 05:33:15,982][47288] Updated weights for policy 0, policy_version 93866 (0.0030) [2024-04-26 05:33:18,922][47267] Signal inference workers to stop experience collection... (22450 times) [2024-04-26 05:33:18,922][47267] Signal inference workers to resume experience collection... (22450 times) [2024-04-26 05:33:18,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55978.8, 300 sec: 56205.5). Total num frames: 1538015232. Throughput: 0: 56154.3. Samples: 1487413860. Policy #0 lag: (min: 0.0, avg: 6.6, max: 21.0) [2024-04-26 05:33:18,923][47056] Avg episode reward: [(0, '0.534')] [2024-04-26 05:33:18,935][47288] InferenceWorker_p0-w0: stopping experience collection (22450 times) [2024-04-26 05:33:18,950][47288] InferenceWorker_p0-w0: resuming experience collection (22450 times) [2024-04-26 05:33:19,346][47288] Updated weights for policy 0, policy_version 93876 (0.0030) [2024-04-26 05:33:21,804][47288] Updated weights for policy 0, policy_version 93886 (0.0028) [2024-04-26 05:33:23,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55432.7, 300 sec: 56261.0). Total num frames: 1538293760. Throughput: 0: 56127.3. Samples: 1487751900. Policy #0 lag: (min: 0.0, avg: 6.6, max: 21.0) [2024-04-26 05:33:23,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 05:33:25,072][47288] Updated weights for policy 0, policy_version 93896 (0.0029) [2024-04-26 05:33:27,689][47288] Updated weights for policy 0, policy_version 93906 (0.0027) [2024-04-26 05:33:28,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55708.4, 300 sec: 56261.0). Total num frames: 1538588672. Throughput: 0: 56275.9. Samples: 1487910200. Policy #0 lag: (min: 0.0, avg: 6.6, max: 21.0) [2024-04-26 05:33:28,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 05:33:31,000][47288] Updated weights for policy 0, policy_version 93916 (0.0030) [2024-04-26 05:33:33,597][47288] Updated weights for policy 0, policy_version 93926 (0.0030) [2024-04-26 05:33:33,923][47056] Fps is (10 sec: 60620.3, 60 sec: 56251.6, 300 sec: 56372.1). Total num frames: 1538899968. Throughput: 0: 56165.0. Samples: 1488248900. Policy #0 lag: (min: 0.0, avg: 6.6, max: 21.0) [2024-04-26 05:33:33,932][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 05:33:36,759][47288] Updated weights for policy 0, policy_version 93936 (0.0027) [2024-04-26 05:33:38,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56797.9, 300 sec: 56316.5). Total num frames: 1539178496. Throughput: 0: 56190.6. Samples: 1488586240. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 05:33:38,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 05:33:39,343][47288] Updated weights for policy 0, policy_version 93946 (0.0030) [2024-04-26 05:33:42,444][47288] Updated weights for policy 0, policy_version 93956 (0.0029) [2024-04-26 05:33:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 57070.8, 300 sec: 56316.5). Total num frames: 1539457024. Throughput: 0: 56398.8. Samples: 1488768340. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 05:33:43,923][47056] Avg episode reward: [(0, '0.565')] [2024-04-26 05:33:45,115][47288] Updated weights for policy 0, policy_version 93966 (0.0027) [2024-04-26 05:33:48,284][47288] Updated weights for policy 0, policy_version 93976 (0.0027) [2024-04-26 05:33:48,923][47056] Fps is (10 sec: 55706.3, 60 sec: 57071.1, 300 sec: 56316.5). Total num frames: 1539735552. Throughput: 0: 56439.6. Samples: 1489098700. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 05:33:48,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 05:33:50,986][47288] Updated weights for policy 0, policy_version 93986 (0.0029) [2024-04-26 05:33:53,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1539997696. Throughput: 0: 56446.6. Samples: 1489431260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 05:33:53,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 05:33:54,233][47288] Updated weights for policy 0, policy_version 93996 (0.0030) [2024-04-26 05:33:56,701][47288] Updated weights for policy 0, policy_version 94006 (0.0027) [2024-04-26 05:33:58,923][47056] Fps is (10 sec: 50790.3, 60 sec: 55159.6, 300 sec: 56261.0). Total num frames: 1540243456. Throughput: 0: 55876.0. Samples: 1489587760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 05:33:58,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 05:33:59,531][47267] Signal inference workers to stop experience collection... (22500 times) [2024-04-26 05:33:59,561][47288] InferenceWorker_p0-w0: stopping experience collection (22500 times) [2024-04-26 05:33:59,587][47267] Signal inference workers to resume experience collection... (22500 times) [2024-04-26 05:33:59,588][47288] InferenceWorker_p0-w0: resuming experience collection (22500 times) [2024-04-26 05:34:00,010][47288] Updated weights for policy 0, policy_version 94016 (0.0026) [2024-04-26 05:34:02,407][47288] Updated weights for policy 0, policy_version 94026 (0.0028) [2024-04-26 05:34:03,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.6, 300 sec: 56205.5). Total num frames: 1540538368. Throughput: 0: 55641.2. Samples: 1489917720. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 05:34:03,923][47056] Avg episode reward: [(0, '0.360')] [2024-04-26 05:34:05,683][47288] Updated weights for policy 0, policy_version 94036 (0.0024) [2024-04-26 05:34:08,675][47288] Updated weights for policy 0, policy_version 94046 (0.0032) [2024-04-26 05:34:08,923][47056] Fps is (10 sec: 60621.0, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1540849664. Throughput: 0: 55877.8. Samples: 1490266400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 05:34:08,923][47056] Avg episode reward: [(0, '0.571')] [2024-04-26 05:34:11,563][47288] Updated weights for policy 0, policy_version 94056 (0.0033) [2024-04-26 05:34:13,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55978.6, 300 sec: 56261.7). Total num frames: 1541128192. Throughput: 0: 56121.3. Samples: 1490435660. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 05:34:13,923][47056] Avg episode reward: [(0, '0.450')] [2024-04-26 05:34:14,719][47288] Updated weights for policy 0, policy_version 94066 (0.0027) [2024-04-26 05:34:17,480][47288] Updated weights for policy 0, policy_version 94076 (0.0029) [2024-04-26 05:34:18,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56524.7, 300 sec: 56205.4). Total num frames: 1541406720. Throughput: 0: 56039.5. Samples: 1490770680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 05:34:18,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 05:34:20,715][47288] Updated weights for policy 0, policy_version 94086 (0.0029) [2024-04-26 05:34:23,154][47288] Updated weights for policy 0, policy_version 94096 (0.0031) [2024-04-26 05:34:23,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56797.7, 300 sec: 56261.0). Total num frames: 1541701632. Throughput: 0: 55946.2. Samples: 1491103820. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 05:34:23,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 05:34:26,703][47288] Updated weights for policy 0, policy_version 94106 (0.0028) [2024-04-26 05:34:28,821][47288] Updated weights for policy 0, policy_version 94116 (0.0028) [2024-04-26 05:34:28,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56797.8, 300 sec: 56316.5). Total num frames: 1541996544. Throughput: 0: 55920.4. Samples: 1491284760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 05:34:28,924][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 05:34:32,494][47288] Updated weights for policy 0, policy_version 94126 (0.0030) [2024-04-26 05:34:33,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55705.7, 300 sec: 56261.0). Total num frames: 1542242304. Throughput: 0: 56173.8. Samples: 1491626520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 05:34:33,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 05:34:34,735][47288] Updated weights for policy 0, policy_version 94136 (0.0026) [2024-04-26 05:34:38,416][47288] Updated weights for policy 0, policy_version 94146 (0.0027) [2024-04-26 05:34:38,923][47056] Fps is (10 sec: 50790.6, 60 sec: 55432.6, 300 sec: 56261.0). Total num frames: 1542504448. Throughput: 0: 56297.3. Samples: 1491964640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 05:34:38,924][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 05:34:40,788][47288] Updated weights for policy 0, policy_version 94156 (0.0025) [2024-04-26 05:34:43,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 56261.0). Total num frames: 1542799360. Throughput: 0: 56488.0. Samples: 1492129720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 05:34:43,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 05:34:44,168][47288] Updated weights for policy 0, policy_version 94166 (0.0029) [2024-04-26 05:34:46,448][47288] Updated weights for policy 0, policy_version 94176 (0.0033) [2024-04-26 05:34:48,923][47056] Fps is (10 sec: 60620.2, 60 sec: 56251.6, 300 sec: 56260.9). Total num frames: 1543110656. Throughput: 0: 56748.7. Samples: 1492471420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 05:34:48,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 05:34:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000094184_1543110656.pth... [2024-04-26 05:34:48,991][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000093359_1529593856.pth [2024-04-26 05:34:49,778][47288] Updated weights for policy 0, policy_version 94186 (0.0028) [2024-04-26 05:34:51,449][47267] Signal inference workers to stop experience collection... (22550 times) [2024-04-26 05:34:51,450][47267] Signal inference workers to resume experience collection... (22550 times) [2024-04-26 05:34:51,467][47288] InferenceWorker_p0-w0: stopping experience collection (22550 times) [2024-04-26 05:34:51,468][47288] InferenceWorker_p0-w0: resuming experience collection (22550 times) [2024-04-26 05:34:52,270][47288] Updated weights for policy 0, policy_version 94196 (0.0025) [2024-04-26 05:34:53,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1543389184. Throughput: 0: 56464.9. Samples: 1492807320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 05:34:53,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 05:34:55,570][47288] Updated weights for policy 0, policy_version 94206 (0.0032) [2024-04-26 05:34:58,079][47288] Updated weights for policy 0, policy_version 94216 (0.0026) [2024-04-26 05:34:58,923][47056] Fps is (10 sec: 55706.6, 60 sec: 57071.0, 300 sec: 56205.5). Total num frames: 1543667712. Throughput: 0: 56507.2. Samples: 1492978480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 05:34:58,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 05:35:01,257][47288] Updated weights for policy 0, policy_version 94226 (0.0036) [2024-04-26 05:35:03,743][47288] Updated weights for policy 0, policy_version 94236 (0.0030) [2024-04-26 05:35:03,923][47056] Fps is (10 sec: 57343.2, 60 sec: 57070.9, 300 sec: 56372.1). Total num frames: 1543962624. Throughput: 0: 56622.2. Samples: 1493318680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 05:35:03,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 05:35:07,389][47288] Updated weights for policy 0, policy_version 94246 (0.0025) [2024-04-26 05:35:08,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56251.6, 300 sec: 56261.0). Total num frames: 1544224768. Throughput: 0: 56692.0. Samples: 1493654960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 05:35:08,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 05:35:09,438][47288] Updated weights for policy 0, policy_version 94256 (0.0025) [2024-04-26 05:35:13,167][47288] Updated weights for policy 0, policy_version 94266 (0.0024) [2024-04-26 05:35:13,923][47056] Fps is (10 sec: 52429.6, 60 sec: 55978.7, 300 sec: 56316.6). Total num frames: 1544486912. Throughput: 0: 56289.1. Samples: 1493817760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 05:35:13,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 05:35:15,355][47288] Updated weights for policy 0, policy_version 94276 (0.0026) [2024-04-26 05:35:18,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1544765440. Throughput: 0: 56181.2. Samples: 1494154680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 05:35:18,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 05:35:19,022][47288] Updated weights for policy 0, policy_version 94286 (0.0035) [2024-04-26 05:35:21,687][47288] Updated weights for policy 0, policy_version 94296 (0.0031) [2024-04-26 05:35:23,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 1545060352. Throughput: 0: 56049.7. Samples: 1494486880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 05:35:23,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 05:35:24,742][47288] Updated weights for policy 0, policy_version 94306 (0.0031) [2024-04-26 05:35:27,401][47288] Updated weights for policy 0, policy_version 94316 (0.0024) [2024-04-26 05:35:28,923][47056] Fps is (10 sec: 58982.0, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1545355264. Throughput: 0: 56059.0. Samples: 1494652380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 05:35:28,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 05:35:30,615][47288] Updated weights for policy 0, policy_version 94326 (0.0032) [2024-04-26 05:35:33,551][47288] Updated weights for policy 0, policy_version 94336 (0.0032) [2024-04-26 05:35:33,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 1545617408. Throughput: 0: 55904.2. Samples: 1494987100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 05:35:33,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 05:35:36,415][47288] Updated weights for policy 0, policy_version 94346 (0.0028) [2024-04-26 05:35:38,923][47056] Fps is (10 sec: 54068.0, 60 sec: 56524.8, 300 sec: 56149.9). Total num frames: 1545895936. Throughput: 0: 55871.1. Samples: 1495321520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 05:35:38,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 05:35:39,477][47288] Updated weights for policy 0, policy_version 94356 (0.0026) [2024-04-26 05:35:42,186][47288] Updated weights for policy 0, policy_version 94366 (0.0030) [2024-04-26 05:35:43,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1546190848. Throughput: 0: 56058.1. Samples: 1495501100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 05:35:43,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 05:35:45,113][47288] Updated weights for policy 0, policy_version 94376 (0.0029) [2024-04-26 05:35:48,080][47288] Updated weights for policy 0, policy_version 94386 (0.0028) [2024-04-26 05:35:48,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55432.6, 300 sec: 56261.0). Total num frames: 1546436608. Throughput: 0: 55905.3. Samples: 1495834420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 05:35:48,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 05:35:49,398][47267] Signal inference workers to stop experience collection... (22600 times) [2024-04-26 05:35:49,451][47267] Signal inference workers to resume experience collection... (22600 times) [2024-04-26 05:35:49,452][47288] InferenceWorker_p0-w0: stopping experience collection (22600 times) [2024-04-26 05:35:49,464][47288] InferenceWorker_p0-w0: resuming experience collection (22600 times) [2024-04-26 05:35:50,895][47288] Updated weights for policy 0, policy_version 94396 (0.0030) [2024-04-26 05:35:53,765][47288] Updated weights for policy 0, policy_version 94406 (0.0026) [2024-04-26 05:35:53,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55978.6, 300 sec: 56316.6). Total num frames: 1546747904. Throughput: 0: 55935.7. Samples: 1496172060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 05:35:53,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 05:35:56,767][47288] Updated weights for policy 0, policy_version 94416 (0.0032) [2024-04-26 05:35:58,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55705.6, 300 sec: 56205.5). Total num frames: 1547010048. Throughput: 0: 55965.7. Samples: 1496336220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 05:35:58,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 05:35:59,569][47288] Updated weights for policy 0, policy_version 94426 (0.0031) [2024-04-26 05:36:02,570][47288] Updated weights for policy 0, policy_version 94436 (0.0029) [2024-04-26 05:36:03,923][47056] Fps is (10 sec: 57342.9, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1547321344. Throughput: 0: 56044.8. Samples: 1496676700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 05:36:03,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 05:36:05,478][47288] Updated weights for policy 0, policy_version 94446 (0.0032) [2024-04-26 05:36:08,306][47288] Updated weights for policy 0, policy_version 94456 (0.0027) [2024-04-26 05:36:08,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56251.9, 300 sec: 56149.9). Total num frames: 1547599872. Throughput: 0: 55937.1. Samples: 1497004040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 05:36:08,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 05:36:11,496][47288] Updated weights for policy 0, policy_version 94466 (0.0030) [2024-04-26 05:36:13,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 1547862016. Throughput: 0: 56081.9. Samples: 1497176060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 05:36:13,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 05:36:14,125][47288] Updated weights for policy 0, policy_version 94476 (0.0027) [2024-04-26 05:36:17,220][47288] Updated weights for policy 0, policy_version 94486 (0.0032) [2024-04-26 05:36:18,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.9, 300 sec: 56205.5). Total num frames: 1548140544. Throughput: 0: 56038.2. Samples: 1497508820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 05:36:18,923][47056] Avg episode reward: [(0, '0.572')] [2024-04-26 05:36:20,122][47288] Updated weights for policy 0, policy_version 94496 (0.0027) [2024-04-26 05:36:22,952][47288] Updated weights for policy 0, policy_version 94506 (0.0030) [2024-04-26 05:36:23,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55705.8, 300 sec: 56261.0). Total num frames: 1548402688. Throughput: 0: 56079.2. Samples: 1497845080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-04-26 05:36:23,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 05:36:25,925][47288] Updated weights for policy 0, policy_version 94516 (0.0028) [2024-04-26 05:36:28,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.7, 300 sec: 56316.5). Total num frames: 1548697600. Throughput: 0: 55763.2. Samples: 1498010440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 05:36:28,924][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 05:36:29,231][47288] Updated weights for policy 0, policy_version 94526 (0.0028) [2024-04-26 05:36:31,780][47288] Updated weights for policy 0, policy_version 94536 (0.0034) [2024-04-26 05:36:33,923][47056] Fps is (10 sec: 57342.8, 60 sec: 55978.5, 300 sec: 56261.0). Total num frames: 1548976128. Throughput: 0: 55770.2. Samples: 1498344080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 05:36:33,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 05:36:35,107][47288] Updated weights for policy 0, policy_version 94546 (0.0031) [2024-04-26 05:36:37,577][47288] Updated weights for policy 0, policy_version 94556 (0.0028) [2024-04-26 05:36:38,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 56094.4). Total num frames: 1549254656. Throughput: 0: 55758.2. Samples: 1498681180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 05:36:38,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 05:36:40,817][47288] Updated weights for policy 0, policy_version 94566 (0.0030) [2024-04-26 05:36:43,360][47288] Updated weights for policy 0, policy_version 94576 (0.0028) [2024-04-26 05:36:43,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 56094.3). Total num frames: 1549549568. Throughput: 0: 56013.1. Samples: 1498856820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 05:36:43,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 05:36:46,727][47288] Updated weights for policy 0, policy_version 94586 (0.0028) [2024-04-26 05:36:48,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56525.0, 300 sec: 56094.4). Total num frames: 1549828096. Throughput: 0: 55931.0. Samples: 1499193580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 05:36:48,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 05:36:49,028][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000094595_1549844480.pth... [2024-04-26 05:36:49,077][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000093769_1536311296.pth [2024-04-26 05:36:49,228][47288] Updated weights for policy 0, policy_version 94596 (0.0030) [2024-04-26 05:36:52,536][47288] Updated weights for policy 0, policy_version 94606 (0.0030) [2024-04-26 05:36:53,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55705.6, 300 sec: 56094.4). Total num frames: 1550090240. Throughput: 0: 56005.7. Samples: 1499524300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 05:36:53,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 05:36:55,103][47288] Updated weights for policy 0, policy_version 94616 (0.0025) [2024-04-26 05:36:58,475][47288] Updated weights for policy 0, policy_version 94626 (0.0034) [2024-04-26 05:36:58,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1550368768. Throughput: 0: 55810.2. Samples: 1499687520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 05:36:58,924][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 05:37:00,265][47267] Signal inference workers to stop experience collection... (22650 times) [2024-04-26 05:37:00,265][47267] Signal inference workers to resume experience collection... (22650 times) [2024-04-26 05:37:00,284][47288] InferenceWorker_p0-w0: stopping experience collection (22650 times) [2024-04-26 05:37:00,285][47288] InferenceWorker_p0-w0: resuming experience collection (22650 times) [2024-04-26 05:37:00,917][47288] Updated weights for policy 0, policy_version 94636 (0.0024) [2024-04-26 05:37:03,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55432.6, 300 sec: 56261.0). Total num frames: 1550647296. Throughput: 0: 55951.8. Samples: 1500026660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 05:37:03,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 05:37:04,162][47288] Updated weights for policy 0, policy_version 94646 (0.0026) [2024-04-26 05:37:06,948][47288] Updated weights for policy 0, policy_version 94656 (0.0027) [2024-04-26 05:37:08,923][47056] Fps is (10 sec: 57342.9, 60 sec: 55705.3, 300 sec: 56205.4). Total num frames: 1550942208. Throughput: 0: 55940.5. Samples: 1500362420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 05:37:08,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 05:37:10,001][47288] Updated weights for policy 0, policy_version 94666 (0.0029) [2024-04-26 05:37:12,796][47288] Updated weights for policy 0, policy_version 94676 (0.0028) [2024-04-26 05:37:13,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55978.8, 300 sec: 56149.9). Total num frames: 1551220736. Throughput: 0: 56157.4. Samples: 1500537520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 05:37:13,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 05:37:15,934][47288] Updated weights for policy 0, policy_version 94686 (0.0027) [2024-04-26 05:37:18,688][47288] Updated weights for policy 0, policy_version 94696 (0.0025) [2024-04-26 05:37:18,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56251.6, 300 sec: 56094.4). Total num frames: 1551515648. Throughput: 0: 56224.0. Samples: 1500874160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 05:37:18,924][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 05:37:21,738][47288] Updated weights for policy 0, policy_version 94706 (0.0029) [2024-04-26 05:37:23,923][47056] Fps is (10 sec: 58981.6, 60 sec: 56797.7, 300 sec: 56150.5). Total num frames: 1551810560. Throughput: 0: 56189.2. Samples: 1501209700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-26 05:37:23,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 05:37:24,326][47288] Updated weights for policy 0, policy_version 94716 (0.0027) [2024-04-26 05:37:27,476][47288] Updated weights for policy 0, policy_version 94726 (0.0026) [2024-04-26 05:37:28,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 1552056320. Throughput: 0: 56180.2. Samples: 1501384920. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-26 05:37:28,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 05:37:30,015][47288] Updated weights for policy 0, policy_version 94736 (0.0028) [2024-04-26 05:37:33,439][47288] Updated weights for policy 0, policy_version 94746 (0.0028) [2024-04-26 05:37:33,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55978.7, 300 sec: 56149.9). Total num frames: 1552334848. Throughput: 0: 56196.3. Samples: 1501722420. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-26 05:37:33,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 05:37:35,805][47288] Updated weights for policy 0, policy_version 94756 (0.0030) [2024-04-26 05:37:38,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1552629760. Throughput: 0: 56333.3. Samples: 1502059300. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-26 05:37:38,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 05:37:39,300][47288] Updated weights for policy 0, policy_version 94766 (0.0026) [2024-04-26 05:37:41,678][47288] Updated weights for policy 0, policy_version 94776 (0.0028) [2024-04-26 05:37:43,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1552908288. Throughput: 0: 56399.1. Samples: 1502225480. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-26 05:37:43,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 05:37:44,911][47288] Updated weights for policy 0, policy_version 94786 (0.0032) [2024-04-26 05:37:47,638][47288] Updated weights for policy 0, policy_version 94796 (0.0027) [2024-04-26 05:37:48,222][47267] Signal inference workers to stop experience collection... (22700 times) [2024-04-26 05:37:48,258][47288] InferenceWorker_p0-w0: stopping experience collection (22700 times) [2024-04-26 05:37:48,311][47267] Signal inference workers to resume experience collection... (22700 times) [2024-04-26 05:37:48,312][47288] InferenceWorker_p0-w0: resuming experience collection (22700 times) [2024-04-26 05:37:48,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56524.7, 300 sec: 56205.5). Total num frames: 1553219584. Throughput: 0: 56517.0. Samples: 1502569920. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-26 05:37:48,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 05:37:50,599][47288] Updated weights for policy 0, policy_version 94806 (0.0028) [2024-04-26 05:37:53,511][47288] Updated weights for policy 0, policy_version 94816 (0.0034) [2024-04-26 05:37:53,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56524.8, 300 sec: 56094.4). Total num frames: 1553481728. Throughput: 0: 56670.2. Samples: 1502912560. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-26 05:37:53,923][47056] Avg episode reward: [(0, '0.415')] [2024-04-26 05:37:56,502][47288] Updated weights for policy 0, policy_version 94826 (0.0029) [2024-04-26 05:37:58,923][47056] Fps is (10 sec: 54067.4, 60 sec: 56524.9, 300 sec: 56094.4). Total num frames: 1553760256. Throughput: 0: 56526.2. Samples: 1503081200. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-26 05:37:58,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 05:37:59,369][47288] Updated weights for policy 0, policy_version 94836 (0.0030) [2024-04-26 05:38:02,369][47288] Updated weights for policy 0, policy_version 94846 (0.0027) [2024-04-26 05:38:03,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.9, 300 sec: 56149.9). Total num frames: 1554055168. Throughput: 0: 56408.6. Samples: 1503412540. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-26 05:38:03,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 05:38:05,061][47288] Updated weights for policy 0, policy_version 94856 (0.0036) [2024-04-26 05:38:08,051][47288] Updated weights for policy 0, policy_version 94866 (0.0025) [2024-04-26 05:38:08,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56525.0, 300 sec: 56149.9). Total num frames: 1554333696. Throughput: 0: 56534.3. Samples: 1503753740. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-26 05:38:08,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 05:38:10,859][47288] Updated weights for policy 0, policy_version 94876 (0.0025) [2024-04-26 05:38:13,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56251.7, 300 sec: 56205.4). Total num frames: 1554595840. Throughput: 0: 56374.7. Samples: 1503921780. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-26 05:38:13,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 05:38:14,190][47288] Updated weights for policy 0, policy_version 94886 (0.0025) [2024-04-26 05:38:16,638][47288] Updated weights for policy 0, policy_version 94896 (0.0032) [2024-04-26 05:38:18,927][47056] Fps is (10 sec: 54045.6, 60 sec: 55975.0, 300 sec: 56204.7). Total num frames: 1554874368. Throughput: 0: 56379.5. Samples: 1504259720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-26 05:38:18,927][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 05:38:19,904][47288] Updated weights for policy 0, policy_version 94906 (0.0028) [2024-04-26 05:38:22,509][47288] Updated weights for policy 0, policy_version 94916 (0.0031) [2024-04-26 05:38:23,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 56205.4). Total num frames: 1555169280. Throughput: 0: 56309.7. Samples: 1504593240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 05:38:23,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 05:38:25,706][47288] Updated weights for policy 0, policy_version 94926 (0.0029) [2024-04-26 05:38:28,341][47288] Updated weights for policy 0, policy_version 94936 (0.0028) [2024-04-26 05:38:28,923][47056] Fps is (10 sec: 55728.0, 60 sec: 56251.8, 300 sec: 56038.9). Total num frames: 1555431424. Throughput: 0: 56274.8. Samples: 1504757840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 05:38:28,923][47056] Avg episode reward: [(0, '0.436')] [2024-04-26 05:38:31,478][47288] Updated weights for policy 0, policy_version 94946 (0.0028) [2024-04-26 05:38:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.8, 300 sec: 56149.9). Total num frames: 1555742720. Throughput: 0: 56126.6. Samples: 1505095620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 05:38:33,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 05:38:34,321][47288] Updated weights for policy 0, policy_version 94956 (0.0027) [2024-04-26 05:38:37,252][47288] Updated weights for policy 0, policy_version 94966 (0.0034) [2024-04-26 05:38:38,923][47056] Fps is (10 sec: 58981.2, 60 sec: 56524.7, 300 sec: 56149.9). Total num frames: 1556021248. Throughput: 0: 56021.0. Samples: 1505433520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 05:38:38,924][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 05:38:40,030][47288] Updated weights for policy 0, policy_version 94976 (0.0029) [2024-04-26 05:38:43,100][47288] Updated weights for policy 0, policy_version 94986 (0.0024) [2024-04-26 05:38:43,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56524.8, 300 sec: 56149.9). Total num frames: 1556299776. Throughput: 0: 56262.6. Samples: 1505613020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 05:38:43,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 05:38:45,804][47288] Updated weights for policy 0, policy_version 94996 (0.0028) [2024-04-26 05:38:47,357][47267] Signal inference workers to stop experience collection... (22750 times) [2024-04-26 05:38:47,397][47288] InferenceWorker_p0-w0: stopping experience collection (22750 times) [2024-04-26 05:38:47,407][47267] Signal inference workers to resume experience collection... (22750 times) [2024-04-26 05:38:47,414][47288] InferenceWorker_p0-w0: resuming experience collection (22750 times) [2024-04-26 05:38:48,875][47288] Updated weights for policy 0, policy_version 95006 (0.0025) [2024-04-26 05:38:48,923][47056] Fps is (10 sec: 55706.7, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 1556578304. Throughput: 0: 56283.1. Samples: 1505945280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 05:38:48,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 05:38:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000095006_1556578304.pth... [2024-04-26 05:38:48,984][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000094184_1543110656.pth [2024-04-26 05:38:51,658][47288] Updated weights for policy 0, policy_version 95016 (0.0028) [2024-04-26 05:38:53,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1556840448. Throughput: 0: 56187.7. Samples: 1506282180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 05:38:53,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 05:38:54,717][47288] Updated weights for policy 0, policy_version 95026 (0.0027) [2024-04-26 05:38:57,446][47288] Updated weights for policy 0, policy_version 95036 (0.0035) [2024-04-26 05:38:58,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55978.6, 300 sec: 56205.5). Total num frames: 1557118976. Throughput: 0: 56129.8. Samples: 1506447620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 05:38:58,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 05:39:00,507][47288] Updated weights for policy 0, policy_version 95046 (0.0028) [2024-04-26 05:39:03,845][47288] Updated weights for policy 0, policy_version 95056 (0.0029) [2024-04-26 05:39:03,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.6, 300 sec: 56094.4). Total num frames: 1557397504. Throughput: 0: 56126.3. Samples: 1506785180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 05:39:03,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 05:39:06,313][47288] Updated weights for policy 0, policy_version 95066 (0.0031) [2024-04-26 05:39:08,923][47056] Fps is (10 sec: 57342.7, 60 sec: 55978.5, 300 sec: 56149.9). Total num frames: 1557692416. Throughput: 0: 56240.2. Samples: 1507124060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 05:39:08,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 05:39:09,787][47288] Updated weights for policy 0, policy_version 95076 (0.0028) [2024-04-26 05:39:12,015][47288] Updated weights for policy 0, policy_version 95086 (0.0037) [2024-04-26 05:39:13,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56524.8, 300 sec: 56205.5). Total num frames: 1557987328. Throughput: 0: 56347.5. Samples: 1507293480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 05:39:13,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 05:39:15,454][47288] Updated weights for policy 0, policy_version 95096 (0.0026) [2024-04-26 05:39:17,755][47288] Updated weights for policy 0, policy_version 95106 (0.0029) [2024-04-26 05:39:18,923][47056] Fps is (10 sec: 58984.2, 60 sec: 56801.7, 300 sec: 56205.5). Total num frames: 1558282240. Throughput: 0: 56351.8. Samples: 1507631440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:39:18,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 05:39:21,227][47288] Updated weights for policy 0, policy_version 95116 (0.0026) [2024-04-26 05:39:23,589][47288] Updated weights for policy 0, policy_version 95126 (0.0036) [2024-04-26 05:39:23,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56524.8, 300 sec: 56149.9). Total num frames: 1558560768. Throughput: 0: 56362.3. Samples: 1507969820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:39:23,923][47056] Avg episode reward: [(0, '0.528')] [2024-04-26 05:39:27,078][47288] Updated weights for policy 0, policy_version 95136 (0.0036) [2024-04-26 05:39:28,923][47056] Fps is (10 sec: 54066.0, 60 sec: 56524.6, 300 sec: 56205.4). Total num frames: 1558822912. Throughput: 0: 56310.1. Samples: 1508146980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:39:28,923][47056] Avg episode reward: [(0, '0.557')] [2024-04-26 05:39:29,304][47288] Updated weights for policy 0, policy_version 95146 (0.0028) [2024-04-26 05:39:33,041][47288] Updated weights for policy 0, policy_version 95156 (0.0036) [2024-04-26 05:39:33,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55705.6, 300 sec: 56205.4). Total num frames: 1559085056. Throughput: 0: 56449.2. Samples: 1508485500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:39:33,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 05:39:35,125][47288] Updated weights for policy 0, policy_version 95166 (0.0028) [2024-04-26 05:39:38,791][47288] Updated weights for policy 0, policy_version 95176 (0.0026) [2024-04-26 05:39:38,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55705.8, 300 sec: 56149.9). Total num frames: 1559363584. Throughput: 0: 56571.0. Samples: 1508827880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:39:38,923][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 05:39:41,104][47288] Updated weights for policy 0, policy_version 95186 (0.0034) [2024-04-26 05:39:43,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.6, 300 sec: 56094.4). Total num frames: 1559658496. Throughput: 0: 56359.4. Samples: 1508983800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:39:43,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 05:39:44,702][47288] Updated weights for policy 0, policy_version 95196 (0.0033) [2024-04-26 05:39:46,818][47288] Updated weights for policy 0, policy_version 95206 (0.0032) [2024-04-26 05:39:48,926][47056] Fps is (10 sec: 60603.2, 60 sec: 56522.1, 300 sec: 56204.9). Total num frames: 1559969792. Throughput: 0: 56300.8. Samples: 1509318880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:39:48,926][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 05:39:50,450][47288] Updated weights for policy 0, policy_version 95216 (0.0034) [2024-04-26 05:39:50,736][47267] Signal inference workers to stop experience collection... (22800 times) [2024-04-26 05:39:50,766][47288] InferenceWorker_p0-w0: stopping experience collection (22800 times) [2024-04-26 05:39:50,824][47267] Signal inference workers to resume experience collection... (22800 times) [2024-04-26 05:39:50,824][47288] InferenceWorker_p0-w0: resuming experience collection (22800 times) [2024-04-26 05:39:52,607][47288] Updated weights for policy 0, policy_version 95226 (0.0027) [2024-04-26 05:39:53,923][47056] Fps is (10 sec: 60616.9, 60 sec: 57070.2, 300 sec: 56260.8). Total num frames: 1560264704. Throughput: 0: 56262.5. Samples: 1509655900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:39:53,924][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 05:39:56,243][47288] Updated weights for policy 0, policy_version 95236 (0.0033) [2024-04-26 05:39:58,352][47288] Updated weights for policy 0, policy_version 95246 (0.0030) [2024-04-26 05:39:58,923][47056] Fps is (10 sec: 57360.7, 60 sec: 57070.9, 300 sec: 56205.5). Total num frames: 1560543232. Throughput: 0: 56486.2. Samples: 1509835360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:39:58,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 05:40:02,070][47288] Updated weights for policy 0, policy_version 95256 (0.0026) [2024-04-26 05:40:03,923][47056] Fps is (10 sec: 54070.8, 60 sec: 56797.8, 300 sec: 56205.5). Total num frames: 1560805376. Throughput: 0: 56454.9. Samples: 1510171920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:40:03,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 05:40:04,393][47288] Updated weights for policy 0, policy_version 95266 (0.0027) [2024-04-26 05:40:07,759][47288] Updated weights for policy 0, policy_version 95276 (0.0024) [2024-04-26 05:40:08,923][47056] Fps is (10 sec: 52429.1, 60 sec: 56252.0, 300 sec: 56205.5). Total num frames: 1561067520. Throughput: 0: 56474.4. Samples: 1510511160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:40:08,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 05:40:10,075][47288] Updated weights for policy 0, policy_version 95286 (0.0030) [2024-04-26 05:40:13,554][47288] Updated weights for policy 0, policy_version 95296 (0.0025) [2024-04-26 05:40:13,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1561346048. Throughput: 0: 56153.4. Samples: 1510673880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 05:40:13,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 05:40:15,785][47288] Updated weights for policy 0, policy_version 95306 (0.0038) [2024-04-26 05:40:18,923][47056] Fps is (10 sec: 54066.3, 60 sec: 55432.4, 300 sec: 56094.4). Total num frames: 1561608192. Throughput: 0: 56201.7. Samples: 1511014580. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 05:40:18,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 05:40:19,322][47288] Updated weights for policy 0, policy_version 95316 (0.0024) [2024-04-26 05:40:21,644][47288] Updated weights for policy 0, policy_version 95326 (0.0029) [2024-04-26 05:40:23,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 56149.9). Total num frames: 1561919488. Throughput: 0: 56081.3. Samples: 1511351540. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 05:40:23,924][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 05:40:25,088][47288] Updated weights for policy 0, policy_version 95336 (0.0023) [2024-04-26 05:40:27,367][47288] Updated weights for policy 0, policy_version 95346 (0.0026) [2024-04-26 05:40:28,923][47056] Fps is (10 sec: 63898.7, 60 sec: 57071.1, 300 sec: 56372.1). Total num frames: 1562247168. Throughput: 0: 56575.8. Samples: 1511529700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 05:40:28,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 05:40:30,869][47288] Updated weights for policy 0, policy_version 95356 (0.0027) [2024-04-26 05:40:32,868][47267] Signal inference workers to stop experience collection... (22850 times) [2024-04-26 05:40:32,921][47267] Signal inference workers to resume experience collection... (22850 times) [2024-04-26 05:40:32,921][47288] InferenceWorker_p0-w0: stopping experience collection (22850 times) [2024-04-26 05:40:32,936][47288] InferenceWorker_p0-w0: resuming experience collection (22850 times) [2024-04-26 05:40:33,031][47288] Updated weights for policy 0, policy_version 95366 (0.0032) [2024-04-26 05:40:33,923][47056] Fps is (10 sec: 62257.7, 60 sec: 57616.9, 300 sec: 56427.6). Total num frames: 1562542080. Throughput: 0: 56716.6. Samples: 1511870980. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 05:40:33,924][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 05:40:36,815][47288] Updated weights for policy 0, policy_version 95376 (0.0027) [2024-04-26 05:40:38,844][47288] Updated weights for policy 0, policy_version 95386 (0.0034) [2024-04-26 05:40:38,923][47056] Fps is (10 sec: 55705.0, 60 sec: 57343.9, 300 sec: 56316.5). Total num frames: 1562804224. Throughput: 0: 56620.0. Samples: 1512203760. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 05:40:38,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 05:40:42,574][47288] Updated weights for policy 0, policy_version 95396 (0.0028) [2024-04-26 05:40:43,923][47056] Fps is (10 sec: 49153.2, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1563033600. Throughput: 0: 56503.1. Samples: 1512378000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 05:40:43,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 05:40:44,702][47288] Updated weights for policy 0, policy_version 95406 (0.0029) [2024-04-26 05:40:48,460][47288] Updated weights for policy 0, policy_version 95416 (0.0024) [2024-04-26 05:40:48,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55981.3, 300 sec: 56205.4). Total num frames: 1563328512. Throughput: 0: 56666.6. Samples: 1512721920. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 05:40:48,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 05:40:49,002][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000095419_1563344896.pth... [2024-04-26 05:40:49,045][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000094595_1549844480.pth [2024-04-26 05:40:50,408][47288] Updated weights for policy 0, policy_version 95426 (0.0025) [2024-04-26 05:40:53,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55706.3, 300 sec: 56261.0). Total num frames: 1563607040. Throughput: 0: 56543.5. Samples: 1513055620. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 05:40:53,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 05:40:54,174][47288] Updated weights for policy 0, policy_version 95436 (0.0034) [2024-04-26 05:40:56,161][47288] Updated weights for policy 0, policy_version 95446 (0.0028) [2024-04-26 05:40:58,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.5, 300 sec: 56149.9). Total num frames: 1563885568. Throughput: 0: 56437.8. Samples: 1513213580. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 05:40:58,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 05:40:59,953][47288] Updated weights for policy 0, policy_version 95456 (0.0029) [2024-04-26 05:41:02,043][47288] Updated weights for policy 0, policy_version 95466 (0.0028) [2024-04-26 05:41:03,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56524.7, 300 sec: 56261.0). Total num frames: 1564196864. Throughput: 0: 56156.9. Samples: 1513541640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 05:41:03,923][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 05:41:05,836][47288] Updated weights for policy 0, policy_version 95476 (0.0031) [2024-04-26 05:41:07,947][47288] Updated weights for policy 0, policy_version 95486 (0.0025) [2024-04-26 05:41:08,923][47056] Fps is (10 sec: 62258.4, 60 sec: 57343.7, 300 sec: 56427.6). Total num frames: 1564508160. Throughput: 0: 56178.0. Samples: 1513879560. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 05:41:08,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 05:41:11,831][47288] Updated weights for policy 0, policy_version 95496 (0.0024) [2024-04-26 05:41:13,620][47288] Updated weights for policy 0, policy_version 95506 (0.0027) [2024-04-26 05:41:13,923][47056] Fps is (10 sec: 58983.4, 60 sec: 57344.1, 300 sec: 56427.6). Total num frames: 1564786688. Throughput: 0: 56466.2. Samples: 1514070680. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 05:41:13,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 05:41:17,578][47288] Updated weights for policy 0, policy_version 95516 (0.0029) [2024-04-26 05:41:18,923][47056] Fps is (10 sec: 50791.0, 60 sec: 56797.9, 300 sec: 56316.5). Total num frames: 1565016064. Throughput: 0: 56417.6. Samples: 1514409760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 05:41:18,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 05:41:19,022][47267] Signal inference workers to stop experience collection... (22900 times) [2024-04-26 05:41:19,057][47288] InferenceWorker_p0-w0: stopping experience collection (22900 times) [2024-04-26 05:41:19,109][47267] Signal inference workers to resume experience collection... (22900 times) [2024-04-26 05:41:19,109][47288] InferenceWorker_p0-w0: resuming experience collection (22900 times) [2024-04-26 05:41:19,476][47288] Updated weights for policy 0, policy_version 95526 (0.0026) [2024-04-26 05:41:23,452][47288] Updated weights for policy 0, policy_version 95536 (0.0026) [2024-04-26 05:41:23,923][47056] Fps is (10 sec: 49150.7, 60 sec: 55978.5, 300 sec: 56205.4). Total num frames: 1565278208. Throughput: 0: 56408.2. Samples: 1514742140. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 05:41:23,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 05:41:25,329][47288] Updated weights for policy 0, policy_version 95546 (0.0026) [2024-04-26 05:41:28,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55159.3, 300 sec: 56205.5). Total num frames: 1565556736. Throughput: 0: 55924.8. Samples: 1514894620. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 05:41:28,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 05:41:29,306][47288] Updated weights for policy 0, policy_version 95556 (0.0029) [2024-04-26 05:41:31,123][47288] Updated weights for policy 0, policy_version 95566 (0.0031) [2024-04-26 05:41:33,923][47056] Fps is (10 sec: 58983.2, 60 sec: 55432.7, 300 sec: 56316.5). Total num frames: 1565868032. Throughput: 0: 55764.0. Samples: 1515231300. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 05:41:33,924][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 05:41:35,005][47288] Updated weights for policy 0, policy_version 95576 (0.0035) [2024-04-26 05:41:36,911][47288] Updated weights for policy 0, policy_version 95586 (0.0030) [2024-04-26 05:41:38,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55705.7, 300 sec: 56261.0). Total num frames: 1566146560. Throughput: 0: 55983.6. Samples: 1515574880. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 05:41:38,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 05:41:40,710][47288] Updated weights for policy 0, policy_version 95596 (0.0033) [2024-04-26 05:41:42,805][47288] Updated weights for policy 0, policy_version 95606 (0.0035) [2024-04-26 05:41:43,923][47056] Fps is (10 sec: 58982.5, 60 sec: 57070.9, 300 sec: 56372.0). Total num frames: 1566457856. Throughput: 0: 56491.6. Samples: 1515755700. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 05:41:43,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 05:41:46,619][47288] Updated weights for policy 0, policy_version 95616 (0.0025) [2024-04-26 05:41:48,563][47288] Updated weights for policy 0, policy_version 95626 (0.0028) [2024-04-26 05:41:48,923][47056] Fps is (10 sec: 60619.7, 60 sec: 57070.9, 300 sec: 56483.1). Total num frames: 1566752768. Throughput: 0: 56602.6. Samples: 1516088760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 05:41:48,923][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 05:41:52,595][47288] Updated weights for policy 0, policy_version 95636 (0.0033) [2024-04-26 05:41:53,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1566998528. Throughput: 0: 56516.2. Samples: 1516422780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 05:41:53,924][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 05:41:54,421][47288] Updated weights for policy 0, policy_version 95646 (0.0028) [2024-04-26 05:41:58,322][47288] Updated weights for policy 0, policy_version 95656 (0.0032) [2024-04-26 05:41:58,923][47056] Fps is (10 sec: 50790.9, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1567260672. Throughput: 0: 56105.2. Samples: 1516595420. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 05:41:58,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 05:41:59,518][47267] Signal inference workers to stop experience collection... (22950 times) [2024-04-26 05:41:59,519][47267] Signal inference workers to resume experience collection... (22950 times) [2024-04-26 05:41:59,531][47288] InferenceWorker_p0-w0: stopping experience collection (22950 times) [2024-04-26 05:41:59,531][47288] InferenceWorker_p0-w0: resuming experience collection (22950 times) [2024-04-26 05:42:00,118][47288] Updated weights for policy 0, policy_version 95666 (0.0029) [2024-04-26 05:42:03,923][47056] Fps is (10 sec: 52429.2, 60 sec: 55432.6, 300 sec: 56205.5). Total num frames: 1567522816. Throughput: 0: 55976.5. Samples: 1516928700. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 05:42:03,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 05:42:04,237][47288] Updated weights for policy 0, policy_version 95676 (0.0028) [2024-04-26 05:42:05,912][47288] Updated weights for policy 0, policy_version 95686 (0.0036) [2024-04-26 05:42:08,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55432.7, 300 sec: 56316.5). Total num frames: 1567834112. Throughput: 0: 56079.8. Samples: 1517265720. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 05:42:08,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 05:42:10,006][47288] Updated weights for policy 0, policy_version 95696 (0.0030) [2024-04-26 05:42:11,951][47288] Updated weights for policy 0, policy_version 95706 (0.0032) [2024-04-26 05:42:13,923][47056] Fps is (10 sec: 58981.4, 60 sec: 55432.3, 300 sec: 56261.0). Total num frames: 1568112640. Throughput: 0: 56259.5. Samples: 1517426300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:42:13,924][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 05:42:15,902][47288] Updated weights for policy 0, policy_version 95716 (0.0026) [2024-04-26 05:42:17,682][47288] Updated weights for policy 0, policy_version 95726 (0.0028) [2024-04-26 05:42:18,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1568407552. Throughput: 0: 56174.7. Samples: 1517759160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:42:18,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 05:42:21,597][47288] Updated weights for policy 0, policy_version 95736 (0.0027) [2024-04-26 05:42:23,335][47288] Updated weights for policy 0, policy_version 95746 (0.0033) [2024-04-26 05:42:23,923][47056] Fps is (10 sec: 58983.0, 60 sec: 57071.1, 300 sec: 56427.6). Total num frames: 1568702464. Throughput: 0: 56091.5. Samples: 1518099000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:42:23,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 05:42:27,382][47288] Updated weights for policy 0, policy_version 95756 (0.0030) [2024-04-26 05:42:28,923][47056] Fps is (10 sec: 58983.0, 60 sec: 57344.1, 300 sec: 56483.2). Total num frames: 1568997376. Throughput: 0: 56021.0. Samples: 1518276640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:42:28,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 05:42:29,361][47288] Updated weights for policy 0, policy_version 95766 (0.0031) [2024-04-26 05:42:33,355][47288] Updated weights for policy 0, policy_version 95776 (0.0030) [2024-04-26 05:42:33,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1569226752. Throughput: 0: 56059.1. Samples: 1518611420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:42:33,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 05:42:35,352][47288] Updated weights for policy 0, policy_version 95786 (0.0027) [2024-04-26 05:42:38,923][47056] Fps is (10 sec: 49151.5, 60 sec: 55705.5, 300 sec: 56205.5). Total num frames: 1569488896. Throughput: 0: 56240.9. Samples: 1518953620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:42:38,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 05:42:39,351][47288] Updated weights for policy 0, policy_version 95796 (0.0032) [2024-04-26 05:42:39,754][47267] Signal inference workers to stop experience collection... (23000 times) [2024-04-26 05:42:39,755][47267] Signal inference workers to resume experience collection... (23000 times) [2024-04-26 05:42:39,770][47288] InferenceWorker_p0-w0: stopping experience collection (23000 times) [2024-04-26 05:42:39,770][47288] InferenceWorker_p0-w0: resuming experience collection (23000 times) [2024-04-26 05:42:41,072][47288] Updated weights for policy 0, policy_version 95806 (0.0027) [2024-04-26 05:42:43,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55432.5, 300 sec: 56149.9). Total num frames: 1569783808. Throughput: 0: 55855.9. Samples: 1519108940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:42:43,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 05:42:45,052][47288] Updated weights for policy 0, policy_version 95816 (0.0025) [2024-04-26 05:42:47,052][47288] Updated weights for policy 0, policy_version 95826 (0.0027) [2024-04-26 05:42:48,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55432.6, 300 sec: 56261.0). Total num frames: 1570078720. Throughput: 0: 55979.0. Samples: 1519447760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:42:48,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 05:42:49,015][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000095831_1570095104.pth... [2024-04-26 05:42:49,057][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000095006_1556578304.pth [2024-04-26 05:42:50,895][47288] Updated weights for policy 0, policy_version 95836 (0.0027) [2024-04-26 05:42:53,151][47288] Updated weights for policy 0, policy_version 95846 (0.0030) [2024-04-26 05:42:53,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1570357248. Throughput: 0: 55871.5. Samples: 1519779940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:42:53,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 05:42:56,748][47288] Updated weights for policy 0, policy_version 95856 (0.0036) [2024-04-26 05:42:58,820][47288] Updated weights for policy 0, policy_version 95866 (0.0037) [2024-04-26 05:42:58,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56797.8, 300 sec: 56316.5). Total num frames: 1570668544. Throughput: 0: 56433.4. Samples: 1519965800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:42:58,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 05:43:02,475][47288] Updated weights for policy 0, policy_version 95876 (0.0028) [2024-04-26 05:43:03,923][47056] Fps is (10 sec: 60621.2, 60 sec: 57344.0, 300 sec: 56372.1). Total num frames: 1570963456. Throughput: 0: 56483.7. Samples: 1520300920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 05:43:03,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 05:43:04,900][47288] Updated weights for policy 0, policy_version 95886 (0.0029) [2024-04-26 05:43:08,171][47288] Updated weights for policy 0, policy_version 95896 (0.0027) [2024-04-26 05:43:08,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1571225600. Throughput: 0: 56525.8. Samples: 1520642660. Policy #0 lag: (min: 2.0, avg: 10.0, max: 22.0) [2024-04-26 05:43:08,923][47056] Avg episode reward: [(0, '0.391')] [2024-04-26 05:43:10,572][47288] Updated weights for policy 0, policy_version 95906 (0.0027) [2024-04-26 05:43:13,915][47288] Updated weights for policy 0, policy_version 95916 (0.0027) [2024-04-26 05:43:13,923][47056] Fps is (10 sec: 52428.0, 60 sec: 56251.8, 300 sec: 56317.3). Total num frames: 1571487744. Throughput: 0: 56252.7. Samples: 1520808020. Policy #0 lag: (min: 2.0, avg: 10.0, max: 22.0) [2024-04-26 05:43:13,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 05:43:16,472][47288] Updated weights for policy 0, policy_version 95926 (0.0028) [2024-04-26 05:43:18,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55705.5, 300 sec: 56205.4). Total num frames: 1571749888. Throughput: 0: 56164.9. Samples: 1521138840. Policy #0 lag: (min: 2.0, avg: 10.0, max: 22.0) [2024-04-26 05:43:18,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 05:43:19,798][47288] Updated weights for policy 0, policy_version 95936 (0.0035) [2024-04-26 05:43:22,146][47288] Updated weights for policy 0, policy_version 95946 (0.0027) [2024-04-26 05:43:23,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.5, 300 sec: 56316.5). Total num frames: 1572044800. Throughput: 0: 56103.1. Samples: 1521478260. Policy #0 lag: (min: 2.0, avg: 10.0, max: 22.0) [2024-04-26 05:43:23,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 05:43:25,670][47288] Updated weights for policy 0, policy_version 95956 (0.0026) [2024-04-26 05:43:27,803][47288] Updated weights for policy 0, policy_version 95966 (0.0027) [2024-04-26 05:43:28,923][47056] Fps is (10 sec: 60620.5, 60 sec: 55978.5, 300 sec: 56316.5). Total num frames: 1572356096. Throughput: 0: 56293.7. Samples: 1521642160. Policy #0 lag: (min: 2.0, avg: 10.0, max: 22.0) [2024-04-26 05:43:28,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 05:43:31,245][47267] Signal inference workers to stop experience collection... (23050 times) [2024-04-26 05:43:31,245][47267] Signal inference workers to resume experience collection... (23050 times) [2024-04-26 05:43:31,259][47288] InferenceWorker_p0-w0: stopping experience collection (23050 times) [2024-04-26 05:43:31,282][47288] InferenceWorker_p0-w0: resuming experience collection (23050 times) [2024-04-26 05:43:31,354][47288] Updated weights for policy 0, policy_version 95976 (0.0027) [2024-04-26 05:43:33,724][47288] Updated weights for policy 0, policy_version 95986 (0.0030) [2024-04-26 05:43:33,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56798.0, 300 sec: 56316.6). Total num frames: 1572634624. Throughput: 0: 56408.1. Samples: 1521986120. Policy #0 lag: (min: 2.0, avg: 10.0, max: 22.0) [2024-04-26 05:43:33,923][47056] Avg episode reward: [(0, '0.378')] [2024-04-26 05:43:37,103][47288] Updated weights for policy 0, policy_version 95996 (0.0031) [2024-04-26 05:43:38,923][47056] Fps is (10 sec: 57344.6, 60 sec: 57344.0, 300 sec: 56372.1). Total num frames: 1572929536. Throughput: 0: 56491.5. Samples: 1522322060. Policy #0 lag: (min: 2.0, avg: 10.0, max: 22.0) [2024-04-26 05:43:38,932][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 05:43:39,589][47288] Updated weights for policy 0, policy_version 96006 (0.0031) [2024-04-26 05:43:42,927][47288] Updated weights for policy 0, policy_version 96016 (0.0029) [2024-04-26 05:43:43,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56798.0, 300 sec: 56316.5). Total num frames: 1573191680. Throughput: 0: 56246.0. Samples: 1522496860. Policy #0 lag: (min: 2.0, avg: 10.0, max: 22.0) [2024-04-26 05:43:43,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 05:43:45,626][47288] Updated weights for policy 0, policy_version 96026 (0.0029) [2024-04-26 05:43:48,684][47288] Updated weights for policy 0, policy_version 96036 (0.0028) [2024-04-26 05:43:48,923][47056] Fps is (10 sec: 52428.6, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1573453824. Throughput: 0: 56417.1. Samples: 1522839700. Policy #0 lag: (min: 2.0, avg: 10.0, max: 22.0) [2024-04-26 05:43:48,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 05:43:51,401][47288] Updated weights for policy 0, policy_version 96046 (0.0028) [2024-04-26 05:43:53,923][47056] Fps is (10 sec: 54066.6, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1573732352. Throughput: 0: 56469.4. Samples: 1523183780. Policy #0 lag: (min: 2.0, avg: 10.0, max: 22.0) [2024-04-26 05:43:53,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 05:43:54,528][47288] Updated weights for policy 0, policy_version 96056 (0.0039) [2024-04-26 05:43:57,239][47288] Updated weights for policy 0, policy_version 96066 (0.0036) [2024-04-26 05:43:58,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1574027264. Throughput: 0: 56321.8. Samples: 1523342500. Policy #0 lag: (min: 2.0, avg: 10.0, max: 22.0) [2024-04-26 05:43:58,923][47056] Avg episode reward: [(0, '0.381')] [2024-04-26 05:44:00,333][47288] Updated weights for policy 0, policy_version 96076 (0.0027) [2024-04-26 05:44:02,950][47288] Updated weights for policy 0, policy_version 96086 (0.0022) [2024-04-26 05:44:03,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 1574322176. Throughput: 0: 56553.0. Samples: 1523683720. Policy #0 lag: (min: 2.0, avg: 10.0, max: 22.0) [2024-04-26 05:44:03,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 05:44:06,108][47288] Updated weights for policy 0, policy_version 96096 (0.0030) [2024-04-26 05:44:08,759][47288] Updated weights for policy 0, policy_version 96106 (0.0027) [2024-04-26 05:44:08,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1574600704. Throughput: 0: 56581.8. Samples: 1524024440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 05:44:08,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 05:44:12,030][47288] Updated weights for policy 0, policy_version 96116 (0.0030) [2024-04-26 05:44:13,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56798.0, 300 sec: 56316.5). Total num frames: 1574895616. Throughput: 0: 56955.7. Samples: 1524205160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 05:44:13,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 05:44:14,362][47288] Updated weights for policy 0, policy_version 96126 (0.0029) [2024-04-26 05:44:17,842][47288] Updated weights for policy 0, policy_version 96136 (0.0030) [2024-04-26 05:44:18,923][47056] Fps is (10 sec: 57344.3, 60 sec: 57071.0, 300 sec: 56316.5). Total num frames: 1575174144. Throughput: 0: 56765.7. Samples: 1524540580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 05:44:18,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 05:44:20,230][47288] Updated weights for policy 0, policy_version 96146 (0.0030) [2024-04-26 05:44:20,563][47267] Signal inference workers to stop experience collection... (23100 times) [2024-04-26 05:44:20,596][47288] InferenceWorker_p0-w0: stopping experience collection (23100 times) [2024-04-26 05:44:20,648][47267] Signal inference workers to resume experience collection... (23100 times) [2024-04-26 05:44:20,649][47288] InferenceWorker_p0-w0: resuming experience collection (23100 times) [2024-04-26 05:44:23,576][47288] Updated weights for policy 0, policy_version 96156 (0.0024) [2024-04-26 05:44:23,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56798.0, 300 sec: 56372.1). Total num frames: 1575452672. Throughput: 0: 56877.5. Samples: 1524881540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 05:44:23,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 05:44:26,057][47288] Updated weights for policy 0, policy_version 96166 (0.0036) [2024-04-26 05:44:28,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.9, 300 sec: 56372.1). Total num frames: 1575714816. Throughput: 0: 56503.9. Samples: 1525039540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 05:44:28,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 05:44:29,346][47288] Updated weights for policy 0, policy_version 96176 (0.0025) [2024-04-26 05:44:31,686][47288] Updated weights for policy 0, policy_version 96186 (0.0029) [2024-04-26 05:44:33,923][47056] Fps is (10 sec: 55704.1, 60 sec: 56251.5, 300 sec: 56427.6). Total num frames: 1576009728. Throughput: 0: 56409.7. Samples: 1525378140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 05:44:33,924][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 05:44:35,019][47288] Updated weights for policy 0, policy_version 96196 (0.0032) [2024-04-26 05:44:37,569][47288] Updated weights for policy 0, policy_version 96206 (0.0025) [2024-04-26 05:44:38,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1576288256. Throughput: 0: 56279.6. Samples: 1525716360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 05:44:38,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 05:44:40,825][47288] Updated weights for policy 0, policy_version 96216 (0.0026) [2024-04-26 05:44:43,223][47288] Updated weights for policy 0, policy_version 96226 (0.0028) [2024-04-26 05:44:43,923][47056] Fps is (10 sec: 57345.0, 60 sec: 56524.7, 300 sec: 56317.1). Total num frames: 1576583168. Throughput: 0: 56556.5. Samples: 1525887540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 05:44:43,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 05:44:46,674][47288] Updated weights for policy 0, policy_version 96236 (0.0027) [2024-04-26 05:44:48,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56798.0, 300 sec: 56261.1). Total num frames: 1576861696. Throughput: 0: 56376.5. Samples: 1526220660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 05:44:48,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 05:44:48,981][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000096245_1576878080.pth... [2024-04-26 05:44:49,030][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000095419_1563344896.pth [2024-04-26 05:44:49,162][47288] Updated weights for policy 0, policy_version 96246 (0.0026) [2024-04-26 05:44:52,554][47288] Updated weights for policy 0, policy_version 96256 (0.0031) [2024-04-26 05:44:53,923][47056] Fps is (10 sec: 57344.4, 60 sec: 57071.0, 300 sec: 56316.5). Total num frames: 1577156608. Throughput: 0: 56427.3. Samples: 1526563660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 05:44:53,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 05:44:54,826][47288] Updated weights for policy 0, policy_version 96266 (0.0031) [2024-04-26 05:44:58,218][47288] Updated weights for policy 0, policy_version 96276 (0.0033) [2024-04-26 05:44:58,923][47056] Fps is (10 sec: 57343.0, 60 sec: 56797.8, 300 sec: 56372.0). Total num frames: 1577435136. Throughput: 0: 56411.4. Samples: 1526743680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 05:44:58,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 05:45:00,657][47288] Updated weights for policy 0, policy_version 96286 (0.0028) [2024-04-26 05:45:03,923][47056] Fps is (10 sec: 54066.5, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 1577697280. Throughput: 0: 56427.9. Samples: 1527079840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 05:45:03,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 05:45:04,040][47288] Updated weights for policy 0, policy_version 96296 (0.0034) [2024-04-26 05:45:06,345][47288] Updated weights for policy 0, policy_version 96306 (0.0025) [2024-04-26 05:45:08,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1577959424. Throughput: 0: 56307.4. Samples: 1527415380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 05:45:08,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 05:45:09,756][47288] Updated weights for policy 0, policy_version 96316 (0.0032) [2024-04-26 05:45:12,210][47288] Updated weights for policy 0, policy_version 96326 (0.0030) [2024-04-26 05:45:13,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 56427.6). Total num frames: 1578254336. Throughput: 0: 56396.8. Samples: 1527577400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 05:45:13,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 05:45:15,595][47288] Updated weights for policy 0, policy_version 96336 (0.0026) [2024-04-26 05:45:16,401][47267] Signal inference workers to stop experience collection... (23150 times) [2024-04-26 05:45:16,423][47288] InferenceWorker_p0-w0: stopping experience collection (23150 times) [2024-04-26 05:45:16,493][47267] Signal inference workers to resume experience collection... (23150 times) [2024-04-26 05:45:16,493][47288] InferenceWorker_p0-w0: resuming experience collection (23150 times) [2024-04-26 05:45:18,042][47288] Updated weights for policy 0, policy_version 96346 (0.0025) [2024-04-26 05:45:18,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1578549248. Throughput: 0: 56337.6. Samples: 1527913320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 05:45:18,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 05:45:21,386][47288] Updated weights for policy 0, policy_version 96356 (0.0035) [2024-04-26 05:45:23,884][47288] Updated weights for policy 0, policy_version 96366 (0.0029) [2024-04-26 05:45:23,923][47056] Fps is (10 sec: 60620.6, 60 sec: 56797.7, 300 sec: 56316.5). Total num frames: 1578860544. Throughput: 0: 56280.7. Samples: 1528249000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 05:45:23,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 05:45:27,169][47288] Updated weights for policy 0, policy_version 96376 (0.0027) [2024-04-26 05:45:28,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56797.9, 300 sec: 56205.5). Total num frames: 1579122688. Throughput: 0: 56554.3. Samples: 1528432480. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 05:45:28,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 05:45:29,645][47288] Updated weights for policy 0, policy_version 96386 (0.0030) [2024-04-26 05:45:33,042][47288] Updated weights for policy 0, policy_version 96396 (0.0028) [2024-04-26 05:45:33,923][47056] Fps is (10 sec: 52429.7, 60 sec: 56252.0, 300 sec: 56205.5). Total num frames: 1579384832. Throughput: 0: 56507.2. Samples: 1528763480. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 05:45:33,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 05:45:35,378][47288] Updated weights for policy 0, policy_version 96406 (0.0027) [2024-04-26 05:45:38,862][47288] Updated weights for policy 0, policy_version 96416 (0.0032) [2024-04-26 05:45:38,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1579679744. Throughput: 0: 56459.1. Samples: 1529104320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 05:45:38,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 05:45:41,210][47288] Updated weights for policy 0, policy_version 96426 (0.0026) [2024-04-26 05:45:43,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1579941888. Throughput: 0: 56009.1. Samples: 1529264080. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 05:45:43,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 05:45:44,723][47288] Updated weights for policy 0, policy_version 96436 (0.0029) [2024-04-26 05:45:46,887][47288] Updated weights for policy 0, policy_version 96446 (0.0031) [2024-04-26 05:45:48,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 1580236800. Throughput: 0: 56164.5. Samples: 1529607240. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 05:45:48,923][47056] Avg episode reward: [(0, '0.561')] [2024-04-26 05:45:50,541][47288] Updated weights for policy 0, policy_version 96456 (0.0031) [2024-04-26 05:45:52,726][47288] Updated weights for policy 0, policy_version 96466 (0.0025) [2024-04-26 05:45:53,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 56316.5). Total num frames: 1580498944. Throughput: 0: 56216.9. Samples: 1529945140. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 05:45:53,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 05:45:56,311][47288] Updated weights for policy 0, policy_version 96476 (0.0030) [2024-04-26 05:45:58,599][47288] Updated weights for policy 0, policy_version 96486 (0.0029) [2024-04-26 05:45:58,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1580826624. Throughput: 0: 56370.3. Samples: 1530114060. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 05:45:58,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 05:46:01,993][47288] Updated weights for policy 0, policy_version 96496 (0.0028) [2024-04-26 05:46:03,923][47056] Fps is (10 sec: 60621.2, 60 sec: 56798.0, 300 sec: 56261.0). Total num frames: 1581105152. Throughput: 0: 56510.3. Samples: 1530456280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 05:46:03,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 05:46:04,288][47288] Updated weights for policy 0, policy_version 96506 (0.0031) [2024-04-26 05:46:08,087][47288] Updated weights for policy 0, policy_version 96516 (0.0029) [2024-04-26 05:46:08,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56797.8, 300 sec: 56205.4). Total num frames: 1581367296. Throughput: 0: 56617.8. Samples: 1530796800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 05:46:08,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 05:46:09,172][47267] Signal inference workers to stop experience collection... (23200 times) [2024-04-26 05:46:09,212][47288] InferenceWorker_p0-w0: stopping experience collection (23200 times) [2024-04-26 05:46:09,225][47267] Signal inference workers to resume experience collection... (23200 times) [2024-04-26 05:46:09,231][47288] InferenceWorker_p0-w0: resuming experience collection (23200 times) [2024-04-26 05:46:10,613][47288] Updated weights for policy 0, policy_version 96526 (0.0026) [2024-04-26 05:46:13,903][47288] Updated weights for policy 0, policy_version 96536 (0.0034) [2024-04-26 05:46:13,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1581645824. Throughput: 0: 56173.7. Samples: 1530960300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 05:46:13,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 05:46:16,526][47288] Updated weights for policy 0, policy_version 96546 (0.0030) [2024-04-26 05:46:18,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 1581940736. Throughput: 0: 56326.5. Samples: 1531298180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 05:46:18,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 05:46:19,649][47288] Updated weights for policy 0, policy_version 96556 (0.0039) [2024-04-26 05:46:22,461][47288] Updated weights for policy 0, policy_version 96566 (0.0031) [2024-04-26 05:46:23,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 56372.1). Total num frames: 1582186496. Throughput: 0: 56270.1. Samples: 1531636480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 05:46:23,923][47056] Avg episode reward: [(0, '0.578')] [2024-04-26 05:46:25,461][47288] Updated weights for policy 0, policy_version 96576 (0.0028) [2024-04-26 05:46:28,307][47288] Updated weights for policy 0, policy_version 96586 (0.0028) [2024-04-26 05:46:28,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55978.5, 300 sec: 56316.5). Total num frames: 1582481408. Throughput: 0: 56462.5. Samples: 1531804900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 05:46:28,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 05:46:31,347][47288] Updated weights for policy 0, policy_version 96596 (0.0024) [2024-04-26 05:46:33,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.6, 300 sec: 56316.5). Total num frames: 1582759936. Throughput: 0: 56385.8. Samples: 1532144600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 05:46:33,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 05:46:34,165][47288] Updated weights for policy 0, policy_version 96606 (0.0023) [2024-04-26 05:46:37,061][47288] Updated weights for policy 0, policy_version 96616 (0.0030) [2024-04-26 05:46:38,923][47056] Fps is (10 sec: 60620.6, 60 sec: 56797.7, 300 sec: 56372.1). Total num frames: 1583087616. Throughput: 0: 56287.4. Samples: 1532478080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 05:46:38,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 05:46:40,006][47288] Updated weights for policy 0, policy_version 96626 (0.0027) [2024-04-26 05:46:42,836][47288] Updated weights for policy 0, policy_version 96636 (0.0027) [2024-04-26 05:46:43,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56524.9, 300 sec: 56205.5). Total num frames: 1583333376. Throughput: 0: 56309.0. Samples: 1532647960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 05:46:43,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 05:46:45,645][47288] Updated weights for policy 0, policy_version 96646 (0.0026) [2024-04-26 05:46:48,609][47288] Updated weights for policy 0, policy_version 96656 (0.0028) [2024-04-26 05:46:48,923][47056] Fps is (10 sec: 52429.6, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1583611904. Throughput: 0: 56209.3. Samples: 1532985700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 05:46:48,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 05:46:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000096656_1583611904.pth... [2024-04-26 05:46:48,976][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000095831_1570095104.pth [2024-04-26 05:46:51,376][47288] Updated weights for policy 0, policy_version 96666 (0.0033) [2024-04-26 05:46:53,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1583890432. Throughput: 0: 56053.4. Samples: 1533319200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 05:46:53,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 05:46:54,980][47288] Updated weights for policy 0, policy_version 96676 (0.0030) [2024-04-26 05:46:57,385][47288] Updated weights for policy 0, policy_version 96686 (0.0027) [2024-04-26 05:46:58,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 56483.2). Total num frames: 1584185344. Throughput: 0: 56205.4. Samples: 1533489540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 05:46:58,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 05:47:00,820][47288] Updated weights for policy 0, policy_version 96696 (0.0032) [2024-04-26 05:47:03,253][47288] Updated weights for policy 0, policy_version 96706 (0.0025) [2024-04-26 05:47:03,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 56261.0). Total num frames: 1584431104. Throughput: 0: 56161.1. Samples: 1533825420. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 05:47:03,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 05:47:06,688][47288] Updated weights for policy 0, policy_version 96716 (0.0036) [2024-04-26 05:47:08,445][47267] Signal inference workers to stop experience collection... (23250 times) [2024-04-26 05:47:08,487][47288] InferenceWorker_p0-w0: stopping experience collection (23250 times) [2024-04-26 05:47:08,498][47267] Signal inference workers to resume experience collection... (23250 times) [2024-04-26 05:47:08,504][47288] InferenceWorker_p0-w0: resuming experience collection (23250 times) [2024-04-26 05:47:08,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1584742400. Throughput: 0: 56135.6. Samples: 1534162580. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 05:47:08,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 05:47:09,052][47288] Updated weights for policy 0, policy_version 96726 (0.0026) [2024-04-26 05:47:12,357][47288] Updated weights for policy 0, policy_version 96736 (0.0030) [2024-04-26 05:47:13,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1585020928. Throughput: 0: 56048.5. Samples: 1534327080. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 05:47:13,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 05:47:14,864][47288] Updated weights for policy 0, policy_version 96746 (0.0030) [2024-04-26 05:47:18,216][47288] Updated weights for policy 0, policy_version 96756 (0.0029) [2024-04-26 05:47:18,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.8, 300 sec: 56261.0). Total num frames: 1585299456. Throughput: 0: 56150.8. Samples: 1534671380. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 05:47:18,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 05:47:20,804][47288] Updated weights for policy 0, policy_version 96766 (0.0024) [2024-04-26 05:47:23,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55978.8, 300 sec: 56094.4). Total num frames: 1585545216. Throughput: 0: 56250.5. Samples: 1535009340. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 05:47:23,923][47056] Avg episode reward: [(0, '0.546')] [2024-04-26 05:47:24,090][47288] Updated weights for policy 0, policy_version 96776 (0.0027) [2024-04-26 05:47:26,740][47288] Updated weights for policy 0, policy_version 96786 (0.0029) [2024-04-26 05:47:28,923][47056] Fps is (10 sec: 57343.0, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1585872896. Throughput: 0: 56117.5. Samples: 1535173260. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 05:47:28,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 05:47:29,761][47288] Updated weights for policy 0, policy_version 96796 (0.0030) [2024-04-26 05:47:32,395][47288] Updated weights for policy 0, policy_version 96806 (0.0025) [2024-04-26 05:47:33,923][47056] Fps is (10 sec: 60620.0, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 1586151424. Throughput: 0: 56111.0. Samples: 1535510700. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 05:47:33,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 05:47:35,347][47288] Updated weights for policy 0, policy_version 96816 (0.0027) [2024-04-26 05:47:38,178][47288] Updated weights for policy 0, policy_version 96826 (0.0031) [2024-04-26 05:47:38,923][47056] Fps is (10 sec: 55706.6, 60 sec: 55705.8, 300 sec: 56427.6). Total num frames: 1586429952. Throughput: 0: 56346.8. Samples: 1535854800. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 05:47:38,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 05:47:41,150][47288] Updated weights for policy 0, policy_version 96836 (0.0031) [2024-04-26 05:47:43,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55978.7, 300 sec: 56316.6). Total num frames: 1586692096. Throughput: 0: 56372.9. Samples: 1536026320. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 05:47:43,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 05:47:44,260][47288] Updated weights for policy 0, policy_version 96846 (0.0032) [2024-04-26 05:47:47,000][47288] Updated weights for policy 0, policy_version 96856 (0.0036) [2024-04-26 05:47:48,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1586987008. Throughput: 0: 56319.0. Samples: 1536359780. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 05:47:48,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 05:47:50,167][47288] Updated weights for policy 0, policy_version 96866 (0.0027) [2024-04-26 05:47:52,883][47288] Updated weights for policy 0, policy_version 96876 (0.0026) [2024-04-26 05:47:53,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1587265536. Throughput: 0: 56319.9. Samples: 1536696980. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 05:47:53,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 05:47:55,877][47288] Updated weights for policy 0, policy_version 96886 (0.0029) [2024-04-26 05:47:58,581][47288] Updated weights for policy 0, policy_version 96896 (0.0027) [2024-04-26 05:47:58,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 1587544064. Throughput: 0: 56413.9. Samples: 1536865700. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 05:47:58,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 05:48:01,649][47288] Updated weights for policy 0, policy_version 96906 (0.0032) [2024-04-26 05:48:03,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56797.8, 300 sec: 56316.5). Total num frames: 1587838976. Throughput: 0: 56180.4. Samples: 1537199500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:48:03,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 05:48:04,563][47288] Updated weights for policy 0, policy_version 96916 (0.0026) [2024-04-26 05:48:07,489][47288] Updated weights for policy 0, policy_version 96926 (0.0032) [2024-04-26 05:48:08,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1588133888. Throughput: 0: 56114.2. Samples: 1537534480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:48:08,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 05:48:10,361][47288] Updated weights for policy 0, policy_version 96936 (0.0027) [2024-04-26 05:48:13,375][47288] Updated weights for policy 0, policy_version 96946 (0.0035) [2024-04-26 05:48:13,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1588396032. Throughput: 0: 56322.9. Samples: 1537707780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:48:13,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 05:48:16,039][47288] Updated weights for policy 0, policy_version 96956 (0.0028) [2024-04-26 05:48:18,923][47056] Fps is (10 sec: 52427.9, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1588658176. Throughput: 0: 56275.1. Samples: 1538043080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:48:18,923][47056] Avg episode reward: [(0, '0.595')] [2024-04-26 05:48:19,148][47288] Updated weights for policy 0, policy_version 96966 (0.0029) [2024-04-26 05:48:22,029][47288] Updated weights for policy 0, policy_version 96976 (0.0031) [2024-04-26 05:48:23,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56797.8, 300 sec: 56261.0). Total num frames: 1588953088. Throughput: 0: 56188.4. Samples: 1538383280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:48:23,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 05:48:24,883][47288] Updated weights for policy 0, policy_version 96986 (0.0034) [2024-04-26 05:48:27,833][47288] Updated weights for policy 0, policy_version 96996 (0.0026) [2024-04-26 05:48:28,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1589231616. Throughput: 0: 56147.8. Samples: 1538552980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:48:28,923][47056] Avg episode reward: [(0, '0.595')] [2024-04-26 05:48:30,951][47288] Updated weights for policy 0, policy_version 97006 (0.0031) [2024-04-26 05:48:33,715][47288] Updated weights for policy 0, policy_version 97016 (0.0027) [2024-04-26 05:48:33,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 1589510144. Throughput: 0: 56100.4. Samples: 1538884300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:48:33,923][47056] Avg episode reward: [(0, '0.565')] [2024-04-26 05:48:35,855][47267] Signal inference workers to stop experience collection... (23300 times) [2024-04-26 05:48:35,896][47288] InferenceWorker_p0-w0: stopping experience collection (23300 times) [2024-04-26 05:48:35,912][47267] Signal inference workers to resume experience collection... (23300 times) [2024-04-26 05:48:35,913][47288] InferenceWorker_p0-w0: resuming experience collection (23300 times) [2024-04-26 05:48:36,692][47288] Updated weights for policy 0, policy_version 97026 (0.0028) [2024-04-26 05:48:38,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1589805056. Throughput: 0: 56085.8. Samples: 1539220840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:48:38,923][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 05:48:39,527][47288] Updated weights for policy 0, policy_version 97036 (0.0028) [2024-04-26 05:48:42,432][47288] Updated weights for policy 0, policy_version 97046 (0.0035) [2024-04-26 05:48:43,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1590083584. Throughput: 0: 56354.1. Samples: 1539401640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:48:43,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 05:48:45,216][47288] Updated weights for policy 0, policy_version 97056 (0.0030) [2024-04-26 05:48:48,263][47288] Updated weights for policy 0, policy_version 97066 (0.0038) [2024-04-26 05:48:48,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1590378496. Throughput: 0: 56418.7. Samples: 1539738340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:48:48,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 05:48:49,055][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000097070_1590394880.pth... [2024-04-26 05:48:49,098][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000096245_1576878080.pth [2024-04-26 05:48:51,152][47288] Updated weights for policy 0, policy_version 97076 (0.0029) [2024-04-26 05:48:53,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1590624256. Throughput: 0: 56362.5. Samples: 1540070800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 05:48:53,923][47056] Avg episode reward: [(0, '0.568')] [2024-04-26 05:48:54,100][47288] Updated weights for policy 0, policy_version 97086 (0.0027) [2024-04-26 05:48:57,178][47288] Updated weights for policy 0, policy_version 97096 (0.0030) [2024-04-26 05:48:58,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56251.6, 300 sec: 56261.0). Total num frames: 1590919168. Throughput: 0: 56119.9. Samples: 1540233180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 05:48:58,923][47056] Avg episode reward: [(0, '0.528')] [2024-04-26 05:48:59,926][47288] Updated weights for policy 0, policy_version 97106 (0.0029) [2024-04-26 05:49:02,833][47288] Updated weights for policy 0, policy_version 97116 (0.0028) [2024-04-26 05:49:03,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1591197696. Throughput: 0: 56239.3. Samples: 1540573840. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 05:49:03,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 05:49:05,579][47288] Updated weights for policy 0, policy_version 97126 (0.0030) [2024-04-26 05:49:08,778][47288] Updated weights for policy 0, policy_version 97136 (0.0027) [2024-04-26 05:49:08,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.5, 300 sec: 56205.4). Total num frames: 1591476224. Throughput: 0: 56304.8. Samples: 1540917000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 05:49:08,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 05:49:11,447][47288] Updated weights for policy 0, policy_version 97146 (0.0033) [2024-04-26 05:49:13,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56251.6, 300 sec: 56261.0). Total num frames: 1591771136. Throughput: 0: 56186.2. Samples: 1541081360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 05:49:13,923][47056] Avg episode reward: [(0, '0.585')] [2024-04-26 05:49:14,609][47288] Updated weights for policy 0, policy_version 97156 (0.0027) [2024-04-26 05:49:17,249][47288] Updated weights for policy 0, policy_version 97166 (0.0025) [2024-04-26 05:49:18,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56797.8, 300 sec: 56316.5). Total num frames: 1592066048. Throughput: 0: 56193.7. Samples: 1541413020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 05:49:18,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 05:49:20,301][47288] Updated weights for policy 0, policy_version 97176 (0.0027) [2024-04-26 05:49:23,078][47288] Updated weights for policy 0, policy_version 97186 (0.0032) [2024-04-26 05:49:23,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1592328192. Throughput: 0: 56175.6. Samples: 1541748740. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 05:49:23,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 05:49:25,940][47288] Updated weights for policy 0, policy_version 97196 (0.0031) [2024-04-26 05:49:28,828][47288] Updated weights for policy 0, policy_version 97206 (0.0036) [2024-04-26 05:49:28,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56524.8, 300 sec: 56316.6). Total num frames: 1592623104. Throughput: 0: 56072.4. Samples: 1541924900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 05:49:28,924][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 05:49:31,808][47288] Updated weights for policy 0, policy_version 97216 (0.0024) [2024-04-26 05:49:33,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1592885248. Throughput: 0: 56154.7. Samples: 1542265300. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 05:49:33,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 05:49:34,501][47288] Updated weights for policy 0, policy_version 97226 (0.0027) [2024-04-26 05:49:37,879][47288] Updated weights for policy 0, policy_version 97236 (0.0025) [2024-04-26 05:49:38,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1593163776. Throughput: 0: 56272.3. Samples: 1542603060. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 05:49:38,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 05:49:40,407][47288] Updated weights for policy 0, policy_version 97246 (0.0029) [2024-04-26 05:49:43,841][47288] Updated weights for policy 0, policy_version 97256 (0.0031) [2024-04-26 05:49:43,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1593442304. Throughput: 0: 56178.2. Samples: 1542761200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 05:49:43,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 05:49:46,274][47288] Updated weights for policy 0, policy_version 97266 (0.0030) [2024-04-26 05:49:48,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1593737216. Throughput: 0: 56110.5. Samples: 1543098820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 05:49:48,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 05:49:49,602][47288] Updated weights for policy 0, policy_version 97276 (0.0034) [2024-04-26 05:49:51,152][47267] Signal inference workers to stop experience collection... (23350 times) [2024-04-26 05:49:51,152][47267] Signal inference workers to resume experience collection... (23350 times) [2024-04-26 05:49:51,165][47288] InferenceWorker_p0-w0: stopping experience collection (23350 times) [2024-04-26 05:49:51,166][47288] InferenceWorker_p0-w0: resuming experience collection (23350 times) [2024-04-26 05:49:52,084][47288] Updated weights for policy 0, policy_version 97286 (0.0027) [2024-04-26 05:49:53,923][47056] Fps is (10 sec: 57345.0, 60 sec: 56524.9, 300 sec: 56205.5). Total num frames: 1594015744. Throughput: 0: 56000.1. Samples: 1543437000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 05:49:53,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 05:49:55,478][47288] Updated weights for policy 0, policy_version 97296 (0.0031) [2024-04-26 05:49:57,950][47288] Updated weights for policy 0, policy_version 97306 (0.0032) [2024-04-26 05:49:58,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1594310656. Throughput: 0: 56171.2. Samples: 1543609060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-04-26 05:49:58,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 05:50:01,424][47288] Updated weights for policy 0, policy_version 97316 (0.0024) [2024-04-26 05:50:03,791][47288] Updated weights for policy 0, policy_version 97326 (0.0032) [2024-04-26 05:50:03,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1594589184. Throughput: 0: 56316.3. Samples: 1543947240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-04-26 05:50:03,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 05:50:07,418][47288] Updated weights for policy 0, policy_version 97336 (0.0030) [2024-04-26 05:50:08,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55978.8, 300 sec: 56205.5). Total num frames: 1594834944. Throughput: 0: 56304.5. Samples: 1544282440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-04-26 05:50:08,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 05:50:09,672][47288] Updated weights for policy 0, policy_version 97346 (0.0030) [2024-04-26 05:50:13,173][47288] Updated weights for policy 0, policy_version 97356 (0.0033) [2024-04-26 05:50:13,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55705.7, 300 sec: 56149.9). Total num frames: 1595113472. Throughput: 0: 55822.8. Samples: 1544436920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-04-26 05:50:13,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 05:50:15,498][47288] Updated weights for policy 0, policy_version 97366 (0.0027) [2024-04-26 05:50:18,827][47288] Updated weights for policy 0, policy_version 97376 (0.0032) [2024-04-26 05:50:18,923][47056] Fps is (10 sec: 57343.4, 60 sec: 55705.7, 300 sec: 56094.4). Total num frames: 1595408384. Throughput: 0: 55918.5. Samples: 1544781640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-04-26 05:50:18,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 05:50:21,234][47288] Updated weights for policy 0, policy_version 97386 (0.0032) [2024-04-26 05:50:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 56149.9). Total num frames: 1595686912. Throughput: 0: 55914.4. Samples: 1545119200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-04-26 05:50:23,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 05:50:24,641][47288] Updated weights for policy 0, policy_version 97396 (0.0030) [2024-04-26 05:50:27,050][47288] Updated weights for policy 0, policy_version 97406 (0.0033) [2024-04-26 05:50:28,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1595981824. Throughput: 0: 56194.8. Samples: 1545289960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-04-26 05:50:28,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 05:50:30,522][47288] Updated weights for policy 0, policy_version 97416 (0.0031) [2024-04-26 05:50:32,848][47288] Updated weights for policy 0, policy_version 97426 (0.0036) [2024-04-26 05:50:33,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 56205.4). Total num frames: 1596260352. Throughput: 0: 56117.0. Samples: 1545624080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-04-26 05:50:33,923][47056] Avg episode reward: [(0, '0.562')] [2024-04-26 05:50:36,165][47288] Updated weights for policy 0, policy_version 97436 (0.0034) [2024-04-26 05:50:38,522][47288] Updated weights for policy 0, policy_version 97446 (0.0030) [2024-04-26 05:50:38,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1596555264. Throughput: 0: 56140.7. Samples: 1545963340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-04-26 05:50:38,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 05:50:41,998][47288] Updated weights for policy 0, policy_version 97456 (0.0029) [2024-04-26 05:50:43,111][47267] Signal inference workers to stop experience collection... (23400 times) [2024-04-26 05:50:43,143][47288] InferenceWorker_p0-w0: stopping experience collection (23400 times) [2024-04-26 05:50:43,170][47267] Signal inference workers to resume experience collection... (23400 times) [2024-04-26 05:50:43,170][47288] InferenceWorker_p0-w0: resuming experience collection (23400 times) [2024-04-26 05:50:43,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1596833792. Throughput: 0: 56340.4. Samples: 1546144380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-04-26 05:50:43,923][47056] Avg episode reward: [(0, '0.557')] [2024-04-26 05:50:44,491][47288] Updated weights for policy 0, policy_version 97466 (0.0033) [2024-04-26 05:50:47,969][47288] Updated weights for policy 0, policy_version 97476 (0.0028) [2024-04-26 05:50:48,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1597095936. Throughput: 0: 56330.5. Samples: 1546482120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-04-26 05:50:48,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 05:50:49,004][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000097480_1597112320.pth... [2024-04-26 05:50:49,046][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000096656_1583611904.pth [2024-04-26 05:50:50,441][47288] Updated weights for policy 0, policy_version 97486 (0.0027) [2024-04-26 05:50:53,834][47288] Updated weights for policy 0, policy_version 97496 (0.0030) [2024-04-26 05:50:53,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55978.4, 300 sec: 56094.3). Total num frames: 1597374464. Throughput: 0: 56401.0. Samples: 1546820500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 25.0) [2024-04-26 05:50:53,924][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 05:50:56,344][47288] Updated weights for policy 0, policy_version 97506 (0.0029) [2024-04-26 05:50:58,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 56094.4). Total num frames: 1597652992. Throughput: 0: 56587.6. Samples: 1546983360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 05:50:58,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 05:50:59,581][47288] Updated weights for policy 0, policy_version 97516 (0.0028) [2024-04-26 05:51:02,152][47288] Updated weights for policy 0, policy_version 97526 (0.0026) [2024-04-26 05:51:03,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55978.5, 300 sec: 56205.5). Total num frames: 1597947904. Throughput: 0: 56512.0. Samples: 1547324680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 05:51:03,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 05:51:05,358][47288] Updated weights for policy 0, policy_version 97536 (0.0025) [2024-04-26 05:51:07,986][47288] Updated weights for policy 0, policy_version 97546 (0.0030) [2024-04-26 05:51:08,923][47056] Fps is (10 sec: 60620.9, 60 sec: 57070.9, 300 sec: 56316.5). Total num frames: 1598259200. Throughput: 0: 56467.1. Samples: 1547660220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 05:51:08,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 05:51:11,225][47288] Updated weights for policy 0, policy_version 97556 (0.0033) [2024-04-26 05:51:13,923][47056] Fps is (10 sec: 55704.7, 60 sec: 56524.6, 300 sec: 56149.9). Total num frames: 1598504960. Throughput: 0: 56511.3. Samples: 1547832980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 05:51:13,923][47056] Avg episode reward: [(0, '0.404')] [2024-04-26 05:51:13,934][47288] Updated weights for policy 0, policy_version 97566 (0.0028) [2024-04-26 05:51:17,012][47288] Updated weights for policy 0, policy_version 97576 (0.0029) [2024-04-26 05:51:18,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1598816256. Throughput: 0: 56582.6. Samples: 1548170300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 05:51:18,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 05:51:19,615][47288] Updated weights for policy 0, policy_version 97586 (0.0034) [2024-04-26 05:51:22,921][47288] Updated weights for policy 0, policy_version 97596 (0.0027) [2024-04-26 05:51:23,923][47056] Fps is (10 sec: 57346.0, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1599078400. Throughput: 0: 56625.6. Samples: 1548511480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 05:51:23,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 05:51:25,425][47288] Updated weights for policy 0, policy_version 97606 (0.0024) [2024-04-26 05:51:28,669][47288] Updated weights for policy 0, policy_version 97616 (0.0035) [2024-04-26 05:51:28,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1599356928. Throughput: 0: 56161.9. Samples: 1548671660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 05:51:28,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 05:51:31,210][47288] Updated weights for policy 0, policy_version 97626 (0.0030) [2024-04-26 05:51:33,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.7, 300 sec: 56038.9). Total num frames: 1599619072. Throughput: 0: 56106.3. Samples: 1549006900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 05:51:33,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 05:51:34,431][47288] Updated weights for policy 0, policy_version 97636 (0.0031) [2024-04-26 05:51:37,029][47288] Updated weights for policy 0, policy_version 97646 (0.0027) [2024-04-26 05:51:38,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 56205.4). Total num frames: 1599913984. Throughput: 0: 56194.5. Samples: 1549349240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 05:51:38,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 05:51:40,205][47288] Updated weights for policy 0, policy_version 97656 (0.0027) [2024-04-26 05:51:42,883][47288] Updated weights for policy 0, policy_version 97666 (0.0027) [2024-04-26 05:51:43,923][47056] Fps is (10 sec: 60620.8, 60 sec: 56524.9, 300 sec: 56316.5). Total num frames: 1600225280. Throughput: 0: 56361.8. Samples: 1549519640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 05:51:43,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 05:51:45,934][47288] Updated weights for policy 0, policy_version 97676 (0.0026) [2024-04-26 05:51:48,583][47288] Updated weights for policy 0, policy_version 97686 (0.0030) [2024-04-26 05:51:48,923][47056] Fps is (10 sec: 60620.9, 60 sec: 57071.0, 300 sec: 56372.1). Total num frames: 1600520192. Throughput: 0: 56398.3. Samples: 1549862600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 05:51:48,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 05:51:51,698][47288] Updated weights for policy 0, policy_version 97696 (0.0031) [2024-04-26 05:51:53,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56525.1, 300 sec: 56205.5). Total num frames: 1600765952. Throughput: 0: 56467.1. Samples: 1550201240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 05:51:53,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 05:51:54,386][47288] Updated weights for policy 0, policy_version 97706 (0.0031) [2024-04-26 05:51:54,656][47267] Signal inference workers to stop experience collection... (23450 times) [2024-04-26 05:51:54,687][47288] InferenceWorker_p0-w0: stopping experience collection (23450 times) [2024-04-26 05:51:54,744][47267] Signal inference workers to resume experience collection... (23450 times) [2024-04-26 05:51:54,744][47288] InferenceWorker_p0-w0: resuming experience collection (23450 times) [2024-04-26 05:51:57,567][47288] Updated weights for policy 0, policy_version 97716 (0.0026) [2024-04-26 05:51:58,923][47056] Fps is (10 sec: 55704.7, 60 sec: 57070.8, 300 sec: 56427.6). Total num frames: 1601077248. Throughput: 0: 56494.8. Samples: 1550375240. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-04-26 05:51:58,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 05:52:00,092][47288] Updated weights for policy 0, policy_version 97726 (0.0027) [2024-04-26 05:52:03,468][47288] Updated weights for policy 0, policy_version 97736 (0.0029) [2024-04-26 05:52:03,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1601339392. Throughput: 0: 56602.3. Samples: 1550717400. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-04-26 05:52:03,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 05:52:05,950][47288] Updated weights for policy 0, policy_version 97746 (0.0028) [2024-04-26 05:52:08,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55705.5, 300 sec: 56205.5). Total num frames: 1601601536. Throughput: 0: 56462.0. Samples: 1551052280. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-04-26 05:52:08,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 05:52:09,335][47288] Updated weights for policy 0, policy_version 97756 (0.0030) [2024-04-26 05:52:11,621][47288] Updated weights for policy 0, policy_version 97766 (0.0026) [2024-04-26 05:52:13,923][47056] Fps is (10 sec: 54065.8, 60 sec: 56251.8, 300 sec: 56205.4). Total num frames: 1601880064. Throughput: 0: 56535.3. Samples: 1551215760. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-04-26 05:52:13,923][47056] Avg episode reward: [(0, '0.601')] [2024-04-26 05:52:15,198][47288] Updated weights for policy 0, policy_version 97776 (0.0031) [2024-04-26 05:52:17,481][47288] Updated weights for policy 0, policy_version 97786 (0.0028) [2024-04-26 05:52:18,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1602191360. Throughput: 0: 56491.8. Samples: 1551549040. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-04-26 05:52:18,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 05:52:20,945][47288] Updated weights for policy 0, policy_version 97796 (0.0027) [2024-04-26 05:52:23,313][47288] Updated weights for policy 0, policy_version 97806 (0.0033) [2024-04-26 05:52:23,923][47056] Fps is (10 sec: 62260.5, 60 sec: 57070.8, 300 sec: 56372.1). Total num frames: 1602502656. Throughput: 0: 56347.6. Samples: 1551884880. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-04-26 05:52:23,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 05:52:26,648][47288] Updated weights for policy 0, policy_version 97816 (0.0031) [2024-04-26 05:52:28,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56524.7, 300 sec: 56261.0). Total num frames: 1602748416. Throughput: 0: 56542.5. Samples: 1552064060. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-04-26 05:52:28,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 05:52:29,113][47288] Updated weights for policy 0, policy_version 97826 (0.0023) [2024-04-26 05:52:32,454][47288] Updated weights for policy 0, policy_version 97836 (0.0028) [2024-04-26 05:52:33,923][47056] Fps is (10 sec: 52428.2, 60 sec: 56797.7, 300 sec: 56261.0). Total num frames: 1603026944. Throughput: 0: 56382.5. Samples: 1552399820. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-04-26 05:52:33,923][47056] Avg episode reward: [(0, '0.534')] [2024-04-26 05:52:34,898][47288] Updated weights for policy 0, policy_version 97846 (0.0033) [2024-04-26 05:52:38,389][47288] Updated weights for policy 0, policy_version 97856 (0.0034) [2024-04-26 05:52:38,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1603305472. Throughput: 0: 56307.9. Samples: 1552735100. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-04-26 05:52:38,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 05:52:40,707][47288] Updated weights for policy 0, policy_version 97866 (0.0027) [2024-04-26 05:52:43,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55705.6, 300 sec: 56205.5). Total num frames: 1603567616. Throughput: 0: 56090.5. Samples: 1552899300. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-04-26 05:52:43,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 05:52:44,344][47288] Updated weights for policy 0, policy_version 97876 (0.0031) [2024-04-26 05:52:46,524][47288] Updated weights for policy 0, policy_version 97886 (0.0030) [2024-04-26 05:52:48,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55159.4, 300 sec: 56149.9). Total num frames: 1603829760. Throughput: 0: 55777.2. Samples: 1553227380. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-04-26 05:52:48,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 05:52:48,997][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000097891_1603846144.pth... [2024-04-26 05:52:49,043][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000097070_1590394880.pth [2024-04-26 05:52:50,230][47288] Updated weights for policy 0, policy_version 97896 (0.0030) [2024-04-26 05:52:52,282][47288] Updated weights for policy 0, policy_version 97906 (0.0025) [2024-04-26 05:52:53,923][47056] Fps is (10 sec: 57342.9, 60 sec: 56251.6, 300 sec: 56261.0). Total num frames: 1604141056. Throughput: 0: 55863.9. Samples: 1553566160. Policy #0 lag: (min: 1.0, avg: 10.8, max: 20.0) [2024-04-26 05:52:53,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 05:52:55,909][47288] Updated weights for policy 0, policy_version 97916 (0.0028) [2024-04-26 05:52:58,116][47288] Updated weights for policy 0, policy_version 97926 (0.0029) [2024-04-26 05:52:58,923][47056] Fps is (10 sec: 62259.9, 60 sec: 56251.9, 300 sec: 56316.5). Total num frames: 1604452352. Throughput: 0: 56272.3. Samples: 1553748000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 05:52:58,923][47056] Avg episode reward: [(0, '0.563')] [2024-04-26 05:53:01,752][47288] Updated weights for policy 0, policy_version 97936 (0.0028) [2024-04-26 05:53:03,893][47288] Updated weights for policy 0, policy_version 97946 (0.0032) [2024-04-26 05:53:03,923][47056] Fps is (10 sec: 60621.9, 60 sec: 56797.9, 300 sec: 56316.5). Total num frames: 1604747264. Throughput: 0: 56316.2. Samples: 1554083260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 05:53:03,923][47056] Avg episode reward: [(0, '0.572')] [2024-04-26 05:53:04,657][47267] Signal inference workers to stop experience collection... (23500 times) [2024-04-26 05:53:04,657][47267] Signal inference workers to resume experience collection... (23500 times) [2024-04-26 05:53:04,666][47288] InferenceWorker_p0-w0: stopping experience collection (23500 times) [2024-04-26 05:53:04,667][47288] InferenceWorker_p0-w0: resuming experience collection (23500 times) [2024-04-26 05:53:07,646][47288] Updated weights for policy 0, policy_version 97956 (0.0029) [2024-04-26 05:53:08,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1604993024. Throughput: 0: 56296.0. Samples: 1554418200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 05:53:08,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 05:53:09,872][47288] Updated weights for policy 0, policy_version 97966 (0.0027) [2024-04-26 05:53:13,382][47288] Updated weights for policy 0, policy_version 97976 (0.0029) [2024-04-26 05:53:13,923][47056] Fps is (10 sec: 52428.3, 60 sec: 56524.9, 300 sec: 56316.5). Total num frames: 1605271552. Throughput: 0: 55983.6. Samples: 1554583320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 05:53:13,923][47056] Avg episode reward: [(0, '0.570')] [2024-04-26 05:53:15,692][47288] Updated weights for policy 0, policy_version 97986 (0.0031) [2024-04-26 05:53:18,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.7, 300 sec: 56205.5). Total num frames: 1605533696. Throughput: 0: 56199.3. Samples: 1554928780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 05:53:18,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 05:53:19,104][47288] Updated weights for policy 0, policy_version 97996 (0.0028) [2024-04-26 05:53:21,335][47288] Updated weights for policy 0, policy_version 98006 (0.0024) [2024-04-26 05:53:23,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55432.6, 300 sec: 56261.0). Total num frames: 1605828608. Throughput: 0: 56354.4. Samples: 1555271040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 05:53:23,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 05:53:24,921][47288] Updated weights for policy 0, policy_version 98016 (0.0028) [2024-04-26 05:53:27,062][47288] Updated weights for policy 0, policy_version 98026 (0.0025) [2024-04-26 05:53:28,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1606107136. Throughput: 0: 56303.0. Samples: 1555432940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 05:53:28,923][47056] Avg episode reward: [(0, '0.534')] [2024-04-26 05:53:30,707][47288] Updated weights for policy 0, policy_version 98036 (0.0028) [2024-04-26 05:53:32,959][47288] Updated weights for policy 0, policy_version 98046 (0.0025) [2024-04-26 05:53:33,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1606402048. Throughput: 0: 56472.1. Samples: 1555768620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 05:53:33,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 05:53:36,586][47288] Updated weights for policy 0, policy_version 98056 (0.0029) [2024-04-26 05:53:38,794][47288] Updated weights for policy 0, policy_version 98066 (0.0036) [2024-04-26 05:53:38,923][47056] Fps is (10 sec: 60621.2, 60 sec: 56798.0, 300 sec: 56372.1). Total num frames: 1606713344. Throughput: 0: 56463.3. Samples: 1556107000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 05:53:38,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 05:53:42,347][47288] Updated weights for policy 0, policy_version 98076 (0.0031) [2024-04-26 05:53:43,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56797.8, 300 sec: 56261.0). Total num frames: 1606975488. Throughput: 0: 56435.1. Samples: 1556287580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 05:53:43,923][47056] Avg episode reward: [(0, '0.595')] [2024-04-26 05:53:44,537][47288] Updated weights for policy 0, policy_version 98086 (0.0023) [2024-04-26 05:53:48,251][47288] Updated weights for policy 0, policy_version 98096 (0.0027) [2024-04-26 05:53:48,923][47056] Fps is (10 sec: 52428.2, 60 sec: 56797.9, 300 sec: 56316.5). Total num frames: 1607237632. Throughput: 0: 56351.8. Samples: 1556619100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 05:53:48,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 05:53:50,416][47288] Updated weights for policy 0, policy_version 98106 (0.0030) [2024-04-26 05:53:53,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56251.9, 300 sec: 56261.0). Total num frames: 1607516160. Throughput: 0: 56379.5. Samples: 1556955280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 05:53:53,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 05:53:54,015][47288] Updated weights for policy 0, policy_version 98116 (0.0029) [2024-04-26 05:53:56,514][47288] Updated weights for policy 0, policy_version 98126 (0.0027) [2024-04-26 05:53:58,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.5, 300 sec: 56261.0). Total num frames: 1607794688. Throughput: 0: 56296.0. Samples: 1557116640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 05:53:58,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 05:53:59,931][47288] Updated weights for policy 0, policy_version 98136 (0.0031) [2024-04-26 05:54:02,432][47288] Updated weights for policy 0, policy_version 98146 (0.0029) [2024-04-26 05:54:03,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55705.5, 300 sec: 56316.5). Total num frames: 1608089600. Throughput: 0: 56159.0. Samples: 1557455940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 05:54:03,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 05:54:05,609][47288] Updated weights for policy 0, policy_version 98156 (0.0028) [2024-04-26 05:54:07,695][47267] Signal inference workers to stop experience collection... (23550 times) [2024-04-26 05:54:07,695][47267] Signal inference workers to resume experience collection... (23550 times) [2024-04-26 05:54:07,709][47288] InferenceWorker_p0-w0: stopping experience collection (23550 times) [2024-04-26 05:54:07,709][47288] InferenceWorker_p0-w0: resuming experience collection (23550 times) [2024-04-26 05:54:08,286][47288] Updated weights for policy 0, policy_version 98166 (0.0033) [2024-04-26 05:54:08,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1608368128. Throughput: 0: 56053.2. Samples: 1557793440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 05:54:08,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 05:54:11,508][47288] Updated weights for policy 0, policy_version 98176 (0.0029) [2024-04-26 05:54:13,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1608663040. Throughput: 0: 56313.3. Samples: 1557967040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 05:54:13,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 05:54:13,984][47288] Updated weights for policy 0, policy_version 98186 (0.0029) [2024-04-26 05:54:17,169][47288] Updated weights for policy 0, policy_version 98196 (0.0025) [2024-04-26 05:54:18,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56797.6, 300 sec: 56316.5). Total num frames: 1608941568. Throughput: 0: 56264.7. Samples: 1558300540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 05:54:18,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 05:54:19,750][47288] Updated weights for policy 0, policy_version 98206 (0.0029) [2024-04-26 05:54:23,126][47288] Updated weights for policy 0, policy_version 98216 (0.0028) [2024-04-26 05:54:23,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.6, 300 sec: 56261.0). Total num frames: 1609220096. Throughput: 0: 56227.4. Samples: 1558637240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 05:54:23,923][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 05:54:25,459][47288] Updated weights for policy 0, policy_version 98226 (0.0033) [2024-04-26 05:54:28,795][47288] Updated weights for policy 0, policy_version 98236 (0.0030) [2024-04-26 05:54:28,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1609498624. Throughput: 0: 56157.6. Samples: 1558814680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 05:54:28,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 05:54:31,328][47288] Updated weights for policy 0, policy_version 98246 (0.0026) [2024-04-26 05:54:33,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1609760768. Throughput: 0: 56305.7. Samples: 1559152860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 05:54:33,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 05:54:34,583][47288] Updated weights for policy 0, policy_version 98256 (0.0027) [2024-04-26 05:54:37,303][47288] Updated weights for policy 0, policy_version 98266 (0.0028) [2024-04-26 05:54:38,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 56316.6). Total num frames: 1610055680. Throughput: 0: 56343.1. Samples: 1559490720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 05:54:38,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 05:54:40,316][47288] Updated weights for policy 0, policy_version 98276 (0.0038) [2024-04-26 05:54:43,197][47288] Updated weights for policy 0, policy_version 98286 (0.0028) [2024-04-26 05:54:43,923][47056] Fps is (10 sec: 57345.2, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1610334208. Throughput: 0: 56545.5. Samples: 1559661180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 05:54:43,923][47056] Avg episode reward: [(0, '0.364')] [2024-04-26 05:54:46,165][47288] Updated weights for policy 0, policy_version 98296 (0.0029) [2024-04-26 05:54:48,923][47056] Fps is (10 sec: 57342.5, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1610629120. Throughput: 0: 56432.2. Samples: 1559995400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 05:54:48,923][47056] Avg episode reward: [(0, '0.393')] [2024-04-26 05:54:48,937][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000098305_1610629120.pth... [2024-04-26 05:54:48,997][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000097480_1597112320.pth [2024-04-26 05:54:49,107][47288] Updated weights for policy 0, policy_version 98306 (0.0028) [2024-04-26 05:54:51,869][47288] Updated weights for policy 0, policy_version 98316 (0.0029) [2024-04-26 05:54:53,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56798.0, 300 sec: 56316.6). Total num frames: 1610924032. Throughput: 0: 56477.6. Samples: 1560334920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 05:54:53,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 05:54:54,828][47288] Updated weights for policy 0, policy_version 98326 (0.0030) [2024-04-26 05:54:57,598][47288] Updated weights for policy 0, policy_version 98336 (0.0030) [2024-04-26 05:54:58,923][47056] Fps is (10 sec: 57345.2, 60 sec: 56797.9, 300 sec: 56316.5). Total num frames: 1611202560. Throughput: 0: 56657.4. Samples: 1560516620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 05:54:58,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 05:55:00,585][47288] Updated weights for policy 0, policy_version 98346 (0.0029) [2024-04-26 05:55:03,600][47288] Updated weights for policy 0, policy_version 98356 (0.0030) [2024-04-26 05:55:03,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1611481088. Throughput: 0: 56646.0. Samples: 1560849600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 05:55:03,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 05:55:06,370][47288] Updated weights for policy 0, policy_version 98366 (0.0032) [2024-04-26 05:55:08,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1611743232. Throughput: 0: 56667.3. Samples: 1561187260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 05:55:08,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 05:55:09,357][47288] Updated weights for policy 0, policy_version 98376 (0.0027) [2024-04-26 05:55:12,132][47288] Updated weights for policy 0, policy_version 98386 (0.0027) [2024-04-26 05:55:13,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.8, 300 sec: 56316.5). Total num frames: 1612021760. Throughput: 0: 56218.3. Samples: 1561344500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 05:55:13,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 05:55:15,183][47288] Updated weights for policy 0, policy_version 98396 (0.0036) [2024-04-26 05:55:15,746][47267] Signal inference workers to stop experience collection... (23600 times) [2024-04-26 05:55:15,765][47288] InferenceWorker_p0-w0: stopping experience collection (23600 times) [2024-04-26 05:55:15,804][47267] Signal inference workers to resume experience collection... (23600 times) [2024-04-26 05:55:15,804][47288] InferenceWorker_p0-w0: resuming experience collection (23600 times) [2024-04-26 05:55:17,930][47288] Updated weights for policy 0, policy_version 98406 (0.0029) [2024-04-26 05:55:18,923][47056] Fps is (10 sec: 55704.3, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1612300288. Throughput: 0: 56265.2. Samples: 1561684800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 05:55:18,923][47056] Avg episode reward: [(0, '0.565')] [2024-04-26 05:55:20,970][47288] Updated weights for policy 0, policy_version 98416 (0.0031) [2024-04-26 05:55:23,816][47288] Updated weights for policy 0, policy_version 98426 (0.0033) [2024-04-26 05:55:23,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56525.0, 300 sec: 56372.1). Total num frames: 1612611584. Throughput: 0: 56297.3. Samples: 1562024100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 05:55:23,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 05:55:26,731][47288] Updated weights for policy 0, policy_version 98436 (0.0031) [2024-04-26 05:55:28,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1612873728. Throughput: 0: 56270.0. Samples: 1562193340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 05:55:28,923][47056] Avg episode reward: [(0, '0.586')] [2024-04-26 05:55:29,811][47288] Updated weights for policy 0, policy_version 98446 (0.0034) [2024-04-26 05:55:32,604][47288] Updated weights for policy 0, policy_version 98456 (0.0030) [2024-04-26 05:55:33,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56798.1, 300 sec: 56316.6). Total num frames: 1613168640. Throughput: 0: 56321.3. Samples: 1562529840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 05:55:33,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 05:55:35,699][47288] Updated weights for policy 0, policy_version 98466 (0.0027) [2024-04-26 05:55:38,575][47288] Updated weights for policy 0, policy_version 98476 (0.0031) [2024-04-26 05:55:38,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56524.8, 300 sec: 56316.6). Total num frames: 1613447168. Throughput: 0: 56240.4. Samples: 1562865740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 05:55:38,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 05:55:41,492][47288] Updated weights for policy 0, policy_version 98486 (0.0041) [2024-04-26 05:55:43,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1613709312. Throughput: 0: 55949.8. Samples: 1563034360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 05:55:43,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 05:55:44,326][47288] Updated weights for policy 0, policy_version 98496 (0.0023) [2024-04-26 05:55:47,357][47288] Updated weights for policy 0, policy_version 98506 (0.0028) [2024-04-26 05:55:48,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55978.8, 300 sec: 56316.6). Total num frames: 1613987840. Throughput: 0: 55912.8. Samples: 1563365680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 05:55:48,923][47056] Avg episode reward: [(0, '0.583')] [2024-04-26 05:55:50,321][47288] Updated weights for policy 0, policy_version 98516 (0.0030) [2024-04-26 05:55:53,054][47288] Updated weights for policy 0, policy_version 98526 (0.0031) [2024-04-26 05:55:53,923][47056] Fps is (10 sec: 54064.6, 60 sec: 55432.0, 300 sec: 56260.9). Total num frames: 1614249984. Throughput: 0: 55994.1. Samples: 1563707020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 05:55:53,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 05:55:56,052][47288] Updated weights for policy 0, policy_version 98536 (0.0034) [2024-04-26 05:55:58,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1614561280. Throughput: 0: 56209.8. Samples: 1563873940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 05:55:58,923][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 05:55:58,997][47288] Updated weights for policy 0, policy_version 98546 (0.0029) [2024-04-26 05:56:01,999][47288] Updated weights for policy 0, policy_version 98556 (0.0027) [2024-04-26 05:56:03,923][47056] Fps is (10 sec: 58985.7, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 1614839808. Throughput: 0: 56192.8. Samples: 1564213460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 05:56:03,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 05:56:04,913][47288] Updated weights for policy 0, policy_version 98566 (0.0027) [2024-04-26 05:56:07,822][47288] Updated weights for policy 0, policy_version 98576 (0.0029) [2024-04-26 05:56:08,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1615134720. Throughput: 0: 56029.7. Samples: 1564545440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 05:56:08,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 05:56:10,729][47288] Updated weights for policy 0, policy_version 98586 (0.0029) [2024-04-26 05:56:13,487][47288] Updated weights for policy 0, policy_version 98596 (0.0028) [2024-04-26 05:56:13,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1615413248. Throughput: 0: 56265.5. Samples: 1564725280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 05:56:13,923][47056] Avg episode reward: [(0, '0.586')] [2024-04-26 05:56:16,988][47288] Updated weights for policy 0, policy_version 98606 (0.0035) [2024-04-26 05:56:17,610][47267] Signal inference workers to stop experience collection... (23650 times) [2024-04-26 05:56:17,612][47267] Signal inference workers to resume experience collection... (23650 times) [2024-04-26 05:56:17,633][47288] InferenceWorker_p0-w0: stopping experience collection (23650 times) [2024-04-26 05:56:17,633][47288] InferenceWorker_p0-w0: resuming experience collection (23650 times) [2024-04-26 05:56:18,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56525.0, 300 sec: 56316.5). Total num frames: 1615691776. Throughput: 0: 56330.6. Samples: 1565064720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 05:56:18,923][47056] Avg episode reward: [(0, '0.603')] [2024-04-26 05:56:19,119][47288] Updated weights for policy 0, policy_version 98616 (0.0024) [2024-04-26 05:56:22,944][47288] Updated weights for policy 0, policy_version 98626 (0.0030) [2024-04-26 05:56:23,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1615986688. Throughput: 0: 56376.9. Samples: 1565402700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 05:56:23,923][47056] Avg episode reward: [(0, '0.562')] [2024-04-26 05:56:24,995][47288] Updated weights for policy 0, policy_version 98636 (0.0032) [2024-04-26 05:56:28,537][47288] Updated weights for policy 0, policy_version 98646 (0.0028) [2024-04-26 05:56:28,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55978.8, 300 sec: 56316.5). Total num frames: 1616232448. Throughput: 0: 56290.6. Samples: 1565567440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 05:56:28,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 05:56:30,810][47288] Updated weights for policy 0, policy_version 98656 (0.0028) [2024-04-26 05:56:33,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55705.5, 300 sec: 56261.0). Total num frames: 1616510976. Throughput: 0: 56360.5. Samples: 1565901900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 05:56:33,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 05:56:34,291][47288] Updated weights for policy 0, policy_version 98666 (0.0032) [2024-04-26 05:56:36,598][47288] Updated weights for policy 0, policy_version 98676 (0.0032) [2024-04-26 05:56:38,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56251.6, 300 sec: 56261.0). Total num frames: 1616822272. Throughput: 0: 56264.5. Samples: 1566238900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 05:56:38,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 05:56:40,177][47288] Updated weights for policy 0, policy_version 98686 (0.0027) [2024-04-26 05:56:42,298][47288] Updated weights for policy 0, policy_version 98696 (0.0028) [2024-04-26 05:56:43,923][47056] Fps is (10 sec: 62258.6, 60 sec: 57070.8, 300 sec: 56316.5). Total num frames: 1617133568. Throughput: 0: 56547.0. Samples: 1566418560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 05:56:43,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 05:56:45,859][47288] Updated weights for policy 0, policy_version 98706 (0.0027) [2024-04-26 05:56:48,242][47288] Updated weights for policy 0, policy_version 98716 (0.0029) [2024-04-26 05:56:48,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56797.7, 300 sec: 56372.0). Total num frames: 1617395712. Throughput: 0: 56514.3. Samples: 1566756620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 05:56:48,924][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 05:56:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000098718_1617395712.pth... [2024-04-26 05:56:48,987][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000097891_1603846144.pth [2024-04-26 05:56:51,519][47288] Updated weights for policy 0, policy_version 98726 (0.0026) [2024-04-26 05:56:53,923][47056] Fps is (10 sec: 54066.6, 60 sec: 57071.2, 300 sec: 56261.0). Total num frames: 1617674240. Throughput: 0: 56533.1. Samples: 1567089440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 05:56:53,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 05:56:54,067][47288] Updated weights for policy 0, policy_version 98736 (0.0032) [2024-04-26 05:56:57,433][47288] Updated weights for policy 0, policy_version 98746 (0.0036) [2024-04-26 05:56:58,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1617936384. Throughput: 0: 56444.8. Samples: 1567265300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 05:56:58,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 05:56:59,824][47288] Updated weights for policy 0, policy_version 98756 (0.0034) [2024-04-26 05:57:03,340][47288] Updated weights for policy 0, policy_version 98766 (0.0033) [2024-04-26 05:57:03,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.6, 300 sec: 56372.1). Total num frames: 1618231296. Throughput: 0: 56440.7. Samples: 1567604560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 05:57:03,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 05:57:05,456][47288] Updated weights for policy 0, policy_version 98776 (0.0027) [2024-04-26 05:57:08,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55705.7, 300 sec: 56261.0). Total num frames: 1618477056. Throughput: 0: 56469.0. Samples: 1567943800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 05:57:08,923][47056] Avg episode reward: [(0, '0.528')] [2024-04-26 05:57:09,142][47288] Updated weights for policy 0, policy_version 98786 (0.0024) [2024-04-26 05:57:11,416][47288] Updated weights for policy 0, policy_version 98796 (0.0034) [2024-04-26 05:57:13,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1618771968. Throughput: 0: 56167.0. Samples: 1568094960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 05:57:13,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 05:57:14,957][47288] Updated weights for policy 0, policy_version 98806 (0.0031) [2024-04-26 05:57:17,184][47267] Signal inference workers to stop experience collection... (23700 times) [2024-04-26 05:57:17,234][47288] InferenceWorker_p0-w0: stopping experience collection (23700 times) [2024-04-26 05:57:17,271][47267] Signal inference workers to resume experience collection... (23700 times) [2024-04-26 05:57:17,271][47288] InferenceWorker_p0-w0: resuming experience collection (23700 times) [2024-04-26 05:57:17,375][47288] Updated weights for policy 0, policy_version 98816 (0.0028) [2024-04-26 05:57:18,923][47056] Fps is (10 sec: 62259.0, 60 sec: 56797.9, 300 sec: 56261.0). Total num frames: 1619099648. Throughput: 0: 56210.7. Samples: 1568431380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 05:57:18,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 05:57:20,682][47288] Updated weights for policy 0, policy_version 98826 (0.0027) [2024-04-26 05:57:23,177][47288] Updated weights for policy 0, policy_version 98836 (0.0032) [2024-04-26 05:57:23,923][47056] Fps is (10 sec: 60620.8, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1619378176. Throughput: 0: 56361.3. Samples: 1568775160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 05:57:23,923][47056] Avg episode reward: [(0, '0.592')] [2024-04-26 05:57:26,428][47288] Updated weights for policy 0, policy_version 98846 (0.0026) [2024-04-26 05:57:28,923][47056] Fps is (10 sec: 54065.8, 60 sec: 56797.6, 300 sec: 56316.5). Total num frames: 1619640320. Throughput: 0: 56214.0. Samples: 1568948200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 05:57:28,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 05:57:29,285][47288] Updated weights for policy 0, policy_version 98856 (0.0028) [2024-04-26 05:57:32,260][47288] Updated weights for policy 0, policy_version 98866 (0.0034) [2024-04-26 05:57:33,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56797.8, 300 sec: 56316.5). Total num frames: 1619918848. Throughput: 0: 56242.0. Samples: 1569287500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 05:57:33,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 05:57:34,999][47288] Updated weights for policy 0, policy_version 98876 (0.0029) [2024-04-26 05:57:38,064][47288] Updated weights for policy 0, policy_version 98886 (0.0025) [2024-04-26 05:57:38,923][47056] Fps is (10 sec: 55707.0, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1620197376. Throughput: 0: 56274.5. Samples: 1569621780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 05:57:38,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 05:57:40,667][47288] Updated weights for policy 0, policy_version 98896 (0.0033) [2024-04-26 05:57:43,787][47288] Updated weights for policy 0, policy_version 98906 (0.0030) [2024-04-26 05:57:43,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 56427.6). Total num frames: 1620475904. Throughput: 0: 56016.4. Samples: 1569786040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 05:57:43,923][47056] Avg episode reward: [(0, '0.578')] [2024-04-26 05:57:46,464][47288] Updated weights for policy 0, policy_version 98916 (0.0030) [2024-04-26 05:57:48,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55705.7, 300 sec: 56261.0). Total num frames: 1620738048. Throughput: 0: 56029.8. Samples: 1570125900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 05:57:48,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 05:57:49,609][47288] Updated weights for policy 0, policy_version 98926 (0.0026) [2024-04-26 05:57:52,153][47288] Updated weights for policy 0, policy_version 98936 (0.0027) [2024-04-26 05:57:53,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56252.0, 300 sec: 56261.0). Total num frames: 1621049344. Throughput: 0: 55964.0. Samples: 1570462180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:57:53,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 05:57:55,486][47288] Updated weights for policy 0, policy_version 98946 (0.0030) [2024-04-26 05:57:57,962][47288] Updated weights for policy 0, policy_version 98956 (0.0029) [2024-04-26 05:57:58,923][47056] Fps is (10 sec: 60620.3, 60 sec: 56797.8, 300 sec: 56260.9). Total num frames: 1621344256. Throughput: 0: 56519.5. Samples: 1570638340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:57:58,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 05:58:01,256][47288] Updated weights for policy 0, policy_version 98966 (0.0025) [2024-04-26 05:58:03,829][47288] Updated weights for policy 0, policy_version 98976 (0.0031) [2024-04-26 05:58:03,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1621622784. Throughput: 0: 56536.8. Samples: 1570975540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:58:03,924][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 05:58:06,984][47288] Updated weights for policy 0, policy_version 98986 (0.0037) [2024-04-26 05:58:08,923][47056] Fps is (10 sec: 54068.0, 60 sec: 56797.8, 300 sec: 56316.5). Total num frames: 1621884928. Throughput: 0: 56433.0. Samples: 1571314640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:58:08,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 05:58:09,715][47288] Updated weights for policy 0, policy_version 98996 (0.0026) [2024-04-26 05:58:12,730][47288] Updated weights for policy 0, policy_version 99006 (0.0026) [2024-04-26 05:58:13,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1622163456. Throughput: 0: 56330.6. Samples: 1571483060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:58:13,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 05:58:15,622][47288] Updated weights for policy 0, policy_version 99016 (0.0036) [2024-04-26 05:58:18,573][47288] Updated weights for policy 0, policy_version 99026 (0.0029) [2024-04-26 05:58:18,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1622458368. Throughput: 0: 56367.2. Samples: 1571824020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:58:18,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 05:58:20,830][47267] Signal inference workers to stop experience collection... (23750 times) [2024-04-26 05:58:20,875][47288] InferenceWorker_p0-w0: stopping experience collection (23750 times) [2024-04-26 05:58:20,889][47267] Signal inference workers to resume experience collection... (23750 times) [2024-04-26 05:58:20,892][47288] InferenceWorker_p0-w0: resuming experience collection (23750 times) [2024-04-26 05:58:21,448][47288] Updated weights for policy 0, policy_version 99036 (0.0033) [2024-04-26 05:58:23,923][47056] Fps is (10 sec: 54066.3, 60 sec: 55432.5, 300 sec: 56261.0). Total num frames: 1622704128. Throughput: 0: 56297.2. Samples: 1572155160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:58:23,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 05:58:24,509][47288] Updated weights for policy 0, policy_version 99046 (0.0024) [2024-04-26 05:58:27,450][47288] Updated weights for policy 0, policy_version 99056 (0.0028) [2024-04-26 05:58:28,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55705.8, 300 sec: 56205.5). Total num frames: 1622982656. Throughput: 0: 56348.6. Samples: 1572321720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:58:28,923][47056] Avg episode reward: [(0, '0.604')] [2024-04-26 05:58:30,310][47288] Updated weights for policy 0, policy_version 99066 (0.0025) [2024-04-26 05:58:33,281][47288] Updated weights for policy 0, policy_version 99076 (0.0029) [2024-04-26 05:58:33,923][47056] Fps is (10 sec: 60621.3, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1623310336. Throughput: 0: 56260.1. Samples: 1572657600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:58:33,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 05:58:36,197][47288] Updated weights for policy 0, policy_version 99086 (0.0026) [2024-04-26 05:58:38,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1623572480. Throughput: 0: 56185.2. Samples: 1572990520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:58:38,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 05:58:39,043][47288] Updated weights for policy 0, policy_version 99096 (0.0026) [2024-04-26 05:58:42,127][47288] Updated weights for policy 0, policy_version 99106 (0.0032) [2024-04-26 05:58:43,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1623851008. Throughput: 0: 56178.7. Samples: 1573166380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:58:43,923][47056] Avg episode reward: [(0, '0.528')] [2024-04-26 05:58:44,898][47288] Updated weights for policy 0, policy_version 99116 (0.0032) [2024-04-26 05:58:47,890][47288] Updated weights for policy 0, policy_version 99126 (0.0026) [2024-04-26 05:58:48,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1624129536. Throughput: 0: 56028.3. Samples: 1573496820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 05:58:48,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 05:58:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000099129_1624129536.pth... [2024-04-26 05:58:48,988][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000098305_1610629120.pth [2024-04-26 05:58:50,800][47288] Updated weights for policy 0, policy_version 99136 (0.0029) [2024-04-26 05:58:53,662][47288] Updated weights for policy 0, policy_version 99146 (0.0029) [2024-04-26 05:58:53,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56251.6, 300 sec: 56372.1). Total num frames: 1624424448. Throughput: 0: 56011.9. Samples: 1573835180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 05:58:53,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 05:58:56,628][47288] Updated weights for policy 0, policy_version 99156 (0.0027) [2024-04-26 05:58:58,923][47056] Fps is (10 sec: 55706.7, 60 sec: 55705.8, 300 sec: 56261.0). Total num frames: 1624686592. Throughput: 0: 56120.9. Samples: 1574008500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 05:58:58,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 05:58:59,397][47288] Updated weights for policy 0, policy_version 99166 (0.0031) [2024-04-26 05:59:02,382][47288] Updated weights for policy 0, policy_version 99176 (0.0028) [2024-04-26 05:59:03,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.6, 300 sec: 56261.0). Total num frames: 1624965120. Throughput: 0: 55914.6. Samples: 1574340180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 05:59:03,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 05:59:05,172][47288] Updated weights for policy 0, policy_version 99186 (0.0031) [2024-04-26 05:59:08,209][47288] Updated weights for policy 0, policy_version 99196 (0.0037) [2024-04-26 05:59:08,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 1625243648. Throughput: 0: 56071.7. Samples: 1574678380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 05:59:08,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 05:59:11,037][47288] Updated weights for policy 0, policy_version 99206 (0.0031) [2024-04-26 05:59:13,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 56205.5). Total num frames: 1625522176. Throughput: 0: 56032.8. Samples: 1574843200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 05:59:13,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 05:59:14,083][47288] Updated weights for policy 0, policy_version 99216 (0.0027) [2024-04-26 05:59:16,789][47288] Updated weights for policy 0, policy_version 99226 (0.0036) [2024-04-26 05:59:18,923][47056] Fps is (10 sec: 57343.1, 60 sec: 55978.5, 300 sec: 56261.0). Total num frames: 1625817088. Throughput: 0: 56082.5. Samples: 1575181320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 05:59:18,924][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 05:59:19,884][47288] Updated weights for policy 0, policy_version 99236 (0.0037) [2024-04-26 05:59:22,693][47288] Updated weights for policy 0, policy_version 99246 (0.0034) [2024-04-26 05:59:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1626095616. Throughput: 0: 56180.6. Samples: 1575518640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 05:59:23,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 05:59:25,538][47288] Updated weights for policy 0, policy_version 99256 (0.0025) [2024-04-26 05:59:28,530][47288] Updated weights for policy 0, policy_version 99266 (0.0027) [2024-04-26 05:59:28,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 1626390528. Throughput: 0: 56118.7. Samples: 1575691720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 05:59:28,923][47056] Avg episode reward: [(0, '0.598')] [2024-04-26 05:59:31,652][47288] Updated weights for policy 0, policy_version 99276 (0.0029) [2024-04-26 05:59:33,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1626669056. Throughput: 0: 56210.8. Samples: 1576026300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 05:59:33,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 05:59:34,497][47288] Updated weights for policy 0, policy_version 99286 (0.0028) [2024-04-26 05:59:37,567][47288] Updated weights for policy 0, policy_version 99296 (0.0027) [2024-04-26 05:59:38,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1626947584. Throughput: 0: 56314.2. Samples: 1576369320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 05:59:38,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 05:59:40,234][47288] Updated weights for policy 0, policy_version 99306 (0.0025) [2024-04-26 05:59:43,267][47288] Updated weights for policy 0, policy_version 99316 (0.0027) [2024-04-26 05:59:43,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.8, 300 sec: 56205.5). Total num frames: 1627209728. Throughput: 0: 56171.5. Samples: 1576536220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 05:59:43,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 05:59:44,521][47267] Signal inference workers to stop experience collection... (23800 times) [2024-04-26 05:59:44,526][47267] Signal inference workers to resume experience collection... (23800 times) [2024-04-26 05:59:44,539][47288] InferenceWorker_p0-w0: stopping experience collection (23800 times) [2024-04-26 05:59:44,547][47288] InferenceWorker_p0-w0: resuming experience collection (23800 times) [2024-04-26 05:59:45,957][47288] Updated weights for policy 0, policy_version 99326 (0.0031) [2024-04-26 05:59:48,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56251.9, 300 sec: 56205.4). Total num frames: 1627504640. Throughput: 0: 56344.1. Samples: 1576875660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 05:59:48,923][47056] Avg episode reward: [(0, '0.534')] [2024-04-26 05:59:48,932][47288] Updated weights for policy 0, policy_version 99336 (0.0026) [2024-04-26 05:59:51,993][47288] Updated weights for policy 0, policy_version 99346 (0.0028) [2024-04-26 05:59:53,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 56205.4). Total num frames: 1627783168. Throughput: 0: 56369.3. Samples: 1577215000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 05:59:53,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 05:59:54,856][47288] Updated weights for policy 0, policy_version 99356 (0.0027) [2024-04-26 05:59:57,816][47288] Updated weights for policy 0, policy_version 99366 (0.0030) [2024-04-26 05:59:58,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.6, 300 sec: 56205.4). Total num frames: 1628061696. Throughput: 0: 56388.9. Samples: 1577380700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 05:59:58,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 06:00:00,732][47288] Updated weights for policy 0, policy_version 99376 (0.0028) [2024-04-26 06:00:03,492][47288] Updated weights for policy 0, policy_version 99386 (0.0033) [2024-04-26 06:00:03,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1628356608. Throughput: 0: 56294.8. Samples: 1577714580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 06:00:03,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 06:00:06,433][47288] Updated weights for policy 0, policy_version 99396 (0.0033) [2024-04-26 06:00:08,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1628635136. Throughput: 0: 56312.8. Samples: 1578052720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 06:00:08,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 06:00:09,384][47288] Updated weights for policy 0, policy_version 99406 (0.0027) [2024-04-26 06:00:12,461][47288] Updated weights for policy 0, policy_version 99416 (0.0034) [2024-04-26 06:00:13,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.9, 300 sec: 56316.6). Total num frames: 1628913664. Throughput: 0: 56312.3. Samples: 1578225760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 06:00:13,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 06:00:15,281][47288] Updated weights for policy 0, policy_version 99426 (0.0032) [2024-04-26 06:00:18,158][47288] Updated weights for policy 0, policy_version 99436 (0.0031) [2024-04-26 06:00:18,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 56149.9). Total num frames: 1629175808. Throughput: 0: 56408.9. Samples: 1578564700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 06:00:18,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 06:00:21,197][47288] Updated weights for policy 0, policy_version 99446 (0.0033) [2024-04-26 06:00:23,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 1629454336. Throughput: 0: 56265.1. Samples: 1578901240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 06:00:23,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 06:00:24,072][47288] Updated weights for policy 0, policy_version 99456 (0.0031) [2024-04-26 06:00:26,893][47288] Updated weights for policy 0, policy_version 99466 (0.0031) [2024-04-26 06:00:28,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.8, 300 sec: 56205.4). Total num frames: 1629749248. Throughput: 0: 56306.2. Samples: 1579070000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 06:00:28,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 06:00:29,803][47288] Updated weights for policy 0, policy_version 99476 (0.0033) [2024-04-26 06:00:32,698][47288] Updated weights for policy 0, policy_version 99486 (0.0028) [2024-04-26 06:00:33,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.7, 300 sec: 56205.4). Total num frames: 1630027776. Throughput: 0: 56431.1. Samples: 1579415060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 06:00:33,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 06:00:35,422][47288] Updated weights for policy 0, policy_version 99496 (0.0030) [2024-04-26 06:00:36,617][47267] Signal inference workers to stop experience collection... (23850 times) [2024-04-26 06:00:36,617][47267] Signal inference workers to resume experience collection... (23850 times) [2024-04-26 06:00:36,642][47288] InferenceWorker_p0-w0: stopping experience collection (23850 times) [2024-04-26 06:00:36,643][47288] InferenceWorker_p0-w0: resuming experience collection (23850 times) [2024-04-26 06:00:38,580][47288] Updated weights for policy 0, policy_version 99506 (0.0029) [2024-04-26 06:00:38,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1630339072. Throughput: 0: 56325.9. Samples: 1579749660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 06:00:38,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 06:00:41,647][47288] Updated weights for policy 0, policy_version 99516 (0.0033) [2024-04-26 06:00:43,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1630601216. Throughput: 0: 56388.4. Samples: 1579918180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 06:00:43,924][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 06:00:44,239][47288] Updated weights for policy 0, policy_version 99526 (0.0029) [2024-04-26 06:00:47,559][47288] Updated weights for policy 0, policy_version 99536 (0.0028) [2024-04-26 06:00:48,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56524.7, 300 sec: 56427.7). Total num frames: 1630896128. Throughput: 0: 56613.2. Samples: 1580262180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 06:00:48,923][47056] Avg episode reward: [(0, '0.567')] [2024-04-26 06:00:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000099542_1630896128.pth... [2024-04-26 06:00:48,984][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000098718_1617395712.pth [2024-04-26 06:00:50,009][47288] Updated weights for policy 0, policy_version 99546 (0.0032) [2024-04-26 06:00:53,269][47288] Updated weights for policy 0, policy_version 99556 (0.0029) [2024-04-26 06:00:53,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1631158272. Throughput: 0: 56542.7. Samples: 1580597140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 06:00:53,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 06:00:55,796][47288] Updated weights for policy 0, policy_version 99566 (0.0030) [2024-04-26 06:00:58,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1631420416. Throughput: 0: 56289.0. Samples: 1580758780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 06:00:58,932][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 06:00:59,177][47288] Updated weights for policy 0, policy_version 99576 (0.0034) [2024-04-26 06:01:01,725][47288] Updated weights for policy 0, policy_version 99586 (0.0027) [2024-04-26 06:01:03,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55978.5, 300 sec: 56205.4). Total num frames: 1631715328. Throughput: 0: 56214.9. Samples: 1581094380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 06:01:03,924][47056] Avg episode reward: [(0, '0.544')] [2024-04-26 06:01:04,878][47288] Updated weights for policy 0, policy_version 99596 (0.0028) [2024-04-26 06:01:07,413][47288] Updated weights for policy 0, policy_version 99606 (0.0031) [2024-04-26 06:01:08,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1632010240. Throughput: 0: 56250.4. Samples: 1581432520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 06:01:08,924][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 06:01:10,664][47288] Updated weights for policy 0, policy_version 99616 (0.0027) [2024-04-26 06:01:13,106][47288] Updated weights for policy 0, policy_version 99626 (0.0029) [2024-04-26 06:01:13,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56251.6, 300 sec: 56261.0). Total num frames: 1632288768. Throughput: 0: 56293.2. Samples: 1581603200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 06:01:13,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 06:01:16,450][47288] Updated weights for policy 0, policy_version 99636 (0.0034) [2024-04-26 06:01:18,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56797.9, 300 sec: 56261.0). Total num frames: 1632583680. Throughput: 0: 56083.4. Samples: 1581938820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 06:01:18,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 06:01:19,203][47288] Updated weights for policy 0, policy_version 99646 (0.0028) [2024-04-26 06:01:22,218][47288] Updated weights for policy 0, policy_version 99656 (0.0033) [2024-04-26 06:01:23,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1632845824. Throughput: 0: 56142.1. Samples: 1582276060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 06:01:23,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 06:01:25,156][47288] Updated weights for policy 0, policy_version 99666 (0.0032) [2024-04-26 06:01:28,081][47288] Updated weights for policy 0, policy_version 99676 (0.0028) [2024-04-26 06:01:28,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1633157120. Throughput: 0: 56252.2. Samples: 1582449520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 06:01:28,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 06:01:30,914][47288] Updated weights for policy 0, policy_version 99686 (0.0029) [2024-04-26 06:01:33,852][47288] Updated weights for policy 0, policy_version 99696 (0.0028) [2024-04-26 06:01:33,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1633419264. Throughput: 0: 56065.0. Samples: 1582785100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 06:01:33,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 06:01:36,578][47288] Updated weights for policy 0, policy_version 99706 (0.0033) [2024-04-26 06:01:38,923][47056] Fps is (10 sec: 52428.1, 60 sec: 55705.5, 300 sec: 56094.4). Total num frames: 1633681408. Throughput: 0: 56273.2. Samples: 1583129440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 06:01:38,923][47056] Avg episode reward: [(0, '0.450')] [2024-04-26 06:01:39,728][47288] Updated weights for policy 0, policy_version 99716 (0.0025) [2024-04-26 06:01:42,449][47288] Updated weights for policy 0, policy_version 99726 (0.0032) [2024-04-26 06:01:43,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1633992704. Throughput: 0: 56326.8. Samples: 1583293480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 06:01:43,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 06:01:45,584][47267] Signal inference workers to stop experience collection... (23900 times) [2024-04-26 06:01:45,630][47288] InferenceWorker_p0-w0: stopping experience collection (23900 times) [2024-04-26 06:01:45,640][47267] Signal inference workers to resume experience collection... (23900 times) [2024-04-26 06:01:45,644][47288] InferenceWorker_p0-w0: resuming experience collection (23900 times) [2024-04-26 06:01:45,647][47288] Updated weights for policy 0, policy_version 99736 (0.0026) [2024-04-26 06:01:48,304][47288] Updated weights for policy 0, policy_version 99746 (0.0025) [2024-04-26 06:01:48,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1634271232. Throughput: 0: 56376.2. Samples: 1583631300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 06:01:48,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 06:01:51,411][47288] Updated weights for policy 0, policy_version 99756 (0.0028) [2024-04-26 06:01:53,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1634549760. Throughput: 0: 56507.3. Samples: 1583975340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 06:01:53,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 06:01:54,075][47288] Updated weights for policy 0, policy_version 99766 (0.0034) [2024-04-26 06:01:57,200][47288] Updated weights for policy 0, policy_version 99776 (0.0038) [2024-04-26 06:01:58,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56797.9, 300 sec: 56261.0). Total num frames: 1634828288. Throughput: 0: 56541.3. Samples: 1584147560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 06:01:58,923][47056] Avg episode reward: [(0, '0.574')] [2024-04-26 06:01:59,909][47288] Updated weights for policy 0, policy_version 99786 (0.0027) [2024-04-26 06:02:02,918][47288] Updated weights for policy 0, policy_version 99796 (0.0032) [2024-04-26 06:02:03,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56525.0, 300 sec: 56372.1). Total num frames: 1635106816. Throughput: 0: 56564.6. Samples: 1584484220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 06:02:03,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 06:02:05,819][47288] Updated weights for policy 0, policy_version 99806 (0.0033) [2024-04-26 06:02:08,714][47288] Updated weights for policy 0, policy_version 99816 (0.0028) [2024-04-26 06:02:08,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1635401728. Throughput: 0: 56619.6. Samples: 1584823940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 06:02:08,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 06:02:11,722][47288] Updated weights for policy 0, policy_version 99826 (0.0028) [2024-04-26 06:02:13,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1635663872. Throughput: 0: 56468.0. Samples: 1584990580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 06:02:13,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 06:02:14,516][47288] Updated weights for policy 0, policy_version 99836 (0.0025) [2024-04-26 06:02:17,584][47288] Updated weights for policy 0, policy_version 99846 (0.0033) [2024-04-26 06:02:18,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56251.7, 300 sec: 56205.4). Total num frames: 1635958784. Throughput: 0: 56562.5. Samples: 1585330420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 06:02:18,924][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 06:02:20,254][47288] Updated weights for policy 0, policy_version 99856 (0.0028) [2024-04-26 06:02:23,323][47288] Updated weights for policy 0, policy_version 99866 (0.0034) [2024-04-26 06:02:23,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1636237312. Throughput: 0: 56353.5. Samples: 1585665340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 06:02:23,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 06:02:26,015][47288] Updated weights for policy 0, policy_version 99876 (0.0028) [2024-04-26 06:02:28,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.5, 300 sec: 56205.4). Total num frames: 1636499456. Throughput: 0: 56488.1. Samples: 1585835440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 06:02:28,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 06:02:29,106][47288] Updated weights for policy 0, policy_version 99886 (0.0026) [2024-04-26 06:02:31,718][47288] Updated weights for policy 0, policy_version 99896 (0.0027) [2024-04-26 06:02:33,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1636794368. Throughput: 0: 56537.4. Samples: 1586175480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 06:02:33,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 06:02:34,753][47288] Updated weights for policy 0, policy_version 99906 (0.0028) [2024-04-26 06:02:37,414][47288] Updated weights for policy 0, policy_version 99916 (0.0028) [2024-04-26 06:02:38,923][47056] Fps is (10 sec: 60620.4, 60 sec: 57070.9, 300 sec: 56372.1). Total num frames: 1637105664. Throughput: 0: 56363.0. Samples: 1586511680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 06:02:38,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 06:02:40,573][47288] Updated weights for policy 0, policy_version 99926 (0.0028) [2024-04-26 06:02:43,246][47288] Updated weights for policy 0, policy_version 99936 (0.0027) [2024-04-26 06:02:43,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1637384192. Throughput: 0: 56422.7. Samples: 1586686580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 06:02:43,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 06:02:46,340][47288] Updated weights for policy 0, policy_version 99946 (0.0027) [2024-04-26 06:02:48,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.9, 300 sec: 56316.5). Total num frames: 1637662720. Throughput: 0: 56549.7. Samples: 1587028960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 06:02:48,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 06:02:48,930][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000099956_1637679104.pth... [2024-04-26 06:02:48,936][47288] Updated weights for policy 0, policy_version 99956 (0.0029) [2024-04-26 06:02:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000099129_1624129536.pth [2024-04-26 06:02:52,147][47267] Signal inference workers to stop experience collection... (23950 times) [2024-04-26 06:02:52,147][47267] Signal inference workers to resume experience collection... (23950 times) [2024-04-26 06:02:52,158][47288] InferenceWorker_p0-w0: stopping experience collection (23950 times) [2024-04-26 06:02:52,158][47288] InferenceWorker_p0-w0: resuming experience collection (23950 times) [2024-04-26 06:02:52,255][47288] Updated weights for policy 0, policy_version 99966 (0.0033) [2024-04-26 06:02:53,923][47056] Fps is (10 sec: 54068.2, 60 sec: 56251.8, 300 sec: 56205.5). Total num frames: 1637924864. Throughput: 0: 56436.5. Samples: 1587363580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:02:53,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 06:02:54,797][47288] Updated weights for policy 0, policy_version 99976 (0.0032) [2024-04-26 06:02:58,029][47288] Updated weights for policy 0, policy_version 99986 (0.0034) [2024-04-26 06:02:58,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56251.9, 300 sec: 56205.5). Total num frames: 1638203392. Throughput: 0: 56407.2. Samples: 1587528900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:02:58,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 06:03:00,658][47288] Updated weights for policy 0, policy_version 99996 (0.0025) [2024-04-26 06:03:03,857][47288] Updated weights for policy 0, policy_version 100006 (0.0026) [2024-04-26 06:03:03,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.9, 300 sec: 56316.6). Total num frames: 1638498304. Throughput: 0: 56400.3. Samples: 1587868420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:03:03,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 06:03:06,439][47288] Updated weights for policy 0, policy_version 100016 (0.0025) [2024-04-26 06:03:08,923][47056] Fps is (10 sec: 58981.4, 60 sec: 56524.7, 300 sec: 56372.0). Total num frames: 1638793216. Throughput: 0: 56644.3. Samples: 1588214340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:03:08,923][47056] Avg episode reward: [(0, '0.404')] [2024-04-26 06:03:09,552][47288] Updated weights for policy 0, policy_version 100026 (0.0028) [2024-04-26 06:03:12,153][47288] Updated weights for policy 0, policy_version 100036 (0.0033) [2024-04-26 06:03:13,923][47056] Fps is (10 sec: 57342.0, 60 sec: 56797.6, 300 sec: 56316.5). Total num frames: 1639071744. Throughput: 0: 56668.2. Samples: 1588385520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:03:13,924][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 06:03:15,393][47288] Updated weights for policy 0, policy_version 100046 (0.0034) [2024-04-26 06:03:17,984][47288] Updated weights for policy 0, policy_version 100056 (0.0031) [2024-04-26 06:03:18,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1639350272. Throughput: 0: 56536.9. Samples: 1588719640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:03:18,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 06:03:21,137][47288] Updated weights for policy 0, policy_version 100066 (0.0034) [2024-04-26 06:03:23,720][47288] Updated weights for policy 0, policy_version 100076 (0.0035) [2024-04-26 06:03:23,923][47056] Fps is (10 sec: 57345.5, 60 sec: 56797.9, 300 sec: 56483.1). Total num frames: 1639645184. Throughput: 0: 56511.7. Samples: 1589054700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:03:23,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 06:03:26,917][47288] Updated weights for policy 0, policy_version 100086 (0.0029) [2024-04-26 06:03:28,923][47056] Fps is (10 sec: 55704.5, 60 sec: 56797.7, 300 sec: 56261.0). Total num frames: 1639907328. Throughput: 0: 56498.6. Samples: 1589229020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:03:28,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 06:03:29,332][47288] Updated weights for policy 0, policy_version 100096 (0.0024) [2024-04-26 06:03:32,840][47288] Updated weights for policy 0, policy_version 100106 (0.0034) [2024-04-26 06:03:33,923][47056] Fps is (10 sec: 54066.5, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1640185856. Throughput: 0: 56498.1. Samples: 1589571380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:03:33,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 06:03:35,320][47288] Updated weights for policy 0, policy_version 100116 (0.0027) [2024-04-26 06:03:38,623][47288] Updated weights for policy 0, policy_version 100126 (0.0026) [2024-04-26 06:03:38,923][47056] Fps is (10 sec: 55706.7, 60 sec: 55978.8, 300 sec: 56316.6). Total num frames: 1640464384. Throughput: 0: 56565.3. Samples: 1589909020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:03:38,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 06:03:41,237][47288] Updated weights for policy 0, policy_version 100136 (0.0026) [2024-04-26 06:03:43,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1640759296. Throughput: 0: 56547.9. Samples: 1590073560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:03:43,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 06:03:44,435][47288] Updated weights for policy 0, policy_version 100146 (0.0035) [2024-04-26 06:03:46,885][47288] Updated weights for policy 0, policy_version 100156 (0.0029) [2024-04-26 06:03:48,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1641054208. Throughput: 0: 56509.1. Samples: 1590411340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:03:48,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 06:03:50,098][47288] Updated weights for policy 0, policy_version 100166 (0.0028) [2024-04-26 06:03:52,776][47288] Updated weights for policy 0, policy_version 100176 (0.0030) [2024-04-26 06:03:53,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1641332736. Throughput: 0: 56293.9. Samples: 1590747560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 06:03:53,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 06:03:55,849][47288] Updated weights for policy 0, policy_version 100186 (0.0034) [2024-04-26 06:03:58,645][47288] Updated weights for policy 0, policy_version 100196 (0.0031) [2024-04-26 06:03:58,923][47056] Fps is (10 sec: 57344.9, 60 sec: 57070.9, 300 sec: 56483.2). Total num frames: 1641627648. Throughput: 0: 56387.5. Samples: 1590922940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 06:03:58,923][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 06:04:01,787][47288] Updated weights for policy 0, policy_version 100206 (0.0029) [2024-04-26 06:04:03,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1641889792. Throughput: 0: 56567.2. Samples: 1591265160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 06:04:03,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 06:04:04,278][47288] Updated weights for policy 0, policy_version 100216 (0.0026) [2024-04-26 06:04:07,605][47288] Updated weights for policy 0, policy_version 100226 (0.0030) [2024-04-26 06:04:08,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1642151936. Throughput: 0: 56683.6. Samples: 1591605460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 06:04:08,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 06:04:10,020][47288] Updated weights for policy 0, policy_version 100236 (0.0030) [2024-04-26 06:04:13,398][47288] Updated weights for policy 0, policy_version 100246 (0.0028) [2024-04-26 06:04:13,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55979.0, 300 sec: 56316.6). Total num frames: 1642430464. Throughput: 0: 56425.6. Samples: 1591768160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 06:04:13,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 06:04:16,063][47288] Updated weights for policy 0, policy_version 100256 (0.0031) [2024-04-26 06:04:18,923][47056] Fps is (10 sec: 57340.0, 60 sec: 56251.1, 300 sec: 56371.9). Total num frames: 1642725376. Throughput: 0: 56221.1. Samples: 1592101360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 06:04:18,924][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 06:04:19,229][47288] Updated weights for policy 0, policy_version 100266 (0.0034) [2024-04-26 06:04:21,922][47288] Updated weights for policy 0, policy_version 100276 (0.0027) [2024-04-26 06:04:23,111][47267] Signal inference workers to stop experience collection... (24000 times) [2024-04-26 06:04:23,111][47267] Signal inference workers to resume experience collection... (24000 times) [2024-04-26 06:04:23,139][47288] InferenceWorker_p0-w0: stopping experience collection (24000 times) [2024-04-26 06:04:23,140][47288] InferenceWorker_p0-w0: resuming experience collection (24000 times) [2024-04-26 06:04:23,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1643020288. Throughput: 0: 56205.3. Samples: 1592438260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 06:04:23,923][47056] Avg episode reward: [(0, '0.586')] [2024-04-26 06:04:25,156][47288] Updated weights for policy 0, policy_version 100286 (0.0027) [2024-04-26 06:04:27,810][47288] Updated weights for policy 0, policy_version 100296 (0.0030) [2024-04-26 06:04:28,923][47056] Fps is (10 sec: 58985.9, 60 sec: 56798.0, 300 sec: 56427.6). Total num frames: 1643315200. Throughput: 0: 56372.3. Samples: 1592610320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 06:04:28,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 06:04:31,047][47288] Updated weights for policy 0, policy_version 100306 (0.0031) [2024-04-26 06:04:33,524][47288] Updated weights for policy 0, policy_version 100316 (0.0033) [2024-04-26 06:04:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56798.0, 300 sec: 56427.6). Total num frames: 1643593728. Throughput: 0: 56425.5. Samples: 1592950480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 06:04:33,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 06:04:36,855][47288] Updated weights for policy 0, policy_version 100326 (0.0026) [2024-04-26 06:04:38,923][47056] Fps is (10 sec: 54067.4, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1643855872. Throughput: 0: 56340.9. Samples: 1593282900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 06:04:38,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 06:04:39,393][47288] Updated weights for policy 0, policy_version 100336 (0.0029) [2024-04-26 06:04:42,547][47288] Updated weights for policy 0, policy_version 100346 (0.0026) [2024-04-26 06:04:43,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1644150784. Throughput: 0: 56283.2. Samples: 1593455680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 06:04:43,923][47056] Avg episode reward: [(0, '0.580')] [2024-04-26 06:04:45,253][47288] Updated weights for policy 0, policy_version 100356 (0.0023) [2024-04-26 06:04:48,593][47288] Updated weights for policy 0, policy_version 100366 (0.0023) [2024-04-26 06:04:48,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 56372.0). Total num frames: 1644412928. Throughput: 0: 56161.5. Samples: 1593792440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 06:04:48,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 06:04:48,930][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000100367_1644412928.pth... [2024-04-26 06:04:48,975][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000099542_1630896128.pth [2024-04-26 06:04:51,158][47288] Updated weights for policy 0, policy_version 100376 (0.0024) [2024-04-26 06:04:53,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1644691456. Throughput: 0: 55980.8. Samples: 1594124600. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-04-26 06:04:53,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 06:04:54,460][47288] Updated weights for policy 0, policy_version 100386 (0.0027) [2024-04-26 06:04:56,924][47288] Updated weights for policy 0, policy_version 100396 (0.0032) [2024-04-26 06:04:58,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 1644986368. Throughput: 0: 56124.0. Samples: 1594293740. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-04-26 06:04:58,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 06:05:00,265][47288] Updated weights for policy 0, policy_version 100406 (0.0033) [2024-04-26 06:05:02,732][47288] Updated weights for policy 0, policy_version 100416 (0.0027) [2024-04-26 06:05:03,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1645281280. Throughput: 0: 56226.6. Samples: 1594631520. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-04-26 06:05:03,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 06:05:06,212][47288] Updated weights for policy 0, policy_version 100426 (0.0026) [2024-04-26 06:05:08,702][47288] Updated weights for policy 0, policy_version 100436 (0.0037) [2024-04-26 06:05:08,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1645543424. Throughput: 0: 56255.6. Samples: 1594969760. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-04-26 06:05:08,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 06:05:11,962][47288] Updated weights for policy 0, policy_version 100446 (0.0026) [2024-04-26 06:05:13,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1645821952. Throughput: 0: 56244.9. Samples: 1595141340. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-04-26 06:05:13,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 06:05:14,477][47288] Updated weights for policy 0, policy_version 100456 (0.0025) [2024-04-26 06:05:17,760][47288] Updated weights for policy 0, policy_version 100466 (0.0034) [2024-04-26 06:05:18,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56525.4, 300 sec: 56483.1). Total num frames: 1646116864. Throughput: 0: 56330.6. Samples: 1595485360. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-04-26 06:05:18,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 06:05:20,165][47288] Updated weights for policy 0, policy_version 100476 (0.0029) [2024-04-26 06:05:23,604][47288] Updated weights for policy 0, policy_version 100486 (0.0036) [2024-04-26 06:05:23,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1646379008. Throughput: 0: 56398.3. Samples: 1595820820. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-04-26 06:05:23,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 06:05:24,133][47267] Signal inference workers to stop experience collection... (24050 times) [2024-04-26 06:05:24,165][47288] InferenceWorker_p0-w0: stopping experience collection (24050 times) [2024-04-26 06:05:24,191][47267] Signal inference workers to resume experience collection... (24050 times) [2024-04-26 06:05:24,195][47288] InferenceWorker_p0-w0: resuming experience collection (24050 times) [2024-04-26 06:05:25,948][47288] Updated weights for policy 0, policy_version 100496 (0.0024) [2024-04-26 06:05:28,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55705.8, 300 sec: 56372.1). Total num frames: 1646657536. Throughput: 0: 56225.8. Samples: 1595985840. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-04-26 06:05:28,927][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 06:05:29,294][47288] Updated weights for policy 0, policy_version 100506 (0.0028) [2024-04-26 06:05:31,763][47288] Updated weights for policy 0, policy_version 100516 (0.0028) [2024-04-26 06:05:33,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1646952448. Throughput: 0: 56277.9. Samples: 1596324940. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-04-26 06:05:33,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 06:05:35,287][47288] Updated weights for policy 0, policy_version 100526 (0.0031) [2024-04-26 06:05:37,744][47288] Updated weights for policy 0, policy_version 100536 (0.0026) [2024-04-26 06:05:38,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1647230976. Throughput: 0: 56434.3. Samples: 1596664140. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-04-26 06:05:38,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 06:05:40,894][47288] Updated weights for policy 0, policy_version 100546 (0.0028) [2024-04-26 06:05:43,350][47288] Updated weights for policy 0, policy_version 100556 (0.0032) [2024-04-26 06:05:43,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1647525888. Throughput: 0: 56497.8. Samples: 1596836140. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-04-26 06:05:43,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 06:05:46,800][47288] Updated weights for policy 0, policy_version 100566 (0.0033) [2024-04-26 06:05:48,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56525.0, 300 sec: 56427.6). Total num frames: 1647804416. Throughput: 0: 56431.6. Samples: 1597170940. Policy #0 lag: (min: 2.0, avg: 11.3, max: 22.0) [2024-04-26 06:05:48,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 06:05:49,289][47288] Updated weights for policy 0, policy_version 100576 (0.0030) [2024-04-26 06:05:52,665][47288] Updated weights for policy 0, policy_version 100586 (0.0027) [2024-04-26 06:05:53,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 1648099328. Throughput: 0: 56263.0. Samples: 1597501600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-26 06:05:53,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 06:05:55,204][47288] Updated weights for policy 0, policy_version 100596 (0.0024) [2024-04-26 06:05:58,502][47288] Updated weights for policy 0, policy_version 100606 (0.0030) [2024-04-26 06:05:58,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1648345088. Throughput: 0: 56291.7. Samples: 1597674460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-26 06:05:58,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 06:06:00,990][47288] Updated weights for policy 0, policy_version 100616 (0.0032) [2024-04-26 06:06:03,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55705.6, 300 sec: 56316.5). Total num frames: 1648623616. Throughput: 0: 56119.6. Samples: 1598010740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-26 06:06:03,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 06:06:04,225][47288] Updated weights for policy 0, policy_version 100626 (0.0026) [2024-04-26 06:06:06,653][47288] Updated weights for policy 0, policy_version 100636 (0.0030) [2024-04-26 06:06:08,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56251.6, 300 sec: 56372.1). Total num frames: 1648918528. Throughput: 0: 56257.6. Samples: 1598352420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-26 06:06:08,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 06:06:09,980][47288] Updated weights for policy 0, policy_version 100646 (0.0028) [2024-04-26 06:06:12,555][47288] Updated weights for policy 0, policy_version 100656 (0.0026) [2024-04-26 06:06:13,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1649213440. Throughput: 0: 56315.0. Samples: 1598520020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-26 06:06:13,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 06:06:15,950][47288] Updated weights for policy 0, policy_version 100666 (0.0038) [2024-04-26 06:06:18,320][47288] Updated weights for policy 0, policy_version 100676 (0.0028) [2024-04-26 06:06:18,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1649491968. Throughput: 0: 56187.5. Samples: 1598853380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-26 06:06:18,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 06:06:21,782][47288] Updated weights for policy 0, policy_version 100686 (0.0028) [2024-04-26 06:06:22,210][47267] Signal inference workers to stop experience collection... (24100 times) [2024-04-26 06:06:22,211][47267] Signal inference workers to resume experience collection... (24100 times) [2024-04-26 06:06:22,226][47288] InferenceWorker_p0-w0: stopping experience collection (24100 times) [2024-04-26 06:06:22,226][47288] InferenceWorker_p0-w0: resuming experience collection (24100 times) [2024-04-26 06:06:23,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 1649786880. Throughput: 0: 56167.5. Samples: 1599191680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-26 06:06:23,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 06:06:24,082][47288] Updated weights for policy 0, policy_version 100696 (0.0027) [2024-04-26 06:06:27,515][47288] Updated weights for policy 0, policy_version 100706 (0.0036) [2024-04-26 06:06:28,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.7, 300 sec: 56427.6). Total num frames: 1650065408. Throughput: 0: 56360.3. Samples: 1599372360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-26 06:06:28,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 06:06:29,708][47288] Updated weights for policy 0, policy_version 100716 (0.0028) [2024-04-26 06:06:33,333][47288] Updated weights for policy 0, policy_version 100726 (0.0028) [2024-04-26 06:06:33,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 1650343936. Throughput: 0: 56424.9. Samples: 1599710060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-26 06:06:33,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 06:06:35,584][47288] Updated weights for policy 0, policy_version 100736 (0.0034) [2024-04-26 06:06:38,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1650606080. Throughput: 0: 56647.6. Samples: 1600050740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-26 06:06:38,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 06:06:39,066][47288] Updated weights for policy 0, policy_version 100746 (0.0035) [2024-04-26 06:06:41,409][47288] Updated weights for policy 0, policy_version 100756 (0.0027) [2024-04-26 06:06:43,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1650884608. Throughput: 0: 56335.5. Samples: 1600209560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-26 06:06:43,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 06:06:44,846][47288] Updated weights for policy 0, policy_version 100766 (0.0030) [2024-04-26 06:06:47,130][47288] Updated weights for policy 0, policy_version 100776 (0.0027) [2024-04-26 06:06:48,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1651179520. Throughput: 0: 56440.9. Samples: 1600550580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-26 06:06:48,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 06:06:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000100780_1651179520.pth... [2024-04-26 06:06:48,980][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000099956_1637679104.pth [2024-04-26 06:06:50,720][47288] Updated weights for policy 0, policy_version 100786 (0.0027) [2024-04-26 06:06:53,022][47288] Updated weights for policy 0, policy_version 100796 (0.0027) [2024-04-26 06:06:53,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1651474432. Throughput: 0: 56388.5. Samples: 1600889900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-26 06:06:53,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 06:06:56,358][47288] Updated weights for policy 0, policy_version 100806 (0.0024) [2024-04-26 06:06:58,779][47288] Updated weights for policy 0, policy_version 100816 (0.0029) [2024-04-26 06:06:58,923][47056] Fps is (10 sec: 58982.0, 60 sec: 57070.8, 300 sec: 56483.1). Total num frames: 1651769344. Throughput: 0: 56636.7. Samples: 1601068680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-26 06:06:58,923][47056] Avg episode reward: [(0, '0.592')] [2024-04-26 06:07:02,208][47288] Updated weights for policy 0, policy_version 100826 (0.0032) [2024-04-26 06:07:03,923][47056] Fps is (10 sec: 57344.5, 60 sec: 57071.0, 300 sec: 56427.6). Total num frames: 1652047872. Throughput: 0: 56668.2. Samples: 1601403440. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-26 06:07:03,923][47056] Avg episode reward: [(0, '0.569')] [2024-04-26 06:07:04,528][47288] Updated weights for policy 0, policy_version 100836 (0.0026) [2024-04-26 06:07:08,030][47288] Updated weights for policy 0, policy_version 100846 (0.0034) [2024-04-26 06:07:08,923][47056] Fps is (10 sec: 57344.4, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 1652342784. Throughput: 0: 56663.1. Samples: 1601741520. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-26 06:07:08,923][47056] Avg episode reward: [(0, '0.340')] [2024-04-26 06:07:10,367][47288] Updated weights for policy 0, policy_version 100856 (0.0031) [2024-04-26 06:07:13,680][47288] Updated weights for policy 0, policy_version 100866 (0.0032) [2024-04-26 06:07:13,923][47056] Fps is (10 sec: 55704.6, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1652604928. Throughput: 0: 56534.6. Samples: 1601916420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-26 06:07:13,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 06:07:16,139][47288] Updated weights for policy 0, policy_version 100876 (0.0034) [2024-04-26 06:07:18,923][47056] Fps is (10 sec: 50790.1, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1652850688. Throughput: 0: 56442.5. Samples: 1602249980. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-26 06:07:18,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 06:07:19,571][47288] Updated weights for policy 0, policy_version 100886 (0.0036) [2024-04-26 06:07:22,106][47288] Updated weights for policy 0, policy_version 100896 (0.0027) [2024-04-26 06:07:23,193][47267] Signal inference workers to stop experience collection... (24150 times) [2024-04-26 06:07:23,193][47267] Signal inference workers to resume experience collection... (24150 times) [2024-04-26 06:07:23,220][47288] InferenceWorker_p0-w0: stopping experience collection (24150 times) [2024-04-26 06:07:23,220][47288] InferenceWorker_p0-w0: resuming experience collection (24150 times) [2024-04-26 06:07:23,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 1653161984. Throughput: 0: 56415.6. Samples: 1602589440. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-26 06:07:23,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 06:07:25,428][47288] Updated weights for policy 0, policy_version 100906 (0.0029) [2024-04-26 06:07:27,739][47288] Updated weights for policy 0, policy_version 100916 (0.0026) [2024-04-26 06:07:28,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1653424128. Throughput: 0: 56488.4. Samples: 1602751540. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-26 06:07:28,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 06:07:31,149][47288] Updated weights for policy 0, policy_version 100926 (0.0033) [2024-04-26 06:07:33,524][47288] Updated weights for policy 0, policy_version 100936 (0.0029) [2024-04-26 06:07:33,923][47056] Fps is (10 sec: 57342.6, 60 sec: 56524.5, 300 sec: 56372.0). Total num frames: 1653735424. Throughput: 0: 56471.8. Samples: 1603091820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-26 06:07:33,923][47056] Avg episode reward: [(0, '0.614')] [2024-04-26 06:07:36,948][47288] Updated weights for policy 0, policy_version 100946 (0.0030) [2024-04-26 06:07:38,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1654013952. Throughput: 0: 56452.5. Samples: 1603430260. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-26 06:07:38,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 06:07:39,498][47288] Updated weights for policy 0, policy_version 100956 (0.0029) [2024-04-26 06:07:42,849][47288] Updated weights for policy 0, policy_version 100966 (0.0026) [2024-04-26 06:07:43,923][47056] Fps is (10 sec: 57344.8, 60 sec: 57070.9, 300 sec: 56427.6). Total num frames: 1654308864. Throughput: 0: 56383.6. Samples: 1603605940. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-26 06:07:43,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 06:07:45,460][47288] Updated weights for policy 0, policy_version 100976 (0.0030) [2024-04-26 06:07:48,691][47288] Updated weights for policy 0, policy_version 100986 (0.0025) [2024-04-26 06:07:48,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1654571008. Throughput: 0: 56484.4. Samples: 1603945240. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-26 06:07:48,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 06:07:51,487][47288] Updated weights for policy 0, policy_version 100996 (0.0027) [2024-04-26 06:07:53,923][47056] Fps is (10 sec: 52428.3, 60 sec: 55978.5, 300 sec: 56372.0). Total num frames: 1654833152. Throughput: 0: 56472.7. Samples: 1604282800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:07:53,924][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 06:07:54,489][47288] Updated weights for policy 0, policy_version 101006 (0.0027) [2024-04-26 06:07:57,389][47288] Updated weights for policy 0, policy_version 101016 (0.0030) [2024-04-26 06:07:58,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1655128064. Throughput: 0: 56128.6. Samples: 1604442200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:07:58,923][47056] Avg episode reward: [(0, '0.402')] [2024-04-26 06:08:00,236][47288] Updated weights for policy 0, policy_version 101026 (0.0028) [2024-04-26 06:08:03,153][47288] Updated weights for policy 0, policy_version 101036 (0.0031) [2024-04-26 06:08:03,923][47056] Fps is (10 sec: 57345.3, 60 sec: 55978.7, 300 sec: 56316.6). Total num frames: 1655406592. Throughput: 0: 56256.6. Samples: 1604781520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:08:03,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 06:08:06,056][47288] Updated weights for policy 0, policy_version 101046 (0.0026) [2024-04-26 06:08:08,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 56316.6). Total num frames: 1655685120. Throughput: 0: 56332.4. Samples: 1605124400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:08:08,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 06:08:08,948][47288] Updated weights for policy 0, policy_version 101056 (0.0026) [2024-04-26 06:08:11,916][47288] Updated weights for policy 0, policy_version 101066 (0.0027) [2024-04-26 06:08:13,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1655996416. Throughput: 0: 56438.6. Samples: 1605291280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:08:13,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 06:08:14,793][47288] Updated weights for policy 0, policy_version 101076 (0.0032) [2024-04-26 06:08:17,699][47288] Updated weights for policy 0, policy_version 101086 (0.0034) [2024-04-26 06:08:18,923][47056] Fps is (10 sec: 60621.1, 60 sec: 57344.1, 300 sec: 56427.6). Total num frames: 1656291328. Throughput: 0: 56432.7. Samples: 1605631280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:08:18,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 06:08:20,680][47288] Updated weights for policy 0, policy_version 101096 (0.0030) [2024-04-26 06:08:23,474][47288] Updated weights for policy 0, policy_version 101106 (0.0027) [2024-04-26 06:08:23,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1656553472. Throughput: 0: 56390.1. Samples: 1605967820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:08:23,924][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 06:08:26,370][47288] Updated weights for policy 0, policy_version 101116 (0.0030) [2024-04-26 06:08:27,049][47267] Signal inference workers to stop experience collection... (24200 times) [2024-04-26 06:08:27,089][47288] InferenceWorker_p0-w0: stopping experience collection (24200 times) [2024-04-26 06:08:27,098][47267] Signal inference workers to resume experience collection... (24200 times) [2024-04-26 06:08:27,107][47288] InferenceWorker_p0-w0: resuming experience collection (24200 times) [2024-04-26 06:08:28,923][47056] Fps is (10 sec: 52428.8, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1656815616. Throughput: 0: 56304.6. Samples: 1606139640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:08:28,931][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 06:08:29,265][47288] Updated weights for policy 0, policy_version 101126 (0.0032) [2024-04-26 06:08:32,212][47288] Updated weights for policy 0, policy_version 101136 (0.0026) [2024-04-26 06:08:33,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1657094144. Throughput: 0: 56196.8. Samples: 1606474100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:08:33,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 06:08:35,104][47288] Updated weights for policy 0, policy_version 101146 (0.0027) [2024-04-26 06:08:37,998][47288] Updated weights for policy 0, policy_version 101156 (0.0034) [2024-04-26 06:08:38,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 1657389056. Throughput: 0: 56249.0. Samples: 1606814000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:08:38,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 06:08:40,803][47288] Updated weights for policy 0, policy_version 101166 (0.0028) [2024-04-26 06:08:43,832][47288] Updated weights for policy 0, policy_version 101176 (0.0029) [2024-04-26 06:08:43,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1657667584. Throughput: 0: 56378.6. Samples: 1606979240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:08:43,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 06:08:46,503][47288] Updated weights for policy 0, policy_version 101186 (0.0026) [2024-04-26 06:08:48,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1657962496. Throughput: 0: 56344.3. Samples: 1607317020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:08:48,923][47056] Avg episode reward: [(0, '0.534')] [2024-04-26 06:08:48,931][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000101194_1657962496.pth... [2024-04-26 06:08:48,978][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000100367_1644412928.pth [2024-04-26 06:08:49,571][47288] Updated weights for policy 0, policy_version 101196 (0.0024) [2024-04-26 06:08:52,324][47288] Updated weights for policy 0, policy_version 101206 (0.0033) [2024-04-26 06:08:53,923][47056] Fps is (10 sec: 58981.9, 60 sec: 57071.0, 300 sec: 56372.0). Total num frames: 1658257408. Throughput: 0: 56315.9. Samples: 1607658620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:08:53,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 06:08:55,257][47288] Updated weights for policy 0, policy_version 101216 (0.0029) [2024-04-26 06:08:58,081][47288] Updated weights for policy 0, policy_version 101226 (0.0026) [2024-04-26 06:08:58,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1658535936. Throughput: 0: 56582.2. Samples: 1607837480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:08:58,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 06:09:01,140][47288] Updated weights for policy 0, policy_version 101236 (0.0035) [2024-04-26 06:09:03,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1658798080. Throughput: 0: 56589.7. Samples: 1608177820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:09:03,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 06:09:04,000][47288] Updated weights for policy 0, policy_version 101246 (0.0029) [2024-04-26 06:09:07,039][47288] Updated weights for policy 0, policy_version 101256 (0.0032) [2024-04-26 06:09:08,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1659076608. Throughput: 0: 56666.8. Samples: 1608517820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:09:08,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 06:09:09,723][47288] Updated weights for policy 0, policy_version 101266 (0.0031) [2024-04-26 06:09:12,784][47288] Updated weights for policy 0, policy_version 101276 (0.0035) [2024-04-26 06:09:13,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 56372.2). Total num frames: 1659355136. Throughput: 0: 56579.2. Samples: 1608685700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:09:13,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 06:09:15,539][47288] Updated weights for policy 0, policy_version 101286 (0.0029) [2024-04-26 06:09:18,398][47288] Updated weights for policy 0, policy_version 101296 (0.0029) [2024-04-26 06:09:18,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1659666432. Throughput: 0: 56650.2. Samples: 1609023360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:09:18,923][47056] Avg episode reward: [(0, '0.605')] [2024-04-26 06:09:21,366][47288] Updated weights for policy 0, policy_version 101306 (0.0030) [2024-04-26 06:09:23,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1659928576. Throughput: 0: 56708.2. Samples: 1609365860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:09:23,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 06:09:24,330][47288] Updated weights for policy 0, policy_version 101316 (0.0029) [2024-04-26 06:09:27,063][47288] Updated weights for policy 0, policy_version 101326 (0.0027) [2024-04-26 06:09:28,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1660207104. Throughput: 0: 56683.4. Samples: 1609530000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:09:28,924][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 06:09:30,035][47288] Updated weights for policy 0, policy_version 101336 (0.0028) [2024-04-26 06:09:32,658][47288] Updated weights for policy 0, policy_version 101346 (0.0028) [2024-04-26 06:09:33,923][47056] Fps is (10 sec: 58981.6, 60 sec: 57070.9, 300 sec: 56483.1). Total num frames: 1660518400. Throughput: 0: 56758.2. Samples: 1609871140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:09:33,924][47056] Avg episode reward: [(0, '0.366')] [2024-04-26 06:09:35,767][47288] Updated weights for policy 0, policy_version 101356 (0.0034) [2024-04-26 06:09:37,976][47267] Signal inference workers to stop experience collection... (24250 times) [2024-04-26 06:09:37,977][47267] Signal inference workers to resume experience collection... (24250 times) [2024-04-26 06:09:37,997][47288] InferenceWorker_p0-w0: stopping experience collection (24250 times) [2024-04-26 06:09:37,997][47288] InferenceWorker_p0-w0: resuming experience collection (24250 times) [2024-04-26 06:09:38,468][47288] Updated weights for policy 0, policy_version 101366 (0.0030) [2024-04-26 06:09:38,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1660796928. Throughput: 0: 56695.1. Samples: 1610209900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:09:38,923][47056] Avg episode reward: [(0, '0.616')] [2024-04-26 06:09:38,934][47267] Saving new best policy, reward=0.616! [2024-04-26 06:09:41,430][47288] Updated weights for policy 0, policy_version 101376 (0.0031) [2024-04-26 06:09:43,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56797.9, 300 sec: 56483.2). Total num frames: 1661075456. Throughput: 0: 56434.7. Samples: 1610377040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:09:43,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 06:09:44,653][47288] Updated weights for policy 0, policy_version 101386 (0.0031) [2024-04-26 06:09:47,258][47288] Updated weights for policy 0, policy_version 101396 (0.0030) [2024-04-26 06:09:48,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 1661353984. Throughput: 0: 56461.4. Samples: 1610718580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:09:48,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 06:09:50,309][47288] Updated weights for policy 0, policy_version 101406 (0.0030) [2024-04-26 06:09:53,119][47288] Updated weights for policy 0, policy_version 101416 (0.0033) [2024-04-26 06:09:53,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1661616128. Throughput: 0: 56359.1. Samples: 1611053980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:09:53,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 06:09:56,063][47288] Updated weights for policy 0, policy_version 101426 (0.0028) [2024-04-26 06:09:58,799][47288] Updated weights for policy 0, policy_version 101436 (0.0029) [2024-04-26 06:09:58,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1661927424. Throughput: 0: 56576.8. Samples: 1611231660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 06:09:58,923][47056] Avg episode reward: [(0, '0.605')] [2024-04-26 06:10:01,850][47288] Updated weights for policy 0, policy_version 101446 (0.0028) [2024-04-26 06:10:03,923][47056] Fps is (10 sec: 58981.5, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 1662205952. Throughput: 0: 56524.4. Samples: 1611566960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 06:10:03,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 06:10:04,829][47288] Updated weights for policy 0, policy_version 101456 (0.0027) [2024-04-26 06:10:07,736][47288] Updated weights for policy 0, policy_version 101466 (0.0029) [2024-04-26 06:10:08,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1662468096. Throughput: 0: 56376.9. Samples: 1611902820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 06:10:08,923][47056] Avg episode reward: [(0, '0.562')] [2024-04-26 06:10:10,665][47288] Updated weights for policy 0, policy_version 101476 (0.0026) [2024-04-26 06:10:13,456][47288] Updated weights for policy 0, policy_version 101486 (0.0028) [2024-04-26 06:10:13,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1662763008. Throughput: 0: 56713.5. Samples: 1612082100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 06:10:13,923][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 06:10:16,409][47288] Updated weights for policy 0, policy_version 101496 (0.0023) [2024-04-26 06:10:18,922][47056] Fps is (10 sec: 58982.8, 60 sec: 56525.0, 300 sec: 56538.7). Total num frames: 1663057920. Throughput: 0: 56552.3. Samples: 1612415980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 06:10:18,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 06:10:19,297][47288] Updated weights for policy 0, policy_version 101506 (0.0039) [2024-04-26 06:10:22,100][47288] Updated weights for policy 0, policy_version 101516 (0.0034) [2024-04-26 06:10:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 1663336448. Throughput: 0: 56490.4. Samples: 1612751960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 06:10:23,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 06:10:25,287][47288] Updated weights for policy 0, policy_version 101526 (0.0027) [2024-04-26 06:10:27,989][47288] Updated weights for policy 0, policy_version 101536 (0.0030) [2024-04-26 06:10:28,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56798.0, 300 sec: 56483.2). Total num frames: 1663614976. Throughput: 0: 56559.6. Samples: 1612922220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 06:10:28,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 06:10:31,135][47288] Updated weights for policy 0, policy_version 101546 (0.0028) [2024-04-26 06:10:33,585][47288] Updated weights for policy 0, policy_version 101556 (0.0040) [2024-04-26 06:10:33,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56251.9, 300 sec: 56483.1). Total num frames: 1663893504. Throughput: 0: 56449.8. Samples: 1613258820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 06:10:33,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 06:10:34,915][47267] Signal inference workers to stop experience collection... (24300 times) [2024-04-26 06:10:34,915][47267] Signal inference workers to resume experience collection... (24300 times) [2024-04-26 06:10:34,945][47288] InferenceWorker_p0-w0: stopping experience collection (24300 times) [2024-04-26 06:10:34,945][47288] InferenceWorker_p0-w0: resuming experience collection (24300 times) [2024-04-26 06:10:36,819][47288] Updated weights for policy 0, policy_version 101566 (0.0034) [2024-04-26 06:10:38,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1664172032. Throughput: 0: 56650.1. Samples: 1613603240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 06:10:38,923][47056] Avg episode reward: [(0, '0.548')] [2024-04-26 06:10:39,245][47288] Updated weights for policy 0, policy_version 101576 (0.0025) [2024-04-26 06:10:42,419][47288] Updated weights for policy 0, policy_version 101586 (0.0029) [2024-04-26 06:10:43,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1664450560. Throughput: 0: 56349.2. Samples: 1613767380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 06:10:43,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 06:10:45,297][47288] Updated weights for policy 0, policy_version 101596 (0.0027) [2024-04-26 06:10:48,295][47288] Updated weights for policy 0, policy_version 101606 (0.0026) [2024-04-26 06:10:48,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1664729088. Throughput: 0: 56439.7. Samples: 1614106740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 06:10:48,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 06:10:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000101607_1664729088.pth... [2024-04-26 06:10:48,986][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000100780_1651179520.pth [2024-04-26 06:10:51,313][47288] Updated weights for policy 0, policy_version 101616 (0.0027) [2024-04-26 06:10:53,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.7, 300 sec: 56538.7). Total num frames: 1665024000. Throughput: 0: 56503.3. Samples: 1614445480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 06:10:53,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 06:10:54,067][47288] Updated weights for policy 0, policy_version 101626 (0.0026) [2024-04-26 06:10:57,192][47288] Updated weights for policy 0, policy_version 101636 (0.0026) [2024-04-26 06:10:58,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56524.7, 300 sec: 56594.2). Total num frames: 1665318912. Throughput: 0: 56294.1. Samples: 1614615340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 06:10:58,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 06:10:59,726][47288] Updated weights for policy 0, policy_version 101646 (0.0028) [2024-04-26 06:11:02,843][47288] Updated weights for policy 0, policy_version 101656 (0.0031) [2024-04-26 06:11:03,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 1665581056. Throughput: 0: 56496.3. Samples: 1614958320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 06:11:03,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 06:11:05,566][47288] Updated weights for policy 0, policy_version 101666 (0.0026) [2024-04-26 06:11:08,641][47288] Updated weights for policy 0, policy_version 101676 (0.0035) [2024-04-26 06:11:08,923][47056] Fps is (10 sec: 55706.6, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 1665875968. Throughput: 0: 56673.3. Samples: 1615302260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 06:11:08,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 06:11:11,568][47288] Updated weights for policy 0, policy_version 101686 (0.0030) [2024-04-26 06:11:13,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 1666154496. Throughput: 0: 56588.8. Samples: 1615468720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 06:11:13,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 06:11:14,420][47288] Updated weights for policy 0, policy_version 101696 (0.0029) [2024-04-26 06:11:17,452][47288] Updated weights for policy 0, policy_version 101706 (0.0036) [2024-04-26 06:11:18,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1666433024. Throughput: 0: 56653.4. Samples: 1615808220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 06:11:18,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 06:11:20,326][47288] Updated weights for policy 0, policy_version 101716 (0.0026) [2024-04-26 06:11:23,172][47288] Updated weights for policy 0, policy_version 101726 (0.0032) [2024-04-26 06:11:23,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1666711552. Throughput: 0: 56542.6. Samples: 1616147660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 06:11:23,923][47056] Avg episode reward: [(0, '0.571')] [2024-04-26 06:11:26,010][47288] Updated weights for policy 0, policy_version 101736 (0.0025) [2024-04-26 06:11:28,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1666990080. Throughput: 0: 56437.9. Samples: 1616307080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 06:11:28,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 06:11:29,048][47288] Updated weights for policy 0, policy_version 101746 (0.0032) [2024-04-26 06:11:31,933][47288] Updated weights for policy 0, policy_version 101756 (0.0030) [2024-04-26 06:11:33,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.6, 300 sec: 56483.1). Total num frames: 1667268608. Throughput: 0: 56351.5. Samples: 1616642560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 06:11:33,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 06:11:34,890][47288] Updated weights for policy 0, policy_version 101766 (0.0031) [2024-04-26 06:11:37,872][47288] Updated weights for policy 0, policy_version 101776 (0.0033) [2024-04-26 06:11:38,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 1667579904. Throughput: 0: 56396.9. Samples: 1616983340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 06:11:38,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 06:11:40,474][47288] Updated weights for policy 0, policy_version 101786 (0.0031) [2024-04-26 06:11:43,620][47288] Updated weights for policy 0, policy_version 101796 (0.0025) [2024-04-26 06:11:43,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56798.0, 300 sec: 56538.7). Total num frames: 1667858432. Throughput: 0: 56564.7. Samples: 1617160740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 06:11:43,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 06:11:46,227][47288] Updated weights for policy 0, policy_version 101806 (0.0031) [2024-04-26 06:11:47,469][47267] Signal inference workers to stop experience collection... (24350 times) [2024-04-26 06:11:47,470][47267] Signal inference workers to resume experience collection... (24350 times) [2024-04-26 06:11:47,509][47288] InferenceWorker_p0-w0: stopping experience collection (24350 times) [2024-04-26 06:11:47,509][47288] InferenceWorker_p0-w0: resuming experience collection (24350 times) [2024-04-26 06:11:48,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1668120576. Throughput: 0: 56591.5. Samples: 1617504940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 06:11:48,923][47056] Avg episode reward: [(0, '0.599')] [2024-04-26 06:11:49,280][47288] Updated weights for policy 0, policy_version 101816 (0.0028) [2024-04-26 06:11:52,227][47288] Updated weights for policy 0, policy_version 101826 (0.0026) [2024-04-26 06:11:53,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56525.0, 300 sec: 56427.6). Total num frames: 1668415488. Throughput: 0: 56201.0. Samples: 1617831300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 06:11:53,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 06:11:55,101][47288] Updated weights for policy 0, policy_version 101836 (0.0035) [2024-04-26 06:11:58,323][47288] Updated weights for policy 0, policy_version 101846 (0.0031) [2024-04-26 06:11:58,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 56372.0). Total num frames: 1668677632. Throughput: 0: 56202.5. Samples: 1617997840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 06:11:58,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 06:12:01,041][47288] Updated weights for policy 0, policy_version 101856 (0.0030) [2024-04-26 06:12:03,923][47056] Fps is (10 sec: 54066.6, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1668956160. Throughput: 0: 56123.9. Samples: 1618333800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 06:12:03,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 06:12:04,225][47288] Updated weights for policy 0, policy_version 101866 (0.0038) [2024-04-26 06:12:06,874][47288] Updated weights for policy 0, policy_version 101876 (0.0032) [2024-04-26 06:12:08,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 1669234688. Throughput: 0: 56194.8. Samples: 1618676420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 06:12:08,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 06:12:09,891][47288] Updated weights for policy 0, policy_version 101886 (0.0034) [2024-04-26 06:12:12,464][47288] Updated weights for policy 0, policy_version 101896 (0.0028) [2024-04-26 06:12:13,923][47056] Fps is (10 sec: 58981.5, 60 sec: 56524.7, 300 sec: 56594.2). Total num frames: 1669545984. Throughput: 0: 56456.3. Samples: 1618847620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 06:12:13,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 06:12:15,511][47288] Updated weights for policy 0, policy_version 101906 (0.0028) [2024-04-26 06:12:18,348][47288] Updated weights for policy 0, policy_version 101916 (0.0028) [2024-04-26 06:12:18,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 1669824512. Throughput: 0: 56472.2. Samples: 1619183800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 06:12:18,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 06:12:21,400][47288] Updated weights for policy 0, policy_version 101926 (0.0030) [2024-04-26 06:12:23,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1670086656. Throughput: 0: 56381.3. Samples: 1619520500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 06:12:23,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 06:12:24,166][47288] Updated weights for policy 0, policy_version 101936 (0.0028) [2024-04-26 06:12:27,149][47288] Updated weights for policy 0, policy_version 101946 (0.0030) [2024-04-26 06:12:28,923][47056] Fps is (10 sec: 55704.4, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1670381568. Throughput: 0: 56203.3. Samples: 1619689900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 06:12:28,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 06:12:29,991][47288] Updated weights for policy 0, policy_version 101956 (0.0033) [2024-04-26 06:12:32,793][47288] Updated weights for policy 0, policy_version 101966 (0.0024) [2024-04-26 06:12:33,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1670660096. Throughput: 0: 56212.1. Samples: 1620034480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 06:12:33,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 06:12:35,634][47288] Updated weights for policy 0, policy_version 101976 (0.0028) [2024-04-26 06:12:38,679][47288] Updated weights for policy 0, policy_version 101986 (0.0034) [2024-04-26 06:12:38,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1670938624. Throughput: 0: 56543.3. Samples: 1620375760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 06:12:38,924][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 06:12:41,337][47288] Updated weights for policy 0, policy_version 101996 (0.0027) [2024-04-26 06:12:43,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55978.5, 300 sec: 56427.6). Total num frames: 1671217152. Throughput: 0: 56392.4. Samples: 1620535500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 06:12:43,924][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 06:12:44,861][47288] Updated weights for policy 0, policy_version 102006 (0.0025) [2024-04-26 06:12:47,152][47288] Updated weights for policy 0, policy_version 102016 (0.0028) [2024-04-26 06:12:48,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 1671512064. Throughput: 0: 56407.1. Samples: 1620872120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 06:12:48,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 06:12:48,941][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000102021_1671512064.pth... [2024-04-26 06:12:48,984][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000101194_1657962496.pth [2024-04-26 06:12:50,560][47288] Updated weights for policy 0, policy_version 102026 (0.0025) [2024-04-26 06:12:52,948][47288] Updated weights for policy 0, policy_version 102036 (0.0031) [2024-04-26 06:12:53,923][47056] Fps is (10 sec: 58983.0, 60 sec: 56524.6, 300 sec: 56538.7). Total num frames: 1671806976. Throughput: 0: 56359.4. Samples: 1621212600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 06:12:53,932][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 06:12:56,324][47288] Updated weights for policy 0, policy_version 102046 (0.0029) [2024-04-26 06:12:58,687][47288] Updated weights for policy 0, policy_version 102056 (0.0032) [2024-04-26 06:12:58,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56798.0, 300 sec: 56538.7). Total num frames: 1672085504. Throughput: 0: 56454.8. Samples: 1621388080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 06:12:58,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 06:13:02,072][47288] Updated weights for policy 0, policy_version 102066 (0.0031) [2024-04-26 06:13:03,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 1672347648. Throughput: 0: 56495.8. Samples: 1621726120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 06:13:03,923][47056] Avg episode reward: [(0, '0.557')] [2024-04-26 06:13:04,637][47288] Updated weights for policy 0, policy_version 102076 (0.0026) [2024-04-26 06:13:07,687][47267] Signal inference workers to stop experience collection... (24400 times) [2024-04-26 06:13:07,716][47288] InferenceWorker_p0-w0: stopping experience collection (24400 times) [2024-04-26 06:13:07,770][47267] Signal inference workers to resume experience collection... (24400 times) [2024-04-26 06:13:07,771][47288] InferenceWorker_p0-w0: resuming experience collection (24400 times) [2024-04-26 06:13:07,884][47288] Updated weights for policy 0, policy_version 102086 (0.0029) [2024-04-26 06:13:08,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56797.7, 300 sec: 56427.6). Total num frames: 1672642560. Throughput: 0: 56447.6. Samples: 1622060640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 06:13:08,923][47056] Avg episode reward: [(0, '0.561')] [2024-04-26 06:13:10,380][47288] Updated weights for policy 0, policy_version 102096 (0.0026) [2024-04-26 06:13:13,696][47288] Updated weights for policy 0, policy_version 102106 (0.0027) [2024-04-26 06:13:13,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 1672921088. Throughput: 0: 56473.7. Samples: 1622231220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 06:13:13,923][47056] Avg episode reward: [(0, '0.386')] [2024-04-26 06:13:16,148][47288] Updated weights for policy 0, policy_version 102116 (0.0035) [2024-04-26 06:13:18,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1673199616. Throughput: 0: 56311.9. Samples: 1622568520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 06:13:18,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 06:13:19,368][47288] Updated weights for policy 0, policy_version 102126 (0.0026) [2024-04-26 06:13:22,028][47288] Updated weights for policy 0, policy_version 102136 (0.0032) [2024-04-26 06:13:23,923][47056] Fps is (10 sec: 54068.3, 60 sec: 56251.9, 300 sec: 56427.6). Total num frames: 1673461760. Throughput: 0: 56367.2. Samples: 1622912280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 06:13:23,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 06:13:25,326][47288] Updated weights for policy 0, policy_version 102146 (0.0027) [2024-04-26 06:13:27,644][47288] Updated weights for policy 0, policy_version 102156 (0.0031) [2024-04-26 06:13:28,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 1673773056. Throughput: 0: 56518.3. Samples: 1623078820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 06:13:28,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 06:13:31,300][47288] Updated weights for policy 0, policy_version 102166 (0.0033) [2024-04-26 06:13:33,466][47288] Updated weights for policy 0, policy_version 102176 (0.0031) [2024-04-26 06:13:33,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 1674051584. Throughput: 0: 56543.2. Samples: 1623416560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 06:13:33,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 06:13:36,915][47288] Updated weights for policy 0, policy_version 102186 (0.0028) [2024-04-26 06:13:38,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 1674346496. Throughput: 0: 56406.2. Samples: 1623750880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 06:13:38,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 06:13:39,249][47288] Updated weights for policy 0, policy_version 102196 (0.0025) [2024-04-26 06:13:42,692][47288] Updated weights for policy 0, policy_version 102206 (0.0030) [2024-04-26 06:13:43,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 1674592256. Throughput: 0: 56406.7. Samples: 1623926380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 06:13:43,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 06:13:45,137][47288] Updated weights for policy 0, policy_version 102216 (0.0024) [2024-04-26 06:13:48,520][47288] Updated weights for policy 0, policy_version 102226 (0.0028) [2024-04-26 06:13:48,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1674903552. Throughput: 0: 56434.3. Samples: 1624265660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 06:13:48,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 06:13:51,215][47288] Updated weights for policy 0, policy_version 102236 (0.0027) [2024-04-26 06:13:53,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1675165696. Throughput: 0: 56386.9. Samples: 1624598040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 06:13:53,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 06:13:54,345][47288] Updated weights for policy 0, policy_version 102246 (0.0025) [2024-04-26 06:13:57,015][47288] Updated weights for policy 0, policy_version 102256 (0.0033) [2024-04-26 06:13:58,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.6, 300 sec: 56427.6). Total num frames: 1675444224. Throughput: 0: 56224.2. Samples: 1624761300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 06:13:58,923][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 06:14:00,136][47288] Updated weights for policy 0, policy_version 102266 (0.0030) [2024-04-26 06:14:02,752][47288] Updated weights for policy 0, policy_version 102276 (0.0027) [2024-04-26 06:14:03,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1675739136. Throughput: 0: 56186.7. Samples: 1625096920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 06:14:03,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 06:14:05,924][47288] Updated weights for policy 0, policy_version 102286 (0.0026) [2024-04-26 06:14:06,305][47267] Signal inference workers to stop experience collection... (24450 times) [2024-04-26 06:14:06,335][47288] InferenceWorker_p0-w0: stopping experience collection (24450 times) [2024-04-26 06:14:06,361][47267] Signal inference workers to resume experience collection... (24450 times) [2024-04-26 06:14:06,366][47288] InferenceWorker_p0-w0: resuming experience collection (24450 times) [2024-04-26 06:14:08,649][47288] Updated weights for policy 0, policy_version 102296 (0.0031) [2024-04-26 06:14:08,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.9, 300 sec: 56483.1). Total num frames: 1676017664. Throughput: 0: 56146.3. Samples: 1625438860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 06:14:08,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 06:14:11,759][47288] Updated weights for policy 0, policy_version 102306 (0.0028) [2024-04-26 06:14:13,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56798.1, 300 sec: 56483.2). Total num frames: 1676328960. Throughput: 0: 56320.6. Samples: 1625613240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 06:14:13,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 06:14:14,585][47288] Updated weights for policy 0, policy_version 102316 (0.0028) [2024-04-26 06:14:17,609][47288] Updated weights for policy 0, policy_version 102326 (0.0028) [2024-04-26 06:14:18,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1676574720. Throughput: 0: 56325.2. Samples: 1625951200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 06:14:18,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 06:14:20,238][47288] Updated weights for policy 0, policy_version 102336 (0.0027) [2024-04-26 06:14:23,369][47288] Updated weights for policy 0, policy_version 102346 (0.0030) [2024-04-26 06:14:23,923][47056] Fps is (10 sec: 52428.9, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1676853248. Throughput: 0: 56431.3. Samples: 1626290280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 06:14:23,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 06:14:26,058][47288] Updated weights for policy 0, policy_version 102356 (0.0027) [2024-04-26 06:14:28,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1677131776. Throughput: 0: 56175.4. Samples: 1626454280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 06:14:28,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 06:14:29,142][47288] Updated weights for policy 0, policy_version 102366 (0.0033) [2024-04-26 06:14:31,868][47288] Updated weights for policy 0, policy_version 102376 (0.0030) [2024-04-26 06:14:33,923][47056] Fps is (10 sec: 57342.4, 60 sec: 56251.4, 300 sec: 56372.0). Total num frames: 1677426688. Throughput: 0: 56148.6. Samples: 1626792360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 06:14:33,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 06:14:34,965][47288] Updated weights for policy 0, policy_version 102386 (0.0026) [2024-04-26 06:14:37,576][47288] Updated weights for policy 0, policy_version 102396 (0.0026) [2024-04-26 06:14:38,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 56316.5). Total num frames: 1677688832. Throughput: 0: 56270.9. Samples: 1627130240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 06:14:38,923][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 06:14:40,658][47288] Updated weights for policy 0, policy_version 102406 (0.0029) [2024-04-26 06:14:43,462][47288] Updated weights for policy 0, policy_version 102416 (0.0036) [2024-04-26 06:14:43,923][47056] Fps is (10 sec: 57345.2, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1678000128. Throughput: 0: 56586.2. Samples: 1627307680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 06:14:43,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 06:14:46,964][47288] Updated weights for policy 0, policy_version 102426 (0.0032) [2024-04-26 06:14:48,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1678278656. Throughput: 0: 56606.7. Samples: 1627644220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 06:14:48,923][47056] Avg episode reward: [(0, '0.557')] [2024-04-26 06:14:48,937][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000102434_1678278656.pth... [2024-04-26 06:14:48,997][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000101607_1664729088.pth [2024-04-26 06:14:49,260][47288] Updated weights for policy 0, policy_version 102436 (0.0031) [2024-04-26 06:14:52,703][47288] Updated weights for policy 0, policy_version 102446 (0.0031) [2024-04-26 06:14:53,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.7, 300 sec: 56372.0). Total num frames: 1678557184. Throughput: 0: 56458.1. Samples: 1627979480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 06:14:53,923][47056] Avg episode reward: [(0, '0.569')] [2024-04-26 06:14:54,937][47288] Updated weights for policy 0, policy_version 102456 (0.0031) [2024-04-26 06:14:58,389][47288] Updated weights for policy 0, policy_version 102466 (0.0030) [2024-04-26 06:14:58,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56251.8, 300 sec: 56316.6). Total num frames: 1678819328. Throughput: 0: 56395.2. Samples: 1628151020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:14:58,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 06:15:00,873][47288] Updated weights for policy 0, policy_version 102476 (0.0030) [2024-04-26 06:15:03,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1679114240. Throughput: 0: 56414.1. Samples: 1628489840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:15:03,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 06:15:04,304][47288] Updated weights for policy 0, policy_version 102486 (0.0028) [2024-04-26 06:15:06,769][47288] Updated weights for policy 0, policy_version 102496 (0.0025) [2024-04-26 06:15:08,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.6, 300 sec: 56372.1). Total num frames: 1679392768. Throughput: 0: 56387.0. Samples: 1628827700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:15:08,924][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 06:15:10,101][47288] Updated weights for policy 0, policy_version 102506 (0.0027) [2024-04-26 06:15:10,595][47267] Signal inference workers to stop experience collection... (24500 times) [2024-04-26 06:15:10,595][47267] Signal inference workers to resume experience collection... (24500 times) [2024-04-26 06:15:10,609][47288] InferenceWorker_p0-w0: stopping experience collection (24500 times) [2024-04-26 06:15:10,609][47288] InferenceWorker_p0-w0: resuming experience collection (24500 times) [2024-04-26 06:15:12,496][47288] Updated weights for policy 0, policy_version 102516 (0.0029) [2024-04-26 06:15:13,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.6, 300 sec: 56372.0). Total num frames: 1679687680. Throughput: 0: 56448.0. Samples: 1628994440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:15:13,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 06:15:16,013][47288] Updated weights for policy 0, policy_version 102526 (0.0027) [2024-04-26 06:15:18,218][47288] Updated weights for policy 0, policy_version 102536 (0.0030) [2024-04-26 06:15:18,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1679982592. Throughput: 0: 56411.3. Samples: 1629330860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:15:18,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 06:15:21,614][47288] Updated weights for policy 0, policy_version 102546 (0.0030) [2024-04-26 06:15:23,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1680261120. Throughput: 0: 56494.0. Samples: 1629672460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:15:23,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 06:15:24,007][47288] Updated weights for policy 0, policy_version 102556 (0.0027) [2024-04-26 06:15:27,411][47288] Updated weights for policy 0, policy_version 102566 (0.0025) [2024-04-26 06:15:28,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56798.0, 300 sec: 56427.6). Total num frames: 1680539648. Throughput: 0: 56570.3. Samples: 1629853340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:15:28,923][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 06:15:29,810][47288] Updated weights for policy 0, policy_version 102576 (0.0027) [2024-04-26 06:15:33,300][47288] Updated weights for policy 0, policy_version 102586 (0.0025) [2024-04-26 06:15:33,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56525.0, 300 sec: 56427.6). Total num frames: 1680818176. Throughput: 0: 56618.2. Samples: 1630192040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:15:33,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 06:15:35,697][47288] Updated weights for policy 0, policy_version 102596 (0.0027) [2024-04-26 06:15:38,923][47056] Fps is (10 sec: 54066.2, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1681080320. Throughput: 0: 56565.7. Samples: 1630524940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:15:38,923][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 06:15:39,020][47288] Updated weights for policy 0, policy_version 102606 (0.0031) [2024-04-26 06:15:41,432][47288] Updated weights for policy 0, policy_version 102616 (0.0029) [2024-04-26 06:15:43,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1681358848. Throughput: 0: 56411.6. Samples: 1630689540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:15:43,923][47056] Avg episode reward: [(0, '0.573')] [2024-04-26 06:15:44,688][47288] Updated weights for policy 0, policy_version 102626 (0.0033) [2024-04-26 06:15:47,363][47288] Updated weights for policy 0, policy_version 102636 (0.0031) [2024-04-26 06:15:48,923][47056] Fps is (10 sec: 58983.6, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1681670144. Throughput: 0: 56403.3. Samples: 1631027980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:15:48,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 06:15:50,554][47288] Updated weights for policy 0, policy_version 102646 (0.0027) [2024-04-26 06:15:53,122][47288] Updated weights for policy 0, policy_version 102656 (0.0026) [2024-04-26 06:15:53,923][47056] Fps is (10 sec: 60620.4, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1681965056. Throughput: 0: 56439.2. Samples: 1631367460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:15:53,923][47056] Avg episode reward: [(0, '0.576')] [2024-04-26 06:15:56,461][47288] Updated weights for policy 0, policy_version 102666 (0.0029) [2024-04-26 06:15:58,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1682227200. Throughput: 0: 56590.4. Samples: 1631541000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 06:15:58,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 06:15:58,949][47288] Updated weights for policy 0, policy_version 102676 (0.0029) [2024-04-26 06:16:02,266][47288] Updated weights for policy 0, policy_version 102686 (0.0029) [2024-04-26 06:16:03,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1682522112. Throughput: 0: 56519.5. Samples: 1631874240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 06:16:03,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 06:16:04,662][47288] Updated weights for policy 0, policy_version 102696 (0.0028) [2024-04-26 06:16:07,939][47288] Updated weights for policy 0, policy_version 102706 (0.0034) [2024-04-26 06:16:08,923][47056] Fps is (10 sec: 55704.3, 60 sec: 56524.6, 300 sec: 56372.0). Total num frames: 1682784256. Throughput: 0: 56392.5. Samples: 1632210140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 06:16:08,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 06:16:10,347][47267] Signal inference workers to stop experience collection... (24550 times) [2024-04-26 06:16:10,347][47267] Signal inference workers to resume experience collection... (24550 times) [2024-04-26 06:16:10,359][47288] InferenceWorker_p0-w0: stopping experience collection (24550 times) [2024-04-26 06:16:10,360][47288] InferenceWorker_p0-w0: resuming experience collection (24550 times) [2024-04-26 06:16:10,455][47288] Updated weights for policy 0, policy_version 102716 (0.0030) [2024-04-26 06:16:13,837][47288] Updated weights for policy 0, policy_version 102726 (0.0027) [2024-04-26 06:16:13,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56251.8, 300 sec: 56372.0). Total num frames: 1683062784. Throughput: 0: 56089.7. Samples: 1632377380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 06:16:13,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 06:16:16,322][47288] Updated weights for policy 0, policy_version 102736 (0.0027) [2024-04-26 06:16:18,923][47056] Fps is (10 sec: 55706.6, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1683341312. Throughput: 0: 56101.3. Samples: 1632716600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 06:16:18,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 06:16:19,659][47288] Updated weights for policy 0, policy_version 102746 (0.0034) [2024-04-26 06:16:22,078][47288] Updated weights for policy 0, policy_version 102756 (0.0028) [2024-04-26 06:16:23,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1683636224. Throughput: 0: 56289.4. Samples: 1633057960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 06:16:23,923][47056] Avg episode reward: [(0, '0.594')] [2024-04-26 06:16:25,397][47288] Updated weights for policy 0, policy_version 102766 (0.0032) [2024-04-26 06:16:27,911][47288] Updated weights for policy 0, policy_version 102776 (0.0030) [2024-04-26 06:16:28,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 1683931136. Throughput: 0: 56471.0. Samples: 1633230740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 06:16:28,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 06:16:31,157][47288] Updated weights for policy 0, policy_version 102786 (0.0029) [2024-04-26 06:16:33,852][47288] Updated weights for policy 0, policy_version 102796 (0.0031) [2024-04-26 06:16:33,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1684209664. Throughput: 0: 56435.1. Samples: 1633567560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 06:16:33,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 06:16:36,960][47288] Updated weights for policy 0, policy_version 102806 (0.0028) [2024-04-26 06:16:38,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56798.1, 300 sec: 56372.1). Total num frames: 1684488192. Throughput: 0: 56285.4. Samples: 1633900300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 06:16:38,924][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 06:16:39,719][47288] Updated weights for policy 0, policy_version 102816 (0.0027) [2024-04-26 06:16:42,861][47288] Updated weights for policy 0, policy_version 102826 (0.0031) [2024-04-26 06:16:43,923][47056] Fps is (10 sec: 54066.6, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1684750336. Throughput: 0: 56313.7. Samples: 1634075120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 06:16:43,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 06:16:45,418][47288] Updated weights for policy 0, policy_version 102836 (0.0025) [2024-04-26 06:16:48,568][47288] Updated weights for policy 0, policy_version 102846 (0.0027) [2024-04-26 06:16:48,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 1685045248. Throughput: 0: 56530.7. Samples: 1634418120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 06:16:48,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 06:16:48,936][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000102847_1685045248.pth... [2024-04-26 06:16:48,979][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000102021_1671512064.pth [2024-04-26 06:16:51,156][47288] Updated weights for policy 0, policy_version 102856 (0.0029) [2024-04-26 06:16:53,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 56316.5). Total num frames: 1685291008. Throughput: 0: 56509.5. Samples: 1634753060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 06:16:53,923][47056] Avg episode reward: [(0, '0.547')] [2024-04-26 06:16:54,420][47288] Updated weights for policy 0, policy_version 102866 (0.0035) [2024-04-26 06:16:56,991][47288] Updated weights for policy 0, policy_version 102876 (0.0028) [2024-04-26 06:16:58,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 56372.0). Total num frames: 1685585920. Throughput: 0: 56368.4. Samples: 1634913960. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 06:16:58,923][47056] Avg episode reward: [(0, '0.597')] [2024-04-26 06:17:00,241][47288] Updated weights for policy 0, policy_version 102886 (0.0035) [2024-04-26 06:17:02,780][47288] Updated weights for policy 0, policy_version 102896 (0.0031) [2024-04-26 06:17:03,923][47056] Fps is (10 sec: 60621.1, 60 sec: 56251.8, 300 sec: 56483.1). Total num frames: 1685897216. Throughput: 0: 56306.7. Samples: 1635250400. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 06:17:03,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 06:17:06,062][47288] Updated weights for policy 0, policy_version 102906 (0.0029) [2024-04-26 06:17:08,439][47288] Updated weights for policy 0, policy_version 102916 (0.0029) [2024-04-26 06:17:08,923][47056] Fps is (10 sec: 60621.2, 60 sec: 56798.0, 300 sec: 56427.6). Total num frames: 1686192128. Throughput: 0: 56229.4. Samples: 1635588280. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 06:17:08,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 06:17:11,347][47267] Signal inference workers to stop experience collection... (24600 times) [2024-04-26 06:17:11,348][47267] Signal inference workers to resume experience collection... (24600 times) [2024-04-26 06:17:11,359][47288] InferenceWorker_p0-w0: stopping experience collection (24600 times) [2024-04-26 06:17:11,389][47288] InferenceWorker_p0-w0: resuming experience collection (24600 times) [2024-04-26 06:17:11,719][47288] Updated weights for policy 0, policy_version 102926 (0.0037) [2024-04-26 06:17:13,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1686454272. Throughput: 0: 56356.0. Samples: 1635766760. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 06:17:13,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 06:17:14,327][47288] Updated weights for policy 0, policy_version 102936 (0.0029) [2024-04-26 06:17:17,684][47288] Updated weights for policy 0, policy_version 102946 (0.0031) [2024-04-26 06:17:18,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56797.9, 300 sec: 56483.2). Total num frames: 1686749184. Throughput: 0: 56230.2. Samples: 1636097920. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 06:17:18,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 06:17:20,090][47288] Updated weights for policy 0, policy_version 102956 (0.0026) [2024-04-26 06:17:23,628][47288] Updated weights for policy 0, policy_version 102966 (0.0032) [2024-04-26 06:17:23,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1687011328. Throughput: 0: 56323.4. Samples: 1636434860. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 06:17:23,923][47056] Avg episode reward: [(0, '0.534')] [2024-04-26 06:17:25,899][47288] Updated weights for policy 0, policy_version 102976 (0.0028) [2024-04-26 06:17:28,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55978.6, 300 sec: 56372.0). Total num frames: 1687289856. Throughput: 0: 56018.2. Samples: 1636595940. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 06:17:28,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 06:17:29,375][47288] Updated weights for policy 0, policy_version 102986 (0.0034) [2024-04-26 06:17:31,704][47288] Updated weights for policy 0, policy_version 102996 (0.0023) [2024-04-26 06:17:33,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55432.5, 300 sec: 56261.0). Total num frames: 1687535616. Throughput: 0: 55933.0. Samples: 1636935100. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 06:17:33,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 06:17:35,419][47288] Updated weights for policy 0, policy_version 103006 (0.0032) [2024-04-26 06:17:37,465][47288] Updated weights for policy 0, policy_version 103016 (0.0028) [2024-04-26 06:17:38,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.5, 300 sec: 56372.1). Total num frames: 1687846912. Throughput: 0: 56073.3. Samples: 1637276360. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 06:17:38,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 06:17:41,047][47288] Updated weights for policy 0, policy_version 103026 (0.0027) [2024-04-26 06:17:43,266][47288] Updated weights for policy 0, policy_version 103036 (0.0030) [2024-04-26 06:17:43,923][47056] Fps is (10 sec: 62259.4, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1688158208. Throughput: 0: 56392.6. Samples: 1637451620. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 06:17:43,923][47056] Avg episode reward: [(0, '0.596')] [2024-04-26 06:17:46,813][47288] Updated weights for policy 0, policy_version 103046 (0.0029) [2024-04-26 06:17:48,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1688436736. Throughput: 0: 56500.9. Samples: 1637792940. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 06:17:48,923][47056] Avg episode reward: [(0, '0.600')] [2024-04-26 06:17:49,180][47288] Updated weights for policy 0, policy_version 103056 (0.0028) [2024-04-26 06:17:52,516][47288] Updated weights for policy 0, policy_version 103066 (0.0029) [2024-04-26 06:17:53,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56797.9, 300 sec: 56316.5). Total num frames: 1688698880. Throughput: 0: 56374.3. Samples: 1638125120. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 06:17:53,923][47056] Avg episode reward: [(0, '0.544')] [2024-04-26 06:17:55,012][47288] Updated weights for policy 0, policy_version 103076 (0.0027) [2024-04-26 06:17:58,285][47288] Updated weights for policy 0, policy_version 103086 (0.0030) [2024-04-26 06:17:58,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1688977408. Throughput: 0: 56231.8. Samples: 1638297200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:17:58,923][47056] Avg episode reward: [(0, '0.586')] [2024-04-26 06:18:00,990][47288] Updated weights for policy 0, policy_version 103096 (0.0038) [2024-04-26 06:18:02,654][47267] Signal inference workers to stop experience collection... (24650 times) [2024-04-26 06:18:02,687][47288] InferenceWorker_p0-w0: stopping experience collection (24650 times) [2024-04-26 06:18:02,748][47267] Signal inference workers to resume experience collection... (24650 times) [2024-04-26 06:18:02,748][47288] InferenceWorker_p0-w0: resuming experience collection (24650 times) [2024-04-26 06:18:03,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1689272320. Throughput: 0: 56472.8. Samples: 1638639200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:18:03,923][47056] Avg episode reward: [(0, '0.614')] [2024-04-26 06:18:04,105][47288] Updated weights for policy 0, policy_version 103106 (0.0029) [2024-04-26 06:18:06,933][47288] Updated weights for policy 0, policy_version 103116 (0.0029) [2024-04-26 06:18:08,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 56316.6). Total num frames: 1689534464. Throughput: 0: 56469.0. Samples: 1638975960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:18:08,932][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 06:18:09,946][47288] Updated weights for policy 0, policy_version 103126 (0.0030) [2024-04-26 06:18:12,709][47288] Updated weights for policy 0, policy_version 103136 (0.0029) [2024-04-26 06:18:13,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55705.5, 300 sec: 56261.0). Total num frames: 1689796608. Throughput: 0: 56285.9. Samples: 1639128800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:18:13,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 06:18:16,102][47288] Updated weights for policy 0, policy_version 103146 (0.0028) [2024-04-26 06:18:18,492][47288] Updated weights for policy 0, policy_version 103156 (0.0028) [2024-04-26 06:18:18,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56251.6, 300 sec: 56483.1). Total num frames: 1690124288. Throughput: 0: 56291.0. Samples: 1639468200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:18:18,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 06:18:22,082][47288] Updated weights for policy 0, policy_version 103166 (0.0029) [2024-04-26 06:18:23,923][47056] Fps is (10 sec: 60620.1, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1690402816. Throughput: 0: 56239.5. Samples: 1639807140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:18:23,923][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 06:18:24,277][47288] Updated weights for policy 0, policy_version 103176 (0.0028) [2024-04-26 06:18:27,753][47288] Updated weights for policy 0, policy_version 103186 (0.0029) [2024-04-26 06:18:28,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1690681344. Throughput: 0: 56172.9. Samples: 1639979400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:18:28,923][47056] Avg episode reward: [(0, '0.588')] [2024-04-26 06:18:30,382][47288] Updated weights for policy 0, policy_version 103196 (0.0028) [2024-04-26 06:18:33,591][47288] Updated weights for policy 0, policy_version 103206 (0.0028) [2024-04-26 06:18:33,923][47056] Fps is (10 sec: 54068.1, 60 sec: 56797.9, 300 sec: 56261.0). Total num frames: 1690943488. Throughput: 0: 56124.1. Samples: 1640318520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:18:33,923][47056] Avg episode reward: [(0, '0.566')] [2024-04-26 06:18:36,205][47288] Updated weights for policy 0, policy_version 103216 (0.0028) [2024-04-26 06:18:38,923][47056] Fps is (10 sec: 54065.9, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 1691222016. Throughput: 0: 56302.4. Samples: 1640658740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:18:38,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 06:18:39,436][47288] Updated weights for policy 0, policy_version 103226 (0.0033) [2024-04-26 06:18:41,944][47288] Updated weights for policy 0, policy_version 103236 (0.0030) [2024-04-26 06:18:43,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1691516928. Throughput: 0: 56229.4. Samples: 1640827520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:18:43,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 06:18:45,115][47288] Updated weights for policy 0, policy_version 103246 (0.0029) [2024-04-26 06:18:47,903][47288] Updated weights for policy 0, policy_version 103256 (0.0026) [2024-04-26 06:18:48,923][47056] Fps is (10 sec: 55707.2, 60 sec: 55705.7, 300 sec: 56316.5). Total num frames: 1691779072. Throughput: 0: 56101.8. Samples: 1641163780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:18:48,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 06:18:49,004][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000103259_1691795456.pth... [2024-04-26 06:18:49,058][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000102434_1678278656.pth [2024-04-26 06:18:50,790][47288] Updated weights for policy 0, policy_version 103266 (0.0030) [2024-04-26 06:18:52,655][47267] Signal inference workers to stop experience collection... (24700 times) [2024-04-26 06:18:52,683][47288] InferenceWorker_p0-w0: stopping experience collection (24700 times) [2024-04-26 06:18:52,709][47267] Signal inference workers to resume experience collection... (24700 times) [2024-04-26 06:18:52,709][47288] InferenceWorker_p0-w0: resuming experience collection (24700 times) [2024-04-26 06:18:53,880][47288] Updated weights for policy 0, policy_version 103276 (0.0032) [2024-04-26 06:18:53,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1692073984. Throughput: 0: 56192.9. Samples: 1641504640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:18:53,923][47056] Avg episode reward: [(0, '0.380')] [2024-04-26 06:18:56,813][47288] Updated weights for policy 0, policy_version 103286 (0.0037) [2024-04-26 06:18:58,923][47056] Fps is (10 sec: 58981.5, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1692368896. Throughput: 0: 56649.2. Samples: 1641678020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 06:18:58,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 06:18:59,733][47288] Updated weights for policy 0, policy_version 103296 (0.0033) [2024-04-26 06:19:02,680][47288] Updated weights for policy 0, policy_version 103306 (0.0028) [2024-04-26 06:19:03,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1692663808. Throughput: 0: 56625.9. Samples: 1642016360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 06:19:03,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 06:19:05,531][47288] Updated weights for policy 0, policy_version 103316 (0.0030) [2024-04-26 06:19:08,403][47288] Updated weights for policy 0, policy_version 103326 (0.0031) [2024-04-26 06:19:08,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1692925952. Throughput: 0: 56386.3. Samples: 1642344520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 06:19:08,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 06:19:11,358][47288] Updated weights for policy 0, policy_version 103336 (0.0032) [2024-04-26 06:19:13,923][47056] Fps is (10 sec: 52428.6, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1693188096. Throughput: 0: 56311.5. Samples: 1642513420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 06:19:13,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 06:19:14,094][47288] Updated weights for policy 0, policy_version 103346 (0.0030) [2024-04-26 06:19:17,198][47288] Updated weights for policy 0, policy_version 103356 (0.0029) [2024-04-26 06:19:18,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55705.7, 300 sec: 56316.5). Total num frames: 1693466624. Throughput: 0: 56288.5. Samples: 1642851500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 06:19:18,923][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 06:19:20,050][47288] Updated weights for policy 0, policy_version 103366 (0.0025) [2024-04-26 06:19:23,313][47288] Updated weights for policy 0, policy_version 103376 (0.0026) [2024-04-26 06:19:23,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1693777920. Throughput: 0: 56275.2. Samples: 1643191120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 06:19:23,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 06:19:25,783][47288] Updated weights for policy 0, policy_version 103386 (0.0033) [2024-04-26 06:19:28,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 56261.0). Total num frames: 1694023680. Throughput: 0: 56082.2. Samples: 1643351220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 06:19:28,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 06:19:29,173][47288] Updated weights for policy 0, policy_version 103396 (0.0029) [2024-04-26 06:19:31,496][47288] Updated weights for policy 0, policy_version 103406 (0.0033) [2024-04-26 06:19:33,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1694318592. Throughput: 0: 56142.2. Samples: 1643690180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 06:19:33,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 06:19:35,153][47288] Updated weights for policy 0, policy_version 103416 (0.0027) [2024-04-26 06:19:37,277][47288] Updated weights for policy 0, policy_version 103426 (0.0035) [2024-04-26 06:19:38,923][47056] Fps is (10 sec: 60621.0, 60 sec: 56798.1, 300 sec: 56372.1). Total num frames: 1694629888. Throughput: 0: 55979.6. Samples: 1644023720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 06:19:38,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 06:19:40,793][47288] Updated weights for policy 0, policy_version 103436 (0.0027) [2024-04-26 06:19:43,147][47288] Updated weights for policy 0, policy_version 103446 (0.0030) [2024-04-26 06:19:43,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1694908416. Throughput: 0: 55903.8. Samples: 1644193680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 06:19:43,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 06:19:46,679][47288] Updated weights for policy 0, policy_version 103456 (0.0031) [2024-04-26 06:19:46,923][47267] Signal inference workers to stop experience collection... (24750 times) [2024-04-26 06:19:46,923][47267] Signal inference workers to resume experience collection... (24750 times) [2024-04-26 06:19:46,947][47288] InferenceWorker_p0-w0: stopping experience collection (24750 times) [2024-04-26 06:19:46,951][47288] InferenceWorker_p0-w0: resuming experience collection (24750 times) [2024-04-26 06:19:48,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1695170560. Throughput: 0: 56040.3. Samples: 1644538180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 06:19:48,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 06:19:48,981][47288] Updated weights for policy 0, policy_version 103466 (0.0023) [2024-04-26 06:19:52,301][47288] Updated weights for policy 0, policy_version 103476 (0.0026) [2024-04-26 06:19:53,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 1695449088. Throughput: 0: 56155.1. Samples: 1644871500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 06:19:53,923][47056] Avg episode reward: [(0, '0.421')] [2024-04-26 06:19:54,857][47288] Updated weights for policy 0, policy_version 103486 (0.0029) [2024-04-26 06:19:58,195][47288] Updated weights for policy 0, policy_version 103496 (0.0032) [2024-04-26 06:19:58,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1695727616. Throughput: 0: 56181.7. Samples: 1645041600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 06:19:58,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 06:20:00,547][47288] Updated weights for policy 0, policy_version 103506 (0.0027) [2024-04-26 06:20:03,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 56261.0). Total num frames: 1695989760. Throughput: 0: 56257.3. Samples: 1645383080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 06:20:03,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 06:20:03,941][47288] Updated weights for policy 0, policy_version 103516 (0.0029) [2024-04-26 06:20:06,324][47288] Updated weights for policy 0, policy_version 103526 (0.0029) [2024-04-26 06:20:08,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 56205.5). Total num frames: 1696268288. Throughput: 0: 56285.5. Samples: 1645723960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 06:20:08,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 06:20:09,699][47288] Updated weights for policy 0, policy_version 103536 (0.0032) [2024-04-26 06:20:12,102][47288] Updated weights for policy 0, policy_version 103546 (0.0029) [2024-04-26 06:20:13,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1696579584. Throughput: 0: 56408.2. Samples: 1645889580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 06:20:13,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 06:20:15,631][47288] Updated weights for policy 0, policy_version 103556 (0.0033) [2024-04-26 06:20:17,956][47288] Updated weights for policy 0, policy_version 103566 (0.0030) [2024-04-26 06:20:18,923][47056] Fps is (10 sec: 60620.4, 60 sec: 56797.7, 300 sec: 56316.5). Total num frames: 1696874496. Throughput: 0: 56345.7. Samples: 1646225740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 06:20:18,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 06:20:21,429][47288] Updated weights for policy 0, policy_version 103576 (0.0032) [2024-04-26 06:20:23,763][47288] Updated weights for policy 0, policy_version 103586 (0.0027) [2024-04-26 06:20:23,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1697153024. Throughput: 0: 56291.0. Samples: 1646556820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 06:20:23,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 06:20:27,366][47288] Updated weights for policy 0, policy_version 103596 (0.0033) [2024-04-26 06:20:28,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1697415168. Throughput: 0: 56436.8. Samples: 1646733340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 06:20:28,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 06:20:29,487][47288] Updated weights for policy 0, policy_version 103606 (0.0026) [2024-04-26 06:20:33,179][47288] Updated weights for policy 0, policy_version 103616 (0.0030) [2024-04-26 06:20:33,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56251.8, 300 sec: 56316.6). Total num frames: 1697693696. Throughput: 0: 56285.4. Samples: 1647071020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 06:20:33,923][47056] Avg episode reward: [(0, '0.546')] [2024-04-26 06:20:35,509][47288] Updated weights for policy 0, policy_version 103626 (0.0029) [2024-04-26 06:20:38,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 56261.0). Total num frames: 1697955840. Throughput: 0: 56493.4. Samples: 1647413700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 06:20:38,923][47056] Avg episode reward: [(0, '0.592')] [2024-04-26 06:20:38,924][47267] Signal inference workers to stop experience collection... (24800 times) [2024-04-26 06:20:38,925][47267] Signal inference workers to resume experience collection... (24800 times) [2024-04-26 06:20:38,943][47288] InferenceWorker_p0-w0: stopping experience collection (24800 times) [2024-04-26 06:20:38,943][47288] InferenceWorker_p0-w0: resuming experience collection (24800 times) [2024-04-26 06:20:39,059][47288] Updated weights for policy 0, policy_version 103636 (0.0027) [2024-04-26 06:20:41,261][47288] Updated weights for policy 0, policy_version 103646 (0.0030) [2024-04-26 06:20:43,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55432.4, 300 sec: 56149.9). Total num frames: 1698234368. Throughput: 0: 56164.0. Samples: 1647568980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 06:20:43,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 06:20:44,835][47288] Updated weights for policy 0, policy_version 103656 (0.0032) [2024-04-26 06:20:47,028][47288] Updated weights for policy 0, policy_version 103666 (0.0030) [2024-04-26 06:20:48,923][47056] Fps is (10 sec: 58980.8, 60 sec: 56251.6, 300 sec: 56205.4). Total num frames: 1698545664. Throughput: 0: 56068.2. Samples: 1647906160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 06:20:48,924][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 06:20:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000103671_1698545664.pth... [2024-04-26 06:20:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000102847_1685045248.pth [2024-04-26 06:20:50,655][47288] Updated weights for policy 0, policy_version 103676 (0.0026) [2024-04-26 06:20:53,039][47288] Updated weights for policy 0, policy_version 103686 (0.0030) [2024-04-26 06:20:53,923][47056] Fps is (10 sec: 60621.0, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1698840576. Throughput: 0: 55847.1. Samples: 1648237080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 06:20:53,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 06:20:56,634][47288] Updated weights for policy 0, policy_version 103696 (0.0034) [2024-04-26 06:20:58,841][47288] Updated weights for policy 0, policy_version 103706 (0.0027) [2024-04-26 06:20:58,923][47056] Fps is (10 sec: 57345.3, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1699119104. Throughput: 0: 56098.1. Samples: 1648414000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 06:20:58,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 06:21:02,383][47288] Updated weights for policy 0, policy_version 103716 (0.0029) [2024-04-26 06:21:03,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56524.7, 300 sec: 56261.0). Total num frames: 1699381248. Throughput: 0: 56053.8. Samples: 1648748160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 06:21:03,923][47056] Avg episode reward: [(0, '0.595')] [2024-04-26 06:21:04,709][47288] Updated weights for policy 0, policy_version 103726 (0.0038) [2024-04-26 06:21:08,308][47288] Updated weights for policy 0, policy_version 103736 (0.0036) [2024-04-26 06:21:08,923][47056] Fps is (10 sec: 52428.7, 60 sec: 56251.7, 300 sec: 56205.4). Total num frames: 1699643392. Throughput: 0: 56086.7. Samples: 1649080720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 06:21:08,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 06:21:10,453][47288] Updated weights for policy 0, policy_version 103746 (0.0027) [2024-04-26 06:21:13,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55705.6, 300 sec: 56205.5). Total num frames: 1699921920. Throughput: 0: 55781.5. Samples: 1649243500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 06:21:13,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 06:21:14,009][47288] Updated weights for policy 0, policy_version 103756 (0.0027) [2024-04-26 06:21:16,316][47288] Updated weights for policy 0, policy_version 103766 (0.0033) [2024-04-26 06:21:18,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 56094.4). Total num frames: 1700184064. Throughput: 0: 55899.0. Samples: 1649586480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 06:21:18,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 06:21:19,852][47288] Updated weights for policy 0, policy_version 103776 (0.0030) [2024-04-26 06:21:22,150][47288] Updated weights for policy 0, policy_version 103786 (0.0036) [2024-04-26 06:21:23,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55705.6, 300 sec: 56149.9). Total num frames: 1700495360. Throughput: 0: 55750.6. Samples: 1649922480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 06:21:23,923][47056] Avg episode reward: [(0, '0.548')] [2024-04-26 06:21:25,634][47288] Updated weights for policy 0, policy_version 103796 (0.0034) [2024-04-26 06:21:28,001][47288] Updated weights for policy 0, policy_version 103806 (0.0031) [2024-04-26 06:21:28,923][47056] Fps is (10 sec: 62259.2, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1700806656. Throughput: 0: 56075.5. Samples: 1650092380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 06:21:28,923][47056] Avg episode reward: [(0, '0.612')] [2024-04-26 06:21:31,351][47288] Updated weights for policy 0, policy_version 103816 (0.0027) [2024-04-26 06:21:33,885][47288] Updated weights for policy 0, policy_version 103826 (0.0029) [2024-04-26 06:21:33,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56524.7, 300 sec: 56261.0). Total num frames: 1701085184. Throughput: 0: 56051.7. Samples: 1650428480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 06:21:33,924][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 06:21:37,266][47288] Updated weights for policy 0, policy_version 103836 (0.0028) [2024-04-26 06:21:38,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1701347328. Throughput: 0: 56094.4. Samples: 1650761320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 06:21:38,923][47056] Avg episode reward: [(0, '0.552')] [2024-04-26 06:21:39,140][47267] Signal inference workers to stop experience collection... (24850 times) [2024-04-26 06:21:39,193][47288] InferenceWorker_p0-w0: stopping experience collection (24850 times) [2024-04-26 06:21:39,193][47267] Signal inference workers to resume experience collection... (24850 times) [2024-04-26 06:21:39,207][47288] InferenceWorker_p0-w0: resuming experience collection (24850 times) [2024-04-26 06:21:39,891][47288] Updated weights for policy 0, policy_version 103846 (0.0030) [2024-04-26 06:21:42,944][47288] Updated weights for policy 0, policy_version 103856 (0.0028) [2024-04-26 06:21:43,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56798.0, 300 sec: 56261.0). Total num frames: 1701642240. Throughput: 0: 56181.4. Samples: 1650942160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 06:21:43,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 06:21:45,731][47288] Updated weights for policy 0, policy_version 103866 (0.0030) [2024-04-26 06:21:48,686][47288] Updated weights for policy 0, policy_version 103876 (0.0025) [2024-04-26 06:21:48,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55978.8, 300 sec: 56316.5). Total num frames: 1701904384. Throughput: 0: 56269.3. Samples: 1651280280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 06:21:48,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 06:21:51,500][47288] Updated weights for policy 0, policy_version 103886 (0.0031) [2024-04-26 06:21:53,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55432.6, 300 sec: 56205.5). Total num frames: 1702166528. Throughput: 0: 56525.4. Samples: 1651624360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 06:21:53,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 06:21:54,664][47288] Updated weights for policy 0, policy_version 103896 (0.0026) [2024-04-26 06:21:57,180][47288] Updated weights for policy 0, policy_version 103906 (0.0030) [2024-04-26 06:21:58,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55432.4, 300 sec: 56094.4). Total num frames: 1702445056. Throughput: 0: 56307.7. Samples: 1651777360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 06:21:58,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 06:22:00,536][47288] Updated weights for policy 0, policy_version 103916 (0.0033) [2024-04-26 06:22:02,878][47288] Updated weights for policy 0, policy_version 103926 (0.0034) [2024-04-26 06:22:03,923][47056] Fps is (10 sec: 58981.6, 60 sec: 56251.7, 300 sec: 56149.9). Total num frames: 1702756352. Throughput: 0: 56194.2. Samples: 1652115220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-04-26 06:22:03,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 06:22:06,208][47288] Updated weights for policy 0, policy_version 103936 (0.0031) [2024-04-26 06:22:08,782][47288] Updated weights for policy 0, policy_version 103946 (0.0023) [2024-04-26 06:22:08,923][47056] Fps is (10 sec: 62259.9, 60 sec: 57071.0, 300 sec: 56316.5). Total num frames: 1703067648. Throughput: 0: 56386.3. Samples: 1652459860. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-04-26 06:22:08,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 06:22:11,926][47288] Updated weights for policy 0, policy_version 103956 (0.0030) [2024-04-26 06:22:13,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56797.8, 300 sec: 56205.4). Total num frames: 1703329792. Throughput: 0: 56535.2. Samples: 1652636460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-04-26 06:22:13,923][47056] Avg episode reward: [(0, '0.603')] [2024-04-26 06:22:14,486][47288] Updated weights for policy 0, policy_version 103966 (0.0027) [2024-04-26 06:22:17,808][47288] Updated weights for policy 0, policy_version 103976 (0.0029) [2024-04-26 06:22:18,923][47056] Fps is (10 sec: 52428.8, 60 sec: 56797.9, 300 sec: 56205.5). Total num frames: 1703591936. Throughput: 0: 56514.3. Samples: 1652971620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-04-26 06:22:18,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 06:22:20,252][47288] Updated weights for policy 0, policy_version 103986 (0.0034) [2024-04-26 06:22:23,562][47288] Updated weights for policy 0, policy_version 103996 (0.0041) [2024-04-26 06:22:23,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56797.8, 300 sec: 56316.5). Total num frames: 1703903232. Throughput: 0: 56629.2. Samples: 1653309640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-04-26 06:22:23,923][47056] Avg episode reward: [(0, '0.390')] [2024-04-26 06:22:26,026][47288] Updated weights for policy 0, policy_version 104006 (0.0028) [2024-04-26 06:22:28,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1704165376. Throughput: 0: 56339.9. Samples: 1653477460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-04-26 06:22:28,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 06:22:29,190][47267] Signal inference workers to stop experience collection... (24900 times) [2024-04-26 06:22:29,232][47288] InferenceWorker_p0-w0: stopping experience collection (24900 times) [2024-04-26 06:22:29,242][47267] Signal inference workers to resume experience collection... (24900 times) [2024-04-26 06:22:29,249][47288] InferenceWorker_p0-w0: resuming experience collection (24900 times) [2024-04-26 06:22:29,252][47288] Updated weights for policy 0, policy_version 104016 (0.0026) [2024-04-26 06:22:31,781][47288] Updated weights for policy 0, policy_version 104026 (0.0032) [2024-04-26 06:22:33,923][47056] Fps is (10 sec: 50790.5, 60 sec: 55432.6, 300 sec: 56149.9). Total num frames: 1704411136. Throughput: 0: 56335.6. Samples: 1653815380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-04-26 06:22:33,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 06:22:35,037][47288] Updated weights for policy 0, policy_version 104036 (0.0028) [2024-04-26 06:22:37,511][47288] Updated weights for policy 0, policy_version 104046 (0.0031) [2024-04-26 06:22:38,924][47056] Fps is (10 sec: 54060.9, 60 sec: 55977.5, 300 sec: 56094.2). Total num frames: 1704706048. Throughput: 0: 56195.8. Samples: 1654153240. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-04-26 06:22:38,924][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 06:22:40,896][47288] Updated weights for policy 0, policy_version 104056 (0.0030) [2024-04-26 06:22:43,296][47288] Updated weights for policy 0, policy_version 104066 (0.0029) [2024-04-26 06:22:43,923][47056] Fps is (10 sec: 62260.1, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1705033728. Throughput: 0: 56578.0. Samples: 1654323360. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-04-26 06:22:43,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 06:22:46,697][47288] Updated weights for policy 0, policy_version 104076 (0.0029) [2024-04-26 06:22:48,923][47056] Fps is (10 sec: 62266.0, 60 sec: 57071.0, 300 sec: 56372.1). Total num frames: 1705328640. Throughput: 0: 56524.0. Samples: 1654658800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-04-26 06:22:48,924][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 06:22:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000104085_1705328640.pth... [2024-04-26 06:22:48,985][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000103259_1691795456.pth [2024-04-26 06:22:49,182][47288] Updated weights for policy 0, policy_version 104086 (0.0029) [2024-04-26 06:22:52,374][47288] Updated weights for policy 0, policy_version 104096 (0.0030) [2024-04-26 06:22:53,923][47056] Fps is (10 sec: 55705.0, 60 sec: 57070.9, 300 sec: 56316.5). Total num frames: 1705590784. Throughput: 0: 56572.0. Samples: 1655005600. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-04-26 06:22:53,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 06:22:54,857][47288] Updated weights for policy 0, policy_version 104106 (0.0028) [2024-04-26 06:22:58,135][47288] Updated weights for policy 0, policy_version 104116 (0.0031) [2024-04-26 06:22:58,923][47056] Fps is (10 sec: 52429.0, 60 sec: 56798.0, 300 sec: 56205.4). Total num frames: 1705852928. Throughput: 0: 56592.9. Samples: 1655183140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 23.0) [2024-04-26 06:22:58,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 06:23:01,058][47288] Updated weights for policy 0, policy_version 104126 (0.0029) [2024-04-26 06:23:03,825][47288] Updated weights for policy 0, policy_version 104136 (0.0026) [2024-04-26 06:23:03,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1706164224. Throughput: 0: 56677.8. Samples: 1655522120. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 06:23:03,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 06:23:06,964][47288] Updated weights for policy 0, policy_version 104146 (0.0034) [2024-04-26 06:23:08,923][47056] Fps is (10 sec: 55704.5, 60 sec: 55705.4, 300 sec: 56316.5). Total num frames: 1706409984. Throughput: 0: 56616.7. Samples: 1655857400. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 06:23:08,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 06:23:09,600][47288] Updated weights for policy 0, policy_version 104156 (0.0032) [2024-04-26 06:23:12,750][47288] Updated weights for policy 0, policy_version 104166 (0.0027) [2024-04-26 06:23:13,923][47056] Fps is (10 sec: 50790.5, 60 sec: 55705.6, 300 sec: 56094.4). Total num frames: 1706672128. Throughput: 0: 56330.2. Samples: 1656012320. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 06:23:13,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 06:23:14,815][47267] Signal inference workers to stop experience collection... (24950 times) [2024-04-26 06:23:14,847][47288] InferenceWorker_p0-w0: stopping experience collection (24950 times) [2024-04-26 06:23:14,899][47267] Signal inference workers to resume experience collection... (24950 times) [2024-04-26 06:23:14,899][47288] InferenceWorker_p0-w0: resuming experience collection (24950 times) [2024-04-26 06:23:15,372][47288] Updated weights for policy 0, policy_version 104176 (0.0024) [2024-04-26 06:23:18,617][47288] Updated weights for policy 0, policy_version 104186 (0.0026) [2024-04-26 06:23:18,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56524.8, 300 sec: 56205.5). Total num frames: 1706983424. Throughput: 0: 56336.0. Samples: 1656350500. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 06:23:18,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 06:23:21,310][47288] Updated weights for policy 0, policy_version 104196 (0.0028) [2024-04-26 06:23:23,923][47056] Fps is (10 sec: 62258.7, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1707294720. Throughput: 0: 56279.6. Samples: 1656685760. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 06:23:23,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 06:23:24,421][47288] Updated weights for policy 0, policy_version 104206 (0.0030) [2024-04-26 06:23:27,713][47288] Updated weights for policy 0, policy_version 104216 (0.0030) [2024-04-26 06:23:28,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 1707573248. Throughput: 0: 56353.6. Samples: 1656859280. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 06:23:28,932][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 06:23:30,139][47288] Updated weights for policy 0, policy_version 104226 (0.0032) [2024-04-26 06:23:33,390][47288] Updated weights for policy 0, policy_version 104236 (0.0029) [2024-04-26 06:23:33,923][47056] Fps is (10 sec: 52429.3, 60 sec: 56798.0, 300 sec: 56261.0). Total num frames: 1707819008. Throughput: 0: 56414.8. Samples: 1657197460. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 06:23:33,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 06:23:35,901][47288] Updated weights for policy 0, policy_version 104246 (0.0031) [2024-04-26 06:23:38,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56798.9, 300 sec: 56261.0). Total num frames: 1708113920. Throughput: 0: 56179.5. Samples: 1657533680. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 06:23:38,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 06:23:39,154][47288] Updated weights for policy 0, policy_version 104256 (0.0033) [2024-04-26 06:23:41,843][47288] Updated weights for policy 0, policy_version 104266 (0.0028) [2024-04-26 06:23:43,923][47056] Fps is (10 sec: 58983.0, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1708408832. Throughput: 0: 55989.1. Samples: 1657702640. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 06:23:43,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 06:23:45,070][47288] Updated weights for policy 0, policy_version 104276 (0.0029) [2024-04-26 06:23:48,173][47288] Updated weights for policy 0, policy_version 104286 (0.0029) [2024-04-26 06:23:48,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55432.6, 300 sec: 56205.5). Total num frames: 1708654592. Throughput: 0: 56056.0. Samples: 1658044640. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 06:23:48,923][47056] Avg episode reward: [(0, '0.393')] [2024-04-26 06:23:50,991][47288] Updated weights for policy 0, policy_version 104296 (0.0024) [2024-04-26 06:23:53,923][47056] Fps is (10 sec: 50789.4, 60 sec: 55432.5, 300 sec: 56094.4). Total num frames: 1708916736. Throughput: 0: 56141.1. Samples: 1658383740. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 06:23:53,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 06:23:54,446][47288] Updated weights for policy 0, policy_version 104306 (0.0029) [2024-04-26 06:23:56,753][47288] Updated weights for policy 0, policy_version 104316 (0.0024) [2024-04-26 06:23:58,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56524.9, 300 sec: 56205.5). Total num frames: 1709244416. Throughput: 0: 56397.8. Samples: 1658550220. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 06:23:58,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 06:24:00,152][47288] Updated weights for policy 0, policy_version 104326 (0.0030) [2024-04-26 06:24:02,514][47288] Updated weights for policy 0, policy_version 104336 (0.0030) [2024-04-26 06:24:02,985][47267] Signal inference workers to stop experience collection... (25000 times) [2024-04-26 06:24:02,987][47267] Signal inference workers to resume experience collection... (25000 times) [2024-04-26 06:24:03,006][47288] InferenceWorker_p0-w0: stopping experience collection (25000 times) [2024-04-26 06:24:03,007][47288] InferenceWorker_p0-w0: resuming experience collection (25000 times) [2024-04-26 06:24:03,923][47056] Fps is (10 sec: 62259.9, 60 sec: 56251.8, 300 sec: 56316.6). Total num frames: 1709539328. Throughput: 0: 56410.4. Samples: 1658888960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:24:03,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 06:24:05,893][47288] Updated weights for policy 0, policy_version 104346 (0.0029) [2024-04-26 06:24:08,449][47288] Updated weights for policy 0, policy_version 104356 (0.0031) [2024-04-26 06:24:08,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56798.1, 300 sec: 56372.1). Total num frames: 1709817856. Throughput: 0: 56441.8. Samples: 1659225640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:24:08,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 06:24:11,762][47288] Updated weights for policy 0, policy_version 104366 (0.0036) [2024-04-26 06:24:13,923][47056] Fps is (10 sec: 52428.0, 60 sec: 56524.7, 300 sec: 56261.0). Total num frames: 1710063616. Throughput: 0: 56480.8. Samples: 1659400920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:24:13,923][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 06:24:14,144][47288] Updated weights for policy 0, policy_version 104376 (0.0030) [2024-04-26 06:24:17,415][47288] Updated weights for policy 0, policy_version 104386 (0.0025) [2024-04-26 06:24:18,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56524.7, 300 sec: 56261.0). Total num frames: 1710374912. Throughput: 0: 56415.4. Samples: 1659736160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:24:18,923][47056] Avg episode reward: [(0, '0.573')] [2024-04-26 06:24:19,995][47288] Updated weights for policy 0, policy_version 104396 (0.0028) [2024-04-26 06:24:23,396][47288] Updated weights for policy 0, policy_version 104406 (0.0028) [2024-04-26 06:24:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55705.6, 300 sec: 56316.5). Total num frames: 1710637056. Throughput: 0: 56431.6. Samples: 1660073100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:24:23,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 06:24:25,821][47288] Updated weights for policy 0, policy_version 104416 (0.0029) [2024-04-26 06:24:28,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55432.4, 300 sec: 56205.4). Total num frames: 1710899200. Throughput: 0: 56137.9. Samples: 1660228860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:24:28,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 06:24:29,049][47288] Updated weights for policy 0, policy_version 104426 (0.0029) [2024-04-26 06:24:31,724][47288] Updated weights for policy 0, policy_version 104436 (0.0027) [2024-04-26 06:24:33,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56251.6, 300 sec: 56149.9). Total num frames: 1711194112. Throughput: 0: 56005.2. Samples: 1660564880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:24:33,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 06:24:35,023][47288] Updated weights for policy 0, policy_version 104446 (0.0028) [2024-04-26 06:24:37,535][47288] Updated weights for policy 0, policy_version 104456 (0.0024) [2024-04-26 06:24:38,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56251.7, 300 sec: 56205.4). Total num frames: 1711489024. Throughput: 0: 55947.5. Samples: 1660901380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:24:38,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 06:24:40,826][47288] Updated weights for policy 0, policy_version 104466 (0.0033) [2024-04-26 06:24:43,204][47288] Updated weights for policy 0, policy_version 104476 (0.0032) [2024-04-26 06:24:43,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56251.6, 300 sec: 56316.5). Total num frames: 1711783936. Throughput: 0: 56115.5. Samples: 1661075420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:24:43,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 06:24:46,550][47288] Updated weights for policy 0, policy_version 104486 (0.0028) [2024-04-26 06:24:48,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56524.7, 300 sec: 56261.0). Total num frames: 1712046080. Throughput: 0: 56030.9. Samples: 1661410360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:24:48,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 06:24:48,930][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000104495_1712046080.pth... [2024-04-26 06:24:48,979][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000103671_1698545664.pth [2024-04-26 06:24:49,132][47288] Updated weights for policy 0, policy_version 104496 (0.0029) [2024-04-26 06:24:49,466][47267] Signal inference workers to stop experience collection... (25050 times) [2024-04-26 06:24:49,467][47267] Signal inference workers to resume experience collection... (25050 times) [2024-04-26 06:24:49,480][47288] InferenceWorker_p0-w0: stopping experience collection (25050 times) [2024-04-26 06:24:49,503][47288] InferenceWorker_p0-w0: resuming experience collection (25050 times) [2024-04-26 06:24:52,535][47288] Updated weights for policy 0, policy_version 104506 (0.0033) [2024-04-26 06:24:53,923][47056] Fps is (10 sec: 54067.4, 60 sec: 56798.0, 300 sec: 56261.0). Total num frames: 1712324608. Throughput: 0: 55893.0. Samples: 1661740820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:24:53,923][47056] Avg episode reward: [(0, '0.607')] [2024-04-26 06:24:54,986][47288] Updated weights for policy 0, policy_version 104516 (0.0029) [2024-04-26 06:24:58,260][47288] Updated weights for policy 0, policy_version 104526 (0.0027) [2024-04-26 06:24:58,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.5, 300 sec: 56316.5). Total num frames: 1712603136. Throughput: 0: 55847.1. Samples: 1661914040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:24:58,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 06:25:00,900][47288] Updated weights for policy 0, policy_version 104536 (0.0028) [2024-04-26 06:25:03,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 56261.0). Total num frames: 1712865280. Throughput: 0: 55905.9. Samples: 1662251920. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 06:25:03,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 06:25:03,978][47288] Updated weights for policy 0, policy_version 104546 (0.0029) [2024-04-26 06:25:06,608][47288] Updated weights for policy 0, policy_version 104556 (0.0027) [2024-04-26 06:25:08,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.5, 300 sec: 56205.4). Total num frames: 1713160192. Throughput: 0: 56047.1. Samples: 1662595220. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 06:25:08,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 06:25:09,763][47288] Updated weights for policy 0, policy_version 104566 (0.0032) [2024-04-26 06:25:12,461][47288] Updated weights for policy 0, policy_version 104576 (0.0033) [2024-04-26 06:25:13,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1713438720. Throughput: 0: 56298.4. Samples: 1662762280. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 06:25:13,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 06:25:15,658][47288] Updated weights for policy 0, policy_version 104586 (0.0032) [2024-04-26 06:25:18,152][47288] Updated weights for policy 0, policy_version 104596 (0.0035) [2024-04-26 06:25:18,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1713750016. Throughput: 0: 56303.0. Samples: 1663098520. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 06:25:18,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 06:25:21,436][47288] Updated weights for policy 0, policy_version 104606 (0.0027) [2024-04-26 06:25:23,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1714012160. Throughput: 0: 56227.3. Samples: 1663431600. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 06:25:23,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 06:25:24,023][47288] Updated weights for policy 0, policy_version 104616 (0.0038) [2024-04-26 06:25:27,219][47288] Updated weights for policy 0, policy_version 104626 (0.0026) [2024-04-26 06:25:28,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1714290688. Throughput: 0: 56345.2. Samples: 1663610960. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 06:25:28,924][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 06:25:29,794][47288] Updated weights for policy 0, policy_version 104636 (0.0029) [2024-04-26 06:25:32,881][47288] Updated weights for policy 0, policy_version 104646 (0.0028) [2024-04-26 06:25:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1714585600. Throughput: 0: 56385.6. Samples: 1663947700. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 06:25:33,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 06:25:35,558][47288] Updated weights for policy 0, policy_version 104656 (0.0030) [2024-04-26 06:25:38,724][47288] Updated weights for policy 0, policy_version 104666 (0.0032) [2024-04-26 06:25:38,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1714864128. Throughput: 0: 56563.4. Samples: 1664286180. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 06:25:38,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 06:25:41,508][47288] Updated weights for policy 0, policy_version 104676 (0.0028) [2024-04-26 06:25:43,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1715142656. Throughput: 0: 56386.3. Samples: 1664451420. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 06:25:43,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 06:25:44,401][47288] Updated weights for policy 0, policy_version 104686 (0.0035) [2024-04-26 06:25:47,161][47288] Updated weights for policy 0, policy_version 104696 (0.0027) [2024-04-26 06:25:48,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 56149.9). Total num frames: 1715404800. Throughput: 0: 56332.8. Samples: 1664786900. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 06:25:48,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 06:25:50,116][47288] Updated weights for policy 0, policy_version 104706 (0.0031) [2024-04-26 06:25:52,969][47288] Updated weights for policy 0, policy_version 104716 (0.0031) [2024-04-26 06:25:53,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.7, 300 sec: 56205.5). Total num frames: 1715699712. Throughput: 0: 56365.0. Samples: 1665131640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 06:25:53,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 06:25:55,939][47288] Updated weights for policy 0, policy_version 104726 (0.0028) [2024-04-26 06:25:57,754][47267] Signal inference workers to stop experience collection... (25100 times) [2024-04-26 06:25:57,754][47267] Signal inference workers to resume experience collection... (25100 times) [2024-04-26 06:25:57,781][47288] InferenceWorker_p0-w0: stopping experience collection (25100 times) [2024-04-26 06:25:57,781][47288] InferenceWorker_p0-w0: resuming experience collection (25100 times) [2024-04-26 06:25:58,780][47288] Updated weights for policy 0, policy_version 104736 (0.0029) [2024-04-26 06:25:58,923][47056] Fps is (10 sec: 58983.0, 60 sec: 56524.9, 300 sec: 56316.6). Total num frames: 1715994624. Throughput: 0: 56340.5. Samples: 1665297600. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 06:25:58,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 06:26:01,597][47288] Updated weights for policy 0, policy_version 104746 (0.0029) [2024-04-26 06:26:03,923][47056] Fps is (10 sec: 58981.8, 60 sec: 57070.8, 300 sec: 56427.6). Total num frames: 1716289536. Throughput: 0: 56421.8. Samples: 1665637500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 06:26:03,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 06:26:04,509][47288] Updated weights for policy 0, policy_version 104756 (0.0035) [2024-04-26 06:26:07,349][47288] Updated weights for policy 0, policy_version 104766 (0.0031) [2024-04-26 06:26:08,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56524.8, 300 sec: 56372.0). Total num frames: 1716551680. Throughput: 0: 56497.5. Samples: 1665974000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 06:26:08,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 06:26:10,218][47288] Updated weights for policy 0, policy_version 104776 (0.0026) [2024-04-26 06:26:13,173][47288] Updated weights for policy 0, policy_version 104786 (0.0030) [2024-04-26 06:26:13,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56797.9, 300 sec: 56483.2). Total num frames: 1716846592. Throughput: 0: 56445.1. Samples: 1666150980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 06:26:13,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 06:26:16,123][47288] Updated weights for policy 0, policy_version 104796 (0.0028) [2024-04-26 06:26:18,923][47056] Fps is (10 sec: 57345.1, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 1717125120. Throughput: 0: 56469.4. Samples: 1666488820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 06:26:18,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 06:26:19,012][47288] Updated weights for policy 0, policy_version 104806 (0.0029) [2024-04-26 06:26:22,031][47288] Updated weights for policy 0, policy_version 104816 (0.0026) [2024-04-26 06:26:23,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56524.6, 300 sec: 56261.0). Total num frames: 1717403648. Throughput: 0: 56445.7. Samples: 1666826240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 06:26:23,923][47056] Avg episode reward: [(0, '0.546')] [2024-04-26 06:26:24,881][47288] Updated weights for policy 0, policy_version 104826 (0.0029) [2024-04-26 06:26:27,785][47288] Updated weights for policy 0, policy_version 104836 (0.0027) [2024-04-26 06:26:28,923][47056] Fps is (10 sec: 52427.0, 60 sec: 55978.5, 300 sec: 56149.9). Total num frames: 1717649408. Throughput: 0: 56488.6. Samples: 1666993420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 06:26:28,924][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 06:26:30,582][47288] Updated weights for policy 0, policy_version 104846 (0.0031) [2024-04-26 06:26:33,591][47288] Updated weights for policy 0, policy_version 104856 (0.0031) [2024-04-26 06:26:33,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1717960704. Throughput: 0: 56566.7. Samples: 1667332400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 06:26:33,923][47056] Avg episode reward: [(0, '0.561')] [2024-04-26 06:26:36,502][47288] Updated weights for policy 0, policy_version 104866 (0.0033) [2024-04-26 06:26:38,923][47056] Fps is (10 sec: 57346.0, 60 sec: 55978.8, 300 sec: 56205.5). Total num frames: 1718222848. Throughput: 0: 56472.1. Samples: 1667672880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 06:26:38,923][47056] Avg episode reward: [(0, '0.415')] [2024-04-26 06:26:39,347][47288] Updated weights for policy 0, policy_version 104876 (0.0029) [2024-04-26 06:26:42,193][47288] Updated weights for policy 0, policy_version 104886 (0.0029) [2024-04-26 06:26:43,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1718534144. Throughput: 0: 56506.5. Samples: 1667840400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 06:26:43,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 06:26:45,334][47288] Updated weights for policy 0, policy_version 104896 (0.0028) [2024-04-26 06:26:48,052][47288] Updated weights for policy 0, policy_version 104906 (0.0028) [2024-04-26 06:26:48,923][47056] Fps is (10 sec: 60620.5, 60 sec: 57071.0, 300 sec: 56483.1). Total num frames: 1718829056. Throughput: 0: 56376.2. Samples: 1668174420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 06:26:48,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 06:26:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000104909_1718829056.pth... [2024-04-26 06:26:48,986][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000104085_1705328640.pth [2024-04-26 06:26:51,225][47288] Updated weights for policy 0, policy_version 104916 (0.0028) [2024-04-26 06:26:52,941][47267] Signal inference workers to stop experience collection... (25150 times) [2024-04-26 06:26:52,975][47288] InferenceWorker_p0-w0: stopping experience collection (25150 times) [2024-04-26 06:26:53,027][47267] Signal inference workers to resume experience collection... (25150 times) [2024-04-26 06:26:53,028][47288] InferenceWorker_p0-w0: resuming experience collection (25150 times) [2024-04-26 06:26:53,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1719091200. Throughput: 0: 56369.0. Samples: 1668510600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 06:26:53,923][47056] Avg episode reward: [(0, '0.572')] [2024-04-26 06:26:53,944][47288] Updated weights for policy 0, policy_version 104926 (0.0027) [2024-04-26 06:26:56,993][47288] Updated weights for policy 0, policy_version 104936 (0.0027) [2024-04-26 06:26:58,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1719386112. Throughput: 0: 56289.7. Samples: 1668684020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 06:26:58,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 06:26:59,590][47288] Updated weights for policy 0, policy_version 104946 (0.0027) [2024-04-26 06:27:02,835][47288] Updated weights for policy 0, policy_version 104956 (0.0028) [2024-04-26 06:27:03,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 56205.4). Total num frames: 1719648256. Throughput: 0: 56412.3. Samples: 1669027380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 06:27:03,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 06:27:05,389][47288] Updated weights for policy 0, policy_version 104966 (0.0029) [2024-04-26 06:27:08,669][47288] Updated weights for policy 0, policy_version 104976 (0.0041) [2024-04-26 06:27:08,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1719926784. Throughput: 0: 56355.1. Samples: 1669362220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 06:27:08,923][47056] Avg episode reward: [(0, '0.596')] [2024-04-26 06:27:11,205][47288] Updated weights for policy 0, policy_version 104986 (0.0032) [2024-04-26 06:27:13,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1720205312. Throughput: 0: 56136.9. Samples: 1669519560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 06:27:13,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 06:27:14,564][47288] Updated weights for policy 0, policy_version 104996 (0.0028) [2024-04-26 06:27:17,051][47288] Updated weights for policy 0, policy_version 105006 (0.0029) [2024-04-26 06:27:18,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.5, 300 sec: 56205.4). Total num frames: 1720483840. Throughput: 0: 56154.1. Samples: 1669859340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 06:27:18,924][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 06:27:20,400][47288] Updated weights for policy 0, policy_version 105016 (0.0030) [2024-04-26 06:27:22,726][47288] Updated weights for policy 0, policy_version 105026 (0.0033) [2024-04-26 06:27:23,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1720778752. Throughput: 0: 56078.9. Samples: 1670196440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 06:27:23,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 06:27:26,150][47288] Updated weights for policy 0, policy_version 105036 (0.0027) [2024-04-26 06:27:28,588][47288] Updated weights for policy 0, policy_version 105046 (0.0038) [2024-04-26 06:27:28,923][47056] Fps is (10 sec: 60621.4, 60 sec: 57344.3, 300 sec: 56538.7). Total num frames: 1721090048. Throughput: 0: 56325.9. Samples: 1670375060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 06:27:28,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 06:27:32,021][47288] Updated weights for policy 0, policy_version 105056 (0.0028) [2024-04-26 06:27:33,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56524.8, 300 sec: 56427.8). Total num frames: 1721352192. Throughput: 0: 56372.4. Samples: 1670711180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 06:27:33,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 06:27:34,498][47288] Updated weights for policy 0, policy_version 105066 (0.0029) [2024-04-26 06:27:37,879][47288] Updated weights for policy 0, policy_version 105076 (0.0029) [2024-04-26 06:27:38,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56797.8, 300 sec: 56261.0). Total num frames: 1721630720. Throughput: 0: 56469.3. Samples: 1671051720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 06:27:38,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 06:27:40,202][47288] Updated weights for policy 0, policy_version 105086 (0.0039) [2024-04-26 06:27:43,712][47288] Updated weights for policy 0, policy_version 105096 (0.0028) [2024-04-26 06:27:43,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 56205.5). Total num frames: 1721909248. Throughput: 0: 56406.6. Samples: 1671222320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 06:27:43,923][47056] Avg episode reward: [(0, '0.528')] [2024-04-26 06:27:45,856][47288] Updated weights for policy 0, policy_version 105106 (0.0026) [2024-04-26 06:27:47,301][47267] Signal inference workers to stop experience collection... (25200 times) [2024-04-26 06:27:47,317][47288] InferenceWorker_p0-w0: stopping experience collection (25200 times) [2024-04-26 06:27:47,397][47267] Signal inference workers to resume experience collection... (25200 times) [2024-04-26 06:27:47,397][47288] InferenceWorker_p0-w0: resuming experience collection (25200 times) [2024-04-26 06:27:48,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1722187776. Throughput: 0: 56434.2. Samples: 1671566920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 06:27:48,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 06:27:49,469][47288] Updated weights for policy 0, policy_version 105116 (0.0026) [2024-04-26 06:27:51,708][47288] Updated weights for policy 0, policy_version 105126 (0.0027) [2024-04-26 06:27:53,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55705.7, 300 sec: 56205.5). Total num frames: 1722433536. Throughput: 0: 56382.8. Samples: 1671899440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 06:27:53,923][47056] Avg episode reward: [(0, '0.595')] [2024-04-26 06:27:55,313][47288] Updated weights for policy 0, policy_version 105136 (0.0028) [2024-04-26 06:27:57,589][47288] Updated weights for policy 0, policy_version 105146 (0.0033) [2024-04-26 06:27:58,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1722744832. Throughput: 0: 56575.0. Samples: 1672065440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 06:27:58,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 06:28:01,242][47288] Updated weights for policy 0, policy_version 105156 (0.0033) [2024-04-26 06:28:03,408][47288] Updated weights for policy 0, policy_version 105166 (0.0028) [2024-04-26 06:28:03,923][47056] Fps is (10 sec: 62259.1, 60 sec: 56798.0, 300 sec: 56427.7). Total num frames: 1723056128. Throughput: 0: 56505.9. Samples: 1672402100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 06:28:03,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 06:28:07,070][47288] Updated weights for policy 0, policy_version 105176 (0.0026) [2024-04-26 06:28:08,923][47056] Fps is (10 sec: 58983.0, 60 sec: 56798.0, 300 sec: 56483.2). Total num frames: 1723334656. Throughput: 0: 56470.4. Samples: 1672737600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 06:28:08,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 06:28:09,126][47288] Updated weights for policy 0, policy_version 105186 (0.0028) [2024-04-26 06:28:12,810][47288] Updated weights for policy 0, policy_version 105196 (0.0035) [2024-04-26 06:28:13,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1723596800. Throughput: 0: 56364.0. Samples: 1672911440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 06:28:13,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 06:28:14,950][47288] Updated weights for policy 0, policy_version 105206 (0.0029) [2024-04-26 06:28:18,497][47288] Updated weights for policy 0, policy_version 105216 (0.0030) [2024-04-26 06:28:18,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56524.9, 300 sec: 56205.5). Total num frames: 1723875328. Throughput: 0: 56451.1. Samples: 1673251480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 06:28:18,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 06:28:20,929][47288] Updated weights for policy 0, policy_version 105226 (0.0031) [2024-04-26 06:28:23,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.8, 300 sec: 56205.5). Total num frames: 1724153856. Throughput: 0: 56314.2. Samples: 1673585860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 06:28:23,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 06:28:24,466][47288] Updated weights for policy 0, policy_version 105236 (0.0036) [2024-04-26 06:28:26,657][47288] Updated weights for policy 0, policy_version 105246 (0.0030) [2024-04-26 06:28:28,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 56316.5). Total num frames: 1724432384. Throughput: 0: 56044.5. Samples: 1673744320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 06:28:28,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 06:28:30,388][47288] Updated weights for policy 0, policy_version 105256 (0.0026) [2024-04-26 06:28:32,513][47288] Updated weights for policy 0, policy_version 105266 (0.0027) [2024-04-26 06:28:33,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1724710912. Throughput: 0: 55821.0. Samples: 1674078860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 06:28:33,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 06:28:36,173][47288] Updated weights for policy 0, policy_version 105276 (0.0029) [2024-04-26 06:28:38,329][47288] Updated weights for policy 0, policy_version 105286 (0.0028) [2024-04-26 06:28:38,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1725022208. Throughput: 0: 55927.0. Samples: 1674416160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 06:28:38,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 06:28:42,019][47288] Updated weights for policy 0, policy_version 105296 (0.0032) [2024-04-26 06:28:43,923][47056] Fps is (10 sec: 60620.1, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 1725317120. Throughput: 0: 56278.1. Samples: 1674597960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 06:28:43,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 06:28:43,985][47288] Updated weights for policy 0, policy_version 105306 (0.0023) [2024-04-26 06:28:47,812][47288] Updated weights for policy 0, policy_version 105316 (0.0027) [2024-04-26 06:28:48,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 1725579264. Throughput: 0: 56511.6. Samples: 1674945120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 06:28:48,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 06:28:49,030][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000105322_1725595648.pth... [2024-04-26 06:28:49,076][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000104495_1712046080.pth [2024-04-26 06:28:49,274][47267] Signal inference workers to stop experience collection... (25250 times) [2024-04-26 06:28:49,306][47288] InferenceWorker_p0-w0: stopping experience collection (25250 times) [2024-04-26 06:28:49,359][47267] Signal inference workers to resume experience collection... (25250 times) [2024-04-26 06:28:49,360][47288] InferenceWorker_p0-w0: resuming experience collection (25250 times) [2024-04-26 06:28:50,216][47288] Updated weights for policy 0, policy_version 105326 (0.0030) [2024-04-26 06:28:53,549][47288] Updated weights for policy 0, policy_version 105336 (0.0031) [2024-04-26 06:28:53,923][47056] Fps is (10 sec: 52429.7, 60 sec: 56797.9, 300 sec: 56261.0). Total num frames: 1725841408. Throughput: 0: 56508.4. Samples: 1675280480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 06:28:53,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 06:28:55,868][47288] Updated weights for policy 0, policy_version 105346 (0.0033) [2024-04-26 06:28:58,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56251.8, 300 sec: 56205.4). Total num frames: 1726119936. Throughput: 0: 56405.8. Samples: 1675449700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 06:28:58,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 06:28:59,309][47288] Updated weights for policy 0, policy_version 105356 (0.0026) [2024-04-26 06:29:01,565][47288] Updated weights for policy 0, policy_version 105366 (0.0025) [2024-04-26 06:29:03,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.8, 300 sec: 56261.0). Total num frames: 1726414848. Throughput: 0: 56285.1. Samples: 1675784300. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 06:29:03,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 06:29:05,129][47288] Updated weights for policy 0, policy_version 105376 (0.0027) [2024-04-26 06:29:07,414][47288] Updated weights for policy 0, policy_version 105386 (0.0030) [2024-04-26 06:29:08,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55705.4, 300 sec: 56316.5). Total num frames: 1726676992. Throughput: 0: 56384.7. Samples: 1676123180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 06:29:08,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 06:29:10,919][47288] Updated weights for policy 0, policy_version 105396 (0.0034) [2024-04-26 06:29:13,298][47288] Updated weights for policy 0, policy_version 105406 (0.0025) [2024-04-26 06:29:13,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1726971904. Throughput: 0: 56626.2. Samples: 1676292500. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 06:29:13,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 06:29:16,689][47288] Updated weights for policy 0, policy_version 105416 (0.0027) [2024-04-26 06:29:18,923][47056] Fps is (10 sec: 60622.0, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1727283200. Throughput: 0: 56730.7. Samples: 1676631740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 06:29:18,923][47056] Avg episode reward: [(0, '0.570')] [2024-04-26 06:29:19,108][47288] Updated weights for policy 0, policy_version 105426 (0.0027) [2024-04-26 06:29:22,476][47288] Updated weights for policy 0, policy_version 105436 (0.0033) [2024-04-26 06:29:23,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56797.9, 300 sec: 56483.2). Total num frames: 1727561728. Throughput: 0: 56621.8. Samples: 1676964140. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 06:29:23,923][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 06:29:24,750][47288] Updated weights for policy 0, policy_version 105446 (0.0030) [2024-04-26 06:29:28,210][47288] Updated weights for policy 0, policy_version 105456 (0.0024) [2024-04-26 06:29:28,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1727823872. Throughput: 0: 56545.6. Samples: 1677142500. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 06:29:28,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 06:29:30,615][47288] Updated weights for policy 0, policy_version 105466 (0.0027) [2024-04-26 06:29:33,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1728102400. Throughput: 0: 56291.5. Samples: 1677478240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 06:29:33,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 06:29:34,010][47288] Updated weights for policy 0, policy_version 105476 (0.0027) [2024-04-26 06:29:36,691][47288] Updated weights for policy 0, policy_version 105486 (0.0029) [2024-04-26 06:29:38,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1728380928. Throughput: 0: 56451.4. Samples: 1677820800. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 06:29:38,923][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 06:29:39,733][47288] Updated weights for policy 0, policy_version 105496 (0.0030) [2024-04-26 06:29:42,341][47288] Updated weights for policy 0, policy_version 105506 (0.0028) [2024-04-26 06:29:43,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56251.9, 300 sec: 56427.6). Total num frames: 1728692224. Throughput: 0: 56449.4. Samples: 1677989920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 06:29:43,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 06:29:45,397][47288] Updated weights for policy 0, policy_version 105516 (0.0029) [2024-04-26 06:29:48,084][47288] Updated weights for policy 0, policy_version 105526 (0.0030) [2024-04-26 06:29:48,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 1728954368. Throughput: 0: 56547.7. Samples: 1678328960. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 06:29:48,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 06:29:51,253][47288] Updated weights for policy 0, policy_version 105536 (0.0037) [2024-04-26 06:29:51,988][47267] Signal inference workers to stop experience collection... (25300 times) [2024-04-26 06:29:51,988][47267] Signal inference workers to resume experience collection... (25300 times) [2024-04-26 06:29:52,014][47288] InferenceWorker_p0-w0: stopping experience collection (25300 times) [2024-04-26 06:29:52,014][47288] InferenceWorker_p0-w0: resuming experience collection (25300 times) [2024-04-26 06:29:53,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1729249280. Throughput: 0: 56468.3. Samples: 1678664240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 06:29:53,923][47056] Avg episode reward: [(0, '0.393')] [2024-04-26 06:29:54,268][47288] Updated weights for policy 0, policy_version 105546 (0.0027) [2024-04-26 06:29:57,057][47288] Updated weights for policy 0, policy_version 105556 (0.0030) [2024-04-26 06:29:58,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 1729527808. Throughput: 0: 56685.7. Samples: 1678843360. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 06:29:58,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 06:29:59,980][47288] Updated weights for policy 0, policy_version 105566 (0.0035) [2024-04-26 06:30:02,873][47288] Updated weights for policy 0, policy_version 105576 (0.0034) [2024-04-26 06:30:03,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56797.8, 300 sec: 56483.2). Total num frames: 1729822720. Throughput: 0: 56544.5. Samples: 1679176240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-04-26 06:30:03,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 06:30:05,810][47288] Updated weights for policy 0, policy_version 105586 (0.0028) [2024-04-26 06:30:08,629][47288] Updated weights for policy 0, policy_version 105596 (0.0029) [2024-04-26 06:30:08,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1730084864. Throughput: 0: 56720.7. Samples: 1679516580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 06:30:08,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 06:30:11,615][47288] Updated weights for policy 0, policy_version 105606 (0.0031) [2024-04-26 06:30:13,923][47056] Fps is (10 sec: 52427.8, 60 sec: 56251.6, 300 sec: 56261.0). Total num frames: 1730347008. Throughput: 0: 56432.2. Samples: 1679681960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 06:30:13,924][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 06:30:14,582][47288] Updated weights for policy 0, policy_version 105616 (0.0026) [2024-04-26 06:30:17,427][47288] Updated weights for policy 0, policy_version 105626 (0.0032) [2024-04-26 06:30:18,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1730658304. Throughput: 0: 56462.1. Samples: 1680019040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 06:30:18,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 06:30:20,499][47288] Updated weights for policy 0, policy_version 105636 (0.0032) [2024-04-26 06:30:23,187][47288] Updated weights for policy 0, policy_version 105646 (0.0032) [2024-04-26 06:30:23,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1730936832. Throughput: 0: 56415.5. Samples: 1680359500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 06:30:23,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 06:30:26,262][47288] Updated weights for policy 0, policy_version 105656 (0.0030) [2024-04-26 06:30:28,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.6, 300 sec: 56372.0). Total num frames: 1731215360. Throughput: 0: 56421.1. Samples: 1680528880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 06:30:28,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 06:30:28,981][47288] Updated weights for policy 0, policy_version 105666 (0.0028) [2024-04-26 06:30:32,065][47288] Updated weights for policy 0, policy_version 105676 (0.0024) [2024-04-26 06:30:33,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1731493888. Throughput: 0: 56317.8. Samples: 1680863260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 06:30:33,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 06:30:34,953][47288] Updated weights for policy 0, policy_version 105686 (0.0027) [2024-04-26 06:30:37,971][47288] Updated weights for policy 0, policy_version 105696 (0.0027) [2024-04-26 06:30:38,923][47056] Fps is (10 sec: 55707.0, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1731772416. Throughput: 0: 56301.8. Samples: 1681197820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 06:30:38,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 06:30:40,757][47288] Updated weights for policy 0, policy_version 105706 (0.0027) [2024-04-26 06:30:43,788][47288] Updated weights for policy 0, policy_version 105716 (0.0035) [2024-04-26 06:30:43,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1732050944. Throughput: 0: 56150.8. Samples: 1681370140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 06:30:43,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 06:30:46,551][47288] Updated weights for policy 0, policy_version 105726 (0.0027) [2024-04-26 06:30:48,923][47056] Fps is (10 sec: 57342.0, 60 sec: 56524.6, 300 sec: 56427.6). Total num frames: 1732345856. Throughput: 0: 56333.9. Samples: 1681711280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 06:30:48,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 06:30:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000105734_1732345856.pth... [2024-04-26 06:30:48,983][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000104909_1718829056.pth [2024-04-26 06:30:49,569][47288] Updated weights for policy 0, policy_version 105736 (0.0027) [2024-04-26 06:30:52,250][47288] Updated weights for policy 0, policy_version 105746 (0.0034) [2024-04-26 06:30:53,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1732608000. Throughput: 0: 56204.3. Samples: 1682045760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 06:30:53,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 06:30:55,373][47288] Updated weights for policy 0, policy_version 105756 (0.0035) [2024-04-26 06:30:57,956][47288] Updated weights for policy 0, policy_version 105766 (0.0035) [2024-04-26 06:30:58,923][47056] Fps is (10 sec: 54068.7, 60 sec: 55978.8, 300 sec: 56261.0). Total num frames: 1732886528. Throughput: 0: 56150.0. Samples: 1682208700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 06:30:58,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 06:31:01,297][47288] Updated weights for policy 0, policy_version 105776 (0.0035) [2024-04-26 06:31:03,800][47288] Updated weights for policy 0, policy_version 105786 (0.0029) [2024-04-26 06:31:03,923][47056] Fps is (10 sec: 58981.1, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1733197824. Throughput: 0: 56217.3. Samples: 1682548820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 06:31:03,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 06:31:06,986][47288] Updated weights for policy 0, policy_version 105796 (0.0031) [2024-04-26 06:31:08,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.9, 300 sec: 56316.5). Total num frames: 1733459968. Throughput: 0: 56176.1. Samples: 1682887420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:31:08,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 06:31:09,632][47288] Updated weights for policy 0, policy_version 105806 (0.0031) [2024-04-26 06:31:12,595][47267] Signal inference workers to stop experience collection... (25350 times) [2024-04-26 06:31:12,618][47288] InferenceWorker_p0-w0: stopping experience collection (25350 times) [2024-04-26 06:31:12,655][47267] Signal inference workers to resume experience collection... (25350 times) [2024-04-26 06:31:12,656][47288] InferenceWorker_p0-w0: resuming experience collection (25350 times) [2024-04-26 06:31:12,761][47288] Updated weights for policy 0, policy_version 105816 (0.0029) [2024-04-26 06:31:13,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1733738496. Throughput: 0: 56106.3. Samples: 1683053660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:31:13,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 06:31:15,382][47288] Updated weights for policy 0, policy_version 105826 (0.0025) [2024-04-26 06:31:18,601][47288] Updated weights for policy 0, policy_version 105836 (0.0027) [2024-04-26 06:31:18,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.8, 300 sec: 56316.5). Total num frames: 1734017024. Throughput: 0: 56176.1. Samples: 1683391180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:31:18,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 06:31:21,530][47288] Updated weights for policy 0, policy_version 105846 (0.0031) [2024-04-26 06:31:23,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 1734311936. Throughput: 0: 56318.5. Samples: 1683732160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:31:23,923][47056] Avg episode reward: [(0, '0.403')] [2024-04-26 06:31:24,457][47288] Updated weights for policy 0, policy_version 105856 (0.0026) [2024-04-26 06:31:27,298][47288] Updated weights for policy 0, policy_version 105866 (0.0025) [2024-04-26 06:31:28,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 1734590464. Throughput: 0: 56202.2. Samples: 1683899240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:31:28,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 06:31:30,146][47288] Updated weights for policy 0, policy_version 105876 (0.0032) [2024-04-26 06:31:33,080][47288] Updated weights for policy 0, policy_version 105886 (0.0029) [2024-04-26 06:31:33,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55978.6, 300 sec: 56372.0). Total num frames: 1734852608. Throughput: 0: 56116.1. Samples: 1684236500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:31:33,923][47056] Avg episode reward: [(0, '0.406')] [2024-04-26 06:31:35,976][47288] Updated weights for policy 0, policy_version 105896 (0.0028) [2024-04-26 06:31:38,897][47288] Updated weights for policy 0, policy_version 105906 (0.0033) [2024-04-26 06:31:38,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1735163904. Throughput: 0: 56254.0. Samples: 1684577200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:31:38,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 06:31:41,875][47288] Updated weights for policy 0, policy_version 105916 (0.0027) [2024-04-26 06:31:43,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1735442432. Throughput: 0: 56404.8. Samples: 1684746920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:31:43,924][47056] Avg episode reward: [(0, '0.570')] [2024-04-26 06:31:44,642][47288] Updated weights for policy 0, policy_version 105926 (0.0027) [2024-04-26 06:31:47,763][47288] Updated weights for policy 0, policy_version 105936 (0.0031) [2024-04-26 06:31:48,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 1735720960. Throughput: 0: 56411.7. Samples: 1685087340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:31:48,923][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 06:31:50,581][47288] Updated weights for policy 0, policy_version 105946 (0.0027) [2024-04-26 06:31:53,564][47288] Updated weights for policy 0, policy_version 105956 (0.0033) [2024-04-26 06:31:53,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1735999488. Throughput: 0: 56372.9. Samples: 1685424200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:31:53,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 06:31:56,362][47288] Updated weights for policy 0, policy_version 105966 (0.0029) [2024-04-26 06:31:58,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1736278016. Throughput: 0: 56311.6. Samples: 1685587680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:31:58,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 06:31:59,445][47288] Updated weights for policy 0, policy_version 105976 (0.0024) [2024-04-26 06:32:02,107][47288] Updated weights for policy 0, policy_version 105986 (0.0028) [2024-04-26 06:32:03,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1736556544. Throughput: 0: 56415.2. Samples: 1685929860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 06:32:03,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 06:32:05,256][47288] Updated weights for policy 0, policy_version 105996 (0.0025) [2024-04-26 06:32:08,032][47288] Updated weights for policy 0, policy_version 106006 (0.0032) [2024-04-26 06:32:08,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1736835072. Throughput: 0: 56330.7. Samples: 1686267040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 06:32:08,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 06:32:10,996][47288] Updated weights for policy 0, policy_version 106016 (0.0027) [2024-04-26 06:32:13,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1737113600. Throughput: 0: 56382.2. Samples: 1686436440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 06:32:13,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 06:32:13,946][47288] Updated weights for policy 0, policy_version 106026 (0.0026) [2024-04-26 06:32:16,778][47288] Updated weights for policy 0, policy_version 106036 (0.0027) [2024-04-26 06:32:18,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1737424896. Throughput: 0: 56580.1. Samples: 1686782600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 06:32:18,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 06:32:19,638][47288] Updated weights for policy 0, policy_version 106046 (0.0030) [2024-04-26 06:32:22,373][47267] Signal inference workers to stop experience collection... (25400 times) [2024-04-26 06:32:22,405][47288] InferenceWorker_p0-w0: stopping experience collection (25400 times) [2024-04-26 06:32:22,459][47267] Signal inference workers to resume experience collection... (25400 times) [2024-04-26 06:32:22,459][47288] InferenceWorker_p0-w0: resuming experience collection (25400 times) [2024-04-26 06:32:22,595][47288] Updated weights for policy 0, policy_version 106056 (0.0024) [2024-04-26 06:32:23,923][47056] Fps is (10 sec: 60620.2, 60 sec: 56797.8, 300 sec: 56372.0). Total num frames: 1737719808. Throughput: 0: 56454.1. Samples: 1687117640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 06:32:23,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 06:32:25,441][47288] Updated weights for policy 0, policy_version 106066 (0.0030) [2024-04-26 06:32:28,254][47288] Updated weights for policy 0, policy_version 106076 (0.0033) [2024-04-26 06:32:28,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1737981952. Throughput: 0: 56500.4. Samples: 1687289440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 06:32:28,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 06:32:31,311][47288] Updated weights for policy 0, policy_version 106086 (0.0028) [2024-04-26 06:32:33,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1738260480. Throughput: 0: 56435.5. Samples: 1687626940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 06:32:33,923][47056] Avg episode reward: [(0, '0.544')] [2024-04-26 06:32:34,080][47288] Updated weights for policy 0, policy_version 106096 (0.0030) [2024-04-26 06:32:37,159][47288] Updated weights for policy 0, policy_version 106106 (0.0030) [2024-04-26 06:32:38,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1738522624. Throughput: 0: 56545.2. Samples: 1687968740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 06:32:38,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 06:32:39,875][47288] Updated weights for policy 0, policy_version 106116 (0.0026) [2024-04-26 06:32:42,992][47288] Updated weights for policy 0, policy_version 106126 (0.0029) [2024-04-26 06:32:43,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1738817536. Throughput: 0: 56547.7. Samples: 1688132320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 06:32:43,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 06:32:45,748][47288] Updated weights for policy 0, policy_version 106136 (0.0026) [2024-04-26 06:32:48,638][47288] Updated weights for policy 0, policy_version 106146 (0.0032) [2024-04-26 06:32:48,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 1739096064. Throughput: 0: 56463.1. Samples: 1688470700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 06:32:48,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 06:32:48,998][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000106147_1739112448.pth... [2024-04-26 06:32:49,045][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000105322_1725595648.pth [2024-04-26 06:32:51,518][47288] Updated weights for policy 0, policy_version 106156 (0.0035) [2024-04-26 06:32:53,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1739390976. Throughput: 0: 56588.9. Samples: 1688813540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 06:32:53,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 06:32:54,467][47288] Updated weights for policy 0, policy_version 106166 (0.0030) [2024-04-26 06:32:57,321][47288] Updated weights for policy 0, policy_version 106176 (0.0035) [2024-04-26 06:32:58,923][47056] Fps is (10 sec: 58981.4, 60 sec: 56797.9, 300 sec: 56372.0). Total num frames: 1739685888. Throughput: 0: 56763.4. Samples: 1688990800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 06:32:58,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 06:33:00,323][47288] Updated weights for policy 0, policy_version 106186 (0.0027) [2024-04-26 06:33:02,977][47288] Updated weights for policy 0, policy_version 106196 (0.0029) [2024-04-26 06:33:03,923][47056] Fps is (10 sec: 58981.8, 60 sec: 57070.8, 300 sec: 56427.6). Total num frames: 1739980800. Throughput: 0: 56539.1. Samples: 1689326860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 06:33:03,923][47056] Avg episode reward: [(0, '0.565')] [2024-04-26 06:33:06,148][47288] Updated weights for policy 0, policy_version 106206 (0.0028) [2024-04-26 06:33:08,840][47288] Updated weights for policy 0, policy_version 106216 (0.0030) [2024-04-26 06:33:08,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56797.7, 300 sec: 56427.6). Total num frames: 1740242944. Throughput: 0: 56648.9. Samples: 1689666840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:33:08,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 06:33:12,003][47288] Updated weights for policy 0, policy_version 106226 (0.0028) [2024-04-26 06:33:13,923][47056] Fps is (10 sec: 52428.9, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1740505088. Throughput: 0: 56462.7. Samples: 1689830260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:33:13,924][47056] Avg episode reward: [(0, '0.548')] [2024-04-26 06:33:14,609][47288] Updated weights for policy 0, policy_version 106236 (0.0032) [2024-04-26 06:33:17,793][47288] Updated weights for policy 0, policy_version 106246 (0.0031) [2024-04-26 06:33:18,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1740800000. Throughput: 0: 56580.8. Samples: 1690173080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:33:18,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 06:33:20,423][47288] Updated weights for policy 0, policy_version 106256 (0.0028) [2024-04-26 06:33:23,609][47288] Updated weights for policy 0, policy_version 106266 (0.0029) [2024-04-26 06:33:23,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1741078528. Throughput: 0: 56530.7. Samples: 1690512620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:33:23,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 06:33:26,370][47288] Updated weights for policy 0, policy_version 106276 (0.0027) [2024-04-26 06:33:28,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1741357056. Throughput: 0: 56519.9. Samples: 1690675720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:33:28,923][47056] Avg episode reward: [(0, '0.562')] [2024-04-26 06:33:29,527][47288] Updated weights for policy 0, policy_version 106286 (0.0029) [2024-04-26 06:33:32,013][47288] Updated weights for policy 0, policy_version 106296 (0.0026) [2024-04-26 06:33:33,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1741651968. Throughput: 0: 56480.7. Samples: 1691012340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:33:33,923][47056] Avg episode reward: [(0, '0.552')] [2024-04-26 06:33:35,323][47288] Updated weights for policy 0, policy_version 106306 (0.0032) [2024-04-26 06:33:38,074][47288] Updated weights for policy 0, policy_version 106316 (0.0029) [2024-04-26 06:33:38,923][47056] Fps is (10 sec: 58982.8, 60 sec: 57071.0, 300 sec: 56372.1). Total num frames: 1741946880. Throughput: 0: 56310.7. Samples: 1691347520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:33:38,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 06:33:40,957][47288] Updated weights for policy 0, policy_version 106326 (0.0028) [2024-04-26 06:33:41,457][47267] Signal inference workers to stop experience collection... (25450 times) [2024-04-26 06:33:41,457][47267] Signal inference workers to resume experience collection... (25450 times) [2024-04-26 06:33:41,472][47288] InferenceWorker_p0-w0: stopping experience collection (25450 times) [2024-04-26 06:33:41,477][47288] InferenceWorker_p0-w0: resuming experience collection (25450 times) [2024-04-26 06:33:43,806][47288] Updated weights for policy 0, policy_version 106336 (0.0028) [2024-04-26 06:33:43,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.7, 300 sec: 56372.0). Total num frames: 1742209024. Throughput: 0: 56333.4. Samples: 1691525800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:33:43,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 06:33:46,778][47288] Updated weights for policy 0, policy_version 106346 (0.0027) [2024-04-26 06:33:48,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56797.7, 300 sec: 56483.1). Total num frames: 1742503936. Throughput: 0: 56342.2. Samples: 1691862260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:33:48,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 06:33:49,473][47288] Updated weights for policy 0, policy_version 106356 (0.0029) [2024-04-26 06:33:52,660][47288] Updated weights for policy 0, policy_version 106366 (0.0031) [2024-04-26 06:33:53,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1742766080. Throughput: 0: 56360.2. Samples: 1692203040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:33:53,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 06:33:55,085][47288] Updated weights for policy 0, policy_version 106376 (0.0033) [2024-04-26 06:33:58,402][47288] Updated weights for policy 0, policy_version 106386 (0.0035) [2024-04-26 06:33:58,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1743060992. Throughput: 0: 56354.2. Samples: 1692366200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:33:58,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 06:34:00,977][47288] Updated weights for policy 0, policy_version 106396 (0.0039) [2024-04-26 06:34:03,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 56427.7). Total num frames: 1743323136. Throughput: 0: 56288.2. Samples: 1692706040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 06:34:03,923][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 06:34:04,366][47288] Updated weights for policy 0, policy_version 106406 (0.0033) [2024-04-26 06:34:06,798][47288] Updated weights for policy 0, policy_version 106416 (0.0028) [2024-04-26 06:34:08,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1743618048. Throughput: 0: 56338.2. Samples: 1693047840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 06:34:08,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 06:34:10,188][47288] Updated weights for policy 0, policy_version 106426 (0.0031) [2024-04-26 06:34:12,482][47288] Updated weights for policy 0, policy_version 106436 (0.0025) [2024-04-26 06:34:13,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56798.0, 300 sec: 56372.1). Total num frames: 1743912960. Throughput: 0: 56494.3. Samples: 1693217960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 06:34:13,923][47056] Avg episode reward: [(0, '0.561')] [2024-04-26 06:34:15,875][47288] Updated weights for policy 0, policy_version 106446 (0.0034) [2024-04-26 06:34:18,317][47288] Updated weights for policy 0, policy_version 106456 (0.0029) [2024-04-26 06:34:18,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1744191488. Throughput: 0: 56541.8. Samples: 1693556720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 06:34:18,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 06:34:21,625][47288] Updated weights for policy 0, policy_version 106466 (0.0030) [2024-04-26 06:34:23,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1744470016. Throughput: 0: 56458.2. Samples: 1693888140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 06:34:23,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 06:34:24,173][47288] Updated weights for policy 0, policy_version 106476 (0.0031) [2024-04-26 06:34:27,418][47288] Updated weights for policy 0, policy_version 106486 (0.0034) [2024-04-26 06:34:28,923][47056] Fps is (10 sec: 57342.8, 60 sec: 56797.7, 300 sec: 56483.1). Total num frames: 1744764928. Throughput: 0: 56454.4. Samples: 1694066260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 06:34:28,924][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 06:34:30,084][47288] Updated weights for policy 0, policy_version 106496 (0.0029) [2024-04-26 06:34:33,272][47288] Updated weights for policy 0, policy_version 106506 (0.0026) [2024-04-26 06:34:33,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1745027072. Throughput: 0: 56450.2. Samples: 1694402520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 06:34:33,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 06:34:35,702][47288] Updated weights for policy 0, policy_version 106516 (0.0028) [2024-04-26 06:34:38,923][47056] Fps is (10 sec: 54068.4, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1745305600. Throughput: 0: 56625.6. Samples: 1694751200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 06:34:38,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 06:34:39,048][47288] Updated weights for policy 0, policy_version 106526 (0.0034) [2024-04-26 06:34:41,332][47288] Updated weights for policy 0, policy_version 106536 (0.0025) [2024-04-26 06:34:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1745584128. Throughput: 0: 56411.1. Samples: 1694904700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 06:34:43,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 06:34:44,899][47288] Updated weights for policy 0, policy_version 106546 (0.0027) [2024-04-26 06:34:47,178][47288] Updated weights for policy 0, policy_version 106556 (0.0028) [2024-04-26 06:34:48,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1745895424. Throughput: 0: 56490.0. Samples: 1695248100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 06:34:48,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 06:34:48,931][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000106561_1745895424.pth... [2024-04-26 06:34:48,978][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000105734_1732345856.pth [2024-04-26 06:34:50,435][47267] Signal inference workers to stop experience collection... (25500 times) [2024-04-26 06:34:50,444][47288] InferenceWorker_p0-w0: stopping experience collection (25500 times) [2024-04-26 06:34:50,526][47267] Signal inference workers to resume experience collection... (25500 times) [2024-04-26 06:34:50,527][47288] InferenceWorker_p0-w0: resuming experience collection (25500 times) [2024-04-26 06:34:50,638][47288] Updated weights for policy 0, policy_version 106566 (0.0025) [2024-04-26 06:34:53,083][47288] Updated weights for policy 0, policy_version 106576 (0.0030) [2024-04-26 06:34:53,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56797.7, 300 sec: 56427.6). Total num frames: 1746173952. Throughput: 0: 56477.8. Samples: 1695589340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 06:34:53,923][47056] Avg episode reward: [(0, '0.586')] [2024-04-26 06:34:56,302][47288] Updated weights for policy 0, policy_version 106586 (0.0032) [2024-04-26 06:34:58,923][47056] Fps is (10 sec: 55704.7, 60 sec: 56524.7, 300 sec: 56372.0). Total num frames: 1746452480. Throughput: 0: 56489.0. Samples: 1695759980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 06:34:58,924][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 06:34:59,157][47288] Updated weights for policy 0, policy_version 106596 (0.0027) [2024-04-26 06:35:02,033][47288] Updated weights for policy 0, policy_version 106606 (0.0025) [2024-04-26 06:35:03,923][47056] Fps is (10 sec: 57344.5, 60 sec: 57070.9, 300 sec: 56483.2). Total num frames: 1746747392. Throughput: 0: 56543.6. Samples: 1696101180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-04-26 06:35:03,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 06:35:05,079][47288] Updated weights for policy 0, policy_version 106616 (0.0028) [2024-04-26 06:35:07,887][47288] Updated weights for policy 0, policy_version 106626 (0.0036) [2024-04-26 06:35:08,923][47056] Fps is (10 sec: 55707.2, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 1747009536. Throughput: 0: 56653.9. Samples: 1696437560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 06:35:08,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 06:35:10,856][47288] Updated weights for policy 0, policy_version 106636 (0.0029) [2024-04-26 06:35:13,599][47288] Updated weights for policy 0, policy_version 106646 (0.0029) [2024-04-26 06:35:13,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1747304448. Throughput: 0: 56629.8. Samples: 1696614580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 06:35:13,923][47056] Avg episode reward: [(0, '0.614')] [2024-04-26 06:35:16,528][47288] Updated weights for policy 0, policy_version 106656 (0.0030) [2024-04-26 06:35:18,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1747550208. Throughput: 0: 56625.8. Samples: 1696950680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 06:35:18,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 06:35:19,560][47288] Updated weights for policy 0, policy_version 106666 (0.0032) [2024-04-26 06:35:22,200][47288] Updated weights for policy 0, policy_version 106676 (0.0031) [2024-04-26 06:35:23,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1747861504. Throughput: 0: 56367.6. Samples: 1697287740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 06:35:23,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 06:35:25,414][47288] Updated weights for policy 0, policy_version 106686 (0.0027) [2024-04-26 06:35:28,060][47288] Updated weights for policy 0, policy_version 106696 (0.0027) [2024-04-26 06:35:28,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1748123648. Throughput: 0: 56635.1. Samples: 1697453280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 06:35:28,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 06:35:31,031][47288] Updated weights for policy 0, policy_version 106706 (0.0026) [2024-04-26 06:35:33,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1748418560. Throughput: 0: 56453.9. Samples: 1697788520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 06:35:33,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 06:35:33,957][47288] Updated weights for policy 0, policy_version 106716 (0.0026) [2024-04-26 06:35:36,738][47288] Updated weights for policy 0, policy_version 106726 (0.0035) [2024-04-26 06:35:38,923][47056] Fps is (10 sec: 58983.3, 60 sec: 56797.9, 300 sec: 56483.1). Total num frames: 1748713472. Throughput: 0: 56446.8. Samples: 1698129440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 06:35:38,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 06:35:39,733][47288] Updated weights for policy 0, policy_version 106736 (0.0031) [2024-04-26 06:35:42,462][47288] Updated weights for policy 0, policy_version 106746 (0.0028) [2024-04-26 06:35:43,923][47056] Fps is (10 sec: 58982.4, 60 sec: 57071.0, 300 sec: 56483.2). Total num frames: 1749008384. Throughput: 0: 56615.9. Samples: 1698307680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 06:35:43,923][47056] Avg episode reward: [(0, '0.580')] [2024-04-26 06:35:45,631][47288] Updated weights for policy 0, policy_version 106756 (0.0029) [2024-04-26 06:35:48,257][47288] Updated weights for policy 0, policy_version 106766 (0.0026) [2024-04-26 06:35:48,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 1749286912. Throughput: 0: 56578.6. Samples: 1698647220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 06:35:48,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 06:35:51,405][47288] Updated weights for policy 0, policy_version 106776 (0.0025) [2024-04-26 06:35:53,600][47267] Signal inference workers to stop experience collection... (25550 times) [2024-04-26 06:35:53,600][47267] Signal inference workers to resume experience collection... (25550 times) [2024-04-26 06:35:53,618][47288] InferenceWorker_p0-w0: stopping experience collection (25550 times) [2024-04-26 06:35:53,618][47288] InferenceWorker_p0-w0: resuming experience collection (25550 times) [2024-04-26 06:35:53,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 1749565440. Throughput: 0: 56626.6. Samples: 1698985760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 06:35:53,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 06:35:53,975][47288] Updated weights for policy 0, policy_version 106786 (0.0027) [2024-04-26 06:35:57,205][47288] Updated weights for policy 0, policy_version 106796 (0.0025) [2024-04-26 06:35:58,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55978.9, 300 sec: 56316.6). Total num frames: 1749811200. Throughput: 0: 56295.0. Samples: 1699147860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 06:35:58,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 06:35:59,708][47288] Updated weights for policy 0, policy_version 106806 (0.0028) [2024-04-26 06:36:02,924][47288] Updated weights for policy 0, policy_version 106816 (0.0027) [2024-04-26 06:36:03,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55705.5, 300 sec: 56372.1). Total num frames: 1750089728. Throughput: 0: 56424.9. Samples: 1699489800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 06:36:03,923][47056] Avg episode reward: [(0, '0.424')] [2024-04-26 06:36:05,625][47288] Updated weights for policy 0, policy_version 106826 (0.0024) [2024-04-26 06:36:08,772][47288] Updated weights for policy 0, policy_version 106836 (0.0034) [2024-04-26 06:36:08,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 1750401024. Throughput: 0: 56491.2. Samples: 1699829840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 06:36:08,923][47056] Avg episode reward: [(0, '0.399')] [2024-04-26 06:36:11,370][47288] Updated weights for policy 0, policy_version 106846 (0.0031) [2024-04-26 06:36:13,923][47056] Fps is (10 sec: 60620.2, 60 sec: 56524.5, 300 sec: 56538.7). Total num frames: 1750695936. Throughput: 0: 56395.5. Samples: 1699991080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 06:36:13,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 06:36:14,617][47288] Updated weights for policy 0, policy_version 106856 (0.0029) [2024-04-26 06:36:17,185][47288] Updated weights for policy 0, policy_version 106866 (0.0031) [2024-04-26 06:36:18,923][47056] Fps is (10 sec: 57343.2, 60 sec: 57070.9, 300 sec: 56483.1). Total num frames: 1750974464. Throughput: 0: 56428.7. Samples: 1700327820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 06:36:18,923][47056] Avg episode reward: [(0, '0.585')] [2024-04-26 06:36:20,394][47288] Updated weights for policy 0, policy_version 106876 (0.0025) [2024-04-26 06:36:23,049][47288] Updated weights for policy 0, policy_version 106886 (0.0025) [2024-04-26 06:36:23,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 1751252992. Throughput: 0: 56252.3. Samples: 1700660800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 06:36:23,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 06:36:26,451][47288] Updated weights for policy 0, policy_version 106896 (0.0031) [2024-04-26 06:36:28,816][47288] Updated weights for policy 0, policy_version 106906 (0.0033) [2024-04-26 06:36:28,923][47056] Fps is (10 sec: 57344.8, 60 sec: 57071.1, 300 sec: 56594.3). Total num frames: 1751547904. Throughput: 0: 56194.2. Samples: 1700836420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 06:36:28,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 06:36:32,551][47288] Updated weights for policy 0, policy_version 106916 (0.0028) [2024-04-26 06:36:33,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1751793664. Throughput: 0: 56181.4. Samples: 1701175380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 06:36:33,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 06:36:34,636][47288] Updated weights for policy 0, policy_version 106926 (0.0030) [2024-04-26 06:36:38,419][47288] Updated weights for policy 0, policy_version 106936 (0.0032) [2024-04-26 06:36:38,923][47056] Fps is (10 sec: 50789.2, 60 sec: 55705.4, 300 sec: 56316.5). Total num frames: 1752055808. Throughput: 0: 56172.6. Samples: 1701513540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 06:36:38,924][47056] Avg episode reward: [(0, '0.568')] [2024-04-26 06:36:40,512][47288] Updated weights for policy 0, policy_version 106946 (0.0030) [2024-04-26 06:36:43,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 56372.1). Total num frames: 1752350720. Throughput: 0: 56051.0. Samples: 1701670160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 06:36:43,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 06:36:44,339][47288] Updated weights for policy 0, policy_version 106956 (0.0029) [2024-04-26 06:36:46,546][47288] Updated weights for policy 0, policy_version 106966 (0.0028) [2024-04-26 06:36:48,923][47056] Fps is (10 sec: 57345.4, 60 sec: 55705.7, 300 sec: 56372.1). Total num frames: 1752629248. Throughput: 0: 55916.6. Samples: 1702006040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 06:36:48,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 06:36:48,997][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000106973_1752645632.pth... [2024-04-26 06:36:49,046][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000106147_1739112448.pth [2024-04-26 06:36:50,205][47288] Updated weights for policy 0, policy_version 106976 (0.0031) [2024-04-26 06:36:50,677][47267] Signal inference workers to stop experience collection... (25600 times) [2024-04-26 06:36:50,708][47288] InferenceWorker_p0-w0: stopping experience collection (25600 times) [2024-04-26 06:36:50,734][47267] Signal inference workers to resume experience collection... (25600 times) [2024-04-26 06:36:50,735][47288] InferenceWorker_p0-w0: resuming experience collection (25600 times) [2024-04-26 06:36:52,514][47288] Updated weights for policy 0, policy_version 106986 (0.0029) [2024-04-26 06:36:53,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1752940544. Throughput: 0: 55848.8. Samples: 1702343040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 06:36:53,924][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 06:36:55,858][47288] Updated weights for policy 0, policy_version 106996 (0.0029) [2024-04-26 06:36:58,180][47288] Updated weights for policy 0, policy_version 107006 (0.0031) [2024-04-26 06:36:58,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1753202688. Throughput: 0: 56317.4. Samples: 1702525360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 06:36:58,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 06:37:01,544][47288] Updated weights for policy 0, policy_version 107016 (0.0030) [2024-04-26 06:37:03,901][47288] Updated weights for policy 0, policy_version 107026 (0.0037) [2024-04-26 06:37:03,923][47056] Fps is (10 sec: 57344.3, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 1753513984. Throughput: 0: 56238.8. Samples: 1702858560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 06:37:03,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 06:37:07,438][47288] Updated weights for policy 0, policy_version 107036 (0.0026) [2024-04-26 06:37:08,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.5, 300 sec: 56483.1). Total num frames: 1753776128. Throughput: 0: 56314.2. Samples: 1703194940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 06:37:08,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 06:37:09,758][47288] Updated weights for policy 0, policy_version 107046 (0.0027) [2024-04-26 06:37:13,340][47288] Updated weights for policy 0, policy_version 107056 (0.0027) [2024-04-26 06:37:13,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.9, 300 sec: 56372.1). Total num frames: 1754054656. Throughput: 0: 56177.8. Samples: 1703364420. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-26 06:37:13,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 06:37:15,521][47288] Updated weights for policy 0, policy_version 107066 (0.0029) [2024-04-26 06:37:18,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55432.6, 300 sec: 56205.5). Total num frames: 1754300416. Throughput: 0: 56099.1. Samples: 1703699840. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-26 06:37:18,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 06:37:19,258][47288] Updated weights for policy 0, policy_version 107076 (0.0030) [2024-04-26 06:37:21,337][47288] Updated weights for policy 0, policy_version 107086 (0.0026) [2024-04-26 06:37:23,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.8, 300 sec: 56316.6). Total num frames: 1754595328. Throughput: 0: 56088.8. Samples: 1704037520. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-26 06:37:23,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 06:37:24,950][47288] Updated weights for policy 0, policy_version 107096 (0.0031) [2024-04-26 06:37:27,045][47288] Updated weights for policy 0, policy_version 107106 (0.0028) [2024-04-26 06:37:28,923][47056] Fps is (10 sec: 58983.0, 60 sec: 55705.6, 300 sec: 56372.1). Total num frames: 1754890240. Throughput: 0: 56270.3. Samples: 1704202320. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-26 06:37:28,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 06:37:30,693][47288] Updated weights for policy 0, policy_version 107116 (0.0028) [2024-04-26 06:37:33,088][47288] Updated weights for policy 0, policy_version 107126 (0.0030) [2024-04-26 06:37:33,923][47056] Fps is (10 sec: 58981.4, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 1755185152. Throughput: 0: 56304.8. Samples: 1704539760. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-26 06:37:33,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 06:37:36,564][47288] Updated weights for policy 0, policy_version 107136 (0.0024) [2024-04-26 06:37:38,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56798.1, 300 sec: 56427.6). Total num frames: 1755463680. Throughput: 0: 56312.1. Samples: 1704877080. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-26 06:37:38,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 06:37:39,139][47288] Updated weights for policy 0, policy_version 107146 (0.0026) [2024-04-26 06:37:42,379][47288] Updated weights for policy 0, policy_version 107156 (0.0031) [2024-04-26 06:37:43,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1755742208. Throughput: 0: 56041.4. Samples: 1705047220. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-26 06:37:43,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 06:37:45,054][47288] Updated weights for policy 0, policy_version 107166 (0.0030) [2024-04-26 06:37:48,051][47288] Updated weights for policy 0, policy_version 107176 (0.0028) [2024-04-26 06:37:48,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1756020736. Throughput: 0: 56226.2. Samples: 1705388740. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-26 06:37:48,923][47056] Avg episode reward: [(0, '0.587')] [2024-04-26 06:37:50,760][47288] Updated weights for policy 0, policy_version 107186 (0.0028) [2024-04-26 06:37:53,797][47288] Updated weights for policy 0, policy_version 107196 (0.0029) [2024-04-26 06:37:53,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1756299264. Throughput: 0: 56247.7. Samples: 1705726080. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-26 06:37:53,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 06:37:56,447][47288] Updated weights for policy 0, policy_version 107206 (0.0028) [2024-04-26 06:37:58,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 1756561408. Throughput: 0: 56065.6. Samples: 1705887380. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-26 06:37:58,923][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 06:37:59,550][47267] Signal inference workers to stop experience collection... (25650 times) [2024-04-26 06:37:59,581][47288] InferenceWorker_p0-w0: stopping experience collection (25650 times) [2024-04-26 06:37:59,605][47267] Signal inference workers to resume experience collection... (25650 times) [2024-04-26 06:37:59,605][47288] InferenceWorker_p0-w0: resuming experience collection (25650 times) [2024-04-26 06:37:59,714][47288] Updated weights for policy 0, policy_version 107216 (0.0031) [2024-04-26 06:38:02,200][47288] Updated weights for policy 0, policy_version 107226 (0.0029) [2024-04-26 06:38:03,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55432.6, 300 sec: 56261.0). Total num frames: 1756839936. Throughput: 0: 56165.1. Samples: 1706227260. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-26 06:38:03,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 06:38:05,598][47288] Updated weights for policy 0, policy_version 107236 (0.0033) [2024-04-26 06:38:08,007][47288] Updated weights for policy 0, policy_version 107246 (0.0030) [2024-04-26 06:38:08,923][47056] Fps is (10 sec: 57344.4, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1757134848. Throughput: 0: 56153.7. Samples: 1706564440. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-26 06:38:08,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 06:38:11,387][47288] Updated weights for policy 0, policy_version 107256 (0.0029) [2024-04-26 06:38:13,749][47288] Updated weights for policy 0, policy_version 107266 (0.0027) [2024-04-26 06:38:13,923][47056] Fps is (10 sec: 60620.5, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1757446144. Throughput: 0: 56384.9. Samples: 1706739640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 06:38:13,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 06:38:17,059][47288] Updated weights for policy 0, policy_version 107276 (0.0027) [2024-04-26 06:38:18,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1757708288. Throughput: 0: 56321.5. Samples: 1707074220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 06:38:18,923][47056] Avg episode reward: [(0, '0.607')] [2024-04-26 06:38:19,643][47288] Updated weights for policy 0, policy_version 107286 (0.0026) [2024-04-26 06:38:22,803][47288] Updated weights for policy 0, policy_version 107296 (0.0024) [2024-04-26 06:38:23,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1757986816. Throughput: 0: 56376.9. Samples: 1707414040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 06:38:23,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 06:38:25,598][47288] Updated weights for policy 0, policy_version 107306 (0.0026) [2024-04-26 06:38:28,557][47288] Updated weights for policy 0, policy_version 107316 (0.0027) [2024-04-26 06:38:28,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1758281728. Throughput: 0: 56480.2. Samples: 1707588820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 06:38:28,923][47056] Avg episode reward: [(0, '0.377')] [2024-04-26 06:38:31,287][47288] Updated weights for policy 0, policy_version 107326 (0.0026) [2024-04-26 06:38:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1758560256. Throughput: 0: 56499.6. Samples: 1707931220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 06:38:33,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 06:38:34,324][47288] Updated weights for policy 0, policy_version 107336 (0.0027) [2024-04-26 06:38:36,934][47288] Updated weights for policy 0, policy_version 107346 (0.0032) [2024-04-26 06:38:38,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1758822400. Throughput: 0: 56588.1. Samples: 1708272540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 06:38:38,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 06:38:40,023][47288] Updated weights for policy 0, policy_version 107356 (0.0030) [2024-04-26 06:38:42,762][47288] Updated weights for policy 0, policy_version 107366 (0.0026) [2024-04-26 06:38:43,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1759100928. Throughput: 0: 56505.8. Samples: 1708430140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 06:38:43,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 06:38:45,922][47288] Updated weights for policy 0, policy_version 107376 (0.0026) [2024-04-26 06:38:48,470][47288] Updated weights for policy 0, policy_version 107386 (0.0035) [2024-04-26 06:38:48,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1759412224. Throughput: 0: 56480.2. Samples: 1708768880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 06:38:48,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 06:38:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000107386_1759412224.pth... [2024-04-26 06:38:48,983][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000106561_1745895424.pth [2024-04-26 06:38:51,767][47288] Updated weights for policy 0, policy_version 107396 (0.0024) [2024-04-26 06:38:53,923][47056] Fps is (10 sec: 60621.3, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1759707136. Throughput: 0: 56522.7. Samples: 1709107960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 06:38:53,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 06:38:54,201][47288] Updated weights for policy 0, policy_version 107406 (0.0029) [2024-04-26 06:38:57,445][47288] Updated weights for policy 0, policy_version 107416 (0.0027) [2024-04-26 06:38:58,923][47056] Fps is (10 sec: 57344.5, 60 sec: 57071.0, 300 sec: 56483.1). Total num frames: 1759985664. Throughput: 0: 56612.4. Samples: 1709287200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 06:38:58,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 06:39:00,356][47288] Updated weights for policy 0, policy_version 107426 (0.0027) [2024-04-26 06:39:03,231][47288] Updated weights for policy 0, policy_version 107436 (0.0025) [2024-04-26 06:39:03,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 1760247808. Throughput: 0: 56657.4. Samples: 1709623800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 06:39:03,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 06:39:06,409][47288] Updated weights for policy 0, policy_version 107446 (0.0033) [2024-04-26 06:39:08,923][47056] Fps is (10 sec: 55704.6, 60 sec: 56797.7, 300 sec: 56372.0). Total num frames: 1760542720. Throughput: 0: 56580.2. Samples: 1709960160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 06:39:08,923][47056] Avg episode reward: [(0, '0.571')] [2024-04-26 06:39:09,061][47288] Updated weights for policy 0, policy_version 107456 (0.0034) [2024-04-26 06:39:12,280][47288] Updated weights for policy 0, policy_version 107466 (0.0030) [2024-04-26 06:39:13,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1760821248. Throughput: 0: 56421.3. Samples: 1710127780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:39:13,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 06:39:14,980][47288] Updated weights for policy 0, policy_version 107476 (0.0030) [2024-04-26 06:39:18,064][47288] Updated weights for policy 0, policy_version 107486 (0.0034) [2024-04-26 06:39:18,787][47267] Signal inference workers to stop experience collection... (25700 times) [2024-04-26 06:39:18,829][47288] InferenceWorker_p0-w0: stopping experience collection (25700 times) [2024-04-26 06:39:18,837][47267] Signal inference workers to resume experience collection... (25700 times) [2024-04-26 06:39:18,843][47288] InferenceWorker_p0-w0: resuming experience collection (25700 times) [2024-04-26 06:39:18,923][47056] Fps is (10 sec: 54068.2, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1761083392. Throughput: 0: 56436.0. Samples: 1710470840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:39:18,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 06:39:20,728][47288] Updated weights for policy 0, policy_version 107496 (0.0031) [2024-04-26 06:39:23,902][47288] Updated weights for policy 0, policy_version 107506 (0.0026) [2024-04-26 06:39:23,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56524.8, 300 sec: 56316.6). Total num frames: 1761378304. Throughput: 0: 56455.1. Samples: 1710813020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:39:23,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 06:39:26,478][47288] Updated weights for policy 0, policy_version 107516 (0.0026) [2024-04-26 06:39:28,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56524.6, 300 sec: 56427.6). Total num frames: 1761673216. Throughput: 0: 56572.0. Samples: 1710975880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:39:28,923][47056] Avg episode reward: [(0, '0.588')] [2024-04-26 06:39:29,621][47288] Updated weights for policy 0, policy_version 107526 (0.0026) [2024-04-26 06:39:32,456][47288] Updated weights for policy 0, policy_version 107536 (0.0027) [2024-04-26 06:39:33,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1761951744. Throughput: 0: 56600.6. Samples: 1711315900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:39:33,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 06:39:35,300][47288] Updated weights for policy 0, policy_version 107546 (0.0034) [2024-04-26 06:39:38,389][47288] Updated weights for policy 0, policy_version 107556 (0.0027) [2024-04-26 06:39:38,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1762230272. Throughput: 0: 56636.8. Samples: 1711656620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:39:38,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 06:39:41,154][47288] Updated weights for policy 0, policy_version 107566 (0.0031) [2024-04-26 06:39:43,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56797.9, 300 sec: 56316.5). Total num frames: 1762508800. Throughput: 0: 56388.9. Samples: 1711824700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:39:43,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 06:39:44,176][47288] Updated weights for policy 0, policy_version 107576 (0.0026) [2024-04-26 06:39:46,988][47288] Updated weights for policy 0, policy_version 107586 (0.0028) [2024-04-26 06:39:48,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56525.0, 300 sec: 56372.1). Total num frames: 1762803712. Throughput: 0: 56440.1. Samples: 1712163600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:39:48,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 06:39:49,825][47288] Updated weights for policy 0, policy_version 107596 (0.0028) [2024-04-26 06:39:52,793][47288] Updated weights for policy 0, policy_version 107606 (0.0027) [2024-04-26 06:39:53,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56251.6, 300 sec: 56372.1). Total num frames: 1763082240. Throughput: 0: 56401.4. Samples: 1712498220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:39:53,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 06:39:55,610][47288] Updated weights for policy 0, policy_version 107616 (0.0025) [2024-04-26 06:39:58,546][47288] Updated weights for policy 0, policy_version 107626 (0.0032) [2024-04-26 06:39:58,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1763360768. Throughput: 0: 56573.2. Samples: 1712673580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:39:58,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 06:40:01,447][47288] Updated weights for policy 0, policy_version 107636 (0.0027) [2024-04-26 06:40:03,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.6, 300 sec: 56372.0). Total num frames: 1763639296. Throughput: 0: 56455.9. Samples: 1713011360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:40:03,923][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 06:40:04,370][47288] Updated weights for policy 0, policy_version 107646 (0.0028) [2024-04-26 06:40:07,145][47288] Updated weights for policy 0, policy_version 107656 (0.0031) [2024-04-26 06:40:08,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56525.0, 300 sec: 56372.1). Total num frames: 1763934208. Throughput: 0: 56288.5. Samples: 1713346000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:40:08,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 06:40:10,124][47288] Updated weights for policy 0, policy_version 107666 (0.0028) [2024-04-26 06:40:13,062][47288] Updated weights for policy 0, policy_version 107676 (0.0032) [2024-04-26 06:40:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.6, 300 sec: 56483.1). Total num frames: 1764212736. Throughput: 0: 56660.0. Samples: 1713525580. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 06:40:13,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 06:40:15,773][47288] Updated weights for policy 0, policy_version 107686 (0.0030) [2024-04-26 06:40:18,923][47056] Fps is (10 sec: 52429.4, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1764458496. Throughput: 0: 56544.1. Samples: 1713860380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 06:40:18,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 06:40:19,031][47288] Updated weights for policy 0, policy_version 107696 (0.0029) [2024-04-26 06:40:21,539][47288] Updated weights for policy 0, policy_version 107706 (0.0030) [2024-04-26 06:40:23,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1764769792. Throughput: 0: 56468.6. Samples: 1714197700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 06:40:23,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 06:40:24,862][47288] Updated weights for policy 0, policy_version 107716 (0.0026) [2024-04-26 06:40:27,275][47288] Updated weights for policy 0, policy_version 107726 (0.0027) [2024-04-26 06:40:28,923][47056] Fps is (10 sec: 60620.0, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1765064704. Throughput: 0: 56546.3. Samples: 1714369280. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 06:40:28,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 06:40:30,510][47288] Updated weights for policy 0, policy_version 107736 (0.0036) [2024-04-26 06:40:33,089][47288] Updated weights for policy 0, policy_version 107746 (0.0033) [2024-04-26 06:40:33,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1765359616. Throughput: 0: 56635.8. Samples: 1714712220. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 06:40:33,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 06:40:36,329][47288] Updated weights for policy 0, policy_version 107756 (0.0033) [2024-04-26 06:40:38,911][47288] Updated weights for policy 0, policy_version 107766 (0.0030) [2024-04-26 06:40:38,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1765638144. Throughput: 0: 56709.9. Samples: 1715050160. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 06:40:38,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 06:40:42,242][47288] Updated weights for policy 0, policy_version 107776 (0.0028) [2024-04-26 06:40:43,923][47056] Fps is (10 sec: 52428.4, 60 sec: 56251.6, 300 sec: 56261.0). Total num frames: 1765883904. Throughput: 0: 56508.7. Samples: 1715216480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 06:40:43,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 06:40:43,963][47267] Signal inference workers to stop experience collection... (25750 times) [2024-04-26 06:40:43,993][47288] InferenceWorker_p0-w0: stopping experience collection (25750 times) [2024-04-26 06:40:44,050][47267] Signal inference workers to resume experience collection... (25750 times) [2024-04-26 06:40:44,051][47288] InferenceWorker_p0-w0: resuming experience collection (25750 times) [2024-04-26 06:40:44,754][47288] Updated weights for policy 0, policy_version 107786 (0.0025) [2024-04-26 06:40:47,891][47288] Updated weights for policy 0, policy_version 107796 (0.0028) [2024-04-26 06:40:48,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.6, 300 sec: 56372.0). Total num frames: 1766195200. Throughput: 0: 56583.5. Samples: 1715557620. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 06:40:48,923][47056] Avg episode reward: [(0, '0.578')] [2024-04-26 06:40:49,042][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000107801_1766211584.pth... [2024-04-26 06:40:49,084][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000106973_1752645632.pth [2024-04-26 06:40:50,529][47288] Updated weights for policy 0, policy_version 107806 (0.0029) [2024-04-26 06:40:53,636][47288] Updated weights for policy 0, policy_version 107816 (0.0027) [2024-04-26 06:40:53,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1766457344. Throughput: 0: 56817.6. Samples: 1715902800. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 06:40:53,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 06:40:56,336][47288] Updated weights for policy 0, policy_version 107826 (0.0028) [2024-04-26 06:40:58,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1766735872. Throughput: 0: 56298.2. Samples: 1716059000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 06:40:58,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 06:40:59,660][47288] Updated weights for policy 0, policy_version 107836 (0.0026) [2024-04-26 06:41:02,029][47288] Updated weights for policy 0, policy_version 107846 (0.0025) [2024-04-26 06:41:03,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1767030784. Throughput: 0: 56384.2. Samples: 1716397680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 06:41:03,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 06:41:05,524][47288] Updated weights for policy 0, policy_version 107856 (0.0033) [2024-04-26 06:41:07,753][47288] Updated weights for policy 0, policy_version 107866 (0.0025) [2024-04-26 06:41:08,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1767325696. Throughput: 0: 56515.3. Samples: 1716740900. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 06:41:08,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 06:41:11,318][47288] Updated weights for policy 0, policy_version 107876 (0.0026) [2024-04-26 06:41:13,579][47288] Updated weights for policy 0, policy_version 107886 (0.0028) [2024-04-26 06:41:13,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1767620608. Throughput: 0: 56625.2. Samples: 1716917420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 06:41:13,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 06:41:17,068][47288] Updated weights for policy 0, policy_version 107896 (0.0027) [2024-04-26 06:41:18,923][47056] Fps is (10 sec: 55706.2, 60 sec: 57070.8, 300 sec: 56372.1). Total num frames: 1767882752. Throughput: 0: 56484.5. Samples: 1717254020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 06:41:18,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 06:41:19,377][47288] Updated weights for policy 0, policy_version 107906 (0.0029) [2024-04-26 06:41:22,960][47288] Updated weights for policy 0, policy_version 107916 (0.0026) [2024-04-26 06:41:23,923][47056] Fps is (10 sec: 54068.3, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1768161280. Throughput: 0: 56463.8. Samples: 1717591020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 06:41:23,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 06:41:25,082][47288] Updated weights for policy 0, policy_version 107926 (0.0029) [2024-04-26 06:41:28,689][47288] Updated weights for policy 0, policy_version 107936 (0.0032) [2024-04-26 06:41:28,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1768439808. Throughput: 0: 56524.5. Samples: 1717760080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 06:41:28,924][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 06:41:30,853][47288] Updated weights for policy 0, policy_version 107946 (0.0028) [2024-04-26 06:41:33,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55978.7, 300 sec: 56483.2). Total num frames: 1768718336. Throughput: 0: 56486.3. Samples: 1718099500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 06:41:33,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 06:41:34,341][47288] Updated weights for policy 0, policy_version 107956 (0.0033) [2024-04-26 06:41:36,647][47288] Updated weights for policy 0, policy_version 107966 (0.0029) [2024-04-26 06:41:38,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1769013248. Throughput: 0: 56476.5. Samples: 1718444240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 06:41:38,923][47056] Avg episode reward: [(0, '0.528')] [2024-04-26 06:41:40,193][47288] Updated weights for policy 0, policy_version 107976 (0.0031) [2024-04-26 06:41:42,334][47288] Updated weights for policy 0, policy_version 107986 (0.0027) [2024-04-26 06:41:43,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56798.0, 300 sec: 56483.1). Total num frames: 1769291776. Throughput: 0: 56644.1. Samples: 1718607980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 06:41:43,923][47056] Avg episode reward: [(0, '0.528')] [2024-04-26 06:41:46,058][47288] Updated weights for policy 0, policy_version 107996 (0.0026) [2024-04-26 06:41:48,029][47288] Updated weights for policy 0, policy_version 108006 (0.0028) [2024-04-26 06:41:48,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56798.0, 300 sec: 56483.2). Total num frames: 1769603072. Throughput: 0: 56621.4. Samples: 1718945640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 06:41:48,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 06:41:52,027][47288] Updated weights for policy 0, policy_version 108016 (0.0029) [2024-04-26 06:41:53,902][47288] Updated weights for policy 0, policy_version 108026 (0.0025) [2024-04-26 06:41:53,923][47056] Fps is (10 sec: 60620.2, 60 sec: 57344.0, 300 sec: 56594.2). Total num frames: 1769897984. Throughput: 0: 56519.2. Samples: 1719284260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 06:41:53,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 06:41:57,694][47288] Updated weights for policy 0, policy_version 108036 (0.0025) [2024-04-26 06:41:58,618][47267] Signal inference workers to stop experience collection... (25800 times) [2024-04-26 06:41:58,618][47267] Signal inference workers to resume experience collection... (25800 times) [2024-04-26 06:41:58,644][47288] InferenceWorker_p0-w0: stopping experience collection (25800 times) [2024-04-26 06:41:58,644][47288] InferenceWorker_p0-w0: resuming experience collection (25800 times) [2024-04-26 06:41:58,923][47056] Fps is (10 sec: 57343.6, 60 sec: 57344.0, 300 sec: 56483.1). Total num frames: 1770176512. Throughput: 0: 56668.0. Samples: 1719467480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 06:41:58,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 06:41:59,744][47288] Updated weights for policy 0, policy_version 108046 (0.0029) [2024-04-26 06:42:03,403][47288] Updated weights for policy 0, policy_version 108056 (0.0028) [2024-04-26 06:42:03,923][47056] Fps is (10 sec: 50790.7, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1770405888. Throughput: 0: 56783.6. Samples: 1719809280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 06:42:03,932][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 06:42:05,487][47288] Updated weights for policy 0, policy_version 108066 (0.0030) [2024-04-26 06:42:08,923][47056] Fps is (10 sec: 50791.0, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1770684416. Throughput: 0: 56909.3. Samples: 1720151940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 06:42:08,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 06:42:09,173][47288] Updated weights for policy 0, policy_version 108076 (0.0026) [2024-04-26 06:42:11,308][47288] Updated weights for policy 0, policy_version 108086 (0.0027) [2024-04-26 06:42:13,923][47056] Fps is (10 sec: 57344.3, 60 sec: 55978.8, 300 sec: 56538.7). Total num frames: 1770979328. Throughput: 0: 56597.5. Samples: 1720306960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 06:42:13,923][47056] Avg episode reward: [(0, '0.584')] [2024-04-26 06:42:14,912][47288] Updated weights for policy 0, policy_version 108096 (0.0030) [2024-04-26 06:42:17,044][47288] Updated weights for policy 0, policy_version 108106 (0.0033) [2024-04-26 06:42:18,923][47056] Fps is (10 sec: 60620.6, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 1771290624. Throughput: 0: 56661.0. Samples: 1720649240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 06:42:18,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 06:42:20,653][47288] Updated weights for policy 0, policy_version 108116 (0.0025) [2024-04-26 06:42:22,767][47288] Updated weights for policy 0, policy_version 108126 (0.0026) [2024-04-26 06:42:23,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 1771569152. Throughput: 0: 56597.9. Samples: 1720991140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 06:42:23,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 06:42:26,507][47288] Updated weights for policy 0, policy_version 108136 (0.0027) [2024-04-26 06:42:28,429][47288] Updated weights for policy 0, policy_version 108146 (0.0030) [2024-04-26 06:42:28,923][47056] Fps is (10 sec: 58982.1, 60 sec: 57344.0, 300 sec: 56594.2). Total num frames: 1771880448. Throughput: 0: 56899.5. Samples: 1721168460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 06:42:28,923][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 06:42:32,356][47288] Updated weights for policy 0, policy_version 108156 (0.0027) [2024-04-26 06:42:33,923][47056] Fps is (10 sec: 58981.0, 60 sec: 57343.9, 300 sec: 56594.2). Total num frames: 1772158976. Throughput: 0: 56891.8. Samples: 1721505780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 06:42:33,923][47056] Avg episode reward: [(0, '0.438')] [2024-04-26 06:42:34,237][47288] Updated weights for policy 0, policy_version 108166 (0.0035) [2024-04-26 06:42:38,180][47288] Updated weights for policy 0, policy_version 108176 (0.0030) [2024-04-26 06:42:38,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 1772421120. Throughput: 0: 56796.0. Samples: 1721840080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 06:42:38,923][47056] Avg episode reward: [(0, '0.385')] [2024-04-26 06:42:40,049][47288] Updated weights for policy 0, policy_version 108186 (0.0024) [2024-04-26 06:42:43,923][47056] Fps is (10 sec: 50791.5, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1772666880. Throughput: 0: 56377.0. Samples: 1722004440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 06:42:43,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 06:42:43,966][47288] Updated weights for policy 0, policy_version 108196 (0.0030) [2024-04-26 06:42:45,928][47288] Updated weights for policy 0, policy_version 108206 (0.0028) [2024-04-26 06:42:48,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55705.7, 300 sec: 56427.6). Total num frames: 1772945408. Throughput: 0: 56385.9. Samples: 1722346640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 06:42:48,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 06:42:48,988][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000108213_1772961792.pth... [2024-04-26 06:42:49,033][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000107386_1759412224.pth [2024-04-26 06:42:49,694][47288] Updated weights for policy 0, policy_version 108216 (0.0027) [2024-04-26 06:42:51,693][47288] Updated weights for policy 0, policy_version 108226 (0.0031) [2024-04-26 06:42:53,923][47056] Fps is (10 sec: 58981.5, 60 sec: 55978.6, 300 sec: 56594.2). Total num frames: 1773256704. Throughput: 0: 56211.8. Samples: 1722681480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 06:42:53,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 06:42:55,444][47288] Updated weights for policy 0, policy_version 108236 (0.0027) [2024-04-26 06:42:55,984][47267] Signal inference workers to stop experience collection... (25850 times) [2024-04-26 06:42:55,985][47267] Signal inference workers to resume experience collection... (25850 times) [2024-04-26 06:42:56,014][47288] InferenceWorker_p0-w0: stopping experience collection (25850 times) [2024-04-26 06:42:56,014][47288] InferenceWorker_p0-w0: resuming experience collection (25850 times) [2024-04-26 06:42:57,681][47288] Updated weights for policy 0, policy_version 108246 (0.0025) [2024-04-26 06:42:58,923][47056] Fps is (10 sec: 60619.3, 60 sec: 56251.6, 300 sec: 56649.7). Total num frames: 1773551616. Throughput: 0: 56566.4. Samples: 1722852460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 06:42:58,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 06:43:01,303][47288] Updated weights for policy 0, policy_version 108256 (0.0027) [2024-04-26 06:43:03,597][47288] Updated weights for policy 0, policy_version 108266 (0.0026) [2024-04-26 06:43:03,923][47056] Fps is (10 sec: 57344.9, 60 sec: 57071.0, 300 sec: 56594.2). Total num frames: 1773830144. Throughput: 0: 56434.2. Samples: 1723188780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 06:43:03,923][47056] Avg episode reward: [(0, '0.587')] [2024-04-26 06:43:06,911][47288] Updated weights for policy 0, policy_version 108276 (0.0035) [2024-04-26 06:43:08,923][47056] Fps is (10 sec: 57345.0, 60 sec: 57343.9, 300 sec: 56538.7). Total num frames: 1774125056. Throughput: 0: 56374.6. Samples: 1723528000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 06:43:08,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 06:43:09,271][47288] Updated weights for policy 0, policy_version 108286 (0.0037) [2024-04-26 06:43:12,696][47288] Updated weights for policy 0, policy_version 108296 (0.0026) [2024-04-26 06:43:13,923][47056] Fps is (10 sec: 57344.4, 60 sec: 57071.0, 300 sec: 56594.2). Total num frames: 1774403584. Throughput: 0: 56557.5. Samples: 1723713540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 06:43:13,923][47056] Avg episode reward: [(0, '0.382')] [2024-04-26 06:43:15,122][47288] Updated weights for policy 0, policy_version 108306 (0.0029) [2024-04-26 06:43:18,426][47288] Updated weights for policy 0, policy_version 108316 (0.0031) [2024-04-26 06:43:18,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 1774665728. Throughput: 0: 56598.8. Samples: 1724052720. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 06:43:18,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 06:43:20,937][47288] Updated weights for policy 0, policy_version 108326 (0.0030) [2024-04-26 06:43:23,923][47056] Fps is (10 sec: 54066.3, 60 sec: 56251.6, 300 sec: 56483.1). Total num frames: 1774944256. Throughput: 0: 56770.7. Samples: 1724394760. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 06:43:23,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 06:43:24,182][47288] Updated weights for policy 0, policy_version 108336 (0.0030) [2024-04-26 06:43:26,704][47288] Updated weights for policy 0, policy_version 108346 (0.0028) [2024-04-26 06:43:28,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 56483.1). Total num frames: 1775222784. Throughput: 0: 56698.1. Samples: 1724555860. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 06:43:28,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 06:43:30,078][47288] Updated weights for policy 0, policy_version 108356 (0.0033) [2024-04-26 06:43:32,552][47288] Updated weights for policy 0, policy_version 108366 (0.0027) [2024-04-26 06:43:33,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.8, 300 sec: 56594.2). Total num frames: 1775517696. Throughput: 0: 56618.1. Samples: 1724894460. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 06:43:33,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 06:43:35,793][47288] Updated weights for policy 0, policy_version 108376 (0.0033) [2024-04-26 06:43:38,406][47288] Updated weights for policy 0, policy_version 108386 (0.0030) [2024-04-26 06:43:38,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56524.8, 300 sec: 56649.8). Total num frames: 1775812608. Throughput: 0: 56646.3. Samples: 1725230560. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 06:43:38,923][47056] Avg episode reward: [(0, '0.562')] [2024-04-26 06:43:41,495][47288] Updated weights for policy 0, policy_version 108396 (0.0027) [2024-04-26 06:43:43,923][47056] Fps is (10 sec: 57343.9, 60 sec: 57070.8, 300 sec: 56538.7). Total num frames: 1776091136. Throughput: 0: 56742.4. Samples: 1725405860. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 06:43:43,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 06:43:44,204][47288] Updated weights for policy 0, policy_version 108406 (0.0030) [2024-04-26 06:43:47,263][47288] Updated weights for policy 0, policy_version 108416 (0.0027) [2024-04-26 06:43:48,923][47056] Fps is (10 sec: 57343.9, 60 sec: 57343.9, 300 sec: 56538.7). Total num frames: 1776386048. Throughput: 0: 56795.4. Samples: 1725744580. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 06:43:48,923][47056] Avg episode reward: [(0, '0.557')] [2024-04-26 06:43:49,950][47288] Updated weights for policy 0, policy_version 108426 (0.0027) [2024-04-26 06:43:53,180][47288] Updated weights for policy 0, policy_version 108436 (0.0029) [2024-04-26 06:43:53,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 1776664576. Throughput: 0: 56711.4. Samples: 1726080020. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 06:43:53,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 06:43:55,819][47288] Updated weights for policy 0, policy_version 108446 (0.0029) [2024-04-26 06:43:58,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56251.9, 300 sec: 56538.7). Total num frames: 1776926720. Throughput: 0: 56363.4. Samples: 1726249900. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 06:43:58,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 06:43:58,969][47288] Updated weights for policy 0, policy_version 108456 (0.0027) [2024-04-26 06:44:01,513][47288] Updated weights for policy 0, policy_version 108466 (0.0024) [2024-04-26 06:44:03,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.6, 300 sec: 56483.2). Total num frames: 1777205248. Throughput: 0: 56441.3. Samples: 1726592580. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 06:44:03,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 06:44:04,648][47288] Updated weights for policy 0, policy_version 108476 (0.0029) [2024-04-26 06:44:05,683][47267] Signal inference workers to stop experience collection... (25900 times) [2024-04-26 06:44:05,683][47267] Signal inference workers to resume experience collection... (25900 times) [2024-04-26 06:44:05,694][47288] InferenceWorker_p0-w0: stopping experience collection (25900 times) [2024-04-26 06:44:05,694][47288] InferenceWorker_p0-w0: resuming experience collection (25900 times) [2024-04-26 06:44:07,181][47288] Updated weights for policy 0, policy_version 108486 (0.0028) [2024-04-26 06:44:08,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 1777500160. Throughput: 0: 56387.6. Samples: 1726932200. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 06:44:08,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 06:44:10,449][47288] Updated weights for policy 0, policy_version 108496 (0.0032) [2024-04-26 06:44:13,040][47288] Updated weights for policy 0, policy_version 108506 (0.0027) [2024-04-26 06:44:13,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.5, 300 sec: 56594.2). Total num frames: 1777778688. Throughput: 0: 56442.2. Samples: 1727095760. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 06:44:13,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 06:44:16,346][47288] Updated weights for policy 0, policy_version 108516 (0.0028) [2024-04-26 06:44:18,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56798.0, 300 sec: 56594.2). Total num frames: 1778073600. Throughput: 0: 56422.4. Samples: 1727433460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 06:44:18,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 06:44:18,962][47288] Updated weights for policy 0, policy_version 108526 (0.0031) [2024-04-26 06:44:22,254][47288] Updated weights for policy 0, policy_version 108536 (0.0029) [2024-04-26 06:44:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 1778352128. Throughput: 0: 56544.5. Samples: 1727775060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 06:44:23,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 06:44:24,827][47288] Updated weights for policy 0, policy_version 108546 (0.0029) [2024-04-26 06:44:28,070][47288] Updated weights for policy 0, policy_version 108556 (0.0026) [2024-04-26 06:44:28,923][47056] Fps is (10 sec: 57344.1, 60 sec: 57071.1, 300 sec: 56594.2). Total num frames: 1778647040. Throughput: 0: 56467.3. Samples: 1727946880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 06:44:28,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 06:44:30,593][47288] Updated weights for policy 0, policy_version 108566 (0.0030) [2024-04-26 06:44:33,650][47288] Updated weights for policy 0, policy_version 108576 (0.0028) [2024-04-26 06:44:33,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 1778909184. Throughput: 0: 56606.3. Samples: 1728291860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 06:44:33,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 06:44:36,714][47288] Updated weights for policy 0, policy_version 108586 (0.0035) [2024-04-26 06:44:38,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56524.8, 300 sec: 56594.2). Total num frames: 1779204096. Throughput: 0: 56555.2. Samples: 1728625000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 06:44:38,923][47056] Avg episode reward: [(0, '0.604')] [2024-04-26 06:44:39,474][47288] Updated weights for policy 0, policy_version 108596 (0.0034) [2024-04-26 06:44:42,538][47288] Updated weights for policy 0, policy_version 108606 (0.0027) [2024-04-26 06:44:43,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56524.8, 300 sec: 56538.6). Total num frames: 1779482624. Throughput: 0: 56543.4. Samples: 1728794360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 06:44:43,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 06:44:45,283][47288] Updated weights for policy 0, policy_version 108616 (0.0026) [2024-04-26 06:44:48,415][47288] Updated weights for policy 0, policy_version 108626 (0.0037) [2024-04-26 06:44:48,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56524.9, 300 sec: 56594.3). Total num frames: 1779777536. Throughput: 0: 56426.8. Samples: 1729131780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 06:44:48,923][47056] Avg episode reward: [(0, '0.569')] [2024-04-26 06:44:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000108629_1779777536.pth... [2024-04-26 06:44:48,978][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000107801_1766211584.pth [2024-04-26 06:44:51,020][47288] Updated weights for policy 0, policy_version 108636 (0.0026) [2024-04-26 06:44:53,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 1780039680. Throughput: 0: 56443.5. Samples: 1729472160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 06:44:53,923][47056] Avg episode reward: [(0, '0.548')] [2024-04-26 06:44:54,061][47288] Updated weights for policy 0, policy_version 108646 (0.0026) [2024-04-26 06:44:56,715][47288] Updated weights for policy 0, policy_version 108656 (0.0030) [2024-04-26 06:44:58,923][47056] Fps is (10 sec: 54066.3, 60 sec: 56524.7, 300 sec: 56538.7). Total num frames: 1780318208. Throughput: 0: 56674.6. Samples: 1729646120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 06:44:58,923][47056] Avg episode reward: [(0, '0.588')] [2024-04-26 06:44:59,848][47288] Updated weights for policy 0, policy_version 108666 (0.0029) [2024-04-26 06:45:02,785][47288] Updated weights for policy 0, policy_version 108676 (0.0025) [2024-04-26 06:45:03,923][47056] Fps is (10 sec: 58982.7, 60 sec: 57071.0, 300 sec: 56594.2). Total num frames: 1780629504. Throughput: 0: 56629.7. Samples: 1729981800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 06:45:03,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 06:45:05,626][47288] Updated weights for policy 0, policy_version 108686 (0.0027) [2024-04-26 06:45:08,502][47288] Updated weights for policy 0, policy_version 108696 (0.0029) [2024-04-26 06:45:08,923][47056] Fps is (10 sec: 58983.3, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 1780908032. Throughput: 0: 56612.9. Samples: 1730322640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 06:45:08,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 06:45:11,328][47288] Updated weights for policy 0, policy_version 108706 (0.0030) [2024-04-26 06:45:13,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56524.8, 300 sec: 56649.7). Total num frames: 1781170176. Throughput: 0: 56569.2. Samples: 1730492500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 06:45:13,923][47056] Avg episode reward: [(0, '0.583')] [2024-04-26 06:45:14,308][47288] Updated weights for policy 0, policy_version 108716 (0.0026) [2024-04-26 06:45:17,130][47288] Updated weights for policy 0, policy_version 108726 (0.0028) [2024-04-26 06:45:18,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56524.8, 300 sec: 56594.2). Total num frames: 1781465088. Throughput: 0: 56427.1. Samples: 1730831080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 06:45:18,923][47056] Avg episode reward: [(0, '0.567')] [2024-04-26 06:45:20,010][47288] Updated weights for policy 0, policy_version 108736 (0.0033) [2024-04-26 06:45:22,492][47267] Signal inference workers to stop experience collection... (25950 times) [2024-04-26 06:45:22,492][47267] Signal inference workers to resume experience collection... (25950 times) [2024-04-26 06:45:22,529][47288] InferenceWorker_p0-w0: stopping experience collection (25950 times) [2024-04-26 06:45:22,529][47288] InferenceWorker_p0-w0: resuming experience collection (25950 times) [2024-04-26 06:45:23,187][47288] Updated weights for policy 0, policy_version 108746 (0.0030) [2024-04-26 06:45:23,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56524.7, 300 sec: 56538.6). Total num frames: 1781743616. Throughput: 0: 56536.7. Samples: 1731169160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 06:45:23,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 06:45:25,796][47288] Updated weights for policy 0, policy_version 108756 (0.0031) [2024-04-26 06:45:28,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.6, 300 sec: 56427.6). Total num frames: 1782005760. Throughput: 0: 56427.2. Samples: 1731333580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 06:45:28,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 06:45:28,969][47288] Updated weights for policy 0, policy_version 108766 (0.0023) [2024-04-26 06:45:31,675][47288] Updated weights for policy 0, policy_version 108776 (0.0030) [2024-04-26 06:45:33,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1782284288. Throughput: 0: 56438.5. Samples: 1731671520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 06:45:33,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 06:45:34,666][47288] Updated weights for policy 0, policy_version 108786 (0.0029) [2024-04-26 06:45:37,473][47288] Updated weights for policy 0, policy_version 108796 (0.0030) [2024-04-26 06:45:38,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.9, 300 sec: 56594.3). Total num frames: 1782579200. Throughput: 0: 56341.9. Samples: 1732007540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 06:45:38,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 06:45:40,330][47288] Updated weights for policy 0, policy_version 108806 (0.0029) [2024-04-26 06:45:43,258][47288] Updated weights for policy 0, policy_version 108816 (0.0032) [2024-04-26 06:45:43,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 1782874112. Throughput: 0: 56420.2. Samples: 1732185020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 06:45:43,923][47056] Avg episode reward: [(0, '0.572')] [2024-04-26 06:45:46,188][47288] Updated weights for policy 0, policy_version 108826 (0.0035) [2024-04-26 06:45:48,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.7, 300 sec: 56594.2). Total num frames: 1783152640. Throughput: 0: 56420.0. Samples: 1732520700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 06:45:48,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 06:45:49,011][47288] Updated weights for policy 0, policy_version 108836 (0.0025) [2024-04-26 06:45:52,135][47288] Updated weights for policy 0, policy_version 108846 (0.0029) [2024-04-26 06:45:53,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.9, 300 sec: 56594.2). Total num frames: 1783431168. Throughput: 0: 56340.9. Samples: 1732857980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 06:45:53,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 06:45:54,895][47288] Updated weights for policy 0, policy_version 108856 (0.0028) [2024-04-26 06:45:57,801][47288] Updated weights for policy 0, policy_version 108866 (0.0029) [2024-04-26 06:45:58,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 1783709696. Throughput: 0: 56355.9. Samples: 1733028520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 06:45:58,924][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 06:46:00,543][47288] Updated weights for policy 0, policy_version 108876 (0.0026) [2024-04-26 06:46:03,546][47288] Updated weights for policy 0, policy_version 108886 (0.0026) [2024-04-26 06:46:03,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 1784004608. Throughput: 0: 56450.6. Samples: 1733371360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 06:46:03,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 06:46:06,253][47288] Updated weights for policy 0, policy_version 108896 (0.0027) [2024-04-26 06:46:08,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1784266752. Throughput: 0: 56415.8. Samples: 1733707860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 06:46:08,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 06:46:09,300][47288] Updated weights for policy 0, policy_version 108906 (0.0025) [2024-04-26 06:46:12,284][47288] Updated weights for policy 0, policy_version 108916 (0.0028) [2024-04-26 06:46:13,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 1784561664. Throughput: 0: 56599.5. Samples: 1733880560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 06:46:13,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 06:46:14,894][47288] Updated weights for policy 0, policy_version 108926 (0.0027) [2024-04-26 06:46:18,040][47288] Updated weights for policy 0, policy_version 108936 (0.0027) [2024-04-26 06:46:18,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56524.7, 300 sec: 56594.2). Total num frames: 1784856576. Throughput: 0: 56574.7. Samples: 1734217380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 06:46:18,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 06:46:20,411][47267] Signal inference workers to stop experience collection... (26000 times) [2024-04-26 06:46:20,455][47288] InferenceWorker_p0-w0: stopping experience collection (26000 times) [2024-04-26 06:46:20,466][47267] Signal inference workers to resume experience collection... (26000 times) [2024-04-26 06:46:20,474][47288] InferenceWorker_p0-w0: resuming experience collection (26000 times) [2024-04-26 06:46:20,569][47288] Updated weights for policy 0, policy_version 108946 (0.0027) [2024-04-26 06:46:23,879][47288] Updated weights for policy 0, policy_version 108956 (0.0034) [2024-04-26 06:46:23,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.9, 300 sec: 56594.2). Total num frames: 1785135104. Throughput: 0: 56725.6. Samples: 1734560200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 06:46:23,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 06:46:26,390][47288] Updated weights for policy 0, policy_version 108966 (0.0027) [2024-04-26 06:46:28,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 1785397248. Throughput: 0: 56397.0. Samples: 1734722880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 06:46:28,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 06:46:29,481][47288] Updated weights for policy 0, policy_version 108976 (0.0031) [2024-04-26 06:46:32,247][47288] Updated weights for policy 0, policy_version 108986 (0.0027) [2024-04-26 06:46:33,923][47056] Fps is (10 sec: 57344.5, 60 sec: 57071.0, 300 sec: 56594.2). Total num frames: 1785708544. Throughput: 0: 56534.7. Samples: 1735064760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 06:46:33,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 06:46:35,273][47288] Updated weights for policy 0, policy_version 108996 (0.0033) [2024-04-26 06:46:38,287][47288] Updated weights for policy 0, policy_version 109006 (0.0031) [2024-04-26 06:46:38,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 1785987072. Throughput: 0: 56666.7. Samples: 1735407980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 06:46:38,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 06:46:41,021][47288] Updated weights for policy 0, policy_version 109016 (0.0034) [2024-04-26 06:46:43,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 1786265600. Throughput: 0: 56582.4. Samples: 1735574720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 06:46:43,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 06:46:43,963][47288] Updated weights for policy 0, policy_version 109026 (0.0035) [2024-04-26 06:46:46,759][47288] Updated weights for policy 0, policy_version 109036 (0.0029) [2024-04-26 06:46:48,923][47056] Fps is (10 sec: 57342.5, 60 sec: 56797.7, 300 sec: 56483.1). Total num frames: 1786560512. Throughput: 0: 56564.2. Samples: 1735916760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 06:46:48,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 06:46:49,029][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000109044_1786576896.pth... [2024-04-26 06:46:49,080][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000108213_1772961792.pth [2024-04-26 06:46:49,817][47288] Updated weights for policy 0, policy_version 109046 (0.0032) [2024-04-26 06:46:52,537][47288] Updated weights for policy 0, policy_version 109056 (0.0033) [2024-04-26 06:46:53,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1786822656. Throughput: 0: 56547.1. Samples: 1736252480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 06:46:53,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 06:46:55,713][47288] Updated weights for policy 0, policy_version 109066 (0.0027) [2024-04-26 06:46:58,313][47288] Updated weights for policy 0, policy_version 109076 (0.0036) [2024-04-26 06:46:58,923][47056] Fps is (10 sec: 54068.3, 60 sec: 56524.9, 300 sec: 56594.2). Total num frames: 1787101184. Throughput: 0: 56526.8. Samples: 1736424260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 06:46:58,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 06:47:01,504][47288] Updated weights for policy 0, policy_version 109086 (0.0026) [2024-04-26 06:47:03,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.8, 300 sec: 56649.8). Total num frames: 1787396096. Throughput: 0: 56552.1. Samples: 1736762220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 06:47:03,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 06:47:04,237][47288] Updated weights for policy 0, policy_version 109096 (0.0026) [2024-04-26 06:47:07,414][47288] Updated weights for policy 0, policy_version 109106 (0.0026) [2024-04-26 06:47:08,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 1787674624. Throughput: 0: 56477.0. Samples: 1737101660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 06:47:08,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 06:47:09,962][47288] Updated weights for policy 0, policy_version 109116 (0.0027) [2024-04-26 06:47:13,384][47288] Updated weights for policy 0, policy_version 109126 (0.0032) [2024-04-26 06:47:13,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1787936768. Throughput: 0: 56565.7. Samples: 1737268340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 06:47:13,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 06:47:16,134][47288] Updated weights for policy 0, policy_version 109136 (0.0028) [2024-04-26 06:47:18,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.8, 300 sec: 56427.6). Total num frames: 1788215296. Throughput: 0: 56353.4. Samples: 1737600660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 06:47:18,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 06:47:19,175][47288] Updated weights for policy 0, policy_version 109146 (0.0029) [2024-04-26 06:47:22,007][47288] Updated weights for policy 0, policy_version 109156 (0.0031) [2024-04-26 06:47:23,883][47267] Signal inference workers to stop experience collection... (26050 times) [2024-04-26 06:47:23,887][47267] Signal inference workers to resume experience collection... (26050 times) [2024-04-26 06:47:23,911][47288] InferenceWorker_p0-w0: stopping experience collection (26050 times) [2024-04-26 06:47:23,911][47288] InferenceWorker_p0-w0: resuming experience collection (26050 times) [2024-04-26 06:47:23,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56525.0, 300 sec: 56427.6). Total num frames: 1788526592. Throughput: 0: 56182.7. Samples: 1737936200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 06:47:23,923][47056] Avg episode reward: [(0, '0.578')] [2024-04-26 06:47:24,982][47288] Updated weights for policy 0, policy_version 109166 (0.0026) [2024-04-26 06:47:27,656][47288] Updated weights for policy 0, policy_version 109176 (0.0025) [2024-04-26 06:47:28,923][47056] Fps is (10 sec: 60619.0, 60 sec: 57070.6, 300 sec: 56483.1). Total num frames: 1788821504. Throughput: 0: 56420.1. Samples: 1738113640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 06:47:28,923][47056] Avg episode reward: [(0, '0.528')] [2024-04-26 06:47:30,854][47288] Updated weights for policy 0, policy_version 109186 (0.0030) [2024-04-26 06:47:33,434][47288] Updated weights for policy 0, policy_version 109196 (0.0029) [2024-04-26 06:47:33,923][47056] Fps is (10 sec: 54066.0, 60 sec: 55978.5, 300 sec: 56427.6). Total num frames: 1789067264. Throughput: 0: 56335.6. Samples: 1738451860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 06:47:33,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 06:47:36,807][47288] Updated weights for policy 0, policy_version 109206 (0.0030) [2024-04-26 06:47:38,923][47056] Fps is (10 sec: 54068.1, 60 sec: 56251.6, 300 sec: 56594.2). Total num frames: 1789362176. Throughput: 0: 56334.1. Samples: 1738787520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 06:47:38,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 06:47:39,653][47288] Updated weights for policy 0, policy_version 109216 (0.0028) [2024-04-26 06:47:42,542][47288] Updated weights for policy 0, policy_version 109226 (0.0035) [2024-04-26 06:47:43,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56251.7, 300 sec: 56594.2). Total num frames: 1789640704. Throughput: 0: 56243.5. Samples: 1738955220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 06:47:43,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 06:47:45,311][47288] Updated weights for policy 0, policy_version 109236 (0.0026) [2024-04-26 06:47:48,357][47288] Updated weights for policy 0, policy_version 109246 (0.0028) [2024-04-26 06:47:48,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56251.9, 300 sec: 56538.7). Total num frames: 1789935616. Throughput: 0: 56264.9. Samples: 1739294140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 06:47:48,923][47056] Avg episode reward: [(0, '0.548')] [2024-04-26 06:47:50,991][47288] Updated weights for policy 0, policy_version 109256 (0.0032) [2024-04-26 06:47:53,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1790197760. Throughput: 0: 56324.4. Samples: 1739636260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 06:47:53,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 06:47:54,010][47288] Updated weights for policy 0, policy_version 109266 (0.0030) [2024-04-26 06:47:56,845][47288] Updated weights for policy 0, policy_version 109276 (0.0030) [2024-04-26 06:47:58,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 1790492672. Throughput: 0: 56512.0. Samples: 1739811380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 06:47:58,923][47056] Avg episode reward: [(0, '0.622')] [2024-04-26 06:47:58,931][47267] Saving new best policy, reward=0.622! [2024-04-26 06:47:59,915][47288] Updated weights for policy 0, policy_version 109286 (0.0030) [2024-04-26 06:48:02,817][47288] Updated weights for policy 0, policy_version 109296 (0.0029) [2024-04-26 06:48:03,923][47056] Fps is (10 sec: 60620.3, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 1790803968. Throughput: 0: 56597.6. Samples: 1740147560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 06:48:03,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 06:48:05,845][47288] Updated weights for policy 0, policy_version 109306 (0.0026) [2024-04-26 06:48:08,539][47288] Updated weights for policy 0, policy_version 109316 (0.0029) [2024-04-26 06:48:08,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1791049728. Throughput: 0: 56489.9. Samples: 1740478260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 06:48:08,923][47056] Avg episode reward: [(0, '0.576')] [2024-04-26 06:48:11,615][47288] Updated weights for policy 0, policy_version 109326 (0.0028) [2024-04-26 06:48:13,923][47056] Fps is (10 sec: 52429.1, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 1791328256. Throughput: 0: 56404.3. Samples: 1740651820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 06:48:13,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 06:48:14,190][47288] Updated weights for policy 0, policy_version 109336 (0.0031) [2024-04-26 06:48:17,439][47288] Updated weights for policy 0, policy_version 109346 (0.0030) [2024-04-26 06:48:18,923][47056] Fps is (10 sec: 57345.3, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 1791623168. Throughput: 0: 56382.0. Samples: 1740989040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:48:18,923][47056] Avg episode reward: [(0, '0.568')] [2024-04-26 06:48:20,075][47288] Updated weights for policy 0, policy_version 109356 (0.0029) [2024-04-26 06:48:23,277][47288] Updated weights for policy 0, policy_version 109366 (0.0029) [2024-04-26 06:48:23,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 56483.2). Total num frames: 1791885312. Throughput: 0: 56371.6. Samples: 1741324240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:48:23,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 06:48:25,906][47288] Updated weights for policy 0, policy_version 109376 (0.0029) [2024-04-26 06:48:28,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55705.7, 300 sec: 56427.6). Total num frames: 1792163840. Throughput: 0: 56225.2. Samples: 1741485360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:48:28,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 06:48:28,970][47288] Updated weights for policy 0, policy_version 109386 (0.0027) [2024-04-26 06:48:31,660][47267] Signal inference workers to stop experience collection... (26100 times) [2024-04-26 06:48:31,704][47288] InferenceWorker_p0-w0: stopping experience collection (26100 times) [2024-04-26 06:48:31,714][47267] Signal inference workers to resume experience collection... (26100 times) [2024-04-26 06:48:31,721][47288] InferenceWorker_p0-w0: resuming experience collection (26100 times) [2024-04-26 06:48:31,724][47288] Updated weights for policy 0, policy_version 109396 (0.0030) [2024-04-26 06:48:33,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1792458752. Throughput: 0: 56091.4. Samples: 1741818260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:48:33,924][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 06:48:34,822][47288] Updated weights for policy 0, policy_version 109406 (0.0038) [2024-04-26 06:48:37,417][47288] Updated weights for policy 0, policy_version 109416 (0.0025) [2024-04-26 06:48:38,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 1792753664. Throughput: 0: 56084.9. Samples: 1742160080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:48:38,923][47056] Avg episode reward: [(0, '0.383')] [2024-04-26 06:48:40,689][47288] Updated weights for policy 0, policy_version 109426 (0.0032) [2024-04-26 06:48:43,138][47288] Updated weights for policy 0, policy_version 109436 (0.0032) [2024-04-26 06:48:43,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1793032192. Throughput: 0: 56207.9. Samples: 1742340740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:48:43,924][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 06:48:46,392][47288] Updated weights for policy 0, policy_version 109446 (0.0030) [2024-04-26 06:48:48,912][47288] Updated weights for policy 0, policy_version 109456 (0.0027) [2024-04-26 06:48:48,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 1793327104. Throughput: 0: 56149.5. Samples: 1742674280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:48:48,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 06:48:48,931][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000109456_1793327104.pth... [2024-04-26 06:48:48,976][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000108629_1779777536.pth [2024-04-26 06:48:52,182][47288] Updated weights for policy 0, policy_version 109466 (0.0028) [2024-04-26 06:48:53,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 1793589248. Throughput: 0: 56300.1. Samples: 1743011760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:48:53,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 06:48:54,562][47288] Updated weights for policy 0, policy_version 109476 (0.0033) [2024-04-26 06:48:57,965][47288] Updated weights for policy 0, policy_version 109486 (0.0034) [2024-04-26 06:48:58,923][47056] Fps is (10 sec: 54065.8, 60 sec: 56251.6, 300 sec: 56483.1). Total num frames: 1793867776. Throughput: 0: 56226.0. Samples: 1743182000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:48:58,923][47056] Avg episode reward: [(0, '0.418')] [2024-04-26 06:49:00,466][47288] Updated weights for policy 0, policy_version 109496 (0.0025) [2024-04-26 06:49:03,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55432.7, 300 sec: 56372.1). Total num frames: 1794129920. Throughput: 0: 56176.9. Samples: 1743517000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:49:03,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 06:49:04,034][47288] Updated weights for policy 0, policy_version 109506 (0.0025) [2024-04-26 06:49:06,261][47288] Updated weights for policy 0, policy_version 109516 (0.0028) [2024-04-26 06:49:08,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1794408448. Throughput: 0: 56292.9. Samples: 1743857420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:49:08,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 06:49:09,803][47288] Updated weights for policy 0, policy_version 109526 (0.0032) [2024-04-26 06:49:12,125][47288] Updated weights for policy 0, policy_version 109536 (0.0029) [2024-04-26 06:49:13,923][47056] Fps is (10 sec: 60620.3, 60 sec: 56797.9, 300 sec: 56483.1). Total num frames: 1794736128. Throughput: 0: 56401.0. Samples: 1744023400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:49:13,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 06:49:15,601][47288] Updated weights for policy 0, policy_version 109546 (0.0031) [2024-04-26 06:49:18,165][47288] Updated weights for policy 0, policy_version 109556 (0.0026) [2024-04-26 06:49:18,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1794998272. Throughput: 0: 56379.2. Samples: 1744355320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 06:49:18,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 06:49:21,308][47288] Updated weights for policy 0, policy_version 109566 (0.0028) [2024-04-26 06:49:23,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56524.7, 300 sec: 56372.0). Total num frames: 1795276800. Throughput: 0: 56423.9. Samples: 1744699160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 06:49:23,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 06:49:23,932][47288] Updated weights for policy 0, policy_version 109576 (0.0027) [2024-04-26 06:49:27,035][47288] Updated weights for policy 0, policy_version 109586 (0.0027) [2024-04-26 06:49:28,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56797.9, 300 sec: 56483.1). Total num frames: 1795571712. Throughput: 0: 56498.8. Samples: 1744883180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 06:49:28,923][47056] Avg episode reward: [(0, '0.574')] [2024-04-26 06:49:29,553][47288] Updated weights for policy 0, policy_version 109596 (0.0025) [2024-04-26 06:49:33,176][47288] Updated weights for policy 0, policy_version 109606 (0.0035) [2024-04-26 06:49:33,177][47267] Signal inference workers to stop experience collection... (26150 times) [2024-04-26 06:49:33,177][47267] Signal inference workers to resume experience collection... (26150 times) [2024-04-26 06:49:33,194][47288] InferenceWorker_p0-w0: stopping experience collection (26150 times) [2024-04-26 06:49:33,195][47288] InferenceWorker_p0-w0: resuming experience collection (26150 times) [2024-04-26 06:49:33,923][47056] Fps is (10 sec: 55706.6, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 1795833856. Throughput: 0: 56565.3. Samples: 1745219720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 06:49:33,923][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 06:49:35,351][47288] Updated weights for policy 0, policy_version 109616 (0.0026) [2024-04-26 06:49:38,819][47288] Updated weights for policy 0, policy_version 109626 (0.0029) [2024-04-26 06:49:38,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1796128768. Throughput: 0: 56580.2. Samples: 1745557860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 06:49:38,923][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 06:49:41,240][47288] Updated weights for policy 0, policy_version 109636 (0.0025) [2024-04-26 06:49:43,923][47056] Fps is (10 sec: 54066.2, 60 sec: 55705.6, 300 sec: 56261.0). Total num frames: 1796374528. Throughput: 0: 56352.5. Samples: 1745717860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 06:49:43,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 06:49:44,577][47288] Updated weights for policy 0, policy_version 109646 (0.0029) [2024-04-26 06:49:47,100][47288] Updated weights for policy 0, policy_version 109656 (0.0028) [2024-04-26 06:49:48,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 56427.6). Total num frames: 1796685824. Throughput: 0: 56424.7. Samples: 1746056120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 06:49:48,923][47056] Avg episode reward: [(0, '0.544')] [2024-04-26 06:49:50,488][47288] Updated weights for policy 0, policy_version 109666 (0.0032) [2024-04-26 06:49:52,924][47288] Updated weights for policy 0, policy_version 109676 (0.0032) [2024-04-26 06:49:53,923][47056] Fps is (10 sec: 60620.1, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 1796980736. Throughput: 0: 56368.2. Samples: 1746394000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 06:49:53,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 06:49:56,305][47288] Updated weights for policy 0, policy_version 109686 (0.0029) [2024-04-26 06:49:58,765][47288] Updated weights for policy 0, policy_version 109696 (0.0027) [2024-04-26 06:49:58,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1797259264. Throughput: 0: 56516.0. Samples: 1746566620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 06:49:58,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 06:50:02,044][47288] Updated weights for policy 0, policy_version 109706 (0.0032) [2024-04-26 06:50:03,923][47056] Fps is (10 sec: 57345.5, 60 sec: 57070.9, 300 sec: 56427.6). Total num frames: 1797554176. Throughput: 0: 56635.1. Samples: 1746903900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 06:50:03,923][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 06:50:04,495][47288] Updated weights for policy 0, policy_version 109716 (0.0030) [2024-04-26 06:50:07,763][47288] Updated weights for policy 0, policy_version 109726 (0.0030) [2024-04-26 06:50:08,923][47056] Fps is (10 sec: 57343.1, 60 sec: 57070.8, 300 sec: 56483.1). Total num frames: 1797832704. Throughput: 0: 56466.5. Samples: 1747240160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 06:50:08,923][47056] Avg episode reward: [(0, '0.601')] [2024-04-26 06:50:10,278][47288] Updated weights for policy 0, policy_version 109736 (0.0030) [2024-04-26 06:50:13,644][47288] Updated weights for policy 0, policy_version 109746 (0.0024) [2024-04-26 06:50:13,922][47056] Fps is (10 sec: 55706.5, 60 sec: 56251.9, 300 sec: 56427.6). Total num frames: 1798111232. Throughput: 0: 56257.5. Samples: 1747414760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 06:50:13,923][47056] Avg episode reward: [(0, '0.580')] [2024-04-26 06:50:16,239][47288] Updated weights for policy 0, policy_version 109756 (0.0025) [2024-04-26 06:50:18,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1798356992. Throughput: 0: 56394.0. Samples: 1747757460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 06:50:18,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 06:50:19,315][47288] Updated weights for policy 0, policy_version 109766 (0.0026) [2024-04-26 06:50:22,015][47288] Updated weights for policy 0, policy_version 109776 (0.0029) [2024-04-26 06:50:23,923][47056] Fps is (10 sec: 54065.8, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1798651904. Throughput: 0: 56446.1. Samples: 1748097940. Policy #0 lag: (min: 0.0, avg: 13.0, max: 29.0) [2024-04-26 06:50:23,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 06:50:25,098][47288] Updated weights for policy 0, policy_version 109786 (0.0027) [2024-04-26 06:50:27,679][47288] Updated weights for policy 0, policy_version 109796 (0.0033) [2024-04-26 06:50:28,477][47267] Signal inference workers to stop experience collection... (26200 times) [2024-04-26 06:50:28,478][47267] Signal inference workers to resume experience collection... (26200 times) [2024-04-26 06:50:28,507][47288] InferenceWorker_p0-w0: stopping experience collection (26200 times) [2024-04-26 06:50:28,507][47288] InferenceWorker_p0-w0: resuming experience collection (26200 times) [2024-04-26 06:50:28,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56251.6, 300 sec: 56483.1). Total num frames: 1798946816. Throughput: 0: 56431.5. Samples: 1748257280. Policy #0 lag: (min: 0.0, avg: 13.0, max: 29.0) [2024-04-26 06:50:28,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 06:50:30,857][47288] Updated weights for policy 0, policy_version 109806 (0.0031) [2024-04-26 06:50:33,447][47288] Updated weights for policy 0, policy_version 109816 (0.0029) [2024-04-26 06:50:33,923][47056] Fps is (10 sec: 58983.2, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 1799241728. Throughput: 0: 56545.0. Samples: 1748600640. Policy #0 lag: (min: 0.0, avg: 13.0, max: 29.0) [2024-04-26 06:50:33,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 06:50:36,677][47288] Updated weights for policy 0, policy_version 109826 (0.0028) [2024-04-26 06:50:38,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1799520256. Throughput: 0: 56496.6. Samples: 1748936340. Policy #0 lag: (min: 0.0, avg: 13.0, max: 29.0) [2024-04-26 06:50:38,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 06:50:39,309][47288] Updated weights for policy 0, policy_version 109836 (0.0028) [2024-04-26 06:50:42,337][47288] Updated weights for policy 0, policy_version 109846 (0.0028) [2024-04-26 06:50:43,923][47056] Fps is (10 sec: 57343.5, 60 sec: 57344.1, 300 sec: 56483.1). Total num frames: 1799815168. Throughput: 0: 56744.0. Samples: 1749120100. Policy #0 lag: (min: 0.0, avg: 13.0, max: 29.0) [2024-04-26 06:50:43,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 06:50:44,960][47288] Updated weights for policy 0, policy_version 109856 (0.0031) [2024-04-26 06:50:48,032][47288] Updated weights for policy 0, policy_version 109866 (0.0028) [2024-04-26 06:50:48,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56797.7, 300 sec: 56483.1). Total num frames: 1800093696. Throughput: 0: 56745.5. Samples: 1749457460. Policy #0 lag: (min: 0.0, avg: 13.0, max: 29.0) [2024-04-26 06:50:48,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 06:50:48,995][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000109870_1800110080.pth... [2024-04-26 06:50:49,038][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000109044_1786576896.pth [2024-04-26 06:50:50,848][47288] Updated weights for policy 0, policy_version 109876 (0.0029) [2024-04-26 06:50:53,890][47288] Updated weights for policy 0, policy_version 109886 (0.0028) [2024-04-26 06:50:53,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56525.1, 300 sec: 56483.2). Total num frames: 1800372224. Throughput: 0: 56770.6. Samples: 1749794820. Policy #0 lag: (min: 0.0, avg: 13.0, max: 29.0) [2024-04-26 06:50:53,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 06:50:56,829][47288] Updated weights for policy 0, policy_version 109896 (0.0034) [2024-04-26 06:50:58,923][47056] Fps is (10 sec: 52429.9, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1800617984. Throughput: 0: 56685.6. Samples: 1749965620. Policy #0 lag: (min: 0.0, avg: 13.0, max: 29.0) [2024-04-26 06:50:58,923][47056] Avg episode reward: [(0, '0.598')] [2024-04-26 06:50:59,639][47288] Updated weights for policy 0, policy_version 109906 (0.0033) [2024-04-26 06:51:02,571][47288] Updated weights for policy 0, policy_version 109916 (0.0029) [2024-04-26 06:51:03,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55978.6, 300 sec: 56427.6). Total num frames: 1800912896. Throughput: 0: 56598.7. Samples: 1750304400. Policy #0 lag: (min: 0.0, avg: 13.0, max: 29.0) [2024-04-26 06:51:03,924][47056] Avg episode reward: [(0, '0.528')] [2024-04-26 06:51:05,364][47288] Updated weights for policy 0, policy_version 109926 (0.0025) [2024-04-26 06:51:08,406][47288] Updated weights for policy 0, policy_version 109936 (0.0028) [2024-04-26 06:51:08,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1801207808. Throughput: 0: 56510.2. Samples: 1750640900. Policy #0 lag: (min: 0.0, avg: 13.0, max: 29.0) [2024-04-26 06:51:08,924][47056] Avg episode reward: [(0, '0.568')] [2024-04-26 06:51:11,029][47288] Updated weights for policy 0, policy_version 109946 (0.0029) [2024-04-26 06:51:13,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56524.5, 300 sec: 56427.6). Total num frames: 1801502720. Throughput: 0: 56687.6. Samples: 1750808220. Policy #0 lag: (min: 0.0, avg: 13.0, max: 29.0) [2024-04-26 06:51:13,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 06:51:14,061][47288] Updated weights for policy 0, policy_version 109956 (0.0028) [2024-04-26 06:51:16,887][47288] Updated weights for policy 0, policy_version 109966 (0.0033) [2024-04-26 06:51:18,923][47056] Fps is (10 sec: 57344.5, 60 sec: 57071.0, 300 sec: 56427.6). Total num frames: 1801781248. Throughput: 0: 56423.0. Samples: 1751139680. Policy #0 lag: (min: 0.0, avg: 13.0, max: 29.0) [2024-04-26 06:51:18,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 06:51:19,794][47288] Updated weights for policy 0, policy_version 109976 (0.0029) [2024-04-26 06:51:22,428][47267] Signal inference workers to stop experience collection... (26250 times) [2024-04-26 06:51:22,461][47288] InferenceWorker_p0-w0: stopping experience collection (26250 times) [2024-04-26 06:51:22,486][47267] Signal inference workers to resume experience collection... (26250 times) [2024-04-26 06:51:22,487][47288] InferenceWorker_p0-w0: resuming experience collection (26250 times) [2024-04-26 06:51:22,760][47288] Updated weights for policy 0, policy_version 109986 (0.0028) [2024-04-26 06:51:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 1802076160. Throughput: 0: 56487.6. Samples: 1751478280. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-04-26 06:51:23,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 06:51:25,689][47288] Updated weights for policy 0, policy_version 109996 (0.0032) [2024-04-26 06:51:28,465][47288] Updated weights for policy 0, policy_version 110006 (0.0028) [2024-04-26 06:51:28,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1802354688. Throughput: 0: 56362.2. Samples: 1751656400. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-04-26 06:51:28,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 06:51:31,655][47288] Updated weights for policy 0, policy_version 110016 (0.0025) [2024-04-26 06:51:33,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1802633216. Throughput: 0: 56414.1. Samples: 1751996080. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-04-26 06:51:33,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 06:51:34,137][47288] Updated weights for policy 0, policy_version 110026 (0.0033) [2024-04-26 06:51:37,322][47288] Updated weights for policy 0, policy_version 110036 (0.0032) [2024-04-26 06:51:38,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 1802878976. Throughput: 0: 56502.5. Samples: 1752337440. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-04-26 06:51:38,923][47056] Avg episode reward: [(0, '0.608')] [2024-04-26 06:51:40,032][47288] Updated weights for policy 0, policy_version 110046 (0.0026) [2024-04-26 06:51:43,285][47288] Updated weights for policy 0, policy_version 110056 (0.0029) [2024-04-26 06:51:43,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.7, 300 sec: 56316.6). Total num frames: 1803173888. Throughput: 0: 56202.7. Samples: 1752494740. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-04-26 06:51:43,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 06:51:45,885][47288] Updated weights for policy 0, policy_version 110066 (0.0024) [2024-04-26 06:51:48,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1803468800. Throughput: 0: 56132.4. Samples: 1752830360. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-04-26 06:51:48,923][47056] Avg episode reward: [(0, '0.599')] [2024-04-26 06:51:49,118][47288] Updated weights for policy 0, policy_version 110076 (0.0027) [2024-04-26 06:51:51,632][47288] Updated weights for policy 0, policy_version 110086 (0.0035) [2024-04-26 06:51:53,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1803747328. Throughput: 0: 56233.0. Samples: 1753171380. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-04-26 06:51:53,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 06:51:54,934][47288] Updated weights for policy 0, policy_version 110096 (0.0028) [2024-04-26 06:51:57,598][47288] Updated weights for policy 0, policy_version 110106 (0.0035) [2024-04-26 06:51:58,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56797.7, 300 sec: 56372.0). Total num frames: 1804025856. Throughput: 0: 56356.8. Samples: 1753344280. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-04-26 06:51:58,924][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 06:52:00,589][47288] Updated weights for policy 0, policy_version 110116 (0.0026) [2024-04-26 06:52:03,479][47288] Updated weights for policy 0, policy_version 110126 (0.0031) [2024-04-26 06:52:03,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56798.0, 300 sec: 56427.6). Total num frames: 1804320768. Throughput: 0: 56445.0. Samples: 1753679700. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-04-26 06:52:03,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 06:52:06,538][47288] Updated weights for policy 0, policy_version 110136 (0.0031) [2024-04-26 06:52:08,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1804599296. Throughput: 0: 56383.5. Samples: 1754015540. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-04-26 06:52:08,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 06:52:09,199][47288] Updated weights for policy 0, policy_version 110146 (0.0032) [2024-04-26 06:52:12,555][47288] Updated weights for policy 0, policy_version 110156 (0.0028) [2024-04-26 06:52:13,923][47056] Fps is (10 sec: 55704.5, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1804877824. Throughput: 0: 56220.4. Samples: 1754186320. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-04-26 06:52:13,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 06:52:15,031][47288] Updated weights for policy 0, policy_version 110166 (0.0025) [2024-04-26 06:52:18,238][47288] Updated weights for policy 0, policy_version 110176 (0.0035) [2024-04-26 06:52:18,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1805139968. Throughput: 0: 56148.7. Samples: 1754522780. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-04-26 06:52:18,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 06:52:20,792][47288] Updated weights for policy 0, policy_version 110186 (0.0034) [2024-04-26 06:52:23,319][47267] Signal inference workers to stop experience collection... (26300 times) [2024-04-26 06:52:23,320][47267] Signal inference workers to resume experience collection... (26300 times) [2024-04-26 06:52:23,342][47288] InferenceWorker_p0-w0: stopping experience collection (26300 times) [2024-04-26 06:52:23,342][47288] InferenceWorker_p0-w0: resuming experience collection (26300 times) [2024-04-26 06:52:23,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 56316.6). Total num frames: 1805434880. Throughput: 0: 56097.3. Samples: 1754861820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:52:23,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 06:52:24,148][47288] Updated weights for policy 0, policy_version 110196 (0.0030) [2024-04-26 06:52:26,636][47288] Updated weights for policy 0, policy_version 110206 (0.0034) [2024-04-26 06:52:28,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.8, 300 sec: 56427.6). Total num frames: 1805713408. Throughput: 0: 56299.6. Samples: 1755028220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:52:28,923][47056] Avg episode reward: [(0, '0.583')] [2024-04-26 06:52:30,243][47288] Updated weights for policy 0, policy_version 110216 (0.0034) [2024-04-26 06:52:32,510][47288] Updated weights for policy 0, policy_version 110226 (0.0030) [2024-04-26 06:52:33,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1806008320. Throughput: 0: 56389.6. Samples: 1755367880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:52:33,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 06:52:35,899][47288] Updated weights for policy 0, policy_version 110236 (0.0035) [2024-04-26 06:52:38,488][47288] Updated weights for policy 0, policy_version 110246 (0.0030) [2024-04-26 06:52:38,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1806286848. Throughput: 0: 56325.3. Samples: 1755706020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:52:38,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 06:52:41,591][47288] Updated weights for policy 0, policy_version 110256 (0.0032) [2024-04-26 06:52:43,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1806565376. Throughput: 0: 56402.9. Samples: 1755882400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:52:43,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 06:52:44,318][47288] Updated weights for policy 0, policy_version 110266 (0.0027) [2024-04-26 06:52:47,346][47288] Updated weights for policy 0, policy_version 110276 (0.0031) [2024-04-26 06:52:48,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1806843904. Throughput: 0: 56349.6. Samples: 1756215440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:52:48,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 06:52:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000110281_1806843904.pth... [2024-04-26 06:52:48,989][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000109456_1793327104.pth [2024-04-26 06:52:50,147][47288] Updated weights for policy 0, policy_version 110286 (0.0035) [2024-04-26 06:52:53,325][47288] Updated weights for policy 0, policy_version 110296 (0.0034) [2024-04-26 06:52:53,923][47056] Fps is (10 sec: 57343.0, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1807138816. Throughput: 0: 56414.2. Samples: 1756554180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:52:53,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 06:52:56,211][47288] Updated weights for policy 0, policy_version 110306 (0.0026) [2024-04-26 06:52:58,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.9, 300 sec: 56261.0). Total num frames: 1807400960. Throughput: 0: 56302.4. Samples: 1756719920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:52:58,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 06:52:59,097][47288] Updated weights for policy 0, policy_version 110316 (0.0035) [2024-04-26 06:53:01,873][47288] Updated weights for policy 0, policy_version 110326 (0.0033) [2024-04-26 06:53:03,923][47056] Fps is (10 sec: 55706.7, 60 sec: 56251.8, 300 sec: 56427.7). Total num frames: 1807695872. Throughput: 0: 56383.7. Samples: 1757060040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:53:03,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 06:53:04,740][47288] Updated weights for policy 0, policy_version 110336 (0.0024) [2024-04-26 06:53:07,766][47288] Updated weights for policy 0, policy_version 110346 (0.0028) [2024-04-26 06:53:08,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56524.9, 300 sec: 56483.1). Total num frames: 1807990784. Throughput: 0: 56388.5. Samples: 1757399300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:53:08,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 06:53:10,497][47288] Updated weights for policy 0, policy_version 110356 (0.0027) [2024-04-26 06:53:13,509][47288] Updated weights for policy 0, policy_version 110366 (0.0029) [2024-04-26 06:53:13,923][47056] Fps is (10 sec: 55704.7, 60 sec: 56251.8, 300 sec: 56372.0). Total num frames: 1808252928. Throughput: 0: 56415.9. Samples: 1757566940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:53:13,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 06:53:16,323][47288] Updated weights for policy 0, policy_version 110376 (0.0026) [2024-04-26 06:53:18,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1808531456. Throughput: 0: 56406.0. Samples: 1757906160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 06:53:18,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 06:53:19,224][47288] Updated weights for policy 0, policy_version 110386 (0.0028) [2024-04-26 06:53:22,067][47288] Updated weights for policy 0, policy_version 110396 (0.0031) [2024-04-26 06:53:23,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 1808826368. Throughput: 0: 56365.3. Samples: 1758242460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 06:53:23,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 06:53:25,032][47288] Updated weights for policy 0, policy_version 110406 (0.0028) [2024-04-26 06:53:27,993][47288] Updated weights for policy 0, policy_version 110416 (0.0028) [2024-04-26 06:53:28,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1809104896. Throughput: 0: 56262.4. Samples: 1758414220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 06:53:28,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 06:53:30,945][47288] Updated weights for policy 0, policy_version 110426 (0.0027) [2024-04-26 06:53:33,551][47267] Signal inference workers to stop experience collection... (26350 times) [2024-04-26 06:53:33,551][47267] Signal inference workers to resume experience collection... (26350 times) [2024-04-26 06:53:33,561][47288] InferenceWorker_p0-w0: stopping experience collection (26350 times) [2024-04-26 06:53:33,561][47288] InferenceWorker_p0-w0: resuming experience collection (26350 times) [2024-04-26 06:53:33,660][47288] Updated weights for policy 0, policy_version 110436 (0.0035) [2024-04-26 06:53:33,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.6, 300 sec: 56372.1). Total num frames: 1809383424. Throughput: 0: 56367.6. Samples: 1758751980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 06:53:33,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 06:53:36,649][47288] Updated weights for policy 0, policy_version 110446 (0.0025) [2024-04-26 06:53:38,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1809661952. Throughput: 0: 56507.2. Samples: 1759097000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 06:53:38,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 06:53:39,420][47288] Updated weights for policy 0, policy_version 110456 (0.0035) [2024-04-26 06:53:42,503][47288] Updated weights for policy 0, policy_version 110466 (0.0035) [2024-04-26 06:53:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.6, 300 sec: 56316.5). Total num frames: 1809940480. Throughput: 0: 56571.5. Samples: 1759265640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 06:53:43,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 06:53:45,186][47288] Updated weights for policy 0, policy_version 110476 (0.0025) [2024-04-26 06:53:48,313][47288] Updated weights for policy 0, policy_version 110486 (0.0033) [2024-04-26 06:53:48,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1810235392. Throughput: 0: 56532.8. Samples: 1759604020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 06:53:48,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 06:53:50,863][47288] Updated weights for policy 0, policy_version 110496 (0.0030) [2024-04-26 06:53:53,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1810497536. Throughput: 0: 56640.5. Samples: 1759948120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 06:53:53,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 06:53:54,123][47288] Updated weights for policy 0, policy_version 110506 (0.0028) [2024-04-26 06:53:56,647][47288] Updated weights for policy 0, policy_version 110516 (0.0032) [2024-04-26 06:53:58,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1810792448. Throughput: 0: 56510.3. Samples: 1760109900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 06:53:58,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 06:53:59,801][47288] Updated weights for policy 0, policy_version 110526 (0.0026) [2024-04-26 06:54:02,511][47288] Updated weights for policy 0, policy_version 110536 (0.0030) [2024-04-26 06:54:03,923][47056] Fps is (10 sec: 58981.5, 60 sec: 56524.6, 300 sec: 56538.7). Total num frames: 1811087360. Throughput: 0: 56560.4. Samples: 1760451380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 06:54:03,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 06:54:05,714][47288] Updated weights for policy 0, policy_version 110546 (0.0030) [2024-04-26 06:54:08,460][47288] Updated weights for policy 0, policy_version 110556 (0.0035) [2024-04-26 06:54:08,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1811382272. Throughput: 0: 56567.7. Samples: 1760788000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 06:54:08,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 06:54:11,690][47288] Updated weights for policy 0, policy_version 110566 (0.0027) [2024-04-26 06:54:13,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1811644416. Throughput: 0: 56602.8. Samples: 1760961340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 06:54:13,923][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 06:54:14,346][47288] Updated weights for policy 0, policy_version 110576 (0.0032) [2024-04-26 06:54:17,384][47288] Updated weights for policy 0, policy_version 110586 (0.0030) [2024-04-26 06:54:18,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56798.0, 300 sec: 56483.2). Total num frames: 1811939328. Throughput: 0: 56550.7. Samples: 1761296760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 06:54:18,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 06:54:20,193][47288] Updated weights for policy 0, policy_version 110596 (0.0026) [2024-04-26 06:54:23,099][47288] Updated weights for policy 0, policy_version 110606 (0.0030) [2024-04-26 06:54:23,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 1812201472. Throughput: 0: 56378.1. Samples: 1761634020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 06:54:23,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 06:54:25,998][47288] Updated weights for policy 0, policy_version 110616 (0.0026) [2024-04-26 06:54:28,923][47056] Fps is (10 sec: 54066.5, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1812480000. Throughput: 0: 56272.3. Samples: 1761797900. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 06:54:28,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 06:54:28,930][47288] Updated weights for policy 0, policy_version 110626 (0.0027) [2024-04-26 06:54:31,822][47288] Updated weights for policy 0, policy_version 110636 (0.0024) [2024-04-26 06:54:33,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1812758528. Throughput: 0: 56157.7. Samples: 1762131120. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 06:54:33,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 06:54:35,166][47288] Updated weights for policy 0, policy_version 110646 (0.0030) [2024-04-26 06:54:37,601][47288] Updated weights for policy 0, policy_version 110656 (0.0028) [2024-04-26 06:54:38,923][47056] Fps is (10 sec: 55706.6, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 1813037056. Throughput: 0: 56070.2. Samples: 1762471280. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 06:54:38,923][47056] Avg episode reward: [(0, '0.578')] [2024-04-26 06:54:40,936][47288] Updated weights for policy 0, policy_version 110666 (0.0029) [2024-04-26 06:54:43,389][47288] Updated weights for policy 0, policy_version 110676 (0.0028) [2024-04-26 06:54:43,923][47056] Fps is (10 sec: 60620.5, 60 sec: 57070.9, 300 sec: 56538.7). Total num frames: 1813364736. Throughput: 0: 56340.4. Samples: 1762645220. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 06:54:43,924][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 06:54:46,764][47288] Updated weights for policy 0, policy_version 110686 (0.0030) [2024-04-26 06:54:48,923][47056] Fps is (10 sec: 57342.4, 60 sec: 56251.5, 300 sec: 56372.1). Total num frames: 1813610496. Throughput: 0: 56254.1. Samples: 1762982820. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 06:54:48,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 06:54:48,936][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000110694_1813610496.pth... [2024-04-26 06:54:48,996][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000109870_1800110080.pth [2024-04-26 06:54:49,259][47288] Updated weights for policy 0, policy_version 110696 (0.0027) [2024-04-26 06:54:52,604][47288] Updated weights for policy 0, policy_version 110706 (0.0029) [2024-04-26 06:54:53,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1813905408. Throughput: 0: 56115.4. Samples: 1763313200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 06:54:53,923][47056] Avg episode reward: [(0, '0.416')] [2024-04-26 06:54:55,169][47288] Updated weights for policy 0, policy_version 110716 (0.0026) [2024-04-26 06:54:58,438][47288] Updated weights for policy 0, policy_version 110726 (0.0027) [2024-04-26 06:54:58,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1814151168. Throughput: 0: 56076.7. Samples: 1763484800. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 06:54:58,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 06:54:58,944][47267] Signal inference workers to stop experience collection... (26400 times) [2024-04-26 06:54:58,944][47267] Signal inference workers to resume experience collection... (26400 times) [2024-04-26 06:54:58,955][47288] InferenceWorker_p0-w0: stopping experience collection (26400 times) [2024-04-26 06:54:58,955][47288] InferenceWorker_p0-w0: resuming experience collection (26400 times) [2024-04-26 06:55:01,079][47288] Updated weights for policy 0, policy_version 110736 (0.0032) [2024-04-26 06:55:03,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55705.7, 300 sec: 56261.0). Total num frames: 1814429696. Throughput: 0: 56139.6. Samples: 1763823040. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 06:55:03,923][47056] Avg episode reward: [(0, '0.582')] [2024-04-26 06:55:04,198][47288] Updated weights for policy 0, policy_version 110746 (0.0027) [2024-04-26 06:55:06,827][47288] Updated weights for policy 0, policy_version 110756 (0.0027) [2024-04-26 06:55:08,923][47056] Fps is (10 sec: 57345.0, 60 sec: 55705.6, 300 sec: 56316.5). Total num frames: 1814724608. Throughput: 0: 56168.7. Samples: 1764161600. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 06:55:08,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 06:55:09,940][47288] Updated weights for policy 0, policy_version 110766 (0.0028) [2024-04-26 06:55:12,515][47288] Updated weights for policy 0, policy_version 110776 (0.0030) [2024-04-26 06:55:13,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 56372.1). Total num frames: 1814986752. Throughput: 0: 56100.7. Samples: 1764322420. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 06:55:13,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 06:55:15,867][47288] Updated weights for policy 0, policy_version 110786 (0.0031) [2024-04-26 06:55:18,416][47288] Updated weights for policy 0, policy_version 110796 (0.0034) [2024-04-26 06:55:18,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1815298048. Throughput: 0: 56070.7. Samples: 1764654300. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 06:55:18,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 06:55:21,854][47288] Updated weights for policy 0, policy_version 110806 (0.0032) [2024-04-26 06:55:23,923][47056] Fps is (10 sec: 60620.0, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1815592960. Throughput: 0: 56058.1. Samples: 1764993900. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 06:55:23,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 06:55:24,165][47288] Updated weights for policy 0, policy_version 110816 (0.0024) [2024-04-26 06:55:27,628][47288] Updated weights for policy 0, policy_version 110826 (0.0041) [2024-04-26 06:55:28,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1815871488. Throughput: 0: 56092.5. Samples: 1765169380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 06:55:28,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 06:55:29,833][47288] Updated weights for policy 0, policy_version 110836 (0.0032) [2024-04-26 06:55:33,237][47288] Updated weights for policy 0, policy_version 110846 (0.0026) [2024-04-26 06:55:33,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1816150016. Throughput: 0: 56230.8. Samples: 1765513200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 06:55:33,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 06:55:35,646][47288] Updated weights for policy 0, policy_version 110856 (0.0029) [2024-04-26 06:55:38,923][47056] Fps is (10 sec: 54067.4, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1816412160. Throughput: 0: 56313.9. Samples: 1765847320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 06:55:38,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 06:55:38,947][47288] Updated weights for policy 0, policy_version 110866 (0.0030) [2024-04-26 06:55:41,716][47288] Updated weights for policy 0, policy_version 110876 (0.0028) [2024-04-26 06:55:43,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 56261.0). Total num frames: 1816690688. Throughput: 0: 56126.7. Samples: 1766010500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 06:55:43,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 06:55:44,826][47288] Updated weights for policy 0, policy_version 110886 (0.0027) [2024-04-26 06:55:47,237][47267] Signal inference workers to stop experience collection... (26450 times) [2024-04-26 06:55:47,271][47288] InferenceWorker_p0-w0: stopping experience collection (26450 times) [2024-04-26 06:55:47,323][47267] Signal inference workers to resume experience collection... (26450 times) [2024-04-26 06:55:47,324][47288] InferenceWorker_p0-w0: resuming experience collection (26450 times) [2024-04-26 06:55:47,430][47288] Updated weights for policy 0, policy_version 110896 (0.0029) [2024-04-26 06:55:48,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.8, 300 sec: 56261.0). Total num frames: 1816969216. Throughput: 0: 56140.3. Samples: 1766349360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 06:55:48,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 06:55:50,608][47288] Updated weights for policy 0, policy_version 110906 (0.0030) [2024-04-26 06:55:53,257][47288] Updated weights for policy 0, policy_version 110916 (0.0027) [2024-04-26 06:55:53,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55705.7, 300 sec: 56372.1). Total num frames: 1817247744. Throughput: 0: 56240.4. Samples: 1766692420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 06:55:53,923][47056] Avg episode reward: [(0, '0.569')] [2024-04-26 06:55:56,309][47288] Updated weights for policy 0, policy_version 110926 (0.0033) [2024-04-26 06:55:58,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1817559040. Throughput: 0: 56404.7. Samples: 1766860640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 06:55:58,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 06:55:59,101][47288] Updated weights for policy 0, policy_version 110936 (0.0030) [2024-04-26 06:56:02,214][47288] Updated weights for policy 0, policy_version 110946 (0.0026) [2024-04-26 06:56:03,923][47056] Fps is (10 sec: 60620.7, 60 sec: 57070.9, 300 sec: 56427.6). Total num frames: 1817853952. Throughput: 0: 56610.6. Samples: 1767201780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 06:56:03,923][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 06:56:04,874][47288] Updated weights for policy 0, policy_version 110956 (0.0030) [2024-04-26 06:56:08,159][47288] Updated weights for policy 0, policy_version 110966 (0.0031) [2024-04-26 06:56:08,923][47056] Fps is (10 sec: 58983.6, 60 sec: 57071.0, 300 sec: 56427.6). Total num frames: 1818148864. Throughput: 0: 56459.7. Samples: 1767534580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 06:56:08,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 06:56:10,590][47288] Updated weights for policy 0, policy_version 110976 (0.0029) [2024-04-26 06:56:13,807][47288] Updated weights for policy 0, policy_version 110986 (0.0026) [2024-04-26 06:56:13,923][47056] Fps is (10 sec: 55705.4, 60 sec: 57070.8, 300 sec: 56372.1). Total num frames: 1818411008. Throughput: 0: 56468.9. Samples: 1767710480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 06:56:13,923][47056] Avg episode reward: [(0, '0.546')] [2024-04-26 06:56:16,466][47288] Updated weights for policy 0, policy_version 110996 (0.0029) [2024-04-26 06:56:18,923][47056] Fps is (10 sec: 54066.4, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1818689536. Throughput: 0: 56292.9. Samples: 1768046380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 06:56:18,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 06:56:19,521][47288] Updated weights for policy 0, policy_version 111006 (0.0030) [2024-04-26 06:56:22,881][47288] Updated weights for policy 0, policy_version 111016 (0.0027) [2024-04-26 06:56:23,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1818968064. Throughput: 0: 56289.7. Samples: 1768380360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 06:56:23,924][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 06:56:25,507][47288] Updated weights for policy 0, policy_version 111026 (0.0031) [2024-04-26 06:56:28,537][47288] Updated weights for policy 0, policy_version 111036 (0.0031) [2024-04-26 06:56:28,923][47056] Fps is (10 sec: 52429.5, 60 sec: 55705.7, 300 sec: 56205.5). Total num frames: 1819213824. Throughput: 0: 56317.1. Samples: 1768544760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:56:28,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 06:56:31,243][47288] Updated weights for policy 0, policy_version 111046 (0.0027) [2024-04-26 06:56:33,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1819508736. Throughput: 0: 56246.3. Samples: 1768880440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:56:33,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 06:56:34,570][47288] Updated weights for policy 0, policy_version 111056 (0.0036) [2024-04-26 06:56:36,954][47288] Updated weights for policy 0, policy_version 111066 (0.0038) [2024-04-26 06:56:38,923][47056] Fps is (10 sec: 58981.1, 60 sec: 56524.6, 300 sec: 56372.0). Total num frames: 1819803648. Throughput: 0: 56186.0. Samples: 1769220800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:56:38,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 06:56:40,294][47288] Updated weights for policy 0, policy_version 111076 (0.0028) [2024-04-26 06:56:42,801][47288] Updated weights for policy 0, policy_version 111086 (0.0024) [2024-04-26 06:56:43,923][47056] Fps is (10 sec: 60620.2, 60 sec: 57070.9, 300 sec: 56427.6). Total num frames: 1820114944. Throughput: 0: 56388.9. Samples: 1769398140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:56:43,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 06:56:46,142][47288] Updated weights for policy 0, policy_version 111096 (0.0027) [2024-04-26 06:56:48,795][47288] Updated weights for policy 0, policy_version 111106 (0.0031) [2024-04-26 06:56:48,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56524.6, 300 sec: 56316.5). Total num frames: 1820360704. Throughput: 0: 56193.1. Samples: 1769730480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:56:48,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 06:56:49,022][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000111107_1820377088.pth... [2024-04-26 06:56:49,066][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000110281_1806843904.pth [2024-04-26 06:56:52,025][47288] Updated weights for policy 0, policy_version 111116 (0.0027) [2024-04-26 06:56:53,923][47056] Fps is (10 sec: 52429.2, 60 sec: 56524.8, 300 sec: 56316.6). Total num frames: 1820639232. Throughput: 0: 56187.4. Samples: 1770063020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:56:53,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 06:56:54,617][47288] Updated weights for policy 0, policy_version 111126 (0.0031) [2024-04-26 06:56:57,692][47288] Updated weights for policy 0, policy_version 111136 (0.0031) [2024-04-26 06:56:58,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1820917760. Throughput: 0: 56084.0. Samples: 1770234260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:56:58,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 06:56:59,056][47267] Signal inference workers to stop experience collection... (26500 times) [2024-04-26 06:56:59,061][47267] Signal inference workers to resume experience collection... (26500 times) [2024-04-26 06:56:59,086][47288] InferenceWorker_p0-w0: stopping experience collection (26500 times) [2024-04-26 06:56:59,087][47288] InferenceWorker_p0-w0: resuming experience collection (26500 times) [2024-04-26 06:57:00,267][47288] Updated weights for policy 0, policy_version 111146 (0.0026) [2024-04-26 06:57:03,542][47288] Updated weights for policy 0, policy_version 111156 (0.0026) [2024-04-26 06:57:03,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 56261.0). Total num frames: 1821196288. Throughput: 0: 56206.4. Samples: 1770575660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:57:03,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 06:57:06,174][47288] Updated weights for policy 0, policy_version 111166 (0.0031) [2024-04-26 06:57:08,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55159.4, 300 sec: 56205.5). Total num frames: 1821458432. Throughput: 0: 56258.8. Samples: 1770912000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:57:08,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 06:57:09,436][47288] Updated weights for policy 0, policy_version 111176 (0.0033) [2024-04-26 06:57:11,963][47288] Updated weights for policy 0, policy_version 111186 (0.0027) [2024-04-26 06:57:13,923][47056] Fps is (10 sec: 57343.0, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 1821769728. Throughput: 0: 56226.5. Samples: 1771074960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:57:13,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 06:57:15,224][47288] Updated weights for policy 0, policy_version 111196 (0.0029) [2024-04-26 06:57:17,734][47288] Updated weights for policy 0, policy_version 111206 (0.0027) [2024-04-26 06:57:18,923][47056] Fps is (10 sec: 60620.4, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1822064640. Throughput: 0: 56349.3. Samples: 1771416160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:57:18,923][47056] Avg episode reward: [(0, '0.546')] [2024-04-26 06:57:20,987][47288] Updated weights for policy 0, policy_version 111216 (0.0028) [2024-04-26 06:57:23,542][47288] Updated weights for policy 0, policy_version 111226 (0.0028) [2024-04-26 06:57:23,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1822343168. Throughput: 0: 56204.1. Samples: 1771749980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 06:57:23,923][47056] Avg episode reward: [(0, '0.388')] [2024-04-26 06:57:26,949][47288] Updated weights for policy 0, policy_version 111236 (0.0025) [2024-04-26 06:57:28,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56524.7, 300 sec: 56261.0). Total num frames: 1822605312. Throughput: 0: 56143.6. Samples: 1771924600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 06:57:28,924][47056] Avg episode reward: [(0, '0.468')] [2024-04-26 06:57:29,479][47288] Updated weights for policy 0, policy_version 111246 (0.0032) [2024-04-26 06:57:32,890][47288] Updated weights for policy 0, policy_version 111256 (0.0024) [2024-04-26 06:57:33,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56524.8, 300 sec: 56316.6). Total num frames: 1822900224. Throughput: 0: 56169.2. Samples: 1772258080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 06:57:33,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 06:57:35,314][47288] Updated weights for policy 0, policy_version 111266 (0.0031) [2024-04-26 06:57:38,610][47288] Updated weights for policy 0, policy_version 111276 (0.0033) [2024-04-26 06:57:38,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.8, 300 sec: 56261.0). Total num frames: 1823162368. Throughput: 0: 56302.2. Samples: 1772596620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 06:57:38,923][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 06:57:41,154][47288] Updated weights for policy 0, policy_version 111286 (0.0022) [2024-04-26 06:57:43,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 56261.0). Total num frames: 1823440896. Throughput: 0: 56101.0. Samples: 1772758800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 06:57:43,923][47056] Avg episode reward: [(0, '0.387')] [2024-04-26 06:57:44,396][47288] Updated weights for policy 0, policy_version 111296 (0.0025) [2024-04-26 06:57:47,008][47288] Updated weights for policy 0, policy_version 111306 (0.0032) [2024-04-26 06:57:48,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.9, 300 sec: 56205.5). Total num frames: 1823719424. Throughput: 0: 55854.1. Samples: 1773089100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 06:57:48,924][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 06:57:50,172][47288] Updated weights for policy 0, policy_version 111316 (0.0026) [2024-04-26 06:57:52,876][47288] Updated weights for policy 0, policy_version 111326 (0.0032) [2024-04-26 06:57:53,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1824014336. Throughput: 0: 55922.7. Samples: 1773428520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 06:57:53,932][47056] Avg episode reward: [(0, '0.591')] [2024-04-26 06:57:56,145][47288] Updated weights for policy 0, policy_version 111336 (0.0030) [2024-04-26 06:57:58,310][47267] Signal inference workers to stop experience collection... (26550 times) [2024-04-26 06:57:58,355][47288] InferenceWorker_p0-w0: stopping experience collection (26550 times) [2024-04-26 06:57:58,365][47267] Signal inference workers to resume experience collection... (26550 times) [2024-04-26 06:57:58,371][47288] InferenceWorker_p0-w0: resuming experience collection (26550 times) [2024-04-26 06:57:58,607][47288] Updated weights for policy 0, policy_version 111346 (0.0032) [2024-04-26 06:57:58,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1824309248. Throughput: 0: 56071.0. Samples: 1773598160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 06:57:58,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 06:58:01,862][47288] Updated weights for policy 0, policy_version 111356 (0.0031) [2024-04-26 06:58:03,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55978.6, 300 sec: 56149.9). Total num frames: 1824555008. Throughput: 0: 55945.9. Samples: 1773933720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 06:58:03,932][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 06:58:04,566][47288] Updated weights for policy 0, policy_version 111366 (0.0025) [2024-04-26 06:58:07,632][47288] Updated weights for policy 0, policy_version 111376 (0.0029) [2024-04-26 06:58:08,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56524.7, 300 sec: 56261.0). Total num frames: 1824849920. Throughput: 0: 56078.7. Samples: 1774273520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 06:58:08,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 06:58:10,371][47288] Updated weights for policy 0, policy_version 111386 (0.0029) [2024-04-26 06:58:13,458][47288] Updated weights for policy 0, policy_version 111396 (0.0032) [2024-04-26 06:58:13,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.8, 300 sec: 56261.0). Total num frames: 1825128448. Throughput: 0: 55908.1. Samples: 1774440460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 06:58:13,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 06:58:16,184][47288] Updated weights for policy 0, policy_version 111406 (0.0028) [2024-04-26 06:58:18,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55432.6, 300 sec: 56149.9). Total num frames: 1825390592. Throughput: 0: 56014.2. Samples: 1774778720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 06:58:18,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 06:58:19,274][47288] Updated weights for policy 0, policy_version 111416 (0.0031) [2024-04-26 06:58:22,075][47288] Updated weights for policy 0, policy_version 111426 (0.0030) [2024-04-26 06:58:23,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 56205.5). Total num frames: 1825685504. Throughput: 0: 55945.3. Samples: 1775114160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 06:58:23,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 06:58:25,144][47288] Updated weights for policy 0, policy_version 111436 (0.0024) [2024-04-26 06:58:27,908][47288] Updated weights for policy 0, policy_version 111446 (0.0028) [2024-04-26 06:58:28,923][47056] Fps is (10 sec: 57343.0, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1825964032. Throughput: 0: 56125.2. Samples: 1775284440. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-04-26 06:58:28,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 06:58:30,866][47288] Updated weights for policy 0, policy_version 111456 (0.0030) [2024-04-26 06:58:33,689][47288] Updated weights for policy 0, policy_version 111466 (0.0033) [2024-04-26 06:58:33,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1826275328. Throughput: 0: 56223.1. Samples: 1775619140. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-04-26 06:58:33,923][47056] Avg episode reward: [(0, '0.622')] [2024-04-26 06:58:36,646][47288] Updated weights for policy 0, policy_version 111476 (0.0028) [2024-04-26 06:58:38,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 56205.4). Total num frames: 1826521088. Throughput: 0: 56209.6. Samples: 1775957960. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-04-26 06:58:38,924][47056] Avg episode reward: [(0, '0.580')] [2024-04-26 06:58:39,535][47288] Updated weights for policy 0, policy_version 111486 (0.0029) [2024-04-26 06:58:42,551][47288] Updated weights for policy 0, policy_version 111496 (0.0031) [2024-04-26 06:58:43,923][47056] Fps is (10 sec: 54065.4, 60 sec: 56251.4, 300 sec: 56205.4). Total num frames: 1826816000. Throughput: 0: 56256.6. Samples: 1776129720. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-04-26 06:58:43,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 06:58:45,509][47288] Updated weights for policy 0, policy_version 111506 (0.0031) [2024-04-26 06:58:48,288][47288] Updated weights for policy 0, policy_version 111516 (0.0031) [2024-04-26 06:58:48,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1827110912. Throughput: 0: 56253.6. Samples: 1776465140. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-04-26 06:58:48,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 06:58:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000111518_1827110912.pth... [2024-04-26 06:58:48,978][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000110694_1813610496.pth [2024-04-26 06:58:51,161][47288] Updated weights for policy 0, policy_version 111526 (0.0028) [2024-04-26 06:58:53,642][47267] Signal inference workers to stop experience collection... (26600 times) [2024-04-26 06:58:53,643][47267] Signal inference workers to resume experience collection... (26600 times) [2024-04-26 06:58:53,659][47288] InferenceWorker_p0-w0: stopping experience collection (26600 times) [2024-04-26 06:58:53,660][47288] InferenceWorker_p0-w0: resuming experience collection (26600 times) [2024-04-26 06:58:53,923][47056] Fps is (10 sec: 57346.4, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1827389440. Throughput: 0: 56255.3. Samples: 1776805000. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-04-26 06:58:53,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 06:58:54,016][47288] Updated weights for policy 0, policy_version 111536 (0.0032) [2024-04-26 06:58:56,941][47288] Updated weights for policy 0, policy_version 111546 (0.0028) [2024-04-26 06:58:58,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55432.6, 300 sec: 56094.4). Total num frames: 1827635200. Throughput: 0: 56304.3. Samples: 1776974160. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-04-26 06:58:58,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 06:58:59,814][47288] Updated weights for policy 0, policy_version 111556 (0.0025) [2024-04-26 06:59:02,883][47288] Updated weights for policy 0, policy_version 111566 (0.0029) [2024-04-26 06:59:03,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56524.7, 300 sec: 56149.9). Total num frames: 1827946496. Throughput: 0: 56183.4. Samples: 1777306980. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-04-26 06:59:03,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 06:59:05,731][47288] Updated weights for policy 0, policy_version 111576 (0.0034) [2024-04-26 06:59:08,714][47288] Updated weights for policy 0, policy_version 111586 (0.0027) [2024-04-26 06:59:08,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56251.6, 300 sec: 56205.4). Total num frames: 1828225024. Throughput: 0: 56209.6. Samples: 1777643600. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-04-26 06:59:08,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 06:59:11,388][47288] Updated weights for policy 0, policy_version 111596 (0.0027) [2024-04-26 06:59:13,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.6, 300 sec: 56094.4). Total num frames: 1828487168. Throughput: 0: 56111.7. Samples: 1777809460. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-04-26 06:59:13,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 06:59:14,460][47288] Updated weights for policy 0, policy_version 111606 (0.0032) [2024-04-26 06:59:17,076][47288] Updated weights for policy 0, policy_version 111616 (0.0030) [2024-04-26 06:59:18,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.6, 300 sec: 56205.4). Total num frames: 1828782080. Throughput: 0: 56293.7. Samples: 1778152360. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-04-26 06:59:18,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 06:59:20,667][47288] Updated weights for policy 0, policy_version 111626 (0.0028) [2024-04-26 06:59:23,114][47288] Updated weights for policy 0, policy_version 111636 (0.0031) [2024-04-26 06:59:23,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1829076992. Throughput: 0: 56262.8. Samples: 1778489780. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-04-26 06:59:23,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 06:59:26,382][47288] Updated weights for policy 0, policy_version 111646 (0.0029) [2024-04-26 06:59:28,769][47288] Updated weights for policy 0, policy_version 111656 (0.0029) [2024-04-26 06:59:28,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56797.8, 300 sec: 56316.5). Total num frames: 1829371904. Throughput: 0: 56218.9. Samples: 1778659560. Policy #0 lag: (min: 0.0, avg: 12.7, max: 25.0) [2024-04-26 06:59:28,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 06:59:32,278][47288] Updated weights for policy 0, policy_version 111666 (0.0026) [2024-04-26 06:59:33,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1829634048. Throughput: 0: 56249.0. Samples: 1778996340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:59:33,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 06:59:34,660][47288] Updated weights for policy 0, policy_version 111676 (0.0030) [2024-04-26 06:59:37,981][47288] Updated weights for policy 0, policy_version 111686 (0.0025) [2024-04-26 06:59:38,923][47056] Fps is (10 sec: 50791.4, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 1829879808. Throughput: 0: 56258.2. Samples: 1779336620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:59:38,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 06:59:40,600][47288] Updated weights for policy 0, policy_version 111696 (0.0033) [2024-04-26 06:59:43,774][47288] Updated weights for policy 0, policy_version 111706 (0.0028) [2024-04-26 06:59:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56252.0, 300 sec: 56205.5). Total num frames: 1830191104. Throughput: 0: 56246.7. Samples: 1779505260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:59:43,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 06:59:46,454][47288] Updated weights for policy 0, policy_version 111716 (0.0033) [2024-04-26 06:59:48,923][47056] Fps is (10 sec: 58981.2, 60 sec: 55978.6, 300 sec: 56149.9). Total num frames: 1830469632. Throughput: 0: 56347.0. Samples: 1779842600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:59:48,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 06:59:49,680][47288] Updated weights for policy 0, policy_version 111726 (0.0029) [2024-04-26 06:59:52,175][47288] Updated weights for policy 0, policy_version 111736 (0.0028) [2024-04-26 06:59:53,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.5, 300 sec: 56205.5). Total num frames: 1830731776. Throughput: 0: 56224.2. Samples: 1780173680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:59:53,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 06:59:54,003][47267] Signal inference workers to stop experience collection... (26650 times) [2024-04-26 06:59:54,004][47267] Signal inference workers to resume experience collection... (26650 times) [2024-04-26 06:59:54,016][47288] InferenceWorker_p0-w0: stopping experience collection (26650 times) [2024-04-26 06:59:54,017][47288] InferenceWorker_p0-w0: resuming experience collection (26650 times) [2024-04-26 06:59:55,409][47288] Updated weights for policy 0, policy_version 111746 (0.0035) [2024-04-26 06:59:57,851][47288] Updated weights for policy 0, policy_version 111756 (0.0025) [2024-04-26 06:59:58,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1831026688. Throughput: 0: 56430.2. Samples: 1780348820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 06:59:58,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 07:00:01,183][47288] Updated weights for policy 0, policy_version 111766 (0.0029) [2024-04-26 07:00:03,761][47288] Updated weights for policy 0, policy_version 111776 (0.0028) [2024-04-26 07:00:03,923][47056] Fps is (10 sec: 60620.3, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1831337984. Throughput: 0: 56182.2. Samples: 1780680560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:00:03,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 07:00:07,184][47288] Updated weights for policy 0, policy_version 111786 (0.0026) [2024-04-26 07:00:08,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56252.0, 300 sec: 56316.5). Total num frames: 1831600128. Throughput: 0: 56225.1. Samples: 1781019900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:00:08,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 07:00:09,655][47288] Updated weights for policy 0, policy_version 111796 (0.0027) [2024-04-26 07:00:12,909][47288] Updated weights for policy 0, policy_version 111806 (0.0034) [2024-04-26 07:00:13,923][47056] Fps is (10 sec: 52429.4, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1831862272. Throughput: 0: 56053.5. Samples: 1781181960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:00:13,923][47056] Avg episode reward: [(0, '0.563')] [2024-04-26 07:00:15,303][47288] Updated weights for policy 0, policy_version 111816 (0.0037) [2024-04-26 07:00:18,785][47288] Updated weights for policy 0, policy_version 111826 (0.0027) [2024-04-26 07:00:18,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.8, 300 sec: 56149.9). Total num frames: 1832157184. Throughput: 0: 56166.7. Samples: 1781523840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:00:18,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 07:00:20,947][47288] Updated weights for policy 0, policy_version 111836 (0.0022) [2024-04-26 07:00:23,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 56149.9). Total num frames: 1832435712. Throughput: 0: 56231.8. Samples: 1781867060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:00:23,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 07:00:24,668][47288] Updated weights for policy 0, policy_version 111846 (0.0029) [2024-04-26 07:00:27,202][47288] Updated weights for policy 0, policy_version 111856 (0.0031) [2024-04-26 07:00:28,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55978.7, 300 sec: 56205.4). Total num frames: 1832730624. Throughput: 0: 55976.7. Samples: 1782024220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:00:28,923][47056] Avg episode reward: [(0, '0.561')] [2024-04-26 07:00:30,434][47288] Updated weights for policy 0, policy_version 111866 (0.0030) [2024-04-26 07:00:33,500][47288] Updated weights for policy 0, policy_version 111876 (0.0031) [2024-04-26 07:00:33,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55705.7, 300 sec: 56149.9). Total num frames: 1832976384. Throughput: 0: 56039.3. Samples: 1782364360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 07:00:33,923][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 07:00:36,316][47288] Updated weights for policy 0, policy_version 111886 (0.0029) [2024-04-26 07:00:38,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56797.8, 300 sec: 56261.0). Total num frames: 1833287680. Throughput: 0: 56070.3. Samples: 1782696840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 07:00:38,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 07:00:39,576][47288] Updated weights for policy 0, policy_version 111896 (0.0026) [2024-04-26 07:00:42,223][47288] Updated weights for policy 0, policy_version 111906 (0.0028) [2024-04-26 07:00:43,923][47056] Fps is (10 sec: 62259.2, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1833598976. Throughput: 0: 56169.0. Samples: 1782876420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 07:00:43,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 07:00:45,260][47288] Updated weights for policy 0, policy_version 111916 (0.0027) [2024-04-26 07:00:48,083][47288] Updated weights for policy 0, policy_version 111926 (0.0030) [2024-04-26 07:00:48,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.9, 300 sec: 56261.0). Total num frames: 1833844736. Throughput: 0: 56313.5. Samples: 1783214660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 07:00:48,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 07:00:49,096][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000111931_1833877504.pth... [2024-04-26 07:00:49,140][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000111107_1820377088.pth [2024-04-26 07:00:51,083][47288] Updated weights for policy 0, policy_version 111936 (0.0029) [2024-04-26 07:00:52,074][47267] Signal inference workers to stop experience collection... (26700 times) [2024-04-26 07:00:52,074][47267] Signal inference workers to resume experience collection... (26700 times) [2024-04-26 07:00:52,099][47288] InferenceWorker_p0-w0: stopping experience collection (26700 times) [2024-04-26 07:00:52,099][47288] InferenceWorker_p0-w0: resuming experience collection (26700 times) [2024-04-26 07:00:53,755][47288] Updated weights for policy 0, policy_version 111946 (0.0030) [2024-04-26 07:00:53,923][47056] Fps is (10 sec: 52428.9, 60 sec: 56524.9, 300 sec: 56149.9). Total num frames: 1834123264. Throughput: 0: 56247.9. Samples: 1783551060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 07:00:53,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 07:00:57,049][47288] Updated weights for policy 0, policy_version 111956 (0.0028) [2024-04-26 07:00:58,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.6, 300 sec: 56094.4). Total num frames: 1834401792. Throughput: 0: 56400.3. Samples: 1783719980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 07:00:58,923][47056] Avg episode reward: [(0, '0.450')] [2024-04-26 07:00:59,687][47288] Updated weights for policy 0, policy_version 111966 (0.0029) [2024-04-26 07:01:02,787][47288] Updated weights for policy 0, policy_version 111976 (0.0027) [2024-04-26 07:01:03,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.8, 300 sec: 56094.4). Total num frames: 1834696704. Throughput: 0: 56370.2. Samples: 1784060500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 07:01:03,923][47056] Avg episode reward: [(0, '0.589')] [2024-04-26 07:01:05,670][47288] Updated weights for policy 0, policy_version 111986 (0.0041) [2024-04-26 07:01:08,518][47288] Updated weights for policy 0, policy_version 111996 (0.0028) [2024-04-26 07:01:08,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.5, 300 sec: 56094.4). Total num frames: 1834958848. Throughput: 0: 56184.0. Samples: 1784395340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 07:01:08,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 07:01:11,396][47288] Updated weights for policy 0, policy_version 112006 (0.0029) [2024-04-26 07:01:13,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56251.8, 300 sec: 56094.4). Total num frames: 1835237376. Throughput: 0: 56407.4. Samples: 1784562540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 07:01:13,923][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 07:01:14,225][47288] Updated weights for policy 0, policy_version 112016 (0.0027) [2024-04-26 07:01:17,268][47288] Updated weights for policy 0, policy_version 112026 (0.0033) [2024-04-26 07:01:18,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56524.8, 300 sec: 56205.5). Total num frames: 1835548672. Throughput: 0: 56455.5. Samples: 1784904860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 07:01:18,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 07:01:20,071][47288] Updated weights for policy 0, policy_version 112036 (0.0038) [2024-04-26 07:01:23,081][47288] Updated weights for policy 0, policy_version 112046 (0.0026) [2024-04-26 07:01:23,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1835810816. Throughput: 0: 56503.6. Samples: 1785239500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 07:01:23,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 07:01:25,877][47288] Updated weights for policy 0, policy_version 112056 (0.0028) [2024-04-26 07:01:28,727][47288] Updated weights for policy 0, policy_version 112066 (0.0030) [2024-04-26 07:01:28,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.8, 300 sec: 56205.4). Total num frames: 1836089344. Throughput: 0: 56306.1. Samples: 1785410200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 07:01:28,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 07:01:31,716][47288] Updated weights for policy 0, policy_version 112076 (0.0030) [2024-04-26 07:01:33,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56797.8, 300 sec: 56205.5). Total num frames: 1836384256. Throughput: 0: 56286.2. Samples: 1785747540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:01:33,924][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 07:01:34,443][47288] Updated weights for policy 0, policy_version 112086 (0.0027) [2024-04-26 07:01:37,618][47288] Updated weights for policy 0, policy_version 112096 (0.0030) [2024-04-26 07:01:38,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.8, 300 sec: 56094.4). Total num frames: 1836662784. Throughput: 0: 56301.7. Samples: 1786084640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:01:38,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 07:01:40,295][47288] Updated weights for policy 0, policy_version 112106 (0.0028) [2024-04-26 07:01:43,297][47288] Updated weights for policy 0, policy_version 112116 (0.0029) [2024-04-26 07:01:43,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55432.4, 300 sec: 56149.9). Total num frames: 1836924928. Throughput: 0: 56170.8. Samples: 1786247660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:01:43,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 07:01:45,998][47288] Updated weights for policy 0, policy_version 112126 (0.0032) [2024-04-26 07:01:48,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.6, 300 sec: 56149.9). Total num frames: 1837203456. Throughput: 0: 56098.2. Samples: 1786584920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:01:48,923][47056] Avg episode reward: [(0, '0.565')] [2024-04-26 07:01:49,199][47288] Updated weights for policy 0, policy_version 112136 (0.0032) [2024-04-26 07:01:51,776][47288] Updated weights for policy 0, policy_version 112146 (0.0037) [2024-04-26 07:01:53,923][47056] Fps is (10 sec: 58983.4, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1837514752. Throughput: 0: 56330.4. Samples: 1786930200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:01:53,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 07:01:54,859][47288] Updated weights for policy 0, policy_version 112156 (0.0027) [2024-04-26 07:01:57,561][47288] Updated weights for policy 0, policy_version 112166 (0.0029) [2024-04-26 07:01:58,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1837793280. Throughput: 0: 56483.9. Samples: 1787104320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:01:58,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 07:02:00,605][47288] Updated weights for policy 0, policy_version 112176 (0.0024) [2024-04-26 07:02:03,439][47288] Updated weights for policy 0, policy_version 112186 (0.0030) [2024-04-26 07:02:03,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1838071808. Throughput: 0: 56309.0. Samples: 1787438760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:02:03,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 07:02:06,407][47288] Updated weights for policy 0, policy_version 112196 (0.0026) [2024-04-26 07:02:08,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.9, 300 sec: 56205.5). Total num frames: 1838350336. Throughput: 0: 56407.1. Samples: 1787777820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:02:08,923][47056] Avg episode reward: [(0, '0.546')] [2024-04-26 07:02:09,258][47288] Updated weights for policy 0, policy_version 112206 (0.0028) [2024-04-26 07:02:12,335][47288] Updated weights for policy 0, policy_version 112216 (0.0030) [2024-04-26 07:02:13,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56524.7, 300 sec: 56149.9). Total num frames: 1838628864. Throughput: 0: 56427.6. Samples: 1787949440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:02:13,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 07:02:14,880][47288] Updated weights for policy 0, policy_version 112226 (0.0025) [2024-04-26 07:02:18,233][47288] Updated weights for policy 0, policy_version 112236 (0.0033) [2024-04-26 07:02:18,575][47267] Signal inference workers to stop experience collection... (26750 times) [2024-04-26 07:02:18,576][47267] Signal inference workers to resume experience collection... (26750 times) [2024-04-26 07:02:18,588][47288] InferenceWorker_p0-w0: stopping experience collection (26750 times) [2024-04-26 07:02:18,588][47288] InferenceWorker_p0-w0: resuming experience collection (26750 times) [2024-04-26 07:02:18,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56251.7, 300 sec: 56205.4). Total num frames: 1838923776. Throughput: 0: 56496.9. Samples: 1788289900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:02:18,923][47056] Avg episode reward: [(0, '0.567')] [2024-04-26 07:02:20,580][47288] Updated weights for policy 0, policy_version 112246 (0.0029) [2024-04-26 07:02:23,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56251.8, 300 sec: 56205.5). Total num frames: 1839185920. Throughput: 0: 56540.1. Samples: 1788628940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:02:23,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 07:02:24,047][47288] Updated weights for policy 0, policy_version 112256 (0.0027) [2024-04-26 07:02:26,341][47288] Updated weights for policy 0, policy_version 112266 (0.0029) [2024-04-26 07:02:28,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.8, 300 sec: 56205.4). Total num frames: 1839480832. Throughput: 0: 56533.0. Samples: 1788791640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:02:28,923][47056] Avg episode reward: [(0, '0.544')] [2024-04-26 07:02:29,974][47288] Updated weights for policy 0, policy_version 112276 (0.0026) [2024-04-26 07:02:32,250][47288] Updated weights for policy 0, policy_version 112286 (0.0030) [2024-04-26 07:02:33,923][47056] Fps is (10 sec: 58981.5, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1839775744. Throughput: 0: 56533.7. Samples: 1789128940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 07:02:33,923][47056] Avg episode reward: [(0, '0.570')] [2024-04-26 07:02:35,640][47288] Updated weights for policy 0, policy_version 112296 (0.0028) [2024-04-26 07:02:38,134][47288] Updated weights for policy 0, policy_version 112306 (0.0034) [2024-04-26 07:02:38,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1840054272. Throughput: 0: 56553.2. Samples: 1789475100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 07:02:38,923][47056] Avg episode reward: [(0, '0.548')] [2024-04-26 07:02:41,543][47288] Updated weights for policy 0, policy_version 112316 (0.0026) [2024-04-26 07:02:43,843][47288] Updated weights for policy 0, policy_version 112326 (0.0028) [2024-04-26 07:02:43,923][47056] Fps is (10 sec: 57344.0, 60 sec: 57070.9, 300 sec: 56372.1). Total num frames: 1840349184. Throughput: 0: 56584.0. Samples: 1789650600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 07:02:43,923][47056] Avg episode reward: [(0, '0.563')] [2024-04-26 07:02:47,250][47288] Updated weights for policy 0, policy_version 112336 (0.0032) [2024-04-26 07:02:48,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56797.8, 300 sec: 56261.0). Total num frames: 1840611328. Throughput: 0: 56625.1. Samples: 1789986900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 07:02:48,924][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 07:02:48,937][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000112342_1840611328.pth... [2024-04-26 07:02:49,005][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000111518_1827110912.pth [2024-04-26 07:02:49,712][47288] Updated weights for policy 0, policy_version 112346 (0.0033) [2024-04-26 07:02:53,130][47288] Updated weights for policy 0, policy_version 112356 (0.0031) [2024-04-26 07:02:53,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.6, 300 sec: 56205.5). Total num frames: 1840889856. Throughput: 0: 56467.4. Samples: 1790318860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 07:02:53,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 07:02:55,510][47288] Updated weights for policy 0, policy_version 112366 (0.0028) [2024-04-26 07:02:58,877][47288] Updated weights for policy 0, policy_version 112376 (0.0028) [2024-04-26 07:02:58,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1841168384. Throughput: 0: 56308.4. Samples: 1790483320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 07:02:58,923][47056] Avg episode reward: [(0, '0.557')] [2024-04-26 07:03:01,274][47288] Updated weights for policy 0, policy_version 112386 (0.0028) [2024-04-26 07:03:03,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55978.5, 300 sec: 56205.4). Total num frames: 1841430528. Throughput: 0: 56279.5. Samples: 1790822480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 07:03:03,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 07:03:04,805][47288] Updated weights for policy 0, policy_version 112396 (0.0030) [2024-04-26 07:03:07,017][47288] Updated weights for policy 0, policy_version 112406 (0.0026) [2024-04-26 07:03:08,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1841741824. Throughput: 0: 56291.0. Samples: 1791162040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 07:03:08,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 07:03:10,667][47288] Updated weights for policy 0, policy_version 112416 (0.0031) [2024-04-26 07:03:12,959][47288] Updated weights for policy 0, policy_version 112426 (0.0033) [2024-04-26 07:03:13,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56524.6, 300 sec: 56372.0). Total num frames: 1842020352. Throughput: 0: 56519.3. Samples: 1791335020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 07:03:13,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 07:03:16,571][47288] Updated weights for policy 0, policy_version 112436 (0.0028) [2024-04-26 07:03:18,692][47267] Signal inference workers to stop experience collection... (26800 times) [2024-04-26 07:03:18,746][47267] Signal inference workers to resume experience collection... (26800 times) [2024-04-26 07:03:18,746][47288] InferenceWorker_p0-w0: stopping experience collection (26800 times) [2024-04-26 07:03:18,770][47288] InferenceWorker_p0-w0: resuming experience collection (26800 times) [2024-04-26 07:03:18,855][47288] Updated weights for policy 0, policy_version 112446 (0.0032) [2024-04-26 07:03:18,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1842315264. Throughput: 0: 56364.4. Samples: 1791665340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 07:03:18,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 07:03:22,280][47288] Updated weights for policy 0, policy_version 112456 (0.0029) [2024-04-26 07:03:23,923][47056] Fps is (10 sec: 55706.9, 60 sec: 56524.8, 300 sec: 56316.6). Total num frames: 1842577408. Throughput: 0: 56256.9. Samples: 1792006660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 07:03:23,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 07:03:24,683][47288] Updated weights for policy 0, policy_version 112466 (0.0034) [2024-04-26 07:03:27,943][47288] Updated weights for policy 0, policy_version 112476 (0.0027) [2024-04-26 07:03:28,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1842872320. Throughput: 0: 56132.6. Samples: 1792176560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 07:03:28,923][47056] Avg episode reward: [(0, '0.453')] [2024-04-26 07:03:30,445][47288] Updated weights for policy 0, policy_version 112486 (0.0026) [2024-04-26 07:03:33,854][47288] Updated weights for policy 0, policy_version 112496 (0.0028) [2024-04-26 07:03:33,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.8, 300 sec: 56316.6). Total num frames: 1843134464. Throughput: 0: 56252.7. Samples: 1792518260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 07:03:33,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 07:03:36,218][47288] Updated weights for policy 0, policy_version 112506 (0.0024) [2024-04-26 07:03:38,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 56261.1). Total num frames: 1843412992. Throughput: 0: 56391.8. Samples: 1792856480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:03:38,923][47056] Avg episode reward: [(0, '0.626')] [2024-04-26 07:03:39,033][47267] Saving new best policy, reward=0.626! [2024-04-26 07:03:39,592][47288] Updated weights for policy 0, policy_version 112516 (0.0025) [2024-04-26 07:03:42,194][47288] Updated weights for policy 0, policy_version 112526 (0.0029) [2024-04-26 07:03:43,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 1843707904. Throughput: 0: 56417.7. Samples: 1793022120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:03:43,923][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 07:03:45,359][47288] Updated weights for policy 0, policy_version 112536 (0.0034) [2024-04-26 07:03:47,854][47288] Updated weights for policy 0, policy_version 112546 (0.0026) [2024-04-26 07:03:48,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.9, 300 sec: 56261.0). Total num frames: 1843986432. Throughput: 0: 56426.5. Samples: 1793361660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:03:48,923][47056] Avg episode reward: [(0, '0.609')] [2024-04-26 07:03:51,009][47288] Updated weights for policy 0, policy_version 112556 (0.0027) [2024-04-26 07:03:53,673][47288] Updated weights for policy 0, policy_version 112566 (0.0026) [2024-04-26 07:03:53,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 1844297728. Throughput: 0: 56363.9. Samples: 1793698420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:03:53,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 07:03:56,852][47288] Updated weights for policy 0, policy_version 112576 (0.0025) [2024-04-26 07:03:58,923][47056] Fps is (10 sec: 57342.3, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1844559872. Throughput: 0: 56504.9. Samples: 1793877740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:03:58,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 07:03:59,422][47288] Updated weights for policy 0, policy_version 112586 (0.0030) [2024-04-26 07:04:02,709][47288] Updated weights for policy 0, policy_version 112596 (0.0028) [2024-04-26 07:04:03,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56797.9, 300 sec: 56316.5). Total num frames: 1844838400. Throughput: 0: 56584.4. Samples: 1794211640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:04:03,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 07:04:05,205][47288] Updated weights for policy 0, policy_version 112606 (0.0028) [2024-04-26 07:04:08,377][47288] Updated weights for policy 0, policy_version 112616 (0.0025) [2024-04-26 07:04:08,923][47056] Fps is (10 sec: 57345.2, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1845133312. Throughput: 0: 56461.3. Samples: 1794547420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:04:08,923][47056] Avg episode reward: [(0, '0.534')] [2024-04-26 07:04:11,328][47288] Updated weights for policy 0, policy_version 112626 (0.0031) [2024-04-26 07:04:13,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1845395456. Throughput: 0: 56455.9. Samples: 1794717080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:04:13,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 07:04:14,197][47288] Updated weights for policy 0, policy_version 112636 (0.0031) [2024-04-26 07:04:17,175][47288] Updated weights for policy 0, policy_version 112646 (0.0031) [2024-04-26 07:04:18,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1845673984. Throughput: 0: 56342.0. Samples: 1795053660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:04:18,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 07:04:20,019][47267] Signal inference workers to stop experience collection... (26850 times) [2024-04-26 07:04:20,055][47288] InferenceWorker_p0-w0: stopping experience collection (26850 times) [2024-04-26 07:04:20,110][47267] Signal inference workers to resume experience collection... (26850 times) [2024-04-26 07:04:20,110][47288] InferenceWorker_p0-w0: resuming experience collection (26850 times) [2024-04-26 07:04:20,112][47288] Updated weights for policy 0, policy_version 112656 (0.0027) [2024-04-26 07:04:23,020][47288] Updated weights for policy 0, policy_version 112666 (0.0028) [2024-04-26 07:04:23,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56251.7, 300 sec: 56205.5). Total num frames: 1845952512. Throughput: 0: 56401.7. Samples: 1795394560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:04:23,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 07:04:25,833][47288] Updated weights for policy 0, policy_version 112676 (0.0027) [2024-04-26 07:04:28,687][47288] Updated weights for policy 0, policy_version 112686 (0.0030) [2024-04-26 07:04:28,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.6, 300 sec: 56316.5). Total num frames: 1846247424. Throughput: 0: 56304.0. Samples: 1795555800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:04:28,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 07:04:31,693][47288] Updated weights for policy 0, policy_version 112696 (0.0027) [2024-04-26 07:04:33,923][47056] Fps is (10 sec: 58978.1, 60 sec: 56797.2, 300 sec: 56483.0). Total num frames: 1846542336. Throughput: 0: 56315.9. Samples: 1795895920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:04:33,924][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 07:04:34,326][47288] Updated weights for policy 0, policy_version 112706 (0.0024) [2024-04-26 07:04:37,542][47288] Updated weights for policy 0, policy_version 112716 (0.0028) [2024-04-26 07:04:38,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1846820864. Throughput: 0: 56446.5. Samples: 1796238500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:04:38,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 07:04:40,037][47288] Updated weights for policy 0, policy_version 112726 (0.0028) [2024-04-26 07:04:43,221][47288] Updated weights for policy 0, policy_version 112736 (0.0027) [2024-04-26 07:04:43,923][47056] Fps is (10 sec: 55709.5, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1847099392. Throughput: 0: 56290.9. Samples: 1796410820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:04:43,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 07:04:46,151][47288] Updated weights for policy 0, policy_version 112746 (0.0027) [2024-04-26 07:04:48,816][47288] Updated weights for policy 0, policy_version 112756 (0.0027) [2024-04-26 07:04:48,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56797.7, 300 sec: 56483.1). Total num frames: 1847394304. Throughput: 0: 56404.5. Samples: 1796749840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:04:48,924][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 07:04:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000112756_1847394304.pth... [2024-04-26 07:04:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000111931_1833877504.pth [2024-04-26 07:04:51,831][47288] Updated weights for policy 0, policy_version 112766 (0.0029) [2024-04-26 07:04:53,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1847656448. Throughput: 0: 56564.3. Samples: 1797092820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:04:53,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 07:04:54,614][47288] Updated weights for policy 0, policy_version 112776 (0.0031) [2024-04-26 07:04:57,682][47288] Updated weights for policy 0, policy_version 112786 (0.0027) [2024-04-26 07:04:58,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1847934976. Throughput: 0: 56492.9. Samples: 1797259260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:04:58,923][47056] Avg episode reward: [(0, '0.569')] [2024-04-26 07:05:00,377][47288] Updated weights for policy 0, policy_version 112796 (0.0025) [2024-04-26 07:05:03,463][47288] Updated weights for policy 0, policy_version 112806 (0.0027) [2024-04-26 07:05:03,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1848213504. Throughput: 0: 56525.5. Samples: 1797597300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:05:03,923][47056] Avg episode reward: [(0, '0.571')] [2024-04-26 07:05:06,357][47288] Updated weights for policy 0, policy_version 112816 (0.0027) [2024-04-26 07:05:08,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1848508416. Throughput: 0: 56342.5. Samples: 1797929980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:05:08,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 07:05:09,330][47288] Updated weights for policy 0, policy_version 112826 (0.0026) [2024-04-26 07:05:12,326][47288] Updated weights for policy 0, policy_version 112836 (0.0027) [2024-04-26 07:05:13,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1848803328. Throughput: 0: 56560.0. Samples: 1798101000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:05:13,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 07:05:15,009][47288] Updated weights for policy 0, policy_version 112846 (0.0029) [2024-04-26 07:05:18,002][47288] Updated weights for policy 0, policy_version 112856 (0.0029) [2024-04-26 07:05:18,923][47056] Fps is (10 sec: 55704.7, 60 sec: 56524.7, 300 sec: 56372.0). Total num frames: 1849065472. Throughput: 0: 56657.0. Samples: 1798445460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:05:18,924][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 07:05:20,654][47288] Updated weights for policy 0, policy_version 112866 (0.0028) [2024-04-26 07:05:23,896][47288] Updated weights for policy 0, policy_version 112876 (0.0028) [2024-04-26 07:05:23,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1849360384. Throughput: 0: 56499.1. Samples: 1798780960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:05:23,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 07:05:26,485][47288] Updated weights for policy 0, policy_version 112886 (0.0027) [2024-04-26 07:05:28,923][47056] Fps is (10 sec: 55707.2, 60 sec: 56251.9, 300 sec: 56427.6). Total num frames: 1849622528. Throughput: 0: 56368.9. Samples: 1798947420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:05:28,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 07:05:29,694][47288] Updated weights for policy 0, policy_version 112896 (0.0022) [2024-04-26 07:05:32,632][47288] Updated weights for policy 0, policy_version 112906 (0.0025) [2024-04-26 07:05:33,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56525.4, 300 sec: 56427.6). Total num frames: 1849933824. Throughput: 0: 56576.8. Samples: 1799295800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:05:33,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 07:05:35,309][47288] Updated weights for policy 0, policy_version 112916 (0.0027) [2024-04-26 07:05:38,496][47288] Updated weights for policy 0, policy_version 112926 (0.0027) [2024-04-26 07:05:38,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1850195968. Throughput: 0: 56609.0. Samples: 1799640220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 07:05:38,923][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 07:05:39,108][47267] Signal inference workers to stop experience collection... (26900 times) [2024-04-26 07:05:39,109][47267] Signal inference workers to resume experience collection... (26900 times) [2024-04-26 07:05:39,122][47288] InferenceWorker_p0-w0: stopping experience collection (26900 times) [2024-04-26 07:05:39,122][47288] InferenceWorker_p0-w0: resuming experience collection (26900 times) [2024-04-26 07:05:41,084][47288] Updated weights for policy 0, policy_version 112936 (0.0026) [2024-04-26 07:05:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1850490880. Throughput: 0: 56498.2. Samples: 1799801680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 07:05:43,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 07:05:44,300][47288] Updated weights for policy 0, policy_version 112946 (0.0028) [2024-04-26 07:05:46,812][47288] Updated weights for policy 0, policy_version 112956 (0.0031) [2024-04-26 07:05:48,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1850785792. Throughput: 0: 56582.2. Samples: 1800143500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 07:05:48,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 07:05:50,110][47288] Updated weights for policy 0, policy_version 112966 (0.0027) [2024-04-26 07:05:52,546][47288] Updated weights for policy 0, policy_version 112976 (0.0027) [2024-04-26 07:05:53,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56797.9, 300 sec: 56483.2). Total num frames: 1851064320. Throughput: 0: 56775.6. Samples: 1800484880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 07:05:53,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 07:05:55,835][47288] Updated weights for policy 0, policy_version 112986 (0.0031) [2024-04-26 07:05:58,430][47288] Updated weights for policy 0, policy_version 112996 (0.0029) [2024-04-26 07:05:58,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1851326464. Throughput: 0: 56691.3. Samples: 1800652100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 07:05:58,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 07:06:01,543][47288] Updated weights for policy 0, policy_version 113006 (0.0030) [2024-04-26 07:06:03,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 1851621376. Throughput: 0: 56559.8. Samples: 1800990640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 07:06:03,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 07:06:04,191][47288] Updated weights for policy 0, policy_version 113016 (0.0029) [2024-04-26 07:06:07,341][47288] Updated weights for policy 0, policy_version 113026 (0.0032) [2024-04-26 07:06:08,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1851899904. Throughput: 0: 56550.1. Samples: 1801325720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 07:06:08,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 07:06:10,337][47288] Updated weights for policy 0, policy_version 113036 (0.0028) [2024-04-26 07:06:13,148][47288] Updated weights for policy 0, policy_version 113046 (0.0028) [2024-04-26 07:06:13,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1852194816. Throughput: 0: 56539.4. Samples: 1801491700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 07:06:13,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 07:06:16,296][47288] Updated weights for policy 0, policy_version 113056 (0.0028) [2024-04-26 07:06:18,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56525.1, 300 sec: 56427.6). Total num frames: 1852456960. Throughput: 0: 56309.6. Samples: 1801829720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 07:06:18,923][47056] Avg episode reward: [(0, '0.579')] [2024-04-26 07:06:18,957][47288] Updated weights for policy 0, policy_version 113066 (0.0027) [2024-04-26 07:06:22,055][47288] Updated weights for policy 0, policy_version 113076 (0.0030) [2024-04-26 07:06:23,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 1852751872. Throughput: 0: 56208.0. Samples: 1802169580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 07:06:23,923][47056] Avg episode reward: [(0, '0.528')] [2024-04-26 07:06:24,737][47288] Updated weights for policy 0, policy_version 113086 (0.0034) [2024-04-26 07:06:27,936][47288] Updated weights for policy 0, policy_version 113096 (0.0030) [2024-04-26 07:06:28,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1853030400. Throughput: 0: 56430.0. Samples: 1802341020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 07:06:28,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 07:06:30,528][47288] Updated weights for policy 0, policy_version 113106 (0.0025) [2024-04-26 07:06:33,770][47288] Updated weights for policy 0, policy_version 113116 (0.0035) [2024-04-26 07:06:33,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1853292544. Throughput: 0: 56309.8. Samples: 1802677440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 07:06:33,923][47056] Avg episode reward: [(0, '0.574')] [2024-04-26 07:06:36,340][47288] Updated weights for policy 0, policy_version 113126 (0.0026) [2024-04-26 07:06:38,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1853571072. Throughput: 0: 56295.6. Samples: 1803018180. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-26 07:06:38,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 07:06:39,578][47288] Updated weights for policy 0, policy_version 113136 (0.0029) [2024-04-26 07:06:42,075][47288] Updated weights for policy 0, policy_version 113146 (0.0026) [2024-04-26 07:06:43,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 1853882368. Throughput: 0: 56243.0. Samples: 1803183040. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-26 07:06:43,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 07:06:45,399][47288] Updated weights for policy 0, policy_version 113156 (0.0026) [2024-04-26 07:06:47,922][47288] Updated weights for policy 0, policy_version 113166 (0.0031) [2024-04-26 07:06:48,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1854160896. Throughput: 0: 56148.1. Samples: 1803517300. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-26 07:06:48,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 07:06:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000113169_1854160896.pth... [2024-04-26 07:06:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000112342_1840611328.pth [2024-04-26 07:06:49,522][47267] Signal inference workers to stop experience collection... (26950 times) [2024-04-26 07:06:49,522][47267] Signal inference workers to resume experience collection... (26950 times) [2024-04-26 07:06:49,535][47288] InferenceWorker_p0-w0: stopping experience collection (26950 times) [2024-04-26 07:06:49,535][47288] InferenceWorker_p0-w0: resuming experience collection (26950 times) [2024-04-26 07:06:51,321][47288] Updated weights for policy 0, policy_version 113176 (0.0026) [2024-04-26 07:06:53,729][47288] Updated weights for policy 0, policy_version 113186 (0.0030) [2024-04-26 07:06:53,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1854439424. Throughput: 0: 56163.5. Samples: 1803853080. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-26 07:06:53,924][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 07:06:57,190][47288] Updated weights for policy 0, policy_version 113196 (0.0024) [2024-04-26 07:06:58,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 1854701568. Throughput: 0: 56365.3. Samples: 1804028140. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-26 07:06:58,924][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 07:06:59,536][47288] Updated weights for policy 0, policy_version 113206 (0.0026) [2024-04-26 07:07:02,989][47288] Updated weights for policy 0, policy_version 113216 (0.0032) [2024-04-26 07:07:03,923][47056] Fps is (10 sec: 55706.8, 60 sec: 56251.9, 300 sec: 56427.6). Total num frames: 1854996480. Throughput: 0: 56376.9. Samples: 1804366680. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-26 07:07:03,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 07:07:05,268][47288] Updated weights for policy 0, policy_version 113226 (0.0027) [2024-04-26 07:07:08,742][47288] Updated weights for policy 0, policy_version 113236 (0.0028) [2024-04-26 07:07:08,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1855258624. Throughput: 0: 56393.4. Samples: 1804707280. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-26 07:07:08,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 07:07:11,108][47288] Updated weights for policy 0, policy_version 113246 (0.0021) [2024-04-26 07:07:13,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.7, 300 sec: 56316.5). Total num frames: 1855537152. Throughput: 0: 56030.2. Samples: 1804862380. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-26 07:07:13,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 07:07:14,651][47288] Updated weights for policy 0, policy_version 113256 (0.0029) [2024-04-26 07:07:16,768][47288] Updated weights for policy 0, policy_version 113266 (0.0027) [2024-04-26 07:07:18,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1855832064. Throughput: 0: 56074.6. Samples: 1805200800. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-26 07:07:18,923][47056] Avg episode reward: [(0, '0.568')] [2024-04-26 07:07:20,446][47288] Updated weights for policy 0, policy_version 113276 (0.0029) [2024-04-26 07:07:22,617][47288] Updated weights for policy 0, policy_version 113286 (0.0023) [2024-04-26 07:07:23,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1856126976. Throughput: 0: 55992.4. Samples: 1805537840. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-26 07:07:23,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 07:07:26,296][47288] Updated weights for policy 0, policy_version 113296 (0.0032) [2024-04-26 07:07:28,336][47288] Updated weights for policy 0, policy_version 113306 (0.0024) [2024-04-26 07:07:28,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1856421888. Throughput: 0: 56131.6. Samples: 1805708960. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-26 07:07:28,923][47056] Avg episode reward: [(0, '0.565')] [2024-04-26 07:07:32,046][47288] Updated weights for policy 0, policy_version 113316 (0.0030) [2024-04-26 07:07:33,923][47056] Fps is (10 sec: 57342.3, 60 sec: 56797.6, 300 sec: 56427.6). Total num frames: 1856700416. Throughput: 0: 56326.8. Samples: 1806052020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-26 07:07:33,923][47056] Avg episode reward: [(0, '0.585')] [2024-04-26 07:07:34,200][47288] Updated weights for policy 0, policy_version 113326 (0.0031) [2024-04-26 07:07:37,935][47288] Updated weights for policy 0, policy_version 113336 (0.0035) [2024-04-26 07:07:38,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1856962560. Throughput: 0: 56165.8. Samples: 1806380540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:07:38,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 07:07:40,168][47288] Updated weights for policy 0, policy_version 113346 (0.0027) [2024-04-26 07:07:43,747][47288] Updated weights for policy 0, policy_version 113356 (0.0028) [2024-04-26 07:07:43,923][47056] Fps is (10 sec: 54068.9, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1857241088. Throughput: 0: 56038.9. Samples: 1806549880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:07:43,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 07:07:45,955][47288] Updated weights for policy 0, policy_version 113366 (0.0031) [2024-04-26 07:07:48,923][47056] Fps is (10 sec: 52429.7, 60 sec: 55432.6, 300 sec: 56261.0). Total num frames: 1857486848. Throughput: 0: 56040.4. Samples: 1806888500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:07:48,923][47056] Avg episode reward: [(0, '0.588')] [2024-04-26 07:07:49,418][47267] Signal inference workers to stop experience collection... (27000 times) [2024-04-26 07:07:49,423][47267] Signal inference workers to resume experience collection... (27000 times) [2024-04-26 07:07:49,443][47288] InferenceWorker_p0-w0: stopping experience collection (27000 times) [2024-04-26 07:07:49,443][47288] InferenceWorker_p0-w0: resuming experience collection (27000 times) [2024-04-26 07:07:49,548][47288] Updated weights for policy 0, policy_version 113376 (0.0031) [2024-04-26 07:07:51,715][47288] Updated weights for policy 0, policy_version 113386 (0.0028) [2024-04-26 07:07:53,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55705.7, 300 sec: 56316.5). Total num frames: 1857781760. Throughput: 0: 56050.2. Samples: 1807229540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:07:53,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 07:07:55,477][47288] Updated weights for policy 0, policy_version 113396 (0.0029) [2024-04-26 07:07:57,682][47288] Updated weights for policy 0, policy_version 113406 (0.0027) [2024-04-26 07:07:58,923][47056] Fps is (10 sec: 60620.4, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 1858093056. Throughput: 0: 56162.7. Samples: 1807389700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:07:58,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 07:08:01,270][47288] Updated weights for policy 0, policy_version 113416 (0.0030) [2024-04-26 07:08:03,621][47288] Updated weights for policy 0, policy_version 113426 (0.0026) [2024-04-26 07:08:03,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56251.6, 300 sec: 56372.1). Total num frames: 1858371584. Throughput: 0: 56161.8. Samples: 1807728080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:08:03,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 07:08:07,126][47288] Updated weights for policy 0, policy_version 113436 (0.0028) [2024-04-26 07:08:08,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1858650112. Throughput: 0: 56089.3. Samples: 1808061860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:08:08,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 07:08:09,480][47288] Updated weights for policy 0, policy_version 113446 (0.0039) [2024-04-26 07:08:12,852][47288] Updated weights for policy 0, policy_version 113456 (0.0026) [2024-04-26 07:08:13,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1858945024. Throughput: 0: 56330.4. Samples: 1808243820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:08:13,923][47056] Avg episode reward: [(0, '0.636')] [2024-04-26 07:08:13,939][47267] Saving new best policy, reward=0.636! [2024-04-26 07:08:15,356][47288] Updated weights for policy 0, policy_version 113466 (0.0029) [2024-04-26 07:08:18,693][47288] Updated weights for policy 0, policy_version 113476 (0.0030) [2024-04-26 07:08:18,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 1859207168. Throughput: 0: 56207.3. Samples: 1808581340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:08:18,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 07:08:21,439][47288] Updated weights for policy 0, policy_version 113486 (0.0030) [2024-04-26 07:08:23,923][47056] Fps is (10 sec: 52427.8, 60 sec: 55705.5, 300 sec: 56261.0). Total num frames: 1859469312. Throughput: 0: 56435.6. Samples: 1808920140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:08:23,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 07:08:24,421][47288] Updated weights for policy 0, policy_version 113496 (0.0031) [2024-04-26 07:08:27,126][47288] Updated weights for policy 0, policy_version 113506 (0.0027) [2024-04-26 07:08:28,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55432.4, 300 sec: 56316.5). Total num frames: 1859747840. Throughput: 0: 56180.1. Samples: 1809078000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:08:28,924][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 07:08:30,214][47288] Updated weights for policy 0, policy_version 113516 (0.0032) [2024-04-26 07:08:33,086][47288] Updated weights for policy 0, policy_version 113526 (0.0032) [2024-04-26 07:08:33,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55705.8, 300 sec: 56372.0). Total num frames: 1860042752. Throughput: 0: 56214.9. Samples: 1809418180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:08:33,923][47056] Avg episode reward: [(0, '0.566')] [2024-04-26 07:08:36,024][47288] Updated weights for policy 0, policy_version 113536 (0.0034) [2024-04-26 07:08:38,762][47288] Updated weights for policy 0, policy_version 113546 (0.0033) [2024-04-26 07:08:38,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 1860337664. Throughput: 0: 56210.4. Samples: 1809759020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:08:38,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 07:08:41,950][47288] Updated weights for policy 0, policy_version 113556 (0.0029) [2024-04-26 07:08:43,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56524.6, 300 sec: 56427.6). Total num frames: 1860632576. Throughput: 0: 56431.4. Samples: 1809929120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 07:08:43,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 07:08:44,656][47288] Updated weights for policy 0, policy_version 113566 (0.0025) [2024-04-26 07:08:47,741][47288] Updated weights for policy 0, policy_version 113576 (0.0030) [2024-04-26 07:08:48,923][47056] Fps is (10 sec: 57345.4, 60 sec: 57070.9, 300 sec: 56316.6). Total num frames: 1860911104. Throughput: 0: 56430.8. Samples: 1810267460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 07:08:48,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 07:08:49,007][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000113582_1860927488.pth... [2024-04-26 07:08:49,055][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000112756_1847394304.pth [2024-04-26 07:08:50,407][47288] Updated weights for policy 0, policy_version 113586 (0.0028) [2024-04-26 07:08:53,516][47288] Updated weights for policy 0, policy_version 113596 (0.0031) [2024-04-26 07:08:53,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1861189632. Throughput: 0: 56497.4. Samples: 1810604240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 07:08:53,923][47056] Avg episode reward: [(0, '0.569')] [2024-04-26 07:08:53,992][47267] Signal inference workers to stop experience collection... (27050 times) [2024-04-26 07:08:54,032][47288] InferenceWorker_p0-w0: stopping experience collection (27050 times) [2024-04-26 07:08:54,042][47267] Signal inference workers to resume experience collection... (27050 times) [2024-04-26 07:08:54,045][47288] InferenceWorker_p0-w0: resuming experience collection (27050 times) [2024-04-26 07:08:56,282][47288] Updated weights for policy 0, policy_version 113606 (0.0024) [2024-04-26 07:08:58,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1861451776. Throughput: 0: 56038.4. Samples: 1810765560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 07:08:58,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 07:08:59,316][47288] Updated weights for policy 0, policy_version 113616 (0.0029) [2024-04-26 07:09:02,124][47288] Updated weights for policy 0, policy_version 113626 (0.0027) [2024-04-26 07:09:03,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55705.7, 300 sec: 56205.5). Total num frames: 1861713920. Throughput: 0: 56187.8. Samples: 1811109780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 07:09:03,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 07:09:05,183][47288] Updated weights for policy 0, policy_version 113636 (0.0029) [2024-04-26 07:09:07,910][47288] Updated weights for policy 0, policy_version 113646 (0.0026) [2024-04-26 07:09:08,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 56261.0). Total num frames: 1861992448. Throughput: 0: 56214.7. Samples: 1811449800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 07:09:08,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 07:09:10,840][47288] Updated weights for policy 0, policy_version 113656 (0.0027) [2024-04-26 07:09:13,801][47288] Updated weights for policy 0, policy_version 113666 (0.0026) [2024-04-26 07:09:13,923][47056] Fps is (10 sec: 58981.2, 60 sec: 55978.4, 300 sec: 56372.1). Total num frames: 1862303744. Throughput: 0: 56142.3. Samples: 1811604400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 07:09:13,924][47056] Avg episode reward: [(0, '0.572')] [2024-04-26 07:09:16,537][47288] Updated weights for policy 0, policy_version 113676 (0.0032) [2024-04-26 07:09:18,923][47056] Fps is (10 sec: 60621.2, 60 sec: 56525.0, 300 sec: 56427.6). Total num frames: 1862598656. Throughput: 0: 56075.2. Samples: 1811941560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 07:09:18,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 07:09:19,714][47288] Updated weights for policy 0, policy_version 113686 (0.0031) [2024-04-26 07:09:22,427][47288] Updated weights for policy 0, policy_version 113696 (0.0028) [2024-04-26 07:09:23,923][47056] Fps is (10 sec: 58983.5, 60 sec: 57071.0, 300 sec: 56427.6). Total num frames: 1862893568. Throughput: 0: 56049.6. Samples: 1812281240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 07:09:23,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 07:09:25,472][47288] Updated weights for policy 0, policy_version 113706 (0.0025) [2024-04-26 07:09:28,238][47288] Updated weights for policy 0, policy_version 113716 (0.0035) [2024-04-26 07:09:28,923][47056] Fps is (10 sec: 57344.1, 60 sec: 57071.2, 300 sec: 56372.2). Total num frames: 1863172096. Throughput: 0: 56388.7. Samples: 1812466600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 07:09:28,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 07:09:31,280][47288] Updated weights for policy 0, policy_version 113726 (0.0025) [2024-04-26 07:09:33,923][47056] Fps is (10 sec: 52429.0, 60 sec: 56251.9, 300 sec: 56261.0). Total num frames: 1863417856. Throughput: 0: 56351.1. Samples: 1812803260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 07:09:33,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 07:09:34,071][47288] Updated weights for policy 0, policy_version 113736 (0.0024) [2024-04-26 07:09:37,212][47288] Updated weights for policy 0, policy_version 113746 (0.0030) [2024-04-26 07:09:38,923][47056] Fps is (10 sec: 50790.1, 60 sec: 55705.8, 300 sec: 56205.4). Total num frames: 1863680000. Throughput: 0: 56327.5. Samples: 1813138980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 07:09:38,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 07:09:39,985][47288] Updated weights for policy 0, policy_version 113756 (0.0030) [2024-04-26 07:09:42,975][47288] Updated weights for policy 0, policy_version 113766 (0.0030) [2024-04-26 07:09:43,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55432.7, 300 sec: 56149.9). Total num frames: 1863958528. Throughput: 0: 56277.0. Samples: 1813298020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-04-26 07:09:43,923][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 07:09:44,181][47267] Signal inference workers to stop experience collection... (27100 times) [2024-04-26 07:09:44,182][47267] Signal inference workers to resume experience collection... (27100 times) [2024-04-26 07:09:44,200][47288] InferenceWorker_p0-w0: stopping experience collection (27100 times) [2024-04-26 07:09:44,200][47288] InferenceWorker_p0-w0: resuming experience collection (27100 times) [2024-04-26 07:09:45,701][47288] Updated weights for policy 0, policy_version 113776 (0.0031) [2024-04-26 07:09:48,830][47288] Updated weights for policy 0, policy_version 113786 (0.0029) [2024-04-26 07:09:48,923][47056] Fps is (10 sec: 58981.9, 60 sec: 55978.5, 300 sec: 56316.5). Total num frames: 1864269824. Throughput: 0: 56147.8. Samples: 1813636440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-04-26 07:09:48,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 07:09:51,417][47288] Updated weights for policy 0, policy_version 113796 (0.0026) [2024-04-26 07:09:53,923][47056] Fps is (10 sec: 58982.4, 60 sec: 55978.7, 300 sec: 56316.6). Total num frames: 1864548352. Throughput: 0: 56105.9. Samples: 1813974560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-04-26 07:09:53,923][47056] Avg episode reward: [(0, '0.581')] [2024-04-26 07:09:54,778][47288] Updated weights for policy 0, policy_version 113806 (0.0031) [2024-04-26 07:09:57,250][47288] Updated weights for policy 0, policy_version 113816 (0.0025) [2024-04-26 07:09:58,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1864843264. Throughput: 0: 56297.4. Samples: 1814137780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-04-26 07:09:58,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 07:10:00,845][47288] Updated weights for policy 0, policy_version 113826 (0.0031) [2024-04-26 07:10:03,150][47288] Updated weights for policy 0, policy_version 113836 (0.0028) [2024-04-26 07:10:03,923][47056] Fps is (10 sec: 58981.7, 60 sec: 57070.8, 300 sec: 56372.1). Total num frames: 1865138176. Throughput: 0: 56228.3. Samples: 1814471840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-04-26 07:10:03,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 07:10:06,770][47288] Updated weights for policy 0, policy_version 113846 (0.0030) [2024-04-26 07:10:08,814][47288] Updated weights for policy 0, policy_version 113856 (0.0031) [2024-04-26 07:10:08,923][47056] Fps is (10 sec: 57343.7, 60 sec: 57070.8, 300 sec: 56316.5). Total num frames: 1865416704. Throughput: 0: 56266.4. Samples: 1814813240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-04-26 07:10:08,923][47056] Avg episode reward: [(0, '0.583')] [2024-04-26 07:10:12,694][47288] Updated weights for policy 0, policy_version 113866 (0.0029) [2024-04-26 07:10:13,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1865695232. Throughput: 0: 56111.0. Samples: 1814991600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-04-26 07:10:13,923][47056] Avg episode reward: [(0, '0.548')] [2024-04-26 07:10:14,558][47288] Updated weights for policy 0, policy_version 113876 (0.0027) [2024-04-26 07:10:18,618][47288] Updated weights for policy 0, policy_version 113886 (0.0029) [2024-04-26 07:10:18,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55705.4, 300 sec: 56205.4). Total num frames: 1865940992. Throughput: 0: 56194.3. Samples: 1815332020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-04-26 07:10:18,923][47056] Avg episode reward: [(0, '0.566')] [2024-04-26 07:10:20,538][47288] Updated weights for policy 0, policy_version 113896 (0.0025) [2024-04-26 07:10:23,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55432.5, 300 sec: 56261.0). Total num frames: 1866219520. Throughput: 0: 56288.5. Samples: 1815671960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-04-26 07:10:23,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 07:10:24,241][47288] Updated weights for policy 0, policy_version 113906 (0.0027) [2024-04-26 07:10:26,398][47288] Updated weights for policy 0, policy_version 113916 (0.0026) [2024-04-26 07:10:28,923][47056] Fps is (10 sec: 58983.1, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 1866530816. Throughput: 0: 56432.3. Samples: 1815837480. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-04-26 07:10:28,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 07:10:29,847][47288] Updated weights for policy 0, policy_version 113926 (0.0031) [2024-04-26 07:10:32,098][47288] Updated weights for policy 0, policy_version 113936 (0.0027) [2024-04-26 07:10:33,923][47056] Fps is (10 sec: 58981.3, 60 sec: 56524.6, 300 sec: 56316.5). Total num frames: 1866809344. Throughput: 0: 56607.5. Samples: 1816183780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-04-26 07:10:33,924][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 07:10:35,677][47288] Updated weights for policy 0, policy_version 113946 (0.0029) [2024-04-26 07:10:36,306][47267] Signal inference workers to stop experience collection... (27150 times) [2024-04-26 07:10:36,335][47288] InferenceWorker_p0-w0: stopping experience collection (27150 times) [2024-04-26 07:10:36,361][47267] Signal inference workers to resume experience collection... (27150 times) [2024-04-26 07:10:36,362][47288] InferenceWorker_p0-w0: resuming experience collection (27150 times) [2024-04-26 07:10:37,766][47288] Updated weights for policy 0, policy_version 113956 (0.0031) [2024-04-26 07:10:38,923][47056] Fps is (10 sec: 57344.1, 60 sec: 57070.9, 300 sec: 56316.5). Total num frames: 1867104256. Throughput: 0: 56597.2. Samples: 1816521440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-04-26 07:10:38,923][47056] Avg episode reward: [(0, '0.614')] [2024-04-26 07:10:41,526][47288] Updated weights for policy 0, policy_version 113966 (0.0026) [2024-04-26 07:10:43,652][47288] Updated weights for policy 0, policy_version 113976 (0.0028) [2024-04-26 07:10:43,923][47056] Fps is (10 sec: 57344.0, 60 sec: 57070.8, 300 sec: 56261.0). Total num frames: 1867382784. Throughput: 0: 56984.0. Samples: 1816702060. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 07:10:43,924][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 07:10:47,331][47288] Updated weights for policy 0, policy_version 113986 (0.0030) [2024-04-26 07:10:48,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 1867661312. Throughput: 0: 56965.4. Samples: 1817035280. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 07:10:48,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 07:10:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000113993_1867661312.pth... [2024-04-26 07:10:48,985][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000113169_1854160896.pth [2024-04-26 07:10:49,507][47288] Updated weights for policy 0, policy_version 113996 (0.0032) [2024-04-26 07:10:52,918][47288] Updated weights for policy 0, policy_version 114006 (0.0026) [2024-04-26 07:10:53,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56797.7, 300 sec: 56372.0). Total num frames: 1867956224. Throughput: 0: 56835.2. Samples: 1817370820. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 07:10:53,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 07:10:55,196][47288] Updated weights for policy 0, policy_version 114016 (0.0026) [2024-04-26 07:10:58,721][47288] Updated weights for policy 0, policy_version 114026 (0.0030) [2024-04-26 07:10:58,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1868218368. Throughput: 0: 56537.6. Samples: 1817535800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 07:10:58,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 07:11:01,072][47288] Updated weights for policy 0, policy_version 114036 (0.0035) [2024-04-26 07:11:03,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55705.7, 300 sec: 56205.5). Total num frames: 1868480512. Throughput: 0: 56495.8. Samples: 1817874320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 07:11:03,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 07:11:04,574][47288] Updated weights for policy 0, policy_version 114046 (0.0033) [2024-04-26 07:11:07,220][47288] Updated weights for policy 0, policy_version 114056 (0.0028) [2024-04-26 07:11:08,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1868808192. Throughput: 0: 56415.3. Samples: 1818210660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 07:11:08,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 07:11:10,205][47288] Updated weights for policy 0, policy_version 114066 (0.0027) [2024-04-26 07:11:12,959][47288] Updated weights for policy 0, policy_version 114076 (0.0027) [2024-04-26 07:11:13,923][47056] Fps is (10 sec: 60620.3, 60 sec: 56524.8, 300 sec: 56372.0). Total num frames: 1869086720. Throughput: 0: 56646.6. Samples: 1818386580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 07:11:13,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 07:11:15,870][47288] Updated weights for policy 0, policy_version 114086 (0.0030) [2024-04-26 07:11:18,679][47288] Updated weights for policy 0, policy_version 114096 (0.0031) [2024-04-26 07:11:18,923][47056] Fps is (10 sec: 55705.8, 60 sec: 57071.0, 300 sec: 56316.5). Total num frames: 1869365248. Throughput: 0: 56486.7. Samples: 1818725680. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 07:11:18,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 07:11:21,712][47288] Updated weights for policy 0, policy_version 114106 (0.0025) [2024-04-26 07:11:22,252][47267] Signal inference workers to stop experience collection... (27200 times) [2024-04-26 07:11:22,252][47267] Signal inference workers to resume experience collection... (27200 times) [2024-04-26 07:11:22,263][47288] InferenceWorker_p0-w0: stopping experience collection (27200 times) [2024-04-26 07:11:22,264][47288] InferenceWorker_p0-w0: resuming experience collection (27200 times) [2024-04-26 07:11:23,923][47056] Fps is (10 sec: 54067.4, 60 sec: 56797.7, 300 sec: 56261.0). Total num frames: 1869627392. Throughput: 0: 56548.0. Samples: 1819066100. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 07:11:23,924][47056] Avg episode reward: [(0, '0.544')] [2024-04-26 07:11:24,595][47288] Updated weights for policy 0, policy_version 114116 (0.0031) [2024-04-26 07:11:27,584][47288] Updated weights for policy 0, policy_version 114126 (0.0025) [2024-04-26 07:11:28,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1869938688. Throughput: 0: 56362.7. Samples: 1819238380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 07:11:28,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 07:11:30,408][47288] Updated weights for policy 0, policy_version 114136 (0.0027) [2024-04-26 07:11:33,285][47288] Updated weights for policy 0, policy_version 114146 (0.0027) [2024-04-26 07:11:33,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56798.0, 300 sec: 56427.6). Total num frames: 1870217216. Throughput: 0: 56549.4. Samples: 1819580000. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 07:11:33,923][47056] Avg episode reward: [(0, '0.534')] [2024-04-26 07:11:36,099][47288] Updated weights for policy 0, policy_version 114156 (0.0026) [2024-04-26 07:11:38,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1870479360. Throughput: 0: 56579.2. Samples: 1819916880. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-26 07:11:38,923][47056] Avg episode reward: [(0, '0.622')] [2024-04-26 07:11:39,139][47288] Updated weights for policy 0, policy_version 114166 (0.0030) [2024-04-26 07:11:41,927][47288] Updated weights for policy 0, policy_version 114176 (0.0032) [2024-04-26 07:11:43,923][47056] Fps is (10 sec: 54066.0, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1870757888. Throughput: 0: 56713.8. Samples: 1820087920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 07:11:43,923][47056] Avg episode reward: [(0, '0.534')] [2024-04-26 07:11:44,807][47288] Updated weights for policy 0, policy_version 114186 (0.0028) [2024-04-26 07:11:47,735][47288] Updated weights for policy 0, policy_version 114196 (0.0025) [2024-04-26 07:11:48,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 1871069184. Throughput: 0: 56621.7. Samples: 1820422300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 07:11:48,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 07:11:50,621][47288] Updated weights for policy 0, policy_version 114206 (0.0025) [2024-04-26 07:11:53,560][47288] Updated weights for policy 0, policy_version 114216 (0.0031) [2024-04-26 07:11:53,923][47056] Fps is (10 sec: 57345.2, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 1871331328. Throughput: 0: 56760.2. Samples: 1820764860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 07:11:53,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 07:11:56,278][47288] Updated weights for policy 0, policy_version 114226 (0.0032) [2024-04-26 07:11:58,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56525.0, 300 sec: 56316.5). Total num frames: 1871609856. Throughput: 0: 56459.7. Samples: 1820927260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 07:11:58,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 07:11:59,258][47288] Updated weights for policy 0, policy_version 114236 (0.0034) [2024-04-26 07:12:01,961][47288] Updated weights for policy 0, policy_version 114246 (0.0025) [2024-04-26 07:12:03,923][47056] Fps is (10 sec: 55704.7, 60 sec: 56797.8, 300 sec: 56372.0). Total num frames: 1871888384. Throughput: 0: 56381.8. Samples: 1821262860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 07:12:03,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 07:12:05,048][47288] Updated weights for policy 0, policy_version 114256 (0.0027) [2024-04-26 07:12:07,836][47288] Updated weights for policy 0, policy_version 114266 (0.0023) [2024-04-26 07:12:08,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56524.9, 300 sec: 56483.1). Total num frames: 1872199680. Throughput: 0: 56281.8. Samples: 1821598780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 07:12:08,923][47056] Avg episode reward: [(0, '0.613')] [2024-04-26 07:12:10,736][47267] Signal inference workers to stop experience collection... (27250 times) [2024-04-26 07:12:10,769][47288] InferenceWorker_p0-w0: stopping experience collection (27250 times) [2024-04-26 07:12:10,795][47267] Signal inference workers to resume experience collection... (27250 times) [2024-04-26 07:12:10,795][47288] InferenceWorker_p0-w0: resuming experience collection (27250 times) [2024-04-26 07:12:10,908][47288] Updated weights for policy 0, policy_version 114276 (0.0037) [2024-04-26 07:12:13,646][47288] Updated weights for policy 0, policy_version 114286 (0.0029) [2024-04-26 07:12:13,923][47056] Fps is (10 sec: 60621.6, 60 sec: 56798.0, 300 sec: 56483.2). Total num frames: 1872494592. Throughput: 0: 56455.7. Samples: 1821778880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 07:12:13,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 07:12:16,700][47288] Updated weights for policy 0, policy_version 114296 (0.0028) [2024-04-26 07:12:18,923][47056] Fps is (10 sec: 52429.6, 60 sec: 55978.9, 300 sec: 56261.0). Total num frames: 1872723968. Throughput: 0: 56270.3. Samples: 1822112160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 07:12:18,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 07:12:19,361][47288] Updated weights for policy 0, policy_version 114306 (0.0031) [2024-04-26 07:12:22,355][47288] Updated weights for policy 0, policy_version 114316 (0.0035) [2024-04-26 07:12:23,923][47056] Fps is (10 sec: 52428.4, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 1873018880. Throughput: 0: 56420.0. Samples: 1822455780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 07:12:23,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 07:12:25,176][47288] Updated weights for policy 0, policy_version 114326 (0.0026) [2024-04-26 07:12:28,239][47288] Updated weights for policy 0, policy_version 114336 (0.0033) [2024-04-26 07:12:28,923][47056] Fps is (10 sec: 58980.8, 60 sec: 56251.7, 300 sec: 56316.6). Total num frames: 1873313792. Throughput: 0: 56377.3. Samples: 1822624900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 07:12:28,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 07:12:30,915][47288] Updated weights for policy 0, policy_version 114346 (0.0030) [2024-04-26 07:12:33,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1873575936. Throughput: 0: 56572.1. Samples: 1822968040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 07:12:33,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 07:12:34,064][47288] Updated weights for policy 0, policy_version 114356 (0.0030) [2024-04-26 07:12:36,707][47288] Updated weights for policy 0, policy_version 114366 (0.0031) [2024-04-26 07:12:38,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1873854464. Throughput: 0: 56454.1. Samples: 1823305300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 07:12:38,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 07:12:39,830][47288] Updated weights for policy 0, policy_version 114376 (0.0030) [2024-04-26 07:12:42,545][47288] Updated weights for policy 0, policy_version 114386 (0.0027) [2024-04-26 07:12:43,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.9, 300 sec: 56483.1). Total num frames: 1874149376. Throughput: 0: 56441.2. Samples: 1823467120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 07:12:43,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 07:12:45,660][47288] Updated weights for policy 0, policy_version 114396 (0.0027) [2024-04-26 07:12:48,217][47288] Updated weights for policy 0, policy_version 114406 (0.0027) [2024-04-26 07:12:48,923][47056] Fps is (10 sec: 62258.6, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 1874477056. Throughput: 0: 56672.8. Samples: 1823813140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 07:12:48,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 07:12:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000114409_1874477056.pth... [2024-04-26 07:12:48,985][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000113582_1860927488.pth [2024-04-26 07:12:51,436][47288] Updated weights for policy 0, policy_version 114416 (0.0042) [2024-04-26 07:12:53,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1874722816. Throughput: 0: 56685.5. Samples: 1824149620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 07:12:53,923][47056] Avg episode reward: [(0, '0.618')] [2024-04-26 07:12:54,084][47288] Updated weights for policy 0, policy_version 114426 (0.0032) [2024-04-26 07:12:57,143][47288] Updated weights for policy 0, policy_version 114436 (0.0026) [2024-04-26 07:12:58,923][47056] Fps is (10 sec: 52429.2, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1875001344. Throughput: 0: 56502.9. Samples: 1824321520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 07:12:58,923][47056] Avg episode reward: [(0, '0.624')] [2024-04-26 07:12:59,938][47288] Updated weights for policy 0, policy_version 114446 (0.0030) [2024-04-26 07:13:02,880][47288] Updated weights for policy 0, policy_version 114456 (0.0031) [2024-04-26 07:13:03,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1875279872. Throughput: 0: 56573.6. Samples: 1824657980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 07:13:03,923][47056] Avg episode reward: [(0, '0.583')] [2024-04-26 07:13:05,696][47288] Updated weights for policy 0, policy_version 114466 (0.0025) [2024-04-26 07:13:08,738][47288] Updated weights for policy 0, policy_version 114476 (0.0029) [2024-04-26 07:13:08,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 1875574784. Throughput: 0: 56391.5. Samples: 1824993400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 07:13:08,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 07:13:11,217][47267] Signal inference workers to stop experience collection... (27300 times) [2024-04-26 07:13:11,260][47288] InferenceWorker_p0-w0: stopping experience collection (27300 times) [2024-04-26 07:13:11,271][47267] Signal inference workers to resume experience collection... (27300 times) [2024-04-26 07:13:11,277][47288] InferenceWorker_p0-w0: resuming experience collection (27300 times) [2024-04-26 07:13:11,375][47288] Updated weights for policy 0, policy_version 114486 (0.0034) [2024-04-26 07:13:13,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 56372.1). Total num frames: 1875836928. Throughput: 0: 56262.7. Samples: 1825156720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 07:13:13,924][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 07:13:14,595][47288] Updated weights for policy 0, policy_version 114496 (0.0029) [2024-04-26 07:13:17,154][47288] Updated weights for policy 0, policy_version 114506 (0.0032) [2024-04-26 07:13:18,923][47056] Fps is (10 sec: 54068.1, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1876115456. Throughput: 0: 56124.1. Samples: 1825493620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 07:13:18,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 07:13:20,538][47288] Updated weights for policy 0, policy_version 114516 (0.0033) [2024-04-26 07:13:22,939][47288] Updated weights for policy 0, policy_version 114526 (0.0028) [2024-04-26 07:13:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 1876410368. Throughput: 0: 56232.0. Samples: 1825835740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 07:13:23,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 07:13:26,181][47288] Updated weights for policy 0, policy_version 114536 (0.0025) [2024-04-26 07:13:28,844][47288] Updated weights for policy 0, policy_version 114546 (0.0030) [2024-04-26 07:13:28,923][47056] Fps is (10 sec: 60620.4, 60 sec: 56798.0, 300 sec: 56538.7). Total num frames: 1876721664. Throughput: 0: 56532.1. Samples: 1826011060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 07:13:28,923][47056] Avg episode reward: [(0, '0.597')] [2024-04-26 07:13:31,961][47288] Updated weights for policy 0, policy_version 114556 (0.0027) [2024-04-26 07:13:33,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1876983808. Throughput: 0: 56444.6. Samples: 1826353140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 07:13:33,923][47056] Avg episode reward: [(0, '0.571')] [2024-04-26 07:13:34,564][47288] Updated weights for policy 0, policy_version 114566 (0.0026) [2024-04-26 07:13:37,822][47288] Updated weights for policy 0, policy_version 114576 (0.0037) [2024-04-26 07:13:38,923][47056] Fps is (10 sec: 54067.4, 60 sec: 56798.0, 300 sec: 56372.1). Total num frames: 1877262336. Throughput: 0: 56465.7. Samples: 1826690580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 07:13:38,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 07:13:40,373][47288] Updated weights for policy 0, policy_version 114586 (0.0029) [2024-04-26 07:13:43,436][47288] Updated weights for policy 0, policy_version 114596 (0.0025) [2024-04-26 07:13:43,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1877557248. Throughput: 0: 56515.9. Samples: 1826864740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 07:13:43,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 07:13:46,435][47288] Updated weights for policy 0, policy_version 114606 (0.0031) [2024-04-26 07:13:48,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.8, 300 sec: 56372.1). Total num frames: 1877819392. Throughput: 0: 56541.5. Samples: 1827202340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:13:48,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 07:13:49,210][47288] Updated weights for policy 0, policy_version 114616 (0.0035) [2024-04-26 07:13:52,255][47288] Updated weights for policy 0, policy_version 114626 (0.0029) [2024-04-26 07:13:53,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.6, 300 sec: 56483.1). Total num frames: 1878114304. Throughput: 0: 56642.2. Samples: 1827542300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:13:53,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 07:13:55,042][47288] Updated weights for policy 0, policy_version 114636 (0.0027) [2024-04-26 07:13:57,963][47288] Updated weights for policy 0, policy_version 114646 (0.0035) [2024-04-26 07:13:58,923][47056] Fps is (10 sec: 55704.6, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1878376448. Throughput: 0: 56664.5. Samples: 1827706620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:13:58,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 07:14:00,858][47288] Updated weights for policy 0, policy_version 114656 (0.0026) [2024-04-26 07:14:03,633][47288] Updated weights for policy 0, policy_version 114666 (0.0027) [2024-04-26 07:14:03,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 1878687744. Throughput: 0: 56710.0. Samples: 1828045580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:14:03,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 07:14:06,506][47288] Updated weights for policy 0, policy_version 114676 (0.0028) [2024-04-26 07:14:08,923][47056] Fps is (10 sec: 58981.1, 60 sec: 56524.6, 300 sec: 56483.1). Total num frames: 1878966272. Throughput: 0: 56636.6. Samples: 1828384400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:14:08,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 07:14:09,492][47288] Updated weights for policy 0, policy_version 114686 (0.0033) [2024-04-26 07:14:12,364][47288] Updated weights for policy 0, policy_version 114696 (0.0030) [2024-04-26 07:14:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 57071.0, 300 sec: 56483.1). Total num frames: 1879261184. Throughput: 0: 56564.4. Samples: 1828556460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:14:13,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 07:14:15,651][47288] Updated weights for policy 0, policy_version 114706 (0.0033) [2024-04-26 07:14:18,162][47288] Updated weights for policy 0, policy_version 114716 (0.0028) [2024-04-26 07:14:18,923][47056] Fps is (10 sec: 55707.4, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 1879523328. Throughput: 0: 56552.0. Samples: 1828897980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:14:18,923][47056] Avg episode reward: [(0, '0.585')] [2024-04-26 07:14:21,334][47288] Updated weights for policy 0, policy_version 114726 (0.0034) [2024-04-26 07:14:23,845][47288] Updated weights for policy 0, policy_version 114736 (0.0029) [2024-04-26 07:14:23,923][47056] Fps is (10 sec: 57343.8, 60 sec: 57070.9, 300 sec: 56483.1). Total num frames: 1879834624. Throughput: 0: 56571.4. Samples: 1829236300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:14:23,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 07:14:27,168][47288] Updated weights for policy 0, policy_version 114746 (0.0027) [2024-04-26 07:14:28,485][47267] Signal inference workers to stop experience collection... (27350 times) [2024-04-26 07:14:28,520][47288] InferenceWorker_p0-w0: stopping experience collection (27350 times) [2024-04-26 07:14:28,573][47267] Signal inference workers to resume experience collection... (27350 times) [2024-04-26 07:14:28,574][47288] InferenceWorker_p0-w0: resuming experience collection (27350 times) [2024-04-26 07:14:28,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 1880096768. Throughput: 0: 56406.5. Samples: 1829403020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:14:28,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 07:14:29,723][47288] Updated weights for policy 0, policy_version 114756 (0.0028) [2024-04-26 07:14:33,160][47288] Updated weights for policy 0, policy_version 114766 (0.0031) [2024-04-26 07:14:33,923][47056] Fps is (10 sec: 52428.8, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 1880358912. Throughput: 0: 56566.9. Samples: 1829747860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:14:33,923][47056] Avg episode reward: [(0, '0.605')] [2024-04-26 07:14:35,488][47288] Updated weights for policy 0, policy_version 114776 (0.0025) [2024-04-26 07:14:38,914][47288] Updated weights for policy 0, policy_version 114786 (0.0035) [2024-04-26 07:14:38,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56524.7, 300 sec: 56594.2). Total num frames: 1880653824. Throughput: 0: 56693.9. Samples: 1830093520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:14:38,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 07:14:41,254][47288] Updated weights for policy 0, policy_version 114796 (0.0026) [2024-04-26 07:14:43,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 1880948736. Throughput: 0: 56684.5. Samples: 1830257420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 07:14:43,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 07:14:44,760][47288] Updated weights for policy 0, policy_version 114806 (0.0027) [2024-04-26 07:14:46,923][47288] Updated weights for policy 0, policy_version 114816 (0.0032) [2024-04-26 07:14:48,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 1881227264. Throughput: 0: 56628.1. Samples: 1830593840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 07:14:48,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 07:14:48,947][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000114822_1881243648.pth... [2024-04-26 07:14:48,990][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000113993_1867661312.pth [2024-04-26 07:14:50,597][47288] Updated weights for policy 0, policy_version 114826 (0.0034) [2024-04-26 07:14:52,935][47288] Updated weights for policy 0, policy_version 114836 (0.0029) [2024-04-26 07:14:53,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 1881522176. Throughput: 0: 56639.8. Samples: 1830933180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 07:14:53,923][47056] Avg episode reward: [(0, '0.610')] [2024-04-26 07:14:56,311][47288] Updated weights for policy 0, policy_version 114846 (0.0030) [2024-04-26 07:14:58,740][47288] Updated weights for policy 0, policy_version 114856 (0.0034) [2024-04-26 07:14:58,923][47056] Fps is (10 sec: 57343.5, 60 sec: 57070.9, 300 sec: 56483.1). Total num frames: 1881800704. Throughput: 0: 56813.3. Samples: 1831113060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 07:14:58,923][47056] Avg episode reward: [(0, '0.597')] [2024-04-26 07:15:02,169][47288] Updated weights for policy 0, policy_version 114866 (0.0032) [2024-04-26 07:15:03,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 1882079232. Throughput: 0: 56748.5. Samples: 1831451660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 07:15:03,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 07:15:04,400][47288] Updated weights for policy 0, policy_version 114876 (0.0028) [2024-04-26 07:15:07,891][47288] Updated weights for policy 0, policy_version 114886 (0.0028) [2024-04-26 07:15:08,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56525.1, 300 sec: 56483.2). Total num frames: 1882357760. Throughput: 0: 56753.9. Samples: 1831790220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 07:15:08,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 07:15:10,186][47288] Updated weights for policy 0, policy_version 114896 (0.0034) [2024-04-26 07:15:13,725][47288] Updated weights for policy 0, policy_version 114906 (0.0027) [2024-04-26 07:15:13,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55978.7, 300 sec: 56538.7). Total num frames: 1882619904. Throughput: 0: 56628.3. Samples: 1831951300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 07:15:13,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 07:15:15,950][47288] Updated weights for policy 0, policy_version 114916 (0.0027) [2024-04-26 07:15:18,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 1882898432. Throughput: 0: 56453.8. Samples: 1832288280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 07:15:18,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 07:15:19,709][47288] Updated weights for policy 0, policy_version 114926 (0.0026) [2024-04-26 07:15:22,035][47288] Updated weights for policy 0, policy_version 114936 (0.0031) [2024-04-26 07:15:23,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 1883209728. Throughput: 0: 56291.6. Samples: 1832626640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 07:15:23,923][47056] Avg episode reward: [(0, '0.436')] [2024-04-26 07:15:25,385][47288] Updated weights for policy 0, policy_version 114946 (0.0027) [2024-04-26 07:15:27,717][47288] Updated weights for policy 0, policy_version 114956 (0.0030) [2024-04-26 07:15:28,923][47056] Fps is (10 sec: 60620.4, 60 sec: 56797.7, 300 sec: 56594.2). Total num frames: 1883504640. Throughput: 0: 56357.6. Samples: 1832793520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 07:15:28,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 07:15:31,161][47288] Updated weights for policy 0, policy_version 114966 (0.0025) [2024-04-26 07:15:31,643][47267] Signal inference workers to stop experience collection... (27400 times) [2024-04-26 07:15:31,643][47267] Signal inference workers to resume experience collection... (27400 times) [2024-04-26 07:15:31,657][47288] InferenceWorker_p0-w0: stopping experience collection (27400 times) [2024-04-26 07:15:31,657][47288] InferenceWorker_p0-w0: resuming experience collection (27400 times) [2024-04-26 07:15:33,399][47288] Updated weights for policy 0, policy_version 114976 (0.0025) [2024-04-26 07:15:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 1883783168. Throughput: 0: 56462.1. Samples: 1833134640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 07:15:33,923][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 07:15:37,046][47288] Updated weights for policy 0, policy_version 114986 (0.0032) [2024-04-26 07:15:38,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 1884061696. Throughput: 0: 56367.6. Samples: 1833469720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 07:15:38,923][47056] Avg episode reward: [(0, '0.397')] [2024-04-26 07:15:39,143][47288] Updated weights for policy 0, policy_version 114996 (0.0023) [2024-04-26 07:15:42,801][47288] Updated weights for policy 0, policy_version 115006 (0.0031) [2024-04-26 07:15:43,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 1884356608. Throughput: 0: 56452.4. Samples: 1833653420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 07:15:43,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 07:15:45,121][47288] Updated weights for policy 0, policy_version 115016 (0.0026) [2024-04-26 07:15:48,405][47288] Updated weights for policy 0, policy_version 115026 (0.0027) [2024-04-26 07:15:48,922][47056] Fps is (10 sec: 55706.8, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 1884618752. Throughput: 0: 56581.9. Samples: 1833997840. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 07:15:48,923][47056] Avg episode reward: [(0, '0.601')] [2024-04-26 07:15:50,958][47288] Updated weights for policy 0, policy_version 115036 (0.0029) [2024-04-26 07:15:53,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55978.7, 300 sec: 56483.2). Total num frames: 1884880896. Throughput: 0: 56658.1. Samples: 1834339840. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 07:15:53,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 07:15:54,231][47288] Updated weights for policy 0, policy_version 115046 (0.0026) [2024-04-26 07:15:56,723][47288] Updated weights for policy 0, policy_version 115056 (0.0030) [2024-04-26 07:15:58,923][47056] Fps is (10 sec: 55704.3, 60 sec: 56251.7, 300 sec: 56594.2). Total num frames: 1885175808. Throughput: 0: 56599.5. Samples: 1834498280. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 07:15:58,924][47056] Avg episode reward: [(0, '0.546')] [2024-04-26 07:15:59,993][47288] Updated weights for policy 0, policy_version 115066 (0.0026) [2024-04-26 07:16:02,427][47288] Updated weights for policy 0, policy_version 115076 (0.0030) [2024-04-26 07:16:03,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56524.7, 300 sec: 56483.2). Total num frames: 1885470720. Throughput: 0: 56611.1. Samples: 1834835780. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 07:16:03,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 07:16:05,780][47288] Updated weights for policy 0, policy_version 115086 (0.0026) [2024-04-26 07:16:08,328][47288] Updated weights for policy 0, policy_version 115096 (0.0036) [2024-04-26 07:16:08,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56797.7, 300 sec: 56538.7). Total num frames: 1885765632. Throughput: 0: 56643.5. Samples: 1835175600. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 07:16:08,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 07:16:11,521][47288] Updated weights for policy 0, policy_version 115106 (0.0032) [2024-04-26 07:16:13,923][47056] Fps is (10 sec: 57344.6, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 1886044160. Throughput: 0: 56796.6. Samples: 1835349360. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 07:16:13,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 07:16:14,181][47288] Updated weights for policy 0, policy_version 115116 (0.0026) [2024-04-26 07:16:17,380][47288] Updated weights for policy 0, policy_version 115126 (0.0027) [2024-04-26 07:16:18,923][47056] Fps is (10 sec: 57344.3, 60 sec: 57344.0, 300 sec: 56649.8). Total num frames: 1886339072. Throughput: 0: 56718.3. Samples: 1835686960. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 07:16:18,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 07:16:19,894][47288] Updated weights for policy 0, policy_version 115136 (0.0032) [2024-04-26 07:16:23,129][47288] Updated weights for policy 0, policy_version 115146 (0.0030) [2024-04-26 07:16:23,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 1886601216. Throughput: 0: 56705.7. Samples: 1836021480. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 07:16:23,924][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 07:16:25,742][47288] Updated weights for policy 0, policy_version 115156 (0.0030) [2024-04-26 07:16:28,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55978.8, 300 sec: 56427.6). Total num frames: 1886863360. Throughput: 0: 56414.8. Samples: 1836192080. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 07:16:28,923][47056] Avg episode reward: [(0, '0.576')] [2024-04-26 07:16:28,961][47288] Updated weights for policy 0, policy_version 115166 (0.0033) [2024-04-26 07:16:29,021][47267] Signal inference workers to stop experience collection... (27450 times) [2024-04-26 07:16:29,068][47288] InferenceWorker_p0-w0: stopping experience collection (27450 times) [2024-04-26 07:16:29,078][47267] Signal inference workers to resume experience collection... (27450 times) [2024-04-26 07:16:29,086][47288] InferenceWorker_p0-w0: resuming experience collection (27450 times) [2024-04-26 07:16:31,449][47288] Updated weights for policy 0, policy_version 115176 (0.0027) [2024-04-26 07:16:33,923][47056] Fps is (10 sec: 54067.9, 60 sec: 55978.8, 300 sec: 56483.2). Total num frames: 1887141888. Throughput: 0: 56235.4. Samples: 1836528440. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 07:16:33,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 07:16:34,684][47288] Updated weights for policy 0, policy_version 115186 (0.0026) [2024-04-26 07:16:37,241][47288] Updated weights for policy 0, policy_version 115196 (0.0027) [2024-04-26 07:16:38,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 1887436800. Throughput: 0: 56256.0. Samples: 1836871360. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 07:16:38,924][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 07:16:40,398][47288] Updated weights for policy 0, policy_version 115206 (0.0032) [2024-04-26 07:16:42,996][47288] Updated weights for policy 0, policy_version 115216 (0.0031) [2024-04-26 07:16:43,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56251.9, 300 sec: 56483.2). Total num frames: 1887731712. Throughput: 0: 56351.7. Samples: 1837034100. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 07:16:43,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 07:16:46,216][47288] Updated weights for policy 0, policy_version 115226 (0.0036) [2024-04-26 07:16:48,803][47288] Updated weights for policy 0, policy_version 115236 (0.0031) [2024-04-26 07:16:48,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56797.7, 300 sec: 56594.2). Total num frames: 1888026624. Throughput: 0: 56446.3. Samples: 1837375860. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-26 07:16:48,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 07:16:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000115236_1888026624.pth... [2024-04-26 07:16:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000114409_1874477056.pth [2024-04-26 07:16:51,941][47288] Updated weights for policy 0, policy_version 115246 (0.0033) [2024-04-26 07:16:53,923][47056] Fps is (10 sec: 55704.5, 60 sec: 56797.7, 300 sec: 56538.6). Total num frames: 1888288768. Throughput: 0: 56471.4. Samples: 1837716820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:16:53,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 07:16:54,736][47288] Updated weights for policy 0, policy_version 115256 (0.0033) [2024-04-26 07:16:57,777][47288] Updated weights for policy 0, policy_version 115266 (0.0030) [2024-04-26 07:16:58,923][47056] Fps is (10 sec: 57343.7, 60 sec: 57071.0, 300 sec: 56649.8). Total num frames: 1888600064. Throughput: 0: 56606.6. Samples: 1837896660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:16:58,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 07:17:00,326][47288] Updated weights for policy 0, policy_version 115276 (0.0027) [2024-04-26 07:17:03,516][47288] Updated weights for policy 0, policy_version 115286 (0.0034) [2024-04-26 07:17:03,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1888862208. Throughput: 0: 56616.0. Samples: 1838234680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:17:03,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 07:17:06,100][47288] Updated weights for policy 0, policy_version 115296 (0.0034) [2024-04-26 07:17:08,923][47056] Fps is (10 sec: 52427.8, 60 sec: 55978.5, 300 sec: 56372.0). Total num frames: 1889124352. Throughput: 0: 56724.7. Samples: 1838574100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:17:08,923][47056] Avg episode reward: [(0, '0.586')] [2024-04-26 07:17:09,324][47288] Updated weights for policy 0, policy_version 115306 (0.0026) [2024-04-26 07:17:11,878][47288] Updated weights for policy 0, policy_version 115316 (0.0039) [2024-04-26 07:17:13,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.6, 300 sec: 56594.2). Total num frames: 1889419264. Throughput: 0: 56527.8. Samples: 1838735840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:17:13,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 07:17:15,191][47288] Updated weights for policy 0, policy_version 115326 (0.0031) [2024-04-26 07:17:17,818][47288] Updated weights for policy 0, policy_version 115336 (0.0034) [2024-04-26 07:17:18,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55978.6, 300 sec: 56538.7). Total num frames: 1889697792. Throughput: 0: 56540.3. Samples: 1839072760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:17:18,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 07:17:20,862][47288] Updated weights for policy 0, policy_version 115346 (0.0027) [2024-04-26 07:17:23,511][47288] Updated weights for policy 0, policy_version 115356 (0.0027) [2024-04-26 07:17:23,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 1889992704. Throughput: 0: 56391.6. Samples: 1839408980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:17:23,923][47056] Avg episode reward: [(0, '0.593')] [2024-04-26 07:17:26,280][47267] Signal inference workers to stop experience collection... (27500 times) [2024-04-26 07:17:26,281][47267] Signal inference workers to resume experience collection... (27500 times) [2024-04-26 07:17:26,304][47288] InferenceWorker_p0-w0: stopping experience collection (27500 times) [2024-04-26 07:17:26,304][47288] InferenceWorker_p0-w0: resuming experience collection (27500 times) [2024-04-26 07:17:26,651][47288] Updated weights for policy 0, policy_version 115366 (0.0033) [2024-04-26 07:17:28,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 1890271232. Throughput: 0: 56664.4. Samples: 1839584000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:17:28,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 07:17:29,690][47288] Updated weights for policy 0, policy_version 115376 (0.0034) [2024-04-26 07:17:32,471][47288] Updated weights for policy 0, policy_version 115386 (0.0031) [2024-04-26 07:17:33,923][47056] Fps is (10 sec: 57343.2, 60 sec: 57070.8, 300 sec: 56649.8). Total num frames: 1890566144. Throughput: 0: 56535.8. Samples: 1839919980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:17:33,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 07:17:35,393][47288] Updated weights for policy 0, policy_version 115396 (0.0031) [2024-04-26 07:17:38,207][47288] Updated weights for policy 0, policy_version 115406 (0.0028) [2024-04-26 07:17:38,923][47056] Fps is (10 sec: 58982.8, 60 sec: 57071.0, 300 sec: 56649.8). Total num frames: 1890861056. Throughput: 0: 56387.8. Samples: 1840254260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:17:38,923][47056] Avg episode reward: [(0, '0.644')] [2024-04-26 07:17:38,933][47267] Saving new best policy, reward=0.644! [2024-04-26 07:17:41,047][47288] Updated weights for policy 0, policy_version 115416 (0.0027) [2024-04-26 07:17:43,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1891123200. Throughput: 0: 56310.6. Samples: 1840430640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:17:43,923][47056] Avg episode reward: [(0, '0.578')] [2024-04-26 07:17:44,010][47288] Updated weights for policy 0, policy_version 115426 (0.0031) [2024-04-26 07:17:46,857][47288] Updated weights for policy 0, policy_version 115436 (0.0029) [2024-04-26 07:17:48,923][47056] Fps is (10 sec: 52428.2, 60 sec: 55978.6, 300 sec: 56483.1). Total num frames: 1891385344. Throughput: 0: 56234.6. Samples: 1840765240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:17:48,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 07:17:49,912][47288] Updated weights for policy 0, policy_version 115446 (0.0025) [2024-04-26 07:17:52,641][47288] Updated weights for policy 0, policy_version 115456 (0.0026) [2024-04-26 07:17:53,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56251.8, 300 sec: 56483.1). Total num frames: 1891663872. Throughput: 0: 56299.7. Samples: 1841107580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:17:53,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 07:17:55,636][47288] Updated weights for policy 0, policy_version 115466 (0.0031) [2024-04-26 07:17:58,519][47288] Updated weights for policy 0, policy_version 115476 (0.0030) [2024-04-26 07:17:58,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 56538.7). Total num frames: 1891958784. Throughput: 0: 56423.7. Samples: 1841274900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:17:58,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 07:18:01,354][47288] Updated weights for policy 0, policy_version 115486 (0.0035) [2024-04-26 07:18:03,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56524.7, 300 sec: 56538.7). Total num frames: 1892253696. Throughput: 0: 56543.6. Samples: 1841617220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:18:03,923][47056] Avg episode reward: [(0, '0.591')] [2024-04-26 07:18:04,286][47288] Updated weights for policy 0, policy_version 115496 (0.0029) [2024-04-26 07:18:07,205][47288] Updated weights for policy 0, policy_version 115506 (0.0030) [2024-04-26 07:18:08,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56798.1, 300 sec: 56594.2). Total num frames: 1892532224. Throughput: 0: 56516.9. Samples: 1841952240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:18:08,923][47056] Avg episode reward: [(0, '0.397')] [2024-04-26 07:18:10,356][47288] Updated weights for policy 0, policy_version 115516 (0.0032) [2024-04-26 07:18:13,144][47288] Updated weights for policy 0, policy_version 115526 (0.0030) [2024-04-26 07:18:13,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56525.0, 300 sec: 56594.2). Total num frames: 1892810752. Throughput: 0: 56487.7. Samples: 1842125940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:18:13,923][47056] Avg episode reward: [(0, '0.606')] [2024-04-26 07:18:16,238][47288] Updated weights for policy 0, policy_version 115536 (0.0034) [2024-04-26 07:18:18,826][47288] Updated weights for policy 0, policy_version 115546 (0.0026) [2024-04-26 07:18:18,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 1893105664. Throughput: 0: 56524.1. Samples: 1842463560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:18:18,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 07:18:22,034][47288] Updated weights for policy 0, policy_version 115556 (0.0032) [2024-04-26 07:18:23,923][47056] Fps is (10 sec: 55704.7, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1893367808. Throughput: 0: 56609.6. Samples: 1842801700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:18:23,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 07:18:24,765][47288] Updated weights for policy 0, policy_version 115566 (0.0032) [2024-04-26 07:18:27,753][47288] Updated weights for policy 0, policy_version 115576 (0.0024) [2024-04-26 07:18:28,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1893646336. Throughput: 0: 56394.3. Samples: 1842968380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:18:28,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 07:18:29,677][47267] Signal inference workers to stop experience collection... (27550 times) [2024-04-26 07:18:29,677][47267] Signal inference workers to resume experience collection... (27550 times) [2024-04-26 07:18:29,688][47288] InferenceWorker_p0-w0: stopping experience collection (27550 times) [2024-04-26 07:18:29,706][47288] InferenceWorker_p0-w0: resuming experience collection (27550 times) [2024-04-26 07:18:30,607][47288] Updated weights for policy 0, policy_version 115586 (0.0028) [2024-04-26 07:18:33,413][47288] Updated weights for policy 0, policy_version 115596 (0.0028) [2024-04-26 07:18:33,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 1893941248. Throughput: 0: 56375.6. Samples: 1843302140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:18:33,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 07:18:36,348][47288] Updated weights for policy 0, policy_version 115606 (0.0025) [2024-04-26 07:18:38,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 56483.2). Total num frames: 1894219776. Throughput: 0: 56372.6. Samples: 1843644340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:18:38,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 07:18:39,125][47288] Updated weights for policy 0, policy_version 115616 (0.0027) [2024-04-26 07:18:42,210][47288] Updated weights for policy 0, policy_version 115626 (0.0027) [2024-04-26 07:18:43,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.8, 300 sec: 56594.2). Total num frames: 1894514688. Throughput: 0: 56587.4. Samples: 1843821340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:18:43,923][47056] Avg episode reward: [(0, '0.563')] [2024-04-26 07:18:44,936][47288] Updated weights for policy 0, policy_version 115636 (0.0029) [2024-04-26 07:18:47,940][47288] Updated weights for policy 0, policy_version 115646 (0.0031) [2024-04-26 07:18:48,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 1894793216. Throughput: 0: 56457.8. Samples: 1844157820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:18:48,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 07:18:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000115649_1894793216.pth... [2024-04-26 07:18:48,989][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000114822_1881243648.pth [2024-04-26 07:18:50,800][47288] Updated weights for policy 0, policy_version 115656 (0.0026) [2024-04-26 07:18:53,688][47288] Updated weights for policy 0, policy_version 115666 (0.0027) [2024-04-26 07:18:53,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 1895071744. Throughput: 0: 56560.0. Samples: 1844497440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:18:53,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 07:18:56,627][47288] Updated weights for policy 0, policy_version 115676 (0.0027) [2024-04-26 07:18:58,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 1895366656. Throughput: 0: 56531.8. Samples: 1844669880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:18:58,924][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 07:18:59,565][47288] Updated weights for policy 0, policy_version 115686 (0.0024) [2024-04-26 07:19:02,281][47288] Updated weights for policy 0, policy_version 115696 (0.0027) [2024-04-26 07:19:03,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.8, 300 sec: 56427.7). Total num frames: 1895612416. Throughput: 0: 56420.6. Samples: 1845002480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:19:03,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 07:19:05,403][47288] Updated weights for policy 0, policy_version 115706 (0.0029) [2024-04-26 07:19:08,058][47288] Updated weights for policy 0, policy_version 115716 (0.0030) [2024-04-26 07:19:08,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1895907328. Throughput: 0: 56572.1. Samples: 1845347440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:19:08,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 07:19:11,103][47288] Updated weights for policy 0, policy_version 115726 (0.0029) [2024-04-26 07:19:13,872][47288] Updated weights for policy 0, policy_version 115736 (0.0029) [2024-04-26 07:19:13,923][47056] Fps is (10 sec: 60620.8, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 1896218624. Throughput: 0: 56591.2. Samples: 1845514980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:19:13,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 07:19:16,941][47288] Updated weights for policy 0, policy_version 115746 (0.0030) [2024-04-26 07:19:18,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1896480768. Throughput: 0: 56676.1. Samples: 1845852560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:19:18,923][47056] Avg episode reward: [(0, '0.376')] [2024-04-26 07:19:19,703][47288] Updated weights for policy 0, policy_version 115756 (0.0029) [2024-04-26 07:19:22,711][47288] Updated weights for policy 0, policy_version 115766 (0.0031) [2024-04-26 07:19:23,923][47056] Fps is (10 sec: 57343.8, 60 sec: 57071.0, 300 sec: 56594.2). Total num frames: 1896792064. Throughput: 0: 56603.1. Samples: 1846191480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:19:23,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 07:19:25,366][47288] Updated weights for policy 0, policy_version 115776 (0.0031) [2024-04-26 07:19:28,546][47288] Updated weights for policy 0, policy_version 115786 (0.0027) [2024-04-26 07:19:28,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56798.0, 300 sec: 56594.3). Total num frames: 1897054208. Throughput: 0: 56676.2. Samples: 1846371760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:19:28,923][47056] Avg episode reward: [(0, '0.552')] [2024-04-26 07:19:31,107][47288] Updated weights for policy 0, policy_version 115796 (0.0028) [2024-04-26 07:19:33,923][47056] Fps is (10 sec: 52428.8, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 1897316352. Throughput: 0: 56776.6. Samples: 1846712760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:19:33,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 07:19:34,334][47288] Updated weights for policy 0, policy_version 115806 (0.0028) [2024-04-26 07:19:37,067][47288] Updated weights for policy 0, policy_version 115816 (0.0028) [2024-04-26 07:19:38,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56797.7, 300 sec: 56538.7). Total num frames: 1897627648. Throughput: 0: 56793.7. Samples: 1847053160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:19:38,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 07:19:40,015][47288] Updated weights for policy 0, policy_version 115826 (0.0030) [2024-04-26 07:19:40,118][47267] Signal inference workers to stop experience collection... (27600 times) [2024-04-26 07:19:40,159][47288] InferenceWorker_p0-w0: stopping experience collection (27600 times) [2024-04-26 07:19:40,174][47267] Signal inference workers to resume experience collection... (27600 times) [2024-04-26 07:19:40,177][47288] InferenceWorker_p0-w0: resuming experience collection (27600 times) [2024-04-26 07:19:42,950][47288] Updated weights for policy 0, policy_version 115836 (0.0029) [2024-04-26 07:19:43,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1897889792. Throughput: 0: 56562.7. Samples: 1847215200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:19:43,923][47056] Avg episode reward: [(0, '0.596')] [2024-04-26 07:19:45,667][47288] Updated weights for policy 0, policy_version 115846 (0.0027) [2024-04-26 07:19:48,643][47288] Updated weights for policy 0, policy_version 115856 (0.0026) [2024-04-26 07:19:48,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 1898184704. Throughput: 0: 56836.8. Samples: 1847560140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:19:48,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 07:19:51,454][47288] Updated weights for policy 0, policy_version 115866 (0.0028) [2024-04-26 07:19:53,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 1898463232. Throughput: 0: 56715.5. Samples: 1847899640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:19:53,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 07:19:54,464][47288] Updated weights for policy 0, policy_version 115876 (0.0030) [2024-04-26 07:19:57,215][47288] Updated weights for policy 0, policy_version 115886 (0.0030) [2024-04-26 07:19:58,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 1898758144. Throughput: 0: 56866.9. Samples: 1848074000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:19:58,923][47056] Avg episode reward: [(0, '0.580')] [2024-04-26 07:20:00,202][47288] Updated weights for policy 0, policy_version 115896 (0.0029) [2024-04-26 07:20:03,026][47288] Updated weights for policy 0, policy_version 115906 (0.0030) [2024-04-26 07:20:03,927][47056] Fps is (10 sec: 58959.0, 60 sec: 57340.2, 300 sec: 56593.5). Total num frames: 1899053056. Throughput: 0: 56940.3. Samples: 1848415100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:20:03,927][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 07:20:05,942][47288] Updated weights for policy 0, policy_version 115916 (0.0028) [2024-04-26 07:20:08,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56797.7, 300 sec: 56594.2). Total num frames: 1899315200. Throughput: 0: 56758.5. Samples: 1848745620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:20:08,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 07:20:09,047][47288] Updated weights for policy 0, policy_version 115926 (0.0029) [2024-04-26 07:20:11,728][47288] Updated weights for policy 0, policy_version 115936 (0.0028) [2024-04-26 07:20:13,923][47056] Fps is (10 sec: 54088.5, 60 sec: 56251.7, 300 sec: 56594.2). Total num frames: 1899593728. Throughput: 0: 56609.2. Samples: 1848919180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:20:13,923][47056] Avg episode reward: [(0, '0.580')] [2024-04-26 07:20:14,821][47288] Updated weights for policy 0, policy_version 115946 (0.0034) [2024-04-26 07:20:17,373][47288] Updated weights for policy 0, policy_version 115956 (0.0031) [2024-04-26 07:20:18,923][47056] Fps is (10 sec: 55706.7, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 1899872256. Throughput: 0: 56518.3. Samples: 1849256080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:20:18,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 07:20:20,418][47288] Updated weights for policy 0, policy_version 115966 (0.0027) [2024-04-26 07:20:23,392][47288] Updated weights for policy 0, policy_version 115976 (0.0032) [2024-04-26 07:20:23,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56251.6, 300 sec: 56483.1). Total num frames: 1900167168. Throughput: 0: 56642.6. Samples: 1849602080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:20:23,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 07:20:26,075][47288] Updated weights for policy 0, policy_version 115986 (0.0031) [2024-04-26 07:20:28,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 1900445696. Throughput: 0: 56677.9. Samples: 1849765700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:20:28,923][47056] Avg episode reward: [(0, '0.566')] [2024-04-26 07:20:29,095][47288] Updated weights for policy 0, policy_version 115996 (0.0027) [2024-04-26 07:20:31,913][47288] Updated weights for policy 0, policy_version 116006 (0.0027) [2024-04-26 07:20:33,923][47056] Fps is (10 sec: 57343.7, 60 sec: 57070.8, 300 sec: 56538.7). Total num frames: 1900740608. Throughput: 0: 56521.1. Samples: 1850103600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:20:33,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 07:20:34,924][47288] Updated weights for policy 0, policy_version 116016 (0.0031) [2024-04-26 07:20:37,752][47288] Updated weights for policy 0, policy_version 116026 (0.0026) [2024-04-26 07:20:38,923][47056] Fps is (10 sec: 60621.0, 60 sec: 57071.1, 300 sec: 56594.3). Total num frames: 1901051904. Throughput: 0: 56597.0. Samples: 1850446500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:20:38,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 07:20:40,832][47288] Updated weights for policy 0, policy_version 116036 (0.0029) [2024-04-26 07:20:43,505][47288] Updated weights for policy 0, policy_version 116046 (0.0022) [2024-04-26 07:20:43,923][47056] Fps is (10 sec: 57343.9, 60 sec: 57070.8, 300 sec: 56594.2). Total num frames: 1901314048. Throughput: 0: 56708.4. Samples: 1850625880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:20:43,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 07:20:46,624][47288] Updated weights for policy 0, policy_version 116056 (0.0032) [2024-04-26 07:20:47,471][47267] Signal inference workers to stop experience collection... (27650 times) [2024-04-26 07:20:47,472][47267] Signal inference workers to resume experience collection... (27650 times) [2024-04-26 07:20:47,488][47288] InferenceWorker_p0-w0: stopping experience collection (27650 times) [2024-04-26 07:20:47,488][47288] InferenceWorker_p0-w0: resuming experience collection (27650 times) [2024-04-26 07:20:48,923][47056] Fps is (10 sec: 54066.3, 60 sec: 56797.8, 300 sec: 56649.7). Total num frames: 1901592576. Throughput: 0: 56716.0. Samples: 1850967100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:20:48,923][47056] Avg episode reward: [(0, '0.583')] [2024-04-26 07:20:49,062][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000116065_1901608960.pth... [2024-04-26 07:20:49,108][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000115236_1888026624.pth [2024-04-26 07:20:49,221][47288] Updated weights for policy 0, policy_version 116066 (0.0028) [2024-04-26 07:20:52,350][47288] Updated weights for policy 0, policy_version 116076 (0.0030) [2024-04-26 07:20:53,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 1901871104. Throughput: 0: 56833.3. Samples: 1851303120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:20:53,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 07:20:55,092][47288] Updated weights for policy 0, policy_version 116086 (0.0034) [2024-04-26 07:20:58,219][47288] Updated weights for policy 0, policy_version 116096 (0.0033) [2024-04-26 07:20:58,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1902133248. Throughput: 0: 56583.0. Samples: 1851465420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 07:20:58,923][47056] Avg episode reward: [(0, '0.628')] [2024-04-26 07:21:00,984][47288] Updated weights for policy 0, policy_version 116106 (0.0029) [2024-04-26 07:21:03,923][47056] Fps is (10 sec: 55706.8, 60 sec: 56255.5, 300 sec: 56483.2). Total num frames: 1902428160. Throughput: 0: 56597.8. Samples: 1851802980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 07:21:03,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 07:21:04,104][47288] Updated weights for policy 0, policy_version 116116 (0.0037) [2024-04-26 07:21:06,698][47288] Updated weights for policy 0, policy_version 116126 (0.0027) [2024-04-26 07:21:08,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56524.9, 300 sec: 56483.1). Total num frames: 1902706688. Throughput: 0: 56420.1. Samples: 1852140980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 07:21:08,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 07:21:09,931][47288] Updated weights for policy 0, policy_version 116136 (0.0030) [2024-04-26 07:21:12,464][47288] Updated weights for policy 0, policy_version 116146 (0.0029) [2024-04-26 07:21:13,923][47056] Fps is (10 sec: 58981.4, 60 sec: 57070.9, 300 sec: 56538.7). Total num frames: 1903017984. Throughput: 0: 56580.3. Samples: 1852311820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 07:21:13,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 07:21:15,851][47288] Updated weights for policy 0, policy_version 116156 (0.0033) [2024-04-26 07:21:18,114][47288] Updated weights for policy 0, policy_version 116166 (0.0029) [2024-04-26 07:21:18,923][47056] Fps is (10 sec: 58982.3, 60 sec: 57070.8, 300 sec: 56594.2). Total num frames: 1903296512. Throughput: 0: 56647.3. Samples: 1852652720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 07:21:18,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 07:21:21,506][47288] Updated weights for policy 0, policy_version 116176 (0.0032) [2024-04-26 07:21:23,873][47288] Updated weights for policy 0, policy_version 116186 (0.0035) [2024-04-26 07:21:23,923][47056] Fps is (10 sec: 57344.0, 60 sec: 57071.0, 300 sec: 56705.3). Total num frames: 1903591424. Throughput: 0: 56595.0. Samples: 1852993280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 07:21:23,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 07:21:27,242][47288] Updated weights for policy 0, policy_version 116196 (0.0028) [2024-04-26 07:21:28,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56524.7, 300 sec: 56594.2). Total num frames: 1903837184. Throughput: 0: 56522.8. Samples: 1853169400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 07:21:28,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 07:21:29,636][47288] Updated weights for policy 0, policy_version 116206 (0.0025) [2024-04-26 07:21:33,066][47288] Updated weights for policy 0, policy_version 116216 (0.0027) [2024-04-26 07:21:33,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56524.9, 300 sec: 56594.2). Total num frames: 1904132096. Throughput: 0: 56421.0. Samples: 1853506040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 07:21:33,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 07:21:35,006][47267] Signal inference workers to stop experience collection... (27700 times) [2024-04-26 07:21:35,006][47267] Signal inference workers to resume experience collection... (27700 times) [2024-04-26 07:21:35,022][47288] InferenceWorker_p0-w0: stopping experience collection (27700 times) [2024-04-26 07:21:35,022][47288] InferenceWorker_p0-w0: resuming experience collection (27700 times) [2024-04-26 07:21:35,527][47288] Updated weights for policy 0, policy_version 116226 (0.0027) [2024-04-26 07:21:38,854][47288] Updated weights for policy 0, policy_version 116236 (0.0029) [2024-04-26 07:21:38,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55978.7, 300 sec: 56538.7). Total num frames: 1904410624. Throughput: 0: 56571.8. Samples: 1853848840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 07:21:38,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 07:21:41,309][47288] Updated weights for policy 0, policy_version 116246 (0.0029) [2024-04-26 07:21:43,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.8, 300 sec: 56483.1). Total num frames: 1904689152. Throughput: 0: 56540.0. Samples: 1854009720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 07:21:43,924][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 07:21:44,598][47288] Updated weights for policy 0, policy_version 116256 (0.0031) [2024-04-26 07:21:47,067][47288] Updated weights for policy 0, policy_version 116266 (0.0028) [2024-04-26 07:21:48,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56524.8, 300 sec: 56594.2). Total num frames: 1904984064. Throughput: 0: 56624.7. Samples: 1854351100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 07:21:48,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 07:21:50,333][47288] Updated weights for policy 0, policy_version 116276 (0.0025) [2024-04-26 07:21:52,682][47288] Updated weights for policy 0, policy_version 116286 (0.0025) [2024-04-26 07:21:53,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 1905278976. Throughput: 0: 56591.4. Samples: 1854687600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 07:21:53,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 07:21:56,093][47288] Updated weights for policy 0, policy_version 116296 (0.0026) [2024-04-26 07:21:58,299][47288] Updated weights for policy 0, policy_version 116306 (0.0026) [2024-04-26 07:21:58,923][47056] Fps is (10 sec: 58982.7, 60 sec: 57344.1, 300 sec: 56649.8). Total num frames: 1905573888. Throughput: 0: 56751.6. Samples: 1854865640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-26 07:21:58,923][47056] Avg episode reward: [(0, '0.583')] [2024-04-26 07:22:01,990][47288] Updated weights for policy 0, policy_version 116316 (0.0031) [2024-04-26 07:22:03,923][47056] Fps is (10 sec: 57345.2, 60 sec: 57070.9, 300 sec: 56705.4). Total num frames: 1905852416. Throughput: 0: 56825.0. Samples: 1855209840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-26 07:22:03,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 07:22:04,031][47288] Updated weights for policy 0, policy_version 116326 (0.0027) [2024-04-26 07:22:07,993][47288] Updated weights for policy 0, policy_version 116336 (0.0028) [2024-04-26 07:22:08,923][47056] Fps is (10 sec: 52428.5, 60 sec: 56524.7, 300 sec: 56538.7). Total num frames: 1906098176. Throughput: 0: 56734.2. Samples: 1855546320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-26 07:22:08,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 07:22:09,813][47288] Updated weights for policy 0, policy_version 116346 (0.0031) [2024-04-26 07:22:13,713][47288] Updated weights for policy 0, policy_version 116356 (0.0024) [2024-04-26 07:22:13,923][47056] Fps is (10 sec: 52428.9, 60 sec: 55978.8, 300 sec: 56538.7). Total num frames: 1906376704. Throughput: 0: 56422.9. Samples: 1855708420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-26 07:22:13,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 07:22:15,805][47288] Updated weights for policy 0, policy_version 116366 (0.0026) [2024-04-26 07:22:18,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.6, 300 sec: 56538.6). Total num frames: 1906671616. Throughput: 0: 56424.7. Samples: 1856045160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-26 07:22:18,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 07:22:19,336][47288] Updated weights for policy 0, policy_version 116376 (0.0032) [2024-04-26 07:22:21,690][47288] Updated weights for policy 0, policy_version 116386 (0.0025) [2024-04-26 07:22:23,923][47056] Fps is (10 sec: 58981.3, 60 sec: 56251.7, 300 sec: 56594.2). Total num frames: 1906966528. Throughput: 0: 56435.4. Samples: 1856388440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-26 07:22:23,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 07:22:25,094][47288] Updated weights for policy 0, policy_version 116396 (0.0032) [2024-04-26 07:22:27,335][47288] Updated weights for policy 0, policy_version 116406 (0.0035) [2024-04-26 07:22:28,803][47267] Signal inference workers to stop experience collection... (27750 times) [2024-04-26 07:22:28,803][47267] Signal inference workers to resume experience collection... (27750 times) [2024-04-26 07:22:28,817][47288] InferenceWorker_p0-w0: stopping experience collection (27750 times) [2024-04-26 07:22:28,817][47288] InferenceWorker_p0-w0: resuming experience collection (27750 times) [2024-04-26 07:22:28,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 1907245056. Throughput: 0: 56803.6. Samples: 1856565880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-26 07:22:28,923][47056] Avg episode reward: [(0, '0.568')] [2024-04-26 07:22:30,907][47288] Updated weights for policy 0, policy_version 116416 (0.0031) [2024-04-26 07:22:33,169][47288] Updated weights for policy 0, policy_version 116426 (0.0028) [2024-04-26 07:22:33,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 1907539968. Throughput: 0: 56609.0. Samples: 1856898500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-26 07:22:33,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 07:22:36,656][47288] Updated weights for policy 0, policy_version 116436 (0.0030) [2024-04-26 07:22:38,923][47056] Fps is (10 sec: 58982.9, 60 sec: 57070.9, 300 sec: 56649.8). Total num frames: 1907834880. Throughput: 0: 56567.8. Samples: 1857233140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-26 07:22:38,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 07:22:39,053][47288] Updated weights for policy 0, policy_version 116446 (0.0029) [2024-04-26 07:22:42,361][47288] Updated weights for policy 0, policy_version 116456 (0.0030) [2024-04-26 07:22:43,923][47056] Fps is (10 sec: 58981.5, 60 sec: 57344.0, 300 sec: 56760.8). Total num frames: 1908129792. Throughput: 0: 56687.0. Samples: 1857416560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-26 07:22:43,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 07:22:44,657][47288] Updated weights for policy 0, policy_version 116466 (0.0031) [2024-04-26 07:22:48,248][47288] Updated weights for policy 0, policy_version 116476 (0.0029) [2024-04-26 07:22:48,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56524.9, 300 sec: 56649.8). Total num frames: 1908375552. Throughput: 0: 56667.1. Samples: 1857759860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-26 07:22:48,923][47056] Avg episode reward: [(0, '0.612')] [2024-04-26 07:22:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000116478_1908375552.pth... [2024-04-26 07:22:48,995][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000115649_1894793216.pth [2024-04-26 07:22:50,334][47288] Updated weights for policy 0, policy_version 116486 (0.0029) [2024-04-26 07:22:53,923][47056] Fps is (10 sec: 50790.8, 60 sec: 55978.8, 300 sec: 56538.7). Total num frames: 1908637696. Throughput: 0: 56774.3. Samples: 1858101160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-26 07:22:53,923][47056] Avg episode reward: [(0, '0.607')] [2024-04-26 07:22:54,127][47288] Updated weights for policy 0, policy_version 116496 (0.0027) [2024-04-26 07:22:56,153][47288] Updated weights for policy 0, policy_version 116506 (0.0028) [2024-04-26 07:22:58,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 56538.7). Total num frames: 1908932608. Throughput: 0: 56587.0. Samples: 1858254840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:22:58,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 07:22:59,925][47288] Updated weights for policy 0, policy_version 116516 (0.0030) [2024-04-26 07:23:02,074][47288] Updated weights for policy 0, policy_version 116526 (0.0030) [2024-04-26 07:23:03,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56251.6, 300 sec: 56594.2). Total num frames: 1909227520. Throughput: 0: 56654.8. Samples: 1858594620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:23:03,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 07:23:05,675][47288] Updated weights for policy 0, policy_version 116536 (0.0027) [2024-04-26 07:23:08,222][47288] Updated weights for policy 0, policy_version 116546 (0.0026) [2024-04-26 07:23:08,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 1909506048. Throughput: 0: 56640.9. Samples: 1858937280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:23:08,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 07:23:11,404][47288] Updated weights for policy 0, policy_version 116556 (0.0026) [2024-04-26 07:23:13,904][47288] Updated weights for policy 0, policy_version 116566 (0.0034) [2024-04-26 07:23:13,923][47056] Fps is (10 sec: 58982.1, 60 sec: 57343.8, 300 sec: 56649.8). Total num frames: 1909817344. Throughput: 0: 56592.4. Samples: 1859112540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:23:13,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 07:23:17,209][47288] Updated weights for policy 0, policy_version 116576 (0.0033) [2024-04-26 07:23:18,923][47056] Fps is (10 sec: 58983.4, 60 sec: 57071.2, 300 sec: 56705.3). Total num frames: 1910095872. Throughput: 0: 56711.2. Samples: 1859450500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:23:18,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 07:23:19,596][47288] Updated weights for policy 0, policy_version 116586 (0.0033) [2024-04-26 07:23:22,942][47288] Updated weights for policy 0, policy_version 116596 (0.0027) [2024-04-26 07:23:23,923][47056] Fps is (10 sec: 57344.6, 60 sec: 57071.0, 300 sec: 56760.8). Total num frames: 1910390784. Throughput: 0: 56802.6. Samples: 1859789260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:23:23,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 07:23:25,381][47288] Updated weights for policy 0, policy_version 116606 (0.0037) [2024-04-26 07:23:27,671][47267] Signal inference workers to stop experience collection... (27800 times) [2024-04-26 07:23:27,672][47267] Signal inference workers to resume experience collection... (27800 times) [2024-04-26 07:23:27,686][47288] InferenceWorker_p0-w0: stopping experience collection (27800 times) [2024-04-26 07:23:27,687][47288] InferenceWorker_p0-w0: resuming experience collection (27800 times) [2024-04-26 07:23:28,773][47288] Updated weights for policy 0, policy_version 116616 (0.0030) [2024-04-26 07:23:28,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56797.9, 300 sec: 56649.8). Total num frames: 1910652928. Throughput: 0: 56647.7. Samples: 1859965700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:23:28,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 07:23:31,131][47288] Updated weights for policy 0, policy_version 116626 (0.0028) [2024-04-26 07:23:33,923][47056] Fps is (10 sec: 52427.7, 60 sec: 56251.5, 300 sec: 56594.2). Total num frames: 1910915072. Throughput: 0: 56569.9. Samples: 1860305520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:23:33,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 07:23:34,484][47288] Updated weights for policy 0, policy_version 116636 (0.0033) [2024-04-26 07:23:36,917][47288] Updated weights for policy 0, policy_version 116646 (0.0031) [2024-04-26 07:23:38,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 56594.2). Total num frames: 1911209984. Throughput: 0: 56516.0. Samples: 1860644380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:23:38,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 07:23:40,252][47288] Updated weights for policy 0, policy_version 116656 (0.0029) [2024-04-26 07:23:42,853][47288] Updated weights for policy 0, policy_version 116666 (0.0030) [2024-04-26 07:23:43,923][47056] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 56594.2). Total num frames: 1911488512. Throughput: 0: 56590.6. Samples: 1860801420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:23:43,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 07:23:46,104][47288] Updated weights for policy 0, policy_version 116676 (0.0025) [2024-04-26 07:23:48,663][47288] Updated weights for policy 0, policy_version 116686 (0.0026) [2024-04-26 07:23:48,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56797.8, 300 sec: 56649.8). Total num frames: 1911783424. Throughput: 0: 56585.4. Samples: 1861140960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:23:48,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 07:23:51,781][47288] Updated weights for policy 0, policy_version 116696 (0.0024) [2024-04-26 07:23:53,923][47056] Fps is (10 sec: 58982.4, 60 sec: 57344.0, 300 sec: 56649.8). Total num frames: 1912078336. Throughput: 0: 56614.7. Samples: 1861484940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:23:53,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 07:23:54,435][47288] Updated weights for policy 0, policy_version 116706 (0.0024) [2024-04-26 07:23:57,539][47288] Updated weights for policy 0, policy_version 116716 (0.0026) [2024-04-26 07:23:58,923][47056] Fps is (10 sec: 58982.4, 60 sec: 57344.0, 300 sec: 56816.4). Total num frames: 1912373248. Throughput: 0: 56643.2. Samples: 1861661480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 07:23:58,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 07:24:00,487][47288] Updated weights for policy 0, policy_version 116726 (0.0030) [2024-04-26 07:24:03,324][47288] Updated weights for policy 0, policy_version 116736 (0.0026) [2024-04-26 07:24:03,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56798.0, 300 sec: 56705.3). Total num frames: 1912635392. Throughput: 0: 56604.8. Samples: 1861997720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 07:24:03,923][47056] Avg episode reward: [(0, '0.567')] [2024-04-26 07:24:06,386][47288] Updated weights for policy 0, policy_version 116746 (0.0028) [2024-04-26 07:24:08,923][47056] Fps is (10 sec: 52429.1, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 1912897536. Throughput: 0: 56649.8. Samples: 1862338500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 07:24:08,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 07:24:09,076][47288] Updated weights for policy 0, policy_version 116756 (0.0025) [2024-04-26 07:24:12,267][47288] Updated weights for policy 0, policy_version 116766 (0.0025) [2024-04-26 07:24:13,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55705.7, 300 sec: 56538.7). Total num frames: 1913159680. Throughput: 0: 56330.2. Samples: 1862500560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 07:24:13,923][47056] Avg episode reward: [(0, '0.620')] [2024-04-26 07:24:14,809][47288] Updated weights for policy 0, policy_version 116776 (0.0027) [2024-04-26 07:24:17,975][47288] Updated weights for policy 0, policy_version 116786 (0.0031) [2024-04-26 07:24:18,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55978.5, 300 sec: 56483.1). Total num frames: 1913454592. Throughput: 0: 56289.5. Samples: 1862838540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 07:24:18,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 07:24:20,961][47267] Signal inference workers to stop experience collection... (27850 times) [2024-04-26 07:24:20,988][47288] InferenceWorker_p0-w0: stopping experience collection (27850 times) [2024-04-26 07:24:21,045][47267] Signal inference workers to resume experience collection... (27850 times) [2024-04-26 07:24:21,045][47288] InferenceWorker_p0-w0: resuming experience collection (27850 times) [2024-04-26 07:24:21,049][47288] Updated weights for policy 0, policy_version 116796 (0.0032) [2024-04-26 07:24:23,827][47288] Updated weights for policy 0, policy_version 116806 (0.0029) [2024-04-26 07:24:23,923][47056] Fps is (10 sec: 58981.0, 60 sec: 55978.5, 300 sec: 56594.2). Total num frames: 1913749504. Throughput: 0: 56265.6. Samples: 1863176340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 07:24:23,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 07:24:26,698][47288] Updated weights for policy 0, policy_version 116816 (0.0026) [2024-04-26 07:24:28,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.6, 300 sec: 56649.7). Total num frames: 1914028032. Throughput: 0: 56523.1. Samples: 1863344960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 07:24:28,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 07:24:29,746][47288] Updated weights for policy 0, policy_version 116826 (0.0028) [2024-04-26 07:24:32,500][47288] Updated weights for policy 0, policy_version 116836 (0.0028) [2024-04-26 07:24:33,923][47056] Fps is (10 sec: 57345.4, 60 sec: 56798.1, 300 sec: 56594.2). Total num frames: 1914322944. Throughput: 0: 56521.9. Samples: 1863684440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 07:24:33,923][47056] Avg episode reward: [(0, '0.547')] [2024-04-26 07:24:35,450][47288] Updated weights for policy 0, policy_version 116846 (0.0030) [2024-04-26 07:24:38,321][47288] Updated weights for policy 0, policy_version 116856 (0.0034) [2024-04-26 07:24:38,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56797.9, 300 sec: 56705.3). Total num frames: 1914617856. Throughput: 0: 56380.5. Samples: 1864022060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 07:24:38,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 07:24:41,247][47288] Updated weights for policy 0, policy_version 116866 (0.0030) [2024-04-26 07:24:43,923][47056] Fps is (10 sec: 55704.7, 60 sec: 56524.8, 300 sec: 56594.2). Total num frames: 1914880000. Throughput: 0: 56327.9. Samples: 1864196240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 07:24:43,924][47056] Avg episode reward: [(0, '0.569')] [2024-04-26 07:24:44,071][47288] Updated weights for policy 0, policy_version 116876 (0.0030) [2024-04-26 07:24:47,162][47288] Updated weights for policy 0, policy_version 116886 (0.0030) [2024-04-26 07:24:48,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55978.7, 300 sec: 56538.7). Total num frames: 1915142144. Throughput: 0: 56355.0. Samples: 1864533700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 07:24:48,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 07:24:48,947][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000116892_1915158528.pth... [2024-04-26 07:24:49,007][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000116065_1901608960.pth [2024-04-26 07:24:49,759][47288] Updated weights for policy 0, policy_version 116896 (0.0026) [2024-04-26 07:24:52,999][47288] Updated weights for policy 0, policy_version 116906 (0.0025) [2024-04-26 07:24:53,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55432.6, 300 sec: 56427.6). Total num frames: 1915404288. Throughput: 0: 56287.9. Samples: 1864871460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 07:24:53,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 07:24:55,640][47288] Updated weights for policy 0, policy_version 116916 (0.0032) [2024-04-26 07:24:58,816][47288] Updated weights for policy 0, policy_version 116926 (0.0029) [2024-04-26 07:24:58,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 56483.9). Total num frames: 1915715584. Throughput: 0: 56258.6. Samples: 1865032200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 07:24:58,932][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 07:25:01,574][47288] Updated weights for policy 0, policy_version 116936 (0.0024) [2024-04-26 07:25:03,923][47056] Fps is (10 sec: 60621.0, 60 sec: 56251.7, 300 sec: 56594.2). Total num frames: 1916010496. Throughput: 0: 56141.4. Samples: 1865364900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 07:25:03,932][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 07:25:04,497][47288] Updated weights for policy 0, policy_version 116946 (0.0027) [2024-04-26 07:25:07,293][47288] Updated weights for policy 0, policy_version 116956 (0.0030) [2024-04-26 07:25:07,778][47267] Signal inference workers to stop experience collection... (27900 times) [2024-04-26 07:25:07,778][47267] Signal inference workers to resume experience collection... (27900 times) [2024-04-26 07:25:07,799][47288] InferenceWorker_p0-w0: stopping experience collection (27900 times) [2024-04-26 07:25:07,799][47288] InferenceWorker_p0-w0: resuming experience collection (27900 times) [2024-04-26 07:25:08,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56524.8, 300 sec: 56594.2). Total num frames: 1916289024. Throughput: 0: 56236.7. Samples: 1865706980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 07:25:08,923][47056] Avg episode reward: [(0, '0.587')] [2024-04-26 07:25:10,203][47288] Updated weights for policy 0, policy_version 116966 (0.0028) [2024-04-26 07:25:13,133][47288] Updated weights for policy 0, policy_version 116976 (0.0031) [2024-04-26 07:25:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 57070.9, 300 sec: 56649.8). Total num frames: 1916583936. Throughput: 0: 56486.3. Samples: 1865886840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 07:25:13,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 07:25:16,064][47288] Updated weights for policy 0, policy_version 116986 (0.0031) [2024-04-26 07:25:18,770][47288] Updated weights for policy 0, policy_version 116996 (0.0028) [2024-04-26 07:25:18,923][47056] Fps is (10 sec: 57342.8, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 1916862464. Throughput: 0: 56387.7. Samples: 1866221900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 07:25:18,923][47056] Avg episode reward: [(0, '0.534')] [2024-04-26 07:25:22,219][47288] Updated weights for policy 0, policy_version 117006 (0.0028) [2024-04-26 07:25:23,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56525.1, 300 sec: 56594.2). Total num frames: 1917140992. Throughput: 0: 56349.0. Samples: 1866557760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 07:25:23,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 07:25:24,708][47288] Updated weights for policy 0, policy_version 117016 (0.0025) [2024-04-26 07:25:27,938][47288] Updated weights for policy 0, policy_version 117026 (0.0028) [2024-04-26 07:25:28,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 1917435904. Throughput: 0: 56386.7. Samples: 1866733640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 07:25:28,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 07:25:30,366][47288] Updated weights for policy 0, policy_version 117036 (0.0030) [2024-04-26 07:25:33,897][47288] Updated weights for policy 0, policy_version 117046 (0.0029) [2024-04-26 07:25:33,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1917681664. Throughput: 0: 56378.3. Samples: 1867070720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 07:25:33,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 07:25:36,147][47288] Updated weights for policy 0, policy_version 117056 (0.0027) [2024-04-26 07:25:38,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.6, 300 sec: 56483.2). Total num frames: 1917976576. Throughput: 0: 56472.0. Samples: 1867412700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 07:25:38,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 07:25:39,820][47288] Updated weights for policy 0, policy_version 117066 (0.0027) [2024-04-26 07:25:41,923][47288] Updated weights for policy 0, policy_version 117076 (0.0032) [2024-04-26 07:25:43,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 1918255104. Throughput: 0: 56520.5. Samples: 1867575620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 07:25:43,923][47056] Avg episode reward: [(0, '0.548')] [2024-04-26 07:25:45,569][47288] Updated weights for policy 0, policy_version 117086 (0.0027) [2024-04-26 07:25:47,668][47288] Updated weights for policy 0, policy_version 117096 (0.0026) [2024-04-26 07:25:48,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56797.7, 300 sec: 56538.7). Total num frames: 1918550016. Throughput: 0: 56706.0. Samples: 1867916680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 07:25:48,923][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 07:25:51,267][47288] Updated weights for policy 0, policy_version 117106 (0.0033) [2024-04-26 07:25:53,503][47288] Updated weights for policy 0, policy_version 117116 (0.0025) [2024-04-26 07:25:53,923][47056] Fps is (10 sec: 58982.6, 60 sec: 57344.0, 300 sec: 56649.8). Total num frames: 1918844928. Throughput: 0: 56546.2. Samples: 1868251560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 07:25:53,923][47056] Avg episode reward: [(0, '0.606')] [2024-04-26 07:25:56,952][47288] Updated weights for policy 0, policy_version 117126 (0.0029) [2024-04-26 07:25:58,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 1919123456. Throughput: 0: 56492.2. Samples: 1868429000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 07:25:58,923][47056] Avg episode reward: [(0, '0.593')] [2024-04-26 07:25:59,162][47288] Updated weights for policy 0, policy_version 117136 (0.0029) [2024-04-26 07:26:02,773][47288] Updated weights for policy 0, policy_version 117146 (0.0027) [2024-04-26 07:26:03,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.8, 300 sec: 56594.2). Total num frames: 1919401984. Throughput: 0: 56637.0. Samples: 1868770560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 07:26:03,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 07:26:04,367][47267] Signal inference workers to stop experience collection... (27950 times) [2024-04-26 07:26:04,388][47288] InferenceWorker_p0-w0: stopping experience collection (27950 times) [2024-04-26 07:26:04,425][47267] Signal inference workers to resume experience collection... (27950 times) [2024-04-26 07:26:04,426][47288] InferenceWorker_p0-w0: resuming experience collection (27950 times) [2024-04-26 07:26:05,055][47288] Updated weights for policy 0, policy_version 117156 (0.0025) [2024-04-26 07:26:08,568][47288] Updated weights for policy 0, policy_version 117166 (0.0029) [2024-04-26 07:26:08,923][47056] Fps is (10 sec: 54068.6, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1919664128. Throughput: 0: 56723.6. Samples: 1869110320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 07:26:08,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 07:26:10,758][47288] Updated weights for policy 0, policy_version 117176 (0.0031) [2024-04-26 07:26:13,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1919942656. Throughput: 0: 56279.7. Samples: 1869266220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 07:26:13,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 07:26:14,232][47288] Updated weights for policy 0, policy_version 117186 (0.0032) [2024-04-26 07:26:16,575][47288] Updated weights for policy 0, policy_version 117196 (0.0026) [2024-04-26 07:26:18,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56251.9, 300 sec: 56427.6). Total num frames: 1920237568. Throughput: 0: 56417.2. Samples: 1869609500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 07:26:18,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 07:26:20,058][47288] Updated weights for policy 0, policy_version 117206 (0.0027) [2024-04-26 07:26:22,350][47288] Updated weights for policy 0, policy_version 117216 (0.0030) [2024-04-26 07:26:23,923][47056] Fps is (10 sec: 58981.2, 60 sec: 56524.6, 300 sec: 56594.2). Total num frames: 1920532480. Throughput: 0: 56386.6. Samples: 1869950100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 07:26:23,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 07:26:25,694][47288] Updated weights for policy 0, policy_version 117226 (0.0027) [2024-04-26 07:26:28,051][47288] Updated weights for policy 0, policy_version 117236 (0.0030) [2024-04-26 07:26:28,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 1920811008. Throughput: 0: 56784.9. Samples: 1870130940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 07:26:28,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 07:26:31,416][47288] Updated weights for policy 0, policy_version 117246 (0.0023) [2024-04-26 07:26:33,882][47288] Updated weights for policy 0, policy_version 117256 (0.0028) [2024-04-26 07:26:33,923][47056] Fps is (10 sec: 58982.5, 60 sec: 57343.8, 300 sec: 56649.7). Total num frames: 1921122304. Throughput: 0: 56637.8. Samples: 1870465380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 07:26:33,924][47056] Avg episode reward: [(0, '0.592')] [2024-04-26 07:26:37,146][47288] Updated weights for policy 0, policy_version 117266 (0.0032) [2024-04-26 07:26:38,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 1921384448. Throughput: 0: 56721.7. Samples: 1870804040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 07:26:38,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 07:26:39,865][47288] Updated weights for policy 0, policy_version 117276 (0.0028) [2024-04-26 07:26:42,910][47288] Updated weights for policy 0, policy_version 117286 (0.0030) [2024-04-26 07:26:43,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 1921662976. Throughput: 0: 56664.1. Samples: 1870978880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 07:26:43,923][47056] Avg episode reward: [(0, '0.596')] [2024-04-26 07:26:45,525][47288] Updated weights for policy 0, policy_version 117296 (0.0029) [2024-04-26 07:26:48,778][47288] Updated weights for policy 0, policy_version 117306 (0.0032) [2024-04-26 07:26:48,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 1921957888. Throughput: 0: 56538.6. Samples: 1871314800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 07:26:48,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 07:26:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000117307_1921957888.pth... [2024-04-26 07:26:48,979][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000116478_1908375552.pth [2024-04-26 07:26:51,341][47288] Updated weights for policy 0, policy_version 117316 (0.0034) [2024-04-26 07:26:53,923][47056] Fps is (10 sec: 54068.0, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1922203648. Throughput: 0: 56485.2. Samples: 1871652160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 07:26:53,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 07:26:54,687][47288] Updated weights for policy 0, policy_version 117326 (0.0030) [2024-04-26 07:26:57,143][47288] Updated weights for policy 0, policy_version 117336 (0.0030) [2024-04-26 07:26:58,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56251.9, 300 sec: 56427.6). Total num frames: 1922498560. Throughput: 0: 56632.9. Samples: 1871814700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 07:26:58,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 07:27:00,389][47288] Updated weights for policy 0, policy_version 117346 (0.0036) [2024-04-26 07:27:02,894][47288] Updated weights for policy 0, policy_version 117356 (0.0030) [2024-04-26 07:27:03,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 1922777088. Throughput: 0: 56454.3. Samples: 1872149940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 07:27:03,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 07:27:06,184][47288] Updated weights for policy 0, policy_version 117366 (0.0031) [2024-04-26 07:27:08,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 1923072000. Throughput: 0: 56477.6. Samples: 1872491580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 07:27:08,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 07:27:09,006][47288] Updated weights for policy 0, policy_version 117376 (0.0029) [2024-04-26 07:27:11,916][47288] Updated weights for policy 0, policy_version 117386 (0.0036) [2024-04-26 07:27:13,923][47056] Fps is (10 sec: 58982.6, 60 sec: 57071.0, 300 sec: 56594.3). Total num frames: 1923366912. Throughput: 0: 56459.3. Samples: 1872671600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 07:27:13,923][47056] Avg episode reward: [(0, '0.569')] [2024-04-26 07:27:14,633][47288] Updated weights for policy 0, policy_version 117396 (0.0030) [2024-04-26 07:27:17,442][47267] Signal inference workers to stop experience collection... (28000 times) [2024-04-26 07:27:17,475][47288] InferenceWorker_p0-w0: stopping experience collection (28000 times) [2024-04-26 07:27:17,528][47267] Signal inference workers to resume experience collection... (28000 times) [2024-04-26 07:27:17,528][47288] InferenceWorker_p0-w0: resuming experience collection (28000 times) [2024-04-26 07:27:17,634][47288] Updated weights for policy 0, policy_version 117406 (0.0025) [2024-04-26 07:27:18,923][47056] Fps is (10 sec: 55704.6, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 1923629056. Throughput: 0: 56608.9. Samples: 1873012780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 07:27:18,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 07:27:20,379][47288] Updated weights for policy 0, policy_version 117416 (0.0027) [2024-04-26 07:27:23,407][47288] Updated weights for policy 0, policy_version 117426 (0.0034) [2024-04-26 07:27:23,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 1923923968. Throughput: 0: 56487.2. Samples: 1873345960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 07:27:23,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 07:27:26,505][47288] Updated weights for policy 0, policy_version 117436 (0.0024) [2024-04-26 07:27:28,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1924202496. Throughput: 0: 56511.2. Samples: 1873521880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 07:27:28,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 07:27:29,203][47288] Updated weights for policy 0, policy_version 117446 (0.0030) [2024-04-26 07:27:32,228][47288] Updated weights for policy 0, policy_version 117456 (0.0026) [2024-04-26 07:27:33,923][47056] Fps is (10 sec: 54067.8, 60 sec: 55705.8, 300 sec: 56372.1). Total num frames: 1924464640. Throughput: 0: 56546.0. Samples: 1873859360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 07:27:33,923][47056] Avg episode reward: [(0, '0.572')] [2024-04-26 07:27:34,892][47288] Updated weights for policy 0, policy_version 117466 (0.0028) [2024-04-26 07:27:37,871][47288] Updated weights for policy 0, policy_version 117476 (0.0029) [2024-04-26 07:27:38,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1924759552. Throughput: 0: 56609.6. Samples: 1874199600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 07:27:38,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 07:27:40,643][47288] Updated weights for policy 0, policy_version 117486 (0.0026) [2024-04-26 07:27:43,700][47288] Updated weights for policy 0, policy_version 117496 (0.0028) [2024-04-26 07:27:43,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 1925054464. Throughput: 0: 56709.8. Samples: 1874366640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 07:27:43,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 07:27:46,404][47288] Updated weights for policy 0, policy_version 117506 (0.0028) [2024-04-26 07:27:48,923][47056] Fps is (10 sec: 58983.2, 60 sec: 56524.9, 300 sec: 56649.8). Total num frames: 1925349376. Throughput: 0: 56821.3. Samples: 1874706900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 07:27:48,923][47056] Avg episode reward: [(0, '0.546')] [2024-04-26 07:27:49,364][47288] Updated weights for policy 0, policy_version 117516 (0.0031) [2024-04-26 07:27:52,164][47288] Updated weights for policy 0, policy_version 117526 (0.0029) [2024-04-26 07:27:53,923][47056] Fps is (10 sec: 57344.0, 60 sec: 57070.9, 300 sec: 56594.2). Total num frames: 1925627904. Throughput: 0: 56855.1. Samples: 1875050060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 07:27:53,923][47056] Avg episode reward: [(0, '0.394')] [2024-04-26 07:27:55,341][47288] Updated weights for policy 0, policy_version 117536 (0.0025) [2024-04-26 07:27:57,935][47288] Updated weights for policy 0, policy_version 117546 (0.0030) [2024-04-26 07:27:58,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 1925906432. Throughput: 0: 56686.1. Samples: 1875222480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 07:27:58,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 07:28:01,104][47288] Updated weights for policy 0, policy_version 117556 (0.0033) [2024-04-26 07:28:03,626][47288] Updated weights for policy 0, policy_version 117566 (0.0040) [2024-04-26 07:28:03,923][47056] Fps is (10 sec: 57343.7, 60 sec: 57070.9, 300 sec: 56594.2). Total num frames: 1926201344. Throughput: 0: 56668.1. Samples: 1875562840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 07:28:03,923][47056] Avg episode reward: [(0, '0.586')] [2024-04-26 07:28:06,890][47288] Updated weights for policy 0, policy_version 117576 (0.0029) [2024-04-26 07:28:08,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56797.8, 300 sec: 56483.2). Total num frames: 1926479872. Throughput: 0: 56796.9. Samples: 1875901820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 07:28:08,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 07:28:09,381][47288] Updated weights for policy 0, policy_version 117586 (0.0030) [2024-04-26 07:28:12,694][47288] Updated weights for policy 0, policy_version 117596 (0.0033) [2024-04-26 07:28:13,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1926758400. Throughput: 0: 56727.7. Samples: 1876074620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 07:28:13,923][47056] Avg episode reward: [(0, '0.396')] [2024-04-26 07:28:15,336][47288] Updated weights for policy 0, policy_version 117606 (0.0031) [2024-04-26 07:28:18,491][47288] Updated weights for policy 0, policy_version 117616 (0.0030) [2024-04-26 07:28:18,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1927036928. Throughput: 0: 56806.0. Samples: 1876415640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 07:28:18,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 07:28:21,129][47288] Updated weights for policy 0, policy_version 117626 (0.0033) [2024-04-26 07:28:23,923][47056] Fps is (10 sec: 55704.4, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 1927315456. Throughput: 0: 56722.7. Samples: 1876752120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 07:28:23,923][47056] Avg episode reward: [(0, '0.546')] [2024-04-26 07:28:24,202][47288] Updated weights for policy 0, policy_version 117636 (0.0029) [2024-04-26 07:28:26,713][47288] Updated weights for policy 0, policy_version 117646 (0.0028) [2024-04-26 07:28:28,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56797.9, 300 sec: 56594.3). Total num frames: 1927610368. Throughput: 0: 56755.5. Samples: 1876920640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 07:28:28,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 07:28:29,844][47288] Updated weights for policy 0, policy_version 117656 (0.0029) [2024-04-26 07:28:32,432][47288] Updated weights for policy 0, policy_version 117666 (0.0026) [2024-04-26 07:28:33,923][47056] Fps is (10 sec: 57344.7, 60 sec: 57070.8, 300 sec: 56538.7). Total num frames: 1927888896. Throughput: 0: 56704.9. Samples: 1877258620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 07:28:33,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 07:28:35,623][47288] Updated weights for policy 0, policy_version 117676 (0.0036) [2024-04-26 07:28:37,573][47267] Signal inference workers to stop experience collection... (28050 times) [2024-04-26 07:28:37,576][47267] Signal inference workers to resume experience collection... (28050 times) [2024-04-26 07:28:37,609][47288] InferenceWorker_p0-w0: stopping experience collection (28050 times) [2024-04-26 07:28:37,609][47288] InferenceWorker_p0-w0: resuming experience collection (28050 times) [2024-04-26 07:28:38,510][47288] Updated weights for policy 0, policy_version 117686 (0.0029) [2024-04-26 07:28:38,923][47056] Fps is (10 sec: 57344.2, 60 sec: 57071.1, 300 sec: 56594.2). Total num frames: 1928183808. Throughput: 0: 56821.3. Samples: 1877607020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 07:28:38,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 07:28:41,564][47288] Updated weights for policy 0, policy_version 117696 (0.0030) [2024-04-26 07:28:43,923][47056] Fps is (10 sec: 58981.9, 60 sec: 57070.8, 300 sec: 56594.2). Total num frames: 1928478720. Throughput: 0: 56874.1. Samples: 1877781820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 07:28:43,923][47056] Avg episode reward: [(0, '0.577')] [2024-04-26 07:28:44,121][47288] Updated weights for policy 0, policy_version 117706 (0.0028) [2024-04-26 07:28:47,292][47288] Updated weights for policy 0, policy_version 117716 (0.0027) [2024-04-26 07:28:48,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 1928757248. Throughput: 0: 56905.7. Samples: 1878123600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 07:28:48,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 07:28:48,977][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000117723_1928773632.pth... [2024-04-26 07:28:49,025][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000116892_1915158528.pth [2024-04-26 07:28:49,781][47288] Updated weights for policy 0, policy_version 117726 (0.0030) [2024-04-26 07:28:53,101][47288] Updated weights for policy 0, policy_version 117736 (0.0031) [2024-04-26 07:28:53,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56797.7, 300 sec: 56483.1). Total num frames: 1929035776. Throughput: 0: 56901.7. Samples: 1878462400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 07:28:53,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 07:28:55,545][47288] Updated weights for policy 0, policy_version 117746 (0.0031) [2024-04-26 07:28:58,832][47288] Updated weights for policy 0, policy_version 117756 (0.0029) [2024-04-26 07:28:58,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 1929314304. Throughput: 0: 56756.6. Samples: 1878628680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 07:28:58,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 07:29:01,346][47288] Updated weights for policy 0, policy_version 117766 (0.0027) [2024-04-26 07:29:03,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 1929576448. Throughput: 0: 56771.1. Samples: 1878970340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 07:29:03,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 07:29:04,537][47288] Updated weights for policy 0, policy_version 117776 (0.0027) [2024-04-26 07:29:06,966][47288] Updated weights for policy 0, policy_version 117786 (0.0031) [2024-04-26 07:29:08,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56797.7, 300 sec: 56705.3). Total num frames: 1929887744. Throughput: 0: 56933.7. Samples: 1879314140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 07:29:08,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 07:29:10,310][47288] Updated weights for policy 0, policy_version 117796 (0.0029) [2024-04-26 07:29:12,809][47288] Updated weights for policy 0, policy_version 117806 (0.0031) [2024-04-26 07:29:13,924][47056] Fps is (10 sec: 58977.2, 60 sec: 56796.9, 300 sec: 56649.6). Total num frames: 1930166272. Throughput: 0: 56859.7. Samples: 1879479380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 07:29:13,924][47056] Avg episode reward: [(0, '0.419')] [2024-04-26 07:29:16,141][47288] Updated weights for policy 0, policy_version 117816 (0.0024) [2024-04-26 07:29:18,497][47288] Updated weights for policy 0, policy_version 117826 (0.0029) [2024-04-26 07:29:18,923][47056] Fps is (10 sec: 57344.6, 60 sec: 57070.9, 300 sec: 56649.8). Total num frames: 1930461184. Throughput: 0: 57003.9. Samples: 1879823800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 07:29:18,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 07:29:21,926][47288] Updated weights for policy 0, policy_version 117836 (0.0028) [2024-04-26 07:29:23,923][47056] Fps is (10 sec: 57348.5, 60 sec: 57070.9, 300 sec: 56649.7). Total num frames: 1930739712. Throughput: 0: 56731.3. Samples: 1880159940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 07:29:23,923][47056] Avg episode reward: [(0, '0.546')] [2024-04-26 07:29:24,466][47288] Updated weights for policy 0, policy_version 117846 (0.0026) [2024-04-26 07:29:27,684][47288] Updated weights for policy 0, policy_version 117856 (0.0028) [2024-04-26 07:29:28,923][47056] Fps is (10 sec: 57344.8, 60 sec: 57071.0, 300 sec: 56649.8). Total num frames: 1931034624. Throughput: 0: 56710.4. Samples: 1880333780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 07:29:28,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 07:29:30,523][47288] Updated weights for policy 0, policy_version 117866 (0.0035) [2024-04-26 07:29:33,475][47288] Updated weights for policy 0, policy_version 117876 (0.0030) [2024-04-26 07:29:33,923][47056] Fps is (10 sec: 57344.5, 60 sec: 57070.9, 300 sec: 56594.2). Total num frames: 1931313152. Throughput: 0: 56778.7. Samples: 1880678640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 07:29:33,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 07:29:36,323][47288] Updated weights for policy 0, policy_version 117886 (0.0028) [2024-04-26 07:29:38,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56524.7, 300 sec: 56594.2). Total num frames: 1931575296. Throughput: 0: 56701.0. Samples: 1881013940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 07:29:38,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 07:29:39,317][47288] Updated weights for policy 0, policy_version 117896 (0.0025) [2024-04-26 07:29:42,053][47288] Updated weights for policy 0, policy_version 117906 (0.0031) [2024-04-26 07:29:43,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56251.8, 300 sec: 56649.8). Total num frames: 1931853824. Throughput: 0: 56592.2. Samples: 1881175320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 07:29:43,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 07:29:45,014][47288] Updated weights for policy 0, policy_version 117916 (0.0025) [2024-04-26 07:29:47,857][47288] Updated weights for policy 0, policy_version 117926 (0.0026) [2024-04-26 07:29:48,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56524.9, 300 sec: 56760.9). Total num frames: 1932148736. Throughput: 0: 56629.9. Samples: 1881518680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 07:29:48,923][47056] Avg episode reward: [(0, '0.374')] [2024-04-26 07:29:50,703][47288] Updated weights for policy 0, policy_version 117936 (0.0028) [2024-04-26 07:29:53,769][47288] Updated weights for policy 0, policy_version 117946 (0.0023) [2024-04-26 07:29:53,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56524.9, 300 sec: 56649.8). Total num frames: 1932427264. Throughput: 0: 56491.8. Samples: 1881856260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 07:29:53,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 07:29:56,616][47288] Updated weights for policy 0, policy_version 117956 (0.0028) [2024-04-26 07:29:57,274][47267] Signal inference workers to stop experience collection... (28100 times) [2024-04-26 07:29:57,274][47267] Signal inference workers to resume experience collection... (28100 times) [2024-04-26 07:29:57,291][47288] InferenceWorker_p0-w0: stopping experience collection (28100 times) [2024-04-26 07:29:57,291][47288] InferenceWorker_p0-w0: resuming experience collection (28100 times) [2024-04-26 07:29:58,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56797.9, 300 sec: 56649.8). Total num frames: 1932722176. Throughput: 0: 56622.4. Samples: 1882027340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 07:29:58,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 07:29:59,552][47288] Updated weights for policy 0, policy_version 117966 (0.0031) [2024-04-26 07:30:02,438][47288] Updated weights for policy 0, policy_version 117976 (0.0029) [2024-04-26 07:30:03,923][47056] Fps is (10 sec: 57343.9, 60 sec: 57070.9, 300 sec: 56649.7). Total num frames: 1933000704. Throughput: 0: 56427.6. Samples: 1882363040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 07:30:03,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 07:30:05,148][47288] Updated weights for policy 0, policy_version 117986 (0.0031) [2024-04-26 07:30:08,149][47288] Updated weights for policy 0, policy_version 117996 (0.0031) [2024-04-26 07:30:08,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56524.9, 300 sec: 56594.2). Total num frames: 1933279232. Throughput: 0: 56478.8. Samples: 1882701480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:30:08,923][47056] Avg episode reward: [(0, '0.567')] [2024-04-26 07:30:10,901][47288] Updated weights for policy 0, policy_version 118006 (0.0038) [2024-04-26 07:30:13,898][47288] Updated weights for policy 0, policy_version 118016 (0.0034) [2024-04-26 07:30:13,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56798.7, 300 sec: 56649.8). Total num frames: 1933574144. Throughput: 0: 56367.4. Samples: 1882870320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:30:13,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 07:30:16,795][47288] Updated weights for policy 0, policy_version 118026 (0.0025) [2024-04-26 07:30:18,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 56594.2). Total num frames: 1933836288. Throughput: 0: 56179.6. Samples: 1883206720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:30:18,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 07:30:19,741][47288] Updated weights for policy 0, policy_version 118036 (0.0031) [2024-04-26 07:30:22,681][47288] Updated weights for policy 0, policy_version 118046 (0.0026) [2024-04-26 07:30:23,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56525.0, 300 sec: 56594.3). Total num frames: 1934131200. Throughput: 0: 56371.2. Samples: 1883550640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:30:23,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 07:30:25,503][47288] Updated weights for policy 0, policy_version 118056 (0.0033) [2024-04-26 07:30:28,692][47288] Updated weights for policy 0, policy_version 118066 (0.0030) [2024-04-26 07:30:28,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55978.7, 300 sec: 56649.8). Total num frames: 1934393344. Throughput: 0: 56451.6. Samples: 1883715640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:30:28,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 07:30:31,228][47288] Updated weights for policy 0, policy_version 118076 (0.0034) [2024-04-26 07:30:33,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.7, 300 sec: 56649.8). Total num frames: 1934688256. Throughput: 0: 56430.9. Samples: 1884058080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:30:33,923][47056] Avg episode reward: [(0, '0.566')] [2024-04-26 07:30:34,398][47288] Updated weights for policy 0, policy_version 118086 (0.0028) [2024-04-26 07:30:36,880][47288] Updated weights for policy 0, policy_version 118096 (0.0031) [2024-04-26 07:30:38,923][47056] Fps is (10 sec: 58981.1, 60 sec: 56797.7, 300 sec: 56705.3). Total num frames: 1934983168. Throughput: 0: 56551.4. Samples: 1884401080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:30:38,924][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 07:30:40,103][47288] Updated weights for policy 0, policy_version 118106 (0.0026) [2024-04-26 07:30:42,984][47288] Updated weights for policy 0, policy_version 118116 (0.0029) [2024-04-26 07:30:43,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56797.8, 300 sec: 56649.8). Total num frames: 1935261696. Throughput: 0: 56424.9. Samples: 1884566460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:30:43,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 07:30:45,979][47288] Updated weights for policy 0, policy_version 118126 (0.0030) [2024-04-26 07:30:48,797][47288] Updated weights for policy 0, policy_version 118136 (0.0035) [2024-04-26 07:30:48,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.5, 300 sec: 56594.2). Total num frames: 1935540224. Throughput: 0: 56549.1. Samples: 1884907760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:30:48,924][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 07:30:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000118136_1935540224.pth... [2024-04-26 07:30:48,978][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000117307_1921957888.pth [2024-04-26 07:30:51,661][47288] Updated weights for policy 0, policy_version 118146 (0.0034) [2024-04-26 07:30:53,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56797.9, 300 sec: 56649.8). Total num frames: 1935835136. Throughput: 0: 56485.8. Samples: 1885243340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:30:53,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 07:30:54,503][47288] Updated weights for policy 0, policy_version 118156 (0.0024) [2024-04-26 07:30:57,333][47288] Updated weights for policy 0, policy_version 118166 (0.0026) [2024-04-26 07:30:58,923][47056] Fps is (10 sec: 57345.3, 60 sec: 56524.9, 300 sec: 56649.8). Total num frames: 1936113664. Throughput: 0: 56571.2. Samples: 1885416020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:30:58,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 07:31:00,320][47288] Updated weights for policy 0, policy_version 118176 (0.0027) [2024-04-26 07:31:03,204][47288] Updated weights for policy 0, policy_version 118186 (0.0026) [2024-04-26 07:31:03,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.8, 300 sec: 56705.3). Total num frames: 1936392192. Throughput: 0: 56717.8. Samples: 1885759020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:31:03,923][47056] Avg episode reward: [(0, '0.570')] [2024-04-26 07:31:06,152][47288] Updated weights for policy 0, policy_version 118196 (0.0027) [2024-04-26 07:31:08,923][47056] Fps is (10 sec: 54065.7, 60 sec: 56251.5, 300 sec: 56649.7). Total num frames: 1936654336. Throughput: 0: 56573.8. Samples: 1886096480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:31:08,924][47056] Avg episode reward: [(0, '0.574')] [2024-04-26 07:31:09,358][47288] Updated weights for policy 0, policy_version 118206 (0.0030) [2024-04-26 07:31:11,945][47288] Updated weights for policy 0, policy_version 118216 (0.0032) [2024-04-26 07:31:13,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 56594.2). Total num frames: 1936932864. Throughput: 0: 56578.1. Samples: 1886261660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:31:13,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 07:31:14,988][47267] Signal inference workers to stop experience collection... (28150 times) [2024-04-26 07:31:14,989][47267] Signal inference workers to resume experience collection... (28150 times) [2024-04-26 07:31:15,013][47288] InferenceWorker_p0-w0: stopping experience collection (28150 times) [2024-04-26 07:31:15,014][47288] InferenceWorker_p0-w0: resuming experience collection (28150 times) [2024-04-26 07:31:15,121][47288] Updated weights for policy 0, policy_version 118226 (0.0031) [2024-04-26 07:31:17,584][47288] Updated weights for policy 0, policy_version 118236 (0.0031) [2024-04-26 07:31:18,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56524.7, 300 sec: 56594.2). Total num frames: 1937227776. Throughput: 0: 56506.6. Samples: 1886600880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:31:18,923][47056] Avg episode reward: [(0, '0.582')] [2024-04-26 07:31:20,785][47288] Updated weights for policy 0, policy_version 118246 (0.0026) [2024-04-26 07:31:23,385][47288] Updated weights for policy 0, policy_version 118256 (0.0026) [2024-04-26 07:31:23,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56524.6, 300 sec: 56649.7). Total num frames: 1937522688. Throughput: 0: 56315.1. Samples: 1886935260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:31:23,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 07:31:26,603][47288] Updated weights for policy 0, policy_version 118266 (0.0029) [2024-04-26 07:31:28,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56797.7, 300 sec: 56538.7). Total num frames: 1937801216. Throughput: 0: 56655.6. Samples: 1887115960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:31:28,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 07:31:29,305][47288] Updated weights for policy 0, policy_version 118276 (0.0031) [2024-04-26 07:31:32,398][47288] Updated weights for policy 0, policy_version 118286 (0.0027) [2024-04-26 07:31:33,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56797.9, 300 sec: 56649.8). Total num frames: 1938096128. Throughput: 0: 56601.5. Samples: 1887454820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:31:33,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 07:31:35,120][47288] Updated weights for policy 0, policy_version 118296 (0.0027) [2024-04-26 07:31:38,167][47288] Updated weights for policy 0, policy_version 118306 (0.0036) [2024-04-26 07:31:38,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56525.0, 300 sec: 56649.8). Total num frames: 1938374656. Throughput: 0: 56630.3. Samples: 1887791700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:31:38,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 07:31:40,842][47288] Updated weights for policy 0, policy_version 118316 (0.0029) [2024-04-26 07:31:43,789][47288] Updated weights for policy 0, policy_version 118326 (0.0027) [2024-04-26 07:31:43,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56524.9, 300 sec: 56594.2). Total num frames: 1938653184. Throughput: 0: 56779.2. Samples: 1887971080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:31:43,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 07:31:46,522][47288] Updated weights for policy 0, policy_version 118336 (0.0027) [2024-04-26 07:31:48,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56251.9, 300 sec: 56649.7). Total num frames: 1938915328. Throughput: 0: 56641.4. Samples: 1888307880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:31:48,923][47056] Avg episode reward: [(0, '0.411')] [2024-04-26 07:31:49,541][47288] Updated weights for policy 0, policy_version 118346 (0.0031) [2024-04-26 07:31:52,374][47288] Updated weights for policy 0, policy_version 118356 (0.0027) [2024-04-26 07:31:53,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 56594.2). Total num frames: 1939193856. Throughput: 0: 56631.9. Samples: 1888644900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:31:53,923][47056] Avg episode reward: [(0, '0.534')] [2024-04-26 07:31:55,491][47288] Updated weights for policy 0, policy_version 118366 (0.0022) [2024-04-26 07:31:58,242][47288] Updated weights for policy 0, policy_version 118376 (0.0025) [2024-04-26 07:31:58,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 56649.8). Total num frames: 1939488768. Throughput: 0: 56526.7. Samples: 1888805360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:31:58,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 07:32:01,228][47288] Updated weights for policy 0, policy_version 118386 (0.0032) [2024-04-26 07:32:03,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56524.9, 300 sec: 56649.8). Total num frames: 1939783680. Throughput: 0: 56702.9. Samples: 1889152500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:32:03,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 07:32:03,924][47288] Updated weights for policy 0, policy_version 118396 (0.0028) [2024-04-26 07:32:06,957][47288] Updated weights for policy 0, policy_version 118406 (0.0029) [2024-04-26 07:32:08,923][47056] Fps is (10 sec: 60620.9, 60 sec: 57344.3, 300 sec: 56705.3). Total num frames: 1940094976. Throughput: 0: 56691.8. Samples: 1889486380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 07:32:08,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 07:32:09,936][47288] Updated weights for policy 0, policy_version 118416 (0.0035) [2024-04-26 07:32:12,704][47288] Updated weights for policy 0, policy_version 118426 (0.0040) [2024-04-26 07:32:13,923][47056] Fps is (10 sec: 57343.8, 60 sec: 57071.0, 300 sec: 56705.3). Total num frames: 1940357120. Throughput: 0: 56683.6. Samples: 1889666720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:32:13,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 07:32:15,584][47288] Updated weights for policy 0, policy_version 118436 (0.0033) [2024-04-26 07:32:18,508][47267] Signal inference workers to stop experience collection... (28200 times) [2024-04-26 07:32:18,512][47288] Updated weights for policy 0, policy_version 118446 (0.0029) [2024-04-26 07:32:18,514][47267] Signal inference workers to resume experience collection... (28200 times) [2024-04-26 07:32:18,521][47288] InferenceWorker_p0-w0: stopping experience collection (28200 times) [2024-04-26 07:32:18,542][47288] InferenceWorker_p0-w0: resuming experience collection (28200 times) [2024-04-26 07:32:18,923][47056] Fps is (10 sec: 55705.1, 60 sec: 57071.0, 300 sec: 56705.3). Total num frames: 1940652032. Throughput: 0: 56697.8. Samples: 1890006220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:32:18,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 07:32:21,354][47288] Updated weights for policy 0, policy_version 118456 (0.0027) [2024-04-26 07:32:23,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56525.0, 300 sec: 56649.8). Total num frames: 1940914176. Throughput: 0: 56838.7. Samples: 1890349440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:32:23,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 07:32:24,132][47288] Updated weights for policy 0, policy_version 118466 (0.0035) [2024-04-26 07:32:27,169][47288] Updated weights for policy 0, policy_version 118476 (0.0037) [2024-04-26 07:32:28,923][47056] Fps is (10 sec: 52428.7, 60 sec: 56251.7, 300 sec: 56649.7). Total num frames: 1941176320. Throughput: 0: 56537.7. Samples: 1890515280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:32:28,923][47056] Avg episode reward: [(0, '0.630')] [2024-04-26 07:32:29,884][47288] Updated weights for policy 0, policy_version 118486 (0.0035) [2024-04-26 07:32:32,849][47288] Updated weights for policy 0, policy_version 118496 (0.0026) [2024-04-26 07:32:33,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.8, 300 sec: 56649.8). Total num frames: 1941471232. Throughput: 0: 56669.3. Samples: 1890858000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:32:33,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 07:32:35,741][47288] Updated weights for policy 0, policy_version 118506 (0.0029) [2024-04-26 07:32:38,624][47288] Updated weights for policy 0, policy_version 118516 (0.0027) [2024-04-26 07:32:38,923][47056] Fps is (10 sec: 60620.9, 60 sec: 56797.8, 300 sec: 56705.3). Total num frames: 1941782528. Throughput: 0: 56538.6. Samples: 1891189140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:32:38,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 07:32:41,623][47288] Updated weights for policy 0, policy_version 118526 (0.0025) [2024-04-26 07:32:43,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56524.8, 300 sec: 56594.2). Total num frames: 1942044672. Throughput: 0: 56758.3. Samples: 1891359480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:32:43,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 07:32:44,464][47288] Updated weights for policy 0, policy_version 118536 (0.0030) [2024-04-26 07:32:47,281][47288] Updated weights for policy 0, policy_version 118546 (0.0030) [2024-04-26 07:32:48,923][47056] Fps is (10 sec: 57343.8, 60 sec: 57343.9, 300 sec: 56705.3). Total num frames: 1942355968. Throughput: 0: 56615.4. Samples: 1891700200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:32:48,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 07:32:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000118552_1942355968.pth... [2024-04-26 07:32:48,985][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000117723_1928773632.pth [2024-04-26 07:32:50,202][47288] Updated weights for policy 0, policy_version 118556 (0.0030) [2024-04-26 07:32:53,106][47288] Updated weights for policy 0, policy_version 118566 (0.0026) [2024-04-26 07:32:53,923][47056] Fps is (10 sec: 57343.4, 60 sec: 57070.9, 300 sec: 56649.7). Total num frames: 1942618112. Throughput: 0: 56739.4. Samples: 1892039660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:32:53,923][47056] Avg episode reward: [(0, '0.579')] [2024-04-26 07:32:55,956][47288] Updated weights for policy 0, policy_version 118576 (0.0026) [2024-04-26 07:32:58,923][47056] Fps is (10 sec: 52429.5, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 1942880256. Throughput: 0: 56411.2. Samples: 1892205220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:32:58,923][47056] Avg episode reward: [(0, '0.584')] [2024-04-26 07:32:59,073][47288] Updated weights for policy 0, policy_version 118586 (0.0035) [2024-04-26 07:33:02,001][47288] Updated weights for policy 0, policy_version 118596 (0.0030) [2024-04-26 07:33:03,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.6, 300 sec: 56538.7). Total num frames: 1943158784. Throughput: 0: 56357.3. Samples: 1892542300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:33:03,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 07:33:04,875][47288] Updated weights for policy 0, policy_version 118606 (0.0028) [2024-04-26 07:33:07,888][47288] Updated weights for policy 0, policy_version 118616 (0.0029) [2024-04-26 07:33:08,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 56538.7). Total num frames: 1943437312. Throughput: 0: 56295.4. Samples: 1892882740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 07:33:08,924][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 07:33:10,631][47288] Updated weights for policy 0, policy_version 118626 (0.0026) [2024-04-26 07:33:13,800][47288] Updated weights for policy 0, policy_version 118636 (0.0030) [2024-04-26 07:33:13,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56524.8, 300 sec: 56649.8). Total num frames: 1943748608. Throughput: 0: 56185.8. Samples: 1893043640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:33:13,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 07:33:16,523][47288] Updated weights for policy 0, policy_version 118646 (0.0031) [2024-04-26 07:33:18,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 56594.2). Total num frames: 1944010752. Throughput: 0: 56052.8. Samples: 1893380380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:33:18,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 07:33:19,530][47288] Updated weights for policy 0, policy_version 118656 (0.0028) [2024-04-26 07:33:22,296][47288] Updated weights for policy 0, policy_version 118666 (0.0038) [2024-04-26 07:33:23,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56797.8, 300 sec: 56649.8). Total num frames: 1944322048. Throughput: 0: 56269.3. Samples: 1893721260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:33:23,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 07:33:25,296][47288] Updated weights for policy 0, policy_version 118676 (0.0029) [2024-04-26 07:33:28,103][47288] Updated weights for policy 0, policy_version 118686 (0.0027) [2024-04-26 07:33:28,923][47056] Fps is (10 sec: 57345.1, 60 sec: 56798.0, 300 sec: 56594.2). Total num frames: 1944584192. Throughput: 0: 56322.7. Samples: 1893894000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:33:28,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 07:33:29,234][47267] Signal inference workers to stop experience collection... (28250 times) [2024-04-26 07:33:29,237][47267] Signal inference workers to resume experience collection... (28250 times) [2024-04-26 07:33:29,260][47288] InferenceWorker_p0-w0: stopping experience collection (28250 times) [2024-04-26 07:33:29,260][47288] InferenceWorker_p0-w0: resuming experience collection (28250 times) [2024-04-26 07:33:31,095][47288] Updated weights for policy 0, policy_version 118696 (0.0032) [2024-04-26 07:33:33,923][47056] Fps is (10 sec: 54068.1, 60 sec: 56525.0, 300 sec: 56538.7). Total num frames: 1944862720. Throughput: 0: 56204.3. Samples: 1894229380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:33:33,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 07:33:34,007][47288] Updated weights for policy 0, policy_version 118706 (0.0032) [2024-04-26 07:33:37,039][47288] Updated weights for policy 0, policy_version 118716 (0.0037) [2024-04-26 07:33:38,923][47056] Fps is (10 sec: 55704.3, 60 sec: 55978.5, 300 sec: 56483.1). Total num frames: 1945141248. Throughput: 0: 56197.6. Samples: 1894568560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:33:38,924][47056] Avg episode reward: [(0, '0.579')] [2024-04-26 07:33:39,935][47288] Updated weights for policy 0, policy_version 118726 (0.0030) [2024-04-26 07:33:43,072][47288] Updated weights for policy 0, policy_version 118736 (0.0028) [2024-04-26 07:33:43,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1945403392. Throughput: 0: 56224.4. Samples: 1894735320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:33:43,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 07:33:45,738][47288] Updated weights for policy 0, policy_version 118746 (0.0026) [2024-04-26 07:33:48,919][47288] Updated weights for policy 0, policy_version 118756 (0.0030) [2024-04-26 07:33:48,923][47056] Fps is (10 sec: 55707.0, 60 sec: 55705.8, 300 sec: 56483.2). Total num frames: 1945698304. Throughput: 0: 56268.6. Samples: 1895074380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:33:48,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 07:33:51,440][47288] Updated weights for policy 0, policy_version 118766 (0.0027) [2024-04-26 07:33:53,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.8, 300 sec: 56483.2). Total num frames: 1945976832. Throughput: 0: 56181.9. Samples: 1895410920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:33:53,932][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 07:33:54,667][47288] Updated weights for policy 0, policy_version 118776 (0.0025) [2024-04-26 07:33:57,335][47288] Updated weights for policy 0, policy_version 118786 (0.0032) [2024-04-26 07:33:58,923][47056] Fps is (10 sec: 57343.0, 60 sec: 56524.7, 300 sec: 56594.2). Total num frames: 1946271744. Throughput: 0: 56352.8. Samples: 1895579520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:33:58,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 07:34:00,640][47288] Updated weights for policy 0, policy_version 118796 (0.0037) [2024-04-26 07:34:03,141][47288] Updated weights for policy 0, policy_version 118806 (0.0028) [2024-04-26 07:34:03,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 1946566656. Throughput: 0: 56472.5. Samples: 1895921640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:34:03,923][47056] Avg episode reward: [(0, '0.562')] [2024-04-26 07:34:06,287][47288] Updated weights for policy 0, policy_version 118816 (0.0031) [2024-04-26 07:34:08,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.9, 300 sec: 56483.3). Total num frames: 1946828800. Throughput: 0: 56308.9. Samples: 1896255160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:34:08,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 07:34:08,996][47288] Updated weights for policy 0, policy_version 118826 (0.0039) [2024-04-26 07:34:11,954][47288] Updated weights for policy 0, policy_version 118836 (0.0029) [2024-04-26 07:34:13,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1947107328. Throughput: 0: 56263.9. Samples: 1896425880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 07:34:13,923][47056] Avg episode reward: [(0, '0.384')] [2024-04-26 07:34:14,753][47288] Updated weights for policy 0, policy_version 118846 (0.0024) [2024-04-26 07:34:17,824][47288] Updated weights for policy 0, policy_version 118856 (0.0033) [2024-04-26 07:34:18,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 1947402240. Throughput: 0: 56336.6. Samples: 1896764540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:34:18,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 07:34:20,596][47288] Updated weights for policy 0, policy_version 118866 (0.0024) [2024-04-26 07:34:23,797][47288] Updated weights for policy 0, policy_version 118876 (0.0028) [2024-04-26 07:34:23,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 56372.1). Total num frames: 1947664384. Throughput: 0: 56476.3. Samples: 1897109980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:34:23,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 07:34:26,267][47288] Updated weights for policy 0, policy_version 118886 (0.0033) [2024-04-26 07:34:28,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 1947959296. Throughput: 0: 56406.0. Samples: 1897273600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:34:28,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 07:34:29,522][47288] Updated weights for policy 0, policy_version 118896 (0.0030) [2024-04-26 07:34:30,559][47267] Signal inference workers to stop experience collection... (28300 times) [2024-04-26 07:34:30,560][47267] Signal inference workers to resume experience collection... (28300 times) [2024-04-26 07:34:30,573][47288] InferenceWorker_p0-w0: stopping experience collection (28300 times) [2024-04-26 07:34:30,573][47288] InferenceWorker_p0-w0: resuming experience collection (28300 times) [2024-04-26 07:34:31,982][47288] Updated weights for policy 0, policy_version 118906 (0.0029) [2024-04-26 07:34:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.6, 300 sec: 56483.1). Total num frames: 1948237824. Throughput: 0: 56386.1. Samples: 1897611760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:34:33,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 07:34:35,171][47288] Updated weights for policy 0, policy_version 118916 (0.0030) [2024-04-26 07:34:37,777][47288] Updated weights for policy 0, policy_version 118926 (0.0031) [2024-04-26 07:34:38,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 1948532736. Throughput: 0: 56371.4. Samples: 1897947640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:34:38,923][47056] Avg episode reward: [(0, '0.612')] [2024-04-26 07:34:41,117][47288] Updated weights for policy 0, policy_version 118936 (0.0029) [2024-04-26 07:34:43,535][47288] Updated weights for policy 0, policy_version 118946 (0.0043) [2024-04-26 07:34:43,923][47056] Fps is (10 sec: 58982.3, 60 sec: 57070.9, 300 sec: 56538.7). Total num frames: 1948827648. Throughput: 0: 56592.6. Samples: 1898126180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:34:43,923][47056] Avg episode reward: [(0, '0.584')] [2024-04-26 07:34:46,866][47288] Updated weights for policy 0, policy_version 118956 (0.0029) [2024-04-26 07:34:48,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.6, 300 sec: 56483.1). Total num frames: 1949089792. Throughput: 0: 56479.5. Samples: 1898463220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:34:48,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 07:34:49,038][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000118964_1949106176.pth... [2024-04-26 07:34:49,080][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000118136_1935540224.pth [2024-04-26 07:34:49,435][47288] Updated weights for policy 0, policy_version 118966 (0.0031) [2024-04-26 07:34:52,544][47288] Updated weights for policy 0, policy_version 118976 (0.0031) [2024-04-26 07:34:53,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1949368320. Throughput: 0: 56528.4. Samples: 1898798940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:34:53,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 07:34:55,193][47288] Updated weights for policy 0, policy_version 118986 (0.0031) [2024-04-26 07:34:58,216][47288] Updated weights for policy 0, policy_version 118996 (0.0027) [2024-04-26 07:34:58,923][47056] Fps is (10 sec: 55706.7, 60 sec: 56251.9, 300 sec: 56427.6). Total num frames: 1949646848. Throughput: 0: 56547.7. Samples: 1898970520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:34:58,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 07:35:00,942][47288] Updated weights for policy 0, policy_version 119006 (0.0026) [2024-04-26 07:35:03,923][47056] Fps is (10 sec: 55704.2, 60 sec: 55978.5, 300 sec: 56427.6). Total num frames: 1949925376. Throughput: 0: 56594.5. Samples: 1899311300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:35:03,923][47056] Avg episode reward: [(0, '0.561')] [2024-04-26 07:35:04,181][47288] Updated weights for policy 0, policy_version 119016 (0.0027) [2024-04-26 07:35:06,700][47288] Updated weights for policy 0, policy_version 119026 (0.0032) [2024-04-26 07:35:08,923][47056] Fps is (10 sec: 57342.9, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1950220288. Throughput: 0: 56413.6. Samples: 1899648600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:35:08,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 07:35:10,050][47288] Updated weights for policy 0, policy_version 119036 (0.0025) [2024-04-26 07:35:12,643][47288] Updated weights for policy 0, policy_version 119046 (0.0023) [2024-04-26 07:35:13,923][47056] Fps is (10 sec: 55706.7, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1950482432. Throughput: 0: 56359.2. Samples: 1899809760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:35:13,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 07:35:15,787][47288] Updated weights for policy 0, policy_version 119056 (0.0027) [2024-04-26 07:35:18,445][47288] Updated weights for policy 0, policy_version 119066 (0.0031) [2024-04-26 07:35:18,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56524.9, 300 sec: 56483.1). Total num frames: 1950793728. Throughput: 0: 56455.5. Samples: 1900152260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 07:35:18,923][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 07:35:21,380][47288] Updated weights for policy 0, policy_version 119076 (0.0026) [2024-04-26 07:35:23,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56797.7, 300 sec: 56538.6). Total num frames: 1951072256. Throughput: 0: 56541.3. Samples: 1900492000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 07:35:23,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 07:35:24,149][47288] Updated weights for policy 0, policy_version 119086 (0.0031) [2024-04-26 07:35:27,380][47288] Updated weights for policy 0, policy_version 119096 (0.0025) [2024-04-26 07:35:28,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1951350784. Throughput: 0: 56376.0. Samples: 1900663100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 07:35:28,924][47056] Avg episode reward: [(0, '0.445')] [2024-04-26 07:35:29,907][47288] Updated weights for policy 0, policy_version 119106 (0.0033) [2024-04-26 07:35:33,167][47288] Updated weights for policy 0, policy_version 119116 (0.0031) [2024-04-26 07:35:33,923][47056] Fps is (10 sec: 54068.0, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1951612928. Throughput: 0: 56429.9. Samples: 1901002560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 07:35:33,923][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 07:35:35,241][47267] Signal inference workers to stop experience collection... (28350 times) [2024-04-26 07:35:35,276][47288] InferenceWorker_p0-w0: stopping experience collection (28350 times) [2024-04-26 07:35:35,288][47267] Signal inference workers to resume experience collection... (28350 times) [2024-04-26 07:35:35,295][47288] InferenceWorker_p0-w0: resuming experience collection (28350 times) [2024-04-26 07:35:35,766][47288] Updated weights for policy 0, policy_version 119126 (0.0030) [2024-04-26 07:35:38,826][47288] Updated weights for policy 0, policy_version 119136 (0.0026) [2024-04-26 07:35:38,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1951924224. Throughput: 0: 56385.7. Samples: 1901336300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 07:35:38,923][47056] Avg episode reward: [(0, '0.528')] [2024-04-26 07:35:41,668][47288] Updated weights for policy 0, policy_version 119146 (0.0029) [2024-04-26 07:35:43,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55978.6, 300 sec: 56427.6). Total num frames: 1952186368. Throughput: 0: 56232.2. Samples: 1901500980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 07:35:43,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 07:35:44,745][47288] Updated weights for policy 0, policy_version 119156 (0.0032) [2024-04-26 07:35:47,374][47288] Updated weights for policy 0, policy_version 119166 (0.0028) [2024-04-26 07:35:48,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1952448512. Throughput: 0: 56149.9. Samples: 1901838040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 07:35:48,924][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 07:35:50,773][47288] Updated weights for policy 0, policy_version 119176 (0.0032) [2024-04-26 07:35:53,180][47288] Updated weights for policy 0, policy_version 119186 (0.0031) [2024-04-26 07:35:53,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 1952743424. Throughput: 0: 56159.5. Samples: 1902175780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 07:35:53,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 07:35:56,525][47288] Updated weights for policy 0, policy_version 119196 (0.0032) [2024-04-26 07:35:58,923][47056] Fps is (10 sec: 60622.3, 60 sec: 56797.9, 300 sec: 56483.2). Total num frames: 1953054720. Throughput: 0: 56465.5. Samples: 1902350700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 07:35:58,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 07:35:59,029][47288] Updated weights for policy 0, policy_version 119206 (0.0030) [2024-04-26 07:36:02,336][47288] Updated weights for policy 0, policy_version 119216 (0.0029) [2024-04-26 07:36:03,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56525.0, 300 sec: 56483.2). Total num frames: 1953316864. Throughput: 0: 56306.2. Samples: 1902686040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 07:36:03,923][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 07:36:04,971][47288] Updated weights for policy 0, policy_version 119226 (0.0033) [2024-04-26 07:36:08,134][47288] Updated weights for policy 0, policy_version 119236 (0.0036) [2024-04-26 07:36:08,923][47056] Fps is (10 sec: 54066.5, 60 sec: 56251.8, 300 sec: 56483.1). Total num frames: 1953595392. Throughput: 0: 56277.0. Samples: 1903024460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 07:36:08,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 07:36:10,932][47288] Updated weights for policy 0, policy_version 119246 (0.0033) [2024-04-26 07:36:13,923][47056] Fps is (10 sec: 55706.6, 60 sec: 56525.0, 300 sec: 56427.7). Total num frames: 1953873920. Throughput: 0: 56317.6. Samples: 1903197380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 07:36:13,923][47056] Avg episode reward: [(0, '0.464')] [2024-04-26 07:36:14,008][47288] Updated weights for policy 0, policy_version 119256 (0.0031) [2024-04-26 07:36:16,906][47288] Updated weights for policy 0, policy_version 119266 (0.0031) [2024-04-26 07:36:18,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1954168832. Throughput: 0: 56212.3. Samples: 1903532120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:36:18,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 07:36:19,914][47288] Updated weights for policy 0, policy_version 119276 (0.0033) [2024-04-26 07:36:22,678][47288] Updated weights for policy 0, policy_version 119286 (0.0028) [2024-04-26 07:36:23,923][47056] Fps is (10 sec: 57342.5, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1954447360. Throughput: 0: 56293.3. Samples: 1903869500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:36:23,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 07:36:25,639][47288] Updated weights for policy 0, policy_version 119296 (0.0030) [2024-04-26 07:36:28,394][47288] Updated weights for policy 0, policy_version 119306 (0.0028) [2024-04-26 07:36:28,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1954709504. Throughput: 0: 56362.7. Samples: 1904037300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:36:28,923][47056] Avg episode reward: [(0, '0.607')] [2024-04-26 07:36:31,491][47288] Updated weights for policy 0, policy_version 119316 (0.0026) [2024-04-26 07:36:33,923][47056] Fps is (10 sec: 55706.8, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1955004416. Throughput: 0: 56287.0. Samples: 1904370940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:36:33,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 07:36:34,373][47288] Updated weights for policy 0, policy_version 119326 (0.0030) [2024-04-26 07:36:37,321][47288] Updated weights for policy 0, policy_version 119336 (0.0035) [2024-04-26 07:36:38,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1955299328. Throughput: 0: 56202.8. Samples: 1904704900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:36:38,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 07:36:40,150][47288] Updated weights for policy 0, policy_version 119346 (0.0027) [2024-04-26 07:36:41,035][47267] Signal inference workers to stop experience collection... (28400 times) [2024-04-26 07:36:41,063][47288] InferenceWorker_p0-w0: stopping experience collection (28400 times) [2024-04-26 07:36:41,089][47267] Signal inference workers to resume experience collection... (28400 times) [2024-04-26 07:36:41,090][47288] InferenceWorker_p0-w0: resuming experience collection (28400 times) [2024-04-26 07:36:43,036][47288] Updated weights for policy 0, policy_version 119356 (0.0031) [2024-04-26 07:36:43,923][47056] Fps is (10 sec: 54066.0, 60 sec: 55978.6, 300 sec: 56372.0). Total num frames: 1955545088. Throughput: 0: 56104.2. Samples: 1904875400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:36:43,924][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 07:36:45,870][47288] Updated weights for policy 0, policy_version 119366 (0.0028) [2024-04-26 07:36:48,841][47288] Updated weights for policy 0, policy_version 119376 (0.0026) [2024-04-26 07:36:48,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 1955856384. Throughput: 0: 56142.5. Samples: 1905212460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:36:48,924][47056] Avg episode reward: [(0, '0.571')] [2024-04-26 07:36:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000119376_1955856384.pth... [2024-04-26 07:36:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000118552_1942355968.pth [2024-04-26 07:36:52,029][47288] Updated weights for policy 0, policy_version 119386 (0.0028) [2024-04-26 07:36:53,923][47056] Fps is (10 sec: 57345.2, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 1956118528. Throughput: 0: 56169.1. Samples: 1905552060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:36:53,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 07:36:54,643][47288] Updated weights for policy 0, policy_version 119396 (0.0027) [2024-04-26 07:36:57,777][47288] Updated weights for policy 0, policy_version 119406 (0.0029) [2024-04-26 07:36:58,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.5, 300 sec: 56372.1). Total num frames: 1956413440. Throughput: 0: 56113.1. Samples: 1905722480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:36:58,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 07:37:00,545][47288] Updated weights for policy 0, policy_version 119416 (0.0026) [2024-04-26 07:37:03,500][47288] Updated weights for policy 0, policy_version 119426 (0.0023) [2024-04-26 07:37:03,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 1956691968. Throughput: 0: 56300.7. Samples: 1906065640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:37:03,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 07:37:06,260][47288] Updated weights for policy 0, policy_version 119436 (0.0029) [2024-04-26 07:37:08,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56524.7, 300 sec: 56372.0). Total num frames: 1956986880. Throughput: 0: 56359.1. Samples: 1906405660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:37:08,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 07:37:09,134][47288] Updated weights for policy 0, policy_version 119446 (0.0028) [2024-04-26 07:37:12,003][47288] Updated weights for policy 0, policy_version 119456 (0.0028) [2024-04-26 07:37:13,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1957265408. Throughput: 0: 56413.5. Samples: 1906575900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:37:13,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 07:37:14,825][47288] Updated weights for policy 0, policy_version 119466 (0.0026) [2024-04-26 07:37:17,816][47288] Updated weights for policy 0, policy_version 119476 (0.0035) [2024-04-26 07:37:18,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 1957543936. Throughput: 0: 56505.1. Samples: 1906913680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:37:18,923][47056] Avg episode reward: [(0, '0.534')] [2024-04-26 07:37:20,782][47288] Updated weights for policy 0, policy_version 119486 (0.0026) [2024-04-26 07:37:23,585][47288] Updated weights for policy 0, policy_version 119496 (0.0027) [2024-04-26 07:37:23,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1957822464. Throughput: 0: 56562.7. Samples: 1907250220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 07:37:23,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 07:37:26,561][47288] Updated weights for policy 0, policy_version 119506 (0.0027) [2024-04-26 07:37:28,922][47056] Fps is (10 sec: 57345.6, 60 sec: 56798.1, 300 sec: 56427.7). Total num frames: 1958117376. Throughput: 0: 56539.0. Samples: 1907419640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 07:37:28,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 07:37:29,294][47288] Updated weights for policy 0, policy_version 119516 (0.0033) [2024-04-26 07:37:32,448][47288] Updated weights for policy 0, policy_version 119526 (0.0027) [2024-04-26 07:37:33,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56524.6, 300 sec: 56316.5). Total num frames: 1958395904. Throughput: 0: 56541.0. Samples: 1907756800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 07:37:33,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 07:37:35,257][47288] Updated weights for policy 0, policy_version 119536 (0.0031) [2024-04-26 07:37:38,262][47288] Updated weights for policy 0, policy_version 119546 (0.0030) [2024-04-26 07:37:38,923][47056] Fps is (10 sec: 54065.7, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 1958658048. Throughput: 0: 56534.4. Samples: 1908096120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 07:37:38,924][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 07:37:41,156][47288] Updated weights for policy 0, policy_version 119556 (0.0030) [2024-04-26 07:37:43,923][47056] Fps is (10 sec: 55706.6, 60 sec: 56798.1, 300 sec: 56261.0). Total num frames: 1958952960. Throughput: 0: 56560.1. Samples: 1908267680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 07:37:43,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 07:37:43,990][47288] Updated weights for policy 0, policy_version 119566 (0.0025) [2024-04-26 07:37:46,932][47288] Updated weights for policy 0, policy_version 119576 (0.0035) [2024-04-26 07:37:48,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56252.0, 300 sec: 56316.5). Total num frames: 1959231488. Throughput: 0: 56469.3. Samples: 1908606760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 07:37:48,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 07:37:49,748][47288] Updated weights for policy 0, policy_version 119586 (0.0035) [2024-04-26 07:37:52,605][47267] Signal inference workers to stop experience collection... (28450 times) [2024-04-26 07:37:52,606][47267] Signal inference workers to resume experience collection... (28450 times) [2024-04-26 07:37:52,647][47288] InferenceWorker_p0-w0: stopping experience collection (28450 times) [2024-04-26 07:37:52,647][47288] InferenceWorker_p0-w0: resuming experience collection (28450 times) [2024-04-26 07:37:52,716][47288] Updated weights for policy 0, policy_version 119596 (0.0027) [2024-04-26 07:37:53,923][47056] Fps is (10 sec: 55703.9, 60 sec: 56524.5, 300 sec: 56372.0). Total num frames: 1959510016. Throughput: 0: 56526.1. Samples: 1908949340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 07:37:53,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 07:37:55,540][47288] Updated weights for policy 0, policy_version 119606 (0.0027) [2024-04-26 07:37:58,754][47288] Updated weights for policy 0, policy_version 119616 (0.0027) [2024-04-26 07:37:58,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1959804928. Throughput: 0: 56375.0. Samples: 1909112780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 07:37:58,923][47056] Avg episode reward: [(0, '0.567')] [2024-04-26 07:38:01,254][47288] Updated weights for policy 0, policy_version 119626 (0.0026) [2024-04-26 07:38:03,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1960083456. Throughput: 0: 56508.1. Samples: 1909456540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 07:38:03,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 07:38:04,451][47288] Updated weights for policy 0, policy_version 119636 (0.0031) [2024-04-26 07:38:07,038][47288] Updated weights for policy 0, policy_version 119646 (0.0034) [2024-04-26 07:38:08,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56525.0, 300 sec: 56372.1). Total num frames: 1960378368. Throughput: 0: 56437.8. Samples: 1909789920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 07:38:08,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 07:38:10,151][47288] Updated weights for policy 0, policy_version 119656 (0.0031) [2024-04-26 07:38:13,020][47288] Updated weights for policy 0, policy_version 119666 (0.0026) [2024-04-26 07:38:13,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1960640512. Throughput: 0: 56473.2. Samples: 1909960940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 07:38:13,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 07:38:16,018][47288] Updated weights for policy 0, policy_version 119676 (0.0031) [2024-04-26 07:38:18,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56252.0, 300 sec: 56261.0). Total num frames: 1960919040. Throughput: 0: 56466.5. Samples: 1910297780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 07:38:18,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 07:38:19,004][47288] Updated weights for policy 0, policy_version 119686 (0.0029) [2024-04-26 07:38:21,878][47288] Updated weights for policy 0, policy_version 119696 (0.0028) [2024-04-26 07:38:23,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1961197568. Throughput: 0: 56460.9. Samples: 1910636860. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-04-26 07:38:23,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 07:38:24,774][47288] Updated weights for policy 0, policy_version 119706 (0.0031) [2024-04-26 07:38:27,558][47288] Updated weights for policy 0, policy_version 119716 (0.0036) [2024-04-26 07:38:28,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 1961492480. Throughput: 0: 56484.3. Samples: 1910809480. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-04-26 07:38:28,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 07:38:30,525][47288] Updated weights for policy 0, policy_version 119726 (0.0026) [2024-04-26 07:38:33,464][47288] Updated weights for policy 0, policy_version 119736 (0.0029) [2024-04-26 07:38:33,923][47056] Fps is (10 sec: 57345.1, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 1961771008. Throughput: 0: 56473.4. Samples: 1911148060. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-04-26 07:38:33,923][47056] Avg episode reward: [(0, '0.557')] [2024-04-26 07:38:36,246][47288] Updated weights for policy 0, policy_version 119746 (0.0029) [2024-04-26 07:38:38,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1962049536. Throughput: 0: 56497.9. Samples: 1911491740. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-04-26 07:38:38,923][47056] Avg episode reward: [(0, '0.544')] [2024-04-26 07:38:39,379][47288] Updated weights for policy 0, policy_version 119756 (0.0025) [2024-04-26 07:38:41,943][47288] Updated weights for policy 0, policy_version 119766 (0.0031) [2024-04-26 07:38:43,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1962344448. Throughput: 0: 56476.1. Samples: 1911654200. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-04-26 07:38:43,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 07:38:45,177][47288] Updated weights for policy 0, policy_version 119776 (0.0038) [2024-04-26 07:38:47,789][47288] Updated weights for policy 0, policy_version 119786 (0.0028) [2024-04-26 07:38:48,923][47056] Fps is (10 sec: 60620.5, 60 sec: 57070.7, 300 sec: 56538.6). Total num frames: 1962655744. Throughput: 0: 56428.8. Samples: 1911995840. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-04-26 07:38:48,923][47056] Avg episode reward: [(0, '0.571')] [2024-04-26 07:38:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000119791_1962655744.pth... [2024-04-26 07:38:48,989][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000118964_1949106176.pth [2024-04-26 07:38:50,940][47288] Updated weights for policy 0, policy_version 119796 (0.0030) [2024-04-26 07:38:53,701][47288] Updated weights for policy 0, policy_version 119806 (0.0039) [2024-04-26 07:38:53,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56525.0, 300 sec: 56372.1). Total num frames: 1962901504. Throughput: 0: 56555.6. Samples: 1912334920. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-04-26 07:38:53,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 07:38:56,746][47288] Updated weights for policy 0, policy_version 119816 (0.0027) [2024-04-26 07:38:58,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1963196416. Throughput: 0: 56519.0. Samples: 1912504300. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-04-26 07:38:58,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 07:38:59,679][47288] Updated weights for policy 0, policy_version 119826 (0.0030) [2024-04-26 07:39:00,075][47267] Signal inference workers to stop experience collection... (28500 times) [2024-04-26 07:39:00,076][47267] Signal inference workers to resume experience collection... (28500 times) [2024-04-26 07:39:00,103][47288] InferenceWorker_p0-w0: stopping experience collection (28500 times) [2024-04-26 07:39:00,104][47288] InferenceWorker_p0-w0: resuming experience collection (28500 times) [2024-04-26 07:39:02,671][47288] Updated weights for policy 0, policy_version 119836 (0.0024) [2024-04-26 07:39:03,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 1963458560. Throughput: 0: 56621.7. Samples: 1912845760. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-04-26 07:39:03,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 07:39:05,562][47288] Updated weights for policy 0, policy_version 119846 (0.0030) [2024-04-26 07:39:08,375][47288] Updated weights for policy 0, policy_version 119856 (0.0029) [2024-04-26 07:39:08,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1963753472. Throughput: 0: 56395.6. Samples: 1913174660. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-04-26 07:39:08,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 07:39:11,332][47288] Updated weights for policy 0, policy_version 119866 (0.0025) [2024-04-26 07:39:13,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.8, 300 sec: 56316.6). Total num frames: 1964015616. Throughput: 0: 56348.6. Samples: 1913345160. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-04-26 07:39:13,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 07:39:14,128][47288] Updated weights for policy 0, policy_version 119876 (0.0028) [2024-04-26 07:39:17,093][47288] Updated weights for policy 0, policy_version 119886 (0.0032) [2024-04-26 07:39:18,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1964310528. Throughput: 0: 56263.0. Samples: 1913679900. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-04-26 07:39:18,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 07:39:20,167][47288] Updated weights for policy 0, policy_version 119896 (0.0036) [2024-04-26 07:39:22,820][47288] Updated weights for policy 0, policy_version 119906 (0.0028) [2024-04-26 07:39:23,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1964589056. Throughput: 0: 55983.3. Samples: 1914010980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:39:23,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 07:39:25,841][47288] Updated weights for policy 0, policy_version 119916 (0.0030) [2024-04-26 07:39:28,534][47288] Updated weights for policy 0, policy_version 119926 (0.0030) [2024-04-26 07:39:28,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1964883968. Throughput: 0: 56253.4. Samples: 1914185600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:39:28,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 07:39:31,820][47288] Updated weights for policy 0, policy_version 119936 (0.0033) [2024-04-26 07:39:33,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1965162496. Throughput: 0: 56096.6. Samples: 1914520180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:39:33,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 07:39:34,408][47288] Updated weights for policy 0, policy_version 119946 (0.0027) [2024-04-26 07:39:37,675][47288] Updated weights for policy 0, policy_version 119956 (0.0036) [2024-04-26 07:39:38,923][47056] Fps is (10 sec: 55704.7, 60 sec: 56524.8, 300 sec: 56316.5). Total num frames: 1965441024. Throughput: 0: 56125.7. Samples: 1914860580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:39:38,923][47056] Avg episode reward: [(0, '0.567')] [2024-04-26 07:39:40,247][47288] Updated weights for policy 0, policy_version 119966 (0.0027) [2024-04-26 07:39:43,381][47288] Updated weights for policy 0, policy_version 119976 (0.0029) [2024-04-26 07:39:43,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1965719552. Throughput: 0: 56066.7. Samples: 1915027300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:39:43,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 07:39:45,981][47288] Updated weights for policy 0, policy_version 119986 (0.0025) [2024-04-26 07:39:48,923][47056] Fps is (10 sec: 55706.8, 60 sec: 55705.8, 300 sec: 56372.1). Total num frames: 1965998080. Throughput: 0: 55991.2. Samples: 1915365360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:39:48,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 07:39:49,162][47288] Updated weights for policy 0, policy_version 119996 (0.0026) [2024-04-26 07:39:51,921][47288] Updated weights for policy 0, policy_version 120006 (0.0026) [2024-04-26 07:39:53,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 1966276608. Throughput: 0: 56330.1. Samples: 1915709520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:39:53,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 07:39:54,894][47288] Updated weights for policy 0, policy_version 120016 (0.0028) [2024-04-26 07:39:57,607][47288] Updated weights for policy 0, policy_version 120026 (0.0026) [2024-04-26 07:39:58,923][47056] Fps is (10 sec: 57342.9, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1966571520. Throughput: 0: 56232.6. Samples: 1915875640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:39:58,923][47056] Avg episode reward: [(0, '0.595')] [2024-04-26 07:40:00,598][47288] Updated weights for policy 0, policy_version 120036 (0.0033) [2024-04-26 07:40:03,265][47288] Updated weights for policy 0, policy_version 120046 (0.0025) [2024-04-26 07:40:03,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 1966850048. Throughput: 0: 56327.5. Samples: 1916214640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:40:03,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 07:40:06,621][47288] Updated weights for policy 0, policy_version 120056 (0.0027) [2024-04-26 07:40:08,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 1967144960. Throughput: 0: 56587.1. Samples: 1916557400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:40:08,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 07:40:08,988][47288] Updated weights for policy 0, policy_version 120066 (0.0025) [2024-04-26 07:40:10,825][47267] Signal inference workers to stop experience collection... (28550 times) [2024-04-26 07:40:10,826][47267] Signal inference workers to resume experience collection... (28550 times) [2024-04-26 07:40:10,837][47288] InferenceWorker_p0-w0: stopping experience collection (28550 times) [2024-04-26 07:40:10,859][47288] InferenceWorker_p0-w0: resuming experience collection (28550 times) [2024-04-26 07:40:12,321][47288] Updated weights for policy 0, policy_version 120076 (0.0030) [2024-04-26 07:40:13,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 1967407104. Throughput: 0: 56518.2. Samples: 1916728920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:40:13,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 07:40:14,744][47288] Updated weights for policy 0, policy_version 120086 (0.0029) [2024-04-26 07:40:18,045][47288] Updated weights for policy 0, policy_version 120096 (0.0026) [2024-04-26 07:40:18,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1967702016. Throughput: 0: 56628.5. Samples: 1917068460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 07:40:18,923][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 07:40:20,497][47288] Updated weights for policy 0, policy_version 120106 (0.0030) [2024-04-26 07:40:23,913][47288] Updated weights for policy 0, policy_version 120116 (0.0037) [2024-04-26 07:40:23,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 1967980544. Throughput: 0: 56532.6. Samples: 1917404540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:40:23,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 07:40:26,470][47288] Updated weights for policy 0, policy_version 120126 (0.0030) [2024-04-26 07:40:28,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1968242688. Throughput: 0: 56431.3. Samples: 1917566700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:40:28,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 07:40:29,678][47288] Updated weights for policy 0, policy_version 120136 (0.0036) [2024-04-26 07:40:32,362][47288] Updated weights for policy 0, policy_version 120146 (0.0025) [2024-04-26 07:40:33,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1968537600. Throughput: 0: 56549.1. Samples: 1917910080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:40:33,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 07:40:35,294][47288] Updated weights for policy 0, policy_version 120156 (0.0029) [2024-04-26 07:40:38,079][47288] Updated weights for policy 0, policy_version 120166 (0.0031) [2024-04-26 07:40:38,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1968832512. Throughput: 0: 56536.2. Samples: 1918253640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:40:38,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 07:40:40,954][47288] Updated weights for policy 0, policy_version 120176 (0.0028) [2024-04-26 07:40:43,923][47056] Fps is (10 sec: 58983.2, 60 sec: 56798.0, 300 sec: 56538.7). Total num frames: 1969127424. Throughput: 0: 56560.2. Samples: 1918420840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:40:43,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 07:40:43,928][47288] Updated weights for policy 0, policy_version 120186 (0.0038) [2024-04-26 07:40:46,857][47288] Updated weights for policy 0, policy_version 120196 (0.0027) [2024-04-26 07:40:48,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1969389568. Throughput: 0: 56579.1. Samples: 1918760700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:40:48,923][47056] Avg episode reward: [(0, '0.581')] [2024-04-26 07:40:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000120202_1969389568.pth... [2024-04-26 07:40:48,985][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000119376_1955856384.pth [2024-04-26 07:40:49,845][47288] Updated weights for policy 0, policy_version 120206 (0.0025) [2024-04-26 07:40:52,725][47288] Updated weights for policy 0, policy_version 120216 (0.0026) [2024-04-26 07:40:53,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56524.9, 300 sec: 56316.5). Total num frames: 1969668096. Throughput: 0: 56540.0. Samples: 1919101700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:40:53,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 07:40:55,587][47288] Updated weights for policy 0, policy_version 120226 (0.0030) [2024-04-26 07:40:58,450][47288] Updated weights for policy 0, policy_version 120236 (0.0026) [2024-04-26 07:40:58,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1969963008. Throughput: 0: 56490.6. Samples: 1919271000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:40:58,923][47056] Avg episode reward: [(0, '0.561')] [2024-04-26 07:41:01,342][47288] Updated weights for policy 0, policy_version 120246 (0.0027) [2024-04-26 07:41:03,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1970241536. Throughput: 0: 56512.4. Samples: 1919611520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:41:03,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 07:41:04,297][47288] Updated weights for policy 0, policy_version 120256 (0.0030) [2024-04-26 07:41:06,955][47288] Updated weights for policy 0, policy_version 120266 (0.0027) [2024-04-26 07:41:08,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1970520064. Throughput: 0: 56591.2. Samples: 1919951140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:41:08,923][47056] Avg episode reward: [(0, '0.578')] [2024-04-26 07:41:10,222][47288] Updated weights for policy 0, policy_version 120276 (0.0027) [2024-04-26 07:41:12,914][47288] Updated weights for policy 0, policy_version 120286 (0.0030) [2024-04-26 07:41:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1970814976. Throughput: 0: 56790.6. Samples: 1920122280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:41:13,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 07:41:15,829][47288] Updated weights for policy 0, policy_version 120296 (0.0030) [2024-04-26 07:41:18,791][47288] Updated weights for policy 0, policy_version 120306 (0.0030) [2024-04-26 07:41:18,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1971093504. Throughput: 0: 56696.4. Samples: 1920461420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:41:18,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 07:41:21,439][47288] Updated weights for policy 0, policy_version 120316 (0.0032) [2024-04-26 07:41:23,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 1971372032. Throughput: 0: 56655.1. Samples: 1920803120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 07:41:23,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 07:41:24,701][47288] Updated weights for policy 0, policy_version 120326 (0.0029) [2024-04-26 07:41:27,300][47288] Updated weights for policy 0, policy_version 120336 (0.0034) [2024-04-26 07:41:28,923][47056] Fps is (10 sec: 57343.5, 60 sec: 57070.6, 300 sec: 56483.1). Total num frames: 1971666944. Throughput: 0: 56665.4. Samples: 1920970800. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-04-26 07:41:28,924][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 07:41:30,413][47288] Updated weights for policy 0, policy_version 120346 (0.0030) [2024-04-26 07:41:33,328][47288] Updated weights for policy 0, policy_version 120356 (0.0030) [2024-04-26 07:41:33,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56798.0, 300 sec: 56427.6). Total num frames: 1971945472. Throughput: 0: 56651.2. Samples: 1921310000. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-04-26 07:41:33,923][47056] Avg episode reward: [(0, '0.609')] [2024-04-26 07:41:36,269][47288] Updated weights for policy 0, policy_version 120366 (0.0033) [2024-04-26 07:41:38,923][47056] Fps is (10 sec: 55706.6, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 1972224000. Throughput: 0: 56618.1. Samples: 1921649520. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-04-26 07:41:38,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 07:41:39,016][47288] Updated weights for policy 0, policy_version 120376 (0.0027) [2024-04-26 07:41:41,986][47288] Updated weights for policy 0, policy_version 120386 (0.0031) [2024-04-26 07:41:43,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.7, 300 sec: 56427.7). Total num frames: 1972502528. Throughput: 0: 56532.2. Samples: 1921814940. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-04-26 07:41:43,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 07:41:44,930][47288] Updated weights for policy 0, policy_version 120396 (0.0032) [2024-04-26 07:41:47,793][47288] Updated weights for policy 0, policy_version 120406 (0.0028) [2024-04-26 07:41:48,479][47267] Signal inference workers to stop experience collection... (28600 times) [2024-04-26 07:41:48,479][47267] Signal inference workers to resume experience collection... (28600 times) [2024-04-26 07:41:48,506][47288] InferenceWorker_p0-w0: stopping experience collection (28600 times) [2024-04-26 07:41:48,506][47288] InferenceWorker_p0-w0: resuming experience collection (28600 times) [2024-04-26 07:41:48,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1972781056. Throughput: 0: 56548.8. Samples: 1922156220. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-04-26 07:41:48,923][47056] Avg episode reward: [(0, '0.625')] [2024-04-26 07:41:50,597][47288] Updated weights for policy 0, policy_version 120416 (0.0029) [2024-04-26 07:41:53,719][47288] Updated weights for policy 0, policy_version 120426 (0.0032) [2024-04-26 07:41:53,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.9, 300 sec: 56483.2). Total num frames: 1973075968. Throughput: 0: 56528.5. Samples: 1922494920. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-04-26 07:41:53,923][47056] Avg episode reward: [(0, '0.597')] [2024-04-26 07:41:56,397][47288] Updated weights for policy 0, policy_version 120436 (0.0026) [2024-04-26 07:41:58,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1973338112. Throughput: 0: 56444.4. Samples: 1922662280. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-04-26 07:41:58,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 07:41:59,497][47288] Updated weights for policy 0, policy_version 120446 (0.0025) [2024-04-26 07:42:02,064][47288] Updated weights for policy 0, policy_version 120456 (0.0033) [2024-04-26 07:42:03,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1973633024. Throughput: 0: 56492.9. Samples: 1923003600. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-04-26 07:42:03,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 07:42:05,484][47288] Updated weights for policy 0, policy_version 120466 (0.0032) [2024-04-26 07:42:07,819][47288] Updated weights for policy 0, policy_version 120476 (0.0028) [2024-04-26 07:42:08,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1973911552. Throughput: 0: 56362.5. Samples: 1923339440. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-04-26 07:42:08,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 07:42:11,267][47288] Updated weights for policy 0, policy_version 120486 (0.0035) [2024-04-26 07:42:13,650][47288] Updated weights for policy 0, policy_version 120496 (0.0026) [2024-04-26 07:42:13,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.7, 300 sec: 56483.2). Total num frames: 1974206464. Throughput: 0: 56262.8. Samples: 1923502620. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-04-26 07:42:13,924][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 07:42:17,104][47288] Updated weights for policy 0, policy_version 120506 (0.0026) [2024-04-26 07:42:18,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56251.9, 300 sec: 56427.6). Total num frames: 1974468608. Throughput: 0: 56218.7. Samples: 1923839840. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-04-26 07:42:18,923][47056] Avg episode reward: [(0, '0.580')] [2024-04-26 07:42:19,692][47288] Updated weights for policy 0, policy_version 120516 (0.0031) [2024-04-26 07:42:22,941][47288] Updated weights for policy 0, policy_version 120526 (0.0027) [2024-04-26 07:42:23,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 1974747136. Throughput: 0: 56211.0. Samples: 1924179020. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-04-26 07:42:23,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 07:42:25,596][47288] Updated weights for policy 0, policy_version 120536 (0.0028) [2024-04-26 07:42:28,686][47288] Updated weights for policy 0, policy_version 120546 (0.0030) [2024-04-26 07:42:28,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1975042048. Throughput: 0: 56214.0. Samples: 1924344580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:42:28,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 07:42:31,544][47288] Updated weights for policy 0, policy_version 120556 (0.0026) [2024-04-26 07:42:33,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1975304192. Throughput: 0: 56121.0. Samples: 1924681660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:42:33,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 07:42:34,485][47288] Updated weights for policy 0, policy_version 120566 (0.0031) [2024-04-26 07:42:37,224][47288] Updated weights for policy 0, policy_version 120576 (0.0025) [2024-04-26 07:42:38,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1975599104. Throughput: 0: 56079.5. Samples: 1925018500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:42:38,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 07:42:40,345][47288] Updated weights for policy 0, policy_version 120586 (0.0028) [2024-04-26 07:42:43,057][47288] Updated weights for policy 0, policy_version 120596 (0.0027) [2024-04-26 07:42:43,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1975877632. Throughput: 0: 56159.6. Samples: 1925189460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:42:43,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 07:42:46,283][47288] Updated weights for policy 0, policy_version 120606 (0.0026) [2024-04-26 07:42:48,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 1976156160. Throughput: 0: 56095.5. Samples: 1925527900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:42:48,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 07:42:48,974][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000120616_1976172544.pth... [2024-04-26 07:42:48,978][47288] Updated weights for policy 0, policy_version 120616 (0.0030) [2024-04-26 07:42:49,028][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000119791_1962655744.pth [2024-04-26 07:42:52,291][47288] Updated weights for policy 0, policy_version 120626 (0.0028) [2024-04-26 07:42:53,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 1976434688. Throughput: 0: 55849.0. Samples: 1925852640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:42:53,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 07:42:54,704][47288] Updated weights for policy 0, policy_version 120636 (0.0027) [2024-04-26 07:42:58,144][47288] Updated weights for policy 0, policy_version 120646 (0.0026) [2024-04-26 07:42:58,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55978.7, 300 sec: 56316.6). Total num frames: 1976696832. Throughput: 0: 55983.3. Samples: 1926021860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:42:58,923][47056] Avg episode reward: [(0, '0.577')] [2024-04-26 07:42:59,140][47267] Signal inference workers to stop experience collection... (28650 times) [2024-04-26 07:42:59,140][47267] Signal inference workers to resume experience collection... (28650 times) [2024-04-26 07:42:59,151][47288] InferenceWorker_p0-w0: stopping experience collection (28650 times) [2024-04-26 07:42:59,171][47288] InferenceWorker_p0-w0: resuming experience collection (28650 times) [2024-04-26 07:43:00,402][47288] Updated weights for policy 0, policy_version 120656 (0.0028) [2024-04-26 07:43:03,923][47056] Fps is (10 sec: 54066.6, 60 sec: 55705.6, 300 sec: 56261.0). Total num frames: 1976975360. Throughput: 0: 56053.2. Samples: 1926362240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:43:03,923][47056] Avg episode reward: [(0, '0.574')] [2024-04-26 07:43:03,963][47288] Updated weights for policy 0, policy_version 120666 (0.0029) [2024-04-26 07:43:06,334][47288] Updated weights for policy 0, policy_version 120676 (0.0031) [2024-04-26 07:43:08,923][47056] Fps is (10 sec: 55704.7, 60 sec: 55705.5, 300 sec: 56316.5). Total num frames: 1977253888. Throughput: 0: 55942.6. Samples: 1926696440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:43:08,923][47056] Avg episode reward: [(0, '0.586')] [2024-04-26 07:43:09,663][47288] Updated weights for policy 0, policy_version 120686 (0.0027) [2024-04-26 07:43:12,619][47288] Updated weights for policy 0, policy_version 120696 (0.0029) [2024-04-26 07:43:13,923][47056] Fps is (10 sec: 58982.5, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1977565184. Throughput: 0: 56064.9. Samples: 1926867500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:43:13,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 07:43:15,350][47288] Updated weights for policy 0, policy_version 120706 (0.0029) [2024-04-26 07:43:18,519][47288] Updated weights for policy 0, policy_version 120716 (0.0028) [2024-04-26 07:43:18,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.5, 300 sec: 56372.1). Total num frames: 1977827328. Throughput: 0: 56044.3. Samples: 1927203660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:43:18,923][47056] Avg episode reward: [(0, '0.587')] [2024-04-26 07:43:21,161][47288] Updated weights for policy 0, policy_version 120726 (0.0031) [2024-04-26 07:43:23,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1978122240. Throughput: 0: 56077.3. Samples: 1927541980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 07:43:23,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 07:43:24,233][47288] Updated weights for policy 0, policy_version 120736 (0.0025) [2024-04-26 07:43:26,930][47288] Updated weights for policy 0, policy_version 120746 (0.0028) [2024-04-26 07:43:28,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 56372.0). Total num frames: 1978400768. Throughput: 0: 56249.6. Samples: 1927720700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 07:43:28,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 07:43:29,905][47288] Updated weights for policy 0, policy_version 120756 (0.0032) [2024-04-26 07:43:32,805][47288] Updated weights for policy 0, policy_version 120766 (0.0026) [2024-04-26 07:43:33,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1978695680. Throughput: 0: 56201.4. Samples: 1928056960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 07:43:33,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 07:43:35,787][47288] Updated weights for policy 0, policy_version 120776 (0.0029) [2024-04-26 07:43:38,912][47288] Updated weights for policy 0, policy_version 120786 (0.0029) [2024-04-26 07:43:38,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.5, 300 sec: 56316.5). Total num frames: 1978957824. Throughput: 0: 56590.8. Samples: 1928399240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 07:43:38,924][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 07:43:41,406][47288] Updated weights for policy 0, policy_version 120796 (0.0028) [2024-04-26 07:43:43,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 1979252736. Throughput: 0: 56459.9. Samples: 1928562560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 07:43:43,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 07:43:44,531][47288] Updated weights for policy 0, policy_version 120806 (0.0034) [2024-04-26 07:43:47,246][47288] Updated weights for policy 0, policy_version 120816 (0.0032) [2024-04-26 07:43:48,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 1979531264. Throughput: 0: 56462.5. Samples: 1928903060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 07:43:48,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 07:43:50,318][47288] Updated weights for policy 0, policy_version 120826 (0.0032) [2024-04-26 07:43:52,966][47288] Updated weights for policy 0, policy_version 120836 (0.0030) [2024-04-26 07:43:53,923][47056] Fps is (10 sec: 60620.2, 60 sec: 57070.8, 300 sec: 56483.1). Total num frames: 1979858944. Throughput: 0: 56649.8. Samples: 1929245680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 07:43:53,923][47056] Avg episode reward: [(0, '0.583')] [2024-04-26 07:43:55,922][47288] Updated weights for policy 0, policy_version 120846 (0.0028) [2024-04-26 07:43:58,763][47288] Updated weights for policy 0, policy_version 120856 (0.0025) [2024-04-26 07:43:58,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56797.6, 300 sec: 56427.6). Total num frames: 1980104704. Throughput: 0: 56730.1. Samples: 1929420360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 07:43:58,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 07:43:59,310][47267] Signal inference workers to stop experience collection... (28700 times) [2024-04-26 07:43:59,329][47288] InferenceWorker_p0-w0: stopping experience collection (28700 times) [2024-04-26 07:43:59,397][47267] Signal inference workers to resume experience collection... (28700 times) [2024-04-26 07:43:59,397][47288] InferenceWorker_p0-w0: resuming experience collection (28700 times) [2024-04-26 07:44:01,704][47288] Updated weights for policy 0, policy_version 120866 (0.0032) [2024-04-26 07:44:03,923][47056] Fps is (10 sec: 52429.2, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 1980383232. Throughput: 0: 56792.5. Samples: 1929759320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 07:44:03,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 07:44:04,581][47288] Updated weights for policy 0, policy_version 120876 (0.0026) [2024-04-26 07:44:07,455][47288] Updated weights for policy 0, policy_version 120886 (0.0028) [2024-04-26 07:44:08,923][47056] Fps is (10 sec: 57345.1, 60 sec: 57071.0, 300 sec: 56483.1). Total num frames: 1980678144. Throughput: 0: 56828.5. Samples: 1930099260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 07:44:08,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 07:44:10,281][47288] Updated weights for policy 0, policy_version 120896 (0.0026) [2024-04-26 07:44:13,151][47288] Updated weights for policy 0, policy_version 120906 (0.0029) [2024-04-26 07:44:13,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1980940288. Throughput: 0: 56609.8. Samples: 1930268140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 07:44:13,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 07:44:16,041][47288] Updated weights for policy 0, policy_version 120916 (0.0037) [2024-04-26 07:44:18,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1981235200. Throughput: 0: 56737.8. Samples: 1930610160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 07:44:18,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 07:44:19,144][47288] Updated weights for policy 0, policy_version 120926 (0.0035) [2024-04-26 07:44:21,851][47288] Updated weights for policy 0, policy_version 120936 (0.0024) [2024-04-26 07:44:23,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1981513728. Throughput: 0: 56682.1. Samples: 1930949920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 07:44:23,923][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 07:44:24,932][47288] Updated weights for policy 0, policy_version 120946 (0.0032) [2024-04-26 07:44:27,729][47288] Updated weights for policy 0, policy_version 120956 (0.0031) [2024-04-26 07:44:28,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1981792256. Throughput: 0: 56558.2. Samples: 1931107680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 07:44:28,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 07:44:30,569][47288] Updated weights for policy 0, policy_version 120966 (0.0033) [2024-04-26 07:44:33,471][47288] Updated weights for policy 0, policy_version 120976 (0.0027) [2024-04-26 07:44:33,923][47056] Fps is (10 sec: 60620.0, 60 sec: 57070.9, 300 sec: 56538.7). Total num frames: 1982119936. Throughput: 0: 56629.0. Samples: 1931451360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 07:44:33,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 07:44:36,208][47288] Updated weights for policy 0, policy_version 120986 (0.0039) [2024-04-26 07:44:38,923][47056] Fps is (10 sec: 58981.7, 60 sec: 57071.0, 300 sec: 56483.1). Total num frames: 1982382080. Throughput: 0: 56635.1. Samples: 1931794260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 07:44:38,924][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 07:44:39,188][47288] Updated weights for policy 0, policy_version 120996 (0.0039) [2024-04-26 07:44:42,001][47288] Updated weights for policy 0, policy_version 121006 (0.0026) [2024-04-26 07:44:43,923][47056] Fps is (10 sec: 52429.2, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1982644224. Throughput: 0: 56570.9. Samples: 1931966040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 07:44:43,923][47056] Avg episode reward: [(0, '0.546')] [2024-04-26 07:44:45,110][47288] Updated weights for policy 0, policy_version 121016 (0.0031) [2024-04-26 07:44:47,832][47288] Updated weights for policy 0, policy_version 121026 (0.0024) [2024-04-26 07:44:48,923][47056] Fps is (10 sec: 54068.0, 60 sec: 56525.0, 300 sec: 56427.6). Total num frames: 1982922752. Throughput: 0: 56542.7. Samples: 1932303740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 07:44:48,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 07:44:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000121029_1982939136.pth... [2024-04-26 07:44:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000120202_1969389568.pth [2024-04-26 07:44:50,843][47288] Updated weights for policy 0, policy_version 121036 (0.0028) [2024-04-26 07:44:53,223][47267] Signal inference workers to stop experience collection... (28750 times) [2024-04-26 07:44:53,261][47288] InferenceWorker_p0-w0: stopping experience collection (28750 times) [2024-04-26 07:44:53,310][47267] Signal inference workers to resume experience collection... (28750 times) [2024-04-26 07:44:53,310][47288] InferenceWorker_p0-w0: resuming experience collection (28750 times) [2024-04-26 07:44:53,539][47288] Updated weights for policy 0, policy_version 121046 (0.0023) [2024-04-26 07:44:53,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.8, 300 sec: 56427.6). Total num frames: 1983217664. Throughput: 0: 56526.7. Samples: 1932642960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 07:44:53,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 07:44:56,466][47288] Updated weights for policy 0, policy_version 121056 (0.0027) [2024-04-26 07:44:58,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1983496192. Throughput: 0: 56488.1. Samples: 1932810100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 07:44:58,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 07:44:59,477][47288] Updated weights for policy 0, policy_version 121066 (0.0028) [2024-04-26 07:45:02,309][47288] Updated weights for policy 0, policy_version 121076 (0.0030) [2024-04-26 07:45:03,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 1983774720. Throughput: 0: 56349.0. Samples: 1933145860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 07:45:03,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 07:45:05,367][47288] Updated weights for policy 0, policy_version 121086 (0.0031) [2024-04-26 07:45:08,176][47288] Updated weights for policy 0, policy_version 121096 (0.0031) [2024-04-26 07:45:08,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1984069632. Throughput: 0: 56406.6. Samples: 1933488220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 07:45:08,923][47056] Avg episode reward: [(0, '0.573')] [2024-04-26 07:45:11,504][47288] Updated weights for policy 0, policy_version 121106 (0.0031) [2024-04-26 07:45:13,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1984348160. Throughput: 0: 56860.4. Samples: 1933666400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 07:45:13,923][47056] Avg episode reward: [(0, '0.585')] [2024-04-26 07:45:13,953][47288] Updated weights for policy 0, policy_version 121116 (0.0027) [2024-04-26 07:45:17,184][47288] Updated weights for policy 0, policy_version 121126 (0.0038) [2024-04-26 07:45:18,923][47056] Fps is (10 sec: 55704.2, 60 sec: 56524.6, 300 sec: 56427.6). Total num frames: 1984626688. Throughput: 0: 56709.2. Samples: 1934003280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 07:45:18,923][47056] Avg episode reward: [(0, '0.568')] [2024-04-26 07:45:19,661][47288] Updated weights for policy 0, policy_version 121136 (0.0030) [2024-04-26 07:45:23,126][47288] Updated weights for policy 0, policy_version 121146 (0.0028) [2024-04-26 07:45:23,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 1984905216. Throughput: 0: 56602.9. Samples: 1934341380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 07:45:23,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 07:45:25,365][47288] Updated weights for policy 0, policy_version 121156 (0.0027) [2024-04-26 07:45:28,923][47056] Fps is (10 sec: 54068.4, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1985167360. Throughput: 0: 56525.4. Samples: 1934509680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 07:45:28,923][47056] Avg episode reward: [(0, '0.617')] [2024-04-26 07:45:28,936][47288] Updated weights for policy 0, policy_version 121166 (0.0029) [2024-04-26 07:45:31,355][47288] Updated weights for policy 0, policy_version 121176 (0.0032) [2024-04-26 07:45:33,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1985478656. Throughput: 0: 56505.7. Samples: 1934846500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-26 07:45:33,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 07:45:34,563][47288] Updated weights for policy 0, policy_version 121186 (0.0028) [2024-04-26 07:45:37,161][47288] Updated weights for policy 0, policy_version 121196 (0.0029) [2024-04-26 07:45:38,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 1985757184. Throughput: 0: 56534.6. Samples: 1935187020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-26 07:45:38,923][47056] Avg episode reward: [(0, '0.562')] [2024-04-26 07:45:40,162][47288] Updated weights for policy 0, policy_version 121206 (0.0022) [2024-04-26 07:45:42,836][47288] Updated weights for policy 0, policy_version 121216 (0.0031) [2024-04-26 07:45:43,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1986019328. Throughput: 0: 56443.2. Samples: 1935350040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-26 07:45:43,923][47056] Avg episode reward: [(0, '0.461')] [2024-04-26 07:45:46,153][47288] Updated weights for policy 0, policy_version 121226 (0.0030) [2024-04-26 07:45:48,535][47288] Updated weights for policy 0, policy_version 121236 (0.0029) [2024-04-26 07:45:48,923][47056] Fps is (10 sec: 58981.8, 60 sec: 57070.8, 300 sec: 56538.7). Total num frames: 1986347008. Throughput: 0: 56548.3. Samples: 1935690540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-26 07:45:48,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 07:45:52,102][47288] Updated weights for policy 0, policy_version 121246 (0.0032) [2024-04-26 07:45:53,923][47056] Fps is (10 sec: 60620.0, 60 sec: 56797.7, 300 sec: 56483.1). Total num frames: 1986625536. Throughput: 0: 56446.1. Samples: 1936028300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-26 07:45:53,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 07:45:54,407][47288] Updated weights for policy 0, policy_version 121256 (0.0029) [2024-04-26 07:45:57,809][47288] Updated weights for policy 0, policy_version 121266 (0.0029) [2024-04-26 07:45:58,366][47267] Signal inference workers to stop experience collection... (28800 times) [2024-04-26 07:45:58,387][47288] InferenceWorker_p0-w0: stopping experience collection (28800 times) [2024-04-26 07:45:58,422][47267] Signal inference workers to resume experience collection... (28800 times) [2024-04-26 07:45:58,422][47288] InferenceWorker_p0-w0: resuming experience collection (28800 times) [2024-04-26 07:45:58,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 1986887680. Throughput: 0: 56278.8. Samples: 1936198940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-26 07:45:58,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 07:46:00,175][47288] Updated weights for policy 0, policy_version 121276 (0.0028) [2024-04-26 07:46:03,644][47288] Updated weights for policy 0, policy_version 121286 (0.0026) [2024-04-26 07:46:03,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1987166208. Throughput: 0: 56328.2. Samples: 1936538040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-26 07:46:03,924][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 07:46:05,954][47288] Updated weights for policy 0, policy_version 121296 (0.0026) [2024-04-26 07:46:08,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 1987444736. Throughput: 0: 56455.9. Samples: 1936881900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-26 07:46:08,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 07:46:09,486][47288] Updated weights for policy 0, policy_version 121306 (0.0027) [2024-04-26 07:46:11,709][47288] Updated weights for policy 0, policy_version 121316 (0.0028) [2024-04-26 07:46:13,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1987739648. Throughput: 0: 56246.1. Samples: 1937040760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-26 07:46:13,923][47056] Avg episode reward: [(0, '0.349')] [2024-04-26 07:46:15,284][47288] Updated weights for policy 0, policy_version 121326 (0.0027) [2024-04-26 07:46:18,214][47288] Updated weights for policy 0, policy_version 121336 (0.0034) [2024-04-26 07:46:18,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56525.0, 300 sec: 56427.6). Total num frames: 1988018176. Throughput: 0: 56299.5. Samples: 1937379980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-26 07:46:18,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 07:46:21,061][47288] Updated weights for policy 0, policy_version 121346 (0.0043) [2024-04-26 07:46:23,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56251.7, 300 sec: 56316.6). Total num frames: 1988280320. Throughput: 0: 56225.3. Samples: 1937717160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-26 07:46:23,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 07:46:24,086][47288] Updated weights for policy 0, policy_version 121356 (0.0030) [2024-04-26 07:46:26,794][47288] Updated weights for policy 0, policy_version 121366 (0.0024) [2024-04-26 07:46:28,923][47056] Fps is (10 sec: 57344.0, 60 sec: 57070.9, 300 sec: 56427.6). Total num frames: 1988591616. Throughput: 0: 56631.9. Samples: 1937898480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-26 07:46:28,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 07:46:29,737][47288] Updated weights for policy 0, policy_version 121376 (0.0035) [2024-04-26 07:46:32,653][47288] Updated weights for policy 0, policy_version 121386 (0.0031) [2024-04-26 07:46:33,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1988870144. Throughput: 0: 56567.7. Samples: 1938236080. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-04-26 07:46:33,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 07:46:35,349][47288] Updated weights for policy 0, policy_version 121396 (0.0028) [2024-04-26 07:46:38,529][47288] Updated weights for policy 0, policy_version 121406 (0.0027) [2024-04-26 07:46:38,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 1989165056. Throughput: 0: 56518.3. Samples: 1938571620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-04-26 07:46:38,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 07:46:41,197][47288] Updated weights for policy 0, policy_version 121416 (0.0030) [2024-04-26 07:46:43,923][47056] Fps is (10 sec: 52428.9, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 1989394432. Throughput: 0: 56381.7. Samples: 1938736120. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-04-26 07:46:43,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 07:46:44,336][47288] Updated weights for policy 0, policy_version 121426 (0.0029) [2024-04-26 07:46:47,194][47288] Updated weights for policy 0, policy_version 121436 (0.0025) [2024-04-26 07:46:48,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55705.6, 300 sec: 56316.5). Total num frames: 1989689344. Throughput: 0: 56327.9. Samples: 1939072800. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-04-26 07:46:48,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 07:46:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000121441_1989689344.pth... [2024-04-26 07:46:48,978][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000120616_1976172544.pth [2024-04-26 07:46:50,142][47288] Updated weights for policy 0, policy_version 121446 (0.0026) [2024-04-26 07:46:51,531][47267] Signal inference workers to stop experience collection... (28850 times) [2024-04-26 07:46:51,531][47267] Signal inference workers to resume experience collection... (28850 times) [2024-04-26 07:46:51,562][47288] InferenceWorker_p0-w0: stopping experience collection (28850 times) [2024-04-26 07:46:51,562][47288] InferenceWorker_p0-w0: resuming experience collection (28850 times) [2024-04-26 07:46:53,055][47288] Updated weights for policy 0, policy_version 121456 (0.0034) [2024-04-26 07:46:53,923][47056] Fps is (10 sec: 58982.3, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 1989984256. Throughput: 0: 56157.3. Samples: 1939408980. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-04-26 07:46:53,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 07:46:56,000][47288] Updated weights for policy 0, policy_version 121466 (0.0034) [2024-04-26 07:46:58,783][47288] Updated weights for policy 0, policy_version 121476 (0.0027) [2024-04-26 07:46:58,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1990279168. Throughput: 0: 56322.7. Samples: 1939575280. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-04-26 07:46:58,923][47056] Avg episode reward: [(0, '0.572')] [2024-04-26 07:47:01,673][47288] Updated weights for policy 0, policy_version 121486 (0.0026) [2024-04-26 07:47:03,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1990557696. Throughput: 0: 56340.4. Samples: 1939915300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-04-26 07:47:03,923][47056] Avg episode reward: [(0, '0.544')] [2024-04-26 07:47:04,608][47288] Updated weights for policy 0, policy_version 121496 (0.0025) [2024-04-26 07:47:07,567][47288] Updated weights for policy 0, policy_version 121506 (0.0035) [2024-04-26 07:47:08,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 1990852608. Throughput: 0: 56392.6. Samples: 1940254820. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-04-26 07:47:08,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 07:47:10,242][47288] Updated weights for policy 0, policy_version 121516 (0.0028) [2024-04-26 07:47:13,270][47288] Updated weights for policy 0, policy_version 121526 (0.0028) [2024-04-26 07:47:13,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56524.9, 300 sec: 56483.1). Total num frames: 1991131136. Throughput: 0: 56262.7. Samples: 1940430300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-04-26 07:47:13,923][47056] Avg episode reward: [(0, '0.586')] [2024-04-26 07:47:15,956][47288] Updated weights for policy 0, policy_version 121536 (0.0029) [2024-04-26 07:47:18,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 1991393280. Throughput: 0: 56484.5. Samples: 1940777880. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-04-26 07:47:18,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 07:47:19,035][47288] Updated weights for policy 0, policy_version 121546 (0.0029) [2024-04-26 07:47:21,633][47288] Updated weights for policy 0, policy_version 121556 (0.0025) [2024-04-26 07:47:23,923][47056] Fps is (10 sec: 52428.8, 60 sec: 56251.7, 300 sec: 56316.6). Total num frames: 1991655424. Throughput: 0: 56459.6. Samples: 1941112300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-04-26 07:47:23,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 07:47:24,782][47288] Updated weights for policy 0, policy_version 121566 (0.0029) [2024-04-26 07:47:27,439][47288] Updated weights for policy 0, policy_version 121576 (0.0028) [2024-04-26 07:47:28,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55978.6, 300 sec: 56427.6). Total num frames: 1991950336. Throughput: 0: 56281.6. Samples: 1941268800. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-04-26 07:47:28,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 07:47:30,661][47288] Updated weights for policy 0, policy_version 121586 (0.0031) [2024-04-26 07:47:33,320][47288] Updated weights for policy 0, policy_version 121596 (0.0028) [2024-04-26 07:47:33,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 1992228864. Throughput: 0: 56397.4. Samples: 1941610680. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-04-26 07:47:33,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 07:47:36,396][47288] Updated weights for policy 0, policy_version 121606 (0.0038) [2024-04-26 07:47:38,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1992540160. Throughput: 0: 56482.2. Samples: 1941950680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 07:47:38,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 07:47:39,006][47288] Updated weights for policy 0, policy_version 121616 (0.0028) [2024-04-26 07:47:42,114][47288] Updated weights for policy 0, policy_version 121626 (0.0026) [2024-04-26 07:47:43,923][47056] Fps is (10 sec: 58981.6, 60 sec: 57070.8, 300 sec: 56483.1). Total num frames: 1992818688. Throughput: 0: 56695.8. Samples: 1942126600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 07:47:43,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 07:47:44,694][47288] Updated weights for policy 0, policy_version 121636 (0.0027) [2024-04-26 07:47:47,920][47288] Updated weights for policy 0, policy_version 121646 (0.0025) [2024-04-26 07:47:48,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56797.9, 300 sec: 56483.1). Total num frames: 1993097216. Throughput: 0: 56727.1. Samples: 1942468020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 07:47:48,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 07:47:50,892][47288] Updated weights for policy 0, policy_version 121656 (0.0033) [2024-04-26 07:47:53,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 1993359360. Throughput: 0: 56596.2. Samples: 1942801660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 07:47:53,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 07:47:53,974][47288] Updated weights for policy 0, policy_version 121666 (0.0029) [2024-04-26 07:47:56,834][47288] Updated weights for policy 0, policy_version 121676 (0.0028) [2024-04-26 07:47:58,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55705.6, 300 sec: 56427.6). Total num frames: 1993621504. Throughput: 0: 56284.0. Samples: 1942963080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 07:47:58,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 07:47:59,695][47288] Updated weights for policy 0, policy_version 121686 (0.0028) [2024-04-26 07:47:59,996][47267] Signal inference workers to stop experience collection... (28900 times) [2024-04-26 07:47:59,996][47267] Signal inference workers to resume experience collection... (28900 times) [2024-04-26 07:48:00,013][47288] InferenceWorker_p0-w0: stopping experience collection (28900 times) [2024-04-26 07:48:00,013][47288] InferenceWorker_p0-w0: resuming experience collection (28900 times) [2024-04-26 07:48:02,684][47288] Updated weights for policy 0, policy_version 121696 (0.0027) [2024-04-26 07:48:03,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.5, 300 sec: 56483.1). Total num frames: 1993916416. Throughput: 0: 56220.6. Samples: 1943307820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 07:48:03,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 07:48:05,356][47288] Updated weights for policy 0, policy_version 121706 (0.0030) [2024-04-26 07:48:08,315][47288] Updated weights for policy 0, policy_version 121716 (0.0026) [2024-04-26 07:48:08,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.4, 300 sec: 56372.1). Total num frames: 1994194944. Throughput: 0: 56308.8. Samples: 1943646200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 07:48:08,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 07:48:11,298][47288] Updated weights for policy 0, policy_version 121726 (0.0033) [2024-04-26 07:48:13,923][47056] Fps is (10 sec: 58983.7, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 1994506240. Throughput: 0: 56474.9. Samples: 1943810160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 07:48:13,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 07:48:14,190][47288] Updated weights for policy 0, policy_version 121736 (0.0029) [2024-04-26 07:48:17,114][47288] Updated weights for policy 0, policy_version 121746 (0.0029) [2024-04-26 07:48:18,923][47056] Fps is (10 sec: 62259.1, 60 sec: 57070.8, 300 sec: 56594.2). Total num frames: 1994817536. Throughput: 0: 56330.6. Samples: 1944145560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 07:48:18,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 07:48:20,231][47288] Updated weights for policy 0, policy_version 121756 (0.0027) [2024-04-26 07:48:23,005][47288] Updated weights for policy 0, policy_version 121766 (0.0037) [2024-04-26 07:48:23,923][47056] Fps is (10 sec: 57343.9, 60 sec: 57070.9, 300 sec: 56538.7). Total num frames: 1995079680. Throughput: 0: 56304.5. Samples: 1944484380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 07:48:23,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 07:48:25,906][47288] Updated weights for policy 0, policy_version 121776 (0.0028) [2024-04-26 07:48:28,876][47288] Updated weights for policy 0, policy_version 121786 (0.0030) [2024-04-26 07:48:28,923][47056] Fps is (10 sec: 52428.5, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1995341824. Throughput: 0: 56182.2. Samples: 1944654800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 07:48:28,932][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 07:48:31,549][47288] Updated weights for policy 0, policy_version 121796 (0.0032) [2024-04-26 07:48:33,923][47056] Fps is (10 sec: 52429.0, 60 sec: 56251.8, 300 sec: 56427.7). Total num frames: 1995603968. Throughput: 0: 56117.5. Samples: 1944993300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 07:48:33,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 07:48:34,542][47288] Updated weights for policy 0, policy_version 121806 (0.0024) [2024-04-26 07:48:37,510][47288] Updated weights for policy 0, policy_version 121816 (0.0026) [2024-04-26 07:48:38,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55705.6, 300 sec: 56372.1). Total num frames: 1995882496. Throughput: 0: 56223.3. Samples: 1945331700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 07:48:38,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 07:48:40,702][47288] Updated weights for policy 0, policy_version 121826 (0.0030) [2024-04-26 07:48:43,404][47288] Updated weights for policy 0, policy_version 121836 (0.0028) [2024-04-26 07:48:43,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.8, 300 sec: 56427.6). Total num frames: 1996177408. Throughput: 0: 56354.2. Samples: 1945499020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 07:48:43,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 07:48:46,313][47288] Updated weights for policy 0, policy_version 121846 (0.0034) [2024-04-26 07:48:48,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 1996472320. Throughput: 0: 56189.1. Samples: 1945836320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 07:48:48,923][47056] Avg episode reward: [(0, '0.610')] [2024-04-26 07:48:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000121855_1996472320.pth... [2024-04-26 07:48:48,987][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000121029_1982939136.pth [2024-04-26 07:48:49,139][47288] Updated weights for policy 0, policy_version 121856 (0.0030) [2024-04-26 07:48:52,156][47288] Updated weights for policy 0, policy_version 121866 (0.0030) [2024-04-26 07:48:53,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56798.0, 300 sec: 56483.2). Total num frames: 1996767232. Throughput: 0: 56263.2. Samples: 1946178040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 07:48:53,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 07:48:54,835][47288] Updated weights for policy 0, policy_version 121876 (0.0036) [2024-04-26 07:48:58,022][47288] Updated weights for policy 0, policy_version 121886 (0.0034) [2024-04-26 07:48:58,923][47056] Fps is (10 sec: 58981.8, 60 sec: 57343.9, 300 sec: 56538.7). Total num frames: 1997062144. Throughput: 0: 56625.6. Samples: 1946358320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 07:48:58,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 07:49:00,624][47288] Updated weights for policy 0, policy_version 121896 (0.0034) [2024-04-26 07:49:02,982][47267] Signal inference workers to stop experience collection... (28950 times) [2024-04-26 07:49:02,987][47267] Signal inference workers to resume experience collection... (28950 times) [2024-04-26 07:49:03,012][47288] InferenceWorker_p0-w0: stopping experience collection (28950 times) [2024-04-26 07:49:03,012][47288] InferenceWorker_p0-w0: resuming experience collection (28950 times) [2024-04-26 07:49:03,710][47288] Updated weights for policy 0, policy_version 121906 (0.0029) [2024-04-26 07:49:03,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56798.0, 300 sec: 56427.6). Total num frames: 1997324288. Throughput: 0: 56609.4. Samples: 1946692980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 07:49:03,923][47056] Avg episode reward: [(0, '0.359')] [2024-04-26 07:49:06,577][47288] Updated weights for policy 0, policy_version 121916 (0.0024) [2024-04-26 07:49:08,923][47056] Fps is (10 sec: 52428.3, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 1997586432. Throughput: 0: 56570.4. Samples: 1947030060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 07:49:08,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 07:49:09,425][47288] Updated weights for policy 0, policy_version 121926 (0.0027) [2024-04-26 07:49:12,236][47288] Updated weights for policy 0, policy_version 121936 (0.0029) [2024-04-26 07:49:13,923][47056] Fps is (10 sec: 54068.3, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 1997864960. Throughput: 0: 56505.3. Samples: 1947197520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 07:49:13,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 07:49:15,282][47288] Updated weights for policy 0, policy_version 121946 (0.0029) [2024-04-26 07:49:17,863][47288] Updated weights for policy 0, policy_version 121956 (0.0031) [2024-04-26 07:49:18,923][47056] Fps is (10 sec: 57345.3, 60 sec: 55705.7, 300 sec: 56427.6). Total num frames: 1998159872. Throughput: 0: 56461.7. Samples: 1947534080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 07:49:18,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 07:49:21,176][47288] Updated weights for policy 0, policy_version 121966 (0.0028) [2024-04-26 07:49:23,923][47056] Fps is (10 sec: 57343.0, 60 sec: 55978.6, 300 sec: 56427.6). Total num frames: 1998438400. Throughput: 0: 56467.1. Samples: 1947872720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 07:49:23,923][47056] Avg episode reward: [(0, '0.606')] [2024-04-26 07:49:24,102][47288] Updated weights for policy 0, policy_version 121976 (0.0027) [2024-04-26 07:49:26,845][47288] Updated weights for policy 0, policy_version 121986 (0.0034) [2024-04-26 07:49:28,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56524.9, 300 sec: 56316.5). Total num frames: 1998733312. Throughput: 0: 56573.3. Samples: 1948044820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 07:49:28,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 07:49:29,863][47288] Updated weights for policy 0, policy_version 121996 (0.0032) [2024-04-26 07:49:32,702][47288] Updated weights for policy 0, policy_version 122006 (0.0029) [2024-04-26 07:49:33,923][47056] Fps is (10 sec: 58982.2, 60 sec: 57070.8, 300 sec: 56427.6). Total num frames: 1999028224. Throughput: 0: 56497.3. Samples: 1948378700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 07:49:33,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 07:49:35,523][47288] Updated weights for policy 0, policy_version 122016 (0.0026) [2024-04-26 07:49:38,516][47288] Updated weights for policy 0, policy_version 122026 (0.0031) [2024-04-26 07:49:38,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 1999290368. Throughput: 0: 56372.8. Samples: 1948714820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 07:49:38,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 07:49:41,462][47288] Updated weights for policy 0, policy_version 122036 (0.0028) [2024-04-26 07:49:43,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 1999568896. Throughput: 0: 56120.6. Samples: 1948883740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:49:43,923][47056] Avg episode reward: [(0, '0.596')] [2024-04-26 07:49:44,307][47288] Updated weights for policy 0, policy_version 122046 (0.0022) [2024-04-26 07:49:47,401][47288] Updated weights for policy 0, policy_version 122056 (0.0031) [2024-04-26 07:49:48,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 1999847424. Throughput: 0: 56239.2. Samples: 1949223740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:49:48,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 07:49:49,998][47288] Updated weights for policy 0, policy_version 122066 (0.0027) [2024-04-26 07:49:53,293][47288] Updated weights for policy 0, policy_version 122076 (0.0031) [2024-04-26 07:49:53,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 2000125952. Throughput: 0: 56210.5. Samples: 1949559520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:49:53,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 07:49:55,769][47288] Updated weights for policy 0, policy_version 122086 (0.0027) [2024-04-26 07:49:58,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.8, 300 sec: 56372.1). Total num frames: 2000404480. Throughput: 0: 56112.4. Samples: 1949722580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:49:58,923][47056] Avg episode reward: [(0, '0.566')] [2024-04-26 07:49:58,991][47288] Updated weights for policy 0, policy_version 122096 (0.0034) [2024-04-26 07:50:01,124][47267] Signal inference workers to stop experience collection... (29000 times) [2024-04-26 07:50:01,144][47288] InferenceWorker_p0-w0: stopping experience collection (29000 times) [2024-04-26 07:50:01,179][47267] Signal inference workers to resume experience collection... (29000 times) [2024-04-26 07:50:01,180][47288] InferenceWorker_p0-w0: resuming experience collection (29000 times) [2024-04-26 07:50:01,635][47288] Updated weights for policy 0, policy_version 122106 (0.0030) [2024-04-26 07:50:03,923][47056] Fps is (10 sec: 58981.5, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 2000715776. Throughput: 0: 56220.3. Samples: 1950064000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:50:03,924][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 07:50:04,791][47288] Updated weights for policy 0, policy_version 122116 (0.0026) [2024-04-26 07:50:07,518][47288] Updated weights for policy 0, policy_version 122126 (0.0027) [2024-04-26 07:50:08,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56525.0, 300 sec: 56372.1). Total num frames: 2000977920. Throughput: 0: 56307.1. Samples: 1950406540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:50:08,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 07:50:10,426][47288] Updated weights for policy 0, policy_version 122136 (0.0032) [2024-04-26 07:50:13,193][47288] Updated weights for policy 0, policy_version 122146 (0.0029) [2024-04-26 07:50:13,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56797.7, 300 sec: 56427.6). Total num frames: 2001272832. Throughput: 0: 56256.0. Samples: 1950576340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:50:13,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 07:50:16,368][47288] Updated weights for policy 0, policy_version 122156 (0.0028) [2024-04-26 07:50:18,889][47288] Updated weights for policy 0, policy_version 122166 (0.0026) [2024-04-26 07:50:18,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56797.9, 300 sec: 56483.1). Total num frames: 2001567744. Throughput: 0: 56425.0. Samples: 1950917820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:50:18,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 07:50:22,233][47288] Updated weights for policy 0, policy_version 122176 (0.0025) [2024-04-26 07:50:23,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 2001813504. Throughput: 0: 56516.5. Samples: 1951258060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:50:23,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 07:50:24,764][47288] Updated weights for policy 0, policy_version 122186 (0.0028) [2024-04-26 07:50:28,042][47288] Updated weights for policy 0, policy_version 122196 (0.0036) [2024-04-26 07:50:28,923][47056] Fps is (10 sec: 54066.2, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 2002108416. Throughput: 0: 56371.8. Samples: 1951420480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:50:28,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 07:50:30,706][47288] Updated weights for policy 0, policy_version 122206 (0.0031) [2024-04-26 07:50:33,889][47288] Updated weights for policy 0, policy_version 122216 (0.0027) [2024-04-26 07:50:33,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 2002386944. Throughput: 0: 56417.4. Samples: 1951762520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:50:33,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 07:50:36,325][47288] Updated weights for policy 0, policy_version 122226 (0.0039) [2024-04-26 07:50:38,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 2002665472. Throughput: 0: 56490.1. Samples: 1952101580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 07:50:38,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 07:50:39,495][47288] Updated weights for policy 0, policy_version 122236 (0.0027) [2024-04-26 07:50:42,039][47288] Updated weights for policy 0, policy_version 122246 (0.0028) [2024-04-26 07:50:43,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 2002960384. Throughput: 0: 56625.1. Samples: 1952270720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:50:43,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 07:50:45,326][47288] Updated weights for policy 0, policy_version 122256 (0.0029) [2024-04-26 07:50:47,951][47288] Updated weights for policy 0, policy_version 122266 (0.0030) [2024-04-26 07:50:48,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.8, 300 sec: 56261.0). Total num frames: 2003222528. Throughput: 0: 56493.6. Samples: 1952606200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:50:48,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 07:50:48,999][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000122268_2003238912.pth... [2024-04-26 07:50:49,044][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000121441_1989689344.pth [2024-04-26 07:50:51,103][47288] Updated weights for policy 0, policy_version 122276 (0.0026) [2024-04-26 07:50:53,729][47288] Updated weights for policy 0, policy_version 122286 (0.0029) [2024-04-26 07:50:53,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 2003533824. Throughput: 0: 56324.0. Samples: 1952941120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:50:53,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 07:50:56,794][47288] Updated weights for policy 0, policy_version 122296 (0.0030) [2024-04-26 07:50:58,873][47267] Signal inference workers to stop experience collection... (29050 times) [2024-04-26 07:50:58,873][47267] Signal inference workers to resume experience collection... (29050 times) [2024-04-26 07:50:58,903][47288] InferenceWorker_p0-w0: stopping experience collection (29050 times) [2024-04-26 07:50:58,903][47288] InferenceWorker_p0-w0: resuming experience collection (29050 times) [2024-04-26 07:50:58,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 2003795968. Throughput: 0: 56360.4. Samples: 1953112560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:50:58,923][47056] Avg episode reward: [(0, '0.557')] [2024-04-26 07:50:59,388][47288] Updated weights for policy 0, policy_version 122306 (0.0027) [2024-04-26 07:51:02,740][47288] Updated weights for policy 0, policy_version 122316 (0.0030) [2024-04-26 07:51:03,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55978.9, 300 sec: 56372.1). Total num frames: 2004074496. Throughput: 0: 56317.0. Samples: 1953452080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:51:03,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 07:51:05,148][47288] Updated weights for policy 0, policy_version 122326 (0.0032) [2024-04-26 07:51:08,554][47288] Updated weights for policy 0, policy_version 122336 (0.0030) [2024-04-26 07:51:08,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 2004369408. Throughput: 0: 56215.0. Samples: 1953787740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:51:08,924][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 07:51:11,098][47288] Updated weights for policy 0, policy_version 122346 (0.0035) [2024-04-26 07:51:13,923][47056] Fps is (10 sec: 57343.0, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 2004647936. Throughput: 0: 56304.5. Samples: 1953954180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:51:13,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 07:51:14,463][47288] Updated weights for policy 0, policy_version 122356 (0.0030) [2024-04-26 07:51:16,819][47288] Updated weights for policy 0, policy_version 122366 (0.0037) [2024-04-26 07:51:18,923][47056] Fps is (10 sec: 54068.2, 60 sec: 55705.6, 300 sec: 56372.1). Total num frames: 2004910080. Throughput: 0: 56203.6. Samples: 1954291680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:51:18,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 07:51:20,105][47288] Updated weights for policy 0, policy_version 122376 (0.0027) [2024-04-26 07:51:22,809][47288] Updated weights for policy 0, policy_version 122386 (0.0025) [2024-04-26 07:51:23,923][47056] Fps is (10 sec: 55706.6, 60 sec: 56524.9, 300 sec: 56316.6). Total num frames: 2005204992. Throughput: 0: 56274.0. Samples: 1954633900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:51:23,923][47056] Avg episode reward: [(0, '0.544')] [2024-04-26 07:51:25,920][47288] Updated weights for policy 0, policy_version 122396 (0.0028) [2024-04-26 07:51:28,467][47288] Updated weights for policy 0, policy_version 122406 (0.0030) [2024-04-26 07:51:28,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56525.0, 300 sec: 56372.1). Total num frames: 2005499904. Throughput: 0: 56337.5. Samples: 1954805900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:51:28,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 07:51:31,681][47288] Updated weights for policy 0, policy_version 122416 (0.0027) [2024-04-26 07:51:33,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 2005762048. Throughput: 0: 56376.8. Samples: 1955143160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:51:33,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 07:51:34,475][47288] Updated weights for policy 0, policy_version 122426 (0.0027) [2024-04-26 07:51:37,394][47288] Updated weights for policy 0, policy_version 122436 (0.0030) [2024-04-26 07:51:38,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2006073344. Throughput: 0: 56356.1. Samples: 1955477140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 07:51:38,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 07:51:40,312][47288] Updated weights for policy 0, policy_version 122446 (0.0031) [2024-04-26 07:51:43,209][47288] Updated weights for policy 0, policy_version 122456 (0.0029) [2024-04-26 07:51:43,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 2006335488. Throughput: 0: 56428.1. Samples: 1955651820. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 07:51:43,923][47056] Avg episode reward: [(0, '0.591')] [2024-04-26 07:51:46,041][47288] Updated weights for policy 0, policy_version 122466 (0.0027) [2024-04-26 07:51:48,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 2006614016. Throughput: 0: 56584.7. Samples: 1955998400. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 07:51:48,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 07:51:49,110][47288] Updated weights for policy 0, policy_version 122476 (0.0032) [2024-04-26 07:51:51,670][47288] Updated weights for policy 0, policy_version 122486 (0.0027) [2024-04-26 07:51:53,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 2006892544. Throughput: 0: 56758.8. Samples: 1956341880. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 07:51:53,923][47056] Avg episode reward: [(0, '0.565')] [2024-04-26 07:51:55,247][47288] Updated weights for policy 0, policy_version 122496 (0.0027) [2024-04-26 07:51:57,418][47288] Updated weights for policy 0, policy_version 122506 (0.0036) [2024-04-26 07:51:58,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 2007171072. Throughput: 0: 56538.3. Samples: 1956498400. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 07:51:58,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 07:52:01,008][47288] Updated weights for policy 0, policy_version 122516 (0.0029) [2024-04-26 07:52:01,915][47267] Signal inference workers to stop experience collection... (29100 times) [2024-04-26 07:52:01,916][47267] Signal inference workers to resume experience collection... (29100 times) [2024-04-26 07:52:01,932][47288] InferenceWorker_p0-w0: stopping experience collection (29100 times) [2024-04-26 07:52:01,932][47288] InferenceWorker_p0-w0: resuming experience collection (29100 times) [2024-04-26 07:52:03,063][47288] Updated weights for policy 0, policy_version 122526 (0.0029) [2024-04-26 07:52:03,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 2007482368. Throughput: 0: 56508.4. Samples: 1956834560. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 07:52:03,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 07:52:06,744][47288] Updated weights for policy 0, policy_version 122536 (0.0025) [2024-04-26 07:52:08,865][47288] Updated weights for policy 0, policy_version 122546 (0.0027) [2024-04-26 07:52:08,923][47056] Fps is (10 sec: 62258.6, 60 sec: 57070.9, 300 sec: 56483.1). Total num frames: 2007793664. Throughput: 0: 56501.9. Samples: 1957176500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 07:52:08,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 07:52:12,484][47288] Updated weights for policy 0, policy_version 122556 (0.0028) [2024-04-26 07:52:13,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 2008039424. Throughput: 0: 56617.7. Samples: 1957353700. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 07:52:13,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 07:52:15,077][47288] Updated weights for policy 0, policy_version 122566 (0.0035) [2024-04-26 07:52:18,174][47288] Updated weights for policy 0, policy_version 122576 (0.0025) [2024-04-26 07:52:18,923][47056] Fps is (10 sec: 52429.2, 60 sec: 56797.7, 300 sec: 56483.1). Total num frames: 2008317952. Throughput: 0: 56623.4. Samples: 1957691220. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 07:52:18,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 07:52:20,773][47288] Updated weights for policy 0, policy_version 122586 (0.0028) [2024-04-26 07:52:23,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 2008596480. Throughput: 0: 56746.3. Samples: 1958030720. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 07:52:23,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 07:52:23,974][47288] Updated weights for policy 0, policy_version 122596 (0.0030) [2024-04-26 07:52:26,674][47288] Updated weights for policy 0, policy_version 122606 (0.0026) [2024-04-26 07:52:28,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 2008875008. Throughput: 0: 56499.1. Samples: 1958194280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 07:52:28,923][47056] Avg episode reward: [(0, '0.398')] [2024-04-26 07:52:29,795][47288] Updated weights for policy 0, policy_version 122616 (0.0032) [2024-04-26 07:52:32,509][47288] Updated weights for policy 0, policy_version 122626 (0.0032) [2024-04-26 07:52:33,923][47056] Fps is (10 sec: 54065.5, 60 sec: 56251.5, 300 sec: 56260.9). Total num frames: 2009137152. Throughput: 0: 56387.3. Samples: 1958535840. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 07:52:33,924][47056] Avg episode reward: [(0, '0.552')] [2024-04-26 07:52:35,487][47288] Updated weights for policy 0, policy_version 122636 (0.0034) [2024-04-26 07:52:38,138][47288] Updated weights for policy 0, policy_version 122646 (0.0030) [2024-04-26 07:52:38,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 2009432064. Throughput: 0: 56244.4. Samples: 1958872880. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 07:52:38,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 07:52:41,527][47288] Updated weights for policy 0, policy_version 122656 (0.0027) [2024-04-26 07:52:43,788][47288] Updated weights for policy 0, policy_version 122666 (0.0027) [2024-04-26 07:52:43,923][47056] Fps is (10 sec: 62260.8, 60 sec: 57071.0, 300 sec: 56483.2). Total num frames: 2009759744. Throughput: 0: 56603.6. Samples: 1959045560. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 07:52:43,923][47056] Avg episode reward: [(0, '0.600')] [2024-04-26 07:52:47,206][47288] Updated weights for policy 0, policy_version 122676 (0.0031) [2024-04-26 07:52:48,923][47056] Fps is (10 sec: 62259.3, 60 sec: 57344.0, 300 sec: 56594.2). Total num frames: 2010054656. Throughput: 0: 56635.9. Samples: 1959383180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:52:48,923][47056] Avg episode reward: [(0, '0.410')] [2024-04-26 07:52:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000122684_2010054656.pth... [2024-04-26 07:52:48,979][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000121855_1996472320.pth [2024-04-26 07:52:49,612][47288] Updated weights for policy 0, policy_version 122686 (0.0027) [2024-04-26 07:52:53,112][47288] Updated weights for policy 0, policy_version 122696 (0.0033) [2024-04-26 07:52:53,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 2010300416. Throughput: 0: 56466.7. Samples: 1959717500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:52:53,923][47056] Avg episode reward: [(0, '0.569')] [2024-04-26 07:52:54,561][47267] Signal inference workers to stop experience collection... (29150 times) [2024-04-26 07:52:54,564][47267] Signal inference workers to resume experience collection... (29150 times) [2024-04-26 07:52:54,579][47288] InferenceWorker_p0-w0: stopping experience collection (29150 times) [2024-04-26 07:52:54,580][47288] InferenceWorker_p0-w0: resuming experience collection (29150 times) [2024-04-26 07:52:55,680][47288] Updated weights for policy 0, policy_version 122706 (0.0026) [2024-04-26 07:52:58,923][47056] Fps is (10 sec: 50790.9, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 2010562560. Throughput: 0: 56319.2. Samples: 1959888060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:52:58,923][47056] Avg episode reward: [(0, '0.629')] [2024-04-26 07:52:58,999][47288] Updated weights for policy 0, policy_version 122716 (0.0027) [2024-04-26 07:53:01,696][47288] Updated weights for policy 0, policy_version 122726 (0.0028) [2024-04-26 07:53:03,923][47056] Fps is (10 sec: 54068.1, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 2010841088. Throughput: 0: 56398.8. Samples: 1960229160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:53:03,923][47056] Avg episode reward: [(0, '0.424')] [2024-04-26 07:53:04,673][47288] Updated weights for policy 0, policy_version 122736 (0.0024) [2024-04-26 07:53:07,969][47288] Updated weights for policy 0, policy_version 122746 (0.0032) [2024-04-26 07:53:08,923][47056] Fps is (10 sec: 54065.9, 60 sec: 55159.4, 300 sec: 56260.9). Total num frames: 2011103232. Throughput: 0: 56310.8. Samples: 1960564720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:53:08,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 07:53:10,435][47288] Updated weights for policy 0, policy_version 122756 (0.0031) [2024-04-26 07:53:13,871][47288] Updated weights for policy 0, policy_version 122766 (0.0030) [2024-04-26 07:53:13,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 56205.5). Total num frames: 2011398144. Throughput: 0: 56148.4. Samples: 1960720960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:53:13,923][47056] Avg episode reward: [(0, '0.628')] [2024-04-26 07:53:16,202][47288] Updated weights for policy 0, policy_version 122776 (0.0028) [2024-04-26 07:53:18,923][47056] Fps is (10 sec: 58984.0, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 2011693056. Throughput: 0: 56068.4. Samples: 1961058900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:53:18,923][47056] Avg episode reward: [(0, '0.568')] [2024-04-26 07:53:19,723][47288] Updated weights for policy 0, policy_version 122786 (0.0029) [2024-04-26 07:53:22,378][47288] Updated weights for policy 0, policy_version 122796 (0.0033) [2024-04-26 07:53:23,923][47056] Fps is (10 sec: 62259.1, 60 sec: 57070.8, 300 sec: 56538.7). Total num frames: 2012020736. Throughput: 0: 55982.7. Samples: 1961392100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:53:23,923][47056] Avg episode reward: [(0, '0.431')] [2024-04-26 07:53:25,452][47288] Updated weights for policy 0, policy_version 122806 (0.0029) [2024-04-26 07:53:28,264][47288] Updated weights for policy 0, policy_version 122816 (0.0031) [2024-04-26 07:53:28,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2012282880. Throughput: 0: 56363.6. Samples: 1961581920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:53:28,923][47056] Avg episode reward: [(0, '0.584')] [2024-04-26 07:53:31,117][47288] Updated weights for policy 0, policy_version 122826 (0.0025) [2024-04-26 07:53:33,923][47056] Fps is (10 sec: 50790.3, 60 sec: 56525.0, 300 sec: 56427.6). Total num frames: 2012528640. Throughput: 0: 56239.5. Samples: 1961913960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:53:33,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 07:53:34,079][47288] Updated weights for policy 0, policy_version 122836 (0.0025) [2024-04-26 07:53:36,980][47288] Updated weights for policy 0, policy_version 122846 (0.0027) [2024-04-26 07:53:38,923][47056] Fps is (10 sec: 52428.4, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 2012807168. Throughput: 0: 56357.4. Samples: 1962253580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:53:38,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 07:53:39,848][47288] Updated weights for policy 0, policy_version 122856 (0.0031) [2024-04-26 07:53:40,019][47267] Signal inference workers to stop experience collection... (29200 times) [2024-04-26 07:53:40,068][47267] Signal inference workers to resume experience collection... (29200 times) [2024-04-26 07:53:40,068][47288] InferenceWorker_p0-w0: stopping experience collection (29200 times) [2024-04-26 07:53:40,082][47288] InferenceWorker_p0-w0: resuming experience collection (29200 times) [2024-04-26 07:53:42,891][47288] Updated weights for policy 0, policy_version 122866 (0.0025) [2024-04-26 07:53:43,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 56316.5). Total num frames: 2013085696. Throughput: 0: 56050.6. Samples: 1962410340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 07:53:43,924][47056] Avg episode reward: [(0, '0.623')] [2024-04-26 07:53:45,510][47288] Updated weights for policy 0, policy_version 122876 (0.0032) [2024-04-26 07:53:48,760][47288] Updated weights for policy 0, policy_version 122886 (0.0033) [2024-04-26 07:53:48,923][47056] Fps is (10 sec: 55705.1, 60 sec: 55159.4, 300 sec: 56261.0). Total num frames: 2013364224. Throughput: 0: 56038.0. Samples: 1962750880. Policy #0 lag: (min: 0.0, avg: 13.4, max: 23.0) [2024-04-26 07:53:48,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 07:53:51,438][47288] Updated weights for policy 0, policy_version 122896 (0.0028) [2024-04-26 07:53:53,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 2013659136. Throughput: 0: 56156.0. Samples: 1963091740. Policy #0 lag: (min: 0.0, avg: 13.4, max: 23.0) [2024-04-26 07:53:53,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 07:53:54,422][47288] Updated weights for policy 0, policy_version 122906 (0.0030) [2024-04-26 07:53:57,110][47288] Updated weights for policy 0, policy_version 122916 (0.0034) [2024-04-26 07:53:58,923][47056] Fps is (10 sec: 60621.0, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 2013970432. Throughput: 0: 56467.9. Samples: 1963262020. Policy #0 lag: (min: 0.0, avg: 13.4, max: 23.0) [2024-04-26 07:53:58,923][47056] Avg episode reward: [(0, '0.427')] [2024-04-26 07:54:00,449][47288] Updated weights for policy 0, policy_version 122926 (0.0028) [2024-04-26 07:54:02,944][47288] Updated weights for policy 0, policy_version 122936 (0.0035) [2024-04-26 07:54:03,923][47056] Fps is (10 sec: 60622.2, 60 sec: 57070.9, 300 sec: 56538.7). Total num frames: 2014265344. Throughput: 0: 56473.3. Samples: 1963600200. Policy #0 lag: (min: 0.0, avg: 13.4, max: 23.0) [2024-04-26 07:54:03,923][47056] Avg episode reward: [(0, '0.587')] [2024-04-26 07:54:06,314][47288] Updated weights for policy 0, policy_version 122946 (0.0026) [2024-04-26 07:54:08,692][47288] Updated weights for policy 0, policy_version 122956 (0.0026) [2024-04-26 07:54:08,923][47056] Fps is (10 sec: 55706.6, 60 sec: 57071.2, 300 sec: 56483.1). Total num frames: 2014527488. Throughput: 0: 56627.3. Samples: 1963940320. Policy #0 lag: (min: 0.0, avg: 13.4, max: 23.0) [2024-04-26 07:54:08,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 07:54:11,945][47288] Updated weights for policy 0, policy_version 122966 (0.0027) [2024-04-26 07:54:13,922][47056] Fps is (10 sec: 52429.3, 60 sec: 56525.0, 300 sec: 56372.1). Total num frames: 2014789632. Throughput: 0: 56149.0. Samples: 1964108620. Policy #0 lag: (min: 0.0, avg: 13.4, max: 23.0) [2024-04-26 07:54:13,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 07:54:14,557][47288] Updated weights for policy 0, policy_version 122976 (0.0025) [2024-04-26 07:54:17,656][47288] Updated weights for policy 0, policy_version 122986 (0.0031) [2024-04-26 07:54:18,923][47056] Fps is (10 sec: 54066.2, 60 sec: 56251.6, 300 sec: 56372.1). Total num frames: 2015068160. Throughput: 0: 56267.5. Samples: 1964446000. Policy #0 lag: (min: 0.0, avg: 13.4, max: 23.0) [2024-04-26 07:54:18,923][47056] Avg episode reward: [(0, '0.596')] [2024-04-26 07:54:20,429][47288] Updated weights for policy 0, policy_version 122996 (0.0026) [2024-04-26 07:54:23,524][47288] Updated weights for policy 0, policy_version 123006 (0.0033) [2024-04-26 07:54:23,923][47056] Fps is (10 sec: 54065.8, 60 sec: 55159.4, 300 sec: 56261.0). Total num frames: 2015330304. Throughput: 0: 56159.9. Samples: 1964780780. Policy #0 lag: (min: 0.0, avg: 13.4, max: 23.0) [2024-04-26 07:54:23,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 07:54:26,108][47288] Updated weights for policy 0, policy_version 123016 (0.0025) [2024-04-26 07:54:28,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55705.5, 300 sec: 56261.0). Total num frames: 2015625216. Throughput: 0: 56306.2. Samples: 1964944120. Policy #0 lag: (min: 0.0, avg: 13.4, max: 23.0) [2024-04-26 07:54:28,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 07:54:29,447][47288] Updated weights for policy 0, policy_version 123026 (0.0030) [2024-04-26 07:54:31,870][47288] Updated weights for policy 0, policy_version 123036 (0.0024) [2024-04-26 07:54:33,923][47056] Fps is (10 sec: 60621.6, 60 sec: 56798.0, 300 sec: 56427.6). Total num frames: 2015936512. Throughput: 0: 56238.9. Samples: 1965281620. Policy #0 lag: (min: 0.0, avg: 13.4, max: 23.0) [2024-04-26 07:54:33,923][47056] Avg episode reward: [(0, '0.604')] [2024-04-26 07:54:35,096][47288] Updated weights for policy 0, policy_version 123046 (0.0032) [2024-04-26 07:54:37,751][47288] Updated weights for policy 0, policy_version 123056 (0.0031) [2024-04-26 07:54:38,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 2016215040. Throughput: 0: 56156.2. Samples: 1965618760. Policy #0 lag: (min: 0.0, avg: 13.4, max: 23.0) [2024-04-26 07:54:38,923][47056] Avg episode reward: [(0, '0.562')] [2024-04-26 07:54:40,730][47288] Updated weights for policy 0, policy_version 123066 (0.0027) [2024-04-26 07:54:42,793][47267] Signal inference workers to stop experience collection... (29250 times) [2024-04-26 07:54:42,794][47267] Signal inference workers to resume experience collection... (29250 times) [2024-04-26 07:54:42,805][47288] InferenceWorker_p0-w0: stopping experience collection (29250 times) [2024-04-26 07:54:42,806][47288] InferenceWorker_p0-w0: resuming experience collection (29250 times) [2024-04-26 07:54:43,495][47288] Updated weights for policy 0, policy_version 123076 (0.0029) [2024-04-26 07:54:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56798.0, 300 sec: 56427.6). Total num frames: 2016493568. Throughput: 0: 56340.6. Samples: 1965797340. Policy #0 lag: (min: 0.0, avg: 13.4, max: 23.0) [2024-04-26 07:54:43,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 07:54:46,566][47288] Updated weights for policy 0, policy_version 123086 (0.0027) [2024-04-26 07:54:48,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56798.0, 300 sec: 56427.6). Total num frames: 2016772096. Throughput: 0: 56310.6. Samples: 1966134180. Policy #0 lag: (min: 0.0, avg: 13.4, max: 23.0) [2024-04-26 07:54:48,923][47056] Avg episode reward: [(0, '0.588')] [2024-04-26 07:54:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000123094_2016772096.pth... [2024-04-26 07:54:48,978][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000122268_2003238912.pth [2024-04-26 07:54:49,445][47288] Updated weights for policy 0, policy_version 123096 (0.0031) [2024-04-26 07:54:52,527][47288] Updated weights for policy 0, policy_version 123106 (0.0032) [2024-04-26 07:54:53,923][47056] Fps is (10 sec: 54066.0, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 2017034240. Throughput: 0: 56325.4. Samples: 1966474980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 07:54:53,923][47056] Avg episode reward: [(0, '0.604')] [2024-04-26 07:54:55,075][47288] Updated weights for policy 0, policy_version 123116 (0.0031) [2024-04-26 07:54:58,183][47288] Updated weights for policy 0, policy_version 123126 (0.0032) [2024-04-26 07:54:58,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55705.8, 300 sec: 56261.0). Total num frames: 2017312768. Throughput: 0: 56161.7. Samples: 1966635900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 07:54:58,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 07:55:00,926][47288] Updated weights for policy 0, policy_version 123136 (0.0025) [2024-04-26 07:55:03,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55705.3, 300 sec: 56372.0). Total num frames: 2017607680. Throughput: 0: 56222.5. Samples: 1966976020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 07:55:03,923][47056] Avg episode reward: [(0, '0.534')] [2024-04-26 07:55:04,057][47288] Updated weights for policy 0, policy_version 123146 (0.0028) [2024-04-26 07:55:06,757][47288] Updated weights for policy 0, policy_version 123156 (0.0025) [2024-04-26 07:55:08,923][47056] Fps is (10 sec: 57343.1, 60 sec: 55978.5, 300 sec: 56316.5). Total num frames: 2017886208. Throughput: 0: 56311.1. Samples: 1967314780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 07:55:08,923][47056] Avg episode reward: [(0, '0.607')] [2024-04-26 07:55:09,933][47288] Updated weights for policy 0, policy_version 123166 (0.0025) [2024-04-26 07:55:12,584][47288] Updated weights for policy 0, policy_version 123176 (0.0030) [2024-04-26 07:55:13,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56524.5, 300 sec: 56316.5). Total num frames: 2018181120. Throughput: 0: 56506.5. Samples: 1967486920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 07:55:13,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 07:55:15,650][47288] Updated weights for policy 0, policy_version 123186 (0.0029) [2024-04-26 07:55:18,269][47288] Updated weights for policy 0, policy_version 123196 (0.0027) [2024-04-26 07:55:18,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56798.0, 300 sec: 56483.2). Total num frames: 2018476032. Throughput: 0: 56592.4. Samples: 1967828280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 07:55:18,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 07:55:21,386][47288] Updated weights for policy 0, policy_version 123206 (0.0031) [2024-04-26 07:55:23,923][47056] Fps is (10 sec: 57345.0, 60 sec: 57071.0, 300 sec: 56427.6). Total num frames: 2018754560. Throughput: 0: 56698.3. Samples: 1968170180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 07:55:23,923][47056] Avg episode reward: [(0, '0.569')] [2024-04-26 07:55:24,087][47288] Updated weights for policy 0, policy_version 123216 (0.0026) [2024-04-26 07:55:27,119][47288] Updated weights for policy 0, policy_version 123226 (0.0028) [2024-04-26 07:55:28,923][47056] Fps is (10 sec: 54067.6, 60 sec: 56525.0, 300 sec: 56372.1). Total num frames: 2019016704. Throughput: 0: 56407.2. Samples: 1968335660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 07:55:28,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 07:55:29,892][47288] Updated weights for policy 0, policy_version 123236 (0.0028) [2024-04-26 07:55:33,071][47288] Updated weights for policy 0, policy_version 123246 (0.0028) [2024-04-26 07:55:33,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 2019295232. Throughput: 0: 56570.6. Samples: 1968679860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 07:55:33,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 07:55:35,706][47288] Updated weights for policy 0, policy_version 123256 (0.0028) [2024-04-26 07:55:38,923][47056] Fps is (10 sec: 55701.8, 60 sec: 55978.2, 300 sec: 56316.4). Total num frames: 2019573760. Throughput: 0: 56548.8. Samples: 1969019700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 07:55:38,924][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 07:55:39,013][47288] Updated weights for policy 0, policy_version 123266 (0.0027) [2024-04-26 07:55:41,441][47288] Updated weights for policy 0, policy_version 123276 (0.0032) [2024-04-26 07:55:43,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 2019868672. Throughput: 0: 56556.0. Samples: 1969180920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 07:55:43,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 07:55:44,843][47288] Updated weights for policy 0, policy_version 123286 (0.0033) [2024-04-26 07:55:47,274][47288] Updated weights for policy 0, policy_version 123296 (0.0027) [2024-04-26 07:55:48,923][47056] Fps is (10 sec: 57347.0, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 2020147200. Throughput: 0: 56453.6. Samples: 1969516420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 07:55:48,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 07:55:49,684][47267] Signal inference workers to stop experience collection... (29300 times) [2024-04-26 07:55:49,711][47288] InferenceWorker_p0-w0: stopping experience collection (29300 times) [2024-04-26 07:55:49,732][47267] Signal inference workers to resume experience collection... (29300 times) [2024-04-26 07:55:49,732][47288] InferenceWorker_p0-w0: resuming experience collection (29300 times) [2024-04-26 07:55:50,565][47288] Updated weights for policy 0, policy_version 123306 (0.0025) [2024-04-26 07:55:53,005][47288] Updated weights for policy 0, policy_version 123316 (0.0026) [2024-04-26 07:55:53,923][47056] Fps is (10 sec: 58981.9, 60 sec: 57071.1, 300 sec: 56483.2). Total num frames: 2020458496. Throughput: 0: 56598.7. Samples: 1969861720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 07:55:53,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 07:55:56,287][47288] Updated weights for policy 0, policy_version 123326 (0.0030) [2024-04-26 07:55:58,849][47288] Updated weights for policy 0, policy_version 123336 (0.0027) [2024-04-26 07:55:58,923][47056] Fps is (10 sec: 58982.8, 60 sec: 57070.9, 300 sec: 56483.1). Total num frames: 2020737024. Throughput: 0: 56658.0. Samples: 1970036520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 07:55:58,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 07:56:02,061][47288] Updated weights for policy 0, policy_version 123346 (0.0034) [2024-04-26 07:56:03,923][47056] Fps is (10 sec: 54067.4, 60 sec: 56525.0, 300 sec: 56372.1). Total num frames: 2020999168. Throughput: 0: 56614.2. Samples: 1970375920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 07:56:03,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 07:56:04,622][47288] Updated weights for policy 0, policy_version 123356 (0.0038) [2024-04-26 07:56:07,801][47288] Updated weights for policy 0, policy_version 123366 (0.0029) [2024-04-26 07:56:08,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56524.9, 300 sec: 56372.1). Total num frames: 2021277696. Throughput: 0: 56468.0. Samples: 1970711240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 07:56:08,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 07:56:10,366][47288] Updated weights for policy 0, policy_version 123376 (0.0033) [2024-04-26 07:56:13,679][47288] Updated weights for policy 0, policy_version 123386 (0.0035) [2024-04-26 07:56:13,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.9, 300 sec: 56427.6). Total num frames: 2021556224. Throughput: 0: 56541.2. Samples: 1970880020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 07:56:13,923][47056] Avg episode reward: [(0, '0.583')] [2024-04-26 07:56:16,249][47288] Updated weights for policy 0, policy_version 123396 (0.0030) [2024-04-26 07:56:18,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 2021851136. Throughput: 0: 56439.9. Samples: 1971219660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 07:56:18,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 07:56:19,545][47288] Updated weights for policy 0, policy_version 123406 (0.0030) [2024-04-26 07:56:22,144][47288] Updated weights for policy 0, policy_version 123416 (0.0028) [2024-04-26 07:56:23,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 2022129664. Throughput: 0: 56475.3. Samples: 1971561060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 07:56:23,923][47056] Avg episode reward: [(0, '0.568')] [2024-04-26 07:56:25,198][47288] Updated weights for policy 0, policy_version 123426 (0.0031) [2024-04-26 07:56:27,836][47288] Updated weights for policy 0, policy_version 123436 (0.0030) [2024-04-26 07:56:28,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 2022408192. Throughput: 0: 56674.9. Samples: 1971731300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 07:56:28,923][47056] Avg episode reward: [(0, '0.600')] [2024-04-26 07:56:31,032][47288] Updated weights for policy 0, policy_version 123446 (0.0029) [2024-04-26 07:56:33,604][47288] Updated weights for policy 0, policy_version 123456 (0.0025) [2024-04-26 07:56:33,923][47056] Fps is (10 sec: 58982.5, 60 sec: 57070.9, 300 sec: 56427.6). Total num frames: 2022719488. Throughput: 0: 56709.3. Samples: 1972068340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 07:56:33,924][47056] Avg episode reward: [(0, '0.592')] [2024-04-26 07:56:36,869][47288] Updated weights for policy 0, policy_version 123466 (0.0028) [2024-04-26 07:56:38,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56798.4, 300 sec: 56427.6). Total num frames: 2022981632. Throughput: 0: 56586.6. Samples: 1972408120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 07:56:38,923][47056] Avg episode reward: [(0, '0.663')] [2024-04-26 07:56:38,929][47267] Saving new best policy, reward=0.663! [2024-04-26 07:56:39,515][47288] Updated weights for policy 0, policy_version 123476 (0.0029) [2024-04-26 07:56:42,829][47288] Updated weights for policy 0, policy_version 123486 (0.0028) [2024-04-26 07:56:43,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 2023260160. Throughput: 0: 56327.6. Samples: 1972571260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 07:56:43,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 07:56:45,062][47267] Signal inference workers to stop experience collection... (29350 times) [2024-04-26 07:56:45,102][47288] InferenceWorker_p0-w0: stopping experience collection (29350 times) [2024-04-26 07:56:45,112][47267] Signal inference workers to resume experience collection... (29350 times) [2024-04-26 07:56:45,117][47288] InferenceWorker_p0-w0: resuming experience collection (29350 times) [2024-04-26 07:56:45,233][47288] Updated weights for policy 0, policy_version 123496 (0.0026) [2024-04-26 07:56:48,529][47288] Updated weights for policy 0, policy_version 123506 (0.0027) [2024-04-26 07:56:48,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 2023538688. Throughput: 0: 56381.3. Samples: 1972913080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 07:56:48,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 07:56:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000123507_2023538688.pth... [2024-04-26 07:56:48,992][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000122684_2010054656.pth [2024-04-26 07:56:50,970][47288] Updated weights for policy 0, policy_version 123516 (0.0030) [2024-04-26 07:56:53,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 2023817216. Throughput: 0: 56429.3. Samples: 1973250560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-04-26 07:56:53,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 07:56:54,357][47288] Updated weights for policy 0, policy_version 123526 (0.0035) [2024-04-26 07:56:56,847][47288] Updated weights for policy 0, policy_version 123536 (0.0026) [2024-04-26 07:56:58,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 2024112128. Throughput: 0: 56456.4. Samples: 1973420560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-04-26 07:56:58,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 07:57:00,075][47288] Updated weights for policy 0, policy_version 123546 (0.0032) [2024-04-26 07:57:02,728][47288] Updated weights for policy 0, policy_version 123556 (0.0035) [2024-04-26 07:57:03,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 2024390656. Throughput: 0: 56336.6. Samples: 1973754800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-04-26 07:57:03,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 07:57:06,048][47288] Updated weights for policy 0, policy_version 123566 (0.0027) [2024-04-26 07:57:08,446][47288] Updated weights for policy 0, policy_version 123576 (0.0027) [2024-04-26 07:57:08,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 2024669184. Throughput: 0: 56171.2. Samples: 1974088760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-04-26 07:57:08,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 07:57:11,815][47288] Updated weights for policy 0, policy_version 123586 (0.0026) [2024-04-26 07:57:13,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 2024947712. Throughput: 0: 56274.7. Samples: 1974263660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-04-26 07:57:13,923][47056] Avg episode reward: [(0, '0.561')] [2024-04-26 07:57:14,430][47288] Updated weights for policy 0, policy_version 123596 (0.0026) [2024-04-26 07:57:17,686][47288] Updated weights for policy 0, policy_version 123606 (0.0031) [2024-04-26 07:57:18,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55978.8, 300 sec: 56316.5). Total num frames: 2025209856. Throughput: 0: 56222.3. Samples: 1974598340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-04-26 07:57:18,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 07:57:20,382][47288] Updated weights for policy 0, policy_version 123616 (0.0033) [2024-04-26 07:57:23,368][47288] Updated weights for policy 0, policy_version 123626 (0.0029) [2024-04-26 07:57:23,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 2025504768. Throughput: 0: 56072.9. Samples: 1974931400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-04-26 07:57:23,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 07:57:26,252][47288] Updated weights for policy 0, policy_version 123636 (0.0031) [2024-04-26 07:57:28,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56251.8, 300 sec: 56427.7). Total num frames: 2025783296. Throughput: 0: 56216.8. Samples: 1975101020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-04-26 07:57:28,923][47056] Avg episode reward: [(0, '0.616')] [2024-04-26 07:57:29,236][47288] Updated weights for policy 0, policy_version 123646 (0.0028) [2024-04-26 07:57:32,064][47288] Updated weights for policy 0, policy_version 123656 (0.0028) [2024-04-26 07:57:33,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 56372.1). Total num frames: 2026061824. Throughput: 0: 56191.3. Samples: 1975441700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-04-26 07:57:33,923][47056] Avg episode reward: [(0, '0.597')] [2024-04-26 07:57:35,069][47288] Updated weights for policy 0, policy_version 123666 (0.0027) [2024-04-26 07:57:37,951][47288] Updated weights for policy 0, policy_version 123676 (0.0026) [2024-04-26 07:57:38,923][47056] Fps is (10 sec: 60620.7, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 2026389504. Throughput: 0: 56099.1. Samples: 1975775020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-04-26 07:57:38,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 07:57:40,931][47288] Updated weights for policy 0, policy_version 123686 (0.0020) [2024-04-26 07:57:43,878][47288] Updated weights for policy 0, policy_version 123696 (0.0028) [2024-04-26 07:57:43,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.6, 300 sec: 56205.4). Total num frames: 2026635264. Throughput: 0: 56083.5. Samples: 1975944320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-04-26 07:57:43,923][47056] Avg episode reward: [(0, '0.561')] [2024-04-26 07:57:46,667][47288] Updated weights for policy 0, policy_version 123706 (0.0026) [2024-04-26 07:57:48,923][47056] Fps is (10 sec: 50791.0, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 2026897408. Throughput: 0: 56178.8. Samples: 1976282840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-04-26 07:57:48,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 07:57:49,771][47288] Updated weights for policy 0, policy_version 123716 (0.0029) [2024-04-26 07:57:50,425][47267] Signal inference workers to stop experience collection... (29400 times) [2024-04-26 07:57:50,472][47288] InferenceWorker_p0-w0: stopping experience collection (29400 times) [2024-04-26 07:57:50,480][47267] Signal inference workers to resume experience collection... (29400 times) [2024-04-26 07:57:50,489][47288] InferenceWorker_p0-w0: resuming experience collection (29400 times) [2024-04-26 07:57:52,420][47288] Updated weights for policy 0, policy_version 123726 (0.0034) [2024-04-26 07:57:53,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 2027175936. Throughput: 0: 56292.4. Samples: 1976621920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-04-26 07:57:53,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 07:57:55,440][47288] Updated weights for policy 0, policy_version 123736 (0.0028) [2024-04-26 07:57:58,289][47288] Updated weights for policy 0, policy_version 123746 (0.0031) [2024-04-26 07:57:58,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.7, 300 sec: 56316.5). Total num frames: 2027454464. Throughput: 0: 55940.1. Samples: 1976780960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 07:57:58,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 07:58:01,421][47288] Updated weights for policy 0, policy_version 123756 (0.0027) [2024-04-26 07:58:03,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55978.7, 300 sec: 56427.7). Total num frames: 2027749376. Throughput: 0: 56045.4. Samples: 1977120380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 07:58:03,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 07:58:04,134][47288] Updated weights for policy 0, policy_version 123766 (0.0026) [2024-04-26 07:58:07,148][47288] Updated weights for policy 0, policy_version 123776 (0.0030) [2024-04-26 07:58:08,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 2028044288. Throughput: 0: 56222.3. Samples: 1977461400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 07:58:08,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 07:58:09,872][47288] Updated weights for policy 0, policy_version 123786 (0.0027) [2024-04-26 07:58:12,909][47288] Updated weights for policy 0, policy_version 123796 (0.0030) [2024-04-26 07:58:13,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 2028339200. Throughput: 0: 56276.5. Samples: 1977633460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 07:58:13,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 07:58:15,495][47288] Updated weights for policy 0, policy_version 123806 (0.0041) [2024-04-26 07:58:18,691][47288] Updated weights for policy 0, policy_version 123816 (0.0033) [2024-04-26 07:58:18,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.8, 300 sec: 56261.0). Total num frames: 2028617728. Throughput: 0: 56227.7. Samples: 1977971940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 07:58:18,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 07:58:21,308][47288] Updated weights for policy 0, policy_version 123826 (0.0027) [2024-04-26 07:58:23,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55978.7, 300 sec: 56205.4). Total num frames: 2028863488. Throughput: 0: 56345.3. Samples: 1978310560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 07:58:23,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 07:58:24,653][47288] Updated weights for policy 0, policy_version 123836 (0.0036) [2024-04-26 07:58:27,176][47288] Updated weights for policy 0, policy_version 123846 (0.0027) [2024-04-26 07:58:28,923][47056] Fps is (10 sec: 54068.0, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 2029158400. Throughput: 0: 56246.9. Samples: 1978475420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 07:58:28,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 07:58:30,382][47288] Updated weights for policy 0, policy_version 123856 (0.0026) [2024-04-26 07:58:32,919][47288] Updated weights for policy 0, policy_version 123866 (0.0028) [2024-04-26 07:58:33,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 2029436928. Throughput: 0: 56320.2. Samples: 1978817260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 07:58:33,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 07:58:36,125][47288] Updated weights for policy 0, policy_version 123876 (0.0035) [2024-04-26 07:58:38,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55705.6, 300 sec: 56427.6). Total num frames: 2029731840. Throughput: 0: 56257.9. Samples: 1979153520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 07:58:38,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 07:58:39,025][47288] Updated weights for policy 0, policy_version 123886 (0.0031) [2024-04-26 07:58:41,821][47288] Updated weights for policy 0, policy_version 123896 (0.0027) [2024-04-26 07:58:43,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 2030010368. Throughput: 0: 56635.7. Samples: 1979329580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 07:58:43,923][47056] Avg episode reward: [(0, '0.574')] [2024-04-26 07:58:44,810][47288] Updated weights for policy 0, policy_version 123906 (0.0026) [2024-04-26 07:58:47,502][47288] Updated weights for policy 0, policy_version 123916 (0.0029) [2024-04-26 07:58:48,095][47267] Signal inference workers to stop experience collection... (29450 times) [2024-04-26 07:58:48,096][47267] Signal inference workers to resume experience collection... (29450 times) [2024-04-26 07:58:48,124][47288] InferenceWorker_p0-w0: stopping experience collection (29450 times) [2024-04-26 07:58:48,125][47288] InferenceWorker_p0-w0: resuming experience collection (29450 times) [2024-04-26 07:58:48,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 2030305280. Throughput: 0: 56577.7. Samples: 1979666380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 07:58:48,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 07:58:49,036][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000123921_2030321664.pth... [2024-04-26 07:58:49,079][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000123094_2016772096.pth [2024-04-26 07:58:50,606][47288] Updated weights for policy 0, policy_version 123926 (0.0030) [2024-04-26 07:58:53,234][47288] Updated weights for policy 0, policy_version 123936 (0.0024) [2024-04-26 07:58:53,924][47056] Fps is (10 sec: 58977.5, 60 sec: 57070.0, 300 sec: 56371.9). Total num frames: 2030600192. Throughput: 0: 56539.6. Samples: 1980005740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 07:58:53,924][47056] Avg episode reward: [(0, '0.605')] [2024-04-26 07:58:56,299][47288] Updated weights for policy 0, policy_version 123946 (0.0026) [2024-04-26 07:58:58,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56797.9, 300 sec: 56261.0). Total num frames: 2030862336. Throughput: 0: 56503.2. Samples: 1980176100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:58:58,923][47056] Avg episode reward: [(0, '0.604')] [2024-04-26 07:58:59,099][47288] Updated weights for policy 0, policy_version 123956 (0.0032) [2024-04-26 07:59:02,083][47288] Updated weights for policy 0, policy_version 123966 (0.0036) [2024-04-26 07:59:03,923][47056] Fps is (10 sec: 54072.3, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 2031140864. Throughput: 0: 56629.8. Samples: 1980520280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:59:03,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 07:59:04,887][47288] Updated weights for policy 0, policy_version 123976 (0.0027) [2024-04-26 07:59:07,901][47288] Updated weights for policy 0, policy_version 123986 (0.0029) [2024-04-26 07:59:08,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 2031435776. Throughput: 0: 56644.9. Samples: 1980859580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:59:08,924][47056] Avg episode reward: [(0, '0.562')] [2024-04-26 07:59:11,096][47288] Updated weights for policy 0, policy_version 123996 (0.0034) [2024-04-26 07:59:13,554][47288] Updated weights for policy 0, policy_version 124006 (0.0032) [2024-04-26 07:59:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 2031714304. Throughput: 0: 56518.5. Samples: 1981018760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:59:13,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 07:59:16,752][47288] Updated weights for policy 0, policy_version 124016 (0.0026) [2024-04-26 07:59:18,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2032009216. Throughput: 0: 56538.7. Samples: 1981361500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:59:18,923][47056] Avg episode reward: [(0, '0.547')] [2024-04-26 07:59:19,313][47288] Updated weights for policy 0, policy_version 124026 (0.0028) [2024-04-26 07:59:22,629][47288] Updated weights for policy 0, policy_version 124036 (0.0029) [2024-04-26 07:59:23,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56798.0, 300 sec: 56427.6). Total num frames: 2032271360. Throughput: 0: 56635.6. Samples: 1981702120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:59:23,923][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 07:59:25,264][47288] Updated weights for policy 0, policy_version 124046 (0.0032) [2024-04-26 07:59:28,331][47288] Updated weights for policy 0, policy_version 124056 (0.0026) [2024-04-26 07:59:28,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 2032566272. Throughput: 0: 56601.6. Samples: 1981876640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:59:28,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 07:59:31,358][47288] Updated weights for policy 0, policy_version 124066 (0.0029) [2024-04-26 07:59:33,923][47056] Fps is (10 sec: 57342.9, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 2032844800. Throughput: 0: 56638.5. Samples: 1982215120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:59:33,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 07:59:34,102][47288] Updated weights for policy 0, policy_version 124076 (0.0031) [2024-04-26 07:59:37,130][47288] Updated weights for policy 0, policy_version 124086 (0.0029) [2024-04-26 07:59:38,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 2033139712. Throughput: 0: 56615.0. Samples: 1982553360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:59:38,923][47056] Avg episode reward: [(0, '0.580')] [2024-04-26 07:59:39,875][47288] Updated weights for policy 0, policy_version 124096 (0.0028) [2024-04-26 07:59:42,960][47288] Updated weights for policy 0, policy_version 124106 (0.0027) [2024-04-26 07:59:43,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.8, 300 sec: 56372.0). Total num frames: 2033401856. Throughput: 0: 56586.8. Samples: 1982722520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:59:43,924][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 07:59:45,595][47288] Updated weights for policy 0, policy_version 124116 (0.0031) [2024-04-26 07:59:48,606][47288] Updated weights for policy 0, policy_version 124126 (0.0027) [2024-04-26 07:59:48,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 2033680384. Throughput: 0: 56472.6. Samples: 1983061540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:59:48,923][47056] Avg episode reward: [(0, '0.587')] [2024-04-26 07:59:51,572][47288] Updated weights for policy 0, policy_version 124136 (0.0028) [2024-04-26 07:59:53,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56252.6, 300 sec: 56483.1). Total num frames: 2033975296. Throughput: 0: 56401.7. Samples: 1983397660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 07:59:53,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 07:59:54,266][47288] Updated weights for policy 0, policy_version 124146 (0.0030) [2024-04-26 07:59:55,434][47267] Signal inference workers to stop experience collection... (29500 times) [2024-04-26 07:59:55,473][47288] InferenceWorker_p0-w0: stopping experience collection (29500 times) [2024-04-26 07:59:55,485][47267] Signal inference workers to resume experience collection... (29500 times) [2024-04-26 07:59:55,486][47288] InferenceWorker_p0-w0: resuming experience collection (29500 times) [2024-04-26 07:59:57,563][47288] Updated weights for policy 0, policy_version 124156 (0.0028) [2024-04-26 07:59:58,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56797.7, 300 sec: 56483.2). Total num frames: 2034270208. Throughput: 0: 56710.2. Samples: 1983570720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 07:59:58,923][47056] Avg episode reward: [(0, '0.547')] [2024-04-26 08:00:00,043][47288] Updated weights for policy 0, policy_version 124166 (0.0030) [2024-04-26 08:00:03,224][47288] Updated weights for policy 0, policy_version 124176 (0.0032) [2024-04-26 08:00:03,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56798.0, 300 sec: 56483.2). Total num frames: 2034548736. Throughput: 0: 56673.9. Samples: 1983911820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 08:00:03,923][47056] Avg episode reward: [(0, '0.552')] [2024-04-26 08:00:05,872][47288] Updated weights for policy 0, policy_version 124186 (0.0029) [2024-04-26 08:00:08,923][47056] Fps is (10 sec: 55706.8, 60 sec: 56525.0, 300 sec: 56427.7). Total num frames: 2034827264. Throughput: 0: 56670.3. Samples: 1984252280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 08:00:08,923][47056] Avg episode reward: [(0, '0.544')] [2024-04-26 08:00:08,954][47288] Updated weights for policy 0, policy_version 124196 (0.0030) [2024-04-26 08:00:11,846][47288] Updated weights for policy 0, policy_version 124206 (0.0031) [2024-04-26 08:00:13,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 2035105792. Throughput: 0: 56588.4. Samples: 1984423120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 08:00:13,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 08:00:14,781][47288] Updated weights for policy 0, policy_version 124216 (0.0030) [2024-04-26 08:00:17,647][47288] Updated weights for policy 0, policy_version 124226 (0.0037) [2024-04-26 08:00:18,923][47056] Fps is (10 sec: 58981.3, 60 sec: 56797.9, 300 sec: 56483.1). Total num frames: 2035417088. Throughput: 0: 56608.1. Samples: 1984762480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 08:00:18,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 08:00:20,572][47288] Updated weights for policy 0, policy_version 124236 (0.0024) [2024-04-26 08:00:23,438][47288] Updated weights for policy 0, policy_version 124246 (0.0033) [2024-04-26 08:00:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 2035679232. Throughput: 0: 56579.2. Samples: 1985099420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 08:00:23,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 08:00:26,313][47288] Updated weights for policy 0, policy_version 124256 (0.0032) [2024-04-26 08:00:28,923][47056] Fps is (10 sec: 52428.6, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 2035941376. Throughput: 0: 56551.2. Samples: 1985267320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 08:00:28,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 08:00:29,311][47288] Updated weights for policy 0, policy_version 124266 (0.0032) [2024-04-26 08:00:32,082][47288] Updated weights for policy 0, policy_version 124276 (0.0034) [2024-04-26 08:00:33,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 2036236288. Throughput: 0: 56596.3. Samples: 1985608380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 08:00:33,923][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 08:00:34,982][47288] Updated weights for policy 0, policy_version 124286 (0.0029) [2024-04-26 08:00:38,139][47288] Updated weights for policy 0, policy_version 124296 (0.0025) [2024-04-26 08:00:38,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56251.9, 300 sec: 56427.6). Total num frames: 2036514816. Throughput: 0: 56625.6. Samples: 1985945800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 08:00:38,923][47056] Avg episode reward: [(0, '0.439')] [2024-04-26 08:00:40,628][47288] Updated weights for policy 0, policy_version 124306 (0.0030) [2024-04-26 08:00:43,911][47288] Updated weights for policy 0, policy_version 124316 (0.0027) [2024-04-26 08:00:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 2036793344. Throughput: 0: 56482.2. Samples: 1986112420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 08:00:43,923][47056] Avg episode reward: [(0, '0.563')] [2024-04-26 08:00:45,305][47267] Signal inference workers to stop experience collection... (29550 times) [2024-04-26 08:00:45,305][47267] Signal inference workers to resume experience collection... (29550 times) [2024-04-26 08:00:45,331][47288] InferenceWorker_p0-w0: stopping experience collection (29550 times) [2024-04-26 08:00:45,332][47288] InferenceWorker_p0-w0: resuming experience collection (29550 times) [2024-04-26 08:00:46,494][47288] Updated weights for policy 0, policy_version 124326 (0.0031) [2024-04-26 08:00:48,923][47056] Fps is (10 sec: 54065.6, 60 sec: 56251.5, 300 sec: 56261.0). Total num frames: 2037055488. Throughput: 0: 56340.2. Samples: 1986447140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 08:00:48,924][47056] Avg episode reward: [(0, '0.566')] [2024-04-26 08:00:48,999][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000124333_2037071872.pth... [2024-04-26 08:00:49,045][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000123507_2023538688.pth [2024-04-26 08:00:49,615][47288] Updated weights for policy 0, policy_version 124336 (0.0030) [2024-04-26 08:00:52,278][47288] Updated weights for policy 0, policy_version 124346 (0.0031) [2024-04-26 08:00:53,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56798.0, 300 sec: 56427.6). Total num frames: 2037383168. Throughput: 0: 56336.7. Samples: 1986787440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 08:00:53,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 08:00:55,305][47288] Updated weights for policy 0, policy_version 124356 (0.0027) [2024-04-26 08:00:58,114][47288] Updated weights for policy 0, policy_version 124366 (0.0033) [2024-04-26 08:00:58,923][47056] Fps is (10 sec: 60622.4, 60 sec: 56524.9, 300 sec: 56483.1). Total num frames: 2037661696. Throughput: 0: 56486.8. Samples: 1986965020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 08:00:58,923][47056] Avg episode reward: [(0, '0.420')] [2024-04-26 08:01:01,261][47288] Updated weights for policy 0, policy_version 124376 (0.0025) [2024-04-26 08:01:03,765][47288] Updated weights for policy 0, policy_version 124386 (0.0032) [2024-04-26 08:01:03,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 2037940224. Throughput: 0: 56421.0. Samples: 1987301420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:01:03,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 08:01:07,078][47288] Updated weights for policy 0, policy_version 124396 (0.0032) [2024-04-26 08:01:08,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 2038202368. Throughput: 0: 56433.0. Samples: 1987638900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:01:08,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 08:01:09,517][47288] Updated weights for policy 0, policy_version 124406 (0.0025) [2024-04-26 08:01:12,758][47288] Updated weights for policy 0, policy_version 124416 (0.0025) [2024-04-26 08:01:13,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 2038497280. Throughput: 0: 56643.6. Samples: 1987816280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:01:13,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 08:01:15,406][47288] Updated weights for policy 0, policy_version 124426 (0.0028) [2024-04-26 08:01:18,526][47288] Updated weights for policy 0, policy_version 124436 (0.0029) [2024-04-26 08:01:18,923][47056] Fps is (10 sec: 57343.1, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 2038775808. Throughput: 0: 56727.2. Samples: 1988161100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:01:18,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 08:01:21,170][47288] Updated weights for policy 0, policy_version 124446 (0.0034) [2024-04-26 08:01:23,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 2039037952. Throughput: 0: 56716.3. Samples: 1988498040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:01:23,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 08:01:24,379][47288] Updated weights for policy 0, policy_version 124456 (0.0034) [2024-04-26 08:01:26,896][47288] Updated weights for policy 0, policy_version 124466 (0.0028) [2024-04-26 08:01:27,299][47267] Signal inference workers to stop experience collection... (29600 times) [2024-04-26 08:01:27,349][47288] InferenceWorker_p0-w0: stopping experience collection (29600 times) [2024-04-26 08:01:27,353][47267] Signal inference workers to resume experience collection... (29600 times) [2024-04-26 08:01:27,361][47288] InferenceWorker_p0-w0: resuming experience collection (29600 times) [2024-04-26 08:01:28,923][47056] Fps is (10 sec: 55704.7, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 2039332864. Throughput: 0: 56633.6. Samples: 1988660940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:01:28,924][47056] Avg episode reward: [(0, '0.391')] [2024-04-26 08:01:30,218][47288] Updated weights for policy 0, policy_version 124476 (0.0031) [2024-04-26 08:01:32,686][47288] Updated weights for policy 0, policy_version 124486 (0.0030) [2024-04-26 08:01:33,923][47056] Fps is (10 sec: 60620.0, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 2039644160. Throughput: 0: 56737.0. Samples: 1989000300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:01:33,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 08:01:35,953][47288] Updated weights for policy 0, policy_version 124496 (0.0032) [2024-04-26 08:01:38,516][47288] Updated weights for policy 0, policy_version 124506 (0.0033) [2024-04-26 08:01:38,923][47056] Fps is (10 sec: 60622.6, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 2039939072. Throughput: 0: 56777.5. Samples: 1989342420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:01:38,923][47056] Avg episode reward: [(0, '0.565')] [2024-04-26 08:01:41,578][47288] Updated weights for policy 0, policy_version 124516 (0.0026) [2024-04-26 08:01:43,923][47056] Fps is (10 sec: 54068.4, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 2040184832. Throughput: 0: 56773.4. Samples: 1989519820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:01:43,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 08:01:44,222][47288] Updated weights for policy 0, policy_version 124526 (0.0033) [2024-04-26 08:01:47,426][47288] Updated weights for policy 0, policy_version 124536 (0.0037) [2024-04-26 08:01:48,923][47056] Fps is (10 sec: 54066.7, 60 sec: 57071.1, 300 sec: 56483.1). Total num frames: 2040479744. Throughput: 0: 56660.0. Samples: 1989851120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:01:48,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 08:01:50,010][47288] Updated weights for policy 0, policy_version 124546 (0.0031) [2024-04-26 08:01:53,371][47288] Updated weights for policy 0, policy_version 124556 (0.0032) [2024-04-26 08:01:53,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 2040774656. Throughput: 0: 56700.3. Samples: 1990190420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:01:53,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 08:01:55,776][47288] Updated weights for policy 0, policy_version 124566 (0.0030) [2024-04-26 08:01:58,923][47056] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 2041020416. Throughput: 0: 56388.1. Samples: 1990353740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:01:58,923][47056] Avg episode reward: [(0, '0.570')] [2024-04-26 08:01:59,136][47288] Updated weights for policy 0, policy_version 124576 (0.0034) [2024-04-26 08:02:01,523][47288] Updated weights for policy 0, policy_version 124586 (0.0027) [2024-04-26 08:02:03,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 2041298944. Throughput: 0: 56310.8. Samples: 1990695080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:02:03,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 08:02:04,860][47288] Updated weights for policy 0, policy_version 124596 (0.0028) [2024-04-26 08:02:07,137][47288] Updated weights for policy 0, policy_version 124606 (0.0032) [2024-04-26 08:02:08,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 2041593856. Throughput: 0: 56387.3. Samples: 1991035460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:02:08,923][47056] Avg episode reward: [(0, '0.660')] [2024-04-26 08:02:10,772][47288] Updated weights for policy 0, policy_version 124616 (0.0028) [2024-04-26 08:02:12,949][47288] Updated weights for policy 0, policy_version 124626 (0.0031) [2024-04-26 08:02:13,923][47056] Fps is (10 sec: 62258.7, 60 sec: 57070.9, 300 sec: 56649.8). Total num frames: 2041921536. Throughput: 0: 56584.2. Samples: 1991207220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:02:13,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 08:02:16,737][47288] Updated weights for policy 0, policy_version 124636 (0.0026) [2024-04-26 08:02:18,770][47288] Updated weights for policy 0, policy_version 124646 (0.0029) [2024-04-26 08:02:18,923][47056] Fps is (10 sec: 60620.3, 60 sec: 57071.0, 300 sec: 56594.2). Total num frames: 2042200064. Throughput: 0: 56522.4. Samples: 1991543800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:02:18,923][47056] Avg episode reward: [(0, '0.544')] [2024-04-26 08:02:19,015][47267] Signal inference workers to stop experience collection... (29650 times) [2024-04-26 08:02:19,016][47267] Signal inference workers to resume experience collection... (29650 times) [2024-04-26 08:02:19,039][47288] InferenceWorker_p0-w0: stopping experience collection (29650 times) [2024-04-26 08:02:19,040][47288] InferenceWorker_p0-w0: resuming experience collection (29650 times) [2024-04-26 08:02:22,425][47288] Updated weights for policy 0, policy_version 124656 (0.0033) [2024-04-26 08:02:23,923][47056] Fps is (10 sec: 54067.2, 60 sec: 57070.9, 300 sec: 56538.7). Total num frames: 2042462208. Throughput: 0: 56466.9. Samples: 1991883440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:02:23,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 08:02:24,640][47288] Updated weights for policy 0, policy_version 124666 (0.0027) [2024-04-26 08:02:28,130][47288] Updated weights for policy 0, policy_version 124676 (0.0034) [2024-04-26 08:02:28,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56798.1, 300 sec: 56538.7). Total num frames: 2042740736. Throughput: 0: 56303.0. Samples: 1992053460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:02:28,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 08:02:30,330][47288] Updated weights for policy 0, policy_version 124686 (0.0029) [2024-04-26 08:02:33,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.9, 300 sec: 56316.5). Total num frames: 2043002880. Throughput: 0: 56480.9. Samples: 1992392760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:02:33,923][47056] Avg episode reward: [(0, '0.571')] [2024-04-26 08:02:33,964][47288] Updated weights for policy 0, policy_version 124696 (0.0030) [2024-04-26 08:02:36,231][47288] Updated weights for policy 0, policy_version 124706 (0.0028) [2024-04-26 08:02:38,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55705.5, 300 sec: 56427.6). Total num frames: 2043281408. Throughput: 0: 56445.3. Samples: 1992730460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:02:38,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 08:02:39,849][47288] Updated weights for policy 0, policy_version 124716 (0.0030) [2024-04-26 08:02:41,906][47288] Updated weights for policy 0, policy_version 124726 (0.0023) [2024-04-26 08:02:43,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 2043592704. Throughput: 0: 56647.9. Samples: 1992902900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:02:43,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 08:02:45,521][47288] Updated weights for policy 0, policy_version 124736 (0.0025) [2024-04-26 08:02:47,704][47288] Updated weights for policy 0, policy_version 124746 (0.0035) [2024-04-26 08:02:48,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 2043854848. Throughput: 0: 56548.0. Samples: 1993239740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:02:48,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 08:02:49,053][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000124749_2043887616.pth... [2024-04-26 08:02:49,098][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000123921_2030321664.pth [2024-04-26 08:02:51,281][47288] Updated weights for policy 0, policy_version 124756 (0.0029) [2024-04-26 08:02:53,404][47288] Updated weights for policy 0, policy_version 124766 (0.0029) [2024-04-26 08:02:53,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56797.9, 300 sec: 56705.3). Total num frames: 2044182528. Throughput: 0: 56388.3. Samples: 1993572940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:02:53,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 08:02:57,096][47288] Updated weights for policy 0, policy_version 124776 (0.0034) [2024-04-26 08:02:58,923][47056] Fps is (10 sec: 60621.0, 60 sec: 57344.0, 300 sec: 56649.8). Total num frames: 2044461056. Throughput: 0: 56738.3. Samples: 1993760440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:02:58,923][47056] Avg episode reward: [(0, '0.408')] [2024-04-26 08:02:59,131][47288] Updated weights for policy 0, policy_version 124786 (0.0033) [2024-04-26 08:03:02,919][47288] Updated weights for policy 0, policy_version 124796 (0.0028) [2024-04-26 08:03:03,923][47056] Fps is (10 sec: 55705.2, 60 sec: 57343.9, 300 sec: 56594.2). Total num frames: 2044739584. Throughput: 0: 56822.5. Samples: 1994100820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:03:03,924][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 08:03:04,727][47267] Signal inference workers to stop experience collection... (29700 times) [2024-04-26 08:03:04,761][47288] InferenceWorker_p0-w0: stopping experience collection (29700 times) [2024-04-26 08:03:04,785][47267] Signal inference workers to resume experience collection... (29700 times) [2024-04-26 08:03:04,786][47288] InferenceWorker_p0-w0: resuming experience collection (29700 times) [2024-04-26 08:03:04,895][47288] Updated weights for policy 0, policy_version 124806 (0.0028) [2024-04-26 08:03:08,521][47288] Updated weights for policy 0, policy_version 124816 (0.0029) [2024-04-26 08:03:08,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56797.7, 300 sec: 56483.1). Total num frames: 2045001728. Throughput: 0: 56791.6. Samples: 1994439060. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-26 08:03:08,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 08:03:10,835][47288] Updated weights for policy 0, policy_version 124826 (0.0030) [2024-04-26 08:03:13,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55705.7, 300 sec: 56427.6). Total num frames: 2045263872. Throughput: 0: 56731.6. Samples: 1994606380. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-26 08:03:13,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 08:03:14,247][47288] Updated weights for policy 0, policy_version 124836 (0.0026) [2024-04-26 08:03:16,599][47288] Updated weights for policy 0, policy_version 124846 (0.0034) [2024-04-26 08:03:18,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 56594.2). Total num frames: 2045558784. Throughput: 0: 56799.0. Samples: 1994948720. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-26 08:03:18,923][47056] Avg episode reward: [(0, '0.582')] [2024-04-26 08:03:19,992][47288] Updated weights for policy 0, policy_version 124856 (0.0027) [2024-04-26 08:03:22,361][47288] Updated weights for policy 0, policy_version 124866 (0.0030) [2024-04-26 08:03:23,922][47056] Fps is (10 sec: 60621.7, 60 sec: 56798.1, 300 sec: 56649.8). Total num frames: 2045870080. Throughput: 0: 56832.3. Samples: 1995287900. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-26 08:03:23,923][47056] Avg episode reward: [(0, '0.547')] [2024-04-26 08:03:25,666][47288] Updated weights for policy 0, policy_version 124876 (0.0029) [2024-04-26 08:03:28,337][47288] Updated weights for policy 0, policy_version 124886 (0.0034) [2024-04-26 08:03:28,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.7, 300 sec: 56594.2). Total num frames: 2046132224. Throughput: 0: 56754.2. Samples: 1995456840. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-26 08:03:28,923][47056] Avg episode reward: [(0, '0.578')] [2024-04-26 08:03:31,371][47288] Updated weights for policy 0, policy_version 124896 (0.0033) [2024-04-26 08:03:33,923][47056] Fps is (10 sec: 57342.5, 60 sec: 57343.9, 300 sec: 56649.7). Total num frames: 2046443520. Throughput: 0: 56736.3. Samples: 1995792880. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-26 08:03:33,923][47056] Avg episode reward: [(0, '0.443')] [2024-04-26 08:03:34,136][47288] Updated weights for policy 0, policy_version 124906 (0.0038) [2024-04-26 08:03:37,197][47288] Updated weights for policy 0, policy_version 124916 (0.0031) [2024-04-26 08:03:38,923][47056] Fps is (10 sec: 60621.5, 60 sec: 57617.2, 300 sec: 56705.3). Total num frames: 2046738432. Throughput: 0: 56880.1. Samples: 1996132540. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-26 08:03:38,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 08:03:39,928][47288] Updated weights for policy 0, policy_version 124926 (0.0032) [2024-04-26 08:03:42,954][47288] Updated weights for policy 0, policy_version 124936 (0.0030) [2024-04-26 08:03:43,923][47056] Fps is (10 sec: 55706.7, 60 sec: 56798.0, 300 sec: 56594.2). Total num frames: 2047000576. Throughput: 0: 56740.5. Samples: 1996313760. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-26 08:03:43,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 08:03:45,689][47288] Updated weights for policy 0, policy_version 124946 (0.0034) [2024-04-26 08:03:48,806][47288] Updated weights for policy 0, policy_version 124956 (0.0032) [2024-04-26 08:03:48,923][47056] Fps is (10 sec: 54067.2, 60 sec: 57071.0, 300 sec: 56538.9). Total num frames: 2047279104. Throughput: 0: 56622.4. Samples: 1996648820. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-26 08:03:48,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 08:03:51,673][47288] Updated weights for policy 0, policy_version 124966 (0.0032) [2024-04-26 08:03:53,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55978.7, 300 sec: 56538.7). Total num frames: 2047541248. Throughput: 0: 56662.2. Samples: 1996988860. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-26 08:03:53,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 08:03:54,689][47288] Updated weights for policy 0, policy_version 124976 (0.0026) [2024-04-26 08:03:57,647][47288] Updated weights for policy 0, policy_version 124986 (0.0034) [2024-04-26 08:03:58,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 56594.2). Total num frames: 2047836160. Throughput: 0: 56556.9. Samples: 1997151440. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-26 08:03:58,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 08:04:00,530][47288] Updated weights for policy 0, policy_version 124996 (0.0024) [2024-04-26 08:04:03,438][47288] Updated weights for policy 0, policy_version 125006 (0.0036) [2024-04-26 08:04:03,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 2048114688. Throughput: 0: 56523.6. Samples: 1997492280. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-26 08:04:03,923][47056] Avg episode reward: [(0, '0.616')] [2024-04-26 08:04:06,116][47288] Updated weights for policy 0, policy_version 125016 (0.0029) [2024-04-26 08:04:08,923][47056] Fps is (10 sec: 55704.2, 60 sec: 56524.6, 300 sec: 56538.7). Total num frames: 2048393216. Throughput: 0: 56473.7. Samples: 1997829240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 08:04:08,923][47056] Avg episode reward: [(0, '0.552')] [2024-04-26 08:04:09,346][47288] Updated weights for policy 0, policy_version 125026 (0.0027) [2024-04-26 08:04:11,854][47288] Updated weights for policy 0, policy_version 125036 (0.0032) [2024-04-26 08:04:13,923][47056] Fps is (10 sec: 58982.0, 60 sec: 57343.9, 300 sec: 56594.2). Total num frames: 2048704512. Throughput: 0: 56621.7. Samples: 1998004820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 08:04:13,923][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 08:04:15,129][47288] Updated weights for policy 0, policy_version 125046 (0.0032) [2024-04-26 08:04:17,737][47288] Updated weights for policy 0, policy_version 125056 (0.0030) [2024-04-26 08:04:18,923][47056] Fps is (10 sec: 60621.7, 60 sec: 57344.0, 300 sec: 56705.3). Total num frames: 2048999424. Throughput: 0: 56684.0. Samples: 1998343660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 08:04:18,923][47056] Avg episode reward: [(0, '0.610')] [2024-04-26 08:04:21,056][47288] Updated weights for policy 0, policy_version 125066 (0.0027) [2024-04-26 08:04:23,512][47288] Updated weights for policy 0, policy_version 125076 (0.0028) [2024-04-26 08:04:23,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56524.7, 300 sec: 56594.2). Total num frames: 2049261568. Throughput: 0: 56531.1. Samples: 1998676440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 08:04:23,923][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 08:04:24,761][47267] Signal inference workers to stop experience collection... (29750 times) [2024-04-26 08:04:24,766][47267] Signal inference workers to resume experience collection... (29750 times) [2024-04-26 08:04:24,791][47288] InferenceWorker_p0-w0: stopping experience collection (29750 times) [2024-04-26 08:04:24,791][47288] InferenceWorker_p0-w0: resuming experience collection (29750 times) [2024-04-26 08:04:26,880][47288] Updated weights for policy 0, policy_version 125086 (0.0032) [2024-04-26 08:04:28,923][47056] Fps is (10 sec: 52429.5, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 2049523712. Throughput: 0: 56459.5. Samples: 1998854440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 08:04:28,923][47056] Avg episode reward: [(0, '0.528')] [2024-04-26 08:04:29,263][47288] Updated weights for policy 0, policy_version 125096 (0.0037) [2024-04-26 08:04:32,618][47288] Updated weights for policy 0, policy_version 125106 (0.0034) [2024-04-26 08:04:33,923][47056] Fps is (10 sec: 54067.5, 60 sec: 55978.8, 300 sec: 56483.2). Total num frames: 2049802240. Throughput: 0: 56534.7. Samples: 1999192880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 08:04:33,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 08:04:35,071][47288] Updated weights for policy 0, policy_version 125116 (0.0034) [2024-04-26 08:04:38,248][47288] Updated weights for policy 0, policy_version 125126 (0.0032) [2024-04-26 08:04:38,923][47056] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 56594.3). Total num frames: 2050097152. Throughput: 0: 56388.5. Samples: 1999526340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 08:04:38,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 08:04:40,804][47288] Updated weights for policy 0, policy_version 125136 (0.0024) [2024-04-26 08:04:43,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.6, 300 sec: 56594.2). Total num frames: 2050375680. Throughput: 0: 56687.5. Samples: 1999702380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 08:04:43,923][47056] Avg episode reward: [(0, '0.528')] [2024-04-26 08:04:43,998][47288] Updated weights for policy 0, policy_version 125146 (0.0029) [2024-04-26 08:04:46,476][47288] Updated weights for policy 0, policy_version 125156 (0.0029) [2024-04-26 08:04:48,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56524.7, 300 sec: 56594.2). Total num frames: 2050670592. Throughput: 0: 56584.5. Samples: 2000038580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 08:04:48,923][47056] Avg episode reward: [(0, '0.435')] [2024-04-26 08:04:48,942][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000125164_2050686976.pth... [2024-04-26 08:04:48,985][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000124333_2037071872.pth [2024-04-26 08:04:49,992][47288] Updated weights for policy 0, policy_version 125166 (0.0032) [2024-04-26 08:04:52,362][47288] Updated weights for policy 0, policy_version 125176 (0.0027) [2024-04-26 08:04:53,923][47056] Fps is (10 sec: 58982.2, 60 sec: 57070.9, 300 sec: 56594.2). Total num frames: 2050965504. Throughput: 0: 56537.5. Samples: 2000373420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 08:04:53,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 08:04:55,644][47288] Updated weights for policy 0, policy_version 125186 (0.0030) [2024-04-26 08:04:58,278][47288] Updated weights for policy 0, policy_version 125196 (0.0030) [2024-04-26 08:04:58,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2051227648. Throughput: 0: 56517.6. Samples: 2000548100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 08:04:58,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 08:05:01,307][47288] Updated weights for policy 0, policy_version 125206 (0.0027) [2024-04-26 08:05:03,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 2051522560. Throughput: 0: 56631.1. Samples: 2000892060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 08:05:03,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 08:05:04,094][47288] Updated weights for policy 0, policy_version 125216 (0.0026) [2024-04-26 08:05:07,129][47288] Updated weights for policy 0, policy_version 125226 (0.0027) [2024-04-26 08:05:08,925][47056] Fps is (10 sec: 55691.2, 60 sec: 56522.6, 300 sec: 56538.2). Total num frames: 2051784704. Throughput: 0: 56658.6. Samples: 2001226220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 08:05:08,926][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 08:05:09,899][47288] Updated weights for policy 0, policy_version 125236 (0.0025) [2024-04-26 08:05:10,743][47267] Signal inference workers to stop experience collection... (29800 times) [2024-04-26 08:05:10,777][47288] InferenceWorker_p0-w0: stopping experience collection (29800 times) [2024-04-26 08:05:10,803][47267] Signal inference workers to resume experience collection... (29800 times) [2024-04-26 08:05:10,806][47288] InferenceWorker_p0-w0: resuming experience collection (29800 times) [2024-04-26 08:05:12,899][47288] Updated weights for policy 0, policy_version 125246 (0.0025) [2024-04-26 08:05:13,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.8, 300 sec: 56427.6). Total num frames: 2052063232. Throughput: 0: 56354.6. Samples: 2001390400. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-04-26 08:05:13,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 08:05:15,603][47288] Updated weights for policy 0, policy_version 125256 (0.0031) [2024-04-26 08:05:18,574][47288] Updated weights for policy 0, policy_version 125266 (0.0028) [2024-04-26 08:05:18,923][47056] Fps is (10 sec: 57359.1, 60 sec: 55978.8, 300 sec: 56538.7). Total num frames: 2052358144. Throughput: 0: 56370.3. Samples: 2001729540. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-04-26 08:05:18,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 08:05:21,520][47288] Updated weights for policy 0, policy_version 125276 (0.0026) [2024-04-26 08:05:23,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56524.8, 300 sec: 56649.8). Total num frames: 2052653056. Throughput: 0: 56618.6. Samples: 2002074180. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-04-26 08:05:23,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 08:05:24,368][47288] Updated weights for policy 0, policy_version 125286 (0.0027) [2024-04-26 08:05:27,188][47288] Updated weights for policy 0, policy_version 125296 (0.0027) [2024-04-26 08:05:28,923][47056] Fps is (10 sec: 58981.5, 60 sec: 57070.8, 300 sec: 56649.8). Total num frames: 2052947968. Throughput: 0: 56418.2. Samples: 2002241200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-04-26 08:05:28,923][47056] Avg episode reward: [(0, '0.552')] [2024-04-26 08:05:30,189][47288] Updated weights for policy 0, policy_version 125306 (0.0035) [2024-04-26 08:05:32,971][47288] Updated weights for policy 0, policy_version 125316 (0.0024) [2024-04-26 08:05:33,923][47056] Fps is (10 sec: 57343.9, 60 sec: 57070.8, 300 sec: 56649.7). Total num frames: 2053226496. Throughput: 0: 56430.7. Samples: 2002577960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-04-26 08:05:33,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 08:05:35,955][47288] Updated weights for policy 0, policy_version 125326 (0.0030) [2024-04-26 08:05:38,916][47288] Updated weights for policy 0, policy_version 125336 (0.0026) [2024-04-26 08:05:38,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56798.0, 300 sec: 56649.8). Total num frames: 2053505024. Throughput: 0: 56575.8. Samples: 2002919320. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-04-26 08:05:38,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 08:05:41,886][47288] Updated weights for policy 0, policy_version 125346 (0.0026) [2024-04-26 08:05:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56797.8, 300 sec: 56705.3). Total num frames: 2053783552. Throughput: 0: 56508.3. Samples: 2003090980. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-04-26 08:05:43,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 08:05:44,688][47288] Updated weights for policy 0, policy_version 125356 (0.0032) [2024-04-26 08:05:47,690][47288] Updated weights for policy 0, policy_version 125366 (0.0030) [2024-04-26 08:05:48,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2054062080. Throughput: 0: 56419.2. Samples: 2003430920. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-04-26 08:05:48,923][47056] Avg episode reward: [(0, '0.548')] [2024-04-26 08:05:50,409][47288] Updated weights for policy 0, policy_version 125376 (0.0027) [2024-04-26 08:05:53,360][47288] Updated weights for policy 0, policy_version 125386 (0.0029) [2024-04-26 08:05:53,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 2054340608. Throughput: 0: 56480.5. Samples: 2003767700. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-04-26 08:05:53,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 08:05:56,401][47288] Updated weights for policy 0, policy_version 125396 (0.0029) [2024-04-26 08:05:58,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.7, 300 sec: 56538.7). Total num frames: 2054619136. Throughput: 0: 56571.9. Samples: 2003936140. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-04-26 08:05:58,923][47056] Avg episode reward: [(0, '0.552')] [2024-04-26 08:05:59,045][47288] Updated weights for policy 0, policy_version 125406 (0.0032) [2024-04-26 08:06:02,244][47288] Updated weights for policy 0, policy_version 125416 (0.0026) [2024-04-26 08:06:03,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.8, 300 sec: 56649.7). Total num frames: 2054914048. Throughput: 0: 56610.1. Samples: 2004277000. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-04-26 08:06:03,923][47056] Avg episode reward: [(0, '0.600')] [2024-04-26 08:06:04,763][47288] Updated weights for policy 0, policy_version 125426 (0.0027) [2024-04-26 08:06:07,932][47288] Updated weights for policy 0, policy_version 125436 (0.0029) [2024-04-26 08:06:08,923][47056] Fps is (10 sec: 58982.2, 60 sec: 57073.3, 300 sec: 56649.8). Total num frames: 2055208960. Throughput: 0: 56641.8. Samples: 2004623060. Policy #0 lag: (min: 1.0, avg: 10.5, max: 25.0) [2024-04-26 08:06:08,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 08:06:10,701][47288] Updated weights for policy 0, policy_version 125446 (0.0027) [2024-04-26 08:06:13,619][47288] Updated weights for policy 0, policy_version 125456 (0.0032) [2024-04-26 08:06:13,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 2055471104. Throughput: 0: 56694.4. Samples: 2004792440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 08:06:13,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 08:06:16,501][47288] Updated weights for policy 0, policy_version 125466 (0.0028) [2024-04-26 08:06:18,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56797.8, 300 sec: 56705.3). Total num frames: 2055766016. Throughput: 0: 56705.4. Samples: 2005129700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 08:06:18,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 08:06:19,406][47288] Updated weights for policy 0, policy_version 125476 (0.0031) [2024-04-26 08:06:22,202][47288] Updated weights for policy 0, policy_version 125486 (0.0029) [2024-04-26 08:06:23,923][47056] Fps is (10 sec: 55704.8, 60 sec: 56251.7, 300 sec: 56594.3). Total num frames: 2056028160. Throughput: 0: 56627.3. Samples: 2005467560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 08:06:23,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 08:06:25,298][47288] Updated weights for policy 0, policy_version 125496 (0.0037) [2024-04-26 08:06:27,870][47288] Updated weights for policy 0, policy_version 125506 (0.0027) [2024-04-26 08:06:28,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 2056323072. Throughput: 0: 56541.3. Samples: 2005635340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 08:06:28,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 08:06:31,092][47288] Updated weights for policy 0, policy_version 125516 (0.0029) [2024-04-26 08:06:33,655][47288] Updated weights for policy 0, policy_version 125526 (0.0028) [2024-04-26 08:06:33,923][47056] Fps is (10 sec: 58983.2, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 2056617984. Throughput: 0: 56496.1. Samples: 2005973240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 08:06:33,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 08:06:36,763][47288] Updated weights for policy 0, policy_version 125536 (0.0027) [2024-04-26 08:06:38,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.6, 300 sec: 56649.7). Total num frames: 2056896512. Throughput: 0: 56520.4. Samples: 2006311120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 08:06:38,923][47056] Avg episode reward: [(0, '0.626')] [2024-04-26 08:06:39,478][47288] Updated weights for policy 0, policy_version 125546 (0.0030) [2024-04-26 08:06:42,592][47288] Updated weights for policy 0, policy_version 125556 (0.0033) [2024-04-26 08:06:43,103][47267] Signal inference workers to stop experience collection... (29850 times) [2024-04-26 08:06:43,139][47288] InferenceWorker_p0-w0: stopping experience collection (29850 times) [2024-04-26 08:06:43,193][47267] Signal inference workers to resume experience collection... (29850 times) [2024-04-26 08:06:43,193][47288] InferenceWorker_p0-w0: resuming experience collection (29850 times) [2024-04-26 08:06:43,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.9, 300 sec: 56594.2). Total num frames: 2057175040. Throughput: 0: 56697.0. Samples: 2006487500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 08:06:43,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 08:06:45,199][47288] Updated weights for policy 0, policy_version 125566 (0.0027) [2024-04-26 08:06:48,378][47288] Updated weights for policy 0, policy_version 125576 (0.0031) [2024-04-26 08:06:48,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 2057453568. Throughput: 0: 56708.6. Samples: 2006828880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 08:06:48,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 08:06:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000125578_2057469952.pth... [2024-04-26 08:06:49,000][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000124749_2043887616.pth [2024-04-26 08:06:50,926][47288] Updated weights for policy 0, policy_version 125586 (0.0040) [2024-04-26 08:06:53,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.8, 300 sec: 56649.8). Total num frames: 2057732096. Throughput: 0: 56542.3. Samples: 2007167460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 08:06:53,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 08:06:54,349][47288] Updated weights for policy 0, policy_version 125596 (0.0034) [2024-04-26 08:06:56,600][47288] Updated weights for policy 0, policy_version 125606 (0.0033) [2024-04-26 08:06:58,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56797.9, 300 sec: 56705.3). Total num frames: 2058027008. Throughput: 0: 56458.5. Samples: 2007333080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 08:06:58,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 08:07:00,030][47288] Updated weights for policy 0, policy_version 125616 (0.0030) [2024-04-26 08:07:02,432][47288] Updated weights for policy 0, policy_version 125626 (0.0034) [2024-04-26 08:07:03,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 56594.2). Total num frames: 2058289152. Throughput: 0: 56377.8. Samples: 2007666700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 08:07:03,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 08:07:05,951][47288] Updated weights for policy 0, policy_version 125636 (0.0034) [2024-04-26 08:07:08,307][47288] Updated weights for policy 0, policy_version 125646 (0.0031) [2024-04-26 08:07:08,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2058600448. Throughput: 0: 56374.6. Samples: 2008004420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 08:07:08,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 08:07:11,612][47288] Updated weights for policy 0, policy_version 125656 (0.0028) [2024-04-26 08:07:13,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 2058878976. Throughput: 0: 56673.5. Samples: 2008185640. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 08:07:13,923][47056] Avg episode reward: [(0, '0.568')] [2024-04-26 08:07:14,371][47288] Updated weights for policy 0, policy_version 125666 (0.0025) [2024-04-26 08:07:17,507][47288] Updated weights for policy 0, policy_version 125676 (0.0028) [2024-04-26 08:07:18,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.7, 300 sec: 56594.2). Total num frames: 2059157504. Throughput: 0: 56675.8. Samples: 2008523660. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 08:07:18,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 08:07:20,172][47288] Updated weights for policy 0, policy_version 125686 (0.0032) [2024-04-26 08:07:23,275][47288] Updated weights for policy 0, policy_version 125696 (0.0029) [2024-04-26 08:07:23,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 2059436032. Throughput: 0: 56636.4. Samples: 2008859760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 08:07:23,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 08:07:25,840][47288] Updated weights for policy 0, policy_version 125706 (0.0031) [2024-04-26 08:07:28,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.8, 300 sec: 56649.7). Total num frames: 2059714560. Throughput: 0: 56488.8. Samples: 2009029500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 08:07:28,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 08:07:29,001][47288] Updated weights for policy 0, policy_version 125716 (0.0028) [2024-04-26 08:07:31,723][47288] Updated weights for policy 0, policy_version 125726 (0.0023) [2024-04-26 08:07:33,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.6, 300 sec: 56649.8). Total num frames: 2059993088. Throughput: 0: 56390.5. Samples: 2009366460. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 08:07:33,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 08:07:34,868][47288] Updated weights for policy 0, policy_version 125736 (0.0030) [2024-04-26 08:07:37,460][47288] Updated weights for policy 0, policy_version 125746 (0.0034) [2024-04-26 08:07:38,923][47056] Fps is (10 sec: 55704.6, 60 sec: 56251.6, 300 sec: 56538.7). Total num frames: 2060271616. Throughput: 0: 56426.0. Samples: 2009706640. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 08:07:38,923][47056] Avg episode reward: [(0, '0.460')] [2024-04-26 08:07:40,689][47288] Updated weights for policy 0, policy_version 125756 (0.0026) [2024-04-26 08:07:43,359][47288] Updated weights for policy 0, policy_version 125766 (0.0032) [2024-04-26 08:07:43,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56251.6, 300 sec: 56594.2). Total num frames: 2060550144. Throughput: 0: 56413.3. Samples: 2009871680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 08:07:43,923][47056] Avg episode reward: [(0, '0.546')] [2024-04-26 08:07:46,621][47288] Updated weights for policy 0, policy_version 125776 (0.0033) [2024-04-26 08:07:48,923][47056] Fps is (10 sec: 57345.3, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 2060845056. Throughput: 0: 56490.6. Samples: 2010208780. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 08:07:48,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 08:07:49,134][47288] Updated weights for policy 0, policy_version 125786 (0.0038) [2024-04-26 08:07:52,493][47288] Updated weights for policy 0, policy_version 125796 (0.0028) [2024-04-26 08:07:53,923][47056] Fps is (10 sec: 58983.2, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2061139968. Throughput: 0: 56478.4. Samples: 2010545940. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 08:07:53,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 08:07:54,966][47288] Updated weights for policy 0, policy_version 125806 (0.0030) [2024-04-26 08:07:58,410][47267] Signal inference workers to stop experience collection... (29900 times) [2024-04-26 08:07:58,411][47267] Signal inference workers to resume experience collection... (29900 times) [2024-04-26 08:07:58,415][47288] Updated weights for policy 0, policy_version 125816 (0.0032) [2024-04-26 08:07:58,424][47288] InferenceWorker_p0-w0: stopping experience collection (29900 times) [2024-04-26 08:07:58,425][47288] InferenceWorker_p0-w0: resuming experience collection (29900 times) [2024-04-26 08:07:58,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 2061402112. Throughput: 0: 56270.5. Samples: 2010717820. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 08:07:58,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 08:08:01,017][47288] Updated weights for policy 0, policy_version 125826 (0.0028) [2024-04-26 08:08:03,923][47056] Fps is (10 sec: 52428.9, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 2061664256. Throughput: 0: 56144.2. Samples: 2011050140. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 08:08:03,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 08:08:04,149][47288] Updated weights for policy 0, policy_version 125836 (0.0027) [2024-04-26 08:08:07,116][47288] Updated weights for policy 0, policy_version 125846 (0.0034) [2024-04-26 08:08:08,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 56594.2). Total num frames: 2061959168. Throughput: 0: 56217.8. Samples: 2011389560. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 08:08:08,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 08:08:09,817][47288] Updated weights for policy 0, policy_version 125856 (0.0033) [2024-04-26 08:08:12,787][47288] Updated weights for policy 0, policy_version 125866 (0.0031) [2024-04-26 08:08:13,923][47056] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 56538.7). Total num frames: 2062237696. Throughput: 0: 56185.4. Samples: 2011557840. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 08:08:13,923][47056] Avg episode reward: [(0, '0.624')] [2024-04-26 08:08:15,654][47288] Updated weights for policy 0, policy_version 125876 (0.0031) [2024-04-26 08:08:18,631][47288] Updated weights for policy 0, policy_version 125886 (0.0033) [2024-04-26 08:08:18,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.8, 300 sec: 56483.1). Total num frames: 2062532608. Throughput: 0: 56329.4. Samples: 2011901280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 08:08:18,923][47056] Avg episode reward: [(0, '0.628')] [2024-04-26 08:08:21,489][47288] Updated weights for policy 0, policy_version 125896 (0.0031) [2024-04-26 08:08:23,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 2062811136. Throughput: 0: 56306.0. Samples: 2012240400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 08:08:23,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 08:08:24,616][47288] Updated weights for policy 0, policy_version 125906 (0.0027) [2024-04-26 08:08:27,258][47288] Updated weights for policy 0, policy_version 125916 (0.0025) [2024-04-26 08:08:28,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 2063122432. Throughput: 0: 56510.2. Samples: 2012414640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 08:08:28,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 08:08:30,239][47288] Updated weights for policy 0, policy_version 125926 (0.0030) [2024-04-26 08:08:33,013][47288] Updated weights for policy 0, policy_version 125936 (0.0026) [2024-04-26 08:08:33,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56525.0, 300 sec: 56427.6). Total num frames: 2063384576. Throughput: 0: 56509.5. Samples: 2012751700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 08:08:33,923][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 08:08:36,093][47288] Updated weights for policy 0, policy_version 125946 (0.0030) [2024-04-26 08:08:38,751][47288] Updated weights for policy 0, policy_version 125956 (0.0028) [2024-04-26 08:08:38,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56525.0, 300 sec: 56483.1). Total num frames: 2063663104. Throughput: 0: 56661.3. Samples: 2013095700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 08:08:38,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 08:08:41,850][47288] Updated weights for policy 0, policy_version 125966 (0.0025) [2024-04-26 08:08:43,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56798.1, 300 sec: 56538.7). Total num frames: 2063958016. Throughput: 0: 56725.2. Samples: 2013270440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 08:08:43,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 08:08:44,599][47288] Updated weights for policy 0, policy_version 125976 (0.0028) [2024-04-26 08:08:47,597][47288] Updated weights for policy 0, policy_version 125986 (0.0036) [2024-04-26 08:08:48,923][47056] Fps is (10 sec: 57342.9, 60 sec: 56524.6, 300 sec: 56594.2). Total num frames: 2064236544. Throughput: 0: 56877.9. Samples: 2013609660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 08:08:48,923][47056] Avg episode reward: [(0, '0.588')] [2024-04-26 08:08:48,936][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000125991_2064236544.pth... [2024-04-26 08:08:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000125164_2050686976.pth [2024-04-26 08:08:50,325][47288] Updated weights for policy 0, policy_version 125996 (0.0030) [2024-04-26 08:08:53,539][47288] Updated weights for policy 0, policy_version 126006 (0.0031) [2024-04-26 08:08:53,923][47056] Fps is (10 sec: 55704.6, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 2064515072. Throughput: 0: 56818.7. Samples: 2013946400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 08:08:53,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 08:08:56,011][47288] Updated weights for policy 0, policy_version 126016 (0.0025) [2024-04-26 08:08:58,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 2064777216. Throughput: 0: 56729.3. Samples: 2014110660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 08:08:58,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 08:08:59,231][47288] Updated weights for policy 0, policy_version 126026 (0.0027) [2024-04-26 08:09:01,893][47288] Updated weights for policy 0, policy_version 126036 (0.0024) [2024-04-26 08:09:03,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 2065072128. Throughput: 0: 56584.1. Samples: 2014447560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 08:09:03,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 08:09:04,954][47288] Updated weights for policy 0, policy_version 126046 (0.0026) [2024-04-26 08:09:07,611][47288] Updated weights for policy 0, policy_version 126056 (0.0027) [2024-04-26 08:09:08,631][47267] Signal inference workers to stop experience collection... (29950 times) [2024-04-26 08:09:08,631][47267] Signal inference workers to resume experience collection... (29950 times) [2024-04-26 08:09:08,661][47288] InferenceWorker_p0-w0: stopping experience collection (29950 times) [2024-04-26 08:09:08,661][47288] InferenceWorker_p0-w0: resuming experience collection (29950 times) [2024-04-26 08:09:08,923][47056] Fps is (10 sec: 60621.3, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 2065383424. Throughput: 0: 56560.1. Samples: 2014785600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 08:09:08,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 08:09:10,749][47288] Updated weights for policy 0, policy_version 126066 (0.0033) [2024-04-26 08:09:13,451][47288] Updated weights for policy 0, policy_version 126076 (0.0030) [2024-04-26 08:09:13,923][47056] Fps is (10 sec: 57342.6, 60 sec: 56797.7, 300 sec: 56427.6). Total num frames: 2065645568. Throughput: 0: 56466.1. Samples: 2014955620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 08:09:13,923][47056] Avg episode reward: [(0, '0.575')] [2024-04-26 08:09:16,557][47288] Updated weights for policy 0, policy_version 126086 (0.0025) [2024-04-26 08:09:18,923][47056] Fps is (10 sec: 54065.7, 60 sec: 56524.6, 300 sec: 56483.1). Total num frames: 2065924096. Throughput: 0: 56600.4. Samples: 2015298740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 08:09:18,924][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 08:09:19,212][47288] Updated weights for policy 0, policy_version 126096 (0.0033) [2024-04-26 08:09:22,370][47288] Updated weights for policy 0, policy_version 126106 (0.0034) [2024-04-26 08:09:23,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56797.7, 300 sec: 56594.2). Total num frames: 2066219008. Throughput: 0: 56413.0. Samples: 2015634300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 08:09:23,924][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 08:09:25,069][47288] Updated weights for policy 0, policy_version 126116 (0.0028) [2024-04-26 08:09:28,113][47288] Updated weights for policy 0, policy_version 126126 (0.0033) [2024-04-26 08:09:28,923][47056] Fps is (10 sec: 57345.9, 60 sec: 56251.9, 300 sec: 56594.2). Total num frames: 2066497536. Throughput: 0: 56316.8. Samples: 2015804700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 08:09:28,923][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 08:09:30,893][47288] Updated weights for policy 0, policy_version 126136 (0.0031) [2024-04-26 08:09:33,814][47288] Updated weights for policy 0, policy_version 126146 (0.0028) [2024-04-26 08:09:33,923][47056] Fps is (10 sec: 55707.2, 60 sec: 56524.7, 300 sec: 56538.7). Total num frames: 2066776064. Throughput: 0: 56356.2. Samples: 2016145680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 08:09:33,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 08:09:36,593][47288] Updated weights for policy 0, policy_version 126156 (0.0036) [2024-04-26 08:09:38,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56251.7, 300 sec: 56483.2). Total num frames: 2067038208. Throughput: 0: 56424.0. Samples: 2016485480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 08:09:38,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 08:09:39,765][47288] Updated weights for policy 0, policy_version 126166 (0.0027) [2024-04-26 08:09:42,526][47288] Updated weights for policy 0, policy_version 126176 (0.0026) [2024-04-26 08:09:43,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.6, 300 sec: 56483.2). Total num frames: 2067333120. Throughput: 0: 56405.9. Samples: 2016648920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 08:09:43,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 08:09:45,596][47288] Updated weights for policy 0, policy_version 126186 (0.0026) [2024-04-26 08:09:48,192][47288] Updated weights for policy 0, policy_version 126196 (0.0025) [2024-04-26 08:09:48,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56524.9, 300 sec: 56483.1). Total num frames: 2067628032. Throughput: 0: 56551.4. Samples: 2016992380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 08:09:48,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 08:09:51,216][47288] Updated weights for policy 0, policy_version 126206 (0.0028) [2024-04-26 08:09:53,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2067906560. Throughput: 0: 56574.6. Samples: 2017331460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 08:09:53,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 08:09:53,940][47288] Updated weights for policy 0, policy_version 126216 (0.0028) [2024-04-26 08:09:56,894][47288] Updated weights for policy 0, policy_version 126226 (0.0026) [2024-04-26 08:09:58,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56797.9, 300 sec: 56483.2). Total num frames: 2068185088. Throughput: 0: 56647.8. Samples: 2017504760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 08:09:58,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 08:09:59,795][47288] Updated weights for policy 0, policy_version 126236 (0.0028) [2024-04-26 08:10:02,716][47288] Updated weights for policy 0, policy_version 126246 (0.0030) [2024-04-26 08:10:03,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56797.7, 300 sec: 56594.7). Total num frames: 2068480000. Throughput: 0: 56464.6. Samples: 2017839640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 08:10:03,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 08:10:05,572][47288] Updated weights for policy 0, policy_version 126256 (0.0029) [2024-04-26 08:10:08,532][47288] Updated weights for policy 0, policy_version 126266 (0.0030) [2024-04-26 08:10:08,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56251.7, 300 sec: 56594.2). Total num frames: 2068758528. Throughput: 0: 56494.2. Samples: 2018176520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 08:10:08,923][47056] Avg episode reward: [(0, '0.597')] [2024-04-26 08:10:11,360][47288] Updated weights for policy 0, policy_version 126276 (0.0029) [2024-04-26 08:10:13,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.9, 300 sec: 56538.6). Total num frames: 2069037056. Throughput: 0: 56405.1. Samples: 2018342940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 08:10:13,924][47056] Avg episode reward: [(0, '0.508')] [2024-04-26 08:10:14,303][47288] Updated weights for policy 0, policy_version 126286 (0.0034) [2024-04-26 08:10:14,963][47267] Signal inference workers to stop experience collection... (30000 times) [2024-04-26 08:10:14,964][47267] Signal inference workers to resume experience collection... (30000 times) [2024-04-26 08:10:14,990][47288] InferenceWorker_p0-w0: stopping experience collection (30000 times) [2024-04-26 08:10:14,990][47288] InferenceWorker_p0-w0: resuming experience collection (30000 times) [2024-04-26 08:10:17,035][47288] Updated weights for policy 0, policy_version 126296 (0.0034) [2024-04-26 08:10:18,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56252.0, 300 sec: 56427.6). Total num frames: 2069299200. Throughput: 0: 56431.1. Samples: 2018685080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 08:10:18,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 08:10:20,024][47288] Updated weights for policy 0, policy_version 126306 (0.0027) [2024-04-26 08:10:22,858][47288] Updated weights for policy 0, policy_version 126316 (0.0035) [2024-04-26 08:10:23,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 2069594112. Throughput: 0: 56430.0. Samples: 2019024840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:10:23,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 08:10:25,807][47288] Updated weights for policy 0, policy_version 126326 (0.0026) [2024-04-26 08:10:28,756][47288] Updated weights for policy 0, policy_version 126336 (0.0025) [2024-04-26 08:10:28,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56524.7, 300 sec: 56483.2). Total num frames: 2069889024. Throughput: 0: 56504.9. Samples: 2019191640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:10:28,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 08:10:31,986][47288] Updated weights for policy 0, policy_version 126346 (0.0028) [2024-04-26 08:10:33,923][47056] Fps is (10 sec: 57345.5, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 2070167552. Throughput: 0: 56377.5. Samples: 2019529360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:10:33,923][47056] Avg episode reward: [(0, '0.629')] [2024-04-26 08:10:34,528][47288] Updated weights for policy 0, policy_version 126356 (0.0031) [2024-04-26 08:10:37,659][47288] Updated weights for policy 0, policy_version 126366 (0.0035) [2024-04-26 08:10:38,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56798.0, 300 sec: 56483.2). Total num frames: 2070446080. Throughput: 0: 56415.7. Samples: 2019870160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:10:38,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 08:10:40,218][47288] Updated weights for policy 0, policy_version 126376 (0.0029) [2024-04-26 08:10:43,328][47288] Updated weights for policy 0, policy_version 126386 (0.0027) [2024-04-26 08:10:43,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 2070740992. Throughput: 0: 56419.5. Samples: 2020043640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:10:43,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 08:10:46,160][47288] Updated weights for policy 0, policy_version 126396 (0.0032) [2024-04-26 08:10:48,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 2071019520. Throughput: 0: 56575.7. Samples: 2020385540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:10:48,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 08:10:48,937][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000126405_2071019520.pth... [2024-04-26 08:10:49,122][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000125578_2057469952.pth [2024-04-26 08:10:49,293][47288] Updated weights for policy 0, policy_version 126406 (0.0028) [2024-04-26 08:10:52,258][47288] Updated weights for policy 0, policy_version 126416 (0.0037) [2024-04-26 08:10:53,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 2071281664. Throughput: 0: 56507.5. Samples: 2020719360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:10:53,923][47056] Avg episode reward: [(0, '0.582')] [2024-04-26 08:10:54,798][47288] Updated weights for policy 0, policy_version 126426 (0.0027) [2024-04-26 08:10:58,168][47288] Updated weights for policy 0, policy_version 126436 (0.0027) [2024-04-26 08:10:58,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 2071543808. Throughput: 0: 56409.5. Samples: 2020881360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:10:58,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 08:11:00,617][47288] Updated weights for policy 0, policy_version 126446 (0.0024) [2024-04-26 08:11:03,838][47288] Updated weights for policy 0, policy_version 126456 (0.0030) [2024-04-26 08:11:03,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 2071855104. Throughput: 0: 56441.7. Samples: 2021224960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:11:03,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 08:11:06,452][47288] Updated weights for policy 0, policy_version 126466 (0.0027) [2024-04-26 08:11:08,923][47056] Fps is (10 sec: 60620.8, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2072150016. Throughput: 0: 56443.4. Samples: 2021564780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:11:08,923][47056] Avg episode reward: [(0, '0.557')] [2024-04-26 08:11:09,575][47288] Updated weights for policy 0, policy_version 126476 (0.0031) [2024-04-26 08:11:12,394][47288] Updated weights for policy 0, policy_version 126486 (0.0031) [2024-04-26 08:11:13,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56524.9, 300 sec: 56483.1). Total num frames: 2072428544. Throughput: 0: 56447.9. Samples: 2021731800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:11:13,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 08:11:15,496][47288] Updated weights for policy 0, policy_version 126496 (0.0038) [2024-04-26 08:11:18,260][47288] Updated weights for policy 0, policy_version 126506 (0.0026) [2024-04-26 08:11:18,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2072707072. Throughput: 0: 56546.6. Samples: 2022073960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 08:11:18,923][47056] Avg episode reward: [(0, '0.586')] [2024-04-26 08:11:21,283][47288] Updated weights for policy 0, policy_version 126516 (0.0034) [2024-04-26 08:11:23,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56525.0, 300 sec: 56483.2). Total num frames: 2072985600. Throughput: 0: 56446.2. Samples: 2022410240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 08:11:23,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 08:11:23,955][47288] Updated weights for policy 0, policy_version 126526 (0.0027) [2024-04-26 08:11:24,696][47267] Signal inference workers to stop experience collection... (30050 times) [2024-04-26 08:11:24,696][47267] Signal inference workers to resume experience collection... (30050 times) [2024-04-26 08:11:24,723][47288] InferenceWorker_p0-w0: stopping experience collection (30050 times) [2024-04-26 08:11:24,723][47288] InferenceWorker_p0-w0: resuming experience collection (30050 times) [2024-04-26 08:11:27,070][47288] Updated weights for policy 0, policy_version 126536 (0.0030) [2024-04-26 08:11:28,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 2073264128. Throughput: 0: 56292.9. Samples: 2022576820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 08:11:28,923][47056] Avg episode reward: [(0, '0.571')] [2024-04-26 08:11:29,652][47288] Updated weights for policy 0, policy_version 126546 (0.0027) [2024-04-26 08:11:32,864][47288] Updated weights for policy 0, policy_version 126556 (0.0027) [2024-04-26 08:11:33,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 2073542656. Throughput: 0: 56424.5. Samples: 2022924640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 08:11:33,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 08:11:35,531][47288] Updated weights for policy 0, policy_version 126566 (0.0027) [2024-04-26 08:11:38,923][47056] Fps is (10 sec: 52428.5, 60 sec: 55705.4, 300 sec: 56316.5). Total num frames: 2073788416. Throughput: 0: 56557.7. Samples: 2023264460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 08:11:38,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 08:11:39,106][47288] Updated weights for policy 0, policy_version 126576 (0.0026) [2024-04-26 08:11:41,257][47288] Updated weights for policy 0, policy_version 126586 (0.0032) [2024-04-26 08:11:43,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 2074099712. Throughput: 0: 56442.2. Samples: 2023421260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 08:11:43,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 08:11:44,843][47288] Updated weights for policy 0, policy_version 126596 (0.0030) [2024-04-26 08:11:46,976][47288] Updated weights for policy 0, policy_version 126606 (0.0029) [2024-04-26 08:11:48,923][47056] Fps is (10 sec: 62258.6, 60 sec: 56524.6, 300 sec: 56538.7). Total num frames: 2074411008. Throughput: 0: 56450.5. Samples: 2023765240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 08:11:48,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 08:11:50,478][47288] Updated weights for policy 0, policy_version 126616 (0.0027) [2024-04-26 08:11:52,821][47288] Updated weights for policy 0, policy_version 126626 (0.0029) [2024-04-26 08:11:53,923][47056] Fps is (10 sec: 60620.5, 60 sec: 57070.9, 300 sec: 56538.7). Total num frames: 2074705920. Throughput: 0: 56291.5. Samples: 2024097900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 08:11:53,923][47056] Avg episode reward: [(0, '0.583')] [2024-04-26 08:11:56,225][47288] Updated weights for policy 0, policy_version 126636 (0.0032) [2024-04-26 08:11:58,841][47288] Updated weights for policy 0, policy_version 126646 (0.0030) [2024-04-26 08:11:58,923][47056] Fps is (10 sec: 55705.9, 60 sec: 57070.8, 300 sec: 56538.7). Total num frames: 2074968064. Throughput: 0: 56683.0. Samples: 2024282540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 08:11:58,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 08:12:02,079][47288] Updated weights for policy 0, policy_version 126656 (0.0028) [2024-04-26 08:12:03,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 2075246592. Throughput: 0: 56560.8. Samples: 2024619200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 08:12:03,923][47056] Avg episode reward: [(0, '0.612')] [2024-04-26 08:12:04,668][47288] Updated weights for policy 0, policy_version 126666 (0.0030) [2024-04-26 08:12:07,864][47288] Updated weights for policy 0, policy_version 126676 (0.0035) [2024-04-26 08:12:08,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 2075525120. Throughput: 0: 56612.8. Samples: 2024957820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 08:12:08,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 08:12:10,442][47288] Updated weights for policy 0, policy_version 126686 (0.0031) [2024-04-26 08:12:13,595][47288] Updated weights for policy 0, policy_version 126696 (0.0032) [2024-04-26 08:12:13,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 2075787264. Throughput: 0: 56484.2. Samples: 2025118600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 08:12:13,923][47056] Avg episode reward: [(0, '0.561')] [2024-04-26 08:12:16,274][47288] Updated weights for policy 0, policy_version 126706 (0.0028) [2024-04-26 08:12:18,923][47056] Fps is (10 sec: 52429.0, 60 sec: 55705.6, 300 sec: 56316.5). Total num frames: 2076049408. Throughput: 0: 56163.5. Samples: 2025452000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 08:12:18,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 08:12:19,684][47288] Updated weights for policy 0, policy_version 126716 (0.0027) [2024-04-26 08:12:22,038][47288] Updated weights for policy 0, policy_version 126726 (0.0026) [2024-04-26 08:12:23,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 2076360704. Throughput: 0: 56147.3. Samples: 2025791080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:12:23,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 08:12:25,435][47288] Updated weights for policy 0, policy_version 126736 (0.0035) [2024-04-26 08:12:26,332][47267] Signal inference workers to stop experience collection... (30100 times) [2024-04-26 08:12:26,356][47288] InferenceWorker_p0-w0: stopping experience collection (30100 times) [2024-04-26 08:12:26,425][47267] Signal inference workers to resume experience collection... (30100 times) [2024-04-26 08:12:26,426][47288] InferenceWorker_p0-w0: resuming experience collection (30100 times) [2024-04-26 08:12:27,820][47288] Updated weights for policy 0, policy_version 126746 (0.0032) [2024-04-26 08:12:28,923][47056] Fps is (10 sec: 62259.1, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2076672000. Throughput: 0: 56357.3. Samples: 2025957340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:12:28,923][47056] Avg episode reward: [(0, '0.579')] [2024-04-26 08:12:31,299][47288] Updated weights for policy 0, policy_version 126756 (0.0028) [2024-04-26 08:12:33,473][47288] Updated weights for policy 0, policy_version 126766 (0.0026) [2024-04-26 08:12:33,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56797.7, 300 sec: 56538.7). Total num frames: 2076950528. Throughput: 0: 56227.2. Samples: 2026295460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:12:33,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 08:12:37,199][47288] Updated weights for policy 0, policy_version 126776 (0.0029) [2024-04-26 08:12:38,923][47056] Fps is (10 sec: 54067.5, 60 sec: 57071.1, 300 sec: 56483.2). Total num frames: 2077212672. Throughput: 0: 56338.8. Samples: 2026633140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:12:38,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 08:12:39,312][47288] Updated weights for policy 0, policy_version 126786 (0.0025) [2024-04-26 08:12:42,902][47288] Updated weights for policy 0, policy_version 126796 (0.0025) [2024-04-26 08:12:43,923][47056] Fps is (10 sec: 54068.2, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 2077491200. Throughput: 0: 56082.5. Samples: 2026806240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:12:43,923][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 08:12:45,313][47288] Updated weights for policy 0, policy_version 126806 (0.0025) [2024-04-26 08:12:48,791][47288] Updated weights for policy 0, policy_version 126816 (0.0028) [2024-04-26 08:12:48,923][47056] Fps is (10 sec: 54066.5, 60 sec: 55705.7, 300 sec: 56316.5). Total num frames: 2077753344. Throughput: 0: 55986.6. Samples: 2027138600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:12:48,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 08:12:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000126816_2077753344.pth... [2024-04-26 08:12:48,985][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000125991_2064236544.pth [2024-04-26 08:12:51,047][47288] Updated weights for policy 0, policy_version 126826 (0.0026) [2024-04-26 08:12:53,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 56427.6). Total num frames: 2078048256. Throughput: 0: 56045.4. Samples: 2027479860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:12:53,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 08:12:54,569][47288] Updated weights for policy 0, policy_version 126836 (0.0025) [2024-04-26 08:12:56,819][47288] Updated weights for policy 0, policy_version 126846 (0.0030) [2024-04-26 08:12:58,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.8, 300 sec: 56483.1). Total num frames: 2078326784. Throughput: 0: 56055.0. Samples: 2027641080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:12:58,923][47056] Avg episode reward: [(0, '0.578')] [2024-04-26 08:13:00,492][47288] Updated weights for policy 0, policy_version 126856 (0.0027) [2024-04-26 08:13:02,680][47288] Updated weights for policy 0, policy_version 126866 (0.0027) [2024-04-26 08:13:03,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 2078605312. Throughput: 0: 56012.9. Samples: 2027972580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:13:03,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 08:13:06,369][47288] Updated weights for policy 0, policy_version 126876 (0.0033) [2024-04-26 08:13:08,346][47288] Updated weights for policy 0, policy_version 126886 (0.0031) [2024-04-26 08:13:08,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 2078900224. Throughput: 0: 55949.4. Samples: 2028308800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:13:08,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 08:13:12,198][47288] Updated weights for policy 0, policy_version 126896 (0.0034) [2024-04-26 08:13:13,923][47056] Fps is (10 sec: 60620.8, 60 sec: 57070.9, 300 sec: 56538.7). Total num frames: 2079211520. Throughput: 0: 56454.3. Samples: 2028497780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:13:13,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 08:13:14,275][47288] Updated weights for policy 0, policy_version 126906 (0.0037) [2024-04-26 08:13:17,911][47288] Updated weights for policy 0, policy_version 126916 (0.0031) [2024-04-26 08:13:18,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56797.8, 300 sec: 56427.6). Total num frames: 2079457280. Throughput: 0: 56443.2. Samples: 2028835400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:13:18,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 08:13:20,337][47288] Updated weights for policy 0, policy_version 126926 (0.0031) [2024-04-26 08:13:23,695][47288] Updated weights for policy 0, policy_version 126936 (0.0028) [2024-04-26 08:13:23,923][47056] Fps is (10 sec: 52428.6, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 2079735808. Throughput: 0: 56319.1. Samples: 2029167500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:13:23,923][47056] Avg episode reward: [(0, '0.568')] [2024-04-26 08:13:26,279][47288] Updated weights for policy 0, policy_version 126946 (0.0033) [2024-04-26 08:13:28,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 56316.5). Total num frames: 2079997952. Throughput: 0: 56103.5. Samples: 2029330900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 08:13:28,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 08:13:29,413][47288] Updated weights for policy 0, policy_version 126956 (0.0023) [2024-04-26 08:13:32,090][47288] Updated weights for policy 0, policy_version 126966 (0.0035) [2024-04-26 08:13:33,923][47056] Fps is (10 sec: 57343.2, 60 sec: 55978.6, 300 sec: 56427.6). Total num frames: 2080309248. Throughput: 0: 56195.0. Samples: 2029667380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 08:13:33,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 08:13:35,293][47288] Updated weights for policy 0, policy_version 126976 (0.0029) [2024-04-26 08:13:37,785][47288] Updated weights for policy 0, policy_version 126986 (0.0031) [2024-04-26 08:13:38,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 56261.0). Total num frames: 2080555008. Throughput: 0: 56150.7. Samples: 2030006640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 08:13:38,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 08:13:40,939][47267] Signal inference workers to stop experience collection... (30150 times) [2024-04-26 08:13:40,939][47267] Signal inference workers to resume experience collection... (30150 times) [2024-04-26 08:13:40,949][47288] InferenceWorker_p0-w0: stopping experience collection (30150 times) [2024-04-26 08:13:40,949][47288] InferenceWorker_p0-w0: resuming experience collection (30150 times) [2024-04-26 08:13:41,047][47288] Updated weights for policy 0, policy_version 126996 (0.0026) [2024-04-26 08:13:43,699][47288] Updated weights for policy 0, policy_version 127006 (0.0030) [2024-04-26 08:13:43,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.6, 300 sec: 56427.6). Total num frames: 2080882688. Throughput: 0: 56361.3. Samples: 2030177340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 08:13:43,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 08:13:46,899][47288] Updated weights for policy 0, policy_version 127016 (0.0031) [2024-04-26 08:13:48,923][47056] Fps is (10 sec: 60620.1, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 2081161216. Throughput: 0: 56420.8. Samples: 2030511520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 08:13:48,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 08:13:49,424][47288] Updated weights for policy 0, policy_version 127026 (0.0029) [2024-04-26 08:13:52,811][47288] Updated weights for policy 0, policy_version 127036 (0.0035) [2024-04-26 08:13:53,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56524.6, 300 sec: 56483.1). Total num frames: 2081439744. Throughput: 0: 56482.8. Samples: 2030850540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 08:13:53,924][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 08:13:55,064][47288] Updated weights for policy 0, policy_version 127046 (0.0030) [2024-04-26 08:13:58,669][47288] Updated weights for policy 0, policy_version 127056 (0.0029) [2024-04-26 08:13:58,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 2081701888. Throughput: 0: 56180.8. Samples: 2031025920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 08:13:58,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 08:14:00,853][47288] Updated weights for policy 0, policy_version 127066 (0.0023) [2024-04-26 08:14:03,923][47056] Fps is (10 sec: 52430.3, 60 sec: 55978.7, 300 sec: 56205.5). Total num frames: 2081964032. Throughput: 0: 56069.5. Samples: 2031358520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 08:14:03,923][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 08:14:04,414][47288] Updated weights for policy 0, policy_version 127076 (0.0036) [2024-04-26 08:14:06,844][47288] Updated weights for policy 0, policy_version 127086 (0.0029) [2024-04-26 08:14:08,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.5, 300 sec: 56316.6). Total num frames: 2082258944. Throughput: 0: 56087.0. Samples: 2031691420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 08:14:08,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 08:14:10,212][47288] Updated weights for policy 0, policy_version 127096 (0.0027) [2024-04-26 08:14:12,502][47288] Updated weights for policy 0, policy_version 127106 (0.0026) [2024-04-26 08:14:13,923][47056] Fps is (10 sec: 57343.1, 60 sec: 55432.4, 300 sec: 56316.6). Total num frames: 2082537472. Throughput: 0: 56195.0. Samples: 2031859680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 08:14:13,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 08:14:16,016][47288] Updated weights for policy 0, policy_version 127116 (0.0026) [2024-04-26 08:14:18,516][47288] Updated weights for policy 0, policy_version 127126 (0.0026) [2024-04-26 08:14:18,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.8, 300 sec: 56316.6). Total num frames: 2082832384. Throughput: 0: 56274.8. Samples: 2032199740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 08:14:18,923][47056] Avg episode reward: [(0, '0.561')] [2024-04-26 08:14:21,771][47288] Updated weights for policy 0, policy_version 127136 (0.0027) [2024-04-26 08:14:23,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56524.7, 300 sec: 56372.0). Total num frames: 2083127296. Throughput: 0: 56297.6. Samples: 2032540040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 08:14:23,924][47056] Avg episode reward: [(0, '0.531')] [2024-04-26 08:14:24,322][47288] Updated weights for policy 0, policy_version 127146 (0.0032) [2024-04-26 08:14:27,519][47288] Updated weights for policy 0, policy_version 127156 (0.0031) [2024-04-26 08:14:28,923][47056] Fps is (10 sec: 58982.0, 60 sec: 57070.8, 300 sec: 56427.6). Total num frames: 2083422208. Throughput: 0: 56347.6. Samples: 2032712980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 08:14:28,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 08:14:30,037][47288] Updated weights for policy 0, policy_version 127166 (0.0033) [2024-04-26 08:14:33,244][47267] Signal inference workers to stop experience collection... (30200 times) [2024-04-26 08:14:33,250][47267] Signal inference workers to resume experience collection... (30200 times) [2024-04-26 08:14:33,275][47288] InferenceWorker_p0-w0: stopping experience collection (30200 times) [2024-04-26 08:14:33,275][47288] InferenceWorker_p0-w0: resuming experience collection (30200 times) [2024-04-26 08:14:33,360][47288] Updated weights for policy 0, policy_version 127176 (0.0030) [2024-04-26 08:14:33,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56251.9, 300 sec: 56427.6). Total num frames: 2083684352. Throughput: 0: 56496.6. Samples: 2033053860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 08:14:33,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 08:14:35,715][47288] Updated weights for policy 0, policy_version 127186 (0.0029) [2024-04-26 08:14:38,923][47056] Fps is (10 sec: 52429.0, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 2083946496. Throughput: 0: 56528.2. Samples: 2033394300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 08:14:38,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 08:14:39,222][47288] Updated weights for policy 0, policy_version 127196 (0.0029) [2024-04-26 08:14:41,604][47288] Updated weights for policy 0, policy_version 127206 (0.0035) [2024-04-26 08:14:43,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.8, 300 sec: 56261.0). Total num frames: 2084225024. Throughput: 0: 56164.6. Samples: 2033553320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 08:14:43,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 08:14:45,099][47288] Updated weights for policy 0, policy_version 127216 (0.0025) [2024-04-26 08:14:47,527][47288] Updated weights for policy 0, policy_version 127226 (0.0029) [2024-04-26 08:14:48,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 2084519936. Throughput: 0: 56298.5. Samples: 2033891960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 08:14:48,923][47056] Avg episode reward: [(0, '0.581')] [2024-04-26 08:14:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000127229_2084519936.pth... [2024-04-26 08:14:48,983][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000126405_2071019520.pth [2024-04-26 08:14:50,823][47288] Updated weights for policy 0, policy_version 127236 (0.0027) [2024-04-26 08:14:53,792][47288] Updated weights for policy 0, policy_version 127246 (0.0032) [2024-04-26 08:14:53,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 2084814848. Throughput: 0: 56440.6. Samples: 2034231240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 08:14:53,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 08:14:56,586][47288] Updated weights for policy 0, policy_version 127256 (0.0024) [2024-04-26 08:14:58,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 2085109760. Throughput: 0: 56532.4. Samples: 2034403640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 08:14:58,924][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 08:14:59,488][47288] Updated weights for policy 0, policy_version 127266 (0.0031) [2024-04-26 08:15:02,419][47288] Updated weights for policy 0, policy_version 127276 (0.0033) [2024-04-26 08:15:03,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56797.8, 300 sec: 56316.5). Total num frames: 2085371904. Throughput: 0: 56422.3. Samples: 2034738740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 08:15:03,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 08:15:05,315][47288] Updated weights for policy 0, policy_version 127286 (0.0026) [2024-04-26 08:15:08,009][47288] Updated weights for policy 0, policy_version 127296 (0.0031) [2024-04-26 08:15:08,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 2085666816. Throughput: 0: 56432.1. Samples: 2035079480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 08:15:08,923][47056] Avg episode reward: [(0, '0.567')] [2024-04-26 08:15:10,994][47288] Updated weights for policy 0, policy_version 127306 (0.0031) [2024-04-26 08:15:13,899][47288] Updated weights for policy 0, policy_version 127316 (0.0026) [2024-04-26 08:15:13,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 2085945344. Throughput: 0: 56473.9. Samples: 2035254300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 08:15:13,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 08:15:16,812][47288] Updated weights for policy 0, policy_version 127326 (0.0032) [2024-04-26 08:15:18,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 2086223872. Throughput: 0: 56420.4. Samples: 2035592780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 08:15:18,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 08:15:19,573][47288] Updated weights for policy 0, policy_version 127336 (0.0029) [2024-04-26 08:15:22,595][47288] Updated weights for policy 0, policy_version 127346 (0.0030) [2024-04-26 08:15:23,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 2086502400. Throughput: 0: 56314.6. Samples: 2035928460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 08:15:23,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 08:15:25,375][47288] Updated weights for policy 0, policy_version 127356 (0.0028) [2024-04-26 08:15:28,398][47288] Updated weights for policy 0, policy_version 127366 (0.0037) [2024-04-26 08:15:28,923][47056] Fps is (10 sec: 57342.3, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 2086797312. Throughput: 0: 56655.6. Samples: 2036102840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 08:15:28,924][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 08:15:31,229][47288] Updated weights for policy 0, policy_version 127376 (0.0025) [2024-04-26 08:15:33,923][47056] Fps is (10 sec: 57345.0, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 2087075840. Throughput: 0: 56543.3. Samples: 2036436400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:15:33,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 08:15:34,046][47288] Updated weights for policy 0, policy_version 127386 (0.0034) [2024-04-26 08:15:36,860][47288] Updated weights for policy 0, policy_version 127396 (0.0034) [2024-04-26 08:15:38,923][47056] Fps is (10 sec: 55706.8, 60 sec: 56797.9, 300 sec: 56316.5). Total num frames: 2087354368. Throughput: 0: 56576.4. Samples: 2036777180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:15:38,923][47056] Avg episode reward: [(0, '0.457')] [2024-04-26 08:15:39,927][47288] Updated weights for policy 0, policy_version 127406 (0.0031) [2024-04-26 08:15:42,591][47288] Updated weights for policy 0, policy_version 127416 (0.0028) [2024-04-26 08:15:43,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56797.8, 300 sec: 56316.5). Total num frames: 2087632896. Throughput: 0: 56565.9. Samples: 2036949100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:15:43,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 08:15:45,659][47288] Updated weights for policy 0, policy_version 127426 (0.0027) [2024-04-26 08:15:47,757][47267] Signal inference workers to stop experience collection... (30250 times) [2024-04-26 08:15:47,796][47288] InferenceWorker_p0-w0: stopping experience collection (30250 times) [2024-04-26 08:15:47,817][47267] Signal inference workers to resume experience collection... (30250 times) [2024-04-26 08:15:47,818][47288] InferenceWorker_p0-w0: resuming experience collection (30250 times) [2024-04-26 08:15:48,479][47288] Updated weights for policy 0, policy_version 127436 (0.0030) [2024-04-26 08:15:48,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56798.0, 300 sec: 56427.6). Total num frames: 2087927808. Throughput: 0: 56633.3. Samples: 2037287240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:15:48,923][47056] Avg episode reward: [(0, '0.567')] [2024-04-26 08:15:51,453][47288] Updated weights for policy 0, policy_version 127446 (0.0027) [2024-04-26 08:15:53,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 2088206336. Throughput: 0: 56641.3. Samples: 2037628340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:15:53,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 08:15:54,242][47288] Updated weights for policy 0, policy_version 127456 (0.0032) [2024-04-26 08:15:57,223][47288] Updated weights for policy 0, policy_version 127466 (0.0033) [2024-04-26 08:15:58,924][47056] Fps is (10 sec: 54060.3, 60 sec: 55977.6, 300 sec: 56316.3). Total num frames: 2088468480. Throughput: 0: 56615.3. Samples: 2037802060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:15:58,925][47056] Avg episode reward: [(0, '0.581')] [2024-04-26 08:16:00,111][47288] Updated weights for policy 0, policy_version 127476 (0.0030) [2024-04-26 08:16:03,008][47288] Updated weights for policy 0, policy_version 127486 (0.0023) [2024-04-26 08:16:03,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 2088779776. Throughput: 0: 56601.3. Samples: 2038139840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:16:03,923][47056] Avg episode reward: [(0, '0.548')] [2024-04-26 08:16:05,858][47288] Updated weights for policy 0, policy_version 127496 (0.0034) [2024-04-26 08:16:08,650][47288] Updated weights for policy 0, policy_version 127506 (0.0028) [2024-04-26 08:16:08,923][47056] Fps is (10 sec: 60628.8, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 2089074688. Throughput: 0: 56693.6. Samples: 2038479660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:16:08,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 08:16:11,749][47288] Updated weights for policy 0, policy_version 127516 (0.0030) [2024-04-26 08:16:13,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 2089320448. Throughput: 0: 56584.7. Samples: 2038649140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:16:13,932][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 08:16:14,382][47288] Updated weights for policy 0, policy_version 127526 (0.0033) [2024-04-26 08:16:17,482][47288] Updated weights for policy 0, policy_version 127536 (0.0029) [2024-04-26 08:16:18,923][47056] Fps is (10 sec: 52428.2, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 2089598976. Throughput: 0: 56662.1. Samples: 2038986200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:16:18,932][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 08:16:20,250][47288] Updated weights for policy 0, policy_version 127546 (0.0033) [2024-04-26 08:16:23,260][47288] Updated weights for policy 0, policy_version 127556 (0.0026) [2024-04-26 08:16:23,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 2089877504. Throughput: 0: 56614.5. Samples: 2039324840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:16:23,923][47056] Avg episode reward: [(0, '0.557')] [2024-04-26 08:16:25,982][47288] Updated weights for policy 0, policy_version 127566 (0.0026) [2024-04-26 08:16:28,923][47056] Fps is (10 sec: 57342.9, 60 sec: 56251.8, 300 sec: 56372.0). Total num frames: 2090172416. Throughput: 0: 56503.7. Samples: 2039491780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:16:28,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 08:16:29,094][47288] Updated weights for policy 0, policy_version 127576 (0.0024) [2024-04-26 08:16:31,840][47288] Updated weights for policy 0, policy_version 127586 (0.0028) [2024-04-26 08:16:33,923][47056] Fps is (10 sec: 58983.4, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2090467328. Throughput: 0: 56476.0. Samples: 2039828660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:16:33,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 08:16:35,040][47288] Updated weights for policy 0, policy_version 127596 (0.0034) [2024-04-26 08:16:37,673][47288] Updated weights for policy 0, policy_version 127606 (0.0024) [2024-04-26 08:16:38,923][47056] Fps is (10 sec: 57345.9, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 2090745856. Throughput: 0: 56412.1. Samples: 2040166880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:16:38,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 08:16:40,851][47288] Updated weights for policy 0, policy_version 127616 (0.0025) [2024-04-26 08:16:43,296][47288] Updated weights for policy 0, policy_version 127626 (0.0028) [2024-04-26 08:16:43,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.9, 300 sec: 56316.6). Total num frames: 2091024384. Throughput: 0: 56532.0. Samples: 2040345920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:16:43,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 08:16:46,714][47288] Updated weights for policy 0, policy_version 127636 (0.0030) [2024-04-26 08:16:48,923][47056] Fps is (10 sec: 58981.5, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 2091335680. Throughput: 0: 56600.8. Samples: 2040686880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:16:48,923][47056] Avg episode reward: [(0, '0.574')] [2024-04-26 08:16:49,018][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000127646_2091352064.pth... [2024-04-26 08:16:49,018][47288] Updated weights for policy 0, policy_version 127646 (0.0031) [2024-04-26 08:16:49,072][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000126816_2077753344.pth [2024-04-26 08:16:52,464][47288] Updated weights for policy 0, policy_version 127656 (0.0034) [2024-04-26 08:16:53,923][47056] Fps is (10 sec: 57343.0, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 2091597824. Throughput: 0: 56609.2. Samples: 2041027080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:16:53,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 08:16:54,793][47288] Updated weights for policy 0, policy_version 127666 (0.0032) [2024-04-26 08:16:55,827][47267] Signal inference workers to stop experience collection... (30300 times) [2024-04-26 08:16:55,858][47288] InferenceWorker_p0-w0: stopping experience collection (30300 times) [2024-04-26 08:16:55,882][47267] Signal inference workers to resume experience collection... (30300 times) [2024-04-26 08:16:55,886][47288] InferenceWorker_p0-w0: resuming experience collection (30300 times) [2024-04-26 08:16:58,187][47288] Updated weights for policy 0, policy_version 127676 (0.0029) [2024-04-26 08:16:58,923][47056] Fps is (10 sec: 52429.2, 60 sec: 56526.0, 300 sec: 56316.5). Total num frames: 2091859968. Throughput: 0: 56579.2. Samples: 2041195200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:16:58,923][47056] Avg episode reward: [(0, '0.588')] [2024-04-26 08:17:00,721][47288] Updated weights for policy 0, policy_version 127686 (0.0028) [2024-04-26 08:17:03,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 2092138496. Throughput: 0: 56499.7. Samples: 2041528680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:17:03,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 08:17:04,107][47288] Updated weights for policy 0, policy_version 127696 (0.0029) [2024-04-26 08:17:06,449][47288] Updated weights for policy 0, policy_version 127706 (0.0027) [2024-04-26 08:17:08,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56251.6, 300 sec: 56483.1). Total num frames: 2092449792. Throughput: 0: 56531.7. Samples: 2041868760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:17:08,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 08:17:09,976][47288] Updated weights for policy 0, policy_version 127716 (0.0030) [2024-04-26 08:17:12,184][47288] Updated weights for policy 0, policy_version 127726 (0.0025) [2024-04-26 08:17:13,923][47056] Fps is (10 sec: 60620.4, 60 sec: 57071.0, 300 sec: 56594.2). Total num frames: 2092744704. Throughput: 0: 56681.2. Samples: 2042042420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:17:13,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 08:17:15,836][47288] Updated weights for policy 0, policy_version 127736 (0.0030) [2024-04-26 08:17:17,991][47288] Updated weights for policy 0, policy_version 127746 (0.0026) [2024-04-26 08:17:18,923][47056] Fps is (10 sec: 57344.2, 60 sec: 57071.0, 300 sec: 56483.1). Total num frames: 2093023232. Throughput: 0: 56588.4. Samples: 2042375140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:17:18,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 08:17:21,585][47288] Updated weights for policy 0, policy_version 127756 (0.0031) [2024-04-26 08:17:23,843][47288] Updated weights for policy 0, policy_version 127766 (0.0028) [2024-04-26 08:17:23,923][47056] Fps is (10 sec: 57344.7, 60 sec: 57344.2, 300 sec: 56427.6). Total num frames: 2093318144. Throughput: 0: 56632.0. Samples: 2042715320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:17:23,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 08:17:27,397][47288] Updated weights for policy 0, policy_version 127776 (0.0027) [2024-04-26 08:17:28,923][47056] Fps is (10 sec: 57344.4, 60 sec: 57071.2, 300 sec: 56427.6). Total num frames: 2093596672. Throughput: 0: 56696.3. Samples: 2042897260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:17:28,923][47056] Avg episode reward: [(0, '0.571')] [2024-04-26 08:17:29,652][47288] Updated weights for policy 0, policy_version 127786 (0.0022) [2024-04-26 08:17:33,222][47288] Updated weights for policy 0, policy_version 127796 (0.0025) [2024-04-26 08:17:33,923][47056] Fps is (10 sec: 52427.9, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 2093842432. Throughput: 0: 56676.4. Samples: 2043237320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:17:33,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 08:17:35,420][47288] Updated weights for policy 0, policy_version 127806 (0.0029) [2024-04-26 08:17:38,923][47056] Fps is (10 sec: 52428.2, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 2094120960. Throughput: 0: 56564.0. Samples: 2043572460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-04-26 08:17:38,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 08:17:38,966][47288] Updated weights for policy 0, policy_version 127816 (0.0031) [2024-04-26 08:17:41,217][47288] Updated weights for policy 0, policy_version 127826 (0.0032) [2024-04-26 08:17:43,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56524.7, 300 sec: 56483.2). Total num frames: 2094415872. Throughput: 0: 56457.4. Samples: 2043735780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-04-26 08:17:43,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 08:17:44,638][47288] Updated weights for policy 0, policy_version 127836 (0.0028) [2024-04-26 08:17:47,074][47288] Updated weights for policy 0, policy_version 127846 (0.0026) [2024-04-26 08:17:48,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 2094710784. Throughput: 0: 56636.3. Samples: 2044077320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-04-26 08:17:48,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 08:17:50,487][47288] Updated weights for policy 0, policy_version 127856 (0.0028) [2024-04-26 08:17:53,217][47288] Updated weights for policy 0, policy_version 127866 (0.0030) [2024-04-26 08:17:53,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 2095005696. Throughput: 0: 56500.4. Samples: 2044411280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-04-26 08:17:53,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 08:17:56,430][47288] Updated weights for policy 0, policy_version 127876 (0.0033) [2024-04-26 08:17:58,854][47288] Updated weights for policy 0, policy_version 127886 (0.0026) [2024-04-26 08:17:58,923][47056] Fps is (10 sec: 57344.1, 60 sec: 57070.9, 300 sec: 56538.7). Total num frames: 2095284224. Throughput: 0: 56552.4. Samples: 2044587280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-04-26 08:17:58,923][47056] Avg episode reward: [(0, '0.488')] [2024-04-26 08:18:02,261][47288] Updated weights for policy 0, policy_version 127896 (0.0033) [2024-04-26 08:18:03,923][47056] Fps is (10 sec: 57344.5, 60 sec: 57344.0, 300 sec: 56538.7). Total num frames: 2095579136. Throughput: 0: 56681.4. Samples: 2044925800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-04-26 08:18:03,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 08:18:04,644][47288] Updated weights for policy 0, policy_version 127906 (0.0032) [2024-04-26 08:18:05,403][47267] Signal inference workers to stop experience collection... (30350 times) [2024-04-26 08:18:05,403][47267] Signal inference workers to resume experience collection... (30350 times) [2024-04-26 08:18:05,413][47288] InferenceWorker_p0-w0: stopping experience collection (30350 times) [2024-04-26 08:18:05,413][47288] InferenceWorker_p0-w0: resuming experience collection (30350 times) [2024-04-26 08:18:08,091][47288] Updated weights for policy 0, policy_version 127916 (0.0033) [2024-04-26 08:18:08,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 2095824896. Throughput: 0: 56652.3. Samples: 2045264680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-04-26 08:18:08,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 08:18:10,515][47288] Updated weights for policy 0, policy_version 127926 (0.0029) [2024-04-26 08:18:13,825][47288] Updated weights for policy 0, policy_version 127936 (0.0027) [2024-04-26 08:18:13,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 2096103424. Throughput: 0: 56112.8. Samples: 2045422340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-04-26 08:18:13,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 08:18:16,488][47288] Updated weights for policy 0, policy_version 127946 (0.0026) [2024-04-26 08:18:18,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 56372.1). Total num frames: 2096365568. Throughput: 0: 56129.8. Samples: 2045763160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-04-26 08:18:18,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 08:18:19,643][47288] Updated weights for policy 0, policy_version 127956 (0.0028) [2024-04-26 08:18:22,091][47288] Updated weights for policy 0, policy_version 127966 (0.0027) [2024-04-26 08:18:23,923][47056] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 56538.7). Total num frames: 2096676864. Throughput: 0: 56153.9. Samples: 2046099380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-04-26 08:18:23,923][47056] Avg episode reward: [(0, '0.592')] [2024-04-26 08:18:25,434][47288] Updated weights for policy 0, policy_version 127976 (0.0027) [2024-04-26 08:18:28,001][47288] Updated weights for policy 0, policy_version 127986 (0.0027) [2024-04-26 08:18:28,923][47056] Fps is (10 sec: 60620.8, 60 sec: 56251.6, 300 sec: 56483.2). Total num frames: 2096971776. Throughput: 0: 56369.2. Samples: 2046272400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-04-26 08:18:28,923][47056] Avg episode reward: [(0, '0.428')] [2024-04-26 08:18:31,284][47288] Updated weights for policy 0, policy_version 127996 (0.0027) [2024-04-26 08:18:33,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56798.0, 300 sec: 56594.2). Total num frames: 2097250304. Throughput: 0: 56265.0. Samples: 2046609240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-04-26 08:18:33,923][47056] Avg episode reward: [(0, '0.582')] [2024-04-26 08:18:33,924][47288] Updated weights for policy 0, policy_version 128006 (0.0026) [2024-04-26 08:18:36,914][47288] Updated weights for policy 0, policy_version 128016 (0.0031) [2024-04-26 08:18:38,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 2097528832. Throughput: 0: 56334.3. Samples: 2046946320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:18:38,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 08:18:39,635][47288] Updated weights for policy 0, policy_version 128026 (0.0028) [2024-04-26 08:18:42,776][47288] Updated weights for policy 0, policy_version 128036 (0.0031) [2024-04-26 08:18:43,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 2097807360. Throughput: 0: 56374.8. Samples: 2047124140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:18:43,923][47056] Avg episode reward: [(0, '0.619')] [2024-04-26 08:18:45,441][47288] Updated weights for policy 0, policy_version 128046 (0.0027) [2024-04-26 08:18:48,633][47288] Updated weights for policy 0, policy_version 128056 (0.0034) [2024-04-26 08:18:48,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 2098085888. Throughput: 0: 56368.8. Samples: 2047462400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:18:48,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 08:18:48,979][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000128058_2098102272.pth... [2024-04-26 08:18:49,022][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000127229_2084519936.pth [2024-04-26 08:18:51,242][47288] Updated weights for policy 0, policy_version 128066 (0.0030) [2024-04-26 08:18:53,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.6, 300 sec: 56427.6). Total num frames: 2098348032. Throughput: 0: 56415.6. Samples: 2047803380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:18:53,923][47056] Avg episode reward: [(0, '0.599')] [2024-04-26 08:18:54,318][47288] Updated weights for policy 0, policy_version 128076 (0.0032) [2024-04-26 08:18:57,109][47288] Updated weights for policy 0, policy_version 128086 (0.0027) [2024-04-26 08:18:57,661][47267] Signal inference workers to stop experience collection... (30400 times) [2024-04-26 08:18:57,662][47267] Signal inference workers to resume experience collection... (30400 times) [2024-04-26 08:18:57,685][47288] InferenceWorker_p0-w0: stopping experience collection (30400 times) [2024-04-26 08:18:57,685][47288] InferenceWorker_p0-w0: resuming experience collection (30400 times) [2024-04-26 08:18:58,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.6, 300 sec: 56594.2). Total num frames: 2098659328. Throughput: 0: 56525.2. Samples: 2047965980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:18:58,923][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 08:19:00,288][47288] Updated weights for policy 0, policy_version 128096 (0.0031) [2024-04-26 08:19:02,753][47288] Updated weights for policy 0, policy_version 128106 (0.0028) [2024-04-26 08:19:03,923][47056] Fps is (10 sec: 60620.9, 60 sec: 56251.7, 300 sec: 56594.2). Total num frames: 2098954240. Throughput: 0: 56655.2. Samples: 2048312640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:19:03,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 08:19:05,897][47288] Updated weights for policy 0, policy_version 128116 (0.0032) [2024-04-26 08:19:08,647][47288] Updated weights for policy 0, policy_version 128126 (0.0026) [2024-04-26 08:19:08,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 2099232768. Throughput: 0: 56741.8. Samples: 2048652760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:19:08,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 08:19:11,755][47288] Updated weights for policy 0, policy_version 128136 (0.0031) [2024-04-26 08:19:13,923][47056] Fps is (10 sec: 52429.3, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 2099478528. Throughput: 0: 56649.1. Samples: 2048821600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:19:13,923][47056] Avg episode reward: [(0, '0.434')] [2024-04-26 08:19:14,527][47288] Updated weights for policy 0, policy_version 128146 (0.0024) [2024-04-26 08:19:17,482][47288] Updated weights for policy 0, policy_version 128156 (0.0026) [2024-04-26 08:19:18,923][47056] Fps is (10 sec: 57343.9, 60 sec: 57344.1, 300 sec: 56538.7). Total num frames: 2099806208. Throughput: 0: 56674.6. Samples: 2049159600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:19:18,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 08:19:20,167][47288] Updated weights for policy 0, policy_version 128166 (0.0023) [2024-04-26 08:19:23,103][47288] Updated weights for policy 0, policy_version 128176 (0.0026) [2024-04-26 08:19:23,923][47056] Fps is (10 sec: 60619.9, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 2100084736. Throughput: 0: 56653.2. Samples: 2049495720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:19:23,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 08:19:26,063][47288] Updated weights for policy 0, policy_version 128186 (0.0024) [2024-04-26 08:19:28,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56251.9, 300 sec: 56483.2). Total num frames: 2100346880. Throughput: 0: 56432.1. Samples: 2049663580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:19:28,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 08:19:28,982][47288] Updated weights for policy 0, policy_version 128196 (0.0028) [2024-04-26 08:19:31,913][47288] Updated weights for policy 0, policy_version 128206 (0.0030) [2024-04-26 08:19:33,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56251.6, 300 sec: 56538.7). Total num frames: 2100625408. Throughput: 0: 56491.9. Samples: 2050004540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 08:19:33,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 08:19:34,862][47288] Updated weights for policy 0, policy_version 128216 (0.0027) [2024-04-26 08:19:37,529][47288] Updated weights for policy 0, policy_version 128226 (0.0027) [2024-04-26 08:19:38,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 2100903936. Throughput: 0: 56369.0. Samples: 2050339980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 08:19:38,923][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 08:19:40,794][47288] Updated weights for policy 0, policy_version 128236 (0.0026) [2024-04-26 08:19:43,209][47288] Updated weights for policy 0, policy_version 128246 (0.0030) [2024-04-26 08:19:43,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56797.7, 300 sec: 56594.2). Total num frames: 2101215232. Throughput: 0: 56539.1. Samples: 2050510240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 08:19:43,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 08:19:46,402][47288] Updated weights for policy 0, policy_version 128256 (0.0026) [2024-04-26 08:19:48,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56798.0, 300 sec: 56538.7). Total num frames: 2101493760. Throughput: 0: 56460.6. Samples: 2050853360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 08:19:48,923][47056] Avg episode reward: [(0, '0.588')] [2024-04-26 08:19:48,973][47267] Signal inference workers to stop experience collection... (30450 times) [2024-04-26 08:19:49,022][47288] InferenceWorker_p0-w0: stopping experience collection (30450 times) [2024-04-26 08:19:49,029][47267] Signal inference workers to resume experience collection... (30450 times) [2024-04-26 08:19:49,034][47288] InferenceWorker_p0-w0: resuming experience collection (30450 times) [2024-04-26 08:19:49,037][47288] Updated weights for policy 0, policy_version 128266 (0.0031) [2024-04-26 08:19:52,138][47288] Updated weights for policy 0, policy_version 128276 (0.0031) [2024-04-26 08:19:53,923][47056] Fps is (10 sec: 54068.1, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 2101755904. Throughput: 0: 56505.4. Samples: 2051195500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 08:19:53,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 08:19:54,948][47288] Updated weights for policy 0, policy_version 128286 (0.0029) [2024-04-26 08:19:57,904][47288] Updated weights for policy 0, policy_version 128296 (0.0030) [2024-04-26 08:19:58,923][47056] Fps is (10 sec: 57342.9, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 2102067200. Throughput: 0: 56425.5. Samples: 2051360760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 08:19:58,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 08:20:00,620][47288] Updated weights for policy 0, policy_version 128306 (0.0026) [2024-04-26 08:20:03,786][47288] Updated weights for policy 0, policy_version 128316 (0.0027) [2024-04-26 08:20:03,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 2102329344. Throughput: 0: 56395.1. Samples: 2051697380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 08:20:03,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 08:20:06,452][47288] Updated weights for policy 0, policy_version 128326 (0.0036) [2024-04-26 08:20:08,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 2102607872. Throughput: 0: 56612.1. Samples: 2052043260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 08:20:08,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 08:20:09,440][47288] Updated weights for policy 0, policy_version 128336 (0.0026) [2024-04-26 08:20:12,256][47288] Updated weights for policy 0, policy_version 128346 (0.0034) [2024-04-26 08:20:13,923][47056] Fps is (10 sec: 57343.6, 60 sec: 57070.8, 300 sec: 56538.7). Total num frames: 2102902784. Throughput: 0: 56649.2. Samples: 2052212800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 08:20:13,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 08:20:15,250][47288] Updated weights for policy 0, policy_version 128356 (0.0025) [2024-04-26 08:20:17,934][47288] Updated weights for policy 0, policy_version 128366 (0.0030) [2024-04-26 08:20:18,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56524.8, 300 sec: 56594.2). Total num frames: 2103197696. Throughput: 0: 56747.7. Samples: 2052558180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 08:20:18,923][47056] Avg episode reward: [(0, '0.576')] [2024-04-26 08:20:21,137][47288] Updated weights for policy 0, policy_version 128376 (0.0028) [2024-04-26 08:20:23,652][47288] Updated weights for policy 0, policy_version 128386 (0.0032) [2024-04-26 08:20:23,923][47056] Fps is (10 sec: 57345.1, 60 sec: 56525.0, 300 sec: 56538.8). Total num frames: 2103476224. Throughput: 0: 56642.3. Samples: 2052888880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 08:20:23,923][47056] Avg episode reward: [(0, '0.548')] [2024-04-26 08:20:26,976][47288] Updated weights for policy 0, policy_version 128396 (0.0031) [2024-04-26 08:20:28,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 2103738368. Throughput: 0: 56725.1. Samples: 2053062860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 08:20:28,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 08:20:29,408][47288] Updated weights for policy 0, policy_version 128406 (0.0030) [2024-04-26 08:20:32,634][47288] Updated weights for policy 0, policy_version 128416 (0.0036) [2024-04-26 08:20:33,923][47056] Fps is (10 sec: 54065.8, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 2104016896. Throughput: 0: 56729.1. Samples: 2053406180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 08:20:33,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 08:20:35,256][47288] Updated weights for policy 0, policy_version 128426 (0.0028) [2024-04-26 08:20:36,173][47267] Signal inference workers to stop experience collection... (30500 times) [2024-04-26 08:20:36,174][47267] Signal inference workers to resume experience collection... (30500 times) [2024-04-26 08:20:36,186][47288] InferenceWorker_p0-w0: stopping experience collection (30500 times) [2024-04-26 08:20:36,186][47288] InferenceWorker_p0-w0: resuming experience collection (30500 times) [2024-04-26 08:20:38,239][47288] Updated weights for policy 0, policy_version 128436 (0.0031) [2024-04-26 08:20:38,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2104311808. Throughput: 0: 56552.0. Samples: 2053740340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 08:20:38,923][47056] Avg episode reward: [(0, '0.570')] [2024-04-26 08:20:41,222][47288] Updated weights for policy 0, policy_version 128446 (0.0031) [2024-04-26 08:20:43,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56251.8, 300 sec: 56483.1). Total num frames: 2104590336. Throughput: 0: 56653.4. Samples: 2053910160. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) [2024-04-26 08:20:43,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 08:20:44,119][47288] Updated weights for policy 0, policy_version 128456 (0.0030) [2024-04-26 08:20:47,251][47288] Updated weights for policy 0, policy_version 128466 (0.0025) [2024-04-26 08:20:48,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.7, 300 sec: 56538.7). Total num frames: 2104885248. Throughput: 0: 56756.0. Samples: 2054251400. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) [2024-04-26 08:20:48,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 08:20:48,995][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000128473_2104901632.pth... [2024-04-26 08:20:49,047][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000127646_2091352064.pth [2024-04-26 08:20:49,841][47288] Updated weights for policy 0, policy_version 128476 (0.0025) [2024-04-26 08:20:52,986][47288] Updated weights for policy 0, policy_version 128486 (0.0032) [2024-04-26 08:20:53,923][47056] Fps is (10 sec: 58983.0, 60 sec: 57071.0, 300 sec: 56650.0). Total num frames: 2105180160. Throughput: 0: 56532.5. Samples: 2054587220. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) [2024-04-26 08:20:53,923][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 08:20:55,567][47288] Updated weights for policy 0, policy_version 128496 (0.0025) [2024-04-26 08:20:58,641][47288] Updated weights for policy 0, policy_version 128506 (0.0028) [2024-04-26 08:20:58,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 2105458688. Throughput: 0: 56634.2. Samples: 2054761340. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) [2024-04-26 08:20:58,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 08:21:01,620][47288] Updated weights for policy 0, policy_version 128516 (0.0030) [2024-04-26 08:21:03,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 2105737216. Throughput: 0: 56401.8. Samples: 2055096260. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) [2024-04-26 08:21:03,923][47056] Avg episode reward: [(0, '0.568')] [2024-04-26 08:21:04,480][47288] Updated weights for policy 0, policy_version 128526 (0.0033) [2024-04-26 08:21:07,538][47288] Updated weights for policy 0, policy_version 128536 (0.0027) [2024-04-26 08:21:08,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 2105999360. Throughput: 0: 56704.4. Samples: 2055440580. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) [2024-04-26 08:21:08,923][47056] Avg episode reward: [(0, '0.547')] [2024-04-26 08:21:10,305][47288] Updated weights for policy 0, policy_version 128546 (0.0028) [2024-04-26 08:21:13,402][47288] Updated weights for policy 0, policy_version 128556 (0.0024) [2024-04-26 08:21:13,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 2106277888. Throughput: 0: 56439.6. Samples: 2055602640. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) [2024-04-26 08:21:13,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 08:21:15,893][47288] Updated weights for policy 0, policy_version 128566 (0.0028) [2024-04-26 08:21:18,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56251.8, 300 sec: 56594.3). Total num frames: 2106572800. Throughput: 0: 56468.6. Samples: 2055947260. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) [2024-04-26 08:21:18,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 08:21:19,134][47288] Updated weights for policy 0, policy_version 128576 (0.0028) [2024-04-26 08:21:21,664][47288] Updated weights for policy 0, policy_version 128586 (0.0029) [2024-04-26 08:21:23,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56524.6, 300 sec: 56594.3). Total num frames: 2106867712. Throughput: 0: 56514.1. Samples: 2056283480. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) [2024-04-26 08:21:23,924][47056] Avg episode reward: [(0, '0.514')] [2024-04-26 08:21:24,777][47288] Updated weights for policy 0, policy_version 128596 (0.0029) [2024-04-26 08:21:27,581][47288] Updated weights for policy 0, policy_version 128606 (0.0028) [2024-04-26 08:21:28,923][47056] Fps is (10 sec: 58981.9, 60 sec: 57070.8, 300 sec: 56594.2). Total num frames: 2107162624. Throughput: 0: 56547.1. Samples: 2056454780. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) [2024-04-26 08:21:28,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 08:21:30,525][47288] Updated weights for policy 0, policy_version 128616 (0.0033) [2024-04-26 08:21:33,122][47267] Signal inference workers to stop experience collection... (30550 times) [2024-04-26 08:21:33,123][47267] Signal inference workers to resume experience collection... (30550 times) [2024-04-26 08:21:33,152][47288] InferenceWorker_p0-w0: stopping experience collection (30550 times) [2024-04-26 08:21:33,152][47288] InferenceWorker_p0-w0: resuming experience collection (30550 times) [2024-04-26 08:21:33,230][47288] Updated weights for policy 0, policy_version 128626 (0.0033) [2024-04-26 08:21:33,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2107424768. Throughput: 0: 56480.8. Samples: 2056793040. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) [2024-04-26 08:21:33,923][47056] Avg episode reward: [(0, '0.552')] [2024-04-26 08:21:36,315][47288] Updated weights for policy 0, policy_version 128636 (0.0027) [2024-04-26 08:21:38,909][47288] Updated weights for policy 0, policy_version 128646 (0.0029) [2024-04-26 08:21:38,923][47056] Fps is (10 sec: 57343.6, 60 sec: 57070.8, 300 sec: 56649.7). Total num frames: 2107736064. Throughput: 0: 56613.5. Samples: 2057134840. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) [2024-04-26 08:21:38,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 08:21:41,969][47288] Updated weights for policy 0, policy_version 128656 (0.0030) [2024-04-26 08:21:43,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 2107998208. Throughput: 0: 56536.0. Samples: 2057305460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:21:43,923][47056] Avg episode reward: [(0, '0.583')] [2024-04-26 08:21:44,757][47288] Updated weights for policy 0, policy_version 128666 (0.0028) [2024-04-26 08:21:48,047][47288] Updated weights for policy 0, policy_version 128676 (0.0027) [2024-04-26 08:21:48,923][47056] Fps is (10 sec: 52429.8, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 2108260352. Throughput: 0: 56615.2. Samples: 2057643940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:21:48,923][47056] Avg episode reward: [(0, '0.422')] [2024-04-26 08:21:50,677][47288] Updated weights for policy 0, policy_version 128686 (0.0028) [2024-04-26 08:21:53,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56251.6, 300 sec: 56594.2). Total num frames: 2108555264. Throughput: 0: 56564.3. Samples: 2057985980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:21:53,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 08:21:53,925][47288] Updated weights for policy 0, policy_version 128696 (0.0032) [2024-04-26 08:21:56,447][47288] Updated weights for policy 0, policy_version 128706 (0.0029) [2024-04-26 08:21:58,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.9, 300 sec: 56594.2). Total num frames: 2108833792. Throughput: 0: 56597.0. Samples: 2058149500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:21:58,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 08:21:59,716][47288] Updated weights for policy 0, policy_version 128716 (0.0027) [2024-04-26 08:22:02,197][47288] Updated weights for policy 0, policy_version 128726 (0.0030) [2024-04-26 08:22:03,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.7, 300 sec: 56538.7). Total num frames: 2109128704. Throughput: 0: 56523.9. Samples: 2058490840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:22:03,924][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 08:22:05,507][47288] Updated weights for policy 0, policy_version 128736 (0.0026) [2024-04-26 08:22:08,178][47288] Updated weights for policy 0, policy_version 128746 (0.0034) [2024-04-26 08:22:08,923][47056] Fps is (10 sec: 58981.7, 60 sec: 57070.8, 300 sec: 56538.7). Total num frames: 2109423616. Throughput: 0: 56646.8. Samples: 2058832580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:22:08,923][47056] Avg episode reward: [(0, '0.561')] [2024-04-26 08:22:11,331][47288] Updated weights for policy 0, policy_version 128756 (0.0028) [2024-04-26 08:22:13,923][47056] Fps is (10 sec: 54068.0, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 2109669376. Throughput: 0: 56531.7. Samples: 2058998700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:22:13,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 08:22:14,119][47288] Updated weights for policy 0, policy_version 128766 (0.0027) [2024-04-26 08:22:17,184][47288] Updated weights for policy 0, policy_version 128776 (0.0033) [2024-04-26 08:22:18,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 2109964288. Throughput: 0: 56581.3. Samples: 2059339200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:22:18,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 08:22:20,003][47288] Updated weights for policy 0, policy_version 128786 (0.0029) [2024-04-26 08:22:22,981][47288] Updated weights for policy 0, policy_version 128796 (0.0035) [2024-04-26 08:22:23,923][47056] Fps is (10 sec: 58981.3, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 2110259200. Throughput: 0: 56504.0. Samples: 2059677520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:22:23,924][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 08:22:25,937][47288] Updated weights for policy 0, policy_version 128806 (0.0026) [2024-04-26 08:22:28,826][47288] Updated weights for policy 0, policy_version 128816 (0.0028) [2024-04-26 08:22:28,923][47056] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 56538.7). Total num frames: 2110521344. Throughput: 0: 56361.7. Samples: 2059841740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:22:28,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 08:22:31,581][47288] Updated weights for policy 0, policy_version 128826 (0.0025) [2024-04-26 08:22:33,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56524.8, 300 sec: 56594.2). Total num frames: 2110816256. Throughput: 0: 56328.3. Samples: 2060178720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:22:33,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 08:22:34,688][47288] Updated weights for policy 0, policy_version 128836 (0.0028) [2024-04-26 08:22:37,340][47288] Updated weights for policy 0, policy_version 128846 (0.0032) [2024-04-26 08:22:38,923][47056] Fps is (10 sec: 55706.7, 60 sec: 55705.8, 300 sec: 56483.2). Total num frames: 2111078400. Throughput: 0: 56243.3. Samples: 2060516920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:22:38,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 08:22:40,563][47288] Updated weights for policy 0, policy_version 128856 (0.0032) [2024-04-26 08:22:43,145][47288] Updated weights for policy 0, policy_version 128866 (0.0028) [2024-04-26 08:22:43,920][47267] Signal inference workers to stop experience collection... (30600 times) [2024-04-26 08:22:43,920][47267] Signal inference workers to resume experience collection... (30600 times) [2024-04-26 08:22:43,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 2111373312. Throughput: 0: 56334.1. Samples: 2060684540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:22:43,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 08:22:43,933][47288] InferenceWorker_p0-w0: stopping experience collection (30600 times) [2024-04-26 08:22:43,933][47288] InferenceWorker_p0-w0: resuming experience collection (30600 times) [2024-04-26 08:22:46,391][47288] Updated weights for policy 0, policy_version 128876 (0.0026) [2024-04-26 08:22:48,852][47288] Updated weights for policy 0, policy_version 128886 (0.0032) [2024-04-26 08:22:48,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56797.9, 300 sec: 56483.2). Total num frames: 2111668224. Throughput: 0: 56304.6. Samples: 2061024540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 08:22:48,923][47056] Avg episode reward: [(0, '0.586')] [2024-04-26 08:22:48,949][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000128887_2111684608.pth... [2024-04-26 08:22:48,997][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000128058_2098102272.pth [2024-04-26 08:22:52,172][47288] Updated weights for policy 0, policy_version 128896 (0.0033) [2024-04-26 08:22:53,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2111963136. Throughput: 0: 56195.6. Samples: 2061361380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 08:22:53,923][47056] Avg episode reward: [(0, '0.582')] [2024-04-26 08:22:54,693][47288] Updated weights for policy 0, policy_version 128906 (0.0024) [2024-04-26 08:22:57,823][47288] Updated weights for policy 0, policy_version 128916 (0.0032) [2024-04-26 08:22:58,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 2112225280. Throughput: 0: 56444.9. Samples: 2061538720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 08:22:58,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 08:23:00,366][47288] Updated weights for policy 0, policy_version 128926 (0.0032) [2024-04-26 08:23:03,613][47288] Updated weights for policy 0, policy_version 128936 (0.0025) [2024-04-26 08:23:03,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 2112503808. Throughput: 0: 56393.0. Samples: 2061876880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 08:23:03,923][47056] Avg episode reward: [(0, '0.456')] [2024-04-26 08:23:06,226][47288] Updated weights for policy 0, policy_version 128946 (0.0030) [2024-04-26 08:23:08,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55978.6, 300 sec: 56538.7). Total num frames: 2112782336. Throughput: 0: 56433.8. Samples: 2062217040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 08:23:08,923][47056] Avg episode reward: [(0, '0.512')] [2024-04-26 08:23:09,341][47288] Updated weights for policy 0, policy_version 128956 (0.0025) [2024-04-26 08:23:12,025][47288] Updated weights for policy 0, policy_version 128966 (0.0029) [2024-04-26 08:23:13,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.7, 300 sec: 56594.2). Total num frames: 2113060864. Throughput: 0: 56355.6. Samples: 2062377740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 08:23:13,923][47056] Avg episode reward: [(0, '0.546')] [2024-04-26 08:23:15,092][47288] Updated weights for policy 0, policy_version 128976 (0.0030) [2024-04-26 08:23:17,838][47288] Updated weights for policy 0, policy_version 128986 (0.0031) [2024-04-26 08:23:18,923][47056] Fps is (10 sec: 55706.9, 60 sec: 56251.9, 300 sec: 56483.2). Total num frames: 2113339392. Throughput: 0: 56434.0. Samples: 2062718240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 08:23:18,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 08:23:21,091][47288] Updated weights for policy 0, policy_version 128996 (0.0026) [2024-04-26 08:23:23,737][47288] Updated weights for policy 0, policy_version 129006 (0.0029) [2024-04-26 08:23:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.9, 300 sec: 56483.2). Total num frames: 2113634304. Throughput: 0: 56297.2. Samples: 2063050300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 08:23:23,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 08:23:26,770][47288] Updated weights for policy 0, policy_version 129016 (0.0028) [2024-04-26 08:23:28,923][47056] Fps is (10 sec: 58981.4, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2113929216. Throughput: 0: 56526.6. Samples: 2063228240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 08:23:28,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 08:23:29,527][47288] Updated weights for policy 0, policy_version 129026 (0.0035) [2024-04-26 08:23:32,571][47288] Updated weights for policy 0, policy_version 129036 (0.0027) [2024-04-26 08:23:33,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 2114224128. Throughput: 0: 56524.8. Samples: 2063568160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 08:23:33,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 08:23:35,243][47288] Updated weights for policy 0, policy_version 129046 (0.0025) [2024-04-26 08:23:38,452][47288] Updated weights for policy 0, policy_version 129056 (0.0032) [2024-04-26 08:23:38,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56524.6, 300 sec: 56483.1). Total num frames: 2114469888. Throughput: 0: 56434.1. Samples: 2063900920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 08:23:38,923][47056] Avg episode reward: [(0, '0.547')] [2024-04-26 08:23:41,147][47288] Updated weights for policy 0, policy_version 129066 (0.0031) [2024-04-26 08:23:43,923][47056] Fps is (10 sec: 50790.4, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 2114732032. Throughput: 0: 56202.6. Samples: 2064067840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 08:23:43,923][47056] Avg episode reward: [(0, '0.605')] [2024-04-26 08:23:44,233][47288] Updated weights for policy 0, policy_version 129076 (0.0036) [2024-04-26 08:23:47,146][47288] Updated weights for policy 0, policy_version 129086 (0.0030) [2024-04-26 08:23:48,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 56538.7). Total num frames: 2115026944. Throughput: 0: 56142.6. Samples: 2064403300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 08:23:48,923][47056] Avg episode reward: [(0, '0.581')] [2024-04-26 08:23:49,516][47267] Signal inference workers to stop experience collection... (30650 times) [2024-04-26 08:23:49,516][47267] Signal inference workers to resume experience collection... (30650 times) [2024-04-26 08:23:49,526][47288] InferenceWorker_p0-w0: stopping experience collection (30650 times) [2024-04-26 08:23:49,527][47288] InferenceWorker_p0-w0: resuming experience collection (30650 times) [2024-04-26 08:23:50,120][47288] Updated weights for policy 0, policy_version 129096 (0.0029) [2024-04-26 08:23:52,977][47288] Updated weights for policy 0, policy_version 129106 (0.0032) [2024-04-26 08:23:53,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55432.6, 300 sec: 56372.1). Total num frames: 2115289088. Throughput: 0: 56079.8. Samples: 2064740620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 08:23:53,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 08:23:55,851][47288] Updated weights for policy 0, policy_version 129116 (0.0026) [2024-04-26 08:23:58,629][47288] Updated weights for policy 0, policy_version 129126 (0.0028) [2024-04-26 08:23:58,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 2115600384. Throughput: 0: 56216.4. Samples: 2064907480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 08:23:58,923][47056] Avg episode reward: [(0, '0.599')] [2024-04-26 08:24:01,709][47288] Updated weights for policy 0, policy_version 129136 (0.0025) [2024-04-26 08:24:03,923][47056] Fps is (10 sec: 60619.7, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 2115895296. Throughput: 0: 56121.1. Samples: 2065243700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 08:24:03,923][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 08:24:04,590][47288] Updated weights for policy 0, policy_version 129146 (0.0030) [2024-04-26 08:24:07,578][47288] Updated weights for policy 0, policy_version 129156 (0.0031) [2024-04-26 08:24:08,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.8, 300 sec: 56594.2). Total num frames: 2116173824. Throughput: 0: 56258.5. Samples: 2065581940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 08:24:08,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 08:24:10,500][47288] Updated weights for policy 0, policy_version 129166 (0.0025) [2024-04-26 08:24:13,259][47288] Updated weights for policy 0, policy_version 129176 (0.0026) [2024-04-26 08:24:13,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 2116452352. Throughput: 0: 56315.9. Samples: 2065762460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 08:24:13,923][47056] Avg episode reward: [(0, '0.444')] [2024-04-26 08:24:16,183][47288] Updated weights for policy 0, policy_version 129186 (0.0033) [2024-04-26 08:24:18,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 2116730880. Throughput: 0: 56218.7. Samples: 2066098000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 08:24:18,923][47056] Avg episode reward: [(0, '0.572')] [2024-04-26 08:24:19,063][47288] Updated weights for policy 0, policy_version 129196 (0.0031) [2024-04-26 08:24:22,206][47288] Updated weights for policy 0, policy_version 129206 (0.0029) [2024-04-26 08:24:23,923][47056] Fps is (10 sec: 55706.5, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 2117009408. Throughput: 0: 56400.6. Samples: 2066438940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 08:24:23,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 08:24:24,791][47288] Updated weights for policy 0, policy_version 129216 (0.0028) [2024-04-26 08:24:28,006][47288] Updated weights for policy 0, policy_version 129226 (0.0029) [2024-04-26 08:24:28,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55705.8, 300 sec: 56427.7). Total num frames: 2117271552. Throughput: 0: 56238.8. Samples: 2066598580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 08:24:28,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 08:24:30,540][47288] Updated weights for policy 0, policy_version 129236 (0.0027) [2024-04-26 08:24:33,800][47288] Updated weights for policy 0, policy_version 129246 (0.0032) [2024-04-26 08:24:33,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 56483.1). Total num frames: 2117566464. Throughput: 0: 56302.8. Samples: 2066936920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 08:24:33,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 08:24:36,442][47288] Updated weights for policy 0, policy_version 129256 (0.0026) [2024-04-26 08:24:38,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 2117844992. Throughput: 0: 56298.6. Samples: 2067274060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 08:24:38,923][47056] Avg episode reward: [(0, '0.416')] [2024-04-26 08:24:39,696][47288] Updated weights for policy 0, policy_version 129266 (0.0028) [2024-04-26 08:24:42,225][47288] Updated weights for policy 0, policy_version 129276 (0.0027) [2024-04-26 08:24:43,923][47056] Fps is (10 sec: 58982.4, 60 sec: 57071.0, 300 sec: 56483.1). Total num frames: 2118156288. Throughput: 0: 56502.4. Samples: 2067450080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 08:24:43,923][47056] Avg episode reward: [(0, '0.594')] [2024-04-26 08:24:45,630][47288] Updated weights for policy 0, policy_version 129286 (0.0029) [2024-04-26 08:24:47,969][47288] Updated weights for policy 0, policy_version 129296 (0.0025) [2024-04-26 08:24:48,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2118434816. Throughput: 0: 56555.2. Samples: 2067788680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 08:24:48,923][47056] Avg episode reward: [(0, '0.597')] [2024-04-26 08:24:48,960][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000129300_2118451200.pth... [2024-04-26 08:24:49,002][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000128473_2104901632.pth [2024-04-26 08:24:51,485][47288] Updated weights for policy 0, policy_version 129306 (0.0025) [2024-04-26 08:24:53,723][47288] Updated weights for policy 0, policy_version 129316 (0.0027) [2024-04-26 08:24:53,923][47056] Fps is (10 sec: 55705.4, 60 sec: 57070.8, 300 sec: 56427.6). Total num frames: 2118713344. Throughput: 0: 56444.6. Samples: 2068121940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:24:53,924][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 08:24:57,308][47288] Updated weights for policy 0, policy_version 129326 (0.0029) [2024-04-26 08:24:58,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56524.9, 300 sec: 56483.1). Total num frames: 2118991872. Throughput: 0: 56261.0. Samples: 2068294200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:24:58,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 08:24:59,541][47288] Updated weights for policy 0, policy_version 129336 (0.0027) [2024-04-26 08:25:03,020][47288] Updated weights for policy 0, policy_version 129346 (0.0027) [2024-04-26 08:25:03,923][47056] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 2119254016. Throughput: 0: 56500.8. Samples: 2068640540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:25:03,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 08:25:05,448][47288] Updated weights for policy 0, policy_version 129356 (0.0032) [2024-04-26 08:25:05,915][47267] Signal inference workers to stop experience collection... (30700 times) [2024-04-26 08:25:05,916][47267] Signal inference workers to resume experience collection... (30700 times) [2024-04-26 08:25:05,928][47288] InferenceWorker_p0-w0: stopping experience collection (30700 times) [2024-04-26 08:25:05,928][47288] InferenceWorker_p0-w0: resuming experience collection (30700 times) [2024-04-26 08:25:08,818][47288] Updated weights for policy 0, policy_version 129366 (0.0028) [2024-04-26 08:25:08,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 2119532544. Throughput: 0: 56534.2. Samples: 2068982980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:25:08,924][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 08:25:11,216][47288] Updated weights for policy 0, policy_version 129376 (0.0031) [2024-04-26 08:25:13,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 2119827456. Throughput: 0: 56567.0. Samples: 2069144100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:25:13,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 08:25:14,726][47288] Updated weights for policy 0, policy_version 129386 (0.0029) [2024-04-26 08:25:17,010][47288] Updated weights for policy 0, policy_version 129396 (0.0025) [2024-04-26 08:25:18,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 2120105984. Throughput: 0: 56648.4. Samples: 2069486100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:25:18,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 08:25:20,561][47288] Updated weights for policy 0, policy_version 129406 (0.0028) [2024-04-26 08:25:22,655][47288] Updated weights for policy 0, policy_version 129416 (0.0027) [2024-04-26 08:25:23,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2120417280. Throughput: 0: 56646.3. Samples: 2069823140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:25:23,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 08:25:26,137][47288] Updated weights for policy 0, policy_version 129426 (0.0034) [2024-04-26 08:25:28,332][47288] Updated weights for policy 0, policy_version 129436 (0.0031) [2024-04-26 08:25:28,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56797.7, 300 sec: 56483.2). Total num frames: 2120679424. Throughput: 0: 56579.9. Samples: 2069996180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:25:28,923][47056] Avg episode reward: [(0, '0.440')] [2024-04-26 08:25:31,981][47288] Updated weights for policy 0, policy_version 129446 (0.0031) [2024-04-26 08:25:33,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56797.9, 300 sec: 56483.2). Total num frames: 2120974336. Throughput: 0: 56722.4. Samples: 2070341180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:25:33,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 08:25:34,157][47288] Updated weights for policy 0, policy_version 129456 (0.0037) [2024-04-26 08:25:37,745][47288] Updated weights for policy 0, policy_version 129466 (0.0032) [2024-04-26 08:25:38,923][47056] Fps is (10 sec: 57345.0, 60 sec: 56797.9, 300 sec: 56483.2). Total num frames: 2121252864. Throughput: 0: 56730.4. Samples: 2070674800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:25:38,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 08:25:39,959][47288] Updated weights for policy 0, policy_version 129476 (0.0030) [2024-04-26 08:25:43,468][47288] Updated weights for policy 0, policy_version 129486 (0.0034) [2024-04-26 08:25:43,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 2121515008. Throughput: 0: 56616.6. Samples: 2070841940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:25:43,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 08:25:45,717][47288] Updated weights for policy 0, policy_version 129496 (0.0027) [2024-04-26 08:25:48,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55978.8, 300 sec: 56316.5). Total num frames: 2121793536. Throughput: 0: 56617.5. Samples: 2071188320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:25:48,923][47056] Avg episode reward: [(0, '0.626')] [2024-04-26 08:25:49,151][47288] Updated weights for policy 0, policy_version 129506 (0.0032) [2024-04-26 08:25:51,854][47288] Updated weights for policy 0, policy_version 129516 (0.0028) [2024-04-26 08:25:53,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 2122088448. Throughput: 0: 56524.4. Samples: 2071526580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:25:53,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 08:25:55,077][47288] Updated weights for policy 0, policy_version 129526 (0.0034) [2024-04-26 08:25:57,441][47288] Updated weights for policy 0, policy_version 129536 (0.0029) [2024-04-26 08:25:58,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 2122383360. Throughput: 0: 56622.7. Samples: 2071692120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:25:58,923][47056] Avg episode reward: [(0, '0.400')] [2024-04-26 08:26:00,802][47288] Updated weights for policy 0, policy_version 129546 (0.0027) [2024-04-26 08:26:03,252][47288] Updated weights for policy 0, policy_version 129556 (0.0030) [2024-04-26 08:26:03,893][47267] Signal inference workers to stop experience collection... (30750 times) [2024-04-26 08:26:03,923][47056] Fps is (10 sec: 57345.1, 60 sec: 56798.1, 300 sec: 56483.2). Total num frames: 2122661888. Throughput: 0: 56638.8. Samples: 2072034840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:26:03,923][47056] Avg episode reward: [(0, '0.501')] [2024-04-26 08:26:03,935][47288] InferenceWorker_p0-w0: stopping experience collection (30750 times) [2024-04-26 08:26:03,943][47267] Signal inference workers to resume experience collection... (30750 times) [2024-04-26 08:26:03,952][47288] InferenceWorker_p0-w0: resuming experience collection (30750 times) [2024-04-26 08:26:06,419][47288] Updated weights for policy 0, policy_version 129566 (0.0033) [2024-04-26 08:26:08,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56797.9, 300 sec: 56483.1). Total num frames: 2122940416. Throughput: 0: 56709.3. Samples: 2072375060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:26:08,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 08:26:09,058][47288] Updated weights for policy 0, policy_version 129576 (0.0036) [2024-04-26 08:26:12,037][47288] Updated weights for policy 0, policy_version 129586 (0.0028) [2024-04-26 08:26:13,923][47056] Fps is (10 sec: 60619.5, 60 sec: 57343.9, 300 sec: 56594.2). Total num frames: 2123268096. Throughput: 0: 56890.7. Samples: 2072556260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:26:13,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 08:26:14,664][47288] Updated weights for policy 0, policy_version 129596 (0.0033) [2024-04-26 08:26:18,014][47288] Updated weights for policy 0, policy_version 129606 (0.0031) [2024-04-26 08:26:18,923][47056] Fps is (10 sec: 58981.6, 60 sec: 57070.8, 300 sec: 56483.1). Total num frames: 2123530240. Throughput: 0: 56662.0. Samples: 2072890980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:26:18,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 08:26:20,332][47288] Updated weights for policy 0, policy_version 129616 (0.0031) [2024-04-26 08:26:23,790][47288] Updated weights for policy 0, policy_version 129626 (0.0030) [2024-04-26 08:26:23,923][47056] Fps is (10 sec: 52429.6, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 2123792384. Throughput: 0: 56801.3. Samples: 2073230860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:26:23,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 08:26:26,160][47288] Updated weights for policy 0, policy_version 129636 (0.0028) [2024-04-26 08:26:28,923][47056] Fps is (10 sec: 54068.2, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 2124070912. Throughput: 0: 56708.9. Samples: 2073393840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:26:28,923][47056] Avg episode reward: [(0, '0.568')] [2024-04-26 08:26:29,552][47288] Updated weights for policy 0, policy_version 129646 (0.0028) [2024-04-26 08:26:31,928][47288] Updated weights for policy 0, policy_version 129656 (0.0029) [2024-04-26 08:26:33,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 2124365824. Throughput: 0: 56588.8. Samples: 2073734820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:26:33,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 08:26:35,416][47288] Updated weights for policy 0, policy_version 129666 (0.0031) [2024-04-26 08:26:37,847][47288] Updated weights for policy 0, policy_version 129676 (0.0028) [2024-04-26 08:26:38,923][47056] Fps is (10 sec: 57342.1, 60 sec: 56524.5, 300 sec: 56427.6). Total num frames: 2124644352. Throughput: 0: 56703.3. Samples: 2074078240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:26:38,924][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 08:26:41,030][47288] Updated weights for policy 0, policy_version 129686 (0.0028) [2024-04-26 08:26:43,498][47288] Updated weights for policy 0, policy_version 129696 (0.0027) [2024-04-26 08:26:43,923][47056] Fps is (10 sec: 57343.9, 60 sec: 57070.8, 300 sec: 56538.7). Total num frames: 2124939264. Throughput: 0: 56921.3. Samples: 2074253580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:26:43,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 08:26:46,628][47288] Updated weights for policy 0, policy_version 129706 (0.0028) [2024-04-26 08:26:48,923][47056] Fps is (10 sec: 58983.7, 60 sec: 57343.9, 300 sec: 56538.7). Total num frames: 2125234176. Throughput: 0: 56805.6. Samples: 2074591100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:26:48,923][47056] Avg episode reward: [(0, '0.572')] [2024-04-26 08:26:49,035][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000129715_2125250560.pth... [2024-04-26 08:26:49,082][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000128887_2111684608.pth [2024-04-26 08:26:49,216][47288] Updated weights for policy 0, policy_version 129716 (0.0032) [2024-04-26 08:26:52,421][47288] Updated weights for policy 0, policy_version 129726 (0.0025) [2024-04-26 08:26:53,923][47056] Fps is (10 sec: 57344.1, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 2125512704. Throughput: 0: 56779.6. Samples: 2074930140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:26:53,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 08:26:55,288][47288] Updated weights for policy 0, policy_version 129736 (0.0027) [2024-04-26 08:26:58,157][47288] Updated weights for policy 0, policy_version 129746 (0.0030) [2024-04-26 08:26:58,923][47056] Fps is (10 sec: 57343.7, 60 sec: 57070.8, 300 sec: 56538.7). Total num frames: 2125807616. Throughput: 0: 56668.9. Samples: 2075106360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:26:58,923][47056] Avg episode reward: [(0, '0.585')] [2024-04-26 08:27:01,112][47288] Updated weights for policy 0, policy_version 129756 (0.0025) [2024-04-26 08:27:03,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56797.7, 300 sec: 56427.6). Total num frames: 2126069760. Throughput: 0: 56780.6. Samples: 2075446100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:27:03,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 08:27:03,937][47288] Updated weights for policy 0, policy_version 129766 (0.0025) [2024-04-26 08:27:06,888][47288] Updated weights for policy 0, policy_version 129776 (0.0027) [2024-04-26 08:27:08,923][47056] Fps is (10 sec: 52429.0, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 2126331904. Throughput: 0: 56712.8. Samples: 2075782940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:27:08,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 08:27:09,426][47267] Signal inference workers to stop experience collection... (30800 times) [2024-04-26 08:27:09,456][47288] InferenceWorker_p0-w0: stopping experience collection (30800 times) [2024-04-26 08:27:09,482][47267] Signal inference workers to resume experience collection... (30800 times) [2024-04-26 08:27:09,486][47288] InferenceWorker_p0-w0: resuming experience collection (30800 times) [2024-04-26 08:27:09,597][47288] Updated weights for policy 0, policy_version 129786 (0.0029) [2024-04-26 08:27:12,694][47288] Updated weights for policy 0, policy_version 129796 (0.0030) [2024-04-26 08:27:13,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 56483.2). Total num frames: 2126626816. Throughput: 0: 56698.1. Samples: 2075945260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:27:13,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 08:27:15,551][47288] Updated weights for policy 0, policy_version 129806 (0.0030) [2024-04-26 08:27:18,457][47288] Updated weights for policy 0, policy_version 129816 (0.0026) [2024-04-26 08:27:18,923][47056] Fps is (10 sec: 58982.6, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 2126921728. Throughput: 0: 56728.9. Samples: 2076287620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:27:18,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 08:27:21,396][47288] Updated weights for policy 0, policy_version 129826 (0.0034) [2024-04-26 08:27:23,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.7, 300 sec: 56538.7). Total num frames: 2127200256. Throughput: 0: 56606.0. Samples: 2076625500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:27:23,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 08:27:24,345][47288] Updated weights for policy 0, policy_version 129836 (0.0029) [2024-04-26 08:27:27,007][47288] Updated weights for policy 0, policy_version 129846 (0.0035) [2024-04-26 08:27:28,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56797.7, 300 sec: 56483.1). Total num frames: 2127478784. Throughput: 0: 56670.6. Samples: 2076803760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:27:28,932][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 08:27:30,168][47288] Updated weights for policy 0, policy_version 129856 (0.0029) [2024-04-26 08:27:32,712][47288] Updated weights for policy 0, policy_version 129866 (0.0033) [2024-04-26 08:27:33,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 2127773696. Throughput: 0: 56627.2. Samples: 2077139320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:27:33,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 08:27:35,970][47288] Updated weights for policy 0, policy_version 129876 (0.0032) [2024-04-26 08:27:38,561][47288] Updated weights for policy 0, policy_version 129886 (0.0034) [2024-04-26 08:27:38,923][47056] Fps is (10 sec: 58981.7, 60 sec: 57071.0, 300 sec: 56594.2). Total num frames: 2128068608. Throughput: 0: 56629.1. Samples: 2077478460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:27:38,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 08:27:41,585][47288] Updated weights for policy 0, policy_version 129896 (0.0026) [2024-04-26 08:27:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 2128330752. Throughput: 0: 56500.2. Samples: 2077648860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:27:43,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 08:27:44,377][47288] Updated weights for policy 0, policy_version 129906 (0.0027) [2024-04-26 08:27:47,361][47288] Updated weights for policy 0, policy_version 129916 (0.0032) [2024-04-26 08:27:48,923][47056] Fps is (10 sec: 54068.6, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 2128609280. Throughput: 0: 56535.2. Samples: 2077990180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:27:48,923][47056] Avg episode reward: [(0, '0.447')] [2024-04-26 08:27:50,171][47288] Updated weights for policy 0, policy_version 129926 (0.0027) [2024-04-26 08:27:53,220][47288] Updated weights for policy 0, policy_version 129936 (0.0027) [2024-04-26 08:27:53,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2128904192. Throughput: 0: 56529.4. Samples: 2078326760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:27:53,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 08:27:55,901][47288] Updated weights for policy 0, policy_version 129946 (0.0032) [2024-04-26 08:27:58,918][47288] Updated weights for policy 0, policy_version 129956 (0.0031) [2024-04-26 08:27:58,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56524.9, 300 sec: 56594.2). Total num frames: 2129199104. Throughput: 0: 56621.9. Samples: 2078493240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 08:27:58,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 08:28:01,814][47288] Updated weights for policy 0, policy_version 129966 (0.0030) [2024-04-26 08:28:03,923][47056] Fps is (10 sec: 55704.6, 60 sec: 56524.7, 300 sec: 56538.7). Total num frames: 2129461248. Throughput: 0: 56568.7. Samples: 2078833220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 08:28:03,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 08:28:04,606][47288] Updated weights for policy 0, policy_version 129976 (0.0027) [2024-04-26 08:28:07,676][47288] Updated weights for policy 0, policy_version 129986 (0.0029) [2024-04-26 08:28:08,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2129739776. Throughput: 0: 56615.7. Samples: 2079173200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 08:28:08,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 08:28:10,519][47288] Updated weights for policy 0, policy_version 129996 (0.0030) [2024-04-26 08:28:13,452][47288] Updated weights for policy 0, policy_version 130006 (0.0032) [2024-04-26 08:28:13,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 2130034688. Throughput: 0: 56518.3. Samples: 2079347080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 08:28:13,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 08:28:16,416][47288] Updated weights for policy 0, policy_version 130016 (0.0027) [2024-04-26 08:28:18,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56524.7, 300 sec: 56538.7). Total num frames: 2130313216. Throughput: 0: 56460.7. Samples: 2079680060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 08:28:18,923][47056] Avg episode reward: [(0, '0.573')] [2024-04-26 08:28:19,087][47267] Signal inference workers to stop experience collection... (30850 times) [2024-04-26 08:28:19,087][47267] Signal inference workers to resume experience collection... (30850 times) [2024-04-26 08:28:19,103][47288] InferenceWorker_p0-w0: stopping experience collection (30850 times) [2024-04-26 08:28:19,103][47288] InferenceWorker_p0-w0: resuming experience collection (30850 times) [2024-04-26 08:28:19,210][47288] Updated weights for policy 0, policy_version 130026 (0.0028) [2024-04-26 08:28:22,243][47288] Updated weights for policy 0, policy_version 130036 (0.0032) [2024-04-26 08:28:23,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 2130591744. Throughput: 0: 56590.1. Samples: 2080025000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 08:28:23,923][47056] Avg episode reward: [(0, '0.621')] [2024-04-26 08:28:24,998][47288] Updated weights for policy 0, policy_version 130046 (0.0028) [2024-04-26 08:28:27,853][47288] Updated weights for policy 0, policy_version 130056 (0.0029) [2024-04-26 08:28:28,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56251.6, 300 sec: 56372.0). Total num frames: 2130853888. Throughput: 0: 56485.1. Samples: 2080190700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 08:28:28,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 08:28:30,923][47288] Updated weights for policy 0, policy_version 130066 (0.0031) [2024-04-26 08:28:33,707][47288] Updated weights for policy 0, policy_version 130076 (0.0028) [2024-04-26 08:28:33,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56524.7, 300 sec: 56594.2). Total num frames: 2131165184. Throughput: 0: 56271.9. Samples: 2080522420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 08:28:33,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 08:28:36,765][47288] Updated weights for policy 0, policy_version 130086 (0.0029) [2024-04-26 08:28:38,923][47056] Fps is (10 sec: 57345.2, 60 sec: 55978.9, 300 sec: 56594.2). Total num frames: 2131427328. Throughput: 0: 56340.0. Samples: 2080862060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 08:28:38,923][47056] Avg episode reward: [(0, '0.598')] [2024-04-26 08:28:39,587][47288] Updated weights for policy 0, policy_version 130096 (0.0034) [2024-04-26 08:28:42,701][47288] Updated weights for policy 0, policy_version 130106 (0.0030) [2024-04-26 08:28:43,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.8, 300 sec: 56594.2). Total num frames: 2131722240. Throughput: 0: 56390.7. Samples: 2081030820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 08:28:43,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 08:28:45,240][47288] Updated weights for policy 0, policy_version 130116 (0.0024) [2024-04-26 08:28:48,445][47288] Updated weights for policy 0, policy_version 130126 (0.0026) [2024-04-26 08:28:48,923][47056] Fps is (10 sec: 57342.9, 60 sec: 56524.6, 300 sec: 56649.7). Total num frames: 2132000768. Throughput: 0: 56284.0. Samples: 2081366000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 08:28:48,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 08:28:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000130127_2132000768.pth... [2024-04-26 08:28:48,979][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000129300_2118451200.pth [2024-04-26 08:28:50,978][47288] Updated weights for policy 0, policy_version 130136 (0.0028) [2024-04-26 08:28:53,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56524.7, 300 sec: 56594.2). Total num frames: 2132295680. Throughput: 0: 56255.4. Samples: 2081704700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 08:28:53,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 08:28:54,143][47288] Updated weights for policy 0, policy_version 130146 (0.0031) [2024-04-26 08:28:56,874][47288] Updated weights for policy 0, policy_version 130156 (0.0026) [2024-04-26 08:28:58,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 2132574208. Throughput: 0: 56237.3. Samples: 2081877760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:28:58,923][47056] Avg episode reward: [(0, '0.547')] [2024-04-26 08:29:00,006][47288] Updated weights for policy 0, policy_version 130166 (0.0030) [2024-04-26 08:29:02,946][47288] Updated weights for policy 0, policy_version 130176 (0.0036) [2024-04-26 08:29:03,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56251.9, 300 sec: 56483.2). Total num frames: 2132836352. Throughput: 0: 56313.5. Samples: 2082214160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:29:03,923][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 08:29:05,796][47288] Updated weights for policy 0, policy_version 130186 (0.0025) [2024-04-26 08:29:08,689][47288] Updated weights for policy 0, policy_version 130196 (0.0029) [2024-04-26 08:29:08,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2133131264. Throughput: 0: 56182.1. Samples: 2082553200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:29:08,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 08:29:11,502][47288] Updated weights for policy 0, policy_version 130206 (0.0028) [2024-04-26 08:29:13,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 2133409792. Throughput: 0: 56175.8. Samples: 2082718600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:29:13,923][47056] Avg episode reward: [(0, '0.442')] [2024-04-26 08:29:14,545][47288] Updated weights for policy 0, policy_version 130216 (0.0026) [2024-04-26 08:29:16,658][47267] Signal inference workers to stop experience collection... (30900 times) [2024-04-26 08:29:16,693][47288] InferenceWorker_p0-w0: stopping experience collection (30900 times) [2024-04-26 08:29:16,719][47267] Signal inference workers to resume experience collection... (30900 times) [2024-04-26 08:29:16,719][47288] InferenceWorker_p0-w0: resuming experience collection (30900 times) [2024-04-26 08:29:17,229][47288] Updated weights for policy 0, policy_version 130226 (0.0032) [2024-04-26 08:29:18,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 2133688320. Throughput: 0: 56254.1. Samples: 2083053860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:29:18,923][47056] Avg episode reward: [(0, '0.579')] [2024-04-26 08:29:20,188][47288] Updated weights for policy 0, policy_version 130236 (0.0029) [2024-04-26 08:29:23,226][47288] Updated weights for policy 0, policy_version 130246 (0.0030) [2024-04-26 08:29:23,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56524.7, 300 sec: 56649.7). Total num frames: 2133983232. Throughput: 0: 56279.0. Samples: 2083394620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:29:23,924][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 08:29:26,103][47288] Updated weights for policy 0, policy_version 130256 (0.0032) [2024-04-26 08:29:28,889][47288] Updated weights for policy 0, policy_version 130266 (0.0024) [2024-04-26 08:29:28,923][47056] Fps is (10 sec: 58982.8, 60 sec: 57071.0, 300 sec: 56649.8). Total num frames: 2134278144. Throughput: 0: 56412.3. Samples: 2083569380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:29:28,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 08:29:31,832][47288] Updated weights for policy 0, policy_version 130276 (0.0030) [2024-04-26 08:29:33,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56251.8, 300 sec: 56594.2). Total num frames: 2134540288. Throughput: 0: 56553.1. Samples: 2083910880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:29:33,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 08:29:34,624][47288] Updated weights for policy 0, policy_version 130286 (0.0031) [2024-04-26 08:29:37,605][47288] Updated weights for policy 0, policy_version 130296 (0.0037) [2024-04-26 08:29:38,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 2134835200. Throughput: 0: 56666.3. Samples: 2084254680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:29:38,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 08:29:40,383][47288] Updated weights for policy 0, policy_version 130306 (0.0026) [2024-04-26 08:29:43,466][47288] Updated weights for policy 0, policy_version 130316 (0.0028) [2024-04-26 08:29:43,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56524.7, 300 sec: 56538.7). Total num frames: 2135113728. Throughput: 0: 56606.6. Samples: 2084425060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:29:43,923][47056] Avg episode reward: [(0, '0.562')] [2024-04-26 08:29:46,179][47288] Updated weights for policy 0, policy_version 130326 (0.0028) [2024-04-26 08:29:48,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56251.8, 300 sec: 56483.1). Total num frames: 2135375872. Throughput: 0: 56696.3. Samples: 2084765500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:29:48,923][47056] Avg episode reward: [(0, '0.565')] [2024-04-26 08:29:49,280][47288] Updated weights for policy 0, policy_version 130336 (0.0026) [2024-04-26 08:29:51,824][47288] Updated weights for policy 0, policy_version 130346 (0.0025) [2024-04-26 08:29:53,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 2135670784. Throughput: 0: 56668.3. Samples: 2085103280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:29:53,923][47056] Avg episode reward: [(0, '0.563')] [2024-04-26 08:29:55,162][47288] Updated weights for policy 0, policy_version 130356 (0.0027) [2024-04-26 08:29:57,646][47288] Updated weights for policy 0, policy_version 130366 (0.0027) [2024-04-26 08:29:58,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56251.8, 300 sec: 56594.2). Total num frames: 2135949312. Throughput: 0: 56698.7. Samples: 2085270040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:29:58,923][47056] Avg episode reward: [(0, '0.589')] [2024-04-26 08:30:00,980][47288] Updated weights for policy 0, policy_version 130376 (0.0023) [2024-04-26 08:30:03,477][47288] Updated weights for policy 0, policy_version 130386 (0.0026) [2024-04-26 08:30:03,923][47056] Fps is (10 sec: 58983.0, 60 sec: 57070.9, 300 sec: 56705.3). Total num frames: 2136260608. Throughput: 0: 56822.8. Samples: 2085610880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:30:03,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 08:30:06,739][47288] Updated weights for policy 0, policy_version 130396 (0.0040) [2024-04-26 08:30:08,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56797.9, 300 sec: 56649.8). Total num frames: 2136539136. Throughput: 0: 56805.4. Samples: 2085950860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:30:08,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 08:30:09,444][47288] Updated weights for policy 0, policy_version 130406 (0.0031) [2024-04-26 08:30:12,395][47288] Updated weights for policy 0, policy_version 130416 (0.0027) [2024-04-26 08:30:13,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56797.8, 300 sec: 56649.8). Total num frames: 2136817664. Throughput: 0: 56656.0. Samples: 2086118900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:30:13,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 08:30:15,237][47288] Updated weights for policy 0, policy_version 130426 (0.0029) [2024-04-26 08:30:16,360][47267] Signal inference workers to stop experience collection... (30950 times) [2024-04-26 08:30:16,369][47288] InferenceWorker_p0-w0: stopping experience collection (30950 times) [2024-04-26 08:30:16,455][47267] Signal inference workers to resume experience collection... (30950 times) [2024-04-26 08:30:16,455][47288] InferenceWorker_p0-w0: resuming experience collection (30950 times) [2024-04-26 08:30:18,192][47288] Updated weights for policy 0, policy_version 130436 (0.0032) [2024-04-26 08:30:18,923][47056] Fps is (10 sec: 54065.8, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 2137079808. Throughput: 0: 56661.0. Samples: 2086460640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:30:18,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 08:30:20,913][47288] Updated weights for policy 0, policy_version 130446 (0.0025) [2024-04-26 08:30:23,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.8, 300 sec: 56594.2). Total num frames: 2137374720. Throughput: 0: 56580.4. Samples: 2086800800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:30:23,923][47056] Avg episode reward: [(0, '0.603')] [2024-04-26 08:30:23,990][47288] Updated weights for policy 0, policy_version 130456 (0.0031) [2024-04-26 08:30:26,732][47288] Updated weights for policy 0, policy_version 130466 (0.0032) [2024-04-26 08:30:28,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55978.6, 300 sec: 56483.1). Total num frames: 2137636864. Throughput: 0: 56410.2. Samples: 2086963520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:30:28,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 08:30:29,884][47288] Updated weights for policy 0, policy_version 130476 (0.0027) [2024-04-26 08:30:32,588][47288] Updated weights for policy 0, policy_version 130486 (0.0027) [2024-04-26 08:30:33,923][47056] Fps is (10 sec: 52428.7, 60 sec: 55978.6, 300 sec: 56427.6). Total num frames: 2137899008. Throughput: 0: 56298.2. Samples: 2087298920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:30:33,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 08:30:35,727][47288] Updated weights for policy 0, policy_version 130496 (0.0034) [2024-04-26 08:30:38,308][47288] Updated weights for policy 0, policy_version 130506 (0.0031) [2024-04-26 08:30:38,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56524.7, 300 sec: 56649.7). Total num frames: 2138226688. Throughput: 0: 56317.3. Samples: 2087637560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:30:38,923][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 08:30:41,497][47288] Updated weights for policy 0, policy_version 130516 (0.0028) [2024-04-26 08:30:43,923][47056] Fps is (10 sec: 62259.9, 60 sec: 56798.0, 300 sec: 56705.3). Total num frames: 2138521600. Throughput: 0: 56498.3. Samples: 2087812460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:30:43,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 08:30:43,976][47288] Updated weights for policy 0, policy_version 130526 (0.0028) [2024-04-26 08:30:47,215][47288] Updated weights for policy 0, policy_version 130536 (0.0024) [2024-04-26 08:30:48,923][47056] Fps is (10 sec: 55706.6, 60 sec: 56798.0, 300 sec: 56594.2). Total num frames: 2138783744. Throughput: 0: 56422.3. Samples: 2088149880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:30:48,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 08:30:48,991][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000130542_2138800128.pth... [2024-04-26 08:30:49,048][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000129715_2125250560.pth [2024-04-26 08:30:49,863][47288] Updated weights for policy 0, policy_version 130546 (0.0031) [2024-04-26 08:30:53,119][47288] Updated weights for policy 0, policy_version 130556 (0.0031) [2024-04-26 08:30:53,923][47056] Fps is (10 sec: 55704.9, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 2139078656. Throughput: 0: 56372.8. Samples: 2088487640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:30:53,923][47056] Avg episode reward: [(0, '0.604')] [2024-04-26 08:30:55,848][47288] Updated weights for policy 0, policy_version 130566 (0.0031) [2024-04-26 08:30:58,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2139340800. Throughput: 0: 56483.7. Samples: 2088660660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:30:58,923][47056] Avg episode reward: [(0, '0.409')] [2024-04-26 08:30:58,930][47288] Updated weights for policy 0, policy_version 130576 (0.0028) [2024-04-26 08:31:01,558][47288] Updated weights for policy 0, policy_version 130586 (0.0032) [2024-04-26 08:31:03,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55978.7, 300 sec: 56538.7). Total num frames: 2139619328. Throughput: 0: 56492.8. Samples: 2089002800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 08:31:03,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 08:31:04,642][47288] Updated weights for policy 0, policy_version 130596 (0.0042) [2024-04-26 08:31:07,270][47288] Updated weights for policy 0, policy_version 130606 (0.0026) [2024-04-26 08:31:08,923][47056] Fps is (10 sec: 54066.4, 60 sec: 55705.5, 300 sec: 56316.5). Total num frames: 2139881472. Throughput: 0: 56417.7. Samples: 2089339600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 08:31:08,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 08:31:10,439][47288] Updated weights for policy 0, policy_version 130616 (0.0028) [2024-04-26 08:31:13,133][47288] Updated weights for policy 0, policy_version 130626 (0.0030) [2024-04-26 08:31:13,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 2140192768. Throughput: 0: 56433.1. Samples: 2089503000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 08:31:13,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 08:31:15,885][47267] Signal inference workers to stop experience collection... (31000 times) [2024-04-26 08:31:15,921][47288] InferenceWorker_p0-w0: stopping experience collection (31000 times) [2024-04-26 08:31:15,942][47267] Signal inference workers to resume experience collection... (31000 times) [2024-04-26 08:31:15,946][47288] InferenceWorker_p0-w0: resuming experience collection (31000 times) [2024-04-26 08:31:16,362][47288] Updated weights for policy 0, policy_version 130636 (0.0025) [2024-04-26 08:31:18,923][47056] Fps is (10 sec: 60620.4, 60 sec: 56798.0, 300 sec: 56594.2). Total num frames: 2140487680. Throughput: 0: 56512.8. Samples: 2089842000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 08:31:18,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 08:31:18,949][47288] Updated weights for policy 0, policy_version 130646 (0.0032) [2024-04-26 08:31:22,333][47288] Updated weights for policy 0, policy_version 130656 (0.0030) [2024-04-26 08:31:23,923][47056] Fps is (10 sec: 60620.3, 60 sec: 57070.9, 300 sec: 56705.3). Total num frames: 2140798976. Throughput: 0: 56465.9. Samples: 2090178520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 08:31:23,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 08:31:24,619][47288] Updated weights for policy 0, policy_version 130666 (0.0034) [2024-04-26 08:31:28,043][47288] Updated weights for policy 0, policy_version 130676 (0.0030) [2024-04-26 08:31:28,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2141044736. Throughput: 0: 56539.8. Samples: 2090356760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 08:31:28,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 08:31:30,304][47288] Updated weights for policy 0, policy_version 130686 (0.0031) [2024-04-26 08:31:33,727][47288] Updated weights for policy 0, policy_version 130696 (0.0025) [2024-04-26 08:31:33,923][47056] Fps is (10 sec: 52429.2, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 2141323264. Throughput: 0: 56666.2. Samples: 2090699860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 08:31:33,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 08:31:36,423][47288] Updated weights for policy 0, policy_version 130706 (0.0025) [2024-04-26 08:31:38,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 2141601792. Throughput: 0: 56692.3. Samples: 2091038800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 08:31:38,923][47056] Avg episode reward: [(0, '0.393')] [2024-04-26 08:31:39,419][47288] Updated weights for policy 0, policy_version 130716 (0.0034) [2024-04-26 08:31:42,442][47288] Updated weights for policy 0, policy_version 130726 (0.0032) [2024-04-26 08:31:43,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 56372.1). Total num frames: 2141863936. Throughput: 0: 56384.9. Samples: 2091197980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 08:31:43,923][47056] Avg episode reward: [(0, '0.584')] [2024-04-26 08:31:45,367][47288] Updated weights for policy 0, policy_version 130736 (0.0037) [2024-04-26 08:31:48,251][47288] Updated weights for policy 0, policy_version 130746 (0.0034) [2024-04-26 08:31:48,923][47056] Fps is (10 sec: 55706.4, 60 sec: 56251.6, 300 sec: 56427.6). Total num frames: 2142158848. Throughput: 0: 56428.8. Samples: 2091542100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 08:31:48,923][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 08:31:51,113][47288] Updated weights for policy 0, policy_version 130756 (0.0027) [2024-04-26 08:31:53,884][47288] Updated weights for policy 0, policy_version 130766 (0.0028) [2024-04-26 08:31:53,923][47056] Fps is (10 sec: 60620.5, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 2142470144. Throughput: 0: 56616.6. Samples: 2091887340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 08:31:53,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 08:31:56,827][47288] Updated weights for policy 0, policy_version 130776 (0.0027) [2024-04-26 08:31:58,923][47056] Fps is (10 sec: 60620.3, 60 sec: 57070.7, 300 sec: 56594.2). Total num frames: 2142765056. Throughput: 0: 56787.3. Samples: 2092058440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 08:31:58,923][47056] Avg episode reward: [(0, '0.570')] [2024-04-26 08:31:59,509][47288] Updated weights for policy 0, policy_version 130786 (0.0031) [2024-04-26 08:32:02,628][47288] Updated weights for policy 0, policy_version 130796 (0.0025) [2024-04-26 08:32:03,923][47056] Fps is (10 sec: 57344.6, 60 sec: 57071.0, 300 sec: 56649.8). Total num frames: 2143043584. Throughput: 0: 56893.7. Samples: 2092402200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 08:32:03,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 08:32:05,333][47288] Updated weights for policy 0, policy_version 130806 (0.0032) [2024-04-26 08:32:08,549][47288] Updated weights for policy 0, policy_version 130816 (0.0028) [2024-04-26 08:32:08,923][47056] Fps is (10 sec: 55706.4, 60 sec: 57344.1, 300 sec: 56594.2). Total num frames: 2143322112. Throughput: 0: 56912.0. Samples: 2092739560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 08:32:08,923][47056] Avg episode reward: [(0, '0.463')] [2024-04-26 08:32:11,100][47288] Updated weights for policy 0, policy_version 130826 (0.0033) [2024-04-26 08:32:13,923][47056] Fps is (10 sec: 54066.1, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 2143584256. Throughput: 0: 56536.0. Samples: 2092900880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 08:32:13,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 08:32:14,298][47288] Updated weights for policy 0, policy_version 130836 (0.0028) [2024-04-26 08:32:16,948][47288] Updated weights for policy 0, policy_version 130846 (0.0025) [2024-04-26 08:32:18,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.9, 300 sec: 56483.2). Total num frames: 2143862784. Throughput: 0: 56679.5. Samples: 2093250440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 08:32:18,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 08:32:19,194][47267] Signal inference workers to stop experience collection... (31050 times) [2024-04-26 08:32:19,245][47288] InferenceWorker_p0-w0: stopping experience collection (31050 times) [2024-04-26 08:32:19,249][47267] Signal inference workers to resume experience collection... (31050 times) [2024-04-26 08:32:19,263][47288] InferenceWorker_p0-w0: resuming experience collection (31050 times) [2024-04-26 08:32:19,978][47288] Updated weights for policy 0, policy_version 130856 (0.0028) [2024-04-26 08:32:22,569][47288] Updated weights for policy 0, policy_version 130866 (0.0026) [2024-04-26 08:32:23,923][47056] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 56483.1). Total num frames: 2144141312. Throughput: 0: 56649.9. Samples: 2093588040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 08:32:23,924][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 08:32:25,885][47288] Updated weights for policy 0, policy_version 130876 (0.0037) [2024-04-26 08:32:28,444][47288] Updated weights for policy 0, policy_version 130886 (0.0033) [2024-04-26 08:32:28,923][47056] Fps is (10 sec: 57342.6, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 2144436224. Throughput: 0: 56644.9. Samples: 2093747020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 08:32:28,924][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 08:32:31,688][47288] Updated weights for policy 0, policy_version 130896 (0.0034) [2024-04-26 08:32:33,923][47056] Fps is (10 sec: 58983.0, 60 sec: 56797.8, 300 sec: 56483.2). Total num frames: 2144731136. Throughput: 0: 56520.5. Samples: 2094085520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 08:32:33,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 08:32:34,355][47288] Updated weights for policy 0, policy_version 130906 (0.0028) [2024-04-26 08:32:37,427][47288] Updated weights for policy 0, policy_version 130916 (0.0025) [2024-04-26 08:32:38,923][47056] Fps is (10 sec: 60621.9, 60 sec: 57344.1, 300 sec: 56649.7). Total num frames: 2145042432. Throughput: 0: 56418.1. Samples: 2094426160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 08:32:38,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 08:32:40,127][47288] Updated weights for policy 0, policy_version 130926 (0.0029) [2024-04-26 08:32:43,174][47288] Updated weights for policy 0, policy_version 130936 (0.0034) [2024-04-26 08:32:43,923][47056] Fps is (10 sec: 57344.7, 60 sec: 57344.0, 300 sec: 56594.2). Total num frames: 2145304576. Throughput: 0: 56605.2. Samples: 2094605660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 08:32:43,923][47056] Avg episode reward: [(0, '0.596')] [2024-04-26 08:32:45,977][47288] Updated weights for policy 0, policy_version 130946 (0.0038) [2024-04-26 08:32:48,923][47056] Fps is (10 sec: 52429.2, 60 sec: 56797.9, 300 sec: 56483.1). Total num frames: 2145566720. Throughput: 0: 56664.3. Samples: 2094952100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 08:32:48,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 08:32:48,973][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000130956_2145583104.pth... [2024-04-26 08:32:48,978][47288] Updated weights for policy 0, policy_version 130956 (0.0037) [2024-04-26 08:32:49,017][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000130127_2132000768.pth [2024-04-26 08:32:51,627][47288] Updated weights for policy 0, policy_version 130966 (0.0027) [2024-04-26 08:32:53,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 2145828864. Throughput: 0: 56662.3. Samples: 2095289360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 08:32:53,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 08:32:54,690][47288] Updated weights for policy 0, policy_version 130976 (0.0027) [2024-04-26 08:32:57,551][47288] Updated weights for policy 0, policy_version 130986 (0.0029) [2024-04-26 08:32:58,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.8, 300 sec: 56483.2). Total num frames: 2146123776. Throughput: 0: 56473.0. Samples: 2095442160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 08:32:58,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 08:33:00,590][47288] Updated weights for policy 0, policy_version 130996 (0.0033) [2024-04-26 08:33:03,533][47288] Updated weights for policy 0, policy_version 131006 (0.0033) [2024-04-26 08:33:03,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56251.6, 300 sec: 56538.7). Total num frames: 2146418688. Throughput: 0: 56389.8. Samples: 2095787980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 08:33:03,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 08:33:06,397][47288] Updated weights for policy 0, policy_version 131016 (0.0031) [2024-04-26 08:33:08,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 2146697216. Throughput: 0: 56438.9. Samples: 2096127780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:33:08,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 08:33:09,126][47288] Updated weights for policy 0, policy_version 131026 (0.0033) [2024-04-26 08:33:12,021][47288] Updated weights for policy 0, policy_version 131036 (0.0035) [2024-04-26 08:33:13,923][47056] Fps is (10 sec: 58982.7, 60 sec: 57071.1, 300 sec: 56594.3). Total num frames: 2147008512. Throughput: 0: 56783.1. Samples: 2096302240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:33:13,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 08:33:14,787][47288] Updated weights for policy 0, policy_version 131046 (0.0031) [2024-04-26 08:33:17,715][47288] Updated weights for policy 0, policy_version 131056 (0.0032) [2024-04-26 08:33:18,923][47056] Fps is (10 sec: 58981.9, 60 sec: 57070.9, 300 sec: 56594.2). Total num frames: 2147287040. Throughput: 0: 56813.7. Samples: 2096642140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:33:18,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 08:33:20,816][47288] Updated weights for policy 0, policy_version 131066 (0.0030) [2024-04-26 08:33:23,637][47288] Updated weights for policy 0, policy_version 131076 (0.0029) [2024-04-26 08:33:23,923][47056] Fps is (10 sec: 55705.5, 60 sec: 57071.1, 300 sec: 56649.8). Total num frames: 2147565568. Throughput: 0: 56772.2. Samples: 2096980900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:33:23,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 08:33:26,614][47288] Updated weights for policy 0, policy_version 131086 (0.0030) [2024-04-26 08:33:28,521][47267] Signal inference workers to stop experience collection... (31100 times) [2024-04-26 08:33:28,563][47288] InferenceWorker_p0-w0: stopping experience collection (31100 times) [2024-04-26 08:33:28,574][47267] Signal inference workers to resume experience collection... (31100 times) [2024-04-26 08:33:28,577][47288] InferenceWorker_p0-w0: resuming experience collection (31100 times) [2024-04-26 08:33:28,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56798.1, 300 sec: 56538.7). Total num frames: 2147844096. Throughput: 0: 56629.2. Samples: 2097153980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:33:28,923][47056] Avg episode reward: [(0, '0.577')] [2024-04-26 08:33:29,378][47288] Updated weights for policy 0, policy_version 131096 (0.0035) [2024-04-26 08:33:32,456][47288] Updated weights for policy 0, policy_version 131106 (0.0029) [2024-04-26 08:33:33,923][47056] Fps is (10 sec: 54066.5, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 2148106240. Throughput: 0: 56427.9. Samples: 2097491360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:33:33,923][47056] Avg episode reward: [(0, '0.557')] [2024-04-26 08:33:35,166][47288] Updated weights for policy 0, policy_version 131116 (0.0025) [2024-04-26 08:33:38,054][47288] Updated weights for policy 0, policy_version 131126 (0.0026) [2024-04-26 08:33:38,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55705.5, 300 sec: 56483.1). Total num frames: 2148384768. Throughput: 0: 56586.4. Samples: 2097835760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:33:38,923][47056] Avg episode reward: [(0, '0.528')] [2024-04-26 08:33:40,883][47288] Updated weights for policy 0, policy_version 131136 (0.0025) [2024-04-26 08:33:43,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.6, 300 sec: 56538.7). Total num frames: 2148679680. Throughput: 0: 56725.7. Samples: 2097994820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:33:43,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 08:33:44,100][47288] Updated weights for policy 0, policy_version 131146 (0.0031) [2024-04-26 08:33:46,606][47288] Updated weights for policy 0, policy_version 131156 (0.0027) [2024-04-26 08:33:48,923][47056] Fps is (10 sec: 60621.5, 60 sec: 57070.9, 300 sec: 56594.2). Total num frames: 2148990976. Throughput: 0: 56687.5. Samples: 2098338920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:33:48,923][47056] Avg episode reward: [(0, '0.574')] [2024-04-26 08:33:49,838][47288] Updated weights for policy 0, policy_version 131166 (0.0028) [2024-04-26 08:33:52,389][47288] Updated weights for policy 0, policy_version 131176 (0.0028) [2024-04-26 08:33:53,923][47056] Fps is (10 sec: 58981.1, 60 sec: 57343.7, 300 sec: 56594.2). Total num frames: 2149269504. Throughput: 0: 56720.5. Samples: 2098680220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:33:53,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 08:33:55,609][47288] Updated weights for policy 0, policy_version 131186 (0.0029) [2024-04-26 08:33:58,095][47288] Updated weights for policy 0, policy_version 131196 (0.0030) [2024-04-26 08:33:58,923][47056] Fps is (10 sec: 57343.8, 60 sec: 57344.0, 300 sec: 56705.3). Total num frames: 2149564416. Throughput: 0: 56844.3. Samples: 2098860240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:33:58,923][47056] Avg episode reward: [(0, '0.589')] [2024-04-26 08:34:01,255][47288] Updated weights for policy 0, policy_version 131206 (0.0027) [2024-04-26 08:34:03,923][47056] Fps is (10 sec: 55707.2, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 2149826560. Throughput: 0: 56828.1. Samples: 2099199400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:34:03,923][47056] Avg episode reward: [(0, '0.614')] [2024-04-26 08:34:03,953][47288] Updated weights for policy 0, policy_version 131216 (0.0027) [2024-04-26 08:34:07,113][47288] Updated weights for policy 0, policy_version 131226 (0.0028) [2024-04-26 08:34:08,923][47056] Fps is (10 sec: 52428.7, 60 sec: 56524.7, 300 sec: 56538.7). Total num frames: 2150088704. Throughput: 0: 56715.0. Samples: 2099533080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 08:34:08,923][47056] Avg episode reward: [(0, '0.567')] [2024-04-26 08:34:09,703][47288] Updated weights for policy 0, policy_version 131236 (0.0027) [2024-04-26 08:34:13,133][47288] Updated weights for policy 0, policy_version 131246 (0.0029) [2024-04-26 08:34:13,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 56594.3). Total num frames: 2150383616. Throughput: 0: 56665.5. Samples: 2099703920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 08:34:13,923][47056] Avg episode reward: [(0, '0.478')] [2024-04-26 08:34:15,465][47288] Updated weights for policy 0, policy_version 131256 (0.0034) [2024-04-26 08:34:18,751][47288] Updated weights for policy 0, policy_version 131266 (0.0028) [2024-04-26 08:34:18,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 2150662144. Throughput: 0: 56692.0. Samples: 2100042500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 08:34:18,923][47056] Avg episode reward: [(0, '0.609')] [2024-04-26 08:34:21,307][47288] Updated weights for policy 0, policy_version 131276 (0.0030) [2024-04-26 08:34:23,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56524.7, 300 sec: 56538.7). Total num frames: 2150957056. Throughput: 0: 56590.3. Samples: 2100382320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 08:34:23,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 08:34:24,446][47288] Updated weights for policy 0, policy_version 131286 (0.0030) [2024-04-26 08:34:27,070][47288] Updated weights for policy 0, policy_version 131296 (0.0031) [2024-04-26 08:34:28,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56797.9, 300 sec: 56649.7). Total num frames: 2151251968. Throughput: 0: 57054.3. Samples: 2100562260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 08:34:28,923][47056] Avg episode reward: [(0, '0.485')] [2024-04-26 08:34:30,294][47288] Updated weights for policy 0, policy_version 131306 (0.0028) [2024-04-26 08:34:32,725][47288] Updated weights for policy 0, policy_version 131316 (0.0032) [2024-04-26 08:34:33,923][47056] Fps is (10 sec: 58982.3, 60 sec: 57344.0, 300 sec: 56649.8). Total num frames: 2151546880. Throughput: 0: 56950.6. Samples: 2100901700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 08:34:33,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 08:34:36,159][47288] Updated weights for policy 0, policy_version 131326 (0.0031) [2024-04-26 08:34:36,636][47267] Signal inference workers to stop experience collection... (31150 times) [2024-04-26 08:34:36,672][47288] InferenceWorker_p0-w0: stopping experience collection (31150 times) [2024-04-26 08:34:36,726][47267] Signal inference workers to resume experience collection... (31150 times) [2024-04-26 08:34:36,726][47288] InferenceWorker_p0-w0: resuming experience collection (31150 times) [2024-04-26 08:34:38,520][47288] Updated weights for policy 0, policy_version 131336 (0.0029) [2024-04-26 08:34:38,923][47056] Fps is (10 sec: 57343.3, 60 sec: 57344.0, 300 sec: 56649.7). Total num frames: 2151825408. Throughput: 0: 56745.5. Samples: 2101233760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 08:34:38,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 08:34:41,920][47288] Updated weights for policy 0, policy_version 131346 (0.0032) [2024-04-26 08:34:43,923][47056] Fps is (10 sec: 52429.0, 60 sec: 56524.8, 300 sec: 56594.2). Total num frames: 2152071168. Throughput: 0: 56625.8. Samples: 2101408400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 08:34:43,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 08:34:44,410][47288] Updated weights for policy 0, policy_version 131356 (0.0029) [2024-04-26 08:34:47,667][47288] Updated weights for policy 0, policy_version 131366 (0.0025) [2024-04-26 08:34:48,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55978.6, 300 sec: 56538.7). Total num frames: 2152349696. Throughput: 0: 56624.8. Samples: 2101747520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 08:34:48,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 08:34:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000131369_2152349696.pth... [2024-04-26 08:34:48,984][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000130542_2138800128.pth [2024-04-26 08:34:50,190][47288] Updated weights for policy 0, policy_version 131376 (0.0037) [2024-04-26 08:34:53,524][47288] Updated weights for policy 0, policy_version 131386 (0.0027) [2024-04-26 08:34:53,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56252.0, 300 sec: 56594.2). Total num frames: 2152644608. Throughput: 0: 56815.2. Samples: 2102089760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 08:34:53,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 08:34:55,934][47288] Updated weights for policy 0, policy_version 131396 (0.0029) [2024-04-26 08:34:58,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55978.8, 300 sec: 56483.2). Total num frames: 2152923136. Throughput: 0: 56557.3. Samples: 2102249000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 08:34:58,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 08:34:59,252][47288] Updated weights for policy 0, policy_version 131406 (0.0031) [2024-04-26 08:35:01,686][47288] Updated weights for policy 0, policy_version 131416 (0.0031) [2024-04-26 08:35:03,923][47056] Fps is (10 sec: 57342.9, 60 sec: 56524.7, 300 sec: 56538.7). Total num frames: 2153218048. Throughput: 0: 56595.9. Samples: 2102589320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 08:35:03,923][47056] Avg episode reward: [(0, '0.552')] [2024-04-26 08:35:04,921][47288] Updated weights for policy 0, policy_version 131426 (0.0030) [2024-04-26 08:35:07,513][47288] Updated weights for policy 0, policy_version 131436 (0.0029) [2024-04-26 08:35:08,923][47056] Fps is (10 sec: 60621.0, 60 sec: 57344.1, 300 sec: 56649.8). Total num frames: 2153529344. Throughput: 0: 56649.5. Samples: 2102931540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 08:35:08,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 08:35:10,798][47288] Updated weights for policy 0, policy_version 131446 (0.0027) [2024-04-26 08:35:13,313][47288] Updated weights for policy 0, policy_version 131456 (0.0028) [2024-04-26 08:35:13,923][47056] Fps is (10 sec: 60621.6, 60 sec: 57343.9, 300 sec: 56760.9). Total num frames: 2153824256. Throughput: 0: 56515.6. Samples: 2103105460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:35:13,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 08:35:16,486][47288] Updated weights for policy 0, policy_version 131466 (0.0025) [2024-04-26 08:35:18,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56798.0, 300 sec: 56594.2). Total num frames: 2154070016. Throughput: 0: 56642.0. Samples: 2103450580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:35:18,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 08:35:19,087][47288] Updated weights for policy 0, policy_version 131476 (0.0031) [2024-04-26 08:35:22,209][47288] Updated weights for policy 0, policy_version 131486 (0.0025) [2024-04-26 08:35:23,923][47056] Fps is (10 sec: 52428.8, 60 sec: 56524.8, 300 sec: 56649.8). Total num frames: 2154348544. Throughput: 0: 56758.4. Samples: 2103787880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:35:23,923][47056] Avg episode reward: [(0, '0.603')] [2024-04-26 08:35:24,799][47288] Updated weights for policy 0, policy_version 131496 (0.0030) [2024-04-26 08:35:27,974][47288] Updated weights for policy 0, policy_version 131506 (0.0029) [2024-04-26 08:35:28,923][47056] Fps is (10 sec: 55704.6, 60 sec: 56251.7, 300 sec: 56705.3). Total num frames: 2154627072. Throughput: 0: 56454.1. Samples: 2103948840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:35:28,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 08:35:30,635][47288] Updated weights for policy 0, policy_version 131516 (0.0034) [2024-04-26 08:35:33,680][47288] Updated weights for policy 0, policy_version 131526 (0.0027) [2024-04-26 08:35:33,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56524.8, 300 sec: 56649.8). Total num frames: 2154938368. Throughput: 0: 56579.5. Samples: 2104293600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:35:33,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 08:35:36,398][47288] Updated weights for policy 0, policy_version 131536 (0.0025) [2024-04-26 08:35:38,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56251.9, 300 sec: 56538.7). Total num frames: 2155200512. Throughput: 0: 56544.0. Samples: 2104634240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:35:38,923][47056] Avg episode reward: [(0, '0.565')] [2024-04-26 08:35:39,428][47288] Updated weights for policy 0, policy_version 131546 (0.0033) [2024-04-26 08:35:42,165][47288] Updated weights for policy 0, policy_version 131556 (0.0028) [2024-04-26 08:35:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 57070.9, 300 sec: 56649.7). Total num frames: 2155495424. Throughput: 0: 56746.5. Samples: 2104802600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:35:43,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 08:35:45,250][47288] Updated weights for policy 0, policy_version 131566 (0.0032) [2024-04-26 08:35:47,867][47288] Updated weights for policy 0, policy_version 131576 (0.0028) [2024-04-26 08:35:48,313][47267] Signal inference workers to stop experience collection... (31200 times) [2024-04-26 08:35:48,314][47267] Signal inference workers to resume experience collection... (31200 times) [2024-04-26 08:35:48,327][47288] InferenceWorker_p0-w0: stopping experience collection (31200 times) [2024-04-26 08:35:48,345][47288] InferenceWorker_p0-w0: resuming experience collection (31200 times) [2024-04-26 08:35:48,923][47056] Fps is (10 sec: 58982.6, 60 sec: 57344.1, 300 sec: 56649.8). Total num frames: 2155790336. Throughput: 0: 56726.9. Samples: 2105142020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:35:48,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 08:35:51,233][47288] Updated weights for policy 0, policy_version 131586 (0.0029) [2024-04-26 08:35:53,858][47288] Updated weights for policy 0, policy_version 131596 (0.0031) [2024-04-26 08:35:53,923][47056] Fps is (10 sec: 57344.1, 60 sec: 57070.8, 300 sec: 56705.3). Total num frames: 2156068864. Throughput: 0: 56651.8. Samples: 2105480880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:35:53,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 08:35:56,991][47288] Updated weights for policy 0, policy_version 131606 (0.0027) [2024-04-26 08:35:58,923][47056] Fps is (10 sec: 54065.5, 60 sec: 56797.6, 300 sec: 56649.7). Total num frames: 2156331008. Throughput: 0: 56593.9. Samples: 2105652200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:35:58,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 08:35:59,837][47288] Updated weights for policy 0, policy_version 131616 (0.0030) [2024-04-26 08:36:02,782][47288] Updated weights for policy 0, policy_version 131626 (0.0026) [2024-04-26 08:36:03,923][47056] Fps is (10 sec: 54068.0, 60 sec: 56525.0, 300 sec: 56705.3). Total num frames: 2156609536. Throughput: 0: 56576.9. Samples: 2105996540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:36:03,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 08:36:05,557][47288] Updated weights for policy 0, policy_version 131636 (0.0032) [2024-04-26 08:36:08,618][47288] Updated weights for policy 0, policy_version 131646 (0.0030) [2024-04-26 08:36:08,923][47056] Fps is (10 sec: 55707.1, 60 sec: 55978.6, 300 sec: 56594.2). Total num frames: 2156888064. Throughput: 0: 56583.6. Samples: 2106334140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:36:08,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 08:36:11,354][47288] Updated weights for policy 0, policy_version 131656 (0.0029) [2024-04-26 08:36:13,923][47056] Fps is (10 sec: 57343.3, 60 sec: 55978.7, 300 sec: 56594.3). Total num frames: 2157182976. Throughput: 0: 56730.8. Samples: 2106501720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-26 08:36:13,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 08:36:14,268][47288] Updated weights for policy 0, policy_version 131666 (0.0030) [2024-04-26 08:36:17,165][47288] Updated weights for policy 0, policy_version 131676 (0.0032) [2024-04-26 08:36:18,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 2157477888. Throughput: 0: 56538.3. Samples: 2106837820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 08:36:18,932][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 08:36:20,155][47288] Updated weights for policy 0, policy_version 131686 (0.0029) [2024-04-26 08:36:22,857][47288] Updated weights for policy 0, policy_version 131696 (0.0029) [2024-04-26 08:36:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56797.9, 300 sec: 56649.8). Total num frames: 2157756416. Throughput: 0: 56521.4. Samples: 2107177700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 08:36:23,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 08:36:26,064][47288] Updated weights for policy 0, policy_version 131706 (0.0029) [2024-04-26 08:36:28,668][47288] Updated weights for policy 0, policy_version 131716 (0.0030) [2024-04-26 08:36:28,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56797.9, 300 sec: 56649.7). Total num frames: 2158034944. Throughput: 0: 56581.3. Samples: 2107348760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 08:36:28,923][47056] Avg episode reward: [(0, '0.570')] [2024-04-26 08:36:31,771][47288] Updated weights for policy 0, policy_version 131726 (0.0031) [2024-04-26 08:36:33,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56524.8, 300 sec: 56705.3). Total num frames: 2158329856. Throughput: 0: 56581.2. Samples: 2107688180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 08:36:33,932][47056] Avg episode reward: [(0, '0.417')] [2024-04-26 08:36:34,421][47288] Updated weights for policy 0, policy_version 131736 (0.0031) [2024-04-26 08:36:37,579][47288] Updated weights for policy 0, policy_version 131746 (0.0026) [2024-04-26 08:36:38,923][47056] Fps is (10 sec: 58982.9, 60 sec: 57070.9, 300 sec: 56816.4). Total num frames: 2158624768. Throughput: 0: 56634.2. Samples: 2108029420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 08:36:38,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 08:36:40,171][47288] Updated weights for policy 0, policy_version 131756 (0.0029) [2024-04-26 08:36:43,441][47288] Updated weights for policy 0, policy_version 131766 (0.0030) [2024-04-26 08:36:43,923][47056] Fps is (10 sec: 55705.7, 60 sec: 56524.8, 300 sec: 56705.3). Total num frames: 2158886912. Throughput: 0: 56449.6. Samples: 2108192420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 08:36:43,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 08:36:46,218][47288] Updated weights for policy 0, policy_version 131776 (0.0035) [2024-04-26 08:36:48,923][47056] Fps is (10 sec: 52429.3, 60 sec: 55978.7, 300 sec: 56538.7). Total num frames: 2159149056. Throughput: 0: 56382.6. Samples: 2108533760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 08:36:48,923][47056] Avg episode reward: [(0, '0.575')] [2024-04-26 08:36:48,945][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000131785_2159165440.pth... [2024-04-26 08:36:49,001][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000130956_2145583104.pth [2024-04-26 08:36:49,135][47288] Updated weights for policy 0, policy_version 131786 (0.0026) [2024-04-26 08:36:51,838][47288] Updated weights for policy 0, policy_version 131796 (0.0025) [2024-04-26 08:36:53,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.8, 300 sec: 56594.2). Total num frames: 2159460352. Throughput: 0: 56482.1. Samples: 2108875840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 08:36:53,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 08:36:54,898][47288] Updated weights for policy 0, policy_version 131806 (0.0036) [2024-04-26 08:36:57,926][47288] Updated weights for policy 0, policy_version 131816 (0.0029) [2024-04-26 08:36:58,923][47056] Fps is (10 sec: 60619.7, 60 sec: 57071.1, 300 sec: 56649.7). Total num frames: 2159755264. Throughput: 0: 56470.5. Samples: 2109042900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 08:36:58,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 08:37:00,702][47288] Updated weights for policy 0, policy_version 131826 (0.0030) [2024-04-26 08:37:03,763][47288] Updated weights for policy 0, policy_version 131836 (0.0026) [2024-04-26 08:37:03,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2160001024. Throughput: 0: 56561.4. Samples: 2109383080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 08:37:03,923][47056] Avg episode reward: [(0, '0.586')] [2024-04-26 08:37:06,506][47288] Updated weights for policy 0, policy_version 131846 (0.0028) [2024-04-26 08:37:08,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56797.7, 300 sec: 56649.8). Total num frames: 2160295936. Throughput: 0: 56506.4. Samples: 2109720500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 08:37:08,923][47056] Avg episode reward: [(0, '0.454')] [2024-04-26 08:37:09,429][47288] Updated weights for policy 0, policy_version 131856 (0.0028) [2024-04-26 08:37:12,187][47267] Signal inference workers to stop experience collection... (31250 times) [2024-04-26 08:37:12,233][47288] InferenceWorker_p0-w0: stopping experience collection (31250 times) [2024-04-26 08:37:12,241][47267] Signal inference workers to resume experience collection... (31250 times) [2024-04-26 08:37:12,249][47288] InferenceWorker_p0-w0: resuming experience collection (31250 times) [2024-04-26 08:37:12,348][47288] Updated weights for policy 0, policy_version 131866 (0.0026) [2024-04-26 08:37:13,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56797.8, 300 sec: 56705.3). Total num frames: 2160590848. Throughput: 0: 56640.5. Samples: 2109897580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 08:37:13,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 08:37:15,097][47288] Updated weights for policy 0, policy_version 131876 (0.0030) [2024-04-26 08:37:18,008][47288] Updated weights for policy 0, policy_version 131886 (0.0038) [2024-04-26 08:37:18,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56524.7, 300 sec: 56705.3). Total num frames: 2160869376. Throughput: 0: 56716.4. Samples: 2110240420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:37:18,923][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 08:37:20,909][47288] Updated weights for policy 0, policy_version 131896 (0.0029) [2024-04-26 08:37:23,856][47288] Updated weights for policy 0, policy_version 131906 (0.0033) [2024-04-26 08:37:23,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.7, 300 sec: 56649.8). Total num frames: 2161147904. Throughput: 0: 56582.2. Samples: 2110575620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:37:23,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 08:37:26,768][47288] Updated weights for policy 0, policy_version 131916 (0.0031) [2024-04-26 08:37:28,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56524.9, 300 sec: 56594.2). Total num frames: 2161426432. Throughput: 0: 56803.6. Samples: 2110748580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:37:28,923][47056] Avg episode reward: [(0, '0.593')] [2024-04-26 08:37:29,617][47288] Updated weights for policy 0, policy_version 131926 (0.0031) [2024-04-26 08:37:32,467][47288] Updated weights for policy 0, policy_version 131936 (0.0035) [2024-04-26 08:37:33,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2161721344. Throughput: 0: 56694.9. Samples: 2111085040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:37:33,923][47056] Avg episode reward: [(0, '0.470')] [2024-04-26 08:37:35,401][47288] Updated weights for policy 0, policy_version 131946 (0.0029) [2024-04-26 08:37:38,292][47288] Updated weights for policy 0, policy_version 131956 (0.0034) [2024-04-26 08:37:38,923][47056] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 56538.6). Total num frames: 2161983488. Throughput: 0: 56643.9. Samples: 2111424820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:37:38,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 08:37:41,138][47288] Updated weights for policy 0, policy_version 131966 (0.0035) [2024-04-26 08:37:43,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56251.8, 300 sec: 56594.2). Total num frames: 2162262016. Throughput: 0: 56577.1. Samples: 2111588860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:37:43,923][47056] Avg episode reward: [(0, '0.441')] [2024-04-26 08:37:44,193][47288] Updated weights for policy 0, policy_version 131976 (0.0027) [2024-04-26 08:37:46,790][47288] Updated weights for policy 0, policy_version 131986 (0.0033) [2024-04-26 08:37:48,923][47056] Fps is (10 sec: 57345.0, 60 sec: 56797.8, 300 sec: 56705.3). Total num frames: 2162556928. Throughput: 0: 56578.6. Samples: 2111929120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:37:48,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 08:37:49,978][47288] Updated weights for policy 0, policy_version 131996 (0.0027) [2024-04-26 08:37:52,763][47288] Updated weights for policy 0, policy_version 132006 (0.0030) [2024-04-26 08:37:53,923][47056] Fps is (10 sec: 60620.3, 60 sec: 56797.9, 300 sec: 56760.8). Total num frames: 2162868224. Throughput: 0: 56549.9. Samples: 2112265240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:37:53,923][47056] Avg episode reward: [(0, '0.582')] [2024-04-26 08:37:55,700][47288] Updated weights for policy 0, policy_version 132016 (0.0031) [2024-04-26 08:37:58,471][47288] Updated weights for policy 0, policy_version 132026 (0.0026) [2024-04-26 08:37:58,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56251.7, 300 sec: 56649.7). Total num frames: 2163130368. Throughput: 0: 56519.0. Samples: 2112440940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:37:58,923][47056] Avg episode reward: [(0, '0.572')] [2024-04-26 08:38:01,387][47288] Updated weights for policy 0, policy_version 132036 (0.0031) [2024-04-26 08:38:03,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56797.7, 300 sec: 56649.7). Total num frames: 2163408896. Throughput: 0: 56429.3. Samples: 2112779740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:38:03,923][47056] Avg episode reward: [(0, '0.484')] [2024-04-26 08:38:04,246][47288] Updated weights for policy 0, policy_version 132046 (0.0028) [2024-04-26 08:38:07,105][47288] Updated weights for policy 0, policy_version 132056 (0.0037) [2024-04-26 08:38:08,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 2163687424. Throughput: 0: 56516.5. Samples: 2113118860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:38:08,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 08:38:10,127][47288] Updated weights for policy 0, policy_version 132066 (0.0032) [2024-04-26 08:38:12,873][47288] Updated weights for policy 0, policy_version 132076 (0.0030) [2024-04-26 08:38:13,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 2163965952. Throughput: 0: 56405.4. Samples: 2113286820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:38:13,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 08:38:15,820][47288] Updated weights for policy 0, policy_version 132086 (0.0031) [2024-04-26 08:38:18,844][47288] Updated weights for policy 0, policy_version 132096 (0.0028) [2024-04-26 08:38:18,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.8, 300 sec: 56594.2). Total num frames: 2164260864. Throughput: 0: 56536.9. Samples: 2113629200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:38:18,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 08:38:21,427][47288] Updated weights for policy 0, policy_version 132106 (0.0030) [2024-04-26 08:38:22,594][47267] Signal inference workers to stop experience collection... (31300 times) [2024-04-26 08:38:22,595][47267] Signal inference workers to resume experience collection... (31300 times) [2024-04-26 08:38:22,614][47288] InferenceWorker_p0-w0: stopping experience collection (31300 times) [2024-04-26 08:38:22,614][47288] InferenceWorker_p0-w0: resuming experience collection (31300 times) [2024-04-26 08:38:23,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.9, 300 sec: 56594.2). Total num frames: 2164539392. Throughput: 0: 56545.5. Samples: 2113969360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:38:23,923][47056] Avg episode reward: [(0, '0.565')] [2024-04-26 08:38:25,042][47288] Updated weights for policy 0, policy_version 132116 (0.0025) [2024-04-26 08:38:27,295][47288] Updated weights for policy 0, policy_version 132126 (0.0029) [2024-04-26 08:38:28,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56524.9, 300 sec: 56649.8). Total num frames: 2164817920. Throughput: 0: 56619.6. Samples: 2114136740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:38:28,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 08:38:30,940][47288] Updated weights for policy 0, policy_version 132136 (0.0031) [2024-04-26 08:38:33,109][47288] Updated weights for policy 0, policy_version 132146 (0.0038) [2024-04-26 08:38:33,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56798.0, 300 sec: 56760.9). Total num frames: 2165129216. Throughput: 0: 56515.6. Samples: 2114472320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:38:33,923][47056] Avg episode reward: [(0, '0.487')] [2024-04-26 08:38:36,644][47288] Updated weights for policy 0, policy_version 132156 (0.0025) [2024-04-26 08:38:38,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56798.1, 300 sec: 56649.8). Total num frames: 2165391360. Throughput: 0: 56698.9. Samples: 2114816680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:38:38,923][47056] Avg episode reward: [(0, '0.437')] [2024-04-26 08:38:39,016][47288] Updated weights for policy 0, policy_version 132166 (0.0035) [2024-04-26 08:38:42,379][47288] Updated weights for policy 0, policy_version 132176 (0.0027) [2024-04-26 08:38:43,923][47056] Fps is (10 sec: 54066.4, 60 sec: 56797.7, 300 sec: 56538.7). Total num frames: 2165669888. Throughput: 0: 56434.7. Samples: 2114980500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:38:43,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 08:38:44,692][47288] Updated weights for policy 0, policy_version 132186 (0.0027) [2024-04-26 08:38:48,312][47288] Updated weights for policy 0, policy_version 132196 (0.0027) [2024-04-26 08:38:48,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56797.8, 300 sec: 56594.3). Total num frames: 2165964800. Throughput: 0: 56513.0. Samples: 2115322820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:38:48,923][47056] Avg episode reward: [(0, '0.573')] [2024-04-26 08:38:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000132200_2165964800.pth... [2024-04-26 08:38:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000131369_2152349696.pth [2024-04-26 08:38:50,576][47288] Updated weights for policy 0, policy_version 132206 (0.0029) [2024-04-26 08:38:53,923][47056] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 56427.6). Total num frames: 2166210560. Throughput: 0: 56588.8. Samples: 2115665360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:38:53,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 08:38:54,058][47288] Updated weights for policy 0, policy_version 132216 (0.0030) [2024-04-26 08:38:56,356][47288] Updated weights for policy 0, policy_version 132226 (0.0034) [2024-04-26 08:38:58,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 2166505472. Throughput: 0: 56459.9. Samples: 2115827520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:38:58,923][47056] Avg episode reward: [(0, '0.563')] [2024-04-26 08:38:59,677][47288] Updated weights for policy 0, policy_version 132236 (0.0031) [2024-04-26 08:39:02,043][47288] Updated weights for policy 0, policy_version 132246 (0.0028) [2024-04-26 08:39:03,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56524.9, 300 sec: 56649.8). Total num frames: 2166800384. Throughput: 0: 56374.2. Samples: 2116166040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:39:03,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 08:39:05,603][47288] Updated weights for policy 0, policy_version 132256 (0.0031) [2024-04-26 08:39:07,657][47288] Updated weights for policy 0, policy_version 132266 (0.0023) [2024-04-26 08:39:08,923][47056] Fps is (10 sec: 58982.8, 60 sec: 56797.9, 300 sec: 56649.7). Total num frames: 2167095296. Throughput: 0: 56500.4. Samples: 2116511880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:39:08,923][47056] Avg episode reward: [(0, '0.472')] [2024-04-26 08:39:11,443][47288] Updated weights for policy 0, policy_version 132276 (0.0028) [2024-04-26 08:39:13,420][47288] Updated weights for policy 0, policy_version 132286 (0.0027) [2024-04-26 08:39:13,923][47056] Fps is (10 sec: 58982.3, 60 sec: 57070.9, 300 sec: 56705.3). Total num frames: 2167390208. Throughput: 0: 56699.4. Samples: 2116688220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:39:13,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 08:39:17,172][47288] Updated weights for policy 0, policy_version 132296 (0.0032) [2024-04-26 08:39:18,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56797.9, 300 sec: 56649.8). Total num frames: 2167668736. Throughput: 0: 56732.8. Samples: 2117025300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 08:39:18,923][47056] Avg episode reward: [(0, '0.548')] [2024-04-26 08:39:19,241][47288] Updated weights for policy 0, policy_version 132306 (0.0024) [2024-04-26 08:39:22,868][47288] Updated weights for policy 0, policy_version 132316 (0.0026) [2024-04-26 08:39:23,923][47056] Fps is (10 sec: 54068.2, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 2167930880. Throughput: 0: 56636.5. Samples: 2117365320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:39:23,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 08:39:25,109][47288] Updated weights for policy 0, policy_version 132326 (0.0029) [2024-04-26 08:39:28,502][47288] Updated weights for policy 0, policy_version 132336 (0.0034) [2024-04-26 08:39:28,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 2168209408. Throughput: 0: 56742.3. Samples: 2117533900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:39:28,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 08:39:30,970][47288] Updated weights for policy 0, policy_version 132346 (0.0035) [2024-04-26 08:39:33,923][47056] Fps is (10 sec: 54065.6, 60 sec: 55705.4, 300 sec: 56427.6). Total num frames: 2168471552. Throughput: 0: 56556.7. Samples: 2117867880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:39:33,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 08:39:34,400][47288] Updated weights for policy 0, policy_version 132356 (0.0026) [2024-04-26 08:39:36,264][47267] Signal inference workers to stop experience collection... (31350 times) [2024-04-26 08:39:36,265][47267] Signal inference workers to resume experience collection... (31350 times) [2024-04-26 08:39:36,292][47288] InferenceWorker_p0-w0: stopping experience collection (31350 times) [2024-04-26 08:39:36,292][47288] InferenceWorker_p0-w0: resuming experience collection (31350 times) [2024-04-26 08:39:36,662][47288] Updated weights for policy 0, policy_version 132366 (0.0028) [2024-04-26 08:39:38,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56524.6, 300 sec: 56649.7). Total num frames: 2168782848. Throughput: 0: 56583.1. Samples: 2118211600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:39:38,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 08:39:40,151][47288] Updated weights for policy 0, policy_version 132376 (0.0029) [2024-04-26 08:39:42,375][47288] Updated weights for policy 0, policy_version 132386 (0.0031) [2024-04-26 08:39:43,923][47056] Fps is (10 sec: 58983.5, 60 sec: 56524.9, 300 sec: 56649.8). Total num frames: 2169061376. Throughput: 0: 56675.2. Samples: 2118377900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:39:43,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 08:39:46,061][47288] Updated weights for policy 0, policy_version 132396 (0.0030) [2024-04-26 08:39:48,266][47288] Updated weights for policy 0, policy_version 132406 (0.0028) [2024-04-26 08:39:48,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.7, 300 sec: 56649.7). Total num frames: 2169356288. Throughput: 0: 56607.9. Samples: 2118713400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:39:48,924][47056] Avg episode reward: [(0, '0.563')] [2024-04-26 08:39:51,939][47288] Updated weights for policy 0, policy_version 132416 (0.0027) [2024-04-26 08:39:53,923][47056] Fps is (10 sec: 58981.9, 60 sec: 57344.0, 300 sec: 56705.3). Total num frames: 2169651200. Throughput: 0: 56388.8. Samples: 2119049380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:39:53,923][47056] Avg episode reward: [(0, '0.567')] [2024-04-26 08:39:54,082][47288] Updated weights for policy 0, policy_version 132426 (0.0035) [2024-04-26 08:39:57,682][47288] Updated weights for policy 0, policy_version 132436 (0.0031) [2024-04-26 08:39:58,923][47056] Fps is (10 sec: 57344.7, 60 sec: 57071.0, 300 sec: 56649.8). Total num frames: 2169929728. Throughput: 0: 56510.3. Samples: 2119231180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:39:58,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 08:39:59,712][47288] Updated weights for policy 0, policy_version 132446 (0.0031) [2024-04-26 08:40:03,356][47288] Updated weights for policy 0, policy_version 132456 (0.0024) [2024-04-26 08:40:03,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 2170191872. Throughput: 0: 56497.7. Samples: 2119567700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:40:03,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 08:40:05,782][47288] Updated weights for policy 0, policy_version 132466 (0.0031) [2024-04-26 08:40:08,923][47056] Fps is (10 sec: 52428.4, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 2170454016. Throughput: 0: 56515.7. Samples: 2119908540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:40:08,923][47056] Avg episode reward: [(0, '0.481')] [2024-04-26 08:40:09,268][47288] Updated weights for policy 0, policy_version 132476 (0.0030) [2024-04-26 08:40:11,530][47288] Updated weights for policy 0, policy_version 132486 (0.0032) [2024-04-26 08:40:13,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 56538.6). Total num frames: 2170748928. Throughput: 0: 56299.4. Samples: 2120067380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:40:13,923][47056] Avg episode reward: [(0, '0.630')] [2024-04-26 08:40:14,984][47288] Updated weights for policy 0, policy_version 132496 (0.0034) [2024-04-26 08:40:17,271][47288] Updated weights for policy 0, policy_version 132506 (0.0029) [2024-04-26 08:40:18,923][47056] Fps is (10 sec: 58983.4, 60 sec: 56251.8, 300 sec: 56594.2). Total num frames: 2171043840. Throughput: 0: 56418.5. Samples: 2120406700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:40:18,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 08:40:20,689][47288] Updated weights for policy 0, policy_version 132516 (0.0028) [2024-04-26 08:40:23,117][47288] Updated weights for policy 0, policy_version 132526 (0.0025) [2024-04-26 08:40:23,923][47056] Fps is (10 sec: 58983.0, 60 sec: 56797.7, 300 sec: 56649.8). Total num frames: 2171338752. Throughput: 0: 56310.8. Samples: 2120745580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 08:40:23,923][47056] Avg episode reward: [(0, '0.574')] [2024-04-26 08:40:26,502][47288] Updated weights for policy 0, policy_version 132536 (0.0030) [2024-04-26 08:40:28,810][47288] Updated weights for policy 0, policy_version 132546 (0.0027) [2024-04-26 08:40:28,923][47056] Fps is (10 sec: 58982.1, 60 sec: 57071.0, 300 sec: 56594.2). Total num frames: 2171633664. Throughput: 0: 56573.4. Samples: 2120923700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:40:28,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 08:40:32,374][47288] Updated weights for policy 0, policy_version 132556 (0.0026) [2024-04-26 08:40:33,923][47056] Fps is (10 sec: 55706.0, 60 sec: 57071.1, 300 sec: 56594.2). Total num frames: 2171895808. Throughput: 0: 56607.3. Samples: 2121260720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:40:33,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 08:40:34,669][47288] Updated weights for policy 0, policy_version 132566 (0.0030) [2024-04-26 08:40:38,188][47288] Updated weights for policy 0, policy_version 132576 (0.0028) [2024-04-26 08:40:38,923][47056] Fps is (10 sec: 52428.8, 60 sec: 56251.9, 300 sec: 56483.2). Total num frames: 2172157952. Throughput: 0: 56545.5. Samples: 2121593920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:40:38,923][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 08:40:40,432][47288] Updated weights for policy 0, policy_version 132586 (0.0030) [2024-04-26 08:40:43,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 2172436480. Throughput: 0: 56208.0. Samples: 2121760540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:40:43,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 08:40:44,007][47288] Updated weights for policy 0, policy_version 132596 (0.0027) [2024-04-26 08:40:45,090][47267] Signal inference workers to stop experience collection... (31400 times) [2024-04-26 08:40:45,125][47288] InferenceWorker_p0-w0: stopping experience collection (31400 times) [2024-04-26 08:40:45,148][47267] Signal inference workers to resume experience collection... (31400 times) [2024-04-26 08:40:45,148][47288] InferenceWorker_p0-w0: resuming experience collection (31400 times) [2024-04-26 08:40:46,261][47288] Updated weights for policy 0, policy_version 132606 (0.0030) [2024-04-26 08:40:48,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 2172715008. Throughput: 0: 56172.4. Samples: 2122095460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:40:48,923][47056] Avg episode reward: [(0, '0.479')] [2024-04-26 08:40:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000132612_2172715008.pth... [2024-04-26 08:40:48,980][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000131785_2159165440.pth [2024-04-26 08:40:49,864][47288] Updated weights for policy 0, policy_version 132616 (0.0028) [2024-04-26 08:40:52,134][47288] Updated weights for policy 0, policy_version 132626 (0.0031) [2024-04-26 08:40:53,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56251.8, 300 sec: 56594.3). Total num frames: 2173026304. Throughput: 0: 56068.1. Samples: 2122431600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:40:53,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 08:40:55,656][47288] Updated weights for policy 0, policy_version 132636 (0.0029) [2024-04-26 08:40:58,103][47288] Updated weights for policy 0, policy_version 132646 (0.0031) [2024-04-26 08:40:58,923][47056] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 56538.7). Total num frames: 2173288448. Throughput: 0: 56576.2. Samples: 2122613300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:40:58,923][47056] Avg episode reward: [(0, '0.482')] [2024-04-26 08:41:01,525][47288] Updated weights for policy 0, policy_version 132656 (0.0030) [2024-04-26 08:41:03,786][47288] Updated weights for policy 0, policy_version 132666 (0.0029) [2024-04-26 08:41:03,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.8, 300 sec: 56649.7). Total num frames: 2173599744. Throughput: 0: 56480.3. Samples: 2122948320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:41:03,923][47056] Avg episode reward: [(0, '0.568')] [2024-04-26 08:41:07,218][47288] Updated weights for policy 0, policy_version 132676 (0.0031) [2024-04-26 08:41:08,923][47056] Fps is (10 sec: 58981.9, 60 sec: 57071.0, 300 sec: 56594.2). Total num frames: 2173878272. Throughput: 0: 56378.2. Samples: 2123282600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:41:08,923][47056] Avg episode reward: [(0, '0.451')] [2024-04-26 08:41:09,593][47288] Updated weights for policy 0, policy_version 132686 (0.0025) [2024-04-26 08:41:13,086][47288] Updated weights for policy 0, policy_version 132696 (0.0039) [2024-04-26 08:41:13,923][47056] Fps is (10 sec: 52428.8, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 2174124032. Throughput: 0: 56262.6. Samples: 2123455520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:41:13,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 08:41:15,438][47288] Updated weights for policy 0, policy_version 132706 (0.0024) [2024-04-26 08:41:18,873][47288] Updated weights for policy 0, policy_version 132716 (0.0032) [2024-04-26 08:41:18,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.6, 300 sec: 56483.1). Total num frames: 2174418944. Throughput: 0: 56328.3. Samples: 2123795500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:41:18,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 08:41:21,263][47288] Updated weights for policy 0, policy_version 132726 (0.0026) [2024-04-26 08:41:23,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 56427.6). Total num frames: 2174681088. Throughput: 0: 56488.4. Samples: 2124135900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:41:23,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 08:41:24,687][47288] Updated weights for policy 0, policy_version 132736 (0.0025) [2024-04-26 08:41:27,024][47288] Updated weights for policy 0, policy_version 132746 (0.0030) [2024-04-26 08:41:28,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 56427.6). Total num frames: 2174976000. Throughput: 0: 56302.7. Samples: 2124294160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 08:41:28,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 08:41:30,545][47288] Updated weights for policy 0, policy_version 132756 (0.0028) [2024-04-26 08:41:32,875][47288] Updated weights for policy 0, policy_version 132766 (0.0034) [2024-04-26 08:41:33,923][47056] Fps is (10 sec: 60620.5, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 2175287296. Throughput: 0: 56452.4. Samples: 2124635820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 08:41:33,923][47056] Avg episode reward: [(0, '0.547')] [2024-04-26 08:41:36,313][47288] Updated weights for policy 0, policy_version 132776 (0.0030) [2024-04-26 08:41:38,777][47288] Updated weights for policy 0, policy_version 132786 (0.0025) [2024-04-26 08:41:38,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 2175565824. Throughput: 0: 56532.9. Samples: 2124975580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 08:41:38,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 08:41:42,036][47288] Updated weights for policy 0, policy_version 132796 (0.0026) [2024-04-26 08:41:42,943][47267] Signal inference workers to stop experience collection... (31450 times) [2024-04-26 08:41:42,945][47267] Signal inference workers to resume experience collection... (31450 times) [2024-04-26 08:41:42,966][47288] InferenceWorker_p0-w0: stopping experience collection (31450 times) [2024-04-26 08:41:42,966][47288] InferenceWorker_p0-w0: resuming experience collection (31450 times) [2024-04-26 08:41:43,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 2175844352. Throughput: 0: 56453.7. Samples: 2125153720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 08:41:43,923][47056] Avg episode reward: [(0, '0.590')] [2024-04-26 08:41:44,471][47288] Updated weights for policy 0, policy_version 132806 (0.0034) [2024-04-26 08:41:47,771][47288] Updated weights for policy 0, policy_version 132816 (0.0026) [2024-04-26 08:41:48,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56798.0, 300 sec: 56483.2). Total num frames: 2176122880. Throughput: 0: 56506.8. Samples: 2125491120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 08:41:48,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 08:41:50,287][47288] Updated weights for policy 0, policy_version 132826 (0.0027) [2024-04-26 08:41:53,661][47288] Updated weights for policy 0, policy_version 132836 (0.0028) [2024-04-26 08:41:53,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 2176401408. Throughput: 0: 56683.1. Samples: 2125833340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 08:41:53,923][47056] Avg episode reward: [(0, '0.578')] [2024-04-26 08:41:56,171][47288] Updated weights for policy 0, policy_version 132846 (0.0032) [2024-04-26 08:41:58,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.8, 300 sec: 56483.1). Total num frames: 2176663552. Throughput: 0: 56339.3. Samples: 2125990780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 08:41:58,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 08:41:59,473][47288] Updated weights for policy 0, policy_version 132856 (0.0035) [2024-04-26 08:42:01,873][47288] Updated weights for policy 0, policy_version 132866 (0.0024) [2024-04-26 08:42:03,923][47056] Fps is (10 sec: 55706.5, 60 sec: 55978.9, 300 sec: 56483.2). Total num frames: 2176958464. Throughput: 0: 56352.2. Samples: 2126331340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 08:42:03,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 08:42:05,132][47288] Updated weights for policy 0, policy_version 132876 (0.0028) [2024-04-26 08:42:07,740][47288] Updated weights for policy 0, policy_version 132886 (0.0028) [2024-04-26 08:42:08,923][47056] Fps is (10 sec: 58980.4, 60 sec: 56251.5, 300 sec: 56483.1). Total num frames: 2177253376. Throughput: 0: 56349.4. Samples: 2126671640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 08:42:08,924][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 08:42:10,974][47288] Updated weights for policy 0, policy_version 132896 (0.0028) [2024-04-26 08:42:13,482][47288] Updated weights for policy 0, policy_version 132906 (0.0027) [2024-04-26 08:42:13,923][47056] Fps is (10 sec: 58981.1, 60 sec: 57070.9, 300 sec: 56538.7). Total num frames: 2177548288. Throughput: 0: 56653.2. Samples: 2126843560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 08:42:13,924][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 08:42:16,933][47288] Updated weights for policy 0, policy_version 132916 (0.0031) [2024-04-26 08:42:18,923][47056] Fps is (10 sec: 57345.5, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2177826816. Throughput: 0: 56659.6. Samples: 2127185500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 08:42:18,923][47056] Avg episode reward: [(0, '0.448')] [2024-04-26 08:42:19,228][47288] Updated weights for policy 0, policy_version 132926 (0.0030) [2024-04-26 08:42:22,652][47288] Updated weights for policy 0, policy_version 132936 (0.0027) [2024-04-26 08:42:23,923][47056] Fps is (10 sec: 55704.9, 60 sec: 57070.7, 300 sec: 56538.6). Total num frames: 2178105344. Throughput: 0: 56654.0. Samples: 2127525020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 08:42:23,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 08:42:25,009][47288] Updated weights for policy 0, policy_version 132946 (0.0030) [2024-04-26 08:42:28,367][47288] Updated weights for policy 0, policy_version 132956 (0.0030) [2024-04-26 08:42:28,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 2178367488. Throughput: 0: 56479.9. Samples: 2127695320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 08:42:28,923][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 08:42:30,879][47288] Updated weights for policy 0, policy_version 132966 (0.0030) [2024-04-26 08:42:33,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 2178662400. Throughput: 0: 56635.8. Samples: 2128039740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 08:42:33,924][47056] Avg episode reward: [(0, '0.450')] [2024-04-26 08:42:34,050][47288] Updated weights for policy 0, policy_version 132976 (0.0035) [2024-04-26 08:42:36,640][47288] Updated weights for policy 0, policy_version 132986 (0.0031) [2024-04-26 08:42:38,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 2178940928. Throughput: 0: 56609.3. Samples: 2128380760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 08:42:38,924][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 08:42:40,025][47288] Updated weights for policy 0, policy_version 132996 (0.0028) [2024-04-26 08:42:42,428][47288] Updated weights for policy 0, policy_version 133006 (0.0026) [2024-04-26 08:42:43,923][47056] Fps is (10 sec: 55706.8, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 2179219456. Throughput: 0: 56644.0. Samples: 2128539760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 08:42:43,923][47056] Avg episode reward: [(0, '0.577')] [2024-04-26 08:42:45,853][47288] Updated weights for policy 0, policy_version 133016 (0.0028) [2024-04-26 08:42:48,293][47288] Updated weights for policy 0, policy_version 133026 (0.0029) [2024-04-26 08:42:48,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56797.7, 300 sec: 56483.1). Total num frames: 2179530752. Throughput: 0: 56646.3. Samples: 2128880440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 08:42:48,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 08:42:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000133028_2179530752.pth... [2024-04-26 08:42:48,986][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000132200_2165964800.pth [2024-04-26 08:42:51,580][47267] Signal inference workers to stop experience collection... (31500 times) [2024-04-26 08:42:51,584][47267] Signal inference workers to resume experience collection... (31500 times) [2024-04-26 08:42:51,594][47288] Updated weights for policy 0, policy_version 133036 (0.0025) [2024-04-26 08:42:51,616][47288] InferenceWorker_p0-w0: stopping experience collection (31500 times) [2024-04-26 08:42:51,616][47288] InferenceWorker_p0-w0: resuming experience collection (31500 times) [2024-04-26 08:42:53,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2179809280. Throughput: 0: 56577.2. Samples: 2129217600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 08:42:53,924][47056] Avg episode reward: [(0, '0.650')] [2024-04-26 08:42:54,110][47288] Updated weights for policy 0, policy_version 133046 (0.0025) [2024-04-26 08:42:57,251][47288] Updated weights for policy 0, policy_version 133056 (0.0028) [2024-04-26 08:42:58,923][47056] Fps is (10 sec: 55706.3, 60 sec: 57070.9, 300 sec: 56538.7). Total num frames: 2180087808. Throughput: 0: 56686.8. Samples: 2129394460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 08:42:58,923][47056] Avg episode reward: [(0, '0.590')] [2024-04-26 08:42:59,848][47288] Updated weights for policy 0, policy_version 133066 (0.0029) [2024-04-26 08:43:03,276][47288] Updated weights for policy 0, policy_version 133076 (0.0028) [2024-04-26 08:43:03,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56797.7, 300 sec: 56538.7). Total num frames: 2180366336. Throughput: 0: 56663.5. Samples: 2129735360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 08:43:03,923][47056] Avg episode reward: [(0, '0.567')] [2024-04-26 08:43:05,615][47288] Updated weights for policy 0, policy_version 133086 (0.0027) [2024-04-26 08:43:08,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56252.0, 300 sec: 56483.1). Total num frames: 2180628480. Throughput: 0: 56687.8. Samples: 2130075960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 08:43:08,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 08:43:09,017][47288] Updated weights for policy 0, policy_version 133096 (0.0030) [2024-04-26 08:43:11,232][47288] Updated weights for policy 0, policy_version 133106 (0.0027) [2024-04-26 08:43:13,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 2180923392. Throughput: 0: 56505.3. Samples: 2130238060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 08:43:13,923][47056] Avg episode reward: [(0, '0.539')] [2024-04-26 08:43:14,737][47288] Updated weights for policy 0, policy_version 133116 (0.0030) [2024-04-26 08:43:16,986][47288] Updated weights for policy 0, policy_version 133126 (0.0027) [2024-04-26 08:43:18,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2181218304. Throughput: 0: 56367.3. Samples: 2130576260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 08:43:18,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 08:43:20,477][47288] Updated weights for policy 0, policy_version 133136 (0.0024) [2024-04-26 08:43:22,928][47288] Updated weights for policy 0, policy_version 133146 (0.0030) [2024-04-26 08:43:23,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 2181496832. Throughput: 0: 56466.2. Samples: 2130921740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 08:43:23,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 08:43:26,255][47288] Updated weights for policy 0, policy_version 133156 (0.0031) [2024-04-26 08:43:28,601][47288] Updated weights for policy 0, policy_version 133166 (0.0027) [2024-04-26 08:43:28,923][47056] Fps is (10 sec: 57344.2, 60 sec: 57071.1, 300 sec: 56483.1). Total num frames: 2181791744. Throughput: 0: 56748.8. Samples: 2131093460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 08:43:28,923][47056] Avg episode reward: [(0, '0.506')] [2024-04-26 08:43:32,150][47288] Updated weights for policy 0, policy_version 133176 (0.0030) [2024-04-26 08:43:33,923][47056] Fps is (10 sec: 58983.2, 60 sec: 57071.1, 300 sec: 56594.2). Total num frames: 2182086656. Throughput: 0: 56727.4. Samples: 2131433160. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 08:43:33,923][47056] Avg episode reward: [(0, '0.446')] [2024-04-26 08:43:34,610][47288] Updated weights for policy 0, policy_version 133186 (0.0025) [2024-04-26 08:43:37,875][47288] Updated weights for policy 0, policy_version 133196 (0.0028) [2024-04-26 08:43:38,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2182348800. Throughput: 0: 56777.3. Samples: 2131772580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 08:43:38,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 08:43:40,486][47288] Updated weights for policy 0, policy_version 133206 (0.0029) [2024-04-26 08:43:43,652][47288] Updated weights for policy 0, policy_version 133216 (0.0028) [2024-04-26 08:43:43,923][47056] Fps is (10 sec: 52428.6, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 2182610944. Throughput: 0: 56560.0. Samples: 2131939660. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 08:43:43,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 08:43:46,212][47288] Updated weights for policy 0, policy_version 133226 (0.0040) [2024-04-26 08:43:48,923][47056] Fps is (10 sec: 54066.9, 60 sec: 55978.7, 300 sec: 56538.7). Total num frames: 2182889472. Throughput: 0: 56398.6. Samples: 2132273300. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 08:43:48,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 08:43:49,455][47288] Updated weights for policy 0, policy_version 133236 (0.0036) [2024-04-26 08:43:52,176][47288] Updated weights for policy 0, policy_version 133246 (0.0032) [2024-04-26 08:43:53,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 2183184384. Throughput: 0: 56337.8. Samples: 2132611160. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 08:43:53,923][47056] Avg episode reward: [(0, '0.412')] [2024-04-26 08:43:55,276][47288] Updated weights for policy 0, policy_version 133256 (0.0027) [2024-04-26 08:43:58,029][47288] Updated weights for policy 0, policy_version 133266 (0.0026) [2024-04-26 08:43:58,923][47056] Fps is (10 sec: 58982.7, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2183479296. Throughput: 0: 56576.1. Samples: 2132783980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 08:43:58,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 08:43:58,942][47267] Signal inference workers to stop experience collection... (31550 times) [2024-04-26 08:43:58,942][47267] Signal inference workers to resume experience collection... (31550 times) [2024-04-26 08:43:58,957][47288] InferenceWorker_p0-w0: stopping experience collection (31550 times) [2024-04-26 08:43:58,957][47288] InferenceWorker_p0-w0: resuming experience collection (31550 times) [2024-04-26 08:44:01,104][47288] Updated weights for policy 0, policy_version 133276 (0.0036) [2024-04-26 08:44:03,820][47288] Updated weights for policy 0, policy_version 133286 (0.0029) [2024-04-26 08:44:03,923][47056] Fps is (10 sec: 57342.8, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 2183757824. Throughput: 0: 56493.1. Samples: 2133118460. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 08:44:03,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 08:44:07,039][47288] Updated weights for policy 0, policy_version 133296 (0.0029) [2024-04-26 08:44:08,923][47056] Fps is (10 sec: 57343.6, 60 sec: 57070.8, 300 sec: 56483.1). Total num frames: 2184052736. Throughput: 0: 56421.3. Samples: 2133460700. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 08:44:08,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 08:44:09,606][47288] Updated weights for policy 0, policy_version 133306 (0.0033) [2024-04-26 08:44:12,636][47288] Updated weights for policy 0, policy_version 133316 (0.0030) [2024-04-26 08:44:13,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56797.9, 300 sec: 56483.1). Total num frames: 2184331264. Throughput: 0: 56475.5. Samples: 2133634860. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 08:44:13,923][47056] Avg episode reward: [(0, '0.541')] [2024-04-26 08:44:15,352][47288] Updated weights for policy 0, policy_version 133326 (0.0030) [2024-04-26 08:44:18,341][47288] Updated weights for policy 0, policy_version 133336 (0.0035) [2024-04-26 08:44:18,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 2184593408. Throughput: 0: 56532.3. Samples: 2133977120. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 08:44:18,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 08:44:21,287][47288] Updated weights for policy 0, policy_version 133346 (0.0029) [2024-04-26 08:44:23,923][47056] Fps is (10 sec: 54067.4, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 2184871936. Throughput: 0: 56453.8. Samples: 2134313000. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 08:44:23,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 08:44:24,300][47288] Updated weights for policy 0, policy_version 133356 (0.0029) [2024-04-26 08:44:27,186][47288] Updated weights for policy 0, policy_version 133366 (0.0032) [2024-04-26 08:44:28,923][47056] Fps is (10 sec: 55706.0, 60 sec: 55978.6, 300 sec: 56538.7). Total num frames: 2185150464. Throughput: 0: 56376.4. Samples: 2134476600. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 08:44:28,923][47056] Avg episode reward: [(0, '0.634')] [2024-04-26 08:44:30,070][47288] Updated weights for policy 0, policy_version 133376 (0.0030) [2024-04-26 08:44:32,839][47288] Updated weights for policy 0, policy_version 133386 (0.0031) [2024-04-26 08:44:33,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55978.6, 300 sec: 56483.2). Total num frames: 2185445376. Throughput: 0: 56582.4. Samples: 2134819500. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-26 08:44:33,923][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 08:44:35,773][47288] Updated weights for policy 0, policy_version 133396 (0.0025) [2024-04-26 08:44:38,617][47288] Updated weights for policy 0, policy_version 133406 (0.0024) [2024-04-26 08:44:38,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56524.7, 300 sec: 56538.7). Total num frames: 2185740288. Throughput: 0: 56622.9. Samples: 2135159200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 08:44:38,923][47056] Avg episode reward: [(0, '0.519')] [2024-04-26 08:44:41,591][47288] Updated weights for policy 0, policy_version 133416 (0.0029) [2024-04-26 08:44:43,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.8, 300 sec: 56483.2). Total num frames: 2186018816. Throughput: 0: 56627.6. Samples: 2135332220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 08:44:43,923][47056] Avg episode reward: [(0, '0.469')] [2024-04-26 08:44:44,383][47288] Updated weights for policy 0, policy_version 133426 (0.0026) [2024-04-26 08:44:47,513][47288] Updated weights for policy 0, policy_version 133436 (0.0026) [2024-04-26 08:44:48,923][47056] Fps is (10 sec: 57344.5, 60 sec: 57071.0, 300 sec: 56483.2). Total num frames: 2186313728. Throughput: 0: 56767.7. Samples: 2135673000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 08:44:48,923][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 08:44:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000133442_2186313728.pth... [2024-04-26 08:44:48,990][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000132612_2172715008.pth [2024-04-26 08:44:50,130][47288] Updated weights for policy 0, policy_version 133446 (0.0027) [2024-04-26 08:44:53,359][47288] Updated weights for policy 0, policy_version 133456 (0.0034) [2024-04-26 08:44:53,383][47267] Signal inference workers to stop experience collection... (31600 times) [2024-04-26 08:44:53,383][47267] Signal inference workers to resume experience collection... (31600 times) [2024-04-26 08:44:53,397][47288] InferenceWorker_p0-w0: stopping experience collection (31600 times) [2024-04-26 08:44:53,397][47288] InferenceWorker_p0-w0: resuming experience collection (31600 times) [2024-04-26 08:44:53,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 2186592256. Throughput: 0: 56659.6. Samples: 2136010380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 08:44:53,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 08:44:55,958][47288] Updated weights for policy 0, policy_version 133466 (0.0031) [2024-04-26 08:44:58,923][47056] Fps is (10 sec: 52429.5, 60 sec: 55978.8, 300 sec: 56427.6). Total num frames: 2186838016. Throughput: 0: 56376.2. Samples: 2136171780. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 08:44:58,923][47056] Avg episode reward: [(0, '0.567')] [2024-04-26 08:44:59,111][47288] Updated weights for policy 0, policy_version 133476 (0.0030) [2024-04-26 08:45:01,647][47288] Updated weights for policy 0, policy_version 133486 (0.0028) [2024-04-26 08:45:03,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 2187132928. Throughput: 0: 56302.2. Samples: 2136510720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 08:45:03,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 08:45:04,855][47288] Updated weights for policy 0, policy_version 133496 (0.0023) [2024-04-26 08:45:07,500][47288] Updated weights for policy 0, policy_version 133506 (0.0025) [2024-04-26 08:45:08,923][47056] Fps is (10 sec: 57343.1, 60 sec: 55978.7, 300 sec: 56483.2). Total num frames: 2187411456. Throughput: 0: 56445.3. Samples: 2136853040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 08:45:08,923][47056] Avg episode reward: [(0, '0.578')] [2024-04-26 08:45:10,562][47288] Updated weights for policy 0, policy_version 133516 (0.0031) [2024-04-26 08:45:13,169][47288] Updated weights for policy 0, policy_version 133526 (0.0030) [2024-04-26 08:45:13,923][47056] Fps is (10 sec: 58981.6, 60 sec: 56524.7, 300 sec: 56538.6). Total num frames: 2187722752. Throughput: 0: 56544.7. Samples: 2137021120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 08:45:13,923][47056] Avg episode reward: [(0, '0.497')] [2024-04-26 08:45:16,420][47288] Updated weights for policy 0, policy_version 133536 (0.0028) [2024-04-26 08:45:18,920][47288] Updated weights for policy 0, policy_version 133546 (0.0029) [2024-04-26 08:45:18,923][47056] Fps is (10 sec: 60620.8, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 2188017664. Throughput: 0: 56507.0. Samples: 2137362320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 08:45:18,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 08:45:22,185][47288] Updated weights for policy 0, policy_version 133556 (0.0027) [2024-04-26 08:45:23,923][47056] Fps is (10 sec: 55707.1, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 2188279808. Throughput: 0: 56455.4. Samples: 2137699680. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 08:45:23,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 08:45:24,721][47288] Updated weights for policy 0, policy_version 133566 (0.0032) [2024-04-26 08:45:27,971][47288] Updated weights for policy 0, policy_version 133576 (0.0030) [2024-04-26 08:45:28,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 2188558336. Throughput: 0: 56652.8. Samples: 2137881600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 08:45:28,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 08:45:30,349][47288] Updated weights for policy 0, policy_version 133586 (0.0028) [2024-04-26 08:45:33,775][47288] Updated weights for policy 0, policy_version 133596 (0.0037) [2024-04-26 08:45:33,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56524.7, 300 sec: 56538.7). Total num frames: 2188836864. Throughput: 0: 56518.7. Samples: 2138216340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 08:45:33,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 08:45:36,365][47288] Updated weights for policy 0, policy_version 133606 (0.0031) [2024-04-26 08:45:38,923][47056] Fps is (10 sec: 54066.1, 60 sec: 55978.5, 300 sec: 56483.1). Total num frames: 2189099008. Throughput: 0: 56602.9. Samples: 2138557520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:45:38,923][47056] Avg episode reward: [(0, '0.576')] [2024-04-26 08:45:39,576][47288] Updated weights for policy 0, policy_version 133616 (0.0028) [2024-04-26 08:45:42,179][47288] Updated weights for policy 0, policy_version 133626 (0.0029) [2024-04-26 08:45:43,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 2189393920. Throughput: 0: 56560.4. Samples: 2138717000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:45:43,923][47056] Avg episode reward: [(0, '0.505')] [2024-04-26 08:45:45,206][47288] Updated weights for policy 0, policy_version 133636 (0.0031) [2024-04-26 08:45:48,037][47288] Updated weights for policy 0, policy_version 133646 (0.0034) [2024-04-26 08:45:48,923][47056] Fps is (10 sec: 58983.7, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 2189688832. Throughput: 0: 56590.7. Samples: 2139057300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:45:48,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 08:45:50,791][47267] Signal inference workers to stop experience collection... (31650 times) [2024-04-26 08:45:50,791][47267] Signal inference workers to resume experience collection... (31650 times) [2024-04-26 08:45:50,819][47288] InferenceWorker_p0-w0: stopping experience collection (31650 times) [2024-04-26 08:45:50,820][47288] InferenceWorker_p0-w0: resuming experience collection (31650 times) [2024-04-26 08:45:50,901][47288] Updated weights for policy 0, policy_version 133656 (0.0027) [2024-04-26 08:45:53,915][47288] Updated weights for policy 0, policy_version 133666 (0.0028) [2024-04-26 08:45:53,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56524.9, 300 sec: 56594.2). Total num frames: 2189983744. Throughput: 0: 56548.1. Samples: 2139397700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:45:53,923][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 08:45:56,702][47288] Updated weights for policy 0, policy_version 133676 (0.0026) [2024-04-26 08:45:58,923][47056] Fps is (10 sec: 58982.6, 60 sec: 57343.9, 300 sec: 56538.7). Total num frames: 2190278656. Throughput: 0: 56689.1. Samples: 2139572120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:45:58,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 08:45:59,604][47288] Updated weights for policy 0, policy_version 133686 (0.0024) [2024-04-26 08:46:02,594][47288] Updated weights for policy 0, policy_version 133696 (0.0032) [2024-04-26 08:46:03,923][47056] Fps is (10 sec: 57344.0, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 2190557184. Throughput: 0: 56565.0. Samples: 2139907740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:46:03,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 08:46:05,394][47288] Updated weights for policy 0, policy_version 133706 (0.0028) [2024-04-26 08:46:08,384][47288] Updated weights for policy 0, policy_version 133716 (0.0032) [2024-04-26 08:46:08,923][47056] Fps is (10 sec: 54066.6, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 2190819328. Throughput: 0: 56494.9. Samples: 2140241960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:46:08,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 08:46:11,157][47288] Updated weights for policy 0, policy_version 133726 (0.0026) [2024-04-26 08:46:13,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56251.8, 300 sec: 56538.7). Total num frames: 2191097856. Throughput: 0: 56186.6. Samples: 2140410000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:46:13,923][47056] Avg episode reward: [(0, '0.522')] [2024-04-26 08:46:14,203][47288] Updated weights for policy 0, policy_version 133736 (0.0033) [2024-04-26 08:46:17,011][47288] Updated weights for policy 0, policy_version 133746 (0.0031) [2024-04-26 08:46:18,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55978.7, 300 sec: 56594.2). Total num frames: 2191376384. Throughput: 0: 56285.4. Samples: 2140749180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:46:18,923][47056] Avg episode reward: [(0, '0.666')] [2024-04-26 08:46:18,929][47267] Saving new best policy, reward=0.666! [2024-04-26 08:46:20,185][47288] Updated weights for policy 0, policy_version 133756 (0.0029) [2024-04-26 08:46:22,860][47288] Updated weights for policy 0, policy_version 133766 (0.0029) [2024-04-26 08:46:23,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.6, 300 sec: 56538.7). Total num frames: 2191654912. Throughput: 0: 56315.4. Samples: 2141091700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:46:23,923][47056] Avg episode reward: [(0, '0.495')] [2024-04-26 08:46:25,897][47288] Updated weights for policy 0, policy_version 133776 (0.0027) [2024-04-26 08:46:28,736][47288] Updated weights for policy 0, policy_version 133786 (0.0031) [2024-04-26 08:46:28,923][47056] Fps is (10 sec: 57342.9, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 2191949824. Throughput: 0: 56495.3. Samples: 2141259300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:46:28,923][47056] Avg episode reward: [(0, '0.577')] [2024-04-26 08:46:31,709][47288] Updated weights for policy 0, policy_version 133796 (0.0033) [2024-04-26 08:46:33,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 2192228352. Throughput: 0: 56377.0. Samples: 2141594260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:46:33,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 08:46:34,606][47288] Updated weights for policy 0, policy_version 133806 (0.0026) [2024-04-26 08:46:37,564][47288] Updated weights for policy 0, policy_version 133816 (0.0031) [2024-04-26 08:46:38,923][47056] Fps is (10 sec: 57344.5, 60 sec: 57071.1, 300 sec: 56538.7). Total num frames: 2192523264. Throughput: 0: 56318.5. Samples: 2141932040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 08:46:38,923][47056] Avg episode reward: [(0, '0.563')] [2024-04-26 08:46:40,713][47288] Updated weights for policy 0, policy_version 133826 (0.0027) [2024-04-26 08:46:43,335][47288] Updated weights for policy 0, policy_version 133836 (0.0028) [2024-04-26 08:46:43,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56797.8, 300 sec: 56538.7). Total num frames: 2192801792. Throughput: 0: 56340.8. Samples: 2142107460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:46:43,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 08:46:46,457][47288] Updated weights for policy 0, policy_version 133846 (0.0028) [2024-04-26 08:46:48,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2193080320. Throughput: 0: 56521.3. Samples: 2142451200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:46:48,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 08:46:48,929][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000133855_2193080320.pth... [2024-04-26 08:46:48,978][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000133028_2179530752.pth [2024-04-26 08:46:49,240][47288] Updated weights for policy 0, policy_version 133856 (0.0027) [2024-04-26 08:46:52,137][47288] Updated weights for policy 0, policy_version 133866 (0.0028) [2024-04-26 08:46:53,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.6, 300 sec: 56538.7). Total num frames: 2193342464. Throughput: 0: 56486.3. Samples: 2142783840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:46:53,923][47056] Avg episode reward: [(0, '0.424')] [2024-04-26 08:46:54,372][47267] Signal inference workers to stop experience collection... (31700 times) [2024-04-26 08:46:54,426][47267] Signal inference workers to resume experience collection... (31700 times) [2024-04-26 08:46:54,426][47288] InferenceWorker_p0-w0: stopping experience collection (31700 times) [2024-04-26 08:46:54,438][47288] InferenceWorker_p0-w0: resuming experience collection (31700 times) [2024-04-26 08:46:55,049][47288] Updated weights for policy 0, policy_version 133876 (0.0027) [2024-04-26 08:46:57,748][47288] Updated weights for policy 0, policy_version 133886 (0.0026) [2024-04-26 08:46:58,923][47056] Fps is (10 sec: 54066.8, 60 sec: 55705.5, 300 sec: 56483.1). Total num frames: 2193620992. Throughput: 0: 56336.9. Samples: 2142945160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:46:58,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 08:47:00,890][47288] Updated weights for policy 0, policy_version 133896 (0.0026) [2024-04-26 08:47:03,651][47288] Updated weights for policy 0, policy_version 133906 (0.0032) [2024-04-26 08:47:03,923][47056] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 56483.2). Total num frames: 2193915904. Throughput: 0: 56309.3. Samples: 2143283100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:47:03,932][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 08:47:06,853][47288] Updated weights for policy 0, policy_version 133916 (0.0029) [2024-04-26 08:47:08,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 2194210816. Throughput: 0: 56231.5. Samples: 2143622120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:47:08,923][47056] Avg episode reward: [(0, '0.635')] [2024-04-26 08:47:09,584][47288] Updated weights for policy 0, policy_version 133926 (0.0032) [2024-04-26 08:47:12,529][47288] Updated weights for policy 0, policy_version 133936 (0.0033) [2024-04-26 08:47:13,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56524.9, 300 sec: 56483.2). Total num frames: 2194489344. Throughput: 0: 56471.3. Samples: 2143800500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:47:13,923][47056] Avg episode reward: [(0, '0.573')] [2024-04-26 08:47:15,358][47288] Updated weights for policy 0, policy_version 133946 (0.0027) [2024-04-26 08:47:18,197][47288] Updated weights for policy 0, policy_version 133956 (0.0028) [2024-04-26 08:47:18,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2194784256. Throughput: 0: 56473.3. Samples: 2144135560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:47:18,923][47056] Avg episode reward: [(0, '0.467')] [2024-04-26 08:47:21,539][47288] Updated weights for policy 0, policy_version 133966 (0.0027) [2024-04-26 08:47:23,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2195046400. Throughput: 0: 56416.6. Samples: 2144470780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:47:23,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 08:47:23,975][47288] Updated weights for policy 0, policy_version 133976 (0.0031) [2024-04-26 08:47:27,335][47288] Updated weights for policy 0, policy_version 133986 (0.0030) [2024-04-26 08:47:28,923][47056] Fps is (10 sec: 54066.6, 60 sec: 56251.8, 300 sec: 56483.2). Total num frames: 2195324928. Throughput: 0: 56253.3. Samples: 2144638860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:47:28,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 08:47:29,784][47288] Updated weights for policy 0, policy_version 133996 (0.0034) [2024-04-26 08:47:33,049][47288] Updated weights for policy 0, policy_version 134006 (0.0029) [2024-04-26 08:47:33,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 2195587072. Throughput: 0: 56212.5. Samples: 2144980760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:47:33,923][47056] Avg episode reward: [(0, '0.423')] [2024-04-26 08:47:35,632][47288] Updated weights for policy 0, policy_version 134016 (0.0022) [2024-04-26 08:47:38,658][47288] Updated weights for policy 0, policy_version 134026 (0.0028) [2024-04-26 08:47:38,923][47056] Fps is (10 sec: 55706.4, 60 sec: 55978.8, 300 sec: 56483.1). Total num frames: 2195881984. Throughput: 0: 56331.6. Samples: 2145318760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 08:47:38,923][47056] Avg episode reward: [(0, '0.557')] [2024-04-26 08:47:41,681][47288] Updated weights for policy 0, policy_version 134036 (0.0027) [2024-04-26 08:47:43,923][47056] Fps is (10 sec: 58981.9, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 2196176896. Throughput: 0: 56678.3. Samples: 2145495680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 08:47:43,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 08:47:44,386][47288] Updated weights for policy 0, policy_version 134046 (0.0028) [2024-04-26 08:47:47,431][47288] Updated weights for policy 0, policy_version 134056 (0.0032) [2024-04-26 08:47:48,923][47056] Fps is (10 sec: 58981.3, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 2196471808. Throughput: 0: 56577.2. Samples: 2145829080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 08:47:48,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 08:47:50,109][47288] Updated weights for policy 0, policy_version 134066 (0.0030) [2024-04-26 08:47:53,256][47288] Updated weights for policy 0, policy_version 134076 (0.0031) [2024-04-26 08:47:53,923][47056] Fps is (10 sec: 58982.7, 60 sec: 57071.0, 300 sec: 56538.7). Total num frames: 2196766720. Throughput: 0: 56637.0. Samples: 2146170780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 08:47:53,923][47056] Avg episode reward: [(0, '0.581')] [2024-04-26 08:47:55,796][47288] Updated weights for policy 0, policy_version 134086 (0.0036) [2024-04-26 08:47:58,755][47267] Signal inference workers to stop experience collection... (31750 times) [2024-04-26 08:47:58,783][47288] InferenceWorker_p0-w0: stopping experience collection (31750 times) [2024-04-26 08:47:58,841][47267] Signal inference workers to resume experience collection... (31750 times) [2024-04-26 08:47:58,842][47288] InferenceWorker_p0-w0: resuming experience collection (31750 times) [2024-04-26 08:47:58,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 2197012480. Throughput: 0: 56380.8. Samples: 2146337640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 08:47:58,923][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 08:47:58,945][47288] Updated weights for policy 0, policy_version 134096 (0.0032) [2024-04-26 08:48:01,730][47288] Updated weights for policy 0, policy_version 134106 (0.0024) [2024-04-26 08:48:03,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 2197323776. Throughput: 0: 56494.1. Samples: 2146677800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 08:48:03,923][47056] Avg episode reward: [(0, '0.542')] [2024-04-26 08:48:04,737][47288] Updated weights for policy 0, policy_version 134116 (0.0023) [2024-04-26 08:48:07,718][47288] Updated weights for policy 0, policy_version 134126 (0.0027) [2024-04-26 08:48:08,923][47056] Fps is (10 sec: 55706.1, 60 sec: 55978.8, 300 sec: 56427.6). Total num frames: 2197569536. Throughput: 0: 56536.1. Samples: 2147014900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 08:48:08,923][47056] Avg episode reward: [(0, '0.643')] [2024-04-26 08:48:10,576][47288] Updated weights for policy 0, policy_version 134136 (0.0029) [2024-04-26 08:48:13,386][47288] Updated weights for policy 0, policy_version 134146 (0.0030) [2024-04-26 08:48:13,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55978.6, 300 sec: 56372.1). Total num frames: 2197848064. Throughput: 0: 56451.2. Samples: 2147179160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 08:48:13,923][47056] Avg episode reward: [(0, '0.585')] [2024-04-26 08:48:16,350][47288] Updated weights for policy 0, policy_version 134156 (0.0028) [2024-04-26 08:48:18,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56251.7, 300 sec: 56483.2). Total num frames: 2198159360. Throughput: 0: 56383.9. Samples: 2147518040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 08:48:18,923][47056] Avg episode reward: [(0, '0.592')] [2024-04-26 08:48:19,309][47288] Updated weights for policy 0, policy_version 134166 (0.0037) [2024-04-26 08:48:22,166][47288] Updated weights for policy 0, policy_version 134176 (0.0030) [2024-04-26 08:48:23,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 2198437888. Throughput: 0: 56340.3. Samples: 2147854080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 08:48:23,923][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 08:48:25,313][47288] Updated weights for policy 0, policy_version 134186 (0.0028) [2024-04-26 08:48:27,910][47288] Updated weights for policy 0, policy_version 134196 (0.0025) [2024-04-26 08:48:28,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56798.0, 300 sec: 56427.6). Total num frames: 2198732800. Throughput: 0: 56239.2. Samples: 2148026440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 08:48:28,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 08:48:30,948][47288] Updated weights for policy 0, policy_version 134206 (0.0032) [2024-04-26 08:48:33,793][47288] Updated weights for policy 0, policy_version 134216 (0.0028) [2024-04-26 08:48:33,923][47056] Fps is (10 sec: 55705.5, 60 sec: 56797.7, 300 sec: 56427.6). Total num frames: 2198994944. Throughput: 0: 56356.1. Samples: 2148365100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 08:48:33,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 08:48:36,695][47288] Updated weights for policy 0, policy_version 134226 (0.0026) [2024-04-26 08:48:38,923][47056] Fps is (10 sec: 54066.5, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 2199273472. Throughput: 0: 56340.3. Samples: 2148706100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 08:48:38,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 08:48:39,549][47288] Updated weights for policy 0, policy_version 134236 (0.0028) [2024-04-26 08:48:42,595][47288] Updated weights for policy 0, policy_version 134246 (0.0031) [2024-04-26 08:48:43,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 2199568384. Throughput: 0: 56430.3. Samples: 2148877000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 08:48:43,923][47056] Avg episode reward: [(0, '0.567')] [2024-04-26 08:48:45,234][47288] Updated weights for policy 0, policy_version 134256 (0.0030) [2024-04-26 08:48:48,556][47288] Updated weights for policy 0, policy_version 134266 (0.0024) [2024-04-26 08:48:48,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.8, 300 sec: 56427.6). Total num frames: 2199830528. Throughput: 0: 56228.1. Samples: 2149208060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 08:48:48,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 08:48:48,935][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000134267_2199830528.pth... [2024-04-26 08:48:48,983][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000133442_2186313728.pth [2024-04-26 08:48:51,190][47288] Updated weights for policy 0, policy_version 134276 (0.0032) [2024-04-26 08:48:53,923][47056] Fps is (10 sec: 54066.7, 60 sec: 55705.5, 300 sec: 56372.1). Total num frames: 2200109056. Throughput: 0: 56262.6. Samples: 2149546720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 08:48:53,923][47056] Avg episode reward: [(0, '0.466')] [2024-04-26 08:48:54,251][47288] Updated weights for policy 0, policy_version 134286 (0.0026) [2024-04-26 08:48:56,938][47288] Updated weights for policy 0, policy_version 134296 (0.0028) [2024-04-26 08:48:58,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56797.9, 300 sec: 56483.2). Total num frames: 2200420352. Throughput: 0: 56381.8. Samples: 2149716340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 08:48:58,923][47056] Avg episode reward: [(0, '0.582')] [2024-04-26 08:49:00,090][47288] Updated weights for policy 0, policy_version 134306 (0.0030) [2024-04-26 08:49:02,686][47288] Updated weights for policy 0, policy_version 134316 (0.0034) [2024-04-26 08:49:03,923][47056] Fps is (10 sec: 57344.9, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 2200682496. Throughput: 0: 56391.2. Samples: 2150055640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 08:49:03,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 08:49:05,858][47288] Updated weights for policy 0, policy_version 134326 (0.0029) [2024-04-26 08:49:06,239][47267] Signal inference workers to stop experience collection... (31800 times) [2024-04-26 08:49:06,271][47288] InferenceWorker_p0-w0: stopping experience collection (31800 times) [2024-04-26 08:49:06,298][47267] Signal inference workers to resume experience collection... (31800 times) [2024-04-26 08:49:06,303][47288] InferenceWorker_p0-w0: resuming experience collection (31800 times) [2024-04-26 08:49:08,700][47288] Updated weights for policy 0, policy_version 134336 (0.0028) [2024-04-26 08:49:08,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56797.7, 300 sec: 56427.6). Total num frames: 2200977408. Throughput: 0: 56420.8. Samples: 2150393020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 08:49:08,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 08:49:11,717][47288] Updated weights for policy 0, policy_version 134346 (0.0030) [2024-04-26 08:49:13,923][47056] Fps is (10 sec: 57343.1, 60 sec: 56797.9, 300 sec: 56483.2). Total num frames: 2201255936. Throughput: 0: 56484.3. Samples: 2150568240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 08:49:13,923][47056] Avg episode reward: [(0, '0.592')] [2024-04-26 08:49:14,475][47288] Updated weights for policy 0, policy_version 134356 (0.0027) [2024-04-26 08:49:17,539][47288] Updated weights for policy 0, policy_version 134366 (0.0037) [2024-04-26 08:49:18,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56251.7, 300 sec: 56483.1). Total num frames: 2201534464. Throughput: 0: 56439.6. Samples: 2150904880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 08:49:18,923][47056] Avg episode reward: [(0, '0.587')] [2024-04-26 08:49:20,085][47288] Updated weights for policy 0, policy_version 134376 (0.0027) [2024-04-26 08:49:23,222][47288] Updated weights for policy 0, policy_version 134386 (0.0030) [2024-04-26 08:49:23,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 2201829376. Throughput: 0: 56334.3. Samples: 2151241140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 08:49:23,923][47056] Avg episode reward: [(0, '0.624')] [2024-04-26 08:49:26,094][47288] Updated weights for policy 0, policy_version 134396 (0.0027) [2024-04-26 08:49:28,923][47056] Fps is (10 sec: 55704.4, 60 sec: 55978.3, 300 sec: 56427.5). Total num frames: 2202091520. Throughput: 0: 56264.1. Samples: 2151408900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 08:49:28,923][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 08:49:29,024][47288] Updated weights for policy 0, policy_version 134406 (0.0027) [2024-04-26 08:49:31,772][47288] Updated weights for policy 0, policy_version 134416 (0.0024) [2024-04-26 08:49:33,923][47056] Fps is (10 sec: 54067.2, 60 sec: 56251.9, 300 sec: 56372.1). Total num frames: 2202370048. Throughput: 0: 56501.8. Samples: 2151750640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 08:49:33,923][47056] Avg episode reward: [(0, '0.601')] [2024-04-26 08:49:34,844][47288] Updated weights for policy 0, policy_version 134426 (0.0027) [2024-04-26 08:49:37,563][47288] Updated weights for policy 0, policy_version 134436 (0.0024) [2024-04-26 08:49:38,923][47056] Fps is (10 sec: 57345.0, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 2202664960. Throughput: 0: 56349.3. Samples: 2152082440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 08:49:38,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 08:49:40,727][47288] Updated weights for policy 0, policy_version 134446 (0.0034) [2024-04-26 08:49:43,449][47288] Updated weights for policy 0, policy_version 134456 (0.0032) [2024-04-26 08:49:43,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 2202927104. Throughput: 0: 56342.3. Samples: 2152251740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 08:49:43,924][47056] Avg episode reward: [(0, '0.504')] [2024-04-26 08:49:46,408][47288] Updated weights for policy 0, policy_version 134466 (0.0029) [2024-04-26 08:49:48,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 2203222016. Throughput: 0: 56351.8. Samples: 2152591480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 08:49:48,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 08:49:49,140][47288] Updated weights for policy 0, policy_version 134476 (0.0032) [2024-04-26 08:49:52,265][47288] Updated weights for policy 0, policy_version 134486 (0.0029) [2024-04-26 08:49:53,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 2203484160. Throughput: 0: 56303.1. Samples: 2152926660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 08:49:53,923][47056] Avg episode reward: [(0, '0.474')] [2024-04-26 08:49:54,941][47288] Updated weights for policy 0, policy_version 134496 (0.0027) [2024-04-26 08:49:58,013][47288] Updated weights for policy 0, policy_version 134506 (0.0023) [2024-04-26 08:49:58,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 56427.6). Total num frames: 2203779072. Throughput: 0: 56054.8. Samples: 2153090700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 08:49:58,923][47056] Avg episode reward: [(0, '0.561')] [2024-04-26 08:50:00,786][47288] Updated weights for policy 0, policy_version 134516 (0.0033) [2024-04-26 08:50:03,826][47288] Updated weights for policy 0, policy_version 134526 (0.0035) [2024-04-26 08:50:03,923][47056] Fps is (10 sec: 58983.3, 60 sec: 56524.7, 300 sec: 56483.2). Total num frames: 2204073984. Throughput: 0: 56173.9. Samples: 2153432700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 08:50:03,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 08:50:06,543][47288] Updated weights for policy 0, policy_version 134536 (0.0030) [2024-04-26 08:50:08,923][47056] Fps is (10 sec: 54067.2, 60 sec: 55705.7, 300 sec: 56261.0). Total num frames: 2204319744. Throughput: 0: 56347.5. Samples: 2153776780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 08:50:08,923][47056] Avg episode reward: [(0, '0.499')] [2024-04-26 08:50:09,614][47288] Updated weights for policy 0, policy_version 134546 (0.0036) [2024-04-26 08:50:12,528][47288] Updated weights for policy 0, policy_version 134556 (0.0026) [2024-04-26 08:50:13,256][47267] Signal inference workers to stop experience collection... (31850 times) [2024-04-26 08:50:13,287][47288] InferenceWorker_p0-w0: stopping experience collection (31850 times) [2024-04-26 08:50:13,311][47267] Signal inference workers to resume experience collection... (31850 times) [2024-04-26 08:50:13,314][47288] InferenceWorker_p0-w0: resuming experience collection (31850 times) [2024-04-26 08:50:13,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 2204647424. Throughput: 0: 56377.2. Samples: 2153945860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 08:50:13,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 08:50:15,430][47288] Updated weights for policy 0, policy_version 134566 (0.0030) [2024-04-26 08:50:18,279][47288] Updated weights for policy 0, policy_version 134576 (0.0027) [2024-04-26 08:50:18,923][47056] Fps is (10 sec: 58982.0, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 2204909568. Throughput: 0: 56241.7. Samples: 2154281520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 08:50:18,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 08:50:21,213][47288] Updated weights for policy 0, policy_version 134586 (0.0030) [2024-04-26 08:50:23,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 2205204480. Throughput: 0: 56282.3. Samples: 2154615140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 08:50:23,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 08:50:24,113][47288] Updated weights for policy 0, policy_version 134596 (0.0026) [2024-04-26 08:50:26,986][47288] Updated weights for policy 0, policy_version 134606 (0.0026) [2024-04-26 08:50:28,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56251.9, 300 sec: 56372.0). Total num frames: 2205466624. Throughput: 0: 56508.3. Samples: 2154794620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 08:50:28,923][47056] Avg episode reward: [(0, '0.622')] [2024-04-26 08:50:29,768][47288] Updated weights for policy 0, policy_version 134616 (0.0037) [2024-04-26 08:50:33,048][47288] Updated weights for policy 0, policy_version 134626 (0.0033) [2024-04-26 08:50:33,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.8, 300 sec: 56483.2). Total num frames: 2205761536. Throughput: 0: 56345.9. Samples: 2155127040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 08:50:33,923][47056] Avg episode reward: [(0, '0.570')] [2024-04-26 08:50:35,625][47288] Updated weights for policy 0, policy_version 134636 (0.0027) [2024-04-26 08:50:38,923][47056] Fps is (10 sec: 55706.9, 60 sec: 55978.8, 300 sec: 56372.1). Total num frames: 2206023680. Throughput: 0: 56369.6. Samples: 2155463280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 08:50:38,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 08:50:38,965][47288] Updated weights for policy 0, policy_version 134646 (0.0029) [2024-04-26 08:50:41,372][47288] Updated weights for policy 0, policy_version 134656 (0.0032) [2024-04-26 08:50:43,923][47056] Fps is (10 sec: 54066.0, 60 sec: 56251.6, 300 sec: 56316.5). Total num frames: 2206302208. Throughput: 0: 56288.2. Samples: 2155623680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 08:50:43,923][47056] Avg episode reward: [(0, '0.637')] [2024-04-26 08:50:44,737][47288] Updated weights for policy 0, policy_version 134666 (0.0026) [2024-04-26 08:50:47,256][47288] Updated weights for policy 0, policy_version 134676 (0.0031) [2024-04-26 08:50:48,923][47056] Fps is (10 sec: 55704.6, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 2206580736. Throughput: 0: 56194.5. Samples: 2155961460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 08:50:48,923][47056] Avg episode reward: [(0, '0.570')] [2024-04-26 08:50:48,932][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000134679_2206580736.pth... [2024-04-26 08:50:49,011][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000133855_2193080320.pth [2024-04-26 08:50:50,352][47288] Updated weights for policy 0, policy_version 134686 (0.0024) [2024-04-26 08:50:52,979][47288] Updated weights for policy 0, policy_version 134696 (0.0027) [2024-04-26 08:50:53,923][47056] Fps is (10 sec: 57345.2, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 2206875648. Throughput: 0: 56226.7. Samples: 2156306980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:50:53,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 08:50:56,305][47288] Updated weights for policy 0, policy_version 134706 (0.0025) [2024-04-26 08:50:58,749][47288] Updated weights for policy 0, policy_version 134716 (0.0027) [2024-04-26 08:50:58,923][47056] Fps is (10 sec: 60621.3, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 2207186944. Throughput: 0: 56060.5. Samples: 2156468580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:50:58,923][47056] Avg episode reward: [(0, '0.513')] [2024-04-26 08:51:02,156][47288] Updated weights for policy 0, policy_version 134726 (0.0029) [2024-04-26 08:51:03,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 56372.1). Total num frames: 2207449088. Throughput: 0: 56173.4. Samples: 2156809320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:51:03,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 08:51:04,441][47288] Updated weights for policy 0, policy_version 134736 (0.0029) [2024-04-26 08:51:07,804][47288] Updated weights for policy 0, policy_version 134746 (0.0036) [2024-04-26 08:51:08,923][47056] Fps is (10 sec: 52429.2, 60 sec: 56524.8, 300 sec: 56316.6). Total num frames: 2207711232. Throughput: 0: 56313.4. Samples: 2157149240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:51:08,923][47056] Avg episode reward: [(0, '0.587')] [2024-04-26 08:51:10,398][47288] Updated weights for policy 0, policy_version 134756 (0.0027) [2024-04-26 08:51:13,567][47288] Updated weights for policy 0, policy_version 134766 (0.0030) [2024-04-26 08:51:13,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 2208006144. Throughput: 0: 56181.9. Samples: 2157322800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:51:13,923][47056] Avg episode reward: [(0, '0.433')] [2024-04-26 08:51:16,275][47288] Updated weights for policy 0, policy_version 134776 (0.0030) [2024-04-26 08:51:18,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.8, 300 sec: 56316.5). Total num frames: 2208268288. Throughput: 0: 56345.8. Samples: 2157662600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:51:18,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 08:51:19,359][47288] Updated weights for policy 0, policy_version 134786 (0.0030) [2024-04-26 08:51:21,250][47267] Signal inference workers to stop experience collection... (31900 times) [2024-04-26 08:51:21,290][47288] InferenceWorker_p0-w0: stopping experience collection (31900 times) [2024-04-26 08:51:21,314][47267] Signal inference workers to resume experience collection... (31900 times) [2024-04-26 08:51:21,319][47288] InferenceWorker_p0-w0: resuming experience collection (31900 times) [2024-04-26 08:51:22,561][47288] Updated weights for policy 0, policy_version 134796 (0.0027) [2024-04-26 08:51:23,923][47056] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 56316.6). Total num frames: 2208563200. Throughput: 0: 56310.9. Samples: 2157997280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:51:23,923][47056] Avg episode reward: [(0, '0.609')] [2024-04-26 08:51:25,363][47288] Updated weights for policy 0, policy_version 134806 (0.0032) [2024-04-26 08:51:28,238][47288] Updated weights for policy 0, policy_version 134816 (0.0027) [2024-04-26 08:51:28,923][47056] Fps is (10 sec: 57343.2, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 2208841728. Throughput: 0: 56504.2. Samples: 2158166360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:51:28,924][47056] Avg episode reward: [(0, '0.516')] [2024-04-26 08:51:31,044][47288] Updated weights for policy 0, policy_version 134826 (0.0026) [2024-04-26 08:51:33,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56251.7, 300 sec: 56316.6). Total num frames: 2209136640. Throughput: 0: 56429.5. Samples: 2158500780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:51:33,923][47056] Avg episode reward: [(0, '0.473')] [2024-04-26 08:51:34,008][47288] Updated weights for policy 0, policy_version 134836 (0.0026) [2024-04-26 08:51:36,952][47288] Updated weights for policy 0, policy_version 134846 (0.0036) [2024-04-26 08:51:38,923][47056] Fps is (10 sec: 58982.3, 60 sec: 56797.7, 300 sec: 56372.1). Total num frames: 2209431552. Throughput: 0: 56311.0. Samples: 2158840980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:51:38,923][47056] Avg episode reward: [(0, '0.565')] [2024-04-26 08:51:39,961][47288] Updated weights for policy 0, policy_version 134856 (0.0025) [2024-04-26 08:51:42,777][47288] Updated weights for policy 0, policy_version 134866 (0.0031) [2024-04-26 08:51:43,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56798.1, 300 sec: 56372.1). Total num frames: 2209710080. Throughput: 0: 56492.5. Samples: 2159010740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:51:43,923][47056] Avg episode reward: [(0, '0.585')] [2024-04-26 08:51:45,979][47288] Updated weights for policy 0, policy_version 134876 (0.0031) [2024-04-26 08:51:48,923][47056] Fps is (10 sec: 52429.3, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 2209955840. Throughput: 0: 56473.8. Samples: 2159350640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 08:51:48,923][47056] Avg episode reward: [(0, '0.503')] [2024-04-26 08:51:48,934][47288] Updated weights for policy 0, policy_version 134886 (0.0026) [2024-04-26 08:51:51,835][47288] Updated weights for policy 0, policy_version 134896 (0.0030) [2024-04-26 08:51:53,923][47056] Fps is (10 sec: 55704.6, 60 sec: 56524.6, 300 sec: 56427.6). Total num frames: 2210267136. Throughput: 0: 56421.5. Samples: 2159688220. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 08:51:53,923][47056] Avg episode reward: [(0, '0.577')] [2024-04-26 08:51:54,827][47288] Updated weights for policy 0, policy_version 134906 (0.0032) [2024-04-26 08:51:57,748][47288] Updated weights for policy 0, policy_version 134916 (0.0030) [2024-04-26 08:51:58,923][47056] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 56316.5). Total num frames: 2210529280. Throughput: 0: 56228.9. Samples: 2159853100. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 08:51:58,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 08:52:00,543][47288] Updated weights for policy 0, policy_version 134926 (0.0027) [2024-04-26 08:52:03,375][47288] Updated weights for policy 0, policy_version 134936 (0.0031) [2024-04-26 08:52:03,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56251.7, 300 sec: 56316.5). Total num frames: 2210824192. Throughput: 0: 56186.1. Samples: 2160190980. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 08:52:03,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 08:52:06,194][47288] Updated weights for policy 0, policy_version 134946 (0.0033) [2024-04-26 08:52:08,923][47056] Fps is (10 sec: 55706.2, 60 sec: 56251.7, 300 sec: 56261.0). Total num frames: 2211086336. Throughput: 0: 56309.5. Samples: 2160531200. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 08:52:08,923][47056] Avg episode reward: [(0, '0.494')] [2024-04-26 08:52:09,254][47288] Updated weights for policy 0, policy_version 134956 (0.0029) [2024-04-26 08:52:12,151][47288] Updated weights for policy 0, policy_version 134966 (0.0029) [2024-04-26 08:52:13,923][47056] Fps is (10 sec: 57343.0, 60 sec: 56524.6, 300 sec: 56316.5). Total num frames: 2211397632. Throughput: 0: 56334.4. Samples: 2160701420. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 08:52:13,923][47056] Avg episode reward: [(0, '0.563')] [2024-04-26 08:52:15,081][47288] Updated weights for policy 0, policy_version 134976 (0.0036) [2024-04-26 08:52:17,857][47288] Updated weights for policy 0, policy_version 134986 (0.0033) [2024-04-26 08:52:18,923][47056] Fps is (10 sec: 60619.2, 60 sec: 57070.7, 300 sec: 56427.6). Total num frames: 2211692544. Throughput: 0: 56490.4. Samples: 2161042860. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 08:52:18,923][47056] Avg episode reward: [(0, '0.459')] [2024-04-26 08:52:20,693][47288] Updated weights for policy 0, policy_version 134996 (0.0026) [2024-04-26 08:52:23,502][47288] Updated weights for policy 0, policy_version 135006 (0.0033) [2024-04-26 08:52:23,923][47056] Fps is (10 sec: 54068.8, 60 sec: 56251.8, 300 sec: 56316.6). Total num frames: 2211938304. Throughput: 0: 56425.9. Samples: 2161380140. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 08:52:23,923][47056] Avg episode reward: [(0, '0.580')] [2024-04-26 08:52:26,247][47267] Signal inference workers to stop experience collection... (31950 times) [2024-04-26 08:52:26,277][47288] InferenceWorker_p0-w0: stopping experience collection (31950 times) [2024-04-26 08:52:26,329][47267] Signal inference workers to resume experience collection... (31950 times) [2024-04-26 08:52:26,330][47288] InferenceWorker_p0-w0: resuming experience collection (31950 times) [2024-04-26 08:52:26,434][47288] Updated weights for policy 0, policy_version 135016 (0.0024) [2024-04-26 08:52:28,923][47056] Fps is (10 sec: 54067.8, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 2212233216. Throughput: 0: 56372.7. Samples: 2161547520. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 08:52:28,923][47056] Avg episode reward: [(0, '0.597')] [2024-04-26 08:52:29,457][47288] Updated weights for policy 0, policy_version 135026 (0.0030) [2024-04-26 08:52:32,310][47288] Updated weights for policy 0, policy_version 135036 (0.0028) [2024-04-26 08:52:33,923][47056] Fps is (10 sec: 58981.7, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 2212528128. Throughput: 0: 56386.1. Samples: 2161888020. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 08:52:33,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 08:52:35,192][47288] Updated weights for policy 0, policy_version 135046 (0.0031) [2024-04-26 08:52:38,222][47288] Updated weights for policy 0, policy_version 135056 (0.0030) [2024-04-26 08:52:38,923][47056] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 2212790272. Throughput: 0: 56437.0. Samples: 2162227880. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 08:52:38,923][47056] Avg episode reward: [(0, '0.486')] [2024-04-26 08:52:40,935][47288] Updated weights for policy 0, policy_version 135066 (0.0032) [2024-04-26 08:52:43,923][47056] Fps is (10 sec: 54067.4, 60 sec: 55978.6, 300 sec: 56261.0). Total num frames: 2213068800. Throughput: 0: 56525.8. Samples: 2162396760. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 08:52:43,923][47056] Avg episode reward: [(0, '0.591')] [2024-04-26 08:52:43,968][47288] Updated weights for policy 0, policy_version 135076 (0.0027) [2024-04-26 08:52:46,589][47288] Updated weights for policy 0, policy_version 135086 (0.0027) [2024-04-26 08:52:48,923][47056] Fps is (10 sec: 57343.3, 60 sec: 56797.7, 300 sec: 56261.0). Total num frames: 2213363712. Throughput: 0: 56465.7. Samples: 2162731940. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 08:52:48,924][47056] Avg episode reward: [(0, '0.518')] [2024-04-26 08:52:48,934][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000135093_2213363712.pth... [2024-04-26 08:52:48,976][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000134267_2199830528.pth [2024-04-26 08:52:49,729][47288] Updated weights for policy 0, policy_version 135096 (0.0026) [2024-04-26 08:52:52,441][47288] Updated weights for policy 0, policy_version 135106 (0.0026) [2024-04-26 08:52:53,923][47056] Fps is (10 sec: 58981.6, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 2213658624. Throughput: 0: 56308.2. Samples: 2163065080. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-26 08:52:53,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 08:52:55,520][47288] Updated weights for policy 0, policy_version 135116 (0.0033) [2024-04-26 08:52:58,255][47288] Updated weights for policy 0, policy_version 135126 (0.0029) [2024-04-26 08:52:58,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 2213920768. Throughput: 0: 56443.4. Samples: 2163241360. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 08:52:58,923][47056] Avg episode reward: [(0, '0.570')] [2024-04-26 08:53:01,386][47288] Updated weights for policy 0, policy_version 135136 (0.0026) [2024-04-26 08:53:03,923][47056] Fps is (10 sec: 55706.7, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 2214215680. Throughput: 0: 56373.6. Samples: 2163579660. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 08:53:03,923][47056] Avg episode reward: [(0, '0.571')] [2024-04-26 08:53:04,077][47288] Updated weights for policy 0, policy_version 135146 (0.0025) [2024-04-26 08:53:07,248][47288] Updated weights for policy 0, policy_version 135156 (0.0027) [2024-04-26 08:53:08,923][47056] Fps is (10 sec: 57343.0, 60 sec: 56797.6, 300 sec: 56427.6). Total num frames: 2214494208. Throughput: 0: 56383.2. Samples: 2163917400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 08:53:08,923][47056] Avg episode reward: [(0, '0.452')] [2024-04-26 08:53:10,032][47288] Updated weights for policy 0, policy_version 135166 (0.0029) [2024-04-26 08:53:12,994][47288] Updated weights for policy 0, policy_version 135176 (0.0030) [2024-04-26 08:53:13,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56252.0, 300 sec: 56316.5). Total num frames: 2214772736. Throughput: 0: 56482.4. Samples: 2164089220. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 08:53:13,923][47056] Avg episode reward: [(0, '0.610')] [2024-04-26 08:53:15,754][47288] Updated weights for policy 0, policy_version 135186 (0.0030) [2024-04-26 08:53:18,774][47288] Updated weights for policy 0, policy_version 135196 (0.0031) [2024-04-26 08:53:18,923][47056] Fps is (10 sec: 55706.2, 60 sec: 55978.8, 300 sec: 56316.5). Total num frames: 2215051264. Throughput: 0: 56334.6. Samples: 2164423080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 08:53:18,923][47056] Avg episode reward: [(0, '0.602')] [2024-04-26 08:53:21,667][47288] Updated weights for policy 0, policy_version 135206 (0.0035) [2024-04-26 08:53:23,923][47056] Fps is (10 sec: 54067.5, 60 sec: 56251.8, 300 sec: 56205.5). Total num frames: 2215313408. Throughput: 0: 56321.0. Samples: 2164762320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 08:53:23,923][47056] Avg episode reward: [(0, '0.577')] [2024-04-26 08:53:24,621][47288] Updated weights for policy 0, policy_version 135216 (0.0024) [2024-04-26 08:53:27,545][47288] Updated weights for policy 0, policy_version 135226 (0.0030) [2024-04-26 08:53:28,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 2215624704. Throughput: 0: 56332.4. Samples: 2164931720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 08:53:28,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 08:53:30,518][47288] Updated weights for policy 0, policy_version 135236 (0.0031) [2024-04-26 08:53:33,293][47288] Updated weights for policy 0, policy_version 135246 (0.0024) [2024-04-26 08:53:33,704][47267] Signal inference workers to stop experience collection... (32000 times) [2024-04-26 08:53:33,704][47267] Signal inference workers to resume experience collection... (32000 times) [2024-04-26 08:53:33,727][47288] InferenceWorker_p0-w0: stopping experience collection (32000 times) [2024-04-26 08:53:33,727][47288] InferenceWorker_p0-w0: resuming experience collection (32000 times) [2024-04-26 08:53:33,923][47056] Fps is (10 sec: 60620.1, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 2215919616. Throughput: 0: 56261.5. Samples: 2165263700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 08:53:33,923][47056] Avg episode reward: [(0, '0.535')] [2024-04-26 08:53:36,298][47288] Updated weights for policy 0, policy_version 135256 (0.0030) [2024-04-26 08:53:38,923][47056] Fps is (10 sec: 55705.1, 60 sec: 56524.7, 300 sec: 56316.5). Total num frames: 2216181760. Throughput: 0: 56509.4. Samples: 2165608000. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 08:53:38,923][47056] Avg episode reward: [(0, '0.597')] [2024-04-26 08:53:38,982][47288] Updated weights for policy 0, policy_version 135266 (0.0029) [2024-04-26 08:53:42,094][47288] Updated weights for policy 0, policy_version 135276 (0.0029) [2024-04-26 08:53:43,923][47056] Fps is (10 sec: 57344.4, 60 sec: 57071.0, 300 sec: 56483.2). Total num frames: 2216493056. Throughput: 0: 56543.6. Samples: 2165785820. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 08:53:43,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 08:53:44,863][47288] Updated weights for policy 0, policy_version 135286 (0.0028) [2024-04-26 08:53:47,927][47288] Updated weights for policy 0, policy_version 135296 (0.0028) [2024-04-26 08:53:48,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 2216738816. Throughput: 0: 56525.6. Samples: 2166123320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 08:53:48,923][47056] Avg episode reward: [(0, '0.559')] [2024-04-26 08:53:50,735][47288] Updated weights for policy 0, policy_version 135306 (0.0026) [2024-04-26 08:53:53,728][47288] Updated weights for policy 0, policy_version 135316 (0.0031) [2024-04-26 08:53:53,923][47056] Fps is (10 sec: 54066.3, 60 sec: 56251.8, 300 sec: 56316.5). Total num frames: 2217033728. Throughput: 0: 56636.1. Samples: 2166466020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 08:53:53,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 08:53:56,550][47288] Updated weights for policy 0, policy_version 135326 (0.0027) [2024-04-26 08:53:58,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.7, 300 sec: 56372.0). Total num frames: 2217312256. Throughput: 0: 56526.9. Samples: 2166632940. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-26 08:53:58,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 08:53:59,395][47288] Updated weights for policy 0, policy_version 135336 (0.0024) [2024-04-26 08:54:02,205][47288] Updated weights for policy 0, policy_version 135346 (0.0027) [2024-04-26 08:54:03,923][47056] Fps is (10 sec: 54068.5, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 2217574400. Throughput: 0: 56454.9. Samples: 2166963540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 08:54:03,923][47056] Avg episode reward: [(0, '0.424')] [2024-04-26 08:54:05,292][47288] Updated weights for policy 0, policy_version 135356 (0.0028) [2024-04-26 08:54:08,058][47288] Updated weights for policy 0, policy_version 135366 (0.0032) [2024-04-26 08:54:08,923][47056] Fps is (10 sec: 57344.5, 60 sec: 56525.0, 300 sec: 56372.1). Total num frames: 2217885696. Throughput: 0: 56459.9. Samples: 2167303020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 08:54:08,923][47056] Avg episode reward: [(0, '0.603')] [2024-04-26 08:54:11,187][47288] Updated weights for policy 0, policy_version 135376 (0.0032) [2024-04-26 08:54:13,923][47056] Fps is (10 sec: 57342.9, 60 sec: 56251.6, 300 sec: 56316.5). Total num frames: 2218147840. Throughput: 0: 56507.1. Samples: 2167474540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 08:54:13,923][47056] Avg episode reward: [(0, '0.552')] [2024-04-26 08:54:14,049][47288] Updated weights for policy 0, policy_version 135386 (0.0030) [2024-04-26 08:54:17,114][47288] Updated weights for policy 0, policy_version 135396 (0.0030) [2024-04-26 08:54:18,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56798.0, 300 sec: 56372.1). Total num frames: 2218459136. Throughput: 0: 56619.7. Samples: 2167811580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 08:54:18,923][47056] Avg episode reward: [(0, '0.620')] [2024-04-26 08:54:19,717][47288] Updated weights for policy 0, policy_version 135406 (0.0029) [2024-04-26 08:54:23,079][47288] Updated weights for policy 0, policy_version 135416 (0.0031) [2024-04-26 08:54:23,923][47056] Fps is (10 sec: 58982.5, 60 sec: 57070.8, 300 sec: 56427.6). Total num frames: 2218737664. Throughput: 0: 56417.4. Samples: 2168146780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 08:54:23,923][47056] Avg episode reward: [(0, '0.465')] [2024-04-26 08:54:25,386][47288] Updated weights for policy 0, policy_version 135426 (0.0028) [2024-04-26 08:54:28,789][47288] Updated weights for policy 0, policy_version 135436 (0.0028) [2024-04-26 08:54:28,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55978.8, 300 sec: 56316.5). Total num frames: 2218983424. Throughput: 0: 56097.8. Samples: 2168310220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 08:54:28,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 08:54:31,347][47288] Updated weights for policy 0, policy_version 135446 (0.0027) [2024-04-26 08:54:33,923][47056] Fps is (10 sec: 52429.4, 60 sec: 55705.7, 300 sec: 56261.0). Total num frames: 2219261952. Throughput: 0: 56134.0. Samples: 2168649340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 08:54:33,923][47056] Avg episode reward: [(0, '0.557')] [2024-04-26 08:54:34,568][47288] Updated weights for policy 0, policy_version 135456 (0.0027) [2024-04-26 08:54:35,269][47267] Signal inference workers to stop experience collection... (32050 times) [2024-04-26 08:54:35,270][47267] Signal inference workers to resume experience collection... (32050 times) [2024-04-26 08:54:35,309][47288] InferenceWorker_p0-w0: stopping experience collection (32050 times) [2024-04-26 08:54:35,309][47288] InferenceWorker_p0-w0: resuming experience collection (32050 times) [2024-04-26 08:54:37,221][47288] Updated weights for policy 0, policy_version 135466 (0.0029) [2024-04-26 08:54:38,923][47056] Fps is (10 sec: 55705.4, 60 sec: 55978.8, 300 sec: 56316.5). Total num frames: 2219540480. Throughput: 0: 56182.0. Samples: 2168994200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 08:54:38,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 08:54:40,305][47288] Updated weights for policy 0, policy_version 135476 (0.0030) [2024-04-26 08:54:42,878][47288] Updated weights for policy 0, policy_version 135486 (0.0029) [2024-04-26 08:54:43,923][47056] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 56316.6). Total num frames: 2219835392. Throughput: 0: 56064.7. Samples: 2169155840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 08:54:43,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 08:54:46,178][47288] Updated weights for policy 0, policy_version 135496 (0.0027) [2024-04-26 08:54:48,609][47288] Updated weights for policy 0, policy_version 135506 (0.0029) [2024-04-26 08:54:48,923][47056] Fps is (10 sec: 60620.1, 60 sec: 56797.9, 300 sec: 56483.2). Total num frames: 2220146688. Throughput: 0: 56274.8. Samples: 2169495920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 08:54:48,924][47056] Avg episode reward: [(0, '0.492')] [2024-04-26 08:54:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000135507_2220146688.pth... [2024-04-26 08:54:48,982][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000134679_2206580736.pth [2024-04-26 08:54:51,876][47288] Updated weights for policy 0, policy_version 135516 (0.0026) [2024-04-26 08:54:53,923][47056] Fps is (10 sec: 58982.5, 60 sec: 56525.0, 300 sec: 56427.6). Total num frames: 2220425216. Throughput: 0: 56081.9. Samples: 2169826700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 08:54:53,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 08:54:54,521][47288] Updated weights for policy 0, policy_version 135526 (0.0026) [2024-04-26 08:54:57,855][47288] Updated weights for policy 0, policy_version 135536 (0.0030) [2024-04-26 08:54:58,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56524.8, 300 sec: 56372.0). Total num frames: 2220703744. Throughput: 0: 56342.1. Samples: 2170009940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 08:54:58,923][47056] Avg episode reward: [(0, '0.551')] [2024-04-26 08:55:00,289][47288] Updated weights for policy 0, policy_version 135546 (0.0031) [2024-04-26 08:55:03,581][47288] Updated weights for policy 0, policy_version 135556 (0.0030) [2024-04-26 08:55:03,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56797.8, 300 sec: 56483.1). Total num frames: 2220982272. Throughput: 0: 56314.6. Samples: 2170345740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 08:55:03,923][47056] Avg episode reward: [(0, '0.569')] [2024-04-26 08:55:05,997][47288] Updated weights for policy 0, policy_version 135566 (0.0027) [2024-04-26 08:55:08,923][47056] Fps is (10 sec: 52429.1, 60 sec: 55705.5, 300 sec: 56205.4). Total num frames: 2221228032. Throughput: 0: 56416.4. Samples: 2170685520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 08:55:08,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 08:55:09,471][47288] Updated weights for policy 0, policy_version 135576 (0.0025) [2024-04-26 08:55:11,694][47288] Updated weights for policy 0, policy_version 135586 (0.0025) [2024-04-26 08:55:13,923][47056] Fps is (10 sec: 52428.8, 60 sec: 55978.7, 300 sec: 56261.0). Total num frames: 2221506560. Throughput: 0: 56278.1. Samples: 2170842740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 08:55:13,923][47056] Avg episode reward: [(0, '0.584')] [2024-04-26 08:55:15,195][47288] Updated weights for policy 0, policy_version 135596 (0.0029) [2024-04-26 08:55:17,582][47288] Updated weights for policy 0, policy_version 135606 (0.0031) [2024-04-26 08:55:18,923][47056] Fps is (10 sec: 57344.7, 60 sec: 55705.6, 300 sec: 56261.0). Total num frames: 2221801472. Throughput: 0: 56191.1. Samples: 2171177940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 08:55:18,923][47056] Avg episode reward: [(0, '0.637')] [2024-04-26 08:55:20,929][47288] Updated weights for policy 0, policy_version 135616 (0.0029) [2024-04-26 08:55:23,587][47288] Updated weights for policy 0, policy_version 135626 (0.0029) [2024-04-26 08:55:23,923][47056] Fps is (10 sec: 60620.9, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 2222112768. Throughput: 0: 56181.3. Samples: 2171522360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 08:55:23,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 08:55:26,656][47288] Updated weights for policy 0, policy_version 135636 (0.0039) [2024-04-26 08:55:28,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56797.7, 300 sec: 56372.0). Total num frames: 2222391296. Throughput: 0: 56478.5. Samples: 2171697380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 08:55:28,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 08:55:29,368][47288] Updated weights for policy 0, policy_version 135646 (0.0026) [2024-04-26 08:55:32,243][47267] Signal inference workers to stop experience collection... (32100 times) [2024-04-26 08:55:32,244][47267] Signal inference workers to resume experience collection... (32100 times) [2024-04-26 08:55:32,259][47288] InferenceWorker_p0-w0: stopping experience collection (32100 times) [2024-04-26 08:55:32,259][47288] InferenceWorker_p0-w0: resuming experience collection (32100 times) [2024-04-26 08:55:32,535][47288] Updated weights for policy 0, policy_version 135656 (0.0033) [2024-04-26 08:55:33,923][47056] Fps is (10 sec: 57343.4, 60 sec: 57070.8, 300 sec: 56483.1). Total num frames: 2222686208. Throughput: 0: 56412.4. Samples: 2172034480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 08:55:33,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 08:55:35,078][47288] Updated weights for policy 0, policy_version 135666 (0.0025) [2024-04-26 08:55:38,239][47288] Updated weights for policy 0, policy_version 135676 (0.0026) [2024-04-26 08:55:38,923][47056] Fps is (10 sec: 57343.9, 60 sec: 57070.8, 300 sec: 56483.2). Total num frames: 2222964736. Throughput: 0: 56524.7. Samples: 2172370320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 08:55:38,923][47056] Avg episode reward: [(0, '0.511')] [2024-04-26 08:55:40,878][47288] Updated weights for policy 0, policy_version 135686 (0.0028) [2024-04-26 08:55:43,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56524.6, 300 sec: 56427.6). Total num frames: 2223226880. Throughput: 0: 56270.7. Samples: 2172542120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 08:55:43,923][47056] Avg episode reward: [(0, '0.553')] [2024-04-26 08:55:44,124][47288] Updated weights for policy 0, policy_version 135696 (0.0033) [2024-04-26 08:55:46,688][47288] Updated weights for policy 0, policy_version 135706 (0.0030) [2024-04-26 08:55:48,923][47056] Fps is (10 sec: 50790.9, 60 sec: 55432.6, 300 sec: 56261.0). Total num frames: 2223472640. Throughput: 0: 56316.5. Samples: 2172879980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 08:55:48,923][47056] Avg episode reward: [(0, '0.594')] [2024-04-26 08:55:50,042][47288] Updated weights for policy 0, policy_version 135716 (0.0032) [2024-04-26 08:55:52,344][47288] Updated weights for policy 0, policy_version 135726 (0.0027) [2024-04-26 08:55:53,923][47056] Fps is (10 sec: 55705.7, 60 sec: 55978.5, 300 sec: 56261.0). Total num frames: 2223783936. Throughput: 0: 56239.1. Samples: 2173216280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 08:55:53,923][47056] Avg episode reward: [(0, '0.547')] [2024-04-26 08:55:55,771][47288] Updated weights for policy 0, policy_version 135736 (0.0030) [2024-04-26 08:55:58,392][47288] Updated weights for policy 0, policy_version 135746 (0.0035) [2024-04-26 08:55:58,923][47056] Fps is (10 sec: 60619.9, 60 sec: 56251.7, 300 sec: 56372.0). Total num frames: 2224078848. Throughput: 0: 56551.4. Samples: 2173387560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 08:55:58,923][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 08:56:01,622][47288] Updated weights for policy 0, policy_version 135756 (0.0025) [2024-04-26 08:56:03,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 2224373760. Throughput: 0: 56447.8. Samples: 2173718100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 08:56:03,923][47056] Avg episode reward: [(0, '0.491')] [2024-04-26 08:56:04,239][47288] Updated weights for policy 0, policy_version 135766 (0.0032) [2024-04-26 08:56:07,601][47288] Updated weights for policy 0, policy_version 135776 (0.0033) [2024-04-26 08:56:08,924][47056] Fps is (10 sec: 57338.2, 60 sec: 57069.9, 300 sec: 56427.4). Total num frames: 2224652288. Throughput: 0: 56231.9. Samples: 2174052860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 08:56:08,925][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 08:56:10,178][47288] Updated weights for policy 0, policy_version 135786 (0.0027) [2024-04-26 08:56:13,542][47288] Updated weights for policy 0, policy_version 135796 (0.0030) [2024-04-26 08:56:13,923][47056] Fps is (10 sec: 55706.6, 60 sec: 57071.0, 300 sec: 56483.1). Total num frames: 2224930816. Throughput: 0: 56294.9. Samples: 2174230640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 08:56:13,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 08:56:15,916][47288] Updated weights for policy 0, policy_version 135806 (0.0025) [2024-04-26 08:56:18,923][47056] Fps is (10 sec: 54073.1, 60 sec: 56524.7, 300 sec: 56372.1). Total num frames: 2225192960. Throughput: 0: 56331.2. Samples: 2174569380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 08:56:18,923][47056] Avg episode reward: [(0, '0.557')] [2024-04-26 08:56:19,290][47288] Updated weights for policy 0, policy_version 135816 (0.0030) [2024-04-26 08:56:21,728][47288] Updated weights for policy 0, policy_version 135826 (0.0030) [2024-04-26 08:56:23,923][47056] Fps is (10 sec: 52428.1, 60 sec: 55705.5, 300 sec: 56316.5). Total num frames: 2225455104. Throughput: 0: 56332.0. Samples: 2174905260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 08:56:23,923][47056] Avg episode reward: [(0, '0.532')] [2024-04-26 08:56:24,918][47288] Updated weights for policy 0, policy_version 135836 (0.0035) [2024-04-26 08:56:27,664][47288] Updated weights for policy 0, policy_version 135846 (0.0028) [2024-04-26 08:56:28,923][47056] Fps is (10 sec: 54067.7, 60 sec: 55705.7, 300 sec: 56261.0). Total num frames: 2225733632. Throughput: 0: 55942.4. Samples: 2175059520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 08:56:28,923][47056] Avg episode reward: [(0, '0.413')] [2024-04-26 08:56:30,895][47288] Updated weights for policy 0, policy_version 135856 (0.0032) [2024-04-26 08:56:33,305][47288] Updated weights for policy 0, policy_version 135866 (0.0025) [2024-04-26 08:56:33,923][47056] Fps is (10 sec: 58983.0, 60 sec: 55978.8, 300 sec: 56316.6). Total num frames: 2226044928. Throughput: 0: 55885.4. Samples: 2175394820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 08:56:33,923][47056] Avg episode reward: [(0, '0.562')] [2024-04-26 08:56:36,897][47288] Updated weights for policy 0, policy_version 135876 (0.0030) [2024-04-26 08:56:37,302][47267] Signal inference workers to stop experience collection... (32150 times) [2024-04-26 08:56:37,335][47288] InferenceWorker_p0-w0: stopping experience collection (32150 times) [2024-04-26 08:56:37,368][47267] Signal inference workers to resume experience collection... (32150 times) [2024-04-26 08:56:37,369][47288] InferenceWorker_p0-w0: resuming experience collection (32150 times) [2024-04-26 08:56:38,923][47056] Fps is (10 sec: 60620.4, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 2226339840. Throughput: 0: 55959.2. Samples: 2175734440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 08:56:38,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 08:56:39,000][47288] Updated weights for policy 0, policy_version 135886 (0.0029) [2024-04-26 08:56:42,781][47288] Updated weights for policy 0, policy_version 135896 (0.0036) [2024-04-26 08:56:43,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.9, 300 sec: 56483.1). Total num frames: 2226618368. Throughput: 0: 56192.2. Samples: 2175916200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 08:56:43,923][47056] Avg episode reward: [(0, '0.493')] [2024-04-26 08:56:44,816][47288] Updated weights for policy 0, policy_version 135906 (0.0029) [2024-04-26 08:56:48,550][47288] Updated weights for policy 0, policy_version 135916 (0.0029) [2024-04-26 08:56:48,923][47056] Fps is (10 sec: 52429.0, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 2226864128. Throughput: 0: 56241.0. Samples: 2176248940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 08:56:48,923][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 08:56:49,037][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000135919_2226896896.pth... [2024-04-26 08:56:49,093][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000135093_2213363712.pth [2024-04-26 08:56:50,924][47288] Updated weights for policy 0, policy_version 135926 (0.0030) [2024-04-26 08:56:53,923][47056] Fps is (10 sec: 52428.6, 60 sec: 55978.7, 300 sec: 56316.5). Total num frames: 2227142656. Throughput: 0: 56313.4. Samples: 2176586900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 08:56:53,923][47056] Avg episode reward: [(0, '0.554')] [2024-04-26 08:56:54,262][47288] Updated weights for policy 0, policy_version 135936 (0.0026) [2024-04-26 08:56:57,040][47288] Updated weights for policy 0, policy_version 135946 (0.0029) [2024-04-26 08:56:58,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55705.8, 300 sec: 56261.0). Total num frames: 2227421184. Throughput: 0: 55836.8. Samples: 2176743300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 08:56:58,923][47056] Avg episode reward: [(0, '0.589')] [2024-04-26 08:57:00,115][47288] Updated weights for policy 0, policy_version 135956 (0.0024) [2024-04-26 08:57:02,945][47288] Updated weights for policy 0, policy_version 135966 (0.0028) [2024-04-26 08:57:03,923][47056] Fps is (10 sec: 55706.3, 60 sec: 55432.7, 300 sec: 56316.5). Total num frames: 2227699712. Throughput: 0: 55864.2. Samples: 2177083260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 08:57:03,923][47056] Avg episode reward: [(0, '0.458')] [2024-04-26 08:57:06,024][47288] Updated weights for policy 0, policy_version 135976 (0.0026) [2024-04-26 08:57:08,825][47288] Updated weights for policy 0, policy_version 135986 (0.0026) [2024-04-26 08:57:08,923][47056] Fps is (10 sec: 57343.8, 60 sec: 55706.7, 300 sec: 56261.0). Total num frames: 2227994624. Throughput: 0: 55948.1. Samples: 2177422920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 08:57:08,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 08:57:11,685][47288] Updated weights for policy 0, policy_version 135996 (0.0028) [2024-04-26 08:57:13,923][47056] Fps is (10 sec: 60620.1, 60 sec: 56251.7, 300 sec: 56316.6). Total num frames: 2228305920. Throughput: 0: 56375.0. Samples: 2177596400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 08:57:13,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 08:57:14,487][47288] Updated weights for policy 0, policy_version 136006 (0.0033) [2024-04-26 08:57:17,494][47288] Updated weights for policy 0, policy_version 136016 (0.0027) [2024-04-26 08:57:18,923][47056] Fps is (10 sec: 58981.4, 60 sec: 56524.7, 300 sec: 56427.6). Total num frames: 2228584448. Throughput: 0: 56497.5. Samples: 2177937220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 08:57:18,923][47056] Avg episode reward: [(0, '0.579')] [2024-04-26 08:57:20,181][47288] Updated weights for policy 0, policy_version 136026 (0.0028) [2024-04-26 08:57:23,455][47288] Updated weights for policy 0, policy_version 136036 (0.0033) [2024-04-26 08:57:23,923][47056] Fps is (10 sec: 54067.9, 60 sec: 56525.0, 300 sec: 56316.6). Total num frames: 2228846592. Throughput: 0: 56398.8. Samples: 2178272380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 08:57:23,923][47056] Avg episode reward: [(0, '0.425')] [2024-04-26 08:57:25,959][47288] Updated weights for policy 0, policy_version 136046 (0.0030) [2024-04-26 08:57:28,923][47056] Fps is (10 sec: 52429.2, 60 sec: 56251.6, 300 sec: 56205.4). Total num frames: 2229108736. Throughput: 0: 56068.8. Samples: 2178439300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 08:57:28,923][47056] Avg episode reward: [(0, '0.583')] [2024-04-26 08:57:29,157][47288] Updated weights for policy 0, policy_version 136056 (0.0030) [2024-04-26 08:57:31,693][47288] Updated weights for policy 0, policy_version 136066 (0.0032) [2024-04-26 08:57:33,923][47056] Fps is (10 sec: 54066.3, 60 sec: 55705.5, 300 sec: 56261.0). Total num frames: 2229387264. Throughput: 0: 56217.7. Samples: 2178778740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 08:57:33,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 08:57:34,894][47288] Updated weights for policy 0, policy_version 136076 (0.0028) [2024-04-26 08:57:37,437][47288] Updated weights for policy 0, policy_version 136086 (0.0036) [2024-04-26 08:57:38,923][47056] Fps is (10 sec: 58983.0, 60 sec: 55978.7, 300 sec: 56372.1). Total num frames: 2229698560. Throughput: 0: 56309.8. Samples: 2179120840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 08:57:38,923][47056] Avg episode reward: [(0, '0.429')] [2024-04-26 08:57:40,400][47267] Signal inference workers to stop experience collection... (32200 times) [2024-04-26 08:57:40,400][47267] Signal inference workers to resume experience collection... (32200 times) [2024-04-26 08:57:40,409][47288] InferenceWorker_p0-w0: stopping experience collection (32200 times) [2024-04-26 08:57:40,409][47288] InferenceWorker_p0-w0: resuming experience collection (32200 times) [2024-04-26 08:57:40,653][47288] Updated weights for policy 0, policy_version 136096 (0.0030) [2024-04-26 08:57:43,782][47288] Updated weights for policy 0, policy_version 136106 (0.0031) [2024-04-26 08:57:43,923][47056] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 56261.0). Total num frames: 2229960704. Throughput: 0: 56583.5. Samples: 2179289560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 08:57:43,923][47056] Avg episode reward: [(0, '0.523')] [2024-04-26 08:57:46,453][47288] Updated weights for policy 0, policy_version 136116 (0.0029) [2024-04-26 08:57:48,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56524.8, 300 sec: 56261.0). Total num frames: 2230255616. Throughput: 0: 56485.6. Samples: 2179625120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 08:57:48,923][47056] Avg episode reward: [(0, '0.568')] [2024-04-26 08:57:49,501][47288] Updated weights for policy 0, policy_version 136126 (0.0031) [2024-04-26 08:57:52,228][47288] Updated weights for policy 0, policy_version 136136 (0.0031) [2024-04-26 08:57:53,923][47056] Fps is (10 sec: 58983.1, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 2230550528. Throughput: 0: 56335.6. Samples: 2179958020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 08:57:53,923][47056] Avg episode reward: [(0, '0.577')] [2024-04-26 08:57:55,392][47288] Updated weights for policy 0, policy_version 136146 (0.0029) [2024-04-26 08:57:57,938][47288] Updated weights for policy 0, policy_version 136156 (0.0031) [2024-04-26 08:57:58,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56797.8, 300 sec: 56316.5). Total num frames: 2230829056. Throughput: 0: 56494.6. Samples: 2180138660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 08:57:58,923][47056] Avg episode reward: [(0, '0.462')] [2024-04-26 08:58:01,012][47288] Updated weights for policy 0, policy_version 136166 (0.0032) [2024-04-26 08:58:03,881][47288] Updated weights for policy 0, policy_version 136176 (0.0028) [2024-04-26 08:58:03,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56797.8, 300 sec: 56316.6). Total num frames: 2231107584. Throughput: 0: 56484.2. Samples: 2180479000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 08:58:03,923][47056] Avg episode reward: [(0, '0.432')] [2024-04-26 08:58:06,678][47288] Updated weights for policy 0, policy_version 136186 (0.0028) [2024-04-26 08:58:08,923][47056] Fps is (10 sec: 54066.1, 60 sec: 56251.5, 300 sec: 56260.9). Total num frames: 2231369728. Throughput: 0: 56578.7. Samples: 2180818440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 08:58:08,923][47056] Avg episode reward: [(0, '0.588')] [2024-04-26 08:58:09,739][47288] Updated weights for policy 0, policy_version 136196 (0.0027) [2024-04-26 08:58:12,500][47288] Updated weights for policy 0, policy_version 136206 (0.0027) [2024-04-26 08:58:13,923][47056] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 56316.5). Total num frames: 2231664640. Throughput: 0: 56457.7. Samples: 2180979900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 08:58:13,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 08:58:15,633][47288] Updated weights for policy 0, policy_version 136216 (0.0028) [2024-04-26 08:58:18,105][47288] Updated weights for policy 0, policy_version 136226 (0.0030) [2024-04-26 08:58:18,923][47056] Fps is (10 sec: 57345.9, 60 sec: 55978.9, 300 sec: 56372.1). Total num frames: 2231943168. Throughput: 0: 56546.8. Samples: 2181323340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 08:58:18,923][47056] Avg episode reward: [(0, '0.483')] [2024-04-26 08:58:21,331][47288] Updated weights for policy 0, policy_version 136236 (0.0027) [2024-04-26 08:58:23,817][47288] Updated weights for policy 0, policy_version 136246 (0.0032) [2024-04-26 08:58:23,923][47056] Fps is (10 sec: 58982.9, 60 sec: 56797.7, 300 sec: 56372.1). Total num frames: 2232254464. Throughput: 0: 56574.6. Samples: 2181666700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 08:58:23,923][47056] Avg episode reward: [(0, '0.580')] [2024-04-26 08:58:27,133][47288] Updated weights for policy 0, policy_version 136256 (0.0026) [2024-04-26 08:58:28,923][47056] Fps is (10 sec: 58982.1, 60 sec: 57071.0, 300 sec: 56316.5). Total num frames: 2232532992. Throughput: 0: 56669.0. Samples: 2181839660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 08:58:28,923][47056] Avg episode reward: [(0, '0.583')] [2024-04-26 08:58:29,802][47288] Updated weights for policy 0, policy_version 136266 (0.0031) [2024-04-26 08:58:32,786][47288] Updated weights for policy 0, policy_version 136276 (0.0028) [2024-04-26 08:58:33,923][47056] Fps is (10 sec: 57344.5, 60 sec: 57344.1, 300 sec: 56427.6). Total num frames: 2232827904. Throughput: 0: 56738.3. Samples: 2182178340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 08:58:33,923][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 08:58:35,408][47288] Updated weights for policy 0, policy_version 136286 (0.0031) [2024-04-26 08:58:38,500][47288] Updated weights for policy 0, policy_version 136296 (0.0025) [2024-04-26 08:58:38,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56251.7, 300 sec: 56205.4). Total num frames: 2233073664. Throughput: 0: 56946.1. Samples: 2182520600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 08:58:38,923][47056] Avg episode reward: [(0, '0.526')] [2024-04-26 08:58:41,193][47288] Updated weights for policy 0, policy_version 136306 (0.0028) [2024-04-26 08:58:43,923][47056] Fps is (10 sec: 54066.9, 60 sec: 56797.9, 300 sec: 56372.1). Total num frames: 2233368576. Throughput: 0: 56722.7. Samples: 2182691180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 08:58:43,923][47056] Avg episode reward: [(0, '0.565')] [2024-04-26 08:58:44,241][47288] Updated weights for policy 0, policy_version 136316 (0.0030) [2024-04-26 08:58:47,037][47288] Updated weights for policy 0, policy_version 136326 (0.0029) [2024-04-26 08:58:48,923][47056] Fps is (10 sec: 57344.6, 60 sec: 56524.9, 300 sec: 56316.6). Total num frames: 2233647104. Throughput: 0: 56772.1. Samples: 2183033740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 08:58:48,923][47056] Avg episode reward: [(0, '0.455')] [2024-04-26 08:58:48,990][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000136332_2233663488.pth... [2024-04-26 08:58:49,040][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000135507_2220146688.pth [2024-04-26 08:58:49,985][47288] Updated weights for policy 0, policy_version 136336 (0.0030) [2024-04-26 08:58:52,757][47288] Updated weights for policy 0, policy_version 136346 (0.0027) [2024-04-26 08:58:53,923][47056] Fps is (10 sec: 57344.4, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 2233942016. Throughput: 0: 56727.5. Samples: 2183371160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 08:58:53,923][47056] Avg episode reward: [(0, '0.536')] [2024-04-26 08:58:55,907][47288] Updated weights for policy 0, policy_version 136356 (0.0033) [2024-04-26 08:58:57,016][47267] Signal inference workers to stop experience collection... (32250 times) [2024-04-26 08:58:57,017][47267] Signal inference workers to resume experience collection... (32250 times) [2024-04-26 08:58:57,030][47288] InferenceWorker_p0-w0: stopping experience collection (32250 times) [2024-04-26 08:58:57,031][47288] InferenceWorker_p0-w0: resuming experience collection (32250 times) [2024-04-26 08:58:58,450][47288] Updated weights for policy 0, policy_version 136366 (0.0031) [2024-04-26 08:58:58,923][47056] Fps is (10 sec: 57342.9, 60 sec: 56524.8, 300 sec: 56427.6). Total num frames: 2234220544. Throughput: 0: 56894.3. Samples: 2183540140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 08:58:58,923][47056] Avg episode reward: [(0, '0.558')] [2024-04-26 08:59:01,776][47288] Updated weights for policy 0, policy_version 136376 (0.0029) [2024-04-26 08:59:03,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 2234515456. Throughput: 0: 56833.6. Samples: 2183880860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 08:59:03,923][47056] Avg episode reward: [(0, '0.581')] [2024-04-26 08:59:04,283][47288] Updated weights for policy 0, policy_version 136386 (0.0033) [2024-04-26 08:59:07,578][47288] Updated weights for policy 0, policy_version 136396 (0.0028) [2024-04-26 08:59:08,923][47056] Fps is (10 sec: 58982.0, 60 sec: 57344.1, 300 sec: 56483.1). Total num frames: 2234810368. Throughput: 0: 56928.3. Samples: 2184228480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 08:59:08,932][47056] Avg episode reward: [(0, '0.572')] [2024-04-26 08:59:10,284][47288] Updated weights for policy 0, policy_version 136406 (0.0028) [2024-04-26 08:59:13,272][47288] Updated weights for policy 0, policy_version 136416 (0.0027) [2024-04-26 08:59:13,923][47056] Fps is (10 sec: 57344.2, 60 sec: 57071.1, 300 sec: 56372.1). Total num frames: 2235088896. Throughput: 0: 56787.1. Samples: 2184395080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:59:13,923][47056] Avg episode reward: [(0, '0.500')] [2024-04-26 08:59:16,069][47288] Updated weights for policy 0, policy_version 136426 (0.0025) [2024-04-26 08:59:18,913][47288] Updated weights for policy 0, policy_version 136436 (0.0029) [2024-04-26 08:59:18,923][47056] Fps is (10 sec: 55706.0, 60 sec: 57070.8, 300 sec: 56372.1). Total num frames: 2235367424. Throughput: 0: 56737.2. Samples: 2184731520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:59:18,923][47056] Avg episode reward: [(0, '0.589')] [2024-04-26 08:59:21,979][47288] Updated weights for policy 0, policy_version 136446 (0.0029) [2024-04-26 08:59:23,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56797.9, 300 sec: 56538.7). Total num frames: 2235662336. Throughput: 0: 56604.8. Samples: 2185067820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:59:23,923][47056] Avg episode reward: [(0, '0.564')] [2024-04-26 08:59:24,646][47288] Updated weights for policy 0, policy_version 136456 (0.0035) [2024-04-26 08:59:27,583][47288] Updated weights for policy 0, policy_version 136466 (0.0027) [2024-04-26 08:59:28,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56797.7, 300 sec: 56538.7). Total num frames: 2235940864. Throughput: 0: 56711.9. Samples: 2185243220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:59:28,923][47056] Avg episode reward: [(0, '0.426')] [2024-04-26 08:59:30,619][47288] Updated weights for policy 0, policy_version 136476 (0.0026) [2024-04-26 08:59:33,262][47288] Updated weights for policy 0, policy_version 136486 (0.0030) [2024-04-26 08:59:33,923][47056] Fps is (10 sec: 55705.4, 60 sec: 56524.7, 300 sec: 56538.7). Total num frames: 2236219392. Throughput: 0: 56758.4. Samples: 2185587880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:59:33,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 08:59:36,285][47288] Updated weights for policy 0, policy_version 136496 (0.0026) [2024-04-26 08:59:38,923][47056] Fps is (10 sec: 55706.5, 60 sec: 57071.0, 300 sec: 56483.1). Total num frames: 2236497920. Throughput: 0: 56762.7. Samples: 2185925480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:59:38,923][47056] Avg episode reward: [(0, '0.583')] [2024-04-26 08:59:39,019][47288] Updated weights for policy 0, policy_version 136506 (0.0033) [2024-04-26 08:59:42,144][47288] Updated weights for policy 0, policy_version 136516 (0.0026) [2024-04-26 08:59:43,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56797.8, 300 sec: 56372.1). Total num frames: 2236776448. Throughput: 0: 56736.9. Samples: 2186093300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:59:43,923][47056] Avg episode reward: [(0, '0.524')] [2024-04-26 08:59:44,725][47288] Updated weights for policy 0, policy_version 136526 (0.0033) [2024-04-26 08:59:48,023][47288] Updated weights for policy 0, policy_version 136536 (0.0031) [2024-04-26 08:59:48,923][47056] Fps is (10 sec: 57343.7, 60 sec: 57070.8, 300 sec: 56427.6). Total num frames: 2237071360. Throughput: 0: 56755.1. Samples: 2186434840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:59:48,923][47056] Avg episode reward: [(0, '0.471')] [2024-04-26 08:59:50,535][47288] Updated weights for policy 0, policy_version 136546 (0.0038) [2024-04-26 08:59:53,792][47288] Updated weights for policy 0, policy_version 136556 (0.0026) [2024-04-26 08:59:53,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.8, 300 sec: 56372.1). Total num frames: 2237333504. Throughput: 0: 56642.9. Samples: 2186777400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:59:53,923][47056] Avg episode reward: [(0, '0.496')] [2024-04-26 08:59:56,246][47288] Updated weights for policy 0, policy_version 136566 (0.0031) [2024-04-26 08:59:58,923][47056] Fps is (10 sec: 55705.2, 60 sec: 56797.9, 300 sec: 56427.6). Total num frames: 2237628416. Throughput: 0: 56488.3. Samples: 2186937060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 08:59:58,923][47056] Avg episode reward: [(0, '0.549')] [2024-04-26 08:59:59,474][47288] Updated weights for policy 0, policy_version 136576 (0.0024) [2024-04-26 09:00:02,116][47288] Updated weights for policy 0, policy_version 136586 (0.0033) [2024-04-26 09:00:03,227][47267] Signal inference workers to stop experience collection... (32300 times) [2024-04-26 09:00:03,228][47267] Signal inference workers to resume experience collection... (32300 times) [2024-04-26 09:00:03,239][47288] InferenceWorker_p0-w0: stopping experience collection (32300 times) [2024-04-26 09:00:03,240][47288] InferenceWorker_p0-w0: resuming experience collection (32300 times) [2024-04-26 09:00:03,923][47056] Fps is (10 sec: 58982.1, 60 sec: 56797.8, 300 sec: 56594.2). Total num frames: 2237923328. Throughput: 0: 56526.7. Samples: 2187275220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 09:00:03,923][47056] Avg episode reward: [(0, '0.562')] [2024-04-26 09:00:05,242][47288] Updated weights for policy 0, policy_version 136596 (0.0028) [2024-04-26 09:00:07,972][47288] Updated weights for policy 0, policy_version 136606 (0.0029) [2024-04-26 09:00:08,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.9, 300 sec: 56594.2). Total num frames: 2238201856. Throughput: 0: 56662.7. Samples: 2187617640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 09:00:08,923][47056] Avg episode reward: [(0, '0.430')] [2024-04-26 09:00:10,902][47288] Updated weights for policy 0, policy_version 136616 (0.0030) [2024-04-26 09:00:13,613][47288] Updated weights for policy 0, policy_version 136626 (0.0027) [2024-04-26 09:00:13,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2238480384. Throughput: 0: 56605.9. Samples: 2187790480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 09:00:13,923][47056] Avg episode reward: [(0, '0.571')] [2024-04-26 09:00:16,867][47288] Updated weights for policy 0, policy_version 136636 (0.0029) [2024-04-26 09:00:18,923][47056] Fps is (10 sec: 54067.7, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 2238742528. Throughput: 0: 56455.7. Samples: 2188128380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:00:18,923][47056] Avg episode reward: [(0, '0.562')] [2024-04-26 09:00:19,372][47288] Updated weights for policy 0, policy_version 136646 (0.0037) [2024-04-26 09:00:22,557][47288] Updated weights for policy 0, policy_version 136656 (0.0037) [2024-04-26 09:00:23,923][47056] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 56427.6). Total num frames: 2239037440. Throughput: 0: 56440.8. Samples: 2188465320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:00:23,923][47056] Avg episode reward: [(0, '0.525')] [2024-04-26 09:00:25,158][47288] Updated weights for policy 0, policy_version 136666 (0.0026) [2024-04-26 09:00:28,592][47288] Updated weights for policy 0, policy_version 136676 (0.0029) [2024-04-26 09:00:28,923][47056] Fps is (10 sec: 57343.7, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 2239315968. Throughput: 0: 56314.3. Samples: 2188627440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:00:28,923][47056] Avg episode reward: [(0, '0.569')] [2024-04-26 09:00:31,150][47288] Updated weights for policy 0, policy_version 136686 (0.0030) [2024-04-26 09:00:33,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 56372.1). Total num frames: 2239594496. Throughput: 0: 56138.2. Samples: 2188961060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:00:33,932][47056] Avg episode reward: [(0, '0.449')] [2024-04-26 09:00:34,397][47288] Updated weights for policy 0, policy_version 136696 (0.0026) [2024-04-26 09:00:37,106][47288] Updated weights for policy 0, policy_version 136706 (0.0028) [2024-04-26 09:00:38,923][47056] Fps is (10 sec: 57343.4, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 2239889408. Throughput: 0: 56221.2. Samples: 2189307360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:00:38,932][47056] Avg episode reward: [(0, '0.555')] [2024-04-26 09:00:40,003][47288] Updated weights for policy 0, policy_version 136716 (0.0029) [2024-04-26 09:00:42,758][47288] Updated weights for policy 0, policy_version 136726 (0.0027) [2024-04-26 09:00:43,923][47056] Fps is (10 sec: 58983.2, 60 sec: 56798.0, 300 sec: 56649.8). Total num frames: 2240184320. Throughput: 0: 56637.1. Samples: 2189485720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:00:43,923][47056] Avg episode reward: [(0, '0.550')] [2024-04-26 09:00:45,616][47288] Updated weights for policy 0, policy_version 136736 (0.0026) [2024-04-26 09:00:48,374][47288] Updated weights for policy 0, policy_version 136746 (0.0028) [2024-04-26 09:00:48,923][47056] Fps is (10 sec: 57344.7, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2240462848. Throughput: 0: 56707.2. Samples: 2189827040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:00:48,923][47056] Avg episode reward: [(0, '0.613')] [2024-04-26 09:00:48,933][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000136747_2240462848.pth... [2024-04-26 09:00:48,981][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000135919_2226896896.pth [2024-04-26 09:00:51,520][47288] Updated weights for policy 0, policy_version 136756 (0.0026) [2024-04-26 09:00:53,924][47056] Fps is (10 sec: 55698.3, 60 sec: 56796.7, 300 sec: 56482.9). Total num frames: 2240741376. Throughput: 0: 56629.6. Samples: 2190166040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:00:53,924][47056] Avg episode reward: [(0, '0.510')] [2024-04-26 09:00:54,628][47288] Updated weights for policy 0, policy_version 136766 (0.0026) [2024-04-26 09:00:57,366][47288] Updated weights for policy 0, policy_version 136776 (0.0030) [2024-04-26 09:00:57,874][47267] Signal inference workers to stop experience collection... (32350 times) [2024-04-26 09:00:57,874][47267] Signal inference workers to resume experience collection... (32350 times) [2024-04-26 09:00:57,895][47288] InferenceWorker_p0-w0: stopping experience collection (32350 times) [2024-04-26 09:00:57,895][47288] InferenceWorker_p0-w0: resuming experience collection (32350 times) [2024-04-26 09:00:58,923][47056] Fps is (10 sec: 57344.1, 60 sec: 56798.0, 300 sec: 56483.2). Total num frames: 2241036288. Throughput: 0: 56477.4. Samples: 2190331960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:00:58,923][47056] Avg episode reward: [(0, '0.560')] [2024-04-26 09:01:00,367][47288] Updated weights for policy 0, policy_version 136786 (0.0033) [2024-04-26 09:01:03,053][47288] Updated weights for policy 0, policy_version 136796 (0.0027) [2024-04-26 09:01:03,923][47056] Fps is (10 sec: 55712.6, 60 sec: 56251.8, 300 sec: 56427.8). Total num frames: 2241298432. Throughput: 0: 56536.0. Samples: 2190672500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:01:03,923][47056] Avg episode reward: [(0, '0.540')] [2024-04-26 09:01:06,145][47288] Updated weights for policy 0, policy_version 136806 (0.0026) [2024-04-26 09:01:08,923][47056] Fps is (10 sec: 54066.8, 60 sec: 56251.8, 300 sec: 56427.6). Total num frames: 2241576960. Throughput: 0: 56625.8. Samples: 2191013480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:01:08,923][47056] Avg episode reward: [(0, '0.567')] [2024-04-26 09:01:08,946][47288] Updated weights for policy 0, policy_version 136816 (0.0033) [2024-04-26 09:01:11,870][47288] Updated weights for policy 0, policy_version 136826 (0.0026) [2024-04-26 09:01:13,923][47056] Fps is (10 sec: 57343.6, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2241871872. Throughput: 0: 56728.4. Samples: 2191180220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:01:13,923][47056] Avg episode reward: [(0, '0.498')] [2024-04-26 09:01:14,983][47288] Updated weights for policy 0, policy_version 136836 (0.0031) [2024-04-26 09:01:17,687][47288] Updated weights for policy 0, policy_version 136846 (0.0029) [2024-04-26 09:01:18,923][47056] Fps is (10 sec: 58982.7, 60 sec: 57070.9, 300 sec: 56649.8). Total num frames: 2242166784. Throughput: 0: 56983.7. Samples: 2191525320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-04-26 09:01:18,923][47056] Avg episode reward: [(0, '0.585')] [2024-04-26 09:01:20,976][47288] Updated weights for policy 0, policy_version 136856 (0.0029) [2024-04-26 09:01:23,505][47288] Updated weights for policy 0, policy_version 136866 (0.0031) [2024-04-26 09:01:23,923][47056] Fps is (10 sec: 55705.9, 60 sec: 56524.9, 300 sec: 56594.2). Total num frames: 2242428928. Throughput: 0: 56774.8. Samples: 2191862220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-04-26 09:01:23,923][47056] Avg episode reward: [(0, '0.489')] [2024-04-26 09:01:26,631][47288] Updated weights for policy 0, policy_version 136876 (0.0026) [2024-04-26 09:01:28,923][47056] Fps is (10 sec: 54066.0, 60 sec: 56524.6, 300 sec: 56483.1). Total num frames: 2242707456. Throughput: 0: 56634.7. Samples: 2192034300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-04-26 09:01:28,923][47056] Avg episode reward: [(0, '0.573')] [2024-04-26 09:01:29,241][47288] Updated weights for policy 0, policy_version 136886 (0.0023) [2024-04-26 09:01:32,339][47288] Updated weights for policy 0, policy_version 136896 (0.0031) [2024-04-26 09:01:33,923][47056] Fps is (10 sec: 55705.6, 60 sec: 56524.9, 300 sec: 56427.6). Total num frames: 2242985984. Throughput: 0: 56519.5. Samples: 2192370420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-04-26 09:01:33,923][47056] Avg episode reward: [(0, '0.633')] [2024-04-26 09:01:35,034][47288] Updated weights for policy 0, policy_version 136906 (0.0031) [2024-04-26 09:01:38,069][47288] Updated weights for policy 0, policy_version 136916 (0.0029) [2024-04-26 09:01:38,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 2243280896. Throughput: 0: 56483.7. Samples: 2192707740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-04-26 09:01:38,923][47056] Avg episode reward: [(0, '0.570')] [2024-04-26 09:01:40,714][47288] Updated weights for policy 0, policy_version 136926 (0.0027) [2024-04-26 09:01:43,923][47056] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 56538.7). Total num frames: 2243543040. Throughput: 0: 56647.6. Samples: 2192881100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-04-26 09:01:43,923][47056] Avg episode reward: [(0, '0.547')] [2024-04-26 09:01:44,003][47288] Updated weights for policy 0, policy_version 136936 (0.0033) [2024-04-26 09:01:46,486][47288] Updated weights for policy 0, policy_version 136946 (0.0035) [2024-04-26 09:01:48,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56524.8, 300 sec: 56649.8). Total num frames: 2243854336. Throughput: 0: 56533.7. Samples: 2193216520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-04-26 09:01:48,923][47056] Avg episode reward: [(0, '0.477')] [2024-04-26 09:01:49,659][47288] Updated weights for policy 0, policy_version 136956 (0.0030) [2024-04-26 09:01:52,250][47288] Updated weights for policy 0, policy_version 136966 (0.0033) [2024-04-26 09:01:53,923][47056] Fps is (10 sec: 58981.8, 60 sec: 56525.9, 300 sec: 56649.7). Total num frames: 2244132864. Throughput: 0: 56380.9. Samples: 2193550620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-04-26 09:01:53,923][47056] Avg episode reward: [(0, '0.490')] [2024-04-26 09:01:55,406][47288] Updated weights for policy 0, policy_version 136976 (0.0030) [2024-04-26 09:01:58,045][47288] Updated weights for policy 0, policy_version 136986 (0.0034) [2024-04-26 09:01:58,923][47056] Fps is (10 sec: 57343.8, 60 sec: 56524.7, 300 sec: 56705.3). Total num frames: 2244427776. Throughput: 0: 56637.3. Samples: 2193728900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-04-26 09:01:58,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 09:02:01,393][47288] Updated weights for policy 0, policy_version 136996 (0.0030) [2024-04-26 09:02:03,819][47288] Updated weights for policy 0, policy_version 137006 (0.0031) [2024-04-26 09:02:03,923][47056] Fps is (10 sec: 57344.2, 60 sec: 56797.8, 300 sec: 56649.8). Total num frames: 2244706304. Throughput: 0: 56424.8. Samples: 2194064440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-04-26 09:02:03,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 09:02:07,049][47288] Updated weights for policy 0, policy_version 137016 (0.0030) [2024-04-26 09:02:08,923][47056] Fps is (10 sec: 54066.7, 60 sec: 56524.7, 300 sec: 56483.1). Total num frames: 2244968448. Throughput: 0: 56480.2. Samples: 2194403840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-04-26 09:02:08,923][47056] Avg episode reward: [(0, '0.537')] [2024-04-26 09:02:09,539][47288] Updated weights for policy 0, policy_version 137026 (0.0027) [2024-04-26 09:02:12,884][47288] Updated weights for policy 0, policy_version 137036 (0.0029) [2024-04-26 09:02:13,923][47056] Fps is (10 sec: 55705.8, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 2245263360. Throughput: 0: 56369.6. Samples: 2194570920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-04-26 09:02:13,923][47056] Avg episode reward: [(0, '0.476')] [2024-04-26 09:02:15,352][47288] Updated weights for policy 0, policy_version 137046 (0.0034) [2024-04-26 09:02:16,068][47267] Signal inference workers to stop experience collection... (32400 times) [2024-04-26 09:02:16,091][47288] InferenceWorker_p0-w0: stopping experience collection (32400 times) [2024-04-26 09:02:16,124][47267] Signal inference workers to resume experience collection... (32400 times) [2024-04-26 09:02:16,125][47288] InferenceWorker_p0-w0: resuming experience collection (32400 times) [2024-04-26 09:02:18,639][47288] Updated weights for policy 0, policy_version 137056 (0.0028) [2024-04-26 09:02:18,923][47056] Fps is (10 sec: 57344.3, 60 sec: 56251.6, 300 sec: 56594.2). Total num frames: 2245541888. Throughput: 0: 56456.8. Samples: 2194910980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-04-26 09:02:18,923][47056] Avg episode reward: [(0, '0.507')] [2024-04-26 09:02:21,310][47288] Updated weights for policy 0, policy_version 137066 (0.0028) [2024-04-26 09:02:23,923][47056] Fps is (10 sec: 55705.0, 60 sec: 56524.7, 300 sec: 56649.8). Total num frames: 2245820416. Throughput: 0: 56728.8. Samples: 2195260540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 09:02:23,923][47056] Avg episode reward: [(0, '0.562')] [2024-04-26 09:02:24,332][47288] Updated weights for policy 0, policy_version 137076 (0.0031) [2024-04-26 09:02:27,088][47288] Updated weights for policy 0, policy_version 137086 (0.0028) [2024-04-26 09:02:28,923][47056] Fps is (10 sec: 57344.0, 60 sec: 56798.0, 300 sec: 56705.3). Total num frames: 2246115328. Throughput: 0: 56512.7. Samples: 2195424180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 09:02:28,923][47056] Avg episode reward: [(0, '0.533')] [2024-04-26 09:02:30,158][47288] Updated weights for policy 0, policy_version 137096 (0.0031) [2024-04-26 09:02:32,870][47288] Updated weights for policy 0, policy_version 137106 (0.0026) [2024-04-26 09:02:33,923][47056] Fps is (10 sec: 60621.0, 60 sec: 57343.9, 300 sec: 56705.3). Total num frames: 2246426624. Throughput: 0: 56555.5. Samples: 2195761520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 09:02:33,923][47056] Avg episode reward: [(0, '0.545')] [2024-04-26 09:02:35,855][47288] Updated weights for policy 0, policy_version 137116 (0.0033) [2024-04-26 09:02:38,552][47288] Updated weights for policy 0, policy_version 137126 (0.0028) [2024-04-26 09:02:38,923][47056] Fps is (10 sec: 57344.9, 60 sec: 56798.0, 300 sec: 56705.3). Total num frames: 2246688768. Throughput: 0: 56671.7. Samples: 2196100840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 09:02:38,923][47056] Avg episode reward: [(0, '0.502')] [2024-04-26 09:02:41,784][47288] Updated weights for policy 0, policy_version 137136 (0.0034) [2024-04-26 09:02:43,923][47056] Fps is (10 sec: 54066.2, 60 sec: 57070.7, 300 sec: 56649.7). Total num frames: 2246967296. Throughput: 0: 56646.0. Samples: 2196277980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 09:02:43,924][47056] Avg episode reward: [(0, '0.543')] [2024-04-26 09:02:44,215][47288] Updated weights for policy 0, policy_version 137146 (0.0029) [2024-04-26 09:02:47,524][47288] Updated weights for policy 0, policy_version 137156 (0.0031) [2024-04-26 09:02:48,923][47056] Fps is (10 sec: 55704.2, 60 sec: 56524.6, 300 sec: 56594.2). Total num frames: 2247245824. Throughput: 0: 56702.4. Samples: 2196616060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 09:02:48,923][47056] Avg episode reward: [(0, '0.552')] [2024-04-26 09:02:48,936][47267] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000137161_2247245824.pth... [2024-04-26 09:02:48,993][47267] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000136332_2233663488.pth [2024-04-26 09:02:50,224][47288] Updated weights for policy 0, policy_version 137166 (0.0024) [2024-04-26 09:02:53,315][47288] Updated weights for policy 0, policy_version 137176 (0.0033) [2024-04-26 09:02:53,923][47056] Fps is (10 sec: 54068.2, 60 sec: 56251.7, 300 sec: 56538.7). Total num frames: 2247507968. Throughput: 0: 56637.0. Samples: 2196952500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 09:02:53,924][47056] Avg episode reward: [(0, '0.586')] [2024-04-26 09:02:56,109][47288] Updated weights for policy 0, policy_version 137186 (0.0030) [2024-04-26 09:02:58,923][47056] Fps is (10 sec: 55706.6, 60 sec: 56251.8, 300 sec: 56594.2). Total num frames: 2247802880. Throughput: 0: 56607.5. Samples: 2197118260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 09:02:58,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 09:02:59,080][47288] Updated weights for policy 0, policy_version 137196 (0.0034) [2024-04-26 09:03:01,893][47288] Updated weights for policy 0, policy_version 137206 (0.0026) [2024-04-26 09:03:03,923][47056] Fps is (10 sec: 58982.4, 60 sec: 56524.8, 300 sec: 56705.3). Total num frames: 2248097792. Throughput: 0: 56459.1. Samples: 2197451640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 09:03:03,923][47056] Avg episode reward: [(0, '0.615')] [2024-04-26 09:03:04,716][47288] Updated weights for policy 0, policy_version 137216 (0.0025) [2024-04-26 09:03:07,671][47288] Updated weights for policy 0, policy_version 137226 (0.0027) [2024-04-26 09:03:08,923][47056] Fps is (10 sec: 57343.5, 60 sec: 56797.9, 300 sec: 56649.8). Total num frames: 2248376320. Throughput: 0: 56312.0. Samples: 2197794580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 09:03:08,923][47056] Avg episode reward: [(0, '0.475')] [2024-04-26 09:03:10,689][47288] Updated weights for policy 0, policy_version 137236 (0.0026) [2024-04-26 09:03:13,385][47288] Updated weights for policy 0, policy_version 137246 (0.0027) [2024-04-26 09:03:13,923][47056] Fps is (10 sec: 58982.5, 60 sec: 57070.9, 300 sec: 56760.8). Total num frames: 2248687616. Throughput: 0: 56621.8. Samples: 2197972160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 09:03:13,923][47056] Avg episode reward: [(0, '0.527')] [2024-04-26 09:03:16,536][47288] Updated weights for policy 0, policy_version 137256 (0.0029) [2024-04-26 09:03:17,485][47267] Signal inference workers to stop experience collection... (32450 times) [2024-04-26 09:03:17,523][47288] InferenceWorker_p0-w0: stopping experience collection (32450 times) [2024-04-26 09:03:17,533][47267] Signal inference workers to resume experience collection... (32450 times) [2024-04-26 09:03:17,539][47288] InferenceWorker_p0-w0: resuming experience collection (32450 times) [2024-04-26 09:03:18,923][47056] Fps is (10 sec: 55706.0, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 2248933376. Throughput: 0: 56744.0. Samples: 2198315000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 09:03:18,923][47056] Avg episode reward: [(0, '0.637')] [2024-04-26 09:03:19,078][47288] Updated weights for policy 0, policy_version 137266 (0.0030) [2024-04-26 09:03:22,358][47288] Updated weights for policy 0, policy_version 137276 (0.0029) [2024-04-26 09:03:23,923][47056] Fps is (10 sec: 54067.1, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 2249228288. Throughput: 0: 56668.7. Samples: 2198650940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:03:23,923][47056] Avg episode reward: [(0, '0.520')] [2024-04-26 09:03:24,740][47288] Updated weights for policy 0, policy_version 137286 (0.0036) [2024-04-26 09:03:28,119][47288] Updated weights for policy 0, policy_version 137296 (0.0028) [2024-04-26 09:03:28,923][47056] Fps is (10 sec: 55706.3, 60 sec: 56251.9, 300 sec: 56483.2). Total num frames: 2249490432. Throughput: 0: 56561.3. Samples: 2198823220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:03:28,923][47056] Avg episode reward: [(0, '0.509')] [2024-04-26 09:03:30,589][47288] Updated weights for policy 0, policy_version 137306 (0.0033) [2024-04-26 09:03:33,923][47056] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 56594.2). Total num frames: 2249768960. Throughput: 0: 56584.3. Samples: 2199162340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:03:33,923][47056] Avg episode reward: [(0, '0.515')] [2024-04-26 09:03:33,978][47288] Updated weights for policy 0, policy_version 137316 (0.0033) [2024-04-26 09:03:36,409][47288] Updated weights for policy 0, policy_version 137326 (0.0024) [2024-04-26 09:03:38,923][47056] Fps is (10 sec: 55704.3, 60 sec: 55978.5, 300 sec: 56538.7). Total num frames: 2250047488. Throughput: 0: 56574.6. Samples: 2199498360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:03:38,923][47056] Avg episode reward: [(0, '0.556')] [2024-04-26 09:03:39,678][47288] Updated weights for policy 0, policy_version 137336 (0.0035) [2024-04-26 09:03:42,078][47288] Updated weights for policy 0, policy_version 137346 (0.0026) [2024-04-26 09:03:43,923][47056] Fps is (10 sec: 58982.2, 60 sec: 56525.0, 300 sec: 56649.7). Total num frames: 2250358784. Throughput: 0: 56716.4. Samples: 2199670500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:03:43,924][47056] Avg episode reward: [(0, '0.480')] [2024-04-26 09:03:45,306][47288] Updated weights for policy 0, policy_version 137356 (0.0030) [2024-04-26 09:03:47,866][47288] Updated weights for policy 0, policy_version 137366 (0.0028) [2024-04-26 09:03:48,923][47056] Fps is (10 sec: 60620.9, 60 sec: 56797.9, 300 sec: 56649.7). Total num frames: 2250653696. Throughput: 0: 56871.9. Samples: 2200010880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:03:48,923][47056] Avg episode reward: [(0, '0.573')] [2024-04-26 09:03:51,041][47288] Updated weights for policy 0, policy_version 137376 (0.0029) [2024-04-26 09:03:53,729][47288] Updated weights for policy 0, policy_version 137386 (0.0033) [2024-04-26 09:03:53,923][47056] Fps is (10 sec: 58982.5, 60 sec: 57344.0, 300 sec: 56705.3). Total num frames: 2250948608. Throughput: 0: 56617.9. Samples: 2200342380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:03:53,923][47056] Avg episode reward: [(0, '0.517')] [2024-04-26 09:03:57,356][47288] Updated weights for policy 0, policy_version 137396 (0.0031) [2024-04-26 09:03:58,923][47056] Fps is (10 sec: 55706.1, 60 sec: 56797.9, 300 sec: 56594.2). Total num frames: 2251210752. Throughput: 0: 56531.6. Samples: 2200516080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:03:58,923][47056] Avg episode reward: [(0, '0.530')] [2024-04-26 09:03:59,570][47288] Updated weights for policy 0, policy_version 137406 (0.0034) [2024-04-26 09:04:03,041][47288] Updated weights for policy 0, policy_version 137416 (0.0028) [2024-04-26 09:04:03,923][47056] Fps is (10 sec: 54067.0, 60 sec: 56524.8, 300 sec: 56538.7). Total num frames: 2251489280. Throughput: 0: 56415.1. Samples: 2200853680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:04:03,923][47056] Avg episode reward: [(0, '0.565')] [2024-04-26 09:04:05,181][47288] Updated weights for policy 0, policy_version 137426 (0.0026) [2024-04-26 09:04:08,902][47288] Updated weights for policy 0, policy_version 137436 (0.0027) [2024-04-26 09:04:08,923][47056] Fps is (10 sec: 54067.3, 60 sec: 56251.8, 300 sec: 56483.1). Total num frames: 2251751424. Throughput: 0: 56555.2. Samples: 2201195920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:04:08,923][47056] Avg episode reward: [(0, '0.529')] [2024-04-26 09:04:11,034][47288] Updated weights for policy 0, policy_version 137446 (0.0029) [2024-04-26 09:04:13,923][47056] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 56538.7). Total num frames: 2252046336. Throughput: 0: 56386.9. Samples: 2201360640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:04:13,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 09:04:14,720][47288] Updated weights for policy 0, policy_version 137456 (0.0031) [2024-04-26 09:04:16,858][47288] Updated weights for policy 0, policy_version 137466 (0.0025) [2024-04-26 09:04:18,923][47056] Fps is (10 sec: 57343.9, 60 sec: 56524.8, 300 sec: 56483.1). Total num frames: 2252324864. Throughput: 0: 56435.5. Samples: 2201701940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:04:18,923][47056] Avg episode reward: [(0, '0.521')] [2024-04-26 09:04:20,312][47288] Updated weights for policy 0, policy_version 137476 (0.0033) [2024-04-26 09:04:21,926][47267] Signal inference workers to stop experience collection... (32500 times) [2024-04-26 09:04:21,947][47288] InferenceWorker_p0-w0: stopping experience collection (32500 times) [2024-04-26 09:04:22,013][47267] Signal inference workers to resume experience collection... (32500 times) [2024-04-26 09:04:22,013][47288] InferenceWorker_p0-w0: resuming experience collection (32500 times) [2024-04-26 09:04:22,484][47288] Updated weights for policy 0, policy_version 137486 (0.0026) [2024-04-26 09:04:23,923][47056] Fps is (10 sec: 57344.8, 60 sec: 56524.9, 300 sec: 56538.7). Total num frames: 2252619776. Throughput: 0: 56523.3. Samples: 2202041900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:04:23,923][47056] Avg episode reward: [(0, '0.538')] [2024-04-26 09:04:44,389][49517] Saving configuration to /workspace/metta/train_dir/p2.fa.clean/config.json... [2024-04-26 09:04:44,397][49517] Rollout worker 0 uses device cpu [2024-04-26 09:04:44,397][49517] Rollout worker 1 uses device cpu [2024-04-26 09:04:44,397][49517] Rollout worker 2 uses device cpu [2024-04-26 09:04:44,397][49517] Rollout worker 3 uses device cpu [2024-04-26 09:04:44,397][49517] Rollout worker 4 uses device cpu [2024-04-26 09:04:44,397][49517] Rollout worker 5 uses device cpu [2024-04-26 09:04:44,397][49517] Rollout worker 6 uses device cpu [2024-04-26 09:04:44,397][49517] Rollout worker 7 uses device cpu [2024-04-26 09:04:44,397][49517] Rollout worker 8 uses device cpu [2024-04-26 09:04:44,397][49517] Rollout worker 9 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 10 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 11 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 12 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 13 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 14 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 15 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 16 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 17 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 18 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 19 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 20 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 21 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 22 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 23 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 24 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 25 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 26 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 27 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 28 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 29 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 30 uses device cpu [2024-04-26 09:04:44,398][49517] Rollout worker 31 uses device cpu [2024-04-26 09:04:45,003][49517] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-26 09:04:45,003][49517] InferenceWorker_p0-w0: min num requests: 10 [2024-04-26 09:04:45,045][49517] Starting all processes... [2024-04-26 09:04:45,045][49517] Starting process learner_proc0 [2024-04-26 09:04:45,102][49517] Starting all processes... [2024-04-26 09:04:45,106][49517] Starting process inference_proc0-0 [2024-04-26 09:04:45,106][49517] Starting process rollout_proc0 [2024-04-26 09:04:45,106][49517] Starting process rollout_proc1 [2024-04-26 09:04:45,106][49517] Starting process rollout_proc2 [2024-04-26 09:04:45,106][49517] Starting process rollout_proc3 [2024-04-26 09:04:45,107][49517] Starting process rollout_proc4 [2024-04-26 09:04:45,107][49517] Starting process rollout_proc5 [2024-04-26 09:04:45,107][49517] Starting process rollout_proc6 [2024-04-26 09:04:45,107][49517] Starting process rollout_proc7 [2024-04-26 09:04:45,109][49517] Starting process rollout_proc8 [2024-04-26 09:04:45,109][49517] Starting process rollout_proc9 [2024-04-26 09:04:45,109][49517] Starting process rollout_proc10 [2024-04-26 09:04:45,109][49517] Starting process rollout_proc11 [2024-04-26 09:04:45,110][49517] Starting process rollout_proc12 [2024-04-26 09:04:45,110][49517] Starting process rollout_proc13 [2024-04-26 09:04:45,110][49517] Starting process rollout_proc14 [2024-04-26 09:04:45,110][49517] Starting process rollout_proc15 [2024-04-26 09:04:45,110][49517] Starting process rollout_proc16 [2024-04-26 09:04:45,110][49517] Starting process rollout_proc17 [2024-04-26 09:04:45,111][49517] Starting process rollout_proc18 [2024-04-26 09:04:45,113][49517] Starting process rollout_proc19 [2024-04-26 09:04:45,115][49517] Starting process rollout_proc20 [2024-04-26 09:04:45,115][49517] Starting process rollout_proc21 [2024-04-26 09:04:45,115][49517] Starting process rollout_proc22 [2024-04-26 09:04:45,118][49517] Starting process rollout_proc23 [2024-04-26 09:04:45,120][49517] Starting process rollout_proc24 [2024-04-26 09:04:45,122][49517] Starting process rollout_proc25 [2024-04-26 09:04:45,122][49517] Starting process rollout_proc26 [2024-04-26 09:04:45,127][49517] Starting process rollout_proc27 [2024-04-26 09:04:45,128][49517] Starting process rollout_proc28 [2024-04-26 09:04:45,128][49517] Starting process rollout_proc29 [2024-04-26 09:04:45,133][49517] Starting process rollout_proc30 [2024-04-26 09:04:45,133][49517] Starting process rollout_proc31 [2024-04-26 09:04:48,531][49749] Worker 1 uses CPU cores [1] [2024-04-26 09:04:48,590][49752] Worker 3 uses CPU cores [3] [2024-04-26 09:04:48,666][49728] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-26 09:04:48,666][49728] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-04-26 09:04:48,669][49748] Worker 0 uses CPU cores [0] [2024-04-26 09:04:48,691][49728] Num visible devices: 1 [2024-04-26 09:04:48,747][49728] Starting seed is not provided [2024-04-26 09:04:48,747][49728] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-26 09:04:48,747][49728] Initializing actor-critic model on device cuda:0 [2024-04-26 09:04:48,748][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,757][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,757][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,758][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,758][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,758][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,758][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,758][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,758][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,758][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,758][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,758][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,758][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,758][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,759][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,759][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,759][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,759][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,759][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,759][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,759][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,759][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,759][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,759][49728] RunningMeanStd input shape: (1,) [2024-04-26 09:04:48,761][49750] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-26 09:04:48,760][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,775][49750] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-04-26 09:04:48,775][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,775][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,775][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,775][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,775][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,775][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,775][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,775][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,775][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,775][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,775][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,776][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,776][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,776][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,776][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,776][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,776][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,776][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,776][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,776][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,776][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,776][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,776][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,776][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,776][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,776][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,776][49728] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:48,785][49750] Num visible devices: 1 [2024-04-26 09:04:48,785][49756] Worker 7 uses CPU cores [7] [2024-04-26 09:04:48,794][49777] Worker 28 uses CPU cores [28] [2024-04-26 09:04:48,822][49755] Worker 6 uses CPU cores [6] [2024-04-26 09:04:48,838][49776] Worker 27 uses CPU cores [27] [2024-04-26 09:04:48,838][49774] Worker 26 uses CPU cores [26] [2024-04-26 09:04:48,846][49773] Worker 22 uses CPU cores [22] [2024-04-26 09:04:48,858][49751] Worker 2 uses CPU cores [2] [2024-04-26 09:04:48,864][49728] Created Actor Critic model with architecture: [2024-04-26 09:04:48,864][49728] PredictingActorCritic( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): FeatureEncoder( (_global_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (conf:agent:energy:initial): RunningMeanStdInPlace() (conf:agent:energy:max): RunningMeanStdInPlace() (conf:agent:energy:regen): RunningMeanStdInPlace() (conf:altar:cooldown): RunningMeanStdInPlace() (conf:altar:cost): RunningMeanStdInPlace() (conf:attack:damage): RunningMeanStdInPlace() (conf:attack:freeze_duration): RunningMeanStdInPlace() (conf:charger:cooldown): RunningMeanStdInPlace() (conf:charger:energy): RunningMeanStdInPlace() (conf:cost:attack): RunningMeanStdInPlace() (conf:cost:frozen): RunningMeanStdInPlace() (conf:cost:jump): RunningMeanStdInPlace() (conf:cost:move): RunningMeanStdInPlace() (conf:cost:move:predator): RunningMeanStdInPlace() (conf:cost:move:prey): RunningMeanStdInPlace() (conf:cost:rotate): RunningMeanStdInPlace() (conf:cost:shield): RunningMeanStdInPlace() (conf:cost:shield:upkeep): RunningMeanStdInPlace() (conf:generator:cooldown): RunningMeanStdInPlace() (conf:gift:energy): RunningMeanStdInPlace() (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() (last_reward): RunningMeanStdInPlace() ) ) (_grid_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (charger): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:extra_property:0): RunningMeanStdInPlace() (agent:extra_property:1): RunningMeanStdInPlace() (agent:extra_property:2): RunningMeanStdInPlace() (agent:extra_property:3): RunningMeanStdInPlace() (agent:extra_property:4): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv:1): RunningMeanStdInPlace() (agent:inv:2): RunningMeanStdInPlace() (agent:inv:3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (agent:species): RunningMeanStdInPlace() (altar:ready): RunningMeanStdInPlace() (charger:bonus): RunningMeanStdInPlace() (charger:input:1): RunningMeanStdInPlace() (charger:input:2): RunningMeanStdInPlace() (charger:input:3): RunningMeanStdInPlace() (charger:output): RunningMeanStdInPlace() (charger:ready): RunningMeanStdInPlace() (generator:ready): RunningMeanStdInPlace() (generator:resource): RunningMeanStdInPlace() ) ) (_global_encoder): FeatureListEncoder( (_embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (_grid_encoder): FeatureListEncoder( (_embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (encoder_head): Sequential( (0): Linear(in_features=520, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): FeatureDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=17, bias=True) ) ) [2024-04-26 09:04:48,914][49753] Worker 4 uses CPU cores [4] [2024-04-26 09:04:48,925][49770] Worker 21 uses CPU cores [21] [2024-04-26 09:04:48,966][49771] Worker 23 uses CPU cores [23] [2024-04-26 09:04:49,006][49764] Worker 14 uses CPU cores [14] [2024-04-26 09:04:49,032][49754] Worker 5 uses CPU cores [5] [2024-04-26 09:04:49,036][49769] Worker 20 uses CPU cores [20] [2024-04-26 09:04:49,046][49766] Worker 16 uses CPU cores [16] [2024-04-26 09:04:49,049][49780] Worker 29 uses CPU cores [29] [2024-04-26 09:04:49,050][49762] Worker 12 uses CPU cores [12] [2024-04-26 09:04:49,052][49758] Worker 9 uses CPU cores [9] [2024-04-26 09:04:49,054][49757] Worker 8 uses CPU cores [8] [2024-04-26 09:04:49,058][49775] Worker 25 uses CPU cores [25] [2024-04-26 09:04:49,071][49778] Worker 30 uses CPU cores [30] [2024-04-26 09:04:49,074][49765] Worker 17 uses CPU cores [17] [2024-04-26 09:04:49,074][49768] Worker 19 uses CPU cores [19] [2024-04-26 09:04:49,074][49728] Using optimizer [2024-04-26 09:04:49,076][49763] Worker 15 uses CPU cores [15] [2024-04-26 09:04:49,098][49779] Worker 31 uses CPU cores [31] [2024-04-26 09:04:49,114][49760] Worker 11 uses CPU cores [11] [2024-04-26 09:04:49,185][49772] Worker 24 uses CPU cores [24] [2024-04-26 09:04:49,219][49728] Loading state from checkpoint /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000137161_2247245824.pth... [2024-04-26 09:04:49,240][49728] Loading model from checkpoint [2024-04-26 09:04:49,241][49728] Could not load encoder._grid_normalizer._norms_dict.agent:extra_property:0.running_mean from the checkpoint, mean=0, var=1 [2024-04-26 09:04:49,241][49728] Could not load encoder._grid_normalizer._norms_dict.agent:extra_property:0.running_var from the checkpoint, mean=0, var=1 [2024-04-26 09:04:49,241][49728] Could not load encoder._grid_normalizer._norms_dict.agent:extra_property:0.count from the checkpoint, mean=0, var=1 [2024-04-26 09:04:49,241][49728] Could not load encoder._grid_normalizer._norms_dict.agent:extra_property:1.running_mean from the checkpoint, mean=0, var=1 [2024-04-26 09:04:49,241][49728] Could not load encoder._grid_normalizer._norms_dict.agent:extra_property:1.running_var from the checkpoint, mean=0, var=1 [2024-04-26 09:04:49,241][49728] Could not load encoder._grid_normalizer._norms_dict.agent:extra_property:1.count from the checkpoint, mean=0, var=1 [2024-04-26 09:04:49,241][49728] Could not load encoder._grid_normalizer._norms_dict.agent:extra_property:2.running_mean from the checkpoint, mean=0, var=1 [2024-04-26 09:04:49,241][49728] Could not load encoder._grid_normalizer._norms_dict.agent:extra_property:2.running_var from the checkpoint, mean=0, var=1 [2024-04-26 09:04:49,241][49728] Could not load encoder._grid_normalizer._norms_dict.agent:extra_property:2.count from the checkpoint, mean=0, var=1 [2024-04-26 09:04:49,241][49728] Could not load encoder._grid_normalizer._norms_dict.agent:extra_property:3.running_mean from the checkpoint, mean=0, var=1 [2024-04-26 09:04:49,241][49728] Could not load encoder._grid_normalizer._norms_dict.agent:extra_property:3.running_var from the checkpoint, mean=0, var=1 [2024-04-26 09:04:49,241][49728] Could not load encoder._grid_normalizer._norms_dict.agent:extra_property:3.count from the checkpoint, mean=0, var=1 [2024-04-26 09:04:49,241][49728] Could not load encoder._grid_normalizer._norms_dict.agent:extra_property:4.running_mean from the checkpoint, mean=0, var=1 [2024-04-26 09:04:49,241][49728] Could not load encoder._grid_normalizer._norms_dict.agent:extra_property:4.running_var from the checkpoint, mean=0, var=1 [2024-04-26 09:04:49,241][49728] Could not load encoder._grid_normalizer._norms_dict.agent:extra_property:4.count from the checkpoint, mean=0, var=1 [2024-04-26 09:04:49,243][49728] Not restoring optimizer state from the checkpoint [2024-04-26 09:04:49,243][49728] Loaded experiment state at self.train_step=137161, self.env_steps=2247245824 [2024-04-26 09:04:49,243][49728] Initialized policy 0 weights for model version 137161 [2024-04-26 09:04:49,246][49728] LearnerWorker_p0 finished initialization! [2024-04-26 09:04:49,246][49728] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-26 09:04:49,342][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,348][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,348][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,348][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,349][49750] RunningMeanStd input shape: (1,) [2024-04-26 09:04:49,350][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,350][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,350][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,350][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,350][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,350][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,350][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,350][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,350][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,350][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,350][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,350][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,351][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,351][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,351][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,351][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,351][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,351][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,351][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,351][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,351][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,351][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,351][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,351][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,351][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,351][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,351][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,351][49750] RunningMeanStd input shape: (11, 11) [2024-04-26 09:04:49,377][49767] Worker 18 uses CPU cores [18] [2024-04-26 09:04:49,412][49759] Worker 10 uses CPU cores [10] [2024-04-26 09:04:49,416][49517] Inference worker 0-0 is ready! [2024-04-26 09:04:49,416][49517] All inference workers are ready! Signal rollout workers to start! [2024-04-26 09:04:49,666][49761] Worker 13 uses CPU cores [13] [2024-04-26 09:04:49,982][49760] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,029][49758] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,037][49762] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,072][49764] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,090][49756] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,115][49749] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,117][49752] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,127][49748] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,128][49751] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,143][49754] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,144][49753] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,146][49763] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,173][49776] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,173][49780] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,177][49774] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,178][49777] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,180][49755] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,182][49775] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,184][49770] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,185][49773] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,186][49771] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,187][49766] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,188][49769] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,189][49757] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,194][49778] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,239][49779] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,252][49772] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,253][49768] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,255][49765] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,360][49759] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,403][49767] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,648][49760] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,687][49762] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,709][49758] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,744][49764] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,751][49756] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,792][49749] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,808][49752] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,815][49763] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,816][49761] Decorrelating experience for 0 frames... [2024-04-26 09:04:50,818][49748] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,819][49751] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,829][49757] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,838][49755] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,857][49754] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,859][49753] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,924][49776] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,926][49766] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,927][49780] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,933][49774] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,940][49777] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,942][49771] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,943][49775] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,948][49778] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,949][49769] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,953][49770] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,954][49773] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,987][49779] Decorrelating experience for 256 frames... [2024-04-26 09:04:50,993][49768] Decorrelating experience for 256 frames... [2024-04-26 09:04:51,009][49759] Decorrelating experience for 256 frames... [2024-04-26 09:04:51,019][49765] Decorrelating experience for 256 frames... [2024-04-26 09:04:51,021][49772] Decorrelating experience for 256 frames... [2024-04-26 09:04:51,124][49767] Decorrelating experience for 256 frames... [2024-04-26 09:04:51,471][49761] Decorrelating experience for 256 frames... [2024-04-26 09:04:52,062][49517] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 2247245824. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-04-26 09:04:55,990][49760] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-04-26 09:04:55,990][49763] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-04-26 09:04:55,999][49758] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-04-26 09:04:55,999][49756] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-04-26 09:04:56,006][49749] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-04-26 09:04:56,007][49752] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-04-26 09:04:56,008][49751] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-04-26 09:04:56,016][49757] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-04-26 09:04:56,059][49762] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-04-26 09:04:56,077][49759] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-04-26 09:04:56,093][49754] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-04-26 09:04:56,094][49778] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-04-26 09:04:56,105][49764] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-04-26 09:04:56,112][49728] Signal inference workers to stop experience collection... [2024-04-26 09:04:56,121][49750] InferenceWorker_p0-w0: stopping experience collection [2024-04-26 09:04:56,650][49728] Signal inference workers to resume experience collection... [2024-04-26 09:04:56,650][49750] InferenceWorker_p0-w0: resuming experience collection [2024-04-26 09:04:56,672][49780] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-04-26 09:04:56,886][49771] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-04-26 09:04:56,894][49774] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-04-26 09:04:56,921][49777] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-04-26 09:04:56,921][49776] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-04-26 09:04:57,003][49779] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-04-26 09:04:57,003][49770] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-04-26 09:04:57,008][49766] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-04-26 09:04:57,012][49769] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-04-26 09:04:57,012][49773] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-04-26 09:04:57,063][49517] Fps is (10 sec: 9830.2, 60 sec: 9830.2, 300 sec: 9830.2). Total num frames: 2247294976. Throughput: 0: 54247.1. Samples: 271240. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-26 09:04:57,145][49775] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-04-26 09:04:57,236][49765] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-04-26 09:04:57,272][49768] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-04-26 09:04:57,351][49772] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-04-26 09:04:57,386][49755] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-04-26 09:04:57,437][49761] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-04-26 09:04:57,439][49767] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-04-26 09:04:57,798][49750] Updated weights for policy 0, policy_version 137171 (0.0017) [2024-04-26 09:04:58,153][49753] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-04-26 09:05:00,717][49749] Worker 1 awakens! [2024-04-26 09:05:02,063][49517] Fps is (10 sec: 16383.6, 60 sec: 16383.6, 300 sec: 16383.6). Total num frames: 2247409664. Throughput: 0: 33343.2. Samples: 333440. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-26 09:05:04,999][49517] Heartbeat connected on Batcher_0 [2024-04-26 09:05:05,001][49517] Heartbeat connected on LearnerWorker_p0 [2024-04-26 09:05:05,006][49517] Heartbeat connected on RolloutWorker_w0 [2024-04-26 09:05:05,006][49517] Heartbeat connected on RolloutWorker_w1 [2024-04-26 09:05:05,065][49517] Heartbeat connected on InferenceWorker_p0-w0 [2024-04-26 09:05:05,430][49751] Worker 2 awakens! [2024-04-26 09:05:05,437][49517] Heartbeat connected on RolloutWorker_w2 [2024-04-26 09:05:07,063][49517] Fps is (10 sec: 13107.1, 60 sec: 12014.8, 300 sec: 12014.8). Total num frames: 2247426048. Throughput: 0: 22661.1. Samples: 339920. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-26 09:05:10,138][49752] Worker 3 awakens! [2024-04-26 09:05:10,146][49517] Heartbeat connected on RolloutWorker_w3 [2024-04-26 09:05:12,062][49517] Fps is (10 sec: 3276.9, 60 sec: 9830.4, 300 sec: 9830.4). Total num frames: 2247442432. Throughput: 0: 17922.0. Samples: 358440. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-26 09:05:16,994][49753] Worker 4 awakens! [2024-04-26 09:05:17,004][49517] Heartbeat connected on RolloutWorker_w4 [2024-04-26 09:05:17,062][49517] Fps is (10 sec: 3276.9, 60 sec: 8519.7, 300 sec: 8519.7). Total num frames: 2247458816. Throughput: 0: 15337.6. Samples: 383440. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-26 09:05:17,063][49517] Avg episode reward: [(0, '0.028')] [2024-04-26 09:05:19,630][49754] Worker 5 awakens! [2024-04-26 09:05:19,634][49517] Heartbeat connected on RolloutWorker_w5 [2024-04-26 09:05:22,062][49517] Fps is (10 sec: 9830.4, 60 sec: 9830.4, 300 sec: 9830.4). Total num frames: 2247540736. Throughput: 0: 14260.7. Samples: 427820. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-26 09:05:22,063][49517] Avg episode reward: [(0, '0.044')] [2024-04-26 09:05:22,945][49750] Updated weights for policy 0, policy_version 137181 (0.0017) [2024-04-26 09:05:25,610][49755] Worker 6 awakens! [2024-04-26 09:05:25,614][49517] Heartbeat connected on RolloutWorker_w6 [2024-04-26 09:05:27,062][49517] Fps is (10 sec: 19660.6, 60 sec: 11702.8, 300 sec: 11702.8). Total num frames: 2247655424. Throughput: 0: 15516.0. Samples: 543060. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-26 09:05:27,063][49517] Avg episode reward: [(0, '0.045')] [2024-04-26 09:05:28,912][49756] Worker 7 awakens! [2024-04-26 09:05:28,917][49517] Heartbeat connected on RolloutWorker_w7 [2024-04-26 09:05:29,492][49750] Updated weights for policy 0, policy_version 137191 (0.0015) [2024-04-26 09:05:32,062][49517] Fps is (10 sec: 24575.8, 60 sec: 13516.8, 300 sec: 13516.8). Total num frames: 2247786496. Throughput: 0: 17387.5. Samples: 695500. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-26 09:05:32,063][49517] Avg episode reward: [(0, '0.039')] [2024-04-26 09:05:33,614][49757] Worker 8 awakens! [2024-04-26 09:05:33,619][49517] Heartbeat connected on RolloutWorker_w8 [2024-04-26 09:05:35,884][49750] Updated weights for policy 0, policy_version 137201 (0.0015) [2024-04-26 09:05:37,062][49517] Fps is (10 sec: 26214.5, 60 sec: 14927.6, 300 sec: 14927.6). Total num frames: 2247917568. Throughput: 0: 17236.0. Samples: 775620. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-26 09:05:37,063][49517] Avg episode reward: [(0, '0.048')] [2024-04-26 09:05:38,286][49758] Worker 9 awakens! [2024-04-26 09:05:38,291][49517] Heartbeat connected on RolloutWorker_w9 [2024-04-26 09:05:41,969][49750] Updated weights for policy 0, policy_version 137211 (0.0015) [2024-04-26 09:05:42,062][49517] Fps is (10 sec: 27853.1, 60 sec: 16384.0, 300 sec: 16384.0). Total num frames: 2248065024. Throughput: 0: 15068.5. Samples: 949320. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-26 09:05:42,063][49517] Avg episode reward: [(0, '0.086')] [2024-04-26 09:05:43,053][49759] Worker 10 awakens! [2024-04-26 09:05:43,057][49517] Heartbeat connected on RolloutWorker_w10 [2024-04-26 09:05:45,572][49750] Updated weights for policy 0, policy_version 137221 (0.0017) [2024-04-26 09:05:47,062][49517] Fps is (10 sec: 31129.6, 60 sec: 17873.4, 300 sec: 17873.4). Total num frames: 2248228864. Throughput: 0: 18577.4. Samples: 1169420. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-26 09:05:47,063][49517] Avg episode reward: [(0, '0.119')] [2024-04-26 09:05:47,653][49760] Worker 11 awakens! [2024-04-26 09:05:47,657][49517] Heartbeat connected on RolloutWorker_w11 [2024-04-26 09:05:50,603][49750] Updated weights for policy 0, policy_version 137231 (0.0017) [2024-04-26 09:05:52,063][49517] Fps is (10 sec: 40959.5, 60 sec: 20480.0, 300 sec: 20480.0). Total num frames: 2248474624. Throughput: 0: 20899.2. Samples: 1280380. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-26 09:05:52,063][49517] Avg episode reward: [(0, '0.147')] [2024-04-26 09:05:52,409][49762] Worker 12 awakens! [2024-04-26 09:05:52,415][49517] Heartbeat connected on RolloutWorker_w12 [2024-04-26 09:05:54,367][49750] Updated weights for policy 0, policy_version 137241 (0.0020) [2024-04-26 09:05:57,062][49517] Fps is (10 sec: 42598.9, 60 sec: 22664.6, 300 sec: 21677.3). Total num frames: 2248654848. Throughput: 0: 25848.0. Samples: 1521600. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-26 09:05:57,063][49517] Avg episode reward: [(0, '0.180')] [2024-04-26 09:05:58,036][49750] Updated weights for policy 0, policy_version 137251 (0.0018) [2024-04-26 09:05:58,475][49761] Worker 13 awakens! [2024-04-26 09:05:58,482][49517] Heartbeat connected on RolloutWorker_w13 [2024-04-26 09:06:01,832][49764] Worker 14 awakens! [2024-04-26 09:06:01,837][49517] Heartbeat connected on RolloutWorker_w14 [2024-04-26 09:06:02,063][49517] Fps is (10 sec: 39321.7, 60 sec: 24303.0, 300 sec: 23171.6). Total num frames: 2248867840. Throughput: 0: 30657.7. Samples: 1763040. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-04-26 09:06:02,063][49517] Avg episode reward: [(0, '0.166')] [2024-04-26 09:06:02,371][49750] Updated weights for policy 0, policy_version 137261 (0.0021) [2024-04-26 09:06:06,403][49763] Worker 15 awakens! [2024-04-26 09:06:06,410][49517] Heartbeat connected on RolloutWorker_w15 [2024-04-26 09:06:06,566][49750] Updated weights for policy 0, policy_version 137271 (0.0023) [2024-04-26 09:06:07,062][49517] Fps is (10 sec: 39321.3, 60 sec: 27033.7, 300 sec: 24029.9). Total num frames: 2249048064. Throughput: 0: 32269.8. Samples: 1879960. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-04-26 09:06:07,063][49517] Avg episode reward: [(0, '0.153')] [2024-04-26 09:06:10,762][49750] Updated weights for policy 0, policy_version 137281 (0.0026) [2024-04-26 09:06:12,063][49517] Fps is (10 sec: 37682.9, 60 sec: 30037.3, 300 sec: 24985.6). Total num frames: 2249244672. Throughput: 0: 34908.8. Samples: 2113960. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-04-26 09:06:12,063][49517] Avg episode reward: [(0, '0.219')] [2024-04-26 09:06:12,106][49766] Worker 16 awakens! [2024-04-26 09:06:12,116][49517] Heartbeat connected on RolloutWorker_w16 [2024-04-26 09:06:14,718][49750] Updated weights for policy 0, policy_version 137291 (0.0029) [2024-04-26 09:06:17,023][49765] Worker 17 awakens! [2024-04-26 09:06:17,032][49517] Heartbeat connected on RolloutWorker_w17 [2024-04-26 09:06:17,062][49517] Fps is (10 sec: 40959.9, 60 sec: 33314.1, 300 sec: 26021.6). Total num frames: 2249457664. Throughput: 0: 37048.9. Samples: 2362700. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-04-26 09:06:17,063][49517] Avg episode reward: [(0, '0.209')] [2024-04-26 09:06:18,963][49750] Updated weights for policy 0, policy_version 137301 (0.0027) [2024-04-26 09:06:21,884][49767] Worker 18 awakens! [2024-04-26 09:06:21,893][49517] Heartbeat connected on RolloutWorker_w18 [2024-04-26 09:06:22,063][49517] Fps is (10 sec: 44236.9, 60 sec: 35771.6, 300 sec: 27124.6). Total num frames: 2249687040. Throughput: 0: 38027.0. Samples: 2486840. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-04-26 09:06:22,063][49517] Avg episode reward: [(0, '0.301')] [2024-04-26 09:06:22,396][49750] Updated weights for policy 0, policy_version 137311 (0.0028) [2024-04-26 09:06:26,285][49750] Updated weights for policy 0, policy_version 137321 (0.0024) [2024-04-26 09:06:26,434][49768] Worker 19 awakens! [2024-04-26 09:06:26,444][49517] Heartbeat connected on RolloutWorker_w19 [2024-04-26 09:06:27,063][49517] Fps is (10 sec: 44236.4, 60 sec: 37410.1, 300 sec: 27939.0). Total num frames: 2249900032. Throughput: 0: 39914.5. Samples: 2745480. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-04-26 09:06:27,063][49517] Avg episode reward: [(0, '0.310')] [2024-04-26 09:06:30,348][49750] Updated weights for policy 0, policy_version 137331 (0.0025) [2024-04-26 09:06:30,814][49769] Worker 20 awakens! [2024-04-26 09:06:30,823][49517] Heartbeat connected on RolloutWorker_w20 [2024-04-26 09:06:32,063][49517] Fps is (10 sec: 42598.5, 60 sec: 38775.4, 300 sec: 28672.0). Total num frames: 2250113024. Throughput: 0: 40794.6. Samples: 3005180. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-04-26 09:06:32,063][49517] Avg episode reward: [(0, '0.341')] [2024-04-26 09:06:34,071][49750] Updated weights for policy 0, policy_version 137341 (0.0030) [2024-04-26 09:06:35,477][49770] Worker 21 awakens! [2024-04-26 09:06:35,487][49517] Heartbeat connected on RolloutWorker_w21 [2024-04-26 09:06:37,063][49517] Fps is (10 sec: 42598.3, 60 sec: 40140.7, 300 sec: 29335.1). Total num frames: 2250326016. Throughput: 0: 41541.7. Samples: 3149760. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-04-26 09:06:37,063][49517] Avg episode reward: [(0, '0.304')] [2024-04-26 09:06:37,307][49750] Updated weights for policy 0, policy_version 137351 (0.0025) [2024-04-26 09:06:40,207][49773] Worker 22 awakens! [2024-04-26 09:06:40,216][49517] Heartbeat connected on RolloutWorker_w22 [2024-04-26 09:06:40,652][49750] Updated weights for policy 0, policy_version 137361 (0.0026) [2024-04-26 09:06:42,063][49517] Fps is (10 sec: 44236.6, 60 sec: 41506.0, 300 sec: 30086.9). Total num frames: 2250555392. Throughput: 0: 42002.4. Samples: 3411720. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-04-26 09:06:42,063][49517] Avg episode reward: [(0, '0.319')] [2024-04-26 09:06:42,252][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000137365_2250588160.pth... [2024-04-26 09:06:42,296][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000136747_2240462848.pth [2024-04-26 09:06:44,798][49771] Worker 23 awakens! [2024-04-26 09:06:44,810][49517] Heartbeat connected on RolloutWorker_w23 [2024-04-26 09:06:44,921][49750] Updated weights for policy 0, policy_version 137371 (0.0024) [2024-04-26 09:06:47,062][49517] Fps is (10 sec: 45876.1, 60 sec: 42598.5, 300 sec: 30773.4). Total num frames: 2250784768. Throughput: 0: 42829.0. Samples: 3690340. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-04-26 09:06:47,063][49517] Avg episode reward: [(0, '0.413')] [2024-04-26 09:06:48,311][49750] Updated weights for policy 0, policy_version 137381 (0.0025) [2024-04-26 09:06:49,920][49772] Worker 24 awakens! [2024-04-26 09:06:49,931][49517] Heartbeat connected on RolloutWorker_w24 [2024-04-26 09:06:50,880][49750] Updated weights for policy 0, policy_version 137391 (0.0026) [2024-04-26 09:06:52,063][49517] Fps is (10 sec: 49152.4, 60 sec: 42871.5, 300 sec: 31675.7). Total num frames: 2251046912. Throughput: 0: 43574.1. Samples: 3840800. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-04-26 09:06:52,063][49517] Avg episode reward: [(0, '0.443')] [2024-04-26 09:06:54,427][49775] Worker 25 awakens! [2024-04-26 09:06:54,439][49517] Heartbeat connected on RolloutWorker_w25 [2024-04-26 09:06:55,256][49750] Updated weights for policy 0, policy_version 137401 (0.0028) [2024-04-26 09:06:55,878][49728] Signal inference workers to stop experience collection... (50 times) [2024-04-26 09:06:55,880][49728] Signal inference workers to resume experience collection... (50 times) [2024-04-26 09:06:55,897][49750] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-04-26 09:06:55,898][49750] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-04-26 09:06:57,063][49517] Fps is (10 sec: 50789.3, 60 sec: 43963.5, 300 sec: 32374.7). Total num frames: 2251292672. Throughput: 0: 44588.0. Samples: 4120420. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-04-26 09:06:57,063][49517] Avg episode reward: [(0, '0.322')] [2024-04-26 09:06:58,457][49750] Updated weights for policy 0, policy_version 137411 (0.0028) [2024-04-26 09:06:58,845][49774] Worker 26 awakens! [2024-04-26 09:06:58,858][49517] Heartbeat connected on RolloutWorker_w26 [2024-04-26 09:07:01,466][49750] Updated weights for policy 0, policy_version 137421 (0.0031) [2024-04-26 09:07:02,063][49517] Fps is (10 sec: 49151.6, 60 sec: 44509.8, 300 sec: 33020.0). Total num frames: 2251538432. Throughput: 0: 45339.4. Samples: 4402980. Policy #0 lag: (min: 0.0, avg: 3.8, max: 9.0) [2024-04-26 09:07:02,063][49517] Avg episode reward: [(0, '0.392')] [2024-04-26 09:07:03,547][49776] Worker 27 awakens! [2024-04-26 09:07:03,559][49517] Heartbeat connected on RolloutWorker_w27 [2024-04-26 09:07:05,718][49750] Updated weights for policy 0, policy_version 137431 (0.0032) [2024-04-26 09:07:07,063][49517] Fps is (10 sec: 45875.0, 60 sec: 45055.8, 300 sec: 33374.8). Total num frames: 2251751424. Throughput: 0: 46044.4. Samples: 4558840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 17.0) [2024-04-26 09:07:07,063][49517] Avg episode reward: [(0, '0.401')] [2024-04-26 09:07:08,008][49750] Updated weights for policy 0, policy_version 137441 (0.0030) [2024-04-26 09:07:08,270][49777] Worker 28 awakens! [2024-04-26 09:07:08,283][49517] Heartbeat connected on RolloutWorker_w28 [2024-04-26 09:07:12,062][49517] Fps is (10 sec: 44237.5, 60 sec: 45602.2, 300 sec: 33821.3). Total num frames: 2251980800. Throughput: 0: 46822.8. Samples: 4852500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 17.0) [2024-04-26 09:07:12,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 09:07:12,421][49750] Updated weights for policy 0, policy_version 137451 (0.0035) [2024-04-26 09:07:12,702][49780] Worker 29 awakens! [2024-04-26 09:07:12,714][49517] Heartbeat connected on RolloutWorker_w29 [2024-04-26 09:07:15,139][49750] Updated weights for policy 0, policy_version 137461 (0.0033) [2024-04-26 09:07:16,819][49778] Worker 30 awakens! [2024-04-26 09:07:16,833][49517] Heartbeat connected on RolloutWorker_w30 [2024-04-26 09:07:17,062][49517] Fps is (10 sec: 47514.7, 60 sec: 46148.3, 300 sec: 34349.9). Total num frames: 2252226560. Throughput: 0: 47488.1. Samples: 5142140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 17.0) [2024-04-26 09:07:17,063][49517] Avg episode reward: [(0, '0.366')] [2024-04-26 09:07:18,847][49750] Updated weights for policy 0, policy_version 137471 (0.0031) [2024-04-26 09:07:21,698][49750] Updated weights for policy 0, policy_version 137481 (0.0027) [2024-04-26 09:07:22,063][49517] Fps is (10 sec: 50789.6, 60 sec: 46694.4, 300 sec: 34952.5). Total num frames: 2252488704. Throughput: 0: 47454.2. Samples: 5285200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 17.0) [2024-04-26 09:07:22,072][49517] Avg episode reward: [(0, '0.458')] [2024-04-26 09:07:22,395][49779] Worker 31 awakens! [2024-04-26 09:07:22,407][49517] Heartbeat connected on RolloutWorker_w31 [2024-04-26 09:07:25,396][49750] Updated weights for policy 0, policy_version 137491 (0.0035) [2024-04-26 09:07:27,062][49517] Fps is (10 sec: 52428.5, 60 sec: 47513.7, 300 sec: 35516.3). Total num frames: 2252750848. Throughput: 0: 48402.4. Samples: 5589820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 17.0) [2024-04-26 09:07:27,063][49517] Avg episode reward: [(0, '0.410')] [2024-04-26 09:07:28,379][49750] Updated weights for policy 0, policy_version 137501 (0.0028) [2024-04-26 09:07:31,828][49750] Updated weights for policy 0, policy_version 137511 (0.0036) [2024-04-26 09:07:32,063][49517] Fps is (10 sec: 50790.6, 60 sec: 48059.7, 300 sec: 35942.4). Total num frames: 2252996608. Throughput: 0: 49030.5. Samples: 5896720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 17.0) [2024-04-26 09:07:32,072][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 09:07:34,709][49750] Updated weights for policy 0, policy_version 137521 (0.0033) [2024-04-26 09:07:37,063][49517] Fps is (10 sec: 47513.2, 60 sec: 48332.8, 300 sec: 36243.4). Total num frames: 2253225984. Throughput: 0: 48928.8. Samples: 6042600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 17.0) [2024-04-26 09:07:37,072][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 09:07:38,145][49750] Updated weights for policy 0, policy_version 137531 (0.0038) [2024-04-26 09:07:41,276][49750] Updated weights for policy 0, policy_version 137541 (0.0034) [2024-04-26 09:07:42,062][49517] Fps is (10 sec: 49152.5, 60 sec: 48879.0, 300 sec: 36719.4). Total num frames: 2253488128. Throughput: 0: 49472.6. Samples: 6346680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 17.0) [2024-04-26 09:07:42,063][49517] Avg episode reward: [(0, '0.482')] [2024-04-26 09:07:44,606][49750] Updated weights for policy 0, policy_version 137551 (0.0036) [2024-04-26 09:07:47,063][49517] Fps is (10 sec: 54067.0, 60 sec: 49698.0, 300 sec: 37261.9). Total num frames: 2253766656. Throughput: 0: 49847.1. Samples: 6646100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 17.0) [2024-04-26 09:07:47,063][49517] Avg episode reward: [(0, '0.444')] [2024-04-26 09:07:47,824][49750] Updated weights for policy 0, policy_version 137561 (0.0029) [2024-04-26 09:07:51,136][49750] Updated weights for policy 0, policy_version 137571 (0.0032) [2024-04-26 09:07:52,063][49517] Fps is (10 sec: 55705.3, 60 sec: 49971.2, 300 sec: 37774.2). Total num frames: 2254045184. Throughput: 0: 50049.0. Samples: 6811040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 17.0) [2024-04-26 09:07:52,072][49517] Avg episode reward: [(0, '0.484')] [2024-04-26 09:07:54,218][49750] Updated weights for policy 0, policy_version 137581 (0.0030) [2024-04-26 09:07:57,062][49517] Fps is (10 sec: 45876.2, 60 sec: 48879.1, 300 sec: 37727.5). Total num frames: 2254225408. Throughput: 0: 50136.5. Samples: 7108640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 17.0) [2024-04-26 09:07:57,063][49517] Avg episode reward: [(0, '0.448')] [2024-04-26 09:07:57,738][49750] Updated weights for policy 0, policy_version 137591 (0.0036) [2024-04-26 09:07:58,350][49728] Signal inference workers to stop experience collection... (100 times) [2024-04-26 09:07:58,350][49728] Signal inference workers to resume experience collection... (100 times) [2024-04-26 09:07:58,379][49750] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-04-26 09:07:58,379][49750] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-04-26 09:08:00,545][49750] Updated weights for policy 0, policy_version 137601 (0.0035) [2024-04-26 09:08:02,062][49517] Fps is (10 sec: 44237.5, 60 sec: 49152.2, 300 sec: 38114.4). Total num frames: 2254487552. Throughput: 0: 50436.0. Samples: 7411760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 17.0) [2024-04-26 09:08:02,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 09:08:04,101][49750] Updated weights for policy 0, policy_version 137611 (0.0031) [2024-04-26 09:08:07,062][49517] Fps is (10 sec: 54066.9, 60 sec: 50244.4, 300 sec: 38565.4). Total num frames: 2254766080. Throughput: 0: 50485.5. Samples: 7557040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:08:07,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 09:08:07,181][49750] Updated weights for policy 0, policy_version 137621 (0.0027) [2024-04-26 09:08:10,434][49750] Updated weights for policy 0, policy_version 137631 (0.0030) [2024-04-26 09:08:12,063][49517] Fps is (10 sec: 54066.4, 60 sec: 50790.3, 300 sec: 38912.0). Total num frames: 2255028224. Throughput: 0: 50490.2. Samples: 7861880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:08:12,072][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 09:08:13,593][49750] Updated weights for policy 0, policy_version 137641 (0.0035) [2024-04-26 09:08:17,047][49750] Updated weights for policy 0, policy_version 137651 (0.0034) [2024-04-26 09:08:17,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 39161.7). Total num frames: 2255273984. Throughput: 0: 50588.0. Samples: 8173180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:08:17,072][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 09:08:20,243][49750] Updated weights for policy 0, policy_version 137661 (0.0033) [2024-04-26 09:08:22,062][49517] Fps is (10 sec: 45875.9, 60 sec: 49971.4, 300 sec: 39243.6). Total num frames: 2255486976. Throughput: 0: 50558.4. Samples: 8317720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:08:22,063][49517] Avg episode reward: [(0, '0.438')] [2024-04-26 09:08:23,551][49750] Updated weights for policy 0, policy_version 137671 (0.0028) [2024-04-26 09:08:27,062][49517] Fps is (10 sec: 47514.5, 60 sec: 49971.3, 300 sec: 39550.2). Total num frames: 2255749120. Throughput: 0: 50568.5. Samples: 8622260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:08:27,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 09:08:27,075][49750] Updated weights for policy 0, policy_version 137681 (0.0033) [2024-04-26 09:08:29,999][49750] Updated weights for policy 0, policy_version 137691 (0.0032) [2024-04-26 09:08:32,063][49517] Fps is (10 sec: 55704.6, 60 sec: 50790.4, 300 sec: 39991.8). Total num frames: 2256044032. Throughput: 0: 50639.2. Samples: 8924860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:08:32,063][49517] Avg episode reward: [(0, '0.436')] [2024-04-26 09:08:33,545][49750] Updated weights for policy 0, policy_version 137701 (0.0031) [2024-04-26 09:08:36,271][49750] Updated weights for policy 0, policy_version 137711 (0.0027) [2024-04-26 09:08:37,062][49517] Fps is (10 sec: 57344.4, 60 sec: 51609.8, 300 sec: 40341.1). Total num frames: 2256322560. Throughput: 0: 50640.7. Samples: 9089860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:08:37,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 09:08:39,840][49750] Updated weights for policy 0, policy_version 137721 (0.0028) [2024-04-26 09:08:42,063][49517] Fps is (10 sec: 45875.3, 60 sec: 50244.2, 300 sec: 40247.6). Total num frames: 2256502784. Throughput: 0: 50700.7. Samples: 9390180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:08:42,063][49517] Avg episode reward: [(0, '0.436')] [2024-04-26 09:08:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000137727_2256519168.pth... [2024-04-26 09:08:42,128][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000137161_2247245824.pth [2024-04-26 09:08:42,849][49750] Updated weights for policy 0, policy_version 137731 (0.0029) [2024-04-26 09:08:46,170][49750] Updated weights for policy 0, policy_version 137741 (0.0029) [2024-04-26 09:08:47,062][49517] Fps is (10 sec: 44236.6, 60 sec: 49971.4, 300 sec: 40506.8). Total num frames: 2256764928. Throughput: 0: 50618.2. Samples: 9689580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:08:47,063][49517] Avg episode reward: [(0, '0.425')] [2024-04-26 09:08:49,347][49750] Updated weights for policy 0, policy_version 137751 (0.0032) [2024-04-26 09:08:49,581][49728] Signal inference workers to stop experience collection... (150 times) [2024-04-26 09:08:49,597][49750] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-04-26 09:08:49,698][49728] Signal inference workers to resume experience collection... (150 times) [2024-04-26 09:08:49,699][49750] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-04-26 09:08:52,063][49517] Fps is (10 sec: 54067.2, 60 sec: 49971.2, 300 sec: 40823.5). Total num frames: 2257043456. Throughput: 0: 50635.0. Samples: 9835620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:08:52,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 09:08:52,690][49750] Updated weights for policy 0, policy_version 137761 (0.0035) [2024-04-26 09:08:55,860][49750] Updated weights for policy 0, policy_version 137771 (0.0027) [2024-04-26 09:08:57,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51336.5, 300 sec: 41060.3). Total num frames: 2257305600. Throughput: 0: 50783.2. Samples: 10147120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:08:57,063][49517] Avg episode reward: [(0, '0.436')] [2024-04-26 09:08:59,274][49750] Updated weights for policy 0, policy_version 137781 (0.0028) [2024-04-26 09:09:02,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50517.4, 300 sec: 41091.1). Total num frames: 2257518592. Throughput: 0: 50518.0. Samples: 10446480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:09:02,063][49517] Avg episode reward: [(0, '0.427')] [2024-04-26 09:09:02,433][49750] Updated weights for policy 0, policy_version 137791 (0.0033) [2024-04-26 09:09:05,697][49750] Updated weights for policy 0, policy_version 137801 (0.0040) [2024-04-26 09:09:07,062][49517] Fps is (10 sec: 45875.3, 60 sec: 49971.2, 300 sec: 41249.1). Total num frames: 2257764352. Throughput: 0: 50448.0. Samples: 10587880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:09:07,063][49517] Avg episode reward: [(0, '0.454')] [2024-04-26 09:09:08,866][49750] Updated weights for policy 0, policy_version 137811 (0.0034) [2024-04-26 09:09:12,062][49517] Fps is (10 sec: 52428.2, 60 sec: 50244.3, 300 sec: 41527.1). Total num frames: 2258042880. Throughput: 0: 50319.5. Samples: 10886640. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-04-26 09:09:12,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 09:09:12,134][49750] Updated weights for policy 0, policy_version 137821 (0.0035) [2024-04-26 09:09:15,552][49750] Updated weights for policy 0, policy_version 137831 (0.0037) [2024-04-26 09:09:17,062][49517] Fps is (10 sec: 54066.9, 60 sec: 50517.4, 300 sec: 41732.8). Total num frames: 2258305024. Throughput: 0: 50145.5. Samples: 11181400. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-04-26 09:09:17,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 09:09:18,880][49750] Updated weights for policy 0, policy_version 137841 (0.0035) [2024-04-26 09:09:22,019][49750] Updated weights for policy 0, policy_version 137851 (0.0034) [2024-04-26 09:09:22,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.5, 300 sec: 41870.2). Total num frames: 2258550784. Throughput: 0: 50046.1. Samples: 11341940. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-04-26 09:09:22,063][49517] Avg episode reward: [(0, '0.483')] [2024-04-26 09:09:25,788][49750] Updated weights for policy 0, policy_version 137861 (0.0031) [2024-04-26 09:09:27,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.3, 300 sec: 41943.0). Total num frames: 2258780160. Throughput: 0: 50152.6. Samples: 11647040. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-04-26 09:09:27,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 09:09:28,469][49750] Updated weights for policy 0, policy_version 137871 (0.0034) [2024-04-26 09:09:32,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49698.2, 300 sec: 42071.8). Total num frames: 2259025920. Throughput: 0: 49968.4. Samples: 11938160. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-04-26 09:09:32,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 09:09:32,248][49750] Updated weights for policy 0, policy_version 137881 (0.0027) [2024-04-26 09:09:35,006][49750] Updated weights for policy 0, policy_version 137891 (0.0030) [2024-04-26 09:09:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 49425.0, 300 sec: 42253.5). Total num frames: 2259288064. Throughput: 0: 50253.9. Samples: 12097040. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-04-26 09:09:37,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 09:09:38,728][49750] Updated weights for policy 0, policy_version 137901 (0.0032) [2024-04-26 09:09:41,608][49750] Updated weights for policy 0, policy_version 137911 (0.0034) [2024-04-26 09:09:42,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.5, 300 sec: 42485.4). Total num frames: 2259566592. Throughput: 0: 49980.8. Samples: 12396260. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-04-26 09:09:42,071][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 09:09:45,394][49750] Updated weights for policy 0, policy_version 137921 (0.0034) [2024-04-26 09:09:47,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49971.2, 300 sec: 42431.8). Total num frames: 2259763200. Throughput: 0: 50061.2. Samples: 12699240. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-04-26 09:09:47,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 09:09:48,260][49750] Updated weights for policy 0, policy_version 137931 (0.0034) [2024-04-26 09:09:51,815][49750] Updated weights for policy 0, policy_version 137941 (0.0032) [2024-04-26 09:09:52,062][49517] Fps is (10 sec: 45875.8, 60 sec: 49698.2, 300 sec: 43153.8). Total num frames: 2260025344. Throughput: 0: 49798.2. Samples: 12828800. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-04-26 09:09:52,063][49517] Avg episode reward: [(0, '0.421')] [2024-04-26 09:09:54,667][49728] Signal inference workers to stop experience collection... (200 times) [2024-04-26 09:09:54,706][49750] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-04-26 09:09:54,740][49728] Signal inference workers to resume experience collection... (200 times) [2024-04-26 09:09:54,747][49750] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-04-26 09:09:54,883][49750] Updated weights for policy 0, policy_version 137951 (0.0026) [2024-04-26 09:09:57,062][49517] Fps is (10 sec: 54067.0, 60 sec: 49971.2, 300 sec: 43709.2). Total num frames: 2260303872. Throughput: 0: 49880.4. Samples: 13131260. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-04-26 09:09:57,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 09:09:58,152][49750] Updated weights for policy 0, policy_version 137961 (0.0029) [2024-04-26 09:10:01,239][49750] Updated weights for policy 0, policy_version 137971 (0.0035) [2024-04-26 09:10:02,063][49517] Fps is (10 sec: 54066.4, 60 sec: 50790.2, 300 sec: 44542.3). Total num frames: 2260566016. Throughput: 0: 50205.7. Samples: 13440660. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-04-26 09:10:02,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 09:10:04,580][49750] Updated weights for policy 0, policy_version 137981 (0.0040) [2024-04-26 09:10:07,062][49517] Fps is (10 sec: 45875.4, 60 sec: 49971.2, 300 sec: 45153.2). Total num frames: 2260762624. Throughput: 0: 50019.1. Samples: 13592800. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-04-26 09:10:07,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 09:10:07,772][49750] Updated weights for policy 0, policy_version 137991 (0.0030) [2024-04-26 09:10:11,231][49750] Updated weights for policy 0, policy_version 138001 (0.0031) [2024-04-26 09:10:12,063][49517] Fps is (10 sec: 45875.3, 60 sec: 49698.1, 300 sec: 45986.3). Total num frames: 2261024768. Throughput: 0: 49955.4. Samples: 13895040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:10:12,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 09:10:14,327][49750] Updated weights for policy 0, policy_version 138011 (0.0029) [2024-04-26 09:10:17,062][49517] Fps is (10 sec: 52428.6, 60 sec: 49698.1, 300 sec: 46597.2). Total num frames: 2261286912. Throughput: 0: 50084.9. Samples: 14191980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:10:17,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 09:10:17,778][49750] Updated weights for policy 0, policy_version 138021 (0.0033) [2024-04-26 09:10:20,818][49750] Updated weights for policy 0, policy_version 138031 (0.0033) [2024-04-26 09:10:22,062][49517] Fps is (10 sec: 52429.1, 60 sec: 49971.1, 300 sec: 47097.1). Total num frames: 2261549056. Throughput: 0: 50028.4. Samples: 14348320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:10:22,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 09:10:24,220][49750] Updated weights for policy 0, policy_version 138041 (0.0029) [2024-04-26 09:10:27,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.3, 300 sec: 47485.9). Total num frames: 2261794816. Throughput: 0: 50148.2. Samples: 14652920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:10:27,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 09:10:27,271][49750] Updated weights for policy 0, policy_version 138051 (0.0037) [2024-04-26 09:10:30,996][49750] Updated weights for policy 0, policy_version 138061 (0.0027) [2024-04-26 09:10:32,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49971.3, 300 sec: 47819.1). Total num frames: 2262024192. Throughput: 0: 50090.8. Samples: 14953320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:10:32,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 09:10:33,735][49750] Updated weights for policy 0, policy_version 138071 (0.0038) [2024-04-26 09:10:37,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.2, 300 sec: 48263.4). Total num frames: 2262302720. Throughput: 0: 50245.7. Samples: 15089860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:10:37,063][49517] Avg episode reward: [(0, '0.456')] [2024-04-26 09:10:37,519][49750] Updated weights for policy 0, policy_version 138081 (0.0029) [2024-04-26 09:10:40,330][49750] Updated weights for policy 0, policy_version 138091 (0.0033) [2024-04-26 09:10:42,063][49517] Fps is (10 sec: 54064.9, 60 sec: 49971.0, 300 sec: 48596.6). Total num frames: 2262564864. Throughput: 0: 50374.8. Samples: 15398140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:10:42,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 09:10:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000138096_2262564864.pth... [2024-04-26 09:10:42,124][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000137365_2250588160.pth [2024-04-26 09:10:43,816][49750] Updated weights for policy 0, policy_version 138101 (0.0033) [2024-04-26 09:10:46,772][49750] Updated weights for policy 0, policy_version 138111 (0.0030) [2024-04-26 09:10:47,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.5, 300 sec: 48652.2). Total num frames: 2262827008. Throughput: 0: 50314.8. Samples: 15704820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:10:47,063][49517] Avg episode reward: [(0, '0.461')] [2024-04-26 09:10:50,364][49750] Updated weights for policy 0, policy_version 138121 (0.0033) [2024-04-26 09:10:52,062][49517] Fps is (10 sec: 47515.1, 60 sec: 50244.2, 300 sec: 48763.2). Total num frames: 2263040000. Throughput: 0: 50287.5. Samples: 15855740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:10:52,063][49517] Avg episode reward: [(0, '0.430')] [2024-04-26 09:10:53,228][49750] Updated weights for policy 0, policy_version 138131 (0.0042) [2024-04-26 09:10:56,946][49750] Updated weights for policy 0, policy_version 138141 (0.0033) [2024-04-26 09:10:57,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49971.3, 300 sec: 48929.9). Total num frames: 2263302144. Throughput: 0: 50231.3. Samples: 16155440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:10:57,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 09:10:59,885][49750] Updated weights for policy 0, policy_version 138151 (0.0031) [2024-04-26 09:11:02,063][49517] Fps is (10 sec: 52428.3, 60 sec: 49971.2, 300 sec: 49207.5). Total num frames: 2263564288. Throughput: 0: 50389.2. Samples: 16459500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:11:02,063][49517] Avg episode reward: [(0, '0.479')] [2024-04-26 09:11:03,132][49728] Signal inference workers to stop experience collection... (250 times) [2024-04-26 09:11:03,132][49728] Signal inference workers to resume experience collection... (250 times) [2024-04-26 09:11:03,161][49750] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-04-26 09:11:03,161][49750] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-04-26 09:11:03,259][49750] Updated weights for policy 0, policy_version 138161 (0.0034) [2024-04-26 09:11:06,354][49750] Updated weights for policy 0, policy_version 138171 (0.0032) [2024-04-26 09:11:07,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 49429.7). Total num frames: 2263826432. Throughput: 0: 50410.4. Samples: 16616780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:11:07,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 09:11:09,949][49750] Updated weights for policy 0, policy_version 138181 (0.0040) [2024-04-26 09:11:12,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 49485.2). Total num frames: 2264055808. Throughput: 0: 50259.1. Samples: 16914580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:11:12,063][49517] Avg episode reward: [(0, '0.463')] [2024-04-26 09:11:12,716][49750] Updated weights for policy 0, policy_version 138191 (0.0032) [2024-04-26 09:11:16,290][49750] Updated weights for policy 0, policy_version 138201 (0.0030) [2024-04-26 09:11:17,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.3, 300 sec: 49540.8). Total num frames: 2264301568. Throughput: 0: 50255.0. Samples: 17214800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 09:11:17,063][49517] Avg episode reward: [(0, '0.432')] [2024-04-26 09:11:19,101][49750] Updated weights for policy 0, policy_version 138211 (0.0037) [2024-04-26 09:11:22,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.3, 300 sec: 49707.4). Total num frames: 2264563712. Throughput: 0: 50469.9. Samples: 17361000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 09:11:22,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-26 09:11:22,706][49750] Updated weights for policy 0, policy_version 138221 (0.0030) [2024-04-26 09:11:25,585][49750] Updated weights for policy 0, policy_version 138231 (0.0033) [2024-04-26 09:11:27,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.3, 300 sec: 49874.0). Total num frames: 2264825856. Throughput: 0: 50336.9. Samples: 17663280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 09:11:27,063][49517] Avg episode reward: [(0, '0.470')] [2024-04-26 09:11:29,223][49750] Updated weights for policy 0, policy_version 138241 (0.0042) [2024-04-26 09:11:32,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.3, 300 sec: 50040.6). Total num frames: 2265088000. Throughput: 0: 50310.6. Samples: 17968800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 09:11:32,063][49517] Avg episode reward: [(0, '0.427')] [2024-04-26 09:11:32,380][49750] Updated weights for policy 0, policy_version 138251 (0.0032) [2024-04-26 09:11:36,061][49750] Updated weights for policy 0, policy_version 138261 (0.0034) [2024-04-26 09:11:37,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2265317376. Throughput: 0: 50410.6. Samples: 18124220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 09:11:37,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 09:11:39,016][49750] Updated weights for policy 0, policy_version 138271 (0.0030) [2024-04-26 09:11:42,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.5, 300 sec: 50151.7). Total num frames: 2265579520. Throughput: 0: 50328.3. Samples: 18420220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 09:11:42,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 09:11:42,622][49750] Updated weights for policy 0, policy_version 138281 (0.0027) [2024-04-26 09:11:45,513][49750] Updated weights for policy 0, policy_version 138291 (0.0032) [2024-04-26 09:11:47,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2265841664. Throughput: 0: 50249.8. Samples: 18720740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 09:11:47,063][49517] Avg episode reward: [(0, '0.421')] [2024-04-26 09:11:49,141][49750] Updated weights for policy 0, policy_version 138301 (0.0037) [2024-04-26 09:11:51,884][49750] Updated weights for policy 0, policy_version 138311 (0.0033) [2024-04-26 09:11:52,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50151.7). Total num frames: 2266087424. Throughput: 0: 50249.7. Samples: 18878020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 09:11:52,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 09:11:55,705][49750] Updated weights for policy 0, policy_version 138321 (0.0033) [2024-04-26 09:11:57,062][49517] Fps is (10 sec: 45875.5, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2266300416. Throughput: 0: 50193.3. Samples: 19173280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 09:11:57,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 09:11:58,507][49750] Updated weights for policy 0, policy_version 138331 (0.0040) [2024-04-26 09:12:02,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49971.3, 300 sec: 50207.3). Total num frames: 2266562560. Throughput: 0: 50143.5. Samples: 19471260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 09:12:02,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 09:12:02,235][49750] Updated weights for policy 0, policy_version 138341 (0.0033) [2024-04-26 09:12:05,199][49750] Updated weights for policy 0, policy_version 138351 (0.0031) [2024-04-26 09:12:07,062][49517] Fps is (10 sec: 52428.9, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 2266824704. Throughput: 0: 50147.5. Samples: 19617640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 09:12:07,063][49517] Avg episode reward: [(0, '0.473')] [2024-04-26 09:12:08,733][49750] Updated weights for policy 0, policy_version 138361 (0.0028) [2024-04-26 09:12:11,594][49750] Updated weights for policy 0, policy_version 138371 (0.0034) [2024-04-26 09:12:12,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2267070464. Throughput: 0: 50172.9. Samples: 19921060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 09:12:12,063][49517] Avg episode reward: [(0, '0.462')] [2024-04-26 09:12:15,468][49750] Updated weights for policy 0, policy_version 138381 (0.0032) [2024-04-26 09:12:17,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2267316224. Throughput: 0: 50137.4. Samples: 20224980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 09:12:17,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 09:12:17,086][49728] Signal inference workers to stop experience collection... (300 times) [2024-04-26 09:12:17,130][49750] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-04-26 09:12:17,153][49728] Signal inference workers to resume experience collection... (300 times) [2024-04-26 09:12:17,162][49750] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-04-26 09:12:17,985][49750] Updated weights for policy 0, policy_version 138391 (0.0031) [2024-04-26 09:12:21,893][49750] Updated weights for policy 0, policy_version 138401 (0.0029) [2024-04-26 09:12:22,063][49517] Fps is (10 sec: 49151.1, 60 sec: 49971.0, 300 sec: 50207.2). Total num frames: 2267561984. Throughput: 0: 49899.5. Samples: 20369700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:12:22,063][49517] Avg episode reward: [(0, '0.462')] [2024-04-26 09:12:24,594][49750] Updated weights for policy 0, policy_version 138411 (0.0032) [2024-04-26 09:12:27,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 50207.3). Total num frames: 2267807744. Throughput: 0: 50133.0. Samples: 20676200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:12:27,063][49517] Avg episode reward: [(0, '0.451')] [2024-04-26 09:12:28,326][49750] Updated weights for policy 0, policy_version 138421 (0.0028) [2024-04-26 09:12:31,120][49750] Updated weights for policy 0, policy_version 138431 (0.0035) [2024-04-26 09:12:32,062][49517] Fps is (10 sec: 50791.2, 60 sec: 49698.2, 300 sec: 50318.3). Total num frames: 2268069888. Throughput: 0: 49982.3. Samples: 20969940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:12:32,063][49517] Avg episode reward: [(0, '0.483')] [2024-04-26 09:12:34,897][49750] Updated weights for policy 0, policy_version 138441 (0.0028) [2024-04-26 09:12:37,062][49517] Fps is (10 sec: 50790.2, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2268315648. Throughput: 0: 49929.8. Samples: 21124860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:12:37,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 09:12:37,595][49750] Updated weights for policy 0, policy_version 138451 (0.0032) [2024-04-26 09:12:41,576][49750] Updated weights for policy 0, policy_version 138461 (0.0032) [2024-04-26 09:12:42,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49698.2, 300 sec: 50151.7). Total num frames: 2268561408. Throughput: 0: 50100.5. Samples: 21427800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:12:42,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 09:12:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000138462_2268561408.pth... [2024-04-26 09:12:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000137727_2256519168.pth [2024-04-26 09:12:44,371][49750] Updated weights for policy 0, policy_version 138471 (0.0030) [2024-04-26 09:12:47,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 50040.6). Total num frames: 2268807168. Throughput: 0: 50072.0. Samples: 21724500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:12:47,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 09:12:47,996][49750] Updated weights for policy 0, policy_version 138481 (0.0032) [2024-04-26 09:12:50,778][49750] Updated weights for policy 0, policy_version 138491 (0.0029) [2024-04-26 09:12:52,062][49517] Fps is (10 sec: 52428.7, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2269085696. Throughput: 0: 50202.7. Samples: 21876760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:12:52,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 09:12:54,555][49750] Updated weights for policy 0, policy_version 138501 (0.0030) [2024-04-26 09:12:57,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2269331456. Throughput: 0: 50163.9. Samples: 22178440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:12:57,063][49517] Avg episode reward: [(0, '0.446')] [2024-04-26 09:12:57,433][49750] Updated weights for policy 0, policy_version 138511 (0.0032) [2024-04-26 09:13:01,125][49750] Updated weights for policy 0, policy_version 138521 (0.0031) [2024-04-26 09:13:02,063][49517] Fps is (10 sec: 45874.7, 60 sec: 49698.1, 300 sec: 50096.1). Total num frames: 2269544448. Throughput: 0: 49989.2. Samples: 22474500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:13:02,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 09:13:04,000][49750] Updated weights for policy 0, policy_version 138531 (0.0032) [2024-04-26 09:13:07,063][49517] Fps is (10 sec: 49151.5, 60 sec: 49971.0, 300 sec: 50151.7). Total num frames: 2269822976. Throughput: 0: 49944.0. Samples: 22617180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:13:07,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 09:13:07,664][49750] Updated weights for policy 0, policy_version 138541 (0.0031) [2024-04-26 09:13:10,651][49750] Updated weights for policy 0, policy_version 138551 (0.0035) [2024-04-26 09:13:12,063][49517] Fps is (10 sec: 52428.9, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 2270068736. Throughput: 0: 49739.0. Samples: 22914460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:13:12,063][49517] Avg episode reward: [(0, '0.457')] [2024-04-26 09:13:14,335][49750] Updated weights for policy 0, policy_version 138561 (0.0032) [2024-04-26 09:13:17,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.1, 300 sec: 50318.3). Total num frames: 2270330880. Throughput: 0: 49895.9. Samples: 23215260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 09:13:17,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-26 09:13:17,230][49750] Updated weights for policy 0, policy_version 138571 (0.0034) [2024-04-26 09:13:20,997][49750] Updated weights for policy 0, policy_version 138581 (0.0033) [2024-04-26 09:13:22,063][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.3, 300 sec: 50207.2). Total num frames: 2270560256. Throughput: 0: 49838.2. Samples: 23367580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:13:22,063][49517] Avg episode reward: [(0, '0.479')] [2024-04-26 09:13:23,686][49750] Updated weights for policy 0, policy_version 138591 (0.0035) [2024-04-26 09:13:27,063][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2270806016. Throughput: 0: 49757.6. Samples: 23666900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:13:27,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 09:13:27,666][49750] Updated weights for policy 0, policy_version 138601 (0.0027) [2024-04-26 09:13:30,070][49750] Updated weights for policy 0, policy_version 138611 (0.0038) [2024-04-26 09:13:32,062][49517] Fps is (10 sec: 50790.6, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2271068160. Throughput: 0: 49883.1. Samples: 23969240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:13:32,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 09:13:34,159][49750] Updated weights for policy 0, policy_version 138621 (0.0033) [2024-04-26 09:13:34,890][49728] Signal inference workers to stop experience collection... (350 times) [2024-04-26 09:13:34,934][49750] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-04-26 09:13:35,001][49728] Signal inference workers to resume experience collection... (350 times) [2024-04-26 09:13:35,002][49750] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-04-26 09:13:36,591][49750] Updated weights for policy 0, policy_version 138631 (0.0038) [2024-04-26 09:13:37,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2271330304. Throughput: 0: 50046.2. Samples: 24128840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:13:37,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 09:13:40,553][49750] Updated weights for policy 0, policy_version 138641 (0.0033) [2024-04-26 09:13:42,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 2271576064. Throughput: 0: 50006.2. Samples: 24428720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:13:42,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-26 09:13:43,215][49750] Updated weights for policy 0, policy_version 138651 (0.0040) [2024-04-26 09:13:47,063][49517] Fps is (10 sec: 47513.2, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2271805440. Throughput: 0: 49988.4. Samples: 24723980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:13:47,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 09:13:47,421][49750] Updated weights for policy 0, policy_version 138661 (0.0032) [2024-04-26 09:13:49,732][49750] Updated weights for policy 0, policy_version 138671 (0.0031) [2024-04-26 09:13:52,063][49517] Fps is (10 sec: 50790.3, 60 sec: 49971.1, 300 sec: 50096.1). Total num frames: 2272083968. Throughput: 0: 50005.9. Samples: 24867440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:13:52,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 09:13:53,951][49750] Updated weights for policy 0, policy_version 138681 (0.0031) [2024-04-26 09:13:56,343][49750] Updated weights for policy 0, policy_version 138691 (0.0030) [2024-04-26 09:13:57,062][49517] Fps is (10 sec: 52429.5, 60 sec: 49971.3, 300 sec: 50207.2). Total num frames: 2272329728. Throughput: 0: 50047.2. Samples: 25166580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:13:57,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 09:14:00,570][49750] Updated weights for policy 0, policy_version 138701 (0.0031) [2024-04-26 09:14:02,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50207.2). Total num frames: 2272575488. Throughput: 0: 50137.4. Samples: 25471440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:14:02,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 09:14:02,829][49750] Updated weights for policy 0, policy_version 138711 (0.0031) [2024-04-26 09:14:07,062][49517] Fps is (10 sec: 45875.0, 60 sec: 49425.2, 300 sec: 49985.1). Total num frames: 2272788480. Throughput: 0: 50057.8. Samples: 25620180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:14:07,063][49517] Avg episode reward: [(0, '0.427')] [2024-04-26 09:14:07,103][49750] Updated weights for policy 0, policy_version 138721 (0.0036) [2024-04-26 09:14:09,511][49750] Updated weights for policy 0, policy_version 138731 (0.0036) [2024-04-26 09:14:12,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49971.3, 300 sec: 50040.6). Total num frames: 2273067008. Throughput: 0: 50058.4. Samples: 25919520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:14:12,063][49517] Avg episode reward: [(0, '0.443')] [2024-04-26 09:14:13,574][49750] Updated weights for policy 0, policy_version 138741 (0.0038) [2024-04-26 09:14:16,013][49750] Updated weights for policy 0, policy_version 138751 (0.0031) [2024-04-26 09:14:17,063][49517] Fps is (10 sec: 52428.4, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 2273312768. Throughput: 0: 49962.6. Samples: 26217560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:14:17,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 09:14:20,103][49750] Updated weights for policy 0, policy_version 138761 (0.0030) [2024-04-26 09:14:22,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 2273591296. Throughput: 0: 49858.6. Samples: 26372480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:14:22,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 09:14:22,674][49750] Updated weights for policy 0, policy_version 138771 (0.0033) [2024-04-26 09:14:26,660][49750] Updated weights for policy 0, policy_version 138781 (0.0039) [2024-04-26 09:14:27,063][49517] Fps is (10 sec: 47513.4, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 2273787904. Throughput: 0: 49903.5. Samples: 26674380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 09:14:27,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 09:14:29,188][49750] Updated weights for policy 0, policy_version 138791 (0.0027) [2024-04-26 09:14:30,660][49728] Signal inference workers to stop experience collection... (400 times) [2024-04-26 09:14:30,660][49728] Signal inference workers to resume experience collection... (400 times) [2024-04-26 09:14:30,673][49750] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-04-26 09:14:30,673][49750] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-04-26 09:14:32,062][49517] Fps is (10 sec: 45875.5, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 2274050048. Throughput: 0: 49935.7. Samples: 26971080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 09:14:32,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 09:14:33,240][49750] Updated weights for policy 0, policy_version 138801 (0.0031) [2024-04-26 09:14:35,562][49750] Updated weights for policy 0, policy_version 138811 (0.0034) [2024-04-26 09:14:37,063][49517] Fps is (10 sec: 54067.5, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2274328576. Throughput: 0: 49954.2. Samples: 27115380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 09:14:37,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 09:14:39,772][49750] Updated weights for policy 0, policy_version 138821 (0.0030) [2024-04-26 09:14:42,047][49750] Updated weights for policy 0, policy_version 138831 (0.0032) [2024-04-26 09:14:42,063][49517] Fps is (10 sec: 55705.1, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2274607104. Throughput: 0: 50162.1. Samples: 27423880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 09:14:42,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 09:14:42,076][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000138831_2274607104.pth... [2024-04-26 09:14:42,114][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000138096_2262564864.pth [2024-04-26 09:14:46,558][49750] Updated weights for policy 0, policy_version 138841 (0.0030) [2024-04-26 09:14:47,062][49517] Fps is (10 sec: 47514.3, 60 sec: 49971.3, 300 sec: 50096.2). Total num frames: 2274803712. Throughput: 0: 50202.8. Samples: 27730560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 09:14:47,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 09:14:48,705][49750] Updated weights for policy 0, policy_version 138851 (0.0030) [2024-04-26 09:14:52,062][49517] Fps is (10 sec: 44237.1, 60 sec: 49425.1, 300 sec: 49985.1). Total num frames: 2275049472. Throughput: 0: 49869.8. Samples: 27864320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 09:14:52,063][49517] Avg episode reward: [(0, '0.468')] [2024-04-26 09:14:53,019][49750] Updated weights for policy 0, policy_version 138861 (0.0035) [2024-04-26 09:14:55,247][49750] Updated weights for policy 0, policy_version 138871 (0.0038) [2024-04-26 09:14:57,062][49517] Fps is (10 sec: 52428.4, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2275328000. Throughput: 0: 49786.6. Samples: 28159920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 09:14:57,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 09:14:59,478][49750] Updated weights for policy 0, policy_version 138881 (0.0030) [2024-04-26 09:15:01,765][49750] Updated weights for policy 0, policy_version 138891 (0.0032) [2024-04-26 09:15:02,063][49517] Fps is (10 sec: 54067.0, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2275590144. Throughput: 0: 49877.8. Samples: 28462060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 09:15:02,063][49517] Avg episode reward: [(0, '0.484')] [2024-04-26 09:15:06,017][49750] Updated weights for policy 0, policy_version 138901 (0.0032) [2024-04-26 09:15:07,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.2, 300 sec: 50151.7). Total num frames: 2275819520. Throughput: 0: 49886.6. Samples: 28617380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 09:15:07,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 09:15:08,633][49750] Updated weights for policy 0, policy_version 138911 (0.0027) [2024-04-26 09:15:12,062][49517] Fps is (10 sec: 44237.5, 60 sec: 49425.1, 300 sec: 49985.1). Total num frames: 2276032512. Throughput: 0: 49771.4. Samples: 28914080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 09:15:12,063][49517] Avg episode reward: [(0, '0.467')] [2024-04-26 09:15:12,710][49750] Updated weights for policy 0, policy_version 138921 (0.0030) [2024-04-26 09:15:15,053][49750] Updated weights for policy 0, policy_version 138931 (0.0027) [2024-04-26 09:15:17,062][49517] Fps is (10 sec: 49153.2, 60 sec: 49971.3, 300 sec: 50040.6). Total num frames: 2276311040. Throughput: 0: 49888.5. Samples: 29216060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 09:15:17,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 09:15:19,201][49750] Updated weights for policy 0, policy_version 138941 (0.0033) [2024-04-26 09:15:21,484][49750] Updated weights for policy 0, policy_version 138951 (0.0036) [2024-04-26 09:15:22,063][49517] Fps is (10 sec: 54062.6, 60 sec: 49697.6, 300 sec: 50096.0). Total num frames: 2276573184. Throughput: 0: 49987.2. Samples: 29364840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 09:15:22,064][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 09:15:25,692][49750] Updated weights for policy 0, policy_version 138961 (0.0031) [2024-04-26 09:15:27,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.6, 300 sec: 50207.2). Total num frames: 2276835328. Throughput: 0: 49974.0. Samples: 29672700. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 09:15:27,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 09:15:28,134][49750] Updated weights for policy 0, policy_version 138971 (0.0036) [2024-04-26 09:15:32,063][49517] Fps is (10 sec: 45877.8, 60 sec: 49698.0, 300 sec: 49929.5). Total num frames: 2277031936. Throughput: 0: 49875.3. Samples: 29974960. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 09:15:32,063][49517] Avg episode reward: [(0, '0.453')] [2024-04-26 09:15:32,330][49750] Updated weights for policy 0, policy_version 138981 (0.0031) [2024-04-26 09:15:34,712][49750] Updated weights for policy 0, policy_version 138991 (0.0032) [2024-04-26 09:15:37,063][49517] Fps is (10 sec: 47512.3, 60 sec: 49698.0, 300 sec: 49985.1). Total num frames: 2277310464. Throughput: 0: 49905.6. Samples: 30110080. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 09:15:37,063][49517] Avg episode reward: [(0, '0.475')] [2024-04-26 09:15:38,639][49728] Signal inference workers to stop experience collection... (450 times) [2024-04-26 09:15:38,663][49750] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-04-26 09:15:38,753][49728] Signal inference workers to resume experience collection... (450 times) [2024-04-26 09:15:38,753][49750] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-04-26 09:15:38,893][49750] Updated weights for policy 0, policy_version 139001 (0.0037) [2024-04-26 09:15:41,181][49750] Updated weights for policy 0, policy_version 139011 (0.0028) [2024-04-26 09:15:42,063][49517] Fps is (10 sec: 54067.8, 60 sec: 49425.1, 300 sec: 49985.1). Total num frames: 2277572608. Throughput: 0: 49924.8. Samples: 30406540. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 09:15:42,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 09:15:45,445][49750] Updated weights for policy 0, policy_version 139021 (0.0035) [2024-04-26 09:15:47,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.0, 300 sec: 50096.1). Total num frames: 2277818368. Throughput: 0: 49993.1. Samples: 30711760. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 09:15:47,063][49517] Avg episode reward: [(0, '0.454')] [2024-04-26 09:15:47,796][49750] Updated weights for policy 0, policy_version 139031 (0.0032) [2024-04-26 09:15:51,836][49750] Updated weights for policy 0, policy_version 139041 (0.0033) [2024-04-26 09:15:52,063][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.1, 300 sec: 49985.1). Total num frames: 2278047744. Throughput: 0: 49932.5. Samples: 30864340. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 09:15:52,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 09:15:54,544][49750] Updated weights for policy 0, policy_version 139051 (0.0030) [2024-04-26 09:15:57,062][49517] Fps is (10 sec: 47514.9, 60 sec: 49425.1, 300 sec: 49929.6). Total num frames: 2278293504. Throughput: 0: 49820.4. Samples: 31156000. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 09:15:57,063][49517] Avg episode reward: [(0, '0.482')] [2024-04-26 09:15:58,450][49750] Updated weights for policy 0, policy_version 139061 (0.0031) [2024-04-26 09:16:01,058][49750] Updated weights for policy 0, policy_version 139071 (0.0031) [2024-04-26 09:16:02,063][49517] Fps is (10 sec: 52428.6, 60 sec: 49698.1, 300 sec: 49985.0). Total num frames: 2278572032. Throughput: 0: 49873.5. Samples: 31460380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 09:16:02,063][49517] Avg episode reward: [(0, '0.437')] [2024-04-26 09:16:05,013][49750] Updated weights for policy 0, policy_version 139081 (0.0037) [2024-04-26 09:16:07,062][49517] Fps is (10 sec: 52429.1, 60 sec: 49971.4, 300 sec: 50040.6). Total num frames: 2278817792. Throughput: 0: 50061.4. Samples: 31617560. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 09:16:07,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 09:16:08,095][49750] Updated weights for policy 0, policy_version 139091 (0.0029) [2024-04-26 09:16:11,585][49750] Updated weights for policy 0, policy_version 139101 (0.0030) [2024-04-26 09:16:12,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50244.2, 300 sec: 49985.1). Total num frames: 2279047168. Throughput: 0: 49893.3. Samples: 31917900. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 09:16:12,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 09:16:14,736][49750] Updated weights for policy 0, policy_version 139111 (0.0030) [2024-04-26 09:16:17,062][49517] Fps is (10 sec: 47513.3, 60 sec: 49698.1, 300 sec: 49929.5). Total num frames: 2279292928. Throughput: 0: 49732.7. Samples: 32212920. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 09:16:17,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 09:16:18,107][49750] Updated weights for policy 0, policy_version 139121 (0.0033) [2024-04-26 09:16:21,179][49750] Updated weights for policy 0, policy_version 139131 (0.0034) [2024-04-26 09:16:22,063][49517] Fps is (10 sec: 50790.0, 60 sec: 49698.7, 300 sec: 49929.5). Total num frames: 2279555072. Throughput: 0: 49723.7. Samples: 32347640. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 09:16:22,063][49517] Avg episode reward: [(0, '0.422')] [2024-04-26 09:16:24,582][49750] Updated weights for policy 0, policy_version 139141 (0.0037) [2024-04-26 09:16:27,062][49517] Fps is (10 sec: 54066.9, 60 sec: 49971.1, 300 sec: 49985.1). Total num frames: 2279833600. Throughput: 0: 49937.9. Samples: 32653740. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 09:16:27,063][49517] Avg episode reward: [(0, '0.439')] [2024-04-26 09:16:27,571][49750] Updated weights for policy 0, policy_version 139151 (0.0030) [2024-04-26 09:16:31,085][49750] Updated weights for policy 0, policy_version 139161 (0.0032) [2024-04-26 09:16:32,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.5, 300 sec: 49985.1). Total num frames: 2280062976. Throughput: 0: 49903.8. Samples: 32957420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:16:32,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 09:16:34,182][49750] Updated weights for policy 0, policy_version 139171 (0.0029) [2024-04-26 09:16:37,062][49517] Fps is (10 sec: 44236.9, 60 sec: 49425.2, 300 sec: 49818.5). Total num frames: 2280275968. Throughput: 0: 49866.3. Samples: 33108320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:16:37,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 09:16:37,853][49750] Updated weights for policy 0, policy_version 139181 (0.0029) [2024-04-26 09:16:38,441][49728] Signal inference workers to stop experience collection... (500 times) [2024-04-26 09:16:38,441][49728] Signal inference workers to resume experience collection... (500 times) [2024-04-26 09:16:38,472][49750] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-04-26 09:16:38,472][49750] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-04-26 09:16:40,801][49750] Updated weights for policy 0, policy_version 139191 (0.0035) [2024-04-26 09:16:42,063][49517] Fps is (10 sec: 49151.1, 60 sec: 49698.0, 300 sec: 49874.0). Total num frames: 2280554496. Throughput: 0: 49891.7. Samples: 33401140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:16:42,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 09:16:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000139194_2280554496.pth... [2024-04-26 09:16:42,128][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000138462_2268561408.pth [2024-04-26 09:16:44,423][49750] Updated weights for policy 0, policy_version 139201 (0.0029) [2024-04-26 09:16:47,063][49517] Fps is (10 sec: 54066.9, 60 sec: 49971.4, 300 sec: 49929.5). Total num frames: 2280816640. Throughput: 0: 49579.2. Samples: 33691440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:16:47,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 09:16:47,301][49750] Updated weights for policy 0, policy_version 139211 (0.0033) [2024-04-26 09:16:51,018][49750] Updated weights for policy 0, policy_version 139221 (0.0032) [2024-04-26 09:16:52,062][49517] Fps is (10 sec: 49153.0, 60 sec: 49971.3, 300 sec: 49985.1). Total num frames: 2281046016. Throughput: 0: 49764.3. Samples: 33856960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:16:52,063][49517] Avg episode reward: [(0, '0.410')] [2024-04-26 09:16:53,707][49750] Updated weights for policy 0, policy_version 139231 (0.0032) [2024-04-26 09:16:57,063][49517] Fps is (10 sec: 45875.1, 60 sec: 49698.0, 300 sec: 49874.0). Total num frames: 2281275392. Throughput: 0: 49645.7. Samples: 34151960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:16:57,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 09:16:57,520][49750] Updated weights for policy 0, policy_version 139241 (0.0030) [2024-04-26 09:17:00,429][49750] Updated weights for policy 0, policy_version 139251 (0.0036) [2024-04-26 09:17:02,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49425.2, 300 sec: 49874.0). Total num frames: 2281537536. Throughput: 0: 49693.8. Samples: 34449140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:17:02,063][49517] Avg episode reward: [(0, '0.462')] [2024-04-26 09:17:04,084][49750] Updated weights for policy 0, policy_version 139261 (0.0034) [2024-04-26 09:17:07,063][49517] Fps is (10 sec: 52428.9, 60 sec: 49698.0, 300 sec: 49929.5). Total num frames: 2281799680. Throughput: 0: 50000.9. Samples: 34597680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:17:07,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 09:17:07,265][49750] Updated weights for policy 0, policy_version 139271 (0.0032) [2024-04-26 09:17:10,617][49750] Updated weights for policy 0, policy_version 139281 (0.0033) [2024-04-26 09:17:12,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50517.4, 300 sec: 50040.6). Total num frames: 2282078208. Throughput: 0: 49891.6. Samples: 34898860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:17:12,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 09:17:13,716][49750] Updated weights for policy 0, policy_version 139291 (0.0031) [2024-04-26 09:17:17,062][49517] Fps is (10 sec: 49152.6, 60 sec: 49971.2, 300 sec: 49929.6). Total num frames: 2282291200. Throughput: 0: 49878.3. Samples: 35201940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:17:17,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 09:17:17,181][49750] Updated weights for policy 0, policy_version 139301 (0.0030) [2024-04-26 09:17:20,398][49750] Updated weights for policy 0, policy_version 139311 (0.0038) [2024-04-26 09:17:22,063][49517] Fps is (10 sec: 44236.1, 60 sec: 49425.0, 300 sec: 49874.0). Total num frames: 2282520576. Throughput: 0: 49574.5. Samples: 35339180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:17:22,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 09:17:23,615][49750] Updated weights for policy 0, policy_version 139321 (0.0035) [2024-04-26 09:17:27,062][49517] Fps is (10 sec: 49151.4, 60 sec: 49152.0, 300 sec: 49874.0). Total num frames: 2282782720. Throughput: 0: 49800.2. Samples: 35642140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:17:27,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 09:17:27,097][49750] Updated weights for policy 0, policy_version 139331 (0.0034) [2024-04-26 09:17:30,185][49750] Updated weights for policy 0, policy_version 139341 (0.0031) [2024-04-26 09:17:32,063][49517] Fps is (10 sec: 54067.4, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2283061248. Throughput: 0: 49939.5. Samples: 35938720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 09:17:32,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 09:17:33,517][49750] Updated weights for policy 0, policy_version 139351 (0.0031) [2024-04-26 09:17:36,869][49750] Updated weights for policy 0, policy_version 139361 (0.0029) [2024-04-26 09:17:37,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.1, 300 sec: 49929.5). Total num frames: 2283290624. Throughput: 0: 49934.9. Samples: 36104040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 09:17:37,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 09:17:39,944][49750] Updated weights for policy 0, policy_version 139371 (0.0034) [2024-04-26 09:17:42,063][49517] Fps is (10 sec: 44236.5, 60 sec: 49152.1, 300 sec: 49818.4). Total num frames: 2283503616. Throughput: 0: 49935.9. Samples: 36399080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 09:17:42,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 09:17:43,525][49750] Updated weights for policy 0, policy_version 139381 (0.0034) [2024-04-26 09:17:46,495][49750] Updated weights for policy 0, policy_version 139391 (0.0037) [2024-04-26 09:17:47,062][49517] Fps is (10 sec: 49152.8, 60 sec: 49425.1, 300 sec: 49818.5). Total num frames: 2283782144. Throughput: 0: 49911.5. Samples: 36695160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 09:17:47,063][49517] Avg episode reward: [(0, '0.442')] [2024-04-26 09:17:49,830][49728] Signal inference workers to stop experience collection... (550 times) [2024-04-26 09:17:49,830][49728] Signal inference workers to resume experience collection... (550 times) [2024-04-26 09:17:49,842][49750] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-04-26 09:17:49,862][49750] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-04-26 09:17:49,961][49750] Updated weights for policy 0, policy_version 139401 (0.0033) [2024-04-26 09:17:52,063][49517] Fps is (10 sec: 55704.8, 60 sec: 50244.0, 300 sec: 49929.5). Total num frames: 2284060672. Throughput: 0: 50073.5. Samples: 36851000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 09:17:52,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 09:17:53,058][49750] Updated weights for policy 0, policy_version 139411 (0.0037) [2024-04-26 09:17:56,580][49750] Updated weights for policy 0, policy_version 139421 (0.0030) [2024-04-26 09:17:57,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50517.3, 300 sec: 50040.6). Total num frames: 2284306432. Throughput: 0: 50058.0. Samples: 37151480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 09:17:57,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 09:17:59,532][49750] Updated weights for policy 0, policy_version 139431 (0.0031) [2024-04-26 09:18:02,062][49517] Fps is (10 sec: 47515.0, 60 sec: 49971.2, 300 sec: 49874.0). Total num frames: 2284535808. Throughput: 0: 49989.3. Samples: 37451460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 09:18:02,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 09:18:03,138][49750] Updated weights for policy 0, policy_version 139441 (0.0029) [2024-04-26 09:18:06,206][49750] Updated weights for policy 0, policy_version 139451 (0.0037) [2024-04-26 09:18:07,062][49517] Fps is (10 sec: 47514.3, 60 sec: 49698.2, 300 sec: 49874.0). Total num frames: 2284781568. Throughput: 0: 50002.8. Samples: 37589300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 09:18:07,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 09:18:09,582][49750] Updated weights for policy 0, policy_version 139461 (0.0030) [2024-04-26 09:18:12,062][49517] Fps is (10 sec: 52428.5, 60 sec: 49698.1, 300 sec: 49929.6). Total num frames: 2285060096. Throughput: 0: 49978.3. Samples: 37891160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 09:18:12,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 09:18:12,954][49750] Updated weights for policy 0, policy_version 139471 (0.0037) [2024-04-26 09:18:15,999][49750] Updated weights for policy 0, policy_version 139481 (0.0034) [2024-04-26 09:18:17,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50244.3, 300 sec: 49985.1). Total num frames: 2285305856. Throughput: 0: 50116.2. Samples: 38193940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 09:18:17,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 09:18:19,372][49750] Updated weights for policy 0, policy_version 139491 (0.0036) [2024-04-26 09:18:22,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.4, 300 sec: 49929.6). Total num frames: 2285535232. Throughput: 0: 49819.4. Samples: 38345900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 09:18:22,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 09:18:22,564][49750] Updated weights for policy 0, policy_version 139501 (0.0030) [2024-04-26 09:18:25,817][49750] Updated weights for policy 0, policy_version 139511 (0.0028) [2024-04-26 09:18:27,063][49517] Fps is (10 sec: 45874.3, 60 sec: 49698.1, 300 sec: 49818.5). Total num frames: 2285764608. Throughput: 0: 49952.5. Samples: 38646940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 09:18:27,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 09:18:29,126][49750] Updated weights for policy 0, policy_version 139521 (0.0032) [2024-04-26 09:18:32,062][49517] Fps is (10 sec: 50790.4, 60 sec: 49698.2, 300 sec: 49874.0). Total num frames: 2286043136. Throughput: 0: 49886.3. Samples: 38940040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 09:18:32,063][49517] Avg episode reward: [(0, '0.491')] [2024-04-26 09:18:32,392][49750] Updated weights for policy 0, policy_version 139531 (0.0030) [2024-04-26 09:18:35,655][49750] Updated weights for policy 0, policy_version 139541 (0.0029) [2024-04-26 09:18:37,062][49517] Fps is (10 sec: 55707.0, 60 sec: 50517.6, 300 sec: 49985.1). Total num frames: 2286321664. Throughput: 0: 50195.6. Samples: 39109780. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 09:18:37,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-26 09:18:38,976][49750] Updated weights for policy 0, policy_version 139551 (0.0036) [2024-04-26 09:18:42,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.4, 300 sec: 49985.1). Total num frames: 2286551040. Throughput: 0: 50069.4. Samples: 39404600. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 09:18:42,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 09:18:42,172][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000139561_2286567424.pth... [2024-04-26 09:18:42,175][49750] Updated weights for policy 0, policy_version 139561 (0.0038) [2024-04-26 09:18:42,214][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000138831_2274607104.pth [2024-04-26 09:18:45,428][49750] Updated weights for policy 0, policy_version 139571 (0.0041) [2024-04-26 09:18:47,063][49517] Fps is (10 sec: 44235.7, 60 sec: 49698.0, 300 sec: 49762.9). Total num frames: 2286764032. Throughput: 0: 50028.7. Samples: 39702760. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 09:18:47,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 09:18:48,754][49750] Updated weights for policy 0, policy_version 139581 (0.0030) [2024-04-26 09:18:51,959][49750] Updated weights for policy 0, policy_version 139591 (0.0028) [2024-04-26 09:18:52,063][49517] Fps is (10 sec: 50790.7, 60 sec: 49971.4, 300 sec: 49929.5). Total num frames: 2287058944. Throughput: 0: 50125.3. Samples: 39844940. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 09:18:52,063][49517] Avg episode reward: [(0, '0.479')] [2024-04-26 09:18:54,867][49728] Signal inference workers to stop experience collection... (600 times) [2024-04-26 09:18:54,913][49750] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-04-26 09:18:54,977][49728] Signal inference workers to resume experience collection... (600 times) [2024-04-26 09:18:54,977][49750] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-04-26 09:18:55,239][49750] Updated weights for policy 0, policy_version 139601 (0.0032) [2024-04-26 09:18:57,062][49517] Fps is (10 sec: 55706.2, 60 sec: 50244.4, 300 sec: 49985.1). Total num frames: 2287321088. Throughput: 0: 50130.7. Samples: 40147040. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 09:18:57,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 09:18:58,420][49750] Updated weights for policy 0, policy_version 139611 (0.0036) [2024-04-26 09:19:01,747][49750] Updated weights for policy 0, policy_version 139621 (0.0040) [2024-04-26 09:19:02,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.2, 300 sec: 50096.2). Total num frames: 2287566848. Throughput: 0: 50177.2. Samples: 40451920. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 09:19:02,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 09:19:04,875][49750] Updated weights for policy 0, policy_version 139631 (0.0036) [2024-04-26 09:19:07,062][49517] Fps is (10 sec: 44237.2, 60 sec: 49698.2, 300 sec: 49818.5). Total num frames: 2287763456. Throughput: 0: 50074.3. Samples: 40599240. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 09:19:07,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 09:19:08,171][49750] Updated weights for policy 0, policy_version 139641 (0.0033) [2024-04-26 09:19:11,571][49750] Updated weights for policy 0, policy_version 139651 (0.0034) [2024-04-26 09:19:12,062][49517] Fps is (10 sec: 47514.2, 60 sec: 49698.2, 300 sec: 49929.6). Total num frames: 2288041984. Throughput: 0: 50116.6. Samples: 40902180. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 09:19:12,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 09:19:14,922][49750] Updated weights for policy 0, policy_version 139661 (0.0033) [2024-04-26 09:19:17,062][49517] Fps is (10 sec: 54066.8, 60 sec: 49971.1, 300 sec: 49874.0). Total num frames: 2288304128. Throughput: 0: 50192.4. Samples: 41198700. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 09:19:17,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 09:19:18,205][49750] Updated weights for policy 0, policy_version 139671 (0.0031) [2024-04-26 09:19:21,351][49750] Updated weights for policy 0, policy_version 139681 (0.0030) [2024-04-26 09:19:22,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50790.4, 300 sec: 50151.7). Total num frames: 2288582656. Throughput: 0: 50131.0. Samples: 41365680. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 09:19:22,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 09:19:24,610][49750] Updated weights for policy 0, policy_version 139691 (0.0035) [2024-04-26 09:19:27,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 49985.1). Total num frames: 2288795648. Throughput: 0: 50132.1. Samples: 41660540. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 09:19:27,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 09:19:27,885][49750] Updated weights for policy 0, policy_version 139701 (0.0033) [2024-04-26 09:19:31,218][49750] Updated weights for policy 0, policy_version 139711 (0.0032) [2024-04-26 09:19:32,063][49517] Fps is (10 sec: 44236.0, 60 sec: 49698.0, 300 sec: 49818.5). Total num frames: 2289025024. Throughput: 0: 49983.5. Samples: 41952020. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 09:19:32,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 09:19:34,476][49750] Updated weights for policy 0, policy_version 139721 (0.0032) [2024-04-26 09:19:37,063][49517] Fps is (10 sec: 52428.0, 60 sec: 49971.0, 300 sec: 49874.0). Total num frames: 2289319936. Throughput: 0: 50123.4. Samples: 42100500. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-04-26 09:19:37,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 09:19:37,745][49750] Updated weights for policy 0, policy_version 139731 (0.0028) [2024-04-26 09:19:40,997][49750] Updated weights for policy 0, policy_version 139741 (0.0034) [2024-04-26 09:19:42,062][49517] Fps is (10 sec: 54068.3, 60 sec: 50244.4, 300 sec: 50040.6). Total num frames: 2289565696. Throughput: 0: 50238.3. Samples: 42407760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:19:42,063][49517] Avg episode reward: [(0, '0.446')] [2024-04-26 09:19:44,150][49750] Updated weights for policy 0, policy_version 139751 (0.0030) [2024-04-26 09:19:47,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50517.5, 300 sec: 49985.1). Total num frames: 2289795072. Throughput: 0: 50223.2. Samples: 42711960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:19:47,063][49517] Avg episode reward: [(0, '0.471')] [2024-04-26 09:19:47,534][49750] Updated weights for policy 0, policy_version 139761 (0.0033) [2024-04-26 09:19:50,561][49750] Updated weights for policy 0, policy_version 139771 (0.0029) [2024-04-26 09:19:52,063][49517] Fps is (10 sec: 45874.4, 60 sec: 49425.0, 300 sec: 49818.5). Total num frames: 2290024448. Throughput: 0: 49962.0. Samples: 42847540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:19:52,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 09:19:53,798][49728] Signal inference workers to stop experience collection... (650 times) [2024-04-26 09:19:53,836][49750] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-04-26 09:19:53,869][49728] Signal inference workers to resume experience collection... (650 times) [2024-04-26 09:19:53,870][49750] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-04-26 09:19:54,003][49750] Updated weights for policy 0, policy_version 139781 (0.0034) [2024-04-26 09:19:57,062][49517] Fps is (10 sec: 52428.6, 60 sec: 49971.2, 300 sec: 49929.6). Total num frames: 2290319360. Throughput: 0: 50092.4. Samples: 43156340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:19:57,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 09:19:57,157][49750] Updated weights for policy 0, policy_version 139791 (0.0036) [2024-04-26 09:20:00,619][49750] Updated weights for policy 0, policy_version 139801 (0.0032) [2024-04-26 09:20:02,063][49517] Fps is (10 sec: 54067.6, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2290565120. Throughput: 0: 50081.7. Samples: 43452380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:20:02,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 09:20:03,643][49750] Updated weights for policy 0, policy_version 139811 (0.0036) [2024-04-26 09:20:07,010][49750] Updated weights for policy 0, policy_version 139821 (0.0025) [2024-04-26 09:20:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.4, 300 sec: 50151.7). Total num frames: 2290827264. Throughput: 0: 49878.6. Samples: 43610220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:20:07,063][49517] Avg episode reward: [(0, '0.434')] [2024-04-26 09:20:10,342][49750] Updated weights for policy 0, policy_version 139831 (0.0032) [2024-04-26 09:20:12,062][49517] Fps is (10 sec: 45875.6, 60 sec: 49698.1, 300 sec: 49874.0). Total num frames: 2291023872. Throughput: 0: 49772.9. Samples: 43900320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:20:12,063][49517] Avg episode reward: [(0, '0.452')] [2024-04-26 09:20:13,557][49750] Updated weights for policy 0, policy_version 139841 (0.0030) [2024-04-26 09:20:17,063][49517] Fps is (10 sec: 47513.3, 60 sec: 49971.1, 300 sec: 49929.7). Total num frames: 2291302400. Throughput: 0: 49982.3. Samples: 44201220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:20:17,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 09:20:17,718][49750] Updated weights for policy 0, policy_version 139851 (0.0035) [2024-04-26 09:20:20,220][49750] Updated weights for policy 0, policy_version 139861 (0.0031) [2024-04-26 09:20:22,062][49517] Fps is (10 sec: 54067.3, 60 sec: 49698.2, 300 sec: 49929.5). Total num frames: 2291564544. Throughput: 0: 50138.5. Samples: 44356720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:20:22,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 09:20:24,511][49750] Updated weights for policy 0, policy_version 139871 (0.0029) [2024-04-26 09:20:26,690][49750] Updated weights for policy 0, policy_version 139881 (0.0031) [2024-04-26 09:20:27,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2291826688. Throughput: 0: 50048.4. Samples: 44659940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:20:27,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 09:20:30,970][49750] Updated weights for policy 0, policy_version 139891 (0.0030) [2024-04-26 09:20:32,062][49517] Fps is (10 sec: 45875.1, 60 sec: 49971.4, 300 sec: 49874.0). Total num frames: 2292023296. Throughput: 0: 50139.2. Samples: 44968220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:20:32,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 09:20:33,157][49750] Updated weights for policy 0, policy_version 139901 (0.0036) [2024-04-26 09:20:37,063][49517] Fps is (10 sec: 45874.9, 60 sec: 49425.2, 300 sec: 49874.0). Total num frames: 2292285440. Throughput: 0: 50145.4. Samples: 45104080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 09:20:37,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 09:20:37,360][49750] Updated weights for policy 0, policy_version 139911 (0.0038) [2024-04-26 09:20:39,752][49750] Updated weights for policy 0, policy_version 139921 (0.0029) [2024-04-26 09:20:42,063][49517] Fps is (10 sec: 55704.3, 60 sec: 50244.1, 300 sec: 50040.6). Total num frames: 2292580352. Throughput: 0: 49922.5. Samples: 45402860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-04-26 09:20:42,063][49517] Avg episode reward: [(0, '0.457')] [2024-04-26 09:20:42,161][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000139929_2292596736.pth... [2024-04-26 09:20:42,207][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000139194_2280554496.pth [2024-04-26 09:20:43,756][49750] Updated weights for policy 0, policy_version 139931 (0.0032) [2024-04-26 09:20:46,431][49750] Updated weights for policy 0, policy_version 139941 (0.0032) [2024-04-26 09:20:47,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2292809728. Throughput: 0: 50006.6. Samples: 45702680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-04-26 09:20:47,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 09:20:47,564][49728] Signal inference workers to stop experience collection... (700 times) [2024-04-26 09:20:47,593][49750] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-04-26 09:20:47,681][49728] Signal inference workers to resume experience collection... (700 times) [2024-04-26 09:20:47,681][49750] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-04-26 09:20:50,209][49750] Updated weights for policy 0, policy_version 139951 (0.0030) [2024-04-26 09:20:52,063][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.3, 300 sec: 50040.6). Total num frames: 2293055488. Throughput: 0: 49840.7. Samples: 45853060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-04-26 09:20:52,063][49517] Avg episode reward: [(0, '0.490')] [2024-04-26 09:20:52,919][49750] Updated weights for policy 0, policy_version 139961 (0.0029) [2024-04-26 09:20:56,692][49750] Updated weights for policy 0, policy_version 139971 (0.0032) [2024-04-26 09:20:57,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49425.1, 300 sec: 49874.0). Total num frames: 2293284864. Throughput: 0: 50004.0. Samples: 46150500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-04-26 09:20:57,063][49517] Avg episode reward: [(0, '0.474')] [2024-04-26 09:20:59,421][49750] Updated weights for policy 0, policy_version 139981 (0.0028) [2024-04-26 09:21:02,062][49517] Fps is (10 sec: 50791.6, 60 sec: 49971.3, 300 sec: 49985.1). Total num frames: 2293563392. Throughput: 0: 50079.3. Samples: 46454780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-04-26 09:21:02,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 09:21:03,089][49750] Updated weights for policy 0, policy_version 139991 (0.0033) [2024-04-26 09:21:06,174][49750] Updated weights for policy 0, policy_version 140001 (0.0031) [2024-04-26 09:21:07,062][49517] Fps is (10 sec: 52429.0, 60 sec: 49698.2, 300 sec: 50040.6). Total num frames: 2293809152. Throughput: 0: 50100.0. Samples: 46611220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-04-26 09:21:07,063][49517] Avg episode reward: [(0, '0.476')] [2024-04-26 09:21:09,625][49750] Updated weights for policy 0, policy_version 140011 (0.0034) [2024-04-26 09:21:12,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50096.2). Total num frames: 2294071296. Throughput: 0: 50058.7. Samples: 46912580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-04-26 09:21:12,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 09:21:12,706][49750] Updated weights for policy 0, policy_version 140021 (0.0033) [2024-04-26 09:21:16,399][49750] Updated weights for policy 0, policy_version 140031 (0.0030) [2024-04-26 09:21:17,062][49517] Fps is (10 sec: 47513.3, 60 sec: 49698.2, 300 sec: 49929.6). Total num frames: 2294284288. Throughput: 0: 49832.4. Samples: 47210680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-04-26 09:21:17,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 09:21:19,233][49750] Updated weights for policy 0, policy_version 140041 (0.0032) [2024-04-26 09:21:22,063][49517] Fps is (10 sec: 49151.0, 60 sec: 49971.0, 300 sec: 49929.5). Total num frames: 2294562816. Throughput: 0: 49995.9. Samples: 47353900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-04-26 09:21:22,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 09:21:23,033][49750] Updated weights for policy 0, policy_version 140051 (0.0036) [2024-04-26 09:21:25,685][49750] Updated weights for policy 0, policy_version 140061 (0.0030) [2024-04-26 09:21:27,063][49517] Fps is (10 sec: 54066.7, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2294824960. Throughput: 0: 50030.8. Samples: 47654240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-04-26 09:21:27,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 09:21:29,813][49750] Updated weights for policy 0, policy_version 140071 (0.0030) [2024-04-26 09:21:32,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.2, 300 sec: 50096.1). Total num frames: 2295054336. Throughput: 0: 50149.7. Samples: 47959420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-04-26 09:21:32,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 09:21:32,300][49750] Updated weights for policy 0, policy_version 140081 (0.0027) [2024-04-26 09:21:36,467][49750] Updated weights for policy 0, policy_version 140091 (0.0028) [2024-04-26 09:21:37,062][49517] Fps is (10 sec: 45875.8, 60 sec: 49971.3, 300 sec: 49929.6). Total num frames: 2295283712. Throughput: 0: 50135.8. Samples: 48109160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-04-26 09:21:37,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 09:21:38,901][49750] Updated weights for policy 0, policy_version 140101 (0.0034) [2024-04-26 09:21:42,063][49517] Fps is (10 sec: 49151.9, 60 sec: 49425.1, 300 sec: 49929.5). Total num frames: 2295545856. Throughput: 0: 50010.5. Samples: 48400980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 27.0) [2024-04-26 09:21:42,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 09:21:42,910][49750] Updated weights for policy 0, policy_version 140111 (0.0033) [2024-04-26 09:21:45,490][49750] Updated weights for policy 0, policy_version 140121 (0.0028) [2024-04-26 09:21:47,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2295824384. Throughput: 0: 49894.6. Samples: 48700040. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-04-26 09:21:47,063][49517] Avg episode reward: [(0, '0.462')] [2024-04-26 09:21:49,300][49750] Updated weights for policy 0, policy_version 140131 (0.0037) [2024-04-26 09:21:52,006][49750] Updated weights for policy 0, policy_version 140141 (0.0030) [2024-04-26 09:21:52,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2296070144. Throughput: 0: 50028.7. Samples: 48862520. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-04-26 09:21:52,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 09:21:55,753][49750] Updated weights for policy 0, policy_version 140151 (0.0030) [2024-04-26 09:21:57,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50244.1, 300 sec: 50040.6). Total num frames: 2296299520. Throughput: 0: 49922.1. Samples: 49159080. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-04-26 09:21:57,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 09:21:58,636][49750] Updated weights for policy 0, policy_version 140161 (0.0024) [2024-04-26 09:21:59,231][49728] Signal inference workers to stop experience collection... (750 times) [2024-04-26 09:21:59,231][49728] Signal inference workers to resume experience collection... (750 times) [2024-04-26 09:21:59,256][49750] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-04-26 09:21:59,261][49750] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-04-26 09:22:02,062][49517] Fps is (10 sec: 47514.5, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2296545280. Throughput: 0: 49805.0. Samples: 49451900. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-04-26 09:22:02,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 09:22:02,246][49750] Updated weights for policy 0, policy_version 140171 (0.0042) [2024-04-26 09:22:05,129][49750] Updated weights for policy 0, policy_version 140181 (0.0026) [2024-04-26 09:22:07,062][49517] Fps is (10 sec: 50791.1, 60 sec: 49971.2, 300 sec: 49929.5). Total num frames: 2296807424. Throughput: 0: 50126.5. Samples: 49609580. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-04-26 09:22:07,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 09:22:08,871][49750] Updated weights for policy 0, policy_version 140191 (0.0039) [2024-04-26 09:22:11,731][49750] Updated weights for policy 0, policy_version 140201 (0.0033) [2024-04-26 09:22:12,062][49517] Fps is (10 sec: 52428.7, 60 sec: 49971.2, 300 sec: 50096.2). Total num frames: 2297069568. Throughput: 0: 49964.1. Samples: 49902620. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-04-26 09:22:12,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 09:22:15,332][49750] Updated weights for policy 0, policy_version 140211 (0.0033) [2024-04-26 09:22:17,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2297298944. Throughput: 0: 49910.4. Samples: 50205380. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-04-26 09:22:17,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 09:22:18,353][49750] Updated weights for policy 0, policy_version 140221 (0.0033) [2024-04-26 09:22:22,062][49517] Fps is (10 sec: 45875.3, 60 sec: 49425.3, 300 sec: 49985.1). Total num frames: 2297528320. Throughput: 0: 49817.4. Samples: 50350940. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-04-26 09:22:22,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 09:22:22,149][49750] Updated weights for policy 0, policy_version 140231 (0.0041) [2024-04-26 09:22:24,853][49750] Updated weights for policy 0, policy_version 140241 (0.0034) [2024-04-26 09:22:27,063][49517] Fps is (10 sec: 50789.7, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2297806848. Throughput: 0: 49710.7. Samples: 50637960. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-04-26 09:22:27,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 09:22:28,815][49750] Updated weights for policy 0, policy_version 140251 (0.0030) [2024-04-26 09:22:31,283][49750] Updated weights for policy 0, policy_version 140261 (0.0030) [2024-04-26 09:22:32,063][49517] Fps is (10 sec: 55704.6, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2298085376. Throughput: 0: 49951.0. Samples: 50947840. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-04-26 09:22:32,072][49517] Avg episode reward: [(0, '0.447')] [2024-04-26 09:22:35,384][49750] Updated weights for policy 0, policy_version 140271 (0.0033) [2024-04-26 09:22:37,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2298298368. Throughput: 0: 49780.9. Samples: 51102660. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-04-26 09:22:37,071][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 09:22:37,851][49750] Updated weights for policy 0, policy_version 140281 (0.0029) [2024-04-26 09:22:41,764][49750] Updated weights for policy 0, policy_version 140291 (0.0029) [2024-04-26 09:22:42,063][49517] Fps is (10 sec: 44236.9, 60 sec: 49698.2, 300 sec: 49985.1). Total num frames: 2298527744. Throughput: 0: 49768.5. Samples: 51398660. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-04-26 09:22:42,072][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 09:22:42,097][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000140292_2298544128.pth... [2024-04-26 09:22:42,142][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000139561_2286567424.pth [2024-04-26 09:22:44,481][49750] Updated weights for policy 0, policy_version 140301 (0.0030) [2024-04-26 09:22:47,063][49517] Fps is (10 sec: 50790.4, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2298806272. Throughput: 0: 49924.7. Samples: 51698520. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 09:22:47,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 09:22:48,426][49750] Updated weights for policy 0, policy_version 140311 (0.0040) [2024-04-26 09:22:50,959][49750] Updated weights for policy 0, policy_version 140321 (0.0036) [2024-04-26 09:22:52,062][49517] Fps is (10 sec: 52429.1, 60 sec: 49698.2, 300 sec: 49985.1). Total num frames: 2299052032. Throughput: 0: 49949.7. Samples: 51857320. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 09:22:52,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 09:22:54,851][49750] Updated weights for policy 0, policy_version 140331 (0.0028) [2024-04-26 09:22:57,062][49517] Fps is (10 sec: 49152.6, 60 sec: 49971.3, 300 sec: 50040.6). Total num frames: 2299297792. Throughput: 0: 50066.2. Samples: 52155600. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 09:22:57,063][49517] Avg episode reward: [(0, '0.484')] [2024-04-26 09:22:57,465][49750] Updated weights for policy 0, policy_version 140341 (0.0031) [2024-04-26 09:23:01,371][49750] Updated weights for policy 0, policy_version 140351 (0.0030) [2024-04-26 09:23:02,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2299543552. Throughput: 0: 50064.0. Samples: 52458260. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 09:23:02,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 09:23:03,946][49750] Updated weights for policy 0, policy_version 140361 (0.0035) [2024-04-26 09:23:07,062][49517] Fps is (10 sec: 49151.8, 60 sec: 49698.1, 300 sec: 49929.6). Total num frames: 2299789312. Throughput: 0: 49896.8. Samples: 52596300. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 09:23:07,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 09:23:08,040][49750] Updated weights for policy 0, policy_version 140371 (0.0029) [2024-04-26 09:23:09,058][49728] Signal inference workers to stop experience collection... (800 times) [2024-04-26 09:23:09,067][49728] Signal inference workers to resume experience collection... (800 times) [2024-04-26 09:23:09,092][49750] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-04-26 09:23:09,092][49750] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-04-26 09:23:10,597][49750] Updated weights for policy 0, policy_version 140381 (0.0034) [2024-04-26 09:23:12,062][49517] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2300051456. Throughput: 0: 50257.0. Samples: 52899520. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 09:23:12,063][49517] Avg episode reward: [(0, '0.461')] [2024-04-26 09:23:14,691][49750] Updated weights for policy 0, policy_version 140391 (0.0029) [2024-04-26 09:23:17,052][49750] Updated weights for policy 0, policy_version 140401 (0.0033) [2024-04-26 09:23:17,063][49517] Fps is (10 sec: 54066.5, 60 sec: 50517.2, 300 sec: 50151.7). Total num frames: 2300329984. Throughput: 0: 50025.8. Samples: 53199000. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 09:23:17,063][49517] Avg episode reward: [(0, '0.430')] [2024-04-26 09:23:21,068][49750] Updated weights for policy 0, policy_version 140411 (0.0037) [2024-04-26 09:23:22,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.2, 300 sec: 50151.7). Total num frames: 2300559360. Throughput: 0: 49963.5. Samples: 53351020. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 09:23:22,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 09:23:23,577][49750] Updated weights for policy 0, policy_version 140421 (0.0030) [2024-04-26 09:23:27,062][49517] Fps is (10 sec: 44237.6, 60 sec: 49425.2, 300 sec: 49929.5). Total num frames: 2300772352. Throughput: 0: 49985.0. Samples: 53647980. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 09:23:27,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 09:23:27,733][49750] Updated weights for policy 0, policy_version 140431 (0.0038) [2024-04-26 09:23:30,074][49750] Updated weights for policy 0, policy_version 140441 (0.0029) [2024-04-26 09:23:32,063][49517] Fps is (10 sec: 50790.4, 60 sec: 49698.2, 300 sec: 49985.0). Total num frames: 2301067264. Throughput: 0: 50038.7. Samples: 53950260. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 09:23:32,063][49517] Avg episode reward: [(0, '0.437')] [2024-04-26 09:23:34,364][49750] Updated weights for policy 0, policy_version 140451 (0.0027) [2024-04-26 09:23:36,536][49750] Updated weights for policy 0, policy_version 140461 (0.0040) [2024-04-26 09:23:37,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50244.4, 300 sec: 50040.7). Total num frames: 2301313024. Throughput: 0: 50092.1. Samples: 54111460. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 09:23:37,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 09:23:40,788][49750] Updated weights for policy 0, policy_version 140471 (0.0034) [2024-04-26 09:23:42,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2301558784. Throughput: 0: 50185.6. Samples: 54413960. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 09:23:42,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 09:23:43,064][49750] Updated weights for policy 0, policy_version 140481 (0.0033) [2024-04-26 09:23:47,062][49517] Fps is (10 sec: 45874.8, 60 sec: 49425.1, 300 sec: 49874.0). Total num frames: 2301771776. Throughput: 0: 50002.1. Samples: 54708360. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 09:23:47,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 09:23:47,210][49750] Updated weights for policy 0, policy_version 140491 (0.0034) [2024-04-26 09:23:50,136][49750] Updated weights for policy 0, policy_version 140501 (0.0038) [2024-04-26 09:23:52,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50244.3, 300 sec: 49985.1). Total num frames: 2302066688. Throughput: 0: 50052.0. Samples: 54848640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 09:23:52,063][49517] Avg episode reward: [(0, '0.459')] [2024-04-26 09:23:53,860][49750] Updated weights for policy 0, policy_version 140511 (0.0033) [2024-04-26 09:23:57,035][49750] Updated weights for policy 0, policy_version 140521 (0.0029) [2024-04-26 09:23:57,062][49517] Fps is (10 sec: 52429.2, 60 sec: 49971.2, 300 sec: 49929.6). Total num frames: 2302296064. Throughput: 0: 50029.4. Samples: 55150840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 09:23:57,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 09:24:00,536][49750] Updated weights for policy 0, policy_version 140531 (0.0036) [2024-04-26 09:24:02,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50244.1, 300 sec: 50151.7). Total num frames: 2302558208. Throughput: 0: 50060.0. Samples: 55451700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 09:24:02,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 09:24:03,641][49750] Updated weights for policy 0, policy_version 140541 (0.0038) [2024-04-26 09:24:06,944][49750] Updated weights for policy 0, policy_version 140551 (0.0032) [2024-04-26 09:24:07,062][49517] Fps is (10 sec: 49151.5, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2302787584. Throughput: 0: 50056.9. Samples: 55603580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 09:24:07,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 09:24:10,119][49750] Updated weights for policy 0, policy_version 140561 (0.0030) [2024-04-26 09:24:12,062][49517] Fps is (10 sec: 49152.7, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2303049728. Throughput: 0: 50140.4. Samples: 55904300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 09:24:12,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 09:24:13,329][49750] Updated weights for policy 0, policy_version 140571 (0.0029) [2024-04-26 09:24:16,648][49750] Updated weights for policy 0, policy_version 140581 (0.0032) [2024-04-26 09:24:17,063][49517] Fps is (10 sec: 52428.8, 60 sec: 49698.2, 300 sec: 49929.5). Total num frames: 2303311872. Throughput: 0: 50219.1. Samples: 56210120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 09:24:17,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 09:24:19,767][49728] Signal inference workers to stop experience collection... (850 times) [2024-04-26 09:24:19,813][49750] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-04-26 09:24:19,833][49728] Signal inference workers to resume experience collection... (850 times) [2024-04-26 09:24:19,834][49750] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-04-26 09:24:19,969][49750] Updated weights for policy 0, policy_version 140591 (0.0039) [2024-04-26 09:24:22,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2303574016. Throughput: 0: 50022.1. Samples: 56362460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 09:24:22,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 09:24:23,088][49750] Updated weights for policy 0, policy_version 140601 (0.0031) [2024-04-26 09:24:26,510][49750] Updated weights for policy 0, policy_version 140611 (0.0032) [2024-04-26 09:24:27,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.2, 300 sec: 50151.7). Total num frames: 2303819776. Throughput: 0: 50005.3. Samples: 56664200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 09:24:27,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 09:24:29,492][49750] Updated weights for policy 0, policy_version 140621 (0.0030) [2024-04-26 09:24:32,062][49517] Fps is (10 sec: 45875.5, 60 sec: 49425.2, 300 sec: 49874.0). Total num frames: 2304032768. Throughput: 0: 50009.0. Samples: 56958760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 09:24:32,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 09:24:32,970][49750] Updated weights for policy 0, policy_version 140631 (0.0035) [2024-04-26 09:24:35,874][49750] Updated weights for policy 0, policy_version 140641 (0.0031) [2024-04-26 09:24:37,063][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.0, 300 sec: 49985.1). Total num frames: 2304311296. Throughput: 0: 50057.6. Samples: 57101240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 09:24:37,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 09:24:39,539][49750] Updated weights for policy 0, policy_version 140651 (0.0033) [2024-04-26 09:24:42,062][49517] Fps is (10 sec: 52428.8, 60 sec: 49971.3, 300 sec: 50040.6). Total num frames: 2304557056. Throughput: 0: 50049.8. Samples: 57403080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 09:24:42,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 09:24:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000140660_2304573440.pth... [2024-04-26 09:24:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000139929_2292596736.pth [2024-04-26 09:24:42,351][49750] Updated weights for policy 0, policy_version 140661 (0.0034) [2024-04-26 09:24:46,130][49750] Updated weights for policy 0, policy_version 140671 (0.0035) [2024-04-26 09:24:47,062][49517] Fps is (10 sec: 49153.2, 60 sec: 50517.4, 300 sec: 50096.2). Total num frames: 2304802816. Throughput: 0: 50102.5. Samples: 57706300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 09:24:47,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 09:24:48,968][49750] Updated weights for policy 0, policy_version 140681 (0.0037) [2024-04-26 09:24:52,062][49517] Fps is (10 sec: 47513.4, 60 sec: 49425.0, 300 sec: 49874.0). Total num frames: 2305032192. Throughput: 0: 49999.6. Samples: 57853560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 09:24:52,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 09:24:52,647][49750] Updated weights for policy 0, policy_version 140691 (0.0037) [2024-04-26 09:24:55,476][49750] Updated weights for policy 0, policy_version 140701 (0.0029) [2024-04-26 09:24:57,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50244.1, 300 sec: 49985.1). Total num frames: 2305310720. Throughput: 0: 49963.4. Samples: 58152660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 09:24:57,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 09:24:59,183][49750] Updated weights for policy 0, policy_version 140711 (0.0039) [2024-04-26 09:25:02,062][49517] Fps is (10 sec: 52429.1, 60 sec: 49971.3, 300 sec: 49929.6). Total num frames: 2305556480. Throughput: 0: 49857.9. Samples: 58453720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 09:25:02,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 09:25:02,145][49750] Updated weights for policy 0, policy_version 140721 (0.0028) [2024-04-26 09:25:05,535][49750] Updated weights for policy 0, policy_version 140731 (0.0038) [2024-04-26 09:25:07,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2305818624. Throughput: 0: 49905.2. Samples: 58608200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 09:25:07,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 09:25:08,605][49750] Updated weights for policy 0, policy_version 140741 (0.0032) [2024-04-26 09:25:12,062][49517] Fps is (10 sec: 49151.5, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2306048000. Throughput: 0: 49846.8. Samples: 58907300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 09:25:12,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 09:25:12,105][49750] Updated weights for policy 0, policy_version 140751 (0.0036) [2024-04-26 09:25:15,043][49750] Updated weights for policy 0, policy_version 140761 (0.0032) [2024-04-26 09:25:17,062][49517] Fps is (10 sec: 45875.8, 60 sec: 49425.1, 300 sec: 49874.0). Total num frames: 2306277376. Throughput: 0: 49913.7. Samples: 59204880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 09:25:17,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 09:25:18,739][49750] Updated weights for policy 0, policy_version 140771 (0.0030) [2024-04-26 09:25:21,788][49750] Updated weights for policy 0, policy_version 140781 (0.0034) [2024-04-26 09:25:22,063][49517] Fps is (10 sec: 50790.1, 60 sec: 49698.1, 300 sec: 49929.5). Total num frames: 2306555904. Throughput: 0: 49880.5. Samples: 59345860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 09:25:22,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 09:25:25,148][49750] Updated weights for policy 0, policy_version 140791 (0.0031) [2024-04-26 09:25:27,062][49517] Fps is (10 sec: 54067.5, 60 sec: 49971.4, 300 sec: 50151.7). Total num frames: 2306818048. Throughput: 0: 50031.1. Samples: 59654480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 09:25:27,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 09:25:28,453][49750] Updated weights for policy 0, policy_version 140801 (0.0036) [2024-04-26 09:25:29,204][49728] Signal inference workers to stop experience collection... (900 times) [2024-04-26 09:25:29,221][49750] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-04-26 09:25:29,312][49728] Signal inference workers to resume experience collection... (900 times) [2024-04-26 09:25:29,312][49750] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-04-26 09:25:31,765][49750] Updated weights for policy 0, policy_version 140811 (0.0029) [2024-04-26 09:25:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50096.2). Total num frames: 2307063808. Throughput: 0: 49987.4. Samples: 59955740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 09:25:32,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 09:25:34,839][49750] Updated weights for policy 0, policy_version 140821 (0.0031) [2024-04-26 09:25:37,062][49517] Fps is (10 sec: 45875.0, 60 sec: 49425.2, 300 sec: 49818.5). Total num frames: 2307276800. Throughput: 0: 49941.8. Samples: 60100940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 09:25:37,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 09:25:38,202][49750] Updated weights for policy 0, policy_version 140831 (0.0031) [2024-04-26 09:25:41,394][49750] Updated weights for policy 0, policy_version 140841 (0.0035) [2024-04-26 09:25:42,063][49517] Fps is (10 sec: 49151.1, 60 sec: 49971.0, 300 sec: 49985.1). Total num frames: 2307555328. Throughput: 0: 50078.1. Samples: 60406180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 09:25:42,064][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 09:25:44,627][49750] Updated weights for policy 0, policy_version 140851 (0.0027) [2024-04-26 09:25:47,063][49517] Fps is (10 sec: 54066.2, 60 sec: 50244.0, 300 sec: 50040.6). Total num frames: 2307817472. Throughput: 0: 50051.7. Samples: 60706060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 09:25:47,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 09:25:48,047][49750] Updated weights for policy 0, policy_version 140861 (0.0031) [2024-04-26 09:25:51,292][49750] Updated weights for policy 0, policy_version 140871 (0.0028) [2024-04-26 09:25:52,062][49517] Fps is (10 sec: 52430.3, 60 sec: 50790.5, 300 sec: 50151.7). Total num frames: 2308079616. Throughput: 0: 50114.9. Samples: 60863360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 09:25:52,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 09:25:54,486][49750] Updated weights for policy 0, policy_version 140881 (0.0034) [2024-04-26 09:25:57,063][49517] Fps is (10 sec: 47514.2, 60 sec: 49698.2, 300 sec: 49929.5). Total num frames: 2308292608. Throughput: 0: 50066.6. Samples: 61160300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 09:25:57,063][49517] Avg episode reward: [(0, '0.470')] [2024-04-26 09:25:57,864][49750] Updated weights for policy 0, policy_version 140891 (0.0030) [2024-04-26 09:26:00,957][49750] Updated weights for policy 0, policy_version 140901 (0.0033) [2024-04-26 09:26:02,063][49517] Fps is (10 sec: 47512.9, 60 sec: 49971.1, 300 sec: 49985.1). Total num frames: 2308554752. Throughput: 0: 50173.3. Samples: 61462680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 09:26:02,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 09:26:04,377][49750] Updated weights for policy 0, policy_version 140911 (0.0033) [2024-04-26 09:26:07,062][49517] Fps is (10 sec: 50790.8, 60 sec: 49698.3, 300 sec: 49929.5). Total num frames: 2308800512. Throughput: 0: 50241.5. Samples: 61606720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 09:26:07,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 09:26:07,644][49750] Updated weights for policy 0, policy_version 140921 (0.0035) [2024-04-26 09:26:10,775][49750] Updated weights for policy 0, policy_version 140931 (0.0028) [2024-04-26 09:26:12,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2309062656. Throughput: 0: 50111.0. Samples: 61909480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 09:26:12,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 09:26:14,324][49750] Updated weights for policy 0, policy_version 140941 (0.0033) [2024-04-26 09:26:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.4, 300 sec: 49985.1). Total num frames: 2309308416. Throughput: 0: 50228.5. Samples: 62216020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 09:26:17,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 09:26:17,427][49750] Updated weights for policy 0, policy_version 140951 (0.0027) [2024-04-26 09:26:20,898][49750] Updated weights for policy 0, policy_version 140961 (0.0030) [2024-04-26 09:26:22,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49698.2, 300 sec: 49874.0). Total num frames: 2309537792. Throughput: 0: 50182.2. Samples: 62359140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 09:26:22,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 09:26:23,942][49750] Updated weights for policy 0, policy_version 140971 (0.0032) [2024-04-26 09:26:27,062][49517] Fps is (10 sec: 49151.6, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2309799936. Throughput: 0: 49966.9. Samples: 62654680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 09:26:27,063][49517] Avg episode reward: [(0, '0.471')] [2024-04-26 09:26:27,308][49750] Updated weights for policy 0, policy_version 140981 (0.0031) [2024-04-26 09:26:30,494][49750] Updated weights for policy 0, policy_version 140991 (0.0034) [2024-04-26 09:26:32,063][49517] Fps is (10 sec: 55704.7, 60 sec: 50517.2, 300 sec: 50207.2). Total num frames: 2310094848. Throughput: 0: 50023.6. Samples: 62957120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 09:26:32,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 09:26:33,654][49728] Signal inference workers to stop experience collection... (950 times) [2024-04-26 09:26:33,654][49728] Signal inference workers to resume experience collection... (950 times) [2024-04-26 09:26:33,696][49750] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-04-26 09:26:33,697][49750] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-04-26 09:26:33,779][49750] Updated weights for policy 0, policy_version 141001 (0.0048) [2024-04-26 09:26:37,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50040.7). Total num frames: 2310307840. Throughput: 0: 50028.4. Samples: 63114640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 09:26:37,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 09:26:37,124][49750] Updated weights for policy 0, policy_version 141011 (0.0030) [2024-04-26 09:26:40,553][49750] Updated weights for policy 0, policy_version 141021 (0.0037) [2024-04-26 09:26:42,062][49517] Fps is (10 sec: 44237.7, 60 sec: 49698.3, 300 sec: 49874.0). Total num frames: 2310537216. Throughput: 0: 50057.4. Samples: 63412880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 09:26:42,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 09:26:42,155][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000141025_2310553600.pth... [2024-04-26 09:26:42,200][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000140292_2298544128.pth [2024-04-26 09:26:43,623][49750] Updated weights for policy 0, policy_version 141031 (0.0031) [2024-04-26 09:26:47,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49698.3, 300 sec: 49929.6). Total num frames: 2310799360. Throughput: 0: 49961.0. Samples: 63710920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 09:26:47,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 09:26:47,193][49750] Updated weights for policy 0, policy_version 141041 (0.0040) [2024-04-26 09:26:50,097][49750] Updated weights for policy 0, policy_version 141051 (0.0028) [2024-04-26 09:26:52,062][49517] Fps is (10 sec: 52428.9, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 2311061504. Throughput: 0: 50133.4. Samples: 63862720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 09:26:52,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 09:26:53,587][49750] Updated weights for policy 0, policy_version 141061 (0.0030) [2024-04-26 09:26:56,549][49750] Updated weights for policy 0, policy_version 141071 (0.0033) [2024-04-26 09:26:57,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.3, 300 sec: 50096.1). Total num frames: 2311323648. Throughput: 0: 50178.2. Samples: 64167500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 09:26:57,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 09:27:00,342][49750] Updated weights for policy 0, policy_version 141081 (0.0035) [2024-04-26 09:27:02,063][49517] Fps is (10 sec: 47513.0, 60 sec: 49698.1, 300 sec: 49929.5). Total num frames: 2311536640. Throughput: 0: 49953.6. Samples: 64463940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 09:27:02,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 09:27:03,145][49750] Updated weights for policy 0, policy_version 141091 (0.0032) [2024-04-26 09:27:07,023][49750] Updated weights for policy 0, policy_version 141101 (0.0033) [2024-04-26 09:27:07,063][49517] Fps is (10 sec: 47513.4, 60 sec: 49971.1, 300 sec: 49929.5). Total num frames: 2311798784. Throughput: 0: 49807.5. Samples: 64600480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 09:27:07,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 09:27:09,654][49750] Updated weights for policy 0, policy_version 141111 (0.0035) [2024-04-26 09:27:12,062][49517] Fps is (10 sec: 52429.0, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2312060928. Throughput: 0: 49871.5. Samples: 64898900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 09:27:12,072][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 09:27:13,929][49750] Updated weights for policy 0, policy_version 141121 (0.0031) [2024-04-26 09:27:16,237][49750] Updated weights for policy 0, policy_version 141131 (0.0035) [2024-04-26 09:27:17,062][49517] Fps is (10 sec: 54067.9, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 2312339456. Throughput: 0: 49834.0. Samples: 65199640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 09:27:17,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 09:27:20,417][49750] Updated weights for policy 0, policy_version 141141 (0.0035) [2024-04-26 09:27:22,062][49517] Fps is (10 sec: 47514.2, 60 sec: 49971.3, 300 sec: 49929.6). Total num frames: 2312536064. Throughput: 0: 49927.6. Samples: 65361380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 09:27:22,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 09:27:22,787][49750] Updated weights for policy 0, policy_version 141151 (0.0030) [2024-04-26 09:27:26,849][49750] Updated weights for policy 0, policy_version 141161 (0.0034) [2024-04-26 09:27:27,062][49517] Fps is (10 sec: 44236.9, 60 sec: 49698.2, 300 sec: 49818.5). Total num frames: 2312781824. Throughput: 0: 50025.4. Samples: 65664020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 09:27:27,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 09:27:29,374][49750] Updated weights for policy 0, policy_version 141171 (0.0031) [2024-04-26 09:27:32,062][49517] Fps is (10 sec: 50790.1, 60 sec: 49152.2, 300 sec: 49985.1). Total num frames: 2313043968. Throughput: 0: 49861.7. Samples: 65954700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 09:27:32,063][49517] Avg episode reward: [(0, '0.436')] [2024-04-26 09:27:33,450][49750] Updated weights for policy 0, policy_version 141181 (0.0030) [2024-04-26 09:27:35,921][49750] Updated weights for policy 0, policy_version 141191 (0.0032) [2024-04-26 09:27:37,062][49517] Fps is (10 sec: 55705.6, 60 sec: 50517.3, 300 sec: 50207.3). Total num frames: 2313338880. Throughput: 0: 50065.8. Samples: 66115680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 09:27:37,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 09:27:39,953][49750] Updated weights for policy 0, policy_version 141201 (0.0028) [2024-04-26 09:27:42,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 49985.1). Total num frames: 2313551872. Throughput: 0: 49856.0. Samples: 66411020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 09:27:42,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 09:27:42,465][49750] Updated weights for policy 0, policy_version 141211 (0.0027) [2024-04-26 09:27:46,444][49750] Updated weights for policy 0, policy_version 141221 (0.0040) [2024-04-26 09:27:47,062][49517] Fps is (10 sec: 45874.9, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2313797632. Throughput: 0: 50013.9. Samples: 66714560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 09:27:47,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 09:27:48,447][49728] Signal inference workers to stop experience collection... (1000 times) [2024-04-26 09:27:48,468][49750] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-04-26 09:27:48,553][49728] Signal inference workers to resume experience collection... (1000 times) [2024-04-26 09:27:48,553][49750] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-04-26 09:27:49,025][49750] Updated weights for policy 0, policy_version 141231 (0.0032) [2024-04-26 09:27:52,063][49517] Fps is (10 sec: 50789.7, 60 sec: 49971.0, 300 sec: 50040.6). Total num frames: 2314059776. Throughput: 0: 50087.4. Samples: 66854420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 09:27:52,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 09:27:52,881][49750] Updated weights for policy 0, policy_version 141241 (0.0033) [2024-04-26 09:27:55,513][49750] Updated weights for policy 0, policy_version 141251 (0.0032) [2024-04-26 09:27:57,067][49517] Fps is (10 sec: 52406.6, 60 sec: 49967.7, 300 sec: 50095.4). Total num frames: 2314321920. Throughput: 0: 50067.3. Samples: 67152140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 09:27:57,067][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 09:27:59,333][49750] Updated weights for policy 0, policy_version 141261 (0.0031) [2024-04-26 09:28:02,049][49750] Updated weights for policy 0, policy_version 141271 (0.0036) [2024-04-26 09:28:02,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50790.5, 300 sec: 50151.7). Total num frames: 2314584064. Throughput: 0: 50212.8. Samples: 67459220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 09:28:02,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 09:28:05,892][49750] Updated weights for policy 0, policy_version 141281 (0.0036) [2024-04-26 09:28:07,063][49517] Fps is (10 sec: 47533.2, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2314797056. Throughput: 0: 49905.1. Samples: 67607120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 09:28:07,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 09:28:08,602][49750] Updated weights for policy 0, policy_version 141291 (0.0033) [2024-04-26 09:28:12,063][49517] Fps is (10 sec: 45875.0, 60 sec: 49698.1, 300 sec: 49874.0). Total num frames: 2315042816. Throughput: 0: 49776.3. Samples: 67903960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 09:28:12,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 09:28:12,684][49750] Updated weights for policy 0, policy_version 141301 (0.0025) [2024-04-26 09:28:15,240][49750] Updated weights for policy 0, policy_version 141311 (0.0031) [2024-04-26 09:28:17,062][49517] Fps is (10 sec: 52429.6, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 2315321344. Throughput: 0: 49850.3. Samples: 68197960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 09:28:17,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 09:28:19,499][49750] Updated weights for policy 0, policy_version 141321 (0.0030) [2024-04-26 09:28:21,681][49750] Updated weights for policy 0, policy_version 141331 (0.0031) [2024-04-26 09:28:22,062][49517] Fps is (10 sec: 54067.8, 60 sec: 50790.4, 300 sec: 50207.2). Total num frames: 2315583488. Throughput: 0: 50069.8. Samples: 68368820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 09:28:22,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 09:28:25,953][49750] Updated weights for policy 0, policy_version 141341 (0.0032) [2024-04-26 09:28:27,062][49517] Fps is (10 sec: 45875.2, 60 sec: 49971.2, 300 sec: 49874.0). Total num frames: 2315780096. Throughput: 0: 50030.8. Samples: 68662400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 09:28:27,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 09:28:28,254][49750] Updated weights for policy 0, policy_version 141351 (0.0030) [2024-04-26 09:28:32,062][49517] Fps is (10 sec: 45874.9, 60 sec: 49971.2, 300 sec: 49929.5). Total num frames: 2316042240. Throughput: 0: 49773.3. Samples: 68954360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 09:28:32,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 09:28:32,490][49750] Updated weights for policy 0, policy_version 141361 (0.0034) [2024-04-26 09:28:34,944][49750] Updated weights for policy 0, policy_version 141371 (0.0028) [2024-04-26 09:28:37,063][49517] Fps is (10 sec: 54066.1, 60 sec: 49698.0, 300 sec: 50040.6). Total num frames: 2316320768. Throughput: 0: 50112.1. Samples: 69109460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 09:28:37,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 09:28:39,254][49750] Updated weights for policy 0, policy_version 141381 (0.0039) [2024-04-26 09:28:41,599][49750] Updated weights for policy 0, policy_version 141391 (0.0032) [2024-04-26 09:28:42,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2316566528. Throughput: 0: 50047.4. Samples: 69404060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 09:28:42,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 09:28:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000141392_2316566528.pth... [2024-04-26 09:28:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000140660_2304573440.pth [2024-04-26 09:28:42,311][49728] Signal inference workers to stop experience collection... (1050 times) [2024-04-26 09:28:42,311][49728] Signal inference workers to resume experience collection... (1050 times) [2024-04-26 09:28:42,326][49750] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-04-26 09:28:42,326][49750] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-04-26 09:28:45,757][49750] Updated weights for policy 0, policy_version 141401 (0.0032) [2024-04-26 09:28:47,062][49517] Fps is (10 sec: 47514.8, 60 sec: 49971.3, 300 sec: 49929.6). Total num frames: 2316795904. Throughput: 0: 49955.7. Samples: 69707220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 09:28:47,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 09:28:48,235][49750] Updated weights for policy 0, policy_version 141411 (0.0033) [2024-04-26 09:28:52,063][49517] Fps is (10 sec: 45874.3, 60 sec: 49425.1, 300 sec: 49929.5). Total num frames: 2317025280. Throughput: 0: 49752.8. Samples: 69846000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 09:28:52,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 09:28:52,246][49750] Updated weights for policy 0, policy_version 141421 (0.0027) [2024-04-26 09:28:54,743][49750] Updated weights for policy 0, policy_version 141431 (0.0031) [2024-04-26 09:28:57,062][49517] Fps is (10 sec: 49151.3, 60 sec: 49428.5, 300 sec: 49929.6). Total num frames: 2317287424. Throughput: 0: 49558.7. Samples: 70134100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 09:28:57,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 09:28:58,892][49750] Updated weights for policy 0, policy_version 141441 (0.0030) [2024-04-26 09:29:01,333][49750] Updated weights for policy 0, policy_version 141451 (0.0024) [2024-04-26 09:29:02,063][49517] Fps is (10 sec: 55705.5, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 2317582336. Throughput: 0: 49777.1. Samples: 70437940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-26 09:29:02,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 09:29:05,524][49750] Updated weights for policy 0, policy_version 141461 (0.0035) [2024-04-26 09:29:07,062][49517] Fps is (10 sec: 49152.6, 60 sec: 49698.3, 300 sec: 49929.6). Total num frames: 2317778944. Throughput: 0: 49468.0. Samples: 70594880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 09:29:07,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 09:29:07,877][49750] Updated weights for policy 0, policy_version 141471 (0.0032) [2024-04-26 09:29:12,062][49517] Fps is (10 sec: 42599.6, 60 sec: 49425.2, 300 sec: 49818.5). Total num frames: 2318008320. Throughput: 0: 49645.4. Samples: 70896440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 09:29:12,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 09:29:12,122][49750] Updated weights for policy 0, policy_version 141481 (0.0030) [2024-04-26 09:29:14,370][49750] Updated weights for policy 0, policy_version 141491 (0.0035) [2024-04-26 09:29:17,062][49517] Fps is (10 sec: 52428.5, 60 sec: 49698.1, 300 sec: 49929.6). Total num frames: 2318303232. Throughput: 0: 49775.6. Samples: 71194260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 09:29:17,063][49517] Avg episode reward: [(0, '0.483')] [2024-04-26 09:29:18,631][49750] Updated weights for policy 0, policy_version 141501 (0.0035) [2024-04-26 09:29:20,910][49750] Updated weights for policy 0, policy_version 141511 (0.0030) [2024-04-26 09:29:22,062][49517] Fps is (10 sec: 55705.3, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2318565376. Throughput: 0: 49835.8. Samples: 71352060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 09:29:22,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 09:29:25,216][49750] Updated weights for policy 0, policy_version 141521 (0.0033) [2024-04-26 09:29:27,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.2, 300 sec: 50096.1). Total num frames: 2318811136. Throughput: 0: 50043.8. Samples: 71656040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 09:29:27,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 09:29:27,373][49750] Updated weights for policy 0, policy_version 141531 (0.0028) [2024-04-26 09:29:31,769][49750] Updated weights for policy 0, policy_version 141541 (0.0036) [2024-04-26 09:29:32,062][49517] Fps is (10 sec: 45875.2, 60 sec: 49698.2, 300 sec: 49874.0). Total num frames: 2319024128. Throughput: 0: 50122.6. Samples: 71962740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 09:29:32,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 09:29:34,044][49750] Updated weights for policy 0, policy_version 141551 (0.0031) [2024-04-26 09:29:35,633][49728] Signal inference workers to stop experience collection... (1100 times) [2024-04-26 09:29:35,693][49750] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-04-26 09:29:35,701][49728] Signal inference workers to resume experience collection... (1100 times) [2024-04-26 09:29:35,709][49750] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-04-26 09:29:37,062][49517] Fps is (10 sec: 49152.5, 60 sec: 49698.2, 300 sec: 49985.1). Total num frames: 2319302656. Throughput: 0: 49927.7. Samples: 72092740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 09:29:37,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 09:29:38,273][49750] Updated weights for policy 0, policy_version 141561 (0.0038) [2024-04-26 09:29:40,577][49750] Updated weights for policy 0, policy_version 141571 (0.0026) [2024-04-26 09:29:42,063][49517] Fps is (10 sec: 52428.3, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2319548416. Throughput: 0: 50215.1. Samples: 72393780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 09:29:42,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 09:29:44,743][49750] Updated weights for policy 0, policy_version 141581 (0.0029) [2024-04-26 09:29:46,981][49750] Updated weights for policy 0, policy_version 141591 (0.0032) [2024-04-26 09:29:47,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2319826944. Throughput: 0: 50270.5. Samples: 72700100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 09:29:47,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 09:29:51,201][49750] Updated weights for policy 0, policy_version 141601 (0.0035) [2024-04-26 09:29:52,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.3, 300 sec: 49929.6). Total num frames: 2320039936. Throughput: 0: 49980.3. Samples: 72844000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 09:29:52,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 09:29:53,518][49750] Updated weights for policy 0, policy_version 141611 (0.0036) [2024-04-26 09:29:57,063][49517] Fps is (10 sec: 45874.5, 60 sec: 49971.2, 300 sec: 49929.5). Total num frames: 2320285696. Throughput: 0: 49975.4. Samples: 73145340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 09:29:57,063][49517] Avg episode reward: [(0, '0.481')] [2024-04-26 09:29:57,840][49750] Updated weights for policy 0, policy_version 141621 (0.0031) [2024-04-26 09:30:00,111][49750] Updated weights for policy 0, policy_version 141631 (0.0033) [2024-04-26 09:30:02,063][49517] Fps is (10 sec: 52428.5, 60 sec: 49698.2, 300 sec: 49985.1). Total num frames: 2320564224. Throughput: 0: 50163.8. Samples: 73451640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 09:30:02,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 09:30:04,432][49750] Updated weights for policy 0, policy_version 141641 (0.0034) [2024-04-26 09:30:06,589][49750] Updated weights for policy 0, policy_version 141651 (0.0031) [2024-04-26 09:30:07,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.1, 300 sec: 50040.6). Total num frames: 2320809984. Throughput: 0: 50096.2. Samples: 73606400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 09:30:07,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 09:30:10,846][49750] Updated weights for policy 0, policy_version 141661 (0.0034) [2024-04-26 09:30:12,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.3, 300 sec: 50096.2). Total num frames: 2321055744. Throughput: 0: 50139.2. Samples: 73912300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 09:30:12,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 09:30:13,035][49750] Updated weights for policy 0, policy_version 141671 (0.0035) [2024-04-26 09:30:17,062][49517] Fps is (10 sec: 45875.8, 60 sec: 49425.0, 300 sec: 49874.0). Total num frames: 2321268736. Throughput: 0: 49973.3. Samples: 74211540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 09:30:17,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 09:30:17,249][49750] Updated weights for policy 0, policy_version 141681 (0.0040) [2024-04-26 09:30:19,539][49750] Updated weights for policy 0, policy_version 141691 (0.0030) [2024-04-26 09:30:22,062][49517] Fps is (10 sec: 50790.7, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2321563648. Throughput: 0: 50051.6. Samples: 74345060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 09:30:22,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 09:30:23,802][49750] Updated weights for policy 0, policy_version 141701 (0.0032) [2024-04-26 09:30:26,200][49750] Updated weights for policy 0, policy_version 141711 (0.0032) [2024-04-26 09:30:27,063][49517] Fps is (10 sec: 54066.6, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2321809408. Throughput: 0: 50221.3. Samples: 74653740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 09:30:27,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 09:30:30,465][49750] Updated weights for policy 0, policy_version 141721 (0.0034) [2024-04-26 09:30:31,345][49728] Signal inference workers to stop experience collection... (1150 times) [2024-04-26 09:30:31,380][49750] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-04-26 09:30:31,415][49728] Signal inference workers to resume experience collection... (1150 times) [2024-04-26 09:30:31,415][49750] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-04-26 09:30:32,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.2, 300 sec: 50096.1). Total num frames: 2322055168. Throughput: 0: 50078.0. Samples: 74953620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 09:30:32,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 09:30:32,742][49750] Updated weights for policy 0, policy_version 141731 (0.0028) [2024-04-26 09:30:37,007][49750] Updated weights for policy 0, policy_version 141741 (0.0031) [2024-04-26 09:30:37,063][49517] Fps is (10 sec: 47513.5, 60 sec: 49698.0, 300 sec: 49929.6). Total num frames: 2322284544. Throughput: 0: 50058.2. Samples: 75096620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 09:30:37,063][49517] Avg episode reward: [(0, '0.457')] [2024-04-26 09:30:39,324][49750] Updated weights for policy 0, policy_version 141751 (0.0033) [2024-04-26 09:30:42,063][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.2, 300 sec: 49929.6). Total num frames: 2322546688. Throughput: 0: 50080.0. Samples: 75398940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 09:30:42,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 09:30:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000141757_2322546688.pth... [2024-04-26 09:30:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000141025_2310553600.pth [2024-04-26 09:30:43,461][49750] Updated weights for policy 0, policy_version 141761 (0.0034) [2024-04-26 09:30:45,926][49750] Updated weights for policy 0, policy_version 141771 (0.0029) [2024-04-26 09:30:47,062][49517] Fps is (10 sec: 54068.2, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2322825216. Throughput: 0: 49940.7. Samples: 75698960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 09:30:47,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 09:30:50,006][49750] Updated weights for policy 0, policy_version 141781 (0.0027) [2024-04-26 09:30:52,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50244.4, 300 sec: 50040.6). Total num frames: 2323054592. Throughput: 0: 50194.0. Samples: 75865120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 09:30:52,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 09:30:52,528][49750] Updated weights for policy 0, policy_version 141791 (0.0035) [2024-04-26 09:30:56,480][49750] Updated weights for policy 0, policy_version 141801 (0.0037) [2024-04-26 09:30:57,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.3, 300 sec: 49985.1). Total num frames: 2323300352. Throughput: 0: 50086.7. Samples: 76166200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 09:30:57,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 09:30:58,940][49750] Updated weights for policy 0, policy_version 141811 (0.0030) [2024-04-26 09:31:02,063][49517] Fps is (10 sec: 47513.1, 60 sec: 49425.1, 300 sec: 49929.5). Total num frames: 2323529728. Throughput: 0: 50031.5. Samples: 76462960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 09:31:02,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 09:31:03,203][49750] Updated weights for policy 0, policy_version 141821 (0.0036) [2024-04-26 09:31:05,544][49750] Updated weights for policy 0, policy_version 141831 (0.0032) [2024-04-26 09:31:07,062][49517] Fps is (10 sec: 50790.5, 60 sec: 49971.3, 300 sec: 49985.1). Total num frames: 2323808256. Throughput: 0: 50259.1. Samples: 76606720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 09:31:07,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 09:31:09,518][49750] Updated weights for policy 0, policy_version 141841 (0.0029) [2024-04-26 09:31:12,063][49517] Fps is (10 sec: 52429.1, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2324054016. Throughput: 0: 50101.0. Samples: 76908280. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 09:31:12,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 09:31:12,194][49750] Updated weights for policy 0, policy_version 141851 (0.0037) [2024-04-26 09:31:16,131][49750] Updated weights for policy 0, policy_version 141861 (0.0039) [2024-04-26 09:31:17,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50096.2). Total num frames: 2324316160. Throughput: 0: 50202.0. Samples: 77212700. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 09:31:17,063][49517] Avg episode reward: [(0, '0.416')] [2024-04-26 09:31:18,779][49750] Updated weights for policy 0, policy_version 141871 (0.0034) [2024-04-26 09:31:22,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49425.1, 300 sec: 49929.6). Total num frames: 2324529152. Throughput: 0: 50142.0. Samples: 77353000. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 09:31:22,063][49517] Avg episode reward: [(0, '0.467')] [2024-04-26 09:31:22,575][49750] Updated weights for policy 0, policy_version 141881 (0.0034) [2024-04-26 09:31:25,391][49750] Updated weights for policy 0, policy_version 141891 (0.0034) [2024-04-26 09:31:27,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.4, 300 sec: 49874.1). Total num frames: 2324807680. Throughput: 0: 50124.7. Samples: 77654540. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 09:31:27,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 09:31:29,168][49750] Updated weights for policy 0, policy_version 141901 (0.0032) [2024-04-26 09:31:31,888][49750] Updated weights for policy 0, policy_version 141911 (0.0031) [2024-04-26 09:31:32,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50244.4, 300 sec: 50040.6). Total num frames: 2325069824. Throughput: 0: 50225.8. Samples: 77959120. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 09:31:32,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 09:31:35,700][49750] Updated weights for policy 0, policy_version 141921 (0.0032) [2024-04-26 09:31:37,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50517.4, 300 sec: 50096.1). Total num frames: 2325315584. Throughput: 0: 49911.4. Samples: 78111140. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 09:31:37,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 09:31:38,505][49750] Updated weights for policy 0, policy_version 141931 (0.0035) [2024-04-26 09:31:42,063][49517] Fps is (10 sec: 47512.7, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2325544960. Throughput: 0: 49862.1. Samples: 78410000. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 09:31:42,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 09:31:42,200][49750] Updated weights for policy 0, policy_version 141941 (0.0039) [2024-04-26 09:31:43,037][49728] Signal inference workers to stop experience collection... (1200 times) [2024-04-26 09:31:43,057][49750] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-04-26 09:31:43,105][49728] Signal inference workers to resume experience collection... (1200 times) [2024-04-26 09:31:43,105][49750] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-04-26 09:31:45,001][49750] Updated weights for policy 0, policy_version 141951 (0.0031) [2024-04-26 09:31:47,062][49517] Fps is (10 sec: 49152.5, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2325807104. Throughput: 0: 49881.4. Samples: 78707620. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 09:31:47,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 09:31:48,708][49750] Updated weights for policy 0, policy_version 141961 (0.0029) [2024-04-26 09:31:51,468][49750] Updated weights for policy 0, policy_version 141971 (0.0033) [2024-04-26 09:31:52,063][49517] Fps is (10 sec: 54067.1, 60 sec: 50517.2, 300 sec: 50040.6). Total num frames: 2326085632. Throughput: 0: 50054.0. Samples: 78859160. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 09:31:52,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 09:31:55,139][49750] Updated weights for policy 0, policy_version 141981 (0.0040) [2024-04-26 09:31:57,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2326315008. Throughput: 0: 49978.8. Samples: 79157320. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 09:31:57,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 09:31:58,034][49750] Updated weights for policy 0, policy_version 141991 (0.0031) [2024-04-26 09:32:01,719][49750] Updated weights for policy 0, policy_version 142001 (0.0041) [2024-04-26 09:32:02,062][49517] Fps is (10 sec: 45875.8, 60 sec: 50244.3, 300 sec: 49985.1). Total num frames: 2326544384. Throughput: 0: 49967.5. Samples: 79461240. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 09:32:02,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 09:32:04,588][49750] Updated weights for policy 0, policy_version 142011 (0.0030) [2024-04-26 09:32:07,063][49517] Fps is (10 sec: 47513.0, 60 sec: 49698.0, 300 sec: 49929.5). Total num frames: 2326790144. Throughput: 0: 50055.0. Samples: 79605480. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 09:32:07,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 09:32:08,255][49750] Updated weights for policy 0, policy_version 142021 (0.0034) [2024-04-26 09:32:11,096][49750] Updated weights for policy 0, policy_version 142031 (0.0027) [2024-04-26 09:32:12,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50244.1, 300 sec: 49929.5). Total num frames: 2327068672. Throughput: 0: 50153.9. Samples: 79911480. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 09:32:12,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 09:32:14,800][49750] Updated weights for policy 0, policy_version 142041 (0.0030) [2024-04-26 09:32:17,062][49517] Fps is (10 sec: 52429.8, 60 sec: 49971.2, 300 sec: 50096.2). Total num frames: 2327314432. Throughput: 0: 49956.5. Samples: 80207160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:32:17,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 09:32:17,570][49750] Updated weights for policy 0, policy_version 142051 (0.0031) [2024-04-26 09:32:21,282][49750] Updated weights for policy 0, policy_version 142061 (0.0030) [2024-04-26 09:32:22,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2327543808. Throughput: 0: 49988.5. Samples: 80360620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:32:22,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 09:32:24,359][49750] Updated weights for policy 0, policy_version 142071 (0.0038) [2024-04-26 09:32:27,062][49517] Fps is (10 sec: 45874.9, 60 sec: 49425.0, 300 sec: 49929.6). Total num frames: 2327773184. Throughput: 0: 49941.5. Samples: 80657360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:32:27,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 09:32:27,735][49750] Updated weights for policy 0, policy_version 142081 (0.0028) [2024-04-26 09:32:30,893][49750] Updated weights for policy 0, policy_version 142091 (0.0029) [2024-04-26 09:32:32,063][49517] Fps is (10 sec: 52428.7, 60 sec: 49971.1, 300 sec: 49929.5). Total num frames: 2328068096. Throughput: 0: 50023.0. Samples: 80958660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:32:32,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 09:32:34,337][49750] Updated weights for policy 0, policy_version 142101 (0.0032) [2024-04-26 09:32:37,063][49517] Fps is (10 sec: 54066.2, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2328313856. Throughput: 0: 49991.6. Samples: 81108780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:32:37,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 09:32:37,373][49750] Updated weights for policy 0, policy_version 142111 (0.0033) [2024-04-26 09:32:40,933][49750] Updated weights for policy 0, policy_version 142121 (0.0032) [2024-04-26 09:32:41,867][49728] Signal inference workers to stop experience collection... (1250 times) [2024-04-26 09:32:41,868][49728] Signal inference workers to resume experience collection... (1250 times) [2024-04-26 09:32:41,883][49750] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-04-26 09:32:41,883][49750] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-04-26 09:32:42,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50096.2). Total num frames: 2328576000. Throughput: 0: 50132.4. Samples: 81413280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:32:42,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 09:32:42,076][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000142125_2328576000.pth... [2024-04-26 09:32:42,124][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000141392_2316566528.pth [2024-04-26 09:32:43,881][49750] Updated weights for policy 0, policy_version 142131 (0.0032) [2024-04-26 09:32:47,063][49517] Fps is (10 sec: 49152.5, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2328805376. Throughput: 0: 50124.4. Samples: 81716840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:32:47,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 09:32:47,602][49750] Updated weights for policy 0, policy_version 142141 (0.0031) [2024-04-26 09:32:50,405][49750] Updated weights for policy 0, policy_version 142151 (0.0036) [2024-04-26 09:32:52,063][49517] Fps is (10 sec: 47513.2, 60 sec: 49425.1, 300 sec: 49930.3). Total num frames: 2329051136. Throughput: 0: 50039.1. Samples: 81857240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:32:52,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 09:32:54,326][49750] Updated weights for policy 0, policy_version 142161 (0.0028) [2024-04-26 09:32:56,950][49750] Updated weights for policy 0, policy_version 142171 (0.0039) [2024-04-26 09:32:57,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50244.1, 300 sec: 49985.1). Total num frames: 2329329664. Throughput: 0: 49884.5. Samples: 82156280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:32:57,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 09:33:00,659][49750] Updated weights for policy 0, policy_version 142181 (0.0036) [2024-04-26 09:33:02,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2329559040. Throughput: 0: 50027.3. Samples: 82458400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:33:02,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 09:33:03,516][49750] Updated weights for policy 0, policy_version 142191 (0.0036) [2024-04-26 09:33:07,062][49517] Fps is (10 sec: 45876.2, 60 sec: 49971.3, 300 sec: 49985.1). Total num frames: 2329788416. Throughput: 0: 49826.3. Samples: 82602800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:33:07,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 09:33:07,388][49750] Updated weights for policy 0, policy_version 142201 (0.0035) [2024-04-26 09:33:10,363][49750] Updated weights for policy 0, policy_version 142211 (0.0032) [2024-04-26 09:33:12,062][49517] Fps is (10 sec: 49152.6, 60 sec: 49698.3, 300 sec: 49929.5). Total num frames: 2330050560. Throughput: 0: 49949.7. Samples: 82905100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 09:33:12,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 09:33:13,972][49750] Updated weights for policy 0, policy_version 142221 (0.0031) [2024-04-26 09:33:17,060][49750] Updated weights for policy 0, policy_version 142231 (0.0032) [2024-04-26 09:33:17,062][49517] Fps is (10 sec: 52428.2, 60 sec: 49971.1, 300 sec: 49929.5). Total num frames: 2330312704. Throughput: 0: 49835.1. Samples: 83201240. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 09:33:17,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 09:33:20,415][49750] Updated weights for policy 0, policy_version 142241 (0.0037) [2024-04-26 09:33:22,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2330558464. Throughput: 0: 50073.1. Samples: 83362060. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 09:33:22,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 09:33:23,420][49750] Updated weights for policy 0, policy_version 142251 (0.0031) [2024-04-26 09:33:26,804][49750] Updated weights for policy 0, policy_version 142261 (0.0033) [2024-04-26 09:33:27,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50040.6). Total num frames: 2330804224. Throughput: 0: 49919.5. Samples: 83659660. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 09:33:27,063][49517] Avg episode reward: [(0, '0.485')] [2024-04-26 09:33:29,920][49750] Updated weights for policy 0, policy_version 142271 (0.0030) [2024-04-26 09:33:32,062][49517] Fps is (10 sec: 49151.5, 60 sec: 49698.1, 300 sec: 49929.6). Total num frames: 2331049984. Throughput: 0: 49934.6. Samples: 83963900. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 09:33:32,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 09:33:33,442][49750] Updated weights for policy 0, policy_version 142281 (0.0030) [2024-04-26 09:33:36,606][49750] Updated weights for policy 0, policy_version 142291 (0.0030) [2024-04-26 09:33:37,062][49517] Fps is (10 sec: 50791.2, 60 sec: 49971.4, 300 sec: 49985.1). Total num frames: 2331312128. Throughput: 0: 49996.6. Samples: 84107080. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 09:33:37,063][49517] Avg episode reward: [(0, '0.457')] [2024-04-26 09:33:40,123][49750] Updated weights for policy 0, policy_version 142301 (0.0032) [2024-04-26 09:33:42,063][49517] Fps is (10 sec: 50790.0, 60 sec: 49698.0, 300 sec: 50040.6). Total num frames: 2331557888. Throughput: 0: 50100.5. Samples: 84410800. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 09:33:42,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 09:33:43,310][49750] Updated weights for policy 0, policy_version 142311 (0.0033) [2024-04-26 09:33:46,783][49750] Updated weights for policy 0, policy_version 142321 (0.0037) [2024-04-26 09:33:47,062][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.3, 300 sec: 50096.2). Total num frames: 2331803648. Throughput: 0: 50106.0. Samples: 84713160. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 09:33:47,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 09:33:49,960][49750] Updated weights for policy 0, policy_version 142331 (0.0030) [2024-04-26 09:33:52,062][49517] Fps is (10 sec: 49152.8, 60 sec: 49971.3, 300 sec: 50040.6). Total num frames: 2332049408. Throughput: 0: 50124.4. Samples: 84858400. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 09:33:52,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 09:33:53,220][49750] Updated weights for policy 0, policy_version 142341 (0.0034) [2024-04-26 09:33:56,452][49750] Updated weights for policy 0, policy_version 142351 (0.0033) [2024-04-26 09:33:57,062][49517] Fps is (10 sec: 49151.6, 60 sec: 49425.2, 300 sec: 49874.0). Total num frames: 2332295168. Throughput: 0: 49996.9. Samples: 85154960. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 09:33:57,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 09:33:59,749][49750] Updated weights for policy 0, policy_version 142361 (0.0032) [2024-04-26 09:34:02,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50517.5, 300 sec: 50207.2). Total num frames: 2332590080. Throughput: 0: 50180.5. Samples: 85459360. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 09:34:02,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 09:34:03,014][49750] Updated weights for policy 0, policy_version 142371 (0.0028) [2024-04-26 09:34:06,228][49750] Updated weights for policy 0, policy_version 142381 (0.0032) [2024-04-26 09:34:07,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50790.3, 300 sec: 50262.8). Total num frames: 2332835840. Throughput: 0: 50131.1. Samples: 85617960. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 09:34:07,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 09:34:09,611][49750] Updated weights for policy 0, policy_version 142391 (0.0034) [2024-04-26 09:34:12,063][49517] Fps is (10 sec: 45874.5, 60 sec: 49971.1, 300 sec: 49985.1). Total num frames: 2333048832. Throughput: 0: 50100.4. Samples: 85914180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 09:34:12,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 09:34:12,730][49750] Updated weights for policy 0, policy_version 142401 (0.0038) [2024-04-26 09:34:14,525][49728] Signal inference workers to stop experience collection... (1300 times) [2024-04-26 09:34:14,526][49728] Signal inference workers to resume experience collection... (1300 times) [2024-04-26 09:34:14,542][49750] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-04-26 09:34:14,542][49750] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-04-26 09:34:16,233][49750] Updated weights for policy 0, policy_version 142411 (0.0033) [2024-04-26 09:34:17,063][49517] Fps is (10 sec: 47513.3, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2333310976. Throughput: 0: 49980.0. Samples: 86213000. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-26 09:34:17,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 09:34:19,233][49750] Updated weights for policy 0, policy_version 142421 (0.0029) [2024-04-26 09:34:22,063][49517] Fps is (10 sec: 50790.4, 60 sec: 49971.1, 300 sec: 49985.1). Total num frames: 2333556736. Throughput: 0: 50018.9. Samples: 86357940. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 09:34:22,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 09:34:22,845][49750] Updated weights for policy 0, policy_version 142431 (0.0040) [2024-04-26 09:34:25,718][49750] Updated weights for policy 0, policy_version 142441 (0.0029) [2024-04-26 09:34:27,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50244.4, 300 sec: 50151.7). Total num frames: 2333818880. Throughput: 0: 50158.4. Samples: 86667920. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 09:34:27,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 09:34:29,235][49750] Updated weights for policy 0, policy_version 142451 (0.0027) [2024-04-26 09:34:32,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50244.4, 300 sec: 50040.6). Total num frames: 2334064640. Throughput: 0: 50201.3. Samples: 86972220. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 09:34:32,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 09:34:32,166][49750] Updated weights for policy 0, policy_version 142461 (0.0029) [2024-04-26 09:34:35,656][49750] Updated weights for policy 0, policy_version 142471 (0.0025) [2024-04-26 09:34:37,062][49517] Fps is (10 sec: 47513.2, 60 sec: 49698.0, 300 sec: 49985.1). Total num frames: 2334294016. Throughput: 0: 49983.1. Samples: 87107640. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 09:34:37,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 09:34:38,622][49750] Updated weights for policy 0, policy_version 142481 (0.0038) [2024-04-26 09:34:42,063][49517] Fps is (10 sec: 49151.0, 60 sec: 49971.2, 300 sec: 49929.5). Total num frames: 2334556160. Throughput: 0: 50199.4. Samples: 87413940. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 09:34:42,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 09:34:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000142490_2334556160.pth... [2024-04-26 09:34:42,124][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000141757_2322546688.pth [2024-04-26 09:34:42,383][49750] Updated weights for policy 0, policy_version 142491 (0.0029) [2024-04-26 09:34:45,127][49750] Updated weights for policy 0, policy_version 142501 (0.0031) [2024-04-26 09:34:47,062][49517] Fps is (10 sec: 54067.9, 60 sec: 50517.4, 300 sec: 50151.7). Total num frames: 2334834688. Throughput: 0: 50149.4. Samples: 87716080. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 09:34:47,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 09:34:48,749][49750] Updated weights for policy 0, policy_version 142511 (0.0043) [2024-04-26 09:34:51,754][49750] Updated weights for policy 0, policy_version 142521 (0.0034) [2024-04-26 09:34:52,063][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2335080448. Throughput: 0: 50112.8. Samples: 87873040. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 09:34:52,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 09:34:55,483][49750] Updated weights for policy 0, policy_version 142531 (0.0036) [2024-04-26 09:34:57,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.4, 300 sec: 50040.7). Total num frames: 2335326208. Throughput: 0: 50229.9. Samples: 88174520. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 09:34:57,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 09:34:58,157][49750] Updated weights for policy 0, policy_version 142541 (0.0036) [2024-04-26 09:35:01,882][49750] Updated weights for policy 0, policy_version 142551 (0.0032) [2024-04-26 09:35:02,062][49517] Fps is (10 sec: 47513.8, 60 sec: 49425.0, 300 sec: 49985.1). Total num frames: 2335555584. Throughput: 0: 50073.4. Samples: 88466300. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 09:35:02,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 09:35:04,703][49750] Updated weights for policy 0, policy_version 142561 (0.0036) [2024-04-26 09:35:07,063][49517] Fps is (10 sec: 49150.9, 60 sec: 49698.0, 300 sec: 50040.6). Total num frames: 2335817728. Throughput: 0: 50212.8. Samples: 88617520. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 09:35:07,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 09:35:08,470][49750] Updated weights for policy 0, policy_version 142571 (0.0029) [2024-04-26 09:35:11,284][49750] Updated weights for policy 0, policy_version 142581 (0.0038) [2024-04-26 09:35:12,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.4, 300 sec: 50151.7). Total num frames: 2336063488. Throughput: 0: 50044.4. Samples: 88919920. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 09:35:12,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 09:35:13,539][49728] Signal inference workers to stop experience collection... (1350 times) [2024-04-26 09:35:13,539][49728] Signal inference workers to resume experience collection... (1350 times) [2024-04-26 09:35:13,568][49750] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-04-26 09:35:13,569][49750] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-04-26 09:35:15,308][49750] Updated weights for policy 0, policy_version 142591 (0.0034) [2024-04-26 09:35:17,062][49517] Fps is (10 sec: 49153.2, 60 sec: 49971.3, 300 sec: 49985.1). Total num frames: 2336309248. Throughput: 0: 49904.0. Samples: 89217900. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 09:35:17,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 09:35:18,037][49750] Updated weights for policy 0, policy_version 142601 (0.0029) [2024-04-26 09:35:21,647][49750] Updated weights for policy 0, policy_version 142611 (0.0030) [2024-04-26 09:35:22,063][49517] Fps is (10 sec: 49151.4, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2336555008. Throughput: 0: 50177.7. Samples: 89365640. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-26 09:35:22,072][49517] Avg episode reward: [(0, '0.437')] [2024-04-26 09:35:24,413][49750] Updated weights for policy 0, policy_version 142621 (0.0031) [2024-04-26 09:35:27,062][49517] Fps is (10 sec: 50790.5, 60 sec: 49971.2, 300 sec: 50040.7). Total num frames: 2336817152. Throughput: 0: 50170.4. Samples: 89671600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 09:35:27,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 09:35:28,034][49750] Updated weights for policy 0, policy_version 142631 (0.0030) [2024-04-26 09:35:30,839][49750] Updated weights for policy 0, policy_version 142641 (0.0032) [2024-04-26 09:35:32,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50244.0, 300 sec: 50151.7). Total num frames: 2337079296. Throughput: 0: 50207.2. Samples: 89975420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 09:35:32,072][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 09:35:34,620][49750] Updated weights for policy 0, policy_version 142651 (0.0030) [2024-04-26 09:35:37,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50151.7). Total num frames: 2337341440. Throughput: 0: 50136.9. Samples: 90129200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 09:35:37,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 09:35:37,384][49750] Updated weights for policy 0, policy_version 142661 (0.0038) [2024-04-26 09:35:41,212][49750] Updated weights for policy 0, policy_version 142671 (0.0030) [2024-04-26 09:35:42,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50244.4, 300 sec: 49985.1). Total num frames: 2337570816. Throughput: 0: 50003.0. Samples: 90424660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 09:35:42,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 09:35:44,098][49750] Updated weights for policy 0, policy_version 142681 (0.0041) [2024-04-26 09:35:47,063][49517] Fps is (10 sec: 47512.9, 60 sec: 49697.9, 300 sec: 50040.6). Total num frames: 2337816576. Throughput: 0: 50178.1. Samples: 90724320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 09:35:47,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 09:35:47,774][49750] Updated weights for policy 0, policy_version 142691 (0.0028) [2024-04-26 09:35:50,514][49750] Updated weights for policy 0, policy_version 142701 (0.0034) [2024-04-26 09:35:52,062][49517] Fps is (10 sec: 50790.3, 60 sec: 49971.2, 300 sec: 50096.2). Total num frames: 2338078720. Throughput: 0: 50042.8. Samples: 90869440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 09:35:52,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 09:35:54,241][49750] Updated weights for policy 0, policy_version 142711 (0.0029) [2024-04-26 09:35:56,945][49750] Updated weights for policy 0, policy_version 142721 (0.0029) [2024-04-26 09:35:57,063][49517] Fps is (10 sec: 52429.3, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 2338340864. Throughput: 0: 50076.3. Samples: 91173360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 09:35:57,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 09:36:00,800][49750] Updated weights for policy 0, policy_version 142731 (0.0031) [2024-04-26 09:36:02,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2338570240. Throughput: 0: 50171.9. Samples: 91475640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 09:36:02,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 09:36:03,493][49750] Updated weights for policy 0, policy_version 142741 (0.0032) [2024-04-26 09:36:07,063][49517] Fps is (10 sec: 45875.2, 60 sec: 49698.2, 300 sec: 49985.1). Total num frames: 2338799616. Throughput: 0: 50205.8. Samples: 91624900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 09:36:07,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 09:36:07,333][49750] Updated weights for policy 0, policy_version 142751 (0.0036) [2024-04-26 09:36:10,131][49750] Updated weights for policy 0, policy_version 142761 (0.0029) [2024-04-26 09:36:12,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.1, 300 sec: 50040.6). Total num frames: 2339078144. Throughput: 0: 50015.8. Samples: 91922320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 09:36:12,063][49517] Avg episode reward: [(0, '0.470')] [2024-04-26 09:36:13,798][49750] Updated weights for policy 0, policy_version 142771 (0.0030) [2024-04-26 09:36:16,901][49750] Updated weights for policy 0, policy_version 142781 (0.0030) [2024-04-26 09:36:17,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2339323904. Throughput: 0: 49909.1. Samples: 92221320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 09:36:17,067][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 09:36:20,486][49750] Updated weights for policy 0, policy_version 142791 (0.0031) [2024-04-26 09:36:22,062][49517] Fps is (10 sec: 50791.6, 60 sec: 50517.5, 300 sec: 50096.2). Total num frames: 2339586048. Throughput: 0: 50055.2. Samples: 92381680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 09:36:22,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 09:36:23,463][49750] Updated weights for policy 0, policy_version 142801 (0.0032) [2024-04-26 09:36:26,897][49750] Updated weights for policy 0, policy_version 142811 (0.0035) [2024-04-26 09:36:27,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2339815424. Throughput: 0: 49999.2. Samples: 92674620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 09:36:27,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 09:36:29,975][49750] Updated weights for policy 0, policy_version 142821 (0.0034) [2024-04-26 09:36:30,615][49728] Signal inference workers to stop experience collection... (1400 times) [2024-04-26 09:36:30,615][49728] Signal inference workers to resume experience collection... (1400 times) [2024-04-26 09:36:30,635][49750] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-04-26 09:36:30,635][49750] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-04-26 09:36:32,062][49517] Fps is (10 sec: 49151.4, 60 sec: 49971.4, 300 sec: 50040.6). Total num frames: 2340077568. Throughput: 0: 50025.0. Samples: 92975440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 09:36:32,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 09:36:33,495][49750] Updated weights for policy 0, policy_version 142831 (0.0032) [2024-04-26 09:36:36,413][49750] Updated weights for policy 0, policy_version 142841 (0.0030) [2024-04-26 09:36:37,063][49517] Fps is (10 sec: 50789.8, 60 sec: 49698.1, 300 sec: 50096.2). Total num frames: 2340323328. Throughput: 0: 50109.7. Samples: 93124380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 09:36:37,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 09:36:40,044][49750] Updated weights for policy 0, policy_version 142851 (0.0030) [2024-04-26 09:36:42,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.2, 300 sec: 50096.1). Total num frames: 2340585472. Throughput: 0: 50153.2. Samples: 93430260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 09:36:42,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 09:36:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000142858_2340585472.pth... [2024-04-26 09:36:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000142125_2328576000.pth [2024-04-26 09:36:42,850][49750] Updated weights for policy 0, policy_version 142861 (0.0028) [2024-04-26 09:36:46,745][49750] Updated weights for policy 0, policy_version 142871 (0.0035) [2024-04-26 09:36:47,062][49517] Fps is (10 sec: 49152.9, 60 sec: 49971.4, 300 sec: 49929.6). Total num frames: 2340814848. Throughput: 0: 50199.3. Samples: 93734600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 09:36:47,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 09:36:49,545][49750] Updated weights for policy 0, policy_version 142881 (0.0029) [2024-04-26 09:36:52,063][49517] Fps is (10 sec: 47513.8, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2341060608. Throughput: 0: 50076.4. Samples: 93878340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 09:36:52,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 09:36:53,201][49750] Updated weights for policy 0, policy_version 142891 (0.0029) [2024-04-26 09:36:55,965][49750] Updated weights for policy 0, policy_version 142901 (0.0032) [2024-04-26 09:36:57,063][49517] Fps is (10 sec: 52428.0, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2341339136. Throughput: 0: 50060.5. Samples: 94175040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 09:36:57,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 09:36:59,768][49750] Updated weights for policy 0, policy_version 142911 (0.0039) [2024-04-26 09:37:02,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2341584896. Throughput: 0: 50051.1. Samples: 94473620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 09:37:02,072][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 09:37:02,345][49750] Updated weights for policy 0, policy_version 142921 (0.0035) [2024-04-26 09:37:06,272][49750] Updated weights for policy 0, policy_version 142931 (0.0032) [2024-04-26 09:37:07,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50040.6). Total num frames: 2341830656. Throughput: 0: 49859.8. Samples: 94625380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 09:37:07,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 09:37:09,082][49750] Updated weights for policy 0, policy_version 142941 (0.0030) [2024-04-26 09:37:12,062][49517] Fps is (10 sec: 45875.5, 60 sec: 49425.2, 300 sec: 49929.5). Total num frames: 2342043648. Throughput: 0: 50014.2. Samples: 94925260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 09:37:12,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 09:37:12,799][49750] Updated weights for policy 0, policy_version 142951 (0.0032) [2024-04-26 09:37:15,836][49750] Updated weights for policy 0, policy_version 142961 (0.0032) [2024-04-26 09:37:17,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2342338560. Throughput: 0: 50018.6. Samples: 95226280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 09:37:17,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 09:37:19,240][49750] Updated weights for policy 0, policy_version 142971 (0.0035) [2024-04-26 09:37:22,063][49517] Fps is (10 sec: 52428.4, 60 sec: 49698.0, 300 sec: 50151.7). Total num frames: 2342567936. Throughput: 0: 50037.8. Samples: 95376080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 09:37:22,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 09:37:22,278][49750] Updated weights for policy 0, policy_version 142981 (0.0042) [2024-04-26 09:37:25,864][49750] Updated weights for policy 0, policy_version 142991 (0.0032) [2024-04-26 09:37:27,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50244.3, 300 sec: 50040.6). Total num frames: 2342830080. Throughput: 0: 50115.3. Samples: 95685440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 09:37:27,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 09:37:28,706][49750] Updated weights for policy 0, policy_version 143001 (0.0029) [2024-04-26 09:37:32,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49698.2, 300 sec: 49985.1). Total num frames: 2343059456. Throughput: 0: 49972.8. Samples: 95983380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 09:37:32,063][49517] Avg episode reward: [(0, '0.467')] [2024-04-26 09:37:32,443][49750] Updated weights for policy 0, policy_version 143011 (0.0029) [2024-04-26 09:37:35,121][49728] Signal inference workers to stop experience collection... (1450 times) [2024-04-26 09:37:35,183][49750] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-04-26 09:37:35,186][49728] Signal inference workers to resume experience collection... (1450 times) [2024-04-26 09:37:35,196][49750] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-04-26 09:37:35,323][49750] Updated weights for policy 0, policy_version 143021 (0.0032) [2024-04-26 09:37:37,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50244.3, 300 sec: 50040.6). Total num frames: 2343337984. Throughput: 0: 49955.6. Samples: 96126340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 09:37:37,072][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 09:37:38,866][49750] Updated weights for policy 0, policy_version 143031 (0.0035) [2024-04-26 09:37:41,932][49750] Updated weights for policy 0, policy_version 143041 (0.0033) [2024-04-26 09:37:42,063][49517] Fps is (10 sec: 52428.5, 60 sec: 49971.3, 300 sec: 50096.2). Total num frames: 2343583744. Throughput: 0: 49943.2. Samples: 96422480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 09:37:42,072][49517] Avg episode reward: [(0, '0.481')] [2024-04-26 09:37:45,326][49750] Updated weights for policy 0, policy_version 143051 (0.0039) [2024-04-26 09:37:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.2, 300 sec: 50096.2). Total num frames: 2343829504. Throughput: 0: 50130.3. Samples: 96729480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 09:37:47,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 09:37:48,320][49750] Updated weights for policy 0, policy_version 143061 (0.0031) [2024-04-26 09:37:51,892][49750] Updated weights for policy 0, policy_version 143071 (0.0031) [2024-04-26 09:37:52,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.4, 300 sec: 49985.1). Total num frames: 2344075264. Throughput: 0: 50069.1. Samples: 96878480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 09:37:52,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 09:37:54,717][49750] Updated weights for policy 0, policy_version 143081 (0.0034) [2024-04-26 09:37:57,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 50040.6). Total num frames: 2344321024. Throughput: 0: 50100.4. Samples: 97179780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 09:37:57,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 09:37:58,423][49750] Updated weights for policy 0, policy_version 143091 (0.0032) [2024-04-26 09:38:01,289][49750] Updated weights for policy 0, policy_version 143101 (0.0030) [2024-04-26 09:38:02,063][49517] Fps is (10 sec: 50789.6, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2344583168. Throughput: 0: 50106.7. Samples: 97481080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 09:38:02,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 09:38:04,920][49750] Updated weights for policy 0, policy_version 143111 (0.0034) [2024-04-26 09:38:07,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50244.4, 300 sec: 50151.7). Total num frames: 2344845312. Throughput: 0: 50221.0. Samples: 97636020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 09:38:07,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 09:38:07,925][49750] Updated weights for policy 0, policy_version 143121 (0.0029) [2024-04-26 09:38:11,339][49750] Updated weights for policy 0, policy_version 143131 (0.0039) [2024-04-26 09:38:12,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50040.6). Total num frames: 2345074688. Throughput: 0: 50054.1. Samples: 97937880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 09:38:12,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 09:38:14,309][49750] Updated weights for policy 0, policy_version 143141 (0.0034) [2024-04-26 09:38:17,063][49517] Fps is (10 sec: 45874.1, 60 sec: 49425.0, 300 sec: 49985.0). Total num frames: 2345304064. Throughput: 0: 50145.5. Samples: 98239940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 09:38:17,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 09:38:17,786][49750] Updated weights for policy 0, policy_version 143151 (0.0033) [2024-04-26 09:38:21,049][49750] Updated weights for policy 0, policy_version 143161 (0.0031) [2024-04-26 09:38:22,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.2, 300 sec: 50151.7). Total num frames: 2345598976. Throughput: 0: 50154.5. Samples: 98383300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 09:38:22,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 09:38:24,544][49750] Updated weights for policy 0, policy_version 143171 (0.0030) [2024-04-26 09:38:27,063][49517] Fps is (10 sec: 54068.0, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2345844736. Throughput: 0: 50276.4. Samples: 98684920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 09:38:27,072][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 09:38:27,655][49750] Updated weights for policy 0, policy_version 143181 (0.0030) [2024-04-26 09:38:31,174][49750] Updated weights for policy 0, policy_version 143191 (0.0034) [2024-04-26 09:38:32,062][49517] Fps is (10 sec: 49153.2, 60 sec: 50517.3, 300 sec: 50096.2). Total num frames: 2346090496. Throughput: 0: 50163.6. Samples: 98986840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 09:38:32,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 09:38:34,084][49750] Updated weights for policy 0, policy_version 143201 (0.0026) [2024-04-26 09:38:37,063][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.2, 300 sec: 50096.2). Total num frames: 2346336256. Throughput: 0: 50170.1. Samples: 99136140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 09:38:37,063][49517] Avg episode reward: [(0, '0.436')] [2024-04-26 09:38:37,588][49750] Updated weights for policy 0, policy_version 143211 (0.0034) [2024-04-26 09:38:40,466][49750] Updated weights for policy 0, policy_version 143221 (0.0029) [2024-04-26 09:38:42,063][49517] Fps is (10 sec: 49151.5, 60 sec: 49971.2, 300 sec: 50096.1). Total num frames: 2346582016. Throughput: 0: 50038.1. Samples: 99431500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 09:38:42,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 09:38:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000143224_2346582016.pth... [2024-04-26 09:38:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000142490_2334556160.pth [2024-04-26 09:38:44,155][49750] Updated weights for policy 0, policy_version 143231 (0.0031) [2024-04-26 09:38:46,914][49750] Updated weights for policy 0, policy_version 143241 (0.0030) [2024-04-26 09:38:47,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 2346860544. Throughput: 0: 50140.9. Samples: 99737420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 09:38:47,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 09:38:50,627][49750] Updated weights for policy 0, policy_version 143251 (0.0032) [2024-04-26 09:38:51,069][49728] Signal inference workers to stop experience collection... (1500 times) [2024-04-26 09:38:51,069][49728] Signal inference workers to resume experience collection... (1500 times) [2024-04-26 09:38:51,087][49750] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-04-26 09:38:51,087][49750] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-04-26 09:38:52,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.1, 300 sec: 50151.7). Total num frames: 2347089920. Throughput: 0: 50046.1. Samples: 99888100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 09:38:52,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 09:38:53,660][49750] Updated weights for policy 0, policy_version 143261 (0.0026) [2024-04-26 09:38:57,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.2, 300 sec: 49985.1). Total num frames: 2347335680. Throughput: 0: 50019.6. Samples: 100188760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 09:38:57,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 09:38:57,197][49750] Updated weights for policy 0, policy_version 143271 (0.0032) [2024-04-26 09:39:00,707][49750] Updated weights for policy 0, policy_version 143281 (0.0034) [2024-04-26 09:39:02,063][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2347581440. Throughput: 0: 49851.7. Samples: 100483260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 09:39:02,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 09:39:03,790][49750] Updated weights for policy 0, policy_version 143291 (0.0036) [2024-04-26 09:39:07,063][49517] Fps is (10 sec: 49151.5, 60 sec: 49698.0, 300 sec: 50096.2). Total num frames: 2347827200. Throughput: 0: 49989.0. Samples: 100632800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 09:39:07,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 09:39:07,251][49750] Updated weights for policy 0, policy_version 143301 (0.0031) [2024-04-26 09:39:10,216][49750] Updated weights for policy 0, policy_version 143311 (0.0029) [2024-04-26 09:39:12,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2348089344. Throughput: 0: 50056.5. Samples: 100937460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 09:39:12,063][49517] Avg episode reward: [(0, '0.471')] [2024-04-26 09:39:13,673][49750] Updated weights for policy 0, policy_version 143321 (0.0030) [2024-04-26 09:39:16,871][49750] Updated weights for policy 0, policy_version 143331 (0.0030) [2024-04-26 09:39:17,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.4, 300 sec: 50096.2). Total num frames: 2348335104. Throughput: 0: 50063.0. Samples: 101239680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 09:39:17,063][49517] Avg episode reward: [(0, '0.458')] [2024-04-26 09:39:20,287][49750] Updated weights for policy 0, policy_version 143341 (0.0033) [2024-04-26 09:39:22,063][49517] Fps is (10 sec: 49151.7, 60 sec: 49698.2, 300 sec: 50040.6). Total num frames: 2348580864. Throughput: 0: 49984.8. Samples: 101385460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 09:39:22,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 09:39:23,408][49750] Updated weights for policy 0, policy_version 143351 (0.0029) [2024-04-26 09:39:26,851][49750] Updated weights for policy 0, policy_version 143361 (0.0028) [2024-04-26 09:39:27,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49698.2, 300 sec: 50040.6). Total num frames: 2348826624. Throughput: 0: 50085.4. Samples: 101685340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 09:39:27,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 09:39:30,042][49750] Updated weights for policy 0, policy_version 143371 (0.0037) [2024-04-26 09:39:32,062][49517] Fps is (10 sec: 50790.8, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2349088768. Throughput: 0: 49972.0. Samples: 101986160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 09:39:32,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 09:39:33,472][49750] Updated weights for policy 0, policy_version 143381 (0.0039) [2024-04-26 09:39:36,405][49750] Updated weights for policy 0, policy_version 143391 (0.0032) [2024-04-26 09:39:37,062][49517] Fps is (10 sec: 50790.3, 60 sec: 49971.2, 300 sec: 50096.2). Total num frames: 2349334528. Throughput: 0: 49978.7. Samples: 102137140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 09:39:37,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 09:39:40,110][49750] Updated weights for policy 0, policy_version 143401 (0.0034) [2024-04-26 09:39:42,063][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2349580288. Throughput: 0: 50004.8. Samples: 102438980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 09:39:42,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 09:39:42,936][49750] Updated weights for policy 0, policy_version 143411 (0.0030) [2024-04-26 09:39:46,605][49750] Updated weights for policy 0, policy_version 143421 (0.0030) [2024-04-26 09:39:47,062][49517] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 2349842432. Throughput: 0: 50190.7. Samples: 102741840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 09:39:47,063][49517] Avg episode reward: [(0, '0.468')] [2024-04-26 09:39:49,472][49750] Updated weights for policy 0, policy_version 143431 (0.0030) [2024-04-26 09:39:52,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49698.2, 300 sec: 49985.1). Total num frames: 2350071808. Throughput: 0: 50009.5. Samples: 102883220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 09:39:52,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 09:39:53,073][49750] Updated weights for policy 0, policy_version 143441 (0.0034) [2024-04-26 09:39:56,083][49750] Updated weights for policy 0, policy_version 143451 (0.0030) [2024-04-26 09:39:57,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.1, 300 sec: 50151.7). Total num frames: 2350350336. Throughput: 0: 50182.1. Samples: 103195660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 09:39:57,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 09:39:59,531][49750] Updated weights for policy 0, policy_version 143461 (0.0036) [2024-04-26 09:40:02,062][49517] Fps is (10 sec: 50790.6, 60 sec: 49971.3, 300 sec: 50040.7). Total num frames: 2350579712. Throughput: 0: 49999.3. Samples: 103489640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 09:40:02,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 09:40:02,648][49750] Updated weights for policy 0, policy_version 143471 (0.0031) [2024-04-26 09:40:06,158][49750] Updated weights for policy 0, policy_version 143481 (0.0028) [2024-04-26 09:40:07,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50244.4, 300 sec: 50096.2). Total num frames: 2350841856. Throughput: 0: 50087.7. Samples: 103639400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 09:40:07,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 09:40:09,241][49750] Updated weights for policy 0, policy_version 143491 (0.0038) [2024-04-26 09:40:12,062][49517] Fps is (10 sec: 49151.5, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 2351071232. Throughput: 0: 50085.8. Samples: 103939200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 09:40:12,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 09:40:12,659][49750] Updated weights for policy 0, policy_version 143501 (0.0033) [2024-04-26 09:40:15,740][49750] Updated weights for policy 0, policy_version 143511 (0.0037) [2024-04-26 09:40:17,062][49517] Fps is (10 sec: 49151.5, 60 sec: 49971.2, 300 sec: 50096.2). Total num frames: 2351333376. Throughput: 0: 49960.0. Samples: 104234360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 09:40:17,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 09:40:19,184][49750] Updated weights for policy 0, policy_version 143521 (0.0034) [2024-04-26 09:40:21,782][49728] Signal inference workers to stop experience collection... (1550 times) [2024-04-26 09:40:21,816][49750] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-04-26 09:40:21,851][49728] Signal inference workers to resume experience collection... (1550 times) [2024-04-26 09:40:21,852][49750] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-04-26 09:40:22,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50244.4, 300 sec: 50096.2). Total num frames: 2351595520. Throughput: 0: 50034.3. Samples: 104388680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 09:40:22,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 09:40:22,145][49750] Updated weights for policy 0, policy_version 143531 (0.0029) [2024-04-26 09:40:25,720][49750] Updated weights for policy 0, policy_version 143541 (0.0029) [2024-04-26 09:40:27,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50244.3, 300 sec: 50040.7). Total num frames: 2351841280. Throughput: 0: 50141.0. Samples: 104695320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 09:40:27,063][49517] Avg episode reward: [(0, '0.462')] [2024-04-26 09:40:28,670][49750] Updated weights for policy 0, policy_version 143551 (0.0032) [2024-04-26 09:40:32,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49698.2, 300 sec: 49929.6). Total num frames: 2352070656. Throughput: 0: 50117.0. Samples: 104997100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 09:40:32,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 09:40:32,234][49750] Updated weights for policy 0, policy_version 143561 (0.0037) [2024-04-26 09:40:35,367][49750] Updated weights for policy 0, policy_version 143571 (0.0031) [2024-04-26 09:40:37,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50244.2, 300 sec: 50096.1). Total num frames: 2352349184. Throughput: 0: 50206.5. Samples: 105142520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 09:40:37,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 09:40:38,756][49750] Updated weights for policy 0, policy_version 143581 (0.0037) [2024-04-26 09:40:41,757][49750] Updated weights for policy 0, policy_version 143591 (0.0029) [2024-04-26 09:40:42,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50244.2, 300 sec: 50096.2). Total num frames: 2352594944. Throughput: 0: 49911.2. Samples: 105441660. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 09:40:42,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 09:40:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000143591_2352594944.pth... [2024-04-26 09:40:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000142858_2340585472.pth [2024-04-26 09:40:45,295][49750] Updated weights for policy 0, policy_version 143601 (0.0035) [2024-04-26 09:40:47,062][49517] Fps is (10 sec: 47514.6, 60 sec: 49698.2, 300 sec: 49985.1). Total num frames: 2352824320. Throughput: 0: 50013.8. Samples: 105740260. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 09:40:47,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 09:40:48,146][49750] Updated weights for policy 0, policy_version 143611 (0.0033) [2024-04-26 09:40:51,748][49750] Updated weights for policy 0, policy_version 143621 (0.0031) [2024-04-26 09:40:52,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.2, 300 sec: 49985.1). Total num frames: 2353086464. Throughput: 0: 49921.2. Samples: 105885860. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 09:40:52,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 09:40:54,843][49750] Updated weights for policy 0, policy_version 143631 (0.0030) [2024-04-26 09:40:57,062][49517] Fps is (10 sec: 50790.3, 60 sec: 49698.3, 300 sec: 50040.6). Total num frames: 2353332224. Throughput: 0: 50014.7. Samples: 106189860. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 09:40:57,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 09:40:58,454][49750] Updated weights for policy 0, policy_version 143641 (0.0029) [2024-04-26 09:41:01,404][49750] Updated weights for policy 0, policy_version 143651 (0.0030) [2024-04-26 09:41:02,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.1, 300 sec: 50151.7). Total num frames: 2353594368. Throughput: 0: 50049.7. Samples: 106486600. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 09:41:02,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 09:41:04,895][49750] Updated weights for policy 0, policy_version 143661 (0.0038) [2024-04-26 09:41:07,063][49517] Fps is (10 sec: 50789.7, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2353840128. Throughput: 0: 50130.9. Samples: 106644580. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 09:41:07,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 09:41:07,745][49750] Updated weights for policy 0, policy_version 143671 (0.0029) [2024-04-26 09:41:11,557][49750] Updated weights for policy 0, policy_version 143681 (0.0028) [2024-04-26 09:41:12,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2354085888. Throughput: 0: 50064.8. Samples: 106948240. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 09:41:12,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 09:41:14,260][49750] Updated weights for policy 0, policy_version 143691 (0.0028) [2024-04-26 09:41:17,063][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2354331648. Throughput: 0: 49991.0. Samples: 107246700. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 09:41:17,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 09:41:18,026][49750] Updated weights for policy 0, policy_version 143701 (0.0034) [2024-04-26 09:41:20,867][49750] Updated weights for policy 0, policy_version 143711 (0.0030) [2024-04-26 09:41:22,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 2354577408. Throughput: 0: 50007.7. Samples: 107392860. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 09:41:22,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 09:41:24,552][49750] Updated weights for policy 0, policy_version 143721 (0.0038) [2024-04-26 09:41:27,063][49517] Fps is (10 sec: 50790.0, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2354839552. Throughput: 0: 50028.9. Samples: 107692960. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 09:41:27,071][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 09:41:27,522][49750] Updated weights for policy 0, policy_version 143731 (0.0042) [2024-04-26 09:41:28,751][49728] Signal inference workers to stop experience collection... (1600 times) [2024-04-26 09:41:28,751][49728] Signal inference workers to resume experience collection... (1600 times) [2024-04-26 09:41:28,776][49750] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-04-26 09:41:28,777][49750] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-04-26 09:41:31,233][49750] Updated weights for policy 0, policy_version 143741 (0.0033) [2024-04-26 09:41:32,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50244.1, 300 sec: 50040.6). Total num frames: 2355085312. Throughput: 0: 49994.4. Samples: 107990020. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 09:41:32,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 09:41:34,156][49750] Updated weights for policy 0, policy_version 143751 (0.0028) [2024-04-26 09:41:37,062][49517] Fps is (10 sec: 50790.9, 60 sec: 49971.3, 300 sec: 50040.6). Total num frames: 2355347456. Throughput: 0: 50129.4. Samples: 108141680. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 09:41:37,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 09:41:37,549][49750] Updated weights for policy 0, policy_version 143761 (0.0029) [2024-04-26 09:41:40,699][49750] Updated weights for policy 0, policy_version 143771 (0.0029) [2024-04-26 09:41:42,062][49517] Fps is (10 sec: 49153.1, 60 sec: 49698.3, 300 sec: 50040.6). Total num frames: 2355576832. Throughput: 0: 50070.2. Samples: 108443020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:41:42,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 09:41:44,021][49750] Updated weights for policy 0, policy_version 143781 (0.0033) [2024-04-26 09:41:47,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2355855360. Throughput: 0: 50096.5. Samples: 108740940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:41:47,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 09:41:47,176][49750] Updated weights for policy 0, policy_version 143791 (0.0030) [2024-04-26 09:41:50,523][49750] Updated weights for policy 0, policy_version 143801 (0.0029) [2024-04-26 09:41:52,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.4, 300 sec: 50040.6). Total num frames: 2356101120. Throughput: 0: 49992.6. Samples: 108894240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:41:52,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 09:41:53,780][49750] Updated weights for policy 0, policy_version 143811 (0.0031) [2024-04-26 09:41:57,041][49750] Updated weights for policy 0, policy_version 143821 (0.0039) [2024-04-26 09:41:57,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50096.2). Total num frames: 2356363264. Throughput: 0: 50037.9. Samples: 109199940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:41:57,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 09:42:00,318][49750] Updated weights for policy 0, policy_version 143831 (0.0034) [2024-04-26 09:42:02,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49698.2, 300 sec: 49985.1). Total num frames: 2356576256. Throughput: 0: 50129.4. Samples: 109502520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:42:02,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 09:42:03,488][49750] Updated weights for policy 0, policy_version 143841 (0.0036) [2024-04-26 09:42:06,916][49750] Updated weights for policy 0, policy_version 143851 (0.0031) [2024-04-26 09:42:07,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 2356854784. Throughput: 0: 50077.8. Samples: 109646360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:42:07,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 09:42:10,102][49750] Updated weights for policy 0, policy_version 143861 (0.0033) [2024-04-26 09:42:12,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2357100544. Throughput: 0: 49959.1. Samples: 109941120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:42:12,063][49517] Avg episode reward: [(0, '0.465')] [2024-04-26 09:42:13,492][49750] Updated weights for policy 0, policy_version 143871 (0.0035) [2024-04-26 09:42:16,837][49750] Updated weights for policy 0, policy_version 143881 (0.0032) [2024-04-26 09:42:17,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2357346304. Throughput: 0: 50044.2. Samples: 110242000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:42:17,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 09:42:19,847][49750] Updated weights for policy 0, policy_version 143891 (0.0028) [2024-04-26 09:42:22,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50040.6). Total num frames: 2357592064. Throughput: 0: 50024.9. Samples: 110392800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:42:22,071][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 09:42:23,430][49750] Updated weights for policy 0, policy_version 143901 (0.0036) [2024-04-26 09:42:26,253][49750] Updated weights for policy 0, policy_version 143911 (0.0030) [2024-04-26 09:42:27,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.3, 300 sec: 50096.2). Total num frames: 2357837824. Throughput: 0: 49999.1. Samples: 110692980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:42:27,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 09:42:30,038][49750] Updated weights for policy 0, policy_version 143921 (0.0027) [2024-04-26 09:42:32,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.5, 300 sec: 50040.6). Total num frames: 2358099968. Throughput: 0: 50130.8. Samples: 110996820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:42:32,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 09:42:32,769][49750] Updated weights for policy 0, policy_version 143931 (0.0041) [2024-04-26 09:42:36,440][49750] Updated weights for policy 0, policy_version 143941 (0.0032) [2024-04-26 09:42:37,063][49517] Fps is (10 sec: 50790.1, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2358345728. Throughput: 0: 50211.9. Samples: 111153780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:42:37,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 09:42:39,354][49750] Updated weights for policy 0, policy_version 143951 (0.0037) [2024-04-26 09:42:42,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50040.6). Total num frames: 2358591488. Throughput: 0: 50059.5. Samples: 111452620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 09:42:42,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 09:42:42,099][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000143958_2358607872.pth... [2024-04-26 09:42:42,149][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000143224_2346582016.pth [2024-04-26 09:42:42,989][49750] Updated weights for policy 0, policy_version 143961 (0.0040) [2024-04-26 09:42:46,595][49750] Updated weights for policy 0, policy_version 143971 (0.0034) [2024-04-26 09:42:47,063][49517] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 2358837248. Throughput: 0: 49888.8. Samples: 111747520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:42:47,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 09:42:49,554][49750] Updated weights for policy 0, policy_version 143981 (0.0033) [2024-04-26 09:42:52,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2359115776. Throughput: 0: 49859.0. Samples: 111890020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:42:52,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 09:42:53,174][49750] Updated weights for policy 0, policy_version 143991 (0.0035) [2024-04-26 09:42:54,210][49728] Signal inference workers to stop experience collection... (1650 times) [2024-04-26 09:42:54,210][49728] Signal inference workers to resume experience collection... (1650 times) [2024-04-26 09:42:54,222][49750] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-04-26 09:42:54,223][49750] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-04-26 09:42:55,996][49750] Updated weights for policy 0, policy_version 144001 (0.0027) [2024-04-26 09:42:57,062][49517] Fps is (10 sec: 50790.9, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 2359345152. Throughput: 0: 49989.0. Samples: 112190620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:42:57,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 09:42:59,665][49750] Updated weights for policy 0, policy_version 144011 (0.0028) [2024-04-26 09:43:02,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.2, 300 sec: 49985.1). Total num frames: 2359590912. Throughput: 0: 50080.5. Samples: 112495620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:43:02,071][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 09:43:02,522][49750] Updated weights for policy 0, policy_version 144021 (0.0035) [2024-04-26 09:43:06,551][49750] Updated weights for policy 0, policy_version 144031 (0.0029) [2024-04-26 09:43:07,062][49517] Fps is (10 sec: 47513.3, 60 sec: 49425.1, 300 sec: 49985.1). Total num frames: 2359820288. Throughput: 0: 50007.1. Samples: 112643120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:43:07,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 09:43:09,124][49750] Updated weights for policy 0, policy_version 144041 (0.0038) [2024-04-26 09:43:12,062][49517] Fps is (10 sec: 50790.2, 60 sec: 49971.3, 300 sec: 50151.7). Total num frames: 2360098816. Throughput: 0: 50048.9. Samples: 112945180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:43:12,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 09:43:12,972][49750] Updated weights for policy 0, policy_version 144051 (0.0032) [2024-04-26 09:43:15,633][49750] Updated weights for policy 0, policy_version 144061 (0.0039) [2024-04-26 09:43:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2360344576. Throughput: 0: 49976.4. Samples: 113245760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:43:17,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 09:43:19,453][49750] Updated weights for policy 0, policy_version 144071 (0.0034) [2024-04-26 09:43:22,046][49750] Updated weights for policy 0, policy_version 144081 (0.0037) [2024-04-26 09:43:22,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50096.2). Total num frames: 2360623104. Throughput: 0: 49832.5. Samples: 113396240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:43:22,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 09:43:25,898][49750] Updated weights for policy 0, policy_version 144091 (0.0032) [2024-04-26 09:43:27,063][49517] Fps is (10 sec: 49151.4, 60 sec: 49971.1, 300 sec: 49985.1). Total num frames: 2360836096. Throughput: 0: 49876.7. Samples: 113697080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:43:27,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 09:43:28,669][49750] Updated weights for policy 0, policy_version 144101 (0.0030) [2024-04-26 09:43:32,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2361098240. Throughput: 0: 49886.8. Samples: 113992420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:43:32,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 09:43:32,414][49750] Updated weights for policy 0, policy_version 144111 (0.0039) [2024-04-26 09:43:35,301][49750] Updated weights for policy 0, policy_version 144121 (0.0035) [2024-04-26 09:43:37,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2361360384. Throughput: 0: 50246.7. Samples: 114151120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:43:37,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 09:43:38,805][49750] Updated weights for policy 0, policy_version 144131 (0.0031) [2024-04-26 09:43:41,851][49750] Updated weights for policy 0, policy_version 144141 (0.0033) [2024-04-26 09:43:42,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50244.1, 300 sec: 49985.1). Total num frames: 2361606144. Throughput: 0: 50254.5. Samples: 114452080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:43:42,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 09:43:45,428][49750] Updated weights for policy 0, policy_version 144151 (0.0032) [2024-04-26 09:43:47,062][49517] Fps is (10 sec: 47513.8, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2361835520. Throughput: 0: 50055.1. Samples: 114748100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:43:47,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 09:43:48,462][49750] Updated weights for policy 0, policy_version 144161 (0.0032) [2024-04-26 09:43:52,029][49750] Updated weights for policy 0, policy_version 144171 (0.0034) [2024-04-26 09:43:52,063][49517] Fps is (10 sec: 49152.1, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 2362097664. Throughput: 0: 50263.0. Samples: 114904960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:43:52,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 09:43:55,109][49750] Updated weights for policy 0, policy_version 144181 (0.0030) [2024-04-26 09:43:56,293][49728] Signal inference workers to stop experience collection... (1700 times) [2024-04-26 09:43:56,293][49728] Signal inference workers to resume experience collection... (1700 times) [2024-04-26 09:43:56,309][49750] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-04-26 09:43:56,310][49750] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-04-26 09:43:57,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.2, 300 sec: 50096.2). Total num frames: 2362359808. Throughput: 0: 50132.9. Samples: 115201160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:43:57,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 09:43:58,573][49750] Updated weights for policy 0, policy_version 144191 (0.0034) [2024-04-26 09:44:01,718][49750] Updated weights for policy 0, policy_version 144201 (0.0032) [2024-04-26 09:44:02,062][49517] Fps is (10 sec: 49152.8, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2362589184. Throughput: 0: 50028.5. Samples: 115497040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:44:02,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 09:44:05,366][49750] Updated weights for policy 0, policy_version 144211 (0.0030) [2024-04-26 09:44:07,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 50096.1). Total num frames: 2362867712. Throughput: 0: 50035.4. Samples: 115647840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:44:07,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 09:44:08,203][49750] Updated weights for policy 0, policy_version 144221 (0.0034) [2024-04-26 09:44:11,785][49750] Updated weights for policy 0, policy_version 144231 (0.0032) [2024-04-26 09:44:12,063][49517] Fps is (10 sec: 50789.7, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2363097088. Throughput: 0: 50003.2. Samples: 115947220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:44:12,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 09:44:14,635][49750] Updated weights for policy 0, policy_version 144241 (0.0032) [2024-04-26 09:44:17,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2363359232. Throughput: 0: 49970.2. Samples: 116241080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:44:17,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 09:44:18,332][49750] Updated weights for policy 0, policy_version 144251 (0.0029) [2024-04-26 09:44:21,159][49750] Updated weights for policy 0, policy_version 144261 (0.0037) [2024-04-26 09:44:22,062][49517] Fps is (10 sec: 49152.7, 60 sec: 49425.1, 300 sec: 50040.6). Total num frames: 2363588608. Throughput: 0: 50013.9. Samples: 116401740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:44:22,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 09:44:24,864][49750] Updated weights for policy 0, policy_version 144271 (0.0042) [2024-04-26 09:44:27,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.3, 300 sec: 49985.1). Total num frames: 2363834368. Throughput: 0: 49916.2. Samples: 116698300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:44:27,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 09:44:27,951][49750] Updated weights for policy 0, policy_version 144281 (0.0030) [2024-04-26 09:44:31,274][49750] Updated weights for policy 0, policy_version 144291 (0.0034) [2024-04-26 09:44:32,063][49517] Fps is (10 sec: 49151.3, 60 sec: 49698.0, 300 sec: 49985.1). Total num frames: 2364080128. Throughput: 0: 49900.3. Samples: 116993620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:44:32,072][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 09:44:34,801][49750] Updated weights for policy 0, policy_version 144301 (0.0035) [2024-04-26 09:44:37,062][49517] Fps is (10 sec: 52428.9, 60 sec: 49971.3, 300 sec: 50096.2). Total num frames: 2364358656. Throughput: 0: 49872.2. Samples: 117149200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:44:37,063][49517] Avg episode reward: [(0, '0.473')] [2024-04-26 09:44:37,734][49750] Updated weights for policy 0, policy_version 144311 (0.0031) [2024-04-26 09:44:41,484][49750] Updated weights for policy 0, policy_version 144321 (0.0038) [2024-04-26 09:44:42,063][49517] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2364588032. Throughput: 0: 49966.0. Samples: 117449640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:44:42,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 09:44:42,118][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000144324_2364604416.pth... [2024-04-26 09:44:42,167][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000143591_2352594944.pth [2024-04-26 09:44:44,306][49750] Updated weights for policy 0, policy_version 144331 (0.0029) [2024-04-26 09:44:47,063][49517] Fps is (10 sec: 47513.1, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2364833792. Throughput: 0: 49951.9. Samples: 117744880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:44:47,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 09:44:48,130][49750] Updated weights for policy 0, policy_version 144341 (0.0045) [2024-04-26 09:44:50,905][49750] Updated weights for policy 0, policy_version 144351 (0.0030) [2024-04-26 09:44:52,062][49517] Fps is (10 sec: 50791.3, 60 sec: 49971.3, 300 sec: 49985.1). Total num frames: 2365095936. Throughput: 0: 49763.7. Samples: 117887200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 09:44:52,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 09:44:54,608][49750] Updated weights for policy 0, policy_version 144361 (0.0031) [2024-04-26 09:44:57,063][49517] Fps is (10 sec: 50789.4, 60 sec: 49697.9, 300 sec: 50040.6). Total num frames: 2365341696. Throughput: 0: 49809.2. Samples: 118188640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 09:44:57,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 09:44:57,356][49750] Updated weights for policy 0, policy_version 144371 (0.0031) [2024-04-26 09:45:01,023][49750] Updated weights for policy 0, policy_version 144381 (0.0030) [2024-04-26 09:45:02,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2365603840. Throughput: 0: 50059.4. Samples: 118493760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 09:45:02,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 09:45:04,179][49750] Updated weights for policy 0, policy_version 144391 (0.0037) [2024-04-26 09:45:07,062][49517] Fps is (10 sec: 49153.5, 60 sec: 49425.2, 300 sec: 50040.6). Total num frames: 2365833216. Throughput: 0: 49918.7. Samples: 118648080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 09:45:07,063][49517] Avg episode reward: [(0, '0.470')] [2024-04-26 09:45:07,555][49750] Updated weights for policy 0, policy_version 144401 (0.0036) [2024-04-26 09:45:07,865][49728] Signal inference workers to stop experience collection... (1750 times) [2024-04-26 09:45:07,870][49728] Signal inference workers to resume experience collection... (1750 times) [2024-04-26 09:45:07,885][49750] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-04-26 09:45:07,886][49750] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-04-26 09:45:10,898][49750] Updated weights for policy 0, policy_version 144411 (0.0038) [2024-04-26 09:45:12,063][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2366095360. Throughput: 0: 50024.7. Samples: 118949420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 09:45:12,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 09:45:14,196][49750] Updated weights for policy 0, policy_version 144421 (0.0031) [2024-04-26 09:45:17,063][49517] Fps is (10 sec: 49151.5, 60 sec: 49425.0, 300 sec: 49929.5). Total num frames: 2366324736. Throughput: 0: 49990.3. Samples: 119243180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 09:45:17,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 09:45:17,522][49750] Updated weights for policy 0, policy_version 144431 (0.0031) [2024-04-26 09:45:20,607][49750] Updated weights for policy 0, policy_version 144441 (0.0037) [2024-04-26 09:45:22,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.0, 300 sec: 50040.6). Total num frames: 2366603264. Throughput: 0: 49856.1. Samples: 119392740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 09:45:22,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 09:45:23,991][49750] Updated weights for policy 0, policy_version 144451 (0.0035) [2024-04-26 09:45:27,034][49750] Updated weights for policy 0, policy_version 144461 (0.0035) [2024-04-26 09:45:27,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.2, 300 sec: 50096.1). Total num frames: 2366849024. Throughput: 0: 49916.1. Samples: 119695860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 09:45:27,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 09:45:30,356][49750] Updated weights for policy 0, policy_version 144471 (0.0039) [2024-04-26 09:45:32,062][49517] Fps is (10 sec: 47514.7, 60 sec: 49971.3, 300 sec: 49929.6). Total num frames: 2367078400. Throughput: 0: 50036.1. Samples: 119996500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 09:45:32,063][49517] Avg episode reward: [(0, '0.485')] [2024-04-26 09:45:33,599][49750] Updated weights for policy 0, policy_version 144481 (0.0032) [2024-04-26 09:45:36,896][49750] Updated weights for policy 0, policy_version 144491 (0.0029) [2024-04-26 09:45:37,063][49517] Fps is (10 sec: 50790.4, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2367356928. Throughput: 0: 49996.0. Samples: 120137020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 09:45:37,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 09:45:40,407][49750] Updated weights for policy 0, policy_version 144501 (0.0030) [2024-04-26 09:45:42,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50244.4, 300 sec: 50096.1). Total num frames: 2367602688. Throughput: 0: 50231.3. Samples: 120449040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 09:45:42,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 09:45:43,405][49750] Updated weights for policy 0, policy_version 144511 (0.0026) [2024-04-26 09:45:46,907][49750] Updated weights for policy 0, policy_version 144521 (0.0029) [2024-04-26 09:45:47,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2367832064. Throughput: 0: 50003.7. Samples: 120743920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 09:45:47,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 09:45:49,866][49750] Updated weights for policy 0, policy_version 144531 (0.0032) [2024-04-26 09:45:52,062][49517] Fps is (10 sec: 47514.2, 60 sec: 49698.2, 300 sec: 49985.1). Total num frames: 2368077824. Throughput: 0: 49827.6. Samples: 120890320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 09:45:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 09:45:53,360][49750] Updated weights for policy 0, policy_version 144541 (0.0033) [2024-04-26 09:45:56,469][49750] Updated weights for policy 0, policy_version 144551 (0.0032) [2024-04-26 09:45:57,062][49517] Fps is (10 sec: 50790.6, 60 sec: 49971.4, 300 sec: 49985.1). Total num frames: 2368339968. Throughput: 0: 49844.6. Samples: 121192420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 09:45:57,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 09:45:59,903][49750] Updated weights for policy 0, policy_version 144561 (0.0029) [2024-04-26 09:46:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 49698.3, 300 sec: 49985.1). Total num frames: 2368585728. Throughput: 0: 49902.8. Samples: 121488800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 09:46:02,063][49517] Avg episode reward: [(0, '0.482')] [2024-04-26 09:46:02,934][49750] Updated weights for policy 0, policy_version 144571 (0.0032) [2024-04-26 09:46:03,770][49728] Signal inference workers to stop experience collection... (1800 times) [2024-04-26 09:46:03,770][49728] Signal inference workers to resume experience collection... (1800 times) [2024-04-26 09:46:03,812][49750] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-04-26 09:46:03,812][49750] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-04-26 09:46:06,503][49750] Updated weights for policy 0, policy_version 144581 (0.0030) [2024-04-26 09:46:07,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50096.2). Total num frames: 2368864256. Throughput: 0: 50054.5. Samples: 121645180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 09:46:07,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 09:46:09,694][49750] Updated weights for policy 0, policy_version 144591 (0.0032) [2024-04-26 09:46:12,063][49517] Fps is (10 sec: 49151.2, 60 sec: 49698.2, 300 sec: 49985.1). Total num frames: 2369077248. Throughput: 0: 49878.6. Samples: 121940400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 09:46:12,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 09:46:12,954][49750] Updated weights for policy 0, policy_version 144601 (0.0030) [2024-04-26 09:46:16,122][49750] Updated weights for policy 0, policy_version 144611 (0.0035) [2024-04-26 09:46:17,062][49517] Fps is (10 sec: 45875.1, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2369323008. Throughput: 0: 49897.8. Samples: 122241900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 09:46:17,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 09:46:19,550][49750] Updated weights for policy 0, policy_version 144621 (0.0033) [2024-04-26 09:46:22,062][49517] Fps is (10 sec: 50791.3, 60 sec: 49698.4, 300 sec: 49985.1). Total num frames: 2369585152. Throughput: 0: 50041.5. Samples: 122388880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 09:46:22,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 09:46:22,633][49750] Updated weights for policy 0, policy_version 144631 (0.0030) [2024-04-26 09:46:26,005][49750] Updated weights for policy 0, policy_version 144641 (0.0036) [2024-04-26 09:46:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 49971.3, 300 sec: 50040.7). Total num frames: 2369847296. Throughput: 0: 49725.5. Samples: 122686680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 09:46:27,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 09:46:29,156][49750] Updated weights for policy 0, policy_version 144651 (0.0031) [2024-04-26 09:46:32,062][49517] Fps is (10 sec: 49151.7, 60 sec: 49971.2, 300 sec: 49929.6). Total num frames: 2370076672. Throughput: 0: 50014.3. Samples: 122994560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 09:46:32,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 09:46:32,523][49750] Updated weights for policy 0, policy_version 144661 (0.0034) [2024-04-26 09:46:35,712][49750] Updated weights for policy 0, policy_version 144671 (0.0032) [2024-04-26 09:46:37,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49425.2, 300 sec: 49985.1). Total num frames: 2370322432. Throughput: 0: 49881.8. Samples: 123135000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 09:46:37,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 09:46:39,135][49750] Updated weights for policy 0, policy_version 144681 (0.0029) [2024-04-26 09:46:42,062][49517] Fps is (10 sec: 52428.7, 60 sec: 49971.3, 300 sec: 49985.1). Total num frames: 2370600960. Throughput: 0: 49790.6. Samples: 123433000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 09:46:42,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 09:46:42,181][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000144691_2370617344.pth... [2024-04-26 09:46:42,188][49750] Updated weights for policy 0, policy_version 144691 (0.0036) [2024-04-26 09:46:42,226][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000143958_2358607872.pth [2024-04-26 09:46:45,815][49750] Updated weights for policy 0, policy_version 144701 (0.0036) [2024-04-26 09:46:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50244.3, 300 sec: 49985.1). Total num frames: 2370846720. Throughput: 0: 49951.1. Samples: 123736600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 09:46:47,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 09:46:48,675][49750] Updated weights for policy 0, policy_version 144711 (0.0033) [2024-04-26 09:46:52,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.1, 300 sec: 49929.5). Total num frames: 2371092480. Throughput: 0: 49876.8. Samples: 123889640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 09:46:52,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 09:46:52,475][49750] Updated weights for policy 0, policy_version 144721 (0.0033) [2024-04-26 09:46:55,208][49750] Updated weights for policy 0, policy_version 144731 (0.0028) [2024-04-26 09:46:57,063][49517] Fps is (10 sec: 49151.3, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2371338240. Throughput: 0: 49808.0. Samples: 124181760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 09:46:57,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 09:46:58,881][49750] Updated weights for policy 0, policy_version 144741 (0.0033) [2024-04-26 09:47:01,826][49750] Updated weights for policy 0, policy_version 144751 (0.0030) [2024-04-26 09:47:02,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.1, 300 sec: 49985.1). Total num frames: 2371600384. Throughput: 0: 49771.4. Samples: 124481620. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 09:47:02,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 09:47:05,294][49750] Updated weights for policy 0, policy_version 144761 (0.0028) [2024-04-26 09:47:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49425.0, 300 sec: 49929.6). Total num frames: 2371829760. Throughput: 0: 49944.3. Samples: 124636380. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 09:47:07,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 09:47:08,864][49750] Updated weights for policy 0, policy_version 144771 (0.0032) [2024-04-26 09:47:11,897][49750] Updated weights for policy 0, policy_version 144781 (0.0040) [2024-04-26 09:47:12,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.2, 300 sec: 49985.1). Total num frames: 2372091904. Throughput: 0: 49943.8. Samples: 124934160. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 09:47:12,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 09:47:15,623][49750] Updated weights for policy 0, policy_version 144791 (0.0037) [2024-04-26 09:47:17,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 49929.5). Total num frames: 2372321280. Throughput: 0: 49802.2. Samples: 125235660. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 09:47:17,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 09:47:18,546][49750] Updated weights for policy 0, policy_version 144801 (0.0036) [2024-04-26 09:47:19,147][49728] Signal inference workers to stop experience collection... (1850 times) [2024-04-26 09:47:19,165][49750] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-04-26 09:47:19,214][49728] Signal inference workers to resume experience collection... (1850 times) [2024-04-26 09:47:19,214][49750] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-04-26 09:47:22,063][49517] Fps is (10 sec: 45875.4, 60 sec: 49424.9, 300 sec: 49874.0). Total num frames: 2372550656. Throughput: 0: 49870.1. Samples: 125379160. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 09:47:22,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 09:47:22,386][49750] Updated weights for policy 0, policy_version 144811 (0.0027) [2024-04-26 09:47:25,024][49750] Updated weights for policy 0, policy_version 144821 (0.0030) [2024-04-26 09:47:27,062][49517] Fps is (10 sec: 52428.8, 60 sec: 49971.1, 300 sec: 49985.1). Total num frames: 2372845568. Throughput: 0: 49952.8. Samples: 125680880. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 09:47:27,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 09:47:28,887][49750] Updated weights for policy 0, policy_version 144831 (0.0031) [2024-04-26 09:47:31,497][49750] Updated weights for policy 0, policy_version 144841 (0.0030) [2024-04-26 09:47:32,062][49517] Fps is (10 sec: 54067.9, 60 sec: 50244.3, 300 sec: 49985.1). Total num frames: 2373091328. Throughput: 0: 49836.8. Samples: 125979260. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 09:47:32,063][49517] Avg episode reward: [(0, '0.482')] [2024-04-26 09:47:35,411][49750] Updated weights for policy 0, policy_version 144851 (0.0032) [2024-04-26 09:47:37,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.1, 300 sec: 49929.5). Total num frames: 2373320704. Throughput: 0: 49817.4. Samples: 126131420. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 09:47:37,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 09:47:38,064][49750] Updated weights for policy 0, policy_version 144861 (0.0029) [2024-04-26 09:47:41,926][49750] Updated weights for policy 0, policy_version 144871 (0.0034) [2024-04-26 09:47:42,062][49517] Fps is (10 sec: 47513.4, 60 sec: 49425.0, 300 sec: 49929.6). Total num frames: 2373566464. Throughput: 0: 49853.0. Samples: 126425140. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 09:47:42,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 09:47:44,970][49750] Updated weights for policy 0, policy_version 144881 (0.0031) [2024-04-26 09:47:47,063][49517] Fps is (10 sec: 52428.5, 60 sec: 49971.1, 300 sec: 49929.5). Total num frames: 2373844992. Throughput: 0: 49873.8. Samples: 126725940. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 09:47:47,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 09:47:48,478][49750] Updated weights for policy 0, policy_version 144891 (0.0035) [2024-04-26 09:47:51,448][49750] Updated weights for policy 0, policy_version 144901 (0.0030) [2024-04-26 09:47:52,062][49517] Fps is (10 sec: 52428.7, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2374090752. Throughput: 0: 49928.4. Samples: 126883160. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 09:47:52,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 09:47:55,016][49750] Updated weights for policy 0, policy_version 144911 (0.0032) [2024-04-26 09:47:57,062][49517] Fps is (10 sec: 47513.8, 60 sec: 49698.2, 300 sec: 49929.5). Total num frames: 2374320128. Throughput: 0: 50121.5. Samples: 127189620. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-26 09:47:57,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 09:47:57,832][49750] Updated weights for policy 0, policy_version 144921 (0.0035) [2024-04-26 09:48:01,444][49750] Updated weights for policy 0, policy_version 144931 (0.0030) [2024-04-26 09:48:02,063][49517] Fps is (10 sec: 47513.3, 60 sec: 49425.1, 300 sec: 49985.1). Total num frames: 2374565888. Throughput: 0: 50032.4. Samples: 127487120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:48:02,063][49517] Avg episode reward: [(0, '0.483')] [2024-04-26 09:48:04,324][49750] Updated weights for policy 0, policy_version 144941 (0.0031) [2024-04-26 09:48:07,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.2, 300 sec: 49985.1). Total num frames: 2374844416. Throughput: 0: 50133.4. Samples: 127635160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:48:07,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 09:48:07,819][49750] Updated weights for policy 0, policy_version 144951 (0.0028) [2024-04-26 09:48:10,864][49750] Updated weights for policy 0, policy_version 144961 (0.0038) [2024-04-26 09:48:12,062][49517] Fps is (10 sec: 52429.0, 60 sec: 49971.3, 300 sec: 49985.1). Total num frames: 2375090176. Throughput: 0: 50134.2. Samples: 127936920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:48:12,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 09:48:14,448][49750] Updated weights for policy 0, policy_version 144971 (0.0036) [2024-04-26 09:48:17,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 49874.0). Total num frames: 2375335936. Throughput: 0: 50183.6. Samples: 128237520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:48:17,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 09:48:17,365][49750] Updated weights for policy 0, policy_version 144981 (0.0037) [2024-04-26 09:48:21,237][49750] Updated weights for policy 0, policy_version 144991 (0.0034) [2024-04-26 09:48:22,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 49985.1). Total num frames: 2375581696. Throughput: 0: 50036.9. Samples: 128383080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:48:22,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 09:48:23,942][49750] Updated weights for policy 0, policy_version 145001 (0.0029) [2024-04-26 09:48:27,063][49517] Fps is (10 sec: 49151.1, 60 sec: 49698.1, 300 sec: 49929.5). Total num frames: 2375827456. Throughput: 0: 50203.4. Samples: 128684300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:48:27,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 09:48:27,779][49750] Updated weights for policy 0, policy_version 145011 (0.0032) [2024-04-26 09:48:30,613][49750] Updated weights for policy 0, policy_version 145021 (0.0034) [2024-04-26 09:48:32,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50244.2, 300 sec: 49985.1). Total num frames: 2376105984. Throughput: 0: 50169.8. Samples: 128983580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:48:32,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 09:48:34,347][49750] Updated weights for policy 0, policy_version 145031 (0.0025) [2024-04-26 09:48:36,996][49750] Updated weights for policy 0, policy_version 145041 (0.0028) [2024-04-26 09:48:37,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.4, 300 sec: 49985.1). Total num frames: 2376351744. Throughput: 0: 50209.4. Samples: 129142580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:48:37,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 09:48:37,478][49728] Signal inference workers to stop experience collection... (1900 times) [2024-04-26 09:48:37,479][49728] Signal inference workers to resume experience collection... (1900 times) [2024-04-26 09:48:37,495][49750] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-04-26 09:48:37,495][49750] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-04-26 09:48:40,938][49750] Updated weights for policy 0, policy_version 145051 (0.0040) [2024-04-26 09:48:42,063][49517] Fps is (10 sec: 45875.3, 60 sec: 49971.2, 300 sec: 49929.5). Total num frames: 2376564736. Throughput: 0: 50162.6. Samples: 129446940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:48:42,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 09:48:42,160][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000145055_2376581120.pth... [2024-04-26 09:48:42,206][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000144324_2364604416.pth [2024-04-26 09:48:43,444][49750] Updated weights for policy 0, policy_version 145061 (0.0034) [2024-04-26 09:48:47,062][49517] Fps is (10 sec: 45875.3, 60 sec: 49425.2, 300 sec: 49874.0). Total num frames: 2376810496. Throughput: 0: 49988.2. Samples: 129736580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:48:47,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 09:48:47,591][49750] Updated weights for policy 0, policy_version 145071 (0.0033) [2024-04-26 09:48:50,242][49750] Updated weights for policy 0, policy_version 145081 (0.0031) [2024-04-26 09:48:52,063][49517] Fps is (10 sec: 54066.5, 60 sec: 50244.1, 300 sec: 49985.1). Total num frames: 2377105408. Throughput: 0: 50027.8. Samples: 129886420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:48:52,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 09:48:53,975][49750] Updated weights for policy 0, policy_version 145091 (0.0029) [2024-04-26 09:48:56,825][49750] Updated weights for policy 0, policy_version 145101 (0.0034) [2024-04-26 09:48:57,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.3, 300 sec: 49985.1). Total num frames: 2377334784. Throughput: 0: 50094.3. Samples: 130191160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:48:57,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 09:49:00,433][49750] Updated weights for policy 0, policy_version 145111 (0.0034) [2024-04-26 09:49:02,063][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.2, 300 sec: 49874.0). Total num frames: 2377580544. Throughput: 0: 50100.7. Samples: 130492060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 09:49:02,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 09:49:03,330][49750] Updated weights for policy 0, policy_version 145121 (0.0029) [2024-04-26 09:49:07,010][49750] Updated weights for policy 0, policy_version 145131 (0.0032) [2024-04-26 09:49:07,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 49929.6). Total num frames: 2377826304. Throughput: 0: 50110.2. Samples: 130638040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:49:07,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 09:49:09,669][49750] Updated weights for policy 0, policy_version 145141 (0.0028) [2024-04-26 09:49:12,063][49517] Fps is (10 sec: 52428.9, 60 sec: 50244.2, 300 sec: 49985.1). Total num frames: 2378104832. Throughput: 0: 50184.9. Samples: 130942620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:49:12,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 09:49:13,388][49750] Updated weights for policy 0, policy_version 145151 (0.0035) [2024-04-26 09:49:16,171][49750] Updated weights for policy 0, policy_version 145161 (0.0033) [2024-04-26 09:49:17,062][49517] Fps is (10 sec: 50790.0, 60 sec: 49971.1, 300 sec: 49985.1). Total num frames: 2378334208. Throughput: 0: 50126.3. Samples: 131239260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:49:17,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 09:49:19,962][49750] Updated weights for policy 0, policy_version 145171 (0.0028) [2024-04-26 09:49:22,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2378596352. Throughput: 0: 50116.8. Samples: 131397840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:49:22,063][49517] Avg episode reward: [(0, '0.479')] [2024-04-26 09:49:22,811][49750] Updated weights for policy 0, policy_version 145181 (0.0034) [2024-04-26 09:49:26,609][49750] Updated weights for policy 0, policy_version 145191 (0.0030) [2024-04-26 09:49:27,063][49517] Fps is (10 sec: 49151.5, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2378825728. Throughput: 0: 50027.9. Samples: 131698200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:49:27,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 09:49:29,336][49750] Updated weights for policy 0, policy_version 145201 (0.0033) [2024-04-26 09:49:32,062][49517] Fps is (10 sec: 47513.8, 60 sec: 49425.2, 300 sec: 49874.0). Total num frames: 2379071488. Throughput: 0: 50014.6. Samples: 131987240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:49:32,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 09:49:33,276][49750] Updated weights for policy 0, policy_version 145211 (0.0036) [2024-04-26 09:49:35,768][49750] Updated weights for policy 0, policy_version 145221 (0.0033) [2024-04-26 09:49:37,062][49517] Fps is (10 sec: 50791.3, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2379333632. Throughput: 0: 50096.7. Samples: 132140760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:49:37,063][49517] Avg episode reward: [(0, '0.454')] [2024-04-26 09:49:37,292][49728] Signal inference workers to stop experience collection... (1950 times) [2024-04-26 09:49:37,338][49750] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-04-26 09:49:37,355][49728] Signal inference workers to resume experience collection... (1950 times) [2024-04-26 09:49:37,363][49750] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-04-26 09:49:39,731][49750] Updated weights for policy 0, policy_version 145231 (0.0034) [2024-04-26 09:49:42,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.4, 300 sec: 50040.6). Total num frames: 2379595776. Throughput: 0: 50080.5. Samples: 132444780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:49:42,063][49517] Avg episode reward: [(0, '0.481')] [2024-04-26 09:49:42,248][49750] Updated weights for policy 0, policy_version 145241 (0.0030) [2024-04-26 09:49:46,124][49750] Updated weights for policy 0, policy_version 145251 (0.0033) [2024-04-26 09:49:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 49929.6). Total num frames: 2379825152. Throughput: 0: 50106.0. Samples: 132746820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:49:47,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 09:49:48,774][49750] Updated weights for policy 0, policy_version 145261 (0.0034) [2024-04-26 09:49:52,062][49517] Fps is (10 sec: 47513.4, 60 sec: 49425.2, 300 sec: 49929.6). Total num frames: 2380070912. Throughput: 0: 50031.5. Samples: 132889460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:49:52,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 09:49:52,741][49750] Updated weights for policy 0, policy_version 145271 (0.0025) [2024-04-26 09:49:55,379][49750] Updated weights for policy 0, policy_version 145281 (0.0026) [2024-04-26 09:49:57,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50517.3, 300 sec: 50040.6). Total num frames: 2380365824. Throughput: 0: 49804.2. Samples: 133183800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:49:57,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 09:49:59,489][49750] Updated weights for policy 0, policy_version 145291 (0.0036) [2024-04-26 09:50:01,976][49750] Updated weights for policy 0, policy_version 145301 (0.0033) [2024-04-26 09:50:02,063][49517] Fps is (10 sec: 54066.3, 60 sec: 50517.3, 300 sec: 50096.1). Total num frames: 2380611584. Throughput: 0: 49945.2. Samples: 133486800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:50:02,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 09:50:05,931][49750] Updated weights for policy 0, policy_version 145311 (0.0030) [2024-04-26 09:50:07,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.3, 300 sec: 49985.1). Total num frames: 2380840960. Throughput: 0: 49829.0. Samples: 133640140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:50:07,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 09:50:08,564][49750] Updated weights for policy 0, policy_version 145321 (0.0041) [2024-04-26 09:50:12,063][49517] Fps is (10 sec: 45875.5, 60 sec: 49425.1, 300 sec: 49985.1). Total num frames: 2381070336. Throughput: 0: 49703.2. Samples: 133934840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:50:12,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 09:50:12,405][49750] Updated weights for policy 0, policy_version 145331 (0.0033) [2024-04-26 09:50:15,080][49750] Updated weights for policy 0, policy_version 145341 (0.0038) [2024-04-26 09:50:17,062][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.2, 300 sec: 49929.6). Total num frames: 2381332480. Throughput: 0: 49991.5. Samples: 134236860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:50:17,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 09:50:18,948][49750] Updated weights for policy 0, policy_version 145351 (0.0029) [2024-04-26 09:50:21,729][49750] Updated weights for policy 0, policy_version 145361 (0.0028) [2024-04-26 09:50:22,062][49517] Fps is (10 sec: 52429.4, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2381594624. Throughput: 0: 50012.0. Samples: 134391300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:50:22,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 09:50:25,620][49750] Updated weights for policy 0, policy_version 145371 (0.0032) [2024-04-26 09:50:27,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.4, 300 sec: 50096.2). Total num frames: 2381856768. Throughput: 0: 50001.7. Samples: 134694860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:50:27,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 09:50:28,177][49750] Updated weights for policy 0, policy_version 145381 (0.0033) [2024-04-26 09:50:32,062][49517] Fps is (10 sec: 45875.4, 60 sec: 49698.2, 300 sec: 49818.5). Total num frames: 2382053376. Throughput: 0: 50109.8. Samples: 135001760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:50:32,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 09:50:32,214][49750] Updated weights for policy 0, policy_version 145391 (0.0032) [2024-04-26 09:50:34,732][49750] Updated weights for policy 0, policy_version 145401 (0.0035) [2024-04-26 09:50:37,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.2, 300 sec: 49929.6). Total num frames: 2382331904. Throughput: 0: 49902.7. Samples: 135135080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:50:37,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 09:50:38,781][49750] Updated weights for policy 0, policy_version 145411 (0.0030) [2024-04-26 09:50:41,418][49750] Updated weights for policy 0, policy_version 145421 (0.0030) [2024-04-26 09:50:42,062][49517] Fps is (10 sec: 54066.9, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2382594048. Throughput: 0: 49955.9. Samples: 135431820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:50:42,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 09:50:42,178][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000145423_2382610432.pth... [2024-04-26 09:50:42,255][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000144691_2370617344.pth [2024-04-26 09:50:42,279][49728] Signal inference workers to stop experience collection... (2000 times) [2024-04-26 09:50:42,279][49728] Signal inference workers to resume experience collection... (2000 times) [2024-04-26 09:50:42,292][49750] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-04-26 09:50:42,292][49750] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-04-26 09:50:45,432][49750] Updated weights for policy 0, policy_version 145431 (0.0033) [2024-04-26 09:50:47,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50244.1, 300 sec: 50040.6). Total num frames: 2382839808. Throughput: 0: 50011.1. Samples: 135737300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:50:47,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 09:50:47,906][49750] Updated weights for policy 0, policy_version 145441 (0.0036) [2024-04-26 09:50:51,839][49750] Updated weights for policy 0, policy_version 145451 (0.0036) [2024-04-26 09:50:52,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49971.2, 300 sec: 49929.5). Total num frames: 2383069184. Throughput: 0: 49837.3. Samples: 135882820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:50:52,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 09:50:54,605][49750] Updated weights for policy 0, policy_version 145461 (0.0029) [2024-04-26 09:50:57,063][49517] Fps is (10 sec: 49152.2, 60 sec: 49425.0, 300 sec: 49985.1). Total num frames: 2383331328. Throughput: 0: 49971.6. Samples: 136183560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:50:57,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 09:50:58,192][49750] Updated weights for policy 0, policy_version 145471 (0.0027) [2024-04-26 09:51:01,396][49750] Updated weights for policy 0, policy_version 145481 (0.0034) [2024-04-26 09:51:02,063][49517] Fps is (10 sec: 52428.1, 60 sec: 49698.2, 300 sec: 49929.5). Total num frames: 2383593472. Throughput: 0: 50002.5. Samples: 136486980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:51:02,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 09:51:04,777][49750] Updated weights for policy 0, policy_version 145491 (0.0030) [2024-04-26 09:51:07,063][49517] Fps is (10 sec: 50790.6, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2383839232. Throughput: 0: 49974.1. Samples: 136640140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 09:51:07,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 09:51:07,814][49750] Updated weights for policy 0, policy_version 145501 (0.0033) [2024-04-26 09:51:11,333][49750] Updated weights for policy 0, policy_version 145511 (0.0034) [2024-04-26 09:51:12,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2384084992. Throughput: 0: 49971.4. Samples: 136943580. Policy #0 lag: (min: 2.0, avg: 9.4, max: 22.0) [2024-04-26 09:51:12,063][49517] Avg episode reward: [(0, '0.447')] [2024-04-26 09:51:14,273][49750] Updated weights for policy 0, policy_version 145521 (0.0032) [2024-04-26 09:51:17,062][49517] Fps is (10 sec: 47513.8, 60 sec: 49698.1, 300 sec: 49929.5). Total num frames: 2384314368. Throughput: 0: 49763.1. Samples: 137241100. Policy #0 lag: (min: 2.0, avg: 9.4, max: 22.0) [2024-04-26 09:51:17,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 09:51:17,825][49750] Updated weights for policy 0, policy_version 145531 (0.0031) [2024-04-26 09:51:20,830][49750] Updated weights for policy 0, policy_version 145541 (0.0033) [2024-04-26 09:51:22,062][49517] Fps is (10 sec: 50791.0, 60 sec: 49971.1, 300 sec: 49985.1). Total num frames: 2384592896. Throughput: 0: 49997.7. Samples: 137384980. Policy #0 lag: (min: 2.0, avg: 9.4, max: 22.0) [2024-04-26 09:51:22,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 09:51:24,389][49750] Updated weights for policy 0, policy_version 145551 (0.0030) [2024-04-26 09:51:27,062][49517] Fps is (10 sec: 52429.0, 60 sec: 49698.2, 300 sec: 50040.6). Total num frames: 2384838656. Throughput: 0: 49953.8. Samples: 137679740. Policy #0 lag: (min: 2.0, avg: 9.4, max: 22.0) [2024-04-26 09:51:27,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 09:51:27,412][49750] Updated weights for policy 0, policy_version 145561 (0.0032) [2024-04-26 09:51:30,988][49750] Updated weights for policy 0, policy_version 145571 (0.0030) [2024-04-26 09:51:32,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50096.2). Total num frames: 2385100800. Throughput: 0: 49926.0. Samples: 137983960. Policy #0 lag: (min: 2.0, avg: 9.4, max: 22.0) [2024-04-26 09:51:32,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 09:51:34,027][49750] Updated weights for policy 0, policy_version 145581 (0.0033) [2024-04-26 09:51:37,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49698.1, 300 sec: 49874.0). Total num frames: 2385313792. Throughput: 0: 50027.1. Samples: 138134040. Policy #0 lag: (min: 2.0, avg: 9.4, max: 22.0) [2024-04-26 09:51:37,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 09:51:37,681][49750] Updated weights for policy 0, policy_version 145591 (0.0032) [2024-04-26 09:51:40,394][49750] Updated weights for policy 0, policy_version 145601 (0.0033) [2024-04-26 09:51:42,063][49517] Fps is (10 sec: 49151.5, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2385592320. Throughput: 0: 49957.4. Samples: 138431640. Policy #0 lag: (min: 2.0, avg: 9.4, max: 22.0) [2024-04-26 09:51:42,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 09:51:44,193][49750] Updated weights for policy 0, policy_version 145611 (0.0029) [2024-04-26 09:51:46,995][49750] Updated weights for policy 0, policy_version 145621 (0.0030) [2024-04-26 09:51:47,063][49517] Fps is (10 sec: 54066.5, 60 sec: 50244.3, 300 sec: 50040.6). Total num frames: 2385854464. Throughput: 0: 49892.9. Samples: 138732160. Policy #0 lag: (min: 2.0, avg: 9.4, max: 22.0) [2024-04-26 09:51:47,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 09:51:50,727][49750] Updated weights for policy 0, policy_version 145631 (0.0032) [2024-04-26 09:51:52,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.2, 300 sec: 49985.1). Total num frames: 2386083840. Throughput: 0: 50033.7. Samples: 138891660. Policy #0 lag: (min: 2.0, avg: 9.4, max: 22.0) [2024-04-26 09:51:52,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 09:51:53,568][49750] Updated weights for policy 0, policy_version 145641 (0.0033) [2024-04-26 09:51:57,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.3, 300 sec: 49929.6). Total num frames: 2386329600. Throughput: 0: 49883.7. Samples: 139188340. Policy #0 lag: (min: 2.0, avg: 9.4, max: 22.0) [2024-04-26 09:51:57,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 09:51:57,085][49728] Signal inference workers to stop experience collection... (2050 times) [2024-04-26 09:51:57,085][49728] Signal inference workers to resume experience collection... (2050 times) [2024-04-26 09:51:57,102][49750] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-04-26 09:51:57,102][49750] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-04-26 09:51:57,226][49750] Updated weights for policy 0, policy_version 145651 (0.0035) [2024-04-26 09:52:00,188][49750] Updated weights for policy 0, policy_version 145661 (0.0035) [2024-04-26 09:52:02,062][49517] Fps is (10 sec: 49152.5, 60 sec: 49698.2, 300 sec: 49985.1). Total num frames: 2386575360. Throughput: 0: 49821.4. Samples: 139483060. Policy #0 lag: (min: 2.0, avg: 9.4, max: 22.0) [2024-04-26 09:52:02,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 09:52:03,757][49750] Updated weights for policy 0, policy_version 145671 (0.0036) [2024-04-26 09:52:06,883][49750] Updated weights for policy 0, policy_version 145681 (0.0033) [2024-04-26 09:52:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 49971.3, 300 sec: 49985.1). Total num frames: 2386837504. Throughput: 0: 50016.1. Samples: 139635700. Policy #0 lag: (min: 2.0, avg: 9.4, max: 22.0) [2024-04-26 09:52:07,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 09:52:10,254][49750] Updated weights for policy 0, policy_version 145691 (0.0032) [2024-04-26 09:52:12,062][49517] Fps is (10 sec: 50790.3, 60 sec: 49971.4, 300 sec: 50040.6). Total num frames: 2387083264. Throughput: 0: 50172.0. Samples: 139937480. Policy #0 lag: (min: 2.0, avg: 9.4, max: 22.0) [2024-04-26 09:52:12,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 09:52:13,425][49750] Updated weights for policy 0, policy_version 145701 (0.0028) [2024-04-26 09:52:16,695][49750] Updated weights for policy 0, policy_version 145711 (0.0038) [2024-04-26 09:52:17,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2387329024. Throughput: 0: 50060.5. Samples: 140236680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:52:17,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 09:52:20,020][49750] Updated weights for policy 0, policy_version 145721 (0.0029) [2024-04-26 09:52:22,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49698.2, 300 sec: 49929.6). Total num frames: 2387574784. Throughput: 0: 49978.2. Samples: 140383060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:52:22,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 09:52:23,315][49750] Updated weights for policy 0, policy_version 145731 (0.0032) [2024-04-26 09:52:26,507][49750] Updated weights for policy 0, policy_version 145741 (0.0036) [2024-04-26 09:52:27,062][49517] Fps is (10 sec: 50790.1, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2387836928. Throughput: 0: 49988.5. Samples: 140681120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:52:27,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 09:52:29,906][49750] Updated weights for policy 0, policy_version 145751 (0.0034) [2024-04-26 09:52:32,063][49517] Fps is (10 sec: 52428.4, 60 sec: 49971.1, 300 sec: 50096.2). Total num frames: 2388099072. Throughput: 0: 49958.3. Samples: 140980280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:52:32,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 09:52:33,069][49750] Updated weights for policy 0, policy_version 145761 (0.0034) [2024-04-26 09:52:36,583][49750] Updated weights for policy 0, policy_version 145771 (0.0034) [2024-04-26 09:52:37,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.2, 300 sec: 50096.2). Total num frames: 2388344832. Throughput: 0: 49733.3. Samples: 141129660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:52:37,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 09:52:39,548][49750] Updated weights for policy 0, policy_version 145781 (0.0035) [2024-04-26 09:52:42,062][49517] Fps is (10 sec: 45875.5, 60 sec: 49425.1, 300 sec: 49874.0). Total num frames: 2388557824. Throughput: 0: 49948.1. Samples: 141436000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:52:42,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 09:52:42,118][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000145787_2388574208.pth... [2024-04-26 09:52:42,166][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000145055_2376581120.pth [2024-04-26 09:52:43,067][49750] Updated weights for policy 0, policy_version 145791 (0.0031) [2024-04-26 09:52:46,268][49750] Updated weights for policy 0, policy_version 145801 (0.0032) [2024-04-26 09:52:47,063][49517] Fps is (10 sec: 47513.4, 60 sec: 49425.0, 300 sec: 49929.5). Total num frames: 2388819968. Throughput: 0: 49931.0. Samples: 141729960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:52:47,071][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 09:52:49,553][49750] Updated weights for policy 0, policy_version 145811 (0.0036) [2024-04-26 09:52:52,062][49517] Fps is (10 sec: 52428.4, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2389082112. Throughput: 0: 49895.9. Samples: 141881020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:52:52,063][49517] Avg episode reward: [(0, '0.490')] [2024-04-26 09:52:52,676][49750] Updated weights for policy 0, policy_version 145821 (0.0025) [2024-04-26 09:52:56,228][49750] Updated weights for policy 0, policy_version 145831 (0.0038) [2024-04-26 09:52:57,063][49517] Fps is (10 sec: 50790.5, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2389327872. Throughput: 0: 49982.1. Samples: 142186680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:52:57,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 09:52:59,105][49750] Updated weights for policy 0, policy_version 145841 (0.0030) [2024-04-26 09:53:02,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.2, 300 sec: 49929.6). Total num frames: 2389573632. Throughput: 0: 49963.0. Samples: 142485020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:53:02,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 09:53:02,625][49750] Updated weights for policy 0, policy_version 145851 (0.0032) [2024-04-26 09:53:05,815][49750] Updated weights for policy 0, policy_version 145861 (0.0034) [2024-04-26 09:53:07,063][49517] Fps is (10 sec: 50790.1, 60 sec: 49971.0, 300 sec: 49985.1). Total num frames: 2389835776. Throughput: 0: 50024.2. Samples: 142634160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:53:07,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 09:53:09,055][49750] Updated weights for policy 0, policy_version 145871 (0.0032) [2024-04-26 09:53:11,850][49728] Signal inference workers to stop experience collection... (2100 times) [2024-04-26 09:53:11,851][49728] Signal inference workers to resume experience collection... (2100 times) [2024-04-26 09:53:11,880][49750] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-04-26 09:53:11,881][49750] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-04-26 09:53:12,063][49517] Fps is (10 sec: 50790.1, 60 sec: 49971.1, 300 sec: 49985.1). Total num frames: 2390081536. Throughput: 0: 50062.2. Samples: 142933920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:53:12,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 09:53:12,677][49750] Updated weights for policy 0, policy_version 145881 (0.0037) [2024-04-26 09:53:15,633][49750] Updated weights for policy 0, policy_version 145891 (0.0033) [2024-04-26 09:53:17,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2390343680. Throughput: 0: 50026.7. Samples: 143231480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 09:53:17,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 09:53:19,301][49750] Updated weights for policy 0, policy_version 145901 (0.0027) [2024-04-26 09:53:22,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.3, 300 sec: 50040.6). Total num frames: 2390589440. Throughput: 0: 50178.8. Samples: 143387700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 09:53:22,063][49517] Avg episode reward: [(0, '0.445')] [2024-04-26 09:53:22,191][49750] Updated weights for policy 0, policy_version 145911 (0.0034) [2024-04-26 09:53:25,705][49750] Updated weights for policy 0, policy_version 145921 (0.0031) [2024-04-26 09:53:27,062][49517] Fps is (10 sec: 47513.8, 60 sec: 49698.2, 300 sec: 49874.0). Total num frames: 2390818816. Throughput: 0: 50014.7. Samples: 143686660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 09:53:27,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 09:53:28,575][49750] Updated weights for policy 0, policy_version 145931 (0.0034) [2024-04-26 09:53:32,063][49517] Fps is (10 sec: 49151.2, 60 sec: 49698.1, 300 sec: 49929.5). Total num frames: 2391080960. Throughput: 0: 50142.6. Samples: 143986380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 09:53:32,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 09:53:32,315][49750] Updated weights for policy 0, policy_version 145941 (0.0030) [2024-04-26 09:53:35,078][49750] Updated weights for policy 0, policy_version 145951 (0.0028) [2024-04-26 09:53:37,062][49517] Fps is (10 sec: 52428.8, 60 sec: 49971.3, 300 sec: 50096.2). Total num frames: 2391343104. Throughput: 0: 50254.3. Samples: 144142460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 09:53:37,063][49517] Avg episode reward: [(0, '0.410')] [2024-04-26 09:53:38,900][49750] Updated weights for policy 0, policy_version 145961 (0.0029) [2024-04-26 09:53:41,830][49750] Updated weights for policy 0, policy_version 145971 (0.0042) [2024-04-26 09:53:42,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50096.1). Total num frames: 2391588864. Throughput: 0: 50071.6. Samples: 144439900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 09:53:42,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 09:53:45,436][49750] Updated weights for policy 0, policy_version 145981 (0.0033) [2024-04-26 09:53:47,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.4, 300 sec: 49929.6). Total num frames: 2391834624. Throughput: 0: 50098.2. Samples: 144739440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 09:53:47,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 09:53:48,301][49750] Updated weights for policy 0, policy_version 145991 (0.0034) [2024-04-26 09:53:51,964][49750] Updated weights for policy 0, policy_version 146001 (0.0035) [2024-04-26 09:53:52,063][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2392080384. Throughput: 0: 49990.7. Samples: 144883740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 09:53:52,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 09:53:54,777][49750] Updated weights for policy 0, policy_version 146011 (0.0032) [2024-04-26 09:53:57,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.3, 300 sec: 49985.1). Total num frames: 2392326144. Throughput: 0: 49982.3. Samples: 145183120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 09:53:57,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 09:53:58,408][49750] Updated weights for policy 0, policy_version 146021 (0.0034) [2024-04-26 09:54:01,249][49750] Updated weights for policy 0, policy_version 146031 (0.0031) [2024-04-26 09:54:02,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2392588288. Throughput: 0: 49957.2. Samples: 145479560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 09:54:02,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 09:54:04,957][49750] Updated weights for policy 0, policy_version 146041 (0.0033) [2024-04-26 09:54:07,063][49517] Fps is (10 sec: 50786.2, 60 sec: 49970.7, 300 sec: 49929.4). Total num frames: 2392834048. Throughput: 0: 49967.1. Samples: 145636260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 09:54:07,064][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 09:54:07,866][49750] Updated weights for policy 0, policy_version 146051 (0.0026) [2024-04-26 09:54:11,519][49750] Updated weights for policy 0, policy_version 146061 (0.0042) [2024-04-26 09:54:12,063][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2393079808. Throughput: 0: 49989.2. Samples: 145936180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 09:54:12,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 09:54:14,254][49728] Signal inference workers to stop experience collection... (2150 times) [2024-04-26 09:54:14,254][49728] Signal inference workers to resume experience collection... (2150 times) [2024-04-26 09:54:14,282][49750] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-04-26 09:54:14,282][49750] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-04-26 09:54:14,385][49750] Updated weights for policy 0, policy_version 146071 (0.0035) [2024-04-26 09:54:17,063][49517] Fps is (10 sec: 49155.4, 60 sec: 49698.1, 300 sec: 49929.5). Total num frames: 2393325568. Throughput: 0: 49994.7. Samples: 146236140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 09:54:17,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 09:54:18,125][49750] Updated weights for policy 0, policy_version 146081 (0.0033) [2024-04-26 09:54:20,878][49750] Updated weights for policy 0, policy_version 146091 (0.0030) [2024-04-26 09:54:22,063][49517] Fps is (10 sec: 49152.2, 60 sec: 49698.0, 300 sec: 49985.1). Total num frames: 2393571328. Throughput: 0: 49864.8. Samples: 146386380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 09:54:22,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 09:54:24,745][49750] Updated weights for policy 0, policy_version 146101 (0.0034) [2024-04-26 09:54:27,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50244.3, 300 sec: 50040.6). Total num frames: 2393833472. Throughput: 0: 49979.7. Samples: 146688980. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 09:54:27,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 09:54:27,449][49750] Updated weights for policy 0, policy_version 146111 (0.0028) [2024-04-26 09:54:31,146][49750] Updated weights for policy 0, policy_version 146121 (0.0037) [2024-04-26 09:54:32,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49698.3, 300 sec: 49929.5). Total num frames: 2394062848. Throughput: 0: 49924.9. Samples: 146986060. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 09:54:32,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 09:54:34,016][49750] Updated weights for policy 0, policy_version 146131 (0.0038) [2024-04-26 09:54:37,062][49517] Fps is (10 sec: 50790.3, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2394341376. Throughput: 0: 49833.9. Samples: 147126260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 09:54:37,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 09:54:37,515][49750] Updated weights for policy 0, policy_version 146141 (0.0035) [2024-04-26 09:54:40,631][49750] Updated weights for policy 0, policy_version 146151 (0.0030) [2024-04-26 09:54:42,063][49517] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2394570752. Throughput: 0: 49987.9. Samples: 147432580. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 09:54:42,063][49517] Avg episode reward: [(0, '0.465')] [2024-04-26 09:54:42,078][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000146154_2394587136.pth... [2024-04-26 09:54:42,127][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000145423_2382610432.pth [2024-04-26 09:54:44,145][49750] Updated weights for policy 0, policy_version 146161 (0.0030) [2024-04-26 09:54:47,034][49750] Updated weights for policy 0, policy_version 146171 (0.0028) [2024-04-26 09:54:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2394865664. Throughput: 0: 50254.9. Samples: 147741020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 09:54:47,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 09:54:50,625][49750] Updated weights for policy 0, policy_version 146181 (0.0029) [2024-04-26 09:54:52,062][49517] Fps is (10 sec: 50790.9, 60 sec: 49971.3, 300 sec: 49874.0). Total num frames: 2395078656. Throughput: 0: 50034.2. Samples: 147887760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 09:54:52,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 09:54:53,582][49750] Updated weights for policy 0, policy_version 146191 (0.0026) [2024-04-26 09:54:57,061][49750] Updated weights for policy 0, policy_version 146201 (0.0035) [2024-04-26 09:54:57,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.2, 300 sec: 49985.1). Total num frames: 2395357184. Throughput: 0: 50014.7. Samples: 148186840. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 09:54:57,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 09:55:00,210][49750] Updated weights for policy 0, policy_version 146211 (0.0027) [2024-04-26 09:55:02,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.4, 300 sec: 50040.6). Total num frames: 2395602944. Throughput: 0: 50126.8. Samples: 148491840. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 09:55:02,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 09:55:03,401][49750] Updated weights for policy 0, policy_version 146221 (0.0032) [2024-04-26 09:55:04,940][49728] Signal inference workers to stop experience collection... (2200 times) [2024-04-26 09:55:04,941][49728] Signal inference workers to resume experience collection... (2200 times) [2024-04-26 09:55:04,965][49750] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-04-26 09:55:04,965][49750] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-04-26 09:55:06,674][49750] Updated weights for policy 0, policy_version 146231 (0.0035) [2024-04-26 09:55:07,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.9, 300 sec: 50096.2). Total num frames: 2395848704. Throughput: 0: 50257.4. Samples: 148647960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 09:55:07,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 09:55:10,004][49750] Updated weights for policy 0, policy_version 146241 (0.0031) [2024-04-26 09:55:12,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.3, 300 sec: 50040.6). Total num frames: 2396094464. Throughput: 0: 50091.5. Samples: 148943100. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 09:55:12,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 09:55:13,067][49750] Updated weights for policy 0, policy_version 146251 (0.0032) [2024-04-26 09:55:16,754][49750] Updated weights for policy 0, policy_version 146261 (0.0034) [2024-04-26 09:55:17,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.3, 300 sec: 49985.1). Total num frames: 2396340224. Throughput: 0: 50193.3. Samples: 149244760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 09:55:17,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 09:55:19,594][49750] Updated weights for policy 0, policy_version 146271 (0.0030) [2024-04-26 09:55:22,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.4, 300 sec: 49929.6). Total num frames: 2396585984. Throughput: 0: 50247.2. Samples: 149387380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 09:55:22,063][49517] Avg episode reward: [(0, '0.485')] [2024-04-26 09:55:23,318][49750] Updated weights for policy 0, policy_version 146281 (0.0027) [2024-04-26 09:55:26,219][49750] Updated weights for policy 0, policy_version 146291 (0.0033) [2024-04-26 09:55:27,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.1, 300 sec: 50151.7). Total num frames: 2396848128. Throughput: 0: 50233.7. Samples: 149693100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:55:27,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 09:55:30,102][49750] Updated weights for policy 0, policy_version 146301 (0.0035) [2024-04-26 09:55:32,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50517.2, 300 sec: 50040.6). Total num frames: 2397093888. Throughput: 0: 50170.9. Samples: 149998720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:55:32,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 09:55:32,850][49750] Updated weights for policy 0, policy_version 146311 (0.0027) [2024-04-26 09:55:36,751][49750] Updated weights for policy 0, policy_version 146321 (0.0035) [2024-04-26 09:55:37,062][49517] Fps is (10 sec: 47514.5, 60 sec: 49698.1, 300 sec: 49929.6). Total num frames: 2397323264. Throughput: 0: 50150.7. Samples: 150144540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:55:37,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 09:55:39,285][49750] Updated weights for policy 0, policy_version 146331 (0.0032) [2024-04-26 09:55:42,064][49517] Fps is (10 sec: 50781.7, 60 sec: 50515.8, 300 sec: 50040.3). Total num frames: 2397601792. Throughput: 0: 50225.6. Samples: 150447080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:55:42,065][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 09:55:43,275][49750] Updated weights for policy 0, policy_version 146341 (0.0033) [2024-04-26 09:55:45,811][49750] Updated weights for policy 0, policy_version 146351 (0.0041) [2024-04-26 09:55:47,062][49517] Fps is (10 sec: 52429.2, 60 sec: 49698.2, 300 sec: 50096.2). Total num frames: 2397847552. Throughput: 0: 50107.6. Samples: 150746680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:55:47,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 09:55:49,805][49750] Updated weights for policy 0, policy_version 146361 (0.0031) [2024-04-26 09:55:52,063][49517] Fps is (10 sec: 49160.9, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2398093312. Throughput: 0: 49954.2. Samples: 150895900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:55:52,063][49517] Avg episode reward: [(0, '0.474')] [2024-04-26 09:55:52,438][49750] Updated weights for policy 0, policy_version 146371 (0.0030) [2024-04-26 09:55:56,142][49750] Updated weights for policy 0, policy_version 146381 (0.0030) [2024-04-26 09:55:57,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49425.2, 300 sec: 49929.6). Total num frames: 2398322688. Throughput: 0: 49994.3. Samples: 151192840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:55:57,063][49517] Avg episode reward: [(0, '0.454')] [2024-04-26 09:55:58,957][49750] Updated weights for policy 0, policy_version 146391 (0.0035) [2024-04-26 09:56:02,063][49517] Fps is (10 sec: 52428.9, 60 sec: 50244.2, 300 sec: 50096.2). Total num frames: 2398617600. Throughput: 0: 49961.8. Samples: 151493040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:56:02,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 09:56:02,519][49750] Updated weights for policy 0, policy_version 146401 (0.0034) [2024-04-26 09:56:04,389][49728] Signal inference workers to stop experience collection... (2250 times) [2024-04-26 09:56:04,389][49728] Signal inference workers to resume experience collection... (2250 times) [2024-04-26 09:56:04,399][49750] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-04-26 09:56:04,410][49750] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-04-26 09:56:05,624][49750] Updated weights for policy 0, policy_version 146411 (0.0031) [2024-04-26 09:56:07,063][49517] Fps is (10 sec: 52428.1, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2398846976. Throughput: 0: 50380.3. Samples: 151654500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:56:07,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 09:56:09,069][49750] Updated weights for policy 0, policy_version 146421 (0.0030) [2024-04-26 09:56:12,058][49750] Updated weights for policy 0, policy_version 146431 (0.0035) [2024-04-26 09:56:12,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.4, 300 sec: 50207.3). Total num frames: 2399125504. Throughput: 0: 50132.3. Samples: 151949040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:56:12,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 09:56:15,965][49750] Updated weights for policy 0, policy_version 146441 (0.0035) [2024-04-26 09:56:17,063][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2399338496. Throughput: 0: 50013.9. Samples: 152249340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:56:17,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 09:56:18,842][49750] Updated weights for policy 0, policy_version 146451 (0.0029) [2024-04-26 09:56:22,062][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2399600640. Throughput: 0: 50021.7. Samples: 152395520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:56:22,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 09:56:22,892][49750] Updated weights for policy 0, policy_version 146461 (0.0033) [2024-04-26 09:56:25,585][49750] Updated weights for policy 0, policy_version 146471 (0.0031) [2024-04-26 09:56:27,063][49517] Fps is (10 sec: 50790.0, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2399846400. Throughput: 0: 49997.9. Samples: 152696900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 09:56:27,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 09:56:29,305][49750] Updated weights for policy 0, policy_version 146481 (0.0036) [2024-04-26 09:56:32,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.4, 300 sec: 50096.2). Total num frames: 2400092160. Throughput: 0: 49855.1. Samples: 152990160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 09:56:32,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 09:56:32,098][49750] Updated weights for policy 0, policy_version 146491 (0.0031) [2024-04-26 09:56:35,706][49750] Updated weights for policy 0, policy_version 146501 (0.0034) [2024-04-26 09:56:37,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.2, 300 sec: 49985.1). Total num frames: 2400337920. Throughput: 0: 49962.3. Samples: 153144200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 09:56:37,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 09:56:38,710][49750] Updated weights for policy 0, policy_version 146511 (0.0042) [2024-04-26 09:56:42,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49426.7, 300 sec: 49874.0). Total num frames: 2400567296. Throughput: 0: 49900.9. Samples: 153438380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 09:56:42,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 09:56:42,137][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000146520_2400583680.pth... [2024-04-26 09:56:42,194][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000145787_2388574208.pth [2024-04-26 09:56:42,341][49750] Updated weights for policy 0, policy_version 146521 (0.0030) [2024-04-26 09:56:45,452][49750] Updated weights for policy 0, policy_version 146531 (0.0033) [2024-04-26 09:56:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50244.2, 300 sec: 50096.2). Total num frames: 2400862208. Throughput: 0: 49976.5. Samples: 153741980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 09:56:47,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 09:56:48,888][49750] Updated weights for policy 0, policy_version 146541 (0.0030) [2024-04-26 09:56:51,913][49750] Updated weights for policy 0, policy_version 146551 (0.0030) [2024-04-26 09:56:52,063][49517] Fps is (10 sec: 52427.8, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2401091584. Throughput: 0: 49943.5. Samples: 153901960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 09:56:52,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 09:56:55,471][49750] Updated weights for policy 0, policy_version 146561 (0.0035) [2024-04-26 09:56:57,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50096.2). Total num frames: 2401353728. Throughput: 0: 50039.9. Samples: 154200840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 09:56:57,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 09:56:58,482][49750] Updated weights for policy 0, policy_version 146571 (0.0032) [2024-04-26 09:57:01,891][49750] Updated weights for policy 0, policy_version 146581 (0.0036) [2024-04-26 09:57:02,062][49517] Fps is (10 sec: 49152.7, 60 sec: 49425.1, 300 sec: 49985.1). Total num frames: 2401583104. Throughput: 0: 50025.0. Samples: 154500460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 09:57:02,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 09:57:05,020][49750] Updated weights for policy 0, policy_version 146591 (0.0032) [2024-04-26 09:57:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.3, 300 sec: 50040.6). Total num frames: 2401845248. Throughput: 0: 49974.3. Samples: 154644360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 09:57:07,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 09:57:08,402][49750] Updated weights for policy 0, policy_version 146601 (0.0035) [2024-04-26 09:57:11,440][49750] Updated weights for policy 0, policy_version 146611 (0.0032) [2024-04-26 09:57:12,063][49517] Fps is (10 sec: 50789.8, 60 sec: 49424.9, 300 sec: 50040.6). Total num frames: 2402091008. Throughput: 0: 49924.5. Samples: 154943500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 09:57:12,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 09:57:15,068][49750] Updated weights for policy 0, policy_version 146621 (0.0038) [2024-04-26 09:57:16,580][49728] Signal inference workers to stop experience collection... (2300 times) [2024-04-26 09:57:16,580][49728] Signal inference workers to resume experience collection... (2300 times) [2024-04-26 09:57:16,605][49750] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-04-26 09:57:16,605][49750] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-04-26 09:57:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.4, 300 sec: 50096.2). Total num frames: 2402353152. Throughput: 0: 50104.5. Samples: 155244860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 09:57:17,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 09:57:17,827][49750] Updated weights for policy 0, policy_version 146631 (0.0037) [2024-04-26 09:57:21,425][49750] Updated weights for policy 0, policy_version 146641 (0.0030) [2024-04-26 09:57:22,063][49517] Fps is (10 sec: 49152.1, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2402582528. Throughput: 0: 49959.5. Samples: 155392380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 09:57:22,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 09:57:24,315][49750] Updated weights for policy 0, policy_version 146651 (0.0030) [2024-04-26 09:57:27,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50244.3, 300 sec: 50040.6). Total num frames: 2402861056. Throughput: 0: 50238.9. Samples: 155699140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-26 09:57:27,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 09:57:27,961][49750] Updated weights for policy 0, policy_version 146661 (0.0034) [2024-04-26 09:57:31,047][49750] Updated weights for policy 0, policy_version 146671 (0.0035) [2024-04-26 09:57:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 49971.1, 300 sec: 49985.1). Total num frames: 2403090432. Throughput: 0: 50108.4. Samples: 155996860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 09:57:32,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 09:57:34,618][49750] Updated weights for policy 0, policy_version 146681 (0.0034) [2024-04-26 09:57:37,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.1, 300 sec: 50151.7). Total num frames: 2403352576. Throughput: 0: 50030.6. Samples: 156153340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 09:57:37,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 09:57:37,584][49750] Updated weights for policy 0, policy_version 146691 (0.0030) [2024-04-26 09:57:41,057][49750] Updated weights for policy 0, policy_version 146701 (0.0036) [2024-04-26 09:57:42,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.1, 300 sec: 50040.6). Total num frames: 2403581952. Throughput: 0: 49957.7. Samples: 156448940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 09:57:42,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 09:57:44,282][49750] Updated weights for policy 0, policy_version 146711 (0.0034) [2024-04-26 09:57:47,062][49517] Fps is (10 sec: 49153.0, 60 sec: 49698.2, 300 sec: 50040.6). Total num frames: 2403844096. Throughput: 0: 49946.3. Samples: 156748040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 09:57:47,063][49517] Avg episode reward: [(0, '0.476')] [2024-04-26 09:57:47,471][49750] Updated weights for policy 0, policy_version 146721 (0.0031) [2024-04-26 09:57:50,969][49750] Updated weights for policy 0, policy_version 146731 (0.0031) [2024-04-26 09:57:52,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50244.4, 300 sec: 50096.2). Total num frames: 2404106240. Throughput: 0: 49997.8. Samples: 156894260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 09:57:52,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 09:57:54,095][49750] Updated weights for policy 0, policy_version 146741 (0.0035) [2024-04-26 09:57:57,063][49517] Fps is (10 sec: 50789.1, 60 sec: 49971.0, 300 sec: 50096.1). Total num frames: 2404352000. Throughput: 0: 50118.1. Samples: 157198820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 09:57:57,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 09:57:57,361][49750] Updated weights for policy 0, policy_version 146751 (0.0033) [2024-04-26 09:58:00,873][49750] Updated weights for policy 0, policy_version 146761 (0.0035) [2024-04-26 09:58:02,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2404597760. Throughput: 0: 50099.0. Samples: 157499320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 09:58:02,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 09:58:03,759][49750] Updated weights for policy 0, policy_version 146771 (0.0031) [2024-04-26 09:58:07,062][49517] Fps is (10 sec: 47514.8, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2404827136. Throughput: 0: 50084.6. Samples: 157646180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 09:58:07,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 09:58:07,394][49750] Updated weights for policy 0, policy_version 146781 (0.0029) [2024-04-26 09:58:10,375][49750] Updated weights for policy 0, policy_version 146791 (0.0032) [2024-04-26 09:58:12,063][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2405089280. Throughput: 0: 49999.6. Samples: 157949120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 09:58:12,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 09:58:13,888][49750] Updated weights for policy 0, policy_version 146801 (0.0029) [2024-04-26 09:58:16,931][49750] Updated weights for policy 0, policy_version 146811 (0.0033) [2024-04-26 09:58:17,063][49517] Fps is (10 sec: 52427.8, 60 sec: 49971.0, 300 sec: 50040.6). Total num frames: 2405351424. Throughput: 0: 50008.3. Samples: 158247240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 09:58:17,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 09:58:20,483][49750] Updated weights for policy 0, policy_version 146821 (0.0030) [2024-04-26 09:58:22,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.5, 300 sec: 50151.7). Total num frames: 2405613568. Throughput: 0: 50106.5. Samples: 158408120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 09:58:22,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 09:58:23,362][49750] Updated weights for policy 0, policy_version 146831 (0.0030) [2024-04-26 09:58:23,992][49728] Signal inference workers to stop experience collection... (2350 times) [2024-04-26 09:58:24,022][49750] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-04-26 09:58:24,057][49728] Signal inference workers to resume experience collection... (2350 times) [2024-04-26 09:58:24,058][49750] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-04-26 09:58:27,001][49750] Updated weights for policy 0, policy_version 146841 (0.0029) [2024-04-26 09:58:27,062][49517] Fps is (10 sec: 49152.7, 60 sec: 49698.2, 300 sec: 50040.6). Total num frames: 2405842944. Throughput: 0: 50082.8. Samples: 158702660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 09:58:27,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 09:58:29,854][49750] Updated weights for policy 0, policy_version 146851 (0.0032) [2024-04-26 09:58:32,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50040.6). Total num frames: 2406105088. Throughput: 0: 50128.0. Samples: 159003800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 09:58:32,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-26 09:58:33,534][49750] Updated weights for policy 0, policy_version 146861 (0.0030) [2024-04-26 09:58:36,508][49750] Updated weights for policy 0, policy_version 146871 (0.0025) [2024-04-26 09:58:37,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.4, 300 sec: 50096.2). Total num frames: 2406367232. Throughput: 0: 50323.9. Samples: 159158840. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-04-26 09:58:37,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 09:58:39,932][49750] Updated weights for policy 0, policy_version 146881 (0.0031) [2024-04-26 09:58:42,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.4, 300 sec: 50096.2). Total num frames: 2406612992. Throughput: 0: 50175.7. Samples: 159456720. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-04-26 09:58:42,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 09:58:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000146888_2406612992.pth... [2024-04-26 09:58:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000146154_2394587136.pth [2024-04-26 09:58:43,077][49750] Updated weights for policy 0, policy_version 146891 (0.0033) [2024-04-26 09:58:46,471][49750] Updated weights for policy 0, policy_version 146901 (0.0032) [2024-04-26 09:58:47,063][49517] Fps is (10 sec: 47512.7, 60 sec: 49971.0, 300 sec: 50040.6). Total num frames: 2406842368. Throughput: 0: 50093.2. Samples: 159753520. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-04-26 09:58:47,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 09:58:49,565][49750] Updated weights for policy 0, policy_version 146911 (0.0029) [2024-04-26 09:58:52,063][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.1, 300 sec: 50096.1). Total num frames: 2407104512. Throughput: 0: 50122.5. Samples: 159901700. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-04-26 09:58:52,063][49517] Avg episode reward: [(0, '0.444')] [2024-04-26 09:58:53,080][49750] Updated weights for policy 0, policy_version 146921 (0.0032) [2024-04-26 09:58:55,976][49750] Updated weights for policy 0, policy_version 146931 (0.0031) [2024-04-26 09:58:57,063][49517] Fps is (10 sec: 50791.0, 60 sec: 49971.3, 300 sec: 50040.6). Total num frames: 2407350272. Throughput: 0: 50002.2. Samples: 160199220. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-04-26 09:58:57,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 09:58:59,667][49750] Updated weights for policy 0, policy_version 146941 (0.0035) [2024-04-26 09:59:02,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50244.3, 300 sec: 50096.3). Total num frames: 2407612416. Throughput: 0: 49964.6. Samples: 160495640. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-04-26 09:59:02,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 09:59:02,543][49750] Updated weights for policy 0, policy_version 146951 (0.0032) [2024-04-26 09:59:06,224][49750] Updated weights for policy 0, policy_version 146961 (0.0038) [2024-04-26 09:59:07,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.3, 300 sec: 50096.2). Total num frames: 2407858176. Throughput: 0: 50052.9. Samples: 160660500. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-04-26 09:59:07,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 09:59:09,085][49750] Updated weights for policy 0, policy_version 146971 (0.0037) [2024-04-26 09:59:12,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.3, 300 sec: 50040.6). Total num frames: 2408087552. Throughput: 0: 50064.5. Samples: 160955560. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-04-26 09:59:12,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 09:59:12,971][49750] Updated weights for policy 0, policy_version 146981 (0.0036) [2024-04-26 09:59:15,632][49750] Updated weights for policy 0, policy_version 146991 (0.0030) [2024-04-26 09:59:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.4, 300 sec: 50096.2). Total num frames: 2408349696. Throughput: 0: 49976.9. Samples: 161252760. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-04-26 09:59:17,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 09:59:19,553][49750] Updated weights for policy 0, policy_version 147001 (0.0035) [2024-04-26 09:59:22,055][49750] Updated weights for policy 0, policy_version 147011 (0.0033) [2024-04-26 09:59:22,062][49517] Fps is (10 sec: 54067.4, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2408628224. Throughput: 0: 50063.6. Samples: 161411700. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-04-26 09:59:22,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 09:59:26,067][49750] Updated weights for policy 0, policy_version 147021 (0.0035) [2024-04-26 09:59:27,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2408857600. Throughput: 0: 50073.4. Samples: 161710020. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-04-26 09:59:27,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 09:59:28,498][49750] Updated weights for policy 0, policy_version 147031 (0.0031) [2024-04-26 09:59:32,063][49517] Fps is (10 sec: 47513.0, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2409103360. Throughput: 0: 50051.7. Samples: 162005840. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-04-26 09:59:32,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 09:59:32,448][49750] Updated weights for policy 0, policy_version 147041 (0.0030) [2024-04-26 09:59:35,148][49750] Updated weights for policy 0, policy_version 147051 (0.0033) [2024-04-26 09:59:37,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49425.1, 300 sec: 50040.6). Total num frames: 2409332736. Throughput: 0: 50118.4. Samples: 162157020. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-04-26 09:59:37,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 09:59:38,936][49750] Updated weights for policy 0, policy_version 147061 (0.0029) [2024-04-26 09:59:41,865][49750] Updated weights for policy 0, policy_version 147071 (0.0034) [2024-04-26 09:59:42,062][49517] Fps is (10 sec: 50790.8, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2409611264. Throughput: 0: 50100.1. Samples: 162453720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 09:59:42,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 09:59:43,637][49728] Signal inference workers to stop experience collection... (2400 times) [2024-04-26 09:59:43,676][49750] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-04-26 09:59:43,709][49728] Signal inference workers to resume experience collection... (2400 times) [2024-04-26 09:59:43,709][49750] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-04-26 09:59:45,437][49750] Updated weights for policy 0, policy_version 147081 (0.0029) [2024-04-26 09:59:47,063][49517] Fps is (10 sec: 54066.4, 60 sec: 50517.4, 300 sec: 50151.7). Total num frames: 2409873408. Throughput: 0: 50198.5. Samples: 162754580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 09:59:47,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 09:59:48,589][49750] Updated weights for policy 0, policy_version 147091 (0.0033) [2024-04-26 09:59:51,926][49750] Updated weights for policy 0, policy_version 147101 (0.0042) [2024-04-26 09:59:52,063][49517] Fps is (10 sec: 49151.1, 60 sec: 49971.1, 300 sec: 49985.1). Total num frames: 2410102784. Throughput: 0: 49850.4. Samples: 162903780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 09:59:52,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 09:59:54,970][49750] Updated weights for policy 0, policy_version 147111 (0.0034) [2024-04-26 09:59:57,063][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2410348544. Throughput: 0: 50008.8. Samples: 163205960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 09:59:57,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 09:59:58,365][49750] Updated weights for policy 0, policy_version 147121 (0.0029) [2024-04-26 10:00:01,570][49750] Updated weights for policy 0, policy_version 147131 (0.0031) [2024-04-26 10:00:02,063][49517] Fps is (10 sec: 49152.5, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2410594304. Throughput: 0: 49954.5. Samples: 163500720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 10:00:02,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 10:00:04,932][49750] Updated weights for policy 0, policy_version 147141 (0.0031) [2024-04-26 10:00:07,063][49517] Fps is (10 sec: 50790.4, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2410856448. Throughput: 0: 49730.5. Samples: 163649580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 10:00:07,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 10:00:07,926][49750] Updated weights for policy 0, policy_version 147151 (0.0031) [2024-04-26 10:00:11,427][49750] Updated weights for policy 0, policy_version 147161 (0.0038) [2024-04-26 10:00:12,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2411102208. Throughput: 0: 49977.8. Samples: 163959020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 10:00:12,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 10:00:14,523][49750] Updated weights for policy 0, policy_version 147171 (0.0029) [2024-04-26 10:00:17,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2411347968. Throughput: 0: 50075.2. Samples: 164259220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 10:00:17,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 10:00:18,274][49750] Updated weights for policy 0, policy_version 147181 (0.0030) [2024-04-26 10:00:21,021][49750] Updated weights for policy 0, policy_version 147191 (0.0038) [2024-04-26 10:00:22,063][49517] Fps is (10 sec: 50790.1, 60 sec: 49698.0, 300 sec: 50040.6). Total num frames: 2411610112. Throughput: 0: 49848.7. Samples: 164400220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 10:00:22,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 10:00:24,724][49750] Updated weights for policy 0, policy_version 147201 (0.0034) [2024-04-26 10:00:27,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2411872256. Throughput: 0: 50146.6. Samples: 164710320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 10:00:27,063][49517] Avg episode reward: [(0, '0.668')] [2024-04-26 10:00:27,073][49728] Saving new best policy, reward=0.668! [2024-04-26 10:00:27,424][49750] Updated weights for policy 0, policy_version 147211 (0.0034) [2024-04-26 10:00:31,259][49750] Updated weights for policy 0, policy_version 147221 (0.0032) [2024-04-26 10:00:32,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49698.2, 300 sec: 50040.6). Total num frames: 2412085248. Throughput: 0: 50050.7. Samples: 165006860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 10:00:32,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 10:00:33,819][49750] Updated weights for policy 0, policy_version 147231 (0.0033) [2024-04-26 10:00:37,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.1, 300 sec: 50040.9). Total num frames: 2412363776. Throughput: 0: 50016.4. Samples: 165154520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 10:00:37,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 10:00:37,829][49750] Updated weights for policy 0, policy_version 147241 (0.0030) [2024-04-26 10:00:40,515][49750] Updated weights for policy 0, policy_version 147251 (0.0030) [2024-04-26 10:00:42,062][49517] Fps is (10 sec: 50790.7, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2412593152. Throughput: 0: 49918.3. Samples: 165452280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-26 10:00:42,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 10:00:42,115][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000147254_2412609536.pth... [2024-04-26 10:00:42,168][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000146520_2400583680.pth [2024-04-26 10:00:44,220][49750] Updated weights for policy 0, policy_version 147261 (0.0035) [2024-04-26 10:00:47,063][49517] Fps is (10 sec: 49152.7, 60 sec: 49698.2, 300 sec: 50040.6). Total num frames: 2412855296. Throughput: 0: 50124.5. Samples: 165756320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-26 10:00:47,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 10:00:47,308][49750] Updated weights for policy 0, policy_version 147271 (0.0030) [2024-04-26 10:00:50,654][49750] Updated weights for policy 0, policy_version 147281 (0.0032) [2024-04-26 10:00:52,062][49517] Fps is (10 sec: 50790.5, 60 sec: 49971.4, 300 sec: 50096.2). Total num frames: 2413101056. Throughput: 0: 50147.3. Samples: 165906200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-26 10:00:52,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 10:00:54,032][49750] Updated weights for policy 0, policy_version 147291 (0.0036) [2024-04-26 10:00:57,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.2, 300 sec: 49929.5). Total num frames: 2413346816. Throughput: 0: 50107.1. Samples: 166213840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-26 10:00:57,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 10:00:57,236][49750] Updated weights for policy 0, policy_version 147301 (0.0029) [2024-04-26 10:01:00,526][49750] Updated weights for policy 0, policy_version 147311 (0.0029) [2024-04-26 10:01:02,062][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.3, 300 sec: 49985.1). Total num frames: 2413592576. Throughput: 0: 50175.6. Samples: 166517120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-26 10:01:02,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 10:01:03,689][49728] Signal inference workers to stop experience collection... (2450 times) [2024-04-26 10:01:03,726][49750] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-04-26 10:01:03,759][49728] Signal inference workers to resume experience collection... (2450 times) [2024-04-26 10:01:03,760][49750] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-04-26 10:01:03,885][49750] Updated weights for policy 0, policy_version 147321 (0.0034) [2024-04-26 10:01:07,063][49517] Fps is (10 sec: 50790.1, 60 sec: 49971.2, 300 sec: 49929.5). Total num frames: 2413854720. Throughput: 0: 50307.1. Samples: 166664040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-26 10:01:07,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 10:01:07,145][49750] Updated weights for policy 0, policy_version 147331 (0.0033) [2024-04-26 10:01:10,264][49750] Updated weights for policy 0, policy_version 147341 (0.0027) [2024-04-26 10:01:12,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50517.4, 300 sec: 50151.7). Total num frames: 2414133248. Throughput: 0: 50032.5. Samples: 166961780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-26 10:01:12,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 10:01:13,705][49750] Updated weights for policy 0, policy_version 147351 (0.0029) [2024-04-26 10:01:16,781][49750] Updated weights for policy 0, policy_version 147361 (0.0029) [2024-04-26 10:01:17,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50244.3, 300 sec: 50040.6). Total num frames: 2414362624. Throughput: 0: 50141.4. Samples: 167263220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-26 10:01:17,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 10:01:20,089][49750] Updated weights for policy 0, policy_version 147371 (0.0034) [2024-04-26 10:01:22,062][49517] Fps is (10 sec: 47513.4, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2414608384. Throughput: 0: 50165.5. Samples: 167411960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-26 10:01:22,063][49517] Avg episode reward: [(0, '0.484')] [2024-04-26 10:01:23,575][49750] Updated weights for policy 0, policy_version 147381 (0.0033) [2024-04-26 10:01:26,499][49750] Updated weights for policy 0, policy_version 147391 (0.0035) [2024-04-26 10:01:27,063][49517] Fps is (10 sec: 49151.4, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 2414854144. Throughput: 0: 50182.1. Samples: 167710480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-26 10:01:27,063][49517] Avg episode reward: [(0, '0.482')] [2024-04-26 10:01:30,288][49750] Updated weights for policy 0, policy_version 147401 (0.0033) [2024-04-26 10:01:32,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50096.2). Total num frames: 2415116288. Throughput: 0: 50054.8. Samples: 168008780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-26 10:01:32,063][49517] Avg episode reward: [(0, '0.448')] [2024-04-26 10:01:33,268][49750] Updated weights for policy 0, policy_version 147411 (0.0032) [2024-04-26 10:01:36,745][49750] Updated weights for policy 0, policy_version 147421 (0.0033) [2024-04-26 10:01:37,063][49517] Fps is (10 sec: 50790.4, 60 sec: 49971.3, 300 sec: 50151.7). Total num frames: 2415362048. Throughput: 0: 50179.4. Samples: 168164280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-26 10:01:37,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 10:01:39,986][49750] Updated weights for policy 0, policy_version 147431 (0.0031) [2024-04-26 10:01:42,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50040.6). Total num frames: 2415624192. Throughput: 0: 50124.5. Samples: 168469440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-26 10:01:42,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 10:01:43,119][49750] Updated weights for policy 0, policy_version 147441 (0.0029) [2024-04-26 10:01:46,336][49750] Updated weights for policy 0, policy_version 147451 (0.0037) [2024-04-26 10:01:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2415853568. Throughput: 0: 49920.9. Samples: 168763560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 10:01:47,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 10:01:49,698][49750] Updated weights for policy 0, policy_version 147461 (0.0036) [2024-04-26 10:01:52,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.2, 300 sec: 50096.1). Total num frames: 2416132096. Throughput: 0: 50099.5. Samples: 168918520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 10:01:52,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 10:01:53,011][49750] Updated weights for policy 0, policy_version 147471 (0.0040) [2024-04-26 10:01:56,173][49750] Updated weights for policy 0, policy_version 147481 (0.0036) [2024-04-26 10:01:57,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.4, 300 sec: 50151.7). Total num frames: 2416377856. Throughput: 0: 50116.9. Samples: 169217040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 10:01:57,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 10:01:59,360][49750] Updated weights for policy 0, policy_version 147491 (0.0035) [2024-04-26 10:02:02,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50244.3, 300 sec: 50040.6). Total num frames: 2416607232. Throughput: 0: 50021.8. Samples: 169514200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 10:02:02,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 10:02:02,698][49750] Updated weights for policy 0, policy_version 147501 (0.0032) [2024-04-26 10:02:06,261][49750] Updated weights for policy 0, policy_version 147511 (0.0030) [2024-04-26 10:02:07,063][49517] Fps is (10 sec: 47513.0, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2416852992. Throughput: 0: 50062.6. Samples: 169664780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 10:02:07,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 10:02:09,141][49750] Updated weights for policy 0, policy_version 147521 (0.0033) [2024-04-26 10:02:12,063][49517] Fps is (10 sec: 50789.3, 60 sec: 49698.0, 300 sec: 50040.6). Total num frames: 2417115136. Throughput: 0: 50038.6. Samples: 169962220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 10:02:12,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 10:02:13,035][49750] Updated weights for policy 0, policy_version 147531 (0.0037) [2024-04-26 10:02:15,708][49750] Updated weights for policy 0, policy_version 147541 (0.0028) [2024-04-26 10:02:17,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2417377280. Throughput: 0: 50083.0. Samples: 170262520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 10:02:17,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 10:02:18,101][49728] Signal inference workers to stop experience collection... (2500 times) [2024-04-26 10:02:18,101][49728] Signal inference workers to resume experience collection... (2500 times) [2024-04-26 10:02:18,115][49750] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-04-26 10:02:18,138][49750] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-04-26 10:02:19,601][49750] Updated weights for policy 0, policy_version 147551 (0.0034) [2024-04-26 10:02:22,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50244.3, 300 sec: 50040.6). Total num frames: 2417623040. Throughput: 0: 50192.6. Samples: 170422940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 10:02:22,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 10:02:22,173][49750] Updated weights for policy 0, policy_version 147561 (0.0030) [2024-04-26 10:02:25,964][49750] Updated weights for policy 0, policy_version 147571 (0.0033) [2024-04-26 10:02:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2417868800. Throughput: 0: 49937.7. Samples: 170716640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 10:02:27,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 10:02:28,666][49750] Updated weights for policy 0, policy_version 147581 (0.0026) [2024-04-26 10:02:32,063][49517] Fps is (10 sec: 49151.2, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2418114560. Throughput: 0: 49970.1. Samples: 171012220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 10:02:32,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 10:02:32,374][49750] Updated weights for policy 0, policy_version 147591 (0.0029) [2024-04-26 10:02:35,417][49750] Updated weights for policy 0, policy_version 147601 (0.0028) [2024-04-26 10:02:37,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2418376704. Throughput: 0: 50029.0. Samples: 171169820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 10:02:37,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 10:02:38,898][49750] Updated weights for policy 0, policy_version 147611 (0.0034) [2024-04-26 10:02:42,015][49750] Updated weights for policy 0, policy_version 147621 (0.0033) [2024-04-26 10:02:42,063][49517] Fps is (10 sec: 50790.5, 60 sec: 49971.1, 300 sec: 50096.1). Total num frames: 2418622464. Throughput: 0: 49973.2. Samples: 171465840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 10:02:42,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 10:02:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000147621_2418622464.pth... [2024-04-26 10:02:42,126][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000146888_2406612992.pth [2024-04-26 10:02:45,541][49750] Updated weights for policy 0, policy_version 147631 (0.0031) [2024-04-26 10:02:47,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.2, 300 sec: 49985.1). Total num frames: 2418851840. Throughput: 0: 49985.3. Samples: 171763540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 10:02:47,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 10:02:48,639][49750] Updated weights for policy 0, policy_version 147641 (0.0031) [2024-04-26 10:02:52,017][49750] Updated weights for policy 0, policy_version 147651 (0.0034) [2024-04-26 10:02:52,063][49517] Fps is (10 sec: 49151.3, 60 sec: 49698.0, 300 sec: 50040.6). Total num frames: 2419113984. Throughput: 0: 50050.9. Samples: 171917080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 10:02:52,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 10:02:55,067][49750] Updated weights for policy 0, policy_version 147661 (0.0035) [2024-04-26 10:02:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 49971.2, 300 sec: 50096.2). Total num frames: 2419376128. Throughput: 0: 49986.0. Samples: 172211580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 10:02:57,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 10:02:58,553][49750] Updated weights for policy 0, policy_version 147671 (0.0032) [2024-04-26 10:03:01,652][49750] Updated weights for policy 0, policy_version 147681 (0.0030) [2024-04-26 10:03:02,062][49517] Fps is (10 sec: 50792.0, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2419621888. Throughput: 0: 50051.8. Samples: 172514840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 10:03:02,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 10:03:05,069][49750] Updated weights for policy 0, policy_version 147691 (0.0030) [2024-04-26 10:03:07,062][49517] Fps is (10 sec: 47513.4, 60 sec: 49971.3, 300 sec: 50040.6). Total num frames: 2419851264. Throughput: 0: 49843.0. Samples: 172665880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 10:03:07,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 10:03:08,312][49750] Updated weights for policy 0, policy_version 147701 (0.0029) [2024-04-26 10:03:11,613][49750] Updated weights for policy 0, policy_version 147711 (0.0029) [2024-04-26 10:03:12,062][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.4, 300 sec: 50040.7). Total num frames: 2420113408. Throughput: 0: 49912.1. Samples: 172962680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 10:03:12,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 10:03:14,651][49750] Updated weights for policy 0, policy_version 147721 (0.0030) [2024-04-26 10:03:17,063][49517] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2420359168. Throughput: 0: 50007.1. Samples: 173262540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 10:03:17,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 10:03:18,162][49750] Updated weights for policy 0, policy_version 147731 (0.0035) [2024-04-26 10:03:21,246][49750] Updated weights for policy 0, policy_version 147741 (0.0031) [2024-04-26 10:03:22,063][49517] Fps is (10 sec: 50789.5, 60 sec: 49971.0, 300 sec: 50096.1). Total num frames: 2420621312. Throughput: 0: 49999.8. Samples: 173419820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 10:03:22,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 10:03:24,506][49750] Updated weights for policy 0, policy_version 147751 (0.0033) [2024-04-26 10:03:27,062][49517] Fps is (10 sec: 50790.7, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2420867072. Throughput: 0: 50081.4. Samples: 173719500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 10:03:27,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 10:03:27,793][49750] Updated weights for policy 0, policy_version 147761 (0.0037) [2024-04-26 10:03:30,933][49750] Updated weights for policy 0, policy_version 147771 (0.0034) [2024-04-26 10:03:32,062][49517] Fps is (10 sec: 49152.7, 60 sec: 49971.3, 300 sec: 49985.1). Total num frames: 2421112832. Throughput: 0: 50191.1. Samples: 174022140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 10:03:32,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 10:03:34,398][49750] Updated weights for policy 0, policy_version 147781 (0.0030) [2024-04-26 10:03:37,063][49517] Fps is (10 sec: 50790.0, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2421374976. Throughput: 0: 49972.6. Samples: 174165840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 10:03:37,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 10:03:37,450][49750] Updated weights for policy 0, policy_version 147791 (0.0031) [2024-04-26 10:03:40,840][49750] Updated weights for policy 0, policy_version 147801 (0.0037) [2024-04-26 10:03:42,062][49517] Fps is (10 sec: 50790.5, 60 sec: 49971.3, 300 sec: 50096.2). Total num frames: 2421620736. Throughput: 0: 50191.1. Samples: 174470180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 10:03:42,063][49517] Avg episode reward: [(0, '0.459')] [2024-04-26 10:03:43,941][49750] Updated weights for policy 0, policy_version 147811 (0.0028) [2024-04-26 10:03:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50040.6). Total num frames: 2421866496. Throughput: 0: 50168.8. Samples: 174772440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 10:03:47,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 10:03:47,317][49750] Updated weights for policy 0, policy_version 147821 (0.0035) [2024-04-26 10:03:50,690][49750] Updated weights for policy 0, policy_version 147831 (0.0038) [2024-04-26 10:03:52,063][49517] Fps is (10 sec: 49151.1, 60 sec: 49971.3, 300 sec: 50040.6). Total num frames: 2422112256. Throughput: 0: 50122.1. Samples: 174921380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:03:52,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 10:03:53,619][49728] Signal inference workers to stop experience collection... (2550 times) [2024-04-26 10:03:53,619][49728] Signal inference workers to resume experience collection... (2550 times) [2024-04-26 10:03:53,637][49750] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-04-26 10:03:53,637][49750] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-04-26 10:03:53,922][49750] Updated weights for policy 0, policy_version 147841 (0.0030) [2024-04-26 10:03:57,062][49517] Fps is (10 sec: 50790.7, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2422374400. Throughput: 0: 50150.2. Samples: 175219440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:03:57,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 10:03:57,527][49750] Updated weights for policy 0, policy_version 147851 (0.0030) [2024-04-26 10:04:00,367][49750] Updated weights for policy 0, policy_version 147861 (0.0030) [2024-04-26 10:04:02,063][49517] Fps is (10 sec: 54067.3, 60 sec: 50517.2, 300 sec: 50151.7). Total num frames: 2422652928. Throughput: 0: 50265.3. Samples: 175524480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:04:02,072][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 10:04:03,886][49750] Updated weights for policy 0, policy_version 147871 (0.0032) [2024-04-26 10:04:06,746][49750] Updated weights for policy 0, policy_version 147881 (0.0033) [2024-04-26 10:04:07,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2422882304. Throughput: 0: 50277.4. Samples: 175682300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:04:07,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 10:04:10,289][49750] Updated weights for policy 0, policy_version 147891 (0.0036) [2024-04-26 10:04:12,063][49517] Fps is (10 sec: 45874.9, 60 sec: 49971.0, 300 sec: 50040.6). Total num frames: 2423111680. Throughput: 0: 50178.9. Samples: 175977560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:04:12,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 10:04:13,312][49750] Updated weights for policy 0, policy_version 147901 (0.0030) [2024-04-26 10:04:16,822][49750] Updated weights for policy 0, policy_version 147911 (0.0031) [2024-04-26 10:04:17,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.2, 300 sec: 49985.1). Total num frames: 2423373824. Throughput: 0: 50074.1. Samples: 176275480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:04:17,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 10:04:20,010][49750] Updated weights for policy 0, policy_version 147921 (0.0039) [2024-04-26 10:04:22,063][49517] Fps is (10 sec: 50790.5, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2423619584. Throughput: 0: 50325.7. Samples: 176430500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:04:22,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 10:04:23,243][49750] Updated weights for policy 0, policy_version 147931 (0.0028) [2024-04-26 10:04:26,454][49750] Updated weights for policy 0, policy_version 147941 (0.0033) [2024-04-26 10:04:27,063][49517] Fps is (10 sec: 49152.7, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2423865344. Throughput: 0: 50291.5. Samples: 176733300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:04:27,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 10:04:30,106][49750] Updated weights for policy 0, policy_version 147951 (0.0030) [2024-04-26 10:04:32,063][49517] Fps is (10 sec: 50790.9, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2424127488. Throughput: 0: 50332.4. Samples: 177037400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:04:32,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 10:04:32,874][49750] Updated weights for policy 0, policy_version 147961 (0.0035) [2024-04-26 10:04:36,687][49750] Updated weights for policy 0, policy_version 147971 (0.0031) [2024-04-26 10:04:37,063][49517] Fps is (10 sec: 50789.4, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2424373248. Throughput: 0: 50204.9. Samples: 177180600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:04:37,063][49517] Avg episode reward: [(0, '0.468')] [2024-04-26 10:04:39,478][49750] Updated weights for policy 0, policy_version 147981 (0.0033) [2024-04-26 10:04:42,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.2, 300 sec: 50096.2). Total num frames: 2424651776. Throughput: 0: 50283.4. Samples: 177482200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:04:42,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 10:04:42,070][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000147989_2424651776.pth... [2024-04-26 10:04:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000147254_2412609536.pth [2024-04-26 10:04:43,091][49750] Updated weights for policy 0, policy_version 147991 (0.0030) [2024-04-26 10:04:43,632][49728] Signal inference workers to stop experience collection... (2600 times) [2024-04-26 10:04:43,671][49750] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-04-26 10:04:43,693][49728] Signal inference workers to resume experience collection... (2600 times) [2024-04-26 10:04:43,694][49750] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-04-26 10:04:46,068][49750] Updated weights for policy 0, policy_version 148001 (0.0030) [2024-04-26 10:04:47,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.2, 300 sec: 50096.2). Total num frames: 2424881152. Throughput: 0: 50177.8. Samples: 177782480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:04:47,063][49517] Avg episode reward: [(0, '0.448')] [2024-04-26 10:04:49,429][49750] Updated weights for policy 0, policy_version 148011 (0.0029) [2024-04-26 10:04:52,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50244.5, 300 sec: 50096.2). Total num frames: 2425126912. Throughput: 0: 49962.8. Samples: 177930620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:04:52,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 10:04:52,551][49750] Updated weights for policy 0, policy_version 148021 (0.0031) [2024-04-26 10:04:56,015][49750] Updated weights for policy 0, policy_version 148031 (0.0029) [2024-04-26 10:04:57,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2425389056. Throughput: 0: 50116.2. Samples: 178232780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 10:04:57,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 10:04:59,233][49750] Updated weights for policy 0, policy_version 148041 (0.0039) [2024-04-26 10:05:02,062][49517] Fps is (10 sec: 50790.0, 60 sec: 49698.2, 300 sec: 50096.2). Total num frames: 2425634816. Throughput: 0: 50176.1. Samples: 178533400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 10:05:02,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 10:05:02,673][49750] Updated weights for policy 0, policy_version 148051 (0.0034) [2024-04-26 10:05:05,980][49750] Updated weights for policy 0, policy_version 148061 (0.0032) [2024-04-26 10:05:07,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50244.1, 300 sec: 50151.7). Total num frames: 2425896960. Throughput: 0: 50202.1. Samples: 178689600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 10:05:07,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 10:05:09,076][49750] Updated weights for policy 0, policy_version 148071 (0.0027) [2024-04-26 10:05:12,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.5, 300 sec: 50151.7). Total num frames: 2426142720. Throughput: 0: 50205.8. Samples: 178992560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 10:05:12,063][49517] Avg episode reward: [(0, '0.484')] [2024-04-26 10:05:12,586][49750] Updated weights for policy 0, policy_version 148081 (0.0030) [2024-04-26 10:05:15,507][49750] Updated weights for policy 0, policy_version 148091 (0.0034) [2024-04-26 10:05:17,063][49517] Fps is (10 sec: 45875.7, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2426355712. Throughput: 0: 49948.3. Samples: 179285080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 10:05:17,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 10:05:18,990][49750] Updated weights for policy 0, policy_version 148101 (0.0034) [2024-04-26 10:05:22,018][49750] Updated weights for policy 0, policy_version 148111 (0.0028) [2024-04-26 10:05:22,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.3, 300 sec: 50096.1). Total num frames: 2426650624. Throughput: 0: 50046.7. Samples: 179432700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 10:05:22,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 10:05:25,666][49750] Updated weights for policy 0, policy_version 148121 (0.0034) [2024-04-26 10:05:27,063][49517] Fps is (10 sec: 54067.4, 60 sec: 50517.2, 300 sec: 50207.2). Total num frames: 2426896384. Throughput: 0: 50173.8. Samples: 179740020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 10:05:27,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 10:05:28,659][49750] Updated weights for policy 0, policy_version 148131 (0.0031) [2024-04-26 10:05:32,062][49517] Fps is (10 sec: 47514.2, 60 sec: 49971.2, 300 sec: 50040.7). Total num frames: 2427125760. Throughput: 0: 50100.2. Samples: 180036980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 10:05:32,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 10:05:32,146][49750] Updated weights for policy 0, policy_version 148141 (0.0035) [2024-04-26 10:05:35,210][49750] Updated weights for policy 0, policy_version 148151 (0.0035) [2024-04-26 10:05:37,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2427387904. Throughput: 0: 50134.0. Samples: 180186660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 10:05:37,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 10:05:38,530][49750] Updated weights for policy 0, policy_version 148161 (0.0037) [2024-04-26 10:05:41,831][49750] Updated weights for policy 0, policy_version 148171 (0.0033) [2024-04-26 10:05:42,062][49517] Fps is (10 sec: 50790.7, 60 sec: 49698.3, 300 sec: 50096.2). Total num frames: 2427633664. Throughput: 0: 50143.6. Samples: 180489240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 10:05:42,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 10:05:45,066][49750] Updated weights for policy 0, policy_version 148181 (0.0035) [2024-04-26 10:05:47,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50244.4, 300 sec: 50151.7). Total num frames: 2427895808. Throughput: 0: 50060.0. Samples: 180786100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 10:05:47,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 10:05:48,488][49750] Updated weights for policy 0, policy_version 148191 (0.0032) [2024-04-26 10:05:51,634][49750] Updated weights for policy 0, policy_version 148201 (0.0030) [2024-04-26 10:05:52,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2428141568. Throughput: 0: 50078.1. Samples: 180943100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 10:05:52,063][49517] Avg episode reward: [(0, '0.491')] [2024-04-26 10:05:54,979][49750] Updated weights for policy 0, policy_version 148211 (0.0031) [2024-04-26 10:05:57,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2428387328. Throughput: 0: 50022.7. Samples: 181243580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 10:05:57,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 10:05:58,117][49750] Updated weights for policy 0, policy_version 148221 (0.0034) [2024-04-26 10:06:01,358][49750] Updated weights for policy 0, policy_version 148231 (0.0033) [2024-04-26 10:06:02,063][49517] Fps is (10 sec: 47512.7, 60 sec: 49698.0, 300 sec: 50040.6). Total num frames: 2428616704. Throughput: 0: 50076.9. Samples: 181538540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 10:06:02,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 10:06:02,614][49728] Signal inference workers to stop experience collection... (2650 times) [2024-04-26 10:06:02,615][49728] Signal inference workers to resume experience collection... (2650 times) [2024-04-26 10:06:02,643][49750] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-04-26 10:06:02,643][49750] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-04-26 10:06:04,522][49750] Updated weights for policy 0, policy_version 148241 (0.0033) [2024-04-26 10:06:07,062][49517] Fps is (10 sec: 52428.2, 60 sec: 50244.4, 300 sec: 50096.2). Total num frames: 2428911616. Throughput: 0: 50179.2. Samples: 181690760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 10:06:07,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 10:06:07,801][49750] Updated weights for policy 0, policy_version 148251 (0.0038) [2024-04-26 10:06:11,276][49750] Updated weights for policy 0, policy_version 148261 (0.0029) [2024-04-26 10:06:12,063][49517] Fps is (10 sec: 54067.7, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2429157376. Throughput: 0: 50087.6. Samples: 181993960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 10:06:12,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 10:06:14,343][49750] Updated weights for policy 0, policy_version 148271 (0.0034) [2024-04-26 10:06:17,063][49517] Fps is (10 sec: 47512.6, 60 sec: 50517.2, 300 sec: 50096.1). Total num frames: 2429386752. Throughput: 0: 50245.1. Samples: 182298020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 10:06:17,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 10:06:17,745][49750] Updated weights for policy 0, policy_version 148281 (0.0034) [2024-04-26 10:06:20,953][49750] Updated weights for policy 0, policy_version 148291 (0.0035) [2024-04-26 10:06:22,062][49517] Fps is (10 sec: 45875.6, 60 sec: 49425.2, 300 sec: 50040.6). Total num frames: 2429616128. Throughput: 0: 50091.8. Samples: 182440780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 10:06:22,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 10:06:24,301][49750] Updated weights for policy 0, policy_version 148301 (0.0035) [2024-04-26 10:06:27,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2429911040. Throughput: 0: 50080.2. Samples: 182742860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 10:06:27,063][49517] Avg episode reward: [(0, '0.448')] [2024-04-26 10:06:27,381][49750] Updated weights for policy 0, policy_version 148311 (0.0035) [2024-04-26 10:06:30,802][49750] Updated weights for policy 0, policy_version 148321 (0.0033) [2024-04-26 10:06:32,062][49517] Fps is (10 sec: 54067.0, 60 sec: 50517.4, 300 sec: 50151.7). Total num frames: 2430156800. Throughput: 0: 50297.3. Samples: 183049480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 10:06:32,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 10:06:33,796][49750] Updated weights for policy 0, policy_version 148331 (0.0034) [2024-04-26 10:06:37,062][49517] Fps is (10 sec: 47514.8, 60 sec: 49971.4, 300 sec: 50040.6). Total num frames: 2430386176. Throughput: 0: 50122.2. Samples: 183198600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 10:06:37,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 10:06:37,227][49750] Updated weights for policy 0, policy_version 148341 (0.0031) [2024-04-26 10:06:40,317][49750] Updated weights for policy 0, policy_version 148351 (0.0041) [2024-04-26 10:06:42,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2430648320. Throughput: 0: 50105.6. Samples: 183498340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 10:06:42,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 10:06:42,177][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000148356_2430664704.pth... [2024-04-26 10:06:42,216][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000147621_2418622464.pth [2024-04-26 10:06:43,626][49750] Updated weights for policy 0, policy_version 148361 (0.0028) [2024-04-26 10:06:46,982][49750] Updated weights for policy 0, policy_version 148371 (0.0032) [2024-04-26 10:06:47,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50244.2, 300 sec: 50096.2). Total num frames: 2430910464. Throughput: 0: 50332.5. Samples: 183803500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 10:06:47,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 10:06:50,073][49750] Updated weights for policy 0, policy_version 148381 (0.0035) [2024-04-26 10:06:52,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.2, 300 sec: 50096.1). Total num frames: 2431156224. Throughput: 0: 50265.4. Samples: 183952700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 10:06:52,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 10:06:53,601][49750] Updated weights for policy 0, policy_version 148391 (0.0035) [2024-04-26 10:06:56,515][49750] Updated weights for policy 0, policy_version 148401 (0.0037) [2024-04-26 10:06:57,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 2431418368. Throughput: 0: 50270.3. Samples: 184256120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 10:06:57,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 10:07:00,349][49750] Updated weights for policy 0, policy_version 148411 (0.0030) [2024-04-26 10:07:02,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2431647744. Throughput: 0: 50180.9. Samples: 184556160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-04-26 10:07:02,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 10:07:03,024][49750] Updated weights for policy 0, policy_version 148421 (0.0033) [2024-04-26 10:07:06,782][49750] Updated weights for policy 0, policy_version 148431 (0.0037) [2024-04-26 10:07:07,063][49517] Fps is (10 sec: 47513.1, 60 sec: 49698.1, 300 sec: 50096.2). Total num frames: 2431893504. Throughput: 0: 50315.9. Samples: 184705000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-04-26 10:07:07,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 10:07:09,525][49750] Updated weights for policy 0, policy_version 148441 (0.0032) [2024-04-26 10:07:12,063][49517] Fps is (10 sec: 52429.5, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2432172032. Throughput: 0: 50324.6. Samples: 185007460. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-04-26 10:07:12,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 10:07:13,147][49750] Updated weights for policy 0, policy_version 148451 (0.0032) [2024-04-26 10:07:16,166][49750] Updated weights for policy 0, policy_version 148461 (0.0031) [2024-04-26 10:07:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.5, 300 sec: 50096.2). Total num frames: 2432401408. Throughput: 0: 50083.5. Samples: 185303240. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-04-26 10:07:17,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 10:07:19,692][49750] Updated weights for policy 0, policy_version 148471 (0.0036) [2024-04-26 10:07:22,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.3, 300 sec: 50096.2). Total num frames: 2432647168. Throughput: 0: 50209.7. Samples: 185458040. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-04-26 10:07:22,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 10:07:22,506][49728] Signal inference workers to stop experience collection... (2700 times) [2024-04-26 10:07:22,548][49750] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-04-26 10:07:22,578][49728] Signal inference workers to resume experience collection... (2700 times) [2024-04-26 10:07:22,582][49750] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-04-26 10:07:22,713][49750] Updated weights for policy 0, policy_version 148481 (0.0033) [2024-04-26 10:07:26,292][49750] Updated weights for policy 0, policy_version 148491 (0.0032) [2024-04-26 10:07:27,063][49517] Fps is (10 sec: 49151.7, 60 sec: 49698.2, 300 sec: 50096.2). Total num frames: 2432892928. Throughput: 0: 50195.5. Samples: 185757140. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-04-26 10:07:27,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 10:07:29,258][49750] Updated weights for policy 0, policy_version 148501 (0.0035) [2024-04-26 10:07:32,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2433171456. Throughput: 0: 50141.8. Samples: 186059880. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-04-26 10:07:32,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 10:07:32,770][49750] Updated weights for policy 0, policy_version 148511 (0.0030) [2024-04-26 10:07:35,878][49750] Updated weights for policy 0, policy_version 148521 (0.0030) [2024-04-26 10:07:37,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.2, 300 sec: 50151.7). Total num frames: 2433417216. Throughput: 0: 50322.6. Samples: 186217220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-04-26 10:07:37,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 10:07:39,184][49750] Updated weights for policy 0, policy_version 148531 (0.0041) [2024-04-26 10:07:42,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 2433662976. Throughput: 0: 50276.9. Samples: 186518580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-04-26 10:07:42,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 10:07:42,589][49750] Updated weights for policy 0, policy_version 148541 (0.0028) [2024-04-26 10:07:45,749][49750] Updated weights for policy 0, policy_version 148551 (0.0037) [2024-04-26 10:07:47,063][49517] Fps is (10 sec: 47513.4, 60 sec: 49698.1, 300 sec: 50096.2). Total num frames: 2433892352. Throughput: 0: 50244.1. Samples: 186817140. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-04-26 10:07:47,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 10:07:49,085][49750] Updated weights for policy 0, policy_version 148561 (0.0038) [2024-04-26 10:07:52,063][49517] Fps is (10 sec: 49151.5, 60 sec: 49971.2, 300 sec: 50096.1). Total num frames: 2434154496. Throughput: 0: 50267.1. Samples: 186967020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-04-26 10:07:52,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 10:07:52,352][49750] Updated weights for policy 0, policy_version 148571 (0.0031) [2024-04-26 10:07:55,675][49750] Updated weights for policy 0, policy_version 148581 (0.0031) [2024-04-26 10:07:57,062][49517] Fps is (10 sec: 52429.9, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2434416640. Throughput: 0: 50213.5. Samples: 187267060. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-04-26 10:07:57,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 10:07:58,973][49750] Updated weights for policy 0, policy_version 148591 (0.0033) [2024-04-26 10:08:02,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 2434662400. Throughput: 0: 50200.3. Samples: 187562260. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-04-26 10:08:02,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 10:08:02,391][49750] Updated weights for policy 0, policy_version 148601 (0.0035) [2024-04-26 10:08:05,446][49750] Updated weights for policy 0, policy_version 148611 (0.0032) [2024-04-26 10:08:07,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2434908160. Throughput: 0: 50099.1. Samples: 187712500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:08:07,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 10:08:08,782][49750] Updated weights for policy 0, policy_version 148621 (0.0037) [2024-04-26 10:08:11,913][49750] Updated weights for policy 0, policy_version 148631 (0.0037) [2024-04-26 10:08:12,063][49517] Fps is (10 sec: 50790.7, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2435170304. Throughput: 0: 50067.6. Samples: 188010180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:08:12,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 10:08:15,272][49750] Updated weights for policy 0, policy_version 148641 (0.0037) [2024-04-26 10:08:17,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 2435432448. Throughput: 0: 50004.8. Samples: 188310100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:08:17,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 10:08:18,516][49750] Updated weights for policy 0, policy_version 148651 (0.0035) [2024-04-26 10:08:21,653][49750] Updated weights for policy 0, policy_version 148661 (0.0027) [2024-04-26 10:08:22,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2435661824. Throughput: 0: 50222.0. Samples: 188477200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:08:22,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 10:08:24,913][49750] Updated weights for policy 0, policy_version 148671 (0.0034) [2024-04-26 10:08:27,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.5, 300 sec: 50207.3). Total num frames: 2435923968. Throughput: 0: 50139.6. Samples: 188774860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:08:27,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 10:08:28,566][49750] Updated weights for policy 0, policy_version 148681 (0.0033) [2024-04-26 10:08:31,423][49750] Updated weights for policy 0, policy_version 148691 (0.0034) [2024-04-26 10:08:32,063][49517] Fps is (10 sec: 49151.3, 60 sec: 49698.1, 300 sec: 50096.2). Total num frames: 2436153344. Throughput: 0: 50000.1. Samples: 189067140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:08:32,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 10:08:35,076][49750] Updated weights for policy 0, policy_version 148701 (0.0032) [2024-04-26 10:08:37,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 2436431872. Throughput: 0: 50109.3. Samples: 189221940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:08:37,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 10:08:38,057][49750] Updated weights for policy 0, policy_version 148711 (0.0029) [2024-04-26 10:08:41,577][49750] Updated weights for policy 0, policy_version 148721 (0.0029) [2024-04-26 10:08:42,063][49517] Fps is (10 sec: 50789.2, 60 sec: 49970.9, 300 sec: 50151.7). Total num frames: 2436661248. Throughput: 0: 50169.8. Samples: 189524720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:08:42,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 10:08:42,172][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000148723_2436677632.pth... [2024-04-26 10:08:42,224][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000147989_2424651776.pth [2024-04-26 10:08:42,472][49728] Signal inference workers to stop experience collection... (2750 times) [2024-04-26 10:08:42,506][49750] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-04-26 10:08:42,544][49728] Signal inference workers to resume experience collection... (2750 times) [2024-04-26 10:08:42,545][49750] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-04-26 10:08:44,631][49750] Updated weights for policy 0, policy_version 148731 (0.0034) [2024-04-26 10:08:47,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.4, 300 sec: 50151.7). Total num frames: 2436907008. Throughput: 0: 50193.0. Samples: 189820940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:08:47,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 10:08:48,070][49750] Updated weights for policy 0, policy_version 148741 (0.0028) [2024-04-26 10:08:51,163][49750] Updated weights for policy 0, policy_version 148751 (0.0031) [2024-04-26 10:08:52,062][49517] Fps is (10 sec: 49153.9, 60 sec: 49971.3, 300 sec: 50096.2). Total num frames: 2437152768. Throughput: 0: 50077.4. Samples: 189965980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:08:52,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 10:08:54,671][49750] Updated weights for policy 0, policy_version 148761 (0.0030) [2024-04-26 10:08:57,062][49517] Fps is (10 sec: 50790.5, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2437414912. Throughput: 0: 50141.4. Samples: 190266540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:08:57,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 10:08:57,877][49750] Updated weights for policy 0, policy_version 148771 (0.0034) [2024-04-26 10:09:01,038][49750] Updated weights for policy 0, policy_version 148781 (0.0032) [2024-04-26 10:09:02,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50244.4, 300 sec: 50151.7). Total num frames: 2437677056. Throughput: 0: 50233.1. Samples: 190570580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:09:02,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 10:09:04,751][49750] Updated weights for policy 0, policy_version 148791 (0.0031) [2024-04-26 10:09:07,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 50207.3). Total num frames: 2437922816. Throughput: 0: 50095.8. Samples: 190731520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:09:07,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 10:09:07,529][49750] Updated weights for policy 0, policy_version 148801 (0.0029) [2024-04-26 10:09:11,214][49750] Updated weights for policy 0, policy_version 148811 (0.0031) [2024-04-26 10:09:12,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49698.2, 300 sec: 50096.2). Total num frames: 2438152192. Throughput: 0: 50010.6. Samples: 191025340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 10:09:12,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 10:09:14,069][49750] Updated weights for policy 0, policy_version 148821 (0.0033) [2024-04-26 10:09:17,063][49517] Fps is (10 sec: 49151.7, 60 sec: 49698.1, 300 sec: 50151.7). Total num frames: 2438414336. Throughput: 0: 50107.0. Samples: 191321960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 10:09:17,063][49517] Avg episode reward: [(0, '0.445')] [2024-04-26 10:09:17,633][49750] Updated weights for policy 0, policy_version 148831 (0.0031) [2024-04-26 10:09:20,573][49750] Updated weights for policy 0, policy_version 148841 (0.0034) [2024-04-26 10:09:22,063][49517] Fps is (10 sec: 54066.6, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 2438692864. Throughput: 0: 50090.7. Samples: 191476020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 10:09:22,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 10:09:24,202][49750] Updated weights for policy 0, policy_version 148851 (0.0033) [2024-04-26 10:09:27,014][49750] Updated weights for policy 0, policy_version 148861 (0.0030) [2024-04-26 10:09:27,063][49517] Fps is (10 sec: 52429.0, 60 sec: 50244.1, 300 sec: 50207.2). Total num frames: 2438938624. Throughput: 0: 50114.0. Samples: 191779840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 10:09:27,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 10:09:30,530][49750] Updated weights for policy 0, policy_version 148871 (0.0037) [2024-04-26 10:09:32,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2439168000. Throughput: 0: 50221.7. Samples: 192080920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 10:09:32,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 10:09:33,552][49750] Updated weights for policy 0, policy_version 148881 (0.0031) [2024-04-26 10:09:36,895][49750] Updated weights for policy 0, policy_version 148891 (0.0035) [2024-04-26 10:09:37,062][49517] Fps is (10 sec: 49153.0, 60 sec: 49971.3, 300 sec: 50096.2). Total num frames: 2439430144. Throughput: 0: 50226.2. Samples: 192226160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 10:09:37,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 10:09:40,022][49750] Updated weights for policy 0, policy_version 148901 (0.0036) [2024-04-26 10:09:42,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50244.6, 300 sec: 50151.7). Total num frames: 2439675904. Throughput: 0: 50315.6. Samples: 192530740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 10:09:42,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 10:09:43,402][49750] Updated weights for policy 0, policy_version 148911 (0.0032) [2024-04-26 10:09:46,537][49750] Updated weights for policy 0, policy_version 148921 (0.0039) [2024-04-26 10:09:47,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 2439938048. Throughput: 0: 50235.5. Samples: 192831180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 10:09:47,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 10:09:49,946][49750] Updated weights for policy 0, policy_version 148931 (0.0032) [2024-04-26 10:09:51,591][49728] Signal inference workers to stop experience collection... (2800 times) [2024-04-26 10:09:51,644][49750] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-04-26 10:09:51,664][49728] Signal inference workers to resume experience collection... (2800 times) [2024-04-26 10:09:51,665][49750] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-04-26 10:09:52,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2440183808. Throughput: 0: 50031.7. Samples: 192982940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 10:09:52,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 10:09:52,991][49750] Updated weights for policy 0, policy_version 148941 (0.0032) [2024-04-26 10:09:56,468][49750] Updated weights for policy 0, policy_version 148951 (0.0033) [2024-04-26 10:09:57,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2440429568. Throughput: 0: 50264.5. Samples: 193287240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 10:09:57,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 10:09:59,570][49750] Updated weights for policy 0, policy_version 148961 (0.0030) [2024-04-26 10:10:02,063][49517] Fps is (10 sec: 49151.4, 60 sec: 49971.1, 300 sec: 50096.2). Total num frames: 2440675328. Throughput: 0: 50408.5. Samples: 193590340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 10:10:02,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 10:10:03,266][49750] Updated weights for policy 0, policy_version 148971 (0.0033) [2024-04-26 10:10:06,063][49750] Updated weights for policy 0, policy_version 148981 (0.0033) [2024-04-26 10:10:07,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.4, 300 sec: 50207.2). Total num frames: 2440953856. Throughput: 0: 50183.7. Samples: 193734280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 10:10:07,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-26 10:10:09,768][49750] Updated weights for policy 0, policy_version 148991 (0.0029) [2024-04-26 10:10:12,063][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.3, 300 sec: 50318.3). Total num frames: 2441199616. Throughput: 0: 50268.0. Samples: 194041900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 10:10:12,063][49517] Avg episode reward: [(0, '0.474')] [2024-04-26 10:10:12,467][49750] Updated weights for policy 0, policy_version 149001 (0.0030) [2024-04-26 10:10:16,453][49750] Updated weights for policy 0, policy_version 149011 (0.0032) [2024-04-26 10:10:17,063][49517] Fps is (10 sec: 45874.1, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2441412608. Throughput: 0: 50213.2. Samples: 194340520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 10:10:17,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 10:10:19,000][49750] Updated weights for policy 0, policy_version 149021 (0.0035) [2024-04-26 10:10:22,063][49517] Fps is (10 sec: 47513.6, 60 sec: 49698.1, 300 sec: 50096.2). Total num frames: 2441674752. Throughput: 0: 49954.9. Samples: 194474140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 10:10:22,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 10:10:22,939][49750] Updated weights for policy 0, policy_version 149031 (0.0030) [2024-04-26 10:10:25,590][49750] Updated weights for policy 0, policy_version 149041 (0.0024) [2024-04-26 10:10:27,063][49517] Fps is (10 sec: 54067.6, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2441953280. Throughput: 0: 50029.1. Samples: 194782060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 10:10:27,063][49517] Avg episode reward: [(0, '0.431')] [2024-04-26 10:10:29,344][49750] Updated weights for policy 0, policy_version 149051 (0.0029) [2024-04-26 10:10:32,063][49517] Fps is (10 sec: 50787.8, 60 sec: 50243.8, 300 sec: 50151.6). Total num frames: 2442182656. Throughput: 0: 50061.6. Samples: 195083980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 10:10:32,064][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 10:10:32,289][49750] Updated weights for policy 0, policy_version 149061 (0.0033) [2024-04-26 10:10:35,902][49750] Updated weights for policy 0, policy_version 149071 (0.0030) [2024-04-26 10:10:37,063][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 2442428416. Throughput: 0: 50065.2. Samples: 195235880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 10:10:37,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 10:10:38,737][49750] Updated weights for policy 0, policy_version 149081 (0.0032) [2024-04-26 10:10:42,063][49517] Fps is (10 sec: 49154.3, 60 sec: 49971.0, 300 sec: 50096.1). Total num frames: 2442674176. Throughput: 0: 49975.8. Samples: 195536160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 10:10:42,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 10:10:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000149089_2442674176.pth... [2024-04-26 10:10:42,124][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000148356_2430664704.pth [2024-04-26 10:10:42,609][49750] Updated weights for policy 0, policy_version 149091 (0.0032) [2024-04-26 10:10:45,148][49750] Updated weights for policy 0, policy_version 149101 (0.0033) [2024-04-26 10:10:47,062][49517] Fps is (10 sec: 50791.3, 60 sec: 49971.3, 300 sec: 50151.7). Total num frames: 2442936320. Throughput: 0: 49922.5. Samples: 195836840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 10:10:47,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 10:10:49,103][49750] Updated weights for policy 0, policy_version 149111 (0.0031) [2024-04-26 10:10:51,683][49750] Updated weights for policy 0, policy_version 149121 (0.0034) [2024-04-26 10:10:52,062][49517] Fps is (10 sec: 54068.3, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2443214848. Throughput: 0: 50126.7. Samples: 195989980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 10:10:52,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 10:10:55,687][49750] Updated weights for policy 0, policy_version 149131 (0.0030) [2024-04-26 10:10:55,707][49728] Signal inference workers to stop experience collection... (2850 times) [2024-04-26 10:10:55,711][49728] Signal inference workers to resume experience collection... (2850 times) [2024-04-26 10:10:55,738][49750] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-04-26 10:10:55,738][49750] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-04-26 10:10:57,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2443444224. Throughput: 0: 50042.3. Samples: 196293800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 10:10:57,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 10:10:58,328][49750] Updated weights for policy 0, policy_version 149141 (0.0031) [2024-04-26 10:11:02,063][49517] Fps is (10 sec: 45874.0, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2443673600. Throughput: 0: 50116.5. Samples: 196595760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 10:11:02,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 10:11:02,131][49750] Updated weights for policy 0, policy_version 149151 (0.0032) [2024-04-26 10:11:04,671][49750] Updated weights for policy 0, policy_version 149161 (0.0026) [2024-04-26 10:11:07,063][49517] Fps is (10 sec: 49151.1, 60 sec: 49698.0, 300 sec: 50096.1). Total num frames: 2443935744. Throughput: 0: 50297.2. Samples: 196737520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 10:11:07,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 10:11:08,642][49750] Updated weights for policy 0, policy_version 149171 (0.0030) [2024-04-26 10:11:11,026][49750] Updated weights for policy 0, policy_version 149181 (0.0033) [2024-04-26 10:11:12,062][49517] Fps is (10 sec: 52430.0, 60 sec: 49971.3, 300 sec: 50207.3). Total num frames: 2444197888. Throughput: 0: 50180.7. Samples: 197040180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 10:11:12,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 10:11:15,205][49750] Updated weights for policy 0, policy_version 149191 (0.0027) [2024-04-26 10:11:17,063][49517] Fps is (10 sec: 54067.8, 60 sec: 51063.6, 300 sec: 50373.8). Total num frames: 2444476416. Throughput: 0: 50262.4. Samples: 197345760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 24.0) [2024-04-26 10:11:17,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-26 10:11:17,522][49750] Updated weights for policy 0, policy_version 149201 (0.0028) [2024-04-26 10:11:21,587][49750] Updated weights for policy 0, policy_version 149211 (0.0035) [2024-04-26 10:11:22,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.4, 300 sec: 50040.7). Total num frames: 2444673024. Throughput: 0: 50246.4. Samples: 197496960. Policy #0 lag: (min: 0.0, avg: 8.2, max: 24.0) [2024-04-26 10:11:22,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 10:11:23,994][49750] Updated weights for policy 0, policy_version 149221 (0.0032) [2024-04-26 10:11:27,063][49517] Fps is (10 sec: 45875.0, 60 sec: 49698.2, 300 sec: 50096.1). Total num frames: 2444935168. Throughput: 0: 50365.8. Samples: 197802620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 24.0) [2024-04-26 10:11:27,063][49517] Avg episode reward: [(0, '0.464')] [2024-04-26 10:11:27,987][49750] Updated weights for policy 0, policy_version 149231 (0.0032) [2024-04-26 10:11:30,509][49750] Updated weights for policy 0, policy_version 149241 (0.0033) [2024-04-26 10:11:32,063][49517] Fps is (10 sec: 54065.9, 60 sec: 50517.7, 300 sec: 50262.7). Total num frames: 2445213696. Throughput: 0: 50312.2. Samples: 198100900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 24.0) [2024-04-26 10:11:32,063][49517] Avg episode reward: [(0, '0.484')] [2024-04-26 10:11:34,608][49750] Updated weights for policy 0, policy_version 149251 (0.0034) [2024-04-26 10:11:37,062][49517] Fps is (10 sec: 54068.3, 60 sec: 50790.5, 300 sec: 50262.8). Total num frames: 2445475840. Throughput: 0: 50440.0. Samples: 198259780. Policy #0 lag: (min: 0.0, avg: 8.2, max: 24.0) [2024-04-26 10:11:37,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 10:11:37,091][49750] Updated weights for policy 0, policy_version 149261 (0.0033) [2024-04-26 10:11:41,220][49750] Updated weights for policy 0, policy_version 149271 (0.0030) [2024-04-26 10:11:42,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.4, 300 sec: 50096.2). Total num frames: 2445688832. Throughput: 0: 50448.4. Samples: 198563980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 24.0) [2024-04-26 10:11:42,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 10:11:43,533][49750] Updated weights for policy 0, policy_version 149281 (0.0035) [2024-04-26 10:11:47,062][49517] Fps is (10 sec: 45875.0, 60 sec: 49971.1, 300 sec: 50096.2). Total num frames: 2445934592. Throughput: 0: 50362.0. Samples: 198862040. Policy #0 lag: (min: 0.0, avg: 8.2, max: 24.0) [2024-04-26 10:11:47,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 10:11:47,573][49750] Updated weights for policy 0, policy_version 149291 (0.0027) [2024-04-26 10:11:50,128][49750] Updated weights for policy 0, policy_version 149301 (0.0037) [2024-04-26 10:11:52,062][49517] Fps is (10 sec: 52428.8, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 2446213120. Throughput: 0: 50415.3. Samples: 199006200. Policy #0 lag: (min: 0.0, avg: 8.2, max: 24.0) [2024-04-26 10:11:52,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 10:11:52,511][49728] Signal inference workers to stop experience collection... (2900 times) [2024-04-26 10:11:52,559][49750] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-04-26 10:11:52,623][49728] Signal inference workers to resume experience collection... (2900 times) [2024-04-26 10:11:52,623][49750] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-04-26 10:11:53,999][49750] Updated weights for policy 0, policy_version 149311 (0.0034) [2024-04-26 10:11:56,611][49750] Updated weights for policy 0, policy_version 149321 (0.0027) [2024-04-26 10:11:57,063][49517] Fps is (10 sec: 54066.6, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2446475264. Throughput: 0: 50435.0. Samples: 199309760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 24.0) [2024-04-26 10:11:57,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 10:12:00,596][49750] Updated weights for policy 0, policy_version 149331 (0.0034) [2024-04-26 10:12:02,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50262.8). Total num frames: 2446721024. Throughput: 0: 50301.8. Samples: 199609340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 24.0) [2024-04-26 10:12:02,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 10:12:03,148][49750] Updated weights for policy 0, policy_version 149341 (0.0036) [2024-04-26 10:12:07,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.4, 300 sec: 50096.2). Total num frames: 2446950400. Throughput: 0: 50328.7. Samples: 199761760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 24.0) [2024-04-26 10:12:07,063][49517] Avg episode reward: [(0, '0.466')] [2024-04-26 10:12:07,175][49750] Updated weights for policy 0, policy_version 149351 (0.0034) [2024-04-26 10:12:09,807][49750] Updated weights for policy 0, policy_version 149361 (0.0029) [2024-04-26 10:12:12,063][49517] Fps is (10 sec: 47513.3, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 2447196160. Throughput: 0: 50240.0. Samples: 200063420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 24.0) [2024-04-26 10:12:12,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 10:12:13,835][49750] Updated weights for policy 0, policy_version 149371 (0.0029) [2024-04-26 10:12:16,159][49750] Updated weights for policy 0, policy_version 149381 (0.0034) [2024-04-26 10:12:17,062][49517] Fps is (10 sec: 52429.2, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2447474688. Throughput: 0: 50141.1. Samples: 200357240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 24.0) [2024-04-26 10:12:17,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 10:12:20,270][49750] Updated weights for policy 0, policy_version 149391 (0.0031) [2024-04-26 10:12:22,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51063.4, 300 sec: 50318.3). Total num frames: 2447736832. Throughput: 0: 50278.1. Samples: 200522300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 10:12:22,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 10:12:22,564][49750] Updated weights for policy 0, policy_version 149401 (0.0028) [2024-04-26 10:12:26,792][49750] Updated weights for policy 0, policy_version 149411 (0.0033) [2024-04-26 10:12:27,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.5, 300 sec: 50151.7). Total num frames: 2447966208. Throughput: 0: 50148.1. Samples: 200820640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 10:12:27,063][49517] Avg episode reward: [(0, '0.484')] [2024-04-26 10:12:29,498][49750] Updated weights for policy 0, policy_version 149421 (0.0037) [2024-04-26 10:12:32,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.3, 300 sec: 50151.7). Total num frames: 2448211968. Throughput: 0: 50274.7. Samples: 201124400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 10:12:32,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 10:12:33,254][49750] Updated weights for policy 0, policy_version 149431 (0.0030) [2024-04-26 10:12:36,137][49750] Updated weights for policy 0, policy_version 149441 (0.0030) [2024-04-26 10:12:37,063][49517] Fps is (10 sec: 50789.5, 60 sec: 49971.0, 300 sec: 50207.2). Total num frames: 2448474112. Throughput: 0: 50308.4. Samples: 201270080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 10:12:37,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 10:12:39,668][49750] Updated weights for policy 0, policy_version 149451 (0.0030) [2024-04-26 10:12:42,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 2448736256. Throughput: 0: 50310.7. Samples: 201573740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 10:12:42,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 10:12:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000149459_2448736256.pth... [2024-04-26 10:12:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000148723_2436677632.pth [2024-04-26 10:12:42,733][49750] Updated weights for policy 0, policy_version 149461 (0.0030) [2024-04-26 10:12:46,108][49750] Updated weights for policy 0, policy_version 149471 (0.0033) [2024-04-26 10:12:47,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.4, 300 sec: 50207.3). Total num frames: 2448965632. Throughput: 0: 50358.4. Samples: 201875460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 10:12:47,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 10:12:49,189][49750] Updated weights for policy 0, policy_version 149481 (0.0029) [2024-04-26 10:12:52,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49971.3, 300 sec: 50151.7). Total num frames: 2449211392. Throughput: 0: 50198.8. Samples: 202020700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 10:12:52,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 10:12:52,649][49750] Updated weights for policy 0, policy_version 149491 (0.0033) [2024-04-26 10:12:55,776][49750] Updated weights for policy 0, policy_version 149501 (0.0028) [2024-04-26 10:12:57,063][49517] Fps is (10 sec: 49151.4, 60 sec: 49698.2, 300 sec: 50151.7). Total num frames: 2449457152. Throughput: 0: 50290.7. Samples: 202326500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 10:12:57,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 10:12:59,041][49728] Signal inference workers to stop experience collection... (2950 times) [2024-04-26 10:12:59,041][49728] Signal inference workers to resume experience collection... (2950 times) [2024-04-26 10:12:59,065][49750] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-04-26 10:12:59,065][49750] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-04-26 10:12:59,166][49750] Updated weights for policy 0, policy_version 149511 (0.0030) [2024-04-26 10:13:02,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2449735680. Throughput: 0: 50283.1. Samples: 202619980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 10:13:02,071][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 10:13:02,301][49750] Updated weights for policy 0, policy_version 149521 (0.0037) [2024-04-26 10:13:05,680][49750] Updated weights for policy 0, policy_version 149531 (0.0031) [2024-04-26 10:13:07,062][49517] Fps is (10 sec: 54067.7, 60 sec: 50790.5, 300 sec: 50262.8). Total num frames: 2449997824. Throughput: 0: 50183.6. Samples: 202780560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 10:13:07,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 10:13:08,919][49750] Updated weights for policy 0, policy_version 149541 (0.0033) [2024-04-26 10:13:12,043][49750] Updated weights for policy 0, policy_version 149551 (0.0029) [2024-04-26 10:13:12,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50207.2). Total num frames: 2450243584. Throughput: 0: 50425.6. Samples: 203089800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 10:13:12,072][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 10:13:15,300][49750] Updated weights for policy 0, policy_version 149561 (0.0028) [2024-04-26 10:13:17,062][49517] Fps is (10 sec: 45875.1, 60 sec: 49698.1, 300 sec: 50151.7). Total num frames: 2450456576. Throughput: 0: 50284.5. Samples: 203387200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 10:13:17,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 10:13:18,760][49750] Updated weights for policy 0, policy_version 149571 (0.0036) [2024-04-26 10:13:21,621][49750] Updated weights for policy 0, policy_version 149581 (0.0027) [2024-04-26 10:13:22,062][49517] Fps is (10 sec: 49152.7, 60 sec: 49971.3, 300 sec: 50207.2). Total num frames: 2450735104. Throughput: 0: 50405.5. Samples: 203538320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 10:13:22,063][49517] Avg episode reward: [(0, '0.471')] [2024-04-26 10:13:25,217][49750] Updated weights for policy 0, policy_version 149591 (0.0030) [2024-04-26 10:13:27,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2450980864. Throughput: 0: 50292.0. Samples: 203836880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 10:13:27,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 10:13:28,227][49750] Updated weights for policy 0, policy_version 149601 (0.0033) [2024-04-26 10:13:31,785][49750] Updated weights for policy 0, policy_version 149611 (0.0038) [2024-04-26 10:13:32,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50244.1, 300 sec: 50151.7). Total num frames: 2451226624. Throughput: 0: 50306.4. Samples: 204139260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 10:13:32,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 10:13:34,736][49750] Updated weights for policy 0, policy_version 149621 (0.0031) [2024-04-26 10:13:37,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.3, 300 sec: 50207.3). Total num frames: 2451472384. Throughput: 0: 50251.9. Samples: 204282040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 10:13:37,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 10:13:38,407][49750] Updated weights for policy 0, policy_version 149631 (0.0028) [2024-04-26 10:13:41,193][49750] Updated weights for policy 0, policy_version 149641 (0.0028) [2024-04-26 10:13:42,063][49517] Fps is (10 sec: 50790.7, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2451734528. Throughput: 0: 50228.4. Samples: 204586780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 10:13:42,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 10:13:44,768][49750] Updated weights for policy 0, policy_version 149651 (0.0031) [2024-04-26 10:13:47,063][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.3, 300 sec: 50373.8). Total num frames: 2452013056. Throughput: 0: 50319.5. Samples: 204884360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 10:13:47,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 10:13:48,128][49750] Updated weights for policy 0, policy_version 149661 (0.0034) [2024-04-26 10:13:51,230][49750] Updated weights for policy 0, policy_version 149671 (0.0029) [2024-04-26 10:13:52,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 2452242432. Throughput: 0: 50268.7. Samples: 205042660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 10:13:52,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 10:13:54,730][49750] Updated weights for policy 0, policy_version 149681 (0.0032) [2024-04-26 10:13:57,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.4, 300 sec: 50207.2). Total num frames: 2452488192. Throughput: 0: 50141.0. Samples: 205346140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 10:13:57,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 10:13:57,785][49750] Updated weights for policy 0, policy_version 149691 (0.0031) [2024-04-26 10:14:01,264][49750] Updated weights for policy 0, policy_version 149701 (0.0041) [2024-04-26 10:14:02,062][49517] Fps is (10 sec: 47514.3, 60 sec: 49698.2, 300 sec: 50151.7). Total num frames: 2452717568. Throughput: 0: 50185.8. Samples: 205645560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 10:14:02,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 10:14:04,380][49750] Updated weights for policy 0, policy_version 149711 (0.0036) [2024-04-26 10:14:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2452996096. Throughput: 0: 50044.0. Samples: 205790300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 10:14:07,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 10:14:07,764][49750] Updated weights for policy 0, policy_version 149721 (0.0031) [2024-04-26 10:14:10,798][49750] Updated weights for policy 0, policy_version 149731 (0.0032) [2024-04-26 10:14:12,062][49517] Fps is (10 sec: 54066.9, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2453258240. Throughput: 0: 50113.8. Samples: 206092000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 10:14:12,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 10:14:14,134][49750] Updated weights for policy 0, policy_version 149741 (0.0029) [2024-04-26 10:14:17,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50151.7). Total num frames: 2453487616. Throughput: 0: 50153.2. Samples: 206396140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 10:14:17,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 10:14:17,535][49750] Updated weights for policy 0, policy_version 149751 (0.0035) [2024-04-26 10:14:20,606][49750] Updated weights for policy 0, policy_version 149761 (0.0029) [2024-04-26 10:14:21,872][49728] Signal inference workers to stop experience collection... (3000 times) [2024-04-26 10:14:21,873][49728] Signal inference workers to resume experience collection... (3000 times) [2024-04-26 10:14:21,906][49750] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-04-26 10:14:21,906][49750] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-04-26 10:14:22,063][49517] Fps is (10 sec: 47513.4, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 2453733376. Throughput: 0: 50153.3. Samples: 206538940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 10:14:22,063][49517] Avg episode reward: [(0, '0.459')] [2024-04-26 10:14:24,218][49750] Updated weights for policy 0, policy_version 149771 (0.0029) [2024-04-26 10:14:27,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2453995520. Throughput: 0: 50075.3. Samples: 206840160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 10:14:27,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 10:14:27,109][49750] Updated weights for policy 0, policy_version 149781 (0.0040) [2024-04-26 10:14:30,787][49750] Updated weights for policy 0, policy_version 149791 (0.0036) [2024-04-26 10:14:32,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.5, 300 sec: 50262.8). Total num frames: 2454257664. Throughput: 0: 50197.4. Samples: 207143240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 10:14:32,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 10:14:33,502][49750] Updated weights for policy 0, policy_version 149801 (0.0037) [2024-04-26 10:14:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.4, 300 sec: 50207.2). Total num frames: 2454487040. Throughput: 0: 50189.6. Samples: 207301180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 10:14:37,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 10:14:37,218][49750] Updated weights for policy 0, policy_version 149811 (0.0033) [2024-04-26 10:14:40,065][49750] Updated weights for policy 0, policy_version 149821 (0.0033) [2024-04-26 10:14:42,063][49517] Fps is (10 sec: 47513.1, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2454732800. Throughput: 0: 50072.3. Samples: 207599400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 10:14:42,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 10:14:42,157][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000149826_2454749184.pth... [2024-04-26 10:14:42,197][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000149089_2442674176.pth [2024-04-26 10:14:43,589][49750] Updated weights for policy 0, policy_version 149831 (0.0030) [2024-04-26 10:14:46,776][49750] Updated weights for policy 0, policy_version 149841 (0.0029) [2024-04-26 10:14:47,063][49517] Fps is (10 sec: 50789.1, 60 sec: 49698.0, 300 sec: 50207.2). Total num frames: 2454994944. Throughput: 0: 50076.7. Samples: 207899020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 10:14:47,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 10:14:50,185][49750] Updated weights for policy 0, policy_version 149851 (0.0034) [2024-04-26 10:14:52,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2455257088. Throughput: 0: 50294.2. Samples: 208053540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 10:14:52,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 10:14:53,656][49750] Updated weights for policy 0, policy_version 149861 (0.0036) [2024-04-26 10:14:56,680][49750] Updated weights for policy 0, policy_version 149871 (0.0032) [2024-04-26 10:14:57,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2455519232. Throughput: 0: 50284.9. Samples: 208354820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 10:14:57,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 10:15:00,205][49750] Updated weights for policy 0, policy_version 149881 (0.0033) [2024-04-26 10:15:02,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2455732224. Throughput: 0: 50161.3. Samples: 208653400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 10:15:02,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 10:15:03,158][49750] Updated weights for policy 0, policy_version 149891 (0.0031) [2024-04-26 10:15:06,708][49750] Updated weights for policy 0, policy_version 149901 (0.0035) [2024-04-26 10:15:07,062][49517] Fps is (10 sec: 47513.3, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2455994368. Throughput: 0: 50127.6. Samples: 208794680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 10:15:07,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 10:15:09,488][49750] Updated weights for policy 0, policy_version 149911 (0.0029) [2024-04-26 10:15:12,063][49517] Fps is (10 sec: 52428.2, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2456256512. Throughput: 0: 50203.9. Samples: 209099340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 10:15:12,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 10:15:13,184][49750] Updated weights for policy 0, policy_version 149921 (0.0031) [2024-04-26 10:15:16,183][49750] Updated weights for policy 0, policy_version 149931 (0.0033) [2024-04-26 10:15:17,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 2456518656. Throughput: 0: 50122.6. Samples: 209398760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 10:15:17,064][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 10:15:19,683][49750] Updated weights for policy 0, policy_version 149941 (0.0034) [2024-04-26 10:15:22,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49971.2, 300 sec: 50096.2). Total num frames: 2456731648. Throughput: 0: 50079.9. Samples: 209554780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 10:15:22,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 10:15:22,809][49750] Updated weights for policy 0, policy_version 149951 (0.0029) [2024-04-26 10:15:26,061][49750] Updated weights for policy 0, policy_version 149961 (0.0033) [2024-04-26 10:15:27,062][49517] Fps is (10 sec: 47514.2, 60 sec: 49971.2, 300 sec: 50207.4). Total num frames: 2456993792. Throughput: 0: 50028.2. Samples: 209850660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 10:15:27,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 10:15:29,184][49750] Updated weights for policy 0, policy_version 149971 (0.0037) [2024-04-26 10:15:32,063][49517] Fps is (10 sec: 54066.6, 60 sec: 50244.1, 300 sec: 50318.3). Total num frames: 2457272320. Throughput: 0: 49991.1. Samples: 210148620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:15:32,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 10:15:32,409][49750] Updated weights for policy 0, policy_version 149981 (0.0028) [2024-04-26 10:15:35,060][49728] Signal inference workers to stop experience collection... (3050 times) [2024-04-26 10:15:35,080][49750] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-04-26 10:15:35,131][49728] Signal inference workers to resume experience collection... (3050 times) [2024-04-26 10:15:35,132][49750] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-04-26 10:15:35,696][49750] Updated weights for policy 0, policy_version 149991 (0.0037) [2024-04-26 10:15:37,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.3, 300 sec: 50318.4). Total num frames: 2457518080. Throughput: 0: 50286.3. Samples: 210316420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:15:37,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 10:15:38,893][49750] Updated weights for policy 0, policy_version 150001 (0.0028) [2024-04-26 10:15:42,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50262.7). Total num frames: 2457763840. Throughput: 0: 50180.3. Samples: 210612940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:15:42,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 10:15:42,202][49750] Updated weights for policy 0, policy_version 150011 (0.0034) [2024-04-26 10:15:45,669][49750] Updated weights for policy 0, policy_version 150021 (0.0033) [2024-04-26 10:15:47,062][49517] Fps is (10 sec: 47513.3, 60 sec: 49971.4, 300 sec: 50096.2). Total num frames: 2457993216. Throughput: 0: 50230.6. Samples: 210913780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:15:47,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 10:15:48,768][49750] Updated weights for policy 0, policy_version 150031 (0.0028) [2024-04-26 10:15:52,062][49517] Fps is (10 sec: 49152.8, 60 sec: 49971.3, 300 sec: 50207.3). Total num frames: 2458255360. Throughput: 0: 50300.5. Samples: 211058200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:15:52,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 10:15:52,148][49750] Updated weights for policy 0, policy_version 150041 (0.0038) [2024-04-26 10:15:55,372][49750] Updated weights for policy 0, policy_version 150051 (0.0029) [2024-04-26 10:15:57,063][49517] Fps is (10 sec: 52428.0, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 2458517504. Throughput: 0: 50206.6. Samples: 211358640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:15:57,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 10:15:58,731][49750] Updated weights for policy 0, policy_version 150061 (0.0028) [2024-04-26 10:16:02,002][49750] Updated weights for policy 0, policy_version 150071 (0.0033) [2024-04-26 10:16:02,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2458763264. Throughput: 0: 50318.3. Samples: 211663080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:16:02,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 10:16:05,087][49750] Updated weights for policy 0, policy_version 150081 (0.0035) [2024-04-26 10:16:07,062][49517] Fps is (10 sec: 45875.7, 60 sec: 49698.2, 300 sec: 50096.2). Total num frames: 2458976256. Throughput: 0: 50173.8. Samples: 211812600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:16:07,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 10:16:08,364][49750] Updated weights for policy 0, policy_version 150091 (0.0034) [2024-04-26 10:16:11,493][49750] Updated weights for policy 0, policy_version 150101 (0.0030) [2024-04-26 10:16:12,063][49517] Fps is (10 sec: 49151.6, 60 sec: 49971.1, 300 sec: 50096.2). Total num frames: 2459254784. Throughput: 0: 50139.8. Samples: 212106960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:16:12,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 10:16:14,983][49750] Updated weights for policy 0, policy_version 150111 (0.0034) [2024-04-26 10:16:17,063][49517] Fps is (10 sec: 55705.3, 60 sec: 50244.3, 300 sec: 50373.8). Total num frames: 2459533312. Throughput: 0: 50226.8. Samples: 212408820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:16:17,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 10:16:18,050][49750] Updated weights for policy 0, policy_version 150121 (0.0032) [2024-04-26 10:16:21,665][49750] Updated weights for policy 0, policy_version 150131 (0.0042) [2024-04-26 10:16:22,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50790.5, 300 sec: 50318.3). Total num frames: 2459779072. Throughput: 0: 50162.6. Samples: 212573740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:16:22,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 10:16:24,502][49750] Updated weights for policy 0, policy_version 150141 (0.0033) [2024-04-26 10:16:27,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2460008448. Throughput: 0: 50301.0. Samples: 212876480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:16:27,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 10:16:28,168][49750] Updated weights for policy 0, policy_version 150151 (0.0034) [2024-04-26 10:16:31,163][49750] Updated weights for policy 0, policy_version 150161 (0.0038) [2024-04-26 10:16:32,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49698.3, 300 sec: 50096.2). Total num frames: 2460254208. Throughput: 0: 50148.4. Samples: 213170460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:16:32,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 10:16:34,757][49750] Updated weights for policy 0, policy_version 150171 (0.0038) [2024-04-26 10:16:37,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50244.1, 300 sec: 50318.3). Total num frames: 2460532736. Throughput: 0: 50271.4. Samples: 213320420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 10:16:37,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 10:16:37,758][49750] Updated weights for policy 0, policy_version 150181 (0.0033) [2024-04-26 10:16:41,207][49750] Updated weights for policy 0, policy_version 150191 (0.0037) [2024-04-26 10:16:42,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2460778496. Throughput: 0: 50280.5. Samples: 213621260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 10:16:42,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 10:16:42,167][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000150195_2460794880.pth... [2024-04-26 10:16:42,211][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000149459_2448736256.pth [2024-04-26 10:16:44,370][49750] Updated weights for policy 0, policy_version 150201 (0.0026) [2024-04-26 10:16:46,160][49728] Signal inference workers to stop experience collection... (3100 times) [2024-04-26 10:16:46,215][49750] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-04-26 10:16:46,221][49728] Signal inference workers to resume experience collection... (3100 times) [2024-04-26 10:16:46,228][49750] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-04-26 10:16:47,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2461007872. Throughput: 0: 50206.3. Samples: 213922360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 10:16:47,063][49517] Avg episode reward: [(0, '0.491')] [2024-04-26 10:16:47,692][49750] Updated weights for policy 0, policy_version 150211 (0.0030) [2024-04-26 10:16:51,027][49750] Updated weights for policy 0, policy_version 150221 (0.0029) [2024-04-26 10:16:52,062][49517] Fps is (10 sec: 45875.3, 60 sec: 49698.1, 300 sec: 50040.6). Total num frames: 2461237248. Throughput: 0: 49956.9. Samples: 214060660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 10:16:52,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 10:16:54,154][49750] Updated weights for policy 0, policy_version 150231 (0.0024) [2024-04-26 10:16:57,063][49517] Fps is (10 sec: 50790.1, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2461515776. Throughput: 0: 50157.4. Samples: 214364040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 10:16:57,063][49517] Avg episode reward: [(0, '0.458')] [2024-04-26 10:16:57,662][49750] Updated weights for policy 0, policy_version 150241 (0.0034) [2024-04-26 10:17:00,546][49750] Updated weights for policy 0, policy_version 150251 (0.0035) [2024-04-26 10:17:02,062][49517] Fps is (10 sec: 55705.7, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2461794304. Throughput: 0: 50248.9. Samples: 214670020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 10:17:02,063][49517] Avg episode reward: [(0, '0.436')] [2024-04-26 10:17:04,066][49750] Updated weights for policy 0, policy_version 150261 (0.0036) [2024-04-26 10:17:07,055][49750] Updated weights for policy 0, policy_version 150271 (0.0036) [2024-04-26 10:17:07,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.5, 300 sec: 50318.4). Total num frames: 2462040064. Throughput: 0: 50135.6. Samples: 214829840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 10:17:07,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 10:17:10,778][49750] Updated weights for policy 0, policy_version 150281 (0.0028) [2024-04-26 10:17:12,062][49517] Fps is (10 sec: 44236.9, 60 sec: 49698.2, 300 sec: 50040.6). Total num frames: 2462236672. Throughput: 0: 50119.5. Samples: 215131860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 10:17:12,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 10:17:13,721][49750] Updated weights for policy 0, policy_version 150291 (0.0032) [2024-04-26 10:17:17,063][49517] Fps is (10 sec: 45874.2, 60 sec: 49425.0, 300 sec: 50040.6). Total num frames: 2462498816. Throughput: 0: 50083.8. Samples: 215424240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 10:17:17,063][49517] Avg episode reward: [(0, '0.436')] [2024-04-26 10:17:17,320][49750] Updated weights for policy 0, policy_version 150301 (0.0029) [2024-04-26 10:17:20,229][49750] Updated weights for policy 0, policy_version 150311 (0.0034) [2024-04-26 10:17:22,062][49517] Fps is (10 sec: 57343.9, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2462810112. Throughput: 0: 50144.1. Samples: 215576900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 10:17:22,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 10:17:23,813][49750] Updated weights for policy 0, policy_version 150321 (0.0035) [2024-04-26 10:17:26,554][49750] Updated weights for policy 0, policy_version 150331 (0.0037) [2024-04-26 10:17:27,063][49517] Fps is (10 sec: 54067.1, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 2463039488. Throughput: 0: 50182.1. Samples: 215879460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 10:17:27,063][49517] Avg episode reward: [(0, '0.474')] [2024-04-26 10:17:30,423][49750] Updated weights for policy 0, policy_version 150341 (0.0029) [2024-04-26 10:17:32,063][49517] Fps is (10 sec: 45875.1, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2463268864. Throughput: 0: 50436.8. Samples: 216192020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 10:17:32,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 10:17:32,996][49750] Updated weights for policy 0, policy_version 150351 (0.0028) [2024-04-26 10:17:37,062][49517] Fps is (10 sec: 45875.7, 60 sec: 49425.1, 300 sec: 50040.6). Total num frames: 2463498240. Throughput: 0: 50302.2. Samples: 216324260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 10:17:37,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 10:17:37,187][49750] Updated weights for policy 0, policy_version 150361 (0.0033) [2024-04-26 10:17:39,634][49750] Updated weights for policy 0, policy_version 150371 (0.0034) [2024-04-26 10:17:42,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50244.2, 300 sec: 50262.7). Total num frames: 2463793152. Throughput: 0: 50217.7. Samples: 216623840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 10:17:42,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 10:17:44,047][49750] Updated weights for policy 0, policy_version 150381 (0.0034) [2024-04-26 10:17:44,909][49728] Signal inference workers to stop experience collection... (3150 times) [2024-04-26 10:17:44,909][49728] Signal inference workers to resume experience collection... (3150 times) [2024-04-26 10:17:44,935][49750] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-04-26 10:17:44,935][49750] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-04-26 10:17:46,131][49750] Updated weights for policy 0, policy_version 150391 (0.0037) [2024-04-26 10:17:47,062][49517] Fps is (10 sec: 55705.7, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 2464055296. Throughput: 0: 50035.6. Samples: 216921620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 10:17:47,063][49517] Avg episode reward: [(0, '0.453')] [2024-04-26 10:17:50,563][49750] Updated weights for policy 0, policy_version 150401 (0.0032) [2024-04-26 10:17:52,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50262.8). Total num frames: 2464284672. Throughput: 0: 50261.5. Samples: 217091620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 10:17:52,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 10:17:52,624][49750] Updated weights for policy 0, policy_version 150411 (0.0031) [2024-04-26 10:17:57,062][49517] Fps is (10 sec: 42598.5, 60 sec: 49425.1, 300 sec: 49985.1). Total num frames: 2464481280. Throughput: 0: 50013.8. Samples: 217382480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 10:17:57,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 10:17:57,099][49750] Updated weights for policy 0, policy_version 150421 (0.0031) [2024-04-26 10:17:59,280][49750] Updated weights for policy 0, policy_version 150431 (0.0027) [2024-04-26 10:18:02,062][49517] Fps is (10 sec: 49153.1, 60 sec: 49698.2, 300 sec: 50096.2). Total num frames: 2464776192. Throughput: 0: 50010.0. Samples: 217674680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 10:18:02,063][49517] Avg episode reward: [(0, '0.432')] [2024-04-26 10:18:03,472][49750] Updated weights for policy 0, policy_version 150441 (0.0030) [2024-04-26 10:18:05,737][49750] Updated weights for policy 0, policy_version 150451 (0.0033) [2024-04-26 10:18:07,063][49517] Fps is (10 sec: 58982.0, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 2465071104. Throughput: 0: 50188.4. Samples: 217835380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 10:18:07,063][49517] Avg episode reward: [(0, '0.452')] [2024-04-26 10:18:09,867][49750] Updated weights for policy 0, policy_version 150461 (0.0035) [2024-04-26 10:18:12,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50318.3). Total num frames: 2465300480. Throughput: 0: 50230.3. Samples: 218139820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 10:18:12,063][49517] Avg episode reward: [(0, '0.469')] [2024-04-26 10:18:12,193][49750] Updated weights for policy 0, policy_version 150471 (0.0028) [2024-04-26 10:18:16,401][49750] Updated weights for policy 0, policy_version 150481 (0.0038) [2024-04-26 10:18:17,062][49517] Fps is (10 sec: 42598.9, 60 sec: 49971.4, 300 sec: 50040.6). Total num frames: 2465497088. Throughput: 0: 49867.7. Samples: 218436060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 10:18:17,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 10:18:18,690][49750] Updated weights for policy 0, policy_version 150491 (0.0030) [2024-04-26 10:18:22,062][49517] Fps is (10 sec: 44237.2, 60 sec: 48879.0, 300 sec: 50040.6). Total num frames: 2465742848. Throughput: 0: 49889.4. Samples: 218569280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 10:18:22,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 10:18:23,095][49750] Updated weights for policy 0, policy_version 150501 (0.0034) [2024-04-26 10:18:25,268][49750] Updated weights for policy 0, policy_version 150511 (0.0038) [2024-04-26 10:18:27,063][49517] Fps is (10 sec: 55704.9, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2466054144. Throughput: 0: 49948.5. Samples: 218871520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 10:18:27,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 10:18:29,443][49750] Updated weights for policy 0, policy_version 150521 (0.0026) [2024-04-26 10:18:31,668][49750] Updated weights for policy 0, policy_version 150531 (0.0029) [2024-04-26 10:18:32,062][49517] Fps is (10 sec: 57344.0, 60 sec: 50790.5, 300 sec: 50318.3). Total num frames: 2466316288. Throughput: 0: 50236.0. Samples: 219182240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 10:18:32,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 10:18:36,187][49750] Updated weights for policy 0, policy_version 150541 (0.0031) [2024-04-26 10:18:36,859][49728] Signal inference workers to stop experience collection... (3200 times) [2024-04-26 10:18:36,899][49750] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-04-26 10:18:36,962][49728] Signal inference workers to resume experience collection... (3200 times) [2024-04-26 10:18:36,962][49750] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-04-26 10:18:37,062][49517] Fps is (10 sec: 45875.8, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2466512896. Throughput: 0: 50021.2. Samples: 219342560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-26 10:18:37,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 10:18:38,115][49750] Updated weights for policy 0, policy_version 150551 (0.0031) [2024-04-26 10:18:42,062][49517] Fps is (10 sec: 42598.5, 60 sec: 49152.2, 300 sec: 49929.6). Total num frames: 2466742272. Throughput: 0: 50112.5. Samples: 219637540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 10:18:42,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 10:18:42,083][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000150559_2466758656.pth... [2024-04-26 10:18:42,130][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000149826_2454749184.pth [2024-04-26 10:18:42,821][49750] Updated weights for policy 0, policy_version 150561 (0.0036) [2024-04-26 10:18:44,648][49750] Updated weights for policy 0, policy_version 150571 (0.0030) [2024-04-26 10:18:47,062][49517] Fps is (10 sec: 52428.6, 60 sec: 49698.1, 300 sec: 50151.7). Total num frames: 2467037184. Throughput: 0: 50055.5. Samples: 219927180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 10:18:47,063][49517] Avg episode reward: [(0, '0.450')] [2024-04-26 10:18:49,247][49750] Updated weights for policy 0, policy_version 150581 (0.0032) [2024-04-26 10:18:51,227][49750] Updated weights for policy 0, policy_version 150591 (0.0029) [2024-04-26 10:18:52,063][49517] Fps is (10 sec: 58981.1, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 2467332096. Throughput: 0: 50279.9. Samples: 220097980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 10:18:52,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 10:18:55,724][49750] Updated weights for policy 0, policy_version 150601 (0.0031) [2024-04-26 10:18:57,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51063.3, 300 sec: 50262.8). Total num frames: 2467545088. Throughput: 0: 50214.6. Samples: 220399480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 10:18:57,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 10:18:57,784][49750] Updated weights for policy 0, policy_version 150611 (0.0036) [2024-04-26 10:19:02,062][49517] Fps is (10 sec: 40960.8, 60 sec: 49425.1, 300 sec: 49985.1). Total num frames: 2467741696. Throughput: 0: 50352.9. Samples: 220701940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 10:19:02,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 10:19:02,313][49750] Updated weights for policy 0, policy_version 150621 (0.0032) [2024-04-26 10:19:04,241][49750] Updated weights for policy 0, policy_version 150631 (0.0032) [2024-04-26 10:19:07,063][49517] Fps is (10 sec: 47513.8, 60 sec: 49152.0, 300 sec: 50040.6). Total num frames: 2468020224. Throughput: 0: 50295.0. Samples: 220832560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 10:19:07,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 10:19:08,903][49750] Updated weights for policy 0, policy_version 150641 (0.0032) [2024-04-26 10:19:10,738][49750] Updated weights for policy 0, policy_version 150651 (0.0031) [2024-04-26 10:19:12,062][49517] Fps is (10 sec: 58982.1, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2468331520. Throughput: 0: 50284.5. Samples: 221134320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 10:19:12,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 10:19:15,345][49750] Updated weights for policy 0, policy_version 150661 (0.0032) [2024-04-26 10:19:17,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.4, 300 sec: 50262.8). Total num frames: 2468560896. Throughput: 0: 50160.8. Samples: 221439480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 10:19:17,063][49517] Avg episode reward: [(0, '0.461')] [2024-04-26 10:19:17,393][49750] Updated weights for policy 0, policy_version 150671 (0.0033) [2024-04-26 10:19:21,859][49750] Updated weights for policy 0, policy_version 150681 (0.0037) [2024-04-26 10:19:22,063][49517] Fps is (10 sec: 42597.6, 60 sec: 50244.1, 300 sec: 50040.6). Total num frames: 2468757504. Throughput: 0: 49957.1. Samples: 221590640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 10:19:22,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 10:19:23,807][49750] Updated weights for policy 0, policy_version 150691 (0.0031) [2024-04-26 10:19:27,063][49517] Fps is (10 sec: 44236.6, 60 sec: 49152.0, 300 sec: 49985.1). Total num frames: 2469003264. Throughput: 0: 49914.9. Samples: 221883720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 10:19:27,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 10:19:28,493][49750] Updated weights for policy 0, policy_version 150701 (0.0030) [2024-04-26 10:19:29,650][49728] Signal inference workers to stop experience collection... (3250 times) [2024-04-26 10:19:29,650][49728] Signal inference workers to resume experience collection... (3250 times) [2024-04-26 10:19:29,661][49750] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-04-26 10:19:29,672][49750] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-04-26 10:19:30,344][49750] Updated weights for policy 0, policy_version 150711 (0.0025) [2024-04-26 10:19:32,063][49517] Fps is (10 sec: 55706.1, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2469314560. Throughput: 0: 49983.9. Samples: 222176460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 10:19:32,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 10:19:34,950][49750] Updated weights for policy 0, policy_version 150721 (0.0029) [2024-04-26 10:19:36,886][49750] Updated weights for policy 0, policy_version 150731 (0.0031) [2024-04-26 10:19:37,062][49517] Fps is (10 sec: 57345.0, 60 sec: 51063.5, 300 sec: 50318.3). Total num frames: 2469576704. Throughput: 0: 50284.2. Samples: 222360760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 10:19:37,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 10:19:41,508][49750] Updated weights for policy 0, policy_version 150741 (0.0034) [2024-04-26 10:19:42,062][49517] Fps is (10 sec: 45875.8, 60 sec: 50517.3, 300 sec: 50096.2). Total num frames: 2469773312. Throughput: 0: 50229.5. Samples: 222659800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 10:19:42,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 10:19:43,403][49750] Updated weights for policy 0, policy_version 150751 (0.0034) [2024-04-26 10:19:47,063][49517] Fps is (10 sec: 42597.6, 60 sec: 49424.9, 300 sec: 49985.1). Total num frames: 2470002688. Throughput: 0: 49974.9. Samples: 222950820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 27.0) [2024-04-26 10:19:47,063][49517] Avg episode reward: [(0, '0.478')] [2024-04-26 10:19:48,006][49750] Updated weights for policy 0, policy_version 150761 (0.0033) [2024-04-26 10:19:50,014][49750] Updated weights for policy 0, policy_version 150771 (0.0031) [2024-04-26 10:19:52,062][49517] Fps is (10 sec: 52428.4, 60 sec: 49425.2, 300 sec: 50096.2). Total num frames: 2470297600. Throughput: 0: 50201.0. Samples: 223091600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 27.0) [2024-04-26 10:19:52,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 10:19:54,634][49750] Updated weights for policy 0, policy_version 150781 (0.0030) [2024-04-26 10:19:56,479][49750] Updated weights for policy 0, policy_version 150791 (0.0031) [2024-04-26 10:19:57,063][49517] Fps is (10 sec: 58982.8, 60 sec: 50790.4, 300 sec: 50373.8). Total num frames: 2470592512. Throughput: 0: 50172.8. Samples: 223392100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 27.0) [2024-04-26 10:19:57,064][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 10:20:01,162][49750] Updated weights for policy 0, policy_version 150801 (0.0030) [2024-04-26 10:20:02,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 50151.7). Total num frames: 2470789120. Throughput: 0: 50206.8. Samples: 223698780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 27.0) [2024-04-26 10:20:02,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 10:20:02,973][49750] Updated weights for policy 0, policy_version 150811 (0.0037) [2024-04-26 10:20:07,062][49517] Fps is (10 sec: 39322.2, 60 sec: 49425.2, 300 sec: 49929.6). Total num frames: 2470985728. Throughput: 0: 49872.7. Samples: 223834900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 27.0) [2024-04-26 10:20:07,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 10:20:07,711][49750] Updated weights for policy 0, policy_version 150821 (0.0031) [2024-04-26 10:20:08,955][49728] Signal inference workers to stop experience collection... (3300 times) [2024-04-26 10:20:08,997][49750] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-04-26 10:20:09,028][49728] Signal inference workers to resume experience collection... (3300 times) [2024-04-26 10:20:09,028][49750] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-04-26 10:20:09,434][49750] Updated weights for policy 0, policy_version 150831 (0.0027) [2024-04-26 10:20:12,063][49517] Fps is (10 sec: 49151.2, 60 sec: 49151.9, 300 sec: 50040.6). Total num frames: 2471280640. Throughput: 0: 49947.6. Samples: 224131360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 27.0) [2024-04-26 10:20:12,063][49517] Avg episode reward: [(0, '0.482')] [2024-04-26 10:20:14,224][49750] Updated weights for policy 0, policy_version 150841 (0.0032) [2024-04-26 10:20:15,989][49750] Updated weights for policy 0, policy_version 150851 (0.0038) [2024-04-26 10:20:17,062][49517] Fps is (10 sec: 60620.4, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2471591936. Throughput: 0: 50062.3. Samples: 224429260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 27.0) [2024-04-26 10:20:17,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 10:20:20,846][49750] Updated weights for policy 0, policy_version 150861 (0.0031) [2024-04-26 10:20:22,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.5, 300 sec: 50207.2). Total num frames: 2471804928. Throughput: 0: 49860.4. Samples: 224604480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 27.0) [2024-04-26 10:20:22,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 10:20:22,482][49750] Updated weights for policy 0, policy_version 150871 (0.0027) [2024-04-26 10:20:27,063][49517] Fps is (10 sec: 40959.1, 60 sec: 49971.1, 300 sec: 49929.5). Total num frames: 2472001536. Throughput: 0: 49842.0. Samples: 224902700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 27.0) [2024-04-26 10:20:27,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 10:20:27,322][49750] Updated weights for policy 0, policy_version 150881 (0.0036) [2024-04-26 10:20:28,973][49750] Updated weights for policy 0, policy_version 150891 (0.0027) [2024-04-26 10:20:32,062][49517] Fps is (10 sec: 44236.7, 60 sec: 48879.0, 300 sec: 49929.5). Total num frames: 2472247296. Throughput: 0: 49954.8. Samples: 225198780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 27.0) [2024-04-26 10:20:32,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 10:20:33,717][49750] Updated weights for policy 0, policy_version 150901 (0.0037) [2024-04-26 10:20:35,505][49750] Updated weights for policy 0, policy_version 150911 (0.0026) [2024-04-26 10:20:37,062][49517] Fps is (10 sec: 57345.1, 60 sec: 49971.2, 300 sec: 50207.3). Total num frames: 2472574976. Throughput: 0: 50151.6. Samples: 225348420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 27.0) [2024-04-26 10:20:37,063][49517] Avg episode reward: [(0, '0.469')] [2024-04-26 10:20:40,384][49750] Updated weights for policy 0, policy_version 150921 (0.0030) [2024-04-26 10:20:42,063][49517] Fps is (10 sec: 58981.9, 60 sec: 51063.3, 300 sec: 50318.3). Total num frames: 2472837120. Throughput: 0: 50233.7. Samples: 225652620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 27.0) [2024-04-26 10:20:42,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 10:20:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000150930_2472837120.pth... [2024-04-26 10:20:42,127][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000150195_2460794880.pth [2024-04-26 10:20:42,301][49750] Updated weights for policy 0, policy_version 150931 (0.0032) [2024-04-26 10:20:46,978][49750] Updated weights for policy 0, policy_version 150941 (0.0032) [2024-04-26 10:20:47,062][49517] Fps is (10 sec: 44237.0, 60 sec: 50244.4, 300 sec: 50040.6). Total num frames: 2473017344. Throughput: 0: 50229.3. Samples: 225959100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 27.0) [2024-04-26 10:20:47,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 10:20:48,717][49750] Updated weights for policy 0, policy_version 150951 (0.0038) [2024-04-26 10:20:49,478][49728] Signal inference workers to stop experience collection... (3350 times) [2024-04-26 10:20:49,479][49728] Signal inference workers to resume experience collection... (3350 times) [2024-04-26 10:20:49,508][49750] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-04-26 10:20:49,509][49750] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-04-26 10:20:52,062][49517] Fps is (10 sec: 40960.3, 60 sec: 49152.0, 300 sec: 49929.6). Total num frames: 2473246720. Throughput: 0: 49999.4. Samples: 226084880. Policy #0 lag: (min: 1.0, avg: 12.9, max: 20.0) [2024-04-26 10:20:52,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 10:20:53,367][49750] Updated weights for policy 0, policy_version 150961 (0.0036) [2024-04-26 10:20:55,596][49750] Updated weights for policy 0, policy_version 150971 (0.0031) [2024-04-26 10:20:57,063][49517] Fps is (10 sec: 54066.3, 60 sec: 49425.0, 300 sec: 50151.7). Total num frames: 2473558016. Throughput: 0: 50124.4. Samples: 226386960. Policy #0 lag: (min: 1.0, avg: 12.9, max: 20.0) [2024-04-26 10:20:57,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 10:20:59,769][49750] Updated weights for policy 0, policy_version 150981 (0.0027) [2024-04-26 10:21:02,063][49517] Fps is (10 sec: 57343.7, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 2473820160. Throughput: 0: 50151.0. Samples: 226686060. Policy #0 lag: (min: 1.0, avg: 12.9, max: 20.0) [2024-04-26 10:21:02,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 10:21:02,178][49750] Updated weights for policy 0, policy_version 150991 (0.0031) [2024-04-26 10:21:06,314][49750] Updated weights for policy 0, policy_version 151001 (0.0033) [2024-04-26 10:21:07,062][49517] Fps is (10 sec: 49152.8, 60 sec: 51063.4, 300 sec: 50151.7). Total num frames: 2474049536. Throughput: 0: 50092.5. Samples: 226858640. Policy #0 lag: (min: 1.0, avg: 12.9, max: 20.0) [2024-04-26 10:21:07,063][49517] Avg episode reward: [(0, '0.482')] [2024-04-26 10:21:08,598][49750] Updated weights for policy 0, policy_version 151011 (0.0031) [2024-04-26 10:21:12,062][49517] Fps is (10 sec: 44237.5, 60 sec: 49698.3, 300 sec: 49929.6). Total num frames: 2474262528. Throughput: 0: 50007.4. Samples: 227153020. Policy #0 lag: (min: 1.0, avg: 12.9, max: 20.0) [2024-04-26 10:21:12,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 10:21:12,786][49750] Updated weights for policy 0, policy_version 151021 (0.0029) [2024-04-26 10:21:14,962][49750] Updated weights for policy 0, policy_version 151031 (0.0037) [2024-04-26 10:21:17,063][49517] Fps is (10 sec: 49151.5, 60 sec: 49151.9, 300 sec: 50040.6). Total num frames: 2474541056. Throughput: 0: 50056.0. Samples: 227451300. Policy #0 lag: (min: 1.0, avg: 12.9, max: 20.0) [2024-04-26 10:21:17,072][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 10:21:19,242][49750] Updated weights for policy 0, policy_version 151041 (0.0030) [2024-04-26 10:21:21,567][49750] Updated weights for policy 0, policy_version 151051 (0.0036) [2024-04-26 10:21:22,063][49517] Fps is (10 sec: 57342.7, 60 sec: 50517.2, 300 sec: 50262.7). Total num frames: 2474835968. Throughput: 0: 50130.5. Samples: 227604300. Policy #0 lag: (min: 1.0, avg: 12.9, max: 20.0) [2024-04-26 10:21:22,071][49517] Avg episode reward: [(0, '0.460')] [2024-04-26 10:21:25,666][49750] Updated weights for policy 0, policy_version 151061 (0.0033) [2024-04-26 10:21:27,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51336.8, 300 sec: 50262.8). Total num frames: 2475081728. Throughput: 0: 50233.6. Samples: 227913120. Policy #0 lag: (min: 1.0, avg: 12.9, max: 20.0) [2024-04-26 10:21:27,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 10:21:28,111][49750] Updated weights for policy 0, policy_version 151071 (0.0034) [2024-04-26 10:21:32,063][49517] Fps is (10 sec: 45875.5, 60 sec: 50790.4, 300 sec: 50040.6). Total num frames: 2475294720. Throughput: 0: 50211.0. Samples: 228218600. Policy #0 lag: (min: 1.0, avg: 12.9, max: 20.0) [2024-04-26 10:21:32,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 10:21:32,193][49750] Updated weights for policy 0, policy_version 151081 (0.0031) [2024-04-26 10:21:34,533][49750] Updated weights for policy 0, policy_version 151091 (0.0034) [2024-04-26 10:21:37,062][49517] Fps is (10 sec: 42598.0, 60 sec: 48878.9, 300 sec: 49929.6). Total num frames: 2475507712. Throughput: 0: 50318.7. Samples: 228349220. Policy #0 lag: (min: 1.0, avg: 12.9, max: 20.0) [2024-04-26 10:21:37,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 10:21:38,179][49728] Signal inference workers to stop experience collection... (3400 times) [2024-04-26 10:21:38,215][49750] InferenceWorker_p0-w0: stopping experience collection (3400 times) [2024-04-26 10:21:38,237][49728] Signal inference workers to resume experience collection... (3400 times) [2024-04-26 10:21:38,238][49750] InferenceWorker_p0-w0: resuming experience collection (3400 times) [2024-04-26 10:21:38,849][49750] Updated weights for policy 0, policy_version 151101 (0.0029) [2024-04-26 10:21:41,103][49750] Updated weights for policy 0, policy_version 151111 (0.0030) [2024-04-26 10:21:42,063][49517] Fps is (10 sec: 54067.3, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2475835392. Throughput: 0: 50235.2. Samples: 228647540. Policy #0 lag: (min: 1.0, avg: 12.9, max: 20.0) [2024-04-26 10:21:42,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 10:21:45,397][49750] Updated weights for policy 0, policy_version 151121 (0.0028) [2024-04-26 10:21:47,063][49517] Fps is (10 sec: 57343.2, 60 sec: 51063.3, 300 sec: 50318.3). Total num frames: 2476081152. Throughput: 0: 50397.3. Samples: 228953940. Policy #0 lag: (min: 1.0, avg: 12.9, max: 20.0) [2024-04-26 10:21:47,063][49517] Avg episode reward: [(0, '0.465')] [2024-04-26 10:21:47,766][49750] Updated weights for policy 0, policy_version 151131 (0.0034) [2024-04-26 10:21:51,860][49750] Updated weights for policy 0, policy_version 151141 (0.0036) [2024-04-26 10:21:52,063][49517] Fps is (10 sec: 47513.6, 60 sec: 51063.4, 300 sec: 50151.7). Total num frames: 2476310528. Throughput: 0: 50175.9. Samples: 229116560. Policy #0 lag: (min: 1.0, avg: 12.9, max: 20.0) [2024-04-26 10:21:52,063][49517] Avg episode reward: [(0, '0.446')] [2024-04-26 10:21:54,331][49750] Updated weights for policy 0, policy_version 151151 (0.0041) [2024-04-26 10:21:57,063][49517] Fps is (10 sec: 45875.4, 60 sec: 49698.2, 300 sec: 49985.1). Total num frames: 2476539904. Throughput: 0: 50114.1. Samples: 229408160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 10:21:57,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 10:21:58,367][49750] Updated weights for policy 0, policy_version 151161 (0.0037) [2024-04-26 10:22:00,989][49750] Updated weights for policy 0, policy_version 151171 (0.0032) [2024-04-26 10:22:02,063][49517] Fps is (10 sec: 50789.8, 60 sec: 49971.1, 300 sec: 50096.1). Total num frames: 2476818432. Throughput: 0: 50124.3. Samples: 229706900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 10:22:02,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 10:22:04,871][49750] Updated weights for policy 0, policy_version 151181 (0.0037) [2024-04-26 10:22:07,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2477080576. Throughput: 0: 50249.0. Samples: 229865500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 10:22:07,072][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 10:22:07,439][49750] Updated weights for policy 0, policy_version 151191 (0.0032) [2024-04-26 10:22:11,283][49750] Updated weights for policy 0, policy_version 151201 (0.0038) [2024-04-26 10:22:12,063][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.3, 300 sec: 50262.8). Total num frames: 2477326336. Throughput: 0: 50182.5. Samples: 230171340. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 10:22:12,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 10:22:14,038][49750] Updated weights for policy 0, policy_version 151211 (0.0027) [2024-04-26 10:22:17,062][49517] Fps is (10 sec: 45875.5, 60 sec: 49971.3, 300 sec: 49929.6). Total num frames: 2477539328. Throughput: 0: 50046.3. Samples: 230470680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 10:22:17,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 10:22:17,791][49750] Updated weights for policy 0, policy_version 151221 (0.0029) [2024-04-26 10:22:20,535][49750] Updated weights for policy 0, policy_version 151231 (0.0029) [2024-04-26 10:22:22,063][49517] Fps is (10 sec: 45875.2, 60 sec: 49152.0, 300 sec: 49985.1). Total num frames: 2477785088. Throughput: 0: 50223.4. Samples: 230609280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 10:22:22,064][49517] Avg episode reward: [(0, '0.474')] [2024-04-26 10:22:24,233][49750] Updated weights for policy 0, policy_version 151241 (0.0034) [2024-04-26 10:22:25,152][49728] Signal inference workers to stop experience collection... (3450 times) [2024-04-26 10:22:25,191][49750] InferenceWorker_p0-w0: stopping experience collection (3450 times) [2024-04-26 10:22:25,256][49728] Signal inference workers to resume experience collection... (3450 times) [2024-04-26 10:22:25,256][49750] InferenceWorker_p0-w0: resuming experience collection (3450 times) [2024-04-26 10:22:26,977][49750] Updated weights for policy 0, policy_version 151251 (0.0029) [2024-04-26 10:22:27,063][49517] Fps is (10 sec: 55704.8, 60 sec: 50244.1, 300 sec: 50262.8). Total num frames: 2478096384. Throughput: 0: 50237.3. Samples: 230908220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 10:22:27,064][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 10:22:30,710][49750] Updated weights for policy 0, policy_version 151261 (0.0038) [2024-04-26 10:22:32,062][49517] Fps is (10 sec: 55706.5, 60 sec: 50790.5, 300 sec: 50318.3). Total num frames: 2478342144. Throughput: 0: 50210.0. Samples: 231213380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 10:22:32,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 10:22:33,431][49750] Updated weights for policy 0, policy_version 151271 (0.0029) [2024-04-26 10:22:37,062][49517] Fps is (10 sec: 45875.7, 60 sec: 50790.4, 300 sec: 50040.6). Total num frames: 2478555136. Throughput: 0: 50088.1. Samples: 231370520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 10:22:37,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 10:22:37,379][49750] Updated weights for policy 0, policy_version 151281 (0.0033) [2024-04-26 10:22:40,091][49750] Updated weights for policy 0, policy_version 151291 (0.0033) [2024-04-26 10:22:42,062][49517] Fps is (10 sec: 45874.8, 60 sec: 49425.1, 300 sec: 49985.1). Total num frames: 2478800896. Throughput: 0: 50199.6. Samples: 231667140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 10:22:42,063][49517] Avg episode reward: [(0, '0.472')] [2024-04-26 10:22:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000151294_2478800896.pth... [2024-04-26 10:22:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000150559_2466758656.pth [2024-04-26 10:22:43,961][49750] Updated weights for policy 0, policy_version 151301 (0.0031) [2024-04-26 10:22:46,768][49750] Updated weights for policy 0, policy_version 151311 (0.0034) [2024-04-26 10:22:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 49971.3, 300 sec: 50151.7). Total num frames: 2479079424. Throughput: 0: 50174.0. Samples: 231964720. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 10:22:47,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 10:22:50,535][49750] Updated weights for policy 0, policy_version 151321 (0.0028) [2024-04-26 10:22:52,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2479325184. Throughput: 0: 50216.4. Samples: 232125240. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 10:22:52,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 10:22:53,195][49750] Updated weights for policy 0, policy_version 151331 (0.0034) [2024-04-26 10:22:56,858][49750] Updated weights for policy 0, policy_version 151341 (0.0040) [2024-04-26 10:22:57,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2479570944. Throughput: 0: 50195.1. Samples: 232430120. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-04-26 10:22:57,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 10:22:59,963][49750] Updated weights for policy 0, policy_version 151351 (0.0028) [2024-04-26 10:23:02,062][49517] Fps is (10 sec: 49153.2, 60 sec: 49971.4, 300 sec: 49985.1). Total num frames: 2479816704. Throughput: 0: 50224.1. Samples: 232730760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 10:23:02,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 10:23:03,322][49750] Updated weights for policy 0, policy_version 151361 (0.0032) [2024-04-26 10:23:06,917][49750] Updated weights for policy 0, policy_version 151371 (0.0040) [2024-04-26 10:23:07,062][49517] Fps is (10 sec: 49152.7, 60 sec: 49698.2, 300 sec: 50040.6). Total num frames: 2480062464. Throughput: 0: 50272.2. Samples: 232871520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 10:23:07,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 10:23:09,801][49750] Updated weights for policy 0, policy_version 151381 (0.0031) [2024-04-26 10:23:12,062][49517] Fps is (10 sec: 52428.2, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2480340992. Throughput: 0: 50377.9. Samples: 233175220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 10:23:12,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 10:23:13,473][49750] Updated weights for policy 0, policy_version 151391 (0.0035) [2024-04-26 10:23:16,294][49750] Updated weights for policy 0, policy_version 151401 (0.0035) [2024-04-26 10:23:17,063][49517] Fps is (10 sec: 54066.0, 60 sec: 51063.3, 300 sec: 50373.8). Total num frames: 2480603136. Throughput: 0: 50294.4. Samples: 233476640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 10:23:17,063][49517] Avg episode reward: [(0, '0.466')] [2024-04-26 10:23:19,985][49750] Updated weights for policy 0, policy_version 151411 (0.0032) [2024-04-26 10:23:22,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.5, 300 sec: 50096.2). Total num frames: 2480832512. Throughput: 0: 50381.4. Samples: 233637680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 10:23:22,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 10:23:22,768][49750] Updated weights for policy 0, policy_version 151421 (0.0025) [2024-04-26 10:23:26,534][49750] Updated weights for policy 0, policy_version 151431 (0.0029) [2024-04-26 10:23:27,062][49517] Fps is (10 sec: 44237.9, 60 sec: 49152.2, 300 sec: 49929.6). Total num frames: 2481045504. Throughput: 0: 50318.8. Samples: 233931480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 10:23:27,063][49517] Avg episode reward: [(0, '0.490')] [2024-04-26 10:23:29,262][49750] Updated weights for policy 0, policy_version 151441 (0.0031) [2024-04-26 10:23:32,062][49517] Fps is (10 sec: 50790.3, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2481340416. Throughput: 0: 50340.0. Samples: 234230020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 10:23:32,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 10:23:32,987][49750] Updated weights for policy 0, policy_version 151451 (0.0032) [2024-04-26 10:23:35,377][49728] Signal inference workers to stop experience collection... (3500 times) [2024-04-26 10:23:35,378][49728] Signal inference workers to resume experience collection... (3500 times) [2024-04-26 10:23:35,393][49750] InferenceWorker_p0-w0: stopping experience collection (3500 times) [2024-04-26 10:23:35,393][49750] InferenceWorker_p0-w0: resuming experience collection (3500 times) [2024-04-26 10:23:35,822][49750] Updated weights for policy 0, policy_version 151461 (0.0034) [2024-04-26 10:23:37,062][49517] Fps is (10 sec: 55705.8, 60 sec: 50790.5, 300 sec: 50373.9). Total num frames: 2481602560. Throughput: 0: 50292.3. Samples: 234388380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 10:23:37,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 10:23:39,449][49750] Updated weights for policy 0, policy_version 151471 (0.0032) [2024-04-26 10:23:42,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50151.7). Total num frames: 2481831936. Throughput: 0: 50281.5. Samples: 234692780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 10:23:42,063][49517] Avg episode reward: [(0, '0.456')] [2024-04-26 10:23:42,461][49750] Updated weights for policy 0, policy_version 151481 (0.0032) [2024-04-26 10:23:45,804][49750] Updated weights for policy 0, policy_version 151491 (0.0033) [2024-04-26 10:23:47,063][49517] Fps is (10 sec: 45874.2, 60 sec: 49698.0, 300 sec: 49929.6). Total num frames: 2482061312. Throughput: 0: 50305.6. Samples: 234994520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 10:23:47,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 10:23:48,894][49750] Updated weights for policy 0, policy_version 151501 (0.0030) [2024-04-26 10:23:52,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2482339840. Throughput: 0: 50208.8. Samples: 235130920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 10:23:52,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 10:23:52,299][49750] Updated weights for policy 0, policy_version 151511 (0.0035) [2024-04-26 10:23:55,333][49750] Updated weights for policy 0, policy_version 151521 (0.0030) [2024-04-26 10:23:57,063][49517] Fps is (10 sec: 54067.2, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2482601984. Throughput: 0: 50243.9. Samples: 235436200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 10:23:57,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 10:23:58,745][49750] Updated weights for policy 0, policy_version 151531 (0.0033) [2024-04-26 10:24:01,959][49750] Updated weights for policy 0, policy_version 151541 (0.0030) [2024-04-26 10:24:02,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2482847744. Throughput: 0: 50339.8. Samples: 235741920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:24:02,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 10:24:05,566][49750] Updated weights for policy 0, policy_version 151551 (0.0023) [2024-04-26 10:24:07,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.2, 300 sec: 49985.1). Total num frames: 2483077120. Throughput: 0: 50128.8. Samples: 235893480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:24:07,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 10:24:08,514][49750] Updated weights for policy 0, policy_version 151561 (0.0035) [2024-04-26 10:24:12,062][49517] Fps is (10 sec: 45875.3, 60 sec: 49425.1, 300 sec: 49985.1). Total num frames: 2483306496. Throughput: 0: 50077.8. Samples: 236184980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:24:12,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 10:24:12,308][49750] Updated weights for policy 0, policy_version 151571 (0.0037) [2024-04-26 10:24:14,996][49750] Updated weights for policy 0, policy_version 151581 (0.0036) [2024-04-26 10:24:17,063][49517] Fps is (10 sec: 52428.6, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2483601408. Throughput: 0: 50075.5. Samples: 236483420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:24:17,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 10:24:18,948][49750] Updated weights for policy 0, policy_version 151591 (0.0031) [2024-04-26 10:24:21,516][49750] Updated weights for policy 0, policy_version 151601 (0.0036) [2024-04-26 10:24:22,063][49517] Fps is (10 sec: 54065.8, 60 sec: 50244.0, 300 sec: 50318.3). Total num frames: 2483847168. Throughput: 0: 50137.9. Samples: 236644600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:24:22,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 10:24:25,404][49750] Updated weights for policy 0, policy_version 151611 (0.0030) [2024-04-26 10:24:27,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.3, 300 sec: 50040.6). Total num frames: 2484076544. Throughput: 0: 49990.6. Samples: 236942360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:24:27,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 10:24:27,976][49750] Updated weights for policy 0, policy_version 151621 (0.0031) [2024-04-26 10:24:31,783][49750] Updated weights for policy 0, policy_version 151631 (0.0028) [2024-04-26 10:24:32,063][49517] Fps is (10 sec: 47514.3, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2484322304. Throughput: 0: 50005.4. Samples: 237244760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:24:32,063][49517] Avg episode reward: [(0, '0.403')] [2024-04-26 10:24:34,480][49728] Signal inference workers to stop experience collection... (3550 times) [2024-04-26 10:24:34,505][49750] InferenceWorker_p0-w0: stopping experience collection (3550 times) [2024-04-26 10:24:34,596][49728] Signal inference workers to resume experience collection... (3550 times) [2024-04-26 10:24:34,596][49750] InferenceWorker_p0-w0: resuming experience collection (3550 times) [2024-04-26 10:24:34,749][49750] Updated weights for policy 0, policy_version 151641 (0.0027) [2024-04-26 10:24:37,062][49517] Fps is (10 sec: 50790.5, 60 sec: 49698.0, 300 sec: 50207.2). Total num frames: 2484584448. Throughput: 0: 50249.4. Samples: 237392140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:24:37,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 10:24:38,286][49750] Updated weights for policy 0, policy_version 151651 (0.0026) [2024-04-26 10:24:41,149][49750] Updated weights for policy 0, policy_version 151661 (0.0032) [2024-04-26 10:24:42,063][49517] Fps is (10 sec: 54066.4, 60 sec: 50517.1, 300 sec: 50373.8). Total num frames: 2484862976. Throughput: 0: 50159.0. Samples: 237693360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:24:42,063][49517] Avg episode reward: [(0, '0.475')] [2024-04-26 10:24:42,200][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000151665_2484879360.pth... [2024-04-26 10:24:42,251][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000150930_2472837120.pth [2024-04-26 10:24:44,841][49750] Updated weights for policy 0, policy_version 151671 (0.0035) [2024-04-26 10:24:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.4, 300 sec: 50096.2). Total num frames: 2485075968. Throughput: 0: 49985.4. Samples: 237991260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:24:47,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 10:24:47,882][49750] Updated weights for policy 0, policy_version 151681 (0.0033) [2024-04-26 10:24:51,317][49750] Updated weights for policy 0, policy_version 151691 (0.0031) [2024-04-26 10:24:52,062][49517] Fps is (10 sec: 45876.4, 60 sec: 49698.2, 300 sec: 49929.6). Total num frames: 2485321728. Throughput: 0: 49909.4. Samples: 238139400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:24:52,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 10:24:54,371][49750] Updated weights for policy 0, policy_version 151701 (0.0039) [2024-04-26 10:24:57,063][49517] Fps is (10 sec: 52427.8, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2485600256. Throughput: 0: 50018.5. Samples: 238435820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:24:57,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 10:24:57,745][49750] Updated weights for policy 0, policy_version 151711 (0.0033) [2024-04-26 10:25:00,946][49750] Updated weights for policy 0, policy_version 151721 (0.0032) [2024-04-26 10:25:02,062][49517] Fps is (10 sec: 52428.6, 60 sec: 49971.2, 300 sec: 50373.8). Total num frames: 2485846016. Throughput: 0: 50055.2. Samples: 238735900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:25:02,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 10:25:04,461][49750] Updated weights for policy 0, policy_version 151731 (0.0037) [2024-04-26 10:25:07,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.3, 300 sec: 50207.3). Total num frames: 2486091776. Throughput: 0: 49926.5. Samples: 238891280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 10:25:07,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-26 10:25:07,354][49750] Updated weights for policy 0, policy_version 151741 (0.0030) [2024-04-26 10:25:10,836][49750] Updated weights for policy 0, policy_version 151751 (0.0029) [2024-04-26 10:25:12,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 49985.1). Total num frames: 2486337536. Throughput: 0: 50040.5. Samples: 239194180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 10:25:12,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 10:25:13,687][49750] Updated weights for policy 0, policy_version 151761 (0.0032) [2024-04-26 10:25:17,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49698.2, 300 sec: 50096.2). Total num frames: 2486583296. Throughput: 0: 50079.2. Samples: 239498320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 10:25:17,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 10:25:17,422][49750] Updated weights for policy 0, policy_version 151771 (0.0032) [2024-04-26 10:25:20,188][49750] Updated weights for policy 0, policy_version 151781 (0.0026) [2024-04-26 10:25:22,062][49517] Fps is (10 sec: 50790.6, 60 sec: 49971.4, 300 sec: 50318.4). Total num frames: 2486845440. Throughput: 0: 50102.3. Samples: 239646740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 10:25:22,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 10:25:23,851][49750] Updated weights for policy 0, policy_version 151791 (0.0029) [2024-04-26 10:25:26,849][49750] Updated weights for policy 0, policy_version 151801 (0.0029) [2024-04-26 10:25:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2487107584. Throughput: 0: 50083.0. Samples: 239947080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 10:25:27,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 10:25:30,500][49750] Updated weights for policy 0, policy_version 151811 (0.0029) [2024-04-26 10:25:31,595][49728] Signal inference workers to stop experience collection... (3600 times) [2024-04-26 10:25:31,596][49728] Signal inference workers to resume experience collection... (3600 times) [2024-04-26 10:25:31,611][49750] InferenceWorker_p0-w0: stopping experience collection (3600 times) [2024-04-26 10:25:31,611][49750] InferenceWorker_p0-w0: resuming experience collection (3600 times) [2024-04-26 10:25:32,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.4, 300 sec: 50040.6). Total num frames: 2487336960. Throughput: 0: 50294.2. Samples: 240254500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 10:25:32,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 10:25:33,535][49750] Updated weights for policy 0, policy_version 151821 (0.0042) [2024-04-26 10:25:37,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.3, 300 sec: 49985.1). Total num frames: 2487582720. Throughput: 0: 50041.4. Samples: 240391260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 10:25:37,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 10:25:37,139][49750] Updated weights for policy 0, policy_version 151831 (0.0029) [2024-04-26 10:25:40,092][49750] Updated weights for policy 0, policy_version 151841 (0.0032) [2024-04-26 10:25:42,063][49517] Fps is (10 sec: 52427.8, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2487861248. Throughput: 0: 50222.7. Samples: 240695840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 10:25:42,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 10:25:43,566][49750] Updated weights for policy 0, policy_version 151851 (0.0039) [2024-04-26 10:25:46,705][49750] Updated weights for policy 0, policy_version 151861 (0.0028) [2024-04-26 10:25:47,063][49517] Fps is (10 sec: 54066.4, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 2488123392. Throughput: 0: 50378.6. Samples: 241002940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 10:25:47,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 10:25:50,008][49750] Updated weights for policy 0, policy_version 151871 (0.0030) [2024-04-26 10:25:52,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.2, 300 sec: 50096.2). Total num frames: 2488336384. Throughput: 0: 50122.2. Samples: 241146780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 10:25:52,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 10:25:53,102][49750] Updated weights for policy 0, policy_version 151881 (0.0027) [2024-04-26 10:25:56,697][49750] Updated weights for policy 0, policy_version 151891 (0.0032) [2024-04-26 10:25:57,062][49517] Fps is (10 sec: 45875.9, 60 sec: 49698.3, 300 sec: 50040.7). Total num frames: 2488582144. Throughput: 0: 50172.0. Samples: 241451920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 10:25:57,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 10:25:59,581][49750] Updated weights for policy 0, policy_version 151901 (0.0035) [2024-04-26 10:26:02,063][49517] Fps is (10 sec: 54066.9, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2488877056. Throughput: 0: 50157.7. Samples: 241755420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 10:26:02,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 10:26:03,162][49750] Updated weights for policy 0, policy_version 151911 (0.0031) [2024-04-26 10:26:06,136][49750] Updated weights for policy 0, policy_version 151921 (0.0029) [2024-04-26 10:26:07,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2489106432. Throughput: 0: 50217.7. Samples: 241906540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 10:26:07,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 10:26:09,729][49750] Updated weights for policy 0, policy_version 151931 (0.0034) [2024-04-26 10:26:12,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2489368576. Throughput: 0: 50276.4. Samples: 242209520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 10:26:12,063][49517] Avg episode reward: [(0, '0.459')] [2024-04-26 10:26:12,671][49750] Updated weights for policy 0, policy_version 151941 (0.0027) [2024-04-26 10:26:16,485][49750] Updated weights for policy 0, policy_version 151951 (0.0031) [2024-04-26 10:26:17,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2489597952. Throughput: 0: 50238.0. Samples: 242515220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 10:26:17,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 10:26:19,070][49750] Updated weights for policy 0, policy_version 151961 (0.0024) [2024-04-26 10:26:22,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2489860096. Throughput: 0: 50250.7. Samples: 242652540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 10:26:22,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 10:26:23,039][49750] Updated weights for policy 0, policy_version 151971 (0.0038) [2024-04-26 10:26:25,633][49750] Updated weights for policy 0, policy_version 151981 (0.0031) [2024-04-26 10:26:27,063][49517] Fps is (10 sec: 50790.5, 60 sec: 49971.1, 300 sec: 50207.2). Total num frames: 2490105856. Throughput: 0: 50281.8. Samples: 242958520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 10:26:27,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 10:26:29,539][49750] Updated weights for policy 0, policy_version 151991 (0.0031) [2024-04-26 10:26:32,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.2, 300 sec: 50373.9). Total num frames: 2490368000. Throughput: 0: 50257.4. Samples: 243264520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 10:26:32,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 10:26:32,143][49750] Updated weights for policy 0, policy_version 152001 (0.0037) [2024-04-26 10:26:35,985][49750] Updated weights for policy 0, policy_version 152011 (0.0032) [2024-04-26 10:26:37,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.2, 300 sec: 50096.2). Total num frames: 2490613760. Throughput: 0: 50357.7. Samples: 243412880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 10:26:37,063][49517] Avg episode reward: [(0, '0.435')] [2024-04-26 10:26:38,598][49750] Updated weights for policy 0, policy_version 152021 (0.0036) [2024-04-26 10:26:42,063][49517] Fps is (10 sec: 47512.7, 60 sec: 49698.0, 300 sec: 50040.6). Total num frames: 2490843136. Throughput: 0: 50217.9. Samples: 243711740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 10:26:42,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 10:26:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000152029_2490843136.pth... [2024-04-26 10:26:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000151294_2478800896.pth [2024-04-26 10:26:42,589][49750] Updated weights for policy 0, policy_version 152031 (0.0035) [2024-04-26 10:26:44,990][49728] Signal inference workers to stop experience collection... (3650 times) [2024-04-26 10:26:44,990][49728] Signal inference workers to resume experience collection... (3650 times) [2024-04-26 10:26:45,003][49750] InferenceWorker_p0-w0: stopping experience collection (3650 times) [2024-04-26 10:26:45,005][49750] InferenceWorker_p0-w0: resuming experience collection (3650 times) [2024-04-26 10:26:45,122][49750] Updated weights for policy 0, policy_version 152041 (0.0036) [2024-04-26 10:26:47,063][49517] Fps is (10 sec: 52429.0, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2491138048. Throughput: 0: 50209.8. Samples: 244014860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 10:26:47,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 10:26:49,052][49750] Updated weights for policy 0, policy_version 152051 (0.0035) [2024-04-26 10:26:51,794][49750] Updated weights for policy 0, policy_version 152061 (0.0029) [2024-04-26 10:26:52,062][49517] Fps is (10 sec: 52430.1, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2491367424. Throughput: 0: 50332.5. Samples: 244171500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 10:26:52,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 10:26:55,451][49750] Updated weights for policy 0, policy_version 152071 (0.0032) [2024-04-26 10:26:57,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.2, 300 sec: 50151.7). Total num frames: 2491613184. Throughput: 0: 50275.9. Samples: 244471940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 10:26:57,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 10:26:58,216][49750] Updated weights for policy 0, policy_version 152081 (0.0029) [2024-04-26 10:27:01,888][49750] Updated weights for policy 0, policy_version 152091 (0.0034) [2024-04-26 10:27:02,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49698.2, 300 sec: 50096.2). Total num frames: 2491858944. Throughput: 0: 50243.7. Samples: 244776180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 10:27:02,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 10:27:04,581][49750] Updated weights for policy 0, policy_version 152101 (0.0037) [2024-04-26 10:27:07,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2492121088. Throughput: 0: 50344.8. Samples: 244918060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 10:27:07,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 10:27:08,463][49750] Updated weights for policy 0, policy_version 152111 (0.0029) [2024-04-26 10:27:11,195][49750] Updated weights for policy 0, policy_version 152121 (0.0031) [2024-04-26 10:27:12,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2492383232. Throughput: 0: 50340.0. Samples: 245223820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 10:27:12,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 10:27:14,980][49750] Updated weights for policy 0, policy_version 152131 (0.0034) [2024-04-26 10:27:17,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 2492612608. Throughput: 0: 50259.7. Samples: 245526200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 10:27:17,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 10:27:17,852][49750] Updated weights for policy 0, policy_version 152141 (0.0031) [2024-04-26 10:27:21,568][49750] Updated weights for policy 0, policy_version 152151 (0.0036) [2024-04-26 10:27:22,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2492874752. Throughput: 0: 50212.2. Samples: 245672420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 10:27:22,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 10:27:24,263][49750] Updated weights for policy 0, policy_version 152161 (0.0031) [2024-04-26 10:27:27,062][49517] Fps is (10 sec: 49151.4, 60 sec: 49971.2, 300 sec: 50040.6). Total num frames: 2493104128. Throughput: 0: 50274.9. Samples: 245974100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 10:27:27,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 10:27:28,155][49750] Updated weights for policy 0, policy_version 152171 (0.0033) [2024-04-26 10:27:30,624][49750] Updated weights for policy 0, policy_version 152181 (0.0031) [2024-04-26 10:27:32,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2493382656. Throughput: 0: 50173.3. Samples: 246272660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 10:27:32,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 10:27:34,587][49750] Updated weights for policy 0, policy_version 152191 (0.0037) [2024-04-26 10:27:37,062][49517] Fps is (10 sec: 54067.7, 60 sec: 50517.5, 300 sec: 50318.3). Total num frames: 2493644800. Throughput: 0: 50321.3. Samples: 246435960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 10:27:37,063][49517] Avg episode reward: [(0, '0.475')] [2024-04-26 10:27:37,125][49750] Updated weights for policy 0, policy_version 152201 (0.0034) [2024-04-26 10:27:40,938][49750] Updated weights for policy 0, policy_version 152211 (0.0036) [2024-04-26 10:27:42,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50151.7). Total num frames: 2493874176. Throughput: 0: 50316.4. Samples: 246736180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 10:27:42,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 10:27:43,820][49750] Updated weights for policy 0, policy_version 152221 (0.0036) [2024-04-26 10:27:47,063][49517] Fps is (10 sec: 47512.7, 60 sec: 49698.1, 300 sec: 50151.7). Total num frames: 2494119936. Throughput: 0: 50229.6. Samples: 247036520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 10:27:47,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 10:27:47,482][49750] Updated weights for policy 0, policy_version 152231 (0.0034) [2024-04-26 10:27:50,359][49750] Updated weights for policy 0, policy_version 152241 (0.0030) [2024-04-26 10:27:52,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.1, 300 sec: 50207.2). Total num frames: 2494382080. Throughput: 0: 50433.2. Samples: 247187560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 10:27:52,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 10:27:54,002][49750] Updated weights for policy 0, policy_version 152251 (0.0031) [2024-04-26 10:27:55,527][49728] Signal inference workers to stop experience collection... (3700 times) [2024-04-26 10:27:55,527][49728] Signal inference workers to resume experience collection... (3700 times) [2024-04-26 10:27:55,542][49750] InferenceWorker_p0-w0: stopping experience collection (3700 times) [2024-04-26 10:27:55,542][49750] InferenceWorker_p0-w0: resuming experience collection (3700 times) [2024-04-26 10:27:56,686][49750] Updated weights for policy 0, policy_version 152261 (0.0034) [2024-04-26 10:27:57,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2494644224. Throughput: 0: 50402.4. Samples: 247491920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 10:27:57,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 10:28:00,443][49750] Updated weights for policy 0, policy_version 152271 (0.0027) [2024-04-26 10:28:02,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 2494873600. Throughput: 0: 50231.1. Samples: 247786600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 10:28:02,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 10:28:03,214][49750] Updated weights for policy 0, policy_version 152281 (0.0031) [2024-04-26 10:28:06,840][49750] Updated weights for policy 0, policy_version 152291 (0.0043) [2024-04-26 10:28:07,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2495135744. Throughput: 0: 50311.8. Samples: 247936460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 10:28:07,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 10:28:09,815][49750] Updated weights for policy 0, policy_version 152301 (0.0034) [2024-04-26 10:28:12,062][49517] Fps is (10 sec: 49151.5, 60 sec: 49698.2, 300 sec: 50040.6). Total num frames: 2495365120. Throughput: 0: 50240.0. Samples: 248234900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 10:28:12,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 10:28:13,373][49750] Updated weights for policy 0, policy_version 152311 (0.0032) [2024-04-26 10:28:16,255][49750] Updated weights for policy 0, policy_version 152321 (0.0028) [2024-04-26 10:28:17,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 2495643648. Throughput: 0: 50230.3. Samples: 248533020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 10:28:17,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 10:28:20,029][49750] Updated weights for policy 0, policy_version 152331 (0.0028) [2024-04-26 10:28:22,062][49517] Fps is (10 sec: 54067.6, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2495905792. Throughput: 0: 50178.7. Samples: 248694000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 10:28:22,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 10:28:22,628][49750] Updated weights for policy 0, policy_version 152341 (0.0037) [2024-04-26 10:28:26,464][49750] Updated weights for policy 0, policy_version 152351 (0.0031) [2024-04-26 10:28:27,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2496135168. Throughput: 0: 50288.9. Samples: 248999180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 10:28:27,063][49517] Avg episode reward: [(0, '0.479')] [2024-04-26 10:28:29,094][49750] Updated weights for policy 0, policy_version 152361 (0.0030) [2024-04-26 10:28:32,063][49517] Fps is (10 sec: 47513.0, 60 sec: 49971.2, 300 sec: 50096.1). Total num frames: 2496380928. Throughput: 0: 50381.4. Samples: 249303680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 10:28:32,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 10:28:33,030][49750] Updated weights for policy 0, policy_version 152371 (0.0030) [2024-04-26 10:28:35,649][49750] Updated weights for policy 0, policy_version 152381 (0.0036) [2024-04-26 10:28:37,062][49517] Fps is (10 sec: 50791.2, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2496643072. Throughput: 0: 50166.4. Samples: 249445040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 10:28:37,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 10:28:39,527][49750] Updated weights for policy 0, policy_version 152391 (0.0043) [2024-04-26 10:28:42,063][49517] Fps is (10 sec: 54067.3, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 2496921600. Throughput: 0: 50172.8. Samples: 249749700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 10:28:42,063][49517] Avg episode reward: [(0, '0.465')] [2024-04-26 10:28:42,120][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000152401_2496937984.pth... [2024-04-26 10:28:42,124][49750] Updated weights for policy 0, policy_version 152401 (0.0030) [2024-04-26 10:28:42,169][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000151665_2484879360.pth [2024-04-26 10:28:46,069][49750] Updated weights for policy 0, policy_version 152411 (0.0033) [2024-04-26 10:28:47,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2497134592. Throughput: 0: 50199.8. Samples: 250045600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 10:28:47,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 10:28:48,662][49750] Updated weights for policy 0, policy_version 152421 (0.0031) [2024-04-26 10:28:52,062][49517] Fps is (10 sec: 45875.6, 60 sec: 49971.3, 300 sec: 50096.2). Total num frames: 2497380352. Throughput: 0: 50088.6. Samples: 250190440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 10:28:52,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 10:28:52,553][49750] Updated weights for policy 0, policy_version 152431 (0.0027) [2024-04-26 10:28:55,211][49750] Updated weights for policy 0, policy_version 152441 (0.0034) [2024-04-26 10:28:57,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49698.1, 300 sec: 50096.2). Total num frames: 2497626112. Throughput: 0: 50037.4. Samples: 250486580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 10:28:57,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 10:28:59,210][49750] Updated weights for policy 0, policy_version 152451 (0.0030) [2024-04-26 10:29:01,340][49728] Signal inference workers to stop experience collection... (3750 times) [2024-04-26 10:29:01,341][49728] Signal inference workers to resume experience collection... (3750 times) [2024-04-26 10:29:01,365][49750] InferenceWorker_p0-w0: stopping experience collection (3750 times) [2024-04-26 10:29:01,365][49750] InferenceWorker_p0-w0: resuming experience collection (3750 times) [2024-04-26 10:29:01,757][49750] Updated weights for policy 0, policy_version 152461 (0.0027) [2024-04-26 10:29:02,063][49517] Fps is (10 sec: 54065.9, 60 sec: 50790.1, 300 sec: 50318.3). Total num frames: 2497921024. Throughput: 0: 50127.7. Samples: 250788780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 10:29:02,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 10:29:05,840][49750] Updated weights for policy 0, policy_version 152471 (0.0032) [2024-04-26 10:29:07,062][49517] Fps is (10 sec: 50790.6, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2498134016. Throughput: 0: 50147.1. Samples: 250950620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 10:29:07,063][49517] Avg episode reward: [(0, '0.454')] [2024-04-26 10:29:08,255][49750] Updated weights for policy 0, policy_version 152481 (0.0035) [2024-04-26 10:29:12,063][49517] Fps is (10 sec: 45875.3, 60 sec: 50244.1, 300 sec: 50096.1). Total num frames: 2498379776. Throughput: 0: 50043.0. Samples: 251251120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 10:29:12,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 10:29:12,419][49750] Updated weights for policy 0, policy_version 152491 (0.0034) [2024-04-26 10:29:14,882][49750] Updated weights for policy 0, policy_version 152501 (0.0036) [2024-04-26 10:29:17,063][49517] Fps is (10 sec: 49151.4, 60 sec: 49698.1, 300 sec: 50096.2). Total num frames: 2498625536. Throughput: 0: 49942.2. Samples: 251551080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 10:29:17,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 10:29:19,054][49750] Updated weights for policy 0, policy_version 152511 (0.0028) [2024-04-26 10:29:21,310][49750] Updated weights for policy 0, policy_version 152521 (0.0035) [2024-04-26 10:29:22,063][49517] Fps is (10 sec: 54067.8, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2498920448. Throughput: 0: 50047.0. Samples: 251697160. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-04-26 10:29:22,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 10:29:25,525][49750] Updated weights for policy 0, policy_version 152531 (0.0034) [2024-04-26 10:29:27,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 2499149824. Throughput: 0: 49984.1. Samples: 251998980. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-04-26 10:29:27,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 10:29:27,774][49750] Updated weights for policy 0, policy_version 152541 (0.0030) [2024-04-26 10:29:32,056][49750] Updated weights for policy 0, policy_version 152551 (0.0033) [2024-04-26 10:29:32,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.4, 300 sec: 50207.3). Total num frames: 2499395584. Throughput: 0: 50100.6. Samples: 252300120. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-04-26 10:29:32,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 10:29:34,453][49750] Updated weights for policy 0, policy_version 152561 (0.0033) [2024-04-26 10:29:37,063][49517] Fps is (10 sec: 47513.1, 60 sec: 49698.0, 300 sec: 50040.6). Total num frames: 2499624960. Throughput: 0: 49807.9. Samples: 252431800. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-04-26 10:29:37,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 10:29:38,741][49750] Updated weights for policy 0, policy_version 152571 (0.0031) [2024-04-26 10:29:40,965][49750] Updated weights for policy 0, policy_version 152581 (0.0028) [2024-04-26 10:29:42,063][49517] Fps is (10 sec: 50789.4, 60 sec: 49698.1, 300 sec: 50262.7). Total num frames: 2499903488. Throughput: 0: 49984.3. Samples: 252735880. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-04-26 10:29:42,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 10:29:45,413][49750] Updated weights for policy 0, policy_version 152591 (0.0032) [2024-04-26 10:29:46,907][49728] Signal inference workers to stop experience collection... (3800 times) [2024-04-26 10:29:46,953][49750] InferenceWorker_p0-w0: stopping experience collection (3800 times) [2024-04-26 10:29:46,972][49728] Signal inference workers to resume experience collection... (3800 times) [2024-04-26 10:29:46,975][49750] InferenceWorker_p0-w0: resuming experience collection (3800 times) [2024-04-26 10:29:47,062][49517] Fps is (10 sec: 54067.5, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2500165632. Throughput: 0: 50039.3. Samples: 253040540. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-04-26 10:29:47,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 10:29:47,375][49750] Updated weights for policy 0, policy_version 152601 (0.0028) [2024-04-26 10:29:51,657][49750] Updated weights for policy 0, policy_version 152611 (0.0027) [2024-04-26 10:29:52,062][49517] Fps is (10 sec: 47514.6, 60 sec: 49971.2, 300 sec: 50096.2). Total num frames: 2500378624. Throughput: 0: 49988.5. Samples: 253200100. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-04-26 10:29:52,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 10:29:53,951][49750] Updated weights for policy 0, policy_version 152621 (0.0025) [2024-04-26 10:29:57,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2500640768. Throughput: 0: 49944.3. Samples: 253498600. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-04-26 10:29:57,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 10:29:58,129][49750] Updated weights for policy 0, policy_version 152631 (0.0037) [2024-04-26 10:30:00,762][49750] Updated weights for policy 0, policy_version 152641 (0.0026) [2024-04-26 10:30:02,063][49517] Fps is (10 sec: 50789.3, 60 sec: 49425.1, 300 sec: 50151.7). Total num frames: 2500886528. Throughput: 0: 49843.9. Samples: 253794060. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-04-26 10:30:02,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 10:30:04,594][49750] Updated weights for policy 0, policy_version 152651 (0.0030) [2024-04-26 10:30:07,063][49517] Fps is (10 sec: 54064.6, 60 sec: 50790.0, 300 sec: 50318.2). Total num frames: 2501181440. Throughput: 0: 50086.7. Samples: 253951080. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-04-26 10:30:07,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 10:30:07,455][49750] Updated weights for policy 0, policy_version 152661 (0.0035) [2024-04-26 10:30:11,176][49750] Updated weights for policy 0, policy_version 152671 (0.0034) [2024-04-26 10:30:12,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50517.5, 300 sec: 50262.8). Total num frames: 2501410816. Throughput: 0: 50158.6. Samples: 254256120. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-04-26 10:30:12,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 10:30:13,891][49750] Updated weights for policy 0, policy_version 152681 (0.0027) [2024-04-26 10:30:17,062][49517] Fps is (10 sec: 45877.1, 60 sec: 50244.4, 300 sec: 50151.7). Total num frames: 2501640192. Throughput: 0: 50087.1. Samples: 254554040. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-04-26 10:30:17,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 10:30:17,825][49750] Updated weights for policy 0, policy_version 152691 (0.0030) [2024-04-26 10:30:20,346][49750] Updated weights for policy 0, policy_version 152701 (0.0031) [2024-04-26 10:30:22,063][49517] Fps is (10 sec: 47513.4, 60 sec: 49425.1, 300 sec: 50096.1). Total num frames: 2501885952. Throughput: 0: 50320.0. Samples: 254696200. Policy #0 lag: (min: 0.0, avg: 13.4, max: 25.0) [2024-04-26 10:30:22,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 10:30:24,182][49750] Updated weights for policy 0, policy_version 152711 (0.0036) [2024-04-26 10:30:26,961][49750] Updated weights for policy 0, policy_version 152721 (0.0030) [2024-04-26 10:30:27,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2502180864. Throughput: 0: 50377.1. Samples: 255002840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:30:27,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 10:30:30,555][49750] Updated weights for policy 0, policy_version 152731 (0.0039) [2024-04-26 10:30:32,062][49517] Fps is (10 sec: 50791.0, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2502393856. Throughput: 0: 50397.4. Samples: 255308420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:30:32,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 10:30:33,570][49750] Updated weights for policy 0, policy_version 152741 (0.0042) [2024-04-26 10:30:37,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.4, 300 sec: 50151.7). Total num frames: 2502656000. Throughput: 0: 50067.0. Samples: 255453120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:30:37,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 10:30:37,068][49750] Updated weights for policy 0, policy_version 152751 (0.0031) [2024-04-26 10:30:40,155][49750] Updated weights for policy 0, policy_version 152761 (0.0034) [2024-04-26 10:30:42,063][49517] Fps is (10 sec: 49151.2, 60 sec: 49698.2, 300 sec: 50040.6). Total num frames: 2502885376. Throughput: 0: 50070.9. Samples: 255751800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:30:42,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 10:30:42,069][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000152764_2502885376.pth... [2024-04-26 10:30:42,113][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000152029_2490843136.pth [2024-04-26 10:30:43,574][49750] Updated weights for policy 0, policy_version 152771 (0.0032) [2024-04-26 10:30:46,895][49750] Updated weights for policy 0, policy_version 152781 (0.0027) [2024-04-26 10:30:47,063][49517] Fps is (10 sec: 50790.3, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2503163904. Throughput: 0: 50181.9. Samples: 256052240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:30:47,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 10:30:48,789][49728] Signal inference workers to stop experience collection... (3850 times) [2024-04-26 10:30:48,843][49750] InferenceWorker_p0-w0: stopping experience collection (3850 times) [2024-04-26 10:30:48,853][49728] Signal inference workers to resume experience collection... (3850 times) [2024-04-26 10:30:48,858][49750] InferenceWorker_p0-w0: resuming experience collection (3850 times) [2024-04-26 10:30:49,991][49750] Updated weights for policy 0, policy_version 152791 (0.0035) [2024-04-26 10:30:52,063][49517] Fps is (10 sec: 54066.8, 60 sec: 50790.2, 300 sec: 50318.3). Total num frames: 2503426048. Throughput: 0: 50225.6. Samples: 256211220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:30:52,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 10:30:53,434][49750] Updated weights for policy 0, policy_version 152801 (0.0025) [2024-04-26 10:30:56,478][49750] Updated weights for policy 0, policy_version 152811 (0.0034) [2024-04-26 10:30:57,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50244.0, 300 sec: 50096.1). Total num frames: 2503655424. Throughput: 0: 50115.3. Samples: 256511320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:30:57,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 10:30:59,976][49750] Updated weights for policy 0, policy_version 152821 (0.0032) [2024-04-26 10:31:02,062][49517] Fps is (10 sec: 47514.7, 60 sec: 50244.5, 300 sec: 50151.7). Total num frames: 2503901184. Throughput: 0: 50299.2. Samples: 256817500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:31:02,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 10:31:03,023][49750] Updated weights for policy 0, policy_version 152831 (0.0036) [2024-04-26 10:31:06,323][49750] Updated weights for policy 0, policy_version 152841 (0.0038) [2024-04-26 10:31:07,063][49517] Fps is (10 sec: 50791.2, 60 sec: 49698.4, 300 sec: 50151.7). Total num frames: 2504163328. Throughput: 0: 50417.3. Samples: 256964980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:31:07,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 10:31:09,521][49750] Updated weights for policy 0, policy_version 152851 (0.0029) [2024-04-26 10:31:12,063][49517] Fps is (10 sec: 52427.8, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2504425472. Throughput: 0: 50237.6. Samples: 257263540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:31:12,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 10:31:12,847][49750] Updated weights for policy 0, policy_version 152861 (0.0030) [2024-04-26 10:31:16,225][49750] Updated weights for policy 0, policy_version 152871 (0.0032) [2024-04-26 10:31:17,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2504654848. Throughput: 0: 50104.3. Samples: 257563120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:31:17,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 10:31:19,407][49750] Updated weights for policy 0, policy_version 152881 (0.0026) [2024-04-26 10:31:22,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 2504916992. Throughput: 0: 50151.9. Samples: 257709960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:31:22,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 10:31:22,608][49750] Updated weights for policy 0, policy_version 152891 (0.0026) [2024-04-26 10:31:25,955][49750] Updated weights for policy 0, policy_version 152901 (0.0034) [2024-04-26 10:31:27,062][49517] Fps is (10 sec: 50791.0, 60 sec: 49698.2, 300 sec: 50151.7). Total num frames: 2505162752. Throughput: 0: 50326.0. Samples: 258016460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:31:27,063][49517] Avg episode reward: [(0, '0.461')] [2024-04-26 10:31:29,232][49750] Updated weights for policy 0, policy_version 152911 (0.0034) [2024-04-26 10:31:32,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.4, 300 sec: 50207.3). Total num frames: 2505424896. Throughput: 0: 50247.7. Samples: 258313380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 10:31:32,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 10:31:32,659][49750] Updated weights for policy 0, policy_version 152921 (0.0030) [2024-04-26 10:31:34,692][49728] Signal inference workers to stop experience collection... (3900 times) [2024-04-26 10:31:34,692][49728] Signal inference workers to resume experience collection... (3900 times) [2024-04-26 10:31:34,716][49750] InferenceWorker_p0-w0: stopping experience collection (3900 times) [2024-04-26 10:31:34,716][49750] InferenceWorker_p0-w0: resuming experience collection (3900 times) [2024-04-26 10:31:35,781][49750] Updated weights for policy 0, policy_version 152931 (0.0033) [2024-04-26 10:31:37,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2505670656. Throughput: 0: 50302.4. Samples: 258474820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 10:31:37,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 10:31:39,022][49750] Updated weights for policy 0, policy_version 152941 (0.0028) [2024-04-26 10:31:42,063][49517] Fps is (10 sec: 50788.9, 60 sec: 50790.3, 300 sec: 50151.7). Total num frames: 2505932800. Throughput: 0: 50339.6. Samples: 258776600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 10:31:42,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 10:31:42,178][49750] Updated weights for policy 0, policy_version 152951 (0.0034) [2024-04-26 10:31:45,671][49750] Updated weights for policy 0, policy_version 152961 (0.0032) [2024-04-26 10:31:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2506162176. Throughput: 0: 50343.1. Samples: 259082940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 10:31:47,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 10:31:48,786][49750] Updated weights for policy 0, policy_version 152971 (0.0028) [2024-04-26 10:31:52,059][49750] Updated weights for policy 0, policy_version 152981 (0.0033) [2024-04-26 10:31:52,062][49517] Fps is (10 sec: 50791.6, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 2506440704. Throughput: 0: 50273.4. Samples: 259227280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 10:31:52,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 10:31:55,138][49750] Updated weights for policy 0, policy_version 152991 (0.0031) [2024-04-26 10:31:57,063][49517] Fps is (10 sec: 50788.1, 60 sec: 50244.1, 300 sec: 50207.2). Total num frames: 2506670080. Throughput: 0: 50317.0. Samples: 259527820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 10:31:57,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 10:31:58,759][49750] Updated weights for policy 0, policy_version 153001 (0.0031) [2024-04-26 10:32:01,668][49750] Updated weights for policy 0, policy_version 153011 (0.0037) [2024-04-26 10:32:02,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.2, 300 sec: 50207.2). Total num frames: 2506932224. Throughput: 0: 50262.1. Samples: 259824920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 10:32:02,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 10:32:05,266][49750] Updated weights for policy 0, policy_version 153021 (0.0033) [2024-04-26 10:32:07,062][49517] Fps is (10 sec: 52431.7, 60 sec: 50517.5, 300 sec: 50207.3). Total num frames: 2507194368. Throughput: 0: 50416.2. Samples: 259978680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 10:32:07,063][49517] Avg episode reward: [(0, '0.490')] [2024-04-26 10:32:08,197][49750] Updated weights for policy 0, policy_version 153031 (0.0028) [2024-04-26 10:32:11,786][49750] Updated weights for policy 0, policy_version 153041 (0.0033) [2024-04-26 10:32:12,063][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2507423744. Throughput: 0: 50339.8. Samples: 260281760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 10:32:12,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 10:32:14,783][49750] Updated weights for policy 0, policy_version 153051 (0.0034) [2024-04-26 10:32:17,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.4, 300 sec: 50207.2). Total num frames: 2507685888. Throughput: 0: 50442.6. Samples: 260583300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 10:32:17,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 10:32:18,429][49750] Updated weights for policy 0, policy_version 153061 (0.0030) [2024-04-26 10:32:21,351][49750] Updated weights for policy 0, policy_version 153071 (0.0031) [2024-04-26 10:32:22,063][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2507915264. Throughput: 0: 50307.5. Samples: 260738660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 10:32:22,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 10:32:25,108][49750] Updated weights for policy 0, policy_version 153081 (0.0029) [2024-04-26 10:32:27,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50262.8). Total num frames: 2508210176. Throughput: 0: 50279.8. Samples: 261039180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 10:32:27,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 10:32:27,713][49750] Updated weights for policy 0, policy_version 153091 (0.0035) [2024-04-26 10:32:31,662][49750] Updated weights for policy 0, policy_version 153101 (0.0030) [2024-04-26 10:32:32,063][49517] Fps is (10 sec: 50790.4, 60 sec: 49971.1, 300 sec: 50096.1). Total num frames: 2508423168. Throughput: 0: 50119.4. Samples: 261338320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 10:32:32,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 10:32:34,127][49750] Updated weights for policy 0, policy_version 153111 (0.0029) [2024-04-26 10:32:37,062][49517] Fps is (10 sec: 45875.3, 60 sec: 49971.3, 300 sec: 50151.7). Total num frames: 2508668928. Throughput: 0: 50127.6. Samples: 261483020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:32:37,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 10:32:38,206][49750] Updated weights for policy 0, policy_version 153121 (0.0027) [2024-04-26 10:32:38,533][49728] Signal inference workers to stop experience collection... (3950 times) [2024-04-26 10:32:38,534][49728] Signal inference workers to resume experience collection... (3950 times) [2024-04-26 10:32:38,562][49750] InferenceWorker_p0-w0: stopping experience collection (3950 times) [2024-04-26 10:32:38,562][49750] InferenceWorker_p0-w0: resuming experience collection (3950 times) [2024-04-26 10:32:40,729][49750] Updated weights for policy 0, policy_version 153131 (0.0032) [2024-04-26 10:32:42,062][49517] Fps is (10 sec: 50791.0, 60 sec: 49971.4, 300 sec: 50207.3). Total num frames: 2508931072. Throughput: 0: 50034.3. Samples: 261779340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:32:42,063][49517] Avg episode reward: [(0, '0.452')] [2024-04-26 10:32:42,102][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000153134_2508947456.pth... [2024-04-26 10:32:42,145][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000152401_2496937984.pth [2024-04-26 10:32:44,580][49750] Updated weights for policy 0, policy_version 153141 (0.0035) [2024-04-26 10:32:47,063][49517] Fps is (10 sec: 54066.4, 60 sec: 50790.3, 300 sec: 50262.8). Total num frames: 2509209600. Throughput: 0: 50160.5. Samples: 262082140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:32:47,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 10:32:47,266][49750] Updated weights for policy 0, policy_version 153151 (0.0033) [2024-04-26 10:32:51,253][49750] Updated weights for policy 0, policy_version 153161 (0.0031) [2024-04-26 10:32:52,063][49517] Fps is (10 sec: 50789.8, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 2509438976. Throughput: 0: 50286.9. Samples: 262241600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:32:52,063][49517] Avg episode reward: [(0, '0.472')] [2024-04-26 10:32:54,297][49750] Updated weights for policy 0, policy_version 153171 (0.0034) [2024-04-26 10:32:57,063][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.6, 300 sec: 50207.2). Total num frames: 2509684736. Throughput: 0: 50144.9. Samples: 262538280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:32:57,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 10:32:57,650][49750] Updated weights for policy 0, policy_version 153181 (0.0031) [2024-04-26 10:33:00,914][49750] Updated weights for policy 0, policy_version 153191 (0.0026) [2024-04-26 10:33:02,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50244.4, 300 sec: 50207.3). Total num frames: 2509946880. Throughput: 0: 50244.5. Samples: 262844300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:33:02,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 10:33:04,089][49750] Updated weights for policy 0, policy_version 153201 (0.0035) [2024-04-26 10:33:07,063][49517] Fps is (10 sec: 50790.2, 60 sec: 49971.0, 300 sec: 50262.8). Total num frames: 2510192640. Throughput: 0: 50108.8. Samples: 262993560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:33:07,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 10:33:07,293][49750] Updated weights for policy 0, policy_version 153211 (0.0038) [2024-04-26 10:33:10,516][49750] Updated weights for policy 0, policy_version 153221 (0.0033) [2024-04-26 10:33:12,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.4, 300 sec: 50151.7). Total num frames: 2510438400. Throughput: 0: 50188.4. Samples: 263297660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:33:12,063][49517] Avg episode reward: [(0, '0.465')] [2024-04-26 10:33:13,800][49750] Updated weights for policy 0, policy_version 153231 (0.0024) [2024-04-26 10:33:16,902][49750] Updated weights for policy 0, policy_version 153241 (0.0038) [2024-04-26 10:33:17,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2510700544. Throughput: 0: 50262.7. Samples: 263600140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:33:17,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 10:33:20,359][49750] Updated weights for policy 0, policy_version 153251 (0.0035) [2024-04-26 10:33:22,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50207.3). Total num frames: 2510946304. Throughput: 0: 50228.0. Samples: 263743280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:33:22,063][49517] Avg episode reward: [(0, '0.451')] [2024-04-26 10:33:23,417][49750] Updated weights for policy 0, policy_version 153261 (0.0038) [2024-04-26 10:33:25,102][49728] Signal inference workers to stop experience collection... (4000 times) [2024-04-26 10:33:25,151][49750] InferenceWorker_p0-w0: stopping experience collection (4000 times) [2024-04-26 10:33:25,171][49728] Signal inference workers to resume experience collection... (4000 times) [2024-04-26 10:33:25,172][49750] InferenceWorker_p0-w0: resuming experience collection (4000 times) [2024-04-26 10:33:27,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49425.1, 300 sec: 50151.7). Total num frames: 2511175680. Throughput: 0: 50298.7. Samples: 264042780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:33:27,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 10:33:27,097][49750] Updated weights for policy 0, policy_version 153271 (0.0032) [2024-04-26 10:33:30,082][49750] Updated weights for policy 0, policy_version 153281 (0.0035) [2024-04-26 10:33:32,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.5, 300 sec: 50262.8). Total num frames: 2511470592. Throughput: 0: 50445.0. Samples: 264352160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:33:32,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 10:33:33,503][49750] Updated weights for policy 0, policy_version 153291 (0.0033) [2024-04-26 10:33:36,528][49750] Updated weights for policy 0, policy_version 153301 (0.0033) [2024-04-26 10:33:37,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 50040.6). Total num frames: 2511683584. Throughput: 0: 50080.5. Samples: 264495220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 10:33:37,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 10:33:40,015][49750] Updated weights for policy 0, policy_version 153311 (0.0031) [2024-04-26 10:33:42,062][49517] Fps is (10 sec: 45875.3, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2511929344. Throughput: 0: 50256.1. Samples: 264799800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 10:33:42,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 10:33:42,898][49750] Updated weights for policy 0, policy_version 153321 (0.0029) [2024-04-26 10:33:46,433][49750] Updated weights for policy 0, policy_version 153331 (0.0031) [2024-04-26 10:33:47,062][49517] Fps is (10 sec: 54067.5, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2512224256. Throughput: 0: 50218.2. Samples: 265104120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 10:33:47,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 10:33:49,525][49750] Updated weights for policy 0, policy_version 153341 (0.0029) [2024-04-26 10:33:52,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2512470016. Throughput: 0: 50238.4. Samples: 265254280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 10:33:52,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 10:33:52,905][49750] Updated weights for policy 0, policy_version 153351 (0.0029) [2024-04-26 10:33:56,171][49750] Updated weights for policy 0, policy_version 153361 (0.0032) [2024-04-26 10:33:57,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.4, 300 sec: 50096.2). Total num frames: 2512699392. Throughput: 0: 50282.7. Samples: 265560380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 10:33:57,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 10:33:59,425][49750] Updated weights for policy 0, policy_version 153371 (0.0032) [2024-04-26 10:34:02,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50244.1, 300 sec: 50262.8). Total num frames: 2512961536. Throughput: 0: 50327.9. Samples: 265864900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 10:34:02,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 10:34:02,576][49750] Updated weights for policy 0, policy_version 153381 (0.0028) [2024-04-26 10:34:05,867][49750] Updated weights for policy 0, policy_version 153391 (0.0029) [2024-04-26 10:34:07,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2513223680. Throughput: 0: 50348.4. Samples: 266008960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 10:34:07,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 10:34:09,001][49750] Updated weights for policy 0, policy_version 153401 (0.0033) [2024-04-26 10:34:12,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2513453056. Throughput: 0: 50558.6. Samples: 266317920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 10:34:12,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 10:34:12,366][49750] Updated weights for policy 0, policy_version 153411 (0.0029) [2024-04-26 10:34:12,557][49728] Signal inference workers to stop experience collection... (4050 times) [2024-04-26 10:34:12,558][49728] Signal inference workers to resume experience collection... (4050 times) [2024-04-26 10:34:12,589][49750] InferenceWorker_p0-w0: stopping experience collection (4050 times) [2024-04-26 10:34:12,590][49750] InferenceWorker_p0-w0: resuming experience collection (4050 times) [2024-04-26 10:34:15,433][49750] Updated weights for policy 0, policy_version 153421 (0.0029) [2024-04-26 10:34:17,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.4, 300 sec: 50151.7). Total num frames: 2513715200. Throughput: 0: 50379.6. Samples: 266619240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 10:34:17,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-26 10:34:18,752][49750] Updated weights for policy 0, policy_version 153431 (0.0033) [2024-04-26 10:34:22,019][49750] Updated weights for policy 0, policy_version 153441 (0.0033) [2024-04-26 10:34:22,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 2513977344. Throughput: 0: 50330.6. Samples: 266760100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 10:34:22,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 10:34:25,293][49750] Updated weights for policy 0, policy_version 153451 (0.0036) [2024-04-26 10:34:27,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 50262.8). Total num frames: 2514223104. Throughput: 0: 50308.4. Samples: 267063680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 10:34:27,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 10:34:28,639][49750] Updated weights for policy 0, policy_version 153461 (0.0029) [2024-04-26 10:34:31,811][49750] Updated weights for policy 0, policy_version 153471 (0.0031) [2024-04-26 10:34:32,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2514485248. Throughput: 0: 50491.0. Samples: 267376220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 10:34:32,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 10:34:35,240][49750] Updated weights for policy 0, policy_version 153481 (0.0040) [2024-04-26 10:34:37,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.4, 300 sec: 50207.3). Total num frames: 2514714624. Throughput: 0: 50272.5. Samples: 267516540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 10:34:37,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 10:34:38,320][49750] Updated weights for policy 0, policy_version 153491 (0.0028) [2024-04-26 10:34:41,674][49750] Updated weights for policy 0, policy_version 153501 (0.0028) [2024-04-26 10:34:42,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.3, 300 sec: 50207.2). Total num frames: 2514976768. Throughput: 0: 50278.1. Samples: 267822900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 10:34:42,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 10:34:42,070][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000153502_2514976768.pth... [2024-04-26 10:34:42,115][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000152764_2502885376.pth [2024-04-26 10:34:44,753][49750] Updated weights for policy 0, policy_version 153511 (0.0031) [2024-04-26 10:34:47,062][49517] Fps is (10 sec: 50789.9, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2515222528. Throughput: 0: 50322.3. Samples: 268129400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 10:34:47,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 10:34:48,086][49750] Updated weights for policy 0, policy_version 153521 (0.0033) [2024-04-26 10:34:51,145][49750] Updated weights for policy 0, policy_version 153531 (0.0033) [2024-04-26 10:34:52,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.1, 300 sec: 50318.3). Total num frames: 2515484672. Throughput: 0: 50471.0. Samples: 268280160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 10:34:52,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 10:34:54,591][49750] Updated weights for policy 0, policy_version 153541 (0.0034) [2024-04-26 10:34:57,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2515714048. Throughput: 0: 50315.9. Samples: 268582140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 10:34:57,064][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 10:34:57,761][49750] Updated weights for policy 0, policy_version 153551 (0.0030) [2024-04-26 10:35:00,992][49750] Updated weights for policy 0, policy_version 153561 (0.0032) [2024-04-26 10:35:02,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.3, 300 sec: 50096.2). Total num frames: 2515959808. Throughput: 0: 50316.8. Samples: 268883500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 10:35:02,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 10:35:04,225][49750] Updated weights for policy 0, policy_version 153571 (0.0026) [2024-04-26 10:35:07,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 2516238336. Throughput: 0: 50342.0. Samples: 269025480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 10:35:07,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 10:35:07,558][49750] Updated weights for policy 0, policy_version 153581 (0.0035) [2024-04-26 10:35:10,630][49750] Updated weights for policy 0, policy_version 153591 (0.0030) [2024-04-26 10:35:12,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2516484096. Throughput: 0: 50408.6. Samples: 269332060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 10:35:12,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 10:35:14,067][49750] Updated weights for policy 0, policy_version 153601 (0.0034) [2024-04-26 10:35:15,702][49728] Signal inference workers to stop experience collection... (4100 times) [2024-04-26 10:35:15,739][49750] InferenceWorker_p0-w0: stopping experience collection (4100 times) [2024-04-26 10:35:15,773][49728] Signal inference workers to resume experience collection... (4100 times) [2024-04-26 10:35:15,773][49750] InferenceWorker_p0-w0: resuming experience collection (4100 times) [2024-04-26 10:35:17,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.2, 300 sec: 50373.9). Total num frames: 2516746240. Throughput: 0: 50342.7. Samples: 269641640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 10:35:17,063][49517] Avg episode reward: [(0, '0.468')] [2024-04-26 10:35:17,167][49750] Updated weights for policy 0, policy_version 153611 (0.0034) [2024-04-26 10:35:20,442][49750] Updated weights for policy 0, policy_version 153621 (0.0035) [2024-04-26 10:35:22,062][49517] Fps is (10 sec: 47513.4, 60 sec: 49698.2, 300 sec: 50096.2). Total num frames: 2516959232. Throughput: 0: 50479.1. Samples: 269788100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 10:35:22,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 10:35:23,692][49750] Updated weights for policy 0, policy_version 153631 (0.0028) [2024-04-26 10:35:27,055][49750] Updated weights for policy 0, policy_version 153641 (0.0041) [2024-04-26 10:35:27,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.4, 300 sec: 50373.8). Total num frames: 2517254144. Throughput: 0: 50389.0. Samples: 270090400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 10:35:27,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 10:35:30,137][49750] Updated weights for policy 0, policy_version 153651 (0.0031) [2024-04-26 10:35:32,062][49517] Fps is (10 sec: 52428.6, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2517483520. Throughput: 0: 50262.7. Samples: 270391220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 10:35:32,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 10:35:33,568][49750] Updated weights for policy 0, policy_version 153661 (0.0033) [2024-04-26 10:35:36,509][49750] Updated weights for policy 0, policy_version 153671 (0.0029) [2024-04-26 10:35:37,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2517745664. Throughput: 0: 50429.1. Samples: 270549460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 10:35:37,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 10:35:40,069][49750] Updated weights for policy 0, policy_version 153681 (0.0030) [2024-04-26 10:35:42,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2517991424. Throughput: 0: 50363.5. Samples: 270848500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 10:35:42,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 10:35:43,200][49750] Updated weights for policy 0, policy_version 153691 (0.0032) [2024-04-26 10:35:46,781][49750] Updated weights for policy 0, policy_version 153701 (0.0032) [2024-04-26 10:35:47,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.4, 300 sec: 50207.3). Total num frames: 2518237184. Throughput: 0: 50281.5. Samples: 271146160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 10:35:47,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 10:35:49,766][49750] Updated weights for policy 0, policy_version 153711 (0.0026) [2024-04-26 10:35:52,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50244.4, 300 sec: 50318.4). Total num frames: 2518499328. Throughput: 0: 50399.5. Samples: 271293460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 10:35:52,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 10:35:53,353][49750] Updated weights for policy 0, policy_version 153721 (0.0031) [2024-04-26 10:35:56,229][49750] Updated weights for policy 0, policy_version 153731 (0.0030) [2024-04-26 10:35:57,063][49517] Fps is (10 sec: 54066.1, 60 sec: 51063.4, 300 sec: 50429.4). Total num frames: 2518777856. Throughput: 0: 50369.6. Samples: 271598700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 10:35:57,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 10:35:59,918][49750] Updated weights for policy 0, policy_version 153741 (0.0032) [2024-04-26 10:36:02,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 2519007232. Throughput: 0: 50312.0. Samples: 271905680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 10:36:02,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 10:36:02,822][49750] Updated weights for policy 0, policy_version 153751 (0.0028) [2024-04-26 10:36:06,379][49750] Updated weights for policy 0, policy_version 153761 (0.0033) [2024-04-26 10:36:07,063][49517] Fps is (10 sec: 45875.3, 60 sec: 49971.1, 300 sec: 50207.3). Total num frames: 2519236608. Throughput: 0: 50264.8. Samples: 272050020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 10:36:07,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 10:36:09,410][49750] Updated weights for policy 0, policy_version 153771 (0.0027) [2024-04-26 10:36:12,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2519515136. Throughput: 0: 50193.3. Samples: 272349100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 10:36:12,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 10:36:12,853][49750] Updated weights for policy 0, policy_version 153781 (0.0033) [2024-04-26 10:36:15,833][49728] Signal inference workers to stop experience collection... (4150 times) [2024-04-26 10:36:15,834][49750] Updated weights for policy 0, policy_version 153791 (0.0037) [2024-04-26 10:36:15,837][49728] Signal inference workers to resume experience collection... (4150 times) [2024-04-26 10:36:15,854][49750] InferenceWorker_p0-w0: stopping experience collection (4150 times) [2024-04-26 10:36:15,854][49750] InferenceWorker_p0-w0: resuming experience collection (4150 times) [2024-04-26 10:36:17,062][49517] Fps is (10 sec: 50790.9, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2519744512. Throughput: 0: 50206.3. Samples: 272650500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 10:36:17,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 10:36:19,448][49750] Updated weights for policy 0, policy_version 153801 (0.0030) [2024-04-26 10:36:22,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50790.2, 300 sec: 50318.3). Total num frames: 2520006656. Throughput: 0: 50068.2. Samples: 272802540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 10:36:22,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 10:36:22,290][49750] Updated weights for policy 0, policy_version 153811 (0.0029) [2024-04-26 10:36:26,059][49750] Updated weights for policy 0, policy_version 153821 (0.0029) [2024-04-26 10:36:27,062][49517] Fps is (10 sec: 49151.6, 60 sec: 49698.1, 300 sec: 50207.2). Total num frames: 2520236032. Throughput: 0: 50078.2. Samples: 273102020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 10:36:27,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 10:36:28,846][49750] Updated weights for policy 0, policy_version 153831 (0.0028) [2024-04-26 10:36:32,063][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2520514560. Throughput: 0: 50283.4. Samples: 273408920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 10:36:32,071][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 10:36:32,502][49750] Updated weights for policy 0, policy_version 153841 (0.0030) [2024-04-26 10:36:35,246][49750] Updated weights for policy 0, policy_version 153851 (0.0034) [2024-04-26 10:36:37,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2520760320. Throughput: 0: 50290.1. Samples: 273556520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 10:36:37,063][49517] Avg episode reward: [(0, '0.466')] [2024-04-26 10:36:38,882][49750] Updated weights for policy 0, policy_version 153861 (0.0043) [2024-04-26 10:36:41,618][49750] Updated weights for policy 0, policy_version 153871 (0.0030) [2024-04-26 10:36:42,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 2521038848. Throughput: 0: 50353.3. Samples: 273864600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 10:36:42,063][49517] Avg episode reward: [(0, '0.463')] [2024-04-26 10:36:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000153872_2521038848.pth... [2024-04-26 10:36:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000153134_2508947456.pth [2024-04-26 10:36:45,490][49750] Updated weights for policy 0, policy_version 153881 (0.0038) [2024-04-26 10:36:47,062][49517] Fps is (10 sec: 47514.5, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2521235456. Throughput: 0: 50135.2. Samples: 274161760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 10:36:47,063][49517] Avg episode reward: [(0, '0.461')] [2024-04-26 10:36:48,094][49750] Updated weights for policy 0, policy_version 153891 (0.0034) [2024-04-26 10:36:52,063][49517] Fps is (10 sec: 45874.8, 60 sec: 49971.0, 300 sec: 50262.8). Total num frames: 2521497600. Throughput: 0: 50170.0. Samples: 274307680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 10:36:52,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 10:36:52,215][49750] Updated weights for policy 0, policy_version 153901 (0.0033) [2024-04-26 10:36:54,652][49750] Updated weights for policy 0, policy_version 153911 (0.0024) [2024-04-26 10:36:57,063][49517] Fps is (10 sec: 52427.7, 60 sec: 49698.1, 300 sec: 50262.8). Total num frames: 2521759744. Throughput: 0: 50116.8. Samples: 274604360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 10:36:57,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 10:36:58,885][49750] Updated weights for policy 0, policy_version 153921 (0.0032) [2024-04-26 10:37:01,256][49750] Updated weights for policy 0, policy_version 153931 (0.0030) [2024-04-26 10:37:02,063][49517] Fps is (10 sec: 52429.6, 60 sec: 50244.2, 300 sec: 50262.7). Total num frames: 2522021888. Throughput: 0: 49910.1. Samples: 274896460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 10:37:02,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 10:37:05,352][49750] Updated weights for policy 0, policy_version 153941 (0.0029) [2024-04-26 10:37:07,062][49517] Fps is (10 sec: 50791.6, 60 sec: 50517.5, 300 sec: 50318.4). Total num frames: 2522267648. Throughput: 0: 50241.3. Samples: 275063380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 10:37:07,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 10:37:07,721][49750] Updated weights for policy 0, policy_version 153951 (0.0030) [2024-04-26 10:37:11,993][49750] Updated weights for policy 0, policy_version 153961 (0.0028) [2024-04-26 10:37:12,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49698.1, 300 sec: 50207.2). Total num frames: 2522497024. Throughput: 0: 50280.5. Samples: 275364640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 10:37:12,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 10:37:14,169][49750] Updated weights for policy 0, policy_version 153971 (0.0031) [2024-04-26 10:37:17,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2522759168. Throughput: 0: 50196.5. Samples: 275667760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 10:37:17,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 10:37:18,511][49750] Updated weights for policy 0, policy_version 153981 (0.0031) [2024-04-26 10:37:19,862][49728] Signal inference workers to stop experience collection... (4200 times) [2024-04-26 10:37:19,911][49750] InferenceWorker_p0-w0: stopping experience collection (4200 times) [2024-04-26 10:37:19,931][49728] Signal inference workers to resume experience collection... (4200 times) [2024-04-26 10:37:19,933][49750] InferenceWorker_p0-w0: resuming experience collection (4200 times) [2024-04-26 10:37:20,744][49750] Updated weights for policy 0, policy_version 153991 (0.0030) [2024-04-26 10:37:22,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50517.5, 300 sec: 50262.8). Total num frames: 2523037696. Throughput: 0: 50423.7. Samples: 275825580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 10:37:22,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 10:37:24,920][49750] Updated weights for policy 0, policy_version 154001 (0.0032) [2024-04-26 10:37:27,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.5, 300 sec: 50373.9). Total num frames: 2523283456. Throughput: 0: 50106.4. Samples: 276119380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 10:37:27,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 10:37:27,250][49750] Updated weights for policy 0, policy_version 154011 (0.0028) [2024-04-26 10:37:31,499][49750] Updated weights for policy 0, policy_version 154021 (0.0032) [2024-04-26 10:37:32,063][49517] Fps is (10 sec: 45874.0, 60 sec: 49698.0, 300 sec: 50262.7). Total num frames: 2523496448. Throughput: 0: 50194.7. Samples: 276420540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 10:37:32,063][49517] Avg episode reward: [(0, '0.466')] [2024-04-26 10:37:33,600][49750] Updated weights for policy 0, policy_version 154031 (0.0026) [2024-04-26 10:37:37,063][49517] Fps is (10 sec: 47513.0, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2523758592. Throughput: 0: 50133.0. Samples: 276563660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 10:37:37,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 10:37:38,048][49750] Updated weights for policy 0, policy_version 154041 (0.0034) [2024-04-26 10:37:40,055][49750] Updated weights for policy 0, policy_version 154051 (0.0030) [2024-04-26 10:37:42,063][49517] Fps is (10 sec: 50790.8, 60 sec: 49425.1, 300 sec: 50151.7). Total num frames: 2524004352. Throughput: 0: 50074.2. Samples: 276857700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 10:37:42,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 10:37:44,466][49750] Updated weights for policy 0, policy_version 154061 (0.0030) [2024-04-26 10:37:46,584][49750] Updated weights for policy 0, policy_version 154071 (0.0033) [2024-04-26 10:37:47,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51063.3, 300 sec: 50373.9). Total num frames: 2524299264. Throughput: 0: 50231.5. Samples: 277156880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 10:37:47,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 10:37:51,131][49750] Updated weights for policy 0, policy_version 154081 (0.0034) [2024-04-26 10:37:52,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50244.5, 300 sec: 50262.8). Total num frames: 2524512256. Throughput: 0: 50233.7. Samples: 277323900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 10:37:52,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 10:37:53,130][49750] Updated weights for policy 0, policy_version 154091 (0.0026) [2024-04-26 10:37:57,062][49517] Fps is (10 sec: 44237.4, 60 sec: 49698.2, 300 sec: 50151.7). Total num frames: 2524741632. Throughput: 0: 50262.7. Samples: 277626460. Policy #0 lag: (min: 1.0, avg: 13.8, max: 21.0) [2024-04-26 10:37:57,063][49517] Avg episode reward: [(0, '0.427')] [2024-04-26 10:37:57,723][49750] Updated weights for policy 0, policy_version 154101 (0.0031) [2024-04-26 10:37:59,598][49750] Updated weights for policy 0, policy_version 154111 (0.0025) [2024-04-26 10:38:02,063][49517] Fps is (10 sec: 50789.6, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2525020160. Throughput: 0: 50159.4. Samples: 277924940. Policy #0 lag: (min: 1.0, avg: 13.8, max: 21.0) [2024-04-26 10:38:02,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 10:38:04,197][49750] Updated weights for policy 0, policy_version 154121 (0.0027) [2024-04-26 10:38:06,257][49750] Updated weights for policy 0, policy_version 154131 (0.0028) [2024-04-26 10:38:07,063][49517] Fps is (10 sec: 55705.0, 60 sec: 50517.2, 300 sec: 50373.8). Total num frames: 2525298688. Throughput: 0: 50039.9. Samples: 278077380. Policy #0 lag: (min: 1.0, avg: 13.8, max: 21.0) [2024-04-26 10:38:07,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 10:38:10,621][49750] Updated weights for policy 0, policy_version 154141 (0.0034) [2024-04-26 10:38:11,709][49728] Signal inference workers to stop experience collection... (4250 times) [2024-04-26 10:38:11,766][49750] InferenceWorker_p0-w0: stopping experience collection (4250 times) [2024-04-26 10:38:11,773][49728] Signal inference workers to resume experience collection... (4250 times) [2024-04-26 10:38:11,780][49750] InferenceWorker_p0-w0: resuming experience collection (4250 times) [2024-04-26 10:38:12,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 2525544448. Throughput: 0: 50340.4. Samples: 278384700. Policy #0 lag: (min: 1.0, avg: 13.8, max: 21.0) [2024-04-26 10:38:12,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 10:38:12,991][49750] Updated weights for policy 0, policy_version 154151 (0.0032) [2024-04-26 10:38:17,062][49517] Fps is (10 sec: 45875.6, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2525757440. Throughput: 0: 50284.3. Samples: 278683320. Policy #0 lag: (min: 1.0, avg: 13.8, max: 21.0) [2024-04-26 10:38:17,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 10:38:17,122][49750] Updated weights for policy 0, policy_version 154161 (0.0032) [2024-04-26 10:38:19,770][49750] Updated weights for policy 0, policy_version 154171 (0.0028) [2024-04-26 10:38:22,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49698.1, 300 sec: 50318.3). Total num frames: 2526019584. Throughput: 0: 50138.8. Samples: 278819900. Policy #0 lag: (min: 1.0, avg: 13.8, max: 21.0) [2024-04-26 10:38:22,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 10:38:23,604][49750] Updated weights for policy 0, policy_version 154181 (0.0032) [2024-04-26 10:38:26,172][49750] Updated weights for policy 0, policy_version 154191 (0.0032) [2024-04-26 10:38:27,063][49517] Fps is (10 sec: 52428.0, 60 sec: 49971.0, 300 sec: 50207.2). Total num frames: 2526281728. Throughput: 0: 50185.8. Samples: 279116060. Policy #0 lag: (min: 1.0, avg: 13.8, max: 21.0) [2024-04-26 10:38:27,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 10:38:30,149][49750] Updated weights for policy 0, policy_version 154201 (0.0035) [2024-04-26 10:38:32,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.7, 300 sec: 50429.4). Total num frames: 2526560256. Throughput: 0: 50379.3. Samples: 279423940. Policy #0 lag: (min: 1.0, avg: 13.8, max: 21.0) [2024-04-26 10:38:32,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 10:38:32,540][49750] Updated weights for policy 0, policy_version 154211 (0.0027) [2024-04-26 10:38:36,614][49750] Updated weights for policy 0, policy_version 154221 (0.0033) [2024-04-26 10:38:37,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50244.4, 300 sec: 50318.3). Total num frames: 2526773248. Throughput: 0: 50158.2. Samples: 279581020. Policy #0 lag: (min: 1.0, avg: 13.8, max: 21.0) [2024-04-26 10:38:37,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 10:38:38,993][49750] Updated weights for policy 0, policy_version 154231 (0.0028) [2024-04-26 10:38:42,063][49517] Fps is (10 sec: 44235.5, 60 sec: 49971.1, 300 sec: 50096.1). Total num frames: 2527002624. Throughput: 0: 50240.2. Samples: 279887280. Policy #0 lag: (min: 1.0, avg: 13.8, max: 21.0) [2024-04-26 10:38:42,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 10:38:42,162][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000154237_2527019008.pth... [2024-04-26 10:38:42,204][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000153502_2514976768.pth [2024-04-26 10:38:43,042][49750] Updated weights for policy 0, policy_version 154241 (0.0033) [2024-04-26 10:38:45,395][49750] Updated weights for policy 0, policy_version 154251 (0.0034) [2024-04-26 10:38:47,062][49517] Fps is (10 sec: 50790.1, 60 sec: 49698.2, 300 sec: 50207.2). Total num frames: 2527281152. Throughput: 0: 50217.9. Samples: 280184740. Policy #0 lag: (min: 1.0, avg: 13.8, max: 21.0) [2024-04-26 10:38:47,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 10:38:49,614][49750] Updated weights for policy 0, policy_version 154261 (0.0030) [2024-04-26 10:38:51,796][49750] Updated weights for policy 0, policy_version 154271 (0.0031) [2024-04-26 10:38:52,063][49517] Fps is (10 sec: 57344.9, 60 sec: 51063.4, 300 sec: 50429.4). Total num frames: 2527576064. Throughput: 0: 50248.9. Samples: 280338580. Policy #0 lag: (min: 1.0, avg: 13.8, max: 21.0) [2024-04-26 10:38:52,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 10:38:56,270][49750] Updated weights for policy 0, policy_version 154281 (0.0033) [2024-04-26 10:38:57,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50207.3). Total num frames: 2527772672. Throughput: 0: 50232.0. Samples: 280645140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 10:38:57,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 10:38:58,614][49750] Updated weights for policy 0, policy_version 154291 (0.0026) [2024-04-26 10:39:00,156][49728] Signal inference workers to stop experience collection... (4300 times) [2024-04-26 10:39:00,157][49728] Signal inference workers to resume experience collection... (4300 times) [2024-04-26 10:39:00,190][49750] InferenceWorker_p0-w0: stopping experience collection (4300 times) [2024-04-26 10:39:00,190][49750] InferenceWorker_p0-w0: resuming experience collection (4300 times) [2024-04-26 10:39:02,063][49517] Fps is (10 sec: 42598.1, 60 sec: 49698.1, 300 sec: 50096.1). Total num frames: 2528002048. Throughput: 0: 50271.0. Samples: 280945520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 10:39:02,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 10:39:02,901][49750] Updated weights for policy 0, policy_version 154301 (0.0032) [2024-04-26 10:39:05,238][49750] Updated weights for policy 0, policy_version 154311 (0.0032) [2024-04-26 10:39:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 49698.2, 300 sec: 50262.8). Total num frames: 2528280576. Throughput: 0: 50173.4. Samples: 281077700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 10:39:07,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 10:39:09,234][49750] Updated weights for policy 0, policy_version 154321 (0.0033) [2024-04-26 10:39:11,705][49750] Updated weights for policy 0, policy_version 154331 (0.0030) [2024-04-26 10:39:12,062][49517] Fps is (10 sec: 55706.5, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2528559104. Throughput: 0: 50486.4. Samples: 281387940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 10:39:12,063][49517] Avg episode reward: [(0, '0.479')] [2024-04-26 10:39:15,718][49750] Updated weights for policy 0, policy_version 154341 (0.0035) [2024-04-26 10:39:17,063][49517] Fps is (10 sec: 52427.7, 60 sec: 50790.3, 300 sec: 50262.8). Total num frames: 2528804864. Throughput: 0: 50511.3. Samples: 281696960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 10:39:17,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 10:39:18,384][49750] Updated weights for policy 0, policy_version 154351 (0.0036) [2024-04-26 10:39:22,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 2529034240. Throughput: 0: 50368.3. Samples: 281847600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 10:39:22,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 10:39:22,282][49750] Updated weights for policy 0, policy_version 154361 (0.0034) [2024-04-26 10:39:24,964][49750] Updated weights for policy 0, policy_version 154371 (0.0033) [2024-04-26 10:39:27,063][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 2529296384. Throughput: 0: 50250.0. Samples: 282148520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 10:39:27,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 10:39:28,744][49750] Updated weights for policy 0, policy_version 154381 (0.0038) [2024-04-26 10:39:31,756][49750] Updated weights for policy 0, policy_version 154391 (0.0038) [2024-04-26 10:39:32,062][49517] Fps is (10 sec: 50790.6, 60 sec: 49698.1, 300 sec: 50262.8). Total num frames: 2529542144. Throughput: 0: 50238.7. Samples: 282445480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 10:39:32,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 10:39:35,088][49750] Updated weights for policy 0, policy_version 154401 (0.0029) [2024-04-26 10:39:37,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.3, 300 sec: 50318.3). Total num frames: 2529820672. Throughput: 0: 50203.1. Samples: 282597720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 10:39:37,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 10:39:38,266][49750] Updated weights for policy 0, policy_version 154411 (0.0029) [2024-04-26 10:39:41,555][49750] Updated weights for policy 0, policy_version 154421 (0.0030) [2024-04-26 10:39:42,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.6, 300 sec: 50207.3). Total num frames: 2530033664. Throughput: 0: 50238.2. Samples: 282905860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 10:39:42,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 10:39:44,968][49750] Updated weights for policy 0, policy_version 154431 (0.0030) [2024-04-26 10:39:47,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.4, 300 sec: 50207.3). Total num frames: 2530295808. Throughput: 0: 50310.0. Samples: 283209460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 10:39:47,063][49517] Avg episode reward: [(0, '0.471')] [2024-04-26 10:39:48,144][49750] Updated weights for policy 0, policy_version 154441 (0.0034) [2024-04-26 10:39:51,546][49750] Updated weights for policy 0, policy_version 154451 (0.0037) [2024-04-26 10:39:52,063][49517] Fps is (10 sec: 50788.8, 60 sec: 49424.9, 300 sec: 50262.7). Total num frames: 2530541568. Throughput: 0: 50410.3. Samples: 283346180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 10:39:52,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 10:39:54,705][49750] Updated weights for policy 0, policy_version 154461 (0.0030) [2024-04-26 10:39:57,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51063.4, 300 sec: 50429.4). Total num frames: 2530836480. Throughput: 0: 50337.2. Samples: 283653120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 10:39:57,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 10:39:57,939][49750] Updated weights for policy 0, policy_version 154471 (0.0032) [2024-04-26 10:40:01,250][49750] Updated weights for policy 0, policy_version 154481 (0.0033) [2024-04-26 10:40:02,019][49728] Signal inference workers to stop experience collection... (4350 times) [2024-04-26 10:40:02,020][49728] Signal inference workers to resume experience collection... (4350 times) [2024-04-26 10:40:02,053][49750] InferenceWorker_p0-w0: stopping experience collection (4350 times) [2024-04-26 10:40:02,053][49750] InferenceWorker_p0-w0: resuming experience collection (4350 times) [2024-04-26 10:40:02,062][49517] Fps is (10 sec: 50791.6, 60 sec: 50790.5, 300 sec: 50207.2). Total num frames: 2531049472. Throughput: 0: 50203.7. Samples: 283956120. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-26 10:40:02,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 10:40:04,317][49750] Updated weights for policy 0, policy_version 154491 (0.0042) [2024-04-26 10:40:07,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 2531295232. Throughput: 0: 50077.0. Samples: 284101060. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-26 10:40:07,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 10:40:07,712][49750] Updated weights for policy 0, policy_version 154501 (0.0035) [2024-04-26 10:40:11,078][49750] Updated weights for policy 0, policy_version 154511 (0.0027) [2024-04-26 10:40:12,063][49517] Fps is (10 sec: 50790.0, 60 sec: 49971.1, 300 sec: 50207.2). Total num frames: 2531557376. Throughput: 0: 50115.5. Samples: 284403720. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-26 10:40:12,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 10:40:14,154][49750] Updated weights for policy 0, policy_version 154521 (0.0031) [2024-04-26 10:40:17,063][49517] Fps is (10 sec: 50789.7, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2531803136. Throughput: 0: 50080.8. Samples: 284699120. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-26 10:40:17,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 10:40:17,522][49750] Updated weights for policy 0, policy_version 154531 (0.0035) [2024-04-26 10:40:20,675][49750] Updated weights for policy 0, policy_version 154541 (0.0035) [2024-04-26 10:40:22,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 2532065280. Throughput: 0: 50276.8. Samples: 284860180. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-26 10:40:22,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-26 10:40:23,920][49750] Updated weights for policy 0, policy_version 154551 (0.0028) [2024-04-26 10:40:27,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2532311040. Throughput: 0: 50134.2. Samples: 285161900. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-26 10:40:27,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 10:40:27,087][49750] Updated weights for policy 0, policy_version 154561 (0.0034) [2024-04-26 10:40:30,590][49750] Updated weights for policy 0, policy_version 154571 (0.0030) [2024-04-26 10:40:32,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2532540416. Throughput: 0: 50118.1. Samples: 285464780. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-26 10:40:32,063][49517] Avg episode reward: [(0, '0.464')] [2024-04-26 10:40:33,613][49750] Updated weights for policy 0, policy_version 154581 (0.0032) [2024-04-26 10:40:37,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 50207.2). Total num frames: 2532802560. Throughput: 0: 50259.8. Samples: 285607860. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-26 10:40:37,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 10:40:37,134][49750] Updated weights for policy 0, policy_version 154591 (0.0030) [2024-04-26 10:40:40,321][49750] Updated weights for policy 0, policy_version 154601 (0.0035) [2024-04-26 10:40:42,063][49517] Fps is (10 sec: 54066.7, 60 sec: 50790.3, 300 sec: 50318.3). Total num frames: 2533081088. Throughput: 0: 50097.7. Samples: 285907520. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-26 10:40:42,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-26 10:40:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000154607_2533081088.pth... [2024-04-26 10:40:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000153872_2521038848.pth [2024-04-26 10:40:43,635][49750] Updated weights for policy 0, policy_version 154611 (0.0034) [2024-04-26 10:40:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 2533294080. Throughput: 0: 50075.6. Samples: 286209520. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-26 10:40:47,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 10:40:47,209][49750] Updated weights for policy 0, policy_version 154621 (0.0037) [2024-04-26 10:40:50,009][49750] Updated weights for policy 0, policy_version 154631 (0.0035) [2024-04-26 10:40:52,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.5, 300 sec: 50096.2). Total num frames: 2533556224. Throughput: 0: 50104.9. Samples: 286355780. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-26 10:40:52,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 10:40:53,727][49750] Updated weights for policy 0, policy_version 154641 (0.0033) [2024-04-26 10:40:56,596][49750] Updated weights for policy 0, policy_version 154651 (0.0031) [2024-04-26 10:40:57,063][49517] Fps is (10 sec: 50789.5, 60 sec: 49425.0, 300 sec: 50151.7). Total num frames: 2533801984. Throughput: 0: 50107.0. Samples: 286658540. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-26 10:40:57,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 10:41:00,119][49750] Updated weights for policy 0, policy_version 154661 (0.0029) [2024-04-26 10:41:01,074][49728] Signal inference workers to stop experience collection... (4400 times) [2024-04-26 10:41:01,075][49728] Signal inference workers to resume experience collection... (4400 times) [2024-04-26 10:41:01,102][49750] InferenceWorker_p0-w0: stopping experience collection (4400 times) [2024-04-26 10:41:01,106][49750] InferenceWorker_p0-w0: resuming experience collection (4400 times) [2024-04-26 10:41:02,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2534064128. Throughput: 0: 50136.1. Samples: 286955240. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-26 10:41:02,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 10:41:03,271][49750] Updated weights for policy 0, policy_version 154671 (0.0027) [2024-04-26 10:41:06,505][49750] Updated weights for policy 0, policy_version 154681 (0.0035) [2024-04-26 10:41:07,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2534309888. Throughput: 0: 50091.7. Samples: 287114300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 10:41:07,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 10:41:09,765][49750] Updated weights for policy 0, policy_version 154691 (0.0033) [2024-04-26 10:41:12,063][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2534555648. Throughput: 0: 50117.7. Samples: 287417200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 10:41:12,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 10:41:13,178][49750] Updated weights for policy 0, policy_version 154701 (0.0030) [2024-04-26 10:41:16,571][49750] Updated weights for policy 0, policy_version 154711 (0.0036) [2024-04-26 10:41:17,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50244.3, 300 sec: 50207.3). Total num frames: 2534817792. Throughput: 0: 50103.8. Samples: 287719460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 10:41:17,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 10:41:19,769][49750] Updated weights for policy 0, policy_version 154721 (0.0034) [2024-04-26 10:41:22,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50244.4, 300 sec: 50318.3). Total num frames: 2535079936. Throughput: 0: 50168.5. Samples: 287865440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 10:41:22,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 10:41:23,046][49750] Updated weights for policy 0, policy_version 154731 (0.0028) [2024-04-26 10:41:26,205][49750] Updated weights for policy 0, policy_version 154741 (0.0030) [2024-04-26 10:41:27,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50244.3, 300 sec: 50207.3). Total num frames: 2535325696. Throughput: 0: 50201.5. Samples: 288166580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 10:41:27,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 10:41:29,488][49750] Updated weights for policy 0, policy_version 154751 (0.0030) [2024-04-26 10:41:32,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 2535571456. Throughput: 0: 50264.3. Samples: 288471420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 10:41:32,063][49517] Avg episode reward: [(0, '0.427')] [2024-04-26 10:41:32,597][49750] Updated weights for policy 0, policy_version 154761 (0.0034) [2024-04-26 10:41:35,986][49750] Updated weights for policy 0, policy_version 154771 (0.0030) [2024-04-26 10:41:37,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50244.2, 300 sec: 50096.2). Total num frames: 2535817216. Throughput: 0: 50304.8. Samples: 288619500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 10:41:37,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 10:41:39,148][49750] Updated weights for policy 0, policy_version 154781 (0.0031) [2024-04-26 10:41:42,063][49517] Fps is (10 sec: 49151.8, 60 sec: 49698.1, 300 sec: 50262.7). Total num frames: 2536062976. Throughput: 0: 50196.9. Samples: 288917400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 10:41:42,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 10:41:42,704][49750] Updated weights for policy 0, policy_version 154791 (0.0035) [2024-04-26 10:41:45,809][49750] Updated weights for policy 0, policy_version 154801 (0.0027) [2024-04-26 10:41:47,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2536325120. Throughput: 0: 50275.5. Samples: 289217640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 10:41:47,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 10:41:49,089][49750] Updated weights for policy 0, policy_version 154811 (0.0033) [2024-04-26 10:41:52,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50244.3, 300 sec: 50207.3). Total num frames: 2536570880. Throughput: 0: 50141.8. Samples: 289370680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 10:41:52,063][49517] Avg episode reward: [(0, '0.483')] [2024-04-26 10:41:52,202][49750] Updated weights for policy 0, policy_version 154821 (0.0030) [2024-04-26 10:41:55,518][49750] Updated weights for policy 0, policy_version 154831 (0.0031) [2024-04-26 10:41:57,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50244.5, 300 sec: 50151.7). Total num frames: 2536816640. Throughput: 0: 50142.4. Samples: 289673600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 10:41:57,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 10:41:58,598][49750] Updated weights for policy 0, policy_version 154841 (0.0029) [2024-04-26 10:42:02,013][49750] Updated weights for policy 0, policy_version 154851 (0.0028) [2024-04-26 10:42:02,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 2537078784. Throughput: 0: 50138.8. Samples: 289975700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 10:42:02,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 10:42:05,192][49750] Updated weights for policy 0, policy_version 154861 (0.0031) [2024-04-26 10:42:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 2537324544. Throughput: 0: 50197.9. Samples: 290124340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 10:42:07,063][49517] Avg episode reward: [(0, '0.490')] [2024-04-26 10:42:08,600][49750] Updated weights for policy 0, policy_version 154871 (0.0027) [2024-04-26 10:42:08,621][49728] Signal inference workers to stop experience collection... (4450 times) [2024-04-26 10:42:08,626][49728] Signal inference workers to resume experience collection... (4450 times) [2024-04-26 10:42:08,642][49750] InferenceWorker_p0-w0: stopping experience collection (4450 times) [2024-04-26 10:42:08,642][49750] InferenceWorker_p0-w0: resuming experience collection (4450 times) [2024-04-26 10:42:11,896][49750] Updated weights for policy 0, policy_version 154881 (0.0036) [2024-04-26 10:42:12,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2537586688. Throughput: 0: 50309.6. Samples: 290430520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 10:42:12,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 10:42:15,109][49750] Updated weights for policy 0, policy_version 154891 (0.0031) [2024-04-26 10:42:17,063][49517] Fps is (10 sec: 49151.1, 60 sec: 49971.3, 300 sec: 50096.1). Total num frames: 2537816064. Throughput: 0: 50098.2. Samples: 290725840. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 10:42:17,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 10:42:18,278][49750] Updated weights for policy 0, policy_version 154901 (0.0027) [2024-04-26 10:42:21,512][49750] Updated weights for policy 0, policy_version 154911 (0.0034) [2024-04-26 10:42:22,063][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 2538078208. Throughput: 0: 50128.4. Samples: 290875280. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 10:42:22,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 10:42:24,912][49750] Updated weights for policy 0, policy_version 154921 (0.0037) [2024-04-26 10:42:27,063][49517] Fps is (10 sec: 50790.6, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2538323968. Throughput: 0: 50148.1. Samples: 291174060. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 10:42:27,063][49517] Avg episode reward: [(0, '0.469')] [2024-04-26 10:42:27,974][49750] Updated weights for policy 0, policy_version 154931 (0.0027) [2024-04-26 10:42:31,460][49750] Updated weights for policy 0, policy_version 154941 (0.0041) [2024-04-26 10:42:32,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 2538602496. Throughput: 0: 50348.3. Samples: 291483320. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 10:42:32,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 10:42:34,506][49750] Updated weights for policy 0, policy_version 154951 (0.0037) [2024-04-26 10:42:37,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.2, 300 sec: 50207.3). Total num frames: 2538815488. Throughput: 0: 50272.4. Samples: 291632940. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 10:42:37,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 10:42:37,888][49750] Updated weights for policy 0, policy_version 154961 (0.0028) [2024-04-26 10:42:41,004][49750] Updated weights for policy 0, policy_version 154971 (0.0032) [2024-04-26 10:42:42,062][49517] Fps is (10 sec: 49153.4, 60 sec: 50517.5, 300 sec: 50151.7). Total num frames: 2539094016. Throughput: 0: 50350.6. Samples: 291939380. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 10:42:42,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 10:42:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000154974_2539094016.pth... [2024-04-26 10:42:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000154237_2527019008.pth [2024-04-26 10:42:44,341][49750] Updated weights for policy 0, policy_version 154981 (0.0025) [2024-04-26 10:42:47,062][49517] Fps is (10 sec: 50790.5, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2539323392. Throughput: 0: 50190.3. Samples: 292234260. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 10:42:47,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 10:42:47,508][49750] Updated weights for policy 0, policy_version 154991 (0.0032) [2024-04-26 10:42:50,893][49750] Updated weights for policy 0, policy_version 155001 (0.0029) [2024-04-26 10:42:52,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2539585536. Throughput: 0: 50317.6. Samples: 292388640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 10:42:52,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 10:42:53,942][49750] Updated weights for policy 0, policy_version 155011 (0.0036) [2024-04-26 10:42:57,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.1, 300 sec: 50207.2). Total num frames: 2539831296. Throughput: 0: 50220.5. Samples: 292690440. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 10:42:57,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 10:42:57,495][49750] Updated weights for policy 0, policy_version 155021 (0.0031) [2024-04-26 10:43:00,428][49750] Updated weights for policy 0, policy_version 155031 (0.0032) [2024-04-26 10:43:02,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49971.2, 300 sec: 50096.2). Total num frames: 2540077056. Throughput: 0: 50270.3. Samples: 292988000. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 10:43:02,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 10:43:03,885][49750] Updated weights for policy 0, policy_version 155041 (0.0030) [2024-04-26 10:43:07,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2540339200. Throughput: 0: 50282.8. Samples: 293138000. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 10:43:07,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 10:43:07,174][49750] Updated weights for policy 0, policy_version 155051 (0.0037) [2024-04-26 10:43:10,270][49750] Updated weights for policy 0, policy_version 155061 (0.0033) [2024-04-26 10:43:12,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50244.4, 300 sec: 50318.3). Total num frames: 2540601344. Throughput: 0: 50321.0. Samples: 293438500. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 10:43:12,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 10:43:13,574][49750] Updated weights for policy 0, policy_version 155071 (0.0034) [2024-04-26 10:43:16,699][49728] Signal inference workers to stop experience collection... (4500 times) [2024-04-26 10:43:16,699][49728] Signal inference workers to resume experience collection... (4500 times) [2024-04-26 10:43:16,710][49750] InferenceWorker_p0-w0: stopping experience collection (4500 times) [2024-04-26 10:43:16,711][49750] InferenceWorker_p0-w0: resuming experience collection (4500 times) [2024-04-26 10:43:16,854][49750] Updated weights for policy 0, policy_version 155081 (0.0032) [2024-04-26 10:43:17,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.5, 300 sec: 50318.3). Total num frames: 2540863488. Throughput: 0: 50402.0. Samples: 293751400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:43:17,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 10:43:20,027][49750] Updated weights for policy 0, policy_version 155091 (0.0040) [2024-04-26 10:43:22,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 2541092864. Throughput: 0: 50267.4. Samples: 293894980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:43:22,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 10:43:23,350][49750] Updated weights for policy 0, policy_version 155101 (0.0039) [2024-04-26 10:43:26,413][49750] Updated weights for policy 0, policy_version 155111 (0.0032) [2024-04-26 10:43:27,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50517.2, 300 sec: 50151.7). Total num frames: 2541355008. Throughput: 0: 50150.8. Samples: 294196180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:43:27,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 10:43:29,813][49750] Updated weights for policy 0, policy_version 155121 (0.0036) [2024-04-26 10:43:32,063][49517] Fps is (10 sec: 50790.0, 60 sec: 49971.2, 300 sec: 50262.7). Total num frames: 2541600768. Throughput: 0: 50371.8. Samples: 294501000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:43:32,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 10:43:32,837][49750] Updated weights for policy 0, policy_version 155131 (0.0035) [2024-04-26 10:43:36,264][49750] Updated weights for policy 0, policy_version 155141 (0.0036) [2024-04-26 10:43:37,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2541846528. Throughput: 0: 50315.1. Samples: 294652820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:43:37,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 10:43:39,358][49750] Updated weights for policy 0, policy_version 155151 (0.0028) [2024-04-26 10:43:42,062][49517] Fps is (10 sec: 49153.1, 60 sec: 49971.1, 300 sec: 50207.2). Total num frames: 2542092288. Throughput: 0: 50226.3. Samples: 294950620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:43:42,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 10:43:42,815][49750] Updated weights for policy 0, policy_version 155161 (0.0040) [2024-04-26 10:43:45,903][49750] Updated weights for policy 0, policy_version 155171 (0.0028) [2024-04-26 10:43:47,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.4, 300 sec: 50096.2). Total num frames: 2542354432. Throughput: 0: 50309.9. Samples: 295251940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:43:47,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 10:43:49,367][49750] Updated weights for policy 0, policy_version 155181 (0.0029) [2024-04-26 10:43:52,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2542616576. Throughput: 0: 50214.7. Samples: 295397660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:43:52,063][49517] Avg episode reward: [(0, '0.478')] [2024-04-26 10:43:52,395][49750] Updated weights for policy 0, policy_version 155191 (0.0031) [2024-04-26 10:43:55,785][49750] Updated weights for policy 0, policy_version 155201 (0.0029) [2024-04-26 10:43:57,062][49517] Fps is (10 sec: 47513.3, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2542829568. Throughput: 0: 50377.3. Samples: 295705480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:43:57,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 10:43:58,739][49750] Updated weights for policy 0, policy_version 155211 (0.0036) [2024-04-26 10:44:02,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.5, 300 sec: 50318.3). Total num frames: 2543124480. Throughput: 0: 50255.7. Samples: 296012900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:44:02,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 10:44:02,179][49750] Updated weights for policy 0, policy_version 155221 (0.0028) [2024-04-26 10:44:05,515][49750] Updated weights for policy 0, policy_version 155231 (0.0033) [2024-04-26 10:44:07,062][49517] Fps is (10 sec: 50790.6, 60 sec: 49971.2, 300 sec: 50096.2). Total num frames: 2543337472. Throughput: 0: 50277.1. Samples: 296157440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:44:07,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 10:44:08,735][49750] Updated weights for policy 0, policy_version 155241 (0.0026) [2024-04-26 10:44:11,033][49728] Signal inference workers to stop experience collection... (4550 times) [2024-04-26 10:44:11,034][49728] Signal inference workers to resume experience collection... (4550 times) [2024-04-26 10:44:11,048][49750] InferenceWorker_p0-w0: stopping experience collection (4550 times) [2024-04-26 10:44:11,070][49750] InferenceWorker_p0-w0: resuming experience collection (4550 times) [2024-04-26 10:44:12,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 2543616000. Throughput: 0: 50234.8. Samples: 296456740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:44:12,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 10:44:12,418][49750] Updated weights for policy 0, policy_version 155251 (0.0034) [2024-04-26 10:44:15,261][49750] Updated weights for policy 0, policy_version 155261 (0.0025) [2024-04-26 10:44:17,062][49517] Fps is (10 sec: 54066.8, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2543878144. Throughput: 0: 50278.0. Samples: 296763500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:44:17,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 10:44:18,805][49750] Updated weights for policy 0, policy_version 155271 (0.0036) [2024-04-26 10:44:21,753][49750] Updated weights for policy 0, policy_version 155281 (0.0034) [2024-04-26 10:44:22,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.5, 300 sec: 50262.8). Total num frames: 2544123904. Throughput: 0: 50293.9. Samples: 296916040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:44:22,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 10:44:25,164][49750] Updated weights for policy 0, policy_version 155291 (0.0031) [2024-04-26 10:44:27,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.4, 300 sec: 50207.2). Total num frames: 2544353280. Throughput: 0: 50284.0. Samples: 297213400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:44:27,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 10:44:28,174][49750] Updated weights for policy 0, policy_version 155301 (0.0026) [2024-04-26 10:44:31,666][49750] Updated weights for policy 0, policy_version 155311 (0.0034) [2024-04-26 10:44:32,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.5, 300 sec: 50207.2). Total num frames: 2544631808. Throughput: 0: 50355.9. Samples: 297517960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:44:32,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 10:44:34,764][49750] Updated weights for policy 0, policy_version 155321 (0.0026) [2024-04-26 10:44:37,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2544877568. Throughput: 0: 50464.9. Samples: 297668580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:44:37,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 10:44:38,239][49750] Updated weights for policy 0, policy_version 155331 (0.0032) [2024-04-26 10:44:41,602][49750] Updated weights for policy 0, policy_version 155341 (0.0034) [2024-04-26 10:44:42,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 2545106944. Throughput: 0: 50461.7. Samples: 297976260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:44:42,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 10:44:42,130][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000155342_2545123328.pth... [2024-04-26 10:44:42,179][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000154607_2533081088.pth [2024-04-26 10:44:44,725][49750] Updated weights for policy 0, policy_version 155351 (0.0029) [2024-04-26 10:44:47,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.2, 300 sec: 50207.3). Total num frames: 2545352704. Throughput: 0: 50332.0. Samples: 298277840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:44:47,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-26 10:44:48,125][49750] Updated weights for policy 0, policy_version 155361 (0.0028) [2024-04-26 10:44:51,282][49750] Updated weights for policy 0, policy_version 155371 (0.0032) [2024-04-26 10:44:52,063][49517] Fps is (10 sec: 50790.0, 60 sec: 49971.1, 300 sec: 50096.2). Total num frames: 2545614848. Throughput: 0: 50270.1. Samples: 298419600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:44:52,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 10:44:54,494][49750] Updated weights for policy 0, policy_version 155381 (0.0035) [2024-04-26 10:44:57,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.5, 300 sec: 50262.8). Total num frames: 2545876992. Throughput: 0: 50408.2. Samples: 298725100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:44:57,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 10:44:57,835][49750] Updated weights for policy 0, policy_version 155391 (0.0031) [2024-04-26 10:45:00,836][49750] Updated weights for policy 0, policy_version 155401 (0.0035) [2024-04-26 10:45:02,062][49517] Fps is (10 sec: 50791.2, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2546122752. Throughput: 0: 50343.2. Samples: 299028940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:45:02,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 10:45:04,222][49750] Updated weights for policy 0, policy_version 155411 (0.0037) [2024-04-26 10:45:07,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50262.8). Total num frames: 2546384896. Throughput: 0: 50175.1. Samples: 299173920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:45:07,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 10:45:07,389][49750] Updated weights for policy 0, policy_version 155421 (0.0030) [2024-04-26 10:45:10,934][49750] Updated weights for policy 0, policy_version 155431 (0.0034) [2024-04-26 10:45:12,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2546630656. Throughput: 0: 50326.6. Samples: 299478100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:45:12,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 10:45:14,232][49750] Updated weights for policy 0, policy_version 155441 (0.0031) [2024-04-26 10:45:17,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 50207.3). Total num frames: 2546876416. Throughput: 0: 50240.5. Samples: 299778780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 10:45:17,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 10:45:17,557][49750] Updated weights for policy 0, policy_version 155451 (0.0025) [2024-04-26 10:45:20,734][49750] Updated weights for policy 0, policy_version 155461 (0.0035) [2024-04-26 10:45:21,288][49728] Signal inference workers to stop experience collection... (4600 times) [2024-04-26 10:45:21,293][49728] Signal inference workers to resume experience collection... (4600 times) [2024-04-26 10:45:21,342][49750] InferenceWorker_p0-w0: stopping experience collection (4600 times) [2024-04-26 10:45:21,342][49750] InferenceWorker_p0-w0: resuming experience collection (4600 times) [2024-04-26 10:45:22,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2547154944. Throughput: 0: 50429.3. Samples: 299937900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 10:45:22,063][49517] Avg episode reward: [(0, '0.692')] [2024-04-26 10:45:22,072][49728] Saving new best policy, reward=0.692! [2024-04-26 10:45:24,088][49750] Updated weights for policy 0, policy_version 155471 (0.0037) [2024-04-26 10:45:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50373.9). Total num frames: 2547400704. Throughput: 0: 50219.2. Samples: 300236120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 10:45:27,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 10:45:27,064][49750] Updated weights for policy 0, policy_version 155481 (0.0030) [2024-04-26 10:45:30,741][49750] Updated weights for policy 0, policy_version 155491 (0.0030) [2024-04-26 10:45:32,063][49517] Fps is (10 sec: 45874.7, 60 sec: 49698.1, 300 sec: 50207.2). Total num frames: 2547613696. Throughput: 0: 50359.0. Samples: 300544000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 10:45:32,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 10:45:33,485][49750] Updated weights for policy 0, policy_version 155501 (0.0029) [2024-04-26 10:45:37,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2547875840. Throughput: 0: 50341.5. Samples: 300684960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 10:45:37,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 10:45:37,127][49750] Updated weights for policy 0, policy_version 155511 (0.0030) [2024-04-26 10:45:39,975][49750] Updated weights for policy 0, policy_version 155521 (0.0030) [2024-04-26 10:45:42,063][49517] Fps is (10 sec: 54067.1, 60 sec: 50790.4, 300 sec: 50373.8). Total num frames: 2548154368. Throughput: 0: 50328.3. Samples: 300989880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 10:45:42,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 10:45:43,569][49750] Updated weights for policy 0, policy_version 155531 (0.0034) [2024-04-26 10:45:46,988][49750] Updated weights for policy 0, policy_version 155541 (0.0029) [2024-04-26 10:45:47,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 2548383744. Throughput: 0: 50170.1. Samples: 301286600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 10:45:47,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 10:45:50,264][49750] Updated weights for policy 0, policy_version 155551 (0.0029) [2024-04-26 10:45:52,063][49517] Fps is (10 sec: 45874.8, 60 sec: 49971.1, 300 sec: 50207.2). Total num frames: 2548613120. Throughput: 0: 50322.0. Samples: 301438420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 10:45:52,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 10:45:53,476][49750] Updated weights for policy 0, policy_version 155561 (0.0030) [2024-04-26 10:45:56,866][49750] Updated weights for policy 0, policy_version 155571 (0.0030) [2024-04-26 10:45:57,062][49517] Fps is (10 sec: 49152.5, 60 sec: 49971.2, 300 sec: 50207.3). Total num frames: 2548875264. Throughput: 0: 50236.6. Samples: 301738740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 10:45:57,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 10:45:59,973][49750] Updated weights for policy 0, policy_version 155581 (0.0027) [2024-04-26 10:46:02,062][49517] Fps is (10 sec: 54067.9, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 2549153792. Throughput: 0: 50131.5. Samples: 302034700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 10:46:02,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 10:46:03,250][49750] Updated weights for policy 0, policy_version 155591 (0.0031) [2024-04-26 10:46:06,435][49750] Updated weights for policy 0, policy_version 155601 (0.0031) [2024-04-26 10:46:07,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2549399552. Throughput: 0: 50284.5. Samples: 302200700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 10:46:07,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 10:46:09,815][49750] Updated weights for policy 0, policy_version 155611 (0.0031) [2024-04-26 10:46:11,844][49728] Signal inference workers to stop experience collection... (4650 times) [2024-04-26 10:46:11,848][49728] Signal inference workers to resume experience collection... (4650 times) [2024-04-26 10:46:11,873][49750] InferenceWorker_p0-w0: stopping experience collection (4650 times) [2024-04-26 10:46:11,873][49750] InferenceWorker_p0-w0: resuming experience collection (4650 times) [2024-04-26 10:46:12,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 2549645312. Throughput: 0: 50253.3. Samples: 302497520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 10:46:12,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 10:46:12,793][49750] Updated weights for policy 0, policy_version 155621 (0.0028) [2024-04-26 10:46:16,246][49750] Updated weights for policy 0, policy_version 155631 (0.0040) [2024-04-26 10:46:17,062][49517] Fps is (10 sec: 47513.0, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2549874688. Throughput: 0: 50072.0. Samples: 302797240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 10:46:17,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 10:46:19,519][49750] Updated weights for policy 0, policy_version 155641 (0.0040) [2024-04-26 10:46:22,062][49517] Fps is (10 sec: 49151.8, 60 sec: 49698.1, 300 sec: 50207.2). Total num frames: 2550136832. Throughput: 0: 50236.4. Samples: 302945600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 10:46:22,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 10:46:22,896][49750] Updated weights for policy 0, policy_version 155651 (0.0030) [2024-04-26 10:46:25,919][49750] Updated weights for policy 0, policy_version 155661 (0.0031) [2024-04-26 10:46:27,062][49517] Fps is (10 sec: 55705.8, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2550431744. Throughput: 0: 50223.7. Samples: 303249940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 10:46:27,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 10:46:29,480][49750] Updated weights for policy 0, policy_version 155671 (0.0027) [2024-04-26 10:46:32,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2550644736. Throughput: 0: 50343.6. Samples: 303552060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 10:46:32,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 10:46:32,271][49750] Updated weights for policy 0, policy_version 155681 (0.0031) [2024-04-26 10:46:35,827][49750] Updated weights for policy 0, policy_version 155691 (0.0028) [2024-04-26 10:46:37,062][49517] Fps is (10 sec: 44236.9, 60 sec: 49971.2, 300 sec: 50207.3). Total num frames: 2550874112. Throughput: 0: 50371.8. Samples: 303705140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 10:46:37,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 10:46:38,641][49750] Updated weights for policy 0, policy_version 155701 (0.0034) [2024-04-26 10:46:42,063][49517] Fps is (10 sec: 50790.1, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2551152640. Throughput: 0: 50321.2. Samples: 304003200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 10:46:42,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 10:46:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000155710_2551152640.pth... [2024-04-26 10:46:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000154974_2539094016.pth [2024-04-26 10:46:42,304][49750] Updated weights for policy 0, policy_version 155711 (0.0034) [2024-04-26 10:46:45,207][49750] Updated weights for policy 0, policy_version 155721 (0.0026) [2024-04-26 10:46:47,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2551414784. Throughput: 0: 50260.5. Samples: 304296420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 10:46:47,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 10:46:48,961][49750] Updated weights for policy 0, policy_version 155731 (0.0036) [2024-04-26 10:46:51,743][49750] Updated weights for policy 0, policy_version 155741 (0.0035) [2024-04-26 10:46:52,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.6, 300 sec: 50318.3). Total num frames: 2551660544. Throughput: 0: 50209.7. Samples: 304460140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 10:46:52,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 10:46:55,491][49750] Updated weights for policy 0, policy_version 155751 (0.0033) [2024-04-26 10:46:57,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2551906304. Throughput: 0: 50378.1. Samples: 304764540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 10:46:57,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 10:46:58,153][49750] Updated weights for policy 0, policy_version 155761 (0.0034) [2024-04-26 10:47:01,957][49750] Updated weights for policy 0, policy_version 155771 (0.0032) [2024-04-26 10:47:02,062][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2552152064. Throughput: 0: 50468.1. Samples: 305068300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 10:47:02,063][49517] Avg episode reward: [(0, '0.465')] [2024-04-26 10:47:04,650][49750] Updated weights for policy 0, policy_version 155781 (0.0031) [2024-04-26 10:47:07,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2552414208. Throughput: 0: 50433.4. Samples: 305215100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 10:47:07,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 10:47:08,256][49750] Updated weights for policy 0, policy_version 155791 (0.0033) [2024-04-26 10:47:11,146][49750] Updated weights for policy 0, policy_version 155801 (0.0030) [2024-04-26 10:47:12,062][49517] Fps is (10 sec: 54067.4, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2552692736. Throughput: 0: 50455.2. Samples: 305520420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 10:47:12,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 10:47:14,890][49750] Updated weights for policy 0, policy_version 155811 (0.0032) [2024-04-26 10:47:17,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 2552922112. Throughput: 0: 50561.7. Samples: 305827340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 10:47:17,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 10:47:17,608][49728] Signal inference workers to stop experience collection... (4700 times) [2024-04-26 10:47:17,609][49728] Signal inference workers to resume experience collection... (4700 times) [2024-04-26 10:47:17,636][49750] InferenceWorker_p0-w0: stopping experience collection (4700 times) [2024-04-26 10:47:17,637][49750] InferenceWorker_p0-w0: resuming experience collection (4700 times) [2024-04-26 10:47:17,748][49750] Updated weights for policy 0, policy_version 155821 (0.0033) [2024-04-26 10:47:21,535][49750] Updated weights for policy 0, policy_version 155831 (0.0033) [2024-04-26 10:47:22,062][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2553167872. Throughput: 0: 50323.9. Samples: 305969720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 10:47:22,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 10:47:24,266][49750] Updated weights for policy 0, policy_version 155841 (0.0033) [2024-04-26 10:47:27,063][49517] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 50207.3). Total num frames: 2553413632. Throughput: 0: 50355.1. Samples: 306269180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-26 10:47:27,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 10:47:27,968][49750] Updated weights for policy 0, policy_version 155851 (0.0036) [2024-04-26 10:47:30,880][49750] Updated weights for policy 0, policy_version 155861 (0.0033) [2024-04-26 10:47:32,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 2553692160. Throughput: 0: 50433.8. Samples: 306565940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:47:32,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 10:47:34,345][49750] Updated weights for policy 0, policy_version 155871 (0.0029) [2024-04-26 10:47:37,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50262.8). Total num frames: 2553921536. Throughput: 0: 50311.0. Samples: 306724140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:47:37,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 10:47:37,314][49750] Updated weights for policy 0, policy_version 155881 (0.0038) [2024-04-26 10:47:40,876][49750] Updated weights for policy 0, policy_version 155891 (0.0033) [2024-04-26 10:47:42,062][49517] Fps is (10 sec: 44236.3, 60 sec: 49698.2, 300 sec: 50207.2). Total num frames: 2554134528. Throughput: 0: 50264.1. Samples: 307026420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:47:42,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 10:47:43,896][49750] Updated weights for policy 0, policy_version 155901 (0.0028) [2024-04-26 10:47:47,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2554413056. Throughput: 0: 50193.3. Samples: 307327000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:47:47,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 10:47:47,520][49750] Updated weights for policy 0, policy_version 155911 (0.0035) [2024-04-26 10:47:50,414][49750] Updated weights for policy 0, policy_version 155921 (0.0027) [2024-04-26 10:47:52,062][49517] Fps is (10 sec: 55705.4, 60 sec: 50517.2, 300 sec: 50373.9). Total num frames: 2554691584. Throughput: 0: 50266.6. Samples: 307477100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:47:52,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 10:47:53,895][49750] Updated weights for policy 0, policy_version 155931 (0.0029) [2024-04-26 10:47:56,874][49750] Updated weights for policy 0, policy_version 155941 (0.0028) [2024-04-26 10:47:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2554937344. Throughput: 0: 50265.7. Samples: 307782380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:47:57,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 10:48:00,310][49750] Updated weights for policy 0, policy_version 155951 (0.0030) [2024-04-26 10:48:02,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2555166720. Throughput: 0: 50220.1. Samples: 308087240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:48:02,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 10:48:03,394][49750] Updated weights for policy 0, policy_version 155961 (0.0032) [2024-04-26 10:48:06,837][49750] Updated weights for policy 0, policy_version 155971 (0.0028) [2024-04-26 10:48:07,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2555428864. Throughput: 0: 50136.0. Samples: 308225840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:48:07,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 10:48:09,889][49750] Updated weights for policy 0, policy_version 155981 (0.0034) [2024-04-26 10:48:12,062][49517] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 50207.2). Total num frames: 2555674624. Throughput: 0: 50248.1. Samples: 308530340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:48:12,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 10:48:13,339][49750] Updated weights for policy 0, policy_version 155991 (0.0033) [2024-04-26 10:48:16,400][49750] Updated weights for policy 0, policy_version 156001 (0.0036) [2024-04-26 10:48:17,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50244.4, 300 sec: 50318.4). Total num frames: 2555936768. Throughput: 0: 50252.0. Samples: 308827280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:48:17,063][49517] Avg episode reward: [(0, '0.462')] [2024-04-26 10:48:18,983][49728] Signal inference workers to stop experience collection... (4750 times) [2024-04-26 10:48:19,019][49750] InferenceWorker_p0-w0: stopping experience collection (4750 times) [2024-04-26 10:48:19,050][49728] Signal inference workers to resume experience collection... (4750 times) [2024-04-26 10:48:19,051][49750] InferenceWorker_p0-w0: resuming experience collection (4750 times) [2024-04-26 10:48:20,049][49750] Updated weights for policy 0, policy_version 156011 (0.0031) [2024-04-26 10:48:22,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2556182528. Throughput: 0: 50201.2. Samples: 308983200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:48:22,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 10:48:22,965][49750] Updated weights for policy 0, policy_version 156021 (0.0032) [2024-04-26 10:48:26,481][49750] Updated weights for policy 0, policy_version 156031 (0.0027) [2024-04-26 10:48:27,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.3, 300 sec: 50207.3). Total num frames: 2556411904. Throughput: 0: 50255.6. Samples: 309287920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 10:48:27,063][49517] Avg episode reward: [(0, '0.474')] [2024-04-26 10:48:29,549][49750] Updated weights for policy 0, policy_version 156041 (0.0031) [2024-04-26 10:48:32,062][49517] Fps is (10 sec: 50791.6, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2556690432. Throughput: 0: 50208.1. Samples: 309586360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 10:48:32,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 10:48:32,885][49750] Updated weights for policy 0, policy_version 156051 (0.0033) [2024-04-26 10:48:36,035][49750] Updated weights for policy 0, policy_version 156061 (0.0033) [2024-04-26 10:48:37,062][49517] Fps is (10 sec: 54066.7, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2556952576. Throughput: 0: 50313.3. Samples: 309741200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 10:48:37,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 10:48:39,465][49750] Updated weights for policy 0, policy_version 156071 (0.0036) [2024-04-26 10:48:42,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.4, 300 sec: 50262.8). Total num frames: 2557181952. Throughput: 0: 50328.4. Samples: 310047160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 10:48:42,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 10:48:42,176][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000156079_2557198336.pth... [2024-04-26 10:48:42,219][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000155342_2545123328.pth [2024-04-26 10:48:42,535][49750] Updated weights for policy 0, policy_version 156081 (0.0033) [2024-04-26 10:48:45,948][49750] Updated weights for policy 0, policy_version 156091 (0.0035) [2024-04-26 10:48:47,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 2557427712. Throughput: 0: 50166.6. Samples: 310344740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 10:48:47,063][49517] Avg episode reward: [(0, '0.483')] [2024-04-26 10:48:49,157][49750] Updated weights for policy 0, policy_version 156101 (0.0029) [2024-04-26 10:48:52,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2557706240. Throughput: 0: 50288.1. Samples: 310488800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 10:48:52,063][49517] Avg episode reward: [(0, '0.428')] [2024-04-26 10:48:52,313][49750] Updated weights for policy 0, policy_version 156111 (0.0031) [2024-04-26 10:48:55,683][49750] Updated weights for policy 0, policy_version 156121 (0.0032) [2024-04-26 10:48:57,062][49517] Fps is (10 sec: 50790.5, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2557935616. Throughput: 0: 50325.7. Samples: 310795000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 10:48:57,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-26 10:48:59,233][49750] Updated weights for policy 0, policy_version 156131 (0.0028) [2024-04-26 10:49:02,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50517.1, 300 sec: 50373.8). Total num frames: 2558197760. Throughput: 0: 50496.2. Samples: 311099620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 10:49:02,063][49517] Avg episode reward: [(0, '0.468')] [2024-04-26 10:49:02,158][49750] Updated weights for policy 0, policy_version 156141 (0.0030) [2024-04-26 10:49:05,585][49750] Updated weights for policy 0, policy_version 156151 (0.0029) [2024-04-26 10:49:07,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2558459904. Throughput: 0: 50184.1. Samples: 311241480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 10:49:07,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 10:49:08,622][49750] Updated weights for policy 0, policy_version 156161 (0.0037) [2024-04-26 10:49:12,062][49517] Fps is (10 sec: 49153.2, 60 sec: 50244.3, 300 sec: 50207.3). Total num frames: 2558689280. Throughput: 0: 50194.7. Samples: 311546680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 10:49:12,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 10:49:12,126][49750] Updated weights for policy 0, policy_version 156171 (0.0031) [2024-04-26 10:49:15,135][49750] Updated weights for policy 0, policy_version 156181 (0.0029) [2024-04-26 10:49:17,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2558951424. Throughput: 0: 50339.5. Samples: 311851640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 10:49:17,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 10:49:18,677][49750] Updated weights for policy 0, policy_version 156191 (0.0029) [2024-04-26 10:49:21,653][49750] Updated weights for policy 0, policy_version 156201 (0.0032) [2024-04-26 10:49:22,063][49517] Fps is (10 sec: 52427.6, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2559213568. Throughput: 0: 50300.3. Samples: 312004720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 10:49:22,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 10:49:25,252][49750] Updated weights for policy 0, policy_version 156211 (0.0033) [2024-04-26 10:49:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50207.3). Total num frames: 2559442944. Throughput: 0: 50093.0. Samples: 312301340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 10:49:27,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 10:49:28,292][49750] Updated weights for policy 0, policy_version 156221 (0.0034) [2024-04-26 10:49:31,650][49750] Updated weights for policy 0, policy_version 156231 (0.0036) [2024-04-26 10:49:32,062][49517] Fps is (10 sec: 47514.3, 60 sec: 49971.1, 300 sec: 50207.2). Total num frames: 2559688704. Throughput: 0: 50106.7. Samples: 312599540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 10:49:32,063][49517] Avg episode reward: [(0, '0.390')] [2024-04-26 10:49:34,682][49750] Updated weights for policy 0, policy_version 156241 (0.0030) [2024-04-26 10:49:37,063][49517] Fps is (10 sec: 50789.6, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2559950848. Throughput: 0: 50214.1. Samples: 312748440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 10:49:37,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 10:49:38,121][49750] Updated weights for policy 0, policy_version 156251 (0.0031) [2024-04-26 10:49:39,671][49728] Signal inference workers to stop experience collection... (4800 times) [2024-04-26 10:49:39,672][49728] Signal inference workers to resume experience collection... (4800 times) [2024-04-26 10:49:39,702][49750] InferenceWorker_p0-w0: stopping experience collection (4800 times) [2024-04-26 10:49:39,702][49750] InferenceWorker_p0-w0: resuming experience collection (4800 times) [2024-04-26 10:49:41,317][49750] Updated weights for policy 0, policy_version 156261 (0.0029) [2024-04-26 10:49:42,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2560196608. Throughput: 0: 50146.1. Samples: 313051580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 10:49:42,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 10:49:44,594][49750] Updated weights for policy 0, policy_version 156271 (0.0029) [2024-04-26 10:49:47,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2560442368. Throughput: 0: 49985.4. Samples: 313348960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 10:49:47,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 10:49:47,985][49750] Updated weights for policy 0, policy_version 156281 (0.0034) [2024-04-26 10:49:51,262][49750] Updated weights for policy 0, policy_version 156291 (0.0036) [2024-04-26 10:49:52,062][49517] Fps is (10 sec: 49152.9, 60 sec: 49698.2, 300 sec: 50207.2). Total num frames: 2560688128. Throughput: 0: 50167.7. Samples: 313499020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 10:49:52,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 10:49:54,526][49750] Updated weights for policy 0, policy_version 156301 (0.0031) [2024-04-26 10:49:57,062][49517] Fps is (10 sec: 49153.0, 60 sec: 49971.3, 300 sec: 50207.2). Total num frames: 2560933888. Throughput: 0: 50040.9. Samples: 313798520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 10:49:57,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 10:49:57,674][49750] Updated weights for policy 0, policy_version 156311 (0.0030) [2024-04-26 10:50:01,077][49750] Updated weights for policy 0, policy_version 156321 (0.0032) [2024-04-26 10:50:02,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 2561212416. Throughput: 0: 49976.4. Samples: 314100580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 10:50:02,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 10:50:04,469][49750] Updated weights for policy 0, policy_version 156331 (0.0031) [2024-04-26 10:50:07,063][49517] Fps is (10 sec: 52427.9, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2561458176. Throughput: 0: 49994.3. Samples: 314254460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 10:50:07,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 10:50:07,520][49750] Updated weights for policy 0, policy_version 156341 (0.0029) [2024-04-26 10:50:10,903][49750] Updated weights for policy 0, policy_version 156351 (0.0033) [2024-04-26 10:50:12,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2561703936. Throughput: 0: 50077.6. Samples: 314554840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 10:50:12,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 10:50:14,104][49750] Updated weights for policy 0, policy_version 156361 (0.0033) [2024-04-26 10:50:17,062][49517] Fps is (10 sec: 49152.6, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2561949696. Throughput: 0: 50251.2. Samples: 314860840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 10:50:17,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 10:50:17,308][49750] Updated weights for policy 0, policy_version 156371 (0.0029) [2024-04-26 10:50:20,532][49750] Updated weights for policy 0, policy_version 156381 (0.0034) [2024-04-26 10:50:22,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49698.2, 300 sec: 50151.7). Total num frames: 2562195456. Throughput: 0: 50207.6. Samples: 315007780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 10:50:22,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 10:50:23,977][49750] Updated weights for policy 0, policy_version 156391 (0.0036) [2024-04-26 10:50:27,033][49750] Updated weights for policy 0, policy_version 156401 (0.0028) [2024-04-26 10:50:27,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2562473984. Throughput: 0: 50265.9. Samples: 315313540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 10:50:27,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 10:50:30,532][49750] Updated weights for policy 0, policy_version 156411 (0.0030) [2024-04-26 10:50:32,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.2, 300 sec: 50262.7). Total num frames: 2562703360. Throughput: 0: 50428.4. Samples: 315618240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 10:50:32,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 10:50:33,504][49750] Updated weights for policy 0, policy_version 156421 (0.0034) [2024-04-26 10:50:36,878][49750] Updated weights for policy 0, policy_version 156431 (0.0031) [2024-04-26 10:50:37,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 2562965504. Throughput: 0: 50139.4. Samples: 315755300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 10:50:37,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 10:50:40,075][49750] Updated weights for policy 0, policy_version 156441 (0.0035) [2024-04-26 10:50:42,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2563211264. Throughput: 0: 50258.0. Samples: 316060140. Policy #0 lag: (min: 1.0, avg: 12.0, max: 24.0) [2024-04-26 10:50:42,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 10:50:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000156446_2563211264.pth... [2024-04-26 10:50:42,131][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000155710_2551152640.pth [2024-04-26 10:50:43,271][49750] Updated weights for policy 0, policy_version 156451 (0.0034) [2024-04-26 10:50:46,632][49750] Updated weights for policy 0, policy_version 156461 (0.0026) [2024-04-26 10:50:47,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2563473408. Throughput: 0: 50287.8. Samples: 316363540. Policy #0 lag: (min: 1.0, avg: 12.0, max: 24.0) [2024-04-26 10:50:47,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 10:50:49,886][49750] Updated weights for policy 0, policy_version 156471 (0.0036) [2024-04-26 10:50:52,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2563702784. Throughput: 0: 50194.9. Samples: 316513220. Policy #0 lag: (min: 1.0, avg: 12.0, max: 24.0) [2024-04-26 10:50:52,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-26 10:50:53,132][49750] Updated weights for policy 0, policy_version 156481 (0.0031) [2024-04-26 10:50:56,508][49750] Updated weights for policy 0, policy_version 156491 (0.0031) [2024-04-26 10:50:57,063][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.1, 300 sec: 50151.7). Total num frames: 2563948544. Throughput: 0: 50213.8. Samples: 316814460. Policy #0 lag: (min: 1.0, avg: 12.0, max: 24.0) [2024-04-26 10:50:57,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 10:50:59,692][49750] Updated weights for policy 0, policy_version 156501 (0.0032) [2024-04-26 10:51:02,062][49517] Fps is (10 sec: 50790.1, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2564210688. Throughput: 0: 50063.1. Samples: 317113680. Policy #0 lag: (min: 1.0, avg: 12.0, max: 24.0) [2024-04-26 10:51:02,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 10:51:02,837][49750] Updated weights for policy 0, policy_version 156511 (0.0033) [2024-04-26 10:51:06,173][49728] Signal inference workers to stop experience collection... (4850 times) [2024-04-26 10:51:06,221][49750] InferenceWorker_p0-w0: stopping experience collection (4850 times) [2024-04-26 10:51:06,241][49728] Signal inference workers to resume experience collection... (4850 times) [2024-04-26 10:51:06,242][49750] InferenceWorker_p0-w0: resuming experience collection (4850 times) [2024-04-26 10:51:06,249][49750] Updated weights for policy 0, policy_version 156521 (0.0027) [2024-04-26 10:51:07,062][49517] Fps is (10 sec: 50790.9, 60 sec: 49971.3, 300 sec: 50207.2). Total num frames: 2564456448. Throughput: 0: 50269.8. Samples: 317269920. Policy #0 lag: (min: 1.0, avg: 12.0, max: 24.0) [2024-04-26 10:51:07,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 10:51:09,277][49750] Updated weights for policy 0, policy_version 156531 (0.0031) [2024-04-26 10:51:12,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2564718592. Throughput: 0: 50263.4. Samples: 317575400. Policy #0 lag: (min: 1.0, avg: 12.0, max: 24.0) [2024-04-26 10:51:12,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 10:51:12,612][49750] Updated weights for policy 0, policy_version 156541 (0.0034) [2024-04-26 10:51:16,176][49750] Updated weights for policy 0, policy_version 156551 (0.0034) [2024-04-26 10:51:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2564947968. Throughput: 0: 50094.0. Samples: 317872460. Policy #0 lag: (min: 1.0, avg: 12.0, max: 24.0) [2024-04-26 10:51:17,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 10:51:19,120][49750] Updated weights for policy 0, policy_version 156561 (0.0031) [2024-04-26 10:51:22,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2565226496. Throughput: 0: 50196.4. Samples: 318014140. Policy #0 lag: (min: 1.0, avg: 12.0, max: 24.0) [2024-04-26 10:51:22,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 10:51:22,974][49750] Updated weights for policy 0, policy_version 156571 (0.0031) [2024-04-26 10:51:25,638][49750] Updated weights for policy 0, policy_version 156581 (0.0034) [2024-04-26 10:51:27,062][49517] Fps is (10 sec: 52428.5, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2565472256. Throughput: 0: 50229.4. Samples: 318320460. Policy #0 lag: (min: 1.0, avg: 12.0, max: 24.0) [2024-04-26 10:51:27,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 10:51:29,689][49750] Updated weights for policy 0, policy_version 156591 (0.0033) [2024-04-26 10:51:32,002][49750] Updated weights for policy 0, policy_version 156601 (0.0032) [2024-04-26 10:51:32,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 2565750784. Throughput: 0: 50167.8. Samples: 318621080. Policy #0 lag: (min: 1.0, avg: 12.0, max: 24.0) [2024-04-26 10:51:32,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 10:51:36,037][49750] Updated weights for policy 0, policy_version 156611 (0.0029) [2024-04-26 10:51:37,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.2, 300 sec: 50207.3). Total num frames: 2565963776. Throughput: 0: 50318.5. Samples: 318777560. Policy #0 lag: (min: 1.0, avg: 12.0, max: 24.0) [2024-04-26 10:51:37,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 10:51:38,503][49750] Updated weights for policy 0, policy_version 156621 (0.0028) [2024-04-26 10:51:42,063][49517] Fps is (10 sec: 47512.6, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 2566225920. Throughput: 0: 50323.9. Samples: 319079040. Policy #0 lag: (min: 1.0, avg: 12.0, max: 24.0) [2024-04-26 10:51:42,063][49517] Avg episode reward: [(0, '0.460')] [2024-04-26 10:51:42,586][49750] Updated weights for policy 0, policy_version 156631 (0.0030) [2024-04-26 10:51:45,074][49750] Updated weights for policy 0, policy_version 156641 (0.0036) [2024-04-26 10:51:47,063][49517] Fps is (10 sec: 50790.3, 60 sec: 49971.3, 300 sec: 50207.2). Total num frames: 2566471680. Throughput: 0: 50235.4. Samples: 319374280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 10:51:47,063][49517] Avg episode reward: [(0, '0.454')] [2024-04-26 10:51:49,081][49750] Updated weights for policy 0, policy_version 156651 (0.0041) [2024-04-26 10:51:51,837][49750] Updated weights for policy 0, policy_version 156661 (0.0027) [2024-04-26 10:51:52,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.2, 300 sec: 50318.3). Total num frames: 2566750208. Throughput: 0: 50331.4. Samples: 319534840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 10:51:52,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 10:51:55,421][49750] Updated weights for policy 0, policy_version 156671 (0.0033) [2024-04-26 10:51:57,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2566979584. Throughput: 0: 50086.7. Samples: 319829300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 10:51:57,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 10:51:58,354][49750] Updated weights for policy 0, policy_version 156681 (0.0030) [2024-04-26 10:52:01,820][49750] Updated weights for policy 0, policy_version 156691 (0.0034) [2024-04-26 10:52:02,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 2567225344. Throughput: 0: 50243.1. Samples: 320133400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 10:52:02,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 10:52:04,889][49750] Updated weights for policy 0, policy_version 156701 (0.0032) [2024-04-26 10:52:07,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.4, 300 sec: 50151.7). Total num frames: 2567487488. Throughput: 0: 50359.3. Samples: 320280300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 10:52:07,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 10:52:08,354][49750] Updated weights for policy 0, policy_version 156711 (0.0035) [2024-04-26 10:52:11,141][49728] Signal inference workers to stop experience collection... (4900 times) [2024-04-26 10:52:11,147][49728] Signal inference workers to resume experience collection... (4900 times) [2024-04-26 10:52:11,176][49750] InferenceWorker_p0-w0: stopping experience collection (4900 times) [2024-04-26 10:52:11,177][49750] InferenceWorker_p0-w0: resuming experience collection (4900 times) [2024-04-26 10:52:11,271][49750] Updated weights for policy 0, policy_version 156721 (0.0028) [2024-04-26 10:52:12,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 2567733248. Throughput: 0: 50314.6. Samples: 320584620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 10:52:12,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 10:52:14,816][49750] Updated weights for policy 0, policy_version 156731 (0.0031) [2024-04-26 10:52:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50262.8). Total num frames: 2567995392. Throughput: 0: 50253.4. Samples: 320882480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 10:52:17,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 10:52:17,905][49750] Updated weights for policy 0, policy_version 156741 (0.0032) [2024-04-26 10:52:21,756][49750] Updated weights for policy 0, policy_version 156751 (0.0031) [2024-04-26 10:52:22,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.3, 300 sec: 50207.3). Total num frames: 2568224768. Throughput: 0: 50317.8. Samples: 321041860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 10:52:22,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 10:52:24,528][49750] Updated weights for policy 0, policy_version 156761 (0.0035) [2024-04-26 10:52:27,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2568486912. Throughput: 0: 50200.2. Samples: 321338040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 10:52:27,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 10:52:28,125][49750] Updated weights for policy 0, policy_version 156771 (0.0030) [2024-04-26 10:52:30,942][49750] Updated weights for policy 0, policy_version 156781 (0.0033) [2024-04-26 10:52:32,063][49517] Fps is (10 sec: 50789.9, 60 sec: 49698.0, 300 sec: 50207.2). Total num frames: 2568732672. Throughput: 0: 50296.4. Samples: 321637620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 10:52:32,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 10:52:34,697][49750] Updated weights for policy 0, policy_version 156791 (0.0031) [2024-04-26 10:52:37,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2568994816. Throughput: 0: 50321.9. Samples: 321799320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 10:52:37,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 10:52:37,434][49750] Updated weights for policy 0, policy_version 156801 (0.0030) [2024-04-26 10:52:41,361][49750] Updated weights for policy 0, policy_version 156811 (0.0032) [2024-04-26 10:52:42,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2569240576. Throughput: 0: 50268.8. Samples: 322091400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 10:52:42,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 10:52:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000156814_2569240576.pth... [2024-04-26 10:52:42,130][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000156079_2557198336.pth [2024-04-26 10:52:44,067][49750] Updated weights for policy 0, policy_version 156821 (0.0027) [2024-04-26 10:52:47,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2569486336. Throughput: 0: 50146.0. Samples: 322389980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 10:52:47,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 10:52:47,784][49750] Updated weights for policy 0, policy_version 156831 (0.0033) [2024-04-26 10:52:50,839][49750] Updated weights for policy 0, policy_version 156841 (0.0037) [2024-04-26 10:52:52,062][49517] Fps is (10 sec: 50791.4, 60 sec: 49971.4, 300 sec: 50207.2). Total num frames: 2569748480. Throughput: 0: 50140.9. Samples: 322536640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 10:52:52,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 10:52:54,163][49750] Updated weights for policy 0, policy_version 156851 (0.0044) [2024-04-26 10:52:57,062][49517] Fps is (10 sec: 49153.3, 60 sec: 49971.4, 300 sec: 50207.3). Total num frames: 2569977856. Throughput: 0: 50041.1. Samples: 322836460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 10:52:57,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 10:52:57,282][49750] Updated weights for policy 0, policy_version 156861 (0.0031) [2024-04-26 10:53:00,710][49750] Updated weights for policy 0, policy_version 156871 (0.0030) [2024-04-26 10:53:02,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 2570256384. Throughput: 0: 50255.9. Samples: 323144000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 10:53:02,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 10:53:04,076][49750] Updated weights for policy 0, policy_version 156881 (0.0028) [2024-04-26 10:53:07,063][49517] Fps is (10 sec: 50789.0, 60 sec: 49971.0, 300 sec: 50207.2). Total num frames: 2570485760. Throughput: 0: 50127.4. Samples: 323297600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 10:53:07,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 10:53:07,247][49750] Updated weights for policy 0, policy_version 156891 (0.0031) [2024-04-26 10:53:10,515][49750] Updated weights for policy 0, policy_version 156901 (0.0029) [2024-04-26 10:53:12,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.4, 300 sec: 50207.2). Total num frames: 2570747904. Throughput: 0: 50327.7. Samples: 323602780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 10:53:12,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 10:53:13,848][49750] Updated weights for policy 0, policy_version 156911 (0.0034) [2024-04-26 10:53:17,007][49750] Updated weights for policy 0, policy_version 156921 (0.0035) [2024-04-26 10:53:17,063][49517] Fps is (10 sec: 50790.8, 60 sec: 49971.1, 300 sec: 50207.3). Total num frames: 2570993664. Throughput: 0: 50251.6. Samples: 323898940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 10:53:17,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 10:53:20,429][49750] Updated weights for policy 0, policy_version 156931 (0.0032) [2024-04-26 10:53:22,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.3, 300 sec: 50373.8). Total num frames: 2571272192. Throughput: 0: 49922.7. Samples: 324045840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 10:53:22,072][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 10:53:23,490][49750] Updated weights for policy 0, policy_version 156941 (0.0031) [2024-04-26 10:53:26,814][49750] Updated weights for policy 0, policy_version 156951 (0.0028) [2024-04-26 10:53:27,062][49517] Fps is (10 sec: 49152.7, 60 sec: 49971.3, 300 sec: 50151.7). Total num frames: 2571485184. Throughput: 0: 50252.2. Samples: 324352740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 10:53:27,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 10:53:27,323][49728] Signal inference workers to stop experience collection... (4950 times) [2024-04-26 10:53:27,363][49750] InferenceWorker_p0-w0: stopping experience collection (4950 times) [2024-04-26 10:53:27,396][49728] Signal inference workers to resume experience collection... (4950 times) [2024-04-26 10:53:27,396][49750] InferenceWorker_p0-w0: resuming experience collection (4950 times) [2024-04-26 10:53:29,867][49750] Updated weights for policy 0, policy_version 156961 (0.0033) [2024-04-26 10:53:32,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50244.4, 300 sec: 50151.7). Total num frames: 2571747328. Throughput: 0: 50325.5. Samples: 324654620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 10:53:32,063][49517] Avg episode reward: [(0, '0.460')] [2024-04-26 10:53:33,349][49750] Updated weights for policy 0, policy_version 156971 (0.0028) [2024-04-26 10:53:36,360][49750] Updated weights for policy 0, policy_version 156981 (0.0031) [2024-04-26 10:53:37,062][49517] Fps is (10 sec: 50790.3, 60 sec: 49971.3, 300 sec: 50207.3). Total num frames: 2571993088. Throughput: 0: 50222.2. Samples: 324796640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 10:53:37,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 10:53:40,009][49750] Updated weights for policy 0, policy_version 156991 (0.0035) [2024-04-26 10:53:42,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2572255232. Throughput: 0: 50361.1. Samples: 325102720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 10:53:42,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 10:53:43,008][49750] Updated weights for policy 0, policy_version 157001 (0.0034) [2024-04-26 10:53:46,449][49750] Updated weights for policy 0, policy_version 157011 (0.0031) [2024-04-26 10:53:47,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.5, 300 sec: 50207.2). Total num frames: 2572517376. Throughput: 0: 50289.4. Samples: 325407020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 10:53:47,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 10:53:49,523][49750] Updated weights for policy 0, policy_version 157021 (0.0032) [2024-04-26 10:53:52,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2572763136. Throughput: 0: 50189.9. Samples: 325556140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 10:53:52,071][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 10:53:52,893][49750] Updated weights for policy 0, policy_version 157031 (0.0034) [2024-04-26 10:53:55,916][49750] Updated weights for policy 0, policy_version 157041 (0.0038) [2024-04-26 10:53:57,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.2, 300 sec: 50207.3). Total num frames: 2573008896. Throughput: 0: 50011.5. Samples: 325853300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 10:53:57,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 10:53:59,457][49750] Updated weights for policy 0, policy_version 157051 (0.0035) [2024-04-26 10:54:02,063][49517] Fps is (10 sec: 49151.5, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 2573254656. Throughput: 0: 50157.7. Samples: 326156040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 10:54:02,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 10:54:02,463][49750] Updated weights for policy 0, policy_version 157061 (0.0031) [2024-04-26 10:54:05,879][49750] Updated weights for policy 0, policy_version 157071 (0.0035) [2024-04-26 10:54:07,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.6, 300 sec: 50318.3). Total num frames: 2573533184. Throughput: 0: 50305.9. Samples: 326309600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 10:54:07,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 10:54:09,203][49750] Updated weights for policy 0, policy_version 157081 (0.0031) [2024-04-26 10:54:12,062][49517] Fps is (10 sec: 49152.7, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 2573746176. Throughput: 0: 50215.4. Samples: 326612440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 10:54:12,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 10:54:12,414][49750] Updated weights for policy 0, policy_version 157091 (0.0031) [2024-04-26 10:54:15,769][49750] Updated weights for policy 0, policy_version 157101 (0.0029) [2024-04-26 10:54:17,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.4, 300 sec: 50151.7). Total num frames: 2574008320. Throughput: 0: 50259.5. Samples: 326916300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 10:54:17,063][49517] Avg episode reward: [(0, '0.473')] [2024-04-26 10:54:18,826][49750] Updated weights for policy 0, policy_version 157111 (0.0031) [2024-04-26 10:54:22,063][49517] Fps is (10 sec: 50790.3, 60 sec: 49698.2, 300 sec: 50207.2). Total num frames: 2574254080. Throughput: 0: 50283.4. Samples: 327059400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 10:54:22,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 10:54:22,228][49750] Updated weights for policy 0, policy_version 157121 (0.0033) [2024-04-26 10:54:25,325][49750] Updated weights for policy 0, policy_version 157131 (0.0037) [2024-04-26 10:54:27,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2574516224. Throughput: 0: 50343.2. Samples: 327368160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 10:54:27,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 10:54:28,582][49750] Updated weights for policy 0, policy_version 157141 (0.0032) [2024-04-26 10:54:29,145][49728] Signal inference workers to stop experience collection... (5000 times) [2024-04-26 10:54:29,151][49728] Signal inference workers to resume experience collection... (5000 times) [2024-04-26 10:54:29,174][49750] InferenceWorker_p0-w0: stopping experience collection (5000 times) [2024-04-26 10:54:29,175][49750] InferenceWorker_p0-w0: resuming experience collection (5000 times) [2024-04-26 10:54:31,780][49750] Updated weights for policy 0, policy_version 157151 (0.0036) [2024-04-26 10:54:32,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2574778368. Throughput: 0: 50411.9. Samples: 327675560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 10:54:32,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 10:54:35,016][49750] Updated weights for policy 0, policy_version 157161 (0.0038) [2024-04-26 10:54:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.2, 300 sec: 50207.3). Total num frames: 2575007744. Throughput: 0: 50496.5. Samples: 327828480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 10:54:37,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 10:54:38,187][49750] Updated weights for policy 0, policy_version 157171 (0.0030) [2024-04-26 10:54:41,689][49750] Updated weights for policy 0, policy_version 157181 (0.0036) [2024-04-26 10:54:42,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.3, 300 sec: 50207.3). Total num frames: 2575253504. Throughput: 0: 50556.5. Samples: 328128340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 10:54:42,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 10:54:42,101][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000157182_2575269888.pth... [2024-04-26 10:54:42,146][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000156446_2563211264.pth [2024-04-26 10:54:44,702][49750] Updated weights for policy 0, policy_version 157191 (0.0032) [2024-04-26 10:54:47,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2575532032. Throughput: 0: 50342.7. Samples: 328421460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 10:54:47,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 10:54:48,113][49750] Updated weights for policy 0, policy_version 157201 (0.0037) [2024-04-26 10:54:51,228][49750] Updated weights for policy 0, policy_version 157211 (0.0033) [2024-04-26 10:54:52,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50517.4, 300 sec: 50373.8). Total num frames: 2575794176. Throughput: 0: 50423.1. Samples: 328578640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 10:54:52,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 10:54:54,693][49750] Updated weights for policy 0, policy_version 157221 (0.0028) [2024-04-26 10:54:57,063][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 2576007168. Throughput: 0: 50360.8. Samples: 328878680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 10:54:57,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 10:54:57,786][49750] Updated weights for policy 0, policy_version 157231 (0.0034) [2024-04-26 10:55:01,138][49750] Updated weights for policy 0, policy_version 157241 (0.0029) [2024-04-26 10:55:02,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2576285696. Throughput: 0: 50434.5. Samples: 329185860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 10:55:02,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 10:55:04,263][49750] Updated weights for policy 0, policy_version 157251 (0.0029) [2024-04-26 10:55:07,063][49517] Fps is (10 sec: 52428.9, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2576531456. Throughput: 0: 50399.9. Samples: 329327400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 10:55:07,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 10:55:07,739][49750] Updated weights for policy 0, policy_version 157261 (0.0035) [2024-04-26 10:55:10,754][49750] Updated weights for policy 0, policy_version 157271 (0.0031) [2024-04-26 10:55:12,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50373.8). Total num frames: 2576809984. Throughput: 0: 50408.8. Samples: 329636560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 10:55:12,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 10:55:14,266][49750] Updated weights for policy 0, policy_version 157281 (0.0033) [2024-04-26 10:55:17,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2577022976. Throughput: 0: 50323.2. Samples: 329940100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 10:55:17,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 10:55:17,253][49750] Updated weights for policy 0, policy_version 157291 (0.0029) [2024-04-26 10:55:20,795][49750] Updated weights for policy 0, policy_version 157301 (0.0034) [2024-04-26 10:55:22,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.4, 300 sec: 50207.3). Total num frames: 2577285120. Throughput: 0: 50246.3. Samples: 330089560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 10:55:22,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 10:55:23,682][49750] Updated weights for policy 0, policy_version 157311 (0.0035) [2024-04-26 10:55:27,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2577530880. Throughput: 0: 50172.8. Samples: 330386120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 10:55:27,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 10:55:27,237][49750] Updated weights for policy 0, policy_version 157321 (0.0028) [2024-04-26 10:55:30,137][49750] Updated weights for policy 0, policy_version 157331 (0.0041) [2024-04-26 10:55:30,753][49728] Signal inference workers to stop experience collection... (5050 times) [2024-04-26 10:55:30,753][49728] Signal inference workers to resume experience collection... (5050 times) [2024-04-26 10:55:30,776][49750] InferenceWorker_p0-w0: stopping experience collection (5050 times) [2024-04-26 10:55:30,776][49750] InferenceWorker_p0-w0: resuming experience collection (5050 times) [2024-04-26 10:55:32,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2577793024. Throughput: 0: 50185.1. Samples: 330679780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 10:55:32,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 10:55:33,956][49750] Updated weights for policy 0, policy_version 157341 (0.0035) [2024-04-26 10:55:36,757][49750] Updated weights for policy 0, policy_version 157351 (0.0032) [2024-04-26 10:55:37,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 2578055168. Throughput: 0: 50435.0. Samples: 330848220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 10:55:37,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 10:55:40,353][49750] Updated weights for policy 0, policy_version 157361 (0.0032) [2024-04-26 10:55:42,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50207.3). Total num frames: 2578284544. Throughput: 0: 50394.3. Samples: 331146420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 10:55:42,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 10:55:43,330][49750] Updated weights for policy 0, policy_version 157371 (0.0029) [2024-04-26 10:55:46,940][49750] Updated weights for policy 0, policy_version 157381 (0.0030) [2024-04-26 10:55:47,062][49517] Fps is (10 sec: 47513.8, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2578530304. Throughput: 0: 50318.3. Samples: 331450180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 10:55:47,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 10:55:49,739][49750] Updated weights for policy 0, policy_version 157391 (0.0028) [2024-04-26 10:55:52,063][49517] Fps is (10 sec: 50790.2, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 2578792448. Throughput: 0: 50381.4. Samples: 331594560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 10:55:52,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 10:55:53,487][49750] Updated weights for policy 0, policy_version 157401 (0.0029) [2024-04-26 10:55:56,364][49750] Updated weights for policy 0, policy_version 157411 (0.0029) [2024-04-26 10:55:57,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.5, 300 sec: 50373.8). Total num frames: 2579070976. Throughput: 0: 50228.5. Samples: 331896840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 10:55:57,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 10:55:59,942][49750] Updated weights for policy 0, policy_version 157421 (0.0030) [2024-04-26 10:56:02,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2579283968. Throughput: 0: 50178.1. Samples: 332198120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 10:56:02,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 10:56:02,986][49750] Updated weights for policy 0, policy_version 157431 (0.0032) [2024-04-26 10:56:06,424][49750] Updated weights for policy 0, policy_version 157441 (0.0028) [2024-04-26 10:56:07,062][49517] Fps is (10 sec: 45875.5, 60 sec: 49971.3, 300 sec: 50207.3). Total num frames: 2579529728. Throughput: 0: 50004.8. Samples: 332339780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 10:56:07,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 10:56:09,512][49750] Updated weights for policy 0, policy_version 157451 (0.0036) [2024-04-26 10:56:12,063][49517] Fps is (10 sec: 50790.3, 60 sec: 49698.2, 300 sec: 50318.3). Total num frames: 2579791872. Throughput: 0: 50149.3. Samples: 332642840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 10:56:12,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 10:56:13,085][49750] Updated weights for policy 0, policy_version 157461 (0.0032) [2024-04-26 10:56:15,939][49750] Updated weights for policy 0, policy_version 157471 (0.0029) [2024-04-26 10:56:17,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2580054016. Throughput: 0: 50276.4. Samples: 332942220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 10:56:17,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 10:56:19,537][49750] Updated weights for policy 0, policy_version 157481 (0.0038) [2024-04-26 10:56:22,063][49517] Fps is (10 sec: 49151.5, 60 sec: 49971.0, 300 sec: 50207.2). Total num frames: 2580283392. Throughput: 0: 50099.4. Samples: 333102700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 10:56:22,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 10:56:22,408][49750] Updated weights for policy 0, policy_version 157491 (0.0034) [2024-04-26 10:56:26,082][49750] Updated weights for policy 0, policy_version 157501 (0.0028) [2024-04-26 10:56:27,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2580545536. Throughput: 0: 50161.0. Samples: 333403660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 10:56:27,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 10:56:28,839][49750] Updated weights for policy 0, policy_version 157511 (0.0031) [2024-04-26 10:56:32,062][49517] Fps is (10 sec: 50791.6, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2580791296. Throughput: 0: 50139.6. Samples: 333706460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 10:56:32,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 10:56:32,635][49750] Updated weights for policy 0, policy_version 157521 (0.0030) [2024-04-26 10:56:34,847][49728] Signal inference workers to stop experience collection... (5100 times) [2024-04-26 10:56:34,848][49728] Signal inference workers to resume experience collection... (5100 times) [2024-04-26 10:56:34,860][49750] InferenceWorker_p0-w0: stopping experience collection (5100 times) [2024-04-26 10:56:34,870][49750] InferenceWorker_p0-w0: resuming experience collection (5100 times) [2024-04-26 10:56:35,296][49750] Updated weights for policy 0, policy_version 157531 (0.0030) [2024-04-26 10:56:37,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2581069824. Throughput: 0: 50313.3. Samples: 333858660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 10:56:37,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 10:56:39,169][49750] Updated weights for policy 0, policy_version 157541 (0.0030) [2024-04-26 10:56:41,867][49750] Updated weights for policy 0, policy_version 157551 (0.0027) [2024-04-26 10:56:42,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2581315584. Throughput: 0: 50185.4. Samples: 334155180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 10:56:42,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 10:56:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000157552_2581331968.pth... [2024-04-26 10:56:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000156814_2569240576.pth [2024-04-26 10:56:45,650][49750] Updated weights for policy 0, policy_version 157561 (0.0033) [2024-04-26 10:56:47,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2581544960. Throughput: 0: 50144.3. Samples: 334454620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 10:56:47,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 10:56:48,368][49750] Updated weights for policy 0, policy_version 157571 (0.0034) [2024-04-26 10:56:52,063][49517] Fps is (10 sec: 47513.2, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2581790720. Throughput: 0: 50228.3. Samples: 334600060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 10:56:52,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 10:56:52,113][49750] Updated weights for policy 0, policy_version 157581 (0.0032) [2024-04-26 10:56:54,999][49750] Updated weights for policy 0, policy_version 157591 (0.0038) [2024-04-26 10:56:57,062][49517] Fps is (10 sec: 50791.1, 60 sec: 49698.2, 300 sec: 50262.8). Total num frames: 2582052864. Throughput: 0: 50109.4. Samples: 334897760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 10:56:57,063][49517] Avg episode reward: [(0, '0.445')] [2024-04-26 10:56:58,587][49750] Updated weights for policy 0, policy_version 157601 (0.0027) [2024-04-26 10:57:01,388][49750] Updated weights for policy 0, policy_version 157611 (0.0035) [2024-04-26 10:57:02,062][49517] Fps is (10 sec: 54067.9, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 2582331392. Throughput: 0: 50197.4. Samples: 335201100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 10:57:02,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 10:57:05,204][49750] Updated weights for policy 0, policy_version 157621 (0.0031) [2024-04-26 10:57:07,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50207.3). Total num frames: 2582544384. Throughput: 0: 50184.2. Samples: 335360980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 10:57:07,063][49517] Avg episode reward: [(0, '0.471')] [2024-04-26 10:57:08,035][49750] Updated weights for policy 0, policy_version 157631 (0.0038) [2024-04-26 10:57:11,652][49750] Updated weights for policy 0, policy_version 157641 (0.0037) [2024-04-26 10:57:12,063][49517] Fps is (10 sec: 45874.3, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 2582790144. Throughput: 0: 50118.0. Samples: 335658980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 10:57:12,063][49517] Avg episode reward: [(0, '0.446')] [2024-04-26 10:57:14,527][49750] Updated weights for policy 0, policy_version 157651 (0.0038) [2024-04-26 10:57:17,063][49517] Fps is (10 sec: 50790.0, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2583052288. Throughput: 0: 50208.8. Samples: 335965860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 10:57:17,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 10:57:18,221][49750] Updated weights for policy 0, policy_version 157661 (0.0032) [2024-04-26 10:57:20,915][49750] Updated weights for policy 0, policy_version 157671 (0.0037) [2024-04-26 10:57:22,062][49517] Fps is (10 sec: 52430.0, 60 sec: 50517.5, 300 sec: 50262.8). Total num frames: 2583314432. Throughput: 0: 49931.3. Samples: 336105560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 10:57:22,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 10:57:24,939][49750] Updated weights for policy 0, policy_version 157681 (0.0030) [2024-04-26 10:57:27,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.4, 300 sec: 50318.4). Total num frames: 2583576576. Throughput: 0: 50241.0. Samples: 336416020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 10:57:27,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 10:57:27,396][49750] Updated weights for policy 0, policy_version 157691 (0.0033) [2024-04-26 10:57:31,523][49750] Updated weights for policy 0, policy_version 157701 (0.0030) [2024-04-26 10:57:32,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50207.3). Total num frames: 2583805952. Throughput: 0: 50393.1. Samples: 336722300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 10:57:32,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 10:57:33,134][49728] Signal inference workers to stop experience collection... (5150 times) [2024-04-26 10:57:33,135][49728] Signal inference workers to resume experience collection... (5150 times) [2024-04-26 10:57:33,166][49750] InferenceWorker_p0-w0: stopping experience collection (5150 times) [2024-04-26 10:57:33,166][49750] InferenceWorker_p0-w0: resuming experience collection (5150 times) [2024-04-26 10:57:33,938][49750] Updated weights for policy 0, policy_version 157711 (0.0034) [2024-04-26 10:57:37,062][49517] Fps is (10 sec: 47513.0, 60 sec: 49698.2, 300 sec: 50207.3). Total num frames: 2584051712. Throughput: 0: 50111.2. Samples: 336855060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 10:57:37,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 10:57:37,876][49750] Updated weights for policy 0, policy_version 157721 (0.0028) [2024-04-26 10:57:40,398][49750] Updated weights for policy 0, policy_version 157731 (0.0035) [2024-04-26 10:57:42,063][49517] Fps is (10 sec: 50789.7, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2584313856. Throughput: 0: 50211.0. Samples: 337157260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 10:57:42,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 10:57:44,417][49750] Updated weights for policy 0, policy_version 157741 (0.0027) [2024-04-26 10:57:46,885][49750] Updated weights for policy 0, policy_version 157751 (0.0028) [2024-04-26 10:57:47,063][49517] Fps is (10 sec: 55705.2, 60 sec: 51063.5, 300 sec: 50373.8). Total num frames: 2584608768. Throughput: 0: 50397.6. Samples: 337469000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 10:57:47,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 10:57:51,009][49750] Updated weights for policy 0, policy_version 157761 (0.0036) [2024-04-26 10:57:52,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 2584805376. Throughput: 0: 50196.9. Samples: 337619840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 10:57:52,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 10:57:53,422][49750] Updated weights for policy 0, policy_version 157771 (0.0027) [2024-04-26 10:57:57,062][49517] Fps is (10 sec: 44237.4, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2585051136. Throughput: 0: 50303.3. Samples: 337922620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 10:57:57,063][49517] Avg episode reward: [(0, '0.470')] [2024-04-26 10:57:57,420][49750] Updated weights for policy 0, policy_version 157781 (0.0030) [2024-04-26 10:58:00,023][49750] Updated weights for policy 0, policy_version 157791 (0.0028) [2024-04-26 10:58:02,063][49517] Fps is (10 sec: 50789.1, 60 sec: 49697.9, 300 sec: 50262.8). Total num frames: 2585313280. Throughput: 0: 50102.9. Samples: 338220500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 10:58:02,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-26 10:58:03,813][49750] Updated weights for policy 0, policy_version 157801 (0.0030) [2024-04-26 10:58:06,532][49750] Updated weights for policy 0, policy_version 157811 (0.0032) [2024-04-26 10:58:07,063][49517] Fps is (10 sec: 54066.2, 60 sec: 50790.2, 300 sec: 50318.3). Total num frames: 2585591808. Throughput: 0: 50382.4. Samples: 338372780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 10:58:07,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 10:58:10,312][49750] Updated weights for policy 0, policy_version 157821 (0.0039) [2024-04-26 10:58:12,063][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2585821184. Throughput: 0: 50170.1. Samples: 338673680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 10:58:12,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 10:58:12,949][49750] Updated weights for policy 0, policy_version 157831 (0.0036) [2024-04-26 10:58:16,932][49750] Updated weights for policy 0, policy_version 157841 (0.0031) [2024-04-26 10:58:17,062][49517] Fps is (10 sec: 47514.8, 60 sec: 50244.4, 300 sec: 50151.7). Total num frames: 2586066944. Throughput: 0: 50177.8. Samples: 338980300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 10:58:17,063][49517] Avg episode reward: [(0, '0.465')] [2024-04-26 10:58:19,468][49750] Updated weights for policy 0, policy_version 157851 (0.0030) [2024-04-26 10:58:22,062][49517] Fps is (10 sec: 49152.5, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2586312704. Throughput: 0: 50192.1. Samples: 339113700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 10:58:22,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 10:58:23,543][49750] Updated weights for policy 0, policy_version 157861 (0.0036) [2024-04-26 10:58:25,019][49728] Signal inference workers to stop experience collection... (5200 times) [2024-04-26 10:58:25,055][49750] InferenceWorker_p0-w0: stopping experience collection (5200 times) [2024-04-26 10:58:25,092][49728] Signal inference workers to resume experience collection... (5200 times) [2024-04-26 10:58:25,092][49750] InferenceWorker_p0-w0: resuming experience collection (5200 times) [2024-04-26 10:58:26,124][49750] Updated weights for policy 0, policy_version 157871 (0.0035) [2024-04-26 10:58:27,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2586591232. Throughput: 0: 50288.5. Samples: 339420240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 10:58:27,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 10:58:30,117][49750] Updated weights for policy 0, policy_version 157881 (0.0035) [2024-04-26 10:58:32,062][49517] Fps is (10 sec: 54066.9, 60 sec: 50790.3, 300 sec: 50373.8). Total num frames: 2586853376. Throughput: 0: 50226.3. Samples: 339729180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 10:58:32,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 10:58:32,523][49750] Updated weights for policy 0, policy_version 157891 (0.0031) [2024-04-26 10:58:36,581][49750] Updated weights for policy 0, policy_version 157901 (0.0029) [2024-04-26 10:58:37,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.3, 300 sec: 50207.3). Total num frames: 2587066368. Throughput: 0: 50172.0. Samples: 339877580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 10:58:37,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 10:58:38,910][49750] Updated weights for policy 0, policy_version 157911 (0.0029) [2024-04-26 10:58:42,063][49517] Fps is (10 sec: 45875.0, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2587312128. Throughput: 0: 50055.0. Samples: 340175100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 10:58:42,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 10:58:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000157917_2587312128.pth... [2024-04-26 10:58:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000157182_2575269888.pth [2024-04-26 10:58:43,060][49750] Updated weights for policy 0, policy_version 157921 (0.0032) [2024-04-26 10:58:45,603][49750] Updated weights for policy 0, policy_version 157931 (0.0028) [2024-04-26 10:58:47,062][49517] Fps is (10 sec: 52428.6, 60 sec: 49698.2, 300 sec: 50262.8). Total num frames: 2587590656. Throughput: 0: 50105.1. Samples: 340475220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 10:58:47,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 10:58:49,618][49750] Updated weights for policy 0, policy_version 157941 (0.0030) [2024-04-26 10:58:52,062][49517] Fps is (10 sec: 54067.4, 60 sec: 50790.3, 300 sec: 50318.3). Total num frames: 2587852800. Throughput: 0: 50287.7. Samples: 340635720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 10:58:52,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 10:58:52,134][49750] Updated weights for policy 0, policy_version 157951 (0.0026) [2024-04-26 10:58:56,098][49750] Updated weights for policy 0, policy_version 157961 (0.0035) [2024-04-26 10:58:57,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.2, 300 sec: 50207.3). Total num frames: 2588065792. Throughput: 0: 50288.9. Samples: 340936680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 10:58:57,072][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 10:58:58,669][49750] Updated weights for policy 0, policy_version 157971 (0.0031) [2024-04-26 10:59:02,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.4, 300 sec: 50151.7). Total num frames: 2588327936. Throughput: 0: 50203.8. Samples: 341239480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 10:59:02,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 10:59:02,618][49750] Updated weights for policy 0, policy_version 157981 (0.0034) [2024-04-26 10:59:05,100][49750] Updated weights for policy 0, policy_version 157991 (0.0039) [2024-04-26 10:59:07,063][49517] Fps is (10 sec: 50790.5, 60 sec: 49698.2, 300 sec: 50262.8). Total num frames: 2588573696. Throughput: 0: 50419.9. Samples: 341382600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 10:59:07,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 10:59:09,007][49750] Updated weights for policy 0, policy_version 158001 (0.0030) [2024-04-26 10:59:11,562][49750] Updated weights for policy 0, policy_version 158011 (0.0032) [2024-04-26 10:59:12,063][49517] Fps is (10 sec: 54066.9, 60 sec: 50790.3, 300 sec: 50373.8). Total num frames: 2588868608. Throughput: 0: 50402.5. Samples: 341688360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 10:59:12,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 10:59:15,528][49750] Updated weights for policy 0, policy_version 158021 (0.0029) [2024-04-26 10:59:17,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2589081600. Throughput: 0: 50228.9. Samples: 341989480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:59:17,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 10:59:18,135][49750] Updated weights for policy 0, policy_version 158031 (0.0034) [2024-04-26 10:59:18,161][49728] Signal inference workers to stop experience collection... (5250 times) [2024-04-26 10:59:18,161][49728] Signal inference workers to resume experience collection... (5250 times) [2024-04-26 10:59:18,174][49750] InferenceWorker_p0-w0: stopping experience collection (5250 times) [2024-04-26 10:59:18,174][49750] InferenceWorker_p0-w0: resuming experience collection (5250 times) [2024-04-26 10:59:22,063][49517] Fps is (10 sec: 45875.6, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 2589327360. Throughput: 0: 50128.8. Samples: 342133380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:59:22,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 10:59:22,180][49750] Updated weights for policy 0, policy_version 158041 (0.0032) [2024-04-26 10:59:24,609][49750] Updated weights for policy 0, policy_version 158051 (0.0030) [2024-04-26 10:59:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 50151.7). Total num frames: 2589573120. Throughput: 0: 50162.3. Samples: 342432400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:59:27,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 10:59:28,814][49750] Updated weights for policy 0, policy_version 158061 (0.0033) [2024-04-26 10:59:31,033][49750] Updated weights for policy 0, policy_version 158071 (0.0031) [2024-04-26 10:59:32,062][49517] Fps is (10 sec: 52429.3, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2589851648. Throughput: 0: 50053.0. Samples: 342727600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:59:32,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 10:59:35,410][49750] Updated weights for policy 0, policy_version 158081 (0.0030) [2024-04-26 10:59:37,063][49517] Fps is (10 sec: 54066.6, 60 sec: 50790.3, 300 sec: 50373.8). Total num frames: 2590113792. Throughput: 0: 50353.7. Samples: 342901640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:59:37,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-26 10:59:37,540][49750] Updated weights for policy 0, policy_version 158091 (0.0027) [2024-04-26 10:59:41,800][49750] Updated weights for policy 0, policy_version 158101 (0.0032) [2024-04-26 10:59:42,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2590326784. Throughput: 0: 50347.2. Samples: 343202300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:59:42,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 10:59:44,041][49750] Updated weights for policy 0, policy_version 158111 (0.0036) [2024-04-26 10:59:47,062][49517] Fps is (10 sec: 45875.7, 60 sec: 49698.2, 300 sec: 50096.1). Total num frames: 2590572544. Throughput: 0: 50267.2. Samples: 343501500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:59:47,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 10:59:48,228][49750] Updated weights for policy 0, policy_version 158121 (0.0030) [2024-04-26 10:59:50,551][49750] Updated weights for policy 0, policy_version 158131 (0.0027) [2024-04-26 10:59:52,063][49517] Fps is (10 sec: 52428.1, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 2590851072. Throughput: 0: 50431.1. Samples: 343652000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:59:52,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 10:59:54,822][49750] Updated weights for policy 0, policy_version 158141 (0.0030) [2024-04-26 10:59:57,038][49750] Updated weights for policy 0, policy_version 158151 (0.0037) [2024-04-26 10:59:57,062][49517] Fps is (10 sec: 57344.3, 60 sec: 51336.6, 300 sec: 50373.9). Total num frames: 2591145984. Throughput: 0: 50428.2. Samples: 343957620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 10:59:57,064][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 11:00:01,379][49750] Updated weights for policy 0, policy_version 158161 (0.0035) [2024-04-26 11:00:02,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 2591342592. Throughput: 0: 50414.1. Samples: 344258120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 11:00:02,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 11:00:03,636][49750] Updated weights for policy 0, policy_version 158171 (0.0027) [2024-04-26 11:00:07,062][49517] Fps is (10 sec: 45875.2, 60 sec: 50517.4, 300 sec: 50151.7). Total num frames: 2591604736. Throughput: 0: 50357.0. Samples: 344399440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 11:00:07,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 11:00:07,834][49750] Updated weights for policy 0, policy_version 158181 (0.0030) [2024-04-26 11:00:10,140][49750] Updated weights for policy 0, policy_version 158191 (0.0026) [2024-04-26 11:00:12,062][49517] Fps is (10 sec: 49153.0, 60 sec: 49425.3, 300 sec: 50207.2). Total num frames: 2591834112. Throughput: 0: 50355.3. Samples: 344698380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 11:00:12,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 11:00:12,821][49728] Signal inference workers to stop experience collection... (5300 times) [2024-04-26 11:00:12,822][49728] Signal inference workers to resume experience collection... (5300 times) [2024-04-26 11:00:12,846][49750] InferenceWorker_p0-w0: stopping experience collection (5300 times) [2024-04-26 11:00:12,846][49750] InferenceWorker_p0-w0: resuming experience collection (5300 times) [2024-04-26 11:00:14,227][49750] Updated weights for policy 0, policy_version 158201 (0.0034) [2024-04-26 11:00:16,553][49750] Updated weights for policy 0, policy_version 158211 (0.0036) [2024-04-26 11:00:17,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 2592129024. Throughput: 0: 50405.7. Samples: 344995860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 11:00:17,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 11:00:20,772][49750] Updated weights for policy 0, policy_version 158221 (0.0029) [2024-04-26 11:00:22,062][49517] Fps is (10 sec: 54066.5, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 2592374784. Throughput: 0: 50280.6. Samples: 345164260. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 11:00:22,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-26 11:00:22,936][49750] Updated weights for policy 0, policy_version 158231 (0.0028) [2024-04-26 11:00:27,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2592587776. Throughput: 0: 50298.2. Samples: 345465720. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 11:00:27,063][49517] Avg episode reward: [(0, '0.469')] [2024-04-26 11:00:27,417][49750] Updated weights for policy 0, policy_version 158241 (0.0027) [2024-04-26 11:00:29,521][49750] Updated weights for policy 0, policy_version 158251 (0.0030) [2024-04-26 11:00:32,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2592849920. Throughput: 0: 50501.0. Samples: 345774040. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 11:00:32,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 11:00:34,059][49750] Updated weights for policy 0, policy_version 158261 (0.0035) [2024-04-26 11:00:36,074][49750] Updated weights for policy 0, policy_version 158271 (0.0030) [2024-04-26 11:00:37,062][49517] Fps is (10 sec: 54066.8, 60 sec: 50244.4, 300 sec: 50318.3). Total num frames: 2593128448. Throughput: 0: 50409.4. Samples: 345920420. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 11:00:37,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 11:00:40,621][49750] Updated weights for policy 0, policy_version 158281 (0.0036) [2024-04-26 11:00:42,063][49517] Fps is (10 sec: 52427.8, 60 sec: 50790.3, 300 sec: 50318.3). Total num frames: 2593374208. Throughput: 0: 50345.1. Samples: 346223160. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 11:00:42,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 11:00:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000158288_2593390592.pth... [2024-04-26 11:00:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000157552_2581331968.pth [2024-04-26 11:00:42,562][49750] Updated weights for policy 0, policy_version 158291 (0.0030) [2024-04-26 11:00:47,062][49517] Fps is (10 sec: 45875.1, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2593587200. Throughput: 0: 50328.9. Samples: 346522920. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 11:00:47,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 11:00:47,155][49750] Updated weights for policy 0, policy_version 158301 (0.0031) [2024-04-26 11:00:49,204][49750] Updated weights for policy 0, policy_version 158311 (0.0028) [2024-04-26 11:00:52,063][49517] Fps is (10 sec: 47512.1, 60 sec: 49970.9, 300 sec: 50096.1). Total num frames: 2593849344. Throughput: 0: 50243.5. Samples: 346660420. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 11:00:52,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 11:00:53,628][49750] Updated weights for policy 0, policy_version 158321 (0.0030) [2024-04-26 11:00:55,910][49750] Updated weights for policy 0, policy_version 158331 (0.0029) [2024-04-26 11:00:57,063][49517] Fps is (10 sec: 52428.8, 60 sec: 49425.0, 300 sec: 50262.8). Total num frames: 2594111488. Throughput: 0: 50290.5. Samples: 346961460. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 11:00:57,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 11:01:00,026][49750] Updated weights for policy 0, policy_version 158341 (0.0036) [2024-04-26 11:01:02,062][49517] Fps is (10 sec: 54069.7, 60 sec: 50790.5, 300 sec: 50373.9). Total num frames: 2594390016. Throughput: 0: 50356.5. Samples: 347261900. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 11:01:02,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 11:01:02,524][49750] Updated weights for policy 0, policy_version 158351 (0.0034) [2024-04-26 11:01:06,532][49750] Updated weights for policy 0, policy_version 158361 (0.0030) [2024-04-26 11:01:07,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.1, 300 sec: 50262.8). Total num frames: 2594619392. Throughput: 0: 50207.4. Samples: 347423600. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 11:01:07,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 11:01:08,985][49750] Updated weights for policy 0, policy_version 158371 (0.0042) [2024-04-26 11:01:10,799][49728] Signal inference workers to stop experience collection... (5350 times) [2024-04-26 11:01:10,846][49750] InferenceWorker_p0-w0: stopping experience collection (5350 times) [2024-04-26 11:01:10,869][49728] Signal inference workers to resume experience collection... (5350 times) [2024-04-26 11:01:10,870][49750] InferenceWorker_p0-w0: resuming experience collection (5350 times) [2024-04-26 11:01:12,062][49517] Fps is (10 sec: 45875.1, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2594848768. Throughput: 0: 50319.9. Samples: 347730120. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 11:01:12,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 11:01:13,030][49750] Updated weights for policy 0, policy_version 158381 (0.0031) [2024-04-26 11:01:15,586][49750] Updated weights for policy 0, policy_version 158391 (0.0033) [2024-04-26 11:01:17,063][49517] Fps is (10 sec: 47513.9, 60 sec: 49425.0, 300 sec: 50207.3). Total num frames: 2595094528. Throughput: 0: 50031.8. Samples: 348025480. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-26 11:01:17,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 11:01:19,542][49750] Updated weights for policy 0, policy_version 158401 (0.0028) [2024-04-26 11:01:22,062][49517] Fps is (10 sec: 54067.6, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2595389440. Throughput: 0: 50102.3. Samples: 348175020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-26 11:01:22,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 11:01:22,291][49750] Updated weights for policy 0, policy_version 158411 (0.0034) [2024-04-26 11:01:26,069][49750] Updated weights for policy 0, policy_version 158421 (0.0029) [2024-04-26 11:01:27,062][49517] Fps is (10 sec: 54067.5, 60 sec: 50790.3, 300 sec: 50318.3). Total num frames: 2595635200. Throughput: 0: 50005.5. Samples: 348473400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-26 11:01:27,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 11:01:28,869][49750] Updated weights for policy 0, policy_version 158431 (0.0028) [2024-04-26 11:01:32,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2595864576. Throughput: 0: 50213.4. Samples: 348782520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-26 11:01:32,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 11:01:32,578][49750] Updated weights for policy 0, policy_version 158441 (0.0034) [2024-04-26 11:01:35,418][49750] Updated weights for policy 0, policy_version 158451 (0.0038) [2024-04-26 11:01:37,063][49517] Fps is (10 sec: 47513.3, 60 sec: 49698.1, 300 sec: 50151.7). Total num frames: 2596110336. Throughput: 0: 50121.3. Samples: 348915860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-26 11:01:37,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 11:01:39,101][49750] Updated weights for policy 0, policy_version 158461 (0.0029) [2024-04-26 11:01:41,985][49750] Updated weights for policy 0, policy_version 158471 (0.0031) [2024-04-26 11:01:42,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2596388864. Throughput: 0: 50135.0. Samples: 349217540. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-26 11:01:42,064][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 11:01:45,545][49750] Updated weights for policy 0, policy_version 158481 (0.0029) [2024-04-26 11:01:47,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51063.5, 300 sec: 50373.9). Total num frames: 2596651008. Throughput: 0: 50318.7. Samples: 349526240. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-26 11:01:47,063][49517] Avg episode reward: [(0, '0.470')] [2024-04-26 11:01:48,442][49750] Updated weights for policy 0, policy_version 158491 (0.0037) [2024-04-26 11:01:51,889][49750] Updated weights for policy 0, policy_version 158501 (0.0032) [2024-04-26 11:01:52,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.6, 300 sec: 50262.7). Total num frames: 2596880384. Throughput: 0: 50266.6. Samples: 349685600. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-26 11:01:52,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 11:01:55,336][49750] Updated weights for policy 0, policy_version 158511 (0.0038) [2024-04-26 11:01:57,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2597126144. Throughput: 0: 50102.2. Samples: 349984720. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-26 11:01:57,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 11:01:58,347][49750] Updated weights for policy 0, policy_version 158521 (0.0038) [2024-04-26 11:02:01,916][49750] Updated weights for policy 0, policy_version 158531 (0.0033) [2024-04-26 11:02:02,063][49517] Fps is (10 sec: 49152.0, 60 sec: 49698.0, 300 sec: 50262.7). Total num frames: 2597371904. Throughput: 0: 50263.9. Samples: 350287360. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-26 11:02:02,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 11:02:04,889][49750] Updated weights for policy 0, policy_version 158541 (0.0032) [2024-04-26 11:02:05,745][49728] Signal inference workers to stop experience collection... (5400 times) [2024-04-26 11:02:05,746][49728] Signal inference workers to resume experience collection... (5400 times) [2024-04-26 11:02:05,780][49750] InferenceWorker_p0-w0: stopping experience collection (5400 times) [2024-04-26 11:02:05,781][49750] InferenceWorker_p0-w0: resuming experience collection (5400 times) [2024-04-26 11:02:07,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2597650432. Throughput: 0: 50290.0. Samples: 350438080. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-26 11:02:07,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 11:02:08,707][49750] Updated weights for policy 0, policy_version 158551 (0.0038) [2024-04-26 11:02:11,383][49750] Updated weights for policy 0, policy_version 158561 (0.0026) [2024-04-26 11:02:12,063][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.3, 300 sec: 50318.3). Total num frames: 2597896192. Throughput: 0: 50399.5. Samples: 350741380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-26 11:02:12,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 11:02:15,376][49750] Updated weights for policy 0, policy_version 158571 (0.0033) [2024-04-26 11:02:17,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.4, 300 sec: 50207.2). Total num frames: 2598125568. Throughput: 0: 50179.1. Samples: 351040580. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-26 11:02:17,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 11:02:17,784][49750] Updated weights for policy 0, policy_version 158581 (0.0028) [2024-04-26 11:02:21,713][49750] Updated weights for policy 0, policy_version 158591 (0.0030) [2024-04-26 11:02:22,063][49517] Fps is (10 sec: 47513.9, 60 sec: 49698.1, 300 sec: 50151.7). Total num frames: 2598371328. Throughput: 0: 50205.4. Samples: 351175100. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-26 11:02:22,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 11:02:24,332][49750] Updated weights for policy 0, policy_version 158601 (0.0031) [2024-04-26 11:02:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2598649856. Throughput: 0: 50276.2. Samples: 351479960. Policy #0 lag: (min: 0.0, avg: 13.3, max: 24.0) [2024-04-26 11:02:27,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 11:02:28,078][49750] Updated weights for policy 0, policy_version 158611 (0.0029) [2024-04-26 11:02:31,247][49750] Updated weights for policy 0, policy_version 158621 (0.0033) [2024-04-26 11:02:32,062][49517] Fps is (10 sec: 55706.2, 60 sec: 51063.5, 300 sec: 50429.4). Total num frames: 2598928384. Throughput: 0: 50326.7. Samples: 351790940. Policy #0 lag: (min: 0.0, avg: 13.3, max: 24.0) [2024-04-26 11:02:32,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 11:02:34,619][49750] Updated weights for policy 0, policy_version 158631 (0.0028) [2024-04-26 11:02:37,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.4, 300 sec: 50207.3). Total num frames: 2599124992. Throughput: 0: 50217.6. Samples: 351945380. Policy #0 lag: (min: 0.0, avg: 13.3, max: 24.0) [2024-04-26 11:02:37,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 11:02:37,691][49750] Updated weights for policy 0, policy_version 158641 (0.0029) [2024-04-26 11:02:41,131][49750] Updated weights for policy 0, policy_version 158651 (0.0036) [2024-04-26 11:02:42,062][49517] Fps is (10 sec: 44236.7, 60 sec: 49698.3, 300 sec: 50040.6). Total num frames: 2599370752. Throughput: 0: 50185.9. Samples: 352243080. Policy #0 lag: (min: 0.0, avg: 13.3, max: 24.0) [2024-04-26 11:02:42,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 11:02:42,081][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000158654_2599387136.pth... [2024-04-26 11:02:42,129][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000157917_2587312128.pth [2024-04-26 11:02:44,173][49750] Updated weights for policy 0, policy_version 158661 (0.0032) [2024-04-26 11:02:47,062][49517] Fps is (10 sec: 50790.0, 60 sec: 49698.1, 300 sec: 50262.8). Total num frames: 2599632896. Throughput: 0: 49904.2. Samples: 352533040. Policy #0 lag: (min: 0.0, avg: 13.3, max: 24.0) [2024-04-26 11:02:47,063][49517] Avg episode reward: [(0, '0.475')] [2024-04-26 11:02:47,537][49750] Updated weights for policy 0, policy_version 158671 (0.0035) [2024-04-26 11:02:50,825][49750] Updated weights for policy 0, policy_version 158681 (0.0038) [2024-04-26 11:02:52,063][49517] Fps is (10 sec: 54066.4, 60 sec: 50517.4, 300 sec: 50373.8). Total num frames: 2599911424. Throughput: 0: 50114.7. Samples: 352693240. Policy #0 lag: (min: 0.0, avg: 13.3, max: 24.0) [2024-04-26 11:02:52,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 11:02:53,957][49750] Updated weights for policy 0, policy_version 158691 (0.0039) [2024-04-26 11:02:57,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2600140800. Throughput: 0: 49981.0. Samples: 352990520. Policy #0 lag: (min: 0.0, avg: 13.3, max: 24.0) [2024-04-26 11:02:57,064][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 11:02:57,193][49750] Updated weights for policy 0, policy_version 158701 (0.0029) [2024-04-26 11:03:00,696][49750] Updated weights for policy 0, policy_version 158711 (0.0029) [2024-04-26 11:03:02,062][49517] Fps is (10 sec: 45875.9, 60 sec: 49971.4, 300 sec: 50096.2). Total num frames: 2600370176. Throughput: 0: 50123.6. Samples: 353296140. Policy #0 lag: (min: 0.0, avg: 13.3, max: 24.0) [2024-04-26 11:03:02,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 11:03:03,711][49750] Updated weights for policy 0, policy_version 158721 (0.0032) [2024-04-26 11:03:07,062][49517] Fps is (10 sec: 47514.2, 60 sec: 49425.2, 300 sec: 50151.7). Total num frames: 2600615936. Throughput: 0: 50161.5. Samples: 353432360. Policy #0 lag: (min: 0.0, avg: 13.3, max: 24.0) [2024-04-26 11:03:07,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 11:03:07,257][49750] Updated weights for policy 0, policy_version 158731 (0.0030) [2024-04-26 11:03:08,125][49728] Signal inference workers to stop experience collection... (5450 times) [2024-04-26 11:03:08,128][49728] Signal inference workers to resume experience collection... (5450 times) [2024-04-26 11:03:08,153][49750] InferenceWorker_p0-w0: stopping experience collection (5450 times) [2024-04-26 11:03:08,153][49750] InferenceWorker_p0-w0: resuming experience collection (5450 times) [2024-04-26 11:03:10,195][49750] Updated weights for policy 0, policy_version 158741 (0.0029) [2024-04-26 11:03:12,063][49517] Fps is (10 sec: 54066.5, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2600910848. Throughput: 0: 50261.7. Samples: 353741740. Policy #0 lag: (min: 0.0, avg: 13.3, max: 24.0) [2024-04-26 11:03:12,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 11:03:14,043][49750] Updated weights for policy 0, policy_version 158751 (0.0033) [2024-04-26 11:03:16,547][49750] Updated weights for policy 0, policy_version 158761 (0.0031) [2024-04-26 11:03:17,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2601140224. Throughput: 0: 49967.8. Samples: 354039500. Policy #0 lag: (min: 0.0, avg: 13.3, max: 24.0) [2024-04-26 11:03:17,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 11:03:20,664][49750] Updated weights for policy 0, policy_version 158771 (0.0027) [2024-04-26 11:03:22,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2601385984. Throughput: 0: 49915.5. Samples: 354191580. Policy #0 lag: (min: 0.0, avg: 13.3, max: 24.0) [2024-04-26 11:03:22,063][49517] Avg episode reward: [(0, '0.459')] [2024-04-26 11:03:23,122][49750] Updated weights for policy 0, policy_version 158781 (0.0032) [2024-04-26 11:03:27,062][49517] Fps is (10 sec: 47514.2, 60 sec: 49425.1, 300 sec: 50040.6). Total num frames: 2601615360. Throughput: 0: 49889.8. Samples: 354488120. Policy #0 lag: (min: 0.0, avg: 13.3, max: 24.0) [2024-04-26 11:03:27,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 11:03:27,260][49750] Updated weights for policy 0, policy_version 158791 (0.0035) [2024-04-26 11:03:30,027][49750] Updated weights for policy 0, policy_version 158801 (0.0028) [2024-04-26 11:03:32,063][49517] Fps is (10 sec: 50789.6, 60 sec: 49424.9, 300 sec: 50262.8). Total num frames: 2601893888. Throughput: 0: 49975.4. Samples: 354781940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:03:32,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 11:03:33,615][49750] Updated weights for policy 0, policy_version 158811 (0.0036) [2024-04-26 11:03:36,562][49750] Updated weights for policy 0, policy_version 158821 (0.0028) [2024-04-26 11:03:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2602139648. Throughput: 0: 50128.0. Samples: 354949000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:03:37,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 11:03:39,983][49750] Updated weights for policy 0, policy_version 158831 (0.0031) [2024-04-26 11:03:42,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.2, 300 sec: 50207.2). Total num frames: 2602401792. Throughput: 0: 50180.0. Samples: 355248620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:03:42,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 11:03:43,103][49750] Updated weights for policy 0, policy_version 158841 (0.0030) [2024-04-26 11:03:46,445][49750] Updated weights for policy 0, policy_version 158851 (0.0031) [2024-04-26 11:03:47,062][49517] Fps is (10 sec: 49152.7, 60 sec: 49971.3, 300 sec: 50096.2). Total num frames: 2602631168. Throughput: 0: 50024.9. Samples: 355547260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:03:47,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 11:03:49,576][49750] Updated weights for policy 0, policy_version 158861 (0.0037) [2024-04-26 11:03:52,062][49517] Fps is (10 sec: 50790.7, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2602909696. Throughput: 0: 50367.9. Samples: 355698920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:03:52,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 11:03:53,045][49750] Updated weights for policy 0, policy_version 158871 (0.0029) [2024-04-26 11:03:56,214][49750] Updated weights for policy 0, policy_version 158881 (0.0029) [2024-04-26 11:03:57,063][49517] Fps is (10 sec: 52427.8, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2603155456. Throughput: 0: 50063.0. Samples: 355994580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:03:57,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 11:03:59,420][49750] Updated weights for policy 0, policy_version 158891 (0.0032) [2024-04-26 11:04:02,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2603401216. Throughput: 0: 50114.4. Samples: 356294640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:04:02,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 11:04:02,668][49750] Updated weights for policy 0, policy_version 158901 (0.0029) [2024-04-26 11:04:05,977][49750] Updated weights for policy 0, policy_version 158911 (0.0030) [2024-04-26 11:04:07,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.3, 300 sec: 50096.2). Total num frames: 2603646976. Throughput: 0: 50056.9. Samples: 356444140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:04:07,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 11:04:09,198][49750] Updated weights for policy 0, policy_version 158921 (0.0028) [2024-04-26 11:04:12,063][49517] Fps is (10 sec: 49151.5, 60 sec: 49698.1, 300 sec: 50207.2). Total num frames: 2603892736. Throughput: 0: 50127.9. Samples: 356743880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:04:12,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 11:04:12,850][49750] Updated weights for policy 0, policy_version 158931 (0.0032) [2024-04-26 11:04:15,684][49750] Updated weights for policy 0, policy_version 158941 (0.0035) [2024-04-26 11:04:17,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2604154880. Throughput: 0: 50284.1. Samples: 357044720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:04:17,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 11:04:19,527][49750] Updated weights for policy 0, policy_version 158951 (0.0028) [2024-04-26 11:04:22,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2604400640. Throughput: 0: 50196.6. Samples: 357207840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:04:22,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 11:04:22,197][49750] Updated weights for policy 0, policy_version 158961 (0.0040) [2024-04-26 11:04:25,930][49750] Updated weights for policy 0, policy_version 158971 (0.0038) [2024-04-26 11:04:27,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50151.7). Total num frames: 2604646400. Throughput: 0: 50023.7. Samples: 357499680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:04:27,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 11:04:28,755][49750] Updated weights for policy 0, policy_version 158981 (0.0031) [2024-04-26 11:04:31,302][49728] Signal inference workers to stop experience collection... (5500 times) [2024-04-26 11:04:31,365][49728] Signal inference workers to resume experience collection... (5500 times) [2024-04-26 11:04:31,366][49750] InferenceWorker_p0-w0: stopping experience collection (5500 times) [2024-04-26 11:04:31,379][49750] InferenceWorker_p0-w0: resuming experience collection (5500 times) [2024-04-26 11:04:32,063][49517] Fps is (10 sec: 49151.5, 60 sec: 49971.3, 300 sec: 50096.2). Total num frames: 2604892160. Throughput: 0: 50118.5. Samples: 357802600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:04:32,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 11:04:32,577][49750] Updated weights for policy 0, policy_version 158991 (0.0031) [2024-04-26 11:04:35,368][49750] Updated weights for policy 0, policy_version 159001 (0.0032) [2024-04-26 11:04:37,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2605170688. Throughput: 0: 50051.9. Samples: 357951260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 11:04:37,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 11:04:39,035][49750] Updated weights for policy 0, policy_version 159011 (0.0038) [2024-04-26 11:04:41,805][49750] Updated weights for policy 0, policy_version 159021 (0.0029) [2024-04-26 11:04:42,062][49517] Fps is (10 sec: 50791.0, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2605400064. Throughput: 0: 50226.4. Samples: 358254760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 11:04:42,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 11:04:42,172][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000159022_2605416448.pth... [2024-04-26 11:04:42,217][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000158288_2593390592.pth [2024-04-26 11:04:45,536][49750] Updated weights for policy 0, policy_version 159031 (0.0033) [2024-04-26 11:04:47,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.3, 300 sec: 50207.3). Total num frames: 2605662208. Throughput: 0: 50348.5. Samples: 358560320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 11:04:47,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 11:04:48,458][49750] Updated weights for policy 0, policy_version 159041 (0.0038) [2024-04-26 11:04:51,942][49750] Updated weights for policy 0, policy_version 159051 (0.0034) [2024-04-26 11:04:52,062][49517] Fps is (10 sec: 49151.7, 60 sec: 49698.1, 300 sec: 49985.1). Total num frames: 2605891584. Throughput: 0: 50198.1. Samples: 358703060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 11:04:52,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 11:04:54,884][49750] Updated weights for policy 0, policy_version 159061 (0.0031) [2024-04-26 11:04:57,062][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.3, 300 sec: 50207.3). Total num frames: 2606153728. Throughput: 0: 50322.8. Samples: 359008400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 11:04:57,063][49517] Avg episode reward: [(0, '0.481')] [2024-04-26 11:04:58,378][49750] Updated weights for policy 0, policy_version 159071 (0.0033) [2024-04-26 11:05:01,367][49750] Updated weights for policy 0, policy_version 159081 (0.0038) [2024-04-26 11:05:02,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 2606415872. Throughput: 0: 50316.0. Samples: 359308940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 11:05:02,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 11:05:04,785][49750] Updated weights for policy 0, policy_version 159091 (0.0036) [2024-04-26 11:05:07,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2606645248. Throughput: 0: 50186.2. Samples: 359466220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 11:05:07,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 11:05:07,827][49750] Updated weights for policy 0, policy_version 159101 (0.0030) [2024-04-26 11:05:11,353][49750] Updated weights for policy 0, policy_version 159111 (0.0033) [2024-04-26 11:05:12,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.3, 300 sec: 50096.1). Total num frames: 2606907392. Throughput: 0: 50377.6. Samples: 359766680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 11:05:12,063][49517] Avg episode reward: [(0, '0.476')] [2024-04-26 11:05:14,269][49750] Updated weights for policy 0, policy_version 159121 (0.0037) [2024-04-26 11:05:17,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2607169536. Throughput: 0: 50226.8. Samples: 360062800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 11:05:17,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 11:05:17,934][49750] Updated weights for policy 0, policy_version 159131 (0.0032) [2024-04-26 11:05:20,761][49750] Updated weights for policy 0, policy_version 159141 (0.0028) [2024-04-26 11:05:22,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2607431680. Throughput: 0: 50314.4. Samples: 360215400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 11:05:22,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 11:05:24,343][49750] Updated weights for policy 0, policy_version 159151 (0.0024) [2024-04-26 11:05:27,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2607677440. Throughput: 0: 50426.2. Samples: 360523940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 11:05:27,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 11:05:27,319][49750] Updated weights for policy 0, policy_version 159161 (0.0032) [2024-04-26 11:05:30,838][49750] Updated weights for policy 0, policy_version 159171 (0.0036) [2024-04-26 11:05:32,062][49517] Fps is (10 sec: 45874.9, 60 sec: 49971.3, 300 sec: 50040.6). Total num frames: 2607890432. Throughput: 0: 50265.7. Samples: 360822280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 11:05:32,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 11:05:33,914][49750] Updated weights for policy 0, policy_version 159181 (0.0034) [2024-04-26 11:05:37,062][49517] Fps is (10 sec: 47513.4, 60 sec: 49698.2, 300 sec: 50096.2). Total num frames: 2608152576. Throughput: 0: 50262.7. Samples: 360964880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 11:05:37,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 11:05:37,282][49750] Updated weights for policy 0, policy_version 159191 (0.0027) [2024-04-26 11:05:40,326][49750] Updated weights for policy 0, policy_version 159201 (0.0031) [2024-04-26 11:05:42,062][49517] Fps is (10 sec: 55706.2, 60 sec: 50790.5, 300 sec: 50373.9). Total num frames: 2608447488. Throughput: 0: 50176.5. Samples: 361266340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:05:42,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 11:05:43,729][49750] Updated weights for policy 0, policy_version 159211 (0.0029) [2024-04-26 11:05:44,374][49728] Signal inference workers to stop experience collection... (5550 times) [2024-04-26 11:05:44,379][49728] Signal inference workers to resume experience collection... (5550 times) [2024-04-26 11:05:44,407][49750] InferenceWorker_p0-w0: stopping experience collection (5550 times) [2024-04-26 11:05:44,408][49750] InferenceWorker_p0-w0: resuming experience collection (5550 times) [2024-04-26 11:05:46,996][49750] Updated weights for policy 0, policy_version 159221 (0.0034) [2024-04-26 11:05:47,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.2, 300 sec: 50262.9). Total num frames: 2608676864. Throughput: 0: 50368.0. Samples: 361575500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:05:47,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 11:05:50,201][49750] Updated weights for policy 0, policy_version 159231 (0.0035) [2024-04-26 11:05:52,063][49517] Fps is (10 sec: 47512.6, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 2608922624. Throughput: 0: 50175.4. Samples: 361724120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:05:52,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 11:05:53,441][49750] Updated weights for policy 0, policy_version 159241 (0.0038) [2024-04-26 11:05:56,770][49750] Updated weights for policy 0, policy_version 159251 (0.0034) [2024-04-26 11:05:57,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.1, 300 sec: 50096.1). Total num frames: 2609168384. Throughput: 0: 50138.7. Samples: 362022920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:05:57,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 11:05:59,914][49750] Updated weights for policy 0, policy_version 159261 (0.0033) [2024-04-26 11:06:02,062][49517] Fps is (10 sec: 49152.6, 60 sec: 49971.3, 300 sec: 50151.7). Total num frames: 2609414144. Throughput: 0: 50225.4. Samples: 362322940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:06:02,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 11:06:03,177][49750] Updated weights for policy 0, policy_version 159271 (0.0029) [2024-04-26 11:06:06,424][49750] Updated weights for policy 0, policy_version 159281 (0.0036) [2024-04-26 11:06:07,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2609676288. Throughput: 0: 50220.0. Samples: 362475300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:06:07,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 11:06:09,951][49750] Updated weights for policy 0, policy_version 159291 (0.0036) [2024-04-26 11:06:12,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 2609922048. Throughput: 0: 50038.7. Samples: 362775680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:06:12,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 11:06:12,903][49750] Updated weights for policy 0, policy_version 159301 (0.0027) [2024-04-26 11:06:16,433][49750] Updated weights for policy 0, policy_version 159311 (0.0033) [2024-04-26 11:06:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.2, 300 sec: 50096.2). Total num frames: 2610167808. Throughput: 0: 50100.1. Samples: 363076780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:06:17,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 11:06:19,497][49750] Updated weights for policy 0, policy_version 159321 (0.0032) [2024-04-26 11:06:22,063][49517] Fps is (10 sec: 49151.6, 60 sec: 49698.0, 300 sec: 50096.2). Total num frames: 2610413568. Throughput: 0: 50181.8. Samples: 363223060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:06:22,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 11:06:23,043][49750] Updated weights for policy 0, policy_version 159331 (0.0031) [2024-04-26 11:06:25,904][49750] Updated weights for policy 0, policy_version 159341 (0.0033) [2024-04-26 11:06:27,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2610692096. Throughput: 0: 50229.1. Samples: 363526660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:06:27,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 11:06:29,591][49750] Updated weights for policy 0, policy_version 159351 (0.0032) [2024-04-26 11:06:32,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.4, 300 sec: 50262.8). Total num frames: 2610937856. Throughput: 0: 50073.4. Samples: 363828800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:06:32,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 11:06:32,368][49750] Updated weights for policy 0, policy_version 159361 (0.0033) [2024-04-26 11:06:36,149][49750] Updated weights for policy 0, policy_version 159371 (0.0033) [2024-04-26 11:06:37,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.3, 300 sec: 50096.2). Total num frames: 2611167232. Throughput: 0: 50140.6. Samples: 363980440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:06:37,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 11:06:39,052][49750] Updated weights for policy 0, policy_version 159381 (0.0033) [2024-04-26 11:06:42,062][49517] Fps is (10 sec: 49151.6, 60 sec: 49698.0, 300 sec: 50096.2). Total num frames: 2611429376. Throughput: 0: 50125.4. Samples: 364278560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:06:42,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 11:06:42,113][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000159390_2611445760.pth... [2024-04-26 11:06:42,160][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000158654_2599387136.pth [2024-04-26 11:06:42,598][49750] Updated weights for policy 0, policy_version 159391 (0.0029) [2024-04-26 11:06:45,464][49750] Updated weights for policy 0, policy_version 159401 (0.0031) [2024-04-26 11:06:47,062][49517] Fps is (10 sec: 50790.1, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2611675136. Throughput: 0: 50247.9. Samples: 364584100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:06:47,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 11:06:49,068][49750] Updated weights for policy 0, policy_version 159411 (0.0033) [2024-04-26 11:06:51,994][49750] Updated weights for policy 0, policy_version 159421 (0.0033) [2024-04-26 11:06:52,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.5, 300 sec: 50262.8). Total num frames: 2611953664. Throughput: 0: 50268.9. Samples: 364737400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:06:52,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 11:06:54,551][49728] Signal inference workers to stop experience collection... (5600 times) [2024-04-26 11:06:54,552][49728] Signal inference workers to resume experience collection... (5600 times) [2024-04-26 11:06:54,580][49750] InferenceWorker_p0-w0: stopping experience collection (5600 times) [2024-04-26 11:06:54,581][49750] InferenceWorker_p0-w0: resuming experience collection (5600 times) [2024-04-26 11:06:55,413][49750] Updated weights for policy 0, policy_version 159431 (0.0029) [2024-04-26 11:06:57,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 2612183040. Throughput: 0: 50278.9. Samples: 365038240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:06:57,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 11:06:58,517][49750] Updated weights for policy 0, policy_version 159441 (0.0032) [2024-04-26 11:07:01,931][49750] Updated weights for policy 0, policy_version 159451 (0.0032) [2024-04-26 11:07:02,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2612445184. Throughput: 0: 50376.3. Samples: 365343720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:07:02,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 11:07:04,912][49750] Updated weights for policy 0, policy_version 159461 (0.0029) [2024-04-26 11:07:07,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2612690944. Throughput: 0: 50436.1. Samples: 365492680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:07:07,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 11:07:08,567][49750] Updated weights for policy 0, policy_version 159471 (0.0032) [2024-04-26 11:07:11,388][49750] Updated weights for policy 0, policy_version 159481 (0.0026) [2024-04-26 11:07:12,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 2612953088. Throughput: 0: 50452.4. Samples: 365797020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:07:12,063][49517] Avg episode reward: [(0, '0.485')] [2024-04-26 11:07:14,939][49750] Updated weights for policy 0, policy_version 159491 (0.0033) [2024-04-26 11:07:17,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 2613198848. Throughput: 0: 50575.0. Samples: 366104680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:07:17,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 11:07:17,833][49750] Updated weights for policy 0, policy_version 159501 (0.0033) [2024-04-26 11:07:21,312][49750] Updated weights for policy 0, policy_version 159511 (0.0033) [2024-04-26 11:07:22,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.4, 300 sec: 50151.7). Total num frames: 2613444608. Throughput: 0: 50472.5. Samples: 366251700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:07:22,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 11:07:24,363][49750] Updated weights for policy 0, policy_version 159521 (0.0033) [2024-04-26 11:07:27,063][49517] Fps is (10 sec: 49151.4, 60 sec: 49971.1, 300 sec: 50040.6). Total num frames: 2613690368. Throughput: 0: 50385.1. Samples: 366545900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:07:27,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 11:07:27,932][49750] Updated weights for policy 0, policy_version 159531 (0.0030) [2024-04-26 11:07:30,966][49750] Updated weights for policy 0, policy_version 159541 (0.0030) [2024-04-26 11:07:32,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2613968896. Throughput: 0: 50256.0. Samples: 366845620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:07:32,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 11:07:34,757][49750] Updated weights for policy 0, policy_version 159551 (0.0028) [2024-04-26 11:07:37,062][49517] Fps is (10 sec: 52429.9, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 2614214656. Throughput: 0: 50456.4. Samples: 367007940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:07:37,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 11:07:37,380][49750] Updated weights for policy 0, policy_version 159561 (0.0038) [2024-04-26 11:07:41,169][49750] Updated weights for policy 0, policy_version 159571 (0.0029) [2024-04-26 11:07:42,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.3, 300 sec: 50207.3). Total num frames: 2614444032. Throughput: 0: 50514.8. Samples: 367311400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:07:42,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 11:07:43,775][49750] Updated weights for policy 0, policy_version 159581 (0.0033) [2024-04-26 11:07:47,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2614706176. Throughput: 0: 50356.5. Samples: 367609760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 11:07:47,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 11:07:47,645][49750] Updated weights for policy 0, policy_version 159591 (0.0037) [2024-04-26 11:07:50,320][49750] Updated weights for policy 0, policy_version 159601 (0.0027) [2024-04-26 11:07:52,063][49517] Fps is (10 sec: 50789.8, 60 sec: 49971.1, 300 sec: 50207.2). Total num frames: 2614951936. Throughput: 0: 50345.2. Samples: 367758220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 11:07:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 11:07:54,251][49750] Updated weights for policy 0, policy_version 159611 (0.0037) [2024-04-26 11:07:56,905][49750] Updated weights for policy 0, policy_version 159621 (0.0029) [2024-04-26 11:07:57,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51063.6, 300 sec: 50429.4). Total num frames: 2615246848. Throughput: 0: 50305.1. Samples: 368060740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 11:07:57,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 11:08:00,890][49750] Updated weights for policy 0, policy_version 159631 (0.0031) [2024-04-26 11:08:02,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2615459840. Throughput: 0: 50156.1. Samples: 368361700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 11:08:02,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 11:08:03,371][49750] Updated weights for policy 0, policy_version 159641 (0.0030) [2024-04-26 11:08:07,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2615705600. Throughput: 0: 50117.0. Samples: 368506960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 11:08:07,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 11:08:07,323][49750] Updated weights for policy 0, policy_version 159651 (0.0034) [2024-04-26 11:08:09,810][49750] Updated weights for policy 0, policy_version 159661 (0.0034) [2024-04-26 11:08:12,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2615967744. Throughput: 0: 50222.9. Samples: 368805920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 11:08:12,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 11:08:13,681][49728] Signal inference workers to stop experience collection... (5650 times) [2024-04-26 11:08:13,701][49750] InferenceWorker_p0-w0: stopping experience collection (5650 times) [2024-04-26 11:08:13,786][49728] Signal inference workers to resume experience collection... (5650 times) [2024-04-26 11:08:13,787][49750] InferenceWorker_p0-w0: resuming experience collection (5650 times) [2024-04-26 11:08:13,919][49750] Updated weights for policy 0, policy_version 159671 (0.0036) [2024-04-26 11:08:16,384][49750] Updated weights for policy 0, policy_version 159681 (0.0029) [2024-04-26 11:08:17,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.5, 300 sec: 50318.3). Total num frames: 2616229888. Throughput: 0: 50199.7. Samples: 369104600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 11:08:17,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 11:08:20,302][49750] Updated weights for policy 0, policy_version 159691 (0.0036) [2024-04-26 11:08:22,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2616459264. Throughput: 0: 50310.2. Samples: 369271900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 11:08:22,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 11:08:22,976][49750] Updated weights for policy 0, policy_version 159701 (0.0034) [2024-04-26 11:08:26,886][49750] Updated weights for policy 0, policy_version 159711 (0.0029) [2024-04-26 11:08:27,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50244.4, 300 sec: 50207.3). Total num frames: 2616705024. Throughput: 0: 50203.9. Samples: 369570580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 11:08:27,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 11:08:29,361][49750] Updated weights for policy 0, policy_version 159721 (0.0031) [2024-04-26 11:08:32,063][49517] Fps is (10 sec: 49151.3, 60 sec: 49698.0, 300 sec: 50207.2). Total num frames: 2616950784. Throughput: 0: 50106.0. Samples: 369864540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 11:08:32,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 11:08:33,331][49750] Updated weights for policy 0, policy_version 159731 (0.0037) [2024-04-26 11:08:35,874][49750] Updated weights for policy 0, policy_version 159741 (0.0035) [2024-04-26 11:08:37,062][49517] Fps is (10 sec: 50790.7, 60 sec: 49971.2, 300 sec: 50207.3). Total num frames: 2617212928. Throughput: 0: 50225.0. Samples: 370018340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 11:08:37,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 11:08:39,806][49750] Updated weights for policy 0, policy_version 159751 (0.0036) [2024-04-26 11:08:42,063][49517] Fps is (10 sec: 54066.8, 60 sec: 50790.2, 300 sec: 50373.8). Total num frames: 2617491456. Throughput: 0: 50357.4. Samples: 370326840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 11:08:42,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 11:08:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000159759_2617491456.pth... [2024-04-26 11:08:42,134][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000159022_2605416448.pth [2024-04-26 11:08:42,548][49750] Updated weights for policy 0, policy_version 159761 (0.0030) [2024-04-26 11:08:46,253][49750] Updated weights for policy 0, policy_version 159771 (0.0031) [2024-04-26 11:08:47,063][49517] Fps is (10 sec: 49151.2, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 2617704448. Throughput: 0: 50296.3. Samples: 370625040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 11:08:47,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 11:08:49,113][49750] Updated weights for policy 0, policy_version 159781 (0.0038) [2024-04-26 11:08:52,063][49517] Fps is (10 sec: 45876.0, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2617950208. Throughput: 0: 50231.4. Samples: 370767380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 11:08:52,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 11:08:52,869][49750] Updated weights for policy 0, policy_version 159791 (0.0028) [2024-04-26 11:08:55,553][49750] Updated weights for policy 0, policy_version 159801 (0.0038) [2024-04-26 11:08:57,063][49517] Fps is (10 sec: 52429.4, 60 sec: 49698.0, 300 sec: 50262.8). Total num frames: 2618228736. Throughput: 0: 50293.8. Samples: 371069140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 11:08:57,063][49517] Avg episode reward: [(0, '0.470')] [2024-04-26 11:08:59,473][49750] Updated weights for policy 0, policy_version 159811 (0.0034) [2024-04-26 11:09:02,062][49517] Fps is (10 sec: 54067.6, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2618490880. Throughput: 0: 50435.5. Samples: 371374200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 11:09:02,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 11:09:02,074][49750] Updated weights for policy 0, policy_version 159821 (0.0032) [2024-04-26 11:09:05,898][49750] Updated weights for policy 0, policy_version 159831 (0.0031) [2024-04-26 11:09:07,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2618720256. Throughput: 0: 50247.2. Samples: 371533020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 11:09:07,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 11:09:08,781][49750] Updated weights for policy 0, policy_version 159841 (0.0030) [2024-04-26 11:09:12,063][49517] Fps is (10 sec: 47513.3, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2618966016. Throughput: 0: 50185.8. Samples: 371828940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 11:09:12,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 11:09:12,385][49750] Updated weights for policy 0, policy_version 159851 (0.0029) [2024-04-26 11:09:15,325][49750] Updated weights for policy 0, policy_version 159861 (0.0030) [2024-04-26 11:09:17,063][49517] Fps is (10 sec: 50789.5, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2619228160. Throughput: 0: 50285.9. Samples: 372127400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 11:09:17,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 11:09:19,042][49750] Updated weights for policy 0, policy_version 159871 (0.0036) [2024-04-26 11:09:21,695][49750] Updated weights for policy 0, policy_version 159881 (0.0035) [2024-04-26 11:09:22,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2619490304. Throughput: 0: 50359.5. Samples: 372284520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 11:09:22,071][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 11:09:25,678][49750] Updated weights for policy 0, policy_version 159891 (0.0036) [2024-04-26 11:09:26,335][49728] Signal inference workers to stop experience collection... (5700 times) [2024-04-26 11:09:26,395][49750] InferenceWorker_p0-w0: stopping experience collection (5700 times) [2024-04-26 11:09:26,395][49728] Signal inference workers to resume experience collection... (5700 times) [2024-04-26 11:09:26,413][49750] InferenceWorker_p0-w0: resuming experience collection (5700 times) [2024-04-26 11:09:27,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2619736064. Throughput: 0: 50182.0. Samples: 372585020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 11:09:27,072][49517] Avg episode reward: [(0, '0.479')] [2024-04-26 11:09:28,254][49750] Updated weights for policy 0, policy_version 159901 (0.0035) [2024-04-26 11:09:32,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2619965440. Throughput: 0: 50179.6. Samples: 372883120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 11:09:32,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 11:09:32,083][49750] Updated weights for policy 0, policy_version 159911 (0.0034) [2024-04-26 11:09:35,006][49750] Updated weights for policy 0, policy_version 159921 (0.0031) [2024-04-26 11:09:37,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2620211200. Throughput: 0: 50197.5. Samples: 373026260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 11:09:37,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 11:09:38,524][49750] Updated weights for policy 0, policy_version 159931 (0.0033) [2024-04-26 11:09:41,386][49750] Updated weights for policy 0, policy_version 159941 (0.0037) [2024-04-26 11:09:42,062][49517] Fps is (10 sec: 52429.4, 60 sec: 49971.4, 300 sec: 50262.8). Total num frames: 2620489728. Throughput: 0: 50176.9. Samples: 373327100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 11:09:42,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 11:09:45,359][49750] Updated weights for policy 0, policy_version 159951 (0.0033) [2024-04-26 11:09:47,062][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.5, 300 sec: 50373.9). Total num frames: 2620751872. Throughput: 0: 50043.6. Samples: 373626160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 11:09:47,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 11:09:47,818][49750] Updated weights for policy 0, policy_version 159961 (0.0030) [2024-04-26 11:09:51,730][49750] Updated weights for policy 0, policy_version 159971 (0.0030) [2024-04-26 11:09:52,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2620981248. Throughput: 0: 50144.8. Samples: 373789540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 11:09:52,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 11:09:54,411][49750] Updated weights for policy 0, policy_version 159981 (0.0029) [2024-04-26 11:09:57,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2621227008. Throughput: 0: 50264.0. Samples: 374090820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 11:09:57,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 11:09:58,163][49750] Updated weights for policy 0, policy_version 159991 (0.0031) [2024-04-26 11:10:01,067][49750] Updated weights for policy 0, policy_version 160001 (0.0033) [2024-04-26 11:10:02,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 50262.8). Total num frames: 2621472768. Throughput: 0: 50114.8. Samples: 374382560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 11:10:02,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 11:10:04,678][49750] Updated weights for policy 0, policy_version 160011 (0.0034) [2024-04-26 11:10:07,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2621751296. Throughput: 0: 50181.5. Samples: 374542680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 11:10:07,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 11:10:07,458][49750] Updated weights for policy 0, policy_version 160021 (0.0029) [2024-04-26 11:10:11,225][49750] Updated weights for policy 0, policy_version 160031 (0.0030) [2024-04-26 11:10:12,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 2621980672. Throughput: 0: 50185.8. Samples: 374843380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 11:10:12,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 11:10:14,221][49750] Updated weights for policy 0, policy_version 160041 (0.0028) [2024-04-26 11:10:17,062][49517] Fps is (10 sec: 47513.2, 60 sec: 49971.3, 300 sec: 50151.7). Total num frames: 2622226432. Throughput: 0: 50120.6. Samples: 375138540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 11:10:17,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 11:10:17,687][49750] Updated weights for policy 0, policy_version 160051 (0.0037) [2024-04-26 11:10:20,620][49750] Updated weights for policy 0, policy_version 160061 (0.0029) [2024-04-26 11:10:22,062][49517] Fps is (10 sec: 49152.5, 60 sec: 49698.2, 300 sec: 50151.7). Total num frames: 2622472192. Throughput: 0: 50173.3. Samples: 375284060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 11:10:22,063][49517] Avg episode reward: [(0, '0.436')] [2024-04-26 11:10:24,145][49750] Updated weights for policy 0, policy_version 160071 (0.0031) [2024-04-26 11:10:27,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 2622750720. Throughput: 0: 50320.7. Samples: 375591540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 11:10:27,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 11:10:27,177][49750] Updated weights for policy 0, policy_version 160081 (0.0032) [2024-04-26 11:10:30,743][49750] Updated weights for policy 0, policy_version 160091 (0.0038) [2024-04-26 11:10:32,063][49517] Fps is (10 sec: 54066.0, 60 sec: 50790.4, 300 sec: 50373.8). Total num frames: 2623012864. Throughput: 0: 50410.5. Samples: 375894640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 11:10:32,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 11:10:33,964][49750] Updated weights for policy 0, policy_version 160101 (0.0032) [2024-04-26 11:10:37,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2623242240. Throughput: 0: 50158.2. Samples: 376046660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 11:10:37,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 11:10:37,181][49750] Updated weights for policy 0, policy_version 160111 (0.0031) [2024-04-26 11:10:40,511][49750] Updated weights for policy 0, policy_version 160121 (0.0031) [2024-04-26 11:10:42,062][49517] Fps is (10 sec: 47514.3, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2623488000. Throughput: 0: 50203.5. Samples: 376349980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 11:10:42,063][49517] Avg episode reward: [(0, '0.474')] [2024-04-26 11:10:42,152][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000160126_2623504384.pth... [2024-04-26 11:10:42,206][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000159390_2611445760.pth [2024-04-26 11:10:43,710][49750] Updated weights for policy 0, policy_version 160131 (0.0026) [2024-04-26 11:10:47,019][49750] Updated weights for policy 0, policy_version 160141 (0.0030) [2024-04-26 11:10:47,062][49517] Fps is (10 sec: 50790.1, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2623750144. Throughput: 0: 50434.2. Samples: 376652100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 11:10:47,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 11:10:50,412][49750] Updated weights for policy 0, policy_version 160151 (0.0033) [2024-04-26 11:10:52,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 2624012288. Throughput: 0: 50236.3. Samples: 376803320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 11:10:52,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 11:10:53,581][49750] Updated weights for policy 0, policy_version 160161 (0.0029) [2024-04-26 11:10:56,778][49750] Updated weights for policy 0, policy_version 160171 (0.0035) [2024-04-26 11:10:57,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2624258048. Throughput: 0: 50351.5. Samples: 377109200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 11:10:57,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 11:11:00,045][49750] Updated weights for policy 0, policy_version 160181 (0.0030) [2024-04-26 11:11:02,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 2624487424. Throughput: 0: 50484.0. Samples: 377410320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:11:02,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 11:11:03,252][49750] Updated weights for policy 0, policy_version 160191 (0.0028) [2024-04-26 11:11:03,742][49728] Signal inference workers to stop experience collection... (5750 times) [2024-04-26 11:11:03,742][49728] Signal inference workers to resume experience collection... (5750 times) [2024-04-26 11:11:03,753][49750] InferenceWorker_p0-w0: stopping experience collection (5750 times) [2024-04-26 11:11:03,753][49750] InferenceWorker_p0-w0: resuming experience collection (5750 times) [2024-04-26 11:11:06,510][49750] Updated weights for policy 0, policy_version 160201 (0.0031) [2024-04-26 11:11:07,062][49517] Fps is (10 sec: 49152.7, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2624749568. Throughput: 0: 50379.5. Samples: 377551140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:11:07,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 11:11:09,775][49750] Updated weights for policy 0, policy_version 160211 (0.0031) [2024-04-26 11:11:12,063][49517] Fps is (10 sec: 54066.4, 60 sec: 50790.3, 300 sec: 50373.8). Total num frames: 2625028096. Throughput: 0: 50361.8. Samples: 377857820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:11:12,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 11:11:12,930][49750] Updated weights for policy 0, policy_version 160221 (0.0037) [2024-04-26 11:11:16,367][49750] Updated weights for policy 0, policy_version 160231 (0.0033) [2024-04-26 11:11:17,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 2625273856. Throughput: 0: 50464.1. Samples: 378165520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:11:17,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 11:11:19,342][49750] Updated weights for policy 0, policy_version 160241 (0.0035) [2024-04-26 11:11:22,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.3, 300 sec: 50207.3). Total num frames: 2625503232. Throughput: 0: 50403.5. Samples: 378314820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:11:22,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 11:11:22,742][49750] Updated weights for policy 0, policy_version 160251 (0.0031) [2024-04-26 11:11:26,026][49750] Updated weights for policy 0, policy_version 160261 (0.0031) [2024-04-26 11:11:27,063][49517] Fps is (10 sec: 47513.1, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2625748992. Throughput: 0: 50363.0. Samples: 378616320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:11:27,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 11:11:29,229][49750] Updated weights for policy 0, policy_version 160271 (0.0030) [2024-04-26 11:11:32,062][49517] Fps is (10 sec: 50790.4, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2626011136. Throughput: 0: 50235.1. Samples: 378912680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:11:32,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 11:11:32,640][49750] Updated weights for policy 0, policy_version 160281 (0.0032) [2024-04-26 11:11:35,831][49750] Updated weights for policy 0, policy_version 160291 (0.0032) [2024-04-26 11:11:37,063][49517] Fps is (10 sec: 54067.7, 60 sec: 50790.3, 300 sec: 50373.9). Total num frames: 2626289664. Throughput: 0: 50248.0. Samples: 379064480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:11:37,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 11:11:39,226][49750] Updated weights for policy 0, policy_version 160301 (0.0037) [2024-04-26 11:11:42,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2626519040. Throughput: 0: 50329.0. Samples: 379374000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:11:42,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 11:11:42,346][49750] Updated weights for policy 0, policy_version 160311 (0.0028) [2024-04-26 11:11:45,809][49750] Updated weights for policy 0, policy_version 160321 (0.0029) [2024-04-26 11:11:47,062][49517] Fps is (10 sec: 45875.8, 60 sec: 49971.3, 300 sec: 50151.7). Total num frames: 2626748416. Throughput: 0: 50420.5. Samples: 379679240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:11:47,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 11:11:48,750][49750] Updated weights for policy 0, policy_version 160331 (0.0030) [2024-04-26 11:11:52,062][49517] Fps is (10 sec: 49151.7, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2627010560. Throughput: 0: 50300.4. Samples: 379814660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:11:52,063][49517] Avg episode reward: [(0, '0.408')] [2024-04-26 11:11:52,224][49750] Updated weights for policy 0, policy_version 160341 (0.0039) [2024-04-26 11:11:55,219][49750] Updated weights for policy 0, policy_version 160351 (0.0026) [2024-04-26 11:11:57,062][49517] Fps is (10 sec: 54066.8, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2627289088. Throughput: 0: 50294.4. Samples: 380121060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:11:57,063][49517] Avg episode reward: [(0, '0.471')] [2024-04-26 11:11:58,630][49750] Updated weights for policy 0, policy_version 160361 (0.0028) [2024-04-26 11:12:01,296][49728] Signal inference workers to stop experience collection... (5800 times) [2024-04-26 11:12:01,297][49728] Signal inference workers to resume experience collection... (5800 times) [2024-04-26 11:12:01,331][49750] InferenceWorker_p0-w0: stopping experience collection (5800 times) [2024-04-26 11:12:01,331][49750] InferenceWorker_p0-w0: resuming experience collection (5800 times) [2024-04-26 11:12:01,838][49750] Updated weights for policy 0, policy_version 160371 (0.0028) [2024-04-26 11:12:02,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.5, 300 sec: 50262.8). Total num frames: 2627518464. Throughput: 0: 50297.1. Samples: 380428880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:12:02,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 11:12:05,008][49750] Updated weights for policy 0, policy_version 160381 (0.0042) [2024-04-26 11:12:07,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2627780608. Throughput: 0: 50251.7. Samples: 380576140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 11:12:07,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 11:12:08,505][49750] Updated weights for policy 0, policy_version 160391 (0.0032) [2024-04-26 11:12:11,428][49750] Updated weights for policy 0, policy_version 160401 (0.0033) [2024-04-26 11:12:12,062][49517] Fps is (10 sec: 49151.3, 60 sec: 49698.3, 300 sec: 50207.3). Total num frames: 2628009984. Throughput: 0: 50273.5. Samples: 380878620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 11:12:12,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 11:12:14,986][49750] Updated weights for policy 0, policy_version 160411 (0.0032) [2024-04-26 11:12:17,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2628288512. Throughput: 0: 50204.9. Samples: 381171900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 11:12:17,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 11:12:17,937][49750] Updated weights for policy 0, policy_version 160421 (0.0034) [2024-04-26 11:12:21,394][49750] Updated weights for policy 0, policy_version 160431 (0.0029) [2024-04-26 11:12:22,063][49517] Fps is (10 sec: 54066.8, 60 sec: 50790.3, 300 sec: 50373.9). Total num frames: 2628550656. Throughput: 0: 50331.1. Samples: 381329380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 11:12:22,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 11:12:24,329][49750] Updated weights for policy 0, policy_version 160441 (0.0036) [2024-04-26 11:12:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.5, 300 sec: 50207.2). Total num frames: 2628780032. Throughput: 0: 50295.5. Samples: 381637300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 11:12:27,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 11:12:27,750][49750] Updated weights for policy 0, policy_version 160451 (0.0035) [2024-04-26 11:12:31,138][49750] Updated weights for policy 0, policy_version 160461 (0.0032) [2024-04-26 11:12:32,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2629042176. Throughput: 0: 50374.2. Samples: 381946080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 11:12:32,071][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 11:12:34,240][49750] Updated weights for policy 0, policy_version 160471 (0.0030) [2024-04-26 11:12:37,062][49517] Fps is (10 sec: 50790.6, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2629287936. Throughput: 0: 50520.1. Samples: 382088060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 11:12:37,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 11:12:37,974][49750] Updated weights for policy 0, policy_version 160481 (0.0034) [2024-04-26 11:12:40,665][49750] Updated weights for policy 0, policy_version 160491 (0.0031) [2024-04-26 11:12:42,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50373.8). Total num frames: 2629566464. Throughput: 0: 50488.8. Samples: 382393060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 11:12:42,072][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 11:12:42,083][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000160496_2629566464.pth... [2024-04-26 11:12:42,132][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000159759_2617491456.pth [2024-04-26 11:12:44,374][49750] Updated weights for policy 0, policy_version 160501 (0.0028) [2024-04-26 11:12:47,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50790.2, 300 sec: 50318.3). Total num frames: 2629795840. Throughput: 0: 50477.0. Samples: 382700360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 11:12:47,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 11:12:47,086][49750] Updated weights for policy 0, policy_version 160511 (0.0032) [2024-04-26 11:12:50,856][49750] Updated weights for policy 0, policy_version 160521 (0.0032) [2024-04-26 11:12:52,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.2, 300 sec: 50151.7). Total num frames: 2630041600. Throughput: 0: 50377.9. Samples: 382843160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 11:12:52,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 11:12:53,660][49750] Updated weights for policy 0, policy_version 160531 (0.0030) [2024-04-26 11:12:57,062][49517] Fps is (10 sec: 47514.4, 60 sec: 49698.1, 300 sec: 50207.2). Total num frames: 2630270976. Throughput: 0: 50355.1. Samples: 383144600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 11:12:57,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 11:12:57,199][49728] Signal inference workers to stop experience collection... (5850 times) [2024-04-26 11:12:57,199][49728] Signal inference workers to resume experience collection... (5850 times) [2024-04-26 11:12:57,211][49750] InferenceWorker_p0-w0: stopping experience collection (5850 times) [2024-04-26 11:12:57,211][49750] InferenceWorker_p0-w0: resuming experience collection (5850 times) [2024-04-26 11:12:57,330][49750] Updated weights for policy 0, policy_version 160541 (0.0034) [2024-04-26 11:13:00,294][49750] Updated weights for policy 0, policy_version 160551 (0.0030) [2024-04-26 11:13:02,063][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 2630549504. Throughput: 0: 50343.5. Samples: 383437360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 11:13:02,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 11:13:03,865][49750] Updated weights for policy 0, policy_version 160561 (0.0035) [2024-04-26 11:13:06,950][49750] Updated weights for policy 0, policy_version 160571 (0.0033) [2024-04-26 11:13:07,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2630795264. Throughput: 0: 50417.5. Samples: 383598160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 11:13:07,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 11:13:10,318][49750] Updated weights for policy 0, policy_version 160581 (0.0034) [2024-04-26 11:13:12,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50517.1, 300 sec: 50207.2). Total num frames: 2631041024. Throughput: 0: 50340.6. Samples: 383902640. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-04-26 11:13:12,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 11:13:13,426][49750] Updated weights for policy 0, policy_version 160591 (0.0034) [2024-04-26 11:13:16,715][49750] Updated weights for policy 0, policy_version 160601 (0.0036) [2024-04-26 11:13:17,063][49517] Fps is (10 sec: 49151.1, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2631286784. Throughput: 0: 50222.5. Samples: 384206100. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-04-26 11:13:17,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 11:13:19,959][49750] Updated weights for policy 0, policy_version 160611 (0.0032) [2024-04-26 11:13:22,062][49517] Fps is (10 sec: 50791.9, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2631548928. Throughput: 0: 50384.8. Samples: 384355380. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-04-26 11:13:22,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 11:13:23,129][49750] Updated weights for policy 0, policy_version 160621 (0.0037) [2024-04-26 11:13:26,383][49750] Updated weights for policy 0, policy_version 160631 (0.0033) [2024-04-26 11:13:27,062][49517] Fps is (10 sec: 54067.9, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2631827456. Throughput: 0: 50323.7. Samples: 384657620. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-04-26 11:13:27,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 11:13:29,589][49750] Updated weights for policy 0, policy_version 160641 (0.0033) [2024-04-26 11:13:32,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2632056832. Throughput: 0: 50146.8. Samples: 384956960. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-04-26 11:13:32,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 11:13:32,947][49750] Updated weights for policy 0, policy_version 160651 (0.0032) [2024-04-26 11:13:36,312][49750] Updated weights for policy 0, policy_version 160661 (0.0035) [2024-04-26 11:13:37,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2632318976. Throughput: 0: 50186.1. Samples: 385101520. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-04-26 11:13:37,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 11:13:39,372][49750] Updated weights for policy 0, policy_version 160671 (0.0029) [2024-04-26 11:13:42,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49425.2, 300 sec: 50262.8). Total num frames: 2632531968. Throughput: 0: 50299.2. Samples: 385408060. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-04-26 11:13:42,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 11:13:42,829][49750] Updated weights for policy 0, policy_version 160681 (0.0032) [2024-04-26 11:13:45,829][49750] Updated weights for policy 0, policy_version 160691 (0.0035) [2024-04-26 11:13:46,208][49728] Signal inference workers to stop experience collection... (5900 times) [2024-04-26 11:13:46,208][49728] Signal inference workers to resume experience collection... (5900 times) [2024-04-26 11:13:46,221][49750] InferenceWorker_p0-w0: stopping experience collection (5900 times) [2024-04-26 11:13:46,221][49750] InferenceWorker_p0-w0: resuming experience collection (5900 times) [2024-04-26 11:13:47,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 2632810496. Throughput: 0: 50315.2. Samples: 385701540. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-04-26 11:13:47,063][49517] Avg episode reward: [(0, '0.474')] [2024-04-26 11:13:49,397][49750] Updated weights for policy 0, policy_version 160701 (0.0040) [2024-04-26 11:13:52,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 2633056256. Throughput: 0: 50276.7. Samples: 385860620. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-04-26 11:13:52,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 11:13:52,399][49750] Updated weights for policy 0, policy_version 160711 (0.0031) [2024-04-26 11:13:55,956][49750] Updated weights for policy 0, policy_version 160721 (0.0029) [2024-04-26 11:13:57,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.2, 300 sec: 50207.2). Total num frames: 2633302016. Throughput: 0: 50164.2. Samples: 386160020. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-04-26 11:13:57,068][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 11:13:58,846][49750] Updated weights for policy 0, policy_version 160731 (0.0026) [2024-04-26 11:14:02,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2633547776. Throughput: 0: 50208.1. Samples: 386465460. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-04-26 11:14:02,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 11:14:02,289][49750] Updated weights for policy 0, policy_version 160741 (0.0029) [2024-04-26 11:14:05,327][49750] Updated weights for policy 0, policy_version 160751 (0.0033) [2024-04-26 11:14:07,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.2, 300 sec: 50373.9). Total num frames: 2633826304. Throughput: 0: 50349.7. Samples: 386621120. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-04-26 11:14:07,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 11:14:08,758][49750] Updated weights for policy 0, policy_version 160761 (0.0033) [2024-04-26 11:14:11,903][49750] Updated weights for policy 0, policy_version 160771 (0.0032) [2024-04-26 11:14:12,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.6, 300 sec: 50318.3). Total num frames: 2634072064. Throughput: 0: 50339.6. Samples: 386922900. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-04-26 11:14:12,063][49517] Avg episode reward: [(0, '0.433')] [2024-04-26 11:14:15,374][49750] Updated weights for policy 0, policy_version 160781 (0.0031) [2024-04-26 11:14:17,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2634317824. Throughput: 0: 50266.2. Samples: 387218940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:14:17,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 11:14:18,344][49750] Updated weights for policy 0, policy_version 160791 (0.0033) [2024-04-26 11:14:21,695][49750] Updated weights for policy 0, policy_version 160801 (0.0031) [2024-04-26 11:14:22,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2634579968. Throughput: 0: 50345.6. Samples: 387367080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:14:22,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 11:14:24,693][49750] Updated weights for policy 0, policy_version 160811 (0.0031) [2024-04-26 11:14:27,063][49517] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 50318.3). Total num frames: 2634809344. Throughput: 0: 50278.5. Samples: 387670600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:14:27,063][49517] Avg episode reward: [(0, '0.482')] [2024-04-26 11:14:28,133][49750] Updated weights for policy 0, policy_version 160821 (0.0027) [2024-04-26 11:14:31,085][49750] Updated weights for policy 0, policy_version 160831 (0.0032) [2024-04-26 11:14:32,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 2635104256. Throughput: 0: 50629.7. Samples: 387979880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:14:32,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 11:14:34,621][49750] Updated weights for policy 0, policy_version 160841 (0.0032) [2024-04-26 11:14:37,062][49517] Fps is (10 sec: 50790.6, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2635317248. Throughput: 0: 50517.4. Samples: 388133900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:14:37,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 11:14:37,679][49750] Updated weights for policy 0, policy_version 160851 (0.0034) [2024-04-26 11:14:41,216][49750] Updated weights for policy 0, policy_version 160861 (0.0029) [2024-04-26 11:14:42,063][49517] Fps is (10 sec: 49151.1, 60 sec: 51063.2, 300 sec: 50318.3). Total num frames: 2635595776. Throughput: 0: 50640.8. Samples: 388438860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:14:42,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 11:14:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000160864_2635595776.pth... [2024-04-26 11:14:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000160126_2623504384.pth [2024-04-26 11:14:44,163][49750] Updated weights for policy 0, policy_version 160871 (0.0028) [2024-04-26 11:14:47,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49698.1, 300 sec: 50207.2). Total num frames: 2635792384. Throughput: 0: 50482.7. Samples: 388737180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:14:47,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 11:14:47,924][49750] Updated weights for policy 0, policy_version 160881 (0.0031) [2024-04-26 11:14:50,539][49750] Updated weights for policy 0, policy_version 160891 (0.0027) [2024-04-26 11:14:52,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.2, 300 sec: 50373.8). Total num frames: 2636087296. Throughput: 0: 50238.9. Samples: 388881880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:14:52,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 11:14:53,719][49728] Signal inference workers to stop experience collection... (5950 times) [2024-04-26 11:14:53,759][49750] InferenceWorker_p0-w0: stopping experience collection (5950 times) [2024-04-26 11:14:53,793][49728] Signal inference workers to resume experience collection... (5950 times) [2024-04-26 11:14:53,794][49750] InferenceWorker_p0-w0: resuming experience collection (5950 times) [2024-04-26 11:14:54,539][49750] Updated weights for policy 0, policy_version 160901 (0.0032) [2024-04-26 11:14:57,062][49517] Fps is (10 sec: 55706.1, 60 sec: 50790.6, 300 sec: 50429.4). Total num frames: 2636349440. Throughput: 0: 50313.3. Samples: 389187000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:14:57,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 11:14:57,095][49750] Updated weights for policy 0, policy_version 160911 (0.0030) [2024-04-26 11:15:00,870][49750] Updated weights for policy 0, policy_version 160921 (0.0035) [2024-04-26 11:15:02,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 2636595200. Throughput: 0: 50460.0. Samples: 389489640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:15:02,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 11:15:03,676][49750] Updated weights for policy 0, policy_version 160931 (0.0031) [2024-04-26 11:15:07,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2636824576. Throughput: 0: 50384.1. Samples: 389634360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:15:07,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 11:15:07,387][49750] Updated weights for policy 0, policy_version 160941 (0.0039) [2024-04-26 11:15:10,292][49750] Updated weights for policy 0, policy_version 160951 (0.0029) [2024-04-26 11:15:12,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2637086720. Throughput: 0: 50362.3. Samples: 389936900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 11:15:12,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 11:15:14,022][49750] Updated weights for policy 0, policy_version 160961 (0.0027) [2024-04-26 11:15:16,872][49750] Updated weights for policy 0, policy_version 160971 (0.0028) [2024-04-26 11:15:17,062][49517] Fps is (10 sec: 54066.7, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 2637365248. Throughput: 0: 50275.5. Samples: 390242280. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 11:15:17,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 11:15:20,442][49750] Updated weights for policy 0, policy_version 160981 (0.0028) [2024-04-26 11:15:22,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2637611008. Throughput: 0: 50354.7. Samples: 390399860. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 11:15:22,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 11:15:23,310][49750] Updated weights for policy 0, policy_version 160991 (0.0033) [2024-04-26 11:15:26,844][49750] Updated weights for policy 0, policy_version 161001 (0.0034) [2024-04-26 11:15:27,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2637840384. Throughput: 0: 50306.5. Samples: 390702640. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 11:15:27,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 11:15:29,774][49750] Updated weights for policy 0, policy_version 161011 (0.0029) [2024-04-26 11:15:32,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49698.2, 300 sec: 50318.3). Total num frames: 2638086144. Throughput: 0: 50339.2. Samples: 391002440. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 11:15:32,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 11:15:33,392][49750] Updated weights for policy 0, policy_version 161021 (0.0029) [2024-04-26 11:15:36,140][49750] Updated weights for policy 0, policy_version 161031 (0.0031) [2024-04-26 11:15:37,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2638348288. Throughput: 0: 50370.0. Samples: 391148520. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 11:15:37,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 11:15:40,198][49750] Updated weights for policy 0, policy_version 161041 (0.0031) [2024-04-26 11:15:42,062][49517] Fps is (10 sec: 54066.8, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 2638626816. Throughput: 0: 50314.1. Samples: 391451140. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 11:15:42,071][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 11:15:42,743][49750] Updated weights for policy 0, policy_version 161051 (0.0030) [2024-04-26 11:15:46,614][49750] Updated weights for policy 0, policy_version 161061 (0.0032) [2024-04-26 11:15:47,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.3, 300 sec: 50262.8). Total num frames: 2638839808. Throughput: 0: 50295.9. Samples: 391752960. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 11:15:47,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 11:15:47,656][49728] Signal inference workers to stop experience collection... (6000 times) [2024-04-26 11:15:47,656][49728] Signal inference workers to resume experience collection... (6000 times) [2024-04-26 11:15:47,689][49750] InferenceWorker_p0-w0: stopping experience collection (6000 times) [2024-04-26 11:15:47,710][49750] InferenceWorker_p0-w0: resuming experience collection (6000 times) [2024-04-26 11:15:49,183][49750] Updated weights for policy 0, policy_version 161071 (0.0036) [2024-04-26 11:15:52,062][49517] Fps is (10 sec: 45875.3, 60 sec: 49971.4, 300 sec: 50262.8). Total num frames: 2639085568. Throughput: 0: 50184.0. Samples: 391892640. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 11:15:52,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 11:15:53,165][49750] Updated weights for policy 0, policy_version 161081 (0.0029) [2024-04-26 11:15:55,613][49750] Updated weights for policy 0, policy_version 161091 (0.0031) [2024-04-26 11:15:57,062][49517] Fps is (10 sec: 50791.1, 60 sec: 49971.1, 300 sec: 50373.9). Total num frames: 2639347712. Throughput: 0: 50305.8. Samples: 392200660. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 11:15:57,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 11:15:59,754][49750] Updated weights for policy 0, policy_version 161101 (0.0032) [2024-04-26 11:16:02,046][49750] Updated weights for policy 0, policy_version 161111 (0.0032) [2024-04-26 11:16:02,062][49517] Fps is (10 sec: 55705.6, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 2639642624. Throughput: 0: 50120.1. Samples: 392497680. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 11:16:02,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 11:16:06,109][49750] Updated weights for policy 0, policy_version 161121 (0.0028) [2024-04-26 11:16:07,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2639855616. Throughput: 0: 50361.0. Samples: 392666100. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 11:16:07,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 11:16:08,571][49750] Updated weights for policy 0, policy_version 161131 (0.0028) [2024-04-26 11:16:12,062][49517] Fps is (10 sec: 45875.1, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2640101376. Throughput: 0: 50310.6. Samples: 392966620. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 11:16:12,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 11:16:12,485][49750] Updated weights for policy 0, policy_version 161141 (0.0031) [2024-04-26 11:16:15,133][49750] Updated weights for policy 0, policy_version 161151 (0.0030) [2024-04-26 11:16:17,062][49517] Fps is (10 sec: 49151.5, 60 sec: 49698.2, 300 sec: 50318.3). Total num frames: 2640347136. Throughput: 0: 50181.3. Samples: 393260600. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-04-26 11:16:17,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 11:16:19,001][49750] Updated weights for policy 0, policy_version 161161 (0.0031) [2024-04-26 11:16:21,644][49750] Updated weights for policy 0, policy_version 161171 (0.0038) [2024-04-26 11:16:22,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2640625664. Throughput: 0: 50470.3. Samples: 393419680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 11:16:22,063][49517] Avg episode reward: [(0, '0.471')] [2024-04-26 11:16:25,554][49750] Updated weights for policy 0, policy_version 161181 (0.0028) [2024-04-26 11:16:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2640871424. Throughput: 0: 50444.1. Samples: 393721120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 11:16:27,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 11:16:28,186][49750] Updated weights for policy 0, policy_version 161191 (0.0031) [2024-04-26 11:16:32,049][49750] Updated weights for policy 0, policy_version 161201 (0.0028) [2024-04-26 11:16:32,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2641117184. Throughput: 0: 50434.4. Samples: 394022500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 11:16:32,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 11:16:34,662][49750] Updated weights for policy 0, policy_version 161211 (0.0029) [2024-04-26 11:16:37,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2641362944. Throughput: 0: 50464.9. Samples: 394163560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 11:16:37,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 11:16:38,501][49750] Updated weights for policy 0, policy_version 161221 (0.0028) [2024-04-26 11:16:41,154][49750] Updated weights for policy 0, policy_version 161231 (0.0033) [2024-04-26 11:16:42,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 2641641472. Throughput: 0: 50334.3. Samples: 394465700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 11:16:42,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 11:16:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000161233_2641641472.pth... [2024-04-26 11:16:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000160496_2629566464.pth [2024-04-26 11:16:45,030][49750] Updated weights for policy 0, policy_version 161241 (0.0033) [2024-04-26 11:16:47,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2641887232. Throughput: 0: 50488.3. Samples: 394769660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 11:16:47,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 11:16:47,732][49750] Updated weights for policy 0, policy_version 161251 (0.0030) [2024-04-26 11:16:51,740][49750] Updated weights for policy 0, policy_version 161261 (0.0030) [2024-04-26 11:16:52,062][49517] Fps is (10 sec: 45875.0, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 2642100224. Throughput: 0: 50131.8. Samples: 394922040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 11:16:52,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 11:16:54,177][49750] Updated weights for policy 0, policy_version 161271 (0.0031) [2024-04-26 11:16:57,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2642362368. Throughput: 0: 50093.7. Samples: 395220840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 11:16:57,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 11:16:58,161][49750] Updated weights for policy 0, policy_version 161281 (0.0030) [2024-04-26 11:17:00,802][49750] Updated weights for policy 0, policy_version 161291 (0.0031) [2024-04-26 11:17:02,062][49517] Fps is (10 sec: 54067.6, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2642640896. Throughput: 0: 50185.4. Samples: 395518940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 11:17:02,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 11:17:04,722][49750] Updated weights for policy 0, policy_version 161301 (0.0037) [2024-04-26 11:17:05,363][49728] Signal inference workers to stop experience collection... (6050 times) [2024-04-26 11:17:05,413][49750] InferenceWorker_p0-w0: stopping experience collection (6050 times) [2024-04-26 11:17:05,433][49728] Signal inference workers to resume experience collection... (6050 times) [2024-04-26 11:17:05,435][49750] InferenceWorker_p0-w0: resuming experience collection (6050 times) [2024-04-26 11:17:07,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2642886656. Throughput: 0: 50240.0. Samples: 395680480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 11:17:07,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 11:17:07,437][49750] Updated weights for policy 0, policy_version 161311 (0.0034) [2024-04-26 11:17:11,086][49750] Updated weights for policy 0, policy_version 161321 (0.0030) [2024-04-26 11:17:12,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2643132416. Throughput: 0: 50231.5. Samples: 395981540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 11:17:12,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 11:17:13,928][49750] Updated weights for policy 0, policy_version 161331 (0.0040) [2024-04-26 11:17:17,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2643378176. Throughput: 0: 50227.5. Samples: 396282740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 11:17:17,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 11:17:17,560][49750] Updated weights for policy 0, policy_version 161341 (0.0033) [2024-04-26 11:17:20,314][49750] Updated weights for policy 0, policy_version 161351 (0.0035) [2024-04-26 11:17:22,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49698.1, 300 sec: 50262.8). Total num frames: 2643607552. Throughput: 0: 50377.3. Samples: 396430540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 11:17:22,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 11:17:24,130][49750] Updated weights for policy 0, policy_version 161361 (0.0030) [2024-04-26 11:17:26,900][49750] Updated weights for policy 0, policy_version 161371 (0.0031) [2024-04-26 11:17:27,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2643902464. Throughput: 0: 50393.4. Samples: 396733400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 11:17:27,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 11:17:30,458][49750] Updated weights for policy 0, policy_version 161381 (0.0028) [2024-04-26 11:17:32,062][49517] Fps is (10 sec: 54067.6, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2644148224. Throughput: 0: 50419.7. Samples: 397038540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 11:17:32,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 11:17:33,535][49750] Updated weights for policy 0, policy_version 161391 (0.0027) [2024-04-26 11:17:37,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.3, 300 sec: 50207.3). Total num frames: 2644377600. Throughput: 0: 50438.3. Samples: 397191760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 11:17:37,063][49517] Avg episode reward: [(0, '0.436')] [2024-04-26 11:17:37,073][49750] Updated weights for policy 0, policy_version 161401 (0.0029) [2024-04-26 11:17:39,940][49750] Updated weights for policy 0, policy_version 161411 (0.0029) [2024-04-26 11:17:42,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 50318.4). Total num frames: 2644639744. Throughput: 0: 50371.7. Samples: 397487560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 11:17:42,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 11:17:43,699][49750] Updated weights for policy 0, policy_version 161421 (0.0030) [2024-04-26 11:17:46,314][49750] Updated weights for policy 0, policy_version 161431 (0.0034) [2024-04-26 11:17:47,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2644901888. Throughput: 0: 50370.2. Samples: 397785600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 11:17:47,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 11:17:50,255][49750] Updated weights for policy 0, policy_version 161441 (0.0033) [2024-04-26 11:17:52,062][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.4, 300 sec: 50484.9). Total num frames: 2645164032. Throughput: 0: 50362.6. Samples: 397946800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 11:17:52,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 11:17:52,992][49750] Updated weights for policy 0, policy_version 161451 (0.0034) [2024-04-26 11:17:56,608][49750] Updated weights for policy 0, policy_version 161461 (0.0033) [2024-04-26 11:17:57,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2645393408. Throughput: 0: 50428.0. Samples: 398250800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 11:17:57,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 11:17:59,466][49750] Updated weights for policy 0, policy_version 161471 (0.0037) [2024-04-26 11:18:02,063][49517] Fps is (10 sec: 47513.3, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 2645639168. Throughput: 0: 50323.4. Samples: 398547300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 11:18:02,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 11:18:03,010][49750] Updated weights for policy 0, policy_version 161481 (0.0038) [2024-04-26 11:18:05,994][49750] Updated weights for policy 0, policy_version 161491 (0.0038) [2024-04-26 11:18:07,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.2, 300 sec: 50318.4). Total num frames: 2645884928. Throughput: 0: 50188.9. Samples: 398689040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 11:18:07,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 11:18:09,687][49750] Updated weights for policy 0, policy_version 161501 (0.0030) [2024-04-26 11:18:12,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2646163456. Throughput: 0: 50345.2. Samples: 398998940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 11:18:12,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 11:18:12,627][49750] Updated weights for policy 0, policy_version 161511 (0.0034) [2024-04-26 11:18:16,209][49750] Updated weights for policy 0, policy_version 161521 (0.0035) [2024-04-26 11:18:17,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2646392832. Throughput: 0: 50375.9. Samples: 399305460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 11:18:17,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 11:18:17,127][49728] Signal inference workers to stop experience collection... (6100 times) [2024-04-26 11:18:17,127][49728] Signal inference workers to resume experience collection... (6100 times) [2024-04-26 11:18:17,143][49750] InferenceWorker_p0-w0: stopping experience collection (6100 times) [2024-04-26 11:18:17,143][49750] InferenceWorker_p0-w0: resuming experience collection (6100 times) [2024-04-26 11:18:19,286][49750] Updated weights for policy 0, policy_version 161531 (0.0038) [2024-04-26 11:18:22,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 2646638592. Throughput: 0: 50189.3. Samples: 399450280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 11:18:22,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 11:18:22,680][49750] Updated weights for policy 0, policy_version 161541 (0.0027) [2024-04-26 11:18:25,625][49750] Updated weights for policy 0, policy_version 161551 (0.0029) [2024-04-26 11:18:27,062][49517] Fps is (10 sec: 50791.2, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2646900736. Throughput: 0: 50313.8. Samples: 399751680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 11:18:27,063][49517] Avg episode reward: [(0, '0.471')] [2024-04-26 11:18:29,143][49750] Updated weights for policy 0, policy_version 161561 (0.0031) [2024-04-26 11:18:32,026][49750] Updated weights for policy 0, policy_version 161571 (0.0031) [2024-04-26 11:18:32,062][49517] Fps is (10 sec: 54067.4, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2647179264. Throughput: 0: 50409.8. Samples: 400054040. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-26 11:18:32,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 11:18:35,599][49750] Updated weights for policy 0, policy_version 161581 (0.0030) [2024-04-26 11:18:37,063][49517] Fps is (10 sec: 52427.7, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 2647425024. Throughput: 0: 50276.8. Samples: 400209260. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-26 11:18:37,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 11:18:38,888][49750] Updated weights for policy 0, policy_version 161591 (0.0032) [2024-04-26 11:18:42,062][49517] Fps is (10 sec: 45875.2, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2647638016. Throughput: 0: 50217.7. Samples: 400510600. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-26 11:18:42,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 11:18:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000161600_2647654400.pth... [2024-04-26 11:18:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000160864_2635595776.pth [2024-04-26 11:18:42,292][49750] Updated weights for policy 0, policy_version 161601 (0.0033) [2024-04-26 11:18:45,517][49750] Updated weights for policy 0, policy_version 161611 (0.0031) [2024-04-26 11:18:47,062][49517] Fps is (10 sec: 47514.3, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2647900160. Throughput: 0: 50440.2. Samples: 400817100. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-26 11:18:47,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 11:18:48,787][49750] Updated weights for policy 0, policy_version 161621 (0.0030) [2024-04-26 11:18:52,062][49517] Fps is (10 sec: 50790.8, 60 sec: 49698.3, 300 sec: 50318.4). Total num frames: 2648145920. Throughput: 0: 50494.7. Samples: 400961300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-26 11:18:52,063][49517] Avg episode reward: [(0, '0.455')] [2024-04-26 11:18:52,101][49750] Updated weights for policy 0, policy_version 161631 (0.0029) [2024-04-26 11:18:55,261][49750] Updated weights for policy 0, policy_version 161641 (0.0029) [2024-04-26 11:18:57,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2648424448. Throughput: 0: 50299.1. Samples: 401262400. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-26 11:18:57,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 11:18:58,594][49750] Updated weights for policy 0, policy_version 161651 (0.0037) [2024-04-26 11:19:01,762][49750] Updated weights for policy 0, policy_version 161661 (0.0031) [2024-04-26 11:19:02,063][49517] Fps is (10 sec: 52427.3, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2648670208. Throughput: 0: 50203.4. Samples: 401564620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-26 11:19:02,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 11:19:05,059][49750] Updated weights for policy 0, policy_version 161671 (0.0030) [2024-04-26 11:19:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2648915968. Throughput: 0: 50324.5. Samples: 401714880. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-26 11:19:07,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 11:19:08,178][49750] Updated weights for policy 0, policy_version 161681 (0.0033) [2024-04-26 11:19:11,594][49750] Updated weights for policy 0, policy_version 161691 (0.0029) [2024-04-26 11:19:12,063][49517] Fps is (10 sec: 47514.2, 60 sec: 49698.1, 300 sec: 50262.8). Total num frames: 2649145344. Throughput: 0: 50200.3. Samples: 402010700. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-26 11:19:12,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 11:19:14,768][49750] Updated weights for policy 0, policy_version 161701 (0.0035) [2024-04-26 11:19:17,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 2649440256. Throughput: 0: 50279.0. Samples: 402316600. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-26 11:19:17,072][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 11:19:18,218][49750] Updated weights for policy 0, policy_version 161711 (0.0031) [2024-04-26 11:19:21,147][49750] Updated weights for policy 0, policy_version 161721 (0.0031) [2024-04-26 11:19:22,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2649669632. Throughput: 0: 50318.3. Samples: 402473580. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-26 11:19:22,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 11:19:24,644][49750] Updated weights for policy 0, policy_version 161731 (0.0030) [2024-04-26 11:19:27,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 2649915392. Throughput: 0: 50422.6. Samples: 402779620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-26 11:19:27,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 11:19:27,640][49750] Updated weights for policy 0, policy_version 161741 (0.0031) [2024-04-26 11:19:31,040][49750] Updated weights for policy 0, policy_version 161751 (0.0030) [2024-04-26 11:19:32,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49698.1, 300 sec: 50318.3). Total num frames: 2650161152. Throughput: 0: 50418.6. Samples: 403085940. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-26 11:19:32,063][49517] Avg episode reward: [(0, '0.479')] [2024-04-26 11:19:34,046][49750] Updated weights for policy 0, policy_version 161761 (0.0033) [2024-04-26 11:19:37,062][49517] Fps is (10 sec: 50790.6, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2650423296. Throughput: 0: 50330.6. Samples: 403226180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 11:19:37,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 11:19:37,650][49750] Updated weights for policy 0, policy_version 161771 (0.0033) [2024-04-26 11:19:40,269][49728] Signal inference workers to stop experience collection... (6150 times) [2024-04-26 11:19:40,269][49728] Signal inference workers to resume experience collection... (6150 times) [2024-04-26 11:19:40,285][49750] InferenceWorker_p0-w0: stopping experience collection (6150 times) [2024-04-26 11:19:40,286][49750] InferenceWorker_p0-w0: resuming experience collection (6150 times) [2024-04-26 11:19:40,546][49750] Updated weights for policy 0, policy_version 161781 (0.0031) [2024-04-26 11:19:42,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.4, 300 sec: 50540.5). Total num frames: 2650701824. Throughput: 0: 50408.8. Samples: 403530800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 11:19:42,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 11:19:44,216][49750] Updated weights for policy 0, policy_version 161791 (0.0035) [2024-04-26 11:19:47,055][49750] Updated weights for policy 0, policy_version 161801 (0.0032) [2024-04-26 11:19:47,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 2650947584. Throughput: 0: 50470.9. Samples: 403835800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 11:19:47,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 11:19:50,842][49750] Updated weights for policy 0, policy_version 161811 (0.0033) [2024-04-26 11:19:52,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 2651176960. Throughput: 0: 50344.3. Samples: 403980380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 11:19:52,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 11:19:53,580][49750] Updated weights for policy 0, policy_version 161821 (0.0030) [2024-04-26 11:19:57,063][49517] Fps is (10 sec: 47512.9, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2651422720. Throughput: 0: 50501.7. Samples: 404283280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 11:19:57,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 11:19:57,247][49750] Updated weights for policy 0, policy_version 161831 (0.0035) [2024-04-26 11:20:00,026][49750] Updated weights for policy 0, policy_version 161841 (0.0032) [2024-04-26 11:20:02,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 2651701248. Throughput: 0: 50333.0. Samples: 404581580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 11:20:02,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 11:20:03,821][49750] Updated weights for policy 0, policy_version 161851 (0.0031) [2024-04-26 11:20:06,578][49750] Updated weights for policy 0, policy_version 161861 (0.0038) [2024-04-26 11:20:07,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2651947008. Throughput: 0: 50340.5. Samples: 404738900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 11:20:07,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 11:20:10,525][49750] Updated weights for policy 0, policy_version 161871 (0.0041) [2024-04-26 11:20:12,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 2652176384. Throughput: 0: 50352.3. Samples: 405045480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 11:20:12,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 11:20:13,193][49750] Updated weights for policy 0, policy_version 161881 (0.0037) [2024-04-26 11:20:17,062][49517] Fps is (10 sec: 45875.3, 60 sec: 49425.2, 300 sec: 50151.7). Total num frames: 2652405760. Throughput: 0: 50218.3. Samples: 405345760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 11:20:17,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 11:20:17,155][49750] Updated weights for policy 0, policy_version 161891 (0.0030) [2024-04-26 11:20:19,747][49750] Updated weights for policy 0, policy_version 161901 (0.0032) [2024-04-26 11:20:22,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2652700672. Throughput: 0: 50301.3. Samples: 405489740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 11:20:22,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 11:20:23,558][49750] Updated weights for policy 0, policy_version 161911 (0.0032) [2024-04-26 11:20:26,246][49750] Updated weights for policy 0, policy_version 161921 (0.0030) [2024-04-26 11:20:27,062][49517] Fps is (10 sec: 54066.5, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2652946432. Throughput: 0: 50313.4. Samples: 405794900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 11:20:27,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 11:20:29,881][49750] Updated weights for policy 0, policy_version 161931 (0.0030) [2024-04-26 11:20:32,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2653175808. Throughput: 0: 50227.4. Samples: 406096040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 11:20:32,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 11:20:32,824][49750] Updated weights for policy 0, policy_version 161941 (0.0030) [2024-04-26 11:20:36,486][49750] Updated weights for policy 0, policy_version 161951 (0.0030) [2024-04-26 11:20:37,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.3, 300 sec: 50207.3). Total num frames: 2653437952. Throughput: 0: 50258.8. Samples: 406242020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 11:20:37,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 11:20:39,345][49750] Updated weights for policy 0, policy_version 161961 (0.0027) [2024-04-26 11:20:42,063][49517] Fps is (10 sec: 52428.9, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2653700096. Throughput: 0: 50100.0. Samples: 406537780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 11:20:42,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 11:20:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000161969_2653700096.pth... [2024-04-26 11:20:42,131][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000161233_2641641472.pth [2024-04-26 11:20:42,929][49750] Updated weights for policy 0, policy_version 161971 (0.0043) [2024-04-26 11:20:45,923][49750] Updated weights for policy 0, policy_version 161981 (0.0035) [2024-04-26 11:20:47,062][49517] Fps is (10 sec: 54066.7, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2653978624. Throughput: 0: 50353.3. Samples: 406847480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 11:20:47,063][49517] Avg episode reward: [(0, '0.473')] [2024-04-26 11:20:49,279][49750] Updated weights for policy 0, policy_version 161991 (0.0034) [2024-04-26 11:20:52,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2654191616. Throughput: 0: 50376.7. Samples: 407005860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 11:20:52,071][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 11:20:52,317][49750] Updated weights for policy 0, policy_version 162001 (0.0034) [2024-04-26 11:20:55,865][49750] Updated weights for policy 0, policy_version 162011 (0.0028) [2024-04-26 11:20:57,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.4, 300 sec: 50262.8). Total num frames: 2654470144. Throughput: 0: 50352.5. Samples: 407311340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 11:20:57,072][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 11:20:58,821][49750] Updated weights for policy 0, policy_version 162021 (0.0031) [2024-04-26 11:21:02,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49698.1, 300 sec: 50262.8). Total num frames: 2654683136. Throughput: 0: 50345.2. Samples: 407611300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 11:21:02,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 11:21:02,323][49750] Updated weights for policy 0, policy_version 162031 (0.0035) [2024-04-26 11:21:05,317][49750] Updated weights for policy 0, policy_version 162041 (0.0030) [2024-04-26 11:21:07,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2654978048. Throughput: 0: 50535.1. Samples: 407763820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 11:21:07,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 11:21:08,817][49750] Updated weights for policy 0, policy_version 162051 (0.0032) [2024-04-26 11:21:11,784][49750] Updated weights for policy 0, policy_version 162061 (0.0036) [2024-04-26 11:21:12,062][49517] Fps is (10 sec: 54067.9, 60 sec: 50790.6, 300 sec: 50429.4). Total num frames: 2655223808. Throughput: 0: 50279.3. Samples: 408057460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 11:21:12,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 11:21:13,160][49728] Signal inference workers to stop experience collection... (6200 times) [2024-04-26 11:21:13,193][49750] InferenceWorker_p0-w0: stopping experience collection (6200 times) [2024-04-26 11:21:13,227][49728] Signal inference workers to resume experience collection... (6200 times) [2024-04-26 11:21:13,227][49750] InferenceWorker_p0-w0: resuming experience collection (6200 times) [2024-04-26 11:21:15,336][49750] Updated weights for policy 0, policy_version 162071 (0.0029) [2024-04-26 11:21:17,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50790.3, 300 sec: 50262.8). Total num frames: 2655453184. Throughput: 0: 50305.4. Samples: 408359780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 11:21:17,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 11:21:18,171][49750] Updated weights for policy 0, policy_version 162081 (0.0034) [2024-04-26 11:21:21,620][49750] Updated weights for policy 0, policy_version 162091 (0.0029) [2024-04-26 11:21:22,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2655715328. Throughput: 0: 50467.1. Samples: 408513040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 11:21:22,063][49517] Avg episode reward: [(0, '0.456')] [2024-04-26 11:21:24,689][49750] Updated weights for policy 0, policy_version 162101 (0.0032) [2024-04-26 11:21:27,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2655961088. Throughput: 0: 50670.2. Samples: 408817940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 11:21:27,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 11:21:28,108][49750] Updated weights for policy 0, policy_version 162111 (0.0029) [2024-04-26 11:21:31,303][49750] Updated weights for policy 0, policy_version 162121 (0.0036) [2024-04-26 11:21:32,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.5, 300 sec: 50429.4). Total num frames: 2656239616. Throughput: 0: 50581.7. Samples: 409123660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 11:21:32,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 11:21:34,499][49750] Updated weights for policy 0, policy_version 162131 (0.0028) [2024-04-26 11:21:37,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 2656452608. Throughput: 0: 50402.9. Samples: 409273980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 11:21:37,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 11:21:37,831][49750] Updated weights for policy 0, policy_version 162141 (0.0031) [2024-04-26 11:21:40,914][49750] Updated weights for policy 0, policy_version 162151 (0.0040) [2024-04-26 11:21:42,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 2656747520. Throughput: 0: 50429.3. Samples: 409580660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 11:21:42,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 11:21:44,350][49750] Updated weights for policy 0, policy_version 162161 (0.0031) [2024-04-26 11:21:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 49971.3, 300 sec: 50429.4). Total num frames: 2656976896. Throughput: 0: 50397.9. Samples: 409879200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 11:21:47,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 11:21:47,490][49750] Updated weights for policy 0, policy_version 162171 (0.0029) [2024-04-26 11:21:50,958][49750] Updated weights for policy 0, policy_version 162181 (0.0032) [2024-04-26 11:21:52,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.5, 300 sec: 50484.9). Total num frames: 2657255424. Throughput: 0: 50408.7. Samples: 410032220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 11:21:52,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 11:21:53,859][49750] Updated weights for policy 0, policy_version 162191 (0.0033) [2024-04-26 11:21:57,063][49517] Fps is (10 sec: 49151.2, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2657468416. Throughput: 0: 50566.9. Samples: 410332980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 11:21:57,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 11:21:57,501][49750] Updated weights for policy 0, policy_version 162201 (0.0031) [2024-04-26 11:22:00,462][49750] Updated weights for policy 0, policy_version 162211 (0.0030) [2024-04-26 11:22:02,062][49517] Fps is (10 sec: 49152.5, 60 sec: 51063.5, 300 sec: 50373.9). Total num frames: 2657746944. Throughput: 0: 50494.4. Samples: 410632020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 11:22:02,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 11:22:03,811][49750] Updated weights for policy 0, policy_version 162221 (0.0037) [2024-04-26 11:22:06,886][49750] Updated weights for policy 0, policy_version 162231 (0.0034) [2024-04-26 11:22:07,062][49517] Fps is (10 sec: 54067.6, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2658009088. Throughput: 0: 50604.9. Samples: 410790260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 11:22:07,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 11:22:10,149][49750] Updated weights for policy 0, policy_version 162241 (0.0030) [2024-04-26 11:22:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2658238464. Throughput: 0: 50528.6. Samples: 411091720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 11:22:12,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 11:22:13,311][49750] Updated weights for policy 0, policy_version 162251 (0.0033) [2024-04-26 11:22:16,683][49750] Updated weights for policy 0, policy_version 162261 (0.0030) [2024-04-26 11:22:17,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2658484224. Throughput: 0: 50363.7. Samples: 411390020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 11:22:17,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 11:22:19,795][49750] Updated weights for policy 0, policy_version 162271 (0.0027) [2024-04-26 11:22:22,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2658729984. Throughput: 0: 50449.7. Samples: 411544220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 11:22:22,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 11:22:22,209][49728] Signal inference workers to stop experience collection... (6250 times) [2024-04-26 11:22:22,210][49728] Signal inference workers to resume experience collection... (6250 times) [2024-04-26 11:22:22,234][49750] InferenceWorker_p0-w0: stopping experience collection (6250 times) [2024-04-26 11:22:22,235][49750] InferenceWorker_p0-w0: resuming experience collection (6250 times) [2024-04-26 11:22:23,183][49750] Updated weights for policy 0, policy_version 162281 (0.0038) [2024-04-26 11:22:26,223][49750] Updated weights for policy 0, policy_version 162291 (0.0036) [2024-04-26 11:22:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50373.9). Total num frames: 2659008512. Throughput: 0: 50385.1. Samples: 411847980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 11:22:27,063][49517] Avg episode reward: [(0, '0.465')] [2024-04-26 11:22:29,729][49750] Updated weights for policy 0, policy_version 162301 (0.0034) [2024-04-26 11:22:32,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 2659254272. Throughput: 0: 50428.0. Samples: 412148460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 11:22:32,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 11:22:32,708][49750] Updated weights for policy 0, policy_version 162311 (0.0038) [2024-04-26 11:22:36,398][49750] Updated weights for policy 0, policy_version 162321 (0.0030) [2024-04-26 11:22:37,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2659483648. Throughput: 0: 50327.2. Samples: 412296940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 11:22:37,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 11:22:39,229][49750] Updated weights for policy 0, policy_version 162331 (0.0035) [2024-04-26 11:22:42,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2659745792. Throughput: 0: 50318.4. Samples: 412597300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 11:22:42,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 11:22:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000162339_2659762176.pth... [2024-04-26 11:22:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000161600_2647654400.pth [2024-04-26 11:22:43,080][49750] Updated weights for policy 0, policy_version 162341 (0.0027) [2024-04-26 11:22:45,750][49750] Updated weights for policy 0, policy_version 162351 (0.0042) [2024-04-26 11:22:47,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 2660007936. Throughput: 0: 50391.0. Samples: 412899620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 11:22:47,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 11:22:49,631][49750] Updated weights for policy 0, policy_version 162361 (0.0032) [2024-04-26 11:22:52,062][49517] Fps is (10 sec: 50790.4, 60 sec: 49971.3, 300 sec: 50373.9). Total num frames: 2660253696. Throughput: 0: 50355.6. Samples: 413056260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 11:22:52,063][49517] Avg episode reward: [(0, '0.464')] [2024-04-26 11:22:52,250][49750] Updated weights for policy 0, policy_version 162371 (0.0030) [2024-04-26 11:22:56,083][49750] Updated weights for policy 0, policy_version 162381 (0.0029) [2024-04-26 11:22:57,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2660483072. Throughput: 0: 50264.7. Samples: 413353640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 11:22:57,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 11:22:58,762][49750] Updated weights for policy 0, policy_version 162391 (0.0028) [2024-04-26 11:23:02,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2660745216. Throughput: 0: 50354.7. Samples: 413655980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 11:23:02,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 11:23:02,574][49750] Updated weights for policy 0, policy_version 162401 (0.0037) [2024-04-26 11:23:05,485][49750] Updated weights for policy 0, policy_version 162411 (0.0030) [2024-04-26 11:23:07,062][49517] Fps is (10 sec: 54068.5, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2661023744. Throughput: 0: 50379.2. Samples: 413811280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 11:23:07,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 11:23:08,995][49750] Updated weights for policy 0, policy_version 162421 (0.0029) [2024-04-26 11:23:12,010][49750] Updated weights for policy 0, policy_version 162431 (0.0027) [2024-04-26 11:23:12,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2661269504. Throughput: 0: 50319.1. Samples: 414112340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 11:23:12,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 11:23:15,533][49750] Updated weights for policy 0, policy_version 162441 (0.0029) [2024-04-26 11:23:17,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2661498880. Throughput: 0: 50426.6. Samples: 414417660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 11:23:17,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 11:23:18,514][49750] Updated weights for policy 0, policy_version 162451 (0.0029) [2024-04-26 11:23:21,994][49750] Updated weights for policy 0, policy_version 162461 (0.0035) [2024-04-26 11:23:22,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2661761024. Throughput: 0: 50310.2. Samples: 414560900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 11:23:22,063][49517] Avg episode reward: [(0, '0.682')] [2024-04-26 11:23:25,139][49750] Updated weights for policy 0, policy_version 162471 (0.0029) [2024-04-26 11:23:27,063][49517] Fps is (10 sec: 50789.7, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2662006784. Throughput: 0: 50332.3. Samples: 414862260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 11:23:27,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 11:23:28,524][49750] Updated weights for policy 0, policy_version 162481 (0.0041) [2024-04-26 11:23:31,688][49750] Updated weights for policy 0, policy_version 162491 (0.0038) [2024-04-26 11:23:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2662268928. Throughput: 0: 50301.9. Samples: 415163200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 11:23:32,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 11:23:35,056][49750] Updated weights for policy 0, policy_version 162501 (0.0031) [2024-04-26 11:23:36,247][49728] Signal inference workers to stop experience collection... (6300 times) [2024-04-26 11:23:36,250][49728] Signal inference workers to resume experience collection... (6300 times) [2024-04-26 11:23:36,274][49750] InferenceWorker_p0-w0: stopping experience collection (6300 times) [2024-04-26 11:23:36,274][49750] InferenceWorker_p0-w0: resuming experience collection (6300 times) [2024-04-26 11:23:37,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2662514688. Throughput: 0: 50206.6. Samples: 415315560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 11:23:37,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 11:23:38,139][49750] Updated weights for policy 0, policy_version 162511 (0.0033) [2024-04-26 11:23:41,540][49750] Updated weights for policy 0, policy_version 162521 (0.0033) [2024-04-26 11:23:42,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50244.1, 300 sec: 50373.8). Total num frames: 2662760448. Throughput: 0: 50394.3. Samples: 415621380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 11:23:42,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 11:23:44,577][49750] Updated weights for policy 0, policy_version 162531 (0.0040) [2024-04-26 11:23:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.3, 300 sec: 50373.8). Total num frames: 2663006208. Throughput: 0: 50353.7. Samples: 415921900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 11:23:47,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 11:23:48,151][49750] Updated weights for policy 0, policy_version 162541 (0.0032) [2024-04-26 11:23:51,109][49750] Updated weights for policy 0, policy_version 162551 (0.0034) [2024-04-26 11:23:52,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.2, 300 sec: 50373.8). Total num frames: 2663284736. Throughput: 0: 50201.5. Samples: 416070360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 11:23:52,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 11:23:54,720][49750] Updated weights for policy 0, policy_version 162561 (0.0034) [2024-04-26 11:23:57,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.5, 300 sec: 50318.4). Total num frames: 2663514112. Throughput: 0: 50236.4. Samples: 416372980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 11:23:57,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 11:23:57,546][49750] Updated weights for policy 0, policy_version 162571 (0.0028) [2024-04-26 11:24:01,221][49750] Updated weights for policy 0, policy_version 162581 (0.0038) [2024-04-26 11:24:02,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2663776256. Throughput: 0: 50155.4. Samples: 416674660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 11:24:02,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 11:24:04,121][49750] Updated weights for policy 0, policy_version 162591 (0.0032) [2024-04-26 11:24:07,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 50373.9). Total num frames: 2664005632. Throughput: 0: 50210.3. Samples: 416820360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 11:24:07,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 11:24:07,719][49750] Updated weights for policy 0, policy_version 162601 (0.0037) [2024-04-26 11:24:10,757][49750] Updated weights for policy 0, policy_version 162611 (0.0036) [2024-04-26 11:24:12,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2664300544. Throughput: 0: 50341.0. Samples: 417127600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 11:24:12,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 11:24:14,145][49750] Updated weights for policy 0, policy_version 162621 (0.0029) [2024-04-26 11:24:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2664513536. Throughput: 0: 50312.4. Samples: 417427260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 11:24:17,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 11:24:17,379][49750] Updated weights for policy 0, policy_version 162631 (0.0029) [2024-04-26 11:24:20,739][49750] Updated weights for policy 0, policy_version 162641 (0.0035) [2024-04-26 11:24:22,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2664775680. Throughput: 0: 50300.0. Samples: 417579060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 11:24:22,063][49517] Avg episode reward: [(0, '0.481')] [2024-04-26 11:24:23,926][49750] Updated weights for policy 0, policy_version 162651 (0.0033) [2024-04-26 11:24:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2665005056. Throughput: 0: 50140.2. Samples: 417877680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 11:24:27,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 11:24:27,211][49750] Updated weights for policy 0, policy_version 162661 (0.0032) [2024-04-26 11:24:30,406][49750] Updated weights for policy 0, policy_version 162671 (0.0033) [2024-04-26 11:24:32,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 2665283584. Throughput: 0: 50265.7. Samples: 418183860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 11:24:32,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 11:24:33,669][49750] Updated weights for policy 0, policy_version 162681 (0.0030) [2024-04-26 11:24:36,856][49750] Updated weights for policy 0, policy_version 162691 (0.0033) [2024-04-26 11:24:37,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2665529344. Throughput: 0: 50237.1. Samples: 418331020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 11:24:37,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 11:24:40,191][49750] Updated weights for policy 0, policy_version 162701 (0.0030) [2024-04-26 11:24:42,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2665791488. Throughput: 0: 50313.6. Samples: 418637100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 11:24:42,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 11:24:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000162707_2665791488.pth... [2024-04-26 11:24:42,135][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000161969_2653700096.pth [2024-04-26 11:24:43,311][49750] Updated weights for policy 0, policy_version 162711 (0.0030) [2024-04-26 11:24:46,681][49750] Updated weights for policy 0, policy_version 162721 (0.0035) [2024-04-26 11:24:47,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2666037248. Throughput: 0: 50343.6. Samples: 418940120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 11:24:47,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 11:24:47,137][49728] Signal inference workers to stop experience collection... (6350 times) [2024-04-26 11:24:47,177][49750] InferenceWorker_p0-w0: stopping experience collection (6350 times) [2024-04-26 11:24:47,240][49728] Signal inference workers to resume experience collection... (6350 times) [2024-04-26 11:24:47,240][49750] InferenceWorker_p0-w0: resuming experience collection (6350 times) [2024-04-26 11:24:49,830][49750] Updated weights for policy 0, policy_version 162731 (0.0036) [2024-04-26 11:24:52,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49698.2, 300 sec: 50318.3). Total num frames: 2666266624. Throughput: 0: 50192.8. Samples: 419079040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 11:24:52,063][49517] Avg episode reward: [(0, '0.463')] [2024-04-26 11:24:53,437][49750] Updated weights for policy 0, policy_version 162741 (0.0035) [2024-04-26 11:24:56,404][49750] Updated weights for policy 0, policy_version 162751 (0.0027) [2024-04-26 11:24:57,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 2666545152. Throughput: 0: 50120.8. Samples: 419383040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 11:24:57,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 11:24:59,861][49750] Updated weights for policy 0, policy_version 162761 (0.0037) [2024-04-26 11:25:02,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2666790912. Throughput: 0: 50124.0. Samples: 419682840. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 11:25:02,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 11:25:03,079][49750] Updated weights for policy 0, policy_version 162771 (0.0037) [2024-04-26 11:25:06,283][49750] Updated weights for policy 0, policy_version 162781 (0.0028) [2024-04-26 11:25:07,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2667020288. Throughput: 0: 50087.1. Samples: 419832980. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 11:25:07,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 11:25:09,632][49750] Updated weights for policy 0, policy_version 162791 (0.0028) [2024-04-26 11:25:12,063][49517] Fps is (10 sec: 47513.1, 60 sec: 49425.0, 300 sec: 50373.8). Total num frames: 2667266048. Throughput: 0: 50221.6. Samples: 420137660. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 11:25:12,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 11:25:12,798][49750] Updated weights for policy 0, policy_version 162801 (0.0030) [2024-04-26 11:25:16,091][49750] Updated weights for policy 0, policy_version 162811 (0.0029) [2024-04-26 11:25:17,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2667528192. Throughput: 0: 49985.5. Samples: 420433200. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 11:25:17,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 11:25:19,422][49750] Updated weights for policy 0, policy_version 162821 (0.0028) [2024-04-26 11:25:22,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2667790336. Throughput: 0: 50045.6. Samples: 420583080. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 11:25:22,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 11:25:22,651][49750] Updated weights for policy 0, policy_version 162831 (0.0030) [2024-04-26 11:25:25,861][49750] Updated weights for policy 0, policy_version 162841 (0.0036) [2024-04-26 11:25:27,063][49517] Fps is (10 sec: 52427.5, 60 sec: 50790.2, 300 sec: 50429.4). Total num frames: 2668052480. Throughput: 0: 50040.3. Samples: 420888920. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 11:25:27,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 11:25:29,042][49750] Updated weights for policy 0, policy_version 162851 (0.0030) [2024-04-26 11:25:32,062][49517] Fps is (10 sec: 49152.5, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2668281856. Throughput: 0: 50036.5. Samples: 421191760. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 11:25:32,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 11:25:32,315][49750] Updated weights for policy 0, policy_version 162861 (0.0032) [2024-04-26 11:25:35,595][49750] Updated weights for policy 0, policy_version 162871 (0.0027) [2024-04-26 11:25:37,062][49517] Fps is (10 sec: 45876.1, 60 sec: 49698.1, 300 sec: 50207.3). Total num frames: 2668511232. Throughput: 0: 50144.4. Samples: 421335540. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 11:25:37,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 11:25:38,774][49750] Updated weights for policy 0, policy_version 162881 (0.0031) [2024-04-26 11:25:42,062][49517] Fps is (10 sec: 50790.5, 60 sec: 49971.3, 300 sec: 50207.3). Total num frames: 2668789760. Throughput: 0: 50035.2. Samples: 421634620. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 11:25:42,063][49517] Avg episode reward: [(0, '0.466')] [2024-04-26 11:25:42,126][49750] Updated weights for policy 0, policy_version 162891 (0.0031) [2024-04-26 11:25:45,411][49750] Updated weights for policy 0, policy_version 162901 (0.0033) [2024-04-26 11:25:47,062][49517] Fps is (10 sec: 54066.9, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2669051904. Throughput: 0: 50101.7. Samples: 421937420. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 11:25:47,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 11:25:48,577][49750] Updated weights for policy 0, policy_version 162911 (0.0031) [2024-04-26 11:25:51,969][49750] Updated weights for policy 0, policy_version 162921 (0.0036) [2024-04-26 11:25:52,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2669297664. Throughput: 0: 50261.9. Samples: 422094760. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 11:25:52,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 11:25:54,988][49750] Updated weights for policy 0, policy_version 162931 (0.0036) [2024-04-26 11:25:57,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49698.2, 300 sec: 50318.3). Total num frames: 2669527040. Throughput: 0: 50113.4. Samples: 422392760. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 11:25:57,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 11:25:57,559][49728] Signal inference workers to stop experience collection... (6400 times) [2024-04-26 11:25:57,601][49750] InferenceWorker_p0-w0: stopping experience collection (6400 times) [2024-04-26 11:25:57,634][49728] Signal inference workers to resume experience collection... (6400 times) [2024-04-26 11:25:57,634][49750] InferenceWorker_p0-w0: resuming experience collection (6400 times) [2024-04-26 11:25:58,460][49750] Updated weights for policy 0, policy_version 162941 (0.0028) [2024-04-26 11:26:01,641][49750] Updated weights for policy 0, policy_version 162951 (0.0033) [2024-04-26 11:26:02,062][49517] Fps is (10 sec: 49151.6, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2669789184. Throughput: 0: 50244.4. Samples: 422694200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 11:26:02,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 11:26:04,971][49750] Updated weights for policy 0, policy_version 162961 (0.0033) [2024-04-26 11:26:07,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2670051328. Throughput: 0: 50271.7. Samples: 422845300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 11:26:07,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 11:26:08,299][49750] Updated weights for policy 0, policy_version 162971 (0.0035) [2024-04-26 11:26:11,483][49750] Updated weights for policy 0, policy_version 162981 (0.0029) [2024-04-26 11:26:12,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50373.8). Total num frames: 2670313472. Throughput: 0: 50294.3. Samples: 423152160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 11:26:12,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 11:26:14,666][49750] Updated weights for policy 0, policy_version 162991 (0.0029) [2024-04-26 11:26:17,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50244.1, 300 sec: 50262.8). Total num frames: 2670542848. Throughput: 0: 50346.0. Samples: 423457340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 11:26:17,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 11:26:17,896][49750] Updated weights for policy 0, policy_version 163001 (0.0034) [2024-04-26 11:26:21,019][49750] Updated weights for policy 0, policy_version 163011 (0.0032) [2024-04-26 11:26:22,063][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2670788608. Throughput: 0: 50290.6. Samples: 423598620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 11:26:22,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 11:26:24,354][49750] Updated weights for policy 0, policy_version 163021 (0.0029) [2024-04-26 11:26:27,063][49517] Fps is (10 sec: 52429.3, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 2671067136. Throughput: 0: 50439.9. Samples: 423904420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 11:26:27,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 11:26:27,693][49750] Updated weights for policy 0, policy_version 163031 (0.0030) [2024-04-26 11:26:30,969][49750] Updated weights for policy 0, policy_version 163041 (0.0027) [2024-04-26 11:26:32,062][49517] Fps is (10 sec: 54067.6, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2671329280. Throughput: 0: 50364.1. Samples: 424203800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 11:26:32,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 11:26:34,134][49750] Updated weights for policy 0, policy_version 163051 (0.0035) [2024-04-26 11:26:37,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 50207.3). Total num frames: 2671558656. Throughput: 0: 50354.6. Samples: 424360720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 11:26:37,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 11:26:37,451][49750] Updated weights for policy 0, policy_version 163061 (0.0028) [2024-04-26 11:26:40,582][49750] Updated weights for policy 0, policy_version 163071 (0.0029) [2024-04-26 11:26:42,062][49517] Fps is (10 sec: 45875.5, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2671788032. Throughput: 0: 50364.5. Samples: 424659160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 11:26:42,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 11:26:42,179][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000163074_2671804416.pth... [2024-04-26 11:26:42,224][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000162339_2659762176.pth [2024-04-26 11:26:43,974][49750] Updated weights for policy 0, policy_version 163081 (0.0028) [2024-04-26 11:26:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2672082944. Throughput: 0: 50361.4. Samples: 424960460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 11:26:47,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 11:26:47,067][49750] Updated weights for policy 0, policy_version 163091 (0.0028) [2024-04-26 11:26:50,413][49750] Updated weights for policy 0, policy_version 163101 (0.0030) [2024-04-26 11:26:52,063][49517] Fps is (10 sec: 52427.6, 60 sec: 50244.1, 300 sec: 50318.3). Total num frames: 2672312320. Throughput: 0: 50384.6. Samples: 425112620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 11:26:52,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 11:26:53,654][49750] Updated weights for policy 0, policy_version 163111 (0.0029) [2024-04-26 11:26:57,026][49750] Updated weights for policy 0, policy_version 163121 (0.0036) [2024-04-26 11:26:57,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.4, 300 sec: 50262.8). Total num frames: 2672574464. Throughput: 0: 50283.3. Samples: 425414900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 11:26:57,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 11:27:00,322][49750] Updated weights for policy 0, policy_version 163131 (0.0038) [2024-04-26 11:27:02,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 2672820224. Throughput: 0: 50267.6. Samples: 425719380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 11:27:02,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 11:27:03,471][49750] Updated weights for policy 0, policy_version 163141 (0.0033) [2024-04-26 11:27:06,837][49750] Updated weights for policy 0, policy_version 163151 (0.0033) [2024-04-26 11:27:07,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2673065984. Throughput: 0: 50316.1. Samples: 425862840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 11:27:07,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 11:27:10,053][49750] Updated weights for policy 0, policy_version 163161 (0.0030) [2024-04-26 11:27:12,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2673328128. Throughput: 0: 50301.7. Samples: 426168000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 11:27:12,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-26 11:27:13,169][49750] Updated weights for policy 0, policy_version 163171 (0.0031) [2024-04-26 11:27:14,438][49728] Signal inference workers to stop experience collection... (6450 times) [2024-04-26 11:27:14,439][49728] Signal inference workers to resume experience collection... (6450 times) [2024-04-26 11:27:14,467][49750] InferenceWorker_p0-w0: stopping experience collection (6450 times) [2024-04-26 11:27:14,468][49750] InferenceWorker_p0-w0: resuming experience collection (6450 times) [2024-04-26 11:27:16,548][49750] Updated weights for policy 0, policy_version 163181 (0.0031) [2024-04-26 11:27:17,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.5, 300 sec: 50318.3). Total num frames: 2673573888. Throughput: 0: 50367.6. Samples: 426470340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 11:27:17,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 11:27:19,743][49750] Updated weights for policy 0, policy_version 163191 (0.0032) [2024-04-26 11:27:22,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.4, 300 sec: 50207.2). Total num frames: 2673819648. Throughput: 0: 50193.8. Samples: 426619440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 11:27:22,063][49517] Avg episode reward: [(0, '0.445')] [2024-04-26 11:27:23,021][49750] Updated weights for policy 0, policy_version 163201 (0.0032) [2024-04-26 11:27:26,369][49750] Updated weights for policy 0, policy_version 163211 (0.0032) [2024-04-26 11:27:27,062][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2674065408. Throughput: 0: 50321.3. Samples: 426923620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 11:27:27,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 11:27:29,624][49750] Updated weights for policy 0, policy_version 163221 (0.0029) [2024-04-26 11:27:32,063][49517] Fps is (10 sec: 49151.3, 60 sec: 49698.0, 300 sec: 50262.8). Total num frames: 2674311168. Throughput: 0: 50211.8. Samples: 427220000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 11:27:32,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 11:27:33,298][49750] Updated weights for policy 0, policy_version 163231 (0.0040) [2024-04-26 11:27:36,180][49750] Updated weights for policy 0, policy_version 163241 (0.0027) [2024-04-26 11:27:37,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2674573312. Throughput: 0: 50276.2. Samples: 427375040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 11:27:37,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 11:27:39,753][49750] Updated weights for policy 0, policy_version 163251 (0.0035) [2024-04-26 11:27:42,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.3, 300 sec: 50262.8). Total num frames: 2674835456. Throughput: 0: 50377.3. Samples: 427681880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 11:27:42,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 11:27:42,623][49750] Updated weights for policy 0, policy_version 163261 (0.0034) [2024-04-26 11:27:46,091][49750] Updated weights for policy 0, policy_version 163271 (0.0033) [2024-04-26 11:27:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49698.1, 300 sec: 50207.2). Total num frames: 2675064832. Throughput: 0: 50327.7. Samples: 427984120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 11:27:47,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 11:27:48,996][49750] Updated weights for policy 0, policy_version 163281 (0.0034) [2024-04-26 11:27:52,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.4, 300 sec: 50318.3). Total num frames: 2675326976. Throughput: 0: 50328.9. Samples: 428127640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 11:27:52,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 11:27:52,665][49750] Updated weights for policy 0, policy_version 163291 (0.0028) [2024-04-26 11:27:55,560][49750] Updated weights for policy 0, policy_version 163301 (0.0028) [2024-04-26 11:27:57,063][49517] Fps is (10 sec: 50789.4, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2675572736. Throughput: 0: 50361.8. Samples: 428434280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 11:27:57,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 11:27:59,077][49750] Updated weights for policy 0, policy_version 163311 (0.0035) [2024-04-26 11:28:02,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 2675834880. Throughput: 0: 50341.1. Samples: 428735700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 11:28:02,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 11:28:02,145][49750] Updated weights for policy 0, policy_version 163321 (0.0034) [2024-04-26 11:28:05,603][49750] Updated weights for policy 0, policy_version 163331 (0.0024) [2024-04-26 11:28:07,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2676097024. Throughput: 0: 50612.3. Samples: 428897000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 11:28:07,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 11:28:08,552][49750] Updated weights for policy 0, policy_version 163341 (0.0034) [2024-04-26 11:28:12,062][49517] Fps is (10 sec: 49153.0, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2676326400. Throughput: 0: 50410.3. Samples: 429192080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 11:28:12,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 11:28:12,148][49750] Updated weights for policy 0, policy_version 163351 (0.0029) [2024-04-26 11:28:15,057][49750] Updated weights for policy 0, policy_version 163361 (0.0036) [2024-04-26 11:28:17,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2676588544. Throughput: 0: 50566.2. Samples: 429495480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 11:28:17,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 11:28:18,499][49750] Updated weights for policy 0, policy_version 163371 (0.0032) [2024-04-26 11:28:21,479][49750] Updated weights for policy 0, policy_version 163381 (0.0038) [2024-04-26 11:28:21,498][49728] Signal inference workers to stop experience collection... (6500 times) [2024-04-26 11:28:21,503][49728] Signal inference workers to resume experience collection... (6500 times) [2024-04-26 11:28:21,530][49750] InferenceWorker_p0-w0: stopping experience collection (6500 times) [2024-04-26 11:28:21,530][49750] InferenceWorker_p0-w0: resuming experience collection (6500 times) [2024-04-26 11:28:22,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2676850688. Throughput: 0: 50503.1. Samples: 429647680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 11:28:22,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 11:28:24,890][49750] Updated weights for policy 0, policy_version 163391 (0.0029) [2024-04-26 11:28:27,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2677096448. Throughput: 0: 50361.0. Samples: 429948120. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 11:28:27,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 11:28:28,017][49750] Updated weights for policy 0, policy_version 163401 (0.0034) [2024-04-26 11:28:31,378][49750] Updated weights for policy 0, policy_version 163411 (0.0030) [2024-04-26 11:28:32,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50318.3). Total num frames: 2677358592. Throughput: 0: 50231.5. Samples: 430244540. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 11:28:32,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 11:28:34,631][49750] Updated weights for policy 0, policy_version 163421 (0.0030) [2024-04-26 11:28:37,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2677587968. Throughput: 0: 50473.4. Samples: 430398940. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 11:28:37,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 11:28:37,913][49750] Updated weights for policy 0, policy_version 163431 (0.0031) [2024-04-26 11:28:41,065][49750] Updated weights for policy 0, policy_version 163441 (0.0034) [2024-04-26 11:28:42,063][49517] Fps is (10 sec: 47512.9, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2677833728. Throughput: 0: 50392.0. Samples: 430701920. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 11:28:42,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 11:28:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000163442_2677833728.pth... [2024-04-26 11:28:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000162707_2665791488.pth [2024-04-26 11:28:44,279][49750] Updated weights for policy 0, policy_version 163451 (0.0033) [2024-04-26 11:28:47,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50207.3). Total num frames: 2678095872. Throughput: 0: 50299.8. Samples: 430999180. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 11:28:47,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 11:28:47,495][49750] Updated weights for policy 0, policy_version 163461 (0.0035) [2024-04-26 11:28:50,757][49750] Updated weights for policy 0, policy_version 163471 (0.0027) [2024-04-26 11:28:52,063][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2678358016. Throughput: 0: 50225.4. Samples: 431157140. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 11:28:52,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 11:28:53,958][49750] Updated weights for policy 0, policy_version 163481 (0.0029) [2024-04-26 11:28:57,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.5, 300 sec: 50262.8). Total num frames: 2678603776. Throughput: 0: 50491.1. Samples: 431464180. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 11:28:57,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 11:28:57,246][49750] Updated weights for policy 0, policy_version 163491 (0.0029) [2024-04-26 11:29:00,462][49750] Updated weights for policy 0, policy_version 163501 (0.0030) [2024-04-26 11:29:02,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2678833152. Throughput: 0: 50252.1. Samples: 431756820. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 11:29:02,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 11:29:04,023][49750] Updated weights for policy 0, policy_version 163511 (0.0031) [2024-04-26 11:29:07,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.4, 300 sec: 50207.3). Total num frames: 2679111680. Throughput: 0: 50110.3. Samples: 431902640. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 11:29:07,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 11:29:07,089][49750] Updated weights for policy 0, policy_version 163521 (0.0031) [2024-04-26 11:29:10,512][49750] Updated weights for policy 0, policy_version 163531 (0.0027) [2024-04-26 11:29:12,062][49517] Fps is (10 sec: 54067.5, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 2679373824. Throughput: 0: 50188.4. Samples: 432206600. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 11:29:12,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 11:29:13,448][49750] Updated weights for policy 0, policy_version 163541 (0.0030) [2024-04-26 11:29:17,023][49750] Updated weights for policy 0, policy_version 163551 (0.0032) [2024-04-26 11:29:17,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2679619584. Throughput: 0: 50269.3. Samples: 432506660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 11:29:17,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 11:29:20,067][49750] Updated weights for policy 0, policy_version 163561 (0.0029) [2024-04-26 11:29:22,062][49517] Fps is (10 sec: 45875.2, 60 sec: 49698.1, 300 sec: 50262.8). Total num frames: 2679832576. Throughput: 0: 50220.8. Samples: 432658880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 11:29:22,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 11:29:22,294][49728] Signal inference workers to stop experience collection... (6550 times) [2024-04-26 11:29:22,295][49728] Signal inference workers to resume experience collection... (6550 times) [2024-04-26 11:29:22,322][49750] InferenceWorker_p0-w0: stopping experience collection (6550 times) [2024-04-26 11:29:22,322][49750] InferenceWorker_p0-w0: resuming experience collection (6550 times) [2024-04-26 11:29:23,650][49750] Updated weights for policy 0, policy_version 163571 (0.0026) [2024-04-26 11:29:26,611][49750] Updated weights for policy 0, policy_version 163581 (0.0030) [2024-04-26 11:29:27,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.1, 300 sec: 50262.8). Total num frames: 2680111104. Throughput: 0: 50200.9. Samples: 432960960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 11:29:27,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 11:29:30,151][49750] Updated weights for policy 0, policy_version 163591 (0.0031) [2024-04-26 11:29:32,063][49517] Fps is (10 sec: 52427.7, 60 sec: 49971.0, 300 sec: 50262.7). Total num frames: 2680356864. Throughput: 0: 50145.5. Samples: 433255740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 11:29:32,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 11:29:33,331][49750] Updated weights for policy 0, policy_version 163601 (0.0034) [2024-04-26 11:29:36,494][49750] Updated weights for policy 0, policy_version 163611 (0.0031) [2024-04-26 11:29:37,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 2680619008. Throughput: 0: 50376.8. Samples: 433424100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 11:29:37,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 11:29:39,702][49750] Updated weights for policy 0, policy_version 163621 (0.0028) [2024-04-26 11:29:42,063][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2680864768. Throughput: 0: 50191.0. Samples: 433722780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 11:29:42,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 11:29:43,148][49750] Updated weights for policy 0, policy_version 163631 (0.0030) [2024-04-26 11:29:46,176][49750] Updated weights for policy 0, policy_version 163641 (0.0031) [2024-04-26 11:29:47,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2681094144. Throughput: 0: 50231.6. Samples: 434017240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 11:29:47,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 11:29:49,657][49750] Updated weights for policy 0, policy_version 163651 (0.0028) [2024-04-26 11:29:52,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2681389056. Throughput: 0: 50424.3. Samples: 434171740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 11:29:52,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 11:29:52,706][49750] Updated weights for policy 0, policy_version 163661 (0.0033) [2024-04-26 11:29:56,161][49750] Updated weights for policy 0, policy_version 163671 (0.0033) [2024-04-26 11:29:57,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2681618432. Throughput: 0: 50258.2. Samples: 434468220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 11:29:57,063][49517] Avg episode reward: [(0, '0.443')] [2024-04-26 11:29:59,260][49750] Updated weights for policy 0, policy_version 163681 (0.0031) [2024-04-26 11:30:02,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2681864192. Throughput: 0: 50271.5. Samples: 434768880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 11:30:02,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 11:30:02,561][49750] Updated weights for policy 0, policy_version 163691 (0.0036) [2024-04-26 11:30:05,634][49750] Updated weights for policy 0, policy_version 163701 (0.0035) [2024-04-26 11:30:07,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49698.0, 300 sec: 50262.8). Total num frames: 2682093568. Throughput: 0: 50289.3. Samples: 434921900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 11:30:07,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 11:30:09,102][49750] Updated weights for policy 0, policy_version 163711 (0.0026) [2024-04-26 11:30:12,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2682388480. Throughput: 0: 50247.8. Samples: 435222100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 11:30:12,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 11:30:12,151][49750] Updated weights for policy 0, policy_version 163721 (0.0036) [2024-04-26 11:30:14,877][49728] Signal inference workers to stop experience collection... (6600 times) [2024-04-26 11:30:14,923][49750] InferenceWorker_p0-w0: stopping experience collection (6600 times) [2024-04-26 11:30:14,981][49728] Signal inference workers to resume experience collection... (6600 times) [2024-04-26 11:30:14,981][49750] InferenceWorker_p0-w0: resuming experience collection (6600 times) [2024-04-26 11:30:15,789][49750] Updated weights for policy 0, policy_version 163731 (0.0030) [2024-04-26 11:30:17,062][49517] Fps is (10 sec: 50790.4, 60 sec: 49698.1, 300 sec: 50207.3). Total num frames: 2682601472. Throughput: 0: 50279.8. Samples: 435518320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 11:30:17,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 11:30:19,090][49750] Updated weights for policy 0, policy_version 163741 (0.0043) [2024-04-26 11:30:22,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50790.3, 300 sec: 50262.8). Total num frames: 2682880000. Throughput: 0: 49967.1. Samples: 435672620. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 11:30:22,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 11:30:22,282][49750] Updated weights for policy 0, policy_version 163751 (0.0037) [2024-04-26 11:30:26,210][49750] Updated weights for policy 0, policy_version 163761 (0.0030) [2024-04-26 11:30:27,063][49517] Fps is (10 sec: 50789.7, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2683109376. Throughput: 0: 49924.8. Samples: 435969400. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 11:30:27,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 11:30:28,693][49750] Updated weights for policy 0, policy_version 163771 (0.0031) [2024-04-26 11:30:32,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50373.8). Total num frames: 2683371520. Throughput: 0: 50019.8. Samples: 436268140. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 11:30:32,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 11:30:32,821][49750] Updated weights for policy 0, policy_version 163781 (0.0028) [2024-04-26 11:30:35,295][49750] Updated weights for policy 0, policy_version 163791 (0.0038) [2024-04-26 11:30:37,062][49517] Fps is (10 sec: 50791.3, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2683617280. Throughput: 0: 49897.8. Samples: 436417140. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 11:30:37,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 11:30:39,439][49750] Updated weights for policy 0, policy_version 163801 (0.0040) [2024-04-26 11:30:41,822][49750] Updated weights for policy 0, policy_version 163811 (0.0029) [2024-04-26 11:30:42,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 2683879424. Throughput: 0: 50054.7. Samples: 436720680. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 11:30:42,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 11:30:42,137][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000163812_2683895808.pth... [2024-04-26 11:30:42,179][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000163074_2671804416.pth [2024-04-26 11:30:45,885][49750] Updated weights for policy 0, policy_version 163821 (0.0035) [2024-04-26 11:30:47,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50517.2, 300 sec: 50262.7). Total num frames: 2684125184. Throughput: 0: 50195.0. Samples: 437027660. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 11:30:47,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 11:30:48,234][49750] Updated weights for policy 0, policy_version 163831 (0.0028) [2024-04-26 11:30:52,062][49517] Fps is (10 sec: 45875.3, 60 sec: 49152.1, 300 sec: 50207.2). Total num frames: 2684338176. Throughput: 0: 50094.7. Samples: 437176160. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 11:30:52,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 11:30:52,407][49750] Updated weights for policy 0, policy_version 163841 (0.0029) [2024-04-26 11:30:54,750][49750] Updated weights for policy 0, policy_version 163851 (0.0027) [2024-04-26 11:30:57,062][49517] Fps is (10 sec: 52429.9, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2684649472. Throughput: 0: 50089.3. Samples: 437476120. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 11:30:57,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 11:30:58,821][49750] Updated weights for policy 0, policy_version 163861 (0.0029) [2024-04-26 11:31:01,417][49750] Updated weights for policy 0, policy_version 163871 (0.0029) [2024-04-26 11:31:02,062][49517] Fps is (10 sec: 52428.1, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2684862464. Throughput: 0: 50150.6. Samples: 437775100. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 11:31:02,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 11:31:05,276][49750] Updated weights for policy 0, policy_version 163881 (0.0033) [2024-04-26 11:31:07,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50262.8). Total num frames: 2685140992. Throughput: 0: 50216.1. Samples: 437932340. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 11:31:07,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 11:31:08,146][49750] Updated weights for policy 0, policy_version 163891 (0.0032) [2024-04-26 11:31:09,447][49728] Signal inference workers to stop experience collection... (6650 times) [2024-04-26 11:31:09,456][49728] Signal inference workers to resume experience collection... (6650 times) [2024-04-26 11:31:09,502][49750] InferenceWorker_p0-w0: stopping experience collection (6650 times) [2024-04-26 11:31:09,503][49750] InferenceWorker_p0-w0: resuming experience collection (6650 times) [2024-04-26 11:31:11,616][49750] Updated weights for policy 0, policy_version 163901 (0.0030) [2024-04-26 11:31:12,063][49517] Fps is (10 sec: 50790.4, 60 sec: 49698.0, 300 sec: 50262.8). Total num frames: 2685370368. Throughput: 0: 50154.3. Samples: 438226340. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 11:31:12,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 11:31:14,825][49750] Updated weights for policy 0, policy_version 163911 (0.0033) [2024-04-26 11:31:17,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50373.9). Total num frames: 2685648896. Throughput: 0: 50286.7. Samples: 438531040. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 11:31:17,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 11:31:18,123][49750] Updated weights for policy 0, policy_version 163921 (0.0036) [2024-04-26 11:31:21,233][49750] Updated weights for policy 0, policy_version 163931 (0.0036) [2024-04-26 11:31:22,062][49517] Fps is (10 sec: 50791.1, 60 sec: 49971.3, 300 sec: 50207.3). Total num frames: 2685878272. Throughput: 0: 50360.0. Samples: 438683340. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 11:31:22,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 11:31:24,743][49750] Updated weights for policy 0, policy_version 163941 (0.0027) [2024-04-26 11:31:27,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50262.8). Total num frames: 2686156800. Throughput: 0: 50283.5. Samples: 438983440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 11:31:27,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 11:31:27,654][49750] Updated weights for policy 0, policy_version 163951 (0.0032) [2024-04-26 11:31:31,200][49750] Updated weights for policy 0, policy_version 163961 (0.0026) [2024-04-26 11:31:32,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 2686386176. Throughput: 0: 50177.1. Samples: 439285620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 11:31:32,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 11:31:34,191][49750] Updated weights for policy 0, policy_version 163971 (0.0036) [2024-04-26 11:31:37,063][49517] Fps is (10 sec: 45875.0, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2686615552. Throughput: 0: 50235.8. Samples: 439436780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 11:31:37,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 11:31:37,642][49750] Updated weights for policy 0, policy_version 163981 (0.0026) [2024-04-26 11:31:40,913][49750] Updated weights for policy 0, policy_version 163991 (0.0031) [2024-04-26 11:31:42,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2686877696. Throughput: 0: 50167.6. Samples: 439733660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 11:31:42,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 11:31:44,233][49750] Updated weights for policy 0, policy_version 164001 (0.0026) [2024-04-26 11:31:47,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2687139840. Throughput: 0: 50174.6. Samples: 440032960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 11:31:47,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 11:31:47,309][49750] Updated weights for policy 0, policy_version 164011 (0.0038) [2024-04-26 11:31:50,671][49750] Updated weights for policy 0, policy_version 164021 (0.0032) [2024-04-26 11:31:52,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50262.8). Total num frames: 2687401984. Throughput: 0: 50130.2. Samples: 440188200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 11:31:52,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 11:31:54,079][49750] Updated weights for policy 0, policy_version 164031 (0.0033) [2024-04-26 11:31:57,063][49517] Fps is (10 sec: 49152.1, 60 sec: 49698.0, 300 sec: 50207.2). Total num frames: 2687631360. Throughput: 0: 50208.4. Samples: 440485720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 11:31:57,063][49517] Avg episode reward: [(0, '0.478')] [2024-04-26 11:31:57,152][49750] Updated weights for policy 0, policy_version 164041 (0.0027) [2024-04-26 11:32:00,425][49750] Updated weights for policy 0, policy_version 164051 (0.0029) [2024-04-26 11:32:02,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2687893504. Throughput: 0: 50171.2. Samples: 440788740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 11:32:02,063][49517] Avg episode reward: [(0, '0.688')] [2024-04-26 11:32:03,554][49750] Updated weights for policy 0, policy_version 164061 (0.0025) [2024-04-26 11:32:07,062][49750] Updated weights for policy 0, policy_version 164071 (0.0036) [2024-04-26 11:32:07,063][49517] Fps is (10 sec: 50790.3, 60 sec: 49971.1, 300 sec: 50207.2). Total num frames: 2688139264. Throughput: 0: 50168.2. Samples: 440940920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 11:32:07,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 11:32:10,104][49750] Updated weights for policy 0, policy_version 164081 (0.0030) [2024-04-26 11:32:12,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2688401408. Throughput: 0: 50197.4. Samples: 441242320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 11:32:12,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 11:32:13,513][49750] Updated weights for policy 0, policy_version 164091 (0.0034) [2024-04-26 11:32:16,577][49750] Updated weights for policy 0, policy_version 164101 (0.0030) [2024-04-26 11:32:17,063][49517] Fps is (10 sec: 50790.6, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2688647168. Throughput: 0: 50163.8. Samples: 441543000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 11:32:17,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 11:32:20,016][49750] Updated weights for policy 0, policy_version 164111 (0.0032) [2024-04-26 11:32:22,062][49517] Fps is (10 sec: 47513.8, 60 sec: 49971.2, 300 sec: 50207.3). Total num frames: 2688876544. Throughput: 0: 50170.0. Samples: 441694420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 11:32:22,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 11:32:23,404][49750] Updated weights for policy 0, policy_version 164121 (0.0031) [2024-04-26 11:32:26,532][49750] Updated weights for policy 0, policy_version 164131 (0.0032) [2024-04-26 11:32:27,063][49517] Fps is (10 sec: 49151.9, 60 sec: 49698.1, 300 sec: 50262.8). Total num frames: 2689138688. Throughput: 0: 50176.7. Samples: 441991620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 11:32:27,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 11:32:29,321][49728] Signal inference workers to stop experience collection... (6700 times) [2024-04-26 11:32:29,322][49728] Signal inference workers to resume experience collection... (6700 times) [2024-04-26 11:32:29,346][49750] InferenceWorker_p0-w0: stopping experience collection (6700 times) [2024-04-26 11:32:29,346][49750] InferenceWorker_p0-w0: resuming experience collection (6700 times) [2024-04-26 11:32:29,816][49750] Updated weights for policy 0, policy_version 164141 (0.0035) [2024-04-26 11:32:32,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2689400832. Throughput: 0: 50119.3. Samples: 442288320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-04-26 11:32:32,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 11:32:32,934][49750] Updated weights for policy 0, policy_version 164151 (0.0026) [2024-04-26 11:32:36,349][49750] Updated weights for policy 0, policy_version 164161 (0.0031) [2024-04-26 11:32:37,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.4, 300 sec: 50262.8). Total num frames: 2689662976. Throughput: 0: 50240.0. Samples: 442449000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-04-26 11:32:37,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 11:32:39,385][49750] Updated weights for policy 0, policy_version 164171 (0.0030) [2024-04-26 11:32:42,063][49517] Fps is (10 sec: 47512.7, 60 sec: 49971.0, 300 sec: 50207.2). Total num frames: 2689875968. Throughput: 0: 50392.4. Samples: 442753380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-04-26 11:32:42,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 11:32:42,188][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000164178_2689892352.pth... [2024-04-26 11:32:42,236][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000163442_2677833728.pth [2024-04-26 11:32:42,822][49750] Updated weights for policy 0, policy_version 164181 (0.0031) [2024-04-26 11:32:45,868][49750] Updated weights for policy 0, policy_version 164191 (0.0034) [2024-04-26 11:32:47,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.3, 300 sec: 50207.3). Total num frames: 2690138112. Throughput: 0: 50273.3. Samples: 443051040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-04-26 11:32:47,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 11:32:49,449][49750] Updated weights for policy 0, policy_version 164201 (0.0031) [2024-04-26 11:32:52,063][49517] Fps is (10 sec: 52428.3, 60 sec: 49971.0, 300 sec: 50262.8). Total num frames: 2690400256. Throughput: 0: 50183.4. Samples: 443199180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-04-26 11:32:52,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 11:32:52,412][49750] Updated weights for policy 0, policy_version 164211 (0.0032) [2024-04-26 11:32:55,975][49750] Updated weights for policy 0, policy_version 164221 (0.0031) [2024-04-26 11:32:57,063][49517] Fps is (10 sec: 54066.0, 60 sec: 50790.3, 300 sec: 50318.3). Total num frames: 2690678784. Throughput: 0: 50252.2. Samples: 443503680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-04-26 11:32:57,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 11:32:59,352][49750] Updated weights for policy 0, policy_version 164231 (0.0036) [2024-04-26 11:33:02,063][49517] Fps is (10 sec: 49152.9, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 2690891776. Throughput: 0: 50293.4. Samples: 443806200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-04-26 11:33:02,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 11:33:02,410][49750] Updated weights for policy 0, policy_version 164241 (0.0034) [2024-04-26 11:33:06,041][49750] Updated weights for policy 0, policy_version 164251 (0.0030) [2024-04-26 11:33:07,063][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2691153920. Throughput: 0: 50178.5. Samples: 443952460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-04-26 11:33:07,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 11:33:08,971][49750] Updated weights for policy 0, policy_version 164261 (0.0029) [2024-04-26 11:33:12,062][49517] Fps is (10 sec: 49152.5, 60 sec: 49698.1, 300 sec: 50151.7). Total num frames: 2691383296. Throughput: 0: 50161.9. Samples: 444248900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-04-26 11:33:12,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 11:33:12,697][49750] Updated weights for policy 0, policy_version 164271 (0.0036) [2024-04-26 11:33:15,533][49750] Updated weights for policy 0, policy_version 164281 (0.0030) [2024-04-26 11:33:17,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.2, 300 sec: 50207.2). Total num frames: 2691661824. Throughput: 0: 50030.9. Samples: 444539720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-04-26 11:33:17,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 11:33:19,089][49750] Updated weights for policy 0, policy_version 164291 (0.0032) [2024-04-26 11:33:21,818][49750] Updated weights for policy 0, policy_version 164301 (0.0028) [2024-04-26 11:33:22,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50517.2, 300 sec: 50207.2). Total num frames: 2691907584. Throughput: 0: 50244.3. Samples: 444710000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-04-26 11:33:22,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 11:33:25,412][49750] Updated weights for policy 0, policy_version 164311 (0.0036) [2024-04-26 11:33:27,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50244.4, 300 sec: 50151.7). Total num frames: 2692153344. Throughput: 0: 50186.0. Samples: 445011740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-04-26 11:33:27,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 11:33:28,310][49750] Updated weights for policy 0, policy_version 164321 (0.0027) [2024-04-26 11:33:30,778][49728] Signal inference workers to stop experience collection... (6750 times) [2024-04-26 11:33:30,779][49728] Signal inference workers to resume experience collection... (6750 times) [2024-04-26 11:33:30,792][49750] InferenceWorker_p0-w0: stopping experience collection (6750 times) [2024-04-26 11:33:30,793][49750] InferenceWorker_p0-w0: resuming experience collection (6750 times) [2024-04-26 11:33:31,970][49750] Updated weights for policy 0, policy_version 164331 (0.0026) [2024-04-26 11:33:32,063][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.1, 300 sec: 50207.2). Total num frames: 2692399104. Throughput: 0: 50326.0. Samples: 445315720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-04-26 11:33:32,064][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 11:33:35,014][49750] Updated weights for policy 0, policy_version 164341 (0.0030) [2024-04-26 11:33:37,063][49517] Fps is (10 sec: 50789.8, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2692661248. Throughput: 0: 50431.7. Samples: 445468600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 11:33:37,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 11:33:38,424][49750] Updated weights for policy 0, policy_version 164351 (0.0031) [2024-04-26 11:33:41,540][49750] Updated weights for policy 0, policy_version 164361 (0.0031) [2024-04-26 11:33:42,062][49517] Fps is (10 sec: 54068.2, 60 sec: 51063.6, 300 sec: 50318.3). Total num frames: 2692939776. Throughput: 0: 50387.8. Samples: 445771120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 11:33:42,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 11:33:44,810][49750] Updated weights for policy 0, policy_version 164371 (0.0037) [2024-04-26 11:33:47,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2693152768. Throughput: 0: 50276.2. Samples: 446068620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 11:33:47,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 11:33:48,029][49750] Updated weights for policy 0, policy_version 164381 (0.0034) [2024-04-26 11:33:51,202][49750] Updated weights for policy 0, policy_version 164391 (0.0036) [2024-04-26 11:33:52,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.5, 300 sec: 50262.8). Total num frames: 2693431296. Throughput: 0: 50306.7. Samples: 446216260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 11:33:52,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 11:33:54,375][49750] Updated weights for policy 0, policy_version 164401 (0.0030) [2024-04-26 11:33:57,062][49517] Fps is (10 sec: 50790.1, 60 sec: 49698.3, 300 sec: 50262.8). Total num frames: 2693660672. Throughput: 0: 50376.0. Samples: 446515820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 11:33:57,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 11:33:57,777][49750] Updated weights for policy 0, policy_version 164411 (0.0028) [2024-04-26 11:34:00,818][49750] Updated weights for policy 0, policy_version 164421 (0.0033) [2024-04-26 11:34:02,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50262.7). Total num frames: 2693939200. Throughput: 0: 50606.8. Samples: 446817020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 11:34:02,063][49517] Avg episode reward: [(0, '0.691')] [2024-04-26 11:34:04,296][49750] Updated weights for policy 0, policy_version 164431 (0.0031) [2024-04-26 11:34:07,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.4, 300 sec: 50207.2). Total num frames: 2694184960. Throughput: 0: 50396.5. Samples: 446977840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 11:34:07,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 11:34:07,249][49750] Updated weights for policy 0, policy_version 164441 (0.0031) [2024-04-26 11:34:10,825][49750] Updated weights for policy 0, policy_version 164451 (0.0033) [2024-04-26 11:34:12,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.2, 300 sec: 50207.2). Total num frames: 2694430720. Throughput: 0: 50502.0. Samples: 447284340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 11:34:12,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 11:34:13,840][49750] Updated weights for policy 0, policy_version 164461 (0.0029) [2024-04-26 11:34:17,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2694660096. Throughput: 0: 50403.6. Samples: 447583880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 11:34:17,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 11:34:17,412][49750] Updated weights for policy 0, policy_version 164471 (0.0030) [2024-04-26 11:34:20,280][49750] Updated weights for policy 0, policy_version 164481 (0.0031) [2024-04-26 11:34:22,063][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2694938624. Throughput: 0: 50261.0. Samples: 447730340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 11:34:22,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 11:34:23,877][49750] Updated weights for policy 0, policy_version 164491 (0.0026) [2024-04-26 11:34:26,776][49750] Updated weights for policy 0, policy_version 164501 (0.0032) [2024-04-26 11:34:27,063][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.3, 300 sec: 50318.3). Total num frames: 2695200768. Throughput: 0: 50194.5. Samples: 448029880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 11:34:27,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 11:34:30,520][49750] Updated weights for policy 0, policy_version 164511 (0.0033) [2024-04-26 11:34:32,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2695413760. Throughput: 0: 50332.8. Samples: 448333600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 11:34:32,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 11:34:33,337][49750] Updated weights for policy 0, policy_version 164521 (0.0038) [2024-04-26 11:34:36,888][49750] Updated weights for policy 0, policy_version 164531 (0.0037) [2024-04-26 11:34:37,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50244.4, 300 sec: 50207.3). Total num frames: 2695675904. Throughput: 0: 50203.3. Samples: 448475400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 11:34:37,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 11:34:39,970][49750] Updated weights for policy 0, policy_version 164541 (0.0030) [2024-04-26 11:34:42,062][49517] Fps is (10 sec: 52429.0, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2695938048. Throughput: 0: 50297.3. Samples: 448779200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:34:42,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 11:34:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000164547_2695938048.pth... [2024-04-26 11:34:42,114][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000163812_2683895808.pth [2024-04-26 11:34:43,025][49728] Signal inference workers to stop experience collection... (6800 times) [2024-04-26 11:34:43,063][49750] InferenceWorker_p0-w0: stopping experience collection (6800 times) [2024-04-26 11:34:43,084][49728] Signal inference workers to resume experience collection... (6800 times) [2024-04-26 11:34:43,085][49750] InferenceWorker_p0-w0: resuming experience collection (6800 times) [2024-04-26 11:34:43,374][49750] Updated weights for policy 0, policy_version 164551 (0.0033) [2024-04-26 11:34:46,344][49750] Updated weights for policy 0, policy_version 164561 (0.0030) [2024-04-26 11:34:47,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50151.7). Total num frames: 2696183808. Throughput: 0: 50253.9. Samples: 449078440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:34:47,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 11:34:49,868][49750] Updated weights for policy 0, policy_version 164571 (0.0030) [2024-04-26 11:34:52,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.3, 300 sec: 50207.2). Total num frames: 2696429568. Throughput: 0: 50165.9. Samples: 449235300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:34:52,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 11:34:52,755][49750] Updated weights for policy 0, policy_version 164581 (0.0033) [2024-04-26 11:34:56,417][49750] Updated weights for policy 0, policy_version 164591 (0.0035) [2024-04-26 11:34:57,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2696691712. Throughput: 0: 49974.8. Samples: 449533200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:34:57,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 11:34:59,384][49750] Updated weights for policy 0, policy_version 164601 (0.0032) [2024-04-26 11:35:02,062][49517] Fps is (10 sec: 49151.7, 60 sec: 49698.2, 300 sec: 50262.8). Total num frames: 2696921088. Throughput: 0: 50117.4. Samples: 449839160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:35:02,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 11:35:02,835][49750] Updated weights for policy 0, policy_version 164611 (0.0033) [2024-04-26 11:35:05,959][49750] Updated weights for policy 0, policy_version 164621 (0.0031) [2024-04-26 11:35:07,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 2697199616. Throughput: 0: 50194.6. Samples: 449989100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:35:07,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 11:35:09,530][49750] Updated weights for policy 0, policy_version 164631 (0.0032) [2024-04-26 11:35:12,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50244.5, 300 sec: 50318.3). Total num frames: 2697445376. Throughput: 0: 50177.6. Samples: 450287860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:35:12,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 11:35:12,307][49750] Updated weights for policy 0, policy_version 164641 (0.0031) [2024-04-26 11:35:15,973][49750] Updated weights for policy 0, policy_version 164651 (0.0030) [2024-04-26 11:35:17,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 2697691136. Throughput: 0: 50214.6. Samples: 450593260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:35:17,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 11:35:18,700][49750] Updated weights for policy 0, policy_version 164661 (0.0028) [2024-04-26 11:35:22,063][49517] Fps is (10 sec: 49150.9, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2697936896. Throughput: 0: 50412.2. Samples: 450743960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:35:22,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 11:35:22,569][49750] Updated weights for policy 0, policy_version 164671 (0.0031) [2024-04-26 11:35:25,279][49750] Updated weights for policy 0, policy_version 164681 (0.0035) [2024-04-26 11:35:27,063][49517] Fps is (10 sec: 49151.6, 60 sec: 49698.1, 300 sec: 50207.2). Total num frames: 2698182656. Throughput: 0: 50198.5. Samples: 451038140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:35:27,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 11:35:29,109][49750] Updated weights for policy 0, policy_version 164691 (0.0033) [2024-04-26 11:35:31,925][49750] Updated weights for policy 0, policy_version 164701 (0.0036) [2024-04-26 11:35:32,063][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 2698461184. Throughput: 0: 50299.4. Samples: 451341920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:35:32,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 11:35:35,564][49750] Updated weights for policy 0, policy_version 164711 (0.0028) [2024-04-26 11:35:37,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 2698706944. Throughput: 0: 50304.8. Samples: 451499020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:35:37,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 11:35:38,475][49750] Updated weights for policy 0, policy_version 164721 (0.0030) [2024-04-26 11:35:41,978][49750] Updated weights for policy 0, policy_version 164731 (0.0037) [2024-04-26 11:35:42,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2698952704. Throughput: 0: 50320.9. Samples: 451797640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:35:42,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 11:35:45,349][49750] Updated weights for policy 0, policy_version 164741 (0.0031) [2024-04-26 11:35:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2699198464. Throughput: 0: 50278.3. Samples: 452101680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:35:47,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 11:35:48,454][49750] Updated weights for policy 0, policy_version 164751 (0.0028) [2024-04-26 11:35:51,760][49750] Updated weights for policy 0, policy_version 164761 (0.0029) [2024-04-26 11:35:52,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2699444224. Throughput: 0: 50315.1. Samples: 452253280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:35:52,063][49517] Avg episode reward: [(0, '0.465')] [2024-04-26 11:35:54,918][49750] Updated weights for policy 0, policy_version 164771 (0.0028) [2024-04-26 11:35:57,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2699706368. Throughput: 0: 50233.6. Samples: 452548380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:35:57,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 11:35:58,209][49750] Updated weights for policy 0, policy_version 164781 (0.0030) [2024-04-26 11:36:01,503][49750] Updated weights for policy 0, policy_version 164791 (0.0036) [2024-04-26 11:36:02,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50207.2). Total num frames: 2699952128. Throughput: 0: 50209.8. Samples: 452852700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:36:02,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 11:36:04,593][49750] Updated weights for policy 0, policy_version 164801 (0.0030) [2024-04-26 11:36:07,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2700197888. Throughput: 0: 50255.7. Samples: 453005460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:36:07,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 11:36:07,653][49728] Signal inference workers to stop experience collection... (6850 times) [2024-04-26 11:36:07,654][49728] Signal inference workers to resume experience collection... (6850 times) [2024-04-26 11:36:07,666][49750] InferenceWorker_p0-w0: stopping experience collection (6850 times) [2024-04-26 11:36:07,686][49750] InferenceWorker_p0-w0: resuming experience collection (6850 times) [2024-04-26 11:36:07,934][49750] Updated weights for policy 0, policy_version 164811 (0.0034) [2024-04-26 11:36:11,188][49750] Updated weights for policy 0, policy_version 164821 (0.0033) [2024-04-26 11:36:12,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49971.1, 300 sec: 50151.7). Total num frames: 2700443648. Throughput: 0: 50328.7. Samples: 453302920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:36:12,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 11:36:14,440][49750] Updated weights for policy 0, policy_version 164831 (0.0034) [2024-04-26 11:36:17,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2700705792. Throughput: 0: 50272.9. Samples: 453604200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:36:17,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 11:36:17,811][49750] Updated weights for policy 0, policy_version 164841 (0.0034) [2024-04-26 11:36:21,151][49750] Updated weights for policy 0, policy_version 164851 (0.0036) [2024-04-26 11:36:22,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.5, 300 sec: 50207.3). Total num frames: 2700967936. Throughput: 0: 50227.3. Samples: 453759240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:36:22,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 11:36:24,181][49750] Updated weights for policy 0, policy_version 164861 (0.0037) [2024-04-26 11:36:27,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.5, 300 sec: 50262.8). Total num frames: 2701213696. Throughput: 0: 50270.3. Samples: 454059800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:36:27,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-26 11:36:27,499][49750] Updated weights for policy 0, policy_version 164871 (0.0035) [2024-04-26 11:36:30,496][49750] Updated weights for policy 0, policy_version 164881 (0.0032) [2024-04-26 11:36:32,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 2701475840. Throughput: 0: 50400.9. Samples: 454369720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:36:32,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 11:36:34,087][49750] Updated weights for policy 0, policy_version 164891 (0.0037) [2024-04-26 11:36:37,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2701721600. Throughput: 0: 50152.9. Samples: 454510160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:36:37,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 11:36:37,404][49750] Updated weights for policy 0, policy_version 164901 (0.0029) [2024-04-26 11:36:40,436][49750] Updated weights for policy 0, policy_version 164911 (0.0035) [2024-04-26 11:36:42,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2701983744. Throughput: 0: 50264.0. Samples: 454810260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:36:42,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 11:36:42,079][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000164916_2701983744.pth... [2024-04-26 11:36:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000164178_2689892352.pth [2024-04-26 11:36:44,002][49750] Updated weights for policy 0, policy_version 164921 (0.0030) [2024-04-26 11:36:46,940][49750] Updated weights for policy 0, policy_version 164931 (0.0035) [2024-04-26 11:36:47,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2702229504. Throughput: 0: 50320.0. Samples: 455117100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:36:47,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 11:36:50,327][49750] Updated weights for policy 0, policy_version 164941 (0.0035) [2024-04-26 11:36:52,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2702458880. Throughput: 0: 50322.7. Samples: 455269980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:36:52,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 11:36:53,423][49750] Updated weights for policy 0, policy_version 164951 (0.0036) [2024-04-26 11:36:57,063][49517] Fps is (10 sec: 47512.9, 60 sec: 49971.1, 300 sec: 50207.2). Total num frames: 2702704640. Throughput: 0: 50379.8. Samples: 455570020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:36:57,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 11:36:57,405][49750] Updated weights for policy 0, policy_version 164961 (0.0028) [2024-04-26 11:37:00,018][49750] Updated weights for policy 0, policy_version 164971 (0.0027) [2024-04-26 11:37:02,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2702983168. Throughput: 0: 50275.1. Samples: 455866580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:37:02,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 11:37:03,742][49750] Updated weights for policy 0, policy_version 164981 (0.0037) [2024-04-26 11:37:06,501][49750] Updated weights for policy 0, policy_version 164991 (0.0029) [2024-04-26 11:37:07,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2703228928. Throughput: 0: 50293.3. Samples: 456022440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:37:07,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 11:37:10,189][49750] Updated weights for policy 0, policy_version 165001 (0.0030) [2024-04-26 11:37:11,791][49728] Signal inference workers to stop experience collection... (6900 times) [2024-04-26 11:37:11,791][49728] Signal inference workers to resume experience collection... (6900 times) [2024-04-26 11:37:11,805][49750] InferenceWorker_p0-w0: stopping experience collection (6900 times) [2024-04-26 11:37:11,805][49750] InferenceWorker_p0-w0: resuming experience collection (6900 times) [2024-04-26 11:37:12,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2703474688. Throughput: 0: 50524.4. Samples: 456333400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:37:12,063][49517] Avg episode reward: [(0, '0.476')] [2024-04-26 11:37:13,018][49750] Updated weights for policy 0, policy_version 165011 (0.0035) [2024-04-26 11:37:16,691][49750] Updated weights for policy 0, policy_version 165021 (0.0033) [2024-04-26 11:37:17,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2703704064. Throughput: 0: 50267.6. Samples: 456631760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:37:17,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 11:37:19,459][49750] Updated weights for policy 0, policy_version 165031 (0.0033) [2024-04-26 11:37:22,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2703982592. Throughput: 0: 50377.8. Samples: 456777160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:37:22,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 11:37:23,028][49750] Updated weights for policy 0, policy_version 165041 (0.0031) [2024-04-26 11:37:26,044][49750] Updated weights for policy 0, policy_version 165051 (0.0037) [2024-04-26 11:37:27,063][49517] Fps is (10 sec: 54066.4, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 2704244736. Throughput: 0: 50520.7. Samples: 457083700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:37:27,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 11:37:29,434][49750] Updated weights for policy 0, policy_version 165061 (0.0031) [2024-04-26 11:37:32,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2704490496. Throughput: 0: 50341.2. Samples: 457382460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:37:32,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 11:37:32,471][49750] Updated weights for policy 0, policy_version 165071 (0.0031) [2024-04-26 11:37:35,939][49750] Updated weights for policy 0, policy_version 165081 (0.0031) [2024-04-26 11:37:37,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2704719872. Throughput: 0: 50205.7. Samples: 457529240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:37:37,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 11:37:38,996][49750] Updated weights for policy 0, policy_version 165091 (0.0030) [2024-04-26 11:37:42,063][49517] Fps is (10 sec: 49151.0, 60 sec: 49970.9, 300 sec: 50318.3). Total num frames: 2704982016. Throughput: 0: 50359.8. Samples: 457836220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:37:42,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 11:37:42,821][49750] Updated weights for policy 0, policy_version 165101 (0.0032) [2024-04-26 11:37:45,431][49750] Updated weights for policy 0, policy_version 165111 (0.0027) [2024-04-26 11:37:47,063][49517] Fps is (10 sec: 50790.1, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2705227776. Throughput: 0: 50548.0. Samples: 458141240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:37:47,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 11:37:49,431][49750] Updated weights for policy 0, policy_version 165121 (0.0037) [2024-04-26 11:37:51,882][49750] Updated weights for policy 0, policy_version 165131 (0.0030) [2024-04-26 11:37:52,062][49517] Fps is (10 sec: 52430.4, 60 sec: 50790.4, 300 sec: 50262.8). Total num frames: 2705506304. Throughput: 0: 50507.1. Samples: 458295260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:37:52,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 11:37:55,943][49750] Updated weights for policy 0, policy_version 165141 (0.0031) [2024-04-26 11:37:57,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 50373.9). Total num frames: 2705752064. Throughput: 0: 50296.0. Samples: 458596720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 11:37:57,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 11:37:58,419][49750] Updated weights for policy 0, policy_version 165151 (0.0028) [2024-04-26 11:38:02,063][49517] Fps is (10 sec: 45874.5, 60 sec: 49698.1, 300 sec: 50207.2). Total num frames: 2705965056. Throughput: 0: 50253.1. Samples: 458893160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 11:38:02,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 11:38:02,744][49750] Updated weights for policy 0, policy_version 165161 (0.0033) [2024-04-26 11:38:04,927][49750] Updated weights for policy 0, policy_version 165171 (0.0028) [2024-04-26 11:38:07,063][49517] Fps is (10 sec: 49150.8, 60 sec: 50244.1, 300 sec: 50373.8). Total num frames: 2706243584. Throughput: 0: 50351.8. Samples: 459043000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 11:38:07,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 11:38:09,092][49750] Updated weights for policy 0, policy_version 165181 (0.0034) [2024-04-26 11:38:11,501][49750] Updated weights for policy 0, policy_version 165191 (0.0034) [2024-04-26 11:38:12,062][49517] Fps is (10 sec: 54068.1, 60 sec: 50517.3, 300 sec: 50318.4). Total num frames: 2706505728. Throughput: 0: 50148.6. Samples: 459340380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 11:38:12,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 11:38:15,720][49750] Updated weights for policy 0, policy_version 165201 (0.0033) [2024-04-26 11:38:16,751][49728] Signal inference workers to stop experience collection... (6950 times) [2024-04-26 11:38:16,752][49728] Signal inference workers to resume experience collection... (6950 times) [2024-04-26 11:38:16,784][49750] InferenceWorker_p0-w0: stopping experience collection (6950 times) [2024-04-26 11:38:16,784][49750] InferenceWorker_p0-w0: resuming experience collection (6950 times) [2024-04-26 11:38:17,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.1, 300 sec: 50318.3). Total num frames: 2706751488. Throughput: 0: 50342.4. Samples: 459647880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 11:38:17,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 11:38:18,009][49750] Updated weights for policy 0, policy_version 165211 (0.0031) [2024-04-26 11:38:22,062][49517] Fps is (10 sec: 45875.2, 60 sec: 49698.2, 300 sec: 50207.2). Total num frames: 2706964480. Throughput: 0: 50293.0. Samples: 459792420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 11:38:22,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 11:38:22,086][49750] Updated weights for policy 0, policy_version 165221 (0.0035) [2024-04-26 11:38:24,519][49750] Updated weights for policy 0, policy_version 165231 (0.0029) [2024-04-26 11:38:27,062][49517] Fps is (10 sec: 47515.2, 60 sec: 49698.3, 300 sec: 50262.8). Total num frames: 2707226624. Throughput: 0: 50187.0. Samples: 460094620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 11:38:27,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 11:38:28,456][49750] Updated weights for policy 0, policy_version 165241 (0.0031) [2024-04-26 11:38:31,082][49750] Updated weights for policy 0, policy_version 165251 (0.0032) [2024-04-26 11:38:32,062][49517] Fps is (10 sec: 52428.8, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2707488768. Throughput: 0: 50054.8. Samples: 460393700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 11:38:32,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 11:38:35,050][49750] Updated weights for policy 0, policy_version 165261 (0.0029) [2024-04-26 11:38:37,063][49517] Fps is (10 sec: 54066.1, 60 sec: 50790.3, 300 sec: 50262.7). Total num frames: 2707767296. Throughput: 0: 50156.2. Samples: 460552300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 11:38:37,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 11:38:37,519][49750] Updated weights for policy 0, policy_version 165271 (0.0032) [2024-04-26 11:38:41,584][49750] Updated weights for policy 0, policy_version 165281 (0.0037) [2024-04-26 11:38:42,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.5, 300 sec: 50318.3). Total num frames: 2707996672. Throughput: 0: 50140.9. Samples: 460853060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 11:38:42,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 11:38:42,167][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000165284_2708013056.pth... [2024-04-26 11:38:42,214][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000164547_2695938048.pth [2024-04-26 11:38:43,910][49750] Updated weights for policy 0, policy_version 165291 (0.0035) [2024-04-26 11:38:47,063][49517] Fps is (10 sec: 45875.4, 60 sec: 49971.2, 300 sec: 50151.7). Total num frames: 2708226048. Throughput: 0: 50267.6. Samples: 461155200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 11:38:47,063][49517] Avg episode reward: [(0, '0.454')] [2024-04-26 11:38:48,023][49750] Updated weights for policy 0, policy_version 165301 (0.0031) [2024-04-26 11:38:50,439][49750] Updated weights for policy 0, policy_version 165311 (0.0035) [2024-04-26 11:38:52,062][49517] Fps is (10 sec: 50790.1, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2708504576. Throughput: 0: 50066.9. Samples: 461296000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 11:38:52,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 11:38:54,615][49750] Updated weights for policy 0, policy_version 165321 (0.0034) [2024-04-26 11:38:57,055][49750] Updated weights for policy 0, policy_version 165331 (0.0031) [2024-04-26 11:38:57,062][49517] Fps is (10 sec: 55706.0, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2708783104. Throughput: 0: 50158.1. Samples: 461597500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 11:38:57,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 11:39:01,216][49750] Updated weights for policy 0, policy_version 165341 (0.0032) [2024-04-26 11:39:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50262.8). Total num frames: 2709012480. Throughput: 0: 50241.2. Samples: 461908720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 11:39:02,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 11:39:03,689][49750] Updated weights for policy 0, policy_version 165351 (0.0030) [2024-04-26 11:39:07,062][49517] Fps is (10 sec: 44236.9, 60 sec: 49698.3, 300 sec: 50151.7). Total num frames: 2709225472. Throughput: 0: 50200.4. Samples: 462051440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 11:39:07,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 11:39:07,746][49750] Updated weights for policy 0, policy_version 165361 (0.0035) [2024-04-26 11:39:10,123][49750] Updated weights for policy 0, policy_version 165371 (0.0039) [2024-04-26 11:39:12,063][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 2709504000. Throughput: 0: 50203.9. Samples: 462353800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 11:39:12,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 11:39:14,105][49750] Updated weights for policy 0, policy_version 165381 (0.0033) [2024-04-26 11:39:16,571][49750] Updated weights for policy 0, policy_version 165391 (0.0025) [2024-04-26 11:39:17,062][49517] Fps is (10 sec: 54067.6, 60 sec: 50244.6, 300 sec: 50262.8). Total num frames: 2709766144. Throughput: 0: 50236.9. Samples: 462654360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 11:39:17,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 11:39:20,610][49750] Updated weights for policy 0, policy_version 165401 (0.0036) [2024-04-26 11:39:22,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50262.8). Total num frames: 2710028288. Throughput: 0: 50294.3. Samples: 462815540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 11:39:22,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 11:39:22,789][49728] Signal inference workers to stop experience collection... (7000 times) [2024-04-26 11:39:22,793][49728] Signal inference workers to resume experience collection... (7000 times) [2024-04-26 11:39:22,804][49750] InferenceWorker_p0-w0: stopping experience collection (7000 times) [2024-04-26 11:39:22,804][49750] InferenceWorker_p0-w0: resuming experience collection (7000 times) [2024-04-26 11:39:23,076][49750] Updated weights for policy 0, policy_version 165411 (0.0030) [2024-04-26 11:39:27,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2710241280. Throughput: 0: 50278.2. Samples: 463115580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 11:39:27,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 11:39:27,089][49750] Updated weights for policy 0, policy_version 165421 (0.0029) [2024-04-26 11:39:29,651][49750] Updated weights for policy 0, policy_version 165431 (0.0029) [2024-04-26 11:39:32,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2710503424. Throughput: 0: 50204.6. Samples: 463414400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 11:39:32,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 11:39:33,540][49750] Updated weights for policy 0, policy_version 165441 (0.0031) [2024-04-26 11:39:36,027][49750] Updated weights for policy 0, policy_version 165451 (0.0033) [2024-04-26 11:39:37,063][49517] Fps is (10 sec: 52428.6, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2710765568. Throughput: 0: 50391.1. Samples: 463563600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 11:39:37,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 11:39:40,200][49750] Updated weights for policy 0, policy_version 165461 (0.0032) [2024-04-26 11:39:42,062][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 2711044096. Throughput: 0: 50547.6. Samples: 463872140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 11:39:42,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 11:39:42,589][49750] Updated weights for policy 0, policy_version 165471 (0.0032) [2024-04-26 11:39:46,642][49750] Updated weights for policy 0, policy_version 165481 (0.0033) [2024-04-26 11:39:47,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2711257088. Throughput: 0: 50423.6. Samples: 464177780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 11:39:47,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 11:39:49,044][49750] Updated weights for policy 0, policy_version 165491 (0.0032) [2024-04-26 11:39:52,063][49517] Fps is (10 sec: 45874.8, 60 sec: 49971.1, 300 sec: 50207.2). Total num frames: 2711502848. Throughput: 0: 50444.3. Samples: 464321440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 11:39:52,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 11:39:53,043][49750] Updated weights for policy 0, policy_version 165501 (0.0033) [2024-04-26 11:39:55,546][49750] Updated weights for policy 0, policy_version 165511 (0.0027) [2024-04-26 11:39:57,063][49517] Fps is (10 sec: 52428.1, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 2711781376. Throughput: 0: 50369.3. Samples: 464620420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 11:39:57,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 11:39:59,746][49750] Updated weights for policy 0, policy_version 165521 (0.0028) [2024-04-26 11:40:02,062][49517] Fps is (10 sec: 54067.9, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2712043520. Throughput: 0: 50468.9. Samples: 464925460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 11:40:02,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 11:40:02,173][49750] Updated weights for policy 0, policy_version 165531 (0.0034) [2024-04-26 11:40:06,274][49750] Updated weights for policy 0, policy_version 165541 (0.0030) [2024-04-26 11:40:07,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.4, 300 sec: 50318.3). Total num frames: 2712289280. Throughput: 0: 50383.1. Samples: 465082780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 11:40:07,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 11:40:08,890][49750] Updated weights for policy 0, policy_version 165551 (0.0032) [2024-04-26 11:40:12,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2712518656. Throughput: 0: 50387.6. Samples: 465383020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 11:40:12,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 11:40:12,707][49750] Updated weights for policy 0, policy_version 165561 (0.0033) [2024-04-26 11:40:15,450][49750] Updated weights for policy 0, policy_version 165571 (0.0028) [2024-04-26 11:40:17,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2712780800. Throughput: 0: 50326.3. Samples: 465679080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 11:40:17,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 11:40:19,119][49750] Updated weights for policy 0, policy_version 165581 (0.0038) [2024-04-26 11:40:19,421][49728] Signal inference workers to stop experience collection... (7050 times) [2024-04-26 11:40:19,477][49750] InferenceWorker_p0-w0: stopping experience collection (7050 times) [2024-04-26 11:40:19,481][49728] Signal inference workers to resume experience collection... (7050 times) [2024-04-26 11:40:19,490][49750] InferenceWorker_p0-w0: resuming experience collection (7050 times) [2024-04-26 11:40:21,913][49750] Updated weights for policy 0, policy_version 165591 (0.0033) [2024-04-26 11:40:22,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2713042944. Throughput: 0: 50400.1. Samples: 465831600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 11:40:22,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 11:40:25,580][49750] Updated weights for policy 0, policy_version 165601 (0.0041) [2024-04-26 11:40:27,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50262.8). Total num frames: 2713288704. Throughput: 0: 50308.5. Samples: 466136020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 11:40:27,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 11:40:28,504][49750] Updated weights for policy 0, policy_version 165611 (0.0033) [2024-04-26 11:40:32,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.1, 300 sec: 50207.2). Total num frames: 2713518080. Throughput: 0: 50295.4. Samples: 466441080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 11:40:32,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-26 11:40:32,152][49750] Updated weights for policy 0, policy_version 165621 (0.0032) [2024-04-26 11:40:34,902][49750] Updated weights for policy 0, policy_version 165631 (0.0032) [2024-04-26 11:40:37,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2713780224. Throughput: 0: 50327.6. Samples: 466586180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 11:40:37,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 11:40:38,564][49750] Updated weights for policy 0, policy_version 165641 (0.0031) [2024-04-26 11:40:41,501][49750] Updated weights for policy 0, policy_version 165651 (0.0031) [2024-04-26 11:40:42,062][49517] Fps is (10 sec: 52429.3, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2714042368. Throughput: 0: 50361.4. Samples: 466886680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 11:40:42,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 11:40:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000165652_2714042368.pth... [2024-04-26 11:40:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000164916_2701983744.pth [2024-04-26 11:40:45,096][49750] Updated weights for policy 0, policy_version 165661 (0.0034) [2024-04-26 11:40:47,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 2714288128. Throughput: 0: 50129.2. Samples: 467181280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 11:40:47,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 11:40:48,089][49750] Updated weights for policy 0, policy_version 165671 (0.0032) [2024-04-26 11:40:51,745][49750] Updated weights for policy 0, policy_version 165681 (0.0034) [2024-04-26 11:40:52,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2714533888. Throughput: 0: 50090.2. Samples: 467336840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 11:40:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 11:40:54,760][49750] Updated weights for policy 0, policy_version 165691 (0.0031) [2024-04-26 11:40:57,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2714779648. Throughput: 0: 50045.3. Samples: 467635060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 11:40:57,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 11:40:58,221][49750] Updated weights for policy 0, policy_version 165701 (0.0036) [2024-04-26 11:41:01,258][49750] Updated weights for policy 0, policy_version 165711 (0.0026) [2024-04-26 11:41:02,063][49517] Fps is (10 sec: 50790.3, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 2715041792. Throughput: 0: 50218.4. Samples: 467938920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 11:41:02,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 11:41:04,631][49750] Updated weights for policy 0, policy_version 165721 (0.0032) [2024-04-26 11:41:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2715287552. Throughput: 0: 50301.4. Samples: 468095160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:41:07,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 11:41:07,664][49750] Updated weights for policy 0, policy_version 165731 (0.0030) [2024-04-26 11:41:11,273][49750] Updated weights for policy 0, policy_version 165741 (0.0030) [2024-04-26 11:41:12,062][49517] Fps is (10 sec: 50791.6, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2715549696. Throughput: 0: 50241.4. Samples: 468396880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:41:12,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 11:41:14,327][49750] Updated weights for policy 0, policy_version 165751 (0.0031) [2024-04-26 11:41:17,062][49517] Fps is (10 sec: 49151.6, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2715779072. Throughput: 0: 49949.9. Samples: 468688820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:41:17,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 11:41:17,747][49750] Updated weights for policy 0, policy_version 165761 (0.0032) [2024-04-26 11:41:20,231][49728] Signal inference workers to stop experience collection... (7100 times) [2024-04-26 11:41:20,235][49728] Signal inference workers to resume experience collection... (7100 times) [2024-04-26 11:41:20,243][49750] InferenceWorker_p0-w0: stopping experience collection (7100 times) [2024-04-26 11:41:20,265][49750] InferenceWorker_p0-w0: resuming experience collection (7100 times) [2024-04-26 11:41:20,993][49750] Updated weights for policy 0, policy_version 165771 (0.0035) [2024-04-26 11:41:22,063][49517] Fps is (10 sec: 49150.3, 60 sec: 49971.0, 300 sec: 50262.7). Total num frames: 2716041216. Throughput: 0: 50075.8. Samples: 468839600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:41:22,063][49517] Avg episode reward: [(0, '0.695')] [2024-04-26 11:41:22,070][49728] Saving new best policy, reward=0.695! [2024-04-26 11:41:24,216][49750] Updated weights for policy 0, policy_version 165781 (0.0032) [2024-04-26 11:41:27,062][49517] Fps is (10 sec: 50790.5, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2716286976. Throughput: 0: 50001.8. Samples: 469136760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:41:27,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 11:41:27,717][49750] Updated weights for policy 0, policy_version 165791 (0.0029) [2024-04-26 11:41:30,675][49750] Updated weights for policy 0, policy_version 165801 (0.0030) [2024-04-26 11:41:32,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2716549120. Throughput: 0: 50122.2. Samples: 469436780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:41:32,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 11:41:34,233][49750] Updated weights for policy 0, policy_version 165811 (0.0029) [2024-04-26 11:41:37,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 2716794880. Throughput: 0: 50127.6. Samples: 469592580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:41:37,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 11:41:37,185][49750] Updated weights for policy 0, policy_version 165821 (0.0030) [2024-04-26 11:41:40,650][49750] Updated weights for policy 0, policy_version 165831 (0.0033) [2024-04-26 11:41:42,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2717057024. Throughput: 0: 50328.8. Samples: 469899860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:41:42,063][49517] Avg episode reward: [(0, '0.474')] [2024-04-26 11:41:43,749][49750] Updated weights for policy 0, policy_version 165841 (0.0028) [2024-04-26 11:41:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2717286400. Throughput: 0: 50109.0. Samples: 470193820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:41:47,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 11:41:47,110][49750] Updated weights for policy 0, policy_version 165851 (0.0032) [2024-04-26 11:41:50,171][49750] Updated weights for policy 0, policy_version 165861 (0.0029) [2024-04-26 11:41:52,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2717532160. Throughput: 0: 50122.2. Samples: 470350660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:41:52,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 11:41:53,706][49750] Updated weights for policy 0, policy_version 165871 (0.0033) [2024-04-26 11:41:56,586][49750] Updated weights for policy 0, policy_version 165881 (0.0032) [2024-04-26 11:41:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2717810688. Throughput: 0: 50130.1. Samples: 470652740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:41:57,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 11:42:00,088][49750] Updated weights for policy 0, policy_version 165891 (0.0034) [2024-04-26 11:42:02,063][49517] Fps is (10 sec: 50789.4, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2718040064. Throughput: 0: 50257.6. Samples: 470950420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:42:02,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 11:42:03,212][49750] Updated weights for policy 0, policy_version 165901 (0.0029) [2024-04-26 11:42:06,730][49750] Updated weights for policy 0, policy_version 165911 (0.0023) [2024-04-26 11:42:07,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2718285824. Throughput: 0: 50200.8. Samples: 471098620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 11:42:07,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 11:42:09,717][49750] Updated weights for policy 0, policy_version 165921 (0.0029) [2024-04-26 11:42:11,706][49728] Signal inference workers to stop experience collection... (7150 times) [2024-04-26 11:42:11,707][49728] Signal inference workers to resume experience collection... (7150 times) [2024-04-26 11:42:11,735][49750] InferenceWorker_p0-w0: stopping experience collection (7150 times) [2024-04-26 11:42:11,735][49750] InferenceWorker_p0-w0: resuming experience collection (7150 times) [2024-04-26 11:42:12,062][49517] Fps is (10 sec: 52429.9, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2718564352. Throughput: 0: 50284.0. Samples: 471399540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 11:42:12,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 11:42:13,406][49750] Updated weights for policy 0, policy_version 165931 (0.0030) [2024-04-26 11:42:16,123][49750] Updated weights for policy 0, policy_version 165941 (0.0035) [2024-04-26 11:42:17,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2718810112. Throughput: 0: 50319.6. Samples: 471701160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 11:42:17,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 11:42:19,798][49750] Updated weights for policy 0, policy_version 165951 (0.0030) [2024-04-26 11:42:22,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49971.4, 300 sec: 50151.7). Total num frames: 2719039488. Throughput: 0: 50397.9. Samples: 471860480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 11:42:22,063][49517] Avg episode reward: [(0, '0.461')] [2024-04-26 11:42:22,563][49750] Updated weights for policy 0, policy_version 165961 (0.0029) [2024-04-26 11:42:26,158][49750] Updated weights for policy 0, policy_version 165971 (0.0030) [2024-04-26 11:42:27,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2719318016. Throughput: 0: 50224.4. Samples: 472159960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 11:42:27,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 11:42:29,089][49750] Updated weights for policy 0, policy_version 165981 (0.0032) [2024-04-26 11:42:32,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2719563776. Throughput: 0: 50415.1. Samples: 472462500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 11:42:32,063][49517] Avg episode reward: [(0, '0.479')] [2024-04-26 11:42:32,722][49750] Updated weights for policy 0, policy_version 165991 (0.0028) [2024-04-26 11:42:35,559][49750] Updated weights for policy 0, policy_version 166001 (0.0029) [2024-04-26 11:42:37,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2719809536. Throughput: 0: 50360.9. Samples: 472616900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 11:42:37,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 11:42:39,237][49750] Updated weights for policy 0, policy_version 166011 (0.0033) [2024-04-26 11:42:42,034][49750] Updated weights for policy 0, policy_version 166021 (0.0032) [2024-04-26 11:42:42,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2720088064. Throughput: 0: 50351.9. Samples: 472918580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 11:42:42,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 11:42:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000166021_2720088064.pth... [2024-04-26 11:42:42,124][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000165284_2708013056.pth [2024-04-26 11:42:45,671][49750] Updated weights for policy 0, policy_version 166031 (0.0031) [2024-04-26 11:42:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50151.7). Total num frames: 2720301056. Throughput: 0: 50510.0. Samples: 473223360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 11:42:47,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 11:42:48,465][49750] Updated weights for policy 0, policy_version 166041 (0.0025) [2024-04-26 11:42:52,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 2720563200. Throughput: 0: 50330.6. Samples: 473363500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 11:42:52,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 11:42:52,167][49750] Updated weights for policy 0, policy_version 166051 (0.0028) [2024-04-26 11:42:55,005][49750] Updated weights for policy 0, policy_version 166061 (0.0028) [2024-04-26 11:42:57,063][49517] Fps is (10 sec: 52427.6, 60 sec: 50244.1, 300 sec: 50373.8). Total num frames: 2720825344. Throughput: 0: 50342.9. Samples: 473664980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 11:42:57,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 11:42:58,636][49750] Updated weights for policy 0, policy_version 166071 (0.0037) [2024-04-26 11:43:01,506][49750] Updated weights for policy 0, policy_version 166081 (0.0033) [2024-04-26 11:43:02,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.6, 300 sec: 50318.4). Total num frames: 2721087488. Throughput: 0: 50334.8. Samples: 473966220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 11:43:02,063][49517] Avg episode reward: [(0, '0.439')] [2024-04-26 11:43:05,166][49750] Updated weights for policy 0, policy_version 166091 (0.0036) [2024-04-26 11:43:07,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.2, 300 sec: 50207.2). Total num frames: 2721316864. Throughput: 0: 50319.8. Samples: 474124880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 11:43:07,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 11:43:07,988][49750] Updated weights for policy 0, policy_version 166101 (0.0029) [2024-04-26 11:43:11,576][49750] Updated weights for policy 0, policy_version 166111 (0.0032) [2024-04-26 11:43:12,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2721579008. Throughput: 0: 50385.3. Samples: 474427300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 11:43:12,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 11:43:14,407][49750] Updated weights for policy 0, policy_version 166121 (0.0030) [2024-04-26 11:43:17,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 2721824768. Throughput: 0: 50409.3. Samples: 474730920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 11:43:17,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 11:43:18,069][49750] Updated weights for policy 0, policy_version 166131 (0.0029) [2024-04-26 11:43:21,014][49750] Updated weights for policy 0, policy_version 166141 (0.0029) [2024-04-26 11:43:22,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50373.8). Total num frames: 2722086912. Throughput: 0: 50365.7. Samples: 474883360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 11:43:22,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 11:43:24,425][49728] Signal inference workers to stop experience collection... (7200 times) [2024-04-26 11:43:24,426][49728] Signal inference workers to resume experience collection... (7200 times) [2024-04-26 11:43:24,437][49750] InferenceWorker_p0-w0: stopping experience collection (7200 times) [2024-04-26 11:43:24,437][49750] InferenceWorker_p0-w0: resuming experience collection (7200 times) [2024-04-26 11:43:24,558][49750] Updated weights for policy 0, policy_version 166151 (0.0030) [2024-04-26 11:43:27,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2722332672. Throughput: 0: 50385.8. Samples: 475185940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 11:43:27,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 11:43:27,525][49750] Updated weights for policy 0, policy_version 166161 (0.0027) [2024-04-26 11:43:31,136][49750] Updated weights for policy 0, policy_version 166171 (0.0027) [2024-04-26 11:43:32,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.3, 300 sec: 50151.7). Total num frames: 2722562048. Throughput: 0: 50240.0. Samples: 475484160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 11:43:32,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 11:43:34,077][49750] Updated weights for policy 0, policy_version 166181 (0.0036) [2024-04-26 11:43:37,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2722824192. Throughput: 0: 50362.6. Samples: 475629820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 11:43:37,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 11:43:37,578][49750] Updated weights for policy 0, policy_version 166191 (0.0034) [2024-04-26 11:43:40,875][49750] Updated weights for policy 0, policy_version 166201 (0.0037) [2024-04-26 11:43:42,063][49517] Fps is (10 sec: 52427.7, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 2723086336. Throughput: 0: 50340.0. Samples: 475930280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 11:43:42,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 11:43:44,248][49750] Updated weights for policy 0, policy_version 166211 (0.0033) [2024-04-26 11:43:47,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.3, 300 sec: 50318.3). Total num frames: 2723348480. Throughput: 0: 50480.7. Samples: 476237860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 11:43:47,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 11:43:47,362][49750] Updated weights for policy 0, policy_version 166221 (0.0029) [2024-04-26 11:43:50,949][49750] Updated weights for policy 0, policy_version 166231 (0.0032) [2024-04-26 11:43:52,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.2, 300 sec: 50207.2). Total num frames: 2723594240. Throughput: 0: 50231.5. Samples: 476385300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 11:43:52,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 11:43:53,805][49750] Updated weights for policy 0, policy_version 166241 (0.0027) [2024-04-26 11:43:57,062][49517] Fps is (10 sec: 47514.6, 60 sec: 49971.4, 300 sec: 50207.3). Total num frames: 2723823616. Throughput: 0: 50268.6. Samples: 476689380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 11:43:57,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 11:43:57,403][49750] Updated weights for policy 0, policy_version 166251 (0.0037) [2024-04-26 11:44:00,300][49750] Updated weights for policy 0, policy_version 166261 (0.0036) [2024-04-26 11:44:02,063][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.0, 300 sec: 50373.8). Total num frames: 2724085760. Throughput: 0: 50358.5. Samples: 476997060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 11:44:02,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 11:44:03,726][49750] Updated weights for policy 0, policy_version 166271 (0.0032) [2024-04-26 11:44:06,829][49750] Updated weights for policy 0, policy_version 166281 (0.0029) [2024-04-26 11:44:07,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2724347904. Throughput: 0: 50321.8. Samples: 477147840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 11:44:07,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 11:44:10,032][49750] Updated weights for policy 0, policy_version 166291 (0.0028) [2024-04-26 11:44:12,062][49517] Fps is (10 sec: 49153.1, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 2724577280. Throughput: 0: 50274.7. Samples: 477448300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 11:44:12,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 11:44:13,300][49750] Updated weights for policy 0, policy_version 166301 (0.0027) [2024-04-26 11:44:16,840][49750] Updated weights for policy 0, policy_version 166311 (0.0035) [2024-04-26 11:44:17,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 2724855808. Throughput: 0: 50431.0. Samples: 477753560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 11:44:17,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 11:44:19,740][49750] Updated weights for policy 0, policy_version 166321 (0.0035) [2024-04-26 11:44:22,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2725101568. Throughput: 0: 50518.7. Samples: 477903160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:44:22,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 11:44:23,386][49750] Updated weights for policy 0, policy_version 166331 (0.0028) [2024-04-26 11:44:26,148][49750] Updated weights for policy 0, policy_version 166341 (0.0029) [2024-04-26 11:44:27,062][49517] Fps is (10 sec: 47514.5, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2725330944. Throughput: 0: 50435.8. Samples: 478199880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:44:27,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 11:44:29,779][49750] Updated weights for policy 0, policy_version 166351 (0.0036) [2024-04-26 11:44:32,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50373.9). Total num frames: 2725625856. Throughput: 0: 50458.8. Samples: 478508500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:44:32,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 11:44:32,701][49750] Updated weights for policy 0, policy_version 166361 (0.0033) [2024-04-26 11:44:36,358][49750] Updated weights for policy 0, policy_version 166371 (0.0032) [2024-04-26 11:44:37,062][49517] Fps is (10 sec: 50789.6, 60 sec: 50244.2, 300 sec: 50151.7). Total num frames: 2725838848. Throughput: 0: 50469.0. Samples: 478656400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:44:37,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 11:44:39,254][49750] Updated weights for policy 0, policy_version 166381 (0.0032) [2024-04-26 11:44:42,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.4, 300 sec: 50373.8). Total num frames: 2726117376. Throughput: 0: 50503.3. Samples: 478962040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:44:42,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 11:44:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000166389_2726117376.pth... [2024-04-26 11:44:42,131][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000165652_2714042368.pth [2024-04-26 11:44:42,684][49750] Updated weights for policy 0, policy_version 166391 (0.0036) [2024-04-26 11:44:46,340][49750] Updated weights for policy 0, policy_version 166401 (0.0042) [2024-04-26 11:44:46,969][49728] Signal inference workers to stop experience collection... (7250 times) [2024-04-26 11:44:46,969][49728] Signal inference workers to resume experience collection... (7250 times) [2024-04-26 11:44:46,994][49750] InferenceWorker_p0-w0: stopping experience collection (7250 times) [2024-04-26 11:44:46,995][49750] InferenceWorker_p0-w0: resuming experience collection (7250 times) [2024-04-26 11:44:47,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 2726363136. Throughput: 0: 50442.1. Samples: 479266940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:44:47,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 11:44:49,192][49750] Updated weights for policy 0, policy_version 166411 (0.0030) [2024-04-26 11:44:52,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2726625280. Throughput: 0: 50420.0. Samples: 479416740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:44:52,063][49517] Avg episode reward: [(0, '0.458')] [2024-04-26 11:44:52,887][49750] Updated weights for policy 0, policy_version 166421 (0.0034) [2024-04-26 11:44:55,695][49750] Updated weights for policy 0, policy_version 166431 (0.0029) [2024-04-26 11:44:57,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.2, 300 sec: 50207.2). Total num frames: 2726854656. Throughput: 0: 50441.7. Samples: 479718180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:44:57,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 11:44:59,346][49750] Updated weights for policy 0, policy_version 166441 (0.0035) [2024-04-26 11:45:02,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.5, 300 sec: 50262.8). Total num frames: 2727116800. Throughput: 0: 50367.6. Samples: 480020100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:45:02,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 11:45:02,120][49750] Updated weights for policy 0, policy_version 166451 (0.0030) [2024-04-26 11:45:05,707][49750] Updated weights for policy 0, policy_version 166461 (0.0029) [2024-04-26 11:45:07,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.2, 300 sec: 50373.8). Total num frames: 2727378944. Throughput: 0: 50210.9. Samples: 480162660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:45:07,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 11:45:08,656][49750] Updated weights for policy 0, policy_version 166471 (0.0035) [2024-04-26 11:45:12,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 2727608320. Throughput: 0: 50462.0. Samples: 480470680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:45:12,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 11:45:12,218][49750] Updated weights for policy 0, policy_version 166481 (0.0029) [2024-04-26 11:45:15,390][49750] Updated weights for policy 0, policy_version 166491 (0.0028) [2024-04-26 11:45:17,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 2727870464. Throughput: 0: 50343.6. Samples: 480773960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:45:17,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 11:45:18,825][49750] Updated weights for policy 0, policy_version 166501 (0.0035) [2024-04-26 11:45:21,936][49750] Updated weights for policy 0, policy_version 166511 (0.0031) [2024-04-26 11:45:22,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2728116224. Throughput: 0: 50258.8. Samples: 480918040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:45:22,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 11:45:25,131][49750] Updated weights for policy 0, policy_version 166521 (0.0031) [2024-04-26 11:45:27,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50373.9). Total num frames: 2728378368. Throughput: 0: 50313.0. Samples: 481226120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:45:27,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 11:45:28,258][49750] Updated weights for policy 0, policy_version 166531 (0.0035) [2024-04-26 11:45:31,549][49750] Updated weights for policy 0, policy_version 166541 (0.0044) [2024-04-26 11:45:32,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2728640512. Throughput: 0: 50328.4. Samples: 481531720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:45:32,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 11:45:34,772][49750] Updated weights for policy 0, policy_version 166551 (0.0027) [2024-04-26 11:45:37,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 2728886272. Throughput: 0: 50396.0. Samples: 481684560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:45:37,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 11:45:38,129][49750] Updated weights for policy 0, policy_version 166561 (0.0033) [2024-04-26 11:45:41,172][49750] Updated weights for policy 0, policy_version 166571 (0.0030) [2024-04-26 11:45:42,063][49517] Fps is (10 sec: 47512.8, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2729115648. Throughput: 0: 50273.7. Samples: 481980500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:45:42,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 11:45:44,767][49750] Updated weights for policy 0, policy_version 166581 (0.0030) [2024-04-26 11:45:47,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.2, 300 sec: 50373.9). Total num frames: 2729394176. Throughput: 0: 50214.2. Samples: 482279740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:45:47,071][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 11:45:47,647][49750] Updated weights for policy 0, policy_version 166591 (0.0031) [2024-04-26 11:45:48,777][49728] Signal inference workers to stop experience collection... (7300 times) [2024-04-26 11:45:48,799][49750] InferenceWorker_p0-w0: stopping experience collection (7300 times) [2024-04-26 11:45:48,879][49728] Signal inference workers to resume experience collection... (7300 times) [2024-04-26 11:45:48,879][49750] InferenceWorker_p0-w0: resuming experience collection (7300 times) [2024-04-26 11:45:51,341][49750] Updated weights for policy 0, policy_version 166601 (0.0036) [2024-04-26 11:45:52,063][49517] Fps is (10 sec: 52428.9, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 2729639936. Throughput: 0: 50387.6. Samples: 482430100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:45:52,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 11:45:54,134][49750] Updated weights for policy 0, policy_version 166611 (0.0037) [2024-04-26 11:45:57,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2729885696. Throughput: 0: 50282.7. Samples: 482733400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:45:57,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 11:45:57,890][49750] Updated weights for policy 0, policy_version 166621 (0.0029) [2024-04-26 11:46:00,771][49750] Updated weights for policy 0, policy_version 166631 (0.0031) [2024-04-26 11:46:02,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50244.1, 300 sec: 50318.3). Total num frames: 2730131456. Throughput: 0: 50258.8. Samples: 483035620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:46:02,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 11:46:04,285][49750] Updated weights for policy 0, policy_version 166641 (0.0030) [2024-04-26 11:46:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2730377216. Throughput: 0: 50366.6. Samples: 483184540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:46:07,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 11:46:07,347][49750] Updated weights for policy 0, policy_version 166651 (0.0028) [2024-04-26 11:46:10,840][49750] Updated weights for policy 0, policy_version 166661 (0.0027) [2024-04-26 11:46:12,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2730639360. Throughput: 0: 50268.8. Samples: 483488220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:46:12,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 11:46:14,057][49750] Updated weights for policy 0, policy_version 166671 (0.0034) [2024-04-26 11:46:17,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.1, 300 sec: 50318.3). Total num frames: 2730885120. Throughput: 0: 50119.8. Samples: 483787120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:46:17,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 11:46:17,318][49750] Updated weights for policy 0, policy_version 166681 (0.0027) [2024-04-26 11:46:20,448][49750] Updated weights for policy 0, policy_version 166691 (0.0029) [2024-04-26 11:46:22,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 2731163648. Throughput: 0: 50246.7. Samples: 483945660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:46:22,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 11:46:23,673][49750] Updated weights for policy 0, policy_version 166701 (0.0030) [2024-04-26 11:46:26,799][49750] Updated weights for policy 0, policy_version 166711 (0.0031) [2024-04-26 11:46:27,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.1, 300 sec: 50318.3). Total num frames: 2731393024. Throughput: 0: 50296.8. Samples: 484243860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 11:46:27,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 11:46:30,182][49750] Updated weights for policy 0, policy_version 166721 (0.0034) [2024-04-26 11:46:32,063][49517] Fps is (10 sec: 47513.4, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 2731638784. Throughput: 0: 50275.1. Samples: 484542120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:46:32,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 11:46:33,282][49750] Updated weights for policy 0, policy_version 166731 (0.0030) [2024-04-26 11:46:36,750][49750] Updated weights for policy 0, policy_version 166741 (0.0027) [2024-04-26 11:46:37,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2731917312. Throughput: 0: 50341.9. Samples: 484695480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:46:37,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 11:46:39,822][49750] Updated weights for policy 0, policy_version 166751 (0.0032) [2024-04-26 11:46:42,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.4, 300 sec: 50373.8). Total num frames: 2732146688. Throughput: 0: 50363.5. Samples: 484999760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:46:42,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 11:46:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000166757_2732146688.pth... [2024-04-26 11:46:42,131][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000166021_2720088064.pth [2024-04-26 11:46:43,266][49750] Updated weights for policy 0, policy_version 166761 (0.0035) [2024-04-26 11:46:46,441][49750] Updated weights for policy 0, policy_version 166771 (0.0030) [2024-04-26 11:46:47,063][49517] Fps is (10 sec: 47512.9, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 2732392448. Throughput: 0: 50253.0. Samples: 485297000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:46:47,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 11:46:49,707][49750] Updated weights for policy 0, policy_version 166781 (0.0033) [2024-04-26 11:46:52,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2732638208. Throughput: 0: 50403.1. Samples: 485452680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:46:52,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 11:46:52,811][49728] Signal inference workers to stop experience collection... (7350 times) [2024-04-26 11:46:52,811][49728] Signal inference workers to resume experience collection... (7350 times) [2024-04-26 11:46:52,824][49750] InferenceWorker_p0-w0: stopping experience collection (7350 times) [2024-04-26 11:46:52,846][49750] InferenceWorker_p0-w0: resuming experience collection (7350 times) [2024-04-26 11:46:52,946][49750] Updated weights for policy 0, policy_version 166791 (0.0037) [2024-04-26 11:46:56,134][49750] Updated weights for policy 0, policy_version 166801 (0.0034) [2024-04-26 11:46:57,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2732900352. Throughput: 0: 50328.5. Samples: 485753000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:46:57,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 11:46:59,393][49750] Updated weights for policy 0, policy_version 166811 (0.0037) [2024-04-26 11:47:02,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.5, 300 sec: 50373.8). Total num frames: 2733146112. Throughput: 0: 50445.0. Samples: 486057140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:47:02,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 11:47:02,655][49750] Updated weights for policy 0, policy_version 166821 (0.0032) [2024-04-26 11:47:05,840][49750] Updated weights for policy 0, policy_version 166831 (0.0036) [2024-04-26 11:47:07,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2733408256. Throughput: 0: 50389.4. Samples: 486213180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:47:07,063][49517] Avg episode reward: [(0, '0.475')] [2024-04-26 11:47:08,992][49750] Updated weights for policy 0, policy_version 166841 (0.0030) [2024-04-26 11:47:12,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2733637632. Throughput: 0: 50512.3. Samples: 486516900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:47:12,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 11:47:12,530][49750] Updated weights for policy 0, policy_version 166851 (0.0034) [2024-04-26 11:47:15,457][49750] Updated weights for policy 0, policy_version 166861 (0.0028) [2024-04-26 11:47:17,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 2733899776. Throughput: 0: 50457.0. Samples: 486812680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:47:17,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 11:47:19,102][49750] Updated weights for policy 0, policy_version 166871 (0.0032) [2024-04-26 11:47:21,877][49750] Updated weights for policy 0, policy_version 166881 (0.0031) [2024-04-26 11:47:22,062][49517] Fps is (10 sec: 54066.8, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2734178304. Throughput: 0: 50522.6. Samples: 486969000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:47:22,072][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 11:47:25,678][49750] Updated weights for policy 0, policy_version 166891 (0.0029) [2024-04-26 11:47:27,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.5, 300 sec: 50373.9). Total num frames: 2734424064. Throughput: 0: 50618.4. Samples: 487277580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:47:27,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 11:47:28,398][49750] Updated weights for policy 0, policy_version 166901 (0.0041) [2024-04-26 11:47:32,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2734653440. Throughput: 0: 50668.4. Samples: 487577080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 11:47:32,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 11:47:32,132][49750] Updated weights for policy 0, policy_version 166911 (0.0034) [2024-04-26 11:47:34,928][49750] Updated weights for policy 0, policy_version 166921 (0.0034) [2024-04-26 11:47:37,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2734915584. Throughput: 0: 50393.0. Samples: 487720360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 11:47:37,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 11:47:38,684][49750] Updated weights for policy 0, policy_version 166931 (0.0029) [2024-04-26 11:47:41,515][49750] Updated weights for policy 0, policy_version 166941 (0.0029) [2024-04-26 11:47:42,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.2, 300 sec: 50429.3). Total num frames: 2735177728. Throughput: 0: 50515.3. Samples: 488026200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 11:47:42,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 11:47:45,000][49750] Updated weights for policy 0, policy_version 166951 (0.0029) [2024-04-26 11:47:47,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.5, 300 sec: 50373.9). Total num frames: 2735423488. Throughput: 0: 50430.3. Samples: 488326500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 11:47:47,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 11:47:48,226][49750] Updated weights for policy 0, policy_version 166961 (0.0037) [2024-04-26 11:47:51,658][49750] Updated weights for policy 0, policy_version 166971 (0.0033) [2024-04-26 11:47:52,063][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.3, 300 sec: 50373.9). Total num frames: 2735685632. Throughput: 0: 50431.9. Samples: 488482620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 11:47:52,065][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 11:47:54,659][49750] Updated weights for policy 0, policy_version 166981 (0.0035) [2024-04-26 11:47:57,063][49517] Fps is (10 sec: 49149.0, 60 sec: 50243.9, 300 sec: 50262.7). Total num frames: 2735915008. Throughput: 0: 50276.2. Samples: 488779360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 11:47:57,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 11:47:58,103][49750] Updated weights for policy 0, policy_version 166991 (0.0031) [2024-04-26 11:48:01,209][49750] Updated weights for policy 0, policy_version 167001 (0.0034) [2024-04-26 11:48:02,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2736177152. Throughput: 0: 50337.2. Samples: 489077860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 11:48:02,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 11:48:04,701][49750] Updated weights for policy 0, policy_version 167011 (0.0031) [2024-04-26 11:48:07,062][49517] Fps is (10 sec: 52431.6, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2736439296. Throughput: 0: 50421.8. Samples: 489237980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 11:48:07,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 11:48:07,594][49750] Updated weights for policy 0, policy_version 167021 (0.0035) [2024-04-26 11:48:11,184][49750] Updated weights for policy 0, policy_version 167031 (0.0034) [2024-04-26 11:48:11,756][49728] Signal inference workers to stop experience collection... (7400 times) [2024-04-26 11:48:11,757][49728] Signal inference workers to resume experience collection... (7400 times) [2024-04-26 11:48:11,783][49750] InferenceWorker_p0-w0: stopping experience collection (7400 times) [2024-04-26 11:48:11,783][49750] InferenceWorker_p0-w0: resuming experience collection (7400 times) [2024-04-26 11:48:12,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.3, 300 sec: 50429.4). Total num frames: 2736701440. Throughput: 0: 50249.6. Samples: 489538820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 11:48:12,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 11:48:14,032][49750] Updated weights for policy 0, policy_version 167041 (0.0033) [2024-04-26 11:48:17,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2736914432. Throughput: 0: 50299.0. Samples: 489840520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 11:48:17,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 11:48:17,728][49750] Updated weights for policy 0, policy_version 167051 (0.0031) [2024-04-26 11:48:20,509][49750] Updated weights for policy 0, policy_version 167061 (0.0031) [2024-04-26 11:48:22,063][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 2737176576. Throughput: 0: 50385.2. Samples: 489987700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 11:48:22,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 11:48:24,248][49750] Updated weights for policy 0, policy_version 167071 (0.0036) [2024-04-26 11:48:27,044][49750] Updated weights for policy 0, policy_version 167081 (0.0036) [2024-04-26 11:48:27,062][49517] Fps is (10 sec: 54066.8, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2737455104. Throughput: 0: 50176.7. Samples: 490284140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 11:48:27,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 11:48:30,763][49750] Updated weights for policy 0, policy_version 167091 (0.0031) [2024-04-26 11:48:32,063][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.6, 300 sec: 50429.4). Total num frames: 2737700864. Throughput: 0: 50258.2. Samples: 490588120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 11:48:32,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 11:48:33,465][49750] Updated weights for policy 0, policy_version 167101 (0.0030) [2024-04-26 11:48:37,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2737930240. Throughput: 0: 50226.2. Samples: 490742800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-26 11:48:37,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 11:48:37,200][49750] Updated weights for policy 0, policy_version 167111 (0.0033) [2024-04-26 11:48:39,872][49750] Updated weights for policy 0, policy_version 167121 (0.0030) [2024-04-26 11:48:42,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2738192384. Throughput: 0: 50353.7. Samples: 491045260. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 11:48:42,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 11:48:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000167126_2738192384.pth... [2024-04-26 11:48:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000166389_2726117376.pth [2024-04-26 11:48:43,573][49750] Updated weights for policy 0, policy_version 167131 (0.0031) [2024-04-26 11:48:46,422][49750] Updated weights for policy 0, policy_version 167141 (0.0029) [2024-04-26 11:48:47,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.1, 300 sec: 50373.9). Total num frames: 2738454528. Throughput: 0: 50379.9. Samples: 491344960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 11:48:47,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 11:48:50,123][49750] Updated weights for policy 0, policy_version 167151 (0.0035) [2024-04-26 11:48:52,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2738700288. Throughput: 0: 50363.9. Samples: 491504360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 11:48:52,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 11:48:52,993][49750] Updated weights for policy 0, policy_version 167161 (0.0028) [2024-04-26 11:48:56,664][49750] Updated weights for policy 0, policy_version 167171 (0.0033) [2024-04-26 11:48:57,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.8, 300 sec: 50429.4). Total num frames: 2738962432. Throughput: 0: 50441.8. Samples: 491808700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 11:48:57,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 11:48:59,656][49750] Updated weights for policy 0, policy_version 167181 (0.0038) [2024-04-26 11:49:02,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2739191808. Throughput: 0: 50434.1. Samples: 492110060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 11:49:02,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 11:49:03,023][49750] Updated weights for policy 0, policy_version 167191 (0.0027) [2024-04-26 11:49:06,060][49750] Updated weights for policy 0, policy_version 167201 (0.0029) [2024-04-26 11:49:07,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2739453952. Throughput: 0: 50518.9. Samples: 492261040. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 11:49:07,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 11:49:09,549][49750] Updated weights for policy 0, policy_version 167211 (0.0027) [2024-04-26 11:49:12,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2739716096. Throughput: 0: 50524.4. Samples: 492557740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 11:49:12,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 11:49:12,369][49750] Updated weights for policy 0, policy_version 167221 (0.0030) [2024-04-26 11:49:16,066][49750] Updated weights for policy 0, policy_version 167231 (0.0036) [2024-04-26 11:49:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 2739961856. Throughput: 0: 50578.7. Samples: 492864160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 11:49:17,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 11:49:18,846][49750] Updated weights for policy 0, policy_version 167241 (0.0031) [2024-04-26 11:49:22,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2740207616. Throughput: 0: 50614.3. Samples: 493020440. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 11:49:22,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 11:49:22,501][49750] Updated weights for policy 0, policy_version 167251 (0.0030) [2024-04-26 11:49:24,459][49728] Signal inference workers to stop experience collection... (7450 times) [2024-04-26 11:49:24,460][49728] Signal inference workers to resume experience collection... (7450 times) [2024-04-26 11:49:24,492][49750] InferenceWorker_p0-w0: stopping experience collection (7450 times) [2024-04-26 11:49:24,492][49750] InferenceWorker_p0-w0: resuming experience collection (7450 times) [2024-04-26 11:49:25,423][49750] Updated weights for policy 0, policy_version 167261 (0.0030) [2024-04-26 11:49:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2740453376. Throughput: 0: 50477.1. Samples: 493316720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 11:49:27,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 11:49:28,859][49750] Updated weights for policy 0, policy_version 167271 (0.0033) [2024-04-26 11:49:31,936][49750] Updated weights for policy 0, policy_version 167281 (0.0032) [2024-04-26 11:49:32,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.2, 300 sec: 50484.9). Total num frames: 2740731904. Throughput: 0: 50672.9. Samples: 493625240. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 11:49:32,063][49517] Avg episode reward: [(0, '0.491')] [2024-04-26 11:49:35,347][49750] Updated weights for policy 0, policy_version 167291 (0.0028) [2024-04-26 11:49:37,063][49517] Fps is (10 sec: 52427.7, 60 sec: 50790.3, 300 sec: 50373.8). Total num frames: 2740977664. Throughput: 0: 50542.1. Samples: 493778760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 11:49:37,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 11:49:38,412][49750] Updated weights for policy 0, policy_version 167301 (0.0034) [2024-04-26 11:49:41,836][49750] Updated weights for policy 0, policy_version 167311 (0.0029) [2024-04-26 11:49:42,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 2741239808. Throughput: 0: 50537.9. Samples: 494082900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 11:49:42,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 11:49:44,814][49750] Updated weights for policy 0, policy_version 167321 (0.0032) [2024-04-26 11:49:47,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50244.4, 300 sec: 50318.3). Total num frames: 2741469184. Throughput: 0: 50546.7. Samples: 494384660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:49:47,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 11:49:48,258][49750] Updated weights for policy 0, policy_version 167331 (0.0033) [2024-04-26 11:49:51,441][49750] Updated weights for policy 0, policy_version 167341 (0.0033) [2024-04-26 11:49:52,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 2741747712. Throughput: 0: 50563.3. Samples: 494536400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:49:52,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 11:49:54,698][49750] Updated weights for policy 0, policy_version 167351 (0.0033) [2024-04-26 11:49:57,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 2741977088. Throughput: 0: 50720.6. Samples: 494840160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:49:57,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 11:49:57,865][49750] Updated weights for policy 0, policy_version 167361 (0.0032) [2024-04-26 11:50:01,114][49750] Updated weights for policy 0, policy_version 167371 (0.0028) [2024-04-26 11:50:02,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 2742239232. Throughput: 0: 50564.0. Samples: 495139540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:50:02,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 11:50:04,253][49750] Updated weights for policy 0, policy_version 167381 (0.0030) [2024-04-26 11:50:07,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2742484992. Throughput: 0: 50434.3. Samples: 495289980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:50:07,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 11:50:07,728][49750] Updated weights for policy 0, policy_version 167391 (0.0032) [2024-04-26 11:50:10,789][49750] Updated weights for policy 0, policy_version 167401 (0.0031) [2024-04-26 11:50:12,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2742747136. Throughput: 0: 50663.8. Samples: 495596600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:50:12,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 11:50:14,217][49750] Updated weights for policy 0, policy_version 167411 (0.0029) [2024-04-26 11:50:17,063][49517] Fps is (10 sec: 50788.9, 60 sec: 50517.1, 300 sec: 50429.3). Total num frames: 2742992896. Throughput: 0: 50579.9. Samples: 495901340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:50:17,063][49517] Avg episode reward: [(0, '0.451')] [2024-04-26 11:50:17,272][49750] Updated weights for policy 0, policy_version 167421 (0.0035) [2024-04-26 11:50:20,651][49750] Updated weights for policy 0, policy_version 167431 (0.0031) [2024-04-26 11:50:22,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2743238656. Throughput: 0: 50492.7. Samples: 496050920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:50:22,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 11:50:23,753][49750] Updated weights for policy 0, policy_version 167441 (0.0035) [2024-04-26 11:50:27,031][49750] Updated weights for policy 0, policy_version 167451 (0.0030) [2024-04-26 11:50:27,063][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.4, 300 sec: 50429.4). Total num frames: 2743517184. Throughput: 0: 50414.6. Samples: 496351560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:50:27,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 11:50:30,166][49750] Updated weights for policy 0, policy_version 167461 (0.0031) [2024-04-26 11:50:32,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 2743762944. Throughput: 0: 50517.8. Samples: 496657960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:50:32,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 11:50:33,548][49750] Updated weights for policy 0, policy_version 167471 (0.0030) [2024-04-26 11:50:36,813][49750] Updated weights for policy 0, policy_version 167481 (0.0032) [2024-04-26 11:50:37,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.6, 300 sec: 50540.5). Total num frames: 2744025088. Throughput: 0: 50458.9. Samples: 496807040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:50:37,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 11:50:40,180][49750] Updated weights for policy 0, policy_version 167491 (0.0034) [2024-04-26 11:50:42,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2744254464. Throughput: 0: 50419.9. Samples: 497109060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:50:42,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 11:50:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000167496_2744254464.pth... [2024-04-26 11:50:42,127][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000166757_2732146688.pth [2024-04-26 11:50:43,413][49750] Updated weights for policy 0, policy_version 167501 (0.0029) [2024-04-26 11:50:46,505][49750] Updated weights for policy 0, policy_version 167511 (0.0031) [2024-04-26 11:50:47,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2744516608. Throughput: 0: 50520.8. Samples: 497412980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 11:50:47,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 11:50:49,789][49750] Updated weights for policy 0, policy_version 167521 (0.0029) [2024-04-26 11:50:52,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.4, 300 sec: 50484.9). Total num frames: 2744778752. Throughput: 0: 50478.5. Samples: 497561520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:50:52,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 11:50:52,946][49750] Updated weights for policy 0, policy_version 167531 (0.0034) [2024-04-26 11:50:56,113][49750] Updated weights for policy 0, policy_version 167541 (0.0033) [2024-04-26 11:50:57,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50485.0). Total num frames: 2745024512. Throughput: 0: 50524.6. Samples: 497870200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:50:57,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 11:50:57,369][49728] Signal inference workers to stop experience collection... (7500 times) [2024-04-26 11:50:57,407][49750] InferenceWorker_p0-w0: stopping experience collection (7500 times) [2024-04-26 11:50:57,468][49728] Signal inference workers to resume experience collection... (7500 times) [2024-04-26 11:50:57,469][49750] InferenceWorker_p0-w0: resuming experience collection (7500 times) [2024-04-26 11:50:59,430][49750] Updated weights for policy 0, policy_version 167551 (0.0034) [2024-04-26 11:51:02,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2745270272. Throughput: 0: 50515.0. Samples: 498174500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:51:02,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 11:51:02,785][49750] Updated weights for policy 0, policy_version 167561 (0.0029) [2024-04-26 11:51:06,152][49750] Updated weights for policy 0, policy_version 167571 (0.0036) [2024-04-26 11:51:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2745516032. Throughput: 0: 50393.3. Samples: 498318620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:51:07,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 11:51:09,386][49750] Updated weights for policy 0, policy_version 167581 (0.0032) [2024-04-26 11:51:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 2745761792. Throughput: 0: 50550.4. Samples: 498626320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:51:12,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 11:51:12,585][49750] Updated weights for policy 0, policy_version 167591 (0.0033) [2024-04-26 11:51:16,031][49750] Updated weights for policy 0, policy_version 167601 (0.0032) [2024-04-26 11:51:17,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.6, 300 sec: 50429.4). Total num frames: 2746040320. Throughput: 0: 50479.6. Samples: 498929540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:51:17,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 11:51:18,997][49750] Updated weights for policy 0, policy_version 167611 (0.0038) [2024-04-26 11:51:22,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 2746269696. Throughput: 0: 50430.9. Samples: 499076440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:51:22,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 11:51:22,384][49750] Updated weights for policy 0, policy_version 167621 (0.0031) [2024-04-26 11:51:25,500][49750] Updated weights for policy 0, policy_version 167631 (0.0034) [2024-04-26 11:51:27,062][49517] Fps is (10 sec: 47513.4, 60 sec: 49971.3, 300 sec: 50429.4). Total num frames: 2746515456. Throughput: 0: 50448.5. Samples: 499379240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:51:27,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 11:51:28,916][49750] Updated weights for policy 0, policy_version 167641 (0.0027) [2024-04-26 11:51:31,995][49750] Updated weights for policy 0, policy_version 167651 (0.0028) [2024-04-26 11:51:32,063][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 2746793984. Throughput: 0: 50477.7. Samples: 499684480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:51:32,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 11:51:35,417][49750] Updated weights for policy 0, policy_version 167661 (0.0029) [2024-04-26 11:51:37,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50244.1, 300 sec: 50484.9). Total num frames: 2747039744. Throughput: 0: 50466.6. Samples: 499832520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:51:37,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 11:51:38,395][49750] Updated weights for policy 0, policy_version 167671 (0.0025) [2024-04-26 11:51:41,809][49750] Updated weights for policy 0, policy_version 167681 (0.0031) [2024-04-26 11:51:42,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2747285504. Throughput: 0: 50377.7. Samples: 500137200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:51:42,072][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 11:51:45,049][49750] Updated weights for policy 0, policy_version 167691 (0.0030) [2024-04-26 11:51:47,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50540.5). Total num frames: 2747547648. Throughput: 0: 50261.1. Samples: 500436260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:51:47,071][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 11:51:48,385][49750] Updated weights for policy 0, policy_version 167701 (0.0030) [2024-04-26 11:51:51,579][49750] Updated weights for policy 0, policy_version 167711 (0.0029) [2024-04-26 11:51:52,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 2747793408. Throughput: 0: 50524.7. Samples: 500592240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:51:52,072][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 11:51:53,932][49728] Signal inference workers to stop experience collection... (7550 times) [2024-04-26 11:51:53,974][49750] InferenceWorker_p0-w0: stopping experience collection (7550 times) [2024-04-26 11:51:53,996][49728] Signal inference workers to resume experience collection... (7550 times) [2024-04-26 11:51:53,996][49750] InferenceWorker_p0-w0: resuming experience collection (7550 times) [2024-04-26 11:51:54,919][49750] Updated weights for policy 0, policy_version 167721 (0.0029) [2024-04-26 11:51:57,062][49517] Fps is (10 sec: 47514.3, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 2748022784. Throughput: 0: 50476.0. Samples: 500897740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 11:51:57,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 11:51:57,933][49750] Updated weights for policy 0, policy_version 167731 (0.0028) [2024-04-26 11:52:01,296][49750] Updated weights for policy 0, policy_version 167741 (0.0032) [2024-04-26 11:52:02,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2748284928. Throughput: 0: 50242.6. Samples: 501190460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 11:52:02,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 11:52:04,417][49750] Updated weights for policy 0, policy_version 167751 (0.0032) [2024-04-26 11:52:07,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 2748547072. Throughput: 0: 50273.1. Samples: 501338720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 11:52:07,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 11:52:07,669][49750] Updated weights for policy 0, policy_version 167761 (0.0031) [2024-04-26 11:52:10,960][49750] Updated weights for policy 0, policy_version 167771 (0.0035) [2024-04-26 11:52:12,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2748792832. Throughput: 0: 50409.4. Samples: 501647660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 11:52:12,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 11:52:14,504][49750] Updated weights for policy 0, policy_version 167781 (0.0037) [2024-04-26 11:52:17,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2749054976. Throughput: 0: 50240.9. Samples: 501945320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 11:52:17,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 11:52:17,610][49750] Updated weights for policy 0, policy_version 167791 (0.0033) [2024-04-26 11:52:21,152][49750] Updated weights for policy 0, policy_version 167801 (0.0034) [2024-04-26 11:52:22,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50244.3, 300 sec: 50373.8). Total num frames: 2749284352. Throughput: 0: 50355.1. Samples: 502098500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 11:52:22,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 11:52:24,014][49750] Updated weights for policy 0, policy_version 167811 (0.0028) [2024-04-26 11:52:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50485.0). Total num frames: 2749546496. Throughput: 0: 50341.0. Samples: 502402540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 11:52:27,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 11:52:27,755][49750] Updated weights for policy 0, policy_version 167821 (0.0032) [2024-04-26 11:52:30,403][49750] Updated weights for policy 0, policy_version 167831 (0.0028) [2024-04-26 11:52:32,062][49517] Fps is (10 sec: 50791.3, 60 sec: 49971.3, 300 sec: 50429.4). Total num frames: 2749792256. Throughput: 0: 50396.6. Samples: 502704100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 11:52:32,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 11:52:34,306][49750] Updated weights for policy 0, policy_version 167841 (0.0032) [2024-04-26 11:52:36,946][49750] Updated weights for policy 0, policy_version 167851 (0.0034) [2024-04-26 11:52:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.4, 300 sec: 50485.0). Total num frames: 2750070784. Throughput: 0: 50385.8. Samples: 502859600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 11:52:37,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 11:52:40,639][49750] Updated weights for policy 0, policy_version 167861 (0.0032) [2024-04-26 11:52:42,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2750300160. Throughput: 0: 50316.3. Samples: 503161980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 11:52:42,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 11:52:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000167865_2750300160.pth... [2024-04-26 11:52:42,135][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000167126_2738192384.pth [2024-04-26 11:52:43,536][49750] Updated weights for policy 0, policy_version 167871 (0.0035) [2024-04-26 11:52:47,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.3, 300 sec: 50373.9). Total num frames: 2750545920. Throughput: 0: 50540.8. Samples: 503464800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 11:52:47,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 11:52:47,186][49750] Updated weights for policy 0, policy_version 167881 (0.0034) [2024-04-26 11:52:49,959][49750] Updated weights for policy 0, policy_version 167891 (0.0033) [2024-04-26 11:52:52,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 2750824448. Throughput: 0: 50529.1. Samples: 503612540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 11:52:52,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 11:52:53,702][49750] Updated weights for policy 0, policy_version 167901 (0.0034) [2024-04-26 11:52:56,530][49750] Updated weights for policy 0, policy_version 167911 (0.0031) [2024-04-26 11:52:57,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.4, 300 sec: 50485.0). Total num frames: 2751070208. Throughput: 0: 50501.8. Samples: 503920240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 11:52:57,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 11:53:00,220][49750] Updated weights for policy 0, policy_version 167921 (0.0029) [2024-04-26 11:53:02,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2751315968. Throughput: 0: 50704.5. Samples: 504227020. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 11:53:02,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 11:53:03,052][49750] Updated weights for policy 0, policy_version 167931 (0.0033) [2024-04-26 11:53:06,626][49750] Updated weights for policy 0, policy_version 167941 (0.0031) [2024-04-26 11:53:07,063][49517] Fps is (10 sec: 49150.8, 60 sec: 50244.0, 300 sec: 50373.8). Total num frames: 2751561728. Throughput: 0: 50469.3. Samples: 504369620. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 11:53:07,063][49517] Avg episode reward: [(0, '0.476')] [2024-04-26 11:53:09,430][49750] Updated weights for policy 0, policy_version 167951 (0.0039) [2024-04-26 11:53:12,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 2751823872. Throughput: 0: 50426.3. Samples: 504671720. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 11:53:12,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 11:53:13,055][49750] Updated weights for policy 0, policy_version 167961 (0.0032) [2024-04-26 11:53:15,949][49750] Updated weights for policy 0, policy_version 167971 (0.0030) [2024-04-26 11:53:17,062][49517] Fps is (10 sec: 52430.1, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 2752086016. Throughput: 0: 50448.9. Samples: 504974300. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 11:53:17,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 11:53:19,535][49750] Updated weights for policy 0, policy_version 167981 (0.0042) [2024-04-26 11:53:20,810][49728] Signal inference workers to stop experience collection... (7600 times) [2024-04-26 11:53:20,810][49728] Signal inference workers to resume experience collection... (7600 times) [2024-04-26 11:53:20,824][49750] InferenceWorker_p0-w0: stopping experience collection (7600 times) [2024-04-26 11:53:20,825][49750] InferenceWorker_p0-w0: resuming experience collection (7600 times) [2024-04-26 11:53:22,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.5, 300 sec: 50373.9). Total num frames: 2752315392. Throughput: 0: 50486.9. Samples: 505131500. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 11:53:22,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 11:53:22,419][49750] Updated weights for policy 0, policy_version 167991 (0.0034) [2024-04-26 11:53:26,212][49750] Updated weights for policy 0, policy_version 168001 (0.0040) [2024-04-26 11:53:27,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2752577536. Throughput: 0: 50445.5. Samples: 505432020. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 11:53:27,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 11:53:28,947][49750] Updated weights for policy 0, policy_version 168011 (0.0028) [2024-04-26 11:53:32,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50485.0). Total num frames: 2752823296. Throughput: 0: 50309.9. Samples: 505728740. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 11:53:32,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 11:53:32,571][49750] Updated weights for policy 0, policy_version 168021 (0.0031) [2024-04-26 11:53:35,366][49750] Updated weights for policy 0, policy_version 168031 (0.0030) [2024-04-26 11:53:37,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 2753101824. Throughput: 0: 50526.5. Samples: 505886220. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 11:53:37,063][49517] Avg episode reward: [(0, '0.482')] [2024-04-26 11:53:38,914][49750] Updated weights for policy 0, policy_version 168041 (0.0031) [2024-04-26 11:53:41,838][49750] Updated weights for policy 0, policy_version 168051 (0.0032) [2024-04-26 11:53:42,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.5, 300 sec: 50485.0). Total num frames: 2753347584. Throughput: 0: 50469.7. Samples: 506191380. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 11:53:42,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 11:53:45,582][49750] Updated weights for policy 0, policy_version 168061 (0.0038) [2024-04-26 11:53:47,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2753576960. Throughput: 0: 50429.4. Samples: 506496340. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 11:53:47,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 11:53:48,371][49750] Updated weights for policy 0, policy_version 168071 (0.0032) [2024-04-26 11:53:52,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.3, 300 sec: 50373.9). Total num frames: 2753822720. Throughput: 0: 50441.5. Samples: 506639480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 11:53:52,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 11:53:52,174][49750] Updated weights for policy 0, policy_version 168081 (0.0032) [2024-04-26 11:53:54,913][49750] Updated weights for policy 0, policy_version 168091 (0.0040) [2024-04-26 11:53:57,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.2, 300 sec: 50540.5). Total num frames: 2754101248. Throughput: 0: 50466.5. Samples: 506942720. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 11:53:57,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 11:53:58,745][49750] Updated weights for policy 0, policy_version 168101 (0.0035) [2024-04-26 11:54:01,286][49750] Updated weights for policy 0, policy_version 168111 (0.0029) [2024-04-26 11:54:02,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.2, 300 sec: 50484.9). Total num frames: 2754347008. Throughput: 0: 50402.9. Samples: 507242440. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-26 11:54:02,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 11:54:05,322][49750] Updated weights for policy 0, policy_version 168121 (0.0032) [2024-04-26 11:54:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 2754592768. Throughput: 0: 50397.6. Samples: 507399400. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-04-26 11:54:07,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 11:54:07,769][49750] Updated weights for policy 0, policy_version 168131 (0.0031) [2024-04-26 11:54:11,628][49750] Updated weights for policy 0, policy_version 168141 (0.0032) [2024-04-26 11:54:12,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2754838528. Throughput: 0: 50430.2. Samples: 507701380. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-04-26 11:54:12,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 11:54:14,399][49750] Updated weights for policy 0, policy_version 168151 (0.0039) [2024-04-26 11:54:17,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 2755100672. Throughput: 0: 50594.2. Samples: 508005480. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-04-26 11:54:17,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 11:54:17,978][49750] Updated weights for policy 0, policy_version 168161 (0.0029) [2024-04-26 11:54:20,794][49750] Updated weights for policy 0, policy_version 168171 (0.0027) [2024-04-26 11:54:22,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.3, 300 sec: 50540.5). Total num frames: 2755362816. Throughput: 0: 50555.5. Samples: 508161220. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-04-26 11:54:22,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 11:54:24,343][49750] Updated weights for policy 0, policy_version 168181 (0.0032) [2024-04-26 11:54:27,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50485.0). Total num frames: 2755624960. Throughput: 0: 50406.2. Samples: 508459660. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-04-26 11:54:27,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 11:54:27,178][49750] Updated weights for policy 0, policy_version 168191 (0.0036) [2024-04-26 11:54:31,126][49750] Updated weights for policy 0, policy_version 168201 (0.0031) [2024-04-26 11:54:32,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2755854336. Throughput: 0: 50446.2. Samples: 508766420. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-04-26 11:54:32,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 11:54:33,640][49750] Updated weights for policy 0, policy_version 168211 (0.0032) [2024-04-26 11:54:36,470][49728] Signal inference workers to stop experience collection... (7650 times) [2024-04-26 11:54:36,470][49728] Signal inference workers to resume experience collection... (7650 times) [2024-04-26 11:54:36,498][49750] InferenceWorker_p0-w0: stopping experience collection (7650 times) [2024-04-26 11:54:36,498][49750] InferenceWorker_p0-w0: resuming experience collection (7650 times) [2024-04-26 11:54:37,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49971.1, 300 sec: 50373.9). Total num frames: 2756100096. Throughput: 0: 50500.4. Samples: 508912000. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-04-26 11:54:37,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 11:54:37,507][49750] Updated weights for policy 0, policy_version 168221 (0.0028) [2024-04-26 11:54:40,313][49750] Updated weights for policy 0, policy_version 168231 (0.0030) [2024-04-26 11:54:42,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 2756378624. Throughput: 0: 50370.2. Samples: 509209380. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-04-26 11:54:42,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 11:54:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000168236_2756378624.pth... [2024-04-26 11:54:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000167496_2744254464.pth [2024-04-26 11:54:43,992][49750] Updated weights for policy 0, policy_version 168241 (0.0036) [2024-04-26 11:54:46,683][49750] Updated weights for policy 0, policy_version 168251 (0.0030) [2024-04-26 11:54:47,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 2756624384. Throughput: 0: 50320.6. Samples: 509506860. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-04-26 11:54:47,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 11:54:50,528][49750] Updated weights for policy 0, policy_version 168261 (0.0038) [2024-04-26 11:54:52,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 2756870144. Throughput: 0: 50430.2. Samples: 509668760. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-04-26 11:54:52,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 11:54:53,336][49750] Updated weights for policy 0, policy_version 168271 (0.0032) [2024-04-26 11:54:57,063][49517] Fps is (10 sec: 47513.1, 60 sec: 49971.2, 300 sec: 50373.8). Total num frames: 2757099520. Throughput: 0: 50310.5. Samples: 509965360. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-04-26 11:54:57,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 11:54:57,113][49750] Updated weights for policy 0, policy_version 168281 (0.0040) [2024-04-26 11:54:59,935][49750] Updated weights for policy 0, policy_version 168291 (0.0032) [2024-04-26 11:55:02,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.5, 300 sec: 50484.9). Total num frames: 2757378048. Throughput: 0: 50165.3. Samples: 510262920. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-04-26 11:55:02,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 11:55:03,485][49750] Updated weights for policy 0, policy_version 168301 (0.0029) [2024-04-26 11:55:06,328][49750] Updated weights for policy 0, policy_version 168311 (0.0036) [2024-04-26 11:55:07,062][49517] Fps is (10 sec: 52430.0, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2757623808. Throughput: 0: 50258.4. Samples: 510422840. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-04-26 11:55:07,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 11:55:10,086][49750] Updated weights for policy 0, policy_version 168321 (0.0036) [2024-04-26 11:55:12,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 2757869568. Throughput: 0: 50198.5. Samples: 510718600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 11:55:12,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 11:55:13,016][49750] Updated weights for policy 0, policy_version 168331 (0.0034) [2024-04-26 11:55:16,582][49750] Updated weights for policy 0, policy_version 168341 (0.0032) [2024-04-26 11:55:17,063][49517] Fps is (10 sec: 47512.8, 60 sec: 49971.2, 300 sec: 50373.8). Total num frames: 2758098944. Throughput: 0: 49998.5. Samples: 511016360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 11:55:17,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 11:55:19,553][49750] Updated weights for policy 0, policy_version 168351 (0.0029) [2024-04-26 11:55:22,063][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 2758361088. Throughput: 0: 50027.5. Samples: 511163240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 11:55:22,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 11:55:23,092][49750] Updated weights for policy 0, policy_version 168361 (0.0036) [2024-04-26 11:55:25,876][49750] Updated weights for policy 0, policy_version 168371 (0.0034) [2024-04-26 11:55:27,062][49517] Fps is (10 sec: 52429.5, 60 sec: 49971.3, 300 sec: 50373.9). Total num frames: 2758623232. Throughput: 0: 50180.6. Samples: 511467500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 11:55:27,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 11:55:29,584][49750] Updated weights for policy 0, policy_version 168381 (0.0030) [2024-04-26 11:55:32,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2758885376. Throughput: 0: 50199.6. Samples: 511765840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 11:55:32,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 11:55:32,333][49750] Updated weights for policy 0, policy_version 168391 (0.0037) [2024-04-26 11:55:36,225][49750] Updated weights for policy 0, policy_version 168401 (0.0038) [2024-04-26 11:55:37,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2759098368. Throughput: 0: 50085.9. Samples: 511922620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 11:55:37,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 11:55:38,785][49750] Updated weights for policy 0, policy_version 168411 (0.0025) [2024-04-26 11:55:42,062][49517] Fps is (10 sec: 47513.1, 60 sec: 49698.2, 300 sec: 50318.3). Total num frames: 2759360512. Throughput: 0: 50214.8. Samples: 512225020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 11:55:42,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 11:55:42,577][49750] Updated weights for policy 0, policy_version 168421 (0.0033) [2024-04-26 11:55:45,355][49750] Updated weights for policy 0, policy_version 168431 (0.0030) [2024-04-26 11:55:47,063][49517] Fps is (10 sec: 52428.2, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2759622656. Throughput: 0: 50289.3. Samples: 512525940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 11:55:47,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 11:55:49,227][49750] Updated weights for policy 0, policy_version 168441 (0.0029) [2024-04-26 11:55:50,606][49728] Signal inference workers to stop experience collection... (7700 times) [2024-04-26 11:55:50,607][49728] Signal inference workers to resume experience collection... (7700 times) [2024-04-26 11:55:50,637][49750] InferenceWorker_p0-w0: stopping experience collection (7700 times) [2024-04-26 11:55:50,637][49750] InferenceWorker_p0-w0: resuming experience collection (7700 times) [2024-04-26 11:55:51,654][49750] Updated weights for policy 0, policy_version 168451 (0.0027) [2024-04-26 11:55:52,062][49517] Fps is (10 sec: 54067.6, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2759901184. Throughput: 0: 50093.7. Samples: 512677060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 11:55:52,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 11:55:55,562][49750] Updated weights for policy 0, policy_version 168461 (0.0030) [2024-04-26 11:55:57,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2760130560. Throughput: 0: 50242.2. Samples: 512979500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 11:55:57,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 11:55:58,568][49750] Updated weights for policy 0, policy_version 168471 (0.0028) [2024-04-26 11:56:02,062][49517] Fps is (10 sec: 47513.3, 60 sec: 49971.2, 300 sec: 50373.8). Total num frames: 2760376320. Throughput: 0: 50376.5. Samples: 513283300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 11:56:02,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 11:56:02,169][49750] Updated weights for policy 0, policy_version 168481 (0.0032) [2024-04-26 11:56:04,925][49750] Updated weights for policy 0, policy_version 168491 (0.0034) [2024-04-26 11:56:07,062][49517] Fps is (10 sec: 49153.1, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2760622080. Throughput: 0: 50286.4. Samples: 513426120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 11:56:07,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 11:56:08,616][49750] Updated weights for policy 0, policy_version 168501 (0.0030) [2024-04-26 11:56:11,463][49750] Updated weights for policy 0, policy_version 168511 (0.0036) [2024-04-26 11:56:12,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2760884224. Throughput: 0: 50375.6. Samples: 513734420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 11:56:12,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 11:56:15,067][49750] Updated weights for policy 0, policy_version 168521 (0.0031) [2024-04-26 11:56:17,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2761146368. Throughput: 0: 50486.0. Samples: 514037720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 11:56:17,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 11:56:18,297][49750] Updated weights for policy 0, policy_version 168531 (0.0031) [2024-04-26 11:56:21,595][49750] Updated weights for policy 0, policy_version 168541 (0.0028) [2024-04-26 11:56:22,062][49517] Fps is (10 sec: 49153.5, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 2761375744. Throughput: 0: 50336.9. Samples: 514187780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 11:56:22,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 11:56:24,629][49750] Updated weights for policy 0, policy_version 168551 (0.0026) [2024-04-26 11:56:27,063][49517] Fps is (10 sec: 47513.3, 60 sec: 49971.0, 300 sec: 50262.8). Total num frames: 2761621504. Throughput: 0: 50477.6. Samples: 514496520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 11:56:27,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 11:56:27,983][49750] Updated weights for policy 0, policy_version 168561 (0.0030) [2024-04-26 11:56:31,236][49750] Updated weights for policy 0, policy_version 168571 (0.0033) [2024-04-26 11:56:32,063][49517] Fps is (10 sec: 50789.3, 60 sec: 49971.0, 300 sec: 50318.3). Total num frames: 2761883648. Throughput: 0: 50386.6. Samples: 514793340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 11:56:32,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 11:56:34,437][49750] Updated weights for policy 0, policy_version 168581 (0.0031) [2024-04-26 11:56:37,063][49517] Fps is (10 sec: 55705.6, 60 sec: 51336.3, 300 sec: 50484.9). Total num frames: 2762178560. Throughput: 0: 50338.0. Samples: 514942280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 11:56:37,063][49517] Avg episode reward: [(0, '0.478')] [2024-04-26 11:56:37,860][49750] Updated weights for policy 0, policy_version 168591 (0.0031) [2024-04-26 11:56:40,993][49750] Updated weights for policy 0, policy_version 168601 (0.0032) [2024-04-26 11:56:42,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2762391552. Throughput: 0: 50577.9. Samples: 515255500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 11:56:42,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 11:56:42,121][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000168604_2762407936.pth... [2024-04-26 11:56:42,170][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000167865_2750300160.pth [2024-04-26 11:56:44,276][49750] Updated weights for policy 0, policy_version 168611 (0.0034) [2024-04-26 11:56:47,062][49517] Fps is (10 sec: 44237.6, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2762620928. Throughput: 0: 50450.3. Samples: 515553560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 11:56:47,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 11:56:47,813][49750] Updated weights for policy 0, policy_version 168621 (0.0028) [2024-04-26 11:56:50,663][49750] Updated weights for policy 0, policy_version 168631 (0.0031) [2024-04-26 11:56:52,062][49517] Fps is (10 sec: 50790.7, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 2762899456. Throughput: 0: 50398.6. Samples: 515694060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 11:56:52,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 11:56:54,253][49750] Updated weights for policy 0, policy_version 168641 (0.0029) [2024-04-26 11:56:54,729][49728] Signal inference workers to stop experience collection... (7750 times) [2024-04-26 11:56:54,765][49750] InferenceWorker_p0-w0: stopping experience collection (7750 times) [2024-04-26 11:56:54,798][49728] Signal inference workers to resume experience collection... (7750 times) [2024-04-26 11:56:54,798][49750] InferenceWorker_p0-w0: resuming experience collection (7750 times) [2024-04-26 11:56:57,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 2763161600. Throughput: 0: 50222.1. Samples: 515994400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 11:56:57,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 11:56:57,230][49750] Updated weights for policy 0, policy_version 168651 (0.0033) [2024-04-26 11:57:00,776][49750] Updated weights for policy 0, policy_version 168661 (0.0040) [2024-04-26 11:57:02,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 2763423744. Throughput: 0: 50434.2. Samples: 516307260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 11:57:02,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 11:57:03,762][49750] Updated weights for policy 0, policy_version 168671 (0.0034) [2024-04-26 11:57:07,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2763653120. Throughput: 0: 50488.4. Samples: 516459760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 11:57:07,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 11:57:07,178][49750] Updated weights for policy 0, policy_version 168681 (0.0030) [2024-04-26 11:57:10,144][49750] Updated weights for policy 0, policy_version 168691 (0.0033) [2024-04-26 11:57:12,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50244.5, 300 sec: 50318.3). Total num frames: 2763898880. Throughput: 0: 50288.2. Samples: 516759480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 11:57:12,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 11:57:13,778][49750] Updated weights for policy 0, policy_version 168701 (0.0033) [2024-04-26 11:57:16,934][49750] Updated weights for policy 0, policy_version 168711 (0.0034) [2024-04-26 11:57:17,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2764161024. Throughput: 0: 50415.1. Samples: 517062020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 11:57:17,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 11:57:20,228][49750] Updated weights for policy 0, policy_version 168721 (0.0034) [2024-04-26 11:57:22,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2764423168. Throughput: 0: 50593.6. Samples: 517218980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:57:22,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 11:57:23,556][49750] Updated weights for policy 0, policy_version 168731 (0.0040) [2024-04-26 11:57:26,731][49750] Updated weights for policy 0, policy_version 168741 (0.0034) [2024-04-26 11:57:27,063][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 2764668928. Throughput: 0: 50301.7. Samples: 517519080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:57:27,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 11:57:29,896][49750] Updated weights for policy 0, policy_version 168751 (0.0030) [2024-04-26 11:57:32,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2764898304. Throughput: 0: 50485.7. Samples: 517825420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:57:32,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 11:57:33,109][49750] Updated weights for policy 0, policy_version 168761 (0.0029) [2024-04-26 11:57:36,465][49750] Updated weights for policy 0, policy_version 168771 (0.0030) [2024-04-26 11:57:37,063][49517] Fps is (10 sec: 49151.8, 60 sec: 49698.2, 300 sec: 50373.9). Total num frames: 2765160448. Throughput: 0: 50436.7. Samples: 517963720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:57:37,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 11:57:39,514][49750] Updated weights for policy 0, policy_version 168781 (0.0028) [2024-04-26 11:57:42,062][49517] Fps is (10 sec: 54068.0, 60 sec: 50790.5, 300 sec: 50485.0). Total num frames: 2765438976. Throughput: 0: 50469.4. Samples: 518265520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:57:42,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 11:57:43,026][49750] Updated weights for policy 0, policy_version 168791 (0.0031) [2024-04-26 11:57:45,998][49750] Updated weights for policy 0, policy_version 168801 (0.0030) [2024-04-26 11:57:47,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.5, 300 sec: 50373.9). Total num frames: 2765684736. Throughput: 0: 50337.5. Samples: 518572440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:57:47,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 11:57:49,475][49750] Updated weights for policy 0, policy_version 168811 (0.0033) [2024-04-26 11:57:52,062][49517] Fps is (10 sec: 45874.8, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2765897728. Throughput: 0: 50309.2. Samples: 518723680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:57:52,063][49517] Avg episode reward: [(0, '0.434')] [2024-04-26 11:57:52,629][49750] Updated weights for policy 0, policy_version 168821 (0.0028) [2024-04-26 11:57:55,838][49750] Updated weights for policy 0, policy_version 168831 (0.0039) [2024-04-26 11:57:57,062][49517] Fps is (10 sec: 47513.8, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2766159872. Throughput: 0: 50382.2. Samples: 519026680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:57:57,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-26 11:57:59,020][49750] Updated weights for policy 0, policy_version 168841 (0.0038) [2024-04-26 11:57:59,767][49728] Signal inference workers to stop experience collection... (7800 times) [2024-04-26 11:57:59,767][49728] Signal inference workers to resume experience collection... (7800 times) [2024-04-26 11:57:59,787][49750] InferenceWorker_p0-w0: stopping experience collection (7800 times) [2024-04-26 11:57:59,787][49750] InferenceWorker_p0-w0: resuming experience collection (7800 times) [2024-04-26 11:58:02,062][49517] Fps is (10 sec: 54067.7, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 2766438400. Throughput: 0: 50275.0. Samples: 519324380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:58:02,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 11:58:02,351][49750] Updated weights for policy 0, policy_version 168851 (0.0034) [2024-04-26 11:58:05,574][49750] Updated weights for policy 0, policy_version 168861 (0.0029) [2024-04-26 11:58:07,062][49517] Fps is (10 sec: 54066.8, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 2766700544. Throughput: 0: 50486.6. Samples: 519490880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:58:07,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 11:58:08,851][49750] Updated weights for policy 0, policy_version 168871 (0.0035) [2024-04-26 11:58:12,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2766929920. Throughput: 0: 50452.2. Samples: 519789420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:58:12,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 11:58:12,082][49750] Updated weights for policy 0, policy_version 168881 (0.0031) [2024-04-26 11:58:15,411][49750] Updated weights for policy 0, policy_version 168891 (0.0031) [2024-04-26 11:58:17,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.5, 300 sec: 50373.9). Total num frames: 2767175680. Throughput: 0: 50281.1. Samples: 520088060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:58:17,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 11:58:18,571][49750] Updated weights for policy 0, policy_version 168901 (0.0033) [2024-04-26 11:58:21,962][49750] Updated weights for policy 0, policy_version 168911 (0.0032) [2024-04-26 11:58:22,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2767437824. Throughput: 0: 50403.7. Samples: 520231880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 11:58:22,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 11:58:25,121][49750] Updated weights for policy 0, policy_version 168921 (0.0038) [2024-04-26 11:58:27,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2767699968. Throughput: 0: 50503.0. Samples: 520538160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 11:58:27,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 11:58:28,507][49750] Updated weights for policy 0, policy_version 168931 (0.0034) [2024-04-26 11:58:31,560][49750] Updated weights for policy 0, policy_version 168941 (0.0027) [2024-04-26 11:58:32,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 2767945728. Throughput: 0: 50379.5. Samples: 520839520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 11:58:32,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 11:58:35,097][49750] Updated weights for policy 0, policy_version 168951 (0.0028) [2024-04-26 11:58:37,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2768175104. Throughput: 0: 50398.5. Samples: 520991620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 11:58:37,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 11:58:38,069][49750] Updated weights for policy 0, policy_version 168961 (0.0035) [2024-04-26 11:58:41,507][49750] Updated weights for policy 0, policy_version 168971 (0.0032) [2024-04-26 11:58:42,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 2768437248. Throughput: 0: 50296.4. Samples: 521290020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 11:58:42,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 11:58:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000168972_2768437248.pth... [2024-04-26 11:58:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000168236_2756378624.pth [2024-04-26 11:58:44,680][49750] Updated weights for policy 0, policy_version 168981 (0.0034) [2024-04-26 11:58:47,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2768699392. Throughput: 0: 50320.9. Samples: 521588820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 11:58:47,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 11:58:48,067][49750] Updated weights for policy 0, policy_version 168991 (0.0033) [2024-04-26 11:58:51,336][49750] Updated weights for policy 0, policy_version 169001 (0.0039) [2024-04-26 11:58:52,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50373.9). Total num frames: 2768961536. Throughput: 0: 50228.0. Samples: 521751140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 11:58:52,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 11:58:54,456][49750] Updated weights for policy 0, policy_version 169011 (0.0032) [2024-04-26 11:58:57,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50373.9). Total num frames: 2769207296. Throughput: 0: 50251.2. Samples: 522050720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 11:58:57,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 11:58:57,923][49750] Updated weights for policy 0, policy_version 169021 (0.0029) [2024-04-26 11:59:01,024][49750] Updated weights for policy 0, policy_version 169031 (0.0028) [2024-04-26 11:59:02,063][49517] Fps is (10 sec: 47513.3, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 2769436672. Throughput: 0: 50221.2. Samples: 522348020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 11:59:02,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 11:59:04,344][49750] Updated weights for policy 0, policy_version 169041 (0.0033) [2024-04-26 11:59:06,511][49728] Signal inference workers to stop experience collection... (7850 times) [2024-04-26 11:59:06,511][49728] Signal inference workers to resume experience collection... (7850 times) [2024-04-26 11:59:06,540][49750] InferenceWorker_p0-w0: stopping experience collection (7850 times) [2024-04-26 11:59:06,540][49750] InferenceWorker_p0-w0: resuming experience collection (7850 times) [2024-04-26 11:59:07,063][49517] Fps is (10 sec: 50789.2, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2769715200. Throughput: 0: 50395.0. Samples: 522499660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 11:59:07,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 11:59:07,411][49750] Updated weights for policy 0, policy_version 169051 (0.0032) [2024-04-26 11:59:10,838][49750] Updated weights for policy 0, policy_version 169061 (0.0029) [2024-04-26 11:59:12,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2769944576. Throughput: 0: 50406.8. Samples: 522806460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 11:59:12,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 11:59:13,989][49750] Updated weights for policy 0, policy_version 169071 (0.0034) [2024-04-26 11:59:17,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2770190336. Throughput: 0: 50251.2. Samples: 523100820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 11:59:17,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 11:59:17,370][49750] Updated weights for policy 0, policy_version 169081 (0.0035) [2024-04-26 11:59:20,402][49750] Updated weights for policy 0, policy_version 169091 (0.0030) [2024-04-26 11:59:22,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2770452480. Throughput: 0: 50398.4. Samples: 523259540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 11:59:22,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 11:59:23,695][49750] Updated weights for policy 0, policy_version 169101 (0.0032) [2024-04-26 11:59:26,846][49750] Updated weights for policy 0, policy_version 169111 (0.0027) [2024-04-26 11:59:27,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2770714624. Throughput: 0: 50438.7. Samples: 523559760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 11:59:27,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 11:59:30,076][49750] Updated weights for policy 0, policy_version 169121 (0.0031) [2024-04-26 11:59:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2770960384. Throughput: 0: 50566.3. Samples: 523864300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 11:59:32,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 11:59:33,647][49750] Updated weights for policy 0, policy_version 169131 (0.0033) [2024-04-26 11:59:36,606][49750] Updated weights for policy 0, policy_version 169141 (0.0031) [2024-04-26 11:59:37,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.6, 300 sec: 50318.3). Total num frames: 2771222528. Throughput: 0: 50388.9. Samples: 524018640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 11:59:37,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 11:59:40,086][49750] Updated weights for policy 0, policy_version 169151 (0.0031) [2024-04-26 11:59:42,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2771468288. Throughput: 0: 50428.7. Samples: 524320020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 11:59:42,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 11:59:43,182][49750] Updated weights for policy 0, policy_version 169161 (0.0042) [2024-04-26 11:59:46,522][49750] Updated weights for policy 0, policy_version 169171 (0.0034) [2024-04-26 11:59:47,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2771697664. Throughput: 0: 50456.6. Samples: 524618560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 11:59:47,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 11:59:49,660][49750] Updated weights for policy 0, policy_version 169181 (0.0039) [2024-04-26 11:59:52,063][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2771959808. Throughput: 0: 50396.5. Samples: 524767500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 11:59:52,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 11:59:52,905][49750] Updated weights for policy 0, policy_version 169191 (0.0027) [2024-04-26 11:59:56,213][49750] Updated weights for policy 0, policy_version 169201 (0.0033) [2024-04-26 11:59:57,063][49517] Fps is (10 sec: 54066.6, 60 sec: 50517.2, 300 sec: 50373.8). Total num frames: 2772238336. Throughput: 0: 50309.2. Samples: 525070380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 11:59:57,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 11:59:59,351][49750] Updated weights for policy 0, policy_version 169211 (0.0035) [2024-04-26 12:00:02,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2772467712. Throughput: 0: 50439.1. Samples: 525370580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 12:00:02,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 12:00:02,834][49750] Updated weights for policy 0, policy_version 169221 (0.0038) [2024-04-26 12:00:05,859][49750] Updated weights for policy 0, policy_version 169231 (0.0030) [2024-04-26 12:00:07,062][49517] Fps is (10 sec: 47514.2, 60 sec: 49971.3, 300 sec: 50318.4). Total num frames: 2772713472. Throughput: 0: 50272.5. Samples: 525521800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 12:00:07,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 12:00:09,197][49750] Updated weights for policy 0, policy_version 169241 (0.0028) [2024-04-26 12:00:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2772959232. Throughput: 0: 50318.7. Samples: 525824100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 12:00:12,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 12:00:12,581][49750] Updated weights for policy 0, policy_version 169251 (0.0032) [2024-04-26 12:00:13,597][49728] Signal inference workers to stop experience collection... (7900 times) [2024-04-26 12:00:13,634][49750] InferenceWorker_p0-w0: stopping experience collection (7900 times) [2024-04-26 12:00:13,670][49728] Signal inference workers to resume experience collection... (7900 times) [2024-04-26 12:00:13,670][49750] InferenceWorker_p0-w0: resuming experience collection (7900 times) [2024-04-26 12:00:15,691][49750] Updated weights for policy 0, policy_version 169261 (0.0032) [2024-04-26 12:00:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2773221376. Throughput: 0: 50187.5. Samples: 526122740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 12:00:17,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 12:00:19,252][49750] Updated weights for policy 0, policy_version 169271 (0.0036) [2024-04-26 12:00:22,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2773483520. Throughput: 0: 50257.8. Samples: 526280240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 12:00:22,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 12:00:22,127][49750] Updated weights for policy 0, policy_version 169281 (0.0028) [2024-04-26 12:00:25,724][49750] Updated weights for policy 0, policy_version 169291 (0.0032) [2024-04-26 12:00:27,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2773745664. Throughput: 0: 50323.6. Samples: 526584580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 12:00:27,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 12:00:28,613][49750] Updated weights for policy 0, policy_version 169301 (0.0029) [2024-04-26 12:00:32,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2773958656. Throughput: 0: 50403.5. Samples: 526886720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 12:00:32,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 12:00:32,328][49750] Updated weights for policy 0, policy_version 169311 (0.0031) [2024-04-26 12:00:35,067][49750] Updated weights for policy 0, policy_version 169321 (0.0028) [2024-04-26 12:00:37,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50244.1, 300 sec: 50429.4). Total num frames: 2774237184. Throughput: 0: 50419.5. Samples: 527036380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 12:00:37,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 12:00:38,896][49750] Updated weights for policy 0, policy_version 169331 (0.0036) [2024-04-26 12:00:41,486][49750] Updated weights for policy 0, policy_version 169341 (0.0033) [2024-04-26 12:00:42,063][49517] Fps is (10 sec: 54066.1, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 2774499328. Throughput: 0: 50260.3. Samples: 527332100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 12:00:42,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 12:00:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000169342_2774499328.pth... [2024-04-26 12:00:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000168604_2762407936.pth [2024-04-26 12:00:45,336][49750] Updated weights for policy 0, policy_version 169351 (0.0029) [2024-04-26 12:00:47,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50318.3). Total num frames: 2774745088. Throughput: 0: 50350.5. Samples: 527636360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 12:00:47,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 12:00:47,974][49750] Updated weights for policy 0, policy_version 169361 (0.0028) [2024-04-26 12:00:51,840][49750] Updated weights for policy 0, policy_version 169371 (0.0035) [2024-04-26 12:00:52,062][49517] Fps is (10 sec: 47514.6, 60 sec: 50244.4, 300 sec: 50318.4). Total num frames: 2774974464. Throughput: 0: 50331.5. Samples: 527786720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 12:00:52,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 12:00:54,538][49750] Updated weights for policy 0, policy_version 169381 (0.0034) [2024-04-26 12:00:57,062][49517] Fps is (10 sec: 49152.7, 60 sec: 49971.3, 300 sec: 50373.9). Total num frames: 2775236608. Throughput: 0: 50361.8. Samples: 528090380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 12:00:57,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 12:00:58,323][49750] Updated weights for policy 0, policy_version 169391 (0.0036) [2024-04-26 12:01:01,127][49750] Updated weights for policy 0, policy_version 169401 (0.0033) [2024-04-26 12:01:02,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2775498752. Throughput: 0: 50387.1. Samples: 528390160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 12:01:02,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 12:01:04,719][49750] Updated weights for policy 0, policy_version 169411 (0.0036) [2024-04-26 12:01:07,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2775744512. Throughput: 0: 50340.4. Samples: 528545560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 12:01:07,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 12:01:07,553][49750] Updated weights for policy 0, policy_version 169421 (0.0037) [2024-04-26 12:01:11,175][49750] Updated weights for policy 0, policy_version 169431 (0.0039) [2024-04-26 12:01:12,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50373.9). Total num frames: 2776006656. Throughput: 0: 50305.7. Samples: 528848340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 12:01:12,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 12:01:14,188][49750] Updated weights for policy 0, policy_version 169441 (0.0034) [2024-04-26 12:01:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 2776236032. Throughput: 0: 50337.3. Samples: 529151900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 12:01:17,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 12:01:17,865][49750] Updated weights for policy 0, policy_version 169451 (0.0032) [2024-04-26 12:01:20,624][49750] Updated weights for policy 0, policy_version 169461 (0.0036) [2024-04-26 12:01:22,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2776481792. Throughput: 0: 50233.0. Samples: 529296860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 12:01:22,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 12:01:24,325][49750] Updated weights for policy 0, policy_version 169471 (0.0033) [2024-04-26 12:01:27,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2776760320. Throughput: 0: 50198.4. Samples: 529591020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 12:01:27,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 12:01:27,221][49750] Updated weights for policy 0, policy_version 169481 (0.0027) [2024-04-26 12:01:30,743][49750] Updated weights for policy 0, policy_version 169491 (0.0036) [2024-04-26 12:01:32,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50262.8). Total num frames: 2777006080. Throughput: 0: 50208.5. Samples: 529895740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 12:01:32,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 12:01:33,620][49750] Updated weights for policy 0, policy_version 169501 (0.0029) [2024-04-26 12:01:36,641][49728] Signal inference workers to stop experience collection... (7950 times) [2024-04-26 12:01:36,692][49750] InferenceWorker_p0-w0: stopping experience collection (7950 times) [2024-04-26 12:01:36,706][49728] Signal inference workers to resume experience collection... (7950 times) [2024-04-26 12:01:36,714][49750] InferenceWorker_p0-w0: resuming experience collection (7950 times) [2024-04-26 12:01:37,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2777251840. Throughput: 0: 50366.2. Samples: 530053200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 12:01:37,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 12:01:37,301][49750] Updated weights for policy 0, policy_version 169511 (0.0032) [2024-04-26 12:01:40,224][49750] Updated weights for policy 0, policy_version 169521 (0.0033) [2024-04-26 12:01:42,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 2777513984. Throughput: 0: 50178.1. Samples: 530348400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 12:01:42,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 12:01:43,811][49750] Updated weights for policy 0, policy_version 169531 (0.0038) [2024-04-26 12:01:46,825][49750] Updated weights for policy 0, policy_version 169541 (0.0031) [2024-04-26 12:01:47,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 2777759744. Throughput: 0: 50399.6. Samples: 530658140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 12:01:47,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 12:01:50,330][49750] Updated weights for policy 0, policy_version 169551 (0.0033) [2024-04-26 12:01:52,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.1, 300 sec: 50318.3). Total num frames: 2778005504. Throughput: 0: 50444.7. Samples: 530815580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 12:01:52,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 12:01:53,271][49750] Updated weights for policy 0, policy_version 169561 (0.0028) [2024-04-26 12:01:56,671][49750] Updated weights for policy 0, policy_version 169571 (0.0030) [2024-04-26 12:01:57,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2778267648. Throughput: 0: 50368.9. Samples: 531114940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 12:01:57,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 12:01:59,758][49750] Updated weights for policy 0, policy_version 169581 (0.0034) [2024-04-26 12:02:02,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 2778513408. Throughput: 0: 50303.9. Samples: 531415580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 12:02:02,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 12:02:03,154][49750] Updated weights for policy 0, policy_version 169591 (0.0031) [2024-04-26 12:02:06,244][49750] Updated weights for policy 0, policy_version 169601 (0.0029) [2024-04-26 12:02:07,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2778759168. Throughput: 0: 50381.4. Samples: 531564020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 12:02:07,063][49517] Avg episode reward: [(0, '0.458')] [2024-04-26 12:02:09,705][49750] Updated weights for policy 0, policy_version 169611 (0.0030) [2024-04-26 12:02:12,063][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2779004928. Throughput: 0: 50556.8. Samples: 531866080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 12:02:12,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 12:02:12,779][49750] Updated weights for policy 0, policy_version 169621 (0.0032) [2024-04-26 12:02:16,179][49750] Updated weights for policy 0, policy_version 169631 (0.0033) [2024-04-26 12:02:17,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.5, 300 sec: 50429.4). Total num frames: 2779299840. Throughput: 0: 50513.8. Samples: 532168860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 12:02:17,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 12:02:19,246][49750] Updated weights for policy 0, policy_version 169641 (0.0028) [2024-04-26 12:02:22,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2779512832. Throughput: 0: 50415.5. Samples: 532321900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 12:02:22,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 12:02:22,699][49750] Updated weights for policy 0, policy_version 169651 (0.0034) [2024-04-26 12:02:25,860][49750] Updated weights for policy 0, policy_version 169661 (0.0037) [2024-04-26 12:02:27,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2779774976. Throughput: 0: 50609.3. Samples: 532625820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 12:02:27,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 12:02:29,158][49750] Updated weights for policy 0, policy_version 169671 (0.0031) [2024-04-26 12:02:32,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2780020736. Throughput: 0: 50475.6. Samples: 532929540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 12:02:32,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 12:02:32,309][49750] Updated weights for policy 0, policy_version 169681 (0.0035) [2024-04-26 12:02:35,635][49750] Updated weights for policy 0, policy_version 169691 (0.0039) [2024-04-26 12:02:37,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2780266496. Throughput: 0: 50248.6. Samples: 533076760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 12:02:37,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 12:02:38,971][49750] Updated weights for policy 0, policy_version 169701 (0.0035) [2024-04-26 12:02:42,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2780512256. Throughput: 0: 50283.6. Samples: 533377700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:02:42,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 12:02:42,088][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000169710_2780528640.pth... [2024-04-26 12:02:42,135][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000168972_2768437248.pth [2024-04-26 12:02:42,309][49750] Updated weights for policy 0, policy_version 169711 (0.0033) [2024-04-26 12:02:45,474][49750] Updated weights for policy 0, policy_version 169721 (0.0040) [2024-04-26 12:02:47,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2780774400. Throughput: 0: 50236.5. Samples: 533676220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:02:47,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 12:02:48,954][49750] Updated weights for policy 0, policy_version 169731 (0.0029) [2024-04-26 12:02:52,048][49750] Updated weights for policy 0, policy_version 169741 (0.0033) [2024-04-26 12:02:52,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 2781036544. Throughput: 0: 50179.1. Samples: 533822080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:02:52,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 12:02:55,322][49750] Updated weights for policy 0, policy_version 169751 (0.0031) [2024-04-26 12:02:57,062][49517] Fps is (10 sec: 49152.7, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2781265920. Throughput: 0: 50204.2. Samples: 534125260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:02:57,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 12:02:58,562][49750] Updated weights for policy 0, policy_version 169761 (0.0032) [2024-04-26 12:02:59,869][49728] Signal inference workers to stop experience collection... (8000 times) [2024-04-26 12:02:59,874][49728] Signal inference workers to resume experience collection... (8000 times) [2024-04-26 12:02:59,901][49750] InferenceWorker_p0-w0: stopping experience collection (8000 times) [2024-04-26 12:02:59,901][49750] InferenceWorker_p0-w0: resuming experience collection (8000 times) [2024-04-26 12:03:01,648][49750] Updated weights for policy 0, policy_version 169771 (0.0032) [2024-04-26 12:03:02,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2781544448. Throughput: 0: 50321.7. Samples: 534433340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:03:02,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 12:03:05,074][49750] Updated weights for policy 0, policy_version 169781 (0.0026) [2024-04-26 12:03:07,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2781790208. Throughput: 0: 50286.7. Samples: 534584800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:03:07,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 12:03:08,134][49750] Updated weights for policy 0, policy_version 169791 (0.0030) [2024-04-26 12:03:11,536][49750] Updated weights for policy 0, policy_version 169801 (0.0028) [2024-04-26 12:03:12,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2782035968. Throughput: 0: 50254.7. Samples: 534887280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:03:12,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 12:03:14,677][49750] Updated weights for policy 0, policy_version 169811 (0.0026) [2024-04-26 12:03:17,063][49517] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 50318.3). Total num frames: 2782281728. Throughput: 0: 50341.7. Samples: 535194920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:03:17,063][49517] Avg episode reward: [(0, '0.470')] [2024-04-26 12:03:18,222][49750] Updated weights for policy 0, policy_version 169821 (0.0028) [2024-04-26 12:03:21,041][49750] Updated weights for policy 0, policy_version 169831 (0.0040) [2024-04-26 12:03:22,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.5, 300 sec: 50373.9). Total num frames: 2782560256. Throughput: 0: 50381.4. Samples: 535343920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:03:22,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 12:03:24,716][49750] Updated weights for policy 0, policy_version 169841 (0.0031) [2024-04-26 12:03:27,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2782789632. Throughput: 0: 50404.4. Samples: 535645900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:03:27,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 12:03:27,482][49750] Updated weights for policy 0, policy_version 169851 (0.0031) [2024-04-26 12:03:31,178][49750] Updated weights for policy 0, policy_version 169861 (0.0037) [2024-04-26 12:03:32,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50244.1, 300 sec: 50373.9). Total num frames: 2783035392. Throughput: 0: 50515.9. Samples: 535949440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:03:32,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 12:03:34,065][49750] Updated weights for policy 0, policy_version 169871 (0.0032) [2024-04-26 12:03:37,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2783297536. Throughput: 0: 50423.6. Samples: 536091140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:03:37,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 12:03:37,922][49750] Updated weights for policy 0, policy_version 169881 (0.0031) [2024-04-26 12:03:40,581][49750] Updated weights for policy 0, policy_version 169891 (0.0037) [2024-04-26 12:03:42,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 2783543296. Throughput: 0: 50439.8. Samples: 536395060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:03:42,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 12:03:44,526][49750] Updated weights for policy 0, policy_version 169901 (0.0034) [2024-04-26 12:03:47,016][49750] Updated weights for policy 0, policy_version 169911 (0.0030) [2024-04-26 12:03:47,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50373.8). Total num frames: 2783821824. Throughput: 0: 50420.0. Samples: 536702240. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-26 12:03:47,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 12:03:50,867][49750] Updated weights for policy 0, policy_version 169921 (0.0029) [2024-04-26 12:03:52,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2784051200. Throughput: 0: 50377.7. Samples: 536851800. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-26 12:03:52,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 12:03:53,563][49750] Updated weights for policy 0, policy_version 169931 (0.0034) [2024-04-26 12:03:57,062][49517] Fps is (10 sec: 45875.6, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2784280576. Throughput: 0: 50388.6. Samples: 537154760. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-26 12:03:57,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 12:03:57,446][49750] Updated weights for policy 0, policy_version 169941 (0.0025) [2024-04-26 12:03:59,852][49728] Signal inference workers to stop experience collection... (8050 times) [2024-04-26 12:03:59,852][49728] Signal inference workers to resume experience collection... (8050 times) [2024-04-26 12:03:59,864][49750] InferenceWorker_p0-w0: stopping experience collection (8050 times) [2024-04-26 12:03:59,885][49750] InferenceWorker_p0-w0: resuming experience collection (8050 times) [2024-04-26 12:03:59,984][49750] Updated weights for policy 0, policy_version 169951 (0.0032) [2024-04-26 12:04:02,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50244.4, 300 sec: 50318.3). Total num frames: 2784559104. Throughput: 0: 50213.0. Samples: 537454500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-26 12:04:02,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 12:04:04,008][49750] Updated weights for policy 0, policy_version 169961 (0.0032) [2024-04-26 12:04:06,658][49750] Updated weights for policy 0, policy_version 169971 (0.0029) [2024-04-26 12:04:07,063][49517] Fps is (10 sec: 54066.2, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 2784821248. Throughput: 0: 50295.4. Samples: 537607220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-26 12:04:07,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 12:04:10,326][49750] Updated weights for policy 0, policy_version 169981 (0.0025) [2024-04-26 12:04:12,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 2785067008. Throughput: 0: 50418.7. Samples: 537914740. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-26 12:04:12,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 12:04:13,125][49750] Updated weights for policy 0, policy_version 169991 (0.0034) [2024-04-26 12:04:16,752][49750] Updated weights for policy 0, policy_version 170001 (0.0034) [2024-04-26 12:04:17,062][49517] Fps is (10 sec: 47514.6, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2785296384. Throughput: 0: 50378.0. Samples: 538216440. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-26 12:04:17,063][49517] Avg episode reward: [(0, '0.469')] [2024-04-26 12:04:19,685][49750] Updated weights for policy 0, policy_version 170011 (0.0032) [2024-04-26 12:04:22,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2785574912. Throughput: 0: 50429.4. Samples: 538360460. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-26 12:04:22,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 12:04:23,465][49750] Updated weights for policy 0, policy_version 170021 (0.0034) [2024-04-26 12:04:26,054][49750] Updated weights for policy 0, policy_version 170031 (0.0025) [2024-04-26 12:04:27,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2785804288. Throughput: 0: 50335.3. Samples: 538660140. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-26 12:04:27,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 12:04:29,914][49750] Updated weights for policy 0, policy_version 170041 (0.0032) [2024-04-26 12:04:32,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.5, 300 sec: 50318.3). Total num frames: 2786066432. Throughput: 0: 50158.3. Samples: 538959360. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-26 12:04:32,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 12:04:32,729][49750] Updated weights for policy 0, policy_version 170051 (0.0032) [2024-04-26 12:04:36,465][49750] Updated weights for policy 0, policy_version 170061 (0.0035) [2024-04-26 12:04:37,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2786312192. Throughput: 0: 50345.5. Samples: 539117340. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-26 12:04:37,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 12:04:39,285][49750] Updated weights for policy 0, policy_version 170071 (0.0033) [2024-04-26 12:04:42,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.3, 300 sec: 50373.8). Total num frames: 2786557952. Throughput: 0: 50365.7. Samples: 539421220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-26 12:04:42,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 12:04:42,162][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000170079_2786574336.pth... [2024-04-26 12:04:42,207][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000169342_2774499328.pth [2024-04-26 12:04:43,066][49750] Updated weights for policy 0, policy_version 170081 (0.0031) [2024-04-26 12:04:45,680][49750] Updated weights for policy 0, policy_version 170091 (0.0030) [2024-04-26 12:04:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49698.2, 300 sec: 50318.3). Total num frames: 2786803712. Throughput: 0: 50328.9. Samples: 539719300. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-26 12:04:47,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 12:04:49,514][49750] Updated weights for policy 0, policy_version 170101 (0.0029) [2024-04-26 12:04:52,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2787082240. Throughput: 0: 50415.5. Samples: 539875920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:04:52,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 12:04:52,133][49750] Updated weights for policy 0, policy_version 170111 (0.0037) [2024-04-26 12:04:55,886][49750] Updated weights for policy 0, policy_version 170121 (0.0030) [2024-04-26 12:04:57,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.3, 300 sec: 50373.8). Total num frames: 2787328000. Throughput: 0: 50368.3. Samples: 540181320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:04:57,072][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 12:04:58,902][49750] Updated weights for policy 0, policy_version 170131 (0.0032) [2024-04-26 12:05:02,062][49517] Fps is (10 sec: 47514.4, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 2787557376. Throughput: 0: 50290.1. Samples: 540479500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:05:02,072][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 12:05:02,514][49750] Updated weights for policy 0, policy_version 170141 (0.0035) [2024-04-26 12:05:03,419][49728] Signal inference workers to stop experience collection... (8100 times) [2024-04-26 12:05:03,468][49750] InferenceWorker_p0-w0: stopping experience collection (8100 times) [2024-04-26 12:05:03,484][49728] Signal inference workers to resume experience collection... (8100 times) [2024-04-26 12:05:03,492][49750] InferenceWorker_p0-w0: resuming experience collection (8100 times) [2024-04-26 12:05:05,366][49750] Updated weights for policy 0, policy_version 170151 (0.0028) [2024-04-26 12:05:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.3, 300 sec: 50373.8). Total num frames: 2787819520. Throughput: 0: 50260.8. Samples: 540622200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:05:07,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 12:05:08,977][49750] Updated weights for policy 0, policy_version 170161 (0.0030) [2024-04-26 12:05:11,857][49750] Updated weights for policy 0, policy_version 170171 (0.0032) [2024-04-26 12:05:12,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 2788081664. Throughput: 0: 50382.5. Samples: 540927360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:05:12,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 12:05:15,442][49750] Updated weights for policy 0, policy_version 170181 (0.0029) [2024-04-26 12:05:17,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 2788343808. Throughput: 0: 50446.2. Samples: 541229440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:05:17,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 12:05:18,437][49750] Updated weights for policy 0, policy_version 170191 (0.0036) [2024-04-26 12:05:21,862][49750] Updated weights for policy 0, policy_version 170201 (0.0028) [2024-04-26 12:05:22,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2788573184. Throughput: 0: 50268.8. Samples: 541379440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:05:22,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 12:05:24,816][49750] Updated weights for policy 0, policy_version 170211 (0.0028) [2024-04-26 12:05:27,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 2788835328. Throughput: 0: 50309.3. Samples: 541685140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:05:27,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 12:05:28,351][49750] Updated weights for policy 0, policy_version 170221 (0.0028) [2024-04-26 12:05:31,173][49750] Updated weights for policy 0, policy_version 170231 (0.0034) [2024-04-26 12:05:32,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50244.1, 300 sec: 50318.3). Total num frames: 2789081088. Throughput: 0: 50380.2. Samples: 541986420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:05:32,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 12:05:34,909][49750] Updated weights for policy 0, policy_version 170241 (0.0035) [2024-04-26 12:05:37,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2789326848. Throughput: 0: 50445.1. Samples: 542145940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:05:37,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 12:05:37,812][49750] Updated weights for policy 0, policy_version 170251 (0.0032) [2024-04-26 12:05:41,395][49750] Updated weights for policy 0, policy_version 170261 (0.0029) [2024-04-26 12:05:42,063][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 2789605376. Throughput: 0: 50408.5. Samples: 542449700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:05:42,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 12:05:44,254][49750] Updated weights for policy 0, policy_version 170271 (0.0034) [2024-04-26 12:05:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2789818368. Throughput: 0: 50263.7. Samples: 542741360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:05:47,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 12:05:47,919][49750] Updated weights for policy 0, policy_version 170281 (0.0031) [2024-04-26 12:05:50,735][49750] Updated weights for policy 0, policy_version 170291 (0.0031) [2024-04-26 12:05:52,062][49517] Fps is (10 sec: 47514.2, 60 sec: 49971.4, 300 sec: 50318.3). Total num frames: 2790080512. Throughput: 0: 50457.9. Samples: 542892800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:05:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 12:05:54,493][49750] Updated weights for policy 0, policy_version 170301 (0.0030) [2024-04-26 12:05:57,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50244.4, 300 sec: 50318.3). Total num frames: 2790342656. Throughput: 0: 50404.6. Samples: 543195560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 12:05:57,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 12:05:57,228][49750] Updated weights for policy 0, policy_version 170311 (0.0036) [2024-04-26 12:06:00,988][49750] Updated weights for policy 0, policy_version 170321 (0.0031) [2024-04-26 12:06:02,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2790588416. Throughput: 0: 50355.4. Samples: 543495440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 12:06:02,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 12:06:03,865][49750] Updated weights for policy 0, policy_version 170331 (0.0031) [2024-04-26 12:06:07,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.3, 300 sec: 50207.3). Total num frames: 2790817792. Throughput: 0: 50354.3. Samples: 543645380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 12:06:07,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 12:06:07,406][49750] Updated weights for policy 0, policy_version 170341 (0.0031) [2024-04-26 12:06:07,994][49728] Signal inference workers to stop experience collection... (8150 times) [2024-04-26 12:06:08,035][49750] InferenceWorker_p0-w0: stopping experience collection (8150 times) [2024-04-26 12:06:08,097][49728] Signal inference workers to resume experience collection... (8150 times) [2024-04-26 12:06:08,097][49750] InferenceWorker_p0-w0: resuming experience collection (8150 times) [2024-04-26 12:06:10,441][49750] Updated weights for policy 0, policy_version 170351 (0.0037) [2024-04-26 12:06:12,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 2791096320. Throughput: 0: 50174.8. Samples: 543943000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 12:06:12,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 12:06:13,877][49750] Updated weights for policy 0, policy_version 170361 (0.0037) [2024-04-26 12:06:16,888][49750] Updated weights for policy 0, policy_version 170371 (0.0032) [2024-04-26 12:06:17,062][49517] Fps is (10 sec: 54066.9, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2791358464. Throughput: 0: 50306.9. Samples: 544250220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 12:06:17,063][49517] Avg episode reward: [(0, '0.414')] [2024-04-26 12:06:20,465][49750] Updated weights for policy 0, policy_version 170381 (0.0033) [2024-04-26 12:06:22,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 2791604224. Throughput: 0: 50046.5. Samples: 544398040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 12:06:22,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 12:06:23,488][49750] Updated weights for policy 0, policy_version 170391 (0.0034) [2024-04-26 12:06:26,922][49750] Updated weights for policy 0, policy_version 170401 (0.0027) [2024-04-26 12:06:27,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2791849984. Throughput: 0: 50055.4. Samples: 544702200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 12:06:27,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 12:06:30,070][49750] Updated weights for policy 0, policy_version 170411 (0.0038) [2024-04-26 12:06:32,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.4, 300 sec: 50318.3). Total num frames: 2792095744. Throughput: 0: 50234.5. Samples: 545001920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 12:06:32,063][49517] Avg episode reward: [(0, '0.677')] [2024-04-26 12:06:33,403][49750] Updated weights for policy 0, policy_version 170421 (0.0038) [2024-04-26 12:06:36,573][49750] Updated weights for policy 0, policy_version 170431 (0.0039) [2024-04-26 12:06:37,063][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2792341504. Throughput: 0: 50152.3. Samples: 545149660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 12:06:37,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 12:06:39,965][49750] Updated weights for policy 0, policy_version 170441 (0.0030) [2024-04-26 12:06:42,063][49517] Fps is (10 sec: 50790.2, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2792603648. Throughput: 0: 50240.3. Samples: 545456380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 12:06:42,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 12:06:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000170447_2792603648.pth... [2024-04-26 12:06:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000169710_2780528640.pth [2024-04-26 12:06:43,054][49750] Updated weights for policy 0, policy_version 170451 (0.0033) [2024-04-26 12:06:46,574][49750] Updated weights for policy 0, policy_version 170461 (0.0031) [2024-04-26 12:06:47,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.3, 300 sec: 50318.4). Total num frames: 2792849408. Throughput: 0: 50318.0. Samples: 545759740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 12:06:47,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 12:06:49,697][49750] Updated weights for policy 0, policy_version 170471 (0.0029) [2024-04-26 12:06:52,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.1, 300 sec: 50262.8). Total num frames: 2793095168. Throughput: 0: 50262.0. Samples: 545907180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 12:06:52,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 12:06:53,224][49750] Updated weights for policy 0, policy_version 170481 (0.0034) [2024-04-26 12:06:56,309][49750] Updated weights for policy 0, policy_version 170491 (0.0031) [2024-04-26 12:06:57,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2793357312. Throughput: 0: 50286.3. Samples: 546205880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 12:06:57,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 12:06:59,878][49750] Updated weights for policy 0, policy_version 170501 (0.0033) [2024-04-26 12:07:02,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2793619456. Throughput: 0: 50139.0. Samples: 546506480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 12:07:02,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 12:07:02,767][49750] Updated weights for policy 0, policy_version 170511 (0.0035) [2024-04-26 12:07:06,288][49750] Updated weights for policy 0, policy_version 170521 (0.0031) [2024-04-26 12:07:07,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.3, 300 sec: 50373.9). Total num frames: 2793865216. Throughput: 0: 50195.2. Samples: 546656820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 12:07:07,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 12:07:09,317][49750] Updated weights for policy 0, policy_version 170531 (0.0036) [2024-04-26 12:07:12,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.1, 300 sec: 50207.2). Total num frames: 2794110976. Throughput: 0: 50063.6. Samples: 546955060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 12:07:12,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 12:07:12,800][49750] Updated weights for policy 0, policy_version 170541 (0.0036) [2024-04-26 12:07:16,019][49750] Updated weights for policy 0, policy_version 170551 (0.0031) [2024-04-26 12:07:17,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2794373120. Throughput: 0: 50159.2. Samples: 547259080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 12:07:17,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 12:07:19,222][49750] Updated weights for policy 0, policy_version 170561 (0.0032) [2024-04-26 12:07:22,062][49517] Fps is (10 sec: 49153.0, 60 sec: 49971.4, 300 sec: 50262.8). Total num frames: 2794602496. Throughput: 0: 50332.1. Samples: 547414600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 12:07:22,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 12:07:22,720][49750] Updated weights for policy 0, policy_version 170571 (0.0037) [2024-04-26 12:07:25,837][49750] Updated weights for policy 0, policy_version 170581 (0.0031) [2024-04-26 12:07:27,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.5, 300 sec: 50373.9). Total num frames: 2794881024. Throughput: 0: 50182.8. Samples: 547714600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 12:07:27,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 12:07:29,170][49750] Updated weights for policy 0, policy_version 170591 (0.0036) [2024-04-26 12:07:32,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2795094016. Throughput: 0: 50039.1. Samples: 548011500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 12:07:32,063][49517] Avg episode reward: [(0, '0.444')] [2024-04-26 12:07:32,279][49750] Updated weights for policy 0, policy_version 170601 (0.0037) [2024-04-26 12:07:35,686][49750] Updated weights for policy 0, policy_version 170611 (0.0028) [2024-04-26 12:07:36,321][49728] Signal inference workers to stop experience collection... (8200 times) [2024-04-26 12:07:36,321][49728] Signal inference workers to resume experience collection... (8200 times) [2024-04-26 12:07:36,339][49750] InferenceWorker_p0-w0: stopping experience collection (8200 times) [2024-04-26 12:07:36,339][49750] InferenceWorker_p0-w0: resuming experience collection (8200 times) [2024-04-26 12:07:37,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2795372544. Throughput: 0: 50236.5. Samples: 548167820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 12:07:37,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 12:07:38,639][49750] Updated weights for policy 0, policy_version 170621 (0.0031) [2024-04-26 12:07:42,063][49517] Fps is (10 sec: 50789.6, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2795601920. Throughput: 0: 50203.8. Samples: 548465060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 12:07:42,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 12:07:42,106][49750] Updated weights for policy 0, policy_version 170631 (0.0031) [2024-04-26 12:07:45,288][49750] Updated weights for policy 0, policy_version 170641 (0.0036) [2024-04-26 12:07:47,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 2795880448. Throughput: 0: 50058.3. Samples: 548759100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 12:07:47,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 12:07:48,590][49750] Updated weights for policy 0, policy_version 170651 (0.0035) [2024-04-26 12:07:51,968][49750] Updated weights for policy 0, policy_version 170661 (0.0037) [2024-04-26 12:07:52,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2796109824. Throughput: 0: 50206.2. Samples: 548916100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 12:07:52,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 12:07:55,002][49750] Updated weights for policy 0, policy_version 170671 (0.0027) [2024-04-26 12:07:57,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.1, 300 sec: 50262.8). Total num frames: 2796371968. Throughput: 0: 50504.0. Samples: 549227740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 12:07:57,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 12:07:58,493][49750] Updated weights for policy 0, policy_version 170681 (0.0034) [2024-04-26 12:08:01,430][49750] Updated weights for policy 0, policy_version 170691 (0.0030) [2024-04-26 12:08:02,063][49517] Fps is (10 sec: 50789.8, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2796617728. Throughput: 0: 50304.7. Samples: 549522800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 12:08:02,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 12:08:04,864][49750] Updated weights for policy 0, policy_version 170701 (0.0027) [2024-04-26 12:08:07,063][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2796863488. Throughput: 0: 50340.2. Samples: 549679920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 12:08:07,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 12:08:07,910][49750] Updated weights for policy 0, policy_version 170711 (0.0035) [2024-04-26 12:08:11,352][49750] Updated weights for policy 0, policy_version 170721 (0.0029) [2024-04-26 12:08:12,062][49517] Fps is (10 sec: 52430.3, 60 sec: 50517.5, 300 sec: 50373.9). Total num frames: 2797142016. Throughput: 0: 50304.0. Samples: 549978280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 12:08:12,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 12:08:14,422][49750] Updated weights for policy 0, policy_version 170731 (0.0029) [2024-04-26 12:08:17,063][49517] Fps is (10 sec: 50790.8, 60 sec: 49971.1, 300 sec: 50207.2). Total num frames: 2797371392. Throughput: 0: 50357.2. Samples: 550277580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 12:08:17,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 12:08:17,757][49750] Updated weights for policy 0, policy_version 170741 (0.0029) [2024-04-26 12:08:21,003][49750] Updated weights for policy 0, policy_version 170751 (0.0033) [2024-04-26 12:08:22,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 2797633536. Throughput: 0: 50322.3. Samples: 550432320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 12:08:22,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 12:08:24,207][49750] Updated weights for policy 0, policy_version 170761 (0.0033) [2024-04-26 12:08:27,063][49517] Fps is (10 sec: 50789.8, 60 sec: 49971.0, 300 sec: 50318.3). Total num frames: 2797879296. Throughput: 0: 50402.5. Samples: 550733180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 12:08:27,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 12:08:27,629][49750] Updated weights for policy 0, policy_version 170771 (0.0033) [2024-04-26 12:08:30,633][49750] Updated weights for policy 0, policy_version 170781 (0.0028) [2024-04-26 12:08:32,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50373.8). Total num frames: 2798157824. Throughput: 0: 50498.7. Samples: 551031540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 12:08:32,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 12:08:33,926][49750] Updated weights for policy 0, policy_version 170791 (0.0033) [2024-04-26 12:08:37,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2798387200. Throughput: 0: 50542.3. Samples: 551190500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 12:08:37,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 12:08:37,124][49750] Updated weights for policy 0, policy_version 170801 (0.0035) [2024-04-26 12:08:37,163][49728] Signal inference workers to stop experience collection... (8250 times) [2024-04-26 12:08:37,163][49728] Signal inference workers to resume experience collection... (8250 times) [2024-04-26 12:08:37,187][49750] InferenceWorker_p0-w0: stopping experience collection (8250 times) [2024-04-26 12:08:37,187][49750] InferenceWorker_p0-w0: resuming experience collection (8250 times) [2024-04-26 12:08:40,274][49750] Updated weights for policy 0, policy_version 170811 (0.0031) [2024-04-26 12:08:42,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 2798632960. Throughput: 0: 50251.6. Samples: 551489060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 12:08:42,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 12:08:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000170815_2798632960.pth... [2024-04-26 12:08:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000170079_2786574336.pth [2024-04-26 12:08:43,670][49750] Updated weights for policy 0, policy_version 170821 (0.0033) [2024-04-26 12:08:46,872][49750] Updated weights for policy 0, policy_version 170831 (0.0033) [2024-04-26 12:08:47,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2798895104. Throughput: 0: 50483.2. Samples: 551794540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 12:08:47,064][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 12:08:50,216][49750] Updated weights for policy 0, policy_version 170841 (0.0038) [2024-04-26 12:08:52,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2799140864. Throughput: 0: 50402.7. Samples: 551948040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 12:08:52,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 12:08:53,450][49750] Updated weights for policy 0, policy_version 170851 (0.0042) [2024-04-26 12:08:56,747][49750] Updated weights for policy 0, policy_version 170861 (0.0034) [2024-04-26 12:08:57,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2799403008. Throughput: 0: 50460.3. Samples: 552249000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 12:08:57,063][49517] Avg episode reward: [(0, '0.483')] [2024-04-26 12:08:59,856][49750] Updated weights for policy 0, policy_version 170871 (0.0026) [2024-04-26 12:09:02,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.4, 300 sec: 50207.3). Total num frames: 2799632384. Throughput: 0: 50365.0. Samples: 552544000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 12:09:02,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 12:09:03,269][49750] Updated weights for policy 0, policy_version 170881 (0.0030) [2024-04-26 12:09:06,228][49750] Updated weights for policy 0, policy_version 170891 (0.0025) [2024-04-26 12:09:07,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.5, 300 sec: 50262.8). Total num frames: 2799894528. Throughput: 0: 50218.8. Samples: 552692160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 12:09:07,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 12:09:09,637][49750] Updated weights for policy 0, policy_version 170901 (0.0029) [2024-04-26 12:09:12,062][49517] Fps is (10 sec: 50790.7, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2800140288. Throughput: 0: 50315.4. Samples: 552997360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:09:12,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 12:09:12,785][49750] Updated weights for policy 0, policy_version 170911 (0.0041) [2024-04-26 12:09:16,082][49750] Updated weights for policy 0, policy_version 170921 (0.0032) [2024-04-26 12:09:17,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 2800402432. Throughput: 0: 50404.1. Samples: 553299720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:09:17,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 12:09:19,374][49750] Updated weights for policy 0, policy_version 170931 (0.0032) [2024-04-26 12:09:22,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2800648192. Throughput: 0: 50189.6. Samples: 553449040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:09:22,063][49517] Avg episode reward: [(0, '0.479')] [2024-04-26 12:09:22,501][49750] Updated weights for policy 0, policy_version 170941 (0.0039) [2024-04-26 12:09:26,122][49750] Updated weights for policy 0, policy_version 170951 (0.0029) [2024-04-26 12:09:27,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2800910336. Throughput: 0: 50360.4. Samples: 553755280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:09:27,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 12:09:29,123][49750] Updated weights for policy 0, policy_version 170961 (0.0038) [2024-04-26 12:09:32,062][49517] Fps is (10 sec: 50791.6, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2801156096. Throughput: 0: 50291.7. Samples: 554057660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:09:32,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 12:09:32,736][49750] Updated weights for policy 0, policy_version 170971 (0.0033) [2024-04-26 12:09:35,520][49750] Updated weights for policy 0, policy_version 170981 (0.0034) [2024-04-26 12:09:37,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2801418240. Throughput: 0: 50302.7. Samples: 554211660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:09:37,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 12:09:39,143][49750] Updated weights for policy 0, policy_version 170991 (0.0042) [2024-04-26 12:09:41,980][49750] Updated weights for policy 0, policy_version 171001 (0.0029) [2024-04-26 12:09:42,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2801680384. Throughput: 0: 50340.0. Samples: 554514300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:09:42,063][49517] Avg episode reward: [(0, '0.440')] [2024-04-26 12:09:45,524][49750] Updated weights for policy 0, policy_version 171011 (0.0037) [2024-04-26 12:09:47,063][49517] Fps is (10 sec: 47510.8, 60 sec: 49970.7, 300 sec: 50207.2). Total num frames: 2801893376. Throughput: 0: 50386.8. Samples: 554811440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:09:47,064][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 12:09:48,040][49728] Signal inference workers to stop experience collection... (8300 times) [2024-04-26 12:09:48,041][49728] Signal inference workers to resume experience collection... (8300 times) [2024-04-26 12:09:48,069][49750] InferenceWorker_p0-w0: stopping experience collection (8300 times) [2024-04-26 12:09:48,069][49750] InferenceWorker_p0-w0: resuming experience collection (8300 times) [2024-04-26 12:09:48,452][49750] Updated weights for policy 0, policy_version 171021 (0.0029) [2024-04-26 12:09:51,985][49750] Updated weights for policy 0, policy_version 171031 (0.0034) [2024-04-26 12:09:52,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.5, 300 sec: 50318.3). Total num frames: 2802171904. Throughput: 0: 50359.0. Samples: 554958320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:09:52,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 12:09:55,094][49750] Updated weights for policy 0, policy_version 171041 (0.0028) [2024-04-26 12:09:57,062][49517] Fps is (10 sec: 52432.4, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2802417664. Throughput: 0: 50294.6. Samples: 555260620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:09:57,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 12:09:58,479][49750] Updated weights for policy 0, policy_version 171051 (0.0030) [2024-04-26 12:10:01,698][49750] Updated weights for policy 0, policy_version 171061 (0.0031) [2024-04-26 12:10:02,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.5, 300 sec: 50373.9). Total num frames: 2802679808. Throughput: 0: 50361.0. Samples: 555565960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:10:02,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 12:10:04,862][49750] Updated weights for policy 0, policy_version 171071 (0.0034) [2024-04-26 12:10:07,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2802909184. Throughput: 0: 50313.9. Samples: 555713160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:10:07,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 12:10:08,084][49750] Updated weights for policy 0, policy_version 171081 (0.0027) [2024-04-26 12:10:11,351][49750] Updated weights for policy 0, policy_version 171091 (0.0030) [2024-04-26 12:10:12,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 2803171328. Throughput: 0: 50315.6. Samples: 556019480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:10:12,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 12:10:14,681][49750] Updated weights for policy 0, policy_version 171101 (0.0037) [2024-04-26 12:10:17,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2803417088. Throughput: 0: 50248.0. Samples: 556318820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 12:10:17,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 12:10:17,863][49750] Updated weights for policy 0, policy_version 171111 (0.0032) [2024-04-26 12:10:21,136][49750] Updated weights for policy 0, policy_version 171121 (0.0034) [2024-04-26 12:10:22,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2803662848. Throughput: 0: 50386.2. Samples: 556479040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 12:10:22,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 12:10:24,340][49750] Updated weights for policy 0, policy_version 171131 (0.0034) [2024-04-26 12:10:27,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.4, 300 sec: 50318.4). Total num frames: 2803924992. Throughput: 0: 50255.7. Samples: 556775800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 12:10:27,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 12:10:27,636][49750] Updated weights for policy 0, policy_version 171141 (0.0033) [2024-04-26 12:10:31,006][49750] Updated weights for policy 0, policy_version 171151 (0.0029) [2024-04-26 12:10:32,063][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2804154368. Throughput: 0: 50369.1. Samples: 557078020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 12:10:32,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 12:10:34,176][49750] Updated weights for policy 0, policy_version 171161 (0.0033) [2024-04-26 12:10:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 2804432896. Throughput: 0: 50470.3. Samples: 557229480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 12:10:37,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 12:10:37,574][49750] Updated weights for policy 0, policy_version 171171 (0.0028) [2024-04-26 12:10:40,786][49750] Updated weights for policy 0, policy_version 171181 (0.0037) [2024-04-26 12:10:42,062][49517] Fps is (10 sec: 52429.4, 60 sec: 49971.3, 300 sec: 50373.8). Total num frames: 2804678656. Throughput: 0: 50387.6. Samples: 557528060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 12:10:42,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 12:10:42,150][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000171185_2804695040.pth... [2024-04-26 12:10:42,198][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000170447_2792603648.pth [2024-04-26 12:10:43,932][49750] Updated weights for policy 0, policy_version 171191 (0.0033) [2024-04-26 12:10:47,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.9, 300 sec: 50318.3). Total num frames: 2804924416. Throughput: 0: 50387.8. Samples: 557833420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 12:10:47,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 12:10:47,381][49750] Updated weights for policy 0, policy_version 171201 (0.0033) [2024-04-26 12:10:50,404][49750] Updated weights for policy 0, policy_version 171211 (0.0037) [2024-04-26 12:10:52,063][49517] Fps is (10 sec: 49150.8, 60 sec: 49971.0, 300 sec: 50262.7). Total num frames: 2805170176. Throughput: 0: 50456.7. Samples: 557983720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 12:10:52,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 12:10:53,586][49728] Signal inference workers to stop experience collection... (8350 times) [2024-04-26 12:10:53,587][49728] Signal inference workers to resume experience collection... (8350 times) [2024-04-26 12:10:53,614][49750] InferenceWorker_p0-w0: stopping experience collection (8350 times) [2024-04-26 12:10:53,615][49750] InferenceWorker_p0-w0: resuming experience collection (8350 times) [2024-04-26 12:10:53,722][49750] Updated weights for policy 0, policy_version 171221 (0.0032) [2024-04-26 12:10:56,910][49750] Updated weights for policy 0, policy_version 171231 (0.0029) [2024-04-26 12:10:57,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2805448704. Throughput: 0: 50353.0. Samples: 558285360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 12:10:57,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 12:11:00,257][49750] Updated weights for policy 0, policy_version 171241 (0.0032) [2024-04-26 12:11:02,062][49517] Fps is (10 sec: 52430.1, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2805694464. Throughput: 0: 50555.5. Samples: 558593820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 12:11:02,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 12:11:03,391][49750] Updated weights for policy 0, policy_version 171251 (0.0032) [2024-04-26 12:11:06,931][49750] Updated weights for policy 0, policy_version 171261 (0.0028) [2024-04-26 12:11:07,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2805940224. Throughput: 0: 50392.6. Samples: 558746700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 12:11:07,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 12:11:09,740][49750] Updated weights for policy 0, policy_version 171271 (0.0033) [2024-04-26 12:11:12,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 2806185984. Throughput: 0: 50471.9. Samples: 559047040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 12:11:12,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 12:11:13,406][49750] Updated weights for policy 0, policy_version 171281 (0.0034) [2024-04-26 12:11:16,281][49750] Updated weights for policy 0, policy_version 171291 (0.0029) [2024-04-26 12:11:17,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2806448128. Throughput: 0: 50312.5. Samples: 559342080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 12:11:17,063][49517] Avg episode reward: [(0, '0.476')] [2024-04-26 12:11:19,770][49750] Updated weights for policy 0, policy_version 171301 (0.0035) [2024-04-26 12:11:22,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.5, 300 sec: 50373.9). Total num frames: 2806710272. Throughput: 0: 50427.5. Samples: 559498720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:11:22,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 12:11:22,774][49750] Updated weights for policy 0, policy_version 171311 (0.0031) [2024-04-26 12:11:26,432][49750] Updated weights for policy 0, policy_version 171321 (0.0030) [2024-04-26 12:11:27,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2806939648. Throughput: 0: 50339.1. Samples: 559793320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:11:27,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 12:11:29,456][49750] Updated weights for policy 0, policy_version 171331 (0.0032) [2024-04-26 12:11:32,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.5, 300 sec: 50373.9). Total num frames: 2807201792. Throughput: 0: 50277.9. Samples: 560095920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:11:32,063][49517] Avg episode reward: [(0, '0.475')] [2024-04-26 12:11:33,034][49750] Updated weights for policy 0, policy_version 171341 (0.0035) [2024-04-26 12:11:36,044][49750] Updated weights for policy 0, policy_version 171351 (0.0027) [2024-04-26 12:11:37,063][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2807431168. Throughput: 0: 50184.6. Samples: 560242020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:11:37,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 12:11:39,413][49750] Updated weights for policy 0, policy_version 171361 (0.0033) [2024-04-26 12:11:42,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2807709696. Throughput: 0: 50343.0. Samples: 560550800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:11:42,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 12:11:42,859][49750] Updated weights for policy 0, policy_version 171371 (0.0030) [2024-04-26 12:11:45,883][49750] Updated weights for policy 0, policy_version 171381 (0.0032) [2024-04-26 12:11:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2807955456. Throughput: 0: 50203.5. Samples: 560852980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:11:47,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 12:11:49,310][49750] Updated weights for policy 0, policy_version 171391 (0.0030) [2024-04-26 12:11:52,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 2808201216. Throughput: 0: 50174.9. Samples: 561004580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:11:52,063][49517] Avg episode reward: [(0, '0.390')] [2024-04-26 12:11:52,733][49750] Updated weights for policy 0, policy_version 171401 (0.0035) [2024-04-26 12:11:55,958][49750] Updated weights for policy 0, policy_version 171411 (0.0038) [2024-04-26 12:11:57,063][49517] Fps is (10 sec: 49151.3, 60 sec: 49971.0, 300 sec: 50262.8). Total num frames: 2808446976. Throughput: 0: 50110.5. Samples: 561302020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:11:57,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 12:11:57,971][49728] Signal inference workers to stop experience collection... (8400 times) [2024-04-26 12:11:57,976][49728] Signal inference workers to resume experience collection... (8400 times) [2024-04-26 12:11:58,005][49750] InferenceWorker_p0-w0: stopping experience collection (8400 times) [2024-04-26 12:11:58,005][49750] InferenceWorker_p0-w0: resuming experience collection (8400 times) [2024-04-26 12:11:59,086][49750] Updated weights for policy 0, policy_version 171421 (0.0030) [2024-04-26 12:12:02,063][49517] Fps is (10 sec: 50790.9, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2808709120. Throughput: 0: 50185.7. Samples: 561600440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:12:02,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 12:12:02,410][49750] Updated weights for policy 0, policy_version 171431 (0.0029) [2024-04-26 12:12:05,532][49750] Updated weights for policy 0, policy_version 171441 (0.0039) [2024-04-26 12:12:07,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2808971264. Throughput: 0: 50290.7. Samples: 561761800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:12:07,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 12:12:08,839][49750] Updated weights for policy 0, policy_version 171451 (0.0031) [2024-04-26 12:12:12,030][49750] Updated weights for policy 0, policy_version 171461 (0.0032) [2024-04-26 12:12:12,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 2809217024. Throughput: 0: 50429.2. Samples: 562062640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:12:12,063][49517] Avg episode reward: [(0, '0.474')] [2024-04-26 12:12:15,296][49750] Updated weights for policy 0, policy_version 171471 (0.0029) [2024-04-26 12:12:17,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 2809462784. Throughput: 0: 50367.8. Samples: 562362480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:12:17,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 12:12:18,819][49750] Updated weights for policy 0, policy_version 171481 (0.0032) [2024-04-26 12:12:21,877][49750] Updated weights for policy 0, policy_version 171491 (0.0033) [2024-04-26 12:12:22,063][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 2809708544. Throughput: 0: 50425.3. Samples: 562511160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:12:22,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 12:12:25,221][49750] Updated weights for policy 0, policy_version 171501 (0.0033) [2024-04-26 12:12:27,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2809970688. Throughput: 0: 50274.7. Samples: 562813160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 12:12:27,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 12:12:28,265][49750] Updated weights for policy 0, policy_version 171511 (0.0032) [2024-04-26 12:12:31,569][49750] Updated weights for policy 0, policy_version 171521 (0.0031) [2024-04-26 12:12:32,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2810200064. Throughput: 0: 50303.6. Samples: 563116640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 12:12:32,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 12:12:34,648][49750] Updated weights for policy 0, policy_version 171531 (0.0031) [2024-04-26 12:12:37,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2810462208. Throughput: 0: 50277.9. Samples: 563267080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 12:12:37,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 12:12:37,916][49750] Updated weights for policy 0, policy_version 171541 (0.0030) [2024-04-26 12:12:41,198][49750] Updated weights for policy 0, policy_version 171551 (0.0030) [2024-04-26 12:12:42,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2810724352. Throughput: 0: 50215.7. Samples: 563561720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 12:12:42,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 12:12:42,069][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000171553_2810724352.pth... [2024-04-26 12:12:42,115][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000170815_2798632960.pth [2024-04-26 12:12:44,367][49750] Updated weights for policy 0, policy_version 171561 (0.0035) [2024-04-26 12:12:47,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2810986496. Throughput: 0: 50420.6. Samples: 563869360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 12:12:47,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 12:12:47,805][49750] Updated weights for policy 0, policy_version 171571 (0.0029) [2024-04-26 12:12:51,202][49750] Updated weights for policy 0, policy_version 171581 (0.0031) [2024-04-26 12:12:52,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2811232256. Throughput: 0: 50333.8. Samples: 564026820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 12:12:52,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 12:12:54,373][49750] Updated weights for policy 0, policy_version 171591 (0.0030) [2024-04-26 12:12:57,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2811478016. Throughput: 0: 50259.5. Samples: 564324320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 12:12:57,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 12:12:57,763][49750] Updated weights for policy 0, policy_version 171601 (0.0034) [2024-04-26 12:13:00,716][49750] Updated weights for policy 0, policy_version 171611 (0.0031) [2024-04-26 12:13:02,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50244.1, 300 sec: 50373.8). Total num frames: 2811723776. Throughput: 0: 50441.2. Samples: 564632340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 12:13:02,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 12:13:04,207][49750] Updated weights for policy 0, policy_version 171621 (0.0041) [2024-04-26 12:13:07,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50244.4, 300 sec: 50318.3). Total num frames: 2811985920. Throughput: 0: 50565.1. Samples: 564786580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 12:13:07,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 12:13:07,097][49750] Updated weights for policy 0, policy_version 171631 (0.0032) [2024-04-26 12:13:10,580][49750] Updated weights for policy 0, policy_version 171641 (0.0036) [2024-04-26 12:13:12,062][49517] Fps is (10 sec: 50792.2, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 2812231680. Throughput: 0: 50444.1. Samples: 565083140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 12:13:12,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 12:13:13,643][49750] Updated weights for policy 0, policy_version 171651 (0.0030) [2024-04-26 12:13:17,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.4, 300 sec: 50318.3). Total num frames: 2812477440. Throughput: 0: 50485.0. Samples: 565388460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 12:13:17,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 12:13:17,326][49750] Updated weights for policy 0, policy_version 171661 (0.0030) [2024-04-26 12:13:20,259][49750] Updated weights for policy 0, policy_version 171671 (0.0027) [2024-04-26 12:13:22,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2812739584. Throughput: 0: 50330.2. Samples: 565531940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 12:13:22,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 12:13:23,789][49750] Updated weights for policy 0, policy_version 171681 (0.0028) [2024-04-26 12:13:26,627][49750] Updated weights for policy 0, policy_version 171691 (0.0035) [2024-04-26 12:13:27,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 2812985344. Throughput: 0: 50551.2. Samples: 565836520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 12:13:27,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 12:13:30,335][49750] Updated weights for policy 0, policy_version 171701 (0.0031) [2024-04-26 12:13:32,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50373.9). Total num frames: 2813247488. Throughput: 0: 50588.4. Samples: 566145840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 12:13:32,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 12:13:32,995][49750] Updated weights for policy 0, policy_version 171711 (0.0032) [2024-04-26 12:13:34,541][49728] Signal inference workers to stop experience collection... (8450 times) [2024-04-26 12:13:34,542][49728] Signal inference workers to resume experience collection... (8450 times) [2024-04-26 12:13:34,569][49750] InferenceWorker_p0-w0: stopping experience collection (8450 times) [2024-04-26 12:13:34,569][49750] InferenceWorker_p0-w0: resuming experience collection (8450 times) [2024-04-26 12:13:36,782][49750] Updated weights for policy 0, policy_version 171721 (0.0034) [2024-04-26 12:13:37,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2813493248. Throughput: 0: 50417.7. Samples: 566295620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 12:13:37,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 12:13:39,862][49750] Updated weights for policy 0, policy_version 171731 (0.0032) [2024-04-26 12:13:42,063][49517] Fps is (10 sec: 50789.2, 60 sec: 50517.2, 300 sec: 50373.8). Total num frames: 2813755392. Throughput: 0: 50429.7. Samples: 566593660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 12:13:42,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 12:13:43,148][49750] Updated weights for policy 0, policy_version 171741 (0.0035) [2024-04-26 12:13:46,588][49750] Updated weights for policy 0, policy_version 171751 (0.0032) [2024-04-26 12:13:47,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2814001152. Throughput: 0: 50440.6. Samples: 566902160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 12:13:47,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 12:13:49,695][49750] Updated weights for policy 0, policy_version 171761 (0.0034) [2024-04-26 12:13:52,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2814246912. Throughput: 0: 50290.6. Samples: 567049660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 12:13:52,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 12:13:52,977][49750] Updated weights for policy 0, policy_version 171771 (0.0034) [2024-04-26 12:13:56,239][49750] Updated weights for policy 0, policy_version 171781 (0.0031) [2024-04-26 12:13:57,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 2814509056. Throughput: 0: 50474.6. Samples: 567354500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 12:13:57,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 12:13:59,341][49750] Updated weights for policy 0, policy_version 171791 (0.0035) [2024-04-26 12:14:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.5, 300 sec: 50373.8). Total num frames: 2814754816. Throughput: 0: 50295.9. Samples: 567651780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 12:14:02,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 12:14:02,728][49750] Updated weights for policy 0, policy_version 171801 (0.0034) [2024-04-26 12:14:05,945][49750] Updated weights for policy 0, policy_version 171811 (0.0035) [2024-04-26 12:14:07,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2815016960. Throughput: 0: 50398.0. Samples: 567799840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 12:14:07,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 12:14:09,122][49750] Updated weights for policy 0, policy_version 171821 (0.0038) [2024-04-26 12:14:12,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2815262720. Throughput: 0: 50350.8. Samples: 568102300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 12:14:12,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 12:14:12,518][49750] Updated weights for policy 0, policy_version 171831 (0.0028) [2024-04-26 12:14:15,723][49750] Updated weights for policy 0, policy_version 171841 (0.0036) [2024-04-26 12:14:17,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2815508480. Throughput: 0: 50289.3. Samples: 568408860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 12:14:17,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 12:14:18,893][49750] Updated weights for policy 0, policy_version 171851 (0.0039) [2024-04-26 12:14:22,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.4, 300 sec: 50318.4). Total num frames: 2815754240. Throughput: 0: 50326.4. Samples: 568560300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 12:14:22,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 12:14:22,184][49750] Updated weights for policy 0, policy_version 171861 (0.0029) [2024-04-26 12:14:25,412][49750] Updated weights for policy 0, policy_version 171871 (0.0033) [2024-04-26 12:14:27,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2816032768. Throughput: 0: 50381.6. Samples: 568860820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 12:14:27,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 12:14:28,579][49750] Updated weights for policy 0, policy_version 171881 (0.0027) [2024-04-26 12:14:31,972][49750] Updated weights for policy 0, policy_version 171891 (0.0032) [2024-04-26 12:14:32,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2816262144. Throughput: 0: 50361.0. Samples: 569168400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 12:14:32,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 12:14:35,087][49750] Updated weights for policy 0, policy_version 171901 (0.0033) [2024-04-26 12:14:37,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2816524288. Throughput: 0: 50441.7. Samples: 569319540. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 12:14:37,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 12:14:38,446][49750] Updated weights for policy 0, policy_version 171911 (0.0030) [2024-04-26 12:14:41,572][49750] Updated weights for policy 0, policy_version 171921 (0.0027) [2024-04-26 12:14:42,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.4, 300 sec: 50429.5). Total num frames: 2816770048. Throughput: 0: 50395.9. Samples: 569622320. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 12:14:42,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 12:14:42,163][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000171923_2816786432.pth... [2024-04-26 12:14:42,206][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000171185_2804695040.pth [2024-04-26 12:14:44,822][49750] Updated weights for policy 0, policy_version 171931 (0.0033) [2024-04-26 12:14:47,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2817015808. Throughput: 0: 50555.0. Samples: 569926760. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 12:14:47,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 12:14:47,973][49750] Updated weights for policy 0, policy_version 171941 (0.0032) [2024-04-26 12:14:51,275][49750] Updated weights for policy 0, policy_version 171951 (0.0036) [2024-04-26 12:14:52,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2817294336. Throughput: 0: 50571.8. Samples: 570075580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 12:14:52,063][49517] Avg episode reward: [(0, '0.466')] [2024-04-26 12:14:54,548][49750] Updated weights for policy 0, policy_version 171961 (0.0031) [2024-04-26 12:14:55,649][49728] Signal inference workers to stop experience collection... (8500 times) [2024-04-26 12:14:55,695][49750] InferenceWorker_p0-w0: stopping experience collection (8500 times) [2024-04-26 12:14:55,708][49728] Signal inference workers to resume experience collection... (8500 times) [2024-04-26 12:14:55,715][49750] InferenceWorker_p0-w0: resuming experience collection (8500 times) [2024-04-26 12:14:57,062][49517] Fps is (10 sec: 49152.5, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 2817507328. Throughput: 0: 50592.0. Samples: 570378940. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 12:14:57,063][49517] Avg episode reward: [(0, '0.467')] [2024-04-26 12:14:57,864][49750] Updated weights for policy 0, policy_version 171971 (0.0035) [2024-04-26 12:15:01,141][49750] Updated weights for policy 0, policy_version 171981 (0.0032) [2024-04-26 12:15:02,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2817785856. Throughput: 0: 50460.9. Samples: 570679600. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 12:15:02,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 12:15:04,268][49750] Updated weights for policy 0, policy_version 171991 (0.0031) [2024-04-26 12:15:07,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2818031616. Throughput: 0: 50565.2. Samples: 570835740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 12:15:07,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 12:15:07,556][49750] Updated weights for policy 0, policy_version 172001 (0.0024) [2024-04-26 12:15:10,607][49750] Updated weights for policy 0, policy_version 172011 (0.0031) [2024-04-26 12:15:12,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 2818293760. Throughput: 0: 50622.2. Samples: 571138820. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 12:15:12,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 12:15:14,017][49750] Updated weights for policy 0, policy_version 172021 (0.0033) [2024-04-26 12:15:17,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2818539520. Throughput: 0: 50617.3. Samples: 571446180. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 12:15:17,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 12:15:17,097][49750] Updated weights for policy 0, policy_version 172031 (0.0027) [2024-04-26 12:15:20,506][49750] Updated weights for policy 0, policy_version 172041 (0.0029) [2024-04-26 12:15:22,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 2818801664. Throughput: 0: 50636.4. Samples: 571598180. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 12:15:22,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 12:15:23,773][49750] Updated weights for policy 0, policy_version 172051 (0.0030) [2024-04-26 12:15:27,010][49750] Updated weights for policy 0, policy_version 172061 (0.0033) [2024-04-26 12:15:27,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 2819047424. Throughput: 0: 50513.4. Samples: 571895420. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 12:15:27,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 12:15:30,108][49750] Updated weights for policy 0, policy_version 172071 (0.0031) [2024-04-26 12:15:32,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2819293184. Throughput: 0: 50442.8. Samples: 572196680. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 12:15:32,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 12:15:33,527][49750] Updated weights for policy 0, policy_version 172081 (0.0036) [2024-04-26 12:15:36,687][49750] Updated weights for policy 0, policy_version 172091 (0.0038) [2024-04-26 12:15:37,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2819555328. Throughput: 0: 50523.7. Samples: 572349140. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 12:15:37,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 12:15:39,886][49750] Updated weights for policy 0, policy_version 172101 (0.0029) [2024-04-26 12:15:42,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2819784704. Throughput: 0: 50461.7. Samples: 572649720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:15:42,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 12:15:43,127][49750] Updated weights for policy 0, policy_version 172111 (0.0033) [2024-04-26 12:15:46,278][49750] Updated weights for policy 0, policy_version 172121 (0.0031) [2024-04-26 12:15:47,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50485.0). Total num frames: 2820063232. Throughput: 0: 50509.8. Samples: 572952540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:15:47,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 12:15:49,726][49750] Updated weights for policy 0, policy_version 172131 (0.0035) [2024-04-26 12:15:52,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 2820308992. Throughput: 0: 50512.4. Samples: 573108800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:15:52,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 12:15:53,011][49750] Updated weights for policy 0, policy_version 172141 (0.0031) [2024-04-26 12:15:56,121][49750] Updated weights for policy 0, policy_version 172151 (0.0031) [2024-04-26 12:15:57,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.3, 300 sec: 50373.8). Total num frames: 2820554752. Throughput: 0: 50569.7. Samples: 573414460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:15:57,064][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 12:15:59,409][49750] Updated weights for policy 0, policy_version 172161 (0.0031) [2024-04-26 12:16:02,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2820800512. Throughput: 0: 50338.7. Samples: 573711420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:16:02,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 12:16:02,578][49750] Updated weights for policy 0, policy_version 172171 (0.0034) [2024-04-26 12:16:05,961][49750] Updated weights for policy 0, policy_version 172181 (0.0035) [2024-04-26 12:16:07,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2821062656. Throughput: 0: 50382.3. Samples: 573865380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:16:07,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 12:16:08,580][49728] Signal inference workers to stop experience collection... (8550 times) [2024-04-26 12:16:08,580][49728] Signal inference workers to resume experience collection... (8550 times) [2024-04-26 12:16:08,591][49750] InferenceWorker_p0-w0: stopping experience collection (8550 times) [2024-04-26 12:16:08,611][49750] InferenceWorker_p0-w0: resuming experience collection (8550 times) [2024-04-26 12:16:09,115][49750] Updated weights for policy 0, policy_version 172191 (0.0034) [2024-04-26 12:16:12,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 2821308416. Throughput: 0: 50384.5. Samples: 574162720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:16:12,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 12:16:12,448][49750] Updated weights for policy 0, policy_version 172201 (0.0032) [2024-04-26 12:16:15,643][49750] Updated weights for policy 0, policy_version 172211 (0.0037) [2024-04-26 12:16:17,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2821554176. Throughput: 0: 50298.3. Samples: 574460100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:16:17,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 12:16:19,154][49750] Updated weights for policy 0, policy_version 172221 (0.0037) [2024-04-26 12:16:22,063][49517] Fps is (10 sec: 50789.2, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2821816320. Throughput: 0: 50328.2. Samples: 574613920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:16:22,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 12:16:22,190][49750] Updated weights for policy 0, policy_version 172231 (0.0037) [2024-04-26 12:16:25,723][49750] Updated weights for policy 0, policy_version 172241 (0.0028) [2024-04-26 12:16:27,063][49517] Fps is (10 sec: 49151.4, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 2822045696. Throughput: 0: 50334.2. Samples: 574914760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:16:27,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 12:16:28,673][49750] Updated weights for policy 0, policy_version 172251 (0.0030) [2024-04-26 12:16:32,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2822307840. Throughput: 0: 50257.0. Samples: 575214100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:16:32,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 12:16:32,081][49750] Updated weights for policy 0, policy_version 172261 (0.0035) [2024-04-26 12:16:35,294][49750] Updated weights for policy 0, policy_version 172271 (0.0035) [2024-04-26 12:16:37,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2822569984. Throughput: 0: 50255.2. Samples: 575370280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:16:37,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 12:16:38,470][49750] Updated weights for policy 0, policy_version 172281 (0.0032) [2024-04-26 12:16:41,692][49750] Updated weights for policy 0, policy_version 172291 (0.0032) [2024-04-26 12:16:42,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50517.2, 300 sec: 50373.8). Total num frames: 2822815744. Throughput: 0: 50229.7. Samples: 575674800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:16:42,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 12:16:42,155][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000172292_2822832128.pth... [2024-04-26 12:16:42,198][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000171553_2810724352.pth [2024-04-26 12:16:45,014][49750] Updated weights for policy 0, policy_version 172301 (0.0034) [2024-04-26 12:16:47,063][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2823061504. Throughput: 0: 50283.4. Samples: 575974180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:16:47,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 12:16:48,360][49750] Updated weights for policy 0, policy_version 172311 (0.0036) [2024-04-26 12:16:51,586][49750] Updated weights for policy 0, policy_version 172321 (0.0030) [2024-04-26 12:16:52,062][49517] Fps is (10 sec: 50791.6, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 2823323648. Throughput: 0: 50135.1. Samples: 576121460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:16:52,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 12:16:54,813][49750] Updated weights for policy 0, policy_version 172331 (0.0037) [2024-04-26 12:16:57,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 2823569408. Throughput: 0: 50242.6. Samples: 576423640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:16:57,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 12:16:57,993][49750] Updated weights for policy 0, policy_version 172341 (0.0033) [2024-04-26 12:17:01,280][49750] Updated weights for policy 0, policy_version 172351 (0.0029) [2024-04-26 12:17:02,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2823815168. Throughput: 0: 50437.8. Samples: 576729800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:17:02,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 12:17:04,430][49750] Updated weights for policy 0, policy_version 172361 (0.0032) [2024-04-26 12:17:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2824077312. Throughput: 0: 50282.9. Samples: 576876640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:17:07,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 12:17:07,828][49750] Updated weights for policy 0, policy_version 172371 (0.0030) [2024-04-26 12:17:10,989][49750] Updated weights for policy 0, policy_version 172381 (0.0033) [2024-04-26 12:17:12,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 2824339456. Throughput: 0: 50381.3. Samples: 577181920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:17:12,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 12:17:14,315][49750] Updated weights for policy 0, policy_version 172391 (0.0032) [2024-04-26 12:17:17,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 2824585216. Throughput: 0: 50431.8. Samples: 577483540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:17:17,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 12:17:17,589][49750] Updated weights for policy 0, policy_version 172401 (0.0032) [2024-04-26 12:17:20,791][49750] Updated weights for policy 0, policy_version 172411 (0.0030) [2024-04-26 12:17:22,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 2824830976. Throughput: 0: 50281.4. Samples: 577632940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:17:22,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 12:17:24,085][49750] Updated weights for policy 0, policy_version 172421 (0.0035) [2024-04-26 12:17:27,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2825076736. Throughput: 0: 50298.0. Samples: 577938200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:17:27,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 12:17:27,244][49750] Updated weights for policy 0, policy_version 172431 (0.0035) [2024-04-26 12:17:31,017][49750] Updated weights for policy 0, policy_version 172441 (0.0028) [2024-04-26 12:17:32,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2825322496. Throughput: 0: 50231.2. Samples: 578234580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:17:32,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 12:17:33,108][49728] Signal inference workers to stop experience collection... (8600 times) [2024-04-26 12:17:33,108][49728] Signal inference workers to resume experience collection... (8600 times) [2024-04-26 12:17:33,140][49750] InferenceWorker_p0-w0: stopping experience collection (8600 times) [2024-04-26 12:17:33,140][49750] InferenceWorker_p0-w0: resuming experience collection (8600 times) [2024-04-26 12:17:33,903][49750] Updated weights for policy 0, policy_version 172451 (0.0036) [2024-04-26 12:17:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2825584640. Throughput: 0: 50274.6. Samples: 578383820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:17:37,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 12:17:37,433][49750] Updated weights for policy 0, policy_version 172461 (0.0030) [2024-04-26 12:17:40,374][49750] Updated weights for policy 0, policy_version 172471 (0.0028) [2024-04-26 12:17:42,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.4, 300 sec: 50373.8). Total num frames: 2825846784. Throughput: 0: 50227.0. Samples: 578683860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:17:42,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 12:17:43,765][49750] Updated weights for policy 0, policy_version 172481 (0.0030) [2024-04-26 12:17:46,710][49750] Updated weights for policy 0, policy_version 172491 (0.0037) [2024-04-26 12:17:47,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2826092544. Throughput: 0: 50224.3. Samples: 578989900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:17:47,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 12:17:50,301][49750] Updated weights for policy 0, policy_version 172501 (0.0031) [2024-04-26 12:17:52,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.1, 300 sec: 50373.9). Total num frames: 2826338304. Throughput: 0: 50362.9. Samples: 579142980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 12:17:52,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 12:17:53,240][49750] Updated weights for policy 0, policy_version 172511 (0.0035) [2024-04-26 12:17:56,847][49750] Updated weights for policy 0, policy_version 172521 (0.0029) [2024-04-26 12:17:57,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2826584064. Throughput: 0: 50238.8. Samples: 579442660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 12:17:57,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 12:17:59,770][49750] Updated weights for policy 0, policy_version 172531 (0.0028) [2024-04-26 12:18:02,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2826846208. Throughput: 0: 50138.3. Samples: 579739760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 12:18:02,063][49517] Avg episode reward: [(0, '0.461')] [2024-04-26 12:18:03,258][49750] Updated weights for policy 0, policy_version 172541 (0.0040) [2024-04-26 12:18:06,194][49750] Updated weights for policy 0, policy_version 172551 (0.0027) [2024-04-26 12:18:07,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2827108352. Throughput: 0: 50410.3. Samples: 579901400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 12:18:07,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 12:18:09,802][49750] Updated weights for policy 0, policy_version 172561 (0.0028) [2024-04-26 12:18:12,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2827354112. Throughput: 0: 50444.3. Samples: 580208200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 12:18:12,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 12:18:12,781][49750] Updated weights for policy 0, policy_version 172571 (0.0031) [2024-04-26 12:18:16,308][49750] Updated weights for policy 0, policy_version 172581 (0.0028) [2024-04-26 12:18:17,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 2827599872. Throughput: 0: 50628.0. Samples: 580512840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 12:18:17,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 12:18:19,049][49750] Updated weights for policy 0, policy_version 172591 (0.0034) [2024-04-26 12:18:22,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2827862016. Throughput: 0: 50544.9. Samples: 580658340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 12:18:22,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 12:18:22,655][49750] Updated weights for policy 0, policy_version 172601 (0.0034) [2024-04-26 12:18:25,454][49750] Updated weights for policy 0, policy_version 172611 (0.0032) [2024-04-26 12:18:27,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51063.5, 300 sec: 50484.9). Total num frames: 2828140544. Throughput: 0: 50608.1. Samples: 580961220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 12:18:27,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 12:18:29,106][49750] Updated weights for policy 0, policy_version 172621 (0.0029) [2024-04-26 12:18:31,909][49750] Updated weights for policy 0, policy_version 172631 (0.0034) [2024-04-26 12:18:32,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50484.9). Total num frames: 2828386304. Throughput: 0: 50683.1. Samples: 581270640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 12:18:32,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 12:18:35,760][49750] Updated weights for policy 0, policy_version 172641 (0.0034) [2024-04-26 12:18:37,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 2828632064. Throughput: 0: 50741.8. Samples: 581426360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 12:18:37,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 12:18:38,228][49750] Updated weights for policy 0, policy_version 172651 (0.0035) [2024-04-26 12:18:42,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2828861440. Throughput: 0: 50714.2. Samples: 581724800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 12:18:42,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 12:18:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000172660_2828861440.pth... [2024-04-26 12:18:42,129][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000171923_2816786432.pth [2024-04-26 12:18:42,279][49750] Updated weights for policy 0, policy_version 172661 (0.0032) [2024-04-26 12:18:44,690][49750] Updated weights for policy 0, policy_version 172671 (0.0034) [2024-04-26 12:18:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2829123584. Throughput: 0: 50760.9. Samples: 582024000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 12:18:47,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 12:18:48,541][49750] Updated weights for policy 0, policy_version 172681 (0.0032) [2024-04-26 12:18:49,435][49728] Signal inference workers to stop experience collection... (8650 times) [2024-04-26 12:18:49,472][49750] InferenceWorker_p0-w0: stopping experience collection (8650 times) [2024-04-26 12:18:49,503][49728] Signal inference workers to resume experience collection... (8650 times) [2024-04-26 12:18:49,506][49750] InferenceWorker_p0-w0: resuming experience collection (8650 times) [2024-04-26 12:18:51,310][49750] Updated weights for policy 0, policy_version 172691 (0.0029) [2024-04-26 12:18:52,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 2829385728. Throughput: 0: 50783.9. Samples: 582186680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 12:18:52,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 12:18:55,062][49750] Updated weights for policy 0, policy_version 172701 (0.0038) [2024-04-26 12:18:57,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2829631488. Throughput: 0: 50543.7. Samples: 582482660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 12:18:57,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 12:18:57,929][49750] Updated weights for policy 0, policy_version 172711 (0.0037) [2024-04-26 12:19:01,413][49750] Updated weights for policy 0, policy_version 172721 (0.0030) [2024-04-26 12:19:02,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2829860864. Throughput: 0: 50486.2. Samples: 582784720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 12:19:02,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 12:19:04,270][49750] Updated weights for policy 0, policy_version 172731 (0.0031) [2024-04-26 12:19:07,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.1, 300 sec: 50373.8). Total num frames: 2830123008. Throughput: 0: 50540.8. Samples: 582932680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 12:19:07,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 12:19:07,808][49750] Updated weights for policy 0, policy_version 172741 (0.0029) [2024-04-26 12:19:10,736][49750] Updated weights for policy 0, policy_version 172751 (0.0035) [2024-04-26 12:19:12,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2830385152. Throughput: 0: 50585.3. Samples: 583237560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 12:19:12,063][49517] Avg episode reward: [(0, '0.464')] [2024-04-26 12:19:14,254][49750] Updated weights for policy 0, policy_version 172761 (0.0030) [2024-04-26 12:19:17,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51063.4, 300 sec: 50540.5). Total num frames: 2830663680. Throughput: 0: 50473.4. Samples: 583541940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 12:19:17,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 12:19:17,203][49750] Updated weights for policy 0, policy_version 172771 (0.0027) [2024-04-26 12:19:20,747][49750] Updated weights for policy 0, policy_version 172781 (0.0036) [2024-04-26 12:19:22,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2830876672. Throughput: 0: 50431.2. Samples: 583695760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 12:19:22,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 12:19:23,878][49750] Updated weights for policy 0, policy_version 172791 (0.0034) [2024-04-26 12:19:27,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50244.1, 300 sec: 50484.9). Total num frames: 2831155200. Throughput: 0: 50491.8. Samples: 583996940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 12:19:27,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 12:19:27,341][49750] Updated weights for policy 0, policy_version 172801 (0.0032) [2024-04-26 12:19:30,230][49750] Updated weights for policy 0, policy_version 172811 (0.0030) [2024-04-26 12:19:32,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2831400960. Throughput: 0: 50642.2. Samples: 584302900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 12:19:32,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 12:19:34,065][49750] Updated weights for policy 0, policy_version 172821 (0.0036) [2024-04-26 12:19:36,655][49750] Updated weights for policy 0, policy_version 172831 (0.0030) [2024-04-26 12:19:37,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.4, 300 sec: 50484.9). Total num frames: 2831663104. Throughput: 0: 50530.7. Samples: 584460560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 12:19:37,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 12:19:40,445][49750] Updated weights for policy 0, policy_version 172841 (0.0026) [2024-04-26 12:19:42,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50540.5). Total num frames: 2831925248. Throughput: 0: 50503.8. Samples: 584755340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 12:19:42,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 12:19:43,097][49750] Updated weights for policy 0, policy_version 172851 (0.0027) [2024-04-26 12:19:47,061][49750] Updated weights for policy 0, policy_version 172861 (0.0028) [2024-04-26 12:19:47,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2832154624. Throughput: 0: 50571.0. Samples: 585060420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 12:19:47,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 12:19:49,576][49750] Updated weights for policy 0, policy_version 172871 (0.0032) [2024-04-26 12:19:52,063][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 2832400384. Throughput: 0: 50513.3. Samples: 585205780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 12:19:52,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 12:19:53,554][49750] Updated weights for policy 0, policy_version 172881 (0.0035) [2024-04-26 12:19:55,679][49728] Signal inference workers to stop experience collection... (8700 times) [2024-04-26 12:19:55,718][49750] InferenceWorker_p0-w0: stopping experience collection (8700 times) [2024-04-26 12:19:55,752][49728] Signal inference workers to resume experience collection... (8700 times) [2024-04-26 12:19:55,755][49750] InferenceWorker_p0-w0: resuming experience collection (8700 times) [2024-04-26 12:19:56,021][49750] Updated weights for policy 0, policy_version 172891 (0.0028) [2024-04-26 12:19:57,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2832662528. Throughput: 0: 50477.3. Samples: 585509040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 12:19:57,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 12:20:00,203][49750] Updated weights for policy 0, policy_version 172901 (0.0034) [2024-04-26 12:20:02,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50484.9). Total num frames: 2832924672. Throughput: 0: 50392.5. Samples: 585809600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 12:20:02,063][49517] Avg episode reward: [(0, '0.675')] [2024-04-26 12:20:02,487][49750] Updated weights for policy 0, policy_version 172911 (0.0025) [2024-04-26 12:20:06,542][49750] Updated weights for policy 0, policy_version 172921 (0.0029) [2024-04-26 12:20:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2833154048. Throughput: 0: 50545.3. Samples: 585970300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 12:20:07,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 12:20:08,950][49750] Updated weights for policy 0, policy_version 172931 (0.0033) [2024-04-26 12:20:12,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 2833416192. Throughput: 0: 50448.0. Samples: 586267100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 12:20:12,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 12:20:13,082][49750] Updated weights for policy 0, policy_version 172941 (0.0037) [2024-04-26 12:20:15,558][49750] Updated weights for policy 0, policy_version 172951 (0.0030) [2024-04-26 12:20:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2833661952. Throughput: 0: 50468.6. Samples: 586573980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 12:20:17,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 12:20:19,677][49750] Updated weights for policy 0, policy_version 172961 (0.0040) [2024-04-26 12:20:22,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.3, 300 sec: 50484.9). Total num frames: 2833940480. Throughput: 0: 50445.1. Samples: 586730600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 12:20:22,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 12:20:22,582][49750] Updated weights for policy 0, policy_version 172971 (0.0031) [2024-04-26 12:20:26,094][49750] Updated weights for policy 0, policy_version 172981 (0.0030) [2024-04-26 12:20:27,063][49517] Fps is (10 sec: 52427.7, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2834186240. Throughput: 0: 50475.0. Samples: 587026720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 12:20:27,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 12:20:29,159][49750] Updated weights for policy 0, policy_version 172991 (0.0033) [2024-04-26 12:20:32,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50244.3, 300 sec: 50373.8). Total num frames: 2834415616. Throughput: 0: 50342.3. Samples: 587325820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 12:20:32,063][49517] Avg episode reward: [(0, '0.449')] [2024-04-26 12:20:32,505][49750] Updated weights for policy 0, policy_version 173001 (0.0029) [2024-04-26 12:20:35,512][49750] Updated weights for policy 0, policy_version 173011 (0.0031) [2024-04-26 12:20:37,062][49517] Fps is (10 sec: 47514.6, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 2834661376. Throughput: 0: 50440.6. Samples: 587475600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 12:20:37,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 12:20:38,999][49750] Updated weights for policy 0, policy_version 173021 (0.0029) [2024-04-26 12:20:41,900][49750] Updated weights for policy 0, policy_version 173031 (0.0029) [2024-04-26 12:20:42,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2834939904. Throughput: 0: 50408.0. Samples: 587777400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 12:20:42,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 12:20:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000173031_2834939904.pth... [2024-04-26 12:20:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000172292_2822832128.pth [2024-04-26 12:20:45,657][49750] Updated weights for policy 0, policy_version 173041 (0.0031) [2024-04-26 12:20:47,062][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.5, 300 sec: 50484.9). Total num frames: 2835202048. Throughput: 0: 50349.8. Samples: 588075340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 12:20:47,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 12:20:48,471][49750] Updated weights for policy 0, policy_version 173051 (0.0036) [2024-04-26 12:20:51,940][49728] Signal inference workers to stop experience collection... (8750 times) [2024-04-26 12:20:51,940][49728] Signal inference workers to resume experience collection... (8750 times) [2024-04-26 12:20:51,967][49750] InferenceWorker_p0-w0: stopping experience collection (8750 times) [2024-04-26 12:20:51,967][49750] InferenceWorker_p0-w0: resuming experience collection (8750 times) [2024-04-26 12:20:52,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 2835431424. Throughput: 0: 50430.3. Samples: 588239660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 12:20:52,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 12:20:52,067][49750] Updated weights for policy 0, policy_version 173061 (0.0030) [2024-04-26 12:20:55,142][49750] Updated weights for policy 0, policy_version 173071 (0.0033) [2024-04-26 12:20:57,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 2835677184. Throughput: 0: 50413.1. Samples: 588535680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 12:20:57,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 12:20:58,709][49750] Updated weights for policy 0, policy_version 173081 (0.0036) [2024-04-26 12:21:01,738][49750] Updated weights for policy 0, policy_version 173091 (0.0030) [2024-04-26 12:21:02,063][49517] Fps is (10 sec: 49151.2, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 2835922944. Throughput: 0: 50241.2. Samples: 588834840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 12:21:02,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 12:21:05,093][49750] Updated weights for policy 0, policy_version 173101 (0.0035) [2024-04-26 12:21:07,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 2836217856. Throughput: 0: 50305.6. Samples: 588994340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 12:21:07,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 12:21:08,257][49750] Updated weights for policy 0, policy_version 173111 (0.0033) [2024-04-26 12:21:11,494][49750] Updated weights for policy 0, policy_version 173121 (0.0029) [2024-04-26 12:21:12,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2836430848. Throughput: 0: 50366.8. Samples: 589293220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 12:21:12,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 12:21:14,729][49750] Updated weights for policy 0, policy_version 173131 (0.0030) [2024-04-26 12:21:17,063][49517] Fps is (10 sec: 45874.4, 60 sec: 50244.1, 300 sec: 50373.9). Total num frames: 2836676608. Throughput: 0: 50411.8. Samples: 589594360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 12:21:17,063][49517] Avg episode reward: [(0, '0.478')] [2024-04-26 12:21:17,962][49750] Updated weights for policy 0, policy_version 173141 (0.0031) [2024-04-26 12:21:21,442][49750] Updated weights for policy 0, policy_version 173151 (0.0036) [2024-04-26 12:21:22,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50244.3, 300 sec: 50540.5). Total num frames: 2836955136. Throughput: 0: 50349.6. Samples: 589741340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 12:21:22,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 12:21:24,442][49750] Updated weights for policy 0, policy_version 173161 (0.0036) [2024-04-26 12:21:27,062][49517] Fps is (10 sec: 50791.6, 60 sec: 49971.4, 300 sec: 50429.4). Total num frames: 2837184512. Throughput: 0: 50240.1. Samples: 590038200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 12:21:27,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 12:21:28,095][49750] Updated weights for policy 0, policy_version 173171 (0.0031) [2024-04-26 12:21:30,866][49750] Updated weights for policy 0, policy_version 173181 (0.0030) [2024-04-26 12:21:32,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 2837463040. Throughput: 0: 50483.6. Samples: 590347100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 12:21:32,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 12:21:34,597][49750] Updated weights for policy 0, policy_version 173191 (0.0038) [2024-04-26 12:21:37,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2837692416. Throughput: 0: 50297.3. Samples: 590503040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 12:21:37,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 12:21:37,356][49750] Updated weights for policy 0, policy_version 173201 (0.0030) [2024-04-26 12:21:40,933][49750] Updated weights for policy 0, policy_version 173211 (0.0030) [2024-04-26 12:21:42,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49971.3, 300 sec: 50429.4). Total num frames: 2837938176. Throughput: 0: 50430.6. Samples: 590805060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 12:21:42,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 12:21:43,881][49750] Updated weights for policy 0, policy_version 173221 (0.0037) [2024-04-26 12:21:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49698.2, 300 sec: 50373.9). Total num frames: 2838183936. Throughput: 0: 50414.4. Samples: 591103480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 12:21:47,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 12:21:47,304][49750] Updated weights for policy 0, policy_version 173231 (0.0032) [2024-04-26 12:21:50,515][49750] Updated weights for policy 0, policy_version 173241 (0.0037) [2024-04-26 12:21:52,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2838462464. Throughput: 0: 50397.0. Samples: 591262200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 12:21:52,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 12:21:53,767][49750] Updated weights for policy 0, policy_version 173251 (0.0039) [2024-04-26 12:21:57,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2838691840. Throughput: 0: 50304.5. Samples: 591556920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 12:21:57,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 12:21:57,138][49750] Updated weights for policy 0, policy_version 173261 (0.0035) [2024-04-26 12:22:00,321][49750] Updated weights for policy 0, policy_version 173271 (0.0034) [2024-04-26 12:22:02,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2838953984. Throughput: 0: 50361.0. Samples: 591860600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 12:22:02,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 12:22:03,602][49750] Updated weights for policy 0, policy_version 173281 (0.0028) [2024-04-26 12:22:06,391][49728] Signal inference workers to stop experience collection... (8800 times) [2024-04-26 12:22:06,449][49750] InferenceWorker_p0-w0: stopping experience collection (8800 times) [2024-04-26 12:22:06,449][49728] Signal inference workers to resume experience collection... (8800 times) [2024-04-26 12:22:06,465][49750] InferenceWorker_p0-w0: resuming experience collection (8800 times) [2024-04-26 12:22:06,589][49750] Updated weights for policy 0, policy_version 173291 (0.0031) [2024-04-26 12:22:07,063][49517] Fps is (10 sec: 52428.7, 60 sec: 49971.1, 300 sec: 50429.4). Total num frames: 2839216128. Throughput: 0: 50437.0. Samples: 592011000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 12:22:07,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 12:22:10,124][49750] Updated weights for policy 0, policy_version 173301 (0.0030) [2024-04-26 12:22:12,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2839461888. Throughput: 0: 50577.3. Samples: 592314180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 12:22:12,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 12:22:12,981][49750] Updated weights for policy 0, policy_version 173311 (0.0028) [2024-04-26 12:22:16,485][49750] Updated weights for policy 0, policy_version 173321 (0.0035) [2024-04-26 12:22:17,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 2839707648. Throughput: 0: 50500.5. Samples: 592619620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 12:22:17,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 12:22:19,675][49750] Updated weights for policy 0, policy_version 173331 (0.0029) [2024-04-26 12:22:22,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.4, 300 sec: 50429.4). Total num frames: 2839953408. Throughput: 0: 50365.0. Samples: 592769460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 12:22:22,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 12:22:22,876][49750] Updated weights for policy 0, policy_version 173341 (0.0033) [2024-04-26 12:22:26,194][49750] Updated weights for policy 0, policy_version 173351 (0.0031) [2024-04-26 12:22:27,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.2, 300 sec: 50484.9). Total num frames: 2840215552. Throughput: 0: 50509.7. Samples: 593078000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 12:22:27,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 12:22:29,557][49750] Updated weights for policy 0, policy_version 173361 (0.0032) [2024-04-26 12:22:32,062][49517] Fps is (10 sec: 50790.2, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 2840461312. Throughput: 0: 50435.1. Samples: 593373060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 12:22:32,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 12:22:32,601][49750] Updated weights for policy 0, policy_version 173371 (0.0029) [2024-04-26 12:22:36,084][49750] Updated weights for policy 0, policy_version 173381 (0.0036) [2024-04-26 12:22:37,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 2840739840. Throughput: 0: 50149.1. Samples: 593518920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 12:22:37,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 12:22:39,186][49750] Updated weights for policy 0, policy_version 173391 (0.0027) [2024-04-26 12:22:42,063][49517] Fps is (10 sec: 52427.5, 60 sec: 50790.2, 300 sec: 50484.9). Total num frames: 2840985600. Throughput: 0: 50498.9. Samples: 593829380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 12:22:42,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 12:22:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000173400_2840985600.pth... [2024-04-26 12:22:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000172660_2828861440.pth [2024-04-26 12:22:42,400][49750] Updated weights for policy 0, policy_version 173401 (0.0034) [2024-04-26 12:22:45,610][49750] Updated weights for policy 0, policy_version 173411 (0.0032) [2024-04-26 12:22:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 2841231360. Throughput: 0: 50540.1. Samples: 594134900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 12:22:47,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 12:22:48,949][49750] Updated weights for policy 0, policy_version 173421 (0.0027) [2024-04-26 12:22:52,062][49517] Fps is (10 sec: 49153.4, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 2841477120. Throughput: 0: 50339.7. Samples: 594276280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 12:22:52,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 12:22:52,129][49750] Updated weights for policy 0, policy_version 173431 (0.0039) [2024-04-26 12:22:55,775][49750] Updated weights for policy 0, policy_version 173441 (0.0034) [2024-04-26 12:22:57,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 2841739264. Throughput: 0: 50565.3. Samples: 594589620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 12:22:57,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 12:22:58,604][49750] Updated weights for policy 0, policy_version 173451 (0.0034) [2024-04-26 12:23:02,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 2841968640. Throughput: 0: 50264.7. Samples: 594881540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 12:23:02,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 12:23:02,277][49750] Updated weights for policy 0, policy_version 173461 (0.0035) [2024-04-26 12:23:05,036][49750] Updated weights for policy 0, policy_version 173471 (0.0028) [2024-04-26 12:23:07,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2842230784. Throughput: 0: 50164.7. Samples: 595026880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 12:23:07,063][49517] Avg episode reward: [(0, '0.458')] [2024-04-26 12:23:08,677][49750] Updated weights for policy 0, policy_version 173481 (0.0032) [2024-04-26 12:23:08,693][49728] Signal inference workers to stop experience collection... (8850 times) [2024-04-26 12:23:08,693][49728] Signal inference workers to resume experience collection... (8850 times) [2024-04-26 12:23:08,718][49750] InferenceWorker_p0-w0: stopping experience collection (8850 times) [2024-04-26 12:23:08,718][49750] InferenceWorker_p0-w0: resuming experience collection (8850 times) [2024-04-26 12:23:11,448][49750] Updated weights for policy 0, policy_version 173491 (0.0034) [2024-04-26 12:23:12,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2842476544. Throughput: 0: 50234.3. Samples: 595338540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 12:23:12,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 12:23:15,274][49750] Updated weights for policy 0, policy_version 173501 (0.0032) [2024-04-26 12:23:17,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2842722304. Throughput: 0: 50301.8. Samples: 595636640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 12:23:17,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 12:23:18,203][49750] Updated weights for policy 0, policy_version 173511 (0.0030) [2024-04-26 12:23:21,551][49750] Updated weights for policy 0, policy_version 173521 (0.0029) [2024-04-26 12:23:22,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 2843000832. Throughput: 0: 50391.3. Samples: 595786520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 12:23:22,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-26 12:23:24,796][49750] Updated weights for policy 0, policy_version 173531 (0.0029) [2024-04-26 12:23:27,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2843246592. Throughput: 0: 50267.4. Samples: 596091400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 12:23:27,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 12:23:27,945][49750] Updated weights for policy 0, policy_version 173541 (0.0030) [2024-04-26 12:23:31,313][49750] Updated weights for policy 0, policy_version 173551 (0.0035) [2024-04-26 12:23:32,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2843492352. Throughput: 0: 50296.9. Samples: 596398260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 12:23:32,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 12:23:34,514][49750] Updated weights for policy 0, policy_version 173561 (0.0030) [2024-04-26 12:23:37,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 2843754496. Throughput: 0: 50491.8. Samples: 596548420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 12:23:37,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 12:23:37,803][49750] Updated weights for policy 0, policy_version 173571 (0.0031) [2024-04-26 12:23:40,836][49750] Updated weights for policy 0, policy_version 173581 (0.0038) [2024-04-26 12:23:42,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.5, 300 sec: 50484.9). Total num frames: 2844016640. Throughput: 0: 50304.4. Samples: 596853320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 12:23:42,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 12:23:44,113][49750] Updated weights for policy 0, policy_version 173591 (0.0032) [2024-04-26 12:23:47,062][49517] Fps is (10 sec: 50791.6, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2844262400. Throughput: 0: 50634.9. Samples: 597160100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 12:23:47,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 12:23:47,379][49750] Updated weights for policy 0, policy_version 173601 (0.0034) [2024-04-26 12:23:50,625][49750] Updated weights for policy 0, policy_version 173611 (0.0030) [2024-04-26 12:23:52,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.1, 300 sec: 50429.4). Total num frames: 2844508160. Throughput: 0: 50633.7. Samples: 597305400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 12:23:52,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 12:23:54,001][49750] Updated weights for policy 0, policy_version 173621 (0.0029) [2024-04-26 12:23:57,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 2844753920. Throughput: 0: 50344.8. Samples: 597604060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 12:23:57,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 12:23:57,190][49750] Updated weights for policy 0, policy_version 173631 (0.0033) [2024-04-26 12:24:00,511][49750] Updated weights for policy 0, policy_version 173641 (0.0027) [2024-04-26 12:24:02,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50484.9). Total num frames: 2845016064. Throughput: 0: 50478.6. Samples: 597908180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 12:24:02,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 12:24:03,533][49750] Updated weights for policy 0, policy_version 173651 (0.0031) [2024-04-26 12:24:06,977][49750] Updated weights for policy 0, policy_version 173661 (0.0034) [2024-04-26 12:24:07,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2845261824. Throughput: 0: 50555.4. Samples: 598061520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 12:24:07,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 12:24:09,871][49750] Updated weights for policy 0, policy_version 173671 (0.0034) [2024-04-26 12:24:12,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 2845507584. Throughput: 0: 50508.8. Samples: 598364300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 12:24:12,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 12:24:13,530][49750] Updated weights for policy 0, policy_version 173681 (0.0030) [2024-04-26 12:24:16,485][49750] Updated weights for policy 0, policy_version 173691 (0.0031) [2024-04-26 12:24:17,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 2845769728. Throughput: 0: 50356.4. Samples: 598664300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 12:24:17,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 12:24:20,069][49750] Updated weights for policy 0, policy_version 173701 (0.0035) [2024-04-26 12:24:22,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2846015488. Throughput: 0: 50472.2. Samples: 598819660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 12:24:22,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 12:24:23,125][49750] Updated weights for policy 0, policy_version 173711 (0.0030) [2024-04-26 12:24:26,423][49750] Updated weights for policy 0, policy_version 173721 (0.0034) [2024-04-26 12:24:27,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 2846277632. Throughput: 0: 50444.4. Samples: 599123320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:24:27,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 12:24:29,428][49750] Updated weights for policy 0, policy_version 173731 (0.0035) [2024-04-26 12:24:32,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2846507008. Throughput: 0: 50297.3. Samples: 599423480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:24:32,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 12:24:32,936][49750] Updated weights for policy 0, policy_version 173741 (0.0033) [2024-04-26 12:24:33,465][49728] Signal inference workers to stop experience collection... (8900 times) [2024-04-26 12:24:33,508][49750] InferenceWorker_p0-w0: stopping experience collection (8900 times) [2024-04-26 12:24:33,572][49728] Signal inference workers to resume experience collection... (8900 times) [2024-04-26 12:24:33,572][49750] InferenceWorker_p0-w0: resuming experience collection (8900 times) [2024-04-26 12:24:36,098][49750] Updated weights for policy 0, policy_version 173751 (0.0031) [2024-04-26 12:24:37,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2846785536. Throughput: 0: 50346.3. Samples: 599570980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:24:37,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 12:24:39,638][49750] Updated weights for policy 0, policy_version 173761 (0.0035) [2024-04-26 12:24:42,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 2847031296. Throughput: 0: 50329.9. Samples: 599868900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:24:42,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 12:24:42,068][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000173769_2847031296.pth... [2024-04-26 12:24:42,132][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000173031_2834939904.pth [2024-04-26 12:24:42,673][49750] Updated weights for policy 0, policy_version 173771 (0.0025) [2024-04-26 12:24:45,960][49750] Updated weights for policy 0, policy_version 173781 (0.0032) [2024-04-26 12:24:47,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.3, 300 sec: 50540.5). Total num frames: 2847309824. Throughput: 0: 50345.4. Samples: 600173720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:24:47,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 12:24:49,171][49750] Updated weights for policy 0, policy_version 173791 (0.0030) [2024-04-26 12:24:52,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 2847522816. Throughput: 0: 50329.9. Samples: 600326360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:24:52,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 12:24:52,373][49750] Updated weights for policy 0, policy_version 173801 (0.0029) [2024-04-26 12:24:55,574][49750] Updated weights for policy 0, policy_version 173811 (0.0030) [2024-04-26 12:24:57,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2847768576. Throughput: 0: 50425.9. Samples: 600633460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:24:57,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 12:24:59,056][49750] Updated weights for policy 0, policy_version 173821 (0.0029) [2024-04-26 12:25:01,917][49750] Updated weights for policy 0, policy_version 173831 (0.0037) [2024-04-26 12:25:02,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2848047104. Throughput: 0: 50500.8. Samples: 600936840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:25:02,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 12:25:05,469][49750] Updated weights for policy 0, policy_version 173841 (0.0030) [2024-04-26 12:25:07,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2848292864. Throughput: 0: 50565.3. Samples: 601095100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:25:07,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 12:25:08,502][49750] Updated weights for policy 0, policy_version 173851 (0.0032) [2024-04-26 12:25:11,978][49750] Updated weights for policy 0, policy_version 173861 (0.0029) [2024-04-26 12:25:12,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2848538624. Throughput: 0: 50413.5. Samples: 601391920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:25:12,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 12:25:15,014][49750] Updated weights for policy 0, policy_version 173871 (0.0031) [2024-04-26 12:25:17,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2848784384. Throughput: 0: 50456.2. Samples: 601694020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:25:17,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 12:25:18,430][49750] Updated weights for policy 0, policy_version 173881 (0.0032) [2024-04-26 12:25:21,570][49750] Updated weights for policy 0, policy_version 173891 (0.0027) [2024-04-26 12:25:22,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2849046528. Throughput: 0: 50434.8. Samples: 601840540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:25:22,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 12:25:24,900][49750] Updated weights for policy 0, policy_version 173901 (0.0030) [2024-04-26 12:25:27,062][49517] Fps is (10 sec: 50791.6, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 2849292288. Throughput: 0: 50464.5. Samples: 602139800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 12:25:27,063][49517] Avg episode reward: [(0, '0.683')] [2024-04-26 12:25:27,919][49750] Updated weights for policy 0, policy_version 173911 (0.0031) [2024-04-26 12:25:31,408][49750] Updated weights for policy 0, policy_version 173921 (0.0033) [2024-04-26 12:25:32,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 2849554432. Throughput: 0: 50490.6. Samples: 602445800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 12:25:32,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 12:25:34,641][49750] Updated weights for policy 0, policy_version 173931 (0.0032) [2024-04-26 12:25:37,062][49517] Fps is (10 sec: 49151.6, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2849783808. Throughput: 0: 50572.0. Samples: 602602100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 12:25:37,063][49517] Avg episode reward: [(0, '0.453')] [2024-04-26 12:25:37,794][49750] Updated weights for policy 0, policy_version 173941 (0.0029) [2024-04-26 12:25:41,075][49750] Updated weights for policy 0, policy_version 173951 (0.0030) [2024-04-26 12:25:42,063][49517] Fps is (10 sec: 47512.7, 60 sec: 49970.9, 300 sec: 50262.7). Total num frames: 2850029568. Throughput: 0: 50431.7. Samples: 602902900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 12:25:42,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 12:25:44,236][49750] Updated weights for policy 0, policy_version 173961 (0.0029) [2024-04-26 12:25:47,063][49517] Fps is (10 sec: 52428.4, 60 sec: 49971.1, 300 sec: 50429.4). Total num frames: 2850308096. Throughput: 0: 50388.0. Samples: 603204300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 12:25:47,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 12:25:47,528][49750] Updated weights for policy 0, policy_version 173971 (0.0031) [2024-04-26 12:25:50,717][49750] Updated weights for policy 0, policy_version 173981 (0.0033) [2024-04-26 12:25:52,063][49517] Fps is (10 sec: 54068.2, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 2850570240. Throughput: 0: 50400.3. Samples: 603363120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 12:25:52,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 12:25:54,057][49750] Updated weights for policy 0, policy_version 173991 (0.0034) [2024-04-26 12:25:56,223][49728] Signal inference workers to stop experience collection... (8950 times) [2024-04-26 12:25:56,227][49728] Signal inference workers to resume experience collection... (8950 times) [2024-04-26 12:25:56,258][49750] InferenceWorker_p0-w0: stopping experience collection (8950 times) [2024-04-26 12:25:56,262][49750] InferenceWorker_p0-w0: resuming experience collection (8950 times) [2024-04-26 12:25:57,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2850799616. Throughput: 0: 50372.5. Samples: 603658680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 12:25:57,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 12:25:57,417][49750] Updated weights for policy 0, policy_version 174001 (0.0034) [2024-04-26 12:26:00,518][49750] Updated weights for policy 0, policy_version 174011 (0.0026) [2024-04-26 12:26:02,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 2851045376. Throughput: 0: 50511.3. Samples: 603967020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 12:26:02,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 12:26:03,793][49750] Updated weights for policy 0, policy_version 174021 (0.0036) [2024-04-26 12:26:06,921][49750] Updated weights for policy 0, policy_version 174031 (0.0030) [2024-04-26 12:26:07,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50485.0). Total num frames: 2851323904. Throughput: 0: 50418.7. Samples: 604109380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 12:26:07,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 12:26:10,135][49750] Updated weights for policy 0, policy_version 174041 (0.0031) [2024-04-26 12:26:12,063][49517] Fps is (10 sec: 52427.6, 60 sec: 50517.1, 300 sec: 50484.9). Total num frames: 2851569664. Throughput: 0: 50578.4. Samples: 604415840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 12:26:12,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 12:26:13,216][49750] Updated weights for policy 0, policy_version 174051 (0.0030) [2024-04-26 12:26:16,712][49750] Updated weights for policy 0, policy_version 174061 (0.0030) [2024-04-26 12:26:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 2851831808. Throughput: 0: 50692.1. Samples: 604726940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 12:26:17,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 12:26:19,743][49750] Updated weights for policy 0, policy_version 174071 (0.0037) [2024-04-26 12:26:22,062][49517] Fps is (10 sec: 50791.9, 60 sec: 50517.4, 300 sec: 50484.9). Total num frames: 2852077568. Throughput: 0: 50556.1. Samples: 604877120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 12:26:22,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 12:26:23,191][49750] Updated weights for policy 0, policy_version 174081 (0.0039) [2024-04-26 12:26:26,274][49750] Updated weights for policy 0, policy_version 174091 (0.0030) [2024-04-26 12:26:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2852323328. Throughput: 0: 50579.9. Samples: 605178980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 12:26:27,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 12:26:29,576][49750] Updated weights for policy 0, policy_version 174101 (0.0029) [2024-04-26 12:26:32,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 2852569088. Throughput: 0: 50629.0. Samples: 605482600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 12:26:32,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 12:26:32,699][49750] Updated weights for policy 0, policy_version 174111 (0.0041) [2024-04-26 12:26:36,024][49750] Updated weights for policy 0, policy_version 174121 (0.0035) [2024-04-26 12:26:37,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 2852847616. Throughput: 0: 50597.5. Samples: 605640000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:26:37,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 12:26:39,656][49750] Updated weights for policy 0, policy_version 174131 (0.0030) [2024-04-26 12:26:42,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51063.6, 300 sec: 50540.4). Total num frames: 2853093376. Throughput: 0: 50675.8. Samples: 605939100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:26:42,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 12:26:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000174139_2853093376.pth... [2024-04-26 12:26:42,115][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000173400_2840985600.pth [2024-04-26 12:26:42,536][49750] Updated weights for policy 0, policy_version 174141 (0.0036) [2024-04-26 12:26:46,076][49750] Updated weights for policy 0, policy_version 174151 (0.0036) [2024-04-26 12:26:47,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.3, 300 sec: 50373.8). Total num frames: 2853322752. Throughput: 0: 50595.1. Samples: 606243800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:26:47,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 12:26:49,172][49750] Updated weights for policy 0, policy_version 174161 (0.0039) [2024-04-26 12:26:52,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 2853601280. Throughput: 0: 50526.7. Samples: 606383080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:26:52,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 12:26:52,612][49750] Updated weights for policy 0, policy_version 174171 (0.0027) [2024-04-26 12:26:55,575][49750] Updated weights for policy 0, policy_version 174181 (0.0040) [2024-04-26 12:26:57,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50790.2, 300 sec: 50484.9). Total num frames: 2853847040. Throughput: 0: 50534.2. Samples: 606689880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:26:57,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 12:26:59,196][49750] Updated weights for policy 0, policy_version 174191 (0.0032) [2024-04-26 12:27:01,916][49750] Updated weights for policy 0, policy_version 174201 (0.0028) [2024-04-26 12:27:02,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.5, 300 sec: 50485.0). Total num frames: 2854109184. Throughput: 0: 50373.9. Samples: 606993760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:27:02,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 12:27:05,538][49750] Updated weights for policy 0, policy_version 174211 (0.0033) [2024-04-26 12:27:07,062][49517] Fps is (10 sec: 47514.9, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2854322176. Throughput: 0: 50375.5. Samples: 607144020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:27:07,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 12:27:08,309][49728] Signal inference workers to stop experience collection... (9000 times) [2024-04-26 12:27:08,310][49728] Signal inference workers to resume experience collection... (9000 times) [2024-04-26 12:27:08,323][49750] InferenceWorker_p0-w0: stopping experience collection (9000 times) [2024-04-26 12:27:08,323][49750] InferenceWorker_p0-w0: resuming experience collection (9000 times) [2024-04-26 12:27:08,442][49750] Updated weights for policy 0, policy_version 174221 (0.0031) [2024-04-26 12:27:12,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.5, 300 sec: 50429.4). Total num frames: 2854584320. Throughput: 0: 50485.9. Samples: 607450840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:27:12,063][49517] Avg episode reward: [(0, '0.703')] [2024-04-26 12:27:12,112][49728] Saving new best policy, reward=0.703! [2024-04-26 12:27:12,120][49750] Updated weights for policy 0, policy_version 174231 (0.0032) [2024-04-26 12:27:15,089][49750] Updated weights for policy 0, policy_version 174241 (0.0030) [2024-04-26 12:27:17,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 2854846464. Throughput: 0: 50496.8. Samples: 607754960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:27:17,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 12:27:18,488][49750] Updated weights for policy 0, policy_version 174251 (0.0029) [2024-04-26 12:27:21,492][49750] Updated weights for policy 0, policy_version 174261 (0.0032) [2024-04-26 12:27:22,062][49517] Fps is (10 sec: 54066.8, 60 sec: 50790.3, 300 sec: 50540.5). Total num frames: 2855124992. Throughput: 0: 50379.6. Samples: 607907080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:27:22,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 12:27:25,116][49750] Updated weights for policy 0, policy_version 174271 (0.0029) [2024-04-26 12:27:27,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.2, 300 sec: 50484.9). Total num frames: 2855354368. Throughput: 0: 50397.4. Samples: 608206980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:27:27,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 12:27:27,884][49750] Updated weights for policy 0, policy_version 174281 (0.0037) [2024-04-26 12:27:31,576][49750] Updated weights for policy 0, policy_version 174291 (0.0031) [2024-04-26 12:27:32,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50517.2, 300 sec: 50373.9). Total num frames: 2855600128. Throughput: 0: 50411.9. Samples: 608512340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:27:32,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 12:27:34,409][49750] Updated weights for policy 0, policy_version 174301 (0.0031) [2024-04-26 12:27:37,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50244.3, 300 sec: 50429.5). Total num frames: 2855862272. Throughput: 0: 50471.6. Samples: 608654300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 12:27:37,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 12:27:37,994][49750] Updated weights for policy 0, policy_version 174311 (0.0032) [2024-04-26 12:27:40,980][49750] Updated weights for policy 0, policy_version 174321 (0.0032) [2024-04-26 12:27:42,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.4, 300 sec: 50484.9). Total num frames: 2856124416. Throughput: 0: 50426.4. Samples: 608959060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 12:27:42,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 12:27:44,586][49750] Updated weights for policy 0, policy_version 174331 (0.0032) [2024-04-26 12:27:47,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 2856370176. Throughput: 0: 50471.9. Samples: 609265000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 12:27:47,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 12:27:47,391][49750] Updated weights for policy 0, policy_version 174341 (0.0028) [2024-04-26 12:27:51,274][49750] Updated weights for policy 0, policy_version 174351 (0.0035) [2024-04-26 12:27:52,062][49517] Fps is (10 sec: 47514.2, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2856599552. Throughput: 0: 50347.1. Samples: 609409640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 12:27:52,063][49517] Avg episode reward: [(0, '0.443')] [2024-04-26 12:27:53,796][49750] Updated weights for policy 0, policy_version 174361 (0.0030) [2024-04-26 12:27:57,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.4, 300 sec: 50485.0). Total num frames: 2856861696. Throughput: 0: 50378.9. Samples: 609717900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 12:27:57,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 12:27:57,678][49750] Updated weights for policy 0, policy_version 174371 (0.0028) [2024-04-26 12:28:00,378][49750] Updated weights for policy 0, policy_version 174381 (0.0031) [2024-04-26 12:28:02,062][49517] Fps is (10 sec: 54067.4, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 2857140224. Throughput: 0: 50284.1. Samples: 610017740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 12:28:02,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 12:28:03,989][49750] Updated weights for policy 0, policy_version 174391 (0.0031) [2024-04-26 12:28:04,008][49728] Signal inference workers to stop experience collection... (9050 times) [2024-04-26 12:28:04,010][49728] Signal inference workers to resume experience collection... (9050 times) [2024-04-26 12:28:04,042][49750] InferenceWorker_p0-w0: stopping experience collection (9050 times) [2024-04-26 12:28:04,043][49750] InferenceWorker_p0-w0: resuming experience collection (9050 times) [2024-04-26 12:28:06,908][49750] Updated weights for policy 0, policy_version 174401 (0.0037) [2024-04-26 12:28:07,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51336.5, 300 sec: 50596.0). Total num frames: 2857402368. Throughput: 0: 50395.1. Samples: 610174860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 12:28:07,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 12:28:10,527][49750] Updated weights for policy 0, policy_version 174411 (0.0032) [2024-04-26 12:28:12,063][49517] Fps is (10 sec: 47512.6, 60 sec: 50517.1, 300 sec: 50484.9). Total num frames: 2857615360. Throughput: 0: 50484.8. Samples: 610478800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 12:28:12,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 12:28:13,335][49750] Updated weights for policy 0, policy_version 174421 (0.0030) [2024-04-26 12:28:17,062][49517] Fps is (10 sec: 45875.2, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2857861120. Throughput: 0: 50331.3. Samples: 610777240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 12:28:17,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 12:28:17,093][49750] Updated weights for policy 0, policy_version 174431 (0.0031) [2024-04-26 12:28:19,813][49750] Updated weights for policy 0, policy_version 174441 (0.0034) [2024-04-26 12:28:22,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 2858139648. Throughput: 0: 50525.7. Samples: 610927960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 12:28:22,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 12:28:23,357][49750] Updated weights for policy 0, policy_version 174451 (0.0031) [2024-04-26 12:28:26,262][49750] Updated weights for policy 0, policy_version 174461 (0.0032) [2024-04-26 12:28:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.5, 300 sec: 50485.0). Total num frames: 2858385408. Throughput: 0: 50586.0. Samples: 611235420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 12:28:27,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 12:28:29,896][49750] Updated weights for policy 0, policy_version 174471 (0.0036) [2024-04-26 12:28:32,063][49517] Fps is (10 sec: 49150.7, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 2858631168. Throughput: 0: 50539.7. Samples: 611539300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 12:28:32,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 12:28:32,759][49750] Updated weights for policy 0, policy_version 174481 (0.0028) [2024-04-26 12:28:36,575][49750] Updated weights for policy 0, policy_version 174491 (0.0026) [2024-04-26 12:28:37,063][49517] Fps is (10 sec: 49147.0, 60 sec: 50243.4, 300 sec: 50373.7). Total num frames: 2858876928. Throughput: 0: 50646.5. Samples: 611688780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 12:28:37,064][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 12:28:39,248][49750] Updated weights for policy 0, policy_version 174501 (0.0034) [2024-04-26 12:28:42,062][49517] Fps is (10 sec: 49153.3, 60 sec: 49971.3, 300 sec: 50373.8). Total num frames: 2859122688. Throughput: 0: 50575.2. Samples: 611993780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 12:28:42,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 12:28:42,214][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000174509_2859155456.pth... [2024-04-26 12:28:42,259][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000173769_2847031296.pth [2024-04-26 12:28:43,046][49750] Updated weights for policy 0, policy_version 174511 (0.0032) [2024-04-26 12:28:45,680][49750] Updated weights for policy 0, policy_version 174521 (0.0032) [2024-04-26 12:28:47,062][49517] Fps is (10 sec: 52433.6, 60 sec: 50517.4, 300 sec: 50485.0). Total num frames: 2859401216. Throughput: 0: 50687.9. Samples: 612298700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 12:28:47,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 12:28:49,462][49750] Updated weights for policy 0, policy_version 174531 (0.0029) [2024-04-26 12:28:52,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 2859663360. Throughput: 0: 50646.7. Samples: 612453960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 12:28:52,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 12:28:52,164][49750] Updated weights for policy 0, policy_version 174541 (0.0033) [2024-04-26 12:28:55,990][49750] Updated weights for policy 0, policy_version 174551 (0.0031) [2024-04-26 12:28:57,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50485.0). Total num frames: 2859909120. Throughput: 0: 50644.2. Samples: 612757780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 12:28:57,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 12:28:58,675][49750] Updated weights for policy 0, policy_version 174561 (0.0031) [2024-04-26 12:29:02,062][49517] Fps is (10 sec: 47513.2, 60 sec: 49971.1, 300 sec: 50429.4). Total num frames: 2860138496. Throughput: 0: 50683.9. Samples: 613058020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 12:29:02,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 12:29:02,500][49750] Updated weights for policy 0, policy_version 174571 (0.0033) [2024-04-26 12:29:05,077][49750] Updated weights for policy 0, policy_version 174581 (0.0032) [2024-04-26 12:29:07,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 50540.5). Total num frames: 2860417024. Throughput: 0: 50664.4. Samples: 613207860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 12:29:07,071][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 12:29:08,875][49750] Updated weights for policy 0, policy_version 174591 (0.0035) [2024-04-26 12:29:11,507][49750] Updated weights for policy 0, policy_version 174601 (0.0030) [2024-04-26 12:29:12,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51063.6, 300 sec: 50540.5). Total num frames: 2860679168. Throughput: 0: 50604.4. Samples: 613512620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 12:29:12,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 12:29:15,353][49750] Updated weights for policy 0, policy_version 174611 (0.0033) [2024-04-26 12:29:17,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 2860924928. Throughput: 0: 50710.6. Samples: 613821260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 12:29:17,071][49517] Avg episode reward: [(0, '0.668')] [2024-04-26 12:29:18,092][49750] Updated weights for policy 0, policy_version 174621 (0.0031) [2024-04-26 12:29:21,716][49750] Updated weights for policy 0, policy_version 174631 (0.0028) [2024-04-26 12:29:22,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50485.0). Total num frames: 2861170688. Throughput: 0: 50668.2. Samples: 613968800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 12:29:22,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 12:29:24,559][49750] Updated weights for policy 0, policy_version 174641 (0.0033) [2024-04-26 12:29:27,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 2861432832. Throughput: 0: 50675.2. Samples: 614274160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 12:29:27,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 12:29:28,005][49728] Signal inference workers to stop experience collection... (9100 times) [2024-04-26 12:29:28,005][49728] Signal inference workers to resume experience collection... (9100 times) [2024-04-26 12:29:28,031][49750] InferenceWorker_p0-w0: stopping experience collection (9100 times) [2024-04-26 12:29:28,031][49750] InferenceWorker_p0-w0: resuming experience collection (9100 times) [2024-04-26 12:29:28,136][49750] Updated weights for policy 0, policy_version 174651 (0.0032) [2024-04-26 12:29:31,000][49750] Updated weights for policy 0, policy_version 174661 (0.0030) [2024-04-26 12:29:32,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 2861662208. Throughput: 0: 50585.7. Samples: 614575060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 12:29:32,072][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 12:29:34,633][49750] Updated weights for policy 0, policy_version 174671 (0.0027) [2024-04-26 12:29:37,062][49517] Fps is (10 sec: 50789.8, 60 sec: 51064.2, 300 sec: 50540.5). Total num frames: 2861940736. Throughput: 0: 50605.3. Samples: 614731200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 12:29:37,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 12:29:37,405][49750] Updated weights for policy 0, policy_version 174681 (0.0035) [2024-04-26 12:29:41,164][49750] Updated weights for policy 0, policy_version 174691 (0.0032) [2024-04-26 12:29:42,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50429.4). Total num frames: 2862186496. Throughput: 0: 50537.2. Samples: 615031960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 12:29:42,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 12:29:43,841][49750] Updated weights for policy 0, policy_version 174701 (0.0029) [2024-04-26 12:29:47,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.2, 300 sec: 50540.5). Total num frames: 2862432256. Throughput: 0: 50560.3. Samples: 615333240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 12:29:47,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 12:29:47,621][49750] Updated weights for policy 0, policy_version 174711 (0.0031) [2024-04-26 12:29:50,395][49750] Updated weights for policy 0, policy_version 174721 (0.0036) [2024-04-26 12:29:52,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50540.5). Total num frames: 2862678016. Throughput: 0: 50665.4. Samples: 615487800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 12:29:52,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-26 12:29:54,003][49750] Updated weights for policy 0, policy_version 174731 (0.0030) [2024-04-26 12:29:57,041][49750] Updated weights for policy 0, policy_version 174741 (0.0031) [2024-04-26 12:29:57,063][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.3, 300 sec: 50540.5). Total num frames: 2862956544. Throughput: 0: 50653.2. Samples: 615792020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 12:29:57,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 12:30:00,370][49750] Updated weights for policy 0, policy_version 174751 (0.0030) [2024-04-26 12:30:02,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 2863202304. Throughput: 0: 50538.1. Samples: 616095480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 12:30:02,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 12:30:03,491][49750] Updated weights for policy 0, policy_version 174761 (0.0037) [2024-04-26 12:30:06,784][49750] Updated weights for policy 0, policy_version 174771 (0.0036) [2024-04-26 12:30:07,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.2, 300 sec: 50540.4). Total num frames: 2863448064. Throughput: 0: 50684.2. Samples: 616249600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 12:30:07,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 12:30:09,943][49750] Updated weights for policy 0, policy_version 174781 (0.0031) [2024-04-26 12:30:12,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.2, 300 sec: 50596.0). Total num frames: 2863710208. Throughput: 0: 50649.1. Samples: 616553380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 12:30:12,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 12:30:13,238][49750] Updated weights for policy 0, policy_version 174791 (0.0029) [2024-04-26 12:30:16,378][49750] Updated weights for policy 0, policy_version 174801 (0.0032) [2024-04-26 12:30:17,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.1, 300 sec: 50540.4). Total num frames: 2863955968. Throughput: 0: 50542.0. Samples: 616849460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 12:30:17,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 12:30:19,672][49750] Updated weights for policy 0, policy_version 174811 (0.0032) [2024-04-26 12:30:22,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 2864201728. Throughput: 0: 50507.6. Samples: 617004040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 12:30:22,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 12:30:22,829][49750] Updated weights for policy 0, policy_version 174821 (0.0030) [2024-04-26 12:30:26,016][49750] Updated weights for policy 0, policy_version 174831 (0.0035) [2024-04-26 12:30:27,062][49517] Fps is (10 sec: 50791.6, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 2864463872. Throughput: 0: 50683.6. Samples: 617312720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 12:30:27,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 12:30:29,478][49750] Updated weights for policy 0, policy_version 174841 (0.0032) [2024-04-26 12:30:32,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50596.0). Total num frames: 2864709632. Throughput: 0: 50624.6. Samples: 617611340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 12:30:32,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 12:30:33,023][49750] Updated weights for policy 0, policy_version 174851 (0.0035) [2024-04-26 12:30:36,136][49750] Updated weights for policy 0, policy_version 174861 (0.0033) [2024-04-26 12:30:37,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50596.1). Total num frames: 2864955392. Throughput: 0: 50734.7. Samples: 617770860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 12:30:37,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 12:30:37,096][49728] Signal inference workers to stop experience collection... (9150 times) [2024-04-26 12:30:37,097][49728] Signal inference workers to resume experience collection... (9150 times) [2024-04-26 12:30:37,126][49750] InferenceWorker_p0-w0: stopping experience collection (9150 times) [2024-04-26 12:30:37,127][49750] InferenceWorker_p0-w0: resuming experience collection (9150 times) [2024-04-26 12:30:39,423][49750] Updated weights for policy 0, policy_version 174871 (0.0027) [2024-04-26 12:30:42,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.2, 300 sec: 50540.5). Total num frames: 2865217536. Throughput: 0: 50673.3. Samples: 618072320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 12:30:42,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 12:30:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000174879_2865217536.pth... [2024-04-26 12:30:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000174139_2853093376.pth [2024-04-26 12:30:42,664][49750] Updated weights for policy 0, policy_version 174881 (0.0033) [2024-04-26 12:30:45,778][49750] Updated weights for policy 0, policy_version 174891 (0.0031) [2024-04-26 12:30:47,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2865463296. Throughput: 0: 50652.4. Samples: 618374840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 12:30:47,072][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 12:30:49,082][49750] Updated weights for policy 0, policy_version 174901 (0.0028) [2024-04-26 12:30:52,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 2865725440. Throughput: 0: 50583.4. Samples: 618525840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 12:30:52,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 12:30:52,100][49750] Updated weights for policy 0, policy_version 174911 (0.0028) [2024-04-26 12:30:55,657][49750] Updated weights for policy 0, policy_version 174921 (0.0033) [2024-04-26 12:30:57,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 2865987584. Throughput: 0: 50512.9. Samples: 618826460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:30:57,072][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 12:30:58,563][49750] Updated weights for policy 0, policy_version 174931 (0.0025) [2024-04-26 12:31:02,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 2866216960. Throughput: 0: 50724.7. Samples: 619132060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:31:02,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 12:31:02,093][49750] Updated weights for policy 0, policy_version 174941 (0.0031) [2024-04-26 12:31:05,534][49750] Updated weights for policy 0, policy_version 174951 (0.0033) [2024-04-26 12:31:07,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50244.5, 300 sec: 50485.0). Total num frames: 2866462720. Throughput: 0: 50492.9. Samples: 619276220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:31:07,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 12:31:08,789][49750] Updated weights for policy 0, policy_version 174961 (0.0036) [2024-04-26 12:31:12,062][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.3, 300 sec: 50429.4). Total num frames: 2866708480. Throughput: 0: 50240.9. Samples: 619573560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:31:12,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 12:31:12,123][49750] Updated weights for policy 0, policy_version 174971 (0.0031) [2024-04-26 12:31:15,411][49750] Updated weights for policy 0, policy_version 174981 (0.0033) [2024-04-26 12:31:17,063][49517] Fps is (10 sec: 55704.6, 60 sec: 51063.6, 300 sec: 50651.5). Total num frames: 2867019776. Throughput: 0: 50336.3. Samples: 619876480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:31:17,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 12:31:18,491][49750] Updated weights for policy 0, policy_version 174991 (0.0034) [2024-04-26 12:31:21,645][49750] Updated weights for policy 0, policy_version 175001 (0.0030) [2024-04-26 12:31:22,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 2867232768. Throughput: 0: 50369.8. Samples: 620037500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:31:22,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 12:31:24,795][49750] Updated weights for policy 0, policy_version 175011 (0.0036) [2024-04-26 12:31:27,063][49517] Fps is (10 sec: 45875.3, 60 sec: 50244.2, 300 sec: 50540.5). Total num frames: 2867478528. Throughput: 0: 50438.7. Samples: 620342060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:31:27,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 12:31:28,173][49750] Updated weights for policy 0, policy_version 175021 (0.0033) [2024-04-26 12:31:31,289][49750] Updated weights for policy 0, policy_version 175031 (0.0034) [2024-04-26 12:31:32,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2867724288. Throughput: 0: 50431.6. Samples: 620644260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:31:32,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 12:31:34,647][49750] Updated weights for policy 0, policy_version 175041 (0.0031) [2024-04-26 12:31:34,670][49728] Signal inference workers to stop experience collection... (9200 times) [2024-04-26 12:31:34,671][49728] Signal inference workers to resume experience collection... (9200 times) [2024-04-26 12:31:34,700][49750] InferenceWorker_p0-w0: stopping experience collection (9200 times) [2024-04-26 12:31:34,700][49750] InferenceWorker_p0-w0: resuming experience collection (9200 times) [2024-04-26 12:31:37,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.4, 300 sec: 50540.5). Total num frames: 2868002816. Throughput: 0: 50540.4. Samples: 620800160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:31:37,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 12:31:37,794][49750] Updated weights for policy 0, policy_version 175051 (0.0035) [2024-04-26 12:31:41,085][49750] Updated weights for policy 0, policy_version 175061 (0.0032) [2024-04-26 12:31:42,063][49517] Fps is (10 sec: 55704.6, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 2868281344. Throughput: 0: 50782.1. Samples: 621111660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:31:42,063][49517] Avg episode reward: [(0, '0.490')] [2024-04-26 12:31:44,385][49750] Updated weights for policy 0, policy_version 175071 (0.0032) [2024-04-26 12:31:47,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2868477952. Throughput: 0: 50555.5. Samples: 621407060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:31:47,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 12:31:47,474][49750] Updated weights for policy 0, policy_version 175081 (0.0030) [2024-04-26 12:31:50,894][49750] Updated weights for policy 0, policy_version 175091 (0.0030) [2024-04-26 12:31:52,062][49517] Fps is (10 sec: 44237.9, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 2868723712. Throughput: 0: 50490.2. Samples: 621548280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:31:52,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 12:31:53,982][49750] Updated weights for policy 0, policy_version 175101 (0.0034) [2024-04-26 12:31:57,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 2869002240. Throughput: 0: 50559.0. Samples: 621848720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:31:57,064][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 12:31:57,225][49750] Updated weights for policy 0, policy_version 175111 (0.0026) [2024-04-26 12:32:00,501][49750] Updated weights for policy 0, policy_version 175121 (0.0030) [2024-04-26 12:32:02,063][49517] Fps is (10 sec: 55704.8, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 2869280768. Throughput: 0: 50575.6. Samples: 622152380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:32:02,063][49517] Avg episode reward: [(0, '0.472')] [2024-04-26 12:32:03,663][49750] Updated weights for policy 0, policy_version 175131 (0.0028) [2024-04-26 12:32:06,907][49750] Updated weights for policy 0, policy_version 175141 (0.0033) [2024-04-26 12:32:07,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.3, 300 sec: 50596.0). Total num frames: 2869510144. Throughput: 0: 50686.5. Samples: 622318400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:32:07,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 12:32:10,152][49750] Updated weights for policy 0, policy_version 175151 (0.0031) [2024-04-26 12:32:12,063][49517] Fps is (10 sec: 49151.7, 60 sec: 51063.4, 300 sec: 50596.0). Total num frames: 2869772288. Throughput: 0: 50618.2. Samples: 622619880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:32:12,063][49517] Avg episode reward: [(0, '0.435')] [2024-04-26 12:32:13,505][49750] Updated weights for policy 0, policy_version 175161 (0.0034) [2024-04-26 12:32:16,874][49750] Updated weights for policy 0, policy_version 175171 (0.0029) [2024-04-26 12:32:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49698.2, 300 sec: 50429.4). Total num frames: 2870001664. Throughput: 0: 50656.9. Samples: 622923820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:32:17,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 12:32:19,952][49750] Updated weights for policy 0, policy_version 175181 (0.0035) [2024-04-26 12:32:22,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.2, 300 sec: 50596.0). Total num frames: 2870280192. Throughput: 0: 50527.8. Samples: 623073920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:32:22,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 12:32:23,339][49750] Updated weights for policy 0, policy_version 175191 (0.0028) [2024-04-26 12:32:26,272][49750] Updated weights for policy 0, policy_version 175201 (0.0028) [2024-04-26 12:32:26,707][49728] Signal inference workers to stop experience collection... (9250 times) [2024-04-26 12:32:26,707][49728] Signal inference workers to resume experience collection... (9250 times) [2024-04-26 12:32:26,736][49750] InferenceWorker_p0-w0: stopping experience collection (9250 times) [2024-04-26 12:32:26,736][49750] InferenceWorker_p0-w0: resuming experience collection (9250 times) [2024-04-26 12:32:27,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51063.6, 300 sec: 50651.6). Total num frames: 2870542336. Throughput: 0: 50391.8. Samples: 623379280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:32:27,063][49517] Avg episode reward: [(0, '0.444')] [2024-04-26 12:32:29,737][49750] Updated weights for policy 0, policy_version 175211 (0.0029) [2024-04-26 12:32:32,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.3, 300 sec: 50540.4). Total num frames: 2870771712. Throughput: 0: 50468.8. Samples: 623678160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:32:32,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 12:32:32,847][49750] Updated weights for policy 0, policy_version 175221 (0.0028) [2024-04-26 12:32:36,335][49750] Updated weights for policy 0, policy_version 175231 (0.0032) [2024-04-26 12:32:37,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 2871017472. Throughput: 0: 50518.1. Samples: 623821600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:32:37,063][49517] Avg episode reward: [(0, '0.481')] [2024-04-26 12:32:39,323][49750] Updated weights for policy 0, policy_version 175241 (0.0030) [2024-04-26 12:32:42,062][49517] Fps is (10 sec: 49152.6, 60 sec: 49698.3, 300 sec: 50484.9). Total num frames: 2871263232. Throughput: 0: 50393.5. Samples: 624116420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:32:42,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 12:32:42,083][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000175249_2871279616.pth... [2024-04-26 12:32:42,130][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000174509_2859155456.pth [2024-04-26 12:32:42,924][49750] Updated weights for policy 0, policy_version 175251 (0.0030) [2024-04-26 12:32:45,683][49750] Updated weights for policy 0, policy_version 175261 (0.0031) [2024-04-26 12:32:47,063][49517] Fps is (10 sec: 55705.2, 60 sec: 51609.5, 300 sec: 50762.6). Total num frames: 2871574528. Throughput: 0: 50538.7. Samples: 624426620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:32:47,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 12:32:49,399][49750] Updated weights for policy 0, policy_version 175271 (0.0033) [2024-04-26 12:32:52,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50596.0). Total num frames: 2871787520. Throughput: 0: 50489.8. Samples: 624590440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:32:52,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 12:32:52,090][49750] Updated weights for policy 0, policy_version 175281 (0.0034) [2024-04-26 12:32:55,869][49750] Updated weights for policy 0, policy_version 175291 (0.0035) [2024-04-26 12:32:57,062][49517] Fps is (10 sec: 44237.4, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 2872016896. Throughput: 0: 50563.8. Samples: 624895240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:32:57,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 12:32:58,629][49750] Updated weights for policy 0, policy_version 175301 (0.0033) [2024-04-26 12:33:02,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49698.2, 300 sec: 50373.9). Total num frames: 2872262656. Throughput: 0: 50455.2. Samples: 625194300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:33:02,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 12:33:02,272][49750] Updated weights for policy 0, policy_version 175311 (0.0029) [2024-04-26 12:33:05,163][49750] Updated weights for policy 0, policy_version 175321 (0.0033) [2024-04-26 12:33:07,063][49517] Fps is (10 sec: 54066.7, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 2872557568. Throughput: 0: 50351.2. Samples: 625339720. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-04-26 12:33:07,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 12:33:08,895][49750] Updated weights for policy 0, policy_version 175331 (0.0031) [2024-04-26 12:33:11,583][49750] Updated weights for policy 0, policy_version 175341 (0.0029) [2024-04-26 12:33:12,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50244.4, 300 sec: 50596.0). Total num frames: 2872786944. Throughput: 0: 50299.0. Samples: 625642740. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-04-26 12:33:12,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 12:33:15,348][49750] Updated weights for policy 0, policy_version 175351 (0.0031) [2024-04-26 12:33:17,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.4, 300 sec: 50484.9). Total num frames: 2873032704. Throughput: 0: 50477.9. Samples: 625949660. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-04-26 12:33:17,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 12:33:17,881][49750] Updated weights for policy 0, policy_version 175361 (0.0040) [2024-04-26 12:33:21,715][49750] Updated weights for policy 0, policy_version 175371 (0.0028) [2024-04-26 12:33:22,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50244.4, 300 sec: 50540.5). Total num frames: 2873294848. Throughput: 0: 50342.7. Samples: 626087020. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-04-26 12:33:22,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 12:33:24,255][49728] Signal inference workers to stop experience collection... (9300 times) [2024-04-26 12:33:24,255][49728] Signal inference workers to resume experience collection... (9300 times) [2024-04-26 12:33:24,283][49750] InferenceWorker_p0-w0: stopping experience collection (9300 times) [2024-04-26 12:33:24,283][49750] InferenceWorker_p0-w0: resuming experience collection (9300 times) [2024-04-26 12:33:24,385][49750] Updated weights for policy 0, policy_version 175381 (0.0036) [2024-04-26 12:33:27,063][49517] Fps is (10 sec: 50789.3, 60 sec: 49971.0, 300 sec: 50540.5). Total num frames: 2873540608. Throughput: 0: 50642.4. Samples: 626395340. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-04-26 12:33:27,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 12:33:28,367][49750] Updated weights for policy 0, policy_version 175391 (0.0034) [2024-04-26 12:33:30,823][49750] Updated weights for policy 0, policy_version 175401 (0.0033) [2024-04-26 12:33:32,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.5, 300 sec: 50651.7). Total num frames: 2873819136. Throughput: 0: 50506.3. Samples: 626699400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-04-26 12:33:32,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 12:33:34,920][49750] Updated weights for policy 0, policy_version 175411 (0.0030) [2024-04-26 12:33:37,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 2874064896. Throughput: 0: 50404.3. Samples: 626858640. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-04-26 12:33:37,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 12:33:37,283][49750] Updated weights for policy 0, policy_version 175421 (0.0032) [2024-04-26 12:33:41,428][49750] Updated weights for policy 0, policy_version 175431 (0.0041) [2024-04-26 12:33:42,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2874294272. Throughput: 0: 50367.5. Samples: 627161780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-04-26 12:33:42,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 12:33:43,675][49750] Updated weights for policy 0, policy_version 175441 (0.0028) [2024-04-26 12:33:47,062][49517] Fps is (10 sec: 47514.3, 60 sec: 49425.2, 300 sec: 50429.4). Total num frames: 2874540032. Throughput: 0: 50469.4. Samples: 627465420. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-04-26 12:33:47,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 12:33:48,015][49750] Updated weights for policy 0, policy_version 175451 (0.0031) [2024-04-26 12:33:50,301][49750] Updated weights for policy 0, policy_version 175461 (0.0034) [2024-04-26 12:33:52,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.3, 300 sec: 50596.0). Total num frames: 2874834944. Throughput: 0: 50465.8. Samples: 627610680. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-04-26 12:33:52,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 12:33:54,542][49750] Updated weights for policy 0, policy_version 175471 (0.0028) [2024-04-26 12:33:56,946][49750] Updated weights for policy 0, policy_version 175481 (0.0030) [2024-04-26 12:33:57,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.4, 300 sec: 50651.6). Total num frames: 2875080704. Throughput: 0: 50504.5. Samples: 627915440. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-04-26 12:33:57,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 12:34:00,977][49750] Updated weights for policy 0, policy_version 175491 (0.0035) [2024-04-26 12:34:02,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2875293696. Throughput: 0: 50384.0. Samples: 628216940. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-04-26 12:34:02,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 12:34:03,412][49750] Updated weights for policy 0, policy_version 175501 (0.0032) [2024-04-26 12:34:04,298][49728] Signal inference workers to stop experience collection... (9350 times) [2024-04-26 12:34:04,340][49750] InferenceWorker_p0-w0: stopping experience collection (9350 times) [2024-04-26 12:34:04,373][49728] Signal inference workers to resume experience collection... (9350 times) [2024-04-26 12:34:04,374][49750] InferenceWorker_p0-w0: resuming experience collection (9350 times) [2024-04-26 12:34:07,063][49517] Fps is (10 sec: 45874.8, 60 sec: 49698.1, 300 sec: 50373.8). Total num frames: 2875539456. Throughput: 0: 50602.5. Samples: 628364140. Policy #0 lag: (min: 1.0, avg: 10.9, max: 24.0) [2024-04-26 12:34:07,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 12:34:07,348][49750] Updated weights for policy 0, policy_version 175511 (0.0030) [2024-04-26 12:34:10,275][49750] Updated weights for policy 0, policy_version 175521 (0.0027) [2024-04-26 12:34:12,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.4, 300 sec: 50484.9). Total num frames: 2875817984. Throughput: 0: 50617.1. Samples: 628673100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:34:12,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 12:34:13,705][49750] Updated weights for policy 0, policy_version 175531 (0.0029) [2024-04-26 12:34:16,820][49750] Updated weights for policy 0, policy_version 175541 (0.0031) [2024-04-26 12:34:17,063][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.3, 300 sec: 50540.4). Total num frames: 2876080128. Throughput: 0: 50525.2. Samples: 628973040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:34:17,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 12:34:20,256][49750] Updated weights for policy 0, policy_version 175551 (0.0032) [2024-04-26 12:34:22,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51063.5, 300 sec: 50596.0). Total num frames: 2876358656. Throughput: 0: 50552.2. Samples: 629133480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:34:22,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 12:34:23,277][49750] Updated weights for policy 0, policy_version 175561 (0.0036) [2024-04-26 12:34:26,750][49750] Updated weights for policy 0, policy_version 175571 (0.0033) [2024-04-26 12:34:27,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 2876571648. Throughput: 0: 50486.2. Samples: 629433660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:34:27,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 12:34:29,784][49750] Updated weights for policy 0, policy_version 175581 (0.0037) [2024-04-26 12:34:32,063][49517] Fps is (10 sec: 44236.0, 60 sec: 49698.1, 300 sec: 50373.8). Total num frames: 2876801024. Throughput: 0: 50548.2. Samples: 629740100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:34:32,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 12:34:33,098][49750] Updated weights for policy 0, policy_version 175591 (0.0027) [2024-04-26 12:34:36,258][49750] Updated weights for policy 0, policy_version 175601 (0.0032) [2024-04-26 12:34:37,063][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 2877095936. Throughput: 0: 50273.3. Samples: 629872980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:34:37,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 12:34:39,558][49750] Updated weights for policy 0, policy_version 175611 (0.0030) [2024-04-26 12:34:42,063][49517] Fps is (10 sec: 55705.5, 60 sec: 51063.4, 300 sec: 50596.0). Total num frames: 2877358080. Throughput: 0: 50500.8. Samples: 630187980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:34:42,063][49517] Avg episode reward: [(0, '0.479')] [2024-04-26 12:34:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000175620_2877358080.pth... [2024-04-26 12:34:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000174879_2865217536.pth [2024-04-26 12:34:42,550][49750] Updated weights for policy 0, policy_version 175621 (0.0032) [2024-04-26 12:34:46,242][49750] Updated weights for policy 0, policy_version 175631 (0.0033) [2024-04-26 12:34:47,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50540.5). Total num frames: 2877587456. Throughput: 0: 50484.0. Samples: 630488720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:34:47,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 12:34:48,878][49750] Updated weights for policy 0, policy_version 175641 (0.0028) [2024-04-26 12:34:52,062][49517] Fps is (10 sec: 45875.7, 60 sec: 49698.1, 300 sec: 50373.9). Total num frames: 2877816832. Throughput: 0: 50481.0. Samples: 630635780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:34:52,063][49517] Avg episode reward: [(0, '0.482')] [2024-04-26 12:34:52,405][49728] Signal inference workers to stop experience collection... (9400 times) [2024-04-26 12:34:52,458][49750] InferenceWorker_p0-w0: stopping experience collection (9400 times) [2024-04-26 12:34:52,517][49728] Signal inference workers to resume experience collection... (9400 times) [2024-04-26 12:34:52,517][49750] InferenceWorker_p0-w0: resuming experience collection (9400 times) [2024-04-26 12:34:52,641][49750] Updated weights for policy 0, policy_version 175651 (0.0028) [2024-04-26 12:34:55,522][49750] Updated weights for policy 0, policy_version 175661 (0.0027) [2024-04-26 12:34:57,062][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 2878078976. Throughput: 0: 50339.1. Samples: 630938360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:34:57,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 12:34:58,962][49750] Updated weights for policy 0, policy_version 175671 (0.0031) [2024-04-26 12:35:02,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50485.0). Total num frames: 2878341120. Throughput: 0: 50444.6. Samples: 631243040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:35:02,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 12:35:02,217][49750] Updated weights for policy 0, policy_version 175681 (0.0029) [2024-04-26 12:35:05,599][49750] Updated weights for policy 0, policy_version 175691 (0.0034) [2024-04-26 12:35:07,063][49517] Fps is (10 sec: 54066.3, 60 sec: 51336.5, 300 sec: 50540.5). Total num frames: 2878619648. Throughput: 0: 50436.2. Samples: 631403120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:35:07,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 12:35:08,604][49750] Updated weights for policy 0, policy_version 175701 (0.0037) [2024-04-26 12:35:12,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2878816256. Throughput: 0: 50383.3. Samples: 631700900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:35:12,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 12:35:12,224][49750] Updated weights for policy 0, policy_version 175711 (0.0026) [2024-04-26 12:35:15,185][49750] Updated weights for policy 0, policy_version 175721 (0.0036) [2024-04-26 12:35:17,062][49517] Fps is (10 sec: 45875.8, 60 sec: 49971.3, 300 sec: 50429.4). Total num frames: 2879078400. Throughput: 0: 50267.2. Samples: 632002120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:35:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 12:35:18,532][49750] Updated weights for policy 0, policy_version 175731 (0.0028) [2024-04-26 12:35:21,616][49750] Updated weights for policy 0, policy_version 175741 (0.0030) [2024-04-26 12:35:22,062][49517] Fps is (10 sec: 54067.1, 60 sec: 49971.1, 300 sec: 50484.9). Total num frames: 2879356928. Throughput: 0: 50482.3. Samples: 632144680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:35:22,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 12:35:24,923][49750] Updated weights for policy 0, policy_version 175751 (0.0029) [2024-04-26 12:35:27,063][49517] Fps is (10 sec: 55705.2, 60 sec: 51063.5, 300 sec: 50596.0). Total num frames: 2879635456. Throughput: 0: 50319.1. Samples: 632452340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:35:27,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 12:35:28,073][49750] Updated weights for policy 0, policy_version 175761 (0.0033) [2024-04-26 12:35:31,593][49750] Updated weights for policy 0, policy_version 175771 (0.0029) [2024-04-26 12:35:32,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.6, 300 sec: 50540.5). Total num frames: 2879864832. Throughput: 0: 50526.3. Samples: 632762400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:35:32,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 12:35:34,562][49750] Updated weights for policy 0, policy_version 175781 (0.0033) [2024-04-26 12:35:37,062][49517] Fps is (10 sec: 44237.3, 60 sec: 49698.2, 300 sec: 50373.9). Total num frames: 2880077824. Throughput: 0: 50512.9. Samples: 632908860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:35:37,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 12:35:38,130][49750] Updated weights for policy 0, policy_version 175791 (0.0035) [2024-04-26 12:35:40,900][49750] Updated weights for policy 0, policy_version 175801 (0.0031) [2024-04-26 12:35:42,062][49517] Fps is (10 sec: 47513.3, 60 sec: 49698.2, 300 sec: 50429.4). Total num frames: 2880339968. Throughput: 0: 50438.2. Samples: 633208080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:35:42,063][49517] Avg episode reward: [(0, '0.451')] [2024-04-26 12:35:44,575][49728] Signal inference workers to stop experience collection... (9450 times) [2024-04-26 12:35:44,575][49728] Signal inference workers to resume experience collection... (9450 times) [2024-04-26 12:35:44,576][49750] Updated weights for policy 0, policy_version 175811 (0.0043) [2024-04-26 12:35:44,601][49750] InferenceWorker_p0-w0: stopping experience collection (9450 times) [2024-04-26 12:35:44,607][49750] InferenceWorker_p0-w0: resuming experience collection (9450 times) [2024-04-26 12:35:47,063][49517] Fps is (10 sec: 54066.3, 60 sec: 50517.2, 300 sec: 50484.9). Total num frames: 2880618496. Throughput: 0: 50236.3. Samples: 633503680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:35:47,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 12:35:47,348][49750] Updated weights for policy 0, policy_version 175821 (0.0030) [2024-04-26 12:35:50,925][49750] Updated weights for policy 0, policy_version 175831 (0.0032) [2024-04-26 12:35:52,063][49517] Fps is (10 sec: 54066.3, 60 sec: 51063.4, 300 sec: 50484.9). Total num frames: 2880880640. Throughput: 0: 50578.2. Samples: 633679140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:35:52,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 12:35:53,809][49750] Updated weights for policy 0, policy_version 175841 (0.0029) [2024-04-26 12:35:57,062][49517] Fps is (10 sec: 47514.7, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 2881093632. Throughput: 0: 50495.6. Samples: 633973200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:35:57,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 12:35:57,482][49750] Updated weights for policy 0, policy_version 175851 (0.0029) [2024-04-26 12:36:00,337][49750] Updated weights for policy 0, policy_version 175861 (0.0034) [2024-04-26 12:36:02,063][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 2881355776. Throughput: 0: 50551.5. Samples: 634276940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:36:02,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 12:36:03,887][49750] Updated weights for policy 0, policy_version 175871 (0.0030) [2024-04-26 12:36:06,740][49750] Updated weights for policy 0, policy_version 175881 (0.0028) [2024-04-26 12:36:07,062][49517] Fps is (10 sec: 55704.7, 60 sec: 50517.4, 300 sec: 50651.5). Total num frames: 2881650688. Throughput: 0: 50687.5. Samples: 634425620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:36:07,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 12:36:10,234][49750] Updated weights for policy 0, policy_version 175891 (0.0034) [2024-04-26 12:36:12,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51336.4, 300 sec: 50429.4). Total num frames: 2881896448. Throughput: 0: 50545.3. Samples: 634726880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:36:12,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 12:36:13,213][49750] Updated weights for policy 0, policy_version 175901 (0.0033) [2024-04-26 12:36:16,646][49750] Updated weights for policy 0, policy_version 175911 (0.0025) [2024-04-26 12:36:17,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 2882125824. Throughput: 0: 50421.7. Samples: 635031380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:36:17,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 12:36:19,696][49750] Updated weights for policy 0, policy_version 175921 (0.0031) [2024-04-26 12:36:22,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.3, 300 sec: 50485.0). Total num frames: 2882371584. Throughput: 0: 50514.7. Samples: 635182020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 12:36:22,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 12:36:23,175][49750] Updated weights for policy 0, policy_version 175931 (0.0034) [2024-04-26 12:36:26,275][49750] Updated weights for policy 0, policy_version 175941 (0.0029) [2024-04-26 12:36:27,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49698.2, 300 sec: 50484.9). Total num frames: 2882617344. Throughput: 0: 50566.2. Samples: 635483560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 12:36:27,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 12:36:29,086][49728] Signal inference workers to stop experience collection... (9500 times) [2024-04-26 12:36:29,091][49728] Signal inference workers to resume experience collection... (9500 times) [2024-04-26 12:36:29,120][49750] InferenceWorker_p0-w0: stopping experience collection (9500 times) [2024-04-26 12:36:29,120][49750] InferenceWorker_p0-w0: resuming experience collection (9500 times) [2024-04-26 12:36:29,656][49750] Updated weights for policy 0, policy_version 175951 (0.0030) [2024-04-26 12:36:32,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2882895872. Throughput: 0: 50608.6. Samples: 635781060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 12:36:32,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 12:36:32,662][49750] Updated weights for policy 0, policy_version 175961 (0.0029) [2024-04-26 12:36:36,094][49750] Updated weights for policy 0, policy_version 175971 (0.0032) [2024-04-26 12:36:37,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50318.4). Total num frames: 2883125248. Throughput: 0: 50361.9. Samples: 635945420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 12:36:37,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 12:36:39,060][49750] Updated weights for policy 0, policy_version 175981 (0.0040) [2024-04-26 12:36:42,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50540.5). Total num frames: 2883387392. Throughput: 0: 50649.2. Samples: 636252420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 12:36:42,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 12:36:42,098][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000175989_2883403776.pth... [2024-04-26 12:36:42,148][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000175249_2871279616.pth [2024-04-26 12:36:42,702][49750] Updated weights for policy 0, policy_version 175991 (0.0031) [2024-04-26 12:36:45,568][49750] Updated weights for policy 0, policy_version 176001 (0.0030) [2024-04-26 12:36:47,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50540.5). Total num frames: 2883633152. Throughput: 0: 50443.6. Samples: 636546900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 12:36:47,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 12:36:49,223][49750] Updated weights for policy 0, policy_version 176011 (0.0038) [2024-04-26 12:36:52,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.5, 300 sec: 50540.5). Total num frames: 2883911680. Throughput: 0: 50530.3. Samples: 636699480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 12:36:52,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 12:36:52,122][49750] Updated weights for policy 0, policy_version 176021 (0.0038) [2024-04-26 12:36:55,730][49750] Updated weights for policy 0, policy_version 176031 (0.0030) [2024-04-26 12:36:57,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.3, 300 sec: 50429.4). Total num frames: 2884157440. Throughput: 0: 50561.4. Samples: 637002140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 12:36:57,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 12:36:58,663][49750] Updated weights for policy 0, policy_version 176041 (0.0032) [2024-04-26 12:37:02,015][49750] Updated weights for policy 0, policy_version 176051 (0.0029) [2024-04-26 12:37:02,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 2884419584. Throughput: 0: 50417.3. Samples: 637300160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 12:37:02,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 12:37:05,101][49750] Updated weights for policy 0, policy_version 176061 (0.0038) [2024-04-26 12:37:07,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49698.1, 300 sec: 50373.9). Total num frames: 2884632576. Throughput: 0: 50599.9. Samples: 637459020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 12:37:07,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 12:37:08,384][49750] Updated weights for policy 0, policy_version 176071 (0.0033) [2024-04-26 12:37:11,744][49750] Updated weights for policy 0, policy_version 176081 (0.0033) [2024-04-26 12:37:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50540.5). Total num frames: 2884911104. Throughput: 0: 50524.8. Samples: 637757180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 12:37:12,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 12:37:14,911][49750] Updated weights for policy 0, policy_version 176091 (0.0032) [2024-04-26 12:37:17,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2885156864. Throughput: 0: 50663.1. Samples: 638060900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 12:37:17,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 12:37:18,227][49750] Updated weights for policy 0, policy_version 176101 (0.0034) [2024-04-26 12:37:21,448][49750] Updated weights for policy 0, policy_version 176111 (0.0030) [2024-04-26 12:37:22,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 2885419008. Throughput: 0: 50334.6. Samples: 638210480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 12:37:22,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 12:37:24,278][49728] Signal inference workers to stop experience collection... (9550 times) [2024-04-26 12:37:24,279][49728] Signal inference workers to resume experience collection... (9550 times) [2024-04-26 12:37:24,330][49750] InferenceWorker_p0-w0: stopping experience collection (9550 times) [2024-04-26 12:37:24,330][49750] InferenceWorker_p0-w0: resuming experience collection (9550 times) [2024-04-26 12:37:24,580][49750] Updated weights for policy 0, policy_version 176121 (0.0026) [2024-04-26 12:37:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 2885681152. Throughput: 0: 50400.0. Samples: 638520420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 12:37:27,063][49517] Avg episode reward: [(0, '0.672')] [2024-04-26 12:37:27,771][49750] Updated weights for policy 0, policy_version 176131 (0.0035) [2024-04-26 12:37:31,103][49750] Updated weights for policy 0, policy_version 176141 (0.0032) [2024-04-26 12:37:32,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 2885910528. Throughput: 0: 50416.0. Samples: 638815620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 12:37:32,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 12:37:34,367][49750] Updated weights for policy 0, policy_version 176151 (0.0033) [2024-04-26 12:37:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50540.5). Total num frames: 2886172672. Throughput: 0: 50312.5. Samples: 638963540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 12:37:37,063][49517] Avg episode reward: [(0, '0.675')] [2024-04-26 12:37:37,613][49750] Updated weights for policy 0, policy_version 176161 (0.0029) [2024-04-26 12:37:41,203][49750] Updated weights for policy 0, policy_version 176171 (0.0036) [2024-04-26 12:37:42,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 2886418432. Throughput: 0: 50285.4. Samples: 639264980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 12:37:42,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 12:37:44,159][49750] Updated weights for policy 0, policy_version 176181 (0.0041) [2024-04-26 12:37:47,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 2886680576. Throughput: 0: 50386.1. Samples: 639567540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 12:37:47,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 12:37:47,735][49750] Updated weights for policy 0, policy_version 176191 (0.0029) [2024-04-26 12:37:50,621][49750] Updated weights for policy 0, policy_version 176201 (0.0031) [2024-04-26 12:37:52,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49971.2, 300 sec: 50484.9). Total num frames: 2886909952. Throughput: 0: 50238.8. Samples: 639719760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 12:37:52,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 12:37:54,271][49750] Updated weights for policy 0, policy_version 176211 (0.0034) [2024-04-26 12:37:57,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 2887188480. Throughput: 0: 50438.3. Samples: 640026900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 12:37:57,063][49517] Avg episode reward: [(0, '0.461')] [2024-04-26 12:37:57,103][49750] Updated weights for policy 0, policy_version 176221 (0.0032) [2024-04-26 12:38:00,823][49750] Updated weights for policy 0, policy_version 176231 (0.0028) [2024-04-26 12:38:02,063][49517] Fps is (10 sec: 50789.3, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 2887417856. Throughput: 0: 50458.9. Samples: 640331560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 12:38:02,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 12:38:03,665][49750] Updated weights for policy 0, policy_version 176241 (0.0032) [2024-04-26 12:38:07,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 2887680000. Throughput: 0: 50431.6. Samples: 640479900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 12:38:07,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 12:38:07,151][49750] Updated weights for policy 0, policy_version 176251 (0.0032) [2024-04-26 12:38:10,145][49750] Updated weights for policy 0, policy_version 176261 (0.0032) [2024-04-26 12:38:12,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 2887942144. Throughput: 0: 50186.6. Samples: 640778820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 12:38:12,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 12:38:13,693][49750] Updated weights for policy 0, policy_version 176271 (0.0031) [2024-04-26 12:38:16,811][49750] Updated weights for policy 0, policy_version 176281 (0.0037) [2024-04-26 12:38:17,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2888187904. Throughput: 0: 50377.3. Samples: 641082600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 12:38:17,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 12:38:20,279][49750] Updated weights for policy 0, policy_version 176291 (0.0028) [2024-04-26 12:38:22,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 2888433664. Throughput: 0: 50347.8. Samples: 641229200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 12:38:22,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 12:38:23,271][49750] Updated weights for policy 0, policy_version 176301 (0.0030) [2024-04-26 12:38:26,664][49750] Updated weights for policy 0, policy_version 176311 (0.0030) [2024-04-26 12:38:27,063][49517] Fps is (10 sec: 49151.7, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 2888679424. Throughput: 0: 50426.1. Samples: 641534160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 12:38:27,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 12:38:29,966][49750] Updated weights for policy 0, policy_version 176321 (0.0030) [2024-04-26 12:38:31,423][49728] Signal inference workers to stop experience collection... (9600 times) [2024-04-26 12:38:31,423][49728] Signal inference workers to resume experience collection... (9600 times) [2024-04-26 12:38:31,452][49750] InferenceWorker_p0-w0: stopping experience collection (9600 times) [2024-04-26 12:38:31,452][49750] InferenceWorker_p0-w0: resuming experience collection (9600 times) [2024-04-26 12:38:32,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2888941568. Throughput: 0: 50455.7. Samples: 641838040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 12:38:32,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 12:38:33,154][49750] Updated weights for policy 0, policy_version 176331 (0.0033) [2024-04-26 12:38:36,567][49750] Updated weights for policy 0, policy_version 176341 (0.0034) [2024-04-26 12:38:37,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 2889187328. Throughput: 0: 50468.9. Samples: 641990860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 12:38:37,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 12:38:40,007][49750] Updated weights for policy 0, policy_version 176351 (0.0033) [2024-04-26 12:38:42,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.3, 300 sec: 50540.4). Total num frames: 2889449472. Throughput: 0: 50468.3. Samples: 642297980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 12:38:42,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 12:38:42,148][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000176359_2889465856.pth... [2024-04-26 12:38:42,193][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000175620_2877358080.pth [2024-04-26 12:38:42,998][49750] Updated weights for policy 0, policy_version 176361 (0.0036) [2024-04-26 12:38:46,441][49750] Updated weights for policy 0, policy_version 176371 (0.0029) [2024-04-26 12:38:47,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2889678848. Throughput: 0: 50397.5. Samples: 642599440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 12:38:47,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 12:38:49,533][49750] Updated weights for policy 0, policy_version 176381 (0.0031) [2024-04-26 12:38:52,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2889940992. Throughput: 0: 50458.3. Samples: 642750520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 12:38:52,063][49517] Avg episode reward: [(0, '0.483')] [2024-04-26 12:38:53,134][49750] Updated weights for policy 0, policy_version 176391 (0.0032) [2024-04-26 12:38:55,989][49750] Updated weights for policy 0, policy_version 176401 (0.0037) [2024-04-26 12:38:57,063][49517] Fps is (10 sec: 52427.8, 60 sec: 50244.1, 300 sec: 50540.4). Total num frames: 2890203136. Throughput: 0: 50357.6. Samples: 643044920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 12:38:57,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 12:38:59,628][49750] Updated weights for policy 0, policy_version 176411 (0.0034) [2024-04-26 12:39:02,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 2890448896. Throughput: 0: 50339.5. Samples: 643347880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 12:39:02,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 12:39:02,317][49750] Updated weights for policy 0, policy_version 176421 (0.0031) [2024-04-26 12:39:06,287][49750] Updated weights for policy 0, policy_version 176431 (0.0029) [2024-04-26 12:39:07,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2890694656. Throughput: 0: 50318.2. Samples: 643493520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 12:39:07,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 12:39:08,820][49750] Updated weights for policy 0, policy_version 176441 (0.0032) [2024-04-26 12:39:12,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2890940416. Throughput: 0: 50384.6. Samples: 643801460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 12:39:12,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 12:39:12,645][49750] Updated weights for policy 0, policy_version 176451 (0.0029) [2024-04-26 12:39:15,445][49750] Updated weights for policy 0, policy_version 176461 (0.0031) [2024-04-26 12:39:17,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50517.4, 300 sec: 50373.8). Total num frames: 2891218944. Throughput: 0: 50377.8. Samples: 644105040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 12:39:17,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 12:39:19,078][49750] Updated weights for policy 0, policy_version 176471 (0.0032) [2024-04-26 12:39:22,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 2891448320. Throughput: 0: 50464.9. Samples: 644261780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 12:39:22,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 12:39:22,088][49750] Updated weights for policy 0, policy_version 176481 (0.0035) [2024-04-26 12:39:25,586][49750] Updated weights for policy 0, policy_version 176491 (0.0028) [2024-04-26 12:39:27,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.5, 300 sec: 50540.5). Total num frames: 2891710464. Throughput: 0: 50450.0. Samples: 644568220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 12:39:27,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 12:39:28,425][49750] Updated weights for policy 0, policy_version 176501 (0.0030) [2024-04-26 12:39:32,002][49750] Updated weights for policy 0, policy_version 176511 (0.0028) [2024-04-26 12:39:32,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50244.1, 300 sec: 50373.8). Total num frames: 2891956224. Throughput: 0: 50438.4. Samples: 644869180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 12:39:32,064][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 12:39:34,899][49750] Updated weights for policy 0, policy_version 176521 (0.0032) [2024-04-26 12:39:37,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2892218368. Throughput: 0: 50276.9. Samples: 645012980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 12:39:37,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 12:39:38,451][49750] Updated weights for policy 0, policy_version 176531 (0.0029) [2024-04-26 12:39:41,295][49750] Updated weights for policy 0, policy_version 176541 (0.0039) [2024-04-26 12:39:42,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2892464128. Throughput: 0: 50501.5. Samples: 645317480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 12:39:42,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 12:39:45,006][49750] Updated weights for policy 0, policy_version 176551 (0.0028) [2024-04-26 12:39:47,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50596.0). Total num frames: 2892742656. Throughput: 0: 50520.5. Samples: 645621300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 12:39:47,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 12:39:47,686][49750] Updated weights for policy 0, policy_version 176561 (0.0028) [2024-04-26 12:39:51,580][49750] Updated weights for policy 0, policy_version 176571 (0.0032) [2024-04-26 12:39:52,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2892955648. Throughput: 0: 50571.7. Samples: 645769240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 12:39:52,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 12:39:53,276][49728] Signal inference workers to stop experience collection... (9650 times) [2024-04-26 12:39:53,324][49750] InferenceWorker_p0-w0: stopping experience collection (9650 times) [2024-04-26 12:39:53,347][49728] Signal inference workers to resume experience collection... (9650 times) [2024-04-26 12:39:53,348][49750] InferenceWorker_p0-w0: resuming experience collection (9650 times) [2024-04-26 12:39:54,139][49750] Updated weights for policy 0, policy_version 176581 (0.0036) [2024-04-26 12:39:57,062][49517] Fps is (10 sec: 45875.5, 60 sec: 49971.4, 300 sec: 50373.9). Total num frames: 2893201408. Throughput: 0: 50523.6. Samples: 646075020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 12:39:57,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 12:39:58,098][49750] Updated weights for policy 0, policy_version 176591 (0.0036) [2024-04-26 12:40:00,621][49750] Updated weights for policy 0, policy_version 176601 (0.0029) [2024-04-26 12:40:02,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2893479936. Throughput: 0: 50446.6. Samples: 646375140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 12:40:02,063][49517] Avg episode reward: [(0, '0.490')] [2024-04-26 12:40:04,550][49750] Updated weights for policy 0, policy_version 176611 (0.0030) [2024-04-26 12:40:07,063][49517] Fps is (10 sec: 54066.4, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 2893742080. Throughput: 0: 50465.6. Samples: 646532740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 12:40:07,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 12:40:07,217][49750] Updated weights for policy 0, policy_version 176621 (0.0025) [2024-04-26 12:40:10,920][49750] Updated weights for policy 0, policy_version 176631 (0.0034) [2024-04-26 12:40:12,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2893971456. Throughput: 0: 50461.6. Samples: 646839000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 12:40:12,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 12:40:13,795][49750] Updated weights for policy 0, policy_version 176641 (0.0031) [2024-04-26 12:40:17,062][49517] Fps is (10 sec: 45875.5, 60 sec: 49698.1, 300 sec: 50318.3). Total num frames: 2894200832. Throughput: 0: 50341.0. Samples: 647134520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 12:40:17,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 12:40:17,547][49750] Updated weights for policy 0, policy_version 176651 (0.0030) [2024-04-26 12:40:20,419][49750] Updated weights for policy 0, policy_version 176661 (0.0029) [2024-04-26 12:40:22,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.3, 300 sec: 50373.9). Total num frames: 2894495744. Throughput: 0: 50339.9. Samples: 647278280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 12:40:22,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 12:40:24,150][49750] Updated weights for policy 0, policy_version 176671 (0.0036) [2024-04-26 12:40:26,909][49750] Updated weights for policy 0, policy_version 176681 (0.0032) [2024-04-26 12:40:27,063][49517] Fps is (10 sec: 54066.8, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 2894741504. Throughput: 0: 50294.1. Samples: 647580720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 12:40:27,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 12:40:30,541][49750] Updated weights for policy 0, policy_version 176691 (0.0038) [2024-04-26 12:40:32,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50596.0). Total num frames: 2895003648. Throughput: 0: 50327.9. Samples: 647886060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 12:40:32,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 12:40:33,269][49750] Updated weights for policy 0, policy_version 176701 (0.0029) [2024-04-26 12:40:36,885][49750] Updated weights for policy 0, policy_version 176711 (0.0031) [2024-04-26 12:40:37,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 2895233024. Throughput: 0: 50431.7. Samples: 648038660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 12:40:37,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 12:40:39,794][49750] Updated weights for policy 0, policy_version 176721 (0.0032) [2024-04-26 12:40:42,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2895478784. Throughput: 0: 50402.2. Samples: 648343120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 12:40:42,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 12:40:42,077][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000176726_2895478784.pth... [2024-04-26 12:40:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000175989_2883403776.pth [2024-04-26 12:40:43,349][49750] Updated weights for policy 0, policy_version 176731 (0.0026) [2024-04-26 12:40:46,273][49750] Updated weights for policy 0, policy_version 176741 (0.0030) [2024-04-26 12:40:47,062][49517] Fps is (10 sec: 50789.9, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2895740928. Throughput: 0: 50381.8. Samples: 648642320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 12:40:47,071][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 12:40:49,921][49750] Updated weights for policy 0, policy_version 176751 (0.0035) [2024-04-26 12:40:52,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51063.6, 300 sec: 50596.0). Total num frames: 2896019456. Throughput: 0: 50465.5. Samples: 648803680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 12:40:52,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 12:40:52,690][49750] Updated weights for policy 0, policy_version 176761 (0.0035) [2024-04-26 12:40:56,351][49750] Updated weights for policy 0, policy_version 176771 (0.0037) [2024-04-26 12:40:57,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2896232448. Throughput: 0: 50437.9. Samples: 649108700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 12:40:57,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 12:40:59,214][49750] Updated weights for policy 0, policy_version 176781 (0.0030) [2024-04-26 12:41:02,063][49517] Fps is (10 sec: 44236.0, 60 sec: 49698.1, 300 sec: 50207.2). Total num frames: 2896461824. Throughput: 0: 50552.8. Samples: 649409400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 12:41:02,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 12:41:02,819][49750] Updated weights for policy 0, policy_version 176791 (0.0028) [2024-04-26 12:41:05,674][49750] Updated weights for policy 0, policy_version 176801 (0.0030) [2024-04-26 12:41:07,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 2896756736. Throughput: 0: 50405.0. Samples: 649546500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 12:41:07,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 12:41:08,737][49728] Signal inference workers to stop experience collection... (9700 times) [2024-04-26 12:41:08,737][49728] Signal inference workers to resume experience collection... (9700 times) [2024-04-26 12:41:08,765][49750] InferenceWorker_p0-w0: stopping experience collection (9700 times) [2024-04-26 12:41:08,765][49750] InferenceWorker_p0-w0: resuming experience collection (9700 times) [2024-04-26 12:41:09,343][49750] Updated weights for policy 0, policy_version 176811 (0.0034) [2024-04-26 12:41:12,062][49517] Fps is (10 sec: 54067.7, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2897002496. Throughput: 0: 50560.5. Samples: 649855940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 12:41:12,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-26 12:41:12,280][49750] Updated weights for policy 0, policy_version 176821 (0.0032) [2024-04-26 12:41:15,729][49750] Updated weights for policy 0, policy_version 176831 (0.0031) [2024-04-26 12:41:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50484.9). Total num frames: 2897264640. Throughput: 0: 50600.5. Samples: 650163080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 12:41:17,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 12:41:18,817][49750] Updated weights for policy 0, policy_version 176841 (0.0029) [2024-04-26 12:41:22,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 2897510400. Throughput: 0: 50350.0. Samples: 650304420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 12:41:22,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 12:41:22,132][49750] Updated weights for policy 0, policy_version 176851 (0.0030) [2024-04-26 12:41:25,283][49750] Updated weights for policy 0, policy_version 176861 (0.0031) [2024-04-26 12:41:27,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2897739776. Throughput: 0: 50280.4. Samples: 650605740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 12:41:27,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 12:41:28,711][49750] Updated weights for policy 0, policy_version 176871 (0.0028) [2024-04-26 12:41:31,956][49750] Updated weights for policy 0, policy_version 176881 (0.0034) [2024-04-26 12:41:32,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 2898018304. Throughput: 0: 50402.2. Samples: 650910420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 12:41:32,072][49517] Avg episode reward: [(0, '0.484')] [2024-04-26 12:41:35,310][49750] Updated weights for policy 0, policy_version 176891 (0.0030) [2024-04-26 12:41:37,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 2898280448. Throughput: 0: 50365.7. Samples: 651070140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 12:41:37,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 12:41:38,364][49750] Updated weights for policy 0, policy_version 176901 (0.0034) [2024-04-26 12:41:41,869][49750] Updated weights for policy 0, policy_version 176911 (0.0040) [2024-04-26 12:41:42,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2898509824. Throughput: 0: 50332.0. Samples: 651373640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 12:41:42,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 12:41:44,891][49750] Updated weights for policy 0, policy_version 176921 (0.0030) [2024-04-26 12:41:47,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 2898755584. Throughput: 0: 50328.2. Samples: 651674160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 12:41:47,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 12:41:48,293][49750] Updated weights for policy 0, policy_version 176931 (0.0039) [2024-04-26 12:41:51,378][49750] Updated weights for policy 0, policy_version 176941 (0.0036) [2024-04-26 12:41:52,063][49517] Fps is (10 sec: 52427.7, 60 sec: 50244.1, 300 sec: 50429.4). Total num frames: 2899034112. Throughput: 0: 50432.7. Samples: 651815980. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 12:41:52,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 12:41:54,774][49750] Updated weights for policy 0, policy_version 176951 (0.0033) [2024-04-26 12:41:57,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.3, 300 sec: 50373.9). Total num frames: 2899279872. Throughput: 0: 50387.5. Samples: 652123380. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 12:41:57,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 12:41:57,857][49750] Updated weights for policy 0, policy_version 176961 (0.0030) [2024-04-26 12:42:01,317][49750] Updated weights for policy 0, policy_version 176971 (0.0030) [2024-04-26 12:42:02,063][49517] Fps is (10 sec: 49152.4, 60 sec: 51063.5, 300 sec: 50484.9). Total num frames: 2899525632. Throughput: 0: 50227.5. Samples: 652423320. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 12:42:02,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 12:42:04,389][49750] Updated weights for policy 0, policy_version 176981 (0.0033) [2024-04-26 12:42:07,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2899771392. Throughput: 0: 50355.3. Samples: 652570400. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 12:42:07,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 12:42:07,999][49750] Updated weights for policy 0, policy_version 176991 (0.0036) [2024-04-26 12:42:10,577][49728] Signal inference workers to stop experience collection... (9750 times) [2024-04-26 12:42:10,578][49728] Signal inference workers to resume experience collection... (9750 times) [2024-04-26 12:42:10,605][49750] InferenceWorker_p0-w0: stopping experience collection (9750 times) [2024-04-26 12:42:10,605][49750] InferenceWorker_p0-w0: resuming experience collection (9750 times) [2024-04-26 12:42:10,929][49750] Updated weights for policy 0, policy_version 177001 (0.0032) [2024-04-26 12:42:12,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 2900017152. Throughput: 0: 50528.8. Samples: 652879540. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 12:42:12,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 12:42:14,549][49750] Updated weights for policy 0, policy_version 177011 (0.0030) [2024-04-26 12:42:17,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 2900279296. Throughput: 0: 50289.8. Samples: 653173460. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 12:42:17,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 12:42:17,559][49750] Updated weights for policy 0, policy_version 177021 (0.0030) [2024-04-26 12:42:20,911][49750] Updated weights for policy 0, policy_version 177031 (0.0035) [2024-04-26 12:42:22,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.4, 300 sec: 50373.8). Total num frames: 2900541440. Throughput: 0: 50201.3. Samples: 653329200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 12:42:22,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 12:42:23,980][49750] Updated weights for policy 0, policy_version 177041 (0.0026) [2024-04-26 12:42:27,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2900770816. Throughput: 0: 50267.0. Samples: 653635660. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 12:42:27,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 12:42:27,285][49750] Updated weights for policy 0, policy_version 177051 (0.0032) [2024-04-26 12:42:30,524][49750] Updated weights for policy 0, policy_version 177061 (0.0034) [2024-04-26 12:42:32,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2901016576. Throughput: 0: 50209.8. Samples: 653933600. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 12:42:32,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 12:42:33,810][49750] Updated weights for policy 0, policy_version 177071 (0.0031) [2024-04-26 12:42:37,041][49750] Updated weights for policy 0, policy_version 177081 (0.0034) [2024-04-26 12:42:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2901295104. Throughput: 0: 50451.2. Samples: 654086280. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 12:42:37,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 12:42:40,326][49750] Updated weights for policy 0, policy_version 177091 (0.0030) [2024-04-26 12:42:42,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2901540864. Throughput: 0: 50205.4. Samples: 654382620. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 12:42:42,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 12:42:42,182][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000177097_2901557248.pth... [2024-04-26 12:42:42,222][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000176359_2889465856.pth [2024-04-26 12:42:43,620][49750] Updated weights for policy 0, policy_version 177101 (0.0032) [2024-04-26 12:42:47,025][49750] Updated weights for policy 0, policy_version 177111 (0.0029) [2024-04-26 12:42:47,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2901786624. Throughput: 0: 50276.1. Samples: 654685740. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 12:42:47,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 12:42:50,113][49750] Updated weights for policy 0, policy_version 177121 (0.0028) [2024-04-26 12:42:52,062][49517] Fps is (10 sec: 49151.6, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2902032384. Throughput: 0: 50463.4. Samples: 654841260. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 12:42:52,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 12:42:53,435][49750] Updated weights for policy 0, policy_version 177131 (0.0031) [2024-04-26 12:42:56,525][49750] Updated weights for policy 0, policy_version 177141 (0.0039) [2024-04-26 12:42:57,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2902294528. Throughput: 0: 50196.2. Samples: 655138360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:42:57,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 12:42:59,880][49750] Updated weights for policy 0, policy_version 177151 (0.0030) [2024-04-26 12:43:02,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 2902540288. Throughput: 0: 50340.0. Samples: 655438760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:43:02,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 12:43:03,158][49750] Updated weights for policy 0, policy_version 177161 (0.0029) [2024-04-26 12:43:06,389][49750] Updated weights for policy 0, policy_version 177171 (0.0040) [2024-04-26 12:43:07,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50517.1, 300 sec: 50373.8). Total num frames: 2902802432. Throughput: 0: 50359.8. Samples: 655595400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:43:07,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 12:43:09,678][49750] Updated weights for policy 0, policy_version 177181 (0.0033) [2024-04-26 12:43:12,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2903048192. Throughput: 0: 50283.1. Samples: 655898400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:43:12,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 12:43:13,056][49750] Updated weights for policy 0, policy_version 177191 (0.0035) [2024-04-26 12:43:16,149][49750] Updated weights for policy 0, policy_version 177201 (0.0033) [2024-04-26 12:43:17,063][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2903293952. Throughput: 0: 50284.7. Samples: 656196420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:43:17,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 12:43:19,459][49750] Updated weights for policy 0, policy_version 177211 (0.0031) [2024-04-26 12:43:22,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.3, 300 sec: 50373.9). Total num frames: 2903539712. Throughput: 0: 50205.9. Samples: 656345540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:43:22,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 12:43:22,528][49750] Updated weights for policy 0, policy_version 177221 (0.0034) [2024-04-26 12:43:25,786][49750] Updated weights for policy 0, policy_version 177231 (0.0033) [2024-04-26 12:43:27,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 2903801856. Throughput: 0: 50446.5. Samples: 656652720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:43:27,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 12:43:29,058][49750] Updated weights for policy 0, policy_version 177241 (0.0033) [2024-04-26 12:43:32,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 2904064000. Throughput: 0: 50407.5. Samples: 656954080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:43:32,063][49517] Avg episode reward: [(0, '0.484')] [2024-04-26 12:43:32,284][49750] Updated weights for policy 0, policy_version 177251 (0.0031) [2024-04-26 12:43:35,547][49750] Updated weights for policy 0, policy_version 177261 (0.0036) [2024-04-26 12:43:37,063][49517] Fps is (10 sec: 49151.7, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 2904293376. Throughput: 0: 50459.9. Samples: 657111960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:43:37,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 12:43:38,621][49728] Signal inference workers to stop experience collection... (9800 times) [2024-04-26 12:43:38,622][49728] Signal inference workers to resume experience collection... (9800 times) [2024-04-26 12:43:38,638][49750] InferenceWorker_p0-w0: stopping experience collection (9800 times) [2024-04-26 12:43:38,638][49750] InferenceWorker_p0-w0: resuming experience collection (9800 times) [2024-04-26 12:43:38,749][49750] Updated weights for policy 0, policy_version 177271 (0.0030) [2024-04-26 12:43:42,000][49750] Updated weights for policy 0, policy_version 177281 (0.0035) [2024-04-26 12:43:42,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2904571904. Throughput: 0: 50496.9. Samples: 657410720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:43:42,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 12:43:45,422][49750] Updated weights for policy 0, policy_version 177291 (0.0029) [2024-04-26 12:43:47,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2904801280. Throughput: 0: 50434.8. Samples: 657708320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:43:47,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 12:43:48,583][49750] Updated weights for policy 0, policy_version 177301 (0.0029) [2024-04-26 12:43:51,869][49750] Updated weights for policy 0, policy_version 177311 (0.0032) [2024-04-26 12:43:52,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2905063424. Throughput: 0: 50255.7. Samples: 657856900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:43:52,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 12:43:55,154][49750] Updated weights for policy 0, policy_version 177321 (0.0032) [2024-04-26 12:43:57,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2905309184. Throughput: 0: 50267.7. Samples: 658160440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 12:43:57,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 12:43:58,532][49750] Updated weights for policy 0, policy_version 177331 (0.0028) [2024-04-26 12:44:01,682][49750] Updated weights for policy 0, policy_version 177341 (0.0032) [2024-04-26 12:44:02,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.6, 300 sec: 50485.0). Total num frames: 2905587712. Throughput: 0: 50473.5. Samples: 658467720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 12:44:02,063][49517] Avg episode reward: [(0, '0.475')] [2024-04-26 12:44:04,922][49750] Updated weights for policy 0, policy_version 177351 (0.0035) [2024-04-26 12:44:07,063][49517] Fps is (10 sec: 47512.7, 60 sec: 49698.2, 300 sec: 50318.3). Total num frames: 2905784320. Throughput: 0: 50490.5. Samples: 658617620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 12:44:07,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 12:44:08,085][49750] Updated weights for policy 0, policy_version 177361 (0.0031) [2024-04-26 12:44:11,415][49750] Updated weights for policy 0, policy_version 177371 (0.0030) [2024-04-26 12:44:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2906079232. Throughput: 0: 50227.7. Samples: 658912960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 12:44:12,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 12:44:14,614][49750] Updated weights for policy 0, policy_version 177381 (0.0027) [2024-04-26 12:44:17,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 2906308608. Throughput: 0: 50278.9. Samples: 659216640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 12:44:17,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 12:44:17,842][49750] Updated weights for policy 0, policy_version 177391 (0.0036) [2024-04-26 12:44:21,123][49750] Updated weights for policy 0, policy_version 177401 (0.0030) [2024-04-26 12:44:22,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 2906587136. Throughput: 0: 50298.7. Samples: 659375400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 12:44:22,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 12:44:24,427][49750] Updated weights for policy 0, policy_version 177411 (0.0032) [2024-04-26 12:44:27,062][49517] Fps is (10 sec: 49153.2, 60 sec: 49971.3, 300 sec: 50318.4). Total num frames: 2906800128. Throughput: 0: 50204.0. Samples: 659669900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 12:44:27,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 12:44:27,636][49750] Updated weights for policy 0, policy_version 177421 (0.0029) [2024-04-26 12:44:31,019][49750] Updated weights for policy 0, policy_version 177431 (0.0032) [2024-04-26 12:44:32,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 2907078656. Throughput: 0: 50383.8. Samples: 659975600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 12:44:32,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 12:44:33,805][49728] Signal inference workers to stop experience collection... (9850 times) [2024-04-26 12:44:33,849][49750] InferenceWorker_p0-w0: stopping experience collection (9850 times) [2024-04-26 12:44:33,912][49728] Signal inference workers to resume experience collection... (9850 times) [2024-04-26 12:44:33,912][49750] InferenceWorker_p0-w0: resuming experience collection (9850 times) [2024-04-26 12:44:34,058][49750] Updated weights for policy 0, policy_version 177441 (0.0026) [2024-04-26 12:44:37,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.4, 300 sec: 50373.8). Total num frames: 2907324416. Throughput: 0: 50299.1. Samples: 660120360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 12:44:37,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 12:44:37,404][49750] Updated weights for policy 0, policy_version 177451 (0.0033) [2024-04-26 12:44:40,471][49750] Updated weights for policy 0, policy_version 177461 (0.0031) [2024-04-26 12:44:42,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2907586560. Throughput: 0: 50349.7. Samples: 660426180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 12:44:42,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 12:44:42,115][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000177466_2907602944.pth... [2024-04-26 12:44:42,163][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000176726_2895478784.pth [2024-04-26 12:44:43,943][49750] Updated weights for policy 0, policy_version 177471 (0.0033) [2024-04-26 12:44:46,956][49750] Updated weights for policy 0, policy_version 177481 (0.0026) [2024-04-26 12:44:47,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.5, 300 sec: 50485.0). Total num frames: 2907848704. Throughput: 0: 50525.4. Samples: 660741360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 12:44:47,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 12:44:50,447][49750] Updated weights for policy 0, policy_version 177491 (0.0036) [2024-04-26 12:44:52,063][49517] Fps is (10 sec: 47513.4, 60 sec: 49971.2, 300 sec: 50373.8). Total num frames: 2908061696. Throughput: 0: 50552.0. Samples: 660892460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 12:44:52,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 12:44:53,381][49750] Updated weights for policy 0, policy_version 177501 (0.0032) [2024-04-26 12:44:56,859][49750] Updated weights for policy 0, policy_version 177511 (0.0035) [2024-04-26 12:44:57,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.2, 300 sec: 50373.8). Total num frames: 2908340224. Throughput: 0: 50521.2. Samples: 661186420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 12:44:57,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 12:44:59,757][49750] Updated weights for policy 0, policy_version 177521 (0.0027) [2024-04-26 12:45:02,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2908602368. Throughput: 0: 50472.2. Samples: 661487880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 12:45:02,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 12:45:03,346][49750] Updated weights for policy 0, policy_version 177531 (0.0033) [2024-04-26 12:45:06,261][49750] Updated weights for policy 0, policy_version 177541 (0.0033) [2024-04-26 12:45:07,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.6, 300 sec: 50484.9). Total num frames: 2908864512. Throughput: 0: 50529.0. Samples: 661649200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 12:45:07,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 12:45:09,725][49750] Updated weights for policy 0, policy_version 177551 (0.0028) [2024-04-26 12:45:12,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 2909093888. Throughput: 0: 50660.4. Samples: 661949620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 12:45:12,063][49517] Avg episode reward: [(0, '0.697')] [2024-04-26 12:45:12,868][49750] Updated weights for policy 0, policy_version 177561 (0.0030) [2024-04-26 12:45:16,144][49750] Updated weights for policy 0, policy_version 177571 (0.0031) [2024-04-26 12:45:17,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.6, 300 sec: 50318.3). Total num frames: 2909339648. Throughput: 0: 50558.5. Samples: 662250720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 12:45:17,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 12:45:19,333][49750] Updated weights for policy 0, policy_version 177581 (0.0029) [2024-04-26 12:45:22,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2909601792. Throughput: 0: 50589.0. Samples: 662396860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 12:45:22,063][49517] Avg episode reward: [(0, '0.468')] [2024-04-26 12:45:22,704][49750] Updated weights for policy 0, policy_version 177591 (0.0027) [2024-04-26 12:45:25,747][49750] Updated weights for policy 0, policy_version 177601 (0.0037) [2024-04-26 12:45:27,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51336.5, 300 sec: 50429.4). Total num frames: 2909880320. Throughput: 0: 50640.9. Samples: 662705020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 12:45:27,063][49517] Avg episode reward: [(0, '0.428')] [2024-04-26 12:45:29,196][49750] Updated weights for policy 0, policy_version 177611 (0.0029) [2024-04-26 12:45:32,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.5, 300 sec: 50484.9). Total num frames: 2910126080. Throughput: 0: 50515.9. Samples: 663014580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 12:45:32,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 12:45:32,195][49750] Updated weights for policy 0, policy_version 177621 (0.0031) [2024-04-26 12:45:35,742][49750] Updated weights for policy 0, policy_version 177631 (0.0033) [2024-04-26 12:45:37,062][49517] Fps is (10 sec: 44237.2, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2910322688. Throughput: 0: 50481.9. Samples: 663164140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 12:45:37,063][49517] Avg episode reward: [(0, '0.479')] [2024-04-26 12:45:38,663][49750] Updated weights for policy 0, policy_version 177641 (0.0029) [2024-04-26 12:45:42,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 2910601216. Throughput: 0: 50616.4. Samples: 663464160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 12:45:42,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 12:45:42,335][49750] Updated weights for policy 0, policy_version 177651 (0.0032) [2024-04-26 12:45:44,632][49728] Signal inference workers to stop experience collection... (9900 times) [2024-04-26 12:45:44,632][49728] Signal inference workers to resume experience collection... (9900 times) [2024-04-26 12:45:44,653][49750] InferenceWorker_p0-w0: stopping experience collection (9900 times) [2024-04-26 12:45:44,653][49750] InferenceWorker_p0-w0: resuming experience collection (9900 times) [2024-04-26 12:45:45,108][49750] Updated weights for policy 0, policy_version 177661 (0.0032) [2024-04-26 12:45:47,062][49517] Fps is (10 sec: 54066.7, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2910863360. Throughput: 0: 50452.1. Samples: 663758220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 12:45:47,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 12:45:48,741][49750] Updated weights for policy 0, policy_version 177671 (0.0031) [2024-04-26 12:45:51,666][49750] Updated weights for policy 0, policy_version 177681 (0.0036) [2024-04-26 12:45:52,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51336.6, 300 sec: 50540.5). Total num frames: 2911141888. Throughput: 0: 50456.0. Samples: 663919720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 12:45:52,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 12:45:55,168][49750] Updated weights for policy 0, policy_version 177691 (0.0033) [2024-04-26 12:45:57,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50485.0). Total num frames: 2911354880. Throughput: 0: 50348.0. Samples: 664215280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 12:45:57,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 12:45:58,261][49750] Updated weights for policy 0, policy_version 177701 (0.0031) [2024-04-26 12:46:01,687][49750] Updated weights for policy 0, policy_version 177711 (0.0031) [2024-04-26 12:46:02,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.3, 300 sec: 50373.8). Total num frames: 2911617024. Throughput: 0: 50365.2. Samples: 664517160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 12:46:02,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 12:46:04,748][49750] Updated weights for policy 0, policy_version 177721 (0.0031) [2024-04-26 12:46:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 49971.3, 300 sec: 50373.9). Total num frames: 2911862784. Throughput: 0: 50451.6. Samples: 664667180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 12:46:07,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 12:46:08,277][49750] Updated weights for policy 0, policy_version 177731 (0.0032) [2024-04-26 12:46:11,199][49750] Updated weights for policy 0, policy_version 177741 (0.0034) [2024-04-26 12:46:12,063][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2912141312. Throughput: 0: 50325.7. Samples: 664969680. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-04-26 12:46:12,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 12:46:14,622][49750] Updated weights for policy 0, policy_version 177751 (0.0031) [2024-04-26 12:46:17,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2912387072. Throughput: 0: 50281.4. Samples: 665277240. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-04-26 12:46:17,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 12:46:17,723][49750] Updated weights for policy 0, policy_version 177761 (0.0040) [2024-04-26 12:46:21,104][49750] Updated weights for policy 0, policy_version 177771 (0.0033) [2024-04-26 12:46:22,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2912616448. Throughput: 0: 50347.9. Samples: 665429800. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-04-26 12:46:22,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 12:46:24,332][49750] Updated weights for policy 0, policy_version 177781 (0.0030) [2024-04-26 12:46:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49971.3, 300 sec: 50373.9). Total num frames: 2912878592. Throughput: 0: 50172.7. Samples: 665721920. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-04-26 12:46:27,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 12:46:27,659][49750] Updated weights for policy 0, policy_version 177791 (0.0030) [2024-04-26 12:46:30,736][49750] Updated weights for policy 0, policy_version 177801 (0.0033) [2024-04-26 12:46:32,063][49517] Fps is (10 sec: 50789.9, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 2913124352. Throughput: 0: 50373.7. Samples: 666025040. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-04-26 12:46:32,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 12:46:34,463][49750] Updated weights for policy 0, policy_version 177811 (0.0031) [2024-04-26 12:46:37,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51336.5, 300 sec: 50484.9). Total num frames: 2913402880. Throughput: 0: 50236.9. Samples: 666180380. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-04-26 12:46:37,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 12:46:37,096][49750] Updated weights for policy 0, policy_version 177821 (0.0036) [2024-04-26 12:46:40,952][49750] Updated weights for policy 0, policy_version 177831 (0.0027) [2024-04-26 12:46:42,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50373.8). Total num frames: 2913615872. Throughput: 0: 50413.3. Samples: 666483880. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-04-26 12:46:42,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 12:46:42,147][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000177834_2913632256.pth... [2024-04-26 12:46:42,207][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000177097_2901557248.pth [2024-04-26 12:46:42,366][49728] Signal inference workers to stop experience collection... (9950 times) [2024-04-26 12:46:42,366][49728] Signal inference workers to resume experience collection... (9950 times) [2024-04-26 12:46:42,379][49750] InferenceWorker_p0-w0: stopping experience collection (9950 times) [2024-04-26 12:46:42,379][49750] InferenceWorker_p0-w0: resuming experience collection (9950 times) [2024-04-26 12:46:43,543][49750] Updated weights for policy 0, policy_version 177841 (0.0028) [2024-04-26 12:46:47,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 2913894400. Throughput: 0: 50504.1. Samples: 666789840. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-04-26 12:46:47,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 12:46:47,298][49750] Updated weights for policy 0, policy_version 177851 (0.0029) [2024-04-26 12:46:50,168][49750] Updated weights for policy 0, policy_version 177861 (0.0039) [2024-04-26 12:46:52,062][49517] Fps is (10 sec: 49152.8, 60 sec: 49425.2, 300 sec: 50262.8). Total num frames: 2914107392. Throughput: 0: 50344.9. Samples: 666932700. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-04-26 12:46:52,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 12:46:54,079][49750] Updated weights for policy 0, policy_version 177871 (0.0029) [2024-04-26 12:46:56,603][49750] Updated weights for policy 0, policy_version 177881 (0.0042) [2024-04-26 12:46:57,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2914402304. Throughput: 0: 50392.9. Samples: 667237360. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-04-26 12:46:57,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 12:47:00,674][49750] Updated weights for policy 0, policy_version 177891 (0.0032) [2024-04-26 12:47:02,062][49517] Fps is (10 sec: 54067.0, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2914648064. Throughput: 0: 50329.4. Samples: 667542060. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-04-26 12:47:02,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 12:47:03,423][49750] Updated weights for policy 0, policy_version 177901 (0.0032) [2024-04-26 12:47:07,039][49750] Updated weights for policy 0, policy_version 177911 (0.0032) [2024-04-26 12:47:07,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2914893824. Throughput: 0: 50209.8. Samples: 667689240. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-04-26 12:47:07,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 12:47:09,862][49750] Updated weights for policy 0, policy_version 177921 (0.0033) [2024-04-26 12:47:12,062][49517] Fps is (10 sec: 49151.4, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2915139584. Throughput: 0: 50422.5. Samples: 667990940. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-04-26 12:47:12,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 12:47:13,478][49750] Updated weights for policy 0, policy_version 177931 (0.0028) [2024-04-26 12:47:16,423][49750] Updated weights for policy 0, policy_version 177941 (0.0028) [2024-04-26 12:47:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2915401728. Throughput: 0: 50385.9. Samples: 668292400. Policy #0 lag: (min: 2.0, avg: 9.9, max: 22.0) [2024-04-26 12:47:17,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 12:47:19,849][49750] Updated weights for policy 0, policy_version 177951 (0.0029) [2024-04-26 12:47:22,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.5, 300 sec: 50485.0). Total num frames: 2915663872. Throughput: 0: 50468.6. Samples: 668451460. Policy #0 lag: (min: 2.0, avg: 9.9, max: 22.0) [2024-04-26 12:47:22,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 12:47:23,130][49750] Updated weights for policy 0, policy_version 177961 (0.0031) [2024-04-26 12:47:26,288][49750] Updated weights for policy 0, policy_version 177971 (0.0028) [2024-04-26 12:47:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2915893248. Throughput: 0: 50451.6. Samples: 668754200. Policy #0 lag: (min: 2.0, avg: 9.9, max: 22.0) [2024-04-26 12:47:27,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 12:47:29,531][49750] Updated weights for policy 0, policy_version 177981 (0.0036) [2024-04-26 12:47:32,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.4, 300 sec: 50318.3). Total num frames: 2916139008. Throughput: 0: 50420.0. Samples: 669058740. Policy #0 lag: (min: 2.0, avg: 9.9, max: 22.0) [2024-04-26 12:47:32,063][49517] Avg episode reward: [(0, '0.485')] [2024-04-26 12:47:32,789][49750] Updated weights for policy 0, policy_version 177991 (0.0037) [2024-04-26 12:47:33,654][49728] Signal inference workers to stop experience collection... (10000 times) [2024-04-26 12:47:33,693][49750] InferenceWorker_p0-w0: stopping experience collection (10000 times) [2024-04-26 12:47:33,756][49728] Signal inference workers to resume experience collection... (10000 times) [2024-04-26 12:47:33,757][49750] InferenceWorker_p0-w0: resuming experience collection (10000 times) [2024-04-26 12:47:35,949][49750] Updated weights for policy 0, policy_version 178001 (0.0030) [2024-04-26 12:47:37,063][49517] Fps is (10 sec: 49151.6, 60 sec: 49698.1, 300 sec: 50318.3). Total num frames: 2916384768. Throughput: 0: 50424.7. Samples: 669201820. Policy #0 lag: (min: 2.0, avg: 9.9, max: 22.0) [2024-04-26 12:47:37,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 12:47:39,277][49750] Updated weights for policy 0, policy_version 178011 (0.0028) [2024-04-26 12:47:42,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 2916663296. Throughput: 0: 50396.7. Samples: 669505220. Policy #0 lag: (min: 2.0, avg: 9.9, max: 22.0) [2024-04-26 12:47:42,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 12:47:42,488][49750] Updated weights for policy 0, policy_version 178021 (0.0031) [2024-04-26 12:47:45,808][49750] Updated weights for policy 0, policy_version 178031 (0.0028) [2024-04-26 12:47:47,062][49517] Fps is (10 sec: 54067.6, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2916925440. Throughput: 0: 50502.6. Samples: 669814680. Policy #0 lag: (min: 2.0, avg: 9.9, max: 22.0) [2024-04-26 12:47:47,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 12:47:49,132][49750] Updated weights for policy 0, policy_version 178041 (0.0033) [2024-04-26 12:47:52,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.3, 300 sec: 50373.8). Total num frames: 2917154816. Throughput: 0: 50480.4. Samples: 669960860. Policy #0 lag: (min: 2.0, avg: 9.9, max: 22.0) [2024-04-26 12:47:52,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 12:47:52,326][49750] Updated weights for policy 0, policy_version 178051 (0.0037) [2024-04-26 12:47:55,631][49750] Updated weights for policy 0, policy_version 178061 (0.0032) [2024-04-26 12:47:57,062][49517] Fps is (10 sec: 45875.2, 60 sec: 49698.1, 300 sec: 50318.3). Total num frames: 2917384192. Throughput: 0: 50485.8. Samples: 670262800. Policy #0 lag: (min: 2.0, avg: 9.9, max: 22.0) [2024-04-26 12:47:57,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 12:47:59,162][49750] Updated weights for policy 0, policy_version 178071 (0.0028) [2024-04-26 12:48:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2917662720. Throughput: 0: 50589.7. Samples: 670568940. Policy #0 lag: (min: 2.0, avg: 9.9, max: 22.0) [2024-04-26 12:48:02,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 12:48:02,230][49750] Updated weights for policy 0, policy_version 178081 (0.0030) [2024-04-26 12:48:05,556][49750] Updated weights for policy 0, policy_version 178091 (0.0031) [2024-04-26 12:48:07,063][49517] Fps is (10 sec: 55705.2, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 2917941248. Throughput: 0: 50468.3. Samples: 670722540. Policy #0 lag: (min: 2.0, avg: 9.9, max: 22.0) [2024-04-26 12:48:07,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 12:48:08,672][49750] Updated weights for policy 0, policy_version 178101 (0.0030) [2024-04-26 12:48:11,899][49750] Updated weights for policy 0, policy_version 178111 (0.0028) [2024-04-26 12:48:12,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2918170624. Throughput: 0: 50395.9. Samples: 671022020. Policy #0 lag: (min: 2.0, avg: 9.9, max: 22.0) [2024-04-26 12:48:12,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 12:48:15,125][49750] Updated weights for policy 0, policy_version 178121 (0.0039) [2024-04-26 12:48:17,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2918416384. Throughput: 0: 50391.4. Samples: 671326360. Policy #0 lag: (min: 2.0, avg: 9.9, max: 22.0) [2024-04-26 12:48:17,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 12:48:18,254][49750] Updated weights for policy 0, policy_version 178131 (0.0026) [2024-04-26 12:48:21,580][49750] Updated weights for policy 0, policy_version 178141 (0.0029) [2024-04-26 12:48:22,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.1, 300 sec: 50373.9). Total num frames: 2918662144. Throughput: 0: 50411.6. Samples: 671470340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 12:48:22,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 12:48:24,724][49750] Updated weights for policy 0, policy_version 178151 (0.0024) [2024-04-26 12:48:27,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2918924288. Throughput: 0: 50580.1. Samples: 671781320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 12:48:27,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 12:48:28,174][49750] Updated weights for policy 0, policy_version 178161 (0.0032) [2024-04-26 12:48:31,315][49750] Updated weights for policy 0, policy_version 178171 (0.0031) [2024-04-26 12:48:32,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 2919202816. Throughput: 0: 50431.2. Samples: 672084080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 12:48:32,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 12:48:34,765][49750] Updated weights for policy 0, policy_version 178181 (0.0030) [2024-04-26 12:48:37,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.5, 300 sec: 50318.3). Total num frames: 2919415808. Throughput: 0: 50660.1. Samples: 672240560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 12:48:37,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-26 12:48:37,934][49750] Updated weights for policy 0, policy_version 178191 (0.0031) [2024-04-26 12:48:38,333][49728] Signal inference workers to stop experience collection... (10050 times) [2024-04-26 12:48:38,372][49750] InferenceWorker_p0-w0: stopping experience collection (10050 times) [2024-04-26 12:48:38,406][49728] Signal inference workers to resume experience collection... (10050 times) [2024-04-26 12:48:38,406][49750] InferenceWorker_p0-w0: resuming experience collection (10050 times) [2024-04-26 12:48:41,285][49750] Updated weights for policy 0, policy_version 178201 (0.0033) [2024-04-26 12:48:42,063][49517] Fps is (10 sec: 45874.6, 60 sec: 49971.3, 300 sec: 50373.8). Total num frames: 2919661568. Throughput: 0: 50511.0. Samples: 672535800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 12:48:42,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 12:48:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000178202_2919661568.pth... [2024-04-26 12:48:42,128][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000177466_2907602944.pth [2024-04-26 12:48:44,285][49750] Updated weights for policy 0, policy_version 178211 (0.0029) [2024-04-26 12:48:47,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2919940096. Throughput: 0: 50508.8. Samples: 672841840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 12:48:47,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 12:48:47,705][49750] Updated weights for policy 0, policy_version 178221 (0.0035) [2024-04-26 12:48:50,847][49750] Updated weights for policy 0, policy_version 178231 (0.0029) [2024-04-26 12:48:52,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 2920218624. Throughput: 0: 50391.2. Samples: 672990140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 12:48:52,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 12:48:54,259][49750] Updated weights for policy 0, policy_version 178241 (0.0031) [2024-04-26 12:48:57,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.3, 300 sec: 50318.3). Total num frames: 2920431616. Throughput: 0: 50655.0. Samples: 673301500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 12:48:57,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 12:48:57,369][49750] Updated weights for policy 0, policy_version 178251 (0.0029) [2024-04-26 12:49:00,738][49750] Updated weights for policy 0, policy_version 178261 (0.0034) [2024-04-26 12:49:02,063][49517] Fps is (10 sec: 45874.1, 60 sec: 50244.1, 300 sec: 50484.9). Total num frames: 2920677376. Throughput: 0: 50548.7. Samples: 673601060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 12:49:02,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 12:49:03,773][49750] Updated weights for policy 0, policy_version 178271 (0.0032) [2024-04-26 12:49:07,062][49517] Fps is (10 sec: 49153.1, 60 sec: 49698.2, 300 sec: 50318.3). Total num frames: 2920923136. Throughput: 0: 50382.3. Samples: 673737540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 12:49:07,063][49517] Avg episode reward: [(0, '0.443')] [2024-04-26 12:49:07,337][49750] Updated weights for policy 0, policy_version 178281 (0.0032) [2024-04-26 12:49:10,164][49750] Updated weights for policy 0, policy_version 178291 (0.0030) [2024-04-26 12:49:12,062][49517] Fps is (10 sec: 52430.1, 60 sec: 50517.4, 300 sec: 50485.0). Total num frames: 2921201664. Throughput: 0: 50241.9. Samples: 674042200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 12:49:12,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 12:49:13,743][49750] Updated weights for policy 0, policy_version 178301 (0.0034) [2024-04-26 12:49:16,633][49750] Updated weights for policy 0, policy_version 178311 (0.0031) [2024-04-26 12:49:17,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50790.6, 300 sec: 50429.4). Total num frames: 2921463808. Throughput: 0: 50340.9. Samples: 674349420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 12:49:17,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 12:49:20,379][49750] Updated weights for policy 0, policy_version 178321 (0.0034) [2024-04-26 12:49:22,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50790.3, 300 sec: 50540.4). Total num frames: 2921709568. Throughput: 0: 50306.8. Samples: 674504380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 12:49:22,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 12:49:23,201][49750] Updated weights for policy 0, policy_version 178331 (0.0037) [2024-04-26 12:49:26,991][49750] Updated weights for policy 0, policy_version 178341 (0.0034) [2024-04-26 12:49:27,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 2921938944. Throughput: 0: 50397.9. Samples: 674803700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:49:27,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 12:49:29,562][49750] Updated weights for policy 0, policy_version 178351 (0.0031) [2024-04-26 12:49:32,062][49517] Fps is (10 sec: 49152.8, 60 sec: 49971.1, 300 sec: 50429.4). Total num frames: 2922201088. Throughput: 0: 50261.4. Samples: 675103600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:49:32,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 12:49:33,546][49750] Updated weights for policy 0, policy_version 178361 (0.0033) [2024-04-26 12:49:36,102][49750] Updated weights for policy 0, policy_version 178371 (0.0033) [2024-04-26 12:49:37,063][49517] Fps is (10 sec: 54066.2, 60 sec: 51063.3, 300 sec: 50484.9). Total num frames: 2922479616. Throughput: 0: 50434.5. Samples: 675259700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:49:37,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 12:49:37,510][49728] Signal inference workers to stop experience collection... (10100 times) [2024-04-26 12:49:37,511][49728] Signal inference workers to resume experience collection... (10100 times) [2024-04-26 12:49:37,540][49750] InferenceWorker_p0-w0: stopping experience collection (10100 times) [2024-04-26 12:49:37,540][49750] InferenceWorker_p0-w0: resuming experience collection (10100 times) [2024-04-26 12:49:40,108][49750] Updated weights for policy 0, policy_version 178381 (0.0030) [2024-04-26 12:49:42,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50429.4). Total num frames: 2922725376. Throughput: 0: 50352.1. Samples: 675567340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:49:42,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 12:49:42,592][49750] Updated weights for policy 0, policy_version 178391 (0.0032) [2024-04-26 12:49:46,434][49750] Updated weights for policy 0, policy_version 178401 (0.0029) [2024-04-26 12:49:47,062][49517] Fps is (10 sec: 47514.8, 60 sec: 50244.4, 300 sec: 50485.0). Total num frames: 2922954752. Throughput: 0: 50388.4. Samples: 675868520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:49:47,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 12:49:49,110][49750] Updated weights for policy 0, policy_version 178411 (0.0038) [2024-04-26 12:49:52,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49698.1, 300 sec: 50373.9). Total num frames: 2923200512. Throughput: 0: 50561.7. Samples: 676012820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:49:52,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 12:49:52,916][49750] Updated weights for policy 0, policy_version 178421 (0.0032) [2024-04-26 12:49:55,705][49750] Updated weights for policy 0, policy_version 178431 (0.0034) [2024-04-26 12:49:57,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.5, 300 sec: 50373.9). Total num frames: 2923462656. Throughput: 0: 50649.4. Samples: 676321420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:49:57,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 12:49:59,381][49750] Updated weights for policy 0, policy_version 178441 (0.0031) [2024-04-26 12:50:02,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.5, 300 sec: 50373.8). Total num frames: 2923724800. Throughput: 0: 50362.5. Samples: 676615740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:50:02,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 12:50:02,159][49750] Updated weights for policy 0, policy_version 178451 (0.0038) [2024-04-26 12:50:05,834][49750] Updated weights for policy 0, policy_version 178461 (0.0041) [2024-04-26 12:50:07,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.4, 300 sec: 50484.9). Total num frames: 2923986944. Throughput: 0: 50589.0. Samples: 676780880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:50:07,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 12:50:08,459][49750] Updated weights for policy 0, policy_version 178471 (0.0031) [2024-04-26 12:50:12,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.2, 300 sec: 50373.8). Total num frames: 2924199936. Throughput: 0: 50606.1. Samples: 677080980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:50:12,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 12:50:12,271][49750] Updated weights for policy 0, policy_version 178481 (0.0028) [2024-04-26 12:50:15,012][49750] Updated weights for policy 0, policy_version 178491 (0.0030) [2024-04-26 12:50:17,062][49517] Fps is (10 sec: 47514.4, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2924462080. Throughput: 0: 50604.6. Samples: 677380800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:50:17,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 12:50:18,812][49750] Updated weights for policy 0, policy_version 178501 (0.0029) [2024-04-26 12:50:21,573][49750] Updated weights for policy 0, policy_version 178511 (0.0027) [2024-04-26 12:50:22,062][49517] Fps is (10 sec: 55706.0, 60 sec: 50790.6, 300 sec: 50429.4). Total num frames: 2924756992. Throughput: 0: 50437.1. Samples: 677529360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:50:22,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 12:50:25,251][49750] Updated weights for policy 0, policy_version 178521 (0.0028) [2024-04-26 12:50:27,062][49517] Fps is (10 sec: 54066.5, 60 sec: 51063.4, 300 sec: 50429.4). Total num frames: 2925002752. Throughput: 0: 50417.0. Samples: 677836100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 12:50:27,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 12:50:27,883][49750] Updated weights for policy 0, policy_version 178531 (0.0037) [2024-04-26 12:50:31,601][49750] Updated weights for policy 0, policy_version 178541 (0.0031) [2024-04-26 12:50:32,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 2925232128. Throughput: 0: 50564.3. Samples: 678143920. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 12:50:32,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 12:50:34,208][49750] Updated weights for policy 0, policy_version 178551 (0.0031) [2024-04-26 12:50:37,063][49517] Fps is (10 sec: 45875.0, 60 sec: 49698.2, 300 sec: 50373.9). Total num frames: 2925461504. Throughput: 0: 50482.6. Samples: 678284540. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 12:50:37,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 12:50:38,094][49750] Updated weights for policy 0, policy_version 178561 (0.0029) [2024-04-26 12:50:40,755][49750] Updated weights for policy 0, policy_version 178571 (0.0039) [2024-04-26 12:50:42,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2925756416. Throughput: 0: 50476.6. Samples: 678592880. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 12:50:42,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 12:50:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000178574_2925756416.pth... [2024-04-26 12:50:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000177834_2913632256.pth [2024-04-26 12:50:44,519][49750] Updated weights for policy 0, policy_version 178581 (0.0038) [2024-04-26 12:50:47,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50790.3, 300 sec: 50373.9). Total num frames: 2926002176. Throughput: 0: 50659.6. Samples: 678895420. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 12:50:47,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 12:50:47,407][49750] Updated weights for policy 0, policy_version 178591 (0.0033) [2024-04-26 12:50:50,955][49750] Updated weights for policy 0, policy_version 178601 (0.0036) [2024-04-26 12:50:51,831][49728] Signal inference workers to stop experience collection... (10150 times) [2024-04-26 12:50:51,834][49728] Signal inference workers to resume experience collection... (10150 times) [2024-04-26 12:50:51,859][49750] InferenceWorker_p0-w0: stopping experience collection (10150 times) [2024-04-26 12:50:51,859][49750] InferenceWorker_p0-w0: resuming experience collection (10150 times) [2024-04-26 12:50:52,063][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.4, 300 sec: 50540.5). Total num frames: 2926264320. Throughput: 0: 50514.2. Samples: 679054020. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 12:50:52,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 12:50:53,927][49750] Updated weights for policy 0, policy_version 178611 (0.0029) [2024-04-26 12:50:57,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2926493696. Throughput: 0: 50519.7. Samples: 679354360. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 12:50:57,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 12:50:57,454][49750] Updated weights for policy 0, policy_version 178621 (0.0035) [2024-04-26 12:51:00,274][49750] Updated weights for policy 0, policy_version 178631 (0.0031) [2024-04-26 12:51:02,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 2926739456. Throughput: 0: 50587.9. Samples: 679657260. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 12:51:02,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 12:51:03,836][49750] Updated weights for policy 0, policy_version 178641 (0.0036) [2024-04-26 12:51:06,856][49750] Updated weights for policy 0, policy_version 178651 (0.0033) [2024-04-26 12:51:07,063][49517] Fps is (10 sec: 52427.6, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2927017984. Throughput: 0: 50613.1. Samples: 679806960. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 12:51:07,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 12:51:10,236][49750] Updated weights for policy 0, policy_version 178661 (0.0030) [2024-04-26 12:51:12,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51336.5, 300 sec: 50484.9). Total num frames: 2927280128. Throughput: 0: 50479.0. Samples: 680107660. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 12:51:12,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 12:51:13,488][49750] Updated weights for policy 0, policy_version 178671 (0.0043) [2024-04-26 12:51:17,008][49750] Updated weights for policy 0, policy_version 178681 (0.0029) [2024-04-26 12:51:17,063][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.2, 300 sec: 50484.9). Total num frames: 2927509504. Throughput: 0: 50423.1. Samples: 680412960. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 12:51:17,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 12:51:19,889][49750] Updated weights for policy 0, policy_version 178691 (0.0031) [2024-04-26 12:51:22,063][49517] Fps is (10 sec: 45874.8, 60 sec: 49697.9, 300 sec: 50373.8). Total num frames: 2927738880. Throughput: 0: 50509.6. Samples: 680557480. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 12:51:22,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 12:51:23,604][49750] Updated weights for policy 0, policy_version 178701 (0.0036) [2024-04-26 12:51:26,283][49750] Updated weights for policy 0, policy_version 178711 (0.0032) [2024-04-26 12:51:27,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.3, 300 sec: 50485.0). Total num frames: 2928017408. Throughput: 0: 50470.8. Samples: 680864060. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 12:51:27,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 12:51:29,980][49750] Updated weights for policy 0, policy_version 178721 (0.0030) [2024-04-26 12:51:32,062][49517] Fps is (10 sec: 54068.5, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 2928279552. Throughput: 0: 50486.3. Samples: 681167300. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 12:51:32,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 12:51:32,820][49750] Updated weights for policy 0, policy_version 178731 (0.0039) [2024-04-26 12:51:36,393][49750] Updated weights for policy 0, policy_version 178741 (0.0034) [2024-04-26 12:51:37,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 2928508928. Throughput: 0: 50496.0. Samples: 681326340. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 12:51:37,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 12:51:39,254][49750] Updated weights for policy 0, policy_version 178751 (0.0028) [2024-04-26 12:51:42,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2928771072. Throughput: 0: 50551.8. Samples: 681629200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:51:42,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 12:51:42,704][49750] Updated weights for policy 0, policy_version 178761 (0.0036) [2024-04-26 12:51:45,800][49750] Updated weights for policy 0, policy_version 178771 (0.0030) [2024-04-26 12:51:47,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 2929033216. Throughput: 0: 50359.1. Samples: 681923420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:51:47,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 12:51:49,277][49750] Updated weights for policy 0, policy_version 178781 (0.0027) [2024-04-26 12:51:52,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2929278976. Throughput: 0: 50545.8. Samples: 682081520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:51:52,063][49517] Avg episode reward: [(0, '0.490')] [2024-04-26 12:51:52,546][49750] Updated weights for policy 0, policy_version 178791 (0.0026) [2024-04-26 12:51:55,714][49750] Updated weights for policy 0, policy_version 178801 (0.0037) [2024-04-26 12:51:57,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 2929557504. Throughput: 0: 50607.3. Samples: 682384980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:51:57,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 12:51:59,008][49750] Updated weights for policy 0, policy_version 178811 (0.0029) [2024-04-26 12:52:02,014][49750] Updated weights for policy 0, policy_version 178821 (0.0031) [2024-04-26 12:52:02,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.3, 300 sec: 50540.4). Total num frames: 2929803264. Throughput: 0: 50571.0. Samples: 682688660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:52:02,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 12:52:05,298][49750] Updated weights for policy 0, policy_version 178831 (0.0031) [2024-04-26 12:52:07,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 2930049024. Throughput: 0: 50829.9. Samples: 682844820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:52:07,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 12:52:08,311][49750] Updated weights for policy 0, policy_version 178841 (0.0031) [2024-04-26 12:52:10,687][49728] Signal inference workers to stop experience collection... (10200 times) [2024-04-26 12:52:10,690][49728] Signal inference workers to resume experience collection... (10200 times) [2024-04-26 12:52:10,711][49750] InferenceWorker_p0-w0: stopping experience collection (10200 times) [2024-04-26 12:52:10,711][49750] InferenceWorker_p0-w0: resuming experience collection (10200 times) [2024-04-26 12:52:11,831][49750] Updated weights for policy 0, policy_version 178851 (0.0033) [2024-04-26 12:52:12,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50540.4). Total num frames: 2930311168. Throughput: 0: 50810.0. Samples: 683150520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:52:12,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 12:52:14,802][49750] Updated weights for policy 0, policy_version 178861 (0.0028) [2024-04-26 12:52:17,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50484.9). Total num frames: 2930556928. Throughput: 0: 50731.9. Samples: 683450240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:52:17,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 12:52:18,303][49750] Updated weights for policy 0, policy_version 178871 (0.0029) [2024-04-26 12:52:21,519][49750] Updated weights for policy 0, policy_version 178881 (0.0028) [2024-04-26 12:52:22,062][49517] Fps is (10 sec: 49152.8, 60 sec: 51063.6, 300 sec: 50540.5). Total num frames: 2930802688. Throughput: 0: 50694.3. Samples: 683607580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:52:22,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-26 12:52:24,660][49750] Updated weights for policy 0, policy_version 178891 (0.0030) [2024-04-26 12:52:27,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 2931048448. Throughput: 0: 50805.4. Samples: 683915440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:52:27,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 12:52:27,990][49750] Updated weights for policy 0, policy_version 178901 (0.0029) [2024-04-26 12:52:31,181][49750] Updated weights for policy 0, policy_version 178911 (0.0032) [2024-04-26 12:52:32,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50540.5). Total num frames: 2931294208. Throughput: 0: 50819.6. Samples: 684210300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:52:32,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 12:52:34,435][49750] Updated weights for policy 0, policy_version 178921 (0.0033) [2024-04-26 12:52:37,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 2931572736. Throughput: 0: 50801.4. Samples: 684367580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:52:37,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 12:52:37,686][49750] Updated weights for policy 0, policy_version 178931 (0.0030) [2024-04-26 12:52:40,793][49750] Updated weights for policy 0, policy_version 178941 (0.0034) [2024-04-26 12:52:42,063][49517] Fps is (10 sec: 52427.3, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 2931818496. Throughput: 0: 50921.9. Samples: 684676480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 12:52:42,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 12:52:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000178944_2931818496.pth... [2024-04-26 12:52:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000178202_2919661568.pth [2024-04-26 12:52:43,994][49750] Updated weights for policy 0, policy_version 178951 (0.0029) [2024-04-26 12:52:47,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 2932080640. Throughput: 0: 50813.5. Samples: 684975260. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 12:52:47,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-26 12:52:47,262][49750] Updated weights for policy 0, policy_version 178961 (0.0033) [2024-04-26 12:52:50,559][49750] Updated weights for policy 0, policy_version 178971 (0.0033) [2024-04-26 12:52:52,062][49517] Fps is (10 sec: 52430.1, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 2932342784. Throughput: 0: 50809.5. Samples: 685131240. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 12:52:52,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 12:52:53,654][49750] Updated weights for policy 0, policy_version 178981 (0.0038) [2024-04-26 12:52:57,063][49517] Fps is (10 sec: 47512.6, 60 sec: 49971.0, 300 sec: 50484.9). Total num frames: 2932555776. Throughput: 0: 50855.1. Samples: 685439000. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 12:52:57,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 12:52:57,235][49750] Updated weights for policy 0, policy_version 178991 (0.0033) [2024-04-26 12:53:00,263][49750] Updated weights for policy 0, policy_version 179001 (0.0030) [2024-04-26 12:53:02,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 2932817920. Throughput: 0: 50774.7. Samples: 685735100. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 12:53:02,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 12:53:03,800][49750] Updated weights for policy 0, policy_version 179011 (0.0031) [2024-04-26 12:53:06,637][49750] Updated weights for policy 0, policy_version 179021 (0.0031) [2024-04-26 12:53:07,063][49517] Fps is (10 sec: 54067.9, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 2933096448. Throughput: 0: 50661.3. Samples: 685887340. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 12:53:07,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 12:53:10,110][49750] Updated weights for policy 0, policy_version 179031 (0.0034) [2024-04-26 12:53:12,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.6, 300 sec: 50596.1). Total num frames: 2933342208. Throughput: 0: 50588.6. Samples: 686191920. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 12:53:12,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 12:53:12,968][49750] Updated weights for policy 0, policy_version 179041 (0.0033) [2024-04-26 12:53:16,495][49750] Updated weights for policy 0, policy_version 179051 (0.0037) [2024-04-26 12:53:17,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 2933587968. Throughput: 0: 50652.3. Samples: 686489660. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 12:53:17,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 12:53:19,702][49750] Updated weights for policy 0, policy_version 179061 (0.0033) [2024-04-26 12:53:22,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 2933833728. Throughput: 0: 50442.3. Samples: 686637480. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 12:53:22,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 12:53:23,094][49750] Updated weights for policy 0, policy_version 179071 (0.0033) [2024-04-26 12:53:26,214][49750] Updated weights for policy 0, policy_version 179081 (0.0027) [2024-04-26 12:53:27,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 2934095872. Throughput: 0: 50429.5. Samples: 686945800. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 12:53:27,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 12:53:29,411][49728] Signal inference workers to stop experience collection... (10250 times) [2024-04-26 12:53:29,441][49750] InferenceWorker_p0-w0: stopping experience collection (10250 times) [2024-04-26 12:53:29,476][49728] Signal inference workers to resume experience collection... (10250 times) [2024-04-26 12:53:29,476][49750] InferenceWorker_p0-w0: resuming experience collection (10250 times) [2024-04-26 12:53:29,611][49750] Updated weights for policy 0, policy_version 179091 (0.0036) [2024-04-26 12:53:32,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 2934341632. Throughput: 0: 50377.3. Samples: 687242240. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 12:53:32,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 12:53:32,567][49750] Updated weights for policy 0, policy_version 179101 (0.0031) [2024-04-26 12:53:35,931][49750] Updated weights for policy 0, policy_version 179111 (0.0039) [2024-04-26 12:53:37,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 2934603776. Throughput: 0: 50416.5. Samples: 687399980. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 12:53:37,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 12:53:39,096][49750] Updated weights for policy 0, policy_version 179121 (0.0029) [2024-04-26 12:53:42,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 2934849536. Throughput: 0: 50421.0. Samples: 687707940. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 12:53:42,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 12:53:42,351][49750] Updated weights for policy 0, policy_version 179131 (0.0032) [2024-04-26 12:53:45,776][49750] Updated weights for policy 0, policy_version 179141 (0.0028) [2024-04-26 12:53:47,063][49517] Fps is (10 sec: 47512.9, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 2935078912. Throughput: 0: 50468.8. Samples: 688006200. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 12:53:47,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 12:53:48,865][49750] Updated weights for policy 0, policy_version 179151 (0.0028) [2024-04-26 12:53:52,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50244.3, 300 sec: 50596.1). Total num frames: 2935357440. Throughput: 0: 50267.2. Samples: 688149360. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-04-26 12:53:52,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 12:53:52,131][49750] Updated weights for policy 0, policy_version 179161 (0.0030) [2024-04-26 12:53:55,401][49750] Updated weights for policy 0, policy_version 179171 (0.0033) [2024-04-26 12:53:57,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.6, 300 sec: 50651.6). Total num frames: 2935619584. Throughput: 0: 50176.3. Samples: 688449860. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-04-26 12:53:57,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 12:53:58,509][49750] Updated weights for policy 0, policy_version 179181 (0.0034) [2024-04-26 12:54:02,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 2935848960. Throughput: 0: 50512.9. Samples: 688762740. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-04-26 12:54:02,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 12:54:02,088][49750] Updated weights for policy 0, policy_version 179191 (0.0031) [2024-04-26 12:54:04,973][49750] Updated weights for policy 0, policy_version 179201 (0.0036) [2024-04-26 12:54:07,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.3, 300 sec: 50484.9). Total num frames: 2936094720. Throughput: 0: 50347.1. Samples: 688903100. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-04-26 12:54:07,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 12:54:08,634][49750] Updated weights for policy 0, policy_version 179211 (0.0034) [2024-04-26 12:54:11,485][49750] Updated weights for policy 0, policy_version 179221 (0.0033) [2024-04-26 12:54:12,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.1, 300 sec: 50484.9). Total num frames: 2936356864. Throughput: 0: 50386.6. Samples: 689213200. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-04-26 12:54:12,063][49517] Avg episode reward: [(0, '0.456')] [2024-04-26 12:54:14,999][49750] Updated weights for policy 0, policy_version 179231 (0.0037) [2024-04-26 12:54:17,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 2936619008. Throughput: 0: 50546.3. Samples: 689516820. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-04-26 12:54:17,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 12:54:17,867][49750] Updated weights for policy 0, policy_version 179241 (0.0032) [2024-04-26 12:54:21,481][49750] Updated weights for policy 0, policy_version 179251 (0.0037) [2024-04-26 12:54:22,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.2, 300 sec: 50651.5). Total num frames: 2936881152. Throughput: 0: 50596.2. Samples: 689676820. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-04-26 12:54:22,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 12:54:24,325][49750] Updated weights for policy 0, policy_version 179261 (0.0031) [2024-04-26 12:54:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.4, 300 sec: 50540.5). Total num frames: 2937110528. Throughput: 0: 50547.3. Samples: 689982560. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-04-26 12:54:27,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 12:54:27,857][49750] Updated weights for policy 0, policy_version 179271 (0.0033) [2024-04-26 12:54:30,990][49750] Updated weights for policy 0, policy_version 179281 (0.0035) [2024-04-26 12:54:32,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2937356288. Throughput: 0: 50634.8. Samples: 690284760. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-04-26 12:54:32,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 12:54:34,187][49750] Updated weights for policy 0, policy_version 179291 (0.0034) [2024-04-26 12:54:35,556][49728] Signal inference workers to stop experience collection... (10300 times) [2024-04-26 12:54:35,557][49728] Signal inference workers to resume experience collection... (10300 times) [2024-04-26 12:54:35,585][49750] InferenceWorker_p0-w0: stopping experience collection (10300 times) [2024-04-26 12:54:35,585][49750] InferenceWorker_p0-w0: resuming experience collection (10300 times) [2024-04-26 12:54:37,063][49517] Fps is (10 sec: 54066.5, 60 sec: 50790.3, 300 sec: 50596.0). Total num frames: 2937651200. Throughput: 0: 50706.9. Samples: 690431180. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-04-26 12:54:37,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 12:54:37,999][49750] Updated weights for policy 0, policy_version 179301 (0.0030) [2024-04-26 12:54:40,585][49750] Updated weights for policy 0, policy_version 179311 (0.0038) [2024-04-26 12:54:42,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.5, 300 sec: 50596.0). Total num frames: 2937880576. Throughput: 0: 50712.1. Samples: 690731900. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-04-26 12:54:42,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 12:54:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000179315_2937896960.pth... [2024-04-26 12:54:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000178574_2925756416.pth [2024-04-26 12:54:44,440][49750] Updated weights for policy 0, policy_version 179321 (0.0034) [2024-04-26 12:54:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 51063.5, 300 sec: 50651.5). Total num frames: 2938142720. Throughput: 0: 50615.1. Samples: 691040420. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-04-26 12:54:47,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 12:54:47,110][49750] Updated weights for policy 0, policy_version 179331 (0.0033) [2024-04-26 12:54:50,818][49750] Updated weights for policy 0, policy_version 179341 (0.0029) [2024-04-26 12:54:52,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50540.5). Total num frames: 2938372096. Throughput: 0: 50660.0. Samples: 691182800. Policy #0 lag: (min: 2.0, avg: 9.8, max: 23.0) [2024-04-26 12:54:52,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 12:54:53,706][49750] Updated weights for policy 0, policy_version 179351 (0.0028) [2024-04-26 12:54:57,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50540.5). Total num frames: 2938634240. Throughput: 0: 50466.7. Samples: 691484200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:54:57,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 12:54:57,345][49750] Updated weights for policy 0, policy_version 179361 (0.0028) [2024-04-26 12:55:00,202][49750] Updated weights for policy 0, policy_version 179371 (0.0030) [2024-04-26 12:55:02,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.4, 300 sec: 50596.0). Total num frames: 2938912768. Throughput: 0: 50442.5. Samples: 691786740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:55:02,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 12:55:03,939][49750] Updated weights for policy 0, policy_version 179381 (0.0034) [2024-04-26 12:55:06,749][49750] Updated weights for policy 0, policy_version 179391 (0.0032) [2024-04-26 12:55:07,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 2939142144. Throughput: 0: 50551.1. Samples: 691951620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:55:07,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 12:55:10,275][49750] Updated weights for policy 0, policy_version 179401 (0.0035) [2024-04-26 12:55:12,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 2939387904. Throughput: 0: 50371.9. Samples: 692249300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:55:12,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 12:55:13,165][49750] Updated weights for policy 0, policy_version 179411 (0.0036) [2024-04-26 12:55:16,665][49750] Updated weights for policy 0, policy_version 179421 (0.0036) [2024-04-26 12:55:17,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.1, 300 sec: 50429.4). Total num frames: 2939633664. Throughput: 0: 50282.1. Samples: 692547460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:55:17,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 12:55:19,572][49750] Updated weights for policy 0, policy_version 179431 (0.0030) [2024-04-26 12:55:22,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.3, 300 sec: 50540.4). Total num frames: 2939912192. Throughput: 0: 50283.9. Samples: 692693960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:55:22,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 12:55:23,391][49750] Updated weights for policy 0, policy_version 179441 (0.0028) [2024-04-26 12:55:25,935][49750] Updated weights for policy 0, policy_version 179451 (0.0032) [2024-04-26 12:55:27,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 2940141568. Throughput: 0: 50325.7. Samples: 692996560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:55:27,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 12:55:30,130][49750] Updated weights for policy 0, policy_version 179461 (0.0030) [2024-04-26 12:55:32,062][49517] Fps is (10 sec: 50791.9, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 2940420096. Throughput: 0: 50260.2. Samples: 693302120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:55:32,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 12:55:32,412][49750] Updated weights for policy 0, policy_version 179471 (0.0033) [2024-04-26 12:55:36,516][49750] Updated weights for policy 0, policy_version 179481 (0.0027) [2024-04-26 12:55:37,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49698.2, 300 sec: 50429.4). Total num frames: 2940633088. Throughput: 0: 50339.0. Samples: 693448060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:55:37,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 12:55:38,957][49750] Updated weights for policy 0, policy_version 179491 (0.0030) [2024-04-26 12:55:42,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.2, 300 sec: 50540.5). Total num frames: 2940911616. Throughput: 0: 50405.8. Samples: 693752460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:55:42,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 12:55:43,128][49750] Updated weights for policy 0, policy_version 179501 (0.0029) [2024-04-26 12:55:44,412][49728] Signal inference workers to stop experience collection... (10350 times) [2024-04-26 12:55:44,418][49728] Signal inference workers to resume experience collection... (10350 times) [2024-04-26 12:55:44,445][49750] InferenceWorker_p0-w0: stopping experience collection (10350 times) [2024-04-26 12:55:44,445][49750] InferenceWorker_p0-w0: resuming experience collection (10350 times) [2024-04-26 12:55:45,506][49750] Updated weights for policy 0, policy_version 179511 (0.0030) [2024-04-26 12:55:47,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 2941157376. Throughput: 0: 50382.6. Samples: 694053960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:55:47,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 12:55:49,477][49750] Updated weights for policy 0, policy_version 179521 (0.0030) [2024-04-26 12:55:52,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.2, 300 sec: 50596.0). Total num frames: 2941419520. Throughput: 0: 50111.5. Samples: 694206640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:55:52,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 12:55:52,261][49750] Updated weights for policy 0, policy_version 179531 (0.0033) [2024-04-26 12:55:55,996][49750] Updated weights for policy 0, policy_version 179541 (0.0031) [2024-04-26 12:55:57,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 50540.5). Total num frames: 2941648896. Throughput: 0: 50198.3. Samples: 694508220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 12:55:57,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 12:55:58,722][49750] Updated weights for policy 0, policy_version 179551 (0.0031) [2024-04-26 12:56:02,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49698.2, 300 sec: 50429.4). Total num frames: 2941894656. Throughput: 0: 50158.7. Samples: 694804600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:56:02,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 12:56:02,461][49750] Updated weights for policy 0, policy_version 179561 (0.0034) [2024-04-26 12:56:05,496][49750] Updated weights for policy 0, policy_version 179571 (0.0027) [2024-04-26 12:56:07,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.4, 300 sec: 50484.9). Total num frames: 2942173184. Throughput: 0: 50308.2. Samples: 694957820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:56:07,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 12:56:08,794][49750] Updated weights for policy 0, policy_version 179581 (0.0032) [2024-04-26 12:56:11,852][49750] Updated weights for policy 0, policy_version 179591 (0.0037) [2024-04-26 12:56:12,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 2942418944. Throughput: 0: 50281.3. Samples: 695259220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:56:12,063][49517] Avg episode reward: [(0, '0.458')] [2024-04-26 12:56:15,208][49750] Updated weights for policy 0, policy_version 179601 (0.0034) [2024-04-26 12:56:17,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 2942681088. Throughput: 0: 50339.3. Samples: 695567400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:56:17,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 12:56:18,211][49750] Updated weights for policy 0, policy_version 179611 (0.0031) [2024-04-26 12:56:21,782][49750] Updated weights for policy 0, policy_version 179621 (0.0034) [2024-04-26 12:56:22,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.3, 300 sec: 50540.4). Total num frames: 2942926848. Throughput: 0: 50507.0. Samples: 695720880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:56:22,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 12:56:24,650][49750] Updated weights for policy 0, policy_version 179631 (0.0040) [2024-04-26 12:56:27,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2943172608. Throughput: 0: 50517.3. Samples: 696025740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:56:27,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 12:56:28,260][49750] Updated weights for policy 0, policy_version 179641 (0.0026) [2024-04-26 12:56:31,213][49750] Updated weights for policy 0, policy_version 179651 (0.0029) [2024-04-26 12:56:32,063][49517] Fps is (10 sec: 47514.1, 60 sec: 49698.0, 300 sec: 50484.9). Total num frames: 2943401984. Throughput: 0: 50358.3. Samples: 696320080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:56:32,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 12:56:34,897][49750] Updated weights for policy 0, policy_version 179661 (0.0034) [2024-04-26 12:56:37,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50540.5). Total num frames: 2943680512. Throughput: 0: 50477.9. Samples: 696478140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:56:37,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 12:56:37,540][49750] Updated weights for policy 0, policy_version 179671 (0.0028) [2024-04-26 12:56:40,679][49728] Signal inference workers to stop experience collection... (10400 times) [2024-04-26 12:56:40,707][49750] InferenceWorker_p0-w0: stopping experience collection (10400 times) [2024-04-26 12:56:40,742][49728] Signal inference workers to resume experience collection... (10400 times) [2024-04-26 12:56:40,742][49750] InferenceWorker_p0-w0: resuming experience collection (10400 times) [2024-04-26 12:56:41,397][49750] Updated weights for policy 0, policy_version 179681 (0.0033) [2024-04-26 12:56:42,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50244.4, 300 sec: 50484.9). Total num frames: 2943926272. Throughput: 0: 50532.9. Samples: 696782200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:56:42,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 12:56:42,226][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000179684_2943942656.pth... [2024-04-26 12:56:42,284][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000178944_2931818496.pth [2024-04-26 12:56:43,888][49750] Updated weights for policy 0, policy_version 179691 (0.0032) [2024-04-26 12:56:47,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.3, 300 sec: 50429.4). Total num frames: 2944155648. Throughput: 0: 50552.9. Samples: 697079480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:56:47,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 12:56:47,730][49750] Updated weights for policy 0, policy_version 179701 (0.0037) [2024-04-26 12:56:50,471][49750] Updated weights for policy 0, policy_version 179711 (0.0032) [2024-04-26 12:56:52,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 2944434176. Throughput: 0: 50501.8. Samples: 697230400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:56:52,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 12:56:54,137][49750] Updated weights for policy 0, policy_version 179721 (0.0029) [2024-04-26 12:56:57,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.4, 300 sec: 50485.0). Total num frames: 2944696320. Throughput: 0: 50606.3. Samples: 697536500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:56:57,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 12:56:57,108][49750] Updated weights for policy 0, policy_version 179731 (0.0032) [2024-04-26 12:57:00,628][49750] Updated weights for policy 0, policy_version 179741 (0.0038) [2024-04-26 12:57:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50485.0). Total num frames: 2944942080. Throughput: 0: 50401.5. Samples: 697835460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:57:02,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 12:57:03,889][49750] Updated weights for policy 0, policy_version 179751 (0.0033) [2024-04-26 12:57:07,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2945187840. Throughput: 0: 50596.3. Samples: 697997700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 12:57:07,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 12:57:07,117][49750] Updated weights for policy 0, policy_version 179761 (0.0034) [2024-04-26 12:57:10,694][49750] Updated weights for policy 0, policy_version 179771 (0.0034) [2024-04-26 12:57:12,063][49517] Fps is (10 sec: 50788.8, 60 sec: 50517.2, 300 sec: 50484.9). Total num frames: 2945449984. Throughput: 0: 50485.2. Samples: 698297580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 12:57:12,064][49517] Avg episode reward: [(0, '0.669')] [2024-04-26 12:57:13,487][49750] Updated weights for policy 0, policy_version 179781 (0.0029) [2024-04-26 12:57:17,063][49517] Fps is (10 sec: 49151.1, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 2945679360. Throughput: 0: 50595.5. Samples: 698596880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 12:57:17,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 12:57:17,495][49750] Updated weights for policy 0, policy_version 179791 (0.0030) [2024-04-26 12:57:19,988][49750] Updated weights for policy 0, policy_version 179801 (0.0026) [2024-04-26 12:57:22,063][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 2945957888. Throughput: 0: 50616.8. Samples: 698755900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 12:57:22,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 12:57:23,842][49750] Updated weights for policy 0, policy_version 179811 (0.0023) [2024-04-26 12:57:26,622][49750] Updated weights for policy 0, policy_version 179821 (0.0034) [2024-04-26 12:57:27,063][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 2946203648. Throughput: 0: 50458.1. Samples: 699052820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 12:57:27,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 12:57:30,539][49750] Updated weights for policy 0, policy_version 179831 (0.0031) [2024-04-26 12:57:32,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2946449408. Throughput: 0: 50527.9. Samples: 699353240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 12:57:32,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 12:57:32,986][49750] Updated weights for policy 0, policy_version 179841 (0.0031) [2024-04-26 12:57:37,063][49517] Fps is (10 sec: 45875.2, 60 sec: 49698.1, 300 sec: 50318.3). Total num frames: 2946662400. Throughput: 0: 50429.6. Samples: 699499740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 12:57:37,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 12:57:37,199][49750] Updated weights for policy 0, policy_version 179851 (0.0030) [2024-04-26 12:57:39,582][49750] Updated weights for policy 0, policy_version 179861 (0.0032) [2024-04-26 12:57:42,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 2946973696. Throughput: 0: 50308.8. Samples: 699800400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 12:57:42,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 12:57:43,525][49750] Updated weights for policy 0, policy_version 179871 (0.0034) [2024-04-26 12:57:43,535][49728] Signal inference workers to stop experience collection... (10450 times) [2024-04-26 12:57:43,536][49728] Signal inference workers to resume experience collection... (10450 times) [2024-04-26 12:57:43,567][49750] InferenceWorker_p0-w0: stopping experience collection (10450 times) [2024-04-26 12:57:43,567][49750] InferenceWorker_p0-w0: resuming experience collection (10450 times) [2024-04-26 12:57:46,073][49750] Updated weights for policy 0, policy_version 179881 (0.0040) [2024-04-26 12:57:47,062][49517] Fps is (10 sec: 55705.8, 60 sec: 51063.4, 300 sec: 50429.4). Total num frames: 2947219456. Throughput: 0: 50559.9. Samples: 700110660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 12:57:47,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 12:57:49,962][49750] Updated weights for policy 0, policy_version 179891 (0.0034) [2024-04-26 12:57:52,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.2, 300 sec: 50540.5). Total num frames: 2947465216. Throughput: 0: 50378.9. Samples: 700264760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 12:57:52,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 12:57:52,455][49750] Updated weights for policy 0, policy_version 179901 (0.0027) [2024-04-26 12:57:56,627][49750] Updated weights for policy 0, policy_version 179911 (0.0026) [2024-04-26 12:57:57,063][49517] Fps is (10 sec: 45874.9, 60 sec: 49698.1, 300 sec: 50373.8). Total num frames: 2947678208. Throughput: 0: 50346.0. Samples: 700563140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 12:57:57,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 12:57:58,872][49750] Updated weights for policy 0, policy_version 179921 (0.0029) [2024-04-26 12:58:02,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2947956736. Throughput: 0: 50211.2. Samples: 700856380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 12:58:02,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 12:58:03,264][49750] Updated weights for policy 0, policy_version 179931 (0.0031) [2024-04-26 12:58:05,351][49750] Updated weights for policy 0, policy_version 179941 (0.0038) [2024-04-26 12:58:07,062][49517] Fps is (10 sec: 54067.8, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2948218880. Throughput: 0: 50128.2. Samples: 701011660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 12:58:07,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 12:58:09,613][49750] Updated weights for policy 0, policy_version 179951 (0.0038) [2024-04-26 12:58:11,844][49750] Updated weights for policy 0, policy_version 179961 (0.0031) [2024-04-26 12:58:12,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.6, 300 sec: 50485.0). Total num frames: 2948481024. Throughput: 0: 50375.7. Samples: 701319720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 12:58:12,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 12:58:16,053][49750] Updated weights for policy 0, policy_version 179971 (0.0028) [2024-04-26 12:58:17,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 2948710400. Throughput: 0: 50521.0. Samples: 701626680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 12:58:17,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 12:58:18,520][49750] Updated weights for policy 0, policy_version 179981 (0.0030) [2024-04-26 12:58:22,063][49517] Fps is (10 sec: 45874.5, 60 sec: 49698.2, 300 sec: 50318.3). Total num frames: 2948939776. Throughput: 0: 50309.3. Samples: 701763660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 12:58:22,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 12:58:22,583][49750] Updated weights for policy 0, policy_version 179991 (0.0036) [2024-04-26 12:58:25,017][49750] Updated weights for policy 0, policy_version 180001 (0.0032) [2024-04-26 12:58:27,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2949218304. Throughput: 0: 50363.7. Samples: 702066760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 12:58:27,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 12:58:29,290][49750] Updated weights for policy 0, policy_version 180011 (0.0031) [2024-04-26 12:58:31,453][49750] Updated weights for policy 0, policy_version 180021 (0.0028) [2024-04-26 12:58:32,062][49517] Fps is (10 sec: 54068.0, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2949480448. Throughput: 0: 50116.1. Samples: 702365880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 12:58:32,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 12:58:35,689][49750] Updated weights for policy 0, policy_version 180031 (0.0030) [2024-04-26 12:58:37,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.6, 300 sec: 50429.4). Total num frames: 2949726208. Throughput: 0: 50189.6. Samples: 702523280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 12:58:37,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 12:58:37,949][49750] Updated weights for policy 0, policy_version 180041 (0.0028) [2024-04-26 12:58:38,376][49728] Signal inference workers to stop experience collection... (10500 times) [2024-04-26 12:58:38,434][49750] InferenceWorker_p0-w0: stopping experience collection (10500 times) [2024-04-26 12:58:38,434][49728] Signal inference workers to resume experience collection... (10500 times) [2024-04-26 12:58:38,448][49750] InferenceWorker_p0-w0: resuming experience collection (10500 times) [2024-04-26 12:58:41,992][49750] Updated weights for policy 0, policy_version 180051 (0.0028) [2024-04-26 12:58:42,062][49517] Fps is (10 sec: 47513.4, 60 sec: 49698.2, 300 sec: 50429.4). Total num frames: 2949955584. Throughput: 0: 50227.7. Samples: 702823380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 12:58:42,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 12:58:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000180051_2949955584.pth... [2024-04-26 12:58:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000179315_2937896960.pth [2024-04-26 12:58:44,464][49750] Updated weights for policy 0, policy_version 180061 (0.0027) [2024-04-26 12:58:47,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 2950234112. Throughput: 0: 50404.5. Samples: 703124580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 12:58:47,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 12:58:48,579][49750] Updated weights for policy 0, policy_version 180071 (0.0029) [2024-04-26 12:58:50,997][49750] Updated weights for policy 0, policy_version 180081 (0.0032) [2024-04-26 12:58:52,062][49517] Fps is (10 sec: 54067.4, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 2950496256. Throughput: 0: 50369.8. Samples: 703278300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 12:58:52,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 12:58:55,074][49750] Updated weights for policy 0, policy_version 180091 (0.0032) [2024-04-26 12:58:57,063][49517] Fps is (10 sec: 50789.1, 60 sec: 51063.4, 300 sec: 50484.9). Total num frames: 2950742016. Throughput: 0: 50362.8. Samples: 703586060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 12:58:57,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 12:58:57,517][49750] Updated weights for policy 0, policy_version 180101 (0.0030) [2024-04-26 12:59:01,591][49750] Updated weights for policy 0, policy_version 180111 (0.0036) [2024-04-26 12:59:02,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2950987776. Throughput: 0: 50443.9. Samples: 703896660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 12:59:02,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 12:59:03,914][49750] Updated weights for policy 0, policy_version 180121 (0.0028) [2024-04-26 12:59:07,062][49517] Fps is (10 sec: 47514.5, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 2951217152. Throughput: 0: 50470.8. Samples: 704034840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 12:59:07,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 12:59:08,034][49750] Updated weights for policy 0, policy_version 180131 (0.0038) [2024-04-26 12:59:10,444][49750] Updated weights for policy 0, policy_version 180141 (0.0035) [2024-04-26 12:59:12,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2951512064. Throughput: 0: 50413.4. Samples: 704335360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 12:59:12,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 12:59:14,512][49750] Updated weights for policy 0, policy_version 180151 (0.0036) [2024-04-26 12:59:16,868][49750] Updated weights for policy 0, policy_version 180161 (0.0034) [2024-04-26 12:59:17,062][49517] Fps is (10 sec: 54067.7, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2951757824. Throughput: 0: 50586.2. Samples: 704642260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:59:17,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 12:59:21,004][49750] Updated weights for policy 0, policy_version 180171 (0.0025) [2024-04-26 12:59:22,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 2951987200. Throughput: 0: 50500.5. Samples: 704795800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:59:22,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 12:59:23,398][49750] Updated weights for policy 0, policy_version 180181 (0.0033) [2024-04-26 12:59:27,063][49517] Fps is (10 sec: 45874.6, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 2952216576. Throughput: 0: 50485.2. Samples: 705095220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:59:27,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 12:59:27,448][49750] Updated weights for policy 0, policy_version 180191 (0.0029) [2024-04-26 12:59:29,985][49750] Updated weights for policy 0, policy_version 180201 (0.0029) [2024-04-26 12:59:32,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 2952495104. Throughput: 0: 50373.3. Samples: 705391380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:59:32,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 12:59:33,747][49750] Updated weights for policy 0, policy_version 180211 (0.0034) [2024-04-26 12:59:36,515][49750] Updated weights for policy 0, policy_version 180221 (0.0035) [2024-04-26 12:59:37,063][49517] Fps is (10 sec: 55705.2, 60 sec: 50790.2, 300 sec: 50484.9). Total num frames: 2952773632. Throughput: 0: 50476.7. Samples: 705549760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:59:37,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 12:59:40,107][49728] Signal inference workers to stop experience collection... (10550 times) [2024-04-26 12:59:40,107][49728] Signal inference workers to resume experience collection... (10550 times) [2024-04-26 12:59:40,132][49750] InferenceWorker_p0-w0: stopping experience collection (10550 times) [2024-04-26 12:59:40,132][49750] InferenceWorker_p0-w0: resuming experience collection (10550 times) [2024-04-26 12:59:40,233][49750] Updated weights for policy 0, policy_version 180231 (0.0029) [2024-04-26 12:59:42,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.3, 300 sec: 50373.8). Total num frames: 2953003008. Throughput: 0: 50474.3. Samples: 705857400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:59:42,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 12:59:42,855][49750] Updated weights for policy 0, policy_version 180241 (0.0031) [2024-04-26 12:59:46,690][49750] Updated weights for policy 0, policy_version 180251 (0.0030) [2024-04-26 12:59:47,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2953248768. Throughput: 0: 50289.8. Samples: 706159700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:59:47,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 12:59:49,338][49750] Updated weights for policy 0, policy_version 180261 (0.0031) [2024-04-26 12:59:52,063][49517] Fps is (10 sec: 49152.2, 60 sec: 49971.1, 300 sec: 50373.9). Total num frames: 2953494528. Throughput: 0: 50268.0. Samples: 706296900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:59:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 12:59:53,299][49750] Updated weights for policy 0, policy_version 180271 (0.0034) [2024-04-26 12:59:55,812][49750] Updated weights for policy 0, policy_version 180281 (0.0028) [2024-04-26 12:59:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.5, 300 sec: 50373.9). Total num frames: 2953773056. Throughput: 0: 50618.2. Samples: 706613180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 12:59:57,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 12:59:59,693][49750] Updated weights for policy 0, policy_version 180291 (0.0030) [2024-04-26 13:00:02,062][49517] Fps is (10 sec: 54068.0, 60 sec: 50790.5, 300 sec: 50485.0). Total num frames: 2954035200. Throughput: 0: 50513.8. Samples: 706915380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 13:00:02,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 13:00:02,279][49750] Updated weights for policy 0, policy_version 180301 (0.0031) [2024-04-26 13:00:06,173][49750] Updated weights for policy 0, policy_version 180311 (0.0035) [2024-04-26 13:00:07,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2954264576. Throughput: 0: 50465.3. Samples: 707066740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 13:00:07,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 13:00:08,721][49750] Updated weights for policy 0, policy_version 180321 (0.0029) [2024-04-26 13:00:12,063][49517] Fps is (10 sec: 45874.4, 60 sec: 49698.0, 300 sec: 50373.9). Total num frames: 2954493952. Throughput: 0: 50421.8. Samples: 707364200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 13:00:12,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 13:00:12,592][49750] Updated weights for policy 0, policy_version 180331 (0.0032) [2024-04-26 13:00:15,295][49750] Updated weights for policy 0, policy_version 180341 (0.0032) [2024-04-26 13:00:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2954772480. Throughput: 0: 50628.9. Samples: 707669680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 13:00:17,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 13:00:19,160][49750] Updated weights for policy 0, policy_version 180351 (0.0034) [2024-04-26 13:00:21,639][49750] Updated weights for policy 0, policy_version 180361 (0.0029) [2024-04-26 13:00:22,063][49517] Fps is (10 sec: 54067.1, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 2955034624. Throughput: 0: 50499.2. Samples: 707822220. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 13:00:22,063][49517] Avg episode reward: [(0, '0.478')] [2024-04-26 13:00:25,542][49750] Updated weights for policy 0, policy_version 180371 (0.0030) [2024-04-26 13:00:27,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50373.8). Total num frames: 2955280384. Throughput: 0: 50553.8. Samples: 708132320. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 13:00:27,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 13:00:28,244][49750] Updated weights for policy 0, policy_version 180381 (0.0033) [2024-04-26 13:00:31,972][49750] Updated weights for policy 0, policy_version 180391 (0.0033) [2024-04-26 13:00:32,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.4, 300 sec: 50484.9). Total num frames: 2955526144. Throughput: 0: 50588.5. Samples: 708436180. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 13:00:32,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 13:00:34,588][49750] Updated weights for policy 0, policy_version 180401 (0.0034) [2024-04-26 13:00:37,063][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.3, 300 sec: 50373.9). Total num frames: 2955771904. Throughput: 0: 50657.3. Samples: 708576480. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 13:00:37,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 13:00:38,468][49750] Updated weights for policy 0, policy_version 180411 (0.0035) [2024-04-26 13:00:39,624][49728] Signal inference workers to stop experience collection... (10600 times) [2024-04-26 13:00:39,644][49750] InferenceWorker_p0-w0: stopping experience collection (10600 times) [2024-04-26 13:00:39,691][49728] Signal inference workers to resume experience collection... (10600 times) [2024-04-26 13:00:39,691][49750] InferenceWorker_p0-w0: resuming experience collection (10600 times) [2024-04-26 13:00:41,021][49750] Updated weights for policy 0, policy_version 180421 (0.0033) [2024-04-26 13:00:42,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2956034048. Throughput: 0: 50526.2. Samples: 708886860. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 13:00:42,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 13:00:42,091][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000180423_2956050432.pth... [2024-04-26 13:00:42,140][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000179684_2943942656.pth [2024-04-26 13:00:44,825][49750] Updated weights for policy 0, policy_version 180431 (0.0029) [2024-04-26 13:00:47,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.5, 300 sec: 50485.0). Total num frames: 2956312576. Throughput: 0: 50542.5. Samples: 709189800. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 13:00:47,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 13:00:47,472][49750] Updated weights for policy 0, policy_version 180441 (0.0030) [2024-04-26 13:00:51,297][49750] Updated weights for policy 0, policy_version 180451 (0.0035) [2024-04-26 13:00:52,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2956525568. Throughput: 0: 50565.3. Samples: 709342180. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 13:00:52,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 13:00:54,102][49750] Updated weights for policy 0, policy_version 180461 (0.0029) [2024-04-26 13:00:57,062][49517] Fps is (10 sec: 45875.2, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 2956771328. Throughput: 0: 50586.3. Samples: 709640580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 13:00:57,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-26 13:00:57,876][49750] Updated weights for policy 0, policy_version 180471 (0.0035) [2024-04-26 13:01:00,478][49750] Updated weights for policy 0, policy_version 180481 (0.0033) [2024-04-26 13:01:02,063][49517] Fps is (10 sec: 52427.8, 60 sec: 50244.0, 300 sec: 50429.4). Total num frames: 2957049856. Throughput: 0: 50492.7. Samples: 709941860. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 13:01:02,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 13:01:04,364][49750] Updated weights for policy 0, policy_version 180491 (0.0030) [2024-04-26 13:01:06,988][49750] Updated weights for policy 0, policy_version 180501 (0.0032) [2024-04-26 13:01:07,062][49517] Fps is (10 sec: 55705.6, 60 sec: 51063.4, 300 sec: 50540.5). Total num frames: 2957328384. Throughput: 0: 50578.8. Samples: 710098260. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 13:01:07,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 13:01:10,766][49750] Updated weights for policy 0, policy_version 180511 (0.0041) [2024-04-26 13:01:12,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.5, 300 sec: 50373.9). Total num frames: 2957541376. Throughput: 0: 50580.9. Samples: 710408460. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 13:01:12,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 13:01:13,606][49750] Updated weights for policy 0, policy_version 180521 (0.0032) [2024-04-26 13:01:17,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 2957787136. Throughput: 0: 50537.4. Samples: 710710360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 13:01:17,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 13:01:17,245][49750] Updated weights for policy 0, policy_version 180531 (0.0030) [2024-04-26 13:01:20,269][49750] Updated weights for policy 0, policy_version 180541 (0.0031) [2024-04-26 13:01:22,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.4, 300 sec: 50485.0). Total num frames: 2958065664. Throughput: 0: 50603.2. Samples: 710853620. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-26 13:01:22,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 13:01:23,729][49750] Updated weights for policy 0, policy_version 180551 (0.0030) [2024-04-26 13:01:26,902][49750] Updated weights for policy 0, policy_version 180561 (0.0027) [2024-04-26 13:01:27,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 2958311424. Throughput: 0: 50510.6. Samples: 711159840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:01:27,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 13:01:30,164][49750] Updated weights for policy 0, policy_version 180571 (0.0035) [2024-04-26 13:01:32,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 2958573568. Throughput: 0: 50650.3. Samples: 711469060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:01:32,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 13:01:33,356][49750] Updated weights for policy 0, policy_version 180581 (0.0036) [2024-04-26 13:01:36,686][49750] Updated weights for policy 0, policy_version 180591 (0.0037) [2024-04-26 13:01:37,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.5, 300 sec: 50484.9). Total num frames: 2958819328. Throughput: 0: 50508.0. Samples: 711615040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:01:37,063][49517] Avg episode reward: [(0, '0.474')] [2024-04-26 13:01:39,766][49750] Updated weights for policy 0, policy_version 180601 (0.0031) [2024-04-26 13:01:42,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 2959048704. Throughput: 0: 50559.0. Samples: 711915740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:01:42,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 13:01:43,244][49750] Updated weights for policy 0, policy_version 180611 (0.0029) [2024-04-26 13:01:44,227][49728] Signal inference workers to stop experience collection... (10650 times) [2024-04-26 13:01:44,227][49728] Signal inference workers to resume experience collection... (10650 times) [2024-04-26 13:01:44,244][49750] InferenceWorker_p0-w0: stopping experience collection (10650 times) [2024-04-26 13:01:44,244][49750] InferenceWorker_p0-w0: resuming experience collection (10650 times) [2024-04-26 13:01:46,300][49750] Updated weights for policy 0, policy_version 180621 (0.0036) [2024-04-26 13:01:47,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 2959343616. Throughput: 0: 50568.5. Samples: 712217440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:01:47,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-26 13:01:49,690][49750] Updated weights for policy 0, policy_version 180631 (0.0035) [2024-04-26 13:01:52,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51063.3, 300 sec: 50484.9). Total num frames: 2959589376. Throughput: 0: 50660.2. Samples: 712377980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:01:52,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 13:01:52,806][49750] Updated weights for policy 0, policy_version 180641 (0.0031) [2024-04-26 13:01:56,209][49750] Updated weights for policy 0, policy_version 180651 (0.0029) [2024-04-26 13:01:57,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 2959818752. Throughput: 0: 50486.3. Samples: 712680340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:01:57,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 13:01:59,210][49750] Updated weights for policy 0, policy_version 180661 (0.0032) [2024-04-26 13:02:02,062][49517] Fps is (10 sec: 45876.1, 60 sec: 49971.3, 300 sec: 50373.8). Total num frames: 2960048128. Throughput: 0: 50527.9. Samples: 712984120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:02:02,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 13:02:02,744][49750] Updated weights for policy 0, policy_version 180671 (0.0031) [2024-04-26 13:02:05,636][49750] Updated weights for policy 0, policy_version 180681 (0.0030) [2024-04-26 13:02:07,063][49517] Fps is (10 sec: 54066.4, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 2960359424. Throughput: 0: 50621.7. Samples: 713131600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:02:07,072][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 13:02:09,280][49750] Updated weights for policy 0, policy_version 180691 (0.0030) [2024-04-26 13:02:12,062][49517] Fps is (10 sec: 54067.4, 60 sec: 50790.4, 300 sec: 50540.5). Total num frames: 2960588800. Throughput: 0: 50407.2. Samples: 713428160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:02:12,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 13:02:12,154][49750] Updated weights for policy 0, policy_version 180701 (0.0032) [2024-04-26 13:02:15,712][49750] Updated weights for policy 0, policy_version 180711 (0.0028) [2024-04-26 13:02:17,063][49517] Fps is (10 sec: 47513.6, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 2960834560. Throughput: 0: 50331.4. Samples: 713733980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:02:17,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 13:02:18,744][49750] Updated weights for policy 0, policy_version 180721 (0.0033) [2024-04-26 13:02:22,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2961080320. Throughput: 0: 50262.3. Samples: 713876840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:02:22,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 13:02:22,146][49750] Updated weights for policy 0, policy_version 180731 (0.0026) [2024-04-26 13:02:25,133][49750] Updated weights for policy 0, policy_version 180741 (0.0032) [2024-04-26 13:02:27,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2961326080. Throughput: 0: 50406.8. Samples: 714184040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:02:27,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 13:02:28,721][49750] Updated weights for policy 0, policy_version 180751 (0.0031) [2024-04-26 13:02:31,724][49750] Updated weights for policy 0, policy_version 180761 (0.0031) [2024-04-26 13:02:32,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 2961604608. Throughput: 0: 50494.3. Samples: 714489680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:02:32,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 13:02:35,259][49750] Updated weights for policy 0, policy_version 180771 (0.0032) [2024-04-26 13:02:37,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2961850368. Throughput: 0: 50497.9. Samples: 714650380. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-04-26 13:02:37,072][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 13:02:38,115][49750] Updated weights for policy 0, policy_version 180781 (0.0030) [2024-04-26 13:02:41,830][49750] Updated weights for policy 0, policy_version 180791 (0.0030) [2024-04-26 13:02:42,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 2962096128. Throughput: 0: 50387.0. Samples: 714947760. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-04-26 13:02:42,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 13:02:42,078][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000180792_2962096128.pth... [2024-04-26 13:02:42,124][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000180051_2949955584.pth [2024-04-26 13:02:44,631][49750] Updated weights for policy 0, policy_version 180801 (0.0033) [2024-04-26 13:02:47,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49698.2, 300 sec: 50373.9). Total num frames: 2962325504. Throughput: 0: 50299.6. Samples: 715247600. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-04-26 13:02:47,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 13:02:48,225][49750] Updated weights for policy 0, policy_version 180811 (0.0030) [2024-04-26 13:02:51,101][49750] Updated weights for policy 0, policy_version 180821 (0.0040) [2024-04-26 13:02:52,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 2962620416. Throughput: 0: 50310.6. Samples: 715395580. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-04-26 13:02:52,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 13:02:54,773][49750] Updated weights for policy 0, policy_version 180831 (0.0030) [2024-04-26 13:02:56,931][49728] Signal inference workers to stop experience collection... (10700 times) [2024-04-26 13:02:56,932][49728] Signal inference workers to resume experience collection... (10700 times) [2024-04-26 13:02:56,949][49750] InferenceWorker_p0-w0: stopping experience collection (10700 times) [2024-04-26 13:02:56,949][49750] InferenceWorker_p0-w0: resuming experience collection (10700 times) [2024-04-26 13:02:57,062][49517] Fps is (10 sec: 54067.5, 60 sec: 50790.4, 300 sec: 50540.5). Total num frames: 2962866176. Throughput: 0: 50419.6. Samples: 715697040. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-04-26 13:02:57,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 13:02:57,488][49750] Updated weights for policy 0, policy_version 180841 (0.0031) [2024-04-26 13:03:01,347][49750] Updated weights for policy 0, policy_version 180851 (0.0041) [2024-04-26 13:03:02,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 2963095552. Throughput: 0: 50362.6. Samples: 716000300. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-04-26 13:03:02,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 13:03:03,879][49750] Updated weights for policy 0, policy_version 180861 (0.0027) [2024-04-26 13:03:07,063][49517] Fps is (10 sec: 44235.5, 60 sec: 49151.9, 300 sec: 50262.7). Total num frames: 2963308544. Throughput: 0: 50328.6. Samples: 716141640. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-04-26 13:03:07,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 13:03:07,924][49750] Updated weights for policy 0, policy_version 180871 (0.0035) [2024-04-26 13:03:10,532][49750] Updated weights for policy 0, policy_version 180881 (0.0031) [2024-04-26 13:03:12,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 2963619840. Throughput: 0: 50439.4. Samples: 716453820. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-04-26 13:03:12,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 13:03:14,254][49750] Updated weights for policy 0, policy_version 180891 (0.0035) [2024-04-26 13:03:16,894][49750] Updated weights for policy 0, policy_version 180901 (0.0033) [2024-04-26 13:03:17,062][49517] Fps is (10 sec: 57345.3, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 2963881984. Throughput: 0: 50403.6. Samples: 716757840. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-04-26 13:03:17,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 13:03:20,731][49750] Updated weights for policy 0, policy_version 180911 (0.0032) [2024-04-26 13:03:22,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2964094976. Throughput: 0: 50510.8. Samples: 716923360. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-04-26 13:03:22,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 13:03:23,289][49750] Updated weights for policy 0, policy_version 180921 (0.0028) [2024-04-26 13:03:27,062][49517] Fps is (10 sec: 45875.1, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 2964340736. Throughput: 0: 50391.1. Samples: 717215360. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-04-26 13:03:27,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 13:03:27,359][49750] Updated weights for policy 0, policy_version 180931 (0.0037) [2024-04-26 13:03:29,919][49750] Updated weights for policy 0, policy_version 180941 (0.0030) [2024-04-26 13:03:32,062][49517] Fps is (10 sec: 50789.9, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 2964602880. Throughput: 0: 50352.8. Samples: 717513480. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-04-26 13:03:32,063][49517] Avg episode reward: [(0, '0.479')] [2024-04-26 13:03:33,877][49750] Updated weights for policy 0, policy_version 180951 (0.0032) [2024-04-26 13:03:36,300][49750] Updated weights for policy 0, policy_version 180961 (0.0026) [2024-04-26 13:03:37,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 2964881408. Throughput: 0: 50561.9. Samples: 717670860. Policy #0 lag: (min: 1.0, avg: 12.8, max: 24.0) [2024-04-26 13:03:37,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 13:03:40,210][49750] Updated weights for policy 0, policy_version 180971 (0.0030) [2024-04-26 13:03:42,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2965127168. Throughput: 0: 50618.1. Samples: 717974860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:03:42,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 13:03:42,758][49750] Updated weights for policy 0, policy_version 180981 (0.0035) [2024-04-26 13:03:46,720][49750] Updated weights for policy 0, policy_version 180991 (0.0031) [2024-04-26 13:03:47,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.2, 300 sec: 50373.8). Total num frames: 2965356544. Throughput: 0: 50577.4. Samples: 718276280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:03:47,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 13:03:49,300][49750] Updated weights for policy 0, policy_version 181001 (0.0034) [2024-04-26 13:03:52,062][49517] Fps is (10 sec: 45875.5, 60 sec: 49425.2, 300 sec: 50318.4). Total num frames: 2965585920. Throughput: 0: 50552.3. Samples: 718416480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:03:52,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 13:03:53,384][49750] Updated weights for policy 0, policy_version 181011 (0.0030) [2024-04-26 13:03:53,996][49728] Signal inference workers to stop experience collection... (10750 times) [2024-04-26 13:03:53,996][49728] Signal inference workers to resume experience collection... (10750 times) [2024-04-26 13:03:54,028][49750] InferenceWorker_p0-w0: stopping experience collection (10750 times) [2024-04-26 13:03:54,028][49750] InferenceWorker_p0-w0: resuming experience collection (10750 times) [2024-04-26 13:03:55,719][49750] Updated weights for policy 0, policy_version 181021 (0.0030) [2024-04-26 13:03:57,063][49517] Fps is (10 sec: 54067.4, 60 sec: 50517.2, 300 sec: 50540.5). Total num frames: 2965897216. Throughput: 0: 50300.4. Samples: 718717340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:03:57,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 13:03:59,892][49750] Updated weights for policy 0, policy_version 181031 (0.0036) [2024-04-26 13:04:02,063][49517] Fps is (10 sec: 57342.5, 60 sec: 51063.4, 300 sec: 50651.5). Total num frames: 2966159360. Throughput: 0: 50363.7. Samples: 719024220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:04:02,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 13:04:02,199][49750] Updated weights for policy 0, policy_version 181041 (0.0036) [2024-04-26 13:04:06,411][49750] Updated weights for policy 0, policy_version 181051 (0.0038) [2024-04-26 13:04:07,063][49517] Fps is (10 sec: 47513.4, 60 sec: 51063.5, 300 sec: 50373.8). Total num frames: 2966372352. Throughput: 0: 50338.9. Samples: 719188620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:04:07,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 13:04:08,558][49750] Updated weights for policy 0, policy_version 181061 (0.0030) [2024-04-26 13:04:12,063][49517] Fps is (10 sec: 45875.8, 60 sec: 49971.2, 300 sec: 50373.8). Total num frames: 2966618112. Throughput: 0: 50431.9. Samples: 719484800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:04:12,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 13:04:12,968][49750] Updated weights for policy 0, policy_version 181071 (0.0027) [2024-04-26 13:04:15,271][49750] Updated weights for policy 0, policy_version 181081 (0.0028) [2024-04-26 13:04:17,062][49517] Fps is (10 sec: 50791.0, 60 sec: 49971.2, 300 sec: 50484.9). Total num frames: 2966880256. Throughput: 0: 50315.2. Samples: 719777660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:04:17,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 13:04:19,652][49750] Updated weights for policy 0, policy_version 181091 (0.0031) [2024-04-26 13:04:21,749][49750] Updated weights for policy 0, policy_version 181101 (0.0033) [2024-04-26 13:04:22,063][49517] Fps is (10 sec: 55705.3, 60 sec: 51336.4, 300 sec: 50707.1). Total num frames: 2967175168. Throughput: 0: 50624.3. Samples: 719948960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:04:22,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 13:04:25,992][49750] Updated weights for policy 0, policy_version 181111 (0.0031) [2024-04-26 13:04:27,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50540.5). Total num frames: 2967404544. Throughput: 0: 50480.9. Samples: 720246500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:04:27,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 13:04:28,411][49750] Updated weights for policy 0, policy_version 181121 (0.0031) [2024-04-26 13:04:32,063][49517] Fps is (10 sec: 44236.5, 60 sec: 50244.1, 300 sec: 50318.3). Total num frames: 2967617536. Throughput: 0: 50436.3. Samples: 720545920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:04:32,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 13:04:32,391][49750] Updated weights for policy 0, policy_version 181131 (0.0029) [2024-04-26 13:04:34,790][49750] Updated weights for policy 0, policy_version 181141 (0.0030) [2024-04-26 13:04:37,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 2967879680. Throughput: 0: 50513.8. Samples: 720689600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:04:37,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 13:04:38,944][49750] Updated weights for policy 0, policy_version 181151 (0.0037) [2024-04-26 13:04:41,294][49750] Updated weights for policy 0, policy_version 181161 (0.0030) [2024-04-26 13:04:42,062][49517] Fps is (10 sec: 54068.4, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 2968158208. Throughput: 0: 50550.8. Samples: 720992120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:04:42,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 13:04:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000181162_2968158208.pth... [2024-04-26 13:04:42,131][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000180423_2956050432.pth [2024-04-26 13:04:45,438][49750] Updated weights for policy 0, policy_version 181171 (0.0029) [2024-04-26 13:04:47,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.6, 300 sec: 50596.0). Total num frames: 2968420352. Throughput: 0: 50446.5. Samples: 721294300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 13:04:47,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 13:04:47,641][49750] Updated weights for policy 0, policy_version 181181 (0.0031) [2024-04-26 13:04:51,648][49750] Updated weights for policy 0, policy_version 181191 (0.0028) [2024-04-26 13:04:52,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 2968633344. Throughput: 0: 50495.3. Samples: 721460900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 13:04:52,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 13:04:53,433][49728] Signal inference workers to stop experience collection... (10800 times) [2024-04-26 13:04:53,434][49728] Signal inference workers to resume experience collection... (10800 times) [2024-04-26 13:04:53,455][49750] InferenceWorker_p0-w0: stopping experience collection (10800 times) [2024-04-26 13:04:53,456][49750] InferenceWorker_p0-w0: resuming experience collection (10800 times) [2024-04-26 13:04:54,149][49750] Updated weights for policy 0, policy_version 181201 (0.0032) [2024-04-26 13:04:57,062][49517] Fps is (10 sec: 47513.2, 60 sec: 49971.2, 300 sec: 50373.8). Total num frames: 2968895488. Throughput: 0: 50603.6. Samples: 721761960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 13:04:57,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 13:04:58,122][49750] Updated weights for policy 0, policy_version 181211 (0.0028) [2024-04-26 13:05:00,533][49750] Updated weights for policy 0, policy_version 181221 (0.0030) [2024-04-26 13:05:02,063][49517] Fps is (10 sec: 50789.7, 60 sec: 49698.2, 300 sec: 50429.4). Total num frames: 2969141248. Throughput: 0: 50681.2. Samples: 722058320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 13:05:02,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 13:05:04,729][49750] Updated weights for policy 0, policy_version 181231 (0.0030) [2024-04-26 13:05:07,013][49750] Updated weights for policy 0, policy_version 181241 (0.0029) [2024-04-26 13:05:07,063][49517] Fps is (10 sec: 55705.4, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 2969452544. Throughput: 0: 50523.2. Samples: 722222500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 13:05:07,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 13:05:11,134][49750] Updated weights for policy 0, policy_version 181251 (0.0033) [2024-04-26 13:05:12,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 2969665536. Throughput: 0: 50645.2. Samples: 722525540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 13:05:12,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 13:05:13,452][49750] Updated weights for policy 0, policy_version 181261 (0.0037) [2024-04-26 13:05:17,062][49517] Fps is (10 sec: 45876.1, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2969911296. Throughput: 0: 50621.2. Samples: 722823860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 13:05:17,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 13:05:17,702][49750] Updated weights for policy 0, policy_version 181271 (0.0037) [2024-04-26 13:05:20,072][49750] Updated weights for policy 0, policy_version 181281 (0.0031) [2024-04-26 13:05:22,062][49517] Fps is (10 sec: 49152.8, 60 sec: 49698.3, 300 sec: 50429.4). Total num frames: 2970157056. Throughput: 0: 50642.2. Samples: 722968500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 13:05:22,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 13:05:23,962][49750] Updated weights for policy 0, policy_version 181291 (0.0029) [2024-04-26 13:05:26,560][49750] Updated weights for policy 0, policy_version 181301 (0.0031) [2024-04-26 13:05:27,062][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 2970435584. Throughput: 0: 50780.9. Samples: 723277260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 13:05:27,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 13:05:30,274][49750] Updated weights for policy 0, policy_version 181311 (0.0031) [2024-04-26 13:05:32,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.7, 300 sec: 50540.5). Total num frames: 2970681344. Throughput: 0: 50812.1. Samples: 723580840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 13:05:32,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 13:05:33,138][49750] Updated weights for policy 0, policy_version 181321 (0.0035) [2024-04-26 13:05:36,840][49750] Updated weights for policy 0, policy_version 181331 (0.0034) [2024-04-26 13:05:37,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 2970943488. Throughput: 0: 50496.5. Samples: 723733240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 13:05:37,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 13:05:39,728][49750] Updated weights for policy 0, policy_version 181341 (0.0028) [2024-04-26 13:05:42,063][49517] Fps is (10 sec: 49150.7, 60 sec: 50244.1, 300 sec: 50373.8). Total num frames: 2971172864. Throughput: 0: 50664.3. Samples: 724041860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 13:05:42,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 13:05:43,278][49750] Updated weights for policy 0, policy_version 181351 (0.0037) [2024-04-26 13:05:46,076][49750] Updated weights for policy 0, policy_version 181361 (0.0039) [2024-04-26 13:05:47,063][49517] Fps is (10 sec: 47512.9, 60 sec: 49971.1, 300 sec: 50484.9). Total num frames: 2971418624. Throughput: 0: 50665.4. Samples: 724338260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 13:05:47,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 13:05:49,648][49750] Updated weights for policy 0, policy_version 181371 (0.0031) [2024-04-26 13:05:52,062][49517] Fps is (10 sec: 54068.3, 60 sec: 51336.5, 300 sec: 50651.6). Total num frames: 2971713536. Throughput: 0: 50418.8. Samples: 724491340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:05:52,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 13:05:52,607][49750] Updated weights for policy 0, policy_version 181381 (0.0035) [2024-04-26 13:05:55,924][49750] Updated weights for policy 0, policy_version 181391 (0.0027) [2024-04-26 13:05:57,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 2971959296. Throughput: 0: 50541.5. Samples: 724799900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:05:57,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 13:05:57,534][49728] Signal inference workers to stop experience collection... (10850 times) [2024-04-26 13:05:57,534][49728] Signal inference workers to resume experience collection... (10850 times) [2024-04-26 13:05:57,563][49750] InferenceWorker_p0-w0: stopping experience collection (10850 times) [2024-04-26 13:05:57,563][49750] InferenceWorker_p0-w0: resuming experience collection (10850 times) [2024-04-26 13:05:59,218][49750] Updated weights for policy 0, policy_version 181401 (0.0037) [2024-04-26 13:06:02,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.5, 300 sec: 50373.9). Total num frames: 2972188672. Throughput: 0: 50721.2. Samples: 725106320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:06:02,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 13:06:02,701][49750] Updated weights for policy 0, policy_version 181411 (0.0036) [2024-04-26 13:06:05,632][49750] Updated weights for policy 0, policy_version 181421 (0.0033) [2024-04-26 13:06:07,063][49517] Fps is (10 sec: 47512.7, 60 sec: 49698.1, 300 sec: 50484.9). Total num frames: 2972434432. Throughput: 0: 50562.9. Samples: 725243840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:06:07,063][49517] Avg episode reward: [(0, '0.458')] [2024-04-26 13:06:09,235][49750] Updated weights for policy 0, policy_version 181431 (0.0032) [2024-04-26 13:06:12,050][49750] Updated weights for policy 0, policy_version 181441 (0.0041) [2024-04-26 13:06:12,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.6, 300 sec: 50651.6). Total num frames: 2972729344. Throughput: 0: 50613.4. Samples: 725554860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:06:12,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 13:06:15,608][49750] Updated weights for policy 0, policy_version 181451 (0.0032) [2024-04-26 13:06:17,062][49517] Fps is (10 sec: 52429.9, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 2972958720. Throughput: 0: 50615.9. Samples: 725858560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:06:17,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 13:06:18,584][49750] Updated weights for policy 0, policy_version 181461 (0.0029) [2024-04-26 13:06:22,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 2973204480. Throughput: 0: 50551.9. Samples: 726008080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:06:22,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 13:06:22,147][49750] Updated weights for policy 0, policy_version 181471 (0.0031) [2024-04-26 13:06:25,043][49750] Updated weights for policy 0, policy_version 181481 (0.0034) [2024-04-26 13:06:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2973450240. Throughput: 0: 50433.1. Samples: 726311340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:06:27,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 13:06:28,586][49750] Updated weights for policy 0, policy_version 181491 (0.0042) [2024-04-26 13:06:31,612][49750] Updated weights for policy 0, policy_version 181501 (0.0032) [2024-04-26 13:06:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2973712384. Throughput: 0: 50560.6. Samples: 726613480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:06:32,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 13:06:35,028][49750] Updated weights for policy 0, policy_version 181511 (0.0033) [2024-04-26 13:06:37,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 2973974528. Throughput: 0: 50672.4. Samples: 726771600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:06:37,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 13:06:38,494][49750] Updated weights for policy 0, policy_version 181521 (0.0032) [2024-04-26 13:06:41,398][49750] Updated weights for policy 0, policy_version 181531 (0.0039) [2024-04-26 13:06:42,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.6, 300 sec: 50429.4). Total num frames: 2974220288. Throughput: 0: 50581.4. Samples: 727076060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:06:42,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 13:06:42,138][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000181533_2974236672.pth... [2024-04-26 13:06:42,182][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000180792_2962096128.pth [2024-04-26 13:06:44,861][49750] Updated weights for policy 0, policy_version 181541 (0.0030) [2024-04-26 13:06:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 2974466048. Throughput: 0: 50428.1. Samples: 727375580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:06:47,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 13:06:47,922][49750] Updated weights for policy 0, policy_version 181551 (0.0031) [2024-04-26 13:06:51,192][49750] Updated weights for policy 0, policy_version 181561 (0.0032) [2024-04-26 13:06:52,062][49517] Fps is (10 sec: 49151.6, 60 sec: 49971.2, 300 sec: 50484.9). Total num frames: 2974711808. Throughput: 0: 50710.0. Samples: 727525780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:06:52,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 13:06:54,418][49750] Updated weights for policy 0, policy_version 181571 (0.0029) [2024-04-26 13:06:57,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 2974990336. Throughput: 0: 50509.2. Samples: 727827780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:06:57,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 13:06:57,751][49750] Updated weights for policy 0, policy_version 181581 (0.0033) [2024-04-26 13:07:00,841][49750] Updated weights for policy 0, policy_version 181591 (0.0030) [2024-04-26 13:07:02,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 2975236096. Throughput: 0: 50529.5. Samples: 728132400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:07:02,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 13:07:04,336][49750] Updated weights for policy 0, policy_version 181601 (0.0030) [2024-04-26 13:07:07,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.6, 300 sec: 50484.9). Total num frames: 2975481856. Throughput: 0: 50706.3. Samples: 728289860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:07:07,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 13:07:07,337][49750] Updated weights for policy 0, policy_version 181611 (0.0031) [2024-04-26 13:07:10,652][49750] Updated weights for policy 0, policy_version 181621 (0.0030) [2024-04-26 13:07:12,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50244.2, 300 sec: 50540.5). Total num frames: 2975744000. Throughput: 0: 50626.6. Samples: 728589540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:07:12,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 13:07:13,997][49750] Updated weights for policy 0, policy_version 181631 (0.0031) [2024-04-26 13:07:17,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 2975989760. Throughput: 0: 50603.6. Samples: 728890640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:07:17,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 13:07:17,168][49750] Updated weights for policy 0, policy_version 181641 (0.0034) [2024-04-26 13:07:19,713][49728] Signal inference workers to stop experience collection... (10900 times) [2024-04-26 13:07:19,713][49728] Signal inference workers to resume experience collection... (10900 times) [2024-04-26 13:07:19,727][49750] InferenceWorker_p0-w0: stopping experience collection (10900 times) [2024-04-26 13:07:19,728][49750] InferenceWorker_p0-w0: resuming experience collection (10900 times) [2024-04-26 13:07:20,357][49750] Updated weights for policy 0, policy_version 181651 (0.0035) [2024-04-26 13:07:22,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 2976251904. Throughput: 0: 50633.3. Samples: 729050100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:07:22,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 13:07:23,741][49750] Updated weights for policy 0, policy_version 181661 (0.0034) [2024-04-26 13:07:26,805][49750] Updated weights for policy 0, policy_version 181671 (0.0034) [2024-04-26 13:07:27,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 2976497664. Throughput: 0: 50527.1. Samples: 729349780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:07:27,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 13:07:30,264][49750] Updated weights for policy 0, policy_version 181681 (0.0031) [2024-04-26 13:07:32,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50540.5). Total num frames: 2976759808. Throughput: 0: 50635.9. Samples: 729654200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:07:32,063][49517] Avg episode reward: [(0, '0.442')] [2024-04-26 13:07:33,116][49750] Updated weights for policy 0, policy_version 181691 (0.0028) [2024-04-26 13:07:36,815][49750] Updated weights for policy 0, policy_version 181701 (0.0034) [2024-04-26 13:07:37,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50517.2, 300 sec: 50540.4). Total num frames: 2977005568. Throughput: 0: 50731.8. Samples: 729808720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:07:37,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 13:07:39,630][49750] Updated weights for policy 0, policy_version 181711 (0.0033) [2024-04-26 13:07:42,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.2, 300 sec: 50651.5). Total num frames: 2977267712. Throughput: 0: 50796.4. Samples: 730113620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:07:42,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 13:07:43,293][49750] Updated weights for policy 0, policy_version 181721 (0.0034) [2024-04-26 13:07:46,190][49750] Updated weights for policy 0, policy_version 181731 (0.0032) [2024-04-26 13:07:47,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 2977513472. Throughput: 0: 50703.2. Samples: 730414040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:07:47,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 13:07:49,636][49750] Updated weights for policy 0, policy_version 181741 (0.0033) [2024-04-26 13:07:52,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 2977759232. Throughput: 0: 50686.1. Samples: 730570740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:07:52,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 13:07:52,540][49750] Updated weights for policy 0, policy_version 181751 (0.0036) [2024-04-26 13:07:56,087][49750] Updated weights for policy 0, policy_version 181761 (0.0028) [2024-04-26 13:07:57,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 2978021376. Throughput: 0: 50731.6. Samples: 730872460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:07:57,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 13:07:59,233][49750] Updated weights for policy 0, policy_version 181771 (0.0034) [2024-04-26 13:08:02,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 2978267136. Throughput: 0: 50717.6. Samples: 731172940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:08:02,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 13:08:02,418][49750] Updated weights for policy 0, policy_version 181781 (0.0033) [2024-04-26 13:08:05,579][49750] Updated weights for policy 0, policy_version 181791 (0.0033) [2024-04-26 13:08:07,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50596.0). Total num frames: 2978545664. Throughput: 0: 50590.6. Samples: 731326680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:08:07,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 13:08:08,841][49750] Updated weights for policy 0, policy_version 181801 (0.0033) [2024-04-26 13:08:12,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2978775040. Throughput: 0: 50707.9. Samples: 731631640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:08:12,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 13:08:12,142][49750] Updated weights for policy 0, policy_version 181811 (0.0030) [2024-04-26 13:08:15,319][49750] Updated weights for policy 0, policy_version 181821 (0.0032) [2024-04-26 13:08:17,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.2, 300 sec: 50651.5). Total num frames: 2979037184. Throughput: 0: 50546.9. Samples: 731928820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:08:17,064][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 13:08:18,679][49750] Updated weights for policy 0, policy_version 181831 (0.0033) [2024-04-26 13:08:21,746][49750] Updated weights for policy 0, policy_version 181841 (0.0034) [2024-04-26 13:08:22,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 2979282944. Throughput: 0: 50628.1. Samples: 732086980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:08:22,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 13:08:25,226][49750] Updated weights for policy 0, policy_version 181851 (0.0035) [2024-04-26 13:08:27,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 2979528704. Throughput: 0: 50540.7. Samples: 732387940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:08:27,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 13:08:28,160][49750] Updated weights for policy 0, policy_version 181861 (0.0030) [2024-04-26 13:08:31,690][49750] Updated weights for policy 0, policy_version 181871 (0.0029) [2024-04-26 13:08:32,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 2979790848. Throughput: 0: 50633.9. Samples: 732692560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:08:32,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 13:08:34,514][49750] Updated weights for policy 0, policy_version 181881 (0.0028) [2024-04-26 13:08:37,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 2980052992. Throughput: 0: 50389.3. Samples: 732838260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:08:37,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 13:08:38,030][49750] Updated weights for policy 0, policy_version 181891 (0.0034) [2024-04-26 13:08:40,982][49750] Updated weights for policy 0, policy_version 181901 (0.0026) [2024-04-26 13:08:42,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 2980315136. Throughput: 0: 50554.6. Samples: 733147420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:08:42,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 13:08:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000181904_2980315136.pth... [2024-04-26 13:08:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000181162_2968158208.pth [2024-04-26 13:08:44,547][49750] Updated weights for policy 0, policy_version 181911 (0.0027) [2024-04-26 13:08:47,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 2980544512. Throughput: 0: 50652.2. Samples: 733452280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:08:47,063][49517] Avg episode reward: [(0, '0.461')] [2024-04-26 13:08:47,736][49750] Updated weights for policy 0, policy_version 181921 (0.0035) [2024-04-26 13:08:49,345][49728] Signal inference workers to stop experience collection... (10950 times) [2024-04-26 13:08:49,384][49750] InferenceWorker_p0-w0: stopping experience collection (10950 times) [2024-04-26 13:08:49,446][49728] Signal inference workers to resume experience collection... (10950 times) [2024-04-26 13:08:49,446][49750] InferenceWorker_p0-w0: resuming experience collection (10950 times) [2024-04-26 13:08:51,053][49750] Updated weights for policy 0, policy_version 181931 (0.0031) [2024-04-26 13:08:52,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.5, 300 sec: 50540.5). Total num frames: 2980806656. Throughput: 0: 50547.2. Samples: 733601300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:08:52,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 13:08:54,224][49750] Updated weights for policy 0, policy_version 181941 (0.0031) [2024-04-26 13:08:57,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.3, 300 sec: 50485.0). Total num frames: 2981052416. Throughput: 0: 50482.2. Samples: 733903340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:08:57,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 13:08:57,510][49750] Updated weights for policy 0, policy_version 181951 (0.0035) [2024-04-26 13:09:00,647][49750] Updated weights for policy 0, policy_version 181961 (0.0029) [2024-04-26 13:09:02,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 2981298176. Throughput: 0: 50563.1. Samples: 734204160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:09:02,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 13:09:03,920][49750] Updated weights for policy 0, policy_version 181971 (0.0030) [2024-04-26 13:09:07,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 2981560320. Throughput: 0: 50420.6. Samples: 734355900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 13:09:07,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 13:09:07,075][49750] Updated weights for policy 0, policy_version 181981 (0.0028) [2024-04-26 13:09:10,432][49750] Updated weights for policy 0, policy_version 181991 (0.0037) [2024-04-26 13:09:12,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 2981806080. Throughput: 0: 50474.7. Samples: 734659300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 13:09:12,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 13:09:13,607][49750] Updated weights for policy 0, policy_version 182001 (0.0029) [2024-04-26 13:09:16,891][49750] Updated weights for policy 0, policy_version 182011 (0.0035) [2024-04-26 13:09:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.5, 300 sec: 50485.0). Total num frames: 2982068224. Throughput: 0: 50460.0. Samples: 734963260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 13:09:17,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 13:09:20,102][49750] Updated weights for policy 0, policy_version 182021 (0.0031) [2024-04-26 13:09:22,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.5, 300 sec: 50540.5). Total num frames: 2982313984. Throughput: 0: 50580.2. Samples: 735114360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 13:09:22,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 13:09:23,389][49750] Updated weights for policy 0, policy_version 182031 (0.0027) [2024-04-26 13:09:26,658][49750] Updated weights for policy 0, policy_version 182041 (0.0034) [2024-04-26 13:09:27,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 2982576128. Throughput: 0: 50482.2. Samples: 735419120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 13:09:27,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 13:09:29,932][49750] Updated weights for policy 0, policy_version 182051 (0.0035) [2024-04-26 13:09:32,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 2982821888. Throughput: 0: 50567.0. Samples: 735727800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 13:09:32,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 13:09:33,019][49750] Updated weights for policy 0, policy_version 182061 (0.0036) [2024-04-26 13:09:36,349][49750] Updated weights for policy 0, policy_version 182071 (0.0030) [2024-04-26 13:09:37,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.2, 300 sec: 50540.4). Total num frames: 2983067648. Throughput: 0: 50496.3. Samples: 735873640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 13:09:37,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 13:09:39,514][49750] Updated weights for policy 0, policy_version 182081 (0.0034) [2024-04-26 13:09:42,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49971.3, 300 sec: 50484.9). Total num frames: 2983313408. Throughput: 0: 50537.9. Samples: 736177540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 13:09:42,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 13:09:42,906][49750] Updated weights for policy 0, policy_version 182091 (0.0029) [2024-04-26 13:09:46,015][49750] Updated weights for policy 0, policy_version 182101 (0.0032) [2024-04-26 13:09:47,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 2983575552. Throughput: 0: 50457.4. Samples: 736474740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 13:09:47,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 13:09:48,469][49728] Signal inference workers to stop experience collection... (11000 times) [2024-04-26 13:09:48,513][49750] InferenceWorker_p0-w0: stopping experience collection (11000 times) [2024-04-26 13:09:48,575][49728] Signal inference workers to resume experience collection... (11000 times) [2024-04-26 13:09:48,575][49750] InferenceWorker_p0-w0: resuming experience collection (11000 times) [2024-04-26 13:09:49,392][49750] Updated weights for policy 0, policy_version 182111 (0.0034) [2024-04-26 13:09:52,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 50596.0). Total num frames: 2983821312. Throughput: 0: 50418.2. Samples: 736624720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 13:09:52,063][49517] Avg episode reward: [(0, '0.462')] [2024-04-26 13:09:52,551][49750] Updated weights for policy 0, policy_version 182121 (0.0033) [2024-04-26 13:09:55,883][49750] Updated weights for policy 0, policy_version 182131 (0.0030) [2024-04-26 13:09:57,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 2984083456. Throughput: 0: 50504.4. Samples: 736932000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 13:09:57,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 13:09:59,059][49750] Updated weights for policy 0, policy_version 182141 (0.0030) [2024-04-26 13:10:02,063][49517] Fps is (10 sec: 50786.5, 60 sec: 50516.8, 300 sec: 50429.3). Total num frames: 2984329216. Throughput: 0: 50552.0. Samples: 737238140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 13:10:02,064][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 13:10:02,268][49750] Updated weights for policy 0, policy_version 182151 (0.0031) [2024-04-26 13:10:05,556][49750] Updated weights for policy 0, policy_version 182161 (0.0030) [2024-04-26 13:10:07,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 2984591360. Throughput: 0: 50444.8. Samples: 737384380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 13:10:07,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 13:10:08,718][49750] Updated weights for policy 0, policy_version 182171 (0.0036) [2024-04-26 13:10:12,040][49750] Updated weights for policy 0, policy_version 182181 (0.0037) [2024-04-26 13:10:12,062][49517] Fps is (10 sec: 52433.2, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 2984853504. Throughput: 0: 50337.9. Samples: 737684320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-26 13:10:12,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 13:10:15,319][49750] Updated weights for policy 0, policy_version 182191 (0.0030) [2024-04-26 13:10:17,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 2985099264. Throughput: 0: 50272.1. Samples: 737990040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:10:17,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 13:10:18,602][49750] Updated weights for policy 0, policy_version 182201 (0.0039) [2024-04-26 13:10:21,732][49750] Updated weights for policy 0, policy_version 182211 (0.0031) [2024-04-26 13:10:22,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.2, 300 sec: 50540.5). Total num frames: 2985345024. Throughput: 0: 50483.2. Samples: 738145380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:10:22,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 13:10:24,965][49750] Updated weights for policy 0, policy_version 182221 (0.0034) [2024-04-26 13:10:27,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50244.2, 300 sec: 50540.4). Total num frames: 2985590784. Throughput: 0: 50584.2. Samples: 738453840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:10:27,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 13:10:28,181][49750] Updated weights for policy 0, policy_version 182231 (0.0030) [2024-04-26 13:10:31,470][49750] Updated weights for policy 0, policy_version 182241 (0.0036) [2024-04-26 13:10:32,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 2985836544. Throughput: 0: 50561.0. Samples: 738749980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:10:32,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 13:10:34,808][49750] Updated weights for policy 0, policy_version 182251 (0.0028) [2024-04-26 13:10:37,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 2986098688. Throughput: 0: 50668.8. Samples: 738904820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:10:37,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 13:10:38,053][49750] Updated weights for policy 0, policy_version 182261 (0.0029) [2024-04-26 13:10:41,409][49750] Updated weights for policy 0, policy_version 182271 (0.0027) [2024-04-26 13:10:42,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 2986377216. Throughput: 0: 50538.0. Samples: 739206220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:10:42,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 13:10:42,198][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000182275_2986393600.pth... [2024-04-26 13:10:42,243][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000181533_2974236672.pth [2024-04-26 13:10:44,576][49750] Updated weights for policy 0, policy_version 182281 (0.0035) [2024-04-26 13:10:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 2986590208. Throughput: 0: 50499.5. Samples: 739510580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:10:47,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-26 13:10:48,013][49750] Updated weights for policy 0, policy_version 182291 (0.0031) [2024-04-26 13:10:51,046][49750] Updated weights for policy 0, policy_version 182301 (0.0031) [2024-04-26 13:10:52,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2986852352. Throughput: 0: 50392.1. Samples: 739652020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:10:52,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 13:10:54,498][49750] Updated weights for policy 0, policy_version 182311 (0.0038) [2024-04-26 13:10:57,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.2, 300 sec: 50540.5). Total num frames: 2987098112. Throughput: 0: 50487.5. Samples: 739956260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:10:57,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 13:10:57,588][49750] Updated weights for policy 0, policy_version 182321 (0.0031) [2024-04-26 13:11:00,896][49750] Updated weights for policy 0, policy_version 182331 (0.0038) [2024-04-26 13:11:02,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50518.0, 300 sec: 50596.1). Total num frames: 2987360256. Throughput: 0: 50412.0. Samples: 740258580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:11:02,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 13:11:04,150][49750] Updated weights for policy 0, policy_version 182341 (0.0028) [2024-04-26 13:11:07,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 2987606016. Throughput: 0: 50421.5. Samples: 740414340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:11:07,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 13:11:07,337][49750] Updated weights for policy 0, policy_version 182351 (0.0033) [2024-04-26 13:11:10,278][49728] Signal inference workers to stop experience collection... (11050 times) [2024-04-26 13:11:10,278][49728] Signal inference workers to resume experience collection... (11050 times) [2024-04-26 13:11:10,304][49750] InferenceWorker_p0-w0: stopping experience collection (11050 times) [2024-04-26 13:11:10,304][49750] InferenceWorker_p0-w0: resuming experience collection (11050 times) [2024-04-26 13:11:10,578][49750] Updated weights for policy 0, policy_version 182361 (0.0029) [2024-04-26 13:11:12,062][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.1, 300 sec: 50484.9). Total num frames: 2987851776. Throughput: 0: 50279.3. Samples: 740716400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:11:12,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 13:11:13,916][49750] Updated weights for policy 0, policy_version 182371 (0.0029) [2024-04-26 13:11:17,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 50540.5). Total num frames: 2988113920. Throughput: 0: 50599.2. Samples: 741026940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 13:11:17,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 13:11:17,122][49750] Updated weights for policy 0, policy_version 182381 (0.0031) [2024-04-26 13:11:20,405][49750] Updated weights for policy 0, policy_version 182391 (0.0034) [2024-04-26 13:11:22,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 2988376064. Throughput: 0: 50284.5. Samples: 741167620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:11:22,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 13:11:24,107][49750] Updated weights for policy 0, policy_version 182401 (0.0029) [2024-04-26 13:11:26,781][49750] Updated weights for policy 0, policy_version 182411 (0.0031) [2024-04-26 13:11:27,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.5, 300 sec: 50596.0). Total num frames: 2988638208. Throughput: 0: 50257.8. Samples: 741467820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:11:27,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 13:11:30,517][49750] Updated weights for policy 0, policy_version 182421 (0.0028) [2024-04-26 13:11:32,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2988867584. Throughput: 0: 50254.1. Samples: 741772020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:11:32,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 13:11:33,181][49750] Updated weights for policy 0, policy_version 182431 (0.0035) [2024-04-26 13:11:36,855][49750] Updated weights for policy 0, policy_version 182441 (0.0035) [2024-04-26 13:11:37,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 2989113344. Throughput: 0: 50348.4. Samples: 741917700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:11:37,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 13:11:39,576][49750] Updated weights for policy 0, policy_version 182451 (0.0028) [2024-04-26 13:11:42,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49698.2, 300 sec: 50484.9). Total num frames: 2989359104. Throughput: 0: 50348.8. Samples: 742221960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:11:42,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 13:11:43,425][49750] Updated weights for policy 0, policy_version 182461 (0.0035) [2024-04-26 13:11:46,118][49750] Updated weights for policy 0, policy_version 182471 (0.0035) [2024-04-26 13:11:47,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.3, 300 sec: 50596.0). Total num frames: 2989637632. Throughput: 0: 50275.0. Samples: 742520960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:11:47,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 13:11:49,912][49750] Updated weights for policy 0, policy_version 182481 (0.0030) [2024-04-26 13:11:52,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.2, 300 sec: 50484.9). Total num frames: 2989883392. Throughput: 0: 50392.7. Samples: 742682020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:11:52,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 13:11:52,644][49750] Updated weights for policy 0, policy_version 182491 (0.0031) [2024-04-26 13:11:56,187][49728] Signal inference workers to stop experience collection... (11100 times) [2024-04-26 13:11:56,187][49728] Signal inference workers to resume experience collection... (11100 times) [2024-04-26 13:11:56,216][49750] InferenceWorker_p0-w0: stopping experience collection (11100 times) [2024-04-26 13:11:56,216][49750] InferenceWorker_p0-w0: resuming experience collection (11100 times) [2024-04-26 13:11:56,316][49750] Updated weights for policy 0, policy_version 182501 (0.0036) [2024-04-26 13:11:57,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50540.5). Total num frames: 2990145536. Throughput: 0: 50508.8. Samples: 742989300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:11:57,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 13:11:59,022][49750] Updated weights for policy 0, policy_version 182511 (0.0030) [2024-04-26 13:12:02,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.1, 300 sec: 50429.4). Total num frames: 2990358528. Throughput: 0: 50316.3. Samples: 743291180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:12:02,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 13:12:02,711][49750] Updated weights for policy 0, policy_version 182521 (0.0038) [2024-04-26 13:12:05,470][49750] Updated weights for policy 0, policy_version 182531 (0.0029) [2024-04-26 13:12:07,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.2, 300 sec: 50484.9). Total num frames: 2990637056. Throughput: 0: 50555.0. Samples: 743442600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:12:07,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 13:12:09,316][49750] Updated weights for policy 0, policy_version 182541 (0.0039) [2024-04-26 13:12:11,942][49750] Updated weights for policy 0, policy_version 182551 (0.0031) [2024-04-26 13:12:12,062][49517] Fps is (10 sec: 55705.7, 60 sec: 51063.4, 300 sec: 50596.0). Total num frames: 2990915584. Throughput: 0: 50527.2. Samples: 743741540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:12:12,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 13:12:15,906][49750] Updated weights for policy 0, policy_version 182561 (0.0032) [2024-04-26 13:12:17,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 2991128576. Throughput: 0: 50407.6. Samples: 744040360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:12:17,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 13:12:18,474][49750] Updated weights for policy 0, policy_version 182571 (0.0029) [2024-04-26 13:12:22,063][49517] Fps is (10 sec: 45875.1, 60 sec: 49971.1, 300 sec: 50429.4). Total num frames: 2991374336. Throughput: 0: 50431.5. Samples: 744187120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:12:22,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 13:12:22,347][49750] Updated weights for policy 0, policy_version 182581 (0.0035) [2024-04-26 13:12:24,894][49750] Updated weights for policy 0, policy_version 182591 (0.0026) [2024-04-26 13:12:27,062][49517] Fps is (10 sec: 49152.6, 60 sec: 49698.3, 300 sec: 50373.9). Total num frames: 2991620096. Throughput: 0: 50334.3. Samples: 744487000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:12:27,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 13:12:28,789][49750] Updated weights for policy 0, policy_version 182601 (0.0033) [2024-04-26 13:12:31,298][49750] Updated weights for policy 0, policy_version 182611 (0.0031) [2024-04-26 13:12:32,062][49517] Fps is (10 sec: 55706.0, 60 sec: 51063.5, 300 sec: 50596.0). Total num frames: 2991931392. Throughput: 0: 50441.4. Samples: 744790820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:12:32,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 13:12:35,341][49750] Updated weights for policy 0, policy_version 182621 (0.0029) [2024-04-26 13:12:37,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 2992144384. Throughput: 0: 50627.2. Samples: 744960240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:12:37,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 13:12:37,904][49750] Updated weights for policy 0, policy_version 182631 (0.0038) [2024-04-26 13:12:41,797][49750] Updated weights for policy 0, policy_version 182641 (0.0031) [2024-04-26 13:12:42,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.4, 300 sec: 50485.0). Total num frames: 2992406528. Throughput: 0: 50493.4. Samples: 745261500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:12:42,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 13:12:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000182642_2992406528.pth... [2024-04-26 13:12:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000181904_2980315136.pth [2024-04-26 13:12:44,421][49750] Updated weights for policy 0, policy_version 182651 (0.0034) [2024-04-26 13:12:47,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49698.2, 300 sec: 50373.9). Total num frames: 2992619520. Throughput: 0: 50340.5. Samples: 745556500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:12:47,063][49517] Avg episode reward: [(0, '0.454')] [2024-04-26 13:12:48,311][49750] Updated weights for policy 0, policy_version 182661 (0.0029) [2024-04-26 13:12:50,897][49750] Updated weights for policy 0, policy_version 182671 (0.0033) [2024-04-26 13:12:52,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.5, 300 sec: 50540.5). Total num frames: 2992930816. Throughput: 0: 50379.1. Samples: 745709660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:12:52,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 13:12:54,772][49750] Updated weights for policy 0, policy_version 182681 (0.0029) [2024-04-26 13:12:56,068][49728] Signal inference workers to stop experience collection... (11150 times) [2024-04-26 13:12:56,068][49728] Signal inference workers to resume experience collection... (11150 times) [2024-04-26 13:12:56,096][49750] InferenceWorker_p0-w0: stopping experience collection (11150 times) [2024-04-26 13:12:56,097][49750] InferenceWorker_p0-w0: resuming experience collection (11150 times) [2024-04-26 13:12:57,063][49517] Fps is (10 sec: 55704.4, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 2993176576. Throughput: 0: 50503.4. Samples: 746014200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:12:57,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 13:12:57,385][49750] Updated weights for policy 0, policy_version 182691 (0.0037) [2024-04-26 13:13:01,225][49750] Updated weights for policy 0, policy_version 182701 (0.0031) [2024-04-26 13:13:02,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 2993405952. Throughput: 0: 50449.4. Samples: 746310580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:13:02,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 13:13:03,780][49750] Updated weights for policy 0, policy_version 182711 (0.0034) [2024-04-26 13:13:07,063][49517] Fps is (10 sec: 45875.7, 60 sec: 49971.2, 300 sec: 50373.8). Total num frames: 2993635328. Throughput: 0: 50493.3. Samples: 746459320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:13:07,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 13:13:07,730][49750] Updated weights for policy 0, policy_version 182721 (0.0033) [2024-04-26 13:13:10,331][49750] Updated weights for policy 0, policy_version 182731 (0.0034) [2024-04-26 13:13:12,063][49517] Fps is (10 sec: 50789.9, 60 sec: 49971.1, 300 sec: 50429.4). Total num frames: 2993913856. Throughput: 0: 50432.2. Samples: 746756460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:13:12,063][49517] Avg episode reward: [(0, '0.454')] [2024-04-26 13:13:14,257][49750] Updated weights for policy 0, policy_version 182741 (0.0028) [2024-04-26 13:13:16,946][49750] Updated weights for policy 0, policy_version 182751 (0.0038) [2024-04-26 13:13:17,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 2994192384. Throughput: 0: 50393.3. Samples: 747058520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:13:17,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 13:13:20,679][49750] Updated weights for policy 0, policy_version 182761 (0.0031) [2024-04-26 13:13:22,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 2994421760. Throughput: 0: 50404.4. Samples: 747228440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:13:22,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 13:13:23,320][49750] Updated weights for policy 0, policy_version 182771 (0.0028) [2024-04-26 13:13:27,062][49517] Fps is (10 sec: 45875.1, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 2994651136. Throughput: 0: 50310.2. Samples: 747525460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 13:13:27,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 13:13:27,298][49750] Updated weights for policy 0, policy_version 182781 (0.0035) [2024-04-26 13:13:29,743][49750] Updated weights for policy 0, policy_version 182791 (0.0027) [2024-04-26 13:13:32,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49425.1, 300 sec: 50318.3). Total num frames: 2994896896. Throughput: 0: 50292.4. Samples: 747819660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-26 13:13:32,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 13:13:33,894][49750] Updated weights for policy 0, policy_version 182801 (0.0036) [2024-04-26 13:13:36,358][49750] Updated weights for policy 0, policy_version 182811 (0.0031) [2024-04-26 13:13:37,062][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2995191808. Throughput: 0: 50382.7. Samples: 747976880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-26 13:13:37,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 13:13:40,391][49750] Updated weights for policy 0, policy_version 182821 (0.0028) [2024-04-26 13:13:42,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2995437568. Throughput: 0: 50227.3. Samples: 748274420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-26 13:13:42,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 13:13:43,037][49750] Updated weights for policy 0, policy_version 182831 (0.0036) [2024-04-26 13:13:46,737][49750] Updated weights for policy 0, policy_version 182841 (0.0030) [2024-04-26 13:13:47,063][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.4, 300 sec: 50429.4). Total num frames: 2995683328. Throughput: 0: 50515.0. Samples: 748583760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-26 13:13:47,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 13:13:49,491][49750] Updated weights for policy 0, policy_version 182851 (0.0034) [2024-04-26 13:13:52,062][49517] Fps is (10 sec: 47513.3, 60 sec: 49698.1, 300 sec: 50373.9). Total num frames: 2995912704. Throughput: 0: 50316.5. Samples: 748723560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-26 13:13:52,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 13:13:53,164][49750] Updated weights for policy 0, policy_version 182861 (0.0025) [2024-04-26 13:13:55,966][49750] Updated weights for policy 0, policy_version 182871 (0.0030) [2024-04-26 13:13:57,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.3, 300 sec: 50429.4). Total num frames: 2996174848. Throughput: 0: 50483.2. Samples: 749028200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-26 13:13:57,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-26 13:13:58,194][49728] Signal inference workers to stop experience collection... (11200 times) [2024-04-26 13:13:58,236][49750] InferenceWorker_p0-w0: stopping experience collection (11200 times) [2024-04-26 13:13:58,270][49728] Signal inference workers to resume experience collection... (11200 times) [2024-04-26 13:13:58,270][49750] InferenceWorker_p0-w0: resuming experience collection (11200 times) [2024-04-26 13:13:59,612][49750] Updated weights for policy 0, policy_version 182881 (0.0032) [2024-04-26 13:14:02,063][49517] Fps is (10 sec: 54066.5, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 2996453376. Throughput: 0: 50460.3. Samples: 749329240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-26 13:14:02,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 13:14:02,634][49750] Updated weights for policy 0, policy_version 182891 (0.0032) [2024-04-26 13:14:06,149][49750] Updated weights for policy 0, policy_version 182901 (0.0039) [2024-04-26 13:14:07,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50484.9). Total num frames: 2996699136. Throughput: 0: 50381.7. Samples: 749495620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-26 13:14:07,064][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 13:14:09,203][49750] Updated weights for policy 0, policy_version 182911 (0.0033) [2024-04-26 13:14:12,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 2996928512. Throughput: 0: 50468.0. Samples: 749796520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-26 13:14:12,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 13:14:12,517][49750] Updated weights for policy 0, policy_version 182921 (0.0031) [2024-04-26 13:14:15,586][49750] Updated weights for policy 0, policy_version 182931 (0.0037) [2024-04-26 13:14:17,063][49517] Fps is (10 sec: 45875.2, 60 sec: 49425.0, 300 sec: 50318.3). Total num frames: 2997157888. Throughput: 0: 50430.9. Samples: 750089060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-26 13:14:17,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 13:14:19,031][49750] Updated weights for policy 0, policy_version 182941 (0.0030) [2024-04-26 13:14:22,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 2997452800. Throughput: 0: 50327.2. Samples: 750241600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-26 13:14:22,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 13:14:22,161][49750] Updated weights for policy 0, policy_version 182951 (0.0031) [2024-04-26 13:14:25,631][49750] Updated weights for policy 0, policy_version 182961 (0.0033) [2024-04-26 13:14:27,063][49517] Fps is (10 sec: 55705.8, 60 sec: 51063.4, 300 sec: 50484.9). Total num frames: 2997714944. Throughput: 0: 50319.0. Samples: 750538780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-26 13:14:27,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 13:14:28,580][49750] Updated weights for policy 0, policy_version 182971 (0.0032) [2024-04-26 13:14:32,031][49750] Updated weights for policy 0, policy_version 182981 (0.0030) [2024-04-26 13:14:32,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51063.4, 300 sec: 50484.9). Total num frames: 2997960704. Throughput: 0: 50335.1. Samples: 750848840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-26 13:14:32,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 13:14:35,081][49750] Updated weights for policy 0, policy_version 182991 (0.0030) [2024-04-26 13:14:37,062][49517] Fps is (10 sec: 45876.1, 60 sec: 49698.2, 300 sec: 50373.9). Total num frames: 2998173696. Throughput: 0: 50448.6. Samples: 750993740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 13:14:37,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 13:14:38,648][49750] Updated weights for policy 0, policy_version 183001 (0.0029) [2024-04-26 13:14:41,492][49750] Updated weights for policy 0, policy_version 183011 (0.0031) [2024-04-26 13:14:42,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 2998468608. Throughput: 0: 50369.7. Samples: 751294840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 13:14:42,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 13:14:42,070][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000183012_2998468608.pth... [2024-04-26 13:14:42,112][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000182275_2986393600.pth [2024-04-26 13:14:45,348][49750] Updated weights for policy 0, policy_version 183021 (0.0034) [2024-04-26 13:14:47,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50517.5, 300 sec: 50484.9). Total num frames: 2998714368. Throughput: 0: 50361.1. Samples: 751595480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 13:14:47,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 13:14:48,038][49750] Updated weights for policy 0, policy_version 183031 (0.0034) [2024-04-26 13:14:51,777][49750] Updated weights for policy 0, policy_version 183041 (0.0033) [2024-04-26 13:14:52,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 2998960128. Throughput: 0: 50258.7. Samples: 751757260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 13:14:52,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 13:14:54,557][49750] Updated weights for policy 0, policy_version 183051 (0.0032) [2024-04-26 13:14:57,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.3, 300 sec: 50374.0). Total num frames: 2999189504. Throughput: 0: 50154.2. Samples: 752053460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 13:14:57,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 13:14:58,110][49750] Updated weights for policy 0, policy_version 183061 (0.0032) [2024-04-26 13:15:01,143][49750] Updated weights for policy 0, policy_version 183071 (0.0031) [2024-04-26 13:15:02,063][49517] Fps is (10 sec: 47513.6, 60 sec: 49698.2, 300 sec: 50318.3). Total num frames: 2999435264. Throughput: 0: 50304.5. Samples: 752352760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 13:15:02,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 13:15:02,736][49728] Signal inference workers to stop experience collection... (11250 times) [2024-04-26 13:15:02,737][49728] Signal inference workers to resume experience collection... (11250 times) [2024-04-26 13:15:02,757][49750] InferenceWorker_p0-w0: stopping experience collection (11250 times) [2024-04-26 13:15:02,757][49750] InferenceWorker_p0-w0: resuming experience collection (11250 times) [2024-04-26 13:15:04,578][49750] Updated weights for policy 0, policy_version 183081 (0.0033) [2024-04-26 13:15:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 2999697408. Throughput: 0: 50339.9. Samples: 752506900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 13:15:07,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 13:15:07,628][49750] Updated weights for policy 0, policy_version 183091 (0.0032) [2024-04-26 13:15:10,976][49750] Updated weights for policy 0, policy_version 183101 (0.0030) [2024-04-26 13:15:12,062][49517] Fps is (10 sec: 54068.1, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 2999975936. Throughput: 0: 50436.6. Samples: 752808420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 13:15:12,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 13:15:14,398][49750] Updated weights for policy 0, policy_version 183111 (0.0032) [2024-04-26 13:15:17,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 3000205312. Throughput: 0: 50305.8. Samples: 753112600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 13:15:17,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 13:15:17,536][49750] Updated weights for policy 0, policy_version 183121 (0.0032) [2024-04-26 13:15:21,015][49750] Updated weights for policy 0, policy_version 183131 (0.0032) [2024-04-26 13:15:22,063][49517] Fps is (10 sec: 45874.6, 60 sec: 49698.0, 300 sec: 50318.3). Total num frames: 3000434688. Throughput: 0: 50343.4. Samples: 753259200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 13:15:22,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 13:15:23,959][49750] Updated weights for policy 0, policy_version 183141 (0.0031) [2024-04-26 13:15:27,062][49517] Fps is (10 sec: 50790.8, 60 sec: 49971.3, 300 sec: 50429.4). Total num frames: 3000713216. Throughput: 0: 50344.0. Samples: 753560320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 13:15:27,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 13:15:27,492][49750] Updated weights for policy 0, policy_version 183151 (0.0028) [2024-04-26 13:15:30,365][49750] Updated weights for policy 0, policy_version 183161 (0.0032) [2024-04-26 13:15:32,062][49517] Fps is (10 sec: 54067.5, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3000975360. Throughput: 0: 50383.4. Samples: 753862740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 13:15:32,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 13:15:34,129][49750] Updated weights for policy 0, policy_version 183171 (0.0038) [2024-04-26 13:15:36,923][49750] Updated weights for policy 0, policy_version 183181 (0.0030) [2024-04-26 13:15:37,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.3, 300 sec: 50373.9). Total num frames: 3001237504. Throughput: 0: 50403.2. Samples: 754025400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 13:15:37,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 13:15:40,578][49750] Updated weights for policy 0, policy_version 183191 (0.0033) [2024-04-26 13:15:42,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49698.2, 300 sec: 50373.9). Total num frames: 3001450496. Throughput: 0: 50473.9. Samples: 754324780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 13:15:42,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 13:15:43,445][49750] Updated weights for policy 0, policy_version 183201 (0.0037) [2024-04-26 13:15:47,007][49750] Updated weights for policy 0, policy_version 183211 (0.0029) [2024-04-26 13:15:47,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 3001729024. Throughput: 0: 50483.7. Samples: 754624520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 13:15:47,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 13:15:50,019][49750] Updated weights for policy 0, policy_version 183221 (0.0032) [2024-04-26 13:15:52,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3001974784. Throughput: 0: 50540.8. Samples: 754781240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 13:15:52,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 13:15:53,445][49750] Updated weights for policy 0, policy_version 183231 (0.0034) [2024-04-26 13:15:56,614][49750] Updated weights for policy 0, policy_version 183241 (0.0029) [2024-04-26 13:15:57,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 3002236928. Throughput: 0: 50592.0. Samples: 755085060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 13:15:57,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 13:15:59,928][49750] Updated weights for policy 0, policy_version 183251 (0.0029) [2024-04-26 13:16:02,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 3002482688. Throughput: 0: 50599.2. Samples: 755389560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 13:16:02,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 13:16:03,192][49750] Updated weights for policy 0, policy_version 183261 (0.0032) [2024-04-26 13:16:06,487][49750] Updated weights for policy 0, policy_version 183271 (0.0032) [2024-04-26 13:16:07,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3002728448. Throughput: 0: 50372.9. Samples: 755525980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 13:16:07,063][49517] Avg episode reward: [(0, '0.394')] [2024-04-26 13:16:07,759][49728] Signal inference workers to stop experience collection... (11300 times) [2024-04-26 13:16:07,800][49750] InferenceWorker_p0-w0: stopping experience collection (11300 times) [2024-04-26 13:16:07,817][49728] Signal inference workers to resume experience collection... (11300 times) [2024-04-26 13:16:07,826][49750] InferenceWorker_p0-w0: resuming experience collection (11300 times) [2024-04-26 13:16:09,680][49750] Updated weights for policy 0, policy_version 183281 (0.0029) [2024-04-26 13:16:12,063][49517] Fps is (10 sec: 49151.7, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 3002974208. Throughput: 0: 50396.8. Samples: 755828180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 13:16:12,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 13:16:12,981][49750] Updated weights for policy 0, policy_version 183291 (0.0033) [2024-04-26 13:16:16,174][49750] Updated weights for policy 0, policy_version 183301 (0.0033) [2024-04-26 13:16:17,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 3003252736. Throughput: 0: 50341.2. Samples: 756128100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 13:16:17,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 13:16:19,389][49750] Updated weights for policy 0, policy_version 183311 (0.0031) [2024-04-26 13:16:22,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.5, 300 sec: 50373.9). Total num frames: 3003498496. Throughput: 0: 50171.6. Samples: 756283120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 13:16:22,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 13:16:22,652][49750] Updated weights for policy 0, policy_version 183321 (0.0033) [2024-04-26 13:16:26,201][49750] Updated weights for policy 0, policy_version 183331 (0.0032) [2024-04-26 13:16:27,062][49517] Fps is (10 sec: 45875.7, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 3003711488. Throughput: 0: 50257.3. Samples: 756586360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 13:16:27,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 13:16:29,247][49750] Updated weights for policy 0, policy_version 183341 (0.0036) [2024-04-26 13:16:32,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 3003990016. Throughput: 0: 50336.3. Samples: 756889660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 13:16:32,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 13:16:32,731][49750] Updated weights for policy 0, policy_version 183351 (0.0028) [2024-04-26 13:16:35,602][49750] Updated weights for policy 0, policy_version 183361 (0.0033) [2024-04-26 13:16:37,063][49517] Fps is (10 sec: 54066.7, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 3004252160. Throughput: 0: 50248.9. Samples: 757042440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 13:16:37,072][49517] Avg episode reward: [(0, '0.461')] [2024-04-26 13:16:39,345][49750] Updated weights for policy 0, policy_version 183371 (0.0030) [2024-04-26 13:16:41,916][49750] Updated weights for policy 0, policy_version 183381 (0.0034) [2024-04-26 13:16:42,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.3, 300 sec: 50429.4). Total num frames: 3004514304. Throughput: 0: 50314.9. Samples: 757349240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 13:16:42,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 13:16:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000183381_3004514304.pth... [2024-04-26 13:16:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000182642_2992406528.pth [2024-04-26 13:16:45,697][49750] Updated weights for policy 0, policy_version 183391 (0.0029) [2024-04-26 13:16:47,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 3004727296. Throughput: 0: 50366.2. Samples: 757656040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-26 13:16:47,063][49517] Avg episode reward: [(0, '0.468')] [2024-04-26 13:16:48,548][49750] Updated weights for policy 0, policy_version 183401 (0.0032) [2024-04-26 13:16:52,063][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 3004989440. Throughput: 0: 50483.1. Samples: 757797720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 13:16:52,065][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 13:16:52,226][49750] Updated weights for policy 0, policy_version 183411 (0.0032) [2024-04-26 13:16:55,196][49750] Updated weights for policy 0, policy_version 183421 (0.0028) [2024-04-26 13:16:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 3005251584. Throughput: 0: 50349.0. Samples: 758093880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 13:16:57,071][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 13:16:58,617][49750] Updated weights for policy 0, policy_version 183431 (0.0032) [2024-04-26 13:17:01,619][49750] Updated weights for policy 0, policy_version 183441 (0.0032) [2024-04-26 13:17:02,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3005513728. Throughput: 0: 50536.0. Samples: 758402220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 13:17:02,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 13:17:05,029][49750] Updated weights for policy 0, policy_version 183451 (0.0035) [2024-04-26 13:17:07,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 3005775872. Throughput: 0: 50462.3. Samples: 758553920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 13:17:07,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 13:17:08,266][49750] Updated weights for policy 0, policy_version 183461 (0.0031) [2024-04-26 13:17:11,568][49750] Updated weights for policy 0, policy_version 183471 (0.0041) [2024-04-26 13:17:12,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 3005988864. Throughput: 0: 50433.2. Samples: 758855860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 13:17:12,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 13:17:13,053][49728] Signal inference workers to stop experience collection... (11350 times) [2024-04-26 13:17:13,054][49728] Signal inference workers to resume experience collection... (11350 times) [2024-04-26 13:17:13,070][49750] InferenceWorker_p0-w0: stopping experience collection (11350 times) [2024-04-26 13:17:13,099][49750] InferenceWorker_p0-w0: resuming experience collection (11350 times) [2024-04-26 13:17:14,727][49750] Updated weights for policy 0, policy_version 183481 (0.0038) [2024-04-26 13:17:17,063][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.4, 300 sec: 50485.0). Total num frames: 3006267392. Throughput: 0: 50430.9. Samples: 759159040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 13:17:17,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 13:17:18,037][49750] Updated weights for policy 0, policy_version 183491 (0.0033) [2024-04-26 13:17:21,257][49750] Updated weights for policy 0, policy_version 183501 (0.0029) [2024-04-26 13:17:22,062][49517] Fps is (10 sec: 54067.4, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 3006529536. Throughput: 0: 50389.8. Samples: 759309980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 13:17:22,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 13:17:24,586][49750] Updated weights for policy 0, policy_version 183511 (0.0032) [2024-04-26 13:17:27,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.4, 300 sec: 50262.8). Total num frames: 3006758912. Throughput: 0: 50229.5. Samples: 759609560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 13:17:27,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 13:17:27,731][49750] Updated weights for policy 0, policy_version 183521 (0.0033) [2024-04-26 13:17:31,168][49750] Updated weights for policy 0, policy_version 183531 (0.0035) [2024-04-26 13:17:32,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 3007004672. Throughput: 0: 50156.1. Samples: 759913060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 13:17:32,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 13:17:34,179][49750] Updated weights for policy 0, policy_version 183541 (0.0029) [2024-04-26 13:17:37,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 3007266816. Throughput: 0: 50303.2. Samples: 760061360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 13:17:37,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 13:17:38,033][49750] Updated weights for policy 0, policy_version 183551 (0.0030) [2024-04-26 13:17:40,661][49750] Updated weights for policy 0, policy_version 183561 (0.0040) [2024-04-26 13:17:42,063][49517] Fps is (10 sec: 49151.1, 60 sec: 49698.1, 300 sec: 50429.4). Total num frames: 3007496192. Throughput: 0: 50441.6. Samples: 760363760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 13:17:42,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 13:17:44,450][49750] Updated weights for policy 0, policy_version 183571 (0.0038) [2024-04-26 13:17:47,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 3007774720. Throughput: 0: 50205.9. Samples: 760661480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 13:17:47,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 13:17:47,122][49750] Updated weights for policy 0, policy_version 183581 (0.0031) [2024-04-26 13:17:50,796][49750] Updated weights for policy 0, policy_version 183591 (0.0033) [2024-04-26 13:17:52,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 3008020480. Throughput: 0: 50333.8. Samples: 760818940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-26 13:17:52,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 13:17:53,554][49750] Updated weights for policy 0, policy_version 183601 (0.0028) [2024-04-26 13:17:57,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 3008249856. Throughput: 0: 50338.3. Samples: 761121080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 13:17:57,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 13:17:57,559][49750] Updated weights for policy 0, policy_version 183611 (0.0034) [2024-04-26 13:18:00,249][49750] Updated weights for policy 0, policy_version 183621 (0.0031) [2024-04-26 13:18:02,063][49517] Fps is (10 sec: 49151.6, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 3008512000. Throughput: 0: 50244.2. Samples: 761420040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 13:18:02,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 13:18:04,021][49750] Updated weights for policy 0, policy_version 183631 (0.0035) [2024-04-26 13:18:06,783][49750] Updated weights for policy 0, policy_version 183641 (0.0030) [2024-04-26 13:18:07,063][49517] Fps is (10 sec: 52428.4, 60 sec: 49971.1, 300 sec: 50373.9). Total num frames: 3008774144. Throughput: 0: 50244.4. Samples: 761570980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 13:18:07,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 13:18:10,729][49750] Updated weights for policy 0, policy_version 183651 (0.0031) [2024-04-26 13:18:12,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 50318.3). Total num frames: 3009036288. Throughput: 0: 50326.2. Samples: 761874240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 13:18:12,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 13:18:13,326][49750] Updated weights for policy 0, policy_version 183661 (0.0036) [2024-04-26 13:18:17,052][49750] Updated weights for policy 0, policy_version 183671 (0.0029) [2024-04-26 13:18:17,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 3009265664. Throughput: 0: 50240.4. Samples: 762173880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 13:18:17,063][49517] Avg episode reward: [(0, '0.485')] [2024-04-26 13:18:19,867][49750] Updated weights for policy 0, policy_version 183681 (0.0029) [2024-04-26 13:18:22,062][49517] Fps is (10 sec: 49151.6, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 3009527808. Throughput: 0: 50264.4. Samples: 762323260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 13:18:22,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 13:18:23,405][49750] Updated weights for policy 0, policy_version 183691 (0.0037) [2024-04-26 13:18:26,208][49728] Signal inference workers to stop experience collection... (11400 times) [2024-04-26 13:18:26,251][49750] InferenceWorker_p0-w0: stopping experience collection (11400 times) [2024-04-26 13:18:26,313][49728] Signal inference workers to resume experience collection... (11400 times) [2024-04-26 13:18:26,313][49750] InferenceWorker_p0-w0: resuming experience collection (11400 times) [2024-04-26 13:18:26,445][49750] Updated weights for policy 0, policy_version 183701 (0.0031) [2024-04-26 13:18:27,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 3009773568. Throughput: 0: 50323.2. Samples: 762628300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 13:18:27,072][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 13:18:29,813][49750] Updated weights for policy 0, policy_version 183711 (0.0026) [2024-04-26 13:18:32,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 3010019328. Throughput: 0: 50320.5. Samples: 762925900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 13:18:32,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 13:18:32,936][49750] Updated weights for policy 0, policy_version 183721 (0.0029) [2024-04-26 13:18:36,708][49750] Updated weights for policy 0, policy_version 183731 (0.0027) [2024-04-26 13:18:37,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 3010281472. Throughput: 0: 50232.0. Samples: 763079380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 13:18:37,063][49517] Avg episode reward: [(0, '0.472')] [2024-04-26 13:18:39,414][49750] Updated weights for policy 0, policy_version 183741 (0.0033) [2024-04-26 13:18:42,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 3010527232. Throughput: 0: 50341.3. Samples: 763386440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 13:18:42,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 13:18:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000183748_3010527232.pth... [2024-04-26 13:18:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000183012_2998468608.pth [2024-04-26 13:18:43,075][49750] Updated weights for policy 0, policy_version 183751 (0.0029) [2024-04-26 13:18:45,824][49750] Updated weights for policy 0, policy_version 183761 (0.0036) [2024-04-26 13:18:47,063][49517] Fps is (10 sec: 47513.4, 60 sec: 49698.1, 300 sec: 50318.3). Total num frames: 3010756608. Throughput: 0: 50220.9. Samples: 763679980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 13:18:47,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 13:18:49,472][49750] Updated weights for policy 0, policy_version 183771 (0.0035) [2024-04-26 13:18:52,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 3011035136. Throughput: 0: 50326.8. Samples: 763835680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 13:18:52,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 13:18:52,285][49750] Updated weights for policy 0, policy_version 183781 (0.0033) [2024-04-26 13:18:55,914][49750] Updated weights for policy 0, policy_version 183791 (0.0031) [2024-04-26 13:18:57,062][49517] Fps is (10 sec: 54067.4, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 3011297280. Throughput: 0: 50362.6. Samples: 764140560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 13:18:57,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 13:18:58,957][49750] Updated weights for policy 0, policy_version 183801 (0.0037) [2024-04-26 13:19:02,062][49517] Fps is (10 sec: 47513.8, 60 sec: 49971.3, 300 sec: 50207.3). Total num frames: 3011510272. Throughput: 0: 50305.9. Samples: 764437640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:19:02,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 13:19:02,677][49750] Updated weights for policy 0, policy_version 183811 (0.0026) [2024-04-26 13:19:05,498][49750] Updated weights for policy 0, policy_version 183821 (0.0033) [2024-04-26 13:19:07,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.3, 300 sec: 50373.8). Total num frames: 3011788800. Throughput: 0: 50226.2. Samples: 764583440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:19:07,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 13:19:09,118][49750] Updated weights for policy 0, policy_version 183831 (0.0035) [2024-04-26 13:19:12,062][49517] Fps is (10 sec: 52428.5, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 3012034560. Throughput: 0: 50101.0. Samples: 764882840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:19:12,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 13:19:12,090][49750] Updated weights for policy 0, policy_version 183841 (0.0026) [2024-04-26 13:19:15,494][49750] Updated weights for policy 0, policy_version 183851 (0.0027) [2024-04-26 13:19:17,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 3012296704. Throughput: 0: 50326.6. Samples: 765190600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:19:17,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 13:19:18,488][49750] Updated weights for policy 0, policy_version 183861 (0.0029) [2024-04-26 13:19:21,981][49750] Updated weights for policy 0, policy_version 183871 (0.0031) [2024-04-26 13:19:22,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 3012542464. Throughput: 0: 50303.5. Samples: 765343040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:19:22,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 13:19:25,006][49750] Updated weights for policy 0, policy_version 183881 (0.0031) [2024-04-26 13:19:27,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 3012804608. Throughput: 0: 50056.4. Samples: 765638980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:19:27,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 13:19:28,584][49750] Updated weights for policy 0, policy_version 183891 (0.0030) [2024-04-26 13:19:31,472][49750] Updated weights for policy 0, policy_version 183901 (0.0030) [2024-04-26 13:19:32,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 3013050368. Throughput: 0: 50297.9. Samples: 765943380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:19:32,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 13:19:35,003][49750] Updated weights for policy 0, policy_version 183911 (0.0031) [2024-04-26 13:19:37,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 3013279744. Throughput: 0: 50314.2. Samples: 766099820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:19:37,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 13:19:37,865][49750] Updated weights for policy 0, policy_version 183921 (0.0033) [2024-04-26 13:19:39,494][49728] Signal inference workers to stop experience collection... (11450 times) [2024-04-26 13:19:39,513][49750] InferenceWorker_p0-w0: stopping experience collection (11450 times) [2024-04-26 13:19:39,605][49728] Signal inference workers to resume experience collection... (11450 times) [2024-04-26 13:19:39,605][49750] InferenceWorker_p0-w0: resuming experience collection (11450 times) [2024-04-26 13:19:41,473][49750] Updated weights for policy 0, policy_version 183931 (0.0033) [2024-04-26 13:19:42,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.5, 300 sec: 50318.3). Total num frames: 3013558272. Throughput: 0: 50308.6. Samples: 766404440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:19:42,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 13:19:44,416][49750] Updated weights for policy 0, policy_version 183941 (0.0033) [2024-04-26 13:19:47,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 3013787648. Throughput: 0: 50330.2. Samples: 766702500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:19:47,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 13:19:48,238][49750] Updated weights for policy 0, policy_version 183951 (0.0031) [2024-04-26 13:19:50,907][49750] Updated weights for policy 0, policy_version 183961 (0.0035) [2024-04-26 13:19:52,063][49517] Fps is (10 sec: 49150.8, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 3014049792. Throughput: 0: 50364.8. Samples: 766849860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:19:52,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 13:19:54,724][49750] Updated weights for policy 0, policy_version 183971 (0.0034) [2024-04-26 13:19:57,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50244.1, 300 sec: 50429.4). Total num frames: 3014311936. Throughput: 0: 50502.5. Samples: 767155460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:19:57,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 13:19:57,351][49750] Updated weights for policy 0, policy_version 183981 (0.0030) [2024-04-26 13:20:01,112][49750] Updated weights for policy 0, policy_version 183991 (0.0029) [2024-04-26 13:20:02,062][49517] Fps is (10 sec: 52430.0, 60 sec: 51063.5, 300 sec: 50429.4). Total num frames: 3014574080. Throughput: 0: 50374.8. Samples: 767457460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 13:20:02,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 13:20:03,857][49750] Updated weights for policy 0, policy_version 184001 (0.0030) [2024-04-26 13:20:07,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 3014803456. Throughput: 0: 50349.0. Samples: 767608740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:20:07,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 13:20:07,613][49750] Updated weights for policy 0, policy_version 184011 (0.0029) [2024-04-26 13:20:10,280][49750] Updated weights for policy 0, policy_version 184021 (0.0031) [2024-04-26 13:20:12,063][49517] Fps is (10 sec: 50788.2, 60 sec: 50790.1, 300 sec: 50429.4). Total num frames: 3015081984. Throughput: 0: 50376.6. Samples: 767905940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:20:12,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 13:20:14,013][49750] Updated weights for policy 0, policy_version 184031 (0.0029) [2024-04-26 13:20:16,742][49750] Updated weights for policy 0, policy_version 184041 (0.0032) [2024-04-26 13:20:17,063][49517] Fps is (10 sec: 54066.0, 60 sec: 50790.2, 300 sec: 50540.5). Total num frames: 3015344128. Throughput: 0: 50555.3. Samples: 768218380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:20:17,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 13:20:20,479][49750] Updated weights for policy 0, policy_version 184051 (0.0030) [2024-04-26 13:20:22,062][49517] Fps is (10 sec: 47515.4, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 3015557120. Throughput: 0: 50503.1. Samples: 768372460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:20:22,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 13:20:23,267][49750] Updated weights for policy 0, policy_version 184061 (0.0040) [2024-04-26 13:20:26,903][49750] Updated weights for policy 0, policy_version 184071 (0.0033) [2024-04-26 13:20:27,063][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 3015835648. Throughput: 0: 50299.4. Samples: 768667920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:20:27,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 13:20:29,836][49750] Updated weights for policy 0, policy_version 184081 (0.0038) [2024-04-26 13:20:30,905][49728] Signal inference workers to stop experience collection... (11500 times) [2024-04-26 13:20:30,905][49728] Signal inference workers to resume experience collection... (11500 times) [2024-04-26 13:20:30,932][49750] InferenceWorker_p0-w0: stopping experience collection (11500 times) [2024-04-26 13:20:30,932][49750] InferenceWorker_p0-w0: resuming experience collection (11500 times) [2024-04-26 13:20:32,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 3016081408. Throughput: 0: 50445.7. Samples: 768972560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:20:32,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 13:20:33,325][49750] Updated weights for policy 0, policy_version 184091 (0.0032) [2024-04-26 13:20:36,331][49750] Updated weights for policy 0, policy_version 184101 (0.0033) [2024-04-26 13:20:37,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 3016327168. Throughput: 0: 50561.8. Samples: 769125140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:20:37,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 13:20:39,830][49750] Updated weights for policy 0, policy_version 184111 (0.0032) [2024-04-26 13:20:42,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 3016589312. Throughput: 0: 50640.7. Samples: 769434280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:20:42,063][49517] Avg episode reward: [(0, '0.476')] [2024-04-26 13:20:42,138][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000184119_3016605696.pth... [2024-04-26 13:20:42,184][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000183381_3004514304.pth [2024-04-26 13:20:42,852][49750] Updated weights for policy 0, policy_version 184121 (0.0032) [2024-04-26 13:20:46,332][49750] Updated weights for policy 0, policy_version 184131 (0.0036) [2024-04-26 13:20:47,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 3016818688. Throughput: 0: 50514.8. Samples: 769730640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:20:47,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 13:20:49,282][49750] Updated weights for policy 0, policy_version 184141 (0.0026) [2024-04-26 13:20:52,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 3017080832. Throughput: 0: 50548.8. Samples: 769883440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:20:52,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 13:20:52,749][49750] Updated weights for policy 0, policy_version 184151 (0.0031) [2024-04-26 13:20:55,919][49750] Updated weights for policy 0, policy_version 184161 (0.0031) [2024-04-26 13:20:57,063][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 3017342976. Throughput: 0: 50647.4. Samples: 770185060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:20:57,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 13:20:59,335][49750] Updated weights for policy 0, policy_version 184171 (0.0028) [2024-04-26 13:21:02,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 3017605120. Throughput: 0: 50438.8. Samples: 770488120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:21:02,063][49517] Avg episode reward: [(0, '0.470')] [2024-04-26 13:21:02,319][49750] Updated weights for policy 0, policy_version 184181 (0.0031) [2024-04-26 13:21:05,794][49750] Updated weights for policy 0, policy_version 184191 (0.0038) [2024-04-26 13:21:07,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 3017850880. Throughput: 0: 50527.9. Samples: 770646220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:21:07,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 13:21:08,767][49750] Updated weights for policy 0, policy_version 184201 (0.0033) [2024-04-26 13:21:12,063][49517] Fps is (10 sec: 47513.2, 60 sec: 49971.4, 300 sec: 50262.8). Total num frames: 3018080256. Throughput: 0: 50554.2. Samples: 770942860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 13:21:12,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 13:21:12,364][49750] Updated weights for policy 0, policy_version 184211 (0.0031) [2024-04-26 13:21:15,382][49750] Updated weights for policy 0, policy_version 184221 (0.0036) [2024-04-26 13:21:17,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 3018358784. Throughput: 0: 50527.7. Samples: 771246300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 13:21:17,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 13:21:18,719][49750] Updated weights for policy 0, policy_version 184231 (0.0031) [2024-04-26 13:21:22,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3018588160. Throughput: 0: 50613.4. Samples: 771402740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 13:21:22,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 13:21:22,159][49750] Updated weights for policy 0, policy_version 184241 (0.0031) [2024-04-26 13:21:25,153][49750] Updated weights for policy 0, policy_version 184251 (0.0035) [2024-04-26 13:21:27,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 3018850304. Throughput: 0: 50429.2. Samples: 771703600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 13:21:27,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 13:21:28,528][49750] Updated weights for policy 0, policy_version 184261 (0.0036) [2024-04-26 13:21:31,614][49750] Updated weights for policy 0, policy_version 184271 (0.0028) [2024-04-26 13:21:32,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 3019096064. Throughput: 0: 50577.4. Samples: 772006620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 13:21:32,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 13:21:35,166][49750] Updated weights for policy 0, policy_version 184281 (0.0030) [2024-04-26 13:21:36,230][49728] Signal inference workers to stop experience collection... (11550 times) [2024-04-26 13:21:36,230][49728] Signal inference workers to resume experience collection... (11550 times) [2024-04-26 13:21:36,243][49750] InferenceWorker_p0-w0: stopping experience collection (11550 times) [2024-04-26 13:21:36,263][49750] InferenceWorker_p0-w0: resuming experience collection (11550 times) [2024-04-26 13:21:37,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.5, 300 sec: 50318.4). Total num frames: 3019358208. Throughput: 0: 50532.2. Samples: 772157380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 13:21:37,063][49517] Avg episode reward: [(0, '0.395')] [2024-04-26 13:21:38,070][49750] Updated weights for policy 0, policy_version 184291 (0.0031) [2024-04-26 13:21:41,714][49750] Updated weights for policy 0, policy_version 184301 (0.0032) [2024-04-26 13:21:42,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 3019603968. Throughput: 0: 50625.4. Samples: 772463200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 13:21:42,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 13:21:44,425][49750] Updated weights for policy 0, policy_version 184311 (0.0036) [2024-04-26 13:21:47,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 3019866112. Throughput: 0: 50571.4. Samples: 772763840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 13:21:47,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 13:21:48,160][49750] Updated weights for policy 0, policy_version 184321 (0.0031) [2024-04-26 13:21:51,262][49750] Updated weights for policy 0, policy_version 184331 (0.0037) [2024-04-26 13:21:52,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 3020128256. Throughput: 0: 50478.7. Samples: 772917760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 13:21:52,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 13:21:54,729][49750] Updated weights for policy 0, policy_version 184341 (0.0029) [2024-04-26 13:21:57,063][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 3020357632. Throughput: 0: 50538.3. Samples: 773217080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 13:21:57,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 13:21:57,799][49750] Updated weights for policy 0, policy_version 184351 (0.0028) [2024-04-26 13:22:01,063][49750] Updated weights for policy 0, policy_version 184361 (0.0039) [2024-04-26 13:22:02,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 3020619776. Throughput: 0: 50536.3. Samples: 773520440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 13:22:02,063][49517] Avg episode reward: [(0, '0.478')] [2024-04-26 13:22:04,323][49750] Updated weights for policy 0, policy_version 184371 (0.0034) [2024-04-26 13:22:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3020865536. Throughput: 0: 50321.4. Samples: 773667200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 13:22:07,071][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 13:22:07,459][49750] Updated weights for policy 0, policy_version 184381 (0.0033) [2024-04-26 13:22:10,795][49750] Updated weights for policy 0, policy_version 184391 (0.0027) [2024-04-26 13:22:12,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50373.8). Total num frames: 3021127680. Throughput: 0: 50548.0. Samples: 773978260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 13:22:12,072][49517] Avg episode reward: [(0, '0.481')] [2024-04-26 13:22:13,962][49750] Updated weights for policy 0, policy_version 184401 (0.0029) [2024-04-26 13:22:17,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 3021373440. Throughput: 0: 50508.0. Samples: 774279480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 13:22:17,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 13:22:17,136][49750] Updated weights for policy 0, policy_version 184411 (0.0032) [2024-04-26 13:22:20,581][49750] Updated weights for policy 0, policy_version 184421 (0.0038) [2024-04-26 13:22:22,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 3021635584. Throughput: 0: 50539.9. Samples: 774431680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 13:22:22,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 13:22:23,489][49750] Updated weights for policy 0, policy_version 184431 (0.0037) [2024-04-26 13:22:27,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 3021864960. Throughput: 0: 50359.3. Samples: 774729360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 13:22:27,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 13:22:27,089][49750] Updated weights for policy 0, policy_version 184441 (0.0032) [2024-04-26 13:22:30,000][49750] Updated weights for policy 0, policy_version 184451 (0.0029) [2024-04-26 13:22:32,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 3022143488. Throughput: 0: 50466.3. Samples: 775034820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 13:22:32,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 13:22:33,487][49750] Updated weights for policy 0, policy_version 184461 (0.0037) [2024-04-26 13:22:36,430][49750] Updated weights for policy 0, policy_version 184471 (0.0029) [2024-04-26 13:22:37,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.3, 300 sec: 50485.0). Total num frames: 3022389248. Throughput: 0: 50455.7. Samples: 775188260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 13:22:37,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 13:22:39,921][49750] Updated weights for policy 0, policy_version 184481 (0.0032) [2024-04-26 13:22:42,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 3022635008. Throughput: 0: 50567.1. Samples: 775492600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 13:22:42,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 13:22:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000184487_3022635008.pth... [2024-04-26 13:22:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000183748_3010527232.pth [2024-04-26 13:22:42,864][49750] Updated weights for policy 0, policy_version 184491 (0.0028) [2024-04-26 13:22:46,523][49750] Updated weights for policy 0, policy_version 184501 (0.0034) [2024-04-26 13:22:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.5, 300 sec: 50373.9). Total num frames: 3022880768. Throughput: 0: 50579.8. Samples: 775796520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 13:22:47,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 13:22:47,295][49728] Signal inference workers to stop experience collection... (11600 times) [2024-04-26 13:22:47,334][49750] InferenceWorker_p0-w0: stopping experience collection (11600 times) [2024-04-26 13:22:47,399][49728] Signal inference workers to resume experience collection... (11600 times) [2024-04-26 13:22:47,399][49750] InferenceWorker_p0-w0: resuming experience collection (11600 times) [2024-04-26 13:22:49,411][49750] Updated weights for policy 0, policy_version 184511 (0.0026) [2024-04-26 13:22:52,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49971.3, 300 sec: 50429.4). Total num frames: 3023126528. Throughput: 0: 50472.5. Samples: 775938460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 13:22:52,063][49517] Avg episode reward: [(0, '0.465')] [2024-04-26 13:22:53,221][49750] Updated weights for policy 0, policy_version 184521 (0.0036) [2024-04-26 13:22:55,832][49750] Updated weights for policy 0, policy_version 184531 (0.0030) [2024-04-26 13:22:57,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 3023405056. Throughput: 0: 50373.7. Samples: 776245080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 13:22:57,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 13:22:59,660][49750] Updated weights for policy 0, policy_version 184541 (0.0036) [2024-04-26 13:23:02,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 3023650816. Throughput: 0: 50493.9. Samples: 776551700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 13:23:02,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 13:23:02,289][49750] Updated weights for policy 0, policy_version 184551 (0.0036) [2024-04-26 13:23:06,030][49750] Updated weights for policy 0, policy_version 184561 (0.0030) [2024-04-26 13:23:07,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 3023912960. Throughput: 0: 50346.7. Samples: 776697280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 13:23:07,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 13:23:08,803][49750] Updated weights for policy 0, policy_version 184571 (0.0031) [2024-04-26 13:23:12,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3024142336. Throughput: 0: 50511.4. Samples: 777002380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 13:23:12,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 13:23:12,675][49750] Updated weights for policy 0, policy_version 184581 (0.0029) [2024-04-26 13:23:15,278][49750] Updated weights for policy 0, policy_version 184591 (0.0032) [2024-04-26 13:23:17,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 3024404480. Throughput: 0: 50404.6. Samples: 777303020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 13:23:17,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 13:23:19,180][49750] Updated weights for policy 0, policy_version 184601 (0.0030) [2024-04-26 13:23:21,649][49750] Updated weights for policy 0, policy_version 184611 (0.0031) [2024-04-26 13:23:22,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.2, 300 sec: 50484.9). Total num frames: 3024666624. Throughput: 0: 50400.7. Samples: 777456300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 13:23:22,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 13:23:25,607][49750] Updated weights for policy 0, policy_version 184621 (0.0035) [2024-04-26 13:23:27,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 3024912384. Throughput: 0: 50523.6. Samples: 777766160. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 13:23:27,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 13:23:28,100][49750] Updated weights for policy 0, policy_version 184631 (0.0031) [2024-04-26 13:23:32,063][49517] Fps is (10 sec: 47513.8, 60 sec: 49971.2, 300 sec: 50373.8). Total num frames: 3025141760. Throughput: 0: 50555.4. Samples: 778071520. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 13:23:32,063][49517] Avg episode reward: [(0, '0.467')] [2024-04-26 13:23:32,135][49750] Updated weights for policy 0, policy_version 184641 (0.0034) [2024-04-26 13:23:34,743][49750] Updated weights for policy 0, policy_version 184651 (0.0031) [2024-04-26 13:23:37,063][49517] Fps is (10 sec: 50789.2, 60 sec: 50517.1, 300 sec: 50484.9). Total num frames: 3025420288. Throughput: 0: 50547.7. Samples: 778213120. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 13:23:37,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 13:23:38,558][49750] Updated weights for policy 0, policy_version 184661 (0.0030) [2024-04-26 13:23:41,087][49750] Updated weights for policy 0, policy_version 184671 (0.0029) [2024-04-26 13:23:42,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 3025666048. Throughput: 0: 50493.9. Samples: 778517300. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 13:23:42,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 13:23:45,070][49750] Updated weights for policy 0, policy_version 184681 (0.0032) [2024-04-26 13:23:46,299][49728] Signal inference workers to stop experience collection... (11650 times) [2024-04-26 13:23:46,299][49728] Signal inference workers to resume experience collection... (11650 times) [2024-04-26 13:23:46,327][49750] InferenceWorker_p0-w0: stopping experience collection (11650 times) [2024-04-26 13:23:46,328][49750] InferenceWorker_p0-w0: resuming experience collection (11650 times) [2024-04-26 13:23:47,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.2, 300 sec: 50484.9). Total num frames: 3025928192. Throughput: 0: 50311.8. Samples: 778815740. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 13:23:47,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 13:23:47,553][49750] Updated weights for policy 0, policy_version 184691 (0.0035) [2024-04-26 13:23:51,757][49750] Updated weights for policy 0, policy_version 184701 (0.0034) [2024-04-26 13:23:52,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 3026157568. Throughput: 0: 50452.9. Samples: 778967660. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 13:23:52,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 13:23:54,131][49750] Updated weights for policy 0, policy_version 184711 (0.0031) [2024-04-26 13:23:57,063][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.2, 300 sec: 50484.9). Total num frames: 3026403328. Throughput: 0: 50469.7. Samples: 779273520. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 13:23:57,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 13:23:58,178][49750] Updated weights for policy 0, policy_version 184721 (0.0034) [2024-04-26 13:24:00,903][49750] Updated weights for policy 0, policy_version 184731 (0.0030) [2024-04-26 13:24:02,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50244.1, 300 sec: 50429.4). Total num frames: 3026665472. Throughput: 0: 50392.7. Samples: 779570700. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 13:24:02,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 13:24:04,552][49750] Updated weights for policy 0, policy_version 184741 (0.0028) [2024-04-26 13:24:07,063][49517] Fps is (10 sec: 54067.3, 60 sec: 50517.2, 300 sec: 50540.5). Total num frames: 3026944000. Throughput: 0: 50493.3. Samples: 779728500. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 13:24:07,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 13:24:07,437][49750] Updated weights for policy 0, policy_version 184751 (0.0032) [2024-04-26 13:24:11,133][49750] Updated weights for policy 0, policy_version 184761 (0.0036) [2024-04-26 13:24:12,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 3027173376. Throughput: 0: 50428.4. Samples: 780035440. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 13:24:12,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 13:24:13,976][49750] Updated weights for policy 0, policy_version 184771 (0.0031) [2024-04-26 13:24:17,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 3027419136. Throughput: 0: 50275.7. Samples: 780333920. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 13:24:17,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 13:24:17,675][49750] Updated weights for policy 0, policy_version 184781 (0.0030) [2024-04-26 13:24:20,432][49750] Updated weights for policy 0, policy_version 184791 (0.0040) [2024-04-26 13:24:22,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.5, 300 sec: 50485.0). Total num frames: 3027697664. Throughput: 0: 50411.4. Samples: 780481620. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 13:24:22,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-26 13:24:24,166][49750] Updated weights for policy 0, policy_version 184801 (0.0029) [2024-04-26 13:24:26,866][49750] Updated weights for policy 0, policy_version 184811 (0.0029) [2024-04-26 13:24:27,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.2, 300 sec: 50484.9). Total num frames: 3027943424. Throughput: 0: 50208.3. Samples: 780776680. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 13:24:27,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 13:24:30,590][49750] Updated weights for policy 0, policy_version 184821 (0.0033) [2024-04-26 13:24:32,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50790.4, 300 sec: 50540.5). Total num frames: 3028189184. Throughput: 0: 50304.0. Samples: 781079420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 13:24:32,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 13:24:33,468][49750] Updated weights for policy 0, policy_version 184831 (0.0034) [2024-04-26 13:24:37,062][49517] Fps is (10 sec: 47514.2, 60 sec: 49971.4, 300 sec: 50373.8). Total num frames: 3028418560. Throughput: 0: 50434.7. Samples: 781237220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 13:24:37,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-26 13:24:37,193][49750] Updated weights for policy 0, policy_version 184841 (0.0030) [2024-04-26 13:24:39,892][49750] Updated weights for policy 0, policy_version 184851 (0.0030) [2024-04-26 13:24:42,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 3028680704. Throughput: 0: 50273.9. Samples: 781535840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 13:24:42,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 13:24:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000184856_3028680704.pth... [2024-04-26 13:24:42,144][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000184119_3016605696.pth [2024-04-26 13:24:43,740][49750] Updated weights for policy 0, policy_version 184861 (0.0032) [2024-04-26 13:24:45,412][49728] Signal inference workers to stop experience collection... (11700 times) [2024-04-26 13:24:45,412][49728] Signal inference workers to resume experience collection... (11700 times) [2024-04-26 13:24:45,443][49750] InferenceWorker_p0-w0: stopping experience collection (11700 times) [2024-04-26 13:24:45,443][49750] InferenceWorker_p0-w0: resuming experience collection (11700 times) [2024-04-26 13:24:46,314][49750] Updated weights for policy 0, policy_version 184871 (0.0028) [2024-04-26 13:24:47,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.4, 300 sec: 50485.0). Total num frames: 3028942848. Throughput: 0: 50236.6. Samples: 781831340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 13:24:47,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 13:24:50,138][49750] Updated weights for policy 0, policy_version 184881 (0.0026) [2024-04-26 13:24:52,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 3029204992. Throughput: 0: 50450.6. Samples: 781998780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 13:24:52,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 13:24:52,694][49750] Updated weights for policy 0, policy_version 184891 (0.0033) [2024-04-26 13:24:56,623][49750] Updated weights for policy 0, policy_version 184901 (0.0034) [2024-04-26 13:24:57,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.4, 300 sec: 50373.8). Total num frames: 3029434368. Throughput: 0: 50171.9. Samples: 782293180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 13:24:57,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 13:24:59,335][49750] Updated weights for policy 0, policy_version 184911 (0.0041) [2024-04-26 13:25:02,063][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3029680128. Throughput: 0: 50133.6. Samples: 782589940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 13:25:02,064][49517] Avg episode reward: [(0, '0.420')] [2024-04-26 13:25:03,239][49750] Updated weights for policy 0, policy_version 184921 (0.0031) [2024-04-26 13:25:05,817][49750] Updated weights for policy 0, policy_version 184931 (0.0038) [2024-04-26 13:25:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49698.2, 300 sec: 50318.4). Total num frames: 3029925888. Throughput: 0: 50251.5. Samples: 782742940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 13:25:07,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 13:25:09,696][49750] Updated weights for policy 0, policy_version 184941 (0.0032) [2024-04-26 13:25:12,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 3030204416. Throughput: 0: 50330.2. Samples: 783041540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 13:25:12,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 13:25:12,508][49750] Updated weights for policy 0, policy_version 184951 (0.0036) [2024-04-26 13:25:16,089][49750] Updated weights for policy 0, policy_version 184961 (0.0034) [2024-04-26 13:25:17,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.2, 300 sec: 50484.9). Total num frames: 3030450176. Throughput: 0: 50340.9. Samples: 783344760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 13:25:17,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 13:25:19,416][49750] Updated weights for policy 0, policy_version 184971 (0.0031) [2024-04-26 13:25:22,063][49517] Fps is (10 sec: 47513.9, 60 sec: 49698.1, 300 sec: 50318.3). Total num frames: 3030679552. Throughput: 0: 50128.4. Samples: 783493000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 13:25:22,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 13:25:22,687][49750] Updated weights for policy 0, policy_version 184981 (0.0035) [2024-04-26 13:25:26,036][49750] Updated weights for policy 0, policy_version 184991 (0.0031) [2024-04-26 13:25:27,063][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 3030941696. Throughput: 0: 50253.7. Samples: 783797260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 13:25:27,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 13:25:29,248][49750] Updated weights for policy 0, policy_version 185001 (0.0034) [2024-04-26 13:25:32,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3031203840. Throughput: 0: 50391.4. Samples: 784098960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 13:25:32,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 13:25:32,481][49750] Updated weights for policy 0, policy_version 185011 (0.0033) [2024-04-26 13:25:35,723][49750] Updated weights for policy 0, policy_version 185021 (0.0029) [2024-04-26 13:25:37,062][49517] Fps is (10 sec: 52430.2, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 3031465984. Throughput: 0: 50138.1. Samples: 784254980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:25:37,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 13:25:38,835][49750] Updated weights for policy 0, policy_version 185031 (0.0028) [2024-04-26 13:25:42,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50517.4, 300 sec: 50485.0). Total num frames: 3031711744. Throughput: 0: 50215.7. Samples: 784552880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:25:42,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 13:25:42,068][49750] Updated weights for policy 0, policy_version 185041 (0.0035) [2024-04-26 13:25:45,313][49750] Updated weights for policy 0, policy_version 185051 (0.0033) [2024-04-26 13:25:47,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 3031957504. Throughput: 0: 50349.8. Samples: 784855680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:25:47,063][49517] Avg episode reward: [(0, '0.459')] [2024-04-26 13:25:48,412][49728] Signal inference workers to stop experience collection... (11750 times) [2024-04-26 13:25:48,436][49750] InferenceWorker_p0-w0: stopping experience collection (11750 times) [2024-04-26 13:25:48,521][49728] Signal inference workers to resume experience collection... (11750 times) [2024-04-26 13:25:48,521][49750] InferenceWorker_p0-w0: resuming experience collection (11750 times) [2024-04-26 13:25:48,642][49750] Updated weights for policy 0, policy_version 185061 (0.0028) [2024-04-26 13:25:51,628][49750] Updated weights for policy 0, policy_version 185071 (0.0035) [2024-04-26 13:25:52,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.4, 300 sec: 50373.9). Total num frames: 3032203264. Throughput: 0: 50217.0. Samples: 785002700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:25:52,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 13:25:55,275][49750] Updated weights for policy 0, policy_version 185081 (0.0035) [2024-04-26 13:25:57,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 3032465408. Throughput: 0: 50437.5. Samples: 785311220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:25:57,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 13:25:58,175][49750] Updated weights for policy 0, policy_version 185091 (0.0032) [2024-04-26 13:26:01,747][49750] Updated weights for policy 0, policy_version 185101 (0.0034) [2024-04-26 13:26:02,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 3032711168. Throughput: 0: 50569.4. Samples: 785620380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:26:02,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 13:26:04,635][49750] Updated weights for policy 0, policy_version 185111 (0.0036) [2024-04-26 13:26:07,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3032956928. Throughput: 0: 50475.1. Samples: 785764380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:26:07,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 13:26:08,176][49750] Updated weights for policy 0, policy_version 185121 (0.0029) [2024-04-26 13:26:11,051][49750] Updated weights for policy 0, policy_version 185131 (0.0032) [2024-04-26 13:26:12,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3033235456. Throughput: 0: 50512.0. Samples: 786070300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:26:12,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 13:26:14,768][49750] Updated weights for policy 0, policy_version 185141 (0.0030) [2024-04-26 13:26:17,062][49517] Fps is (10 sec: 49152.5, 60 sec: 49971.4, 300 sec: 50373.9). Total num frames: 3033448448. Throughput: 0: 50273.1. Samples: 786361240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:26:17,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 13:26:17,718][49750] Updated weights for policy 0, policy_version 185151 (0.0028) [2024-04-26 13:26:21,315][49750] Updated weights for policy 0, policy_version 185161 (0.0028) [2024-04-26 13:26:22,063][49517] Fps is (10 sec: 45875.2, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 3033694208. Throughput: 0: 50255.3. Samples: 786516480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:26:22,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 13:26:24,313][49750] Updated weights for policy 0, policy_version 185171 (0.0034) [2024-04-26 13:26:27,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 3033956352. Throughput: 0: 50404.3. Samples: 786821080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:26:27,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 13:26:27,772][49750] Updated weights for policy 0, policy_version 185181 (0.0033) [2024-04-26 13:26:30,827][49750] Updated weights for policy 0, policy_version 185191 (0.0026) [2024-04-26 13:26:32,063][49517] Fps is (10 sec: 52428.9, 60 sec: 50244.3, 300 sec: 50373.8). Total num frames: 3034218496. Throughput: 0: 50331.5. Samples: 787120600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:26:32,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 13:26:34,230][49750] Updated weights for policy 0, policy_version 185201 (0.0033) [2024-04-26 13:26:37,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 3034480640. Throughput: 0: 50541.7. Samples: 787277080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:26:37,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 13:26:37,249][49750] Updated weights for policy 0, policy_version 185211 (0.0035) [2024-04-26 13:26:40,879][49750] Updated weights for policy 0, policy_version 185221 (0.0028) [2024-04-26 13:26:42,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3034742784. Throughput: 0: 50300.4. Samples: 787574740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 13:26:42,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 13:26:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000185226_3034742784.pth... [2024-04-26 13:26:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000184487_3022635008.pth [2024-04-26 13:26:43,819][49750] Updated weights for policy 0, policy_version 185231 (0.0027) [2024-04-26 13:26:47,063][49517] Fps is (10 sec: 47512.7, 60 sec: 49971.1, 300 sec: 50262.8). Total num frames: 3034955776. Throughput: 0: 50258.6. Samples: 787882020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 13:26:47,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 13:26:47,451][49750] Updated weights for policy 0, policy_version 185241 (0.0028) [2024-04-26 13:26:50,276][49750] Updated weights for policy 0, policy_version 185251 (0.0029) [2024-04-26 13:26:52,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 3035234304. Throughput: 0: 50262.5. Samples: 788026200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 13:26:52,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 13:26:53,927][49750] Updated weights for policy 0, policy_version 185261 (0.0032) [2024-04-26 13:26:56,821][49750] Updated weights for policy 0, policy_version 185271 (0.0035) [2024-04-26 13:26:57,062][49517] Fps is (10 sec: 52430.1, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 3035480064. Throughput: 0: 50130.4. Samples: 788326160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 13:26:57,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 13:27:00,479][49750] Updated weights for policy 0, policy_version 185281 (0.0030) [2024-04-26 13:27:02,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 3035742208. Throughput: 0: 50392.4. Samples: 788628900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 13:27:02,063][49517] Avg episode reward: [(0, '0.491')] [2024-04-26 13:27:03,129][49750] Updated weights for policy 0, policy_version 185291 (0.0030) [2024-04-26 13:27:07,062][49517] Fps is (10 sec: 47513.4, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 3035955200. Throughput: 0: 50437.9. Samples: 788786180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 13:27:07,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 13:27:07,108][49750] Updated weights for policy 0, policy_version 185301 (0.0028) [2024-04-26 13:27:07,630][49728] Signal inference workers to stop experience collection... (11800 times) [2024-04-26 13:27:07,631][49728] Signal inference workers to resume experience collection... (11800 times) [2024-04-26 13:27:07,660][49750] InferenceWorker_p0-w0: stopping experience collection (11800 times) [2024-04-26 13:27:07,660][49750] InferenceWorker_p0-w0: resuming experience collection (11800 times) [2024-04-26 13:27:09,479][49750] Updated weights for policy 0, policy_version 185311 (0.0030) [2024-04-26 13:27:12,063][49517] Fps is (10 sec: 49151.4, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 3036233728. Throughput: 0: 50320.4. Samples: 789085500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 13:27:12,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 13:27:13,505][49750] Updated weights for policy 0, policy_version 185321 (0.0031) [2024-04-26 13:27:16,042][49750] Updated weights for policy 0, policy_version 185331 (0.0027) [2024-04-26 13:27:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 3036479488. Throughput: 0: 50339.7. Samples: 789385880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 13:27:17,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 13:27:19,884][49750] Updated weights for policy 0, policy_version 185341 (0.0029) [2024-04-26 13:27:22,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 3036741632. Throughput: 0: 50486.7. Samples: 789548980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 13:27:22,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 13:27:22,789][49750] Updated weights for policy 0, policy_version 185351 (0.0029) [2024-04-26 13:27:26,465][49750] Updated weights for policy 0, policy_version 185361 (0.0037) [2024-04-26 13:27:27,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 3037003776. Throughput: 0: 50505.7. Samples: 789847500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 13:27:27,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 13:27:29,255][49750] Updated weights for policy 0, policy_version 185371 (0.0032) [2024-04-26 13:27:32,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 3037233152. Throughput: 0: 50295.2. Samples: 790145300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 13:27:32,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 13:27:32,970][49750] Updated weights for policy 0, policy_version 185381 (0.0029) [2024-04-26 13:27:35,596][49750] Updated weights for policy 0, policy_version 185391 (0.0037) [2024-04-26 13:27:37,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3037511680. Throughput: 0: 50527.7. Samples: 790299940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 13:27:37,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 13:27:39,490][49750] Updated weights for policy 0, policy_version 185401 (0.0033) [2024-04-26 13:27:42,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50244.1, 300 sec: 50429.3). Total num frames: 3037757440. Throughput: 0: 50312.6. Samples: 790590240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 13:27:42,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 13:27:42,237][49750] Updated weights for policy 0, policy_version 185411 (0.0036) [2024-04-26 13:27:45,875][49750] Updated weights for policy 0, policy_version 185421 (0.0029) [2024-04-26 13:27:47,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.7, 300 sec: 50484.9). Total num frames: 3038019584. Throughput: 0: 50462.3. Samples: 790899700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 13:27:47,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 13:27:48,760][49750] Updated weights for policy 0, policy_version 185431 (0.0031) [2024-04-26 13:27:52,063][49517] Fps is (10 sec: 47514.3, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 3038232576. Throughput: 0: 50475.9. Samples: 791057600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 13:27:52,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 13:27:52,468][49750] Updated weights for policy 0, policy_version 185441 (0.0029) [2024-04-26 13:27:55,173][49750] Updated weights for policy 0, policy_version 185451 (0.0035) [2024-04-26 13:27:57,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 3038511104. Throughput: 0: 50442.4. Samples: 791355400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 13:27:57,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 13:27:58,945][49750] Updated weights for policy 0, policy_version 185461 (0.0028) [2024-04-26 13:28:01,477][49750] Updated weights for policy 0, policy_version 185471 (0.0029) [2024-04-26 13:28:02,063][49517] Fps is (10 sec: 54067.2, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 3038773248. Throughput: 0: 50450.1. Samples: 791656140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 13:28:02,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 13:28:05,296][49728] Signal inference workers to stop experience collection... (11850 times) [2024-04-26 13:28:05,328][49750] InferenceWorker_p0-w0: stopping experience collection (11850 times) [2024-04-26 13:28:05,360][49728] Signal inference workers to resume experience collection... (11850 times) [2024-04-26 13:28:05,361][49750] InferenceWorker_p0-w0: resuming experience collection (11850 times) [2024-04-26 13:28:05,362][49750] Updated weights for policy 0, policy_version 185481 (0.0031) [2024-04-26 13:28:07,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50429.4). Total num frames: 3039019008. Throughput: 0: 50505.4. Samples: 791821720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 13:28:07,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 13:28:07,937][49750] Updated weights for policy 0, policy_version 185491 (0.0030) [2024-04-26 13:28:11,772][49750] Updated weights for policy 0, policy_version 185501 (0.0031) [2024-04-26 13:28:12,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 3039264768. Throughput: 0: 50618.3. Samples: 792125320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 13:28:12,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 13:28:14,385][49750] Updated weights for policy 0, policy_version 185511 (0.0040) [2024-04-26 13:28:17,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 3039510528. Throughput: 0: 50709.4. Samples: 792427220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 13:28:17,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 13:28:18,287][49750] Updated weights for policy 0, policy_version 185521 (0.0030) [2024-04-26 13:28:20,941][49750] Updated weights for policy 0, policy_version 185531 (0.0031) [2024-04-26 13:28:22,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 3039789056. Throughput: 0: 50516.2. Samples: 792573160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 13:28:22,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 13:28:24,732][49750] Updated weights for policy 0, policy_version 185541 (0.0036) [2024-04-26 13:28:27,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 3040034816. Throughput: 0: 50697.1. Samples: 792871600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 13:28:27,063][49517] Avg episode reward: [(0, '0.452')] [2024-04-26 13:28:27,452][49750] Updated weights for policy 0, policy_version 185551 (0.0028) [2024-04-26 13:28:31,122][49750] Updated weights for policy 0, policy_version 185561 (0.0033) [2024-04-26 13:28:32,062][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.5, 300 sec: 50429.4). Total num frames: 3040296960. Throughput: 0: 50699.9. Samples: 793181200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 13:28:32,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 13:28:33,764][49750] Updated weights for policy 0, policy_version 185571 (0.0034) [2024-04-26 13:28:37,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 3040526336. Throughput: 0: 50597.4. Samples: 793334480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 13:28:37,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 13:28:37,554][49750] Updated weights for policy 0, policy_version 185581 (0.0031) [2024-04-26 13:28:40,284][49750] Updated weights for policy 0, policy_version 185591 (0.0032) [2024-04-26 13:28:42,063][49517] Fps is (10 sec: 47509.2, 60 sec: 50243.6, 300 sec: 50318.2). Total num frames: 3040772096. Throughput: 0: 50581.9. Samples: 793631640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 13:28:42,064][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 13:28:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000185594_3040772096.pth... [2024-04-26 13:28:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000184856_3028680704.pth [2024-04-26 13:28:44,092][49750] Updated weights for policy 0, policy_version 185601 (0.0035) [2024-04-26 13:28:46,873][49750] Updated weights for policy 0, policy_version 185611 (0.0035) [2024-04-26 13:28:47,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.2, 300 sec: 50484.9). Total num frames: 3041050624. Throughput: 0: 50631.1. Samples: 793934540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 13:28:47,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 13:28:50,558][49750] Updated weights for policy 0, policy_version 185621 (0.0037) [2024-04-26 13:28:52,062][49517] Fps is (10 sec: 52434.1, 60 sec: 51063.6, 300 sec: 50485.0). Total num frames: 3041296384. Throughput: 0: 50484.0. Samples: 794093500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 13:28:52,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 13:28:53,240][49750] Updated weights for policy 0, policy_version 185631 (0.0032) [2024-04-26 13:28:56,992][49750] Updated weights for policy 0, policy_version 185641 (0.0027) [2024-04-26 13:28:57,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 3041542144. Throughput: 0: 50508.3. Samples: 794398200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:28:57,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 13:28:59,666][49750] Updated weights for policy 0, policy_version 185651 (0.0036) [2024-04-26 13:29:02,062][49517] Fps is (10 sec: 47513.2, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 3041771520. Throughput: 0: 50418.6. Samples: 794696060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:29:02,063][49517] Avg episode reward: [(0, '0.441')] [2024-04-26 13:29:03,584][49750] Updated weights for policy 0, policy_version 185661 (0.0027) [2024-04-26 13:29:06,298][49750] Updated weights for policy 0, policy_version 185671 (0.0032) [2024-04-26 13:29:07,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3042050048. Throughput: 0: 50516.8. Samples: 794846420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:29:07,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 13:29:10,055][49750] Updated weights for policy 0, policy_version 185681 (0.0029) [2024-04-26 13:29:12,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 3042312192. Throughput: 0: 50660.5. Samples: 795151320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:29:12,063][49517] Avg episode reward: [(0, '0.460')] [2024-04-26 13:29:12,773][49750] Updated weights for policy 0, policy_version 185691 (0.0033) [2024-04-26 13:29:16,314][49728] Signal inference workers to stop experience collection... (11900 times) [2024-04-26 13:29:16,356][49750] InferenceWorker_p0-w0: stopping experience collection (11900 times) [2024-04-26 13:29:16,417][49728] Signal inference workers to resume experience collection... (11900 times) [2024-04-26 13:29:16,417][49750] InferenceWorker_p0-w0: resuming experience collection (11900 times) [2024-04-26 13:29:16,564][49750] Updated weights for policy 0, policy_version 185701 (0.0027) [2024-04-26 13:29:17,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50373.8). Total num frames: 3042557952. Throughput: 0: 50493.8. Samples: 795453420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:29:17,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 13:29:19,273][49750] Updated weights for policy 0, policy_version 185711 (0.0029) [2024-04-26 13:29:22,062][49517] Fps is (10 sec: 47513.8, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 3042787328. Throughput: 0: 50285.4. Samples: 795597320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:29:22,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 13:29:22,895][49750] Updated weights for policy 0, policy_version 185721 (0.0031) [2024-04-26 13:29:25,845][49750] Updated weights for policy 0, policy_version 185731 (0.0032) [2024-04-26 13:29:27,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 3043065856. Throughput: 0: 50570.9. Samples: 795907280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:29:27,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 13:29:29,374][49750] Updated weights for policy 0, policy_version 185741 (0.0033) [2024-04-26 13:29:32,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 3043311616. Throughput: 0: 50523.6. Samples: 796208100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:29:32,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 13:29:32,413][49750] Updated weights for policy 0, policy_version 185751 (0.0031) [2024-04-26 13:29:35,922][49750] Updated weights for policy 0, policy_version 185761 (0.0033) [2024-04-26 13:29:37,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 3043590144. Throughput: 0: 50490.6. Samples: 796365580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:29:37,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 13:29:38,947][49750] Updated weights for policy 0, policy_version 185771 (0.0031) [2024-04-26 13:29:42,063][49517] Fps is (10 sec: 49150.5, 60 sec: 50517.9, 300 sec: 50373.8). Total num frames: 3043803136. Throughput: 0: 50556.3. Samples: 796673240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:29:42,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 13:29:42,494][49750] Updated weights for policy 0, policy_version 185781 (0.0032) [2024-04-26 13:29:45,347][49750] Updated weights for policy 0, policy_version 185791 (0.0036) [2024-04-26 13:29:47,063][49517] Fps is (10 sec: 45874.9, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 3044048896. Throughput: 0: 50643.0. Samples: 796975000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:29:47,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 13:29:48,857][49750] Updated weights for policy 0, policy_version 185801 (0.0036) [2024-04-26 13:29:51,710][49750] Updated weights for policy 0, policy_version 185811 (0.0033) [2024-04-26 13:29:52,062][49517] Fps is (10 sec: 52430.4, 60 sec: 50517.3, 300 sec: 50485.0). Total num frames: 3044327424. Throughput: 0: 50514.2. Samples: 797119560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:29:52,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 13:29:55,409][49750] Updated weights for policy 0, policy_version 185821 (0.0030) [2024-04-26 13:29:57,063][49517] Fps is (10 sec: 54067.3, 60 sec: 50790.4, 300 sec: 50540.5). Total num frames: 3044589568. Throughput: 0: 50595.9. Samples: 797428140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 13:29:57,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 13:29:58,086][49750] Updated weights for policy 0, policy_version 185831 (0.0031) [2024-04-26 13:30:01,897][49750] Updated weights for policy 0, policy_version 185841 (0.0029) [2024-04-26 13:30:02,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.4, 300 sec: 50540.5). Total num frames: 3044835328. Throughput: 0: 50816.9. Samples: 797740180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 13:30:02,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 13:30:04,573][49750] Updated weights for policy 0, policy_version 185851 (0.0033) [2024-04-26 13:30:07,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 3045064704. Throughput: 0: 50658.7. Samples: 797876960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 13:30:07,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 13:30:08,436][49750] Updated weights for policy 0, policy_version 185861 (0.0029) [2024-04-26 13:30:10,922][49750] Updated weights for policy 0, policy_version 185871 (0.0035) [2024-04-26 13:30:12,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.2, 300 sec: 50484.9). Total num frames: 3045343232. Throughput: 0: 50441.1. Samples: 798177140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 13:30:12,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 13:30:14,922][49750] Updated weights for policy 0, policy_version 185881 (0.0027) [2024-04-26 13:30:17,062][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 3045605376. Throughput: 0: 50654.2. Samples: 798487540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 13:30:17,063][49517] Avg episode reward: [(0, '0.692')] [2024-04-26 13:30:17,305][49750] Updated weights for policy 0, policy_version 185891 (0.0033) [2024-04-26 13:30:21,385][49750] Updated weights for policy 0, policy_version 185901 (0.0027) [2024-04-26 13:30:22,062][49517] Fps is (10 sec: 50791.3, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 3045851136. Throughput: 0: 50584.5. Samples: 798641880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 13:30:22,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 13:30:23,872][49750] Updated weights for policy 0, policy_version 185911 (0.0034) [2024-04-26 13:30:27,062][49517] Fps is (10 sec: 45875.4, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 3046064128. Throughput: 0: 50548.4. Samples: 798947900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 13:30:27,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 13:30:27,872][49750] Updated weights for policy 0, policy_version 185921 (0.0032) [2024-04-26 13:30:30,757][49750] Updated weights for policy 0, policy_version 185931 (0.0033) [2024-04-26 13:30:32,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 3046326272. Throughput: 0: 50425.8. Samples: 799244160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 13:30:32,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 13:30:34,284][49750] Updated weights for policy 0, policy_version 185941 (0.0030) [2024-04-26 13:30:37,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 3046604800. Throughput: 0: 50712.5. Samples: 799401620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 13:30:37,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 13:30:37,075][49750] Updated weights for policy 0, policy_version 185951 (0.0032) [2024-04-26 13:30:40,788][49750] Updated weights for policy 0, policy_version 185961 (0.0031) [2024-04-26 13:30:42,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51063.7, 300 sec: 50540.5). Total num frames: 3046866944. Throughput: 0: 50636.6. Samples: 799706780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 13:30:42,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 13:30:42,185][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000185967_3046883328.pth... [2024-04-26 13:30:42,236][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000185226_3034742784.pth [2024-04-26 13:30:43,429][49750] Updated weights for policy 0, policy_version 185971 (0.0031) [2024-04-26 13:30:45,068][49728] Signal inference workers to stop experience collection... (11950 times) [2024-04-26 13:30:45,069][49728] Signal inference workers to resume experience collection... (11950 times) [2024-04-26 13:30:45,096][49750] InferenceWorker_p0-w0: stopping experience collection (11950 times) [2024-04-26 13:30:45,096][49750] InferenceWorker_p0-w0: resuming experience collection (11950 times) [2024-04-26 13:30:47,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 3047079936. Throughput: 0: 50424.0. Samples: 800009260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 13:30:47,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 13:30:47,301][49750] Updated weights for policy 0, policy_version 185981 (0.0036) [2024-04-26 13:30:49,925][49750] Updated weights for policy 0, policy_version 185991 (0.0029) [2024-04-26 13:30:52,063][49517] Fps is (10 sec: 45874.8, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 3047325696. Throughput: 0: 50421.2. Samples: 800145920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 13:30:52,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 13:30:53,782][49750] Updated weights for policy 0, policy_version 186001 (0.0033) [2024-04-26 13:30:56,427][49750] Updated weights for policy 0, policy_version 186011 (0.0033) [2024-04-26 13:30:57,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50244.3, 300 sec: 50485.0). Total num frames: 3047604224. Throughput: 0: 50406.8. Samples: 800445440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 13:30:57,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 13:31:00,336][49750] Updated weights for policy 0, policy_version 186021 (0.0031) [2024-04-26 13:31:02,062][49517] Fps is (10 sec: 55706.2, 60 sec: 50790.5, 300 sec: 50596.0). Total num frames: 3047882752. Throughput: 0: 50320.0. Samples: 800751940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 13:31:02,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 13:31:02,734][49750] Updated weights for policy 0, policy_version 186031 (0.0032) [2024-04-26 13:31:06,802][49750] Updated weights for policy 0, policy_version 186041 (0.0027) [2024-04-26 13:31:07,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 3048112128. Throughput: 0: 50611.8. Samples: 800919420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 13:31:07,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 13:31:09,072][49750] Updated weights for policy 0, policy_version 186051 (0.0031) [2024-04-26 13:31:12,063][49517] Fps is (10 sec: 45874.6, 60 sec: 49971.2, 300 sec: 50484.9). Total num frames: 3048341504. Throughput: 0: 50329.2. Samples: 801212720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 13:31:12,063][49517] Avg episode reward: [(0, '0.457')] [2024-04-26 13:31:13,376][49750] Updated weights for policy 0, policy_version 186061 (0.0033) [2024-04-26 13:31:15,587][49750] Updated weights for policy 0, policy_version 186071 (0.0031) [2024-04-26 13:31:17,063][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.1, 300 sec: 50540.5). Total num frames: 3048603648. Throughput: 0: 50281.7. Samples: 801506840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 13:31:17,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 13:31:19,953][49750] Updated weights for policy 0, policy_version 186081 (0.0032) [2024-04-26 13:31:22,063][49517] Fps is (10 sec: 55705.5, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 3048898560. Throughput: 0: 50444.7. Samples: 801671640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 13:31:22,072][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 13:31:22,528][49750] Updated weights for policy 0, policy_version 186091 (0.0037) [2024-04-26 13:31:26,360][49750] Updated weights for policy 0, policy_version 186101 (0.0032) [2024-04-26 13:31:27,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 3049127936. Throughput: 0: 50317.3. Samples: 801971060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 13:31:27,063][49517] Avg episode reward: [(0, '0.485')] [2024-04-26 13:31:29,168][49750] Updated weights for policy 0, policy_version 186111 (0.0035) [2024-04-26 13:31:32,063][49517] Fps is (10 sec: 45875.2, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3049357312. Throughput: 0: 50374.2. Samples: 802276100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 13:31:32,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 13:31:32,873][49750] Updated weights for policy 0, policy_version 186121 (0.0030) [2024-04-26 13:31:35,592][49750] Updated weights for policy 0, policy_version 186131 (0.0024) [2024-04-26 13:31:37,062][49517] Fps is (10 sec: 45875.3, 60 sec: 49698.1, 300 sec: 50318.3). Total num frames: 3049586688. Throughput: 0: 50338.4. Samples: 802411140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 13:31:37,063][49517] Avg episode reward: [(0, '0.431')] [2024-04-26 13:31:39,344][49750] Updated weights for policy 0, policy_version 186141 (0.0032) [2024-04-26 13:31:42,062][49517] Fps is (10 sec: 50791.1, 60 sec: 49971.2, 300 sec: 50540.5). Total num frames: 3049865216. Throughput: 0: 50439.6. Samples: 802715220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 13:31:42,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 13:31:42,099][49728] Signal inference workers to stop experience collection... (12000 times) [2024-04-26 13:31:42,100][49728] Signal inference workers to resume experience collection... (12000 times) [2024-04-26 13:31:42,112][49750] InferenceWorker_p0-w0: stopping experience collection (12000 times) [2024-04-26 13:31:42,112][49750] InferenceWorker_p0-w0: resuming experience collection (12000 times) [2024-04-26 13:31:42,232][49750] Updated weights for policy 0, policy_version 186151 (0.0026) [2024-04-26 13:31:45,871][49750] Updated weights for policy 0, policy_version 186161 (0.0029) [2024-04-26 13:31:47,062][49517] Fps is (10 sec: 55705.2, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 3050143744. Throughput: 0: 50450.6. Samples: 803022220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 13:31:47,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 13:31:48,791][49750] Updated weights for policy 0, policy_version 186171 (0.0031) [2024-04-26 13:31:52,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 3050356736. Throughput: 0: 50143.8. Samples: 803175880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 13:31:52,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 13:31:52,303][49750] Updated weights for policy 0, policy_version 186181 (0.0034) [2024-04-26 13:31:55,210][49750] Updated weights for policy 0, policy_version 186191 (0.0032) [2024-04-26 13:31:57,062][49517] Fps is (10 sec: 45875.3, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 3050602496. Throughput: 0: 50263.2. Samples: 803474560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 13:31:57,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 13:31:58,950][49750] Updated weights for policy 0, policy_version 186201 (0.0034) [2024-04-26 13:32:01,680][49750] Updated weights for policy 0, policy_version 186211 (0.0030) [2024-04-26 13:32:02,063][49517] Fps is (10 sec: 52428.1, 60 sec: 49971.1, 300 sec: 50596.0). Total num frames: 3050881024. Throughput: 0: 50399.2. Samples: 803774800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 13:32:02,072][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 13:32:05,435][49750] Updated weights for policy 0, policy_version 186221 (0.0036) [2024-04-26 13:32:07,062][49517] Fps is (10 sec: 55705.7, 60 sec: 50790.5, 300 sec: 50596.0). Total num frames: 3051159552. Throughput: 0: 50243.3. Samples: 803932580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 13:32:07,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 13:32:08,114][49750] Updated weights for policy 0, policy_version 186231 (0.0032) [2024-04-26 13:32:12,060][49750] Updated weights for policy 0, policy_version 186241 (0.0041) [2024-04-26 13:32:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50484.9). Total num frames: 3051372544. Throughput: 0: 50245.2. Samples: 804232100. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 13:32:12,063][49517] Avg episode reward: [(0, '0.410')] [2024-04-26 13:32:14,556][49750] Updated weights for policy 0, policy_version 186251 (0.0026) [2024-04-26 13:32:17,063][49517] Fps is (10 sec: 45874.6, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3051618304. Throughput: 0: 50308.4. Samples: 804539980. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 13:32:17,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 13:32:18,447][49750] Updated weights for policy 0, policy_version 186261 (0.0030) [2024-04-26 13:32:21,131][49750] Updated weights for policy 0, policy_version 186271 (0.0032) [2024-04-26 13:32:22,063][49517] Fps is (10 sec: 49151.8, 60 sec: 49425.1, 300 sec: 50373.8). Total num frames: 3051864064. Throughput: 0: 50285.2. Samples: 804673980. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 13:32:22,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-26 13:32:24,917][49750] Updated weights for policy 0, policy_version 186281 (0.0032) [2024-04-26 13:32:27,063][49517] Fps is (10 sec: 52429.0, 60 sec: 50244.2, 300 sec: 50540.5). Total num frames: 3052142592. Throughput: 0: 50313.7. Samples: 804979340. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 13:32:27,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 13:32:27,681][49750] Updated weights for policy 0, policy_version 186291 (0.0029) [2024-04-26 13:32:31,414][49750] Updated weights for policy 0, policy_version 186301 (0.0033) [2024-04-26 13:32:32,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3052388352. Throughput: 0: 50263.9. Samples: 805284100. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 13:32:32,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 13:32:33,921][49728] Signal inference workers to stop experience collection... (12050 times) [2024-04-26 13:32:33,973][49750] InferenceWorker_p0-w0: stopping experience collection (12050 times) [2024-04-26 13:32:33,992][49728] Signal inference workers to resume experience collection... (12050 times) [2024-04-26 13:32:33,994][49750] InferenceWorker_p0-w0: resuming experience collection (12050 times) [2024-04-26 13:32:34,125][49750] Updated weights for policy 0, policy_version 186311 (0.0035) [2024-04-26 13:32:37,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 3052634112. Throughput: 0: 50078.0. Samples: 805429400. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 13:32:37,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 13:32:37,933][49750] Updated weights for policy 0, policy_version 186321 (0.0027) [2024-04-26 13:32:40,639][49750] Updated weights for policy 0, policy_version 186331 (0.0030) [2024-04-26 13:32:42,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 3052863488. Throughput: 0: 50217.8. Samples: 805734360. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 13:32:42,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 13:32:42,211][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000186333_3052879872.pth... [2024-04-26 13:32:42,256][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000185594_3040772096.pth [2024-04-26 13:32:44,353][49750] Updated weights for policy 0, policy_version 186341 (0.0032) [2024-04-26 13:32:47,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50244.1, 300 sec: 50596.0). Total num frames: 3053158400. Throughput: 0: 50355.9. Samples: 806040820. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 13:32:47,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 13:32:47,328][49750] Updated weights for policy 0, policy_version 186351 (0.0029) [2024-04-26 13:32:50,779][49750] Updated weights for policy 0, policy_version 186361 (0.0031) [2024-04-26 13:32:52,062][49517] Fps is (10 sec: 55705.7, 60 sec: 51063.4, 300 sec: 50540.5). Total num frames: 3053420544. Throughput: 0: 50283.1. Samples: 806195320. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 13:32:52,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 13:32:54,089][49750] Updated weights for policy 0, policy_version 186371 (0.0033) [2024-04-26 13:32:57,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.2, 300 sec: 50373.8). Total num frames: 3053633536. Throughput: 0: 50359.9. Samples: 806498300. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 13:32:57,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 13:32:57,405][49750] Updated weights for policy 0, policy_version 186381 (0.0033) [2024-04-26 13:33:00,620][49750] Updated weights for policy 0, policy_version 186391 (0.0033) [2024-04-26 13:33:02,063][49517] Fps is (10 sec: 45873.9, 60 sec: 49971.0, 300 sec: 50373.8). Total num frames: 3053879296. Throughput: 0: 50287.8. Samples: 806802940. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 13:33:02,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 13:33:03,923][49750] Updated weights for policy 0, policy_version 186401 (0.0036) [2024-04-26 13:33:07,063][49517] Fps is (10 sec: 50790.8, 60 sec: 49698.1, 300 sec: 50429.4). Total num frames: 3054141440. Throughput: 0: 50590.2. Samples: 806950540. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 13:33:07,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 13:33:07,128][49750] Updated weights for policy 0, policy_version 186411 (0.0030) [2024-04-26 13:33:10,283][49750] Updated weights for policy 0, policy_version 186421 (0.0041) [2024-04-26 13:33:12,062][49517] Fps is (10 sec: 54068.3, 60 sec: 50790.4, 300 sec: 50540.5). Total num frames: 3054419968. Throughput: 0: 50399.1. Samples: 807247300. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 13:33:12,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 13:33:13,745][49750] Updated weights for policy 0, policy_version 186431 (0.0027) [2024-04-26 13:33:16,693][49750] Updated weights for policy 0, policy_version 186441 (0.0032) [2024-04-26 13:33:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 3054665728. Throughput: 0: 50469.8. Samples: 807555240. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-26 13:33:17,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 13:33:20,389][49750] Updated weights for policy 0, policy_version 186451 (0.0035) [2024-04-26 13:33:22,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 3054911488. Throughput: 0: 50504.6. Samples: 807702100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 13:33:22,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 13:33:23,198][49750] Updated weights for policy 0, policy_version 186461 (0.0034) [2024-04-26 13:33:26,834][49750] Updated weights for policy 0, policy_version 186471 (0.0030) [2024-04-26 13:33:27,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 3055140864. Throughput: 0: 50416.4. Samples: 808003100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 13:33:27,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 13:33:29,730][49750] Updated weights for policy 0, policy_version 186481 (0.0039) [2024-04-26 13:33:32,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 3055419392. Throughput: 0: 50318.3. Samples: 808305140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 13:33:32,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 13:33:33,512][49750] Updated weights for policy 0, policy_version 186491 (0.0037) [2024-04-26 13:33:35,749][49728] Signal inference workers to stop experience collection... (12100 times) [2024-04-26 13:33:35,750][49728] Signal inference workers to resume experience collection... (12100 times) [2024-04-26 13:33:35,778][49750] InferenceWorker_p0-w0: stopping experience collection (12100 times) [2024-04-26 13:33:35,779][49750] InferenceWorker_p0-w0: resuming experience collection (12100 times) [2024-04-26 13:33:36,029][49750] Updated weights for policy 0, policy_version 186501 (0.0031) [2024-04-26 13:33:37,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.3, 300 sec: 50429.6). Total num frames: 3055648768. Throughput: 0: 50319.5. Samples: 808459700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 13:33:37,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 13:33:39,975][49750] Updated weights for policy 0, policy_version 186511 (0.0033) [2024-04-26 13:33:42,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.3, 300 sec: 50373.9). Total num frames: 3055910912. Throughput: 0: 50481.4. Samples: 808769960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 13:33:42,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 13:33:42,539][49750] Updated weights for policy 0, policy_version 186521 (0.0033) [2024-04-26 13:33:46,361][49750] Updated weights for policy 0, policy_version 186531 (0.0037) [2024-04-26 13:33:47,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 3056173056. Throughput: 0: 50576.7. Samples: 809078880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 13:33:47,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 13:33:49,129][49750] Updated weights for policy 0, policy_version 186541 (0.0032) [2024-04-26 13:33:52,062][49517] Fps is (10 sec: 50790.9, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 3056418816. Throughput: 0: 50509.4. Samples: 809223460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 13:33:52,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 13:33:52,801][49750] Updated weights for policy 0, policy_version 186551 (0.0029) [2024-04-26 13:33:55,822][49750] Updated weights for policy 0, policy_version 186561 (0.0032) [2024-04-26 13:33:57,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50596.0). Total num frames: 3056697344. Throughput: 0: 50525.8. Samples: 809520960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 13:33:57,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 13:33:59,336][49750] Updated weights for policy 0, policy_version 186571 (0.0032) [2024-04-26 13:34:02,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.6, 300 sec: 50429.4). Total num frames: 3056926720. Throughput: 0: 50464.5. Samples: 809826140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 13:34:02,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 13:34:02,298][49750] Updated weights for policy 0, policy_version 186581 (0.0036) [2024-04-26 13:34:05,805][49750] Updated weights for policy 0, policy_version 186591 (0.0036) [2024-04-26 13:34:07,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 3057188864. Throughput: 0: 50543.1. Samples: 809976540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 13:34:07,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 13:34:08,947][49750] Updated weights for policy 0, policy_version 186601 (0.0040) [2024-04-26 13:34:12,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49971.3, 300 sec: 50373.9). Total num frames: 3057418240. Throughput: 0: 50449.8. Samples: 810273340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 13:34:12,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 13:34:12,218][49750] Updated weights for policy 0, policy_version 186611 (0.0028) [2024-04-26 13:34:15,555][49750] Updated weights for policy 0, policy_version 186621 (0.0028) [2024-04-26 13:34:17,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 3057680384. Throughput: 0: 50431.2. Samples: 810574540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 13:34:17,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 13:34:18,707][49750] Updated weights for policy 0, policy_version 186631 (0.0029) [2024-04-26 13:34:21,921][49750] Updated weights for policy 0, policy_version 186641 (0.0039) [2024-04-26 13:34:22,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 3057926144. Throughput: 0: 50539.9. Samples: 810734000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 13:34:22,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 13:34:25,137][49750] Updated weights for policy 0, policy_version 186651 (0.0033) [2024-04-26 13:34:27,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.2, 300 sec: 50429.4). Total num frames: 3058188288. Throughput: 0: 50298.5. Samples: 811033400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 13:34:27,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 13:34:28,270][49750] Updated weights for policy 0, policy_version 186661 (0.0038) [2024-04-26 13:34:31,676][49750] Updated weights for policy 0, policy_version 186671 (0.0034) [2024-04-26 13:34:32,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 3058434048. Throughput: 0: 50314.5. Samples: 811343040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 13:34:32,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 13:34:34,869][49750] Updated weights for policy 0, policy_version 186681 (0.0032) [2024-04-26 13:34:37,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3058679808. Throughput: 0: 50411.6. Samples: 811491980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 13:34:37,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 13:34:38,164][49750] Updated weights for policy 0, policy_version 186691 (0.0034) [2024-04-26 13:34:41,506][49750] Updated weights for policy 0, policy_version 186701 (0.0032) [2024-04-26 13:34:41,844][49728] Signal inference workers to stop experience collection... (12150 times) [2024-04-26 13:34:41,844][49728] Signal inference workers to resume experience collection... (12150 times) [2024-04-26 13:34:41,878][49750] InferenceWorker_p0-w0: stopping experience collection (12150 times) [2024-04-26 13:34:41,878][49750] InferenceWorker_p0-w0: resuming experience collection (12150 times) [2024-04-26 13:34:42,063][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.4, 300 sec: 50540.5). Total num frames: 3058958336. Throughput: 0: 50331.5. Samples: 811785880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 13:34:42,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 13:34:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000186704_3058958336.pth... [2024-04-26 13:34:42,129][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000185967_3046883328.pth [2024-04-26 13:34:44,485][49750] Updated weights for policy 0, policy_version 186711 (0.0029) [2024-04-26 13:34:47,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3059204096. Throughput: 0: 50347.9. Samples: 812091800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 13:34:47,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 13:34:47,904][49750] Updated weights for policy 0, policy_version 186721 (0.0032) [2024-04-26 13:34:50,937][49750] Updated weights for policy 0, policy_version 186731 (0.0032) [2024-04-26 13:34:52,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 3059449856. Throughput: 0: 50417.7. Samples: 812245340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 13:34:52,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 13:34:54,348][49750] Updated weights for policy 0, policy_version 186741 (0.0034) [2024-04-26 13:34:57,062][49517] Fps is (10 sec: 47514.3, 60 sec: 49698.2, 300 sec: 50318.3). Total num frames: 3059679232. Throughput: 0: 50617.4. Samples: 812551120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 13:34:57,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 13:34:57,391][49750] Updated weights for policy 0, policy_version 186751 (0.0030) [2024-04-26 13:35:00,918][49750] Updated weights for policy 0, policy_version 186761 (0.0028) [2024-04-26 13:35:02,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.4, 300 sec: 50484.9). Total num frames: 3059957760. Throughput: 0: 50526.4. Samples: 812848220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 13:35:02,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 13:35:03,940][49750] Updated weights for policy 0, policy_version 186771 (0.0028) [2024-04-26 13:35:07,062][49517] Fps is (10 sec: 50789.9, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 3060187136. Throughput: 0: 50465.8. Samples: 813004960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 13:35:07,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 13:35:07,306][49750] Updated weights for policy 0, policy_version 186781 (0.0029) [2024-04-26 13:35:10,377][49750] Updated weights for policy 0, policy_version 186791 (0.0036) [2024-04-26 13:35:12,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 3060465664. Throughput: 0: 50532.7. Samples: 813307360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 13:35:12,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 13:35:13,831][49750] Updated weights for policy 0, policy_version 186801 (0.0037) [2024-04-26 13:35:16,759][49750] Updated weights for policy 0, policy_version 186811 (0.0030) [2024-04-26 13:35:17,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 3060711424. Throughput: 0: 50411.6. Samples: 813611560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 13:35:17,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 13:35:20,207][49750] Updated weights for policy 0, policy_version 186821 (0.0033) [2024-04-26 13:35:22,063][49517] Fps is (10 sec: 49150.5, 60 sec: 50517.2, 300 sec: 50484.9). Total num frames: 3060957184. Throughput: 0: 50445.5. Samples: 813762040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 13:35:22,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 13:35:23,238][49750] Updated weights for policy 0, policy_version 186831 (0.0036) [2024-04-26 13:35:26,763][49750] Updated weights for policy 0, policy_version 186841 (0.0034) [2024-04-26 13:35:27,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50517.6, 300 sec: 50485.0). Total num frames: 3061219328. Throughput: 0: 50576.2. Samples: 814061800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 13:35:27,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 13:35:29,867][49750] Updated weights for policy 0, policy_version 186851 (0.0033) [2024-04-26 13:35:32,063][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 3061465088. Throughput: 0: 50411.9. Samples: 814360340. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 13:35:32,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 13:35:33,233][49750] Updated weights for policy 0, policy_version 186861 (0.0031) [2024-04-26 13:35:36,356][49750] Updated weights for policy 0, policy_version 186871 (0.0032) [2024-04-26 13:35:37,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 3061710848. Throughput: 0: 50526.8. Samples: 814519040. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 13:35:37,063][49517] Avg episode reward: [(0, '0.478')] [2024-04-26 13:35:39,689][49750] Updated weights for policy 0, policy_version 186881 (0.0029) [2024-04-26 13:35:42,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 3061972992. Throughput: 0: 50434.1. Samples: 814820660. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 13:35:42,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 13:35:42,826][49750] Updated weights for policy 0, policy_version 186891 (0.0027) [2024-04-26 13:35:46,296][49750] Updated weights for policy 0, policy_version 186901 (0.0038) [2024-04-26 13:35:47,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 3062235136. Throughput: 0: 50597.2. Samples: 815125100. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 13:35:47,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 13:35:47,728][49728] Signal inference workers to stop experience collection... (12200 times) [2024-04-26 13:35:47,729][49728] Signal inference workers to resume experience collection... (12200 times) [2024-04-26 13:35:47,757][49750] InferenceWorker_p0-w0: stopping experience collection (12200 times) [2024-04-26 13:35:47,757][49750] InferenceWorker_p0-w0: resuming experience collection (12200 times) [2024-04-26 13:35:49,194][49750] Updated weights for policy 0, policy_version 186911 (0.0030) [2024-04-26 13:35:52,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 3062480896. Throughput: 0: 50404.0. Samples: 815273140. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 13:35:52,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 13:35:52,805][49750] Updated weights for policy 0, policy_version 186921 (0.0033) [2024-04-26 13:35:55,758][49750] Updated weights for policy 0, policy_version 186931 (0.0030) [2024-04-26 13:35:57,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 3062726656. Throughput: 0: 50403.6. Samples: 815575520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 13:35:57,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 13:35:59,211][49750] Updated weights for policy 0, policy_version 186941 (0.0030) [2024-04-26 13:36:02,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 50373.9). Total num frames: 3062972416. Throughput: 0: 50341.0. Samples: 815876900. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 13:36:02,063][49517] Avg episode reward: [(0, '0.458')] [2024-04-26 13:36:02,381][49750] Updated weights for policy 0, policy_version 186951 (0.0033) [2024-04-26 13:36:05,730][49750] Updated weights for policy 0, policy_version 186961 (0.0028) [2024-04-26 13:36:07,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3063218176. Throughput: 0: 50504.3. Samples: 816034720. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 13:36:07,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 13:36:08,755][49750] Updated weights for policy 0, policy_version 186971 (0.0033) [2024-04-26 13:36:12,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 3063463936. Throughput: 0: 50380.9. Samples: 816328940. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 13:36:12,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 13:36:12,241][49750] Updated weights for policy 0, policy_version 186981 (0.0033) [2024-04-26 13:36:15,398][49750] Updated weights for policy 0, policy_version 186991 (0.0037) [2024-04-26 13:36:17,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 3063742464. Throughput: 0: 50437.3. Samples: 816630020. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 13:36:17,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 13:36:18,655][49750] Updated weights for policy 0, policy_version 187001 (0.0035) [2024-04-26 13:36:21,925][49750] Updated weights for policy 0, policy_version 187011 (0.0030) [2024-04-26 13:36:22,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.6, 300 sec: 50373.9). Total num frames: 3063988224. Throughput: 0: 50372.5. Samples: 816785800. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 13:36:22,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 13:36:25,026][49750] Updated weights for policy 0, policy_version 187021 (0.0030) [2024-04-26 13:36:27,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.1, 300 sec: 50484.9). Total num frames: 3064250368. Throughput: 0: 50366.9. Samples: 817087180. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 13:36:27,063][49517] Avg episode reward: [(0, '0.472')] [2024-04-26 13:36:28,368][49750] Updated weights for policy 0, policy_version 187031 (0.0031) [2024-04-26 13:36:31,485][49750] Updated weights for policy 0, policy_version 187041 (0.0033) [2024-04-26 13:36:32,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 3064496128. Throughput: 0: 50404.4. Samples: 817393300. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 13:36:32,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 13:36:34,793][49750] Updated weights for policy 0, policy_version 187051 (0.0032) [2024-04-26 13:36:37,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3064741888. Throughput: 0: 50509.3. Samples: 817546060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-26 13:36:37,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 13:36:37,969][49750] Updated weights for policy 0, policy_version 187061 (0.0032) [2024-04-26 13:36:41,316][49750] Updated weights for policy 0, policy_version 187071 (0.0041) [2024-04-26 13:36:42,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 3065004032. Throughput: 0: 50561.8. Samples: 817850800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-26 13:36:42,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-26 13:36:42,174][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000187074_3065020416.pth... [2024-04-26 13:36:42,218][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000186333_3052879872.pth [2024-04-26 13:36:44,512][49750] Updated weights for policy 0, policy_version 187081 (0.0028) [2024-04-26 13:36:47,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50540.4). Total num frames: 3065266176. Throughput: 0: 50445.7. Samples: 818146960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-26 13:36:47,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 13:36:47,844][49728] Signal inference workers to stop experience collection... (12250 times) [2024-04-26 13:36:47,844][49750] Updated weights for policy 0, policy_version 187091 (0.0032) [2024-04-26 13:36:47,844][49728] Signal inference workers to resume experience collection... (12250 times) [2024-04-26 13:36:47,863][49750] InferenceWorker_p0-w0: stopping experience collection (12250 times) [2024-04-26 13:36:47,863][49750] InferenceWorker_p0-w0: resuming experience collection (12250 times) [2024-04-26 13:36:50,936][49750] Updated weights for policy 0, policy_version 187101 (0.0031) [2024-04-26 13:36:52,062][49517] Fps is (10 sec: 49151.3, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 3065495552. Throughput: 0: 50497.3. Samples: 818307100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-26 13:36:52,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 13:36:54,181][49750] Updated weights for policy 0, policy_version 187111 (0.0035) [2024-04-26 13:36:57,062][49517] Fps is (10 sec: 45875.6, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 3065724928. Throughput: 0: 50598.6. Samples: 818605880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-26 13:36:57,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 13:36:57,516][49750] Updated weights for policy 0, policy_version 187121 (0.0031) [2024-04-26 13:37:00,541][49750] Updated weights for policy 0, policy_version 187131 (0.0034) [2024-04-26 13:37:02,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 3066003456. Throughput: 0: 50522.0. Samples: 818903500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-26 13:37:02,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 13:37:04,041][49750] Updated weights for policy 0, policy_version 187141 (0.0032) [2024-04-26 13:37:06,928][49750] Updated weights for policy 0, policy_version 187151 (0.0034) [2024-04-26 13:37:07,062][49517] Fps is (10 sec: 55705.5, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 3066281984. Throughput: 0: 50649.7. Samples: 819065040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-26 13:37:07,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 13:37:10,575][49750] Updated weights for policy 0, policy_version 187161 (0.0033) [2024-04-26 13:37:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3066494976. Throughput: 0: 50603.8. Samples: 819364340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-26 13:37:12,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-26 13:37:13,299][49750] Updated weights for policy 0, policy_version 187171 (0.0035) [2024-04-26 13:37:16,949][49750] Updated weights for policy 0, policy_version 187181 (0.0033) [2024-04-26 13:37:17,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 3066773504. Throughput: 0: 50594.0. Samples: 819670040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-26 13:37:17,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 13:37:19,848][49750] Updated weights for policy 0, policy_version 187191 (0.0030) [2024-04-26 13:37:22,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 3067019264. Throughput: 0: 50486.6. Samples: 819817960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-26 13:37:22,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 13:37:23,354][49750] Updated weights for policy 0, policy_version 187201 (0.0030) [2024-04-26 13:37:26,622][49750] Updated weights for policy 0, policy_version 187211 (0.0030) [2024-04-26 13:37:27,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50517.5, 300 sec: 50484.9). Total num frames: 3067281408. Throughput: 0: 50543.0. Samples: 820125240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-26 13:37:27,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 13:37:29,837][49750] Updated weights for policy 0, policy_version 187221 (0.0033) [2024-04-26 13:37:32,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 3067510784. Throughput: 0: 50692.8. Samples: 820428140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-26 13:37:32,063][49517] Avg episode reward: [(0, '0.451')] [2024-04-26 13:37:33,282][49750] Updated weights for policy 0, policy_version 187231 (0.0035) [2024-04-26 13:37:36,416][49750] Updated weights for policy 0, policy_version 187241 (0.0029) [2024-04-26 13:37:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 3067772928. Throughput: 0: 50284.1. Samples: 820569880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-26 13:37:37,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 13:37:39,662][49750] Updated weights for policy 0, policy_version 187251 (0.0037) [2024-04-26 13:37:42,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 3068035072. Throughput: 0: 50509.7. Samples: 820878820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:37:42,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 13:37:42,969][49750] Updated weights for policy 0, policy_version 187261 (0.0029) [2024-04-26 13:37:45,014][49728] Signal inference workers to stop experience collection... (12300 times) [2024-04-26 13:37:45,056][49750] InferenceWorker_p0-w0: stopping experience collection (12300 times) [2024-04-26 13:37:45,090][49728] Signal inference workers to resume experience collection... (12300 times) [2024-04-26 13:37:45,091][49750] InferenceWorker_p0-w0: resuming experience collection (12300 times) [2024-04-26 13:37:46,188][49750] Updated weights for policy 0, policy_version 187271 (0.0036) [2024-04-26 13:37:47,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.3, 300 sec: 50373.8). Total num frames: 3068280832. Throughput: 0: 50512.3. Samples: 821176560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:37:47,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 13:37:49,408][49750] Updated weights for policy 0, policy_version 187281 (0.0028) [2024-04-26 13:37:52,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.5, 300 sec: 50540.5). Total num frames: 3068542976. Throughput: 0: 50292.0. Samples: 821328180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:37:52,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 13:37:52,600][49750] Updated weights for policy 0, policy_version 187291 (0.0024) [2024-04-26 13:37:55,932][49750] Updated weights for policy 0, policy_version 187301 (0.0037) [2024-04-26 13:37:57,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.4, 300 sec: 50540.5). Total num frames: 3068788736. Throughput: 0: 50559.0. Samples: 821639500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:37:57,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 13:37:59,038][49750] Updated weights for policy 0, policy_version 187311 (0.0035) [2024-04-26 13:38:02,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3069018112. Throughput: 0: 50503.4. Samples: 821942680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:38:02,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 13:38:02,533][49750] Updated weights for policy 0, policy_version 187321 (0.0028) [2024-04-26 13:38:05,570][49750] Updated weights for policy 0, policy_version 187331 (0.0030) [2024-04-26 13:38:07,063][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.1, 300 sec: 50373.9). Total num frames: 3069280256. Throughput: 0: 50465.3. Samples: 822088900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:38:07,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 13:38:08,930][49750] Updated weights for policy 0, policy_version 187341 (0.0030) [2024-04-26 13:38:11,974][49750] Updated weights for policy 0, policy_version 187351 (0.0029) [2024-04-26 13:38:12,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.4, 300 sec: 50484.9). Total num frames: 3069558784. Throughput: 0: 50416.5. Samples: 822393980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:38:12,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 13:38:15,452][49750] Updated weights for policy 0, policy_version 187361 (0.0031) [2024-04-26 13:38:17,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 3069788160. Throughput: 0: 50447.7. Samples: 822698280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:38:17,063][49517] Avg episode reward: [(0, '0.490')] [2024-04-26 13:38:18,604][49750] Updated weights for policy 0, policy_version 187371 (0.0033) [2024-04-26 13:38:21,900][49750] Updated weights for policy 0, policy_version 187381 (0.0030) [2024-04-26 13:38:22,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 3070050304. Throughput: 0: 50470.1. Samples: 822841040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:38:22,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 13:38:25,095][49750] Updated weights for policy 0, policy_version 187391 (0.0030) [2024-04-26 13:38:27,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 3070296064. Throughput: 0: 50373.7. Samples: 823145640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:38:27,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 13:38:28,224][49750] Updated weights for policy 0, policy_version 187401 (0.0028) [2024-04-26 13:38:31,602][49750] Updated weights for policy 0, policy_version 187411 (0.0032) [2024-04-26 13:38:32,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.6, 300 sec: 50540.5). Total num frames: 3070558208. Throughput: 0: 50544.6. Samples: 823451060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:38:32,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 13:38:34,754][49750] Updated weights for policy 0, policy_version 187421 (0.0033) [2024-04-26 13:38:37,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3070787584. Throughput: 0: 50520.5. Samples: 823601600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:38:37,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 13:38:38,050][49750] Updated weights for policy 0, policy_version 187431 (0.0033) [2024-04-26 13:38:41,122][49750] Updated weights for policy 0, policy_version 187441 (0.0034) [2024-04-26 13:38:42,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3071049728. Throughput: 0: 50491.6. Samples: 823911620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:38:42,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 13:38:42,070][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000187443_3071066112.pth... [2024-04-26 13:38:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000186704_3058958336.pth [2024-04-26 13:38:44,614][49750] Updated weights for policy 0, policy_version 187451 (0.0031) [2024-04-26 13:38:47,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3071295488. Throughput: 0: 50400.9. Samples: 824210720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:38:47,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 13:38:47,630][49750] Updated weights for policy 0, policy_version 187461 (0.0029) [2024-04-26 13:38:50,926][49750] Updated weights for policy 0, policy_version 187471 (0.0037) [2024-04-26 13:38:52,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 3071557632. Throughput: 0: 50575.7. Samples: 824364800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 13:38:52,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 13:38:54,204][49750] Updated weights for policy 0, policy_version 187481 (0.0031) [2024-04-26 13:38:57,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 3071819776. Throughput: 0: 50366.0. Samples: 824660460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 13:38:57,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 13:38:57,327][49728] Signal inference workers to stop experience collection... (12350 times) [2024-04-26 13:38:57,327][49728] Signal inference workers to resume experience collection... (12350 times) [2024-04-26 13:38:57,357][49750] InferenceWorker_p0-w0: stopping experience collection (12350 times) [2024-04-26 13:38:57,357][49750] InferenceWorker_p0-w0: resuming experience collection (12350 times) [2024-04-26 13:38:57,460][49750] Updated weights for policy 0, policy_version 187491 (0.0029) [2024-04-26 13:39:00,652][49750] Updated weights for policy 0, policy_version 187501 (0.0030) [2024-04-26 13:39:02,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 3072065536. Throughput: 0: 50403.1. Samples: 824966420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 13:39:02,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 13:39:03,990][49750] Updated weights for policy 0, policy_version 187511 (0.0028) [2024-04-26 13:39:07,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.5, 300 sec: 50540.5). Total num frames: 3072327680. Throughput: 0: 50439.2. Samples: 825110800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 13:39:07,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 13:39:07,329][49750] Updated weights for policy 0, policy_version 187521 (0.0035) [2024-04-26 13:39:10,528][49750] Updated weights for policy 0, policy_version 187531 (0.0034) [2024-04-26 13:39:12,063][49517] Fps is (10 sec: 49151.3, 60 sec: 49971.1, 300 sec: 50429.4). Total num frames: 3072557056. Throughput: 0: 50331.5. Samples: 825410560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 13:39:12,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 13:39:14,254][49750] Updated weights for policy 0, policy_version 187541 (0.0035) [2024-04-26 13:39:16,988][49750] Updated weights for policy 0, policy_version 187551 (0.0031) [2024-04-26 13:39:17,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.3, 300 sec: 50540.5). Total num frames: 3072835584. Throughput: 0: 50329.6. Samples: 825715900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 13:39:17,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 13:39:20,728][49750] Updated weights for policy 0, policy_version 187561 (0.0037) [2024-04-26 13:39:22,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3073064960. Throughput: 0: 50337.7. Samples: 825866800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 13:39:22,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 13:39:23,383][49750] Updated weights for policy 0, policy_version 187571 (0.0033) [2024-04-26 13:39:27,062][49750] Updated weights for policy 0, policy_version 187581 (0.0032) [2024-04-26 13:39:27,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50485.0). Total num frames: 3073327104. Throughput: 0: 50188.9. Samples: 826170120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 13:39:27,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 13:39:29,949][49750] Updated weights for policy 0, policy_version 187591 (0.0031) [2024-04-26 13:39:32,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 3073572864. Throughput: 0: 50324.0. Samples: 826475300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 13:39:32,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 13:39:33,620][49750] Updated weights for policy 0, policy_version 187601 (0.0030) [2024-04-26 13:39:36,398][49750] Updated weights for policy 0, policy_version 187611 (0.0029) [2024-04-26 13:39:37,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 3073835008. Throughput: 0: 50249.6. Samples: 826626040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 13:39:37,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 13:39:40,060][49750] Updated weights for policy 0, policy_version 187621 (0.0029) [2024-04-26 13:39:42,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50244.1, 300 sec: 50373.8). Total num frames: 3074064384. Throughput: 0: 50442.6. Samples: 826930380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 13:39:42,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 13:39:42,989][49750] Updated weights for policy 0, policy_version 187631 (0.0033) [2024-04-26 13:39:46,373][49750] Updated weights for policy 0, policy_version 187641 (0.0031) [2024-04-26 13:39:47,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3074326528. Throughput: 0: 50192.9. Samples: 827225100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 13:39:47,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 13:39:49,675][49750] Updated weights for policy 0, policy_version 187651 (0.0037) [2024-04-26 13:39:52,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 3074572288. Throughput: 0: 50399.1. Samples: 827378760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 13:39:52,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 13:39:52,734][49750] Updated weights for policy 0, policy_version 187661 (0.0031) [2024-04-26 13:39:56,049][49750] Updated weights for policy 0, policy_version 187671 (0.0030) [2024-04-26 13:39:57,063][49517] Fps is (10 sec: 49151.6, 60 sec: 49971.3, 300 sec: 50373.8). Total num frames: 3074818048. Throughput: 0: 50504.1. Samples: 827683240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 13:39:57,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 13:39:59,330][49750] Updated weights for policy 0, policy_version 187681 (0.0030) [2024-04-26 13:40:02,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 3075080192. Throughput: 0: 50489.4. Samples: 827987920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 13:40:02,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 13:40:02,563][49750] Updated weights for policy 0, policy_version 187691 (0.0031) [2024-04-26 13:40:05,780][49750] Updated weights for policy 0, policy_version 187701 (0.0035) [2024-04-26 13:40:07,063][49517] Fps is (10 sec: 49151.6, 60 sec: 49698.0, 300 sec: 50318.3). Total num frames: 3075309568. Throughput: 0: 50541.2. Samples: 828141160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 13:40:07,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 13:40:09,141][49750] Updated weights for policy 0, policy_version 187711 (0.0032) [2024-04-26 13:40:11,798][49728] Signal inference workers to stop experience collection... (12400 times) [2024-04-26 13:40:11,799][49728] Signal inference workers to resume experience collection... (12400 times) [2024-04-26 13:40:11,811][49750] InferenceWorker_p0-w0: stopping experience collection (12400 times) [2024-04-26 13:40:11,833][49750] InferenceWorker_p0-w0: resuming experience collection (12400 times) [2024-04-26 13:40:12,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 3075588096. Throughput: 0: 50373.9. Samples: 828436940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 13:40:12,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 13:40:12,229][49750] Updated weights for policy 0, policy_version 187721 (0.0030) [2024-04-26 13:40:15,509][49750] Updated weights for policy 0, policy_version 187731 (0.0034) [2024-04-26 13:40:17,063][49517] Fps is (10 sec: 52429.2, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 3075833856. Throughput: 0: 50279.9. Samples: 828737900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 13:40:17,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 13:40:19,165][49750] Updated weights for policy 0, policy_version 187741 (0.0034) [2024-04-26 13:40:21,901][49750] Updated weights for policy 0, policy_version 187751 (0.0028) [2024-04-26 13:40:22,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 3076112384. Throughput: 0: 50439.1. Samples: 828895800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 13:40:22,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 13:40:25,836][49750] Updated weights for policy 0, policy_version 187761 (0.0032) [2024-04-26 13:40:27,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 3076341760. Throughput: 0: 50385.3. Samples: 829197720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 13:40:27,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 13:40:28,636][49750] Updated weights for policy 0, policy_version 187771 (0.0030) [2024-04-26 13:40:32,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.1, 300 sec: 50429.4). Total num frames: 3076587520. Throughput: 0: 50416.8. Samples: 829493860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 13:40:32,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 13:40:32,177][49750] Updated weights for policy 0, policy_version 187781 (0.0030) [2024-04-26 13:40:35,107][49750] Updated weights for policy 0, policy_version 187791 (0.0029) [2024-04-26 13:40:37,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 3076866048. Throughput: 0: 50498.1. Samples: 829651180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 13:40:37,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 13:40:38,700][49750] Updated weights for policy 0, policy_version 187801 (0.0032) [2024-04-26 13:40:41,736][49750] Updated weights for policy 0, policy_version 187811 (0.0033) [2024-04-26 13:40:42,063][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 3077111808. Throughput: 0: 50489.7. Samples: 829955280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 13:40:42,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 13:40:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000187812_3077111808.pth... [2024-04-26 13:40:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000187074_3065020416.pth [2024-04-26 13:40:45,303][49750] Updated weights for policy 0, policy_version 187821 (0.0030) [2024-04-26 13:40:47,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 3077341184. Throughput: 0: 50367.2. Samples: 830254440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 13:40:47,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 13:40:48,103][49750] Updated weights for policy 0, policy_version 187831 (0.0032) [2024-04-26 13:40:51,640][49750] Updated weights for policy 0, policy_version 187841 (0.0032) [2024-04-26 13:40:52,063][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 3077586944. Throughput: 0: 50287.7. Samples: 830404100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 13:40:52,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 13:40:54,590][49750] Updated weights for policy 0, policy_version 187851 (0.0032) [2024-04-26 13:40:57,062][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 3077865472. Throughput: 0: 50349.2. Samples: 830702660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 13:40:57,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 13:40:58,007][49750] Updated weights for policy 0, policy_version 187861 (0.0030) [2024-04-26 13:41:01,157][49750] Updated weights for policy 0, policy_version 187871 (0.0034) [2024-04-26 13:41:02,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.4, 300 sec: 50484.9). Total num frames: 3078111232. Throughput: 0: 50311.7. Samples: 831001920. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:41:02,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 13:41:04,674][49750] Updated weights for policy 0, policy_version 187881 (0.0032) [2024-04-26 13:41:07,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 3078356992. Throughput: 0: 50389.3. Samples: 831163320. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:41:07,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 13:41:07,539][49750] Updated weights for policy 0, policy_version 187891 (0.0028) [2024-04-26 13:41:09,861][49728] Signal inference workers to stop experience collection... (12450 times) [2024-04-26 13:41:09,866][49728] Signal inference workers to resume experience collection... (12450 times) [2024-04-26 13:41:09,893][49750] InferenceWorker_p0-w0: stopping experience collection (12450 times) [2024-04-26 13:41:09,893][49750] InferenceWorker_p0-w0: resuming experience collection (12450 times) [2024-04-26 13:41:11,230][49750] Updated weights for policy 0, policy_version 187901 (0.0029) [2024-04-26 13:41:12,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 3078619136. Throughput: 0: 50352.5. Samples: 831463580. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:41:12,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 13:41:14,046][49750] Updated weights for policy 0, policy_version 187911 (0.0031) [2024-04-26 13:41:17,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 3078864896. Throughput: 0: 50444.5. Samples: 831763860. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:41:17,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 13:41:17,569][49750] Updated weights for policy 0, policy_version 187921 (0.0029) [2024-04-26 13:41:20,481][49750] Updated weights for policy 0, policy_version 187931 (0.0038) [2024-04-26 13:41:22,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 3079127040. Throughput: 0: 50496.6. Samples: 831923520. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:41:22,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 13:41:23,866][49750] Updated weights for policy 0, policy_version 187941 (0.0034) [2024-04-26 13:41:26,871][49750] Updated weights for policy 0, policy_version 187951 (0.0031) [2024-04-26 13:41:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.6, 300 sec: 50484.9). Total num frames: 3079389184. Throughput: 0: 50613.0. Samples: 832232860. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:41:27,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 13:41:30,496][49750] Updated weights for policy 0, policy_version 187961 (0.0030) [2024-04-26 13:41:32,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.4, 300 sec: 50373.9). Total num frames: 3079602176. Throughput: 0: 50589.8. Samples: 832530980. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:41:32,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 13:41:33,590][49750] Updated weights for policy 0, policy_version 187971 (0.0031) [2024-04-26 13:41:37,063][49517] Fps is (10 sec: 47513.2, 60 sec: 49971.2, 300 sec: 50373.8). Total num frames: 3079864320. Throughput: 0: 50596.4. Samples: 832680940. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:41:37,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 13:41:37,161][49750] Updated weights for policy 0, policy_version 187981 (0.0027) [2024-04-26 13:41:40,141][49750] Updated weights for policy 0, policy_version 187991 (0.0031) [2024-04-26 13:41:42,062][49517] Fps is (10 sec: 55705.9, 60 sec: 50790.6, 300 sec: 50485.0). Total num frames: 3080159232. Throughput: 0: 50662.8. Samples: 832982480. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:41:42,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 13:41:43,573][49750] Updated weights for policy 0, policy_version 188001 (0.0032) [2024-04-26 13:41:46,463][49750] Updated weights for policy 0, policy_version 188011 (0.0031) [2024-04-26 13:41:47,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.4, 300 sec: 50485.0). Total num frames: 3080388608. Throughput: 0: 50753.8. Samples: 833285840. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:41:47,063][49517] Avg episode reward: [(0, '0.687')] [2024-04-26 13:41:50,039][49750] Updated weights for policy 0, policy_version 188021 (0.0031) [2024-04-26 13:41:52,063][49517] Fps is (10 sec: 47512.6, 60 sec: 50790.4, 300 sec: 50540.5). Total num frames: 3080634368. Throughput: 0: 50630.7. Samples: 833441700. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:41:52,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 13:41:52,932][49750] Updated weights for policy 0, policy_version 188031 (0.0039) [2024-04-26 13:41:56,441][49750] Updated weights for policy 0, policy_version 188041 (0.0031) [2024-04-26 13:41:57,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3080880128. Throughput: 0: 50727.2. Samples: 833746300. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:41:57,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 13:41:59,899][49750] Updated weights for policy 0, policy_version 188051 (0.0033) [2024-04-26 13:42:02,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 3081125888. Throughput: 0: 50606.7. Samples: 834041160. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:42:02,063][49517] Avg episode reward: [(0, '0.491')] [2024-04-26 13:42:02,920][49750] Updated weights for policy 0, policy_version 188061 (0.0033) [2024-04-26 13:42:06,234][49750] Updated weights for policy 0, policy_version 188071 (0.0031) [2024-04-26 13:42:07,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50540.5). Total num frames: 3081404416. Throughput: 0: 50620.4. Samples: 834201440. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-26 13:42:07,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 13:42:07,188][49728] Signal inference workers to stop experience collection... (12500 times) [2024-04-26 13:42:07,189][49728] Signal inference workers to resume experience collection... (12500 times) [2024-04-26 13:42:07,224][49750] InferenceWorker_p0-w0: stopping experience collection (12500 times) [2024-04-26 13:42:07,225][49750] InferenceWorker_p0-w0: resuming experience collection (12500 times) [2024-04-26 13:42:09,435][49750] Updated weights for policy 0, policy_version 188081 (0.0029) [2024-04-26 13:42:12,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 3081650176. Throughput: 0: 50390.7. Samples: 834500440. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-26 13:42:12,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 13:42:12,587][49750] Updated weights for policy 0, policy_version 188091 (0.0036) [2024-04-26 13:42:15,822][49750] Updated weights for policy 0, policy_version 188101 (0.0030) [2024-04-26 13:42:17,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3081895936. Throughput: 0: 50671.0. Samples: 834811180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-26 13:42:17,072][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 13:42:19,043][49750] Updated weights for policy 0, policy_version 188111 (0.0035) [2024-04-26 13:42:22,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 3082158080. Throughput: 0: 50534.2. Samples: 834954980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-26 13:42:22,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 13:42:22,120][49750] Updated weights for policy 0, policy_version 188121 (0.0037) [2024-04-26 13:42:25,462][49750] Updated weights for policy 0, policy_version 188131 (0.0031) [2024-04-26 13:42:27,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 3082420224. Throughput: 0: 50560.9. Samples: 835257720. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-26 13:42:27,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 13:42:28,695][49750] Updated weights for policy 0, policy_version 188141 (0.0034) [2024-04-26 13:42:32,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 3082649600. Throughput: 0: 50647.0. Samples: 835564960. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-26 13:42:32,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 13:42:32,084][49750] Updated weights for policy 0, policy_version 188151 (0.0030) [2024-04-26 13:42:35,261][49750] Updated weights for policy 0, policy_version 188161 (0.0033) [2024-04-26 13:42:37,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 3082911744. Throughput: 0: 50494.0. Samples: 835713920. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-26 13:42:37,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 13:42:38,532][49750] Updated weights for policy 0, policy_version 188171 (0.0036) [2024-04-26 13:42:41,691][49750] Updated weights for policy 0, policy_version 188181 (0.0033) [2024-04-26 13:42:42,063][49517] Fps is (10 sec: 50790.0, 60 sec: 49971.0, 300 sec: 50429.4). Total num frames: 3083157504. Throughput: 0: 50397.7. Samples: 836014200. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-26 13:42:42,072][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 13:42:42,081][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000188181_3083157504.pth... [2024-04-26 13:42:42,127][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000187443_3071066112.pth [2024-04-26 13:42:44,979][49750] Updated weights for policy 0, policy_version 188191 (0.0034) [2024-04-26 13:42:47,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 3083419648. Throughput: 0: 50474.6. Samples: 836312520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-26 13:42:47,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 13:42:48,112][49750] Updated weights for policy 0, policy_version 188201 (0.0034) [2024-04-26 13:42:51,418][49750] Updated weights for policy 0, policy_version 188211 (0.0030) [2024-04-26 13:42:52,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 3083665408. Throughput: 0: 50477.8. Samples: 836472940. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-26 13:42:52,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 13:42:54,550][49750] Updated weights for policy 0, policy_version 188221 (0.0032) [2024-04-26 13:42:57,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 3083911168. Throughput: 0: 50564.3. Samples: 836775840. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-26 13:42:57,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 13:42:57,887][49750] Updated weights for policy 0, policy_version 188231 (0.0032) [2024-04-26 13:43:01,100][49750] Updated weights for policy 0, policy_version 188241 (0.0029) [2024-04-26 13:43:02,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3084156928. Throughput: 0: 50311.2. Samples: 837075180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-26 13:43:02,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 13:43:04,346][49750] Updated weights for policy 0, policy_version 188251 (0.0031) [2024-04-26 13:43:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 3084419072. Throughput: 0: 50368.1. Samples: 837221540. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-26 13:43:07,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 13:43:08,036][49750] Updated weights for policy 0, policy_version 188261 (0.0034) [2024-04-26 13:43:10,924][49750] Updated weights for policy 0, policy_version 188271 (0.0029) [2024-04-26 13:43:12,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.3, 300 sec: 50540.5). Total num frames: 3084697600. Throughput: 0: 50449.7. Samples: 837527960. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-26 13:43:12,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 13:43:14,385][49750] Updated weights for policy 0, policy_version 188281 (0.0033) [2024-04-26 13:43:15,636][49728] Signal inference workers to stop experience collection... (12550 times) [2024-04-26 13:43:15,641][49728] Signal inference workers to resume experience collection... (12550 times) [2024-04-26 13:43:15,674][49750] InferenceWorker_p0-w0: stopping experience collection (12550 times) [2024-04-26 13:43:15,675][49750] InferenceWorker_p0-w0: resuming experience collection (12550 times) [2024-04-26 13:43:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 3084926976. Throughput: 0: 50398.7. Samples: 837832900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 13:43:17,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 13:43:17,303][49750] Updated weights for policy 0, policy_version 188291 (0.0029) [2024-04-26 13:43:20,700][49750] Updated weights for policy 0, policy_version 188301 (0.0032) [2024-04-26 13:43:22,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 3085172736. Throughput: 0: 50368.7. Samples: 837980520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 13:43:22,063][49517] Avg episode reward: [(0, '0.483')] [2024-04-26 13:43:23,827][49750] Updated weights for policy 0, policy_version 188311 (0.0034) [2024-04-26 13:43:27,063][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 3085418496. Throughput: 0: 50428.1. Samples: 838283460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 13:43:27,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 13:43:27,308][49750] Updated weights for policy 0, policy_version 188321 (0.0030) [2024-04-26 13:43:30,193][49750] Updated weights for policy 0, policy_version 188331 (0.0030) [2024-04-26 13:43:32,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.3, 300 sec: 50540.4). Total num frames: 3085697024. Throughput: 0: 50527.9. Samples: 838586280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 13:43:32,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 13:43:33,766][49750] Updated weights for policy 0, policy_version 188341 (0.0031) [2024-04-26 13:43:36,865][49750] Updated weights for policy 0, policy_version 188351 (0.0030) [2024-04-26 13:43:37,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 3085942784. Throughput: 0: 50428.4. Samples: 838742220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 13:43:37,063][49517] Avg episode reward: [(0, '0.484')] [2024-04-26 13:43:40,269][49750] Updated weights for policy 0, policy_version 188361 (0.0034) [2024-04-26 13:43:42,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.5, 300 sec: 50484.9). Total num frames: 3086188544. Throughput: 0: 50409.0. Samples: 839044240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 13:43:42,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 13:43:43,262][49750] Updated weights for policy 0, policy_version 188371 (0.0037) [2024-04-26 13:43:46,703][49750] Updated weights for policy 0, policy_version 188381 (0.0029) [2024-04-26 13:43:47,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3086434304. Throughput: 0: 50468.5. Samples: 839346260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 13:43:47,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 13:43:49,777][49750] Updated weights for policy 0, policy_version 188391 (0.0028) [2024-04-26 13:43:52,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3086696448. Throughput: 0: 50550.7. Samples: 839496320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 13:43:52,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 13:43:53,319][49750] Updated weights for policy 0, policy_version 188401 (0.0033) [2024-04-26 13:43:56,228][49750] Updated weights for policy 0, policy_version 188411 (0.0037) [2024-04-26 13:43:57,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3086942208. Throughput: 0: 50295.4. Samples: 839791260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 13:43:57,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 13:43:59,751][49750] Updated weights for policy 0, policy_version 188421 (0.0031) [2024-04-26 13:44:02,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50373.8). Total num frames: 3087187968. Throughput: 0: 50355.5. Samples: 840098900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 13:44:02,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 13:44:02,819][49750] Updated weights for policy 0, policy_version 188431 (0.0030) [2024-04-26 13:44:06,263][49750] Updated weights for policy 0, policy_version 188441 (0.0038) [2024-04-26 13:44:07,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3087433728. Throughput: 0: 50475.2. Samples: 840251900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 13:44:07,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 13:44:09,510][49750] Updated weights for policy 0, policy_version 188451 (0.0035) [2024-04-26 13:44:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 50318.3). Total num frames: 3087679488. Throughput: 0: 50322.2. Samples: 840547960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 13:44:12,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 13:44:12,871][49750] Updated weights for policy 0, policy_version 188461 (0.0031) [2024-04-26 13:44:16,039][49750] Updated weights for policy 0, policy_version 188471 (0.0037) [2024-04-26 13:44:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3087941632. Throughput: 0: 50320.2. Samples: 840850680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 13:44:17,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 13:44:19,305][49750] Updated weights for policy 0, policy_version 188481 (0.0035) [2024-04-26 13:44:22,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3088203776. Throughput: 0: 50196.8. Samples: 841001080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 13:44:22,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 13:44:22,410][49750] Updated weights for policy 0, policy_version 188491 (0.0030) [2024-04-26 13:44:25,964][49750] Updated weights for policy 0, policy_version 188501 (0.0041) [2024-04-26 13:44:27,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3088449536. Throughput: 0: 50227.0. Samples: 841304460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 13:44:27,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 13:44:28,938][49750] Updated weights for policy 0, policy_version 188511 (0.0040) [2024-04-26 13:44:32,063][49517] Fps is (10 sec: 49151.6, 60 sec: 49971.2, 300 sec: 50373.8). Total num frames: 3088695296. Throughput: 0: 50174.0. Samples: 841604100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 13:44:32,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 13:44:32,389][49750] Updated weights for policy 0, policy_version 188521 (0.0034) [2024-04-26 13:44:33,524][49728] Signal inference workers to stop experience collection... (12600 times) [2024-04-26 13:44:33,524][49728] Signal inference workers to resume experience collection... (12600 times) [2024-04-26 13:44:33,539][49750] InferenceWorker_p0-w0: stopping experience collection (12600 times) [2024-04-26 13:44:33,552][49750] InferenceWorker_p0-w0: resuming experience collection (12600 times) [2024-04-26 13:44:35,508][49750] Updated weights for policy 0, policy_version 188531 (0.0030) [2024-04-26 13:44:37,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 50484.9). Total num frames: 3088957440. Throughput: 0: 50232.3. Samples: 841756780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 13:44:37,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 13:44:38,877][49750] Updated weights for policy 0, policy_version 188541 (0.0033) [2024-04-26 13:44:42,034][49750] Updated weights for policy 0, policy_version 188551 (0.0029) [2024-04-26 13:44:42,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.1, 300 sec: 50484.9). Total num frames: 3089219584. Throughput: 0: 50384.4. Samples: 842058560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 13:44:42,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 13:44:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000188551_3089219584.pth... [2024-04-26 13:44:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000187812_3077111808.pth [2024-04-26 13:44:45,546][49750] Updated weights for policy 0, policy_version 188561 (0.0043) [2024-04-26 13:44:47,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.2, 300 sec: 50484.9). Total num frames: 3089465344. Throughput: 0: 50268.3. Samples: 842360980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 13:44:47,063][49517] Avg episode reward: [(0, '0.469')] [2024-04-26 13:44:48,559][49750] Updated weights for policy 0, policy_version 188571 (0.0032) [2024-04-26 13:44:52,063][49517] Fps is (10 sec: 45875.8, 60 sec: 49698.1, 300 sec: 50373.9). Total num frames: 3089678336. Throughput: 0: 50128.0. Samples: 842507660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 13:44:52,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 13:44:52,273][49750] Updated weights for policy 0, policy_version 188581 (0.0032) [2024-04-26 13:44:55,104][49750] Updated weights for policy 0, policy_version 188591 (0.0034) [2024-04-26 13:44:57,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 3089956864. Throughput: 0: 50285.9. Samples: 842810820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 13:44:57,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 13:44:58,639][49750] Updated weights for policy 0, policy_version 188601 (0.0030) [2024-04-26 13:45:01,559][49750] Updated weights for policy 0, policy_version 188611 (0.0031) [2024-04-26 13:45:02,062][49517] Fps is (10 sec: 54067.6, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 3090219008. Throughput: 0: 50368.9. Samples: 843117280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 13:45:02,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 13:45:05,162][49750] Updated weights for policy 0, policy_version 188621 (0.0035) [2024-04-26 13:45:07,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 3090481152. Throughput: 0: 50396.9. Samples: 843268940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 13:45:07,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 13:45:08,023][49750] Updated weights for policy 0, policy_version 188631 (0.0039) [2024-04-26 13:45:11,785][49750] Updated weights for policy 0, policy_version 188641 (0.0033) [2024-04-26 13:45:12,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3090710528. Throughput: 0: 50311.5. Samples: 843568480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 13:45:12,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 13:45:14,536][49750] Updated weights for policy 0, policy_version 188651 (0.0032) [2024-04-26 13:45:17,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.2, 300 sec: 50373.9). Total num frames: 3090972672. Throughput: 0: 50360.5. Samples: 843870320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 13:45:17,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 13:45:18,347][49750] Updated weights for policy 0, policy_version 188661 (0.0043) [2024-04-26 13:45:21,009][49750] Updated weights for policy 0, policy_version 188671 (0.0033) [2024-04-26 13:45:22,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.4, 300 sec: 50485.0). Total num frames: 3091234816. Throughput: 0: 50376.6. Samples: 844023720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 13:45:22,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 13:45:24,824][49750] Updated weights for policy 0, policy_version 188681 (0.0038) [2024-04-26 13:45:27,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50517.4, 300 sec: 50485.0). Total num frames: 3091480576. Throughput: 0: 50354.0. Samples: 844324480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:45:27,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 13:45:27,721][49750] Updated weights for policy 0, policy_version 188691 (0.0034) [2024-04-26 13:45:31,266][49750] Updated weights for policy 0, policy_version 188701 (0.0034) [2024-04-26 13:45:32,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.5, 300 sec: 50373.9). Total num frames: 3091726336. Throughput: 0: 50489.1. Samples: 844632980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:45:32,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 13:45:34,188][49750] Updated weights for policy 0, policy_version 188711 (0.0031) [2024-04-26 13:45:37,062][49517] Fps is (10 sec: 47513.4, 60 sec: 49971.3, 300 sec: 50318.3). Total num frames: 3091955712. Throughput: 0: 50480.9. Samples: 844779300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:45:37,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 13:45:37,714][49750] Updated weights for policy 0, policy_version 188721 (0.0035) [2024-04-26 13:45:40,688][49750] Updated weights for policy 0, policy_version 188731 (0.0032) [2024-04-26 13:45:42,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50517.4, 300 sec: 50540.4). Total num frames: 3092250624. Throughput: 0: 50553.6. Samples: 845085740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:45:42,072][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 13:45:44,148][49750] Updated weights for policy 0, policy_version 188741 (0.0030) [2024-04-26 13:45:47,021][49750] Updated weights for policy 0, policy_version 188751 (0.0035) [2024-04-26 13:45:47,062][49517] Fps is (10 sec: 54067.0, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 3092496384. Throughput: 0: 50414.6. Samples: 845385940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:45:47,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 13:45:50,593][49750] Updated weights for policy 0, policy_version 188761 (0.0031) [2024-04-26 13:45:51,183][49728] Signal inference workers to stop experience collection... (12650 times) [2024-04-26 13:45:51,183][49728] Signal inference workers to resume experience collection... (12650 times) [2024-04-26 13:45:51,210][49750] InferenceWorker_p0-w0: stopping experience collection (12650 times) [2024-04-26 13:45:51,211][49750] InferenceWorker_p0-w0: resuming experience collection (12650 times) [2024-04-26 13:45:52,063][49517] Fps is (10 sec: 49152.2, 60 sec: 51063.4, 300 sec: 50429.4). Total num frames: 3092742144. Throughput: 0: 50579.6. Samples: 845545020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:45:52,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 13:45:53,638][49750] Updated weights for policy 0, policy_version 188771 (0.0036) [2024-04-26 13:45:56,974][49750] Updated weights for policy 0, policy_version 188781 (0.0034) [2024-04-26 13:45:57,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.2, 300 sec: 50429.4). Total num frames: 3092987904. Throughput: 0: 50579.1. Samples: 845844540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:45:57,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 13:46:00,135][49750] Updated weights for policy 0, policy_version 188791 (0.0025) [2024-04-26 13:46:02,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50484.9). Total num frames: 3093250048. Throughput: 0: 50432.5. Samples: 846139780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:46:02,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 13:46:03,561][49750] Updated weights for policy 0, policy_version 188801 (0.0035) [2024-04-26 13:46:06,471][49750] Updated weights for policy 0, policy_version 188811 (0.0035) [2024-04-26 13:46:07,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.4, 300 sec: 50484.9). Total num frames: 3093512192. Throughput: 0: 50412.9. Samples: 846292300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:46:07,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 13:46:10,046][49750] Updated weights for policy 0, policy_version 188821 (0.0038) [2024-04-26 13:46:12,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 3093741568. Throughput: 0: 50612.8. Samples: 846602060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:46:12,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 13:46:12,850][49750] Updated weights for policy 0, policy_version 188831 (0.0035) [2024-04-26 13:46:16,432][49750] Updated weights for policy 0, policy_version 188841 (0.0032) [2024-04-26 13:46:17,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 3094003712. Throughput: 0: 50424.0. Samples: 846902060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:46:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 13:46:19,475][49750] Updated weights for policy 0, policy_version 188851 (0.0034) [2024-04-26 13:46:22,063][49517] Fps is (10 sec: 49151.6, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 3094233088. Throughput: 0: 50455.8. Samples: 847049820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:46:22,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 13:46:22,887][49750] Updated weights for policy 0, policy_version 188861 (0.0033) [2024-04-26 13:46:25,927][49750] Updated weights for policy 0, policy_version 188871 (0.0037) [2024-04-26 13:46:27,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 3094528000. Throughput: 0: 50410.0. Samples: 847354180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 13:46:27,063][49517] Avg episode reward: [(0, '0.491')] [2024-04-26 13:46:29,364][49750] Updated weights for policy 0, policy_version 188881 (0.0030) [2024-04-26 13:46:32,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3094740992. Throughput: 0: 50486.3. Samples: 847657820. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 13:46:32,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 13:46:32,461][49750] Updated weights for policy 0, policy_version 188891 (0.0038) [2024-04-26 13:46:35,874][49750] Updated weights for policy 0, policy_version 188901 (0.0035) [2024-04-26 13:46:37,063][49517] Fps is (10 sec: 45874.6, 60 sec: 50517.3, 300 sec: 50262.7). Total num frames: 3094986752. Throughput: 0: 50185.8. Samples: 847803380. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 13:46:37,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 13:46:38,963][49750] Updated weights for policy 0, policy_version 188911 (0.0036) [2024-04-26 13:46:42,062][49517] Fps is (10 sec: 50790.3, 60 sec: 49971.3, 300 sec: 50373.9). Total num frames: 3095248896. Throughput: 0: 50175.7. Samples: 848102440. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 13:46:42,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 13:46:42,182][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000188920_3095265280.pth... [2024-04-26 13:46:42,226][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000188181_3083157504.pth [2024-04-26 13:46:42,381][49750] Updated weights for policy 0, policy_version 188921 (0.0035) [2024-04-26 13:46:45,436][49750] Updated weights for policy 0, policy_version 188931 (0.0030) [2024-04-26 13:46:47,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3095511040. Throughput: 0: 50418.3. Samples: 848408600. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 13:46:47,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 13:46:48,710][49750] Updated weights for policy 0, policy_version 188941 (0.0029) [2024-04-26 13:46:51,883][49750] Updated weights for policy 0, policy_version 188951 (0.0033) [2024-04-26 13:46:52,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.4, 300 sec: 50484.9). Total num frames: 3095773184. Throughput: 0: 50580.9. Samples: 848568440. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 13:46:52,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 13:46:54,525][49728] Signal inference workers to stop experience collection... (12700 times) [2024-04-26 13:46:54,559][49750] InferenceWorker_p0-w0: stopping experience collection (12700 times) [2024-04-26 13:46:54,630][49728] Signal inference workers to resume experience collection... (12700 times) [2024-04-26 13:46:54,630][49750] InferenceWorker_p0-w0: resuming experience collection (12700 times) [2024-04-26 13:46:55,253][49750] Updated weights for policy 0, policy_version 188961 (0.0028) [2024-04-26 13:46:57,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 3096002560. Throughput: 0: 50298.8. Samples: 848865500. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 13:46:57,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-26 13:46:58,418][49750] Updated weights for policy 0, policy_version 188971 (0.0035) [2024-04-26 13:47:01,794][49750] Updated weights for policy 0, policy_version 188981 (0.0034) [2024-04-26 13:47:02,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3096281088. Throughput: 0: 50541.6. Samples: 849176440. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 13:47:02,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 13:47:04,778][49750] Updated weights for policy 0, policy_version 188991 (0.0036) [2024-04-26 13:47:07,063][49517] Fps is (10 sec: 50789.6, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 3096510464. Throughput: 0: 50445.8. Samples: 849319880. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 13:47:07,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 13:47:08,378][49750] Updated weights for policy 0, policy_version 189001 (0.0030) [2024-04-26 13:47:11,246][49750] Updated weights for policy 0, policy_version 189011 (0.0032) [2024-04-26 13:47:12,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 3096788992. Throughput: 0: 50416.7. Samples: 849622940. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 13:47:12,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 13:47:14,864][49750] Updated weights for policy 0, policy_version 189021 (0.0038) [2024-04-26 13:47:17,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3097034752. Throughput: 0: 50434.1. Samples: 849927360. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 13:47:17,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 13:47:17,910][49750] Updated weights for policy 0, policy_version 189031 (0.0031) [2024-04-26 13:47:21,349][49750] Updated weights for policy 0, policy_version 189041 (0.0030) [2024-04-26 13:47:22,062][49517] Fps is (10 sec: 45875.6, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 3097247744. Throughput: 0: 50473.4. Samples: 850074680. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 13:47:22,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 13:47:24,269][49750] Updated weights for policy 0, policy_version 189051 (0.0033) [2024-04-26 13:47:27,063][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.1, 300 sec: 50429.4). Total num frames: 3097526272. Throughput: 0: 50395.4. Samples: 850370240. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 13:47:27,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 13:47:27,904][49750] Updated weights for policy 0, policy_version 189061 (0.0031) [2024-04-26 13:47:30,763][49750] Updated weights for policy 0, policy_version 189071 (0.0032) [2024-04-26 13:47:32,063][49517] Fps is (10 sec: 55704.7, 60 sec: 51063.3, 300 sec: 50484.9). Total num frames: 3097804800. Throughput: 0: 50378.5. Samples: 850675640. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 13:47:32,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 13:47:34,291][49750] Updated weights for policy 0, policy_version 189081 (0.0030) [2024-04-26 13:47:37,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 3098034176. Throughput: 0: 50335.7. Samples: 850833540. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 13:47:37,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 13:47:37,303][49750] Updated weights for policy 0, policy_version 189091 (0.0028) [2024-04-26 13:47:40,760][49750] Updated weights for policy 0, policy_version 189101 (0.0035) [2024-04-26 13:47:42,062][49517] Fps is (10 sec: 45876.2, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 3098263552. Throughput: 0: 50573.3. Samples: 851141300. Policy #0 lag: (min: 1.0, avg: 10.8, max: 19.0) [2024-04-26 13:47:42,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 13:47:43,656][49750] Updated weights for policy 0, policy_version 189111 (0.0040) [2024-04-26 13:47:47,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 3098525696. Throughput: 0: 50265.5. Samples: 851438380. Policy #0 lag: (min: 1.0, avg: 10.8, max: 19.0) [2024-04-26 13:47:47,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 13:47:47,485][49750] Updated weights for policy 0, policy_version 189121 (0.0031) [2024-04-26 13:47:50,166][49750] Updated weights for policy 0, policy_version 189131 (0.0034) [2024-04-26 13:47:52,063][49517] Fps is (10 sec: 54066.5, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 3098804224. Throughput: 0: 50431.6. Samples: 851589300. Policy #0 lag: (min: 1.0, avg: 10.8, max: 19.0) [2024-04-26 13:47:52,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 13:47:53,981][49750] Updated weights for policy 0, policy_version 189141 (0.0033) [2024-04-26 13:47:56,707][49750] Updated weights for policy 0, policy_version 189151 (0.0032) [2024-04-26 13:47:57,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.4, 300 sec: 50540.5). Total num frames: 3099066368. Throughput: 0: 50429.0. Samples: 851892240. Policy #0 lag: (min: 1.0, avg: 10.8, max: 19.0) [2024-04-26 13:47:57,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 13:48:00,361][49750] Updated weights for policy 0, policy_version 189161 (0.0034) [2024-04-26 13:48:02,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3099295744. Throughput: 0: 50414.7. Samples: 852196020. Policy #0 lag: (min: 1.0, avg: 10.8, max: 19.0) [2024-04-26 13:48:02,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 13:48:03,135][49750] Updated weights for policy 0, policy_version 189171 (0.0028) [2024-04-26 13:48:07,063][49517] Fps is (10 sec: 45874.8, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 3099525120. Throughput: 0: 50368.4. Samples: 852341260. Policy #0 lag: (min: 1.0, avg: 10.8, max: 19.0) [2024-04-26 13:48:07,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 13:48:07,082][49750] Updated weights for policy 0, policy_version 189181 (0.0030) [2024-04-26 13:48:09,627][49750] Updated weights for policy 0, policy_version 189191 (0.0034) [2024-04-26 13:48:12,062][49517] Fps is (10 sec: 49152.5, 60 sec: 49971.3, 300 sec: 50373.9). Total num frames: 3099787264. Throughput: 0: 50428.1. Samples: 852639500. Policy #0 lag: (min: 1.0, avg: 10.8, max: 19.0) [2024-04-26 13:48:12,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 13:48:13,686][49750] Updated weights for policy 0, policy_version 189201 (0.0033) [2024-04-26 13:48:16,061][49750] Updated weights for policy 0, policy_version 189211 (0.0034) [2024-04-26 13:48:17,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3100049408. Throughput: 0: 50327.8. Samples: 852940380. Policy #0 lag: (min: 1.0, avg: 10.8, max: 19.0) [2024-04-26 13:48:17,063][49517] Avg episode reward: [(0, '0.463')] [2024-04-26 13:48:20,047][49750] Updated weights for policy 0, policy_version 189221 (0.0034) [2024-04-26 13:48:22,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51063.4, 300 sec: 50484.9). Total num frames: 3100311552. Throughput: 0: 50431.3. Samples: 853102960. Policy #0 lag: (min: 1.0, avg: 10.8, max: 19.0) [2024-04-26 13:48:22,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 13:48:22,601][49750] Updated weights for policy 0, policy_version 189231 (0.0039) [2024-04-26 13:48:26,525][49750] Updated weights for policy 0, policy_version 189241 (0.0025) [2024-04-26 13:48:27,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 3100540928. Throughput: 0: 50207.5. Samples: 853400640. Policy #0 lag: (min: 1.0, avg: 10.8, max: 19.0) [2024-04-26 13:48:27,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 13:48:28,242][49728] Signal inference workers to stop experience collection... (12750 times) [2024-04-26 13:48:28,243][49728] Signal inference workers to resume experience collection... (12750 times) [2024-04-26 13:48:28,277][49750] InferenceWorker_p0-w0: stopping experience collection (12750 times) [2024-04-26 13:48:28,277][49750] InferenceWorker_p0-w0: resuming experience collection (12750 times) [2024-04-26 13:48:29,240][49750] Updated weights for policy 0, policy_version 189251 (0.0032) [2024-04-26 13:48:32,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49698.3, 300 sec: 50318.3). Total num frames: 3100786688. Throughput: 0: 50203.5. Samples: 853697540. Policy #0 lag: (min: 1.0, avg: 10.8, max: 19.0) [2024-04-26 13:48:32,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 13:48:33,263][49750] Updated weights for policy 0, policy_version 189261 (0.0032) [2024-04-26 13:48:36,010][49750] Updated weights for policy 0, policy_version 189271 (0.0036) [2024-04-26 13:48:37,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3101065216. Throughput: 0: 50254.4. Samples: 853850740. Policy #0 lag: (min: 1.0, avg: 10.8, max: 19.0) [2024-04-26 13:48:37,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 13:48:39,774][49750] Updated weights for policy 0, policy_version 189281 (0.0032) [2024-04-26 13:48:42,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 3101310976. Throughput: 0: 50296.3. Samples: 854155580. Policy #0 lag: (min: 1.0, avg: 10.8, max: 19.0) [2024-04-26 13:48:42,063][49517] Avg episode reward: [(0, '0.668')] [2024-04-26 13:48:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000189289_3101310976.pth... [2024-04-26 13:48:42,113][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000188551_3089219584.pth [2024-04-26 13:48:42,484][49750] Updated weights for policy 0, policy_version 189291 (0.0027) [2024-04-26 13:48:46,201][49750] Updated weights for policy 0, policy_version 189301 (0.0034) [2024-04-26 13:48:47,063][49517] Fps is (10 sec: 47512.3, 60 sec: 50244.1, 300 sec: 50318.3). Total num frames: 3101540352. Throughput: 0: 50115.0. Samples: 854451200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:48:47,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 13:48:49,041][49750] Updated weights for policy 0, policy_version 189311 (0.0037) [2024-04-26 13:48:52,063][49517] Fps is (10 sec: 45875.3, 60 sec: 49425.1, 300 sec: 50262.8). Total num frames: 3101769728. Throughput: 0: 50231.1. Samples: 854601660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:48:52,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 13:48:52,657][49750] Updated weights for policy 0, policy_version 189321 (0.0033) [2024-04-26 13:48:55,574][49750] Updated weights for policy 0, policy_version 189331 (0.0030) [2024-04-26 13:48:57,063][49517] Fps is (10 sec: 50790.8, 60 sec: 49698.0, 300 sec: 50373.8). Total num frames: 3102048256. Throughput: 0: 50257.2. Samples: 854901080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:48:57,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 13:48:59,142][49750] Updated weights for policy 0, policy_version 189341 (0.0031) [2024-04-26 13:49:02,033][49750] Updated weights for policy 0, policy_version 189351 (0.0032) [2024-04-26 13:49:02,062][49517] Fps is (10 sec: 55705.9, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 3102326784. Throughput: 0: 50280.8. Samples: 855203020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:49:02,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 13:49:05,663][49750] Updated weights for policy 0, policy_version 189361 (0.0030) [2024-04-26 13:49:07,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 3102572544. Throughput: 0: 50192.4. Samples: 855361620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:49:07,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 13:49:08,672][49750] Updated weights for policy 0, policy_version 189371 (0.0030) [2024-04-26 13:49:12,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 3102801920. Throughput: 0: 50238.7. Samples: 855661380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:49:12,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 13:49:12,089][49750] Updated weights for policy 0, policy_version 189381 (0.0028) [2024-04-26 13:49:15,293][49750] Updated weights for policy 0, policy_version 189391 (0.0028) [2024-04-26 13:49:17,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 3103064064. Throughput: 0: 50415.6. Samples: 855966240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:49:17,063][49517] Avg episode reward: [(0, '0.490')] [2024-04-26 13:49:18,543][49750] Updated weights for policy 0, policy_version 189401 (0.0030) [2024-04-26 13:49:21,622][49750] Updated weights for policy 0, policy_version 189411 (0.0028) [2024-04-26 13:49:22,063][49517] Fps is (10 sec: 50789.1, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 3103309824. Throughput: 0: 50496.1. Samples: 856123080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:49:22,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 13:49:25,172][49750] Updated weights for policy 0, policy_version 189421 (0.0035) [2024-04-26 13:49:27,063][49517] Fps is (10 sec: 52427.6, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 3103588352. Throughput: 0: 50327.0. Samples: 856420300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:49:27,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 13:49:27,952][49750] Updated weights for policy 0, policy_version 189431 (0.0034) [2024-04-26 13:49:31,714][49750] Updated weights for policy 0, policy_version 189441 (0.0032) [2024-04-26 13:49:32,063][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 3103801344. Throughput: 0: 50571.7. Samples: 856726920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:49:32,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 13:49:34,690][49750] Updated weights for policy 0, policy_version 189451 (0.0029) [2024-04-26 13:49:37,063][49517] Fps is (10 sec: 45874.7, 60 sec: 49697.9, 300 sec: 50262.8). Total num frames: 3104047104. Throughput: 0: 50401.1. Samples: 856869720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:49:37,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 13:49:38,099][49750] Updated weights for policy 0, policy_version 189461 (0.0036) [2024-04-26 13:49:39,153][49728] Signal inference workers to stop experience collection... (12800 times) [2024-04-26 13:49:39,157][49728] Signal inference workers to resume experience collection... (12800 times) [2024-04-26 13:49:39,182][49750] InferenceWorker_p0-w0: stopping experience collection (12800 times) [2024-04-26 13:49:39,182][49750] InferenceWorker_p0-w0: resuming experience collection (12800 times) [2024-04-26 13:49:41,313][49750] Updated weights for policy 0, policy_version 189471 (0.0036) [2024-04-26 13:49:42,063][49517] Fps is (10 sec: 52428.9, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 3104325632. Throughput: 0: 50386.3. Samples: 857168460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:49:42,063][49517] Avg episode reward: [(0, '0.473')] [2024-04-26 13:49:44,513][49750] Updated weights for policy 0, policy_version 189481 (0.0036) [2024-04-26 13:49:47,063][49517] Fps is (10 sec: 54068.0, 60 sec: 50790.5, 300 sec: 50540.5). Total num frames: 3104587776. Throughput: 0: 50493.7. Samples: 857475240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 13:49:47,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 13:49:47,656][49750] Updated weights for policy 0, policy_version 189491 (0.0040) [2024-04-26 13:49:51,088][49750] Updated weights for policy 0, policy_version 189501 (0.0030) [2024-04-26 13:49:52,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50429.4). Total num frames: 3104833536. Throughput: 0: 50529.4. Samples: 857635440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:49:52,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 13:49:53,977][49750] Updated weights for policy 0, policy_version 189511 (0.0036) [2024-04-26 13:49:57,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50373.8). Total num frames: 3105079296. Throughput: 0: 50436.7. Samples: 857931040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:49:57,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 13:49:57,638][49750] Updated weights for policy 0, policy_version 189521 (0.0029) [2024-04-26 13:50:00,691][49750] Updated weights for policy 0, policy_version 189531 (0.0028) [2024-04-26 13:50:02,063][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 3105325056. Throughput: 0: 50320.3. Samples: 858230660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:50:02,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 13:50:04,079][49750] Updated weights for policy 0, policy_version 189541 (0.0034) [2024-04-26 13:50:07,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3105587200. Throughput: 0: 50252.9. Samples: 858384460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:50:07,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 13:50:07,256][49750] Updated weights for policy 0, policy_version 189551 (0.0030) [2024-04-26 13:50:10,714][49750] Updated weights for policy 0, policy_version 189561 (0.0035) [2024-04-26 13:50:12,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 3105849344. Throughput: 0: 50418.3. Samples: 858689120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:50:12,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 13:50:13,784][49750] Updated weights for policy 0, policy_version 189571 (0.0029) [2024-04-26 13:50:17,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 3106095104. Throughput: 0: 50369.0. Samples: 858993520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:50:17,063][49517] Avg episode reward: [(0, '0.452')] [2024-04-26 13:50:17,068][49750] Updated weights for policy 0, policy_version 189581 (0.0031) [2024-04-26 13:50:20,148][49750] Updated weights for policy 0, policy_version 189591 (0.0031) [2024-04-26 13:50:22,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.4, 300 sec: 50318.3). Total num frames: 3106324480. Throughput: 0: 50387.8. Samples: 859137160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:50:22,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 13:50:23,541][49750] Updated weights for policy 0, policy_version 189601 (0.0039) [2024-04-26 13:50:26,512][49750] Updated weights for policy 0, policy_version 189611 (0.0033) [2024-04-26 13:50:27,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 3106603008. Throughput: 0: 50492.5. Samples: 859440620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:50:27,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 13:50:29,920][49750] Updated weights for policy 0, policy_version 189621 (0.0035) [2024-04-26 13:50:32,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.5, 300 sec: 50540.5). Total num frames: 3106865152. Throughput: 0: 50378.8. Samples: 859742280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:50:32,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 13:50:33,068][49750] Updated weights for policy 0, policy_version 189631 (0.0034) [2024-04-26 13:50:36,419][49750] Updated weights for policy 0, policy_version 189641 (0.0036) [2024-04-26 13:50:37,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.7, 300 sec: 50318.3). Total num frames: 3107094528. Throughput: 0: 50298.4. Samples: 859898860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:50:37,063][49517] Avg episode reward: [(0, '0.672')] [2024-04-26 13:50:39,571][49750] Updated weights for policy 0, policy_version 189651 (0.0034) [2024-04-26 13:50:42,063][49517] Fps is (10 sec: 47512.7, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 3107340288. Throughput: 0: 50478.1. Samples: 860202560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:50:42,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 13:50:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000189657_3107340288.pth... [2024-04-26 13:50:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000188920_3095265280.pth [2024-04-26 13:50:43,054][49750] Updated weights for policy 0, policy_version 189661 (0.0031) [2024-04-26 13:50:46,031][49750] Updated weights for policy 0, policy_version 189671 (0.0034) [2024-04-26 13:50:47,063][49517] Fps is (10 sec: 49151.5, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 3107586048. Throughput: 0: 50424.5. Samples: 860499760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:50:47,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 13:50:49,538][49750] Updated weights for policy 0, policy_version 189681 (0.0028) [2024-04-26 13:50:52,063][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3107864576. Throughput: 0: 50406.7. Samples: 860652760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 13:50:52,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 13:50:52,368][49750] Updated weights for policy 0, policy_version 189691 (0.0033) [2024-04-26 13:50:55,943][49750] Updated weights for policy 0, policy_version 189701 (0.0042) [2024-04-26 13:50:57,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50244.4, 300 sec: 50318.3). Total num frames: 3108093952. Throughput: 0: 50309.5. Samples: 860953040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 13:50:57,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 13:50:58,872][49750] Updated weights for policy 0, policy_version 189711 (0.0028) [2024-04-26 13:51:00,159][49728] Signal inference workers to stop experience collection... (12850 times) [2024-04-26 13:51:00,164][49728] Signal inference workers to resume experience collection... (12850 times) [2024-04-26 13:51:00,211][49750] InferenceWorker_p0-w0: stopping experience collection (12850 times) [2024-04-26 13:51:00,211][49750] InferenceWorker_p0-w0: resuming experience collection (12850 times) [2024-04-26 13:51:02,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 3108356096. Throughput: 0: 50453.7. Samples: 861263940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 13:51:02,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 13:51:02,338][49750] Updated weights for policy 0, policy_version 189721 (0.0031) [2024-04-26 13:51:05,472][49750] Updated weights for policy 0, policy_version 189731 (0.0030) [2024-04-26 13:51:07,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 3108618240. Throughput: 0: 50441.7. Samples: 861407040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 13:51:07,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 13:51:09,219][49750] Updated weights for policy 0, policy_version 189741 (0.0033) [2024-04-26 13:51:11,944][49750] Updated weights for policy 0, policy_version 189751 (0.0030) [2024-04-26 13:51:12,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3108880384. Throughput: 0: 50405.6. Samples: 861708880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 13:51:12,063][49517] Avg episode reward: [(0, '0.667')] [2024-04-26 13:51:15,816][49750] Updated weights for policy 0, policy_version 189761 (0.0036) [2024-04-26 13:51:17,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50485.0). Total num frames: 3109126144. Throughput: 0: 50554.2. Samples: 862017220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 13:51:17,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 13:51:18,292][49750] Updated weights for policy 0, policy_version 189771 (0.0033) [2024-04-26 13:51:22,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.3, 300 sec: 50262.7). Total num frames: 3109355520. Throughput: 0: 50324.3. Samples: 862163460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 13:51:22,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 13:51:22,321][49750] Updated weights for policy 0, policy_version 189781 (0.0031) [2024-04-26 13:51:24,764][49750] Updated weights for policy 0, policy_version 189791 (0.0033) [2024-04-26 13:51:27,062][49517] Fps is (10 sec: 47513.8, 60 sec: 49971.2, 300 sec: 50373.8). Total num frames: 3109601280. Throughput: 0: 50320.6. Samples: 862466980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 13:51:27,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 13:51:28,729][49750] Updated weights for policy 0, policy_version 189801 (0.0030) [2024-04-26 13:51:31,503][49750] Updated weights for policy 0, policy_version 189811 (0.0035) [2024-04-26 13:51:32,062][49517] Fps is (10 sec: 50791.0, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 3109863424. Throughput: 0: 50285.9. Samples: 862762620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 13:51:32,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 13:51:35,089][49750] Updated weights for policy 0, policy_version 189821 (0.0032) [2024-04-26 13:51:37,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3110125568. Throughput: 0: 50404.2. Samples: 862920940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 13:51:37,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 13:51:38,520][49750] Updated weights for policy 0, policy_version 189831 (0.0031) [2024-04-26 13:51:41,786][49750] Updated weights for policy 0, policy_version 189841 (0.0032) [2024-04-26 13:51:42,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.4, 300 sec: 50318.3). Total num frames: 3110354944. Throughput: 0: 50332.8. Samples: 863218020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 13:51:42,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 13:51:44,924][49750] Updated weights for policy 0, policy_version 189851 (0.0037) [2024-04-26 13:51:47,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 3110617088. Throughput: 0: 50178.9. Samples: 863522000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 13:51:47,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 13:51:48,392][49750] Updated weights for policy 0, policy_version 189861 (0.0039) [2024-04-26 13:51:51,352][49750] Updated weights for policy 0, policy_version 189871 (0.0031) [2024-04-26 13:51:52,062][49517] Fps is (10 sec: 50790.5, 60 sec: 49971.3, 300 sec: 50373.9). Total num frames: 3110862848. Throughput: 0: 50199.7. Samples: 863666020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 13:51:52,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 13:51:54,810][49750] Updated weights for policy 0, policy_version 189881 (0.0028) [2024-04-26 13:51:57,062][49517] Fps is (10 sec: 52429.9, 60 sec: 50790.4, 300 sec: 50373.9). Total num frames: 3111141376. Throughput: 0: 50263.8. Samples: 863970740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 13:51:57,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 13:51:57,755][49750] Updated weights for policy 0, policy_version 189891 (0.0032) [2024-04-26 13:52:01,199][49750] Updated weights for policy 0, policy_version 189901 (0.0036) [2024-04-26 13:52:02,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 3111387136. Throughput: 0: 50146.7. Samples: 864273820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 13:52:02,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 13:52:04,242][49750] Updated weights for policy 0, policy_version 189911 (0.0034) [2024-04-26 13:52:07,062][49517] Fps is (10 sec: 45875.5, 60 sec: 49698.3, 300 sec: 50207.3). Total num frames: 3111600128. Throughput: 0: 50105.6. Samples: 864418200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 13:52:07,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 13:52:07,782][49750] Updated weights for policy 0, policy_version 189921 (0.0029) [2024-04-26 13:52:10,745][49750] Updated weights for policy 0, policy_version 189931 (0.0033) [2024-04-26 13:52:12,063][49517] Fps is (10 sec: 47512.9, 60 sec: 49698.1, 300 sec: 50262.8). Total num frames: 3111862272. Throughput: 0: 50103.0. Samples: 864721620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 13:52:12,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 13:52:13,856][49728] Signal inference workers to stop experience collection... (12900 times) [2024-04-26 13:52:13,891][49750] InferenceWorker_p0-w0: stopping experience collection (12900 times) [2024-04-26 13:52:13,925][49728] Signal inference workers to resume experience collection... (12900 times) [2024-04-26 13:52:13,926][49750] InferenceWorker_p0-w0: resuming experience collection (12900 times) [2024-04-26 13:52:14,524][49750] Updated weights for policy 0, policy_version 189941 (0.0035) [2024-04-26 13:52:17,062][49517] Fps is (10 sec: 54066.8, 60 sec: 50244.3, 300 sec: 50484.9). Total num frames: 3112140800. Throughput: 0: 50169.4. Samples: 865020240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 13:52:17,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 13:52:17,285][49750] Updated weights for policy 0, policy_version 189951 (0.0026) [2024-04-26 13:52:20,871][49750] Updated weights for policy 0, policy_version 189961 (0.0033) [2024-04-26 13:52:22,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50517.5, 300 sec: 50373.9). Total num frames: 3112386560. Throughput: 0: 50135.1. Samples: 865177020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 13:52:22,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-26 13:52:23,771][49750] Updated weights for policy 0, policy_version 189971 (0.0031) [2024-04-26 13:52:27,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.3, 300 sec: 50207.3). Total num frames: 3112615936. Throughput: 0: 50159.6. Samples: 865475200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 13:52:27,063][49517] Avg episode reward: [(0, '0.484')] [2024-04-26 13:52:27,309][49750] Updated weights for policy 0, policy_version 189981 (0.0029) [2024-04-26 13:52:30,161][49750] Updated weights for policy 0, policy_version 189991 (0.0026) [2024-04-26 13:52:32,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 3112861696. Throughput: 0: 50138.9. Samples: 865778240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 13:52:32,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 13:52:33,868][49750] Updated weights for policy 0, policy_version 190001 (0.0033) [2024-04-26 13:52:36,775][49750] Updated weights for policy 0, policy_version 190011 (0.0034) [2024-04-26 13:52:37,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3113140224. Throughput: 0: 50250.3. Samples: 865927280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 13:52:37,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 13:52:40,347][49750] Updated weights for policy 0, policy_version 190021 (0.0031) [2024-04-26 13:52:42,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 3113402368. Throughput: 0: 50216.0. Samples: 866230460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 13:52:42,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 13:52:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000190027_3113402368.pth... [2024-04-26 13:52:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000189289_3101310976.pth [2024-04-26 13:52:43,675][49750] Updated weights for policy 0, policy_version 190031 (0.0038) [2024-04-26 13:52:46,951][49750] Updated weights for policy 0, policy_version 190041 (0.0027) [2024-04-26 13:52:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.5, 300 sec: 50262.8). Total num frames: 3113631744. Throughput: 0: 50340.9. Samples: 866539160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 13:52:47,063][49517] Avg episode reward: [(0, '0.442')] [2024-04-26 13:52:50,076][49750] Updated weights for policy 0, policy_version 190051 (0.0032) [2024-04-26 13:52:52,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 3113877504. Throughput: 0: 50279.0. Samples: 866680760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 13:52:52,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 13:52:53,362][49750] Updated weights for policy 0, policy_version 190061 (0.0031) [2024-04-26 13:52:56,677][49750] Updated weights for policy 0, policy_version 190071 (0.0035) [2024-04-26 13:52:57,063][49517] Fps is (10 sec: 50789.5, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 3114139648. Throughput: 0: 50177.8. Samples: 866979620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 13:52:57,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 13:52:59,960][49750] Updated weights for policy 0, policy_version 190081 (0.0037) [2024-04-26 13:53:02,062][49517] Fps is (10 sec: 50790.2, 60 sec: 49971.2, 300 sec: 50373.9). Total num frames: 3114385408. Throughput: 0: 50182.6. Samples: 867278460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 13:53:02,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 13:53:03,376][49750] Updated weights for policy 0, policy_version 190091 (0.0036) [2024-04-26 13:53:06,558][49750] Updated weights for policy 0, policy_version 190101 (0.0028) [2024-04-26 13:53:07,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.3, 300 sec: 50373.9). Total num frames: 3114647552. Throughput: 0: 50306.2. Samples: 867440800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 13:53:07,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 13:53:09,795][49750] Updated weights for policy 0, policy_version 190111 (0.0029) [2024-04-26 13:53:12,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49971.3, 300 sec: 50207.2). Total num frames: 3114860544. Throughput: 0: 50233.3. Samples: 867735700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:53:12,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 13:53:12,898][49750] Updated weights for policy 0, policy_version 190121 (0.0037) [2024-04-26 13:53:16,142][49750] Updated weights for policy 0, policy_version 190131 (0.0034) [2024-04-26 13:53:17,063][49517] Fps is (10 sec: 49150.9, 60 sec: 49971.0, 300 sec: 50262.8). Total num frames: 3115139072. Throughput: 0: 50158.4. Samples: 868035380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:53:17,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 13:53:19,539][49750] Updated weights for policy 0, policy_version 190141 (0.0032) [2024-04-26 13:53:21,527][49728] Signal inference workers to stop experience collection... (12950 times) [2024-04-26 13:53:21,527][49728] Signal inference workers to resume experience collection... (12950 times) [2024-04-26 13:53:21,541][49750] InferenceWorker_p0-w0: stopping experience collection (12950 times) [2024-04-26 13:53:21,541][49750] InferenceWorker_p0-w0: resuming experience collection (12950 times) [2024-04-26 13:53:22,062][49517] Fps is (10 sec: 54066.9, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 3115401216. Throughput: 0: 50267.9. Samples: 868189340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:53:22,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 13:53:22,798][49750] Updated weights for policy 0, policy_version 190151 (0.0029) [2024-04-26 13:53:26,022][49750] Updated weights for policy 0, policy_version 190161 (0.0030) [2024-04-26 13:53:27,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50373.8). Total num frames: 3115646976. Throughput: 0: 50402.9. Samples: 868498600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:53:27,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 13:53:29,292][49750] Updated weights for policy 0, policy_version 190171 (0.0034) [2024-04-26 13:53:32,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.3, 300 sec: 50207.2). Total num frames: 3115876352. Throughput: 0: 50273.3. Samples: 868801460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:53:32,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 13:53:32,433][49750] Updated weights for policy 0, policy_version 190181 (0.0034) [2024-04-26 13:53:35,789][49750] Updated weights for policy 0, policy_version 190191 (0.0030) [2024-04-26 13:53:37,063][49517] Fps is (10 sec: 47513.6, 60 sec: 49698.0, 300 sec: 50207.2). Total num frames: 3116122112. Throughput: 0: 50233.1. Samples: 868941260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:53:37,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 13:53:39,020][49750] Updated weights for policy 0, policy_version 190201 (0.0031) [2024-04-26 13:53:42,063][49517] Fps is (10 sec: 52427.9, 60 sec: 49971.1, 300 sec: 50373.9). Total num frames: 3116400640. Throughput: 0: 50337.8. Samples: 869244820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:53:42,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 13:53:42,111][49750] Updated weights for policy 0, policy_version 190211 (0.0035) [2024-04-26 13:53:45,497][49750] Updated weights for policy 0, policy_version 190221 (0.0029) [2024-04-26 13:53:47,062][49517] Fps is (10 sec: 54068.0, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 3116662784. Throughput: 0: 50389.3. Samples: 869545980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:53:47,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 13:53:48,620][49750] Updated weights for policy 0, policy_version 190231 (0.0034) [2024-04-26 13:53:51,966][49750] Updated weights for policy 0, policy_version 190241 (0.0030) [2024-04-26 13:53:52,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.4, 300 sec: 50373.9). Total num frames: 3116908544. Throughput: 0: 50336.0. Samples: 869705920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:53:52,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 13:53:55,115][49750] Updated weights for policy 0, policy_version 190251 (0.0030) [2024-04-26 13:53:57,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.3, 300 sec: 50207.2). Total num frames: 3117137920. Throughput: 0: 50439.1. Samples: 870005460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:53:57,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 13:53:58,389][49750] Updated weights for policy 0, policy_version 190261 (0.0036) [2024-04-26 13:54:01,710][49750] Updated weights for policy 0, policy_version 190271 (0.0034) [2024-04-26 13:54:02,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 3117400064. Throughput: 0: 50473.2. Samples: 870306660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:54:02,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 13:54:04,971][49750] Updated weights for policy 0, policy_version 190281 (0.0024) [2024-04-26 13:54:07,062][49517] Fps is (10 sec: 54067.5, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3117678592. Throughput: 0: 50437.9. Samples: 870459040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:54:07,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 13:54:08,121][49750] Updated weights for policy 0, policy_version 190291 (0.0033) [2024-04-26 13:54:11,547][49750] Updated weights for policy 0, policy_version 190301 (0.0039) [2024-04-26 13:54:12,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50373.8). Total num frames: 3117924352. Throughput: 0: 50382.4. Samples: 870765800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 13:54:12,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 13:54:14,477][49750] Updated weights for policy 0, policy_version 190311 (0.0032) [2024-04-26 13:54:17,062][49517] Fps is (10 sec: 45875.1, 60 sec: 49971.4, 300 sec: 50262.8). Total num frames: 3118137344. Throughput: 0: 50373.3. Samples: 871068260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 13:54:17,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 13:54:18,002][49750] Updated weights for policy 0, policy_version 190321 (0.0036) [2024-04-26 13:54:21,118][49750] Updated weights for policy 0, policy_version 190331 (0.0034) [2024-04-26 13:54:22,063][49517] Fps is (10 sec: 47512.9, 60 sec: 49971.1, 300 sec: 50207.2). Total num frames: 3118399488. Throughput: 0: 50322.7. Samples: 871205780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 13:54:22,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 13:54:24,591][49750] Updated weights for policy 0, policy_version 190341 (0.0031) [2024-04-26 13:54:27,062][49517] Fps is (10 sec: 54067.4, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 3118678016. Throughput: 0: 50451.3. Samples: 871515120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 13:54:27,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 13:54:27,324][49728] Signal inference workers to stop experience collection... (13000 times) [2024-04-26 13:54:27,368][49750] InferenceWorker_p0-w0: stopping experience collection (13000 times) [2024-04-26 13:54:27,431][49728] Signal inference workers to resume experience collection... (13000 times) [2024-04-26 13:54:27,431][49750] InferenceWorker_p0-w0: resuming experience collection (13000 times) [2024-04-26 13:54:27,556][49750] Updated weights for policy 0, policy_version 190351 (0.0032) [2024-04-26 13:54:31,155][49750] Updated weights for policy 0, policy_version 190361 (0.0035) [2024-04-26 13:54:32,063][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 3118923776. Throughput: 0: 50349.3. Samples: 871811700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 13:54:32,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 13:54:33,969][49750] Updated weights for policy 0, policy_version 190371 (0.0029) [2024-04-26 13:54:37,062][49517] Fps is (10 sec: 45874.7, 60 sec: 50244.4, 300 sec: 50207.2). Total num frames: 3119136768. Throughput: 0: 50168.3. Samples: 871963500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 13:54:37,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 13:54:37,650][49750] Updated weights for policy 0, policy_version 190381 (0.0031) [2024-04-26 13:54:40,637][49750] Updated weights for policy 0, policy_version 190391 (0.0032) [2024-04-26 13:54:42,063][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 3119398912. Throughput: 0: 50140.8. Samples: 872261800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 13:54:42,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 13:54:42,104][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000190394_3119415296.pth... [2024-04-26 13:54:42,146][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000189657_3107340288.pth [2024-04-26 13:54:44,064][49750] Updated weights for policy 0, policy_version 190401 (0.0032) [2024-04-26 13:54:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 3119661056. Throughput: 0: 50040.7. Samples: 872558500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 13:54:47,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 13:54:47,384][49750] Updated weights for policy 0, policy_version 190411 (0.0036) [2024-04-26 13:54:50,512][49750] Updated weights for policy 0, policy_version 190421 (0.0027) [2024-04-26 13:54:52,063][49517] Fps is (10 sec: 54067.2, 60 sec: 50517.2, 300 sec: 50373.9). Total num frames: 3119939584. Throughput: 0: 50414.5. Samples: 872727700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 13:54:52,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 13:54:54,178][49750] Updated weights for policy 0, policy_version 190431 (0.0039) [2024-04-26 13:54:57,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 3120168960. Throughput: 0: 50224.2. Samples: 873025900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 13:54:57,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 13:54:57,194][49750] Updated weights for policy 0, policy_version 190441 (0.0035) [2024-04-26 13:55:00,703][49750] Updated weights for policy 0, policy_version 190451 (0.0035) [2024-04-26 13:55:02,062][49517] Fps is (10 sec: 45875.4, 60 sec: 49971.1, 300 sec: 50207.3). Total num frames: 3120398336. Throughput: 0: 50095.5. Samples: 873322560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 13:55:02,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 13:55:03,796][49750] Updated weights for policy 0, policy_version 190461 (0.0032) [2024-04-26 13:55:07,062][49517] Fps is (10 sec: 49153.1, 60 sec: 49698.1, 300 sec: 50207.3). Total num frames: 3120660480. Throughput: 0: 50223.7. Samples: 873465840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 13:55:07,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 13:55:07,091][49750] Updated weights for policy 0, policy_version 190471 (0.0030) [2024-04-26 13:55:10,207][49750] Updated weights for policy 0, policy_version 190481 (0.0030) [2024-04-26 13:55:12,062][49517] Fps is (10 sec: 55705.8, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 3120955392. Throughput: 0: 50127.1. Samples: 873770840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 13:55:12,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 13:55:13,451][49750] Updated weights for policy 0, policy_version 190491 (0.0032) [2024-04-26 13:55:16,664][49750] Updated weights for policy 0, policy_version 190501 (0.0033) [2024-04-26 13:55:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 3121168384. Throughput: 0: 50348.1. Samples: 874077360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 13:55:17,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 13:55:20,034][49750] Updated weights for policy 0, policy_version 190511 (0.0037) [2024-04-26 13:55:22,062][49517] Fps is (10 sec: 44237.0, 60 sec: 49971.3, 300 sec: 50151.7). Total num frames: 3121397760. Throughput: 0: 50150.3. Samples: 874220260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 13:55:22,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 13:55:23,395][49750] Updated weights for policy 0, policy_version 190521 (0.0037) [2024-04-26 13:55:26,683][49750] Updated weights for policy 0, policy_version 190531 (0.0037) [2024-04-26 13:55:27,062][49517] Fps is (10 sec: 50790.4, 60 sec: 49971.2, 300 sec: 50207.2). Total num frames: 3121676288. Throughput: 0: 50109.9. Samples: 874516740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 13:55:27,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 13:55:29,160][49728] Signal inference workers to stop experience collection... (13050 times) [2024-04-26 13:55:29,196][49750] InferenceWorker_p0-w0: stopping experience collection (13050 times) [2024-04-26 13:55:29,230][49728] Signal inference workers to resume experience collection... (13050 times) [2024-04-26 13:55:29,230][49750] InferenceWorker_p0-w0: resuming experience collection (13050 times) [2024-04-26 13:55:29,964][49750] Updated weights for policy 0, policy_version 190541 (0.0034) [2024-04-26 13:55:32,062][49517] Fps is (10 sec: 54066.7, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 3121938432. Throughput: 0: 50268.9. Samples: 874820600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 13:55:32,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 13:55:33,152][49750] Updated weights for policy 0, policy_version 190551 (0.0029) [2024-04-26 13:55:36,467][49750] Updated weights for policy 0, policy_version 190561 (0.0034) [2024-04-26 13:55:37,063][49517] Fps is (10 sec: 50789.2, 60 sec: 50790.3, 300 sec: 50318.3). Total num frames: 3122184192. Throughput: 0: 50045.6. Samples: 874979760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 13:55:37,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 13:55:39,590][49750] Updated weights for policy 0, policy_version 190571 (0.0035) [2024-04-26 13:55:42,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 3122413568. Throughput: 0: 49911.4. Samples: 875271900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 13:55:42,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 13:55:42,901][49750] Updated weights for policy 0, policy_version 190581 (0.0030) [2024-04-26 13:55:46,069][49750] Updated weights for policy 0, policy_version 190591 (0.0038) [2024-04-26 13:55:47,062][49517] Fps is (10 sec: 49153.4, 60 sec: 50244.3, 300 sec: 50207.3). Total num frames: 3122675712. Throughput: 0: 50076.5. Samples: 875576000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 13:55:47,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 13:55:49,477][49750] Updated weights for policy 0, policy_version 190601 (0.0032) [2024-04-26 13:55:52,062][49517] Fps is (10 sec: 50789.8, 60 sec: 49698.1, 300 sec: 50262.8). Total num frames: 3122921472. Throughput: 0: 50279.9. Samples: 875728440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 13:55:52,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 13:55:52,714][49750] Updated weights for policy 0, policy_version 190611 (0.0031) [2024-04-26 13:55:56,061][49750] Updated weights for policy 0, policy_version 190621 (0.0035) [2024-04-26 13:55:57,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.5, 300 sec: 50318.3). Total num frames: 3123200000. Throughput: 0: 50200.0. Samples: 876029840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 13:55:57,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 13:55:59,242][49750] Updated weights for policy 0, policy_version 190631 (0.0031) [2024-04-26 13:56:02,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 3123429376. Throughput: 0: 50137.2. Samples: 876333540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 13:56:02,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 13:56:02,466][49750] Updated weights for policy 0, policy_version 190641 (0.0029) [2024-04-26 13:56:05,907][49750] Updated weights for policy 0, policy_version 190651 (0.0024) [2024-04-26 13:56:07,062][49517] Fps is (10 sec: 45874.9, 60 sec: 49971.2, 300 sec: 50096.2). Total num frames: 3123658752. Throughput: 0: 50202.6. Samples: 876479380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 13:56:07,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 13:56:09,078][49750] Updated weights for policy 0, policy_version 190661 (0.0037) [2024-04-26 13:56:12,062][49517] Fps is (10 sec: 49152.7, 60 sec: 49425.1, 300 sec: 50151.7). Total num frames: 3123920896. Throughput: 0: 50164.0. Samples: 876774120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 13:56:12,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 13:56:12,263][49750] Updated weights for policy 0, policy_version 190671 (0.0032) [2024-04-26 13:56:15,620][49750] Updated weights for policy 0, policy_version 190681 (0.0033) [2024-04-26 13:56:17,063][49517] Fps is (10 sec: 54066.8, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 3124199424. Throughput: 0: 50112.8. Samples: 877075680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 13:56:17,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 13:56:18,734][49750] Updated weights for policy 0, policy_version 190691 (0.0033) [2024-04-26 13:56:22,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 3124428800. Throughput: 0: 50084.1. Samples: 877233540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 13:56:22,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 13:56:22,155][49750] Updated weights for policy 0, policy_version 190701 (0.0028) [2024-04-26 13:56:25,217][49750] Updated weights for policy 0, policy_version 190711 (0.0031) [2024-04-26 13:56:27,062][49517] Fps is (10 sec: 45875.9, 60 sec: 49698.2, 300 sec: 50151.7). Total num frames: 3124658176. Throughput: 0: 50196.5. Samples: 877530740. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 13:56:27,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 13:56:28,573][49750] Updated weights for policy 0, policy_version 190721 (0.0036) [2024-04-26 13:56:31,625][49750] Updated weights for policy 0, policy_version 190731 (0.0033) [2024-04-26 13:56:32,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 3124953088. Throughput: 0: 50366.7. Samples: 877842500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 13:56:32,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 13:56:34,970][49750] Updated weights for policy 0, policy_version 190741 (0.0039) [2024-04-26 13:56:37,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50244.5, 300 sec: 50318.3). Total num frames: 3125198848. Throughput: 0: 50428.6. Samples: 877997720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 13:56:37,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 13:56:38,015][49750] Updated weights for policy 0, policy_version 190751 (0.0032) [2024-04-26 13:56:41,504][49750] Updated weights for policy 0, policy_version 190761 (0.0038) [2024-04-26 13:56:42,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50262.8). Total num frames: 3125444608. Throughput: 0: 50382.2. Samples: 878297040. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 13:56:42,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 13:56:42,163][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000190763_3125460992.pth... [2024-04-26 13:56:42,213][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000190027_3113402368.pth [2024-04-26 13:56:44,573][49750] Updated weights for policy 0, policy_version 190771 (0.0038) [2024-04-26 13:56:45,913][49728] Signal inference workers to stop experience collection... (13100 times) [2024-04-26 13:56:45,913][49728] Signal inference workers to resume experience collection... (13100 times) [2024-04-26 13:56:45,929][49750] InferenceWorker_p0-w0: stopping experience collection (13100 times) [2024-04-26 13:56:45,930][49750] InferenceWorker_p0-w0: resuming experience collection (13100 times) [2024-04-26 13:56:47,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 3125706752. Throughput: 0: 50230.3. Samples: 878593900. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 13:56:47,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 13:56:48,075][49750] Updated weights for policy 0, policy_version 190781 (0.0032) [2024-04-26 13:56:51,165][49750] Updated weights for policy 0, policy_version 190791 (0.0032) [2024-04-26 13:56:52,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.4, 300 sec: 50151.7). Total num frames: 3125936128. Throughput: 0: 50429.9. Samples: 878748720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 13:56:52,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 13:56:54,581][49750] Updated weights for policy 0, policy_version 190801 (0.0036) [2024-04-26 13:56:57,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49698.1, 300 sec: 50151.7). Total num frames: 3126181888. Throughput: 0: 50466.2. Samples: 879045100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 13:56:57,063][49517] Avg episode reward: [(0, '0.458')] [2024-04-26 13:56:57,737][49750] Updated weights for policy 0, policy_version 190811 (0.0033) [2024-04-26 13:57:00,973][49750] Updated weights for policy 0, policy_version 190821 (0.0029) [2024-04-26 13:57:02,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.4, 300 sec: 50373.8). Total num frames: 3126460416. Throughput: 0: 50472.5. Samples: 879346940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 13:57:02,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 13:57:04,231][49750] Updated weights for policy 0, policy_version 190831 (0.0030) [2024-04-26 13:57:07,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50318.3). Total num frames: 3126706176. Throughput: 0: 50415.2. Samples: 879502220. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 13:57:07,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 13:57:07,363][49750] Updated weights for policy 0, policy_version 190841 (0.0031) [2024-04-26 13:57:10,672][49750] Updated weights for policy 0, policy_version 190851 (0.0035) [2024-04-26 13:57:12,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50207.2). Total num frames: 3126951936. Throughput: 0: 50435.1. Samples: 879800320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 13:57:12,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 13:57:13,870][49750] Updated weights for policy 0, policy_version 190861 (0.0035) [2024-04-26 13:57:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 3127214080. Throughput: 0: 50236.8. Samples: 880103160. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 13:57:17,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 13:57:17,139][49750] Updated weights for policy 0, policy_version 190871 (0.0030) [2024-04-26 13:57:20,595][49750] Updated weights for policy 0, policy_version 190881 (0.0037) [2024-04-26 13:57:22,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.4, 300 sec: 50318.3). Total num frames: 3127459840. Throughput: 0: 50152.8. Samples: 880254600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 13:57:22,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 13:57:23,757][49750] Updated weights for policy 0, policy_version 190891 (0.0031) [2024-04-26 13:57:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.3, 300 sec: 50318.3). Total num frames: 3127705600. Throughput: 0: 50194.6. Samples: 880555800. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 13:57:27,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 13:57:27,085][49750] Updated weights for policy 0, policy_version 190901 (0.0030) [2024-04-26 13:57:30,309][49750] Updated weights for policy 0, policy_version 190911 (0.0031) [2024-04-26 13:57:32,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 3127967744. Throughput: 0: 50421.0. Samples: 880862840. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 13:57:32,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 13:57:33,644][49750] Updated weights for policy 0, policy_version 190921 (0.0038) [2024-04-26 13:57:36,700][49750] Updated weights for policy 0, policy_version 190931 (0.0034) [2024-04-26 13:57:37,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50244.1, 300 sec: 50207.2). Total num frames: 3128213504. Throughput: 0: 50461.0. Samples: 881019480. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:57:37,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 13:57:40,076][49750] Updated weights for policy 0, policy_version 190941 (0.0035) [2024-04-26 13:57:42,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 3128459264. Throughput: 0: 50415.2. Samples: 881313780. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:57:42,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 13:57:43,079][49750] Updated weights for policy 0, policy_version 190951 (0.0028) [2024-04-26 13:57:46,588][49750] Updated weights for policy 0, policy_version 190961 (0.0046) [2024-04-26 13:57:47,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 3128721408. Throughput: 0: 50478.7. Samples: 881618480. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:57:47,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 13:57:49,714][49750] Updated weights for policy 0, policy_version 190971 (0.0035) [2024-04-26 13:57:52,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 3128967168. Throughput: 0: 50383.4. Samples: 881769480. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:57:52,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 13:57:53,053][49750] Updated weights for policy 0, policy_version 190981 (0.0029) [2024-04-26 13:57:56,692][49750] Updated weights for policy 0, policy_version 190991 (0.0036) [2024-04-26 13:57:57,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 3129212928. Throughput: 0: 50433.3. Samples: 882069820. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:57:57,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 13:57:59,591][49750] Updated weights for policy 0, policy_version 191001 (0.0030) [2024-04-26 13:58:01,318][49728] Signal inference workers to stop experience collection... (13150 times) [2024-04-26 13:58:01,318][49728] Signal inference workers to resume experience collection... (13150 times) [2024-04-26 13:58:01,345][49750] InferenceWorker_p0-w0: stopping experience collection (13150 times) [2024-04-26 13:58:01,345][49750] InferenceWorker_p0-w0: resuming experience collection (13150 times) [2024-04-26 13:58:02,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 3129475072. Throughput: 0: 50520.5. Samples: 882376580. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:58:02,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 13:58:03,131][49750] Updated weights for policy 0, policy_version 191011 (0.0030) [2024-04-26 13:58:06,085][49750] Updated weights for policy 0, policy_version 191021 (0.0029) [2024-04-26 13:58:07,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.2, 300 sec: 50373.8). Total num frames: 3129720832. Throughput: 0: 50362.7. Samples: 882520920. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:58:07,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 13:58:09,481][49750] Updated weights for policy 0, policy_version 191031 (0.0036) [2024-04-26 13:58:12,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.2, 300 sec: 50318.3). Total num frames: 3129982976. Throughput: 0: 50459.9. Samples: 882826500. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:58:12,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 13:58:12,509][49750] Updated weights for policy 0, policy_version 191041 (0.0029) [2024-04-26 13:58:16,203][49750] Updated weights for policy 0, policy_version 191051 (0.0033) [2024-04-26 13:58:17,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 3130228736. Throughput: 0: 50451.8. Samples: 883133180. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:58:17,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 13:58:19,072][49750] Updated weights for policy 0, policy_version 191061 (0.0036) [2024-04-26 13:58:22,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50244.4, 300 sec: 50262.8). Total num frames: 3130474496. Throughput: 0: 50226.5. Samples: 883279660. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:58:22,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 13:58:22,680][49750] Updated weights for policy 0, policy_version 191071 (0.0035) [2024-04-26 13:58:25,817][49750] Updated weights for policy 0, policy_version 191081 (0.0036) [2024-04-26 13:58:27,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.2, 300 sec: 50318.3). Total num frames: 3130720256. Throughput: 0: 50442.5. Samples: 883583700. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:58:27,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 13:58:29,069][49750] Updated weights for policy 0, policy_version 191091 (0.0031) [2024-04-26 13:58:32,062][49517] Fps is (10 sec: 49151.6, 60 sec: 49971.2, 300 sec: 50318.3). Total num frames: 3130966016. Throughput: 0: 50415.1. Samples: 883887160. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:58:32,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 13:58:32,228][49750] Updated weights for policy 0, policy_version 191101 (0.0035) [2024-04-26 13:58:35,563][49750] Updated weights for policy 0, policy_version 191111 (0.0033) [2024-04-26 13:58:37,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50517.5, 300 sec: 50318.3). Total num frames: 3131244544. Throughput: 0: 50184.2. Samples: 884027760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 13:58:37,063][49517] Avg episode reward: [(0, '0.491')] [2024-04-26 13:58:38,746][49750] Updated weights for policy 0, policy_version 191121 (0.0033) [2024-04-26 13:58:42,050][49750] Updated weights for policy 0, policy_version 191131 (0.0025) [2024-04-26 13:58:42,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 3131490304. Throughput: 0: 50322.9. Samples: 884334360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 13:58:42,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 13:58:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000191131_3131490304.pth... [2024-04-26 13:58:42,126][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000190394_3119415296.pth [2024-04-26 13:58:45,386][49750] Updated weights for policy 0, policy_version 191141 (0.0033) [2024-04-26 13:58:47,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50262.8). Total num frames: 3131736064. Throughput: 0: 50383.1. Samples: 884643820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 13:58:47,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 13:58:48,409][49750] Updated weights for policy 0, policy_version 191151 (0.0028) [2024-04-26 13:58:51,853][49750] Updated weights for policy 0, policy_version 191161 (0.0037) [2024-04-26 13:58:52,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.4, 300 sec: 50318.3). Total num frames: 3131981824. Throughput: 0: 50271.2. Samples: 884783120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 13:58:52,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 13:58:55,020][49750] Updated weights for policy 0, policy_version 191171 (0.0030) [2024-04-26 13:58:57,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.2, 300 sec: 50262.8). Total num frames: 3132227584. Throughput: 0: 50265.9. Samples: 885088460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 13:58:57,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 13:58:58,142][49728] Signal inference workers to stop experience collection... (13200 times) [2024-04-26 13:58:58,142][49728] Signal inference workers to resume experience collection... (13200 times) [2024-04-26 13:58:58,157][49750] InferenceWorker_p0-w0: stopping experience collection (13200 times) [2024-04-26 13:58:58,178][49750] InferenceWorker_p0-w0: resuming experience collection (13200 times) [2024-04-26 13:58:58,273][49750] Updated weights for policy 0, policy_version 191181 (0.0030) [2024-04-26 13:59:01,556][49750] Updated weights for policy 0, policy_version 191191 (0.0037) [2024-04-26 13:59:02,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.1, 300 sec: 50207.2). Total num frames: 3132489728. Throughput: 0: 50259.6. Samples: 885394860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 13:59:02,063][49517] Avg episode reward: [(0, '0.481')] [2024-04-26 13:59:04,663][49750] Updated weights for policy 0, policy_version 191201 (0.0029) [2024-04-26 13:59:07,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 3132751872. Throughput: 0: 50356.9. Samples: 885545720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 13:59:07,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 13:59:08,100][49750] Updated weights for policy 0, policy_version 191211 (0.0029) [2024-04-26 13:59:11,168][49750] Updated weights for policy 0, policy_version 191221 (0.0033) [2024-04-26 13:59:12,062][49517] Fps is (10 sec: 49152.9, 60 sec: 49971.4, 300 sec: 50318.3). Total num frames: 3132981248. Throughput: 0: 50388.2. Samples: 885851160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 13:59:12,063][49517] Avg episode reward: [(0, '0.473')] [2024-04-26 13:59:14,666][49750] Updated weights for policy 0, policy_version 191231 (0.0028) [2024-04-26 13:59:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.4, 300 sec: 50318.3). Total num frames: 3133243392. Throughput: 0: 50342.3. Samples: 886152560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 13:59:17,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 13:59:17,807][49750] Updated weights for policy 0, policy_version 191241 (0.0035) [2024-04-26 13:59:21,046][49750] Updated weights for policy 0, policy_version 191251 (0.0031) [2024-04-26 13:59:22,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 3133505536. Throughput: 0: 50434.1. Samples: 886297300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 13:59:22,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 13:59:24,245][49750] Updated weights for policy 0, policy_version 191261 (0.0030) [2024-04-26 13:59:27,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.4, 300 sec: 50262.8). Total num frames: 3133751296. Throughput: 0: 50305.4. Samples: 886598100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 13:59:27,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 13:59:27,462][49750] Updated weights for policy 0, policy_version 191271 (0.0031) [2024-04-26 13:59:31,142][49750] Updated weights for policy 0, policy_version 191281 (0.0028) [2024-04-26 13:59:32,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 3133997056. Throughput: 0: 50203.9. Samples: 886903000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 13:59:32,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 13:59:33,881][49750] Updated weights for policy 0, policy_version 191291 (0.0039) [2024-04-26 13:59:37,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.1, 300 sec: 50318.3). Total num frames: 3134242816. Throughput: 0: 50370.2. Samples: 887049780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 13:59:37,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 13:59:37,839][49750] Updated weights for policy 0, policy_version 191301 (0.0026) [2024-04-26 13:59:40,376][49750] Updated weights for policy 0, policy_version 191311 (0.0027) [2024-04-26 13:59:42,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 3134504960. Throughput: 0: 50355.8. Samples: 887354480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 13:59:42,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 13:59:44,144][49750] Updated weights for policy 0, policy_version 191321 (0.0028) [2024-04-26 13:59:46,674][49750] Updated weights for policy 0, policy_version 191331 (0.0032) [2024-04-26 13:59:47,063][49517] Fps is (10 sec: 54066.8, 60 sec: 50790.3, 300 sec: 50318.3). Total num frames: 3134783488. Throughput: 0: 50270.3. Samples: 887657020. Policy #0 lag: (min: 2.0, avg: 10.8, max: 20.0) [2024-04-26 13:59:47,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 13:59:50,402][49728] Signal inference workers to stop experience collection... (13250 times) [2024-04-26 13:59:50,402][49728] Signal inference workers to resume experience collection... (13250 times) [2024-04-26 13:59:50,415][49750] InferenceWorker_p0-w0: stopping experience collection (13250 times) [2024-04-26 13:59:50,415][49750] InferenceWorker_p0-w0: resuming experience collection (13250 times) [2024-04-26 13:59:50,540][49750] Updated weights for policy 0, policy_version 191341 (0.0035) [2024-04-26 13:59:52,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50517.4, 300 sec: 50318.4). Total num frames: 3135012864. Throughput: 0: 50533.4. Samples: 887819720. Policy #0 lag: (min: 2.0, avg: 10.8, max: 20.0) [2024-04-26 13:59:52,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 13:59:53,132][49750] Updated weights for policy 0, policy_version 191351 (0.0034) [2024-04-26 13:59:56,937][49750] Updated weights for policy 0, policy_version 191361 (0.0035) [2024-04-26 13:59:57,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.2, 300 sec: 50373.8). Total num frames: 3135258624. Throughput: 0: 50402.0. Samples: 888119260. Policy #0 lag: (min: 2.0, avg: 10.8, max: 20.0) [2024-04-26 13:59:57,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 13:59:59,715][49750] Updated weights for policy 0, policy_version 191371 (0.0032) [2024-04-26 14:00:02,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 3135504384. Throughput: 0: 50311.0. Samples: 888416560. Policy #0 lag: (min: 2.0, avg: 10.8, max: 20.0) [2024-04-26 14:00:02,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 14:00:03,519][49750] Updated weights for policy 0, policy_version 191381 (0.0029) [2024-04-26 14:00:06,297][49750] Updated weights for policy 0, policy_version 191391 (0.0032) [2024-04-26 14:00:07,063][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.2, 300 sec: 50262.8). Total num frames: 3135782912. Throughput: 0: 50553.8. Samples: 888572220. Policy #0 lag: (min: 2.0, avg: 10.8, max: 20.0) [2024-04-26 14:00:07,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 14:00:10,002][49750] Updated weights for policy 0, policy_version 191401 (0.0036) [2024-04-26 14:00:12,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.4, 300 sec: 50429.4). Total num frames: 3136045056. Throughput: 0: 50556.5. Samples: 888873140. Policy #0 lag: (min: 2.0, avg: 10.8, max: 20.0) [2024-04-26 14:00:12,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 14:00:12,655][49750] Updated weights for policy 0, policy_version 191411 (0.0039) [2024-04-26 14:00:16,567][49750] Updated weights for policy 0, policy_version 191421 (0.0031) [2024-04-26 14:00:17,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 3136258048. Throughput: 0: 50522.4. Samples: 889176500. Policy #0 lag: (min: 2.0, avg: 10.8, max: 20.0) [2024-04-26 14:00:17,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 14:00:19,233][49750] Updated weights for policy 0, policy_version 191431 (0.0032) [2024-04-26 14:00:22,063][49517] Fps is (10 sec: 45873.9, 60 sec: 49971.0, 300 sec: 50262.7). Total num frames: 3136503808. Throughput: 0: 50469.5. Samples: 889320920. Policy #0 lag: (min: 2.0, avg: 10.8, max: 20.0) [2024-04-26 14:00:22,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 14:00:23,120][49750] Updated weights for policy 0, policy_version 191441 (0.0033) [2024-04-26 14:00:25,571][49750] Updated weights for policy 0, policy_version 191451 (0.0030) [2024-04-26 14:00:27,063][49517] Fps is (10 sec: 52427.6, 60 sec: 50517.3, 300 sec: 50318.3). Total num frames: 3136782336. Throughput: 0: 50335.5. Samples: 889619580. Policy #0 lag: (min: 2.0, avg: 10.8, max: 20.0) [2024-04-26 14:00:27,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 14:00:29,505][49750] Updated weights for policy 0, policy_version 191461 (0.0031) [2024-04-26 14:00:32,003][49750] Updated weights for policy 0, policy_version 191471 (0.0026) [2024-04-26 14:00:32,062][49517] Fps is (10 sec: 55707.1, 60 sec: 51063.5, 300 sec: 50429.4). Total num frames: 3137060864. Throughput: 0: 50343.2. Samples: 889922460. Policy #0 lag: (min: 2.0, avg: 10.8, max: 20.0) [2024-04-26 14:00:32,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 14:00:36,108][49750] Updated weights for policy 0, policy_version 191481 (0.0034) [2024-04-26 14:00:37,062][49517] Fps is (10 sec: 47514.6, 60 sec: 50244.3, 300 sec: 50318.3). Total num frames: 3137257472. Throughput: 0: 50279.5. Samples: 890082300. Policy #0 lag: (min: 2.0, avg: 10.8, max: 20.0) [2024-04-26 14:00:37,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 14:00:38,526][49750] Updated weights for policy 0, policy_version 191491 (0.0029) [2024-04-26 14:00:42,062][49517] Fps is (10 sec: 44237.0, 60 sec: 49971.3, 300 sec: 50262.8). Total num frames: 3137503232. Throughput: 0: 50335.7. Samples: 890384360. Policy #0 lag: (min: 2.0, avg: 10.8, max: 20.0) [2024-04-26 14:00:42,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 14:00:42,129][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000191499_3137519616.pth... [2024-04-26 14:00:42,172][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000190763_3125460992.pth [2024-04-26 14:00:42,491][49750] Updated weights for policy 0, policy_version 191501 (0.0031) [2024-04-26 14:00:45,105][49750] Updated weights for policy 0, policy_version 191511 (0.0028) [2024-04-26 14:00:47,062][49517] Fps is (10 sec: 50790.2, 60 sec: 49698.2, 300 sec: 50318.3). Total num frames: 3137765376. Throughput: 0: 50221.5. Samples: 890676520. Policy #0 lag: (min: 2.0, avg: 10.8, max: 20.0) [2024-04-26 14:00:47,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 14:00:49,071][49750] Updated weights for policy 0, policy_version 191521 (0.0030) [2024-04-26 14:00:51,496][49750] Updated weights for policy 0, policy_version 191531 (0.0033) [2024-04-26 14:00:52,063][49517] Fps is (10 sec: 55705.4, 60 sec: 50790.3, 300 sec: 50373.8). Total num frames: 3138060288. Throughput: 0: 50361.4. Samples: 890838480. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-04-26 14:00:52,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 14:00:55,615][49750] Updated weights for policy 0, policy_version 191541 (0.0029) [2024-04-26 14:00:57,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.6, 300 sec: 50373.9). Total num frames: 3138289664. Throughput: 0: 50311.7. Samples: 891137160. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-04-26 14:00:57,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 14:00:58,186][49750] Updated weights for policy 0, policy_version 191551 (0.0029) [2024-04-26 14:01:01,950][49750] Updated weights for policy 0, policy_version 191561 (0.0032) [2024-04-26 14:01:02,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 3138535424. Throughput: 0: 50307.1. Samples: 891440320. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-04-26 14:01:02,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 14:01:04,782][49750] Updated weights for policy 0, policy_version 191571 (0.0035) [2024-04-26 14:01:06,836][49728] Signal inference workers to stop experience collection... (13300 times) [2024-04-26 14:01:06,836][49728] Signal inference workers to resume experience collection... (13300 times) [2024-04-26 14:01:06,870][49750] InferenceWorker_p0-w0: stopping experience collection (13300 times) [2024-04-26 14:01:06,870][49750] InferenceWorker_p0-w0: resuming experience collection (13300 times) [2024-04-26 14:01:07,063][49517] Fps is (10 sec: 49150.8, 60 sec: 49971.1, 300 sec: 50373.8). Total num frames: 3138781184. Throughput: 0: 50329.5. Samples: 891585740. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-04-26 14:01:07,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 14:01:08,354][49750] Updated weights for policy 0, policy_version 191581 (0.0029) [2024-04-26 14:01:11,197][49750] Updated weights for policy 0, policy_version 191591 (0.0031) [2024-04-26 14:01:12,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.3, 300 sec: 50373.9). Total num frames: 3139059712. Throughput: 0: 50368.2. Samples: 891886140. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-04-26 14:01:12,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 14:01:14,885][49750] Updated weights for policy 0, policy_version 191601 (0.0032) [2024-04-26 14:01:17,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.2, 300 sec: 50484.9). Total num frames: 3139321856. Throughput: 0: 50487.8. Samples: 892194420. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-04-26 14:01:17,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 14:01:17,524][49750] Updated weights for policy 0, policy_version 191611 (0.0033) [2024-04-26 14:01:21,441][49750] Updated weights for policy 0, policy_version 191621 (0.0029) [2024-04-26 14:01:22,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.5, 300 sec: 50429.4). Total num frames: 3139534848. Throughput: 0: 50444.7. Samples: 892352320. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-04-26 14:01:22,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 14:01:23,996][49750] Updated weights for policy 0, policy_version 191631 (0.0031) [2024-04-26 14:01:27,063][49517] Fps is (10 sec: 45875.6, 60 sec: 49971.2, 300 sec: 50262.8). Total num frames: 3139780608. Throughput: 0: 50368.3. Samples: 892650940. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-04-26 14:01:27,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 14:01:27,845][49750] Updated weights for policy 0, policy_version 191641 (0.0033) [2024-04-26 14:01:30,580][49750] Updated weights for policy 0, policy_version 191651 (0.0026) [2024-04-26 14:01:32,063][49517] Fps is (10 sec: 50790.3, 60 sec: 49698.1, 300 sec: 50318.3). Total num frames: 3140042752. Throughput: 0: 50500.7. Samples: 892949060. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-04-26 14:01:32,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 14:01:34,234][49750] Updated weights for policy 0, policy_version 191661 (0.0030) [2024-04-26 14:01:37,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51063.5, 300 sec: 50429.4). Total num frames: 3140321280. Throughput: 0: 50437.0. Samples: 893108140. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-04-26 14:01:37,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 14:01:37,129][49750] Updated weights for policy 0, policy_version 191671 (0.0034) [2024-04-26 14:01:40,658][49750] Updated weights for policy 0, policy_version 191681 (0.0031) [2024-04-26 14:01:42,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.5, 300 sec: 50373.9). Total num frames: 3140567040. Throughput: 0: 50567.1. Samples: 893412680. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-04-26 14:01:42,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 14:01:43,411][49750] Updated weights for policy 0, policy_version 191691 (0.0028) [2024-04-26 14:01:47,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.3, 300 sec: 50373.9). Total num frames: 3140796416. Throughput: 0: 50558.2. Samples: 893715440. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-04-26 14:01:47,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 14:01:47,616][49750] Updated weights for policy 0, policy_version 191701 (0.0028) [2024-04-26 14:01:49,785][49750] Updated weights for policy 0, policy_version 191711 (0.0038) [2024-04-26 14:01:52,062][49517] Fps is (10 sec: 49151.5, 60 sec: 49971.2, 300 sec: 50429.4). Total num frames: 3141058560. Throughput: 0: 50572.6. Samples: 893861500. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-04-26 14:01:52,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 14:01:53,928][49750] Updated weights for policy 0, policy_version 191721 (0.0031) [2024-04-26 14:01:56,587][49750] Updated weights for policy 0, policy_version 191731 (0.0026) [2024-04-26 14:01:57,062][49517] Fps is (10 sec: 54066.9, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 3141337088. Throughput: 0: 50660.4. Samples: 894165860. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-04-26 14:01:57,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 14:02:00,321][49750] Updated weights for policy 0, policy_version 191741 (0.0029) [2024-04-26 14:02:02,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 3141582848. Throughput: 0: 50592.6. Samples: 894471080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 14:02:02,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 14:02:02,921][49750] Updated weights for policy 0, policy_version 191751 (0.0031) [2024-04-26 14:02:06,798][49750] Updated weights for policy 0, policy_version 191761 (0.0031) [2024-04-26 14:02:07,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50517.6, 300 sec: 50373.9). Total num frames: 3141812224. Throughput: 0: 50574.4. Samples: 894628160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 14:02:07,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 14:02:09,359][49750] Updated weights for policy 0, policy_version 191771 (0.0036) [2024-04-26 14:02:12,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50244.1, 300 sec: 50373.8). Total num frames: 3142074368. Throughput: 0: 50737.2. Samples: 894934120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 14:02:12,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 14:02:13,110][49750] Updated weights for policy 0, policy_version 191781 (0.0031) [2024-04-26 14:02:13,578][49728] Signal inference workers to stop experience collection... (13350 times) [2024-04-26 14:02:13,579][49728] Signal inference workers to resume experience collection... (13350 times) [2024-04-26 14:02:13,604][49750] InferenceWorker_p0-w0: stopping experience collection (13350 times) [2024-04-26 14:02:13,604][49750] InferenceWorker_p0-w0: resuming experience collection (13350 times) [2024-04-26 14:02:16,029][49750] Updated weights for policy 0, policy_version 191791 (0.0040) [2024-04-26 14:02:17,062][49517] Fps is (10 sec: 50789.9, 60 sec: 49971.4, 300 sec: 50373.9). Total num frames: 3142320128. Throughput: 0: 50716.6. Samples: 895231300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 14:02:17,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 14:02:19,430][49750] Updated weights for policy 0, policy_version 191801 (0.0036) [2024-04-26 14:02:22,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.5, 300 sec: 50484.9). Total num frames: 3142598656. Throughput: 0: 50615.9. Samples: 895385860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 14:02:22,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 14:02:22,373][49750] Updated weights for policy 0, policy_version 191811 (0.0035) [2024-04-26 14:02:26,070][49750] Updated weights for policy 0, policy_version 191821 (0.0034) [2024-04-26 14:02:27,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51063.4, 300 sec: 50429.4). Total num frames: 3142844416. Throughput: 0: 50643.8. Samples: 895691660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 14:02:27,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 14:02:28,768][49750] Updated weights for policy 0, policy_version 191831 (0.0032) [2024-04-26 14:02:32,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50429.4). Total num frames: 3143090176. Throughput: 0: 50661.8. Samples: 895995220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 14:02:32,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 14:02:32,553][49750] Updated weights for policy 0, policy_version 191841 (0.0034) [2024-04-26 14:02:35,259][49750] Updated weights for policy 0, policy_version 191851 (0.0031) [2024-04-26 14:02:37,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50244.2, 300 sec: 50429.4). Total num frames: 3143335936. Throughput: 0: 50714.8. Samples: 896143660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 14:02:37,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 14:02:38,897][49750] Updated weights for policy 0, policy_version 191861 (0.0027) [2024-04-26 14:02:41,762][49750] Updated weights for policy 0, policy_version 191871 (0.0033) [2024-04-26 14:02:42,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 3143614464. Throughput: 0: 50633.2. Samples: 896444360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 14:02:42,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 14:02:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000191871_3143614464.pth... [2024-04-26 14:02:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000191131_3131490304.pth [2024-04-26 14:02:45,437][49750] Updated weights for policy 0, policy_version 191881 (0.0032) [2024-04-26 14:02:47,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50429.4). Total num frames: 3143843840. Throughput: 0: 50676.6. Samples: 896751520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 14:02:47,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 14:02:48,274][49750] Updated weights for policy 0, policy_version 191891 (0.0032) [2024-04-26 14:02:51,968][49750] Updated weights for policy 0, policy_version 191901 (0.0031) [2024-04-26 14:02:52,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.5, 300 sec: 50484.9). Total num frames: 3144105984. Throughput: 0: 50762.1. Samples: 896912460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 14:02:52,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 14:02:54,897][49750] Updated weights for policy 0, policy_version 191911 (0.0037) [2024-04-26 14:02:57,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.3, 300 sec: 50429.4). Total num frames: 3144351744. Throughput: 0: 50546.4. Samples: 897208700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 14:02:57,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 14:02:58,409][49750] Updated weights for policy 0, policy_version 191921 (0.0030) [2024-04-26 14:03:01,404][49750] Updated weights for policy 0, policy_version 191931 (0.0032) [2024-04-26 14:03:02,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.4, 300 sec: 50429.4). Total num frames: 3144597504. Throughput: 0: 50655.6. Samples: 897510800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 14:03:02,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 14:03:04,640][49750] Updated weights for policy 0, policy_version 191941 (0.0035) [2024-04-26 14:03:07,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.3, 300 sec: 50429.4). Total num frames: 3144859648. Throughput: 0: 50652.1. Samples: 897665200. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 14:03:07,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 14:03:07,837][49750] Updated weights for policy 0, policy_version 191951 (0.0031) [2024-04-26 14:03:11,008][49750] Updated weights for policy 0, policy_version 191961 (0.0030) [2024-04-26 14:03:12,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.6, 300 sec: 50485.0). Total num frames: 3145121792. Throughput: 0: 50682.8. Samples: 897972380. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 14:03:12,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 14:03:14,398][49750] Updated weights for policy 0, policy_version 191971 (0.0030) [2024-04-26 14:03:17,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50484.9). Total num frames: 3145367552. Throughput: 0: 50759.2. Samples: 898279380. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 14:03:17,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 14:03:17,460][49750] Updated weights for policy 0, policy_version 191981 (0.0027) [2024-04-26 14:03:19,927][49728] Signal inference workers to stop experience collection... (13400 times) [2024-04-26 14:03:19,927][49728] Signal inference workers to resume experience collection... (13400 times) [2024-04-26 14:03:19,952][49750] InferenceWorker_p0-w0: stopping experience collection (13400 times) [2024-04-26 14:03:19,953][49750] InferenceWorker_p0-w0: resuming experience collection (13400 times) [2024-04-26 14:03:21,077][49750] Updated weights for policy 0, policy_version 191991 (0.0031) [2024-04-26 14:03:22,063][49517] Fps is (10 sec: 49150.8, 60 sec: 50244.1, 300 sec: 50484.9). Total num frames: 3145613312. Throughput: 0: 50736.6. Samples: 898426820. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 14:03:22,064][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 14:03:23,997][49750] Updated weights for policy 0, policy_version 192001 (0.0036) [2024-04-26 14:03:27,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 3145875456. Throughput: 0: 50800.9. Samples: 898730400. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 14:03:27,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 14:03:27,552][49750] Updated weights for policy 0, policy_version 192011 (0.0037) [2024-04-26 14:03:30,428][49750] Updated weights for policy 0, policy_version 192021 (0.0030) [2024-04-26 14:03:32,063][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.3, 300 sec: 50429.4). Total num frames: 3146121216. Throughput: 0: 50586.9. Samples: 899027940. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 14:03:32,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 14:03:33,867][49750] Updated weights for policy 0, policy_version 192031 (0.0035) [2024-04-26 14:03:36,939][49750] Updated weights for policy 0, policy_version 192041 (0.0033) [2024-04-26 14:03:37,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50540.5). Total num frames: 3146399744. Throughput: 0: 50518.6. Samples: 899185800. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 14:03:37,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 14:03:40,172][49750] Updated weights for policy 0, policy_version 192051 (0.0033) [2024-04-26 14:03:42,063][49517] Fps is (10 sec: 49151.6, 60 sec: 49971.1, 300 sec: 50429.4). Total num frames: 3146612736. Throughput: 0: 50660.7. Samples: 899488440. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 14:03:42,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 14:03:43,383][49750] Updated weights for policy 0, policy_version 192061 (0.0034) [2024-04-26 14:03:46,765][49750] Updated weights for policy 0, policy_version 192071 (0.0026) [2024-04-26 14:03:47,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.3, 300 sec: 50540.5). Total num frames: 3146891264. Throughput: 0: 50616.7. Samples: 899788560. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 14:03:47,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 14:03:50,207][49750] Updated weights for policy 0, policy_version 192081 (0.0034) [2024-04-26 14:03:52,062][49517] Fps is (10 sec: 52430.1, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 3147137024. Throughput: 0: 50565.8. Samples: 899940660. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 14:03:52,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 14:03:53,307][49750] Updated weights for policy 0, policy_version 192091 (0.0027) [2024-04-26 14:03:56,593][49750] Updated weights for policy 0, policy_version 192101 (0.0028) [2024-04-26 14:03:57,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50540.5). Total num frames: 3147399168. Throughput: 0: 50597.2. Samples: 900249260. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 14:03:57,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 14:03:59,837][49750] Updated weights for policy 0, policy_version 192111 (0.0030) [2024-04-26 14:04:02,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 3147644928. Throughput: 0: 50616.7. Samples: 900557140. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 14:04:02,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 14:04:02,856][49750] Updated weights for policy 0, policy_version 192121 (0.0026) [2024-04-26 14:04:06,132][49750] Updated weights for policy 0, policy_version 192131 (0.0031) [2024-04-26 14:04:07,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 3147890688. Throughput: 0: 50607.3. Samples: 900704140. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 14:04:07,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 14:04:09,321][49750] Updated weights for policy 0, policy_version 192141 (0.0031) [2024-04-26 14:04:12,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 3148152832. Throughput: 0: 50587.6. Samples: 901006840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 14:04:12,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 14:04:12,585][49750] Updated weights for policy 0, policy_version 192151 (0.0035) [2024-04-26 14:04:15,882][49750] Updated weights for policy 0, policy_version 192161 (0.0039) [2024-04-26 14:04:17,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.4, 300 sec: 50540.5). Total num frames: 3148414976. Throughput: 0: 50672.6. Samples: 901308200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 14:04:17,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 14:04:19,087][49750] Updated weights for policy 0, policy_version 192171 (0.0034) [2024-04-26 14:04:22,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.5, 300 sec: 50540.5). Total num frames: 3148660736. Throughput: 0: 50546.1. Samples: 901460380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 14:04:22,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 14:04:22,469][49750] Updated weights for policy 0, policy_version 192181 (0.0034) [2024-04-26 14:04:24,586][49728] Signal inference workers to stop experience collection... (13450 times) [2024-04-26 14:04:24,635][49750] InferenceWorker_p0-w0: stopping experience collection (13450 times) [2024-04-26 14:04:24,651][49728] Signal inference workers to resume experience collection... (13450 times) [2024-04-26 14:04:24,658][49750] InferenceWorker_p0-w0: resuming experience collection (13450 times) [2024-04-26 14:04:25,470][49750] Updated weights for policy 0, policy_version 192191 (0.0034) [2024-04-26 14:04:27,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 3148906496. Throughput: 0: 50464.5. Samples: 901759340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 14:04:27,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 14:04:29,055][49750] Updated weights for policy 0, policy_version 192201 (0.0030) [2024-04-26 14:04:31,880][49750] Updated weights for policy 0, policy_version 192211 (0.0036) [2024-04-26 14:04:32,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50651.5). Total num frames: 3149185024. Throughput: 0: 50684.4. Samples: 902069360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 14:04:32,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 14:04:35,435][49750] Updated weights for policy 0, policy_version 192221 (0.0033) [2024-04-26 14:04:37,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 3149430784. Throughput: 0: 50584.3. Samples: 902216960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 14:04:37,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 14:04:38,315][49750] Updated weights for policy 0, policy_version 192231 (0.0030) [2024-04-26 14:04:41,840][49750] Updated weights for policy 0, policy_version 192241 (0.0034) [2024-04-26 14:04:42,062][49517] Fps is (10 sec: 49152.2, 60 sec: 51063.6, 300 sec: 50484.9). Total num frames: 3149676544. Throughput: 0: 50497.4. Samples: 902521640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 14:04:42,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 14:04:42,138][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000192242_3149692928.pth... [2024-04-26 14:04:42,180][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000191499_3137519616.pth [2024-04-26 14:04:44,819][49750] Updated weights for policy 0, policy_version 192251 (0.0031) [2024-04-26 14:04:47,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 3149922304. Throughput: 0: 50420.0. Samples: 902826040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 14:04:47,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 14:04:48,330][49750] Updated weights for policy 0, policy_version 192261 (0.0031) [2024-04-26 14:04:51,599][49750] Updated weights for policy 0, policy_version 192271 (0.0030) [2024-04-26 14:04:52,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.2, 300 sec: 50540.5). Total num frames: 3150168064. Throughput: 0: 50451.0. Samples: 902974440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 14:04:52,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 14:04:54,910][49750] Updated weights for policy 0, policy_version 192281 (0.0032) [2024-04-26 14:04:57,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 3150430208. Throughput: 0: 50553.8. Samples: 903281760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 14:04:57,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 14:04:58,574][49750] Updated weights for policy 0, policy_version 192291 (0.0032) [2024-04-26 14:05:01,306][49750] Updated weights for policy 0, policy_version 192301 (0.0036) [2024-04-26 14:05:02,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.5, 300 sec: 50540.5). Total num frames: 3150692352. Throughput: 0: 50553.3. Samples: 903583100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 14:05:02,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 14:05:05,038][49750] Updated weights for policy 0, policy_version 192311 (0.0030) [2024-04-26 14:05:07,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50429.4). Total num frames: 3150921728. Throughput: 0: 50569.1. Samples: 903735980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 14:05:07,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 14:05:07,874][49750] Updated weights for policy 0, policy_version 192321 (0.0031) [2024-04-26 14:05:11,413][49750] Updated weights for policy 0, policy_version 192331 (0.0032) [2024-04-26 14:05:12,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 3151183872. Throughput: 0: 50568.5. Samples: 904034920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 14:05:12,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 14:05:14,201][49750] Updated weights for policy 0, policy_version 192341 (0.0031) [2024-04-26 14:05:17,063][49517] Fps is (10 sec: 54066.4, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 3151462400. Throughput: 0: 50479.1. Samples: 904340920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 14:05:17,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 14:05:17,767][49750] Updated weights for policy 0, policy_version 192351 (0.0027) [2024-04-26 14:05:20,653][49750] Updated weights for policy 0, policy_version 192361 (0.0031) [2024-04-26 14:05:22,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.5, 300 sec: 50540.5). Total num frames: 3151691776. Throughput: 0: 50620.1. Samples: 904494860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 14:05:22,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 14:05:24,273][49750] Updated weights for policy 0, policy_version 192371 (0.0031) [2024-04-26 14:05:27,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50484.9). Total num frames: 3151953920. Throughput: 0: 50623.0. Samples: 904799680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 14:05:27,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 14:05:27,133][49750] Updated weights for policy 0, policy_version 192381 (0.0034) [2024-04-26 14:05:30,690][49750] Updated weights for policy 0, policy_version 192391 (0.0036) [2024-04-26 14:05:32,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 3152199680. Throughput: 0: 50520.5. Samples: 905099460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 14:05:32,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 14:05:33,631][49750] Updated weights for policy 0, policy_version 192401 (0.0029) [2024-04-26 14:05:37,007][49750] Updated weights for policy 0, policy_version 192411 (0.0032) [2024-04-26 14:05:37,063][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3152461824. Throughput: 0: 50758.2. Samples: 905258560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 14:05:37,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 14:05:39,939][49750] Updated weights for policy 0, policy_version 192421 (0.0034) [2024-04-26 14:05:42,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3152723968. Throughput: 0: 50710.5. Samples: 905563740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 14:05:42,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 14:05:43,425][49750] Updated weights for policy 0, policy_version 192431 (0.0029) [2024-04-26 14:05:46,220][49750] Updated weights for policy 0, policy_version 192441 (0.0032) [2024-04-26 14:05:47,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50540.5). Total num frames: 3152969728. Throughput: 0: 50759.8. Samples: 905867300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 14:05:47,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 14:05:49,867][49750] Updated weights for policy 0, policy_version 192451 (0.0031) [2024-04-26 14:05:52,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 3153215488. Throughput: 0: 50889.6. Samples: 906026020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 14:05:52,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 14:05:52,135][49728] Signal inference workers to stop experience collection... (13500 times) [2024-04-26 14:05:52,136][49728] Signal inference workers to resume experience collection... (13500 times) [2024-04-26 14:05:52,149][49750] InferenceWorker_p0-w0: stopping experience collection (13500 times) [2024-04-26 14:05:52,170][49750] InferenceWorker_p0-w0: resuming experience collection (13500 times) [2024-04-26 14:05:53,061][49750] Updated weights for policy 0, policy_version 192461 (0.0032) [2024-04-26 14:05:56,304][49750] Updated weights for policy 0, policy_version 192471 (0.0036) [2024-04-26 14:05:57,063][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3153494016. Throughput: 0: 51011.5. Samples: 906330440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 14:05:57,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 14:05:59,316][49750] Updated weights for policy 0, policy_version 192481 (0.0024) [2024-04-26 14:06:02,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 3153739776. Throughput: 0: 51023.1. Samples: 906636960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 14:06:02,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 14:06:02,679][49750] Updated weights for policy 0, policy_version 192491 (0.0032) [2024-04-26 14:06:05,703][49750] Updated weights for policy 0, policy_version 192501 (0.0034) [2024-04-26 14:06:07,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51336.5, 300 sec: 50651.6). Total num frames: 3154001920. Throughput: 0: 51137.3. Samples: 906796040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 14:06:07,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 14:06:09,028][49750] Updated weights for policy 0, policy_version 192511 (0.0032) [2024-04-26 14:06:12,063][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.4, 300 sec: 50596.0). Total num frames: 3154247680. Throughput: 0: 51144.5. Samples: 907101180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 14:06:12,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 14:06:12,255][49750] Updated weights for policy 0, policy_version 192521 (0.0033) [2024-04-26 14:06:15,574][49750] Updated weights for policy 0, policy_version 192531 (0.0028) [2024-04-26 14:06:17,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 3154477056. Throughput: 0: 51071.0. Samples: 907397660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 14:06:17,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 14:06:18,524][49750] Updated weights for policy 0, policy_version 192541 (0.0036) [2024-04-26 14:06:21,928][49750] Updated weights for policy 0, policy_version 192551 (0.0029) [2024-04-26 14:06:22,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3154755584. Throughput: 0: 50905.9. Samples: 907549320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-26 14:06:22,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 14:06:25,299][49750] Updated weights for policy 0, policy_version 192561 (0.0030) [2024-04-26 14:06:27,063][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3155017728. Throughput: 0: 50956.0. Samples: 907856760. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 14:06:27,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 14:06:28,361][49750] Updated weights for policy 0, policy_version 192571 (0.0035) [2024-04-26 14:06:31,682][49750] Updated weights for policy 0, policy_version 192581 (0.0029) [2024-04-26 14:06:32,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.4, 300 sec: 50651.5). Total num frames: 3155263488. Throughput: 0: 50995.6. Samples: 908162100. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 14:06:32,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 14:06:34,765][49750] Updated weights for policy 0, policy_version 192591 (0.0031) [2024-04-26 14:06:37,062][49517] Fps is (10 sec: 50791.4, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 3155525632. Throughput: 0: 50921.1. Samples: 908317460. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 14:06:37,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 14:06:37,995][49750] Updated weights for policy 0, policy_version 192601 (0.0033) [2024-04-26 14:06:41,132][49750] Updated weights for policy 0, policy_version 192611 (0.0032) [2024-04-26 14:06:42,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3155771392. Throughput: 0: 50858.2. Samples: 908619060. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 14:06:42,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 14:06:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000192613_3155771392.pth... [2024-04-26 14:06:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000191871_3143614464.pth [2024-04-26 14:06:44,384][49750] Updated weights for policy 0, policy_version 192621 (0.0030) [2024-04-26 14:06:47,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3156017152. Throughput: 0: 50754.9. Samples: 908920920. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 14:06:47,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 14:06:47,479][49750] Updated weights for policy 0, policy_version 192631 (0.0034) [2024-04-26 14:06:50,850][49750] Updated weights for policy 0, policy_version 192641 (0.0032) [2024-04-26 14:06:52,063][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.6, 300 sec: 50651.6). Total num frames: 3156279296. Throughput: 0: 50907.1. Samples: 909086860. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 14:06:52,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 14:06:53,974][49750] Updated weights for policy 0, policy_version 192651 (0.0034) [2024-04-26 14:06:57,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3156525056. Throughput: 0: 50686.0. Samples: 909382040. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 14:06:57,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 14:06:57,311][49750] Updated weights for policy 0, policy_version 192661 (0.0033) [2024-04-26 14:06:58,076][49728] Signal inference workers to stop experience collection... (13550 times) [2024-04-26 14:06:58,076][49728] Signal inference workers to resume experience collection... (13550 times) [2024-04-26 14:06:58,101][49750] InferenceWorker_p0-w0: stopping experience collection (13550 times) [2024-04-26 14:06:58,101][49750] InferenceWorker_p0-w0: resuming experience collection (13550 times) [2024-04-26 14:07:00,511][49750] Updated weights for policy 0, policy_version 192671 (0.0032) [2024-04-26 14:07:02,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3156770816. Throughput: 0: 50888.0. Samples: 909687620. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 14:07:02,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 14:07:03,686][49750] Updated weights for policy 0, policy_version 192681 (0.0037) [2024-04-26 14:07:07,049][49750] Updated weights for policy 0, policy_version 192691 (0.0029) [2024-04-26 14:07:07,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3157049344. Throughput: 0: 50911.6. Samples: 909840340. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 14:07:07,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 14:07:10,139][49750] Updated weights for policy 0, policy_version 192701 (0.0030) [2024-04-26 14:07:12,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50790.6, 300 sec: 50762.6). Total num frames: 3157295104. Throughput: 0: 50840.7. Samples: 910144580. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 14:07:12,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 14:07:13,350][49750] Updated weights for policy 0, policy_version 192711 (0.0032) [2024-04-26 14:07:16,806][49750] Updated weights for policy 0, policy_version 192721 (0.0033) [2024-04-26 14:07:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 51063.6, 300 sec: 50651.6). Total num frames: 3157540864. Throughput: 0: 50684.6. Samples: 910442900. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 14:07:17,063][49517] Avg episode reward: [(0, '0.430')] [2024-04-26 14:07:19,866][49750] Updated weights for policy 0, policy_version 192731 (0.0028) [2024-04-26 14:07:22,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3157786624. Throughput: 0: 50742.2. Samples: 910600860. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 14:07:22,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 14:07:23,136][49750] Updated weights for policy 0, policy_version 192741 (0.0032) [2024-04-26 14:07:26,542][49750] Updated weights for policy 0, policy_version 192751 (0.0032) [2024-04-26 14:07:27,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3158065152. Throughput: 0: 50768.1. Samples: 910903620. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-26 14:07:27,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 14:07:29,750][49750] Updated weights for policy 0, policy_version 192761 (0.0028) [2024-04-26 14:07:32,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3158294528. Throughput: 0: 50682.5. Samples: 911201640. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 14:07:32,071][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 14:07:32,980][49750] Updated weights for policy 0, policy_version 192771 (0.0032) [2024-04-26 14:07:36,285][49750] Updated weights for policy 0, policy_version 192781 (0.0038) [2024-04-26 14:07:37,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 3158573056. Throughput: 0: 50683.0. Samples: 911367600. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 14:07:37,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 14:07:39,342][49750] Updated weights for policy 0, policy_version 192791 (0.0030) [2024-04-26 14:07:42,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3158818816. Throughput: 0: 50909.6. Samples: 911672980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 14:07:42,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 14:07:42,749][49750] Updated weights for policy 0, policy_version 192801 (0.0031) [2024-04-26 14:07:45,750][49750] Updated weights for policy 0, policy_version 192811 (0.0031) [2024-04-26 14:07:47,063][49517] Fps is (10 sec: 45875.0, 60 sec: 50244.1, 300 sec: 50596.0). Total num frames: 3159031808. Throughput: 0: 50727.1. Samples: 911970340. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 14:07:47,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 14:07:49,051][49750] Updated weights for policy 0, policy_version 192821 (0.0025) [2024-04-26 14:07:50,052][49728] Signal inference workers to stop experience collection... (13600 times) [2024-04-26 14:07:50,053][49728] Signal inference workers to resume experience collection... (13600 times) [2024-04-26 14:07:50,080][49750] InferenceWorker_p0-w0: stopping experience collection (13600 times) [2024-04-26 14:07:50,080][49750] InferenceWorker_p0-w0: resuming experience collection (13600 times) [2024-04-26 14:07:52,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3159310336. Throughput: 0: 50624.9. Samples: 912118460. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 14:07:52,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 14:07:52,306][49750] Updated weights for policy 0, policy_version 192831 (0.0028) [2024-04-26 14:07:55,467][49750] Updated weights for policy 0, policy_version 192841 (0.0030) [2024-04-26 14:07:57,062][49517] Fps is (10 sec: 55706.6, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3159588864. Throughput: 0: 50637.3. Samples: 912423260. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 14:07:57,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 14:07:58,724][49750] Updated weights for policy 0, policy_version 192851 (0.0031) [2024-04-26 14:08:02,002][49750] Updated weights for policy 0, policy_version 192861 (0.0033) [2024-04-26 14:08:02,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3159834624. Throughput: 0: 50775.8. Samples: 912727820. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 14:08:02,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 14:08:05,089][49750] Updated weights for policy 0, policy_version 192871 (0.0029) [2024-04-26 14:08:07,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 3160064000. Throughput: 0: 50680.4. Samples: 912881480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 14:08:07,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 14:08:08,446][49750] Updated weights for policy 0, policy_version 192881 (0.0025) [2024-04-26 14:08:11,685][49750] Updated weights for policy 0, policy_version 192891 (0.0030) [2024-04-26 14:08:12,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3160342528. Throughput: 0: 50646.2. Samples: 913182700. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 14:08:12,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 14:08:14,813][49750] Updated weights for policy 0, policy_version 192901 (0.0039) [2024-04-26 14:08:17,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.3, 300 sec: 50762.7). Total num frames: 3160588288. Throughput: 0: 50749.8. Samples: 913485380. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 14:08:17,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 14:08:18,139][49750] Updated weights for policy 0, policy_version 192911 (0.0028) [2024-04-26 14:08:21,252][49750] Updated weights for policy 0, policy_version 192921 (0.0033) [2024-04-26 14:08:22,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 3160866816. Throughput: 0: 50792.9. Samples: 913653280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 14:08:22,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 14:08:24,531][49750] Updated weights for policy 0, policy_version 192931 (0.0033) [2024-04-26 14:08:27,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3161079808. Throughput: 0: 50668.5. Samples: 913953060. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 14:08:27,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 14:08:27,665][49750] Updated weights for policy 0, policy_version 192941 (0.0033) [2024-04-26 14:08:30,827][49750] Updated weights for policy 0, policy_version 192951 (0.0037) [2024-04-26 14:08:32,063][49517] Fps is (10 sec: 45874.9, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 3161325568. Throughput: 0: 50636.5. Samples: 914248980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 14:08:32,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 14:08:34,128][49750] Updated weights for policy 0, policy_version 192961 (0.0038) [2024-04-26 14:08:37,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3161587712. Throughput: 0: 50734.8. Samples: 914401540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 14:08:37,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 14:08:37,465][49750] Updated weights for policy 0, policy_version 192971 (0.0032) [2024-04-26 14:08:40,502][49750] Updated weights for policy 0, policy_version 192981 (0.0027) [2024-04-26 14:08:42,063][49517] Fps is (10 sec: 54067.7, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3161866240. Throughput: 0: 50686.6. Samples: 914704160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 14:08:42,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 14:08:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000192985_3161866240.pth... [2024-04-26 14:08:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000192242_3149692928.pth [2024-04-26 14:08:43,953][49750] Updated weights for policy 0, policy_version 192991 (0.0029) [2024-04-26 14:08:46,498][49728] Signal inference workers to stop experience collection... (13650 times) [2024-04-26 14:08:46,498][49728] Signal inference workers to resume experience collection... (13650 times) [2024-04-26 14:08:46,509][49750] InferenceWorker_p0-w0: stopping experience collection (13650 times) [2024-04-26 14:08:46,509][49750] InferenceWorker_p0-w0: resuming experience collection (13650 times) [2024-04-26 14:08:46,921][49750] Updated weights for policy 0, policy_version 193001 (0.0027) [2024-04-26 14:08:47,062][49517] Fps is (10 sec: 54068.2, 60 sec: 51609.7, 300 sec: 50818.2). Total num frames: 3162128384. Throughput: 0: 50792.2. Samples: 915013460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 14:08:47,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 14:08:50,661][49750] Updated weights for policy 0, policy_version 193011 (0.0029) [2024-04-26 14:08:52,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3162357760. Throughput: 0: 50756.9. Samples: 915165540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 14:08:52,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 14:08:53,405][49750] Updated weights for policy 0, policy_version 193021 (0.0032) [2024-04-26 14:08:57,055][49750] Updated weights for policy 0, policy_version 193031 (0.0032) [2024-04-26 14:08:57,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 3162619904. Throughput: 0: 50845.0. Samples: 915470720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 14:08:57,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 14:08:59,880][49750] Updated weights for policy 0, policy_version 193041 (0.0037) [2024-04-26 14:09:02,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3162882048. Throughput: 0: 50791.1. Samples: 915770980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 14:09:02,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 14:09:03,399][49750] Updated weights for policy 0, policy_version 193051 (0.0033) [2024-04-26 14:09:06,242][49750] Updated weights for policy 0, policy_version 193061 (0.0027) [2024-04-26 14:09:07,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 3163160576. Throughput: 0: 50625.7. Samples: 915931440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 14:09:07,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 14:09:10,020][49750] Updated weights for policy 0, policy_version 193071 (0.0035) [2024-04-26 14:09:12,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3163373568. Throughput: 0: 50814.6. Samples: 916239720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 14:09:12,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 14:09:12,721][49750] Updated weights for policy 0, policy_version 193081 (0.0031) [2024-04-26 14:09:16,321][49750] Updated weights for policy 0, policy_version 193091 (0.0032) [2024-04-26 14:09:17,062][49517] Fps is (10 sec: 45875.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3163619328. Throughput: 0: 50896.2. Samples: 916539300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 14:09:17,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 14:09:19,109][49750] Updated weights for policy 0, policy_version 193101 (0.0042) [2024-04-26 14:09:22,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3163897856. Throughput: 0: 50825.4. Samples: 916688680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 14:09:22,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 14:09:22,668][49750] Updated weights for policy 0, policy_version 193111 (0.0028) [2024-04-26 14:09:25,558][49750] Updated weights for policy 0, policy_version 193121 (0.0034) [2024-04-26 14:09:27,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3164143616. Throughput: 0: 50864.8. Samples: 916993080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 14:09:27,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 14:09:29,114][49750] Updated weights for policy 0, policy_version 193131 (0.0034) [2024-04-26 14:09:31,933][49750] Updated weights for policy 0, policy_version 193141 (0.0033) [2024-04-26 14:09:32,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51609.6, 300 sec: 50818.1). Total num frames: 3164422144. Throughput: 0: 50870.9. Samples: 917302660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 14:09:32,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 14:09:35,373][49750] Updated weights for policy 0, policy_version 193151 (0.0029) [2024-04-26 14:09:37,062][49517] Fps is (10 sec: 50791.4, 60 sec: 51063.7, 300 sec: 50762.7). Total num frames: 3164651520. Throughput: 0: 51027.6. Samples: 917461780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 14:09:37,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 14:09:38,223][49750] Updated weights for policy 0, policy_version 193161 (0.0035) [2024-04-26 14:09:42,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3164897280. Throughput: 0: 51001.8. Samples: 917765800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-26 14:09:42,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 14:09:42,146][49750] Updated weights for policy 0, policy_version 193171 (0.0030) [2024-04-26 14:09:44,758][49750] Updated weights for policy 0, policy_version 193181 (0.0034) [2024-04-26 14:09:47,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 3165159424. Throughput: 0: 51015.0. Samples: 918066660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 14:09:47,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 14:09:48,755][49750] Updated weights for policy 0, policy_version 193191 (0.0031) [2024-04-26 14:09:51,325][49750] Updated weights for policy 0, policy_version 193201 (0.0030) [2024-04-26 14:09:52,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3165437952. Throughput: 0: 50883.3. Samples: 918221180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 14:09:52,063][49517] Avg episode reward: [(0, '0.451')] [2024-04-26 14:09:55,123][49750] Updated weights for policy 0, policy_version 193211 (0.0032) [2024-04-26 14:09:57,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3165683712. Throughput: 0: 50939.8. Samples: 918532000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 14:09:57,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 14:09:57,690][49750] Updated weights for policy 0, policy_version 193221 (0.0028) [2024-04-26 14:10:01,641][49750] Updated weights for policy 0, policy_version 193231 (0.0039) [2024-04-26 14:10:02,062][49517] Fps is (10 sec: 45875.0, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3165896704. Throughput: 0: 51069.7. Samples: 918837440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 14:10:02,063][49517] Avg episode reward: [(0, '0.398')] [2024-04-26 14:10:04,188][49750] Updated weights for policy 0, policy_version 193241 (0.0028) [2024-04-26 14:10:07,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 3166191616. Throughput: 0: 50748.7. Samples: 918972360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 14:10:07,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-26 14:10:08,111][49750] Updated weights for policy 0, policy_version 193251 (0.0034) [2024-04-26 14:10:10,574][49728] Signal inference workers to stop experience collection... (13700 times) [2024-04-26 14:10:10,575][49728] Signal inference workers to resume experience collection... (13700 times) [2024-04-26 14:10:10,591][49750] InferenceWorker_p0-w0: stopping experience collection (13700 times) [2024-04-26 14:10:10,592][49750] InferenceWorker_p0-w0: resuming experience collection (13700 times) [2024-04-26 14:10:10,715][49750] Updated weights for policy 0, policy_version 193261 (0.0031) [2024-04-26 14:10:12,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3166437376. Throughput: 0: 51001.9. Samples: 919288160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 14:10:12,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 14:10:14,457][49750] Updated weights for policy 0, policy_version 193271 (0.0036) [2024-04-26 14:10:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3166699520. Throughput: 0: 50871.3. Samples: 919591860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 14:10:17,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 14:10:17,207][49750] Updated weights for policy 0, policy_version 193281 (0.0031) [2024-04-26 14:10:20,869][49750] Updated weights for policy 0, policy_version 193291 (0.0032) [2024-04-26 14:10:22,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3166961664. Throughput: 0: 50670.1. Samples: 919741940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 14:10:22,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 14:10:23,505][49750] Updated weights for policy 0, policy_version 193301 (0.0031) [2024-04-26 14:10:27,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 3167191040. Throughput: 0: 50786.9. Samples: 920051220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 14:10:27,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 14:10:27,406][49750] Updated weights for policy 0, policy_version 193311 (0.0031) [2024-04-26 14:10:29,832][49750] Updated weights for policy 0, policy_version 193321 (0.0028) [2024-04-26 14:10:32,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3167453184. Throughput: 0: 50936.5. Samples: 920358800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 14:10:32,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 14:10:33,662][49750] Updated weights for policy 0, policy_version 193331 (0.0028) [2024-04-26 14:10:36,338][49750] Updated weights for policy 0, policy_version 193341 (0.0033) [2024-04-26 14:10:37,062][49517] Fps is (10 sec: 52429.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3167715328. Throughput: 0: 50891.1. Samples: 920511280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 14:10:37,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 14:10:39,992][49750] Updated weights for policy 0, policy_version 193351 (0.0033) [2024-04-26 14:10:42,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3167961088. Throughput: 0: 50744.0. Samples: 920815480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 14:10:42,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 14:10:42,180][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000193358_3167977472.pth... [2024-04-26 14:10:42,224][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000192613_3155771392.pth [2024-04-26 14:10:42,706][49750] Updated weights for policy 0, policy_version 193361 (0.0032) [2024-04-26 14:10:46,467][49750] Updated weights for policy 0, policy_version 193371 (0.0029) [2024-04-26 14:10:47,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3168206848. Throughput: 0: 50761.4. Samples: 921121700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 14:10:47,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 14:10:49,171][49750] Updated weights for policy 0, policy_version 193381 (0.0029) [2024-04-26 14:10:52,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3168452608. Throughput: 0: 50894.5. Samples: 921262620. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-04-26 14:10:52,063][49517] Avg episode reward: [(0, '0.675')] [2024-04-26 14:10:53,032][49750] Updated weights for policy 0, policy_version 193391 (0.0035) [2024-04-26 14:10:55,477][49750] Updated weights for policy 0, policy_version 193401 (0.0025) [2024-04-26 14:10:57,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 3168714752. Throughput: 0: 50749.0. Samples: 921571860. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-04-26 14:10:57,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 14:10:59,428][49750] Updated weights for policy 0, policy_version 193411 (0.0037) [2024-04-26 14:11:01,968][49750] Updated weights for policy 0, policy_version 193421 (0.0024) [2024-04-26 14:11:02,063][49517] Fps is (10 sec: 55706.3, 60 sec: 51882.7, 300 sec: 50873.7). Total num frames: 3169009664. Throughput: 0: 50876.0. Samples: 921881280. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-04-26 14:11:02,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 14:11:05,962][49750] Updated weights for policy 0, policy_version 193431 (0.0028) [2024-04-26 14:11:07,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3169239040. Throughput: 0: 50942.2. Samples: 922034340. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-04-26 14:11:07,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 14:11:08,525][49750] Updated weights for policy 0, policy_version 193441 (0.0030) [2024-04-26 14:11:12,063][49517] Fps is (10 sec: 45874.6, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3169468416. Throughput: 0: 50814.3. Samples: 922337860. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-04-26 14:11:12,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 14:11:12,359][49750] Updated weights for policy 0, policy_version 193451 (0.0032) [2024-04-26 14:11:14,842][49750] Updated weights for policy 0, policy_version 193461 (0.0032) [2024-04-26 14:11:16,527][49728] Signal inference workers to stop experience collection... (13750 times) [2024-04-26 14:11:16,527][49728] Signal inference workers to resume experience collection... (13750 times) [2024-04-26 14:11:16,557][49750] InferenceWorker_p0-w0: stopping experience collection (13750 times) [2024-04-26 14:11:16,557][49750] InferenceWorker_p0-w0: resuming experience collection (13750 times) [2024-04-26 14:11:17,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3169746944. Throughput: 0: 50839.1. Samples: 922646560. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-04-26 14:11:17,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 14:11:18,866][49750] Updated weights for policy 0, policy_version 193471 (0.0031) [2024-04-26 14:11:21,227][49750] Updated weights for policy 0, policy_version 193481 (0.0031) [2024-04-26 14:11:22,063][49517] Fps is (10 sec: 54066.7, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3170009088. Throughput: 0: 50752.6. Samples: 922795160. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-04-26 14:11:22,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 14:11:25,343][49750] Updated weights for policy 0, policy_version 193491 (0.0029) [2024-04-26 14:11:27,062][49517] Fps is (10 sec: 50791.2, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3170254848. Throughput: 0: 50904.5. Samples: 923106180. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-04-26 14:11:27,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 14:11:27,744][49750] Updated weights for policy 0, policy_version 193501 (0.0034) [2024-04-26 14:11:31,703][49750] Updated weights for policy 0, policy_version 193511 (0.0032) [2024-04-26 14:11:32,063][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3170500608. Throughput: 0: 50909.3. Samples: 923412620. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-04-26 14:11:32,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 14:11:34,248][49750] Updated weights for policy 0, policy_version 193521 (0.0031) [2024-04-26 14:11:37,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3170746368. Throughput: 0: 50884.2. Samples: 923552400. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-04-26 14:11:37,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 14:11:38,033][49750] Updated weights for policy 0, policy_version 193531 (0.0037) [2024-04-26 14:11:40,524][49750] Updated weights for policy 0, policy_version 193541 (0.0030) [2024-04-26 14:11:42,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3171008512. Throughput: 0: 50767.5. Samples: 923856400. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-04-26 14:11:42,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 14:11:44,581][49750] Updated weights for policy 0, policy_version 193551 (0.0027) [2024-04-26 14:11:46,960][49750] Updated weights for policy 0, policy_version 193561 (0.0031) [2024-04-26 14:11:47,063][49517] Fps is (10 sec: 55704.4, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 3171303424. Throughput: 0: 50735.0. Samples: 924164360. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-04-26 14:11:47,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 14:11:51,079][49750] Updated weights for policy 0, policy_version 193571 (0.0030) [2024-04-26 14:11:52,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3171516416. Throughput: 0: 50914.6. Samples: 924325500. Policy #0 lag: (min: 0.0, avg: 12.9, max: 22.0) [2024-04-26 14:11:52,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 14:11:53,733][49750] Updated weights for policy 0, policy_version 193581 (0.0028) [2024-04-26 14:11:57,063][49517] Fps is (10 sec: 45875.2, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3171762176. Throughput: 0: 50854.2. Samples: 924626300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 14:11:57,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 14:11:57,344][49750] Updated weights for policy 0, policy_version 193591 (0.0026) [2024-04-26 14:12:00,312][49750] Updated weights for policy 0, policy_version 193601 (0.0034) [2024-04-26 14:12:02,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 3172040704. Throughput: 0: 50899.0. Samples: 924937020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 14:12:02,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 14:12:03,717][49750] Updated weights for policy 0, policy_version 193611 (0.0032) [2024-04-26 14:12:06,682][49750] Updated weights for policy 0, policy_version 193621 (0.0030) [2024-04-26 14:12:07,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3172286464. Throughput: 0: 51027.8. Samples: 925091400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 14:12:07,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 14:12:10,106][49750] Updated weights for policy 0, policy_version 193631 (0.0026) [2024-04-26 14:12:12,062][49517] Fps is (10 sec: 50791.4, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3172548608. Throughput: 0: 51048.0. Samples: 925403340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 14:12:12,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 14:12:13,110][49750] Updated weights for policy 0, policy_version 193641 (0.0029) [2024-04-26 14:12:16,463][49750] Updated weights for policy 0, policy_version 193651 (0.0037) [2024-04-26 14:12:17,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3172794368. Throughput: 0: 50913.8. Samples: 925703740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 14:12:17,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 14:12:19,504][49750] Updated weights for policy 0, policy_version 193661 (0.0036) [2024-04-26 14:12:22,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3173023744. Throughput: 0: 51142.6. Samples: 925853820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 14:12:22,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 14:12:22,539][49728] Signal inference workers to stop experience collection... (13800 times) [2024-04-26 14:12:22,540][49728] Signal inference workers to resume experience collection... (13800 times) [2024-04-26 14:12:22,567][49750] InferenceWorker_p0-w0: stopping experience collection (13800 times) [2024-04-26 14:12:22,568][49750] InferenceWorker_p0-w0: resuming experience collection (13800 times) [2024-04-26 14:12:22,822][49750] Updated weights for policy 0, policy_version 193671 (0.0034) [2024-04-26 14:12:25,915][49750] Updated weights for policy 0, policy_version 193681 (0.0040) [2024-04-26 14:12:27,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3173302272. Throughput: 0: 51164.3. Samples: 926158800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 14:12:27,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 14:12:29,625][49750] Updated weights for policy 0, policy_version 193691 (0.0031) [2024-04-26 14:12:32,063][49517] Fps is (10 sec: 54066.1, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3173564416. Throughput: 0: 50956.4. Samples: 926457400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 14:12:32,063][49517] Avg episode reward: [(0, '0.461')] [2024-04-26 14:12:32,265][49750] Updated weights for policy 0, policy_version 193701 (0.0028) [2024-04-26 14:12:36,174][49750] Updated weights for policy 0, policy_version 193711 (0.0034) [2024-04-26 14:12:37,062][49517] Fps is (10 sec: 52429.9, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3173826560. Throughput: 0: 51055.8. Samples: 926623000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 14:12:37,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 14:12:38,831][49750] Updated weights for policy 0, policy_version 193721 (0.0027) [2024-04-26 14:12:42,063][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.3, 300 sec: 50929.3). Total num frames: 3174055936. Throughput: 0: 51022.7. Samples: 926922320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 14:12:42,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 14:12:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000193729_3174055936.pth... [2024-04-26 14:12:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000192985_3161866240.pth [2024-04-26 14:12:42,510][49750] Updated weights for policy 0, policy_version 193731 (0.0027) [2024-04-26 14:12:45,310][49750] Updated weights for policy 0, policy_version 193741 (0.0027) [2024-04-26 14:12:47,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50517.3, 300 sec: 50929.2). Total num frames: 3174334464. Throughput: 0: 50790.3. Samples: 927222580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 14:12:47,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 14:12:49,025][49750] Updated weights for policy 0, policy_version 193751 (0.0029) [2024-04-26 14:12:51,627][49750] Updated weights for policy 0, policy_version 193761 (0.0034) [2024-04-26 14:12:52,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 3174596608. Throughput: 0: 50821.8. Samples: 927378380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 14:12:52,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 14:12:55,560][49750] Updated weights for policy 0, policy_version 193771 (0.0030) [2024-04-26 14:12:57,067][49517] Fps is (10 sec: 52404.1, 60 sec: 51605.5, 300 sec: 50928.4). Total num frames: 3174858752. Throughput: 0: 50803.4. Samples: 927689740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 14:12:57,068][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 14:12:58,080][49750] Updated weights for policy 0, policy_version 193781 (0.0033) [2024-04-26 14:13:01,835][49750] Updated weights for policy 0, policy_version 193791 (0.0030) [2024-04-26 14:13:02,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 3175071744. Throughput: 0: 50898.8. Samples: 927994180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 14:13:02,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 14:13:04,569][49750] Updated weights for policy 0, policy_version 193801 (0.0029) [2024-04-26 14:13:07,062][49517] Fps is (10 sec: 47536.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3175333888. Throughput: 0: 50772.0. Samples: 928138560. Policy #0 lag: (min: 2.0, avg: 11.5, max: 22.0) [2024-04-26 14:13:07,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 14:13:08,198][49750] Updated weights for policy 0, policy_version 193811 (0.0036) [2024-04-26 14:13:11,207][49750] Updated weights for policy 0, policy_version 193821 (0.0029) [2024-04-26 14:13:12,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3175596032. Throughput: 0: 50680.1. Samples: 928439400. Policy #0 lag: (min: 2.0, avg: 11.5, max: 22.0) [2024-04-26 14:13:12,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 14:13:14,511][49750] Updated weights for policy 0, policy_version 193831 (0.0028) [2024-04-26 14:13:17,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3175874560. Throughput: 0: 50885.0. Samples: 928747220. Policy #0 lag: (min: 2.0, avg: 11.5, max: 22.0) [2024-04-26 14:13:17,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 14:13:17,513][49750] Updated weights for policy 0, policy_version 193841 (0.0027) [2024-04-26 14:13:20,887][49750] Updated weights for policy 0, policy_version 193851 (0.0026) [2024-04-26 14:13:22,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51336.4, 300 sec: 50929.3). Total num frames: 3176103936. Throughput: 0: 50756.7. Samples: 928907060. Policy #0 lag: (min: 2.0, avg: 11.5, max: 22.0) [2024-04-26 14:13:22,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 14:13:23,868][49750] Updated weights for policy 0, policy_version 193861 (0.0028) [2024-04-26 14:13:27,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 3176349696. Throughput: 0: 50972.1. Samples: 929216060. Policy #0 lag: (min: 2.0, avg: 11.5, max: 22.0) [2024-04-26 14:13:27,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 14:13:27,498][49750] Updated weights for policy 0, policy_version 193871 (0.0036) [2024-04-26 14:13:28,045][49728] Signal inference workers to stop experience collection... (13850 times) [2024-04-26 14:13:28,050][49728] Signal inference workers to resume experience collection... (13850 times) [2024-04-26 14:13:28,079][49750] InferenceWorker_p0-w0: stopping experience collection (13850 times) [2024-04-26 14:13:28,079][49750] InferenceWorker_p0-w0: resuming experience collection (13850 times) [2024-04-26 14:13:30,138][49750] Updated weights for policy 0, policy_version 193881 (0.0034) [2024-04-26 14:13:32,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 3176611840. Throughput: 0: 51087.9. Samples: 929521540. Policy #0 lag: (min: 2.0, avg: 11.5, max: 22.0) [2024-04-26 14:13:32,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 14:13:34,031][49750] Updated weights for policy 0, policy_version 193891 (0.0036) [2024-04-26 14:13:36,610][49750] Updated weights for policy 0, policy_version 193901 (0.0032) [2024-04-26 14:13:37,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 3176890368. Throughput: 0: 51012.3. Samples: 929673940. Policy #0 lag: (min: 2.0, avg: 11.5, max: 22.0) [2024-04-26 14:13:37,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 14:13:40,440][49750] Updated weights for policy 0, policy_version 193911 (0.0028) [2024-04-26 14:13:42,062][49517] Fps is (10 sec: 54068.6, 60 sec: 51609.7, 300 sec: 50929.3). Total num frames: 3177152512. Throughput: 0: 50816.6. Samples: 929976240. Policy #0 lag: (min: 2.0, avg: 11.5, max: 22.0) [2024-04-26 14:13:42,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 14:13:43,136][49750] Updated weights for policy 0, policy_version 193921 (0.0029) [2024-04-26 14:13:46,926][49750] Updated weights for policy 0, policy_version 193931 (0.0035) [2024-04-26 14:13:47,063][49517] Fps is (10 sec: 47510.8, 60 sec: 50516.8, 300 sec: 50873.6). Total num frames: 3177365504. Throughput: 0: 50879.6. Samples: 930283800. Policy #0 lag: (min: 2.0, avg: 11.5, max: 22.0) [2024-04-26 14:13:47,064][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 14:13:49,492][49750] Updated weights for policy 0, policy_version 193941 (0.0028) [2024-04-26 14:13:52,062][49517] Fps is (10 sec: 45874.9, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 3177611264. Throughput: 0: 50844.0. Samples: 930426540. Policy #0 lag: (min: 2.0, avg: 11.5, max: 22.0) [2024-04-26 14:13:52,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 14:13:53,352][49750] Updated weights for policy 0, policy_version 193951 (0.0030) [2024-04-26 14:13:55,904][49750] Updated weights for policy 0, policy_version 193961 (0.0028) [2024-04-26 14:13:57,062][49517] Fps is (10 sec: 52433.1, 60 sec: 50521.5, 300 sec: 50873.7). Total num frames: 3177889792. Throughput: 0: 50949.9. Samples: 930732140. Policy #0 lag: (min: 2.0, avg: 11.5, max: 22.0) [2024-04-26 14:13:57,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 14:13:59,758][49750] Updated weights for policy 0, policy_version 193971 (0.0031) [2024-04-26 14:14:02,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3178151936. Throughput: 0: 50911.7. Samples: 931038240. Policy #0 lag: (min: 2.0, avg: 11.5, max: 22.0) [2024-04-26 14:14:02,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 14:14:02,469][49750] Updated weights for policy 0, policy_version 193981 (0.0034) [2024-04-26 14:14:05,987][49750] Updated weights for policy 0, policy_version 193991 (0.0029) [2024-04-26 14:14:07,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 3178414080. Throughput: 0: 50989.4. Samples: 931201580. Policy #0 lag: (min: 2.0, avg: 11.5, max: 22.0) [2024-04-26 14:14:07,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 14:14:08,831][49750] Updated weights for policy 0, policy_version 194001 (0.0032) [2024-04-26 14:14:12,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 3178643456. Throughput: 0: 51092.8. Samples: 931515240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 14:14:12,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 14:14:12,423][49750] Updated weights for policy 0, policy_version 194011 (0.0030) [2024-04-26 14:14:15,295][49750] Updated weights for policy 0, policy_version 194021 (0.0031) [2024-04-26 14:14:17,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 3178905600. Throughput: 0: 50936.0. Samples: 931813660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 14:14:17,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 14:14:18,901][49750] Updated weights for policy 0, policy_version 194031 (0.0029) [2024-04-26 14:14:21,618][49750] Updated weights for policy 0, policy_version 194041 (0.0026) [2024-04-26 14:14:22,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 3179167744. Throughput: 0: 51061.9. Samples: 931971720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 14:14:22,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 14:14:24,963][49728] Signal inference workers to stop experience collection... (13900 times) [2024-04-26 14:14:24,967][49728] Signal inference workers to resume experience collection... (13900 times) [2024-04-26 14:14:24,995][49750] InferenceWorker_p0-w0: stopping experience collection (13900 times) [2024-04-26 14:14:24,995][49750] InferenceWorker_p0-w0: resuming experience collection (13900 times) [2024-04-26 14:14:25,250][49750] Updated weights for policy 0, policy_version 194051 (0.0033) [2024-04-26 14:14:27,062][49517] Fps is (10 sec: 55706.8, 60 sec: 51882.7, 300 sec: 50984.8). Total num frames: 3179462656. Throughput: 0: 51125.8. Samples: 932276900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 14:14:27,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 14:14:28,049][49750] Updated weights for policy 0, policy_version 194061 (0.0032) [2024-04-26 14:14:31,750][49750] Updated weights for policy 0, policy_version 194071 (0.0036) [2024-04-26 14:14:32,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.6, 300 sec: 50929.2). Total num frames: 3179675648. Throughput: 0: 51204.8. Samples: 932587980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 14:14:32,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 14:14:34,566][49750] Updated weights for policy 0, policy_version 194081 (0.0031) [2024-04-26 14:14:37,062][49517] Fps is (10 sec: 44236.9, 60 sec: 50244.4, 300 sec: 50873.7). Total num frames: 3179905024. Throughput: 0: 51275.6. Samples: 932733940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 14:14:37,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 14:14:38,118][49750] Updated weights for policy 0, policy_version 194091 (0.0031) [2024-04-26 14:14:40,813][49750] Updated weights for policy 0, policy_version 194101 (0.0031) [2024-04-26 14:14:42,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.3, 300 sec: 50984.8). Total num frames: 3180199936. Throughput: 0: 51323.8. Samples: 933041720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 14:14:42,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 14:14:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000194104_3180199936.pth... [2024-04-26 14:14:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000193358_3167977472.pth [2024-04-26 14:14:44,535][49750] Updated weights for policy 0, policy_version 194111 (0.0030) [2024-04-26 14:14:47,063][49517] Fps is (10 sec: 55704.9, 60 sec: 51610.2, 300 sec: 50929.2). Total num frames: 3180462080. Throughput: 0: 51128.8. Samples: 933339040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 14:14:47,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 14:14:47,264][49750] Updated weights for policy 0, policy_version 194121 (0.0028) [2024-04-26 14:14:50,939][49750] Updated weights for policy 0, policy_version 194131 (0.0029) [2024-04-26 14:14:52,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 3180707840. Throughput: 0: 51087.9. Samples: 933500540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 14:14:52,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 14:14:53,929][49750] Updated weights for policy 0, policy_version 194141 (0.0027) [2024-04-26 14:14:57,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50790.2, 300 sec: 50984.8). Total num frames: 3180937216. Throughput: 0: 50808.4. Samples: 933801620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 14:14:57,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 14:14:57,386][49750] Updated weights for policy 0, policy_version 194151 (0.0027) [2024-04-26 14:15:00,429][49750] Updated weights for policy 0, policy_version 194161 (0.0031) [2024-04-26 14:15:02,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 3181215744. Throughput: 0: 50976.9. Samples: 934107620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 14:15:02,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 14:15:03,769][49750] Updated weights for policy 0, policy_version 194171 (0.0032) [2024-04-26 14:15:07,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 3181461504. Throughput: 0: 50774.2. Samples: 934256560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 14:15:07,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 14:15:07,069][49750] Updated weights for policy 0, policy_version 194181 (0.0028) [2024-04-26 14:15:10,309][49750] Updated weights for policy 0, policy_version 194191 (0.0026) [2024-04-26 14:15:12,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 3181740032. Throughput: 0: 50810.6. Samples: 934563380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 14:15:12,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 14:15:13,380][49750] Updated weights for policy 0, policy_version 194201 (0.0030) [2024-04-26 14:15:16,672][49750] Updated weights for policy 0, policy_version 194211 (0.0034) [2024-04-26 14:15:17,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3181969408. Throughput: 0: 50787.9. Samples: 934873440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 14:15:17,064][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 14:15:20,005][49750] Updated weights for policy 0, policy_version 194221 (0.0029) [2024-04-26 14:15:22,063][49517] Fps is (10 sec: 45875.0, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3182198784. Throughput: 0: 50756.8. Samples: 935018000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 14:15:22,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 14:15:22,674][49728] Signal inference workers to stop experience collection... (13950 times) [2024-04-26 14:15:22,674][49728] Signal inference workers to resume experience collection... (13950 times) [2024-04-26 14:15:22,701][49750] InferenceWorker_p0-w0: stopping experience collection (13950 times) [2024-04-26 14:15:22,701][49750] InferenceWorker_p0-w0: resuming experience collection (13950 times) [2024-04-26 14:15:23,103][49750] Updated weights for policy 0, policy_version 194231 (0.0034) [2024-04-26 14:15:26,613][49750] Updated weights for policy 0, policy_version 194241 (0.0030) [2024-04-26 14:15:27,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50244.3, 300 sec: 50929.3). Total num frames: 3182477312. Throughput: 0: 50697.5. Samples: 935323100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 14:15:27,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 14:15:29,468][49750] Updated weights for policy 0, policy_version 194251 (0.0038) [2024-04-26 14:15:32,063][49517] Fps is (10 sec: 54066.1, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 3182739456. Throughput: 0: 50731.3. Samples: 935621960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 14:15:32,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 14:15:32,978][49750] Updated weights for policy 0, policy_version 194261 (0.0029) [2024-04-26 14:15:36,062][49750] Updated weights for policy 0, policy_version 194271 (0.0028) [2024-04-26 14:15:37,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51882.7, 300 sec: 51040.3). Total num frames: 3183017984. Throughput: 0: 50803.7. Samples: 935786700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 14:15:37,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 14:15:39,257][49750] Updated weights for policy 0, policy_version 194281 (0.0031) [2024-04-26 14:15:42,063][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.3, 300 sec: 50929.2). Total num frames: 3183230976. Throughput: 0: 50734.7. Samples: 936084680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 14:15:42,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 14:15:42,497][49750] Updated weights for policy 0, policy_version 194291 (0.0032) [2024-04-26 14:15:45,614][49750] Updated weights for policy 0, policy_version 194301 (0.0031) [2024-04-26 14:15:47,063][49517] Fps is (10 sec: 45874.5, 60 sec: 50244.2, 300 sec: 50929.2). Total num frames: 3183476736. Throughput: 0: 50786.7. Samples: 936393020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 14:15:47,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 14:15:48,777][49750] Updated weights for policy 0, policy_version 194311 (0.0035) [2024-04-26 14:15:52,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.4, 300 sec: 50929.2). Total num frames: 3183738880. Throughput: 0: 50808.0. Samples: 936542920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 14:15:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 14:15:52,202][49750] Updated weights for policy 0, policy_version 194321 (0.0028) [2024-04-26 14:15:55,241][49750] Updated weights for policy 0, policy_version 194331 (0.0029) [2024-04-26 14:15:57,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3184017408. Throughput: 0: 50909.2. Samples: 936854300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 14:15:57,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 14:15:58,527][49750] Updated weights for policy 0, policy_version 194341 (0.0026) [2024-04-26 14:16:01,740][49750] Updated weights for policy 0, policy_version 194351 (0.0028) [2024-04-26 14:16:02,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.6, 300 sec: 50929.3). Total num frames: 3184263168. Throughput: 0: 50868.2. Samples: 937162500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 14:16:02,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 14:16:04,787][49750] Updated weights for policy 0, policy_version 194361 (0.0028) [2024-04-26 14:16:07,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 3184508928. Throughput: 0: 50905.8. Samples: 937308760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 14:16:07,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 14:16:08,006][49750] Updated weights for policy 0, policy_version 194371 (0.0034) [2024-04-26 14:16:11,322][49750] Updated weights for policy 0, policy_version 194381 (0.0032) [2024-04-26 14:16:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 3184754688. Throughput: 0: 51019.6. Samples: 937618980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 14:16:12,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 14:16:14,334][49750] Updated weights for policy 0, policy_version 194391 (0.0030) [2024-04-26 14:16:17,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 3185033216. Throughput: 0: 51099.8. Samples: 937921440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 14:16:17,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 14:16:17,808][49750] Updated weights for policy 0, policy_version 194401 (0.0028) [2024-04-26 14:16:20,847][49750] Updated weights for policy 0, policy_version 194411 (0.0030) [2024-04-26 14:16:22,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 3185295360. Throughput: 0: 51051.5. Samples: 938084020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 14:16:22,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 14:16:24,123][49750] Updated weights for policy 0, policy_version 194421 (0.0030) [2024-04-26 14:16:27,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.3, 300 sec: 50929.3). Total num frames: 3185524736. Throughput: 0: 51263.1. Samples: 938391520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-26 14:16:27,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 14:16:27,292][49750] Updated weights for policy 0, policy_version 194431 (0.0032) [2024-04-26 14:16:30,393][49750] Updated weights for policy 0, policy_version 194441 (0.0033) [2024-04-26 14:16:32,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.6, 300 sec: 50929.2). Total num frames: 3185770496. Throughput: 0: 51168.2. Samples: 938695580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-26 14:16:32,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 14:16:33,280][49728] Signal inference workers to stop experience collection... (14000 times) [2024-04-26 14:16:33,281][49728] Signal inference workers to resume experience collection... (14000 times) [2024-04-26 14:16:33,294][49750] InferenceWorker_p0-w0: stopping experience collection (14000 times) [2024-04-26 14:16:33,295][49750] InferenceWorker_p0-w0: resuming experience collection (14000 times) [2024-04-26 14:16:33,567][49750] Updated weights for policy 0, policy_version 194451 (0.0034) [2024-04-26 14:16:36,942][49750] Updated weights for policy 0, policy_version 194461 (0.0033) [2024-04-26 14:16:37,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.3, 300 sec: 50984.8). Total num frames: 3186049024. Throughput: 0: 51025.0. Samples: 938839040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-26 14:16:37,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-26 14:16:40,071][49750] Updated weights for policy 0, policy_version 194471 (0.0033) [2024-04-26 14:16:42,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3186294784. Throughput: 0: 50878.0. Samples: 939143800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-26 14:16:42,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 14:16:42,112][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000194477_3186311168.pth... [2024-04-26 14:16:42,155][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000193729_3174055936.pth [2024-04-26 14:16:43,461][49750] Updated weights for policy 0, policy_version 194481 (0.0034) [2024-04-26 14:16:46,485][49750] Updated weights for policy 0, policy_version 194491 (0.0032) [2024-04-26 14:16:47,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 3186556928. Throughput: 0: 50852.4. Samples: 939450860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-26 14:16:47,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 14:16:49,977][49750] Updated weights for policy 0, policy_version 194501 (0.0037) [2024-04-26 14:16:52,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 3186786304. Throughput: 0: 51090.7. Samples: 939607840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-26 14:16:52,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 14:16:53,069][49750] Updated weights for policy 0, policy_version 194511 (0.0030) [2024-04-26 14:16:56,341][49750] Updated weights for policy 0, policy_version 194521 (0.0031) [2024-04-26 14:16:57,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 3187048448. Throughput: 0: 50947.9. Samples: 939911640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-26 14:16:57,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 14:16:59,382][49750] Updated weights for policy 0, policy_version 194531 (0.0031) [2024-04-26 14:17:02,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 3187326976. Throughput: 0: 50947.7. Samples: 940214080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-26 14:17:02,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 14:17:02,788][49750] Updated weights for policy 0, policy_version 194541 (0.0032) [2024-04-26 14:17:05,772][49750] Updated weights for policy 0, policy_version 194551 (0.0038) [2024-04-26 14:17:07,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 3187589120. Throughput: 0: 50868.5. Samples: 940373100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-26 14:17:07,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 14:17:09,106][49750] Updated weights for policy 0, policy_version 194561 (0.0032) [2024-04-26 14:17:12,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 3187802112. Throughput: 0: 50804.8. Samples: 940677740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-26 14:17:12,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 14:17:12,354][49750] Updated weights for policy 0, policy_version 194571 (0.0030) [2024-04-26 14:17:16,094][49750] Updated weights for policy 0, policy_version 194581 (0.0033) [2024-04-26 14:17:17,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.3, 300 sec: 50984.8). Total num frames: 3188064256. Throughput: 0: 50836.3. Samples: 940983220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-26 14:17:17,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 14:17:18,931][49750] Updated weights for policy 0, policy_version 194591 (0.0033) [2024-04-26 14:17:22,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50517.4, 300 sec: 50929.3). Total num frames: 3188326400. Throughput: 0: 50887.6. Samples: 941128980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-26 14:17:22,070][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 14:17:22,645][49750] Updated weights for policy 0, policy_version 194601 (0.0029) [2024-04-26 14:17:25,254][49750] Updated weights for policy 0, policy_version 194611 (0.0032) [2024-04-26 14:17:27,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 3188588544. Throughput: 0: 50896.0. Samples: 941434120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-26 14:17:27,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 14:17:29,012][49750] Updated weights for policy 0, policy_version 194621 (0.0028) [2024-04-26 14:17:31,577][49750] Updated weights for policy 0, policy_version 194631 (0.0034) [2024-04-26 14:17:32,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3188834304. Throughput: 0: 50805.4. Samples: 941737100. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:17:32,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 14:17:35,247][49750] Updated weights for policy 0, policy_version 194641 (0.0030) [2024-04-26 14:17:37,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 3189096448. Throughput: 0: 50801.8. Samples: 941893920. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:17:37,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 14:17:38,061][49750] Updated weights for policy 0, policy_version 194651 (0.0034) [2024-04-26 14:17:41,768][49750] Updated weights for policy 0, policy_version 194661 (0.0026) [2024-04-26 14:17:42,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3189342208. Throughput: 0: 50824.8. Samples: 942198760. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:17:42,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 14:17:44,651][49750] Updated weights for policy 0, policy_version 194671 (0.0037) [2024-04-26 14:17:45,504][49728] Signal inference workers to stop experience collection... (14050 times) [2024-04-26 14:17:45,562][49750] InferenceWorker_p0-w0: stopping experience collection (14050 times) [2024-04-26 14:17:45,568][49728] Signal inference workers to resume experience collection... (14050 times) [2024-04-26 14:17:45,573][49750] InferenceWorker_p0-w0: resuming experience collection (14050 times) [2024-04-26 14:17:47,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3189604352. Throughput: 0: 50925.8. Samples: 942505740. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:17:47,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 14:17:48,251][49750] Updated weights for policy 0, policy_version 194681 (0.0030) [2024-04-26 14:17:50,904][49750] Updated weights for policy 0, policy_version 194691 (0.0036) [2024-04-26 14:17:52,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51609.6, 300 sec: 50930.1). Total num frames: 3189882880. Throughput: 0: 50839.0. Samples: 942660860. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:17:52,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 14:17:54,515][49750] Updated weights for policy 0, policy_version 194701 (0.0037) [2024-04-26 14:17:57,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.5, 300 sec: 51040.3). Total num frames: 3190128640. Throughput: 0: 50927.7. Samples: 942969480. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:17:57,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 14:17:57,361][49750] Updated weights for policy 0, policy_version 194711 (0.0032) [2024-04-26 14:18:00,894][49750] Updated weights for policy 0, policy_version 194721 (0.0034) [2024-04-26 14:18:02,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.4, 300 sec: 50929.3). Total num frames: 3190358016. Throughput: 0: 50979.7. Samples: 943277300. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:18:02,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 14:18:03,821][49750] Updated weights for policy 0, policy_version 194731 (0.0027) [2024-04-26 14:18:07,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50929.2). Total num frames: 3190620160. Throughput: 0: 50952.8. Samples: 943421860. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:18:07,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 14:18:07,175][49750] Updated weights for policy 0, policy_version 194741 (0.0032) [2024-04-26 14:18:10,366][49750] Updated weights for policy 0, policy_version 194751 (0.0031) [2024-04-26 14:18:12,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51609.6, 300 sec: 50929.2). Total num frames: 3190898688. Throughput: 0: 50953.6. Samples: 943727040. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:18:12,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 14:18:13,696][49750] Updated weights for policy 0, policy_version 194761 (0.0033) [2024-04-26 14:18:16,885][49750] Updated weights for policy 0, policy_version 194771 (0.0030) [2024-04-26 14:18:17,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 3191128064. Throughput: 0: 50946.7. Samples: 944029700. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:18:17,063][49517] Avg episode reward: [(0, '0.459')] [2024-04-26 14:18:20,042][49750] Updated weights for policy 0, policy_version 194781 (0.0029) [2024-04-26 14:18:22,063][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.3, 300 sec: 50984.8). Total num frames: 3191390208. Throughput: 0: 51103.3. Samples: 944193580. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:18:22,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 14:18:23,139][49750] Updated weights for policy 0, policy_version 194791 (0.0033) [2024-04-26 14:18:26,364][49750] Updated weights for policy 0, policy_version 194801 (0.0032) [2024-04-26 14:18:27,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 3191652352. Throughput: 0: 51097.8. Samples: 944498160. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:18:27,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 14:18:29,397][49750] Updated weights for policy 0, policy_version 194811 (0.0030) [2024-04-26 14:18:32,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3191881728. Throughput: 0: 50984.8. Samples: 944800060. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:18:32,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 14:18:32,834][49750] Updated weights for policy 0, policy_version 194821 (0.0026) [2024-04-26 14:18:35,815][49750] Updated weights for policy 0, policy_version 194831 (0.0031) [2024-04-26 14:18:37,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3192160256. Throughput: 0: 51119.3. Samples: 944961220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 14:18:37,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 14:18:39,367][49750] Updated weights for policy 0, policy_version 194841 (0.0035) [2024-04-26 14:18:42,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50984.9). Total num frames: 3192406016. Throughput: 0: 50946.6. Samples: 945262080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 14:18:42,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 14:18:42,176][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000194850_3192422400.pth... [2024-04-26 14:18:42,221][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000194104_3180199936.pth [2024-04-26 14:18:42,401][49750] Updated weights for policy 0, policy_version 194851 (0.0031) [2024-04-26 14:18:45,821][49750] Updated weights for policy 0, policy_version 194861 (0.0028) [2024-04-26 14:18:47,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50790.3, 300 sec: 50984.8). Total num frames: 3192651776. Throughput: 0: 50982.0. Samples: 945571500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 14:18:47,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 14:18:47,678][49728] Signal inference workers to stop experience collection... (14100 times) [2024-04-26 14:18:47,717][49750] InferenceWorker_p0-w0: stopping experience collection (14100 times) [2024-04-26 14:18:47,748][49728] Signal inference workers to resume experience collection... (14100 times) [2024-04-26 14:18:47,750][49750] InferenceWorker_p0-w0: resuming experience collection (14100 times) [2024-04-26 14:18:48,657][49750] Updated weights for policy 0, policy_version 194871 (0.0028) [2024-04-26 14:18:52,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50929.2). Total num frames: 3192913920. Throughput: 0: 51087.6. Samples: 945720800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 14:18:52,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 14:18:52,140][49750] Updated weights for policy 0, policy_version 194881 (0.0031) [2024-04-26 14:18:54,959][49750] Updated weights for policy 0, policy_version 194891 (0.0032) [2024-04-26 14:18:57,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 3193159680. Throughput: 0: 51004.6. Samples: 946022240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 14:18:57,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 14:18:58,670][49750] Updated weights for policy 0, policy_version 194901 (0.0033) [2024-04-26 14:19:01,465][49750] Updated weights for policy 0, policy_version 194911 (0.0027) [2024-04-26 14:19:02,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 3193438208. Throughput: 0: 50878.2. Samples: 946319220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 14:19:02,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 14:19:05,160][49750] Updated weights for policy 0, policy_version 194921 (0.0032) [2024-04-26 14:19:07,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 3193683968. Throughput: 0: 50810.5. Samples: 946480040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 14:19:07,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 14:19:08,231][49750] Updated weights for policy 0, policy_version 194931 (0.0031) [2024-04-26 14:19:11,583][49750] Updated weights for policy 0, policy_version 194941 (0.0028) [2024-04-26 14:19:12,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50929.3). Total num frames: 3193929728. Throughput: 0: 50886.2. Samples: 946788040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 14:19:12,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 14:19:14,557][49750] Updated weights for policy 0, policy_version 194951 (0.0033) [2024-04-26 14:19:17,063][49517] Fps is (10 sec: 50789.3, 60 sec: 51063.2, 300 sec: 50929.2). Total num frames: 3194191872. Throughput: 0: 50901.6. Samples: 947090640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 14:19:17,064][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 14:19:17,984][49750] Updated weights for policy 0, policy_version 194961 (0.0032) [2024-04-26 14:19:20,845][49750] Updated weights for policy 0, policy_version 194971 (0.0031) [2024-04-26 14:19:22,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3194454016. Throughput: 0: 50842.1. Samples: 947249120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 14:19:22,063][49517] Avg episode reward: [(0, '0.441')] [2024-04-26 14:19:24,413][49750] Updated weights for policy 0, policy_version 194981 (0.0037) [2024-04-26 14:19:27,063][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 3194716160. Throughput: 0: 50905.8. Samples: 947552840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 14:19:27,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 14:19:27,319][49750] Updated weights for policy 0, policy_version 194991 (0.0032) [2024-04-26 14:19:30,781][49750] Updated weights for policy 0, policy_version 195001 (0.0031) [2024-04-26 14:19:32,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 3194929152. Throughput: 0: 50669.2. Samples: 947851620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 14:19:32,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 14:19:33,874][49750] Updated weights for policy 0, policy_version 195011 (0.0027) [2024-04-26 14:19:37,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3195207680. Throughput: 0: 50518.7. Samples: 947994140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 14:19:37,064][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 14:19:37,355][49750] Updated weights for policy 0, policy_version 195021 (0.0032) [2024-04-26 14:19:38,493][49728] Signal inference workers to stop experience collection... (14150 times) [2024-04-26 14:19:38,500][49728] Signal inference workers to resume experience collection... (14150 times) [2024-04-26 14:19:38,522][49750] InferenceWorker_p0-w0: stopping experience collection (14150 times) [2024-04-26 14:19:38,522][49750] InferenceWorker_p0-w0: resuming experience collection (14150 times) [2024-04-26 14:19:40,381][49750] Updated weights for policy 0, policy_version 195031 (0.0034) [2024-04-26 14:19:42,062][49517] Fps is (10 sec: 54068.5, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3195469824. Throughput: 0: 50710.2. Samples: 948304200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 14:19:42,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 14:19:44,094][49750] Updated weights for policy 0, policy_version 195041 (0.0034) [2024-04-26 14:19:46,782][49750] Updated weights for policy 0, policy_version 195051 (0.0029) [2024-04-26 14:19:47,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3195715584. Throughput: 0: 50796.5. Samples: 948605060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 14:19:47,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 14:19:50,591][49750] Updated weights for policy 0, policy_version 195061 (0.0027) [2024-04-26 14:19:52,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 3195961344. Throughput: 0: 50767.6. Samples: 948764580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 14:19:52,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 14:19:53,178][49750] Updated weights for policy 0, policy_version 195071 (0.0032) [2024-04-26 14:19:57,008][49750] Updated weights for policy 0, policy_version 195081 (0.0037) [2024-04-26 14:19:57,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3196207104. Throughput: 0: 50673.3. Samples: 949068340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 14:19:57,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 14:19:59,683][49750] Updated weights for policy 0, policy_version 195091 (0.0040) [2024-04-26 14:20:02,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 3196452864. Throughput: 0: 50667.0. Samples: 949370640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 14:20:02,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 14:20:03,466][49750] Updated weights for policy 0, policy_version 195101 (0.0034) [2024-04-26 14:20:06,303][49750] Updated weights for policy 0, policy_version 195111 (0.0032) [2024-04-26 14:20:07,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3196747776. Throughput: 0: 50596.0. Samples: 949525940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 14:20:07,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 14:20:09,839][49750] Updated weights for policy 0, policy_version 195121 (0.0030) [2024-04-26 14:20:12,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 3196993536. Throughput: 0: 50638.8. Samples: 949831580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 14:20:12,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 14:20:12,643][49750] Updated weights for policy 0, policy_version 195131 (0.0034) [2024-04-26 14:20:16,469][49750] Updated weights for policy 0, policy_version 195141 (0.0034) [2024-04-26 14:20:17,062][49517] Fps is (10 sec: 45875.7, 60 sec: 50244.5, 300 sec: 50873.7). Total num frames: 3197206528. Throughput: 0: 50717.1. Samples: 950133880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 14:20:17,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 14:20:19,008][49750] Updated weights for policy 0, policy_version 195151 (0.0041) [2024-04-26 14:20:22,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 3197485056. Throughput: 0: 50750.6. Samples: 950277920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 14:20:22,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 14:20:22,934][49750] Updated weights for policy 0, policy_version 195161 (0.0029) [2024-04-26 14:20:25,545][49750] Updated weights for policy 0, policy_version 195171 (0.0028) [2024-04-26 14:20:27,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 3197730816. Throughput: 0: 50570.6. Samples: 950579880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 14:20:27,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 14:20:29,342][49750] Updated weights for policy 0, policy_version 195181 (0.0037) [2024-04-26 14:20:31,934][49750] Updated weights for policy 0, policy_version 195191 (0.0032) [2024-04-26 14:20:32,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51336.6, 300 sec: 50818.1). Total num frames: 3198009344. Throughput: 0: 50639.4. Samples: 950883840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 14:20:32,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 14:20:35,818][49750] Updated weights for policy 0, policy_version 195201 (0.0036) [2024-04-26 14:20:37,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3198238720. Throughput: 0: 50599.9. Samples: 951041580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 14:20:37,063][49517] Avg episode reward: [(0, '0.675')] [2024-04-26 14:20:38,460][49750] Updated weights for policy 0, policy_version 195211 (0.0032) [2024-04-26 14:20:42,062][49517] Fps is (10 sec: 45876.0, 60 sec: 49971.2, 300 sec: 50818.2). Total num frames: 3198468096. Throughput: 0: 50532.6. Samples: 951342300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 14:20:42,063][49517] Avg episode reward: [(0, '0.399')] [2024-04-26 14:20:42,101][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000195220_3198484480.pth... [2024-04-26 14:20:42,148][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000194477_3186311168.pth [2024-04-26 14:20:42,574][49750] Updated weights for policy 0, policy_version 195221 (0.0031) [2024-04-26 14:20:44,908][49750] Updated weights for policy 0, policy_version 195231 (0.0033) [2024-04-26 14:20:46,758][49728] Signal inference workers to stop experience collection... (14200 times) [2024-04-26 14:20:46,759][49728] Signal inference workers to resume experience collection... (14200 times) [2024-04-26 14:20:46,791][49750] InferenceWorker_p0-w0: stopping experience collection (14200 times) [2024-04-26 14:20:46,791][49750] InferenceWorker_p0-w0: resuming experience collection (14200 times) [2024-04-26 14:20:47,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3198746624. Throughput: 0: 50565.3. Samples: 951646080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 14:20:47,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 14:20:48,896][49750] Updated weights for policy 0, policy_version 195241 (0.0037) [2024-04-26 14:20:51,504][49750] Updated weights for policy 0, policy_version 195251 (0.0032) [2024-04-26 14:20:52,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3198992384. Throughput: 0: 50551.2. Samples: 951800740. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:20:52,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 14:20:55,312][49750] Updated weights for policy 0, policy_version 195261 (0.0035) [2024-04-26 14:20:57,063][49517] Fps is (10 sec: 52427.5, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3199270912. Throughput: 0: 50559.3. Samples: 952106760. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:20:57,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 14:20:57,817][49750] Updated weights for policy 0, policy_version 195271 (0.0038) [2024-04-26 14:21:01,809][49750] Updated weights for policy 0, policy_version 195281 (0.0028) [2024-04-26 14:21:02,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3199500288. Throughput: 0: 50768.4. Samples: 952418460. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:21:02,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 14:21:04,511][49750] Updated weights for policy 0, policy_version 195291 (0.0028) [2024-04-26 14:21:07,062][49517] Fps is (10 sec: 47514.8, 60 sec: 49971.3, 300 sec: 50818.2). Total num frames: 3199746048. Throughput: 0: 50695.7. Samples: 952559220. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:21:07,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 14:21:08,303][49750] Updated weights for policy 0, policy_version 195301 (0.0037) [2024-04-26 14:21:11,077][49750] Updated weights for policy 0, policy_version 195311 (0.0027) [2024-04-26 14:21:12,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 3200024576. Throughput: 0: 50703.0. Samples: 952861520. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:21:12,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 14:21:14,643][49750] Updated weights for policy 0, policy_version 195321 (0.0037) [2024-04-26 14:21:17,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3200270336. Throughput: 0: 50666.2. Samples: 953163820. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:21:17,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 14:21:17,570][49750] Updated weights for policy 0, policy_version 195331 (0.0039) [2024-04-26 14:21:21,189][49750] Updated weights for policy 0, policy_version 195341 (0.0029) [2024-04-26 14:21:22,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3200532480. Throughput: 0: 50724.0. Samples: 953324160. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:21:22,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 14:21:23,954][49750] Updated weights for policy 0, policy_version 195351 (0.0028) [2024-04-26 14:21:27,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3200761856. Throughput: 0: 50767.1. Samples: 953626820. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:21:27,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 14:21:27,708][49750] Updated weights for policy 0, policy_version 195361 (0.0030) [2024-04-26 14:21:30,268][49750] Updated weights for policy 0, policy_version 195371 (0.0035) [2024-04-26 14:21:32,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3201024000. Throughput: 0: 50780.3. Samples: 953931200. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:21:32,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 14:21:34,123][49750] Updated weights for policy 0, policy_version 195381 (0.0034) [2024-04-26 14:21:36,823][49750] Updated weights for policy 0, policy_version 195391 (0.0033) [2024-04-26 14:21:37,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 3201286144. Throughput: 0: 50766.5. Samples: 954085240. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:21:37,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 14:21:40,620][49750] Updated weights for policy 0, policy_version 195401 (0.0029) [2024-04-26 14:21:42,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3201548288. Throughput: 0: 50895.4. Samples: 954397040. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:21:42,063][49517] Avg episode reward: [(0, '0.483')] [2024-04-26 14:21:43,197][49750] Updated weights for policy 0, policy_version 195411 (0.0031) [2024-04-26 14:21:47,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3201761280. Throughput: 0: 50666.2. Samples: 954698440. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:21:47,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 14:21:47,136][49750] Updated weights for policy 0, policy_version 195421 (0.0030) [2024-04-26 14:21:49,500][49750] Updated weights for policy 0, policy_version 195431 (0.0028) [2024-04-26 14:21:50,262][49728] Signal inference workers to stop experience collection... (14250 times) [2024-04-26 14:21:50,308][49750] InferenceWorker_p0-w0: stopping experience collection (14250 times) [2024-04-26 14:21:50,324][49728] Signal inference workers to resume experience collection... (14250 times) [2024-04-26 14:21:50,332][49750] InferenceWorker_p0-w0: resuming experience collection (14250 times) [2024-04-26 14:21:52,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3202039808. Throughput: 0: 50672.3. Samples: 954839480. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-04-26 14:21:52,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 14:21:53,522][49750] Updated weights for policy 0, policy_version 195441 (0.0031) [2024-04-26 14:21:56,040][49750] Updated weights for policy 0, policy_version 195451 (0.0033) [2024-04-26 14:21:57,062][49517] Fps is (10 sec: 54067.5, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 3202301952. Throughput: 0: 50782.8. Samples: 955146740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:21:57,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 14:21:59,790][49750] Updated weights for policy 0, policy_version 195461 (0.0032) [2024-04-26 14:22:02,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3202564096. Throughput: 0: 50833.8. Samples: 955451340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:22:02,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 14:22:02,692][49750] Updated weights for policy 0, policy_version 195471 (0.0029) [2024-04-26 14:22:06,263][49750] Updated weights for policy 0, policy_version 195481 (0.0029) [2024-04-26 14:22:07,063][49517] Fps is (10 sec: 52427.4, 60 sec: 51336.3, 300 sec: 50929.2). Total num frames: 3202826240. Throughput: 0: 50755.8. Samples: 955608180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:22:07,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 14:22:09,020][49750] Updated weights for policy 0, policy_version 195491 (0.0030) [2024-04-26 14:22:12,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3203055616. Throughput: 0: 50926.6. Samples: 955918520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:22:12,063][49517] Avg episode reward: [(0, '0.697')] [2024-04-26 14:22:12,803][49750] Updated weights for policy 0, policy_version 195501 (0.0029) [2024-04-26 14:22:15,425][49750] Updated weights for policy 0, policy_version 195511 (0.0029) [2024-04-26 14:22:17,063][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3203301376. Throughput: 0: 50774.2. Samples: 956216040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:22:17,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 14:22:19,223][49750] Updated weights for policy 0, policy_version 195521 (0.0031) [2024-04-26 14:22:21,919][49750] Updated weights for policy 0, policy_version 195531 (0.0031) [2024-04-26 14:22:22,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 3203579904. Throughput: 0: 50774.7. Samples: 956370100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:22:22,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 14:22:25,591][49750] Updated weights for policy 0, policy_version 195541 (0.0027) [2024-04-26 14:22:27,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3203842048. Throughput: 0: 50678.6. Samples: 956677580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:22:27,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 14:22:28,167][49750] Updated weights for policy 0, policy_version 195551 (0.0034) [2024-04-26 14:22:32,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3204055040. Throughput: 0: 50792.0. Samples: 956984080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:22:32,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 14:22:32,074][49750] Updated weights for policy 0, policy_version 195561 (0.0036) [2024-04-26 14:22:34,470][49750] Updated weights for policy 0, policy_version 195571 (0.0031) [2024-04-26 14:22:37,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 3204317184. Throughput: 0: 50671.3. Samples: 957119680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:22:37,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 14:22:38,595][49750] Updated weights for policy 0, policy_version 195581 (0.0033) [2024-04-26 14:22:40,933][49750] Updated weights for policy 0, policy_version 195591 (0.0031) [2024-04-26 14:22:42,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3204579328. Throughput: 0: 50633.1. Samples: 957425240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:22:42,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 14:22:42,153][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000195593_3204595712.pth... [2024-04-26 14:22:42,204][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000194850_3192422400.pth [2024-04-26 14:22:45,097][49750] Updated weights for policy 0, policy_version 195601 (0.0034) [2024-04-26 14:22:47,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51609.6, 300 sec: 50762.6). Total num frames: 3204857856. Throughput: 0: 50775.2. Samples: 957736220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:22:47,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 14:22:47,834][49750] Updated weights for policy 0, policy_version 195611 (0.0031) [2024-04-26 14:22:51,453][49750] Updated weights for policy 0, policy_version 195621 (0.0035) [2024-04-26 14:22:52,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3205103616. Throughput: 0: 50641.5. Samples: 957887040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:22:52,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 14:22:54,212][49750] Updated weights for policy 0, policy_version 195631 (0.0036) [2024-04-26 14:22:55,638][49728] Signal inference workers to stop experience collection... (14300 times) [2024-04-26 14:22:55,691][49750] InferenceWorker_p0-w0: stopping experience collection (14300 times) [2024-04-26 14:22:55,695][49728] Signal inference workers to resume experience collection... (14300 times) [2024-04-26 14:22:55,703][49750] InferenceWorker_p0-w0: resuming experience collection (14300 times) [2024-04-26 14:22:57,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3205332992. Throughput: 0: 50687.9. Samples: 958199480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:22:57,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 14:22:57,782][49750] Updated weights for policy 0, policy_version 195641 (0.0029) [2024-04-26 14:23:00,701][49750] Updated weights for policy 0, policy_version 195651 (0.0037) [2024-04-26 14:23:02,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3205595136. Throughput: 0: 50834.2. Samples: 958503580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:23:02,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 14:23:04,209][49750] Updated weights for policy 0, policy_version 195661 (0.0036) [2024-04-26 14:23:07,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.6, 300 sec: 50707.1). Total num frames: 3205857280. Throughput: 0: 50765.0. Samples: 958654520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-26 14:23:07,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 14:23:07,097][49750] Updated weights for policy 0, policy_version 195671 (0.0035) [2024-04-26 14:23:10,671][49750] Updated weights for policy 0, policy_version 195681 (0.0029) [2024-04-26 14:23:12,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3206119424. Throughput: 0: 50759.5. Samples: 958961760. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-26 14:23:12,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 14:23:13,440][49750] Updated weights for policy 0, policy_version 195691 (0.0029) [2024-04-26 14:23:17,032][49750] Updated weights for policy 0, policy_version 195701 (0.0029) [2024-04-26 14:23:17,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3206365184. Throughput: 0: 50651.5. Samples: 959263400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-26 14:23:17,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 14:23:19,854][49750] Updated weights for policy 0, policy_version 195711 (0.0029) [2024-04-26 14:23:22,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 3206594560. Throughput: 0: 50972.0. Samples: 959413420. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-26 14:23:22,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 14:23:23,455][49750] Updated weights for policy 0, policy_version 195721 (0.0036) [2024-04-26 14:23:26,413][49750] Updated weights for policy 0, policy_version 195731 (0.0029) [2024-04-26 14:23:27,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3206856704. Throughput: 0: 50840.2. Samples: 959713040. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-26 14:23:27,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 14:23:29,991][49750] Updated weights for policy 0, policy_version 195741 (0.0032) [2024-04-26 14:23:32,062][49517] Fps is (10 sec: 55705.2, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 3207151616. Throughput: 0: 50771.1. Samples: 960020920. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-26 14:23:32,063][49517] Avg episode reward: [(0, '0.479')] [2024-04-26 14:23:32,811][49750] Updated weights for policy 0, policy_version 195751 (0.0029) [2024-04-26 14:23:36,423][49750] Updated weights for policy 0, policy_version 195761 (0.0031) [2024-04-26 14:23:37,063][49517] Fps is (10 sec: 52427.7, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3207380992. Throughput: 0: 50982.6. Samples: 960181260. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-26 14:23:37,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 14:23:39,284][49750] Updated weights for policy 0, policy_version 195771 (0.0029) [2024-04-26 14:23:42,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3207610368. Throughput: 0: 50783.7. Samples: 960484740. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-26 14:23:42,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 14:23:42,752][49750] Updated weights for policy 0, policy_version 195781 (0.0030) [2024-04-26 14:23:45,946][49750] Updated weights for policy 0, policy_version 195791 (0.0032) [2024-04-26 14:23:47,063][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3207888896. Throughput: 0: 50726.3. Samples: 960786260. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-26 14:23:47,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 14:23:49,168][49750] Updated weights for policy 0, policy_version 195801 (0.0029) [2024-04-26 14:23:52,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3208134656. Throughput: 0: 50750.2. Samples: 960938280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-26 14:23:52,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 14:23:52,394][49750] Updated weights for policy 0, policy_version 195811 (0.0029) [2024-04-26 14:23:55,615][49750] Updated weights for policy 0, policy_version 195821 (0.0031) [2024-04-26 14:23:57,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 3208413184. Throughput: 0: 50787.9. Samples: 961247220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-26 14:23:57,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 14:23:58,985][49750] Updated weights for policy 0, policy_version 195831 (0.0029) [2024-04-26 14:24:02,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3208658944. Throughput: 0: 50985.3. Samples: 961557740. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-26 14:24:02,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 14:24:02,065][49750] Updated weights for policy 0, policy_version 195841 (0.0043) [2024-04-26 14:24:05,463][49750] Updated weights for policy 0, policy_version 195851 (0.0031) [2024-04-26 14:24:07,062][49517] Fps is (10 sec: 47514.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3208888320. Throughput: 0: 50832.8. Samples: 961700900. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-26 14:24:07,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 14:24:08,445][49750] Updated weights for policy 0, policy_version 195861 (0.0030) [2024-04-26 14:24:11,890][49750] Updated weights for policy 0, policy_version 195871 (0.0028) [2024-04-26 14:24:12,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3209150464. Throughput: 0: 50887.5. Samples: 962002980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 14:24:12,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 14:24:14,863][49750] Updated weights for policy 0, policy_version 195881 (0.0029) [2024-04-26 14:24:17,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 3209428992. Throughput: 0: 50750.7. Samples: 962304700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 14:24:17,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 14:24:18,232][49750] Updated weights for policy 0, policy_version 195891 (0.0033) [2024-04-26 14:24:18,983][49728] Signal inference workers to stop experience collection... (14350 times) [2024-04-26 14:24:19,027][49750] InferenceWorker_p0-w0: stopping experience collection (14350 times) [2024-04-26 14:24:19,089][49728] Signal inference workers to resume experience collection... (14350 times) [2024-04-26 14:24:19,089][49750] InferenceWorker_p0-w0: resuming experience collection (14350 times) [2024-04-26 14:24:21,397][49750] Updated weights for policy 0, policy_version 195901 (0.0026) [2024-04-26 14:24:22,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51336.6, 300 sec: 50707.1). Total num frames: 3209674752. Throughput: 0: 50738.1. Samples: 962464460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 14:24:22,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 14:24:24,773][49750] Updated weights for policy 0, policy_version 195911 (0.0029) [2024-04-26 14:24:27,063][49517] Fps is (10 sec: 45874.6, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3209887744. Throughput: 0: 50675.0. Samples: 962765120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 14:24:27,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 14:24:27,821][49750] Updated weights for policy 0, policy_version 195921 (0.0032) [2024-04-26 14:24:31,264][49750] Updated weights for policy 0, policy_version 195931 (0.0029) [2024-04-26 14:24:32,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3210182656. Throughput: 0: 50758.3. Samples: 963070380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 14:24:32,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 14:24:34,345][49750] Updated weights for policy 0, policy_version 195941 (0.0033) [2024-04-26 14:24:37,063][49517] Fps is (10 sec: 54067.3, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3210428416. Throughput: 0: 50724.3. Samples: 963220880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 14:24:37,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 14:24:37,656][49750] Updated weights for policy 0, policy_version 195951 (0.0037) [2024-04-26 14:24:40,901][49750] Updated weights for policy 0, policy_version 195961 (0.0033) [2024-04-26 14:24:42,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 3210690560. Throughput: 0: 50644.5. Samples: 963526220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 14:24:42,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 14:24:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000195965_3210690560.pth... [2024-04-26 14:24:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000195220_3198484480.pth [2024-04-26 14:24:44,040][49750] Updated weights for policy 0, policy_version 195971 (0.0029) [2024-04-26 14:24:47,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3210936320. Throughput: 0: 50590.6. Samples: 963834320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 14:24:47,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 14:24:47,257][49750] Updated weights for policy 0, policy_version 195981 (0.0035) [2024-04-26 14:24:50,651][49750] Updated weights for policy 0, policy_version 195991 (0.0029) [2024-04-26 14:24:52,063][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3211165696. Throughput: 0: 50655.4. Samples: 963980400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 14:24:52,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 14:24:53,785][49750] Updated weights for policy 0, policy_version 196001 (0.0032) [2024-04-26 14:24:57,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49971.4, 300 sec: 50707.1). Total num frames: 3211411456. Throughput: 0: 50543.6. Samples: 964277440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 14:24:57,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 14:24:57,291][49750] Updated weights for policy 0, policy_version 196011 (0.0032) [2024-04-26 14:25:00,210][49750] Updated weights for policy 0, policy_version 196021 (0.0027) [2024-04-26 14:25:02,062][49517] Fps is (10 sec: 54067.8, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3211706368. Throughput: 0: 50598.6. Samples: 964581640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 14:25:02,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 14:25:03,747][49750] Updated weights for policy 0, policy_version 196031 (0.0026) [2024-04-26 14:25:06,527][49750] Updated weights for policy 0, policy_version 196041 (0.0030) [2024-04-26 14:25:07,062][49517] Fps is (10 sec: 55705.9, 60 sec: 51336.6, 300 sec: 50762.6). Total num frames: 3211968512. Throughput: 0: 50681.3. Samples: 964745120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 14:25:07,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 14:25:10,201][49750] Updated weights for policy 0, policy_version 196051 (0.0031) [2024-04-26 14:25:12,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3212181504. Throughput: 0: 50690.8. Samples: 965046200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 14:25:12,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 14:25:12,981][49750] Updated weights for policy 0, policy_version 196061 (0.0034) [2024-04-26 14:25:16,529][49750] Updated weights for policy 0, policy_version 196071 (0.0032) [2024-04-26 14:25:17,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3212443648. Throughput: 0: 50631.1. Samples: 965348780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 14:25:17,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 14:25:19,445][49750] Updated weights for policy 0, policy_version 196081 (0.0031) [2024-04-26 14:25:22,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3212705792. Throughput: 0: 50465.8. Samples: 965491840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 14:25:22,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 14:25:23,151][49750] Updated weights for policy 0, policy_version 196091 (0.0035) [2024-04-26 14:25:25,480][49728] Signal inference workers to stop experience collection... (14400 times) [2024-04-26 14:25:25,529][49750] InferenceWorker_p0-w0: stopping experience collection (14400 times) [2024-04-26 14:25:25,548][49728] Signal inference workers to resume experience collection... (14400 times) [2024-04-26 14:25:25,550][49750] InferenceWorker_p0-w0: resuming experience collection (14400 times) [2024-04-26 14:25:25,811][49750] Updated weights for policy 0, policy_version 196101 (0.0028) [2024-04-26 14:25:27,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.7, 300 sec: 50707.1). Total num frames: 3212967936. Throughput: 0: 50639.3. Samples: 965804980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 14:25:27,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 14:25:29,480][49750] Updated weights for policy 0, policy_version 196111 (0.0029) [2024-04-26 14:25:32,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3213213696. Throughput: 0: 50648.5. Samples: 966113500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 14:25:32,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 14:25:32,379][49750] Updated weights for policy 0, policy_version 196121 (0.0028) [2024-04-26 14:25:35,993][49750] Updated weights for policy 0, policy_version 196131 (0.0029) [2024-04-26 14:25:37,063][49517] Fps is (10 sec: 47512.7, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3213443072. Throughput: 0: 50555.6. Samples: 966255400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 14:25:37,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 14:25:38,747][49750] Updated weights for policy 0, policy_version 196141 (0.0031) [2024-04-26 14:25:42,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50244.2, 300 sec: 50707.0). Total num frames: 3213705216. Throughput: 0: 50739.3. Samples: 966560720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 14:25:42,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 14:25:42,406][49750] Updated weights for policy 0, policy_version 196151 (0.0028) [2024-04-26 14:25:45,097][49750] Updated weights for policy 0, policy_version 196161 (0.0038) [2024-04-26 14:25:47,062][49517] Fps is (10 sec: 54067.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3213983744. Throughput: 0: 50740.0. Samples: 966864940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 14:25:47,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 14:25:48,879][49750] Updated weights for policy 0, policy_version 196171 (0.0033) [2024-04-26 14:25:51,470][49750] Updated weights for policy 0, policy_version 196181 (0.0027) [2024-04-26 14:25:52,063][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3214229504. Throughput: 0: 50652.8. Samples: 967024500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 14:25:52,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 14:25:55,447][49750] Updated weights for policy 0, policy_version 196191 (0.0036) [2024-04-26 14:25:57,062][49517] Fps is (10 sec: 49152.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3214475264. Throughput: 0: 50688.5. Samples: 967327180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 14:25:57,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 14:25:57,968][49750] Updated weights for policy 0, policy_version 196201 (0.0031) [2024-04-26 14:26:01,881][49750] Updated weights for policy 0, policy_version 196211 (0.0042) [2024-04-26 14:26:02,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3214721024. Throughput: 0: 50769.3. Samples: 967633400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 14:26:02,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 14:26:04,472][49750] Updated weights for policy 0, policy_version 196221 (0.0032) [2024-04-26 14:26:07,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 3214983168. Throughput: 0: 50720.4. Samples: 967774260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 14:26:07,072][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 14:26:08,360][49750] Updated weights for policy 0, policy_version 196231 (0.0031) [2024-04-26 14:26:11,090][49750] Updated weights for policy 0, policy_version 196241 (0.0033) [2024-04-26 14:26:12,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3215245312. Throughput: 0: 50617.6. Samples: 968082780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 14:26:12,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 14:26:14,779][49750] Updated weights for policy 0, policy_version 196251 (0.0031) [2024-04-26 14:26:17,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3215507456. Throughput: 0: 50712.1. Samples: 968395540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 14:26:17,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 14:26:17,683][49750] Updated weights for policy 0, policy_version 196261 (0.0031) [2024-04-26 14:26:21,379][49750] Updated weights for policy 0, policy_version 196271 (0.0039) [2024-04-26 14:26:22,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3215720448. Throughput: 0: 50765.4. Samples: 968539840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 14:26:22,072][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 14:26:24,003][49728] Signal inference workers to stop experience collection... (14450 times) [2024-04-26 14:26:24,063][49750] InferenceWorker_p0-w0: stopping experience collection (14450 times) [2024-04-26 14:26:24,067][49728] Signal inference workers to resume experience collection... (14450 times) [2024-04-26 14:26:24,077][49750] InferenceWorker_p0-w0: resuming experience collection (14450 times) [2024-04-26 14:26:24,202][49750] Updated weights for policy 0, policy_version 196281 (0.0030) [2024-04-26 14:26:27,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 3215982592. Throughput: 0: 50493.9. Samples: 968832940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 14:26:27,063][49517] Avg episode reward: [(0, '0.441')] [2024-04-26 14:26:27,939][49750] Updated weights for policy 0, policy_version 196291 (0.0032) [2024-04-26 14:26:30,741][49750] Updated weights for policy 0, policy_version 196301 (0.0026) [2024-04-26 14:26:32,063][49517] Fps is (10 sec: 55705.6, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3216277504. Throughput: 0: 50517.3. Samples: 969138220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 14:26:32,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 14:26:34,347][49750] Updated weights for policy 0, policy_version 196311 (0.0032) [2024-04-26 14:26:37,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3216506880. Throughput: 0: 50513.3. Samples: 969297600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 14:26:37,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 14:26:37,213][49750] Updated weights for policy 0, policy_version 196321 (0.0031) [2024-04-26 14:26:40,840][49750] Updated weights for policy 0, policy_version 196331 (0.0029) [2024-04-26 14:26:42,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 3216736256. Throughput: 0: 50632.4. Samples: 969605640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 14:26:42,063][49517] Avg episode reward: [(0, '0.485')] [2024-04-26 14:26:42,160][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000196335_3216752640.pth... [2024-04-26 14:26:42,207][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000195593_3204595712.pth [2024-04-26 14:26:43,561][49750] Updated weights for policy 0, policy_version 196341 (0.0032) [2024-04-26 14:26:47,062][49517] Fps is (10 sec: 47514.3, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 3216982016. Throughput: 0: 50517.9. Samples: 969906700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 14:26:47,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 14:26:47,322][49750] Updated weights for policy 0, policy_version 196351 (0.0036) [2024-04-26 14:26:50,036][49750] Updated weights for policy 0, policy_version 196361 (0.0035) [2024-04-26 14:26:52,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3217260544. Throughput: 0: 50694.3. Samples: 970055500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 14:26:52,063][49517] Avg episode reward: [(0, '0.490')] [2024-04-26 14:26:53,886][49750] Updated weights for policy 0, policy_version 196371 (0.0031) [2024-04-26 14:26:56,508][49750] Updated weights for policy 0, policy_version 196381 (0.0031) [2024-04-26 14:26:57,062][49517] Fps is (10 sec: 55705.6, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 3217539072. Throughput: 0: 50629.5. Samples: 970361100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 14:26:57,063][49517] Avg episode reward: [(0, '0.491')] [2024-04-26 14:27:00,311][49750] Updated weights for policy 0, policy_version 196391 (0.0031) [2024-04-26 14:27:02,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50596.1). Total num frames: 3217752064. Throughput: 0: 50469.4. Samples: 970666660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 14:27:02,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 14:27:02,964][49750] Updated weights for policy 0, policy_version 196401 (0.0030) [2024-04-26 14:27:06,685][49750] Updated weights for policy 0, policy_version 196411 (0.0026) [2024-04-26 14:27:07,063][49517] Fps is (10 sec: 45874.6, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 3217997824. Throughput: 0: 50488.0. Samples: 970811800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 14:27:07,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 14:27:09,378][49750] Updated weights for policy 0, policy_version 196421 (0.0028) [2024-04-26 14:27:12,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3218259968. Throughput: 0: 50648.5. Samples: 971112120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 14:27:12,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 14:27:13,201][49750] Updated weights for policy 0, policy_version 196431 (0.0030) [2024-04-26 14:27:15,874][49750] Updated weights for policy 0, policy_version 196441 (0.0027) [2024-04-26 14:27:17,062][49517] Fps is (10 sec: 55706.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3218554880. Throughput: 0: 50665.0. Samples: 971418140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 14:27:17,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 14:27:19,702][49750] Updated weights for policy 0, policy_version 196451 (0.0031) [2024-04-26 14:27:22,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50651.6). Total num frames: 3218784256. Throughput: 0: 50898.3. Samples: 971588020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 14:27:22,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 14:27:22,274][49750] Updated weights for policy 0, policy_version 196461 (0.0034) [2024-04-26 14:27:26,235][49750] Updated weights for policy 0, policy_version 196471 (0.0029) [2024-04-26 14:27:27,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3219030016. Throughput: 0: 50771.1. Samples: 971890340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 14:27:27,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 14:27:28,592][49750] Updated weights for policy 0, policy_version 196481 (0.0028) [2024-04-26 14:27:32,062][49517] Fps is (10 sec: 47513.4, 60 sec: 49698.2, 300 sec: 50651.5). Total num frames: 3219259392. Throughput: 0: 50788.8. Samples: 972192200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 14:27:32,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 14:27:32,387][49728] Signal inference workers to stop experience collection... (14500 times) [2024-04-26 14:27:32,388][49728] Signal inference workers to resume experience collection... (14500 times) [2024-04-26 14:27:32,416][49750] InferenceWorker_p0-w0: stopping experience collection (14500 times) [2024-04-26 14:27:32,416][49750] InferenceWorker_p0-w0: resuming experience collection (14500 times) [2024-04-26 14:27:32,516][49750] Updated weights for policy 0, policy_version 196491 (0.0032) [2024-04-26 14:27:35,101][49750] Updated weights for policy 0, policy_version 196501 (0.0030) [2024-04-26 14:27:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3219537920. Throughput: 0: 50842.7. Samples: 972343420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 14:27:37,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 14:27:38,969][49750] Updated weights for policy 0, policy_version 196511 (0.0031) [2024-04-26 14:27:41,534][49750] Updated weights for policy 0, policy_version 196521 (0.0028) [2024-04-26 14:27:42,062][49517] Fps is (10 sec: 55705.7, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 3219816448. Throughput: 0: 50766.1. Samples: 972645580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 14:27:42,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 14:27:45,597][49750] Updated weights for policy 0, policy_version 196531 (0.0030) [2024-04-26 14:27:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 3220029440. Throughput: 0: 50811.1. Samples: 972953160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 14:27:47,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 14:27:48,025][49750] Updated weights for policy 0, policy_version 196541 (0.0028) [2024-04-26 14:27:52,063][49517] Fps is (10 sec: 45875.0, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 3220275200. Throughput: 0: 50673.3. Samples: 973092100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 14:27:52,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 14:27:52,114][49750] Updated weights for policy 0, policy_version 196551 (0.0038) [2024-04-26 14:27:54,543][49750] Updated weights for policy 0, policy_version 196561 (0.0031) [2024-04-26 14:27:57,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 3220553728. Throughput: 0: 50788.4. Samples: 973397600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 14:27:57,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 14:27:58,479][49750] Updated weights for policy 0, policy_version 196571 (0.0036) [2024-04-26 14:28:00,875][49750] Updated weights for policy 0, policy_version 196581 (0.0036) [2024-04-26 14:28:02,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3220815872. Throughput: 0: 50731.4. Samples: 973701060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 14:28:02,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 14:28:04,811][49750] Updated weights for policy 0, policy_version 196591 (0.0029) [2024-04-26 14:28:07,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51336.6, 300 sec: 50707.1). Total num frames: 3221078016. Throughput: 0: 50638.2. Samples: 973866740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 14:28:07,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 14:28:07,277][49750] Updated weights for policy 0, policy_version 196601 (0.0031) [2024-04-26 14:28:11,175][49750] Updated weights for policy 0, policy_version 196611 (0.0029) [2024-04-26 14:28:12,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3221307392. Throughput: 0: 50815.9. Samples: 974177060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 14:28:12,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 14:28:13,729][49750] Updated weights for policy 0, policy_version 196621 (0.0030) [2024-04-26 14:28:17,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3221569536. Throughput: 0: 50918.3. Samples: 974483520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 14:28:17,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 14:28:17,695][49750] Updated weights for policy 0, policy_version 196631 (0.0033) [2024-04-26 14:28:20,101][49750] Updated weights for policy 0, policy_version 196641 (0.0028) [2024-04-26 14:28:22,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3221831680. Throughput: 0: 50630.2. Samples: 974621780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 14:28:22,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 14:28:24,235][49750] Updated weights for policy 0, policy_version 196651 (0.0033) [2024-04-26 14:28:26,568][49750] Updated weights for policy 0, policy_version 196661 (0.0034) [2024-04-26 14:28:27,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 3222110208. Throughput: 0: 50827.6. Samples: 974932820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 14:28:27,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 14:28:30,516][49750] Updated weights for policy 0, policy_version 196671 (0.0029) [2024-04-26 14:28:31,472][49728] Signal inference workers to stop experience collection... (14550 times) [2024-04-26 14:28:31,500][49750] InferenceWorker_p0-w0: stopping experience collection (14550 times) [2024-04-26 14:28:31,572][49728] Signal inference workers to resume experience collection... (14550 times) [2024-04-26 14:28:31,572][49750] InferenceWorker_p0-w0: resuming experience collection (14550 times) [2024-04-26 14:28:32,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51609.6, 300 sec: 50762.6). Total num frames: 3222355968. Throughput: 0: 50935.8. Samples: 975245280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-26 14:28:32,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 14:28:33,039][49750] Updated weights for policy 0, policy_version 196681 (0.0031) [2024-04-26 14:28:37,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3222568960. Throughput: 0: 51118.8. Samples: 975392440. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 14:28:37,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 14:28:37,141][49750] Updated weights for policy 0, policy_version 196691 (0.0031) [2024-04-26 14:28:39,456][49750] Updated weights for policy 0, policy_version 196701 (0.0035) [2024-04-26 14:28:42,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 3222831104. Throughput: 0: 51074.3. Samples: 975695940. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 14:28:42,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 14:28:42,182][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000196707_3222847488.pth... [2024-04-26 14:28:42,230][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000195965_3210690560.pth [2024-04-26 14:28:43,492][49750] Updated weights for policy 0, policy_version 196711 (0.0031) [2024-04-26 14:28:46,082][49750] Updated weights for policy 0, policy_version 196721 (0.0033) [2024-04-26 14:28:47,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 3223109632. Throughput: 0: 51085.4. Samples: 975999900. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 14:28:47,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 14:28:49,883][49750] Updated weights for policy 0, policy_version 196731 (0.0027) [2024-04-26 14:28:52,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51609.6, 300 sec: 50707.1). Total num frames: 3223371776. Throughput: 0: 51119.0. Samples: 976167100. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 14:28:52,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 14:28:52,338][49750] Updated weights for policy 0, policy_version 196741 (0.0028) [2024-04-26 14:28:56,391][49750] Updated weights for policy 0, policy_version 196751 (0.0027) [2024-04-26 14:28:57,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3223617536. Throughput: 0: 50843.0. Samples: 976465000. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 14:28:57,063][49517] Avg episode reward: [(0, '0.466')] [2024-04-26 14:28:58,943][49750] Updated weights for policy 0, policy_version 196761 (0.0031) [2024-04-26 14:29:02,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.6, 300 sec: 50762.7). Total num frames: 3223863296. Throughput: 0: 50746.3. Samples: 976767100. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 14:29:02,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 14:29:02,802][49750] Updated weights for policy 0, policy_version 196771 (0.0034) [2024-04-26 14:29:05,356][49750] Updated weights for policy 0, policy_version 196781 (0.0034) [2024-04-26 14:29:07,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3224125440. Throughput: 0: 50947.5. Samples: 976914420. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 14:29:07,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 14:29:09,301][49750] Updated weights for policy 0, policy_version 196791 (0.0031) [2024-04-26 14:29:11,837][49750] Updated weights for policy 0, policy_version 196801 (0.0030) [2024-04-26 14:29:12,063][49517] Fps is (10 sec: 52427.4, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 3224387584. Throughput: 0: 50794.1. Samples: 977218560. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 14:29:12,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 14:29:15,608][49750] Updated weights for policy 0, policy_version 196811 (0.0031) [2024-04-26 14:29:17,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3224633344. Throughput: 0: 50657.8. Samples: 977524880. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 14:29:17,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 14:29:18,260][49750] Updated weights for policy 0, policy_version 196821 (0.0030) [2024-04-26 14:29:22,038][49750] Updated weights for policy 0, policy_version 196831 (0.0033) [2024-04-26 14:29:22,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3224879104. Throughput: 0: 50766.2. Samples: 977676920. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 14:29:22,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 14:29:24,682][49750] Updated weights for policy 0, policy_version 196841 (0.0032) [2024-04-26 14:29:27,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 3225124864. Throughput: 0: 50696.8. Samples: 977977300. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 14:29:27,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 14:29:28,516][49750] Updated weights for policy 0, policy_version 196851 (0.0033) [2024-04-26 14:29:31,089][49750] Updated weights for policy 0, policy_version 196861 (0.0038) [2024-04-26 14:29:32,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3225387008. Throughput: 0: 50644.5. Samples: 978278900. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 14:29:32,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 14:29:35,042][49750] Updated weights for policy 0, policy_version 196871 (0.0030) [2024-04-26 14:29:36,852][49728] Signal inference workers to stop experience collection... (14600 times) [2024-04-26 14:29:36,852][49728] Signal inference workers to resume experience collection... (14600 times) [2024-04-26 14:29:36,869][49750] InferenceWorker_p0-w0: stopping experience collection (14600 times) [2024-04-26 14:29:36,869][49750] InferenceWorker_p0-w0: resuming experience collection (14600 times) [2024-04-26 14:29:37,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 3225649152. Throughput: 0: 50596.4. Samples: 978443940. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 14:29:37,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 14:29:37,597][49750] Updated weights for policy 0, policy_version 196881 (0.0029) [2024-04-26 14:29:41,395][49750] Updated weights for policy 0, policy_version 196891 (0.0032) [2024-04-26 14:29:42,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3225878528. Throughput: 0: 50673.5. Samples: 978745300. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-26 14:29:42,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 14:29:44,036][49750] Updated weights for policy 0, policy_version 196901 (0.0034) [2024-04-26 14:29:47,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3226124288. Throughput: 0: 50637.5. Samples: 979045800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 14:29:47,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 14:29:47,826][49750] Updated weights for policy 0, policy_version 196911 (0.0034) [2024-04-26 14:29:50,525][49750] Updated weights for policy 0, policy_version 196921 (0.0028) [2024-04-26 14:29:52,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3226402816. Throughput: 0: 50757.9. Samples: 979198520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 14:29:52,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 14:29:54,336][49750] Updated weights for policy 0, policy_version 196931 (0.0031) [2024-04-26 14:29:57,038][49750] Updated weights for policy 0, policy_version 196941 (0.0032) [2024-04-26 14:29:57,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3226681344. Throughput: 0: 50746.8. Samples: 979502160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 14:29:57,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 14:30:00,764][49750] Updated weights for policy 0, policy_version 196951 (0.0028) [2024-04-26 14:30:02,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.1, 300 sec: 50596.0). Total num frames: 3226894336. Throughput: 0: 50765.3. Samples: 979809320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 14:30:02,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 14:30:03,518][49750] Updated weights for policy 0, policy_version 196961 (0.0028) [2024-04-26 14:30:07,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3227156480. Throughput: 0: 50735.5. Samples: 979960020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 14:30:07,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 14:30:07,133][49750] Updated weights for policy 0, policy_version 196971 (0.0030) [2024-04-26 14:30:09,882][49750] Updated weights for policy 0, policy_version 196981 (0.0035) [2024-04-26 14:30:12,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3227402240. Throughput: 0: 50811.3. Samples: 980263800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 14:30:12,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 14:30:13,604][49750] Updated weights for policy 0, policy_version 196991 (0.0031) [2024-04-26 14:30:16,376][49750] Updated weights for policy 0, policy_version 197001 (0.0031) [2024-04-26 14:30:17,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3227664384. Throughput: 0: 50710.5. Samples: 980560880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 14:30:17,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 14:30:20,172][49750] Updated weights for policy 0, policy_version 197011 (0.0033) [2024-04-26 14:30:22,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3227926528. Throughput: 0: 50694.6. Samples: 980725200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 14:30:22,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 14:30:22,740][49750] Updated weights for policy 0, policy_version 197021 (0.0028) [2024-04-26 14:30:26,600][49750] Updated weights for policy 0, policy_version 197031 (0.0032) [2024-04-26 14:30:27,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3228188672. Throughput: 0: 50740.8. Samples: 981028640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 14:30:27,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 14:30:29,126][49750] Updated weights for policy 0, policy_version 197041 (0.0032) [2024-04-26 14:30:32,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3228418048. Throughput: 0: 50917.1. Samples: 981337060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 14:30:32,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 14:30:33,006][49750] Updated weights for policy 0, policy_version 197051 (0.0030) [2024-04-26 14:30:35,727][49750] Updated weights for policy 0, policy_version 197061 (0.0030) [2024-04-26 14:30:37,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3228696576. Throughput: 0: 50882.3. Samples: 981488220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 14:30:37,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 14:30:39,237][49728] Signal inference workers to stop experience collection... (14650 times) [2024-04-26 14:30:39,238][49728] Signal inference workers to resume experience collection... (14650 times) [2024-04-26 14:30:39,267][49750] InferenceWorker_p0-w0: stopping experience collection (14650 times) [2024-04-26 14:30:39,267][49750] InferenceWorker_p0-w0: resuming experience collection (14650 times) [2024-04-26 14:30:39,368][49750] Updated weights for policy 0, policy_version 197071 (0.0038) [2024-04-26 14:30:42,062][49517] Fps is (10 sec: 54066.6, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3228958720. Throughput: 0: 50959.6. Samples: 981795340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 14:30:42,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 14:30:42,146][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000197081_3228975104.pth... [2024-04-26 14:30:42,152][49750] Updated weights for policy 0, policy_version 197081 (0.0026) [2024-04-26 14:30:42,192][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000196335_3216752640.pth [2024-04-26 14:30:45,868][49750] Updated weights for policy 0, policy_version 197091 (0.0035) [2024-04-26 14:30:47,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50790.6, 300 sec: 50651.6). Total num frames: 3229171712. Throughput: 0: 50733.1. Samples: 982092300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 14:30:47,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 14:30:49,004][49750] Updated weights for policy 0, policy_version 197101 (0.0027) [2024-04-26 14:30:52,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3229433856. Throughput: 0: 50782.5. Samples: 982245240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 14:30:52,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 14:30:52,327][49750] Updated weights for policy 0, policy_version 197111 (0.0030) [2024-04-26 14:30:55,524][49750] Updated weights for policy 0, policy_version 197121 (0.0026) [2024-04-26 14:30:57,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3229696000. Throughput: 0: 50848.3. Samples: 982551980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 14:30:57,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 14:30:58,779][49750] Updated weights for policy 0, policy_version 197131 (0.0034) [2024-04-26 14:31:02,002][49750] Updated weights for policy 0, policy_version 197141 (0.0033) [2024-04-26 14:31:02,062][49517] Fps is (10 sec: 52429.8, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 3229958144. Throughput: 0: 51016.7. Samples: 982856620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 14:31:02,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 14:31:05,236][49750] Updated weights for policy 0, policy_version 197151 (0.0032) [2024-04-26 14:31:07,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3230220288. Throughput: 0: 50881.0. Samples: 983014840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 14:31:07,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 14:31:08,342][49750] Updated weights for policy 0, policy_version 197161 (0.0030) [2024-04-26 14:31:11,635][49750] Updated weights for policy 0, policy_version 197171 (0.0036) [2024-04-26 14:31:12,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3230466048. Throughput: 0: 50840.5. Samples: 983316460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 14:31:12,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 14:31:14,651][49750] Updated weights for policy 0, policy_version 197181 (0.0035) [2024-04-26 14:31:17,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 3230695424. Throughput: 0: 50771.1. Samples: 983621760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 14:31:17,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 14:31:18,108][49750] Updated weights for policy 0, policy_version 197191 (0.0035) [2024-04-26 14:31:21,113][49750] Updated weights for policy 0, policy_version 197201 (0.0030) [2024-04-26 14:31:22,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3230957568. Throughput: 0: 50618.9. Samples: 983766080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 14:31:22,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 14:31:24,722][49750] Updated weights for policy 0, policy_version 197211 (0.0037) [2024-04-26 14:31:27,063][49517] Fps is (10 sec: 54066.1, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3231236096. Throughput: 0: 50465.7. Samples: 984066300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 14:31:27,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 14:31:27,567][49750] Updated weights for policy 0, policy_version 197221 (0.0041) [2024-04-26 14:31:31,237][49750] Updated weights for policy 0, policy_version 197231 (0.0033) [2024-04-26 14:31:32,062][49517] Fps is (10 sec: 52429.9, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3231481856. Throughput: 0: 50805.7. Samples: 984378560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 14:31:32,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 14:31:33,947][49750] Updated weights for policy 0, policy_version 197241 (0.0038) [2024-04-26 14:31:37,062][49517] Fps is (10 sec: 47514.7, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3231711232. Throughput: 0: 50621.6. Samples: 984523200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 14:31:37,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 14:31:37,632][49750] Updated weights for policy 0, policy_version 197251 (0.0030) [2024-04-26 14:31:40,268][49750] Updated weights for policy 0, policy_version 197261 (0.0029) [2024-04-26 14:31:42,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50244.2, 300 sec: 50818.1). Total num frames: 3231973376. Throughput: 0: 50506.1. Samples: 984824760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 14:31:42,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 14:31:44,135][49750] Updated weights for policy 0, policy_version 197271 (0.0030) [2024-04-26 14:31:46,141][49728] Signal inference workers to stop experience collection... (14700 times) [2024-04-26 14:31:46,190][49750] InferenceWorker_p0-w0: stopping experience collection (14700 times) [2024-04-26 14:31:46,254][49728] Signal inference workers to resume experience collection... (14700 times) [2024-04-26 14:31:46,254][49750] InferenceWorker_p0-w0: resuming experience collection (14700 times) [2024-04-26 14:31:46,690][49750] Updated weights for policy 0, policy_version 197281 (0.0031) [2024-04-26 14:31:47,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3232251904. Throughput: 0: 50533.8. Samples: 985130640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 14:31:47,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 14:31:50,667][49750] Updated weights for policy 0, policy_version 197291 (0.0037) [2024-04-26 14:31:52,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.5, 300 sec: 50651.5). Total num frames: 3232481280. Throughput: 0: 50653.2. Samples: 985294240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 14:31:52,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 14:31:53,467][49750] Updated weights for policy 0, policy_version 197301 (0.0027) [2024-04-26 14:31:57,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3232727040. Throughput: 0: 50651.1. Samples: 985595760. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-26 14:31:57,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 14:31:57,099][49750] Updated weights for policy 0, policy_version 197311 (0.0028) [2024-04-26 14:32:00,143][49750] Updated weights for policy 0, policy_version 197321 (0.0034) [2024-04-26 14:32:02,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3232989184. Throughput: 0: 50511.5. Samples: 985894780. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-26 14:32:02,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 14:32:03,570][49750] Updated weights for policy 0, policy_version 197331 (0.0029) [2024-04-26 14:32:06,508][49750] Updated weights for policy 0, policy_version 197341 (0.0037) [2024-04-26 14:32:07,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50517.1, 300 sec: 50818.2). Total num frames: 3233251328. Throughput: 0: 50595.2. Samples: 986042860. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-26 14:32:07,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 14:32:09,986][49750] Updated weights for policy 0, policy_version 197351 (0.0033) [2024-04-26 14:32:12,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3233529856. Throughput: 0: 50674.8. Samples: 986346660. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-26 14:32:12,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 14:32:13,074][49750] Updated weights for policy 0, policy_version 197361 (0.0034) [2024-04-26 14:32:16,369][49750] Updated weights for policy 0, policy_version 197371 (0.0037) [2024-04-26 14:32:17,062][49517] Fps is (10 sec: 50791.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3233759232. Throughput: 0: 50525.8. Samples: 986652220. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-26 14:32:17,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 14:32:19,532][49750] Updated weights for policy 0, policy_version 197381 (0.0027) [2024-04-26 14:32:22,062][49517] Fps is (10 sec: 45875.7, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3233988608. Throughput: 0: 50756.8. Samples: 986807260. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-26 14:32:22,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 14:32:22,775][49750] Updated weights for policy 0, policy_version 197391 (0.0027) [2024-04-26 14:32:25,800][49750] Updated weights for policy 0, policy_version 197401 (0.0025) [2024-04-26 14:32:27,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 3234250752. Throughput: 0: 50671.7. Samples: 987104980. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-26 14:32:27,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 14:32:29,271][49750] Updated weights for policy 0, policy_version 197411 (0.0023) [2024-04-26 14:32:32,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3234529280. Throughput: 0: 50609.3. Samples: 987408060. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-26 14:32:32,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 14:32:32,164][49750] Updated weights for policy 0, policy_version 197421 (0.0030) [2024-04-26 14:32:35,891][49750] Updated weights for policy 0, policy_version 197431 (0.0028) [2024-04-26 14:32:37,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3234775040. Throughput: 0: 50673.9. Samples: 987574560. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-26 14:32:37,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 14:32:38,627][49750] Updated weights for policy 0, policy_version 197441 (0.0029) [2024-04-26 14:32:42,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3235020800. Throughput: 0: 50650.7. Samples: 987875040. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-26 14:32:42,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 14:32:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000197450_3235020800.pth... [2024-04-26 14:32:42,124][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000196707_3222847488.pth [2024-04-26 14:32:42,446][49750] Updated weights for policy 0, policy_version 197451 (0.0031) [2024-04-26 14:32:45,120][49750] Updated weights for policy 0, policy_version 197461 (0.0024) [2024-04-26 14:32:47,063][49517] Fps is (10 sec: 47512.6, 60 sec: 49971.1, 300 sec: 50762.6). Total num frames: 3235250176. Throughput: 0: 50595.9. Samples: 988171600. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-26 14:32:47,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 14:32:48,926][49750] Updated weights for policy 0, policy_version 197471 (0.0031) [2024-04-26 14:32:50,234][49728] Signal inference workers to stop experience collection... (14750 times) [2024-04-26 14:32:50,235][49728] Signal inference workers to resume experience collection... (14750 times) [2024-04-26 14:32:50,247][49750] InferenceWorker_p0-w0: stopping experience collection (14750 times) [2024-04-26 14:32:50,247][49750] InferenceWorker_p0-w0: resuming experience collection (14750 times) [2024-04-26 14:32:51,586][49750] Updated weights for policy 0, policy_version 197481 (0.0033) [2024-04-26 14:32:52,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3235545088. Throughput: 0: 50812.6. Samples: 988329420. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-26 14:32:52,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 14:32:55,278][49750] Updated weights for policy 0, policy_version 197491 (0.0033) [2024-04-26 14:32:57,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3235774464. Throughput: 0: 50797.9. Samples: 988632560. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-26 14:32:57,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 14:32:57,901][49750] Updated weights for policy 0, policy_version 197501 (0.0033) [2024-04-26 14:33:01,572][49750] Updated weights for policy 0, policy_version 197511 (0.0035) [2024-04-26 14:33:02,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 3236020224. Throughput: 0: 50776.2. Samples: 988937160. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-26 14:33:02,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 14:33:04,540][49750] Updated weights for policy 0, policy_version 197521 (0.0029) [2024-04-26 14:33:07,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3236282368. Throughput: 0: 50581.6. Samples: 989083440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 14:33:07,064][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 14:33:08,158][49750] Updated weights for policy 0, policy_version 197531 (0.0034) [2024-04-26 14:33:10,877][49750] Updated weights for policy 0, policy_version 197541 (0.0029) [2024-04-26 14:33:12,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3236544512. Throughput: 0: 50701.3. Samples: 989386540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 14:33:12,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 14:33:14,621][49750] Updated weights for policy 0, policy_version 197551 (0.0031) [2024-04-26 14:33:17,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3236823040. Throughput: 0: 50760.0. Samples: 989692260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 14:33:17,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 14:33:17,276][49750] Updated weights for policy 0, policy_version 197561 (0.0035) [2024-04-26 14:33:20,934][49750] Updated weights for policy 0, policy_version 197571 (0.0029) [2024-04-26 14:33:22,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.4, 300 sec: 50651.5). Total num frames: 3237052416. Throughput: 0: 50639.9. Samples: 989853360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 14:33:22,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 14:33:23,677][49750] Updated weights for policy 0, policy_version 197581 (0.0034) [2024-04-26 14:33:27,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3237298176. Throughput: 0: 50731.1. Samples: 990157940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 14:33:27,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 14:33:27,510][49750] Updated weights for policy 0, policy_version 197591 (0.0031) [2024-04-26 14:33:29,961][49750] Updated weights for policy 0, policy_version 197601 (0.0030) [2024-04-26 14:33:32,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3237543936. Throughput: 0: 50963.3. Samples: 990464940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 14:33:32,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 14:33:33,842][49750] Updated weights for policy 0, policy_version 197611 (0.0030) [2024-04-26 14:33:36,368][49750] Updated weights for policy 0, policy_version 197621 (0.0029) [2024-04-26 14:33:37,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 3237822464. Throughput: 0: 50782.1. Samples: 990614620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 14:33:37,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 14:33:40,467][49750] Updated weights for policy 0, policy_version 197631 (0.0031) [2024-04-26 14:33:42,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3238068224. Throughput: 0: 50822.1. Samples: 990919560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 14:33:42,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 14:33:42,793][49750] Updated weights for policy 0, policy_version 197641 (0.0034) [2024-04-26 14:33:46,942][49750] Updated weights for policy 0, policy_version 197651 (0.0030) [2024-04-26 14:33:47,063][49517] Fps is (10 sec: 49152.3, 60 sec: 51063.5, 300 sec: 50651.5). Total num frames: 3238313984. Throughput: 0: 50842.8. Samples: 991225080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 14:33:47,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 14:33:49,081][49728] Signal inference workers to stop experience collection... (14800 times) [2024-04-26 14:33:49,081][49728] Signal inference workers to resume experience collection... (14800 times) [2024-04-26 14:33:49,110][49750] InferenceWorker_p0-w0: stopping experience collection (14800 times) [2024-04-26 14:33:49,110][49750] InferenceWorker_p0-w0: resuming experience collection (14800 times) [2024-04-26 14:33:49,374][49750] Updated weights for policy 0, policy_version 197661 (0.0035) [2024-04-26 14:33:52,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3238576128. Throughput: 0: 50664.0. Samples: 991363320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 14:33:52,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 14:33:53,284][49750] Updated weights for policy 0, policy_version 197671 (0.0029) [2024-04-26 14:33:55,842][49750] Updated weights for policy 0, policy_version 197681 (0.0033) [2024-04-26 14:33:57,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3238821888. Throughput: 0: 50777.8. Samples: 991671540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 14:33:57,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 14:33:59,661][49750] Updated weights for policy 0, policy_version 197691 (0.0031) [2024-04-26 14:34:02,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.7, 300 sec: 50762.7). Total num frames: 3239100416. Throughput: 0: 50842.2. Samples: 991980160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 14:34:02,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 14:34:02,487][49750] Updated weights for policy 0, policy_version 197701 (0.0034) [2024-04-26 14:34:06,042][49750] Updated weights for policy 0, policy_version 197711 (0.0031) [2024-04-26 14:34:07,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3239329792. Throughput: 0: 50637.4. Samples: 992132040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 14:34:07,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 14:34:09,513][49750] Updated weights for policy 0, policy_version 197721 (0.0030) [2024-04-26 14:34:12,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3239575552. Throughput: 0: 50752.9. Samples: 992441820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 14:34:12,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 14:34:12,440][49750] Updated weights for policy 0, policy_version 197731 (0.0027) [2024-04-26 14:34:16,033][49750] Updated weights for policy 0, policy_version 197741 (0.0032) [2024-04-26 14:34:17,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3239854080. Throughput: 0: 50834.8. Samples: 992752500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 14:34:17,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 14:34:18,848][49750] Updated weights for policy 0, policy_version 197751 (0.0031) [2024-04-26 14:34:22,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3240099840. Throughput: 0: 50911.5. Samples: 992905640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 14:34:22,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 14:34:22,352][49750] Updated weights for policy 0, policy_version 197761 (0.0031) [2024-04-26 14:34:25,399][49750] Updated weights for policy 0, policy_version 197771 (0.0030) [2024-04-26 14:34:27,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3240378368. Throughput: 0: 50770.2. Samples: 993204220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 14:34:27,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 14:34:28,707][49750] Updated weights for policy 0, policy_version 197781 (0.0031) [2024-04-26 14:34:32,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3240591360. Throughput: 0: 50822.8. Samples: 993512100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 14:34:32,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 14:34:32,102][49750] Updated weights for policy 0, policy_version 197791 (0.0030) [2024-04-26 14:34:35,189][49750] Updated weights for policy 0, policy_version 197801 (0.0029) [2024-04-26 14:34:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3240869888. Throughput: 0: 51068.9. Samples: 993661420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 14:34:37,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 14:34:38,488][49750] Updated weights for policy 0, policy_version 197811 (0.0039) [2024-04-26 14:34:41,673][49750] Updated weights for policy 0, policy_version 197821 (0.0043) [2024-04-26 14:34:42,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3241132032. Throughput: 0: 50914.2. Samples: 993962680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 14:34:42,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 14:34:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000197823_3241132032.pth... [2024-04-26 14:34:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000197081_3228975104.pth [2024-04-26 14:34:45,022][49750] Updated weights for policy 0, policy_version 197831 (0.0036) [2024-04-26 14:34:47,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3241377792. Throughput: 0: 50769.7. Samples: 994264800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 14:34:47,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 14:34:47,989][49750] Updated weights for policy 0, policy_version 197841 (0.0036) [2024-04-26 14:34:51,480][49750] Updated weights for policy 0, policy_version 197851 (0.0033) [2024-04-26 14:34:52,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 3241607168. Throughput: 0: 50867.6. Samples: 994421080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 14:34:52,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 14:34:54,413][49750] Updated weights for policy 0, policy_version 197861 (0.0030) [2024-04-26 14:34:57,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3241869312. Throughput: 0: 50715.5. Samples: 994724020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 14:34:57,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 14:34:57,988][49728] Signal inference workers to stop experience collection... (14850 times) [2024-04-26 14:34:58,031][49750] InferenceWorker_p0-w0: stopping experience collection (14850 times) [2024-04-26 14:34:58,093][49728] Signal inference workers to resume experience collection... (14850 times) [2024-04-26 14:34:58,093][49750] InferenceWorker_p0-w0: resuming experience collection (14850 times) [2024-04-26 14:34:58,095][49750] Updated weights for policy 0, policy_version 197871 (0.0027) [2024-04-26 14:35:00,894][49750] Updated weights for policy 0, policy_version 197881 (0.0026) [2024-04-26 14:35:02,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3242131456. Throughput: 0: 50714.1. Samples: 995034640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 14:35:02,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 14:35:04,364][49750] Updated weights for policy 0, policy_version 197891 (0.0034) [2024-04-26 14:35:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3242377216. Throughput: 0: 50783.2. Samples: 995190880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 14:35:07,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 14:35:07,334][49750] Updated weights for policy 0, policy_version 197901 (0.0041) [2024-04-26 14:35:10,759][49750] Updated weights for policy 0, policy_version 197911 (0.0035) [2024-04-26 14:35:12,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 3242672128. Throughput: 0: 50997.3. Samples: 995499100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 14:35:12,063][49517] Avg episode reward: [(0, '0.459')] [2024-04-26 14:35:13,662][49750] Updated weights for policy 0, policy_version 197921 (0.0029) [2024-04-26 14:35:17,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3242885120. Throughput: 0: 50937.8. Samples: 995804300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 14:35:17,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 14:35:17,076][49750] Updated weights for policy 0, policy_version 197931 (0.0032) [2024-04-26 14:35:20,136][49750] Updated weights for policy 0, policy_version 197941 (0.0027) [2024-04-26 14:35:22,062][49517] Fps is (10 sec: 49152.7, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 3243163648. Throughput: 0: 51033.8. Samples: 995957940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 14:35:22,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 14:35:23,552][49750] Updated weights for policy 0, policy_version 197951 (0.0029) [2024-04-26 14:35:26,546][49750] Updated weights for policy 0, policy_version 197961 (0.0033) [2024-04-26 14:35:27,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3243393024. Throughput: 0: 50942.7. Samples: 996255100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 14:35:27,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 14:35:30,093][49750] Updated weights for policy 0, policy_version 197971 (0.0033) [2024-04-26 14:35:32,063][49517] Fps is (10 sec: 49151.6, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3243655168. Throughput: 0: 50850.2. Samples: 996553060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 14:35:32,067][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 14:35:32,982][49750] Updated weights for policy 0, policy_version 197981 (0.0034) [2024-04-26 14:35:36,526][49750] Updated weights for policy 0, policy_version 197991 (0.0031) [2024-04-26 14:35:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3243917312. Throughput: 0: 50972.8. Samples: 996714860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 14:35:37,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 14:35:37,130][49728] Signal inference workers to stop experience collection... (14900 times) [2024-04-26 14:35:37,168][49750] InferenceWorker_p0-w0: stopping experience collection (14900 times) [2024-04-26 14:35:37,202][49728] Signal inference workers to resume experience collection... (14900 times) [2024-04-26 14:35:37,203][49750] InferenceWorker_p0-w0: resuming experience collection (14900 times) [2024-04-26 14:35:39,570][49750] Updated weights for policy 0, policy_version 198001 (0.0028) [2024-04-26 14:35:42,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3244146688. Throughput: 0: 50921.4. Samples: 997015480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 14:35:42,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 14:35:42,909][49750] Updated weights for policy 0, policy_version 198011 (0.0033) [2024-04-26 14:35:45,894][49750] Updated weights for policy 0, policy_version 198021 (0.0026) [2024-04-26 14:35:47,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3244425216. Throughput: 0: 50843.2. Samples: 997322580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 14:35:47,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 14:35:49,517][49750] Updated weights for policy 0, policy_version 198031 (0.0036) [2024-04-26 14:35:52,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3244670976. Throughput: 0: 50839.1. Samples: 997478640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 14:35:52,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 14:35:52,359][49750] Updated weights for policy 0, policy_version 198041 (0.0032) [2024-04-26 14:35:55,844][49750] Updated weights for policy 0, policy_version 198051 (0.0031) [2024-04-26 14:35:57,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3244949504. Throughput: 0: 50729.5. Samples: 997781920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 14:35:57,063][49517] Avg episode reward: [(0, '0.446')] [2024-04-26 14:35:59,257][49750] Updated weights for policy 0, policy_version 198061 (0.0034) [2024-04-26 14:36:02,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3245178880. Throughput: 0: 50700.1. Samples: 998085800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 14:36:02,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 14:36:02,188][49750] Updated weights for policy 0, policy_version 198071 (0.0027) [2024-04-26 14:36:05,698][49750] Updated weights for policy 0, policy_version 198081 (0.0035) [2024-04-26 14:36:07,062][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3245441024. Throughput: 0: 50668.4. Samples: 998238020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 14:36:07,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 14:36:08,684][49750] Updated weights for policy 0, policy_version 198091 (0.0032) [2024-04-26 14:36:12,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49698.3, 300 sec: 50707.1). Total num frames: 3245654016. Throughput: 0: 50747.7. Samples: 998538740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 14:36:12,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 14:36:12,232][49750] Updated weights for policy 0, policy_version 198101 (0.0026) [2024-04-26 14:36:15,288][49750] Updated weights for policy 0, policy_version 198111 (0.0034) [2024-04-26 14:36:17,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 3245948928. Throughput: 0: 50808.8. Samples: 998839460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 14:36:17,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 14:36:18,552][49750] Updated weights for policy 0, policy_version 198121 (0.0034) [2024-04-26 14:36:21,540][49750] Updated weights for policy 0, policy_version 198131 (0.0030) [2024-04-26 14:36:22,062][49517] Fps is (10 sec: 54066.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3246194688. Throughput: 0: 50641.9. Samples: 998993740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 14:36:22,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 14:36:25,015][49750] Updated weights for policy 0, policy_version 198141 (0.0031) [2024-04-26 14:36:27,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3246440448. Throughput: 0: 50753.8. Samples: 999299400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:36:27,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 14:36:27,883][49750] Updated weights for policy 0, policy_version 198151 (0.0038) [2024-04-26 14:36:31,368][49750] Updated weights for policy 0, policy_version 198161 (0.0029) [2024-04-26 14:36:32,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3246718976. Throughput: 0: 50817.8. Samples: 999609380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:36:32,063][49517] Avg episode reward: [(0, '0.475')] [2024-04-26 14:36:34,463][49750] Updated weights for policy 0, policy_version 198171 (0.0032) [2024-04-26 14:36:37,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3246948352. Throughput: 0: 50678.2. Samples: 999759160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:36:37,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 14:36:37,771][49750] Updated weights for policy 0, policy_version 198181 (0.0035) [2024-04-26 14:36:38,852][49728] Signal inference workers to stop experience collection... (14950 times) [2024-04-26 14:36:38,853][49728] Signal inference workers to resume experience collection... (14950 times) [2024-04-26 14:36:38,872][49750] InferenceWorker_p0-w0: stopping experience collection (14950 times) [2024-04-26 14:36:38,872][49750] InferenceWorker_p0-w0: resuming experience collection (14950 times) [2024-04-26 14:36:40,979][49750] Updated weights for policy 0, policy_version 198191 (0.0029) [2024-04-26 14:36:42,063][49517] Fps is (10 sec: 49151.0, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3247210496. Throughput: 0: 50796.7. Samples: 1000067780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:36:42,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 14:36:42,186][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000198195_3247226880.pth... [2024-04-26 14:36:42,243][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000197450_3235020800.pth [2024-04-26 14:36:44,226][49750] Updated weights for policy 0, policy_version 198201 (0.0034) [2024-04-26 14:36:47,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3247456256. Throughput: 0: 50734.4. Samples: 1000368860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:36:47,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 14:36:47,374][49750] Updated weights for policy 0, policy_version 198211 (0.0028) [2024-04-26 14:36:50,790][49750] Updated weights for policy 0, policy_version 198221 (0.0035) [2024-04-26 14:36:52,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3247734784. Throughput: 0: 50691.8. Samples: 1000519160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:36:52,063][49517] Avg episode reward: [(0, '0.441')] [2024-04-26 14:36:53,689][49750] Updated weights for policy 0, policy_version 198231 (0.0034) [2024-04-26 14:36:57,062][49517] Fps is (10 sec: 49152.5, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 3247947776. Throughput: 0: 50711.5. Samples: 1000820760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:36:57,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 14:36:57,198][49750] Updated weights for policy 0, policy_version 198241 (0.0030) [2024-04-26 14:37:00,149][49750] Updated weights for policy 0, policy_version 198251 (0.0035) [2024-04-26 14:37:02,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.3, 300 sec: 50762.7). Total num frames: 3248226304. Throughput: 0: 50761.5. Samples: 1001123720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:37:02,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 14:37:03,615][49750] Updated weights for policy 0, policy_version 198261 (0.0027) [2024-04-26 14:37:06,615][49750] Updated weights for policy 0, policy_version 198271 (0.0030) [2024-04-26 14:37:07,063][49517] Fps is (10 sec: 54065.9, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 3248488448. Throughput: 0: 50772.7. Samples: 1001278520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:37:07,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 14:37:10,049][49750] Updated weights for policy 0, policy_version 198281 (0.0039) [2024-04-26 14:37:12,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3248734208. Throughput: 0: 50736.9. Samples: 1001582560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:37:12,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 14:37:12,951][49750] Updated weights for policy 0, policy_version 198291 (0.0035) [2024-04-26 14:37:16,461][49750] Updated weights for policy 0, policy_version 198301 (0.0029) [2024-04-26 14:37:17,063][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3248979968. Throughput: 0: 50669.2. Samples: 1001889500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:37:17,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 14:37:19,572][49750] Updated weights for policy 0, policy_version 198311 (0.0031) [2024-04-26 14:37:22,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3249225728. Throughput: 0: 50710.1. Samples: 1002041120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:37:22,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 14:37:22,960][49750] Updated weights for policy 0, policy_version 198321 (0.0036) [2024-04-26 14:37:26,102][49750] Updated weights for policy 0, policy_version 198331 (0.0036) [2024-04-26 14:37:27,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 3249520640. Throughput: 0: 50498.7. Samples: 1002340220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:37:27,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-26 14:37:29,344][49750] Updated weights for policy 0, policy_version 198341 (0.0029) [2024-04-26 14:37:32,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3249750016. Throughput: 0: 50761.8. Samples: 1002653140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 14:37:32,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 14:37:32,394][49750] Updated weights for policy 0, policy_version 198351 (0.0029) [2024-04-26 14:37:35,706][49750] Updated weights for policy 0, policy_version 198361 (0.0034) [2024-04-26 14:37:37,062][49517] Fps is (10 sec: 47514.6, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3249995776. Throughput: 0: 50816.7. Samples: 1002805900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 14:37:37,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 14:37:38,854][49750] Updated weights for policy 0, policy_version 198371 (0.0027) [2024-04-26 14:37:41,935][49728] Signal inference workers to stop experience collection... (15000 times) [2024-04-26 14:37:41,975][49750] InferenceWorker_p0-w0: stopping experience collection (15000 times) [2024-04-26 14:37:42,011][49728] Signal inference workers to resume experience collection... (15000 times) [2024-04-26 14:37:42,011][49750] InferenceWorker_p0-w0: resuming experience collection (15000 times) [2024-04-26 14:37:42,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3250257920. Throughput: 0: 50904.6. Samples: 1003111480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 14:37:42,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 14:37:42,138][49750] Updated weights for policy 0, policy_version 198381 (0.0031) [2024-04-26 14:37:45,327][49750] Updated weights for policy 0, policy_version 198391 (0.0028) [2024-04-26 14:37:47,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3250520064. Throughput: 0: 50830.7. Samples: 1003411100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 14:37:47,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 14:37:48,572][49750] Updated weights for policy 0, policy_version 198401 (0.0031) [2024-04-26 14:37:51,780][49750] Updated weights for policy 0, policy_version 198411 (0.0028) [2024-04-26 14:37:52,062][49517] Fps is (10 sec: 50791.9, 60 sec: 50517.6, 300 sec: 50818.2). Total num frames: 3250765824. Throughput: 0: 50671.0. Samples: 1003558700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 14:37:52,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 14:37:54,975][49750] Updated weights for policy 0, policy_version 198421 (0.0029) [2024-04-26 14:37:57,062][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3251011584. Throughput: 0: 50800.8. Samples: 1003868600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 14:37:57,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 14:37:58,201][49750] Updated weights for policy 0, policy_version 198431 (0.0031) [2024-04-26 14:38:01,343][49750] Updated weights for policy 0, policy_version 198441 (0.0028) [2024-04-26 14:38:02,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3251273728. Throughput: 0: 50740.5. Samples: 1004172820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 14:38:02,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 14:38:04,548][49750] Updated weights for policy 0, policy_version 198451 (0.0040) [2024-04-26 14:38:07,064][49517] Fps is (10 sec: 50784.1, 60 sec: 50516.4, 300 sec: 50762.4). Total num frames: 3251519488. Throughput: 0: 50825.8. Samples: 1004328340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 14:38:07,064][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 14:38:07,789][49750] Updated weights for policy 0, policy_version 198461 (0.0031) [2024-04-26 14:38:10,861][49750] Updated weights for policy 0, policy_version 198471 (0.0041) [2024-04-26 14:38:12,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3251798016. Throughput: 0: 50929.4. Samples: 1004632040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 14:38:12,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 14:38:14,305][49750] Updated weights for policy 0, policy_version 198481 (0.0029) [2024-04-26 14:38:17,062][49517] Fps is (10 sec: 50797.2, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3252027392. Throughput: 0: 50899.2. Samples: 1004943600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 14:38:17,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 14:38:17,456][49750] Updated weights for policy 0, policy_version 198491 (0.0030) [2024-04-26 14:38:20,583][49750] Updated weights for policy 0, policy_version 198501 (0.0041) [2024-04-26 14:38:22,062][49517] Fps is (10 sec: 49152.7, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3252289536. Throughput: 0: 50784.8. Samples: 1005091220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 14:38:22,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 14:38:23,916][49750] Updated weights for policy 0, policy_version 198511 (0.0026) [2024-04-26 14:38:27,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 3252568064. Throughput: 0: 50838.0. Samples: 1005399180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 14:38:27,063][49750] Updated weights for policy 0, policy_version 198521 (0.0033) [2024-04-26 14:38:27,071][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 14:38:30,391][49750] Updated weights for policy 0, policy_version 198531 (0.0032) [2024-04-26 14:38:32,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3252781056. Throughput: 0: 50758.7. Samples: 1005695240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 14:38:32,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 14:38:33,446][49750] Updated weights for policy 0, policy_version 198541 (0.0031) [2024-04-26 14:38:36,686][49750] Updated weights for policy 0, policy_version 198551 (0.0031) [2024-04-26 14:38:37,063][49517] Fps is (10 sec: 50787.5, 60 sec: 51336.0, 300 sec: 50873.6). Total num frames: 3253075968. Throughput: 0: 51037.9. Samples: 1005855440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 14:38:37,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 14:38:40,002][49750] Updated weights for policy 0, policy_version 198561 (0.0037) [2024-04-26 14:38:42,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 3253288960. Throughput: 0: 50822.7. Samples: 1006155620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 14:38:42,063][49517] Avg episode reward: [(0, '0.451')] [2024-04-26 14:38:42,081][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000198566_3253305344.pth... [2024-04-26 14:38:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000197823_3241132032.pth [2024-04-26 14:38:43,210][49750] Updated weights for policy 0, policy_version 198571 (0.0029) [2024-04-26 14:38:46,686][49750] Updated weights for policy 0, policy_version 198581 (0.0030) [2024-04-26 14:38:47,063][49517] Fps is (10 sec: 49154.1, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3253567488. Throughput: 0: 50820.8. Samples: 1006459760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 14:38:47,072][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 14:38:49,695][49750] Updated weights for policy 0, policy_version 198591 (0.0034) [2024-04-26 14:38:52,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3253813248. Throughput: 0: 50825.4. Samples: 1006615420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 14:38:52,071][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 14:38:52,967][49750] Updated weights for policy 0, policy_version 198601 (0.0039) [2024-04-26 14:38:55,601][49728] Signal inference workers to stop experience collection... (15050 times) [2024-04-26 14:38:55,601][49728] Signal inference workers to resume experience collection... (15050 times) [2024-04-26 14:38:55,632][49750] InferenceWorker_p0-w0: stopping experience collection (15050 times) [2024-04-26 14:38:55,633][49750] InferenceWorker_p0-w0: resuming experience collection (15050 times) [2024-04-26 14:38:56,217][49750] Updated weights for policy 0, policy_version 198611 (0.0038) [2024-04-26 14:38:57,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3254075392. Throughput: 0: 50905.9. Samples: 1006922800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 14:38:57,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 14:38:59,349][49750] Updated weights for policy 0, policy_version 198621 (0.0033) [2024-04-26 14:39:02,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3254337536. Throughput: 0: 50753.3. Samples: 1007227500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 14:39:02,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 14:39:02,729][49750] Updated weights for policy 0, policy_version 198631 (0.0035) [2024-04-26 14:39:05,867][49750] Updated weights for policy 0, policy_version 198641 (0.0034) [2024-04-26 14:39:07,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50791.5, 300 sec: 50818.2). Total num frames: 3254566912. Throughput: 0: 50756.4. Samples: 1007375260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 14:39:07,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 14:39:09,028][49750] Updated weights for policy 0, policy_version 198651 (0.0033) [2024-04-26 14:39:12,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 3254829056. Throughput: 0: 50727.1. Samples: 1007681900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 14:39:12,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 14:39:12,494][49750] Updated weights for policy 0, policy_version 198661 (0.0034) [2024-04-26 14:39:15,453][49750] Updated weights for policy 0, policy_version 198671 (0.0029) [2024-04-26 14:39:17,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 3255091200. Throughput: 0: 50874.5. Samples: 1007984600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 14:39:17,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 14:39:19,063][49750] Updated weights for policy 0, policy_version 198681 (0.0030) [2024-04-26 14:39:21,962][49750] Updated weights for policy 0, policy_version 198691 (0.0026) [2024-04-26 14:39:22,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3255353344. Throughput: 0: 50747.2. Samples: 1008139040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 14:39:22,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 14:39:25,590][49750] Updated weights for policy 0, policy_version 198701 (0.0032) [2024-04-26 14:39:27,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 3255582720. Throughput: 0: 50902.7. Samples: 1008446240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 14:39:27,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 14:39:28,509][49750] Updated weights for policy 0, policy_version 198711 (0.0032) [2024-04-26 14:39:31,918][49750] Updated weights for policy 0, policy_version 198721 (0.0039) [2024-04-26 14:39:32,063][49517] Fps is (10 sec: 49151.5, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3255844864. Throughput: 0: 50831.5. Samples: 1008747180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 14:39:32,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 14:39:34,860][49750] Updated weights for policy 0, policy_version 198731 (0.0026) [2024-04-26 14:39:37,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.7, 300 sec: 50762.6). Total num frames: 3256107008. Throughput: 0: 50667.5. Samples: 1008895460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 14:39:37,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 14:39:38,299][49750] Updated weights for policy 0, policy_version 198741 (0.0031) [2024-04-26 14:39:41,242][49750] Updated weights for policy 0, policy_version 198751 (0.0029) [2024-04-26 14:39:42,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3256352768. Throughput: 0: 50679.9. Samples: 1009203400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 14:39:42,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-26 14:39:44,789][49750] Updated weights for policy 0, policy_version 198761 (0.0030) [2024-04-26 14:39:47,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3256614912. Throughput: 0: 50654.6. Samples: 1009506960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:39:47,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 14:39:47,847][49750] Updated weights for policy 0, policy_version 198771 (0.0033) [2024-04-26 14:39:51,420][49750] Updated weights for policy 0, policy_version 198781 (0.0027) [2024-04-26 14:39:52,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3256844288. Throughput: 0: 50772.5. Samples: 1009660020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:39:52,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 14:39:54,361][49750] Updated weights for policy 0, policy_version 198791 (0.0032) [2024-04-26 14:39:57,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3257122816. Throughput: 0: 50556.0. Samples: 1009956920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:39:57,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 14:39:57,895][49750] Updated weights for policy 0, policy_version 198801 (0.0029) [2024-04-26 14:40:00,720][49750] Updated weights for policy 0, policy_version 198811 (0.0028) [2024-04-26 14:40:02,063][49517] Fps is (10 sec: 52427.8, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 3257368576. Throughput: 0: 50696.9. Samples: 1010265960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:40:02,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 14:40:04,423][49750] Updated weights for policy 0, policy_version 198821 (0.0030) [2024-04-26 14:40:05,454][49728] Signal inference workers to stop experience collection... (15100 times) [2024-04-26 14:40:05,488][49750] InferenceWorker_p0-w0: stopping experience collection (15100 times) [2024-04-26 14:40:05,519][49728] Signal inference workers to resume experience collection... (15100 times) [2024-04-26 14:40:05,519][49750] InferenceWorker_p0-w0: resuming experience collection (15100 times) [2024-04-26 14:40:07,063][49517] Fps is (10 sec: 50789.4, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3257630720. Throughput: 0: 50688.3. Samples: 1010420020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:40:07,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 14:40:07,227][49750] Updated weights for policy 0, policy_version 198831 (0.0033) [2024-04-26 14:40:10,749][49750] Updated weights for policy 0, policy_version 198841 (0.0033) [2024-04-26 14:40:12,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3257876480. Throughput: 0: 50677.6. Samples: 1010726740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:40:12,063][49517] Avg episode reward: [(0, '0.463')] [2024-04-26 14:40:13,782][49750] Updated weights for policy 0, policy_version 198851 (0.0030) [2024-04-26 14:40:17,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 3258105856. Throughput: 0: 50760.2. Samples: 1011031380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:40:17,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 14:40:17,319][49750] Updated weights for policy 0, policy_version 198861 (0.0030) [2024-04-26 14:40:20,054][49750] Updated weights for policy 0, policy_version 198871 (0.0026) [2024-04-26 14:40:22,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3258368000. Throughput: 0: 50756.4. Samples: 1011179500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:40:22,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 14:40:23,672][49750] Updated weights for policy 0, policy_version 198881 (0.0030) [2024-04-26 14:40:26,413][49750] Updated weights for policy 0, policy_version 198891 (0.0037) [2024-04-26 14:40:27,063][49517] Fps is (10 sec: 54066.0, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 3258646528. Throughput: 0: 50742.9. Samples: 1011486840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:40:27,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 14:40:30,078][49750] Updated weights for policy 0, policy_version 198901 (0.0029) [2024-04-26 14:40:32,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3258892288. Throughput: 0: 50768.5. Samples: 1011791540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:40:32,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 14:40:32,900][49750] Updated weights for policy 0, policy_version 198911 (0.0031) [2024-04-26 14:40:36,460][49750] Updated weights for policy 0, policy_version 198921 (0.0032) [2024-04-26 14:40:37,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3259121664. Throughput: 0: 50741.7. Samples: 1011943400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:40:37,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 14:40:39,375][49750] Updated weights for policy 0, policy_version 198931 (0.0035) [2024-04-26 14:40:42,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3259383808. Throughput: 0: 50940.2. Samples: 1012249240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:40:42,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 14:40:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000198937_3259383808.pth... [2024-04-26 14:40:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000198195_3247226880.pth [2024-04-26 14:40:42,916][49750] Updated weights for policy 0, policy_version 198941 (0.0031) [2024-04-26 14:40:45,715][49750] Updated weights for policy 0, policy_version 198951 (0.0032) [2024-04-26 14:40:47,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3259629568. Throughput: 0: 50641.6. Samples: 1012544820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:40:47,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 14:40:49,849][49750] Updated weights for policy 0, policy_version 198961 (0.0028) [2024-04-26 14:40:52,062][49517] Fps is (10 sec: 54068.4, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3259924480. Throughput: 0: 50737.1. Samples: 1012703180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 14:40:52,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 14:40:52,115][49750] Updated weights for policy 0, policy_version 198971 (0.0030) [2024-04-26 14:40:56,269][49750] Updated weights for policy 0, policy_version 198981 (0.0028) [2024-04-26 14:40:57,062][49517] Fps is (10 sec: 52428.0, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3260153856. Throughput: 0: 50638.7. Samples: 1013005480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 14:40:57,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 14:40:58,663][49750] Updated weights for policy 0, policy_version 198991 (0.0030) [2024-04-26 14:41:02,063][49517] Fps is (10 sec: 47512.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3260399616. Throughput: 0: 50652.7. Samples: 1013310760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 14:41:02,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 14:41:02,504][49750] Updated weights for policy 0, policy_version 199001 (0.0030) [2024-04-26 14:41:03,144][49728] Signal inference workers to stop experience collection... (15150 times) [2024-04-26 14:41:03,186][49750] InferenceWorker_p0-w0: stopping experience collection (15150 times) [2024-04-26 14:41:03,204][49728] Signal inference workers to resume experience collection... (15150 times) [2024-04-26 14:41:03,206][49750] InferenceWorker_p0-w0: resuming experience collection (15150 times) [2024-04-26 14:41:04,982][49750] Updated weights for policy 0, policy_version 199011 (0.0026) [2024-04-26 14:41:07,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49971.4, 300 sec: 50762.6). Total num frames: 3260628992. Throughput: 0: 50593.4. Samples: 1013456200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 14:41:07,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 14:41:09,016][49750] Updated weights for policy 0, policy_version 199021 (0.0033) [2024-04-26 14:41:11,441][49750] Updated weights for policy 0, policy_version 199031 (0.0027) [2024-04-26 14:41:12,063][49517] Fps is (10 sec: 54067.7, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3260940288. Throughput: 0: 50581.0. Samples: 1013762980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 14:41:12,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 14:41:15,619][49750] Updated weights for policy 0, policy_version 199041 (0.0030) [2024-04-26 14:41:17,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3261169664. Throughput: 0: 50583.2. Samples: 1014067780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 14:41:17,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 14:41:17,783][49750] Updated weights for policy 0, policy_version 199051 (0.0026) [2024-04-26 14:41:21,862][49750] Updated weights for policy 0, policy_version 199061 (0.0028) [2024-04-26 14:41:22,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3261415424. Throughput: 0: 50747.9. Samples: 1014227060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 14:41:22,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 14:41:24,227][49750] Updated weights for policy 0, policy_version 199071 (0.0035) [2024-04-26 14:41:27,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.5, 300 sec: 50651.6). Total num frames: 3261661184. Throughput: 0: 50621.6. Samples: 1014527200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 14:41:27,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 14:41:28,134][49750] Updated weights for policy 0, policy_version 199081 (0.0025) [2024-04-26 14:41:30,676][49750] Updated weights for policy 0, policy_version 199091 (0.0028) [2024-04-26 14:41:32,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3261923328. Throughput: 0: 50750.6. Samples: 1014828600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 14:41:32,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 14:41:34,653][49750] Updated weights for policy 0, policy_version 199101 (0.0028) [2024-04-26 14:41:37,062][49517] Fps is (10 sec: 57344.1, 60 sec: 51882.7, 300 sec: 50929.3). Total num frames: 3262234624. Throughput: 0: 50734.7. Samples: 1014986240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 14:41:37,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 14:41:37,065][49750] Updated weights for policy 0, policy_version 199111 (0.0030) [2024-04-26 14:41:41,134][49750] Updated weights for policy 0, policy_version 199121 (0.0028) [2024-04-26 14:41:42,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3262447616. Throughput: 0: 50844.0. Samples: 1015293460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 14:41:42,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 14:41:43,329][49750] Updated weights for policy 0, policy_version 199131 (0.0033) [2024-04-26 14:41:47,062][49517] Fps is (10 sec: 44236.7, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3262676992. Throughput: 0: 50825.1. Samples: 1015597880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 14:41:47,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 14:41:47,860][49750] Updated weights for policy 0, policy_version 199141 (0.0031) [2024-04-26 14:41:49,366][49728] Signal inference workers to stop experience collection... (15200 times) [2024-04-26 14:41:49,366][49728] Signal inference workers to resume experience collection... (15200 times) [2024-04-26 14:41:49,392][49750] InferenceWorker_p0-w0: stopping experience collection (15200 times) [2024-04-26 14:41:49,392][49750] InferenceWorker_p0-w0: resuming experience collection (15200 times) [2024-04-26 14:41:49,990][49750] Updated weights for policy 0, policy_version 199151 (0.0037) [2024-04-26 14:41:52,063][49517] Fps is (10 sec: 45874.5, 60 sec: 49698.0, 300 sec: 50707.1). Total num frames: 3262906368. Throughput: 0: 50636.2. Samples: 1015734840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 14:41:52,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 14:41:54,382][49750] Updated weights for policy 0, policy_version 199161 (0.0029) [2024-04-26 14:41:56,469][49750] Updated weights for policy 0, policy_version 199171 (0.0026) [2024-04-26 14:41:57,063][49517] Fps is (10 sec: 54066.0, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3263217664. Throughput: 0: 50684.4. Samples: 1016043780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 14:41:57,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 14:42:00,772][49750] Updated weights for policy 0, policy_version 199181 (0.0024) [2024-04-26 14:42:02,063][49517] Fps is (10 sec: 57344.3, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3263479808. Throughput: 0: 50813.2. Samples: 1016354380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 14:42:02,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 14:42:02,831][49750] Updated weights for policy 0, policy_version 199191 (0.0034) [2024-04-26 14:42:07,062][49517] Fps is (10 sec: 47514.1, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3263692800. Throughput: 0: 50791.2. Samples: 1016512660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 14:42:07,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 14:42:07,278][49750] Updated weights for policy 0, policy_version 199201 (0.0038) [2024-04-26 14:42:09,726][49750] Updated weights for policy 0, policy_version 199211 (0.0033) [2024-04-26 14:42:12,062][49517] Fps is (10 sec: 44237.2, 60 sec: 49698.2, 300 sec: 50651.6). Total num frames: 3263922176. Throughput: 0: 50725.2. Samples: 1016809840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 14:42:12,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 14:42:13,627][49750] Updated weights for policy 0, policy_version 199221 (0.0033) [2024-04-26 14:42:16,049][49750] Updated weights for policy 0, policy_version 199231 (0.0032) [2024-04-26 14:42:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3264200704. Throughput: 0: 50547.4. Samples: 1017103240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 14:42:17,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 14:42:20,276][49750] Updated weights for policy 0, policy_version 199241 (0.0032) [2024-04-26 14:42:22,063][49517] Fps is (10 sec: 58981.5, 60 sec: 51609.5, 300 sec: 50818.2). Total num frames: 3264512000. Throughput: 0: 50559.7. Samples: 1017261440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 14:42:22,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 14:42:22,596][49750] Updated weights for policy 0, policy_version 199251 (0.0030) [2024-04-26 14:42:26,829][49750] Updated weights for policy 0, policy_version 199261 (0.0037) [2024-04-26 14:42:27,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3264708608. Throughput: 0: 50543.1. Samples: 1017567900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 14:42:27,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 14:42:29,117][49750] Updated weights for policy 0, policy_version 199271 (0.0034) [2024-04-26 14:42:31,232][49728] Signal inference workers to stop experience collection... (15250 times) [2024-04-26 14:42:31,232][49728] Signal inference workers to resume experience collection... (15250 times) [2024-04-26 14:42:31,267][49750] InferenceWorker_p0-w0: stopping experience collection (15250 times) [2024-04-26 14:42:31,267][49750] InferenceWorker_p0-w0: resuming experience collection (15250 times) [2024-04-26 14:42:32,063][49517] Fps is (10 sec: 45875.6, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3264970752. Throughput: 0: 50630.9. Samples: 1017876280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 14:42:32,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 14:42:33,153][49750] Updated weights for policy 0, policy_version 199281 (0.0031) [2024-04-26 14:42:35,432][49750] Updated weights for policy 0, policy_version 199291 (0.0032) [2024-04-26 14:42:37,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49425.0, 300 sec: 50651.6). Total num frames: 3265200128. Throughput: 0: 50675.7. Samples: 1018015240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 14:42:37,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 14:42:39,562][49750] Updated weights for policy 0, policy_version 199301 (0.0033) [2024-04-26 14:42:42,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3265495040. Throughput: 0: 50636.2. Samples: 1018322400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 14:42:42,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 14:42:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000199310_3265495040.pth... [2024-04-26 14:42:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000198566_3253305344.pth [2024-04-26 14:42:42,314][49750] Updated weights for policy 0, policy_version 199311 (0.0028) [2024-04-26 14:42:46,123][49750] Updated weights for policy 0, policy_version 199321 (0.0029) [2024-04-26 14:42:47,062][49517] Fps is (10 sec: 55705.7, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 3265757184. Throughput: 0: 50645.4. Samples: 1018633420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 14:42:47,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 14:42:48,810][49750] Updated weights for policy 0, policy_version 199331 (0.0032) [2024-04-26 14:42:52,062][49517] Fps is (10 sec: 47513.6, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 3265970176. Throughput: 0: 50581.9. Samples: 1018788840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 14:42:52,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 14:42:52,496][49750] Updated weights for policy 0, policy_version 199341 (0.0031) [2024-04-26 14:42:55,237][49750] Updated weights for policy 0, policy_version 199351 (0.0032) [2024-04-26 14:42:57,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3266232320. Throughput: 0: 50565.3. Samples: 1019085280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 14:42:57,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 14:42:58,820][49750] Updated weights for policy 0, policy_version 199361 (0.0028) [2024-04-26 14:43:01,592][49750] Updated weights for policy 0, policy_version 199371 (0.0036) [2024-04-26 14:43:02,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50244.3, 300 sec: 50762.8). Total num frames: 3266494464. Throughput: 0: 50744.0. Samples: 1019386720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-26 14:43:02,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 14:43:05,188][49750] Updated weights for policy 0, policy_version 199381 (0.0028) [2024-04-26 14:43:07,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3266772992. Throughput: 0: 50886.3. Samples: 1019551320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-04-26 14:43:07,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 14:43:08,647][49750] Updated weights for policy 0, policy_version 199391 (0.0028) [2024-04-26 14:43:11,653][49750] Updated weights for policy 0, policy_version 199401 (0.0031) [2024-04-26 14:43:12,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3267002368. Throughput: 0: 50721.2. Samples: 1019850360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-04-26 14:43:12,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 14:43:15,398][49750] Updated weights for policy 0, policy_version 199411 (0.0035) [2024-04-26 14:43:17,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3267248128. Throughput: 0: 50678.3. Samples: 1020156800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-04-26 14:43:17,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 14:43:18,285][49750] Updated weights for policy 0, policy_version 199421 (0.0028) [2024-04-26 14:43:21,743][49750] Updated weights for policy 0, policy_version 199431 (0.0029) [2024-04-26 14:43:22,062][49517] Fps is (10 sec: 47514.3, 60 sec: 49425.2, 300 sec: 50540.5). Total num frames: 3267477504. Throughput: 0: 50811.6. Samples: 1020301760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-04-26 14:43:22,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 14:43:24,576][49750] Updated weights for policy 0, policy_version 199441 (0.0026) [2024-04-26 14:43:25,644][49728] Signal inference workers to stop experience collection... (15300 times) [2024-04-26 14:43:25,674][49750] InferenceWorker_p0-w0: stopping experience collection (15300 times) [2024-04-26 14:43:25,706][49728] Signal inference workers to resume experience collection... (15300 times) [2024-04-26 14:43:25,706][49750] InferenceWorker_p0-w0: resuming experience collection (15300 times) [2024-04-26 14:43:27,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3267788800. Throughput: 0: 50823.8. Samples: 1020609480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-04-26 14:43:27,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 14:43:28,026][49750] Updated weights for policy 0, policy_version 199451 (0.0025) [2024-04-26 14:43:30,931][49750] Updated weights for policy 0, policy_version 199461 (0.0036) [2024-04-26 14:43:32,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3268018176. Throughput: 0: 50763.2. Samples: 1020917760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-04-26 14:43:32,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 14:43:34,353][49750] Updated weights for policy 0, policy_version 199471 (0.0028) [2024-04-26 14:43:37,062][49517] Fps is (10 sec: 45876.0, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3268247552. Throughput: 0: 50732.0. Samples: 1021071780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-04-26 14:43:37,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 14:43:37,482][49750] Updated weights for policy 0, policy_version 199481 (0.0027) [2024-04-26 14:43:40,794][49750] Updated weights for policy 0, policy_version 199491 (0.0033) [2024-04-26 14:43:42,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 3268509696. Throughput: 0: 50873.5. Samples: 1021374580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-04-26 14:43:42,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 14:43:43,939][49750] Updated weights for policy 0, policy_version 199501 (0.0032) [2024-04-26 14:43:47,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3268771840. Throughput: 0: 50938.3. Samples: 1021678940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-04-26 14:43:47,063][49517] Avg episode reward: [(0, '0.491')] [2024-04-26 14:43:47,193][49750] Updated weights for policy 0, policy_version 199511 (0.0028) [2024-04-26 14:43:50,302][49750] Updated weights for policy 0, policy_version 199521 (0.0028) [2024-04-26 14:43:52,062][49517] Fps is (10 sec: 54066.5, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3269050368. Throughput: 0: 50736.6. Samples: 1021834460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-04-26 14:43:52,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 14:43:53,653][49750] Updated weights for policy 0, policy_version 199531 (0.0037) [2024-04-26 14:43:56,799][49750] Updated weights for policy 0, policy_version 199541 (0.0031) [2024-04-26 14:43:57,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3269279744. Throughput: 0: 50809.5. Samples: 1022136780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-04-26 14:43:57,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 14:44:00,101][49750] Updated weights for policy 0, policy_version 199551 (0.0032) [2024-04-26 14:44:02,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3269541888. Throughput: 0: 50771.8. Samples: 1022441540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-04-26 14:44:02,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 14:44:03,132][49750] Updated weights for policy 0, policy_version 199561 (0.0032) [2024-04-26 14:44:06,561][49750] Updated weights for policy 0, policy_version 199571 (0.0034) [2024-04-26 14:44:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3269787648. Throughput: 0: 50843.6. Samples: 1022589720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-04-26 14:44:07,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 14:44:09,539][49750] Updated weights for policy 0, policy_version 199581 (0.0039) [2024-04-26 14:44:12,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3270049792. Throughput: 0: 50863.7. Samples: 1022898340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 14:44:12,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 14:44:12,924][49750] Updated weights for policy 0, policy_version 199591 (0.0035) [2024-04-26 14:44:16,105][49750] Updated weights for policy 0, policy_version 199601 (0.0028) [2024-04-26 14:44:17,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3270311936. Throughput: 0: 50814.9. Samples: 1023204440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 14:44:17,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 14:44:19,274][49750] Updated weights for policy 0, policy_version 199611 (0.0031) [2024-04-26 14:44:22,062][49517] Fps is (10 sec: 49151.7, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3270541312. Throughput: 0: 50672.3. Samples: 1023352040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 14:44:22,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 14:44:22,811][49750] Updated weights for policy 0, policy_version 199621 (0.0031) [2024-04-26 14:44:25,853][49750] Updated weights for policy 0, policy_version 199631 (0.0028) [2024-04-26 14:44:27,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3270803456. Throughput: 0: 50761.7. Samples: 1023658860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 14:44:27,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 14:44:29,183][49750] Updated weights for policy 0, policy_version 199641 (0.0033) [2024-04-26 14:44:29,718][49728] Signal inference workers to stop experience collection... (15350 times) [2024-04-26 14:44:29,756][49750] InferenceWorker_p0-w0: stopping experience collection (15350 times) [2024-04-26 14:44:29,788][49728] Signal inference workers to resume experience collection... (15350 times) [2024-04-26 14:44:29,788][49750] InferenceWorker_p0-w0: resuming experience collection (15350 times) [2024-04-26 14:44:32,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3271049216. Throughput: 0: 50816.9. Samples: 1023965700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 14:44:32,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 14:44:32,360][49750] Updated weights for policy 0, policy_version 199651 (0.0032) [2024-04-26 14:44:35,729][49750] Updated weights for policy 0, policy_version 199661 (0.0032) [2024-04-26 14:44:37,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 3271327744. Throughput: 0: 50632.4. Samples: 1024112920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 14:44:37,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 14:44:38,675][49750] Updated weights for policy 0, policy_version 199671 (0.0036) [2024-04-26 14:44:42,043][49750] Updated weights for policy 0, policy_version 199681 (0.0027) [2024-04-26 14:44:42,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3271573504. Throughput: 0: 50563.8. Samples: 1024412160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 14:44:42,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 14:44:42,069][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000199681_3271573504.pth... [2024-04-26 14:44:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000198937_3259383808.pth [2024-04-26 14:44:45,077][49750] Updated weights for policy 0, policy_version 199691 (0.0033) [2024-04-26 14:44:47,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3271835648. Throughput: 0: 50688.1. Samples: 1024722500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 14:44:47,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 14:44:48,572][49750] Updated weights for policy 0, policy_version 199701 (0.0029) [2024-04-26 14:44:51,657][49750] Updated weights for policy 0, policy_version 199711 (0.0039) [2024-04-26 14:44:52,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3272081408. Throughput: 0: 50734.1. Samples: 1024872760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 14:44:52,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 14:44:55,060][49750] Updated weights for policy 0, policy_version 199721 (0.0036) [2024-04-26 14:44:57,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3272310784. Throughput: 0: 50679.6. Samples: 1025178920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 14:44:57,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 14:44:58,281][49750] Updated weights for policy 0, policy_version 199731 (0.0032) [2024-04-26 14:45:01,373][49750] Updated weights for policy 0, policy_version 199741 (0.0034) [2024-04-26 14:45:02,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 3272572928. Throughput: 0: 50596.7. Samples: 1025481280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 14:45:02,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 14:45:04,887][49750] Updated weights for policy 0, policy_version 199751 (0.0030) [2024-04-26 14:45:07,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3272851456. Throughput: 0: 50762.2. Samples: 1025636340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 14:45:07,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 14:45:07,895][49750] Updated weights for policy 0, policy_version 199761 (0.0029) [2024-04-26 14:45:11,229][49750] Updated weights for policy 0, policy_version 199771 (0.0030) [2024-04-26 14:45:12,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3273080832. Throughput: 0: 50683.5. Samples: 1025939620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 14:45:12,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 14:45:14,215][49750] Updated weights for policy 0, policy_version 199781 (0.0030) [2024-04-26 14:45:17,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3273342976. Throughput: 0: 50681.6. Samples: 1026246380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 14:45:17,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 14:45:17,747][49750] Updated weights for policy 0, policy_version 199791 (0.0029) [2024-04-26 14:45:20,832][49750] Updated weights for policy 0, policy_version 199801 (0.0033) [2024-04-26 14:45:22,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 50651.6). Total num frames: 3273588736. Throughput: 0: 50584.8. Samples: 1026389240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 14:45:22,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 14:45:24,282][49750] Updated weights for policy 0, policy_version 199811 (0.0030) [2024-04-26 14:45:27,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3273850880. Throughput: 0: 50678.2. Samples: 1026692680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 14:45:27,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 14:45:27,259][49750] Updated weights for policy 0, policy_version 199821 (0.0033) [2024-04-26 14:45:30,653][49750] Updated weights for policy 0, policy_version 199831 (0.0036) [2024-04-26 14:45:32,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3274113024. Throughput: 0: 50738.2. Samples: 1027005720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 14:45:32,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 14:45:33,657][49750] Updated weights for policy 0, policy_version 199841 (0.0031) [2024-04-26 14:45:36,914][49750] Updated weights for policy 0, policy_version 199851 (0.0035) [2024-04-26 14:45:37,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 3274358784. Throughput: 0: 50792.0. Samples: 1027158400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 14:45:37,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 14:45:40,264][49750] Updated weights for policy 0, policy_version 199861 (0.0032) [2024-04-26 14:45:42,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3274604544. Throughput: 0: 50722.1. Samples: 1027461420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 14:45:42,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 14:45:42,407][49728] Signal inference workers to stop experience collection... (15400 times) [2024-04-26 14:45:42,407][49728] Signal inference workers to resume experience collection... (15400 times) [2024-04-26 14:45:42,423][49750] InferenceWorker_p0-w0: stopping experience collection (15400 times) [2024-04-26 14:45:42,423][49750] InferenceWorker_p0-w0: resuming experience collection (15400 times) [2024-04-26 14:45:43,208][49750] Updated weights for policy 0, policy_version 199871 (0.0030) [2024-04-26 14:45:46,616][49750] Updated weights for policy 0, policy_version 199881 (0.0037) [2024-04-26 14:45:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50596.0). Total num frames: 3274850304. Throughput: 0: 50874.6. Samples: 1027770640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 14:45:47,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 14:45:49,686][49750] Updated weights for policy 0, policy_version 199891 (0.0029) [2024-04-26 14:45:52,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3275128832. Throughput: 0: 50831.1. Samples: 1027923740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 14:45:52,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 14:45:53,053][49750] Updated weights for policy 0, policy_version 199901 (0.0039) [2024-04-26 14:45:56,140][49750] Updated weights for policy 0, policy_version 199911 (0.0030) [2024-04-26 14:45:57,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.4, 300 sec: 50762.7). Total num frames: 3275374592. Throughput: 0: 50815.6. Samples: 1028226320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 14:45:57,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 14:45:59,695][49750] Updated weights for policy 0, policy_version 199921 (0.0033) [2024-04-26 14:46:02,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3275620352. Throughput: 0: 50744.0. Samples: 1028529860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 14:46:02,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 14:46:02,507][49750] Updated weights for policy 0, policy_version 199931 (0.0030) [2024-04-26 14:46:06,131][49750] Updated weights for policy 0, policy_version 199941 (0.0029) [2024-04-26 14:46:07,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 3275882496. Throughput: 0: 50920.1. Samples: 1028680640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 14:46:07,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 14:46:09,147][49750] Updated weights for policy 0, policy_version 199951 (0.0031) [2024-04-26 14:46:12,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3276144640. Throughput: 0: 50966.8. Samples: 1028986180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 14:46:12,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 14:46:12,573][49750] Updated weights for policy 0, policy_version 199961 (0.0030) [2024-04-26 14:46:15,561][49750] Updated weights for policy 0, policy_version 199971 (0.0029) [2024-04-26 14:46:17,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3276406784. Throughput: 0: 50780.0. Samples: 1029290820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 14:46:17,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 14:46:18,883][49750] Updated weights for policy 0, policy_version 199981 (0.0031) [2024-04-26 14:46:22,038][49750] Updated weights for policy 0, policy_version 199991 (0.0031) [2024-04-26 14:46:22,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 3276652544. Throughput: 0: 50857.3. Samples: 1029446980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-26 14:46:22,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 14:46:25,475][49750] Updated weights for policy 0, policy_version 200001 (0.0026) [2024-04-26 14:46:27,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3276914688. Throughput: 0: 50982.2. Samples: 1029755620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 14:46:27,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 14:46:28,416][49750] Updated weights for policy 0, policy_version 200011 (0.0033) [2024-04-26 14:46:31,975][49750] Updated weights for policy 0, policy_version 200021 (0.0027) [2024-04-26 14:46:32,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 3277144064. Throughput: 0: 50947.5. Samples: 1030063280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 14:46:32,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 14:46:34,677][49750] Updated weights for policy 0, policy_version 200031 (0.0031) [2024-04-26 14:46:37,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3277406208. Throughput: 0: 50794.3. Samples: 1030209480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 14:46:37,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 14:46:38,253][49750] Updated weights for policy 0, policy_version 200041 (0.0023) [2024-04-26 14:46:40,795][49728] Signal inference workers to stop experience collection... (15450 times) [2024-04-26 14:46:40,846][49750] InferenceWorker_p0-w0: stopping experience collection (15450 times) [2024-04-26 14:46:40,859][49728] Signal inference workers to resume experience collection... (15450 times) [2024-04-26 14:46:40,867][49750] InferenceWorker_p0-w0: resuming experience collection (15450 times) [2024-04-26 14:46:40,987][49750] Updated weights for policy 0, policy_version 200051 (0.0029) [2024-04-26 14:46:42,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3277668352. Throughput: 0: 50841.3. Samples: 1030514180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 14:46:42,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 14:46:42,179][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000200054_3277684736.pth... [2024-04-26 14:46:42,227][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000199310_3265495040.pth [2024-04-26 14:46:44,675][49750] Updated weights for policy 0, policy_version 200061 (0.0031) [2024-04-26 14:46:47,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 3277930496. Throughput: 0: 50961.9. Samples: 1030823140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 14:46:47,063][49517] Avg episode reward: [(0, '0.674')] [2024-04-26 14:46:47,453][49750] Updated weights for policy 0, policy_version 200071 (0.0031) [2024-04-26 14:46:51,186][49750] Updated weights for policy 0, policy_version 200081 (0.0035) [2024-04-26 14:46:52,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3278176256. Throughput: 0: 50852.5. Samples: 1030969000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 14:46:52,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 14:46:54,230][49750] Updated weights for policy 0, policy_version 200091 (0.0031) [2024-04-26 14:46:57,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.3, 300 sec: 50651.6). Total num frames: 3278422016. Throughput: 0: 50835.6. Samples: 1031273780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 14:46:57,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 14:46:57,663][49750] Updated weights for policy 0, policy_version 200101 (0.0032) [2024-04-26 14:47:00,633][49750] Updated weights for policy 0, policy_version 200111 (0.0036) [2024-04-26 14:47:02,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3278700544. Throughput: 0: 50976.0. Samples: 1031584740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 14:47:02,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 14:47:04,025][49750] Updated weights for policy 0, policy_version 200121 (0.0030) [2024-04-26 14:47:07,005][49750] Updated weights for policy 0, policy_version 200131 (0.0032) [2024-04-26 14:47:07,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 3278946304. Throughput: 0: 50953.2. Samples: 1031739880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 14:47:07,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 14:47:10,449][49750] Updated weights for policy 0, policy_version 200141 (0.0028) [2024-04-26 14:47:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3279192064. Throughput: 0: 50801.8. Samples: 1032041700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 14:47:12,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 14:47:13,590][49750] Updated weights for policy 0, policy_version 200151 (0.0035) [2024-04-26 14:47:16,936][49750] Updated weights for policy 0, policy_version 200161 (0.0032) [2024-04-26 14:47:17,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3279454208. Throughput: 0: 50699.6. Samples: 1032344760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 14:47:17,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 14:47:20,106][49750] Updated weights for policy 0, policy_version 200171 (0.0032) [2024-04-26 14:47:22,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3279699968. Throughput: 0: 50768.0. Samples: 1032494040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 14:47:22,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 14:47:23,376][49750] Updated weights for policy 0, policy_version 200181 (0.0030) [2024-04-26 14:47:26,414][49750] Updated weights for policy 0, policy_version 200191 (0.0031) [2024-04-26 14:47:27,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3279929344. Throughput: 0: 50712.4. Samples: 1032796240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 14:47:27,063][49517] Avg episode reward: [(0, '0.466')] [2024-04-26 14:47:29,777][49750] Updated weights for policy 0, policy_version 200201 (0.0030) [2024-04-26 14:47:32,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3280207872. Throughput: 0: 50498.5. Samples: 1033095580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 14:47:32,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 14:47:32,838][49750] Updated weights for policy 0, policy_version 200211 (0.0027) [2024-04-26 14:47:36,310][49750] Updated weights for policy 0, policy_version 200221 (0.0029) [2024-04-26 14:47:37,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3280453632. Throughput: 0: 50635.6. Samples: 1033247600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 14:47:37,063][49517] Avg episode reward: [(0, '0.471')] [2024-04-26 14:47:39,395][49750] Updated weights for policy 0, policy_version 200231 (0.0031) [2024-04-26 14:47:42,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 3280699392. Throughput: 0: 50656.8. Samples: 1033553340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 14:47:42,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 14:47:42,453][49728] Signal inference workers to stop experience collection... (15500 times) [2024-04-26 14:47:42,453][49728] Signal inference workers to resume experience collection... (15500 times) [2024-04-26 14:47:42,472][49750] InferenceWorker_p0-w0: stopping experience collection (15500 times) [2024-04-26 14:47:42,472][49750] InferenceWorker_p0-w0: resuming experience collection (15500 times) [2024-04-26 14:47:42,791][49750] Updated weights for policy 0, policy_version 200241 (0.0034) [2024-04-26 14:47:45,911][49750] Updated weights for policy 0, policy_version 200251 (0.0038) [2024-04-26 14:47:47,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 3280961536. Throughput: 0: 50511.0. Samples: 1033857740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 14:47:47,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 14:47:49,370][49750] Updated weights for policy 0, policy_version 200261 (0.0030) [2024-04-26 14:47:52,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3281207296. Throughput: 0: 50414.0. Samples: 1034008500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 14:47:52,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 14:47:52,343][49750] Updated weights for policy 0, policy_version 200271 (0.0028) [2024-04-26 14:47:55,800][49750] Updated weights for policy 0, policy_version 200281 (0.0029) [2024-04-26 14:47:57,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3281469440. Throughput: 0: 50610.6. Samples: 1034319180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 14:47:57,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 14:47:58,753][49750] Updated weights for policy 0, policy_version 200291 (0.0032) [2024-04-26 14:48:02,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 3281715200. Throughput: 0: 50742.5. Samples: 1034628180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 14:48:02,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 14:48:02,174][49750] Updated weights for policy 0, policy_version 200301 (0.0030) [2024-04-26 14:48:05,343][49750] Updated weights for policy 0, policy_version 200311 (0.0033) [2024-04-26 14:48:07,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 3281977344. Throughput: 0: 50645.4. Samples: 1034773080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 14:48:07,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 14:48:08,527][49750] Updated weights for policy 0, policy_version 200321 (0.0036) [2024-04-26 14:48:11,782][49750] Updated weights for policy 0, policy_version 200331 (0.0026) [2024-04-26 14:48:12,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3282223104. Throughput: 0: 50711.1. Samples: 1035078240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 14:48:12,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 14:48:14,975][49750] Updated weights for policy 0, policy_version 200341 (0.0035) [2024-04-26 14:48:17,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3282485248. Throughput: 0: 50677.0. Samples: 1035376040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 14:48:17,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 14:48:18,247][49750] Updated weights for policy 0, policy_version 200351 (0.0038) [2024-04-26 14:48:21,473][49750] Updated weights for policy 0, policy_version 200361 (0.0034) [2024-04-26 14:48:22,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3282747392. Throughput: 0: 50729.5. Samples: 1035530440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 14:48:22,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 14:48:24,534][49750] Updated weights for policy 0, policy_version 200371 (0.0031) [2024-04-26 14:48:27,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3282976768. Throughput: 0: 50827.3. Samples: 1035840560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 14:48:27,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 14:48:27,920][49750] Updated weights for policy 0, policy_version 200381 (0.0033) [2024-04-26 14:48:31,052][49750] Updated weights for policy 0, policy_version 200391 (0.0033) [2024-04-26 14:48:32,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3283255296. Throughput: 0: 50862.4. Samples: 1036146540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 14:48:32,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 14:48:34,510][49750] Updated weights for policy 0, policy_version 200401 (0.0032) [2024-04-26 14:48:37,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3283501056. Throughput: 0: 50863.2. Samples: 1036297340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 14:48:37,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 14:48:37,562][49750] Updated weights for policy 0, policy_version 200411 (0.0031) [2024-04-26 14:48:40,887][49750] Updated weights for policy 0, policy_version 200421 (0.0028) [2024-04-26 14:48:42,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3283763200. Throughput: 0: 50703.2. Samples: 1036600820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:48:42,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 14:48:42,069][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000200425_3283763200.pth... [2024-04-26 14:48:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000199681_3271573504.pth [2024-04-26 14:48:43,803][49750] Updated weights for policy 0, policy_version 200431 (0.0032) [2024-04-26 14:48:47,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.4, 300 sec: 50651.5). Total num frames: 3283992576. Throughput: 0: 50724.5. Samples: 1036910780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:48:47,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 14:48:47,379][49750] Updated weights for policy 0, policy_version 200441 (0.0033) [2024-04-26 14:48:50,353][49750] Updated weights for policy 0, policy_version 200451 (0.0035) [2024-04-26 14:48:52,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3284271104. Throughput: 0: 50837.7. Samples: 1037060780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:48:52,063][49517] Avg episode reward: [(0, '0.673')] [2024-04-26 14:48:53,665][49750] Updated weights for policy 0, policy_version 200461 (0.0033) [2024-04-26 14:48:55,155][49728] Signal inference workers to stop experience collection... (15550 times) [2024-04-26 14:48:55,161][49728] Signal inference workers to resume experience collection... (15550 times) [2024-04-26 14:48:55,186][49750] InferenceWorker_p0-w0: stopping experience collection (15550 times) [2024-04-26 14:48:55,187][49750] InferenceWorker_p0-w0: resuming experience collection (15550 times) [2024-04-26 14:48:56,860][49750] Updated weights for policy 0, policy_version 200471 (0.0032) [2024-04-26 14:48:57,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 3284516864. Throughput: 0: 50777.9. Samples: 1037363240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:48:57,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 14:48:59,920][49750] Updated weights for policy 0, policy_version 200481 (0.0032) [2024-04-26 14:49:02,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3284762624. Throughput: 0: 50982.7. Samples: 1037670260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:49:02,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 14:49:03,371][49750] Updated weights for policy 0, policy_version 200491 (0.0033) [2024-04-26 14:49:06,399][49750] Updated weights for policy 0, policy_version 200501 (0.0028) [2024-04-26 14:49:07,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3285024768. Throughput: 0: 50949.8. Samples: 1037823180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:49:07,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 14:49:09,705][49750] Updated weights for policy 0, policy_version 200511 (0.0030) [2024-04-26 14:49:12,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3285270528. Throughput: 0: 50794.6. Samples: 1038126320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:49:12,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 14:49:12,836][49750] Updated weights for policy 0, policy_version 200521 (0.0033) [2024-04-26 14:49:16,408][49750] Updated weights for policy 0, policy_version 200531 (0.0032) [2024-04-26 14:49:17,062][49517] Fps is (10 sec: 52429.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3285549056. Throughput: 0: 50796.0. Samples: 1038432360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:49:17,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 14:49:19,252][49750] Updated weights for policy 0, policy_version 200541 (0.0029) [2024-04-26 14:49:22,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3285794816. Throughput: 0: 50752.8. Samples: 1038581220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:49:22,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 14:49:23,016][49750] Updated weights for policy 0, policy_version 200551 (0.0032) [2024-04-26 14:49:26,046][49750] Updated weights for policy 0, policy_version 200561 (0.0031) [2024-04-26 14:49:27,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3286024192. Throughput: 0: 50808.5. Samples: 1038887200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:49:27,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 14:49:29,434][49750] Updated weights for policy 0, policy_version 200571 (0.0032) [2024-04-26 14:49:32,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3286302720. Throughput: 0: 50676.0. Samples: 1039191200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:49:32,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 14:49:32,370][49750] Updated weights for policy 0, policy_version 200581 (0.0037) [2024-04-26 14:49:36,113][49750] Updated weights for policy 0, policy_version 200591 (0.0038) [2024-04-26 14:49:37,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3286532096. Throughput: 0: 50616.0. Samples: 1039338500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:49:37,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 14:49:38,929][49750] Updated weights for policy 0, policy_version 200601 (0.0029) [2024-04-26 14:49:42,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3286794240. Throughput: 0: 50842.9. Samples: 1039651180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 14:49:42,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 14:49:42,370][49750] Updated weights for policy 0, policy_version 200611 (0.0028) [2024-04-26 14:49:45,437][49750] Updated weights for policy 0, policy_version 200621 (0.0027) [2024-04-26 14:49:47,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3287040000. Throughput: 0: 50836.2. Samples: 1039957900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 14:49:47,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 14:49:48,746][49750] Updated weights for policy 0, policy_version 200631 (0.0038) [2024-04-26 14:49:51,934][49750] Updated weights for policy 0, policy_version 200641 (0.0026) [2024-04-26 14:49:52,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3287302144. Throughput: 0: 50518.8. Samples: 1040096520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 14:49:52,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 14:49:55,345][49750] Updated weights for policy 0, policy_version 200651 (0.0029) [2024-04-26 14:49:57,062][49517] Fps is (10 sec: 50791.7, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3287547904. Throughput: 0: 50652.6. Samples: 1040405680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 14:49:57,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 14:49:58,237][49750] Updated weights for policy 0, policy_version 200661 (0.0035) [2024-04-26 14:50:01,655][49750] Updated weights for policy 0, policy_version 200671 (0.0032) [2024-04-26 14:50:02,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3287810048. Throughput: 0: 50657.4. Samples: 1040711940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 14:50:02,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 14:50:04,636][49750] Updated weights for policy 0, policy_version 200681 (0.0027) [2024-04-26 14:50:07,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3288072192. Throughput: 0: 50724.0. Samples: 1040863800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 14:50:07,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 14:50:08,006][49750] Updated weights for policy 0, policy_version 200691 (0.0032) [2024-04-26 14:50:11,073][49750] Updated weights for policy 0, policy_version 200701 (0.0031) [2024-04-26 14:50:12,063][49517] Fps is (10 sec: 49150.6, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3288301568. Throughput: 0: 50616.2. Samples: 1041164940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 14:50:12,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 14:50:14,517][49750] Updated weights for policy 0, policy_version 200711 (0.0030) [2024-04-26 14:50:16,805][49728] Signal inference workers to stop experience collection... (15600 times) [2024-04-26 14:50:16,806][49728] Signal inference workers to resume experience collection... (15600 times) [2024-04-26 14:50:16,844][49750] InferenceWorker_p0-w0: stopping experience collection (15600 times) [2024-04-26 14:50:16,845][49750] InferenceWorker_p0-w0: resuming experience collection (15600 times) [2024-04-26 14:50:17,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 3288580096. Throughput: 0: 50705.3. Samples: 1041472940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 14:50:17,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 14:50:17,549][49750] Updated weights for policy 0, policy_version 200721 (0.0034) [2024-04-26 14:50:20,982][49750] Updated weights for policy 0, policy_version 200731 (0.0032) [2024-04-26 14:50:22,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3288825856. Throughput: 0: 50820.0. Samples: 1041625400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 14:50:22,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 14:50:24,008][49750] Updated weights for policy 0, policy_version 200741 (0.0034) [2024-04-26 14:50:27,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3289071616. Throughput: 0: 50821.0. Samples: 1041938120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 14:50:27,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 14:50:27,453][49750] Updated weights for policy 0, policy_version 200751 (0.0029) [2024-04-26 14:50:30,490][49750] Updated weights for policy 0, policy_version 200761 (0.0029) [2024-04-26 14:50:32,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3289333760. Throughput: 0: 50836.9. Samples: 1042245560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 14:50:32,063][49517] Avg episode reward: [(0, '0.450')] [2024-04-26 14:50:33,759][49750] Updated weights for policy 0, policy_version 200771 (0.0029) [2024-04-26 14:50:36,853][49750] Updated weights for policy 0, policy_version 200781 (0.0031) [2024-04-26 14:50:37,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3289595904. Throughput: 0: 50954.2. Samples: 1042389460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 14:50:37,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 14:50:40,249][49750] Updated weights for policy 0, policy_version 200791 (0.0027) [2024-04-26 14:50:42,062][49517] Fps is (10 sec: 52429.8, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3289858048. Throughput: 0: 50830.6. Samples: 1042693060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 14:50:42,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 14:50:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000200797_3289858048.pth... [2024-04-26 14:50:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000200054_3277684736.pth [2024-04-26 14:50:43,310][49750] Updated weights for policy 0, policy_version 200801 (0.0027) [2024-04-26 14:50:46,672][49750] Updated weights for policy 0, policy_version 200811 (0.0033) [2024-04-26 14:50:47,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3290103808. Throughput: 0: 50980.7. Samples: 1043006080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 14:50:47,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 14:50:49,713][49750] Updated weights for policy 0, policy_version 200821 (0.0033) [2024-04-26 14:50:52,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3290349568. Throughput: 0: 50915.9. Samples: 1043155020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 14:50:52,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 14:50:53,187][49750] Updated weights for policy 0, policy_version 200831 (0.0030) [2024-04-26 14:50:56,142][49750] Updated weights for policy 0, policy_version 200841 (0.0037) [2024-04-26 14:50:57,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3290595328. Throughput: 0: 50898.0. Samples: 1043455340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-26 14:50:57,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 14:50:59,458][49750] Updated weights for policy 0, policy_version 200851 (0.0030) [2024-04-26 14:51:02,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 3290873856. Throughput: 0: 50892.5. Samples: 1043763100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-26 14:51:02,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 14:51:02,783][49750] Updated weights for policy 0, policy_version 200861 (0.0031) [2024-04-26 14:51:05,572][49728] Signal inference workers to stop experience collection... (15650 times) [2024-04-26 14:51:05,609][49750] InferenceWorker_p0-w0: stopping experience collection (15650 times) [2024-04-26 14:51:05,627][49728] Signal inference workers to resume experience collection... (15650 times) [2024-04-26 14:51:05,636][49750] InferenceWorker_p0-w0: resuming experience collection (15650 times) [2024-04-26 14:51:05,953][49750] Updated weights for policy 0, policy_version 200871 (0.0027) [2024-04-26 14:51:07,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3291136000. Throughput: 0: 50951.1. Samples: 1043918200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-26 14:51:07,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 14:51:09,105][49750] Updated weights for policy 0, policy_version 200881 (0.0034) [2024-04-26 14:51:12,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50790.6, 300 sec: 50651.6). Total num frames: 3291348992. Throughput: 0: 50870.3. Samples: 1044227280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-26 14:51:12,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 14:51:12,482][49750] Updated weights for policy 0, policy_version 200891 (0.0034) [2024-04-26 14:51:15,560][49750] Updated weights for policy 0, policy_version 200901 (0.0030) [2024-04-26 14:51:17,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3291627520. Throughput: 0: 50881.9. Samples: 1044535240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-26 14:51:17,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 14:51:18,878][49750] Updated weights for policy 0, policy_version 200911 (0.0033) [2024-04-26 14:51:22,060][49750] Updated weights for policy 0, policy_version 200921 (0.0032) [2024-04-26 14:51:22,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3291889664. Throughput: 0: 50871.2. Samples: 1044678660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-26 14:51:22,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 14:51:25,325][49750] Updated weights for policy 0, policy_version 200931 (0.0032) [2024-04-26 14:51:27,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3292151808. Throughput: 0: 50942.2. Samples: 1044985460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-26 14:51:27,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 14:51:28,482][49750] Updated weights for policy 0, policy_version 200941 (0.0031) [2024-04-26 14:51:31,707][49750] Updated weights for policy 0, policy_version 200951 (0.0029) [2024-04-26 14:51:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.7, 300 sec: 50818.2). Total num frames: 3292397568. Throughput: 0: 50909.5. Samples: 1045297000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-26 14:51:32,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 14:51:34,766][49750] Updated weights for policy 0, policy_version 200961 (0.0033) [2024-04-26 14:51:37,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3292643328. Throughput: 0: 50893.4. Samples: 1045445220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-26 14:51:37,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 14:51:38,305][49750] Updated weights for policy 0, policy_version 200971 (0.0031) [2024-04-26 14:51:41,143][49750] Updated weights for policy 0, policy_version 200981 (0.0033) [2024-04-26 14:51:42,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3292889088. Throughput: 0: 50978.6. Samples: 1045749380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-26 14:51:42,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 14:51:44,698][49750] Updated weights for policy 0, policy_version 200991 (0.0026) [2024-04-26 14:51:47,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3293151232. Throughput: 0: 50859.2. Samples: 1046051760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-26 14:51:47,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 14:51:47,597][49750] Updated weights for policy 0, policy_version 201001 (0.0031) [2024-04-26 14:51:51,029][49750] Updated weights for policy 0, policy_version 201011 (0.0034) [2024-04-26 14:51:52,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3293429760. Throughput: 0: 50947.0. Samples: 1046210820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-26 14:51:52,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 14:51:54,162][49750] Updated weights for policy 0, policy_version 201021 (0.0026) [2024-04-26 14:51:57,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3293642752. Throughput: 0: 50787.5. Samples: 1046512720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-04-26 14:51:57,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 14:51:57,454][49750] Updated weights for policy 0, policy_version 201031 (0.0031) [2024-04-26 14:52:00,727][49750] Updated weights for policy 0, policy_version 201041 (0.0029) [2024-04-26 14:52:02,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3293921280. Throughput: 0: 50755.0. Samples: 1046819220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:52:02,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 14:52:03,996][49750] Updated weights for policy 0, policy_version 201051 (0.0041) [2024-04-26 14:52:04,119][49728] Signal inference workers to stop experience collection... (15700 times) [2024-04-26 14:52:04,163][49750] InferenceWorker_p0-w0: stopping experience collection (15700 times) [2024-04-26 14:52:04,185][49728] Signal inference workers to resume experience collection... (15700 times) [2024-04-26 14:52:04,186][49750] InferenceWorker_p0-w0: resuming experience collection (15700 times) [2024-04-26 14:52:07,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3294167040. Throughput: 0: 50761.4. Samples: 1046962920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:52:07,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 14:52:07,108][49750] Updated weights for policy 0, policy_version 201061 (0.0030) [2024-04-26 14:52:10,393][49750] Updated weights for policy 0, policy_version 201071 (0.0036) [2024-04-26 14:52:12,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51609.5, 300 sec: 50818.1). Total num frames: 3294445568. Throughput: 0: 50833.7. Samples: 1047272980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:52:12,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 14:52:13,667][49750] Updated weights for policy 0, policy_version 201081 (0.0033) [2024-04-26 14:52:16,745][49750] Updated weights for policy 0, policy_version 201091 (0.0034) [2024-04-26 14:52:17,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3294691328. Throughput: 0: 50753.3. Samples: 1047580900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:52:17,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 14:52:20,185][49750] Updated weights for policy 0, policy_version 201101 (0.0036) [2024-04-26 14:52:22,063][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3294920704. Throughput: 0: 50796.0. Samples: 1047731040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:52:22,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 14:52:23,202][49750] Updated weights for policy 0, policy_version 201111 (0.0031) [2024-04-26 14:52:26,754][49750] Updated weights for policy 0, policy_version 201121 (0.0027) [2024-04-26 14:52:27,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3295182848. Throughput: 0: 50842.7. Samples: 1048037300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:52:27,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 14:52:29,716][49750] Updated weights for policy 0, policy_version 201131 (0.0030) [2024-04-26 14:52:32,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3295444992. Throughput: 0: 50743.0. Samples: 1048335200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:52:32,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 14:52:33,261][49750] Updated weights for policy 0, policy_version 201141 (0.0031) [2024-04-26 14:52:36,086][49750] Updated weights for policy 0, policy_version 201151 (0.0034) [2024-04-26 14:52:37,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3295707136. Throughput: 0: 50749.0. Samples: 1048494520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:52:37,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 14:52:39,594][49750] Updated weights for policy 0, policy_version 201161 (0.0033) [2024-04-26 14:52:42,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3295936512. Throughput: 0: 50732.8. Samples: 1048795700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:52:42,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 14:52:42,082][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000201169_3295952896.pth... [2024-04-26 14:52:42,132][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000200425_3283763200.pth [2024-04-26 14:52:42,463][49750] Updated weights for policy 0, policy_version 201171 (0.0029) [2024-04-26 14:52:46,017][49750] Updated weights for policy 0, policy_version 201181 (0.0031) [2024-04-26 14:52:47,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3296198656. Throughput: 0: 50670.0. Samples: 1049099360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:52:47,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 14:52:48,932][49750] Updated weights for policy 0, policy_version 201191 (0.0033) [2024-04-26 14:52:52,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3296444416. Throughput: 0: 50763.4. Samples: 1049247280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:52:52,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 14:52:52,432][49750] Updated weights for policy 0, policy_version 201201 (0.0031) [2024-04-26 14:52:55,516][49750] Updated weights for policy 0, policy_version 201211 (0.0030) [2024-04-26 14:52:57,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3296722944. Throughput: 0: 50676.6. Samples: 1049553420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:52:57,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 14:52:58,953][49750] Updated weights for policy 0, policy_version 201221 (0.0040) [2024-04-26 14:53:01,903][49750] Updated weights for policy 0, policy_version 201231 (0.0029) [2024-04-26 14:53:02,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3296968704. Throughput: 0: 50710.6. Samples: 1049862880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:53:02,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 14:53:05,439][49750] Updated weights for policy 0, policy_version 201241 (0.0030) [2024-04-26 14:53:07,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3297214464. Throughput: 0: 50581.9. Samples: 1050007220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 14:53:07,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 14:53:08,270][49750] Updated weights for policy 0, policy_version 201251 (0.0031) [2024-04-26 14:53:09,609][49728] Signal inference workers to stop experience collection... (15750 times) [2024-04-26 14:53:09,614][49728] Signal inference workers to resume experience collection... (15750 times) [2024-04-26 14:53:09,657][49750] InferenceWorker_p0-w0: stopping experience collection (15750 times) [2024-04-26 14:53:09,657][49750] InferenceWorker_p0-w0: resuming experience collection (15750 times) [2024-04-26 14:53:11,847][49750] Updated weights for policy 0, policy_version 201261 (0.0028) [2024-04-26 14:53:12,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3297460224. Throughput: 0: 50555.1. Samples: 1050312280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:53:12,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 14:53:14,663][49750] Updated weights for policy 0, policy_version 201271 (0.0035) [2024-04-26 14:53:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 3297722368. Throughput: 0: 50672.1. Samples: 1050615440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:53:17,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 14:53:18,317][49750] Updated weights for policy 0, policy_version 201281 (0.0036) [2024-04-26 14:53:21,259][49750] Updated weights for policy 0, policy_version 201291 (0.0031) [2024-04-26 14:53:22,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3297984512. Throughput: 0: 50710.9. Samples: 1050776520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:53:22,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 14:53:24,839][49750] Updated weights for policy 0, policy_version 201301 (0.0039) [2024-04-26 14:53:27,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3298213888. Throughput: 0: 50796.9. Samples: 1051081560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:53:27,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 14:53:27,748][49750] Updated weights for policy 0, policy_version 201311 (0.0030) [2024-04-26 14:53:31,379][49750] Updated weights for policy 0, policy_version 201321 (0.0032) [2024-04-26 14:53:32,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3298459648. Throughput: 0: 50766.7. Samples: 1051383860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:53:32,063][49517] Avg episode reward: [(0, '0.683')] [2024-04-26 14:53:34,068][49750] Updated weights for policy 0, policy_version 201331 (0.0037) [2024-04-26 14:53:37,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3298738176. Throughput: 0: 50799.2. Samples: 1051533240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:53:37,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 14:53:37,805][49750] Updated weights for policy 0, policy_version 201341 (0.0026) [2024-04-26 14:53:40,467][49750] Updated weights for policy 0, policy_version 201351 (0.0029) [2024-04-26 14:53:42,062][49517] Fps is (10 sec: 54066.5, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3299000320. Throughput: 0: 50629.2. Samples: 1051831740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:53:42,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 14:53:44,130][49750] Updated weights for policy 0, policy_version 201361 (0.0027) [2024-04-26 14:53:46,939][49750] Updated weights for policy 0, policy_version 201371 (0.0035) [2024-04-26 14:53:47,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3299262464. Throughput: 0: 50634.7. Samples: 1052141440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:53:47,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 14:53:50,677][49750] Updated weights for policy 0, policy_version 201381 (0.0030) [2024-04-26 14:53:52,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3299491840. Throughput: 0: 50882.9. Samples: 1052296960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:53:52,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 14:53:53,689][49750] Updated weights for policy 0, policy_version 201391 (0.0031) [2024-04-26 14:53:57,062][49517] Fps is (10 sec: 45875.1, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 3299721216. Throughput: 0: 50859.6. Samples: 1052600960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:53:57,064][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 14:53:57,312][49750] Updated weights for policy 0, policy_version 201401 (0.0030) [2024-04-26 14:54:00,068][49750] Updated weights for policy 0, policy_version 201411 (0.0024) [2024-04-26 14:54:02,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3299999744. Throughput: 0: 50865.6. Samples: 1052904400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:54:02,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 14:54:03,597][49750] Updated weights for policy 0, policy_version 201421 (0.0037) [2024-04-26 14:54:06,502][49750] Updated weights for policy 0, policy_version 201431 (0.0030) [2024-04-26 14:54:07,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3300261888. Throughput: 0: 50593.0. Samples: 1053053200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:54:07,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 14:54:10,108][49750] Updated weights for policy 0, policy_version 201441 (0.0031) [2024-04-26 14:54:12,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3300507648. Throughput: 0: 50675.7. Samples: 1053361960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 14:54:12,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 14:54:12,936][49750] Updated weights for policy 0, policy_version 201451 (0.0034) [2024-04-26 14:54:16,660][49750] Updated weights for policy 0, policy_version 201461 (0.0035) [2024-04-26 14:54:17,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3300769792. Throughput: 0: 50731.8. Samples: 1053666800. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 14:54:17,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 14:54:19,363][49750] Updated weights for policy 0, policy_version 201471 (0.0033) [2024-04-26 14:54:22,062][49517] Fps is (10 sec: 49151.2, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3300999168. Throughput: 0: 50615.9. Samples: 1053810960. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 14:54:22,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 14:54:23,093][49750] Updated weights for policy 0, policy_version 201481 (0.0034) [2024-04-26 14:54:25,918][49750] Updated weights for policy 0, policy_version 201491 (0.0033) [2024-04-26 14:54:27,063][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3301277696. Throughput: 0: 50798.2. Samples: 1054117660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 14:54:27,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 14:54:27,343][49728] Signal inference workers to stop experience collection... (15800 times) [2024-04-26 14:54:27,343][49728] Signal inference workers to resume experience collection... (15800 times) [2024-04-26 14:54:27,371][49750] InferenceWorker_p0-w0: stopping experience collection (15800 times) [2024-04-26 14:54:27,371][49750] InferenceWorker_p0-w0: resuming experience collection (15800 times) [2024-04-26 14:54:29,436][49750] Updated weights for policy 0, policy_version 201501 (0.0034) [2024-04-26 14:54:32,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3301523456. Throughput: 0: 50696.4. Samples: 1054422780. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 14:54:32,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 14:54:32,358][49750] Updated weights for policy 0, policy_version 201511 (0.0039) [2024-04-26 14:54:35,986][49750] Updated weights for policy 0, policy_version 201521 (0.0031) [2024-04-26 14:54:37,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3301769216. Throughput: 0: 50658.9. Samples: 1054576600. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 14:54:37,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 14:54:38,873][49750] Updated weights for policy 0, policy_version 201531 (0.0027) [2024-04-26 14:54:42,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 3302014976. Throughput: 0: 50489.5. Samples: 1054873000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 14:54:42,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 14:54:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000201539_3302014976.pth... [2024-04-26 14:54:42,131][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000200797_3289858048.pth [2024-04-26 14:54:42,441][49750] Updated weights for policy 0, policy_version 201541 (0.0034) [2024-04-26 14:54:45,401][49750] Updated weights for policy 0, policy_version 201551 (0.0030) [2024-04-26 14:54:47,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3302293504. Throughput: 0: 50627.7. Samples: 1055182640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 14:54:47,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 14:54:48,874][49750] Updated weights for policy 0, policy_version 201561 (0.0034) [2024-04-26 14:54:51,754][49750] Updated weights for policy 0, policy_version 201571 (0.0030) [2024-04-26 14:54:52,062][49517] Fps is (10 sec: 52430.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3302539264. Throughput: 0: 50640.5. Samples: 1055332020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 14:54:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 14:54:55,375][49750] Updated weights for policy 0, policy_version 201581 (0.0034) [2024-04-26 14:54:57,062][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3302785024. Throughput: 0: 50519.9. Samples: 1055635360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 14:54:57,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 14:54:58,195][49750] Updated weights for policy 0, policy_version 201591 (0.0029) [2024-04-26 14:55:01,833][49750] Updated weights for policy 0, policy_version 201601 (0.0033) [2024-04-26 14:55:02,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3303047168. Throughput: 0: 50666.2. Samples: 1055946780. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 14:55:02,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 14:55:04,777][49750] Updated weights for policy 0, policy_version 201611 (0.0033) [2024-04-26 14:55:07,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3303292928. Throughput: 0: 50773.9. Samples: 1056095780. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 14:55:07,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 14:55:08,322][49750] Updated weights for policy 0, policy_version 201621 (0.0029) [2024-04-26 14:55:11,185][49750] Updated weights for policy 0, policy_version 201631 (0.0034) [2024-04-26 14:55:12,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3303555072. Throughput: 0: 50604.0. Samples: 1056394840. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 14:55:12,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 14:55:14,758][49750] Updated weights for policy 0, policy_version 201641 (0.0030) [2024-04-26 14:55:17,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3303817216. Throughput: 0: 50712.9. Samples: 1056704860. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 14:55:17,063][49517] Avg episode reward: [(0, '0.463')] [2024-04-26 14:55:17,514][49750] Updated weights for policy 0, policy_version 201651 (0.0027) [2024-04-26 14:55:21,078][49750] Updated weights for policy 0, policy_version 201661 (0.0027) [2024-04-26 14:55:22,062][49517] Fps is (10 sec: 50791.2, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3304062976. Throughput: 0: 50716.5. Samples: 1056858840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 14:55:22,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 14:55:24,328][49750] Updated weights for policy 0, policy_version 201671 (0.0030) [2024-04-26 14:55:27,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3304292352. Throughput: 0: 50756.3. Samples: 1057157020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 14:55:27,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 14:55:27,694][49750] Updated weights for policy 0, policy_version 201681 (0.0041) [2024-04-26 14:55:30,652][49750] Updated weights for policy 0, policy_version 201691 (0.0028) [2024-04-26 14:55:32,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3304570880. Throughput: 0: 50680.4. Samples: 1057463260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 14:55:32,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 14:55:34,077][49750] Updated weights for policy 0, policy_version 201701 (0.0029) [2024-04-26 14:55:34,724][49728] Signal inference workers to stop experience collection... (15850 times) [2024-04-26 14:55:34,724][49728] Signal inference workers to resume experience collection... (15850 times) [2024-04-26 14:55:34,751][49750] InferenceWorker_p0-w0: stopping experience collection (15850 times) [2024-04-26 14:55:34,752][49750] InferenceWorker_p0-w0: resuming experience collection (15850 times) [2024-04-26 14:55:37,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3304816640. Throughput: 0: 50854.2. Samples: 1057620460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 14:55:37,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 14:55:37,116][49750] Updated weights for policy 0, policy_version 201711 (0.0032) [2024-04-26 14:55:40,520][49750] Updated weights for policy 0, policy_version 201721 (0.0036) [2024-04-26 14:55:42,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 3305078784. Throughput: 0: 50889.8. Samples: 1057925400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 14:55:42,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 14:55:43,596][49750] Updated weights for policy 0, policy_version 201731 (0.0029) [2024-04-26 14:55:46,906][49750] Updated weights for policy 0, policy_version 201741 (0.0031) [2024-04-26 14:55:47,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3305324544. Throughput: 0: 50685.8. Samples: 1058227640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 14:55:47,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 14:55:49,933][49750] Updated weights for policy 0, policy_version 201751 (0.0031) [2024-04-26 14:55:52,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3305570304. Throughput: 0: 50710.7. Samples: 1058377760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 14:55:52,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 14:55:53,398][49750] Updated weights for policy 0, policy_version 201761 (0.0036) [2024-04-26 14:55:56,209][49750] Updated weights for policy 0, policy_version 201771 (0.0029) [2024-04-26 14:55:57,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3305832448. Throughput: 0: 50866.7. Samples: 1058683840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 14:55:57,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 14:55:59,911][49750] Updated weights for policy 0, policy_version 201781 (0.0034) [2024-04-26 14:56:02,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 3306094592. Throughput: 0: 50701.4. Samples: 1058986420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 14:56:02,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 14:56:02,710][49750] Updated weights for policy 0, policy_version 201791 (0.0030) [2024-04-26 14:56:06,229][49750] Updated weights for policy 0, policy_version 201801 (0.0025) [2024-04-26 14:56:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3306340352. Throughput: 0: 50832.3. Samples: 1059146300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 14:56:07,063][49517] Avg episode reward: [(0, '0.457')] [2024-04-26 14:56:09,169][49750] Updated weights for policy 0, policy_version 201811 (0.0030) [2024-04-26 14:56:12,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3306586112. Throughput: 0: 50913.1. Samples: 1059448120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 14:56:12,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 14:56:12,582][49750] Updated weights for policy 0, policy_version 201821 (0.0031) [2024-04-26 14:56:15,742][49750] Updated weights for policy 0, policy_version 201831 (0.0031) [2024-04-26 14:56:17,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3306848256. Throughput: 0: 50854.5. Samples: 1059751720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 14:56:17,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 14:56:19,025][49750] Updated weights for policy 0, policy_version 201841 (0.0033) [2024-04-26 14:56:22,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3307110400. Throughput: 0: 50780.5. Samples: 1059905580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 14:56:22,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 14:56:22,300][49750] Updated weights for policy 0, policy_version 201851 (0.0034) [2024-04-26 14:56:25,492][49750] Updated weights for policy 0, policy_version 201861 (0.0029) [2024-04-26 14:56:27,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 3307339776. Throughput: 0: 50801.3. Samples: 1060211460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 14:56:27,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 14:56:28,693][49750] Updated weights for policy 0, policy_version 201871 (0.0031) [2024-04-26 14:56:31,906][49750] Updated weights for policy 0, policy_version 201881 (0.0029) [2024-04-26 14:56:32,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3307634688. Throughput: 0: 50917.4. Samples: 1060518920. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 14:56:32,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 14:56:35,197][49750] Updated weights for policy 0, policy_version 201891 (0.0026) [2024-04-26 14:56:37,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 3307864064. Throughput: 0: 51003.1. Samples: 1060672900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 14:56:37,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 14:56:38,346][49750] Updated weights for policy 0, policy_version 201901 (0.0030) [2024-04-26 14:56:41,774][49750] Updated weights for policy 0, policy_version 201911 (0.0036) [2024-04-26 14:56:42,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3308126208. Throughput: 0: 50818.6. Samples: 1060970680. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 14:56:42,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 14:56:42,200][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000201913_3308142592.pth... [2024-04-26 14:56:42,249][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000201169_3295952896.pth [2024-04-26 14:56:43,319][49728] Signal inference workers to stop experience collection... (15900 times) [2024-04-26 14:56:43,362][49750] InferenceWorker_p0-w0: stopping experience collection (15900 times) [2024-04-26 14:56:43,424][49728] Signal inference workers to resume experience collection... (15900 times) [2024-04-26 14:56:43,424][49750] InferenceWorker_p0-w0: resuming experience collection (15900 times) [2024-04-26 14:56:44,733][49750] Updated weights for policy 0, policy_version 201921 (0.0030) [2024-04-26 14:56:47,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3308371968. Throughput: 0: 50882.1. Samples: 1061276120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 14:56:47,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 14:56:48,097][49750] Updated weights for policy 0, policy_version 201931 (0.0028) [2024-04-26 14:56:51,276][49750] Updated weights for policy 0, policy_version 201941 (0.0037) [2024-04-26 14:56:52,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3308617728. Throughput: 0: 50744.1. Samples: 1061429780. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 14:56:52,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 14:56:54,359][49750] Updated weights for policy 0, policy_version 201951 (0.0030) [2024-04-26 14:56:57,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 3308863488. Throughput: 0: 50750.9. Samples: 1061731900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 14:56:57,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 14:56:57,769][49750] Updated weights for policy 0, policy_version 201961 (0.0029) [2024-04-26 14:57:00,925][49750] Updated weights for policy 0, policy_version 201971 (0.0040) [2024-04-26 14:57:02,063][49517] Fps is (10 sec: 54066.2, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 3309158400. Throughput: 0: 50788.5. Samples: 1062037200. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 14:57:02,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 14:57:04,117][49750] Updated weights for policy 0, policy_version 201981 (0.0032) [2024-04-26 14:57:07,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3309387776. Throughput: 0: 50915.2. Samples: 1062196760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 14:57:07,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 14:57:07,408][49750] Updated weights for policy 0, policy_version 201991 (0.0030) [2024-04-26 14:57:10,756][49750] Updated weights for policy 0, policy_version 202001 (0.0028) [2024-04-26 14:57:12,062][49517] Fps is (10 sec: 49152.8, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 3309649920. Throughput: 0: 50890.3. Samples: 1062501520. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 14:57:12,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 14:57:13,852][49750] Updated weights for policy 0, policy_version 202011 (0.0028) [2024-04-26 14:57:17,064][49517] Fps is (10 sec: 50784.5, 60 sec: 50789.6, 300 sec: 50762.5). Total num frames: 3309895680. Throughput: 0: 50823.7. Samples: 1062806040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 14:57:17,064][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 14:57:17,098][49750] Updated weights for policy 0, policy_version 202021 (0.0025) [2024-04-26 14:57:20,388][49750] Updated weights for policy 0, policy_version 202031 (0.0029) [2024-04-26 14:57:22,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3310157824. Throughput: 0: 50890.4. Samples: 1062962980. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 14:57:22,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 14:57:23,539][49750] Updated weights for policy 0, policy_version 202041 (0.0039) [2024-04-26 14:57:26,843][49750] Updated weights for policy 0, policy_version 202051 (0.0031) [2024-04-26 14:57:27,062][49517] Fps is (10 sec: 50796.2, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3310403584. Throughput: 0: 50918.8. Samples: 1063262020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 14:57:27,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 14:57:30,054][49750] Updated weights for policy 0, policy_version 202061 (0.0031) [2024-04-26 14:57:32,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 3310649344. Throughput: 0: 50899.1. Samples: 1063566580. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 14:57:32,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 14:57:33,234][49750] Updated weights for policy 0, policy_version 202071 (0.0030) [2024-04-26 14:57:36,509][49750] Updated weights for policy 0, policy_version 202081 (0.0029) [2024-04-26 14:57:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3310927872. Throughput: 0: 50875.0. Samples: 1063719160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 14:57:37,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 14:57:39,562][49750] Updated weights for policy 0, policy_version 202091 (0.0033) [2024-04-26 14:57:42,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3311157248. Throughput: 0: 50775.4. Samples: 1064016800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 14:57:42,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 14:57:42,824][49750] Updated weights for policy 0, policy_version 202101 (0.0031) [2024-04-26 14:57:46,100][49750] Updated weights for policy 0, policy_version 202111 (0.0027) [2024-04-26 14:57:47,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3311435776. Throughput: 0: 50801.4. Samples: 1064323260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 14:57:47,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 14:57:49,272][49750] Updated weights for policy 0, policy_version 202121 (0.0031) [2024-04-26 14:57:50,602][49728] Signal inference workers to stop experience collection... (15950 times) [2024-04-26 14:57:50,602][49728] Signal inference workers to resume experience collection... (15950 times) [2024-04-26 14:57:50,614][49750] InferenceWorker_p0-w0: stopping experience collection (15950 times) [2024-04-26 14:57:50,632][49750] InferenceWorker_p0-w0: resuming experience collection (15950 times) [2024-04-26 14:57:52,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3311681536. Throughput: 0: 50729.2. Samples: 1064479580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 14:57:52,072][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 14:57:52,583][49750] Updated weights for policy 0, policy_version 202131 (0.0032) [2024-04-26 14:57:55,859][49750] Updated weights for policy 0, policy_version 202141 (0.0034) [2024-04-26 14:57:57,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3311943680. Throughput: 0: 50803.6. Samples: 1064787680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 14:57:57,071][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 14:57:58,988][49750] Updated weights for policy 0, policy_version 202151 (0.0028) [2024-04-26 14:58:02,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3312189440. Throughput: 0: 50672.8. Samples: 1065086260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 14:58:02,071][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 14:58:02,313][49750] Updated weights for policy 0, policy_version 202161 (0.0031) [2024-04-26 14:58:05,385][49750] Updated weights for policy 0, policy_version 202171 (0.0027) [2024-04-26 14:58:07,064][49517] Fps is (10 sec: 50784.6, 60 sec: 51062.5, 300 sec: 50818.0). Total num frames: 3312451584. Throughput: 0: 50789.2. Samples: 1065248540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 14:58:07,064][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 14:58:08,617][49750] Updated weights for policy 0, policy_version 202181 (0.0027) [2024-04-26 14:58:11,917][49750] Updated weights for policy 0, policy_version 202191 (0.0029) [2024-04-26 14:58:12,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3312713728. Throughput: 0: 50884.8. Samples: 1065551840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 14:58:12,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 14:58:15,242][49750] Updated weights for policy 0, policy_version 202201 (0.0029) [2024-04-26 14:58:17,062][49517] Fps is (10 sec: 47518.8, 60 sec: 50518.2, 300 sec: 50651.6). Total num frames: 3312926720. Throughput: 0: 50756.8. Samples: 1065850640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 14:58:17,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 14:58:18,373][49750] Updated weights for policy 0, policy_version 202211 (0.0030) [2024-04-26 14:58:21,629][49750] Updated weights for policy 0, policy_version 202221 (0.0032) [2024-04-26 14:58:22,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 3313205248. Throughput: 0: 50649.9. Samples: 1065998400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 14:58:22,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 14:58:24,848][49750] Updated weights for policy 0, policy_version 202231 (0.0029) [2024-04-26 14:58:27,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3313467392. Throughput: 0: 50936.0. Samples: 1066308920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 14:58:27,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 14:58:27,939][49750] Updated weights for policy 0, policy_version 202241 (0.0028) [2024-04-26 14:58:31,335][49750] Updated weights for policy 0, policy_version 202251 (0.0032) [2024-04-26 14:58:32,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3313696768. Throughput: 0: 50888.2. Samples: 1066613220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 14:58:32,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 14:58:34,245][49750] Updated weights for policy 0, policy_version 202261 (0.0032) [2024-04-26 14:58:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3313975296. Throughput: 0: 50753.4. Samples: 1066763480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 14:58:37,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 14:58:37,673][49750] Updated weights for policy 0, policy_version 202271 (0.0030) [2024-04-26 14:58:40,788][49750] Updated weights for policy 0, policy_version 202281 (0.0031) [2024-04-26 14:58:42,062][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3314221056. Throughput: 0: 50749.3. Samples: 1067071400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 14:58:42,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 14:58:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000202284_3314221056.pth... [2024-04-26 14:58:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000201539_3302014976.pth [2024-04-26 14:58:44,100][49750] Updated weights for policy 0, policy_version 202291 (0.0027) [2024-04-26 14:58:47,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3314483200. Throughput: 0: 50810.7. Samples: 1067372740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 14:58:47,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 14:58:47,317][49750] Updated weights for policy 0, policy_version 202301 (0.0031) [2024-04-26 14:58:50,500][49750] Updated weights for policy 0, policy_version 202311 (0.0035) [2024-04-26 14:58:52,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3314728960. Throughput: 0: 50747.8. Samples: 1067532140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 14:58:52,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 14:58:53,791][49750] Updated weights for policy 0, policy_version 202321 (0.0030) [2024-04-26 14:58:54,632][49728] Signal inference workers to stop experience collection... (16000 times) [2024-04-26 14:58:54,633][49728] Signal inference workers to resume experience collection... (16000 times) [2024-04-26 14:58:54,663][49750] InferenceWorker_p0-w0: stopping experience collection (16000 times) [2024-04-26 14:58:54,663][49750] InferenceWorker_p0-w0: resuming experience collection (16000 times) [2024-04-26 14:58:56,895][49750] Updated weights for policy 0, policy_version 202331 (0.0029) [2024-04-26 14:58:57,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3314991104. Throughput: 0: 50776.1. Samples: 1067836760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 14:58:57,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-26 14:59:00,112][49750] Updated weights for policy 0, policy_version 202341 (0.0038) [2024-04-26 14:59:02,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3315220480. Throughput: 0: 50869.3. Samples: 1068139760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 14:59:02,063][49517] Avg episode reward: [(0, '0.468')] [2024-04-26 14:59:03,293][49750] Updated weights for policy 0, policy_version 202351 (0.0032) [2024-04-26 14:59:06,401][49750] Updated weights for policy 0, policy_version 202361 (0.0031) [2024-04-26 14:59:07,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50518.3, 300 sec: 50762.6). Total num frames: 3315482624. Throughput: 0: 50953.8. Samples: 1068291320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 14:59:07,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 14:59:09,723][49750] Updated weights for policy 0, policy_version 202371 (0.0031) [2024-04-26 14:59:12,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3315744768. Throughput: 0: 50789.8. Samples: 1068594460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 14:59:12,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 14:59:12,847][49750] Updated weights for policy 0, policy_version 202381 (0.0030) [2024-04-26 14:59:16,355][49750] Updated weights for policy 0, policy_version 202391 (0.0038) [2024-04-26 14:59:17,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3315990528. Throughput: 0: 50719.4. Samples: 1068895600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 14:59:17,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 14:59:19,319][49750] Updated weights for policy 0, policy_version 202401 (0.0029) [2024-04-26 14:59:22,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3316236288. Throughput: 0: 50705.7. Samples: 1069045240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 14:59:22,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 14:59:22,902][49750] Updated weights for policy 0, policy_version 202411 (0.0035) [2024-04-26 14:59:25,607][49750] Updated weights for policy 0, policy_version 202421 (0.0037) [2024-04-26 14:59:27,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3316498432. Throughput: 0: 50625.3. Samples: 1069349540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 14:59:27,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 14:59:29,251][49750] Updated weights for policy 0, policy_version 202431 (0.0032) [2024-04-26 14:59:32,030][49750] Updated weights for policy 0, policy_version 202441 (0.0028) [2024-04-26 14:59:32,062][49517] Fps is (10 sec: 55705.6, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 3316793344. Throughput: 0: 50671.0. Samples: 1069652940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 14:59:32,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 14:59:35,741][49750] Updated weights for policy 0, policy_version 202451 (0.0028) [2024-04-26 14:59:37,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3317006336. Throughput: 0: 50663.7. Samples: 1069812000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 14:59:37,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 14:59:38,545][49750] Updated weights for policy 0, policy_version 202461 (0.0029) [2024-04-26 14:59:42,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3317268480. Throughput: 0: 50664.4. Samples: 1070116660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 14:59:42,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 14:59:42,301][49750] Updated weights for policy 0, policy_version 202471 (0.0034) [2024-04-26 14:59:45,415][49750] Updated weights for policy 0, policy_version 202481 (0.0031) [2024-04-26 14:59:47,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3317530624. Throughput: 0: 50680.0. Samples: 1070420360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 14:59:47,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 14:59:48,604][49750] Updated weights for policy 0, policy_version 202491 (0.0031) [2024-04-26 14:59:52,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 3317760000. Throughput: 0: 50841.8. Samples: 1070579200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 14:59:52,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 14:59:52,165][49750] Updated weights for policy 0, policy_version 202501 (0.0038) [2024-04-26 14:59:54,909][49750] Updated weights for policy 0, policy_version 202511 (0.0040) [2024-04-26 14:59:57,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3318038528. Throughput: 0: 50817.3. Samples: 1070881240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 14:59:57,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 14:59:58,556][49750] Updated weights for policy 0, policy_version 202521 (0.0034) [2024-04-26 15:00:01,392][49750] Updated weights for policy 0, policy_version 202531 (0.0036) [2024-04-26 15:00:02,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3318267904. Throughput: 0: 50746.3. Samples: 1071179180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 15:00:02,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 15:00:02,414][49728] Signal inference workers to stop experience collection... (16050 times) [2024-04-26 15:00:02,456][49750] InferenceWorker_p0-w0: stopping experience collection (16050 times) [2024-04-26 15:00:02,489][49728] Signal inference workers to resume experience collection... (16050 times) [2024-04-26 15:00:02,490][49750] InferenceWorker_p0-w0: resuming experience collection (16050 times) [2024-04-26 15:00:04,964][49750] Updated weights for policy 0, policy_version 202541 (0.0034) [2024-04-26 15:00:07,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3318530048. Throughput: 0: 50969.7. Samples: 1071338880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 15:00:07,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 15:00:07,956][49750] Updated weights for policy 0, policy_version 202551 (0.0036) [2024-04-26 15:00:11,368][49750] Updated weights for policy 0, policy_version 202561 (0.0030) [2024-04-26 15:00:12,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3318792192. Throughput: 0: 50817.7. Samples: 1071636340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 15:00:12,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 15:00:14,882][49750] Updated weights for policy 0, policy_version 202571 (0.0032) [2024-04-26 15:00:17,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3319054336. Throughput: 0: 50723.0. Samples: 1071935480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 15:00:17,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 15:00:17,813][49750] Updated weights for policy 0, policy_version 202581 (0.0036) [2024-04-26 15:00:21,185][49750] Updated weights for policy 0, policy_version 202591 (0.0035) [2024-04-26 15:00:22,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3319300096. Throughput: 0: 50900.0. Samples: 1072102500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 15:00:22,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 15:00:24,133][49750] Updated weights for policy 0, policy_version 202601 (0.0033) [2024-04-26 15:00:27,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3319545856. Throughput: 0: 50902.6. Samples: 1072407280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 15:00:27,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 15:00:27,565][49750] Updated weights for policy 0, policy_version 202611 (0.0029) [2024-04-26 15:00:30,631][49750] Updated weights for policy 0, policy_version 202621 (0.0031) [2024-04-26 15:00:32,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 3319808000. Throughput: 0: 50864.9. Samples: 1072709280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 15:00:32,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 15:00:33,977][49750] Updated weights for policy 0, policy_version 202631 (0.0038) [2024-04-26 15:00:37,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3320053760. Throughput: 0: 50744.9. Samples: 1072862720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 15:00:37,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 15:00:37,101][49750] Updated weights for policy 0, policy_version 202641 (0.0033) [2024-04-26 15:00:40,406][49750] Updated weights for policy 0, policy_version 202651 (0.0032) [2024-04-26 15:00:42,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3320332288. Throughput: 0: 50703.0. Samples: 1073162880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 15:00:42,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 15:00:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000202657_3320332288.pth... [2024-04-26 15:00:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000201913_3308142592.pth [2024-04-26 15:00:43,469][49750] Updated weights for policy 0, policy_version 202661 (0.0030) [2024-04-26 15:00:46,954][49750] Updated weights for policy 0, policy_version 202671 (0.0033) [2024-04-26 15:00:47,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 3320561664. Throughput: 0: 50896.3. Samples: 1073469520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 15:00:47,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 15:00:49,997][49750] Updated weights for policy 0, policy_version 202681 (0.0039) [2024-04-26 15:00:52,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3320807424. Throughput: 0: 50754.4. Samples: 1073622820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 15:00:52,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 15:00:53,262][49750] Updated weights for policy 0, policy_version 202691 (0.0033) [2024-04-26 15:00:56,564][49750] Updated weights for policy 0, policy_version 202701 (0.0035) [2024-04-26 15:00:57,063][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3321085952. Throughput: 0: 50942.7. Samples: 1073928760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 15:00:57,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 15:00:59,521][49750] Updated weights for policy 0, policy_version 202711 (0.0035) [2024-04-26 15:01:02,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 3321331712. Throughput: 0: 50957.7. Samples: 1074228580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 15:01:02,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 15:01:02,909][49750] Updated weights for policy 0, policy_version 202721 (0.0028) [2024-04-26 15:01:06,078][49750] Updated weights for policy 0, policy_version 202731 (0.0031) [2024-04-26 15:01:07,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3321593856. Throughput: 0: 50855.2. Samples: 1074390980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 15:01:07,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 15:01:09,307][49750] Updated weights for policy 0, policy_version 202741 (0.0033) [2024-04-26 15:01:11,680][49728] Signal inference workers to stop experience collection... (16100 times) [2024-04-26 15:01:11,681][49728] Signal inference workers to resume experience collection... (16100 times) [2024-04-26 15:01:11,710][49750] InferenceWorker_p0-w0: stopping experience collection (16100 times) [2024-04-26 15:01:11,710][49750] InferenceWorker_p0-w0: resuming experience collection (16100 times) [2024-04-26 15:01:12,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3321839616. Throughput: 0: 50832.5. Samples: 1074694740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 15:01:12,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 15:01:12,656][49750] Updated weights for policy 0, policy_version 202751 (0.0033) [2024-04-26 15:01:15,680][49750] Updated weights for policy 0, policy_version 202761 (0.0028) [2024-04-26 15:01:17,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3322085376. Throughput: 0: 50849.7. Samples: 1074997520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 15:01:17,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 15:01:18,882][49750] Updated weights for policy 0, policy_version 202771 (0.0032) [2024-04-26 15:01:21,991][49750] Updated weights for policy 0, policy_version 202781 (0.0033) [2024-04-26 15:01:22,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 3322363904. Throughput: 0: 50824.7. Samples: 1075149840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 15:01:22,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 15:01:25,332][49750] Updated weights for policy 0, policy_version 202791 (0.0032) [2024-04-26 15:01:27,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 3322626048. Throughput: 0: 50911.4. Samples: 1075453900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 15:01:27,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 15:01:28,545][49750] Updated weights for policy 0, policy_version 202801 (0.0030) [2024-04-26 15:01:31,847][49750] Updated weights for policy 0, policy_version 202811 (0.0037) [2024-04-26 15:01:32,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 3322855424. Throughput: 0: 50999.5. Samples: 1075764500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 15:01:32,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 15:01:34,926][49750] Updated weights for policy 0, policy_version 202821 (0.0035) [2024-04-26 15:01:37,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3323101184. Throughput: 0: 50814.7. Samples: 1075909480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 15:01:37,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 15:01:38,271][49750] Updated weights for policy 0, policy_version 202831 (0.0033) [2024-04-26 15:01:41,423][49750] Updated weights for policy 0, policy_version 202841 (0.0035) [2024-04-26 15:01:42,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3323363328. Throughput: 0: 50859.5. Samples: 1076217440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 15:01:42,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 15:01:44,601][49750] Updated weights for policy 0, policy_version 202851 (0.0035) [2024-04-26 15:01:47,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3323609088. Throughput: 0: 50947.8. Samples: 1076521220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 15:01:47,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 15:01:47,963][49750] Updated weights for policy 0, policy_version 202861 (0.0040) [2024-04-26 15:01:50,980][49750] Updated weights for policy 0, policy_version 202871 (0.0028) [2024-04-26 15:01:52,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3323871232. Throughput: 0: 50803.0. Samples: 1076677120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 15:01:52,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 15:01:54,371][49750] Updated weights for policy 0, policy_version 202881 (0.0033) [2024-04-26 15:01:57,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3324133376. Throughput: 0: 50797.3. Samples: 1076980620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 15:01:57,064][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 15:01:57,520][49750] Updated weights for policy 0, policy_version 202891 (0.0034) [2024-04-26 15:02:01,054][49750] Updated weights for policy 0, policy_version 202901 (0.0028) [2024-04-26 15:02:02,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3324379136. Throughput: 0: 50933.5. Samples: 1077289520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-26 15:02:02,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 15:02:03,970][49750] Updated weights for policy 0, policy_version 202911 (0.0032) [2024-04-26 15:02:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3324624896. Throughput: 0: 50861.5. Samples: 1077438600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:02:07,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 15:02:07,545][49750] Updated weights for policy 0, policy_version 202921 (0.0033) [2024-04-26 15:02:10,261][49750] Updated weights for policy 0, policy_version 202931 (0.0030) [2024-04-26 15:02:12,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51336.5, 300 sec: 50929.4). Total num frames: 3324919808. Throughput: 0: 50840.5. Samples: 1077741720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:02:12,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 15:02:14,081][49750] Updated weights for policy 0, policy_version 202941 (0.0026) [2024-04-26 15:02:16,733][49750] Updated weights for policy 0, policy_version 202951 (0.0029) [2024-04-26 15:02:17,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3325149184. Throughput: 0: 50759.2. Samples: 1078048660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:02:17,063][49517] Avg episode reward: [(0, '0.448')] [2024-04-26 15:02:20,365][49750] Updated weights for policy 0, policy_version 202961 (0.0029) [2024-04-26 15:02:22,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3325394944. Throughput: 0: 50897.0. Samples: 1078199840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:02:22,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 15:02:23,510][49750] Updated weights for policy 0, policy_version 202971 (0.0028) [2024-04-26 15:02:26,833][49750] Updated weights for policy 0, policy_version 202981 (0.0037) [2024-04-26 15:02:27,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 3325657088. Throughput: 0: 50739.1. Samples: 1078500700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:02:27,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 15:02:28,175][49728] Signal inference workers to stop experience collection... (16150 times) [2024-04-26 15:02:28,217][49750] InferenceWorker_p0-w0: stopping experience collection (16150 times) [2024-04-26 15:02:28,280][49728] Signal inference workers to resume experience collection... (16150 times) [2024-04-26 15:02:28,281][49750] InferenceWorker_p0-w0: resuming experience collection (16150 times) [2024-04-26 15:02:30,023][49750] Updated weights for policy 0, policy_version 202991 (0.0033) [2024-04-26 15:02:32,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3325935616. Throughput: 0: 50833.7. Samples: 1078808740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:02:32,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 15:02:33,331][49750] Updated weights for policy 0, policy_version 203001 (0.0032) [2024-04-26 15:02:36,448][49750] Updated weights for policy 0, policy_version 203011 (0.0032) [2024-04-26 15:02:37,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 3326181376. Throughput: 0: 50953.3. Samples: 1078970020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:02:37,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 15:02:39,625][49750] Updated weights for policy 0, policy_version 203021 (0.0029) [2024-04-26 15:02:42,063][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3326427136. Throughput: 0: 50853.3. Samples: 1079269020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:02:42,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 15:02:42,069][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000203029_3326427136.pth... [2024-04-26 15:02:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000202284_3314221056.pth [2024-04-26 15:02:42,888][49750] Updated weights for policy 0, policy_version 203031 (0.0028) [2024-04-26 15:02:45,906][49750] Updated weights for policy 0, policy_version 203041 (0.0032) [2024-04-26 15:02:47,062][49517] Fps is (10 sec: 49152.5, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3326672896. Throughput: 0: 50736.5. Samples: 1079572660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:02:47,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 15:02:49,441][49750] Updated weights for policy 0, policy_version 203051 (0.0032) [2024-04-26 15:02:52,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3326918656. Throughput: 0: 50755.2. Samples: 1079722580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:02:52,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 15:02:52,816][49750] Updated weights for policy 0, policy_version 203061 (0.0032) [2024-04-26 15:02:55,873][49750] Updated weights for policy 0, policy_version 203071 (0.0034) [2024-04-26 15:02:57,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3327197184. Throughput: 0: 50828.0. Samples: 1080028980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:02:57,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 15:02:59,206][49750] Updated weights for policy 0, policy_version 203081 (0.0028) [2024-04-26 15:03:02,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50762.8). Total num frames: 3327426560. Throughput: 0: 50728.5. Samples: 1080331440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:03:02,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 15:03:02,140][49750] Updated weights for policy 0, policy_version 203091 (0.0030) [2024-04-26 15:03:05,498][49750] Updated weights for policy 0, policy_version 203101 (0.0028) [2024-04-26 15:03:07,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3327705088. Throughput: 0: 50951.5. Samples: 1080492660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:03:07,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 15:03:08,601][49750] Updated weights for policy 0, policy_version 203111 (0.0038) [2024-04-26 15:03:11,895][49750] Updated weights for policy 0, policy_version 203121 (0.0030) [2024-04-26 15:03:12,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50244.2, 300 sec: 50873.7). Total num frames: 3327934464. Throughput: 0: 50881.7. Samples: 1080790380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:03:12,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-26 15:03:15,117][49750] Updated weights for policy 0, policy_version 203131 (0.0029) [2024-04-26 15:03:17,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3328196608. Throughput: 0: 50709.4. Samples: 1081090660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:03:17,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 15:03:18,468][49750] Updated weights for policy 0, policy_version 203141 (0.0035) [2024-04-26 15:03:21,565][49750] Updated weights for policy 0, policy_version 203151 (0.0031) [2024-04-26 15:03:22,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3328458752. Throughput: 0: 50730.7. Samples: 1081252900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:03:22,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 15:03:22,338][49728] Signal inference workers to stop experience collection... (16200 times) [2024-04-26 15:03:22,338][49728] Signal inference workers to resume experience collection... (16200 times) [2024-04-26 15:03:22,364][49750] InferenceWorker_p0-w0: stopping experience collection (16200 times) [2024-04-26 15:03:22,365][49750] InferenceWorker_p0-w0: resuming experience collection (16200 times) [2024-04-26 15:03:24,750][49750] Updated weights for policy 0, policy_version 203161 (0.0030) [2024-04-26 15:03:27,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3328688128. Throughput: 0: 50880.6. Samples: 1081558640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:03:27,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 15:03:28,144][49750] Updated weights for policy 0, policy_version 203171 (0.0027) [2024-04-26 15:03:31,221][49750] Updated weights for policy 0, policy_version 203181 (0.0027) [2024-04-26 15:03:32,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3328950272. Throughput: 0: 50867.0. Samples: 1081861680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:03:32,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 15:03:34,610][49750] Updated weights for policy 0, policy_version 203191 (0.0031) [2024-04-26 15:03:37,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3329212416. Throughput: 0: 50931.6. Samples: 1082014500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:03:37,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 15:03:37,723][49750] Updated weights for policy 0, policy_version 203201 (0.0031) [2024-04-26 15:03:40,924][49750] Updated weights for policy 0, policy_version 203211 (0.0029) [2024-04-26 15:03:42,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3329474560. Throughput: 0: 50713.4. Samples: 1082311080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:03:42,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 15:03:44,164][49750] Updated weights for policy 0, policy_version 203221 (0.0033) [2024-04-26 15:03:47,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3329703936. Throughput: 0: 50708.7. Samples: 1082613340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:03:47,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 15:03:47,549][49750] Updated weights for policy 0, policy_version 203231 (0.0032) [2024-04-26 15:03:50,594][49750] Updated weights for policy 0, policy_version 203241 (0.0032) [2024-04-26 15:03:52,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3329966080. Throughput: 0: 50457.0. Samples: 1082763220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:03:52,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 15:03:53,943][49750] Updated weights for policy 0, policy_version 203251 (0.0037) [2024-04-26 15:03:57,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 3330211840. Throughput: 0: 50580.9. Samples: 1083066520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:03:57,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 15:03:57,364][49750] Updated weights for policy 0, policy_version 203261 (0.0029) [2024-04-26 15:04:00,306][49750] Updated weights for policy 0, policy_version 203271 (0.0031) [2024-04-26 15:04:02,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 3330490368. Throughput: 0: 50590.5. Samples: 1083367240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:04:02,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 15:04:03,769][49750] Updated weights for policy 0, policy_version 203281 (0.0036) [2024-04-26 15:04:06,805][49750] Updated weights for policy 0, policy_version 203291 (0.0031) [2024-04-26 15:04:07,063][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 3330736128. Throughput: 0: 50531.5. Samples: 1083526820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:04:07,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 15:04:10,133][49750] Updated weights for policy 0, policy_version 203301 (0.0030) [2024-04-26 15:04:12,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3330965504. Throughput: 0: 50502.6. Samples: 1083831260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:04:12,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 15:04:13,189][49750] Updated weights for policy 0, policy_version 203311 (0.0036) [2024-04-26 15:04:15,851][49728] Signal inference workers to stop experience collection... (16250 times) [2024-04-26 15:04:15,887][49750] InferenceWorker_p0-w0: stopping experience collection (16250 times) [2024-04-26 15:04:15,922][49728] Signal inference workers to resume experience collection... (16250 times) [2024-04-26 15:04:15,923][49750] InferenceWorker_p0-w0: resuming experience collection (16250 times) [2024-04-26 15:04:16,432][49750] Updated weights for policy 0, policy_version 203321 (0.0027) [2024-04-26 15:04:17,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3331227648. Throughput: 0: 50393.8. Samples: 1084129400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:04:17,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 15:04:19,731][49750] Updated weights for policy 0, policy_version 203331 (0.0030) [2024-04-26 15:04:22,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3331473408. Throughput: 0: 50536.7. Samples: 1084288660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 15:04:22,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 15:04:22,876][49750] Updated weights for policy 0, policy_version 203341 (0.0033) [2024-04-26 15:04:26,244][49750] Updated weights for policy 0, policy_version 203351 (0.0031) [2024-04-26 15:04:27,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3331751936. Throughput: 0: 50741.2. Samples: 1084594440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 15:04:27,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 15:04:29,517][49750] Updated weights for policy 0, policy_version 203361 (0.0029) [2024-04-26 15:04:32,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3331964928. Throughput: 0: 50717.1. Samples: 1084895600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 15:04:32,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 15:04:32,674][49750] Updated weights for policy 0, policy_version 203371 (0.0030) [2024-04-26 15:04:35,776][49750] Updated weights for policy 0, policy_version 203381 (0.0035) [2024-04-26 15:04:37,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3332243456. Throughput: 0: 50563.9. Samples: 1085038600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 15:04:37,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 15:04:39,105][49750] Updated weights for policy 0, policy_version 203391 (0.0028) [2024-04-26 15:04:42,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3332505600. Throughput: 0: 50700.6. Samples: 1085348040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 15:04:42,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 15:04:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000203401_3332521984.pth... [2024-04-26 15:04:42,079][49750] Updated weights for policy 0, policy_version 203401 (0.0036) [2024-04-26 15:04:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000202657_3320332288.pth [2024-04-26 15:04:45,493][49750] Updated weights for policy 0, policy_version 203411 (0.0028) [2024-04-26 15:04:47,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3332767744. Throughput: 0: 50791.7. Samples: 1085652860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 15:04:47,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 15:04:48,794][49750] Updated weights for policy 0, policy_version 203421 (0.0028) [2024-04-26 15:04:51,928][49750] Updated weights for policy 0, policy_version 203431 (0.0031) [2024-04-26 15:04:52,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3333013504. Throughput: 0: 50596.5. Samples: 1085803660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 15:04:52,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 15:04:55,228][49750] Updated weights for policy 0, policy_version 203441 (0.0031) [2024-04-26 15:04:57,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3333259264. Throughput: 0: 50684.9. Samples: 1086112080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 15:04:57,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 15:04:58,406][49750] Updated weights for policy 0, policy_version 203451 (0.0028) [2024-04-26 15:05:01,799][49750] Updated weights for policy 0, policy_version 203461 (0.0031) [2024-04-26 15:05:02,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3333505024. Throughput: 0: 50907.0. Samples: 1086420220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 15:05:02,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 15:05:04,829][49750] Updated weights for policy 0, policy_version 203471 (0.0033) [2024-04-26 15:05:07,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3333767168. Throughput: 0: 50678.1. Samples: 1086569180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 15:05:07,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 15:05:08,117][49750] Updated weights for policy 0, policy_version 203481 (0.0030) [2024-04-26 15:05:11,256][49750] Updated weights for policy 0, policy_version 203491 (0.0030) [2024-04-26 15:05:12,063][49517] Fps is (10 sec: 54067.7, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3334045696. Throughput: 0: 50729.4. Samples: 1086877260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 15:05:12,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 15:05:14,597][49750] Updated weights for policy 0, policy_version 203501 (0.0041) [2024-04-26 15:05:17,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3334275072. Throughput: 0: 50826.1. Samples: 1087182780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 15:05:17,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 15:05:17,779][49750] Updated weights for policy 0, policy_version 203511 (0.0039) [2024-04-26 15:05:21,061][49750] Updated weights for policy 0, policy_version 203521 (0.0033) [2024-04-26 15:05:22,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3334520832. Throughput: 0: 50815.5. Samples: 1087325300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 15:05:22,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 15:05:24,231][49750] Updated weights for policy 0, policy_version 203531 (0.0028) [2024-04-26 15:05:27,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 3334782976. Throughput: 0: 50663.2. Samples: 1087627880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-26 15:05:27,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 15:05:27,392][49750] Updated weights for policy 0, policy_version 203541 (0.0030) [2024-04-26 15:05:30,666][49750] Updated weights for policy 0, policy_version 203551 (0.0032) [2024-04-26 15:05:32,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3335028736. Throughput: 0: 50794.7. Samples: 1087938620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-26 15:05:32,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 15:05:33,746][49750] Updated weights for policy 0, policy_version 203561 (0.0027) [2024-04-26 15:05:37,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3335290880. Throughput: 0: 50847.2. Samples: 1088091780. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-26 15:05:37,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 15:05:37,147][49750] Updated weights for policy 0, policy_version 203571 (0.0037) [2024-04-26 15:05:40,024][49728] Signal inference workers to stop experience collection... (16300 times) [2024-04-26 15:05:40,060][49750] InferenceWorker_p0-w0: stopping experience collection (16300 times) [2024-04-26 15:05:40,096][49728] Signal inference workers to resume experience collection... (16300 times) [2024-04-26 15:05:40,096][49750] InferenceWorker_p0-w0: resuming experience collection (16300 times) [2024-04-26 15:05:40,223][49750] Updated weights for policy 0, policy_version 203581 (0.0028) [2024-04-26 15:05:42,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3335536640. Throughput: 0: 50646.9. Samples: 1088391200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-26 15:05:42,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 15:05:43,414][49750] Updated weights for policy 0, policy_version 203591 (0.0035) [2024-04-26 15:05:46,747][49750] Updated weights for policy 0, policy_version 203601 (0.0031) [2024-04-26 15:05:47,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3335798784. Throughput: 0: 50567.7. Samples: 1088695760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-26 15:05:47,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 15:05:49,873][49750] Updated weights for policy 0, policy_version 203611 (0.0038) [2024-04-26 15:05:52,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3336044544. Throughput: 0: 50596.0. Samples: 1088846000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-26 15:05:52,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 15:05:53,561][49750] Updated weights for policy 0, policy_version 203621 (0.0037) [2024-04-26 15:05:56,404][49750] Updated weights for policy 0, policy_version 203631 (0.0042) [2024-04-26 15:05:57,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3336306688. Throughput: 0: 50582.2. Samples: 1089153460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-26 15:05:57,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 15:05:59,952][49750] Updated weights for policy 0, policy_version 203641 (0.0032) [2024-04-26 15:06:02,062][49517] Fps is (10 sec: 52430.4, 60 sec: 51063.7, 300 sec: 50762.6). Total num frames: 3336568832. Throughput: 0: 50568.2. Samples: 1089458340. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-26 15:06:02,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 15:06:02,968][49750] Updated weights for policy 0, policy_version 203651 (0.0032) [2024-04-26 15:06:06,343][49750] Updated weights for policy 0, policy_version 203661 (0.0037) [2024-04-26 15:06:07,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3336798208. Throughput: 0: 50666.8. Samples: 1089605300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-26 15:06:07,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 15:06:09,221][49750] Updated weights for policy 0, policy_version 203671 (0.0032) [2024-04-26 15:06:12,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3337076736. Throughput: 0: 50767.5. Samples: 1089912420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-26 15:06:12,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 15:06:12,814][49750] Updated weights for policy 0, policy_version 203681 (0.0029) [2024-04-26 15:06:15,613][49750] Updated weights for policy 0, policy_version 203691 (0.0029) [2024-04-26 15:06:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3337322496. Throughput: 0: 50688.0. Samples: 1090219580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-26 15:06:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 15:06:19,201][49750] Updated weights for policy 0, policy_version 203701 (0.0033) [2024-04-26 15:06:22,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3337568256. Throughput: 0: 50702.5. Samples: 1090373400. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-26 15:06:22,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 15:06:22,387][49750] Updated weights for policy 0, policy_version 203711 (0.0032) [2024-04-26 15:06:25,636][49750] Updated weights for policy 0, policy_version 203721 (0.0029) [2024-04-26 15:06:27,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3337814016. Throughput: 0: 50713.9. Samples: 1090673320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-26 15:06:27,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 15:06:28,819][49750] Updated weights for policy 0, policy_version 203731 (0.0026) [2024-04-26 15:06:31,983][49750] Updated weights for policy 0, policy_version 203741 (0.0037) [2024-04-26 15:06:32,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3338092544. Throughput: 0: 50666.7. Samples: 1090975760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-26 15:06:32,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 15:06:35,230][49750] Updated weights for policy 0, policy_version 203751 (0.0032) [2024-04-26 15:06:37,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3338338304. Throughput: 0: 50724.0. Samples: 1091128580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:06:37,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 15:06:38,471][49750] Updated weights for policy 0, policy_version 203761 (0.0028) [2024-04-26 15:06:41,712][49750] Updated weights for policy 0, policy_version 203771 (0.0030) [2024-04-26 15:06:42,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 3338600448. Throughput: 0: 50711.5. Samples: 1091435480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:06:42,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 15:06:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000203772_3338600448.pth... [2024-04-26 15:06:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000203029_3326427136.pth [2024-04-26 15:06:44,835][49750] Updated weights for policy 0, policy_version 203781 (0.0024) [2024-04-26 15:06:47,063][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3338846208. Throughput: 0: 50703.3. Samples: 1091740000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:06:47,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 15:06:48,157][49750] Updated weights for policy 0, policy_version 203791 (0.0031) [2024-04-26 15:06:51,274][49750] Updated weights for policy 0, policy_version 203801 (0.0033) [2024-04-26 15:06:52,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3339091968. Throughput: 0: 50855.3. Samples: 1091893800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:06:52,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 15:06:54,578][49750] Updated weights for policy 0, policy_version 203811 (0.0035) [2024-04-26 15:06:57,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3339337728. Throughput: 0: 50861.6. Samples: 1092201200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:06:57,064][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 15:06:58,116][49750] Updated weights for policy 0, policy_version 203821 (0.0038) [2024-04-26 15:07:00,104][49728] Signal inference workers to stop experience collection... (16350 times) [2024-04-26 15:07:00,104][49728] Signal inference workers to resume experience collection... (16350 times) [2024-04-26 15:07:00,147][49750] InferenceWorker_p0-w0: stopping experience collection (16350 times) [2024-04-26 15:07:00,147][49750] InferenceWorker_p0-w0: resuming experience collection (16350 times) [2024-04-26 15:07:00,885][49750] Updated weights for policy 0, policy_version 203831 (0.0030) [2024-04-26 15:07:02,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3339599872. Throughput: 0: 50724.9. Samples: 1092502200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:07:02,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 15:07:04,596][49750] Updated weights for policy 0, policy_version 203841 (0.0038) [2024-04-26 15:07:07,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.4, 300 sec: 50651.6). Total num frames: 3339862016. Throughput: 0: 50801.4. Samples: 1092659460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:07:07,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 15:07:07,349][49750] Updated weights for policy 0, policy_version 203851 (0.0035) [2024-04-26 15:07:10,896][49750] Updated weights for policy 0, policy_version 203861 (0.0032) [2024-04-26 15:07:12,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3340107776. Throughput: 0: 50761.4. Samples: 1092957580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:07:12,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 15:07:13,919][49750] Updated weights for policy 0, policy_version 203871 (0.0034) [2024-04-26 15:07:17,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3340369920. Throughput: 0: 50868.3. Samples: 1093264840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:07:17,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 15:07:17,210][49750] Updated weights for policy 0, policy_version 203881 (0.0028) [2024-04-26 15:07:20,220][49750] Updated weights for policy 0, policy_version 203891 (0.0030) [2024-04-26 15:07:22,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3340615680. Throughput: 0: 50868.2. Samples: 1093417640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:07:22,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 15:07:23,791][49750] Updated weights for policy 0, policy_version 203901 (0.0028) [2024-04-26 15:07:26,763][49750] Updated weights for policy 0, policy_version 203911 (0.0034) [2024-04-26 15:07:27,062][49517] Fps is (10 sec: 50791.3, 60 sec: 51063.6, 300 sec: 50651.6). Total num frames: 3340877824. Throughput: 0: 50740.6. Samples: 1093718800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:07:27,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 15:07:30,347][49750] Updated weights for policy 0, policy_version 203921 (0.0032) [2024-04-26 15:07:32,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3341123584. Throughput: 0: 50701.5. Samples: 1094021560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:07:32,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 15:07:33,192][49750] Updated weights for policy 0, policy_version 203931 (0.0033) [2024-04-26 15:07:36,679][49750] Updated weights for policy 0, policy_version 203941 (0.0027) [2024-04-26 15:07:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 3341385728. Throughput: 0: 50782.0. Samples: 1094178980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:07:37,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 15:07:39,682][49750] Updated weights for policy 0, policy_version 203951 (0.0028) [2024-04-26 15:07:42,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 3341615104. Throughput: 0: 50793.1. Samples: 1094486880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:07:42,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 15:07:43,027][49750] Updated weights for policy 0, policy_version 203961 (0.0034) [2024-04-26 15:07:46,063][49750] Updated weights for policy 0, policy_version 203971 (0.0037) [2024-04-26 15:07:47,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 3341860864. Throughput: 0: 50682.1. Samples: 1094782900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:07:47,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 15:07:49,591][49750] Updated weights for policy 0, policy_version 203981 (0.0030) [2024-04-26 15:07:52,063][49517] Fps is (10 sec: 54065.7, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3342155776. Throughput: 0: 50589.6. Samples: 1094936000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:07:52,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 15:07:52,368][49750] Updated weights for policy 0, policy_version 203991 (0.0033) [2024-04-26 15:07:56,218][49750] Updated weights for policy 0, policy_version 204001 (0.0037) [2024-04-26 15:07:57,062][49517] Fps is (10 sec: 55705.9, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3342417920. Throughput: 0: 50961.8. Samples: 1095250860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:07:57,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 15:07:58,776][49750] Updated weights for policy 0, policy_version 204011 (0.0030) [2024-04-26 15:08:02,062][49517] Fps is (10 sec: 47514.9, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 3342630912. Throughput: 0: 50915.3. Samples: 1095556020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:08:02,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 15:08:02,750][49750] Updated weights for policy 0, policy_version 204021 (0.0034) [2024-04-26 15:08:05,166][49750] Updated weights for policy 0, policy_version 204031 (0.0031) [2024-04-26 15:08:07,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3342893056. Throughput: 0: 50636.8. Samples: 1095696300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:08:07,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 15:08:09,162][49750] Updated weights for policy 0, policy_version 204041 (0.0029) [2024-04-26 15:08:11,690][49750] Updated weights for policy 0, policy_version 204051 (0.0028) [2024-04-26 15:08:12,063][49517] Fps is (10 sec: 54066.2, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3343171584. Throughput: 0: 50758.9. Samples: 1096002960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:08:12,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 15:08:15,539][49728] Signal inference workers to stop experience collection... (16400 times) [2024-04-26 15:08:15,596][49750] InferenceWorker_p0-w0: stopping experience collection (16400 times) [2024-04-26 15:08:15,603][49728] Signal inference workers to resume experience collection... (16400 times) [2024-04-26 15:08:15,611][49750] InferenceWorker_p0-w0: resuming experience collection (16400 times) [2024-04-26 15:08:15,615][49750] Updated weights for policy 0, policy_version 204061 (0.0029) [2024-04-26 15:08:17,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3343417344. Throughput: 0: 50762.6. Samples: 1096305880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:08:17,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 15:08:18,146][49750] Updated weights for policy 0, policy_version 204071 (0.0032) [2024-04-26 15:08:21,829][49750] Updated weights for policy 0, policy_version 204081 (0.0032) [2024-04-26 15:08:22,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3343679488. Throughput: 0: 50940.9. Samples: 1096471320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:08:22,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 15:08:24,810][49750] Updated weights for policy 0, policy_version 204091 (0.0032) [2024-04-26 15:08:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3343908864. Throughput: 0: 50904.4. Samples: 1096777580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:08:27,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 15:08:28,219][49750] Updated weights for policy 0, policy_version 204101 (0.0032) [2024-04-26 15:08:31,748][49750] Updated weights for policy 0, policy_version 204111 (0.0029) [2024-04-26 15:08:32,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 3344154624. Throughput: 0: 51041.8. Samples: 1097079780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:08:32,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-26 15:08:34,726][49750] Updated weights for policy 0, policy_version 204121 (0.0028) [2024-04-26 15:08:37,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3344449536. Throughput: 0: 51044.3. Samples: 1097232980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:08:37,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 15:08:38,652][49750] Updated weights for policy 0, policy_version 204131 (0.0036) [2024-04-26 15:08:41,091][49750] Updated weights for policy 0, policy_version 204141 (0.0027) [2024-04-26 15:08:42,063][49517] Fps is (10 sec: 55705.3, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 3344711680. Throughput: 0: 50880.8. Samples: 1097540500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:08:42,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 15:08:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000204145_3344711680.pth... [2024-04-26 15:08:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000203401_3332521984.pth [2024-04-26 15:08:44,945][49750] Updated weights for policy 0, policy_version 204151 (0.0032) [2024-04-26 15:08:47,062][49517] Fps is (10 sec: 47513.4, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3344924672. Throughput: 0: 50823.0. Samples: 1097843060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 15:08:47,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 15:08:47,591][49750] Updated weights for policy 0, policy_version 204161 (0.0032) [2024-04-26 15:08:51,248][49750] Updated weights for policy 0, policy_version 204171 (0.0034) [2024-04-26 15:08:52,062][49517] Fps is (10 sec: 44237.2, 60 sec: 49971.4, 300 sec: 50651.6). Total num frames: 3345154048. Throughput: 0: 50790.3. Samples: 1097981860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 15:08:52,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 15:08:53,927][49750] Updated weights for policy 0, policy_version 204181 (0.0030) [2024-04-26 15:08:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3345448960. Throughput: 0: 50856.2. Samples: 1098291480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 15:08:57,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 15:08:57,657][49750] Updated weights for policy 0, policy_version 204191 (0.0031) [2024-04-26 15:09:00,535][49750] Updated weights for policy 0, policy_version 204201 (0.0031) [2024-04-26 15:09:02,063][49517] Fps is (10 sec: 57343.6, 60 sec: 51609.5, 300 sec: 50818.2). Total num frames: 3345727488. Throughput: 0: 50915.5. Samples: 1098597080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 15:09:02,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 15:09:04,055][49750] Updated weights for policy 0, policy_version 204211 (0.0035) [2024-04-26 15:09:06,908][49750] Updated weights for policy 0, policy_version 204221 (0.0029) [2024-04-26 15:09:07,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3345956864. Throughput: 0: 50889.3. Samples: 1098761340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 15:09:07,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 15:09:10,357][49750] Updated weights for policy 0, policy_version 204231 (0.0029) [2024-04-26 15:09:12,062][49517] Fps is (10 sec: 45875.6, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3346186240. Throughput: 0: 50808.9. Samples: 1099063980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 15:09:12,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 15:09:13,323][49750] Updated weights for policy 0, policy_version 204241 (0.0040) [2024-04-26 15:09:16,690][49750] Updated weights for policy 0, policy_version 204251 (0.0030) [2024-04-26 15:09:17,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3346448384. Throughput: 0: 50864.8. Samples: 1099368700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 15:09:17,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 15:09:19,838][49750] Updated weights for policy 0, policy_version 204261 (0.0029) [2024-04-26 15:09:19,910][49728] Signal inference workers to stop experience collection... (16450 times) [2024-04-26 15:09:19,958][49750] InferenceWorker_p0-w0: stopping experience collection (16450 times) [2024-04-26 15:09:19,981][49728] Signal inference workers to resume experience collection... (16450 times) [2024-04-26 15:09:19,982][49750] InferenceWorker_p0-w0: resuming experience collection (16450 times) [2024-04-26 15:09:22,062][49517] Fps is (10 sec: 55705.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3346743296. Throughput: 0: 50822.6. Samples: 1099520000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 15:09:22,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 15:09:23,154][49750] Updated weights for policy 0, policy_version 204271 (0.0029) [2024-04-26 15:09:26,253][49750] Updated weights for policy 0, policy_version 204281 (0.0032) [2024-04-26 15:09:27,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 3346989056. Throughput: 0: 50787.7. Samples: 1099825940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 15:09:27,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 15:09:29,624][49750] Updated weights for policy 0, policy_version 204291 (0.0034) [2024-04-26 15:09:32,062][49517] Fps is (10 sec: 45875.8, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3347202048. Throughput: 0: 50789.0. Samples: 1100128560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 15:09:32,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 15:09:32,633][49750] Updated weights for policy 0, policy_version 204301 (0.0035) [2024-04-26 15:09:36,339][49750] Updated weights for policy 0, policy_version 204311 (0.0034) [2024-04-26 15:09:37,062][49517] Fps is (10 sec: 45875.1, 60 sec: 49971.2, 300 sec: 50651.5). Total num frames: 3347447808. Throughput: 0: 50866.7. Samples: 1100270860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 15:09:37,063][49517] Avg episode reward: [(0, '0.473')] [2024-04-26 15:09:39,093][49750] Updated weights for policy 0, policy_version 204321 (0.0027) [2024-04-26 15:09:42,063][49517] Fps is (10 sec: 52427.7, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3347726336. Throughput: 0: 50692.3. Samples: 1100572640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 15:09:42,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 15:09:42,762][49750] Updated weights for policy 0, policy_version 204331 (0.0033) [2024-04-26 15:09:45,634][49750] Updated weights for policy 0, policy_version 204341 (0.0033) [2024-04-26 15:09:47,063][49517] Fps is (10 sec: 55704.5, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 3348004864. Throughput: 0: 50673.2. Samples: 1100877380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 15:09:47,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 15:09:49,089][49750] Updated weights for policy 0, policy_version 204351 (0.0031) [2024-04-26 15:09:52,010][49750] Updated weights for policy 0, policy_version 204361 (0.0025) [2024-04-26 15:09:52,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51609.4, 300 sec: 50818.1). Total num frames: 3348250624. Throughput: 0: 50766.5. Samples: 1101045840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 15:09:52,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 15:09:55,677][49750] Updated weights for policy 0, policy_version 204371 (0.0033) [2024-04-26 15:09:57,063][49517] Fps is (10 sec: 45875.5, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 3348463616. Throughput: 0: 50664.7. Samples: 1101343900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 15:09:57,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 15:09:58,581][49750] Updated weights for policy 0, policy_version 204381 (0.0029) [2024-04-26 15:10:02,053][49750] Updated weights for policy 0, policy_version 204391 (0.0025) [2024-04-26 15:10:02,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3348742144. Throughput: 0: 50488.0. Samples: 1101640660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 15:10:02,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 15:10:05,010][49750] Updated weights for policy 0, policy_version 204401 (0.0032) [2024-04-26 15:10:07,063][49517] Fps is (10 sec: 55705.9, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3349020672. Throughput: 0: 50786.2. Samples: 1101805380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 15:10:07,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 15:10:08,401][49750] Updated weights for policy 0, policy_version 204411 (0.0028) [2024-04-26 15:10:11,434][49750] Updated weights for policy 0, policy_version 204421 (0.0030) [2024-04-26 15:10:12,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 3349282816. Throughput: 0: 50636.4. Samples: 1102104580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 15:10:12,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 15:10:14,789][49750] Updated weights for policy 0, policy_version 204431 (0.0032) [2024-04-26 15:10:17,063][49517] Fps is (10 sec: 45875.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3349479424. Throughput: 0: 50624.7. Samples: 1102406680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 15:10:17,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 15:10:17,938][49750] Updated weights for policy 0, policy_version 204441 (0.0029) [2024-04-26 15:10:21,339][49750] Updated weights for policy 0, policy_version 204451 (0.0033) [2024-04-26 15:10:22,063][49517] Fps is (10 sec: 45875.0, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 3349741568. Throughput: 0: 50587.5. Samples: 1102547300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 15:10:22,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 15:10:24,359][49750] Updated weights for policy 0, policy_version 204461 (0.0033) [2024-04-26 15:10:27,063][49517] Fps is (10 sec: 54067.2, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3350020096. Throughput: 0: 50728.9. Samples: 1102855440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 15:10:27,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 15:10:27,918][49750] Updated weights for policy 0, policy_version 204471 (0.0030) [2024-04-26 15:10:28,762][49728] Signal inference workers to stop experience collection... (16500 times) [2024-04-26 15:10:28,762][49728] Signal inference workers to resume experience collection... (16500 times) [2024-04-26 15:10:28,775][49750] InferenceWorker_p0-w0: stopping experience collection (16500 times) [2024-04-26 15:10:28,776][49750] InferenceWorker_p0-w0: resuming experience collection (16500 times) [2024-04-26 15:10:30,865][49750] Updated weights for policy 0, policy_version 204481 (0.0032) [2024-04-26 15:10:32,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3350282240. Throughput: 0: 50643.4. Samples: 1103156320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 15:10:32,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 15:10:34,275][49750] Updated weights for policy 0, policy_version 204491 (0.0029) [2024-04-26 15:10:37,062][49517] Fps is (10 sec: 49152.0, 60 sec: 51063.4, 300 sec: 50762.7). Total num frames: 3350511616. Throughput: 0: 50510.4. Samples: 1103318800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 15:10:37,063][49517] Avg episode reward: [(0, '0.481')] [2024-04-26 15:10:37,248][49750] Updated weights for policy 0, policy_version 204501 (0.0029) [2024-04-26 15:10:40,935][49750] Updated weights for policy 0, policy_version 204511 (0.0036) [2024-04-26 15:10:42,063][49517] Fps is (10 sec: 47512.6, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3350757376. Throughput: 0: 50628.8. Samples: 1103622200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 15:10:42,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 15:10:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000204514_3350757376.pth... [2024-04-26 15:10:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000203772_3338600448.pth [2024-04-26 15:10:43,696][49750] Updated weights for policy 0, policy_version 204521 (0.0031) [2024-04-26 15:10:47,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.4, 300 sec: 50707.1). Total num frames: 3351003136. Throughput: 0: 50598.4. Samples: 1103917580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 15:10:47,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 15:10:47,355][49750] Updated weights for policy 0, policy_version 204531 (0.0033) [2024-04-26 15:10:50,326][49750] Updated weights for policy 0, policy_version 204541 (0.0033) [2024-04-26 15:10:52,062][49517] Fps is (10 sec: 54068.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3351298048. Throughput: 0: 50576.0. Samples: 1104081300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 15:10:52,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 15:10:53,749][49750] Updated weights for policy 0, policy_version 204551 (0.0033) [2024-04-26 15:10:56,665][49750] Updated weights for policy 0, policy_version 204561 (0.0036) [2024-04-26 15:10:57,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.7, 300 sec: 50762.6). Total num frames: 3351543808. Throughput: 0: 50763.2. Samples: 1104388920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 15:10:57,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 15:11:00,060][49750] Updated weights for policy 0, policy_version 204571 (0.0027) [2024-04-26 15:11:02,063][49517] Fps is (10 sec: 45874.9, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3351756800. Throughput: 0: 50811.5. Samples: 1104693200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:11:02,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 15:11:02,996][49750] Updated weights for policy 0, policy_version 204581 (0.0029) [2024-04-26 15:11:06,365][49750] Updated weights for policy 0, policy_version 204591 (0.0028) [2024-04-26 15:11:07,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3352035328. Throughput: 0: 50924.1. Samples: 1104838880. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:11:07,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 15:11:09,435][49750] Updated weights for policy 0, policy_version 204601 (0.0028) [2024-04-26 15:11:12,062][49517] Fps is (10 sec: 54067.8, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3352297472. Throughput: 0: 50716.9. Samples: 1105137700. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:11:12,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 15:11:12,814][49750] Updated weights for policy 0, policy_version 204611 (0.0032) [2024-04-26 15:11:15,611][49728] Signal inference workers to stop experience collection... (16550 times) [2024-04-26 15:11:15,615][49728] Signal inference workers to resume experience collection... (16550 times) [2024-04-26 15:11:15,637][49750] InferenceWorker_p0-w0: stopping experience collection (16550 times) [2024-04-26 15:11:15,637][49750] InferenceWorker_p0-w0: resuming experience collection (16550 times) [2024-04-26 15:11:15,897][49750] Updated weights for policy 0, policy_version 204621 (0.0027) [2024-04-26 15:11:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3352543232. Throughput: 0: 50914.2. Samples: 1105447460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:11:17,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 15:11:19,483][49750] Updated weights for policy 0, policy_version 204631 (0.0036) [2024-04-26 15:11:22,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3352805376. Throughput: 0: 50749.4. Samples: 1105602520. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:11:22,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 15:11:22,274][49750] Updated weights for policy 0, policy_version 204641 (0.0030) [2024-04-26 15:11:25,793][49750] Updated weights for policy 0, policy_version 204651 (0.0032) [2024-04-26 15:11:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 3353034752. Throughput: 0: 50792.3. Samples: 1105907840. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:11:27,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 15:11:28,653][49750] Updated weights for policy 0, policy_version 204661 (0.0032) [2024-04-26 15:11:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3353313280. Throughput: 0: 50905.8. Samples: 1106208340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:11:32,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 15:11:32,099][49750] Updated weights for policy 0, policy_version 204671 (0.0035) [2024-04-26 15:11:35,187][49750] Updated weights for policy 0, policy_version 204681 (0.0031) [2024-04-26 15:11:37,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 3353575424. Throughput: 0: 50847.7. Samples: 1106369440. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:11:37,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 15:11:38,931][49750] Updated weights for policy 0, policy_version 204691 (0.0035) [2024-04-26 15:11:41,655][49750] Updated weights for policy 0, policy_version 204701 (0.0023) [2024-04-26 15:11:42,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3353837568. Throughput: 0: 50754.0. Samples: 1106672860. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:11:42,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 15:11:45,472][49750] Updated weights for policy 0, policy_version 204711 (0.0034) [2024-04-26 15:11:47,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3354050560. Throughput: 0: 50781.5. Samples: 1106978360. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:11:47,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 15:11:48,227][49750] Updated weights for policy 0, policy_version 204721 (0.0030) [2024-04-26 15:11:51,986][49750] Updated weights for policy 0, policy_version 204731 (0.0030) [2024-04-26 15:11:52,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50244.3, 300 sec: 50762.7). Total num frames: 3354312704. Throughput: 0: 50652.4. Samples: 1107118240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:11:52,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 15:11:54,614][49750] Updated weights for policy 0, policy_version 204741 (0.0026) [2024-04-26 15:11:57,062][49517] Fps is (10 sec: 55705.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3354607616. Throughput: 0: 50933.3. Samples: 1107429700. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:11:57,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 15:11:58,309][49750] Updated weights for policy 0, policy_version 204751 (0.0029) [2024-04-26 15:12:01,000][49750] Updated weights for policy 0, policy_version 204761 (0.0033) [2024-04-26 15:12:02,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51609.7, 300 sec: 50818.2). Total num frames: 3354853376. Throughput: 0: 50767.6. Samples: 1107732000. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:12:02,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 15:12:04,570][49750] Updated weights for policy 0, policy_version 204771 (0.0030) [2024-04-26 15:12:07,063][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 3355099136. Throughput: 0: 50761.2. Samples: 1107886780. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:12:07,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 15:12:07,810][49750] Updated weights for policy 0, policy_version 204781 (0.0028) [2024-04-26 15:12:11,066][49750] Updated weights for policy 0, policy_version 204791 (0.0035) [2024-04-26 15:12:12,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 3355312128. Throughput: 0: 50702.2. Samples: 1108189440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:12:12,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 15:12:14,197][49750] Updated weights for policy 0, policy_version 204801 (0.0029) [2024-04-26 15:12:17,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3355590656. Throughput: 0: 50857.8. Samples: 1108496940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:12:17,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 15:12:17,600][49750] Updated weights for policy 0, policy_version 204811 (0.0027) [2024-04-26 15:12:19,279][49728] Signal inference workers to stop experience collection... (16600 times) [2024-04-26 15:12:19,279][49728] Signal inference workers to resume experience collection... (16600 times) [2024-04-26 15:12:19,300][49750] InferenceWorker_p0-w0: stopping experience collection (16600 times) [2024-04-26 15:12:19,301][49750] InferenceWorker_p0-w0: resuming experience collection (16600 times) [2024-04-26 15:12:20,640][49750] Updated weights for policy 0, policy_version 204821 (0.0032) [2024-04-26 15:12:22,063][49517] Fps is (10 sec: 55704.7, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3355869184. Throughput: 0: 50590.5. Samples: 1108646020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:12:22,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 15:12:24,074][49750] Updated weights for policy 0, policy_version 204831 (0.0027) [2024-04-26 15:12:27,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3356098560. Throughput: 0: 50842.0. Samples: 1108960740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:12:27,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 15:12:27,166][49750] Updated weights for policy 0, policy_version 204841 (0.0034) [2024-04-26 15:12:30,656][49750] Updated weights for policy 0, policy_version 204851 (0.0031) [2024-04-26 15:12:32,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3356344320. Throughput: 0: 50792.5. Samples: 1109264020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:12:32,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 15:12:33,591][49750] Updated weights for policy 0, policy_version 204861 (0.0030) [2024-04-26 15:12:37,052][49750] Updated weights for policy 0, policy_version 204871 (0.0031) [2024-04-26 15:12:37,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3356606464. Throughput: 0: 50844.4. Samples: 1109406240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:12:37,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 15:12:39,917][49750] Updated weights for policy 0, policy_version 204881 (0.0035) [2024-04-26 15:12:42,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 3356868608. Throughput: 0: 50752.8. Samples: 1109713580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:12:42,063][49517] Avg episode reward: [(0, '0.387')] [2024-04-26 15:12:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000204887_3356868608.pth... [2024-04-26 15:12:42,131][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000204145_3344711680.pth [2024-04-26 15:12:43,656][49750] Updated weights for policy 0, policy_version 204891 (0.0028) [2024-04-26 15:12:46,426][49750] Updated weights for policy 0, policy_version 204901 (0.0025) [2024-04-26 15:12:47,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.5, 300 sec: 50762.7). Total num frames: 3357130752. Throughput: 0: 50841.3. Samples: 1110019860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:12:47,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 15:12:49,921][49750] Updated weights for policy 0, policy_version 204911 (0.0031) [2024-04-26 15:12:52,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3357392896. Throughput: 0: 50884.5. Samples: 1110176580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:12:52,063][49517] Avg episode reward: [(0, '0.453')] [2024-04-26 15:12:52,904][49750] Updated weights for policy 0, policy_version 204921 (0.0029) [2024-04-26 15:12:56,333][49750] Updated weights for policy 0, policy_version 204931 (0.0032) [2024-04-26 15:12:57,063][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 3357605888. Throughput: 0: 51036.4. Samples: 1110486080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:12:57,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 15:12:59,338][49750] Updated weights for policy 0, policy_version 204941 (0.0036) [2024-04-26 15:13:02,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50762.7). Total num frames: 3357868032. Throughput: 0: 50804.9. Samples: 1110783160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:13:02,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 15:13:02,817][49750] Updated weights for policy 0, policy_version 204951 (0.0031) [2024-04-26 15:13:05,798][49750] Updated weights for policy 0, policy_version 204961 (0.0024) [2024-04-26 15:13:07,062][49517] Fps is (10 sec: 55705.8, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3358162944. Throughput: 0: 50802.0. Samples: 1110932100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:13:07,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 15:13:09,111][49750] Updated weights for policy 0, policy_version 204971 (0.0027) [2024-04-26 15:13:12,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3358392320. Throughput: 0: 50805.3. Samples: 1111246980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:13:12,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 15:13:12,164][49750] Updated weights for policy 0, policy_version 204981 (0.0030) [2024-04-26 15:13:12,904][49728] Signal inference workers to stop experience collection... (16650 times) [2024-04-26 15:13:12,955][49750] InferenceWorker_p0-w0: stopping experience collection (16650 times) [2024-04-26 15:13:12,968][49728] Signal inference workers to resume experience collection... (16650 times) [2024-04-26 15:13:12,976][49750] InferenceWorker_p0-w0: resuming experience collection (16650 times) [2024-04-26 15:13:15,461][49750] Updated weights for policy 0, policy_version 204991 (0.0031) [2024-04-26 15:13:17,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3358621696. Throughput: 0: 50853.4. Samples: 1111552420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 15:13:17,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 15:13:18,623][49750] Updated weights for policy 0, policy_version 205001 (0.0027) [2024-04-26 15:13:21,945][49750] Updated weights for policy 0, policy_version 205011 (0.0037) [2024-04-26 15:13:22,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3358900224. Throughput: 0: 50804.4. Samples: 1111692440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 15:13:22,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 15:13:25,001][49750] Updated weights for policy 0, policy_version 205021 (0.0031) [2024-04-26 15:13:27,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3359145984. Throughput: 0: 50840.0. Samples: 1112001380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 15:13:27,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 15:13:28,911][49750] Updated weights for policy 0, policy_version 205031 (0.0035) [2024-04-26 15:13:31,421][49750] Updated weights for policy 0, policy_version 205041 (0.0025) [2024-04-26 15:13:32,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3359408128. Throughput: 0: 50767.7. Samples: 1112304400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 15:13:32,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 15:13:35,240][49750] Updated weights for policy 0, policy_version 205051 (0.0031) [2024-04-26 15:13:37,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3359686656. Throughput: 0: 50847.5. Samples: 1112464720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 15:13:37,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 15:13:37,817][49750] Updated weights for policy 0, policy_version 205061 (0.0036) [2024-04-26 15:13:41,684][49750] Updated weights for policy 0, policy_version 205071 (0.0031) [2024-04-26 15:13:42,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3359899648. Throughput: 0: 50856.0. Samples: 1112774600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 15:13:42,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 15:13:44,197][49750] Updated weights for policy 0, policy_version 205081 (0.0036) [2024-04-26 15:13:47,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 3360161792. Throughput: 0: 50835.5. Samples: 1113070760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 15:13:47,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 15:13:48,351][49750] Updated weights for policy 0, policy_version 205091 (0.0032) [2024-04-26 15:13:50,635][49750] Updated weights for policy 0, policy_version 205101 (0.0035) [2024-04-26 15:13:52,063][49517] Fps is (10 sec: 54066.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3360440320. Throughput: 0: 50777.2. Samples: 1113217080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 15:13:52,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 15:13:54,681][49750] Updated weights for policy 0, policy_version 205111 (0.0027) [2024-04-26 15:13:57,035][49750] Updated weights for policy 0, policy_version 205121 (0.0027) [2024-04-26 15:13:57,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51609.6, 300 sec: 50762.7). Total num frames: 3360702464. Throughput: 0: 50849.8. Samples: 1113535220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 15:13:57,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 15:14:01,069][49750] Updated weights for policy 0, policy_version 205131 (0.0030) [2024-04-26 15:14:01,915][49728] Signal inference workers to stop experience collection... (16700 times) [2024-04-26 15:14:01,915][49728] Signal inference workers to resume experience collection... (16700 times) [2024-04-26 15:14:01,959][49750] InferenceWorker_p0-w0: stopping experience collection (16700 times) [2024-04-26 15:14:01,959][49750] InferenceWorker_p0-w0: resuming experience collection (16700 times) [2024-04-26 15:14:02,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3360948224. Throughput: 0: 50958.5. Samples: 1113845560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 15:14:02,064][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 15:14:03,552][49750] Updated weights for policy 0, policy_version 205141 (0.0035) [2024-04-26 15:14:07,063][49517] Fps is (10 sec: 45874.6, 60 sec: 49971.1, 300 sec: 50762.6). Total num frames: 3361161216. Throughput: 0: 50879.1. Samples: 1113982000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 15:14:07,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 15:14:07,650][49750] Updated weights for policy 0, policy_version 205151 (0.0029) [2024-04-26 15:14:09,989][49750] Updated weights for policy 0, policy_version 205161 (0.0029) [2024-04-26 15:14:12,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3361439744. Throughput: 0: 50773.0. Samples: 1114286160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 15:14:12,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 15:14:14,042][49750] Updated weights for policy 0, policy_version 205171 (0.0031) [2024-04-26 15:14:16,442][49750] Updated weights for policy 0, policy_version 205181 (0.0037) [2024-04-26 15:14:17,062][49517] Fps is (10 sec: 55705.8, 60 sec: 51609.5, 300 sec: 50762.6). Total num frames: 3361718272. Throughput: 0: 50806.1. Samples: 1114590680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 15:14:17,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 15:14:20,402][49750] Updated weights for policy 0, policy_version 205191 (0.0036) [2024-04-26 15:14:22,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3361947648. Throughput: 0: 50781.7. Samples: 1114749900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 15:14:22,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 15:14:22,889][49750] Updated weights for policy 0, policy_version 205201 (0.0027) [2024-04-26 15:14:26,901][49750] Updated weights for policy 0, policy_version 205211 (0.0034) [2024-04-26 15:14:27,062][49517] Fps is (10 sec: 45875.7, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 3362177024. Throughput: 0: 50506.3. Samples: 1115047380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 15:14:27,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 15:14:29,422][49750] Updated weights for policy 0, policy_version 205221 (0.0030) [2024-04-26 15:14:32,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3362455552. Throughput: 0: 50833.7. Samples: 1115358280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 15:14:32,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 15:14:33,438][49750] Updated weights for policy 0, policy_version 205231 (0.0033) [2024-04-26 15:14:35,768][49750] Updated weights for policy 0, policy_version 205241 (0.0029) [2024-04-26 15:14:37,062][49517] Fps is (10 sec: 54066.9, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3362717696. Throughput: 0: 50835.7. Samples: 1115504680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 15:14:37,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 15:14:39,904][49750] Updated weights for policy 0, policy_version 205251 (0.0031) [2024-04-26 15:14:42,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.5, 300 sec: 50762.7). Total num frames: 3362979840. Throughput: 0: 50599.9. Samples: 1115812220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 15:14:42,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 15:14:42,128][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000205261_3362996224.pth... [2024-04-26 15:14:42,134][49750] Updated weights for policy 0, policy_version 205261 (0.0030) [2024-04-26 15:14:42,174][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000204514_3350757376.pth [2024-04-26 15:14:46,208][49750] Updated weights for policy 0, policy_version 205271 (0.0044) [2024-04-26 15:14:47,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3363241984. Throughput: 0: 50650.6. Samples: 1116124840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 15:14:47,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 15:14:48,703][49750] Updated weights for policy 0, policy_version 205281 (0.0035) [2024-04-26 15:14:52,063][49517] Fps is (10 sec: 45874.9, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 3363438592. Throughput: 0: 50799.1. Samples: 1116267960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 15:14:52,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 15:14:52,597][49750] Updated weights for policy 0, policy_version 205291 (0.0031) [2024-04-26 15:14:55,077][49750] Updated weights for policy 0, policy_version 205301 (0.0033) [2024-04-26 15:14:57,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 3363733504. Throughput: 0: 50766.5. Samples: 1116570660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 15:14:57,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 15:14:58,928][49728] Signal inference workers to stop experience collection... (16750 times) [2024-04-26 15:14:58,931][49728] Signal inference workers to resume experience collection... (16750 times) [2024-04-26 15:14:58,959][49750] InferenceWorker_p0-w0: stopping experience collection (16750 times) [2024-04-26 15:14:58,959][49750] InferenceWorker_p0-w0: resuming experience collection (16750 times) [2024-04-26 15:14:59,063][49750] Updated weights for policy 0, policy_version 205311 (0.0031) [2024-04-26 15:15:01,511][49750] Updated weights for policy 0, policy_version 205321 (0.0030) [2024-04-26 15:15:02,063][49517] Fps is (10 sec: 55705.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3363995648. Throughput: 0: 50847.4. Samples: 1116878820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 15:15:02,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 15:15:05,459][49750] Updated weights for policy 0, policy_version 205331 (0.0036) [2024-04-26 15:15:07,062][49517] Fps is (10 sec: 49152.7, 60 sec: 51063.6, 300 sec: 50651.6). Total num frames: 3364225024. Throughput: 0: 50857.1. Samples: 1117038460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 15:15:07,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 15:15:07,941][49750] Updated weights for policy 0, policy_version 205341 (0.0029) [2024-04-26 15:15:11,700][49750] Updated weights for policy 0, policy_version 205351 (0.0034) [2024-04-26 15:15:12,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3364487168. Throughput: 0: 51048.9. Samples: 1117344580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 15:15:12,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 15:15:14,417][49750] Updated weights for policy 0, policy_version 205361 (0.0031) [2024-04-26 15:15:17,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 3364732928. Throughput: 0: 50897.3. Samples: 1117648660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 15:15:17,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 15:15:18,059][49750] Updated weights for policy 0, policy_version 205371 (0.0029) [2024-04-26 15:15:20,888][49750] Updated weights for policy 0, policy_version 205381 (0.0036) [2024-04-26 15:15:22,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3365011456. Throughput: 0: 51125.2. Samples: 1117805320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 15:15:22,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 15:15:24,557][49750] Updated weights for policy 0, policy_version 205391 (0.0029) [2024-04-26 15:15:27,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3365257216. Throughput: 0: 51035.2. Samples: 1118108800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 15:15:27,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 15:15:27,407][49750] Updated weights for policy 0, policy_version 205401 (0.0028) [2024-04-26 15:15:30,985][49750] Updated weights for policy 0, policy_version 205411 (0.0030) [2024-04-26 15:15:32,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3365502976. Throughput: 0: 50849.4. Samples: 1118413060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:15:32,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 15:15:33,730][49750] Updated weights for policy 0, policy_version 205421 (0.0029) [2024-04-26 15:15:37,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3365748736. Throughput: 0: 51128.9. Samples: 1118568760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:15:37,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 15:15:37,309][49750] Updated weights for policy 0, policy_version 205431 (0.0035) [2024-04-26 15:15:40,113][49750] Updated weights for policy 0, policy_version 205441 (0.0028) [2024-04-26 15:15:42,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 3366027264. Throughput: 0: 51168.6. Samples: 1118873240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:15:42,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 15:15:43,740][49750] Updated weights for policy 0, policy_version 205451 (0.0032) [2024-04-26 15:15:46,486][49750] Updated weights for policy 0, policy_version 205461 (0.0036) [2024-04-26 15:15:47,062][49517] Fps is (10 sec: 55705.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3366305792. Throughput: 0: 51094.8. Samples: 1119178080. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:15:47,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 15:15:50,094][49750] Updated weights for policy 0, policy_version 205471 (0.0028) [2024-04-26 15:15:52,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51882.6, 300 sec: 50873.7). Total num frames: 3366551552. Throughput: 0: 51198.9. Samples: 1119342420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:15:52,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 15:15:53,003][49750] Updated weights for policy 0, policy_version 205481 (0.0032) [2024-04-26 15:15:56,507][49750] Updated weights for policy 0, policy_version 205491 (0.0028) [2024-04-26 15:15:57,062][49517] Fps is (10 sec: 49152.0, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 3366797312. Throughput: 0: 51193.2. Samples: 1119648280. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:15:57,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 15:15:59,520][49750] Updated weights for policy 0, policy_version 205501 (0.0029) [2024-04-26 15:16:02,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3367026688. Throughput: 0: 51179.3. Samples: 1119951720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:16:02,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 15:16:02,846][49750] Updated weights for policy 0, policy_version 205511 (0.0029) [2024-04-26 15:16:05,956][49750] Updated weights for policy 0, policy_version 205521 (0.0034) [2024-04-26 15:16:07,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3367305216. Throughput: 0: 51057.0. Samples: 1120102880. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:16:07,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 15:16:09,180][49750] Updated weights for policy 0, policy_version 205531 (0.0031) [2024-04-26 15:16:09,498][49728] Signal inference workers to stop experience collection... (16800 times) [2024-04-26 15:16:09,503][49728] Signal inference workers to resume experience collection... (16800 times) [2024-04-26 15:16:09,527][49750] InferenceWorker_p0-w0: stopping experience collection (16800 times) [2024-04-26 15:16:09,527][49750] InferenceWorker_p0-w0: resuming experience collection (16800 times) [2024-04-26 15:16:12,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3367550976. Throughput: 0: 51043.9. Samples: 1120405780. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:16:12,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-26 15:16:12,275][49750] Updated weights for policy 0, policy_version 205541 (0.0037) [2024-04-26 15:16:15,677][49750] Updated weights for policy 0, policy_version 205551 (0.0028) [2024-04-26 15:16:17,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3367813120. Throughput: 0: 51038.1. Samples: 1120709780. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:16:17,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 15:16:18,833][49750] Updated weights for policy 0, policy_version 205561 (0.0030) [2024-04-26 15:16:22,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 3368058880. Throughput: 0: 50993.5. Samples: 1120863460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:16:22,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 15:16:22,115][49750] Updated weights for policy 0, policy_version 205571 (0.0032) [2024-04-26 15:16:25,170][49750] Updated weights for policy 0, policy_version 205581 (0.0033) [2024-04-26 15:16:27,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3368304640. Throughput: 0: 50967.0. Samples: 1121166760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:16:27,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 15:16:28,569][49750] Updated weights for policy 0, policy_version 205591 (0.0038) [2024-04-26 15:16:31,493][49750] Updated weights for policy 0, policy_version 205601 (0.0026) [2024-04-26 15:16:32,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3368583168. Throughput: 0: 50957.4. Samples: 1121471160. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:16:32,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 15:16:34,992][49750] Updated weights for policy 0, policy_version 205611 (0.0035) [2024-04-26 15:16:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3368828928. Throughput: 0: 50915.1. Samples: 1121633600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-26 15:16:37,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 15:16:37,907][49750] Updated weights for policy 0, policy_version 205621 (0.0033) [2024-04-26 15:16:41,541][49750] Updated weights for policy 0, policy_version 205631 (0.0035) [2024-04-26 15:16:42,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 3369074688. Throughput: 0: 50816.1. Samples: 1121935000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 15:16:42,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 15:16:42,084][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000205633_3369091072.pth... [2024-04-26 15:16:42,144][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000204887_3356868608.pth [2024-04-26 15:16:44,347][49750] Updated weights for policy 0, policy_version 205641 (0.0035) [2024-04-26 15:16:47,062][49517] Fps is (10 sec: 47514.3, 60 sec: 49971.3, 300 sec: 50818.2). Total num frames: 3369304064. Throughput: 0: 50831.1. Samples: 1122239120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 15:16:47,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 15:16:47,928][49750] Updated weights for policy 0, policy_version 205651 (0.0029) [2024-04-26 15:16:50,877][49750] Updated weights for policy 0, policy_version 205661 (0.0032) [2024-04-26 15:16:52,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3369566208. Throughput: 0: 50721.3. Samples: 1122385340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 15:16:52,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 15:16:54,271][49750] Updated weights for policy 0, policy_version 205671 (0.0031) [2024-04-26 15:16:57,063][49517] Fps is (10 sec: 55704.8, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3369861120. Throughput: 0: 50756.4. Samples: 1122689820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 15:16:57,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 15:16:57,305][49750] Updated weights for policy 0, policy_version 205681 (0.0032) [2024-04-26 15:17:00,768][49750] Updated weights for policy 0, policy_version 205691 (0.0027) [2024-04-26 15:17:02,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 3370106880. Throughput: 0: 50795.1. Samples: 1122995560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 15:17:02,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 15:17:03,805][49750] Updated weights for policy 0, policy_version 205701 (0.0032) [2024-04-26 15:17:07,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.4, 300 sec: 50929.2). Total num frames: 3370336256. Throughput: 0: 50746.2. Samples: 1123147040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 15:17:07,063][49517] Avg episode reward: [(0, '0.458')] [2024-04-26 15:17:07,309][49750] Updated weights for policy 0, policy_version 205711 (0.0028) [2024-04-26 15:17:10,534][49750] Updated weights for policy 0, policy_version 205721 (0.0039) [2024-04-26 15:17:12,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3370598400. Throughput: 0: 50745.2. Samples: 1123450300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 15:17:12,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 15:17:13,736][49750] Updated weights for policy 0, policy_version 205731 (0.0038) [2024-04-26 15:17:15,787][49728] Signal inference workers to stop experience collection... (16850 times) [2024-04-26 15:17:15,788][49728] Signal inference workers to resume experience collection... (16850 times) [2024-04-26 15:17:15,822][49750] InferenceWorker_p0-w0: stopping experience collection (16850 times) [2024-04-26 15:17:15,822][49750] InferenceWorker_p0-w0: resuming experience collection (16850 times) [2024-04-26 15:17:16,832][49750] Updated weights for policy 0, policy_version 205741 (0.0031) [2024-04-26 15:17:17,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 3370860544. Throughput: 0: 50760.5. Samples: 1123755380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 15:17:17,063][49517] Avg episode reward: [(0, '0.485')] [2024-04-26 15:17:20,128][49750] Updated weights for policy 0, policy_version 205751 (0.0034) [2024-04-26 15:17:22,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 3371122688. Throughput: 0: 50652.9. Samples: 1123912980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 15:17:22,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 15:17:23,368][49750] Updated weights for policy 0, policy_version 205761 (0.0036) [2024-04-26 15:17:26,583][49750] Updated weights for policy 0, policy_version 205771 (0.0029) [2024-04-26 15:17:27,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 3371368448. Throughput: 0: 50751.1. Samples: 1124218800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 15:17:27,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 15:17:29,994][49750] Updated weights for policy 0, policy_version 205781 (0.0031) [2024-04-26 15:17:32,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.1, 300 sec: 50818.2). Total num frames: 3371597824. Throughput: 0: 50776.2. Samples: 1124524060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 15:17:32,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 15:17:33,034][49750] Updated weights for policy 0, policy_version 205791 (0.0031) [2024-04-26 15:17:36,339][49750] Updated weights for policy 0, policy_version 205801 (0.0031) [2024-04-26 15:17:37,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 3371843584. Throughput: 0: 50736.5. Samples: 1124668480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 15:17:37,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 15:17:39,497][49750] Updated weights for policy 0, policy_version 205811 (0.0031) [2024-04-26 15:17:42,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 3372138496. Throughput: 0: 50765.7. Samples: 1124974280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-26 15:17:42,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 15:17:42,957][49750] Updated weights for policy 0, policy_version 205821 (0.0028) [2024-04-26 15:17:45,877][49750] Updated weights for policy 0, policy_version 205831 (0.0028) [2024-04-26 15:17:47,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3372384256. Throughput: 0: 50655.2. Samples: 1125275040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 15:17:47,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 15:17:49,518][49750] Updated weights for policy 0, policy_version 205841 (0.0034) [2024-04-26 15:17:52,063][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 3372630016. Throughput: 0: 50927.2. Samples: 1125438780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 15:17:52,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 15:17:52,361][49750] Updated weights for policy 0, policy_version 205851 (0.0029) [2024-04-26 15:17:55,863][49750] Updated weights for policy 0, policy_version 205861 (0.0030) [2024-04-26 15:17:57,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.3, 300 sec: 50818.2). Total num frames: 3372859392. Throughput: 0: 50867.2. Samples: 1125739320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 15:17:57,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 15:17:58,807][49750] Updated weights for policy 0, policy_version 205871 (0.0030) [2024-04-26 15:18:02,062][49517] Fps is (10 sec: 50791.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3373137920. Throughput: 0: 50677.3. Samples: 1126035860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 15:18:02,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 15:18:02,272][49750] Updated weights for policy 0, policy_version 205881 (0.0032) [2024-04-26 15:18:05,268][49750] Updated weights for policy 0, policy_version 205891 (0.0034) [2024-04-26 15:18:07,063][49517] Fps is (10 sec: 55704.6, 60 sec: 51336.3, 300 sec: 50929.2). Total num frames: 3373416448. Throughput: 0: 50700.7. Samples: 1126194520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 15:18:07,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 15:18:08,914][49750] Updated weights for policy 0, policy_version 205901 (0.0039) [2024-04-26 15:18:11,628][49750] Updated weights for policy 0, policy_version 205911 (0.0032) [2024-04-26 15:18:12,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 3373662208. Throughput: 0: 50850.2. Samples: 1126507060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 15:18:12,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 15:18:15,461][49750] Updated weights for policy 0, policy_version 205921 (0.0028) [2024-04-26 15:18:17,062][49517] Fps is (10 sec: 45875.9, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3373875200. Throughput: 0: 50724.1. Samples: 1126806640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 15:18:17,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 15:18:18,088][49750] Updated weights for policy 0, policy_version 205931 (0.0031) [2024-04-26 15:18:20,486][49728] Signal inference workers to stop experience collection... (16900 times) [2024-04-26 15:18:20,487][49728] Signal inference workers to resume experience collection... (16900 times) [2024-04-26 15:18:20,499][49750] InferenceWorker_p0-w0: stopping experience collection (16900 times) [2024-04-26 15:18:20,500][49750] InferenceWorker_p0-w0: resuming experience collection (16900 times) [2024-04-26 15:18:21,757][49750] Updated weights for policy 0, policy_version 205941 (0.0034) [2024-04-26 15:18:22,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 3374137344. Throughput: 0: 50639.8. Samples: 1126947280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 15:18:22,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 15:18:24,508][49750] Updated weights for policy 0, policy_version 205951 (0.0031) [2024-04-26 15:18:27,063][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3374415872. Throughput: 0: 50521.8. Samples: 1127247760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 15:18:27,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 15:18:28,184][49750] Updated weights for policy 0, policy_version 205961 (0.0034) [2024-04-26 15:18:30,908][49750] Updated weights for policy 0, policy_version 205971 (0.0037) [2024-04-26 15:18:32,062][49517] Fps is (10 sec: 55706.6, 60 sec: 51609.7, 300 sec: 50873.7). Total num frames: 3374694400. Throughput: 0: 50616.1. Samples: 1127552760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 15:18:32,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 15:18:34,674][49750] Updated weights for policy 0, policy_version 205981 (0.0035) [2024-04-26 15:18:37,063][49517] Fps is (10 sec: 50790.4, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 3374923776. Throughput: 0: 50714.9. Samples: 1127720940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 15:18:37,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 15:18:37,485][49750] Updated weights for policy 0, policy_version 205991 (0.0034) [2024-04-26 15:18:41,689][49750] Updated weights for policy 0, policy_version 206001 (0.0031) [2024-04-26 15:18:42,062][49517] Fps is (10 sec: 44236.7, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 3375136768. Throughput: 0: 50655.1. Samples: 1128018800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 15:18:42,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 15:18:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000206002_3375136768.pth... [2024-04-26 15:18:42,131][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000205261_3362996224.pth [2024-04-26 15:18:43,945][49750] Updated weights for policy 0, policy_version 206011 (0.0030) [2024-04-26 15:18:47,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3375415296. Throughput: 0: 50663.9. Samples: 1128315740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 15:18:47,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 15:18:48,131][49750] Updated weights for policy 0, policy_version 206021 (0.0030) [2024-04-26 15:18:50,315][49750] Updated weights for policy 0, policy_version 206031 (0.0027) [2024-04-26 15:18:52,063][49517] Fps is (10 sec: 55705.1, 60 sec: 51063.6, 300 sec: 50818.1). Total num frames: 3375693824. Throughput: 0: 50678.3. Samples: 1128475040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 15:18:52,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 15:18:54,596][49750] Updated weights for policy 0, policy_version 206041 (0.0031) [2024-04-26 15:18:56,842][49750] Updated weights for policy 0, policy_version 206051 (0.0032) [2024-04-26 15:18:57,063][49517] Fps is (10 sec: 54067.3, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 3375955968. Throughput: 0: 50613.7. Samples: 1128784680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 15:18:57,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 15:19:00,934][49750] Updated weights for policy 0, policy_version 206061 (0.0034) [2024-04-26 15:19:02,062][49517] Fps is (10 sec: 44237.1, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 3376136192. Throughput: 0: 50685.8. Samples: 1129087500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 15:19:02,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 15:19:03,324][49750] Updated weights for policy 0, policy_version 206071 (0.0032) [2024-04-26 15:19:07,063][49517] Fps is (10 sec: 45874.9, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 3376414720. Throughput: 0: 50533.8. Samples: 1129221300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 15:19:07,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 15:19:07,419][49750] Updated weights for policy 0, policy_version 206081 (0.0033) [2024-04-26 15:19:09,748][49728] Signal inference workers to stop experience collection... (16950 times) [2024-04-26 15:19:09,796][49750] InferenceWorker_p0-w0: stopping experience collection (16950 times) [2024-04-26 15:19:09,811][49728] Signal inference workers to resume experience collection... (16950 times) [2024-04-26 15:19:09,814][49750] InferenceWorker_p0-w0: resuming experience collection (16950 times) [2024-04-26 15:19:09,817][49750] Updated weights for policy 0, policy_version 206091 (0.0028) [2024-04-26 15:19:12,063][49517] Fps is (10 sec: 55705.2, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3376693248. Throughput: 0: 50584.0. Samples: 1129524040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 15:19:12,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 15:19:13,768][49750] Updated weights for policy 0, policy_version 206101 (0.0030) [2024-04-26 15:19:16,126][49750] Updated weights for policy 0, policy_version 206111 (0.0028) [2024-04-26 15:19:17,062][49517] Fps is (10 sec: 55706.7, 60 sec: 51609.7, 300 sec: 50929.3). Total num frames: 3376971776. Throughput: 0: 50654.7. Samples: 1129832220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 15:19:17,063][49517] Avg episode reward: [(0, '0.467')] [2024-04-26 15:19:20,213][49750] Updated weights for policy 0, policy_version 206121 (0.0037) [2024-04-26 15:19:22,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.6, 300 sec: 50929.2). Total num frames: 3377201152. Throughput: 0: 50652.1. Samples: 1130000280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 15:19:22,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 15:19:22,478][49750] Updated weights for policy 0, policy_version 206131 (0.0028) [2024-04-26 15:19:26,531][49750] Updated weights for policy 0, policy_version 206141 (0.0036) [2024-04-26 15:19:27,062][49517] Fps is (10 sec: 45875.2, 60 sec: 50244.4, 300 sec: 50762.7). Total num frames: 3377430528. Throughput: 0: 50827.1. Samples: 1130306020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 15:19:27,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 15:19:28,808][49750] Updated weights for policy 0, policy_version 206151 (0.0032) [2024-04-26 15:19:32,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 3377692672. Throughput: 0: 50906.8. Samples: 1130606540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 15:19:32,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 15:19:32,947][49750] Updated weights for policy 0, policy_version 206161 (0.0032) [2024-04-26 15:19:35,347][49750] Updated weights for policy 0, policy_version 206171 (0.0035) [2024-04-26 15:19:37,063][49517] Fps is (10 sec: 54066.3, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 3377971200. Throughput: 0: 50960.0. Samples: 1130768240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 15:19:37,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 15:19:39,463][49750] Updated weights for policy 0, policy_version 206181 (0.0027) [2024-04-26 15:19:41,650][49750] Updated weights for policy 0, policy_version 206191 (0.0029) [2024-04-26 15:19:42,063][49517] Fps is (10 sec: 55705.1, 60 sec: 51882.6, 300 sec: 50873.7). Total num frames: 3378249728. Throughput: 0: 50935.1. Samples: 1131076760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 15:19:42,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 15:19:45,733][49750] Updated weights for policy 0, policy_version 206201 (0.0028) [2024-04-26 15:19:47,062][49517] Fps is (10 sec: 47514.7, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 3378446336. Throughput: 0: 51106.4. Samples: 1131387280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 15:19:47,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 15:19:47,966][49750] Updated weights for policy 0, policy_version 206211 (0.0025) [2024-04-26 15:19:51,999][49750] Updated weights for policy 0, policy_version 206221 (0.0029) [2024-04-26 15:19:52,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3378724864. Throughput: 0: 51295.3. Samples: 1131529580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 15:19:52,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 15:19:54,367][49750] Updated weights for policy 0, policy_version 206231 (0.0027) [2024-04-26 15:19:57,063][49517] Fps is (10 sec: 55704.5, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3379003392. Throughput: 0: 51216.4. Samples: 1131828780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 15:19:57,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 15:19:58,395][49750] Updated weights for policy 0, policy_version 206241 (0.0030) [2024-04-26 15:20:00,804][49750] Updated weights for policy 0, policy_version 206251 (0.0026) [2024-04-26 15:20:02,062][49517] Fps is (10 sec: 54066.8, 60 sec: 52155.7, 300 sec: 50984.8). Total num frames: 3379265536. Throughput: 0: 51279.9. Samples: 1132139820. Policy #0 lag: (min: 1.0, avg: 8.0, max: 21.0) [2024-04-26 15:20:02,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 15:20:04,830][49750] Updated weights for policy 0, policy_version 206261 (0.0030) [2024-04-26 15:20:07,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51882.8, 300 sec: 50984.8). Total num frames: 3379527680. Throughput: 0: 51147.6. Samples: 1132301920. Policy #0 lag: (min: 1.0, avg: 8.0, max: 21.0) [2024-04-26 15:20:07,063][49517] Avg episode reward: [(0, '0.481')] [2024-04-26 15:20:07,142][49750] Updated weights for policy 0, policy_version 206271 (0.0028) [2024-04-26 15:20:09,255][49728] Signal inference workers to stop experience collection... (17000 times) [2024-04-26 15:20:09,256][49728] Signal inference workers to resume experience collection... (17000 times) [2024-04-26 15:20:09,289][49750] InferenceWorker_p0-w0: stopping experience collection (17000 times) [2024-04-26 15:20:09,290][49750] InferenceWorker_p0-w0: resuming experience collection (17000 times) [2024-04-26 15:20:11,181][49750] Updated weights for policy 0, policy_version 206281 (0.0030) [2024-04-26 15:20:12,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3379740672. Throughput: 0: 51135.1. Samples: 1132607100. Policy #0 lag: (min: 1.0, avg: 8.0, max: 21.0) [2024-04-26 15:20:12,063][49517] Avg episode reward: [(0, '0.472')] [2024-04-26 15:20:13,445][49750] Updated weights for policy 0, policy_version 206291 (0.0033) [2024-04-26 15:20:17,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 3380002816. Throughput: 0: 51351.0. Samples: 1132917340. Policy #0 lag: (min: 1.0, avg: 8.0, max: 21.0) [2024-04-26 15:20:17,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 15:20:17,755][49750] Updated weights for policy 0, policy_version 206301 (0.0033) [2024-04-26 15:20:20,001][49750] Updated weights for policy 0, policy_version 206311 (0.0037) [2024-04-26 15:20:22,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3380264960. Throughput: 0: 50920.6. Samples: 1133059660. Policy #0 lag: (min: 1.0, avg: 8.0, max: 21.0) [2024-04-26 15:20:22,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 15:20:24,287][49750] Updated weights for policy 0, policy_version 206321 (0.0030) [2024-04-26 15:20:26,435][49750] Updated weights for policy 0, policy_version 206331 (0.0033) [2024-04-26 15:20:27,063][49517] Fps is (10 sec: 54067.4, 60 sec: 51882.6, 300 sec: 50984.8). Total num frames: 3380543488. Throughput: 0: 50947.1. Samples: 1133369380. Policy #0 lag: (min: 1.0, avg: 8.0, max: 21.0) [2024-04-26 15:20:27,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 15:20:30,786][49750] Updated weights for policy 0, policy_version 206341 (0.0029) [2024-04-26 15:20:32,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51609.5, 300 sec: 50984.8). Total num frames: 3380789248. Throughput: 0: 51026.0. Samples: 1133683460. Policy #0 lag: (min: 1.0, avg: 8.0, max: 21.0) [2024-04-26 15:20:32,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 15:20:32,850][49750] Updated weights for policy 0, policy_version 206351 (0.0024) [2024-04-26 15:20:37,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 3381018624. Throughput: 0: 51172.1. Samples: 1133832320. Policy #0 lag: (min: 1.0, avg: 8.0, max: 21.0) [2024-04-26 15:20:37,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 15:20:37,068][49750] Updated weights for policy 0, policy_version 206361 (0.0031) [2024-04-26 15:20:39,484][49750] Updated weights for policy 0, policy_version 206371 (0.0032) [2024-04-26 15:20:42,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3381280768. Throughput: 0: 51303.3. Samples: 1134137420. Policy #0 lag: (min: 1.0, avg: 8.0, max: 21.0) [2024-04-26 15:20:42,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 15:20:42,150][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000206378_3381297152.pth... [2024-04-26 15:20:42,200][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000205633_3369091072.pth [2024-04-26 15:20:43,471][49750] Updated weights for policy 0, policy_version 206381 (0.0032) [2024-04-26 15:20:45,822][49750] Updated weights for policy 0, policy_version 206391 (0.0032) [2024-04-26 15:20:47,063][49517] Fps is (10 sec: 54065.9, 60 sec: 51882.4, 300 sec: 50873.7). Total num frames: 3381559296. Throughput: 0: 51213.7. Samples: 1134444440. Policy #0 lag: (min: 1.0, avg: 8.0, max: 21.0) [2024-04-26 15:20:47,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 15:20:49,844][49750] Updated weights for policy 0, policy_version 206401 (0.0026) [2024-04-26 15:20:52,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51609.6, 300 sec: 50929.3). Total num frames: 3381821440. Throughput: 0: 51212.0. Samples: 1134606460. Policy #0 lag: (min: 1.0, avg: 8.0, max: 21.0) [2024-04-26 15:20:52,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 15:20:52,215][49750] Updated weights for policy 0, policy_version 206411 (0.0032) [2024-04-26 15:20:56,135][49750] Updated weights for policy 0, policy_version 206421 (0.0033) [2024-04-26 15:20:57,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 3382067200. Throughput: 0: 51254.2. Samples: 1134913540. Policy #0 lag: (min: 1.0, avg: 8.0, max: 21.0) [2024-04-26 15:20:57,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 15:20:58,889][49750] Updated weights for policy 0, policy_version 206431 (0.0027) [2024-04-26 15:21:02,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3382296576. Throughput: 0: 51065.5. Samples: 1135215280. Policy #0 lag: (min: 1.0, avg: 8.0, max: 21.0) [2024-04-26 15:21:02,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 15:21:02,463][49750] Updated weights for policy 0, policy_version 206441 (0.0027) [2024-04-26 15:21:05,225][49750] Updated weights for policy 0, policy_version 206451 (0.0031) [2024-04-26 15:21:07,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 3382558720. Throughput: 0: 51083.9. Samples: 1135358440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 15:21:07,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 15:21:08,916][49750] Updated weights for policy 0, policy_version 206461 (0.0028) [2024-04-26 15:21:09,652][49728] Signal inference workers to stop experience collection... (17050 times) [2024-04-26 15:21:09,652][49728] Signal inference workers to resume experience collection... (17050 times) [2024-04-26 15:21:09,667][49750] InferenceWorker_p0-w0: stopping experience collection (17050 times) [2024-04-26 15:21:09,668][49750] InferenceWorker_p0-w0: resuming experience collection (17050 times) [2024-04-26 15:21:11,614][49750] Updated weights for policy 0, policy_version 206471 (0.0030) [2024-04-26 15:21:12,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51609.6, 300 sec: 50929.3). Total num frames: 3382837248. Throughput: 0: 51008.1. Samples: 1135664740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 15:21:12,063][49517] Avg episode reward: [(0, '0.466')] [2024-04-26 15:21:15,421][49750] Updated weights for policy 0, policy_version 206481 (0.0030) [2024-04-26 15:21:17,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 3383083008. Throughput: 0: 50663.1. Samples: 1135963300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 15:21:17,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 15:21:18,259][49750] Updated weights for policy 0, policy_version 206491 (0.0028) [2024-04-26 15:21:21,945][49750] Updated weights for policy 0, policy_version 206501 (0.0033) [2024-04-26 15:21:22,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3383312384. Throughput: 0: 50916.0. Samples: 1136123540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 15:21:22,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 15:21:24,674][49750] Updated weights for policy 0, policy_version 206511 (0.0028) [2024-04-26 15:21:27,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 3383574528. Throughput: 0: 51066.4. Samples: 1136435420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 15:21:27,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 15:21:28,239][49750] Updated weights for policy 0, policy_version 206521 (0.0033) [2024-04-26 15:21:30,969][49750] Updated weights for policy 0, policy_version 206531 (0.0029) [2024-04-26 15:21:32,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3383820288. Throughput: 0: 50845.3. Samples: 1136732480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 15:21:32,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 15:21:34,774][49750] Updated weights for policy 0, policy_version 206541 (0.0035) [2024-04-26 15:21:37,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.3, 300 sec: 50929.2). Total num frames: 3384098816. Throughput: 0: 50874.4. Samples: 1136895820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 15:21:37,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 15:21:37,472][49750] Updated weights for policy 0, policy_version 206551 (0.0034) [2024-04-26 15:21:41,122][49750] Updated weights for policy 0, policy_version 206561 (0.0026) [2024-04-26 15:21:42,062][49517] Fps is (10 sec: 54068.3, 60 sec: 51336.5, 300 sec: 51040.3). Total num frames: 3384360960. Throughput: 0: 50793.4. Samples: 1137199240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 15:21:42,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 15:21:44,014][49750] Updated weights for policy 0, policy_version 206571 (0.0027) [2024-04-26 15:21:47,062][49517] Fps is (10 sec: 49153.3, 60 sec: 50517.6, 300 sec: 50929.3). Total num frames: 3384590336. Throughput: 0: 50812.5. Samples: 1137501840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 15:21:47,063][49517] Avg episode reward: [(0, '0.673')] [2024-04-26 15:21:47,512][49750] Updated weights for policy 0, policy_version 206581 (0.0033) [2024-04-26 15:21:50,609][49750] Updated weights for policy 0, policy_version 206591 (0.0027) [2024-04-26 15:21:52,063][49517] Fps is (10 sec: 47512.6, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 3384836096. Throughput: 0: 50934.1. Samples: 1137650480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 15:21:52,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 15:21:53,915][49750] Updated weights for policy 0, policy_version 206601 (0.0031) [2024-04-26 15:21:57,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3385098240. Throughput: 0: 50818.1. Samples: 1137951560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 15:21:57,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 15:21:57,083][49750] Updated weights for policy 0, policy_version 206611 (0.0035) [2024-04-26 15:22:00,391][49750] Updated weights for policy 0, policy_version 206621 (0.0034) [2024-04-26 15:22:02,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 3385360384. Throughput: 0: 50898.6. Samples: 1138253740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 15:22:02,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 15:22:03,522][49750] Updated weights for policy 0, policy_version 206631 (0.0034) [2024-04-26 15:22:06,762][49750] Updated weights for policy 0, policy_version 206641 (0.0029) [2024-04-26 15:22:07,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 3385622528. Throughput: 0: 50962.4. Samples: 1138416860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 15:22:07,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 15:22:09,984][49750] Updated weights for policy 0, policy_version 206651 (0.0027) [2024-04-26 15:22:12,063][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.2, 300 sec: 50818.1). Total num frames: 3385851904. Throughput: 0: 50600.9. Samples: 1138712460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 15:22:12,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 15:22:13,127][49750] Updated weights for policy 0, policy_version 206661 (0.0029) [2024-04-26 15:22:16,427][49750] Updated weights for policy 0, policy_version 206671 (0.0037) [2024-04-26 15:22:17,062][49517] Fps is (10 sec: 49153.3, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3386114048. Throughput: 0: 50770.5. Samples: 1139017140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 15:22:17,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 15:22:19,629][49728] Signal inference workers to stop experience collection... (17100 times) [2024-04-26 15:22:19,629][49728] Signal inference workers to resume experience collection... (17100 times) [2024-04-26 15:22:19,636][49750] Updated weights for policy 0, policy_version 206681 (0.0031) [2024-04-26 15:22:19,643][49750] InferenceWorker_p0-w0: stopping experience collection (17100 times) [2024-04-26 15:22:19,644][49750] InferenceWorker_p0-w0: resuming experience collection (17100 times) [2024-04-26 15:22:22,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3386376192. Throughput: 0: 50588.2. Samples: 1139172280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 15:22:22,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 15:22:22,776][49750] Updated weights for policy 0, policy_version 206691 (0.0028) [2024-04-26 15:22:25,961][49750] Updated weights for policy 0, policy_version 206701 (0.0031) [2024-04-26 15:22:27,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 3386638336. Throughput: 0: 50686.6. Samples: 1139480140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 15:22:27,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 15:22:29,195][49750] Updated weights for policy 0, policy_version 206711 (0.0028) [2024-04-26 15:22:32,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.5, 300 sec: 50929.2). Total num frames: 3386867712. Throughput: 0: 50792.7. Samples: 1139787520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 15:22:32,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 15:22:32,444][49750] Updated weights for policy 0, policy_version 206721 (0.0032) [2024-04-26 15:22:35,618][49750] Updated weights for policy 0, policy_version 206731 (0.0030) [2024-04-26 15:22:37,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3387113472. Throughput: 0: 50783.2. Samples: 1139935720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 15:22:37,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 15:22:38,754][49750] Updated weights for policy 0, policy_version 206741 (0.0031) [2024-04-26 15:22:42,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3387392000. Throughput: 0: 50878.4. Samples: 1140241080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 15:22:42,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 15:22:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000206751_3387408384.pth... [2024-04-26 15:22:42,075][49750] Updated weights for policy 0, policy_version 206751 (0.0029) [2024-04-26 15:22:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000206002_3375136768.pth [2024-04-26 15:22:45,237][49750] Updated weights for policy 0, policy_version 206761 (0.0037) [2024-04-26 15:22:47,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 3387637760. Throughput: 0: 50822.7. Samples: 1140540760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 15:22:47,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 15:22:48,463][49750] Updated weights for policy 0, policy_version 206771 (0.0028) [2024-04-26 15:22:51,706][49750] Updated weights for policy 0, policy_version 206781 (0.0030) [2024-04-26 15:22:52,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51336.6, 300 sec: 51040.3). Total num frames: 3387916288. Throughput: 0: 50714.8. Samples: 1140699020. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 15:22:52,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 15:22:55,033][49750] Updated weights for policy 0, policy_version 206791 (0.0041) [2024-04-26 15:22:57,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3388129280. Throughput: 0: 50855.6. Samples: 1141000960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 15:22:57,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 15:22:58,134][49750] Updated weights for policy 0, policy_version 206801 (0.0029) [2024-04-26 15:23:01,434][49750] Updated weights for policy 0, policy_version 206811 (0.0030) [2024-04-26 15:23:02,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3388391424. Throughput: 0: 50807.8. Samples: 1141303500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 15:23:02,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 15:23:04,532][49750] Updated weights for policy 0, policy_version 206821 (0.0030) [2024-04-26 15:23:07,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3388653568. Throughput: 0: 50726.6. Samples: 1141454980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 15:23:07,063][49517] Avg episode reward: [(0, '0.481')] [2024-04-26 15:23:07,810][49750] Updated weights for policy 0, policy_version 206831 (0.0036) [2024-04-26 15:23:10,849][49750] Updated weights for policy 0, policy_version 206841 (0.0033) [2024-04-26 15:23:12,063][49517] Fps is (10 sec: 54067.5, 60 sec: 51336.5, 300 sec: 51040.3). Total num frames: 3388932096. Throughput: 0: 50739.5. Samples: 1141763420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 15:23:12,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 15:23:14,428][49750] Updated weights for policy 0, policy_version 206851 (0.0033) [2024-04-26 15:23:17,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 3389177856. Throughput: 0: 50790.1. Samples: 1142073080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-26 15:23:17,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 15:23:17,312][49750] Updated weights for policy 0, policy_version 206861 (0.0030) [2024-04-26 15:23:20,980][49750] Updated weights for policy 0, policy_version 206871 (0.0026) [2024-04-26 15:23:22,062][49517] Fps is (10 sec: 45875.7, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3389390848. Throughput: 0: 50743.6. Samples: 1142219180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 15:23:22,063][49517] Avg episode reward: [(0, '0.453')] [2024-04-26 15:23:23,543][49728] Signal inference workers to stop experience collection... (17150 times) [2024-04-26 15:23:23,543][49728] Signal inference workers to resume experience collection... (17150 times) [2024-04-26 15:23:23,565][49750] InferenceWorker_p0-w0: stopping experience collection (17150 times) [2024-04-26 15:23:23,565][49750] InferenceWorker_p0-w0: resuming experience collection (17150 times) [2024-04-26 15:23:23,804][49750] Updated weights for policy 0, policy_version 206881 (0.0028) [2024-04-26 15:23:27,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3389685760. Throughput: 0: 50800.8. Samples: 1142527120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 15:23:27,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-26 15:23:27,369][49750] Updated weights for policy 0, policy_version 206891 (0.0037) [2024-04-26 15:23:30,283][49750] Updated weights for policy 0, policy_version 206901 (0.0034) [2024-04-26 15:23:32,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3389931520. Throughput: 0: 50920.5. Samples: 1142832180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 15:23:32,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 15:23:33,733][49750] Updated weights for policy 0, policy_version 206911 (0.0028) [2024-04-26 15:23:36,778][49750] Updated weights for policy 0, policy_version 206921 (0.0031) [2024-04-26 15:23:37,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51609.6, 300 sec: 51095.9). Total num frames: 3390210048. Throughput: 0: 50968.5. Samples: 1142992600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 15:23:37,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 15:23:40,447][49750] Updated weights for policy 0, policy_version 206931 (0.0031) [2024-04-26 15:23:42,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 3390439424. Throughput: 0: 50895.1. Samples: 1143291240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 15:23:42,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 15:23:43,072][49750] Updated weights for policy 0, policy_version 206941 (0.0028) [2024-04-26 15:23:46,874][49750] Updated weights for policy 0, policy_version 206951 (0.0027) [2024-04-26 15:23:47,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3390685184. Throughput: 0: 51017.4. Samples: 1143599280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 15:23:47,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 15:23:49,395][49750] Updated weights for policy 0, policy_version 206961 (0.0032) [2024-04-26 15:23:52,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3390947328. Throughput: 0: 50709.7. Samples: 1143736920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 15:23:52,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 15:23:53,276][49750] Updated weights for policy 0, policy_version 206971 (0.0029) [2024-04-26 15:23:55,953][49750] Updated weights for policy 0, policy_version 206981 (0.0028) [2024-04-26 15:23:57,062][49517] Fps is (10 sec: 50791.4, 60 sec: 51063.6, 300 sec: 51040.3). Total num frames: 3391193088. Throughput: 0: 50766.4. Samples: 1144047900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 15:23:57,063][49517] Avg episode reward: [(0, '0.467')] [2024-04-26 15:23:59,883][49750] Updated weights for policy 0, policy_version 206991 (0.0028) [2024-04-26 15:24:02,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.7, 300 sec: 51040.4). Total num frames: 3391471616. Throughput: 0: 50716.1. Samples: 1144355300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 15:24:02,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 15:24:02,405][49750] Updated weights for policy 0, policy_version 207001 (0.0033) [2024-04-26 15:24:06,207][49750] Updated weights for policy 0, policy_version 207011 (0.0029) [2024-04-26 15:24:07,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3391684608. Throughput: 0: 50954.2. Samples: 1144512120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 15:24:07,063][49517] Avg episode reward: [(0, '0.666')] [2024-04-26 15:24:08,904][49750] Updated weights for policy 0, policy_version 207021 (0.0027) [2024-04-26 15:24:12,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3391963136. Throughput: 0: 50943.5. Samples: 1144819580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 15:24:12,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 15:24:12,640][49750] Updated weights for policy 0, policy_version 207031 (0.0030) [2024-04-26 15:24:15,297][49750] Updated weights for policy 0, policy_version 207041 (0.0036) [2024-04-26 15:24:17,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.5, 300 sec: 50929.2). Total num frames: 3392225280. Throughput: 0: 50941.4. Samples: 1145124540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 15:24:17,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 15:24:19,015][49750] Updated weights for policy 0, policy_version 207051 (0.0029) [2024-04-26 15:24:21,766][49750] Updated weights for policy 0, policy_version 207061 (0.0030) [2024-04-26 15:24:22,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51609.6, 300 sec: 51040.3). Total num frames: 3392487424. Throughput: 0: 50959.6. Samples: 1145285780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 15:24:22,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 15:24:25,368][49750] Updated weights for policy 0, policy_version 207071 (0.0034) [2024-04-26 15:24:26,913][49728] Signal inference workers to stop experience collection... (17200 times) [2024-04-26 15:24:26,953][49750] InferenceWorker_p0-w0: stopping experience collection (17200 times) [2024-04-26 15:24:27,015][49728] Signal inference workers to resume experience collection... (17200 times) [2024-04-26 15:24:27,016][49750] InferenceWorker_p0-w0: resuming experience collection (17200 times) [2024-04-26 15:24:27,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.5, 300 sec: 51040.3). Total num frames: 3392749568. Throughput: 0: 51018.4. Samples: 1145587060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 15:24:27,063][49517] Avg episode reward: [(0, '0.491')] [2024-04-26 15:24:28,363][49750] Updated weights for policy 0, policy_version 207081 (0.0026) [2024-04-26 15:24:31,780][49750] Updated weights for policy 0, policy_version 207091 (0.0031) [2024-04-26 15:24:32,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3392978944. Throughput: 0: 50965.9. Samples: 1145892740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:24:32,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 15:24:34,765][49750] Updated weights for policy 0, policy_version 207101 (0.0034) [2024-04-26 15:24:37,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3393241088. Throughput: 0: 51072.5. Samples: 1146035180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:24:37,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 15:24:38,371][49750] Updated weights for policy 0, policy_version 207111 (0.0029) [2024-04-26 15:24:41,096][49750] Updated weights for policy 0, policy_version 207121 (0.0028) [2024-04-26 15:24:42,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50984.8). Total num frames: 3393486848. Throughput: 0: 50847.5. Samples: 1146336040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:24:42,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 15:24:42,070][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000207123_3393503232.pth... [2024-04-26 15:24:42,115][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000206378_3381297152.pth [2024-04-26 15:24:44,924][49750] Updated weights for policy 0, policy_version 207131 (0.0038) [2024-04-26 15:24:47,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 3393765376. Throughput: 0: 50868.3. Samples: 1146644380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:24:47,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 15:24:47,469][49750] Updated weights for policy 0, policy_version 207141 (0.0033) [2024-04-26 15:24:51,317][49750] Updated weights for policy 0, policy_version 207151 (0.0040) [2024-04-26 15:24:52,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3393978368. Throughput: 0: 50817.3. Samples: 1146798900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:24:52,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 15:24:54,057][49750] Updated weights for policy 0, policy_version 207161 (0.0031) [2024-04-26 15:24:57,063][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 3394256896. Throughput: 0: 50779.5. Samples: 1147104660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:24:57,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 15:24:57,831][49750] Updated weights for policy 0, policy_version 207171 (0.0037) [2024-04-26 15:25:00,514][49750] Updated weights for policy 0, policy_version 207181 (0.0032) [2024-04-26 15:25:02,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3394502656. Throughput: 0: 50825.1. Samples: 1147411660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:25:02,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 15:25:04,586][49750] Updated weights for policy 0, policy_version 207191 (0.0035) [2024-04-26 15:25:07,050][49750] Updated weights for policy 0, policy_version 207201 (0.0022) [2024-04-26 15:25:07,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51609.5, 300 sec: 50984.8). Total num frames: 3394781184. Throughput: 0: 50804.3. Samples: 1147571980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:25:07,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 15:25:10,953][49750] Updated weights for policy 0, policy_version 207211 (0.0029) [2024-04-26 15:25:12,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 3395026944. Throughput: 0: 50880.9. Samples: 1147876700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:25:12,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 15:25:13,425][49750] Updated weights for policy 0, policy_version 207221 (0.0030) [2024-04-26 15:25:17,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3395256320. Throughput: 0: 50695.5. Samples: 1148174040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:25:17,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 15:25:17,222][49750] Updated weights for policy 0, policy_version 207231 (0.0037) [2024-04-26 15:25:19,763][49750] Updated weights for policy 0, policy_version 207241 (0.0032) [2024-04-26 15:25:22,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3395518464. Throughput: 0: 50809.3. Samples: 1148321600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:25:22,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 15:25:23,674][49750] Updated weights for policy 0, policy_version 207251 (0.0031) [2024-04-26 15:25:26,457][49750] Updated weights for policy 0, policy_version 207261 (0.0023) [2024-04-26 15:25:27,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3395780608. Throughput: 0: 50864.8. Samples: 1148624960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:25:27,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 15:25:27,708][49728] Signal inference workers to stop experience collection... (17250 times) [2024-04-26 15:25:27,714][49728] Signal inference workers to resume experience collection... (17250 times) [2024-04-26 15:25:27,752][49750] InferenceWorker_p0-w0: stopping experience collection (17250 times) [2024-04-26 15:25:27,752][49750] InferenceWorker_p0-w0: resuming experience collection (17250 times) [2024-04-26 15:25:30,182][49750] Updated weights for policy 0, policy_version 207271 (0.0026) [2024-04-26 15:25:32,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 3396059136. Throughput: 0: 50873.7. Samples: 1148933700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:25:32,063][49517] Avg episode reward: [(0, '0.471')] [2024-04-26 15:25:32,769][49750] Updated weights for policy 0, policy_version 207281 (0.0032) [2024-04-26 15:25:36,553][49750] Updated weights for policy 0, policy_version 207291 (0.0028) [2024-04-26 15:25:37,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 3396304896. Throughput: 0: 50898.5. Samples: 1149089340. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-04-26 15:25:37,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 15:25:39,348][49750] Updated weights for policy 0, policy_version 207301 (0.0035) [2024-04-26 15:25:42,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50790.3, 300 sec: 50762.7). Total num frames: 3396534272. Throughput: 0: 50989.8. Samples: 1149399200. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-04-26 15:25:42,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 15:25:42,880][49750] Updated weights for policy 0, policy_version 207311 (0.0034) [2024-04-26 15:25:45,904][49750] Updated weights for policy 0, policy_version 207321 (0.0041) [2024-04-26 15:25:47,063][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3396796416. Throughput: 0: 51008.7. Samples: 1149707060. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-04-26 15:25:47,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 15:25:49,349][49750] Updated weights for policy 0, policy_version 207331 (0.0032) [2024-04-26 15:25:52,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3397058560. Throughput: 0: 50673.5. Samples: 1149852280. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-04-26 15:25:52,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 15:25:52,210][49750] Updated weights for policy 0, policy_version 207341 (0.0038) [2024-04-26 15:25:55,665][49750] Updated weights for policy 0, policy_version 207351 (0.0035) [2024-04-26 15:25:57,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3397304320. Throughput: 0: 50732.0. Samples: 1150159640. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-04-26 15:25:57,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 15:25:58,625][49750] Updated weights for policy 0, policy_version 207361 (0.0034) [2024-04-26 15:26:02,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3397550080. Throughput: 0: 50994.9. Samples: 1150468800. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-04-26 15:26:02,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 15:26:02,125][49750] Updated weights for policy 0, policy_version 207371 (0.0032) [2024-04-26 15:26:05,191][49750] Updated weights for policy 0, policy_version 207381 (0.0029) [2024-04-26 15:26:07,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3397812224. Throughput: 0: 50827.1. Samples: 1150608820. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-04-26 15:26:07,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 15:26:08,583][49750] Updated weights for policy 0, policy_version 207391 (0.0027) [2024-04-26 15:26:11,721][49750] Updated weights for policy 0, policy_version 207401 (0.0030) [2024-04-26 15:26:12,062][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3398057984. Throughput: 0: 50776.4. Samples: 1150909900. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-04-26 15:26:12,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 15:26:14,886][49750] Updated weights for policy 0, policy_version 207411 (0.0032) [2024-04-26 15:26:17,063][49517] Fps is (10 sec: 54067.2, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 3398352896. Throughput: 0: 50851.6. Samples: 1151222020. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-04-26 15:26:17,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 15:26:18,152][49750] Updated weights for policy 0, policy_version 207421 (0.0034) [2024-04-26 15:26:21,387][49750] Updated weights for policy 0, policy_version 207431 (0.0028) [2024-04-26 15:26:22,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3398582272. Throughput: 0: 51001.0. Samples: 1151384380. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-04-26 15:26:22,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 15:26:24,470][49750] Updated weights for policy 0, policy_version 207441 (0.0029) [2024-04-26 15:26:27,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3398828032. Throughput: 0: 50888.6. Samples: 1151689180. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-04-26 15:26:27,063][49517] Avg episode reward: [(0, '0.484')] [2024-04-26 15:26:27,662][49750] Updated weights for policy 0, policy_version 207451 (0.0031) [2024-04-26 15:26:31,033][49750] Updated weights for policy 0, policy_version 207461 (0.0027) [2024-04-26 15:26:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3399090176. Throughput: 0: 50851.6. Samples: 1151995380. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-04-26 15:26:32,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 15:26:34,100][49750] Updated weights for policy 0, policy_version 207471 (0.0036) [2024-04-26 15:26:35,692][49728] Signal inference workers to stop experience collection... (17300 times) [2024-04-26 15:26:35,741][49750] InferenceWorker_p0-w0: stopping experience collection (17300 times) [2024-04-26 15:26:35,759][49728] Signal inference workers to resume experience collection... (17300 times) [2024-04-26 15:26:35,761][49750] InferenceWorker_p0-w0: resuming experience collection (17300 times) [2024-04-26 15:26:37,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 3399335936. Throughput: 0: 51029.4. Samples: 1152148600. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-04-26 15:26:37,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 15:26:37,828][49750] Updated weights for policy 0, policy_version 207481 (0.0037) [2024-04-26 15:26:40,533][49750] Updated weights for policy 0, policy_version 207491 (0.0030) [2024-04-26 15:26:42,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 3399630848. Throughput: 0: 50937.3. Samples: 1152451820. Policy #0 lag: (min: 1.0, avg: 8.4, max: 20.0) [2024-04-26 15:26:42,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 15:26:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000207497_3399630848.pth... [2024-04-26 15:26:42,127][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000206751_3387408384.pth [2024-04-26 15:26:44,063][49750] Updated weights for policy 0, policy_version 207501 (0.0029) [2024-04-26 15:26:46,795][49750] Updated weights for policy 0, policy_version 207511 (0.0034) [2024-04-26 15:26:47,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 3399860224. Throughput: 0: 50865.7. Samples: 1152757760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 15:26:47,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 15:26:50,277][49750] Updated weights for policy 0, policy_version 207521 (0.0027) [2024-04-26 15:26:52,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3400105984. Throughput: 0: 51128.8. Samples: 1152909620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 15:26:52,063][49517] Avg episode reward: [(0, '0.491')] [2024-04-26 15:26:53,198][49750] Updated weights for policy 0, policy_version 207531 (0.0029) [2024-04-26 15:26:56,816][49750] Updated weights for policy 0, policy_version 207541 (0.0036) [2024-04-26 15:26:57,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3400351744. Throughput: 0: 51159.7. Samples: 1153212080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 15:26:57,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 15:26:59,638][49750] Updated weights for policy 0, policy_version 207551 (0.0028) [2024-04-26 15:27:02,063][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 3400630272. Throughput: 0: 50882.2. Samples: 1153511720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 15:27:02,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-26 15:27:03,501][49750] Updated weights for policy 0, policy_version 207561 (0.0027) [2024-04-26 15:27:06,318][49750] Updated weights for policy 0, policy_version 207571 (0.0032) [2024-04-26 15:27:07,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3400859648. Throughput: 0: 50919.7. Samples: 1153675760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 15:27:07,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 15:27:09,878][49750] Updated weights for policy 0, policy_version 207581 (0.0035) [2024-04-26 15:27:12,062][49517] Fps is (10 sec: 49152.9, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3401121792. Throughput: 0: 50949.8. Samples: 1153981920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 15:27:12,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 15:27:12,604][49750] Updated weights for policy 0, policy_version 207591 (0.0035) [2024-04-26 15:27:16,296][49750] Updated weights for policy 0, policy_version 207601 (0.0031) [2024-04-26 15:27:17,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 3401367552. Throughput: 0: 50981.8. Samples: 1154289560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 15:27:17,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 15:27:18,965][49750] Updated weights for policy 0, policy_version 207611 (0.0025) [2024-04-26 15:27:22,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3401629696. Throughput: 0: 50813.7. Samples: 1154435220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 15:27:22,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 15:27:22,760][49750] Updated weights for policy 0, policy_version 207621 (0.0029) [2024-04-26 15:27:25,510][49750] Updated weights for policy 0, policy_version 207631 (0.0030) [2024-04-26 15:27:27,063][49517] Fps is (10 sec: 55704.9, 60 sec: 51609.4, 300 sec: 51040.3). Total num frames: 3401924608. Throughput: 0: 50891.8. Samples: 1154741960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 15:27:27,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 15:27:29,087][49750] Updated weights for policy 0, policy_version 207641 (0.0032) [2024-04-26 15:27:31,974][49750] Updated weights for policy 0, policy_version 207651 (0.0031) [2024-04-26 15:27:32,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 3402153984. Throughput: 0: 50948.8. Samples: 1155050460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 15:27:32,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 15:27:35,404][49750] Updated weights for policy 0, policy_version 207661 (0.0028) [2024-04-26 15:27:37,062][49517] Fps is (10 sec: 45875.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3402383360. Throughput: 0: 50848.2. Samples: 1155197780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 15:27:37,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 15:27:38,354][49750] Updated weights for policy 0, policy_version 207671 (0.0034) [2024-04-26 15:27:41,950][49750] Updated weights for policy 0, policy_version 207681 (0.0035) [2024-04-26 15:27:42,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.2, 300 sec: 50873.7). Total num frames: 3402645504. Throughput: 0: 50921.2. Samples: 1155503540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 15:27:42,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 15:27:42,496][49728] Signal inference workers to stop experience collection... (17350 times) [2024-04-26 15:27:42,496][49728] Signal inference workers to resume experience collection... (17350 times) [2024-04-26 15:27:42,513][49750] InferenceWorker_p0-w0: stopping experience collection (17350 times) [2024-04-26 15:27:42,513][49750] InferenceWorker_p0-w0: resuming experience collection (17350 times) [2024-04-26 15:27:44,770][49750] Updated weights for policy 0, policy_version 207691 (0.0034) [2024-04-26 15:27:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3402907648. Throughput: 0: 50998.3. Samples: 1155806640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 15:27:47,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 15:27:48,500][49750] Updated weights for policy 0, policy_version 207701 (0.0030) [2024-04-26 15:27:51,213][49750] Updated weights for policy 0, policy_version 207711 (0.0024) [2024-04-26 15:27:52,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 3403169792. Throughput: 0: 50892.2. Samples: 1155965920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 15:27:52,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 15:27:55,097][49750] Updated weights for policy 0, policy_version 207721 (0.0033) [2024-04-26 15:27:57,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 3403415552. Throughput: 0: 50801.3. Samples: 1156267980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 15:27:57,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 15:27:57,754][49750] Updated weights for policy 0, policy_version 207731 (0.0033) [2024-04-26 15:28:01,418][49750] Updated weights for policy 0, policy_version 207741 (0.0040) [2024-04-26 15:28:02,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3403661312. Throughput: 0: 50865.7. Samples: 1156578520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 15:28:02,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 15:28:04,095][49750] Updated weights for policy 0, policy_version 207751 (0.0032) [2024-04-26 15:28:07,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 3403923456. Throughput: 0: 50792.3. Samples: 1156720880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 15:28:07,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 15:28:07,783][49750] Updated weights for policy 0, policy_version 207761 (0.0032) [2024-04-26 15:28:10,500][49750] Updated weights for policy 0, policy_version 207771 (0.0033) [2024-04-26 15:28:12,063][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 3404201984. Throughput: 0: 50770.7. Samples: 1157026640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 15:28:12,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 15:28:14,190][49750] Updated weights for policy 0, policy_version 207781 (0.0030) [2024-04-26 15:28:16,922][49750] Updated weights for policy 0, policy_version 207791 (0.0038) [2024-04-26 15:28:17,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.5, 300 sec: 51040.3). Total num frames: 3404447744. Throughput: 0: 50778.7. Samples: 1157335500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 15:28:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 15:28:20,613][49750] Updated weights for policy 0, policy_version 207801 (0.0030) [2024-04-26 15:28:22,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3404677120. Throughput: 0: 50807.1. Samples: 1157484100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 15:28:22,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 15:28:23,665][49750] Updated weights for policy 0, policy_version 207811 (0.0035) [2024-04-26 15:28:27,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.4, 300 sec: 50818.2). Total num frames: 3404922880. Throughput: 0: 50717.0. Samples: 1157785800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 15:28:27,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 15:28:27,082][49750] Updated weights for policy 0, policy_version 207821 (0.0034) [2024-04-26 15:28:29,998][49750] Updated weights for policy 0, policy_version 207831 (0.0033) [2024-04-26 15:28:32,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3405201408. Throughput: 0: 50754.1. Samples: 1158090580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 15:28:32,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 15:28:33,438][49750] Updated weights for policy 0, policy_version 207841 (0.0027) [2024-04-26 15:28:36,513][49750] Updated weights for policy 0, policy_version 207851 (0.0034) [2024-04-26 15:28:37,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3405430784. Throughput: 0: 50717.9. Samples: 1158248220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 15:28:37,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 15:28:39,889][49750] Updated weights for policy 0, policy_version 207861 (0.0033) [2024-04-26 15:28:42,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 3405709312. Throughput: 0: 50684.7. Samples: 1158548800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 15:28:42,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 15:28:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000207868_3405709312.pth... [2024-04-26 15:28:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000207123_3393503232.pth [2024-04-26 15:28:42,953][49750] Updated weights for policy 0, policy_version 207871 (0.0029) [2024-04-26 15:28:46,415][49750] Updated weights for policy 0, policy_version 207881 (0.0040) [2024-04-26 15:28:47,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3405955072. Throughput: 0: 50588.5. Samples: 1158855000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 15:28:47,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 15:28:48,147][49728] Signal inference workers to stop experience collection... (17400 times) [2024-04-26 15:28:48,148][49728] Signal inference workers to resume experience collection... (17400 times) [2024-04-26 15:28:48,178][49750] InferenceWorker_p0-w0: stopping experience collection (17400 times) [2024-04-26 15:28:48,178][49750] InferenceWorker_p0-w0: resuming experience collection (17400 times) [2024-04-26 15:28:49,269][49750] Updated weights for policy 0, policy_version 207891 (0.0035) [2024-04-26 15:28:52,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3406200832. Throughput: 0: 50737.2. Samples: 1159004060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 15:28:52,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 15:28:52,876][49750] Updated weights for policy 0, policy_version 207901 (0.0035) [2024-04-26 15:28:55,731][49750] Updated weights for policy 0, policy_version 207911 (0.0028) [2024-04-26 15:28:57,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3406462976. Throughput: 0: 50685.5. Samples: 1159307480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 15:28:57,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 15:28:59,374][49750] Updated weights for policy 0, policy_version 207921 (0.0032) [2024-04-26 15:29:02,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 3406725120. Throughput: 0: 50644.8. Samples: 1159614520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:29:02,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 15:29:02,200][49750] Updated weights for policy 0, policy_version 207931 (0.0027) [2024-04-26 15:29:05,695][49750] Updated weights for policy 0, policy_version 207941 (0.0028) [2024-04-26 15:29:07,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3406970880. Throughput: 0: 50835.5. Samples: 1159771700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:29:07,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 15:29:08,618][49750] Updated weights for policy 0, policy_version 207951 (0.0028) [2024-04-26 15:29:12,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 3407216640. Throughput: 0: 50784.3. Samples: 1160071100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:29:12,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 15:29:12,197][49750] Updated weights for policy 0, policy_version 207961 (0.0035) [2024-04-26 15:29:14,948][49750] Updated weights for policy 0, policy_version 207971 (0.0033) [2024-04-26 15:29:17,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3407495168. Throughput: 0: 50801.5. Samples: 1160376640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:29:17,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 15:29:18,616][49750] Updated weights for policy 0, policy_version 207981 (0.0031) [2024-04-26 15:29:21,552][49750] Updated weights for policy 0, policy_version 207991 (0.0037) [2024-04-26 15:29:22,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3407740928. Throughput: 0: 50826.5. Samples: 1160535420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:29:22,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 15:29:25,052][49750] Updated weights for policy 0, policy_version 208001 (0.0031) [2024-04-26 15:29:27,062][49517] Fps is (10 sec: 49152.0, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3407986688. Throughput: 0: 50870.8. Samples: 1160837980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:29:27,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 15:29:28,167][49750] Updated weights for policy 0, policy_version 208011 (0.0036) [2024-04-26 15:29:31,356][49750] Updated weights for policy 0, policy_version 208021 (0.0026) [2024-04-26 15:29:32,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3408232448. Throughput: 0: 50744.1. Samples: 1161138480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:29:32,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 15:29:34,756][49750] Updated weights for policy 0, policy_version 208031 (0.0034) [2024-04-26 15:29:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3408478208. Throughput: 0: 50828.3. Samples: 1161291320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:29:37,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 15:29:37,802][49750] Updated weights for policy 0, policy_version 208041 (0.0033) [2024-04-26 15:29:41,021][49750] Updated weights for policy 0, policy_version 208051 (0.0036) [2024-04-26 15:29:42,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 3408740352. Throughput: 0: 50923.6. Samples: 1161599040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:29:42,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 15:29:44,345][49750] Updated weights for policy 0, policy_version 208061 (0.0031) [2024-04-26 15:29:47,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 3409018880. Throughput: 0: 50909.8. Samples: 1161905460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:29:47,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 15:29:47,530][49750] Updated weights for policy 0, policy_version 208071 (0.0032) [2024-04-26 15:29:50,644][49750] Updated weights for policy 0, policy_version 208081 (0.0029) [2024-04-26 15:29:52,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.7, 300 sec: 50873.7). Total num frames: 3409264640. Throughput: 0: 50887.8. Samples: 1162061640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:29:52,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 15:29:53,906][49750] Updated weights for policy 0, policy_version 208091 (0.0031) [2024-04-26 15:29:57,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3409510400. Throughput: 0: 50974.2. Samples: 1162364940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:29:57,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 15:29:57,094][49750] Updated weights for policy 0, policy_version 208101 (0.0030) [2024-04-26 15:30:00,173][49750] Updated weights for policy 0, policy_version 208111 (0.0030) [2024-04-26 15:30:02,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 3409756160. Throughput: 0: 50965.8. Samples: 1162670100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:30:02,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 15:30:03,590][49750] Updated weights for policy 0, policy_version 208121 (0.0028) [2024-04-26 15:30:06,424][49750] Updated weights for policy 0, policy_version 208131 (0.0031) [2024-04-26 15:30:07,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 3410018304. Throughput: 0: 50759.5. Samples: 1162819600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 15:30:07,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 15:30:07,415][49728] Signal inference workers to stop experience collection... (17450 times) [2024-04-26 15:30:07,457][49750] InferenceWorker_p0-w0: stopping experience collection (17450 times) [2024-04-26 15:30:07,519][49728] Signal inference workers to resume experience collection... (17450 times) [2024-04-26 15:30:07,520][49750] InferenceWorker_p0-w0: resuming experience collection (17450 times) [2024-04-26 15:30:09,849][49750] Updated weights for policy 0, policy_version 208141 (0.0028) [2024-04-26 15:30:12,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 3410280448. Throughput: 0: 50977.3. Samples: 1163131960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 15:30:12,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 15:30:13,173][49750] Updated weights for policy 0, policy_version 208151 (0.0027) [2024-04-26 15:30:16,242][49750] Updated weights for policy 0, policy_version 208161 (0.0032) [2024-04-26 15:30:17,062][49517] Fps is (10 sec: 50791.7, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 3410526208. Throughput: 0: 51021.0. Samples: 1163434420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 15:30:17,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 15:30:19,749][49750] Updated weights for policy 0, policy_version 208171 (0.0027) [2024-04-26 15:30:22,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3410788352. Throughput: 0: 50992.4. Samples: 1163585980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 15:30:22,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 15:30:22,694][49750] Updated weights for policy 0, policy_version 208181 (0.0035) [2024-04-26 15:30:26,200][49750] Updated weights for policy 0, policy_version 208191 (0.0029) [2024-04-26 15:30:27,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3411034112. Throughput: 0: 50908.4. Samples: 1163889920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 15:30:27,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 15:30:29,198][49750] Updated weights for policy 0, policy_version 208201 (0.0027) [2024-04-26 15:30:32,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3411296256. Throughput: 0: 50921.0. Samples: 1164196900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 15:30:32,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 15:30:32,739][49750] Updated weights for policy 0, policy_version 208211 (0.0032) [2024-04-26 15:30:35,517][49750] Updated weights for policy 0, policy_version 208221 (0.0030) [2024-04-26 15:30:37,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 3411558400. Throughput: 0: 50926.6. Samples: 1164353340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 15:30:37,063][49517] Avg episode reward: [(0, '0.483')] [2024-04-26 15:30:39,084][49750] Updated weights for policy 0, policy_version 208231 (0.0030) [2024-04-26 15:30:41,987][49750] Updated weights for policy 0, policy_version 208241 (0.0029) [2024-04-26 15:30:42,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 3411820544. Throughput: 0: 50981.4. Samples: 1164659100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 15:30:42,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 15:30:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000208241_3411820544.pth... [2024-04-26 15:30:42,115][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000207497_3399630848.pth [2024-04-26 15:30:45,529][49750] Updated weights for policy 0, policy_version 208251 (0.0032) [2024-04-26 15:30:47,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3412049920. Throughput: 0: 50847.1. Samples: 1164958220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 15:30:47,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 15:30:48,653][49750] Updated weights for policy 0, policy_version 208261 (0.0027) [2024-04-26 15:30:52,023][49750] Updated weights for policy 0, policy_version 208271 (0.0029) [2024-04-26 15:30:52,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 3412312064. Throughput: 0: 50949.7. Samples: 1165112340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 15:30:52,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 15:30:54,977][49750] Updated weights for policy 0, policy_version 208281 (0.0030) [2024-04-26 15:30:57,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 3412574208. Throughput: 0: 50737.0. Samples: 1165415120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 15:30:57,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 15:30:58,363][49750] Updated weights for policy 0, policy_version 208291 (0.0029) [2024-04-26 15:31:01,529][49750] Updated weights for policy 0, policy_version 208301 (0.0031) [2024-04-26 15:31:02,063][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3412819968. Throughput: 0: 50800.7. Samples: 1165720460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 15:31:02,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 15:31:04,912][49750] Updated weights for policy 0, policy_version 208311 (0.0029) [2024-04-26 15:31:07,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3413065728. Throughput: 0: 50890.6. Samples: 1165876060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 15:31:07,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 15:31:07,974][49750] Updated weights for policy 0, policy_version 208321 (0.0031) [2024-04-26 15:31:11,349][49750] Updated weights for policy 0, policy_version 208331 (0.0033) [2024-04-26 15:31:12,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3413327872. Throughput: 0: 50792.8. Samples: 1166175600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-26 15:31:12,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 15:31:14,502][49750] Updated weights for policy 0, policy_version 208341 (0.0030) [2024-04-26 15:31:17,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3413573632. Throughput: 0: 50708.5. Samples: 1166478780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 15:31:17,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 15:31:17,823][49750] Updated weights for policy 0, policy_version 208351 (0.0031) [2024-04-26 15:31:20,835][49750] Updated weights for policy 0, policy_version 208361 (0.0034) [2024-04-26 15:31:22,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3413835776. Throughput: 0: 50706.2. Samples: 1166635120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 15:31:22,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 15:31:24,142][49750] Updated weights for policy 0, policy_version 208371 (0.0031) [2024-04-26 15:31:26,418][49728] Signal inference workers to stop experience collection... (17500 times) [2024-04-26 15:31:26,460][49750] InferenceWorker_p0-w0: stopping experience collection (17500 times) [2024-04-26 15:31:26,521][49728] Signal inference workers to resume experience collection... (17500 times) [2024-04-26 15:31:26,522][49750] InferenceWorker_p0-w0: resuming experience collection (17500 times) [2024-04-26 15:31:27,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3414097920. Throughput: 0: 50712.8. Samples: 1166941180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 15:31:27,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 15:31:27,199][49750] Updated weights for policy 0, policy_version 208381 (0.0028) [2024-04-26 15:31:30,554][49750] Updated weights for policy 0, policy_version 208391 (0.0026) [2024-04-26 15:31:32,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3414327296. Throughput: 0: 50846.7. Samples: 1167246320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 15:31:32,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 15:31:33,585][49750] Updated weights for policy 0, policy_version 208401 (0.0031) [2024-04-26 15:31:37,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3414589440. Throughput: 0: 50685.5. Samples: 1167393180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 15:31:37,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 15:31:37,178][49750] Updated weights for policy 0, policy_version 208411 (0.0030) [2024-04-26 15:31:40,114][49750] Updated weights for policy 0, policy_version 208421 (0.0033) [2024-04-26 15:31:42,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3414851584. Throughput: 0: 50729.8. Samples: 1167697960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 15:31:42,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 15:31:43,579][49750] Updated weights for policy 0, policy_version 208431 (0.0033) [2024-04-26 15:31:46,889][49750] Updated weights for policy 0, policy_version 208441 (0.0040) [2024-04-26 15:31:47,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 3415113728. Throughput: 0: 50674.1. Samples: 1168000800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 15:31:47,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 15:31:49,966][49750] Updated weights for policy 0, policy_version 208451 (0.0035) [2024-04-26 15:31:52,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 3415326720. Throughput: 0: 50660.5. Samples: 1168155780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 15:31:52,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 15:31:53,222][49750] Updated weights for policy 0, policy_version 208461 (0.0029) [2024-04-26 15:31:56,426][49750] Updated weights for policy 0, policy_version 208471 (0.0029) [2024-04-26 15:31:57,063][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3415605248. Throughput: 0: 50753.3. Samples: 1168459500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 15:31:57,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 15:31:59,576][49750] Updated weights for policy 0, policy_version 208481 (0.0031) [2024-04-26 15:32:02,063][49517] Fps is (10 sec: 54066.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3415867392. Throughput: 0: 50751.8. Samples: 1168762620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 15:32:02,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 15:32:02,877][49750] Updated weights for policy 0, policy_version 208491 (0.0031) [2024-04-26 15:32:06,144][49750] Updated weights for policy 0, policy_version 208501 (0.0027) [2024-04-26 15:32:07,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3416129536. Throughput: 0: 50977.7. Samples: 1168929120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 15:32:07,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 15:32:09,351][49750] Updated weights for policy 0, policy_version 208511 (0.0032) [2024-04-26 15:32:12,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3416375296. Throughput: 0: 50872.5. Samples: 1169230440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 15:32:12,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 15:32:12,504][49750] Updated weights for policy 0, policy_version 208521 (0.0029) [2024-04-26 15:32:15,806][49750] Updated weights for policy 0, policy_version 208531 (0.0032) [2024-04-26 15:32:17,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3416621056. Throughput: 0: 50831.5. Samples: 1169533740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 15:32:17,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 15:32:18,776][49750] Updated weights for policy 0, policy_version 208541 (0.0032) [2024-04-26 15:32:22,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3416883200. Throughput: 0: 50884.1. Samples: 1169682960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 15:32:22,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 15:32:22,208][49750] Updated weights for policy 0, policy_version 208551 (0.0029) [2024-04-26 15:32:24,353][49728] Signal inference workers to stop experience collection... (17550 times) [2024-04-26 15:32:24,404][49750] InferenceWorker_p0-w0: stopping experience collection (17550 times) [2024-04-26 15:32:24,425][49728] Signal inference workers to resume experience collection... (17550 times) [2024-04-26 15:32:24,426][49750] InferenceWorker_p0-w0: resuming experience collection (17550 times) [2024-04-26 15:32:25,380][49750] Updated weights for policy 0, policy_version 208561 (0.0033) [2024-04-26 15:32:27,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3417145344. Throughput: 0: 50835.5. Samples: 1169985560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 15:32:27,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 15:32:28,742][49750] Updated weights for policy 0, policy_version 208571 (0.0030) [2024-04-26 15:32:31,811][49750] Updated weights for policy 0, policy_version 208581 (0.0030) [2024-04-26 15:32:32,063][49517] Fps is (10 sec: 52427.7, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 3417407488. Throughput: 0: 50815.6. Samples: 1170287500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 15:32:32,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 15:32:35,215][49750] Updated weights for policy 0, policy_version 208591 (0.0034) [2024-04-26 15:32:37,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3417636864. Throughput: 0: 50884.0. Samples: 1170445560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 15:32:37,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 15:32:38,128][49750] Updated weights for policy 0, policy_version 208601 (0.0036) [2024-04-26 15:32:41,530][49750] Updated weights for policy 0, policy_version 208611 (0.0034) [2024-04-26 15:32:42,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3417882624. Throughput: 0: 50919.9. Samples: 1170750900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 15:32:42,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 15:32:42,151][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000208612_3417899008.pth... [2024-04-26 15:32:42,197][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000207868_3405709312.pth [2024-04-26 15:32:44,623][49750] Updated weights for policy 0, policy_version 208621 (0.0044) [2024-04-26 15:32:47,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3418144768. Throughput: 0: 50890.7. Samples: 1171052700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 15:32:47,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 15:32:47,910][49750] Updated weights for policy 0, policy_version 208631 (0.0033) [2024-04-26 15:32:51,065][49750] Updated weights for policy 0, policy_version 208641 (0.0032) [2024-04-26 15:32:52,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3418406912. Throughput: 0: 50716.4. Samples: 1171211360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 15:32:52,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 15:32:54,411][49750] Updated weights for policy 0, policy_version 208651 (0.0026) [2024-04-26 15:32:57,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3418669056. Throughput: 0: 50836.5. Samples: 1171518080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 15:32:57,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 15:32:57,489][49750] Updated weights for policy 0, policy_version 208661 (0.0026) [2024-04-26 15:33:00,934][49750] Updated weights for policy 0, policy_version 208671 (0.0036) [2024-04-26 15:33:02,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3418898432. Throughput: 0: 50728.4. Samples: 1171816520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 15:33:02,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 15:33:04,019][49750] Updated weights for policy 0, policy_version 208681 (0.0030) [2024-04-26 15:33:07,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3419176960. Throughput: 0: 50846.6. Samples: 1171971060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 15:33:07,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 15:33:07,263][49750] Updated weights for policy 0, policy_version 208691 (0.0030) [2024-04-26 15:33:10,394][49750] Updated weights for policy 0, policy_version 208701 (0.0038) [2024-04-26 15:33:12,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3419422720. Throughput: 0: 50694.5. Samples: 1172266820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 15:33:12,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 15:33:13,710][49750] Updated weights for policy 0, policy_version 208711 (0.0033) [2024-04-26 15:33:16,753][49750] Updated weights for policy 0, policy_version 208721 (0.0023) [2024-04-26 15:33:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3419684864. Throughput: 0: 50836.1. Samples: 1172575120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 15:33:17,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 15:33:20,089][49750] Updated weights for policy 0, policy_version 208731 (0.0030) [2024-04-26 15:33:22,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3419930624. Throughput: 0: 50759.1. Samples: 1172729720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 15:33:22,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 15:33:23,200][49750] Updated weights for policy 0, policy_version 208741 (0.0030) [2024-04-26 15:33:26,587][49750] Updated weights for policy 0, policy_version 208751 (0.0028) [2024-04-26 15:33:27,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3420176384. Throughput: 0: 50913.6. Samples: 1173042000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 15:33:27,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 15:33:29,554][49728] Signal inference workers to stop experience collection... (17600 times) [2024-04-26 15:33:29,554][49728] Signal inference workers to resume experience collection... (17600 times) [2024-04-26 15:33:29,570][49750] InferenceWorker_p0-w0: stopping experience collection (17600 times) [2024-04-26 15:33:29,570][49750] InferenceWorker_p0-w0: resuming experience collection (17600 times) [2024-04-26 15:33:29,687][49750] Updated weights for policy 0, policy_version 208761 (0.0027) [2024-04-26 15:33:32,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 3420422144. Throughput: 0: 50851.7. Samples: 1173341020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:33:32,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 15:33:33,175][49750] Updated weights for policy 0, policy_version 208771 (0.0031) [2024-04-26 15:33:36,018][49750] Updated weights for policy 0, policy_version 208781 (0.0030) [2024-04-26 15:33:37,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3420700672. Throughput: 0: 50683.9. Samples: 1173492140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:33:37,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 15:33:39,648][49750] Updated weights for policy 0, policy_version 208791 (0.0034) [2024-04-26 15:33:42,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3420946432. Throughput: 0: 50709.7. Samples: 1173800020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:33:42,063][49517] Avg episode reward: [(0, '0.482')] [2024-04-26 15:33:42,433][49750] Updated weights for policy 0, policy_version 208801 (0.0034) [2024-04-26 15:33:46,266][49750] Updated weights for policy 0, policy_version 208811 (0.0032) [2024-04-26 15:33:47,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3421175808. Throughput: 0: 50715.7. Samples: 1174098720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:33:47,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 15:33:49,041][49750] Updated weights for policy 0, policy_version 208821 (0.0033) [2024-04-26 15:33:52,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3421437952. Throughput: 0: 50502.2. Samples: 1174243660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:33:52,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 15:33:52,588][49750] Updated weights for policy 0, policy_version 208831 (0.0035) [2024-04-26 15:33:55,675][49750] Updated weights for policy 0, policy_version 208841 (0.0035) [2024-04-26 15:33:57,063][49517] Fps is (10 sec: 54066.6, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3421716480. Throughput: 0: 50846.7. Samples: 1174554920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:33:57,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 15:33:59,104][49750] Updated weights for policy 0, policy_version 208851 (0.0030) [2024-04-26 15:34:02,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3421962240. Throughput: 0: 50749.3. Samples: 1174858840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:34:02,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 15:34:02,239][49750] Updated weights for policy 0, policy_version 208861 (0.0035) [2024-04-26 15:34:05,565][49750] Updated weights for policy 0, policy_version 208871 (0.0033) [2024-04-26 15:34:07,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3422224384. Throughput: 0: 50747.5. Samples: 1175013360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:34:07,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 15:34:08,544][49750] Updated weights for policy 0, policy_version 208881 (0.0031) [2024-04-26 15:34:12,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3422453760. Throughput: 0: 50565.6. Samples: 1175317460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:34:12,063][49517] Avg episode reward: [(0, '0.438')] [2024-04-26 15:34:12,104][49750] Updated weights for policy 0, policy_version 208891 (0.0031) [2024-04-26 15:34:14,944][49750] Updated weights for policy 0, policy_version 208901 (0.0031) [2024-04-26 15:34:17,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3422715904. Throughput: 0: 50717.2. Samples: 1175623300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:34:17,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 15:34:18,495][49750] Updated weights for policy 0, policy_version 208911 (0.0029) [2024-04-26 15:34:21,462][49750] Updated weights for policy 0, policy_version 208921 (0.0035) [2024-04-26 15:34:22,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3422978048. Throughput: 0: 50672.4. Samples: 1175772400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:34:22,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 15:34:24,924][49750] Updated weights for policy 0, policy_version 208931 (0.0034) [2024-04-26 15:34:27,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3423240192. Throughput: 0: 50611.1. Samples: 1176077520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:34:27,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 15:34:27,784][49750] Updated weights for policy 0, policy_version 208941 (0.0027) [2024-04-26 15:34:31,605][49750] Updated weights for policy 0, policy_version 208951 (0.0032) [2024-04-26 15:34:32,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 3423502336. Throughput: 0: 50989.3. Samples: 1176393240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 15:34:32,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 15:34:34,083][49750] Updated weights for policy 0, policy_version 208961 (0.0041) [2024-04-26 15:34:37,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3423715328. Throughput: 0: 50845.7. Samples: 1176531720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 15:34:37,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 15:34:38,098][49750] Updated weights for policy 0, policy_version 208971 (0.0029) [2024-04-26 15:34:40,482][49750] Updated weights for policy 0, policy_version 208981 (0.0032) [2024-04-26 15:34:42,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3423993856. Throughput: 0: 50701.7. Samples: 1176836500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 15:34:42,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 15:34:42,081][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000208984_3423993856.pth... [2024-04-26 15:34:42,129][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000208241_3411820544.pth [2024-04-26 15:34:44,538][49750] Updated weights for policy 0, policy_version 208991 (0.0029) [2024-04-26 15:34:46,964][49750] Updated weights for policy 0, policy_version 209001 (0.0030) [2024-04-26 15:34:47,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 3424272384. Throughput: 0: 50799.2. Samples: 1177144800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 15:34:47,071][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 15:34:50,884][49750] Updated weights for policy 0, policy_version 209011 (0.0037) [2024-04-26 15:34:52,062][49517] Fps is (10 sec: 50791.4, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3424501760. Throughput: 0: 50745.9. Samples: 1177296920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 15:34:52,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 15:34:53,443][49728] Signal inference workers to stop experience collection... (17650 times) [2024-04-26 15:34:53,480][49750] InferenceWorker_p0-w0: stopping experience collection (17650 times) [2024-04-26 15:34:53,513][49728] Signal inference workers to resume experience collection... (17650 times) [2024-04-26 15:34:53,513][49750] InferenceWorker_p0-w0: resuming experience collection (17650 times) [2024-04-26 15:34:53,647][49750] Updated weights for policy 0, policy_version 209021 (0.0034) [2024-04-26 15:34:57,063][49517] Fps is (10 sec: 45874.6, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3424731136. Throughput: 0: 50734.2. Samples: 1177600500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 15:34:57,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 15:34:57,459][49750] Updated weights for policy 0, policy_version 209031 (0.0029) [2024-04-26 15:35:00,174][49750] Updated weights for policy 0, policy_version 209041 (0.0031) [2024-04-26 15:35:02,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3424993280. Throughput: 0: 50608.2. Samples: 1177900660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 15:35:02,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 15:35:03,909][49750] Updated weights for policy 0, policy_version 209051 (0.0035) [2024-04-26 15:35:06,596][49750] Updated weights for policy 0, policy_version 209061 (0.0030) [2024-04-26 15:35:07,063][49517] Fps is (10 sec: 54067.3, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3425271808. Throughput: 0: 50745.2. Samples: 1178055940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 15:35:07,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 15:35:10,262][49750] Updated weights for policy 0, policy_version 209071 (0.0029) [2024-04-26 15:35:12,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3425501184. Throughput: 0: 50748.4. Samples: 1178361200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 15:35:12,063][49517] Avg episode reward: [(0, '0.678')] [2024-04-26 15:35:13,186][49750] Updated weights for policy 0, policy_version 209081 (0.0032) [2024-04-26 15:35:16,612][49750] Updated weights for policy 0, policy_version 209091 (0.0029) [2024-04-26 15:35:17,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3425763328. Throughput: 0: 50479.0. Samples: 1178664800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 15:35:17,064][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 15:35:19,643][49750] Updated weights for policy 0, policy_version 209101 (0.0030) [2024-04-26 15:35:22,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3425992704. Throughput: 0: 50729.4. Samples: 1178814540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 15:35:22,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 15:35:23,193][49750] Updated weights for policy 0, policy_version 209111 (0.0033) [2024-04-26 15:35:25,997][49750] Updated weights for policy 0, policy_version 209121 (0.0028) [2024-04-26 15:35:27,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3426271232. Throughput: 0: 50588.5. Samples: 1179112980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 15:35:27,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 15:35:29,807][49750] Updated weights for policy 0, policy_version 209131 (0.0033) [2024-04-26 15:35:32,063][49517] Fps is (10 sec: 54066.6, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3426533376. Throughput: 0: 50521.6. Samples: 1179418280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 15:35:32,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 15:35:32,541][49750] Updated weights for policy 0, policy_version 209141 (0.0033) [2024-04-26 15:35:36,219][49750] Updated weights for policy 0, policy_version 209151 (0.0030) [2024-04-26 15:35:37,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3426779136. Throughput: 0: 50635.5. Samples: 1179575520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 15:35:37,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 15:35:39,028][49750] Updated weights for policy 0, policy_version 209161 (0.0029) [2024-04-26 15:35:42,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3427024896. Throughput: 0: 50733.8. Samples: 1179883520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 15:35:42,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 15:35:42,838][49750] Updated weights for policy 0, policy_version 209171 (0.0030) [2024-04-26 15:35:45,439][49750] Updated weights for policy 0, policy_version 209181 (0.0036) [2024-04-26 15:35:47,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3427287040. Throughput: 0: 50816.8. Samples: 1180187420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 15:35:47,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 15:35:49,271][49750] Updated weights for policy 0, policy_version 209191 (0.0031) [2024-04-26 15:35:51,904][49750] Updated weights for policy 0, policy_version 209201 (0.0032) [2024-04-26 15:35:52,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3427549184. Throughput: 0: 50776.9. Samples: 1180340900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 15:35:52,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 15:35:55,766][49750] Updated weights for policy 0, policy_version 209211 (0.0031) [2024-04-26 15:35:57,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3427794944. Throughput: 0: 50834.5. Samples: 1180648760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 15:35:57,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 15:35:58,427][49750] Updated weights for policy 0, policy_version 209221 (0.0031) [2024-04-26 15:36:02,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3428024320. Throughput: 0: 50849.9. Samples: 1180953040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 15:36:02,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 15:36:02,101][49750] Updated weights for policy 0, policy_version 209231 (0.0030) [2024-04-26 15:36:04,878][49750] Updated weights for policy 0, policy_version 209241 (0.0030) [2024-04-26 15:36:07,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3428302848. Throughput: 0: 50796.9. Samples: 1181100400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 15:36:07,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 15:36:08,485][49750] Updated weights for policy 0, policy_version 209251 (0.0030) [2024-04-26 15:36:10,244][49728] Signal inference workers to stop experience collection... (17700 times) [2024-04-26 15:36:10,286][49750] InferenceWorker_p0-w0: stopping experience collection (17700 times) [2024-04-26 15:36:10,308][49728] Signal inference workers to resume experience collection... (17700 times) [2024-04-26 15:36:10,310][49750] InferenceWorker_p0-w0: resuming experience collection (17700 times) [2024-04-26 15:36:11,268][49750] Updated weights for policy 0, policy_version 209261 (0.0027) [2024-04-26 15:36:12,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3428548608. Throughput: 0: 50960.1. Samples: 1181406180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 15:36:12,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 15:36:14,939][49750] Updated weights for policy 0, policy_version 209271 (0.0026) [2024-04-26 15:36:17,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3428827136. Throughput: 0: 50999.6. Samples: 1181713260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 15:36:17,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 15:36:17,675][49750] Updated weights for policy 0, policy_version 209281 (0.0032) [2024-04-26 15:36:21,465][49750] Updated weights for policy 0, policy_version 209291 (0.0029) [2024-04-26 15:36:22,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.6, 300 sec: 50762.7). Total num frames: 3429072896. Throughput: 0: 50937.4. Samples: 1181867700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 15:36:22,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 15:36:24,146][49750] Updated weights for policy 0, policy_version 209301 (0.0041) [2024-04-26 15:36:27,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3429318656. Throughput: 0: 50816.9. Samples: 1182170280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 15:36:27,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 15:36:27,853][49750] Updated weights for policy 0, policy_version 209311 (0.0036) [2024-04-26 15:36:30,598][49750] Updated weights for policy 0, policy_version 209321 (0.0034) [2024-04-26 15:36:32,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3429564416. Throughput: 0: 50764.4. Samples: 1182471820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 15:36:32,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 15:36:34,392][49750] Updated weights for policy 0, policy_version 209331 (0.0033) [2024-04-26 15:36:37,059][49750] Updated weights for policy 0, policy_version 209341 (0.0030) [2024-04-26 15:36:37,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3429842944. Throughput: 0: 50881.0. Samples: 1182630540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 15:36:37,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 15:36:40,791][49750] Updated weights for policy 0, policy_version 209351 (0.0031) [2024-04-26 15:36:42,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3430088704. Throughput: 0: 50950.3. Samples: 1182941520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 15:36:42,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 15:36:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000209356_3430088704.pth... [2024-04-26 15:36:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000208612_3417899008.pth [2024-04-26 15:36:43,435][49750] Updated weights for policy 0, policy_version 209361 (0.0033) [2024-04-26 15:36:47,062][49517] Fps is (10 sec: 45874.9, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3430301696. Throughput: 0: 50825.3. Samples: 1183240180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 15:36:47,063][49517] Avg episode reward: [(0, '0.682')] [2024-04-26 15:36:47,278][49750] Updated weights for policy 0, policy_version 209371 (0.0033) [2024-04-26 15:36:49,833][49750] Updated weights for policy 0, policy_version 209381 (0.0029) [2024-04-26 15:36:52,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3430596608. Throughput: 0: 50776.5. Samples: 1183385340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:36:52,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 15:36:53,851][49750] Updated weights for policy 0, policy_version 209391 (0.0028) [2024-04-26 15:36:56,291][49750] Updated weights for policy 0, policy_version 209401 (0.0027) [2024-04-26 15:36:57,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3430842368. Throughput: 0: 50739.1. Samples: 1183689440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:36:57,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 15:37:00,178][49750] Updated weights for policy 0, policy_version 209411 (0.0036) [2024-04-26 15:37:02,063][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3431088128. Throughput: 0: 50757.3. Samples: 1183997340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:37:02,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 15:37:02,270][49728] Signal inference workers to stop experience collection... (17750 times) [2024-04-26 15:37:02,305][49750] InferenceWorker_p0-w0: stopping experience collection (17750 times) [2024-04-26 15:37:02,338][49728] Signal inference workers to resume experience collection... (17750 times) [2024-04-26 15:37:02,338][49750] InferenceWorker_p0-w0: resuming experience collection (17750 times) [2024-04-26 15:37:02,773][49750] Updated weights for policy 0, policy_version 209421 (0.0027) [2024-04-26 15:37:06,738][49750] Updated weights for policy 0, policy_version 209431 (0.0036) [2024-04-26 15:37:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3431350272. Throughput: 0: 50542.2. Samples: 1184142100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:37:07,063][49517] Avg episode reward: [(0, '0.458')] [2024-04-26 15:37:09,322][49750] Updated weights for policy 0, policy_version 209441 (0.0030) [2024-04-26 15:37:12,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3431579648. Throughput: 0: 50653.8. Samples: 1184449700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:37:12,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 15:37:13,053][49750] Updated weights for policy 0, policy_version 209451 (0.0028) [2024-04-26 15:37:15,771][49750] Updated weights for policy 0, policy_version 209461 (0.0027) [2024-04-26 15:37:17,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3431841792. Throughput: 0: 50690.0. Samples: 1184752860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:37:17,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 15:37:19,439][49750] Updated weights for policy 0, policy_version 209471 (0.0031) [2024-04-26 15:37:22,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3432120320. Throughput: 0: 50595.9. Samples: 1184907360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:37:22,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 15:37:22,229][49750] Updated weights for policy 0, policy_version 209481 (0.0031) [2024-04-26 15:37:25,840][49750] Updated weights for policy 0, policy_version 209491 (0.0028) [2024-04-26 15:37:27,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3432366080. Throughput: 0: 50546.7. Samples: 1185216120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:37:27,063][49517] Avg episode reward: [(0, '0.450')] [2024-04-26 15:37:28,605][49750] Updated weights for policy 0, policy_version 209501 (0.0030) [2024-04-26 15:37:32,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3432595456. Throughput: 0: 50612.9. Samples: 1185517760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:37:32,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 15:37:32,373][49750] Updated weights for policy 0, policy_version 209511 (0.0037) [2024-04-26 15:37:35,269][49750] Updated weights for policy 0, policy_version 209521 (0.0036) [2024-04-26 15:37:37,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3432873984. Throughput: 0: 50553.9. Samples: 1185660260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:37:37,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 15:37:39,016][49750] Updated weights for policy 0, policy_version 209531 (0.0033) [2024-04-26 15:37:41,863][49750] Updated weights for policy 0, policy_version 209541 (0.0032) [2024-04-26 15:37:42,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3433119744. Throughput: 0: 50591.6. Samples: 1185966060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:37:42,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 15:37:45,605][49750] Updated weights for policy 0, policy_version 209551 (0.0033) [2024-04-26 15:37:47,063][49517] Fps is (10 sec: 49151.3, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3433365504. Throughput: 0: 50524.1. Samples: 1186270920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:37:47,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 15:37:48,356][49750] Updated weights for policy 0, policy_version 209561 (0.0029) [2024-04-26 15:37:51,954][49750] Updated weights for policy 0, policy_version 209571 (0.0037) [2024-04-26 15:37:52,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 3433611264. Throughput: 0: 50492.9. Samples: 1186414280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:37:52,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 15:37:54,806][49750] Updated weights for policy 0, policy_version 209581 (0.0029) [2024-04-26 15:37:57,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3433857024. Throughput: 0: 50519.5. Samples: 1186723080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:37:57,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 15:37:58,449][49750] Updated weights for policy 0, policy_version 209591 (0.0034) [2024-04-26 15:38:01,424][49750] Updated weights for policy 0, policy_version 209601 (0.0032) [2024-04-26 15:38:01,677][49728] Signal inference workers to stop experience collection... (17800 times) [2024-04-26 15:38:01,732][49750] InferenceWorker_p0-w0: stopping experience collection (17800 times) [2024-04-26 15:38:01,745][49728] Signal inference workers to resume experience collection... (17800 times) [2024-04-26 15:38:01,753][49750] InferenceWorker_p0-w0: resuming experience collection (17800 times) [2024-04-26 15:38:02,062][49517] Fps is (10 sec: 55706.0, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 3434168320. Throughput: 0: 50616.9. Samples: 1187030620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:38:02,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 15:38:04,901][49750] Updated weights for policy 0, policy_version 209611 (0.0031) [2024-04-26 15:38:07,062][49517] Fps is (10 sec: 54067.5, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3434397696. Throughput: 0: 50756.5. Samples: 1187191400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:38:07,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 15:38:07,693][49750] Updated weights for policy 0, policy_version 209621 (0.0040) [2024-04-26 15:38:11,252][49750] Updated weights for policy 0, policy_version 209631 (0.0032) [2024-04-26 15:38:12,062][49517] Fps is (10 sec: 47512.9, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3434643456. Throughput: 0: 50535.6. Samples: 1187490220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:38:12,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 15:38:14,364][49750] Updated weights for policy 0, policy_version 209641 (0.0036) [2024-04-26 15:38:17,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50517.1, 300 sec: 50651.5). Total num frames: 3434872832. Throughput: 0: 50525.1. Samples: 1187791400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:38:17,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 15:38:17,740][49750] Updated weights for policy 0, policy_version 209651 (0.0031) [2024-04-26 15:38:20,893][49750] Updated weights for policy 0, policy_version 209661 (0.0032) [2024-04-26 15:38:22,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3435167744. Throughput: 0: 50753.3. Samples: 1187944160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:38:22,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 15:38:24,121][49750] Updated weights for policy 0, policy_version 209671 (0.0036) [2024-04-26 15:38:27,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3435397120. Throughput: 0: 50675.5. Samples: 1188246460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:38:27,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 15:38:27,251][49750] Updated weights for policy 0, policy_version 209681 (0.0035) [2024-04-26 15:38:30,601][49750] Updated weights for policy 0, policy_version 209691 (0.0030) [2024-04-26 15:38:32,062][49517] Fps is (10 sec: 49151.7, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3435659264. Throughput: 0: 50693.4. Samples: 1188552120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:38:32,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 15:38:33,551][49750] Updated weights for policy 0, policy_version 209701 (0.0029) [2024-04-26 15:38:36,988][49750] Updated weights for policy 0, policy_version 209711 (0.0036) [2024-04-26 15:38:37,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3435905024. Throughput: 0: 50827.9. Samples: 1188701540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:38:37,064][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 15:38:40,022][49750] Updated weights for policy 0, policy_version 209721 (0.0030) [2024-04-26 15:38:42,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3436150784. Throughput: 0: 50651.5. Samples: 1189002400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:38:42,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 15:38:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000209726_3436150784.pth... [2024-04-26 15:38:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000208984_3423993856.pth [2024-04-26 15:38:43,560][49750] Updated weights for policy 0, policy_version 209731 (0.0029) [2024-04-26 15:38:46,537][49750] Updated weights for policy 0, policy_version 209741 (0.0033) [2024-04-26 15:38:47,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3436429312. Throughput: 0: 50678.1. Samples: 1189311140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:38:47,063][49517] Avg episode reward: [(0, '0.668')] [2024-04-26 15:38:50,039][49750] Updated weights for policy 0, policy_version 209751 (0.0028) [2024-04-26 15:38:52,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3436675072. Throughput: 0: 50751.4. Samples: 1189475220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:38:52,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 15:38:52,808][49750] Updated weights for policy 0, policy_version 209761 (0.0037) [2024-04-26 15:38:56,492][49750] Updated weights for policy 0, policy_version 209771 (0.0034) [2024-04-26 15:38:57,062][49517] Fps is (10 sec: 49152.0, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3436920832. Throughput: 0: 50845.8. Samples: 1189778280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:38:57,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 15:38:58,974][49728] Signal inference workers to stop experience collection... (17850 times) [2024-04-26 15:38:58,975][49728] Signal inference workers to resume experience collection... (17850 times) [2024-04-26 15:38:58,992][49750] InferenceWorker_p0-w0: stopping experience collection (17850 times) [2024-04-26 15:38:58,992][49750] InferenceWorker_p0-w0: resuming experience collection (17850 times) [2024-04-26 15:38:59,112][49750] Updated weights for policy 0, policy_version 209781 (0.0031) [2024-04-26 15:39:02,062][49517] Fps is (10 sec: 47514.2, 60 sec: 49698.0, 300 sec: 50596.0). Total num frames: 3437150208. Throughput: 0: 50903.3. Samples: 1190082040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:39:02,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 15:39:02,754][49750] Updated weights for policy 0, policy_version 209791 (0.0031) [2024-04-26 15:39:05,605][49750] Updated weights for policy 0, policy_version 209801 (0.0029) [2024-04-26 15:39:07,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 3437428736. Throughput: 0: 50811.1. Samples: 1190230660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:39:07,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 15:39:09,277][49750] Updated weights for policy 0, policy_version 209811 (0.0039) [2024-04-26 15:39:11,963][49750] Updated weights for policy 0, policy_version 209821 (0.0026) [2024-04-26 15:39:12,063][49517] Fps is (10 sec: 55705.2, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3437707264. Throughput: 0: 50824.3. Samples: 1190533560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:39:12,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 15:39:15,685][49750] Updated weights for policy 0, policy_version 209831 (0.0031) [2024-04-26 15:39:17,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3437936640. Throughput: 0: 50790.1. Samples: 1190837680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:39:17,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 15:39:18,636][49750] Updated weights for policy 0, policy_version 209841 (0.0033) [2024-04-26 15:39:22,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 3438182400. Throughput: 0: 50856.0. Samples: 1190990060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:39:22,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 15:39:22,179][49750] Updated weights for policy 0, policy_version 209851 (0.0028) [2024-04-26 15:39:25,061][49750] Updated weights for policy 0, policy_version 209861 (0.0032) [2024-04-26 15:39:27,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 3438444544. Throughput: 0: 51011.2. Samples: 1191297900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:39:27,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 15:39:28,514][49750] Updated weights for policy 0, policy_version 209871 (0.0029) [2024-04-26 15:39:31,324][49750] Updated weights for policy 0, policy_version 209881 (0.0031) [2024-04-26 15:39:32,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3438706688. Throughput: 0: 50888.6. Samples: 1191601140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:39:32,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 15:39:34,860][49750] Updated weights for policy 0, policy_version 209891 (0.0033) [2024-04-26 15:39:37,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3438968832. Throughput: 0: 50756.9. Samples: 1191759280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:39:37,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 15:39:37,617][49750] Updated weights for policy 0, policy_version 209901 (0.0034) [2024-04-26 15:39:41,272][49750] Updated weights for policy 0, policy_version 209911 (0.0035) [2024-04-26 15:39:42,064][49517] Fps is (10 sec: 50785.1, 60 sec: 51062.5, 300 sec: 50651.3). Total num frames: 3439214592. Throughput: 0: 50971.5. Samples: 1192072060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:39:42,064][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 15:39:44,110][49750] Updated weights for policy 0, policy_version 209921 (0.0031) [2024-04-26 15:39:47,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 3439443968. Throughput: 0: 50842.7. Samples: 1192369960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:39:47,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 15:39:47,866][49750] Updated weights for policy 0, policy_version 209931 (0.0035) [2024-04-26 15:39:50,703][49750] Updated weights for policy 0, policy_version 209941 (0.0026) [2024-04-26 15:39:52,062][49517] Fps is (10 sec: 50796.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3439722496. Throughput: 0: 50875.1. Samples: 1192520040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:39:52,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 15:39:54,459][49750] Updated weights for policy 0, policy_version 209951 (0.0025) [2024-04-26 15:39:57,059][49750] Updated weights for policy 0, policy_version 209961 (0.0036) [2024-04-26 15:39:57,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3440001024. Throughput: 0: 50782.8. Samples: 1192818780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:39:57,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 15:40:00,806][49750] Updated weights for policy 0, policy_version 209971 (0.0036) [2024-04-26 15:40:02,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51336.6, 300 sec: 50707.1). Total num frames: 3440230400. Throughput: 0: 50851.7. Samples: 1193126000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:40:02,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 15:40:03,432][49750] Updated weights for policy 0, policy_version 209981 (0.0030) [2024-04-26 15:40:07,063][49517] Fps is (10 sec: 47512.7, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3440476160. Throughput: 0: 50721.7. Samples: 1193272540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:40:07,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 15:40:07,192][49750] Updated weights for policy 0, policy_version 209991 (0.0029) [2024-04-26 15:40:10,126][49750] Updated weights for policy 0, policy_version 210001 (0.0029) [2024-04-26 15:40:12,063][49517] Fps is (10 sec: 49150.6, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 3440721920. Throughput: 0: 50752.2. Samples: 1193581760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 15:40:12,063][49517] Avg episode reward: [(0, '0.676')] [2024-04-26 15:40:13,670][49750] Updated weights for policy 0, policy_version 210011 (0.0039) [2024-04-26 15:40:14,712][49728] Signal inference workers to stop experience collection... (17900 times) [2024-04-26 15:40:14,712][49728] Signal inference workers to resume experience collection... (17900 times) [2024-04-26 15:40:14,732][49750] InferenceWorker_p0-w0: stopping experience collection (17900 times) [2024-04-26 15:40:14,732][49750] InferenceWorker_p0-w0: resuming experience collection (17900 times) [2024-04-26 15:40:16,491][49750] Updated weights for policy 0, policy_version 210021 (0.0036) [2024-04-26 15:40:17,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3441000448. Throughput: 0: 50742.9. Samples: 1193884560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 15:40:17,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 15:40:20,079][49750] Updated weights for policy 0, policy_version 210031 (0.0030) [2024-04-26 15:40:22,062][49517] Fps is (10 sec: 50791.8, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3441229824. Throughput: 0: 50749.5. Samples: 1194043000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 15:40:22,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 15:40:23,180][49750] Updated weights for policy 0, policy_version 210041 (0.0026) [2024-04-26 15:40:26,455][49750] Updated weights for policy 0, policy_version 210051 (0.0031) [2024-04-26 15:40:27,063][49517] Fps is (10 sec: 49150.8, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 3441491968. Throughput: 0: 50646.5. Samples: 1194351100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 15:40:27,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 15:40:29,459][49750] Updated weights for policy 0, policy_version 210061 (0.0030) [2024-04-26 15:40:32,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3441737728. Throughput: 0: 50928.5. Samples: 1194661740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 15:40:32,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 15:40:32,792][49750] Updated weights for policy 0, policy_version 210071 (0.0030) [2024-04-26 15:40:35,827][49750] Updated weights for policy 0, policy_version 210081 (0.0027) [2024-04-26 15:40:37,063][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3441983488. Throughput: 0: 50689.7. Samples: 1194801080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 15:40:37,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 15:40:39,666][49750] Updated weights for policy 0, policy_version 210091 (0.0023) [2024-04-26 15:40:42,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50791.4, 300 sec: 50762.6). Total num frames: 3442262016. Throughput: 0: 50734.2. Samples: 1195101820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 15:40:42,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 15:40:42,251][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000210101_3442294784.pth... [2024-04-26 15:40:42,259][49750] Updated weights for policy 0, policy_version 210101 (0.0033) [2024-04-26 15:40:42,295][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000209356_3430088704.pth [2024-04-26 15:40:46,052][49750] Updated weights for policy 0, policy_version 210111 (0.0030) [2024-04-26 15:40:47,062][49517] Fps is (10 sec: 55706.5, 60 sec: 51609.7, 300 sec: 50818.2). Total num frames: 3442540544. Throughput: 0: 50747.1. Samples: 1195409620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 15:40:47,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 15:40:48,552][49750] Updated weights for policy 0, policy_version 210121 (0.0030) [2024-04-26 15:40:52,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3442753536. Throughput: 0: 50934.6. Samples: 1195564600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 15:40:52,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 15:40:52,498][49750] Updated weights for policy 0, policy_version 210131 (0.0031) [2024-04-26 15:40:55,108][49750] Updated weights for policy 0, policy_version 210141 (0.0030) [2024-04-26 15:40:57,063][49517] Fps is (10 sec: 45874.1, 60 sec: 49971.0, 300 sec: 50762.6). Total num frames: 3442999296. Throughput: 0: 50765.0. Samples: 1195866180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 15:40:57,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 15:40:59,009][49750] Updated weights for policy 0, policy_version 210151 (0.0029) [2024-04-26 15:41:01,556][49750] Updated weights for policy 0, policy_version 210161 (0.0028) [2024-04-26 15:41:02,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3443277824. Throughput: 0: 50752.9. Samples: 1196168440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 15:41:02,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 15:41:05,282][49750] Updated weights for policy 0, policy_version 210171 (0.0025) [2024-04-26 15:41:07,062][49517] Fps is (10 sec: 52430.1, 60 sec: 50790.6, 300 sec: 50762.6). Total num frames: 3443523584. Throughput: 0: 50896.5. Samples: 1196333340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 15:41:07,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 15:41:08,255][49750] Updated weights for policy 0, policy_version 210181 (0.0029) [2024-04-26 15:41:11,716][49750] Updated weights for policy 0, policy_version 210191 (0.0038) [2024-04-26 15:41:12,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.7, 300 sec: 50707.1). Total num frames: 3443785728. Throughput: 0: 50680.7. Samples: 1196631720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 15:41:12,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 15:41:13,366][49728] Signal inference workers to stop experience collection... (17950 times) [2024-04-26 15:41:13,367][49728] Signal inference workers to resume experience collection... (17950 times) [2024-04-26 15:41:13,380][49750] InferenceWorker_p0-w0: stopping experience collection (17950 times) [2024-04-26 15:41:13,400][49750] InferenceWorker_p0-w0: resuming experience collection (17950 times) [2024-04-26 15:41:14,659][49750] Updated weights for policy 0, policy_version 210201 (0.0028) [2024-04-26 15:41:17,062][49517] Fps is (10 sec: 49151.3, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 3444015104. Throughput: 0: 50571.9. Samples: 1196937480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 15:41:17,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 15:41:18,191][49750] Updated weights for policy 0, policy_version 210211 (0.0037) [2024-04-26 15:41:21,364][49750] Updated weights for policy 0, policy_version 210221 (0.0029) [2024-04-26 15:41:22,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3444277248. Throughput: 0: 50666.0. Samples: 1197081040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 15:41:22,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 15:41:24,614][49750] Updated weights for policy 0, policy_version 210231 (0.0033) [2024-04-26 15:41:27,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51063.7, 300 sec: 50818.2). Total num frames: 3444555776. Throughput: 0: 50689.4. Samples: 1197382840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:41:27,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 15:41:27,837][49750] Updated weights for policy 0, policy_version 210241 (0.0029) [2024-04-26 15:41:30,984][49750] Updated weights for policy 0, policy_version 210251 (0.0032) [2024-04-26 15:41:32,063][49517] Fps is (10 sec: 52427.2, 60 sec: 51063.3, 300 sec: 50707.0). Total num frames: 3444801536. Throughput: 0: 50588.5. Samples: 1197686120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:41:32,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 15:41:34,171][49750] Updated weights for policy 0, policy_version 210261 (0.0030) [2024-04-26 15:41:37,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3445030912. Throughput: 0: 50690.4. Samples: 1197845660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:41:37,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 15:41:37,537][49750] Updated weights for policy 0, policy_version 210271 (0.0034) [2024-04-26 15:41:40,865][49750] Updated weights for policy 0, policy_version 210281 (0.0033) [2024-04-26 15:41:42,063][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 3445276672. Throughput: 0: 50757.4. Samples: 1198150260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:41:42,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 15:41:43,967][49750] Updated weights for policy 0, policy_version 210291 (0.0030) [2024-04-26 15:41:47,062][49517] Fps is (10 sec: 50790.8, 60 sec: 49971.2, 300 sec: 50651.6). Total num frames: 3445538816. Throughput: 0: 50604.0. Samples: 1198445620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:41:47,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 15:41:47,323][49750] Updated weights for policy 0, policy_version 210301 (0.0029) [2024-04-26 15:41:50,457][49750] Updated weights for policy 0, policy_version 210311 (0.0028) [2024-04-26 15:41:52,062][49517] Fps is (10 sec: 54068.4, 60 sec: 51063.7, 300 sec: 50762.6). Total num frames: 3445817344. Throughput: 0: 50598.6. Samples: 1198610280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:41:52,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 15:41:53,782][49750] Updated weights for policy 0, policy_version 210321 (0.0026) [2024-04-26 15:41:56,875][49750] Updated weights for policy 0, policy_version 210331 (0.0031) [2024-04-26 15:41:57,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.7, 300 sec: 50762.7). Total num frames: 3446063104. Throughput: 0: 50854.3. Samples: 1198920160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:41:57,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 15:42:00,208][49750] Updated weights for policy 0, policy_version 210341 (0.0028) [2024-04-26 15:42:02,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 3446292480. Throughput: 0: 50793.8. Samples: 1199223200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:42:02,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 15:42:03,392][49750] Updated weights for policy 0, policy_version 210351 (0.0031) [2024-04-26 15:42:06,741][49750] Updated weights for policy 0, policy_version 210361 (0.0042) [2024-04-26 15:42:07,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3446571008. Throughput: 0: 50699.0. Samples: 1199362500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:42:07,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 15:42:08,994][49728] Signal inference workers to stop experience collection... (18000 times) [2024-04-26 15:42:08,995][49728] Signal inference workers to resume experience collection... (18000 times) [2024-04-26 15:42:09,012][49750] InferenceWorker_p0-w0: stopping experience collection (18000 times) [2024-04-26 15:42:09,012][49750] InferenceWorker_p0-w0: resuming experience collection (18000 times) [2024-04-26 15:42:09,861][49750] Updated weights for policy 0, policy_version 210371 (0.0039) [2024-04-26 15:42:12,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 3446833152. Throughput: 0: 50769.3. Samples: 1199667460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:42:12,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 15:42:13,268][49750] Updated weights for policy 0, policy_version 210381 (0.0033) [2024-04-26 15:42:16,317][49750] Updated weights for policy 0, policy_version 210391 (0.0033) [2024-04-26 15:42:17,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3447078912. Throughput: 0: 50763.2. Samples: 1199970460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:42:17,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 15:42:19,575][49750] Updated weights for policy 0, policy_version 210401 (0.0031) [2024-04-26 15:42:22,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3447324672. Throughput: 0: 50756.5. Samples: 1200129700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:42:22,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 15:42:22,711][49750] Updated weights for policy 0, policy_version 210411 (0.0031) [2024-04-26 15:42:26,062][49750] Updated weights for policy 0, policy_version 210421 (0.0029) [2024-04-26 15:42:27,062][49517] Fps is (10 sec: 47514.3, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 3447554048. Throughput: 0: 50770.4. Samples: 1200434920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:42:27,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 15:42:29,183][49750] Updated weights for policy 0, policy_version 210431 (0.0034) [2024-04-26 15:42:32,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.4, 300 sec: 50651.5). Total num frames: 3447816192. Throughput: 0: 50859.0. Samples: 1200734280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 15:42:32,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 15:42:32,638][49750] Updated weights for policy 0, policy_version 210441 (0.0038) [2024-04-26 15:42:35,598][49750] Updated weights for policy 0, policy_version 210451 (0.0032) [2024-04-26 15:42:37,063][49517] Fps is (10 sec: 55705.1, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 3448111104. Throughput: 0: 50679.4. Samples: 1200890860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 15:42:37,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 15:42:39,057][49750] Updated weights for policy 0, policy_version 210461 (0.0041) [2024-04-26 15:42:42,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 3448340480. Throughput: 0: 50554.2. Samples: 1201195100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 15:42:42,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 15:42:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000210471_3448356864.pth... [2024-04-26 15:42:42,071][49750] Updated weights for policy 0, policy_version 210471 (0.0033) [2024-04-26 15:42:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000209726_3436150784.pth [2024-04-26 15:42:45,392][49750] Updated weights for policy 0, policy_version 210481 (0.0029) [2024-04-26 15:42:47,062][49517] Fps is (10 sec: 44237.2, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 3448553472. Throughput: 0: 50601.4. Samples: 1201500260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 15:42:47,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 15:42:48,617][49750] Updated weights for policy 0, policy_version 210491 (0.0031) [2024-04-26 15:42:51,804][49750] Updated weights for policy 0, policy_version 210501 (0.0033) [2024-04-26 15:42:52,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 3448848384. Throughput: 0: 50590.5. Samples: 1201639080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 15:42:52,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 15:42:55,097][49750] Updated weights for policy 0, policy_version 210511 (0.0033) [2024-04-26 15:42:57,063][49517] Fps is (10 sec: 55704.6, 60 sec: 50790.2, 300 sec: 50651.5). Total num frames: 3449110528. Throughput: 0: 50525.6. Samples: 1201941120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 15:42:57,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 15:42:58,373][49750] Updated weights for policy 0, policy_version 210521 (0.0029) [2024-04-26 15:43:01,867][49750] Updated weights for policy 0, policy_version 210531 (0.0034) [2024-04-26 15:43:02,062][49517] Fps is (10 sec: 50791.3, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3449356288. Throughput: 0: 50573.5. Samples: 1202246260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 15:43:02,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 15:43:02,098][49728] Signal inference workers to stop experience collection... (18050 times) [2024-04-26 15:43:02,131][49750] InferenceWorker_p0-w0: stopping experience collection (18050 times) [2024-04-26 15:43:02,165][49728] Signal inference workers to resume experience collection... (18050 times) [2024-04-26 15:43:02,166][49750] InferenceWorker_p0-w0: resuming experience collection (18050 times) [2024-04-26 15:43:04,716][49750] Updated weights for policy 0, policy_version 210541 (0.0027) [2024-04-26 15:43:07,063][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 3449585664. Throughput: 0: 50474.1. Samples: 1202401040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 15:43:07,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 15:43:08,210][49750] Updated weights for policy 0, policy_version 210551 (0.0039) [2024-04-26 15:43:11,027][49750] Updated weights for policy 0, policy_version 210561 (0.0027) [2024-04-26 15:43:12,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3449847808. Throughput: 0: 50520.3. Samples: 1202708340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 15:43:12,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 15:43:14,534][49750] Updated weights for policy 0, policy_version 210571 (0.0028) [2024-04-26 15:43:17,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.4, 300 sec: 50651.5). Total num frames: 3450109952. Throughput: 0: 50574.6. Samples: 1203010140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 15:43:17,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 15:43:17,452][49750] Updated weights for policy 0, policy_version 210581 (0.0033) [2024-04-26 15:43:20,889][49750] Updated weights for policy 0, policy_version 210591 (0.0029) [2024-04-26 15:43:22,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3450372096. Throughput: 0: 50496.0. Samples: 1203163180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 15:43:22,063][49517] Avg episode reward: [(0, '0.453')] [2024-04-26 15:43:24,044][49750] Updated weights for policy 0, policy_version 210601 (0.0032) [2024-04-26 15:43:27,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.2, 300 sec: 50707.1). Total num frames: 3450617856. Throughput: 0: 50539.7. Samples: 1203469400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 15:43:27,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 15:43:27,399][49750] Updated weights for policy 0, policy_version 210611 (0.0032) [2024-04-26 15:43:30,756][49750] Updated weights for policy 0, policy_version 210621 (0.0029) [2024-04-26 15:43:32,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3450847232. Throughput: 0: 50606.6. Samples: 1203777560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 15:43:32,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 15:43:33,821][49750] Updated weights for policy 0, policy_version 210631 (0.0028) [2024-04-26 15:43:37,062][49517] Fps is (10 sec: 50791.7, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3451125760. Throughput: 0: 50607.7. Samples: 1203916420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 15:43:37,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 15:43:37,317][49750] Updated weights for policy 0, policy_version 210641 (0.0028) [2024-04-26 15:43:40,273][49750] Updated weights for policy 0, policy_version 210651 (0.0028) [2024-04-26 15:43:42,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 3451371520. Throughput: 0: 50638.4. Samples: 1204219840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:43:42,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 15:43:43,985][49750] Updated weights for policy 0, policy_version 210661 (0.0028) [2024-04-26 15:43:46,643][49750] Updated weights for policy 0, policy_version 210671 (0.0033) [2024-04-26 15:43:47,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51609.6, 300 sec: 50762.6). Total num frames: 3451650048. Throughput: 0: 50768.4. Samples: 1204530840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:43:47,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 15:43:50,531][49750] Updated weights for policy 0, policy_version 210681 (0.0034) [2024-04-26 15:43:52,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3451879424. Throughput: 0: 50577.9. Samples: 1204677040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:43:52,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 15:43:53,068][49750] Updated weights for policy 0, policy_version 210691 (0.0034) [2024-04-26 15:43:56,880][49750] Updated weights for policy 0, policy_version 210701 (0.0034) [2024-04-26 15:43:57,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 3452125184. Throughput: 0: 50616.1. Samples: 1204986060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:43:57,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-26 15:43:59,490][49750] Updated weights for policy 0, policy_version 210711 (0.0029) [2024-04-26 15:44:02,063][49517] Fps is (10 sec: 54066.1, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 3452420096. Throughput: 0: 50827.1. Samples: 1205297360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:44:02,063][49517] Avg episode reward: [(0, '0.672')] [2024-04-26 15:44:03,279][49750] Updated weights for policy 0, policy_version 210721 (0.0031) [2024-04-26 15:44:05,961][49750] Updated weights for policy 0, policy_version 210731 (0.0033) [2024-04-26 15:44:07,063][49517] Fps is (10 sec: 54066.3, 60 sec: 51336.4, 300 sec: 50707.1). Total num frames: 3452665856. Throughput: 0: 50884.8. Samples: 1205453000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:44:07,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 15:44:09,746][49750] Updated weights for policy 0, policy_version 210741 (0.0033) [2024-04-26 15:44:09,758][49728] Signal inference workers to stop experience collection... (18100 times) [2024-04-26 15:44:09,761][49728] Signal inference workers to resume experience collection... (18100 times) [2024-04-26 15:44:09,784][49750] InferenceWorker_p0-w0: stopping experience collection (18100 times) [2024-04-26 15:44:09,784][49750] InferenceWorker_p0-w0: resuming experience collection (18100 times) [2024-04-26 15:44:12,062][49517] Fps is (10 sec: 49153.0, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 3452911616. Throughput: 0: 50891.5. Samples: 1205759500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:44:12,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 15:44:12,296][49750] Updated weights for policy 0, policy_version 210751 (0.0031) [2024-04-26 15:44:16,103][49750] Updated weights for policy 0, policy_version 210761 (0.0033) [2024-04-26 15:44:17,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3453157376. Throughput: 0: 50760.4. Samples: 1206061780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:44:17,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 15:44:18,839][49750] Updated weights for policy 0, policy_version 210771 (0.0035) [2024-04-26 15:44:22,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3453403136. Throughput: 0: 50983.5. Samples: 1206210680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:44:22,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 15:44:22,570][49750] Updated weights for policy 0, policy_version 210781 (0.0036) [2024-04-26 15:44:25,263][49750] Updated weights for policy 0, policy_version 210791 (0.0033) [2024-04-26 15:44:27,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 3453665280. Throughput: 0: 50907.5. Samples: 1206510680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:44:27,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 15:44:28,932][49750] Updated weights for policy 0, policy_version 210801 (0.0026) [2024-04-26 15:44:31,583][49750] Updated weights for policy 0, policy_version 210811 (0.0033) [2024-04-26 15:44:32,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51609.6, 300 sec: 50762.6). Total num frames: 3453943808. Throughput: 0: 50844.9. Samples: 1206818860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:44:32,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-26 15:44:35,512][49750] Updated weights for policy 0, policy_version 210821 (0.0028) [2024-04-26 15:44:37,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50707.3). Total num frames: 3454173184. Throughput: 0: 50939.1. Samples: 1206969300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:44:37,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 15:44:38,019][49750] Updated weights for policy 0, policy_version 210831 (0.0030) [2024-04-26 15:44:41,972][49750] Updated weights for policy 0, policy_version 210841 (0.0026) [2024-04-26 15:44:42,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3454418944. Throughput: 0: 50939.5. Samples: 1207278340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 15:44:42,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 15:44:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000210841_3454418944.pth... [2024-04-26 15:44:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000210101_3442294784.pth [2024-04-26 15:44:44,591][49750] Updated weights for policy 0, policy_version 210851 (0.0027) [2024-04-26 15:44:47,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3454697472. Throughput: 0: 50732.1. Samples: 1207580300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:44:47,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 15:44:48,342][49750] Updated weights for policy 0, policy_version 210861 (0.0035) [2024-04-26 15:44:51,009][49750] Updated weights for policy 0, policy_version 210871 (0.0027) [2024-04-26 15:44:52,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.3, 300 sec: 50651.5). Total num frames: 3454943232. Throughput: 0: 50826.7. Samples: 1207740200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:44:52,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 15:44:54,683][49750] Updated weights for policy 0, policy_version 210881 (0.0027) [2024-04-26 15:44:57,062][49517] Fps is (10 sec: 49152.3, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3455188992. Throughput: 0: 50828.8. Samples: 1208046800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:44:57,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 15:44:57,534][49750] Updated weights for policy 0, policy_version 210891 (0.0038) [2024-04-26 15:45:01,151][49750] Updated weights for policy 0, policy_version 210901 (0.0034) [2024-04-26 15:45:02,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 3455451136. Throughput: 0: 50899.6. Samples: 1208352260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:45:02,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 15:45:04,007][49750] Updated weights for policy 0, policy_version 210911 (0.0030) [2024-04-26 15:45:07,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3455713280. Throughput: 0: 50909.3. Samples: 1208501600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:45:07,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 15:45:07,599][49750] Updated weights for policy 0, policy_version 210921 (0.0030) [2024-04-26 15:45:10,325][49750] Updated weights for policy 0, policy_version 210931 (0.0034) [2024-04-26 15:45:12,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3455959040. Throughput: 0: 50901.9. Samples: 1208801260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:45:12,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 15:45:13,995][49750] Updated weights for policy 0, policy_version 210941 (0.0030) [2024-04-26 15:45:16,695][49750] Updated weights for policy 0, policy_version 210951 (0.0029) [2024-04-26 15:45:17,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3456221184. Throughput: 0: 50880.1. Samples: 1209108460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:45:17,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 15:45:20,541][49750] Updated weights for policy 0, policy_version 210961 (0.0040) [2024-04-26 15:45:22,063][49517] Fps is (10 sec: 50789.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3456466944. Throughput: 0: 50950.5. Samples: 1209262080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:45:22,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 15:45:23,140][49750] Updated weights for policy 0, policy_version 210971 (0.0030) [2024-04-26 15:45:27,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3456696320. Throughput: 0: 50955.2. Samples: 1209571320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:45:27,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 15:45:27,148][49750] Updated weights for policy 0, policy_version 210981 (0.0035) [2024-04-26 15:45:29,690][49750] Updated weights for policy 0, policy_version 210991 (0.0032) [2024-04-26 15:45:32,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3456958464. Throughput: 0: 50996.1. Samples: 1209875120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:45:32,071][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 15:45:33,592][49750] Updated weights for policy 0, policy_version 211001 (0.0028) [2024-04-26 15:45:35,473][49728] Signal inference workers to stop experience collection... (18150 times) [2024-04-26 15:45:35,474][49728] Signal inference workers to resume experience collection... (18150 times) [2024-04-26 15:45:35,504][49750] InferenceWorker_p0-w0: stopping experience collection (18150 times) [2024-04-26 15:45:35,504][49750] InferenceWorker_p0-w0: resuming experience collection (18150 times) [2024-04-26 15:45:36,133][49750] Updated weights for policy 0, policy_version 211011 (0.0029) [2024-04-26 15:45:37,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3457236992. Throughput: 0: 50770.8. Samples: 1210024880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:45:37,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 15:45:40,123][49750] Updated weights for policy 0, policy_version 211021 (0.0028) [2024-04-26 15:45:42,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.6, 300 sec: 50651.5). Total num frames: 3457482752. Throughput: 0: 50770.7. Samples: 1210331480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:45:42,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 15:45:42,737][49750] Updated weights for policy 0, policy_version 211031 (0.0029) [2024-04-26 15:45:46,442][49750] Updated weights for policy 0, policy_version 211041 (0.0030) [2024-04-26 15:45:47,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3457744896. Throughput: 0: 50861.3. Samples: 1210641020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:45:47,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 15:45:49,262][49750] Updated weights for policy 0, policy_version 211051 (0.0034) [2024-04-26 15:45:52,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3457990656. Throughput: 0: 50816.8. Samples: 1210788360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 15:45:52,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 15:45:52,934][49750] Updated weights for policy 0, policy_version 211061 (0.0035) [2024-04-26 15:45:55,610][49750] Updated weights for policy 0, policy_version 211071 (0.0029) [2024-04-26 15:45:57,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3458252800. Throughput: 0: 50921.3. Samples: 1211092720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 15:45:57,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 15:45:59,309][49750] Updated weights for policy 0, policy_version 211081 (0.0038) [2024-04-26 15:46:02,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3458498560. Throughput: 0: 50827.1. Samples: 1211395680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 15:46:02,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 15:46:02,110][49750] Updated weights for policy 0, policy_version 211091 (0.0029) [2024-04-26 15:46:05,716][49750] Updated weights for policy 0, policy_version 211101 (0.0030) [2024-04-26 15:46:07,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 3458727936. Throughput: 0: 50869.5. Samples: 1211551200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 15:46:07,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 15:46:08,534][49750] Updated weights for policy 0, policy_version 211111 (0.0032) [2024-04-26 15:46:12,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3458990080. Throughput: 0: 50699.5. Samples: 1211852800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 15:46:12,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 15:46:12,084][49750] Updated weights for policy 0, policy_version 211121 (0.0030) [2024-04-26 15:46:15,072][49750] Updated weights for policy 0, policy_version 211131 (0.0031) [2024-04-26 15:46:17,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3459252224. Throughput: 0: 50699.2. Samples: 1212156580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 15:46:17,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 15:46:18,576][49750] Updated weights for policy 0, policy_version 211141 (0.0029) [2024-04-26 15:46:21,481][49750] Updated weights for policy 0, policy_version 211151 (0.0033) [2024-04-26 15:46:22,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3459514368. Throughput: 0: 50636.7. Samples: 1212303540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 15:46:22,063][49517] Avg episode reward: [(0, '0.482')] [2024-04-26 15:46:24,973][49750] Updated weights for policy 0, policy_version 211161 (0.0032) [2024-04-26 15:46:27,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3459760128. Throughput: 0: 50688.0. Samples: 1212612440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 15:46:27,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 15:46:27,856][49750] Updated weights for policy 0, policy_version 211171 (0.0032) [2024-04-26 15:46:31,390][49750] Updated weights for policy 0, policy_version 211181 (0.0029) [2024-04-26 15:46:32,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3460005888. Throughput: 0: 50533.6. Samples: 1212915040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 15:46:32,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 15:46:34,316][49750] Updated weights for policy 0, policy_version 211191 (0.0033) [2024-04-26 15:46:37,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3460251648. Throughput: 0: 50577.4. Samples: 1213064340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 15:46:37,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 15:46:37,807][49750] Updated weights for policy 0, policy_version 211201 (0.0031) [2024-04-26 15:46:40,753][49750] Updated weights for policy 0, policy_version 211211 (0.0033) [2024-04-26 15:46:42,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3460530176. Throughput: 0: 50669.7. Samples: 1213372860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 15:46:42,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 15:46:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000211214_3460530176.pth... [2024-04-26 15:46:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000210471_3448356864.pth [2024-04-26 15:46:44,387][49750] Updated weights for policy 0, policy_version 211221 (0.0031) [2024-04-26 15:46:47,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3460775936. Throughput: 0: 50556.9. Samples: 1213670740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 15:46:47,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 15:46:47,277][49750] Updated weights for policy 0, policy_version 211231 (0.0032) [2024-04-26 15:46:50,694][49750] Updated weights for policy 0, policy_version 211241 (0.0031) [2024-04-26 15:46:52,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3461021696. Throughput: 0: 50655.1. Samples: 1213830680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 15:46:52,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 15:46:53,804][49750] Updated weights for policy 0, policy_version 211251 (0.0034) [2024-04-26 15:46:57,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3461267456. Throughput: 0: 50649.9. Samples: 1214132040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 15:46:57,071][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 15:46:57,237][49750] Updated weights for policy 0, policy_version 211261 (0.0031) [2024-04-26 15:47:00,207][49750] Updated weights for policy 0, policy_version 211271 (0.0033) [2024-04-26 15:47:00,854][49728] Signal inference workers to stop experience collection... (18200 times) [2024-04-26 15:47:00,905][49750] InferenceWorker_p0-w0: stopping experience collection (18200 times) [2024-04-26 15:47:00,920][49728] Signal inference workers to resume experience collection... (18200 times) [2024-04-26 15:47:00,927][49750] InferenceWorker_p0-w0: resuming experience collection (18200 times) [2024-04-26 15:47:02,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3461529600. Throughput: 0: 50656.8. Samples: 1214436140. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 15:47:02,063][49517] Avg episode reward: [(0, '0.438')] [2024-04-26 15:47:03,696][49750] Updated weights for policy 0, policy_version 211281 (0.0031) [2024-04-26 15:47:06,599][49750] Updated weights for policy 0, policy_version 211291 (0.0035) [2024-04-26 15:47:07,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3461791744. Throughput: 0: 50910.3. Samples: 1214594500. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 15:47:07,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 15:47:10,212][49750] Updated weights for policy 0, policy_version 211301 (0.0028) [2024-04-26 15:47:12,063][49517] Fps is (10 sec: 52427.5, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3462053888. Throughput: 0: 50706.4. Samples: 1214894240. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 15:47:12,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 15:47:13,163][49750] Updated weights for policy 0, policy_version 211311 (0.0031) [2024-04-26 15:47:16,868][49750] Updated weights for policy 0, policy_version 211321 (0.0036) [2024-04-26 15:47:17,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3462283264. Throughput: 0: 50839.2. Samples: 1215202800. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 15:47:17,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 15:47:19,873][49750] Updated weights for policy 0, policy_version 211331 (0.0037) [2024-04-26 15:47:22,062][49517] Fps is (10 sec: 49153.5, 60 sec: 50517.6, 300 sec: 50818.2). Total num frames: 3462545408. Throughput: 0: 50831.3. Samples: 1215351740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 15:47:22,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 15:47:23,241][49750] Updated weights for policy 0, policy_version 211341 (0.0030) [2024-04-26 15:47:26,324][49750] Updated weights for policy 0, policy_version 211351 (0.0035) [2024-04-26 15:47:27,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3462823936. Throughput: 0: 50757.8. Samples: 1215656960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 15:47:27,071][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 15:47:29,697][49750] Updated weights for policy 0, policy_version 211361 (0.0029) [2024-04-26 15:47:32,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 3463069696. Throughput: 0: 50873.7. Samples: 1215960060. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 15:47:32,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 15:47:32,678][49750] Updated weights for policy 0, policy_version 211371 (0.0031) [2024-04-26 15:47:36,132][49750] Updated weights for policy 0, policy_version 211381 (0.0036) [2024-04-26 15:47:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 3463315456. Throughput: 0: 50792.4. Samples: 1216116340. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 15:47:37,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 15:47:39,210][49750] Updated weights for policy 0, policy_version 211391 (0.0034) [2024-04-26 15:47:42,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 3463561216. Throughput: 0: 50844.1. Samples: 1216420020. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 15:47:42,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 15:47:42,522][49750] Updated weights for policy 0, policy_version 211401 (0.0027) [2024-04-26 15:47:45,622][49750] Updated weights for policy 0, policy_version 211411 (0.0024) [2024-04-26 15:47:47,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50517.1, 300 sec: 50707.1). Total num frames: 3463806976. Throughput: 0: 50844.2. Samples: 1216724140. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 15:47:47,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 15:47:48,940][49750] Updated weights for policy 0, policy_version 211421 (0.0029) [2024-04-26 15:47:51,990][49750] Updated weights for policy 0, policy_version 211431 (0.0031) [2024-04-26 15:47:52,063][49517] Fps is (10 sec: 52427.7, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3464085504. Throughput: 0: 50691.6. Samples: 1216875620. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 15:47:52,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 15:47:55,506][49750] Updated weights for policy 0, policy_version 211441 (0.0027) [2024-04-26 15:47:57,062][49517] Fps is (10 sec: 52430.3, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 3464331264. Throughput: 0: 50955.5. Samples: 1217187220. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 15:47:57,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 15:47:58,342][49750] Updated weights for policy 0, policy_version 211451 (0.0030) [2024-04-26 15:48:01,949][49750] Updated weights for policy 0, policy_version 211461 (0.0028) [2024-04-26 15:48:02,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3464577024. Throughput: 0: 50874.8. Samples: 1217492160. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 15:48:02,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 15:48:04,698][49750] Updated weights for policy 0, policy_version 211471 (0.0030) [2024-04-26 15:48:07,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3464839168. Throughput: 0: 50857.3. Samples: 1217640320. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 15:48:07,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 15:48:08,278][49750] Updated weights for policy 0, policy_version 211481 (0.0035) [2024-04-26 15:48:11,210][49750] Updated weights for policy 0, policy_version 211491 (0.0035) [2024-04-26 15:48:12,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 3465101312. Throughput: 0: 50876.5. Samples: 1217946400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 15:48:12,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 15:48:14,683][49750] Updated weights for policy 0, policy_version 211501 (0.0033) [2024-04-26 15:48:15,328][49728] Signal inference workers to stop experience collection... (18250 times) [2024-04-26 15:48:15,328][49728] Signal inference workers to resume experience collection... (18250 times) [2024-04-26 15:48:15,343][49750] InferenceWorker_p0-w0: stopping experience collection (18250 times) [2024-04-26 15:48:15,343][49750] InferenceWorker_p0-w0: resuming experience collection (18250 times) [2024-04-26 15:48:17,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3465347072. Throughput: 0: 50911.0. Samples: 1218251060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 15:48:17,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 15:48:17,664][49750] Updated weights for policy 0, policy_version 211511 (0.0035) [2024-04-26 15:48:21,098][49750] Updated weights for policy 0, policy_version 211521 (0.0033) [2024-04-26 15:48:22,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3465609216. Throughput: 0: 50710.6. Samples: 1218398320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 15:48:22,063][49517] Avg episode reward: [(0, '0.459')] [2024-04-26 15:48:24,201][49750] Updated weights for policy 0, policy_version 211531 (0.0035) [2024-04-26 15:48:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 3465838592. Throughput: 0: 50766.1. Samples: 1218704500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 15:48:27,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 15:48:27,569][49750] Updated weights for policy 0, policy_version 211541 (0.0037) [2024-04-26 15:48:30,588][49750] Updated weights for policy 0, policy_version 211551 (0.0039) [2024-04-26 15:48:32,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3466117120. Throughput: 0: 50718.5. Samples: 1219006460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 15:48:32,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 15:48:34,086][49750] Updated weights for policy 0, policy_version 211561 (0.0035) [2024-04-26 15:48:37,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3466362880. Throughput: 0: 50795.7. Samples: 1219161420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 15:48:37,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 15:48:37,153][49750] Updated weights for policy 0, policy_version 211571 (0.0034) [2024-04-26 15:48:40,438][49750] Updated weights for policy 0, policy_version 211581 (0.0032) [2024-04-26 15:48:42,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3466608640. Throughput: 0: 50616.3. Samples: 1219464960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 15:48:42,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 15:48:42,187][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000211586_3466625024.pth... [2024-04-26 15:48:42,237][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000210841_3454418944.pth [2024-04-26 15:48:43,510][49750] Updated weights for policy 0, policy_version 211591 (0.0033) [2024-04-26 15:48:46,781][49750] Updated weights for policy 0, policy_version 211601 (0.0029) [2024-04-26 15:48:47,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.7, 300 sec: 50818.2). Total num frames: 3466870784. Throughput: 0: 50741.3. Samples: 1219775520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 15:48:47,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 15:48:50,038][49750] Updated weights for policy 0, policy_version 211611 (0.0029) [2024-04-26 15:48:52,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3467116544. Throughput: 0: 50661.2. Samples: 1219920080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 15:48:52,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 15:48:53,232][49750] Updated weights for policy 0, policy_version 211621 (0.0027) [2024-04-26 15:48:56,390][49750] Updated weights for policy 0, policy_version 211631 (0.0036) [2024-04-26 15:48:57,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 3467378688. Throughput: 0: 50736.3. Samples: 1220229540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 15:48:57,063][49517] Avg episode reward: [(0, '0.668')] [2024-04-26 15:48:59,827][49750] Updated weights for policy 0, policy_version 211641 (0.0030) [2024-04-26 15:49:02,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 3467640832. Throughput: 0: 50751.7. Samples: 1220534880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 15:49:02,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 15:49:02,860][49750] Updated weights for policy 0, policy_version 211651 (0.0031) [2024-04-26 15:49:06,197][49750] Updated weights for policy 0, policy_version 211661 (0.0031) [2024-04-26 15:49:07,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3467902976. Throughput: 0: 50747.1. Samples: 1220681940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 15:49:07,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 15:49:09,326][49750] Updated weights for policy 0, policy_version 211671 (0.0030) [2024-04-26 15:49:12,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3468132352. Throughput: 0: 50754.7. Samples: 1220988460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 15:49:12,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 15:49:12,807][49750] Updated weights for policy 0, policy_version 211681 (0.0029) [2024-04-26 15:49:15,811][49750] Updated weights for policy 0, policy_version 211691 (0.0037) [2024-04-26 15:49:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3468394496. Throughput: 0: 50735.5. Samples: 1221289560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 15:49:17,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 15:49:19,234][49750] Updated weights for policy 0, policy_version 211701 (0.0029) [2024-04-26 15:49:20,392][49728] Signal inference workers to stop experience collection... (18300 times) [2024-04-26 15:49:20,393][49728] Signal inference workers to resume experience collection... (18300 times) [2024-04-26 15:49:20,416][49750] InferenceWorker_p0-w0: stopping experience collection (18300 times) [2024-04-26 15:49:20,417][49750] InferenceWorker_p0-w0: resuming experience collection (18300 times) [2024-04-26 15:49:22,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3468656640. Throughput: 0: 50721.4. Samples: 1221443880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 15:49:22,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 15:49:22,086][49750] Updated weights for policy 0, policy_version 211711 (0.0028) [2024-04-26 15:49:25,516][49750] Updated weights for policy 0, policy_version 211721 (0.0036) [2024-04-26 15:49:27,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3468918784. Throughput: 0: 50925.2. Samples: 1221756600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 15:49:27,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 15:49:28,672][49750] Updated weights for policy 0, policy_version 211731 (0.0036) [2024-04-26 15:49:31,901][49750] Updated weights for policy 0, policy_version 211741 (0.0031) [2024-04-26 15:49:32,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3469164544. Throughput: 0: 50758.5. Samples: 1222059660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 15:49:32,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 15:49:35,052][49750] Updated weights for policy 0, policy_version 211751 (0.0029) [2024-04-26 15:49:37,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3469410304. Throughput: 0: 51109.9. Samples: 1222220020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 15:49:37,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 15:49:38,299][49750] Updated weights for policy 0, policy_version 211761 (0.0029) [2024-04-26 15:49:41,469][49750] Updated weights for policy 0, policy_version 211771 (0.0033) [2024-04-26 15:49:42,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3469672448. Throughput: 0: 50878.1. Samples: 1222519060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 15:49:42,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 15:49:44,719][49750] Updated weights for policy 0, policy_version 211781 (0.0034) [2024-04-26 15:49:47,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 3469918208. Throughput: 0: 50916.5. Samples: 1222826120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 15:49:47,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 15:49:47,897][49750] Updated weights for policy 0, policy_version 211791 (0.0031) [2024-04-26 15:49:51,011][49750] Updated weights for policy 0, policy_version 211801 (0.0032) [2024-04-26 15:49:52,062][49517] Fps is (10 sec: 50791.5, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3470180352. Throughput: 0: 51190.3. Samples: 1222985500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 15:49:52,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 15:49:54,486][49750] Updated weights for policy 0, policy_version 211811 (0.0027) [2024-04-26 15:49:57,063][49517] Fps is (10 sec: 49150.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3470409728. Throughput: 0: 51007.4. Samples: 1223283800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 15:49:57,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 15:49:57,530][49750] Updated weights for policy 0, policy_version 211821 (0.0031) [2024-04-26 15:50:00,858][49750] Updated weights for policy 0, policy_version 211831 (0.0033) [2024-04-26 15:50:02,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3470688256. Throughput: 0: 51054.6. Samples: 1223587020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 15:50:02,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 15:50:03,953][49750] Updated weights for policy 0, policy_version 211841 (0.0029) [2024-04-26 15:50:07,063][49517] Fps is (10 sec: 54067.6, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 3470950400. Throughput: 0: 51117.6. Samples: 1223744180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 15:50:07,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 15:50:07,132][49750] Updated weights for policy 0, policy_version 211851 (0.0032) [2024-04-26 15:50:10,456][49750] Updated weights for policy 0, policy_version 211861 (0.0034) [2024-04-26 15:50:12,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3471196160. Throughput: 0: 50877.4. Samples: 1224046080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 15:50:12,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 15:50:13,611][49750] Updated weights for policy 0, policy_version 211871 (0.0030) [2024-04-26 15:50:16,803][49750] Updated weights for policy 0, policy_version 211881 (0.0031) [2024-04-26 15:50:17,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3471458304. Throughput: 0: 50992.0. Samples: 1224354300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 15:50:17,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 15:50:20,061][49750] Updated weights for policy 0, policy_version 211891 (0.0034) [2024-04-26 15:50:20,225][49728] Signal inference workers to stop experience collection... (18350 times) [2024-04-26 15:50:20,225][49728] Signal inference workers to resume experience collection... (18350 times) [2024-04-26 15:50:20,251][49750] InferenceWorker_p0-w0: stopping experience collection (18350 times) [2024-04-26 15:50:20,252][49750] InferenceWorker_p0-w0: resuming experience collection (18350 times) [2024-04-26 15:50:22,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3471704064. Throughput: 0: 50989.8. Samples: 1224514560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 15:50:22,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 15:50:23,180][49750] Updated weights for policy 0, policy_version 211901 (0.0027) [2024-04-26 15:50:26,494][49750] Updated weights for policy 0, policy_version 211911 (0.0027) [2024-04-26 15:50:27,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 3471982592. Throughput: 0: 51066.5. Samples: 1224817040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:50:27,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 15:50:29,730][49750] Updated weights for policy 0, policy_version 211921 (0.0027) [2024-04-26 15:50:32,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3472195584. Throughput: 0: 50990.9. Samples: 1225120720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:50:32,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 15:50:32,852][49750] Updated weights for policy 0, policy_version 211931 (0.0030) [2024-04-26 15:50:36,121][49750] Updated weights for policy 0, policy_version 211941 (0.0032) [2024-04-26 15:50:37,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3472457728. Throughput: 0: 50642.2. Samples: 1225264400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:50:37,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 15:50:39,368][49750] Updated weights for policy 0, policy_version 211951 (0.0028) [2024-04-26 15:50:42,063][49517] Fps is (10 sec: 55705.1, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3472752640. Throughput: 0: 50984.0. Samples: 1225578080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:50:42,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 15:50:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000211960_3472752640.pth... [2024-04-26 15:50:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000211214_3460530176.pth [2024-04-26 15:50:42,513][49750] Updated weights for policy 0, policy_version 211961 (0.0028) [2024-04-26 15:50:45,827][49750] Updated weights for policy 0, policy_version 211971 (0.0030) [2024-04-26 15:50:47,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51063.2, 300 sec: 50818.2). Total num frames: 3472982016. Throughput: 0: 51109.2. Samples: 1225886940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:50:47,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 15:50:48,870][49750] Updated weights for policy 0, policy_version 211981 (0.0030) [2024-04-26 15:50:52,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3473227776. Throughput: 0: 50881.3. Samples: 1226033840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:50:52,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 15:50:52,278][49750] Updated weights for policy 0, policy_version 211991 (0.0031) [2024-04-26 15:50:55,404][49750] Updated weights for policy 0, policy_version 212001 (0.0029) [2024-04-26 15:50:57,062][49517] Fps is (10 sec: 49153.1, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 3473473536. Throughput: 0: 50844.5. Samples: 1226334080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:50:57,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 15:50:58,800][49750] Updated weights for policy 0, policy_version 212011 (0.0036) [2024-04-26 15:51:01,832][49750] Updated weights for policy 0, policy_version 212021 (0.0028) [2024-04-26 15:51:02,062][49517] Fps is (10 sec: 52429.9, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 3473752064. Throughput: 0: 50816.2. Samples: 1226641020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:51:02,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 15:51:05,298][49750] Updated weights for policy 0, policy_version 212031 (0.0028) [2024-04-26 15:51:07,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3473997824. Throughput: 0: 50770.2. Samples: 1226799220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:51:07,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 15:51:08,252][49750] Updated weights for policy 0, policy_version 212041 (0.0036) [2024-04-26 15:51:11,683][49750] Updated weights for policy 0, policy_version 212051 (0.0031) [2024-04-26 15:51:12,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3474243584. Throughput: 0: 50807.8. Samples: 1227103400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:51:12,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 15:51:14,641][49750] Updated weights for policy 0, policy_version 212061 (0.0029) [2024-04-26 15:51:17,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3474489344. Throughput: 0: 50754.3. Samples: 1227404660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:51:17,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 15:51:18,147][49750] Updated weights for policy 0, policy_version 212071 (0.0029) [2024-04-26 15:51:18,902][49728] Signal inference workers to stop experience collection... (18400 times) [2024-04-26 15:51:18,903][49728] Signal inference workers to resume experience collection... (18400 times) [2024-04-26 15:51:18,915][49750] InferenceWorker_p0-w0: stopping experience collection (18400 times) [2024-04-26 15:51:18,916][49750] InferenceWorker_p0-w0: resuming experience collection (18400 times) [2024-04-26 15:51:21,165][49750] Updated weights for policy 0, policy_version 212081 (0.0031) [2024-04-26 15:51:22,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3474735104. Throughput: 0: 50865.8. Samples: 1227553360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:51:22,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 15:51:24,660][49750] Updated weights for policy 0, policy_version 212091 (0.0031) [2024-04-26 15:51:27,062][49517] Fps is (10 sec: 54067.4, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 3475030016. Throughput: 0: 50654.4. Samples: 1227857520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 15:51:27,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 15:51:27,741][49750] Updated weights for policy 0, policy_version 212101 (0.0040) [2024-04-26 15:51:31,145][49750] Updated weights for policy 0, policy_version 212111 (0.0041) [2024-04-26 15:51:32,063][49517] Fps is (10 sec: 54066.0, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 3475275776. Throughput: 0: 50722.7. Samples: 1228169460. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 15:51:32,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 15:51:34,195][49750] Updated weights for policy 0, policy_version 212121 (0.0035) [2024-04-26 15:51:37,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3475505152. Throughput: 0: 50724.0. Samples: 1228316420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 15:51:37,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 15:51:37,571][49750] Updated weights for policy 0, policy_version 212131 (0.0030) [2024-04-26 15:51:40,624][49750] Updated weights for policy 0, policy_version 212141 (0.0031) [2024-04-26 15:51:42,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 3475783680. Throughput: 0: 50775.9. Samples: 1228619000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 15:51:42,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 15:51:44,269][49750] Updated weights for policy 0, policy_version 212151 (0.0034) [2024-04-26 15:51:47,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3476029440. Throughput: 0: 50647.8. Samples: 1228920180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 15:51:47,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 15:51:47,247][49750] Updated weights for policy 0, policy_version 212161 (0.0030) [2024-04-26 15:51:50,549][49750] Updated weights for policy 0, policy_version 212171 (0.0029) [2024-04-26 15:51:52,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3476275200. Throughput: 0: 50643.5. Samples: 1229078180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 15:51:52,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 15:51:53,790][49750] Updated weights for policy 0, policy_version 212181 (0.0033) [2024-04-26 15:51:57,031][49750] Updated weights for policy 0, policy_version 212191 (0.0032) [2024-04-26 15:51:57,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 3476537344. Throughput: 0: 50664.0. Samples: 1229383280. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 15:51:57,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 15:52:00,164][49750] Updated weights for policy 0, policy_version 212201 (0.0035) [2024-04-26 15:52:02,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.1, 300 sec: 50818.2). Total num frames: 3476783104. Throughput: 0: 50699.3. Samples: 1229686140. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 15:52:02,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 15:52:03,403][49750] Updated weights for policy 0, policy_version 212211 (0.0028) [2024-04-26 15:52:06,746][49750] Updated weights for policy 0, policy_version 212221 (0.0030) [2024-04-26 15:52:07,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3477028864. Throughput: 0: 50933.4. Samples: 1229845360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 15:52:07,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 15:52:09,771][49750] Updated weights for policy 0, policy_version 212231 (0.0040) [2024-04-26 15:52:12,062][49517] Fps is (10 sec: 52429.9, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 3477307392. Throughput: 0: 50915.1. Samples: 1230148700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 15:52:12,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 15:52:13,300][49750] Updated weights for policy 0, policy_version 212241 (0.0033) [2024-04-26 15:52:16,402][49750] Updated weights for policy 0, policy_version 212251 (0.0036) [2024-04-26 15:52:17,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3477553152. Throughput: 0: 50645.6. Samples: 1230448500. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 15:52:17,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 15:52:17,725][49728] Signal inference workers to stop experience collection... (18450 times) [2024-04-26 15:52:17,758][49750] InferenceWorker_p0-w0: stopping experience collection (18450 times) [2024-04-26 15:52:17,791][49728] Signal inference workers to resume experience collection... (18450 times) [2024-04-26 15:52:17,792][49750] InferenceWorker_p0-w0: resuming experience collection (18450 times) [2024-04-26 15:52:19,773][49750] Updated weights for policy 0, policy_version 212261 (0.0030) [2024-04-26 15:52:22,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 3477782528. Throughput: 0: 50878.6. Samples: 1230605960. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 15:52:22,072][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 15:52:22,805][49750] Updated weights for policy 0, policy_version 212271 (0.0032) [2024-04-26 15:52:26,111][49750] Updated weights for policy 0, policy_version 212281 (0.0028) [2024-04-26 15:52:27,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3478061056. Throughput: 0: 50883.1. Samples: 1230908740. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 15:52:27,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 15:52:29,244][49750] Updated weights for policy 0, policy_version 212291 (0.0028) [2024-04-26 15:52:32,063][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.4, 300 sec: 50818.1). Total num frames: 3478306816. Throughput: 0: 50781.8. Samples: 1231205360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 15:52:32,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 15:52:32,595][49750] Updated weights for policy 0, policy_version 212301 (0.0028) [2024-04-26 15:52:35,727][49750] Updated weights for policy 0, policy_version 212311 (0.0031) [2024-04-26 15:52:37,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3478568960. Throughput: 0: 50775.2. Samples: 1231363060. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 15:52:37,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 15:52:38,890][49750] Updated weights for policy 0, policy_version 212321 (0.0029) [2024-04-26 15:52:42,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 3478798336. Throughput: 0: 50783.7. Samples: 1231668540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:52:42,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 15:52:42,281][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000212331_3478831104.pth... [2024-04-26 15:52:42,282][49750] Updated weights for policy 0, policy_version 212331 (0.0033) [2024-04-26 15:52:42,328][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000211586_3466625024.pth [2024-04-26 15:52:45,342][49750] Updated weights for policy 0, policy_version 212341 (0.0036) [2024-04-26 15:52:47,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3479060480. Throughput: 0: 50828.9. Samples: 1231973440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:52:47,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 15:52:48,743][49750] Updated weights for policy 0, policy_version 212351 (0.0036) [2024-04-26 15:52:51,734][49750] Updated weights for policy 0, policy_version 212361 (0.0030) [2024-04-26 15:52:52,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3479339008. Throughput: 0: 50677.6. Samples: 1232125860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:52:52,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 15:52:55,132][49750] Updated weights for policy 0, policy_version 212371 (0.0032) [2024-04-26 15:52:57,062][49517] Fps is (10 sec: 52430.0, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 3479584768. Throughput: 0: 50692.5. Samples: 1232429860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:52:57,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 15:52:58,229][49750] Updated weights for policy 0, policy_version 212381 (0.0035) [2024-04-26 15:53:01,609][49750] Updated weights for policy 0, policy_version 212391 (0.0029) [2024-04-26 15:53:02,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 3479830528. Throughput: 0: 50783.3. Samples: 1232733760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:53:02,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 15:53:04,600][49750] Updated weights for policy 0, policy_version 212401 (0.0028) [2024-04-26 15:53:07,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3480076288. Throughput: 0: 50597.6. Samples: 1232882840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:53:07,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 15:53:08,065][49750] Updated weights for policy 0, policy_version 212411 (0.0033) [2024-04-26 15:53:11,014][49750] Updated weights for policy 0, policy_version 212421 (0.0029) [2024-04-26 15:53:12,063][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3480322048. Throughput: 0: 50789.2. Samples: 1233194260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:53:12,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 15:53:14,500][49750] Updated weights for policy 0, policy_version 212431 (0.0031) [2024-04-26 15:53:17,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3480600576. Throughput: 0: 50930.0. Samples: 1233497200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:53:17,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 15:53:17,435][49750] Updated weights for policy 0, policy_version 212441 (0.0029) [2024-04-26 15:53:20,817][49750] Updated weights for policy 0, policy_version 212451 (0.0026) [2024-04-26 15:53:22,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 3480862720. Throughput: 0: 50891.9. Samples: 1233653200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:53:22,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 15:53:23,805][49750] Updated weights for policy 0, policy_version 212461 (0.0033) [2024-04-26 15:53:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3481092096. Throughput: 0: 50846.7. Samples: 1233956640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:53:27,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 15:53:27,317][49750] Updated weights for policy 0, policy_version 212471 (0.0028) [2024-04-26 15:53:29,713][49728] Signal inference workers to stop experience collection... (18500 times) [2024-04-26 15:53:29,762][49750] InferenceWorker_p0-w0: stopping experience collection (18500 times) [2024-04-26 15:53:29,783][49728] Signal inference workers to resume experience collection... (18500 times) [2024-04-26 15:53:29,784][49750] InferenceWorker_p0-w0: resuming experience collection (18500 times) [2024-04-26 15:53:30,198][49750] Updated weights for policy 0, policy_version 212481 (0.0034) [2024-04-26 15:53:32,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 3481354240. Throughput: 0: 50914.3. Samples: 1234264580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:53:32,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 15:53:33,828][49750] Updated weights for policy 0, policy_version 212491 (0.0032) [2024-04-26 15:53:36,565][49750] Updated weights for policy 0, policy_version 212501 (0.0031) [2024-04-26 15:53:37,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3481616384. Throughput: 0: 50715.9. Samples: 1234408080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:53:37,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 15:53:40,154][49750] Updated weights for policy 0, policy_version 212511 (0.0029) [2024-04-26 15:53:42,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3481862144. Throughput: 0: 50769.6. Samples: 1234714500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:53:42,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 15:53:43,055][49750] Updated weights for policy 0, policy_version 212521 (0.0028) [2024-04-26 15:53:46,630][49750] Updated weights for policy 0, policy_version 212531 (0.0029) [2024-04-26 15:53:47,062][49517] Fps is (10 sec: 50791.5, 60 sec: 51063.7, 300 sec: 50873.7). Total num frames: 3482124288. Throughput: 0: 50887.0. Samples: 1235023660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 15:53:47,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 15:53:49,671][49750] Updated weights for policy 0, policy_version 212541 (0.0036) [2024-04-26 15:53:52,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3482386432. Throughput: 0: 50789.9. Samples: 1235168400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 15:53:52,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 15:53:53,051][49750] Updated weights for policy 0, policy_version 212551 (0.0032) [2024-04-26 15:53:56,043][49750] Updated weights for policy 0, policy_version 212561 (0.0036) [2024-04-26 15:53:57,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3482615808. Throughput: 0: 50689.5. Samples: 1235475280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 15:53:57,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 15:53:59,452][49750] Updated weights for policy 0, policy_version 212571 (0.0034) [2024-04-26 15:54:02,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50790.6, 300 sec: 50762.6). Total num frames: 3482877952. Throughput: 0: 50731.1. Samples: 1235780100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 15:54:02,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 15:54:02,868][49750] Updated weights for policy 0, policy_version 212581 (0.0027) [2024-04-26 15:54:05,836][49750] Updated weights for policy 0, policy_version 212591 (0.0031) [2024-04-26 15:54:07,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3483123712. Throughput: 0: 50577.8. Samples: 1235929200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 15:54:07,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 15:54:09,449][49750] Updated weights for policy 0, policy_version 212601 (0.0032) [2024-04-26 15:54:12,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3483385856. Throughput: 0: 50731.9. Samples: 1236239580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 15:54:12,063][49517] Avg episode reward: [(0, '0.473')] [2024-04-26 15:54:12,264][49750] Updated weights for policy 0, policy_version 212611 (0.0027) [2024-04-26 15:54:16,094][49750] Updated weights for policy 0, policy_version 212621 (0.0031) [2024-04-26 15:54:17,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3483631616. Throughput: 0: 50629.8. Samples: 1236542920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 15:54:17,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 15:54:18,658][49750] Updated weights for policy 0, policy_version 212631 (0.0033) [2024-04-26 15:54:22,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3483877376. Throughput: 0: 50746.4. Samples: 1236691660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 15:54:22,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 15:54:22,740][49750] Updated weights for policy 0, policy_version 212641 (0.0035) [2024-04-26 15:54:25,063][49750] Updated weights for policy 0, policy_version 212651 (0.0032) [2024-04-26 15:54:27,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3484155904. Throughput: 0: 50656.9. Samples: 1236994060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 15:54:27,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 15:54:29,208][49750] Updated weights for policy 0, policy_version 212661 (0.0032) [2024-04-26 15:54:31,645][49750] Updated weights for policy 0, policy_version 212671 (0.0028) [2024-04-26 15:54:32,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3484401664. Throughput: 0: 50553.2. Samples: 1237298560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 15:54:32,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 15:54:35,534][49750] Updated weights for policy 0, policy_version 212681 (0.0028) [2024-04-26 15:54:37,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3484663808. Throughput: 0: 50827.2. Samples: 1237455620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 15:54:37,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 15:54:38,282][49750] Updated weights for policy 0, policy_version 212691 (0.0040) [2024-04-26 15:54:39,374][49728] Signal inference workers to stop experience collection... (18550 times) [2024-04-26 15:54:39,374][49728] Signal inference workers to resume experience collection... (18550 times) [2024-04-26 15:54:39,388][49750] InferenceWorker_p0-w0: stopping experience collection (18550 times) [2024-04-26 15:54:39,389][49750] InferenceWorker_p0-w0: resuming experience collection (18550 times) [2024-04-26 15:54:41,797][49750] Updated weights for policy 0, policy_version 212701 (0.0032) [2024-04-26 15:54:42,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3484893184. Throughput: 0: 50699.0. Samples: 1237756740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 15:54:42,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 15:54:42,083][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000212701_3484893184.pth... [2024-04-26 15:54:42,150][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000211960_3472752640.pth [2024-04-26 15:54:44,893][49750] Updated weights for policy 0, policy_version 212711 (0.0026) [2024-04-26 15:54:47,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3485155328. Throughput: 0: 50624.9. Samples: 1238058220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 15:54:47,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 15:54:48,221][49750] Updated weights for policy 0, policy_version 212721 (0.0028) [2024-04-26 15:54:51,495][49750] Updated weights for policy 0, policy_version 212731 (0.0031) [2024-04-26 15:54:52,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 3485417472. Throughput: 0: 50822.2. Samples: 1238216200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-26 15:54:52,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 15:54:54,683][49750] Updated weights for policy 0, policy_version 212741 (0.0035) [2024-04-26 15:54:57,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3485679616. Throughput: 0: 50664.9. Samples: 1238519500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:54:57,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 15:54:57,949][49750] Updated weights for policy 0, policy_version 212751 (0.0029) [2024-04-26 15:55:01,364][49750] Updated weights for policy 0, policy_version 212761 (0.0031) [2024-04-26 15:55:02,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3485908992. Throughput: 0: 50591.6. Samples: 1238819540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:55:02,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 15:55:04,612][49750] Updated weights for policy 0, policy_version 212771 (0.0033) [2024-04-26 15:55:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3486171136. Throughput: 0: 50706.7. Samples: 1238973460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:55:07,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 15:55:08,120][49750] Updated weights for policy 0, policy_version 212781 (0.0028) [2024-04-26 15:55:11,142][49750] Updated weights for policy 0, policy_version 212791 (0.0032) [2024-04-26 15:55:12,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3486416896. Throughput: 0: 50628.9. Samples: 1239272360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:55:12,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 15:55:14,507][49750] Updated weights for policy 0, policy_version 212801 (0.0030) [2024-04-26 15:55:17,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3486662656. Throughput: 0: 50485.4. Samples: 1239570400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:55:17,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 15:55:17,384][49750] Updated weights for policy 0, policy_version 212811 (0.0035) [2024-04-26 15:55:20,874][49750] Updated weights for policy 0, policy_version 212821 (0.0030) [2024-04-26 15:55:22,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3486941184. Throughput: 0: 50535.2. Samples: 1239729700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:55:22,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 15:55:23,816][49750] Updated weights for policy 0, policy_version 212831 (0.0030) [2024-04-26 15:55:27,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 3487170560. Throughput: 0: 50541.0. Samples: 1240031080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:55:27,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 15:55:27,326][49750] Updated weights for policy 0, policy_version 212841 (0.0030) [2024-04-26 15:55:30,389][49750] Updated weights for policy 0, policy_version 212851 (0.0030) [2024-04-26 15:55:32,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3487432704. Throughput: 0: 50583.4. Samples: 1240334480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:55:32,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 15:55:33,826][49750] Updated weights for policy 0, policy_version 212861 (0.0031) [2024-04-26 15:55:36,873][49750] Updated weights for policy 0, policy_version 212871 (0.0032) [2024-04-26 15:55:37,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 3487694848. Throughput: 0: 50562.5. Samples: 1240491520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:55:37,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 15:55:40,158][49750] Updated weights for policy 0, policy_version 212881 (0.0033) [2024-04-26 15:55:40,491][49728] Signal inference workers to stop experience collection... (18600 times) [2024-04-26 15:55:40,521][49750] InferenceWorker_p0-w0: stopping experience collection (18600 times) [2024-04-26 15:55:40,552][49728] Signal inference workers to resume experience collection... (18600 times) [2024-04-26 15:55:40,552][49750] InferenceWorker_p0-w0: resuming experience collection (18600 times) [2024-04-26 15:55:42,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 3487956992. Throughput: 0: 50750.8. Samples: 1240803280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:55:42,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 15:55:43,145][49750] Updated weights for policy 0, policy_version 212891 (0.0034) [2024-04-26 15:55:46,640][49750] Updated weights for policy 0, policy_version 212901 (0.0030) [2024-04-26 15:55:47,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3488186368. Throughput: 0: 50820.0. Samples: 1241106440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:55:47,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 15:55:49,519][49750] Updated weights for policy 0, policy_version 212911 (0.0030) [2024-04-26 15:55:52,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3488448512. Throughput: 0: 50648.9. Samples: 1241252660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:55:52,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 15:55:53,053][49750] Updated weights for policy 0, policy_version 212921 (0.0036) [2024-04-26 15:55:55,994][49750] Updated weights for policy 0, policy_version 212931 (0.0031) [2024-04-26 15:55:57,062][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3488710656. Throughput: 0: 50740.0. Samples: 1241555660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:55:57,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 15:55:59,508][49750] Updated weights for policy 0, policy_version 212941 (0.0032) [2024-04-26 15:56:02,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3488956416. Throughput: 0: 50819.4. Samples: 1241857280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 15:56:02,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 15:56:02,493][49750] Updated weights for policy 0, policy_version 212951 (0.0031) [2024-04-26 15:56:05,908][49750] Updated weights for policy 0, policy_version 212961 (0.0032) [2024-04-26 15:56:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3489202176. Throughput: 0: 50620.9. Samples: 1242007640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 15:56:07,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 15:56:08,857][49750] Updated weights for policy 0, policy_version 212971 (0.0037) [2024-04-26 15:56:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3489447936. Throughput: 0: 50723.0. Samples: 1242313620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 15:56:12,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 15:56:12,464][49750] Updated weights for policy 0, policy_version 212981 (0.0031) [2024-04-26 15:56:15,381][49750] Updated weights for policy 0, policy_version 212991 (0.0027) [2024-04-26 15:56:17,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 3489726464. Throughput: 0: 50814.2. Samples: 1242621120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 15:56:17,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 15:56:18,833][49750] Updated weights for policy 0, policy_version 213001 (0.0028) [2024-04-26 15:56:21,996][49750] Updated weights for policy 0, policy_version 213011 (0.0035) [2024-04-26 15:56:22,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 3489972224. Throughput: 0: 50593.4. Samples: 1242768220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 15:56:22,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 15:56:25,134][49750] Updated weights for policy 0, policy_version 213021 (0.0030) [2024-04-26 15:56:27,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3490217984. Throughput: 0: 50445.2. Samples: 1243073320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 15:56:27,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 15:56:28,405][49750] Updated weights for policy 0, policy_version 213031 (0.0029) [2024-04-26 15:56:31,563][49750] Updated weights for policy 0, policy_version 213041 (0.0033) [2024-04-26 15:56:32,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3490496512. Throughput: 0: 50522.6. Samples: 1243379960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 15:56:32,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 15:56:34,679][49750] Updated weights for policy 0, policy_version 213051 (0.0027) [2024-04-26 15:56:37,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 3490725888. Throughput: 0: 50621.9. Samples: 1243530640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 15:56:37,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 15:56:38,026][49750] Updated weights for policy 0, policy_version 213061 (0.0029) [2024-04-26 15:56:41,163][49750] Updated weights for policy 0, policy_version 213071 (0.0036) [2024-04-26 15:56:42,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50244.1, 300 sec: 50651.6). Total num frames: 3490971648. Throughput: 0: 50713.3. Samples: 1243837760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 15:56:42,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-26 15:56:42,077][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000213072_3490971648.pth... [2024-04-26 15:56:42,135][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000212331_3478831104.pth [2024-04-26 15:56:44,362][49750] Updated weights for policy 0, policy_version 213081 (0.0027) [2024-04-26 15:56:47,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3491233792. Throughput: 0: 50741.1. Samples: 1244140620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 15:56:47,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 15:56:47,669][49750] Updated weights for policy 0, policy_version 213091 (0.0026) [2024-04-26 15:56:50,717][49750] Updated weights for policy 0, policy_version 213101 (0.0029) [2024-04-26 15:56:52,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3491495936. Throughput: 0: 50791.1. Samples: 1244293240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 15:56:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 15:56:53,499][49728] Signal inference workers to stop experience collection... (18650 times) [2024-04-26 15:56:53,499][49728] Signal inference workers to resume experience collection... (18650 times) [2024-04-26 15:56:53,514][49750] InferenceWorker_p0-w0: stopping experience collection (18650 times) [2024-04-26 15:56:53,515][49750] InferenceWorker_p0-w0: resuming experience collection (18650 times) [2024-04-26 15:56:54,113][49750] Updated weights for policy 0, policy_version 213111 (0.0028) [2024-04-26 15:56:57,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 3491758080. Throughput: 0: 50892.1. Samples: 1244603760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 15:56:57,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 15:56:57,171][49750] Updated weights for policy 0, policy_version 213121 (0.0029) [2024-04-26 15:57:00,495][49750] Updated weights for policy 0, policy_version 213131 (0.0032) [2024-04-26 15:57:02,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3492003840. Throughput: 0: 50794.7. Samples: 1244906880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 15:57:02,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 15:57:03,642][49750] Updated weights for policy 0, policy_version 213141 (0.0038) [2024-04-26 15:57:06,971][49750] Updated weights for policy 0, policy_version 213151 (0.0034) [2024-04-26 15:57:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3492265984. Throughput: 0: 50838.7. Samples: 1245055960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 15:57:07,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 15:57:09,979][49750] Updated weights for policy 0, policy_version 213161 (0.0027) [2024-04-26 15:57:12,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 3492495360. Throughput: 0: 50768.8. Samples: 1245357920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 15:57:12,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 15:57:13,438][49750] Updated weights for policy 0, policy_version 213171 (0.0029) [2024-04-26 15:57:16,317][49750] Updated weights for policy 0, policy_version 213181 (0.0028) [2024-04-26 15:57:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3492773888. Throughput: 0: 50551.0. Samples: 1245654760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 15:57:17,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 15:57:19,974][49750] Updated weights for policy 0, policy_version 213191 (0.0034) [2024-04-26 15:57:22,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3493019648. Throughput: 0: 50878.5. Samples: 1245820180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 15:57:22,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 15:57:22,718][49750] Updated weights for policy 0, policy_version 213201 (0.0042) [2024-04-26 15:57:26,418][49750] Updated weights for policy 0, policy_version 213211 (0.0029) [2024-04-26 15:57:27,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3493265408. Throughput: 0: 50798.7. Samples: 1246123700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 15:57:27,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 15:57:29,409][49750] Updated weights for policy 0, policy_version 213221 (0.0032) [2024-04-26 15:57:32,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 3493511168. Throughput: 0: 50831.0. Samples: 1246428020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 15:57:32,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 15:57:32,712][49750] Updated weights for policy 0, policy_version 213231 (0.0027) [2024-04-26 15:57:35,634][49750] Updated weights for policy 0, policy_version 213241 (0.0034) [2024-04-26 15:57:37,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3493773312. Throughput: 0: 50779.7. Samples: 1246578320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 15:57:37,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 15:57:39,068][49750] Updated weights for policy 0, policy_version 213251 (0.0031) [2024-04-26 15:57:42,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3494051840. Throughput: 0: 50793.1. Samples: 1246889460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 15:57:42,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 15:57:42,174][49750] Updated weights for policy 0, policy_version 213261 (0.0035) [2024-04-26 15:57:45,509][49750] Updated weights for policy 0, policy_version 213271 (0.0031) [2024-04-26 15:57:47,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 50651.6). Total num frames: 3494281216. Throughput: 0: 50761.9. Samples: 1247191160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 15:57:47,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 15:57:48,991][49750] Updated weights for policy 0, policy_version 213281 (0.0033) [2024-04-26 15:57:52,021][49750] Updated weights for policy 0, policy_version 213291 (0.0031) [2024-04-26 15:57:52,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3494559744. Throughput: 0: 50721.6. Samples: 1247338440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 15:57:52,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 15:57:55,443][49750] Updated weights for policy 0, policy_version 213301 (0.0031) [2024-04-26 15:57:57,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3494789120. Throughput: 0: 50773.8. Samples: 1247642740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 15:57:57,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 15:57:58,493][49750] Updated weights for policy 0, policy_version 213311 (0.0035) [2024-04-26 15:58:00,531][49728] Signal inference workers to stop experience collection... (18700 times) [2024-04-26 15:58:00,531][49728] Signal inference workers to resume experience collection... (18700 times) [2024-04-26 15:58:00,555][49750] InferenceWorker_p0-w0: stopping experience collection (18700 times) [2024-04-26 15:58:00,555][49750] InferenceWorker_p0-w0: resuming experience collection (18700 times) [2024-04-26 15:58:01,735][49750] Updated weights for policy 0, policy_version 213321 (0.0034) [2024-04-26 15:58:02,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3495051264. Throughput: 0: 50919.6. Samples: 1247946140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 15:58:02,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 15:58:05,124][49750] Updated weights for policy 0, policy_version 213331 (0.0032) [2024-04-26 15:58:07,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3495329792. Throughput: 0: 50642.7. Samples: 1248099100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 15:58:07,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 15:58:08,093][49750] Updated weights for policy 0, policy_version 213341 (0.0029) [2024-04-26 15:58:11,506][49750] Updated weights for policy 0, policy_version 213351 (0.0031) [2024-04-26 15:58:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50651.5). Total num frames: 3495542784. Throughput: 0: 50841.4. Samples: 1248411560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 15:58:12,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 15:58:14,672][49750] Updated weights for policy 0, policy_version 213361 (0.0029) [2024-04-26 15:58:17,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 3495804928. Throughput: 0: 50914.2. Samples: 1248719160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 15:58:17,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 15:58:18,416][49750] Updated weights for policy 0, policy_version 213371 (0.0034) [2024-04-26 15:58:21,146][49750] Updated weights for policy 0, policy_version 213381 (0.0026) [2024-04-26 15:58:22,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3496067072. Throughput: 0: 50740.8. Samples: 1248861660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 15:58:22,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 15:58:24,944][49750] Updated weights for policy 0, policy_version 213391 (0.0031) [2024-04-26 15:58:27,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3496329216. Throughput: 0: 50804.1. Samples: 1249175640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 15:58:27,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 15:58:27,415][49750] Updated weights for policy 0, policy_version 213401 (0.0032) [2024-04-26 15:58:31,225][49750] Updated weights for policy 0, policy_version 213411 (0.0029) [2024-04-26 15:58:32,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3496574976. Throughput: 0: 50963.5. Samples: 1249484520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 15:58:32,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 15:58:33,966][49750] Updated weights for policy 0, policy_version 213421 (0.0034) [2024-04-26 15:58:37,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.2, 300 sec: 50762.6). Total num frames: 3496837120. Throughput: 0: 50875.1. Samples: 1249627820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 15:58:37,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 15:58:37,499][49750] Updated weights for policy 0, policy_version 213431 (0.0043) [2024-04-26 15:58:40,450][49750] Updated weights for policy 0, policy_version 213441 (0.0031) [2024-04-26 15:58:42,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.5, 300 sec: 50651.5). Total num frames: 3497066496. Throughput: 0: 50890.3. Samples: 1249932800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 15:58:42,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 15:58:42,177][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000213445_3497082880.pth... [2024-04-26 15:58:42,219][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000212701_3484893184.pth [2024-04-26 15:58:44,030][49750] Updated weights for policy 0, policy_version 213451 (0.0032) [2024-04-26 15:58:46,932][49750] Updated weights for policy 0, policy_version 213461 (0.0026) [2024-04-26 15:58:47,062][49517] Fps is (10 sec: 50791.5, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3497345024. Throughput: 0: 51031.6. Samples: 1250242560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 15:58:47,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 15:58:50,473][49750] Updated weights for policy 0, policy_version 213471 (0.0031) [2024-04-26 15:58:52,062][49517] Fps is (10 sec: 55705.2, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3497623552. Throughput: 0: 50964.0. Samples: 1250392480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 15:58:52,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 15:58:53,202][49750] Updated weights for policy 0, policy_version 213481 (0.0039) [2024-04-26 15:58:56,751][49750] Updated weights for policy 0, policy_version 213491 (0.0028) [2024-04-26 15:58:57,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3497836544. Throughput: 0: 50915.8. Samples: 1250702780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 15:58:57,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 15:58:57,209][49728] Signal inference workers to stop experience collection... (18750 times) [2024-04-26 15:58:57,254][49750] InferenceWorker_p0-w0: stopping experience collection (18750 times) [2024-04-26 15:58:57,317][49728] Signal inference workers to resume experience collection... (18750 times) [2024-04-26 15:58:57,317][49750] InferenceWorker_p0-w0: resuming experience collection (18750 times) [2024-04-26 15:58:59,609][49750] Updated weights for policy 0, policy_version 213501 (0.0030) [2024-04-26 15:59:02,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3498098688. Throughput: 0: 51068.9. Samples: 1251017260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 15:59:02,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 15:59:03,153][49750] Updated weights for policy 0, policy_version 213511 (0.0030) [2024-04-26 15:59:05,981][49750] Updated weights for policy 0, policy_version 213521 (0.0030) [2024-04-26 15:59:07,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3498344448. Throughput: 0: 51015.4. Samples: 1251157360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 15:59:07,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 15:59:09,741][49750] Updated weights for policy 0, policy_version 213531 (0.0030) [2024-04-26 15:59:12,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 3498622976. Throughput: 0: 50822.7. Samples: 1251462660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 15:59:12,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 15:59:12,373][49750] Updated weights for policy 0, policy_version 213541 (0.0027) [2024-04-26 15:59:16,120][49750] Updated weights for policy 0, policy_version 213551 (0.0030) [2024-04-26 15:59:17,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3498885120. Throughput: 0: 50788.8. Samples: 1251770020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 15:59:17,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 15:59:18,718][49750] Updated weights for policy 0, policy_version 213561 (0.0035) [2024-04-26 15:59:22,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3499114496. Throughput: 0: 50972.5. Samples: 1251921580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 15:59:22,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 15:59:22,547][49750] Updated weights for policy 0, policy_version 213571 (0.0030) [2024-04-26 15:59:25,159][49750] Updated weights for policy 0, policy_version 213581 (0.0028) [2024-04-26 15:59:27,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3499360256. Throughput: 0: 50845.9. Samples: 1252220880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 15:59:27,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 15:59:28,966][49750] Updated weights for policy 0, policy_version 213591 (0.0030) [2024-04-26 15:59:31,746][49750] Updated weights for policy 0, policy_version 213601 (0.0031) [2024-04-26 15:59:32,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 3499655168. Throughput: 0: 50834.5. Samples: 1252530120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 15:59:32,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 15:59:35,431][49750] Updated weights for policy 0, policy_version 213611 (0.0037) [2024-04-26 15:59:37,063][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3499884544. Throughput: 0: 50896.3. Samples: 1252682820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 15:59:37,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 15:59:38,130][49750] Updated weights for policy 0, policy_version 213621 (0.0034) [2024-04-26 15:59:41,850][49750] Updated weights for policy 0, policy_version 213631 (0.0032) [2024-04-26 15:59:42,062][49517] Fps is (10 sec: 49152.8, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3500146688. Throughput: 0: 50774.0. Samples: 1252987600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 15:59:42,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 15:59:44,548][49750] Updated weights for policy 0, policy_version 213641 (0.0032) [2024-04-26 15:59:47,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3500392448. Throughput: 0: 50573.2. Samples: 1253293060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 15:59:47,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 15:59:48,220][49750] Updated weights for policy 0, policy_version 213651 (0.0031) [2024-04-26 15:59:51,386][49750] Updated weights for policy 0, policy_version 213661 (0.0029) [2024-04-26 15:59:52,063][49517] Fps is (10 sec: 47512.9, 60 sec: 49971.1, 300 sec: 50651.5). Total num frames: 3500621824. Throughput: 0: 50561.7. Samples: 1253432640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 15:59:52,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 15:59:54,690][49750] Updated weights for policy 0, policy_version 213671 (0.0033) [2024-04-26 15:59:57,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3500883968. Throughput: 0: 50613.2. Samples: 1253740260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 15:59:57,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 15:59:57,554][49728] Signal inference workers to stop experience collection... (18800 times) [2024-04-26 15:59:57,612][49750] InferenceWorker_p0-w0: stopping experience collection (18800 times) [2024-04-26 15:59:57,618][49728] Signal inference workers to resume experience collection... (18800 times) [2024-04-26 15:59:57,626][49750] InferenceWorker_p0-w0: resuming experience collection (18800 times) [2024-04-26 15:59:57,750][49750] Updated weights for policy 0, policy_version 213681 (0.0033) [2024-04-26 16:00:01,031][49750] Updated weights for policy 0, policy_version 213691 (0.0031) [2024-04-26 16:00:02,062][49517] Fps is (10 sec: 55706.2, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3501178880. Throughput: 0: 50557.0. Samples: 1254045080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:00:02,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 16:00:04,151][49750] Updated weights for policy 0, policy_version 213701 (0.0030) [2024-04-26 16:00:07,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3501408256. Throughput: 0: 50736.1. Samples: 1254204700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:00:07,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 16:00:07,485][49750] Updated weights for policy 0, policy_version 213711 (0.0030) [2024-04-26 16:00:10,597][49750] Updated weights for policy 0, policy_version 213721 (0.0031) [2024-04-26 16:00:12,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 3501637632. Throughput: 0: 50848.3. Samples: 1254509040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:00:12,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 16:00:13,946][49750] Updated weights for policy 0, policy_version 213731 (0.0030) [2024-04-26 16:00:17,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3501916160. Throughput: 0: 50694.7. Samples: 1254811380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:00:17,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 16:00:17,240][49750] Updated weights for policy 0, policy_version 213741 (0.0027) [2024-04-26 16:00:20,270][49750] Updated weights for policy 0, policy_version 213751 (0.0029) [2024-04-26 16:00:22,063][49517] Fps is (10 sec: 54066.1, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3502178304. Throughput: 0: 50899.5. Samples: 1254973300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:00:22,064][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 16:00:23,806][49750] Updated weights for policy 0, policy_version 213761 (0.0027) [2024-04-26 16:00:26,819][49750] Updated weights for policy 0, policy_version 213771 (0.0039) [2024-04-26 16:00:27,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 3502440448. Throughput: 0: 50820.8. Samples: 1255274540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:00:27,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 16:00:30,078][49750] Updated weights for policy 0, policy_version 213781 (0.0033) [2024-04-26 16:00:32,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3502669824. Throughput: 0: 51017.9. Samples: 1255588860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:00:32,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 16:00:33,118][49750] Updated weights for policy 0, policy_version 213791 (0.0029) [2024-04-26 16:00:36,564][49750] Updated weights for policy 0, policy_version 213801 (0.0031) [2024-04-26 16:00:37,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3502915584. Throughput: 0: 51030.3. Samples: 1255729000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 16:00:37,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 16:00:39,610][49750] Updated weights for policy 0, policy_version 213811 (0.0032) [2024-04-26 16:00:42,063][49517] Fps is (10 sec: 52427.8, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 3503194112. Throughput: 0: 50847.1. Samples: 1256028380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 16:00:42,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 16:00:42,079][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000213818_3503194112.pth... [2024-04-26 16:00:42,126][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000213072_3490971648.pth [2024-04-26 16:00:43,413][49750] Updated weights for policy 0, policy_version 213821 (0.0029) [2024-04-26 16:00:45,889][49750] Updated weights for policy 0, policy_version 213831 (0.0030) [2024-04-26 16:00:47,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3503456256. Throughput: 0: 51038.7. Samples: 1256341820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 16:00:47,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 16:00:49,622][49750] Updated weights for policy 0, policy_version 213841 (0.0029) [2024-04-26 16:00:52,062][49517] Fps is (10 sec: 50791.3, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3503702016. Throughput: 0: 51081.3. Samples: 1256503360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 16:00:52,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 16:00:52,280][49750] Updated weights for policy 0, policy_version 213851 (0.0025) [2024-04-26 16:00:56,248][49750] Updated weights for policy 0, policy_version 213861 (0.0028) [2024-04-26 16:00:57,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3503931392. Throughput: 0: 51019.5. Samples: 1256804920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 16:00:57,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 16:00:58,527][49728] Signal inference workers to stop experience collection... (18850 times) [2024-04-26 16:00:58,528][49728] Signal inference workers to resume experience collection... (18850 times) [2024-04-26 16:00:58,542][49750] InferenceWorker_p0-w0: stopping experience collection (18850 times) [2024-04-26 16:00:58,543][49750] InferenceWorker_p0-w0: resuming experience collection (18850 times) [2024-04-26 16:00:58,662][49750] Updated weights for policy 0, policy_version 213871 (0.0034) [2024-04-26 16:01:02,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 3504193536. Throughput: 0: 51034.3. Samples: 1257107920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 16:01:02,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 16:01:02,827][49750] Updated weights for policy 0, policy_version 213881 (0.0034) [2024-04-26 16:01:05,112][49750] Updated weights for policy 0, policy_version 213891 (0.0031) [2024-04-26 16:01:07,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 3504472064. Throughput: 0: 50845.1. Samples: 1257261320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 16:01:07,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 16:01:09,185][49750] Updated weights for policy 0, policy_version 213901 (0.0031) [2024-04-26 16:01:11,651][49750] Updated weights for policy 0, policy_version 213911 (0.0027) [2024-04-26 16:01:12,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3504717824. Throughput: 0: 50933.9. Samples: 1257566560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 16:01:12,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 16:01:15,674][49750] Updated weights for policy 0, policy_version 213921 (0.0028) [2024-04-26 16:01:17,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3504963584. Throughput: 0: 50736.9. Samples: 1257872020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 16:01:17,063][49517] Avg episode reward: [(0, '0.466')] [2024-04-26 16:01:18,078][49750] Updated weights for policy 0, policy_version 213931 (0.0037) [2024-04-26 16:01:22,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 3505192960. Throughput: 0: 50762.2. Samples: 1258013300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 16:01:22,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 16:01:22,151][49750] Updated weights for policy 0, policy_version 213941 (0.0028) [2024-04-26 16:01:24,784][49750] Updated weights for policy 0, policy_version 213951 (0.0031) [2024-04-26 16:01:27,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3505471488. Throughput: 0: 50975.2. Samples: 1258322260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 16:01:27,063][49517] Avg episode reward: [(0, '0.666')] [2024-04-26 16:01:28,489][49750] Updated weights for policy 0, policy_version 213961 (0.0033) [2024-04-26 16:01:31,366][49750] Updated weights for policy 0, policy_version 213971 (0.0034) [2024-04-26 16:01:32,062][49517] Fps is (10 sec: 55705.9, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 3505750016. Throughput: 0: 50774.6. Samples: 1258626680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 16:01:32,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 16:01:34,946][49750] Updated weights for policy 0, policy_version 213981 (0.0028) [2024-04-26 16:01:37,063][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3505979392. Throughput: 0: 50763.1. Samples: 1258787700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 16:01:37,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 16:01:37,766][49750] Updated weights for policy 0, policy_version 213991 (0.0033) [2024-04-26 16:01:41,298][49750] Updated weights for policy 0, policy_version 214001 (0.0031) [2024-04-26 16:01:42,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.5, 300 sec: 50818.1). Total num frames: 3506225152. Throughput: 0: 50702.6. Samples: 1259086540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-26 16:01:42,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 16:01:44,144][49750] Updated weights for policy 0, policy_version 214011 (0.0030) [2024-04-26 16:01:47,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3506470912. Throughput: 0: 50775.5. Samples: 1259392820. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 16:01:47,063][49517] Avg episode reward: [(0, '0.491')] [2024-04-26 16:01:47,960][49750] Updated weights for policy 0, policy_version 214021 (0.0028) [2024-04-26 16:01:50,523][49750] Updated weights for policy 0, policy_version 214031 (0.0035) [2024-04-26 16:01:52,063][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3506765824. Throughput: 0: 50653.7. Samples: 1259540740. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 16:01:52,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 16:01:54,293][49750] Updated weights for policy 0, policy_version 214041 (0.0027) [2024-04-26 16:01:55,339][49728] Signal inference workers to stop experience collection... (18900 times) [2024-04-26 16:01:55,340][49728] Signal inference workers to resume experience collection... (18900 times) [2024-04-26 16:01:55,378][49750] InferenceWorker_p0-w0: stopping experience collection (18900 times) [2024-04-26 16:01:55,378][49750] InferenceWorker_p0-w0: resuming experience collection (18900 times) [2024-04-26 16:01:57,025][49750] Updated weights for policy 0, policy_version 214051 (0.0032) [2024-04-26 16:01:57,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3507011584. Throughput: 0: 50662.0. Samples: 1259846360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 16:01:57,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 16:02:00,690][49750] Updated weights for policy 0, policy_version 214061 (0.0040) [2024-04-26 16:02:02,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3507240960. Throughput: 0: 50797.7. Samples: 1260157920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 16:02:02,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 16:02:03,468][49750] Updated weights for policy 0, policy_version 214071 (0.0041) [2024-04-26 16:02:07,049][49750] Updated weights for policy 0, policy_version 214081 (0.0029) [2024-04-26 16:02:07,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 3507503104. Throughput: 0: 50820.0. Samples: 1260300200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 16:02:07,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 16:02:09,820][49750] Updated weights for policy 0, policy_version 214091 (0.0036) [2024-04-26 16:02:12,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3507765248. Throughput: 0: 50687.6. Samples: 1260603200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 16:02:12,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 16:02:13,478][49750] Updated weights for policy 0, policy_version 214101 (0.0032) [2024-04-26 16:02:16,332][49750] Updated weights for policy 0, policy_version 214111 (0.0030) [2024-04-26 16:02:17,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3508011008. Throughput: 0: 50751.6. Samples: 1260910500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 16:02:17,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 16:02:19,871][49750] Updated weights for policy 0, policy_version 214121 (0.0031) [2024-04-26 16:02:22,062][49517] Fps is (10 sec: 49152.6, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3508256768. Throughput: 0: 50710.4. Samples: 1261069660. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 16:02:22,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 16:02:22,851][49750] Updated weights for policy 0, policy_version 214131 (0.0033) [2024-04-26 16:02:26,362][49750] Updated weights for policy 0, policy_version 214141 (0.0036) [2024-04-26 16:02:27,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3508502528. Throughput: 0: 50803.2. Samples: 1261372680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 16:02:27,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 16:02:29,384][49750] Updated weights for policy 0, policy_version 214151 (0.0032) [2024-04-26 16:02:32,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 3508764672. Throughput: 0: 50653.4. Samples: 1261672220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 16:02:32,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 16:02:32,715][49750] Updated weights for policy 0, policy_version 214161 (0.0035) [2024-04-26 16:02:35,774][49750] Updated weights for policy 0, policy_version 214171 (0.0030) [2024-04-26 16:02:37,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3509043200. Throughput: 0: 50781.9. Samples: 1261825920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 16:02:37,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 16:02:39,147][49750] Updated weights for policy 0, policy_version 214181 (0.0036) [2024-04-26 16:02:42,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3509288960. Throughput: 0: 50814.1. Samples: 1262133000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 16:02:42,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 16:02:42,170][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000214191_3509305344.pth... [2024-04-26 16:02:42,177][49750] Updated weights for policy 0, policy_version 214191 (0.0031) [2024-04-26 16:02:42,218][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000213445_3497082880.pth [2024-04-26 16:02:45,646][49750] Updated weights for policy 0, policy_version 214201 (0.0029) [2024-04-26 16:02:47,062][49517] Fps is (10 sec: 49151.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3509534720. Throughput: 0: 50629.8. Samples: 1262436260. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 16:02:47,071][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 16:02:48,684][49750] Updated weights for policy 0, policy_version 214211 (0.0035) [2024-04-26 16:02:52,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 3509780480. Throughput: 0: 50661.9. Samples: 1262579980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:02:52,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 16:02:52,334][49750] Updated weights for policy 0, policy_version 214221 (0.0033) [2024-04-26 16:02:55,221][49750] Updated weights for policy 0, policy_version 214231 (0.0029) [2024-04-26 16:02:57,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3510042624. Throughput: 0: 50701.4. Samples: 1262884760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:02:57,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 16:02:58,723][49750] Updated weights for policy 0, policy_version 214241 (0.0031) [2024-04-26 16:03:01,609][49750] Updated weights for policy 0, policy_version 214251 (0.0031) [2024-04-26 16:03:02,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3510304768. Throughput: 0: 50662.2. Samples: 1263190300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:03:02,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 16:03:05,171][49750] Updated weights for policy 0, policy_version 214261 (0.0029) [2024-04-26 16:03:07,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3510550528. Throughput: 0: 50706.1. Samples: 1263351440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:03:07,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 16:03:08,044][49750] Updated weights for policy 0, policy_version 214271 (0.0037) [2024-04-26 16:03:11,635][49750] Updated weights for policy 0, policy_version 214281 (0.0029) [2024-04-26 16:03:12,062][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3510779904. Throughput: 0: 50554.6. Samples: 1263647640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:03:12,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 16:03:12,795][49728] Signal inference workers to stop experience collection... (18950 times) [2024-04-26 16:03:12,837][49750] InferenceWorker_p0-w0: stopping experience collection (18950 times) [2024-04-26 16:03:12,899][49728] Signal inference workers to resume experience collection... (18950 times) [2024-04-26 16:03:12,899][49750] InferenceWorker_p0-w0: resuming experience collection (18950 times) [2024-04-26 16:03:14,804][49750] Updated weights for policy 0, policy_version 214291 (0.0030) [2024-04-26 16:03:17,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3511042048. Throughput: 0: 50638.5. Samples: 1263950960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:03:17,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 16:03:17,954][49750] Updated weights for policy 0, policy_version 214301 (0.0034) [2024-04-26 16:03:21,193][49750] Updated weights for policy 0, policy_version 214311 (0.0025) [2024-04-26 16:03:22,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 3511320576. Throughput: 0: 50633.2. Samples: 1264104420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:03:22,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 16:03:24,319][49750] Updated weights for policy 0, policy_version 214321 (0.0031) [2024-04-26 16:03:27,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3511549952. Throughput: 0: 50617.3. Samples: 1264410780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:03:27,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 16:03:27,701][49750] Updated weights for policy 0, policy_version 214331 (0.0042) [2024-04-26 16:03:30,853][49750] Updated weights for policy 0, policy_version 214341 (0.0027) [2024-04-26 16:03:32,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3511812096. Throughput: 0: 50660.6. Samples: 1264715980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:03:32,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 16:03:34,183][49750] Updated weights for policy 0, policy_version 214351 (0.0029) [2024-04-26 16:03:37,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 3512057856. Throughput: 0: 50750.2. Samples: 1264863740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:03:37,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 16:03:37,388][49750] Updated weights for policy 0, policy_version 214361 (0.0035) [2024-04-26 16:03:40,629][49750] Updated weights for policy 0, policy_version 214371 (0.0031) [2024-04-26 16:03:42,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3512303616. Throughput: 0: 50905.8. Samples: 1265175520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:03:42,063][49517] Avg episode reward: [(0, '0.671')] [2024-04-26 16:03:43,753][49750] Updated weights for policy 0, policy_version 214381 (0.0033) [2024-04-26 16:03:46,991][49750] Updated weights for policy 0, policy_version 214391 (0.0028) [2024-04-26 16:03:47,063][49517] Fps is (10 sec: 52427.3, 60 sec: 50790.2, 300 sec: 50707.0). Total num frames: 3512582144. Throughput: 0: 50842.8. Samples: 1265478240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:03:47,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 16:03:50,535][49750] Updated weights for policy 0, policy_version 214401 (0.0028) [2024-04-26 16:03:52,063][49517] Fps is (10 sec: 54066.1, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 3512844288. Throughput: 0: 50783.8. Samples: 1265636720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:03:52,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 16:03:53,435][49750] Updated weights for policy 0, policy_version 214411 (0.0029) [2024-04-26 16:03:56,912][49750] Updated weights for policy 0, policy_version 214421 (0.0031) [2024-04-26 16:03:57,062][49517] Fps is (10 sec: 49153.6, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3513073664. Throughput: 0: 50875.3. Samples: 1265937020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:03:57,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 16:04:00,012][49750] Updated weights for policy 0, policy_version 214431 (0.0029) [2024-04-26 16:04:02,062][49517] Fps is (10 sec: 49153.6, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3513335808. Throughput: 0: 50838.9. Samples: 1266238700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:04:02,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 16:04:03,208][49750] Updated weights for policy 0, policy_version 214441 (0.0030) [2024-04-26 16:04:06,360][49750] Updated weights for policy 0, policy_version 214451 (0.0029) [2024-04-26 16:04:07,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3513581568. Throughput: 0: 50686.3. Samples: 1266385300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:04:07,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 16:04:09,696][49750] Updated weights for policy 0, policy_version 214461 (0.0033) [2024-04-26 16:04:10,372][49728] Signal inference workers to stop experience collection... (19000 times) [2024-04-26 16:04:10,372][49728] Signal inference workers to resume experience collection... (19000 times) [2024-04-26 16:04:10,399][49750] InferenceWorker_p0-w0: stopping experience collection (19000 times) [2024-04-26 16:04:10,399][49750] InferenceWorker_p0-w0: resuming experience collection (19000 times) [2024-04-26 16:04:12,063][49517] Fps is (10 sec: 49150.4, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 3513827328. Throughput: 0: 50626.2. Samples: 1266688960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:04:12,064][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 16:04:12,957][49750] Updated weights for policy 0, policy_version 214471 (0.0028) [2024-04-26 16:04:16,170][49750] Updated weights for policy 0, policy_version 214481 (0.0029) [2024-04-26 16:04:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3514105856. Throughput: 0: 50595.5. Samples: 1266992780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:04:17,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 16:04:19,548][49750] Updated weights for policy 0, policy_version 214491 (0.0031) [2024-04-26 16:04:22,062][49517] Fps is (10 sec: 50791.7, 60 sec: 50244.4, 300 sec: 50762.7). Total num frames: 3514335232. Throughput: 0: 50761.8. Samples: 1267148020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:04:22,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 16:04:22,752][49750] Updated weights for policy 0, policy_version 214501 (0.0028) [2024-04-26 16:04:25,919][49750] Updated weights for policy 0, policy_version 214511 (0.0036) [2024-04-26 16:04:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.6, 300 sec: 50651.6). Total num frames: 3514597376. Throughput: 0: 50659.6. Samples: 1267455200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:04:27,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 16:04:29,158][49750] Updated weights for policy 0, policy_version 214521 (0.0031) [2024-04-26 16:04:32,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3514843136. Throughput: 0: 50619.8. Samples: 1267756120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:04:32,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 16:04:32,402][49750] Updated weights for policy 0, policy_version 214531 (0.0039) [2024-04-26 16:04:35,491][49750] Updated weights for policy 0, policy_version 214541 (0.0034) [2024-04-26 16:04:37,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3515121664. Throughput: 0: 50436.6. Samples: 1267906360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:04:37,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 16:04:38,903][49750] Updated weights for policy 0, policy_version 214551 (0.0031) [2024-04-26 16:04:41,861][49750] Updated weights for policy 0, policy_version 214561 (0.0033) [2024-04-26 16:04:42,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 3515367424. Throughput: 0: 50633.3. Samples: 1268215520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:04:42,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 16:04:42,090][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000214562_3515383808.pth... [2024-04-26 16:04:42,133][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000213818_3503194112.pth [2024-04-26 16:04:45,462][49750] Updated weights for policy 0, policy_version 214571 (0.0033) [2024-04-26 16:04:47,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3515613184. Throughput: 0: 50599.9. Samples: 1268515700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:04:47,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 16:04:48,407][49750] Updated weights for policy 0, policy_version 214581 (0.0033) [2024-04-26 16:04:51,795][49750] Updated weights for policy 0, policy_version 214591 (0.0027) [2024-04-26 16:04:52,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3515875328. Throughput: 0: 50466.2. Samples: 1268656280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:04:52,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 16:04:54,882][49750] Updated weights for policy 0, policy_version 214601 (0.0028) [2024-04-26 16:04:57,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 3516104704. Throughput: 0: 50499.4. Samples: 1268961420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:04:57,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 16:04:58,231][49750] Updated weights for policy 0, policy_version 214611 (0.0040) [2024-04-26 16:05:01,216][49750] Updated weights for policy 0, policy_version 214621 (0.0037) [2024-04-26 16:05:02,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3516383232. Throughput: 0: 50542.1. Samples: 1269267180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:05:02,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 16:05:04,879][49750] Updated weights for policy 0, policy_version 214631 (0.0034) [2024-04-26 16:05:07,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3516612608. Throughput: 0: 50625.6. Samples: 1269426180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 16:05:07,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 16:05:07,673][49750] Updated weights for policy 0, policy_version 214641 (0.0027) [2024-04-26 16:05:11,117][49750] Updated weights for policy 0, policy_version 214651 (0.0035) [2024-04-26 16:05:12,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3516874752. Throughput: 0: 50524.8. Samples: 1269728820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 16:05:12,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 16:05:14,158][49750] Updated weights for policy 0, policy_version 214661 (0.0029) [2024-04-26 16:05:14,793][49728] Signal inference workers to stop experience collection... (19050 times) [2024-04-26 16:05:14,793][49728] Signal inference workers to resume experience collection... (19050 times) [2024-04-26 16:05:14,817][49750] InferenceWorker_p0-w0: stopping experience collection (19050 times) [2024-04-26 16:05:14,818][49750] InferenceWorker_p0-w0: resuming experience collection (19050 times) [2024-04-26 16:05:17,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3517136896. Throughput: 0: 50677.8. Samples: 1270036620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 16:05:17,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 16:05:17,765][49750] Updated weights for policy 0, policy_version 214671 (0.0029) [2024-04-26 16:05:20,503][49750] Updated weights for policy 0, policy_version 214681 (0.0027) [2024-04-26 16:05:22,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3517382656. Throughput: 0: 50661.4. Samples: 1270186120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 16:05:22,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 16:05:24,290][49750] Updated weights for policy 0, policy_version 214691 (0.0031) [2024-04-26 16:05:26,896][49750] Updated weights for policy 0, policy_version 214701 (0.0036) [2024-04-26 16:05:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3517661184. Throughput: 0: 50559.1. Samples: 1270490680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 16:05:27,063][49517] Avg episode reward: [(0, '0.471')] [2024-04-26 16:05:30,695][49750] Updated weights for policy 0, policy_version 214711 (0.0025) [2024-04-26 16:05:32,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3517874176. Throughput: 0: 50641.9. Samples: 1270794580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 16:05:32,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 16:05:33,389][49750] Updated weights for policy 0, policy_version 214721 (0.0029) [2024-04-26 16:05:37,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 3518136320. Throughput: 0: 50772.5. Samples: 1270941040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 16:05:37,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 16:05:37,123][49750] Updated weights for policy 0, policy_version 214731 (0.0032) [2024-04-26 16:05:39,855][49750] Updated weights for policy 0, policy_version 214741 (0.0029) [2024-04-26 16:05:42,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 3518398464. Throughput: 0: 50643.1. Samples: 1271240360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 16:05:42,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 16:05:43,530][49750] Updated weights for policy 0, policy_version 214751 (0.0027) [2024-04-26 16:05:46,493][49750] Updated weights for policy 0, policy_version 214761 (0.0028) [2024-04-26 16:05:47,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3518676992. Throughput: 0: 50649.0. Samples: 1271546380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 16:05:47,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 16:05:49,897][49750] Updated weights for policy 0, policy_version 214771 (0.0027) [2024-04-26 16:05:52,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3518889984. Throughput: 0: 50683.2. Samples: 1271706920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 16:05:52,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 16:05:53,101][49750] Updated weights for policy 0, policy_version 214781 (0.0032) [2024-04-26 16:05:56,385][49750] Updated weights for policy 0, policy_version 214791 (0.0036) [2024-04-26 16:05:57,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3519152128. Throughput: 0: 50629.7. Samples: 1272007160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 16:05:57,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 16:05:59,438][49750] Updated weights for policy 0, policy_version 214801 (0.0032) [2024-04-26 16:06:02,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 3519414272. Throughput: 0: 50551.0. Samples: 1272311420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 16:06:02,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 16:06:02,962][49750] Updated weights for policy 0, policy_version 214811 (0.0028) [2024-04-26 16:06:05,794][49750] Updated weights for policy 0, policy_version 214821 (0.0027) [2024-04-26 16:06:07,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.5, 300 sec: 50596.0). Total num frames: 3519643648. Throughput: 0: 50577.8. Samples: 1272462120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 16:06:07,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 16:06:09,345][49750] Updated weights for policy 0, policy_version 214831 (0.0034) [2024-04-26 16:06:12,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3519922176. Throughput: 0: 50550.7. Samples: 1272765460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 16:06:12,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 16:06:12,326][49750] Updated weights for policy 0, policy_version 214841 (0.0034) [2024-04-26 16:06:15,833][49750] Updated weights for policy 0, policy_version 214851 (0.0028) [2024-04-26 16:06:17,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3520167936. Throughput: 0: 50556.8. Samples: 1273069640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:06:17,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 16:06:18,970][49750] Updated weights for policy 0, policy_version 214861 (0.0030) [2024-04-26 16:06:20,550][49728] Signal inference workers to stop experience collection... (19100 times) [2024-04-26 16:06:20,550][49728] Signal inference workers to resume experience collection... (19100 times) [2024-04-26 16:06:20,562][49750] InferenceWorker_p0-w0: stopping experience collection (19100 times) [2024-04-26 16:06:20,562][49750] InferenceWorker_p0-w0: resuming experience collection (19100 times) [2024-04-26 16:06:22,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 3520413696. Throughput: 0: 50568.3. Samples: 1273216620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:06:22,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 16:06:22,345][49750] Updated weights for policy 0, policy_version 214871 (0.0036) [2024-04-26 16:06:25,317][49750] Updated weights for policy 0, policy_version 214881 (0.0026) [2024-04-26 16:06:27,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.1, 300 sec: 50596.0). Total num frames: 3520675840. Throughput: 0: 50643.8. Samples: 1273519340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:06:27,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 16:06:29,184][49750] Updated weights for policy 0, policy_version 214891 (0.0033) [2024-04-26 16:06:31,748][49750] Updated weights for policy 0, policy_version 214901 (0.0034) [2024-04-26 16:06:32,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3520937984. Throughput: 0: 50665.9. Samples: 1273826340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:06:32,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 16:06:35,636][49750] Updated weights for policy 0, policy_version 214911 (0.0030) [2024-04-26 16:06:37,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3521200128. Throughput: 0: 50632.4. Samples: 1273985380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:06:37,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 16:06:38,220][49750] Updated weights for policy 0, policy_version 214921 (0.0036) [2024-04-26 16:06:42,063][49517] Fps is (10 sec: 47512.7, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 3521413120. Throughput: 0: 50540.0. Samples: 1274281460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:06:42,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 16:06:42,130][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000214931_3521429504.pth... [2024-04-26 16:06:42,133][49750] Updated weights for policy 0, policy_version 214931 (0.0028) [2024-04-26 16:06:42,187][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000214191_3509305344.pth [2024-04-26 16:06:44,921][49750] Updated weights for policy 0, policy_version 214941 (0.0034) [2024-04-26 16:06:47,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.2, 300 sec: 50596.0). Total num frames: 3521691648. Throughput: 0: 50540.9. Samples: 1274585760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:06:47,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 16:06:48,569][49750] Updated weights for policy 0, policy_version 214951 (0.0043) [2024-04-26 16:06:51,380][49750] Updated weights for policy 0, policy_version 214961 (0.0034) [2024-04-26 16:06:52,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 3521937408. Throughput: 0: 50716.7. Samples: 1274744380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:06:52,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 16:06:54,920][49750] Updated weights for policy 0, policy_version 214971 (0.0028) [2024-04-26 16:06:57,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3522199552. Throughput: 0: 50663.8. Samples: 1275045340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:06:57,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 16:06:57,757][49750] Updated weights for policy 0, policy_version 214981 (0.0030) [2024-04-26 16:07:01,278][49750] Updated weights for policy 0, policy_version 214991 (0.0030) [2024-04-26 16:07:02,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3522445312. Throughput: 0: 50735.6. Samples: 1275352740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:07:02,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 16:07:04,262][49750] Updated weights for policy 0, policy_version 215001 (0.0027) [2024-04-26 16:07:07,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.3, 300 sec: 50596.0). Total num frames: 3522691072. Throughput: 0: 50756.9. Samples: 1275500680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:07:07,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 16:07:07,741][49750] Updated weights for policy 0, policy_version 215011 (0.0034) [2024-04-26 16:07:10,592][49750] Updated weights for policy 0, policy_version 215021 (0.0035) [2024-04-26 16:07:12,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 3522953216. Throughput: 0: 50775.7. Samples: 1275804240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:07:12,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 16:07:14,276][49750] Updated weights for policy 0, policy_version 215031 (0.0029) [2024-04-26 16:07:17,002][49750] Updated weights for policy 0, policy_version 215041 (0.0034) [2024-04-26 16:07:17,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3523231744. Throughput: 0: 50738.9. Samples: 1276109600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:07:17,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 16:07:20,051][49728] Signal inference workers to stop experience collection... (19150 times) [2024-04-26 16:07:20,051][49728] Signal inference workers to resume experience collection... (19150 times) [2024-04-26 16:07:20,064][49750] InferenceWorker_p0-w0: stopping experience collection (19150 times) [2024-04-26 16:07:20,065][49750] InferenceWorker_p0-w0: resuming experience collection (19150 times) [2024-04-26 16:07:20,761][49750] Updated weights for policy 0, policy_version 215051 (0.0032) [2024-04-26 16:07:22,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3523477504. Throughput: 0: 50612.0. Samples: 1276262920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 16:07:22,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-26 16:07:23,508][49750] Updated weights for policy 0, policy_version 215061 (0.0032) [2024-04-26 16:07:27,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.4, 300 sec: 50651.5). Total num frames: 3523706880. Throughput: 0: 50711.2. Samples: 1276563460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 16:07:27,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 16:07:27,116][49750] Updated weights for policy 0, policy_version 215071 (0.0034) [2024-04-26 16:07:30,074][49750] Updated weights for policy 0, policy_version 215081 (0.0026) [2024-04-26 16:07:32,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.2, 300 sec: 50596.0). Total num frames: 3523969024. Throughput: 0: 50734.1. Samples: 1276868800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 16:07:32,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 16:07:33,472][49750] Updated weights for policy 0, policy_version 215091 (0.0030) [2024-04-26 16:07:36,612][49750] Updated weights for policy 0, policy_version 215101 (0.0030) [2024-04-26 16:07:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.4, 300 sec: 50596.0). Total num frames: 3524214784. Throughput: 0: 50549.4. Samples: 1277019100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 16:07:37,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 16:07:40,068][49750] Updated weights for policy 0, policy_version 215111 (0.0032) [2024-04-26 16:07:42,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 3524493312. Throughput: 0: 50677.8. Samples: 1277325840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 16:07:42,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 16:07:43,412][49750] Updated weights for policy 0, policy_version 215121 (0.0030) [2024-04-26 16:07:46,484][49750] Updated weights for policy 0, policy_version 215131 (0.0031) [2024-04-26 16:07:47,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3524739072. Throughput: 0: 50744.9. Samples: 1277636260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 16:07:47,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 16:07:49,893][49750] Updated weights for policy 0, policy_version 215141 (0.0029) [2024-04-26 16:07:52,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 3524984832. Throughput: 0: 50807.5. Samples: 1277787020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 16:07:52,072][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 16:07:52,897][49750] Updated weights for policy 0, policy_version 215151 (0.0032) [2024-04-26 16:07:56,402][49750] Updated weights for policy 0, policy_version 215161 (0.0027) [2024-04-26 16:07:57,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.5, 300 sec: 50596.0). Total num frames: 3525230592. Throughput: 0: 50796.5. Samples: 1278090080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 16:07:57,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 16:07:59,324][49750] Updated weights for policy 0, policy_version 215171 (0.0033) [2024-04-26 16:08:02,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3525509120. Throughput: 0: 50713.8. Samples: 1278391720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 16:08:02,072][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 16:08:02,744][49750] Updated weights for policy 0, policy_version 215181 (0.0033) [2024-04-26 16:08:05,683][49750] Updated weights for policy 0, policy_version 215191 (0.0029) [2024-04-26 16:08:07,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3525754880. Throughput: 0: 50827.7. Samples: 1278550160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 16:08:07,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 16:08:09,238][49750] Updated weights for policy 0, policy_version 215201 (0.0032) [2024-04-26 16:08:12,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3525984256. Throughput: 0: 50883.2. Samples: 1278853200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 16:08:12,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 16:08:12,267][49750] Updated weights for policy 0, policy_version 215211 (0.0039) [2024-04-26 16:08:15,653][49750] Updated weights for policy 0, policy_version 215221 (0.0033) [2024-04-26 16:08:17,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3526262784. Throughput: 0: 50801.9. Samples: 1279154880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 16:08:17,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 16:08:18,595][49750] Updated weights for policy 0, policy_version 215231 (0.0027) [2024-04-26 16:08:22,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 3526492160. Throughput: 0: 50817.7. Samples: 1279305900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 16:08:22,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 16:08:22,090][49750] Updated weights for policy 0, policy_version 215241 (0.0035) [2024-04-26 16:08:25,135][49750] Updated weights for policy 0, policy_version 215251 (0.0028) [2024-04-26 16:08:27,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3526770688. Throughput: 0: 50674.7. Samples: 1279606200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 16:08:27,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 16:08:28,470][49750] Updated weights for policy 0, policy_version 215261 (0.0036) [2024-04-26 16:08:31,614][49750] Updated weights for policy 0, policy_version 215271 (0.0033) [2024-04-26 16:08:32,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3527016448. Throughput: 0: 50551.5. Samples: 1279911080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 16:08:32,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 16:08:34,898][49750] Updated weights for policy 0, policy_version 215281 (0.0036) [2024-04-26 16:08:37,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3527262208. Throughput: 0: 50620.6. Samples: 1280064940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 16:08:37,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 16:08:38,045][49750] Updated weights for policy 0, policy_version 215291 (0.0033) [2024-04-26 16:08:41,580][49750] Updated weights for policy 0, policy_version 215301 (0.0029) [2024-04-26 16:08:42,063][49517] Fps is (10 sec: 47512.9, 60 sec: 49971.1, 300 sec: 50540.5). Total num frames: 3527491584. Throughput: 0: 50624.7. Samples: 1280368200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 16:08:42,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 16:08:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000215302_3527507968.pth... [2024-04-26 16:08:42,108][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000214562_3515383808.pth [2024-04-26 16:08:44,678][49750] Updated weights for policy 0, policy_version 215311 (0.0032) [2024-04-26 16:08:47,063][49517] Fps is (10 sec: 50789.2, 60 sec: 50517.2, 300 sec: 50596.0). Total num frames: 3527770112. Throughput: 0: 50488.8. Samples: 1280663720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 16:08:47,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 16:08:47,985][49750] Updated weights for policy 0, policy_version 215321 (0.0027) [2024-04-26 16:08:48,764][49728] Signal inference workers to stop experience collection... (19200 times) [2024-04-26 16:08:48,767][49728] Signal inference workers to resume experience collection... (19200 times) [2024-04-26 16:08:48,785][49750] InferenceWorker_p0-w0: stopping experience collection (19200 times) [2024-04-26 16:08:48,785][49750] InferenceWorker_p0-w0: resuming experience collection (19200 times) [2024-04-26 16:08:51,033][49750] Updated weights for policy 0, policy_version 215331 (0.0032) [2024-04-26 16:08:52,062][49517] Fps is (10 sec: 54068.0, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3528032256. Throughput: 0: 50497.3. Samples: 1280822540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 16:08:52,063][49517] Avg episode reward: [(0, '0.472')] [2024-04-26 16:08:54,359][49750] Updated weights for policy 0, policy_version 215341 (0.0036) [2024-04-26 16:08:57,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 3528278016. Throughput: 0: 50669.7. Samples: 1281133340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 16:08:57,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 16:08:57,618][49750] Updated weights for policy 0, policy_version 215351 (0.0028) [2024-04-26 16:09:00,882][49750] Updated weights for policy 0, policy_version 215361 (0.0036) [2024-04-26 16:09:02,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.3, 300 sec: 50596.0). Total num frames: 3528507392. Throughput: 0: 50653.9. Samples: 1281434300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 16:09:02,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 16:09:04,098][49750] Updated weights for policy 0, policy_version 215371 (0.0029) [2024-04-26 16:09:07,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3528785920. Throughput: 0: 50443.1. Samples: 1281575840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 16:09:07,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 16:09:07,393][49750] Updated weights for policy 0, policy_version 215381 (0.0027) [2024-04-26 16:09:10,598][49750] Updated weights for policy 0, policy_version 215391 (0.0032) [2024-04-26 16:09:12,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51063.3, 300 sec: 50651.5). Total num frames: 3529048064. Throughput: 0: 50755.9. Samples: 1281890220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 16:09:12,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 16:09:13,998][49750] Updated weights for policy 0, policy_version 215401 (0.0033) [2024-04-26 16:09:16,894][49750] Updated weights for policy 0, policy_version 215411 (0.0032) [2024-04-26 16:09:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3529293824. Throughput: 0: 50718.2. Samples: 1282193400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 16:09:17,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 16:09:20,583][49750] Updated weights for policy 0, policy_version 215421 (0.0033) [2024-04-26 16:09:22,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 3529523200. Throughput: 0: 50588.0. Samples: 1282341400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 16:09:22,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 16:09:23,284][49750] Updated weights for policy 0, policy_version 215431 (0.0027) [2024-04-26 16:09:26,952][49750] Updated weights for policy 0, policy_version 215441 (0.0033) [2024-04-26 16:09:27,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 3529785344. Throughput: 0: 50645.9. Samples: 1282647260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 16:09:27,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 16:09:29,775][49750] Updated weights for policy 0, policy_version 215451 (0.0033) [2024-04-26 16:09:32,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 3530047488. Throughput: 0: 50741.9. Samples: 1282947100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 16:09:32,063][49517] Avg episode reward: [(0, '0.667')] [2024-04-26 16:09:33,706][49750] Updated weights for policy 0, policy_version 215461 (0.0034) [2024-04-26 16:09:36,223][49750] Updated weights for policy 0, policy_version 215471 (0.0033) [2024-04-26 16:09:37,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3530326016. Throughput: 0: 50682.7. Samples: 1283103260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-26 16:09:37,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 16:09:40,178][49750] Updated weights for policy 0, policy_version 215481 (0.0029) [2024-04-26 16:09:42,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.4, 300 sec: 50651.5). Total num frames: 3530555392. Throughput: 0: 50601.1. Samples: 1283410400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 16:09:42,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 16:09:42,631][49750] Updated weights for policy 0, policy_version 215491 (0.0031) [2024-04-26 16:09:46,603][49750] Updated weights for policy 0, policy_version 215501 (0.0030) [2024-04-26 16:09:47,062][49517] Fps is (10 sec: 45875.0, 60 sec: 50244.4, 300 sec: 50540.5). Total num frames: 3530784768. Throughput: 0: 50773.3. Samples: 1283719100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 16:09:47,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 16:09:49,100][49750] Updated weights for policy 0, policy_version 215511 (0.0028) [2024-04-26 16:09:52,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.2, 300 sec: 50707.0). Total num frames: 3531063296. Throughput: 0: 50654.1. Samples: 1283855280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 16:09:52,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 16:09:53,075][49750] Updated weights for policy 0, policy_version 215521 (0.0031) [2024-04-26 16:09:54,263][49728] Signal inference workers to stop experience collection... (19250 times) [2024-04-26 16:09:54,264][49728] Signal inference workers to resume experience collection... (19250 times) [2024-04-26 16:09:54,290][49750] InferenceWorker_p0-w0: stopping experience collection (19250 times) [2024-04-26 16:09:54,290][49750] InferenceWorker_p0-w0: resuming experience collection (19250 times) [2024-04-26 16:09:55,479][49750] Updated weights for policy 0, policy_version 215531 (0.0031) [2024-04-26 16:09:57,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3531325440. Throughput: 0: 50397.0. Samples: 1284158080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 16:09:57,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 16:09:59,543][49750] Updated weights for policy 0, policy_version 215541 (0.0032) [2024-04-26 16:10:02,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3531571200. Throughput: 0: 50504.9. Samples: 1284466120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 16:10:02,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 16:10:02,153][49750] Updated weights for policy 0, policy_version 215551 (0.0039) [2024-04-26 16:10:05,980][49750] Updated weights for policy 0, policy_version 215561 (0.0035) [2024-04-26 16:10:07,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50596.0). Total num frames: 3531800576. Throughput: 0: 50504.9. Samples: 1284614120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 16:10:07,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 16:10:08,626][49750] Updated weights for policy 0, policy_version 215571 (0.0036) [2024-04-26 16:10:12,063][49517] Fps is (10 sec: 47513.4, 60 sec: 49971.2, 300 sec: 50540.5). Total num frames: 3532046336. Throughput: 0: 50456.4. Samples: 1284917800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 16:10:12,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 16:10:12,360][49750] Updated weights for policy 0, policy_version 215581 (0.0037) [2024-04-26 16:10:15,305][49750] Updated weights for policy 0, policy_version 215591 (0.0034) [2024-04-26 16:10:17,063][49517] Fps is (10 sec: 54066.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3532341248. Throughput: 0: 50552.0. Samples: 1285221940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 16:10:17,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 16:10:18,784][49750] Updated weights for policy 0, policy_version 215601 (0.0030) [2024-04-26 16:10:21,630][49750] Updated weights for policy 0, policy_version 215611 (0.0028) [2024-04-26 16:10:22,063][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.4, 300 sec: 50596.0). Total num frames: 3532587008. Throughput: 0: 50585.2. Samples: 1285379600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 16:10:22,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 16:10:25,221][49750] Updated weights for policy 0, policy_version 215621 (0.0028) [2024-04-26 16:10:27,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3532832768. Throughput: 0: 50564.2. Samples: 1285685780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 16:10:27,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 16:10:28,232][49750] Updated weights for policy 0, policy_version 215631 (0.0031) [2024-04-26 16:10:31,799][49750] Updated weights for policy 0, policy_version 215641 (0.0033) [2024-04-26 16:10:32,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.3, 300 sec: 50596.0). Total num frames: 3533062144. Throughput: 0: 50501.0. Samples: 1285991640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 16:10:32,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 16:10:34,665][49750] Updated weights for policy 0, policy_version 215651 (0.0030) [2024-04-26 16:10:37,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50244.1, 300 sec: 50651.5). Total num frames: 3533340672. Throughput: 0: 50664.0. Samples: 1286135160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 16:10:37,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 16:10:38,176][49750] Updated weights for policy 0, policy_version 215661 (0.0035) [2024-04-26 16:10:41,078][49750] Updated weights for policy 0, policy_version 215671 (0.0033) [2024-04-26 16:10:42,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50790.6, 300 sec: 50596.0). Total num frames: 3533602816. Throughput: 0: 50616.1. Samples: 1286435800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 16:10:42,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 16:10:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000215674_3533602816.pth... [2024-04-26 16:10:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000214931_3521429504.pth [2024-04-26 16:10:44,729][49750] Updated weights for policy 0, policy_version 215681 (0.0030) [2024-04-26 16:10:47,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 3533832192. Throughput: 0: 50565.3. Samples: 1286741560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 16:10:47,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 16:10:47,600][49750] Updated weights for policy 0, policy_version 215691 (0.0028) [2024-04-26 16:10:51,241][49750] Updated weights for policy 0, policy_version 215701 (0.0030) [2024-04-26 16:10:52,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.5, 300 sec: 50596.0). Total num frames: 3534077952. Throughput: 0: 50544.4. Samples: 1286888620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 16:10:52,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 16:10:54,258][49750] Updated weights for policy 0, policy_version 215711 (0.0030) [2024-04-26 16:10:57,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.2, 300 sec: 50596.0). Total num frames: 3534340096. Throughput: 0: 50489.8. Samples: 1287189840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 16:10:57,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 16:10:57,624][49750] Updated weights for policy 0, policy_version 215721 (0.0032) [2024-04-26 16:11:00,695][49750] Updated weights for policy 0, policy_version 215731 (0.0032) [2024-04-26 16:11:02,062][49517] Fps is (10 sec: 54066.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3534618624. Throughput: 0: 50501.8. Samples: 1287494520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 16:11:02,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 16:11:03,933][49750] Updated weights for policy 0, policy_version 215741 (0.0035) [2024-04-26 16:11:07,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 3534831616. Throughput: 0: 50587.7. Samples: 1287656040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 16:11:07,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 16:11:07,209][49750] Updated weights for policy 0, policy_version 215751 (0.0031) [2024-04-26 16:11:10,418][49750] Updated weights for policy 0, policy_version 215761 (0.0028) [2024-04-26 16:11:12,062][49517] Fps is (10 sec: 49152.4, 60 sec: 51063.6, 300 sec: 50651.6). Total num frames: 3535110144. Throughput: 0: 50536.1. Samples: 1287959900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 16:11:12,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 16:11:13,689][49750] Updated weights for policy 0, policy_version 215771 (0.0029) [2024-04-26 16:11:16,919][49750] Updated weights for policy 0, policy_version 215781 (0.0035) [2024-04-26 16:11:17,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 3535355904. Throughput: 0: 50425.7. Samples: 1288260800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 16:11:17,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 16:11:20,054][49750] Updated weights for policy 0, policy_version 215791 (0.0031) [2024-04-26 16:11:22,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 3535618048. Throughput: 0: 50822.3. Samples: 1288422160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 16:11:22,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 16:11:23,240][49750] Updated weights for policy 0, policy_version 215801 (0.0027) [2024-04-26 16:11:26,424][49750] Updated weights for policy 0, policy_version 215811 (0.0032) [2024-04-26 16:11:27,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3535896576. Throughput: 0: 50876.4. Samples: 1288725240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 16:11:27,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 16:11:27,365][49728] Signal inference workers to stop experience collection... (19300 times) [2024-04-26 16:11:27,366][49728] Signal inference workers to resume experience collection... (19300 times) [2024-04-26 16:11:27,393][49750] InferenceWorker_p0-w0: stopping experience collection (19300 times) [2024-04-26 16:11:27,394][49750] InferenceWorker_p0-w0: resuming experience collection (19300 times) [2024-04-26 16:11:29,604][49750] Updated weights for policy 0, policy_version 215821 (0.0032) [2024-04-26 16:11:32,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.4, 300 sec: 50540.5). Total num frames: 3536109568. Throughput: 0: 50624.2. Samples: 1289019640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 16:11:32,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 16:11:32,931][49750] Updated weights for policy 0, policy_version 215831 (0.0031) [2024-04-26 16:11:36,426][49750] Updated weights for policy 0, policy_version 215841 (0.0031) [2024-04-26 16:11:37,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3536388096. Throughput: 0: 50763.0. Samples: 1289172960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 16:11:37,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 16:11:39,278][49750] Updated weights for policy 0, policy_version 215851 (0.0029) [2024-04-26 16:11:42,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 3536633856. Throughput: 0: 50794.2. Samples: 1289475580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 16:11:42,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 16:11:42,985][49750] Updated weights for policy 0, policy_version 215861 (0.0030) [2024-04-26 16:11:45,707][49750] Updated weights for policy 0, policy_version 215871 (0.0033) [2024-04-26 16:11:47,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.6, 300 sec: 50762.6). Total num frames: 3536912384. Throughput: 0: 50683.6. Samples: 1289775280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 16:11:47,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 16:11:49,371][49750] Updated weights for policy 0, policy_version 215881 (0.0029) [2024-04-26 16:11:52,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.3, 300 sec: 50596.0). Total num frames: 3537125376. Throughput: 0: 50758.2. Samples: 1289940160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 16:11:52,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 16:11:52,225][49750] Updated weights for policy 0, policy_version 215891 (0.0032) [2024-04-26 16:11:55,858][49750] Updated weights for policy 0, policy_version 215901 (0.0030) [2024-04-26 16:11:57,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3537387520. Throughput: 0: 50667.9. Samples: 1290239960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 16:11:57,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 16:11:58,784][49750] Updated weights for policy 0, policy_version 215911 (0.0035) [2024-04-26 16:12:02,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 3537633280. Throughput: 0: 50684.5. Samples: 1290541600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 16:12:02,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 16:12:02,255][49750] Updated weights for policy 0, policy_version 215921 (0.0026) [2024-04-26 16:12:05,175][49750] Updated weights for policy 0, policy_version 215931 (0.0033) [2024-04-26 16:12:07,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 3537911808. Throughput: 0: 50634.8. Samples: 1290700720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 16:12:07,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 16:12:08,637][49750] Updated weights for policy 0, policy_version 215941 (0.0036) [2024-04-26 16:12:11,619][49750] Updated weights for policy 0, policy_version 215951 (0.0032) [2024-04-26 16:12:12,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.3, 300 sec: 50596.0). Total num frames: 3538157568. Throughput: 0: 50620.4. Samples: 1291003160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 16:12:12,063][49517] Avg episode reward: [(0, '0.485')] [2024-04-26 16:12:15,048][49750] Updated weights for policy 0, policy_version 215961 (0.0032) [2024-04-26 16:12:17,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50596.0). Total num frames: 3538403328. Throughput: 0: 50800.9. Samples: 1291305680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 16:12:17,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 16:12:18,008][49750] Updated weights for policy 0, policy_version 215971 (0.0029) [2024-04-26 16:12:21,301][49750] Updated weights for policy 0, policy_version 215981 (0.0028) [2024-04-26 16:12:22,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3538649088. Throughput: 0: 50777.9. Samples: 1291457960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 16:12:22,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 16:12:24,441][49750] Updated weights for policy 0, policy_version 215991 (0.0037) [2024-04-26 16:12:27,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 3538911232. Throughput: 0: 50825.4. Samples: 1291762720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 16:12:27,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 16:12:27,809][49750] Updated weights for policy 0, policy_version 216001 (0.0029) [2024-04-26 16:12:30,871][49750] Updated weights for policy 0, policy_version 216011 (0.0034) [2024-04-26 16:12:32,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3539173376. Throughput: 0: 50878.9. Samples: 1292064840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 16:12:32,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 16:12:34,430][49750] Updated weights for policy 0, policy_version 216021 (0.0029) [2024-04-26 16:12:37,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 3539419136. Throughput: 0: 50678.6. Samples: 1292220700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 16:12:37,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 16:12:37,419][49750] Updated weights for policy 0, policy_version 216031 (0.0030) [2024-04-26 16:12:41,007][49750] Updated weights for policy 0, policy_version 216041 (0.0039) [2024-04-26 16:12:42,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 3539664896. Throughput: 0: 50803.5. Samples: 1292526120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 16:12:42,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 16:12:42,124][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000216045_3539681280.pth... [2024-04-26 16:12:42,171][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000215302_3527507968.pth [2024-04-26 16:12:43,814][49750] Updated weights for policy 0, policy_version 216051 (0.0030) [2024-04-26 16:12:44,786][49728] Signal inference workers to stop experience collection... (19350 times) [2024-04-26 16:12:44,786][49728] Signal inference workers to resume experience collection... (19350 times) [2024-04-26 16:12:44,799][49750] InferenceWorker_p0-w0: stopping experience collection (19350 times) [2024-04-26 16:12:44,802][49750] InferenceWorker_p0-w0: resuming experience collection (19350 times) [2024-04-26 16:12:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49971.2, 300 sec: 50596.0). Total num frames: 3539910656. Throughput: 0: 50728.0. Samples: 1292824360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 16:12:47,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 16:12:47,594][49750] Updated weights for policy 0, policy_version 216061 (0.0030) [2024-04-26 16:12:50,378][49750] Updated weights for policy 0, policy_version 216071 (0.0029) [2024-04-26 16:12:52,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3540172800. Throughput: 0: 50577.8. Samples: 1292976720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 16:12:52,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 16:12:53,991][49750] Updated weights for policy 0, policy_version 216081 (0.0027) [2024-04-26 16:12:56,952][49750] Updated weights for policy 0, policy_version 216091 (0.0037) [2024-04-26 16:12:57,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 3540434944. Throughput: 0: 50511.2. Samples: 1293276160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 16:12:57,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 16:13:00,425][49750] Updated weights for policy 0, policy_version 216101 (0.0033) [2024-04-26 16:13:02,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50596.0). Total num frames: 3540680704. Throughput: 0: 50602.6. Samples: 1293582800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-26 16:13:02,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 16:13:03,665][49750] Updated weights for policy 0, policy_version 216111 (0.0033) [2024-04-26 16:13:06,968][49750] Updated weights for policy 0, policy_version 216121 (0.0034) [2024-04-26 16:13:07,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 3540926464. Throughput: 0: 50515.6. Samples: 1293731160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:13:07,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 16:13:10,217][49750] Updated weights for policy 0, policy_version 216131 (0.0029) [2024-04-26 16:13:12,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 3541188608. Throughput: 0: 50536.8. Samples: 1294036880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:13:12,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 16:13:13,423][49750] Updated weights for policy 0, policy_version 216141 (0.0027) [2024-04-26 16:13:16,695][49750] Updated weights for policy 0, policy_version 216151 (0.0029) [2024-04-26 16:13:17,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 3541434368. Throughput: 0: 50672.6. Samples: 1294345100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:13:17,063][49517] Avg episode reward: [(0, '0.476')] [2024-04-26 16:13:19,709][49750] Updated weights for policy 0, policy_version 216161 (0.0034) [2024-04-26 16:13:22,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.2, 300 sec: 50540.4). Total num frames: 3541680128. Throughput: 0: 50539.5. Samples: 1294494980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:13:22,072][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 16:13:23,030][49750] Updated weights for policy 0, policy_version 216171 (0.0039) [2024-04-26 16:13:26,203][49750] Updated weights for policy 0, policy_version 216181 (0.0036) [2024-04-26 16:13:27,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 3541958656. Throughput: 0: 50467.5. Samples: 1294797160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:13:27,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 16:13:29,427][49750] Updated weights for policy 0, policy_version 216191 (0.0031) [2024-04-26 16:13:32,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50244.4, 300 sec: 50596.0). Total num frames: 3542188032. Throughput: 0: 50580.9. Samples: 1295100500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:13:32,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 16:13:32,729][49750] Updated weights for policy 0, policy_version 216201 (0.0028) [2024-04-26 16:13:35,863][49750] Updated weights for policy 0, policy_version 216211 (0.0029) [2024-04-26 16:13:37,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3542466560. Throughput: 0: 50649.6. Samples: 1295255960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:13:37,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 16:13:38,959][49728] Signal inference workers to stop experience collection... (19400 times) [2024-04-26 16:13:39,002][49750] InferenceWorker_p0-w0: stopping experience collection (19400 times) [2024-04-26 16:13:39,021][49728] Signal inference workers to resume experience collection... (19400 times) [2024-04-26 16:13:39,022][49750] InferenceWorker_p0-w0: resuming experience collection (19400 times) [2024-04-26 16:13:39,154][49750] Updated weights for policy 0, policy_version 216221 (0.0028) [2024-04-26 16:13:42,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 3542695936. Throughput: 0: 50573.7. Samples: 1295551980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:13:42,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 16:13:42,245][49750] Updated weights for policy 0, policy_version 216231 (0.0030) [2024-04-26 16:13:45,649][49750] Updated weights for policy 0, policy_version 216241 (0.0029) [2024-04-26 16:13:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 3542958080. Throughput: 0: 50580.4. Samples: 1295858920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:13:47,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 16:13:48,584][49750] Updated weights for policy 0, policy_version 216251 (0.0028) [2024-04-26 16:13:52,062][49517] Fps is (10 sec: 52429.9, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3543220224. Throughput: 0: 50769.9. Samples: 1296015800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:13:52,063][49517] Avg episode reward: [(0, '0.669')] [2024-04-26 16:13:52,067][49750] Updated weights for policy 0, policy_version 216261 (0.0029) [2024-04-26 16:13:55,147][49750] Updated weights for policy 0, policy_version 216271 (0.0031) [2024-04-26 16:13:57,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3543482368. Throughput: 0: 50640.8. Samples: 1296315720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:13:57,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 16:13:58,639][49750] Updated weights for policy 0, policy_version 216281 (0.0036) [2024-04-26 16:14:01,745][49750] Updated weights for policy 0, policy_version 216291 (0.0031) [2024-04-26 16:14:02,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 3543728128. Throughput: 0: 50630.5. Samples: 1296623480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:14:02,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 16:14:05,043][49750] Updated weights for policy 0, policy_version 216301 (0.0029) [2024-04-26 16:14:07,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 3543957504. Throughput: 0: 50674.4. Samples: 1296775320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:14:07,063][49517] Avg episode reward: [(0, '0.483')] [2024-04-26 16:14:08,196][49750] Updated weights for policy 0, policy_version 216311 (0.0030) [2024-04-26 16:14:11,490][49750] Updated weights for policy 0, policy_version 216321 (0.0026) [2024-04-26 16:14:12,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 3544236032. Throughput: 0: 50590.2. Samples: 1297073720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 16:14:12,072][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 16:14:14,800][49750] Updated weights for policy 0, policy_version 216331 (0.0031) [2024-04-26 16:14:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 3544465408. Throughput: 0: 50486.2. Samples: 1297372380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 16:14:17,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 16:14:17,998][49750] Updated weights for policy 0, policy_version 216341 (0.0030) [2024-04-26 16:14:21,210][49750] Updated weights for policy 0, policy_version 216351 (0.0031) [2024-04-26 16:14:22,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3544727552. Throughput: 0: 50629.8. Samples: 1297534300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 16:14:22,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 16:14:24,424][49750] Updated weights for policy 0, policy_version 216361 (0.0026) [2024-04-26 16:14:27,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.3, 300 sec: 50596.0). Total num frames: 3544973312. Throughput: 0: 50668.0. Samples: 1297832040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 16:14:27,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 16:14:27,774][49750] Updated weights for policy 0, policy_version 216371 (0.0042) [2024-04-26 16:14:30,778][49750] Updated weights for policy 0, policy_version 216381 (0.0034) [2024-04-26 16:14:32,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50540.5). Total num frames: 3545235456. Throughput: 0: 50560.4. Samples: 1298134140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 16:14:32,063][49517] Avg episode reward: [(0, '0.472')] [2024-04-26 16:14:34,090][49750] Updated weights for policy 0, policy_version 216391 (0.0029) [2024-04-26 16:14:37,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50244.4, 300 sec: 50596.1). Total num frames: 3545481216. Throughput: 0: 50552.3. Samples: 1298290660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 16:14:37,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 16:14:37,428][49750] Updated weights for policy 0, policy_version 216401 (0.0031) [2024-04-26 16:14:40,452][49750] Updated weights for policy 0, policy_version 216411 (0.0027) [2024-04-26 16:14:42,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3545743360. Throughput: 0: 50587.8. Samples: 1298592160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 16:14:42,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 16:14:42,170][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000216416_3545759744.pth... [2024-04-26 16:14:42,212][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000215674_3533602816.pth [2024-04-26 16:14:43,770][49750] Updated weights for policy 0, policy_version 216421 (0.0029) [2024-04-26 16:14:47,033][49750] Updated weights for policy 0, policy_version 216431 (0.0033) [2024-04-26 16:14:47,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3546005504. Throughput: 0: 50624.5. Samples: 1298901580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 16:14:47,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 16:14:49,850][49728] Signal inference workers to stop experience collection... (19450 times) [2024-04-26 16:14:49,893][49750] InferenceWorker_p0-w0: stopping experience collection (19450 times) [2024-04-26 16:14:49,957][49728] Signal inference workers to resume experience collection... (19450 times) [2024-04-26 16:14:49,958][49750] InferenceWorker_p0-w0: resuming experience collection (19450 times) [2024-04-26 16:14:50,236][49750] Updated weights for policy 0, policy_version 216441 (0.0028) [2024-04-26 16:14:52,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50596.0). Total num frames: 3546251264. Throughput: 0: 50687.2. Samples: 1299056240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 16:14:52,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-26 16:14:53,575][49750] Updated weights for policy 0, policy_version 216451 (0.0029) [2024-04-26 16:14:56,682][49750] Updated weights for policy 0, policy_version 216461 (0.0027) [2024-04-26 16:14:57,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 3546513408. Throughput: 0: 50650.8. Samples: 1299353000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 16:14:57,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 16:15:00,016][49750] Updated weights for policy 0, policy_version 216471 (0.0031) [2024-04-26 16:15:02,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3546759168. Throughput: 0: 50792.4. Samples: 1299658040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 16:15:02,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 16:15:03,180][49750] Updated weights for policy 0, policy_version 216481 (0.0032) [2024-04-26 16:15:06,444][49750] Updated weights for policy 0, policy_version 216491 (0.0028) [2024-04-26 16:15:07,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3547037696. Throughput: 0: 50772.0. Samples: 1299819040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 16:15:07,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 16:15:09,650][49750] Updated weights for policy 0, policy_version 216501 (0.0040) [2024-04-26 16:15:12,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.4, 300 sec: 50540.5). Total num frames: 3547250688. Throughput: 0: 50784.6. Samples: 1300117340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 16:15:12,063][49517] Avg episode reward: [(0, '0.452')] [2024-04-26 16:15:13,003][49750] Updated weights for policy 0, policy_version 216511 (0.0038) [2024-04-26 16:15:16,031][49750] Updated weights for policy 0, policy_version 216521 (0.0031) [2024-04-26 16:15:17,063][49517] Fps is (10 sec: 45874.6, 60 sec: 50517.2, 300 sec: 50540.5). Total num frames: 3547496448. Throughput: 0: 50663.1. Samples: 1300413980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 16:15:17,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 16:15:19,451][49750] Updated weights for policy 0, policy_version 216531 (0.0032) [2024-04-26 16:15:22,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 3547758592. Throughput: 0: 50611.5. Samples: 1300568180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:15:22,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 16:15:22,488][49750] Updated weights for policy 0, policy_version 216541 (0.0036) [2024-04-26 16:15:25,947][49750] Updated weights for policy 0, policy_version 216551 (0.0029) [2024-04-26 16:15:27,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3548020736. Throughput: 0: 50753.8. Samples: 1300876080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:15:27,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 16:15:28,843][49750] Updated weights for policy 0, policy_version 216561 (0.0034) [2024-04-26 16:15:32,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 3548266496. Throughput: 0: 50536.9. Samples: 1301175740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:15:32,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 16:15:32,299][49750] Updated weights for policy 0, policy_version 216571 (0.0027) [2024-04-26 16:15:35,319][49750] Updated weights for policy 0, policy_version 216581 (0.0029) [2024-04-26 16:15:37,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 50596.0). Total num frames: 3548528640. Throughput: 0: 50596.8. Samples: 1301333100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:15:37,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 16:15:38,608][49750] Updated weights for policy 0, policy_version 216591 (0.0032) [2024-04-26 16:15:41,923][49750] Updated weights for policy 0, policy_version 216601 (0.0035) [2024-04-26 16:15:42,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3548807168. Throughput: 0: 50710.5. Samples: 1301634980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:15:42,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 16:15:45,211][49750] Updated weights for policy 0, policy_version 216611 (0.0032) [2024-04-26 16:15:47,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3549052928. Throughput: 0: 50728.7. Samples: 1301940840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:15:47,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 16:15:48,155][49728] Signal inference workers to stop experience collection... (19500 times) [2024-04-26 16:15:48,156][49728] Signal inference workers to resume experience collection... (19500 times) [2024-04-26 16:15:48,166][49750] InferenceWorker_p0-w0: stopping experience collection (19500 times) [2024-04-26 16:15:48,184][49750] InferenceWorker_p0-w0: resuming experience collection (19500 times) [2024-04-26 16:15:48,290][49750] Updated weights for policy 0, policy_version 216621 (0.0033) [2024-04-26 16:15:51,759][49750] Updated weights for policy 0, policy_version 216631 (0.0032) [2024-04-26 16:15:52,062][49517] Fps is (10 sec: 47514.7, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3549282304. Throughput: 0: 50550.7. Samples: 1302093820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:15:52,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 16:15:54,762][49750] Updated weights for policy 0, policy_version 216641 (0.0031) [2024-04-26 16:15:57,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 3549560832. Throughput: 0: 50716.3. Samples: 1302399580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:15:57,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 16:15:58,097][49750] Updated weights for policy 0, policy_version 216651 (0.0029) [2024-04-26 16:16:01,152][49750] Updated weights for policy 0, policy_version 216661 (0.0034) [2024-04-26 16:16:02,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 3549773824. Throughput: 0: 50782.4. Samples: 1302699180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:16:02,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 16:16:04,425][49750] Updated weights for policy 0, policy_version 216671 (0.0031) [2024-04-26 16:16:07,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 3550052352. Throughput: 0: 50636.5. Samples: 1302846820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:16:07,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 16:16:07,496][49750] Updated weights for policy 0, policy_version 216681 (0.0030) [2024-04-26 16:16:11,071][49750] Updated weights for policy 0, policy_version 216691 (0.0027) [2024-04-26 16:16:12,063][49517] Fps is (10 sec: 54065.9, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3550314496. Throughput: 0: 50725.9. Samples: 1303158760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:16:12,064][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 16:16:14,010][49750] Updated weights for policy 0, policy_version 216701 (0.0026) [2024-04-26 16:16:17,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.5, 300 sec: 50596.0). Total num frames: 3550543872. Throughput: 0: 50790.7. Samples: 1303461320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:16:17,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 16:16:17,618][49750] Updated weights for policy 0, policy_version 216711 (0.0028) [2024-04-26 16:16:20,412][49750] Updated weights for policy 0, policy_version 216721 (0.0033) [2024-04-26 16:16:22,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50790.4, 300 sec: 50540.5). Total num frames: 3550806016. Throughput: 0: 50582.4. Samples: 1303609300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:16:22,063][49517] Avg episode reward: [(0, '0.473')] [2024-04-26 16:16:24,331][49750] Updated weights for policy 0, policy_version 216731 (0.0034) [2024-04-26 16:16:26,829][49750] Updated weights for policy 0, policy_version 216741 (0.0030) [2024-04-26 16:16:27,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3551084544. Throughput: 0: 50697.8. Samples: 1303916380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 16:16:27,072][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 16:16:30,738][49750] Updated weights for policy 0, policy_version 216751 (0.0034) [2024-04-26 16:16:32,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50651.6). Total num frames: 3551330304. Throughput: 0: 50633.0. Samples: 1304219320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-26 16:16:32,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 16:16:33,300][49750] Updated weights for policy 0, policy_version 216761 (0.0032) [2024-04-26 16:16:37,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 3551559680. Throughput: 0: 50674.2. Samples: 1304374160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-26 16:16:37,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 16:16:37,174][49750] Updated weights for policy 0, policy_version 216771 (0.0029) [2024-04-26 16:16:39,709][49750] Updated weights for policy 0, policy_version 216781 (0.0029) [2024-04-26 16:16:42,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.3, 300 sec: 50540.5). Total num frames: 3551821824. Throughput: 0: 50653.8. Samples: 1304679000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-26 16:16:42,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 16:16:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000216786_3551821824.pth... [2024-04-26 16:16:42,132][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000216045_3539681280.pth [2024-04-26 16:16:43,482][49750] Updated weights for policy 0, policy_version 216791 (0.0031) [2024-04-26 16:16:46,494][49750] Updated weights for policy 0, policy_version 216801 (0.0028) [2024-04-26 16:16:47,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 3552067584. Throughput: 0: 50650.5. Samples: 1304978460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-26 16:16:47,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 16:16:49,896][49750] Updated weights for policy 0, policy_version 216811 (0.0031) [2024-04-26 16:16:52,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 3552329728. Throughput: 0: 50740.3. Samples: 1305130140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-26 16:16:52,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 16:16:52,988][49750] Updated weights for policy 0, policy_version 216821 (0.0038) [2024-04-26 16:16:56,369][49750] Updated weights for policy 0, policy_version 216831 (0.0032) [2024-04-26 16:16:57,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3552591872. Throughput: 0: 50661.6. Samples: 1305438520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-26 16:16:57,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 16:16:59,379][49750] Updated weights for policy 0, policy_version 216841 (0.0032) [2024-04-26 16:17:02,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.3, 300 sec: 50484.9). Total num frames: 3552804864. Throughput: 0: 50673.8. Samples: 1305741640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-26 16:17:02,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 16:17:02,164][49728] Signal inference workers to stop experience collection... (19550 times) [2024-04-26 16:17:02,211][49750] InferenceWorker_p0-w0: stopping experience collection (19550 times) [2024-04-26 16:17:02,226][49728] Signal inference workers to resume experience collection... (19550 times) [2024-04-26 16:17:02,231][49750] InferenceWorker_p0-w0: resuming experience collection (19550 times) [2024-04-26 16:17:02,666][49750] Updated weights for policy 0, policy_version 216851 (0.0034) [2024-04-26 16:17:05,672][49750] Updated weights for policy 0, policy_version 216861 (0.0028) [2024-04-26 16:17:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3553099776. Throughput: 0: 50655.1. Samples: 1305888780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-26 16:17:07,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 16:17:09,337][49750] Updated weights for policy 0, policy_version 216871 (0.0033) [2024-04-26 16:17:11,999][49750] Updated weights for policy 0, policy_version 216881 (0.0030) [2024-04-26 16:17:12,063][49517] Fps is (10 sec: 57343.0, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3553378304. Throughput: 0: 50664.4. Samples: 1306196280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-26 16:17:12,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 16:17:15,637][49750] Updated weights for policy 0, policy_version 216891 (0.0033) [2024-04-26 16:17:17,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3553607680. Throughput: 0: 50811.6. Samples: 1306505840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-26 16:17:17,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 16:17:18,468][49750] Updated weights for policy 0, policy_version 216901 (0.0032) [2024-04-26 16:17:22,062][49517] Fps is (10 sec: 45876.0, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 3553837056. Throughput: 0: 50706.7. Samples: 1306655960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-26 16:17:22,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 16:17:22,199][49750] Updated weights for policy 0, policy_version 216911 (0.0031) [2024-04-26 16:17:24,973][49750] Updated weights for policy 0, policy_version 216921 (0.0037) [2024-04-26 16:17:27,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3554115584. Throughput: 0: 50646.8. Samples: 1306958100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-26 16:17:27,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 16:17:28,729][49750] Updated weights for policy 0, policy_version 216931 (0.0031) [2024-04-26 16:17:31,845][49750] Updated weights for policy 0, policy_version 216941 (0.0030) [2024-04-26 16:17:32,062][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 3554361344. Throughput: 0: 50668.9. Samples: 1307258560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-26 16:17:32,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 16:17:35,129][49750] Updated weights for policy 0, policy_version 216951 (0.0032) [2024-04-26 16:17:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3554607104. Throughput: 0: 50788.1. Samples: 1307415600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:17:37,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 16:17:38,226][49750] Updated weights for policy 0, policy_version 216961 (0.0032) [2024-04-26 16:17:41,635][49750] Updated weights for policy 0, policy_version 216971 (0.0026) [2024-04-26 16:17:42,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3554869248. Throughput: 0: 50864.4. Samples: 1307727420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:17:42,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 16:17:44,486][49750] Updated weights for policy 0, policy_version 216981 (0.0033) [2024-04-26 16:17:47,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 3555115008. Throughput: 0: 50766.9. Samples: 1308026160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:17:47,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 16:17:48,104][49750] Updated weights for policy 0, policy_version 216991 (0.0029) [2024-04-26 16:17:51,071][49750] Updated weights for policy 0, policy_version 217001 (0.0035) [2024-04-26 16:17:52,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 3555377152. Throughput: 0: 50982.0. Samples: 1308182980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:17:52,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 16:17:54,481][49750] Updated weights for policy 0, policy_version 217011 (0.0031) [2024-04-26 16:17:57,062][49517] Fps is (10 sec: 52429.9, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3555639296. Throughput: 0: 50674.8. Samples: 1308476640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:17:57,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 16:17:57,271][49728] Signal inference workers to stop experience collection... (19600 times) [2024-04-26 16:17:57,305][49750] InferenceWorker_p0-w0: stopping experience collection (19600 times) [2024-04-26 16:17:57,340][49728] Signal inference workers to resume experience collection... (19600 times) [2024-04-26 16:17:57,341][49750] InferenceWorker_p0-w0: resuming experience collection (19600 times) [2024-04-26 16:17:57,471][49750] Updated weights for policy 0, policy_version 217021 (0.0033) [2024-04-26 16:18:01,072][49750] Updated weights for policy 0, policy_version 217031 (0.0024) [2024-04-26 16:18:02,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51336.4, 300 sec: 50707.1). Total num frames: 3555885056. Throughput: 0: 50505.7. Samples: 1308778600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:18:02,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 16:18:03,893][49750] Updated weights for policy 0, policy_version 217041 (0.0034) [2024-04-26 16:18:07,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 3556130816. Throughput: 0: 50612.9. Samples: 1308933540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:18:07,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 16:18:07,544][49750] Updated weights for policy 0, policy_version 217051 (0.0030) [2024-04-26 16:18:10,213][49750] Updated weights for policy 0, policy_version 217061 (0.0027) [2024-04-26 16:18:12,063][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 50651.5). Total num frames: 3556376576. Throughput: 0: 50624.3. Samples: 1309236200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:18:12,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 16:18:13,816][49750] Updated weights for policy 0, policy_version 217071 (0.0027) [2024-04-26 16:18:16,599][49750] Updated weights for policy 0, policy_version 217081 (0.0031) [2024-04-26 16:18:17,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 3556655104. Throughput: 0: 50645.0. Samples: 1309537580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:18:17,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 16:18:20,274][49750] Updated weights for policy 0, policy_version 217091 (0.0030) [2024-04-26 16:18:22,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.4, 300 sec: 50651.6). Total num frames: 3556900864. Throughput: 0: 50825.3. Samples: 1309702740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:18:22,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 16:18:23,017][49750] Updated weights for policy 0, policy_version 217101 (0.0039) [2024-04-26 16:18:26,767][49750] Updated weights for policy 0, policy_version 217111 (0.0037) [2024-04-26 16:18:27,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3557146624. Throughput: 0: 50686.1. Samples: 1310008300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:18:27,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 16:18:29,474][49750] Updated weights for policy 0, policy_version 217121 (0.0027) [2024-04-26 16:18:32,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3557408768. Throughput: 0: 50789.1. Samples: 1310311660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:18:32,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 16:18:33,311][49750] Updated weights for policy 0, policy_version 217131 (0.0029) [2024-04-26 16:18:35,867][49750] Updated weights for policy 0, policy_version 217141 (0.0042) [2024-04-26 16:18:37,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3557654528. Throughput: 0: 50723.3. Samples: 1310465520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:18:37,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 16:18:39,618][49750] Updated weights for policy 0, policy_version 217151 (0.0037) [2024-04-26 16:18:42,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3557916672. Throughput: 0: 50915.5. Samples: 1310767840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:18:42,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 16:18:42,238][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000217160_3557949440.pth... [2024-04-26 16:18:42,281][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000216416_3545759744.pth [2024-04-26 16:18:42,411][49750] Updated weights for policy 0, policy_version 217161 (0.0031) [2024-04-26 16:18:45,942][49750] Updated weights for policy 0, policy_version 217171 (0.0030) [2024-04-26 16:18:47,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 3558146048. Throughput: 0: 50882.2. Samples: 1311068300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 16:18:47,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 16:18:49,365][49750] Updated weights for policy 0, policy_version 217181 (0.0030) [2024-04-26 16:18:52,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3558424576. Throughput: 0: 50702.9. Samples: 1311215180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 16:18:52,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 16:18:52,517][49750] Updated weights for policy 0, policy_version 217191 (0.0033) [2024-04-26 16:18:55,709][49750] Updated weights for policy 0, policy_version 217201 (0.0027) [2024-04-26 16:18:57,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 3558670336. Throughput: 0: 50716.1. Samples: 1311518420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 16:18:57,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 16:18:59,013][49750] Updated weights for policy 0, policy_version 217211 (0.0033) [2024-04-26 16:19:00,681][49728] Signal inference workers to stop experience collection... (19650 times) [2024-04-26 16:19:00,712][49750] InferenceWorker_p0-w0: stopping experience collection (19650 times) [2024-04-26 16:19:00,749][49728] Signal inference workers to resume experience collection... (19650 times) [2024-04-26 16:19:00,749][49750] InferenceWorker_p0-w0: resuming experience collection (19650 times) [2024-04-26 16:19:02,011][49750] Updated weights for policy 0, policy_version 217221 (0.0033) [2024-04-26 16:19:02,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3558948864. Throughput: 0: 50694.4. Samples: 1311818840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 16:19:02,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 16:19:05,360][49750] Updated weights for policy 0, policy_version 217231 (0.0031) [2024-04-26 16:19:07,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50651.6). Total num frames: 3559178240. Throughput: 0: 50638.5. Samples: 1311981480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 16:19:07,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 16:19:08,542][49750] Updated weights for policy 0, policy_version 217241 (0.0029) [2024-04-26 16:19:12,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3559424000. Throughput: 0: 50716.5. Samples: 1312290540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 16:19:12,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 16:19:12,106][49750] Updated weights for policy 0, policy_version 217251 (0.0032) [2024-04-26 16:19:15,058][49750] Updated weights for policy 0, policy_version 217261 (0.0032) [2024-04-26 16:19:17,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 3559669760. Throughput: 0: 50566.2. Samples: 1312587140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 16:19:17,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 16:19:18,850][49750] Updated weights for policy 0, policy_version 217271 (0.0036) [2024-04-26 16:19:21,459][49750] Updated weights for policy 0, policy_version 217281 (0.0032) [2024-04-26 16:19:22,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3559948288. Throughput: 0: 50544.8. Samples: 1312740040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 16:19:22,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 16:19:25,257][49750] Updated weights for policy 0, policy_version 217291 (0.0033) [2024-04-26 16:19:27,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3560194048. Throughput: 0: 50632.9. Samples: 1313046320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 16:19:27,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 16:19:27,773][49750] Updated weights for policy 0, policy_version 217301 (0.0028) [2024-04-26 16:19:31,613][49750] Updated weights for policy 0, policy_version 217311 (0.0029) [2024-04-26 16:19:32,063][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 3560423424. Throughput: 0: 50732.1. Samples: 1313351240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 16:19:32,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 16:19:34,395][49750] Updated weights for policy 0, policy_version 217321 (0.0030) [2024-04-26 16:19:37,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3560718336. Throughput: 0: 50641.4. Samples: 1313494040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 16:19:37,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 16:19:38,004][49750] Updated weights for policy 0, policy_version 217331 (0.0033) [2024-04-26 16:19:41,012][49750] Updated weights for policy 0, policy_version 217341 (0.0029) [2024-04-26 16:19:42,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 3560947712. Throughput: 0: 50733.2. Samples: 1313801420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 16:19:42,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 16:19:44,458][49750] Updated weights for policy 0, policy_version 217351 (0.0033) [2024-04-26 16:19:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 3561209856. Throughput: 0: 50834.0. Samples: 1314106360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 16:19:47,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 16:19:47,458][49750] Updated weights for policy 0, policy_version 217361 (0.0033) [2024-04-26 16:19:50,829][49750] Updated weights for policy 0, policy_version 217371 (0.0028) [2024-04-26 16:19:52,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 3561455616. Throughput: 0: 50578.8. Samples: 1314257520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 16:19:52,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 16:19:54,091][49750] Updated weights for policy 0, policy_version 217381 (0.0030) [2024-04-26 16:19:57,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3561717760. Throughput: 0: 50482.7. Samples: 1314562260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 16:19:57,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 16:19:57,234][49750] Updated weights for policy 0, policy_version 217391 (0.0034) [2024-04-26 16:20:00,478][49750] Updated weights for policy 0, policy_version 217401 (0.0037) [2024-04-26 16:20:02,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.3, 300 sec: 50596.0). Total num frames: 3561963520. Throughput: 0: 50675.0. Samples: 1314867520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 16:20:02,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 16:20:03,601][49750] Updated weights for policy 0, policy_version 217411 (0.0030) [2024-04-26 16:20:06,928][49750] Updated weights for policy 0, policy_version 217421 (0.0030) [2024-04-26 16:20:07,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3562225664. Throughput: 0: 50666.4. Samples: 1315020020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 16:20:07,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 16:20:10,069][49750] Updated weights for policy 0, policy_version 217431 (0.0029) [2024-04-26 16:20:12,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 3562487808. Throughput: 0: 50678.4. Samples: 1315326860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 16:20:12,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 16:20:13,324][49750] Updated weights for policy 0, policy_version 217441 (0.0028) [2024-04-26 16:20:16,612][49750] Updated weights for policy 0, policy_version 217451 (0.0032) [2024-04-26 16:20:17,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3562733568. Throughput: 0: 50660.2. Samples: 1315630940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 16:20:17,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 16:20:19,775][49750] Updated weights for policy 0, policy_version 217461 (0.0031) [2024-04-26 16:20:22,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3562979328. Throughput: 0: 50956.9. Samples: 1315787100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 16:20:22,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 16:20:23,058][49750] Updated weights for policy 0, policy_version 217471 (0.0028) [2024-04-26 16:20:26,161][49750] Updated weights for policy 0, policy_version 217481 (0.0040) [2024-04-26 16:20:27,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3563225088. Throughput: 0: 50875.3. Samples: 1316090800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 16:20:27,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 16:20:29,676][49750] Updated weights for policy 0, policy_version 217491 (0.0031) [2024-04-26 16:20:32,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 3563487232. Throughput: 0: 50849.4. Samples: 1316394580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 16:20:32,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 16:20:32,540][49750] Updated weights for policy 0, policy_version 217501 (0.0033) [2024-04-26 16:20:36,092][49750] Updated weights for policy 0, policy_version 217511 (0.0037) [2024-04-26 16:20:36,114][49728] Signal inference workers to stop experience collection... (19700 times) [2024-04-26 16:20:36,115][49728] Signal inference workers to resume experience collection... (19700 times) [2024-04-26 16:20:36,144][49750] InferenceWorker_p0-w0: stopping experience collection (19700 times) [2024-04-26 16:20:36,144][49750] InferenceWorker_p0-w0: resuming experience collection (19700 times) [2024-04-26 16:20:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 50596.0). Total num frames: 3563732992. Throughput: 0: 50667.5. Samples: 1316537560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 16:20:37,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 16:20:39,053][49750] Updated weights for policy 0, policy_version 217521 (0.0031) [2024-04-26 16:20:42,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3563995136. Throughput: 0: 50745.6. Samples: 1316845820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 16:20:42,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 16:20:42,132][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000217530_3564011520.pth... [2024-04-26 16:20:42,187][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000216786_3551821824.pth [2024-04-26 16:20:42,334][49750] Updated weights for policy 0, policy_version 217531 (0.0028) [2024-04-26 16:20:45,670][49750] Updated weights for policy 0, policy_version 217541 (0.0034) [2024-04-26 16:20:47,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3564273664. Throughput: 0: 50791.2. Samples: 1317153120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 16:20:47,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 16:20:48,861][49750] Updated weights for policy 0, policy_version 217551 (0.0032) [2024-04-26 16:20:52,007][49750] Updated weights for policy 0, policy_version 217561 (0.0033) [2024-04-26 16:20:52,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3564519424. Throughput: 0: 50760.3. Samples: 1317304240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 16:20:52,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 16:20:55,275][49750] Updated weights for policy 0, policy_version 217571 (0.0038) [2024-04-26 16:20:57,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3564748800. Throughput: 0: 50683.8. Samples: 1317607620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 16:20:57,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 16:20:58,443][49750] Updated weights for policy 0, policy_version 217581 (0.0027) [2024-04-26 16:21:01,613][49750] Updated weights for policy 0, policy_version 217591 (0.0033) [2024-04-26 16:21:02,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3565027328. Throughput: 0: 50698.4. Samples: 1317912380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:21:02,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 16:21:04,886][49750] Updated weights for policy 0, policy_version 217601 (0.0030) [2024-04-26 16:21:07,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3565273088. Throughput: 0: 50743.2. Samples: 1318070540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:21:07,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 16:21:07,941][49750] Updated weights for policy 0, policy_version 217611 (0.0034) [2024-04-26 16:21:11,359][49750] Updated weights for policy 0, policy_version 217621 (0.0030) [2024-04-26 16:21:12,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 3565518848. Throughput: 0: 50780.4. Samples: 1318375920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:21:12,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 16:21:14,442][49750] Updated weights for policy 0, policy_version 217631 (0.0028) [2024-04-26 16:21:17,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3565764608. Throughput: 0: 50875.9. Samples: 1318684000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:21:17,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 16:21:17,668][49750] Updated weights for policy 0, policy_version 217641 (0.0029) [2024-04-26 16:21:20,836][49750] Updated weights for policy 0, policy_version 217651 (0.0029) [2024-04-26 16:21:22,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.2, 300 sec: 50596.0). Total num frames: 3566010368. Throughput: 0: 50921.6. Samples: 1318829040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:21:22,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 16:21:24,133][49750] Updated weights for policy 0, policy_version 217661 (0.0030) [2024-04-26 16:21:27,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3566288896. Throughput: 0: 50809.0. Samples: 1319132220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:21:27,063][49517] Avg episode reward: [(0, '0.676')] [2024-04-26 16:21:27,455][49750] Updated weights for policy 0, policy_version 217671 (0.0032) [2024-04-26 16:21:30,573][49750] Updated weights for policy 0, policy_version 217681 (0.0029) [2024-04-26 16:21:32,062][49517] Fps is (10 sec: 52429.9, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3566534656. Throughput: 0: 50730.7. Samples: 1319436000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:21:32,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-26 16:21:33,784][49750] Updated weights for policy 0, policy_version 217691 (0.0033) [2024-04-26 16:21:37,001][49750] Updated weights for policy 0, policy_version 217701 (0.0029) [2024-04-26 16:21:37,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3566813184. Throughput: 0: 50894.3. Samples: 1319594480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:21:37,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 16:21:40,305][49750] Updated weights for policy 0, policy_version 217711 (0.0036) [2024-04-26 16:21:42,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3567026176. Throughput: 0: 50706.3. Samples: 1319889400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:21:42,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 16:21:43,697][49750] Updated weights for policy 0, policy_version 217721 (0.0031) [2024-04-26 16:21:46,832][49750] Updated weights for policy 0, policy_version 217731 (0.0035) [2024-04-26 16:21:47,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3567304704. Throughput: 0: 50623.3. Samples: 1320190420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:21:47,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 16:21:50,433][49750] Updated weights for policy 0, policy_version 217741 (0.0037) [2024-04-26 16:21:52,062][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3567566848. Throughput: 0: 50633.3. Samples: 1320349040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:21:52,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 16:21:53,182][49750] Updated weights for policy 0, policy_version 217751 (0.0028) [2024-04-26 16:21:56,786][49750] Updated weights for policy 0, policy_version 217761 (0.0035) [2024-04-26 16:21:57,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3567812608. Throughput: 0: 50703.9. Samples: 1320657600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:21:57,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 16:21:58,655][49728] Signal inference workers to stop experience collection... (19750 times) [2024-04-26 16:21:58,656][49728] Signal inference workers to resume experience collection... (19750 times) [2024-04-26 16:21:58,687][49750] InferenceWorker_p0-w0: stopping experience collection (19750 times) [2024-04-26 16:21:58,687][49750] InferenceWorker_p0-w0: resuming experience collection (19750 times) [2024-04-26 16:21:59,703][49750] Updated weights for policy 0, policy_version 217771 (0.0033) [2024-04-26 16:22:02,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3568058368. Throughput: 0: 50737.9. Samples: 1320967200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:22:02,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 16:22:03,348][49750] Updated weights for policy 0, policy_version 217781 (0.0028) [2024-04-26 16:22:06,174][49750] Updated weights for policy 0, policy_version 217791 (0.0038) [2024-04-26 16:22:07,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 3568304128. Throughput: 0: 50686.4. Samples: 1321109920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 16:22:07,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 16:22:09,922][49750] Updated weights for policy 0, policy_version 217801 (0.0035) [2024-04-26 16:22:12,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3568582656. Throughput: 0: 50664.7. Samples: 1321412140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 16:22:12,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 16:22:13,156][49750] Updated weights for policy 0, policy_version 217811 (0.0027) [2024-04-26 16:22:16,365][49750] Updated weights for policy 0, policy_version 217821 (0.0033) [2024-04-26 16:22:17,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3568812032. Throughput: 0: 50814.0. Samples: 1321722640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 16:22:17,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 16:22:19,594][49750] Updated weights for policy 0, policy_version 217831 (0.0041) [2024-04-26 16:22:22,063][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3569074176. Throughput: 0: 50729.2. Samples: 1321877300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 16:22:22,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 16:22:22,726][49750] Updated weights for policy 0, policy_version 217841 (0.0032) [2024-04-26 16:22:26,123][49750] Updated weights for policy 0, policy_version 217851 (0.0032) [2024-04-26 16:22:27,062][49517] Fps is (10 sec: 47514.7, 60 sec: 49971.2, 300 sec: 50596.0). Total num frames: 3569287168. Throughput: 0: 50777.3. Samples: 1322174380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 16:22:27,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 16:22:29,287][49750] Updated weights for policy 0, policy_version 217861 (0.0033) [2024-04-26 16:22:32,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3569565696. Throughput: 0: 50736.0. Samples: 1322473540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 16:22:32,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 16:22:32,488][49750] Updated weights for policy 0, policy_version 217871 (0.0029) [2024-04-26 16:22:35,566][49750] Updated weights for policy 0, policy_version 217881 (0.0032) [2024-04-26 16:22:37,062][49517] Fps is (10 sec: 55705.3, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3569844224. Throughput: 0: 50769.3. Samples: 1322633660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 16:22:37,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 16:22:39,103][49750] Updated weights for policy 0, policy_version 217891 (0.0028) [2024-04-26 16:22:41,974][49750] Updated weights for policy 0, policy_version 217901 (0.0030) [2024-04-26 16:22:42,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.4, 300 sec: 50762.7). Total num frames: 3570089984. Throughput: 0: 50757.0. Samples: 1322941660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 16:22:42,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 16:22:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000217901_3570089984.pth... [2024-04-26 16:22:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000217160_3557949440.pth [2024-04-26 16:22:45,605][49750] Updated weights for policy 0, policy_version 217911 (0.0030) [2024-04-26 16:22:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3570335744. Throughput: 0: 50724.9. Samples: 1323249820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 16:22:47,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 16:22:48,614][49750] Updated weights for policy 0, policy_version 217921 (0.0033) [2024-04-26 16:22:52,063][49517] Fps is (10 sec: 47512.5, 60 sec: 49971.0, 300 sec: 50596.0). Total num frames: 3570565120. Throughput: 0: 50673.0. Samples: 1323390220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 16:22:52,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 16:22:52,099][49750] Updated weights for policy 0, policy_version 217931 (0.0034) [2024-04-26 16:22:52,818][49728] Signal inference workers to stop experience collection... (19800 times) [2024-04-26 16:22:52,819][49728] Signal inference workers to resume experience collection... (19800 times) [2024-04-26 16:22:52,852][49750] InferenceWorker_p0-w0: stopping experience collection (19800 times) [2024-04-26 16:22:52,852][49750] InferenceWorker_p0-w0: resuming experience collection (19800 times) [2024-04-26 16:22:55,122][49750] Updated weights for policy 0, policy_version 217941 (0.0029) [2024-04-26 16:22:57,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3570876416. Throughput: 0: 50796.1. Samples: 1323697960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 16:22:57,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 16:22:58,428][49750] Updated weights for policy 0, policy_version 217951 (0.0034) [2024-04-26 16:23:01,463][49750] Updated weights for policy 0, policy_version 217961 (0.0031) [2024-04-26 16:23:02,063][49517] Fps is (10 sec: 52429.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3571089408. Throughput: 0: 50629.4. Samples: 1324000960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 16:23:02,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 16:23:05,095][49750] Updated weights for policy 0, policy_version 217971 (0.0026) [2024-04-26 16:23:07,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3571351552. Throughput: 0: 50648.1. Samples: 1324156460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 16:23:07,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 16:23:08,067][49750] Updated weights for policy 0, policy_version 217981 (0.0034) [2024-04-26 16:23:11,646][49750] Updated weights for policy 0, policy_version 217991 (0.0031) [2024-04-26 16:23:12,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 3571597312. Throughput: 0: 50657.6. Samples: 1324453980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 16:23:12,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 16:23:14,549][49750] Updated weights for policy 0, policy_version 218001 (0.0031) [2024-04-26 16:23:17,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50790.3, 300 sec: 50707.0). Total num frames: 3571859456. Throughput: 0: 50647.3. Samples: 1324752680. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-04-26 16:23:17,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 16:23:18,117][49750] Updated weights for policy 0, policy_version 218011 (0.0028) [2024-04-26 16:23:20,913][49750] Updated weights for policy 0, policy_version 218021 (0.0028) [2024-04-26 16:23:22,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3572137984. Throughput: 0: 50483.5. Samples: 1324905420. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-04-26 16:23:22,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 16:23:24,603][49750] Updated weights for policy 0, policy_version 218031 (0.0029) [2024-04-26 16:23:27,062][49517] Fps is (10 sec: 49153.1, 60 sec: 51063.4, 300 sec: 50651.5). Total num frames: 3572350976. Throughput: 0: 50443.1. Samples: 1325211600. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-04-26 16:23:27,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 16:23:27,220][49750] Updated weights for policy 0, policy_version 218041 (0.0032) [2024-04-26 16:23:30,909][49750] Updated weights for policy 0, policy_version 218051 (0.0029) [2024-04-26 16:23:32,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3572613120. Throughput: 0: 50539.6. Samples: 1325524100. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-04-26 16:23:32,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 16:23:33,623][49750] Updated weights for policy 0, policy_version 218061 (0.0026) [2024-04-26 16:23:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.2, 300 sec: 50596.0). Total num frames: 3572842496. Throughput: 0: 50639.8. Samples: 1325669000. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-04-26 16:23:37,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 16:23:37,327][49750] Updated weights for policy 0, policy_version 218071 (0.0029) [2024-04-26 16:23:39,910][49728] Signal inference workers to stop experience collection... (19850 times) [2024-04-26 16:23:39,910][49728] Signal inference workers to resume experience collection... (19850 times) [2024-04-26 16:23:39,920][49750] InferenceWorker_p0-w0: stopping experience collection (19850 times) [2024-04-26 16:23:39,921][49750] InferenceWorker_p0-w0: resuming experience collection (19850 times) [2024-04-26 16:23:40,045][49750] Updated weights for policy 0, policy_version 218081 (0.0032) [2024-04-26 16:23:42,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3573137408. Throughput: 0: 50629.3. Samples: 1325976280. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-04-26 16:23:42,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 16:23:43,811][49750] Updated weights for policy 0, policy_version 218091 (0.0025) [2024-04-26 16:23:46,655][49750] Updated weights for policy 0, policy_version 218101 (0.0031) [2024-04-26 16:23:47,063][49517] Fps is (10 sec: 54066.8, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3573383168. Throughput: 0: 50690.6. Samples: 1326282040. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-04-26 16:23:47,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 16:23:50,223][49750] Updated weights for policy 0, policy_version 218111 (0.0032) [2024-04-26 16:23:52,062][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 3573628928. Throughput: 0: 50703.5. Samples: 1326438120. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-04-26 16:23:52,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 16:23:52,948][49750] Updated weights for policy 0, policy_version 218121 (0.0029) [2024-04-26 16:23:56,699][49750] Updated weights for policy 0, policy_version 218131 (0.0032) [2024-04-26 16:23:57,063][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.1, 300 sec: 50596.0). Total num frames: 3573874688. Throughput: 0: 50700.9. Samples: 1326735520. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-04-26 16:23:57,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 16:23:59,414][49750] Updated weights for policy 0, policy_version 218141 (0.0032) [2024-04-26 16:24:02,064][49517] Fps is (10 sec: 50782.3, 60 sec: 50789.1, 300 sec: 50706.8). Total num frames: 3574136832. Throughput: 0: 50738.9. Samples: 1327036000. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-04-26 16:24:02,065][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 16:24:03,257][49750] Updated weights for policy 0, policy_version 218151 (0.0030) [2024-04-26 16:24:05,936][49750] Updated weights for policy 0, policy_version 218161 (0.0033) [2024-04-26 16:24:07,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3574431744. Throughput: 0: 50829.8. Samples: 1327192760. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-04-26 16:24:07,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 16:24:09,523][49750] Updated weights for policy 0, policy_version 218171 (0.0029) [2024-04-26 16:24:12,063][49517] Fps is (10 sec: 50798.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3574644736. Throughput: 0: 50803.5. Samples: 1327497760. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-04-26 16:24:12,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 16:24:12,358][49750] Updated weights for policy 0, policy_version 218181 (0.0029) [2024-04-26 16:24:15,921][49750] Updated weights for policy 0, policy_version 218191 (0.0035) [2024-04-26 16:24:17,062][49517] Fps is (10 sec: 45875.3, 60 sec: 50517.6, 300 sec: 50651.6). Total num frames: 3574890496. Throughput: 0: 50651.1. Samples: 1327803400. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-04-26 16:24:17,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 16:24:18,889][49750] Updated weights for policy 0, policy_version 218201 (0.0029) [2024-04-26 16:24:22,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.1, 300 sec: 50651.5). Total num frames: 3575136256. Throughput: 0: 50621.3. Samples: 1327946960. Policy #0 lag: (min: 2.0, avg: 12.8, max: 22.0) [2024-04-26 16:24:22,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 16:24:22,479][49750] Updated weights for policy 0, policy_version 218211 (0.0030) [2024-04-26 16:24:25,245][49750] Updated weights for policy 0, policy_version 218221 (0.0035) [2024-04-26 16:24:27,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 3575414784. Throughput: 0: 50650.0. Samples: 1328255540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:24:27,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 16:24:28,897][49750] Updated weights for policy 0, policy_version 218231 (0.0037) [2024-04-26 16:24:31,492][49750] Updated weights for policy 0, policy_version 218241 (0.0028) [2024-04-26 16:24:32,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3575676928. Throughput: 0: 50523.0. Samples: 1328555580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:24:32,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 16:24:35,274][49750] Updated weights for policy 0, policy_version 218251 (0.0032) [2024-04-26 16:24:37,063][49517] Fps is (10 sec: 50790.8, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3575922688. Throughput: 0: 50696.8. Samples: 1328719480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:24:37,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 16:24:37,988][49750] Updated weights for policy 0, policy_version 218261 (0.0036) [2024-04-26 16:24:41,687][49750] Updated weights for policy 0, policy_version 218271 (0.0027) [2024-04-26 16:24:42,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 3576152064. Throughput: 0: 50824.0. Samples: 1329022600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:24:42,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 16:24:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000218271_3576152064.pth... [2024-04-26 16:24:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000217530_3564011520.pth [2024-04-26 16:24:44,383][49750] Updated weights for policy 0, policy_version 218281 (0.0030) [2024-04-26 16:24:47,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3576414208. Throughput: 0: 50783.4. Samples: 1329321180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:24:47,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 16:24:48,212][49750] Updated weights for policy 0, policy_version 218291 (0.0030) [2024-04-26 16:24:49,309][49728] Signal inference workers to stop experience collection... (19900 times) [2024-04-26 16:24:49,309][49728] Signal inference workers to resume experience collection... (19900 times) [2024-04-26 16:24:49,323][49750] InferenceWorker_p0-w0: stopping experience collection (19900 times) [2024-04-26 16:24:49,323][49750] InferenceWorker_p0-w0: resuming experience collection (19900 times) [2024-04-26 16:24:50,795][49750] Updated weights for policy 0, policy_version 218301 (0.0033) [2024-04-26 16:24:52,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3576692736. Throughput: 0: 50767.9. Samples: 1329477320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:24:52,063][49517] Avg episode reward: [(0, '0.475')] [2024-04-26 16:24:54,709][49750] Updated weights for policy 0, policy_version 218311 (0.0032) [2024-04-26 16:24:57,062][49517] Fps is (10 sec: 54068.5, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 3576954880. Throughput: 0: 50884.1. Samples: 1329787540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:24:57,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 16:24:57,140][49750] Updated weights for policy 0, policy_version 218321 (0.0034) [2024-04-26 16:25:01,121][49750] Updated weights for policy 0, policy_version 218331 (0.0028) [2024-04-26 16:25:02,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50791.8, 300 sec: 50707.1). Total num frames: 3577184256. Throughput: 0: 50800.9. Samples: 1330089440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:25:02,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 16:25:03,990][49750] Updated weights for policy 0, policy_version 218341 (0.0030) [2024-04-26 16:25:07,063][49517] Fps is (10 sec: 47512.8, 60 sec: 49971.1, 300 sec: 50651.6). Total num frames: 3577430016. Throughput: 0: 50832.8. Samples: 1330234440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:25:07,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 16:25:07,568][49750] Updated weights for policy 0, policy_version 218351 (0.0028) [2024-04-26 16:25:10,433][49750] Updated weights for policy 0, policy_version 218361 (0.0031) [2024-04-26 16:25:12,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3577708544. Throughput: 0: 50734.8. Samples: 1330538600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:25:12,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 16:25:13,961][49750] Updated weights for policy 0, policy_version 218371 (0.0040) [2024-04-26 16:25:16,776][49750] Updated weights for policy 0, policy_version 218381 (0.0032) [2024-04-26 16:25:17,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3577954304. Throughput: 0: 50796.2. Samples: 1330841400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:25:17,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 16:25:20,530][49750] Updated weights for policy 0, policy_version 218391 (0.0028) [2024-04-26 16:25:22,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3578216448. Throughput: 0: 50847.6. Samples: 1331007620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:25:22,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 16:25:23,123][49750] Updated weights for policy 0, policy_version 218401 (0.0033) [2024-04-26 16:25:27,062][49517] Fps is (10 sec: 45875.6, 60 sec: 49971.4, 300 sec: 50596.0). Total num frames: 3578413056. Throughput: 0: 50754.8. Samples: 1331306560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:25:27,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 16:25:27,219][49750] Updated weights for policy 0, policy_version 218411 (0.0033) [2024-04-26 16:25:29,715][49750] Updated weights for policy 0, policy_version 218421 (0.0032) [2024-04-26 16:25:32,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3578691584. Throughput: 0: 50677.8. Samples: 1331601680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:25:32,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 16:25:33,722][49750] Updated weights for policy 0, policy_version 218431 (0.0035) [2024-04-26 16:25:36,229][49750] Updated weights for policy 0, policy_version 218441 (0.0034) [2024-04-26 16:25:37,063][49517] Fps is (10 sec: 55705.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3578970112. Throughput: 0: 50554.7. Samples: 1331752280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:25:37,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 16:25:40,100][49750] Updated weights for policy 0, policy_version 218451 (0.0028) [2024-04-26 16:25:42,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51336.6, 300 sec: 50707.1). Total num frames: 3579232256. Throughput: 0: 50548.8. Samples: 1332062240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:25:42,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 16:25:42,709][49750] Updated weights for policy 0, policy_version 218461 (0.0034) [2024-04-26 16:25:46,453][49750] Updated weights for policy 0, policy_version 218471 (0.0030) [2024-04-26 16:25:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3579461632. Throughput: 0: 50625.3. Samples: 1332367580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:25:47,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 16:25:49,113][49750] Updated weights for policy 0, policy_version 218481 (0.0028) [2024-04-26 16:25:52,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3579707392. Throughput: 0: 50685.5. Samples: 1332515280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:25:52,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 16:25:52,881][49750] Updated weights for policy 0, policy_version 218491 (0.0032) [2024-04-26 16:25:55,646][49750] Updated weights for policy 0, policy_version 218501 (0.0031) [2024-04-26 16:25:57,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3579985920. Throughput: 0: 50666.7. Samples: 1332818600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:25:57,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 16:25:59,053][49728] Signal inference workers to stop experience collection... (19950 times) [2024-04-26 16:25:59,073][49750] InferenceWorker_p0-w0: stopping experience collection (19950 times) [2024-04-26 16:25:59,119][49728] Signal inference workers to resume experience collection... (19950 times) [2024-04-26 16:25:59,120][49750] InferenceWorker_p0-w0: resuming experience collection (19950 times) [2024-04-26 16:25:59,244][49750] Updated weights for policy 0, policy_version 218511 (0.0034) [2024-04-26 16:26:02,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3580231680. Throughput: 0: 50793.4. Samples: 1333127100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:26:02,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 16:26:02,103][49750] Updated weights for policy 0, policy_version 218521 (0.0028) [2024-04-26 16:26:05,689][49750] Updated weights for policy 0, policy_version 218531 (0.0030) [2024-04-26 16:26:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.7, 300 sec: 50762.6). Total num frames: 3580493824. Throughput: 0: 50535.7. Samples: 1333281720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:26:07,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 16:26:08,644][49750] Updated weights for policy 0, policy_version 218541 (0.0031) [2024-04-26 16:26:12,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3580723200. Throughput: 0: 50683.1. Samples: 1333587300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:26:12,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 16:26:12,083][49750] Updated weights for policy 0, policy_version 218551 (0.0038) [2024-04-26 16:26:15,195][49750] Updated weights for policy 0, policy_version 218561 (0.0022) [2024-04-26 16:26:17,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3580985344. Throughput: 0: 50762.3. Samples: 1333885980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:26:17,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 16:26:18,557][49750] Updated weights for policy 0, policy_version 218571 (0.0037) [2024-04-26 16:26:21,623][49750] Updated weights for policy 0, policy_version 218581 (0.0029) [2024-04-26 16:26:22,063][49517] Fps is (10 sec: 52427.5, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3581247488. Throughput: 0: 50776.3. Samples: 1334037220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:26:22,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 16:26:25,012][49750] Updated weights for policy 0, policy_version 218591 (0.0029) [2024-04-26 16:26:27,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51336.3, 300 sec: 50707.1). Total num frames: 3581493248. Throughput: 0: 50718.5. Samples: 1334344580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:26:27,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 16:26:28,011][49750] Updated weights for policy 0, policy_version 218601 (0.0034) [2024-04-26 16:26:31,357][49750] Updated weights for policy 0, policy_version 218611 (0.0035) [2024-04-26 16:26:32,062][49517] Fps is (10 sec: 50791.6, 60 sec: 51063.7, 300 sec: 50651.6). Total num frames: 3581755392. Throughput: 0: 50689.5. Samples: 1334648600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:26:32,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 16:26:34,469][49750] Updated weights for policy 0, policy_version 218621 (0.0030) [2024-04-26 16:26:37,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3581984768. Throughput: 0: 50784.4. Samples: 1334800580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 16:26:37,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 16:26:37,741][49750] Updated weights for policy 0, policy_version 218631 (0.0024) [2024-04-26 16:26:40,901][49750] Updated weights for policy 0, policy_version 218641 (0.0031) [2024-04-26 16:26:42,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3582263296. Throughput: 0: 50910.0. Samples: 1335109560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 16:26:42,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 16:26:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000218644_3582263296.pth... [2024-04-26 16:26:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000217901_3570089984.pth [2024-04-26 16:26:44,284][49750] Updated weights for policy 0, policy_version 218651 (0.0025) [2024-04-26 16:26:47,063][49517] Fps is (10 sec: 52426.4, 60 sec: 50790.0, 300 sec: 50651.5). Total num frames: 3582509056. Throughput: 0: 50765.2. Samples: 1335411560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 16:26:47,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 16:26:47,519][49750] Updated weights for policy 0, policy_version 218661 (0.0033) [2024-04-26 16:26:50,588][49750] Updated weights for policy 0, policy_version 218671 (0.0034) [2024-04-26 16:26:52,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3582771200. Throughput: 0: 50728.1. Samples: 1335564500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 16:26:52,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 16:26:53,942][49750] Updated weights for policy 0, policy_version 218681 (0.0034) [2024-04-26 16:26:57,014][49750] Updated weights for policy 0, policy_version 218691 (0.0030) [2024-04-26 16:26:57,063][49517] Fps is (10 sec: 52430.8, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3583033344. Throughput: 0: 50752.7. Samples: 1335871180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 16:26:57,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 16:27:00,434][49750] Updated weights for policy 0, policy_version 218701 (0.0028) [2024-04-26 16:27:02,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3583262720. Throughput: 0: 50693.3. Samples: 1336167180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 16:27:02,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 16:27:03,524][49750] Updated weights for policy 0, policy_version 218711 (0.0039) [2024-04-26 16:27:06,904][49750] Updated weights for policy 0, policy_version 218721 (0.0034) [2024-04-26 16:27:07,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 3583541248. Throughput: 0: 50739.2. Samples: 1336320480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 16:27:07,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 16:27:09,995][49750] Updated weights for policy 0, policy_version 218731 (0.0032) [2024-04-26 16:27:12,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3583770624. Throughput: 0: 50659.7. Samples: 1336624260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 16:27:12,063][49517] Avg episode reward: [(0, '0.667')] [2024-04-26 16:27:13,231][49750] Updated weights for policy 0, policy_version 218741 (0.0036) [2024-04-26 16:27:16,299][49750] Updated weights for policy 0, policy_version 218751 (0.0035) [2024-04-26 16:27:17,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3584032768. Throughput: 0: 50608.8. Samples: 1336926000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 16:27:17,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-26 16:27:17,343][49728] Signal inference workers to stop experience collection... (20000 times) [2024-04-26 16:27:17,344][49728] Signal inference workers to resume experience collection... (20000 times) [2024-04-26 16:27:17,358][49750] InferenceWorker_p0-w0: stopping experience collection (20000 times) [2024-04-26 16:27:17,358][49750] InferenceWorker_p0-w0: resuming experience collection (20000 times) [2024-04-26 16:27:19,656][49750] Updated weights for policy 0, policy_version 218761 (0.0032) [2024-04-26 16:27:22,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3584278528. Throughput: 0: 50727.6. Samples: 1337083320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 16:27:22,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 16:27:22,741][49750] Updated weights for policy 0, policy_version 218771 (0.0032) [2024-04-26 16:27:26,171][49750] Updated weights for policy 0, policy_version 218781 (0.0030) [2024-04-26 16:27:27,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3584524288. Throughput: 0: 50665.4. Samples: 1337389500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 16:27:27,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-26 16:27:29,296][49750] Updated weights for policy 0, policy_version 218791 (0.0030) [2024-04-26 16:27:32,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 3584802816. Throughput: 0: 50859.5. Samples: 1337700220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 16:27:32,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 16:27:32,602][49750] Updated weights for policy 0, policy_version 218801 (0.0028) [2024-04-26 16:27:35,717][49750] Updated weights for policy 0, policy_version 218811 (0.0037) [2024-04-26 16:27:37,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3585048576. Throughput: 0: 50810.8. Samples: 1337850980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 16:27:37,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 16:27:38,945][49750] Updated weights for policy 0, policy_version 218821 (0.0032) [2024-04-26 16:27:42,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3585310720. Throughput: 0: 50790.3. Samples: 1338156740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 16:27:42,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 16:27:42,210][49750] Updated weights for policy 0, policy_version 218831 (0.0031) [2024-04-26 16:27:45,576][49750] Updated weights for policy 0, policy_version 218841 (0.0030) [2024-04-26 16:27:47,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.6, 300 sec: 50762.6). Total num frames: 3585540096. Throughput: 0: 50915.9. Samples: 1338458400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 16:27:47,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 16:27:48,589][49750] Updated weights for policy 0, policy_version 218851 (0.0032) [2024-04-26 16:27:52,005][49750] Updated weights for policy 0, policy_version 218861 (0.0028) [2024-04-26 16:27:52,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 3585818624. Throughput: 0: 50916.8. Samples: 1338611740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 16:27:52,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 16:27:54,999][49750] Updated weights for policy 0, policy_version 218871 (0.0036) [2024-04-26 16:27:57,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3586064384. Throughput: 0: 50764.1. Samples: 1338908640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 16:27:57,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 16:27:58,450][49750] Updated weights for policy 0, policy_version 218881 (0.0032) [2024-04-26 16:28:01,371][49750] Updated weights for policy 0, policy_version 218891 (0.0027) [2024-04-26 16:28:02,063][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3586326528. Throughput: 0: 50787.5. Samples: 1339211440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 16:28:02,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 16:28:04,966][49750] Updated weights for policy 0, policy_version 218901 (0.0031) [2024-04-26 16:28:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3586572288. Throughput: 0: 50898.6. Samples: 1339373760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 16:28:07,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 16:28:07,806][49750] Updated weights for policy 0, policy_version 218911 (0.0032) [2024-04-26 16:28:11,389][49750] Updated weights for policy 0, policy_version 218921 (0.0033) [2024-04-26 16:28:12,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3586818048. Throughput: 0: 50970.1. Samples: 1339683160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 16:28:12,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 16:28:14,268][49750] Updated weights for policy 0, policy_version 218931 (0.0029) [2024-04-26 16:28:17,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.2, 300 sec: 50596.0). Total num frames: 3587063808. Throughput: 0: 50757.3. Samples: 1339984300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 16:28:17,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 16:28:17,892][49728] Signal inference workers to stop experience collection... (20050 times) [2024-04-26 16:28:17,892][49728] Signal inference workers to resume experience collection... (20050 times) [2024-04-26 16:28:17,905][49750] InferenceWorker_p0-w0: stopping experience collection (20050 times) [2024-04-26 16:28:17,924][49750] InferenceWorker_p0-w0: resuming experience collection (20050 times) [2024-04-26 16:28:18,036][49750] Updated weights for policy 0, policy_version 218941 (0.0029) [2024-04-26 16:28:20,618][49750] Updated weights for policy 0, policy_version 218951 (0.0031) [2024-04-26 16:28:22,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3587325952. Throughput: 0: 50754.6. Samples: 1340134940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 16:28:22,063][49517] Avg episode reward: [(0, '0.678')] [2024-04-26 16:28:24,344][49750] Updated weights for policy 0, policy_version 218961 (0.0031) [2024-04-26 16:28:26,983][49750] Updated weights for policy 0, policy_version 218971 (0.0031) [2024-04-26 16:28:27,063][49517] Fps is (10 sec: 55705.4, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 3587620864. Throughput: 0: 50829.7. Samples: 1340444080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 16:28:27,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 16:28:30,873][49750] Updated weights for policy 0, policy_version 218981 (0.0029) [2024-04-26 16:28:32,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 3587817472. Throughput: 0: 50885.1. Samples: 1340748220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 16:28:32,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 16:28:33,444][49750] Updated weights for policy 0, policy_version 218991 (0.0044) [2024-04-26 16:28:37,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3588096000. Throughput: 0: 50696.6. Samples: 1340893080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 16:28:37,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 16:28:37,265][49750] Updated weights for policy 0, policy_version 219001 (0.0032) [2024-04-26 16:28:40,086][49750] Updated weights for policy 0, policy_version 219011 (0.0027) [2024-04-26 16:28:42,063][49517] Fps is (10 sec: 54066.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3588358144. Throughput: 0: 50851.9. Samples: 1341196980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 16:28:42,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 16:28:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000219016_3588358144.pth... [2024-04-26 16:28:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000218271_3576152064.pth [2024-04-26 16:28:43,928][49750] Updated weights for policy 0, policy_version 219021 (0.0033) [2024-04-26 16:28:46,570][49750] Updated weights for policy 0, policy_version 219031 (0.0028) [2024-04-26 16:28:47,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3588603904. Throughput: 0: 50871.1. Samples: 1341500640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 16:28:47,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 16:28:50,565][49750] Updated weights for policy 0, policy_version 219041 (0.0027) [2024-04-26 16:28:52,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3588866048. Throughput: 0: 50852.3. Samples: 1341662120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 16:28:52,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 16:28:52,855][49750] Updated weights for policy 0, policy_version 219051 (0.0028) [2024-04-26 16:28:56,907][49750] Updated weights for policy 0, policy_version 219061 (0.0034) [2024-04-26 16:28:57,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.2, 300 sec: 50707.3). Total num frames: 3589095424. Throughput: 0: 50817.3. Samples: 1341969940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 16:28:57,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 16:28:59,297][49750] Updated weights for policy 0, policy_version 219071 (0.0027) [2024-04-26 16:29:02,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 3589357568. Throughput: 0: 50796.1. Samples: 1342270120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 16:29:02,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 16:29:03,349][49750] Updated weights for policy 0, policy_version 219081 (0.0029) [2024-04-26 16:29:05,751][49750] Updated weights for policy 0, policy_version 219091 (0.0029) [2024-04-26 16:29:07,063][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3589636096. Throughput: 0: 50856.5. Samples: 1342423480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 16:29:07,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 16:29:09,646][49750] Updated weights for policy 0, policy_version 219101 (0.0030) [2024-04-26 16:29:12,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3589898240. Throughput: 0: 50666.4. Samples: 1342724060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 16:29:12,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 16:29:12,284][49750] Updated weights for policy 0, policy_version 219111 (0.0031) [2024-04-26 16:29:15,954][49750] Updated weights for policy 0, policy_version 219121 (0.0032) [2024-04-26 16:29:17,062][49517] Fps is (10 sec: 49152.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3590127616. Throughput: 0: 50743.5. Samples: 1343031680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 16:29:17,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 16:29:18,787][49750] Updated weights for policy 0, policy_version 219131 (0.0032) [2024-04-26 16:29:22,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3590373376. Throughput: 0: 50848.7. Samples: 1343181280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 16:29:22,064][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 16:29:22,619][49750] Updated weights for policy 0, policy_version 219141 (0.0035) [2024-04-26 16:29:25,126][49750] Updated weights for policy 0, policy_version 219151 (0.0030) [2024-04-26 16:29:27,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49971.4, 300 sec: 50651.6). Total num frames: 3590619136. Throughput: 0: 50820.6. Samples: 1343483900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 16:29:27,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 16:29:28,889][49750] Updated weights for policy 0, policy_version 219161 (0.0029) [2024-04-26 16:29:31,235][49728] Signal inference workers to stop experience collection... (20100 times) [2024-04-26 16:29:31,238][49728] Signal inference workers to resume experience collection... (20100 times) [2024-04-26 16:29:31,265][49750] InferenceWorker_p0-w0: stopping experience collection (20100 times) [2024-04-26 16:29:31,265][49750] InferenceWorker_p0-w0: resuming experience collection (20100 times) [2024-04-26 16:29:31,570][49750] Updated weights for policy 0, policy_version 219171 (0.0031) [2024-04-26 16:29:32,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51609.4, 300 sec: 50818.2). Total num frames: 3590914048. Throughput: 0: 50747.0. Samples: 1343784260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 16:29:32,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 16:29:35,416][49750] Updated weights for policy 0, policy_version 219181 (0.0034) [2024-04-26 16:29:37,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3591143424. Throughput: 0: 50713.0. Samples: 1343944200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 16:29:37,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 16:29:38,004][49750] Updated weights for policy 0, policy_version 219191 (0.0030) [2024-04-26 16:29:41,847][49750] Updated weights for policy 0, policy_version 219201 (0.0041) [2024-04-26 16:29:42,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3591389184. Throughput: 0: 50726.8. Samples: 1344252640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 16:29:42,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 16:29:44,465][49750] Updated weights for policy 0, policy_version 219211 (0.0033) [2024-04-26 16:29:47,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3591634944. Throughput: 0: 50842.8. Samples: 1344558040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 16:29:47,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 16:29:48,246][49750] Updated weights for policy 0, policy_version 219221 (0.0030) [2024-04-26 16:29:50,941][49750] Updated weights for policy 0, policy_version 219231 (0.0030) [2024-04-26 16:29:52,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 3591913472. Throughput: 0: 50814.8. Samples: 1344710140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 16:29:52,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 16:29:54,734][49750] Updated weights for policy 0, policy_version 219241 (0.0037) [2024-04-26 16:29:57,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 3592175616. Throughput: 0: 50941.4. Samples: 1345016420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 16:29:57,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 16:29:57,516][49750] Updated weights for policy 0, policy_version 219251 (0.0029) [2024-04-26 16:30:01,063][49750] Updated weights for policy 0, policy_version 219261 (0.0034) [2024-04-26 16:30:02,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3592388608. Throughput: 0: 50808.5. Samples: 1345318060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 16:30:02,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 16:30:04,161][49750] Updated weights for policy 0, policy_version 219271 (0.0032) [2024-04-26 16:30:07,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3592667136. Throughput: 0: 50750.8. Samples: 1345465060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 16:30:07,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 16:30:07,330][49750] Updated weights for policy 0, policy_version 219281 (0.0035) [2024-04-26 16:30:10,504][49750] Updated weights for policy 0, policy_version 219291 (0.0028) [2024-04-26 16:30:12,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3592912896. Throughput: 0: 50789.4. Samples: 1345769420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 16:30:12,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 16:30:13,753][49750] Updated weights for policy 0, policy_version 219301 (0.0029) [2024-04-26 16:30:16,844][49750] Updated weights for policy 0, policy_version 219311 (0.0028) [2024-04-26 16:30:17,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 3593191424. Throughput: 0: 50907.4. Samples: 1346075080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 16:30:17,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 16:30:20,204][49750] Updated weights for policy 0, policy_version 219321 (0.0025) [2024-04-26 16:30:22,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3593420800. Throughput: 0: 50900.5. Samples: 1346234720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 16:30:22,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 16:30:23,331][49750] Updated weights for policy 0, policy_version 219331 (0.0030) [2024-04-26 16:30:26,825][49750] Updated weights for policy 0, policy_version 219341 (0.0031) [2024-04-26 16:30:27,062][49517] Fps is (10 sec: 49151.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3593682944. Throughput: 0: 50769.8. Samples: 1346537280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 16:30:27,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 16:30:29,855][49750] Updated weights for policy 0, policy_version 219351 (0.0032) [2024-04-26 16:30:32,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3593928704. Throughput: 0: 50686.2. Samples: 1346838920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 16:30:32,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 16:30:33,470][49750] Updated weights for policy 0, policy_version 219361 (0.0027) [2024-04-26 16:30:36,303][49750] Updated weights for policy 0, policy_version 219371 (0.0031) [2024-04-26 16:30:37,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3594190848. Throughput: 0: 50707.1. Samples: 1346991960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 16:30:37,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 16:30:39,868][49750] Updated weights for policy 0, policy_version 219381 (0.0032) [2024-04-26 16:30:42,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3594452992. Throughput: 0: 50745.2. Samples: 1347299960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 16:30:42,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 16:30:42,076][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000219388_3594452992.pth... [2024-04-26 16:30:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000218644_3582263296.pth [2024-04-26 16:30:42,728][49750] Updated weights for policy 0, policy_version 219391 (0.0028) [2024-04-26 16:30:46,214][49750] Updated weights for policy 0, policy_version 219401 (0.0032) [2024-04-26 16:30:47,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3594682368. Throughput: 0: 50679.5. Samples: 1347598640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 16:30:47,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 16:30:49,214][49750] Updated weights for policy 0, policy_version 219411 (0.0033) [2024-04-26 16:30:52,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3594944512. Throughput: 0: 50582.2. Samples: 1347741260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 16:30:52,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 16:30:52,659][49750] Updated weights for policy 0, policy_version 219421 (0.0032) [2024-04-26 16:30:55,730][49750] Updated weights for policy 0, policy_version 219431 (0.0034) [2024-04-26 16:30:57,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3595206656. Throughput: 0: 50787.0. Samples: 1348054840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 16:30:57,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 16:30:59,136][49750] Updated weights for policy 0, policy_version 219441 (0.0032) [2024-04-26 16:31:02,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3595452416. Throughput: 0: 50841.3. Samples: 1348362940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 16:31:02,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 16:31:02,323][49750] Updated weights for policy 0, policy_version 219451 (0.0031) [2024-04-26 16:31:03,999][49728] Signal inference workers to stop experience collection... (20150 times) [2024-04-26 16:31:04,000][49728] Signal inference workers to resume experience collection... (20150 times) [2024-04-26 16:31:04,027][49750] InferenceWorker_p0-w0: stopping experience collection (20150 times) [2024-04-26 16:31:04,027][49750] InferenceWorker_p0-w0: resuming experience collection (20150 times) [2024-04-26 16:31:05,606][49750] Updated weights for policy 0, policy_version 219461 (0.0031) [2024-04-26 16:31:07,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3595714560. Throughput: 0: 50510.1. Samples: 1348507680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 16:31:07,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 16:31:08,738][49750] Updated weights for policy 0, policy_version 219471 (0.0027) [2024-04-26 16:31:11,967][49750] Updated weights for policy 0, policy_version 219481 (0.0033) [2024-04-26 16:31:12,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 3595976704. Throughput: 0: 50596.3. Samples: 1348814120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 16:31:12,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 16:31:15,339][49750] Updated weights for policy 0, policy_version 219491 (0.0032) [2024-04-26 16:31:17,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3596238848. Throughput: 0: 50616.4. Samples: 1349116660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 16:31:17,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 16:31:18,406][49750] Updated weights for policy 0, policy_version 219501 (0.0039) [2024-04-26 16:31:21,899][49750] Updated weights for policy 0, policy_version 219511 (0.0029) [2024-04-26 16:31:22,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3596468224. Throughput: 0: 50746.3. Samples: 1349275540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 16:31:22,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 16:31:24,965][49750] Updated weights for policy 0, policy_version 219521 (0.0035) [2024-04-26 16:31:27,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3596713984. Throughput: 0: 50645.5. Samples: 1349579000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 16:31:27,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 16:31:28,192][49750] Updated weights for policy 0, policy_version 219531 (0.0037) [2024-04-26 16:31:31,306][49750] Updated weights for policy 0, policy_version 219541 (0.0031) [2024-04-26 16:31:32,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3596976128. Throughput: 0: 50755.5. Samples: 1349882640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 16:31:32,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 16:31:34,545][49750] Updated weights for policy 0, policy_version 219551 (0.0029) [2024-04-26 16:31:37,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3597238272. Throughput: 0: 50901.8. Samples: 1350031840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 16:31:37,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 16:31:37,945][49750] Updated weights for policy 0, policy_version 219561 (0.0026) [2024-04-26 16:31:40,893][49750] Updated weights for policy 0, policy_version 219571 (0.0033) [2024-04-26 16:31:42,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3597500416. Throughput: 0: 50824.9. Samples: 1350341960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 16:31:42,063][49517] Avg episode reward: [(0, '0.687')] [2024-04-26 16:31:44,413][49750] Updated weights for policy 0, policy_version 219581 (0.0030) [2024-04-26 16:31:47,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 3597746176. Throughput: 0: 50709.8. Samples: 1350644880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 16:31:47,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 16:31:47,397][49750] Updated weights for policy 0, policy_version 219591 (0.0042) [2024-04-26 16:31:50,883][49750] Updated weights for policy 0, policy_version 219601 (0.0034) [2024-04-26 16:31:52,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 3597975552. Throughput: 0: 50872.3. Samples: 1350796940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 16:31:52,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 16:31:53,682][49750] Updated weights for policy 0, policy_version 219611 (0.0030) [2024-04-26 16:31:57,063][49517] Fps is (10 sec: 49150.8, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3598237696. Throughput: 0: 50791.9. Samples: 1351099760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 16:31:57,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 16:31:57,273][49750] Updated weights for policy 0, policy_version 219621 (0.0033) [2024-04-26 16:32:00,218][49750] Updated weights for policy 0, policy_version 219631 (0.0030) [2024-04-26 16:32:02,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3598516224. Throughput: 0: 50725.8. Samples: 1351399320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 16:32:02,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 16:32:03,728][49750] Updated weights for policy 0, policy_version 219641 (0.0030) [2024-04-26 16:32:06,896][49750] Updated weights for policy 0, policy_version 219651 (0.0031) [2024-04-26 16:32:07,063][49517] Fps is (10 sec: 52429.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3598761984. Throughput: 0: 50916.4. Samples: 1351566780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 16:32:07,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 16:32:10,299][49750] Updated weights for policy 0, policy_version 219661 (0.0033) [2024-04-26 16:32:11,732][49728] Signal inference workers to stop experience collection... (20200 times) [2024-04-26 16:32:11,732][49728] Signal inference workers to resume experience collection... (20200 times) [2024-04-26 16:32:11,748][49750] InferenceWorker_p0-w0: stopping experience collection (20200 times) [2024-04-26 16:32:11,748][49750] InferenceWorker_p0-w0: resuming experience collection (20200 times) [2024-04-26 16:32:12,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3599007744. Throughput: 0: 50811.5. Samples: 1351865520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 16:32:12,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 16:32:13,304][49750] Updated weights for policy 0, policy_version 219671 (0.0029) [2024-04-26 16:32:16,786][49750] Updated weights for policy 0, policy_version 219681 (0.0037) [2024-04-26 16:32:17,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3599253504. Throughput: 0: 50853.9. Samples: 1352171060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 16:32:17,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 16:32:19,633][49750] Updated weights for policy 0, policy_version 219691 (0.0034) [2024-04-26 16:32:22,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51063.2, 300 sec: 50873.7). Total num frames: 3599532032. Throughput: 0: 50813.6. Samples: 1352318460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 16:32:22,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 16:32:23,162][49750] Updated weights for policy 0, policy_version 219701 (0.0033) [2024-04-26 16:32:26,161][49750] Updated weights for policy 0, policy_version 219711 (0.0028) [2024-04-26 16:32:27,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 3599794176. Throughput: 0: 50581.7. Samples: 1352618140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 16:32:27,071][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 16:32:29,568][49750] Updated weights for policy 0, policy_version 219721 (0.0030) [2024-04-26 16:32:32,062][49517] Fps is (10 sec: 50791.4, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3600039936. Throughput: 0: 50764.8. Samples: 1352929300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 16:32:32,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 16:32:32,576][49750] Updated weights for policy 0, policy_version 219731 (0.0030) [2024-04-26 16:32:36,111][49750] Updated weights for policy 0, policy_version 219741 (0.0033) [2024-04-26 16:32:37,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3600285696. Throughput: 0: 50820.6. Samples: 1353083860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 16:32:37,072][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 16:32:38,910][49750] Updated weights for policy 0, policy_version 219751 (0.0029) [2024-04-26 16:32:42,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.3, 300 sec: 50762.7). Total num frames: 3600515072. Throughput: 0: 50766.9. Samples: 1353384260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 16:32:42,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 16:32:42,079][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000219758_3600515072.pth... [2024-04-26 16:32:42,137][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000219016_3588358144.pth [2024-04-26 16:32:42,900][49750] Updated weights for policy 0, policy_version 219761 (0.0030) [2024-04-26 16:32:45,407][49750] Updated weights for policy 0, policy_version 219771 (0.0032) [2024-04-26 16:32:47,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3600793600. Throughput: 0: 50812.4. Samples: 1353685880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 16:32:47,072][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 16:32:49,379][49750] Updated weights for policy 0, policy_version 219781 (0.0031) [2024-04-26 16:32:51,805][49750] Updated weights for policy 0, policy_version 219791 (0.0032) [2024-04-26 16:32:52,062][49517] Fps is (10 sec: 55705.7, 60 sec: 51609.8, 300 sec: 50873.7). Total num frames: 3601072128. Throughput: 0: 50739.6. Samples: 1353850060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 16:32:52,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 16:32:55,736][49750] Updated weights for policy 0, policy_version 219801 (0.0030) [2024-04-26 16:32:57,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3601285120. Throughput: 0: 50940.3. Samples: 1354157840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 16:32:57,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 16:32:58,152][49750] Updated weights for policy 0, policy_version 219811 (0.0028) [2024-04-26 16:33:02,018][49750] Updated weights for policy 0, policy_version 219821 (0.0032) [2024-04-26 16:33:02,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3601547264. Throughput: 0: 50958.6. Samples: 1354464200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 16:33:02,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 16:33:04,571][49750] Updated weights for policy 0, policy_version 219831 (0.0029) [2024-04-26 16:33:07,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3601809408. Throughput: 0: 50789.4. Samples: 1354603980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 16:33:07,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 16:33:08,511][49750] Updated weights for policy 0, policy_version 219841 (0.0031) [2024-04-26 16:33:10,391][49728] Signal inference workers to stop experience collection... (20250 times) [2024-04-26 16:33:10,427][49750] InferenceWorker_p0-w0: stopping experience collection (20250 times) [2024-04-26 16:33:10,459][49728] Signal inference workers to resume experience collection... (20250 times) [2024-04-26 16:33:10,462][49750] InferenceWorker_p0-w0: resuming experience collection (20250 times) [2024-04-26 16:33:11,128][49750] Updated weights for policy 0, policy_version 219851 (0.0027) [2024-04-26 16:33:12,063][49517] Fps is (10 sec: 55705.0, 60 sec: 51609.5, 300 sec: 50984.8). Total num frames: 3602104320. Throughput: 0: 51057.3. Samples: 1354915720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 16:33:12,063][49517] Avg episode reward: [(0, '0.451')] [2024-04-26 16:33:14,926][49750] Updated weights for policy 0, policy_version 219861 (0.0038) [2024-04-26 16:33:17,062][49517] Fps is (10 sec: 50791.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3602317312. Throughput: 0: 50879.6. Samples: 1355218880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 16:33:17,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 16:33:17,476][49750] Updated weights for policy 0, policy_version 219871 (0.0032) [2024-04-26 16:33:21,382][49750] Updated weights for policy 0, policy_version 219881 (0.0030) [2024-04-26 16:33:22,063][49517] Fps is (10 sec: 45875.4, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3602563072. Throughput: 0: 50805.3. Samples: 1355370100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 16:33:22,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 16:33:23,913][49750] Updated weights for policy 0, policy_version 219891 (0.0033) [2024-04-26 16:33:27,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 3602808832. Throughput: 0: 51029.4. Samples: 1355680580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 16:33:27,063][49517] Avg episode reward: [(0, '0.672')] [2024-04-26 16:33:27,662][49750] Updated weights for policy 0, policy_version 219901 (0.0029) [2024-04-26 16:33:30,243][49750] Updated weights for policy 0, policy_version 219911 (0.0028) [2024-04-26 16:33:32,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3603087360. Throughput: 0: 50934.7. Samples: 1355977940. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 16:33:32,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 16:33:34,127][49750] Updated weights for policy 0, policy_version 219921 (0.0034) [2024-04-26 16:33:36,594][49750] Updated weights for policy 0, policy_version 219931 (0.0027) [2024-04-26 16:33:37,062][49517] Fps is (10 sec: 55705.3, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3603365888. Throughput: 0: 50936.0. Samples: 1356142180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 16:33:37,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 16:33:40,598][49750] Updated weights for policy 0, policy_version 219941 (0.0029) [2024-04-26 16:33:42,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3603595264. Throughput: 0: 50916.2. Samples: 1356449060. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 16:33:42,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 16:33:42,961][49750] Updated weights for policy 0, policy_version 219951 (0.0037) [2024-04-26 16:33:47,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 3603841024. Throughput: 0: 50883.2. Samples: 1356753940. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 16:33:47,066][49750] Updated weights for policy 0, policy_version 219961 (0.0030) [2024-04-26 16:33:47,071][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 16:33:49,397][49750] Updated weights for policy 0, policy_version 219971 (0.0031) [2024-04-26 16:33:52,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 3604086784. Throughput: 0: 50708.5. Samples: 1356885860. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 16:33:52,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 16:33:53,582][49750] Updated weights for policy 0, policy_version 219981 (0.0027) [2024-04-26 16:33:55,808][49750] Updated weights for policy 0, policy_version 219991 (0.0026) [2024-04-26 16:33:57,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51609.7, 300 sec: 50929.3). Total num frames: 3604381696. Throughput: 0: 50769.1. Samples: 1357200320. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 16:33:57,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 16:33:59,936][49750] Updated weights for policy 0, policy_version 220001 (0.0032) [2024-04-26 16:34:02,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3604627456. Throughput: 0: 50908.4. Samples: 1357509760. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 16:34:02,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 16:34:02,290][49750] Updated weights for policy 0, policy_version 220011 (0.0027) [2024-04-26 16:34:06,342][49750] Updated weights for policy 0, policy_version 220021 (0.0029) [2024-04-26 16:34:07,062][49517] Fps is (10 sec: 45875.3, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 3604840448. Throughput: 0: 50819.7. Samples: 1357656980. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 16:34:07,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 16:34:08,874][49750] Updated weights for policy 0, policy_version 220031 (0.0033) [2024-04-26 16:34:12,062][49517] Fps is (10 sec: 45875.0, 60 sec: 49698.3, 300 sec: 50707.1). Total num frames: 3605086208. Throughput: 0: 50711.5. Samples: 1357962600. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 16:34:12,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 16:34:12,300][49728] Signal inference workers to stop experience collection... (20300 times) [2024-04-26 16:34:12,306][49728] Signal inference workers to resume experience collection... (20300 times) [2024-04-26 16:34:12,329][49750] InferenceWorker_p0-w0: stopping experience collection (20300 times) [2024-04-26 16:34:12,329][49750] InferenceWorker_p0-w0: resuming experience collection (20300 times) [2024-04-26 16:34:12,764][49750] Updated weights for policy 0, policy_version 220041 (0.0030) [2024-04-26 16:34:15,447][49750] Updated weights for policy 0, policy_version 220051 (0.0030) [2024-04-26 16:34:17,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3605381120. Throughput: 0: 50944.6. Samples: 1358270440. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 16:34:17,063][49517] Avg episode reward: [(0, '0.485')] [2024-04-26 16:34:19,173][49750] Updated weights for policy 0, policy_version 220061 (0.0032) [2024-04-26 16:34:21,807][49750] Updated weights for policy 0, policy_version 220071 (0.0029) [2024-04-26 16:34:22,063][49517] Fps is (10 sec: 55704.9, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 3605643264. Throughput: 0: 50839.4. Samples: 1358429960. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 16:34:22,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 16:34:25,686][49750] Updated weights for policy 0, policy_version 220081 (0.0034) [2024-04-26 16:34:27,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3605889024. Throughput: 0: 50856.8. Samples: 1358737620. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 16:34:27,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 16:34:28,092][49750] Updated weights for policy 0, policy_version 220091 (0.0028) [2024-04-26 16:34:32,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3606118400. Throughput: 0: 50831.9. Samples: 1359041380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 16:34:32,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 16:34:32,183][49750] Updated weights for policy 0, policy_version 220101 (0.0034) [2024-04-26 16:34:34,576][49750] Updated weights for policy 0, policy_version 220111 (0.0030) [2024-04-26 16:34:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 3606380544. Throughput: 0: 50973.8. Samples: 1359179680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 16:34:37,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 16:34:38,624][49750] Updated weights for policy 0, policy_version 220121 (0.0038) [2024-04-26 16:34:41,132][49750] Updated weights for policy 0, policy_version 220131 (0.0033) [2024-04-26 16:34:42,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 3606642688. Throughput: 0: 50754.4. Samples: 1359484280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 16:34:42,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 16:34:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000220132_3606642688.pth... [2024-04-26 16:34:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000219388_3594452992.pth [2024-04-26 16:34:45,110][49750] Updated weights for policy 0, policy_version 220141 (0.0028) [2024-04-26 16:34:47,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3606921216. Throughput: 0: 50735.1. Samples: 1359792840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 16:34:47,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 16:34:47,519][49750] Updated weights for policy 0, policy_version 220151 (0.0031) [2024-04-26 16:34:51,712][49750] Updated weights for policy 0, policy_version 220161 (0.0030) [2024-04-26 16:34:52,063][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3607150592. Throughput: 0: 51089.6. Samples: 1359956020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 16:34:52,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 16:34:53,839][49750] Updated weights for policy 0, policy_version 220171 (0.0030) [2024-04-26 16:34:57,063][49517] Fps is (10 sec: 45874.4, 60 sec: 49971.0, 300 sec: 50818.1). Total num frames: 3607379968. Throughput: 0: 50852.3. Samples: 1360250960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 16:34:57,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 16:34:58,134][49750] Updated weights for policy 0, policy_version 220181 (0.0029) [2024-04-26 16:35:00,326][49750] Updated weights for policy 0, policy_version 220191 (0.0034) [2024-04-26 16:35:02,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 3607658496. Throughput: 0: 50701.6. Samples: 1360552020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 16:35:02,064][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 16:35:04,615][49750] Updated weights for policy 0, policy_version 220201 (0.0030) [2024-04-26 16:35:07,063][49517] Fps is (10 sec: 54067.3, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 3607920640. Throughput: 0: 50825.3. Samples: 1360717100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 16:35:07,063][49517] Avg episode reward: [(0, '0.466')] [2024-04-26 16:35:07,066][49750] Updated weights for policy 0, policy_version 220211 (0.0029) [2024-04-26 16:35:11,059][49750] Updated weights for policy 0, policy_version 220221 (0.0033) [2024-04-26 16:35:11,350][49728] Signal inference workers to stop experience collection... (20350 times) [2024-04-26 16:35:11,351][49728] Signal inference workers to resume experience collection... (20350 times) [2024-04-26 16:35:11,365][49750] InferenceWorker_p0-w0: stopping experience collection (20350 times) [2024-04-26 16:35:11,365][49750] InferenceWorker_p0-w0: resuming experience collection (20350 times) [2024-04-26 16:35:12,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 3608166400. Throughput: 0: 50728.3. Samples: 1361020400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 16:35:12,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 16:35:13,524][49750] Updated weights for policy 0, policy_version 220231 (0.0029) [2024-04-26 16:35:17,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3608395776. Throughput: 0: 50803.5. Samples: 1361327540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 16:35:17,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 16:35:17,440][49750] Updated weights for policy 0, policy_version 220241 (0.0028) [2024-04-26 16:35:20,104][49750] Updated weights for policy 0, policy_version 220251 (0.0033) [2024-04-26 16:35:22,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 3608674304. Throughput: 0: 50777.6. Samples: 1361464680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 16:35:22,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 16:35:23,711][49750] Updated weights for policy 0, policy_version 220261 (0.0032) [2024-04-26 16:35:26,410][49750] Updated weights for policy 0, policy_version 220271 (0.0029) [2024-04-26 16:35:27,066][49517] Fps is (10 sec: 52410.6, 60 sec: 50514.4, 300 sec: 50817.6). Total num frames: 3608920064. Throughput: 0: 50755.3. Samples: 1361768440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 16:35:27,067][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 16:35:30,109][49750] Updated weights for policy 0, policy_version 220281 (0.0030) [2024-04-26 16:35:32,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3609198592. Throughput: 0: 50743.4. Samples: 1362076300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 16:35:32,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 16:35:32,957][49750] Updated weights for policy 0, policy_version 220291 (0.0033) [2024-04-26 16:35:36,461][49750] Updated weights for policy 0, policy_version 220301 (0.0031) [2024-04-26 16:35:37,062][49517] Fps is (10 sec: 52447.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3609444352. Throughput: 0: 50799.6. Samples: 1362242000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 16:35:37,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 16:35:39,446][49750] Updated weights for policy 0, policy_version 220311 (0.0032) [2024-04-26 16:35:42,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3609673728. Throughput: 0: 51169.5. Samples: 1362553580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 16:35:42,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 16:35:42,709][49750] Updated weights for policy 0, policy_version 220321 (0.0029) [2024-04-26 16:35:45,778][49750] Updated weights for policy 0, policy_version 220331 (0.0032) [2024-04-26 16:35:47,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 3609952256. Throughput: 0: 51212.0. Samples: 1362856560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:35:47,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 16:35:49,141][49750] Updated weights for policy 0, policy_version 220341 (0.0036) [2024-04-26 16:35:52,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3610214400. Throughput: 0: 50914.4. Samples: 1363008240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:35:52,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 16:35:52,097][49750] Updated weights for policy 0, policy_version 220351 (0.0030) [2024-04-26 16:35:55,646][49750] Updated weights for policy 0, policy_version 220361 (0.0029) [2024-04-26 16:35:57,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51609.8, 300 sec: 50929.3). Total num frames: 3610476544. Throughput: 0: 50998.1. Samples: 1363315300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:35:57,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 16:35:58,618][49750] Updated weights for policy 0, policy_version 220371 (0.0033) [2024-04-26 16:36:01,996][49750] Updated weights for policy 0, policy_version 220381 (0.0034) [2024-04-26 16:36:02,063][49517] Fps is (10 sec: 50789.5, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3610722304. Throughput: 0: 51029.2. Samples: 1363623860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:36:02,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 16:36:04,943][49750] Updated weights for policy 0, policy_version 220391 (0.0033) [2024-04-26 16:36:07,062][49517] Fps is (10 sec: 47512.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3610951680. Throughput: 0: 51190.3. Samples: 1363768240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:36:07,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 16:36:08,534][49750] Updated weights for policy 0, policy_version 220401 (0.0031) [2024-04-26 16:36:11,491][49750] Updated weights for policy 0, policy_version 220411 (0.0036) [2024-04-26 16:36:12,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3611230208. Throughput: 0: 51148.1. Samples: 1364069940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:36:12,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 16:36:14,997][49750] Updated weights for policy 0, policy_version 220421 (0.0033) [2024-04-26 16:36:17,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3611475968. Throughput: 0: 51108.6. Samples: 1364376180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:36:17,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 16:36:17,954][49750] Updated weights for policy 0, policy_version 220431 (0.0031) [2024-04-26 16:36:21,339][49750] Updated weights for policy 0, policy_version 220441 (0.0031) [2024-04-26 16:36:22,062][49517] Fps is (10 sec: 49153.5, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 3611721728. Throughput: 0: 50951.6. Samples: 1364534820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:36:22,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 16:36:24,529][49750] Updated weights for policy 0, policy_version 220451 (0.0027) [2024-04-26 16:36:27,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50793.3, 300 sec: 50818.2). Total num frames: 3611967488. Throughput: 0: 50716.8. Samples: 1364835840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:36:27,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 16:36:27,780][49750] Updated weights for policy 0, policy_version 220461 (0.0031) [2024-04-26 16:36:31,328][49750] Updated weights for policy 0, policy_version 220471 (0.0028) [2024-04-26 16:36:32,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 3612213248. Throughput: 0: 50805.0. Samples: 1365142780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:36:32,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 16:36:33,705][49728] Signal inference workers to stop experience collection... (20400 times) [2024-04-26 16:36:33,705][49728] Signal inference workers to resume experience collection... (20400 times) [2024-04-26 16:36:33,718][49750] InferenceWorker_p0-w0: stopping experience collection (20400 times) [2024-04-26 16:36:33,718][49750] InferenceWorker_p0-w0: resuming experience collection (20400 times) [2024-04-26 16:36:34,275][49750] Updated weights for policy 0, policy_version 220481 (0.0033) [2024-04-26 16:36:37,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3612491776. Throughput: 0: 50787.9. Samples: 1365293700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:36:37,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 16:36:37,666][49750] Updated weights for policy 0, policy_version 220491 (0.0038) [2024-04-26 16:36:40,629][49750] Updated weights for policy 0, policy_version 220501 (0.0031) [2024-04-26 16:36:42,062][49517] Fps is (10 sec: 55705.3, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 3612770304. Throughput: 0: 50620.8. Samples: 1365593240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:36:42,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 16:36:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000220506_3612770304.pth... [2024-04-26 16:36:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000219758_3600515072.pth [2024-04-26 16:36:43,976][49750] Updated weights for policy 0, policy_version 220511 (0.0031) [2024-04-26 16:36:46,906][49750] Updated weights for policy 0, policy_version 220521 (0.0036) [2024-04-26 16:36:47,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.6, 300 sec: 50984.8). Total num frames: 3613016064. Throughput: 0: 50585.2. Samples: 1365900180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:36:47,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 16:36:50,485][49750] Updated weights for policy 0, policy_version 220531 (0.0034) [2024-04-26 16:36:52,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 3613229056. Throughput: 0: 50769.0. Samples: 1366052840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:36:52,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 16:36:53,697][49750] Updated weights for policy 0, policy_version 220541 (0.0032) [2024-04-26 16:36:56,967][49750] Updated weights for policy 0, policy_version 220551 (0.0029) [2024-04-26 16:36:57,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50517.1, 300 sec: 50818.2). Total num frames: 3613507584. Throughput: 0: 50688.1. Samples: 1366350900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 16:36:57,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 16:37:00,311][49750] Updated weights for policy 0, policy_version 220561 (0.0035) [2024-04-26 16:37:02,063][49517] Fps is (10 sec: 54066.5, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3613769728. Throughput: 0: 50632.3. Samples: 1366654640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 16:37:02,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 16:37:03,335][49750] Updated weights for policy 0, policy_version 220571 (0.0030) [2024-04-26 16:37:06,650][49750] Updated weights for policy 0, policy_version 220581 (0.0031) [2024-04-26 16:37:07,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3614015488. Throughput: 0: 50759.7. Samples: 1366819020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 16:37:07,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 16:37:09,853][49750] Updated weights for policy 0, policy_version 220591 (0.0034) [2024-04-26 16:37:12,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 3614261248. Throughput: 0: 50802.6. Samples: 1367121960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 16:37:12,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 16:37:12,970][49750] Updated weights for policy 0, policy_version 220601 (0.0032) [2024-04-26 16:37:16,171][49750] Updated weights for policy 0, policy_version 220611 (0.0026) [2024-04-26 16:37:17,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 3614507008. Throughput: 0: 50725.7. Samples: 1367425440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 16:37:17,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 16:37:19,313][49750] Updated weights for policy 0, policy_version 220621 (0.0029) [2024-04-26 16:37:22,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3614769152. Throughput: 0: 50760.6. Samples: 1367577920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 16:37:22,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 16:37:22,795][49750] Updated weights for policy 0, policy_version 220631 (0.0030) [2024-04-26 16:37:25,740][49750] Updated weights for policy 0, policy_version 220641 (0.0028) [2024-04-26 16:37:27,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3615047680. Throughput: 0: 50947.9. Samples: 1367885900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 16:37:27,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 16:37:29,369][49750] Updated weights for policy 0, policy_version 220651 (0.0028) [2024-04-26 16:37:32,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3615293440. Throughput: 0: 50850.7. Samples: 1368188460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 16:37:32,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 16:37:32,076][49750] Updated weights for policy 0, policy_version 220661 (0.0036) [2024-04-26 16:37:35,820][49750] Updated weights for policy 0, policy_version 220671 (0.0030) [2024-04-26 16:37:37,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51063.6, 300 sec: 50984.8). Total num frames: 3615555584. Throughput: 0: 50841.3. Samples: 1368340700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 16:37:37,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 16:37:38,445][49750] Updated weights for policy 0, policy_version 220681 (0.0033) [2024-04-26 16:37:42,062][49517] Fps is (10 sec: 49151.2, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 3615784960. Throughput: 0: 51024.1. Samples: 1368646980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 16:37:42,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 16:37:42,201][49750] Updated weights for policy 0, policy_version 220691 (0.0032) [2024-04-26 16:37:44,917][49750] Updated weights for policy 0, policy_version 220701 (0.0029) [2024-04-26 16:37:45,628][49728] Signal inference workers to stop experience collection... (20450 times) [2024-04-26 16:37:45,666][49750] InferenceWorker_p0-w0: stopping experience collection (20450 times) [2024-04-26 16:37:45,700][49728] Signal inference workers to resume experience collection... (20450 times) [2024-04-26 16:37:45,701][49750] InferenceWorker_p0-w0: resuming experience collection (20450 times) [2024-04-26 16:37:47,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50790.2, 300 sec: 50818.1). Total num frames: 3616063488. Throughput: 0: 50920.4. Samples: 1368946060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 16:37:47,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 16:37:48,726][49750] Updated weights for policy 0, policy_version 220711 (0.0027) [2024-04-26 16:37:51,449][49750] Updated weights for policy 0, policy_version 220721 (0.0042) [2024-04-26 16:37:52,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51609.4, 300 sec: 50984.8). Total num frames: 3616325632. Throughput: 0: 50785.8. Samples: 1369104380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 16:37:52,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 16:37:55,184][49750] Updated weights for policy 0, policy_version 220731 (0.0028) [2024-04-26 16:37:57,063][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3616555008. Throughput: 0: 50867.1. Samples: 1369410980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 16:37:57,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 16:37:57,875][49750] Updated weights for policy 0, policy_version 220741 (0.0034) [2024-04-26 16:38:01,707][49750] Updated weights for policy 0, policy_version 220751 (0.0035) [2024-04-26 16:38:02,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3616817152. Throughput: 0: 51028.1. Samples: 1369721700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 16:38:02,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 16:38:04,210][49750] Updated weights for policy 0, policy_version 220761 (0.0034) [2024-04-26 16:38:07,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 3617062912. Throughput: 0: 50749.3. Samples: 1369861640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 16:38:07,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 16:38:08,075][49750] Updated weights for policy 0, policy_version 220771 (0.0034) [2024-04-26 16:38:10,748][49750] Updated weights for policy 0, policy_version 220781 (0.0042) [2024-04-26 16:38:12,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3617325056. Throughput: 0: 50669.4. Samples: 1370166020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 16:38:12,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 16:38:14,482][49750] Updated weights for policy 0, policy_version 220791 (0.0030) [2024-04-26 16:38:17,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 3617587200. Throughput: 0: 50804.3. Samples: 1370474660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 16:38:17,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 16:38:17,150][49750] Updated weights for policy 0, policy_version 220801 (0.0037) [2024-04-26 16:38:20,956][49750] Updated weights for policy 0, policy_version 220811 (0.0030) [2024-04-26 16:38:22,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 3617816576. Throughput: 0: 50804.6. Samples: 1370626920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 16:38:22,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 16:38:23,753][49750] Updated weights for policy 0, policy_version 220821 (0.0029) [2024-04-26 16:38:27,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 3618062336. Throughput: 0: 50675.2. Samples: 1370927360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 16:38:27,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 16:38:27,424][49750] Updated weights for policy 0, policy_version 220831 (0.0033) [2024-04-26 16:38:30,066][49750] Updated weights for policy 0, policy_version 220841 (0.0027) [2024-04-26 16:38:32,063][49517] Fps is (10 sec: 54067.6, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 3618357248. Throughput: 0: 50869.8. Samples: 1371235200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 16:38:32,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 16:38:33,935][49750] Updated weights for policy 0, policy_version 220851 (0.0031) [2024-04-26 16:38:36,581][49750] Updated weights for policy 0, policy_version 220861 (0.0031) [2024-04-26 16:38:37,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3618586624. Throughput: 0: 50772.7. Samples: 1371389140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 16:38:37,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 16:38:40,345][49750] Updated weights for policy 0, policy_version 220871 (0.0031) [2024-04-26 16:38:42,062][49517] Fps is (10 sec: 49152.3, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3618848768. Throughput: 0: 50771.6. Samples: 1371695700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 16:38:42,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 16:38:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000220877_3618848768.pth... [2024-04-26 16:38:42,113][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000220132_3606642688.pth [2024-04-26 16:38:43,093][49750] Updated weights for policy 0, policy_version 220881 (0.0036) [2024-04-26 16:38:46,833][49750] Updated weights for policy 0, policy_version 220891 (0.0036) [2024-04-26 16:38:47,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 3619078144. Throughput: 0: 50655.9. Samples: 1372001220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 16:38:47,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 16:38:49,683][49750] Updated weights for policy 0, policy_version 220901 (0.0039) [2024-04-26 16:38:52,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3619340288. Throughput: 0: 50716.8. Samples: 1372143900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 16:38:52,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 16:38:53,008][49728] Signal inference workers to stop experience collection... (20500 times) [2024-04-26 16:38:53,070][49750] InferenceWorker_p0-w0: stopping experience collection (20500 times) [2024-04-26 16:38:53,070][49728] Signal inference workers to resume experience collection... (20500 times) [2024-04-26 16:38:53,087][49750] InferenceWorker_p0-w0: resuming experience collection (20500 times) [2024-04-26 16:38:53,215][49750] Updated weights for policy 0, policy_version 220911 (0.0040) [2024-04-26 16:38:56,063][49750] Updated weights for policy 0, policy_version 220921 (0.0032) [2024-04-26 16:38:57,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3619618816. Throughput: 0: 50779.5. Samples: 1372451100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 16:38:57,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 16:38:59,517][49750] Updated weights for policy 0, policy_version 220931 (0.0028) [2024-04-26 16:39:02,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 3619848192. Throughput: 0: 50733.7. Samples: 1372757680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 16:39:02,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 16:39:02,435][49750] Updated weights for policy 0, policy_version 220941 (0.0029) [2024-04-26 16:39:05,944][49750] Updated weights for policy 0, policy_version 220951 (0.0033) [2024-04-26 16:39:07,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 3620110336. Throughput: 0: 50620.1. Samples: 1372904820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 16:39:07,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 16:39:09,188][49750] Updated weights for policy 0, policy_version 220961 (0.0036) [2024-04-26 16:39:12,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3620356096. Throughput: 0: 50651.4. Samples: 1373206680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 16:39:12,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 16:39:12,434][49750] Updated weights for policy 0, policy_version 220971 (0.0026) [2024-04-26 16:39:15,604][49750] Updated weights for policy 0, policy_version 220981 (0.0029) [2024-04-26 16:39:17,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3620634624. Throughput: 0: 50715.3. Samples: 1373517380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 16:39:17,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 16:39:18,940][49750] Updated weights for policy 0, policy_version 220991 (0.0029) [2024-04-26 16:39:21,952][49750] Updated weights for policy 0, policy_version 221001 (0.0032) [2024-04-26 16:39:22,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.7, 300 sec: 50818.2). Total num frames: 3620880384. Throughput: 0: 50747.6. Samples: 1373672780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 16:39:22,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 16:39:25,596][49750] Updated weights for policy 0, policy_version 221011 (0.0039) [2024-04-26 16:39:27,062][49517] Fps is (10 sec: 50789.9, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 3621142528. Throughput: 0: 50836.4. Samples: 1373983340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 16:39:27,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 16:39:28,393][49750] Updated weights for policy 0, policy_version 221021 (0.0033) [2024-04-26 16:39:32,062][49517] Fps is (10 sec: 47513.4, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 3621355520. Throughput: 0: 50768.5. Samples: 1374285800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 16:39:32,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 16:39:32,129][49750] Updated weights for policy 0, policy_version 221031 (0.0028) [2024-04-26 16:39:34,830][49750] Updated weights for policy 0, policy_version 221041 (0.0034) [2024-04-26 16:39:37,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3621634048. Throughput: 0: 50859.6. Samples: 1374432580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 16:39:37,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 16:39:38,513][49750] Updated weights for policy 0, policy_version 221051 (0.0035) [2024-04-26 16:39:41,316][49750] Updated weights for policy 0, policy_version 221061 (0.0029) [2024-04-26 16:39:42,063][49517] Fps is (10 sec: 54066.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3621896192. Throughput: 0: 50775.5. Samples: 1374736000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 16:39:42,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 16:39:44,956][49750] Updated weights for policy 0, policy_version 221071 (0.0034) [2024-04-26 16:39:47,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3622141952. Throughput: 0: 50733.5. Samples: 1375040680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 16:39:47,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 16:39:47,690][49750] Updated weights for policy 0, policy_version 221081 (0.0031) [2024-04-26 16:39:51,468][49750] Updated weights for policy 0, policy_version 221091 (0.0032) [2024-04-26 16:39:52,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3622387712. Throughput: 0: 50713.3. Samples: 1375186920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 16:39:52,063][49517] Avg episode reward: [(0, '0.668')] [2024-04-26 16:39:54,298][49750] Updated weights for policy 0, policy_version 221101 (0.0032) [2024-04-26 16:39:57,063][49517] Fps is (10 sec: 49150.7, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3622633472. Throughput: 0: 50736.8. Samples: 1375489840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 16:39:57,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 16:39:57,859][49750] Updated weights for policy 0, policy_version 221111 (0.0038) [2024-04-26 16:40:00,785][49750] Updated weights for policy 0, policy_version 221121 (0.0033) [2024-04-26 16:40:01,370][49728] Signal inference workers to stop experience collection... (20550 times) [2024-04-26 16:40:01,424][49750] InferenceWorker_p0-w0: stopping experience collection (20550 times) [2024-04-26 16:40:01,438][49728] Signal inference workers to resume experience collection... (20550 times) [2024-04-26 16:40:01,444][49750] InferenceWorker_p0-w0: resuming experience collection (20550 times) [2024-04-26 16:40:02,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3622928384. Throughput: 0: 50656.8. Samples: 1375796940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 16:40:02,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 16:40:04,360][49750] Updated weights for policy 0, policy_version 221131 (0.0033) [2024-04-26 16:40:07,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3623157760. Throughput: 0: 50655.3. Samples: 1375952280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 16:40:07,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 16:40:07,451][49750] Updated weights for policy 0, policy_version 221141 (0.0031) [2024-04-26 16:40:10,703][49750] Updated weights for policy 0, policy_version 221151 (0.0030) [2024-04-26 16:40:12,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3623403520. Throughput: 0: 50557.6. Samples: 1376258440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 16:40:12,064][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 16:40:13,801][49750] Updated weights for policy 0, policy_version 221161 (0.0042) [2024-04-26 16:40:17,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 3623649280. Throughput: 0: 50532.3. Samples: 1376559760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 16:40:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 16:40:17,207][49750] Updated weights for policy 0, policy_version 221171 (0.0036) [2024-04-26 16:40:20,350][49750] Updated weights for policy 0, policy_version 221181 (0.0033) [2024-04-26 16:40:22,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50517.3, 300 sec: 50818.8). Total num frames: 3623911424. Throughput: 0: 50556.5. Samples: 1376707620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:40:22,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 16:40:23,717][49750] Updated weights for policy 0, policy_version 221191 (0.0031) [2024-04-26 16:40:26,855][49750] Updated weights for policy 0, policy_version 221201 (0.0035) [2024-04-26 16:40:27,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3624173568. Throughput: 0: 50638.9. Samples: 1377014740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:40:27,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 16:40:30,290][49750] Updated weights for policy 0, policy_version 221211 (0.0032) [2024-04-26 16:40:32,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3624419328. Throughput: 0: 50599.5. Samples: 1377317660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:40:32,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 16:40:33,160][49750] Updated weights for policy 0, policy_version 221221 (0.0032) [2024-04-26 16:40:36,796][49750] Updated weights for policy 0, policy_version 221231 (0.0030) [2024-04-26 16:40:37,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3624665088. Throughput: 0: 50569.5. Samples: 1377462540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:40:37,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 16:40:39,483][49750] Updated weights for policy 0, policy_version 221241 (0.0035) [2024-04-26 16:40:42,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3624910848. Throughput: 0: 50713.9. Samples: 1377771960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:40:42,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 16:40:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000221247_3624910848.pth... [2024-04-26 16:40:42,129][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000220506_3612770304.pth [2024-04-26 16:40:43,045][49750] Updated weights for policy 0, policy_version 221251 (0.0032) [2024-04-26 16:40:45,946][49750] Updated weights for policy 0, policy_version 221261 (0.0033) [2024-04-26 16:40:47,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3625189376. Throughput: 0: 50649.5. Samples: 1378076160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:40:47,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 16:40:49,472][49750] Updated weights for policy 0, policy_version 221271 (0.0026) [2024-04-26 16:40:52,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.3, 300 sec: 50707.0). Total num frames: 3625435136. Throughput: 0: 50752.0. Samples: 1378236120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:40:52,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 16:40:52,404][49750] Updated weights for policy 0, policy_version 221281 (0.0028) [2024-04-26 16:40:55,945][49750] Updated weights for policy 0, policy_version 221291 (0.0030) [2024-04-26 16:40:57,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3625680896. Throughput: 0: 50782.8. Samples: 1378543660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:40:57,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 16:40:58,792][49750] Updated weights for policy 0, policy_version 221301 (0.0033) [2024-04-26 16:41:02,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 3625943040. Throughput: 0: 50925.4. Samples: 1378851400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:41:02,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 16:41:02,354][49750] Updated weights for policy 0, policy_version 221311 (0.0026) [2024-04-26 16:41:05,143][49750] Updated weights for policy 0, policy_version 221321 (0.0030) [2024-04-26 16:41:07,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3626205184. Throughput: 0: 50965.7. Samples: 1379001080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:41:07,063][49517] Avg episode reward: [(0, '0.490')] [2024-04-26 16:41:08,692][49750] Updated weights for policy 0, policy_version 221331 (0.0034) [2024-04-26 16:41:11,628][49750] Updated weights for policy 0, policy_version 221341 (0.0026) [2024-04-26 16:41:12,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 3626467328. Throughput: 0: 50976.2. Samples: 1379308680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:41:12,063][49517] Avg episode reward: [(0, '0.479')] [2024-04-26 16:41:15,082][49750] Updated weights for policy 0, policy_version 221351 (0.0031) [2024-04-26 16:41:17,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3626680320. Throughput: 0: 50922.6. Samples: 1379609180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:41:17,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 16:41:18,157][49750] Updated weights for policy 0, policy_version 221361 (0.0031) [2024-04-26 16:41:20,864][49728] Signal inference workers to stop experience collection... (20600 times) [2024-04-26 16:41:20,880][49750] InferenceWorker_p0-w0: stopping experience collection (20600 times) [2024-04-26 16:41:20,973][49728] Signal inference workers to resume experience collection... (20600 times) [2024-04-26 16:41:20,973][49750] InferenceWorker_p0-w0: resuming experience collection (20600 times) [2024-04-26 16:41:21,546][49750] Updated weights for policy 0, policy_version 221371 (0.0028) [2024-04-26 16:41:22,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3626958848. Throughput: 0: 50877.7. Samples: 1379752040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:41:22,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 16:41:24,637][49750] Updated weights for policy 0, policy_version 221381 (0.0031) [2024-04-26 16:41:27,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3627204608. Throughput: 0: 50733.9. Samples: 1380054980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 16:41:27,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 16:41:28,092][49750] Updated weights for policy 0, policy_version 221391 (0.0030) [2024-04-26 16:41:30,999][49750] Updated weights for policy 0, policy_version 221401 (0.0033) [2024-04-26 16:41:32,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3627483136. Throughput: 0: 50687.5. Samples: 1380357100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 16:41:32,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 16:41:34,509][49750] Updated weights for policy 0, policy_version 221411 (0.0028) [2024-04-26 16:41:37,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50651.6). Total num frames: 3627712512. Throughput: 0: 50741.1. Samples: 1380519460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 16:41:37,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 16:41:37,476][49750] Updated weights for policy 0, policy_version 221421 (0.0025) [2024-04-26 16:41:40,979][49750] Updated weights for policy 0, policy_version 221431 (0.0030) [2024-04-26 16:41:42,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 3627958272. Throughput: 0: 50660.9. Samples: 1380823400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 16:41:42,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 16:41:43,953][49750] Updated weights for policy 0, policy_version 221441 (0.0029) [2024-04-26 16:41:47,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3628220416. Throughput: 0: 50659.3. Samples: 1381131060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 16:41:47,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 16:41:47,533][49750] Updated weights for policy 0, policy_version 221451 (0.0033) [2024-04-26 16:41:50,401][49750] Updated weights for policy 0, policy_version 221461 (0.0029) [2024-04-26 16:41:52,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.6, 300 sec: 50762.7). Total num frames: 3628482560. Throughput: 0: 50716.6. Samples: 1381283320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 16:41:52,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 16:41:53,947][49750] Updated weights for policy 0, policy_version 221471 (0.0029) [2024-04-26 16:41:56,854][49750] Updated weights for policy 0, policy_version 221481 (0.0030) [2024-04-26 16:41:57,063][49517] Fps is (10 sec: 54066.3, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3628761088. Throughput: 0: 50688.9. Samples: 1381589680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 16:41:57,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 16:42:00,321][49750] Updated weights for policy 0, policy_version 221491 (0.0032) [2024-04-26 16:42:02,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 3628957696. Throughput: 0: 50559.8. Samples: 1381884380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 16:42:02,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 16:42:03,230][49750] Updated weights for policy 0, policy_version 221501 (0.0031) [2024-04-26 16:42:06,811][49750] Updated weights for policy 0, policy_version 221511 (0.0033) [2024-04-26 16:42:07,063][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3629236224. Throughput: 0: 50763.0. Samples: 1382036380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 16:42:07,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 16:42:09,678][49750] Updated weights for policy 0, policy_version 221521 (0.0032) [2024-04-26 16:42:12,063][49517] Fps is (10 sec: 54067.3, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3629498368. Throughput: 0: 50727.4. Samples: 1382337720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 16:42:12,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-26 16:42:13,413][49750] Updated weights for policy 0, policy_version 221531 (0.0028) [2024-04-26 16:42:16,259][49750] Updated weights for policy 0, policy_version 221541 (0.0031) [2024-04-26 16:42:17,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3629760512. Throughput: 0: 50697.8. Samples: 1382638500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 16:42:17,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 16:42:19,746][49750] Updated weights for policy 0, policy_version 221551 (0.0028) [2024-04-26 16:42:22,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 3629989888. Throughput: 0: 50770.6. Samples: 1382804140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 16:42:22,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 16:42:22,541][49750] Updated weights for policy 0, policy_version 221561 (0.0030) [2024-04-26 16:42:25,390][49728] Signal inference workers to stop experience collection... (20650 times) [2024-04-26 16:42:25,436][49750] InferenceWorker_p0-w0: stopping experience collection (20650 times) [2024-04-26 16:42:25,498][49728] Signal inference workers to resume experience collection... (20650 times) [2024-04-26 16:42:25,498][49750] InferenceWorker_p0-w0: resuming experience collection (20650 times) [2024-04-26 16:42:26,237][49750] Updated weights for policy 0, policy_version 221571 (0.0035) [2024-04-26 16:42:27,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3630252032. Throughput: 0: 50839.1. Samples: 1383111160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 16:42:27,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 16:42:29,002][49750] Updated weights for policy 0, policy_version 221581 (0.0029) [2024-04-26 16:42:32,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.1, 300 sec: 50651.5). Total num frames: 3630497792. Throughput: 0: 50626.0. Samples: 1383409240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 16:42:32,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 16:42:32,572][49750] Updated weights for policy 0, policy_version 221591 (0.0030) [2024-04-26 16:42:35,408][49750] Updated weights for policy 0, policy_version 221601 (0.0032) [2024-04-26 16:42:37,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 3630759936. Throughput: 0: 50647.6. Samples: 1383562460. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-26 16:42:37,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 16:42:39,127][49750] Updated weights for policy 0, policy_version 221611 (0.0029) [2024-04-26 16:42:41,918][49750] Updated weights for policy 0, policy_version 221621 (0.0029) [2024-04-26 16:42:42,063][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3631038464. Throughput: 0: 50622.2. Samples: 1383867680. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-26 16:42:42,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 16:42:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000221621_3631038464.pth... [2024-04-26 16:42:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000220877_3618848768.pth [2024-04-26 16:42:45,527][49750] Updated weights for policy 0, policy_version 221631 (0.0029) [2024-04-26 16:42:47,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 3631251456. Throughput: 0: 50922.9. Samples: 1384175900. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-26 16:42:47,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 16:42:48,424][49750] Updated weights for policy 0, policy_version 221641 (0.0028) [2024-04-26 16:42:51,901][49750] Updated weights for policy 0, policy_version 221651 (0.0032) [2024-04-26 16:42:52,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3631529984. Throughput: 0: 50770.3. Samples: 1384321040. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-26 16:42:52,064][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 16:42:54,866][49750] Updated weights for policy 0, policy_version 221661 (0.0026) [2024-04-26 16:42:57,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3631775744. Throughput: 0: 50759.7. Samples: 1384621900. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-26 16:42:57,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 16:42:58,547][49750] Updated weights for policy 0, policy_version 221671 (0.0032) [2024-04-26 16:43:01,266][49750] Updated weights for policy 0, policy_version 221681 (0.0032) [2024-04-26 16:43:02,062][49517] Fps is (10 sec: 49152.5, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 3632021504. Throughput: 0: 50689.8. Samples: 1384919540. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-26 16:43:02,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 16:43:05,079][49750] Updated weights for policy 0, policy_version 221691 (0.0026) [2024-04-26 16:43:07,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 3632283648. Throughput: 0: 50614.4. Samples: 1385081780. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-26 16:43:07,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 16:43:07,718][49750] Updated weights for policy 0, policy_version 221701 (0.0036) [2024-04-26 16:43:11,370][49750] Updated weights for policy 0, policy_version 221711 (0.0030) [2024-04-26 16:43:12,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.4, 300 sec: 50651.5). Total num frames: 3632529408. Throughput: 0: 50693.8. Samples: 1385392380. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-26 16:43:12,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 16:43:14,230][49750] Updated weights for policy 0, policy_version 221721 (0.0027) [2024-04-26 16:43:17,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 3632791552. Throughput: 0: 50836.7. Samples: 1385696880. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-26 16:43:17,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 16:43:17,731][49750] Updated weights for policy 0, policy_version 221731 (0.0029) [2024-04-26 16:43:20,537][49750] Updated weights for policy 0, policy_version 221741 (0.0032) [2024-04-26 16:43:22,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3633053696. Throughput: 0: 50856.8. Samples: 1385851020. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-26 16:43:22,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 16:43:24,241][49750] Updated weights for policy 0, policy_version 221751 (0.0031) [2024-04-26 16:43:26,513][49728] Signal inference workers to stop experience collection... (20700 times) [2024-04-26 16:43:26,513][49728] Signal inference workers to resume experience collection... (20700 times) [2024-04-26 16:43:26,543][49750] InferenceWorker_p0-w0: stopping experience collection (20700 times) [2024-04-26 16:43:26,544][49750] InferenceWorker_p0-w0: resuming experience collection (20700 times) [2024-04-26 16:43:26,965][49750] Updated weights for policy 0, policy_version 221761 (0.0038) [2024-04-26 16:43:27,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3633332224. Throughput: 0: 50941.0. Samples: 1386160020. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-26 16:43:27,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 16:43:30,556][49750] Updated weights for policy 0, policy_version 221771 (0.0031) [2024-04-26 16:43:32,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 3633561600. Throughput: 0: 50859.1. Samples: 1386464560. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-26 16:43:32,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 16:43:33,617][49750] Updated weights for policy 0, policy_version 221781 (0.0029) [2024-04-26 16:43:36,839][49750] Updated weights for policy 0, policy_version 221791 (0.0029) [2024-04-26 16:43:37,062][49517] Fps is (10 sec: 49152.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3633823744. Throughput: 0: 50872.5. Samples: 1386610300. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-26 16:43:37,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 16:43:40,017][49750] Updated weights for policy 0, policy_version 221801 (0.0032) [2024-04-26 16:43:42,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3634069504. Throughput: 0: 51044.3. Samples: 1386918900. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-26 16:43:42,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 16:43:43,262][49750] Updated weights for policy 0, policy_version 221811 (0.0029) [2024-04-26 16:43:46,438][49750] Updated weights for policy 0, policy_version 221821 (0.0029) [2024-04-26 16:43:47,063][49517] Fps is (10 sec: 50789.5, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 3634331648. Throughput: 0: 51086.0. Samples: 1387218420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 16:43:47,063][49517] Avg episode reward: [(0, '0.491')] [2024-04-26 16:43:49,721][49750] Updated weights for policy 0, policy_version 221831 (0.0028) [2024-04-26 16:43:52,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3634561024. Throughput: 0: 50970.5. Samples: 1387375460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 16:43:52,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 16:43:53,152][49750] Updated weights for policy 0, policy_version 221841 (0.0028) [2024-04-26 16:43:56,031][49750] Updated weights for policy 0, policy_version 221851 (0.0032) [2024-04-26 16:43:57,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3634823168. Throughput: 0: 50849.9. Samples: 1387680620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 16:43:57,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 16:43:59,550][49750] Updated weights for policy 0, policy_version 221861 (0.0026) [2024-04-26 16:44:02,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3635101696. Throughput: 0: 50891.0. Samples: 1387986980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 16:44:02,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 16:44:02,339][49750] Updated weights for policy 0, policy_version 221871 (0.0029) [2024-04-26 16:44:05,957][49750] Updated weights for policy 0, policy_version 221881 (0.0027) [2024-04-26 16:44:07,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.3, 300 sec: 50762.7). Total num frames: 3635331072. Throughput: 0: 50853.0. Samples: 1388139400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 16:44:07,063][49517] Avg episode reward: [(0, '0.446')] [2024-04-26 16:44:08,913][49750] Updated weights for policy 0, policy_version 221891 (0.0032) [2024-04-26 16:44:12,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51336.6, 300 sec: 50762.6). Total num frames: 3635609600. Throughput: 0: 50793.4. Samples: 1388445720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 16:44:12,063][49517] Avg episode reward: [(0, '0.459')] [2024-04-26 16:44:12,320][49750] Updated weights for policy 0, policy_version 221901 (0.0034) [2024-04-26 16:44:15,686][49750] Updated weights for policy 0, policy_version 221911 (0.0032) [2024-04-26 16:44:17,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3635855360. Throughput: 0: 50788.4. Samples: 1388750040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 16:44:17,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 16:44:18,743][49750] Updated weights for policy 0, policy_version 221921 (0.0032) [2024-04-26 16:44:22,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3636101120. Throughput: 0: 50906.5. Samples: 1388901100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 16:44:22,064][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 16:44:22,360][49750] Updated weights for policy 0, policy_version 221931 (0.0034) [2024-04-26 16:44:23,086][49728] Signal inference workers to stop experience collection... (20750 times) [2024-04-26 16:44:23,116][49750] InferenceWorker_p0-w0: stopping experience collection (20750 times) [2024-04-26 16:44:23,146][49728] Signal inference workers to resume experience collection... (20750 times) [2024-04-26 16:44:23,146][49750] InferenceWorker_p0-w0: resuming experience collection (20750 times) [2024-04-26 16:44:25,258][49750] Updated weights for policy 0, policy_version 221941 (0.0029) [2024-04-26 16:44:27,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3636363264. Throughput: 0: 50809.4. Samples: 1389205320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 16:44:27,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 16:44:28,867][49750] Updated weights for policy 0, policy_version 221951 (0.0035) [2024-04-26 16:44:31,717][49750] Updated weights for policy 0, policy_version 221961 (0.0029) [2024-04-26 16:44:32,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3636609024. Throughput: 0: 50914.8. Samples: 1389509580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 16:44:32,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 16:44:35,273][49750] Updated weights for policy 0, policy_version 221971 (0.0031) [2024-04-26 16:44:37,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3636871168. Throughput: 0: 50841.0. Samples: 1389663300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 16:44:37,063][49517] Avg episode reward: [(0, '0.671')] [2024-04-26 16:44:38,172][49750] Updated weights for policy 0, policy_version 221981 (0.0034) [2024-04-26 16:44:41,685][49750] Updated weights for policy 0, policy_version 221991 (0.0032) [2024-04-26 16:44:42,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3637116928. Throughput: 0: 50867.0. Samples: 1389969640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 16:44:42,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 16:44:42,070][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000221992_3637116928.pth... [2024-04-26 16:44:42,114][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000221247_3624910848.pth [2024-04-26 16:44:44,626][49750] Updated weights for policy 0, policy_version 222001 (0.0027) [2024-04-26 16:44:47,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 3637362688. Throughput: 0: 50783.2. Samples: 1390272220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 16:44:47,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 16:44:48,033][49750] Updated weights for policy 0, policy_version 222011 (0.0035) [2024-04-26 16:44:51,265][49750] Updated weights for policy 0, policy_version 222021 (0.0031) [2024-04-26 16:44:52,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3637624832. Throughput: 0: 50913.2. Samples: 1390430500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 16:44:52,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 16:44:54,309][49750] Updated weights for policy 0, policy_version 222031 (0.0027) [2024-04-26 16:44:57,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3637886976. Throughput: 0: 50777.3. Samples: 1390730700. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 16:44:57,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 16:44:57,910][49750] Updated weights for policy 0, policy_version 222041 (0.0030) [2024-04-26 16:45:00,791][49750] Updated weights for policy 0, policy_version 222051 (0.0032) [2024-04-26 16:45:02,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3638116352. Throughput: 0: 50716.9. Samples: 1391032300. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 16:45:02,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 16:45:04,283][49750] Updated weights for policy 0, policy_version 222061 (0.0032) [2024-04-26 16:45:07,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 3638394880. Throughput: 0: 50797.4. Samples: 1391186980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 16:45:07,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 16:45:07,250][49750] Updated weights for policy 0, policy_version 222071 (0.0031) [2024-04-26 16:45:10,724][49750] Updated weights for policy 0, policy_version 222081 (0.0037) [2024-04-26 16:45:12,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3638624256. Throughput: 0: 50815.2. Samples: 1391492000. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 16:45:12,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 16:45:13,553][49750] Updated weights for policy 0, policy_version 222091 (0.0035) [2024-04-26 16:45:17,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3638886400. Throughput: 0: 50611.0. Samples: 1391787080. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 16:45:17,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 16:45:17,221][49750] Updated weights for policy 0, policy_version 222101 (0.0033) [2024-04-26 16:45:20,389][49750] Updated weights for policy 0, policy_version 222111 (0.0029) [2024-04-26 16:45:22,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3639132160. Throughput: 0: 50693.3. Samples: 1391944500. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 16:45:22,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 16:45:23,544][49750] Updated weights for policy 0, policy_version 222121 (0.0035) [2024-04-26 16:45:26,698][49750] Updated weights for policy 0, policy_version 222131 (0.0036) [2024-04-26 16:45:27,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3639394304. Throughput: 0: 50594.6. Samples: 1392246400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 16:45:27,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 16:45:30,025][49750] Updated weights for policy 0, policy_version 222141 (0.0033) [2024-04-26 16:45:32,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3639640064. Throughput: 0: 50765.5. Samples: 1392556680. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 16:45:32,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-26 16:45:33,096][49750] Updated weights for policy 0, policy_version 222151 (0.0034) [2024-04-26 16:45:34,999][49728] Signal inference workers to stop experience collection... (20800 times) [2024-04-26 16:45:35,043][49750] InferenceWorker_p0-w0: stopping experience collection (20800 times) [2024-04-26 16:45:35,106][49728] Signal inference workers to resume experience collection... (20800 times) [2024-04-26 16:45:35,106][49750] InferenceWorker_p0-w0: resuming experience collection (20800 times) [2024-04-26 16:45:36,490][49750] Updated weights for policy 0, policy_version 222161 (0.0033) [2024-04-26 16:45:37,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 3639902208. Throughput: 0: 50597.7. Samples: 1392707400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 16:45:37,063][49517] Avg episode reward: [(0, '0.694')] [2024-04-26 16:45:39,458][49750] Updated weights for policy 0, policy_version 222171 (0.0033) [2024-04-26 16:45:42,062][49517] Fps is (10 sec: 52430.1, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3640164352. Throughput: 0: 50701.8. Samples: 1393012280. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 16:45:42,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 16:45:42,821][49750] Updated weights for policy 0, policy_version 222181 (0.0030) [2024-04-26 16:45:45,868][49750] Updated weights for policy 0, policy_version 222191 (0.0031) [2024-04-26 16:45:47,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3640393728. Throughput: 0: 50720.0. Samples: 1393314700. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 16:45:47,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 16:45:49,204][49750] Updated weights for policy 0, policy_version 222201 (0.0032) [2024-04-26 16:45:52,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3640672256. Throughput: 0: 50569.5. Samples: 1393462600. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 16:45:52,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 16:45:52,342][49750] Updated weights for policy 0, policy_version 222211 (0.0035) [2024-04-26 16:45:55,980][49750] Updated weights for policy 0, policy_version 222221 (0.0030) [2024-04-26 16:45:57,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3640918016. Throughput: 0: 50551.0. Samples: 1393766800. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 16:45:57,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 16:45:58,846][49750] Updated weights for policy 0, policy_version 222231 (0.0032) [2024-04-26 16:46:02,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3641163776. Throughput: 0: 50904.2. Samples: 1394077760. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 16:46:02,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 16:46:02,477][49750] Updated weights for policy 0, policy_version 222241 (0.0030) [2024-04-26 16:46:05,522][49750] Updated weights for policy 0, policy_version 222251 (0.0029) [2024-04-26 16:46:07,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3641425920. Throughput: 0: 50653.9. Samples: 1394223920. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 16:46:07,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 16:46:09,021][49750] Updated weights for policy 0, policy_version 222261 (0.0033) [2024-04-26 16:46:11,950][49750] Updated weights for policy 0, policy_version 222271 (0.0034) [2024-04-26 16:46:12,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3641688064. Throughput: 0: 50704.0. Samples: 1394528080. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 16:46:12,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 16:46:15,452][49750] Updated weights for policy 0, policy_version 222281 (0.0029) [2024-04-26 16:46:17,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3641933824. Throughput: 0: 50685.4. Samples: 1394837520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 16:46:17,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 16:46:18,299][49750] Updated weights for policy 0, policy_version 222291 (0.0033) [2024-04-26 16:46:21,838][49750] Updated weights for policy 0, policy_version 222301 (0.0029) [2024-04-26 16:46:22,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3642179584. Throughput: 0: 50728.6. Samples: 1394990180. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 16:46:22,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 16:46:24,881][49750] Updated weights for policy 0, policy_version 222311 (0.0030) [2024-04-26 16:46:27,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3642441728. Throughput: 0: 50763.9. Samples: 1395296660. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 16:46:27,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 16:46:28,326][49750] Updated weights for policy 0, policy_version 222321 (0.0038) [2024-04-26 16:46:31,616][49750] Updated weights for policy 0, policy_version 222331 (0.0030) [2024-04-26 16:46:32,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3642703872. Throughput: 0: 50987.9. Samples: 1395609160. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 16:46:32,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 16:46:34,912][49750] Updated weights for policy 0, policy_version 222341 (0.0037) [2024-04-26 16:46:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3642966016. Throughput: 0: 50930.5. Samples: 1395754480. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 16:46:37,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 16:46:38,078][49750] Updated weights for policy 0, policy_version 222351 (0.0031) [2024-04-26 16:46:39,672][49728] Signal inference workers to stop experience collection... (20850 times) [2024-04-26 16:46:39,672][49728] Signal inference workers to resume experience collection... (20850 times) [2024-04-26 16:46:39,683][49750] InferenceWorker_p0-w0: stopping experience collection (20850 times) [2024-04-26 16:46:39,708][49750] InferenceWorker_p0-w0: resuming experience collection (20850 times) [2024-04-26 16:46:41,217][49750] Updated weights for policy 0, policy_version 222361 (0.0035) [2024-04-26 16:46:42,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3643195392. Throughput: 0: 50881.9. Samples: 1396056480. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 16:46:42,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 16:46:42,070][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000222364_3643211776.pth... [2024-04-26 16:46:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000221621_3631038464.pth [2024-04-26 16:46:44,416][49750] Updated weights for policy 0, policy_version 222371 (0.0029) [2024-04-26 16:46:47,063][49517] Fps is (10 sec: 49151.5, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3643457536. Throughput: 0: 50519.3. Samples: 1396351140. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 16:46:47,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 16:46:47,601][49750] Updated weights for policy 0, policy_version 222381 (0.0026) [2024-04-26 16:46:50,808][49750] Updated weights for policy 0, policy_version 222391 (0.0033) [2024-04-26 16:46:52,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 3643703296. Throughput: 0: 50795.8. Samples: 1396509740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 16:46:52,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 16:46:54,066][49750] Updated weights for policy 0, policy_version 222401 (0.0028) [2024-04-26 16:46:57,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3643949056. Throughput: 0: 50769.4. Samples: 1396812700. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 16:46:57,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 16:46:57,293][49750] Updated weights for policy 0, policy_version 222411 (0.0034) [2024-04-26 16:47:00,547][49750] Updated weights for policy 0, policy_version 222421 (0.0032) [2024-04-26 16:47:02,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3644227584. Throughput: 0: 50782.8. Samples: 1397122740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 16:47:02,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 16:47:03,728][49750] Updated weights for policy 0, policy_version 222431 (0.0035) [2024-04-26 16:47:07,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 3644440576. Throughput: 0: 50902.3. Samples: 1397280780. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 16:47:07,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 16:47:07,289][49750] Updated weights for policy 0, policy_version 222441 (0.0035) [2024-04-26 16:47:10,144][49750] Updated weights for policy 0, policy_version 222451 (0.0035) [2024-04-26 16:47:12,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3644735488. Throughput: 0: 50740.3. Samples: 1397579980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 16:47:12,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 16:47:13,685][49750] Updated weights for policy 0, policy_version 222461 (0.0031) [2024-04-26 16:47:16,622][49750] Updated weights for policy 0, policy_version 222471 (0.0029) [2024-04-26 16:47:17,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 3644981248. Throughput: 0: 50535.2. Samples: 1397883240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 16:47:17,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 16:47:20,052][49750] Updated weights for policy 0, policy_version 222481 (0.0035) [2024-04-26 16:47:22,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3645259776. Throughput: 0: 50712.1. Samples: 1398036520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 16:47:22,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 16:47:22,983][49750] Updated weights for policy 0, policy_version 222491 (0.0034) [2024-04-26 16:47:26,388][49750] Updated weights for policy 0, policy_version 222501 (0.0031) [2024-04-26 16:47:27,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3645489152. Throughput: 0: 50792.4. Samples: 1398342140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 16:47:27,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 16:47:29,292][49750] Updated weights for policy 0, policy_version 222511 (0.0036) [2024-04-26 16:47:32,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 3645734912. Throughput: 0: 50935.0. Samples: 1398643200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 16:47:32,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 16:47:32,531][49728] Signal inference workers to stop experience collection... (20900 times) [2024-04-26 16:47:32,579][49750] InferenceWorker_p0-w0: stopping experience collection (20900 times) [2024-04-26 16:47:32,600][49728] Signal inference workers to resume experience collection... (20900 times) [2024-04-26 16:47:32,601][49750] InferenceWorker_p0-w0: resuming experience collection (20900 times) [2024-04-26 16:47:32,745][49750] Updated weights for policy 0, policy_version 222521 (0.0034) [2024-04-26 16:47:35,953][49750] Updated weights for policy 0, policy_version 222531 (0.0034) [2024-04-26 16:47:37,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3645997056. Throughput: 0: 50753.5. Samples: 1398793640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 16:47:37,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 16:47:39,161][49750] Updated weights for policy 0, policy_version 222541 (0.0030) [2024-04-26 16:47:42,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3646242816. Throughput: 0: 50825.3. Samples: 1399099840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 16:47:42,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 16:47:42,333][49750] Updated weights for policy 0, policy_version 222551 (0.0029) [2024-04-26 16:47:45,455][49750] Updated weights for policy 0, policy_version 222561 (0.0029) [2024-04-26 16:47:47,063][49517] Fps is (10 sec: 54066.2, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3646537728. Throughput: 0: 50790.5. Samples: 1399408320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 16:47:47,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 16:47:48,675][49750] Updated weights for policy 0, policy_version 222571 (0.0028) [2024-04-26 16:47:51,851][49750] Updated weights for policy 0, policy_version 222581 (0.0029) [2024-04-26 16:47:52,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3646767104. Throughput: 0: 50830.7. Samples: 1399568160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 16:47:52,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 16:47:55,176][49750] Updated weights for policy 0, policy_version 222591 (0.0032) [2024-04-26 16:47:57,062][49517] Fps is (10 sec: 47514.2, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3647012864. Throughput: 0: 50783.7. Samples: 1399865240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 16:47:57,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 16:47:58,321][49750] Updated weights for policy 0, policy_version 222601 (0.0027) [2024-04-26 16:48:01,687][49750] Updated weights for policy 0, policy_version 222611 (0.0027) [2024-04-26 16:48:02,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3647258624. Throughput: 0: 50701.6. Samples: 1400164820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 16:48:02,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 16:48:05,022][49750] Updated weights for policy 0, policy_version 222621 (0.0031) [2024-04-26 16:48:07,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 3647520768. Throughput: 0: 50686.1. Samples: 1400317400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 16:48:07,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 16:48:08,005][49750] Updated weights for policy 0, policy_version 222631 (0.0031) [2024-04-26 16:48:11,311][49750] Updated weights for policy 0, policy_version 222641 (0.0032) [2024-04-26 16:48:12,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3647766528. Throughput: 0: 50713.8. Samples: 1400624260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 16:48:12,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 16:48:14,423][49750] Updated weights for policy 0, policy_version 222651 (0.0031) [2024-04-26 16:48:17,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3648012288. Throughput: 0: 50704.4. Samples: 1400924900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 16:48:17,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 16:48:17,908][49750] Updated weights for policy 0, policy_version 222661 (0.0033) [2024-04-26 16:48:20,888][49750] Updated weights for policy 0, policy_version 222671 (0.0033) [2024-04-26 16:48:22,063][49517] Fps is (10 sec: 49151.6, 60 sec: 49971.1, 300 sec: 50596.0). Total num frames: 3648258048. Throughput: 0: 50628.2. Samples: 1401071920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 16:48:22,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 16:48:24,296][49750] Updated weights for policy 0, policy_version 222681 (0.0029) [2024-04-26 16:48:27,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3648536576. Throughput: 0: 50628.9. Samples: 1401378140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 16:48:27,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 16:48:27,285][49750] Updated weights for policy 0, policy_version 222691 (0.0031) [2024-04-26 16:48:28,617][49728] Signal inference workers to stop experience collection... (20950 times) [2024-04-26 16:48:28,654][49750] InferenceWorker_p0-w0: stopping experience collection (20950 times) [2024-04-26 16:48:28,676][49728] Signal inference workers to resume experience collection... (20950 times) [2024-04-26 16:48:28,678][49750] InferenceWorker_p0-w0: resuming experience collection (20950 times) [2024-04-26 16:48:30,824][49750] Updated weights for policy 0, policy_version 222701 (0.0029) [2024-04-26 16:48:32,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3648798720. Throughput: 0: 50662.9. Samples: 1401688140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 16:48:32,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 16:48:34,143][49750] Updated weights for policy 0, policy_version 222711 (0.0030) [2024-04-26 16:48:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3649044480. Throughput: 0: 50477.7. Samples: 1401839660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 16:48:37,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 16:48:37,191][49750] Updated weights for policy 0, policy_version 222721 (0.0027) [2024-04-26 16:48:40,781][49750] Updated weights for policy 0, policy_version 222731 (0.0028) [2024-04-26 16:48:42,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3649290240. Throughput: 0: 50620.0. Samples: 1402143140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 16:48:42,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 16:48:42,140][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000222736_3649306624.pth... [2024-04-26 16:48:42,181][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000221992_3637116928.pth [2024-04-26 16:48:43,565][49750] Updated weights for policy 0, policy_version 222741 (0.0029) [2024-04-26 16:48:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49971.4, 300 sec: 50762.6). Total num frames: 3649536000. Throughput: 0: 50613.6. Samples: 1402442420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 16:48:47,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 16:48:47,147][49750] Updated weights for policy 0, policy_version 222751 (0.0031) [2024-04-26 16:48:49,974][49750] Updated weights for policy 0, policy_version 222761 (0.0028) [2024-04-26 16:48:52,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3649814528. Throughput: 0: 50568.9. Samples: 1402593000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 16:48:52,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 16:48:53,613][49750] Updated weights for policy 0, policy_version 222771 (0.0030) [2024-04-26 16:48:56,521][49750] Updated weights for policy 0, policy_version 222781 (0.0034) [2024-04-26 16:48:57,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3650043904. Throughput: 0: 50649.9. Samples: 1402903500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 16:48:57,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 16:49:00,062][49750] Updated weights for policy 0, policy_version 222791 (0.0033) [2024-04-26 16:49:02,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3650306048. Throughput: 0: 50771.4. Samples: 1403209620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 16:49:02,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 16:49:03,315][49750] Updated weights for policy 0, policy_version 222801 (0.0033) [2024-04-26 16:49:06,339][49750] Updated weights for policy 0, policy_version 222811 (0.0033) [2024-04-26 16:49:07,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.4, 300 sec: 50596.0). Total num frames: 3650535424. Throughput: 0: 50686.9. Samples: 1403352820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 16:49:07,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 16:49:09,933][49750] Updated weights for policy 0, policy_version 222821 (0.0031) [2024-04-26 16:49:12,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3650813952. Throughput: 0: 50569.4. Samples: 1403653760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 16:49:12,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 16:49:12,761][49750] Updated weights for policy 0, policy_version 222831 (0.0030) [2024-04-26 16:49:16,390][49750] Updated weights for policy 0, policy_version 222841 (0.0041) [2024-04-26 16:49:17,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.4, 300 sec: 50762.7). Total num frames: 3651076096. Throughput: 0: 50596.4. Samples: 1403964980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 16:49:17,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 16:49:19,194][49750] Updated weights for policy 0, policy_version 222851 (0.0030) [2024-04-26 16:49:22,063][49517] Fps is (10 sec: 50789.5, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3651321856. Throughput: 0: 50477.6. Samples: 1404111160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 16:49:22,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 16:49:22,780][49750] Updated weights for policy 0, policy_version 222861 (0.0032) [2024-04-26 16:49:25,494][49750] Updated weights for policy 0, policy_version 222871 (0.0028) [2024-04-26 16:49:27,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3651567616. Throughput: 0: 50507.5. Samples: 1404415980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:49:27,063][49517] Avg episode reward: [(0, '0.475')] [2024-04-26 16:49:29,249][49750] Updated weights for policy 0, policy_version 222881 (0.0035) [2024-04-26 16:49:31,762][49750] Updated weights for policy 0, policy_version 222891 (0.0029) [2024-04-26 16:49:32,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3651846144. Throughput: 0: 50676.4. Samples: 1404722860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:49:32,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 16:49:35,601][49750] Updated weights for policy 0, policy_version 222901 (0.0032) [2024-04-26 16:49:37,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3652091904. Throughput: 0: 50890.8. Samples: 1404883080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:49:37,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 16:49:37,684][49728] Signal inference workers to stop experience collection... (21000 times) [2024-04-26 16:49:37,732][49750] InferenceWorker_p0-w0: stopping experience collection (21000 times) [2024-04-26 16:49:37,751][49728] Signal inference workers to resume experience collection... (21000 times) [2024-04-26 16:49:37,753][49750] InferenceWorker_p0-w0: resuming experience collection (21000 times) [2024-04-26 16:49:38,652][49750] Updated weights for policy 0, policy_version 222911 (0.0032) [2024-04-26 16:49:41,955][49750] Updated weights for policy 0, policy_version 222921 (0.0030) [2024-04-26 16:49:42,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3652337664. Throughput: 0: 50800.4. Samples: 1405189520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:49:42,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 16:49:45,490][49750] Updated weights for policy 0, policy_version 222931 (0.0034) [2024-04-26 16:49:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3652583424. Throughput: 0: 50725.4. Samples: 1405492260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:49:47,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 16:49:48,410][49750] Updated weights for policy 0, policy_version 222941 (0.0033) [2024-04-26 16:49:52,042][49750] Updated weights for policy 0, policy_version 222951 (0.0026) [2024-04-26 16:49:52,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 3652829184. Throughput: 0: 50949.6. Samples: 1405645560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:49:52,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 16:49:54,748][49750] Updated weights for policy 0, policy_version 222961 (0.0033) [2024-04-26 16:49:57,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3653107712. Throughput: 0: 50915.9. Samples: 1405944980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:49:57,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 16:49:58,413][49750] Updated weights for policy 0, policy_version 222971 (0.0038) [2024-04-26 16:50:01,203][49750] Updated weights for policy 0, policy_version 222981 (0.0031) [2024-04-26 16:50:02,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 3653369856. Throughput: 0: 50817.3. Samples: 1406251760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:50:02,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 16:50:04,945][49750] Updated weights for policy 0, policy_version 222991 (0.0033) [2024-04-26 16:50:07,062][49517] Fps is (10 sec: 49152.0, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3653599232. Throughput: 0: 51249.0. Samples: 1406417360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:50:07,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 16:50:07,710][49750] Updated weights for policy 0, policy_version 223001 (0.0033) [2024-04-26 16:50:11,309][49750] Updated weights for policy 0, policy_version 223011 (0.0035) [2024-04-26 16:50:12,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3653861376. Throughput: 0: 51055.4. Samples: 1406713480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:50:12,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 16:50:14,218][49750] Updated weights for policy 0, policy_version 223021 (0.0032) [2024-04-26 16:50:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3654107136. Throughput: 0: 50854.2. Samples: 1407011300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:50:17,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 16:50:17,752][49750] Updated weights for policy 0, policy_version 223031 (0.0033) [2024-04-26 16:50:20,616][49750] Updated weights for policy 0, policy_version 223041 (0.0034) [2024-04-26 16:50:22,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3654385664. Throughput: 0: 50910.1. Samples: 1407174040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:50:22,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 16:50:24,177][49750] Updated weights for policy 0, policy_version 223051 (0.0029) [2024-04-26 16:50:27,057][49750] Updated weights for policy 0, policy_version 223061 (0.0033) [2024-04-26 16:50:27,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3654631424. Throughput: 0: 50797.4. Samples: 1407475400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:50:27,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 16:50:30,607][49750] Updated weights for policy 0, policy_version 223071 (0.0026) [2024-04-26 16:50:32,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3654860800. Throughput: 0: 50671.5. Samples: 1407772480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 16:50:32,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 16:50:33,549][49750] Updated weights for policy 0, policy_version 223081 (0.0033) [2024-04-26 16:50:37,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3655122944. Throughput: 0: 50535.3. Samples: 1407919640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:50:37,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 16:50:37,065][49750] Updated weights for policy 0, policy_version 223091 (0.0030) [2024-04-26 16:50:39,988][49750] Updated weights for policy 0, policy_version 223101 (0.0034) [2024-04-26 16:50:42,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3655385088. Throughput: 0: 50754.5. Samples: 1408228940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:50:42,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 16:50:42,070][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000223107_3655385088.pth... [2024-04-26 16:50:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000222364_3643211776.pth [2024-04-26 16:50:43,434][49750] Updated weights for policy 0, policy_version 223111 (0.0032) [2024-04-26 16:50:46,360][49750] Updated weights for policy 0, policy_version 223121 (0.0032) [2024-04-26 16:50:47,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3655647232. Throughput: 0: 50652.0. Samples: 1408531100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:50:47,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 16:50:49,975][49750] Updated weights for policy 0, policy_version 223131 (0.0036) [2024-04-26 16:50:52,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3655876608. Throughput: 0: 50616.4. Samples: 1408695100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:50:52,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 16:50:52,933][49750] Updated weights for policy 0, policy_version 223141 (0.0028) [2024-04-26 16:50:52,941][49728] Signal inference workers to stop experience collection... (21050 times) [2024-04-26 16:50:52,942][49728] Signal inference workers to resume experience collection... (21050 times) [2024-04-26 16:50:52,954][49750] InferenceWorker_p0-w0: stopping experience collection (21050 times) [2024-04-26 16:50:52,973][49750] InferenceWorker_p0-w0: resuming experience collection (21050 times) [2024-04-26 16:50:56,437][49750] Updated weights for policy 0, policy_version 223151 (0.0030) [2024-04-26 16:50:57,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3656122368. Throughput: 0: 50653.9. Samples: 1408992900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:50:57,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 16:50:59,275][49750] Updated weights for policy 0, policy_version 223161 (0.0032) [2024-04-26 16:51:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3656384512. Throughput: 0: 50825.3. Samples: 1409298440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:51:02,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 16:51:02,837][49750] Updated weights for policy 0, policy_version 223171 (0.0037) [2024-04-26 16:51:05,608][49750] Updated weights for policy 0, policy_version 223181 (0.0028) [2024-04-26 16:51:07,063][49517] Fps is (10 sec: 55705.1, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3656679424. Throughput: 0: 50668.1. Samples: 1409454100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:51:07,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 16:51:09,197][49750] Updated weights for policy 0, policy_version 223191 (0.0033) [2024-04-26 16:51:12,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3656908800. Throughput: 0: 50775.5. Samples: 1409760300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:51:12,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 16:51:12,168][49750] Updated weights for policy 0, policy_version 223201 (0.0032) [2024-04-26 16:51:15,599][49750] Updated weights for policy 0, policy_version 223211 (0.0032) [2024-04-26 16:51:17,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3657138176. Throughput: 0: 50829.3. Samples: 1410059800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:51:17,064][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 16:51:18,671][49750] Updated weights for policy 0, policy_version 223221 (0.0029) [2024-04-26 16:51:22,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3657400320. Throughput: 0: 50869.5. Samples: 1410208780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:51:22,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 16:51:22,240][49750] Updated weights for policy 0, policy_version 223231 (0.0031) [2024-04-26 16:51:25,004][49750] Updated weights for policy 0, policy_version 223241 (0.0030) [2024-04-26 16:51:27,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3657678848. Throughput: 0: 50703.3. Samples: 1410510580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:51:27,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 16:51:28,714][49750] Updated weights for policy 0, policy_version 223251 (0.0032) [2024-04-26 16:51:31,385][49750] Updated weights for policy 0, policy_version 223261 (0.0031) [2024-04-26 16:51:32,063][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3657924608. Throughput: 0: 50737.7. Samples: 1410814300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:51:32,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 16:51:35,019][49750] Updated weights for policy 0, policy_version 223271 (0.0040) [2024-04-26 16:51:37,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3658170368. Throughput: 0: 50571.4. Samples: 1410970820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:51:37,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 16:51:37,899][49750] Updated weights for policy 0, policy_version 223281 (0.0033) [2024-04-26 16:51:41,426][49750] Updated weights for policy 0, policy_version 223291 (0.0030) [2024-04-26 16:51:42,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3658416128. Throughput: 0: 50800.8. Samples: 1411278940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:51:42,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 16:51:44,257][49750] Updated weights for policy 0, policy_version 223301 (0.0031) [2024-04-26 16:51:47,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3658661888. Throughput: 0: 50734.2. Samples: 1411581480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 16:51:47,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 16:51:48,074][49750] Updated weights for policy 0, policy_version 223311 (0.0029) [2024-04-26 16:51:50,785][49750] Updated weights for policy 0, policy_version 223321 (0.0029) [2024-04-26 16:51:52,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3658940416. Throughput: 0: 50588.1. Samples: 1411730560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 16:51:52,063][49517] Avg episode reward: [(0, '0.445')] [2024-04-26 16:51:54,597][49750] Updated weights for policy 0, policy_version 223331 (0.0030) [2024-04-26 16:51:57,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 3659202560. Throughput: 0: 50706.6. Samples: 1412042100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 16:51:57,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 16:51:57,124][49750] Updated weights for policy 0, policy_version 223341 (0.0032) [2024-04-26 16:51:58,167][49728] Signal inference workers to stop experience collection... (21100 times) [2024-04-26 16:51:58,189][49750] InferenceWorker_p0-w0: stopping experience collection (21100 times) [2024-04-26 16:51:58,234][49728] Signal inference workers to resume experience collection... (21100 times) [2024-04-26 16:51:58,234][49750] InferenceWorker_p0-w0: resuming experience collection (21100 times) [2024-04-26 16:52:00,951][49750] Updated weights for policy 0, policy_version 223351 (0.0027) [2024-04-26 16:52:02,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3659415552. Throughput: 0: 50809.3. Samples: 1412346220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 16:52:02,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 16:52:03,520][49750] Updated weights for policy 0, policy_version 223361 (0.0027) [2024-04-26 16:52:07,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.2, 300 sec: 50651.6). Total num frames: 3659677696. Throughput: 0: 50798.9. Samples: 1412494720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 16:52:07,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 16:52:07,323][49750] Updated weights for policy 0, policy_version 223371 (0.0038) [2024-04-26 16:52:10,140][49750] Updated weights for policy 0, policy_version 223381 (0.0036) [2024-04-26 16:52:12,062][49517] Fps is (10 sec: 54067.5, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3659956224. Throughput: 0: 50559.2. Samples: 1412785740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 16:52:12,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 16:52:13,968][49750] Updated weights for policy 0, policy_version 223391 (0.0028) [2024-04-26 16:52:16,655][49750] Updated weights for policy 0, policy_version 223401 (0.0031) [2024-04-26 16:52:17,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50651.5). Total num frames: 3660201984. Throughput: 0: 50708.4. Samples: 1413096180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 16:52:17,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 16:52:20,362][49750] Updated weights for policy 0, policy_version 223411 (0.0032) [2024-04-26 16:52:22,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 3660447744. Throughput: 0: 50682.8. Samples: 1413251540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 16:52:22,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 16:52:23,111][49750] Updated weights for policy 0, policy_version 223421 (0.0032) [2024-04-26 16:52:26,821][49750] Updated weights for policy 0, policy_version 223431 (0.0026) [2024-04-26 16:52:27,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3660693504. Throughput: 0: 50449.0. Samples: 1413549140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 16:52:27,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 16:52:29,610][49750] Updated weights for policy 0, policy_version 223441 (0.0030) [2024-04-26 16:52:32,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3660972032. Throughput: 0: 50491.2. Samples: 1413853580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 16:52:32,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 16:52:33,132][49750] Updated weights for policy 0, policy_version 223451 (0.0034) [2024-04-26 16:52:36,082][49750] Updated weights for policy 0, policy_version 223461 (0.0033) [2024-04-26 16:52:37,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3661201408. Throughput: 0: 50685.1. Samples: 1414011400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 16:52:37,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 16:52:39,725][49750] Updated weights for policy 0, policy_version 223471 (0.0034) [2024-04-26 16:52:42,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 3661463552. Throughput: 0: 50545.5. Samples: 1414316640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 16:52:42,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-26 16:52:42,100][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000223479_3661479936.pth... [2024-04-26 16:52:42,158][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000222736_3649306624.pth [2024-04-26 16:52:42,724][49750] Updated weights for policy 0, policy_version 223481 (0.0031) [2024-04-26 16:52:46,204][49750] Updated weights for policy 0, policy_version 223491 (0.0027) [2024-04-26 16:52:47,063][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3661725696. Throughput: 0: 50626.6. Samples: 1414624420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 16:52:47,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 16:52:49,075][49750] Updated weights for policy 0, policy_version 223501 (0.0036) [2024-04-26 16:52:52,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50244.1, 300 sec: 50651.5). Total num frames: 3661955072. Throughput: 0: 50597.1. Samples: 1414771600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 16:52:52,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 16:52:52,629][49750] Updated weights for policy 0, policy_version 223511 (0.0029) [2024-04-26 16:52:55,477][49750] Updated weights for policy 0, policy_version 223521 (0.0030) [2024-04-26 16:52:57,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3662249984. Throughput: 0: 50902.1. Samples: 1415076340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 16:52:57,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 16:52:59,206][49750] Updated weights for policy 0, policy_version 223531 (0.0033) [2024-04-26 16:53:02,060][49750] Updated weights for policy 0, policy_version 223541 (0.0032) [2024-04-26 16:53:02,063][49517] Fps is (10 sec: 54067.7, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3662495744. Throughput: 0: 50838.6. Samples: 1415383920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 16:53:02,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 16:53:05,637][49750] Updated weights for policy 0, policy_version 223551 (0.0028) [2024-04-26 16:53:07,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3662757888. Throughput: 0: 50910.7. Samples: 1415542520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 16:53:07,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 16:53:08,413][49750] Updated weights for policy 0, policy_version 223561 (0.0030) [2024-04-26 16:53:09,314][49728] Signal inference workers to stop experience collection... (21150 times) [2024-04-26 16:53:09,314][49728] Signal inference workers to resume experience collection... (21150 times) [2024-04-26 16:53:09,326][49750] InferenceWorker_p0-w0: stopping experience collection (21150 times) [2024-04-26 16:53:09,326][49750] InferenceWorker_p0-w0: resuming experience collection (21150 times) [2024-04-26 16:53:11,937][49750] Updated weights for policy 0, policy_version 223571 (0.0030) [2024-04-26 16:53:12,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3662987264. Throughput: 0: 51088.7. Samples: 1415848140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 16:53:12,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 16:53:14,846][49750] Updated weights for policy 0, policy_version 223581 (0.0036) [2024-04-26 16:53:17,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3663233024. Throughput: 0: 50911.5. Samples: 1416144600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 16:53:17,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 16:53:18,320][49750] Updated weights for policy 0, policy_version 223591 (0.0035) [2024-04-26 16:53:21,296][49750] Updated weights for policy 0, policy_version 223601 (0.0025) [2024-04-26 16:53:22,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3663511552. Throughput: 0: 50896.6. Samples: 1416301740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 16:53:22,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 16:53:24,815][49750] Updated weights for policy 0, policy_version 223611 (0.0029) [2024-04-26 16:53:27,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3663757312. Throughput: 0: 50945.7. Samples: 1416609200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 16:53:27,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 16:53:27,872][49750] Updated weights for policy 0, policy_version 223621 (0.0029) [2024-04-26 16:53:31,199][49750] Updated weights for policy 0, policy_version 223631 (0.0025) [2024-04-26 16:53:32,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 3664035840. Throughput: 0: 51032.8. Samples: 1416920900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 16:53:32,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 16:53:34,368][49750] Updated weights for policy 0, policy_version 223641 (0.0029) [2024-04-26 16:53:37,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3664248832. Throughput: 0: 51026.3. Samples: 1417067780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 16:53:37,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 16:53:37,556][49750] Updated weights for policy 0, policy_version 223651 (0.0034) [2024-04-26 16:53:40,682][49750] Updated weights for policy 0, policy_version 223661 (0.0028) [2024-04-26 16:53:42,062][49517] Fps is (10 sec: 49152.3, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3664527360. Throughput: 0: 51116.5. Samples: 1417376580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 16:53:42,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 16:53:44,095][49750] Updated weights for policy 0, policy_version 223671 (0.0029) [2024-04-26 16:53:47,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3664773120. Throughput: 0: 50897.8. Samples: 1417674320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 16:53:47,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 16:53:47,132][49750] Updated weights for policy 0, policy_version 223681 (0.0032) [2024-04-26 16:53:50,635][49750] Updated weights for policy 0, policy_version 223691 (0.0031) [2024-04-26 16:53:52,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51336.6, 300 sec: 50818.1). Total num frames: 3665035264. Throughput: 0: 50898.1. Samples: 1417832940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 16:53:52,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 16:53:53,570][49750] Updated weights for policy 0, policy_version 223701 (0.0031) [2024-04-26 16:53:57,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3665264640. Throughput: 0: 50787.6. Samples: 1418133580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 16:53:57,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 16:53:57,165][49750] Updated weights for policy 0, policy_version 223711 (0.0031) [2024-04-26 16:54:00,016][49750] Updated weights for policy 0, policy_version 223721 (0.0035) [2024-04-26 16:54:02,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 3665510400. Throughput: 0: 50772.5. Samples: 1418429360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:54:02,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 16:54:03,650][49750] Updated weights for policy 0, policy_version 223731 (0.0025) [2024-04-26 16:54:03,870][49728] Signal inference workers to stop experience collection... (21200 times) [2024-04-26 16:54:03,871][49728] Signal inference workers to resume experience collection... (21200 times) [2024-04-26 16:54:03,893][49750] InferenceWorker_p0-w0: stopping experience collection (21200 times) [2024-04-26 16:54:03,893][49750] InferenceWorker_p0-w0: resuming experience collection (21200 times) [2024-04-26 16:54:06,399][49750] Updated weights for policy 0, policy_version 223741 (0.0029) [2024-04-26 16:54:07,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3665788928. Throughput: 0: 50627.5. Samples: 1418579980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:54:07,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 16:54:10,021][49750] Updated weights for policy 0, policy_version 223751 (0.0034) [2024-04-26 16:54:12,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3666034688. Throughput: 0: 50761.3. Samples: 1418893460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:54:12,063][49517] Avg episode reward: [(0, '0.684')] [2024-04-26 16:54:12,709][49750] Updated weights for policy 0, policy_version 223761 (0.0025) [2024-04-26 16:54:16,517][49750] Updated weights for policy 0, policy_version 223771 (0.0039) [2024-04-26 16:54:17,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 3666296832. Throughput: 0: 50634.4. Samples: 1419199440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:54:17,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 16:54:19,249][49750] Updated weights for policy 0, policy_version 223781 (0.0034) [2024-04-26 16:54:22,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3666542592. Throughput: 0: 50591.6. Samples: 1419344400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:54:22,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 16:54:22,910][49750] Updated weights for policy 0, policy_version 223791 (0.0026) [2024-04-26 16:54:25,627][49750] Updated weights for policy 0, policy_version 223801 (0.0032) [2024-04-26 16:54:27,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3666821120. Throughput: 0: 50512.1. Samples: 1419649620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:54:27,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 16:54:29,290][49750] Updated weights for policy 0, policy_version 223811 (0.0033) [2024-04-26 16:54:32,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 3667066880. Throughput: 0: 50672.1. Samples: 1419954560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:54:32,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 16:54:32,173][49750] Updated weights for policy 0, policy_version 223821 (0.0030) [2024-04-26 16:54:35,747][49750] Updated weights for policy 0, policy_version 223831 (0.0033) [2024-04-26 16:54:37,062][49517] Fps is (10 sec: 49151.7, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3667312640. Throughput: 0: 50685.9. Samples: 1420113800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:54:37,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 16:54:38,686][49750] Updated weights for policy 0, policy_version 223841 (0.0027) [2024-04-26 16:54:42,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3667558400. Throughput: 0: 50847.9. Samples: 1420421740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:54:42,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 16:54:42,156][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000223851_3667574784.pth... [2024-04-26 16:54:42,159][49750] Updated weights for policy 0, policy_version 223851 (0.0025) [2024-04-26 16:54:42,205][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000223107_3655385088.pth [2024-04-26 16:54:45,268][49750] Updated weights for policy 0, policy_version 223861 (0.0029) [2024-04-26 16:54:47,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3667804160. Throughput: 0: 50841.2. Samples: 1420717220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:54:47,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 16:54:48,505][49750] Updated weights for policy 0, policy_version 223871 (0.0033) [2024-04-26 16:54:51,559][49750] Updated weights for policy 0, policy_version 223881 (0.0034) [2024-04-26 16:54:52,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3668066304. Throughput: 0: 50816.5. Samples: 1420866720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:54:52,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 16:54:54,980][49750] Updated weights for policy 0, policy_version 223891 (0.0027) [2024-04-26 16:54:57,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3668344832. Throughput: 0: 50712.4. Samples: 1421175520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:54:57,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 16:54:57,956][49750] Updated weights for policy 0, policy_version 223901 (0.0034) [2024-04-26 16:55:01,470][49750] Updated weights for policy 0, policy_version 223911 (0.0030) [2024-04-26 16:55:02,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 3668590592. Throughput: 0: 50637.1. Samples: 1421478120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:55:02,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 16:55:04,420][49750] Updated weights for policy 0, policy_version 223921 (0.0028) [2024-04-26 16:55:07,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3668819968. Throughput: 0: 50662.3. Samples: 1421624200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 16:55:07,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 16:55:08,070][49750] Updated weights for policy 0, policy_version 223931 (0.0028) [2024-04-26 16:55:09,520][49728] Signal inference workers to stop experience collection... (21250 times) [2024-04-26 16:55:09,526][49728] Signal inference workers to resume experience collection... (21250 times) [2024-04-26 16:55:09,540][49750] InferenceWorker_p0-w0: stopping experience collection (21250 times) [2024-04-26 16:55:09,540][49750] InferenceWorker_p0-w0: resuming experience collection (21250 times) [2024-04-26 16:55:10,975][49750] Updated weights for policy 0, policy_version 223941 (0.0033) [2024-04-26 16:55:12,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3669082112. Throughput: 0: 50720.4. Samples: 1421932040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:55:12,064][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 16:55:14,357][49750] Updated weights for policy 0, policy_version 223951 (0.0034) [2024-04-26 16:55:17,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 3669360640. Throughput: 0: 50640.9. Samples: 1422233400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:55:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 16:55:17,343][49750] Updated weights for policy 0, policy_version 223961 (0.0026) [2024-04-26 16:55:20,979][49750] Updated weights for policy 0, policy_version 223971 (0.0030) [2024-04-26 16:55:22,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 3669606400. Throughput: 0: 50810.4. Samples: 1422400260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:55:22,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 16:55:23,959][49750] Updated weights for policy 0, policy_version 223981 (0.0029) [2024-04-26 16:55:27,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3669852160. Throughput: 0: 50667.6. Samples: 1422701780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:55:27,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 16:55:27,406][49750] Updated weights for policy 0, policy_version 223991 (0.0030) [2024-04-26 16:55:30,431][49750] Updated weights for policy 0, policy_version 224001 (0.0029) [2024-04-26 16:55:32,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 3670081536. Throughput: 0: 50848.8. Samples: 1423005420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:55:32,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 16:55:33,730][49750] Updated weights for policy 0, policy_version 224011 (0.0026) [2024-04-26 16:55:36,987][49750] Updated weights for policy 0, policy_version 224021 (0.0037) [2024-04-26 16:55:37,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 3670360064. Throughput: 0: 50797.4. Samples: 1423152600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:55:37,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 16:55:40,141][49750] Updated weights for policy 0, policy_version 224031 (0.0036) [2024-04-26 16:55:42,063][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3670622208. Throughput: 0: 50764.0. Samples: 1423459900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:55:42,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 16:55:43,387][49750] Updated weights for policy 0, policy_version 224041 (0.0030) [2024-04-26 16:55:46,684][49750] Updated weights for policy 0, policy_version 224051 (0.0034) [2024-04-26 16:55:47,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3670867968. Throughput: 0: 50874.0. Samples: 1423767440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:55:47,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 16:55:49,801][49750] Updated weights for policy 0, policy_version 224061 (0.0032) [2024-04-26 16:55:52,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3671113728. Throughput: 0: 51028.1. Samples: 1423920460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:55:52,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 16:55:53,182][49750] Updated weights for policy 0, policy_version 224071 (0.0030) [2024-04-26 16:55:56,362][49750] Updated weights for policy 0, policy_version 224081 (0.0030) [2024-04-26 16:55:57,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3671359488. Throughput: 0: 50868.4. Samples: 1424221120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:55:57,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 16:55:59,497][49750] Updated weights for policy 0, policy_version 224091 (0.0032) [2024-04-26 16:56:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 3671621632. Throughput: 0: 50942.2. Samples: 1424525800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:56:02,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 16:56:02,676][49750] Updated weights for policy 0, policy_version 224101 (0.0031) [2024-04-26 16:56:06,021][49750] Updated weights for policy 0, policy_version 224111 (0.0030) [2024-04-26 16:56:07,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3671900160. Throughput: 0: 50767.9. Samples: 1424684820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:56:07,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 16:56:09,105][49750] Updated weights for policy 0, policy_version 224121 (0.0029) [2024-04-26 16:56:12,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3672129536. Throughput: 0: 50817.7. Samples: 1424988580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:56:12,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 16:56:12,556][49750] Updated weights for policy 0, policy_version 224131 (0.0031) [2024-04-26 16:56:15,657][49750] Updated weights for policy 0, policy_version 224141 (0.0030) [2024-04-26 16:56:17,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50244.3, 300 sec: 50762.7). Total num frames: 3672375296. Throughput: 0: 50837.6. Samples: 1425293100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 16:56:17,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 16:56:19,109][49750] Updated weights for policy 0, policy_version 224151 (0.0030) [2024-04-26 16:56:22,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3672637440. Throughput: 0: 50789.7. Samples: 1425438140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:56:22,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 16:56:22,129][49750] Updated weights for policy 0, policy_version 224161 (0.0032) [2024-04-26 16:56:23,734][49728] Signal inference workers to stop experience collection... (21300 times) [2024-04-26 16:56:23,734][49728] Signal inference workers to resume experience collection... (21300 times) [2024-04-26 16:56:23,762][49750] InferenceWorker_p0-w0: stopping experience collection (21300 times) [2024-04-26 16:56:23,762][49750] InferenceWorker_p0-w0: resuming experience collection (21300 times) [2024-04-26 16:56:25,437][49750] Updated weights for policy 0, policy_version 224171 (0.0034) [2024-04-26 16:56:27,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3672899584. Throughput: 0: 50720.6. Samples: 1425742320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:56:27,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 16:56:28,491][49750] Updated weights for policy 0, policy_version 224181 (0.0031) [2024-04-26 16:56:31,758][49750] Updated weights for policy 0, policy_version 224191 (0.0040) [2024-04-26 16:56:32,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3673161728. Throughput: 0: 50791.0. Samples: 1426053040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:56:32,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 16:56:35,254][49750] Updated weights for policy 0, policy_version 224201 (0.0040) [2024-04-26 16:56:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3673407488. Throughput: 0: 50717.3. Samples: 1426202740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:56:37,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 16:56:38,274][49750] Updated weights for policy 0, policy_version 224211 (0.0032) [2024-04-26 16:56:42,036][49750] Updated weights for policy 0, policy_version 224221 (0.0043) [2024-04-26 16:56:42,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3673636864. Throughput: 0: 50849.3. Samples: 1426509340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:56:42,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 16:56:42,160][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000224222_3673653248.pth... [2024-04-26 16:56:42,203][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000223479_3661479936.pth [2024-04-26 16:56:44,754][49750] Updated weights for policy 0, policy_version 224231 (0.0031) [2024-04-26 16:56:47,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50517.1, 300 sec: 50707.1). Total num frames: 3673899008. Throughput: 0: 50714.0. Samples: 1426807940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:56:47,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 16:56:48,400][49750] Updated weights for policy 0, policy_version 224241 (0.0030) [2024-04-26 16:56:51,168][49750] Updated weights for policy 0, policy_version 224251 (0.0033) [2024-04-26 16:56:52,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3674177536. Throughput: 0: 50652.5. Samples: 1426964180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:56:52,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 16:56:54,704][49750] Updated weights for policy 0, policy_version 224261 (0.0030) [2024-04-26 16:56:57,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3674406912. Throughput: 0: 50809.4. Samples: 1427275000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:56:57,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 16:56:57,541][49750] Updated weights for policy 0, policy_version 224271 (0.0032) [2024-04-26 16:57:01,053][49750] Updated weights for policy 0, policy_version 224281 (0.0028) [2024-04-26 16:57:02,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3674652672. Throughput: 0: 50780.0. Samples: 1427578200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:57:02,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 16:57:04,063][49750] Updated weights for policy 0, policy_version 224291 (0.0029) [2024-04-26 16:57:07,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3674914816. Throughput: 0: 50649.1. Samples: 1427717360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:57:07,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 16:57:07,523][49750] Updated weights for policy 0, policy_version 224301 (0.0030) [2024-04-26 16:57:10,438][49750] Updated weights for policy 0, policy_version 224311 (0.0033) [2024-04-26 16:57:12,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3675193344. Throughput: 0: 50702.2. Samples: 1428023920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:57:12,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 16:57:13,997][49750] Updated weights for policy 0, policy_version 224321 (0.0032) [2024-04-26 16:57:16,902][49750] Updated weights for policy 0, policy_version 224331 (0.0029) [2024-04-26 16:57:17,062][49517] Fps is (10 sec: 54068.2, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3675455488. Throughput: 0: 50709.4. Samples: 1428334960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:57:17,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 16:57:20,316][49750] Updated weights for policy 0, policy_version 224341 (0.0030) [2024-04-26 16:57:22,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3675684864. Throughput: 0: 50869.4. Samples: 1428491860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 16:57:22,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 16:57:23,211][49750] Updated weights for policy 0, policy_version 224351 (0.0033) [2024-04-26 16:57:26,628][49750] Updated weights for policy 0, policy_version 224361 (0.0037) [2024-04-26 16:57:27,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3675930624. Throughput: 0: 50788.1. Samples: 1428794800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 16:57:27,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 16:57:29,575][49750] Updated weights for policy 0, policy_version 224371 (0.0027) [2024-04-26 16:57:32,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3676176384. Throughput: 0: 50812.2. Samples: 1429094480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 16:57:32,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 16:57:33,370][49750] Updated weights for policy 0, policy_version 224381 (0.0029) [2024-04-26 16:57:35,404][49728] Signal inference workers to stop experience collection... (21350 times) [2024-04-26 16:57:35,404][49728] Signal inference workers to resume experience collection... (21350 times) [2024-04-26 16:57:35,418][49750] InferenceWorker_p0-w0: stopping experience collection (21350 times) [2024-04-26 16:57:35,418][49750] InferenceWorker_p0-w0: resuming experience collection (21350 times) [2024-04-26 16:57:36,060][49750] Updated weights for policy 0, policy_version 224391 (0.0029) [2024-04-26 16:57:37,063][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3676471296. Throughput: 0: 50652.8. Samples: 1429243560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 16:57:37,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 16:57:39,815][49750] Updated weights for policy 0, policy_version 224401 (0.0028) [2024-04-26 16:57:42,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3676717056. Throughput: 0: 50603.4. Samples: 1429552160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 16:57:42,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 16:57:42,409][49750] Updated weights for policy 0, policy_version 224411 (0.0034) [2024-04-26 16:57:46,293][49750] Updated weights for policy 0, policy_version 224421 (0.0029) [2024-04-26 16:57:47,063][49517] Fps is (10 sec: 49152.2, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3676962816. Throughput: 0: 50727.8. Samples: 1429860960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 16:57:47,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 16:57:48,760][49750] Updated weights for policy 0, policy_version 224431 (0.0030) [2024-04-26 16:57:52,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3677208576. Throughput: 0: 50806.5. Samples: 1430003640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 16:57:52,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 16:57:52,714][49750] Updated weights for policy 0, policy_version 224441 (0.0035) [2024-04-26 16:57:55,161][49750] Updated weights for policy 0, policy_version 224451 (0.0027) [2024-04-26 16:57:57,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3677470720. Throughput: 0: 50816.4. Samples: 1430310660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 16:57:57,063][49517] Avg episode reward: [(0, '0.671')] [2024-04-26 16:57:59,089][49750] Updated weights for policy 0, policy_version 224461 (0.0037) [2024-04-26 16:58:01,846][49750] Updated weights for policy 0, policy_version 224471 (0.0031) [2024-04-26 16:58:02,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3677732864. Throughput: 0: 50708.5. Samples: 1430616840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 16:58:02,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 16:58:05,494][49750] Updated weights for policy 0, policy_version 224481 (0.0039) [2024-04-26 16:58:07,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3677945856. Throughput: 0: 50774.4. Samples: 1430776720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 16:58:07,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 16:58:08,265][49750] Updated weights for policy 0, policy_version 224491 (0.0030) [2024-04-26 16:58:11,865][49750] Updated weights for policy 0, policy_version 224501 (0.0031) [2024-04-26 16:58:12,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3678224384. Throughput: 0: 50766.8. Samples: 1431079300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 16:58:12,063][49517] Avg episode reward: [(0, '0.438')] [2024-04-26 16:58:14,633][49750] Updated weights for policy 0, policy_version 224511 (0.0034) [2024-04-26 16:58:17,063][49517] Fps is (10 sec: 52428.9, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3678470144. Throughput: 0: 50803.0. Samples: 1431380620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 16:58:17,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 16:58:18,296][49750] Updated weights for policy 0, policy_version 224521 (0.0030) [2024-04-26 16:58:21,130][49750] Updated weights for policy 0, policy_version 224531 (0.0030) [2024-04-26 16:58:22,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3678748672. Throughput: 0: 50899.6. Samples: 1431534040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 16:58:22,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 16:58:24,704][49750] Updated weights for policy 0, policy_version 224541 (0.0029) [2024-04-26 16:58:27,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 3678994432. Throughput: 0: 50874.4. Samples: 1431841500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 16:58:27,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 16:58:27,575][49750] Updated weights for policy 0, policy_version 224551 (0.0031) [2024-04-26 16:58:31,156][49750] Updated weights for policy 0, policy_version 224561 (0.0033) [2024-04-26 16:58:32,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3679223808. Throughput: 0: 50583.1. Samples: 1432137200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 16:58:32,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 16:58:33,946][49750] Updated weights for policy 0, policy_version 224571 (0.0034) [2024-04-26 16:58:37,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3679485952. Throughput: 0: 50923.9. Samples: 1432295220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:58:37,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 16:58:37,588][49750] Updated weights for policy 0, policy_version 224581 (0.0026) [2024-04-26 16:58:39,940][49728] Signal inference workers to stop experience collection... (21400 times) [2024-04-26 16:58:39,983][49750] InferenceWorker_p0-w0: stopping experience collection (21400 times) [2024-04-26 16:58:40,002][49728] Signal inference workers to resume experience collection... (21400 times) [2024-04-26 16:58:40,004][49750] InferenceWorker_p0-w0: resuming experience collection (21400 times) [2024-04-26 16:58:40,267][49750] Updated weights for policy 0, policy_version 224591 (0.0028) [2024-04-26 16:58:42,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3679748096. Throughput: 0: 50693.7. Samples: 1432591880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:58:42,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 16:58:42,069][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000224594_3679748096.pth... [2024-04-26 16:58:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000223851_3667574784.pth [2024-04-26 16:58:44,005][49750] Updated weights for policy 0, policy_version 224601 (0.0031) [2024-04-26 16:58:46,771][49750] Updated weights for policy 0, policy_version 224611 (0.0031) [2024-04-26 16:58:47,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3680026624. Throughput: 0: 50645.3. Samples: 1432895880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:58:47,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-26 16:58:50,562][49750] Updated weights for policy 0, policy_version 224621 (0.0031) [2024-04-26 16:58:52,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3680272384. Throughput: 0: 50721.8. Samples: 1433059200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:58:52,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 16:58:53,134][49750] Updated weights for policy 0, policy_version 224631 (0.0026) [2024-04-26 16:58:57,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 3680501760. Throughput: 0: 50810.9. Samples: 1433365800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:58:57,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 16:58:57,173][49750] Updated weights for policy 0, policy_version 224641 (0.0035) [2024-04-26 16:58:59,664][49750] Updated weights for policy 0, policy_version 224651 (0.0031) [2024-04-26 16:59:02,063][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3680747520. Throughput: 0: 50852.0. Samples: 1433668960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:59:02,072][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 16:59:03,558][49750] Updated weights for policy 0, policy_version 224661 (0.0031) [2024-04-26 16:59:06,324][49750] Updated weights for policy 0, policy_version 224671 (0.0033) [2024-04-26 16:59:07,063][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3681026048. Throughput: 0: 50775.1. Samples: 1433818920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:59:07,072][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 16:59:09,830][49750] Updated weights for policy 0, policy_version 224681 (0.0035) [2024-04-26 16:59:12,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3681288192. Throughput: 0: 50741.7. Samples: 1434124880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:59:12,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 16:59:13,078][49750] Updated weights for policy 0, policy_version 224691 (0.0030) [2024-04-26 16:59:16,437][49750] Updated weights for policy 0, policy_version 224701 (0.0027) [2024-04-26 16:59:17,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3681517568. Throughput: 0: 50792.1. Samples: 1434422840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:59:17,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 16:59:19,636][49750] Updated weights for policy 0, policy_version 224711 (0.0032) [2024-04-26 16:59:22,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3681779712. Throughput: 0: 50641.0. Samples: 1434574060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:59:22,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 16:59:23,067][49750] Updated weights for policy 0, policy_version 224721 (0.0029) [2024-04-26 16:59:26,328][49750] Updated weights for policy 0, policy_version 224731 (0.0026) [2024-04-26 16:59:27,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 3682009088. Throughput: 0: 50564.6. Samples: 1434867280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:59:27,071][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 16:59:29,451][49750] Updated weights for policy 0, policy_version 224741 (0.0030) [2024-04-26 16:59:32,063][49517] Fps is (10 sec: 52427.4, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 3682304000. Throughput: 0: 50589.5. Samples: 1435172420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:59:32,072][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 16:59:32,713][49750] Updated weights for policy 0, policy_version 224751 (0.0029) [2024-04-26 16:59:35,946][49750] Updated weights for policy 0, policy_version 224761 (0.0028) [2024-04-26 16:59:37,063][49517] Fps is (10 sec: 52427.6, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3682533376. Throughput: 0: 50667.9. Samples: 1435339260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:59:37,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 16:59:39,316][49750] Updated weights for policy 0, policy_version 224771 (0.0032) [2024-04-26 16:59:42,063][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3682779136. Throughput: 0: 50649.8. Samples: 1435645040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 16:59:42,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 16:59:42,449][49750] Updated weights for policy 0, policy_version 224781 (0.0035) [2024-04-26 16:59:45,722][49750] Updated weights for policy 0, policy_version 224791 (0.0035) [2024-04-26 16:59:47,062][49517] Fps is (10 sec: 47514.7, 60 sec: 49698.2, 300 sec: 50651.6). Total num frames: 3683008512. Throughput: 0: 50554.9. Samples: 1435943920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 16:59:47,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 16:59:48,941][49750] Updated weights for policy 0, policy_version 224801 (0.0028) [2024-04-26 16:59:49,754][49728] Signal inference workers to stop experience collection... (21450 times) [2024-04-26 16:59:49,756][49728] Signal inference workers to resume experience collection... (21450 times) [2024-04-26 16:59:49,784][49750] InferenceWorker_p0-w0: stopping experience collection (21450 times) [2024-04-26 16:59:49,784][49750] InferenceWorker_p0-w0: resuming experience collection (21450 times) [2024-04-26 16:59:52,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 3683287040. Throughput: 0: 50551.6. Samples: 1436093740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 16:59:52,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 16:59:52,233][49750] Updated weights for policy 0, policy_version 224811 (0.0031) [2024-04-26 16:59:55,361][49750] Updated weights for policy 0, policy_version 224821 (0.0035) [2024-04-26 16:59:57,062][49517] Fps is (10 sec: 55705.1, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 3683565568. Throughput: 0: 50493.3. Samples: 1436397080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 16:59:57,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 16:59:58,647][49750] Updated weights for policy 0, policy_version 224831 (0.0035) [2024-04-26 17:00:01,720][49750] Updated weights for policy 0, policy_version 224841 (0.0032) [2024-04-26 17:00:02,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3683794944. Throughput: 0: 50675.1. Samples: 1436703220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:00:02,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 17:00:05,089][49750] Updated weights for policy 0, policy_version 224851 (0.0030) [2024-04-26 17:00:07,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3684057088. Throughput: 0: 50461.8. Samples: 1436844840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:00:07,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 17:00:08,227][49750] Updated weights for policy 0, policy_version 224861 (0.0033) [2024-04-26 17:00:11,466][49750] Updated weights for policy 0, policy_version 224871 (0.0031) [2024-04-26 17:00:12,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 3684302848. Throughput: 0: 50788.7. Samples: 1437152780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:00:12,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 17:00:14,573][49750] Updated weights for policy 0, policy_version 224881 (0.0028) [2024-04-26 17:00:17,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3684564992. Throughput: 0: 50770.1. Samples: 1437457060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:00:17,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 17:00:17,992][49750] Updated weights for policy 0, policy_version 224891 (0.0030) [2024-04-26 17:00:20,852][49750] Updated weights for policy 0, policy_version 224901 (0.0036) [2024-04-26 17:00:22,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.1, 300 sec: 50762.6). Total num frames: 3684827136. Throughput: 0: 50679.9. Samples: 1437619860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:00:22,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 17:00:24,466][49750] Updated weights for policy 0, policy_version 224911 (0.0033) [2024-04-26 17:00:27,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3685072896. Throughput: 0: 50706.7. Samples: 1437926840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:00:27,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 17:00:27,341][49750] Updated weights for policy 0, policy_version 224921 (0.0037) [2024-04-26 17:00:30,926][49750] Updated weights for policy 0, policy_version 224931 (0.0029) [2024-04-26 17:00:32,063][49517] Fps is (10 sec: 47514.4, 60 sec: 49971.3, 300 sec: 50651.5). Total num frames: 3685302272. Throughput: 0: 50862.1. Samples: 1438232720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:00:32,063][49517] Avg episode reward: [(0, '0.485')] [2024-04-26 17:00:33,910][49750] Updated weights for policy 0, policy_version 224941 (0.0030) [2024-04-26 17:00:37,063][49517] Fps is (10 sec: 49150.8, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 3685564416. Throughput: 0: 50762.4. Samples: 1438378060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:00:37,064][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 17:00:37,287][49750] Updated weights for policy 0, policy_version 224951 (0.0033) [2024-04-26 17:00:40,187][49750] Updated weights for policy 0, policy_version 224961 (0.0036) [2024-04-26 17:00:42,062][49517] Fps is (10 sec: 55706.2, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3685859328. Throughput: 0: 50855.2. Samples: 1438685560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:00:42,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 17:00:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000224967_3685859328.pth... [2024-04-26 17:00:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000224222_3673653248.pth [2024-04-26 17:00:43,704][49750] Updated weights for policy 0, policy_version 224971 (0.0035) [2024-04-26 17:00:46,885][49750] Updated weights for policy 0, policy_version 224981 (0.0031) [2024-04-26 17:00:47,062][49517] Fps is (10 sec: 52430.2, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3686088704. Throughput: 0: 50914.6. Samples: 1438994380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:00:47,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 17:00:50,190][49750] Updated weights for policy 0, policy_version 224991 (0.0029) [2024-04-26 17:00:52,063][49517] Fps is (10 sec: 49151.6, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3686350848. Throughput: 0: 51043.5. Samples: 1439141800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:00:52,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 17:00:53,384][49750] Updated weights for policy 0, policy_version 225001 (0.0035) [2024-04-26 17:00:54,597][49728] Signal inference workers to stop experience collection... (21500 times) [2024-04-26 17:00:54,598][49728] Signal inference workers to resume experience collection... (21500 times) [2024-04-26 17:00:54,610][49750] InferenceWorker_p0-w0: stopping experience collection (21500 times) [2024-04-26 17:00:54,610][49750] InferenceWorker_p0-w0: resuming experience collection (21500 times) [2024-04-26 17:00:56,678][49750] Updated weights for policy 0, policy_version 225011 (0.0034) [2024-04-26 17:00:57,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3686596608. Throughput: 0: 50941.0. Samples: 1439445120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:00:57,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 17:00:59,724][49750] Updated weights for policy 0, policy_version 225021 (0.0025) [2024-04-26 17:01:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3686858752. Throughput: 0: 50973.7. Samples: 1439750880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:01:02,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 17:01:02,903][49750] Updated weights for policy 0, policy_version 225031 (0.0029) [2024-04-26 17:01:05,987][49750] Updated weights for policy 0, policy_version 225041 (0.0030) [2024-04-26 17:01:07,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3687120896. Throughput: 0: 51077.2. Samples: 1439918320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:01:07,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 17:01:09,209][49750] Updated weights for policy 0, policy_version 225051 (0.0033) [2024-04-26 17:01:12,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 3687366656. Throughput: 0: 50956.4. Samples: 1440219880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:01:12,063][49517] Avg episode reward: [(0, '0.685')] [2024-04-26 17:01:12,359][49750] Updated weights for policy 0, policy_version 225061 (0.0034) [2024-04-26 17:01:15,754][49750] Updated weights for policy 0, policy_version 225071 (0.0033) [2024-04-26 17:01:17,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3687612416. Throughput: 0: 50866.7. Samples: 1440521720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:01:17,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 17:01:18,787][49750] Updated weights for policy 0, policy_version 225081 (0.0030) [2024-04-26 17:01:22,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3687858176. Throughput: 0: 50971.8. Samples: 1440671780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:01:22,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 17:01:22,383][49750] Updated weights for policy 0, policy_version 225091 (0.0033) [2024-04-26 17:01:25,248][49750] Updated weights for policy 0, policy_version 225101 (0.0034) [2024-04-26 17:01:27,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3688136704. Throughput: 0: 50875.9. Samples: 1440974980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:01:27,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 17:01:28,861][49750] Updated weights for policy 0, policy_version 225111 (0.0035) [2024-04-26 17:01:31,540][49750] Updated weights for policy 0, policy_version 225121 (0.0035) [2024-04-26 17:01:32,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.6, 300 sec: 50762.6). Total num frames: 3688382464. Throughput: 0: 50764.0. Samples: 1441278760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:01:32,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 17:01:35,272][49750] Updated weights for policy 0, policy_version 225131 (0.0029) [2024-04-26 17:01:37,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51336.8, 300 sec: 50873.7). Total num frames: 3688644608. Throughput: 0: 51078.3. Samples: 1441440320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:01:37,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 17:01:37,822][49750] Updated weights for policy 0, policy_version 225141 (0.0033) [2024-04-26 17:01:41,623][49750] Updated weights for policy 0, policy_version 225151 (0.0036) [2024-04-26 17:01:42,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50762.7). Total num frames: 3688873984. Throughput: 0: 51024.5. Samples: 1441741220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:01:42,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 17:01:43,809][49728] Signal inference workers to stop experience collection... (21550 times) [2024-04-26 17:01:43,814][49728] Signal inference workers to resume experience collection... (21550 times) [2024-04-26 17:01:43,827][49750] InferenceWorker_p0-w0: stopping experience collection (21550 times) [2024-04-26 17:01:43,828][49750] InferenceWorker_p0-w0: resuming experience collection (21550 times) [2024-04-26 17:01:44,811][49750] Updated weights for policy 0, policy_version 225161 (0.0038) [2024-04-26 17:01:47,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3689136128. Throughput: 0: 50980.4. Samples: 1442045000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:01:47,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 17:01:48,003][49750] Updated weights for policy 0, policy_version 225171 (0.0033) [2024-04-26 17:01:51,297][49750] Updated weights for policy 0, policy_version 225181 (0.0030) [2024-04-26 17:01:52,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3689414656. Throughput: 0: 50656.8. Samples: 1442197880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:01:52,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 17:01:54,545][49750] Updated weights for policy 0, policy_version 225191 (0.0030) [2024-04-26 17:01:57,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3689660416. Throughput: 0: 50863.6. Samples: 1442508740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:01:57,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 17:01:57,829][49750] Updated weights for policy 0, policy_version 225201 (0.0030) [2024-04-26 17:02:01,005][49750] Updated weights for policy 0, policy_version 225211 (0.0033) [2024-04-26 17:02:02,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3689906176. Throughput: 0: 50972.1. Samples: 1442815460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 17:02:02,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 17:02:04,117][49750] Updated weights for policy 0, policy_version 225221 (0.0031) [2024-04-26 17:02:07,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3690168320. Throughput: 0: 50973.0. Samples: 1442965560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 17:02:07,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 17:02:07,470][49750] Updated weights for policy 0, policy_version 225231 (0.0034) [2024-04-26 17:02:10,448][49750] Updated weights for policy 0, policy_version 225241 (0.0031) [2024-04-26 17:02:12,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3690414080. Throughput: 0: 50887.0. Samples: 1443264900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 17:02:12,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 17:02:13,824][49750] Updated weights for policy 0, policy_version 225251 (0.0037) [2024-04-26 17:02:17,008][49750] Updated weights for policy 0, policy_version 225261 (0.0038) [2024-04-26 17:02:17,063][49517] Fps is (10 sec: 50789.4, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3690676224. Throughput: 0: 50978.1. Samples: 1443572780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 17:02:17,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 17:02:20,589][49750] Updated weights for policy 0, policy_version 225271 (0.0035) [2024-04-26 17:02:22,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3690938368. Throughput: 0: 50935.1. Samples: 1443732400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 17:02:22,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 17:02:23,485][49750] Updated weights for policy 0, policy_version 225281 (0.0027) [2024-04-26 17:02:27,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3691151360. Throughput: 0: 50840.0. Samples: 1444029020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 17:02:27,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 17:02:27,141][49750] Updated weights for policy 0, policy_version 225291 (0.0032) [2024-04-26 17:02:29,843][49750] Updated weights for policy 0, policy_version 225301 (0.0028) [2024-04-26 17:02:32,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3691429888. Throughput: 0: 50833.7. Samples: 1444332520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 17:02:32,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 17:02:33,501][49750] Updated weights for policy 0, policy_version 225311 (0.0028) [2024-04-26 17:02:36,309][49750] Updated weights for policy 0, policy_version 225321 (0.0027) [2024-04-26 17:02:37,062][49517] Fps is (10 sec: 54066.8, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3691692032. Throughput: 0: 50889.8. Samples: 1444487920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 17:02:37,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 17:02:39,871][49750] Updated weights for policy 0, policy_version 225331 (0.0030) [2024-04-26 17:02:42,062][49517] Fps is (10 sec: 50791.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3691937792. Throughput: 0: 50781.5. Samples: 1444793900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 17:02:42,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 17:02:42,117][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000225339_3691954176.pth... [2024-04-26 17:02:42,163][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000224594_3679748096.pth [2024-04-26 17:02:42,842][49750] Updated weights for policy 0, policy_version 225341 (0.0033) [2024-04-26 17:02:46,229][49750] Updated weights for policy 0, policy_version 225351 (0.0029) [2024-04-26 17:02:47,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3692167168. Throughput: 0: 50568.4. Samples: 1445091040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 17:02:47,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 17:02:49,250][49750] Updated weights for policy 0, policy_version 225361 (0.0034) [2024-04-26 17:02:52,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3692445696. Throughput: 0: 50649.3. Samples: 1445244780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 17:02:52,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 17:02:52,658][49750] Updated weights for policy 0, policy_version 225371 (0.0029) [2024-04-26 17:02:55,834][49750] Updated weights for policy 0, policy_version 225381 (0.0033) [2024-04-26 17:02:56,379][49728] Signal inference workers to stop experience collection... (21600 times) [2024-04-26 17:02:56,415][49750] InferenceWorker_p0-w0: stopping experience collection (21600 times) [2024-04-26 17:02:56,447][49728] Signal inference workers to resume experience collection... (21600 times) [2024-04-26 17:02:56,450][49750] InferenceWorker_p0-w0: resuming experience collection (21600 times) [2024-04-26 17:02:57,062][49517] Fps is (10 sec: 54066.7, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3692707840. Throughput: 0: 50905.9. Samples: 1445555660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 17:02:57,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 17:02:59,059][49750] Updated weights for policy 0, policy_version 225391 (0.0029) [2024-04-26 17:03:02,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 3692953600. Throughput: 0: 50763.1. Samples: 1445857120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 17:03:02,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 17:03:02,332][49750] Updated weights for policy 0, policy_version 225401 (0.0035) [2024-04-26 17:03:05,343][49750] Updated weights for policy 0, policy_version 225411 (0.0034) [2024-04-26 17:03:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3693215744. Throughput: 0: 50517.9. Samples: 1446005700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-26 17:03:07,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 17:03:08,637][49750] Updated weights for policy 0, policy_version 225421 (0.0031) [2024-04-26 17:03:11,648][49750] Updated weights for policy 0, policy_version 225431 (0.0030) [2024-04-26 17:03:12,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3693477888. Throughput: 0: 50856.8. Samples: 1446317580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-04-26 17:03:12,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 17:03:15,033][49750] Updated weights for policy 0, policy_version 225441 (0.0029) [2024-04-26 17:03:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3693723648. Throughput: 0: 50775.8. Samples: 1446617420. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-04-26 17:03:17,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 17:03:18,100][49750] Updated weights for policy 0, policy_version 225451 (0.0031) [2024-04-26 17:03:21,421][49750] Updated weights for policy 0, policy_version 225461 (0.0032) [2024-04-26 17:03:22,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3693985792. Throughput: 0: 50763.5. Samples: 1446772280. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-04-26 17:03:22,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 17:03:24,720][49750] Updated weights for policy 0, policy_version 225471 (0.0039) [2024-04-26 17:03:27,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 3694231552. Throughput: 0: 50794.1. Samples: 1447079640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-04-26 17:03:27,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 17:03:27,866][49750] Updated weights for policy 0, policy_version 225481 (0.0034) [2024-04-26 17:03:31,240][49750] Updated weights for policy 0, policy_version 225491 (0.0033) [2024-04-26 17:03:32,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3694477312. Throughput: 0: 51079.8. Samples: 1447389640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-04-26 17:03:32,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 17:03:34,178][49750] Updated weights for policy 0, policy_version 225501 (0.0030) [2024-04-26 17:03:37,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3694739456. Throughput: 0: 50923.9. Samples: 1447536360. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-04-26 17:03:37,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 17:03:37,696][49750] Updated weights for policy 0, policy_version 225511 (0.0036) [2024-04-26 17:03:40,554][49750] Updated weights for policy 0, policy_version 225521 (0.0031) [2024-04-26 17:03:42,062][49517] Fps is (10 sec: 52429.8, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3695001600. Throughput: 0: 50777.0. Samples: 1447840620. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-04-26 17:03:42,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 17:03:44,180][49750] Updated weights for policy 0, policy_version 225531 (0.0035) [2024-04-26 17:03:46,895][49750] Updated weights for policy 0, policy_version 225541 (0.0031) [2024-04-26 17:03:47,062][49517] Fps is (10 sec: 52430.0, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 3695263744. Throughput: 0: 50934.5. Samples: 1448149160. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-04-26 17:03:47,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 17:03:50,462][49750] Updated weights for policy 0, policy_version 225551 (0.0035) [2024-04-26 17:03:52,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3695493120. Throughput: 0: 50965.3. Samples: 1448299140. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-04-26 17:03:52,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 17:03:53,650][49750] Updated weights for policy 0, policy_version 225561 (0.0034) [2024-04-26 17:03:56,725][49750] Updated weights for policy 0, policy_version 225571 (0.0035) [2024-04-26 17:03:57,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3695755264. Throughput: 0: 50747.1. Samples: 1448601200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-04-26 17:03:57,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 17:04:00,046][49750] Updated weights for policy 0, policy_version 225581 (0.0033) [2024-04-26 17:04:02,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3696017408. Throughput: 0: 50894.1. Samples: 1448907660. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-04-26 17:04:02,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 17:04:03,177][49750] Updated weights for policy 0, policy_version 225591 (0.0033) [2024-04-26 17:04:06,430][49750] Updated weights for policy 0, policy_version 225601 (0.0033) [2024-04-26 17:04:07,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3696263168. Throughput: 0: 50970.6. Samples: 1449065960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-04-26 17:04:07,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 17:04:09,656][49750] Updated weights for policy 0, policy_version 225611 (0.0028) [2024-04-26 17:04:12,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3696508928. Throughput: 0: 50986.8. Samples: 1449374040. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-04-26 17:04:12,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 17:04:12,841][49750] Updated weights for policy 0, policy_version 225621 (0.0028) [2024-04-26 17:04:15,722][49728] Signal inference workers to stop experience collection... (21650 times) [2024-04-26 17:04:15,722][49728] Signal inference workers to resume experience collection... (21650 times) [2024-04-26 17:04:15,739][49750] InferenceWorker_p0-w0: stopping experience collection (21650 times) [2024-04-26 17:04:15,739][49750] InferenceWorker_p0-w0: resuming experience collection (21650 times) [2024-04-26 17:04:15,995][49750] Updated weights for policy 0, policy_version 225631 (0.0032) [2024-04-26 17:04:17,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3696771072. Throughput: 0: 50801.8. Samples: 1449675720. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-04-26 17:04:17,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 17:04:19,261][49750] Updated weights for policy 0, policy_version 225641 (0.0035) [2024-04-26 17:04:22,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 3697016832. Throughput: 0: 50770.4. Samples: 1449821020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 17:04:22,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 17:04:22,721][49750] Updated weights for policy 0, policy_version 225651 (0.0027) [2024-04-26 17:04:25,602][49750] Updated weights for policy 0, policy_version 225661 (0.0026) [2024-04-26 17:04:27,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3697295360. Throughput: 0: 50855.5. Samples: 1450129120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 17:04:27,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 17:04:29,289][49750] Updated weights for policy 0, policy_version 225671 (0.0029) [2024-04-26 17:04:32,034][49750] Updated weights for policy 0, policy_version 225681 (0.0031) [2024-04-26 17:04:32,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 3697557504. Throughput: 0: 50890.0. Samples: 1450439220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 17:04:32,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 17:04:35,911][49750] Updated weights for policy 0, policy_version 225691 (0.0034) [2024-04-26 17:04:37,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3697770496. Throughput: 0: 50863.2. Samples: 1450587980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 17:04:37,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 17:04:38,455][49750] Updated weights for policy 0, policy_version 225701 (0.0028) [2024-04-26 17:04:42,063][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.2, 300 sec: 50929.2). Total num frames: 3698032640. Throughput: 0: 50920.8. Samples: 1450892640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 17:04:42,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 17:04:42,197][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000225711_3698049024.pth... [2024-04-26 17:04:42,200][49750] Updated weights for policy 0, policy_version 225711 (0.0037) [2024-04-26 17:04:42,245][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000224967_3685859328.pth [2024-04-26 17:04:44,889][49750] Updated weights for policy 0, policy_version 225721 (0.0036) [2024-04-26 17:04:47,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 3698294784. Throughput: 0: 50791.6. Samples: 1451193280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 17:04:47,063][49517] Avg episode reward: [(0, '0.484')] [2024-04-26 17:04:48,556][49750] Updated weights for policy 0, policy_version 225731 (0.0030) [2024-04-26 17:04:51,257][49750] Updated weights for policy 0, policy_version 225741 (0.0030) [2024-04-26 17:04:52,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3698556928. Throughput: 0: 51025.0. Samples: 1451362080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 17:04:52,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 17:04:54,948][49750] Updated weights for policy 0, policy_version 225751 (0.0028) [2024-04-26 17:04:57,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 3698802688. Throughput: 0: 50800.6. Samples: 1451660080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 17:04:57,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 17:04:58,039][49750] Updated weights for policy 0, policy_version 225761 (0.0034) [2024-04-26 17:05:01,463][49750] Updated weights for policy 0, policy_version 225771 (0.0031) [2024-04-26 17:05:02,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3699048448. Throughput: 0: 50859.7. Samples: 1451964400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 17:05:02,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 17:05:04,441][49750] Updated weights for policy 0, policy_version 225781 (0.0036) [2024-04-26 17:05:07,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3699310592. Throughput: 0: 50884.0. Samples: 1452110800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 17:05:07,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 17:05:07,805][49750] Updated weights for policy 0, policy_version 225791 (0.0028) [2024-04-26 17:05:10,709][49750] Updated weights for policy 0, policy_version 225801 (0.0038) [2024-04-26 17:05:12,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3699572736. Throughput: 0: 50793.8. Samples: 1452414840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 17:05:12,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 17:05:14,189][49750] Updated weights for policy 0, policy_version 225811 (0.0028) [2024-04-26 17:05:17,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.6, 300 sec: 50873.8). Total num frames: 3699834880. Throughput: 0: 50878.4. Samples: 1452728740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 17:05:17,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 17:05:17,162][49750] Updated weights for policy 0, policy_version 225821 (0.0035) [2024-04-26 17:05:20,650][49750] Updated weights for policy 0, policy_version 225831 (0.0028) [2024-04-26 17:05:22,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3700080640. Throughput: 0: 50984.8. Samples: 1452882300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 17:05:22,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 17:05:22,888][49728] Signal inference workers to stop experience collection... (21700 times) [2024-04-26 17:05:22,888][49728] Signal inference workers to resume experience collection... (21700 times) [2024-04-26 17:05:22,903][49750] InferenceWorker_p0-w0: stopping experience collection (21700 times) [2024-04-26 17:05:22,903][49750] InferenceWorker_p0-w0: resuming experience collection (21700 times) [2024-04-26 17:05:23,521][49750] Updated weights for policy 0, policy_version 225841 (0.0033) [2024-04-26 17:05:27,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 3700310016. Throughput: 0: 50945.9. Samples: 1453185200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 17:05:27,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 17:05:27,252][49750] Updated weights for policy 0, policy_version 225851 (0.0028) [2024-04-26 17:05:29,859][49750] Updated weights for policy 0, policy_version 225861 (0.0035) [2024-04-26 17:05:32,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50929.3). Total num frames: 3700588544. Throughput: 0: 50876.5. Samples: 1453482720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-04-26 17:05:32,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 17:05:33,899][49750] Updated weights for policy 0, policy_version 225871 (0.0032) [2024-04-26 17:05:36,437][49750] Updated weights for policy 0, policy_version 225881 (0.0026) [2024-04-26 17:05:37,063][49517] Fps is (10 sec: 54066.1, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 3700850688. Throughput: 0: 50750.5. Samples: 1453645860. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-04-26 17:05:37,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 17:05:40,229][49750] Updated weights for policy 0, policy_version 225891 (0.0035) [2024-04-26 17:05:42,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3701096448. Throughput: 0: 50877.9. Samples: 1453949580. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-04-26 17:05:42,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 17:05:42,837][49750] Updated weights for policy 0, policy_version 225901 (0.0033) [2024-04-26 17:05:46,785][49750] Updated weights for policy 0, policy_version 225911 (0.0031) [2024-04-26 17:05:47,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3701325824. Throughput: 0: 50915.3. Samples: 1454255600. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-04-26 17:05:47,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 17:05:49,426][49750] Updated weights for policy 0, policy_version 225921 (0.0036) [2024-04-26 17:05:52,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3701604352. Throughput: 0: 50826.1. Samples: 1454397980. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-04-26 17:05:52,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 17:05:53,221][49750] Updated weights for policy 0, policy_version 225931 (0.0034) [2024-04-26 17:05:56,099][49750] Updated weights for policy 0, policy_version 225941 (0.0030) [2024-04-26 17:05:57,063][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3701866496. Throughput: 0: 50883.4. Samples: 1454704600. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-04-26 17:05:57,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 17:05:59,702][49750] Updated weights for policy 0, policy_version 225951 (0.0031) [2024-04-26 17:06:02,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 3702112256. Throughput: 0: 50670.0. Samples: 1455008900. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-04-26 17:06:02,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 17:06:02,448][49750] Updated weights for policy 0, policy_version 225961 (0.0034) [2024-04-26 17:06:06,035][49750] Updated weights for policy 0, policy_version 225971 (0.0031) [2024-04-26 17:06:07,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3702374400. Throughput: 0: 50819.9. Samples: 1455169200. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-04-26 17:06:07,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 17:06:08,804][49750] Updated weights for policy 0, policy_version 225981 (0.0028) [2024-04-26 17:06:12,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3702603776. Throughput: 0: 50802.1. Samples: 1455471300. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-04-26 17:06:12,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 17:06:12,410][49750] Updated weights for policy 0, policy_version 225991 (0.0032) [2024-04-26 17:06:15,233][49750] Updated weights for policy 0, policy_version 226001 (0.0036) [2024-04-26 17:06:17,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3702865920. Throughput: 0: 50894.3. Samples: 1455772960. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-04-26 17:06:17,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 17:06:18,875][49750] Updated weights for policy 0, policy_version 226011 (0.0031) [2024-04-26 17:06:21,522][49750] Updated weights for policy 0, policy_version 226021 (0.0028) [2024-04-26 17:06:22,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3703144448. Throughput: 0: 50742.3. Samples: 1455929260. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-04-26 17:06:22,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 17:06:25,308][49750] Updated weights for policy 0, policy_version 226031 (0.0025) [2024-04-26 17:06:27,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3703390208. Throughput: 0: 50745.5. Samples: 1456233120. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-04-26 17:06:27,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 17:06:28,051][49750] Updated weights for policy 0, policy_version 226041 (0.0032) [2024-04-26 17:06:31,370][49728] Signal inference workers to stop experience collection... (21750 times) [2024-04-26 17:06:31,371][49728] Signal inference workers to resume experience collection... (21750 times) [2024-04-26 17:06:31,389][49750] InferenceWorker_p0-w0: stopping experience collection (21750 times) [2024-04-26 17:06:31,389][49750] InferenceWorker_p0-w0: resuming experience collection (21750 times) [2024-04-26 17:06:31,643][49750] Updated weights for policy 0, policy_version 226051 (0.0034) [2024-04-26 17:06:32,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3703635968. Throughput: 0: 50917.1. Samples: 1456546860. Policy #0 lag: (min: 1.0, avg: 9.9, max: 23.0) [2024-04-26 17:06:32,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 17:06:34,415][49750] Updated weights for policy 0, policy_version 226061 (0.0030) [2024-04-26 17:06:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 3703881728. Throughput: 0: 50904.1. Samples: 1456688660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 17:06:37,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 17:06:38,078][49750] Updated weights for policy 0, policy_version 226071 (0.0030) [2024-04-26 17:06:40,691][49750] Updated weights for policy 0, policy_version 226081 (0.0026) [2024-04-26 17:06:42,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 3704160256. Throughput: 0: 50956.2. Samples: 1456997620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 17:06:42,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 17:06:42,138][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000226085_3704176640.pth... [2024-04-26 17:06:42,183][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000225339_3691954176.pth [2024-04-26 17:06:44,602][49750] Updated weights for policy 0, policy_version 226091 (0.0027) [2024-04-26 17:06:47,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51609.8, 300 sec: 50873.7). Total num frames: 3704422400. Throughput: 0: 50940.6. Samples: 1457301220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 17:06:47,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 17:06:47,337][49750] Updated weights for policy 0, policy_version 226101 (0.0032) [2024-04-26 17:06:50,925][49750] Updated weights for policy 0, policy_version 226111 (0.0028) [2024-04-26 17:06:52,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3704651776. Throughput: 0: 50934.6. Samples: 1457461260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 17:06:52,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 17:06:53,635][49750] Updated weights for policy 0, policy_version 226121 (0.0033) [2024-04-26 17:06:57,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.4, 300 sec: 50818.1). Total num frames: 3704897536. Throughput: 0: 50981.3. Samples: 1457765460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 17:06:57,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 17:06:57,279][49750] Updated weights for policy 0, policy_version 226131 (0.0034) [2024-04-26 17:07:00,230][49750] Updated weights for policy 0, policy_version 226141 (0.0032) [2024-04-26 17:07:02,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3705159680. Throughput: 0: 51031.2. Samples: 1458069360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 17:07:02,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 17:07:03,721][49750] Updated weights for policy 0, policy_version 226151 (0.0035) [2024-04-26 17:07:06,664][49750] Updated weights for policy 0, policy_version 226161 (0.0033) [2024-04-26 17:07:07,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3705421824. Throughput: 0: 50992.6. Samples: 1458223920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 17:07:07,063][49517] Avg episode reward: [(0, '0.483')] [2024-04-26 17:07:10,146][49750] Updated weights for policy 0, policy_version 226171 (0.0030) [2024-04-26 17:07:12,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51609.7, 300 sec: 50929.3). Total num frames: 3705700352. Throughput: 0: 51019.6. Samples: 1458529000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 17:07:12,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 17:07:12,897][49750] Updated weights for policy 0, policy_version 226181 (0.0028) [2024-04-26 17:07:16,414][49750] Updated weights for policy 0, policy_version 226191 (0.0031) [2024-04-26 17:07:17,063][49517] Fps is (10 sec: 50789.3, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 3705929728. Throughput: 0: 50803.3. Samples: 1458833020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 17:07:17,063][49517] Avg episode reward: [(0, '0.463')] [2024-04-26 17:07:19,251][49750] Updated weights for policy 0, policy_version 226201 (0.0036) [2024-04-26 17:07:22,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.3, 300 sec: 50929.2). Total num frames: 3706175488. Throughput: 0: 51012.3. Samples: 1458984220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 17:07:22,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 17:07:22,749][49750] Updated weights for policy 0, policy_version 226211 (0.0028) [2024-04-26 17:07:25,728][49750] Updated weights for policy 0, policy_version 226221 (0.0026) [2024-04-26 17:07:27,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 3706421248. Throughput: 0: 50964.3. Samples: 1459291020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 17:07:27,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 17:07:29,163][49750] Updated weights for policy 0, policy_version 226231 (0.0035) [2024-04-26 17:07:32,045][49750] Updated weights for policy 0, policy_version 226241 (0.0029) [2024-04-26 17:07:32,062][49517] Fps is (10 sec: 55706.4, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 3706732544. Throughput: 0: 50978.7. Samples: 1459595260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 17:07:32,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 17:07:35,749][49750] Updated weights for policy 0, policy_version 226251 (0.0034) [2024-04-26 17:07:37,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3706945536. Throughput: 0: 51006.0. Samples: 1459756520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 17:07:37,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 17:07:37,255][49728] Signal inference workers to stop experience collection... (21800 times) [2024-04-26 17:07:37,255][49728] Signal inference workers to resume experience collection... (21800 times) [2024-04-26 17:07:37,267][49750] InferenceWorker_p0-w0: stopping experience collection (21800 times) [2024-04-26 17:07:37,267][49750] InferenceWorker_p0-w0: resuming experience collection (21800 times) [2024-04-26 17:07:38,499][49750] Updated weights for policy 0, policy_version 226261 (0.0028) [2024-04-26 17:07:42,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 3707207680. Throughput: 0: 51142.8. Samples: 1460066880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 17:07:42,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 17:07:42,146][49750] Updated weights for policy 0, policy_version 226271 (0.0031) [2024-04-26 17:07:45,088][49750] Updated weights for policy 0, policy_version 226281 (0.0036) [2024-04-26 17:07:47,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3707453440. Throughput: 0: 51233.2. Samples: 1460374860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 17:07:47,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 17:07:48,499][49750] Updated weights for policy 0, policy_version 226291 (0.0032) [2024-04-26 17:07:51,578][49750] Updated weights for policy 0, policy_version 226301 (0.0035) [2024-04-26 17:07:52,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 3707731968. Throughput: 0: 51040.3. Samples: 1460520740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 17:07:52,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 17:07:54,980][49750] Updated weights for policy 0, policy_version 226311 (0.0032) [2024-04-26 17:07:57,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 3707977728. Throughput: 0: 50888.2. Samples: 1460818980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 17:07:57,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 17:07:58,123][49750] Updated weights for policy 0, policy_version 226321 (0.0040) [2024-04-26 17:08:01,393][49750] Updated weights for policy 0, policy_version 226331 (0.0039) [2024-04-26 17:08:02,063][49517] Fps is (10 sec: 49152.2, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3708223488. Throughput: 0: 50813.0. Samples: 1461119600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 17:08:02,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 17:08:04,640][49750] Updated weights for policy 0, policy_version 226341 (0.0025) [2024-04-26 17:08:07,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3708469248. Throughput: 0: 50983.7. Samples: 1461278480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 17:08:07,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 17:08:07,866][49750] Updated weights for policy 0, policy_version 226351 (0.0031) [2024-04-26 17:08:11,097][49750] Updated weights for policy 0, policy_version 226361 (0.0031) [2024-04-26 17:08:12,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 3708715008. Throughput: 0: 50859.2. Samples: 1461579680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 17:08:12,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 17:08:14,266][49750] Updated weights for policy 0, policy_version 226371 (0.0034) [2024-04-26 17:08:17,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3708993536. Throughput: 0: 50800.8. Samples: 1461881300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 17:08:17,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 17:08:17,558][49750] Updated weights for policy 0, policy_version 226381 (0.0032) [2024-04-26 17:08:20,706][49750] Updated weights for policy 0, policy_version 226391 (0.0034) [2024-04-26 17:08:22,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 3709255680. Throughput: 0: 50887.0. Samples: 1462046440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 17:08:22,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 17:08:23,976][49750] Updated weights for policy 0, policy_version 226401 (0.0031) [2024-04-26 17:08:27,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51336.7, 300 sec: 50929.3). Total num frames: 3709501440. Throughput: 0: 50709.8. Samples: 1462348820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 17:08:27,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-26 17:08:27,199][49750] Updated weights for policy 0, policy_version 226411 (0.0036) [2024-04-26 17:08:30,245][49750] Updated weights for policy 0, policy_version 226421 (0.0028) [2024-04-26 17:08:32,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.2, 300 sec: 50818.2). Total num frames: 3709730816. Throughput: 0: 50677.8. Samples: 1462655360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 17:08:32,071][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 17:08:33,669][49750] Updated weights for policy 0, policy_version 226431 (0.0031) [2024-04-26 17:08:36,739][49750] Updated weights for policy 0, policy_version 226441 (0.0031) [2024-04-26 17:08:37,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3710009344. Throughput: 0: 50697.5. Samples: 1462802120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 17:08:37,063][49517] Avg episode reward: [(0, '0.671')] [2024-04-26 17:08:40,146][49750] Updated weights for policy 0, policy_version 226451 (0.0026) [2024-04-26 17:08:42,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 3710271488. Throughput: 0: 50714.7. Samples: 1463101140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 17:08:42,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 17:08:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000226457_3710271488.pth... [2024-04-26 17:08:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000225711_3698049024.pth [2024-04-26 17:08:43,404][49750] Updated weights for policy 0, policy_version 226461 (0.0035) [2024-04-26 17:08:46,029][49728] Signal inference workers to stop experience collection... (21850 times) [2024-04-26 17:08:46,068][49750] InferenceWorker_p0-w0: stopping experience collection (21850 times) [2024-04-26 17:08:46,101][49728] Signal inference workers to resume experience collection... (21850 times) [2024-04-26 17:08:46,102][49750] InferenceWorker_p0-w0: resuming experience collection (21850 times) [2024-04-26 17:08:46,608][49750] Updated weights for policy 0, policy_version 226471 (0.0029) [2024-04-26 17:08:47,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 3710517248. Throughput: 0: 50834.3. Samples: 1463407140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 17:08:47,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 17:08:49,823][49750] Updated weights for policy 0, policy_version 226481 (0.0031) [2024-04-26 17:08:52,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3710763008. Throughput: 0: 50756.7. Samples: 1463562540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 17:08:52,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 17:08:52,989][49750] Updated weights for policy 0, policy_version 226491 (0.0031) [2024-04-26 17:08:56,266][49750] Updated weights for policy 0, policy_version 226501 (0.0031) [2024-04-26 17:08:57,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.4, 300 sec: 50762.7). Total num frames: 3710992384. Throughput: 0: 50838.3. Samples: 1463867400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:08:57,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 17:08:59,450][49750] Updated weights for policy 0, policy_version 226511 (0.0024) [2024-04-26 17:09:02,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3711254528. Throughput: 0: 50791.9. Samples: 1464166940. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:09:02,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 17:09:02,848][49750] Updated weights for policy 0, policy_version 226521 (0.0032) [2024-04-26 17:09:05,847][49750] Updated weights for policy 0, policy_version 226531 (0.0037) [2024-04-26 17:09:07,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 3711533056. Throughput: 0: 50598.2. Samples: 1464323360. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:09:07,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 17:09:09,946][49750] Updated weights for policy 0, policy_version 226541 (0.0029) [2024-04-26 17:09:12,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3711778816. Throughput: 0: 50783.0. Samples: 1464634060. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:09:12,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 17:09:12,437][49750] Updated weights for policy 0, policy_version 226551 (0.0034) [2024-04-26 17:09:16,389][49750] Updated weights for policy 0, policy_version 226561 (0.0037) [2024-04-26 17:09:17,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3712024576. Throughput: 0: 50776.3. Samples: 1464940300. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:09:17,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 17:09:18,901][49750] Updated weights for policy 0, policy_version 226571 (0.0026) [2024-04-26 17:09:22,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3712270336. Throughput: 0: 50750.7. Samples: 1465085900. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:09:22,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 17:09:22,688][49750] Updated weights for policy 0, policy_version 226581 (0.0027) [2024-04-26 17:09:25,360][49750] Updated weights for policy 0, policy_version 226591 (0.0034) [2024-04-26 17:09:27,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3712548864. Throughput: 0: 50818.4. Samples: 1465387960. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:09:27,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 17:09:28,988][49750] Updated weights for policy 0, policy_version 226601 (0.0032) [2024-04-26 17:09:31,733][49750] Updated weights for policy 0, policy_version 226611 (0.0032) [2024-04-26 17:09:32,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 3712794624. Throughput: 0: 50852.0. Samples: 1465695480. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:09:32,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 17:09:35,547][49750] Updated weights for policy 0, policy_version 226621 (0.0031) [2024-04-26 17:09:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 3713040384. Throughput: 0: 50818.9. Samples: 1465849380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:09:37,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 17:09:38,446][49750] Updated weights for policy 0, policy_version 226631 (0.0037) [2024-04-26 17:09:41,943][49750] Updated weights for policy 0, policy_version 226641 (0.0033) [2024-04-26 17:09:42,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 3713286144. Throughput: 0: 50715.9. Samples: 1466149620. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:09:42,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 17:09:44,804][49750] Updated weights for policy 0, policy_version 226651 (0.0030) [2024-04-26 17:09:47,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3713548288. Throughput: 0: 50675.3. Samples: 1466447320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:09:47,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 17:09:48,343][49750] Updated weights for policy 0, policy_version 226661 (0.0031) [2024-04-26 17:09:51,121][49728] Signal inference workers to stop experience collection... (21900 times) [2024-04-26 17:09:51,122][49728] Signal inference workers to resume experience collection... (21900 times) [2024-04-26 17:09:51,135][49750] InferenceWorker_p0-w0: stopping experience collection (21900 times) [2024-04-26 17:09:51,136][49750] InferenceWorker_p0-w0: resuming experience collection (21900 times) [2024-04-26 17:09:51,256][49750] Updated weights for policy 0, policy_version 226671 (0.0029) [2024-04-26 17:09:52,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 3713826816. Throughput: 0: 50797.4. Samples: 1466609240. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:09:52,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 17:09:54,750][49750] Updated weights for policy 0, policy_version 226681 (0.0032) [2024-04-26 17:09:57,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3714056192. Throughput: 0: 50666.7. Samples: 1466914060. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:09:57,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 17:09:57,668][49750] Updated weights for policy 0, policy_version 226691 (0.0032) [2024-04-26 17:10:01,177][49750] Updated weights for policy 0, policy_version 226701 (0.0032) [2024-04-26 17:10:02,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3714301952. Throughput: 0: 50746.3. Samples: 1467223880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 17:10:02,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 17:10:03,974][49750] Updated weights for policy 0, policy_version 226711 (0.0028) [2024-04-26 17:10:07,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3714547712. Throughput: 0: 50703.6. Samples: 1467367560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 17:10:07,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 17:10:07,824][49750] Updated weights for policy 0, policy_version 226721 (0.0027) [2024-04-26 17:10:10,471][49750] Updated weights for policy 0, policy_version 226731 (0.0030) [2024-04-26 17:10:12,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3714842624. Throughput: 0: 50765.3. Samples: 1467672400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 17:10:12,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 17:10:14,107][49750] Updated weights for policy 0, policy_version 226741 (0.0026) [2024-04-26 17:10:16,997][49750] Updated weights for policy 0, policy_version 226751 (0.0025) [2024-04-26 17:10:17,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3715088384. Throughput: 0: 50828.1. Samples: 1467982740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 17:10:17,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 17:10:20,419][49750] Updated weights for policy 0, policy_version 226761 (0.0029) [2024-04-26 17:10:22,063][49517] Fps is (10 sec: 49151.6, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 3715334144. Throughput: 0: 50890.1. Samples: 1468139440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 17:10:22,063][49517] Avg episode reward: [(0, '0.454')] [2024-04-26 17:10:23,320][49750] Updated weights for policy 0, policy_version 226771 (0.0027) [2024-04-26 17:10:26,903][49750] Updated weights for policy 0, policy_version 226781 (0.0032) [2024-04-26 17:10:27,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3715579904. Throughput: 0: 50873.8. Samples: 1468438940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 17:10:27,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 17:10:29,806][49750] Updated weights for policy 0, policy_version 226791 (0.0034) [2024-04-26 17:10:32,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3715842048. Throughput: 0: 50965.6. Samples: 1468740780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 17:10:32,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 17:10:33,475][49750] Updated weights for policy 0, policy_version 226801 (0.0031) [2024-04-26 17:10:36,256][49750] Updated weights for policy 0, policy_version 226811 (0.0030) [2024-04-26 17:10:37,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 3716120576. Throughput: 0: 50888.5. Samples: 1468899220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 17:10:37,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 17:10:39,756][49750] Updated weights for policy 0, policy_version 226821 (0.0026) [2024-04-26 17:10:42,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 3716349952. Throughput: 0: 50865.1. Samples: 1469203000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 17:10:42,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 17:10:42,198][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000226829_3716366336.pth... [2024-04-26 17:10:42,245][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000226085_3704176640.pth [2024-04-26 17:10:42,594][49750] Updated weights for policy 0, policy_version 226831 (0.0034) [2024-04-26 17:10:46,094][49750] Updated weights for policy 0, policy_version 226841 (0.0035) [2024-04-26 17:10:47,063][49517] Fps is (10 sec: 49151.5, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3716612096. Throughput: 0: 50738.2. Samples: 1469507100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 17:10:47,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 17:10:49,036][49750] Updated weights for policy 0, policy_version 226851 (0.0035) [2024-04-26 17:10:52,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 3716841472. Throughput: 0: 50896.2. Samples: 1469657900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 17:10:52,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 17:10:52,648][49750] Updated weights for policy 0, policy_version 226861 (0.0029) [2024-04-26 17:10:53,773][49728] Signal inference workers to stop experience collection... (21950 times) [2024-04-26 17:10:53,826][49750] InferenceWorker_p0-w0: stopping experience collection (21950 times) [2024-04-26 17:10:53,832][49728] Signal inference workers to resume experience collection... (21950 times) [2024-04-26 17:10:53,841][49750] InferenceWorker_p0-w0: resuming experience collection (21950 times) [2024-04-26 17:10:55,577][49750] Updated weights for policy 0, policy_version 226871 (0.0025) [2024-04-26 17:10:57,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3717103616. Throughput: 0: 50892.8. Samples: 1469962580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 17:10:57,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 17:10:59,103][49750] Updated weights for policy 0, policy_version 226881 (0.0034) [2024-04-26 17:11:01,962][49750] Updated weights for policy 0, policy_version 226891 (0.0033) [2024-04-26 17:11:02,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3717382144. Throughput: 0: 50719.9. Samples: 1470265140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 17:11:02,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 17:11:05,510][49750] Updated weights for policy 0, policy_version 226901 (0.0032) [2024-04-26 17:11:07,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.3, 300 sec: 50929.2). Total num frames: 3717627904. Throughput: 0: 50826.1. Samples: 1470426620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 17:11:07,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 17:11:08,343][49750] Updated weights for policy 0, policy_version 226911 (0.0038) [2024-04-26 17:11:11,924][49750] Updated weights for policy 0, policy_version 226921 (0.0035) [2024-04-26 17:11:12,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 3717873664. Throughput: 0: 50870.1. Samples: 1470728100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 17:11:12,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 17:11:14,901][49750] Updated weights for policy 0, policy_version 226931 (0.0030) [2024-04-26 17:11:17,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3718119424. Throughput: 0: 50991.2. Samples: 1471035380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 17:11:17,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 17:11:18,267][49750] Updated weights for policy 0, policy_version 226941 (0.0035) [2024-04-26 17:11:21,382][49750] Updated weights for policy 0, policy_version 226951 (0.0029) [2024-04-26 17:11:22,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3718397952. Throughput: 0: 50774.2. Samples: 1471184060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 17:11:22,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 17:11:24,768][49750] Updated weights for policy 0, policy_version 226961 (0.0036) [2024-04-26 17:11:27,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3718627328. Throughput: 0: 50853.9. Samples: 1471491420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 17:11:27,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 17:11:27,794][49750] Updated weights for policy 0, policy_version 226971 (0.0030) [2024-04-26 17:11:31,089][49750] Updated weights for policy 0, policy_version 226981 (0.0034) [2024-04-26 17:11:32,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3718889472. Throughput: 0: 50875.6. Samples: 1471796500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 17:11:32,063][49517] Avg episode reward: [(0, '0.672')] [2024-04-26 17:11:34,209][49750] Updated weights for policy 0, policy_version 226991 (0.0029) [2024-04-26 17:11:37,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3719151616. Throughput: 0: 50777.2. Samples: 1471942860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 17:11:37,063][49517] Avg episode reward: [(0, '0.667')] [2024-04-26 17:11:37,458][49750] Updated weights for policy 0, policy_version 227001 (0.0033) [2024-04-26 17:11:40,649][49750] Updated weights for policy 0, policy_version 227011 (0.0032) [2024-04-26 17:11:42,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.6, 300 sec: 50762.6). Total num frames: 3719397376. Throughput: 0: 50885.5. Samples: 1472252420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 17:11:42,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 17:11:43,875][49750] Updated weights for policy 0, policy_version 227021 (0.0032) [2024-04-26 17:11:47,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3719659520. Throughput: 0: 50918.3. Samples: 1472556460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 17:11:47,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 17:11:47,155][49750] Updated weights for policy 0, policy_version 227031 (0.0028) [2024-04-26 17:11:50,449][49750] Updated weights for policy 0, policy_version 227041 (0.0025) [2024-04-26 17:11:52,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.7, 300 sec: 50929.3). Total num frames: 3719921664. Throughput: 0: 50796.7. Samples: 1472712460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 17:11:52,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 17:11:53,534][49750] Updated weights for policy 0, policy_version 227051 (0.0035) [2024-04-26 17:11:56,893][49750] Updated weights for policy 0, policy_version 227061 (0.0027) [2024-04-26 17:11:57,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3720167424. Throughput: 0: 50824.6. Samples: 1473015200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 17:11:57,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 17:11:59,967][49750] Updated weights for policy 0, policy_version 227071 (0.0031) [2024-04-26 17:12:02,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3720396800. Throughput: 0: 50643.5. Samples: 1473314340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 17:12:02,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 17:12:03,563][49750] Updated weights for policy 0, policy_version 227081 (0.0034) [2024-04-26 17:12:06,418][49750] Updated weights for policy 0, policy_version 227091 (0.0025) [2024-04-26 17:12:07,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.7, 300 sec: 50818.2). Total num frames: 3720691712. Throughput: 0: 50844.0. Samples: 1473472040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 17:12:07,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 17:12:10,095][49750] Updated weights for policy 0, policy_version 227101 (0.0030) [2024-04-26 17:12:10,995][49728] Signal inference workers to stop experience collection... (22000 times) [2024-04-26 17:12:10,995][49728] Signal inference workers to resume experience collection... (22000 times) [2024-04-26 17:12:11,014][49750] InferenceWorker_p0-w0: stopping experience collection (22000 times) [2024-04-26 17:12:11,014][49750] InferenceWorker_p0-w0: resuming experience collection (22000 times) [2024-04-26 17:12:12,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3720921088. Throughput: 0: 50775.5. Samples: 1473776320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 17:12:12,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 17:12:12,900][49750] Updated weights for policy 0, policy_version 227111 (0.0039) [2024-04-26 17:12:16,407][49750] Updated weights for policy 0, policy_version 227121 (0.0034) [2024-04-26 17:12:17,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3721166848. Throughput: 0: 50680.5. Samples: 1474077120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 17:12:17,063][49517] Avg episode reward: [(0, '0.668')] [2024-04-26 17:12:19,344][49750] Updated weights for policy 0, policy_version 227131 (0.0029) [2024-04-26 17:12:22,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 3721412608. Throughput: 0: 50664.4. Samples: 1474222760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 17:12:22,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 17:12:22,799][49750] Updated weights for policy 0, policy_version 227141 (0.0033) [2024-04-26 17:12:25,818][49750] Updated weights for policy 0, policy_version 227151 (0.0034) [2024-04-26 17:12:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 3721691136. Throughput: 0: 50653.0. Samples: 1474531800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 17:12:27,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 17:12:29,277][49750] Updated weights for policy 0, policy_version 227161 (0.0035) [2024-04-26 17:12:32,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3721936896. Throughput: 0: 50763.8. Samples: 1474840840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 17:12:32,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 17:12:32,307][49750] Updated weights for policy 0, policy_version 227171 (0.0036) [2024-04-26 17:12:35,806][49750] Updated weights for policy 0, policy_version 227181 (0.0031) [2024-04-26 17:12:37,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3722182656. Throughput: 0: 50711.8. Samples: 1474994500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 17:12:37,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 17:12:38,740][49750] Updated weights for policy 0, policy_version 227191 (0.0032) [2024-04-26 17:12:42,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3722444800. Throughput: 0: 50838.0. Samples: 1475302920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 17:12:42,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 17:12:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000227200_3722444800.pth... [2024-04-26 17:12:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000226457_3710271488.pth [2024-04-26 17:12:42,300][49750] Updated weights for policy 0, policy_version 227201 (0.0034) [2024-04-26 17:12:45,228][49750] Updated weights for policy 0, policy_version 227211 (0.0030) [2024-04-26 17:12:47,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3722690560. Throughput: 0: 50860.0. Samples: 1475603040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 17:12:47,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 17:12:48,657][49750] Updated weights for policy 0, policy_version 227221 (0.0032) [2024-04-26 17:12:51,641][49750] Updated weights for policy 0, policy_version 227231 (0.0037) [2024-04-26 17:12:52,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3722969088. Throughput: 0: 50772.4. Samples: 1475756800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 17:12:52,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 17:12:55,044][49750] Updated weights for policy 0, policy_version 227241 (0.0034) [2024-04-26 17:12:57,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3723214848. Throughput: 0: 50715.6. Samples: 1476058520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 17:12:57,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 17:12:58,004][49750] Updated weights for policy 0, policy_version 227251 (0.0026) [2024-04-26 17:13:01,659][49750] Updated weights for policy 0, policy_version 227261 (0.0032) [2024-04-26 17:13:02,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3723444224. Throughput: 0: 50804.8. Samples: 1476363340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 17:13:02,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 17:13:04,489][49750] Updated weights for policy 0, policy_version 227271 (0.0029) [2024-04-26 17:13:07,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 3723706368. Throughput: 0: 50809.7. Samples: 1476509200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 17:13:07,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 17:13:08,183][49750] Updated weights for policy 0, policy_version 227281 (0.0028) [2024-04-26 17:13:10,837][49750] Updated weights for policy 0, policy_version 227291 (0.0027) [2024-04-26 17:13:12,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3723968512. Throughput: 0: 50855.9. Samples: 1476820320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 17:13:12,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 17:13:14,568][49750] Updated weights for policy 0, policy_version 227301 (0.0034) [2024-04-26 17:13:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3724230656. Throughput: 0: 50641.9. Samples: 1477119720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 17:13:17,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 17:13:17,397][49750] Updated weights for policy 0, policy_version 227311 (0.0029) [2024-04-26 17:13:21,012][49750] Updated weights for policy 0, policy_version 227321 (0.0035) [2024-04-26 17:13:22,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 3724443648. Throughput: 0: 50695.8. Samples: 1477275800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 17:13:22,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 17:13:23,804][49750] Updated weights for policy 0, policy_version 227331 (0.0033) [2024-04-26 17:13:27,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3724722176. Throughput: 0: 50544.1. Samples: 1477577400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 17:13:27,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 17:13:27,336][49750] Updated weights for policy 0, policy_version 227341 (0.0034) [2024-04-26 17:13:29,030][49728] Signal inference workers to stop experience collection... (22050 times) [2024-04-26 17:13:29,030][49728] Signal inference workers to resume experience collection... (22050 times) [2024-04-26 17:13:29,044][49750] InferenceWorker_p0-w0: stopping experience collection (22050 times) [2024-04-26 17:13:29,045][49750] InferenceWorker_p0-w0: resuming experience collection (22050 times) [2024-04-26 17:13:30,274][49750] Updated weights for policy 0, policy_version 227351 (0.0037) [2024-04-26 17:13:32,062][49517] Fps is (10 sec: 55705.5, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3725000704. Throughput: 0: 50695.1. Samples: 1477884320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 17:13:32,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 17:13:33,697][49750] Updated weights for policy 0, policy_version 227361 (0.0030) [2024-04-26 17:13:36,736][49750] Updated weights for policy 0, policy_version 227371 (0.0039) [2024-04-26 17:13:37,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 3725262848. Throughput: 0: 50846.7. Samples: 1478044900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 17:13:37,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 17:13:40,223][49750] Updated weights for policy 0, policy_version 227381 (0.0035) [2024-04-26 17:13:42,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3725475840. Throughput: 0: 50714.2. Samples: 1478340660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 17:13:42,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 17:13:43,177][49750] Updated weights for policy 0, policy_version 227391 (0.0029) [2024-04-26 17:13:46,747][49750] Updated weights for policy 0, policy_version 227401 (0.0038) [2024-04-26 17:13:47,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 3725737984. Throughput: 0: 50619.6. Samples: 1478641220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 17:13:47,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 17:13:49,664][49750] Updated weights for policy 0, policy_version 227411 (0.0037) [2024-04-26 17:13:52,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3726000128. Throughput: 0: 50728.4. Samples: 1478791980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 17:13:52,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 17:13:53,118][49750] Updated weights for policy 0, policy_version 227421 (0.0030) [2024-04-26 17:13:56,154][49750] Updated weights for policy 0, policy_version 227431 (0.0036) [2024-04-26 17:13:57,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3726245888. Throughput: 0: 50707.5. Samples: 1479102160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 17:13:57,064][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 17:13:59,506][49750] Updated weights for policy 0, policy_version 227441 (0.0036) [2024-04-26 17:14:02,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3726508032. Throughput: 0: 50806.5. Samples: 1479406020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 17:14:02,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 17:14:02,541][49750] Updated weights for policy 0, policy_version 227451 (0.0029) [2024-04-26 17:14:05,881][49750] Updated weights for policy 0, policy_version 227461 (0.0034) [2024-04-26 17:14:07,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3726737408. Throughput: 0: 50788.8. Samples: 1479561300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 17:14:07,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 17:14:09,028][49750] Updated weights for policy 0, policy_version 227471 (0.0035) [2024-04-26 17:14:12,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 3727015936. Throughput: 0: 50791.7. Samples: 1479863040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 17:14:12,063][49517] Avg episode reward: [(0, '0.679')] [2024-04-26 17:14:12,360][49750] Updated weights for policy 0, policy_version 227481 (0.0033) [2024-04-26 17:14:15,457][49750] Updated weights for policy 0, policy_version 227491 (0.0029) [2024-04-26 17:14:17,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3727261696. Throughput: 0: 50631.6. Samples: 1480162740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 17:14:17,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 17:14:19,533][49750] Updated weights for policy 0, policy_version 227501 (0.0032) [2024-04-26 17:14:21,970][49750] Updated weights for policy 0, policy_version 227511 (0.0031) [2024-04-26 17:14:22,063][49517] Fps is (10 sec: 52429.8, 60 sec: 51609.5, 300 sec: 50818.2). Total num frames: 3727540224. Throughput: 0: 50632.3. Samples: 1480323360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 17:14:22,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 17:14:26,202][49750] Updated weights for policy 0, policy_version 227521 (0.0029) [2024-04-26 17:14:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3727753216. Throughput: 0: 50661.5. Samples: 1480620420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 17:14:27,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 17:14:28,549][49750] Updated weights for policy 0, policy_version 227531 (0.0034) [2024-04-26 17:14:32,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3728015360. Throughput: 0: 50605.8. Samples: 1480918480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 17:14:32,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 17:14:32,670][49750] Updated weights for policy 0, policy_version 227541 (0.0032) [2024-04-26 17:14:35,226][49750] Updated weights for policy 0, policy_version 227551 (0.0030) [2024-04-26 17:14:37,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50244.1, 300 sec: 50818.2). Total num frames: 3728277504. Throughput: 0: 50650.1. Samples: 1481071240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 17:14:37,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 17:14:39,139][49750] Updated weights for policy 0, policy_version 227561 (0.0035) [2024-04-26 17:14:39,539][49728] Signal inference workers to stop experience collection... (22100 times) [2024-04-26 17:14:39,560][49750] InferenceWorker_p0-w0: stopping experience collection (22100 times) [2024-04-26 17:14:39,658][49728] Signal inference workers to resume experience collection... (22100 times) [2024-04-26 17:14:39,658][49750] InferenceWorker_p0-w0: resuming experience collection (22100 times) [2024-04-26 17:14:41,644][49750] Updated weights for policy 0, policy_version 227571 (0.0029) [2024-04-26 17:14:42,063][49517] Fps is (10 sec: 52427.1, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 3728539648. Throughput: 0: 50633.5. Samples: 1481380680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 17:14:42,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 17:14:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000227572_3728539648.pth... [2024-04-26 17:14:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000226829_3716366336.pth [2024-04-26 17:14:45,632][49750] Updated weights for policy 0, policy_version 227581 (0.0034) [2024-04-26 17:14:47,062][49517] Fps is (10 sec: 50791.6, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3728785408. Throughput: 0: 50674.9. Samples: 1481686380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 17:14:47,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 17:14:48,182][49750] Updated weights for policy 0, policy_version 227591 (0.0031) [2024-04-26 17:14:51,919][49750] Updated weights for policy 0, policy_version 227601 (0.0035) [2024-04-26 17:14:52,063][49517] Fps is (10 sec: 47514.5, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3729014784. Throughput: 0: 50520.4. Samples: 1481834720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 17:14:52,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 17:14:54,575][49750] Updated weights for policy 0, policy_version 227611 (0.0028) [2024-04-26 17:14:57,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3729309696. Throughput: 0: 50562.0. Samples: 1482138320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 17:14:57,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 17:14:58,294][49750] Updated weights for policy 0, policy_version 227621 (0.0030) [2024-04-26 17:15:00,988][49750] Updated weights for policy 0, policy_version 227631 (0.0030) [2024-04-26 17:15:02,062][49517] Fps is (10 sec: 54067.8, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3729555456. Throughput: 0: 50671.5. Samples: 1482442960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 17:15:02,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 17:15:04,938][49750] Updated weights for policy 0, policy_version 227641 (0.0039) [2024-04-26 17:15:07,063][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3729801216. Throughput: 0: 50723.5. Samples: 1482605920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 17:15:07,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 17:15:07,431][49750] Updated weights for policy 0, policy_version 227651 (0.0033) [2024-04-26 17:15:11,234][49750] Updated weights for policy 0, policy_version 227661 (0.0027) [2024-04-26 17:15:12,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3730046976. Throughput: 0: 50772.7. Samples: 1482905200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 17:15:12,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 17:15:13,795][49750] Updated weights for policy 0, policy_version 227671 (0.0033) [2024-04-26 17:15:17,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3730292736. Throughput: 0: 50769.6. Samples: 1483203120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 17:15:17,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 17:15:17,649][49750] Updated weights for policy 0, policy_version 227681 (0.0032) [2024-04-26 17:15:20,222][49750] Updated weights for policy 0, policy_version 227691 (0.0031) [2024-04-26 17:15:22,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3730571264. Throughput: 0: 50971.3. Samples: 1483364940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 17:15:22,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 17:15:24,037][49750] Updated weights for policy 0, policy_version 227701 (0.0034) [2024-04-26 17:15:26,656][49750] Updated weights for policy 0, policy_version 227711 (0.0036) [2024-04-26 17:15:27,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3730833408. Throughput: 0: 50982.1. Samples: 1483674860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 17:15:27,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 17:15:30,518][49750] Updated weights for policy 0, policy_version 227721 (0.0034) [2024-04-26 17:15:32,062][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 3731046400. Throughput: 0: 50809.6. Samples: 1483972820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 17:15:32,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 17:15:33,163][49750] Updated weights for policy 0, policy_version 227731 (0.0029) [2024-04-26 17:15:37,002][49750] Updated weights for policy 0, policy_version 227741 (0.0035) [2024-04-26 17:15:37,063][49517] Fps is (10 sec: 47512.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3731308544. Throughput: 0: 50688.8. Samples: 1484115720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 17:15:37,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 17:15:38,969][49728] Signal inference workers to stop experience collection... (22150 times) [2024-04-26 17:15:39,004][49750] InferenceWorker_p0-w0: stopping experience collection (22150 times) [2024-04-26 17:15:39,029][49728] Signal inference workers to resume experience collection... (22150 times) [2024-04-26 17:15:39,029][49750] InferenceWorker_p0-w0: resuming experience collection (22150 times) [2024-04-26 17:15:39,633][49750] Updated weights for policy 0, policy_version 227751 (0.0036) [2024-04-26 17:15:42,063][49517] Fps is (10 sec: 55704.8, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 3731603456. Throughput: 0: 50766.9. Samples: 1484422840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-26 17:15:42,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 17:15:43,397][49750] Updated weights for policy 0, policy_version 227761 (0.0036) [2024-04-26 17:15:46,011][49750] Updated weights for policy 0, policy_version 227771 (0.0030) [2024-04-26 17:15:47,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 3731849216. Throughput: 0: 50766.6. Samples: 1484727460. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-04-26 17:15:47,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 17:15:49,707][49750] Updated weights for policy 0, policy_version 227781 (0.0031) [2024-04-26 17:15:52,062][49517] Fps is (10 sec: 47514.8, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 3732078592. Throughput: 0: 50704.2. Samples: 1484887600. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-04-26 17:15:52,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 17:15:52,781][49750] Updated weights for policy 0, policy_version 227791 (0.0029) [2024-04-26 17:15:56,228][49750] Updated weights for policy 0, policy_version 227801 (0.0030) [2024-04-26 17:15:57,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 3732324352. Throughput: 0: 50734.4. Samples: 1485188240. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-04-26 17:15:57,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 17:15:59,310][49750] Updated weights for policy 0, policy_version 227811 (0.0033) [2024-04-26 17:16:02,063][49517] Fps is (10 sec: 50789.2, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3732586496. Throughput: 0: 50790.1. Samples: 1485488680. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-04-26 17:16:02,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 17:16:02,590][49750] Updated weights for policy 0, policy_version 227821 (0.0034) [2024-04-26 17:16:05,675][49750] Updated weights for policy 0, policy_version 227831 (0.0032) [2024-04-26 17:16:07,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 3732848640. Throughput: 0: 50706.7. Samples: 1485646740. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-04-26 17:16:07,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 17:16:09,093][49750] Updated weights for policy 0, policy_version 227841 (0.0032) [2024-04-26 17:16:12,055][49750] Updated weights for policy 0, policy_version 227851 (0.0036) [2024-04-26 17:16:12,062][49517] Fps is (10 sec: 52429.9, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3733110784. Throughput: 0: 50645.3. Samples: 1485953900. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-04-26 17:16:12,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 17:16:15,482][49750] Updated weights for policy 0, policy_version 227861 (0.0032) [2024-04-26 17:16:17,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.5, 300 sec: 50651.5). Total num frames: 3733340160. Throughput: 0: 50763.1. Samples: 1486257160. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-04-26 17:16:17,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 17:16:18,596][49750] Updated weights for policy 0, policy_version 227871 (0.0029) [2024-04-26 17:16:21,945][49750] Updated weights for policy 0, policy_version 227881 (0.0025) [2024-04-26 17:16:22,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3733602304. Throughput: 0: 50773.4. Samples: 1486400520. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-04-26 17:16:22,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 17:16:24,940][49750] Updated weights for policy 0, policy_version 227891 (0.0027) [2024-04-26 17:16:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3733864448. Throughput: 0: 50769.6. Samples: 1486707460. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-04-26 17:16:27,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 17:16:28,461][49750] Updated weights for policy 0, policy_version 227901 (0.0032) [2024-04-26 17:16:31,532][49750] Updated weights for policy 0, policy_version 227911 (0.0032) [2024-04-26 17:16:32,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51609.7, 300 sec: 50818.2). Total num frames: 3734142976. Throughput: 0: 50817.0. Samples: 1487014220. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-04-26 17:16:32,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 17:16:34,927][49750] Updated weights for policy 0, policy_version 227921 (0.0033) [2024-04-26 17:16:37,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3734355968. Throughput: 0: 50705.3. Samples: 1487169340. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-04-26 17:16:37,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 17:16:37,869][49750] Updated weights for policy 0, policy_version 227931 (0.0033) [2024-04-26 17:16:41,166][49750] Updated weights for policy 0, policy_version 227941 (0.0031) [2024-04-26 17:16:42,062][49517] Fps is (10 sec: 45875.3, 60 sec: 49971.4, 300 sec: 50651.6). Total num frames: 3734601728. Throughput: 0: 50820.5. Samples: 1487475160. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-04-26 17:16:42,063][49517] Avg episode reward: [(0, '0.468')] [2024-04-26 17:16:42,143][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000227943_3734618112.pth... [2024-04-26 17:16:42,190][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000227200_3722444800.pth [2024-04-26 17:16:44,295][49750] Updated weights for policy 0, policy_version 227951 (0.0028) [2024-04-26 17:16:45,724][49728] Signal inference workers to stop experience collection... (22200 times) [2024-04-26 17:16:45,725][49728] Signal inference workers to resume experience collection... (22200 times) [2024-04-26 17:16:45,752][49750] InferenceWorker_p0-w0: stopping experience collection (22200 times) [2024-04-26 17:16:45,753][49750] InferenceWorker_p0-w0: resuming experience collection (22200 times) [2024-04-26 17:16:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3734880256. Throughput: 0: 50786.0. Samples: 1487774040. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-04-26 17:16:47,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 17:16:47,684][49750] Updated weights for policy 0, policy_version 227961 (0.0029) [2024-04-26 17:16:50,694][49750] Updated weights for policy 0, policy_version 227971 (0.0031) [2024-04-26 17:16:52,063][49517] Fps is (10 sec: 54066.0, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3735142400. Throughput: 0: 50751.3. Samples: 1487930560. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-04-26 17:16:52,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-26 17:16:54,212][49750] Updated weights for policy 0, policy_version 227981 (0.0028) [2024-04-26 17:16:57,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 3735388160. Throughput: 0: 50682.0. Samples: 1488234600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:16:57,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 17:16:57,410][49750] Updated weights for policy 0, policy_version 227991 (0.0032) [2024-04-26 17:17:00,548][49750] Updated weights for policy 0, policy_version 228001 (0.0029) [2024-04-26 17:17:02,063][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.5, 300 sec: 50651.5). Total num frames: 3735633920. Throughput: 0: 50767.5. Samples: 1488541700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:17:02,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 17:17:03,701][49750] Updated weights for policy 0, policy_version 228011 (0.0032) [2024-04-26 17:17:06,908][49750] Updated weights for policy 0, policy_version 228021 (0.0031) [2024-04-26 17:17:07,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3735896064. Throughput: 0: 51052.6. Samples: 1488697880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:17:07,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 17:17:10,112][49750] Updated weights for policy 0, policy_version 228031 (0.0036) [2024-04-26 17:17:12,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3736158208. Throughput: 0: 50883.9. Samples: 1488997240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:17:12,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 17:17:13,394][49750] Updated weights for policy 0, policy_version 228041 (0.0031) [2024-04-26 17:17:16,623][49750] Updated weights for policy 0, policy_version 228051 (0.0027) [2024-04-26 17:17:17,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3736403968. Throughput: 0: 50807.0. Samples: 1489300540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:17:17,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 17:17:19,829][49750] Updated weights for policy 0, policy_version 228061 (0.0034) [2024-04-26 17:17:22,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3736649728. Throughput: 0: 50794.6. Samples: 1489455100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:17:22,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 17:17:22,915][49750] Updated weights for policy 0, policy_version 228071 (0.0031) [2024-04-26 17:17:26,343][49750] Updated weights for policy 0, policy_version 228081 (0.0029) [2024-04-26 17:17:27,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3736911872. Throughput: 0: 50823.1. Samples: 1489762200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:17:27,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 17:17:29,295][49750] Updated weights for policy 0, policy_version 228091 (0.0033) [2024-04-26 17:17:32,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 3737157632. Throughput: 0: 50812.2. Samples: 1490060600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:17:32,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 17:17:32,697][49750] Updated weights for policy 0, policy_version 228101 (0.0037) [2024-04-26 17:17:35,833][49750] Updated weights for policy 0, policy_version 228111 (0.0033) [2024-04-26 17:17:37,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3737419776. Throughput: 0: 50839.3. Samples: 1490218320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:17:37,064][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 17:17:39,048][49750] Updated weights for policy 0, policy_version 228121 (0.0029) [2024-04-26 17:17:42,062][49517] Fps is (10 sec: 50791.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3737665536. Throughput: 0: 50806.0. Samples: 1490520860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:17:42,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 17:17:42,426][49750] Updated weights for policy 0, policy_version 228131 (0.0034) [2024-04-26 17:17:45,663][49750] Updated weights for policy 0, policy_version 228141 (0.0034) [2024-04-26 17:17:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 3737911296. Throughput: 0: 50786.4. Samples: 1490827080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:17:47,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 17:17:48,793][49750] Updated weights for policy 0, policy_version 228151 (0.0029) [2024-04-26 17:17:49,864][49728] Signal inference workers to stop experience collection... (22250 times) [2024-04-26 17:17:49,864][49728] Signal inference workers to resume experience collection... (22250 times) [2024-04-26 17:17:49,906][49750] InferenceWorker_p0-w0: stopping experience collection (22250 times) [2024-04-26 17:17:49,907][49750] InferenceWorker_p0-w0: resuming experience collection (22250 times) [2024-04-26 17:17:52,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3738173440. Throughput: 0: 50673.3. Samples: 1490978180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:17:52,063][49517] Avg episode reward: [(0, '0.485')] [2024-04-26 17:17:52,140][49750] Updated weights for policy 0, policy_version 228161 (0.0032) [2024-04-26 17:17:55,386][49750] Updated weights for policy 0, policy_version 228171 (0.0029) [2024-04-26 17:17:57,063][49517] Fps is (10 sec: 52427.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3738435584. Throughput: 0: 50692.7. Samples: 1491278420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:17:57,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 17:17:58,491][49750] Updated weights for policy 0, policy_version 228181 (0.0032) [2024-04-26 17:18:02,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3738664960. Throughput: 0: 50800.1. Samples: 1491586540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:18:02,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 17:18:02,140][49750] Updated weights for policy 0, policy_version 228191 (0.0031) [2024-04-26 17:18:05,039][49750] Updated weights for policy 0, policy_version 228201 (0.0029) [2024-04-26 17:18:07,063][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3738927104. Throughput: 0: 50678.7. Samples: 1491735640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 17:18:07,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 17:18:08,444][49750] Updated weights for policy 0, policy_version 228211 (0.0028) [2024-04-26 17:18:11,360][49750] Updated weights for policy 0, policy_version 228221 (0.0028) [2024-04-26 17:18:12,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3739189248. Throughput: 0: 50742.0. Samples: 1492045600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 17:18:12,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 17:18:14,887][49750] Updated weights for policy 0, policy_version 228231 (0.0034) [2024-04-26 17:18:17,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3739451392. Throughput: 0: 50826.7. Samples: 1492347800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 17:18:17,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 17:18:17,844][49750] Updated weights for policy 0, policy_version 228241 (0.0032) [2024-04-26 17:18:21,423][49750] Updated weights for policy 0, policy_version 228251 (0.0033) [2024-04-26 17:18:22,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3739713536. Throughput: 0: 50753.8. Samples: 1492502240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 17:18:22,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 17:18:24,326][49750] Updated weights for policy 0, policy_version 228261 (0.0036) [2024-04-26 17:18:27,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 3739942912. Throughput: 0: 50768.9. Samples: 1492805460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 17:18:27,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 17:18:27,684][49750] Updated weights for policy 0, policy_version 228271 (0.0034) [2024-04-26 17:18:30,723][49750] Updated weights for policy 0, policy_version 228281 (0.0036) [2024-04-26 17:18:32,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.6, 300 sec: 50651.5). Total num frames: 3740205056. Throughput: 0: 50737.3. Samples: 1493110260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 17:18:32,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 17:18:34,070][49750] Updated weights for policy 0, policy_version 228291 (0.0035) [2024-04-26 17:18:37,062][49750] Updated weights for policy 0, policy_version 228301 (0.0029) [2024-04-26 17:18:37,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3740483584. Throughput: 0: 50767.6. Samples: 1493262720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 17:18:37,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 17:18:40,580][49750] Updated weights for policy 0, policy_version 228311 (0.0030) [2024-04-26 17:18:42,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3740729344. Throughput: 0: 50908.1. Samples: 1493569280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 17:18:42,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 17:18:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000228316_3740729344.pth... [2024-04-26 17:18:42,130][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000227572_3728539648.pth [2024-04-26 17:18:43,381][49750] Updated weights for policy 0, policy_version 228321 (0.0030) [2024-04-26 17:18:47,063][49517] Fps is (10 sec: 47512.7, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 3740958720. Throughput: 0: 50893.5. Samples: 1493876760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 17:18:47,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 17:18:47,153][49750] Updated weights for policy 0, policy_version 228331 (0.0032) [2024-04-26 17:18:49,878][49750] Updated weights for policy 0, policy_version 228341 (0.0030) [2024-04-26 17:18:52,063][49517] Fps is (10 sec: 45874.5, 60 sec: 50244.1, 300 sec: 50651.5). Total num frames: 3741188096. Throughput: 0: 50851.4. Samples: 1494023960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 17:18:52,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 17:18:53,429][49750] Updated weights for policy 0, policy_version 228351 (0.0034) [2024-04-26 17:18:56,347][49750] Updated weights for policy 0, policy_version 228361 (0.0031) [2024-04-26 17:18:57,063][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3741483008. Throughput: 0: 50700.9. Samples: 1494327140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 17:18:57,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 17:18:59,894][49750] Updated weights for policy 0, policy_version 228371 (0.0026) [2024-04-26 17:19:02,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3741728768. Throughput: 0: 50763.7. Samples: 1494632160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 17:19:02,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 17:19:02,663][49750] Updated weights for policy 0, policy_version 228381 (0.0030) [2024-04-26 17:19:06,365][49750] Updated weights for policy 0, policy_version 228391 (0.0030) [2024-04-26 17:19:07,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 3741990912. Throughput: 0: 50861.0. Samples: 1494790980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 17:19:07,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 17:19:09,106][49750] Updated weights for policy 0, policy_version 228401 (0.0031) [2024-04-26 17:19:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3742220288. Throughput: 0: 50712.8. Samples: 1495087540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 17:19:12,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 17:19:12,813][49750] Updated weights for policy 0, policy_version 228411 (0.0033) [2024-04-26 17:19:15,715][49750] Updated weights for policy 0, policy_version 228421 (0.0034) [2024-04-26 17:19:16,755][49728] Signal inference workers to stop experience collection... (22300 times) [2024-04-26 17:19:16,755][49728] Signal inference workers to resume experience collection... (22300 times) [2024-04-26 17:19:16,788][49750] InferenceWorker_p0-w0: stopping experience collection (22300 times) [2024-04-26 17:19:16,788][49750] InferenceWorker_p0-w0: resuming experience collection (22300 times) [2024-04-26 17:19:17,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3742498816. Throughput: 0: 50788.4. Samples: 1495395740. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 17:19:17,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 17:19:19,181][49750] Updated weights for policy 0, policy_version 228431 (0.0029) [2024-04-26 17:19:22,062][49517] Fps is (10 sec: 54067.4, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3742760960. Throughput: 0: 50760.4. Samples: 1495546940. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 17:19:22,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 17:19:22,217][49750] Updated weights for policy 0, policy_version 228441 (0.0038) [2024-04-26 17:19:25,661][49750] Updated weights for policy 0, policy_version 228451 (0.0036) [2024-04-26 17:19:27,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3743006720. Throughput: 0: 50742.8. Samples: 1495852700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 17:19:27,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 17:19:28,881][49750] Updated weights for policy 0, policy_version 228461 (0.0034) [2024-04-26 17:19:32,062][49750] Updated weights for policy 0, policy_version 228471 (0.0028) [2024-04-26 17:19:32,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 3743268864. Throughput: 0: 50758.2. Samples: 1496160880. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 17:19:32,071][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 17:19:35,160][49750] Updated weights for policy 0, policy_version 228481 (0.0029) [2024-04-26 17:19:37,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 3743498240. Throughput: 0: 50682.3. Samples: 1496304660. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 17:19:37,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 17:19:38,557][49750] Updated weights for policy 0, policy_version 228491 (0.0037) [2024-04-26 17:19:41,664][49750] Updated weights for policy 0, policy_version 228501 (0.0041) [2024-04-26 17:19:42,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3743760384. Throughput: 0: 50750.4. Samples: 1496610900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 17:19:42,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 17:19:44,917][49750] Updated weights for policy 0, policy_version 228511 (0.0035) [2024-04-26 17:19:47,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3744022528. Throughput: 0: 50677.0. Samples: 1496912620. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 17:19:47,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 17:19:48,165][49750] Updated weights for policy 0, policy_version 228521 (0.0031) [2024-04-26 17:19:51,357][49750] Updated weights for policy 0, policy_version 228531 (0.0031) [2024-04-26 17:19:52,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51609.7, 300 sec: 50762.6). Total num frames: 3744284672. Throughput: 0: 50819.9. Samples: 1497077880. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 17:19:52,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 17:19:54,633][49750] Updated weights for policy 0, policy_version 228541 (0.0037) [2024-04-26 17:19:57,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 3744497664. Throughput: 0: 50948.5. Samples: 1497380220. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 17:19:57,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 17:19:57,829][49750] Updated weights for policy 0, policy_version 228551 (0.0032) [2024-04-26 17:20:00,868][49750] Updated weights for policy 0, policy_version 228561 (0.0034) [2024-04-26 17:20:02,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3744759808. Throughput: 0: 50768.8. Samples: 1497680340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 17:20:02,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 17:20:04,348][49750] Updated weights for policy 0, policy_version 228571 (0.0034) [2024-04-26 17:20:07,063][49517] Fps is (10 sec: 55704.8, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 3745054720. Throughput: 0: 50819.0. Samples: 1497833800. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 17:20:07,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 17:20:07,181][49750] Updated weights for policy 0, policy_version 228581 (0.0027) [2024-04-26 17:20:10,731][49750] Updated weights for policy 0, policy_version 228591 (0.0035) [2024-04-26 17:20:12,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3745300480. Throughput: 0: 50891.0. Samples: 1498142800. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 17:20:12,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 17:20:13,664][49750] Updated weights for policy 0, policy_version 228601 (0.0031) [2024-04-26 17:20:17,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3745546240. Throughput: 0: 50897.2. Samples: 1498451240. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 17:20:17,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 17:20:17,089][49750] Updated weights for policy 0, policy_version 228611 (0.0030) [2024-04-26 17:20:20,138][49750] Updated weights for policy 0, policy_version 228621 (0.0040) [2024-04-26 17:20:22,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.2, 300 sec: 50707.0). Total num frames: 3745792000. Throughput: 0: 50767.9. Samples: 1498589220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 17:20:22,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 17:20:23,613][49750] Updated weights for policy 0, policy_version 228631 (0.0035) [2024-04-26 17:20:26,672][49750] Updated weights for policy 0, policy_version 228641 (0.0026) [2024-04-26 17:20:27,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 3746054144. Throughput: 0: 50809.2. Samples: 1498897320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 17:20:27,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 17:20:28,031][49728] Signal inference workers to stop experience collection... (22350 times) [2024-04-26 17:20:28,067][49750] InferenceWorker_p0-w0: stopping experience collection (22350 times) [2024-04-26 17:20:28,136][49728] Signal inference workers to resume experience collection... (22350 times) [2024-04-26 17:20:28,136][49750] InferenceWorker_p0-w0: resuming experience collection (22350 times) [2024-04-26 17:20:30,161][49750] Updated weights for policy 0, policy_version 228651 (0.0025) [2024-04-26 17:20:32,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3746316288. Throughput: 0: 50946.6. Samples: 1499205220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 17:20:32,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 17:20:33,383][49750] Updated weights for policy 0, policy_version 228661 (0.0030) [2024-04-26 17:20:36,565][49750] Updated weights for policy 0, policy_version 228671 (0.0026) [2024-04-26 17:20:37,062][49517] Fps is (10 sec: 52430.0, 60 sec: 51336.7, 300 sec: 50762.7). Total num frames: 3746578432. Throughput: 0: 50832.6. Samples: 1499365340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 17:20:37,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 17:20:40,195][49750] Updated weights for policy 0, policy_version 228681 (0.0031) [2024-04-26 17:20:42,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 3746791424. Throughput: 0: 50823.6. Samples: 1499667280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 17:20:42,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 17:20:42,070][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000228687_3746807808.pth... [2024-04-26 17:20:42,111][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000227943_3734618112.pth [2024-04-26 17:20:42,900][49750] Updated weights for policy 0, policy_version 228691 (0.0029) [2024-04-26 17:20:46,509][49750] Updated weights for policy 0, policy_version 228701 (0.0034) [2024-04-26 17:20:47,063][49517] Fps is (10 sec: 45874.4, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3747037184. Throughput: 0: 50903.2. Samples: 1499970980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 17:20:47,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 17:20:49,455][49750] Updated weights for policy 0, policy_version 228711 (0.0035) [2024-04-26 17:20:52,063][49517] Fps is (10 sec: 55705.0, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 3747348480. Throughput: 0: 50752.5. Samples: 1500117660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 17:20:52,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 17:20:53,010][49750] Updated weights for policy 0, policy_version 228721 (0.0031) [2024-04-26 17:20:56,099][49750] Updated weights for policy 0, policy_version 228731 (0.0029) [2024-04-26 17:20:57,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 3747577856. Throughput: 0: 50652.4. Samples: 1500422160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 17:20:57,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 17:20:59,531][49750] Updated weights for policy 0, policy_version 228741 (0.0038) [2024-04-26 17:21:02,063][49517] Fps is (10 sec: 47513.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3747823616. Throughput: 0: 50652.7. Samples: 1500730620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 17:21:02,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 17:21:02,452][49750] Updated weights for policy 0, policy_version 228751 (0.0030) [2024-04-26 17:21:05,877][49750] Updated weights for policy 0, policy_version 228761 (0.0031) [2024-04-26 17:21:07,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3748069376. Throughput: 0: 50834.0. Samples: 1500876740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 17:21:07,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 17:21:08,824][49750] Updated weights for policy 0, policy_version 228771 (0.0030) [2024-04-26 17:21:12,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3748331520. Throughput: 0: 50638.3. Samples: 1501176040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 17:21:12,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 17:21:12,238][49750] Updated weights for policy 0, policy_version 228781 (0.0040) [2024-04-26 17:21:15,414][49750] Updated weights for policy 0, policy_version 228791 (0.0030) [2024-04-26 17:21:17,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3748610048. Throughput: 0: 50633.5. Samples: 1501483720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 17:21:17,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 17:21:18,703][49750] Updated weights for policy 0, policy_version 228801 (0.0034) [2024-04-26 17:21:21,809][49750] Updated weights for policy 0, policy_version 228811 (0.0035) [2024-04-26 17:21:22,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3748839424. Throughput: 0: 50714.8. Samples: 1501647520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 17:21:22,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 17:21:25,307][49750] Updated weights for policy 0, policy_version 228821 (0.0026) [2024-04-26 17:21:27,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 3749085184. Throughput: 0: 50659.5. Samples: 1501946960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 17:21:27,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 17:21:28,174][49750] Updated weights for policy 0, policy_version 228831 (0.0029) [2024-04-26 17:21:32,056][49750] Updated weights for policy 0, policy_version 228841 (0.0031) [2024-04-26 17:21:32,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3749330944. Throughput: 0: 50787.2. Samples: 1502256400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:21:32,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 17:21:34,577][49750] Updated weights for policy 0, policy_version 228851 (0.0025) [2024-04-26 17:21:37,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3749609472. Throughput: 0: 50722.8. Samples: 1502400180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:21:37,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 17:21:38,355][49750] Updated weights for policy 0, policy_version 228861 (0.0028) [2024-04-26 17:21:41,045][49750] Updated weights for policy 0, policy_version 228871 (0.0031) [2024-04-26 17:21:42,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3749871616. Throughput: 0: 50841.4. Samples: 1502710020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:21:42,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 17:21:44,937][49750] Updated weights for policy 0, policy_version 228881 (0.0029) [2024-04-26 17:21:46,227][49728] Signal inference workers to stop experience collection... (22400 times) [2024-04-26 17:21:46,266][49750] InferenceWorker_p0-w0: stopping experience collection (22400 times) [2024-04-26 17:21:46,327][49728] Signal inference workers to resume experience collection... (22400 times) [2024-04-26 17:21:46,327][49750] InferenceWorker_p0-w0: resuming experience collection (22400 times) [2024-04-26 17:21:47,062][49517] Fps is (10 sec: 49151.5, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3750100992. Throughput: 0: 50659.1. Samples: 1503010280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:21:47,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 17:21:47,520][49750] Updated weights for policy 0, policy_version 228891 (0.0028) [2024-04-26 17:21:51,417][49750] Updated weights for policy 0, policy_version 228901 (0.0033) [2024-04-26 17:21:52,063][49517] Fps is (10 sec: 45874.8, 60 sec: 49698.1, 300 sec: 50651.6). Total num frames: 3750330368. Throughput: 0: 50563.8. Samples: 1503152120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:21:52,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 17:21:53,922][49750] Updated weights for policy 0, policy_version 228911 (0.0021) [2024-04-26 17:21:57,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3750608896. Throughput: 0: 50659.1. Samples: 1503455700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:21:57,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 17:21:57,803][49750] Updated weights for policy 0, policy_version 228921 (0.0033) [2024-04-26 17:22:00,432][49750] Updated weights for policy 0, policy_version 228931 (0.0026) [2024-04-26 17:22:02,062][49517] Fps is (10 sec: 55706.5, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3750887424. Throughput: 0: 50600.4. Samples: 1503760740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:22:02,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 17:22:04,131][49750] Updated weights for policy 0, policy_version 228941 (0.0030) [2024-04-26 17:22:06,998][49750] Updated weights for policy 0, policy_version 228951 (0.0034) [2024-04-26 17:22:07,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3751133184. Throughput: 0: 50684.6. Samples: 1503928320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:22:07,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 17:22:10,605][49750] Updated weights for policy 0, policy_version 228961 (0.0035) [2024-04-26 17:22:12,062][49517] Fps is (10 sec: 45875.1, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 3751346176. Throughput: 0: 50751.1. Samples: 1504230760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:22:12,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 17:22:13,444][49750] Updated weights for policy 0, policy_version 228971 (0.0035) [2024-04-26 17:22:17,062][49517] Fps is (10 sec: 47513.8, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 3751608320. Throughput: 0: 50524.4. Samples: 1504530000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:22:17,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 17:22:17,120][49750] Updated weights for policy 0, policy_version 228981 (0.0028) [2024-04-26 17:22:19,935][49750] Updated weights for policy 0, policy_version 228991 (0.0028) [2024-04-26 17:22:22,063][49517] Fps is (10 sec: 54066.7, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3751886848. Throughput: 0: 50607.9. Samples: 1504677540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:22:22,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 17:22:23,493][49750] Updated weights for policy 0, policy_version 229001 (0.0038) [2024-04-26 17:22:26,461][49750] Updated weights for policy 0, policy_version 229011 (0.0032) [2024-04-26 17:22:27,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3752148992. Throughput: 0: 50607.1. Samples: 1504987340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:22:27,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 17:22:30,004][49750] Updated weights for policy 0, policy_version 229021 (0.0037) [2024-04-26 17:22:32,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3752378368. Throughput: 0: 50742.8. Samples: 1505293700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:22:32,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 17:22:32,934][49750] Updated weights for policy 0, policy_version 229031 (0.0032) [2024-04-26 17:22:36,481][49750] Updated weights for policy 0, policy_version 229041 (0.0031) [2024-04-26 17:22:37,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3752624128. Throughput: 0: 50660.7. Samples: 1505431840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:22:37,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 17:22:37,597][49728] Signal inference workers to stop experience collection... (22450 times) [2024-04-26 17:22:37,598][49728] Signal inference workers to resume experience collection... (22450 times) [2024-04-26 17:22:37,612][49750] InferenceWorker_p0-w0: stopping experience collection (22450 times) [2024-04-26 17:22:37,612][49750] InferenceWorker_p0-w0: resuming experience collection (22450 times) [2024-04-26 17:22:39,273][49750] Updated weights for policy 0, policy_version 229051 (0.0037) [2024-04-26 17:22:42,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3752886272. Throughput: 0: 50728.4. Samples: 1505738480. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 17:22:42,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 17:22:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000229058_3752886272.pth... [2024-04-26 17:22:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000228316_3740729344.pth [2024-04-26 17:22:42,872][49750] Updated weights for policy 0, policy_version 229061 (0.0037) [2024-04-26 17:22:45,650][49750] Updated weights for policy 0, policy_version 229071 (0.0031) [2024-04-26 17:22:47,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3753164800. Throughput: 0: 50522.6. Samples: 1506034260. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 17:22:47,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 17:22:49,366][49750] Updated weights for policy 0, policy_version 229081 (0.0029) [2024-04-26 17:22:52,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 3753394176. Throughput: 0: 50420.1. Samples: 1506197220. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 17:22:52,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 17:22:52,200][49750] Updated weights for policy 0, policy_version 229091 (0.0038) [2024-04-26 17:22:56,040][49750] Updated weights for policy 0, policy_version 229101 (0.0034) [2024-04-26 17:22:57,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3753639936. Throughput: 0: 50613.3. Samples: 1506508360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 17:22:57,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 17:22:58,686][49750] Updated weights for policy 0, policy_version 229111 (0.0032) [2024-04-26 17:23:02,062][49517] Fps is (10 sec: 49151.7, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 3753885696. Throughput: 0: 50535.6. Samples: 1506804100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 17:23:02,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 17:23:02,436][49750] Updated weights for policy 0, policy_version 229121 (0.0029) [2024-04-26 17:23:05,160][49750] Updated weights for policy 0, policy_version 229131 (0.0029) [2024-04-26 17:23:07,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3754164224. Throughput: 0: 50583.2. Samples: 1506953780. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 17:23:07,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 17:23:08,786][49750] Updated weights for policy 0, policy_version 229141 (0.0030) [2024-04-26 17:23:11,647][49750] Updated weights for policy 0, policy_version 229151 (0.0034) [2024-04-26 17:23:12,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3754426368. Throughput: 0: 50477.3. Samples: 1507258820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 17:23:12,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 17:23:15,318][49750] Updated weights for policy 0, policy_version 229161 (0.0029) [2024-04-26 17:23:17,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3754655744. Throughput: 0: 50447.9. Samples: 1507563860. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 17:23:17,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 17:23:18,217][49750] Updated weights for policy 0, policy_version 229171 (0.0039) [2024-04-26 17:23:21,769][49750] Updated weights for policy 0, policy_version 229181 (0.0028) [2024-04-26 17:23:22,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3754901504. Throughput: 0: 50704.4. Samples: 1507713540. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 17:23:22,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 17:23:24,695][49750] Updated weights for policy 0, policy_version 229191 (0.0031) [2024-04-26 17:23:27,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 3755163648. Throughput: 0: 50572.2. Samples: 1508014240. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 17:23:27,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 17:23:28,115][49750] Updated weights for policy 0, policy_version 229201 (0.0033) [2024-04-26 17:23:31,035][49750] Updated weights for policy 0, policy_version 229211 (0.0026) [2024-04-26 17:23:32,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 3755425792. Throughput: 0: 50689.6. Samples: 1508315300. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 17:23:32,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 17:23:34,606][49750] Updated weights for policy 0, policy_version 229221 (0.0036) [2024-04-26 17:23:37,062][49517] Fps is (10 sec: 50791.8, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3755671552. Throughput: 0: 50599.6. Samples: 1508474200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 17:23:37,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 17:23:37,560][49750] Updated weights for policy 0, policy_version 229231 (0.0034) [2024-04-26 17:23:41,355][49750] Updated weights for policy 0, policy_version 229241 (0.0029) [2024-04-26 17:23:42,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 3755900928. Throughput: 0: 50494.1. Samples: 1508780600. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 17:23:42,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 17:23:44,141][49750] Updated weights for policy 0, policy_version 229251 (0.0032) [2024-04-26 17:23:44,381][49728] Signal inference workers to stop experience collection... (22500 times) [2024-04-26 17:23:44,381][49728] Signal inference workers to resume experience collection... (22500 times) [2024-04-26 17:23:44,393][49750] InferenceWorker_p0-w0: stopping experience collection (22500 times) [2024-04-26 17:23:44,393][49750] InferenceWorker_p0-w0: resuming experience collection (22500 times) [2024-04-26 17:23:47,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 3756179456. Throughput: 0: 50643.5. Samples: 1509083060. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-26 17:23:47,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 17:23:47,954][49750] Updated weights for policy 0, policy_version 229261 (0.0030) [2024-04-26 17:23:50,631][49750] Updated weights for policy 0, policy_version 229271 (0.0028) [2024-04-26 17:23:52,063][49517] Fps is (10 sec: 54066.7, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 3756441600. Throughput: 0: 50578.4. Samples: 1509229820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 17:23:52,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 17:23:54,262][49750] Updated weights for policy 0, policy_version 229281 (0.0033) [2024-04-26 17:23:57,003][49750] Updated weights for policy 0, policy_version 229291 (0.0026) [2024-04-26 17:23:57,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3756703744. Throughput: 0: 50628.4. Samples: 1509537100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 17:23:57,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 17:24:00,841][49750] Updated weights for policy 0, policy_version 229301 (0.0031) [2024-04-26 17:24:02,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 3756933120. Throughput: 0: 50672.1. Samples: 1509844100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 17:24:02,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 17:24:03,530][49750] Updated weights for policy 0, policy_version 229311 (0.0029) [2024-04-26 17:24:07,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3757178880. Throughput: 0: 50551.5. Samples: 1509988360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 17:24:07,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 17:24:07,281][49750] Updated weights for policy 0, policy_version 229321 (0.0031) [2024-04-26 17:24:10,050][49750] Updated weights for policy 0, policy_version 229331 (0.0036) [2024-04-26 17:24:12,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 3757441024. Throughput: 0: 50669.8. Samples: 1510294380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 17:24:12,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 17:24:13,587][49750] Updated weights for policy 0, policy_version 229341 (0.0030) [2024-04-26 17:24:16,503][49750] Updated weights for policy 0, policy_version 229351 (0.0032) [2024-04-26 17:24:17,063][49517] Fps is (10 sec: 54066.3, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3757719552. Throughput: 0: 50738.2. Samples: 1510598520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 17:24:17,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 17:24:19,959][49750] Updated weights for policy 0, policy_version 229361 (0.0031) [2024-04-26 17:24:22,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3757965312. Throughput: 0: 50697.6. Samples: 1510755600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 17:24:22,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 17:24:22,834][49750] Updated weights for policy 0, policy_version 229371 (0.0039) [2024-04-26 17:24:26,282][49750] Updated weights for policy 0, policy_version 229381 (0.0036) [2024-04-26 17:24:27,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.6, 300 sec: 50651.6). Total num frames: 3758211072. Throughput: 0: 50777.4. Samples: 1511065580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 17:24:27,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 17:24:29,344][49750] Updated weights for policy 0, policy_version 229391 (0.0029) [2024-04-26 17:24:32,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3758456832. Throughput: 0: 50813.3. Samples: 1511369660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 17:24:32,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 17:24:32,893][49750] Updated weights for policy 0, policy_version 229401 (0.0029) [2024-04-26 17:24:35,875][49750] Updated weights for policy 0, policy_version 229411 (0.0031) [2024-04-26 17:24:37,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 3758718976. Throughput: 0: 50700.0. Samples: 1511511320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 17:24:37,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 17:24:39,511][49750] Updated weights for policy 0, policy_version 229421 (0.0036) [2024-04-26 17:24:42,063][49517] Fps is (10 sec: 50789.4, 60 sec: 51063.3, 300 sec: 50651.5). Total num frames: 3758964736. Throughput: 0: 50521.5. Samples: 1511810580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 17:24:42,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 17:24:42,254][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000229431_3758997504.pth... [2024-04-26 17:24:42,262][49750] Updated weights for policy 0, policy_version 229431 (0.0029) [2024-04-26 17:24:42,303][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000228687_3746807808.pth [2024-04-26 17:24:45,882][49750] Updated weights for policy 0, policy_version 229441 (0.0036) [2024-04-26 17:24:47,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3759226880. Throughput: 0: 50604.4. Samples: 1512121300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 17:24:47,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 17:24:48,709][49750] Updated weights for policy 0, policy_version 229451 (0.0033) [2024-04-26 17:24:52,063][49517] Fps is (10 sec: 50791.5, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3759472640. Throughput: 0: 50864.8. Samples: 1512277280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 17:24:52,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 17:24:52,317][49750] Updated weights for policy 0, policy_version 229461 (0.0033) [2024-04-26 17:24:55,191][49750] Updated weights for policy 0, policy_version 229471 (0.0033) [2024-04-26 17:24:57,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3759734784. Throughput: 0: 50785.0. Samples: 1512579700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:24:57,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 17:24:58,841][49750] Updated weights for policy 0, policy_version 229481 (0.0033) [2024-04-26 17:25:01,771][49750] Updated weights for policy 0, policy_version 229491 (0.0034) [2024-04-26 17:25:02,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 3759980544. Throughput: 0: 50869.5. Samples: 1512887640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:25:02,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 17:25:04,087][49728] Signal inference workers to stop experience collection... (22550 times) [2024-04-26 17:25:04,087][49728] Signal inference workers to resume experience collection... (22550 times) [2024-04-26 17:25:04,115][49750] InferenceWorker_p0-w0: stopping experience collection (22550 times) [2024-04-26 17:25:04,115][49750] InferenceWorker_p0-w0: resuming experience collection (22550 times) [2024-04-26 17:25:05,157][49750] Updated weights for policy 0, policy_version 229501 (0.0028) [2024-04-26 17:25:07,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.3, 300 sec: 50651.5). Total num frames: 3760242688. Throughput: 0: 50700.3. Samples: 1513037120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:25:07,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 17:25:08,239][49750] Updated weights for policy 0, policy_version 229511 (0.0030) [2024-04-26 17:25:11,547][49750] Updated weights for policy 0, policy_version 229521 (0.0028) [2024-04-26 17:25:12,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50651.5). Total num frames: 3760488448. Throughput: 0: 50680.9. Samples: 1513346220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:25:12,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 17:25:14,673][49750] Updated weights for policy 0, policy_version 229531 (0.0030) [2024-04-26 17:25:17,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 3760766976. Throughput: 0: 50666.7. Samples: 1513649660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:25:17,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 17:25:17,960][49750] Updated weights for policy 0, policy_version 229541 (0.0027) [2024-04-26 17:25:21,178][49750] Updated weights for policy 0, policy_version 229551 (0.0032) [2024-04-26 17:25:22,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3760996352. Throughput: 0: 50803.7. Samples: 1513797480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:25:22,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 17:25:24,465][49750] Updated weights for policy 0, policy_version 229561 (0.0031) [2024-04-26 17:25:27,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.3, 300 sec: 50651.6). Total num frames: 3761258496. Throughput: 0: 50953.6. Samples: 1514103480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:25:27,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 17:25:27,622][49750] Updated weights for policy 0, policy_version 229571 (0.0032) [2024-04-26 17:25:30,803][49750] Updated weights for policy 0, policy_version 229581 (0.0027) [2024-04-26 17:25:32,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.5, 300 sec: 50651.5). Total num frames: 3761520640. Throughput: 0: 50760.8. Samples: 1514405540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:25:32,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 17:25:34,020][49750] Updated weights for policy 0, policy_version 229591 (0.0030) [2024-04-26 17:25:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3761766400. Throughput: 0: 50760.0. Samples: 1514561480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:25:37,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 17:25:37,217][49750] Updated weights for policy 0, policy_version 229601 (0.0028) [2024-04-26 17:25:40,784][49750] Updated weights for policy 0, policy_version 229611 (0.0029) [2024-04-26 17:25:42,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.7, 300 sec: 50762.6). Total num frames: 3762012160. Throughput: 0: 50871.7. Samples: 1514868920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:25:42,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 17:25:43,555][49750] Updated weights for policy 0, policy_version 229621 (0.0033) [2024-04-26 17:25:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 3762257920. Throughput: 0: 50716.8. Samples: 1515169900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:25:47,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 17:25:47,176][49750] Updated weights for policy 0, policy_version 229631 (0.0036) [2024-04-26 17:25:50,066][49750] Updated weights for policy 0, policy_version 229641 (0.0031) [2024-04-26 17:25:52,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3762520064. Throughput: 0: 50796.1. Samples: 1515322940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:25:52,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 17:25:53,564][49750] Updated weights for policy 0, policy_version 229651 (0.0029) [2024-04-26 17:25:56,530][49750] Updated weights for policy 0, policy_version 229661 (0.0031) [2024-04-26 17:25:57,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3762782208. Throughput: 0: 50597.8. Samples: 1515623120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:25:57,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 17:25:59,903][49750] Updated weights for policy 0, policy_version 229671 (0.0028) [2024-04-26 17:26:02,063][49517] Fps is (10 sec: 50789.2, 60 sec: 50790.2, 300 sec: 50707.0). Total num frames: 3763027968. Throughput: 0: 50701.1. Samples: 1515931220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:26:02,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 17:26:02,857][49750] Updated weights for policy 0, policy_version 229681 (0.0034) [2024-04-26 17:26:03,512][49728] Signal inference workers to stop experience collection... (22600 times) [2024-04-26 17:26:03,515][49728] Signal inference workers to resume experience collection... (22600 times) [2024-04-26 17:26:03,540][49750] InferenceWorker_p0-w0: stopping experience collection (22600 times) [2024-04-26 17:26:03,541][49750] InferenceWorker_p0-w0: resuming experience collection (22600 times) [2024-04-26 17:26:06,492][49750] Updated weights for policy 0, policy_version 229691 (0.0036) [2024-04-26 17:26:07,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.4, 300 sec: 50651.5). Total num frames: 3763273728. Throughput: 0: 50850.5. Samples: 1516085760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 17:26:07,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 17:26:09,305][49750] Updated weights for policy 0, policy_version 229701 (0.0038) [2024-04-26 17:26:12,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.2, 300 sec: 50596.0). Total num frames: 3763535872. Throughput: 0: 50659.0. Samples: 1516383140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 17:26:12,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 17:26:12,961][49750] Updated weights for policy 0, policy_version 229711 (0.0030) [2024-04-26 17:26:15,803][49750] Updated weights for policy 0, policy_version 229721 (0.0033) [2024-04-26 17:26:17,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3763798016. Throughput: 0: 50665.4. Samples: 1516685480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 17:26:17,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 17:26:19,339][49750] Updated weights for policy 0, policy_version 229731 (0.0029) [2024-04-26 17:26:22,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3764043776. Throughput: 0: 50576.5. Samples: 1516837420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 17:26:22,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 17:26:22,277][49750] Updated weights for policy 0, policy_version 229741 (0.0029) [2024-04-26 17:26:25,792][49750] Updated weights for policy 0, policy_version 229751 (0.0037) [2024-04-26 17:26:27,063][49517] Fps is (10 sec: 49150.5, 60 sec: 50517.1, 300 sec: 50707.0). Total num frames: 3764289536. Throughput: 0: 50599.6. Samples: 1517145920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 17:26:27,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 17:26:28,628][49750] Updated weights for policy 0, policy_version 229761 (0.0033) [2024-04-26 17:26:32,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.4, 300 sec: 50596.0). Total num frames: 3764535296. Throughput: 0: 50672.5. Samples: 1517450160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 17:26:32,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 17:26:32,273][49750] Updated weights for policy 0, policy_version 229771 (0.0026) [2024-04-26 17:26:35,438][49750] Updated weights for policy 0, policy_version 229781 (0.0033) [2024-04-26 17:26:37,062][49517] Fps is (10 sec: 50792.3, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 3764797440. Throughput: 0: 50692.1. Samples: 1517604080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 17:26:37,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 17:26:38,858][49750] Updated weights for policy 0, policy_version 229791 (0.0030) [2024-04-26 17:26:41,805][49750] Updated weights for policy 0, policy_version 229801 (0.0032) [2024-04-26 17:26:42,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3765059584. Throughput: 0: 50871.4. Samples: 1517912340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 17:26:42,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 17:26:42,153][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000229802_3765075968.pth... [2024-04-26 17:26:42,205][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000229058_3752886272.pth [2024-04-26 17:26:45,383][49750] Updated weights for policy 0, policy_version 229811 (0.0029) [2024-04-26 17:26:47,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3765305344. Throughput: 0: 50683.7. Samples: 1518211980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 17:26:47,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 17:26:48,373][49750] Updated weights for policy 0, policy_version 229821 (0.0030) [2024-04-26 17:26:51,677][49750] Updated weights for policy 0, policy_version 229831 (0.0030) [2024-04-26 17:26:52,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3765567488. Throughput: 0: 50777.3. Samples: 1518370740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 17:26:52,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 17:26:54,675][49750] Updated weights for policy 0, policy_version 229841 (0.0029) [2024-04-26 17:26:57,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 3765813248. Throughput: 0: 50856.7. Samples: 1518671680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 17:26:57,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 17:26:58,294][49750] Updated weights for policy 0, policy_version 229851 (0.0028) [2024-04-26 17:27:01,044][49750] Updated weights for policy 0, policy_version 229861 (0.0033) [2024-04-26 17:27:02,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.6, 300 sec: 50651.6). Total num frames: 3766075392. Throughput: 0: 50785.8. Samples: 1518970840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 17:27:02,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 17:27:04,785][49750] Updated weights for policy 0, policy_version 229871 (0.0030) [2024-04-26 17:27:07,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3766337536. Throughput: 0: 51060.9. Samples: 1519135160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 17:27:07,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 17:27:07,587][49750] Updated weights for policy 0, policy_version 229881 (0.0024) [2024-04-26 17:27:11,142][49750] Updated weights for policy 0, policy_version 229891 (0.0035) [2024-04-26 17:27:12,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3766583296. Throughput: 0: 50894.4. Samples: 1519436160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 17:27:12,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 17:27:14,089][49750] Updated weights for policy 0, policy_version 229901 (0.0033) [2024-04-26 17:27:17,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.2, 300 sec: 50596.0). Total num frames: 3766812672. Throughput: 0: 50760.3. Samples: 1519734380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 17:27:17,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 17:27:17,416][49728] Signal inference workers to stop experience collection... (22650 times) [2024-04-26 17:27:17,416][49728] Signal inference workers to resume experience collection... (22650 times) [2024-04-26 17:27:17,433][49750] InferenceWorker_p0-w0: stopping experience collection (22650 times) [2024-04-26 17:27:17,433][49750] InferenceWorker_p0-w0: resuming experience collection (22650 times) [2024-04-26 17:27:17,552][49750] Updated weights for policy 0, policy_version 229911 (0.0036) [2024-04-26 17:27:20,537][49750] Updated weights for policy 0, policy_version 229921 (0.0034) [2024-04-26 17:27:22,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3767091200. Throughput: 0: 50656.9. Samples: 1519883640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 17:27:22,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 17:27:24,053][49750] Updated weights for policy 0, policy_version 229931 (0.0035) [2024-04-26 17:27:26,930][49750] Updated weights for policy 0, policy_version 229941 (0.0029) [2024-04-26 17:27:27,063][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.7, 300 sec: 50762.6). Total num frames: 3767353344. Throughput: 0: 50678.7. Samples: 1520192880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 17:27:27,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 17:27:30,365][49750] Updated weights for policy 0, policy_version 229951 (0.0034) [2024-04-26 17:27:32,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3767599104. Throughput: 0: 50874.6. Samples: 1520501340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 17:27:32,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 17:27:33,404][49750] Updated weights for policy 0, policy_version 229961 (0.0032) [2024-04-26 17:27:36,696][49750] Updated weights for policy 0, policy_version 229971 (0.0039) [2024-04-26 17:27:37,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3767844864. Throughput: 0: 50679.6. Samples: 1520651320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 17:27:37,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 17:27:39,902][49750] Updated weights for policy 0, policy_version 229981 (0.0026) [2024-04-26 17:27:42,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 3768107008. Throughput: 0: 50861.2. Samples: 1520960440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 17:27:42,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 17:27:43,254][49750] Updated weights for policy 0, policy_version 229991 (0.0032) [2024-04-26 17:27:46,389][49750] Updated weights for policy 0, policy_version 230001 (0.0030) [2024-04-26 17:27:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3768369152. Throughput: 0: 50900.4. Samples: 1521261360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 17:27:47,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 17:27:49,865][49750] Updated weights for policy 0, policy_version 230011 (0.0032) [2024-04-26 17:27:52,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3768614912. Throughput: 0: 50823.4. Samples: 1521422220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 17:27:52,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 17:27:52,851][49750] Updated weights for policy 0, policy_version 230021 (0.0027) [2024-04-26 17:27:56,429][49750] Updated weights for policy 0, policy_version 230031 (0.0036) [2024-04-26 17:27:57,063][49517] Fps is (10 sec: 50789.3, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 3768877056. Throughput: 0: 50813.2. Samples: 1521722760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 17:27:57,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 17:27:59,276][49750] Updated weights for policy 0, policy_version 230041 (0.0030) [2024-04-26 17:28:02,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 3769106432. Throughput: 0: 50852.1. Samples: 1522022720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 17:28:02,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 17:28:02,820][49750] Updated weights for policy 0, policy_version 230051 (0.0031) [2024-04-26 17:28:05,816][49750] Updated weights for policy 0, policy_version 230061 (0.0034) [2024-04-26 17:28:07,062][49517] Fps is (10 sec: 49153.3, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3769368576. Throughput: 0: 50873.8. Samples: 1522172960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 17:28:07,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 17:28:09,147][49750] Updated weights for policy 0, policy_version 230071 (0.0027) [2024-04-26 17:28:12,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3769630720. Throughput: 0: 50819.6. Samples: 1522479760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 17:28:12,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 17:28:12,249][49750] Updated weights for policy 0, policy_version 230081 (0.0029) [2024-04-26 17:28:15,546][49750] Updated weights for policy 0, policy_version 230091 (0.0030) [2024-04-26 17:28:17,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3769892864. Throughput: 0: 50785.4. Samples: 1522786680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 17:28:17,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 17:28:18,699][49750] Updated weights for policy 0, policy_version 230101 (0.0029) [2024-04-26 17:28:22,033][49750] Updated weights for policy 0, policy_version 230111 (0.0034) [2024-04-26 17:28:22,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3770138624. Throughput: 0: 50758.5. Samples: 1522935460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-04-26 17:28:22,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 17:28:22,640][49728] Signal inference workers to stop experience collection... (22700 times) [2024-04-26 17:28:22,641][49728] Signal inference workers to resume experience collection... (22700 times) [2024-04-26 17:28:22,656][49750] InferenceWorker_p0-w0: stopping experience collection (22700 times) [2024-04-26 17:28:22,656][49750] InferenceWorker_p0-w0: resuming experience collection (22700 times) [2024-04-26 17:28:25,172][49750] Updated weights for policy 0, policy_version 230121 (0.0036) [2024-04-26 17:28:27,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3770400768. Throughput: 0: 50689.8. Samples: 1523241480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 17:28:27,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 17:28:28,464][49750] Updated weights for policy 0, policy_version 230131 (0.0031) [2024-04-26 17:28:31,631][49750] Updated weights for policy 0, policy_version 230141 (0.0032) [2024-04-26 17:28:32,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3770646528. Throughput: 0: 50890.7. Samples: 1523551440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 17:28:32,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 17:28:34,773][49750] Updated weights for policy 0, policy_version 230151 (0.0032) [2024-04-26 17:28:37,063][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3770908672. Throughput: 0: 50784.5. Samples: 1523707520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 17:28:37,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 17:28:37,972][49750] Updated weights for policy 0, policy_version 230161 (0.0027) [2024-04-26 17:28:41,092][49750] Updated weights for policy 0, policy_version 230171 (0.0026) [2024-04-26 17:28:42,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3771154432. Throughput: 0: 50824.1. Samples: 1524009840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 17:28:42,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 17:28:42,200][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000230174_3771170816.pth... [2024-04-26 17:28:42,249][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000229431_3758997504.pth [2024-04-26 17:28:44,540][49750] Updated weights for policy 0, policy_version 230181 (0.0029) [2024-04-26 17:28:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3771400192. Throughput: 0: 50915.6. Samples: 1524313920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 17:28:47,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 17:28:47,515][49750] Updated weights for policy 0, policy_version 230191 (0.0032) [2024-04-26 17:28:51,047][49750] Updated weights for policy 0, policy_version 230201 (0.0035) [2024-04-26 17:28:52,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3771662336. Throughput: 0: 50765.6. Samples: 1524457420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 17:28:52,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 17:28:54,019][49750] Updated weights for policy 0, policy_version 230211 (0.0027) [2024-04-26 17:28:57,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.6, 300 sec: 50762.6). Total num frames: 3771908096. Throughput: 0: 50690.0. Samples: 1524760800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 17:28:57,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 17:28:57,455][49750] Updated weights for policy 0, policy_version 230221 (0.0024) [2024-04-26 17:29:00,436][49750] Updated weights for policy 0, policy_version 230231 (0.0029) [2024-04-26 17:29:02,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 3772186624. Throughput: 0: 50696.7. Samples: 1525068040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 17:29:02,064][49517] Avg episode reward: [(0, '0.664')] [2024-04-26 17:29:03,927][49750] Updated weights for policy 0, policy_version 230241 (0.0033) [2024-04-26 17:29:06,837][49750] Updated weights for policy 0, policy_version 230251 (0.0029) [2024-04-26 17:29:07,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3772432384. Throughput: 0: 50745.5. Samples: 1525219000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 17:29:07,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 17:29:10,432][49750] Updated weights for policy 0, policy_version 230261 (0.0033) [2024-04-26 17:29:12,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3772678144. Throughput: 0: 50716.8. Samples: 1525523740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 17:29:12,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 17:29:13,310][49750] Updated weights for policy 0, policy_version 230271 (0.0033) [2024-04-26 17:29:14,672][49728] Signal inference workers to stop experience collection... (22750 times) [2024-04-26 17:29:14,673][49728] Signal inference workers to resume experience collection... (22750 times) [2024-04-26 17:29:14,692][49750] InferenceWorker_p0-w0: stopping experience collection (22750 times) [2024-04-26 17:29:14,692][49750] InferenceWorker_p0-w0: resuming experience collection (22750 times) [2024-04-26 17:29:16,966][49750] Updated weights for policy 0, policy_version 230281 (0.0027) [2024-04-26 17:29:17,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3772923904. Throughput: 0: 50471.5. Samples: 1525822660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 17:29:17,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 17:29:20,259][49750] Updated weights for policy 0, policy_version 230291 (0.0028) [2024-04-26 17:29:22,063][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3773186048. Throughput: 0: 50372.9. Samples: 1525974300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 17:29:22,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 17:29:23,394][49750] Updated weights for policy 0, policy_version 230301 (0.0029) [2024-04-26 17:29:26,916][49750] Updated weights for policy 0, policy_version 230311 (0.0032) [2024-04-26 17:29:27,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3773415424. Throughput: 0: 50470.7. Samples: 1526281020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 17:29:27,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 17:29:29,884][49750] Updated weights for policy 0, policy_version 230321 (0.0031) [2024-04-26 17:29:32,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3773677568. Throughput: 0: 50362.2. Samples: 1526580220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 17:29:32,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 17:29:33,398][49750] Updated weights for policy 0, policy_version 230331 (0.0034) [2024-04-26 17:29:36,402][49750] Updated weights for policy 0, policy_version 230341 (0.0036) [2024-04-26 17:29:37,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3773923328. Throughput: 0: 50487.7. Samples: 1526729360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 17:29:37,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 17:29:39,724][49750] Updated weights for policy 0, policy_version 230351 (0.0030) [2024-04-26 17:29:42,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3774185472. Throughput: 0: 50694.1. Samples: 1527042040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 17:29:42,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 17:29:43,023][49750] Updated weights for policy 0, policy_version 230361 (0.0027) [2024-04-26 17:29:46,049][49750] Updated weights for policy 0, policy_version 230371 (0.0027) [2024-04-26 17:29:47,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3774447616. Throughput: 0: 50650.4. Samples: 1527347300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 17:29:47,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 17:29:49,387][49750] Updated weights for policy 0, policy_version 230381 (0.0035) [2024-04-26 17:29:52,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3774709760. Throughput: 0: 50548.4. Samples: 1527493680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 17:29:52,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 17:29:52,646][49750] Updated weights for policy 0, policy_version 230391 (0.0033) [2024-04-26 17:29:55,771][49750] Updated weights for policy 0, policy_version 230401 (0.0033) [2024-04-26 17:29:57,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3774939136. Throughput: 0: 50653.5. Samples: 1527803140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 17:29:57,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 17:29:59,226][49750] Updated weights for policy 0, policy_version 230411 (0.0030) [2024-04-26 17:30:02,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.4, 300 sec: 50651.6). Total num frames: 3775184896. Throughput: 0: 50678.2. Samples: 1528103180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 17:30:02,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 17:30:02,297][49750] Updated weights for policy 0, policy_version 230421 (0.0026) [2024-04-26 17:30:05,692][49750] Updated weights for policy 0, policy_version 230431 (0.0034) [2024-04-26 17:30:06,374][49728] Signal inference workers to stop experience collection... (22800 times) [2024-04-26 17:30:06,420][49750] InferenceWorker_p0-w0: stopping experience collection (22800 times) [2024-04-26 17:30:06,481][49728] Signal inference workers to resume experience collection... (22800 times) [2024-04-26 17:30:06,482][49750] InferenceWorker_p0-w0: resuming experience collection (22800 times) [2024-04-26 17:30:07,062][49517] Fps is (10 sec: 54067.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3775479808. Throughput: 0: 50709.9. Samples: 1528256240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 17:30:07,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 17:30:08,738][49750] Updated weights for policy 0, policy_version 230441 (0.0032) [2024-04-26 17:30:12,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.4, 300 sec: 50596.0). Total num frames: 3775692800. Throughput: 0: 50682.7. Samples: 1528561740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 17:30:12,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 17:30:12,199][49750] Updated weights for policy 0, policy_version 230451 (0.0034) [2024-04-26 17:30:15,330][49750] Updated weights for policy 0, policy_version 230461 (0.0029) [2024-04-26 17:30:17,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3775954944. Throughput: 0: 50744.1. Samples: 1528863700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 17:30:17,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 17:30:18,511][49750] Updated weights for policy 0, policy_version 230471 (0.0033) [2024-04-26 17:30:21,683][49750] Updated weights for policy 0, policy_version 230481 (0.0026) [2024-04-26 17:30:22,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3776217088. Throughput: 0: 50693.3. Samples: 1529010560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 17:30:22,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 17:30:25,164][49750] Updated weights for policy 0, policy_version 230491 (0.0036) [2024-04-26 17:30:27,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3776462848. Throughput: 0: 50596.1. Samples: 1529318860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 17:30:27,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 17:30:28,077][49750] Updated weights for policy 0, policy_version 230501 (0.0031) [2024-04-26 17:30:31,660][49750] Updated weights for policy 0, policy_version 230511 (0.0033) [2024-04-26 17:30:32,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3776724992. Throughput: 0: 50814.2. Samples: 1529633940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 17:30:32,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 17:30:34,440][49750] Updated weights for policy 0, policy_version 230521 (0.0033) [2024-04-26 17:30:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3776970752. Throughput: 0: 50648.5. Samples: 1529772860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-26 17:30:37,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 17:30:38,067][49750] Updated weights for policy 0, policy_version 230531 (0.0030) [2024-04-26 17:30:41,024][49750] Updated weights for policy 0, policy_version 230541 (0.0032) [2024-04-26 17:30:42,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3777232896. Throughput: 0: 50634.8. Samples: 1530081700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:30:42,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 17:30:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000230544_3777232896.pth... [2024-04-26 17:30:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000229802_3765075968.pth [2024-04-26 17:30:44,382][49750] Updated weights for policy 0, policy_version 230551 (0.0032) [2024-04-26 17:30:47,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 3777462272. Throughput: 0: 50767.2. Samples: 1530387700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:30:47,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 17:30:47,456][49750] Updated weights for policy 0, policy_version 230561 (0.0033) [2024-04-26 17:30:50,775][49750] Updated weights for policy 0, policy_version 230571 (0.0029) [2024-04-26 17:30:52,062][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3777757184. Throughput: 0: 50653.7. Samples: 1530535660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:30:52,063][49517] Avg episode reward: [(0, '0.666')] [2024-04-26 17:30:53,918][49750] Updated weights for policy 0, policy_version 230581 (0.0042) [2024-04-26 17:30:57,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3777986560. Throughput: 0: 50575.5. Samples: 1530837640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:30:57,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 17:30:57,330][49750] Updated weights for policy 0, policy_version 230591 (0.0035) [2024-04-26 17:31:00,493][49750] Updated weights for policy 0, policy_version 230601 (0.0030) [2024-04-26 17:31:02,063][49517] Fps is (10 sec: 49151.6, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3778248704. Throughput: 0: 50660.3. Samples: 1531143420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:31:02,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 17:31:03,636][49750] Updated weights for policy 0, policy_version 230611 (0.0041) [2024-04-26 17:31:06,749][49750] Updated weights for policy 0, policy_version 230621 (0.0034) [2024-04-26 17:31:07,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3778494464. Throughput: 0: 50639.2. Samples: 1531289320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:31:07,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 17:31:10,058][49750] Updated weights for policy 0, policy_version 230631 (0.0030) [2024-04-26 17:31:12,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3778756608. Throughput: 0: 50742.0. Samples: 1531602260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:31:12,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 17:31:13,186][49750] Updated weights for policy 0, policy_version 230641 (0.0032) [2024-04-26 17:31:16,555][49728] Signal inference workers to stop experience collection... (22850 times) [2024-04-26 17:31:16,555][49728] Signal inference workers to resume experience collection... (22850 times) [2024-04-26 17:31:16,590][49750] InferenceWorker_p0-w0: stopping experience collection (22850 times) [2024-04-26 17:31:16,590][49750] InferenceWorker_p0-w0: resuming experience collection (22850 times) [2024-04-26 17:31:16,692][49750] Updated weights for policy 0, policy_version 230651 (0.0032) [2024-04-26 17:31:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3779002368. Throughput: 0: 50657.8. Samples: 1531913540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:31:17,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 17:31:19,517][49750] Updated weights for policy 0, policy_version 230661 (0.0031) [2024-04-26 17:31:22,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3779248128. Throughput: 0: 50768.0. Samples: 1532057420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:31:22,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 17:31:23,228][49750] Updated weights for policy 0, policy_version 230671 (0.0030) [2024-04-26 17:31:25,907][49750] Updated weights for policy 0, policy_version 230681 (0.0031) [2024-04-26 17:31:27,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3779510272. Throughput: 0: 50665.3. Samples: 1532361640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:31:27,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 17:31:29,578][49750] Updated weights for policy 0, policy_version 230691 (0.0031) [2024-04-26 17:31:32,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3779772416. Throughput: 0: 50731.9. Samples: 1532670640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:31:32,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 17:31:32,355][49750] Updated weights for policy 0, policy_version 230701 (0.0031) [2024-04-26 17:31:36,038][49750] Updated weights for policy 0, policy_version 230711 (0.0033) [2024-04-26 17:31:37,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3780034560. Throughput: 0: 50840.9. Samples: 1532823500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:31:37,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 17:31:38,942][49750] Updated weights for policy 0, policy_version 230721 (0.0034) [2024-04-26 17:31:42,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3780263936. Throughput: 0: 50901.9. Samples: 1533128220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:31:42,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 17:31:42,318][49750] Updated weights for policy 0, policy_version 230731 (0.0034) [2024-04-26 17:31:45,430][49750] Updated weights for policy 0, policy_version 230741 (0.0029) [2024-04-26 17:31:47,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51336.6, 300 sec: 50762.7). Total num frames: 3780542464. Throughput: 0: 50802.5. Samples: 1533429520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:31:47,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 17:31:48,772][49750] Updated weights for policy 0, policy_version 230751 (0.0037) [2024-04-26 17:31:51,921][49750] Updated weights for policy 0, policy_version 230761 (0.0028) [2024-04-26 17:31:52,063][49517] Fps is (10 sec: 52427.6, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3780788224. Throughput: 0: 50965.9. Samples: 1533582800. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 17:31:52,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 17:31:55,258][49750] Updated weights for policy 0, policy_version 230771 (0.0033) [2024-04-26 17:31:57,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3781033984. Throughput: 0: 50744.5. Samples: 1533885760. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 17:31:57,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 17:31:58,232][49750] Updated weights for policy 0, policy_version 230781 (0.0035) [2024-04-26 17:32:01,539][49750] Updated weights for policy 0, policy_version 230791 (0.0033) [2024-04-26 17:32:02,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3781296128. Throughput: 0: 50659.0. Samples: 1534193200. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 17:32:02,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 17:32:04,643][49750] Updated weights for policy 0, policy_version 230801 (0.0032) [2024-04-26 17:32:07,063][49517] Fps is (10 sec: 50787.8, 60 sec: 50789.8, 300 sec: 50707.0). Total num frames: 3781541888. Throughput: 0: 50858.4. Samples: 1534346080. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 17:32:07,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 17:32:07,967][49750] Updated weights for policy 0, policy_version 230811 (0.0028) [2024-04-26 17:32:11,071][49750] Updated weights for policy 0, policy_version 230821 (0.0034) [2024-04-26 17:32:12,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3781787648. Throughput: 0: 50797.6. Samples: 1534647540. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 17:32:12,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 17:32:14,543][49750] Updated weights for policy 0, policy_version 230831 (0.0035) [2024-04-26 17:32:17,062][49517] Fps is (10 sec: 52432.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3782066176. Throughput: 0: 50666.4. Samples: 1534950620. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 17:32:17,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 17:32:17,459][49750] Updated weights for policy 0, policy_version 230841 (0.0037) [2024-04-26 17:32:20,879][49750] Updated weights for policy 0, policy_version 230851 (0.0035) [2024-04-26 17:32:22,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3782295552. Throughput: 0: 50807.2. Samples: 1535109820. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 17:32:22,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 17:32:23,842][49750] Updated weights for policy 0, policy_version 230861 (0.0031) [2024-04-26 17:32:25,823][49728] Signal inference workers to stop experience collection... (22900 times) [2024-04-26 17:32:25,824][49728] Signal inference workers to resume experience collection... (22900 times) [2024-04-26 17:32:25,851][49750] InferenceWorker_p0-w0: stopping experience collection (22900 times) [2024-04-26 17:32:25,851][49750] InferenceWorker_p0-w0: resuming experience collection (22900 times) [2024-04-26 17:32:27,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3782557696. Throughput: 0: 50858.1. Samples: 1535416840. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 17:32:27,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 17:32:27,347][49750] Updated weights for policy 0, policy_version 230871 (0.0030) [2024-04-26 17:32:30,307][49750] Updated weights for policy 0, policy_version 230881 (0.0032) [2024-04-26 17:32:32,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3782803456. Throughput: 0: 50885.7. Samples: 1535719380. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 17:32:32,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 17:32:33,819][49750] Updated weights for policy 0, policy_version 230891 (0.0034) [2024-04-26 17:32:36,844][49750] Updated weights for policy 0, policy_version 230901 (0.0033) [2024-04-26 17:32:37,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3783081984. Throughput: 0: 50866.9. Samples: 1535871800. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 17:32:37,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 17:32:40,320][49750] Updated weights for policy 0, policy_version 230911 (0.0028) [2024-04-26 17:32:42,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3783327744. Throughput: 0: 50717.7. Samples: 1536168060. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 17:32:42,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 17:32:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000230916_3783327744.pth... [2024-04-26 17:32:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000230174_3771170816.pth [2024-04-26 17:32:43,605][49750] Updated weights for policy 0, policy_version 230921 (0.0037) [2024-04-26 17:32:46,632][49750] Updated weights for policy 0, policy_version 230931 (0.0031) [2024-04-26 17:32:47,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3783573504. Throughput: 0: 50604.1. Samples: 1536470380. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 17:32:47,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 17:32:50,093][49750] Updated weights for policy 0, policy_version 230941 (0.0034) [2024-04-26 17:32:52,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 3783835648. Throughput: 0: 50618.9. Samples: 1536623900. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 17:32:52,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 17:32:53,004][49750] Updated weights for policy 0, policy_version 230951 (0.0031) [2024-04-26 17:32:56,489][49750] Updated weights for policy 0, policy_version 230961 (0.0032) [2024-04-26 17:32:57,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3784065024. Throughput: 0: 50641.7. Samples: 1536926420. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 17:32:57,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 17:32:59,431][49750] Updated weights for policy 0, policy_version 230971 (0.0032) [2024-04-26 17:33:02,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3784359936. Throughput: 0: 50712.8. Samples: 1537232700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 17:33:02,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-26 17:33:03,096][49750] Updated weights for policy 0, policy_version 230981 (0.0033) [2024-04-26 17:33:05,951][49750] Updated weights for policy 0, policy_version 230991 (0.0031) [2024-04-26 17:33:07,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50517.9, 300 sec: 50651.6). Total num frames: 3784572928. Throughput: 0: 50546.6. Samples: 1537384420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 17:33:07,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 17:33:09,696][49750] Updated weights for policy 0, policy_version 231001 (0.0030) [2024-04-26 17:33:12,062][49517] Fps is (10 sec: 49152.2, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3784851456. Throughput: 0: 50684.5. Samples: 1537697640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 17:33:12,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 17:33:12,734][49750] Updated weights for policy 0, policy_version 231011 (0.0032) [2024-04-26 17:33:16,097][49750] Updated weights for policy 0, policy_version 231021 (0.0036) [2024-04-26 17:33:17,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50244.1, 300 sec: 50651.6). Total num frames: 3785080832. Throughput: 0: 50699.3. Samples: 1538000860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 17:33:17,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 17:33:19,667][49750] Updated weights for policy 0, policy_version 231031 (0.0029) [2024-04-26 17:33:19,969][49728] Signal inference workers to stop experience collection... (22950 times) [2024-04-26 17:33:20,019][49750] InferenceWorker_p0-w0: stopping experience collection (22950 times) [2024-04-26 17:33:20,031][49728] Signal inference workers to resume experience collection... (22950 times) [2024-04-26 17:33:20,039][49750] InferenceWorker_p0-w0: resuming experience collection (22950 times) [2024-04-26 17:33:22,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3785359360. Throughput: 0: 50727.6. Samples: 1538154540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 17:33:22,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 17:33:22,471][49750] Updated weights for policy 0, policy_version 231041 (0.0037) [2024-04-26 17:33:26,298][49750] Updated weights for policy 0, policy_version 231051 (0.0025) [2024-04-26 17:33:27,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3785605120. Throughput: 0: 50787.6. Samples: 1538453500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 17:33:27,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 17:33:29,169][49750] Updated weights for policy 0, policy_version 231061 (0.0037) [2024-04-26 17:33:32,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 3785834496. Throughput: 0: 50648.5. Samples: 1538749560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 17:33:32,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 17:33:32,638][49750] Updated weights for policy 0, policy_version 231071 (0.0027) [2024-04-26 17:33:35,630][49750] Updated weights for policy 0, policy_version 231081 (0.0040) [2024-04-26 17:33:37,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3786113024. Throughput: 0: 50515.9. Samples: 1538897120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 17:33:37,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 17:33:38,976][49750] Updated weights for policy 0, policy_version 231091 (0.0033) [2024-04-26 17:33:41,955][49750] Updated weights for policy 0, policy_version 231101 (0.0035) [2024-04-26 17:33:42,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3786358784. Throughput: 0: 50711.7. Samples: 1539208440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 17:33:42,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 17:33:45,386][49750] Updated weights for policy 0, policy_version 231111 (0.0028) [2024-04-26 17:33:47,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3786637312. Throughput: 0: 50837.7. Samples: 1539520400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 17:33:47,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 17:33:48,557][49750] Updated weights for policy 0, policy_version 231121 (0.0037) [2024-04-26 17:33:51,837][49750] Updated weights for policy 0, policy_version 231131 (0.0030) [2024-04-26 17:33:52,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3786866688. Throughput: 0: 50740.4. Samples: 1539667740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 17:33:52,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 17:33:55,155][49750] Updated weights for policy 0, policy_version 231141 (0.0033) [2024-04-26 17:33:57,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51336.6, 300 sec: 50707.1). Total num frames: 3787145216. Throughput: 0: 50635.4. Samples: 1539976240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 17:33:57,063][49517] Avg episode reward: [(0, '0.721')] [2024-04-26 17:33:57,063][49728] Saving new best policy, reward=0.721! [2024-04-26 17:33:58,092][49750] Updated weights for policy 0, policy_version 231151 (0.0028) [2024-04-26 17:34:01,513][49750] Updated weights for policy 0, policy_version 231161 (0.0033) [2024-04-26 17:34:02,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 3787374592. Throughput: 0: 50785.9. Samples: 1540286220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 17:34:02,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 17:34:04,409][49750] Updated weights for policy 0, policy_version 231171 (0.0034) [2024-04-26 17:34:07,062][49517] Fps is (10 sec: 49153.1, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3787636736. Throughput: 0: 50595.7. Samples: 1540431340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 17:34:07,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 17:34:07,921][49750] Updated weights for policy 0, policy_version 231181 (0.0026) [2024-04-26 17:34:10,957][49750] Updated weights for policy 0, policy_version 231191 (0.0030) [2024-04-26 17:34:12,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3787898880. Throughput: 0: 50798.4. Samples: 1540739420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 17:34:12,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 17:34:14,277][49750] Updated weights for policy 0, policy_version 231201 (0.0030) [2024-04-26 17:34:17,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 3788128256. Throughput: 0: 50885.2. Samples: 1541039400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 17:34:17,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 17:34:17,596][49750] Updated weights for policy 0, policy_version 231211 (0.0033) [2024-04-26 17:34:20,799][49750] Updated weights for policy 0, policy_version 231221 (0.0041) [2024-04-26 17:34:22,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3788406784. Throughput: 0: 50940.4. Samples: 1541189440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 17:34:22,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 17:34:24,040][49750] Updated weights for policy 0, policy_version 231231 (0.0031) [2024-04-26 17:34:27,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3788636160. Throughput: 0: 50841.3. Samples: 1541496300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 17:34:27,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 17:34:27,247][49750] Updated weights for policy 0, policy_version 231241 (0.0034) [2024-04-26 17:34:30,412][49750] Updated weights for policy 0, policy_version 231251 (0.0028) [2024-04-26 17:34:32,062][49517] Fps is (10 sec: 49152.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3788898304. Throughput: 0: 50736.2. Samples: 1541803520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 17:34:32,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 17:34:33,626][49750] Updated weights for policy 0, policy_version 231261 (0.0031) [2024-04-26 17:34:36,855][49750] Updated weights for policy 0, policy_version 231271 (0.0032) [2024-04-26 17:34:37,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3789160448. Throughput: 0: 50740.5. Samples: 1541951060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 17:34:37,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 17:34:39,961][49750] Updated weights for policy 0, policy_version 231281 (0.0029) [2024-04-26 17:34:42,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3789422592. Throughput: 0: 50714.8. Samples: 1542258400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 17:34:42,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 17:34:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000231288_3789422592.pth... [2024-04-26 17:34:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000230544_3777232896.pth [2024-04-26 17:34:43,246][49750] Updated weights for policy 0, policy_version 231291 (0.0032) [2024-04-26 17:34:43,272][49728] Signal inference workers to stop experience collection... (23000 times) [2024-04-26 17:34:43,272][49728] Signal inference workers to resume experience collection... (23000 times) [2024-04-26 17:34:43,297][49750] InferenceWorker_p0-w0: stopping experience collection (23000 times) [2024-04-26 17:34:43,298][49750] InferenceWorker_p0-w0: resuming experience collection (23000 times) [2024-04-26 17:34:46,549][49750] Updated weights for policy 0, policy_version 231301 (0.0036) [2024-04-26 17:34:47,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.6, 300 sec: 50762.6). Total num frames: 3789684736. Throughput: 0: 50777.5. Samples: 1542571200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 17:34:47,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 17:34:49,542][49750] Updated weights for policy 0, policy_version 231311 (0.0028) [2024-04-26 17:34:52,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3789897728. Throughput: 0: 50800.0. Samples: 1542717340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 17:34:52,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 17:34:53,011][49750] Updated weights for policy 0, policy_version 231321 (0.0039) [2024-04-26 17:34:55,890][49750] Updated weights for policy 0, policy_version 231331 (0.0030) [2024-04-26 17:34:57,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.5, 300 sec: 50762.6). Total num frames: 3790159872. Throughput: 0: 50671.6. Samples: 1543019640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 17:34:57,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 17:34:59,492][49750] Updated weights for policy 0, policy_version 231341 (0.0033) [2024-04-26 17:35:02,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 3790422016. Throughput: 0: 50608.5. Samples: 1543316780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 17:35:02,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 17:35:02,356][49750] Updated weights for policy 0, policy_version 231351 (0.0029) [2024-04-26 17:35:05,804][49750] Updated weights for policy 0, policy_version 231361 (0.0039) [2024-04-26 17:35:07,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3790684160. Throughput: 0: 50861.3. Samples: 1543478200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 17:35:07,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 17:35:08,889][49750] Updated weights for policy 0, policy_version 231371 (0.0028) [2024-04-26 17:35:12,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 3790913536. Throughput: 0: 50847.5. Samples: 1543784440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 17:35:12,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 17:35:12,260][49750] Updated weights for policy 0, policy_version 231381 (0.0031) [2024-04-26 17:35:15,268][49750] Updated weights for policy 0, policy_version 231391 (0.0038) [2024-04-26 17:35:17,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3791175680. Throughput: 0: 50786.7. Samples: 1544088920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 17:35:17,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 17:35:18,643][49750] Updated weights for policy 0, policy_version 231401 (0.0035) [2024-04-26 17:35:21,847][49750] Updated weights for policy 0, policy_version 231411 (0.0032) [2024-04-26 17:35:22,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3791437824. Throughput: 0: 50791.1. Samples: 1544236660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 17:35:22,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 17:35:24,998][49750] Updated weights for policy 0, policy_version 231421 (0.0031) [2024-04-26 17:35:27,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3791699968. Throughput: 0: 50850.7. Samples: 1544546680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 17:35:27,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 17:35:28,372][49750] Updated weights for policy 0, policy_version 231431 (0.0033) [2024-04-26 17:35:31,574][49750] Updated weights for policy 0, policy_version 231441 (0.0032) [2024-04-26 17:35:32,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3791945728. Throughput: 0: 50614.2. Samples: 1544848840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 17:35:32,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 17:35:34,674][49750] Updated weights for policy 0, policy_version 231451 (0.0033) [2024-04-26 17:35:37,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3792207872. Throughput: 0: 50771.1. Samples: 1545002040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 17:35:37,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 17:35:38,026][49750] Updated weights for policy 0, policy_version 231461 (0.0032) [2024-04-26 17:35:41,006][49750] Updated weights for policy 0, policy_version 231471 (0.0030) [2024-04-26 17:35:42,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3792437248. Throughput: 0: 50653.1. Samples: 1545299040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 17:35:42,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 17:35:44,622][49750] Updated weights for policy 0, policy_version 231481 (0.0039) [2024-04-26 17:35:47,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3792715776. Throughput: 0: 50887.6. Samples: 1545606720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 17:35:47,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 17:35:47,579][49750] Updated weights for policy 0, policy_version 231491 (0.0038) [2024-04-26 17:35:50,956][49750] Updated weights for policy 0, policy_version 231501 (0.0027) [2024-04-26 17:35:52,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3792977920. Throughput: 0: 50775.2. Samples: 1545763080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 17:35:52,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 17:35:54,016][49750] Updated weights for policy 0, policy_version 231511 (0.0032) [2024-04-26 17:35:57,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 3793207296. Throughput: 0: 50780.4. Samples: 1546069560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 17:35:57,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 17:35:57,388][49728] Signal inference workers to stop experience collection... (23050 times) [2024-04-26 17:35:57,388][49728] Signal inference workers to resume experience collection... (23050 times) [2024-04-26 17:35:57,414][49750] InferenceWorker_p0-w0: stopping experience collection (23050 times) [2024-04-26 17:35:57,414][49750] InferenceWorker_p0-w0: resuming experience collection (23050 times) [2024-04-26 17:35:57,517][49750] Updated weights for policy 0, policy_version 231521 (0.0030) [2024-04-26 17:36:00,322][49750] Updated weights for policy 0, policy_version 231531 (0.0028) [2024-04-26 17:36:02,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3793469440. Throughput: 0: 50792.1. Samples: 1546374560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 17:36:02,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 17:36:03,861][49750] Updated weights for policy 0, policy_version 231541 (0.0031) [2024-04-26 17:36:06,625][49750] Updated weights for policy 0, policy_version 231551 (0.0032) [2024-04-26 17:36:07,063][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3793731584. Throughput: 0: 50900.7. Samples: 1546527200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 17:36:07,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 17:36:10,302][49750] Updated weights for policy 0, policy_version 231561 (0.0030) [2024-04-26 17:36:12,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3793993728. Throughput: 0: 50745.7. Samples: 1546830240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 17:36:12,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 17:36:13,215][49750] Updated weights for policy 0, policy_version 231571 (0.0032) [2024-04-26 17:36:16,716][49750] Updated weights for policy 0, policy_version 231581 (0.0030) [2024-04-26 17:36:17,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3794223104. Throughput: 0: 50901.0. Samples: 1547139400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 17:36:17,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 17:36:19,804][49750] Updated weights for policy 0, policy_version 231591 (0.0030) [2024-04-26 17:36:22,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3794485248. Throughput: 0: 50718.6. Samples: 1547284380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 17:36:22,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 17:36:23,090][49750] Updated weights for policy 0, policy_version 231601 (0.0030) [2024-04-26 17:36:26,385][49750] Updated weights for policy 0, policy_version 231611 (0.0032) [2024-04-26 17:36:27,062][49517] Fps is (10 sec: 50791.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3794731008. Throughput: 0: 50898.8. Samples: 1547589480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:36:27,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 17:36:29,493][49750] Updated weights for policy 0, policy_version 231621 (0.0035) [2024-04-26 17:36:32,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3795009536. Throughput: 0: 50722.6. Samples: 1547889240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:36:32,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 17:36:33,047][49750] Updated weights for policy 0, policy_version 231631 (0.0033) [2024-04-26 17:36:35,912][49750] Updated weights for policy 0, policy_version 231641 (0.0034) [2024-04-26 17:36:37,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3795238912. Throughput: 0: 50815.6. Samples: 1548049780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:36:37,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 17:36:39,544][49750] Updated weights for policy 0, policy_version 231651 (0.0049) [2024-04-26 17:36:42,062][49517] Fps is (10 sec: 49152.4, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3795501056. Throughput: 0: 50728.6. Samples: 1548352340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:36:42,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 17:36:42,079][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000231660_3795517440.pth... [2024-04-26 17:36:42,133][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000230916_3783327744.pth [2024-04-26 17:36:42,401][49750] Updated weights for policy 0, policy_version 231661 (0.0033) [2024-04-26 17:36:46,073][49750] Updated weights for policy 0, policy_version 231671 (0.0029) [2024-04-26 17:36:47,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3795746816. Throughput: 0: 50811.7. Samples: 1548661100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:36:47,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 17:36:48,800][49750] Updated weights for policy 0, policy_version 231681 (0.0037) [2024-04-26 17:36:52,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 3796008960. Throughput: 0: 50665.5. Samples: 1548807140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:36:52,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 17:36:52,494][49750] Updated weights for policy 0, policy_version 231691 (0.0039) [2024-04-26 17:36:55,248][49750] Updated weights for policy 0, policy_version 231701 (0.0030) [2024-04-26 17:36:57,062][49517] Fps is (10 sec: 52429.9, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 3796271104. Throughput: 0: 50673.5. Samples: 1549110540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:36:57,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 17:36:58,763][49750] Updated weights for policy 0, policy_version 231711 (0.0028) [2024-04-26 17:37:01,928][49750] Updated weights for policy 0, policy_version 231721 (0.0031) [2024-04-26 17:37:02,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50762.7). Total num frames: 3796516864. Throughput: 0: 50701.6. Samples: 1549420960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:37:02,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 17:37:05,145][49750] Updated weights for policy 0, policy_version 231731 (0.0031) [2024-04-26 17:37:07,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3796762624. Throughput: 0: 50838.6. Samples: 1549572120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:37:07,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 17:37:08,419][49750] Updated weights for policy 0, policy_version 231741 (0.0029) [2024-04-26 17:37:10,257][49728] Signal inference workers to stop experience collection... (23100 times) [2024-04-26 17:37:10,257][49728] Signal inference workers to resume experience collection... (23100 times) [2024-04-26 17:37:10,271][49750] InferenceWorker_p0-w0: stopping experience collection (23100 times) [2024-04-26 17:37:10,271][49750] InferenceWorker_p0-w0: resuming experience collection (23100 times) [2024-04-26 17:37:11,495][49750] Updated weights for policy 0, policy_version 231751 (0.0030) [2024-04-26 17:37:12,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 3797008384. Throughput: 0: 50627.5. Samples: 1549867720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:37:12,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 17:37:14,933][49750] Updated weights for policy 0, policy_version 231761 (0.0032) [2024-04-26 17:37:17,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.8, 300 sec: 50873.7). Total num frames: 3797303296. Throughput: 0: 50894.4. Samples: 1550179480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:37:17,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 17:37:17,896][49750] Updated weights for policy 0, policy_version 231771 (0.0027) [2024-04-26 17:37:21,248][49750] Updated weights for policy 0, policy_version 231781 (0.0034) [2024-04-26 17:37:22,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 3797549056. Throughput: 0: 50791.3. Samples: 1550335400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:37:22,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 17:37:24,238][49750] Updated weights for policy 0, policy_version 231791 (0.0038) [2024-04-26 17:37:27,062][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3797794816. Throughput: 0: 50900.5. Samples: 1550642860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:37:27,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 17:37:27,684][49750] Updated weights for policy 0, policy_version 231801 (0.0033) [2024-04-26 17:37:30,951][49750] Updated weights for policy 0, policy_version 231811 (0.0031) [2024-04-26 17:37:32,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3798056960. Throughput: 0: 50891.6. Samples: 1550951220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 17:37:32,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-26 17:37:34,192][49750] Updated weights for policy 0, policy_version 231821 (0.0027) [2024-04-26 17:37:37,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3798286336. Throughput: 0: 50860.9. Samples: 1551095880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:37:37,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 17:37:37,629][49750] Updated weights for policy 0, policy_version 231831 (0.0029) [2024-04-26 17:37:40,489][49750] Updated weights for policy 0, policy_version 231841 (0.0037) [2024-04-26 17:37:42,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3798564864. Throughput: 0: 50960.8. Samples: 1551403780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:37:42,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 17:37:43,985][49750] Updated weights for policy 0, policy_version 231851 (0.0028) [2024-04-26 17:37:46,814][49750] Updated weights for policy 0, policy_version 231861 (0.0031) [2024-04-26 17:37:47,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 3798827008. Throughput: 0: 50764.9. Samples: 1551705380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:37:47,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 17:37:50,269][49750] Updated weights for policy 0, policy_version 231871 (0.0027) [2024-04-26 17:37:52,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 3799040000. Throughput: 0: 50980.8. Samples: 1551866260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:37:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 17:37:53,196][49750] Updated weights for policy 0, policy_version 231881 (0.0031) [2024-04-26 17:37:56,806][49750] Updated weights for policy 0, policy_version 231891 (0.0041) [2024-04-26 17:37:57,063][49517] Fps is (10 sec: 49150.5, 60 sec: 50790.1, 300 sec: 50707.1). Total num frames: 3799318528. Throughput: 0: 51230.4. Samples: 1552173100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:37:57,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 17:37:59,684][49750] Updated weights for policy 0, policy_version 231901 (0.0032) [2024-04-26 17:38:02,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3799564288. Throughput: 0: 50914.9. Samples: 1552470660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:38:02,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 17:38:03,263][49750] Updated weights for policy 0, policy_version 231911 (0.0035) [2024-04-26 17:38:06,018][49750] Updated weights for policy 0, policy_version 231921 (0.0031) [2024-04-26 17:38:07,063][49517] Fps is (10 sec: 52429.5, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 3799842816. Throughput: 0: 51003.6. Samples: 1552630560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:38:07,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 17:38:09,678][49750] Updated weights for policy 0, policy_version 231931 (0.0037) [2024-04-26 17:38:12,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3800088576. Throughput: 0: 50984.7. Samples: 1552937180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:38:12,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 17:38:12,308][49750] Updated weights for policy 0, policy_version 231941 (0.0034) [2024-04-26 17:38:16,024][49750] Updated weights for policy 0, policy_version 231951 (0.0032) [2024-04-26 17:38:17,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3800334336. Throughput: 0: 50884.4. Samples: 1553241020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:38:17,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 17:38:18,711][49750] Updated weights for policy 0, policy_version 231961 (0.0035) [2024-04-26 17:38:22,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3800580096. Throughput: 0: 50897.6. Samples: 1553386280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:38:22,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 17:38:22,557][49750] Updated weights for policy 0, policy_version 231971 (0.0028) [2024-04-26 17:38:25,210][49750] Updated weights for policy 0, policy_version 231981 (0.0031) [2024-04-26 17:38:27,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 3800842240. Throughput: 0: 50790.1. Samples: 1553689340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:38:27,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 17:38:28,974][49750] Updated weights for policy 0, policy_version 231991 (0.0036) [2024-04-26 17:38:29,001][49728] Signal inference workers to stop experience collection... (23150 times) [2024-04-26 17:38:29,001][49728] Signal inference workers to resume experience collection... (23150 times) [2024-04-26 17:38:29,014][49750] InferenceWorker_p0-w0: stopping experience collection (23150 times) [2024-04-26 17:38:29,014][49750] InferenceWorker_p0-w0: resuming experience collection (23150 times) [2024-04-26 17:38:31,616][49750] Updated weights for policy 0, policy_version 232001 (0.0030) [2024-04-26 17:38:32,063][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3801120768. Throughput: 0: 50920.3. Samples: 1553996800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:38:32,072][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 17:38:35,513][49750] Updated weights for policy 0, policy_version 232011 (0.0027) [2024-04-26 17:38:37,062][49517] Fps is (10 sec: 50791.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3801350144. Throughput: 0: 50949.5. Samples: 1554158980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:38:37,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 17:38:38,050][49750] Updated weights for policy 0, policy_version 232021 (0.0034) [2024-04-26 17:38:41,885][49750] Updated weights for policy 0, policy_version 232031 (0.0028) [2024-04-26 17:38:42,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3801612288. Throughput: 0: 50838.9. Samples: 1554460840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 17:38:42,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 17:38:42,081][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000232032_3801612288.pth... [2024-04-26 17:38:42,126][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000231288_3789422592.pth [2024-04-26 17:38:44,494][49750] Updated weights for policy 0, policy_version 232041 (0.0028) [2024-04-26 17:38:47,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50762.7). Total num frames: 3801841664. Throughput: 0: 50854.9. Samples: 1554759120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 17:38:47,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 17:38:48,206][49750] Updated weights for policy 0, policy_version 232051 (0.0033) [2024-04-26 17:38:50,823][49750] Updated weights for policy 0, policy_version 232061 (0.0026) [2024-04-26 17:38:52,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 3802136576. Throughput: 0: 50821.8. Samples: 1554917540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 17:38:52,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 17:38:54,716][49750] Updated weights for policy 0, policy_version 232071 (0.0036) [2024-04-26 17:38:57,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.7, 300 sec: 50818.2). Total num frames: 3802365952. Throughput: 0: 50738.0. Samples: 1555220380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 17:38:57,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 17:38:57,442][49750] Updated weights for policy 0, policy_version 232081 (0.0035) [2024-04-26 17:39:01,181][49750] Updated weights for policy 0, policy_version 232091 (0.0027) [2024-04-26 17:39:02,063][49517] Fps is (10 sec: 49152.0, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 3802628096. Throughput: 0: 50873.8. Samples: 1555530340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 17:39:02,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 17:39:03,874][49750] Updated weights for policy 0, policy_version 232101 (0.0030) [2024-04-26 17:39:07,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3802873856. Throughput: 0: 50867.2. Samples: 1555675300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 17:39:07,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 17:39:07,486][49750] Updated weights for policy 0, policy_version 232111 (0.0025) [2024-04-26 17:39:10,207][49750] Updated weights for policy 0, policy_version 232121 (0.0029) [2024-04-26 17:39:12,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 3803152384. Throughput: 0: 50895.0. Samples: 1555979600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 17:39:12,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 17:39:13,883][49750] Updated weights for policy 0, policy_version 232131 (0.0031) [2024-04-26 17:39:16,795][49750] Updated weights for policy 0, policy_version 232141 (0.0029) [2024-04-26 17:39:17,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3803398144. Throughput: 0: 50879.5. Samples: 1556286380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 17:39:17,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 17:39:20,487][49750] Updated weights for policy 0, policy_version 232151 (0.0030) [2024-04-26 17:39:22,063][49517] Fps is (10 sec: 49150.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3803643904. Throughput: 0: 50804.6. Samples: 1556445200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 17:39:22,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-26 17:39:23,215][49750] Updated weights for policy 0, policy_version 232161 (0.0028) [2024-04-26 17:39:27,014][49750] Updated weights for policy 0, policy_version 232171 (0.0031) [2024-04-26 17:39:27,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3803889664. Throughput: 0: 50931.6. Samples: 1556752760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 17:39:27,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 17:39:29,686][49750] Updated weights for policy 0, policy_version 232181 (0.0032) [2024-04-26 17:39:32,063][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3804135424. Throughput: 0: 51064.7. Samples: 1557057040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 17:39:32,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 17:39:33,455][49750] Updated weights for policy 0, policy_version 232191 (0.0031) [2024-04-26 17:39:36,205][49750] Updated weights for policy 0, policy_version 232201 (0.0030) [2024-04-26 17:39:37,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3804413952. Throughput: 0: 50817.9. Samples: 1557204340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 17:39:37,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 17:39:38,849][49728] Signal inference workers to stop experience collection... (23200 times) [2024-04-26 17:39:38,849][49728] Signal inference workers to resume experience collection... (23200 times) [2024-04-26 17:39:38,879][49750] InferenceWorker_p0-w0: stopping experience collection (23200 times) [2024-04-26 17:39:38,879][49750] InferenceWorker_p0-w0: resuming experience collection (23200 times) [2024-04-26 17:39:39,916][49750] Updated weights for policy 0, policy_version 232211 (0.0031) [2024-04-26 17:39:42,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3804676096. Throughput: 0: 50830.9. Samples: 1557507780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 17:39:42,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 17:39:42,643][49750] Updated weights for policy 0, policy_version 232221 (0.0036) [2024-04-26 17:39:46,407][49750] Updated weights for policy 0, policy_version 232231 (0.0034) [2024-04-26 17:39:47,062][49517] Fps is (10 sec: 49151.7, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3804905472. Throughput: 0: 50679.6. Samples: 1557810920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 17:39:47,071][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 17:39:49,086][49750] Updated weights for policy 0, policy_version 232241 (0.0030) [2024-04-26 17:39:52,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.3, 300 sec: 50818.1). Total num frames: 3805151232. Throughput: 0: 50659.1. Samples: 1557954960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 17:39:52,072][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 17:39:53,007][49750] Updated weights for policy 0, policy_version 232251 (0.0030) [2024-04-26 17:39:55,439][49750] Updated weights for policy 0, policy_version 232261 (0.0036) [2024-04-26 17:39:57,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3805413376. Throughput: 0: 50643.0. Samples: 1558258540. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 17:39:57,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 17:39:59,361][49750] Updated weights for policy 0, policy_version 232271 (0.0031) [2024-04-26 17:40:02,062][49517] Fps is (10 sec: 52429.9, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 3805675520. Throughput: 0: 50776.3. Samples: 1558571300. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 17:40:02,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 17:40:02,118][49750] Updated weights for policy 0, policy_version 232281 (0.0028) [2024-04-26 17:40:05,872][49750] Updated weights for policy 0, policy_version 232291 (0.0034) [2024-04-26 17:40:07,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 3805937664. Throughput: 0: 50703.3. Samples: 1558726840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 17:40:07,072][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 17:40:08,595][49750] Updated weights for policy 0, policy_version 232301 (0.0031) [2024-04-26 17:40:12,062][49517] Fps is (10 sec: 49151.1, 60 sec: 50244.1, 300 sec: 50818.2). Total num frames: 3806167040. Throughput: 0: 50565.7. Samples: 1559028220. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 17:40:12,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 17:40:12,168][49750] Updated weights for policy 0, policy_version 232311 (0.0029) [2024-04-26 17:40:14,930][49750] Updated weights for policy 0, policy_version 232321 (0.0030) [2024-04-26 17:40:17,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3806429184. Throughput: 0: 50721.0. Samples: 1559339480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 17:40:17,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 17:40:18,571][49750] Updated weights for policy 0, policy_version 232331 (0.0025) [2024-04-26 17:40:21,252][49750] Updated weights for policy 0, policy_version 232341 (0.0030) [2024-04-26 17:40:22,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3806691328. Throughput: 0: 50729.7. Samples: 1559487180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 17:40:22,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 17:40:24,906][49750] Updated weights for policy 0, policy_version 232351 (0.0031) [2024-04-26 17:40:27,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3806953472. Throughput: 0: 50739.3. Samples: 1559791040. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 17:40:27,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 17:40:27,854][49750] Updated weights for policy 0, policy_version 232361 (0.0028) [2024-04-26 17:40:31,247][49750] Updated weights for policy 0, policy_version 232371 (0.0028) [2024-04-26 17:40:32,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3807215616. Throughput: 0: 50911.2. Samples: 1560101920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 17:40:32,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 17:40:34,252][49750] Updated weights for policy 0, policy_version 232381 (0.0026) [2024-04-26 17:40:37,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3807444992. Throughput: 0: 51013.8. Samples: 1560250580. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 17:40:37,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 17:40:37,774][49750] Updated weights for policy 0, policy_version 232391 (0.0030) [2024-04-26 17:40:40,656][49750] Updated weights for policy 0, policy_version 232401 (0.0029) [2024-04-26 17:40:42,062][49517] Fps is (10 sec: 47513.0, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3807690752. Throughput: 0: 51057.2. Samples: 1560556120. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 17:40:42,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 17:40:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000232403_3807690752.pth... [2024-04-26 17:40:42,127][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000231660_3795517440.pth [2024-04-26 17:40:44,200][49750] Updated weights for policy 0, policy_version 232411 (0.0031) [2024-04-26 17:40:47,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3807969280. Throughput: 0: 50778.4. Samples: 1560856340. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 17:40:47,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 17:40:47,094][49750] Updated weights for policy 0, policy_version 232421 (0.0034) [2024-04-26 17:40:50,607][49750] Updated weights for policy 0, policy_version 232431 (0.0028) [2024-04-26 17:40:50,999][49728] Signal inference workers to stop experience collection... (23250 times) [2024-04-26 17:40:51,000][49728] Signal inference workers to resume experience collection... (23250 times) [2024-04-26 17:40:51,016][49750] InferenceWorker_p0-w0: stopping experience collection (23250 times) [2024-04-26 17:40:51,016][49750] InferenceWorker_p0-w0: resuming experience collection (23250 times) [2024-04-26 17:40:52,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3808215040. Throughput: 0: 50882.5. Samples: 1561016560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 17:40:52,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 17:40:53,457][49750] Updated weights for policy 0, policy_version 232441 (0.0030) [2024-04-26 17:40:57,007][49750] Updated weights for policy 0, policy_version 232451 (0.0029) [2024-04-26 17:40:57,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 3808477184. Throughput: 0: 51020.7. Samples: 1561324160. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 17:40:57,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 17:41:00,061][49750] Updated weights for policy 0, policy_version 232461 (0.0033) [2024-04-26 17:41:02,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.1, 300 sec: 50818.1). Total num frames: 3808722944. Throughput: 0: 50975.7. Samples: 1561633400. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 17:41:02,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 17:41:03,330][49750] Updated weights for policy 0, policy_version 232471 (0.0031) [2024-04-26 17:41:06,551][49750] Updated weights for policy 0, policy_version 232481 (0.0029) [2024-04-26 17:41:07,063][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3808985088. Throughput: 0: 50955.0. Samples: 1561780160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:41:07,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 17:41:09,656][49750] Updated weights for policy 0, policy_version 232491 (0.0031) [2024-04-26 17:41:12,062][49517] Fps is (10 sec: 50791.6, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3809230848. Throughput: 0: 50841.2. Samples: 1562078900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:41:12,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 17:41:13,006][49750] Updated weights for policy 0, policy_version 232501 (0.0033) [2024-04-26 17:41:16,104][49750] Updated weights for policy 0, policy_version 232511 (0.0033) [2024-04-26 17:41:17,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 3809509376. Throughput: 0: 50741.6. Samples: 1562385300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:41:17,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 17:41:19,494][49750] Updated weights for policy 0, policy_version 232521 (0.0027) [2024-04-26 17:41:22,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3809738752. Throughput: 0: 50919.5. Samples: 1562541960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:41:22,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 17:41:22,565][49750] Updated weights for policy 0, policy_version 232531 (0.0028) [2024-04-26 17:41:25,998][49750] Updated weights for policy 0, policy_version 232541 (0.0035) [2024-04-26 17:41:27,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3810000896. Throughput: 0: 50937.9. Samples: 1562848320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:41:27,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 17:41:28,810][49750] Updated weights for policy 0, policy_version 232551 (0.0029) [2024-04-26 17:41:32,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3810246656. Throughput: 0: 51058.8. Samples: 1563153980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:41:32,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 17:41:32,352][49750] Updated weights for policy 0, policy_version 232561 (0.0029) [2024-04-26 17:41:35,220][49750] Updated weights for policy 0, policy_version 232571 (0.0028) [2024-04-26 17:41:37,062][49517] Fps is (10 sec: 52428.0, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 3810525184. Throughput: 0: 51108.1. Samples: 1563316420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:41:37,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 17:41:38,696][49750] Updated weights for policy 0, policy_version 232581 (0.0031) [2024-04-26 17:41:41,925][49750] Updated weights for policy 0, policy_version 232591 (0.0038) [2024-04-26 17:41:42,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 3810770944. Throughput: 0: 51093.1. Samples: 1563623340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:41:42,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 17:41:45,072][49750] Updated weights for policy 0, policy_version 232601 (0.0034) [2024-04-26 17:41:47,062][49517] Fps is (10 sec: 45875.8, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 3810983936. Throughput: 0: 50778.2. Samples: 1563918400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:41:47,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 17:41:48,548][49750] Updated weights for policy 0, policy_version 232611 (0.0028) [2024-04-26 17:41:51,620][49750] Updated weights for policy 0, policy_version 232621 (0.0029) [2024-04-26 17:41:52,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3811278848. Throughput: 0: 50819.6. Samples: 1564067040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:41:52,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 17:41:54,792][49750] Updated weights for policy 0, policy_version 232631 (0.0031) [2024-04-26 17:41:57,063][49517] Fps is (10 sec: 54066.5, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3811524608. Throughput: 0: 51047.1. Samples: 1564376020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:41:57,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 17:41:58,260][49750] Updated weights for policy 0, policy_version 232641 (0.0033) [2024-04-26 17:42:01,349][49750] Updated weights for policy 0, policy_version 232651 (0.0031) [2024-04-26 17:42:02,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3811770368. Throughput: 0: 50900.3. Samples: 1564675820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:42:02,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 17:42:04,825][49750] Updated weights for policy 0, policy_version 232661 (0.0032) [2024-04-26 17:42:07,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 3812032512. Throughput: 0: 50956.5. Samples: 1564835000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:42:07,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 17:42:07,128][49728] Signal inference workers to stop experience collection... (23300 times) [2024-04-26 17:42:07,133][49728] Signal inference workers to resume experience collection... (23300 times) [2024-04-26 17:42:07,159][49750] InferenceWorker_p0-w0: stopping experience collection (23300 times) [2024-04-26 17:42:07,159][49750] InferenceWorker_p0-w0: resuming experience collection (23300 times) [2024-04-26 17:42:07,716][49750] Updated weights for policy 0, policy_version 232671 (0.0029) [2024-04-26 17:42:11,215][49750] Updated weights for policy 0, policy_version 232681 (0.0033) [2024-04-26 17:42:12,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3812294656. Throughput: 0: 50858.9. Samples: 1565136980. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-26 17:42:12,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 17:42:14,139][49750] Updated weights for policy 0, policy_version 232691 (0.0036) [2024-04-26 17:42:17,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.4, 300 sec: 50762.7). Total num frames: 3812524032. Throughput: 0: 50913.8. Samples: 1565445100. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-26 17:42:17,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 17:42:17,726][49750] Updated weights for policy 0, policy_version 232701 (0.0030) [2024-04-26 17:42:20,437][49750] Updated weights for policy 0, policy_version 232711 (0.0033) [2024-04-26 17:42:22,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3812786176. Throughput: 0: 50548.5. Samples: 1565591100. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-26 17:42:22,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 17:42:24,394][49750] Updated weights for policy 0, policy_version 232721 (0.0032) [2024-04-26 17:42:27,002][49750] Updated weights for policy 0, policy_version 232731 (0.0033) [2024-04-26 17:42:27,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 3813064704. Throughput: 0: 50592.8. Samples: 1565900020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-26 17:42:27,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 17:42:30,748][49750] Updated weights for policy 0, policy_version 232741 (0.0027) [2024-04-26 17:42:32,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3813277696. Throughput: 0: 50736.0. Samples: 1566201520. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-26 17:42:32,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 17:42:33,552][49750] Updated weights for policy 0, policy_version 232751 (0.0032) [2024-04-26 17:42:37,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 3813539840. Throughput: 0: 50671.7. Samples: 1566347260. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-26 17:42:37,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 17:42:37,170][49750] Updated weights for policy 0, policy_version 232761 (0.0035) [2024-04-26 17:42:39,889][49750] Updated weights for policy 0, policy_version 232771 (0.0032) [2024-04-26 17:42:42,063][49517] Fps is (10 sec: 52427.2, 60 sec: 50517.1, 300 sec: 50762.6). Total num frames: 3813801984. Throughput: 0: 50662.5. Samples: 1566655840. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-26 17:42:42,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 17:42:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000232776_3813801984.pth... [2024-04-26 17:42:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000232032_3801612288.pth [2024-04-26 17:42:43,772][49750] Updated weights for policy 0, policy_version 232781 (0.0034) [2024-04-26 17:42:46,537][49750] Updated weights for policy 0, policy_version 232791 (0.0035) [2024-04-26 17:42:47,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 3814064128. Throughput: 0: 50762.7. Samples: 1566960140. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-26 17:42:47,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 17:42:50,217][49750] Updated weights for policy 0, policy_version 232801 (0.0030) [2024-04-26 17:42:52,062][49517] Fps is (10 sec: 50791.6, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3814309888. Throughput: 0: 50672.9. Samples: 1567115280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-26 17:42:52,063][49517] Avg episode reward: [(0, '0.463')] [2024-04-26 17:42:52,938][49750] Updated weights for policy 0, policy_version 232811 (0.0033) [2024-04-26 17:42:56,539][49750] Updated weights for policy 0, policy_version 232821 (0.0031) [2024-04-26 17:42:57,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3814572032. Throughput: 0: 50805.5. Samples: 1567423220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-26 17:42:57,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 17:42:59,276][49750] Updated weights for policy 0, policy_version 232831 (0.0033) [2024-04-26 17:43:02,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.6, 300 sec: 50762.7). Total num frames: 3814817792. Throughput: 0: 50665.2. Samples: 1567725040. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-26 17:43:02,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 17:43:02,852][49750] Updated weights for policy 0, policy_version 232841 (0.0030) [2024-04-26 17:43:05,708][49750] Updated weights for policy 0, policy_version 232851 (0.0032) [2024-04-26 17:43:07,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3815079936. Throughput: 0: 50669.3. Samples: 1567871220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-26 17:43:07,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 17:43:09,503][49750] Updated weights for policy 0, policy_version 232861 (0.0038) [2024-04-26 17:43:12,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 3815342080. Throughput: 0: 50555.3. Samples: 1568175000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-26 17:43:12,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 17:43:12,188][49750] Updated weights for policy 0, policy_version 232871 (0.0030) [2024-04-26 17:43:16,010][49750] Updated weights for policy 0, policy_version 232881 (0.0030) [2024-04-26 17:43:16,598][49728] Signal inference workers to stop experience collection... (23350 times) [2024-04-26 17:43:16,598][49728] Signal inference workers to resume experience collection... (23350 times) [2024-04-26 17:43:16,625][49750] InferenceWorker_p0-w0: stopping experience collection (23350 times) [2024-04-26 17:43:16,625][49750] InferenceWorker_p0-w0: resuming experience collection (23350 times) [2024-04-26 17:43:17,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 3815604224. Throughput: 0: 50858.3. Samples: 1568490140. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-26 17:43:17,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 17:43:18,582][49750] Updated weights for policy 0, policy_version 232891 (0.0033) [2024-04-26 17:43:22,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 3815817216. Throughput: 0: 50786.2. Samples: 1568632640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:43:22,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 17:43:22,486][49750] Updated weights for policy 0, policy_version 232901 (0.0028) [2024-04-26 17:43:24,995][49750] Updated weights for policy 0, policy_version 232911 (0.0030) [2024-04-26 17:43:27,062][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3816079360. Throughput: 0: 50701.6. Samples: 1568937400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:43:27,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 17:43:29,026][49750] Updated weights for policy 0, policy_version 232921 (0.0035) [2024-04-26 17:43:31,511][49750] Updated weights for policy 0, policy_version 232931 (0.0036) [2024-04-26 17:43:32,063][49517] Fps is (10 sec: 54066.3, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 3816357888. Throughput: 0: 50625.8. Samples: 1569238300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:43:32,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 17:43:35,321][49750] Updated weights for policy 0, policy_version 232941 (0.0033) [2024-04-26 17:43:37,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3816603648. Throughput: 0: 50951.1. Samples: 1569408080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:43:37,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 17:43:38,043][49750] Updated weights for policy 0, policy_version 232951 (0.0030) [2024-04-26 17:43:41,720][49750] Updated weights for policy 0, policy_version 232961 (0.0029) [2024-04-26 17:43:42,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3816849408. Throughput: 0: 50751.3. Samples: 1569707040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:43:42,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 17:43:44,447][49750] Updated weights for policy 0, policy_version 232971 (0.0030) [2024-04-26 17:43:47,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3817095168. Throughput: 0: 50706.6. Samples: 1570006840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:43:47,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 17:43:48,024][49750] Updated weights for policy 0, policy_version 232981 (0.0034) [2024-04-26 17:43:50,957][49750] Updated weights for policy 0, policy_version 232991 (0.0032) [2024-04-26 17:43:52,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3817357312. Throughput: 0: 50889.8. Samples: 1570161260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:43:52,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 17:43:54,415][49750] Updated weights for policy 0, policy_version 233001 (0.0033) [2024-04-26 17:43:57,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3817619456. Throughput: 0: 50988.7. Samples: 1570469500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:43:57,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 17:43:57,326][49750] Updated weights for policy 0, policy_version 233011 (0.0028) [2024-04-26 17:44:00,791][49750] Updated weights for policy 0, policy_version 233021 (0.0032) [2024-04-26 17:44:02,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3817881600. Throughput: 0: 50686.2. Samples: 1570771020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:44:02,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 17:44:03,927][49750] Updated weights for policy 0, policy_version 233031 (0.0029) [2024-04-26 17:44:07,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3818110976. Throughput: 0: 51004.7. Samples: 1570927860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:44:07,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 17:44:07,228][49750] Updated weights for policy 0, policy_version 233041 (0.0035) [2024-04-26 17:44:10,268][49750] Updated weights for policy 0, policy_version 233051 (0.0031) [2024-04-26 17:44:12,063][49517] Fps is (10 sec: 50789.2, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 3818389504. Throughput: 0: 50964.7. Samples: 1571230820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:44:12,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 17:44:13,774][49750] Updated weights for policy 0, policy_version 233061 (0.0029) [2024-04-26 17:44:16,733][49750] Updated weights for policy 0, policy_version 233071 (0.0037) [2024-04-26 17:44:17,063][49517] Fps is (10 sec: 54067.3, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 3818651648. Throughput: 0: 51105.8. Samples: 1571538060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:44:17,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 17:44:20,196][49750] Updated weights for policy 0, policy_version 233081 (0.0029) [2024-04-26 17:44:22,063][49517] Fps is (10 sec: 50790.4, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 3818897408. Throughput: 0: 50869.6. Samples: 1571697220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:44:22,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 17:44:23,150][49750] Updated weights for policy 0, policy_version 233091 (0.0032) [2024-04-26 17:44:26,636][49750] Updated weights for policy 0, policy_version 233101 (0.0038) [2024-04-26 17:44:27,062][49517] Fps is (10 sec: 49152.5, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3819143168. Throughput: 0: 51025.5. Samples: 1572003180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:44:27,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 17:44:29,465][49750] Updated weights for policy 0, policy_version 233111 (0.0027) [2024-04-26 17:44:32,063][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3819388928. Throughput: 0: 51002.2. Samples: 1572301940. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:44:32,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 17:44:33,135][49750] Updated weights for policy 0, policy_version 233121 (0.0029) [2024-04-26 17:44:35,584][49728] Signal inference workers to stop experience collection... (23400 times) [2024-04-26 17:44:35,584][49728] Signal inference workers to resume experience collection... (23400 times) [2024-04-26 17:44:35,610][49750] InferenceWorker_p0-w0: stopping experience collection (23400 times) [2024-04-26 17:44:35,610][49750] InferenceWorker_p0-w0: resuming experience collection (23400 times) [2024-04-26 17:44:35,907][49750] Updated weights for policy 0, policy_version 233131 (0.0030) [2024-04-26 17:44:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3819667456. Throughput: 0: 50931.4. Samples: 1572453180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:44:37,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 17:44:39,415][49750] Updated weights for policy 0, policy_version 233141 (0.0028) [2024-04-26 17:44:42,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3819913216. Throughput: 0: 50960.0. Samples: 1572762700. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:44:42,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 17:44:42,187][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000233150_3819929600.pth... [2024-04-26 17:44:42,226][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000232403_3807690752.pth [2024-04-26 17:44:42,496][49750] Updated weights for policy 0, policy_version 233151 (0.0034) [2024-04-26 17:44:45,840][49750] Updated weights for policy 0, policy_version 233161 (0.0028) [2024-04-26 17:44:47,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 3820175360. Throughput: 0: 51067.0. Samples: 1573069040. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:44:47,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 17:44:49,099][49750] Updated weights for policy 0, policy_version 233171 (0.0029) [2024-04-26 17:44:52,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3820421120. Throughput: 0: 50909.4. Samples: 1573218780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:44:52,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-26 17:44:52,262][49750] Updated weights for policy 0, policy_version 233181 (0.0032) [2024-04-26 17:44:55,722][49750] Updated weights for policy 0, policy_version 233191 (0.0030) [2024-04-26 17:44:57,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3820683264. Throughput: 0: 51160.1. Samples: 1573533020. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:44:57,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 17:44:58,720][49750] Updated weights for policy 0, policy_version 233201 (0.0027) [2024-04-26 17:45:02,047][49750] Updated weights for policy 0, policy_version 233211 (0.0036) [2024-04-26 17:45:02,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3820929024. Throughput: 0: 50941.1. Samples: 1573830400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:45:02,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 17:45:05,004][49750] Updated weights for policy 0, policy_version 233221 (0.0034) [2024-04-26 17:45:07,062][49517] Fps is (10 sec: 49151.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3821174784. Throughput: 0: 50747.7. Samples: 1573980860. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:45:07,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 17:45:08,355][49750] Updated weights for policy 0, policy_version 233231 (0.0027) [2024-04-26 17:45:11,514][49750] Updated weights for policy 0, policy_version 233241 (0.0031) [2024-04-26 17:45:12,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 3821453312. Throughput: 0: 50813.8. Samples: 1574289800. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:45:12,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 17:45:14,818][49750] Updated weights for policy 0, policy_version 233251 (0.0030) [2024-04-26 17:45:17,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3821682688. Throughput: 0: 51012.1. Samples: 1574597480. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:45:17,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 17:45:18,102][49750] Updated weights for policy 0, policy_version 233261 (0.0033) [2024-04-26 17:45:21,249][49750] Updated weights for policy 0, policy_version 233271 (0.0034) [2024-04-26 17:45:22,063][49517] Fps is (10 sec: 50789.1, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3821961216. Throughput: 0: 50863.9. Samples: 1574742060. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:45:22,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 17:45:24,445][49750] Updated weights for policy 0, policy_version 233281 (0.0031) [2024-04-26 17:45:27,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3822190592. Throughput: 0: 50865.4. Samples: 1575051640. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:45:27,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 17:45:27,663][49750] Updated weights for policy 0, policy_version 233291 (0.0031) [2024-04-26 17:45:30,982][49750] Updated weights for policy 0, policy_version 233301 (0.0030) [2024-04-26 17:45:32,062][49517] Fps is (10 sec: 50791.5, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 3822469120. Throughput: 0: 50855.1. Samples: 1575357520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:45:32,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 17:45:34,055][49750] Updated weights for policy 0, policy_version 233311 (0.0029) [2024-04-26 17:45:37,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 3822698496. Throughput: 0: 50924.5. Samples: 1575510380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 17:45:37,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 17:45:37,373][49750] Updated weights for policy 0, policy_version 233321 (0.0027) [2024-04-26 17:45:40,507][49750] Updated weights for policy 0, policy_version 233331 (0.0031) [2024-04-26 17:45:42,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3822960640. Throughput: 0: 50738.3. Samples: 1575816240. Policy #0 lag: (min: 2.0, avg: 11.4, max: 25.0) [2024-04-26 17:45:42,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 17:45:43,663][49750] Updated weights for policy 0, policy_version 233341 (0.0032) [2024-04-26 17:45:47,033][49750] Updated weights for policy 0, policy_version 233351 (0.0028) [2024-04-26 17:45:47,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3823222784. Throughput: 0: 50874.9. Samples: 1576119780. Policy #0 lag: (min: 2.0, avg: 11.4, max: 25.0) [2024-04-26 17:45:47,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 17:45:50,129][49750] Updated weights for policy 0, policy_version 233361 (0.0031) [2024-04-26 17:45:52,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3823468544. Throughput: 0: 50962.6. Samples: 1576274180. Policy #0 lag: (min: 2.0, avg: 11.4, max: 25.0) [2024-04-26 17:45:52,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 17:45:53,576][49750] Updated weights for policy 0, policy_version 233371 (0.0035) [2024-04-26 17:45:56,508][49750] Updated weights for policy 0, policy_version 233381 (0.0030) [2024-04-26 17:45:57,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.4, 300 sec: 50873.8). Total num frames: 3823730688. Throughput: 0: 50754.2. Samples: 1576573740. Policy #0 lag: (min: 2.0, avg: 11.4, max: 25.0) [2024-04-26 17:45:57,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 17:45:59,994][49750] Updated weights for policy 0, policy_version 233391 (0.0032) [2024-04-26 17:46:02,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3823992832. Throughput: 0: 50676.9. Samples: 1576877940. Policy #0 lag: (min: 2.0, avg: 11.4, max: 25.0) [2024-04-26 17:46:02,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 17:46:02,955][49750] Updated weights for policy 0, policy_version 233401 (0.0033) [2024-04-26 17:46:06,444][49750] Updated weights for policy 0, policy_version 233411 (0.0035) [2024-04-26 17:46:07,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3824222208. Throughput: 0: 50825.0. Samples: 1577029180. Policy #0 lag: (min: 2.0, avg: 11.4, max: 25.0) [2024-04-26 17:46:07,063][49517] Avg episode reward: [(0, '0.674')] [2024-04-26 17:46:07,392][49728] Signal inference workers to stop experience collection... (23450 times) [2024-04-26 17:46:07,392][49728] Signal inference workers to resume experience collection... (23450 times) [2024-04-26 17:46:07,423][49750] InferenceWorker_p0-w0: stopping experience collection (23450 times) [2024-04-26 17:46:07,423][49750] InferenceWorker_p0-w0: resuming experience collection (23450 times) [2024-04-26 17:46:09,309][49750] Updated weights for policy 0, policy_version 233421 (0.0031) [2024-04-26 17:46:12,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3824484352. Throughput: 0: 50731.4. Samples: 1577334560. Policy #0 lag: (min: 2.0, avg: 11.4, max: 25.0) [2024-04-26 17:46:12,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 17:46:12,993][49750] Updated weights for policy 0, policy_version 233431 (0.0030) [2024-04-26 17:46:15,906][49750] Updated weights for policy 0, policy_version 233441 (0.0031) [2024-04-26 17:46:17,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3824746496. Throughput: 0: 50713.8. Samples: 1577639640. Policy #0 lag: (min: 2.0, avg: 11.4, max: 25.0) [2024-04-26 17:46:17,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 17:46:19,535][49750] Updated weights for policy 0, policy_version 233451 (0.0033) [2024-04-26 17:46:22,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.6, 300 sec: 50818.2). Total num frames: 3824992256. Throughput: 0: 50881.8. Samples: 1577800060. Policy #0 lag: (min: 2.0, avg: 11.4, max: 25.0) [2024-04-26 17:46:22,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 17:46:22,425][49750] Updated weights for policy 0, policy_version 233461 (0.0032) [2024-04-26 17:46:25,950][49750] Updated weights for policy 0, policy_version 233471 (0.0029) [2024-04-26 17:46:27,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 3825270784. Throughput: 0: 50765.8. Samples: 1578100700. Policy #0 lag: (min: 2.0, avg: 11.4, max: 25.0) [2024-04-26 17:46:27,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 17:46:28,974][49750] Updated weights for policy 0, policy_version 233481 (0.0040) [2024-04-26 17:46:32,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3825500160. Throughput: 0: 50695.6. Samples: 1578401080. Policy #0 lag: (min: 2.0, avg: 11.4, max: 25.0) [2024-04-26 17:46:32,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 17:46:32,232][49750] Updated weights for policy 0, policy_version 233491 (0.0034) [2024-04-26 17:46:35,553][49750] Updated weights for policy 0, policy_version 233501 (0.0034) [2024-04-26 17:46:37,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3825745920. Throughput: 0: 50679.6. Samples: 1578554760. Policy #0 lag: (min: 2.0, avg: 11.4, max: 25.0) [2024-04-26 17:46:37,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 17:46:38,720][49750] Updated weights for policy 0, policy_version 233511 (0.0036) [2024-04-26 17:46:41,898][49750] Updated weights for policy 0, policy_version 233521 (0.0027) [2024-04-26 17:46:42,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 3826008064. Throughput: 0: 50808.3. Samples: 1578860120. Policy #0 lag: (min: 2.0, avg: 11.4, max: 25.0) [2024-04-26 17:46:42,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 17:46:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000233521_3826008064.pth... [2024-04-26 17:46:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000232776_3813801984.pth [2024-04-26 17:46:45,449][49750] Updated weights for policy 0, policy_version 233531 (0.0034) [2024-04-26 17:46:47,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 3826270208. Throughput: 0: 50843.6. Samples: 1579165900. Policy #0 lag: (min: 2.0, avg: 11.4, max: 25.0) [2024-04-26 17:46:47,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 17:46:48,359][49750] Updated weights for policy 0, policy_version 233541 (0.0033) [2024-04-26 17:46:51,791][49750] Updated weights for policy 0, policy_version 233551 (0.0030) [2024-04-26 17:46:52,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3826515968. Throughput: 0: 50876.1. Samples: 1579318600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:46:52,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 17:46:54,924][49750] Updated weights for policy 0, policy_version 233561 (0.0033) [2024-04-26 17:46:57,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3826778112. Throughput: 0: 50815.2. Samples: 1579621240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:46:57,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 17:46:58,173][49750] Updated weights for policy 0, policy_version 233571 (0.0034) [2024-04-26 17:47:01,261][49750] Updated weights for policy 0, policy_version 233581 (0.0030) [2024-04-26 17:47:02,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 3827023872. Throughput: 0: 50851.4. Samples: 1579927960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:47:02,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 17:47:04,489][49750] Updated weights for policy 0, policy_version 233591 (0.0032) [2024-04-26 17:47:07,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3827286016. Throughput: 0: 50836.8. Samples: 1580087720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:47:07,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 17:47:07,503][49750] Updated weights for policy 0, policy_version 233601 (0.0032) [2024-04-26 17:47:11,055][49750] Updated weights for policy 0, policy_version 233611 (0.0032) [2024-04-26 17:47:12,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 3827548160. Throughput: 0: 50990.5. Samples: 1580395280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:47:12,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 17:47:14,049][49750] Updated weights for policy 0, policy_version 233621 (0.0030) [2024-04-26 17:47:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3827777536. Throughput: 0: 51029.8. Samples: 1580697420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:47:17,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 17:47:17,422][49750] Updated weights for policy 0, policy_version 233631 (0.0028) [2024-04-26 17:47:20,578][49750] Updated weights for policy 0, policy_version 233641 (0.0035) [2024-04-26 17:47:22,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3828039680. Throughput: 0: 50848.4. Samples: 1580842940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:47:22,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 17:47:23,733][49750] Updated weights for policy 0, policy_version 233651 (0.0035) [2024-04-26 17:47:26,936][49750] Updated weights for policy 0, policy_version 233661 (0.0032) [2024-04-26 17:47:27,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.2, 300 sec: 50929.2). Total num frames: 3828301824. Throughput: 0: 50832.8. Samples: 1581147600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:47:27,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 17:47:30,157][49750] Updated weights for policy 0, policy_version 233671 (0.0037) [2024-04-26 17:47:30,883][49728] Signal inference workers to stop experience collection... (23500 times) [2024-04-26 17:47:30,883][49728] Signal inference workers to resume experience collection... (23500 times) [2024-04-26 17:47:30,912][49750] InferenceWorker_p0-w0: stopping experience collection (23500 times) [2024-04-26 17:47:30,912][49750] InferenceWorker_p0-w0: resuming experience collection (23500 times) [2024-04-26 17:47:32,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3828547584. Throughput: 0: 50831.7. Samples: 1581453340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:47:32,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 17:47:33,365][49750] Updated weights for policy 0, policy_version 233681 (0.0031) [2024-04-26 17:47:36,556][49750] Updated weights for policy 0, policy_version 233691 (0.0030) [2024-04-26 17:47:37,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3828809728. Throughput: 0: 50899.0. Samples: 1581609060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:47:37,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 17:47:39,785][49750] Updated weights for policy 0, policy_version 233701 (0.0027) [2024-04-26 17:47:42,063][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3829055488. Throughput: 0: 50979.0. Samples: 1581915300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:47:42,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 17:47:42,864][49750] Updated weights for policy 0, policy_version 233711 (0.0032) [2024-04-26 17:47:46,294][49750] Updated weights for policy 0, policy_version 233721 (0.0034) [2024-04-26 17:47:47,062][49517] Fps is (10 sec: 50791.7, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3829317632. Throughput: 0: 50892.6. Samples: 1582218120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:47:47,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 17:47:49,295][49750] Updated weights for policy 0, policy_version 233731 (0.0033) [2024-04-26 17:47:52,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3829563392. Throughput: 0: 50689.3. Samples: 1582368740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 17:47:52,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 17:47:52,779][49750] Updated weights for policy 0, policy_version 233741 (0.0035) [2024-04-26 17:47:55,691][49750] Updated weights for policy 0, policy_version 233751 (0.0032) [2024-04-26 17:47:57,063][49517] Fps is (10 sec: 52427.7, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 3829841920. Throughput: 0: 50748.0. Samples: 1582678940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 17:47:57,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 17:47:59,157][49750] Updated weights for policy 0, policy_version 233761 (0.0041) [2024-04-26 17:48:02,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3830071296. Throughput: 0: 50797.0. Samples: 1582983280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 17:48:02,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-26 17:48:02,250][49750] Updated weights for policy 0, policy_version 233771 (0.0036) [2024-04-26 17:48:05,471][49750] Updated weights for policy 0, policy_version 233781 (0.0029) [2024-04-26 17:48:07,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3830333440. Throughput: 0: 50969.8. Samples: 1583136580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 17:48:07,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 17:48:08,484][49750] Updated weights for policy 0, policy_version 233791 (0.0035) [2024-04-26 17:48:11,900][49750] Updated weights for policy 0, policy_version 233801 (0.0031) [2024-04-26 17:48:12,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3830595584. Throughput: 0: 50991.7. Samples: 1583442220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 17:48:12,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 17:48:14,985][49750] Updated weights for policy 0, policy_version 233811 (0.0033) [2024-04-26 17:48:17,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 3830841344. Throughput: 0: 50874.9. Samples: 1583742700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 17:48:17,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 17:48:18,574][49750] Updated weights for policy 0, policy_version 233821 (0.0034) [2024-04-26 17:48:21,614][49750] Updated weights for policy 0, policy_version 233831 (0.0035) [2024-04-26 17:48:22,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 3831103488. Throughput: 0: 50916.0. Samples: 1583900280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 17:48:22,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 17:48:24,997][49750] Updated weights for policy 0, policy_version 233841 (0.0033) [2024-04-26 17:48:27,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 3831349248. Throughput: 0: 50908.6. Samples: 1584206180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 17:48:27,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 17:48:27,960][49750] Updated weights for policy 0, policy_version 233851 (0.0036) [2024-04-26 17:48:31,371][49750] Updated weights for policy 0, policy_version 233861 (0.0033) [2024-04-26 17:48:32,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3831611392. Throughput: 0: 50991.4. Samples: 1584512740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 17:48:32,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 17:48:32,724][49728] Signal inference workers to stop experience collection... (23550 times) [2024-04-26 17:48:32,767][49750] InferenceWorker_p0-w0: stopping experience collection (23550 times) [2024-04-26 17:48:32,802][49728] Signal inference workers to resume experience collection... (23550 times) [2024-04-26 17:48:32,803][49750] InferenceWorker_p0-w0: resuming experience collection (23550 times) [2024-04-26 17:48:34,209][49750] Updated weights for policy 0, policy_version 233871 (0.0033) [2024-04-26 17:48:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.6, 300 sec: 50873.8). Total num frames: 3831857152. Throughput: 0: 51018.0. Samples: 1584664540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 17:48:37,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 17:48:37,690][49750] Updated weights for policy 0, policy_version 233881 (0.0029) [2024-04-26 17:48:40,695][49750] Updated weights for policy 0, policy_version 233891 (0.0039) [2024-04-26 17:48:42,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 3832119296. Throughput: 0: 50926.8. Samples: 1584970640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 17:48:42,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 17:48:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000233895_3832135680.pth... [2024-04-26 17:48:42,113][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000233150_3819929600.pth [2024-04-26 17:48:44,080][49750] Updated weights for policy 0, policy_version 233901 (0.0031) [2024-04-26 17:48:47,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 3832381440. Throughput: 0: 50928.5. Samples: 1585275060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 17:48:47,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 17:48:47,187][49750] Updated weights for policy 0, policy_version 233911 (0.0030) [2024-04-26 17:48:50,546][49750] Updated weights for policy 0, policy_version 233921 (0.0035) [2024-04-26 17:48:52,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3832627200. Throughput: 0: 50974.8. Samples: 1585430440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 17:48:52,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 17:48:53,545][49750] Updated weights for policy 0, policy_version 233931 (0.0039) [2024-04-26 17:48:57,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.4, 300 sec: 50818.1). Total num frames: 3832872960. Throughput: 0: 50918.6. Samples: 1585733560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 17:48:57,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 17:48:57,135][49750] Updated weights for policy 0, policy_version 233941 (0.0033) [2024-04-26 17:48:59,829][49750] Updated weights for policy 0, policy_version 233951 (0.0031) [2024-04-26 17:49:02,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 3833135104. Throughput: 0: 50972.8. Samples: 1586036480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 17:49:02,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 17:49:03,474][49750] Updated weights for policy 0, policy_version 233961 (0.0030) [2024-04-26 17:49:06,323][49750] Updated weights for policy 0, policy_version 233971 (0.0028) [2024-04-26 17:49:07,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3833397248. Throughput: 0: 50949.2. Samples: 1586192980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 17:49:07,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 17:49:09,995][49750] Updated weights for policy 0, policy_version 233981 (0.0033) [2024-04-26 17:49:12,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3833643008. Throughput: 0: 50848.9. Samples: 1586494380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 17:49:12,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 17:49:12,860][49750] Updated weights for policy 0, policy_version 233991 (0.0032) [2024-04-26 17:49:16,269][49750] Updated weights for policy 0, policy_version 234001 (0.0030) [2024-04-26 17:49:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3833888768. Throughput: 0: 50739.7. Samples: 1586796020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 17:49:17,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 17:49:19,303][49750] Updated weights for policy 0, policy_version 234011 (0.0029) [2024-04-26 17:49:22,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3834134528. Throughput: 0: 50578.6. Samples: 1586940580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 17:49:22,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 17:49:22,956][49750] Updated weights for policy 0, policy_version 234021 (0.0030) [2024-04-26 17:49:26,058][49750] Updated weights for policy 0, policy_version 234031 (0.0031) [2024-04-26 17:49:27,062][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 3834413056. Throughput: 0: 50565.4. Samples: 1587246080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 17:49:27,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 17:49:29,326][49750] Updated weights for policy 0, policy_version 234041 (0.0026) [2024-04-26 17:49:32,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3834675200. Throughput: 0: 50703.5. Samples: 1587556720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 17:49:32,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 17:49:32,373][49750] Updated weights for policy 0, policy_version 234051 (0.0033) [2024-04-26 17:49:35,558][49728] Signal inference workers to stop experience collection... (23600 times) [2024-04-26 17:49:35,609][49750] InferenceWorker_p0-w0: stopping experience collection (23600 times) [2024-04-26 17:49:35,627][49728] Signal inference workers to resume experience collection... (23600 times) [2024-04-26 17:49:35,634][49750] InferenceWorker_p0-w0: resuming experience collection (23600 times) [2024-04-26 17:49:35,758][49750] Updated weights for policy 0, policy_version 234061 (0.0032) [2024-04-26 17:49:37,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3834904576. Throughput: 0: 50728.3. Samples: 1587713220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 17:49:37,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 17:49:38,668][49750] Updated weights for policy 0, policy_version 234071 (0.0032) [2024-04-26 17:49:42,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3835166720. Throughput: 0: 50782.3. Samples: 1588018760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 17:49:42,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 17:49:42,119][49750] Updated weights for policy 0, policy_version 234081 (0.0029) [2024-04-26 17:49:45,247][49750] Updated weights for policy 0, policy_version 234091 (0.0031) [2024-04-26 17:49:47,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50244.0, 300 sec: 50762.6). Total num frames: 3835396096. Throughput: 0: 50701.6. Samples: 1588318060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 17:49:47,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 17:49:48,529][49750] Updated weights for policy 0, policy_version 234101 (0.0029) [2024-04-26 17:49:51,677][49750] Updated weights for policy 0, policy_version 234111 (0.0033) [2024-04-26 17:49:52,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3835674624. Throughput: 0: 50775.9. Samples: 1588477900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 17:49:52,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 17:49:55,045][49750] Updated weights for policy 0, policy_version 234121 (0.0030) [2024-04-26 17:49:57,063][49517] Fps is (10 sec: 54068.2, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3835936768. Throughput: 0: 50828.7. Samples: 1588781680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 17:49:57,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 17:49:57,989][49750] Updated weights for policy 0, policy_version 234131 (0.0029) [2024-04-26 17:50:01,662][49750] Updated weights for policy 0, policy_version 234141 (0.0040) [2024-04-26 17:50:02,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3836182528. Throughput: 0: 50918.5. Samples: 1589087360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 17:50:02,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 17:50:04,336][49750] Updated weights for policy 0, policy_version 234151 (0.0032) [2024-04-26 17:50:07,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3836444672. Throughput: 0: 50828.5. Samples: 1589227860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 17:50:07,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 17:50:08,134][49750] Updated weights for policy 0, policy_version 234161 (0.0030) [2024-04-26 17:50:10,858][49750] Updated weights for policy 0, policy_version 234171 (0.0033) [2024-04-26 17:50:12,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3836674048. Throughput: 0: 50817.8. Samples: 1589532880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 17:50:12,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 17:50:14,574][49750] Updated weights for policy 0, policy_version 234181 (0.0031) [2024-04-26 17:50:17,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 3836968960. Throughput: 0: 50683.1. Samples: 1589837460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:50:17,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 17:50:17,366][49750] Updated weights for policy 0, policy_version 234191 (0.0036) [2024-04-26 17:50:21,006][49750] Updated weights for policy 0, policy_version 234201 (0.0027) [2024-04-26 17:50:22,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3837181952. Throughput: 0: 50837.8. Samples: 1590000920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:50:22,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 17:50:23,686][49750] Updated weights for policy 0, policy_version 234211 (0.0030) [2024-04-26 17:50:27,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3837444096. Throughput: 0: 50792.8. Samples: 1590304440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:50:27,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 17:50:27,525][49750] Updated weights for policy 0, policy_version 234221 (0.0032) [2024-04-26 17:50:30,234][49750] Updated weights for policy 0, policy_version 234231 (0.0027) [2024-04-26 17:50:32,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 3837706240. Throughput: 0: 51067.0. Samples: 1590616060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:50:32,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 17:50:33,922][49750] Updated weights for policy 0, policy_version 234241 (0.0035) [2024-04-26 17:50:36,682][49750] Updated weights for policy 0, policy_version 234251 (0.0029) [2024-04-26 17:50:37,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3837968384. Throughput: 0: 50781.6. Samples: 1590763080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:50:37,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 17:50:40,362][49750] Updated weights for policy 0, policy_version 234261 (0.0032) [2024-04-26 17:50:42,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3838214144. Throughput: 0: 50800.6. Samples: 1591067700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:50:42,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 17:50:42,089][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000234267_3838230528.pth... [2024-04-26 17:50:42,143][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000233521_3826008064.pth [2024-04-26 17:50:42,294][49728] Signal inference workers to stop experience collection... (23650 times) [2024-04-26 17:50:42,295][49728] Signal inference workers to resume experience collection... (23650 times) [2024-04-26 17:50:42,318][49750] InferenceWorker_p0-w0: stopping experience collection (23650 times) [2024-04-26 17:50:42,319][49750] InferenceWorker_p0-w0: resuming experience collection (23650 times) [2024-04-26 17:50:43,203][49750] Updated weights for policy 0, policy_version 234271 (0.0032) [2024-04-26 17:50:46,850][49750] Updated weights for policy 0, policy_version 234281 (0.0026) [2024-04-26 17:50:47,062][49517] Fps is (10 sec: 49152.6, 60 sec: 51063.7, 300 sec: 50818.2). Total num frames: 3838459904. Throughput: 0: 50708.1. Samples: 1591369220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:50:47,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 17:50:49,642][49750] Updated weights for policy 0, policy_version 234291 (0.0032) [2024-04-26 17:50:52,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3838722048. Throughput: 0: 50941.3. Samples: 1591520220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:50:52,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 17:50:53,256][49750] Updated weights for policy 0, policy_version 234301 (0.0033) [2024-04-26 17:50:56,227][49750] Updated weights for policy 0, policy_version 234311 (0.0036) [2024-04-26 17:50:57,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3838967808. Throughput: 0: 50865.1. Samples: 1591821820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:50:57,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 17:50:59,605][49750] Updated weights for policy 0, policy_version 234321 (0.0031) [2024-04-26 17:51:02,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 3839246336. Throughput: 0: 50836.8. Samples: 1592125120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:51:02,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 17:51:02,775][49750] Updated weights for policy 0, policy_version 234331 (0.0034) [2024-04-26 17:51:06,120][49750] Updated weights for policy 0, policy_version 234341 (0.0038) [2024-04-26 17:51:07,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3839492096. Throughput: 0: 50741.4. Samples: 1592284280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:51:07,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 17:51:09,169][49750] Updated weights for policy 0, policy_version 234351 (0.0034) [2024-04-26 17:51:12,063][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3839737856. Throughput: 0: 50802.7. Samples: 1592590560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:51:12,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 17:51:12,842][49750] Updated weights for policy 0, policy_version 234361 (0.0029) [2024-04-26 17:51:15,590][49750] Updated weights for policy 0, policy_version 234371 (0.0029) [2024-04-26 17:51:17,062][49517] Fps is (10 sec: 47513.2, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 3839967232. Throughput: 0: 50603.9. Samples: 1592893240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:51:17,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 17:51:19,191][49750] Updated weights for policy 0, policy_version 234381 (0.0028) [2024-04-26 17:51:22,008][49750] Updated weights for policy 0, policy_version 234391 (0.0028) [2024-04-26 17:51:22,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3840262144. Throughput: 0: 50781.8. Samples: 1593048260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 17:51:22,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 17:51:25,471][49750] Updated weights for policy 0, policy_version 234401 (0.0035) [2024-04-26 17:51:27,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51063.7, 300 sec: 50873.7). Total num frames: 3840507904. Throughput: 0: 50656.5. Samples: 1593347240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:51:27,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 17:51:28,587][49750] Updated weights for policy 0, policy_version 234411 (0.0029) [2024-04-26 17:51:32,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 3840737280. Throughput: 0: 50782.5. Samples: 1593654440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:51:32,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 17:51:32,186][49750] Updated weights for policy 0, policy_version 234421 (0.0039) [2024-04-26 17:51:35,052][49750] Updated weights for policy 0, policy_version 234431 (0.0031) [2024-04-26 17:51:37,062][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3840999424. Throughput: 0: 50732.4. Samples: 1593803180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:51:37,063][49517] Avg episode reward: [(0, '0.463')] [2024-04-26 17:51:38,587][49750] Updated weights for policy 0, policy_version 234441 (0.0026) [2024-04-26 17:51:41,512][49750] Updated weights for policy 0, policy_version 234451 (0.0029) [2024-04-26 17:51:42,063][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3841261568. Throughput: 0: 50870.7. Samples: 1594111000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:51:42,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 17:51:44,995][49750] Updated weights for policy 0, policy_version 234461 (0.0029) [2024-04-26 17:51:47,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 3841523712. Throughput: 0: 50841.3. Samples: 1594412980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:51:47,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 17:51:48,058][49750] Updated weights for policy 0, policy_version 234471 (0.0032) [2024-04-26 17:51:51,302][49750] Updated weights for policy 0, policy_version 234481 (0.0029) [2024-04-26 17:51:52,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3841785856. Throughput: 0: 50833.3. Samples: 1594571780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:51:52,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 17:51:54,576][49750] Updated weights for policy 0, policy_version 234491 (0.0032) [2024-04-26 17:51:55,963][49728] Signal inference workers to stop experience collection... (23700 times) [2024-04-26 17:51:55,964][49728] Signal inference workers to resume experience collection... (23700 times) [2024-04-26 17:51:55,994][49750] InferenceWorker_p0-w0: stopping experience collection (23700 times) [2024-04-26 17:51:55,994][49750] InferenceWorker_p0-w0: resuming experience collection (23700 times) [2024-04-26 17:51:57,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3842031616. Throughput: 0: 50817.4. Samples: 1594877340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:51:57,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 17:51:57,659][49750] Updated weights for policy 0, policy_version 234501 (0.0029) [2024-04-26 17:52:00,906][49750] Updated weights for policy 0, policy_version 234511 (0.0036) [2024-04-26 17:52:02,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3842260992. Throughput: 0: 50811.1. Samples: 1595179740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:52:02,063][49517] Avg episode reward: [(0, '0.450')] [2024-04-26 17:52:04,102][49750] Updated weights for policy 0, policy_version 234521 (0.0035) [2024-04-26 17:52:07,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3842523136. Throughput: 0: 50650.1. Samples: 1595327520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:52:07,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 17:52:07,400][49750] Updated weights for policy 0, policy_version 234531 (0.0035) [2024-04-26 17:52:10,507][49750] Updated weights for policy 0, policy_version 234541 (0.0033) [2024-04-26 17:52:12,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 3842801664. Throughput: 0: 50851.8. Samples: 1595635580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:52:12,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 17:52:13,858][49750] Updated weights for policy 0, policy_version 234551 (0.0033) [2024-04-26 17:52:16,980][49750] Updated weights for policy 0, policy_version 234561 (0.0033) [2024-04-26 17:52:17,062][49517] Fps is (10 sec: 52429.8, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3843047424. Throughput: 0: 50776.7. Samples: 1595939380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:52:17,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 17:52:20,162][49750] Updated weights for policy 0, policy_version 234571 (0.0027) [2024-04-26 17:52:22,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3843276800. Throughput: 0: 50951.5. Samples: 1596096000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:52:22,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 17:52:23,399][49750] Updated weights for policy 0, policy_version 234581 (0.0029) [2024-04-26 17:52:26,454][49750] Updated weights for policy 0, policy_version 234591 (0.0037) [2024-04-26 17:52:27,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 3843538944. Throughput: 0: 50851.6. Samples: 1596399320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:52:27,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 17:52:29,918][49750] Updated weights for policy 0, policy_version 234601 (0.0033) [2024-04-26 17:52:32,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3843817472. Throughput: 0: 50812.5. Samples: 1596699540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 17:52:32,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 17:52:32,974][49750] Updated weights for policy 0, policy_version 234611 (0.0037) [2024-04-26 17:52:36,248][49750] Updated weights for policy 0, policy_version 234621 (0.0035) [2024-04-26 17:52:37,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3844063232. Throughput: 0: 50838.1. Samples: 1596859500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 17:52:37,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 17:52:39,576][49750] Updated weights for policy 0, policy_version 234631 (0.0034) [2024-04-26 17:52:42,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3844308992. Throughput: 0: 50812.5. Samples: 1597163900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 17:52:42,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 17:52:42,158][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000234639_3844325376.pth... [2024-04-26 17:52:42,196][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000233895_3832135680.pth [2024-04-26 17:52:42,758][49750] Updated weights for policy 0, policy_version 234641 (0.0029) [2024-04-26 17:52:46,094][49750] Updated weights for policy 0, policy_version 234651 (0.0027) [2024-04-26 17:52:47,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.4, 300 sec: 50762.7). Total num frames: 3844538368. Throughput: 0: 50732.5. Samples: 1597462700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 17:52:47,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 17:52:49,206][49750] Updated weights for policy 0, policy_version 234661 (0.0031) [2024-04-26 17:52:52,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3844816896. Throughput: 0: 50748.5. Samples: 1597611200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 17:52:52,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 17:52:52,601][49750] Updated weights for policy 0, policy_version 234671 (0.0030) [2024-04-26 17:52:55,679][49750] Updated weights for policy 0, policy_version 234681 (0.0035) [2024-04-26 17:52:57,062][49517] Fps is (10 sec: 54066.7, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3845079040. Throughput: 0: 50628.5. Samples: 1597913860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 17:52:57,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 17:52:59,281][49750] Updated weights for policy 0, policy_version 234691 (0.0032) [2024-04-26 17:53:02,063][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3845324800. Throughput: 0: 50636.3. Samples: 1598218020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 17:53:02,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 17:53:02,080][49750] Updated weights for policy 0, policy_version 234701 (0.0037) [2024-04-26 17:53:05,916][49750] Updated weights for policy 0, policy_version 234711 (0.0043) [2024-04-26 17:53:07,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3845570560. Throughput: 0: 50606.2. Samples: 1598373280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 17:53:07,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 17:53:08,441][49750] Updated weights for policy 0, policy_version 234721 (0.0037) [2024-04-26 17:53:12,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3845816320. Throughput: 0: 50677.3. Samples: 1598679800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 17:53:12,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 17:53:12,303][49750] Updated weights for policy 0, policy_version 234731 (0.0030) [2024-04-26 17:53:14,945][49750] Updated weights for policy 0, policy_version 234741 (0.0043) [2024-04-26 17:53:17,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 3846078464. Throughput: 0: 50629.9. Samples: 1598977880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 17:53:17,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 17:53:18,754][49750] Updated weights for policy 0, policy_version 234751 (0.0033) [2024-04-26 17:53:21,430][49750] Updated weights for policy 0, policy_version 234761 (0.0029) [2024-04-26 17:53:22,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3846340608. Throughput: 0: 50653.5. Samples: 1599138900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 17:53:22,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 17:53:25,188][49750] Updated weights for policy 0, policy_version 234771 (0.0032) [2024-04-26 17:53:27,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3846586368. Throughput: 0: 50553.3. Samples: 1599438800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 17:53:27,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 17:53:27,902][49750] Updated weights for policy 0, policy_version 234781 (0.0032) [2024-04-26 17:53:28,670][49728] Signal inference workers to stop experience collection... (23750 times) [2024-04-26 17:53:28,670][49728] Signal inference workers to resume experience collection... (23750 times) [2024-04-26 17:53:28,681][49750] InferenceWorker_p0-w0: stopping experience collection (23750 times) [2024-04-26 17:53:28,681][49750] InferenceWorker_p0-w0: resuming experience collection (23750 times) [2024-04-26 17:53:31,753][49750] Updated weights for policy 0, policy_version 234791 (0.0032) [2024-04-26 17:53:32,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3846832128. Throughput: 0: 50705.7. Samples: 1599744460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 17:53:32,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 17:53:34,510][49750] Updated weights for policy 0, policy_version 234801 (0.0031) [2024-04-26 17:53:37,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3847110656. Throughput: 0: 50626.7. Samples: 1599889400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 17:53:37,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 17:53:38,159][49750] Updated weights for policy 0, policy_version 234811 (0.0035) [2024-04-26 17:53:41,003][49750] Updated weights for policy 0, policy_version 234821 (0.0029) [2024-04-26 17:53:42,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3847340032. Throughput: 0: 50652.8. Samples: 1600193240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 17:53:42,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 17:53:44,544][49750] Updated weights for policy 0, policy_version 234831 (0.0030) [2024-04-26 17:53:47,063][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.2, 300 sec: 50762.6). Total num frames: 3847602176. Throughput: 0: 50619.4. Samples: 1600495900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 17:53:47,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 17:53:47,410][49750] Updated weights for policy 0, policy_version 234841 (0.0029) [2024-04-26 17:53:51,218][49750] Updated weights for policy 0, policy_version 234851 (0.0033) [2024-04-26 17:53:52,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3847831552. Throughput: 0: 50558.7. Samples: 1600648420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 17:53:52,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 17:53:53,814][49750] Updated weights for policy 0, policy_version 234861 (0.0033) [2024-04-26 17:53:57,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3848093696. Throughput: 0: 50404.4. Samples: 1600948000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 17:53:57,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 17:53:57,721][49750] Updated weights for policy 0, policy_version 234871 (0.0030) [2024-04-26 17:54:00,273][49750] Updated weights for policy 0, policy_version 234881 (0.0032) [2024-04-26 17:54:02,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.3, 300 sec: 50707.0). Total num frames: 3848355840. Throughput: 0: 50606.0. Samples: 1601255160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 17:54:02,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 17:54:04,137][49750] Updated weights for policy 0, policy_version 234891 (0.0032) [2024-04-26 17:54:06,837][49750] Updated weights for policy 0, policy_version 234901 (0.0033) [2024-04-26 17:54:07,062][49517] Fps is (10 sec: 54068.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3848634368. Throughput: 0: 50411.5. Samples: 1601407420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 17:54:07,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 17:54:10,645][49750] Updated weights for policy 0, policy_version 234911 (0.0032) [2024-04-26 17:54:12,063][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3848863744. Throughput: 0: 50612.8. Samples: 1601716380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 17:54:12,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 17:54:13,278][49750] Updated weights for policy 0, policy_version 234921 (0.0030) [2024-04-26 17:54:17,062][49517] Fps is (10 sec: 45875.1, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3849093120. Throughput: 0: 50500.4. Samples: 1602016980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 17:54:17,063][49517] Avg episode reward: [(0, '0.432')] [2024-04-26 17:54:17,088][49750] Updated weights for policy 0, policy_version 234931 (0.0032) [2024-04-26 17:54:19,686][49750] Updated weights for policy 0, policy_version 234941 (0.0029) [2024-04-26 17:54:22,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3849371648. Throughput: 0: 50475.5. Samples: 1602160800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 17:54:22,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 17:54:23,587][49750] Updated weights for policy 0, policy_version 234951 (0.0028) [2024-04-26 17:54:26,068][49750] Updated weights for policy 0, policy_version 234961 (0.0030) [2024-04-26 17:54:27,062][49517] Fps is (10 sec: 54067.6, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3849633792. Throughput: 0: 50707.3. Samples: 1602475060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 17:54:27,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 17:54:29,919][49750] Updated weights for policy 0, policy_version 234971 (0.0033) [2024-04-26 17:54:32,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3849879552. Throughput: 0: 50714.8. Samples: 1602778060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 17:54:32,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 17:54:32,756][49750] Updated weights for policy 0, policy_version 234981 (0.0031) [2024-04-26 17:54:36,291][49750] Updated weights for policy 0, policy_version 234991 (0.0032) [2024-04-26 17:54:37,063][49517] Fps is (10 sec: 47513.1, 60 sec: 49971.3, 300 sec: 50651.5). Total num frames: 3850108928. Throughput: 0: 50597.8. Samples: 1602925320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 17:54:37,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 17:54:39,328][49750] Updated weights for policy 0, policy_version 235001 (0.0030) [2024-04-26 17:54:42,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3850371072. Throughput: 0: 50606.4. Samples: 1603225280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 17:54:42,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 17:54:42,094][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000235009_3850387456.pth... [2024-04-26 17:54:42,142][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000234267_3838230528.pth [2024-04-26 17:54:42,810][49750] Updated weights for policy 0, policy_version 235011 (0.0036) [2024-04-26 17:54:45,689][49750] Updated weights for policy 0, policy_version 235021 (0.0032) [2024-04-26 17:54:47,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3850633216. Throughput: 0: 50665.4. Samples: 1603535100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 17:54:47,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 17:54:49,249][49750] Updated weights for policy 0, policy_version 235031 (0.0036) [2024-04-26 17:54:52,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3850895360. Throughput: 0: 50718.7. Samples: 1603689760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 17:54:52,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 17:54:52,185][49750] Updated weights for policy 0, policy_version 235041 (0.0032) [2024-04-26 17:54:55,653][49750] Updated weights for policy 0, policy_version 235051 (0.0036) [2024-04-26 17:54:57,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 3851124736. Throughput: 0: 50551.7. Samples: 1603991200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 17:54:57,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 17:54:58,589][49750] Updated weights for policy 0, policy_version 235061 (0.0036) [2024-04-26 17:55:01,609][49728] Signal inference workers to stop experience collection... (23800 times) [2024-04-26 17:55:01,610][49728] Signal inference workers to resume experience collection... (23800 times) [2024-04-26 17:55:01,631][49750] InferenceWorker_p0-w0: stopping experience collection (23800 times) [2024-04-26 17:55:01,631][49750] InferenceWorker_p0-w0: resuming experience collection (23800 times) [2024-04-26 17:55:02,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 3851386880. Throughput: 0: 50675.6. Samples: 1604297380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 17:55:02,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 17:55:02,096][49750] Updated weights for policy 0, policy_version 235071 (0.0034) [2024-04-26 17:55:05,070][49750] Updated weights for policy 0, policy_version 235081 (0.0031) [2024-04-26 17:55:07,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3851649024. Throughput: 0: 50714.0. Samples: 1604442920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 17:55:07,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 17:55:08,624][49750] Updated weights for policy 0, policy_version 235091 (0.0037) [2024-04-26 17:55:11,563][49750] Updated weights for policy 0, policy_version 235101 (0.0033) [2024-04-26 17:55:12,063][49517] Fps is (10 sec: 52427.8, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 3851911168. Throughput: 0: 50594.0. Samples: 1604751800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 17:55:12,072][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 17:55:15,228][49750] Updated weights for policy 0, policy_version 235111 (0.0029) [2024-04-26 17:55:17,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3852156928. Throughput: 0: 50643.2. Samples: 1605057000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 17:55:17,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 17:55:18,018][49750] Updated weights for policy 0, policy_version 235121 (0.0036) [2024-04-26 17:55:21,557][49750] Updated weights for policy 0, policy_version 235131 (0.0033) [2024-04-26 17:55:22,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3852402688. Throughput: 0: 50775.2. Samples: 1605210200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 17:55:22,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 17:55:24,312][49750] Updated weights for policy 0, policy_version 235141 (0.0041) [2024-04-26 17:55:27,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 3852648448. Throughput: 0: 50746.7. Samples: 1605508880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 17:55:27,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 17:55:27,942][49750] Updated weights for policy 0, policy_version 235151 (0.0028) [2024-04-26 17:55:30,783][49750] Updated weights for policy 0, policy_version 235161 (0.0033) [2024-04-26 17:55:32,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3852926976. Throughput: 0: 50669.8. Samples: 1605815240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 17:55:32,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 17:55:34,440][49750] Updated weights for policy 0, policy_version 235171 (0.0034) [2024-04-26 17:55:37,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 3853172736. Throughput: 0: 50655.8. Samples: 1605969280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 17:55:37,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 17:55:37,437][49750] Updated weights for policy 0, policy_version 235181 (0.0035) [2024-04-26 17:55:40,934][49750] Updated weights for policy 0, policy_version 235191 (0.0038) [2024-04-26 17:55:42,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 3853402112. Throughput: 0: 50664.4. Samples: 1606271100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 17:55:42,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 17:55:44,012][49750] Updated weights for policy 0, policy_version 235201 (0.0029) [2024-04-26 17:55:47,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3853680640. Throughput: 0: 50568.3. Samples: 1606572960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 17:55:47,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-26 17:55:47,304][49750] Updated weights for policy 0, policy_version 235211 (0.0029) [2024-04-26 17:55:50,410][49750] Updated weights for policy 0, policy_version 235221 (0.0033) [2024-04-26 17:55:52,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3853942784. Throughput: 0: 50786.6. Samples: 1606728320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 17:55:52,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 17:55:53,622][49750] Updated weights for policy 0, policy_version 235231 (0.0034) [2024-04-26 17:55:56,932][49750] Updated weights for policy 0, policy_version 235241 (0.0034) [2024-04-26 17:55:57,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.4, 300 sec: 50651.6). Total num frames: 3854188544. Throughput: 0: 50786.3. Samples: 1607037180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 17:55:57,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 17:55:59,936][49750] Updated weights for policy 0, policy_version 235251 (0.0032) [2024-04-26 17:56:02,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.2, 300 sec: 50596.0). Total num frames: 3854417920. Throughput: 0: 50712.9. Samples: 1607339080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-26 17:56:02,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 17:56:03,409][49750] Updated weights for policy 0, policy_version 235261 (0.0034) [2024-04-26 17:56:06,508][49750] Updated weights for policy 0, policy_version 235271 (0.0032) [2024-04-26 17:56:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3854696448. Throughput: 0: 50699.5. Samples: 1607491680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:56:07,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 17:56:09,724][49750] Updated weights for policy 0, policy_version 235281 (0.0029) [2024-04-26 17:56:12,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 3854942208. Throughput: 0: 50845.0. Samples: 1607796900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:56:12,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 17:56:13,212][49750] Updated weights for policy 0, policy_version 235291 (0.0032) [2024-04-26 17:56:16,149][49750] Updated weights for policy 0, policy_version 235301 (0.0030) [2024-04-26 17:56:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3855204352. Throughput: 0: 50703.6. Samples: 1608096900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:56:17,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 17:56:19,751][49750] Updated weights for policy 0, policy_version 235311 (0.0029) [2024-04-26 17:56:21,863][49728] Signal inference workers to stop experience collection... (23850 times) [2024-04-26 17:56:21,905][49750] InferenceWorker_p0-w0: stopping experience collection (23850 times) [2024-04-26 17:56:21,970][49728] Signal inference workers to resume experience collection... (23850 times) [2024-04-26 17:56:21,970][49750] InferenceWorker_p0-w0: resuming experience collection (23850 times) [2024-04-26 17:56:22,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 3855450112. Throughput: 0: 50781.0. Samples: 1608254420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:56:22,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 17:56:22,764][49750] Updated weights for policy 0, policy_version 235321 (0.0035) [2024-04-26 17:56:26,210][49750] Updated weights for policy 0, policy_version 235331 (0.0031) [2024-04-26 17:56:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3855695872. Throughput: 0: 50774.3. Samples: 1608555940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:56:27,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 17:56:29,149][49750] Updated weights for policy 0, policy_version 235341 (0.0026) [2024-04-26 17:56:32,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3855958016. Throughput: 0: 50815.6. Samples: 1608859660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:56:32,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 17:56:32,628][49750] Updated weights for policy 0, policy_version 235351 (0.0035) [2024-04-26 17:56:35,583][49750] Updated weights for policy 0, policy_version 235361 (0.0034) [2024-04-26 17:56:37,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 3856220160. Throughput: 0: 50869.1. Samples: 1609017420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:56:37,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 17:56:38,948][49750] Updated weights for policy 0, policy_version 235371 (0.0034) [2024-04-26 17:56:42,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.5, 300 sec: 50651.6). Total num frames: 3856465920. Throughput: 0: 50814.8. Samples: 1609323840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:56:42,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 17:56:42,139][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000235381_3856482304.pth... [2024-04-26 17:56:42,145][49750] Updated weights for policy 0, policy_version 235381 (0.0029) [2024-04-26 17:56:42,204][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000234639_3844325376.pth [2024-04-26 17:56:45,442][49750] Updated weights for policy 0, policy_version 235391 (0.0029) [2024-04-26 17:56:47,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 3856711680. Throughput: 0: 50811.3. Samples: 1609625580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:56:47,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 17:56:48,640][49750] Updated weights for policy 0, policy_version 235401 (0.0030) [2024-04-26 17:56:51,870][49750] Updated weights for policy 0, policy_version 235411 (0.0035) [2024-04-26 17:56:52,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 3856973824. Throughput: 0: 50623.3. Samples: 1609769740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:56:52,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 17:56:55,062][49750] Updated weights for policy 0, policy_version 235421 (0.0026) [2024-04-26 17:56:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 3857235968. Throughput: 0: 50743.1. Samples: 1610080340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:56:57,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 17:56:58,294][49750] Updated weights for policy 0, policy_version 235431 (0.0033) [2024-04-26 17:57:01,525][49750] Updated weights for policy 0, policy_version 235441 (0.0031) [2024-04-26 17:57:02,062][49517] Fps is (10 sec: 50791.8, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 3857481728. Throughput: 0: 50861.8. Samples: 1610385680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:57:02,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 17:57:04,651][49750] Updated weights for policy 0, policy_version 235451 (0.0039) [2024-04-26 17:57:07,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50517.2, 300 sec: 50596.0). Total num frames: 3857727488. Throughput: 0: 50731.5. Samples: 1610537340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 17:57:07,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 17:57:07,895][49750] Updated weights for policy 0, policy_version 235461 (0.0032) [2024-04-26 17:57:11,114][49750] Updated weights for policy 0, policy_version 235471 (0.0030) [2024-04-26 17:57:12,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 3857989632. Throughput: 0: 50818.7. Samples: 1610842780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 17:57:12,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 17:57:14,248][49750] Updated weights for policy 0, policy_version 235481 (0.0037) [2024-04-26 17:57:17,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3858235392. Throughput: 0: 50834.1. Samples: 1611147200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 17:57:17,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 17:57:17,706][49750] Updated weights for policy 0, policy_version 235491 (0.0029) [2024-04-26 17:57:20,657][49750] Updated weights for policy 0, policy_version 235501 (0.0039) [2024-04-26 17:57:22,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 3858513920. Throughput: 0: 50737.3. Samples: 1611300600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 17:57:22,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 17:57:24,233][49750] Updated weights for policy 0, policy_version 235511 (0.0032) [2024-04-26 17:57:27,062][49517] Fps is (10 sec: 52429.9, 60 sec: 51063.5, 300 sec: 50651.6). Total num frames: 3858759680. Throughput: 0: 50749.4. Samples: 1611607560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 17:57:27,063][49517] Avg episode reward: [(0, '0.673')] [2024-04-26 17:57:27,187][49750] Updated weights for policy 0, policy_version 235521 (0.0029) [2024-04-26 17:57:30,578][49750] Updated weights for policy 0, policy_version 235531 (0.0032) [2024-04-26 17:57:32,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 3859005440. Throughput: 0: 50883.5. Samples: 1611915340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 17:57:32,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 17:57:33,601][49750] Updated weights for policy 0, policy_version 235541 (0.0035) [2024-04-26 17:57:37,051][49750] Updated weights for policy 0, policy_version 235551 (0.0039) [2024-04-26 17:57:37,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3859267584. Throughput: 0: 50981.5. Samples: 1612063900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 17:57:37,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 17:57:39,728][49728] Signal inference workers to stop experience collection... (23900 times) [2024-04-26 17:57:39,728][49728] Signal inference workers to resume experience collection... (23900 times) [2024-04-26 17:57:39,753][49750] InferenceWorker_p0-w0: stopping experience collection (23900 times) [2024-04-26 17:57:39,753][49750] InferenceWorker_p0-w0: resuming experience collection (23900 times) [2024-04-26 17:57:40,132][49750] Updated weights for policy 0, policy_version 235561 (0.0032) [2024-04-26 17:57:42,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3859513344. Throughput: 0: 50697.6. Samples: 1612361740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 17:57:42,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 17:57:43,486][49750] Updated weights for policy 0, policy_version 235571 (0.0030) [2024-04-26 17:57:46,687][49750] Updated weights for policy 0, policy_version 235581 (0.0031) [2024-04-26 17:57:47,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3859775488. Throughput: 0: 50754.1. Samples: 1612669620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 17:57:47,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 17:57:49,927][49750] Updated weights for policy 0, policy_version 235591 (0.0034) [2024-04-26 17:57:52,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.5, 300 sec: 50596.0). Total num frames: 3860004864. Throughput: 0: 50815.2. Samples: 1612824020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 17:57:52,063][49517] Avg episode reward: [(0, '0.464')] [2024-04-26 17:57:53,119][49750] Updated weights for policy 0, policy_version 235601 (0.0031) [2024-04-26 17:57:56,503][49750] Updated weights for policy 0, policy_version 235611 (0.0028) [2024-04-26 17:57:57,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.2, 300 sec: 50651.6). Total num frames: 3860267008. Throughput: 0: 50688.4. Samples: 1613123760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 17:57:57,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 17:57:59,567][49750] Updated weights for policy 0, policy_version 235621 (0.0032) [2024-04-26 17:58:02,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3860529152. Throughput: 0: 50670.8. Samples: 1613427380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 17:58:02,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 17:58:02,824][49750] Updated weights for policy 0, policy_version 235631 (0.0030) [2024-04-26 17:58:06,118][49750] Updated weights for policy 0, policy_version 235641 (0.0036) [2024-04-26 17:58:07,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3860791296. Throughput: 0: 50701.6. Samples: 1613582180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 17:58:07,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 17:58:09,254][49750] Updated weights for policy 0, policy_version 235651 (0.0036) [2024-04-26 17:58:12,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3861037056. Throughput: 0: 50715.5. Samples: 1613889760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 17:58:12,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 17:58:12,398][49750] Updated weights for policy 0, policy_version 235661 (0.0039) [2024-04-26 17:58:15,662][49750] Updated weights for policy 0, policy_version 235671 (0.0033) [2024-04-26 17:58:17,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 3861299200. Throughput: 0: 50638.6. Samples: 1614194080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 17:58:17,063][49517] Avg episode reward: [(0, '0.680')] [2024-04-26 17:58:18,748][49750] Updated weights for policy 0, policy_version 235681 (0.0031) [2024-04-26 17:58:22,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3861544960. Throughput: 0: 50658.8. Samples: 1614343540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:58:22,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 17:58:22,167][49750] Updated weights for policy 0, policy_version 235691 (0.0032) [2024-04-26 17:58:25,247][49750] Updated weights for policy 0, policy_version 235701 (0.0027) [2024-04-26 17:58:27,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3861807104. Throughput: 0: 50825.3. Samples: 1614648880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:58:27,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 17:58:28,764][49750] Updated weights for policy 0, policy_version 235711 (0.0029) [2024-04-26 17:58:31,693][49750] Updated weights for policy 0, policy_version 235721 (0.0033) [2024-04-26 17:58:32,063][49517] Fps is (10 sec: 52427.4, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3862069248. Throughput: 0: 50696.3. Samples: 1614950960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:58:32,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 17:58:35,152][49750] Updated weights for policy 0, policy_version 235731 (0.0028) [2024-04-26 17:58:37,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3862298624. Throughput: 0: 50854.2. Samples: 1615112460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:58:37,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 17:58:37,992][49750] Updated weights for policy 0, policy_version 235741 (0.0027) [2024-04-26 17:58:41,501][49750] Updated weights for policy 0, policy_version 235751 (0.0028) [2024-04-26 17:58:42,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 3862560768. Throughput: 0: 50902.9. Samples: 1615414400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:58:42,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 17:58:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000235752_3862560768.pth... [2024-04-26 17:58:42,129][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000235009_3850387456.pth [2024-04-26 17:58:44,563][49750] Updated weights for policy 0, policy_version 235761 (0.0036) [2024-04-26 17:58:47,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3862806528. Throughput: 0: 50798.1. Samples: 1615713300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:58:47,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 17:58:48,099][49750] Updated weights for policy 0, policy_version 235771 (0.0032) [2024-04-26 17:58:51,248][49750] Updated weights for policy 0, policy_version 235781 (0.0028) [2024-04-26 17:58:52,062][49517] Fps is (10 sec: 52430.0, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3863085056. Throughput: 0: 50731.2. Samples: 1615865080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:58:52,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 17:58:54,626][49750] Updated weights for policy 0, policy_version 235791 (0.0034) [2024-04-26 17:58:54,631][49728] Signal inference workers to stop experience collection... (23950 times) [2024-04-26 17:58:54,632][49728] Signal inference workers to resume experience collection... (23950 times) [2024-04-26 17:58:54,670][49750] InferenceWorker_p0-w0: stopping experience collection (23950 times) [2024-04-26 17:58:54,670][49750] InferenceWorker_p0-w0: resuming experience collection (23950 times) [2024-04-26 17:58:57,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3863298048. Throughput: 0: 50687.6. Samples: 1616170700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:58:57,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 17:58:57,637][49750] Updated weights for policy 0, policy_version 235801 (0.0029) [2024-04-26 17:59:00,934][49750] Updated weights for policy 0, policy_version 235811 (0.0033) [2024-04-26 17:59:02,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 3863576576. Throughput: 0: 50672.4. Samples: 1616474340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:59:02,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 17:59:04,025][49750] Updated weights for policy 0, policy_version 235821 (0.0033) [2024-04-26 17:59:07,063][49517] Fps is (10 sec: 54066.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3863838720. Throughput: 0: 50672.7. Samples: 1616623820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:59:07,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 17:59:07,264][49750] Updated weights for policy 0, policy_version 235831 (0.0035) [2024-04-26 17:59:10,510][49750] Updated weights for policy 0, policy_version 235841 (0.0024) [2024-04-26 17:59:12,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3864068096. Throughput: 0: 50684.8. Samples: 1616929700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:59:12,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 17:59:13,762][49750] Updated weights for policy 0, policy_version 235851 (0.0035) [2024-04-26 17:59:16,876][49750] Updated weights for policy 0, policy_version 235861 (0.0029) [2024-04-26 17:59:17,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3864346624. Throughput: 0: 50750.8. Samples: 1617234740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:59:17,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 17:59:20,323][49750] Updated weights for policy 0, policy_version 235871 (0.0031) [2024-04-26 17:59:22,063][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3864592384. Throughput: 0: 50668.9. Samples: 1617392560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:59:22,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 17:59:23,148][49750] Updated weights for policy 0, policy_version 235881 (0.0034) [2024-04-26 17:59:26,797][49750] Updated weights for policy 0, policy_version 235891 (0.0028) [2024-04-26 17:59:27,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3864838144. Throughput: 0: 50779.0. Samples: 1617699440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 17:59:27,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 17:59:29,631][49750] Updated weights for policy 0, policy_version 235901 (0.0035) [2024-04-26 17:59:32,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3865100288. Throughput: 0: 50893.0. Samples: 1618003480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 17:59:32,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 17:59:33,377][49750] Updated weights for policy 0, policy_version 235911 (0.0028) [2024-04-26 17:59:36,101][49750] Updated weights for policy 0, policy_version 235921 (0.0023) [2024-04-26 17:59:37,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3865362432. Throughput: 0: 50862.6. Samples: 1618153900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 17:59:37,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 17:59:39,680][49750] Updated weights for policy 0, policy_version 235931 (0.0032) [2024-04-26 17:59:42,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3865624576. Throughput: 0: 50870.0. Samples: 1618459860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 17:59:42,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 17:59:42,435][49750] Updated weights for policy 0, policy_version 235941 (0.0033) [2024-04-26 17:59:46,019][49750] Updated weights for policy 0, policy_version 235951 (0.0032) [2024-04-26 17:59:47,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 3865853952. Throughput: 0: 50862.4. Samples: 1618763140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 17:59:47,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 17:59:49,030][49750] Updated weights for policy 0, policy_version 235961 (0.0036) [2024-04-26 17:59:52,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3866116096. Throughput: 0: 50747.1. Samples: 1618907440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 17:59:52,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 17:59:52,515][49750] Updated weights for policy 0, policy_version 235971 (0.0033) [2024-04-26 17:59:55,609][49750] Updated weights for policy 0, policy_version 235981 (0.0025) [2024-04-26 17:59:57,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3866378240. Throughput: 0: 50891.2. Samples: 1619219800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 17:59:57,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 17:59:59,046][49750] Updated weights for policy 0, policy_version 235991 (0.0035) [2024-04-26 18:00:02,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3866624000. Throughput: 0: 50781.5. Samples: 1619519900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 18:00:02,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 18:00:02,149][49750] Updated weights for policy 0, policy_version 236001 (0.0031) [2024-04-26 18:00:05,369][49750] Updated weights for policy 0, policy_version 236011 (0.0037) [2024-04-26 18:00:07,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3866869760. Throughput: 0: 50771.5. Samples: 1619677280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 18:00:07,072][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 18:00:08,510][49750] Updated weights for policy 0, policy_version 236021 (0.0037) [2024-04-26 18:00:09,911][49728] Signal inference workers to stop experience collection... (24000 times) [2024-04-26 18:00:09,912][49728] Signal inference workers to resume experience collection... (24000 times) [2024-04-26 18:00:09,939][49750] InferenceWorker_p0-w0: stopping experience collection (24000 times) [2024-04-26 18:00:09,940][49750] InferenceWorker_p0-w0: resuming experience collection (24000 times) [2024-04-26 18:00:11,786][49750] Updated weights for policy 0, policy_version 236031 (0.0031) [2024-04-26 18:00:12,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 3867131904. Throughput: 0: 50684.9. Samples: 1619980260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 18:00:12,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 18:00:15,105][49750] Updated weights for policy 0, policy_version 236041 (0.0028) [2024-04-26 18:00:17,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3867377664. Throughput: 0: 50675.9. Samples: 1620283900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 18:00:17,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 18:00:18,272][49750] Updated weights for policy 0, policy_version 236051 (0.0030) [2024-04-26 18:00:21,514][49750] Updated weights for policy 0, policy_version 236061 (0.0030) [2024-04-26 18:00:22,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3867639808. Throughput: 0: 50788.3. Samples: 1620439380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 18:00:22,072][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 18:00:24,838][49750] Updated weights for policy 0, policy_version 236071 (0.0029) [2024-04-26 18:00:27,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3867901952. Throughput: 0: 50645.8. Samples: 1620738920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 18:00:27,072][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 18:00:28,043][49750] Updated weights for policy 0, policy_version 236081 (0.0029) [2024-04-26 18:00:31,559][49750] Updated weights for policy 0, policy_version 236091 (0.0029) [2024-04-26 18:00:32,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3868147712. Throughput: 0: 50779.9. Samples: 1621048240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 18:00:32,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 18:00:34,658][49750] Updated weights for policy 0, policy_version 236101 (0.0032) [2024-04-26 18:00:37,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3868377088. Throughput: 0: 50949.3. Samples: 1621200160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-26 18:00:37,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 18:00:37,942][49750] Updated weights for policy 0, policy_version 236111 (0.0033) [2024-04-26 18:00:41,055][49750] Updated weights for policy 0, policy_version 236121 (0.0036) [2024-04-26 18:00:42,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 3868655616. Throughput: 0: 50652.6. Samples: 1621499160. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-04-26 18:00:42,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 18:00:42,152][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000236125_3868672000.pth... [2024-04-26 18:00:42,196][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000235381_3856482304.pth [2024-04-26 18:00:44,354][49750] Updated weights for policy 0, policy_version 236131 (0.0032) [2024-04-26 18:00:47,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3868917760. Throughput: 0: 50737.1. Samples: 1621803080. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-04-26 18:00:47,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 18:00:47,468][49750] Updated weights for policy 0, policy_version 236141 (0.0029) [2024-04-26 18:00:50,762][49750] Updated weights for policy 0, policy_version 236151 (0.0034) [2024-04-26 18:00:52,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3869147136. Throughput: 0: 50685.1. Samples: 1621958100. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-04-26 18:00:52,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 18:00:53,823][49750] Updated weights for policy 0, policy_version 236161 (0.0034) [2024-04-26 18:00:57,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3869409280. Throughput: 0: 50731.2. Samples: 1622263160. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-04-26 18:00:57,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 18:00:57,104][49750] Updated weights for policy 0, policy_version 236171 (0.0027) [2024-04-26 18:01:00,509][49750] Updated weights for policy 0, policy_version 236181 (0.0032) [2024-04-26 18:01:02,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3869671424. Throughput: 0: 50787.3. Samples: 1622569320. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-04-26 18:01:02,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 18:01:03,547][49750] Updated weights for policy 0, policy_version 236191 (0.0035) [2024-04-26 18:01:06,874][49750] Updated weights for policy 0, policy_version 236201 (0.0030) [2024-04-26 18:01:07,063][49517] Fps is (10 sec: 50789.2, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3869917184. Throughput: 0: 50949.8. Samples: 1622732120. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-04-26 18:01:07,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 18:01:10,039][49750] Updated weights for policy 0, policy_version 236211 (0.0036) [2024-04-26 18:01:11,156][49728] Signal inference workers to stop experience collection... (24050 times) [2024-04-26 18:01:11,157][49728] Signal inference workers to resume experience collection... (24050 times) [2024-04-26 18:01:11,184][49750] InferenceWorker_p0-w0: stopping experience collection (24050 times) [2024-04-26 18:01:11,184][49750] InferenceWorker_p0-w0: resuming experience collection (24050 times) [2024-04-26 18:01:12,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3870195712. Throughput: 0: 50979.1. Samples: 1623032980. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-04-26 18:01:12,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 18:01:13,205][49750] Updated weights for policy 0, policy_version 236221 (0.0030) [2024-04-26 18:01:16,471][49750] Updated weights for policy 0, policy_version 236231 (0.0031) [2024-04-26 18:01:17,063][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3870425088. Throughput: 0: 50766.2. Samples: 1623332720. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-04-26 18:01:17,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 18:01:19,597][49750] Updated weights for policy 0, policy_version 236241 (0.0031) [2024-04-26 18:01:22,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3870670848. Throughput: 0: 50862.1. Samples: 1623488960. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-04-26 18:01:22,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 18:01:22,984][49750] Updated weights for policy 0, policy_version 236251 (0.0031) [2024-04-26 18:01:26,087][49750] Updated weights for policy 0, policy_version 236261 (0.0036) [2024-04-26 18:01:27,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3870949376. Throughput: 0: 50872.3. Samples: 1623788420. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-04-26 18:01:27,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 18:01:29,572][49750] Updated weights for policy 0, policy_version 236271 (0.0032) [2024-04-26 18:01:32,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3871195136. Throughput: 0: 50728.4. Samples: 1624085860. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-04-26 18:01:32,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 18:01:32,530][49750] Updated weights for policy 0, policy_version 236281 (0.0033) [2024-04-26 18:01:36,237][49750] Updated weights for policy 0, policy_version 236291 (0.0033) [2024-04-26 18:01:37,063][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3871440896. Throughput: 0: 50896.8. Samples: 1624248460. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-04-26 18:01:37,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 18:01:38,990][49750] Updated weights for policy 0, policy_version 236301 (0.0027) [2024-04-26 18:01:42,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3871686656. Throughput: 0: 50664.8. Samples: 1624543080. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-04-26 18:01:42,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 18:01:42,647][49750] Updated weights for policy 0, policy_version 236311 (0.0036) [2024-04-26 18:01:45,425][49750] Updated weights for policy 0, policy_version 236321 (0.0030) [2024-04-26 18:01:47,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3871965184. Throughput: 0: 50572.7. Samples: 1624845100. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-04-26 18:01:47,063][49517] Avg episode reward: [(0, '0.458')] [2024-04-26 18:01:49,015][49750] Updated weights for policy 0, policy_version 236331 (0.0034) [2024-04-26 18:01:51,865][49750] Updated weights for policy 0, policy_version 236341 (0.0031) [2024-04-26 18:01:52,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3872210944. Throughput: 0: 50618.4. Samples: 1625009940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 18:01:52,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 18:01:55,394][49750] Updated weights for policy 0, policy_version 236351 (0.0028) [2024-04-26 18:01:57,062][49517] Fps is (10 sec: 50791.4, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3872473088. Throughput: 0: 50662.8. Samples: 1625312800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 18:01:57,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-26 18:01:58,423][49750] Updated weights for policy 0, policy_version 236361 (0.0028) [2024-04-26 18:02:01,774][49750] Updated weights for policy 0, policy_version 236371 (0.0033) [2024-04-26 18:02:02,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3872702464. Throughput: 0: 50877.8. Samples: 1625622220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 18:02:02,071][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 18:02:04,864][49750] Updated weights for policy 0, policy_version 236381 (0.0032) [2024-04-26 18:02:07,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.6, 300 sec: 50762.6). Total num frames: 3872964608. Throughput: 0: 50798.5. Samples: 1625774880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 18:02:07,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 18:02:08,031][49750] Updated weights for policy 0, policy_version 236391 (0.0037) [2024-04-26 18:02:11,262][49750] Updated weights for policy 0, policy_version 236401 (0.0036) [2024-04-26 18:02:12,063][49517] Fps is (10 sec: 54066.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3873243136. Throughput: 0: 50813.7. Samples: 1626075040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 18:02:12,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 18:02:13,042][49728] Signal inference workers to stop experience collection... (24100 times) [2024-04-26 18:02:13,078][49750] InferenceWorker_p0-w0: stopping experience collection (24100 times) [2024-04-26 18:02:13,099][49728] Signal inference workers to resume experience collection... (24100 times) [2024-04-26 18:02:13,101][49750] InferenceWorker_p0-w0: resuming experience collection (24100 times) [2024-04-26 18:02:14,354][49750] Updated weights for policy 0, policy_version 236411 (0.0037) [2024-04-26 18:02:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3873472512. Throughput: 0: 50855.3. Samples: 1626374340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 18:02:17,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 18:02:17,699][49750] Updated weights for policy 0, policy_version 236421 (0.0031) [2024-04-26 18:02:20,804][49750] Updated weights for policy 0, policy_version 236431 (0.0029) [2024-04-26 18:02:22,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3873718272. Throughput: 0: 50761.8. Samples: 1626532740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 18:02:22,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 18:02:24,122][49750] Updated weights for policy 0, policy_version 236441 (0.0038) [2024-04-26 18:02:27,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3873980416. Throughput: 0: 50943.1. Samples: 1626835520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 18:02:27,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 18:02:27,482][49750] Updated weights for policy 0, policy_version 236451 (0.0031) [2024-04-26 18:02:30,566][49750] Updated weights for policy 0, policy_version 236461 (0.0030) [2024-04-26 18:02:32,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3874226176. Throughput: 0: 50814.7. Samples: 1627131760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 18:02:32,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 18:02:34,298][49750] Updated weights for policy 0, policy_version 236471 (0.0036) [2024-04-26 18:02:36,932][49750] Updated weights for policy 0, policy_version 236481 (0.0031) [2024-04-26 18:02:37,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3874504704. Throughput: 0: 50824.6. Samples: 1627297040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 18:02:37,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 18:02:41,016][49750] Updated weights for policy 0, policy_version 236491 (0.0033) [2024-04-26 18:02:42,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3874750464. Throughput: 0: 50905.6. Samples: 1627603560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 18:02:42,064][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 18:02:42,128][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000236497_3874766848.pth... [2024-04-26 18:02:42,172][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000235752_3862560768.pth [2024-04-26 18:02:43,375][49750] Updated weights for policy 0, policy_version 236501 (0.0033) [2024-04-26 18:02:47,063][49517] Fps is (10 sec: 45874.4, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 3874963456. Throughput: 0: 50633.6. Samples: 1627900740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 18:02:47,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 18:02:47,442][49750] Updated weights for policy 0, policy_version 236511 (0.0034) [2024-04-26 18:02:49,904][49750] Updated weights for policy 0, policy_version 236521 (0.0037) [2024-04-26 18:02:52,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3875241984. Throughput: 0: 50522.6. Samples: 1628048400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 18:02:52,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 18:02:53,742][49750] Updated weights for policy 0, policy_version 236531 (0.0038) [2024-04-26 18:02:56,243][49750] Updated weights for policy 0, policy_version 236541 (0.0032) [2024-04-26 18:02:57,062][49517] Fps is (10 sec: 57345.0, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3875536896. Throughput: 0: 50707.3. Samples: 1628356860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-26 18:02:57,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 18:03:00,182][49750] Updated weights for policy 0, policy_version 236551 (0.0033) [2024-04-26 18:03:02,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3875749888. Throughput: 0: 50748.3. Samples: 1628658020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 18:03:02,063][49517] Avg episode reward: [(0, '0.467')] [2024-04-26 18:03:02,157][49728] Signal inference workers to stop experience collection... (24150 times) [2024-04-26 18:03:02,219][49750] InferenceWorker_p0-w0: stopping experience collection (24150 times) [2024-04-26 18:03:02,225][49728] Signal inference workers to resume experience collection... (24150 times) [2024-04-26 18:03:02,234][49750] InferenceWorker_p0-w0: resuming experience collection (24150 times) [2024-04-26 18:03:02,770][49750] Updated weights for policy 0, policy_version 236561 (0.0028) [2024-04-26 18:03:06,671][49750] Updated weights for policy 0, policy_version 236571 (0.0032) [2024-04-26 18:03:07,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3876012032. Throughput: 0: 50656.4. Samples: 1628812280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 18:03:07,063][49517] Avg episode reward: [(0, '0.443')] [2024-04-26 18:03:09,250][49750] Updated weights for policy 0, policy_version 236581 (0.0031) [2024-04-26 18:03:12,063][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3876274176. Throughput: 0: 50709.8. Samples: 1629117460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 18:03:12,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 18:03:12,953][49750] Updated weights for policy 0, policy_version 236591 (0.0030) [2024-04-26 18:03:15,644][49750] Updated weights for policy 0, policy_version 236601 (0.0029) [2024-04-26 18:03:17,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3876519936. Throughput: 0: 50940.8. Samples: 1629424100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 18:03:17,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 18:03:19,279][49750] Updated weights for policy 0, policy_version 236611 (0.0042) [2024-04-26 18:03:21,954][49750] Updated weights for policy 0, policy_version 236621 (0.0028) [2024-04-26 18:03:22,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3876798464. Throughput: 0: 50731.0. Samples: 1629579940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 18:03:22,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 18:03:25,753][49750] Updated weights for policy 0, policy_version 236631 (0.0033) [2024-04-26 18:03:27,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3877027840. Throughput: 0: 50771.4. Samples: 1629888280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 18:03:27,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 18:03:28,284][49750] Updated weights for policy 0, policy_version 236641 (0.0035) [2024-04-26 18:03:32,062][49517] Fps is (10 sec: 45875.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3877257216. Throughput: 0: 50855.3. Samples: 1630189220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 18:03:32,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 18:03:32,243][49750] Updated weights for policy 0, policy_version 236651 (0.0027) [2024-04-26 18:03:34,712][49750] Updated weights for policy 0, policy_version 236661 (0.0031) [2024-04-26 18:03:37,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3877552128. Throughput: 0: 50852.9. Samples: 1630336780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 18:03:37,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 18:03:38,647][49750] Updated weights for policy 0, policy_version 236671 (0.0030) [2024-04-26 18:03:41,292][49750] Updated weights for policy 0, policy_version 236681 (0.0029) [2024-04-26 18:03:42,062][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3877797888. Throughput: 0: 50805.3. Samples: 1630643100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 18:03:42,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 18:03:45,096][49750] Updated weights for policy 0, policy_version 236691 (0.0028) [2024-04-26 18:03:47,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51609.7, 300 sec: 50762.6). Total num frames: 3878060032. Throughput: 0: 51018.3. Samples: 1630953840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 18:03:47,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 18:03:47,644][49750] Updated weights for policy 0, policy_version 236701 (0.0036) [2024-04-26 18:03:51,504][49750] Updated weights for policy 0, policy_version 236711 (0.0031) [2024-04-26 18:03:52,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3878289408. Throughput: 0: 50836.1. Samples: 1631099900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 18:03:52,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 18:03:54,022][49750] Updated weights for policy 0, policy_version 236721 (0.0032) [2024-04-26 18:03:57,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3878551552. Throughput: 0: 50874.3. Samples: 1631406800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 18:03:57,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 18:03:58,014][49750] Updated weights for policy 0, policy_version 236731 (0.0038) [2024-04-26 18:04:00,701][49750] Updated weights for policy 0, policy_version 236741 (0.0031) [2024-04-26 18:04:02,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 3878813696. Throughput: 0: 50727.8. Samples: 1631706840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 18:04:02,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 18:04:04,281][49728] Signal inference workers to stop experience collection... (24200 times) [2024-04-26 18:04:04,290][49728] Signal inference workers to resume experience collection... (24200 times) [2024-04-26 18:04:04,311][49750] InferenceWorker_p0-w0: stopping experience collection (24200 times) [2024-04-26 18:04:04,311][49750] InferenceWorker_p0-w0: resuming experience collection (24200 times) [2024-04-26 18:04:04,414][49750] Updated weights for policy 0, policy_version 236751 (0.0034) [2024-04-26 18:04:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3879059456. Throughput: 0: 50656.1. Samples: 1631859460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 18:04:07,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 18:04:07,304][49750] Updated weights for policy 0, policy_version 236761 (0.0034) [2024-04-26 18:04:10,859][49750] Updated weights for policy 0, policy_version 236771 (0.0034) [2024-04-26 18:04:12,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3879305216. Throughput: 0: 50672.7. Samples: 1632168540. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-04-26 18:04:12,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 18:04:13,919][49750] Updated weights for policy 0, policy_version 236781 (0.0035) [2024-04-26 18:04:17,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3879550976. Throughput: 0: 50679.4. Samples: 1632469800. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-04-26 18:04:17,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 18:04:17,337][49750] Updated weights for policy 0, policy_version 236791 (0.0030) [2024-04-26 18:04:20,427][49750] Updated weights for policy 0, policy_version 236801 (0.0036) [2024-04-26 18:04:22,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3879829504. Throughput: 0: 50704.5. Samples: 1632618480. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-04-26 18:04:22,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 18:04:23,721][49750] Updated weights for policy 0, policy_version 236811 (0.0036) [2024-04-26 18:04:27,009][49750] Updated weights for policy 0, policy_version 236821 (0.0037) [2024-04-26 18:04:27,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50790.6, 300 sec: 50762.6). Total num frames: 3880075264. Throughput: 0: 50634.7. Samples: 1632921660. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-04-26 18:04:27,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 18:04:30,175][49750] Updated weights for policy 0, policy_version 236831 (0.0031) [2024-04-26 18:04:32,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 3880337408. Throughput: 0: 50548.9. Samples: 1633228540. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-04-26 18:04:32,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 18:04:33,408][49750] Updated weights for policy 0, policy_version 236841 (0.0029) [2024-04-26 18:04:36,625][49750] Updated weights for policy 0, policy_version 236851 (0.0034) [2024-04-26 18:04:37,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 3880566784. Throughput: 0: 50627.0. Samples: 1633378120. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-04-26 18:04:37,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 18:04:39,818][49750] Updated weights for policy 0, policy_version 236861 (0.0028) [2024-04-26 18:04:42,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3880828928. Throughput: 0: 50511.1. Samples: 1633679800. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-04-26 18:04:42,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 18:04:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000236867_3880828928.pth... [2024-04-26 18:04:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000236125_3868672000.pth [2024-04-26 18:04:43,179][49750] Updated weights for policy 0, policy_version 236871 (0.0031) [2024-04-26 18:04:46,133][49750] Updated weights for policy 0, policy_version 236881 (0.0027) [2024-04-26 18:04:47,062][49517] Fps is (10 sec: 54067.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3881107456. Throughput: 0: 50708.4. Samples: 1633988720. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-04-26 18:04:47,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 18:04:49,956][49750] Updated weights for policy 0, policy_version 236891 (0.0027) [2024-04-26 18:04:52,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3881353216. Throughput: 0: 50653.8. Samples: 1634138880. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-04-26 18:04:52,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 18:04:52,766][49750] Updated weights for policy 0, policy_version 236901 (0.0029) [2024-04-26 18:04:56,399][49750] Updated weights for policy 0, policy_version 236911 (0.0034) [2024-04-26 18:04:57,063][49517] Fps is (10 sec: 47512.7, 60 sec: 50517.2, 300 sec: 50707.0). Total num frames: 3881582592. Throughput: 0: 50672.7. Samples: 1634448820. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-04-26 18:04:57,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 18:04:57,946][49728] Signal inference workers to stop experience collection... (24250 times) [2024-04-26 18:04:57,975][49750] InferenceWorker_p0-w0: stopping experience collection (24250 times) [2024-04-26 18:04:58,047][49728] Signal inference workers to resume experience collection... (24250 times) [2024-04-26 18:04:58,047][49750] InferenceWorker_p0-w0: resuming experience collection (24250 times) [2024-04-26 18:04:59,211][49750] Updated weights for policy 0, policy_version 236921 (0.0026) [2024-04-26 18:05:02,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 3881828352. Throughput: 0: 50658.2. Samples: 1634749420. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-04-26 18:05:02,064][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 18:05:02,704][49750] Updated weights for policy 0, policy_version 236931 (0.0034) [2024-04-26 18:05:05,498][49750] Updated weights for policy 0, policy_version 236941 (0.0032) [2024-04-26 18:05:07,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3882106880. Throughput: 0: 50679.5. Samples: 1634899060. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-04-26 18:05:07,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 18:05:08,952][49750] Updated weights for policy 0, policy_version 236951 (0.0030) [2024-04-26 18:05:12,039][49750] Updated weights for policy 0, policy_version 236961 (0.0032) [2024-04-26 18:05:12,062][49517] Fps is (10 sec: 54068.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3882369024. Throughput: 0: 50803.1. Samples: 1635207800. Policy #0 lag: (min: 1.0, avg: 12.4, max: 24.0) [2024-04-26 18:05:12,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 18:05:15,368][49750] Updated weights for policy 0, policy_version 236971 (0.0030) [2024-04-26 18:05:17,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3882614784. Throughput: 0: 50940.8. Samples: 1635520880. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 18:05:17,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 18:05:18,477][49750] Updated weights for policy 0, policy_version 236981 (0.0028) [2024-04-26 18:05:21,909][49750] Updated weights for policy 0, policy_version 236991 (0.0032) [2024-04-26 18:05:22,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3882860544. Throughput: 0: 50989.9. Samples: 1635672660. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 18:05:22,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 18:05:24,841][49750] Updated weights for policy 0, policy_version 237001 (0.0034) [2024-04-26 18:05:27,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3883122688. Throughput: 0: 51052.8. Samples: 1635977180. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 18:05:27,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 18:05:28,236][49750] Updated weights for policy 0, policy_version 237011 (0.0031) [2024-04-26 18:05:31,164][49750] Updated weights for policy 0, policy_version 237021 (0.0036) [2024-04-26 18:05:32,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3883368448. Throughput: 0: 50921.8. Samples: 1636280200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 18:05:32,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 18:05:34,549][49750] Updated weights for policy 0, policy_version 237031 (0.0024) [2024-04-26 18:05:37,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3883646976. Throughput: 0: 51023.1. Samples: 1636434920. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 18:05:37,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 18:05:37,445][49750] Updated weights for policy 0, policy_version 237041 (0.0035) [2024-04-26 18:05:40,938][49750] Updated weights for policy 0, policy_version 237051 (0.0035) [2024-04-26 18:05:42,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3883892736. Throughput: 0: 50946.0. Samples: 1636741380. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 18:05:42,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 18:05:43,826][49750] Updated weights for policy 0, policy_version 237061 (0.0034) [2024-04-26 18:05:47,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3884122112. Throughput: 0: 51143.8. Samples: 1637050880. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 18:05:47,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 18:05:47,469][49750] Updated weights for policy 0, policy_version 237071 (0.0029) [2024-04-26 18:05:50,297][49750] Updated weights for policy 0, policy_version 237081 (0.0033) [2024-04-26 18:05:52,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3884384256. Throughput: 0: 51033.8. Samples: 1637195580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 18:05:52,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 18:05:54,054][49750] Updated weights for policy 0, policy_version 237091 (0.0030) [2024-04-26 18:05:56,557][49750] Updated weights for policy 0, policy_version 237101 (0.0028) [2024-04-26 18:05:57,063][49517] Fps is (10 sec: 54066.3, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 3884662784. Throughput: 0: 50964.2. Samples: 1637501200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 18:05:57,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 18:06:00,391][49750] Updated weights for policy 0, policy_version 237111 (0.0033) [2024-04-26 18:06:02,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 3884908544. Throughput: 0: 50929.4. Samples: 1637812700. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 18:06:02,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 18:06:02,949][49750] Updated weights for policy 0, policy_version 237121 (0.0026) [2024-04-26 18:06:06,750][49750] Updated weights for policy 0, policy_version 237131 (0.0033) [2024-04-26 18:06:07,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3885154304. Throughput: 0: 50846.6. Samples: 1637960760. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 18:06:07,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 18:06:09,930][49750] Updated weights for policy 0, policy_version 237141 (0.0030) [2024-04-26 18:06:12,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3885400064. Throughput: 0: 50758.6. Samples: 1638261320. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 18:06:12,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 18:06:13,058][49728] Signal inference workers to stop experience collection... (24300 times) [2024-04-26 18:06:13,059][49728] Signal inference workers to resume experience collection... (24300 times) [2024-04-26 18:06:13,092][49750] InferenceWorker_p0-w0: stopping experience collection (24300 times) [2024-04-26 18:06:13,092][49750] InferenceWorker_p0-w0: resuming experience collection (24300 times) [2024-04-26 18:06:13,200][49750] Updated weights for policy 0, policy_version 237151 (0.0028) [2024-04-26 18:06:16,665][49750] Updated weights for policy 0, policy_version 237161 (0.0029) [2024-04-26 18:06:17,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3885662208. Throughput: 0: 50877.8. Samples: 1638569700. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 18:06:17,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 18:06:19,532][49750] Updated weights for policy 0, policy_version 237171 (0.0030) [2024-04-26 18:06:22,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3885940736. Throughput: 0: 50730.6. Samples: 1638717800. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-26 18:06:22,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 18:06:23,019][49750] Updated weights for policy 0, policy_version 237181 (0.0031) [2024-04-26 18:06:25,904][49750] Updated weights for policy 0, policy_version 237191 (0.0030) [2024-04-26 18:06:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3886153728. Throughput: 0: 50833.8. Samples: 1639028900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 18:06:27,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 18:06:29,522][49750] Updated weights for policy 0, policy_version 237201 (0.0030) [2024-04-26 18:06:32,062][49517] Fps is (10 sec: 49152.4, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3886432256. Throughput: 0: 50726.2. Samples: 1639333560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 18:06:32,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 18:06:32,349][49750] Updated weights for policy 0, policy_version 237211 (0.0034) [2024-04-26 18:06:36,034][49750] Updated weights for policy 0, policy_version 237221 (0.0028) [2024-04-26 18:06:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3886661632. Throughput: 0: 50573.8. Samples: 1639471400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 18:06:37,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 18:06:38,879][49750] Updated weights for policy 0, policy_version 237231 (0.0031) [2024-04-26 18:06:42,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3886923776. Throughput: 0: 50710.5. Samples: 1639783160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 18:06:42,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 18:06:42,177][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000237240_3886940160.pth... [2024-04-26 18:06:42,233][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000236497_3874766848.pth [2024-04-26 18:06:42,385][49750] Updated weights for policy 0, policy_version 237241 (0.0029) [2024-04-26 18:06:45,194][49750] Updated weights for policy 0, policy_version 237251 (0.0029) [2024-04-26 18:06:47,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3887202304. Throughput: 0: 50607.1. Samples: 1640090020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 18:06:47,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 18:06:48,742][49750] Updated weights for policy 0, policy_version 237261 (0.0028) [2024-04-26 18:06:51,872][49750] Updated weights for policy 0, policy_version 237271 (0.0034) [2024-04-26 18:06:52,063][49517] Fps is (10 sec: 52427.5, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3887448064. Throughput: 0: 50773.2. Samples: 1640245560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 18:06:52,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 18:06:53,641][49728] Signal inference workers to stop experience collection... (24350 times) [2024-04-26 18:06:53,641][49728] Signal inference workers to resume experience collection... (24350 times) [2024-04-26 18:06:53,669][49750] InferenceWorker_p0-w0: stopping experience collection (24350 times) [2024-04-26 18:06:53,693][49750] InferenceWorker_p0-w0: resuming experience collection (24350 times) [2024-04-26 18:06:55,265][49750] Updated weights for policy 0, policy_version 237281 (0.0036) [2024-04-26 18:06:57,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 3887677440. Throughput: 0: 50756.7. Samples: 1640545360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 18:06:57,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 18:06:58,619][49750] Updated weights for policy 0, policy_version 237291 (0.0031) [2024-04-26 18:07:01,797][49750] Updated weights for policy 0, policy_version 237301 (0.0037) [2024-04-26 18:07:02,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3887939584. Throughput: 0: 50601.3. Samples: 1640846760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 18:07:02,063][49517] Avg episode reward: [(0, '0.695')] [2024-04-26 18:07:04,947][49750] Updated weights for policy 0, policy_version 237311 (0.0036) [2024-04-26 18:07:07,062][49517] Fps is (10 sec: 55705.4, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3888234496. Throughput: 0: 50762.3. Samples: 1641002100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 18:07:07,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 18:07:08,136][49750] Updated weights for policy 0, policy_version 237321 (0.0034) [2024-04-26 18:07:11,243][49750] Updated weights for policy 0, policy_version 237331 (0.0031) [2024-04-26 18:07:12,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.6, 300 sec: 50762.6). Total num frames: 3888447488. Throughput: 0: 50705.8. Samples: 1641310660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 18:07:12,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 18:07:14,766][49750] Updated weights for policy 0, policy_version 237341 (0.0035) [2024-04-26 18:07:17,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3888709632. Throughput: 0: 50727.6. Samples: 1641616300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 18:07:17,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 18:07:17,808][49750] Updated weights for policy 0, policy_version 237351 (0.0032) [2024-04-26 18:07:21,150][49750] Updated weights for policy 0, policy_version 237361 (0.0035) [2024-04-26 18:07:22,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3888955392. Throughput: 0: 50847.3. Samples: 1641759540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 18:07:22,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 18:07:24,258][49750] Updated weights for policy 0, policy_version 237371 (0.0030) [2024-04-26 18:07:27,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 3889217536. Throughput: 0: 50743.8. Samples: 1642066640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 18:07:27,072][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 18:07:27,706][49750] Updated weights for policy 0, policy_version 237381 (0.0031) [2024-04-26 18:07:30,608][49750] Updated weights for policy 0, policy_version 237391 (0.0030) [2024-04-26 18:07:32,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.1, 300 sec: 50762.6). Total num frames: 3889479680. Throughput: 0: 50709.4. Samples: 1642371960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 18:07:32,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-26 18:07:33,998][49750] Updated weights for policy 0, policy_version 237401 (0.0029) [2024-04-26 18:07:36,909][49750] Updated weights for policy 0, policy_version 237411 (0.0036) [2024-04-26 18:07:37,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3889741824. Throughput: 0: 50788.7. Samples: 1642531040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 18:07:37,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 18:07:40,372][49750] Updated weights for policy 0, policy_version 237421 (0.0026) [2024-04-26 18:07:42,062][49517] Fps is (10 sec: 47515.1, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3889954816. Throughput: 0: 50727.1. Samples: 1642828080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 18:07:42,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 18:07:43,497][49750] Updated weights for policy 0, policy_version 237431 (0.0031) [2024-04-26 18:07:46,746][49728] Signal inference workers to stop experience collection... (24400 times) [2024-04-26 18:07:46,746][49728] Signal inference workers to resume experience collection... (24400 times) [2024-04-26 18:07:46,766][49750] InferenceWorker_p0-w0: stopping experience collection (24400 times) [2024-04-26 18:07:46,766][49750] InferenceWorker_p0-w0: resuming experience collection (24400 times) [2024-04-26 18:07:46,884][49750] Updated weights for policy 0, policy_version 237441 (0.0028) [2024-04-26 18:07:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3890233344. Throughput: 0: 50821.5. Samples: 1643133720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 18:07:47,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 18:07:49,956][49750] Updated weights for policy 0, policy_version 237451 (0.0036) [2024-04-26 18:07:52,062][49517] Fps is (10 sec: 55705.3, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 3890511872. Throughput: 0: 50761.7. Samples: 1643286380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 18:07:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 18:07:53,405][49750] Updated weights for policy 0, policy_version 237461 (0.0037) [2024-04-26 18:07:56,660][49750] Updated weights for policy 0, policy_version 237471 (0.0036) [2024-04-26 18:07:57,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3890741248. Throughput: 0: 50728.4. Samples: 1643593440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 18:07:57,072][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 18:07:59,773][49750] Updated weights for policy 0, policy_version 237481 (0.0033) [2024-04-26 18:08:02,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3890987008. Throughput: 0: 50551.9. Samples: 1643891140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 18:08:02,072][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 18:08:03,009][49750] Updated weights for policy 0, policy_version 237491 (0.0029) [2024-04-26 18:08:06,145][49750] Updated weights for policy 0, policy_version 237501 (0.0034) [2024-04-26 18:08:07,063][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.1, 300 sec: 50707.1). Total num frames: 3891232768. Throughput: 0: 50675.7. Samples: 1644039940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 18:08:07,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 18:08:09,546][49750] Updated weights for policy 0, policy_version 237511 (0.0031) [2024-04-26 18:08:12,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3891511296. Throughput: 0: 50805.9. Samples: 1644352900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 18:08:12,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 18:08:12,595][49750] Updated weights for policy 0, policy_version 237521 (0.0031) [2024-04-26 18:08:15,964][49750] Updated weights for policy 0, policy_version 237531 (0.0033) [2024-04-26 18:08:17,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3891757056. Throughput: 0: 50631.0. Samples: 1644650340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 18:08:17,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 18:08:19,191][49750] Updated weights for policy 0, policy_version 237541 (0.0041) [2024-04-26 18:08:22,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3892002816. Throughput: 0: 50511.4. Samples: 1644804060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 18:08:22,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 18:08:22,339][49750] Updated weights for policy 0, policy_version 237551 (0.0031) [2024-04-26 18:08:26,046][49750] Updated weights for policy 0, policy_version 237561 (0.0032) [2024-04-26 18:08:27,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 3892248576. Throughput: 0: 50644.2. Samples: 1645107080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 18:08:27,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 18:08:28,613][49750] Updated weights for policy 0, policy_version 237571 (0.0036) [2024-04-26 18:08:32,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50517.6, 300 sec: 50707.1). Total num frames: 3892510720. Throughput: 0: 50674.2. Samples: 1645414060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 18:08:32,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 18:08:32,498][49750] Updated weights for policy 0, policy_version 237581 (0.0040) [2024-04-26 18:08:35,070][49750] Updated weights for policy 0, policy_version 237591 (0.0033) [2024-04-26 18:08:37,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3892772864. Throughput: 0: 50536.9. Samples: 1645560540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 18:08:37,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 18:08:39,010][49750] Updated weights for policy 0, policy_version 237601 (0.0033) [2024-04-26 18:08:41,636][49750] Updated weights for policy 0, policy_version 237611 (0.0026) [2024-04-26 18:08:42,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 3893035008. Throughput: 0: 50606.6. Samples: 1645870740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 18:08:42,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 18:08:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000237612_3893035008.pth... [2024-04-26 18:08:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000236867_3880828928.pth [2024-04-26 18:08:45,349][49750] Updated weights for policy 0, policy_version 237621 (0.0031) [2024-04-26 18:08:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3893264384. Throughput: 0: 50848.1. Samples: 1646179300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 18:08:47,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 18:08:48,168][49750] Updated weights for policy 0, policy_version 237631 (0.0032) [2024-04-26 18:08:51,717][49750] Updated weights for policy 0, policy_version 237641 (0.0030) [2024-04-26 18:08:52,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3893526528. Throughput: 0: 50733.4. Samples: 1646322940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 18:08:52,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 18:08:54,888][49750] Updated weights for policy 0, policy_version 237651 (0.0024) [2024-04-26 18:08:55,484][49728] Signal inference workers to stop experience collection... (24450 times) [2024-04-26 18:08:55,531][49750] InferenceWorker_p0-w0: stopping experience collection (24450 times) [2024-04-26 18:08:55,553][49728] Signal inference workers to resume experience collection... (24450 times) [2024-04-26 18:08:55,554][49750] InferenceWorker_p0-w0: resuming experience collection (24450 times) [2024-04-26 18:08:57,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3893788672. Throughput: 0: 50721.4. Samples: 1646635360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 18:08:57,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 18:08:58,064][49750] Updated weights for policy 0, policy_version 237661 (0.0029) [2024-04-26 18:09:01,369][49750] Updated weights for policy 0, policy_version 237671 (0.0031) [2024-04-26 18:09:02,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3894018048. Throughput: 0: 50853.0. Samples: 1646938720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 18:09:02,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 18:09:04,377][49750] Updated weights for policy 0, policy_version 237681 (0.0030) [2024-04-26 18:09:07,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3894296576. Throughput: 0: 50709.6. Samples: 1647085980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 18:09:07,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 18:09:07,861][49750] Updated weights for policy 0, policy_version 237691 (0.0030) [2024-04-26 18:09:10,751][49750] Updated weights for policy 0, policy_version 237701 (0.0027) [2024-04-26 18:09:12,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3894542336. Throughput: 0: 50774.3. Samples: 1647391920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 18:09:12,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 18:09:14,235][49750] Updated weights for policy 0, policy_version 237711 (0.0034) [2024-04-26 18:09:17,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3894788096. Throughput: 0: 50789.3. Samples: 1647699580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 18:09:17,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 18:09:17,374][49750] Updated weights for policy 0, policy_version 237721 (0.0030) [2024-04-26 18:09:20,755][49750] Updated weights for policy 0, policy_version 237731 (0.0031) [2024-04-26 18:09:22,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3895066624. Throughput: 0: 50658.8. Samples: 1647840180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 18:09:22,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 18:09:23,872][49750] Updated weights for policy 0, policy_version 237741 (0.0033) [2024-04-26 18:09:27,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 3895296000. Throughput: 0: 50626.3. Samples: 1648148920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 18:09:27,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 18:09:27,255][49750] Updated weights for policy 0, policy_version 237751 (0.0036) [2024-04-26 18:09:30,304][49750] Updated weights for policy 0, policy_version 237761 (0.0039) [2024-04-26 18:09:32,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3895558144. Throughput: 0: 50687.6. Samples: 1648460240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 18:09:32,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-26 18:09:33,585][49750] Updated weights for policy 0, policy_version 237771 (0.0033) [2024-04-26 18:09:36,735][49750] Updated weights for policy 0, policy_version 237781 (0.0031) [2024-04-26 18:09:37,063][49517] Fps is (10 sec: 50788.8, 60 sec: 50517.1, 300 sec: 50762.6). Total num frames: 3895803904. Throughput: 0: 50823.3. Samples: 1648610000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 18:09:37,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 18:09:39,848][49750] Updated weights for policy 0, policy_version 237791 (0.0038) [2024-04-26 18:09:42,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3896066048. Throughput: 0: 50739.8. Samples: 1648918660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 18:09:42,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 18:09:43,071][49750] Updated weights for policy 0, policy_version 237801 (0.0038) [2024-04-26 18:09:46,415][49750] Updated weights for policy 0, policy_version 237811 (0.0031) [2024-04-26 18:09:47,062][49517] Fps is (10 sec: 50792.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3896311808. Throughput: 0: 50560.0. Samples: 1649213920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 18:09:47,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 18:09:49,432][49750] Updated weights for policy 0, policy_version 237821 (0.0030) [2024-04-26 18:09:52,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3896573952. Throughput: 0: 50577.7. Samples: 1649361980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 18:09:52,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 18:09:53,024][49750] Updated weights for policy 0, policy_version 237831 (0.0027) [2024-04-26 18:09:55,907][49750] Updated weights for policy 0, policy_version 237841 (0.0028) [2024-04-26 18:09:57,045][49728] Signal inference workers to stop experience collection... (24500 times) [2024-04-26 18:09:57,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50762.7). Total num frames: 3896803328. Throughput: 0: 50629.5. Samples: 1649670240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 18:09:57,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 18:09:57,097][49750] InferenceWorker_p0-w0: stopping experience collection (24500 times) [2024-04-26 18:09:57,115][49728] Signal inference workers to resume experience collection... (24500 times) [2024-04-26 18:09:57,116][49750] InferenceWorker_p0-w0: resuming experience collection (24500 times) [2024-04-26 18:09:59,693][49750] Updated weights for policy 0, policy_version 237851 (0.0029) [2024-04-26 18:10:02,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3897081856. Throughput: 0: 50707.8. Samples: 1649981440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 18:10:02,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 18:10:02,276][49750] Updated weights for policy 0, policy_version 237861 (0.0026) [2024-04-26 18:10:06,178][49750] Updated weights for policy 0, policy_version 237871 (0.0031) [2024-04-26 18:10:07,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 3897327616. Throughput: 0: 50912.4. Samples: 1650131240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 18:10:07,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 18:10:08,567][49750] Updated weights for policy 0, policy_version 237881 (0.0033) [2024-04-26 18:10:12,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3897573376. Throughput: 0: 50791.5. Samples: 1650434540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 18:10:12,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 18:10:12,475][49750] Updated weights for policy 0, policy_version 237891 (0.0031) [2024-04-26 18:10:15,025][49750] Updated weights for policy 0, policy_version 237901 (0.0031) [2024-04-26 18:10:17,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3897868288. Throughput: 0: 50790.6. Samples: 1650745820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 18:10:17,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 18:10:18,779][49750] Updated weights for policy 0, policy_version 237911 (0.0029) [2024-04-26 18:10:21,786][49750] Updated weights for policy 0, policy_version 237921 (0.0029) [2024-04-26 18:10:22,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3898097664. Throughput: 0: 50842.1. Samples: 1650897880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 18:10:22,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 18:10:25,311][49750] Updated weights for policy 0, policy_version 237931 (0.0030) [2024-04-26 18:10:27,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3898343424. Throughput: 0: 50719.6. Samples: 1651201040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 18:10:27,063][49517] Avg episode reward: [(0, '0.683')] [2024-04-26 18:10:28,552][49750] Updated weights for policy 0, policy_version 237941 (0.0032) [2024-04-26 18:10:31,727][49750] Updated weights for policy 0, policy_version 237951 (0.0031) [2024-04-26 18:10:32,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.1, 300 sec: 50651.5). Total num frames: 3898589184. Throughput: 0: 50889.2. Samples: 1651503940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 18:10:32,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 18:10:35,063][49750] Updated weights for policy 0, policy_version 237961 (0.0030) [2024-04-26 18:10:37,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 3898851328. Throughput: 0: 50904.3. Samples: 1651652680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 18:10:37,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 18:10:38,035][49750] Updated weights for policy 0, policy_version 237971 (0.0035) [2024-04-26 18:10:41,398][49750] Updated weights for policy 0, policy_version 237981 (0.0030) [2024-04-26 18:10:42,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3899097088. Throughput: 0: 50793.2. Samples: 1651955940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 18:10:42,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 18:10:42,145][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000237983_3899113472.pth... [2024-04-26 18:10:42,196][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000237240_3886940160.pth [2024-04-26 18:10:44,431][49750] Updated weights for policy 0, policy_version 237991 (0.0037) [2024-04-26 18:10:47,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3899375616. Throughput: 0: 50578.3. Samples: 1652257460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 18:10:47,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 18:10:47,741][49750] Updated weights for policy 0, policy_version 238001 (0.0035) [2024-04-26 18:10:50,975][49750] Updated weights for policy 0, policy_version 238011 (0.0031) [2024-04-26 18:10:52,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 3899604992. Throughput: 0: 50629.4. Samples: 1652409560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 18:10:52,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 18:10:54,270][49750] Updated weights for policy 0, policy_version 238021 (0.0030) [2024-04-26 18:10:57,062][49517] Fps is (10 sec: 49152.5, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3899867136. Throughput: 0: 50693.5. Samples: 1652715740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 18:10:57,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 18:10:57,896][49750] Updated weights for policy 0, policy_version 238031 (0.0031) [2024-04-26 18:10:58,149][49728] Signal inference workers to stop experience collection... (24550 times) [2024-04-26 18:10:58,155][49728] Signal inference workers to resume experience collection... (24550 times) [2024-04-26 18:10:58,178][49750] InferenceWorker_p0-w0: stopping experience collection (24550 times) [2024-04-26 18:10:58,179][49750] InferenceWorker_p0-w0: resuming experience collection (24550 times) [2024-04-26 18:11:00,602][49750] Updated weights for policy 0, policy_version 238041 (0.0035) [2024-04-26 18:11:02,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3900129280. Throughput: 0: 50520.0. Samples: 1653019220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-26 18:11:02,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 18:11:04,440][49750] Updated weights for policy 0, policy_version 238051 (0.0031) [2024-04-26 18:11:07,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3900375040. Throughput: 0: 50685.8. Samples: 1653178740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 18:11:07,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 18:11:07,116][49750] Updated weights for policy 0, policy_version 238061 (0.0032) [2024-04-26 18:11:10,992][49750] Updated weights for policy 0, policy_version 238071 (0.0029) [2024-04-26 18:11:12,062][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3900637184. Throughput: 0: 50688.0. Samples: 1653482000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 18:11:12,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 18:11:13,551][49750] Updated weights for policy 0, policy_version 238081 (0.0038) [2024-04-26 18:11:17,063][49517] Fps is (10 sec: 49151.6, 60 sec: 49971.1, 300 sec: 50596.0). Total num frames: 3900866560. Throughput: 0: 50502.2. Samples: 1653776540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 18:11:17,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 18:11:17,401][49750] Updated weights for policy 0, policy_version 238091 (0.0036) [2024-04-26 18:11:20,047][49750] Updated weights for policy 0, policy_version 238101 (0.0031) [2024-04-26 18:11:22,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3901128704. Throughput: 0: 50560.5. Samples: 1653927900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 18:11:22,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 18:11:23,861][49750] Updated weights for policy 0, policy_version 238111 (0.0030) [2024-04-26 18:11:26,541][49750] Updated weights for policy 0, policy_version 238121 (0.0033) [2024-04-26 18:11:27,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3901390848. Throughput: 0: 50600.5. Samples: 1654232960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 18:11:27,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 18:11:30,355][49750] Updated weights for policy 0, policy_version 238131 (0.0035) [2024-04-26 18:11:32,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3901636608. Throughput: 0: 50731.5. Samples: 1654540380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 18:11:32,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 18:11:32,953][49750] Updated weights for policy 0, policy_version 238141 (0.0028) [2024-04-26 18:11:36,663][49750] Updated weights for policy 0, policy_version 238151 (0.0036) [2024-04-26 18:11:37,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3901882368. Throughput: 0: 50531.4. Samples: 1654683480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 18:11:37,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 18:11:39,530][49750] Updated weights for policy 0, policy_version 238161 (0.0028) [2024-04-26 18:11:42,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3902144512. Throughput: 0: 50644.4. Samples: 1654994740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 18:11:42,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 18:11:42,962][49750] Updated weights for policy 0, policy_version 238171 (0.0032) [2024-04-26 18:11:45,864][49750] Updated weights for policy 0, policy_version 238181 (0.0039) [2024-04-26 18:11:47,063][49517] Fps is (10 sec: 54067.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3902423040. Throughput: 0: 50714.1. Samples: 1655301360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 18:11:47,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-26 18:11:49,419][49750] Updated weights for policy 0, policy_version 238191 (0.0039) [2024-04-26 18:11:52,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3902668800. Throughput: 0: 50656.6. Samples: 1655458280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 18:11:52,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 18:11:52,200][49750] Updated weights for policy 0, policy_version 238201 (0.0029) [2024-04-26 18:11:55,863][49750] Updated weights for policy 0, policy_version 238211 (0.0032) [2024-04-26 18:11:57,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3902914560. Throughput: 0: 50796.1. Samples: 1655767820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 18:11:57,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 18:11:58,685][49750] Updated weights for policy 0, policy_version 238221 (0.0031) [2024-04-26 18:12:02,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.2, 300 sec: 50540.5). Total num frames: 3903143936. Throughput: 0: 50948.1. Samples: 1656069200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 18:12:02,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 18:12:02,235][49750] Updated weights for policy 0, policy_version 238231 (0.0030) [2024-04-26 18:12:05,124][49750] Updated weights for policy 0, policy_version 238241 (0.0032) [2024-04-26 18:12:07,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3903422464. Throughput: 0: 50899.1. Samples: 1656218360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 18:12:07,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 18:12:08,593][49750] Updated weights for policy 0, policy_version 238251 (0.0035) [2024-04-26 18:12:11,481][49750] Updated weights for policy 0, policy_version 238261 (0.0026) [2024-04-26 18:12:12,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3903684608. Throughput: 0: 50864.1. Samples: 1656521840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-26 18:12:12,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 18:12:15,315][49750] Updated weights for policy 0, policy_version 238271 (0.0028) [2024-04-26 18:12:17,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 3903930368. Throughput: 0: 50804.2. Samples: 1656826560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 18:12:17,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 18:12:17,993][49750] Updated weights for policy 0, policy_version 238281 (0.0031) [2024-04-26 18:12:21,684][49750] Updated weights for policy 0, policy_version 238291 (0.0032) [2024-04-26 18:12:22,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3904176128. Throughput: 0: 51095.7. Samples: 1656982780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 18:12:22,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 18:12:24,424][49750] Updated weights for policy 0, policy_version 238301 (0.0030) [2024-04-26 18:12:27,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 3904421888. Throughput: 0: 50854.0. Samples: 1657283180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 18:12:27,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 18:12:27,261][49728] Signal inference workers to stop experience collection... (24600 times) [2024-04-26 18:12:27,305][49750] InferenceWorker_p0-w0: stopping experience collection (24600 times) [2024-04-26 18:12:27,328][49728] Signal inference workers to resume experience collection... (24600 times) [2024-04-26 18:12:27,329][49750] InferenceWorker_p0-w0: resuming experience collection (24600 times) [2024-04-26 18:12:28,000][49750] Updated weights for policy 0, policy_version 238311 (0.0033) [2024-04-26 18:12:30,858][49750] Updated weights for policy 0, policy_version 238321 (0.0038) [2024-04-26 18:12:32,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 3904700416. Throughput: 0: 50852.0. Samples: 1657589700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 18:12:32,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 18:12:34,494][49750] Updated weights for policy 0, policy_version 238331 (0.0040) [2024-04-26 18:12:37,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3904962560. Throughput: 0: 50911.8. Samples: 1657749320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 18:12:37,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 18:12:37,188][49750] Updated weights for policy 0, policy_version 238341 (0.0029) [2024-04-26 18:12:40,844][49750] Updated weights for policy 0, policy_version 238351 (0.0028) [2024-04-26 18:12:42,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3905208320. Throughput: 0: 50810.9. Samples: 1658054320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 18:12:42,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 18:12:42,194][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000238356_3905224704.pth... [2024-04-26 18:12:42,240][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000237612_3893035008.pth [2024-04-26 18:12:43,951][49750] Updated weights for policy 0, policy_version 238361 (0.0033) [2024-04-26 18:12:47,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 3905454080. Throughput: 0: 51005.2. Samples: 1658364440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 18:12:47,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 18:12:47,093][49750] Updated weights for policy 0, policy_version 238371 (0.0026) [2024-04-26 18:12:50,465][49750] Updated weights for policy 0, policy_version 238381 (0.0031) [2024-04-26 18:12:52,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3905716224. Throughput: 0: 50971.3. Samples: 1658512060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 18:12:52,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 18:12:53,447][49750] Updated weights for policy 0, policy_version 238391 (0.0032) [2024-04-26 18:12:56,824][49750] Updated weights for policy 0, policy_version 238401 (0.0034) [2024-04-26 18:12:57,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3905961984. Throughput: 0: 50984.3. Samples: 1658816140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 18:12:57,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 18:12:59,926][49750] Updated weights for policy 0, policy_version 238411 (0.0033) [2024-04-26 18:13:02,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3906224128. Throughput: 0: 50982.6. Samples: 1659120780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 18:13:02,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 18:13:03,279][49750] Updated weights for policy 0, policy_version 238421 (0.0030) [2024-04-26 18:13:06,343][49750] Updated weights for policy 0, policy_version 238431 (0.0031) [2024-04-26 18:13:07,062][49517] Fps is (10 sec: 52429.9, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 3906486272. Throughput: 0: 51103.2. Samples: 1659282420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 18:13:07,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 18:13:09,633][49750] Updated weights for policy 0, policy_version 238441 (0.0033) [2024-04-26 18:13:12,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3906715648. Throughput: 0: 51249.4. Samples: 1659589400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 18:13:12,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 18:13:12,688][49750] Updated weights for policy 0, policy_version 238451 (0.0029) [2024-04-26 18:13:15,925][49750] Updated weights for policy 0, policy_version 238461 (0.0033) [2024-04-26 18:13:17,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3906977792. Throughput: 0: 51204.9. Samples: 1659893920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 18:13:17,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 18:13:19,204][49750] Updated weights for policy 0, policy_version 238471 (0.0033) [2024-04-26 18:13:22,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3907256320. Throughput: 0: 51071.6. Samples: 1660047540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-26 18:13:22,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 18:13:22,192][49750] Updated weights for policy 0, policy_version 238481 (0.0029) [2024-04-26 18:13:25,632][49750] Updated weights for policy 0, policy_version 238491 (0.0030) [2024-04-26 18:13:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 3907502080. Throughput: 0: 51145.9. Samples: 1660355880. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 18:13:27,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 18:13:28,686][49750] Updated weights for policy 0, policy_version 238501 (0.0029) [2024-04-26 18:13:31,951][49750] Updated weights for policy 0, policy_version 238511 (0.0030) [2024-04-26 18:13:32,063][49517] Fps is (10 sec: 50789.5, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3907764224. Throughput: 0: 51104.8. Samples: 1660664160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 18:13:32,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 18:13:35,286][49750] Updated weights for policy 0, policy_version 238521 (0.0033) [2024-04-26 18:13:37,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 3908009984. Throughput: 0: 51083.5. Samples: 1660810820. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 18:13:37,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 18:13:38,264][49750] Updated weights for policy 0, policy_version 238531 (0.0032) [2024-04-26 18:13:41,725][49750] Updated weights for policy 0, policy_version 238541 (0.0033) [2024-04-26 18:13:42,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3908272128. Throughput: 0: 51153.9. Samples: 1661118060. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 18:13:42,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 18:13:44,639][49750] Updated weights for policy 0, policy_version 238551 (0.0036) [2024-04-26 18:13:45,526][49728] Signal inference workers to stop experience collection... (24650 times) [2024-04-26 18:13:45,526][49728] Signal inference workers to resume experience collection... (24650 times) [2024-04-26 18:13:45,554][49750] InferenceWorker_p0-w0: stopping experience collection (24650 times) [2024-04-26 18:13:45,554][49750] InferenceWorker_p0-w0: resuming experience collection (24650 times) [2024-04-26 18:13:47,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3908534272. Throughput: 0: 51152.8. Samples: 1661422660. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 18:13:47,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 18:13:48,118][49750] Updated weights for policy 0, policy_version 238561 (0.0030) [2024-04-26 18:13:51,107][49750] Updated weights for policy 0, policy_version 238571 (0.0034) [2024-04-26 18:13:52,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 3908796416. Throughput: 0: 51081.7. Samples: 1661581100. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 18:13:52,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 18:13:54,592][49750] Updated weights for policy 0, policy_version 238581 (0.0034) [2024-04-26 18:13:57,062][49517] Fps is (10 sec: 49152.3, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3909025792. Throughput: 0: 51158.3. Samples: 1661891520. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 18:13:57,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 18:13:57,479][49750] Updated weights for policy 0, policy_version 238591 (0.0033) [2024-04-26 18:14:00,955][49750] Updated weights for policy 0, policy_version 238601 (0.0030) [2024-04-26 18:14:02,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3909255168. Throughput: 0: 51022.3. Samples: 1662189920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 18:14:02,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 18:14:04,081][49750] Updated weights for policy 0, policy_version 238611 (0.0027) [2024-04-26 18:14:07,063][49517] Fps is (10 sec: 50788.9, 60 sec: 50790.1, 300 sec: 50818.1). Total num frames: 3909533696. Throughput: 0: 50927.2. Samples: 1662339280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 18:14:07,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 18:14:07,325][49750] Updated weights for policy 0, policy_version 238621 (0.0028) [2024-04-26 18:14:10,378][49750] Updated weights for policy 0, policy_version 238631 (0.0023) [2024-04-26 18:14:12,062][49517] Fps is (10 sec: 55705.4, 60 sec: 51609.7, 300 sec: 50929.2). Total num frames: 3909812224. Throughput: 0: 50817.3. Samples: 1662642660. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 18:14:12,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 18:14:13,849][49750] Updated weights for policy 0, policy_version 238641 (0.0036) [2024-04-26 18:14:17,018][49750] Updated weights for policy 0, policy_version 238651 (0.0033) [2024-04-26 18:14:17,063][49517] Fps is (10 sec: 52429.8, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 3910057984. Throughput: 0: 50916.5. Samples: 1662955400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 18:14:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 18:14:20,276][49750] Updated weights for policy 0, policy_version 238661 (0.0032) [2024-04-26 18:14:22,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 3910287360. Throughput: 0: 50945.2. Samples: 1663103360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 18:14:22,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 18:14:23,571][49750] Updated weights for policy 0, policy_version 238671 (0.0032) [2024-04-26 18:14:26,806][49750] Updated weights for policy 0, policy_version 238681 (0.0038) [2024-04-26 18:14:27,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3910549504. Throughput: 0: 50811.5. Samples: 1663404580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 18:14:27,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 18:14:30,002][49750] Updated weights for policy 0, policy_version 238691 (0.0032) [2024-04-26 18:14:32,063][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3910811648. Throughput: 0: 50702.2. Samples: 1663704260. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-26 18:14:32,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 18:14:33,261][49750] Updated weights for policy 0, policy_version 238701 (0.0034) [2024-04-26 18:14:36,426][49750] Updated weights for policy 0, policy_version 238711 (0.0033) [2024-04-26 18:14:37,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 3911090176. Throughput: 0: 50727.9. Samples: 1663863860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-26 18:14:37,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 18:14:39,817][49750] Updated weights for policy 0, policy_version 238721 (0.0037) [2024-04-26 18:14:42,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 3911303168. Throughput: 0: 50664.2. Samples: 1664171420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-26 18:14:42,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 18:14:42,190][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000238728_3911319552.pth... [2024-04-26 18:14:42,234][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000237983_3899113472.pth [2024-04-26 18:14:42,821][49750] Updated weights for policy 0, policy_version 238731 (0.0031) [2024-04-26 18:14:46,294][49750] Updated weights for policy 0, policy_version 238741 (0.0031) [2024-04-26 18:14:47,062][49517] Fps is (10 sec: 45875.9, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3911548928. Throughput: 0: 50764.9. Samples: 1664474340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-26 18:14:47,063][49517] Avg episode reward: [(0, '0.673')] [2024-04-26 18:14:49,177][49750] Updated weights for policy 0, policy_version 238751 (0.0032) [2024-04-26 18:14:52,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 3911811072. Throughput: 0: 50635.0. Samples: 1664617840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-26 18:14:52,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 18:14:53,044][49750] Updated weights for policy 0, policy_version 238761 (0.0029) [2024-04-26 18:14:55,633][49750] Updated weights for policy 0, policy_version 238771 (0.0033) [2024-04-26 18:14:57,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3912089600. Throughput: 0: 50768.0. Samples: 1664927220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-26 18:14:57,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 18:14:59,383][49750] Updated weights for policy 0, policy_version 238781 (0.0032) [2024-04-26 18:15:02,055][49750] Updated weights for policy 0, policy_version 238791 (0.0028) [2024-04-26 18:15:02,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51609.6, 300 sec: 50929.3). Total num frames: 3912351744. Throughput: 0: 50719.7. Samples: 1665237780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-26 18:15:02,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 18:15:03,210][49728] Signal inference workers to stop experience collection... (24700 times) [2024-04-26 18:15:03,240][49750] InferenceWorker_p0-w0: stopping experience collection (24700 times) [2024-04-26 18:15:03,276][49728] Signal inference workers to resume experience collection... (24700 times) [2024-04-26 18:15:03,276][49750] InferenceWorker_p0-w0: resuming experience collection (24700 times) [2024-04-26 18:15:05,780][49750] Updated weights for policy 0, policy_version 238801 (0.0026) [2024-04-26 18:15:07,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.7, 300 sec: 50873.7). Total num frames: 3912581120. Throughput: 0: 50814.8. Samples: 1665390020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-26 18:15:07,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 18:15:08,533][49750] Updated weights for policy 0, policy_version 238811 (0.0030) [2024-04-26 18:15:12,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3912826880. Throughput: 0: 50739.6. Samples: 1665687860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-26 18:15:12,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 18:15:12,170][49750] Updated weights for policy 0, policy_version 238821 (0.0031) [2024-04-26 18:15:15,076][49750] Updated weights for policy 0, policy_version 238831 (0.0032) [2024-04-26 18:15:17,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3913105408. Throughput: 0: 50770.3. Samples: 1665988920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-26 18:15:17,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 18:15:18,491][49750] Updated weights for policy 0, policy_version 238841 (0.0034) [2024-04-26 18:15:21,565][49750] Updated weights for policy 0, policy_version 238851 (0.0031) [2024-04-26 18:15:22,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3913351168. Throughput: 0: 50848.0. Samples: 1666152020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-26 18:15:22,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 18:15:24,941][49750] Updated weights for policy 0, policy_version 238861 (0.0032) [2024-04-26 18:15:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3913596928. Throughput: 0: 50655.8. Samples: 1666450920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-26 18:15:27,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 18:15:27,998][49750] Updated weights for policy 0, policy_version 238871 (0.0032) [2024-04-26 18:15:31,394][49750] Updated weights for policy 0, policy_version 238881 (0.0027) [2024-04-26 18:15:32,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3913842688. Throughput: 0: 50699.3. Samples: 1666755820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-26 18:15:32,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 18:15:34,377][49750] Updated weights for policy 0, policy_version 238891 (0.0032) [2024-04-26 18:15:37,063][49517] Fps is (10 sec: 49151.0, 60 sec: 49971.2, 300 sec: 50818.2). Total num frames: 3914088448. Throughput: 0: 50844.2. Samples: 1666905840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-26 18:15:37,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 18:15:37,977][49750] Updated weights for policy 0, policy_version 238901 (0.0029) [2024-04-26 18:15:40,769][49750] Updated weights for policy 0, policy_version 238911 (0.0034) [2024-04-26 18:15:42,062][49517] Fps is (10 sec: 54068.3, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 3914383360. Throughput: 0: 50766.7. Samples: 1667211720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 18:15:42,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 18:15:44,426][49750] Updated weights for policy 0, policy_version 238921 (0.0030) [2024-04-26 18:15:47,062][49517] Fps is (10 sec: 52430.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3914612736. Throughput: 0: 50700.5. Samples: 1667519300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 18:15:47,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 18:15:47,306][49750] Updated weights for policy 0, policy_version 238931 (0.0026) [2024-04-26 18:15:50,821][49750] Updated weights for policy 0, policy_version 238941 (0.0029) [2024-04-26 18:15:52,062][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3914874880. Throughput: 0: 50785.7. Samples: 1667675380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 18:15:52,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 18:15:53,681][49750] Updated weights for policy 0, policy_version 238951 (0.0031) [2024-04-26 18:15:57,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3915120640. Throughput: 0: 50872.5. Samples: 1667977120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 18:15:57,063][49517] Avg episode reward: [(0, '0.666')] [2024-04-26 18:15:57,218][49750] Updated weights for policy 0, policy_version 238961 (0.0029) [2024-04-26 18:16:00,013][49750] Updated weights for policy 0, policy_version 238971 (0.0027) [2024-04-26 18:16:02,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3915382784. Throughput: 0: 50860.1. Samples: 1668277620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 18:16:02,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 18:16:03,694][49750] Updated weights for policy 0, policy_version 238981 (0.0038) [2024-04-26 18:16:06,511][49750] Updated weights for policy 0, policy_version 238991 (0.0034) [2024-04-26 18:16:07,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3915644928. Throughput: 0: 50927.3. Samples: 1668443740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 18:16:07,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 18:16:10,089][49750] Updated weights for policy 0, policy_version 239001 (0.0031) [2024-04-26 18:16:12,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 3915890688. Throughput: 0: 51065.0. Samples: 1668748840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 18:16:12,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 18:16:12,993][49750] Updated weights for policy 0, policy_version 239011 (0.0029) [2024-04-26 18:16:16,413][49750] Updated weights for policy 0, policy_version 239021 (0.0034) [2024-04-26 18:16:17,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 3916120064. Throughput: 0: 50864.5. Samples: 1669044720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 18:16:17,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 18:16:19,389][49750] Updated weights for policy 0, policy_version 239031 (0.0043) [2024-04-26 18:16:22,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3916398592. Throughput: 0: 50855.3. Samples: 1669194320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 18:16:22,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 18:16:22,749][49750] Updated weights for policy 0, policy_version 239041 (0.0033) [2024-04-26 18:16:24,228][49728] Signal inference workers to stop experience collection... (24750 times) [2024-04-26 18:16:24,261][49750] InferenceWorker_p0-w0: stopping experience collection (24750 times) [2024-04-26 18:16:24,296][49728] Signal inference workers to resume experience collection... (24750 times) [2024-04-26 18:16:24,296][49750] InferenceWorker_p0-w0: resuming experience collection (24750 times) [2024-04-26 18:16:25,702][49750] Updated weights for policy 0, policy_version 239051 (0.0029) [2024-04-26 18:16:27,063][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 3916660736. Throughput: 0: 50895.4. Samples: 1669502020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 18:16:27,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 18:16:29,252][49750] Updated weights for policy 0, policy_version 239061 (0.0028) [2024-04-26 18:16:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.7, 300 sec: 50929.3). Total num frames: 3916906496. Throughput: 0: 50851.1. Samples: 1669807600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 18:16:32,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 18:16:32,204][49750] Updated weights for policy 0, policy_version 239071 (0.0028) [2024-04-26 18:16:35,737][49750] Updated weights for policy 0, policy_version 239081 (0.0029) [2024-04-26 18:16:37,062][49517] Fps is (10 sec: 49152.7, 60 sec: 51063.7, 300 sec: 50873.7). Total num frames: 3917152256. Throughput: 0: 50872.5. Samples: 1669964640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 18:16:37,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 18:16:38,748][49750] Updated weights for policy 0, policy_version 239091 (0.0029) [2024-04-26 18:16:42,062][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3917414400. Throughput: 0: 51005.7. Samples: 1670272380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 18:16:42,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 18:16:42,088][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000239101_3917430784.pth... [2024-04-26 18:16:42,092][49750] Updated weights for policy 0, policy_version 239101 (0.0035) [2024-04-26 18:16:42,131][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000238356_3905224704.pth [2024-04-26 18:16:45,064][49750] Updated weights for policy 0, policy_version 239111 (0.0033) [2024-04-26 18:16:47,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3917660160. Throughput: 0: 50958.3. Samples: 1670570740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 18:16:47,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 18:16:48,967][49750] Updated weights for policy 0, policy_version 239121 (0.0028) [2024-04-26 18:16:51,426][49750] Updated weights for policy 0, policy_version 239131 (0.0028) [2024-04-26 18:16:52,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51336.4, 300 sec: 50984.8). Total num frames: 3917955072. Throughput: 0: 50811.8. Samples: 1670730280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 18:16:52,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 18:16:55,555][49750] Updated weights for policy 0, policy_version 239141 (0.0031) [2024-04-26 18:16:57,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 3918184448. Throughput: 0: 50817.1. Samples: 1671035620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 18:16:57,063][49517] Avg episode reward: [(0, '0.425')] [2024-04-26 18:16:58,007][49750] Updated weights for policy 0, policy_version 239151 (0.0029) [2024-04-26 18:17:02,055][49750] Updated weights for policy 0, policy_version 239161 (0.0030) [2024-04-26 18:17:02,063][49517] Fps is (10 sec: 45875.2, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 3918413824. Throughput: 0: 50864.0. Samples: 1671333600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 18:17:02,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 18:17:04,478][49750] Updated weights for policy 0, policy_version 239171 (0.0029) [2024-04-26 18:17:07,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 3918675968. Throughput: 0: 50828.8. Samples: 1671481620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 18:17:07,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 18:17:08,351][49750] Updated weights for policy 0, policy_version 239181 (0.0031) [2024-04-26 18:17:10,834][49750] Updated weights for policy 0, policy_version 239191 (0.0027) [2024-04-26 18:17:12,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 3918954496. Throughput: 0: 50928.6. Samples: 1671793800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 18:17:12,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 18:17:14,702][49750] Updated weights for policy 0, policy_version 239201 (0.0036) [2024-04-26 18:17:17,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 3919200256. Throughput: 0: 50832.8. Samples: 1672095080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 18:17:17,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 18:17:17,318][49750] Updated weights for policy 0, policy_version 239211 (0.0029) [2024-04-26 18:17:21,115][49750] Updated weights for policy 0, policy_version 239221 (0.0033) [2024-04-26 18:17:21,891][49728] Signal inference workers to stop experience collection... (24800 times) [2024-04-26 18:17:21,926][49750] InferenceWorker_p0-w0: stopping experience collection (24800 times) [2024-04-26 18:17:21,956][49728] Signal inference workers to resume experience collection... (24800 times) [2024-04-26 18:17:21,959][49750] InferenceWorker_p0-w0: resuming experience collection (24800 times) [2024-04-26 18:17:22,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 3919429632. Throughput: 0: 50779.0. Samples: 1672249700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 18:17:22,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 18:17:23,742][49750] Updated weights for policy 0, policy_version 239231 (0.0027) [2024-04-26 18:17:27,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3919691776. Throughput: 0: 50783.1. Samples: 1672557620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 18:17:27,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 18:17:27,446][49750] Updated weights for policy 0, policy_version 239241 (0.0028) [2024-04-26 18:17:30,242][49750] Updated weights for policy 0, policy_version 239251 (0.0028) [2024-04-26 18:17:32,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3919937536. Throughput: 0: 50966.5. Samples: 1672864240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 18:17:32,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 18:17:33,818][49750] Updated weights for policy 0, policy_version 239261 (0.0034) [2024-04-26 18:17:36,580][49750] Updated weights for policy 0, policy_version 239271 (0.0037) [2024-04-26 18:17:37,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 3920232448. Throughput: 0: 50794.2. Samples: 1673016020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 18:17:37,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 18:17:40,291][49750] Updated weights for policy 0, policy_version 239281 (0.0032) [2024-04-26 18:17:42,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 3920478208. Throughput: 0: 50712.9. Samples: 1673317700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 18:17:42,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 18:17:42,976][49750] Updated weights for policy 0, policy_version 239291 (0.0030) [2024-04-26 18:17:46,794][49750] Updated weights for policy 0, policy_version 239301 (0.0027) [2024-04-26 18:17:47,063][49517] Fps is (10 sec: 47513.8, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3920707584. Throughput: 0: 50909.3. Samples: 1673624520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 18:17:47,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 18:17:49,560][49750] Updated weights for policy 0, policy_version 239311 (0.0038) [2024-04-26 18:17:52,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49971.3, 300 sec: 50818.2). Total num frames: 3920953344. Throughput: 0: 50806.4. Samples: 1673767900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 18:17:52,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 18:17:53,390][49750] Updated weights for policy 0, policy_version 239321 (0.0031) [2024-04-26 18:17:56,068][49750] Updated weights for policy 0, policy_version 239331 (0.0031) [2024-04-26 18:17:57,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3921231872. Throughput: 0: 50608.3. Samples: 1674071180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 18:17:57,063][49517] Avg episode reward: [(0, '0.467')] [2024-04-26 18:17:59,717][49750] Updated weights for policy 0, policy_version 239341 (0.0034) [2024-04-26 18:18:02,063][49517] Fps is (10 sec: 54066.0, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3921494016. Throughput: 0: 50718.5. Samples: 1674377420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:18:02,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 18:18:02,390][49750] Updated weights for policy 0, policy_version 239351 (0.0027) [2024-04-26 18:18:06,270][49750] Updated weights for policy 0, policy_version 239361 (0.0033) [2024-04-26 18:18:07,063][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 3921739776. Throughput: 0: 50672.4. Samples: 1674529960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:18:07,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 18:18:08,915][49750] Updated weights for policy 0, policy_version 239371 (0.0030) [2024-04-26 18:18:12,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.1, 300 sec: 50818.1). Total num frames: 3921969152. Throughput: 0: 50667.9. Samples: 1674837680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:18:12,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 18:18:12,745][49750] Updated weights for policy 0, policy_version 239381 (0.0036) [2024-04-26 18:18:15,508][49750] Updated weights for policy 0, policy_version 239391 (0.0026) [2024-04-26 18:18:17,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3922231296. Throughput: 0: 50594.4. Samples: 1675140980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:18:17,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 18:18:19,084][49750] Updated weights for policy 0, policy_version 239401 (0.0040) [2024-04-26 18:18:19,687][49728] Signal inference workers to stop experience collection... (24850 times) [2024-04-26 18:18:19,688][49728] Signal inference workers to resume experience collection... (24850 times) [2024-04-26 18:18:19,709][49750] InferenceWorker_p0-w0: stopping experience collection (24850 times) [2024-04-26 18:18:19,714][49750] InferenceWorker_p0-w0: resuming experience collection (24850 times) [2024-04-26 18:18:22,062][49517] Fps is (10 sec: 52430.0, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3922493440. Throughput: 0: 50663.8. Samples: 1675295880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:18:22,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 18:18:22,113][49750] Updated weights for policy 0, policy_version 239411 (0.0029) [2024-04-26 18:18:25,388][49750] Updated weights for policy 0, policy_version 239421 (0.0031) [2024-04-26 18:18:27,063][49517] Fps is (10 sec: 54066.0, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3922771968. Throughput: 0: 50730.1. Samples: 1675600560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:18:27,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 18:18:28,461][49750] Updated weights for policy 0, policy_version 239431 (0.0031) [2024-04-26 18:18:31,885][49750] Updated weights for policy 0, policy_version 239441 (0.0029) [2024-04-26 18:18:32,063][49517] Fps is (10 sec: 50789.3, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3923001344. Throughput: 0: 50798.6. Samples: 1675910460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:18:32,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 18:18:34,720][49750] Updated weights for policy 0, policy_version 239451 (0.0037) [2024-04-26 18:18:37,063][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3923247104. Throughput: 0: 50938.9. Samples: 1676060160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:18:37,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 18:18:38,325][49750] Updated weights for policy 0, policy_version 239461 (0.0031) [2024-04-26 18:18:41,128][49750] Updated weights for policy 0, policy_version 239471 (0.0029) [2024-04-26 18:18:42,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3923509248. Throughput: 0: 51053.4. Samples: 1676368580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:18:42,072][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 18:18:42,083][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000239472_3923509248.pth... [2024-04-26 18:18:42,130][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000238728_3911319552.pth [2024-04-26 18:18:44,687][49750] Updated weights for policy 0, policy_version 239481 (0.0029) [2024-04-26 18:18:47,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3923771392. Throughput: 0: 50849.9. Samples: 1676665660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:18:47,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 18:18:47,674][49750] Updated weights for policy 0, policy_version 239491 (0.0036) [2024-04-26 18:18:51,035][49750] Updated weights for policy 0, policy_version 239501 (0.0031) [2024-04-26 18:18:52,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 3924049920. Throughput: 0: 51082.3. Samples: 1676828660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:18:52,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 18:18:54,232][49750] Updated weights for policy 0, policy_version 239511 (0.0031) [2024-04-26 18:18:57,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 3924246528. Throughput: 0: 50911.7. Samples: 1677128700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:18:57,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-26 18:18:57,825][49750] Updated weights for policy 0, policy_version 239521 (0.0029) [2024-04-26 18:19:00,540][49750] Updated weights for policy 0, policy_version 239531 (0.0028) [2024-04-26 18:19:02,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3924525056. Throughput: 0: 50932.3. Samples: 1677432940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:19:02,063][49517] Avg episode reward: [(0, '0.455')] [2024-04-26 18:19:04,127][49750] Updated weights for policy 0, policy_version 239541 (0.0033) [2024-04-26 18:19:06,953][49750] Updated weights for policy 0, policy_version 239551 (0.0028) [2024-04-26 18:19:07,062][49517] Fps is (10 sec: 55706.3, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3924803584. Throughput: 0: 50812.0. Samples: 1677582420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:19:07,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 18:19:10,573][49750] Updated weights for policy 0, policy_version 239561 (0.0034) [2024-04-26 18:19:12,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3925049344. Throughput: 0: 50929.4. Samples: 1677892380. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 18:19:12,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 18:19:13,431][49750] Updated weights for policy 0, policy_version 239571 (0.0036) [2024-04-26 18:19:17,000][49750] Updated weights for policy 0, policy_version 239581 (0.0028) [2024-04-26 18:19:17,062][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3925295104. Throughput: 0: 50747.7. Samples: 1678194100. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 18:19:17,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 18:19:19,869][49750] Updated weights for policy 0, policy_version 239591 (0.0031) [2024-04-26 18:19:22,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3925540864. Throughput: 0: 50737.9. Samples: 1678343360. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 18:19:22,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 18:19:23,444][49750] Updated weights for policy 0, policy_version 239601 (0.0028) [2024-04-26 18:19:26,532][49750] Updated weights for policy 0, policy_version 239611 (0.0028) [2024-04-26 18:19:27,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3925803008. Throughput: 0: 50763.6. Samples: 1678652940. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 18:19:27,063][49517] Avg episode reward: [(0, '0.669')] [2024-04-26 18:19:29,734][49728] Signal inference workers to stop experience collection... (24900 times) [2024-04-26 18:19:29,734][49728] Signal inference workers to resume experience collection... (24900 times) [2024-04-26 18:19:29,763][49750] InferenceWorker_p0-w0: stopping experience collection (24900 times) [2024-04-26 18:19:29,763][49750] InferenceWorker_p0-w0: resuming experience collection (24900 times) [2024-04-26 18:19:29,889][49750] Updated weights for policy 0, policy_version 239621 (0.0034) [2024-04-26 18:19:32,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3926048768. Throughput: 0: 50803.6. Samples: 1678951820. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 18:19:32,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 18:19:33,074][49750] Updated weights for policy 0, policy_version 239631 (0.0031) [2024-04-26 18:19:36,381][49750] Updated weights for policy 0, policy_version 239641 (0.0029) [2024-04-26 18:19:37,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 3926327296. Throughput: 0: 50701.2. Samples: 1679110220. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 18:19:37,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 18:19:39,421][49750] Updated weights for policy 0, policy_version 239651 (0.0031) [2024-04-26 18:19:42,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3926556672. Throughput: 0: 50852.5. Samples: 1679417060. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 18:19:42,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 18:19:42,864][49750] Updated weights for policy 0, policy_version 239661 (0.0028) [2024-04-26 18:19:45,700][49750] Updated weights for policy 0, policy_version 239671 (0.0029) [2024-04-26 18:19:47,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3926802432. Throughput: 0: 50861.5. Samples: 1679721700. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 18:19:47,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 18:19:49,301][49750] Updated weights for policy 0, policy_version 239681 (0.0029) [2024-04-26 18:19:52,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3927080960. Throughput: 0: 50730.7. Samples: 1679865300. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 18:19:52,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 18:19:52,203][49750] Updated weights for policy 0, policy_version 239691 (0.0029) [2024-04-26 18:19:55,639][49750] Updated weights for policy 0, policy_version 239701 (0.0035) [2024-04-26 18:19:57,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.7, 300 sec: 50762.6). Total num frames: 3927326720. Throughput: 0: 50790.9. Samples: 1680177960. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 18:19:57,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 18:19:58,679][49750] Updated weights for policy 0, policy_version 239711 (0.0030) [2024-04-26 18:20:02,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3927572480. Throughput: 0: 50852.6. Samples: 1680482460. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 18:20:02,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 18:20:02,106][49750] Updated weights for policy 0, policy_version 239721 (0.0031) [2024-04-26 18:20:05,009][49750] Updated weights for policy 0, policy_version 239731 (0.0028) [2024-04-26 18:20:07,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 3927834624. Throughput: 0: 50824.8. Samples: 1680630480. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 18:20:07,071][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 18:20:08,615][49750] Updated weights for policy 0, policy_version 239741 (0.0029) [2024-04-26 18:20:11,853][49750] Updated weights for policy 0, policy_version 239751 (0.0034) [2024-04-26 18:20:12,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3928080384. Throughput: 0: 50704.5. Samples: 1680934640. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 18:20:12,072][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 18:20:14,922][49750] Updated weights for policy 0, policy_version 239761 (0.0029) [2024-04-26 18:20:17,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3928342528. Throughput: 0: 50951.9. Samples: 1681244660. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-26 18:20:17,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 18:20:18,504][49750] Updated weights for policy 0, policy_version 239771 (0.0033) [2024-04-26 18:20:21,271][49750] Updated weights for policy 0, policy_version 239781 (0.0028) [2024-04-26 18:20:22,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3928604672. Throughput: 0: 50786.9. Samples: 1681395620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 18:20:22,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 18:20:24,912][49750] Updated weights for policy 0, policy_version 239791 (0.0036) [2024-04-26 18:20:27,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3928834048. Throughput: 0: 50840.0. Samples: 1681704860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 18:20:27,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-26 18:20:27,563][49728] Signal inference workers to stop experience collection... (24950 times) [2024-04-26 18:20:27,564][49728] Signal inference workers to resume experience collection... (24950 times) [2024-04-26 18:20:27,590][49750] InferenceWorker_p0-w0: stopping experience collection (24950 times) [2024-04-26 18:20:27,590][49750] InferenceWorker_p0-w0: resuming experience collection (24950 times) [2024-04-26 18:20:27,704][49750] Updated weights for policy 0, policy_version 239801 (0.0028) [2024-04-26 18:20:31,527][49750] Updated weights for policy 0, policy_version 239811 (0.0030) [2024-04-26 18:20:32,063][49517] Fps is (10 sec: 49150.5, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3929096192. Throughput: 0: 50864.6. Samples: 1682010620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 18:20:32,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 18:20:34,196][49750] Updated weights for policy 0, policy_version 239821 (0.0034) [2024-04-26 18:20:37,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3929358336. Throughput: 0: 50775.8. Samples: 1682150220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 18:20:37,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 18:20:37,969][49750] Updated weights for policy 0, policy_version 239831 (0.0030) [2024-04-26 18:20:40,700][49750] Updated weights for policy 0, policy_version 239841 (0.0032) [2024-04-26 18:20:42,062][49517] Fps is (10 sec: 52430.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3929620480. Throughput: 0: 50701.7. Samples: 1682459540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 18:20:42,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 18:20:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000239845_3929620480.pth... [2024-04-26 18:20:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000239101_3917430784.pth [2024-04-26 18:20:44,331][49750] Updated weights for policy 0, policy_version 239851 (0.0034) [2024-04-26 18:20:47,063][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3929866240. Throughput: 0: 50725.7. Samples: 1682765120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 18:20:47,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 18:20:47,103][49750] Updated weights for policy 0, policy_version 239861 (0.0032) [2024-04-26 18:20:50,716][49750] Updated weights for policy 0, policy_version 239871 (0.0031) [2024-04-26 18:20:52,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.1, 300 sec: 50818.1). Total num frames: 3930112000. Throughput: 0: 50740.3. Samples: 1682913800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 18:20:52,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 18:20:53,446][49750] Updated weights for policy 0, policy_version 239881 (0.0033) [2024-04-26 18:20:57,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3930357760. Throughput: 0: 50797.9. Samples: 1683220540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 18:20:57,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 18:20:57,100][49750] Updated weights for policy 0, policy_version 239891 (0.0029) [2024-04-26 18:21:00,008][49750] Updated weights for policy 0, policy_version 239901 (0.0040) [2024-04-26 18:21:02,063][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3930636288. Throughput: 0: 50777.9. Samples: 1683529660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 18:21:02,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 18:21:03,490][49750] Updated weights for policy 0, policy_version 239911 (0.0029) [2024-04-26 18:21:06,458][49750] Updated weights for policy 0, policy_version 239921 (0.0042) [2024-04-26 18:21:07,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 3930882048. Throughput: 0: 50769.2. Samples: 1683680240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 18:21:07,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 18:21:09,873][49750] Updated weights for policy 0, policy_version 239931 (0.0031) [2024-04-26 18:21:12,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3931127808. Throughput: 0: 50608.0. Samples: 1683982220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 18:21:12,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 18:21:12,848][49750] Updated weights for policy 0, policy_version 239941 (0.0031) [2024-04-26 18:21:16,367][49750] Updated weights for policy 0, policy_version 239951 (0.0029) [2024-04-26 18:21:17,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3931406336. Throughput: 0: 50678.0. Samples: 1684291120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 18:21:17,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-26 18:21:19,238][49750] Updated weights for policy 0, policy_version 239961 (0.0028) [2024-04-26 18:21:22,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3931652096. Throughput: 0: 50937.5. Samples: 1684442400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 18:21:22,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 18:21:22,781][49750] Updated weights for policy 0, policy_version 239971 (0.0033) [2024-04-26 18:21:25,701][49750] Updated weights for policy 0, policy_version 239981 (0.0035) [2024-04-26 18:21:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3931897856. Throughput: 0: 50939.5. Samples: 1684751820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 18:21:27,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 18:21:29,236][49750] Updated weights for policy 0, policy_version 239991 (0.0033) [2024-04-26 18:21:32,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3932160000. Throughput: 0: 50987.9. Samples: 1685059580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 18:21:32,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 18:21:32,134][49750] Updated weights for policy 0, policy_version 240001 (0.0033) [2024-04-26 18:21:35,609][49750] Updated weights for policy 0, policy_version 240011 (0.0031) [2024-04-26 18:21:37,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 3932389376. Throughput: 0: 50988.7. Samples: 1685208280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 18:21:37,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-26 18:21:38,567][49750] Updated weights for policy 0, policy_version 240021 (0.0032) [2024-04-26 18:21:41,619][49728] Signal inference workers to stop experience collection... (25000 times) [2024-04-26 18:21:41,619][49728] Signal inference workers to resume experience collection... (25000 times) [2024-04-26 18:21:41,631][49750] InferenceWorker_p0-w0: stopping experience collection (25000 times) [2024-04-26 18:21:41,651][49750] InferenceWorker_p0-w0: resuming experience collection (25000 times) [2024-04-26 18:21:41,900][49750] Updated weights for policy 0, policy_version 240031 (0.0032) [2024-04-26 18:21:42,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3932667904. Throughput: 0: 50940.6. Samples: 1685512880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 18:21:42,072][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 18:21:45,021][49750] Updated weights for policy 0, policy_version 240041 (0.0036) [2024-04-26 18:21:47,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3932913664. Throughput: 0: 50868.0. Samples: 1685818720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 18:21:47,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 18:21:48,415][49750] Updated weights for policy 0, policy_version 240051 (0.0032) [2024-04-26 18:21:51,499][49750] Updated weights for policy 0, policy_version 240061 (0.0037) [2024-04-26 18:21:52,063][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3933175808. Throughput: 0: 51000.0. Samples: 1685975240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 18:21:52,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 18:21:54,855][49750] Updated weights for policy 0, policy_version 240071 (0.0033) [2024-04-26 18:21:57,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3933405184. Throughput: 0: 50971.1. Samples: 1686275920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 18:21:57,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 18:21:57,857][49750] Updated weights for policy 0, policy_version 240081 (0.0030) [2024-04-26 18:22:01,221][49750] Updated weights for policy 0, policy_version 240091 (0.0036) [2024-04-26 18:22:02,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3933683712. Throughput: 0: 50743.6. Samples: 1686574580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 18:22:02,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 18:22:04,201][49750] Updated weights for policy 0, policy_version 240101 (0.0039) [2024-04-26 18:22:07,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3933929472. Throughput: 0: 50779.9. Samples: 1686727500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 18:22:07,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 18:22:07,539][49750] Updated weights for policy 0, policy_version 240111 (0.0029) [2024-04-26 18:22:10,644][49750] Updated weights for policy 0, policy_version 240121 (0.0034) [2024-04-26 18:22:12,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3934191616. Throughput: 0: 50743.5. Samples: 1687035280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 18:22:12,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 18:22:14,059][49750] Updated weights for policy 0, policy_version 240131 (0.0030) [2024-04-26 18:22:17,060][49750] Updated weights for policy 0, policy_version 240141 (0.0031) [2024-04-26 18:22:17,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 3934470144. Throughput: 0: 50830.6. Samples: 1687346960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 18:22:17,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 18:22:20,472][49750] Updated weights for policy 0, policy_version 240151 (0.0031) [2024-04-26 18:22:22,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3934699520. Throughput: 0: 50948.7. Samples: 1687500980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 18:22:22,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 18:22:23,404][49750] Updated weights for policy 0, policy_version 240161 (0.0028) [2024-04-26 18:22:26,888][49750] Updated weights for policy 0, policy_version 240171 (0.0035) [2024-04-26 18:22:27,062][49517] Fps is (10 sec: 49152.9, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 3934961664. Throughput: 0: 50928.7. Samples: 1687804660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 18:22:27,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 18:22:29,830][49750] Updated weights for policy 0, policy_version 240181 (0.0033) [2024-04-26 18:22:32,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3935191040. Throughput: 0: 50800.9. Samples: 1688104760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 18:22:32,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 18:22:32,743][49728] Signal inference workers to stop experience collection... (25050 times) [2024-04-26 18:22:32,744][49728] Signal inference workers to resume experience collection... (25050 times) [2024-04-26 18:22:32,770][49750] InferenceWorker_p0-w0: stopping experience collection (25050 times) [2024-04-26 18:22:32,770][49750] InferenceWorker_p0-w0: resuming experience collection (25050 times) [2024-04-26 18:22:33,368][49750] Updated weights for policy 0, policy_version 240191 (0.0028) [2024-04-26 18:22:36,272][49750] Updated weights for policy 0, policy_version 240201 (0.0031) [2024-04-26 18:22:37,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3935469568. Throughput: 0: 50655.2. Samples: 1688254720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 18:22:37,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 18:22:40,118][49750] Updated weights for policy 0, policy_version 240211 (0.0032) [2024-04-26 18:22:42,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3935715328. Throughput: 0: 50730.6. Samples: 1688558800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 18:22:42,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 18:22:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000240217_3935715328.pth... [2024-04-26 18:22:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000239472_3923509248.pth [2024-04-26 18:22:42,822][49750] Updated weights for policy 0, policy_version 240221 (0.0030) [2024-04-26 18:22:46,623][49750] Updated weights for policy 0, policy_version 240231 (0.0031) [2024-04-26 18:22:47,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 3935977472. Throughput: 0: 50964.9. Samples: 1688868000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 18:22:47,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 18:22:49,251][49750] Updated weights for policy 0, policy_version 240241 (0.0031) [2024-04-26 18:22:52,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3936206848. Throughput: 0: 50724.5. Samples: 1689010100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 18:22:52,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 18:22:52,909][49750] Updated weights for policy 0, policy_version 240251 (0.0028) [2024-04-26 18:22:55,704][49750] Updated weights for policy 0, policy_version 240261 (0.0025) [2024-04-26 18:22:57,062][49517] Fps is (10 sec: 49152.3, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 3936468992. Throughput: 0: 50668.6. Samples: 1689315360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 18:22:57,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 18:22:59,209][49750] Updated weights for policy 0, policy_version 240271 (0.0031) [2024-04-26 18:23:02,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3936747520. Throughput: 0: 50708.1. Samples: 1689628820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 18:23:02,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 18:23:02,289][49750] Updated weights for policy 0, policy_version 240281 (0.0029) [2024-04-26 18:23:05,688][49750] Updated weights for policy 0, policy_version 240291 (0.0031) [2024-04-26 18:23:07,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3936976896. Throughput: 0: 50762.0. Samples: 1689785260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 18:23:07,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 18:23:08,762][49750] Updated weights for policy 0, policy_version 240301 (0.0029) [2024-04-26 18:23:12,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3937239040. Throughput: 0: 50673.8. Samples: 1690084980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 18:23:12,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 18:23:12,107][49750] Updated weights for policy 0, policy_version 240311 (0.0033) [2024-04-26 18:23:15,349][49750] Updated weights for policy 0, policy_version 240321 (0.0029) [2024-04-26 18:23:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 3937484800. Throughput: 0: 50732.5. Samples: 1690387720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 18:23:17,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 18:23:18,582][49750] Updated weights for policy 0, policy_version 240331 (0.0029) [2024-04-26 18:23:21,807][49750] Updated weights for policy 0, policy_version 240341 (0.0031) [2024-04-26 18:23:22,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3937746944. Throughput: 0: 50782.5. Samples: 1690539940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 18:23:22,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 18:23:24,788][49728] Signal inference workers to stop experience collection... (25100 times) [2024-04-26 18:23:24,788][49728] Signal inference workers to resume experience collection... (25100 times) [2024-04-26 18:23:24,823][49750] InferenceWorker_p0-w0: stopping experience collection (25100 times) [2024-04-26 18:23:24,823][49750] InferenceWorker_p0-w0: resuming experience collection (25100 times) [2024-04-26 18:23:24,927][49750] Updated weights for policy 0, policy_version 240351 (0.0032) [2024-04-26 18:23:27,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3938009088. Throughput: 0: 50685.3. Samples: 1690839640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 18:23:27,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 18:23:28,084][49750] Updated weights for policy 0, policy_version 240361 (0.0031) [2024-04-26 18:23:31,414][49750] Updated weights for policy 0, policy_version 240371 (0.0035) [2024-04-26 18:23:32,063][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3938254848. Throughput: 0: 50589.7. Samples: 1691144540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 18:23:32,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 18:23:34,585][49750] Updated weights for policy 0, policy_version 240381 (0.0034) [2024-04-26 18:23:37,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3938484224. Throughput: 0: 50819.1. Samples: 1691296960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 18:23:37,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 18:23:37,815][49750] Updated weights for policy 0, policy_version 240391 (0.0033) [2024-04-26 18:23:40,897][49750] Updated weights for policy 0, policy_version 240401 (0.0033) [2024-04-26 18:23:42,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3938746368. Throughput: 0: 50873.8. Samples: 1691604680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 18:23:42,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 18:23:44,249][49750] Updated weights for policy 0, policy_version 240411 (0.0030) [2024-04-26 18:23:47,062][49517] Fps is (10 sec: 54067.9, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3939024896. Throughput: 0: 50776.5. Samples: 1691913760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 18:23:47,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 18:23:47,672][49750] Updated weights for policy 0, policy_version 240421 (0.0029) [2024-04-26 18:23:50,812][49750] Updated weights for policy 0, policy_version 240431 (0.0037) [2024-04-26 18:23:52,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 3939270656. Throughput: 0: 50716.8. Samples: 1692067520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 18:23:52,063][49517] Avg episode reward: [(0, '0.490')] [2024-04-26 18:23:54,109][49750] Updated weights for policy 0, policy_version 240441 (0.0033) [2024-04-26 18:23:57,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3939532800. Throughput: 0: 50698.6. Samples: 1692366420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 18:23:57,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 18:23:57,433][49750] Updated weights for policy 0, policy_version 240451 (0.0035) [2024-04-26 18:24:00,423][49750] Updated weights for policy 0, policy_version 240461 (0.0030) [2024-04-26 18:24:02,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3939778560. Throughput: 0: 50933.3. Samples: 1692679720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 18:24:02,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 18:24:03,781][49750] Updated weights for policy 0, policy_version 240471 (0.0035) [2024-04-26 18:24:06,713][49750] Updated weights for policy 0, policy_version 240481 (0.0031) [2024-04-26 18:24:07,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3940040704. Throughput: 0: 50871.8. Samples: 1692829160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 18:24:07,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 18:24:10,329][49750] Updated weights for policy 0, policy_version 240491 (0.0029) [2024-04-26 18:24:12,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3940270080. Throughput: 0: 50923.7. Samples: 1693131200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 18:24:12,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 18:24:13,489][49750] Updated weights for policy 0, policy_version 240501 (0.0033) [2024-04-26 18:24:16,823][49750] Updated weights for policy 0, policy_version 240511 (0.0035) [2024-04-26 18:24:17,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3940532224. Throughput: 0: 50766.0. Samples: 1693429000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 18:24:17,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 18:24:20,167][49750] Updated weights for policy 0, policy_version 240521 (0.0033) [2024-04-26 18:24:22,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 3940777984. Throughput: 0: 50815.2. Samples: 1693583640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 18:24:22,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 18:24:23,360][49750] Updated weights for policy 0, policy_version 240531 (0.0031) [2024-04-26 18:24:25,342][49728] Signal inference workers to stop experience collection... (25150 times) [2024-04-26 18:24:25,343][49728] Signal inference workers to resume experience collection... (25150 times) [2024-04-26 18:24:25,374][49750] InferenceWorker_p0-w0: stopping experience collection (25150 times) [2024-04-26 18:24:25,374][49750] InferenceWorker_p0-w0: resuming experience collection (25150 times) [2024-04-26 18:24:26,752][49750] Updated weights for policy 0, policy_version 240541 (0.0036) [2024-04-26 18:24:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 3941023744. Throughput: 0: 50704.9. Samples: 1693886400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 18:24:27,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 18:24:29,793][49750] Updated weights for policy 0, policy_version 240551 (0.0043) [2024-04-26 18:24:32,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.6, 300 sec: 50762.7). Total num frames: 3941302272. Throughput: 0: 50578.3. Samples: 1694189780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 18:24:32,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 18:24:33,068][49750] Updated weights for policy 0, policy_version 240561 (0.0031) [2024-04-26 18:24:36,264][49750] Updated weights for policy 0, policy_version 240571 (0.0033) [2024-04-26 18:24:37,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3941531648. Throughput: 0: 50662.4. Samples: 1694347320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 18:24:37,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 18:24:39,412][49750] Updated weights for policy 0, policy_version 240581 (0.0027) [2024-04-26 18:24:42,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 3941810176. Throughput: 0: 50735.6. Samples: 1694649520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 18:24:42,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 18:24:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000240589_3941810176.pth... [2024-04-26 18:24:42,124][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000239845_3929620480.pth [2024-04-26 18:24:42,743][49750] Updated weights for policy 0, policy_version 240591 (0.0032) [2024-04-26 18:24:45,890][49750] Updated weights for policy 0, policy_version 240601 (0.0033) [2024-04-26 18:24:47,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3942039552. Throughput: 0: 50472.9. Samples: 1694951000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 18:24:47,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 18:24:49,193][49750] Updated weights for policy 0, policy_version 240611 (0.0029) [2024-04-26 18:24:52,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3942301696. Throughput: 0: 50573.3. Samples: 1695104960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 18:24:52,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 18:24:52,559][49750] Updated weights for policy 0, policy_version 240621 (0.0045) [2024-04-26 18:24:55,825][49750] Updated weights for policy 0, policy_version 240631 (0.0031) [2024-04-26 18:24:57,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3942547456. Throughput: 0: 50591.9. Samples: 1695407840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-04-26 18:24:57,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 18:24:58,918][49750] Updated weights for policy 0, policy_version 240641 (0.0031) [2024-04-26 18:25:02,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3942809600. Throughput: 0: 50665.9. Samples: 1695708960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 18:25:02,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 18:25:02,167][49750] Updated weights for policy 0, policy_version 240651 (0.0035) [2024-04-26 18:25:05,413][49750] Updated weights for policy 0, policy_version 240661 (0.0036) [2024-04-26 18:25:07,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3943071744. Throughput: 0: 50600.5. Samples: 1695860660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 18:25:07,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 18:25:08,596][49750] Updated weights for policy 0, policy_version 240671 (0.0031) [2024-04-26 18:25:11,825][49750] Updated weights for policy 0, policy_version 240681 (0.0031) [2024-04-26 18:25:12,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3943317504. Throughput: 0: 50740.7. Samples: 1696169740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 18:25:12,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 18:25:15,222][49750] Updated weights for policy 0, policy_version 240691 (0.0037) [2024-04-26 18:25:17,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3943579648. Throughput: 0: 50799.5. Samples: 1696475760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 18:25:17,071][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 18:25:18,143][49750] Updated weights for policy 0, policy_version 240701 (0.0036) [2024-04-26 18:25:21,577][49750] Updated weights for policy 0, policy_version 240711 (0.0032) [2024-04-26 18:25:22,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3943825408. Throughput: 0: 50638.7. Samples: 1696626060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 18:25:22,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 18:25:24,610][49750] Updated weights for policy 0, policy_version 240721 (0.0029) [2024-04-26 18:25:27,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3944071168. Throughput: 0: 50771.1. Samples: 1696934220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 18:25:27,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 18:25:28,081][49750] Updated weights for policy 0, policy_version 240731 (0.0036) [2024-04-26 18:25:31,274][49750] Updated weights for policy 0, policy_version 240741 (0.0036) [2024-04-26 18:25:32,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3944349696. Throughput: 0: 50833.7. Samples: 1697238520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 18:25:32,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 18:25:34,432][49750] Updated weights for policy 0, policy_version 240751 (0.0029) [2024-04-26 18:25:37,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3944595456. Throughput: 0: 50723.5. Samples: 1697387520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 18:25:37,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 18:25:37,604][49750] Updated weights for policy 0, policy_version 240761 (0.0026) [2024-04-26 18:25:40,817][49750] Updated weights for policy 0, policy_version 240771 (0.0037) [2024-04-26 18:25:42,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3944841216. Throughput: 0: 50825.7. Samples: 1697695000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 18:25:42,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 18:25:44,099][49750] Updated weights for policy 0, policy_version 240781 (0.0030) [2024-04-26 18:25:44,863][49728] Signal inference workers to stop experience collection... (25200 times) [2024-04-26 18:25:44,864][49728] Signal inference workers to resume experience collection... (25200 times) [2024-04-26 18:25:44,883][49750] InferenceWorker_p0-w0: stopping experience collection (25200 times) [2024-04-26 18:25:44,883][49750] InferenceWorker_p0-w0: resuming experience collection (25200 times) [2024-04-26 18:25:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3945086976. Throughput: 0: 50976.4. Samples: 1698002900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 18:25:47,063][49517] Avg episode reward: [(0, '0.477')] [2024-04-26 18:25:47,231][49750] Updated weights for policy 0, policy_version 240791 (0.0035) [2024-04-26 18:25:50,607][49750] Updated weights for policy 0, policy_version 240801 (0.0027) [2024-04-26 18:25:52,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3945349120. Throughput: 0: 50855.1. Samples: 1698149140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 18:25:52,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 18:25:53,576][49750] Updated weights for policy 0, policy_version 240811 (0.0029) [2024-04-26 18:25:57,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3945594880. Throughput: 0: 50803.8. Samples: 1698455900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 18:25:57,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 18:25:57,185][49750] Updated weights for policy 0, policy_version 240821 (0.0027) [2024-04-26 18:26:00,129][49750] Updated weights for policy 0, policy_version 240831 (0.0036) [2024-04-26 18:26:02,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3945857024. Throughput: 0: 50840.5. Samples: 1698763580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 18:26:02,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 18:26:03,590][49750] Updated weights for policy 0, policy_version 240841 (0.0031) [2024-04-26 18:26:06,551][49750] Updated weights for policy 0, policy_version 240851 (0.0032) [2024-04-26 18:26:07,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3946119168. Throughput: 0: 50813.8. Samples: 1698912680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 18:26:07,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 18:26:10,022][49750] Updated weights for policy 0, policy_version 240861 (0.0035) [2024-04-26 18:26:12,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3946364928. Throughput: 0: 50723.0. Samples: 1699216760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 18:26:12,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 18:26:12,864][49750] Updated weights for policy 0, policy_version 240871 (0.0032) [2024-04-26 18:26:16,432][49750] Updated weights for policy 0, policy_version 240881 (0.0030) [2024-04-26 18:26:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3946627072. Throughput: 0: 50803.7. Samples: 1699524680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 18:26:17,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 18:26:19,304][49750] Updated weights for policy 0, policy_version 240891 (0.0030) [2024-04-26 18:26:22,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3946889216. Throughput: 0: 50765.7. Samples: 1699671980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 18:26:22,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 18:26:22,741][49750] Updated weights for policy 0, policy_version 240901 (0.0033) [2024-04-26 18:26:26,039][49750] Updated weights for policy 0, policy_version 240911 (0.0032) [2024-04-26 18:26:27,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3947118592. Throughput: 0: 50773.4. Samples: 1699979800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 18:26:27,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 18:26:29,197][49750] Updated weights for policy 0, policy_version 240921 (0.0034) [2024-04-26 18:26:32,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 3947380736. Throughput: 0: 50756.7. Samples: 1700286960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 18:26:32,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 18:26:32,317][49750] Updated weights for policy 0, policy_version 240931 (0.0033) [2024-04-26 18:26:35,778][49750] Updated weights for policy 0, policy_version 240941 (0.0030) [2024-04-26 18:26:37,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3947626496. Throughput: 0: 50696.5. Samples: 1700430480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 18:26:37,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 18:26:38,766][49750] Updated weights for policy 0, policy_version 240951 (0.0031) [2024-04-26 18:26:42,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3947888640. Throughput: 0: 50764.4. Samples: 1700740300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 18:26:42,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 18:26:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000240961_3947905024.pth... [2024-04-26 18:26:42,077][49750] Updated weights for policy 0, policy_version 240961 (0.0028) [2024-04-26 18:26:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000240217_3935715328.pth [2024-04-26 18:26:45,308][49750] Updated weights for policy 0, policy_version 240971 (0.0029) [2024-04-26 18:26:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 3948150784. Throughput: 0: 50829.3. Samples: 1701050900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 18:26:47,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 18:26:48,576][49750] Updated weights for policy 0, policy_version 240981 (0.0035) [2024-04-26 18:26:51,596][49750] Updated weights for policy 0, policy_version 240991 (0.0032) [2024-04-26 18:26:52,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3948412928. Throughput: 0: 50823.0. Samples: 1701199720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 18:26:52,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 18:26:53,031][49728] Signal inference workers to stop experience collection... (25250 times) [2024-04-26 18:26:53,032][49728] Signal inference workers to resume experience collection... (25250 times) [2024-04-26 18:26:53,046][49750] InferenceWorker_p0-w0: stopping experience collection (25250 times) [2024-04-26 18:26:53,051][49750] InferenceWorker_p0-w0: resuming experience collection (25250 times) [2024-04-26 18:26:55,010][49750] Updated weights for policy 0, policy_version 241001 (0.0034) [2024-04-26 18:26:57,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3948642304. Throughput: 0: 50746.8. Samples: 1701500360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 18:26:57,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 18:26:58,140][49750] Updated weights for policy 0, policy_version 241011 (0.0038) [2024-04-26 18:27:01,582][49750] Updated weights for policy 0, policy_version 241021 (0.0030) [2024-04-26 18:27:02,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3948904448. Throughput: 0: 50769.6. Samples: 1701809320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 18:27:02,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 18:27:04,553][49750] Updated weights for policy 0, policy_version 241031 (0.0037) [2024-04-26 18:27:07,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3949150208. Throughput: 0: 50796.1. Samples: 1701957800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 18:27:07,063][49517] Avg episode reward: [(0, '0.680')] [2024-04-26 18:27:07,899][49750] Updated weights for policy 0, policy_version 241041 (0.0032) [2024-04-26 18:27:10,895][49750] Updated weights for policy 0, policy_version 241051 (0.0033) [2024-04-26 18:27:12,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3949412352. Throughput: 0: 50824.1. Samples: 1702266880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 18:27:12,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 18:27:14,294][49750] Updated weights for policy 0, policy_version 241061 (0.0029) [2024-04-26 18:27:17,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3949690880. Throughput: 0: 50758.8. Samples: 1702571100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 18:27:17,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 18:27:17,457][49750] Updated weights for policy 0, policy_version 241071 (0.0027) [2024-04-26 18:27:20,587][49750] Updated weights for policy 0, policy_version 241081 (0.0030) [2024-04-26 18:27:22,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3949920256. Throughput: 0: 50949.8. Samples: 1702723220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:27:22,063][49517] Avg episode reward: [(0, '0.453')] [2024-04-26 18:27:24,002][49750] Updated weights for policy 0, policy_version 241091 (0.0028) [2024-04-26 18:27:27,062][49517] Fps is (10 sec: 49151.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3950182400. Throughput: 0: 50751.1. Samples: 1703024100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:27:27,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 18:27:27,095][49750] Updated weights for policy 0, policy_version 241101 (0.0031) [2024-04-26 18:27:30,543][49750] Updated weights for policy 0, policy_version 241111 (0.0039) [2024-04-26 18:27:32,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 3950444544. Throughput: 0: 50593.8. Samples: 1703327620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:27:32,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 18:27:33,795][49750] Updated weights for policy 0, policy_version 241121 (0.0031) [2024-04-26 18:27:37,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3950673920. Throughput: 0: 50845.8. Samples: 1703487780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:27:37,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 18:27:37,084][49750] Updated weights for policy 0, policy_version 241131 (0.0032) [2024-04-26 18:27:40,270][49750] Updated weights for policy 0, policy_version 241141 (0.0033) [2024-04-26 18:27:42,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3950936064. Throughput: 0: 50892.8. Samples: 1703790540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:27:42,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 18:27:43,530][49750] Updated weights for policy 0, policy_version 241151 (0.0033) [2024-04-26 18:27:46,572][49750] Updated weights for policy 0, policy_version 241161 (0.0033) [2024-04-26 18:27:47,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.1, 300 sec: 50762.6). Total num frames: 3951181824. Throughput: 0: 50744.4. Samples: 1704092820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:27:47,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 18:27:49,922][49750] Updated weights for policy 0, policy_version 241171 (0.0033) [2024-04-26 18:27:52,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3951443968. Throughput: 0: 50858.2. Samples: 1704246420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:27:52,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 18:27:53,110][49750] Updated weights for policy 0, policy_version 241181 (0.0035) [2024-04-26 18:27:56,298][49750] Updated weights for policy 0, policy_version 241191 (0.0033) [2024-04-26 18:27:57,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 3951706112. Throughput: 0: 50697.2. Samples: 1704548260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:27:57,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 18:27:59,697][49750] Updated weights for policy 0, policy_version 241201 (0.0037) [2024-04-26 18:28:02,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3951968256. Throughput: 0: 50800.1. Samples: 1704857120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:28:02,063][49517] Avg episode reward: [(0, '0.473')] [2024-04-26 18:28:02,717][49750] Updated weights for policy 0, policy_version 241211 (0.0032) [2024-04-26 18:28:06,316][49750] Updated weights for policy 0, policy_version 241221 (0.0029) [2024-04-26 18:28:07,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 3952181248. Throughput: 0: 50692.5. Samples: 1705004380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:28:07,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 18:28:07,107][49728] Signal inference workers to stop experience collection... (25300 times) [2024-04-26 18:28:07,108][49728] Signal inference workers to resume experience collection... (25300 times) [2024-04-26 18:28:07,132][49750] InferenceWorker_p0-w0: stopping experience collection (25300 times) [2024-04-26 18:28:07,132][49750] InferenceWorker_p0-w0: resuming experience collection (25300 times) [2024-04-26 18:28:09,256][49750] Updated weights for policy 0, policy_version 241231 (0.0024) [2024-04-26 18:28:12,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3952459776. Throughput: 0: 50807.0. Samples: 1705310420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:28:12,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 18:28:12,657][49750] Updated weights for policy 0, policy_version 241241 (0.0034) [2024-04-26 18:28:15,652][49750] Updated weights for policy 0, policy_version 241251 (0.0035) [2024-04-26 18:28:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3952705536. Throughput: 0: 50880.5. Samples: 1705617240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:28:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 18:28:19,266][49750] Updated weights for policy 0, policy_version 241261 (0.0030) [2024-04-26 18:28:22,041][49750] Updated weights for policy 0, policy_version 241271 (0.0027) [2024-04-26 18:28:22,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3952984064. Throughput: 0: 50750.3. Samples: 1705771540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:28:22,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 18:28:25,580][49750] Updated weights for policy 0, policy_version 241281 (0.0028) [2024-04-26 18:28:27,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3953213440. Throughput: 0: 50766.2. Samples: 1706075020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 18:28:27,063][49517] Avg episode reward: [(0, '0.443')] [2024-04-26 18:28:28,492][49750] Updated weights for policy 0, policy_version 241291 (0.0034) [2024-04-26 18:28:31,874][49750] Updated weights for policy 0, policy_version 241301 (0.0026) [2024-04-26 18:28:32,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 3953475584. Throughput: 0: 50773.8. Samples: 1706377640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 18:28:32,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 18:28:35,010][49750] Updated weights for policy 0, policy_version 241311 (0.0030) [2024-04-26 18:28:37,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3953737728. Throughput: 0: 50752.4. Samples: 1706530280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 18:28:37,063][49517] Avg episode reward: [(0, '0.478')] [2024-04-26 18:28:38,332][49750] Updated weights for policy 0, policy_version 241321 (0.0033) [2024-04-26 18:28:41,433][49750] Updated weights for policy 0, policy_version 241331 (0.0033) [2024-04-26 18:28:42,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3953983488. Throughput: 0: 50901.8. Samples: 1706838840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 18:28:42,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 18:28:42,082][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000241333_3953999872.pth... [2024-04-26 18:28:42,135][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000240589_3941810176.pth [2024-04-26 18:28:45,052][49750] Updated weights for policy 0, policy_version 241341 (0.0029) [2024-04-26 18:28:47,062][49517] Fps is (10 sec: 50791.2, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 3954245632. Throughput: 0: 50829.6. Samples: 1707144440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 18:28:47,063][49517] Avg episode reward: [(0, '0.697')] [2024-04-26 18:28:47,886][49750] Updated weights for policy 0, policy_version 241351 (0.0031) [2024-04-26 18:28:51,355][49750] Updated weights for policy 0, policy_version 241361 (0.0031) [2024-04-26 18:28:52,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3954491392. Throughput: 0: 50829.2. Samples: 1707291700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 18:28:52,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 18:28:54,215][49750] Updated weights for policy 0, policy_version 241371 (0.0033) [2024-04-26 18:28:57,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3954753536. Throughput: 0: 50876.8. Samples: 1707599880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 18:28:57,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 18:28:57,673][49750] Updated weights for policy 0, policy_version 241381 (0.0039) [2024-04-26 18:29:00,698][49750] Updated weights for policy 0, policy_version 241391 (0.0028) [2024-04-26 18:29:02,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3954999296. Throughput: 0: 50821.2. Samples: 1707904200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 18:29:02,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 18:29:04,281][49750] Updated weights for policy 0, policy_version 241401 (0.0034) [2024-04-26 18:29:07,062][49517] Fps is (10 sec: 50791.3, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3955261440. Throughput: 0: 50854.3. Samples: 1708059980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 18:29:07,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 18:29:07,124][49750] Updated weights for policy 0, policy_version 241411 (0.0030) [2024-04-26 18:29:10,925][49750] Updated weights for policy 0, policy_version 241421 (0.0031) [2024-04-26 18:29:11,304][49728] Signal inference workers to stop experience collection... (25350 times) [2024-04-26 18:29:11,305][49728] Signal inference workers to resume experience collection... (25350 times) [2024-04-26 18:29:11,317][49750] InferenceWorker_p0-w0: stopping experience collection (25350 times) [2024-04-26 18:29:11,317][49750] InferenceWorker_p0-w0: resuming experience collection (25350 times) [2024-04-26 18:29:12,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 3955523584. Throughput: 0: 51012.8. Samples: 1708370600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 18:29:12,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 18:29:13,526][49750] Updated weights for policy 0, policy_version 241431 (0.0027) [2024-04-26 18:29:17,063][49517] Fps is (10 sec: 49150.8, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3955752960. Throughput: 0: 50903.1. Samples: 1708668280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 18:29:17,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 18:29:17,276][49750] Updated weights for policy 0, policy_version 241441 (0.0037) [2024-04-26 18:29:20,080][49750] Updated weights for policy 0, policy_version 241451 (0.0031) [2024-04-26 18:29:22,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 3956031488. Throughput: 0: 50948.5. Samples: 1708822960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 18:29:22,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 18:29:23,629][49750] Updated weights for policy 0, policy_version 241461 (0.0035) [2024-04-26 18:29:26,568][49750] Updated weights for policy 0, policy_version 241471 (0.0029) [2024-04-26 18:29:27,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3956277248. Throughput: 0: 50752.1. Samples: 1709122680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 18:29:27,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 18:29:30,137][49750] Updated weights for policy 0, policy_version 241481 (0.0031) [2024-04-26 18:29:32,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 3956523008. Throughput: 0: 50697.1. Samples: 1709425820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 18:29:32,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 18:29:33,119][49750] Updated weights for policy 0, policy_version 241491 (0.0031) [2024-04-26 18:29:36,564][49750] Updated weights for policy 0, policy_version 241501 (0.0036) [2024-04-26 18:29:37,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3956785152. Throughput: 0: 50826.7. Samples: 1709578900. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 18:29:37,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 18:29:39,507][49750] Updated weights for policy 0, policy_version 241511 (0.0031) [2024-04-26 18:29:42,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3957030912. Throughput: 0: 50755.7. Samples: 1709883880. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 18:29:42,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 18:29:42,901][49750] Updated weights for policy 0, policy_version 241521 (0.0029) [2024-04-26 18:29:46,041][49750] Updated weights for policy 0, policy_version 241531 (0.0030) [2024-04-26 18:29:47,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 3957293056. Throughput: 0: 50804.4. Samples: 1710190400. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 18:29:47,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 18:29:49,333][49750] Updated weights for policy 0, policy_version 241541 (0.0025) [2024-04-26 18:29:52,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3957538816. Throughput: 0: 50796.3. Samples: 1710345820. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 18:29:52,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 18:29:52,477][49750] Updated weights for policy 0, policy_version 241551 (0.0031) [2024-04-26 18:29:55,787][49750] Updated weights for policy 0, policy_version 241561 (0.0031) [2024-04-26 18:29:57,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3957817344. Throughput: 0: 50735.9. Samples: 1710653720. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 18:29:57,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 18:29:58,829][49750] Updated weights for policy 0, policy_version 241571 (0.0035) [2024-04-26 18:30:02,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3958030336. Throughput: 0: 50795.4. Samples: 1710954060. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 18:30:02,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 18:30:02,239][49750] Updated weights for policy 0, policy_version 241581 (0.0031) [2024-04-26 18:30:05,172][49750] Updated weights for policy 0, policy_version 241591 (0.0034) [2024-04-26 18:30:07,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3958308864. Throughput: 0: 50818.4. Samples: 1711109780. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 18:30:07,063][49517] Avg episode reward: [(0, '0.462')] [2024-04-26 18:30:08,640][49750] Updated weights for policy 0, policy_version 241601 (0.0035) [2024-04-26 18:30:11,703][49750] Updated weights for policy 0, policy_version 241611 (0.0030) [2024-04-26 18:30:12,063][49517] Fps is (10 sec: 54066.2, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 3958571008. Throughput: 0: 50875.4. Samples: 1711412080. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 18:30:12,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 18:30:15,139][49750] Updated weights for policy 0, policy_version 241621 (0.0039) [2024-04-26 18:30:17,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.6, 300 sec: 50762.6). Total num frames: 3958800384. Throughput: 0: 50855.3. Samples: 1711714300. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 18:30:17,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 18:30:18,204][49750] Updated weights for policy 0, policy_version 241631 (0.0032) [2024-04-26 18:30:21,583][49750] Updated weights for policy 0, policy_version 241641 (0.0029) [2024-04-26 18:30:22,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 3959078912. Throughput: 0: 50787.5. Samples: 1711864340. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 18:30:22,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 18:30:24,782][49750] Updated weights for policy 0, policy_version 241651 (0.0028) [2024-04-26 18:30:27,063][49517] Fps is (10 sec: 52427.8, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3959324672. Throughput: 0: 50624.2. Samples: 1712161980. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 18:30:27,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 18:30:27,992][49750] Updated weights for policy 0, policy_version 241661 (0.0036) [2024-04-26 18:30:30,546][49728] Signal inference workers to stop experience collection... (25400 times) [2024-04-26 18:30:30,546][49728] Signal inference workers to resume experience collection... (25400 times) [2024-04-26 18:30:30,571][49750] InferenceWorker_p0-w0: stopping experience collection (25400 times) [2024-04-26 18:30:30,571][49750] InferenceWorker_p0-w0: resuming experience collection (25400 times) [2024-04-26 18:30:31,081][49750] Updated weights for policy 0, policy_version 241671 (0.0032) [2024-04-26 18:30:32,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3959586816. Throughput: 0: 50705.0. Samples: 1712472120. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 18:30:32,063][49517] Avg episode reward: [(0, '0.473')] [2024-04-26 18:30:34,296][49750] Updated weights for policy 0, policy_version 241681 (0.0028) [2024-04-26 18:30:37,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3959816192. Throughput: 0: 50748.6. Samples: 1712629500. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 18:30:37,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 18:30:37,720][49750] Updated weights for policy 0, policy_version 241691 (0.0026) [2024-04-26 18:30:40,690][49750] Updated weights for policy 0, policy_version 241701 (0.0028) [2024-04-26 18:30:42,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3960094720. Throughput: 0: 50711.7. Samples: 1712935740. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-26 18:30:42,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 18:30:42,080][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000241705_3960094720.pth... [2024-04-26 18:30:42,137][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000240961_3947905024.pth [2024-04-26 18:30:44,199][49750] Updated weights for policy 0, policy_version 241711 (0.0030) [2024-04-26 18:30:47,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3960340480. Throughput: 0: 50853.2. Samples: 1713242460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 18:30:47,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 18:30:47,203][49750] Updated weights for policy 0, policy_version 241721 (0.0034) [2024-04-26 18:30:50,600][49750] Updated weights for policy 0, policy_version 241731 (0.0034) [2024-04-26 18:30:52,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 3960586240. Throughput: 0: 50698.1. Samples: 1713391200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 18:30:52,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 18:30:53,850][49750] Updated weights for policy 0, policy_version 241741 (0.0030) [2024-04-26 18:30:57,048][49750] Updated weights for policy 0, policy_version 241751 (0.0032) [2024-04-26 18:30:57,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 3960848384. Throughput: 0: 50779.0. Samples: 1713697140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 18:30:57,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 18:31:00,206][49750] Updated weights for policy 0, policy_version 241761 (0.0030) [2024-04-26 18:31:02,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3961094144. Throughput: 0: 50729.2. Samples: 1713997120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 18:31:02,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 18:31:03,497][49750] Updated weights for policy 0, policy_version 241771 (0.0035) [2024-04-26 18:31:06,458][49750] Updated weights for policy 0, policy_version 241781 (0.0028) [2024-04-26 18:31:07,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3961356288. Throughput: 0: 50867.6. Samples: 1714153380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 18:31:07,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 18:31:09,803][49750] Updated weights for policy 0, policy_version 241791 (0.0028) [2024-04-26 18:31:12,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3961602048. Throughput: 0: 50917.9. Samples: 1714453280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 18:31:12,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 18:31:13,116][49750] Updated weights for policy 0, policy_version 241801 (0.0030) [2024-04-26 18:31:16,229][49750] Updated weights for policy 0, policy_version 241811 (0.0029) [2024-04-26 18:31:17,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3961880576. Throughput: 0: 50709.8. Samples: 1714754060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 18:31:17,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-26 18:31:19,627][49750] Updated weights for policy 0, policy_version 241821 (0.0027) [2024-04-26 18:31:22,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 50762.7). Total num frames: 3962093568. Throughput: 0: 50779.1. Samples: 1714914560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 18:31:22,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 18:31:22,609][49750] Updated weights for policy 0, policy_version 241831 (0.0029) [2024-04-26 18:31:25,931][49750] Updated weights for policy 0, policy_version 241841 (0.0031) [2024-04-26 18:31:27,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3962372096. Throughput: 0: 50765.4. Samples: 1715220180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 18:31:27,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 18:31:29,005][49750] Updated weights for policy 0, policy_version 241851 (0.0033) [2024-04-26 18:31:32,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 3962617856. Throughput: 0: 50664.0. Samples: 1715522340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 18:31:32,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 18:31:32,344][49750] Updated weights for policy 0, policy_version 241861 (0.0033) [2024-04-26 18:31:33,063][49728] Signal inference workers to stop experience collection... (25450 times) [2024-04-26 18:31:33,108][49750] InferenceWorker_p0-w0: stopping experience collection (25450 times) [2024-04-26 18:31:33,128][49728] Signal inference workers to resume experience collection... (25450 times) [2024-04-26 18:31:33,129][49750] InferenceWorker_p0-w0: resuming experience collection (25450 times) [2024-04-26 18:31:35,475][49750] Updated weights for policy 0, policy_version 241871 (0.0029) [2024-04-26 18:31:37,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3962880000. Throughput: 0: 50783.7. Samples: 1715676460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 18:31:37,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 18:31:38,934][49750] Updated weights for policy 0, policy_version 241881 (0.0032) [2024-04-26 18:31:41,988][49750] Updated weights for policy 0, policy_version 241891 (0.0023) [2024-04-26 18:31:42,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 3963142144. Throughput: 0: 50736.1. Samples: 1715980260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 18:31:42,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 18:31:45,388][49750] Updated weights for policy 0, policy_version 241901 (0.0028) [2024-04-26 18:31:47,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3963387904. Throughput: 0: 50737.7. Samples: 1716280320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 18:31:47,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 18:31:48,388][49750] Updated weights for policy 0, policy_version 241911 (0.0038) [2024-04-26 18:31:51,657][49750] Updated weights for policy 0, policy_version 241921 (0.0030) [2024-04-26 18:31:52,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3963650048. Throughput: 0: 50776.2. Samples: 1716438320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 18:31:52,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 18:31:54,860][49750] Updated weights for policy 0, policy_version 241931 (0.0031) [2024-04-26 18:31:57,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3963895808. Throughput: 0: 50786.2. Samples: 1716738660. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 18:31:57,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 18:31:58,186][49750] Updated weights for policy 0, policy_version 241941 (0.0037) [2024-04-26 18:32:01,418][49750] Updated weights for policy 0, policy_version 241951 (0.0030) [2024-04-26 18:32:02,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3964141568. Throughput: 0: 50832.3. Samples: 1717041520. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 18:32:02,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 18:32:04,742][49750] Updated weights for policy 0, policy_version 241961 (0.0036) [2024-04-26 18:32:07,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3964403712. Throughput: 0: 50783.4. Samples: 1717199820. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 18:32:07,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 18:32:07,843][49750] Updated weights for policy 0, policy_version 241971 (0.0030) [2024-04-26 18:32:11,144][49750] Updated weights for policy 0, policy_version 241981 (0.0033) [2024-04-26 18:32:12,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3964665856. Throughput: 0: 50729.6. Samples: 1717503020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 18:32:12,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 18:32:14,202][49750] Updated weights for policy 0, policy_version 241991 (0.0031) [2024-04-26 18:32:17,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 3964895232. Throughput: 0: 50758.8. Samples: 1717806480. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 18:32:17,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 18:32:17,631][49750] Updated weights for policy 0, policy_version 242001 (0.0034) [2024-04-26 18:32:20,614][49750] Updated weights for policy 0, policy_version 242011 (0.0029) [2024-04-26 18:32:22,063][49517] Fps is (10 sec: 47513.9, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3965140992. Throughput: 0: 50778.9. Samples: 1717961520. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 18:32:22,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 18:32:24,115][49750] Updated weights for policy 0, policy_version 242021 (0.0031) [2024-04-26 18:32:27,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3965419520. Throughput: 0: 50762.0. Samples: 1718264540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 18:32:27,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 18:32:27,116][49750] Updated weights for policy 0, policy_version 242031 (0.0031) [2024-04-26 18:32:30,527][49750] Updated weights for policy 0, policy_version 242041 (0.0032) [2024-04-26 18:32:32,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3965665280. Throughput: 0: 50960.2. Samples: 1718573520. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 18:32:32,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 18:32:33,673][49750] Updated weights for policy 0, policy_version 242051 (0.0030) [2024-04-26 18:32:35,629][49728] Signal inference workers to stop experience collection... (25500 times) [2024-04-26 18:32:35,629][49728] Signal inference workers to resume experience collection... (25500 times) [2024-04-26 18:32:35,660][49750] InferenceWorker_p0-w0: stopping experience collection (25500 times) [2024-04-26 18:32:35,660][49750] InferenceWorker_p0-w0: resuming experience collection (25500 times) [2024-04-26 18:32:36,916][49750] Updated weights for policy 0, policy_version 242061 (0.0038) [2024-04-26 18:32:37,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 3965943808. Throughput: 0: 50889.1. Samples: 1718728320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 18:32:37,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 18:32:40,078][49750] Updated weights for policy 0, policy_version 242071 (0.0030) [2024-04-26 18:32:42,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3966173184. Throughput: 0: 50801.0. Samples: 1719024700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 18:32:42,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 18:32:42,193][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000242078_3966205952.pth... [2024-04-26 18:32:42,244][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000241333_3953999872.pth [2024-04-26 18:32:43,387][49750] Updated weights for policy 0, policy_version 242081 (0.0035) [2024-04-26 18:32:46,624][49750] Updated weights for policy 0, policy_version 242091 (0.0036) [2024-04-26 18:32:47,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3966435328. Throughput: 0: 50896.3. Samples: 1719331860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 18:32:47,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 18:32:49,844][49750] Updated weights for policy 0, policy_version 242101 (0.0028) [2024-04-26 18:32:52,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.5, 300 sec: 50707.1). Total num frames: 3966664704. Throughput: 0: 50710.8. Samples: 1719481800. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 18:32:52,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 18:32:53,040][49750] Updated weights for policy 0, policy_version 242111 (0.0030) [2024-04-26 18:32:56,175][49750] Updated weights for policy 0, policy_version 242121 (0.0028) [2024-04-26 18:32:57,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 3966943232. Throughput: 0: 50850.4. Samples: 1719791280. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 18:32:57,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 18:32:59,515][49750] Updated weights for policy 0, policy_version 242131 (0.0029) [2024-04-26 18:33:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3967172608. Throughput: 0: 50724.9. Samples: 1720089100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-04-26 18:33:02,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 18:33:02,691][49750] Updated weights for policy 0, policy_version 242141 (0.0031) [2024-04-26 18:33:05,967][49750] Updated weights for policy 0, policy_version 242151 (0.0028) [2024-04-26 18:33:07,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3967418368. Throughput: 0: 50683.7. Samples: 1720242280. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 18:33:07,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 18:33:09,071][49750] Updated weights for policy 0, policy_version 242161 (0.0038) [2024-04-26 18:33:12,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.4, 300 sec: 50818.1). Total num frames: 3967696896. Throughput: 0: 50745.2. Samples: 1720548080. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 18:33:12,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 18:33:12,439][49750] Updated weights for policy 0, policy_version 242171 (0.0029) [2024-04-26 18:33:15,518][49750] Updated weights for policy 0, policy_version 242181 (0.0037) [2024-04-26 18:33:17,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3967959040. Throughput: 0: 50651.1. Samples: 1720852820. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 18:33:17,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 18:33:18,790][49750] Updated weights for policy 0, policy_version 242191 (0.0029) [2024-04-26 18:33:21,979][49750] Updated weights for policy 0, policy_version 242201 (0.0033) [2024-04-26 18:33:22,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 3968221184. Throughput: 0: 50661.7. Samples: 1721008100. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 18:33:22,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 18:33:25,334][49750] Updated weights for policy 0, policy_version 242211 (0.0030) [2024-04-26 18:33:27,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3968466944. Throughput: 0: 50744.8. Samples: 1721308220. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 18:33:27,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 18:33:27,869][49728] Signal inference workers to stop experience collection... (25550 times) [2024-04-26 18:33:27,869][49728] Signal inference workers to resume experience collection... (25550 times) [2024-04-26 18:33:27,904][49750] InferenceWorker_p0-w0: stopping experience collection (25550 times) [2024-04-26 18:33:27,904][49750] InferenceWorker_p0-w0: resuming experience collection (25550 times) [2024-04-26 18:33:28,481][49750] Updated weights for policy 0, policy_version 242221 (0.0033) [2024-04-26 18:33:31,767][49750] Updated weights for policy 0, policy_version 242231 (0.0031) [2024-04-26 18:33:32,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3968712704. Throughput: 0: 50758.6. Samples: 1721616000. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 18:33:32,064][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 18:33:34,804][49750] Updated weights for policy 0, policy_version 242241 (0.0029) [2024-04-26 18:33:37,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 3968942080. Throughput: 0: 50695.5. Samples: 1721763100. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 18:33:37,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 18:33:38,473][49750] Updated weights for policy 0, policy_version 242251 (0.0028) [2024-04-26 18:33:41,245][49750] Updated weights for policy 0, policy_version 242261 (0.0029) [2024-04-26 18:33:42,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3969220608. Throughput: 0: 50645.4. Samples: 1722070320. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 18:33:42,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 18:33:44,799][49750] Updated weights for policy 0, policy_version 242271 (0.0033) [2024-04-26 18:33:47,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3969466368. Throughput: 0: 50760.8. Samples: 1722373340. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 18:33:47,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 18:33:47,603][49750] Updated weights for policy 0, policy_version 242281 (0.0035) [2024-04-26 18:33:51,389][49750] Updated weights for policy 0, policy_version 242291 (0.0036) [2024-04-26 18:33:52,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3969728512. Throughput: 0: 50760.4. Samples: 1722526500. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 18:33:52,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 18:33:54,069][49750] Updated weights for policy 0, policy_version 242301 (0.0033) [2024-04-26 18:33:57,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3969957888. Throughput: 0: 50723.9. Samples: 1722830660. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 18:33:57,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 18:33:58,022][49750] Updated weights for policy 0, policy_version 242311 (0.0031) [2024-04-26 18:34:00,436][49750] Updated weights for policy 0, policy_version 242321 (0.0031) [2024-04-26 18:34:02,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3970252800. Throughput: 0: 50712.9. Samples: 1723134900. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 18:34:02,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 18:34:04,358][49750] Updated weights for policy 0, policy_version 242331 (0.0033) [2024-04-26 18:34:07,009][49750] Updated weights for policy 0, policy_version 242341 (0.0025) [2024-04-26 18:34:07,063][49517] Fps is (10 sec: 55705.7, 60 sec: 51609.5, 300 sec: 50818.2). Total num frames: 3970514944. Throughput: 0: 50768.0. Samples: 1723292660. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 18:34:07,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 18:34:10,735][49750] Updated weights for policy 0, policy_version 242351 (0.0033) [2024-04-26 18:34:12,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3970744320. Throughput: 0: 50850.6. Samples: 1723596500. Policy #0 lag: (min: 2.0, avg: 10.4, max: 22.0) [2024-04-26 18:34:12,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 18:34:13,572][49750] Updated weights for policy 0, policy_version 242361 (0.0029) [2024-04-26 18:34:17,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3970990080. Throughput: 0: 50770.0. Samples: 1723900640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 18:34:17,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 18:34:17,170][49750] Updated weights for policy 0, policy_version 242371 (0.0034) [2024-04-26 18:34:20,049][49750] Updated weights for policy 0, policy_version 242381 (0.0033) [2024-04-26 18:34:22,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3971252224. Throughput: 0: 50837.3. Samples: 1724050780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 18:34:22,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 18:34:23,472][49750] Updated weights for policy 0, policy_version 242391 (0.0029) [2024-04-26 18:34:26,380][49750] Updated weights for policy 0, policy_version 242401 (0.0030) [2024-04-26 18:34:27,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3971514368. Throughput: 0: 50947.0. Samples: 1724362940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 18:34:27,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 18:34:27,224][49728] Signal inference workers to stop experience collection... (25600 times) [2024-04-26 18:34:27,224][49728] Signal inference workers to resume experience collection... (25600 times) [2024-04-26 18:34:27,256][49750] InferenceWorker_p0-w0: stopping experience collection (25600 times) [2024-04-26 18:34:27,256][49750] InferenceWorker_p0-w0: resuming experience collection (25600 times) [2024-04-26 18:34:29,759][49750] Updated weights for policy 0, policy_version 242411 (0.0032) [2024-04-26 18:34:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.6, 300 sec: 50762.6). Total num frames: 3971760128. Throughput: 0: 51001.0. Samples: 1724668380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 18:34:32,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 18:34:32,843][49750] Updated weights for policy 0, policy_version 242421 (0.0032) [2024-04-26 18:34:36,272][49750] Updated weights for policy 0, policy_version 242431 (0.0034) [2024-04-26 18:34:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 3972005888. Throughput: 0: 50972.0. Samples: 1724820240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 18:34:37,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 18:34:39,228][49750] Updated weights for policy 0, policy_version 242441 (0.0031) [2024-04-26 18:34:42,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3972251648. Throughput: 0: 50776.1. Samples: 1725115580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 18:34:42,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 18:34:42,135][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000242448_3972268032.pth... [2024-04-26 18:34:42,186][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000241705_3960094720.pth [2024-04-26 18:34:42,940][49750] Updated weights for policy 0, policy_version 242451 (0.0030) [2024-04-26 18:34:45,697][49750] Updated weights for policy 0, policy_version 242461 (0.0031) [2024-04-26 18:34:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 3972530176. Throughput: 0: 50713.3. Samples: 1725417000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 18:34:47,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 18:34:49,395][49750] Updated weights for policy 0, policy_version 242471 (0.0038) [2024-04-26 18:34:51,983][49750] Updated weights for policy 0, policy_version 242481 (0.0030) [2024-04-26 18:34:52,062][49517] Fps is (10 sec: 55705.6, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3972808704. Throughput: 0: 50711.7. Samples: 1725574680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 18:34:52,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 18:34:55,905][49750] Updated weights for policy 0, policy_version 242491 (0.0027) [2024-04-26 18:34:57,062][49517] Fps is (10 sec: 49151.7, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 3973021696. Throughput: 0: 50813.8. Samples: 1725883120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 18:34:57,063][49517] Avg episode reward: [(0, '0.666')] [2024-04-26 18:34:58,474][49750] Updated weights for policy 0, policy_version 242501 (0.0036) [2024-04-26 18:35:02,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3973283840. Throughput: 0: 50889.6. Samples: 1726190680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 18:35:02,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 18:35:02,265][49750] Updated weights for policy 0, policy_version 242511 (0.0033) [2024-04-26 18:35:04,988][49750] Updated weights for policy 0, policy_version 242521 (0.0032) [2024-04-26 18:35:07,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3973529600. Throughput: 0: 50799.0. Samples: 1726336740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 18:35:07,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 18:35:08,762][49750] Updated weights for policy 0, policy_version 242531 (0.0034) [2024-04-26 18:35:11,478][49750] Updated weights for policy 0, policy_version 242541 (0.0037) [2024-04-26 18:35:12,063][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 3973824512. Throughput: 0: 50671.5. Samples: 1726643160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 18:35:12,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 18:35:15,257][49750] Updated weights for policy 0, policy_version 242551 (0.0032) [2024-04-26 18:35:17,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3974037504. Throughput: 0: 50745.6. Samples: 1726951940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 18:35:17,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 18:35:17,864][49750] Updated weights for policy 0, policy_version 242561 (0.0033) [2024-04-26 18:35:21,631][49750] Updated weights for policy 0, policy_version 242571 (0.0028) [2024-04-26 18:35:22,062][49517] Fps is (10 sec: 45875.7, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3974283264. Throughput: 0: 50572.9. Samples: 1727096020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 18:35:22,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 18:35:24,225][49750] Updated weights for policy 0, policy_version 242581 (0.0029) [2024-04-26 18:35:27,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3974545408. Throughput: 0: 50815.6. Samples: 1727402280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:35:27,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-26 18:35:27,910][49750] Updated weights for policy 0, policy_version 242591 (0.0030) [2024-04-26 18:35:30,766][49750] Updated weights for policy 0, policy_version 242601 (0.0034) [2024-04-26 18:35:32,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 3974791168. Throughput: 0: 50746.5. Samples: 1727700600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:35:32,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 18:35:33,668][49728] Signal inference workers to stop experience collection... (25650 times) [2024-04-26 18:35:33,717][49750] InferenceWorker_p0-w0: stopping experience collection (25650 times) [2024-04-26 18:35:33,735][49728] Signal inference workers to resume experience collection... (25650 times) [2024-04-26 18:35:33,737][49750] InferenceWorker_p0-w0: resuming experience collection (25650 times) [2024-04-26 18:35:34,420][49750] Updated weights for policy 0, policy_version 242611 (0.0036) [2024-04-26 18:35:37,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3975086080. Throughput: 0: 50666.7. Samples: 1727854680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:35:37,072][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 18:35:37,238][49750] Updated weights for policy 0, policy_version 242621 (0.0024) [2024-04-26 18:35:40,897][49750] Updated weights for policy 0, policy_version 242631 (0.0035) [2024-04-26 18:35:42,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 3975299072. Throughput: 0: 50680.3. Samples: 1728163740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:35:42,072][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 18:35:43,609][49750] Updated weights for policy 0, policy_version 242641 (0.0028) [2024-04-26 18:35:47,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3975561216. Throughput: 0: 50645.0. Samples: 1728469700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:35:47,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 18:35:47,427][49750] Updated weights for policy 0, policy_version 242651 (0.0033) [2024-04-26 18:35:49,977][49750] Updated weights for policy 0, policy_version 242661 (0.0030) [2024-04-26 18:35:52,063][49517] Fps is (10 sec: 50790.9, 60 sec: 49971.1, 300 sec: 50707.1). Total num frames: 3975806976. Throughput: 0: 50674.2. Samples: 1728617080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:35:52,071][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 18:35:54,357][49750] Updated weights for policy 0, policy_version 242671 (0.0035) [2024-04-26 18:35:56,546][49750] Updated weights for policy 0, policy_version 242681 (0.0029) [2024-04-26 18:35:57,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 3976101888. Throughput: 0: 50664.1. Samples: 1728923040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:35:57,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 18:36:00,691][49750] Updated weights for policy 0, policy_version 242691 (0.0036) [2024-04-26 18:36:02,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3976331264. Throughput: 0: 50647.9. Samples: 1729231100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:36:02,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 18:36:03,017][49750] Updated weights for policy 0, policy_version 242701 (0.0029) [2024-04-26 18:36:07,046][49750] Updated weights for policy 0, policy_version 242711 (0.0030) [2024-04-26 18:36:07,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 3976577024. Throughput: 0: 50720.8. Samples: 1729378460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:36:07,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 18:36:09,368][49750] Updated weights for policy 0, policy_version 242721 (0.0028) [2024-04-26 18:36:12,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3976839168. Throughput: 0: 50571.4. Samples: 1729678000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:36:12,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 18:36:13,582][49750] Updated weights for policy 0, policy_version 242731 (0.0034) [2024-04-26 18:36:15,974][49750] Updated weights for policy 0, policy_version 242741 (0.0032) [2024-04-26 18:36:17,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3977084928. Throughput: 0: 50571.7. Samples: 1729976320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:36:17,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 18:36:20,076][49750] Updated weights for policy 0, policy_version 242751 (0.0031) [2024-04-26 18:36:22,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51336.3, 300 sec: 50818.1). Total num frames: 3977363456. Throughput: 0: 50774.4. Samples: 1730139540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:36:22,063][49517] Avg episode reward: [(0, '0.442')] [2024-04-26 18:36:22,431][49750] Updated weights for policy 0, policy_version 242761 (0.0030) [2024-04-26 18:36:26,531][49750] Updated weights for policy 0, policy_version 242771 (0.0038) [2024-04-26 18:36:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3977576448. Throughput: 0: 50606.0. Samples: 1730441000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:36:27,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 18:36:28,964][49750] Updated weights for policy 0, policy_version 242781 (0.0029) [2024-04-26 18:36:32,063][49517] Fps is (10 sec: 49152.7, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3977854976. Throughput: 0: 50572.8. Samples: 1730745480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:36:32,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 18:36:32,883][49750] Updated weights for policy 0, policy_version 242791 (0.0029) [2024-04-26 18:36:34,383][49728] Signal inference workers to stop experience collection... (25700 times) [2024-04-26 18:36:34,387][49728] Signal inference workers to resume experience collection... (25700 times) [2024-04-26 18:36:34,413][49750] InferenceWorker_p0-w0: stopping experience collection (25700 times) [2024-04-26 18:36:34,413][49750] InferenceWorker_p0-w0: resuming experience collection (25700 times) [2024-04-26 18:36:35,270][49750] Updated weights for policy 0, policy_version 242801 (0.0036) [2024-04-26 18:36:37,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3978100736. Throughput: 0: 50707.7. Samples: 1730898920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 18:36:37,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 18:36:39,512][49750] Updated weights for policy 0, policy_version 242811 (0.0029) [2024-04-26 18:36:41,757][49750] Updated weights for policy 0, policy_version 242821 (0.0030) [2024-04-26 18:36:42,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 3978379264. Throughput: 0: 50740.5. Samples: 1731206360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 18:36:42,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 18:36:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000242821_3978379264.pth... [2024-04-26 18:36:42,129][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000242078_3966205952.pth [2024-04-26 18:36:45,931][49750] Updated weights for policy 0, policy_version 242831 (0.0031) [2024-04-26 18:36:47,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 3978608640. Throughput: 0: 50698.4. Samples: 1731512520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 18:36:47,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 18:36:48,304][49750] Updated weights for policy 0, policy_version 242841 (0.0032) [2024-04-26 18:36:52,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 3978854400. Throughput: 0: 50749.8. Samples: 1731662200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 18:36:52,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 18:36:52,216][49750] Updated weights for policy 0, policy_version 242851 (0.0028) [2024-04-26 18:36:54,626][49750] Updated weights for policy 0, policy_version 242861 (0.0028) [2024-04-26 18:36:57,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 3979116544. Throughput: 0: 50819.1. Samples: 1731964860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 18:36:57,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-26 18:36:58,820][49750] Updated weights for policy 0, policy_version 242871 (0.0033) [2024-04-26 18:37:00,985][49750] Updated weights for policy 0, policy_version 242881 (0.0033) [2024-04-26 18:37:02,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3979362304. Throughput: 0: 50815.1. Samples: 1732263000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 18:37:02,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 18:37:05,425][49750] Updated weights for policy 0, policy_version 242891 (0.0030) [2024-04-26 18:37:07,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3979657216. Throughput: 0: 50889.2. Samples: 1732429540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 18:37:07,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 18:37:07,584][49750] Updated weights for policy 0, policy_version 242901 (0.0034) [2024-04-26 18:37:11,806][49750] Updated weights for policy 0, policy_version 242911 (0.0031) [2024-04-26 18:37:12,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3979853824. Throughput: 0: 50883.1. Samples: 1732730740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 18:37:12,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 18:37:13,923][49750] Updated weights for policy 0, policy_version 242921 (0.0025) [2024-04-26 18:37:17,063][49517] Fps is (10 sec: 45874.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3980115968. Throughput: 0: 50801.8. Samples: 1733031560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 18:37:17,064][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 18:37:18,146][49750] Updated weights for policy 0, policy_version 242931 (0.0031) [2024-04-26 18:37:20,665][49750] Updated weights for policy 0, policy_version 242941 (0.0029) [2024-04-26 18:37:22,063][49517] Fps is (10 sec: 54067.0, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 3980394496. Throughput: 0: 50720.9. Samples: 1733181360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 18:37:22,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 18:37:24,525][49750] Updated weights for policy 0, policy_version 242951 (0.0034) [2024-04-26 18:37:27,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3980656640. Throughput: 0: 50744.0. Samples: 1733489840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 18:37:27,072][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 18:37:27,526][49750] Updated weights for policy 0, policy_version 242961 (0.0029) [2024-04-26 18:37:31,070][49750] Updated weights for policy 0, policy_version 242971 (0.0030) [2024-04-26 18:37:32,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 3980886016. Throughput: 0: 50829.7. Samples: 1733799860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 18:37:32,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 18:37:34,007][49750] Updated weights for policy 0, policy_version 242981 (0.0037) [2024-04-26 18:37:37,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 3981131776. Throughput: 0: 50738.8. Samples: 1733945440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 18:37:37,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 18:37:37,615][49750] Updated weights for policy 0, policy_version 242991 (0.0028) [2024-04-26 18:37:40,481][49750] Updated weights for policy 0, policy_version 243001 (0.0026) [2024-04-26 18:37:42,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3981393920. Throughput: 0: 50729.0. Samples: 1734247660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-26 18:37:42,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 18:37:43,974][49750] Updated weights for policy 0, policy_version 243011 (0.0031) [2024-04-26 18:37:44,972][49728] Signal inference workers to stop experience collection... (25750 times) [2024-04-26 18:37:45,012][49750] InferenceWorker_p0-w0: stopping experience collection (25750 times) [2024-04-26 18:37:45,077][49728] Signal inference workers to resume experience collection... (25750 times) [2024-04-26 18:37:45,077][49750] InferenceWorker_p0-w0: resuming experience collection (25750 times) [2024-04-26 18:37:46,810][49750] Updated weights for policy 0, policy_version 243021 (0.0026) [2024-04-26 18:37:47,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 3981656064. Throughput: 0: 50713.3. Samples: 1734545100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 18:37:47,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 18:37:50,418][49750] Updated weights for policy 0, policy_version 243031 (0.0035) [2024-04-26 18:37:52,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3981918208. Throughput: 0: 50751.1. Samples: 1734713340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 18:37:52,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 18:37:53,128][49750] Updated weights for policy 0, policy_version 243041 (0.0035) [2024-04-26 18:37:57,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 3982131200. Throughput: 0: 50794.8. Samples: 1735016500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 18:37:57,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 18:37:57,091][49750] Updated weights for policy 0, policy_version 243051 (0.0028) [2024-04-26 18:37:59,523][49750] Updated weights for policy 0, policy_version 243061 (0.0033) [2024-04-26 18:38:02,062][49517] Fps is (10 sec: 45875.0, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3982376960. Throughput: 0: 50722.2. Samples: 1735314060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 18:38:02,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 18:38:03,687][49750] Updated weights for policy 0, policy_version 243071 (0.0035) [2024-04-26 18:38:06,114][49750] Updated weights for policy 0, policy_version 243081 (0.0029) [2024-04-26 18:38:07,063][49517] Fps is (10 sec: 54066.1, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 3982671872. Throughput: 0: 50635.0. Samples: 1735459940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 18:38:07,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 18:38:10,082][49750] Updated weights for policy 0, policy_version 243091 (0.0034) [2024-04-26 18:38:12,063][49517] Fps is (10 sec: 57343.9, 60 sec: 51609.5, 300 sec: 50818.2). Total num frames: 3982950400. Throughput: 0: 50761.7. Samples: 1735774120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 18:38:12,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 18:38:12,397][49750] Updated weights for policy 0, policy_version 243101 (0.0031) [2024-04-26 18:38:16,610][49750] Updated weights for policy 0, policy_version 243111 (0.0033) [2024-04-26 18:38:17,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 3983163392. Throughput: 0: 50795.8. Samples: 1736085660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 18:38:17,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 18:38:18,755][49750] Updated weights for policy 0, policy_version 243121 (0.0029) [2024-04-26 18:38:22,063][49517] Fps is (10 sec: 44236.7, 60 sec: 49971.1, 300 sec: 50596.0). Total num frames: 3983392768. Throughput: 0: 50569.2. Samples: 1736221060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 18:38:22,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 18:38:23,170][49750] Updated weights for policy 0, policy_version 243131 (0.0035) [2024-04-26 18:38:25,505][49750] Updated weights for policy 0, policy_version 243141 (0.0028) [2024-04-26 18:38:26,039][49728] Signal inference workers to stop experience collection... (25800 times) [2024-04-26 18:38:26,091][49750] InferenceWorker_p0-w0: stopping experience collection (25800 times) [2024-04-26 18:38:26,099][49728] Signal inference workers to resume experience collection... (25800 times) [2024-04-26 18:38:26,104][49750] InferenceWorker_p0-w0: resuming experience collection (25800 times) [2024-04-26 18:38:27,063][49517] Fps is (10 sec: 50789.2, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 3983671296. Throughput: 0: 50587.8. Samples: 1736524120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 18:38:27,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 18:38:29,476][49750] Updated weights for policy 0, policy_version 243151 (0.0028) [2024-04-26 18:38:32,052][49750] Updated weights for policy 0, policy_version 243161 (0.0029) [2024-04-26 18:38:32,062][49517] Fps is (10 sec: 55706.5, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 3983949824. Throughput: 0: 50761.8. Samples: 1736829380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 18:38:32,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 18:38:35,878][49750] Updated weights for policy 0, policy_version 243171 (0.0028) [2024-04-26 18:38:37,062][49517] Fps is (10 sec: 54068.5, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 3984211968. Throughput: 0: 50565.9. Samples: 1736988800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 18:38:37,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 18:38:38,543][49750] Updated weights for policy 0, policy_version 243181 (0.0028) [2024-04-26 18:38:42,063][49517] Fps is (10 sec: 45874.6, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 3984408576. Throughput: 0: 50682.0. Samples: 1737297200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 18:38:42,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 18:38:42,196][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000243190_3984424960.pth... [2024-04-26 18:38:42,249][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000242448_3972268032.pth [2024-04-26 18:38:42,388][49750] Updated weights for policy 0, policy_version 243191 (0.0034) [2024-04-26 18:38:44,842][49750] Updated weights for policy 0, policy_version 243201 (0.0037) [2024-04-26 18:38:47,062][49517] Fps is (10 sec: 45874.7, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 3984670720. Throughput: 0: 50743.6. Samples: 1737597520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 18:38:47,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 18:38:48,802][49750] Updated weights for policy 0, policy_version 243211 (0.0028) [2024-04-26 18:38:51,213][49750] Updated weights for policy 0, policy_version 243221 (0.0029) [2024-04-26 18:38:52,062][49517] Fps is (10 sec: 55706.0, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 3984965632. Throughput: 0: 50750.8. Samples: 1737743720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 18:38:52,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 18:38:55,163][49750] Updated weights for policy 0, policy_version 243231 (0.0028) [2024-04-26 18:38:57,062][49517] Fps is (10 sec: 55705.8, 60 sec: 51609.5, 300 sec: 50762.6). Total num frames: 3985227776. Throughput: 0: 50687.6. Samples: 1738055060. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-26 18:38:57,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 18:38:57,656][49750] Updated weights for policy 0, policy_version 243241 (0.0028) [2024-04-26 18:39:01,752][49750] Updated weights for policy 0, policy_version 243251 (0.0033) [2024-04-26 18:39:02,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50790.5, 300 sec: 50540.5). Total num frames: 3985424384. Throughput: 0: 50625.8. Samples: 1738363820. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-26 18:39:02,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 18:39:04,180][49750] Updated weights for policy 0, policy_version 243261 (0.0030) [2024-04-26 18:39:07,062][49517] Fps is (10 sec: 44237.1, 60 sec: 49971.4, 300 sec: 50596.0). Total num frames: 3985670144. Throughput: 0: 50606.0. Samples: 1738498320. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-26 18:39:07,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 18:39:08,297][49750] Updated weights for policy 0, policy_version 243271 (0.0029) [2024-04-26 18:39:10,502][49750] Updated weights for policy 0, policy_version 243281 (0.0032) [2024-04-26 18:39:12,063][49517] Fps is (10 sec: 52428.1, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 3985948672. Throughput: 0: 50672.6. Samples: 1738804380. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-26 18:39:12,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 18:39:14,640][49750] Updated weights for policy 0, policy_version 243291 (0.0031) [2024-04-26 18:39:15,596][49728] Signal inference workers to stop experience collection... (25850 times) [2024-04-26 18:39:15,651][49750] InferenceWorker_p0-w0: stopping experience collection (25850 times) [2024-04-26 18:39:15,668][49728] Signal inference workers to resume experience collection... (25850 times) [2024-04-26 18:39:15,670][49750] InferenceWorker_p0-w0: resuming experience collection (25850 times) [2024-04-26 18:39:17,026][49750] Updated weights for policy 0, policy_version 243301 (0.0033) [2024-04-26 18:39:17,063][49517] Fps is (10 sec: 57343.3, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 3986243584. Throughput: 0: 50612.3. Samples: 1739106940. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-26 18:39:17,063][49517] Avg episode reward: [(0, '0.670')] [2024-04-26 18:39:21,165][49750] Updated weights for policy 0, policy_version 243311 (0.0033) [2024-04-26 18:39:22,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51609.7, 300 sec: 50762.6). Total num frames: 3986489344. Throughput: 0: 50779.0. Samples: 1739273860. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-26 18:39:22,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 18:39:23,412][49750] Updated weights for policy 0, policy_version 243321 (0.0028) [2024-04-26 18:39:27,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50517.5, 300 sec: 50651.5). Total num frames: 3986702336. Throughput: 0: 50647.2. Samples: 1739576320. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-26 18:39:27,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 18:39:27,640][49750] Updated weights for policy 0, policy_version 243331 (0.0034) [2024-04-26 18:39:29,845][49750] Updated weights for policy 0, policy_version 243341 (0.0029) [2024-04-26 18:39:32,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 3986964480. Throughput: 0: 50691.7. Samples: 1739878640. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-26 18:39:32,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 18:39:33,910][49750] Updated weights for policy 0, policy_version 243351 (0.0032) [2024-04-26 18:39:36,205][49750] Updated weights for policy 0, policy_version 243361 (0.0030) [2024-04-26 18:39:37,063][49517] Fps is (10 sec: 54067.1, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 3987243008. Throughput: 0: 50883.5. Samples: 1740033480. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-26 18:39:37,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 18:39:40,445][49750] Updated weights for policy 0, policy_version 243371 (0.0030) [2024-04-26 18:39:42,063][49517] Fps is (10 sec: 55704.2, 60 sec: 51882.6, 300 sec: 50818.1). Total num frames: 3987521536. Throughput: 0: 50784.7. Samples: 1740340380. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-26 18:39:42,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 18:39:42,686][49750] Updated weights for policy 0, policy_version 243381 (0.0032) [2024-04-26 18:39:46,920][49750] Updated weights for policy 0, policy_version 243391 (0.0029) [2024-04-26 18:39:47,063][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.4, 300 sec: 50596.0). Total num frames: 3987734528. Throughput: 0: 50757.6. Samples: 1740647920. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-26 18:39:47,063][49517] Avg episode reward: [(0, '0.683')] [2024-04-26 18:39:49,182][49750] Updated weights for policy 0, policy_version 243401 (0.0029) [2024-04-26 18:39:52,062][49517] Fps is (10 sec: 44238.0, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 3987963904. Throughput: 0: 50838.3. Samples: 1740786040. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-26 18:39:52,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 18:39:53,248][49750] Updated weights for policy 0, policy_version 243411 (0.0032) [2024-04-26 18:39:55,548][49750] Updated weights for policy 0, policy_version 243421 (0.0029) [2024-04-26 18:39:57,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 3988242432. Throughput: 0: 50752.0. Samples: 1741088220. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-26 18:39:57,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 18:39:59,793][49750] Updated weights for policy 0, policy_version 243431 (0.0027) [2024-04-26 18:40:01,912][49750] Updated weights for policy 0, policy_version 243441 (0.0030) [2024-04-26 18:40:02,062][49517] Fps is (10 sec: 57343.2, 60 sec: 51882.6, 300 sec: 50873.7). Total num frames: 3988537344. Throughput: 0: 50753.8. Samples: 1741390860. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-26 18:40:02,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 18:40:06,143][49750] Updated weights for policy 0, policy_version 243451 (0.0035) [2024-04-26 18:40:06,614][49728] Signal inference workers to stop experience collection... (25900 times) [2024-04-26 18:40:06,653][49750] InferenceWorker_p0-w0: stopping experience collection (25900 times) [2024-04-26 18:40:06,686][49728] Signal inference workers to resume experience collection... (25900 times) [2024-04-26 18:40:06,687][49750] InferenceWorker_p0-w0: resuming experience collection (25900 times) [2024-04-26 18:40:07,062][49517] Fps is (10 sec: 52429.9, 60 sec: 51609.6, 300 sec: 50651.6). Total num frames: 3988766720. Throughput: 0: 50885.1. Samples: 1741563680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 18:40:07,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 18:40:08,380][49750] Updated weights for policy 0, policy_version 243461 (0.0027) [2024-04-26 18:40:12,062][49517] Fps is (10 sec: 44237.3, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 3988979712. Throughput: 0: 50784.6. Samples: 1741861620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 18:40:12,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 18:40:12,582][49750] Updated weights for policy 0, policy_version 243471 (0.0029) [2024-04-26 18:40:15,024][49750] Updated weights for policy 0, policy_version 243481 (0.0028) [2024-04-26 18:40:17,062][49517] Fps is (10 sec: 47513.0, 60 sec: 49971.3, 300 sec: 50707.1). Total num frames: 3989241856. Throughput: 0: 50686.1. Samples: 1742159520. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 18:40:17,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 18:40:19,067][49750] Updated weights for policy 0, policy_version 243491 (0.0031) [2024-04-26 18:40:21,438][49750] Updated weights for policy 0, policy_version 243501 (0.0027) [2024-04-26 18:40:22,062][49517] Fps is (10 sec: 55705.0, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3989536768. Throughput: 0: 50610.2. Samples: 1742310940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 18:40:22,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 18:40:25,524][49750] Updated weights for policy 0, policy_version 243511 (0.0028) [2024-04-26 18:40:27,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3989782528. Throughput: 0: 50804.1. Samples: 1742626560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 18:40:27,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 18:40:27,868][49750] Updated weights for policy 0, policy_version 243521 (0.0034) [2024-04-26 18:40:31,952][49750] Updated weights for policy 0, policy_version 243531 (0.0033) [2024-04-26 18:40:32,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 3990011904. Throughput: 0: 50692.5. Samples: 1742929080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 18:40:32,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 18:40:34,371][49750] Updated weights for policy 0, policy_version 243541 (0.0025) [2024-04-26 18:40:37,063][49517] Fps is (10 sec: 45875.4, 60 sec: 49971.2, 300 sec: 50651.6). Total num frames: 3990241280. Throughput: 0: 50704.3. Samples: 1743067740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 18:40:37,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 18:40:38,262][49750] Updated weights for policy 0, policy_version 243551 (0.0033) [2024-04-26 18:40:40,934][49750] Updated weights for policy 0, policy_version 243561 (0.0033) [2024-04-26 18:40:42,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 3990536192. Throughput: 0: 50855.6. Samples: 1743376720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 18:40:42,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 18:40:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000243563_3990536192.pth... [2024-04-26 18:40:42,124][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000242821_3978379264.pth [2024-04-26 18:40:44,714][49750] Updated weights for policy 0, policy_version 243571 (0.0031) [2024-04-26 18:40:47,063][49517] Fps is (10 sec: 55705.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 3990798336. Throughput: 0: 51073.6. Samples: 1743689180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 18:40:47,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 18:40:47,341][49750] Updated weights for policy 0, policy_version 243581 (0.0026) [2024-04-26 18:40:51,112][49750] Updated weights for policy 0, policy_version 243591 (0.0031) [2024-04-26 18:40:52,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51609.5, 300 sec: 50707.1). Total num frames: 3991060480. Throughput: 0: 50796.7. Samples: 1743849540. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 18:40:52,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 18:40:53,932][49750] Updated weights for policy 0, policy_version 243601 (0.0037) [2024-04-26 18:40:57,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 3991289856. Throughput: 0: 50801.8. Samples: 1744147700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 18:40:57,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 18:40:57,614][49750] Updated weights for policy 0, policy_version 243611 (0.0028) [2024-04-26 18:41:00,288][49750] Updated weights for policy 0, policy_version 243621 (0.0029) [2024-04-26 18:41:02,063][49517] Fps is (10 sec: 45874.9, 60 sec: 49698.1, 300 sec: 50651.5). Total num frames: 3991519232. Throughput: 0: 50967.0. Samples: 1744453040. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 18:41:02,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 18:41:03,940][49750] Updated weights for policy 0, policy_version 243631 (0.0030) [2024-04-26 18:41:06,545][49728] Signal inference workers to stop experience collection... (25950 times) [2024-04-26 18:41:06,545][49728] Signal inference workers to resume experience collection... (25950 times) [2024-04-26 18:41:06,579][49750] InferenceWorker_p0-w0: stopping experience collection (25950 times) [2024-04-26 18:41:06,579][49750] InferenceWorker_p0-w0: resuming experience collection (25950 times) [2024-04-26 18:41:06,679][49750] Updated weights for policy 0, policy_version 243641 (0.0032) [2024-04-26 18:41:07,062][49517] Fps is (10 sec: 54066.5, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 3991830528. Throughput: 0: 50809.3. Samples: 1744597360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 18:41:07,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 18:41:10,297][49750] Updated weights for policy 0, policy_version 243651 (0.0031) [2024-04-26 18:41:12,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 3992059904. Throughput: 0: 50655.1. Samples: 1744906040. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 18:41:12,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 18:41:13,012][49750] Updated weights for policy 0, policy_version 243661 (0.0036) [2024-04-26 18:41:16,951][49750] Updated weights for policy 0, policy_version 243671 (0.0031) [2024-04-26 18:41:17,062][49517] Fps is (10 sec: 47514.3, 60 sec: 51063.5, 300 sec: 50651.6). Total num frames: 3992305664. Throughput: 0: 50662.3. Samples: 1745208880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 18:41:17,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 18:41:19,655][49750] Updated weights for policy 0, policy_version 243681 (0.0036) [2024-04-26 18:41:22,062][49517] Fps is (10 sec: 47514.3, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 3992535040. Throughput: 0: 50753.0. Samples: 1745351620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 18:41:22,063][49517] Avg episode reward: [(0, '0.469')] [2024-04-26 18:41:23,403][49750] Updated weights for policy 0, policy_version 243691 (0.0031) [2024-04-26 18:41:26,091][49750] Updated weights for policy 0, policy_version 243701 (0.0034) [2024-04-26 18:41:27,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 3992813568. Throughput: 0: 50792.8. Samples: 1745662400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 18:41:27,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 18:41:29,660][49750] Updated weights for policy 0, policy_version 243711 (0.0031) [2024-04-26 18:41:32,062][49517] Fps is (10 sec: 55705.4, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 3993092096. Throughput: 0: 50623.3. Samples: 1745967220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 18:41:32,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 18:41:32,506][49750] Updated weights for policy 0, policy_version 243721 (0.0032) [2024-04-26 18:41:36,048][49750] Updated weights for policy 0, policy_version 243731 (0.0027) [2024-04-26 18:41:37,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51609.5, 300 sec: 50707.1). Total num frames: 3993337856. Throughput: 0: 50593.7. Samples: 1746126260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 18:41:37,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 18:41:39,204][49750] Updated weights for policy 0, policy_version 243741 (0.0030) [2024-04-26 18:41:42,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3993583616. Throughput: 0: 50674.1. Samples: 1746428040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 18:41:42,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 18:41:42,682][49750] Updated weights for policy 0, policy_version 243751 (0.0033) [2024-04-26 18:41:45,875][49750] Updated weights for policy 0, policy_version 243761 (0.0032) [2024-04-26 18:41:47,062][49517] Fps is (10 sec: 47514.9, 60 sec: 50244.5, 300 sec: 50707.1). Total num frames: 3993812992. Throughput: 0: 50634.9. Samples: 1746731600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 18:41:47,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 18:41:49,074][49750] Updated weights for policy 0, policy_version 243771 (0.0029) [2024-04-26 18:41:52,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 3994091520. Throughput: 0: 50662.4. Samples: 1746877160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 18:41:52,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 18:41:52,381][49750] Updated weights for policy 0, policy_version 243781 (0.0042) [2024-04-26 18:41:55,426][49750] Updated weights for policy 0, policy_version 243791 (0.0026) [2024-04-26 18:41:57,062][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 3994337280. Throughput: 0: 50582.8. Samples: 1747182260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 18:41:57,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 18:41:58,856][49750] Updated weights for policy 0, policy_version 243801 (0.0032) [2024-04-26 18:42:01,831][49750] Updated weights for policy 0, policy_version 243811 (0.0035) [2024-04-26 18:42:02,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51336.7, 300 sec: 50651.6). Total num frames: 3994599424. Throughput: 0: 50699.5. Samples: 1747490360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 18:42:02,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 18:42:05,378][49750] Updated weights for policy 0, policy_version 243821 (0.0030) [2024-04-26 18:42:07,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49698.2, 300 sec: 50707.1). Total num frames: 3994812416. Throughput: 0: 50828.0. Samples: 1747638880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 18:42:07,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 18:42:08,193][49728] Signal inference workers to stop experience collection... (26000 times) [2024-04-26 18:42:08,194][49728] Signal inference workers to resume experience collection... (26000 times) [2024-04-26 18:42:08,230][49750] InferenceWorker_p0-w0: stopping experience collection (26000 times) [2024-04-26 18:42:08,230][49750] InferenceWorker_p0-w0: resuming experience collection (26000 times) [2024-04-26 18:42:08,327][49750] Updated weights for policy 0, policy_version 243831 (0.0037) [2024-04-26 18:42:11,682][49750] Updated weights for policy 0, policy_version 243841 (0.0031) [2024-04-26 18:42:12,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 3995090944. Throughput: 0: 50733.8. Samples: 1747945420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 18:42:12,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 18:42:14,802][49750] Updated weights for policy 0, policy_version 243851 (0.0029) [2024-04-26 18:42:17,063][49517] Fps is (10 sec: 55705.0, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 3995369472. Throughput: 0: 50806.6. Samples: 1748253520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 18:42:17,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 18:42:18,273][49750] Updated weights for policy 0, policy_version 243861 (0.0032) [2024-04-26 18:42:21,172][49750] Updated weights for policy 0, policy_version 243871 (0.0029) [2024-04-26 18:42:22,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.4, 300 sec: 50707.1). Total num frames: 3995615232. Throughput: 0: 50864.5. Samples: 1748415160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 18:42:22,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 18:42:24,865][49750] Updated weights for policy 0, policy_version 243881 (0.0036) [2024-04-26 18:42:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 3995860992. Throughput: 0: 50896.1. Samples: 1748718360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 18:42:27,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 18:42:27,578][49750] Updated weights for policy 0, policy_version 243891 (0.0031) [2024-04-26 18:42:31,141][49750] Updated weights for policy 0, policy_version 243901 (0.0029) [2024-04-26 18:42:32,062][49517] Fps is (10 sec: 47514.2, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 3996090368. Throughput: 0: 50883.8. Samples: 1749021380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 18:42:32,063][49517] Avg episode reward: [(0, '0.466')] [2024-04-26 18:42:33,920][49750] Updated weights for policy 0, policy_version 243911 (0.0029) [2024-04-26 18:42:37,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 3996368896. Throughput: 0: 50794.8. Samples: 1749162940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 18:42:37,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 18:42:37,444][49750] Updated weights for policy 0, policy_version 243921 (0.0032) [2024-04-26 18:42:40,404][49750] Updated weights for policy 0, policy_version 243931 (0.0031) [2024-04-26 18:42:42,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 3996614656. Throughput: 0: 50762.3. Samples: 1749466560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 18:42:42,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 18:42:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000243934_3996614656.pth... [2024-04-26 18:42:42,128][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000243190_3984424960.pth [2024-04-26 18:42:44,237][49750] Updated weights for policy 0, policy_version 243941 (0.0031) [2024-04-26 18:42:46,901][49750] Updated weights for policy 0, policy_version 243951 (0.0027) [2024-04-26 18:42:47,063][49517] Fps is (10 sec: 54067.5, 60 sec: 51609.4, 300 sec: 50818.1). Total num frames: 3996909568. Throughput: 0: 50737.1. Samples: 1749773540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 18:42:47,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 18:42:50,647][49750] Updated weights for policy 0, policy_version 243961 (0.0034) [2024-04-26 18:42:52,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 3997122560. Throughput: 0: 50881.9. Samples: 1749928560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 18:42:52,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 18:42:53,183][49750] Updated weights for policy 0, policy_version 243971 (0.0036) [2024-04-26 18:42:57,063][49517] Fps is (10 sec: 45875.5, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 3997368320. Throughput: 0: 50870.7. Samples: 1750234600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 18:42:57,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 18:42:57,535][49750] Updated weights for policy 0, policy_version 243981 (0.0030) [2024-04-26 18:42:59,497][49750] Updated weights for policy 0, policy_version 243991 (0.0030) [2024-04-26 18:43:02,063][49517] Fps is (10 sec: 52427.4, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 3997646848. Throughput: 0: 50854.1. Samples: 1750541960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 18:43:02,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 18:43:03,906][49750] Updated weights for policy 0, policy_version 244001 (0.0035) [2024-04-26 18:43:05,465][49728] Signal inference workers to stop experience collection... (26050 times) [2024-04-26 18:43:05,465][49728] Signal inference workers to resume experience collection... (26050 times) [2024-04-26 18:43:05,489][49750] InferenceWorker_p0-w0: stopping experience collection (26050 times) [2024-04-26 18:43:05,489][49750] InferenceWorker_p0-w0: resuming experience collection (26050 times) [2024-04-26 18:43:05,999][49750] Updated weights for policy 0, policy_version 244011 (0.0033) [2024-04-26 18:43:07,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.5, 300 sec: 50651.6). Total num frames: 3997892608. Throughput: 0: 50800.1. Samples: 1750701160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 18:43:07,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 18:43:10,172][49750] Updated weights for policy 0, policy_version 244021 (0.0033) [2024-04-26 18:43:12,062][49517] Fps is (10 sec: 50791.5, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 3998154752. Throughput: 0: 50931.7. Samples: 1751010280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 18:43:12,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 18:43:12,478][49750] Updated weights for policy 0, policy_version 244031 (0.0034) [2024-04-26 18:43:16,763][49750] Updated weights for policy 0, policy_version 244041 (0.0037) [2024-04-26 18:43:17,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 3998384128. Throughput: 0: 50898.2. Samples: 1751311800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 18:43:17,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 18:43:19,102][49750] Updated weights for policy 0, policy_version 244051 (0.0028) [2024-04-26 18:43:22,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 3998662656. Throughput: 0: 50925.0. Samples: 1751454560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 18:43:22,063][49517] Avg episode reward: [(0, '0.675')] [2024-04-26 18:43:23,279][49750] Updated weights for policy 0, policy_version 244061 (0.0026) [2024-04-26 18:43:25,598][49750] Updated weights for policy 0, policy_version 244071 (0.0031) [2024-04-26 18:43:27,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 3998924800. Throughput: 0: 50913.7. Samples: 1751757680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 18:43:27,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 18:43:29,581][49750] Updated weights for policy 0, policy_version 244081 (0.0034) [2024-04-26 18:43:32,062][49517] Fps is (10 sec: 50791.6, 60 sec: 51336.6, 300 sec: 50707.1). Total num frames: 3999170560. Throughput: 0: 50832.3. Samples: 1752060980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 18:43:32,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 18:43:32,087][49750] Updated weights for policy 0, policy_version 244091 (0.0031) [2024-04-26 18:43:35,862][49750] Updated weights for policy 0, policy_version 244101 (0.0028) [2024-04-26 18:43:37,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 3999399936. Throughput: 0: 50936.3. Samples: 1752220700. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 18:43:37,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 18:43:38,441][49750] Updated weights for policy 0, policy_version 244111 (0.0030) [2024-04-26 18:43:42,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 3999662080. Throughput: 0: 50941.3. Samples: 1752526960. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 18:43:42,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 18:43:42,413][49750] Updated weights for policy 0, policy_version 244121 (0.0034) [2024-04-26 18:43:44,790][49750] Updated weights for policy 0, policy_version 244131 (0.0031) [2024-04-26 18:43:47,062][49517] Fps is (10 sec: 50790.5, 60 sec: 49971.4, 300 sec: 50651.6). Total num frames: 3999907840. Throughput: 0: 50755.3. Samples: 1752825940. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 18:43:47,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 18:43:48,879][49750] Updated weights for policy 0, policy_version 244141 (0.0031) [2024-04-26 18:43:51,425][49750] Updated weights for policy 0, policy_version 244151 (0.0037) [2024-04-26 18:43:52,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4000186368. Throughput: 0: 50564.1. Samples: 1752976540. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 18:43:52,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 18:43:55,333][49750] Updated weights for policy 0, policy_version 244161 (0.0029) [2024-04-26 18:43:57,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.7, 300 sec: 50929.3). Total num frames: 4000448512. Throughput: 0: 50546.7. Samples: 1753284880. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 18:43:57,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-26 18:43:57,914][49750] Updated weights for policy 0, policy_version 244171 (0.0030) [2024-04-26 18:44:01,691][49750] Updated weights for policy 0, policy_version 244181 (0.0031) [2024-04-26 18:44:02,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4000677888. Throughput: 0: 50631.5. Samples: 1753590220. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 18:44:02,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 18:44:04,197][49750] Updated weights for policy 0, policy_version 244191 (0.0027) [2024-04-26 18:44:07,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 4000923648. Throughput: 0: 50648.3. Samples: 1753733720. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 18:44:07,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 18:44:08,193][49750] Updated weights for policy 0, policy_version 244201 (0.0031) [2024-04-26 18:44:10,583][49750] Updated weights for policy 0, policy_version 244211 (0.0029) [2024-04-26 18:44:12,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4001202176. Throughput: 0: 50660.1. Samples: 1754037380. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 18:44:12,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 18:44:14,501][49750] Updated weights for policy 0, policy_version 244221 (0.0036) [2024-04-26 18:44:17,063][49517] Fps is (10 sec: 52427.5, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 4001447936. Throughput: 0: 50823.7. Samples: 1754348060. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 18:44:17,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 18:44:17,380][49750] Updated weights for policy 0, policy_version 244231 (0.0032) [2024-04-26 18:44:17,592][49728] Signal inference workers to stop experience collection... (26100 times) [2024-04-26 18:44:17,593][49728] Signal inference workers to resume experience collection... (26100 times) [2024-04-26 18:44:17,606][49750] InferenceWorker_p0-w0: stopping experience collection (26100 times) [2024-04-26 18:44:17,606][49750] InferenceWorker_p0-w0: resuming experience collection (26100 times) [2024-04-26 18:44:20,906][49750] Updated weights for policy 0, policy_version 244241 (0.0032) [2024-04-26 18:44:22,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.6, 300 sec: 50818.2). Total num frames: 4001693696. Throughput: 0: 50743.7. Samples: 1754504160. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 18:44:22,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 18:44:23,734][49750] Updated weights for policy 0, policy_version 244251 (0.0034) [2024-04-26 18:44:27,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4001955840. Throughput: 0: 50652.0. Samples: 1754806300. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 18:44:27,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 18:44:27,209][49750] Updated weights for policy 0, policy_version 244261 (0.0029) [2024-04-26 18:44:30,104][49750] Updated weights for policy 0, policy_version 244271 (0.0038) [2024-04-26 18:44:32,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4002201600. Throughput: 0: 50859.9. Samples: 1755114640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 18:44:32,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 18:44:33,667][49750] Updated weights for policy 0, policy_version 244281 (0.0032) [2024-04-26 18:44:36,657][49750] Updated weights for policy 0, policy_version 244291 (0.0030) [2024-04-26 18:44:37,062][49517] Fps is (10 sec: 52429.8, 60 sec: 51336.7, 300 sec: 50707.1). Total num frames: 4002480128. Throughput: 0: 50831.7. Samples: 1755263960. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 18:44:37,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 18:44:40,179][49750] Updated weights for policy 0, policy_version 244301 (0.0027) [2024-04-26 18:44:42,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50790.6, 300 sec: 50762.7). Total num frames: 4002709504. Throughput: 0: 50635.6. Samples: 1755563480. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 18:44:42,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 18:44:42,091][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000244307_4002725888.pth... [2024-04-26 18:44:42,137][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000243563_3990536192.pth [2024-04-26 18:44:43,165][49750] Updated weights for policy 0, policy_version 244311 (0.0030) [2024-04-26 18:44:46,578][49750] Updated weights for policy 0, policy_version 244321 (0.0032) [2024-04-26 18:44:47,062][49517] Fps is (10 sec: 49150.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4002971648. Throughput: 0: 50704.4. Samples: 1755871920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 18:44:47,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 18:44:49,523][49750] Updated weights for policy 0, policy_version 244331 (0.0028) [2024-04-26 18:44:52,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4003217408. Throughput: 0: 50795.1. Samples: 1756019500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 18:44:52,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 18:44:53,168][49750] Updated weights for policy 0, policy_version 244341 (0.0029) [2024-04-26 18:44:55,949][49750] Updated weights for policy 0, policy_version 244351 (0.0029) [2024-04-26 18:44:57,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4003479552. Throughput: 0: 50730.6. Samples: 1756320260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 18:44:57,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 18:44:59,559][49750] Updated weights for policy 0, policy_version 244361 (0.0034) [2024-04-26 18:45:02,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4003741696. Throughput: 0: 50668.8. Samples: 1756628140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 18:45:02,062][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 18:45:02,413][49750] Updated weights for policy 0, policy_version 244371 (0.0028) [2024-04-26 18:45:05,986][49750] Updated weights for policy 0, policy_version 244381 (0.0038) [2024-04-26 18:45:07,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4003971072. Throughput: 0: 50723.8. Samples: 1756786740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 18:45:07,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 18:45:08,965][49750] Updated weights for policy 0, policy_version 244391 (0.0035) [2024-04-26 18:45:12,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4004233216. Throughput: 0: 50653.4. Samples: 1757085700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 18:45:12,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 18:45:12,599][49750] Updated weights for policy 0, policy_version 244401 (0.0038) [2024-04-26 18:45:15,517][49750] Updated weights for policy 0, policy_version 244411 (0.0034) [2024-04-26 18:45:17,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 4004478976. Throughput: 0: 50520.6. Samples: 1757388060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 18:45:17,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 18:45:18,908][49750] Updated weights for policy 0, policy_version 244421 (0.0035) [2024-04-26 18:45:21,847][49750] Updated weights for policy 0, policy_version 244431 (0.0032) [2024-04-26 18:45:22,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50762.7). Total num frames: 4004757504. Throughput: 0: 50611.4. Samples: 1757541480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 18:45:22,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 18:45:25,229][49750] Updated weights for policy 0, policy_version 244441 (0.0030) [2024-04-26 18:45:27,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4004986880. Throughput: 0: 50610.2. Samples: 1757840940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 18:45:27,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 18:45:28,554][49750] Updated weights for policy 0, policy_version 244451 (0.0027) [2024-04-26 18:45:31,774][49750] Updated weights for policy 0, policy_version 244461 (0.0033) [2024-04-26 18:45:32,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 4005249024. Throughput: 0: 50619.3. Samples: 1758149780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 18:45:32,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 18:45:34,845][49750] Updated weights for policy 0, policy_version 244471 (0.0031) [2024-04-26 18:45:37,062][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4005511168. Throughput: 0: 50697.2. Samples: 1758300880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 18:45:37,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 18:45:38,273][49750] Updated weights for policy 0, policy_version 244481 (0.0037) [2024-04-26 18:45:41,292][49750] Updated weights for policy 0, policy_version 244491 (0.0038) [2024-04-26 18:45:42,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4005756928. Throughput: 0: 50769.0. Samples: 1758604860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 18:45:42,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 18:45:43,147][49728] Signal inference workers to stop experience collection... (26150 times) [2024-04-26 18:45:43,184][49750] InferenceWorker_p0-w0: stopping experience collection (26150 times) [2024-04-26 18:45:43,215][49728] Signal inference workers to resume experience collection... (26150 times) [2024-04-26 18:45:43,216][49750] InferenceWorker_p0-w0: resuming experience collection (26150 times) [2024-04-26 18:45:44,603][49750] Updated weights for policy 0, policy_version 244501 (0.0029) [2024-04-26 18:45:47,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 4006019072. Throughput: 0: 50708.0. Samples: 1758910000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 18:45:47,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 18:45:47,934][49750] Updated weights for policy 0, policy_version 244511 (0.0033) [2024-04-26 18:45:51,218][49750] Updated weights for policy 0, policy_version 244521 (0.0036) [2024-04-26 18:45:52,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4006248448. Throughput: 0: 50764.2. Samples: 1759071120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-26 18:45:52,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 18:45:54,255][49750] Updated weights for policy 0, policy_version 244531 (0.0033) [2024-04-26 18:45:57,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4006510592. Throughput: 0: 50694.3. Samples: 1759366940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 18:45:57,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 18:45:57,741][49750] Updated weights for policy 0, policy_version 244541 (0.0032) [2024-04-26 18:46:00,625][49750] Updated weights for policy 0, policy_version 244551 (0.0028) [2024-04-26 18:46:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.2, 300 sec: 50596.0). Total num frames: 4006756352. Throughput: 0: 50672.9. Samples: 1759668340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 18:46:02,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 18:46:04,216][49750] Updated weights for policy 0, policy_version 244561 (0.0039) [2024-04-26 18:46:07,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 4007034880. Throughput: 0: 50697.8. Samples: 1759822880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 18:46:07,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 18:46:07,112][49750] Updated weights for policy 0, policy_version 244571 (0.0032) [2024-04-26 18:46:10,552][49750] Updated weights for policy 0, policy_version 244581 (0.0034) [2024-04-26 18:46:12,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4007280640. Throughput: 0: 50700.8. Samples: 1760122480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 18:46:12,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 18:46:13,633][49750] Updated weights for policy 0, policy_version 244591 (0.0034) [2024-04-26 18:46:16,944][49750] Updated weights for policy 0, policy_version 244601 (0.0037) [2024-04-26 18:46:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4007542784. Throughput: 0: 50638.2. Samples: 1760428500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 18:46:17,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 18:46:20,210][49750] Updated weights for policy 0, policy_version 244611 (0.0032) [2024-04-26 18:46:22,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4007772160. Throughput: 0: 50564.0. Samples: 1760576260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 18:46:22,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 18:46:23,613][49750] Updated weights for policy 0, policy_version 244621 (0.0027) [2024-04-26 18:46:26,700][49750] Updated weights for policy 0, policy_version 244631 (0.0027) [2024-04-26 18:46:27,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50651.6). Total num frames: 4008034304. Throughput: 0: 50692.8. Samples: 1760886040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 18:46:27,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 18:46:29,919][49750] Updated weights for policy 0, policy_version 244641 (0.0031) [2024-04-26 18:46:32,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4008296448. Throughput: 0: 50622.5. Samples: 1761188020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 18:46:32,063][49517] Avg episode reward: [(0, '0.482')] [2024-04-26 18:46:33,268][49750] Updated weights for policy 0, policy_version 244651 (0.0035) [2024-04-26 18:46:36,234][49750] Updated weights for policy 0, policy_version 244661 (0.0034) [2024-04-26 18:46:37,065][49517] Fps is (10 sec: 50775.8, 60 sec: 50514.9, 300 sec: 50706.6). Total num frames: 4008542208. Throughput: 0: 50452.2. Samples: 1761341620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 18:46:37,066][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 18:46:39,526][49750] Updated weights for policy 0, policy_version 244671 (0.0034) [2024-04-26 18:46:42,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4008787968. Throughput: 0: 50872.8. Samples: 1761656220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 18:46:42,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 18:46:42,254][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000244679_4008820736.pth... [2024-04-26 18:46:42,298][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000243934_3996614656.pth [2024-04-26 18:46:42,798][49750] Updated weights for policy 0, policy_version 244681 (0.0029) [2024-04-26 18:46:46,028][49750] Updated weights for policy 0, policy_version 244691 (0.0033) [2024-04-26 18:46:47,062][49517] Fps is (10 sec: 49166.6, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 4009033728. Throughput: 0: 50708.9. Samples: 1761950240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 18:46:47,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 18:46:49,480][49750] Updated weights for policy 0, policy_version 244701 (0.0030) [2024-04-26 18:46:52,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4009312256. Throughput: 0: 50682.7. Samples: 1762103600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 18:46:52,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 18:46:52,493][49750] Updated weights for policy 0, policy_version 244711 (0.0029) [2024-04-26 18:46:55,034][49728] Signal inference workers to stop experience collection... (26200 times) [2024-04-26 18:46:55,077][49750] InferenceWorker_p0-w0: stopping experience collection (26200 times) [2024-04-26 18:46:55,098][49728] Signal inference workers to resume experience collection... (26200 times) [2024-04-26 18:46:55,099][49750] InferenceWorker_p0-w0: resuming experience collection (26200 times) [2024-04-26 18:46:55,842][49750] Updated weights for policy 0, policy_version 244721 (0.0032) [2024-04-26 18:46:57,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4009574400. Throughput: 0: 50837.9. Samples: 1762410180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 18:46:57,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 18:46:58,791][49750] Updated weights for policy 0, policy_version 244731 (0.0029) [2024-04-26 18:47:02,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4009803776. Throughput: 0: 50932.0. Samples: 1762720440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 18:47:02,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 18:47:02,428][49750] Updated weights for policy 0, policy_version 244741 (0.0033) [2024-04-26 18:47:05,064][49750] Updated weights for policy 0, policy_version 244751 (0.0030) [2024-04-26 18:47:07,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4010049536. Throughput: 0: 50834.8. Samples: 1762863820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 18:47:07,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 18:47:09,023][49750] Updated weights for policy 0, policy_version 244761 (0.0030) [2024-04-26 18:47:11,486][49750] Updated weights for policy 0, policy_version 244771 (0.0030) [2024-04-26 18:47:12,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4010328064. Throughput: 0: 50657.9. Samples: 1763165640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 18:47:12,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 18:47:15,421][49750] Updated weights for policy 0, policy_version 244781 (0.0031) [2024-04-26 18:47:17,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4010590208. Throughput: 0: 50784.6. Samples: 1763473320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 18:47:17,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 18:47:18,330][49750] Updated weights for policy 0, policy_version 244791 (0.0032) [2024-04-26 18:47:21,723][49750] Updated weights for policy 0, policy_version 244801 (0.0032) [2024-04-26 18:47:22,062][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4010835968. Throughput: 0: 50985.9. Samples: 1763635840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 18:47:22,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 18:47:24,875][49750] Updated weights for policy 0, policy_version 244811 (0.0037) [2024-04-26 18:47:27,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4011081728. Throughput: 0: 50745.8. Samples: 1763939780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 18:47:27,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 18:47:28,190][49750] Updated weights for policy 0, policy_version 244821 (0.0037) [2024-04-26 18:47:31,431][49750] Updated weights for policy 0, policy_version 244831 (0.0031) [2024-04-26 18:47:32,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4011327488. Throughput: 0: 50957.7. Samples: 1764243340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 18:47:32,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 18:47:34,678][49750] Updated weights for policy 0, policy_version 244841 (0.0036) [2024-04-26 18:47:37,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51065.9, 300 sec: 50818.2). Total num frames: 4011606016. Throughput: 0: 50936.9. Samples: 1764395760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 18:47:37,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 18:47:37,949][49750] Updated weights for policy 0, policy_version 244851 (0.0026) [2024-04-26 18:47:40,992][49750] Updated weights for policy 0, policy_version 244861 (0.0031) [2024-04-26 18:47:42,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51336.6, 300 sec: 50707.1). Total num frames: 4011868160. Throughput: 0: 50866.6. Samples: 1764699180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 18:47:42,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 18:47:44,264][49750] Updated weights for policy 0, policy_version 244871 (0.0033) [2024-04-26 18:47:47,062][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4012097536. Throughput: 0: 50829.8. Samples: 1765007780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 18:47:47,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 18:47:47,540][49750] Updated weights for policy 0, policy_version 244881 (0.0036) [2024-04-26 18:47:50,644][49750] Updated weights for policy 0, policy_version 244891 (0.0034) [2024-04-26 18:47:52,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 4012343296. Throughput: 0: 50824.4. Samples: 1765150920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 18:47:52,063][49517] Avg episode reward: [(0, '0.681')] [2024-04-26 18:47:54,005][49750] Updated weights for policy 0, policy_version 244901 (0.0030) [2024-04-26 18:47:57,037][49750] Updated weights for policy 0, policy_version 244911 (0.0028) [2024-04-26 18:47:57,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.3, 300 sec: 50762.7). Total num frames: 4012621824. Throughput: 0: 50912.8. Samples: 1765456720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 18:47:57,071][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 18:48:00,419][49750] Updated weights for policy 0, policy_version 244921 (0.0039) [2024-04-26 18:48:02,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4012883968. Throughput: 0: 50820.7. Samples: 1765760260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 18:48:02,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 18:48:03,504][49750] Updated weights for policy 0, policy_version 244931 (0.0035) [2024-04-26 18:48:05,448][49728] Signal inference workers to stop experience collection... (26250 times) [2024-04-26 18:48:05,493][49750] InferenceWorker_p0-w0: stopping experience collection (26250 times) [2024-04-26 18:48:05,554][49728] Signal inference workers to resume experience collection... (26250 times) [2024-04-26 18:48:05,554][49750] InferenceWorker_p0-w0: resuming experience collection (26250 times) [2024-04-26 18:48:06,807][49750] Updated weights for policy 0, policy_version 244941 (0.0035) [2024-04-26 18:48:07,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 4013129728. Throughput: 0: 50872.9. Samples: 1765925120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 18:48:07,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 18:48:09,808][49750] Updated weights for policy 0, policy_version 244951 (0.0033) [2024-04-26 18:48:12,062][49517] Fps is (10 sec: 45875.7, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4013342720. Throughput: 0: 50796.9. Samples: 1766225640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 18:48:12,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 18:48:13,173][49750] Updated weights for policy 0, policy_version 244961 (0.0040) [2024-04-26 18:48:16,270][49750] Updated weights for policy 0, policy_version 244971 (0.0031) [2024-04-26 18:48:17,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4013621248. Throughput: 0: 50705.8. Samples: 1766525100. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-04-26 18:48:17,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 18:48:19,674][49750] Updated weights for policy 0, policy_version 244981 (0.0032) [2024-04-26 18:48:22,062][49517] Fps is (10 sec: 55705.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4013899776. Throughput: 0: 50860.5. Samples: 1766684480. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-04-26 18:48:22,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 18:48:22,694][49750] Updated weights for policy 0, policy_version 244991 (0.0031) [2024-04-26 18:48:26,165][49750] Updated weights for policy 0, policy_version 245001 (0.0032) [2024-04-26 18:48:27,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4014145536. Throughput: 0: 50817.8. Samples: 1766985980. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-04-26 18:48:27,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 18:48:29,092][49750] Updated weights for policy 0, policy_version 245011 (0.0025) [2024-04-26 18:48:32,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4014374912. Throughput: 0: 50767.2. Samples: 1767292300. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-04-26 18:48:32,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 18:48:32,565][49750] Updated weights for policy 0, policy_version 245021 (0.0032) [2024-04-26 18:48:36,072][49750] Updated weights for policy 0, policy_version 245031 (0.0035) [2024-04-26 18:48:37,063][49517] Fps is (10 sec: 47512.7, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4014620672. Throughput: 0: 50798.1. Samples: 1767436840. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-04-26 18:48:37,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 18:48:38,947][49750] Updated weights for policy 0, policy_version 245041 (0.0029) [2024-04-26 18:48:42,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4014899200. Throughput: 0: 50691.1. Samples: 1767737820. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-04-26 18:48:42,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 18:48:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000245050_4014899200.pth... [2024-04-26 18:48:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000244307_4002725888.pth [2024-04-26 18:48:42,326][49750] Updated weights for policy 0, policy_version 245051 (0.0031) [2024-04-26 18:48:45,455][49750] Updated weights for policy 0, policy_version 245061 (0.0030) [2024-04-26 18:48:47,062][49517] Fps is (10 sec: 54068.3, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 4015161344. Throughput: 0: 50645.5. Samples: 1768039300. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-04-26 18:48:47,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 18:48:48,729][49750] Updated weights for policy 0, policy_version 245071 (0.0027) [2024-04-26 18:48:52,017][49750] Updated weights for policy 0, policy_version 245081 (0.0026) [2024-04-26 18:48:52,067][49517] Fps is (10 sec: 50767.4, 60 sec: 51059.6, 300 sec: 50706.3). Total num frames: 4015407104. Throughput: 0: 50622.9. Samples: 1768203380. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-04-26 18:48:52,068][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 18:48:55,145][49750] Updated weights for policy 0, policy_version 245091 (0.0028) [2024-04-26 18:48:57,062][49517] Fps is (10 sec: 45875.2, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 4015620096. Throughput: 0: 50710.7. Samples: 1768507620. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-04-26 18:48:57,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 18:48:58,424][49750] Updated weights for policy 0, policy_version 245101 (0.0030) [2024-04-26 18:49:01,437][49750] Updated weights for policy 0, policy_version 245111 (0.0033) [2024-04-26 18:49:02,062][49517] Fps is (10 sec: 49174.1, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4015898624. Throughput: 0: 50668.0. Samples: 1768805160. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-04-26 18:49:02,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 18:49:04,851][49750] Updated weights for policy 0, policy_version 245121 (0.0030) [2024-04-26 18:49:06,828][49728] Signal inference workers to stop experience collection... (26300 times) [2024-04-26 18:49:06,828][49728] Signal inference workers to resume experience collection... (26300 times) [2024-04-26 18:49:06,856][49750] InferenceWorker_p0-w0: stopping experience collection (26300 times) [2024-04-26 18:49:06,856][49750] InferenceWorker_p0-w0: resuming experience collection (26300 times) [2024-04-26 18:49:07,062][49517] Fps is (10 sec: 57343.1, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4016193536. Throughput: 0: 50634.1. Samples: 1768963020. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-04-26 18:49:07,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 18:49:07,714][49750] Updated weights for policy 0, policy_version 245131 (0.0039) [2024-04-26 18:49:11,299][49750] Updated weights for policy 0, policy_version 245141 (0.0031) [2024-04-26 18:49:12,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 4016439296. Throughput: 0: 50855.5. Samples: 1769274480. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-04-26 18:49:12,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 18:49:14,111][49750] Updated weights for policy 0, policy_version 245151 (0.0029) [2024-04-26 18:49:17,062][49517] Fps is (10 sec: 45875.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4016652288. Throughput: 0: 50893.8. Samples: 1769582520. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-04-26 18:49:17,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 18:49:17,699][49750] Updated weights for policy 0, policy_version 245161 (0.0028) [2024-04-26 18:49:20,770][49750] Updated weights for policy 0, policy_version 245171 (0.0031) [2024-04-26 18:49:22,062][49517] Fps is (10 sec: 45875.4, 60 sec: 49971.2, 300 sec: 50651.6). Total num frames: 4016898048. Throughput: 0: 50745.5. Samples: 1769720380. Policy #0 lag: (min: 1.0, avg: 11.2, max: 20.0) [2024-04-26 18:49:22,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 18:49:24,069][49750] Updated weights for policy 0, policy_version 245181 (0.0032) [2024-04-26 18:49:27,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 4017176576. Throughput: 0: 50997.9. Samples: 1770032720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:49:27,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 18:49:27,229][49750] Updated weights for policy 0, policy_version 245191 (0.0035) [2024-04-26 18:49:30,507][49750] Updated weights for policy 0, policy_version 245201 (0.0030) [2024-04-26 18:49:32,062][49517] Fps is (10 sec: 55705.5, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 4017455104. Throughput: 0: 51005.2. Samples: 1770334540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:49:32,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 18:49:34,132][49750] Updated weights for policy 0, policy_version 245211 (0.0025) [2024-04-26 18:49:36,914][49750] Updated weights for policy 0, policy_version 245221 (0.0031) [2024-04-26 18:49:37,062][49517] Fps is (10 sec: 54066.4, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 4017717248. Throughput: 0: 50891.3. Samples: 1770493260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:49:37,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 18:49:40,508][49750] Updated weights for policy 0, policy_version 245231 (0.0035) [2024-04-26 18:49:42,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4017930240. Throughput: 0: 51033.3. Samples: 1770804120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:49:42,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 18:49:43,226][49750] Updated weights for policy 0, policy_version 245241 (0.0027) [2024-04-26 18:49:46,863][49750] Updated weights for policy 0, policy_version 245251 (0.0031) [2024-04-26 18:49:47,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4018192384. Throughput: 0: 51071.6. Samples: 1771103380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:49:47,063][49517] Avg episode reward: [(0, '0.449')] [2024-04-26 18:49:49,733][49750] Updated weights for policy 0, policy_version 245261 (0.0030) [2024-04-26 18:49:52,062][49517] Fps is (10 sec: 55705.3, 60 sec: 51340.4, 300 sec: 50873.7). Total num frames: 4018487296. Throughput: 0: 51033.4. Samples: 1771259520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:49:52,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 18:49:53,310][49750] Updated weights for policy 0, policy_version 245271 (0.0029) [2024-04-26 18:49:56,118][49750] Updated weights for policy 0, policy_version 245281 (0.0029) [2024-04-26 18:49:57,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51609.5, 300 sec: 50762.6). Total num frames: 4018716672. Throughput: 0: 50988.8. Samples: 1771568980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:49:57,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 18:49:59,633][49750] Updated weights for policy 0, policy_version 245291 (0.0032) [2024-04-26 18:50:02,062][49517] Fps is (10 sec: 45875.6, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 4018946048. Throughput: 0: 51022.2. Samples: 1771878520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:50:02,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 18:50:02,592][49750] Updated weights for policy 0, policy_version 245301 (0.0027) [2024-04-26 18:50:06,022][49750] Updated weights for policy 0, policy_version 245311 (0.0033) [2024-04-26 18:50:07,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4019208192. Throughput: 0: 50956.3. Samples: 1772013420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:50:07,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 18:50:08,373][49728] Signal inference workers to stop experience collection... (26350 times) [2024-04-26 18:50:08,379][49728] Signal inference workers to resume experience collection... (26350 times) [2024-04-26 18:50:08,404][49750] InferenceWorker_p0-w0: stopping experience collection (26350 times) [2024-04-26 18:50:08,404][49750] InferenceWorker_p0-w0: resuming experience collection (26350 times) [2024-04-26 18:50:09,048][49750] Updated weights for policy 0, policy_version 245321 (0.0039) [2024-04-26 18:50:12,062][49517] Fps is (10 sec: 54066.9, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4019486720. Throughput: 0: 50919.9. Samples: 1772324120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:50:12,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 18:50:12,274][49750] Updated weights for policy 0, policy_version 245331 (0.0036) [2024-04-26 18:50:15,460][49750] Updated weights for policy 0, policy_version 245341 (0.0034) [2024-04-26 18:50:17,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51609.5, 300 sec: 50818.2). Total num frames: 4019748864. Throughput: 0: 50962.6. Samples: 1772627860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:50:17,071][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 18:50:18,820][49750] Updated weights for policy 0, policy_version 245351 (0.0030) [2024-04-26 18:50:21,864][49750] Updated weights for policy 0, policy_version 245361 (0.0027) [2024-04-26 18:50:22,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51882.7, 300 sec: 50929.2). Total num frames: 4020011008. Throughput: 0: 50963.3. Samples: 1772786600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:50:22,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 18:50:25,387][49750] Updated weights for policy 0, policy_version 245371 (0.0029) [2024-04-26 18:50:27,062][49517] Fps is (10 sec: 45875.8, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4020207616. Throughput: 0: 50761.8. Samples: 1773088400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:50:27,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 18:50:28,251][49750] Updated weights for policy 0, policy_version 245381 (0.0040) [2024-04-26 18:50:31,725][49750] Updated weights for policy 0, policy_version 245391 (0.0030) [2024-04-26 18:50:32,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4020486144. Throughput: 0: 50912.0. Samples: 1773394420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 18:50:32,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 18:50:34,638][49750] Updated weights for policy 0, policy_version 245401 (0.0032) [2024-04-26 18:50:37,063][49517] Fps is (10 sec: 55704.7, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4020764672. Throughput: 0: 50825.2. Samples: 1773546660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 18:50:37,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 18:50:38,076][49750] Updated weights for policy 0, policy_version 245411 (0.0031) [2024-04-26 18:50:41,016][49750] Updated weights for policy 0, policy_version 245421 (0.0030) [2024-04-26 18:50:42,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 4021026816. Throughput: 0: 50745.3. Samples: 1773852520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 18:50:42,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 18:50:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000245424_4021026816.pth... [2024-04-26 18:50:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000244679_4008820736.pth [2024-04-26 18:50:44,597][49750] Updated weights for policy 0, policy_version 245431 (0.0035) [2024-04-26 18:50:47,062][49517] Fps is (10 sec: 49152.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4021256192. Throughput: 0: 50768.0. Samples: 1774163080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 18:50:47,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 18:50:47,543][49750] Updated weights for policy 0, policy_version 245441 (0.0034) [2024-04-26 18:50:51,072][49750] Updated weights for policy 0, policy_version 245451 (0.0031) [2024-04-26 18:50:52,062][49517] Fps is (10 sec: 45875.6, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 4021485568. Throughput: 0: 50961.9. Samples: 1774306700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 18:50:52,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 18:50:53,967][49750] Updated weights for policy 0, policy_version 245461 (0.0033) [2024-04-26 18:50:57,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4021764096. Throughput: 0: 50782.2. Samples: 1774609320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 18:50:57,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 18:50:57,356][49750] Updated weights for policy 0, policy_version 245471 (0.0035) [2024-04-26 18:51:00,615][49750] Updated weights for policy 0, policy_version 245481 (0.0029) [2024-04-26 18:51:02,062][49517] Fps is (10 sec: 55705.0, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 4022042624. Throughput: 0: 50808.0. Samples: 1774914220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 18:51:02,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 18:51:03,717][49750] Updated weights for policy 0, policy_version 245491 (0.0028) [2024-04-26 18:51:06,907][49750] Updated weights for policy 0, policy_version 245501 (0.0029) [2024-04-26 18:51:07,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4022288384. Throughput: 0: 50958.1. Samples: 1775079720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 18:51:07,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 18:51:10,298][49750] Updated weights for policy 0, policy_version 245511 (0.0030) [2024-04-26 18:51:12,062][49517] Fps is (10 sec: 45875.9, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4022501376. Throughput: 0: 50935.1. Samples: 1775380480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 18:51:12,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 18:51:13,471][49750] Updated weights for policy 0, policy_version 245521 (0.0029) [2024-04-26 18:51:16,921][49750] Updated weights for policy 0, policy_version 245531 (0.0028) [2024-04-26 18:51:17,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4022779904. Throughput: 0: 50821.8. Samples: 1775681400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 18:51:17,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 18:51:19,962][49750] Updated weights for policy 0, policy_version 245541 (0.0025) [2024-04-26 18:51:22,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4023042048. Throughput: 0: 50768.2. Samples: 1775831220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 18:51:22,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 18:51:23,294][49750] Updated weights for policy 0, policy_version 245551 (0.0031) [2024-04-26 18:51:25,028][49728] Signal inference workers to stop experience collection... (26400 times) [2024-04-26 18:51:25,072][49750] InferenceWorker_p0-w0: stopping experience collection (26400 times) [2024-04-26 18:51:25,093][49728] Signal inference workers to resume experience collection... (26400 times) [2024-04-26 18:51:25,093][49750] InferenceWorker_p0-w0: resuming experience collection (26400 times) [2024-04-26 18:51:26,240][49750] Updated weights for policy 0, policy_version 245561 (0.0032) [2024-04-26 18:51:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 4023304192. Throughput: 0: 50964.5. Samples: 1776145920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 18:51:27,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 18:51:29,932][49750] Updated weights for policy 0, policy_version 245571 (0.0033) [2024-04-26 18:51:32,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50818.7). Total num frames: 4023533568. Throughput: 0: 50816.9. Samples: 1776449840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 18:51:32,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 18:51:32,589][49750] Updated weights for policy 0, policy_version 245581 (0.0030) [2024-04-26 18:51:36,366][49750] Updated weights for policy 0, policy_version 245591 (0.0032) [2024-04-26 18:51:37,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 4023779328. Throughput: 0: 50832.4. Samples: 1776594160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 18:51:37,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 18:51:39,027][49750] Updated weights for policy 0, policy_version 245601 (0.0034) [2024-04-26 18:51:42,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 4024041472. Throughput: 0: 50999.2. Samples: 1776904280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-26 18:51:42,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 18:51:42,890][49750] Updated weights for policy 0, policy_version 245611 (0.0031) [2024-04-26 18:51:45,518][49750] Updated weights for policy 0, policy_version 245621 (0.0031) [2024-04-26 18:51:47,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4024320000. Throughput: 0: 50817.4. Samples: 1777201000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 18:51:47,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 18:51:49,417][49750] Updated weights for policy 0, policy_version 245631 (0.0032) [2024-04-26 18:51:51,822][49750] Updated weights for policy 0, policy_version 245641 (0.0032) [2024-04-26 18:51:52,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 4024582144. Throughput: 0: 50847.2. Samples: 1777367840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 18:51:52,071][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 18:51:55,816][49750] Updated weights for policy 0, policy_version 245651 (0.0029) [2024-04-26 18:51:57,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4024778752. Throughput: 0: 50918.2. Samples: 1777671800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 18:51:57,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 18:51:58,296][49750] Updated weights for policy 0, policy_version 245661 (0.0037) [2024-04-26 18:52:02,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 4025057280. Throughput: 0: 50989.4. Samples: 1777975920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 18:52:02,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 18:52:02,128][49750] Updated weights for policy 0, policy_version 245671 (0.0031) [2024-04-26 18:52:04,731][49750] Updated weights for policy 0, policy_version 245681 (0.0022) [2024-04-26 18:52:07,062][49517] Fps is (10 sec: 55705.4, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4025335808. Throughput: 0: 50921.3. Samples: 1778122680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 18:52:07,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 18:52:08,560][49750] Updated weights for policy 0, policy_version 245691 (0.0033) [2024-04-26 18:52:11,307][49750] Updated weights for policy 0, policy_version 245701 (0.0030) [2024-04-26 18:52:12,062][49517] Fps is (10 sec: 55705.5, 60 sec: 51882.6, 300 sec: 50929.2). Total num frames: 4025614336. Throughput: 0: 50848.0. Samples: 1778434080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 18:52:12,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 18:52:15,117][49750] Updated weights for policy 0, policy_version 245711 (0.0034) [2024-04-26 18:52:17,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4025843712. Throughput: 0: 50994.6. Samples: 1778744600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 18:52:17,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 18:52:17,725][49750] Updated weights for policy 0, policy_version 245721 (0.0034) [2024-04-26 18:52:21,429][49750] Updated weights for policy 0, policy_version 245731 (0.0033) [2024-04-26 18:52:22,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4026073088. Throughput: 0: 50958.3. Samples: 1778887280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 18:52:22,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 18:52:24,101][49750] Updated weights for policy 0, policy_version 245741 (0.0033) [2024-04-26 18:52:25,747][49728] Signal inference workers to stop experience collection... (26450 times) [2024-04-26 18:52:25,747][49728] Signal inference workers to resume experience collection... (26450 times) [2024-04-26 18:52:25,759][49750] InferenceWorker_p0-w0: stopping experience collection (26450 times) [2024-04-26 18:52:25,760][49750] InferenceWorker_p0-w0: resuming experience collection (26450 times) [2024-04-26 18:52:27,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 4026351616. Throughput: 0: 50969.4. Samples: 1779197900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 18:52:27,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 18:52:27,678][49750] Updated weights for policy 0, policy_version 245751 (0.0034) [2024-04-26 18:52:30,474][49750] Updated weights for policy 0, policy_version 245761 (0.0030) [2024-04-26 18:52:32,062][49517] Fps is (10 sec: 54066.6, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4026613760. Throughput: 0: 51051.5. Samples: 1779498320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 18:52:32,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 18:52:34,114][49750] Updated weights for policy 0, policy_version 245771 (0.0036) [2024-04-26 18:52:36,783][49750] Updated weights for policy 0, policy_version 245781 (0.0035) [2024-04-26 18:52:37,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51882.6, 300 sec: 50929.2). Total num frames: 4026892288. Throughput: 0: 50998.7. Samples: 1779662780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 18:52:37,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 18:52:40,949][49750] Updated weights for policy 0, policy_version 245791 (0.0032) [2024-04-26 18:52:42,062][49517] Fps is (10 sec: 49152.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4027105280. Throughput: 0: 51064.0. Samples: 1779969680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 18:52:42,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 18:52:42,131][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000245796_4027121664.pth... [2024-04-26 18:52:42,175][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000245050_4014899200.pth [2024-04-26 18:52:43,259][49750] Updated weights for policy 0, policy_version 245801 (0.0035) [2024-04-26 18:52:47,063][49517] Fps is (10 sec: 44235.9, 60 sec: 50244.1, 300 sec: 50818.1). Total num frames: 4027334656. Throughput: 0: 51183.3. Samples: 1780279180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 18:52:47,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 18:52:47,536][49750] Updated weights for policy 0, policy_version 245811 (0.0032) [2024-04-26 18:52:49,725][49750] Updated weights for policy 0, policy_version 245821 (0.0031) [2024-04-26 18:52:52,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4027629568. Throughput: 0: 50912.4. Samples: 1780413740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 18:52:52,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 18:52:54,120][49750] Updated weights for policy 0, policy_version 245831 (0.0030) [2024-04-26 18:52:56,131][49750] Updated weights for policy 0, policy_version 245841 (0.0032) [2024-04-26 18:52:57,062][49517] Fps is (10 sec: 55706.3, 60 sec: 51882.6, 300 sec: 50873.7). Total num frames: 4027891712. Throughput: 0: 50771.0. Samples: 1780718780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 18:52:57,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 18:53:00,707][49750] Updated weights for policy 0, policy_version 245851 (0.0037) [2024-04-26 18:53:02,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51609.6, 300 sec: 50929.3). Total num frames: 4028153856. Throughput: 0: 50790.3. Samples: 1781030160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 18:53:02,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 18:53:02,844][49750] Updated weights for policy 0, policy_version 245861 (0.0032) [2024-04-26 18:53:06,983][49750] Updated weights for policy 0, policy_version 245871 (0.0033) [2024-04-26 18:53:07,062][49517] Fps is (10 sec: 45875.7, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 4028350464. Throughput: 0: 50856.0. Samples: 1781175800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 18:53:07,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-26 18:53:09,413][49750] Updated weights for policy 0, policy_version 245881 (0.0034) [2024-04-26 18:53:12,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.2, 300 sec: 50873.7). Total num frames: 4028628992. Throughput: 0: 50731.0. Samples: 1781480800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 18:53:12,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 18:53:13,269][49750] Updated weights for policy 0, policy_version 245891 (0.0040) [2024-04-26 18:53:15,786][49750] Updated weights for policy 0, policy_version 245901 (0.0034) [2024-04-26 18:53:17,062][49517] Fps is (10 sec: 55705.1, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4028907520. Throughput: 0: 50868.9. Samples: 1781787420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 18:53:17,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 18:53:19,826][49750] Updated weights for policy 0, policy_version 245911 (0.0035) [2024-04-26 18:53:21,363][49728] Signal inference workers to stop experience collection... (26500 times) [2024-04-26 18:53:21,363][49728] Signal inference workers to resume experience collection... (26500 times) [2024-04-26 18:53:21,389][49750] InferenceWorker_p0-w0: stopping experience collection (26500 times) [2024-04-26 18:53:21,389][49750] InferenceWorker_p0-w0: resuming experience collection (26500 times) [2024-04-26 18:53:22,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4029153280. Throughput: 0: 50843.9. Samples: 1781950760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 18:53:22,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 18:53:22,189][49750] Updated weights for policy 0, policy_version 245921 (0.0033) [2024-04-26 18:53:26,307][49750] Updated weights for policy 0, policy_version 245931 (0.0030) [2024-04-26 18:53:27,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4029382656. Throughput: 0: 50727.5. Samples: 1782252420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 18:53:27,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 18:53:28,724][49750] Updated weights for policy 0, policy_version 245941 (0.0040) [2024-04-26 18:53:32,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 4029628416. Throughput: 0: 50549.5. Samples: 1782553900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 18:53:32,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 18:53:32,590][49750] Updated weights for policy 0, policy_version 245951 (0.0029) [2024-04-26 18:53:35,023][49750] Updated weights for policy 0, policy_version 245961 (0.0031) [2024-04-26 18:53:37,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 4029906944. Throughput: 0: 50896.9. Samples: 1782704100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 18:53:37,063][49517] Avg episode reward: [(0, '0.678')] [2024-04-26 18:53:38,864][49750] Updated weights for policy 0, policy_version 245971 (0.0028) [2024-04-26 18:53:41,365][49750] Updated weights for policy 0, policy_version 245981 (0.0028) [2024-04-26 18:53:42,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4030169088. Throughput: 0: 50978.8. Samples: 1783012820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 18:53:42,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 18:53:45,482][49750] Updated weights for policy 0, policy_version 245991 (0.0035) [2024-04-26 18:53:47,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51609.8, 300 sec: 50930.0). Total num frames: 4030431232. Throughput: 0: 50750.2. Samples: 1783313920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 18:53:47,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 18:53:48,027][49750] Updated weights for policy 0, policy_version 246001 (0.0031) [2024-04-26 18:53:51,879][49750] Updated weights for policy 0, policy_version 246011 (0.0034) [2024-04-26 18:53:52,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50517.3, 300 sec: 50984.7). Total num frames: 4030660608. Throughput: 0: 50913.6. Samples: 1783466920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 18:53:52,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 18:53:54,399][49750] Updated weights for policy 0, policy_version 246021 (0.0029) [2024-04-26 18:53:57,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.4, 300 sec: 50929.3). Total num frames: 4030922752. Throughput: 0: 50896.4. Samples: 1783771140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 18:53:57,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 18:53:58,235][49750] Updated weights for policy 0, policy_version 246031 (0.0034) [2024-04-26 18:54:00,713][49750] Updated weights for policy 0, policy_version 246041 (0.0036) [2024-04-26 18:54:02,062][49517] Fps is (10 sec: 54068.1, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4031201280. Throughput: 0: 50905.0. Samples: 1784078140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 18:54:02,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 18:54:04,584][49750] Updated weights for policy 0, policy_version 246051 (0.0036) [2024-04-26 18:54:07,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 4031447040. Throughput: 0: 50972.4. Samples: 1784244520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 18:54:07,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 18:54:07,207][49750] Updated weights for policy 0, policy_version 246061 (0.0032) [2024-04-26 18:54:11,003][49750] Updated weights for policy 0, policy_version 246071 (0.0030) [2024-04-26 18:54:12,063][49517] Fps is (10 sec: 49150.9, 60 sec: 51063.3, 300 sec: 50984.7). Total num frames: 4031692800. Throughput: 0: 50979.3. Samples: 1784546500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 18:54:12,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 18:54:13,752][49750] Updated weights for policy 0, policy_version 246081 (0.0032) [2024-04-26 18:54:17,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.1, 300 sec: 50929.2). Total num frames: 4031922176. Throughput: 0: 50954.9. Samples: 1784846880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 18:54:17,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 18:54:17,400][49750] Updated weights for policy 0, policy_version 246091 (0.0028) [2024-04-26 18:54:20,349][49750] Updated weights for policy 0, policy_version 246101 (0.0037) [2024-04-26 18:54:22,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 4032200704. Throughput: 0: 50936.4. Samples: 1784996240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 18:54:22,063][49517] Avg episode reward: [(0, '0.669')] [2024-04-26 18:54:23,727][49750] Updated weights for policy 0, policy_version 246111 (0.0035) [2024-04-26 18:54:26,734][49750] Updated weights for policy 0, policy_version 246121 (0.0030) [2024-04-26 18:54:27,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4032462848. Throughput: 0: 50903.0. Samples: 1785303460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 18:54:27,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 18:54:27,568][49728] Signal inference workers to stop experience collection... (26550 times) [2024-04-26 18:54:27,620][49750] InferenceWorker_p0-w0: stopping experience collection (26550 times) [2024-04-26 18:54:27,641][49728] Signal inference workers to resume experience collection... (26550 times) [2024-04-26 18:54:27,642][49750] InferenceWorker_p0-w0: resuming experience collection (26550 times) [2024-04-26 18:54:30,086][49750] Updated weights for policy 0, policy_version 246131 (0.0030) [2024-04-26 18:54:32,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4032708608. Throughput: 0: 51024.3. Samples: 1785610020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 18:54:32,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 18:54:33,056][49750] Updated weights for policy 0, policy_version 246141 (0.0032) [2024-04-26 18:54:36,617][49750] Updated weights for policy 0, policy_version 246151 (0.0029) [2024-04-26 18:54:37,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 4032970752. Throughput: 0: 50876.1. Samples: 1785756340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 18:54:37,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 18:54:39,470][49750] Updated weights for policy 0, policy_version 246161 (0.0037) [2024-04-26 18:54:42,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 4033200128. Throughput: 0: 50922.2. Samples: 1786062640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 18:54:42,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 18:54:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000246167_4033200128.pth... [2024-04-26 18:54:42,124][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000245424_4021026816.pth [2024-04-26 18:54:43,030][49750] Updated weights for policy 0, policy_version 246171 (0.0029) [2024-04-26 18:54:45,962][49750] Updated weights for policy 0, policy_version 246181 (0.0032) [2024-04-26 18:54:47,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4033478656. Throughput: 0: 50887.5. Samples: 1786368080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 18:54:47,063][49517] Avg episode reward: [(0, '0.687')] [2024-04-26 18:54:49,334][49750] Updated weights for policy 0, policy_version 246191 (0.0031) [2024-04-26 18:54:52,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4033724416. Throughput: 0: 50686.3. Samples: 1786525400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 18:54:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 18:54:52,279][49750] Updated weights for policy 0, policy_version 246201 (0.0027) [2024-04-26 18:54:55,727][49750] Updated weights for policy 0, policy_version 246211 (0.0034) [2024-04-26 18:54:57,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 4033986560. Throughput: 0: 50853.0. Samples: 1786834880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 18:54:57,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 18:54:58,772][49750] Updated weights for policy 0, policy_version 246221 (0.0035) [2024-04-26 18:55:02,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50929.2). Total num frames: 4034232320. Throughput: 0: 51067.3. Samples: 1787144900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 18:55:02,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 18:55:02,292][49750] Updated weights for policy 0, policy_version 246231 (0.0034) [2024-04-26 18:55:05,122][49750] Updated weights for policy 0, policy_version 246241 (0.0030) [2024-04-26 18:55:07,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4034494464. Throughput: 0: 51014.3. Samples: 1787291880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 18:55:07,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 18:55:08,563][49750] Updated weights for policy 0, policy_version 246251 (0.0040) [2024-04-26 18:55:11,566][49750] Updated weights for policy 0, policy_version 246261 (0.0029) [2024-04-26 18:55:12,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4034740224. Throughput: 0: 50997.3. Samples: 1787598340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 18:55:12,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 18:55:14,877][49750] Updated weights for policy 0, policy_version 246271 (0.0041) [2024-04-26 18:55:17,062][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.7, 300 sec: 50762.6). Total num frames: 4034985984. Throughput: 0: 50989.5. Samples: 1787904540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 18:55:17,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 18:55:18,218][49750] Updated weights for policy 0, policy_version 246281 (0.0029) [2024-04-26 18:55:21,481][49750] Updated weights for policy 0, policy_version 246291 (0.0036) [2024-04-26 18:55:22,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.5, 300 sec: 50984.8). Total num frames: 4035248128. Throughput: 0: 51036.5. Samples: 1788052980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 18:55:22,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 18:55:24,682][49750] Updated weights for policy 0, policy_version 246301 (0.0034) [2024-04-26 18:55:27,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 4035510272. Throughput: 0: 50861.3. Samples: 1788351400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 18:55:27,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 18:55:27,983][49750] Updated weights for policy 0, policy_version 246311 (0.0028) [2024-04-26 18:55:31,087][49750] Updated weights for policy 0, policy_version 246321 (0.0029) [2024-04-26 18:55:32,062][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4035772416. Throughput: 0: 50952.3. Samples: 1788660940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 18:55:32,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 18:55:34,374][49750] Updated weights for policy 0, policy_version 246331 (0.0032) [2024-04-26 18:55:36,219][49728] Signal inference workers to stop experience collection... (26600 times) [2024-04-26 18:55:36,219][49728] Signal inference workers to resume experience collection... (26600 times) [2024-04-26 18:55:36,235][49750] InferenceWorker_p0-w0: stopping experience collection (26600 times) [2024-04-26 18:55:36,235][49750] InferenceWorker_p0-w0: resuming experience collection (26600 times) [2024-04-26 18:55:37,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4036018176. Throughput: 0: 50879.5. Samples: 1788814980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 18:55:37,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 18:55:37,549][49750] Updated weights for policy 0, policy_version 246341 (0.0029) [2024-04-26 18:55:40,666][49750] Updated weights for policy 0, policy_version 246351 (0.0029) [2024-04-26 18:55:42,062][49517] Fps is (10 sec: 49152.2, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4036263936. Throughput: 0: 50877.9. Samples: 1789124380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 18:55:42,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 18:55:44,038][49750] Updated weights for policy 0, policy_version 246361 (0.0028) [2024-04-26 18:55:47,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 4036526080. Throughput: 0: 50671.6. Samples: 1789425120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 18:55:47,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 18:55:47,103][49750] Updated weights for policy 0, policy_version 246371 (0.0035) [2024-04-26 18:55:50,463][49750] Updated weights for policy 0, policy_version 246381 (0.0026) [2024-04-26 18:55:52,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4036788224. Throughput: 0: 50764.8. Samples: 1789576300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 18:55:52,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 18:55:53,575][49750] Updated weights for policy 0, policy_version 246391 (0.0027) [2024-04-26 18:55:56,748][49750] Updated weights for policy 0, policy_version 246401 (0.0032) [2024-04-26 18:55:57,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4037050368. Throughput: 0: 50881.0. Samples: 1789887980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 18:55:57,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 18:56:00,067][49750] Updated weights for policy 0, policy_version 246411 (0.0027) [2024-04-26 18:56:02,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4037296128. Throughput: 0: 51005.7. Samples: 1790199800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 18:56:02,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 18:56:03,325][49750] Updated weights for policy 0, policy_version 246421 (0.0031) [2024-04-26 18:56:06,424][49750] Updated weights for policy 0, policy_version 246431 (0.0035) [2024-04-26 18:56:07,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 4037541888. Throughput: 0: 50877.7. Samples: 1790342480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 18:56:07,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 18:56:09,682][49750] Updated weights for policy 0, policy_version 246441 (0.0033) [2024-04-26 18:56:12,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4037804032. Throughput: 0: 51106.2. Samples: 1790651180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 18:56:12,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 18:56:12,794][49750] Updated weights for policy 0, policy_version 246451 (0.0038) [2024-04-26 18:56:16,176][49750] Updated weights for policy 0, policy_version 246461 (0.0032) [2024-04-26 18:56:17,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4038066176. Throughput: 0: 50934.3. Samples: 1790952980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 18:56:17,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 18:56:19,295][49750] Updated weights for policy 0, policy_version 246471 (0.0029) [2024-04-26 18:56:22,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4038311936. Throughput: 0: 50942.2. Samples: 1791107380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 18:56:22,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 18:56:22,547][49750] Updated weights for policy 0, policy_version 246481 (0.0030) [2024-04-26 18:56:25,680][49750] Updated weights for policy 0, policy_version 246491 (0.0030) [2024-04-26 18:56:27,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 4038557696. Throughput: 0: 50902.7. Samples: 1791415000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:56:27,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 18:56:29,002][49750] Updated weights for policy 0, policy_version 246501 (0.0030) [2024-04-26 18:56:32,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50984.8). Total num frames: 4038819840. Throughput: 0: 51022.7. Samples: 1791721140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:56:32,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 18:56:32,199][49750] Updated weights for policy 0, policy_version 246511 (0.0039) [2024-04-26 18:56:35,549][49750] Updated weights for policy 0, policy_version 246521 (0.0032) [2024-04-26 18:56:37,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 4039081984. Throughput: 0: 50981.8. Samples: 1791870480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:56:37,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 18:56:38,608][49750] Updated weights for policy 0, policy_version 246531 (0.0029) [2024-04-26 18:56:41,924][49750] Updated weights for policy 0, policy_version 246541 (0.0033) [2024-04-26 18:56:42,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4039327744. Throughput: 0: 50895.9. Samples: 1792178300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:56:42,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 18:56:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000246541_4039327744.pth... [2024-04-26 18:56:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000245796_4027121664.pth [2024-04-26 18:56:45,018][49750] Updated weights for policy 0, policy_version 246551 (0.0028) [2024-04-26 18:56:47,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4039573504. Throughput: 0: 50661.9. Samples: 1792479580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:56:47,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 18:56:47,306][49728] Signal inference workers to stop experience collection... (26650 times) [2024-04-26 18:56:47,340][49750] InferenceWorker_p0-w0: stopping experience collection (26650 times) [2024-04-26 18:56:47,374][49728] Signal inference workers to resume experience collection... (26650 times) [2024-04-26 18:56:47,374][49750] InferenceWorker_p0-w0: resuming experience collection (26650 times) [2024-04-26 18:56:48,279][49750] Updated weights for policy 0, policy_version 246561 (0.0028) [2024-04-26 18:56:51,404][49750] Updated weights for policy 0, policy_version 246571 (0.0030) [2024-04-26 18:56:52,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.2, 300 sec: 50984.8). Total num frames: 4039819264. Throughput: 0: 50869.6. Samples: 1792631620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:56:52,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 18:56:54,695][49750] Updated weights for policy 0, policy_version 246581 (0.0032) [2024-04-26 18:56:57,062][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.2, 300 sec: 50929.2). Total num frames: 4040081408. Throughput: 0: 50848.0. Samples: 1792939340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:56:57,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 18:56:58,101][49750] Updated weights for policy 0, policy_version 246591 (0.0032) [2024-04-26 18:57:01,279][49750] Updated weights for policy 0, policy_version 246601 (0.0028) [2024-04-26 18:57:02,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4040343552. Throughput: 0: 50871.2. Samples: 1793242180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:57:02,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 18:57:04,575][49750] Updated weights for policy 0, policy_version 246611 (0.0030) [2024-04-26 18:57:07,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4040605696. Throughput: 0: 50790.2. Samples: 1793392940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:57:07,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 18:57:07,872][49750] Updated weights for policy 0, policy_version 246621 (0.0025) [2024-04-26 18:57:11,036][49750] Updated weights for policy 0, policy_version 246631 (0.0031) [2024-04-26 18:57:12,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4040835072. Throughput: 0: 50725.8. Samples: 1793697660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:57:12,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 18:57:14,292][49750] Updated weights for policy 0, policy_version 246641 (0.0029) [2024-04-26 18:57:17,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50984.8). Total num frames: 4041113600. Throughput: 0: 50782.5. Samples: 1794006360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:57:17,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 18:57:17,379][49750] Updated weights for policy 0, policy_version 246651 (0.0035) [2024-04-26 18:57:20,611][49750] Updated weights for policy 0, policy_version 246661 (0.0035) [2024-04-26 18:57:22,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4041359360. Throughput: 0: 50836.1. Samples: 1794158100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:57:22,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 18:57:23,706][49750] Updated weights for policy 0, policy_version 246671 (0.0033) [2024-04-26 18:57:27,029][49750] Updated weights for policy 0, policy_version 246681 (0.0028) [2024-04-26 18:57:27,062][49517] Fps is (10 sec: 50791.2, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4041621504. Throughput: 0: 50917.9. Samples: 1794469600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:57:27,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 18:57:30,144][49750] Updated weights for policy 0, policy_version 246691 (0.0034) [2024-04-26 18:57:32,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4041850880. Throughput: 0: 50883.3. Samples: 1794769340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 18:57:32,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 18:57:33,544][49750] Updated weights for policy 0, policy_version 246701 (0.0033) [2024-04-26 18:57:36,723][49750] Updated weights for policy 0, policy_version 246711 (0.0030) [2024-04-26 18:57:37,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 4042129408. Throughput: 0: 50808.1. Samples: 1794917980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 18:57:37,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 18:57:39,797][49750] Updated weights for policy 0, policy_version 246721 (0.0037) [2024-04-26 18:57:42,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50790.5, 300 sec: 50984.8). Total num frames: 4042375168. Throughput: 0: 50856.2. Samples: 1795227860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 18:57:42,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-26 18:57:43,161][49750] Updated weights for policy 0, policy_version 246731 (0.0037) [2024-04-26 18:57:46,261][49750] Updated weights for policy 0, policy_version 246741 (0.0027) [2024-04-26 18:57:47,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4042637312. Throughput: 0: 50924.8. Samples: 1795533800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 18:57:47,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 18:57:49,526][49750] Updated weights for policy 0, policy_version 246751 (0.0032) [2024-04-26 18:57:52,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4042883072. Throughput: 0: 50998.8. Samples: 1795687880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 18:57:52,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 18:57:52,657][49750] Updated weights for policy 0, policy_version 246761 (0.0029) [2024-04-26 18:57:56,006][49750] Updated weights for policy 0, policy_version 246771 (0.0031) [2024-04-26 18:57:57,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 4043145216. Throughput: 0: 50915.0. Samples: 1795988840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 18:57:57,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 18:57:58,998][49750] Updated weights for policy 0, policy_version 246781 (0.0025) [2024-04-26 18:58:02,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 4043390976. Throughput: 0: 50820.6. Samples: 1796293280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 18:58:02,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 18:58:02,499][49750] Updated weights for policy 0, policy_version 246791 (0.0030) [2024-04-26 18:58:03,980][49728] Signal inference workers to stop experience collection... (26700 times) [2024-04-26 18:58:04,000][49750] InferenceWorker_p0-w0: stopping experience collection (26700 times) [2024-04-26 18:58:04,089][49728] Signal inference workers to resume experience collection... (26700 times) [2024-04-26 18:58:04,089][49750] InferenceWorker_p0-w0: resuming experience collection (26700 times) [2024-04-26 18:58:05,834][49750] Updated weights for policy 0, policy_version 246801 (0.0031) [2024-04-26 18:58:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 4043653120. Throughput: 0: 50755.9. Samples: 1796442120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 18:58:07,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 18:58:08,914][49750] Updated weights for policy 0, policy_version 246811 (0.0033) [2024-04-26 18:58:12,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4043898880. Throughput: 0: 50764.4. Samples: 1796754000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 18:58:12,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 18:58:12,198][49750] Updated weights for policy 0, policy_version 246821 (0.0032) [2024-04-26 18:58:15,402][49750] Updated weights for policy 0, policy_version 246831 (0.0029) [2024-04-26 18:58:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4044144640. Throughput: 0: 50893.4. Samples: 1797059540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 18:58:17,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 18:58:18,614][49750] Updated weights for policy 0, policy_version 246841 (0.0029) [2024-04-26 18:58:21,757][49750] Updated weights for policy 0, policy_version 246851 (0.0037) [2024-04-26 18:58:22,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50790.2, 300 sec: 50929.2). Total num frames: 4044406784. Throughput: 0: 50747.3. Samples: 1797201620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 18:58:22,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 18:58:24,963][49750] Updated weights for policy 0, policy_version 246861 (0.0025) [2024-04-26 18:58:27,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50929.2). Total num frames: 4044652544. Throughput: 0: 50759.9. Samples: 1797512060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 18:58:27,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 18:58:28,140][49750] Updated weights for policy 0, policy_version 246871 (0.0028) [2024-04-26 18:58:31,606][49750] Updated weights for policy 0, policy_version 246881 (0.0033) [2024-04-26 18:58:32,062][49517] Fps is (10 sec: 50791.8, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4044914688. Throughput: 0: 50858.3. Samples: 1797822420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 18:58:32,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 18:58:34,632][49750] Updated weights for policy 0, policy_version 246891 (0.0031) [2024-04-26 18:58:37,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 4045193216. Throughput: 0: 50772.9. Samples: 1797972660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 18:58:37,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 18:58:37,913][49750] Updated weights for policy 0, policy_version 246901 (0.0033) [2024-04-26 18:58:41,113][49750] Updated weights for policy 0, policy_version 246911 (0.0031) [2024-04-26 18:58:42,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4045438976. Throughput: 0: 50981.8. Samples: 1798283020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 18:58:42,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 18:58:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000246914_4045438976.pth... [2024-04-26 18:58:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000246167_4033200128.pth [2024-04-26 18:58:44,321][49750] Updated weights for policy 0, policy_version 246921 (0.0035) [2024-04-26 18:58:47,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 4045684736. Throughput: 0: 50893.8. Samples: 1798583500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:58:47,063][49517] Avg episode reward: [(0, '0.481')] [2024-04-26 18:58:47,499][49750] Updated weights for policy 0, policy_version 246931 (0.0026) [2024-04-26 18:58:50,692][49750] Updated weights for policy 0, policy_version 246941 (0.0029) [2024-04-26 18:58:52,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4045930496. Throughput: 0: 51045.7. Samples: 1798739180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:58:52,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 18:58:53,900][49750] Updated weights for policy 0, policy_version 246951 (0.0028) [2024-04-26 18:58:56,987][49750] Updated weights for policy 0, policy_version 246961 (0.0030) [2024-04-26 18:58:57,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4046209024. Throughput: 0: 50971.0. Samples: 1799047700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:58:57,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 18:59:00,415][49750] Updated weights for policy 0, policy_version 246971 (0.0034) [2024-04-26 18:59:02,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4046438400. Throughput: 0: 50936.8. Samples: 1799351700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:59:02,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 18:59:03,303][49750] Updated weights for policy 0, policy_version 246981 (0.0028) [2024-04-26 18:59:06,829][49750] Updated weights for policy 0, policy_version 246991 (0.0033) [2024-04-26 18:59:07,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4046700544. Throughput: 0: 51145.7. Samples: 1799503160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:59:07,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 18:59:09,954][49750] Updated weights for policy 0, policy_version 247001 (0.0029) [2024-04-26 18:59:12,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50929.3). Total num frames: 4046946304. Throughput: 0: 50923.9. Samples: 1799803640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:59:12,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 18:59:13,198][49750] Updated weights for policy 0, policy_version 247011 (0.0029) [2024-04-26 18:59:16,674][49750] Updated weights for policy 0, policy_version 247021 (0.0030) [2024-04-26 18:59:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4047208448. Throughput: 0: 50755.1. Samples: 1800106400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:59:17,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 18:59:19,598][49750] Updated weights for policy 0, policy_version 247031 (0.0027) [2024-04-26 18:59:22,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4047470592. Throughput: 0: 50834.1. Samples: 1800260200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:59:22,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 18:59:22,965][49750] Updated weights for policy 0, policy_version 247041 (0.0030) [2024-04-26 18:59:26,002][49750] Updated weights for policy 0, policy_version 247051 (0.0029) [2024-04-26 18:59:27,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4047699968. Throughput: 0: 50752.1. Samples: 1800566860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:59:27,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 18:59:28,287][49728] Signal inference workers to stop experience collection... (26750 times) [2024-04-26 18:59:28,326][49750] InferenceWorker_p0-w0: stopping experience collection (26750 times) [2024-04-26 18:59:28,345][49728] Signal inference workers to resume experience collection... (26750 times) [2024-04-26 18:59:28,353][49750] InferenceWorker_p0-w0: resuming experience collection (26750 times) [2024-04-26 18:59:29,450][49750] Updated weights for policy 0, policy_version 247061 (0.0028) [2024-04-26 18:59:32,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4047978496. Throughput: 0: 50823.4. Samples: 1800870560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:59:32,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 18:59:32,465][49750] Updated weights for policy 0, policy_version 247071 (0.0029) [2024-04-26 18:59:35,937][49750] Updated weights for policy 0, policy_version 247081 (0.0034) [2024-04-26 18:59:37,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.3, 300 sec: 50929.3). Total num frames: 4048224256. Throughput: 0: 50770.8. Samples: 1801023860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:59:37,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 18:59:38,852][49750] Updated weights for policy 0, policy_version 247091 (0.0031) [2024-04-26 18:59:42,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4048486400. Throughput: 0: 50751.3. Samples: 1801331500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:59:42,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 18:59:42,255][49750] Updated weights for policy 0, policy_version 247101 (0.0029) [2024-04-26 18:59:45,417][49750] Updated weights for policy 0, policy_version 247111 (0.0027) [2024-04-26 18:59:47,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4048732160. Throughput: 0: 50781.0. Samples: 1801636840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:59:47,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 18:59:48,577][49750] Updated weights for policy 0, policy_version 247121 (0.0031) [2024-04-26 18:59:51,650][49750] Updated weights for policy 0, policy_version 247131 (0.0027) [2024-04-26 18:59:52,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4048994304. Throughput: 0: 50868.4. Samples: 1801792240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 18:59:52,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 18:59:55,185][49750] Updated weights for policy 0, policy_version 247141 (0.0034) [2024-04-26 18:59:57,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4049240064. Throughput: 0: 50900.2. Samples: 1802094140. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-04-26 18:59:57,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 18:59:58,219][49750] Updated weights for policy 0, policy_version 247151 (0.0028) [2024-04-26 19:00:01,720][49750] Updated weights for policy 0, policy_version 247161 (0.0029) [2024-04-26 19:00:02,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 4049485824. Throughput: 0: 50820.1. Samples: 1802393300. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-04-26 19:00:02,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 19:00:04,827][49750] Updated weights for policy 0, policy_version 247171 (0.0032) [2024-04-26 19:00:07,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 4049764352. Throughput: 0: 50767.2. Samples: 1802544720. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-04-26 19:00:07,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 19:00:08,118][49750] Updated weights for policy 0, policy_version 247181 (0.0040) [2024-04-26 19:00:11,196][49750] Updated weights for policy 0, policy_version 247191 (0.0030) [2024-04-26 19:00:12,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4049993728. Throughput: 0: 50838.6. Samples: 1802854600. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-04-26 19:00:12,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 19:00:14,774][49750] Updated weights for policy 0, policy_version 247201 (0.0031) [2024-04-26 19:00:17,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4050239488. Throughput: 0: 50831.3. Samples: 1803157960. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-04-26 19:00:17,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 19:00:17,631][49750] Updated weights for policy 0, policy_version 247211 (0.0035) [2024-04-26 19:00:21,083][49750] Updated weights for policy 0, policy_version 247221 (0.0028) [2024-04-26 19:00:22,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4050518016. Throughput: 0: 50681.8. Samples: 1803304540. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-04-26 19:00:22,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 19:00:24,079][49750] Updated weights for policy 0, policy_version 247231 (0.0032) [2024-04-26 19:00:27,062][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4050763776. Throughput: 0: 50687.0. Samples: 1803612420. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-04-26 19:00:27,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 19:00:27,604][49750] Updated weights for policy 0, policy_version 247241 (0.0031) [2024-04-26 19:00:28,315][49728] Signal inference workers to stop experience collection... (26800 times) [2024-04-26 19:00:28,357][49750] InferenceWorker_p0-w0: stopping experience collection (26800 times) [2024-04-26 19:00:28,419][49728] Signal inference workers to resume experience collection... (26800 times) [2024-04-26 19:00:28,420][49750] InferenceWorker_p0-w0: resuming experience collection (26800 times) [2024-04-26 19:00:30,370][49750] Updated weights for policy 0, policy_version 247251 (0.0027) [2024-04-26 19:00:32,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4051009536. Throughput: 0: 50710.3. Samples: 1803918800. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-04-26 19:00:32,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 19:00:34,141][49750] Updated weights for policy 0, policy_version 247261 (0.0030) [2024-04-26 19:00:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4051271680. Throughput: 0: 50619.0. Samples: 1804070100. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-04-26 19:00:37,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-26 19:00:37,172][49750] Updated weights for policy 0, policy_version 247271 (0.0033) [2024-04-26 19:00:40,694][49750] Updated weights for policy 0, policy_version 247281 (0.0034) [2024-04-26 19:00:42,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4051517440. Throughput: 0: 50728.5. Samples: 1804376920. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-04-26 19:00:42,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 19:00:42,163][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000247286_4051533824.pth... [2024-04-26 19:00:42,213][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000246541_4039327744.pth [2024-04-26 19:00:43,599][49750] Updated weights for policy 0, policy_version 247291 (0.0030) [2024-04-26 19:00:47,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4051746816. Throughput: 0: 50731.0. Samples: 1804676200. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-04-26 19:00:47,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 19:00:47,248][49750] Updated weights for policy 0, policy_version 247301 (0.0030) [2024-04-26 19:00:49,973][49750] Updated weights for policy 0, policy_version 247311 (0.0029) [2024-04-26 19:00:52,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4052041728. Throughput: 0: 50635.5. Samples: 1804823320. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-04-26 19:00:52,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 19:00:53,700][49750] Updated weights for policy 0, policy_version 247321 (0.0029) [2024-04-26 19:00:56,306][49750] Updated weights for policy 0, policy_version 247331 (0.0034) [2024-04-26 19:00:57,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4052271104. Throughput: 0: 50481.7. Samples: 1805126280. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-04-26 19:00:57,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 19:01:00,026][49750] Updated weights for policy 0, policy_version 247341 (0.0029) [2024-04-26 19:01:02,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4052533248. Throughput: 0: 50474.5. Samples: 1805429320. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-04-26 19:01:02,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 19:01:02,685][49750] Updated weights for policy 0, policy_version 247351 (0.0031) [2024-04-26 19:01:06,452][49750] Updated weights for policy 0, policy_version 247361 (0.0035) [2024-04-26 19:01:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4052779008. Throughput: 0: 50773.7. Samples: 1805589360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 19:01:07,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 19:01:09,649][49750] Updated weights for policy 0, policy_version 247371 (0.0029) [2024-04-26 19:01:12,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4053041152. Throughput: 0: 50709.3. Samples: 1805894340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 19:01:12,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 19:01:13,256][49750] Updated weights for policy 0, policy_version 247381 (0.0033) [2024-04-26 19:01:16,019][49750] Updated weights for policy 0, policy_version 247391 (0.0034) [2024-04-26 19:01:17,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4053286912. Throughput: 0: 50618.3. Samples: 1806196620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 19:01:17,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 19:01:19,777][49750] Updated weights for policy 0, policy_version 247401 (0.0040) [2024-04-26 19:01:22,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4053565440. Throughput: 0: 50664.1. Samples: 1806349980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 19:01:22,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 19:01:22,320][49750] Updated weights for policy 0, policy_version 247411 (0.0027) [2024-04-26 19:01:26,109][49750] Updated weights for policy 0, policy_version 247421 (0.0035) [2024-04-26 19:01:27,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4053811200. Throughput: 0: 50573.7. Samples: 1806652740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 19:01:27,071][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 19:01:28,613][49750] Updated weights for policy 0, policy_version 247431 (0.0028) [2024-04-26 19:01:32,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4054040576. Throughput: 0: 50739.5. Samples: 1806959480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 19:01:32,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 19:01:32,411][49750] Updated weights for policy 0, policy_version 247441 (0.0035) [2024-04-26 19:01:35,142][49750] Updated weights for policy 0, policy_version 247451 (0.0029) [2024-04-26 19:01:37,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4054286336. Throughput: 0: 50790.3. Samples: 1807108880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 19:01:37,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 19:01:38,956][49750] Updated weights for policy 0, policy_version 247461 (0.0032) [2024-04-26 19:01:42,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4054548480. Throughput: 0: 50677.9. Samples: 1807406780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 19:01:42,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 19:01:42,106][49750] Updated weights for policy 0, policy_version 247471 (0.0031) [2024-04-26 19:01:45,560][49750] Updated weights for policy 0, policy_version 247481 (0.0034) [2024-04-26 19:01:47,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4054827008. Throughput: 0: 50724.1. Samples: 1807711900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 19:01:47,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 19:01:48,413][49750] Updated weights for policy 0, policy_version 247491 (0.0029) [2024-04-26 19:01:51,955][49750] Updated weights for policy 0, policy_version 247501 (0.0037) [2024-04-26 19:01:52,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4055056384. Throughput: 0: 50757.4. Samples: 1807873440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 19:01:52,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 19:01:52,375][49728] Signal inference workers to stop experience collection... (26850 times) [2024-04-26 19:01:52,375][49728] Signal inference workers to resume experience collection... (26850 times) [2024-04-26 19:01:52,402][49750] InferenceWorker_p0-w0: stopping experience collection (26850 times) [2024-04-26 19:01:52,402][49750] InferenceWorker_p0-w0: resuming experience collection (26850 times) [2024-04-26 19:01:54,756][49750] Updated weights for policy 0, policy_version 247511 (0.0031) [2024-04-26 19:01:57,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4055318528. Throughput: 0: 50746.7. Samples: 1808177940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 19:01:57,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 19:01:58,349][49750] Updated weights for policy 0, policy_version 247521 (0.0034) [2024-04-26 19:02:01,146][49750] Updated weights for policy 0, policy_version 247531 (0.0029) [2024-04-26 19:02:02,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4055564288. Throughput: 0: 50751.8. Samples: 1808480460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 19:02:02,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 19:02:04,731][49750] Updated weights for policy 0, policy_version 247541 (0.0032) [2024-04-26 19:02:07,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4055842816. Throughput: 0: 50828.0. Samples: 1808637240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 19:02:07,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 19:02:07,550][49750] Updated weights for policy 0, policy_version 247551 (0.0029) [2024-04-26 19:02:11,177][49750] Updated weights for policy 0, policy_version 247561 (0.0039) [2024-04-26 19:02:12,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4056088576. Throughput: 0: 50936.8. Samples: 1808944900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 19:02:12,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 19:02:14,102][49750] Updated weights for policy 0, policy_version 247571 (0.0028) [2024-04-26 19:02:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4056334336. Throughput: 0: 50871.6. Samples: 1809248700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 19:02:17,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 19:02:17,686][49750] Updated weights for policy 0, policy_version 247581 (0.0036) [2024-04-26 19:02:20,424][49750] Updated weights for policy 0, policy_version 247591 (0.0032) [2024-04-26 19:02:22,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4056596480. Throughput: 0: 50532.8. Samples: 1809382860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 19:02:22,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 19:02:24,009][49750] Updated weights for policy 0, policy_version 247601 (0.0029) [2024-04-26 19:02:26,763][49750] Updated weights for policy 0, policy_version 247611 (0.0031) [2024-04-26 19:02:27,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4056858624. Throughput: 0: 50814.1. Samples: 1809693420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 19:02:27,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 19:02:30,374][49750] Updated weights for policy 0, policy_version 247621 (0.0033) [2024-04-26 19:02:32,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4057104384. Throughput: 0: 50836.0. Samples: 1809999520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 19:02:32,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 19:02:33,244][49750] Updated weights for policy 0, policy_version 247631 (0.0032) [2024-04-26 19:02:36,858][49750] Updated weights for policy 0, policy_version 247641 (0.0028) [2024-04-26 19:02:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4057350144. Throughput: 0: 50792.4. Samples: 1810159100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 19:02:37,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 19:02:39,670][49750] Updated weights for policy 0, policy_version 247651 (0.0029) [2024-04-26 19:02:42,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4057595904. Throughput: 0: 50703.5. Samples: 1810459600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 19:02:42,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-26 19:02:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000247656_4057595904.pth... [2024-04-26 19:02:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000246914_4045438976.pth [2024-04-26 19:02:43,237][49750] Updated weights for policy 0, policy_version 247661 (0.0029) [2024-04-26 19:02:46,373][49750] Updated weights for policy 0, policy_version 247671 (0.0031) [2024-04-26 19:02:47,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4057858048. Throughput: 0: 50792.6. Samples: 1810766120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 19:02:47,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 19:02:49,653][49750] Updated weights for policy 0, policy_version 247681 (0.0030) [2024-04-26 19:02:52,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4058136576. Throughput: 0: 50834.7. Samples: 1810924800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 19:02:52,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 19:02:52,704][49750] Updated weights for policy 0, policy_version 247691 (0.0030) [2024-04-26 19:02:56,142][49750] Updated weights for policy 0, policy_version 247701 (0.0035) [2024-04-26 19:02:57,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4058382336. Throughput: 0: 50849.1. Samples: 1811233100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 19:02:57,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 19:02:59,245][49750] Updated weights for policy 0, policy_version 247711 (0.0028) [2024-04-26 19:03:01,892][49728] Signal inference workers to stop experience collection... (26900 times) [2024-04-26 19:03:01,893][49728] Signal inference workers to resume experience collection... (26900 times) [2024-04-26 19:03:01,904][49750] InferenceWorker_p0-w0: stopping experience collection (26900 times) [2024-04-26 19:03:01,904][49750] InferenceWorker_p0-w0: resuming experience collection (26900 times) [2024-04-26 19:03:02,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4058611712. Throughput: 0: 50834.6. Samples: 1811536260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 19:03:02,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 19:03:02,648][49750] Updated weights for policy 0, policy_version 247721 (0.0034) [2024-04-26 19:03:05,678][49750] Updated weights for policy 0, policy_version 247731 (0.0028) [2024-04-26 19:03:07,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4058873856. Throughput: 0: 50838.2. Samples: 1811670580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 19:03:07,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 19:03:08,992][49750] Updated weights for policy 0, policy_version 247741 (0.0032) [2024-04-26 19:03:11,970][49750] Updated weights for policy 0, policy_version 247751 (0.0028) [2024-04-26 19:03:12,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4059152384. Throughput: 0: 50880.1. Samples: 1811983020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 19:03:12,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 19:03:15,385][49750] Updated weights for policy 0, policy_version 247761 (0.0033) [2024-04-26 19:03:17,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51336.6, 300 sec: 50873.8). Total num frames: 4059414528. Throughput: 0: 50895.2. Samples: 1812289800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 19:03:17,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 19:03:18,414][49750] Updated weights for policy 0, policy_version 247771 (0.0033) [2024-04-26 19:03:21,968][49750] Updated weights for policy 0, policy_version 247781 (0.0031) [2024-04-26 19:03:22,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4059643904. Throughput: 0: 50800.9. Samples: 1812445140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 19:03:22,072][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 19:03:24,892][49750] Updated weights for policy 0, policy_version 247791 (0.0026) [2024-04-26 19:03:27,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4059889664. Throughput: 0: 50912.5. Samples: 1812750660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 19:03:27,071][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 19:03:28,562][49750] Updated weights for policy 0, policy_version 247801 (0.0035) [2024-04-26 19:03:31,436][49750] Updated weights for policy 0, policy_version 247811 (0.0030) [2024-04-26 19:03:32,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4060151808. Throughput: 0: 50774.6. Samples: 1813050980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 19:03:32,072][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 19:03:35,043][49750] Updated weights for policy 0, policy_version 247821 (0.0030) [2024-04-26 19:03:37,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4060430336. Throughput: 0: 50790.2. Samples: 1813210360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 19:03:37,072][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 19:03:37,739][49750] Updated weights for policy 0, policy_version 247831 (0.0037) [2024-04-26 19:03:41,365][49750] Updated weights for policy 0, policy_version 247841 (0.0030) [2024-04-26 19:03:42,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 4060676096. Throughput: 0: 50779.4. Samples: 1813518180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 19:03:42,071][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 19:03:44,385][49750] Updated weights for policy 0, policy_version 247851 (0.0038) [2024-04-26 19:03:47,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4060889088. Throughput: 0: 50725.9. Samples: 1813818920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 19:03:47,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 19:03:47,696][49750] Updated weights for policy 0, policy_version 247861 (0.0026) [2024-04-26 19:03:50,961][49750] Updated weights for policy 0, policy_version 247871 (0.0029) [2024-04-26 19:03:52,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4061167616. Throughput: 0: 50872.0. Samples: 1813959820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 19:03:52,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 19:03:54,280][49750] Updated weights for policy 0, policy_version 247881 (0.0031) [2024-04-26 19:03:57,063][49517] Fps is (10 sec: 54066.3, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 4061429760. Throughput: 0: 50566.9. Samples: 1814258540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 19:03:57,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 19:03:57,338][49750] Updated weights for policy 0, policy_version 247891 (0.0035) [2024-04-26 19:04:00,540][49728] Signal inference workers to stop experience collection... (26950 times) [2024-04-26 19:04:00,586][49750] InferenceWorker_p0-w0: stopping experience collection (26950 times) [2024-04-26 19:04:00,650][49728] Signal inference workers to resume experience collection... (26950 times) [2024-04-26 19:04:00,651][49750] InferenceWorker_p0-w0: resuming experience collection (26950 times) [2024-04-26 19:04:00,775][49750] Updated weights for policy 0, policy_version 247901 (0.0028) [2024-04-26 19:04:02,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 4061708288. Throughput: 0: 50695.0. Samples: 1814571080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 19:04:02,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 19:04:03,833][49750] Updated weights for policy 0, policy_version 247911 (0.0029) [2024-04-26 19:04:07,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4061921280. Throughput: 0: 50809.3. Samples: 1814731560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 19:04:07,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 19:04:07,240][49750] Updated weights for policy 0, policy_version 247921 (0.0031) [2024-04-26 19:04:10,306][49750] Updated weights for policy 0, policy_version 247931 (0.0028) [2024-04-26 19:04:12,062][49517] Fps is (10 sec: 45875.3, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4062167040. Throughput: 0: 50644.0. Samples: 1815029640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 19:04:12,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 19:04:13,603][49750] Updated weights for policy 0, policy_version 247941 (0.0033) [2024-04-26 19:04:16,734][49750] Updated weights for policy 0, policy_version 247951 (0.0037) [2024-04-26 19:04:17,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4062445568. Throughput: 0: 50694.3. Samples: 1815332220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 19:04:17,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 19:04:20,111][49750] Updated weights for policy 0, policy_version 247961 (0.0032) [2024-04-26 19:04:22,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4062707712. Throughput: 0: 50798.6. Samples: 1815496300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 19:04:22,063][49517] Avg episode reward: [(0, '0.462')] [2024-04-26 19:04:23,171][49750] Updated weights for policy 0, policy_version 247971 (0.0028) [2024-04-26 19:04:26,547][49750] Updated weights for policy 0, policy_version 247981 (0.0035) [2024-04-26 19:04:27,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4062937088. Throughput: 0: 50583.7. Samples: 1815794440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 19:04:27,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 19:04:29,833][49750] Updated weights for policy 0, policy_version 247991 (0.0032) [2024-04-26 19:04:32,063][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4063182848. Throughput: 0: 50658.1. Samples: 1816098540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 19:04:32,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 19:04:32,973][49750] Updated weights for policy 0, policy_version 248001 (0.0034) [2024-04-26 19:04:36,208][49750] Updated weights for policy 0, policy_version 248011 (0.0031) [2024-04-26 19:04:37,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 4063428608. Throughput: 0: 50807.6. Samples: 1816246160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 19:04:37,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 19:04:39,340][49750] Updated weights for policy 0, policy_version 248021 (0.0038) [2024-04-26 19:04:42,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4063707136. Throughput: 0: 50811.8. Samples: 1816545060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 19:04:42,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 19:04:42,083][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000248029_4063707136.pth... [2024-04-26 19:04:42,131][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000247286_4051533824.pth [2024-04-26 19:04:42,659][49750] Updated weights for policy 0, policy_version 248031 (0.0033) [2024-04-26 19:04:45,952][49750] Updated weights for policy 0, policy_version 248041 (0.0030) [2024-04-26 19:04:47,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 4063969280. Throughput: 0: 50683.5. Samples: 1816851840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 19:04:47,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 19:04:49,123][49750] Updated weights for policy 0, policy_version 248051 (0.0032) [2024-04-26 19:04:52,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4064215040. Throughput: 0: 50586.2. Samples: 1817007940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 19:04:52,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 19:04:52,465][49750] Updated weights for policy 0, policy_version 248061 (0.0030) [2024-04-26 19:04:55,526][49750] Updated weights for policy 0, policy_version 248071 (0.0040) [2024-04-26 19:04:57,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4064444416. Throughput: 0: 50778.8. Samples: 1817314680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 19:04:57,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 19:04:58,865][49750] Updated weights for policy 0, policy_version 248081 (0.0031) [2024-04-26 19:05:01,989][49750] Updated weights for policy 0, policy_version 248091 (0.0028) [2024-04-26 19:05:02,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4064722944. Throughput: 0: 50788.8. Samples: 1817617720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 19:05:02,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 19:05:05,337][49750] Updated weights for policy 0, policy_version 248101 (0.0037) [2024-04-26 19:05:07,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 4064985088. Throughput: 0: 50610.3. Samples: 1817773760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 19:05:07,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 19:05:08,313][49750] Updated weights for policy 0, policy_version 248111 (0.0028) [2024-04-26 19:05:11,829][49750] Updated weights for policy 0, policy_version 248121 (0.0035) [2024-04-26 19:05:12,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 4065230848. Throughput: 0: 50729.6. Samples: 1818077280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 19:05:12,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 19:05:12,226][49728] Signal inference workers to stop experience collection... (27000 times) [2024-04-26 19:05:12,226][49728] Signal inference workers to resume experience collection... (27000 times) [2024-04-26 19:05:12,241][49750] InferenceWorker_p0-w0: stopping experience collection (27000 times) [2024-04-26 19:05:12,241][49750] InferenceWorker_p0-w0: resuming experience collection (27000 times) [2024-04-26 19:05:14,689][49750] Updated weights for policy 0, policy_version 248131 (0.0028) [2024-04-26 19:05:17,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4065476608. Throughput: 0: 50915.2. Samples: 1818389720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 19:05:17,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 19:05:18,154][49750] Updated weights for policy 0, policy_version 248141 (0.0032) [2024-04-26 19:05:21,174][49750] Updated weights for policy 0, policy_version 248151 (0.0027) [2024-04-26 19:05:22,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4065738752. Throughput: 0: 50765.2. Samples: 1818530600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 19:05:22,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 19:05:24,660][49750] Updated weights for policy 0, policy_version 248161 (0.0030) [2024-04-26 19:05:27,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4065968128. Throughput: 0: 50880.0. Samples: 1818834660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 19:05:27,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 19:05:27,610][49750] Updated weights for policy 0, policy_version 248171 (0.0035) [2024-04-26 19:05:31,098][49750] Updated weights for policy 0, policy_version 248181 (0.0031) [2024-04-26 19:05:32,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4066246656. Throughput: 0: 50856.1. Samples: 1819140360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 19:05:32,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 19:05:34,095][49750] Updated weights for policy 0, policy_version 248191 (0.0035) [2024-04-26 19:05:37,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4066492416. Throughput: 0: 50772.1. Samples: 1819292680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 19:05:37,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 19:05:37,458][49750] Updated weights for policy 0, policy_version 248201 (0.0033) [2024-04-26 19:05:40,570][49750] Updated weights for policy 0, policy_version 248211 (0.0030) [2024-04-26 19:05:42,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 4066738176. Throughput: 0: 50799.0. Samples: 1819600640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-26 19:05:42,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 19:05:43,905][49750] Updated weights for policy 0, policy_version 248221 (0.0034) [2024-04-26 19:05:47,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4067000320. Throughput: 0: 50778.3. Samples: 1819902740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 19:05:47,063][49517] Avg episode reward: [(0, '0.684')] [2024-04-26 19:05:47,101][49750] Updated weights for policy 0, policy_version 248231 (0.0030) [2024-04-26 19:05:50,304][49750] Updated weights for policy 0, policy_version 248241 (0.0032) [2024-04-26 19:05:52,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4067246080. Throughput: 0: 50664.1. Samples: 1820053640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 19:05:52,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-26 19:05:53,647][49750] Updated weights for policy 0, policy_version 248251 (0.0031) [2024-04-26 19:05:56,757][49750] Updated weights for policy 0, policy_version 248261 (0.0029) [2024-04-26 19:05:57,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 4067508224. Throughput: 0: 50794.9. Samples: 1820363040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 19:05:57,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 19:06:00,044][49750] Updated weights for policy 0, policy_version 248271 (0.0032) [2024-04-26 19:06:02,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4067753984. Throughput: 0: 50512.0. Samples: 1820662760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 19:06:02,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 19:06:03,144][49750] Updated weights for policy 0, policy_version 248281 (0.0031) [2024-04-26 19:06:06,531][49750] Updated weights for policy 0, policy_version 248291 (0.0038) [2024-04-26 19:06:07,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4068016128. Throughput: 0: 50720.0. Samples: 1820813000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 19:06:07,071][49517] Avg episode reward: [(0, '0.678')] [2024-04-26 19:06:09,684][49750] Updated weights for policy 0, policy_version 248301 (0.0036) [2024-04-26 19:06:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4068245504. Throughput: 0: 50716.8. Samples: 1821116920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 19:06:12,071][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 19:06:12,899][49750] Updated weights for policy 0, policy_version 248311 (0.0033) [2024-04-26 19:06:16,212][49750] Updated weights for policy 0, policy_version 248321 (0.0033) [2024-04-26 19:06:17,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4068524032. Throughput: 0: 50607.2. Samples: 1821417680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 19:06:17,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 19:06:19,389][49750] Updated weights for policy 0, policy_version 248331 (0.0027) [2024-04-26 19:06:22,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4068769792. Throughput: 0: 50505.6. Samples: 1821565440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 19:06:22,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 19:06:22,791][49728] Signal inference workers to stop experience collection... (27050 times) [2024-04-26 19:06:22,833][49750] InferenceWorker_p0-w0: stopping experience collection (27050 times) [2024-04-26 19:06:22,847][49728] Signal inference workers to resume experience collection... (27050 times) [2024-04-26 19:06:22,855][49750] InferenceWorker_p0-w0: resuming experience collection (27050 times) [2024-04-26 19:06:22,858][49750] Updated weights for policy 0, policy_version 248341 (0.0033) [2024-04-26 19:06:25,779][49750] Updated weights for policy 0, policy_version 248351 (0.0031) [2024-04-26 19:06:27,062][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4069031936. Throughput: 0: 50590.7. Samples: 1821877220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 19:06:27,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 19:06:29,169][49750] Updated weights for policy 0, policy_version 248361 (0.0039) [2024-04-26 19:06:32,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4069277696. Throughput: 0: 50701.9. Samples: 1822184320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 19:06:32,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 19:06:32,216][49750] Updated weights for policy 0, policy_version 248371 (0.0030) [2024-04-26 19:06:35,538][49750] Updated weights for policy 0, policy_version 248381 (0.0035) [2024-04-26 19:06:37,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4069523456. Throughput: 0: 50623.5. Samples: 1822331700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 19:06:37,071][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 19:06:38,673][49750] Updated weights for policy 0, policy_version 248391 (0.0029) [2024-04-26 19:06:42,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4069785600. Throughput: 0: 50657.3. Samples: 1822642620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 19:06:42,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 19:06:42,092][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000248401_4069801984.pth... [2024-04-26 19:06:42,096][49750] Updated weights for policy 0, policy_version 248401 (0.0035) [2024-04-26 19:06:42,141][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000247656_4057595904.pth [2024-04-26 19:06:45,306][49750] Updated weights for policy 0, policy_version 248411 (0.0033) [2024-04-26 19:06:47,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4070031360. Throughput: 0: 50586.2. Samples: 1822939140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 19:06:47,072][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 19:06:48,518][49750] Updated weights for policy 0, policy_version 248421 (0.0031) [2024-04-26 19:06:51,780][49750] Updated weights for policy 0, policy_version 248431 (0.0036) [2024-04-26 19:06:52,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4070293504. Throughput: 0: 50788.9. Samples: 1823098500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 19:06:52,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 19:06:54,890][49750] Updated weights for policy 0, policy_version 248441 (0.0030) [2024-04-26 19:06:57,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 4070539264. Throughput: 0: 50669.9. Samples: 1823397060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 19:06:57,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 19:06:58,222][49750] Updated weights for policy 0, policy_version 248451 (0.0030) [2024-04-26 19:07:01,413][49750] Updated weights for policy 0, policy_version 248461 (0.0035) [2024-04-26 19:07:02,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4070801408. Throughput: 0: 50673.2. Samples: 1823697980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 19:07:02,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 19:07:04,651][49750] Updated weights for policy 0, policy_version 248471 (0.0033) [2024-04-26 19:07:07,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4071063552. Throughput: 0: 50854.4. Samples: 1823853880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 19:07:07,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 19:07:07,920][49750] Updated weights for policy 0, policy_version 248481 (0.0033) [2024-04-26 19:07:11,194][49750] Updated weights for policy 0, policy_version 248491 (0.0032) [2024-04-26 19:07:12,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4071309312. Throughput: 0: 50664.3. Samples: 1824157120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 19:07:12,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 19:07:14,221][49750] Updated weights for policy 0, policy_version 248501 (0.0032) [2024-04-26 19:07:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4071571456. Throughput: 0: 50631.1. Samples: 1824462720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 19:07:17,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 19:07:17,518][49750] Updated weights for policy 0, policy_version 248511 (0.0030) [2024-04-26 19:07:20,856][49750] Updated weights for policy 0, policy_version 248521 (0.0037) [2024-04-26 19:07:22,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4071817216. Throughput: 0: 50724.4. Samples: 1824614300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 19:07:22,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 19:07:23,785][49750] Updated weights for policy 0, policy_version 248531 (0.0034) [2024-04-26 19:07:27,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4072062976. Throughput: 0: 50690.9. Samples: 1824923720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 19:07:27,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 19:07:27,278][49750] Updated weights for policy 0, policy_version 248541 (0.0031) [2024-04-26 19:07:28,224][49728] Signal inference workers to stop experience collection... (27100 times) [2024-04-26 19:07:28,224][49728] Signal inference workers to resume experience collection... (27100 times) [2024-04-26 19:07:28,236][49750] InferenceWorker_p0-w0: stopping experience collection (27100 times) [2024-04-26 19:07:28,236][49750] InferenceWorker_p0-w0: resuming experience collection (27100 times) [2024-04-26 19:07:30,183][49750] Updated weights for policy 0, policy_version 248551 (0.0029) [2024-04-26 19:07:32,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4072325120. Throughput: 0: 50872.6. Samples: 1825228400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 19:07:32,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 19:07:33,738][49750] Updated weights for policy 0, policy_version 248561 (0.0038) [2024-04-26 19:07:36,541][49750] Updated weights for policy 0, policy_version 248571 (0.0035) [2024-04-26 19:07:37,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4072587264. Throughput: 0: 50932.4. Samples: 1825390460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 19:07:37,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 19:07:40,215][49750] Updated weights for policy 0, policy_version 248581 (0.0032) [2024-04-26 19:07:42,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 4072833024. Throughput: 0: 50924.6. Samples: 1825688680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 19:07:42,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 19:07:42,882][49750] Updated weights for policy 0, policy_version 248591 (0.0034) [2024-04-26 19:07:46,526][49750] Updated weights for policy 0, policy_version 248601 (0.0032) [2024-04-26 19:07:47,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 4073078784. Throughput: 0: 51000.9. Samples: 1825993020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 19:07:47,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 19:07:49,525][49750] Updated weights for policy 0, policy_version 248611 (0.0040) [2024-04-26 19:07:52,062][49517] Fps is (10 sec: 52430.0, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4073357312. Throughput: 0: 50782.7. Samples: 1826139100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 19:07:52,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-26 19:07:52,950][49750] Updated weights for policy 0, policy_version 248621 (0.0032) [2024-04-26 19:07:56,178][49750] Updated weights for policy 0, policy_version 248631 (0.0031) [2024-04-26 19:07:57,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4073603072. Throughput: 0: 50889.1. Samples: 1826447120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 19:07:57,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 19:07:59,484][49750] Updated weights for policy 0, policy_version 248641 (0.0033) [2024-04-26 19:08:02,062][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4073865216. Throughput: 0: 50983.4. Samples: 1826756980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 19:08:02,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 19:08:02,520][49750] Updated weights for policy 0, policy_version 248651 (0.0030) [2024-04-26 19:08:05,969][49750] Updated weights for policy 0, policy_version 248661 (0.0038) [2024-04-26 19:08:07,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4074094592. Throughput: 0: 50793.1. Samples: 1826899980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 19:08:07,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 19:08:08,883][49750] Updated weights for policy 0, policy_version 248671 (0.0026) [2024-04-26 19:08:12,063][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 4074373120. Throughput: 0: 50796.9. Samples: 1827209580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 19:08:12,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 19:08:12,285][49750] Updated weights for policy 0, policy_version 248681 (0.0035) [2024-04-26 19:08:15,417][49750] Updated weights for policy 0, policy_version 248691 (0.0033) [2024-04-26 19:08:17,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4074618880. Throughput: 0: 50860.0. Samples: 1827517100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 19:08:17,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 19:08:18,865][49750] Updated weights for policy 0, policy_version 248701 (0.0033) [2024-04-26 19:08:21,752][49750] Updated weights for policy 0, policy_version 248711 (0.0033) [2024-04-26 19:08:22,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4074881024. Throughput: 0: 50598.6. Samples: 1827667400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 19:08:22,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 19:08:24,307][49728] Signal inference workers to stop experience collection... (27150 times) [2024-04-26 19:08:24,360][49750] InferenceWorker_p0-w0: stopping experience collection (27150 times) [2024-04-26 19:08:24,372][49728] Signal inference workers to resume experience collection... (27150 times) [2024-04-26 19:08:24,378][49750] InferenceWorker_p0-w0: resuming experience collection (27150 times) [2024-04-26 19:08:25,464][49750] Updated weights for policy 0, policy_version 248721 (0.0032) [2024-04-26 19:08:27,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4075110400. Throughput: 0: 50629.1. Samples: 1827966980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 19:08:27,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 19:08:28,138][49750] Updated weights for policy 0, policy_version 248731 (0.0032) [2024-04-26 19:08:31,782][49750] Updated weights for policy 0, policy_version 248741 (0.0036) [2024-04-26 19:08:32,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 4075372544. Throughput: 0: 50660.6. Samples: 1828272740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 19:08:32,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 19:08:34,510][49750] Updated weights for policy 0, policy_version 248751 (0.0027) [2024-04-26 19:08:37,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4075634688. Throughput: 0: 50651.1. Samples: 1828418400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 19:08:37,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 19:08:38,545][49750] Updated weights for policy 0, policy_version 248761 (0.0032) [2024-04-26 19:08:41,055][49750] Updated weights for policy 0, policy_version 248771 (0.0027) [2024-04-26 19:08:42,062][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4075880448. Throughput: 0: 50686.6. Samples: 1828728020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 19:08:42,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 19:08:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000248772_4075880448.pth... [2024-04-26 19:08:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000248029_4063707136.pth [2024-04-26 19:08:45,095][49750] Updated weights for policy 0, policy_version 248781 (0.0032) [2024-04-26 19:08:47,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4076158976. Throughput: 0: 50628.4. Samples: 1829035260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 19:08:47,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 19:08:47,561][49750] Updated weights for policy 0, policy_version 248791 (0.0029) [2024-04-26 19:08:51,683][49750] Updated weights for policy 0, policy_version 248801 (0.0032) [2024-04-26 19:08:52,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 4076371968. Throughput: 0: 50835.1. Samples: 1829187560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 19:08:52,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 19:08:54,067][49750] Updated weights for policy 0, policy_version 248811 (0.0030) [2024-04-26 19:08:57,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 4076650496. Throughput: 0: 50796.0. Samples: 1829495400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 19:08:57,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 19:08:58,018][49750] Updated weights for policy 0, policy_version 248821 (0.0029) [2024-04-26 19:09:00,448][49750] Updated weights for policy 0, policy_version 248831 (0.0031) [2024-04-26 19:09:02,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4076896256. Throughput: 0: 50798.6. Samples: 1829803040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 19:09:02,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 19:09:04,410][49750] Updated weights for policy 0, policy_version 248841 (0.0035) [2024-04-26 19:09:07,062][49517] Fps is (10 sec: 50791.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4077158400. Throughput: 0: 50688.6. Samples: 1829948380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 19:09:07,063][49517] Avg episode reward: [(0, '0.474')] [2024-04-26 19:09:07,267][49750] Updated weights for policy 0, policy_version 248851 (0.0029) [2024-04-26 19:09:10,730][49750] Updated weights for policy 0, policy_version 248861 (0.0027) [2024-04-26 19:09:12,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 4077387776. Throughput: 0: 50787.1. Samples: 1830252400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 19:09:12,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 19:09:13,831][49750] Updated weights for policy 0, policy_version 248871 (0.0035) [2024-04-26 19:09:17,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.3, 300 sec: 50596.0). Total num frames: 4077633536. Throughput: 0: 50693.7. Samples: 1830553960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 19:09:17,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 19:09:17,249][49750] Updated weights for policy 0, policy_version 248881 (0.0030) [2024-04-26 19:09:20,157][49750] Updated weights for policy 0, policy_version 248891 (0.0037) [2024-04-26 19:09:22,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4077912064. Throughput: 0: 50660.5. Samples: 1830698120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 19:09:22,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 19:09:23,725][49750] Updated weights for policy 0, policy_version 248901 (0.0031) [2024-04-26 19:09:26,493][49750] Updated weights for policy 0, policy_version 248911 (0.0029) [2024-04-26 19:09:27,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4078157824. Throughput: 0: 50659.6. Samples: 1831007700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 19:09:27,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 19:09:30,106][49750] Updated weights for policy 0, policy_version 248921 (0.0036) [2024-04-26 19:09:31,326][49728] Signal inference workers to stop experience collection... (27200 times) [2024-04-26 19:09:31,326][49728] Signal inference workers to resume experience collection... (27200 times) [2024-04-26 19:09:31,340][49750] InferenceWorker_p0-w0: stopping experience collection (27200 times) [2024-04-26 19:09:31,341][49750] InferenceWorker_p0-w0: resuming experience collection (27200 times) [2024-04-26 19:09:32,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4078436352. Throughput: 0: 50709.0. Samples: 1831317160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 19:09:32,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 19:09:33,013][49750] Updated weights for policy 0, policy_version 248931 (0.0035) [2024-04-26 19:09:36,549][49750] Updated weights for policy 0, policy_version 248941 (0.0031) [2024-04-26 19:09:37,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4078665728. Throughput: 0: 50710.1. Samples: 1831469520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 19:09:37,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-26 19:09:39,439][49750] Updated weights for policy 0, policy_version 248951 (0.0029) [2024-04-26 19:09:42,063][49517] Fps is (10 sec: 47512.6, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 4078911488. Throughput: 0: 50642.1. Samples: 1831774300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 19:09:42,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 19:09:43,161][49750] Updated weights for policy 0, policy_version 248961 (0.0031) [2024-04-26 19:09:46,385][49750] Updated weights for policy 0, policy_version 248971 (0.0042) [2024-04-26 19:09:47,062][49517] Fps is (10 sec: 49152.5, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 4079157248. Throughput: 0: 50424.1. Samples: 1832072120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 19:09:47,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 19:09:49,575][49750] Updated weights for policy 0, policy_version 248981 (0.0030) [2024-04-26 19:09:52,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4079435776. Throughput: 0: 50668.8. Samples: 1832228480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 19:09:52,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 19:09:52,772][49750] Updated weights for policy 0, policy_version 248991 (0.0034) [2024-04-26 19:09:55,883][49750] Updated weights for policy 0, policy_version 249001 (0.0033) [2024-04-26 19:09:57,062][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4079681536. Throughput: 0: 50702.6. Samples: 1832534020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 19:09:57,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 19:09:59,032][49750] Updated weights for policy 0, policy_version 249011 (0.0035) [2024-04-26 19:10:02,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4079943680. Throughput: 0: 50829.2. Samples: 1832841280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 19:10:02,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 19:10:02,335][49750] Updated weights for policy 0, policy_version 249021 (0.0028) [2024-04-26 19:10:05,713][49750] Updated weights for policy 0, policy_version 249031 (0.0032) [2024-04-26 19:10:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4080189440. Throughput: 0: 50942.2. Samples: 1832990520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 19:10:07,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 19:10:08,813][49750] Updated weights for policy 0, policy_version 249041 (0.0032) [2024-04-26 19:10:12,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4080435200. Throughput: 0: 50778.6. Samples: 1833292740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 19:10:12,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 19:10:12,254][49750] Updated weights for policy 0, policy_version 249051 (0.0032) [2024-04-26 19:10:15,083][49750] Updated weights for policy 0, policy_version 249061 (0.0036) [2024-04-26 19:10:17,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4080697344. Throughput: 0: 50726.1. Samples: 1833599840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 19:10:17,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 19:10:18,594][49750] Updated weights for policy 0, policy_version 249071 (0.0035) [2024-04-26 19:10:21,545][49750] Updated weights for policy 0, policy_version 249081 (0.0034) [2024-04-26 19:10:22,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 4080959488. Throughput: 0: 50889.7. Samples: 1833759560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 19:10:22,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 19:10:25,053][49750] Updated weights for policy 0, policy_version 249091 (0.0028) [2024-04-26 19:10:27,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4081205248. Throughput: 0: 50848.6. Samples: 1834062480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:10:27,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 19:10:27,979][49750] Updated weights for policy 0, policy_version 249101 (0.0034) [2024-04-26 19:10:31,613][49750] Updated weights for policy 0, policy_version 249111 (0.0035) [2024-04-26 19:10:32,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 4081451008. Throughput: 0: 51074.9. Samples: 1834370500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:10:32,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 19:10:34,545][49750] Updated weights for policy 0, policy_version 249121 (0.0030) [2024-04-26 19:10:37,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4081729536. Throughput: 0: 50831.5. Samples: 1834515900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:10:37,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 19:10:37,910][49750] Updated weights for policy 0, policy_version 249131 (0.0036) [2024-04-26 19:10:41,130][49750] Updated weights for policy 0, policy_version 249141 (0.0031) [2024-04-26 19:10:42,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4081975296. Throughput: 0: 50801.7. Samples: 1834820100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:10:42,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 19:10:42,112][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000249145_4081991680.pth... [2024-04-26 19:10:42,158][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000248401_4069801984.pth [2024-04-26 19:10:44,268][49750] Updated weights for policy 0, policy_version 249151 (0.0030) [2024-04-26 19:10:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4082221056. Throughput: 0: 50820.1. Samples: 1835128180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:10:47,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 19:10:47,483][49750] Updated weights for policy 0, policy_version 249161 (0.0034) [2024-04-26 19:10:50,813][49750] Updated weights for policy 0, policy_version 249171 (0.0034) [2024-04-26 19:10:52,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4082483200. Throughput: 0: 50914.2. Samples: 1835281660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:10:52,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 19:10:53,095][49728] Signal inference workers to stop experience collection... (27250 times) [2024-04-26 19:10:53,095][49728] Signal inference workers to resume experience collection... (27250 times) [2024-04-26 19:10:53,128][49750] InferenceWorker_p0-w0: stopping experience collection (27250 times) [2024-04-26 19:10:53,128][49750] InferenceWorker_p0-w0: resuming experience collection (27250 times) [2024-04-26 19:10:53,850][49750] Updated weights for policy 0, policy_version 249181 (0.0032) [2024-04-26 19:10:57,063][49517] Fps is (10 sec: 49150.3, 60 sec: 50517.1, 300 sec: 50707.0). Total num frames: 4082712576. Throughput: 0: 50774.9. Samples: 1835577620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:10:57,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 19:10:57,434][49750] Updated weights for policy 0, policy_version 249191 (0.0035) [2024-04-26 19:11:00,302][49750] Updated weights for policy 0, policy_version 249201 (0.0029) [2024-04-26 19:11:02,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4082991104. Throughput: 0: 50800.0. Samples: 1835885840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:11:02,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 19:11:03,722][49750] Updated weights for policy 0, policy_version 249211 (0.0034) [2024-04-26 19:11:06,699][49750] Updated weights for policy 0, policy_version 249221 (0.0034) [2024-04-26 19:11:07,062][49517] Fps is (10 sec: 54068.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4083253248. Throughput: 0: 50587.7. Samples: 1836036000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:11:07,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 19:11:10,077][49750] Updated weights for policy 0, policy_version 249231 (0.0032) [2024-04-26 19:11:12,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4083499008. Throughput: 0: 50867.1. Samples: 1836351500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:11:12,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 19:11:12,998][49750] Updated weights for policy 0, policy_version 249241 (0.0036) [2024-04-26 19:11:16,796][49750] Updated weights for policy 0, policy_version 249251 (0.0033) [2024-04-26 19:11:17,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4083744768. Throughput: 0: 50782.3. Samples: 1836655700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:11:17,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 19:11:19,478][49750] Updated weights for policy 0, policy_version 249261 (0.0034) [2024-04-26 19:11:22,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4083990528. Throughput: 0: 50713.8. Samples: 1836798020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:11:22,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 19:11:23,151][49750] Updated weights for policy 0, policy_version 249271 (0.0029) [2024-04-26 19:11:26,106][49750] Updated weights for policy 0, policy_version 249281 (0.0030) [2024-04-26 19:11:27,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4084269056. Throughput: 0: 50918.4. Samples: 1837111420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:11:27,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 19:11:29,561][49750] Updated weights for policy 0, policy_version 249291 (0.0032) [2024-04-26 19:11:32,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4084498432. Throughput: 0: 50821.6. Samples: 1837415160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:11:32,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 19:11:32,538][49750] Updated weights for policy 0, policy_version 249301 (0.0035) [2024-04-26 19:11:35,967][49750] Updated weights for policy 0, policy_version 249311 (0.0038) [2024-04-26 19:11:37,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4084776960. Throughput: 0: 50817.8. Samples: 1837568460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 19:11:37,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 19:11:38,937][49750] Updated weights for policy 0, policy_version 249321 (0.0030) [2024-04-26 19:11:42,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4085006336. Throughput: 0: 50891.8. Samples: 1837867740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 19:11:42,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 19:11:42,363][49750] Updated weights for policy 0, policy_version 249331 (0.0030) [2024-04-26 19:11:45,542][49750] Updated weights for policy 0, policy_version 249341 (0.0033) [2024-04-26 19:11:47,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4085268480. Throughput: 0: 50656.4. Samples: 1838165380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 19:11:47,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 19:11:48,918][49750] Updated weights for policy 0, policy_version 249351 (0.0034) [2024-04-26 19:11:51,990][49750] Updated weights for policy 0, policy_version 249361 (0.0040) [2024-04-26 19:11:52,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 4085530624. Throughput: 0: 50815.5. Samples: 1838322700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 19:11:52,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 19:11:55,253][49750] Updated weights for policy 0, policy_version 249371 (0.0034) [2024-04-26 19:11:56,539][49728] Signal inference workers to stop experience collection... (27300 times) [2024-04-26 19:11:56,540][49728] Signal inference workers to resume experience collection... (27300 times) [2024-04-26 19:11:56,571][49750] InferenceWorker_p0-w0: stopping experience collection (27300 times) [2024-04-26 19:11:56,571][49750] InferenceWorker_p0-w0: resuming experience collection (27300 times) [2024-04-26 19:11:57,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51336.8, 300 sec: 50818.2). Total num frames: 4085792768. Throughput: 0: 50640.9. Samples: 1838630340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 19:11:57,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 19:11:58,443][49750] Updated weights for policy 0, policy_version 249381 (0.0033) [2024-04-26 19:12:02,026][49750] Updated weights for policy 0, policy_version 249391 (0.0032) [2024-04-26 19:12:02,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4086022144. Throughput: 0: 50695.6. Samples: 1838937000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 19:12:02,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 19:12:04,871][49750] Updated weights for policy 0, policy_version 249401 (0.0029) [2024-04-26 19:12:07,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 4086284288. Throughput: 0: 50827.5. Samples: 1839085260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 19:12:07,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 19:12:08,353][49750] Updated weights for policy 0, policy_version 249411 (0.0033) [2024-04-26 19:12:11,359][49750] Updated weights for policy 0, policy_version 249421 (0.0031) [2024-04-26 19:12:12,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4086562816. Throughput: 0: 50538.7. Samples: 1839385660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 19:12:12,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 19:12:14,764][49750] Updated weights for policy 0, policy_version 249431 (0.0035) [2024-04-26 19:12:17,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 4086792192. Throughput: 0: 50506.8. Samples: 1839687960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 19:12:17,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 19:12:17,816][49750] Updated weights for policy 0, policy_version 249441 (0.0032) [2024-04-26 19:12:21,155][49750] Updated weights for policy 0, policy_version 249451 (0.0034) [2024-04-26 19:12:22,063][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4087054336. Throughput: 0: 50708.8. Samples: 1839850360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 19:12:22,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 19:12:24,226][49750] Updated weights for policy 0, policy_version 249461 (0.0032) [2024-04-26 19:12:27,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4087300096. Throughput: 0: 50785.3. Samples: 1840153080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 19:12:27,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 19:12:27,676][49750] Updated weights for policy 0, policy_version 249471 (0.0027) [2024-04-26 19:12:30,689][49750] Updated weights for policy 0, policy_version 249481 (0.0028) [2024-04-26 19:12:32,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4087545856. Throughput: 0: 50773.4. Samples: 1840450180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 19:12:32,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 19:12:34,040][49750] Updated weights for policy 0, policy_version 249491 (0.0030) [2024-04-26 19:12:37,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4087808000. Throughput: 0: 50837.3. Samples: 1840610380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 19:12:37,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 19:12:37,116][49750] Updated weights for policy 0, policy_version 249501 (0.0030) [2024-04-26 19:12:40,537][49750] Updated weights for policy 0, policy_version 249511 (0.0025) [2024-04-26 19:12:42,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4088070144. Throughput: 0: 50772.5. Samples: 1840915100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-26 19:12:42,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 19:12:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000249517_4088086528.pth... [2024-04-26 19:12:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000248772_4075880448.pth [2024-04-26 19:12:43,523][49750] Updated weights for policy 0, policy_version 249521 (0.0031) [2024-04-26 19:12:46,849][49750] Updated weights for policy 0, policy_version 249531 (0.0029) [2024-04-26 19:12:47,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4088315904. Throughput: 0: 50730.0. Samples: 1841219860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:12:47,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 19:12:49,944][49750] Updated weights for policy 0, policy_version 249541 (0.0028) [2024-04-26 19:12:52,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4088561664. Throughput: 0: 50879.2. Samples: 1841374820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:12:52,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 19:12:53,305][49750] Updated weights for policy 0, policy_version 249551 (0.0030) [2024-04-26 19:12:56,400][49750] Updated weights for policy 0, policy_version 249561 (0.0036) [2024-04-26 19:12:57,062][49517] Fps is (10 sec: 52430.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4088840192. Throughput: 0: 50943.1. Samples: 1841678100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:12:57,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 19:13:00,058][49750] Updated weights for policy 0, policy_version 249571 (0.0030) [2024-04-26 19:13:00,854][49728] Signal inference workers to stop experience collection... (27350 times) [2024-04-26 19:13:00,856][49728] Signal inference workers to resume experience collection... (27350 times) [2024-04-26 19:13:00,873][49750] InferenceWorker_p0-w0: stopping experience collection (27350 times) [2024-04-26 19:13:00,874][49750] InferenceWorker_p0-w0: resuming experience collection (27350 times) [2024-04-26 19:13:02,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4089069568. Throughput: 0: 50725.8. Samples: 1841970620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:13:02,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 19:13:02,955][49750] Updated weights for policy 0, policy_version 249581 (0.0027) [2024-04-26 19:13:06,411][49750] Updated weights for policy 0, policy_version 249591 (0.0031) [2024-04-26 19:13:07,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4089331712. Throughput: 0: 50722.3. Samples: 1842132860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:13:07,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 19:13:09,400][49750] Updated weights for policy 0, policy_version 249601 (0.0033) [2024-04-26 19:13:12,063][49517] Fps is (10 sec: 49151.4, 60 sec: 49971.1, 300 sec: 50651.5). Total num frames: 4089561088. Throughput: 0: 50691.6. Samples: 1842434200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:13:12,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 19:13:12,712][49750] Updated weights for policy 0, policy_version 249611 (0.0029) [2024-04-26 19:13:15,738][49750] Updated weights for policy 0, policy_version 249621 (0.0028) [2024-04-26 19:13:17,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4089839616. Throughput: 0: 50918.4. Samples: 1842741500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:13:17,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 19:13:19,157][49750] Updated weights for policy 0, policy_version 249631 (0.0031) [2024-04-26 19:13:22,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4090085376. Throughput: 0: 50652.9. Samples: 1842889760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:13:22,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 19:13:22,301][49750] Updated weights for policy 0, policy_version 249641 (0.0028) [2024-04-26 19:13:25,718][49750] Updated weights for policy 0, policy_version 249651 (0.0030) [2024-04-26 19:13:27,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 4090363904. Throughput: 0: 50650.6. Samples: 1843194380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:13:27,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 19:13:28,798][49750] Updated weights for policy 0, policy_version 249661 (0.0028) [2024-04-26 19:13:32,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4090593280. Throughput: 0: 50723.6. Samples: 1843502420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:13:32,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 19:13:32,210][49750] Updated weights for policy 0, policy_version 249671 (0.0032) [2024-04-26 19:13:35,246][49750] Updated weights for policy 0, policy_version 249681 (0.0028) [2024-04-26 19:13:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4090855424. Throughput: 0: 50711.1. Samples: 1843656820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:13:37,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 19:13:38,559][49750] Updated weights for policy 0, policy_version 249691 (0.0031) [2024-04-26 19:13:41,595][49750] Updated weights for policy 0, policy_version 249701 (0.0033) [2024-04-26 19:13:42,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4091101184. Throughput: 0: 50601.3. Samples: 1843955160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:13:42,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-26 19:13:44,901][49750] Updated weights for policy 0, policy_version 249711 (0.0042) [2024-04-26 19:13:47,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4091363328. Throughput: 0: 50836.8. Samples: 1844258280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:13:47,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 19:13:48,048][49750] Updated weights for policy 0, policy_version 249721 (0.0029) [2024-04-26 19:13:51,409][49750] Updated weights for policy 0, policy_version 249731 (0.0030) [2024-04-26 19:13:52,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4091641856. Throughput: 0: 50698.7. Samples: 1844414300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:13:52,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 19:13:54,516][49750] Updated weights for policy 0, policy_version 249741 (0.0035) [2024-04-26 19:13:57,062][49517] Fps is (10 sec: 47513.8, 60 sec: 49971.1, 300 sec: 50651.6). Total num frames: 4091838464. Throughput: 0: 50738.7. Samples: 1844717440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:13:57,063][49517] Avg episode reward: [(0, '0.449')] [2024-04-26 19:13:57,720][49728] Signal inference workers to stop experience collection... (27400 times) [2024-04-26 19:13:57,729][49728] Signal inference workers to resume experience collection... (27400 times) [2024-04-26 19:13:57,752][49750] InferenceWorker_p0-w0: stopping experience collection (27400 times) [2024-04-26 19:13:57,752][49750] InferenceWorker_p0-w0: resuming experience collection (27400 times) [2024-04-26 19:13:57,862][49750] Updated weights for policy 0, policy_version 249751 (0.0032) [2024-04-26 19:14:01,063][49750] Updated weights for policy 0, policy_version 249761 (0.0032) [2024-04-26 19:14:02,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4092116992. Throughput: 0: 50754.1. Samples: 1845025440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 19:14:02,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 19:14:04,262][49750] Updated weights for policy 0, policy_version 249771 (0.0031) [2024-04-26 19:14:07,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4092362752. Throughput: 0: 50685.9. Samples: 1845170620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 19:14:07,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 19:14:07,623][49750] Updated weights for policy 0, policy_version 249781 (0.0028) [2024-04-26 19:14:10,698][49750] Updated weights for policy 0, policy_version 249791 (0.0028) [2024-04-26 19:14:12,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51609.7, 300 sec: 50929.2). Total num frames: 4092657664. Throughput: 0: 50849.9. Samples: 1845482620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 19:14:12,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 19:14:14,133][49750] Updated weights for policy 0, policy_version 249801 (0.0031) [2024-04-26 19:14:17,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4092887040. Throughput: 0: 50850.8. Samples: 1845790700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 19:14:17,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 19:14:17,115][49750] Updated weights for policy 0, policy_version 249811 (0.0032) [2024-04-26 19:14:20,669][49750] Updated weights for policy 0, policy_version 249821 (0.0029) [2024-04-26 19:14:22,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4093132800. Throughput: 0: 50724.5. Samples: 1845939420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 19:14:22,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 19:14:23,474][49750] Updated weights for policy 0, policy_version 249831 (0.0027) [2024-04-26 19:14:26,937][49750] Updated weights for policy 0, policy_version 249841 (0.0033) [2024-04-26 19:14:27,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4093394944. Throughput: 0: 50913.7. Samples: 1846246280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 19:14:27,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 19:14:29,872][49750] Updated weights for policy 0, policy_version 249851 (0.0038) [2024-04-26 19:14:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.6, 300 sec: 50762.7). Total num frames: 4093640704. Throughput: 0: 50820.6. Samples: 1846545200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 19:14:32,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 19:14:33,366][49750] Updated weights for policy 0, policy_version 249861 (0.0035) [2024-04-26 19:14:36,329][49750] Updated weights for policy 0, policy_version 249871 (0.0028) [2024-04-26 19:14:37,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 4093935616. Throughput: 0: 51045.3. Samples: 1846711340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 19:14:37,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 19:14:39,874][49750] Updated weights for policy 0, policy_version 249881 (0.0034) [2024-04-26 19:14:42,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4094148608. Throughput: 0: 51025.0. Samples: 1847013560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 19:14:42,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 19:14:42,120][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000249888_4094164992.pth... [2024-04-26 19:14:42,173][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000249145_4081991680.pth [2024-04-26 19:14:42,704][49750] Updated weights for policy 0, policy_version 249891 (0.0031) [2024-04-26 19:14:46,276][49750] Updated weights for policy 0, policy_version 249901 (0.0028) [2024-04-26 19:14:47,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4094394368. Throughput: 0: 50897.0. Samples: 1847315800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 19:14:47,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 19:14:49,076][49750] Updated weights for policy 0, policy_version 249911 (0.0035) [2024-04-26 19:14:52,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4094656512. Throughput: 0: 50838.7. Samples: 1847458360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 19:14:52,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 19:14:52,693][49750] Updated weights for policy 0, policy_version 249921 (0.0031) [2024-04-26 19:14:55,440][49750] Updated weights for policy 0, policy_version 249931 (0.0029) [2024-04-26 19:14:57,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51882.8, 300 sec: 50873.7). Total num frames: 4094951424. Throughput: 0: 50677.4. Samples: 1847763100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 19:14:57,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 19:14:59,100][49750] Updated weights for policy 0, policy_version 249941 (0.0033) [2024-04-26 19:15:01,993][49750] Updated weights for policy 0, policy_version 249951 (0.0029) [2024-04-26 19:15:02,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4095197184. Throughput: 0: 50774.7. Samples: 1848075560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 19:15:02,071][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 19:15:02,457][49728] Signal inference workers to stop experience collection... (27450 times) [2024-04-26 19:15:02,457][49728] Signal inference workers to resume experience collection... (27450 times) [2024-04-26 19:15:02,487][49750] InferenceWorker_p0-w0: stopping experience collection (27450 times) [2024-04-26 19:15:02,487][49750] InferenceWorker_p0-w0: resuming experience collection (27450 times) [2024-04-26 19:15:05,618][49750] Updated weights for policy 0, policy_version 249961 (0.0029) [2024-04-26 19:15:07,063][49517] Fps is (10 sec: 45874.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4095410176. Throughput: 0: 50897.7. Samples: 1848229820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 19:15:07,072][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 19:15:08,554][49750] Updated weights for policy 0, policy_version 249971 (0.0029) [2024-04-26 19:15:12,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4095672320. Throughput: 0: 50660.5. Samples: 1848526000. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 19:15:12,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 19:15:12,227][49750] Updated weights for policy 0, policy_version 249981 (0.0031) [2024-04-26 19:15:14,894][49750] Updated weights for policy 0, policy_version 249991 (0.0027) [2024-04-26 19:15:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4095934464. Throughput: 0: 50768.8. Samples: 1848829800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 19:15:17,063][49517] Avg episode reward: [(0, '0.679')] [2024-04-26 19:15:18,557][49750] Updated weights for policy 0, policy_version 250001 (0.0037) [2024-04-26 19:15:21,374][49750] Updated weights for policy 0, policy_version 250011 (0.0036) [2024-04-26 19:15:22,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4096196608. Throughput: 0: 50626.2. Samples: 1848989520. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 19:15:22,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 19:15:24,926][49750] Updated weights for policy 0, policy_version 250021 (0.0029) [2024-04-26 19:15:27,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4096442368. Throughput: 0: 50672.8. Samples: 1849293840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 19:15:27,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 19:15:27,942][49750] Updated weights for policy 0, policy_version 250031 (0.0035) [2024-04-26 19:15:31,539][49750] Updated weights for policy 0, policy_version 250041 (0.0030) [2024-04-26 19:15:32,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4096688128. Throughput: 0: 50606.3. Samples: 1849593080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 19:15:32,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 19:15:34,292][49750] Updated weights for policy 0, policy_version 250051 (0.0028) [2024-04-26 19:15:37,062][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4096933888. Throughput: 0: 50649.8. Samples: 1849737600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 19:15:37,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 19:15:38,142][49750] Updated weights for policy 0, policy_version 250061 (0.0035) [2024-04-26 19:15:40,645][49750] Updated weights for policy 0, policy_version 250071 (0.0030) [2024-04-26 19:15:42,062][49517] Fps is (10 sec: 54066.5, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4097228800. Throughput: 0: 50730.5. Samples: 1850045980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 19:15:42,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 19:15:44,473][49750] Updated weights for policy 0, policy_version 250081 (0.0032) [2024-04-26 19:15:47,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4097474560. Throughput: 0: 50628.9. Samples: 1850353860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 19:15:47,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 19:15:47,145][49750] Updated weights for policy 0, policy_version 250091 (0.0037) [2024-04-26 19:15:51,031][49750] Updated weights for policy 0, policy_version 250101 (0.0036) [2024-04-26 19:15:52,062][49517] Fps is (10 sec: 45875.1, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 4097687552. Throughput: 0: 50618.2. Samples: 1850507640. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 19:15:52,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 19:15:53,702][49750] Updated weights for policy 0, policy_version 250111 (0.0028) [2024-04-26 19:15:57,062][49517] Fps is (10 sec: 47513.3, 60 sec: 49971.1, 300 sec: 50707.1). Total num frames: 4097949696. Throughput: 0: 50807.1. Samples: 1850812320. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 19:15:57,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 19:15:57,526][49750] Updated weights for policy 0, policy_version 250121 (0.0035) [2024-04-26 19:16:00,081][49750] Updated weights for policy 0, policy_version 250131 (0.0030) [2024-04-26 19:16:02,063][49517] Fps is (10 sec: 50790.3, 60 sec: 49971.1, 300 sec: 50651.5). Total num frames: 4098195456. Throughput: 0: 50683.5. Samples: 1851110560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 19:16:02,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 19:16:03,920][49750] Updated weights for policy 0, policy_version 250141 (0.0030) [2024-04-26 19:16:06,734][49750] Updated weights for policy 0, policy_version 250151 (0.0034) [2024-04-26 19:16:07,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4098490368. Throughput: 0: 50653.8. Samples: 1851268940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 19:16:07,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 19:16:07,091][49728] Signal inference workers to stop experience collection... (27500 times) [2024-04-26 19:16:07,132][49750] InferenceWorker_p0-w0: stopping experience collection (27500 times) [2024-04-26 19:16:07,196][49728] Signal inference workers to resume experience collection... (27500 times) [2024-04-26 19:16:07,196][49750] InferenceWorker_p0-w0: resuming experience collection (27500 times) [2024-04-26 19:16:10,434][49750] Updated weights for policy 0, policy_version 250161 (0.0030) [2024-04-26 19:16:12,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4098719744. Throughput: 0: 50660.4. Samples: 1851573560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 19:16:12,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 19:16:13,147][49750] Updated weights for policy 0, policy_version 250171 (0.0034) [2024-04-26 19:16:17,046][49750] Updated weights for policy 0, policy_version 250181 (0.0031) [2024-04-26 19:16:17,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4098965504. Throughput: 0: 50766.6. Samples: 1851877580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-04-26 19:16:17,071][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 19:16:19,651][49750] Updated weights for policy 0, policy_version 250191 (0.0033) [2024-04-26 19:16:22,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4099227648. Throughput: 0: 50806.7. Samples: 1852023900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:16:22,071][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 19:16:23,416][49750] Updated weights for policy 0, policy_version 250201 (0.0033) [2024-04-26 19:16:26,171][49750] Updated weights for policy 0, policy_version 250211 (0.0034) [2024-04-26 19:16:27,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4099489792. Throughput: 0: 50581.8. Samples: 1852322160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:16:27,063][49517] Avg episode reward: [(0, '0.674')] [2024-04-26 19:16:29,780][49750] Updated weights for policy 0, policy_version 250221 (0.0029) [2024-04-26 19:16:32,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4099735552. Throughput: 0: 50627.1. Samples: 1852632080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:16:32,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 19:16:32,493][49750] Updated weights for policy 0, policy_version 250231 (0.0032) [2024-04-26 19:16:36,309][49750] Updated weights for policy 0, policy_version 250241 (0.0031) [2024-04-26 19:16:37,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4099981312. Throughput: 0: 50829.4. Samples: 1852794960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:16:37,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 19:16:39,063][49750] Updated weights for policy 0, policy_version 250251 (0.0031) [2024-04-26 19:16:42,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4100243456. Throughput: 0: 50669.3. Samples: 1853092440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:16:42,072][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 19:16:42,083][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000250259_4100243456.pth... [2024-04-26 19:16:42,130][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000249517_4088086528.pth [2024-04-26 19:16:42,846][49750] Updated weights for policy 0, policy_version 250261 (0.0028) [2024-04-26 19:16:45,543][49750] Updated weights for policy 0, policy_version 250271 (0.0032) [2024-04-26 19:16:47,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4100489216. Throughput: 0: 50830.3. Samples: 1853397920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:16:47,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 19:16:49,196][49750] Updated weights for policy 0, policy_version 250281 (0.0035) [2024-04-26 19:16:52,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 4100751360. Throughput: 0: 50643.1. Samples: 1853547880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:16:52,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 19:16:52,111][49750] Updated weights for policy 0, policy_version 250291 (0.0030) [2024-04-26 19:16:55,574][49750] Updated weights for policy 0, policy_version 250301 (0.0032) [2024-04-26 19:16:57,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4101013504. Throughput: 0: 50729.9. Samples: 1853856400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:16:57,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 19:16:58,459][49750] Updated weights for policy 0, policy_version 250311 (0.0032) [2024-04-26 19:17:02,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4101242880. Throughput: 0: 50742.5. Samples: 1854161000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:17:02,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 19:17:02,090][49750] Updated weights for policy 0, policy_version 250321 (0.0030) [2024-04-26 19:17:04,897][49750] Updated weights for policy 0, policy_version 250331 (0.0035) [2024-04-26 19:17:07,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 4101505024. Throughput: 0: 50790.3. Samples: 1854309460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:17:07,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 19:17:08,465][49750] Updated weights for policy 0, policy_version 250341 (0.0033) [2024-04-26 19:17:11,454][49750] Updated weights for policy 0, policy_version 250351 (0.0032) [2024-04-26 19:17:12,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4101767168. Throughput: 0: 50794.6. Samples: 1854607920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:17:12,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 19:17:14,996][49750] Updated weights for policy 0, policy_version 250361 (0.0028) [2024-04-26 19:17:17,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4102029312. Throughput: 0: 50724.0. Samples: 1854914660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:17:17,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 19:17:17,935][49750] Updated weights for policy 0, policy_version 250371 (0.0029) [2024-04-26 19:17:21,488][49750] Updated weights for policy 0, policy_version 250381 (0.0031) [2024-04-26 19:17:22,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4102275072. Throughput: 0: 50651.9. Samples: 1855074300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:17:22,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 19:17:24,251][49750] Updated weights for policy 0, policy_version 250391 (0.0034) [2024-04-26 19:17:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4102520832. Throughput: 0: 50869.5. Samples: 1855381560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:17:27,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 19:17:27,853][49750] Updated weights for policy 0, policy_version 250401 (0.0034) [2024-04-26 19:17:30,557][49750] Updated weights for policy 0, policy_version 250411 (0.0030) [2024-04-26 19:17:32,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4102782976. Throughput: 0: 50873.3. Samples: 1855687220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:17:32,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 19:17:34,159][49750] Updated weights for policy 0, policy_version 250421 (0.0031) [2024-04-26 19:17:37,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4103045120. Throughput: 0: 50813.0. Samples: 1855834460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:17:37,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 19:17:37,253][49750] Updated weights for policy 0, policy_version 250431 (0.0033) [2024-04-26 19:17:40,604][49750] Updated weights for policy 0, policy_version 250441 (0.0034) [2024-04-26 19:17:41,497][49728] Signal inference workers to stop experience collection... (27550 times) [2024-04-26 19:17:41,498][49728] Signal inference workers to resume experience collection... (27550 times) [2024-04-26 19:17:41,513][49750] InferenceWorker_p0-w0: stopping experience collection (27550 times) [2024-04-26 19:17:41,513][49750] InferenceWorker_p0-w0: resuming experience collection (27550 times) [2024-04-26 19:17:42,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4103307264. Throughput: 0: 50737.8. Samples: 1856139600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:17:42,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 19:17:43,764][49750] Updated weights for policy 0, policy_version 250451 (0.0029) [2024-04-26 19:17:47,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4103520256. Throughput: 0: 50690.3. Samples: 1856442060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:17:47,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 19:17:47,240][49750] Updated weights for policy 0, policy_version 250461 (0.0029) [2024-04-26 19:17:50,107][49750] Updated weights for policy 0, policy_version 250471 (0.0028) [2024-04-26 19:17:52,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4103798784. Throughput: 0: 50809.8. Samples: 1856595900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:17:52,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 19:17:53,616][49750] Updated weights for policy 0, policy_version 250481 (0.0029) [2024-04-26 19:17:56,662][49750] Updated weights for policy 0, policy_version 250491 (0.0032) [2024-04-26 19:17:57,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4104044544. Throughput: 0: 50851.1. Samples: 1856896220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:17:57,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 19:17:59,934][49750] Updated weights for policy 0, policy_version 250501 (0.0029) [2024-04-26 19:18:02,062][49517] Fps is (10 sec: 50789.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4104306688. Throughput: 0: 50776.8. Samples: 1857199620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:18:02,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 19:18:03,072][49750] Updated weights for policy 0, policy_version 250511 (0.0030) [2024-04-26 19:18:06,354][49750] Updated weights for policy 0, policy_version 250521 (0.0042) [2024-04-26 19:18:07,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4104552448. Throughput: 0: 50767.5. Samples: 1857358840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:18:07,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 19:18:09,599][49750] Updated weights for policy 0, policy_version 250531 (0.0032) [2024-04-26 19:18:12,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4104798208. Throughput: 0: 50715.0. Samples: 1857663740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:18:12,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 19:18:12,939][49750] Updated weights for policy 0, policy_version 250541 (0.0030) [2024-04-26 19:18:16,344][49750] Updated weights for policy 0, policy_version 250551 (0.0043) [2024-04-26 19:18:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4105043968. Throughput: 0: 50615.1. Samples: 1857964900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:18:17,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 19:18:19,540][49750] Updated weights for policy 0, policy_version 250561 (0.0029) [2024-04-26 19:18:22,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4105306112. Throughput: 0: 50761.8. Samples: 1858118740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:18:22,063][49517] Avg episode reward: [(0, '0.473')] [2024-04-26 19:18:22,809][49750] Updated weights for policy 0, policy_version 250571 (0.0036) [2024-04-26 19:18:26,118][49750] Updated weights for policy 0, policy_version 250581 (0.0035) [2024-04-26 19:18:27,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 4105568256. Throughput: 0: 50577.6. Samples: 1858415600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:18:27,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 19:18:29,211][49750] Updated weights for policy 0, policy_version 250591 (0.0029) [2024-04-26 19:18:32,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4105814016. Throughput: 0: 50656.1. Samples: 1858721580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:18:32,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 19:18:32,459][49750] Updated weights for policy 0, policy_version 250601 (0.0030) [2024-04-26 19:18:35,504][49728] Signal inference workers to stop experience collection... (27600 times) [2024-04-26 19:18:35,504][49728] Signal inference workers to resume experience collection... (27600 times) [2024-04-26 19:18:35,536][49750] InferenceWorker_p0-w0: stopping experience collection (27600 times) [2024-04-26 19:18:35,537][49750] InferenceWorker_p0-w0: resuming experience collection (27600 times) [2024-04-26 19:18:35,635][49750] Updated weights for policy 0, policy_version 250611 (0.0030) [2024-04-26 19:18:37,062][49517] Fps is (10 sec: 50791.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4106076160. Throughput: 0: 50622.7. Samples: 1858873920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:18:37,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 19:18:38,710][49750] Updated weights for policy 0, policy_version 250621 (0.0033) [2024-04-26 19:18:42,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4106321920. Throughput: 0: 50754.8. Samples: 1859180180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 26.0) [2024-04-26 19:18:42,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 19:18:42,094][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000250631_4106338304.pth... [2024-04-26 19:18:42,098][49750] Updated weights for policy 0, policy_version 250631 (0.0031) [2024-04-26 19:18:42,142][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000249888_4094164992.pth [2024-04-26 19:18:45,317][49750] Updated weights for policy 0, policy_version 250641 (0.0032) [2024-04-26 19:18:47,062][49517] Fps is (10 sec: 50789.5, 60 sec: 51063.5, 300 sec: 50651.5). Total num frames: 4106584064. Throughput: 0: 50695.1. Samples: 1859480900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 26.0) [2024-04-26 19:18:47,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 19:18:48,736][49750] Updated weights for policy 0, policy_version 250651 (0.0035) [2024-04-26 19:18:51,806][49750] Updated weights for policy 0, policy_version 250661 (0.0029) [2024-04-26 19:18:52,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4106846208. Throughput: 0: 50654.3. Samples: 1859638280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 26.0) [2024-04-26 19:18:52,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 19:18:55,288][49750] Updated weights for policy 0, policy_version 250671 (0.0029) [2024-04-26 19:18:57,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4107091968. Throughput: 0: 50709.8. Samples: 1859945680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 26.0) [2024-04-26 19:18:57,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 19:18:58,190][49750] Updated weights for policy 0, policy_version 250681 (0.0032) [2024-04-26 19:19:01,593][49750] Updated weights for policy 0, policy_version 250691 (0.0035) [2024-04-26 19:19:02,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 4107337728. Throughput: 0: 50796.6. Samples: 1860250740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 26.0) [2024-04-26 19:19:02,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 19:19:04,639][49750] Updated weights for policy 0, policy_version 250701 (0.0032) [2024-04-26 19:19:07,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 4107599872. Throughput: 0: 50638.1. Samples: 1860397460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 26.0) [2024-04-26 19:19:07,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 19:19:07,982][49750] Updated weights for policy 0, policy_version 250711 (0.0031) [2024-04-26 19:19:11,078][49750] Updated weights for policy 0, policy_version 250721 (0.0040) [2024-04-26 19:19:12,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4107862016. Throughput: 0: 50783.8. Samples: 1860700860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 26.0) [2024-04-26 19:19:12,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 19:19:14,452][49750] Updated weights for policy 0, policy_version 250731 (0.0029) [2024-04-26 19:19:17,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4108107776. Throughput: 0: 50729.3. Samples: 1861004400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 26.0) [2024-04-26 19:19:17,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 19:19:17,634][49750] Updated weights for policy 0, policy_version 250741 (0.0031) [2024-04-26 19:19:21,036][49750] Updated weights for policy 0, policy_version 250751 (0.0031) [2024-04-26 19:19:22,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4108353536. Throughput: 0: 50740.7. Samples: 1861157260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 26.0) [2024-04-26 19:19:22,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 19:19:23,902][49750] Updated weights for policy 0, policy_version 250761 (0.0035) [2024-04-26 19:19:27,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4108599296. Throughput: 0: 50573.7. Samples: 1861456000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 26.0) [2024-04-26 19:19:27,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 19:19:27,443][49750] Updated weights for policy 0, policy_version 250771 (0.0035) [2024-04-26 19:19:30,208][49750] Updated weights for policy 0, policy_version 250781 (0.0031) [2024-04-26 19:19:32,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50651.6). Total num frames: 4108877824. Throughput: 0: 50735.1. Samples: 1861763980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 26.0) [2024-04-26 19:19:32,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 19:19:33,888][49750] Updated weights for policy 0, policy_version 250791 (0.0029) [2024-04-26 19:19:36,693][49750] Updated weights for policy 0, policy_version 250801 (0.0035) [2024-04-26 19:19:37,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4109139968. Throughput: 0: 50681.3. Samples: 1861918940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 26.0) [2024-04-26 19:19:37,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 19:19:40,322][49750] Updated weights for policy 0, policy_version 250811 (0.0032) [2024-04-26 19:19:42,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4109369344. Throughput: 0: 50673.3. Samples: 1862225980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 26.0) [2024-04-26 19:19:42,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 19:19:43,166][49750] Updated weights for policy 0, policy_version 250821 (0.0027) [2024-04-26 19:19:46,683][49750] Updated weights for policy 0, policy_version 250831 (0.0034) [2024-04-26 19:19:47,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4109615104. Throughput: 0: 50657.7. Samples: 1862530340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 26.0) [2024-04-26 19:19:47,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 19:19:49,496][49750] Updated weights for policy 0, policy_version 250841 (0.0028) [2024-04-26 19:19:52,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 4109877248. Throughput: 0: 50636.9. Samples: 1862676120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:19:52,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 19:19:52,742][49728] Signal inference workers to stop experience collection... (27650 times) [2024-04-26 19:19:52,742][49728] Signal inference workers to resume experience collection... (27650 times) [2024-04-26 19:19:52,756][49750] InferenceWorker_p0-w0: stopping experience collection (27650 times) [2024-04-26 19:19:52,756][49750] InferenceWorker_p0-w0: resuming experience collection (27650 times) [2024-04-26 19:19:53,230][49750] Updated weights for policy 0, policy_version 250851 (0.0031) [2024-04-26 19:19:55,871][49750] Updated weights for policy 0, policy_version 250861 (0.0036) [2024-04-26 19:19:57,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 4110139392. Throughput: 0: 50837.6. Samples: 1862988560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:19:57,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 19:19:59,656][49750] Updated weights for policy 0, policy_version 250871 (0.0038) [2024-04-26 19:20:02,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4110401536. Throughput: 0: 50991.9. Samples: 1863299040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:20:02,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 19:20:02,308][49750] Updated weights for policy 0, policy_version 250881 (0.0035) [2024-04-26 19:20:06,171][49750] Updated weights for policy 0, policy_version 250891 (0.0033) [2024-04-26 19:20:07,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4110647296. Throughput: 0: 50848.0. Samples: 1863445420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:20:07,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 19:20:08,851][49750] Updated weights for policy 0, policy_version 250901 (0.0029) [2024-04-26 19:20:12,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4110893056. Throughput: 0: 51032.1. Samples: 1863752440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:20:12,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 19:20:12,790][49750] Updated weights for policy 0, policy_version 250911 (0.0029) [2024-04-26 19:20:15,241][49750] Updated weights for policy 0, policy_version 250921 (0.0029) [2024-04-26 19:20:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4111155200. Throughput: 0: 50844.9. Samples: 1864052000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:20:17,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 19:20:19,050][49750] Updated weights for policy 0, policy_version 250931 (0.0031) [2024-04-26 19:20:21,839][49750] Updated weights for policy 0, policy_version 250941 (0.0038) [2024-04-26 19:20:22,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4111417344. Throughput: 0: 50914.8. Samples: 1864210100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:20:22,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 19:20:25,506][49750] Updated weights for policy 0, policy_version 250951 (0.0027) [2024-04-26 19:20:27,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4111663104. Throughput: 0: 50924.0. Samples: 1864517560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:20:27,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 19:20:28,295][49750] Updated weights for policy 0, policy_version 250961 (0.0032) [2024-04-26 19:20:32,030][49750] Updated weights for policy 0, policy_version 250971 (0.0031) [2024-04-26 19:20:32,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4111908864. Throughput: 0: 50836.8. Samples: 1864818000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:20:32,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 19:20:34,645][49750] Updated weights for policy 0, policy_version 250981 (0.0031) [2024-04-26 19:20:37,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4112171008. Throughput: 0: 50868.5. Samples: 1864965200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:20:37,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 19:20:38,427][49750] Updated weights for policy 0, policy_version 250991 (0.0031) [2024-04-26 19:20:41,124][49750] Updated weights for policy 0, policy_version 251001 (0.0034) [2024-04-26 19:20:42,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4112433152. Throughput: 0: 50814.8. Samples: 1865275220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:20:42,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 19:20:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000251003_4112433152.pth... [2024-04-26 19:20:42,128][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000250259_4100243456.pth [2024-04-26 19:20:44,718][49750] Updated weights for policy 0, policy_version 251011 (0.0027) [2024-04-26 19:20:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4112695296. Throughput: 0: 50772.1. Samples: 1865583780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:20:47,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 19:20:47,586][49750] Updated weights for policy 0, policy_version 251021 (0.0027) [2024-04-26 19:20:51,357][49750] Updated weights for policy 0, policy_version 251031 (0.0033) [2024-04-26 19:20:52,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 4112941056. Throughput: 0: 50815.4. Samples: 1865732120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:20:52,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 19:20:54,181][49750] Updated weights for policy 0, policy_version 251041 (0.0033) [2024-04-26 19:20:57,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4113186816. Throughput: 0: 50802.6. Samples: 1866038560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:20:57,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 19:20:57,714][49750] Updated weights for policy 0, policy_version 251051 (0.0032) [2024-04-26 19:21:00,566][49750] Updated weights for policy 0, policy_version 251061 (0.0033) [2024-04-26 19:21:02,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4113448960. Throughput: 0: 50950.7. Samples: 1866344780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 19:21:02,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 19:21:04,089][49750] Updated weights for policy 0, policy_version 251071 (0.0031) [2024-04-26 19:21:06,951][49750] Updated weights for policy 0, policy_version 251081 (0.0033) [2024-04-26 19:21:07,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4113711104. Throughput: 0: 50839.3. Samples: 1866497880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 19:21:07,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 19:21:07,739][49728] Signal inference workers to stop experience collection... (27700 times) [2024-04-26 19:21:07,774][49750] InferenceWorker_p0-w0: stopping experience collection (27700 times) [2024-04-26 19:21:07,798][49728] Signal inference workers to resume experience collection... (27700 times) [2024-04-26 19:21:07,799][49750] InferenceWorker_p0-w0: resuming experience collection (27700 times) [2024-04-26 19:21:10,519][49750] Updated weights for policy 0, policy_version 251091 (0.0028) [2024-04-26 19:21:12,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4113940480. Throughput: 0: 50825.7. Samples: 1866804720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 19:21:12,071][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 19:21:13,490][49750] Updated weights for policy 0, policy_version 251101 (0.0032) [2024-04-26 19:21:16,855][49750] Updated weights for policy 0, policy_version 251111 (0.0030) [2024-04-26 19:21:17,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4114202624. Throughput: 0: 50847.6. Samples: 1867106140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 19:21:17,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 19:21:20,108][49750] Updated weights for policy 0, policy_version 251121 (0.0034) [2024-04-26 19:21:22,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4114448384. Throughput: 0: 50964.5. Samples: 1867258600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 19:21:22,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 19:21:23,302][49750] Updated weights for policy 0, policy_version 251131 (0.0033) [2024-04-26 19:21:26,715][49750] Updated weights for policy 0, policy_version 251141 (0.0034) [2024-04-26 19:21:27,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4114710528. Throughput: 0: 50800.1. Samples: 1867561220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 19:21:27,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 19:21:29,829][49750] Updated weights for policy 0, policy_version 251151 (0.0029) [2024-04-26 19:21:32,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4114956288. Throughput: 0: 50698.6. Samples: 1867865220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 19:21:32,063][49517] Avg episode reward: [(0, '0.475')] [2024-04-26 19:21:33,013][49750] Updated weights for policy 0, policy_version 251161 (0.0030) [2024-04-26 19:21:36,282][49750] Updated weights for policy 0, policy_version 251171 (0.0029) [2024-04-26 19:21:37,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4115234816. Throughput: 0: 50766.5. Samples: 1868016600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 19:21:37,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 19:21:39,474][49750] Updated weights for policy 0, policy_version 251181 (0.0030) [2024-04-26 19:21:42,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4115480576. Throughput: 0: 50849.8. Samples: 1868326800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 19:21:42,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 19:21:42,621][49750] Updated weights for policy 0, policy_version 251191 (0.0028) [2024-04-26 19:21:45,897][49750] Updated weights for policy 0, policy_version 251201 (0.0039) [2024-04-26 19:21:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4115726336. Throughput: 0: 50711.2. Samples: 1868626780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 19:21:47,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 19:21:49,137][49750] Updated weights for policy 0, policy_version 251211 (0.0033) [2024-04-26 19:21:52,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4115972096. Throughput: 0: 50713.1. Samples: 1868779960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 19:21:52,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 19:21:52,324][49750] Updated weights for policy 0, policy_version 251221 (0.0034) [2024-04-26 19:21:55,500][49750] Updated weights for policy 0, policy_version 251231 (0.0029) [2024-04-26 19:21:57,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4116250624. Throughput: 0: 50787.2. Samples: 1869090140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 19:21:57,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 19:21:58,706][49750] Updated weights for policy 0, policy_version 251241 (0.0030) [2024-04-26 19:22:02,011][49750] Updated weights for policy 0, policy_version 251251 (0.0030) [2024-04-26 19:22:02,062][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4116496384. Throughput: 0: 50905.8. Samples: 1869396900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 19:22:02,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 19:22:05,244][49750] Updated weights for policy 0, policy_version 251261 (0.0030) [2024-04-26 19:22:07,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4116758528. Throughput: 0: 50895.4. Samples: 1869548900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 19:22:07,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 19:22:08,457][49750] Updated weights for policy 0, policy_version 251271 (0.0031) [2024-04-26 19:22:11,643][49750] Updated weights for policy 0, policy_version 251281 (0.0028) [2024-04-26 19:22:12,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4117004288. Throughput: 0: 50825.9. Samples: 1869848400. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 19:22:12,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 19:22:14,836][49750] Updated weights for policy 0, policy_version 251291 (0.0029) [2024-04-26 19:22:17,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4117250048. Throughput: 0: 50942.9. Samples: 1870157660. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 19:22:17,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 19:22:18,042][49750] Updated weights for policy 0, policy_version 251301 (0.0034) [2024-04-26 19:22:21,289][49750] Updated weights for policy 0, policy_version 251311 (0.0036) [2024-04-26 19:22:22,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4117495808. Throughput: 0: 50820.8. Samples: 1870303540. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 19:22:22,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 19:22:24,426][49750] Updated weights for policy 0, policy_version 251321 (0.0034) [2024-04-26 19:22:27,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4117757952. Throughput: 0: 50729.4. Samples: 1870609620. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 19:22:27,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 19:22:27,773][49750] Updated weights for policy 0, policy_version 251331 (0.0035) [2024-04-26 19:22:30,808][49750] Updated weights for policy 0, policy_version 251341 (0.0033) [2024-04-26 19:22:32,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 4118036480. Throughput: 0: 50961.1. Samples: 1870920040. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 19:22:32,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 19:22:34,195][49750] Updated weights for policy 0, policy_version 251351 (0.0028) [2024-04-26 19:22:37,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4118282240. Throughput: 0: 50993.7. Samples: 1871074680. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 19:22:37,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 19:22:37,395][49750] Updated weights for policy 0, policy_version 251361 (0.0032) [2024-04-26 19:22:38,101][49728] Signal inference workers to stop experience collection... (27750 times) [2024-04-26 19:22:38,101][49728] Signal inference workers to resume experience collection... (27750 times) [2024-04-26 19:22:38,127][49750] InferenceWorker_p0-w0: stopping experience collection (27750 times) [2024-04-26 19:22:38,127][49750] InferenceWorker_p0-w0: resuming experience collection (27750 times) [2024-04-26 19:22:40,526][49750] Updated weights for policy 0, policy_version 251371 (0.0030) [2024-04-26 19:22:42,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 4118511616. Throughput: 0: 50758.9. Samples: 1871374300. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 19:22:42,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 19:22:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000251374_4118511616.pth... [2024-04-26 19:22:42,132][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000250631_4106338304.pth [2024-04-26 19:22:43,762][49750] Updated weights for policy 0, policy_version 251381 (0.0040) [2024-04-26 19:22:47,000][49750] Updated weights for policy 0, policy_version 251391 (0.0032) [2024-04-26 19:22:47,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4118790144. Throughput: 0: 50659.6. Samples: 1871676580. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 19:22:47,063][49517] Avg episode reward: [(0, '0.478')] [2024-04-26 19:22:50,255][49750] Updated weights for policy 0, policy_version 251401 (0.0038) [2024-04-26 19:22:52,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51336.3, 300 sec: 50873.7). Total num frames: 4119052288. Throughput: 0: 50783.9. Samples: 1871834180. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 19:22:52,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 19:22:53,507][49750] Updated weights for policy 0, policy_version 251411 (0.0033) [2024-04-26 19:22:56,655][49750] Updated weights for policy 0, policy_version 251421 (0.0029) [2024-04-26 19:22:57,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4119281664. Throughput: 0: 50906.1. Samples: 1872139160. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 19:22:57,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 19:23:00,050][49750] Updated weights for policy 0, policy_version 251431 (0.0036) [2024-04-26 19:23:02,062][49517] Fps is (10 sec: 47514.7, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4119527424. Throughput: 0: 50874.0. Samples: 1872446980. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 19:23:02,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 19:23:03,172][49750] Updated weights for policy 0, policy_version 251441 (0.0036) [2024-04-26 19:23:06,654][49750] Updated weights for policy 0, policy_version 251451 (0.0033) [2024-04-26 19:23:07,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 4119773184. Throughput: 0: 50845.0. Samples: 1872591560. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 19:23:07,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 19:23:09,603][49750] Updated weights for policy 0, policy_version 251461 (0.0028) [2024-04-26 19:23:12,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 4120051712. Throughput: 0: 50892.9. Samples: 1872899800. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 19:23:12,063][49517] Avg episode reward: [(0, '0.673')] [2024-04-26 19:23:13,130][49750] Updated weights for policy 0, policy_version 251471 (0.0029) [2024-04-26 19:23:15,899][49750] Updated weights for policy 0, policy_version 251481 (0.0028) [2024-04-26 19:23:17,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4120313856. Throughput: 0: 50633.4. Samples: 1873198540. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-26 19:23:17,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 19:23:19,452][49750] Updated weights for policy 0, policy_version 251491 (0.0031) [2024-04-26 19:23:22,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4120559616. Throughput: 0: 50770.6. Samples: 1873359360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 19:23:22,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 19:23:22,342][49750] Updated weights for policy 0, policy_version 251501 (0.0036) [2024-04-26 19:23:26,120][49750] Updated weights for policy 0, policy_version 251511 (0.0032) [2024-04-26 19:23:27,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4120805376. Throughput: 0: 50907.7. Samples: 1873665140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 19:23:27,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 19:23:28,800][49750] Updated weights for policy 0, policy_version 251521 (0.0029) [2024-04-26 19:23:32,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4121051136. Throughput: 0: 50966.6. Samples: 1873970080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 19:23:32,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 19:23:32,596][49750] Updated weights for policy 0, policy_version 251531 (0.0032) [2024-04-26 19:23:35,401][49750] Updated weights for policy 0, policy_version 251541 (0.0036) [2024-04-26 19:23:37,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4121329664. Throughput: 0: 50662.5. Samples: 1874113980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 19:23:37,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 19:23:38,999][49750] Updated weights for policy 0, policy_version 251551 (0.0034) [2024-04-26 19:23:41,723][49750] Updated weights for policy 0, policy_version 251561 (0.0030) [2024-04-26 19:23:42,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 4121591808. Throughput: 0: 50833.3. Samples: 1874426660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 19:23:42,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 19:23:45,324][49750] Updated weights for policy 0, policy_version 251571 (0.0032) [2024-04-26 19:23:47,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4121821184. Throughput: 0: 50836.6. Samples: 1874734640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 19:23:47,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 19:23:48,035][49750] Updated weights for policy 0, policy_version 251581 (0.0039) [2024-04-26 19:23:51,748][49750] Updated weights for policy 0, policy_version 251591 (0.0029) [2024-04-26 19:23:52,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4122066944. Throughput: 0: 50745.6. Samples: 1874875120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 19:23:52,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 19:23:54,512][49750] Updated weights for policy 0, policy_version 251601 (0.0030) [2024-04-26 19:23:55,804][49728] Signal inference workers to stop experience collection... (27800 times) [2024-04-26 19:23:55,827][49750] InferenceWorker_p0-w0: stopping experience collection (27800 times) [2024-04-26 19:23:55,910][49728] Signal inference workers to resume experience collection... (27800 times) [2024-04-26 19:23:55,910][49750] InferenceWorker_p0-w0: resuming experience collection (27800 times) [2024-04-26 19:23:57,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 4122329088. Throughput: 0: 50708.8. Samples: 1875181700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 19:23:57,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 19:23:58,317][49750] Updated weights for policy 0, policy_version 251611 (0.0030) [2024-04-26 19:24:00,975][49750] Updated weights for policy 0, policy_version 251621 (0.0028) [2024-04-26 19:24:02,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4122591232. Throughput: 0: 50756.1. Samples: 1875482560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 19:24:02,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 19:24:04,710][49750] Updated weights for policy 0, policy_version 251631 (0.0032) [2024-04-26 19:24:07,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 4122853376. Throughput: 0: 50687.5. Samples: 1875640300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 19:24:07,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 19:24:07,454][49750] Updated weights for policy 0, policy_version 251641 (0.0037) [2024-04-26 19:24:11,157][49750] Updated weights for policy 0, policy_version 251651 (0.0029) [2024-04-26 19:24:12,062][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4123082752. Throughput: 0: 50832.4. Samples: 1875952600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 19:24:12,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 19:24:13,820][49750] Updated weights for policy 0, policy_version 251661 (0.0038) [2024-04-26 19:24:17,062][49517] Fps is (10 sec: 47514.6, 60 sec: 50244.3, 300 sec: 50762.7). Total num frames: 4123328512. Throughput: 0: 50724.1. Samples: 1876252660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 19:24:17,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 19:24:17,655][49750] Updated weights for policy 0, policy_version 251671 (0.0032) [2024-04-26 19:24:20,237][49750] Updated weights for policy 0, policy_version 251681 (0.0030) [2024-04-26 19:24:22,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4123607040. Throughput: 0: 50741.7. Samples: 1876397360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 19:24:22,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 19:24:24,225][49750] Updated weights for policy 0, policy_version 251691 (0.0028) [2024-04-26 19:24:26,609][49750] Updated weights for policy 0, policy_version 251701 (0.0035) [2024-04-26 19:24:27,062][49517] Fps is (10 sec: 55705.3, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4123885568. Throughput: 0: 50707.6. Samples: 1876708500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 19:24:27,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-26 19:24:30,588][49750] Updated weights for policy 0, policy_version 251711 (0.0036) [2024-04-26 19:24:32,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4124114944. Throughput: 0: 50750.3. Samples: 1877018400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 19:24:32,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 19:24:33,123][49750] Updated weights for policy 0, policy_version 251721 (0.0034) [2024-04-26 19:24:37,062][49517] Fps is (10 sec: 45875.0, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4124344320. Throughput: 0: 50689.0. Samples: 1877156120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 19:24:37,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 19:24:37,152][49750] Updated weights for policy 0, policy_version 251731 (0.0030) [2024-04-26 19:24:39,642][49750] Updated weights for policy 0, policy_version 251741 (0.0033) [2024-04-26 19:24:42,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 4124606464. Throughput: 0: 50581.3. Samples: 1877457860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 19:24:42,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 19:24:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000251746_4124606464.pth... [2024-04-26 19:24:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000251003_4112433152.pth [2024-04-26 19:24:43,454][49750] Updated weights for policy 0, policy_version 251751 (0.0032) [2024-04-26 19:24:46,013][49750] Updated weights for policy 0, policy_version 251761 (0.0029) [2024-04-26 19:24:47,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51063.7, 300 sec: 50873.7). Total num frames: 4124884992. Throughput: 0: 50719.1. Samples: 1877764920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 19:24:47,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 19:24:49,805][49750] Updated weights for policy 0, policy_version 251771 (0.0028) [2024-04-26 19:24:52,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 4125147136. Throughput: 0: 50879.4. Samples: 1877929860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 19:24:52,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 19:24:52,392][49750] Updated weights for policy 0, policy_version 251781 (0.0032) [2024-04-26 19:24:56,381][49750] Updated weights for policy 0, policy_version 251791 (0.0030) [2024-04-26 19:24:57,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4125360128. Throughput: 0: 50805.4. Samples: 1878238840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 19:24:57,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 19:24:57,186][49728] Signal inference workers to stop experience collection... (27850 times) [2024-04-26 19:24:57,232][49750] InferenceWorker_p0-w0: stopping experience collection (27850 times) [2024-04-26 19:24:57,295][49728] Signal inference workers to resume experience collection... (27850 times) [2024-04-26 19:24:57,295][49750] InferenceWorker_p0-w0: resuming experience collection (27850 times) [2024-04-26 19:24:58,894][49750] Updated weights for policy 0, policy_version 251801 (0.0026) [2024-04-26 19:25:02,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4125622272. Throughput: 0: 50970.1. Samples: 1878546320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 19:25:02,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 19:25:02,835][49750] Updated weights for policy 0, policy_version 251811 (0.0036) [2024-04-26 19:25:05,372][49750] Updated weights for policy 0, policy_version 251821 (0.0027) [2024-04-26 19:25:07,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4125884416. Throughput: 0: 50802.2. Samples: 1878683460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 19:25:07,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 19:25:09,201][49750] Updated weights for policy 0, policy_version 251831 (0.0035) [2024-04-26 19:25:11,669][49750] Updated weights for policy 0, policy_version 251841 (0.0035) [2024-04-26 19:25:12,062][49517] Fps is (10 sec: 55706.3, 60 sec: 51609.7, 300 sec: 50929.3). Total num frames: 4126179328. Throughput: 0: 50758.3. Samples: 1878992620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 19:25:12,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 19:25:15,865][49750] Updated weights for policy 0, policy_version 251851 (0.0032) [2024-04-26 19:25:17,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4126392320. Throughput: 0: 50681.0. Samples: 1879299040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 19:25:17,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 19:25:18,186][49750] Updated weights for policy 0, policy_version 251861 (0.0027) [2024-04-26 19:25:22,063][49517] Fps is (10 sec: 45874.4, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4126638080. Throughput: 0: 50802.5. Samples: 1879442240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 19:25:22,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 19:25:22,224][49750] Updated weights for policy 0, policy_version 251871 (0.0030) [2024-04-26 19:25:24,582][49750] Updated weights for policy 0, policy_version 251881 (0.0037) [2024-04-26 19:25:27,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49698.1, 300 sec: 50707.1). Total num frames: 4126867456. Throughput: 0: 50737.0. Samples: 1879741020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 19:25:27,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 19:25:28,762][49750] Updated weights for policy 0, policy_version 251891 (0.0033) [2024-04-26 19:25:31,048][49750] Updated weights for policy 0, policy_version 251901 (0.0031) [2024-04-26 19:25:32,062][49517] Fps is (10 sec: 55706.2, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 4127195136. Throughput: 0: 50726.6. Samples: 1880047620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 19:25:32,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 19:25:35,178][49750] Updated weights for policy 0, policy_version 251911 (0.0034) [2024-04-26 19:25:37,062][49517] Fps is (10 sec: 55705.3, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4127424512. Throughput: 0: 50865.2. Samples: 1880218800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 19:25:37,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 19:25:37,482][49750] Updated weights for policy 0, policy_version 251921 (0.0034) [2024-04-26 19:25:41,670][49750] Updated weights for policy 0, policy_version 251931 (0.0032) [2024-04-26 19:25:42,063][49517] Fps is (10 sec: 45874.8, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4127653888. Throughput: 0: 50809.6. Samples: 1880525280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 19:25:42,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 19:25:43,895][49750] Updated weights for policy 0, policy_version 251941 (0.0031) [2024-04-26 19:25:47,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4127899648. Throughput: 0: 50710.8. Samples: 1880828300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 19:25:47,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 19:25:47,974][49750] Updated weights for policy 0, policy_version 251951 (0.0031) [2024-04-26 19:25:50,453][49750] Updated weights for policy 0, policy_version 251961 (0.0028) [2024-04-26 19:25:52,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4128178176. Throughput: 0: 50861.8. Samples: 1880972240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 19:25:52,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 19:25:54,297][49750] Updated weights for policy 0, policy_version 251971 (0.0029) [2024-04-26 19:25:56,842][49750] Updated weights for policy 0, policy_version 251981 (0.0026) [2024-04-26 19:25:57,062][49517] Fps is (10 sec: 55705.2, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 4128456704. Throughput: 0: 50877.2. Samples: 1881282100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 19:25:57,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 19:26:00,789][49750] Updated weights for policy 0, policy_version 251991 (0.0034) [2024-04-26 19:26:02,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 4128686080. Throughput: 0: 50873.8. Samples: 1881588360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 19:26:02,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 19:26:02,539][49728] Signal inference workers to stop experience collection... (27900 times) [2024-04-26 19:26:02,577][49750] InferenceWorker_p0-w0: stopping experience collection (27900 times) [2024-04-26 19:26:02,644][49728] Signal inference workers to resume experience collection... (27900 times) [2024-04-26 19:26:02,644][49750] InferenceWorker_p0-w0: resuming experience collection (27900 times) [2024-04-26 19:26:03,386][49750] Updated weights for policy 0, policy_version 252001 (0.0035) [2024-04-26 19:26:07,062][49517] Fps is (10 sec: 45875.2, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4128915456. Throughput: 0: 50873.9. Samples: 1881731560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 19:26:07,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 19:26:07,294][49750] Updated weights for policy 0, policy_version 252011 (0.0032) [2024-04-26 19:26:09,853][49750] Updated weights for policy 0, policy_version 252021 (0.0031) [2024-04-26 19:26:12,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49698.1, 300 sec: 50707.1). Total num frames: 4129161216. Throughput: 0: 51037.4. Samples: 1882037700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 19:26:12,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 19:26:13,714][49750] Updated weights for policy 0, policy_version 252031 (0.0031) [2024-04-26 19:26:16,335][49750] Updated weights for policy 0, policy_version 252041 (0.0032) [2024-04-26 19:26:17,062][49517] Fps is (10 sec: 55706.2, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 4129472512. Throughput: 0: 50889.4. Samples: 1882337640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 19:26:17,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 19:26:20,312][49750] Updated weights for policy 0, policy_version 252051 (0.0039) [2024-04-26 19:26:22,063][49517] Fps is (10 sec: 55704.6, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4129718272. Throughput: 0: 50807.9. Samples: 1882505160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 19:26:22,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 19:26:22,714][49750] Updated weights for policy 0, policy_version 252061 (0.0039) [2024-04-26 19:26:26,730][49750] Updated weights for policy 0, policy_version 252071 (0.0032) [2024-04-26 19:26:27,063][49517] Fps is (10 sec: 47513.1, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4129947648. Throughput: 0: 50736.5. Samples: 1882808420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 19:26:27,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 19:26:29,143][49750] Updated weights for policy 0, policy_version 252081 (0.0033) [2024-04-26 19:26:32,062][49517] Fps is (10 sec: 47514.2, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4130193408. Throughput: 0: 50812.0. Samples: 1883114840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 19:26:32,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 19:26:33,176][49750] Updated weights for policy 0, policy_version 252091 (0.0034) [2024-04-26 19:26:35,677][49750] Updated weights for policy 0, policy_version 252101 (0.0027) [2024-04-26 19:26:37,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4130471936. Throughput: 0: 50784.5. Samples: 1883257540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 19:26:37,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 19:26:39,525][49750] Updated weights for policy 0, policy_version 252111 (0.0029) [2024-04-26 19:26:42,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4130734080. Throughput: 0: 50767.0. Samples: 1883566620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 19:26:42,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 19:26:42,105][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000252121_4130750464.pth... [2024-04-26 19:26:42,111][49750] Updated weights for policy 0, policy_version 252121 (0.0032) [2024-04-26 19:26:42,165][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000251374_4118511616.pth [2024-04-26 19:26:45,938][49750] Updated weights for policy 0, policy_version 252131 (0.0031) [2024-04-26 19:26:47,063][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 4130963456. Throughput: 0: 50688.8. Samples: 1883869360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-26 19:26:47,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 19:26:48,704][49750] Updated weights for policy 0, policy_version 252141 (0.0032) [2024-04-26 19:26:52,063][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4131225600. Throughput: 0: 50823.5. Samples: 1884018620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 19:26:52,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 19:26:52,311][49750] Updated weights for policy 0, policy_version 252151 (0.0030) [2024-04-26 19:26:55,076][49750] Updated weights for policy 0, policy_version 252161 (0.0030) [2024-04-26 19:26:57,062][49517] Fps is (10 sec: 49152.5, 60 sec: 49971.3, 300 sec: 50707.1). Total num frames: 4131454976. Throughput: 0: 50662.2. Samples: 1884317500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 19:26:57,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 19:26:58,696][49750] Updated weights for policy 0, policy_version 252171 (0.0034) [2024-04-26 19:27:01,462][49750] Updated weights for policy 0, policy_version 252181 (0.0033) [2024-04-26 19:27:02,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4131749888. Throughput: 0: 50795.6. Samples: 1884623440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 19:27:02,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 19:27:05,179][49750] Updated weights for policy 0, policy_version 252191 (0.0034) [2024-04-26 19:27:07,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4131995648. Throughput: 0: 50743.3. Samples: 1884788600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 19:27:07,063][49517] Avg episode reward: [(0, '0.670')] [2024-04-26 19:27:07,787][49750] Updated weights for policy 0, policy_version 252201 (0.0032) [2024-04-26 19:27:11,612][49750] Updated weights for policy 0, policy_version 252211 (0.0029) [2024-04-26 19:27:12,062][49517] Fps is (10 sec: 49151.8, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4132241408. Throughput: 0: 50733.4. Samples: 1885091420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 19:27:12,063][49517] Avg episode reward: [(0, '0.684')] [2024-04-26 19:27:14,258][49750] Updated weights for policy 0, policy_version 252221 (0.0034) [2024-04-26 19:27:17,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4132487168. Throughput: 0: 50732.5. Samples: 1885397800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 19:27:17,063][49517] Avg episode reward: [(0, '0.474')] [2024-04-26 19:27:17,688][49728] Signal inference workers to stop experience collection... (27950 times) [2024-04-26 19:27:17,732][49750] InferenceWorker_p0-w0: stopping experience collection (27950 times) [2024-04-26 19:27:17,793][49728] Signal inference workers to resume experience collection... (27950 times) [2024-04-26 19:27:17,793][49750] InferenceWorker_p0-w0: resuming experience collection (27950 times) [2024-04-26 19:27:17,916][49750] Updated weights for policy 0, policy_version 252231 (0.0030) [2024-04-26 19:27:20,597][49750] Updated weights for policy 0, policy_version 252241 (0.0024) [2024-04-26 19:27:22,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 4132732928. Throughput: 0: 51022.6. Samples: 1885553560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 19:27:22,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 19:27:24,378][49750] Updated weights for policy 0, policy_version 252251 (0.0038) [2024-04-26 19:27:27,030][49750] Updated weights for policy 0, policy_version 252261 (0.0031) [2024-04-26 19:27:27,063][49517] Fps is (10 sec: 55705.4, 60 sec: 51609.7, 300 sec: 50873.7). Total num frames: 4133044224. Throughput: 0: 50997.6. Samples: 1885861500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 19:27:27,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 19:27:30,873][49750] Updated weights for policy 0, policy_version 252271 (0.0031) [2024-04-26 19:27:32,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4133257216. Throughput: 0: 51066.2. Samples: 1886167340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 19:27:32,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 19:27:33,392][49750] Updated weights for policy 0, policy_version 252281 (0.0033) [2024-04-26 19:27:37,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4133519360. Throughput: 0: 51004.9. Samples: 1886313840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 19:27:37,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 19:27:37,200][49750] Updated weights for policy 0, policy_version 252291 (0.0036) [2024-04-26 19:27:39,937][49750] Updated weights for policy 0, policy_version 252301 (0.0030) [2024-04-26 19:27:42,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4133765120. Throughput: 0: 51010.5. Samples: 1886612980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 19:27:42,063][49517] Avg episode reward: [(0, '0.478')] [2024-04-26 19:27:43,490][49750] Updated weights for policy 0, policy_version 252311 (0.0035) [2024-04-26 19:27:46,400][49750] Updated weights for policy 0, policy_version 252321 (0.0035) [2024-04-26 19:27:47,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4134043648. Throughput: 0: 51044.8. Samples: 1886920460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 19:27:47,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 19:27:49,857][49750] Updated weights for policy 0, policy_version 252331 (0.0034) [2024-04-26 19:27:52,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4134273024. Throughput: 0: 50928.9. Samples: 1887080400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 19:27:52,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 19:27:52,742][49750] Updated weights for policy 0, policy_version 252341 (0.0027) [2024-04-26 19:27:56,298][49750] Updated weights for policy 0, policy_version 252351 (0.0031) [2024-04-26 19:27:57,062][49517] Fps is (10 sec: 49151.9, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4134535168. Throughput: 0: 51062.2. Samples: 1887389220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-26 19:27:57,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 19:27:59,154][49750] Updated weights for policy 0, policy_version 252361 (0.0030) [2024-04-26 19:28:02,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 4134813696. Throughput: 0: 50991.9. Samples: 1887692440. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 19:28:02,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 19:28:02,668][49750] Updated weights for policy 0, policy_version 252371 (0.0034) [2024-04-26 19:28:05,747][49750] Updated weights for policy 0, policy_version 252381 (0.0028) [2024-04-26 19:28:07,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4135026688. Throughput: 0: 51056.0. Samples: 1887851080. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 19:28:07,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 19:28:09,065][49750] Updated weights for policy 0, policy_version 252391 (0.0037) [2024-04-26 19:28:11,591][49728] Signal inference workers to stop experience collection... (28000 times) [2024-04-26 19:28:11,594][49728] Signal inference workers to resume experience collection... (28000 times) [2024-04-26 19:28:11,612][49750] InferenceWorker_p0-w0: stopping experience collection (28000 times) [2024-04-26 19:28:11,612][49750] InferenceWorker_p0-w0: resuming experience collection (28000 times) [2024-04-26 19:28:12,063][49517] Fps is (10 sec: 50790.4, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4135321600. Throughput: 0: 50877.2. Samples: 1888150980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 19:28:12,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 19:28:12,253][49750] Updated weights for policy 0, policy_version 252401 (0.0031) [2024-04-26 19:28:15,569][49750] Updated weights for policy 0, policy_version 252411 (0.0026) [2024-04-26 19:28:17,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 4135550976. Throughput: 0: 50899.6. Samples: 1888457820. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 19:28:17,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 19:28:18,608][49750] Updated weights for policy 0, policy_version 252421 (0.0038) [2024-04-26 19:28:22,062][49517] Fps is (10 sec: 49152.1, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4135813120. Throughput: 0: 51073.3. Samples: 1888612140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 19:28:22,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 19:28:22,303][49750] Updated weights for policy 0, policy_version 252431 (0.0028) [2024-04-26 19:28:24,965][49750] Updated weights for policy 0, policy_version 252441 (0.0026) [2024-04-26 19:28:27,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.3, 300 sec: 50929.2). Total num frames: 4136075264. Throughput: 0: 51090.8. Samples: 1888912060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 19:28:27,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 19:28:28,715][49750] Updated weights for policy 0, policy_version 252451 (0.0029) [2024-04-26 19:28:31,481][49750] Updated weights for policy 0, policy_version 252461 (0.0029) [2024-04-26 19:28:32,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 4136321024. Throughput: 0: 50948.8. Samples: 1889213160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 19:28:32,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 19:28:35,257][49750] Updated weights for policy 0, policy_version 252471 (0.0040) [2024-04-26 19:28:37,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4136566784. Throughput: 0: 50890.0. Samples: 1889370460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 19:28:37,063][49517] Avg episode reward: [(0, '0.668')] [2024-04-26 19:28:38,010][49750] Updated weights for policy 0, policy_version 252481 (0.0026) [2024-04-26 19:28:41,502][49750] Updated weights for policy 0, policy_version 252491 (0.0036) [2024-04-26 19:28:42,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4136812544. Throughput: 0: 50879.1. Samples: 1889678780. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 19:28:42,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 19:28:42,118][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000252492_4136828928.pth... [2024-04-26 19:28:42,164][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000251746_4124606464.pth [2024-04-26 19:28:44,627][49750] Updated weights for policy 0, policy_version 252501 (0.0033) [2024-04-26 19:28:47,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 4137107456. Throughput: 0: 50794.7. Samples: 1889978200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 19:28:47,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 19:28:47,818][49750] Updated weights for policy 0, policy_version 252511 (0.0029) [2024-04-26 19:28:51,231][49750] Updated weights for policy 0, policy_version 252521 (0.0028) [2024-04-26 19:28:52,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4137336832. Throughput: 0: 51023.2. Samples: 1890147120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 19:28:52,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 19:28:54,362][49750] Updated weights for policy 0, policy_version 252531 (0.0027) [2024-04-26 19:28:57,063][49517] Fps is (10 sec: 49151.7, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4137598976. Throughput: 0: 51104.8. Samples: 1890450700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 19:28:57,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 19:28:57,860][49750] Updated weights for policy 0, policy_version 252541 (0.0029) [2024-04-26 19:29:00,880][49750] Updated weights for policy 0, policy_version 252551 (0.0027) [2024-04-26 19:29:02,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4137844736. Throughput: 0: 51110.3. Samples: 1890757780. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 19:29:02,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 19:29:04,270][49750] Updated weights for policy 0, policy_version 252561 (0.0029) [2024-04-26 19:29:05,057][49728] Signal inference workers to stop experience collection... (28050 times) [2024-04-26 19:29:05,058][49728] Signal inference workers to resume experience collection... (28050 times) [2024-04-26 19:29:05,084][49750] InferenceWorker_p0-w0: stopping experience collection (28050 times) [2024-04-26 19:29:05,084][49750] InferenceWorker_p0-w0: resuming experience collection (28050 times) [2024-04-26 19:29:07,062][49517] Fps is (10 sec: 50791.6, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 4138106880. Throughput: 0: 51158.9. Samples: 1890914280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-26 19:29:07,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 19:29:07,157][49750] Updated weights for policy 0, policy_version 252571 (0.0031) [2024-04-26 19:29:10,784][49750] Updated weights for policy 0, policy_version 252581 (0.0030) [2024-04-26 19:29:12,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.4, 300 sec: 51040.3). Total num frames: 4138385408. Throughput: 0: 51222.9. Samples: 1891217100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:29:12,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 19:29:13,759][49750] Updated weights for policy 0, policy_version 252591 (0.0031) [2024-04-26 19:29:17,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4138598400. Throughput: 0: 51213.9. Samples: 1891517780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:29:17,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 19:29:17,092][49750] Updated weights for policy 0, policy_version 252601 (0.0031) [2024-04-26 19:29:20,231][49750] Updated weights for policy 0, policy_version 252611 (0.0028) [2024-04-26 19:29:22,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4138860544. Throughput: 0: 51154.5. Samples: 1891672400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:29:22,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 19:29:23,327][49750] Updated weights for policy 0, policy_version 252621 (0.0033) [2024-04-26 19:29:26,679][49750] Updated weights for policy 0, policy_version 252631 (0.0030) [2024-04-26 19:29:27,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4139106304. Throughput: 0: 50959.5. Samples: 1891971960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:29:27,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 19:29:29,787][49750] Updated weights for policy 0, policy_version 252641 (0.0027) [2024-04-26 19:29:32,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 4139384832. Throughput: 0: 51114.1. Samples: 1892278340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:29:32,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 19:29:33,345][49750] Updated weights for policy 0, policy_version 252651 (0.0027) [2024-04-26 19:29:36,299][49750] Updated weights for policy 0, policy_version 252661 (0.0030) [2024-04-26 19:29:37,062][49517] Fps is (10 sec: 57343.8, 60 sec: 51882.8, 300 sec: 51095.9). Total num frames: 4139679744. Throughput: 0: 50964.4. Samples: 1892440520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:29:37,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 19:29:39,963][49750] Updated weights for policy 0, policy_version 252671 (0.0033) [2024-04-26 19:29:42,062][49517] Fps is (10 sec: 49153.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4139876352. Throughput: 0: 51083.8. Samples: 1892749460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:29:42,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 19:29:42,707][49750] Updated weights for policy 0, policy_version 252681 (0.0033) [2024-04-26 19:29:46,324][49750] Updated weights for policy 0, policy_version 252691 (0.0037) [2024-04-26 19:29:47,063][49517] Fps is (10 sec: 44236.5, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4140122112. Throughput: 0: 50989.7. Samples: 1893052320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:29:47,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 19:29:49,053][49750] Updated weights for policy 0, policy_version 252701 (0.0032) [2024-04-26 19:29:52,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 4140384256. Throughput: 0: 50580.7. Samples: 1893190420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:29:52,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 19:29:52,632][49750] Updated weights for policy 0, policy_version 252711 (0.0027) [2024-04-26 19:29:55,612][49750] Updated weights for policy 0, policy_version 252721 (0.0037) [2024-04-26 19:29:55,977][49728] Signal inference workers to stop experience collection... (28100 times) [2024-04-26 19:29:56,021][49750] InferenceWorker_p0-w0: stopping experience collection (28100 times) [2024-04-26 19:29:56,037][49728] Signal inference workers to resume experience collection... (28100 times) [2024-04-26 19:29:56,043][49750] InferenceWorker_p0-w0: resuming experience collection (28100 times) [2024-04-26 19:29:57,063][49517] Fps is (10 sec: 57344.1, 60 sec: 51609.6, 300 sec: 51095.9). Total num frames: 4140695552. Throughput: 0: 50961.0. Samples: 1893510340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:29:57,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 19:29:59,106][49750] Updated weights for policy 0, policy_version 252731 (0.0030) [2024-04-26 19:30:01,981][49750] Updated weights for policy 0, policy_version 252741 (0.0029) [2024-04-26 19:30:02,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 4140908544. Throughput: 0: 51184.4. Samples: 1893821080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:30:02,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 19:30:05,697][49750] Updated weights for policy 0, policy_version 252751 (0.0029) [2024-04-26 19:30:07,062][49517] Fps is (10 sec: 45875.3, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4141154304. Throughput: 0: 50939.0. Samples: 1893964660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:30:07,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 19:30:08,283][49750] Updated weights for policy 0, policy_version 252761 (0.0031) [2024-04-26 19:30:12,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 4141400064. Throughput: 0: 50869.3. Samples: 1894261080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:30:12,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 19:30:12,067][49750] Updated weights for policy 0, policy_version 252771 (0.0037) [2024-04-26 19:30:14,671][49750] Updated weights for policy 0, policy_version 252781 (0.0033) [2024-04-26 19:30:17,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 4141662208. Throughput: 0: 50776.0. Samples: 1894563260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 19:30:17,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 19:30:18,471][49750] Updated weights for policy 0, policy_version 252791 (0.0029) [2024-04-26 19:30:21,200][49750] Updated weights for policy 0, policy_version 252801 (0.0033) [2024-04-26 19:30:22,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51609.6, 300 sec: 51151.4). Total num frames: 4141957120. Throughput: 0: 50877.8. Samples: 1894730020. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 19:30:22,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 19:30:24,730][49750] Updated weights for policy 0, policy_version 252811 (0.0026) [2024-04-26 19:30:27,062][49517] Fps is (10 sec: 50791.3, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4142170112. Throughput: 0: 50810.2. Samples: 1895035920. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 19:30:27,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 19:30:27,545][49750] Updated weights for policy 0, policy_version 252821 (0.0029) [2024-04-26 19:30:31,361][49750] Updated weights for policy 0, policy_version 252831 (0.0026) [2024-04-26 19:30:32,062][49517] Fps is (10 sec: 44237.0, 60 sec: 50244.5, 300 sec: 50762.7). Total num frames: 4142399488. Throughput: 0: 50845.5. Samples: 1895340360. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 19:30:32,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 19:30:33,850][49750] Updated weights for policy 0, policy_version 252841 (0.0033) [2024-04-26 19:30:37,062][49517] Fps is (10 sec: 50790.2, 60 sec: 49971.2, 300 sec: 50929.3). Total num frames: 4142678016. Throughput: 0: 50790.7. Samples: 1895476000. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 19:30:37,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 19:30:37,951][49750] Updated weights for policy 0, policy_version 252851 (0.0033) [2024-04-26 19:30:40,321][49750] Updated weights for policy 0, policy_version 252861 (0.0035) [2024-04-26 19:30:42,062][49517] Fps is (10 sec: 57343.5, 60 sec: 51609.5, 300 sec: 51095.9). Total num frames: 4142972928. Throughput: 0: 50620.1. Samples: 1895788240. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 19:30:42,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 19:30:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000252867_4142972928.pth... [2024-04-26 19:30:42,134][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000252121_4130750464.pth [2024-04-26 19:30:44,277][49750] Updated weights for policy 0, policy_version 252871 (0.0029) [2024-04-26 19:30:46,707][49750] Updated weights for policy 0, policy_version 252881 (0.0031) [2024-04-26 19:30:47,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51609.7, 300 sec: 50984.8). Total num frames: 4143218688. Throughput: 0: 50607.1. Samples: 1896098400. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 19:30:47,071][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 19:30:50,872][49750] Updated weights for policy 0, policy_version 252891 (0.0025) [2024-04-26 19:30:52,062][49517] Fps is (10 sec: 45875.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4143431680. Throughput: 0: 50844.9. Samples: 1896252680. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 19:30:52,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 19:30:52,602][49728] Signal inference workers to stop experience collection... (28150 times) [2024-04-26 19:30:52,602][49728] Signal inference workers to resume experience collection... (28150 times) [2024-04-26 19:30:52,614][49750] InferenceWorker_p0-w0: stopping experience collection (28150 times) [2024-04-26 19:30:52,615][49750] InferenceWorker_p0-w0: resuming experience collection (28150 times) [2024-04-26 19:30:53,197][49750] Updated weights for policy 0, policy_version 252901 (0.0034) [2024-04-26 19:30:57,062][49517] Fps is (10 sec: 45874.9, 60 sec: 49698.2, 300 sec: 50818.2). Total num frames: 4143677440. Throughput: 0: 50960.5. Samples: 1896554300. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 19:30:57,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 19:30:57,467][49750] Updated weights for policy 0, policy_version 252911 (0.0029) [2024-04-26 19:30:59,837][49750] Updated weights for policy 0, policy_version 252921 (0.0032) [2024-04-26 19:31:02,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.3, 300 sec: 50984.8). Total num frames: 4143955968. Throughput: 0: 50745.0. Samples: 1896846780. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 19:31:02,072][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 19:31:03,856][49750] Updated weights for policy 0, policy_version 252931 (0.0036) [2024-04-26 19:31:06,301][49750] Updated weights for policy 0, policy_version 252941 (0.0034) [2024-04-26 19:31:07,063][49517] Fps is (10 sec: 55705.1, 60 sec: 51336.5, 300 sec: 51095.8). Total num frames: 4144234496. Throughput: 0: 50729.2. Samples: 1897012840. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 19:31:07,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 19:31:10,204][49750] Updated weights for policy 0, policy_version 252951 (0.0028) [2024-04-26 19:31:12,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4144463872. Throughput: 0: 50835.1. Samples: 1897323500. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 19:31:12,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 19:31:12,680][49750] Updated weights for policy 0, policy_version 252961 (0.0032) [2024-04-26 19:31:16,691][49750] Updated weights for policy 0, policy_version 252971 (0.0031) [2024-04-26 19:31:17,062][49517] Fps is (10 sec: 45875.3, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4144693248. Throughput: 0: 50832.3. Samples: 1897627820. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 19:31:17,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 19:31:18,991][49750] Updated weights for policy 0, policy_version 252981 (0.0032) [2024-04-26 19:31:22,062][49517] Fps is (10 sec: 49151.8, 60 sec: 49971.2, 300 sec: 50873.7). Total num frames: 4144955392. Throughput: 0: 50773.3. Samples: 1897760800. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 19:31:22,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 19:31:23,184][49750] Updated weights for policy 0, policy_version 252991 (0.0035) [2024-04-26 19:31:25,389][49750] Updated weights for policy 0, policy_version 253001 (0.0030) [2024-04-26 19:31:27,062][49517] Fps is (10 sec: 55706.0, 60 sec: 51336.5, 300 sec: 51040.3). Total num frames: 4145250304. Throughput: 0: 50675.6. Samples: 1898068640. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 19:31:27,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 19:31:29,535][49750] Updated weights for policy 0, policy_version 253011 (0.0032) [2024-04-26 19:31:31,787][49750] Updated weights for policy 0, policy_version 253021 (0.0036) [2024-04-26 19:31:32,063][49517] Fps is (10 sec: 54067.3, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 4145496064. Throughput: 0: 50639.1. Samples: 1898377160. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-26 19:31:32,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 19:31:35,926][49750] Updated weights for policy 0, policy_version 253031 (0.0028) [2024-04-26 19:31:37,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4145725440. Throughput: 0: 50811.2. Samples: 1898539180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 19:31:37,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 19:31:38,186][49750] Updated weights for policy 0, policy_version 253041 (0.0030) [2024-04-26 19:31:42,062][49517] Fps is (10 sec: 45875.2, 60 sec: 49698.1, 300 sec: 50818.2). Total num frames: 4145954816. Throughput: 0: 50752.0. Samples: 1898838140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 19:31:42,063][49517] Avg episode reward: [(0, '0.693')] [2024-04-26 19:31:42,338][49750] Updated weights for policy 0, policy_version 253051 (0.0029) [2024-04-26 19:31:44,597][49750] Updated weights for policy 0, policy_version 253061 (0.0032) [2024-04-26 19:31:47,062][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.2, 300 sec: 50929.2). Total num frames: 4146249728. Throughput: 0: 50945.7. Samples: 1899139340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 19:31:47,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 19:31:48,817][49750] Updated weights for policy 0, policy_version 253071 (0.0029) [2024-04-26 19:31:50,580][49728] Signal inference workers to stop experience collection... (28200 times) [2024-04-26 19:31:50,601][49750] InferenceWorker_p0-w0: stopping experience collection (28200 times) [2024-04-26 19:31:50,652][49728] Signal inference workers to resume experience collection... (28200 times) [2024-04-26 19:31:50,654][49750] InferenceWorker_p0-w0: resuming experience collection (28200 times) [2024-04-26 19:31:50,931][49750] Updated weights for policy 0, policy_version 253081 (0.0027) [2024-04-26 19:31:52,062][49517] Fps is (10 sec: 57344.3, 60 sec: 51609.7, 300 sec: 51095.9). Total num frames: 4146528256. Throughput: 0: 51000.6. Samples: 1899307860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 19:31:52,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 19:31:55,154][49750] Updated weights for policy 0, policy_version 253091 (0.0029) [2024-04-26 19:31:57,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4146757632. Throughput: 0: 50798.6. Samples: 1899609440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 19:31:57,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 19:31:57,411][49750] Updated weights for policy 0, policy_version 253101 (0.0033) [2024-04-26 19:32:01,539][49750] Updated weights for policy 0, policy_version 253111 (0.0029) [2024-04-26 19:32:02,062][49517] Fps is (10 sec: 45874.5, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4146987008. Throughput: 0: 50900.4. Samples: 1899918340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 19:32:02,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 19:32:03,869][49750] Updated weights for policy 0, policy_version 253121 (0.0038) [2024-04-26 19:32:07,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50244.2, 300 sec: 50873.7). Total num frames: 4147249152. Throughput: 0: 51034.4. Samples: 1900057360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 19:32:07,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 19:32:08,020][49750] Updated weights for policy 0, policy_version 253131 (0.0032) [2024-04-26 19:32:10,224][49750] Updated weights for policy 0, policy_version 253141 (0.0033) [2024-04-26 19:32:12,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 4147527680. Throughput: 0: 50928.5. Samples: 1900360420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 19:32:12,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 19:32:14,303][49750] Updated weights for policy 0, policy_version 253151 (0.0036) [2024-04-26 19:32:16,789][49750] Updated weights for policy 0, policy_version 253161 (0.0032) [2024-04-26 19:32:17,062][49517] Fps is (10 sec: 54068.5, 60 sec: 51609.7, 300 sec: 51040.3). Total num frames: 4147789824. Throughput: 0: 50755.1. Samples: 1900661140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 19:32:17,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 19:32:20,743][49750] Updated weights for policy 0, policy_version 253171 (0.0037) [2024-04-26 19:32:22,062][49517] Fps is (10 sec: 47512.9, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4148002816. Throughput: 0: 50752.7. Samples: 1900823060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 19:32:22,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 19:32:23,299][49750] Updated weights for policy 0, policy_version 253181 (0.0029) [2024-04-26 19:32:27,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.2, 300 sec: 50873.7). Total num frames: 4148264960. Throughput: 0: 50886.2. Samples: 1901128020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 19:32:27,071][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 19:32:27,110][49750] Updated weights for policy 0, policy_version 253191 (0.0028) [2024-04-26 19:32:29,664][49750] Updated weights for policy 0, policy_version 253201 (0.0030) [2024-04-26 19:32:32,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 4148527104. Throughput: 0: 50989.3. Samples: 1901433860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 19:32:32,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 19:32:33,711][49750] Updated weights for policy 0, policy_version 253211 (0.0029) [2024-04-26 19:32:36,002][49750] Updated weights for policy 0, policy_version 253221 (0.0030) [2024-04-26 19:32:37,063][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.4, 300 sec: 50984.8). Total num frames: 4148805632. Throughput: 0: 50535.4. Samples: 1901581960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 19:32:37,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 19:32:40,165][49750] Updated weights for policy 0, policy_version 253231 (0.0035) [2024-04-26 19:32:42,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51882.7, 300 sec: 50929.2). Total num frames: 4149067776. Throughput: 0: 50648.9. Samples: 1901888640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-26 19:32:42,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 19:32:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000253239_4149067776.pth... [2024-04-26 19:32:42,126][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000252492_4136828928.pth [2024-04-26 19:32:42,648][49750] Updated weights for policy 0, policy_version 253241 (0.0036) [2024-04-26 19:32:46,620][49750] Updated weights for policy 0, policy_version 253251 (0.0029) [2024-04-26 19:32:47,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4149280768. Throughput: 0: 50647.9. Samples: 1902197500. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 19:32:47,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-26 19:32:49,180][49750] Updated weights for policy 0, policy_version 253261 (0.0032) [2024-04-26 19:32:52,062][49517] Fps is (10 sec: 45875.1, 60 sec: 49971.1, 300 sec: 50818.2). Total num frames: 4149526528. Throughput: 0: 50682.9. Samples: 1902338080. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 19:32:52,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 19:32:52,988][49750] Updated weights for policy 0, policy_version 253271 (0.0040) [2024-04-26 19:32:55,614][49750] Updated weights for policy 0, policy_version 253281 (0.0030) [2024-04-26 19:32:57,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4149805056. Throughput: 0: 50716.7. Samples: 1902642680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 19:32:57,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 19:32:59,413][49750] Updated weights for policy 0, policy_version 253291 (0.0030) [2024-04-26 19:33:02,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4150067200. Throughput: 0: 50783.8. Samples: 1902946420. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 19:33:02,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 19:33:02,084][49750] Updated weights for policy 0, policy_version 253301 (0.0035) [2024-04-26 19:33:05,836][49750] Updated weights for policy 0, policy_version 253311 (0.0032) [2024-04-26 19:33:05,867][49728] Signal inference workers to stop experience collection... (28250 times) [2024-04-26 19:33:05,867][49728] Signal inference workers to resume experience collection... (28250 times) [2024-04-26 19:33:05,879][49750] InferenceWorker_p0-w0: stopping experience collection (28250 times) [2024-04-26 19:33:05,879][49750] InferenceWorker_p0-w0: resuming experience collection (28250 times) [2024-04-26 19:33:07,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4150312960. Throughput: 0: 50837.7. Samples: 1903110760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 19:33:07,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 19:33:08,566][49750] Updated weights for policy 0, policy_version 253321 (0.0039) [2024-04-26 19:33:12,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4150558720. Throughput: 0: 50930.3. Samples: 1903419880. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 19:33:12,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 19:33:12,136][49750] Updated weights for policy 0, policy_version 253331 (0.0038) [2024-04-26 19:33:14,976][49750] Updated weights for policy 0, policy_version 253341 (0.0032) [2024-04-26 19:33:17,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 4150788096. Throughput: 0: 50892.1. Samples: 1903724000. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 19:33:17,063][49517] Avg episode reward: [(0, '0.474')] [2024-04-26 19:33:18,473][49750] Updated weights for policy 0, policy_version 253351 (0.0038) [2024-04-26 19:33:21,328][49750] Updated weights for policy 0, policy_version 253361 (0.0037) [2024-04-26 19:33:22,062][49517] Fps is (10 sec: 52428.2, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4151083008. Throughput: 0: 50867.5. Samples: 1903871000. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 19:33:22,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 19:33:24,918][49750] Updated weights for policy 0, policy_version 253371 (0.0034) [2024-04-26 19:33:27,062][49517] Fps is (10 sec: 55705.5, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 4151345152. Throughput: 0: 50800.0. Samples: 1904174640. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 19:33:27,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 19:33:27,877][49750] Updated weights for policy 0, policy_version 253381 (0.0028) [2024-04-26 19:33:31,504][49750] Updated weights for policy 0, policy_version 253391 (0.0027) [2024-04-26 19:33:32,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4151574528. Throughput: 0: 50677.8. Samples: 1904478000. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 19:33:32,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 19:33:34,414][49750] Updated weights for policy 0, policy_version 253401 (0.0029) [2024-04-26 19:33:37,062][49517] Fps is (10 sec: 45875.1, 60 sec: 49971.2, 300 sec: 50818.2). Total num frames: 4151803904. Throughput: 0: 50772.0. Samples: 1904622820. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 19:33:37,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 19:33:37,913][49750] Updated weights for policy 0, policy_version 253411 (0.0035) [2024-04-26 19:33:40,715][49750] Updated weights for policy 0, policy_version 253421 (0.0032) [2024-04-26 19:33:42,062][49517] Fps is (10 sec: 49152.6, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4152066048. Throughput: 0: 50890.3. Samples: 1904932740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 19:33:42,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 19:33:44,288][49750] Updated weights for policy 0, policy_version 253431 (0.0029) [2024-04-26 19:33:47,052][49750] Updated weights for policy 0, policy_version 253441 (0.0034) [2024-04-26 19:33:47,062][49517] Fps is (10 sec: 57344.4, 60 sec: 51609.7, 300 sec: 50984.8). Total num frames: 4152377344. Throughput: 0: 50860.6. Samples: 1905235140. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 19:33:47,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 19:33:50,904][49750] Updated weights for policy 0, policy_version 253451 (0.0029) [2024-04-26 19:33:52,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4152606720. Throughput: 0: 50969.8. Samples: 1905404400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-26 19:33:52,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 19:33:53,585][49750] Updated weights for policy 0, policy_version 253461 (0.0032) [2024-04-26 19:33:57,062][49517] Fps is (10 sec: 44236.7, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4152819712. Throughput: 0: 50659.1. Samples: 1905699540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 19:33:57,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 19:33:57,404][49750] Updated weights for policy 0, policy_version 253471 (0.0031) [2024-04-26 19:34:00,093][49750] Updated weights for policy 0, policy_version 253481 (0.0028) [2024-04-26 19:34:02,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 4153081856. Throughput: 0: 50624.5. Samples: 1906002100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 19:34:02,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-26 19:34:03,772][49728] Signal inference workers to stop experience collection... (28300 times) [2024-04-26 19:34:03,807][49750] InferenceWorker_p0-w0: stopping experience collection (28300 times) [2024-04-26 19:34:03,843][49728] Signal inference workers to resume experience collection... (28300 times) [2024-04-26 19:34:03,843][49750] InferenceWorker_p0-w0: resuming experience collection (28300 times) [2024-04-26 19:34:03,845][49750] Updated weights for policy 0, policy_version 253491 (0.0024) [2024-04-26 19:34:06,408][49750] Updated weights for policy 0, policy_version 253501 (0.0028) [2024-04-26 19:34:07,062][49517] Fps is (10 sec: 54067.6, 60 sec: 50790.6, 300 sec: 50762.7). Total num frames: 4153360384. Throughput: 0: 50664.6. Samples: 1906150900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 19:34:07,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 19:34:10,118][49750] Updated weights for policy 0, policy_version 253511 (0.0039) [2024-04-26 19:34:12,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4153606144. Throughput: 0: 50655.0. Samples: 1906454120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 19:34:12,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 19:34:12,789][49750] Updated weights for policy 0, policy_version 253521 (0.0029) [2024-04-26 19:34:16,609][49750] Updated weights for policy 0, policy_version 253531 (0.0028) [2024-04-26 19:34:17,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4153868288. Throughput: 0: 50685.0. Samples: 1906758820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 19:34:17,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 19:34:19,593][49750] Updated weights for policy 0, policy_version 253541 (0.0041) [2024-04-26 19:34:22,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4154097664. Throughput: 0: 50829.8. Samples: 1906910160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 19:34:22,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 19:34:23,007][49750] Updated weights for policy 0, policy_version 253551 (0.0033) [2024-04-26 19:34:26,184][49750] Updated weights for policy 0, policy_version 253561 (0.0033) [2024-04-26 19:34:27,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.4, 300 sec: 50762.7). Total num frames: 4154359808. Throughput: 0: 50674.8. Samples: 1907213100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 19:34:27,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 19:34:29,450][49750] Updated weights for policy 0, policy_version 253571 (0.0032) [2024-04-26 19:34:32,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 4154638336. Throughput: 0: 50785.4. Samples: 1907520480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 19:34:32,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 19:34:32,730][49750] Updated weights for policy 0, policy_version 253581 (0.0033) [2024-04-26 19:34:35,894][49750] Updated weights for policy 0, policy_version 253591 (0.0032) [2024-04-26 19:34:37,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4154884096. Throughput: 0: 50624.1. Samples: 1907682480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 19:34:37,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 19:34:39,103][49750] Updated weights for policy 0, policy_version 253601 (0.0033) [2024-04-26 19:34:42,063][49517] Fps is (10 sec: 49151.2, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4155129856. Throughput: 0: 50926.1. Samples: 1907991220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 19:34:42,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 19:34:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000253609_4155129856.pth... [2024-04-26 19:34:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000252867_4142972928.pth [2024-04-26 19:34:42,438][49750] Updated weights for policy 0, policy_version 253611 (0.0037) [2024-04-26 19:34:45,441][49750] Updated weights for policy 0, policy_version 253621 (0.0034) [2024-04-26 19:34:47,063][49517] Fps is (10 sec: 47513.1, 60 sec: 49698.0, 300 sec: 50762.6). Total num frames: 4155359232. Throughput: 0: 50877.2. Samples: 1908291580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 19:34:47,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 19:34:48,810][49750] Updated weights for policy 0, policy_version 253631 (0.0029) [2024-04-26 19:34:50,891][49728] Signal inference workers to stop experience collection... (28350 times) [2024-04-26 19:34:50,891][49728] Signal inference workers to resume experience collection... (28350 times) [2024-04-26 19:34:50,922][49750] InferenceWorker_p0-w0: stopping experience collection (28350 times) [2024-04-26 19:34:50,922][49750] InferenceWorker_p0-w0: resuming experience collection (28350 times) [2024-04-26 19:34:51,817][49750] Updated weights for policy 0, policy_version 253641 (0.0032) [2024-04-26 19:34:52,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4155654144. Throughput: 0: 50868.4. Samples: 1908439980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 19:34:52,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 19:34:55,259][49750] Updated weights for policy 0, policy_version 253651 (0.0032) [2024-04-26 19:34:57,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4155899904. Throughput: 0: 50840.2. Samples: 1908741920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 19:34:57,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 19:34:58,448][49750] Updated weights for policy 0, policy_version 253661 (0.0028) [2024-04-26 19:35:01,688][49750] Updated weights for policy 0, policy_version 253671 (0.0033) [2024-04-26 19:35:02,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4156162048. Throughput: 0: 50952.1. Samples: 1909051660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-26 19:35:02,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 19:35:04,945][49750] Updated weights for policy 0, policy_version 253681 (0.0029) [2024-04-26 19:35:07,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4156407808. Throughput: 0: 50988.0. Samples: 1909204620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 19:35:07,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 19:35:08,124][49750] Updated weights for policy 0, policy_version 253691 (0.0035) [2024-04-26 19:35:11,230][49750] Updated weights for policy 0, policy_version 253701 (0.0026) [2024-04-26 19:35:12,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 4156637184. Throughput: 0: 50834.2. Samples: 1909500640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 19:35:12,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 19:35:14,495][49750] Updated weights for policy 0, policy_version 253711 (0.0031) [2024-04-26 19:35:17,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4156932096. Throughput: 0: 50985.1. Samples: 1909814820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 19:35:17,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 19:35:17,722][49750] Updated weights for policy 0, policy_version 253721 (0.0030) [2024-04-26 19:35:20,860][49750] Updated weights for policy 0, policy_version 253731 (0.0032) [2024-04-26 19:35:22,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4157177856. Throughput: 0: 50869.7. Samples: 1909971620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 19:35:22,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 19:35:24,266][49750] Updated weights for policy 0, policy_version 253741 (0.0032) [2024-04-26 19:35:27,062][49517] Fps is (10 sec: 49152.7, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4157423616. Throughput: 0: 50842.3. Samples: 1910279120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 19:35:27,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 19:35:27,358][49750] Updated weights for policy 0, policy_version 253751 (0.0030) [2024-04-26 19:35:30,875][49750] Updated weights for policy 0, policy_version 253761 (0.0030) [2024-04-26 19:35:32,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4157669376. Throughput: 0: 50969.9. Samples: 1910585220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 19:35:32,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 19:35:33,703][49750] Updated weights for policy 0, policy_version 253771 (0.0031) [2024-04-26 19:35:37,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4157931520. Throughput: 0: 51023.9. Samples: 1910736060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 19:35:37,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 19:35:37,286][49750] Updated weights for policy 0, policy_version 253781 (0.0029) [2024-04-26 19:35:40,144][49750] Updated weights for policy 0, policy_version 253791 (0.0029) [2024-04-26 19:35:42,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4158193664. Throughput: 0: 51040.4. Samples: 1911038740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 19:35:42,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 19:35:43,781][49750] Updated weights for policy 0, policy_version 253801 (0.0034) [2024-04-26 19:35:46,485][49750] Updated weights for policy 0, policy_version 253811 (0.0032) [2024-04-26 19:35:47,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51609.7, 300 sec: 50929.2). Total num frames: 4158455808. Throughput: 0: 50908.4. Samples: 1911342540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 19:35:47,063][49517] Avg episode reward: [(0, '0.483')] [2024-04-26 19:35:50,187][49750] Updated weights for policy 0, policy_version 253821 (0.0036) [2024-04-26 19:35:52,063][49517] Fps is (10 sec: 50789.2, 60 sec: 50790.2, 300 sec: 50929.2). Total num frames: 4158701568. Throughput: 0: 51048.7. Samples: 1911501820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 19:35:52,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 19:35:52,934][49750] Updated weights for policy 0, policy_version 253831 (0.0029) [2024-04-26 19:35:55,385][49728] Signal inference workers to stop experience collection... (28400 times) [2024-04-26 19:35:55,430][49750] InferenceWorker_p0-w0: stopping experience collection (28400 times) [2024-04-26 19:35:55,444][49728] Signal inference workers to resume experience collection... (28400 times) [2024-04-26 19:35:55,453][49750] InferenceWorker_p0-w0: resuming experience collection (28400 times) [2024-04-26 19:35:56,549][49750] Updated weights for policy 0, policy_version 253841 (0.0029) [2024-04-26 19:35:57,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4158963712. Throughput: 0: 51208.4. Samples: 1911805020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 19:35:57,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 19:35:59,627][49750] Updated weights for policy 0, policy_version 253851 (0.0032) [2024-04-26 19:36:02,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4159209472. Throughput: 0: 51107.7. Samples: 1912114660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 19:36:02,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 19:36:02,813][49750] Updated weights for policy 0, policy_version 253861 (0.0036) [2024-04-26 19:36:05,887][49750] Updated weights for policy 0, policy_version 253871 (0.0029) [2024-04-26 19:36:07,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4159455232. Throughput: 0: 50931.3. Samples: 1912263520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 19:36:07,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 19:36:09,309][49750] Updated weights for policy 0, policy_version 253881 (0.0034) [2024-04-26 19:36:12,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 4159733760. Throughput: 0: 50798.7. Samples: 1912565060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-26 19:36:12,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 19:36:12,246][49750] Updated weights for policy 0, policy_version 253891 (0.0034) [2024-04-26 19:36:15,846][49750] Updated weights for policy 0, policy_version 253901 (0.0033) [2024-04-26 19:36:17,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.7, 300 sec: 50984.8). Total num frames: 4159995904. Throughput: 0: 50874.8. Samples: 1912874580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 19:36:17,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 19:36:18,717][49750] Updated weights for policy 0, policy_version 253911 (0.0032) [2024-04-26 19:36:22,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4160225280. Throughput: 0: 51023.0. Samples: 1913032100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 19:36:22,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 19:36:22,160][49750] Updated weights for policy 0, policy_version 253921 (0.0029) [2024-04-26 19:36:25,180][49750] Updated weights for policy 0, policy_version 253931 (0.0028) [2024-04-26 19:36:27,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4160471040. Throughput: 0: 51021.0. Samples: 1913334680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 19:36:27,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 19:36:28,543][49750] Updated weights for policy 0, policy_version 253941 (0.0034) [2024-04-26 19:36:31,524][49750] Updated weights for policy 0, policy_version 253951 (0.0032) [2024-04-26 19:36:32,062][49517] Fps is (10 sec: 52429.8, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4160749568. Throughput: 0: 50974.7. Samples: 1913636400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 19:36:32,063][49517] Avg episode reward: [(0, '0.707')] [2024-04-26 19:36:35,152][49750] Updated weights for policy 0, policy_version 253961 (0.0033) [2024-04-26 19:36:37,062][49517] Fps is (10 sec: 54066.5, 60 sec: 51336.6, 300 sec: 51040.3). Total num frames: 4161011712. Throughput: 0: 51038.0. Samples: 1913798520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 19:36:37,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 19:36:38,102][49750] Updated weights for policy 0, policy_version 253971 (0.0033) [2024-04-26 19:36:41,526][49750] Updated weights for policy 0, policy_version 253981 (0.0032) [2024-04-26 19:36:42,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 4161273856. Throughput: 0: 51202.3. Samples: 1914109120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 19:36:42,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 19:36:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000253984_4161273856.pth... [2024-04-26 19:36:42,115][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000253239_4149067776.pth [2024-04-26 19:36:44,418][49750] Updated weights for policy 0, policy_version 253991 (0.0029) [2024-04-26 19:36:47,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4161486848. Throughput: 0: 50867.5. Samples: 1914403700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 19:36:47,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 19:36:47,897][49750] Updated weights for policy 0, policy_version 254001 (0.0036) [2024-04-26 19:36:50,860][49750] Updated weights for policy 0, policy_version 254011 (0.0030) [2024-04-26 19:36:52,062][49517] Fps is (10 sec: 45875.0, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4161732608. Throughput: 0: 50872.8. Samples: 1914552800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 19:36:52,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 19:36:54,399][49750] Updated weights for policy 0, policy_version 254021 (0.0032) [2024-04-26 19:36:57,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 4162011136. Throughput: 0: 50900.5. Samples: 1914855580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 19:36:57,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 19:36:57,286][49750] Updated weights for policy 0, policy_version 254031 (0.0028) [2024-04-26 19:37:00,839][49750] Updated weights for policy 0, policy_version 254041 (0.0032) [2024-04-26 19:37:02,063][49517] Fps is (10 sec: 55704.8, 60 sec: 51336.4, 300 sec: 50984.8). Total num frames: 4162289664. Throughput: 0: 50772.5. Samples: 1915159360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 19:37:02,072][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 19:37:03,881][49750] Updated weights for policy 0, policy_version 254051 (0.0028) [2024-04-26 19:37:07,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4162519040. Throughput: 0: 50833.2. Samples: 1915319580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 19:37:07,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 19:37:07,167][49750] Updated weights for policy 0, policy_version 254061 (0.0032) [2024-04-26 19:37:10,463][49750] Updated weights for policy 0, policy_version 254071 (0.0030) [2024-04-26 19:37:12,062][49517] Fps is (10 sec: 45876.2, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4162748416. Throughput: 0: 50816.4. Samples: 1915621420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 19:37:12,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 19:37:13,648][49750] Updated weights for policy 0, policy_version 254081 (0.0032) [2024-04-26 19:37:16,989][49750] Updated weights for policy 0, policy_version 254091 (0.0035) [2024-04-26 19:37:17,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.2, 300 sec: 50929.3). Total num frames: 4163026944. Throughput: 0: 50855.5. Samples: 1915924900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 19:37:17,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 19:37:19,483][49728] Signal inference workers to stop experience collection... (28450 times) [2024-04-26 19:37:19,516][49750] InferenceWorker_p0-w0: stopping experience collection (28450 times) [2024-04-26 19:37:19,551][49728] Signal inference workers to resume experience collection... (28450 times) [2024-04-26 19:37:19,552][49750] InferenceWorker_p0-w0: resuming experience collection (28450 times) [2024-04-26 19:37:20,141][49750] Updated weights for policy 0, policy_version 254101 (0.0028) [2024-04-26 19:37:22,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.6, 300 sec: 50929.2). Total num frames: 4163289088. Throughput: 0: 50812.4. Samples: 1916085080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-26 19:37:22,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 19:37:23,318][49750] Updated weights for policy 0, policy_version 254111 (0.0028) [2024-04-26 19:37:26,446][49750] Updated weights for policy 0, policy_version 254121 (0.0027) [2024-04-26 19:37:27,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51336.3, 300 sec: 50929.2). Total num frames: 4163551232. Throughput: 0: 50690.5. Samples: 1916390200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:37:27,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 19:37:29,641][49750] Updated weights for policy 0, policy_version 254131 (0.0029) [2024-04-26 19:37:32,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4163780608. Throughput: 0: 50989.8. Samples: 1916698240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:37:32,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 19:37:32,786][49750] Updated weights for policy 0, policy_version 254141 (0.0037) [2024-04-26 19:37:36,253][49750] Updated weights for policy 0, policy_version 254151 (0.0033) [2024-04-26 19:37:37,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4164026368. Throughput: 0: 50712.9. Samples: 1916834880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:37:37,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 19:37:39,279][49750] Updated weights for policy 0, policy_version 254161 (0.0031) [2024-04-26 19:37:42,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50244.1, 300 sec: 50873.7). Total num frames: 4164288512. Throughput: 0: 50761.9. Samples: 1917139880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:37:42,063][49517] Avg episode reward: [(0, '0.680')] [2024-04-26 19:37:42,840][49750] Updated weights for policy 0, policy_version 254171 (0.0031) [2024-04-26 19:37:45,779][49750] Updated weights for policy 0, policy_version 254181 (0.0026) [2024-04-26 19:37:47,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 4164567040. Throughput: 0: 50722.0. Samples: 1917441840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:37:47,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 19:37:49,431][49750] Updated weights for policy 0, policy_version 254191 (0.0028) [2024-04-26 19:37:52,049][49750] Updated weights for policy 0, policy_version 254201 (0.0029) [2024-04-26 19:37:52,063][49517] Fps is (10 sec: 54067.3, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 4164829184. Throughput: 0: 50755.7. Samples: 1917603600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:37:52,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 19:37:55,786][49750] Updated weights for policy 0, policy_version 254211 (0.0029) [2024-04-26 19:37:57,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4165042176. Throughput: 0: 50911.9. Samples: 1917912460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:37:57,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 19:37:58,520][49750] Updated weights for policy 0, policy_version 254221 (0.0032) [2024-04-26 19:38:02,062][49517] Fps is (10 sec: 45875.8, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 4165287936. Throughput: 0: 50874.2. Samples: 1918214240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:38:02,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 19:38:02,381][49750] Updated weights for policy 0, policy_version 254231 (0.0027) [2024-04-26 19:38:05,076][49750] Updated weights for policy 0, policy_version 254241 (0.0027) [2024-04-26 19:38:07,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4165566464. Throughput: 0: 50596.6. Samples: 1918361920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:38:07,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 19:38:08,940][49750] Updated weights for policy 0, policy_version 254251 (0.0033) [2024-04-26 19:38:11,074][49728] Signal inference workers to stop experience collection... (28500 times) [2024-04-26 19:38:11,114][49750] InferenceWorker_p0-w0: stopping experience collection (28500 times) [2024-04-26 19:38:11,133][49728] Signal inference workers to resume experience collection... (28500 times) [2024-04-26 19:38:11,141][49750] InferenceWorker_p0-w0: resuming experience collection (28500 times) [2024-04-26 19:38:11,500][49750] Updated weights for policy 0, policy_version 254261 (0.0038) [2024-04-26 19:38:12,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4165828608. Throughput: 0: 50614.0. Samples: 1918667820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:38:12,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 19:38:15,318][49750] Updated weights for policy 0, policy_version 254271 (0.0033) [2024-04-26 19:38:17,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4166057984. Throughput: 0: 50540.1. Samples: 1918972540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:38:17,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 19:38:17,959][49750] Updated weights for policy 0, policy_version 254281 (0.0033) [2024-04-26 19:38:21,731][49750] Updated weights for policy 0, policy_version 254291 (0.0028) [2024-04-26 19:38:22,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4166320128. Throughput: 0: 50717.3. Samples: 1919117160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:38:22,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 19:38:24,445][49750] Updated weights for policy 0, policy_version 254301 (0.0032) [2024-04-26 19:38:27,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 4166565888. Throughput: 0: 50723.3. Samples: 1919422420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:38:27,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 19:38:28,398][49750] Updated weights for policy 0, policy_version 254311 (0.0029) [2024-04-26 19:38:30,754][49750] Updated weights for policy 0, policy_version 254321 (0.0031) [2024-04-26 19:38:32,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 4166844416. Throughput: 0: 50678.6. Samples: 1919722380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:38:32,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 19:38:34,989][49750] Updated weights for policy 0, policy_version 254331 (0.0028) [2024-04-26 19:38:37,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4167106560. Throughput: 0: 50700.1. Samples: 1919885100. Policy #0 lag: (min: 2.0, avg: 9.1, max: 21.0) [2024-04-26 19:38:37,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 19:38:37,144][49750] Updated weights for policy 0, policy_version 254341 (0.0032) [2024-04-26 19:38:41,300][49750] Updated weights for policy 0, policy_version 254351 (0.0031) [2024-04-26 19:38:42,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.6, 300 sec: 50651.6). Total num frames: 4167319552. Throughput: 0: 50673.9. Samples: 1920192780. Policy #0 lag: (min: 2.0, avg: 9.1, max: 21.0) [2024-04-26 19:38:42,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 19:38:42,099][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000254354_4167335936.pth... [2024-04-26 19:38:42,142][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000253609_4155129856.pth [2024-04-26 19:38:43,655][49750] Updated weights for policy 0, policy_version 254361 (0.0038) [2024-04-26 19:38:47,063][49517] Fps is (10 sec: 45875.2, 60 sec: 49971.1, 300 sec: 50707.1). Total num frames: 4167565312. Throughput: 0: 50635.6. Samples: 1920492840. Policy #0 lag: (min: 2.0, avg: 9.1, max: 21.0) [2024-04-26 19:38:47,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 19:38:47,672][49750] Updated weights for policy 0, policy_version 254371 (0.0031) [2024-04-26 19:38:50,275][49750] Updated weights for policy 0, policy_version 254381 (0.0032) [2024-04-26 19:38:52,062][49517] Fps is (10 sec: 50790.4, 60 sec: 49971.4, 300 sec: 50873.7). Total num frames: 4167827456. Throughput: 0: 50567.5. Samples: 1920637460. Policy #0 lag: (min: 2.0, avg: 9.1, max: 21.0) [2024-04-26 19:38:52,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 19:38:54,354][49750] Updated weights for policy 0, policy_version 254391 (0.0029) [2024-04-26 19:38:56,835][49750] Updated weights for policy 0, policy_version 254401 (0.0030) [2024-04-26 19:38:57,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 4168105984. Throughput: 0: 50565.8. Samples: 1920943280. Policy #0 lag: (min: 2.0, avg: 9.1, max: 21.0) [2024-04-26 19:38:57,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 19:39:00,711][49750] Updated weights for policy 0, policy_version 254411 (0.0032) [2024-04-26 19:39:02,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4168351744. Throughput: 0: 50645.8. Samples: 1921251600. Policy #0 lag: (min: 2.0, avg: 9.1, max: 21.0) [2024-04-26 19:39:02,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 19:39:03,460][49750] Updated weights for policy 0, policy_version 254421 (0.0033) [2024-04-26 19:39:03,464][49728] Signal inference workers to stop experience collection... (28550 times) [2024-04-26 19:39:03,464][49728] Signal inference workers to resume experience collection... (28550 times) [2024-04-26 19:39:03,481][49750] InferenceWorker_p0-w0: stopping experience collection (28550 times) [2024-04-26 19:39:03,482][49750] InferenceWorker_p0-w0: resuming experience collection (28550 times) [2024-04-26 19:39:07,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4168581120. Throughput: 0: 50706.3. Samples: 1921398940. Policy #0 lag: (min: 2.0, avg: 9.1, max: 21.0) [2024-04-26 19:39:07,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 19:39:07,184][49750] Updated weights for policy 0, policy_version 254431 (0.0034) [2024-04-26 19:39:09,907][49750] Updated weights for policy 0, policy_version 254441 (0.0033) [2024-04-26 19:39:12,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 4168859648. Throughput: 0: 50791.0. Samples: 1921708020. Policy #0 lag: (min: 2.0, avg: 9.1, max: 21.0) [2024-04-26 19:39:12,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 19:39:13,520][49750] Updated weights for policy 0, policy_version 254451 (0.0035) [2024-04-26 19:39:16,315][49750] Updated weights for policy 0, policy_version 254461 (0.0032) [2024-04-26 19:39:17,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4169105408. Throughput: 0: 50755.5. Samples: 1922006380. Policy #0 lag: (min: 2.0, avg: 9.1, max: 21.0) [2024-04-26 19:39:17,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 19:39:19,903][49750] Updated weights for policy 0, policy_version 254471 (0.0032) [2024-04-26 19:39:22,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4169383936. Throughput: 0: 50813.3. Samples: 1922171700. Policy #0 lag: (min: 2.0, avg: 9.1, max: 21.0) [2024-04-26 19:39:22,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 19:39:22,636][49750] Updated weights for policy 0, policy_version 254481 (0.0036) [2024-04-26 19:39:26,299][49750] Updated weights for policy 0, policy_version 254491 (0.0031) [2024-04-26 19:39:27,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4169613312. Throughput: 0: 50797.4. Samples: 1922478660. Policy #0 lag: (min: 2.0, avg: 9.1, max: 21.0) [2024-04-26 19:39:27,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 19:39:29,097][49750] Updated weights for policy 0, policy_version 254501 (0.0031) [2024-04-26 19:39:32,062][49517] Fps is (10 sec: 45876.0, 60 sec: 49971.3, 300 sec: 50707.1). Total num frames: 4169842688. Throughput: 0: 50791.3. Samples: 1922778440. Policy #0 lag: (min: 2.0, avg: 9.1, max: 21.0) [2024-04-26 19:39:32,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 19:39:32,828][49750] Updated weights for policy 0, policy_version 254511 (0.0037) [2024-04-26 19:39:35,473][49750] Updated weights for policy 0, policy_version 254521 (0.0030) [2024-04-26 19:39:37,062][49517] Fps is (10 sec: 52428.0, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4170137600. Throughput: 0: 50760.8. Samples: 1922921700. Policy #0 lag: (min: 2.0, avg: 9.1, max: 21.0) [2024-04-26 19:39:37,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 19:39:39,300][49750] Updated weights for policy 0, policy_version 254531 (0.0033) [2024-04-26 19:39:42,005][49750] Updated weights for policy 0, policy_version 254541 (0.0036) [2024-04-26 19:39:42,062][49517] Fps is (10 sec: 55705.1, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4170399744. Throughput: 0: 50648.4. Samples: 1923222460. Policy #0 lag: (min: 2.0, avg: 9.1, max: 21.0) [2024-04-26 19:39:42,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 19:39:45,770][49750] Updated weights for policy 0, policy_version 254551 (0.0033) [2024-04-26 19:39:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4170629120. Throughput: 0: 50599.0. Samples: 1923528560. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-26 19:39:47,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 19:39:48,392][49750] Updated weights for policy 0, policy_version 254561 (0.0035) [2024-04-26 19:39:52,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 4170874880. Throughput: 0: 50699.8. Samples: 1923680440. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-26 19:39:52,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 19:39:52,285][49750] Updated weights for policy 0, policy_version 254571 (0.0027) [2024-04-26 19:39:54,908][49750] Updated weights for policy 0, policy_version 254581 (0.0032) [2024-04-26 19:39:57,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4171120640. Throughput: 0: 50652.5. Samples: 1923987380. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-26 19:39:57,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 19:39:58,615][49728] Signal inference workers to stop experience collection... (28600 times) [2024-04-26 19:39:58,667][49750] InferenceWorker_p0-w0: stopping experience collection (28600 times) [2024-04-26 19:39:58,688][49728] Signal inference workers to resume experience collection... (28600 times) [2024-04-26 19:39:58,688][49750] InferenceWorker_p0-w0: resuming experience collection (28600 times) [2024-04-26 19:39:58,692][49750] Updated weights for policy 0, policy_version 254591 (0.0032) [2024-04-26 19:40:01,313][49750] Updated weights for policy 0, policy_version 254601 (0.0032) [2024-04-26 19:40:02,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4171399168. Throughput: 0: 50735.2. Samples: 1924289460. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-26 19:40:02,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 19:40:05,070][49750] Updated weights for policy 0, policy_version 254611 (0.0033) [2024-04-26 19:40:07,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4171644928. Throughput: 0: 50682.4. Samples: 1924452400. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-26 19:40:07,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 19:40:07,813][49750] Updated weights for policy 0, policy_version 254621 (0.0028) [2024-04-26 19:40:11,480][49750] Updated weights for policy 0, policy_version 254631 (0.0037) [2024-04-26 19:40:12,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4171923456. Throughput: 0: 50660.2. Samples: 1924758380. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-26 19:40:12,063][49517] Avg episode reward: [(0, '0.697')] [2024-04-26 19:40:14,194][49750] Updated weights for policy 0, policy_version 254641 (0.0031) [2024-04-26 19:40:17,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4172136448. Throughput: 0: 50698.0. Samples: 1925059860. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-26 19:40:17,063][49517] Avg episode reward: [(0, '0.675')] [2024-04-26 19:40:17,983][49750] Updated weights for policy 0, policy_version 254651 (0.0033) [2024-04-26 19:40:20,596][49750] Updated weights for policy 0, policy_version 254661 (0.0028) [2024-04-26 19:40:22,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4172414976. Throughput: 0: 50751.5. Samples: 1925205520. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-26 19:40:22,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 19:40:24,273][49750] Updated weights for policy 0, policy_version 254671 (0.0029) [2024-04-26 19:40:27,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4172677120. Throughput: 0: 50831.2. Samples: 1925509860. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-26 19:40:27,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 19:40:27,168][49750] Updated weights for policy 0, policy_version 254681 (0.0031) [2024-04-26 19:40:30,839][49750] Updated weights for policy 0, policy_version 254691 (0.0033) [2024-04-26 19:40:32,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 4172922880. Throughput: 0: 50603.9. Samples: 1925805740. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-26 19:40:32,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 19:40:33,732][49750] Updated weights for policy 0, policy_version 254701 (0.0034) [2024-04-26 19:40:37,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4173152256. Throughput: 0: 50732.3. Samples: 1925963380. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-26 19:40:37,063][49517] Avg episode reward: [(0, '0.468')] [2024-04-26 19:40:37,285][49750] Updated weights for policy 0, policy_version 254711 (0.0035) [2024-04-26 19:40:40,067][49750] Updated weights for policy 0, policy_version 254721 (0.0032) [2024-04-26 19:40:42,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.2, 300 sec: 50651.5). Total num frames: 4173398016. Throughput: 0: 50632.0. Samples: 1926265820. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-26 19:40:42,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 19:40:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000254724_4173398016.pth... [2024-04-26 19:40:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000253984_4161273856.pth [2024-04-26 19:40:43,727][49750] Updated weights for policy 0, policy_version 254731 (0.0029) [2024-04-26 19:40:45,052][49728] Signal inference workers to stop experience collection... (28650 times) [2024-04-26 19:40:45,104][49750] InferenceWorker_p0-w0: stopping experience collection (28650 times) [2024-04-26 19:40:45,124][49728] Signal inference workers to resume experience collection... (28650 times) [2024-04-26 19:40:45,125][49750] InferenceWorker_p0-w0: resuming experience collection (28650 times) [2024-04-26 19:40:46,564][49750] Updated weights for policy 0, policy_version 254741 (0.0030) [2024-04-26 19:40:47,063][49517] Fps is (10 sec: 52427.8, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4173676544. Throughput: 0: 50679.9. Samples: 1926570060. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-26 19:40:47,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 19:40:50,089][49750] Updated weights for policy 0, policy_version 254751 (0.0029) [2024-04-26 19:40:52,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 4173922304. Throughput: 0: 50625.4. Samples: 1926730540. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-26 19:40:52,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 19:40:53,078][49750] Updated weights for policy 0, policy_version 254761 (0.0033) [2024-04-26 19:40:56,592][49750] Updated weights for policy 0, policy_version 254771 (0.0028) [2024-04-26 19:40:57,063][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4174184448. Throughput: 0: 50606.8. Samples: 1927035680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 19:40:57,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 19:40:59,607][49750] Updated weights for policy 0, policy_version 254781 (0.0032) [2024-04-26 19:41:02,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4174413824. Throughput: 0: 50666.3. Samples: 1927339840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 19:41:02,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 19:41:02,991][49750] Updated weights for policy 0, policy_version 254791 (0.0032) [2024-04-26 19:41:06,139][49750] Updated weights for policy 0, policy_version 254801 (0.0028) [2024-04-26 19:41:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 4174675968. Throughput: 0: 50699.7. Samples: 1927487000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 19:41:07,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 19:41:09,350][49750] Updated weights for policy 0, policy_version 254811 (0.0026) [2024-04-26 19:41:12,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 4174938112. Throughput: 0: 50605.2. Samples: 1927787100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 19:41:12,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 19:41:12,787][49750] Updated weights for policy 0, policy_version 254821 (0.0031) [2024-04-26 19:41:15,733][49750] Updated weights for policy 0, policy_version 254831 (0.0033) [2024-04-26 19:41:17,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 4175200256. Throughput: 0: 50841.4. Samples: 1928093600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 19:41:17,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-26 19:41:19,237][49750] Updated weights for policy 0, policy_version 254841 (0.0029) [2024-04-26 19:41:22,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4175462400. Throughput: 0: 50667.5. Samples: 1928243420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 19:41:22,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 19:41:22,180][49750] Updated weights for policy 0, policy_version 254851 (0.0033) [2024-04-26 19:41:25,684][49750] Updated weights for policy 0, policy_version 254861 (0.0038) [2024-04-26 19:41:27,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 4175691776. Throughput: 0: 50690.3. Samples: 1928546880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 19:41:27,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 19:41:28,667][49750] Updated weights for policy 0, policy_version 254871 (0.0034) [2024-04-26 19:41:32,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 4175953920. Throughput: 0: 50737.3. Samples: 1928853240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 19:41:32,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 19:41:32,092][49750] Updated weights for policy 0, policy_version 254881 (0.0037) [2024-04-26 19:41:35,094][49750] Updated weights for policy 0, policy_version 254891 (0.0027) [2024-04-26 19:41:37,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50651.6). Total num frames: 4176216064. Throughput: 0: 50645.2. Samples: 1929009580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 19:41:37,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 19:41:38,565][49750] Updated weights for policy 0, policy_version 254901 (0.0035) [2024-04-26 19:41:40,541][49728] Signal inference workers to stop experience collection... (28700 times) [2024-04-26 19:41:40,571][49750] InferenceWorker_p0-w0: stopping experience collection (28700 times) [2024-04-26 19:41:40,605][49728] Signal inference workers to resume experience collection... (28700 times) [2024-04-26 19:41:40,606][49750] InferenceWorker_p0-w0: resuming experience collection (28700 times) [2024-04-26 19:41:41,651][49750] Updated weights for policy 0, policy_version 254911 (0.0028) [2024-04-26 19:41:42,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4176461824. Throughput: 0: 50731.1. Samples: 1929318580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 19:41:42,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 19:41:44,899][49750] Updated weights for policy 0, policy_version 254921 (0.0027) [2024-04-26 19:41:47,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4176691200. Throughput: 0: 50621.3. Samples: 1929617800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 19:41:47,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 19:41:48,277][49750] Updated weights for policy 0, policy_version 254931 (0.0033) [2024-04-26 19:41:51,326][49750] Updated weights for policy 0, policy_version 254941 (0.0026) [2024-04-26 19:41:52,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4176969728. Throughput: 0: 50706.6. Samples: 1929768800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 19:41:52,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 19:41:54,768][49750] Updated weights for policy 0, policy_version 254951 (0.0035) [2024-04-26 19:41:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 4177215488. Throughput: 0: 50883.2. Samples: 1930076840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 19:41:57,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 19:41:57,709][49750] Updated weights for policy 0, policy_version 254961 (0.0034) [2024-04-26 19:42:01,412][49750] Updated weights for policy 0, policy_version 254971 (0.0030) [2024-04-26 19:42:02,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 4177461248. Throughput: 0: 50827.5. Samples: 1930380840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-26 19:42:02,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 19:42:04,407][49750] Updated weights for policy 0, policy_version 254981 (0.0031) [2024-04-26 19:42:07,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4177739776. Throughput: 0: 50780.0. Samples: 1930528520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 19:42:07,064][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 19:42:07,808][49750] Updated weights for policy 0, policy_version 254991 (0.0038) [2024-04-26 19:42:11,049][49750] Updated weights for policy 0, policy_version 255001 (0.0042) [2024-04-26 19:42:12,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4177985536. Throughput: 0: 50827.0. Samples: 1930834100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 19:42:12,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 19:42:14,188][49750] Updated weights for policy 0, policy_version 255011 (0.0038) [2024-04-26 19:42:17,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4178231296. Throughput: 0: 50799.7. Samples: 1931139220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 19:42:17,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 19:42:17,602][49750] Updated weights for policy 0, policy_version 255021 (0.0039) [2024-04-26 19:42:20,555][49750] Updated weights for policy 0, policy_version 255031 (0.0031) [2024-04-26 19:42:22,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4178493440. Throughput: 0: 50629.3. Samples: 1931287900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 19:42:22,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 19:42:24,084][49750] Updated weights for policy 0, policy_version 255041 (0.0037) [2024-04-26 19:42:27,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4178739200. Throughput: 0: 50562.2. Samples: 1931593880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 19:42:27,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 19:42:27,128][49750] Updated weights for policy 0, policy_version 255051 (0.0031) [2024-04-26 19:42:30,624][49750] Updated weights for policy 0, policy_version 255061 (0.0030) [2024-04-26 19:42:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.6, 300 sec: 50762.6). Total num frames: 4179001344. Throughput: 0: 50708.5. Samples: 1931899680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 19:42:32,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 19:42:33,644][49750] Updated weights for policy 0, policy_version 255071 (0.0033) [2024-04-26 19:42:37,002][49750] Updated weights for policy 0, policy_version 255081 (0.0029) [2024-04-26 19:42:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4179247104. Throughput: 0: 50712.5. Samples: 1932050860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 19:42:37,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 19:42:40,119][49750] Updated weights for policy 0, policy_version 255091 (0.0032) [2024-04-26 19:42:42,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 4179492864. Throughput: 0: 50482.7. Samples: 1932348560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 19:42:42,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 19:42:42,174][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000255097_4179509248.pth... [2024-04-26 19:42:42,217][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000254354_4167335936.pth [2024-04-26 19:42:43,503][49750] Updated weights for policy 0, policy_version 255101 (0.0028) [2024-04-26 19:42:46,759][49750] Updated weights for policy 0, policy_version 255111 (0.0032) [2024-04-26 19:42:47,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.5, 300 sec: 50596.0). Total num frames: 4179755008. Throughput: 0: 50579.6. Samples: 1932656920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 19:42:47,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 19:42:49,834][49750] Updated weights for policy 0, policy_version 255121 (0.0031) [2024-04-26 19:42:51,535][49728] Signal inference workers to stop experience collection... (28750 times) [2024-04-26 19:42:51,536][49728] Signal inference workers to resume experience collection... (28750 times) [2024-04-26 19:42:51,563][49750] InferenceWorker_p0-w0: stopping experience collection (28750 times) [2024-04-26 19:42:51,563][49750] InferenceWorker_p0-w0: resuming experience collection (28750 times) [2024-04-26 19:42:52,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4180000768. Throughput: 0: 50712.2. Samples: 1932810580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 19:42:52,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 19:42:53,114][49750] Updated weights for policy 0, policy_version 255131 (0.0035) [2024-04-26 19:42:56,232][49750] Updated weights for policy 0, policy_version 255141 (0.0034) [2024-04-26 19:42:57,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4180262912. Throughput: 0: 50632.1. Samples: 1933112540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 19:42:57,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 19:42:59,538][49750] Updated weights for policy 0, policy_version 255151 (0.0029) [2024-04-26 19:43:02,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.4, 300 sec: 50707.0). Total num frames: 4180525056. Throughput: 0: 50680.8. Samples: 1933419860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 19:43:02,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 19:43:02,586][49750] Updated weights for policy 0, policy_version 255161 (0.0033) [2024-04-26 19:43:06,039][49750] Updated weights for policy 0, policy_version 255171 (0.0037) [2024-04-26 19:43:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 4180770816. Throughput: 0: 50642.7. Samples: 1933566820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 19:43:07,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 19:43:09,096][49750] Updated weights for policy 0, policy_version 255181 (0.0031) [2024-04-26 19:43:12,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4181016576. Throughput: 0: 50567.2. Samples: 1933869400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 19:43:12,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 19:43:12,498][49750] Updated weights for policy 0, policy_version 255191 (0.0029) [2024-04-26 19:43:15,620][49750] Updated weights for policy 0, policy_version 255201 (0.0028) [2024-04-26 19:43:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4181278720. Throughput: 0: 50529.8. Samples: 1934173520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 19:43:17,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 19:43:18,782][49750] Updated weights for policy 0, policy_version 255211 (0.0031) [2024-04-26 19:43:22,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4181524480. Throughput: 0: 50642.7. Samples: 1934329780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 19:43:22,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 19:43:22,270][49750] Updated weights for policy 0, policy_version 255221 (0.0029) [2024-04-26 19:43:25,152][49750] Updated weights for policy 0, policy_version 255231 (0.0032) [2024-04-26 19:43:27,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 4181786624. Throughput: 0: 50874.7. Samples: 1934637920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 19:43:27,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 19:43:28,636][49750] Updated weights for policy 0, policy_version 255241 (0.0034) [2024-04-26 19:43:31,575][49750] Updated weights for policy 0, policy_version 255251 (0.0033) [2024-04-26 19:43:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 4182032384. Throughput: 0: 50717.9. Samples: 1934939220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 19:43:32,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 19:43:34,934][49750] Updated weights for policy 0, policy_version 255261 (0.0027) [2024-04-26 19:43:37,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4182294528. Throughput: 0: 50750.4. Samples: 1935094340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 19:43:37,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 19:43:38,138][49750] Updated weights for policy 0, policy_version 255271 (0.0030) [2024-04-26 19:43:41,345][49750] Updated weights for policy 0, policy_version 255281 (0.0028) [2024-04-26 19:43:42,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4182556672. Throughput: 0: 50888.4. Samples: 1935402520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 19:43:42,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 19:43:44,763][49750] Updated weights for policy 0, policy_version 255291 (0.0032) [2024-04-26 19:43:47,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4182802432. Throughput: 0: 50767.6. Samples: 1935704400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 19:43:47,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 19:43:47,772][49750] Updated weights for policy 0, policy_version 255301 (0.0029) [2024-04-26 19:43:51,354][49750] Updated weights for policy 0, policy_version 255311 (0.0033) [2024-04-26 19:43:52,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 4183064576. Throughput: 0: 50957.2. Samples: 1935859900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 19:43:52,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 19:43:54,088][49750] Updated weights for policy 0, policy_version 255321 (0.0031) [2024-04-26 19:43:57,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4183310336. Throughput: 0: 50774.6. Samples: 1936154260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 19:43:57,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 19:43:57,827][49750] Updated weights for policy 0, policy_version 255331 (0.0030) [2024-04-26 19:44:00,417][49750] Updated weights for policy 0, policy_version 255341 (0.0034) [2024-04-26 19:44:02,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4183572480. Throughput: 0: 50711.1. Samples: 1936455520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 19:44:02,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 19:44:04,188][49750] Updated weights for policy 0, policy_version 255351 (0.0029) [2024-04-26 19:44:06,937][49750] Updated weights for policy 0, policy_version 255361 (0.0031) [2024-04-26 19:44:07,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4183834624. Throughput: 0: 50787.0. Samples: 1936615200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 19:44:07,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 19:44:10,524][49750] Updated weights for policy 0, policy_version 255371 (0.0029) [2024-04-26 19:44:12,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4184047616. Throughput: 0: 50768.5. Samples: 1936922500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 19:44:12,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 19:44:12,356][49728] Signal inference workers to stop experience collection... (28800 times) [2024-04-26 19:44:12,356][49728] Signal inference workers to resume experience collection... (28800 times) [2024-04-26 19:44:12,386][49750] InferenceWorker_p0-w0: stopping experience collection (28800 times) [2024-04-26 19:44:12,386][49750] InferenceWorker_p0-w0: resuming experience collection (28800 times) [2024-04-26 19:44:13,628][49750] Updated weights for policy 0, policy_version 255381 (0.0030) [2024-04-26 19:44:17,027][49750] Updated weights for policy 0, policy_version 255391 (0.0037) [2024-04-26 19:44:17,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 4184326144. Throughput: 0: 50765.6. Samples: 1937223680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 19:44:17,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 19:44:20,132][49750] Updated weights for policy 0, policy_version 255401 (0.0029) [2024-04-26 19:44:22,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4184571904. Throughput: 0: 50685.0. Samples: 1937375160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 19:44:22,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 19:44:23,481][49750] Updated weights for policy 0, policy_version 255411 (0.0031) [2024-04-26 19:44:26,480][49750] Updated weights for policy 0, policy_version 255421 (0.0033) [2024-04-26 19:44:27,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 4184834048. Throughput: 0: 50581.7. Samples: 1937678700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 19:44:27,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 19:44:29,832][49750] Updated weights for policy 0, policy_version 255431 (0.0029) [2024-04-26 19:44:32,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 4185079808. Throughput: 0: 50640.2. Samples: 1937983200. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 19:44:32,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 19:44:32,997][49750] Updated weights for policy 0, policy_version 255441 (0.0030) [2024-04-26 19:44:36,250][49750] Updated weights for policy 0, policy_version 255451 (0.0040) [2024-04-26 19:44:37,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 4185325568. Throughput: 0: 50667.3. Samples: 1938139920. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 19:44:37,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 19:44:39,367][49750] Updated weights for policy 0, policy_version 255461 (0.0027) [2024-04-26 19:44:42,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4185587712. Throughput: 0: 50724.3. Samples: 1938436860. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 19:44:42,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 19:44:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000255468_4185587712.pth... [2024-04-26 19:44:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000254724_4173398016.pth [2024-04-26 19:44:42,703][49750] Updated weights for policy 0, policy_version 255471 (0.0034) [2024-04-26 19:44:45,850][49750] Updated weights for policy 0, policy_version 255481 (0.0029) [2024-04-26 19:44:47,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4185849856. Throughput: 0: 50842.6. Samples: 1938743440. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 19:44:47,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 19:44:49,284][49750] Updated weights for policy 0, policy_version 255491 (0.0030) [2024-04-26 19:44:52,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4186112000. Throughput: 0: 50739.2. Samples: 1938898460. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 19:44:52,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 19:44:52,193][49750] Updated weights for policy 0, policy_version 255501 (0.0029) [2024-04-26 19:44:55,571][49750] Updated weights for policy 0, policy_version 255511 (0.0032) [2024-04-26 19:44:57,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4186341376. Throughput: 0: 50630.5. Samples: 1939200880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 19:44:57,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 19:44:58,724][49750] Updated weights for policy 0, policy_version 255521 (0.0030) [2024-04-26 19:45:02,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4186603520. Throughput: 0: 50645.5. Samples: 1939502720. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 19:45:02,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 19:45:02,204][49750] Updated weights for policy 0, policy_version 255531 (0.0029) [2024-04-26 19:45:05,138][49750] Updated weights for policy 0, policy_version 255541 (0.0030) [2024-04-26 19:45:07,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4186865664. Throughput: 0: 50641.7. Samples: 1939654040. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 19:45:07,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 19:45:08,657][49750] Updated weights for policy 0, policy_version 255551 (0.0035) [2024-04-26 19:45:11,857][49750] Updated weights for policy 0, policy_version 255561 (0.0035) [2024-04-26 19:45:12,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4187111424. Throughput: 0: 50758.5. Samples: 1939962840. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 19:45:12,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 19:45:15,030][49750] Updated weights for policy 0, policy_version 255571 (0.0031) [2024-04-26 19:45:16,991][49728] Signal inference workers to stop experience collection... (28850 times) [2024-04-26 19:45:17,019][49750] InferenceWorker_p0-w0: stopping experience collection (28850 times) [2024-04-26 19:45:17,055][49728] Signal inference workers to resume experience collection... (28850 times) [2024-04-26 19:45:17,055][49750] InferenceWorker_p0-w0: resuming experience collection (28850 times) [2024-04-26 19:45:17,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4187373568. Throughput: 0: 50731.4. Samples: 1940266120. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 19:45:17,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 19:45:18,608][49750] Updated weights for policy 0, policy_version 255581 (0.0032) [2024-04-26 19:45:21,343][49750] Updated weights for policy 0, policy_version 255591 (0.0029) [2024-04-26 19:45:22,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 4187602944. Throughput: 0: 50682.2. Samples: 1940420620. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 19:45:22,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 19:45:25,043][49750] Updated weights for policy 0, policy_version 255601 (0.0028) [2024-04-26 19:45:27,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4187881472. Throughput: 0: 50942.4. Samples: 1940729260. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 19:45:27,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 19:45:27,802][49750] Updated weights for policy 0, policy_version 255611 (0.0029) [2024-04-26 19:45:31,428][49750] Updated weights for policy 0, policy_version 255621 (0.0036) [2024-04-26 19:45:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4188110848. Throughput: 0: 50828.1. Samples: 1941030700. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 19:45:32,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 19:45:34,373][49750] Updated weights for policy 0, policy_version 255631 (0.0030) [2024-04-26 19:45:37,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4188389376. Throughput: 0: 50710.1. Samples: 1941180420. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-26 19:45:37,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 19:45:37,878][49750] Updated weights for policy 0, policy_version 255641 (0.0028) [2024-04-26 19:45:40,745][49750] Updated weights for policy 0, policy_version 255651 (0.0029) [2024-04-26 19:45:42,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4188635136. Throughput: 0: 50830.2. Samples: 1941488240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 19:45:42,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 19:45:44,399][49750] Updated weights for policy 0, policy_version 255661 (0.0030) [2024-04-26 19:45:47,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4188897280. Throughput: 0: 50982.2. Samples: 1941796920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 19:45:47,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 19:45:47,511][49750] Updated weights for policy 0, policy_version 255671 (0.0036) [2024-04-26 19:45:50,809][49750] Updated weights for policy 0, policy_version 255681 (0.0038) [2024-04-26 19:45:52,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4189159424. Throughput: 0: 50743.0. Samples: 1941937480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 19:45:52,063][49517] Avg episode reward: [(0, '0.476')] [2024-04-26 19:45:54,026][49750] Updated weights for policy 0, policy_version 255691 (0.0030) [2024-04-26 19:45:57,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4189372416. Throughput: 0: 50589.7. Samples: 1942239380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 19:45:57,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 19:45:57,234][49750] Updated weights for policy 0, policy_version 255701 (0.0036) [2024-04-26 19:46:00,320][49750] Updated weights for policy 0, policy_version 255711 (0.0037) [2024-04-26 19:46:02,063][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4189667328. Throughput: 0: 50729.7. Samples: 1942548960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 19:46:02,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 19:46:03,769][49750] Updated weights for policy 0, policy_version 255721 (0.0029) [2024-04-26 19:46:06,677][49750] Updated weights for policy 0, policy_version 255731 (0.0031) [2024-04-26 19:46:07,062][49517] Fps is (10 sec: 52430.3, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4189896704. Throughput: 0: 50787.6. Samples: 1942706060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 19:46:07,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 19:46:10,134][49750] Updated weights for policy 0, policy_version 255741 (0.0029) [2024-04-26 19:46:12,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 4190158848. Throughput: 0: 50776.5. Samples: 1943014200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 19:46:12,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 19:46:13,202][49750] Updated weights for policy 0, policy_version 255751 (0.0035) [2024-04-26 19:46:16,655][49750] Updated weights for policy 0, policy_version 255761 (0.0031) [2024-04-26 19:46:17,063][49517] Fps is (10 sec: 50789.2, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 4190404608. Throughput: 0: 50826.0. Samples: 1943317880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 19:46:17,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 19:46:19,742][49750] Updated weights for policy 0, policy_version 255771 (0.0030) [2024-04-26 19:46:22,062][49517] Fps is (10 sec: 49151.3, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4190650368. Throughput: 0: 50778.2. Samples: 1943465440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 19:46:22,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 19:46:23,294][49750] Updated weights for policy 0, policy_version 255781 (0.0030) [2024-04-26 19:46:26,035][49750] Updated weights for policy 0, policy_version 255791 (0.0039) [2024-04-26 19:46:27,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4190928896. Throughput: 0: 50597.8. Samples: 1943765140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 19:46:27,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 19:46:29,710][49750] Updated weights for policy 0, policy_version 255801 (0.0036) [2024-04-26 19:46:31,886][49728] Signal inference workers to stop experience collection... (28900 times) [2024-04-26 19:46:31,887][49728] Signal inference workers to resume experience collection... (28900 times) [2024-04-26 19:46:31,898][49750] InferenceWorker_p0-w0: stopping experience collection (28900 times) [2024-04-26 19:46:31,898][49750] InferenceWorker_p0-w0: resuming experience collection (28900 times) [2024-04-26 19:46:32,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 4191191040. Throughput: 0: 50601.4. Samples: 1944073980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 19:46:32,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 19:46:32,339][49750] Updated weights for policy 0, policy_version 255811 (0.0030) [2024-04-26 19:46:36,269][49750] Updated weights for policy 0, policy_version 255821 (0.0028) [2024-04-26 19:46:37,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4191436800. Throughput: 0: 50774.8. Samples: 1944222340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 19:46:37,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 19:46:38,760][49750] Updated weights for policy 0, policy_version 255831 (0.0033) [2024-04-26 19:46:42,062][49517] Fps is (10 sec: 45874.9, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4191649792. Throughput: 0: 50790.0. Samples: 1944524920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 19:46:42,071][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 19:46:42,113][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000255839_4191666176.pth... [2024-04-26 19:46:42,159][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000255097_4179509248.pth [2024-04-26 19:46:42,882][49750] Updated weights for policy 0, policy_version 255841 (0.0028) [2024-04-26 19:46:45,496][49750] Updated weights for policy 0, policy_version 255851 (0.0027) [2024-04-26 19:46:47,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 4191944704. Throughput: 0: 50567.3. Samples: 1944824480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 19:46:47,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 19:46:49,164][49750] Updated weights for policy 0, policy_version 255861 (0.0034) [2024-04-26 19:46:51,905][49750] Updated weights for policy 0, policy_version 255871 (0.0034) [2024-04-26 19:46:52,063][49517] Fps is (10 sec: 54066.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4192190464. Throughput: 0: 50726.5. Samples: 1944988760. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-04-26 19:46:52,071][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 19:46:55,599][49750] Updated weights for policy 0, policy_version 255881 (0.0038) [2024-04-26 19:46:57,062][49517] Fps is (10 sec: 49151.8, 60 sec: 51063.7, 300 sec: 50762.6). Total num frames: 4192436224. Throughput: 0: 50712.0. Samples: 1945296240. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-04-26 19:46:57,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 19:46:58,435][49750] Updated weights for policy 0, policy_version 255891 (0.0026) [2024-04-26 19:47:02,042][49750] Updated weights for policy 0, policy_version 255901 (0.0032) [2024-04-26 19:47:02,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 4192681984. Throughput: 0: 50807.2. Samples: 1945604200. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-04-26 19:47:02,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 19:47:04,935][49750] Updated weights for policy 0, policy_version 255911 (0.0031) [2024-04-26 19:47:07,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 4192944128. Throughput: 0: 50763.0. Samples: 1945749780. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-04-26 19:47:07,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 19:47:08,358][49750] Updated weights for policy 0, policy_version 255921 (0.0030) [2024-04-26 19:47:11,289][49750] Updated weights for policy 0, policy_version 255931 (0.0030) [2024-04-26 19:47:12,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4193206272. Throughput: 0: 50735.5. Samples: 1946048240. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-04-26 19:47:12,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 19:47:14,748][49750] Updated weights for policy 0, policy_version 255941 (0.0028) [2024-04-26 19:47:17,063][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4193468416. Throughput: 0: 50673.7. Samples: 1946354300. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-04-26 19:47:17,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 19:47:17,606][49750] Updated weights for policy 0, policy_version 255951 (0.0031) [2024-04-26 19:47:21,305][49750] Updated weights for policy 0, policy_version 255961 (0.0029) [2024-04-26 19:47:22,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4193714176. Throughput: 0: 50800.4. Samples: 1946508360. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-04-26 19:47:22,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 19:47:24,043][49750] Updated weights for policy 0, policy_version 255971 (0.0025) [2024-04-26 19:47:27,062][49517] Fps is (10 sec: 45875.6, 60 sec: 49971.2, 300 sec: 50596.0). Total num frames: 4193927168. Throughput: 0: 50839.6. Samples: 1946812700. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-04-26 19:47:27,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 19:47:27,755][49750] Updated weights for policy 0, policy_version 255981 (0.0031) [2024-04-26 19:47:30,396][49750] Updated weights for policy 0, policy_version 255991 (0.0035) [2024-04-26 19:47:32,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4194238464. Throughput: 0: 50989.2. Samples: 1947119000. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-04-26 19:47:32,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 19:47:34,197][49750] Updated weights for policy 0, policy_version 256001 (0.0034) [2024-04-26 19:47:36,179][49728] Signal inference workers to stop experience collection... (28950 times) [2024-04-26 19:47:36,179][49728] Signal inference workers to resume experience collection... (28950 times) [2024-04-26 19:47:36,192][49750] InferenceWorker_p0-w0: stopping experience collection (28950 times) [2024-04-26 19:47:36,192][49750] InferenceWorker_p0-w0: resuming experience collection (28950 times) [2024-04-26 19:47:36,734][49750] Updated weights for policy 0, policy_version 256011 (0.0031) [2024-04-26 19:47:37,062][49517] Fps is (10 sec: 57343.5, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4194500608. Throughput: 0: 50966.3. Samples: 1947282240. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-04-26 19:47:37,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 19:47:40,695][49750] Updated weights for policy 0, policy_version 256021 (0.0030) [2024-04-26 19:47:42,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 4194746368. Throughput: 0: 50852.9. Samples: 1947584620. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-04-26 19:47:42,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 19:47:43,136][49750] Updated weights for policy 0, policy_version 256031 (0.0033) [2024-04-26 19:47:47,062][49517] Fps is (10 sec: 45875.8, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4194959360. Throughput: 0: 50930.8. Samples: 1947896080. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-04-26 19:47:47,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 19:47:47,168][49750] Updated weights for policy 0, policy_version 256041 (0.0027) [2024-04-26 19:47:49,633][49750] Updated weights for policy 0, policy_version 256051 (0.0025) [2024-04-26 19:47:52,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4195221504. Throughput: 0: 50586.8. Samples: 1948026180. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-04-26 19:47:52,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 19:47:53,620][49750] Updated weights for policy 0, policy_version 256061 (0.0034) [2024-04-26 19:47:56,013][49750] Updated weights for policy 0, policy_version 256071 (0.0033) [2024-04-26 19:47:57,062][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4195483648. Throughput: 0: 50750.2. Samples: 1948332000. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-04-26 19:47:57,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 19:48:00,018][49750] Updated weights for policy 0, policy_version 256081 (0.0037) [2024-04-26 19:48:02,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4195762176. Throughput: 0: 50805.0. Samples: 1948640520. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-26 19:48:02,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 19:48:02,347][49750] Updated weights for policy 0, policy_version 256091 (0.0030) [2024-04-26 19:48:06,490][49750] Updated weights for policy 0, policy_version 256101 (0.0031) [2024-04-26 19:48:07,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4195975168. Throughput: 0: 50811.7. Samples: 1948794880. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-26 19:48:07,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 19:48:08,683][49750] Updated weights for policy 0, policy_version 256111 (0.0035) [2024-04-26 19:48:12,062][49517] Fps is (10 sec: 45875.0, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 4196220928. Throughput: 0: 50865.3. Samples: 1949101640. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-26 19:48:12,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 19:48:12,939][49750] Updated weights for policy 0, policy_version 256121 (0.0029) [2024-04-26 19:48:15,086][49750] Updated weights for policy 0, policy_version 256131 (0.0027) [2024-04-26 19:48:17,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4196499456. Throughput: 0: 50899.6. Samples: 1949409480. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-26 19:48:17,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 19:48:19,360][49750] Updated weights for policy 0, policy_version 256141 (0.0032) [2024-04-26 19:48:21,418][49750] Updated weights for policy 0, policy_version 256151 (0.0032) [2024-04-26 19:48:22,062][49517] Fps is (10 sec: 57344.1, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4196794368. Throughput: 0: 50678.7. Samples: 1949562780. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-26 19:48:22,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 19:48:25,922][49750] Updated weights for policy 0, policy_version 256161 (0.0029) [2024-04-26 19:48:27,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51609.5, 300 sec: 50818.1). Total num frames: 4197023744. Throughput: 0: 50814.6. Samples: 1949871280. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-26 19:48:27,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 19:48:27,881][49750] Updated weights for policy 0, policy_version 256171 (0.0027) [2024-04-26 19:48:32,063][49517] Fps is (10 sec: 44235.7, 60 sec: 49971.0, 300 sec: 50651.5). Total num frames: 4197236736. Throughput: 0: 50770.7. Samples: 1950180780. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-26 19:48:32,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 19:48:32,418][49750] Updated weights for policy 0, policy_version 256181 (0.0034) [2024-04-26 19:48:33,554][49728] Signal inference workers to stop experience collection... (29000 times) [2024-04-26 19:48:33,555][49728] Signal inference workers to resume experience collection... (29000 times) [2024-04-26 19:48:33,580][49750] InferenceWorker_p0-w0: stopping experience collection (29000 times) [2024-04-26 19:48:33,580][49750] InferenceWorker_p0-w0: resuming experience collection (29000 times) [2024-04-26 19:48:34,228][49750] Updated weights for policy 0, policy_version 256191 (0.0034) [2024-04-26 19:48:37,062][49517] Fps is (10 sec: 47513.8, 60 sec: 49971.2, 300 sec: 50651.5). Total num frames: 4197498880. Throughput: 0: 50669.8. Samples: 1950306320. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-26 19:48:37,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 19:48:38,974][49750] Updated weights for policy 0, policy_version 256201 (0.0033) [2024-04-26 19:48:40,865][49750] Updated weights for policy 0, policy_version 256211 (0.0028) [2024-04-26 19:48:42,063][49517] Fps is (10 sec: 54067.2, 60 sec: 50517.1, 300 sec: 50762.6). Total num frames: 4197777408. Throughput: 0: 50691.7. Samples: 1950613140. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-26 19:48:42,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 19:48:42,142][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000256213_4197793792.pth... [2024-04-26 19:48:42,191][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000255468_4185587712.pth [2024-04-26 19:48:45,397][49750] Updated weights for policy 0, policy_version 256221 (0.0030) [2024-04-26 19:48:47,063][49517] Fps is (10 sec: 55705.1, 60 sec: 51609.4, 300 sec: 50818.2). Total num frames: 4198055936. Throughput: 0: 50644.7. Samples: 1950919540. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-26 19:48:47,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 19:48:47,567][49750] Updated weights for policy 0, policy_version 256231 (0.0030) [2024-04-26 19:48:51,797][49750] Updated weights for policy 0, policy_version 256241 (0.0036) [2024-04-26 19:48:52,062][49517] Fps is (10 sec: 49153.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4198268928. Throughput: 0: 50855.1. Samples: 1951083360. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-26 19:48:52,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 19:48:53,977][49750] Updated weights for policy 0, policy_version 256251 (0.0030) [2024-04-26 19:48:57,063][49517] Fps is (10 sec: 45875.1, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 4198514688. Throughput: 0: 50674.5. Samples: 1951382000. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-26 19:48:57,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 19:48:58,194][49750] Updated weights for policy 0, policy_version 256261 (0.0033) [2024-04-26 19:49:00,731][49750] Updated weights for policy 0, policy_version 256271 (0.0035) [2024-04-26 19:49:02,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4198793216. Throughput: 0: 50573.6. Samples: 1951685300. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-26 19:49:02,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 19:49:04,705][49750] Updated weights for policy 0, policy_version 256281 (0.0029) [2024-04-26 19:49:07,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4199055360. Throughput: 0: 50846.7. Samples: 1951850880. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-26 19:49:07,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 19:49:07,186][49750] Updated weights for policy 0, policy_version 256291 (0.0034) [2024-04-26 19:49:11,077][49750] Updated weights for policy 0, policy_version 256301 (0.0028) [2024-04-26 19:49:12,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51609.5, 300 sec: 50818.2). Total num frames: 4199317504. Throughput: 0: 50763.1. Samples: 1952155620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:49:12,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 19:49:13,546][49750] Updated weights for policy 0, policy_version 256311 (0.0028) [2024-04-26 19:49:17,062][49517] Fps is (10 sec: 45875.2, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 4199514112. Throughput: 0: 50603.8. Samples: 1952457940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:49:17,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 19:49:17,401][49750] Updated weights for policy 0, policy_version 256321 (0.0026) [2024-04-26 19:49:20,229][49750] Updated weights for policy 0, policy_version 256331 (0.0028) [2024-04-26 19:49:22,062][49517] Fps is (10 sec: 45875.7, 60 sec: 49698.1, 300 sec: 50651.6). Total num frames: 4199776256. Throughput: 0: 50901.8. Samples: 1952596900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:49:22,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 19:49:23,766][49750] Updated weights for policy 0, policy_version 256341 (0.0027) [2024-04-26 19:49:24,204][49728] Signal inference workers to stop experience collection... (29050 times) [2024-04-26 19:49:24,248][49750] InferenceWorker_p0-w0: stopping experience collection (29050 times) [2024-04-26 19:49:24,309][49728] Signal inference workers to resume experience collection... (29050 times) [2024-04-26 19:49:24,310][49750] InferenceWorker_p0-w0: resuming experience collection (29050 times) [2024-04-26 19:49:26,695][49750] Updated weights for policy 0, policy_version 256351 (0.0031) [2024-04-26 19:49:27,062][49517] Fps is (10 sec: 55705.4, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 4200071168. Throughput: 0: 50798.9. Samples: 1952899080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:49:27,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 19:49:30,269][49750] Updated weights for policy 0, policy_version 256361 (0.0031) [2024-04-26 19:49:32,062][49517] Fps is (10 sec: 55706.2, 60 sec: 51609.9, 300 sec: 50873.7). Total num frames: 4200333312. Throughput: 0: 50707.3. Samples: 1953201360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:49:32,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 19:49:33,019][49750] Updated weights for policy 0, policy_version 256371 (0.0033) [2024-04-26 19:49:36,857][49750] Updated weights for policy 0, policy_version 256381 (0.0036) [2024-04-26 19:49:37,063][49517] Fps is (10 sec: 49151.8, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4200562688. Throughput: 0: 50865.7. Samples: 1953372320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:49:37,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 19:49:39,460][49750] Updated weights for policy 0, policy_version 256391 (0.0033) [2024-04-26 19:49:42,062][49517] Fps is (10 sec: 45874.9, 60 sec: 50244.5, 300 sec: 50651.6). Total num frames: 4200792064. Throughput: 0: 50749.1. Samples: 1953665700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:49:42,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 19:49:43,386][49750] Updated weights for policy 0, policy_version 256401 (0.0032) [2024-04-26 19:49:45,987][49750] Updated weights for policy 0, policy_version 256411 (0.0035) [2024-04-26 19:49:47,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4201070592. Throughput: 0: 50770.3. Samples: 1953969960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:49:47,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 19:49:49,759][49750] Updated weights for policy 0, policy_version 256421 (0.0034) [2024-04-26 19:49:52,063][49517] Fps is (10 sec: 54066.2, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 4201332736. Throughput: 0: 50730.9. Samples: 1954133780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:49:52,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 19:49:52,529][49750] Updated weights for policy 0, policy_version 256431 (0.0029) [2024-04-26 19:49:56,140][49750] Updated weights for policy 0, policy_version 256441 (0.0030) [2024-04-26 19:49:57,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51609.7, 300 sec: 50873.7). Total num frames: 4201611264. Throughput: 0: 50749.0. Samples: 1954439320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:49:57,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 19:49:58,908][49750] Updated weights for policy 0, policy_version 256451 (0.0031) [2024-04-26 19:50:02,062][49517] Fps is (10 sec: 47514.6, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 4201807872. Throughput: 0: 50829.4. Samples: 1954745260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:50:02,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 19:50:02,501][49750] Updated weights for policy 0, policy_version 256461 (0.0035) [2024-04-26 19:50:05,417][49750] Updated weights for policy 0, policy_version 256471 (0.0032) [2024-04-26 19:50:07,062][49517] Fps is (10 sec: 45875.6, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4202070016. Throughput: 0: 50755.2. Samples: 1954880880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:50:07,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 19:50:08,910][49750] Updated weights for policy 0, policy_version 256481 (0.0028) [2024-04-26 19:50:11,975][49750] Updated weights for policy 0, policy_version 256491 (0.0029) [2024-04-26 19:50:12,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4202348544. Throughput: 0: 50789.9. Samples: 1955184620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:50:12,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 19:50:15,455][49750] Updated weights for policy 0, policy_version 256501 (0.0035) [2024-04-26 19:50:16,194][49728] Signal inference workers to stop experience collection... (29100 times) [2024-04-26 19:50:16,195][49728] Signal inference workers to resume experience collection... (29100 times) [2024-04-26 19:50:16,227][49750] InferenceWorker_p0-w0: stopping experience collection (29100 times) [2024-04-26 19:50:16,228][49750] InferenceWorker_p0-w0: resuming experience collection (29100 times) [2024-04-26 19:50:17,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 4202610688. Throughput: 0: 50658.6. Samples: 1955481000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 19:50:17,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 19:50:18,443][49750] Updated weights for policy 0, policy_version 256511 (0.0035) [2024-04-26 19:50:21,907][49750] Updated weights for policy 0, policy_version 256521 (0.0031) [2024-04-26 19:50:22,062][49517] Fps is (10 sec: 49151.8, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 4202840064. Throughput: 0: 50664.5. Samples: 1955652220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 19:50:22,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 19:50:24,820][49750] Updated weights for policy 0, policy_version 256531 (0.0030) [2024-04-26 19:50:27,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4203085824. Throughput: 0: 50796.4. Samples: 1955951540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 19:50:27,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 19:50:28,228][49750] Updated weights for policy 0, policy_version 256541 (0.0035) [2024-04-26 19:50:31,335][49750] Updated weights for policy 0, policy_version 256551 (0.0030) [2024-04-26 19:50:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4203347968. Throughput: 0: 50690.8. Samples: 1956251040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 19:50:32,063][49517] Avg episode reward: [(0, '0.685')] [2024-04-26 19:50:34,868][49750] Updated weights for policy 0, policy_version 256561 (0.0033) [2024-04-26 19:50:37,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4203626496. Throughput: 0: 50709.5. Samples: 1956415700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 19:50:37,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 19:50:37,854][49750] Updated weights for policy 0, policy_version 256571 (0.0030) [2024-04-26 19:50:41,297][49750] Updated weights for policy 0, policy_version 256581 (0.0038) [2024-04-26 19:50:42,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51609.5, 300 sec: 50818.2). Total num frames: 4203888640. Throughput: 0: 50637.3. Samples: 1956718000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 19:50:42,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 19:50:42,174][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000256586_4203905024.pth... [2024-04-26 19:50:42,222][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000255839_4191666176.pth [2024-04-26 19:50:44,167][49750] Updated weights for policy 0, policy_version 256591 (0.0034) [2024-04-26 19:50:47,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50244.4, 300 sec: 50596.1). Total num frames: 4204085248. Throughput: 0: 50600.0. Samples: 1957022260. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 19:50:47,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 19:50:47,617][49750] Updated weights for policy 0, policy_version 256601 (0.0034) [2024-04-26 19:50:50,437][49750] Updated weights for policy 0, policy_version 256611 (0.0029) [2024-04-26 19:50:52,063][49517] Fps is (10 sec: 45874.9, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4204347392. Throughput: 0: 50791.8. Samples: 1957166520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 19:50:52,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 19:50:53,961][49750] Updated weights for policy 0, policy_version 256621 (0.0030) [2024-04-26 19:50:56,995][49750] Updated weights for policy 0, policy_version 256631 (0.0033) [2024-04-26 19:50:57,063][49517] Fps is (10 sec: 55704.4, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4204642304. Throughput: 0: 50851.8. Samples: 1957472960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 19:50:57,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 19:51:00,460][49750] Updated weights for policy 0, policy_version 256641 (0.0037) [2024-04-26 19:51:02,063][49517] Fps is (10 sec: 55705.8, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 4204904448. Throughput: 0: 50981.6. Samples: 1957775180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 19:51:02,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 19:51:03,793][49750] Updated weights for policy 0, policy_version 256651 (0.0031) [2024-04-26 19:51:06,776][49750] Updated weights for policy 0, policy_version 256661 (0.0028) [2024-04-26 19:51:07,062][49517] Fps is (10 sec: 49153.1, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4205133824. Throughput: 0: 50837.0. Samples: 1957939880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 19:51:07,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 19:51:10,177][49750] Updated weights for policy 0, policy_version 256671 (0.0035) [2024-04-26 19:51:12,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4205395968. Throughput: 0: 50948.5. Samples: 1958244220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 19:51:12,063][49517] Avg episode reward: [(0, '0.694')] [2024-04-26 19:51:13,153][49750] Updated weights for policy 0, policy_version 256681 (0.0034) [2024-04-26 19:51:16,771][49750] Updated weights for policy 0, policy_version 256691 (0.0035) [2024-04-26 19:51:17,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4205625344. Throughput: 0: 51044.8. Samples: 1958548060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 19:51:17,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 19:51:17,838][49728] Signal inference workers to stop experience collection... (29150 times) [2024-04-26 19:51:17,838][49728] Signal inference workers to resume experience collection... (29150 times) [2024-04-26 19:51:17,863][49750] InferenceWorker_p0-w0: stopping experience collection (29150 times) [2024-04-26 19:51:17,864][49750] InferenceWorker_p0-w0: resuming experience collection (29150 times) [2024-04-26 19:51:19,549][49750] Updated weights for policy 0, policy_version 256701 (0.0039) [2024-04-26 19:51:22,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4205903872. Throughput: 0: 50661.8. Samples: 1958695480. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 19:51:22,063][49517] Avg episode reward: [(0, '0.475')] [2024-04-26 19:51:23,384][49750] Updated weights for policy 0, policy_version 256711 (0.0034) [2024-04-26 19:51:26,106][49750] Updated weights for policy 0, policy_version 256721 (0.0030) [2024-04-26 19:51:27,063][49517] Fps is (10 sec: 55705.0, 60 sec: 51609.5, 300 sec: 50818.1). Total num frames: 4206182400. Throughput: 0: 50855.9. Samples: 1959006520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-04-26 19:51:27,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 19:51:29,814][49750] Updated weights for policy 0, policy_version 256731 (0.0030) [2024-04-26 19:51:32,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4206395392. Throughput: 0: 50770.6. Samples: 1959306940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 19:51:32,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 19:51:32,743][49750] Updated weights for policy 0, policy_version 256741 (0.0031) [2024-04-26 19:51:36,360][49750] Updated weights for policy 0, policy_version 256751 (0.0030) [2024-04-26 19:51:37,062][49517] Fps is (10 sec: 45876.1, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 4206641152. Throughput: 0: 50724.7. Samples: 1959449120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 19:51:37,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 19:51:39,160][49750] Updated weights for policy 0, policy_version 256761 (0.0036) [2024-04-26 19:51:42,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4206919680. Throughput: 0: 50659.4. Samples: 1959752620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 19:51:42,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 19:51:42,989][49750] Updated weights for policy 0, policy_version 256771 (0.0033) [2024-04-26 19:51:45,515][49750] Updated weights for policy 0, policy_version 256781 (0.0030) [2024-04-26 19:51:47,063][49517] Fps is (10 sec: 54065.8, 60 sec: 51609.4, 300 sec: 50818.2). Total num frames: 4207181824. Throughput: 0: 50748.4. Samples: 1960058860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 19:51:47,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 19:51:49,342][49750] Updated weights for policy 0, policy_version 256791 (0.0034) [2024-04-26 19:51:51,995][49750] Updated weights for policy 0, policy_version 256801 (0.0032) [2024-04-26 19:51:52,063][49517] Fps is (10 sec: 50788.9, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 4207427584. Throughput: 0: 50713.5. Samples: 1960222000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 19:51:52,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 19:51:55,685][49750] Updated weights for policy 0, policy_version 256811 (0.0032) [2024-04-26 19:51:57,062][49517] Fps is (10 sec: 47514.9, 60 sec: 50244.5, 300 sec: 50762.7). Total num frames: 4207656960. Throughput: 0: 50695.2. Samples: 1960525500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 19:51:57,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 19:51:58,419][49750] Updated weights for policy 0, policy_version 256821 (0.0031) [2024-04-26 19:52:02,062][49517] Fps is (10 sec: 47515.1, 60 sec: 49971.4, 300 sec: 50707.1). Total num frames: 4207902720. Throughput: 0: 50677.5. Samples: 1960828540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 19:52:02,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 19:52:02,150][49750] Updated weights for policy 0, policy_version 256831 (0.0029) [2024-04-26 19:52:04,836][49750] Updated weights for policy 0, policy_version 256841 (0.0034) [2024-04-26 19:52:07,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4208181248. Throughput: 0: 50708.9. Samples: 1960977380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 19:52:07,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 19:52:08,668][49750] Updated weights for policy 0, policy_version 256851 (0.0032) [2024-04-26 19:52:11,351][49750] Updated weights for policy 0, policy_version 256861 (0.0036) [2024-04-26 19:52:12,062][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4208443392. Throughput: 0: 50660.7. Samples: 1961286240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 19:52:12,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 19:52:15,064][49750] Updated weights for policy 0, policy_version 256871 (0.0032) [2024-04-26 19:52:17,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4208672768. Throughput: 0: 50716.0. Samples: 1961589160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 19:52:17,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 19:52:17,641][49750] Updated weights for policy 0, policy_version 256881 (0.0030) [2024-04-26 19:52:21,421][49750] Updated weights for policy 0, policy_version 256891 (0.0033) [2024-04-26 19:52:22,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4208934912. Throughput: 0: 50845.3. Samples: 1961737160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 19:52:22,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 19:52:24,084][49750] Updated weights for policy 0, policy_version 256901 (0.0030) [2024-04-26 19:52:27,062][49517] Fps is (10 sec: 50790.4, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 4209180672. Throughput: 0: 50815.4. Samples: 1962039320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 19:52:27,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 19:52:27,866][49750] Updated weights for policy 0, policy_version 256911 (0.0032) [2024-04-26 19:52:29,933][49728] Signal inference workers to stop experience collection... (29200 times) [2024-04-26 19:52:29,933][49728] Signal inference workers to resume experience collection... (29200 times) [2024-04-26 19:52:29,953][49750] InferenceWorker_p0-w0: stopping experience collection (29200 times) [2024-04-26 19:52:29,953][49750] InferenceWorker_p0-w0: resuming experience collection (29200 times) [2024-04-26 19:52:30,650][49750] Updated weights for policy 0, policy_version 256921 (0.0028) [2024-04-26 19:52:32,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51336.6, 300 sec: 50762.6). Total num frames: 4209475584. Throughput: 0: 50743.4. Samples: 1962342300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 19:52:32,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 19:52:34,299][49750] Updated weights for policy 0, policy_version 256931 (0.0029) [2024-04-26 19:52:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 4209688576. Throughput: 0: 50771.3. Samples: 1962506700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 19:52:37,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 19:52:37,264][49750] Updated weights for policy 0, policy_version 256941 (0.0029) [2024-04-26 19:52:40,639][49750] Updated weights for policy 0, policy_version 256951 (0.0032) [2024-04-26 19:52:42,062][49517] Fps is (10 sec: 45875.2, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4209934336. Throughput: 0: 50732.8. Samples: 1962808480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 19:52:42,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 19:52:42,152][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000256955_4209950720.pth... [2024-04-26 19:52:42,195][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000256213_4197793792.pth [2024-04-26 19:52:43,797][49750] Updated weights for policy 0, policy_version 256961 (0.0037) [2024-04-26 19:52:47,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 4210196480. Throughput: 0: 50626.5. Samples: 1963106740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 19:52:47,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 19:52:47,101][49750] Updated weights for policy 0, policy_version 256971 (0.0028) [2024-04-26 19:52:50,149][49750] Updated weights for policy 0, policy_version 256981 (0.0029) [2024-04-26 19:52:52,063][49517] Fps is (10 sec: 54066.1, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 4210475008. Throughput: 0: 50858.5. Samples: 1963266020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 19:52:52,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 19:52:53,571][49750] Updated weights for policy 0, policy_version 256991 (0.0032) [2024-04-26 19:52:56,774][49750] Updated weights for policy 0, policy_version 257001 (0.0031) [2024-04-26 19:52:57,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 4210720768. Throughput: 0: 50749.1. Samples: 1963569960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 19:52:57,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 19:52:59,885][49750] Updated weights for policy 0, policy_version 257011 (0.0038) [2024-04-26 19:53:02,062][49517] Fps is (10 sec: 49153.1, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4210966528. Throughput: 0: 50727.6. Samples: 1963871900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 19:53:02,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 19:53:03,043][49750] Updated weights for policy 0, policy_version 257021 (0.0033) [2024-04-26 19:53:06,246][49750] Updated weights for policy 0, policy_version 257031 (0.0038) [2024-04-26 19:53:07,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4211228672. Throughput: 0: 50805.7. Samples: 1964023420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 19:53:07,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 19:53:09,451][49750] Updated weights for policy 0, policy_version 257041 (0.0040) [2024-04-26 19:53:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4211458048. Throughput: 0: 50735.2. Samples: 1964322400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 19:53:12,063][49517] Avg episode reward: [(0, '0.671')] [2024-04-26 19:53:12,879][49750] Updated weights for policy 0, policy_version 257051 (0.0035) [2024-04-26 19:53:15,853][49750] Updated weights for policy 0, policy_version 257061 (0.0032) [2024-04-26 19:53:17,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.5, 300 sec: 50651.6). Total num frames: 4211736576. Throughput: 0: 50837.7. Samples: 1964630000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 19:53:17,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 19:53:19,184][49750] Updated weights for policy 0, policy_version 257071 (0.0030) [2024-04-26 19:53:22,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4211998720. Throughput: 0: 50668.5. Samples: 1964786780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 19:53:22,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 19:53:22,356][49750] Updated weights for policy 0, policy_version 257081 (0.0029) [2024-04-26 19:53:25,514][49750] Updated weights for policy 0, policy_version 257091 (0.0028) [2024-04-26 19:53:27,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4212228096. Throughput: 0: 50675.8. Samples: 1965088900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 19:53:27,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 19:53:28,688][49750] Updated weights for policy 0, policy_version 257101 (0.0032) [2024-04-26 19:53:31,914][49750] Updated weights for policy 0, policy_version 257111 (0.0036) [2024-04-26 19:53:32,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 4212506624. Throughput: 0: 50850.1. Samples: 1965395000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 19:53:32,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 19:53:35,127][49750] Updated weights for policy 0, policy_version 257121 (0.0036) [2024-04-26 19:53:37,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 4212752384. Throughput: 0: 50832.2. Samples: 1965553460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 19:53:37,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 19:53:38,344][49750] Updated weights for policy 0, policy_version 257131 (0.0031) [2024-04-26 19:53:40,379][49728] Signal inference workers to stop experience collection... (29250 times) [2024-04-26 19:53:40,379][49728] Signal inference workers to resume experience collection... (29250 times) [2024-04-26 19:53:40,406][49750] InferenceWorker_p0-w0: stopping experience collection (29250 times) [2024-04-26 19:53:40,406][49750] InferenceWorker_p0-w0: resuming experience collection (29250 times) [2024-04-26 19:53:41,933][49750] Updated weights for policy 0, policy_version 257141 (0.0032) [2024-04-26 19:53:42,062][49517] Fps is (10 sec: 49153.0, 60 sec: 51063.5, 300 sec: 50651.6). Total num frames: 4212998144. Throughput: 0: 50797.1. Samples: 1965855820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 19:53:42,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 19:53:44,654][49750] Updated weights for policy 0, policy_version 257151 (0.0029) [2024-04-26 19:53:47,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4213243904. Throughput: 0: 50880.2. Samples: 1966161520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 19:53:47,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 19:53:48,219][49750] Updated weights for policy 0, policy_version 257161 (0.0033) [2024-04-26 19:53:51,006][49750] Updated weights for policy 0, policy_version 257171 (0.0029) [2024-04-26 19:53:52,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.6, 300 sec: 50873.8). Total num frames: 4213522432. Throughput: 0: 50727.3. Samples: 1966306140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 19:53:52,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 19:53:54,714][49750] Updated weights for policy 0, policy_version 257181 (0.0032) [2024-04-26 19:53:57,062][49517] Fps is (10 sec: 52430.1, 60 sec: 50790.6, 300 sec: 50762.7). Total num frames: 4213768192. Throughput: 0: 50983.1. Samples: 1966616640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 19:53:57,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 19:53:57,392][49750] Updated weights for policy 0, policy_version 257191 (0.0028) [2024-04-26 19:54:01,184][49750] Updated weights for policy 0, policy_version 257201 (0.0031) [2024-04-26 19:54:02,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4213997568. Throughput: 0: 50868.5. Samples: 1966919080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 19:54:02,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 19:54:03,858][49750] Updated weights for policy 0, policy_version 257211 (0.0027) [2024-04-26 19:54:07,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4214276096. Throughput: 0: 50753.4. Samples: 1967070680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 19:54:07,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 19:54:07,475][49750] Updated weights for policy 0, policy_version 257221 (0.0034) [2024-04-26 19:54:10,636][49750] Updated weights for policy 0, policy_version 257231 (0.0037) [2024-04-26 19:54:12,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4214521856. Throughput: 0: 50882.4. Samples: 1967378600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 19:54:12,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 19:54:13,791][49750] Updated weights for policy 0, policy_version 257241 (0.0029) [2024-04-26 19:54:17,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 4214767616. Throughput: 0: 50715.1. Samples: 1967677180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 19:54:17,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 19:54:17,233][49750] Updated weights for policy 0, policy_version 257251 (0.0030) [2024-04-26 19:54:20,497][49750] Updated weights for policy 0, policy_version 257261 (0.0031) [2024-04-26 19:54:22,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4215029760. Throughput: 0: 50700.3. Samples: 1967834980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 19:54:22,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 19:54:23,646][49750] Updated weights for policy 0, policy_version 257271 (0.0028) [2024-04-26 19:54:26,962][49750] Updated weights for policy 0, policy_version 257281 (0.0033) [2024-04-26 19:54:27,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4215291904. Throughput: 0: 50788.6. Samples: 1968141320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 19:54:27,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 19:54:30,073][49750] Updated weights for policy 0, policy_version 257291 (0.0029) [2024-04-26 19:54:32,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4215521280. Throughput: 0: 50891.7. Samples: 1968451640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 19:54:32,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 19:54:33,252][49750] Updated weights for policy 0, policy_version 257301 (0.0027) [2024-04-26 19:54:36,407][49750] Updated weights for policy 0, policy_version 257311 (0.0027) [2024-04-26 19:54:37,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4215799808. Throughput: 0: 50998.5. Samples: 1968601080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 19:54:37,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 19:54:39,828][49750] Updated weights for policy 0, policy_version 257321 (0.0031) [2024-04-26 19:54:42,063][49517] Fps is (10 sec: 55705.4, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4216078336. Throughput: 0: 50876.3. Samples: 1968906080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 19:54:42,063][49517] Avg episode reward: [(0, '0.472')] [2024-04-26 19:54:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000257329_4216078336.pth... [2024-04-26 19:54:42,126][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000256586_4203905024.pth [2024-04-26 19:54:42,776][49750] Updated weights for policy 0, policy_version 257331 (0.0036) [2024-04-26 19:54:46,313][49750] Updated weights for policy 0, policy_version 257341 (0.0028) [2024-04-26 19:54:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4216291328. Throughput: 0: 50807.1. Samples: 1969205400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 19:54:47,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 19:54:47,191][49728] Signal inference workers to stop experience collection... (29300 times) [2024-04-26 19:54:47,195][49728] Signal inference workers to resume experience collection... (29300 times) [2024-04-26 19:54:47,214][49750] InferenceWorker_p0-w0: stopping experience collection (29300 times) [2024-04-26 19:54:47,215][49750] InferenceWorker_p0-w0: resuming experience collection (29300 times) [2024-04-26 19:54:49,230][49750] Updated weights for policy 0, policy_version 257351 (0.0030) [2024-04-26 19:54:52,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50244.2, 300 sec: 50596.0). Total num frames: 4216537088. Throughput: 0: 50721.7. Samples: 1969353160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 19:54:52,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 19:54:53,049][49750] Updated weights for policy 0, policy_version 257361 (0.0030) [2024-04-26 19:54:55,525][49750] Updated weights for policy 0, policy_version 257371 (0.0027) [2024-04-26 19:54:57,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4216799232. Throughput: 0: 50560.9. Samples: 1969653840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 19:54:57,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 19:54:59,356][49750] Updated weights for policy 0, policy_version 257381 (0.0036) [2024-04-26 19:55:02,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4217077760. Throughput: 0: 50716.3. Samples: 1969959400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 19:55:02,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 19:55:02,067][49750] Updated weights for policy 0, policy_version 257391 (0.0032) [2024-04-26 19:55:05,823][49750] Updated weights for policy 0, policy_version 257401 (0.0029) [2024-04-26 19:55:07,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4217323520. Throughput: 0: 50692.2. Samples: 1970116120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:55:07,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 19:55:08,548][49750] Updated weights for policy 0, policy_version 257411 (0.0026) [2024-04-26 19:55:12,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 4217569280. Throughput: 0: 50810.3. Samples: 1970427780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:55:12,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 19:55:12,229][49750] Updated weights for policy 0, policy_version 257421 (0.0032) [2024-04-26 19:55:15,041][49750] Updated weights for policy 0, policy_version 257431 (0.0027) [2024-04-26 19:55:17,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4217831424. Throughput: 0: 50605.4. Samples: 1970728880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:55:17,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 19:55:18,586][49750] Updated weights for policy 0, policy_version 257441 (0.0036) [2024-04-26 19:55:21,694][49750] Updated weights for policy 0, policy_version 257451 (0.0040) [2024-04-26 19:55:22,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4218077184. Throughput: 0: 50593.1. Samples: 1970877760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:55:22,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 19:55:24,979][49750] Updated weights for policy 0, policy_version 257461 (0.0034) [2024-04-26 19:55:27,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4218339328. Throughput: 0: 50755.2. Samples: 1971190060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:55:27,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 19:55:28,037][49750] Updated weights for policy 0, policy_version 257471 (0.0030) [2024-04-26 19:55:31,375][49750] Updated weights for policy 0, policy_version 257481 (0.0030) [2024-04-26 19:55:32,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.6, 300 sec: 50762.6). Total num frames: 4218601472. Throughput: 0: 50756.1. Samples: 1971489420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:55:32,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 19:55:34,614][49750] Updated weights for policy 0, policy_version 257491 (0.0040) [2024-04-26 19:55:37,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 4218830848. Throughput: 0: 50881.3. Samples: 1971642820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:55:37,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 19:55:38,120][49750] Updated weights for policy 0, policy_version 257501 (0.0028) [2024-04-26 19:55:41,087][49750] Updated weights for policy 0, policy_version 257511 (0.0028) [2024-04-26 19:55:42,063][49517] Fps is (10 sec: 47512.1, 60 sec: 49971.1, 300 sec: 50818.1). Total num frames: 4219076608. Throughput: 0: 50918.8. Samples: 1971945200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:55:42,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 19:55:44,511][49750] Updated weights for policy 0, policy_version 257521 (0.0038) [2024-04-26 19:55:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4219355136. Throughput: 0: 50936.7. Samples: 1972251560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:55:47,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 19:55:47,445][49750] Updated weights for policy 0, policy_version 257531 (0.0025) [2024-04-26 19:55:50,856][49750] Updated weights for policy 0, policy_version 257541 (0.0032) [2024-04-26 19:55:51,182][49728] Signal inference workers to stop experience collection... (29350 times) [2024-04-26 19:55:51,186][49728] Signal inference workers to resume experience collection... (29350 times) [2024-04-26 19:55:51,197][49750] InferenceWorker_p0-w0: stopping experience collection (29350 times) [2024-04-26 19:55:51,198][49750] InferenceWorker_p0-w0: resuming experience collection (29350 times) [2024-04-26 19:55:52,063][49517] Fps is (10 sec: 54067.5, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 4219617280. Throughput: 0: 51037.5. Samples: 1972412820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:55:52,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 19:55:53,755][49750] Updated weights for policy 0, policy_version 257551 (0.0030) [2024-04-26 19:55:57,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.2, 300 sec: 50651.5). Total num frames: 4219846656. Throughput: 0: 50875.9. Samples: 1972717200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:55:57,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 19:55:57,237][49750] Updated weights for policy 0, policy_version 257561 (0.0029) [2024-04-26 19:56:00,256][49750] Updated weights for policy 0, policy_version 257571 (0.0032) [2024-04-26 19:56:02,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.2, 300 sec: 50818.1). Total num frames: 4220125184. Throughput: 0: 50943.4. Samples: 1973021340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:56:02,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 19:56:03,526][49750] Updated weights for policy 0, policy_version 257581 (0.0027) [2024-04-26 19:56:06,804][49750] Updated weights for policy 0, policy_version 257591 (0.0035) [2024-04-26 19:56:07,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4220370944. Throughput: 0: 50904.8. Samples: 1973168480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:56:07,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 19:56:09,902][49750] Updated weights for policy 0, policy_version 257601 (0.0033) [2024-04-26 19:56:12,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4220633088. Throughput: 0: 50864.7. Samples: 1973478980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 19:56:12,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 19:56:13,131][49750] Updated weights for policy 0, policy_version 257611 (0.0028) [2024-04-26 19:56:16,244][49750] Updated weights for policy 0, policy_version 257621 (0.0035) [2024-04-26 19:56:17,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4220911616. Throughput: 0: 50911.8. Samples: 1973780460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 19:56:17,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 19:56:19,610][49750] Updated weights for policy 0, policy_version 257631 (0.0026) [2024-04-26 19:56:22,063][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4221140992. Throughput: 0: 51041.3. Samples: 1973939680. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 19:56:22,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 19:56:22,687][49750] Updated weights for policy 0, policy_version 257641 (0.0031) [2024-04-26 19:56:26,523][49750] Updated weights for policy 0, policy_version 257651 (0.0029) [2024-04-26 19:56:27,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 4221386752. Throughput: 0: 51261.9. Samples: 1974251980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 19:56:27,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 19:56:28,954][49750] Updated weights for policy 0, policy_version 257661 (0.0038) [2024-04-26 19:56:32,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4221632512. Throughput: 0: 51104.5. Samples: 1974551260. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 19:56:32,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 19:56:32,922][49750] Updated weights for policy 0, policy_version 257671 (0.0033) [2024-04-26 19:56:35,507][49750] Updated weights for policy 0, policy_version 257681 (0.0031) [2024-04-26 19:56:37,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 4221927424. Throughput: 0: 50942.4. Samples: 1974705220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 19:56:37,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 19:56:39,206][49750] Updated weights for policy 0, policy_version 257691 (0.0033) [2024-04-26 19:56:41,938][49750] Updated weights for policy 0, policy_version 257701 (0.0027) [2024-04-26 19:56:42,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51609.7, 300 sec: 50818.2). Total num frames: 4222173184. Throughput: 0: 51057.4. Samples: 1975014780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 19:56:42,072][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 19:56:42,179][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000257702_4222189568.pth... [2024-04-26 19:56:42,239][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000256955_4209950720.pth [2024-04-26 19:56:45,645][49750] Updated weights for policy 0, policy_version 257711 (0.0035) [2024-04-26 19:56:47,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4222402560. Throughput: 0: 51052.5. Samples: 1975318700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 19:56:47,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 19:56:48,445][49750] Updated weights for policy 0, policy_version 257721 (0.0036) [2024-04-26 19:56:52,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4222648320. Throughput: 0: 50948.9. Samples: 1975461180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 19:56:52,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 19:56:52,186][49750] Updated weights for policy 0, policy_version 257731 (0.0032) [2024-04-26 19:56:54,932][49750] Updated weights for policy 0, policy_version 257741 (0.0033) [2024-04-26 19:56:57,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4222910464. Throughput: 0: 50735.5. Samples: 1975762080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 19:56:57,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 19:56:58,507][49750] Updated weights for policy 0, policy_version 257751 (0.0033) [2024-04-26 19:57:01,252][49750] Updated weights for policy 0, policy_version 257761 (0.0030) [2024-04-26 19:57:02,062][49517] Fps is (10 sec: 55705.6, 60 sec: 51336.7, 300 sec: 50929.3). Total num frames: 4223205376. Throughput: 0: 50905.4. Samples: 1976071200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 19:57:02,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 19:57:04,767][49728] Signal inference workers to stop experience collection... (29400 times) [2024-04-26 19:57:04,767][49728] Signal inference workers to resume experience collection... (29400 times) [2024-04-26 19:57:04,797][49750] InferenceWorker_p0-w0: stopping experience collection (29400 times) [2024-04-26 19:57:04,797][49750] InferenceWorker_p0-w0: resuming experience collection (29400 times) [2024-04-26 19:57:04,900][49750] Updated weights for policy 0, policy_version 257771 (0.0034) [2024-04-26 19:57:07,063][49517] Fps is (10 sec: 54067.8, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4223451136. Throughput: 0: 51040.5. Samples: 1976236500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 19:57:07,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 19:57:07,521][49750] Updated weights for policy 0, policy_version 257781 (0.0028) [2024-04-26 19:57:11,620][49750] Updated weights for policy 0, policy_version 257791 (0.0032) [2024-04-26 19:57:12,063][49517] Fps is (10 sec: 47512.2, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4223680512. Throughput: 0: 50779.4. Samples: 1976537060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 19:57:12,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 19:57:14,031][49750] Updated weights for policy 0, policy_version 257801 (0.0029) [2024-04-26 19:57:17,062][49517] Fps is (10 sec: 45875.3, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 4223909888. Throughput: 0: 50859.5. Samples: 1976839940. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 19:57:17,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 19:57:18,031][49750] Updated weights for policy 0, policy_version 257811 (0.0040) [2024-04-26 19:57:20,616][49750] Updated weights for policy 0, policy_version 257821 (0.0027) [2024-04-26 19:57:22,062][49517] Fps is (10 sec: 50791.9, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4224188416. Throughput: 0: 50664.5. Samples: 1976985120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 19:57:22,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 19:57:24,441][49750] Updated weights for policy 0, policy_version 257831 (0.0034) [2024-04-26 19:57:26,928][49750] Updated weights for policy 0, policy_version 257841 (0.0030) [2024-04-26 19:57:27,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 4224466944. Throughput: 0: 50715.8. Samples: 1977296980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:57:27,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 19:57:30,884][49750] Updated weights for policy 0, policy_version 257851 (0.0031) [2024-04-26 19:57:32,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4224696320. Throughput: 0: 50799.3. Samples: 1977604660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:57:32,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 19:57:33,393][49750] Updated weights for policy 0, policy_version 257861 (0.0031) [2024-04-26 19:57:37,063][49517] Fps is (10 sec: 45874.5, 60 sec: 49971.2, 300 sec: 50818.1). Total num frames: 4224925696. Throughput: 0: 50803.4. Samples: 1977747340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:57:37,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 19:57:37,365][49750] Updated weights for policy 0, policy_version 257871 (0.0029) [2024-04-26 19:57:39,833][49750] Updated weights for policy 0, policy_version 257881 (0.0034) [2024-04-26 19:57:42,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4225204224. Throughput: 0: 50988.1. Samples: 1978056540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:57:42,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 19:57:43,763][49750] Updated weights for policy 0, policy_version 257891 (0.0030) [2024-04-26 19:57:46,308][49750] Updated weights for policy 0, policy_version 257901 (0.0032) [2024-04-26 19:57:47,062][49517] Fps is (10 sec: 55706.6, 60 sec: 51336.7, 300 sec: 50873.8). Total num frames: 4225482752. Throughput: 0: 50813.0. Samples: 1978357780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:57:47,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 19:57:50,184][49750] Updated weights for policy 0, policy_version 257911 (0.0037) [2024-04-26 19:57:52,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51609.5, 300 sec: 50929.3). Total num frames: 4225744896. Throughput: 0: 50854.7. Samples: 1978524960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:57:52,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 19:57:52,680][49750] Updated weights for policy 0, policy_version 257921 (0.0028) [2024-04-26 19:57:56,569][49750] Updated weights for policy 0, policy_version 257931 (0.0032) [2024-04-26 19:57:57,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 4225957888. Throughput: 0: 50833.7. Samples: 1978824560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:57:57,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 19:57:59,144][49750] Updated weights for policy 0, policy_version 257941 (0.0034) [2024-04-26 19:58:02,063][49517] Fps is (10 sec: 45874.6, 60 sec: 49971.0, 300 sec: 50762.6). Total num frames: 4226203648. Throughput: 0: 50871.0. Samples: 1979129140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:58:02,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 19:58:03,045][49750] Updated weights for policy 0, policy_version 257951 (0.0033) [2024-04-26 19:58:05,459][49750] Updated weights for policy 0, policy_version 257961 (0.0027) [2024-04-26 19:58:07,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 4226465792. Throughput: 0: 50873.3. Samples: 1979274420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:58:07,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 19:58:09,548][49750] Updated weights for policy 0, policy_version 257971 (0.0029) [2024-04-26 19:58:11,728][49728] Signal inference workers to stop experience collection... (29450 times) [2024-04-26 19:58:11,765][49750] InferenceWorker_p0-w0: stopping experience collection (29450 times) [2024-04-26 19:58:11,799][49728] Signal inference workers to resume experience collection... (29450 times) [2024-04-26 19:58:11,800][49750] InferenceWorker_p0-w0: resuming experience collection (29450 times) [2024-04-26 19:58:11,937][49750] Updated weights for policy 0, policy_version 257981 (0.0026) [2024-04-26 19:58:12,062][49517] Fps is (10 sec: 55706.3, 60 sec: 51336.7, 300 sec: 50929.2). Total num frames: 4226760704. Throughput: 0: 50692.3. Samples: 1979578140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:58:12,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 19:58:15,855][49750] Updated weights for policy 0, policy_version 257991 (0.0031) [2024-04-26 19:58:17,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51609.7, 300 sec: 50873.7). Total num frames: 4227006464. Throughput: 0: 50876.8. Samples: 1979894120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:58:17,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 19:58:18,199][49750] Updated weights for policy 0, policy_version 258001 (0.0032) [2024-04-26 19:58:22,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4227235840. Throughput: 0: 51022.7. Samples: 1980043360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:58:22,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 19:58:22,174][49750] Updated weights for policy 0, policy_version 258011 (0.0031) [2024-04-26 19:58:24,603][49750] Updated weights for policy 0, policy_version 258021 (0.0031) [2024-04-26 19:58:27,062][49517] Fps is (10 sec: 45875.3, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4227465216. Throughput: 0: 50915.7. Samples: 1980347740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:58:27,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 19:58:28,649][49750] Updated weights for policy 0, policy_version 258031 (0.0026) [2024-04-26 19:58:31,217][49750] Updated weights for policy 0, policy_version 258041 (0.0031) [2024-04-26 19:58:32,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4227760128. Throughput: 0: 50914.7. Samples: 1980648940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 19:58:32,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 19:58:35,134][49750] Updated weights for policy 0, policy_version 258051 (0.0032) [2024-04-26 19:58:37,063][49517] Fps is (10 sec: 57343.6, 60 sec: 51882.7, 300 sec: 50984.8). Total num frames: 4228038656. Throughput: 0: 51008.5. Samples: 1980820340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 19:58:37,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 19:58:37,548][49750] Updated weights for policy 0, policy_version 258061 (0.0033) [2024-04-26 19:58:41,525][49750] Updated weights for policy 0, policy_version 258071 (0.0032) [2024-04-26 19:58:42,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4228251648. Throughput: 0: 50992.7. Samples: 1981119240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 19:58:42,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 19:58:42,275][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000258074_4228284416.pth... [2024-04-26 19:58:42,321][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000257329_4216078336.pth [2024-04-26 19:58:43,961][49750] Updated weights for policy 0, policy_version 258081 (0.0028) [2024-04-26 19:58:47,063][49517] Fps is (10 sec: 45874.8, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 4228497408. Throughput: 0: 51032.5. Samples: 1981425600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 19:58:47,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 19:58:47,892][49750] Updated weights for policy 0, policy_version 258091 (0.0035) [2024-04-26 19:58:50,505][49750] Updated weights for policy 0, policy_version 258101 (0.0032) [2024-04-26 19:58:52,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4228775936. Throughput: 0: 50824.9. Samples: 1981561540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 19:58:52,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 19:58:54,356][49750] Updated weights for policy 0, policy_version 258111 (0.0037) [2024-04-26 19:58:56,866][49750] Updated weights for policy 0, policy_version 258121 (0.0034) [2024-04-26 19:58:57,063][49517] Fps is (10 sec: 55705.7, 60 sec: 51609.5, 300 sec: 51040.3). Total num frames: 4229054464. Throughput: 0: 50947.9. Samples: 1981870800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 19:58:57,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 19:59:00,578][49728] Signal inference workers to stop experience collection... (29500 times) [2024-04-26 19:59:00,622][49750] InferenceWorker_p0-w0: stopping experience collection (29500 times) [2024-04-26 19:59:00,685][49728] Signal inference workers to resume experience collection... (29500 times) [2024-04-26 19:59:00,685][49750] InferenceWorker_p0-w0: resuming experience collection (29500 times) [2024-04-26 19:59:00,809][49750] Updated weights for policy 0, policy_version 258131 (0.0032) [2024-04-26 19:59:02,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 4229283840. Throughput: 0: 50606.7. Samples: 1982171420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 19:59:02,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 19:59:03,292][49750] Updated weights for policy 0, policy_version 258141 (0.0032) [2024-04-26 19:59:07,063][49517] Fps is (10 sec: 47513.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4229529600. Throughput: 0: 50752.5. Samples: 1982327220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 19:59:07,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 19:59:07,209][49750] Updated weights for policy 0, policy_version 258151 (0.0035) [2024-04-26 19:59:09,656][49750] Updated weights for policy 0, policy_version 258161 (0.0027) [2024-04-26 19:59:12,062][49517] Fps is (10 sec: 47513.0, 60 sec: 49971.2, 300 sec: 50818.2). Total num frames: 4229758976. Throughput: 0: 50714.1. Samples: 1982629880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 19:59:12,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 19:59:13,585][49750] Updated weights for policy 0, policy_version 258171 (0.0034) [2024-04-26 19:59:16,141][49750] Updated weights for policy 0, policy_version 258181 (0.0030) [2024-04-26 19:59:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4230037504. Throughput: 0: 50586.6. Samples: 1982925340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 19:59:17,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 19:59:20,084][49750] Updated weights for policy 0, policy_version 258191 (0.0034) [2024-04-26 19:59:22,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 4230299648. Throughput: 0: 50582.0. Samples: 1983096540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 19:59:22,064][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 19:59:22,831][49750] Updated weights for policy 0, policy_version 258201 (0.0031) [2024-04-26 19:59:26,613][49750] Updated weights for policy 0, policy_version 258211 (0.0030) [2024-04-26 19:59:27,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 4230545408. Throughput: 0: 50641.9. Samples: 1983398120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 19:59:27,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 19:59:29,870][49750] Updated weights for policy 0, policy_version 258221 (0.0028) [2024-04-26 19:59:32,062][49517] Fps is (10 sec: 45876.2, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4230758400. Throughput: 0: 50527.3. Samples: 1983699320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 19:59:32,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 19:59:33,033][49750] Updated weights for policy 0, policy_version 258231 (0.0030) [2024-04-26 19:59:36,227][49750] Updated weights for policy 0, policy_version 258241 (0.0037) [2024-04-26 19:59:37,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4231036928. Throughput: 0: 50685.8. Samples: 1983842400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 19:59:37,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 19:59:39,338][49750] Updated weights for policy 0, policy_version 258251 (0.0029) [2024-04-26 19:59:42,063][49517] Fps is (10 sec: 57343.3, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4231331840. Throughput: 0: 50642.2. Samples: 1984149700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 19:59:42,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 19:59:42,570][49750] Updated weights for policy 0, policy_version 258261 (0.0034) [2024-04-26 19:59:45,837][49750] Updated weights for policy 0, policy_version 258271 (0.0031) [2024-04-26 19:59:47,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.6, 300 sec: 50929.2). Total num frames: 4231561216. Throughput: 0: 50683.5. Samples: 1984452180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 19:59:47,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 19:59:49,193][49750] Updated weights for policy 0, policy_version 258281 (0.0032) [2024-04-26 19:59:51,720][49728] Signal inference workers to stop experience collection... (29550 times) [2024-04-26 19:59:51,721][49728] Signal inference workers to resume experience collection... (29550 times) [2024-04-26 19:59:51,747][49750] InferenceWorker_p0-w0: stopping experience collection (29550 times) [2024-04-26 19:59:51,748][49750] InferenceWorker_p0-w0: resuming experience collection (29550 times) [2024-04-26 19:59:52,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.2, 300 sec: 50929.2). Total num frames: 4231823360. Throughput: 0: 50704.3. Samples: 1984608920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 19:59:52,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 19:59:52,381][49750] Updated weights for policy 0, policy_version 258291 (0.0028) [2024-04-26 19:59:55,719][49750] Updated weights for policy 0, policy_version 258301 (0.0040) [2024-04-26 19:59:57,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49698.2, 300 sec: 50707.1). Total num frames: 4232036352. Throughput: 0: 50796.5. Samples: 1984915720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 19:59:57,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 19:59:58,844][49750] Updated weights for policy 0, policy_version 258311 (0.0029) [2024-04-26 20:00:02,062][49517] Fps is (10 sec: 50791.7, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4232331264. Throughput: 0: 50880.1. Samples: 1985214940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:00:02,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 20:00:02,066][49750] Updated weights for policy 0, policy_version 258321 (0.0034) [2024-04-26 20:00:05,311][49750] Updated weights for policy 0, policy_version 258331 (0.0029) [2024-04-26 20:00:07,062][49517] Fps is (10 sec: 55705.2, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 4232593408. Throughput: 0: 50622.4. Samples: 1985374540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:00:07,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 20:00:08,599][49750] Updated weights for policy 0, policy_version 258341 (0.0031) [2024-04-26 20:00:11,910][49750] Updated weights for policy 0, policy_version 258351 (0.0031) [2024-04-26 20:00:12,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4232839168. Throughput: 0: 50712.9. Samples: 1985680200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:00:12,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 20:00:15,119][49750] Updated weights for policy 0, policy_version 258361 (0.0031) [2024-04-26 20:00:17,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 4233068544. Throughput: 0: 50847.0. Samples: 1985987440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:00:17,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 20:00:18,214][49750] Updated weights for policy 0, policy_version 258371 (0.0035) [2024-04-26 20:00:21,432][49750] Updated weights for policy 0, policy_version 258381 (0.0035) [2024-04-26 20:00:22,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4233314304. Throughput: 0: 50826.1. Samples: 1986129580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:00:22,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 20:00:24,574][49750] Updated weights for policy 0, policy_version 258391 (0.0036) [2024-04-26 20:00:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4233592832. Throughput: 0: 50707.6. Samples: 1986431540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:00:27,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 20:00:27,989][49750] Updated weights for policy 0, policy_version 258401 (0.0027) [2024-04-26 20:00:31,097][49750] Updated weights for policy 0, policy_version 258411 (0.0032) [2024-04-26 20:00:32,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51609.6, 300 sec: 50929.3). Total num frames: 4233854976. Throughput: 0: 50821.9. Samples: 1986739160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:00:32,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 20:00:34,837][49750] Updated weights for policy 0, policy_version 258421 (0.0027) [2024-04-26 20:00:37,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50873.8). Total num frames: 4234084352. Throughput: 0: 50898.9. Samples: 1986899360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:00:37,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 20:00:37,668][49750] Updated weights for policy 0, policy_version 258431 (0.0042) [2024-04-26 20:00:41,286][49750] Updated weights for policy 0, policy_version 258441 (0.0032) [2024-04-26 20:00:42,062][49517] Fps is (10 sec: 47513.4, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 4234330112. Throughput: 0: 50804.0. Samples: 1987201900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:00:42,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 20:00:42,110][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000258444_4234346496.pth... [2024-04-26 20:00:42,156][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000257702_4222189568.pth [2024-04-26 20:00:43,972][49750] Updated weights for policy 0, policy_version 258451 (0.0034) [2024-04-26 20:00:47,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 4234592256. Throughput: 0: 50855.0. Samples: 1987503420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:00:47,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 20:00:47,905][49750] Updated weights for policy 0, policy_version 258461 (0.0029) [2024-04-26 20:00:50,373][49750] Updated weights for policy 0, policy_version 258471 (0.0031) [2024-04-26 20:00:52,063][49517] Fps is (10 sec: 55704.7, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 4234887168. Throughput: 0: 50779.4. Samples: 1987659620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:00:52,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 20:00:54,232][49750] Updated weights for policy 0, policy_version 258481 (0.0032) [2024-04-26 20:00:56,806][49750] Updated weights for policy 0, policy_version 258491 (0.0028) [2024-04-26 20:00:57,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 4235132928. Throughput: 0: 50940.8. Samples: 1987972540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:00:57,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 20:00:59,362][49728] Signal inference workers to stop experience collection... (29600 times) [2024-04-26 20:00:59,362][49728] Signal inference workers to resume experience collection... (29600 times) [2024-04-26 20:00:59,394][49750] InferenceWorker_p0-w0: stopping experience collection (29600 times) [2024-04-26 20:00:59,394][49750] InferenceWorker_p0-w0: resuming experience collection (29600 times) [2024-04-26 20:01:00,663][49750] Updated weights for policy 0, policy_version 258501 (0.0027) [2024-04-26 20:01:02,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 4235362304. Throughput: 0: 50764.8. Samples: 1988271860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 20:01:02,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 20:01:03,267][49750] Updated weights for policy 0, policy_version 258511 (0.0030) [2024-04-26 20:01:07,062][49517] Fps is (10 sec: 45875.0, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4235591680. Throughput: 0: 50717.4. Samples: 1988411860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 20:01:07,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 20:01:07,186][49750] Updated weights for policy 0, policy_version 258521 (0.0037) [2024-04-26 20:01:09,706][49750] Updated weights for policy 0, policy_version 258531 (0.0032) [2024-04-26 20:01:12,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4235870208. Throughput: 0: 50796.8. Samples: 1988717400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 20:01:12,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 20:01:13,594][49750] Updated weights for policy 0, policy_version 258541 (0.0038) [2024-04-26 20:01:16,132][49750] Updated weights for policy 0, policy_version 258551 (0.0028) [2024-04-26 20:01:17,062][49517] Fps is (10 sec: 55705.9, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4236148736. Throughput: 0: 50785.3. Samples: 1989024500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 20:01:17,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 20:01:19,947][49750] Updated weights for policy 0, policy_version 258561 (0.0032) [2024-04-26 20:01:22,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4236378112. Throughput: 0: 50985.3. Samples: 1989193700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 20:01:22,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 20:01:22,458][49750] Updated weights for policy 0, policy_version 258571 (0.0038) [2024-04-26 20:01:26,431][49750] Updated weights for policy 0, policy_version 258581 (0.0029) [2024-04-26 20:01:27,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4236623872. Throughput: 0: 50950.1. Samples: 1989494660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 20:01:27,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 20:01:28,911][49750] Updated weights for policy 0, policy_version 258591 (0.0035) [2024-04-26 20:01:32,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 4236869632. Throughput: 0: 50976.9. Samples: 1989797380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 20:01:32,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 20:01:32,941][49750] Updated weights for policy 0, policy_version 258601 (0.0030) [2024-04-26 20:01:35,310][49750] Updated weights for policy 0, policy_version 258611 (0.0029) [2024-04-26 20:01:37,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 4237164544. Throughput: 0: 50820.5. Samples: 1989946540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 20:01:37,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 20:01:39,340][49750] Updated weights for policy 0, policy_version 258621 (0.0031) [2024-04-26 20:01:41,889][49750] Updated weights for policy 0, policy_version 258631 (0.0030) [2024-04-26 20:01:42,063][49517] Fps is (10 sec: 55705.1, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 4237426688. Throughput: 0: 50698.5. Samples: 1990253980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 20:01:42,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 20:01:45,769][49750] Updated weights for policy 0, policy_version 258641 (0.0025) [2024-04-26 20:01:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4237656064. Throughput: 0: 50869.5. Samples: 1990560980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 20:01:47,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 20:01:48,117][49728] Signal inference workers to stop experience collection... (29650 times) [2024-04-26 20:01:48,118][49728] Signal inference workers to resume experience collection... (29650 times) [2024-04-26 20:01:48,146][49750] InferenceWorker_p0-w0: stopping experience collection (29650 times) [2024-04-26 20:01:48,146][49750] InferenceWorker_p0-w0: resuming experience collection (29650 times) [2024-04-26 20:01:48,247][49750] Updated weights for policy 0, policy_version 258651 (0.0026) [2024-04-26 20:01:52,062][49517] Fps is (10 sec: 44237.5, 60 sec: 49698.3, 300 sec: 50707.1). Total num frames: 4237869056. Throughput: 0: 50754.3. Samples: 1990695800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 20:01:52,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 20:01:52,232][49750] Updated weights for policy 0, policy_version 258661 (0.0032) [2024-04-26 20:01:54,664][49750] Updated weights for policy 0, policy_version 258671 (0.0027) [2024-04-26 20:01:57,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4238163968. Throughput: 0: 50742.3. Samples: 1991000800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 20:01:57,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 20:01:58,735][49750] Updated weights for policy 0, policy_version 258681 (0.0036) [2024-04-26 20:02:01,088][49750] Updated weights for policy 0, policy_version 258691 (0.0026) [2024-04-26 20:02:02,062][49517] Fps is (10 sec: 57343.7, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 4238442496. Throughput: 0: 50686.7. Samples: 1991305400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 20:02:02,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 20:02:05,266][49750] Updated weights for policy 0, policy_version 258701 (0.0031) [2024-04-26 20:02:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 4238655488. Throughput: 0: 50722.7. Samples: 1991476220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 20:02:07,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 20:02:07,482][49750] Updated weights for policy 0, policy_version 258711 (0.0025) [2024-04-26 20:02:11,504][49750] Updated weights for policy 0, policy_version 258721 (0.0035) [2024-04-26 20:02:12,063][49517] Fps is (10 sec: 45874.3, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 4238901248. Throughput: 0: 50825.6. Samples: 1991781820. Policy #0 lag: (min: 0.0, avg: 13.3, max: 26.0) [2024-04-26 20:02:12,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 20:02:13,901][49750] Updated weights for policy 0, policy_version 258731 (0.0029) [2024-04-26 20:02:17,062][49517] Fps is (10 sec: 49151.4, 60 sec: 49971.1, 300 sec: 50707.1). Total num frames: 4239147008. Throughput: 0: 50770.2. Samples: 1992082040. Policy #0 lag: (min: 0.0, avg: 13.3, max: 26.0) [2024-04-26 20:02:17,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 20:02:17,902][49750] Updated weights for policy 0, policy_version 258741 (0.0033) [2024-04-26 20:02:20,350][49750] Updated weights for policy 0, policy_version 258751 (0.0032) [2024-04-26 20:02:22,063][49517] Fps is (10 sec: 54067.6, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4239441920. Throughput: 0: 50728.4. Samples: 1992229320. Policy #0 lag: (min: 0.0, avg: 13.3, max: 26.0) [2024-04-26 20:02:22,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 20:02:24,517][49750] Updated weights for policy 0, policy_version 258761 (0.0031) [2024-04-26 20:02:26,706][49750] Updated weights for policy 0, policy_version 258771 (0.0030) [2024-04-26 20:02:27,062][49517] Fps is (10 sec: 57344.4, 60 sec: 51609.7, 300 sec: 50929.2). Total num frames: 4239720448. Throughput: 0: 50609.9. Samples: 1992531420. Policy #0 lag: (min: 0.0, avg: 13.3, max: 26.0) [2024-04-26 20:02:27,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 20:02:30,969][49750] Updated weights for policy 0, policy_version 258781 (0.0031) [2024-04-26 20:02:32,062][49517] Fps is (10 sec: 49152.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4239933440. Throughput: 0: 50747.1. Samples: 1992844600. Policy #0 lag: (min: 0.0, avg: 13.3, max: 26.0) [2024-04-26 20:02:32,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 20:02:33,082][49750] Updated weights for policy 0, policy_version 258791 (0.0032) [2024-04-26 20:02:37,062][49517] Fps is (10 sec: 45875.1, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4240179200. Throughput: 0: 50890.6. Samples: 1992985880. Policy #0 lag: (min: 0.0, avg: 13.3, max: 26.0) [2024-04-26 20:02:37,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-26 20:02:37,292][49750] Updated weights for policy 0, policy_version 258801 (0.0029) [2024-04-26 20:02:39,170][49728] Signal inference workers to stop experience collection... (29700 times) [2024-04-26 20:02:39,218][49750] InferenceWorker_p0-w0: stopping experience collection (29700 times) [2024-04-26 20:02:39,236][49728] Signal inference workers to resume experience collection... (29700 times) [2024-04-26 20:02:39,237][49750] InferenceWorker_p0-w0: resuming experience collection (29700 times) [2024-04-26 20:02:39,503][49750] Updated weights for policy 0, policy_version 258811 (0.0026) [2024-04-26 20:02:42,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4240441344. Throughput: 0: 50911.0. Samples: 1993291800. Policy #0 lag: (min: 0.0, avg: 13.3, max: 26.0) [2024-04-26 20:02:42,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 20:02:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000258816_4240441344.pth... [2024-04-26 20:02:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000258074_4228284416.pth [2024-04-26 20:02:43,789][49750] Updated weights for policy 0, policy_version 258821 (0.0031) [2024-04-26 20:02:46,173][49750] Updated weights for policy 0, policy_version 258831 (0.0032) [2024-04-26 20:02:47,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4240719872. Throughput: 0: 50734.5. Samples: 1993588460. Policy #0 lag: (min: 0.0, avg: 13.3, max: 26.0) [2024-04-26 20:02:47,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 20:02:50,351][49750] Updated weights for policy 0, policy_version 258841 (0.0032) [2024-04-26 20:02:52,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 4240965632. Throughput: 0: 50831.8. Samples: 1993763660. Policy #0 lag: (min: 0.0, avg: 13.3, max: 26.0) [2024-04-26 20:02:52,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 20:02:52,532][49750] Updated weights for policy 0, policy_version 258851 (0.0031) [2024-04-26 20:02:56,664][49750] Updated weights for policy 0, policy_version 258861 (0.0027) [2024-04-26 20:02:57,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4241195008. Throughput: 0: 50790.9. Samples: 1994067400. Policy #0 lag: (min: 0.0, avg: 13.3, max: 26.0) [2024-04-26 20:02:57,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 20:02:59,054][49750] Updated weights for policy 0, policy_version 258871 (0.0027) [2024-04-26 20:03:02,063][49517] Fps is (10 sec: 47513.5, 60 sec: 49971.1, 300 sec: 50762.6). Total num frames: 4241440768. Throughput: 0: 50739.5. Samples: 1994365320. Policy #0 lag: (min: 0.0, avg: 13.3, max: 26.0) [2024-04-26 20:03:02,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 20:03:02,974][49750] Updated weights for policy 0, policy_version 258881 (0.0034) [2024-04-26 20:03:05,602][49750] Updated weights for policy 0, policy_version 258891 (0.0039) [2024-04-26 20:03:07,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 4241735680. Throughput: 0: 50836.9. Samples: 1994516980. Policy #0 lag: (min: 0.0, avg: 13.3, max: 26.0) [2024-04-26 20:03:07,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 20:03:09,448][49750] Updated weights for policy 0, policy_version 258901 (0.0030) [2024-04-26 20:03:12,035][49750] Updated weights for policy 0, policy_version 258911 (0.0033) [2024-04-26 20:03:12,063][49517] Fps is (10 sec: 55705.3, 60 sec: 51609.6, 300 sec: 50818.1). Total num frames: 4241997824. Throughput: 0: 50902.0. Samples: 1994822020. Policy #0 lag: (min: 0.0, avg: 13.3, max: 26.0) [2024-04-26 20:03:12,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 20:03:15,882][49750] Updated weights for policy 0, policy_version 258921 (0.0033) [2024-04-26 20:03:17,062][49517] Fps is (10 sec: 49152.3, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4242227200. Throughput: 0: 50771.5. Samples: 1995129320. Policy #0 lag: (min: 0.0, avg: 13.3, max: 26.0) [2024-04-26 20:03:17,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 20:03:18,334][49750] Updated weights for policy 0, policy_version 258931 (0.0031) [2024-04-26 20:03:22,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50244.3, 300 sec: 50818.1). Total num frames: 4242456576. Throughput: 0: 50833.2. Samples: 1995273380. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 20:03:22,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 20:03:22,310][49750] Updated weights for policy 0, policy_version 258941 (0.0035) [2024-04-26 20:03:24,876][49750] Updated weights for policy 0, policy_version 258951 (0.0031) [2024-04-26 20:03:27,063][49517] Fps is (10 sec: 47512.9, 60 sec: 49698.0, 300 sec: 50651.5). Total num frames: 4242702336. Throughput: 0: 50747.5. Samples: 1995575440. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 20:03:27,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 20:03:28,660][49750] Updated weights for policy 0, policy_version 258961 (0.0031) [2024-04-26 20:03:31,507][49750] Updated weights for policy 0, policy_version 258971 (0.0029) [2024-04-26 20:03:32,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 4242997248. Throughput: 0: 50896.9. Samples: 1995878820. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 20:03:32,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 20:03:35,130][49750] Updated weights for policy 0, policy_version 258981 (0.0030) [2024-04-26 20:03:37,062][49517] Fps is (10 sec: 54068.4, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4243243008. Throughput: 0: 50656.1. Samples: 1996043180. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 20:03:37,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-26 20:03:37,871][49750] Updated weights for policy 0, policy_version 258991 (0.0032) [2024-04-26 20:03:41,648][49750] Updated weights for policy 0, policy_version 259001 (0.0033) [2024-04-26 20:03:42,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4243472384. Throughput: 0: 50630.0. Samples: 1996345760. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 20:03:42,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 20:03:42,215][49728] Signal inference workers to stop experience collection... (29750 times) [2024-04-26 20:03:42,254][49750] InferenceWorker_p0-w0: stopping experience collection (29750 times) [2024-04-26 20:03:42,287][49728] Signal inference workers to resume experience collection... (29750 times) [2024-04-26 20:03:42,288][49750] InferenceWorker_p0-w0: resuming experience collection (29750 times) [2024-04-26 20:03:44,157][49750] Updated weights for policy 0, policy_version 259011 (0.0028) [2024-04-26 20:03:47,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.4, 300 sec: 50651.6). Total num frames: 4243718144. Throughput: 0: 50803.7. Samples: 1996651480. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 20:03:47,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 20:03:48,057][49750] Updated weights for policy 0, policy_version 259021 (0.0032) [2024-04-26 20:03:50,748][49750] Updated weights for policy 0, policy_version 259031 (0.0036) [2024-04-26 20:03:52,063][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4244013056. Throughput: 0: 50776.4. Samples: 1996801920. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 20:03:52,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 20:03:54,376][49750] Updated weights for policy 0, policy_version 259041 (0.0037) [2024-04-26 20:03:57,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4244258816. Throughput: 0: 50723.3. Samples: 1997104560. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 20:03:57,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 20:03:57,326][49750] Updated weights for policy 0, policy_version 259051 (0.0034) [2024-04-26 20:04:00,970][49750] Updated weights for policy 0, policy_version 259061 (0.0030) [2024-04-26 20:04:02,062][49517] Fps is (10 sec: 49152.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4244504576. Throughput: 0: 50703.1. Samples: 1997410960. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 20:04:02,063][49517] Avg episode reward: [(0, '0.696')] [2024-04-26 20:04:03,951][49750] Updated weights for policy 0, policy_version 259071 (0.0036) [2024-04-26 20:04:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4244766720. Throughput: 0: 50734.3. Samples: 1997556420. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 20:04:07,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 20:04:07,287][49750] Updated weights for policy 0, policy_version 259081 (0.0029) [2024-04-26 20:04:10,307][49750] Updated weights for policy 0, policy_version 259091 (0.0034) [2024-04-26 20:04:12,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 4245012480. Throughput: 0: 50865.5. Samples: 1997864380. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 20:04:12,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 20:04:13,591][49750] Updated weights for policy 0, policy_version 259101 (0.0031) [2024-04-26 20:04:16,660][49750] Updated weights for policy 0, policy_version 259111 (0.0029) [2024-04-26 20:04:17,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 4245291008. Throughput: 0: 50885.7. Samples: 1998168680. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 20:04:17,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 20:04:20,034][49750] Updated weights for policy 0, policy_version 259121 (0.0029) [2024-04-26 20:04:22,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 4245536768. Throughput: 0: 50782.9. Samples: 1998328420. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 20:04:22,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 20:04:23,250][49750] Updated weights for policy 0, policy_version 259131 (0.0037) [2024-04-26 20:04:26,480][49750] Updated weights for policy 0, policy_version 259141 (0.0029) [2024-04-26 20:04:27,062][49517] Fps is (10 sec: 49153.2, 60 sec: 51336.7, 300 sec: 50929.3). Total num frames: 4245782528. Throughput: 0: 50802.4. Samples: 1998631860. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-26 20:04:27,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 20:04:29,768][49750] Updated weights for policy 0, policy_version 259151 (0.0033) [2024-04-26 20:04:32,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 4246011904. Throughput: 0: 50752.4. Samples: 1998935340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 20:04:32,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 20:04:33,018][49750] Updated weights for policy 0, policy_version 259161 (0.0031) [2024-04-26 20:04:36,087][49750] Updated weights for policy 0, policy_version 259171 (0.0030) [2024-04-26 20:04:37,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4246290432. Throughput: 0: 50768.2. Samples: 1999086480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 20:04:37,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 20:04:39,508][49750] Updated weights for policy 0, policy_version 259181 (0.0030) [2024-04-26 20:04:42,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 4246552576. Throughput: 0: 50844.7. Samples: 1999392580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 20:04:42,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 20:04:42,069][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000259189_4246552576.pth... [2024-04-26 20:04:42,128][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000258444_4234346496.pth [2024-04-26 20:04:42,438][49750] Updated weights for policy 0, policy_version 259191 (0.0027) [2024-04-26 20:04:45,928][49750] Updated weights for policy 0, policy_version 259201 (0.0030) [2024-04-26 20:04:47,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51336.5, 300 sec: 50762.7). Total num frames: 4246798336. Throughput: 0: 50851.2. Samples: 1999699260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 20:04:47,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 20:04:48,946][49750] Updated weights for policy 0, policy_version 259211 (0.0030) [2024-04-26 20:04:52,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.6, 300 sec: 50929.3). Total num frames: 4247060480. Throughput: 0: 50892.5. Samples: 1999846580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 20:04:52,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 20:04:52,272][49750] Updated weights for policy 0, policy_version 259221 (0.0030) [2024-04-26 20:04:55,533][49750] Updated weights for policy 0, policy_version 259231 (0.0030) [2024-04-26 20:04:57,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 4247306240. Throughput: 0: 50853.5. Samples: 2000152800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 20:04:57,063][49517] Avg episode reward: [(0, '0.676')] [2024-04-26 20:04:58,678][49750] Updated weights for policy 0, policy_version 259241 (0.0029) [2024-04-26 20:05:01,900][49750] Updated weights for policy 0, policy_version 259251 (0.0030) [2024-04-26 20:05:02,062][49517] Fps is (10 sec: 52428.1, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4247584768. Throughput: 0: 50916.6. Samples: 2000459920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 20:05:02,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 20:05:04,242][49728] Signal inference workers to stop experience collection... (29800 times) [2024-04-26 20:05:04,244][49728] Signal inference workers to resume experience collection... (29800 times) [2024-04-26 20:05:04,272][49750] InferenceWorker_p0-w0: stopping experience collection (29800 times) [2024-04-26 20:05:04,272][49750] InferenceWorker_p0-w0: resuming experience collection (29800 times) [2024-04-26 20:05:05,081][49750] Updated weights for policy 0, policy_version 259261 (0.0033) [2024-04-26 20:05:07,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4247814144. Throughput: 0: 50828.1. Samples: 2000615680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 20:05:07,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 20:05:08,430][49750] Updated weights for policy 0, policy_version 259271 (0.0034) [2024-04-26 20:05:11,459][49750] Updated weights for policy 0, policy_version 259281 (0.0031) [2024-04-26 20:05:12,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4248059904. Throughput: 0: 50806.5. Samples: 2000918160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 20:05:12,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 20:05:15,020][49750] Updated weights for policy 0, policy_version 259291 (0.0035) [2024-04-26 20:05:17,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4248305664. Throughput: 0: 50703.0. Samples: 2001216980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 20:05:17,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 20:05:17,842][49750] Updated weights for policy 0, policy_version 259301 (0.0029) [2024-04-26 20:05:21,475][49750] Updated weights for policy 0, policy_version 259311 (0.0037) [2024-04-26 20:05:22,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4248567808. Throughput: 0: 50766.0. Samples: 2001370960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 20:05:22,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 20:05:24,390][49750] Updated weights for policy 0, policy_version 259321 (0.0029) [2024-04-26 20:05:27,063][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4248829952. Throughput: 0: 50804.5. Samples: 2001678780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 20:05:27,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 20:05:27,739][49750] Updated weights for policy 0, policy_version 259331 (0.0029) [2024-04-26 20:05:30,668][49750] Updated weights for policy 0, policy_version 259341 (0.0033) [2024-04-26 20:05:32,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4249092096. Throughput: 0: 50854.1. Samples: 2001987700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 20:05:32,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 20:05:34,137][49750] Updated weights for policy 0, policy_version 259351 (0.0031) [2024-04-26 20:05:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 4249354240. Throughput: 0: 50917.5. Samples: 2002137880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 20:05:37,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 20:05:37,199][49750] Updated weights for policy 0, policy_version 259361 (0.0029) [2024-04-26 20:05:40,585][49750] Updated weights for policy 0, policy_version 259371 (0.0027) [2024-04-26 20:05:42,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4249583616. Throughput: 0: 50791.0. Samples: 2002438380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:05:42,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 20:05:44,265][49750] Updated weights for policy 0, policy_version 259381 (0.0031) [2024-04-26 20:05:47,055][49750] Updated weights for policy 0, policy_version 259391 (0.0039) [2024-04-26 20:05:47,062][49517] Fps is (10 sec: 50791.4, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 4249862144. Throughput: 0: 50873.9. Samples: 2002749240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:05:47,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 20:05:50,601][49750] Updated weights for policy 0, policy_version 259401 (0.0031) [2024-04-26 20:05:52,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4250107904. Throughput: 0: 50768.9. Samples: 2002900280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:05:52,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 20:05:53,299][49750] Updated weights for policy 0, policy_version 259411 (0.0028) [2024-04-26 20:05:57,033][49750] Updated weights for policy 0, policy_version 259421 (0.0032) [2024-04-26 20:05:57,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4250353664. Throughput: 0: 50927.0. Samples: 2003209880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:05:57,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 20:05:59,643][49750] Updated weights for policy 0, policy_version 259431 (0.0037) [2024-04-26 20:06:02,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 4250632192. Throughput: 0: 51084.9. Samples: 2003515800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:06:02,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 20:06:03,383][49750] Updated weights for policy 0, policy_version 259441 (0.0032) [2024-04-26 20:06:06,117][49750] Updated weights for policy 0, policy_version 259451 (0.0034) [2024-04-26 20:06:07,063][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4250861568. Throughput: 0: 51150.3. Samples: 2003672720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:06:07,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 20:06:09,828][49750] Updated weights for policy 0, policy_version 259461 (0.0028) [2024-04-26 20:06:12,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4251107328. Throughput: 0: 51041.3. Samples: 2003975640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:06:12,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 20:06:12,730][49750] Updated weights for policy 0, policy_version 259471 (0.0029) [2024-04-26 20:06:16,328][49750] Updated weights for policy 0, policy_version 259481 (0.0034) [2024-04-26 20:06:17,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 4251369472. Throughput: 0: 51024.4. Samples: 2004283800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:06:17,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 20:06:19,083][49750] Updated weights for policy 0, policy_version 259491 (0.0029) [2024-04-26 20:06:22,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4251648000. Throughput: 0: 51056.1. Samples: 2004435400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:06:22,063][49517] Avg episode reward: [(0, '0.426')] [2024-04-26 20:06:22,749][49750] Updated weights for policy 0, policy_version 259501 (0.0029) [2024-04-26 20:06:25,653][49750] Updated weights for policy 0, policy_version 259511 (0.0029) [2024-04-26 20:06:27,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4251893760. Throughput: 0: 51096.6. Samples: 2004737740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:06:27,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 20:06:29,126][49750] Updated weights for policy 0, policy_version 259521 (0.0028) [2024-04-26 20:06:29,653][49728] Signal inference workers to stop experience collection... (29850 times) [2024-04-26 20:06:29,653][49728] Signal inference workers to resume experience collection... (29850 times) [2024-04-26 20:06:29,665][49750] InferenceWorker_p0-w0: stopping experience collection (29850 times) [2024-04-26 20:06:29,665][49750] InferenceWorker_p0-w0: resuming experience collection (29850 times) [2024-04-26 20:06:32,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4252139520. Throughput: 0: 51081.1. Samples: 2005047900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:06:32,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 20:06:32,133][49750] Updated weights for policy 0, policy_version 259531 (0.0029) [2024-04-26 20:06:35,592][49750] Updated weights for policy 0, policy_version 259541 (0.0029) [2024-04-26 20:06:37,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4252385280. Throughput: 0: 50931.1. Samples: 2005192180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:06:37,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-26 20:06:38,466][49750] Updated weights for policy 0, policy_version 259551 (0.0029) [2024-04-26 20:06:42,041][49750] Updated weights for policy 0, policy_version 259561 (0.0024) [2024-04-26 20:06:42,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.2, 300 sec: 50818.1). Total num frames: 4252647424. Throughput: 0: 50924.4. Samples: 2005501480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:06:42,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 20:06:42,191][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000259562_4252663808.pth... [2024-04-26 20:06:42,237][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000258816_4240441344.pth [2024-04-26 20:06:44,746][49750] Updated weights for policy 0, policy_version 259571 (0.0032) [2024-04-26 20:06:47,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.4, 300 sec: 51040.3). Total num frames: 4252925952. Throughput: 0: 50865.7. Samples: 2005804760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:06:47,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 20:06:48,683][49750] Updated weights for policy 0, policy_version 259581 (0.0033) [2024-04-26 20:06:51,244][49750] Updated weights for policy 0, policy_version 259591 (0.0036) [2024-04-26 20:06:52,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 4253155328. Throughput: 0: 50878.6. Samples: 2005962260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:06:52,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 20:06:55,149][49750] Updated weights for policy 0, policy_version 259601 (0.0031) [2024-04-26 20:06:57,063][49517] Fps is (10 sec: 49151.7, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4253417472. Throughput: 0: 50904.4. Samples: 2006266340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 20:06:57,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 20:06:57,725][49750] Updated weights for policy 0, policy_version 259611 (0.0031) [2024-04-26 20:07:01,528][49750] Updated weights for policy 0, policy_version 259621 (0.0033) [2024-04-26 20:07:02,063][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4253646848. Throughput: 0: 50929.9. Samples: 2006575640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 20:07:02,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 20:07:04,027][49750] Updated weights for policy 0, policy_version 259631 (0.0028) [2024-04-26 20:07:07,062][49517] Fps is (10 sec: 50791.5, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 4253925376. Throughput: 0: 50825.1. Samples: 2006722520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 20:07:07,071][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 20:07:07,946][49750] Updated weights for policy 0, policy_version 259641 (0.0030) [2024-04-26 20:07:10,599][49750] Updated weights for policy 0, policy_version 259651 (0.0035) [2024-04-26 20:07:12,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51609.7, 300 sec: 51040.3). Total num frames: 4254203904. Throughput: 0: 50859.4. Samples: 2007026400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 20:07:12,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 20:07:14,397][49750] Updated weights for policy 0, policy_version 259661 (0.0032) [2024-04-26 20:07:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4254433280. Throughput: 0: 50776.7. Samples: 2007332840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 20:07:17,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 20:07:17,327][49750] Updated weights for policy 0, policy_version 259671 (0.0035) [2024-04-26 20:07:20,900][49750] Updated weights for policy 0, policy_version 259681 (0.0030) [2024-04-26 20:07:21,787][49728] Signal inference workers to stop experience collection... (29900 times) [2024-04-26 20:07:21,824][49750] InferenceWorker_p0-w0: stopping experience collection (29900 times) [2024-04-26 20:07:21,847][49728] Signal inference workers to resume experience collection... (29900 times) [2024-04-26 20:07:21,848][49750] InferenceWorker_p0-w0: resuming experience collection (29900 times) [2024-04-26 20:07:22,062][49517] Fps is (10 sec: 45875.1, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 4254662656. Throughput: 0: 50809.4. Samples: 2007478600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 20:07:22,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 20:07:23,775][49750] Updated weights for policy 0, policy_version 259691 (0.0036) [2024-04-26 20:07:27,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 4254924800. Throughput: 0: 50628.0. Samples: 2007779740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 20:07:27,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 20:07:27,235][49750] Updated weights for policy 0, policy_version 259701 (0.0028) [2024-04-26 20:07:30,057][49750] Updated weights for policy 0, policy_version 259711 (0.0030) [2024-04-26 20:07:32,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.6, 300 sec: 50929.2). Total num frames: 4255203328. Throughput: 0: 50837.0. Samples: 2008092420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 20:07:32,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 20:07:33,562][49750] Updated weights for policy 0, policy_version 259721 (0.0036) [2024-04-26 20:07:36,540][49750] Updated weights for policy 0, policy_version 259731 (0.0030) [2024-04-26 20:07:37,062][49517] Fps is (10 sec: 54068.4, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 4255465472. Throughput: 0: 50998.9. Samples: 2008257200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 20:07:37,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 20:07:40,073][49750] Updated weights for policy 0, policy_version 259741 (0.0030) [2024-04-26 20:07:42,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.6, 300 sec: 50762.7). Total num frames: 4255694848. Throughput: 0: 50828.7. Samples: 2008553620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 20:07:42,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 20:07:42,951][49750] Updated weights for policy 0, policy_version 259751 (0.0031) [2024-04-26 20:07:46,578][49750] Updated weights for policy 0, policy_version 259761 (0.0029) [2024-04-26 20:07:47,063][49517] Fps is (10 sec: 47512.6, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4255940608. Throughput: 0: 50779.4. Samples: 2008860720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 20:07:47,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 20:07:49,326][49750] Updated weights for policy 0, policy_version 259771 (0.0038) [2024-04-26 20:07:52,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.4, 300 sec: 50818.1). Total num frames: 4256186368. Throughput: 0: 50666.1. Samples: 2009002500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 20:07:52,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 20:07:52,937][49750] Updated weights for policy 0, policy_version 259781 (0.0032) [2024-04-26 20:07:55,630][49750] Updated weights for policy 0, policy_version 259791 (0.0032) [2024-04-26 20:07:57,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 4256464896. Throughput: 0: 50761.6. Samples: 2009310680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 20:07:57,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 20:07:59,313][49750] Updated weights for policy 0, policy_version 259801 (0.0028) [2024-04-26 20:08:02,044][49750] Updated weights for policy 0, policy_version 259811 (0.0027) [2024-04-26 20:08:02,063][49517] Fps is (10 sec: 55705.8, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 4256743424. Throughput: 0: 50744.8. Samples: 2009616360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 20:08:02,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 20:08:05,948][49750] Updated weights for policy 0, policy_version 259821 (0.0031) [2024-04-26 20:08:07,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4256956416. Throughput: 0: 50984.4. Samples: 2009772900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 20:08:07,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 20:08:08,598][49750] Updated weights for policy 0, policy_version 259831 (0.0032) [2024-04-26 20:08:12,063][49517] Fps is (10 sec: 45875.1, 60 sec: 49971.1, 300 sec: 50762.6). Total num frames: 4257202176. Throughput: 0: 50962.3. Samples: 2010073040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 20:08:12,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 20:08:12,453][49750] Updated weights for policy 0, policy_version 259841 (0.0036) [2024-04-26 20:08:14,937][49750] Updated weights for policy 0, policy_version 259851 (0.0029) [2024-04-26 20:08:17,063][49517] Fps is (10 sec: 52427.8, 60 sec: 50790.2, 300 sec: 50929.2). Total num frames: 4257480704. Throughput: 0: 50635.8. Samples: 2010371040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 20:08:17,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 20:08:18,929][49750] Updated weights for policy 0, policy_version 259861 (0.0038) [2024-04-26 20:08:21,090][49728] Signal inference workers to stop experience collection... (29950 times) [2024-04-26 20:08:21,090][49728] Signal inference workers to resume experience collection... (29950 times) [2024-04-26 20:08:21,130][49750] InferenceWorker_p0-w0: stopping experience collection (29950 times) [2024-04-26 20:08:21,130][49750] InferenceWorker_p0-w0: resuming experience collection (29950 times) [2024-04-26 20:08:21,378][49750] Updated weights for policy 0, policy_version 259871 (0.0030) [2024-04-26 20:08:22,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51609.6, 300 sec: 51040.4). Total num frames: 4257759232. Throughput: 0: 50636.8. Samples: 2010535860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 20:08:22,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 20:08:25,309][49750] Updated weights for policy 0, policy_version 259881 (0.0030) [2024-04-26 20:08:27,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4257972224. Throughput: 0: 50815.4. Samples: 2010840320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 20:08:27,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 20:08:27,782][49750] Updated weights for policy 0, policy_version 259891 (0.0034) [2024-04-26 20:08:31,767][49750] Updated weights for policy 0, policy_version 259901 (0.0029) [2024-04-26 20:08:32,063][49517] Fps is (10 sec: 45874.6, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4258217984. Throughput: 0: 50768.1. Samples: 2011145280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 20:08:32,063][49517] Avg episode reward: [(0, '0.678')] [2024-04-26 20:08:34,230][49750] Updated weights for policy 0, policy_version 259911 (0.0031) [2024-04-26 20:08:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.2, 300 sec: 50873.7). Total num frames: 4258480128. Throughput: 0: 50737.8. Samples: 2011285700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 20:08:37,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 20:08:38,189][49750] Updated weights for policy 0, policy_version 259921 (0.0030) [2024-04-26 20:08:40,793][49750] Updated weights for policy 0, policy_version 259931 (0.0034) [2024-04-26 20:08:42,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.3, 300 sec: 50984.7). Total num frames: 4258758656. Throughput: 0: 50613.2. Samples: 2011588280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 20:08:42,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 20:08:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000259934_4258758656.pth... [2024-04-26 20:08:42,137][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000259189_4246552576.pth [2024-04-26 20:08:44,655][49750] Updated weights for policy 0, policy_version 259941 (0.0033) [2024-04-26 20:08:47,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4259020800. Throughput: 0: 50686.7. Samples: 2011897260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 20:08:47,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 20:08:47,183][49750] Updated weights for policy 0, policy_version 259951 (0.0029) [2024-04-26 20:08:51,130][49750] Updated weights for policy 0, policy_version 259961 (0.0033) [2024-04-26 20:08:52,062][49517] Fps is (10 sec: 49152.8, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4259250176. Throughput: 0: 50758.3. Samples: 2012057020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 20:08:52,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 20:08:53,484][49750] Updated weights for policy 0, policy_version 259971 (0.0037) [2024-04-26 20:08:57,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4259495936. Throughput: 0: 50737.8. Samples: 2012356240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 20:08:57,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 20:08:57,485][49750] Updated weights for policy 0, policy_version 259981 (0.0036) [2024-04-26 20:09:00,082][49750] Updated weights for policy 0, policy_version 259991 (0.0030) [2024-04-26 20:09:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4259758080. Throughput: 0: 50761.6. Samples: 2012655300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 20:09:02,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-26 20:09:03,977][49750] Updated weights for policy 0, policy_version 260001 (0.0032) [2024-04-26 20:09:06,503][49750] Updated weights for policy 0, policy_version 260011 (0.0027) [2024-04-26 20:09:07,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4260036608. Throughput: 0: 50676.4. Samples: 2012816300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 20:09:07,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 20:09:10,396][49750] Updated weights for policy 0, policy_version 260021 (0.0027) [2024-04-26 20:09:12,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 4260282368. Throughput: 0: 50691.7. Samples: 2013121440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 20:09:12,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 20:09:12,931][49750] Updated weights for policy 0, policy_version 260031 (0.0031) [2024-04-26 20:09:16,885][49750] Updated weights for policy 0, policy_version 260041 (0.0033) [2024-04-26 20:09:17,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4260511744. Throughput: 0: 50747.6. Samples: 2013428920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:09:17,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 20:09:19,394][49750] Updated weights for policy 0, policy_version 260051 (0.0029) [2024-04-26 20:09:22,063][49517] Fps is (10 sec: 47512.7, 60 sec: 49971.1, 300 sec: 50762.6). Total num frames: 4260757504. Throughput: 0: 50755.9. Samples: 2013569720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:09:22,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 20:09:23,316][49750] Updated weights for policy 0, policy_version 260061 (0.0031) [2024-04-26 20:09:24,637][49728] Signal inference workers to stop experience collection... (30000 times) [2024-04-26 20:09:24,658][49750] InferenceWorker_p0-w0: stopping experience collection (30000 times) [2024-04-26 20:09:24,746][49728] Signal inference workers to resume experience collection... (30000 times) [2024-04-26 20:09:24,746][49750] InferenceWorker_p0-w0: resuming experience collection (30000 times) [2024-04-26 20:09:25,923][49750] Updated weights for policy 0, policy_version 260071 (0.0033) [2024-04-26 20:09:27,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 4261052416. Throughput: 0: 50786.9. Samples: 2013873680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:09:27,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 20:09:29,665][49750] Updated weights for policy 0, policy_version 260081 (0.0033) [2024-04-26 20:09:32,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 4261298176. Throughput: 0: 50756.5. Samples: 2014181300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:09:32,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 20:09:32,337][49750] Updated weights for policy 0, policy_version 260091 (0.0029) [2024-04-26 20:09:35,937][49750] Updated weights for policy 0, policy_version 260101 (0.0027) [2024-04-26 20:09:37,063][49517] Fps is (10 sec: 47512.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4261527552. Throughput: 0: 50822.4. Samples: 2014344040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:09:37,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 20:09:38,827][49750] Updated weights for policy 0, policy_version 260111 (0.0030) [2024-04-26 20:09:42,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4261773312. Throughput: 0: 50871.5. Samples: 2014645460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:09:42,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 20:09:42,633][49750] Updated weights for policy 0, policy_version 260121 (0.0029) [2024-04-26 20:09:45,310][49750] Updated weights for policy 0, policy_version 260131 (0.0031) [2024-04-26 20:09:47,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4262035456. Throughput: 0: 50731.4. Samples: 2014938220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:09:47,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-26 20:09:49,205][49750] Updated weights for policy 0, policy_version 260141 (0.0028) [2024-04-26 20:09:51,705][49750] Updated weights for policy 0, policy_version 260151 (0.0029) [2024-04-26 20:09:52,063][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4262313984. Throughput: 0: 50787.0. Samples: 2015101720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:09:52,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 20:09:55,517][49750] Updated weights for policy 0, policy_version 260161 (0.0036) [2024-04-26 20:09:57,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4262559744. Throughput: 0: 50647.9. Samples: 2015400600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:09:57,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 20:09:58,151][49750] Updated weights for policy 0, policy_version 260171 (0.0025) [2024-04-26 20:10:01,994][49750] Updated weights for policy 0, policy_version 260181 (0.0037) [2024-04-26 20:10:02,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4262805504. Throughput: 0: 50582.6. Samples: 2015705140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:10:02,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 20:10:04,592][49750] Updated weights for policy 0, policy_version 260191 (0.0027) [2024-04-26 20:10:07,062][49517] Fps is (10 sec: 47513.3, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 4263034880. Throughput: 0: 50584.5. Samples: 2015846020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:10:07,063][49517] Avg episode reward: [(0, '0.681')] [2024-04-26 20:10:08,601][49750] Updated weights for policy 0, policy_version 260201 (0.0032) [2024-04-26 20:10:11,139][49750] Updated weights for policy 0, policy_version 260211 (0.0025) [2024-04-26 20:10:12,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4263313408. Throughput: 0: 50689.8. Samples: 2016154720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:10:12,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 20:10:15,321][49750] Updated weights for policy 0, policy_version 260221 (0.0030) [2024-04-26 20:10:15,778][49728] Signal inference workers to stop experience collection... (30050 times) [2024-04-26 20:10:15,818][49750] InferenceWorker_p0-w0: stopping experience collection (30050 times) [2024-04-26 20:10:15,853][49728] Signal inference workers to resume experience collection... (30050 times) [2024-04-26 20:10:15,858][49750] InferenceWorker_p0-w0: resuming experience collection (30050 times) [2024-04-26 20:10:17,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4263559168. Throughput: 0: 50501.4. Samples: 2016453860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:10:17,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 20:10:17,668][49750] Updated weights for policy 0, policy_version 260231 (0.0025) [2024-04-26 20:10:21,763][49750] Updated weights for policy 0, policy_version 260241 (0.0035) [2024-04-26 20:10:22,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 4263804928. Throughput: 0: 50377.1. Samples: 2016611000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:10:22,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 20:10:24,191][49750] Updated weights for policy 0, policy_version 260251 (0.0032) [2024-04-26 20:10:27,062][49517] Fps is (10 sec: 49151.4, 60 sec: 49971.1, 300 sec: 50707.1). Total num frames: 4264050688. Throughput: 0: 50362.3. Samples: 2016911760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 20:10:27,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 20:10:28,294][49750] Updated weights for policy 0, policy_version 260261 (0.0033) [2024-04-26 20:10:30,718][49750] Updated weights for policy 0, policy_version 260271 (0.0032) [2024-04-26 20:10:32,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4264312832. Throughput: 0: 50688.5. Samples: 2017219200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 20:10:32,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 20:10:34,606][49750] Updated weights for policy 0, policy_version 260281 (0.0033) [2024-04-26 20:10:37,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 4264574976. Throughput: 0: 50474.0. Samples: 2017373040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 20:10:37,063][49517] Avg episode reward: [(0, '0.682')] [2024-04-26 20:10:37,344][49750] Updated weights for policy 0, policy_version 260291 (0.0037) [2024-04-26 20:10:41,037][49750] Updated weights for policy 0, policy_version 260301 (0.0034) [2024-04-26 20:10:42,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4264853504. Throughput: 0: 50565.8. Samples: 2017676060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 20:10:42,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 20:10:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000260306_4264853504.pth... [2024-04-26 20:10:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000259562_4252663808.pth [2024-04-26 20:10:44,163][49750] Updated weights for policy 0, policy_version 260311 (0.0029) [2024-04-26 20:10:47,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4265066496. Throughput: 0: 50552.0. Samples: 2017979980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 20:10:47,063][49517] Avg episode reward: [(0, '0.680')] [2024-04-26 20:10:47,466][49750] Updated weights for policy 0, policy_version 260321 (0.0029) [2024-04-26 20:10:50,557][49750] Updated weights for policy 0, policy_version 260331 (0.0031) [2024-04-26 20:10:52,063][49517] Fps is (10 sec: 45874.9, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4265312256. Throughput: 0: 50621.8. Samples: 2018124000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 20:10:52,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 20:10:53,929][49750] Updated weights for policy 0, policy_version 260341 (0.0033) [2024-04-26 20:10:56,984][49750] Updated weights for policy 0, policy_version 260351 (0.0033) [2024-04-26 20:10:57,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4265590784. Throughput: 0: 50524.2. Samples: 2018428320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 20:10:57,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 20:11:00,366][49750] Updated weights for policy 0, policy_version 260361 (0.0034) [2024-04-26 20:11:02,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4265836544. Throughput: 0: 50701.2. Samples: 2018735420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 20:11:02,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 20:11:03,432][49750] Updated weights for policy 0, policy_version 260371 (0.0032) [2024-04-26 20:11:06,693][49750] Updated weights for policy 0, policy_version 260381 (0.0025) [2024-04-26 20:11:07,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4266082304. Throughput: 0: 50620.7. Samples: 2018888940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 20:11:07,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 20:11:09,904][49750] Updated weights for policy 0, policy_version 260391 (0.0027) [2024-04-26 20:11:12,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 4266328064. Throughput: 0: 50609.7. Samples: 2019189200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 20:11:12,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 20:11:13,142][49750] Updated weights for policy 0, policy_version 260401 (0.0028) [2024-04-26 20:11:16,505][49750] Updated weights for policy 0, policy_version 260411 (0.0031) [2024-04-26 20:11:17,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.2, 300 sec: 50651.6). Total num frames: 4266590208. Throughput: 0: 50573.3. Samples: 2019495000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 20:11:17,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 20:11:19,774][49750] Updated weights for policy 0, policy_version 260421 (0.0029) [2024-04-26 20:11:22,063][49517] Fps is (10 sec: 52427.6, 60 sec: 50790.1, 300 sec: 50707.1). Total num frames: 4266852352. Throughput: 0: 50542.7. Samples: 2019647480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 20:11:22,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 20:11:22,861][49750] Updated weights for policy 0, policy_version 260431 (0.0034) [2024-04-26 20:11:26,150][49750] Updated weights for policy 0, policy_version 260441 (0.0031) [2024-04-26 20:11:27,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4267114496. Throughput: 0: 50736.4. Samples: 2019959200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 20:11:27,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 20:11:27,525][49728] Signal inference workers to stop experience collection... (30100 times) [2024-04-26 20:11:27,569][49750] InferenceWorker_p0-w0: stopping experience collection (30100 times) [2024-04-26 20:11:27,631][49728] Signal inference workers to resume experience collection... (30100 times) [2024-04-26 20:11:27,632][49750] InferenceWorker_p0-w0: resuming experience collection (30100 times) [2024-04-26 20:11:29,415][49750] Updated weights for policy 0, policy_version 260451 (0.0028) [2024-04-26 20:11:32,063][49517] Fps is (10 sec: 50791.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4267360256. Throughput: 0: 50697.7. Samples: 2020261380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-26 20:11:32,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 20:11:32,779][49750] Updated weights for policy 0, policy_version 260461 (0.0029) [2024-04-26 20:11:35,911][49750] Updated weights for policy 0, policy_version 260471 (0.0032) [2024-04-26 20:11:37,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4267606016. Throughput: 0: 50675.7. Samples: 2020404400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:11:37,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 20:11:39,154][49750] Updated weights for policy 0, policy_version 260481 (0.0029) [2024-04-26 20:11:42,062][49517] Fps is (10 sec: 49152.6, 60 sec: 49971.2, 300 sec: 50596.0). Total num frames: 4267851776. Throughput: 0: 50643.7. Samples: 2020707280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:11:42,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 20:11:42,296][49750] Updated weights for policy 0, policy_version 260491 (0.0028) [2024-04-26 20:11:45,573][49750] Updated weights for policy 0, policy_version 260501 (0.0032) [2024-04-26 20:11:47,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 4268130304. Throughput: 0: 50607.6. Samples: 2021012760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:11:47,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 20:11:48,807][49750] Updated weights for policy 0, policy_version 260511 (0.0037) [2024-04-26 20:11:52,032][49750] Updated weights for policy 0, policy_version 260521 (0.0032) [2024-04-26 20:11:52,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4268376064. Throughput: 0: 50700.0. Samples: 2021170440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:11:52,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 20:11:55,292][49750] Updated weights for policy 0, policy_version 260531 (0.0028) [2024-04-26 20:11:57,062][49517] Fps is (10 sec: 45875.1, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 4268589056. Throughput: 0: 50693.9. Samples: 2021470420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:11:57,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 20:11:58,572][49750] Updated weights for policy 0, policy_version 260541 (0.0030) [2024-04-26 20:12:01,726][49750] Updated weights for policy 0, policy_version 260551 (0.0030) [2024-04-26 20:12:02,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 4268867584. Throughput: 0: 50682.2. Samples: 2021775700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:12:02,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 20:12:05,055][49750] Updated weights for policy 0, policy_version 260561 (0.0036) [2024-04-26 20:12:07,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.5, 300 sec: 50596.0). Total num frames: 4269129728. Throughput: 0: 50679.0. Samples: 2021928020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:12:07,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 20:12:08,013][49750] Updated weights for policy 0, policy_version 260571 (0.0031) [2024-04-26 20:12:11,406][49750] Updated weights for policy 0, policy_version 260581 (0.0032) [2024-04-26 20:12:12,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 4269391872. Throughput: 0: 50620.1. Samples: 2022237100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:12:12,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 20:12:14,670][49750] Updated weights for policy 0, policy_version 260591 (0.0031) [2024-04-26 20:12:17,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4269621248. Throughput: 0: 50599.1. Samples: 2022538340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:12:17,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 20:12:17,890][49750] Updated weights for policy 0, policy_version 260601 (0.0032) [2024-04-26 20:12:21,082][49750] Updated weights for policy 0, policy_version 260611 (0.0030) [2024-04-26 20:12:22,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.6, 300 sec: 50651.6). Total num frames: 4269867008. Throughput: 0: 50683.1. Samples: 2022685140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:12:22,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 20:12:24,384][49750] Updated weights for policy 0, policy_version 260621 (0.0029) [2024-04-26 20:12:27,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4270145536. Throughput: 0: 50650.2. Samples: 2022986540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:12:27,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 20:12:27,400][49750] Updated weights for policy 0, policy_version 260631 (0.0034) [2024-04-26 20:12:30,877][49750] Updated weights for policy 0, policy_version 260641 (0.0027) [2024-04-26 20:12:32,063][49517] Fps is (10 sec: 52427.4, 60 sec: 50517.2, 300 sec: 50596.0). Total num frames: 4270391296. Throughput: 0: 50615.7. Samples: 2023290480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:12:32,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 20:12:33,696][49750] Updated weights for policy 0, policy_version 260651 (0.0027) [2024-04-26 20:12:37,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 4270637056. Throughput: 0: 50655.7. Samples: 2023449940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:12:37,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 20:12:37,258][49750] Updated weights for policy 0, policy_version 260661 (0.0032) [2024-04-26 20:12:40,126][49750] Updated weights for policy 0, policy_version 260671 (0.0032) [2024-04-26 20:12:42,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 4270899200. Throughput: 0: 50657.5. Samples: 2023750020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:12:42,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 20:12:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000260675_4270899200.pth... [2024-04-26 20:12:42,131][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000259934_4258758656.pth [2024-04-26 20:12:43,732][49750] Updated weights for policy 0, policy_version 260681 (0.0032) [2024-04-26 20:12:46,747][49750] Updated weights for policy 0, policy_version 260691 (0.0033) [2024-04-26 20:12:47,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4271161344. Throughput: 0: 50587.1. Samples: 2024052120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 20:12:47,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 20:12:50,247][49750] Updated weights for policy 0, policy_version 260701 (0.0029) [2024-04-26 20:12:51,589][49728] Signal inference workers to stop experience collection... (30150 times) [2024-04-26 20:12:51,590][49728] Signal inference workers to resume experience collection... (30150 times) [2024-04-26 20:12:51,615][49750] InferenceWorker_p0-w0: stopping experience collection (30150 times) [2024-04-26 20:12:51,616][49750] InferenceWorker_p0-w0: resuming experience collection (30150 times) [2024-04-26 20:12:52,062][49517] Fps is (10 sec: 50792.0, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 4271407104. Throughput: 0: 50766.3. Samples: 2024212500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 20:12:52,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 20:12:53,481][49750] Updated weights for policy 0, policy_version 260711 (0.0027) [2024-04-26 20:12:56,708][49750] Updated weights for policy 0, policy_version 260721 (0.0029) [2024-04-26 20:12:57,062][49517] Fps is (10 sec: 50791.4, 60 sec: 51336.6, 300 sec: 50596.0). Total num frames: 4271669248. Throughput: 0: 50613.8. Samples: 2024514720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 20:12:57,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 20:12:59,730][49750] Updated weights for policy 0, policy_version 260731 (0.0030) [2024-04-26 20:13:02,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4271915008. Throughput: 0: 50788.4. Samples: 2024823820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 20:13:02,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 20:13:03,126][49750] Updated weights for policy 0, policy_version 260741 (0.0029) [2024-04-26 20:13:06,146][49750] Updated weights for policy 0, policy_version 260751 (0.0029) [2024-04-26 20:13:07,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4272160768. Throughput: 0: 50783.6. Samples: 2024970400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 20:13:07,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 20:13:09,460][49750] Updated weights for policy 0, policy_version 260761 (0.0032) [2024-04-26 20:13:12,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4272439296. Throughput: 0: 50769.7. Samples: 2025271180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 20:13:12,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 20:13:12,931][49750] Updated weights for policy 0, policy_version 260771 (0.0040) [2024-04-26 20:13:16,133][49750] Updated weights for policy 0, policy_version 260781 (0.0027) [2024-04-26 20:13:17,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50596.0). Total num frames: 4272685056. Throughput: 0: 50774.0. Samples: 2025575300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 20:13:17,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 20:13:19,488][49750] Updated weights for policy 0, policy_version 260791 (0.0031) [2024-04-26 20:13:22,063][49517] Fps is (10 sec: 49151.5, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 4272930816. Throughput: 0: 50757.2. Samples: 2025734020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 20:13:22,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 20:13:22,534][49750] Updated weights for policy 0, policy_version 260801 (0.0027) [2024-04-26 20:13:26,006][49750] Updated weights for policy 0, policy_version 260811 (0.0031) [2024-04-26 20:13:27,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4273176576. Throughput: 0: 50713.0. Samples: 2026032100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 20:13:27,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 20:13:29,089][49750] Updated weights for policy 0, policy_version 260821 (0.0029) [2024-04-26 20:13:32,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 4273438720. Throughput: 0: 50733.5. Samples: 2026335120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 20:13:32,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 20:13:32,341][49750] Updated weights for policy 0, policy_version 260831 (0.0030) [2024-04-26 20:13:35,395][49750] Updated weights for policy 0, policy_version 260841 (0.0033) [2024-04-26 20:13:37,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50651.6). Total num frames: 4273700864. Throughput: 0: 50750.8. Samples: 2026496300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 20:13:37,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 20:13:38,648][49750] Updated weights for policy 0, policy_version 260851 (0.0027) [2024-04-26 20:13:41,904][49750] Updated weights for policy 0, policy_version 260861 (0.0027) [2024-04-26 20:13:42,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.6, 300 sec: 50596.0). Total num frames: 4273946624. Throughput: 0: 50786.1. Samples: 2026800100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 20:13:42,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 20:13:44,998][49750] Updated weights for policy 0, policy_version 260871 (0.0030) [2024-04-26 20:13:47,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.4, 300 sec: 50651.5). Total num frames: 4274192384. Throughput: 0: 50600.5. Samples: 2027100840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 20:13:47,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 20:13:48,331][49750] Updated weights for policy 0, policy_version 260881 (0.0028) [2024-04-26 20:13:51,372][49750] Updated weights for policy 0, policy_version 260891 (0.0028) [2024-04-26 20:13:52,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4274454528. Throughput: 0: 50726.1. Samples: 2027253080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 20:13:52,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 20:13:54,681][49750] Updated weights for policy 0, policy_version 260901 (0.0028) [2024-04-26 20:13:57,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4274733056. Throughput: 0: 50951.5. Samples: 2027564000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-26 20:13:57,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 20:13:57,922][49750] Updated weights for policy 0, policy_version 260911 (0.0033) [2024-04-26 20:14:01,216][49750] Updated weights for policy 0, policy_version 260921 (0.0032) [2024-04-26 20:14:02,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 4274946048. Throughput: 0: 50817.2. Samples: 2027862080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:14:02,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 20:14:04,324][49750] Updated weights for policy 0, policy_version 260931 (0.0031) [2024-04-26 20:14:07,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50790.3, 300 sec: 50596.0). Total num frames: 4275208192. Throughput: 0: 50644.1. Samples: 2028013000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:14:07,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 20:14:07,689][49750] Updated weights for policy 0, policy_version 260941 (0.0032) [2024-04-26 20:14:08,975][49728] Signal inference workers to stop experience collection... (30200 times) [2024-04-26 20:14:09,012][49750] InferenceWorker_p0-w0: stopping experience collection (30200 times) [2024-04-26 20:14:09,082][49728] Signal inference workers to resume experience collection... (30200 times) [2024-04-26 20:14:09,082][49750] InferenceWorker_p0-w0: resuming experience collection (30200 times) [2024-04-26 20:14:10,850][49750] Updated weights for policy 0, policy_version 260951 (0.0029) [2024-04-26 20:14:12,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4275470336. Throughput: 0: 50862.0. Samples: 2028320880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:14:12,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 20:14:14,048][49750] Updated weights for policy 0, policy_version 260961 (0.0030) [2024-04-26 20:14:17,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.1, 300 sec: 50707.1). Total num frames: 4275716096. Throughput: 0: 50836.2. Samples: 2028622760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:14:17,063][49517] Avg episode reward: [(0, '0.481')] [2024-04-26 20:14:17,342][49750] Updated weights for policy 0, policy_version 260971 (0.0034) [2024-04-26 20:14:20,472][49750] Updated weights for policy 0, policy_version 260981 (0.0035) [2024-04-26 20:14:22,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.7, 300 sec: 50651.6). Total num frames: 4275994624. Throughput: 0: 50779.8. Samples: 2028781380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:14:22,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 20:14:23,643][49750] Updated weights for policy 0, policy_version 260991 (0.0030) [2024-04-26 20:14:26,836][49750] Updated weights for policy 0, policy_version 261001 (0.0025) [2024-04-26 20:14:27,063][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.5, 300 sec: 50651.5). Total num frames: 4276240384. Throughput: 0: 50895.5. Samples: 2029090400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:14:27,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 20:14:30,200][49750] Updated weights for policy 0, policy_version 261011 (0.0029) [2024-04-26 20:14:32,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4276469760. Throughput: 0: 50813.8. Samples: 2029387460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:14:32,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 20:14:33,443][49750] Updated weights for policy 0, policy_version 261021 (0.0030) [2024-04-26 20:14:36,735][49750] Updated weights for policy 0, policy_version 261031 (0.0033) [2024-04-26 20:14:37,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4276731904. Throughput: 0: 50799.5. Samples: 2029539060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:14:37,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 20:14:39,804][49750] Updated weights for policy 0, policy_version 261041 (0.0032) [2024-04-26 20:14:42,063][49517] Fps is (10 sec: 55704.8, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4277026816. Throughput: 0: 50731.5. Samples: 2029846920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:14:42,064][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 20:14:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000261049_4277026816.pth... [2024-04-26 20:14:42,127][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000260306_4264853504.pth [2024-04-26 20:14:43,054][49750] Updated weights for policy 0, policy_version 261051 (0.0033) [2024-04-26 20:14:46,384][49750] Updated weights for policy 0, policy_version 261061 (0.0032) [2024-04-26 20:14:47,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.4, 300 sec: 50651.5). Total num frames: 4277256192. Throughput: 0: 50872.4. Samples: 2030151340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:14:47,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 20:14:49,435][49750] Updated weights for policy 0, policy_version 261071 (0.0029) [2024-04-26 20:14:52,063][49517] Fps is (10 sec: 45875.2, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 4277485568. Throughput: 0: 50805.7. Samples: 2030299260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:14:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 20:14:52,879][49750] Updated weights for policy 0, policy_version 261081 (0.0034) [2024-04-26 20:14:55,985][49750] Updated weights for policy 0, policy_version 261091 (0.0034) [2024-04-26 20:14:57,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4277764096. Throughput: 0: 50684.2. Samples: 2030601680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:14:57,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 20:14:59,273][49750] Updated weights for policy 0, policy_version 261101 (0.0030) [2024-04-26 20:15:02,063][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4278009856. Throughput: 0: 50748.7. Samples: 2030906440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:15:02,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 20:15:02,401][49750] Updated weights for policy 0, policy_version 261111 (0.0034) [2024-04-26 20:15:05,725][49750] Updated weights for policy 0, policy_version 261121 (0.0035) [2024-04-26 20:15:07,046][49728] Signal inference workers to stop experience collection... (30250 times) [2024-04-26 20:15:07,046][49728] Signal inference workers to resume experience collection... (30250 times) [2024-04-26 20:15:07,062][49517] Fps is (10 sec: 52430.0, 60 sec: 51336.6, 300 sec: 50762.6). Total num frames: 4278288384. Throughput: 0: 50921.7. Samples: 2031072860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-26 20:15:07,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 20:15:07,074][49750] InferenceWorker_p0-w0: stopping experience collection (30250 times) [2024-04-26 20:15:07,074][49750] InferenceWorker_p0-w0: resuming experience collection (30250 times) [2024-04-26 20:15:08,978][49750] Updated weights for policy 0, policy_version 261131 (0.0034) [2024-04-26 20:15:12,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4278517760. Throughput: 0: 50837.8. Samples: 2031378100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 20:15:12,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 20:15:12,201][49750] Updated weights for policy 0, policy_version 261141 (0.0036) [2024-04-26 20:15:15,626][49750] Updated weights for policy 0, policy_version 261151 (0.0033) [2024-04-26 20:15:17,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 4278763520. Throughput: 0: 50832.0. Samples: 2031674900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 20:15:17,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 20:15:18,713][49750] Updated weights for policy 0, policy_version 261161 (0.0028) [2024-04-26 20:15:22,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 4279009280. Throughput: 0: 50730.7. Samples: 2031821940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 20:15:22,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 20:15:22,245][49750] Updated weights for policy 0, policy_version 261171 (0.0032) [2024-04-26 20:15:25,215][49750] Updated weights for policy 0, policy_version 261181 (0.0037) [2024-04-26 20:15:27,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4279287808. Throughput: 0: 50722.7. Samples: 2032129440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 20:15:27,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 20:15:28,655][49750] Updated weights for policy 0, policy_version 261191 (0.0032) [2024-04-26 20:15:31,632][49750] Updated weights for policy 0, policy_version 261201 (0.0033) [2024-04-26 20:15:32,063][49517] Fps is (10 sec: 54067.5, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 4279549952. Throughput: 0: 50818.3. Samples: 2032438160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 20:15:32,063][49517] Avg episode reward: [(0, '0.666')] [2024-04-26 20:15:35,103][49750] Updated weights for policy 0, policy_version 261211 (0.0031) [2024-04-26 20:15:37,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.6, 300 sec: 50596.0). Total num frames: 4279779328. Throughput: 0: 50951.3. Samples: 2032592060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 20:15:37,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 20:15:38,018][49750] Updated weights for policy 0, policy_version 261221 (0.0033) [2024-04-26 20:15:41,555][49750] Updated weights for policy 0, policy_version 261231 (0.0029) [2024-04-26 20:15:42,062][49517] Fps is (10 sec: 45876.0, 60 sec: 49698.3, 300 sec: 50651.6). Total num frames: 4280008704. Throughput: 0: 50848.7. Samples: 2032889860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 20:15:42,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 20:15:44,453][49750] Updated weights for policy 0, policy_version 261241 (0.0028) [2024-04-26 20:15:47,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4280303616. Throughput: 0: 50705.2. Samples: 2033188180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 20:15:47,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 20:15:47,908][49750] Updated weights for policy 0, policy_version 261251 (0.0036) [2024-04-26 20:15:51,062][49750] Updated weights for policy 0, policy_version 261261 (0.0036) [2024-04-26 20:15:52,062][49517] Fps is (10 sec: 55706.0, 60 sec: 51336.7, 300 sec: 50762.7). Total num frames: 4280565760. Throughput: 0: 50668.5. Samples: 2033352940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 20:15:52,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 20:15:54,499][49750] Updated weights for policy 0, policy_version 261271 (0.0026) [2024-04-26 20:15:57,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4280795136. Throughput: 0: 50543.2. Samples: 2033652540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 20:15:57,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 20:15:57,433][49750] Updated weights for policy 0, policy_version 261281 (0.0035) [2024-04-26 20:16:00,807][49750] Updated weights for policy 0, policy_version 261291 (0.0031) [2024-04-26 20:16:02,063][49517] Fps is (10 sec: 47512.5, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4281040896. Throughput: 0: 50821.2. Samples: 2033961860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 20:16:02,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 20:16:03,971][49750] Updated weights for policy 0, policy_version 261301 (0.0037) [2024-04-26 20:16:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4281286656. Throughput: 0: 50765.1. Samples: 2034106360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 20:16:07,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 20:16:07,338][49750] Updated weights for policy 0, policy_version 261311 (0.0029) [2024-04-26 20:16:10,357][49750] Updated weights for policy 0, policy_version 261321 (0.0032) [2024-04-26 20:16:12,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4281581568. Throughput: 0: 50639.5. Samples: 2034408220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 20:16:12,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 20:16:13,836][49750] Updated weights for policy 0, policy_version 261331 (0.0030) [2024-04-26 20:16:16,720][49750] Updated weights for policy 0, policy_version 261341 (0.0032) [2024-04-26 20:16:17,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51063.4, 300 sec: 50762.7). Total num frames: 4281827328. Throughput: 0: 50660.0. Samples: 2034717860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 20:16:17,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 20:16:20,346][49750] Updated weights for policy 0, policy_version 261351 (0.0041) [2024-04-26 20:16:22,062][49517] Fps is (10 sec: 49152.5, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 4282073088. Throughput: 0: 50674.1. Samples: 2034872400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 20:16:22,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-26 20:16:22,175][49728] Signal inference workers to stop experience collection... (30300 times) [2024-04-26 20:16:22,204][49750] InferenceWorker_p0-w0: stopping experience collection (30300 times) [2024-04-26 20:16:22,240][49728] Signal inference workers to resume experience collection... (30300 times) [2024-04-26 20:16:22,248][49750] InferenceWorker_p0-w0: resuming experience collection (30300 times) [2024-04-26 20:16:23,114][49750] Updated weights for policy 0, policy_version 261361 (0.0029) [2024-04-26 20:16:26,891][49750] Updated weights for policy 0, policy_version 261371 (0.0034) [2024-04-26 20:16:27,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 4282302464. Throughput: 0: 50631.8. Samples: 2035168300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 20:16:27,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 20:16:29,546][49750] Updated weights for policy 0, policy_version 261381 (0.0035) [2024-04-26 20:16:32,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4282580992. Throughput: 0: 50662.7. Samples: 2035468000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 20:16:32,063][49517] Avg episode reward: [(0, '0.676')] [2024-04-26 20:16:33,437][49750] Updated weights for policy 0, policy_version 261391 (0.0033) [2024-04-26 20:16:35,904][49750] Updated weights for policy 0, policy_version 261401 (0.0024) [2024-04-26 20:16:37,063][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 4282843136. Throughput: 0: 50763.8. Samples: 2035637320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 20:16:37,065][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 20:16:39,771][49750] Updated weights for policy 0, policy_version 261411 (0.0032) [2024-04-26 20:16:42,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 4283088896. Throughput: 0: 50765.3. Samples: 2035936980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 20:16:42,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 20:16:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000261419_4283088896.pth... [2024-04-26 20:16:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000260675_4270899200.pth [2024-04-26 20:16:42,500][49750] Updated weights for policy 0, policy_version 261421 (0.0028) [2024-04-26 20:16:46,062][49750] Updated weights for policy 0, policy_version 261431 (0.0031) [2024-04-26 20:16:47,063][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 4283318272. Throughput: 0: 50740.5. Samples: 2036245180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 20:16:47,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 20:16:48,915][49750] Updated weights for policy 0, policy_version 261441 (0.0031) [2024-04-26 20:16:52,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50244.0, 300 sec: 50818.1). Total num frames: 4283580416. Throughput: 0: 50637.0. Samples: 2036385040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 20:16:52,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 20:16:52,568][49750] Updated weights for policy 0, policy_version 261451 (0.0032) [2024-04-26 20:16:55,484][49750] Updated weights for policy 0, policy_version 261461 (0.0036) [2024-04-26 20:16:57,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4283858944. Throughput: 0: 50738.4. Samples: 2036691440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 20:16:57,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 20:16:59,028][49750] Updated weights for policy 0, policy_version 261471 (0.0031) [2024-04-26 20:17:01,807][49750] Updated weights for policy 0, policy_version 261481 (0.0034) [2024-04-26 20:17:02,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4284104704. Throughput: 0: 50613.6. Samples: 2036995480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 20:17:02,072][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 20:17:05,340][49750] Updated weights for policy 0, policy_version 261491 (0.0026) [2024-04-26 20:17:07,063][49517] Fps is (10 sec: 49151.5, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 4284350464. Throughput: 0: 50673.8. Samples: 2037152720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 20:17:07,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 20:17:08,170][49750] Updated weights for policy 0, policy_version 261501 (0.0031) [2024-04-26 20:17:11,721][49750] Updated weights for policy 0, policy_version 261511 (0.0031) [2024-04-26 20:17:12,063][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4284596224. Throughput: 0: 50933.8. Samples: 2037460320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 20:17:12,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 20:17:14,690][49750] Updated weights for policy 0, policy_version 261521 (0.0034) [2024-04-26 20:17:17,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4284841984. Throughput: 0: 50876.0. Samples: 2037757420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 20:17:17,063][49517] Avg episode reward: [(0, '0.457')] [2024-04-26 20:17:18,482][49750] Updated weights for policy 0, policy_version 261531 (0.0038) [2024-04-26 20:17:21,012][49750] Updated weights for policy 0, policy_version 261541 (0.0025) [2024-04-26 20:17:22,063][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 4285136896. Throughput: 0: 50658.7. Samples: 2037916960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 20:17:22,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 20:17:25,125][49750] Updated weights for policy 0, policy_version 261551 (0.0029) [2024-04-26 20:17:27,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4285382656. Throughput: 0: 50912.2. Samples: 2038228040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 20:17:27,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 20:17:27,458][49750] Updated weights for policy 0, policy_version 261561 (0.0028) [2024-04-26 20:17:28,339][49728] Signal inference workers to stop experience collection... (30350 times) [2024-04-26 20:17:28,339][49728] Signal inference workers to resume experience collection... (30350 times) [2024-04-26 20:17:28,367][49750] InferenceWorker_p0-w0: stopping experience collection (30350 times) [2024-04-26 20:17:28,367][49750] InferenceWorker_p0-w0: resuming experience collection (30350 times) [2024-04-26 20:17:31,467][49750] Updated weights for policy 0, policy_version 261571 (0.0033) [2024-04-26 20:17:32,062][49517] Fps is (10 sec: 45875.7, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4285595648. Throughput: 0: 50761.5. Samples: 2038529440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 20:17:32,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 20:17:33,813][49750] Updated weights for policy 0, policy_version 261581 (0.0036) [2024-04-26 20:17:37,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4285874176. Throughput: 0: 50789.6. Samples: 2038670560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 20:17:37,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 20:17:37,961][49750] Updated weights for policy 0, policy_version 261591 (0.0038) [2024-04-26 20:17:40,165][49750] Updated weights for policy 0, policy_version 261601 (0.0032) [2024-04-26 20:17:42,063][49517] Fps is (10 sec: 55704.5, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 4286152704. Throughput: 0: 50801.1. Samples: 2038977500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 20:17:42,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 20:17:44,473][49750] Updated weights for policy 0, policy_version 261611 (0.0036) [2024-04-26 20:17:47,026][49750] Updated weights for policy 0, policy_version 261621 (0.0034) [2024-04-26 20:17:47,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 4286398464. Throughput: 0: 50826.8. Samples: 2039282680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 20:17:47,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 20:17:50,791][49750] Updated weights for policy 0, policy_version 261631 (0.0027) [2024-04-26 20:17:52,062][49517] Fps is (10 sec: 49152.7, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4286644224. Throughput: 0: 50931.2. Samples: 2039444620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 20:17:52,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 20:17:53,651][49750] Updated weights for policy 0, policy_version 261641 (0.0027) [2024-04-26 20:17:57,063][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4286873600. Throughput: 0: 50744.5. Samples: 2039743820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 20:17:57,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 20:17:57,112][49750] Updated weights for policy 0, policy_version 261651 (0.0037) [2024-04-26 20:18:00,005][49750] Updated weights for policy 0, policy_version 261661 (0.0028) [2024-04-26 20:18:02,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.6, 300 sec: 50762.6). Total num frames: 4287135744. Throughput: 0: 50897.1. Samples: 2040047780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 20:18:02,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 20:18:03,684][49750] Updated weights for policy 0, policy_version 261671 (0.0032) [2024-04-26 20:18:06,328][49750] Updated weights for policy 0, policy_version 261681 (0.0032) [2024-04-26 20:18:07,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4287414272. Throughput: 0: 50667.1. Samples: 2040196980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 20:18:07,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 20:18:10,177][49750] Updated weights for policy 0, policy_version 261691 (0.0033) [2024-04-26 20:18:12,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4287676416. Throughput: 0: 50654.4. Samples: 2040507480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 20:18:12,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 20:18:12,879][49750] Updated weights for policy 0, policy_version 261701 (0.0030) [2024-04-26 20:18:16,528][49750] Updated weights for policy 0, policy_version 261711 (0.0037) [2024-04-26 20:18:17,062][49517] Fps is (10 sec: 49152.7, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 4287905792. Throughput: 0: 50704.0. Samples: 2040811120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 20:18:17,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 20:18:19,418][49750] Updated weights for policy 0, policy_version 261721 (0.0034) [2024-04-26 20:18:22,062][49517] Fps is (10 sec: 45875.2, 60 sec: 49971.3, 300 sec: 50707.1). Total num frames: 4288135168. Throughput: 0: 50756.0. Samples: 2040954580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 20:18:22,071][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 20:18:23,089][49750] Updated weights for policy 0, policy_version 261731 (0.0030) [2024-04-26 20:18:25,781][49750] Updated weights for policy 0, policy_version 261741 (0.0034) [2024-04-26 20:18:27,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4288413696. Throughput: 0: 50635.1. Samples: 2041256080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 20:18:27,072][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 20:18:29,584][49750] Updated weights for policy 0, policy_version 261751 (0.0032) [2024-04-26 20:18:32,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51336.5, 300 sec: 50762.7). Total num frames: 4288675840. Throughput: 0: 50558.3. Samples: 2041557800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 20:18:32,063][49517] Avg episode reward: [(0, '0.684')] [2024-04-26 20:18:32,105][49750] Updated weights for policy 0, policy_version 261761 (0.0031) [2024-04-26 20:18:36,031][49750] Updated weights for policy 0, policy_version 261771 (0.0030) [2024-04-26 20:18:37,063][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4288937984. Throughput: 0: 50649.3. Samples: 2041723840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 20:18:37,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 20:18:38,604][49750] Updated weights for policy 0, policy_version 261781 (0.0030) [2024-04-26 20:18:42,062][49517] Fps is (10 sec: 45874.9, 60 sec: 49698.2, 300 sec: 50651.5). Total num frames: 4289134592. Throughput: 0: 50763.5. Samples: 2042028180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 20:18:42,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 20:18:42,103][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000261789_4289150976.pth... [2024-04-26 20:18:42,156][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000261049_4277026816.pth [2024-04-26 20:18:42,478][49750] Updated weights for policy 0, policy_version 261791 (0.0029) [2024-04-26 20:18:45,055][49750] Updated weights for policy 0, policy_version 261801 (0.0029) [2024-04-26 20:18:47,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4289429504. Throughput: 0: 50698.1. Samples: 2042329200. Policy #0 lag: (min: 1.0, avg: 7.8, max: 21.0) [2024-04-26 20:18:47,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 20:18:48,445][49728] Signal inference workers to stop experience collection... (30400 times) [2024-04-26 20:18:48,445][49728] Signal inference workers to resume experience collection... (30400 times) [2024-04-26 20:18:48,474][49750] InferenceWorker_p0-w0: stopping experience collection (30400 times) [2024-04-26 20:18:48,474][49750] InferenceWorker_p0-w0: resuming experience collection (30400 times) [2024-04-26 20:18:48,744][49750] Updated weights for policy 0, policy_version 261811 (0.0033) [2024-04-26 20:18:51,516][49750] Updated weights for policy 0, policy_version 261821 (0.0033) [2024-04-26 20:18:52,062][49517] Fps is (10 sec: 55706.3, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4289691648. Throughput: 0: 50716.7. Samples: 2042479220. Policy #0 lag: (min: 1.0, avg: 7.8, max: 21.0) [2024-04-26 20:18:52,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 20:18:55,098][49750] Updated weights for policy 0, policy_version 261831 (0.0038) [2024-04-26 20:18:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4289953792. Throughput: 0: 50707.5. Samples: 2042789320. Policy #0 lag: (min: 1.0, avg: 7.8, max: 21.0) [2024-04-26 20:18:57,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 20:18:57,865][49750] Updated weights for policy 0, policy_version 261841 (0.0036) [2024-04-26 20:19:01,564][49750] Updated weights for policy 0, policy_version 261851 (0.0035) [2024-04-26 20:19:02,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4290183168. Throughput: 0: 50782.7. Samples: 2043096340. Policy #0 lag: (min: 1.0, avg: 7.8, max: 21.0) [2024-04-26 20:19:02,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 20:19:04,409][49750] Updated weights for policy 0, policy_version 261861 (0.0029) [2024-04-26 20:19:07,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4290428928. Throughput: 0: 50749.3. Samples: 2043238300. Policy #0 lag: (min: 1.0, avg: 7.8, max: 21.0) [2024-04-26 20:19:07,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 20:19:07,906][49750] Updated weights for policy 0, policy_version 261871 (0.0033) [2024-04-26 20:19:10,930][49750] Updated weights for policy 0, policy_version 261881 (0.0037) [2024-04-26 20:19:12,062][49517] Fps is (10 sec: 49151.4, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4290674688. Throughput: 0: 50953.4. Samples: 2043548980. Policy #0 lag: (min: 1.0, avg: 7.8, max: 21.0) [2024-04-26 20:19:12,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 20:19:14,235][49750] Updated weights for policy 0, policy_version 261891 (0.0029) [2024-04-26 20:19:17,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4290953216. Throughput: 0: 50869.4. Samples: 2043846920. Policy #0 lag: (min: 1.0, avg: 7.8, max: 21.0) [2024-04-26 20:19:17,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 20:19:17,279][49750] Updated weights for policy 0, policy_version 261901 (0.0032) [2024-04-26 20:19:20,734][49750] Updated weights for policy 0, policy_version 261911 (0.0030) [2024-04-26 20:19:22,062][49517] Fps is (10 sec: 55705.8, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 4291231744. Throughput: 0: 50886.3. Samples: 2044013720. Policy #0 lag: (min: 1.0, avg: 7.8, max: 21.0) [2024-04-26 20:19:22,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 20:19:23,780][49750] Updated weights for policy 0, policy_version 261921 (0.0029) [2024-04-26 20:19:27,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4291444736. Throughput: 0: 50830.8. Samples: 2044315560. Policy #0 lag: (min: 1.0, avg: 7.8, max: 21.0) [2024-04-26 20:19:27,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 20:19:27,295][49750] Updated weights for policy 0, policy_version 261931 (0.0031) [2024-04-26 20:19:30,178][49750] Updated weights for policy 0, policy_version 261941 (0.0028) [2024-04-26 20:19:32,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4291690496. Throughput: 0: 50869.0. Samples: 2044618300. Policy #0 lag: (min: 1.0, avg: 7.8, max: 21.0) [2024-04-26 20:19:32,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 20:19:33,674][49750] Updated weights for policy 0, policy_version 261951 (0.0034) [2024-04-26 20:19:36,628][49750] Updated weights for policy 0, policy_version 261961 (0.0032) [2024-04-26 20:19:37,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4291969024. Throughput: 0: 50843.1. Samples: 2044767160. Policy #0 lag: (min: 1.0, avg: 7.8, max: 21.0) [2024-04-26 20:19:37,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 20:19:40,087][49750] Updated weights for policy 0, policy_version 261971 (0.0036) [2024-04-26 20:19:42,062][49517] Fps is (10 sec: 54066.6, 60 sec: 51609.6, 300 sec: 50762.6). Total num frames: 4292231168. Throughput: 0: 50586.2. Samples: 2045065700. Policy #0 lag: (min: 1.0, avg: 7.8, max: 21.0) [2024-04-26 20:19:42,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 20:19:43,014][49750] Updated weights for policy 0, policy_version 261981 (0.0028) [2024-04-26 20:19:46,541][49750] Updated weights for policy 0, policy_version 261991 (0.0033) [2024-04-26 20:19:47,063][49517] Fps is (10 sec: 50788.8, 60 sec: 50790.2, 300 sec: 50818.1). Total num frames: 4292476928. Throughput: 0: 50550.3. Samples: 2045371120. Policy #0 lag: (min: 1.0, avg: 7.8, max: 21.0) [2024-04-26 20:19:47,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 20:19:49,524][49750] Updated weights for policy 0, policy_version 262001 (0.0034) [2024-04-26 20:19:52,063][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 4292706304. Throughput: 0: 50716.9. Samples: 2045520560. Policy #0 lag: (min: 1.0, avg: 7.8, max: 21.0) [2024-04-26 20:19:52,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 20:19:52,792][49728] Signal inference workers to stop experience collection... (30450 times) [2024-04-26 20:19:52,793][49728] Signal inference workers to resume experience collection... (30450 times) [2024-04-26 20:19:52,816][49750] InferenceWorker_p0-w0: stopping experience collection (30450 times) [2024-04-26 20:19:52,816][49750] InferenceWorker_p0-w0: resuming experience collection (30450 times) [2024-04-26 20:19:52,920][49750] Updated weights for policy 0, policy_version 262011 (0.0032) [2024-04-26 20:19:56,219][49750] Updated weights for policy 0, policy_version 262021 (0.0033) [2024-04-26 20:19:57,063][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4292968448. Throughput: 0: 50763.0. Samples: 2045833320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-04-26 20:19:57,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 20:19:59,259][49750] Updated weights for policy 0, policy_version 262031 (0.0026) [2024-04-26 20:20:02,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 4293230592. Throughput: 0: 50785.2. Samples: 2046132260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-04-26 20:20:02,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 20:20:02,959][49750] Updated weights for policy 0, policy_version 262041 (0.0033) [2024-04-26 20:20:05,726][49750] Updated weights for policy 0, policy_version 262051 (0.0026) [2024-04-26 20:20:07,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4293492736. Throughput: 0: 50741.3. Samples: 2046297080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-04-26 20:20:07,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 20:20:09,724][49750] Updated weights for policy 0, policy_version 262061 (0.0031) [2024-04-26 20:20:12,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 4293754880. Throughput: 0: 50716.6. Samples: 2046597820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-04-26 20:20:12,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 20:20:12,124][49750] Updated weights for policy 0, policy_version 262071 (0.0029) [2024-04-26 20:20:16,162][49750] Updated weights for policy 0, policy_version 262081 (0.0027) [2024-04-26 20:20:17,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4293967872. Throughput: 0: 50761.3. Samples: 2046902560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-04-26 20:20:17,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 20:20:18,622][49750] Updated weights for policy 0, policy_version 262091 (0.0029) [2024-04-26 20:20:22,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4294246400. Throughput: 0: 50565.6. Samples: 2047042620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-04-26 20:20:22,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 20:20:22,536][49750] Updated weights for policy 0, policy_version 262101 (0.0033) [2024-04-26 20:20:25,163][49750] Updated weights for policy 0, policy_version 262111 (0.0028) [2024-04-26 20:20:27,063][49517] Fps is (10 sec: 54066.0, 60 sec: 51063.2, 300 sec: 50707.1). Total num frames: 4294508544. Throughput: 0: 50565.6. Samples: 2047341160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-04-26 20:20:27,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 20:20:29,052][49750] Updated weights for policy 0, policy_version 262121 (0.0029) [2024-04-26 20:20:31,566][49750] Updated weights for policy 0, policy_version 262131 (0.0031) [2024-04-26 20:20:32,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 4294770688. Throughput: 0: 50644.2. Samples: 2047650100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-04-26 20:20:32,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 20:20:35,509][49750] Updated weights for policy 0, policy_version 262141 (0.0029) [2024-04-26 20:20:37,062][49517] Fps is (10 sec: 49153.2, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4295000064. Throughput: 0: 50823.3. Samples: 2047807600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-04-26 20:20:37,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 20:20:38,122][49750] Updated weights for policy 0, policy_version 262151 (0.0028) [2024-04-26 20:20:41,909][49750] Updated weights for policy 0, policy_version 262161 (0.0024) [2024-04-26 20:20:42,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 4295245824. Throughput: 0: 50534.9. Samples: 2048107380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-04-26 20:20:42,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 20:20:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000262161_4295245824.pth... [2024-04-26 20:20:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000261419_4283088896.pth [2024-04-26 20:20:44,472][49750] Updated weights for policy 0, policy_version 262171 (0.0028) [2024-04-26 20:20:47,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.5, 300 sec: 50651.5). Total num frames: 4295507968. Throughput: 0: 50492.9. Samples: 2048404440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-04-26 20:20:47,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 20:20:48,250][49750] Updated weights for policy 0, policy_version 262181 (0.0028) [2024-04-26 20:20:50,817][49750] Updated weights for policy 0, policy_version 262191 (0.0034) [2024-04-26 20:20:52,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4295753728. Throughput: 0: 50382.4. Samples: 2048564280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-04-26 20:20:52,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 20:20:54,883][49750] Updated weights for policy 0, policy_version 262201 (0.0033) [2024-04-26 20:20:56,925][49728] Signal inference workers to stop experience collection... (30500 times) [2024-04-26 20:20:56,926][49728] Signal inference workers to resume experience collection... (30500 times) [2024-04-26 20:20:56,954][49750] InferenceWorker_p0-w0: stopping experience collection (30500 times) [2024-04-26 20:20:56,954][49750] InferenceWorker_p0-w0: resuming experience collection (30500 times) [2024-04-26 20:20:57,062][49517] Fps is (10 sec: 54068.3, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 4296048640. Throughput: 0: 50511.4. Samples: 2048870820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-04-26 20:20:57,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 20:20:57,262][49750] Updated weights for policy 0, policy_version 262211 (0.0029) [2024-04-26 20:21:01,417][49750] Updated weights for policy 0, policy_version 262221 (0.0028) [2024-04-26 20:21:02,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4296261632. Throughput: 0: 50532.8. Samples: 2049176540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-04-26 20:21:02,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 20:21:03,732][49750] Updated weights for policy 0, policy_version 262231 (0.0027) [2024-04-26 20:21:07,063][49517] Fps is (10 sec: 45871.1, 60 sec: 50243.7, 300 sec: 50595.9). Total num frames: 4296507392. Throughput: 0: 50588.2. Samples: 2049319120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-26 20:21:07,064][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 20:21:08,036][49750] Updated weights for policy 0, policy_version 262241 (0.0035) [2024-04-26 20:21:10,243][49750] Updated weights for policy 0, policy_version 262251 (0.0040) [2024-04-26 20:21:12,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4296785920. Throughput: 0: 50611.4. Samples: 2049618660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-26 20:21:12,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 20:21:14,487][49750] Updated weights for policy 0, policy_version 262261 (0.0037) [2024-04-26 20:21:16,585][49750] Updated weights for policy 0, policy_version 262271 (0.0029) [2024-04-26 20:21:17,063][49517] Fps is (10 sec: 55709.4, 60 sec: 51609.5, 300 sec: 50818.2). Total num frames: 4297064448. Throughput: 0: 50506.7. Samples: 2049922900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-26 20:21:17,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 20:21:20,895][49750] Updated weights for policy 0, policy_version 262281 (0.0027) [2024-04-26 20:21:22,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4297293824. Throughput: 0: 50726.1. Samples: 2050090280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-26 20:21:22,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 20:21:23,047][49750] Updated weights for policy 0, policy_version 262291 (0.0032) [2024-04-26 20:21:27,062][49517] Fps is (10 sec: 44237.3, 60 sec: 49971.4, 300 sec: 50596.0). Total num frames: 4297506816. Throughput: 0: 50785.8. Samples: 2050392740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-26 20:21:27,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 20:21:27,427][49750] Updated weights for policy 0, policy_version 262301 (0.0033) [2024-04-26 20:21:29,744][49750] Updated weights for policy 0, policy_version 262311 (0.0039) [2024-04-26 20:21:32,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 4297785344. Throughput: 0: 50728.2. Samples: 2050687200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-26 20:21:32,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 20:21:33,798][49750] Updated weights for policy 0, policy_version 262321 (0.0033) [2024-04-26 20:21:36,062][49750] Updated weights for policy 0, policy_version 262331 (0.0028) [2024-04-26 20:21:37,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4298063872. Throughput: 0: 50697.4. Samples: 2050845660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-26 20:21:37,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 20:21:40,337][49750] Updated weights for policy 0, policy_version 262341 (0.0033) [2024-04-26 20:21:42,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4298326016. Throughput: 0: 50763.5. Samples: 2051155180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-26 20:21:42,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 20:21:42,423][49750] Updated weights for policy 0, policy_version 262351 (0.0034) [2024-04-26 20:21:46,642][49750] Updated weights for policy 0, policy_version 262361 (0.0035) [2024-04-26 20:21:47,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4298539008. Throughput: 0: 50785.4. Samples: 2051461880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-26 20:21:47,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 20:21:48,983][49750] Updated weights for policy 0, policy_version 262371 (0.0033) [2024-04-26 20:21:52,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 4298801152. Throughput: 0: 50744.4. Samples: 2051602580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-26 20:21:52,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 20:21:53,222][49750] Updated weights for policy 0, policy_version 262381 (0.0036) [2024-04-26 20:21:55,459][49750] Updated weights for policy 0, policy_version 262391 (0.0028) [2024-04-26 20:21:57,062][49517] Fps is (10 sec: 52428.1, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 4299063296. Throughput: 0: 50674.5. Samples: 2051899020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-26 20:21:57,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 20:21:59,622][49750] Updated weights for policy 0, policy_version 262401 (0.0029) [2024-04-26 20:22:01,460][49728] Signal inference workers to stop experience collection... (30550 times) [2024-04-26 20:22:01,460][49728] Signal inference workers to resume experience collection... (30550 times) [2024-04-26 20:22:01,471][49750] InferenceWorker_p0-w0: stopping experience collection (30550 times) [2024-04-26 20:22:01,471][49750] InferenceWorker_p0-w0: resuming experience collection (30550 times) [2024-04-26 20:22:01,808][49750] Updated weights for policy 0, policy_version 262411 (0.0032) [2024-04-26 20:22:02,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 4299341824. Throughput: 0: 50640.4. Samples: 2052201720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-26 20:22:02,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 20:22:06,182][49750] Updated weights for policy 0, policy_version 262421 (0.0027) [2024-04-26 20:22:07,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51064.2, 300 sec: 50762.7). Total num frames: 4299571200. Throughput: 0: 50657.0. Samples: 2052369840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-26 20:22:07,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 20:22:08,186][49750] Updated weights for policy 0, policy_version 262431 (0.0030) [2024-04-26 20:22:12,062][49517] Fps is (10 sec: 44237.2, 60 sec: 49971.1, 300 sec: 50651.6). Total num frames: 4299784192. Throughput: 0: 50641.3. Samples: 2052671600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-26 20:22:12,071][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 20:22:12,657][49750] Updated weights for policy 0, policy_version 262441 (0.0031) [2024-04-26 20:22:14,728][49750] Updated weights for policy 0, policy_version 262451 (0.0027) [2024-04-26 20:22:17,063][49517] Fps is (10 sec: 47512.5, 60 sec: 49698.1, 300 sec: 50540.5). Total num frames: 4300046336. Throughput: 0: 50699.7. Samples: 2052968700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 20:22:17,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 20:22:19,191][49750] Updated weights for policy 0, policy_version 262461 (0.0030) [2024-04-26 20:22:21,365][49750] Updated weights for policy 0, policy_version 262471 (0.0026) [2024-04-26 20:22:22,062][49517] Fps is (10 sec: 55705.7, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4300341248. Throughput: 0: 50655.4. Samples: 2053125160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 20:22:22,072][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 20:22:25,589][49750] Updated weights for policy 0, policy_version 262481 (0.0036) [2024-04-26 20:22:27,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4300587008. Throughput: 0: 50652.9. Samples: 2053434560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 20:22:27,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 20:22:27,773][49750] Updated weights for policy 0, policy_version 262491 (0.0028) [2024-04-26 20:22:31,957][49750] Updated weights for policy 0, policy_version 262501 (0.0034) [2024-04-26 20:22:32,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4300816384. Throughput: 0: 50578.1. Samples: 2053737900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 20:22:32,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 20:22:34,789][49750] Updated weights for policy 0, policy_version 262511 (0.0028) [2024-04-26 20:22:37,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.1, 300 sec: 50596.0). Total num frames: 4301078528. Throughput: 0: 50594.6. Samples: 2053879340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 20:22:37,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 20:22:38,500][49750] Updated weights for policy 0, policy_version 262521 (0.0039) [2024-04-26 20:22:41,165][49750] Updated weights for policy 0, policy_version 262531 (0.0033) [2024-04-26 20:22:42,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 4301340672. Throughput: 0: 50694.3. Samples: 2054180260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 20:22:42,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 20:22:42,114][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000262534_4301357056.pth... [2024-04-26 20:22:42,163][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000261789_4289150976.pth [2024-04-26 20:22:44,814][49750] Updated weights for policy 0, policy_version 262541 (0.0034) [2024-04-26 20:22:47,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 4301619200. Throughput: 0: 50807.6. Samples: 2054488060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 20:22:47,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 20:22:47,621][49750] Updated weights for policy 0, policy_version 262551 (0.0033) [2024-04-26 20:22:51,233][49750] Updated weights for policy 0, policy_version 262561 (0.0035) [2024-04-26 20:22:52,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4301848576. Throughput: 0: 50465.2. Samples: 2054640780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 20:22:52,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 20:22:54,103][49750] Updated weights for policy 0, policy_version 262571 (0.0034) [2024-04-26 20:22:57,063][49517] Fps is (10 sec: 45875.2, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 4302077952. Throughput: 0: 50752.4. Samples: 2054955460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 20:22:57,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 20:22:57,555][49750] Updated weights for policy 0, policy_version 262581 (0.0031) [2024-04-26 20:22:58,376][49728] Signal inference workers to stop experience collection... (30600 times) [2024-04-26 20:22:58,378][49728] Signal inference workers to resume experience collection... (30600 times) [2024-04-26 20:22:58,418][49750] InferenceWorker_p0-w0: stopping experience collection (30600 times) [2024-04-26 20:22:58,418][49750] InferenceWorker_p0-w0: resuming experience collection (30600 times) [2024-04-26 20:23:00,393][49750] Updated weights for policy 0, policy_version 262591 (0.0035) [2024-04-26 20:23:02,063][49517] Fps is (10 sec: 49151.7, 60 sec: 49971.3, 300 sec: 50596.0). Total num frames: 4302340096. Throughput: 0: 50885.0. Samples: 2055258520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 20:23:02,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 20:23:03,945][49750] Updated weights for policy 0, policy_version 262601 (0.0028) [2024-04-26 20:23:06,679][49750] Updated weights for policy 0, policy_version 262611 (0.0030) [2024-04-26 20:23:07,063][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 4302618624. Throughput: 0: 50747.5. Samples: 2055408800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 20:23:07,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 20:23:10,601][49750] Updated weights for policy 0, policy_version 262621 (0.0035) [2024-04-26 20:23:12,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.6, 300 sec: 50707.1). Total num frames: 4302864384. Throughput: 0: 50725.4. Samples: 2055717200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 20:23:12,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 20:23:13,080][49750] Updated weights for policy 0, policy_version 262631 (0.0032) [2024-04-26 20:23:17,062][49517] Fps is (10 sec: 47514.6, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 4303093760. Throughput: 0: 50728.1. Samples: 2056020660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 20:23:17,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 20:23:17,126][49750] Updated weights for policy 0, policy_version 262641 (0.0030) [2024-04-26 20:23:19,525][49750] Updated weights for policy 0, policy_version 262651 (0.0029) [2024-04-26 20:23:22,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 4303355904. Throughput: 0: 50714.7. Samples: 2056161500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 20:23:22,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 20:23:23,389][49750] Updated weights for policy 0, policy_version 262661 (0.0031) [2024-04-26 20:23:25,788][49750] Updated weights for policy 0, policy_version 262671 (0.0025) [2024-04-26 20:23:27,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4303618048. Throughput: 0: 50878.7. Samples: 2056469800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 20:23:27,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 20:23:29,793][49750] Updated weights for policy 0, policy_version 262681 (0.0030) [2024-04-26 20:23:32,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51336.6, 300 sec: 50707.1). Total num frames: 4303896576. Throughput: 0: 50875.7. Samples: 2056777460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 20:23:32,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 20:23:32,190][49750] Updated weights for policy 0, policy_version 262691 (0.0032) [2024-04-26 20:23:36,145][49750] Updated weights for policy 0, policy_version 262701 (0.0035) [2024-04-26 20:23:37,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4304142336. Throughput: 0: 50978.2. Samples: 2056934800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 20:23:37,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 20:23:38,945][49750] Updated weights for policy 0, policy_version 262711 (0.0032) [2024-04-26 20:23:42,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4304388096. Throughput: 0: 50828.9. Samples: 2057242760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 20:23:42,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 20:23:42,592][49750] Updated weights for policy 0, policy_version 262721 (0.0027) [2024-04-26 20:23:45,791][49750] Updated weights for policy 0, policy_version 262731 (0.0032) [2024-04-26 20:23:47,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 4304633856. Throughput: 0: 50880.6. Samples: 2057548140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 20:23:47,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 20:23:48,943][49750] Updated weights for policy 0, policy_version 262741 (0.0037) [2024-04-26 20:23:52,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.2, 300 sec: 50651.5). Total num frames: 4304896000. Throughput: 0: 50865.2. Samples: 2057697740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 20:23:52,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 20:23:52,220][49750] Updated weights for policy 0, policy_version 262751 (0.0027) [2024-04-26 20:23:53,582][49728] Signal inference workers to stop experience collection... (30650 times) [2024-04-26 20:23:53,583][49728] Signal inference workers to resume experience collection... (30650 times) [2024-04-26 20:23:53,594][49750] InferenceWorker_p0-w0: stopping experience collection (30650 times) [2024-04-26 20:23:53,610][49750] InferenceWorker_p0-w0: resuming experience collection (30650 times) [2024-04-26 20:23:55,230][49750] Updated weights for policy 0, policy_version 262761 (0.0030) [2024-04-26 20:23:57,062][49517] Fps is (10 sec: 52428.0, 60 sec: 51336.6, 300 sec: 50762.6). Total num frames: 4305158144. Throughput: 0: 50741.7. Samples: 2058000580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 20:23:57,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 20:23:58,524][49750] Updated weights for policy 0, policy_version 262771 (0.0033) [2024-04-26 20:24:01,671][49750] Updated weights for policy 0, policy_version 262781 (0.0032) [2024-04-26 20:24:02,062][49517] Fps is (10 sec: 50791.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4305403904. Throughput: 0: 50791.0. Samples: 2058306260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 20:24:02,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 20:24:04,967][49750] Updated weights for policy 0, policy_version 262791 (0.0029) [2024-04-26 20:24:07,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 4305649664. Throughput: 0: 51009.5. Samples: 2058456920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 20:24:07,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 20:24:08,260][49750] Updated weights for policy 0, policy_version 262801 (0.0028) [2024-04-26 20:24:11,546][49750] Updated weights for policy 0, policy_version 262811 (0.0028) [2024-04-26 20:24:12,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 4305895424. Throughput: 0: 50894.5. Samples: 2058760060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 20:24:12,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 20:24:14,854][49750] Updated weights for policy 0, policy_version 262821 (0.0032) [2024-04-26 20:24:17,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.5, 300 sec: 50651.6). Total num frames: 4306173952. Throughput: 0: 50894.6. Samples: 2059067720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 20:24:17,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 20:24:17,976][49750] Updated weights for policy 0, policy_version 262831 (0.0032) [2024-04-26 20:24:21,332][49750] Updated weights for policy 0, policy_version 262841 (0.0032) [2024-04-26 20:24:22,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4306419712. Throughput: 0: 50901.4. Samples: 2059225360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 20:24:22,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 20:24:24,265][49750] Updated weights for policy 0, policy_version 262851 (0.0033) [2024-04-26 20:24:27,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4306681856. Throughput: 0: 50874.3. Samples: 2059532100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 20:24:27,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 20:24:27,762][49750] Updated weights for policy 0, policy_version 262861 (0.0033) [2024-04-26 20:24:30,749][49750] Updated weights for policy 0, policy_version 262871 (0.0036) [2024-04-26 20:24:32,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 4306911232. Throughput: 0: 50922.2. Samples: 2059839640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 20:24:32,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 20:24:34,087][49750] Updated weights for policy 0, policy_version 262881 (0.0033) [2024-04-26 20:24:37,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4307189760. Throughput: 0: 50868.7. Samples: 2059986820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-26 20:24:37,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 20:24:37,173][49750] Updated weights for policy 0, policy_version 262891 (0.0032) [2024-04-26 20:24:40,382][49750] Updated weights for policy 0, policy_version 262901 (0.0034) [2024-04-26 20:24:42,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 4307451904. Throughput: 0: 50902.8. Samples: 2060291200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:24:42,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 20:24:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000262906_4307451904.pth... [2024-04-26 20:24:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000262161_4295245824.pth [2024-04-26 20:24:43,598][49750] Updated weights for policy 0, policy_version 262911 (0.0031) [2024-04-26 20:24:46,819][49750] Updated weights for policy 0, policy_version 262921 (0.0028) [2024-04-26 20:24:47,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4307697664. Throughput: 0: 50950.7. Samples: 2060599040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:24:47,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 20:24:50,646][49750] Updated weights for policy 0, policy_version 262931 (0.0034) [2024-04-26 20:24:52,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.7, 300 sec: 50818.2). Total num frames: 4307959808. Throughput: 0: 51050.6. Samples: 2060754200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:24:52,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 20:24:53,312][49750] Updated weights for policy 0, policy_version 262941 (0.0031) [2024-04-26 20:24:56,881][49750] Updated weights for policy 0, policy_version 262951 (0.0031) [2024-04-26 20:24:57,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4308189184. Throughput: 0: 50947.0. Samples: 2061052680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:24:57,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 20:24:59,441][49728] Signal inference workers to stop experience collection... (30700 times) [2024-04-26 20:24:59,475][49750] InferenceWorker_p0-w0: stopping experience collection (30700 times) [2024-04-26 20:24:59,506][49728] Signal inference workers to resume experience collection... (30700 times) [2024-04-26 20:24:59,507][49750] InferenceWorker_p0-w0: resuming experience collection (30700 times) [2024-04-26 20:24:59,638][49750] Updated weights for policy 0, policy_version 262961 (0.0034) [2024-04-26 20:25:02,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4308451328. Throughput: 0: 50898.7. Samples: 2061358160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:25:02,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 20:25:03,325][49750] Updated weights for policy 0, policy_version 262971 (0.0035) [2024-04-26 20:25:05,969][49750] Updated weights for policy 0, policy_version 262981 (0.0033) [2024-04-26 20:25:07,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51336.5, 300 sec: 50762.7). Total num frames: 4308729856. Throughput: 0: 50914.2. Samples: 2061516500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:25:07,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 20:25:09,713][49750] Updated weights for policy 0, policy_version 262991 (0.0030) [2024-04-26 20:25:12,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4308959232. Throughput: 0: 50844.8. Samples: 2061820120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:25:12,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 20:25:12,458][49750] Updated weights for policy 0, policy_version 263001 (0.0025) [2024-04-26 20:25:15,993][49750] Updated weights for policy 0, policy_version 263011 (0.0034) [2024-04-26 20:25:17,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4309221376. Throughput: 0: 50784.7. Samples: 2062124960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:25:17,063][49517] Avg episode reward: [(0, '0.691')] [2024-04-26 20:25:18,879][49750] Updated weights for policy 0, policy_version 263021 (0.0035) [2024-04-26 20:25:22,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4309467136. Throughput: 0: 50868.5. Samples: 2062275900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:25:22,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 20:25:22,289][49750] Updated weights for policy 0, policy_version 263031 (0.0033) [2024-04-26 20:25:25,205][49750] Updated weights for policy 0, policy_version 263041 (0.0031) [2024-04-26 20:25:27,063][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4309745664. Throughput: 0: 50819.9. Samples: 2062578100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:25:27,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 20:25:28,752][49750] Updated weights for policy 0, policy_version 263051 (0.0032) [2024-04-26 20:25:31,533][49750] Updated weights for policy 0, policy_version 263061 (0.0028) [2024-04-26 20:25:32,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 4309991424. Throughput: 0: 50759.9. Samples: 2062883240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:25:32,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 20:25:35,180][49750] Updated weights for policy 0, policy_version 263071 (0.0031) [2024-04-26 20:25:37,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4310237184. Throughput: 0: 50891.5. Samples: 2063044320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:25:37,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 20:25:38,038][49750] Updated weights for policy 0, policy_version 263081 (0.0026) [2024-04-26 20:25:41,660][49750] Updated weights for policy 0, policy_version 263091 (0.0033) [2024-04-26 20:25:42,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 4310499328. Throughput: 0: 51053.8. Samples: 2063350100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:25:42,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 20:25:44,487][49750] Updated weights for policy 0, policy_version 263101 (0.0030) [2024-04-26 20:25:47,063][49517] Fps is (10 sec: 49149.2, 60 sec: 50516.9, 300 sec: 50762.5). Total num frames: 4310728704. Throughput: 0: 50972.2. Samples: 2063651940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:25:47,064][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 20:25:48,070][49750] Updated weights for policy 0, policy_version 263111 (0.0032) [2024-04-26 20:25:50,842][49750] Updated weights for policy 0, policy_version 263121 (0.0031) [2024-04-26 20:25:52,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.1, 300 sec: 50651.5). Total num frames: 4310990848. Throughput: 0: 50734.0. Samples: 2063799540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 20:25:52,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 20:25:54,559][49750] Updated weights for policy 0, policy_version 263131 (0.0028) [2024-04-26 20:25:57,062][49517] Fps is (10 sec: 54070.1, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 4311269376. Throughput: 0: 50756.5. Samples: 2064104160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 20:25:57,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 20:25:57,227][49750] Updated weights for policy 0, policy_version 263141 (0.0030) [2024-04-26 20:26:01,065][49750] Updated weights for policy 0, policy_version 263151 (0.0030) [2024-04-26 20:26:02,062][49517] Fps is (10 sec: 52430.4, 60 sec: 51063.5, 300 sec: 50873.9). Total num frames: 4311515136. Throughput: 0: 50915.3. Samples: 2064416140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 20:26:02,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 20:26:03,669][49750] Updated weights for policy 0, policy_version 263161 (0.0040) [2024-04-26 20:26:05,199][49728] Signal inference workers to stop experience collection... (30750 times) [2024-04-26 20:26:05,199][49728] Signal inference workers to resume experience collection... (30750 times) [2024-04-26 20:26:05,211][49750] InferenceWorker_p0-w0: stopping experience collection (30750 times) [2024-04-26 20:26:05,231][49750] InferenceWorker_p0-w0: resuming experience collection (30750 times) [2024-04-26 20:26:07,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4311760896. Throughput: 0: 50702.0. Samples: 2064557500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 20:26:07,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 20:26:07,376][49750] Updated weights for policy 0, policy_version 263171 (0.0031) [2024-04-26 20:26:10,280][49750] Updated weights for policy 0, policy_version 263181 (0.0034) [2024-04-26 20:26:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 4312006656. Throughput: 0: 50688.6. Samples: 2064859080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 20:26:12,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 20:26:13,876][49750] Updated weights for policy 0, policy_version 263191 (0.0033) [2024-04-26 20:26:16,965][49750] Updated weights for policy 0, policy_version 263201 (0.0031) [2024-04-26 20:26:17,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4312285184. Throughput: 0: 50733.9. Samples: 2065166260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 20:26:17,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 20:26:20,274][49750] Updated weights for policy 0, policy_version 263211 (0.0031) [2024-04-26 20:26:22,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4312530944. Throughput: 0: 50738.7. Samples: 2065327560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 20:26:22,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 20:26:23,440][49750] Updated weights for policy 0, policy_version 263221 (0.0029) [2024-04-26 20:26:26,622][49750] Updated weights for policy 0, policy_version 263231 (0.0031) [2024-04-26 20:26:27,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 4312776704. Throughput: 0: 50720.9. Samples: 2065632540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 20:26:27,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 20:26:29,919][49750] Updated weights for policy 0, policy_version 263241 (0.0032) [2024-04-26 20:26:32,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4313022464. Throughput: 0: 50886.4. Samples: 2065941800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 20:26:32,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 20:26:32,886][49750] Updated weights for policy 0, policy_version 263251 (0.0029) [2024-04-26 20:26:36,382][49750] Updated weights for policy 0, policy_version 263261 (0.0034) [2024-04-26 20:26:37,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4313268224. Throughput: 0: 50799.9. Samples: 2066085520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 20:26:37,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 20:26:39,412][49750] Updated weights for policy 0, policy_version 263271 (0.0029) [2024-04-26 20:26:42,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.6, 300 sec: 50929.2). Total num frames: 4313563136. Throughput: 0: 50885.4. Samples: 2066394000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 20:26:42,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 20:26:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000263279_4313563136.pth... [2024-04-26 20:26:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000262534_4301357056.pth [2024-04-26 20:26:42,678][49750] Updated weights for policy 0, policy_version 263281 (0.0037) [2024-04-26 20:26:46,143][49750] Updated weights for policy 0, policy_version 263291 (0.0038) [2024-04-26 20:26:47,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51337.0, 300 sec: 50873.7). Total num frames: 4313808896. Throughput: 0: 50651.0. Samples: 2066695440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 20:26:47,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 20:26:49,033][49750] Updated weights for policy 0, policy_version 263301 (0.0026) [2024-04-26 20:26:52,063][49517] Fps is (10 sec: 49151.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4314054656. Throughput: 0: 50947.5. Samples: 2066850140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 20:26:52,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 20:26:52,698][49750] Updated weights for policy 0, policy_version 263311 (0.0029) [2024-04-26 20:26:55,513][49750] Updated weights for policy 0, policy_version 263321 (0.0037) [2024-04-26 20:26:57,062][49517] Fps is (10 sec: 45875.6, 60 sec: 49971.2, 300 sec: 50596.0). Total num frames: 4314267648. Throughput: 0: 50941.3. Samples: 2067151440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-04-26 20:26:57,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 20:26:59,270][49750] Updated weights for policy 0, policy_version 263331 (0.0027) [2024-04-26 20:27:01,948][49750] Updated weights for policy 0, policy_version 263341 (0.0029) [2024-04-26 20:27:02,062][49517] Fps is (10 sec: 52429.8, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4314578944. Throughput: 0: 50865.9. Samples: 2067455220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:27:02,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 20:27:05,530][49750] Updated weights for policy 0, policy_version 263351 (0.0038) [2024-04-26 20:27:07,062][49517] Fps is (10 sec: 55705.8, 60 sec: 51063.6, 300 sec: 50984.8). Total num frames: 4314824704. Throughput: 0: 50709.4. Samples: 2067609480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:27:07,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 20:27:08,230][49750] Updated weights for policy 0, policy_version 263361 (0.0028) [2024-04-26 20:27:11,691][49728] Signal inference workers to stop experience collection... (30800 times) [2024-04-26 20:27:11,692][49728] Signal inference workers to resume experience collection... (30800 times) [2024-04-26 20:27:11,717][49750] InferenceWorker_p0-w0: stopping experience collection (30800 times) [2024-04-26 20:27:11,717][49750] InferenceWorker_p0-w0: resuming experience collection (30800 times) [2024-04-26 20:27:11,824][49750] Updated weights for policy 0, policy_version 263371 (0.0031) [2024-04-26 20:27:12,063][49517] Fps is (10 sec: 49151.0, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 4315070464. Throughput: 0: 50825.3. Samples: 2067919680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:27:12,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 20:27:15,063][49750] Updated weights for policy 0, policy_version 263381 (0.0032) [2024-04-26 20:27:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4315332608. Throughput: 0: 50856.1. Samples: 2068230320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:27:17,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 20:27:18,273][49750] Updated weights for policy 0, policy_version 263391 (0.0027) [2024-04-26 20:27:21,885][49750] Updated weights for policy 0, policy_version 263401 (0.0034) [2024-04-26 20:27:22,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4315561984. Throughput: 0: 50878.7. Samples: 2068375060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:27:22,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 20:27:24,705][49750] Updated weights for policy 0, policy_version 263411 (0.0031) [2024-04-26 20:27:27,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 4315840512. Throughput: 0: 50965.3. Samples: 2068687440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:27:27,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 20:27:28,221][49750] Updated weights for policy 0, policy_version 263421 (0.0026) [2024-04-26 20:27:30,979][49750] Updated weights for policy 0, policy_version 263431 (0.0031) [2024-04-26 20:27:32,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4316086272. Throughput: 0: 51021.4. Samples: 2068991400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:27:32,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 20:27:34,718][49750] Updated weights for policy 0, policy_version 263441 (0.0035) [2024-04-26 20:27:37,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4316348416. Throughput: 0: 51011.8. Samples: 2069145660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:27:37,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 20:27:37,310][49750] Updated weights for policy 0, policy_version 263451 (0.0029) [2024-04-26 20:27:41,236][49750] Updated weights for policy 0, policy_version 263461 (0.0033) [2024-04-26 20:27:42,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4316577792. Throughput: 0: 50994.6. Samples: 2069446200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:27:42,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 20:27:43,848][49750] Updated weights for policy 0, policy_version 263471 (0.0033) [2024-04-26 20:27:47,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4316839936. Throughput: 0: 50840.0. Samples: 2069743020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:27:47,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 20:27:47,707][49750] Updated weights for policy 0, policy_version 263481 (0.0031) [2024-04-26 20:27:50,859][49750] Updated weights for policy 0, policy_version 263491 (0.0038) [2024-04-26 20:27:52,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 4317102080. Throughput: 0: 50912.3. Samples: 2069900540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:27:52,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 20:27:54,017][49750] Updated weights for policy 0, policy_version 263501 (0.0029) [2024-04-26 20:27:57,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4317347840. Throughput: 0: 50706.7. Samples: 2070201480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:27:57,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 20:27:57,222][49750] Updated weights for policy 0, policy_version 263511 (0.0026) [2024-04-26 20:28:00,309][49750] Updated weights for policy 0, policy_version 263521 (0.0029) [2024-04-26 20:28:02,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4317609984. Throughput: 0: 50665.8. Samples: 2070510280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:28:02,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 20:28:03,722][49750] Updated weights for policy 0, policy_version 263531 (0.0027) [2024-04-26 20:28:06,904][49750] Updated weights for policy 0, policy_version 263541 (0.0031) [2024-04-26 20:28:07,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 4317855744. Throughput: 0: 50779.8. Samples: 2070660160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:28:07,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 20:28:07,693][49728] Signal inference workers to stop experience collection... (30850 times) [2024-04-26 20:28:07,735][49750] InferenceWorker_p0-w0: stopping experience collection (30850 times) [2024-04-26 20:28:07,797][49728] Signal inference workers to resume experience collection... (30850 times) [2024-04-26 20:28:07,798][49750] InferenceWorker_p0-w0: resuming experience collection (30850 times) [2024-04-26 20:28:10,162][49750] Updated weights for policy 0, policy_version 263551 (0.0030) [2024-04-26 20:28:12,062][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.5, 300 sec: 50929.2). Total num frames: 4318117888. Throughput: 0: 50615.4. Samples: 2070965140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 20:28:12,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 20:28:13,354][49750] Updated weights for policy 0, policy_version 263561 (0.0031) [2024-04-26 20:28:16,604][49750] Updated weights for policy 0, policy_version 263571 (0.0027) [2024-04-26 20:28:17,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4318363648. Throughput: 0: 50738.7. Samples: 2071274640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 20:28:17,063][49517] Avg episode reward: [(0, '0.685')] [2024-04-26 20:28:19,872][49750] Updated weights for policy 0, policy_version 263581 (0.0035) [2024-04-26 20:28:22,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4318625792. Throughput: 0: 50635.9. Samples: 2071424280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 20:28:22,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 20:28:22,954][49750] Updated weights for policy 0, policy_version 263591 (0.0030) [2024-04-26 20:28:26,377][49750] Updated weights for policy 0, policy_version 263601 (0.0032) [2024-04-26 20:28:27,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4318887936. Throughput: 0: 50701.9. Samples: 2071727780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 20:28:27,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 20:28:29,261][49750] Updated weights for policy 0, policy_version 263611 (0.0035) [2024-04-26 20:28:32,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4319133696. Throughput: 0: 50907.9. Samples: 2072033880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 20:28:32,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 20:28:32,871][49750] Updated weights for policy 0, policy_version 263621 (0.0034) [2024-04-26 20:28:35,754][49750] Updated weights for policy 0, policy_version 263631 (0.0033) [2024-04-26 20:28:37,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 4319379456. Throughput: 0: 50726.2. Samples: 2072183220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 20:28:37,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 20:28:39,174][49750] Updated weights for policy 0, policy_version 263641 (0.0029) [2024-04-26 20:28:42,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4319641600. Throughput: 0: 50867.6. Samples: 2072490520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 20:28:42,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 20:28:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000263650_4319641600.pth... [2024-04-26 20:28:42,128][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000262906_4307451904.pth [2024-04-26 20:28:42,282][49750] Updated weights for policy 0, policy_version 263651 (0.0029) [2024-04-26 20:28:45,483][49750] Updated weights for policy 0, policy_version 263661 (0.0029) [2024-04-26 20:28:47,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4319887360. Throughput: 0: 50772.9. Samples: 2072795060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 20:28:47,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 20:28:48,621][49750] Updated weights for policy 0, policy_version 263671 (0.0029) [2024-04-26 20:28:51,900][49750] Updated weights for policy 0, policy_version 263681 (0.0030) [2024-04-26 20:28:52,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4320149504. Throughput: 0: 50747.6. Samples: 2072943800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 20:28:52,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 20:28:55,071][49750] Updated weights for policy 0, policy_version 263691 (0.0028) [2024-04-26 20:28:57,063][49517] Fps is (10 sec: 52427.7, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4320411648. Throughput: 0: 50797.3. Samples: 2073251020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 20:28:57,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 20:28:58,386][49750] Updated weights for policy 0, policy_version 263701 (0.0028) [2024-04-26 20:29:01,449][49750] Updated weights for policy 0, policy_version 263711 (0.0030) [2024-04-26 20:29:02,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 4320657408. Throughput: 0: 50745.5. Samples: 2073558200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 20:29:02,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 20:29:04,711][49750] Updated weights for policy 0, policy_version 263721 (0.0031) [2024-04-26 20:29:07,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 4320903168. Throughput: 0: 50927.2. Samples: 2073716000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 20:29:07,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 20:29:07,979][49750] Updated weights for policy 0, policy_version 263731 (0.0031) [2024-04-26 20:29:10,801][49728] Signal inference workers to stop experience collection... (30900 times) [2024-04-26 20:29:10,850][49750] InferenceWorker_p0-w0: stopping experience collection (30900 times) [2024-04-26 20:29:10,872][49728] Signal inference workers to resume experience collection... (30900 times) [2024-04-26 20:29:10,874][49750] InferenceWorker_p0-w0: resuming experience collection (30900 times) [2024-04-26 20:29:11,007][49750] Updated weights for policy 0, policy_version 263741 (0.0033) [2024-04-26 20:29:12,062][49517] Fps is (10 sec: 52429.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4321181696. Throughput: 0: 51012.4. Samples: 2074023340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 20:29:12,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 20:29:14,327][49750] Updated weights for policy 0, policy_version 263751 (0.0035) [2024-04-26 20:29:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4321411072. Throughput: 0: 50877.5. Samples: 2074323360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 20:29:17,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 20:29:17,538][49750] Updated weights for policy 0, policy_version 263761 (0.0029) [2024-04-26 20:29:20,702][49750] Updated weights for policy 0, policy_version 263771 (0.0027) [2024-04-26 20:29:22,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4321656832. Throughput: 0: 50982.8. Samples: 2074477440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 20:29:22,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 20:29:23,948][49750] Updated weights for policy 0, policy_version 263781 (0.0036) [2024-04-26 20:29:27,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 4321935360. Throughput: 0: 50896.6. Samples: 2074780860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 20:29:27,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 20:29:27,155][49750] Updated weights for policy 0, policy_version 263791 (0.0028) [2024-04-26 20:29:30,378][49750] Updated weights for policy 0, policy_version 263801 (0.0034) [2024-04-26 20:29:32,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 4322181120. Throughput: 0: 50898.2. Samples: 2075085480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 20:29:32,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-26 20:29:33,582][49750] Updated weights for policy 0, policy_version 263811 (0.0028) [2024-04-26 20:29:36,842][49750] Updated weights for policy 0, policy_version 263821 (0.0032) [2024-04-26 20:29:37,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 4322443264. Throughput: 0: 51081.8. Samples: 2075242480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 20:29:37,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 20:29:39,910][49750] Updated weights for policy 0, policy_version 263831 (0.0037) [2024-04-26 20:29:42,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 4322689024. Throughput: 0: 50907.6. Samples: 2075541860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 20:29:42,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 20:29:43,288][49750] Updated weights for policy 0, policy_version 263841 (0.0033) [2024-04-26 20:29:46,573][49750] Updated weights for policy 0, policy_version 263851 (0.0029) [2024-04-26 20:29:47,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4322951168. Throughput: 0: 50973.5. Samples: 2075852000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 20:29:47,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 20:29:49,683][49750] Updated weights for policy 0, policy_version 263861 (0.0029) [2024-04-26 20:29:52,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4323196928. Throughput: 0: 50894.9. Samples: 2076006280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 20:29:52,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 20:29:53,053][49750] Updated weights for policy 0, policy_version 263871 (0.0031) [2024-04-26 20:29:56,194][49750] Updated weights for policy 0, policy_version 263881 (0.0033) [2024-04-26 20:29:57,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4323459072. Throughput: 0: 50803.5. Samples: 2076309500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 20:29:57,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 20:29:59,516][49750] Updated weights for policy 0, policy_version 263891 (0.0030) [2024-04-26 20:30:02,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4323721216. Throughput: 0: 50967.5. Samples: 2076616900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 20:30:02,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 20:30:02,761][49750] Updated weights for policy 0, policy_version 263901 (0.0030) [2024-04-26 20:30:05,925][49750] Updated weights for policy 0, policy_version 263911 (0.0035) [2024-04-26 20:30:07,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4323966976. Throughput: 0: 50939.0. Samples: 2076769700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 20:30:07,063][49517] Avg episode reward: [(0, '0.702')] [2024-04-26 20:30:08,151][49728] Signal inference workers to stop experience collection... (30950 times) [2024-04-26 20:30:08,151][49728] Signal inference workers to resume experience collection... (30950 times) [2024-04-26 20:30:08,165][49750] InferenceWorker_p0-w0: stopping experience collection (30950 times) [2024-04-26 20:30:08,165][49750] InferenceWorker_p0-w0: resuming experience collection (30950 times) [2024-04-26 20:30:09,108][49750] Updated weights for policy 0, policy_version 263921 (0.0035) [2024-04-26 20:30:12,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4324212736. Throughput: 0: 50864.8. Samples: 2077069780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 20:30:12,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 20:30:12,239][49750] Updated weights for policy 0, policy_version 263931 (0.0032) [2024-04-26 20:30:15,558][49750] Updated weights for policy 0, policy_version 263941 (0.0029) [2024-04-26 20:30:17,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 4324474880. Throughput: 0: 50722.4. Samples: 2077368000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 20:30:17,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 20:30:18,763][49750] Updated weights for policy 0, policy_version 263951 (0.0031) [2024-04-26 20:30:22,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4324720640. Throughput: 0: 50760.8. Samples: 2077526720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 20:30:22,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 20:30:22,218][49750] Updated weights for policy 0, policy_version 263961 (0.0043) [2024-04-26 20:30:25,383][49750] Updated weights for policy 0, policy_version 263971 (0.0031) [2024-04-26 20:30:27,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 4324966400. Throughput: 0: 50749.0. Samples: 2077825560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 20:30:27,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 20:30:28,630][49750] Updated weights for policy 0, policy_version 263981 (0.0036) [2024-04-26 20:30:31,712][49750] Updated weights for policy 0, policy_version 263991 (0.0036) [2024-04-26 20:30:32,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4325228544. Throughput: 0: 50618.3. Samples: 2078129820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 20:30:32,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 20:30:34,904][49750] Updated weights for policy 0, policy_version 264001 (0.0033) [2024-04-26 20:30:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4325474304. Throughput: 0: 50594.0. Samples: 2078283000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:30:37,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 20:30:38,061][49750] Updated weights for policy 0, policy_version 264011 (0.0035) [2024-04-26 20:30:41,493][49750] Updated weights for policy 0, policy_version 264021 (0.0034) [2024-04-26 20:30:42,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50873.8). Total num frames: 4325736448. Throughput: 0: 50752.6. Samples: 2078593360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:30:42,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 20:30:42,106][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000264023_4325752832.pth... [2024-04-26 20:30:42,166][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000263279_4313563136.pth [2024-04-26 20:30:44,570][49750] Updated weights for policy 0, policy_version 264031 (0.0031) [2024-04-26 20:30:47,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4325982208. Throughput: 0: 50532.1. Samples: 2078890840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:30:47,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 20:30:47,898][49750] Updated weights for policy 0, policy_version 264041 (0.0030) [2024-04-26 20:30:50,982][49750] Updated weights for policy 0, policy_version 264051 (0.0033) [2024-04-26 20:30:52,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4326260736. Throughput: 0: 50592.0. Samples: 2079046340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:30:52,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 20:30:54,323][49750] Updated weights for policy 0, policy_version 264061 (0.0029) [2024-04-26 20:30:57,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4326506496. Throughput: 0: 50782.7. Samples: 2079355000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:30:57,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 20:30:57,278][49750] Updated weights for policy 0, policy_version 264071 (0.0030) [2024-04-26 20:31:00,796][49750] Updated weights for policy 0, policy_version 264081 (0.0032) [2024-04-26 20:31:02,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4326768640. Throughput: 0: 51038.8. Samples: 2079664740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:31:02,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 20:31:03,716][49750] Updated weights for policy 0, policy_version 264091 (0.0028) [2024-04-26 20:31:07,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4327014400. Throughput: 0: 50847.2. Samples: 2079814840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:31:07,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 20:31:07,165][49750] Updated weights for policy 0, policy_version 264101 (0.0032) [2024-04-26 20:31:10,428][49750] Updated weights for policy 0, policy_version 264111 (0.0034) [2024-04-26 20:31:12,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4327276544. Throughput: 0: 50933.3. Samples: 2080117560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:31:12,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 20:31:13,469][49750] Updated weights for policy 0, policy_version 264121 (0.0027) [2024-04-26 20:31:17,013][49750] Updated weights for policy 0, policy_version 264131 (0.0026) [2024-04-26 20:31:17,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.5, 300 sec: 50818.1). Total num frames: 4327522304. Throughput: 0: 50999.4. Samples: 2080424800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:31:17,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 20:31:17,827][49728] Signal inference workers to stop experience collection... (31000 times) [2024-04-26 20:31:17,851][49750] InferenceWorker_p0-w0: stopping experience collection (31000 times) [2024-04-26 20:31:17,935][49728] Signal inference workers to resume experience collection... (31000 times) [2024-04-26 20:31:17,935][49750] InferenceWorker_p0-w0: resuming experience collection (31000 times) [2024-04-26 20:31:20,120][49750] Updated weights for policy 0, policy_version 264141 (0.0030) [2024-04-26 20:31:22,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4327768064. Throughput: 0: 50952.9. Samples: 2080575880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:31:22,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 20:31:23,282][49750] Updated weights for policy 0, policy_version 264151 (0.0035) [2024-04-26 20:31:26,379][49750] Updated weights for policy 0, policy_version 264161 (0.0030) [2024-04-26 20:31:27,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4328030208. Throughput: 0: 50855.5. Samples: 2080881860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:31:27,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 20:31:29,794][49750] Updated weights for policy 0, policy_version 264171 (0.0031) [2024-04-26 20:31:32,063][49517] Fps is (10 sec: 52427.7, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 4328292352. Throughput: 0: 51081.9. Samples: 2081189540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:31:32,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 20:31:32,669][49750] Updated weights for policy 0, policy_version 264181 (0.0028) [2024-04-26 20:31:36,287][49750] Updated weights for policy 0, policy_version 264191 (0.0033) [2024-04-26 20:31:37,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 4328570880. Throughput: 0: 51172.8. Samples: 2081349120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:31:37,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 20:31:39,094][49750] Updated weights for policy 0, policy_version 264201 (0.0025) [2024-04-26 20:31:42,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4328800256. Throughput: 0: 51085.2. Samples: 2081653840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:31:42,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 20:31:42,530][49750] Updated weights for policy 0, policy_version 264211 (0.0034) [2024-04-26 20:31:45,723][49750] Updated weights for policy 0, policy_version 264221 (0.0026) [2024-04-26 20:31:47,062][49517] Fps is (10 sec: 47513.9, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4329046016. Throughput: 0: 51013.8. Samples: 2081960360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:31:47,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 20:31:48,859][49750] Updated weights for policy 0, policy_version 264231 (0.0032) [2024-04-26 20:31:52,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.5, 300 sec: 50984.8). Total num frames: 4329308160. Throughput: 0: 50941.5. Samples: 2082107200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:31:52,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 20:31:52,168][49750] Updated weights for policy 0, policy_version 264241 (0.0032) [2024-04-26 20:31:55,205][49750] Updated weights for policy 0, policy_version 264251 (0.0028) [2024-04-26 20:31:57,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4329570304. Throughput: 0: 51130.1. Samples: 2082418420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:31:57,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 20:31:58,496][49750] Updated weights for policy 0, policy_version 264261 (0.0038) [2024-04-26 20:32:01,688][49750] Updated weights for policy 0, policy_version 264271 (0.0031) [2024-04-26 20:32:02,063][49517] Fps is (10 sec: 52427.6, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4329832448. Throughput: 0: 51058.2. Samples: 2082722420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:32:02,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 20:32:04,967][49750] Updated weights for policy 0, policy_version 264281 (0.0033) [2024-04-26 20:32:07,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4330045440. Throughput: 0: 51016.0. Samples: 2082871600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:32:07,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 20:32:07,967][49750] Updated weights for policy 0, policy_version 264291 (0.0034) [2024-04-26 20:32:11,305][49750] Updated weights for policy 0, policy_version 264301 (0.0030) [2024-04-26 20:32:12,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4330323968. Throughput: 0: 51050.2. Samples: 2083179120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:32:12,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 20:32:14,571][49750] Updated weights for policy 0, policy_version 264311 (0.0030) [2024-04-26 20:32:15,311][49728] Signal inference workers to stop experience collection... (31050 times) [2024-04-26 20:32:15,312][49728] Signal inference workers to resume experience collection... (31050 times) [2024-04-26 20:32:15,344][49750] InferenceWorker_p0-w0: stopping experience collection (31050 times) [2024-04-26 20:32:15,349][49750] InferenceWorker_p0-w0: resuming experience collection (31050 times) [2024-04-26 20:32:17,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51063.6, 300 sec: 50929.2). Total num frames: 4330586112. Throughput: 0: 50919.8. Samples: 2083480920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:32:17,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 20:32:17,883][49750] Updated weights for policy 0, policy_version 264321 (0.0028) [2024-04-26 20:32:20,956][49750] Updated weights for policy 0, policy_version 264331 (0.0032) [2024-04-26 20:32:22,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4330848256. Throughput: 0: 50969.0. Samples: 2083642720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:32:22,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 20:32:24,217][49750] Updated weights for policy 0, policy_version 264341 (0.0033) [2024-04-26 20:32:27,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4331094016. Throughput: 0: 50934.3. Samples: 2083945880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:32:27,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 20:32:27,280][49750] Updated weights for policy 0, policy_version 264351 (0.0030) [2024-04-26 20:32:30,713][49750] Updated weights for policy 0, policy_version 264361 (0.0032) [2024-04-26 20:32:32,063][49517] Fps is (10 sec: 45874.5, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4331307008. Throughput: 0: 50771.9. Samples: 2084245100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:32:32,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 20:32:33,736][49750] Updated weights for policy 0, policy_version 264371 (0.0031) [2024-04-26 20:32:37,042][49750] Updated weights for policy 0, policy_version 264381 (0.0031) [2024-04-26 20:32:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 4331618304. Throughput: 0: 50638.5. Samples: 2084385940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:32:37,072][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 20:32:40,267][49750] Updated weights for policy 0, policy_version 264391 (0.0022) [2024-04-26 20:32:42,062][49517] Fps is (10 sec: 55705.8, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4331864064. Throughput: 0: 50540.0. Samples: 2084692720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:32:42,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 20:32:42,080][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000264396_4331864064.pth... [2024-04-26 20:32:42,143][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000263650_4319641600.pth [2024-04-26 20:32:43,662][49750] Updated weights for policy 0, policy_version 264401 (0.0033) [2024-04-26 20:32:46,701][49750] Updated weights for policy 0, policy_version 264411 (0.0031) [2024-04-26 20:32:47,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 4332126208. Throughput: 0: 50686.7. Samples: 2085003320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:32:47,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 20:32:50,078][49750] Updated weights for policy 0, policy_version 264421 (0.0032) [2024-04-26 20:32:52,062][49517] Fps is (10 sec: 45875.9, 60 sec: 50244.3, 300 sec: 50762.7). Total num frames: 4332322816. Throughput: 0: 50780.0. Samples: 2085156700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:32:52,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 20:32:53,119][49750] Updated weights for policy 0, policy_version 264431 (0.0028) [2024-04-26 20:32:56,638][49750] Updated weights for policy 0, policy_version 264441 (0.0033) [2024-04-26 20:32:57,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 4332601344. Throughput: 0: 50652.7. Samples: 2085458500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:32:57,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 20:32:59,571][49750] Updated weights for policy 0, policy_version 264451 (0.0033) [2024-04-26 20:33:02,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 4332863488. Throughput: 0: 50670.7. Samples: 2085761100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:33:02,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 20:33:03,031][49750] Updated weights for policy 0, policy_version 264461 (0.0029) [2024-04-26 20:33:06,081][49750] Updated weights for policy 0, policy_version 264471 (0.0031) [2024-04-26 20:33:07,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51609.6, 300 sec: 50929.3). Total num frames: 4333142016. Throughput: 0: 50763.5. Samples: 2085927080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:33:07,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 20:33:09,593][49750] Updated weights for policy 0, policy_version 264481 (0.0034) [2024-04-26 20:33:12,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4333387776. Throughput: 0: 50785.3. Samples: 2086231220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:33:12,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 20:33:12,248][49728] Signal inference workers to stop experience collection... (31100 times) [2024-04-26 20:33:12,306][49750] InferenceWorker_p0-w0: stopping experience collection (31100 times) [2024-04-26 20:33:12,312][49728] Signal inference workers to resume experience collection... (31100 times) [2024-04-26 20:33:12,320][49750] InferenceWorker_p0-w0: resuming experience collection (31100 times) [2024-04-26 20:33:12,446][49750] Updated weights for policy 0, policy_version 264491 (0.0031) [2024-04-26 20:33:16,061][49750] Updated weights for policy 0, policy_version 264501 (0.0035) [2024-04-26 20:33:17,063][49517] Fps is (10 sec: 45874.2, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 4333600768. Throughput: 0: 50831.9. Samples: 2086532540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:33:17,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 20:33:18,838][49750] Updated weights for policy 0, policy_version 264511 (0.0032) [2024-04-26 20:33:22,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4333879296. Throughput: 0: 50731.7. Samples: 2086668860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:33:22,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 20:33:22,432][49750] Updated weights for policy 0, policy_version 264521 (0.0028) [2024-04-26 20:33:25,350][49750] Updated weights for policy 0, policy_version 264531 (0.0030) [2024-04-26 20:33:27,062][49517] Fps is (10 sec: 55706.5, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 4334157824. Throughput: 0: 50818.7. Samples: 2086979560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:33:27,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 20:33:28,844][49750] Updated weights for policy 0, policy_version 264541 (0.0032) [2024-04-26 20:33:31,790][49750] Updated weights for policy 0, policy_version 264551 (0.0029) [2024-04-26 20:33:32,063][49517] Fps is (10 sec: 52427.5, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 4334403584. Throughput: 0: 50726.9. Samples: 2087286040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:33:32,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 20:33:35,284][49750] Updated weights for policy 0, policy_version 264561 (0.0031) [2024-04-26 20:33:37,062][49517] Fps is (10 sec: 45875.1, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 4334616576. Throughput: 0: 50707.9. Samples: 2087438560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:33:37,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 20:33:38,250][49750] Updated weights for policy 0, policy_version 264571 (0.0029) [2024-04-26 20:33:41,839][49750] Updated weights for policy 0, policy_version 264581 (0.0028) [2024-04-26 20:33:42,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4334895104. Throughput: 0: 50796.5. Samples: 2087744340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:33:42,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 20:33:44,775][49750] Updated weights for policy 0, policy_version 264591 (0.0030) [2024-04-26 20:33:47,062][49517] Fps is (10 sec: 54067.4, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4335157248. Throughput: 0: 50763.9. Samples: 2088045480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:33:47,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 20:33:48,347][49750] Updated weights for policy 0, policy_version 264601 (0.0032) [2024-04-26 20:33:51,229][49750] Updated weights for policy 0, policy_version 264611 (0.0026) [2024-04-26 20:33:52,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 4335403008. Throughput: 0: 50737.2. Samples: 2088210260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:33:52,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 20:33:55,036][49750] Updated weights for policy 0, policy_version 264621 (0.0031) [2024-04-26 20:33:57,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.6, 300 sec: 50873.8). Total num frames: 4335665152. Throughput: 0: 50733.0. Samples: 2088514200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:33:57,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 20:33:57,579][49750] Updated weights for policy 0, policy_version 264631 (0.0029) [2024-04-26 20:34:01,860][49750] Updated weights for policy 0, policy_version 264641 (0.0030) [2024-04-26 20:34:02,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 4335878144. Throughput: 0: 50861.4. Samples: 2088821300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 20:34:02,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 20:34:03,867][49750] Updated weights for policy 0, policy_version 264651 (0.0040) [2024-04-26 20:34:05,753][49728] Signal inference workers to stop experience collection... (31150 times) [2024-04-26 20:34:05,753][49728] Signal inference workers to resume experience collection... (31150 times) [2024-04-26 20:34:05,766][49750] InferenceWorker_p0-w0: stopping experience collection (31150 times) [2024-04-26 20:34:05,766][49750] InferenceWorker_p0-w0: resuming experience collection (31150 times) [2024-04-26 20:34:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4336173056. Throughput: 0: 50969.8. Samples: 2088962500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:34:07,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 20:34:08,394][49750] Updated weights for policy 0, policy_version 264661 (0.0031) [2024-04-26 20:34:10,363][49750] Updated weights for policy 0, policy_version 264671 (0.0035) [2024-04-26 20:34:12,062][49517] Fps is (10 sec: 55706.4, 60 sec: 50790.5, 300 sec: 50929.2). Total num frames: 4336435200. Throughput: 0: 50720.0. Samples: 2089261960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:34:12,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 20:34:14,633][49750] Updated weights for policy 0, policy_version 264681 (0.0028) [2024-04-26 20:34:16,857][49750] Updated weights for policy 0, policy_version 264691 (0.0032) [2024-04-26 20:34:17,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51609.8, 300 sec: 50984.8). Total num frames: 4336697344. Throughput: 0: 50825.7. Samples: 2089573180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:34:17,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 20:34:21,202][49750] Updated weights for policy 0, policy_version 264701 (0.0037) [2024-04-26 20:34:22,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4336943104. Throughput: 0: 51022.2. Samples: 2089734560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:34:22,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 20:34:23,464][49750] Updated weights for policy 0, policy_version 264711 (0.0034) [2024-04-26 20:34:27,063][49517] Fps is (10 sec: 45874.7, 60 sec: 49971.1, 300 sec: 50762.6). Total num frames: 4337156096. Throughput: 0: 50986.2. Samples: 2090038720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:34:27,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 20:34:27,514][49750] Updated weights for policy 0, policy_version 264721 (0.0030) [2024-04-26 20:34:29,964][49750] Updated weights for policy 0, policy_version 264731 (0.0030) [2024-04-26 20:34:32,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4337451008. Throughput: 0: 51101.2. Samples: 2090345040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:34:32,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 20:34:33,941][49750] Updated weights for policy 0, policy_version 264741 (0.0033) [2024-04-26 20:34:36,203][49750] Updated weights for policy 0, policy_version 264751 (0.0028) [2024-04-26 20:34:37,063][49517] Fps is (10 sec: 55705.4, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 4337713152. Throughput: 0: 50859.0. Samples: 2090498920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:34:37,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 20:34:40,206][49750] Updated weights for policy 0, policy_version 264761 (0.0033) [2024-04-26 20:34:42,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 4337975296. Throughput: 0: 51038.2. Samples: 2090810920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:34:42,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 20:34:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000264769_4337975296.pth... [2024-04-26 20:34:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000264023_4325752832.pth [2024-04-26 20:34:42,637][49750] Updated weights for policy 0, policy_version 264771 (0.0027) [2024-04-26 20:34:46,516][49750] Updated weights for policy 0, policy_version 264781 (0.0030) [2024-04-26 20:34:47,062][49517] Fps is (10 sec: 50791.3, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 4338221056. Throughput: 0: 51119.7. Samples: 2091121680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:34:47,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 20:34:49,061][49750] Updated weights for policy 0, policy_version 264791 (0.0034) [2024-04-26 20:34:52,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4338450432. Throughput: 0: 51237.8. Samples: 2091268200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:34:52,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 20:34:52,880][49750] Updated weights for policy 0, policy_version 264801 (0.0031) [2024-04-26 20:34:55,423][49750] Updated weights for policy 0, policy_version 264811 (0.0024) [2024-04-26 20:34:57,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4338712576. Throughput: 0: 51259.6. Samples: 2091568640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:34:57,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 20:34:59,294][49750] Updated weights for policy 0, policy_version 264821 (0.0039) [2024-04-26 20:35:01,765][49750] Updated weights for policy 0, policy_version 264831 (0.0032) [2024-04-26 20:35:02,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51882.8, 300 sec: 50929.3). Total num frames: 4338991104. Throughput: 0: 51126.2. Samples: 2091873860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:35:02,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 20:35:05,683][49750] Updated weights for policy 0, policy_version 264841 (0.0032) [2024-04-26 20:35:07,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4339253248. Throughput: 0: 51197.8. Samples: 2092038460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:35:07,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 20:35:08,139][49750] Updated weights for policy 0, policy_version 264851 (0.0029) [2024-04-26 20:35:11,979][49750] Updated weights for policy 0, policy_version 264861 (0.0030) [2024-04-26 20:35:12,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4339482624. Throughput: 0: 51169.4. Samples: 2092341340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:35:12,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 20:35:14,782][49750] Updated weights for policy 0, policy_version 264871 (0.0029) [2024-04-26 20:35:17,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4339728384. Throughput: 0: 51074.0. Samples: 2092643360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 20:35:17,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 20:35:18,186][49728] Signal inference workers to stop experience collection... (31200 times) [2024-04-26 20:35:18,187][49728] Signal inference workers to resume experience collection... (31200 times) [2024-04-26 20:35:18,213][49750] InferenceWorker_p0-w0: stopping experience collection (31200 times) [2024-04-26 20:35:18,213][49750] InferenceWorker_p0-w0: resuming experience collection (31200 times) [2024-04-26 20:35:18,316][49750] Updated weights for policy 0, policy_version 264881 (0.0030) [2024-04-26 20:35:21,323][49750] Updated weights for policy 0, policy_version 264891 (0.0030) [2024-04-26 20:35:22,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 4340006912. Throughput: 0: 51016.1. Samples: 2092794640. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 20:35:22,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 20:35:24,826][49750] Updated weights for policy 0, policy_version 264901 (0.0035) [2024-04-26 20:35:27,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51882.8, 300 sec: 50984.8). Total num frames: 4340269056. Throughput: 0: 50964.1. Samples: 2093104300. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 20:35:27,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 20:35:27,765][49750] Updated weights for policy 0, policy_version 264911 (0.0030) [2024-04-26 20:35:31,259][49750] Updated weights for policy 0, policy_version 264921 (0.0034) [2024-04-26 20:35:32,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 4340514816. Throughput: 0: 50776.8. Samples: 2093406640. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 20:35:32,063][49517] Avg episode reward: [(0, '0.673')] [2024-04-26 20:35:34,069][49750] Updated weights for policy 0, policy_version 264931 (0.0029) [2024-04-26 20:35:37,062][49517] Fps is (10 sec: 45874.8, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 4340727808. Throughput: 0: 50745.7. Samples: 2093551760. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 20:35:37,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 20:35:37,561][49750] Updated weights for policy 0, policy_version 264941 (0.0036) [2024-04-26 20:35:40,445][49750] Updated weights for policy 0, policy_version 264951 (0.0032) [2024-04-26 20:35:42,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50929.2). Total num frames: 4341006336. Throughput: 0: 50822.0. Samples: 2093855640. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 20:35:42,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 20:35:43,954][49750] Updated weights for policy 0, policy_version 264961 (0.0029) [2024-04-26 20:35:47,063][49517] Fps is (10 sec: 54066.5, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4341268480. Throughput: 0: 50805.6. Samples: 2094160120. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 20:35:47,063][49517] Avg episode reward: [(0, '0.704')] [2024-04-26 20:35:47,097][49750] Updated weights for policy 0, policy_version 264971 (0.0037) [2024-04-26 20:35:50,413][49750] Updated weights for policy 0, policy_version 264981 (0.0032) [2024-04-26 20:35:52,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4341530624. Throughput: 0: 50813.4. Samples: 2094325060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 20:35:52,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 20:35:53,474][49750] Updated weights for policy 0, policy_version 264991 (0.0028) [2024-04-26 20:35:57,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4341760000. Throughput: 0: 50724.5. Samples: 2094623940. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 20:35:57,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 20:35:57,177][49750] Updated weights for policy 0, policy_version 265001 (0.0029) [2024-04-26 20:35:59,748][49750] Updated weights for policy 0, policy_version 265011 (0.0030) [2024-04-26 20:36:02,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 4342022144. Throughput: 0: 50846.9. Samples: 2094931480. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 20:36:02,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 20:36:03,441][49750] Updated weights for policy 0, policy_version 265021 (0.0028) [2024-04-26 20:36:06,093][49750] Updated weights for policy 0, policy_version 265031 (0.0037) [2024-04-26 20:36:07,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4342284288. Throughput: 0: 50823.4. Samples: 2095081700. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 20:36:07,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 20:36:10,001][49750] Updated weights for policy 0, policy_version 265041 (0.0037) [2024-04-26 20:36:12,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 4342546432. Throughput: 0: 50832.3. Samples: 2095391760. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 20:36:12,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 20:36:12,553][49750] Updated weights for policy 0, policy_version 265051 (0.0032) [2024-04-26 20:36:16,382][49750] Updated weights for policy 0, policy_version 265061 (0.0029) [2024-04-26 20:36:17,063][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4342792192. Throughput: 0: 50832.5. Samples: 2095694100. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 20:36:17,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 20:36:18,959][49750] Updated weights for policy 0, policy_version 265071 (0.0036) [2024-04-26 20:36:22,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4343037952. Throughput: 0: 50844.7. Samples: 2095839780. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 20:36:22,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 20:36:22,831][49750] Updated weights for policy 0, policy_version 265081 (0.0031) [2024-04-26 20:36:24,882][49728] Signal inference workers to stop experience collection... (31250 times) [2024-04-26 20:36:24,882][49728] Signal inference workers to resume experience collection... (31250 times) [2024-04-26 20:36:24,893][49750] InferenceWorker_p0-w0: stopping experience collection (31250 times) [2024-04-26 20:36:24,893][49750] InferenceWorker_p0-w0: resuming experience collection (31250 times) [2024-04-26 20:36:25,545][49750] Updated weights for policy 0, policy_version 265091 (0.0029) [2024-04-26 20:36:27,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 4343300096. Throughput: 0: 50900.5. Samples: 2096146160. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-04-26 20:36:27,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 20:36:29,252][49750] Updated weights for policy 0, policy_version 265101 (0.0028) [2024-04-26 20:36:32,029][49750] Updated weights for policy 0, policy_version 265111 (0.0025) [2024-04-26 20:36:32,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4343578624. Throughput: 0: 50956.2. Samples: 2096453140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 20:36:32,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 20:36:35,822][49750] Updated weights for policy 0, policy_version 265121 (0.0037) [2024-04-26 20:36:37,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 4343824384. Throughput: 0: 50738.1. Samples: 2096608280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 20:36:37,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 20:36:38,502][49750] Updated weights for policy 0, policy_version 265131 (0.0029) [2024-04-26 20:36:42,063][49517] Fps is (10 sec: 47512.7, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4344053760. Throughput: 0: 50874.0. Samples: 2096913280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 20:36:42,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 20:36:42,123][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000265141_4344070144.pth... [2024-04-26 20:36:42,127][49750] Updated weights for policy 0, policy_version 265141 (0.0028) [2024-04-26 20:36:42,168][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000264396_4331864064.pth [2024-04-26 20:36:45,008][49750] Updated weights for policy 0, policy_version 265151 (0.0029) [2024-04-26 20:36:47,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4344315904. Throughput: 0: 50912.2. Samples: 2097222520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 20:36:47,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-26 20:36:48,465][49750] Updated weights for policy 0, policy_version 265161 (0.0034) [2024-04-26 20:36:51,551][49750] Updated weights for policy 0, policy_version 265171 (0.0029) [2024-04-26 20:36:52,062][49517] Fps is (10 sec: 52429.9, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4344578048. Throughput: 0: 50892.2. Samples: 2097371840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 20:36:52,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 20:36:54,969][49750] Updated weights for policy 0, policy_version 265181 (0.0035) [2024-04-26 20:36:57,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4344840192. Throughput: 0: 50843.2. Samples: 2097679700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 20:36:57,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 20:36:58,100][49750] Updated weights for policy 0, policy_version 265191 (0.0030) [2024-04-26 20:37:01,532][49750] Updated weights for policy 0, policy_version 265201 (0.0028) [2024-04-26 20:37:02,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.6, 300 sec: 50984.8). Total num frames: 4345085952. Throughput: 0: 50987.1. Samples: 2097988520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 20:37:02,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 20:37:04,374][49750] Updated weights for policy 0, policy_version 265211 (0.0026) [2024-04-26 20:37:07,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4345331712. Throughput: 0: 50960.6. Samples: 2098133000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 20:37:07,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 20:37:07,854][49750] Updated weights for policy 0, policy_version 265221 (0.0036) [2024-04-26 20:37:10,720][49750] Updated weights for policy 0, policy_version 265231 (0.0034) [2024-04-26 20:37:12,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4345577472. Throughput: 0: 50889.8. Samples: 2098436200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 20:37:12,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 20:37:14,268][49750] Updated weights for policy 0, policy_version 265241 (0.0030) [2024-04-26 20:37:17,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4345856000. Throughput: 0: 50893.3. Samples: 2098743340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 20:37:17,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 20:37:17,227][49750] Updated weights for policy 0, policy_version 265251 (0.0031) [2024-04-26 20:37:20,894][49750] Updated weights for policy 0, policy_version 265261 (0.0034) [2024-04-26 20:37:22,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 4346118144. Throughput: 0: 50977.0. Samples: 2098902240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 20:37:22,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 20:37:23,658][49750] Updated weights for policy 0, policy_version 265271 (0.0033) [2024-04-26 20:37:27,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 4346347520. Throughput: 0: 50993.5. Samples: 2099207980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 20:37:27,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 20:37:27,113][49750] Updated weights for policy 0, policy_version 265281 (0.0032) [2024-04-26 20:37:30,011][49750] Updated weights for policy 0, policy_version 265291 (0.0029) [2024-04-26 20:37:32,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4346626048. Throughput: 0: 50777.6. Samples: 2099507520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 20:37:32,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 20:37:33,407][49750] Updated weights for policy 0, policy_version 265301 (0.0038) [2024-04-26 20:37:36,336][49750] Updated weights for policy 0, policy_version 265311 (0.0030) [2024-04-26 20:37:37,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 4346871808. Throughput: 0: 51092.0. Samples: 2099670980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 20:37:37,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 20:37:37,701][49728] Signal inference workers to stop experience collection... (31300 times) [2024-04-26 20:37:37,704][49728] Signal inference workers to resume experience collection... (31300 times) [2024-04-26 20:37:37,732][49750] InferenceWorker_p0-w0: stopping experience collection (31300 times) [2024-04-26 20:37:37,733][49750] InferenceWorker_p0-w0: resuming experience collection (31300 times) [2024-04-26 20:37:39,870][49750] Updated weights for policy 0, policy_version 265321 (0.0028) [2024-04-26 20:37:42,062][49517] Fps is (10 sec: 49152.5, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4347117568. Throughput: 0: 50960.3. Samples: 2099972920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:37:42,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 20:37:43,006][49750] Updated weights for policy 0, policy_version 265331 (0.0030) [2024-04-26 20:37:46,363][49750] Updated weights for policy 0, policy_version 265341 (0.0032) [2024-04-26 20:37:47,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 51040.3). Total num frames: 4347379712. Throughput: 0: 50816.1. Samples: 2100275240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:37:47,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 20:37:49,434][49750] Updated weights for policy 0, policy_version 265351 (0.0027) [2024-04-26 20:37:52,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4347609088. Throughput: 0: 50961.8. Samples: 2100426280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:37:52,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 20:37:52,652][49750] Updated weights for policy 0, policy_version 265361 (0.0031) [2024-04-26 20:37:55,687][49750] Updated weights for policy 0, policy_version 265371 (0.0029) [2024-04-26 20:37:57,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 4347904000. Throughput: 0: 51121.8. Samples: 2100736680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:37:57,064][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 20:37:59,234][49750] Updated weights for policy 0, policy_version 265381 (0.0029) [2024-04-26 20:38:02,041][49750] Updated weights for policy 0, policy_version 265391 (0.0034) [2024-04-26 20:38:02,063][49517] Fps is (10 sec: 55705.3, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4348166144. Throughput: 0: 51066.6. Samples: 2101041340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:38:02,064][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 20:38:05,702][49750] Updated weights for policy 0, policy_version 265401 (0.0030) [2024-04-26 20:38:07,062][49517] Fps is (10 sec: 49151.6, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4348395520. Throughput: 0: 51099.5. Samples: 2101201720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:38:07,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-26 20:38:08,529][49750] Updated weights for policy 0, policy_version 265411 (0.0027) [2024-04-26 20:38:12,062][49517] Fps is (10 sec: 47514.4, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 4348641280. Throughput: 0: 50910.4. Samples: 2101498940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:38:12,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 20:38:12,168][49750] Updated weights for policy 0, policy_version 265421 (0.0027) [2024-04-26 20:38:15,125][49750] Updated weights for policy 0, policy_version 265431 (0.0034) [2024-04-26 20:38:17,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 4348903424. Throughput: 0: 50828.9. Samples: 2101794820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:38:17,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 20:38:18,658][49750] Updated weights for policy 0, policy_version 265441 (0.0035) [2024-04-26 20:38:21,543][49750] Updated weights for policy 0, policy_version 265451 (0.0033) [2024-04-26 20:38:22,063][49517] Fps is (10 sec: 52427.7, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4349165568. Throughput: 0: 50804.7. Samples: 2101957200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:38:22,063][49517] Avg episode reward: [(0, '0.670')] [2024-04-26 20:38:25,014][49750] Updated weights for policy 0, policy_version 265461 (0.0030) [2024-04-26 20:38:27,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4349411328. Throughput: 0: 50876.0. Samples: 2102262340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:38:27,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 20:38:27,996][49750] Updated weights for policy 0, policy_version 265471 (0.0027) [2024-04-26 20:38:31,391][49750] Updated weights for policy 0, policy_version 265481 (0.0030) [2024-04-26 20:38:32,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.5, 300 sec: 50984.8). Total num frames: 4349657088. Throughput: 0: 50890.2. Samples: 2102565300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:38:32,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 20:38:34,383][49750] Updated weights for policy 0, policy_version 265491 (0.0032) [2024-04-26 20:38:37,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 4349919232. Throughput: 0: 50888.8. Samples: 2102716280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:38:37,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 20:38:37,889][49750] Updated weights for policy 0, policy_version 265501 (0.0035) [2024-04-26 20:38:41,005][49750] Updated weights for policy 0, policy_version 265511 (0.0035) [2024-04-26 20:38:42,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 4350181376. Throughput: 0: 50817.9. Samples: 2103023500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:38:42,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 20:38:42,103][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000265515_4350197760.pth... [2024-04-26 20:38:42,149][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000264769_4337975296.pth [2024-04-26 20:38:44,074][49728] Signal inference workers to stop experience collection... (31350 times) [2024-04-26 20:38:44,083][49728] Signal inference workers to resume experience collection... (31350 times) [2024-04-26 20:38:44,111][49750] InferenceWorker_p0-w0: stopping experience collection (31350 times) [2024-04-26 20:38:44,111][49750] InferenceWorker_p0-w0: resuming experience collection (31350 times) [2024-04-26 20:38:44,212][49750] Updated weights for policy 0, policy_version 265521 (0.0036) [2024-04-26 20:38:47,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 4350427136. Throughput: 0: 50918.8. Samples: 2103332680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:38:47,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 20:38:47,347][49750] Updated weights for policy 0, policy_version 265531 (0.0029) [2024-04-26 20:38:50,602][49750] Updated weights for policy 0, policy_version 265541 (0.0032) [2024-04-26 20:38:52,062][49517] Fps is (10 sec: 49153.1, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4350672896. Throughput: 0: 50856.1. Samples: 2103490240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:38:52,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 20:38:53,661][49750] Updated weights for policy 0, policy_version 265551 (0.0028) [2024-04-26 20:38:56,976][49750] Updated weights for policy 0, policy_version 265561 (0.0029) [2024-04-26 20:38:57,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.3, 300 sec: 51095.9). Total num frames: 4350951424. Throughput: 0: 51103.9. Samples: 2103798620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:38:57,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 20:39:00,111][49750] Updated weights for policy 0, policy_version 265571 (0.0035) [2024-04-26 20:39:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.4, 300 sec: 50873.7). Total num frames: 4351180800. Throughput: 0: 51170.4. Samples: 2104097480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:39:02,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 20:39:03,495][49750] Updated weights for policy 0, policy_version 265581 (0.0031) [2024-04-26 20:39:06,546][49750] Updated weights for policy 0, policy_version 265591 (0.0033) [2024-04-26 20:39:07,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 4351459328. Throughput: 0: 50974.1. Samples: 2104251020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:39:07,063][49517] Avg episode reward: [(0, '0.468')] [2024-04-26 20:39:09,954][49750] Updated weights for policy 0, policy_version 265601 (0.0035) [2024-04-26 20:39:12,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4351705088. Throughput: 0: 51021.4. Samples: 2104558300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:39:12,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 20:39:12,986][49750] Updated weights for policy 0, policy_version 265611 (0.0029) [2024-04-26 20:39:16,329][49750] Updated weights for policy 0, policy_version 265621 (0.0035) [2024-04-26 20:39:17,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 4351950848. Throughput: 0: 50946.8. Samples: 2104857900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:39:17,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 20:39:19,332][49750] Updated weights for policy 0, policy_version 265631 (0.0034) [2024-04-26 20:39:22,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.4, 300 sec: 51040.3). Total num frames: 4352212992. Throughput: 0: 50936.0. Samples: 2105008400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:39:22,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 20:39:22,750][49750] Updated weights for policy 0, policy_version 265641 (0.0031) [2024-04-26 20:39:25,830][49750] Updated weights for policy 0, policy_version 265651 (0.0029) [2024-04-26 20:39:27,063][49517] Fps is (10 sec: 52427.4, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 4352475136. Throughput: 0: 50840.9. Samples: 2105311340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:39:27,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 20:39:29,257][49750] Updated weights for policy 0, policy_version 265661 (0.0029) [2024-04-26 20:39:32,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4352720896. Throughput: 0: 50803.5. Samples: 2105618840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:39:32,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 20:39:32,313][49750] Updated weights for policy 0, policy_version 265671 (0.0028) [2024-04-26 20:39:35,891][49750] Updated weights for policy 0, policy_version 265681 (0.0031) [2024-04-26 20:39:37,062][49517] Fps is (10 sec: 49153.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4352966656. Throughput: 0: 50775.2. Samples: 2105775120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:39:37,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 20:39:38,658][49750] Updated weights for policy 0, policy_version 265691 (0.0037) [2024-04-26 20:39:42,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4353228800. Throughput: 0: 50655.5. Samples: 2106078120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:39:42,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 20:39:42,256][49750] Updated weights for policy 0, policy_version 265701 (0.0034) [2024-04-26 20:39:45,318][49750] Updated weights for policy 0, policy_version 265711 (0.0030) [2024-04-26 20:39:47,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50790.2, 300 sec: 50929.2). Total num frames: 4353474560. Throughput: 0: 50770.0. Samples: 2106382140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:39:47,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 20:39:48,706][49750] Updated weights for policy 0, policy_version 265721 (0.0028) [2024-04-26 20:39:51,759][49750] Updated weights for policy 0, policy_version 265731 (0.0032) [2024-04-26 20:39:52,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51336.4, 300 sec: 50984.8). Total num frames: 4353753088. Throughput: 0: 50942.4. Samples: 2106543440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:39:52,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-26 20:39:55,124][49750] Updated weights for policy 0, policy_version 265741 (0.0027) [2024-04-26 20:39:57,062][49517] Fps is (10 sec: 52429.9, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4353998848. Throughput: 0: 50822.3. Samples: 2106845300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:39:57,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 20:39:58,376][49750] Updated weights for policy 0, policy_version 265751 (0.0040) [2024-04-26 20:39:58,398][49728] Signal inference workers to stop experience collection... (31400 times) [2024-04-26 20:39:58,399][49728] Signal inference workers to resume experience collection... (31400 times) [2024-04-26 20:39:58,419][49750] InferenceWorker_p0-w0: stopping experience collection (31400 times) [2024-04-26 20:39:58,419][49750] InferenceWorker_p0-w0: resuming experience collection (31400 times) [2024-04-26 20:40:01,614][49750] Updated weights for policy 0, policy_version 265761 (0.0029) [2024-04-26 20:40:02,062][49517] Fps is (10 sec: 47514.7, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4354228224. Throughput: 0: 50873.8. Samples: 2107147220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:40:02,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 20:40:04,742][49750] Updated weights for policy 0, policy_version 265771 (0.0034) [2024-04-26 20:40:07,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 4354490368. Throughput: 0: 50867.2. Samples: 2107297420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 20:40:07,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 20:40:08,253][49750] Updated weights for policy 0, policy_version 265781 (0.0030) [2024-04-26 20:40:11,201][49750] Updated weights for policy 0, policy_version 265791 (0.0036) [2024-04-26 20:40:12,063][49517] Fps is (10 sec: 54065.9, 60 sec: 51063.3, 300 sec: 50984.8). Total num frames: 4354768896. Throughput: 0: 50859.6. Samples: 2107600020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 20:40:12,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 20:40:14,631][49750] Updated weights for policy 0, policy_version 265801 (0.0039) [2024-04-26 20:40:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4355014656. Throughput: 0: 50583.1. Samples: 2107895080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 20:40:17,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 20:40:17,539][49750] Updated weights for policy 0, policy_version 265811 (0.0034) [2024-04-26 20:40:21,343][49750] Updated weights for policy 0, policy_version 265821 (0.0035) [2024-04-26 20:40:22,062][49517] Fps is (10 sec: 47514.6, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4355244032. Throughput: 0: 50811.1. Samples: 2108061620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 20:40:22,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 20:40:24,095][49750] Updated weights for policy 0, policy_version 265831 (0.0031) [2024-04-26 20:40:27,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4355489792. Throughput: 0: 50748.8. Samples: 2108361820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 20:40:27,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 20:40:27,767][49750] Updated weights for policy 0, policy_version 265841 (0.0037) [2024-04-26 20:40:30,580][49750] Updated weights for policy 0, policy_version 265851 (0.0028) [2024-04-26 20:40:32,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 4355768320. Throughput: 0: 50725.5. Samples: 2108664780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 20:40:32,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 20:40:34,292][49750] Updated weights for policy 0, policy_version 265861 (0.0032) [2024-04-26 20:40:37,040][49750] Updated weights for policy 0, policy_version 265871 (0.0032) [2024-04-26 20:40:37,063][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 4356030464. Throughput: 0: 50695.2. Samples: 2108824720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 20:40:37,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 20:40:40,576][49750] Updated weights for policy 0, policy_version 265881 (0.0030) [2024-04-26 20:40:42,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 4356292608. Throughput: 0: 50806.6. Samples: 2109131600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 20:40:42,063][49517] Avg episode reward: [(0, '0.689')] [2024-04-26 20:40:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000265887_4356292608.pth... [2024-04-26 20:40:42,127][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000265141_4344070144.pth [2024-04-26 20:40:43,516][49750] Updated weights for policy 0, policy_version 265891 (0.0036) [2024-04-26 20:40:47,026][49750] Updated weights for policy 0, policy_version 265901 (0.0026) [2024-04-26 20:40:47,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 4356521984. Throughput: 0: 50856.4. Samples: 2109435760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 20:40:47,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 20:40:49,965][49750] Updated weights for policy 0, policy_version 265911 (0.0039) [2024-04-26 20:40:52,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50929.2). Total num frames: 4356784128. Throughput: 0: 50824.0. Samples: 2109584500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 20:40:52,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 20:40:53,299][49750] Updated weights for policy 0, policy_version 265921 (0.0032) [2024-04-26 20:40:56,430][49750] Updated weights for policy 0, policy_version 265931 (0.0032) [2024-04-26 20:40:57,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4357029888. Throughput: 0: 50863.3. Samples: 2109888860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 20:40:57,063][49517] Avg episode reward: [(0, '0.696')] [2024-04-26 20:40:59,816][49750] Updated weights for policy 0, policy_version 265941 (0.0030) [2024-04-26 20:41:02,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.4, 300 sec: 50929.3). Total num frames: 4357308416. Throughput: 0: 50982.1. Samples: 2110189280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 20:41:02,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 20:41:02,849][49750] Updated weights for policy 0, policy_version 265951 (0.0029) [2024-04-26 20:41:06,155][49750] Updated weights for policy 0, policy_version 265961 (0.0028) [2024-04-26 20:41:06,233][49728] Signal inference workers to stop experience collection... (31450 times) [2024-04-26 20:41:06,277][49750] InferenceWorker_p0-w0: stopping experience collection (31450 times) [2024-04-26 20:41:06,296][49728] Signal inference workers to resume experience collection... (31450 times) [2024-04-26 20:41:06,298][49750] InferenceWorker_p0-w0: resuming experience collection (31450 times) [2024-04-26 20:41:07,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 4357570560. Throughput: 0: 50881.3. Samples: 2110351280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 20:41:07,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 20:41:09,089][49750] Updated weights for policy 0, policy_version 265971 (0.0027) [2024-04-26 20:41:12,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4357799936. Throughput: 0: 51127.6. Samples: 2110662560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 20:41:12,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 20:41:12,459][49750] Updated weights for policy 0, policy_version 265981 (0.0031) [2024-04-26 20:41:15,450][49750] Updated weights for policy 0, policy_version 265991 (0.0036) [2024-04-26 20:41:17,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4358045696. Throughput: 0: 51304.0. Samples: 2110973460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:41:17,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 20:41:18,852][49750] Updated weights for policy 0, policy_version 266001 (0.0028) [2024-04-26 20:41:21,863][49750] Updated weights for policy 0, policy_version 266011 (0.0031) [2024-04-26 20:41:22,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 4358324224. Throughput: 0: 50884.6. Samples: 2111114520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:41:22,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 20:41:25,265][49750] Updated weights for policy 0, policy_version 266021 (0.0034) [2024-04-26 20:41:27,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51609.7, 300 sec: 50873.7). Total num frames: 4358586368. Throughput: 0: 50835.1. Samples: 2111419180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:41:27,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 20:41:28,284][49750] Updated weights for policy 0, policy_version 266031 (0.0035) [2024-04-26 20:41:31,613][49750] Updated weights for policy 0, policy_version 266041 (0.0031) [2024-04-26 20:41:32,062][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4358832128. Throughput: 0: 50979.9. Samples: 2111729860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:41:32,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 20:41:34,869][49750] Updated weights for policy 0, policy_version 266051 (0.0033) [2024-04-26 20:41:37,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4359061504. Throughput: 0: 50950.7. Samples: 2111877280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:41:37,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 20:41:38,094][49750] Updated weights for policy 0, policy_version 266061 (0.0024) [2024-04-26 20:41:41,229][49750] Updated weights for policy 0, policy_version 266071 (0.0030) [2024-04-26 20:41:42,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4359323648. Throughput: 0: 51180.4. Samples: 2112191980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:41:42,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 20:41:44,358][49750] Updated weights for policy 0, policy_version 266081 (0.0031) [2024-04-26 20:41:47,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4359585792. Throughput: 0: 51180.1. Samples: 2112492380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:41:47,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 20:41:47,703][49750] Updated weights for policy 0, policy_version 266091 (0.0035) [2024-04-26 20:41:50,825][49750] Updated weights for policy 0, policy_version 266101 (0.0027) [2024-04-26 20:41:52,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4359864320. Throughput: 0: 50973.1. Samples: 2112645080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:41:52,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-26 20:41:54,292][49750] Updated weights for policy 0, policy_version 266111 (0.0030) [2024-04-26 20:41:57,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 4360110080. Throughput: 0: 51034.6. Samples: 2112959120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:41:57,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 20:41:57,174][49750] Updated weights for policy 0, policy_version 266121 (0.0034) [2024-04-26 20:42:00,861][49750] Updated weights for policy 0, policy_version 266131 (0.0033) [2024-04-26 20:42:02,062][49517] Fps is (10 sec: 45876.3, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 4360323072. Throughput: 0: 50853.4. Samples: 2113261860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:42:02,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 20:42:02,580][49728] Signal inference workers to stop experience collection... (31500 times) [2024-04-26 20:42:02,614][49750] InferenceWorker_p0-w0: stopping experience collection (31500 times) [2024-04-26 20:42:02,650][49728] Signal inference workers to resume experience collection... (31500 times) [2024-04-26 20:42:02,650][49750] InferenceWorker_p0-w0: resuming experience collection (31500 times) [2024-04-26 20:42:03,498][49750] Updated weights for policy 0, policy_version 266141 (0.0029) [2024-04-26 20:42:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50929.2). Total num frames: 4360601600. Throughput: 0: 50883.4. Samples: 2113404280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:42:07,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 20:42:07,157][49750] Updated weights for policy 0, policy_version 266151 (0.0032) [2024-04-26 20:42:09,967][49750] Updated weights for policy 0, policy_version 266161 (0.0028) [2024-04-26 20:42:12,063][49517] Fps is (10 sec: 55704.1, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4360880128. Throughput: 0: 50952.3. Samples: 2113712040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:42:12,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 20:42:13,526][49750] Updated weights for policy 0, policy_version 266171 (0.0035) [2024-04-26 20:42:16,453][49750] Updated weights for policy 0, policy_version 266181 (0.0026) [2024-04-26 20:42:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4361125888. Throughput: 0: 50763.6. Samples: 2114014220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:42:17,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 20:42:20,093][49750] Updated weights for policy 0, policy_version 266191 (0.0033) [2024-04-26 20:42:22,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 4361371648. Throughput: 0: 50892.5. Samples: 2114167440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-26 20:42:22,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 20:42:23,016][49750] Updated weights for policy 0, policy_version 266201 (0.0032) [2024-04-26 20:42:26,491][49750] Updated weights for policy 0, policy_version 266211 (0.0028) [2024-04-26 20:42:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4361617408. Throughput: 0: 50738.2. Samples: 2114475200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 20:42:27,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 20:42:29,468][49750] Updated weights for policy 0, policy_version 266221 (0.0030) [2024-04-26 20:42:32,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4361895936. Throughput: 0: 50836.3. Samples: 2114780020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 20:42:32,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 20:42:32,802][49750] Updated weights for policy 0, policy_version 266231 (0.0027) [2024-04-26 20:42:35,910][49750] Updated weights for policy 0, policy_version 266241 (0.0033) [2024-04-26 20:42:37,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4362141696. Throughput: 0: 50906.7. Samples: 2114935880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 20:42:37,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 20:42:39,403][49750] Updated weights for policy 0, policy_version 266251 (0.0034) [2024-04-26 20:42:42,062][49517] Fps is (10 sec: 50791.2, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 4362403840. Throughput: 0: 50611.2. Samples: 2115236620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 20:42:42,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 20:42:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000266260_4362403840.pth... [2024-04-26 20:42:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000265515_4350197760.pth [2024-04-26 20:42:42,442][49750] Updated weights for policy 0, policy_version 266261 (0.0029) [2024-04-26 20:42:46,047][49750] Updated weights for policy 0, policy_version 266271 (0.0029) [2024-04-26 20:42:47,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 4362633216. Throughput: 0: 50704.4. Samples: 2115543560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 20:42:47,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 20:42:48,818][49750] Updated weights for policy 0, policy_version 266281 (0.0037) [2024-04-26 20:42:52,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4362878976. Throughput: 0: 50856.0. Samples: 2115692800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 20:42:52,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-26 20:42:52,576][49750] Updated weights for policy 0, policy_version 266291 (0.0031) [2024-04-26 20:42:55,163][49750] Updated weights for policy 0, policy_version 266301 (0.0031) [2024-04-26 20:42:57,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4363157504. Throughput: 0: 50747.3. Samples: 2115995660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 20:42:57,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 20:42:59,159][49750] Updated weights for policy 0, policy_version 266311 (0.0033) [2024-04-26 20:43:01,451][49750] Updated weights for policy 0, policy_version 266321 (0.0027) [2024-04-26 20:43:02,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51609.4, 300 sec: 50929.2). Total num frames: 4363419648. Throughput: 0: 50734.6. Samples: 2116297280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 20:43:02,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 20:43:05,726][49750] Updated weights for policy 0, policy_version 266331 (0.0032) [2024-04-26 20:43:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 4363665408. Throughput: 0: 50915.1. Samples: 2116458620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 20:43:07,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 20:43:08,061][49750] Updated weights for policy 0, policy_version 266341 (0.0033) [2024-04-26 20:43:09,614][49728] Signal inference workers to stop experience collection... (31550 times) [2024-04-26 20:43:09,614][49728] Signal inference workers to resume experience collection... (31550 times) [2024-04-26 20:43:09,639][49750] InferenceWorker_p0-w0: stopping experience collection (31550 times) [2024-04-26 20:43:09,640][49750] InferenceWorker_p0-w0: resuming experience collection (31550 times) [2024-04-26 20:43:12,062][49517] Fps is (10 sec: 45875.8, 60 sec: 49971.4, 300 sec: 50762.7). Total num frames: 4363878400. Throughput: 0: 50749.9. Samples: 2116758940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 20:43:12,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 20:43:12,091][49750] Updated weights for policy 0, policy_version 266351 (0.0031) [2024-04-26 20:43:14,509][49750] Updated weights for policy 0, policy_version 266361 (0.0033) [2024-04-26 20:43:17,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4364156928. Throughput: 0: 50627.6. Samples: 2117058260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 20:43:17,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 20:43:18,414][49750] Updated weights for policy 0, policy_version 266371 (0.0030) [2024-04-26 20:43:21,112][49750] Updated weights for policy 0, policy_version 266381 (0.0032) [2024-04-26 20:43:22,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4364419072. Throughput: 0: 50745.5. Samples: 2117219420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 20:43:22,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 20:43:24,953][49750] Updated weights for policy 0, policy_version 266391 (0.0030) [2024-04-26 20:43:27,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 4364681216. Throughput: 0: 50844.4. Samples: 2117524620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 20:43:27,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 20:43:27,662][49750] Updated weights for policy 0, policy_version 266401 (0.0027) [2024-04-26 20:43:31,492][49750] Updated weights for policy 0, policy_version 266411 (0.0033) [2024-04-26 20:43:32,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4364910592. Throughput: 0: 50808.8. Samples: 2117829960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 20:43:32,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 20:43:34,008][49750] Updated weights for policy 0, policy_version 266421 (0.0037) [2024-04-26 20:43:37,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4365172736. Throughput: 0: 50728.0. Samples: 2117975560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:43:37,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 20:43:37,868][49750] Updated weights for policy 0, policy_version 266431 (0.0029) [2024-04-26 20:43:40,598][49750] Updated weights for policy 0, policy_version 266441 (0.0030) [2024-04-26 20:43:42,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 4365418496. Throughput: 0: 50715.5. Samples: 2118277860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:43:42,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 20:43:44,208][49750] Updated weights for policy 0, policy_version 266451 (0.0047) [2024-04-26 20:43:46,903][49750] Updated weights for policy 0, policy_version 266461 (0.0028) [2024-04-26 20:43:47,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4365697024. Throughput: 0: 50765.9. Samples: 2118581740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:43:47,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 20:43:50,588][49750] Updated weights for policy 0, policy_version 266471 (0.0036) [2024-04-26 20:43:52,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4365942784. Throughput: 0: 50766.6. Samples: 2118743120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:43:52,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 20:43:53,388][49750] Updated weights for policy 0, policy_version 266481 (0.0033) [2024-04-26 20:43:57,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 4366172160. Throughput: 0: 50791.0. Samples: 2119044540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:43:57,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 20:43:57,130][49750] Updated weights for policy 0, policy_version 266491 (0.0035) [2024-04-26 20:43:59,739][49750] Updated weights for policy 0, policy_version 266501 (0.0030) [2024-04-26 20:44:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50818.1). Total num frames: 4366450688. Throughput: 0: 50800.5. Samples: 2119344280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:44:02,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 20:44:03,601][49750] Updated weights for policy 0, policy_version 266511 (0.0025) [2024-04-26 20:44:06,129][49750] Updated weights for policy 0, policy_version 266521 (0.0033) [2024-04-26 20:44:07,062][49517] Fps is (10 sec: 54067.5, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4366712832. Throughput: 0: 50738.2. Samples: 2119502640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:44:07,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 20:44:07,403][49728] Signal inference workers to stop experience collection... (31600 times) [2024-04-26 20:44:07,430][49750] InferenceWorker_p0-w0: stopping experience collection (31600 times) [2024-04-26 20:44:07,505][49728] Signal inference workers to resume experience collection... (31600 times) [2024-04-26 20:44:07,505][49750] InferenceWorker_p0-w0: resuming experience collection (31600 times) [2024-04-26 20:44:10,022][49750] Updated weights for policy 0, policy_version 266531 (0.0030) [2024-04-26 20:44:12,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51609.4, 300 sec: 50929.2). Total num frames: 4366974976. Throughput: 0: 50823.0. Samples: 2119811660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:44:12,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 20:44:12,578][49750] Updated weights for policy 0, policy_version 266541 (0.0031) [2024-04-26 20:44:16,412][49750] Updated weights for policy 0, policy_version 266551 (0.0032) [2024-04-26 20:44:17,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4367204352. Throughput: 0: 50685.7. Samples: 2120110820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:44:17,063][49517] Avg episode reward: [(0, '0.669')] [2024-04-26 20:44:19,048][49750] Updated weights for policy 0, policy_version 266561 (0.0030) [2024-04-26 20:44:22,063][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4367450112. Throughput: 0: 50727.5. Samples: 2120258300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:44:22,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 20:44:22,871][49750] Updated weights for policy 0, policy_version 266571 (0.0029) [2024-04-26 20:44:25,616][49750] Updated weights for policy 0, policy_version 266581 (0.0028) [2024-04-26 20:44:27,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4367712256. Throughput: 0: 50753.7. Samples: 2120561780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:44:27,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 20:44:29,350][49750] Updated weights for policy 0, policy_version 266591 (0.0031) [2024-04-26 20:44:32,062][49517] Fps is (10 sec: 52429.8, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4367974400. Throughput: 0: 50909.8. Samples: 2120872680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:44:32,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 20:44:32,155][49750] Updated weights for policy 0, policy_version 266601 (0.0031) [2024-04-26 20:44:35,775][49750] Updated weights for policy 0, policy_version 266611 (0.0031) [2024-04-26 20:44:37,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4368236544. Throughput: 0: 50637.3. Samples: 2121021800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:44:37,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 20:44:38,548][49750] Updated weights for policy 0, policy_version 266621 (0.0030) [2024-04-26 20:44:42,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4368465920. Throughput: 0: 50674.3. Samples: 2121324880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:44:42,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 20:44:42,076][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000266631_4368482304.pth... [2024-04-26 20:44:42,086][49750] Updated weights for policy 0, policy_version 266631 (0.0031) [2024-04-26 20:44:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000265887_4356292608.pth [2024-04-26 20:44:44,911][49750] Updated weights for policy 0, policy_version 266641 (0.0034) [2024-04-26 20:44:47,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4368728064. Throughput: 0: 50933.7. Samples: 2121636300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 20:44:47,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 20:44:48,643][49750] Updated weights for policy 0, policy_version 266651 (0.0037) [2024-04-26 20:44:51,328][49750] Updated weights for policy 0, policy_version 266661 (0.0035) [2024-04-26 20:44:52,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4368990208. Throughput: 0: 50686.2. Samples: 2121783520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:44:52,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 20:44:55,122][49750] Updated weights for policy 0, policy_version 266671 (0.0031) [2024-04-26 20:44:57,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4369235968. Throughput: 0: 50492.1. Samples: 2122083800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:44:57,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 20:44:57,771][49750] Updated weights for policy 0, policy_version 266681 (0.0034) [2024-04-26 20:45:01,563][49750] Updated weights for policy 0, policy_version 266691 (0.0029) [2024-04-26 20:45:02,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4369498112. Throughput: 0: 50819.9. Samples: 2122397720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:45:02,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 20:45:04,059][49750] Updated weights for policy 0, policy_version 266701 (0.0030) [2024-04-26 20:45:07,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4369727488. Throughput: 0: 50763.1. Samples: 2122542640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:45:07,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 20:45:07,916][49750] Updated weights for policy 0, policy_version 266711 (0.0025) [2024-04-26 20:45:10,018][49728] Signal inference workers to stop experience collection... (31650 times) [2024-04-26 20:45:10,061][49750] InferenceWorker_p0-w0: stopping experience collection (31650 times) [2024-04-26 20:45:10,078][49728] Signal inference workers to resume experience collection... (31650 times) [2024-04-26 20:45:10,083][49750] InferenceWorker_p0-w0: resuming experience collection (31650 times) [2024-04-26 20:45:10,497][49750] Updated weights for policy 0, policy_version 266721 (0.0029) [2024-04-26 20:45:12,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4369989632. Throughput: 0: 50697.8. Samples: 2122843180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:45:12,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 20:45:14,305][49750] Updated weights for policy 0, policy_version 266731 (0.0030) [2024-04-26 20:45:17,048][49750] Updated weights for policy 0, policy_version 266741 (0.0028) [2024-04-26 20:45:17,062][49517] Fps is (10 sec: 55706.3, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 4370284544. Throughput: 0: 50726.2. Samples: 2123155360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:45:17,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 20:45:20,837][49750] Updated weights for policy 0, policy_version 266751 (0.0028) [2024-04-26 20:45:22,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.7, 300 sec: 50984.8). Total num frames: 4370530304. Throughput: 0: 50884.1. Samples: 2123311580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:45:22,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 20:45:23,624][49750] Updated weights for policy 0, policy_version 266761 (0.0033) [2024-04-26 20:45:27,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 4370759680. Throughput: 0: 50978.0. Samples: 2123618900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:45:27,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 20:45:27,159][49750] Updated weights for policy 0, policy_version 266771 (0.0035) [2024-04-26 20:45:30,564][49750] Updated weights for policy 0, policy_version 266781 (0.0031) [2024-04-26 20:45:32,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 4371005440. Throughput: 0: 50813.1. Samples: 2123922880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:45:32,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 20:45:33,447][49750] Updated weights for policy 0, policy_version 266791 (0.0028) [2024-04-26 20:45:36,881][49750] Updated weights for policy 0, policy_version 266801 (0.0033) [2024-04-26 20:45:37,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4371267584. Throughput: 0: 50771.2. Samples: 2124068220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:45:37,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 20:45:40,018][49750] Updated weights for policy 0, policy_version 266811 (0.0031) [2024-04-26 20:45:42,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 4371529728. Throughput: 0: 50914.6. Samples: 2124374960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:45:42,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 20:45:43,332][49750] Updated weights for policy 0, policy_version 266821 (0.0029) [2024-04-26 20:45:46,370][49750] Updated weights for policy 0, policy_version 266831 (0.0031) [2024-04-26 20:45:47,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4371791872. Throughput: 0: 50717.4. Samples: 2124680000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:45:47,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 20:45:49,681][49750] Updated weights for policy 0, policy_version 266841 (0.0033) [2024-04-26 20:45:52,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.1, 300 sec: 50818.1). Total num frames: 4372021248. Throughput: 0: 50948.8. Samples: 2124835340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:45:52,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 20:45:52,696][49750] Updated weights for policy 0, policy_version 266851 (0.0039) [2024-04-26 20:45:56,117][49750] Updated weights for policy 0, policy_version 266861 (0.0032) [2024-04-26 20:45:57,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4372267008. Throughput: 0: 51006.7. Samples: 2125138480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 20:45:57,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 20:45:59,223][49750] Updated weights for policy 0, policy_version 266871 (0.0031) [2024-04-26 20:46:02,063][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4372545536. Throughput: 0: 50818.9. Samples: 2125442220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 20:46:02,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 20:46:02,511][49750] Updated weights for policy 0, policy_version 266881 (0.0036) [2024-04-26 20:46:05,788][49750] Updated weights for policy 0, policy_version 266891 (0.0036) [2024-04-26 20:46:07,063][49517] Fps is (10 sec: 55704.7, 60 sec: 51609.6, 300 sec: 50929.2). Total num frames: 4372824064. Throughput: 0: 50865.6. Samples: 2125600540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 20:46:07,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 20:46:09,089][49750] Updated weights for policy 0, policy_version 266901 (0.0032) [2024-04-26 20:46:12,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.2, 300 sec: 50818.1). Total num frames: 4373037056. Throughput: 0: 50839.4. Samples: 2125906680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 20:46:12,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 20:46:12,332][49750] Updated weights for policy 0, policy_version 266911 (0.0031) [2024-04-26 20:46:15,540][49750] Updated weights for policy 0, policy_version 266921 (0.0026) [2024-04-26 20:46:17,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4373299200. Throughput: 0: 50853.7. Samples: 2126211300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 20:46:17,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 20:46:18,606][49750] Updated weights for policy 0, policy_version 266931 (0.0029) [2024-04-26 20:46:21,803][49750] Updated weights for policy 0, policy_version 266941 (0.0031) [2024-04-26 20:46:22,063][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4373561344. Throughput: 0: 50941.1. Samples: 2126360580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 20:46:22,072][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 20:46:24,835][49728] Signal inference workers to stop experience collection... (31700 times) [2024-04-26 20:46:24,836][49728] Signal inference workers to resume experience collection... (31700 times) [2024-04-26 20:46:24,847][49750] InferenceWorker_p0-w0: stopping experience collection (31700 times) [2024-04-26 20:46:24,847][49750] InferenceWorker_p0-w0: resuming experience collection (31700 times) [2024-04-26 20:46:24,972][49750] Updated weights for policy 0, policy_version 266951 (0.0031) [2024-04-26 20:46:27,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4373823488. Throughput: 0: 50867.2. Samples: 2126663980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 20:46:27,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 20:46:28,338][49750] Updated weights for policy 0, policy_version 266961 (0.0033) [2024-04-26 20:46:31,732][49750] Updated weights for policy 0, policy_version 266971 (0.0029) [2024-04-26 20:46:32,062][49517] Fps is (10 sec: 50791.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4374069248. Throughput: 0: 50929.1. Samples: 2126971800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 20:46:32,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 20:46:35,048][49750] Updated weights for policy 0, policy_version 266981 (0.0032) [2024-04-26 20:46:37,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4374315008. Throughput: 0: 50883.7. Samples: 2127125100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 20:46:37,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 20:46:38,174][49750] Updated weights for policy 0, policy_version 266991 (0.0030) [2024-04-26 20:46:41,457][49750] Updated weights for policy 0, policy_version 267001 (0.0032) [2024-04-26 20:46:42,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4374560768. Throughput: 0: 50789.7. Samples: 2127424020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 20:46:42,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 20:46:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000267002_4374560768.pth... [2024-04-26 20:46:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000266260_4362403840.pth [2024-04-26 20:46:44,470][49750] Updated weights for policy 0, policy_version 267011 (0.0030) [2024-04-26 20:46:47,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4374822912. Throughput: 0: 50823.3. Samples: 2127729260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 20:46:47,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 20:46:47,978][49750] Updated weights for policy 0, policy_version 267021 (0.0029) [2024-04-26 20:46:50,805][49750] Updated weights for policy 0, policy_version 267031 (0.0035) [2024-04-26 20:46:52,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4375101440. Throughput: 0: 50732.9. Samples: 2127883520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 20:46:52,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 20:46:54,338][49750] Updated weights for policy 0, policy_version 267041 (0.0036) [2024-04-26 20:46:57,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 4375347200. Throughput: 0: 50777.9. Samples: 2128191680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 20:46:57,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 20:46:57,255][49750] Updated weights for policy 0, policy_version 267051 (0.0036) [2024-04-26 20:47:00,809][49750] Updated weights for policy 0, policy_version 267061 (0.0035) [2024-04-26 20:47:02,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4375576576. Throughput: 0: 50814.6. Samples: 2128497960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 20:47:02,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 20:47:03,819][49750] Updated weights for policy 0, policy_version 267071 (0.0025) [2024-04-26 20:47:07,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4375838720. Throughput: 0: 50685.4. Samples: 2128641420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-26 20:47:07,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-26 20:47:07,259][49750] Updated weights for policy 0, policy_version 267081 (0.0033) [2024-04-26 20:47:10,156][49750] Updated weights for policy 0, policy_version 267091 (0.0026) [2024-04-26 20:47:12,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.6, 300 sec: 50818.1). Total num frames: 4376117248. Throughput: 0: 50827.8. Samples: 2128951240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 20:47:12,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 20:47:13,641][49750] Updated weights for policy 0, policy_version 267101 (0.0033) [2024-04-26 20:47:16,446][49750] Updated weights for policy 0, policy_version 267111 (0.0029) [2024-04-26 20:47:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4376346624. Throughput: 0: 50655.4. Samples: 2129251300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 20:47:17,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 20:47:20,123][49750] Updated weights for policy 0, policy_version 267121 (0.0031) [2024-04-26 20:47:22,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4376608768. Throughput: 0: 50788.9. Samples: 2129410600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 20:47:22,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 20:47:22,911][49750] Updated weights for policy 0, policy_version 267131 (0.0030) [2024-04-26 20:47:26,600][49750] Updated weights for policy 0, policy_version 267141 (0.0028) [2024-04-26 20:47:27,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4376854528. Throughput: 0: 50931.5. Samples: 2129715940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 20:47:27,071][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 20:47:29,423][49750] Updated weights for policy 0, policy_version 267151 (0.0029) [2024-04-26 20:47:31,012][49728] Signal inference workers to stop experience collection... (31750 times) [2024-04-26 20:47:31,058][49750] InferenceWorker_p0-w0: stopping experience collection (31750 times) [2024-04-26 20:47:31,077][49728] Signal inference workers to resume experience collection... (31750 times) [2024-04-26 20:47:31,079][49750] InferenceWorker_p0-w0: resuming experience collection (31750 times) [2024-04-26 20:47:32,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4377100288. Throughput: 0: 50818.7. Samples: 2130016100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 20:47:32,071][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 20:47:33,070][49750] Updated weights for policy 0, policy_version 267161 (0.0030) [2024-04-26 20:47:35,711][49750] Updated weights for policy 0, policy_version 267171 (0.0034) [2024-04-26 20:47:37,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4377378816. Throughput: 0: 50750.2. Samples: 2130167280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 20:47:37,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 20:47:39,661][49750] Updated weights for policy 0, policy_version 267181 (0.0030) [2024-04-26 20:47:42,038][49750] Updated weights for policy 0, policy_version 267191 (0.0030) [2024-04-26 20:47:42,063][49517] Fps is (10 sec: 55704.5, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 4377657344. Throughput: 0: 50794.6. Samples: 2130477440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 20:47:42,071][49517] Avg episode reward: [(0, '0.669')] [2024-04-26 20:47:46,139][49750] Updated weights for policy 0, policy_version 267201 (0.0031) [2024-04-26 20:47:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4377870336. Throughput: 0: 50745.4. Samples: 2130781500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 20:47:47,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 20:47:48,712][49750] Updated weights for policy 0, policy_version 267211 (0.0034) [2024-04-26 20:47:52,063][49517] Fps is (10 sec: 45875.3, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4378116096. Throughput: 0: 50693.7. Samples: 2130922640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 20:47:52,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 20:47:52,641][49750] Updated weights for policy 0, policy_version 267221 (0.0030) [2024-04-26 20:47:55,260][49750] Updated weights for policy 0, policy_version 267231 (0.0031) [2024-04-26 20:47:57,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4378394624. Throughput: 0: 50656.2. Samples: 2131230760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 20:47:57,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 20:47:59,000][49750] Updated weights for policy 0, policy_version 267241 (0.0029) [2024-04-26 20:48:01,687][49750] Updated weights for policy 0, policy_version 267251 (0.0029) [2024-04-26 20:48:02,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4378656768. Throughput: 0: 50859.6. Samples: 2131539980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 20:48:02,063][49517] Avg episode reward: [(0, '0.676')] [2024-04-26 20:48:05,551][49750] Updated weights for policy 0, policy_version 267261 (0.0039) [2024-04-26 20:48:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.6, 300 sec: 50929.2). Total num frames: 4378902528. Throughput: 0: 50825.0. Samples: 2131697720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 20:48:07,063][49517] Avg episode reward: [(0, '0.464')] [2024-04-26 20:48:08,404][49750] Updated weights for policy 0, policy_version 267271 (0.0030) [2024-04-26 20:48:11,860][49750] Updated weights for policy 0, policy_version 267281 (0.0035) [2024-04-26 20:48:12,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4379148288. Throughput: 0: 50669.0. Samples: 2131996040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 20:48:12,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 20:48:14,738][49750] Updated weights for policy 0, policy_version 267291 (0.0029) [2024-04-26 20:48:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4379394048. Throughput: 0: 50670.6. Samples: 2132296280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 20:48:17,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 20:48:18,371][49750] Updated weights for policy 0, policy_version 267301 (0.0035) [2024-04-26 20:48:21,154][49750] Updated weights for policy 0, policy_version 267311 (0.0031) [2024-04-26 20:48:22,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4379639808. Throughput: 0: 50673.8. Samples: 2132447600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-26 20:48:22,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 20:48:24,878][49750] Updated weights for policy 0, policy_version 267321 (0.0032) [2024-04-26 20:48:27,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4379901952. Throughput: 0: 50616.6. Samples: 2132755180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:48:27,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 20:48:27,597][49750] Updated weights for policy 0, policy_version 267331 (0.0027) [2024-04-26 20:48:31,271][49750] Updated weights for policy 0, policy_version 267341 (0.0032) [2024-04-26 20:48:32,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4380164096. Throughput: 0: 50753.0. Samples: 2133065380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:48:32,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 20:48:34,031][49750] Updated weights for policy 0, policy_version 267351 (0.0032) [2024-04-26 20:48:37,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4380393472. Throughput: 0: 50806.7. Samples: 2133208940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:48:37,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 20:48:37,648][49750] Updated weights for policy 0, policy_version 267361 (0.0036) [2024-04-26 20:48:40,353][49728] Signal inference workers to stop experience collection... (31800 times) [2024-04-26 20:48:40,353][49728] Signal inference workers to resume experience collection... (31800 times) [2024-04-26 20:48:40,385][49750] InferenceWorker_p0-w0: stopping experience collection (31800 times) [2024-04-26 20:48:40,385][49750] InferenceWorker_p0-w0: resuming experience collection (31800 times) [2024-04-26 20:48:40,513][49750] Updated weights for policy 0, policy_version 267371 (0.0031) [2024-04-26 20:48:42,063][49517] Fps is (10 sec: 49150.7, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4380655616. Throughput: 0: 50739.8. Samples: 2133514060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:48:42,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 20:48:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000267374_4380655616.pth... [2024-04-26 20:48:42,136][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000266631_4368482304.pth [2024-04-26 20:48:44,267][49750] Updated weights for policy 0, policy_version 267381 (0.0030) [2024-04-26 20:48:46,867][49750] Updated weights for policy 0, policy_version 267391 (0.0029) [2024-04-26 20:48:47,063][49517] Fps is (10 sec: 55705.7, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4380950528. Throughput: 0: 50639.5. Samples: 2133818760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:48:47,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 20:48:50,830][49750] Updated weights for policy 0, policy_version 267401 (0.0029) [2024-04-26 20:48:52,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 4381196288. Throughput: 0: 50721.7. Samples: 2133980200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:48:52,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 20:48:53,245][49750] Updated weights for policy 0, policy_version 267411 (0.0032) [2024-04-26 20:48:57,062][49517] Fps is (10 sec: 44237.4, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 4381392896. Throughput: 0: 50736.5. Samples: 2134279180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:48:57,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 20:48:57,291][49750] Updated weights for policy 0, policy_version 267421 (0.0033) [2024-04-26 20:48:59,676][49750] Updated weights for policy 0, policy_version 267431 (0.0031) [2024-04-26 20:49:02,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4381671424. Throughput: 0: 50735.4. Samples: 2134579380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:49:02,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 20:49:03,648][49750] Updated weights for policy 0, policy_version 267441 (0.0039) [2024-04-26 20:49:06,190][49750] Updated weights for policy 0, policy_version 267451 (0.0033) [2024-04-26 20:49:07,062][49517] Fps is (10 sec: 54066.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4381933568. Throughput: 0: 50828.9. Samples: 2134734900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:49:07,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 20:49:10,169][49750] Updated weights for policy 0, policy_version 267461 (0.0028) [2024-04-26 20:49:12,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4382195712. Throughput: 0: 50831.1. Samples: 2135042580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:49:12,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 20:49:12,637][49750] Updated weights for policy 0, policy_version 267471 (0.0028) [2024-04-26 20:49:16,525][49750] Updated weights for policy 0, policy_version 267481 (0.0039) [2024-04-26 20:49:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4382441472. Throughput: 0: 50803.4. Samples: 2135351540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:49:17,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 20:49:19,183][49750] Updated weights for policy 0, policy_version 267491 (0.0029) [2024-04-26 20:49:22,063][49517] Fps is (10 sec: 50789.5, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4382703616. Throughput: 0: 50842.2. Samples: 2135496840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:49:22,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 20:49:22,947][49750] Updated weights for policy 0, policy_version 267501 (0.0029) [2024-04-26 20:49:25,458][49750] Updated weights for policy 0, policy_version 267511 (0.0028) [2024-04-26 20:49:27,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 4382949376. Throughput: 0: 50780.0. Samples: 2135799160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:49:27,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 20:49:29,349][49750] Updated weights for policy 0, policy_version 267521 (0.0030) [2024-04-26 20:49:31,784][49750] Updated weights for policy 0, policy_version 267531 (0.0036) [2024-04-26 20:49:32,063][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 4383227904. Throughput: 0: 50892.9. Samples: 2136108940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 20:49:32,072][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 20:49:35,775][49750] Updated weights for policy 0, policy_version 267541 (0.0033) [2024-04-26 20:49:37,062][49517] Fps is (10 sec: 52430.1, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4383473664. Throughput: 0: 50740.5. Samples: 2136263520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 20:49:37,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 20:49:38,312][49750] Updated weights for policy 0, policy_version 267551 (0.0035) [2024-04-26 20:49:42,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50790.6, 300 sec: 50762.7). Total num frames: 4383703040. Throughput: 0: 50881.7. Samples: 2136568860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 20:49:42,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 20:49:42,219][49750] Updated weights for policy 0, policy_version 267561 (0.0030) [2024-04-26 20:49:44,835][49750] Updated weights for policy 0, policy_version 267571 (0.0036) [2024-04-26 20:49:47,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4383965184. Throughput: 0: 50933.0. Samples: 2136871360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 20:49:47,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 20:49:48,810][49750] Updated weights for policy 0, policy_version 267581 (0.0029) [2024-04-26 20:49:49,572][49728] Signal inference workers to stop experience collection... (31850 times) [2024-04-26 20:49:49,575][49728] Signal inference workers to resume experience collection... (31850 times) [2024-04-26 20:49:49,604][49750] InferenceWorker_p0-w0: stopping experience collection (31850 times) [2024-04-26 20:49:49,604][49750] InferenceWorker_p0-w0: resuming experience collection (31850 times) [2024-04-26 20:49:51,154][49750] Updated weights for policy 0, policy_version 267591 (0.0029) [2024-04-26 20:49:52,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4384227328. Throughput: 0: 50729.4. Samples: 2137017720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 20:49:52,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 20:49:55,130][49750] Updated weights for policy 0, policy_version 267601 (0.0035) [2024-04-26 20:49:57,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51609.5, 300 sec: 50818.2). Total num frames: 4384489472. Throughput: 0: 50803.9. Samples: 2137328760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 20:49:57,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 20:49:57,609][49750] Updated weights for policy 0, policy_version 267611 (0.0033) [2024-04-26 20:50:01,754][49750] Updated weights for policy 0, policy_version 267621 (0.0030) [2024-04-26 20:50:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4384735232. Throughput: 0: 50800.1. Samples: 2137637540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 20:50:02,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 20:50:04,166][49750] Updated weights for policy 0, policy_version 267631 (0.0030) [2024-04-26 20:50:07,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4384964608. Throughput: 0: 50620.9. Samples: 2137774780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 20:50:07,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 20:50:08,128][49750] Updated weights for policy 0, policy_version 267641 (0.0034) [2024-04-26 20:50:11,064][49750] Updated weights for policy 0, policy_version 267651 (0.0038) [2024-04-26 20:50:12,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4385243136. Throughput: 0: 50636.7. Samples: 2138077800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 20:50:12,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 20:50:14,598][49750] Updated weights for policy 0, policy_version 267661 (0.0032) [2024-04-26 20:50:17,063][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4385488896. Throughput: 0: 50579.1. Samples: 2138385000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 20:50:17,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 20:50:17,514][49750] Updated weights for policy 0, policy_version 267671 (0.0033) [2024-04-26 20:50:21,091][49750] Updated weights for policy 0, policy_version 267681 (0.0029) [2024-04-26 20:50:22,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4385751040. Throughput: 0: 50590.1. Samples: 2138540080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 20:50:22,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 20:50:23,995][49750] Updated weights for policy 0, policy_version 267691 (0.0030) [2024-04-26 20:50:27,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.6, 300 sec: 50762.6). Total num frames: 4385980416. Throughput: 0: 50553.4. Samples: 2138843760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 20:50:27,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 20:50:27,400][49750] Updated weights for policy 0, policy_version 267701 (0.0032) [2024-04-26 20:50:30,331][49750] Updated weights for policy 0, policy_version 267711 (0.0032) [2024-04-26 20:50:32,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4386242560. Throughput: 0: 50696.4. Samples: 2139152700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 20:50:32,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 20:50:33,719][49750] Updated weights for policy 0, policy_version 267721 (0.0034) [2024-04-26 20:50:36,813][49750] Updated weights for policy 0, policy_version 267731 (0.0034) [2024-04-26 20:50:37,062][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4386504704. Throughput: 0: 50743.9. Samples: 2139301200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 20:50:37,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 20:50:40,346][49750] Updated weights for policy 0, policy_version 267741 (0.0030) [2024-04-26 20:50:42,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4386766848. Throughput: 0: 50634.2. Samples: 2139607300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 20:50:42,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 20:50:42,076][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000267747_4386766848.pth... [2024-04-26 20:50:42,131][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000267002_4374560768.pth [2024-04-26 20:50:43,294][49750] Updated weights for policy 0, policy_version 267751 (0.0037) [2024-04-26 20:50:46,786][49750] Updated weights for policy 0, policy_version 267761 (0.0028) [2024-04-26 20:50:47,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4387012608. Throughput: 0: 50624.4. Samples: 2139915640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 20:50:47,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 20:50:49,638][49750] Updated weights for policy 0, policy_version 267771 (0.0031) [2024-04-26 20:50:52,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4387258368. Throughput: 0: 50756.1. Samples: 2140058800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 20:50:52,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 20:50:53,314][49750] Updated weights for policy 0, policy_version 267781 (0.0032) [2024-04-26 20:50:56,016][49750] Updated weights for policy 0, policy_version 267791 (0.0033) [2024-04-26 20:50:57,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4387504128. Throughput: 0: 50679.6. Samples: 2140358380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 20:50:57,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 20:50:59,330][49728] Signal inference workers to stop experience collection... (31900 times) [2024-04-26 20:50:59,331][49728] Signal inference workers to resume experience collection... (31900 times) [2024-04-26 20:50:59,357][49750] InferenceWorker_p0-w0: stopping experience collection (31900 times) [2024-04-26 20:50:59,357][49750] InferenceWorker_p0-w0: resuming experience collection (31900 times) [2024-04-26 20:50:59,874][49750] Updated weights for policy 0, policy_version 267801 (0.0031) [2024-04-26 20:51:02,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4387782656. Throughput: 0: 50684.5. Samples: 2140665800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 20:51:02,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 20:51:02,645][49750] Updated weights for policy 0, policy_version 267811 (0.0029) [2024-04-26 20:51:06,286][49750] Updated weights for policy 0, policy_version 267821 (0.0032) [2024-04-26 20:51:07,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4388028416. Throughput: 0: 50750.7. Samples: 2140823860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 20:51:07,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 20:51:09,173][49750] Updated weights for policy 0, policy_version 267831 (0.0033) [2024-04-26 20:51:12,062][49517] Fps is (10 sec: 45875.6, 60 sec: 49971.2, 300 sec: 50651.6). Total num frames: 4388241408. Throughput: 0: 50679.1. Samples: 2141124320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 20:51:12,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 20:51:12,612][49750] Updated weights for policy 0, policy_version 267841 (0.0027) [2024-04-26 20:51:15,857][49750] Updated weights for policy 0, policy_version 267851 (0.0033) [2024-04-26 20:51:17,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4388519936. Throughput: 0: 50521.4. Samples: 2141426160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 20:51:17,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 20:51:19,132][49750] Updated weights for policy 0, policy_version 267861 (0.0026) [2024-04-26 20:51:22,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 4388765696. Throughput: 0: 50570.4. Samples: 2141576860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 20:51:22,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 20:51:22,262][49750] Updated weights for policy 0, policy_version 267871 (0.0037) [2024-04-26 20:51:25,662][49750] Updated weights for policy 0, policy_version 267881 (0.0033) [2024-04-26 20:51:27,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 4389027840. Throughput: 0: 50544.8. Samples: 2141881820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 20:51:27,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 20:51:28,896][49750] Updated weights for policy 0, policy_version 267891 (0.0030) [2024-04-26 20:51:32,062][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4389289984. Throughput: 0: 50508.1. Samples: 2142188500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 20:51:32,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 20:51:32,071][49750] Updated weights for policy 0, policy_version 267901 (0.0030) [2024-04-26 20:51:35,170][49750] Updated weights for policy 0, policy_version 267911 (0.0038) [2024-04-26 20:51:37,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4389535744. Throughput: 0: 50560.0. Samples: 2142334000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 20:51:37,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 20:51:38,387][49750] Updated weights for policy 0, policy_version 267921 (0.0035) [2024-04-26 20:51:41,672][49750] Updated weights for policy 0, policy_version 267931 (0.0034) [2024-04-26 20:51:42,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4389781504. Throughput: 0: 50602.0. Samples: 2142635480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 20:51:42,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 20:51:44,854][49750] Updated weights for policy 0, policy_version 267941 (0.0036) [2024-04-26 20:51:47,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4390060032. Throughput: 0: 50613.3. Samples: 2142943400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 20:51:47,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 20:51:48,042][49750] Updated weights for policy 0, policy_version 267951 (0.0032) [2024-04-26 20:51:51,314][49750] Updated weights for policy 0, policy_version 267961 (0.0033) [2024-04-26 20:51:52,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4390289408. Throughput: 0: 50687.7. Samples: 2143104800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 20:51:52,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 20:51:54,305][49750] Updated weights for policy 0, policy_version 267971 (0.0028) [2024-04-26 20:51:57,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4390535168. Throughput: 0: 50763.4. Samples: 2143408680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 20:51:57,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 20:51:57,837][49750] Updated weights for policy 0, policy_version 267981 (0.0031) [2024-04-26 20:52:00,801][49750] Updated weights for policy 0, policy_version 267991 (0.0029) [2024-04-26 20:52:02,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4390797312. Throughput: 0: 50670.1. Samples: 2143706320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 20:52:02,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 20:52:04,167][49750] Updated weights for policy 0, policy_version 268001 (0.0031) [2024-04-26 20:52:07,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 4391059456. Throughput: 0: 50859.1. Samples: 2143865520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 20:52:07,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 20:52:07,227][49750] Updated weights for policy 0, policy_version 268011 (0.0035) [2024-04-26 20:52:10,622][49750] Updated weights for policy 0, policy_version 268021 (0.0034) [2024-04-26 20:52:12,013][49728] Signal inference workers to stop experience collection... (31950 times) [2024-04-26 20:52:12,014][49728] Signal inference workers to resume experience collection... (31950 times) [2024-04-26 20:52:12,029][49750] InferenceWorker_p0-w0: stopping experience collection (31950 times) [2024-04-26 20:52:12,029][49750] InferenceWorker_p0-w0: resuming experience collection (31950 times) [2024-04-26 20:52:12,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 4391321600. Throughput: 0: 50834.0. Samples: 2144169340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 20:52:12,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 20:52:13,718][49750] Updated weights for policy 0, policy_version 268031 (0.0031) [2024-04-26 20:52:16,978][49750] Updated weights for policy 0, policy_version 268041 (0.0028) [2024-04-26 20:52:17,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4391583744. Throughput: 0: 50907.1. Samples: 2144479320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 20:52:17,063][49517] Avg episode reward: [(0, '0.479')] [2024-04-26 20:52:20,033][49750] Updated weights for policy 0, policy_version 268051 (0.0026) [2024-04-26 20:52:22,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4391829504. Throughput: 0: 50913.3. Samples: 2144625100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 20:52:22,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 20:52:23,454][49750] Updated weights for policy 0, policy_version 268061 (0.0032) [2024-04-26 20:52:26,602][49750] Updated weights for policy 0, policy_version 268071 (0.0028) [2024-04-26 20:52:27,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4392091648. Throughput: 0: 50956.5. Samples: 2144928520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 20:52:27,072][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 20:52:29,836][49750] Updated weights for policy 0, policy_version 268081 (0.0029) [2024-04-26 20:52:32,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4392337408. Throughput: 0: 51040.1. Samples: 2145240200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 20:52:32,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 20:52:33,204][49750] Updated weights for policy 0, policy_version 268091 (0.0031) [2024-04-26 20:52:36,371][49750] Updated weights for policy 0, policy_version 268101 (0.0034) [2024-04-26 20:52:37,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.3, 300 sec: 50540.5). Total num frames: 4392566784. Throughput: 0: 50789.2. Samples: 2145390320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 20:52:37,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 20:52:39,455][49750] Updated weights for policy 0, policy_version 268111 (0.0032) [2024-04-26 20:52:42,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4392828928. Throughput: 0: 50788.9. Samples: 2145694180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 20:52:42,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 20:52:42,126][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000268118_4392845312.pth... [2024-04-26 20:52:42,170][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000267374_4380655616.pth [2024-04-26 20:52:42,730][49750] Updated weights for policy 0, policy_version 268121 (0.0035) [2024-04-26 20:52:45,933][49750] Updated weights for policy 0, policy_version 268131 (0.0037) [2024-04-26 20:52:47,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4393074688. Throughput: 0: 50905.0. Samples: 2145997040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 20:52:47,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-26 20:52:49,214][49750] Updated weights for policy 0, policy_version 268141 (0.0028) [2024-04-26 20:52:52,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4393353216. Throughput: 0: 50701.6. Samples: 2146147100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 20:52:52,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 20:52:52,339][49750] Updated weights for policy 0, policy_version 268151 (0.0031) [2024-04-26 20:52:55,811][49750] Updated weights for policy 0, policy_version 268161 (0.0033) [2024-04-26 20:52:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50651.6). Total num frames: 4393598976. Throughput: 0: 50666.6. Samples: 2146449340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 20:52:57,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 20:52:58,906][49750] Updated weights for policy 0, policy_version 268171 (0.0037) [2024-04-26 20:53:02,063][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 4393861120. Throughput: 0: 50624.4. Samples: 2146757420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 20:53:02,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 20:53:02,234][49750] Updated weights for policy 0, policy_version 268181 (0.0035) [2024-04-26 20:53:05,246][49750] Updated weights for policy 0, policy_version 268191 (0.0033) [2024-04-26 20:53:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4394106880. Throughput: 0: 50668.4. Samples: 2146905180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-26 20:53:07,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 20:53:09,061][49750] Updated weights for policy 0, policy_version 268201 (0.0028) [2024-04-26 20:53:12,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4394352640. Throughput: 0: 50598.4. Samples: 2147205440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 20:53:12,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 20:53:12,253][49750] Updated weights for policy 0, policy_version 268211 (0.0034) [2024-04-26 20:53:15,486][49750] Updated weights for policy 0, policy_version 268221 (0.0035) [2024-04-26 20:53:17,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4394614784. Throughput: 0: 50582.3. Samples: 2147516400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 20:53:17,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 20:53:18,761][49750] Updated weights for policy 0, policy_version 268231 (0.0030) [2024-04-26 20:53:21,858][49750] Updated weights for policy 0, policy_version 268241 (0.0032) [2024-04-26 20:53:22,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4394860544. Throughput: 0: 50587.5. Samples: 2147666760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 20:53:22,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 20:53:22,669][49728] Signal inference workers to stop experience collection... (32000 times) [2024-04-26 20:53:22,671][49728] Signal inference workers to resume experience collection... (32000 times) [2024-04-26 20:53:22,698][49750] InferenceWorker_p0-w0: stopping experience collection (32000 times) [2024-04-26 20:53:22,698][49750] InferenceWorker_p0-w0: resuming experience collection (32000 times) [2024-04-26 20:53:25,309][49750] Updated weights for policy 0, policy_version 268251 (0.0038) [2024-04-26 20:53:27,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4395122688. Throughput: 0: 50663.7. Samples: 2147974040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 20:53:27,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 20:53:28,170][49750] Updated weights for policy 0, policy_version 268261 (0.0030) [2024-04-26 20:53:31,829][49750] Updated weights for policy 0, policy_version 268271 (0.0033) [2024-04-26 20:53:32,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4395368448. Throughput: 0: 50607.0. Samples: 2148274360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 20:53:32,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 20:53:34,545][49750] Updated weights for policy 0, policy_version 268281 (0.0034) [2024-04-26 20:53:37,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 4395630592. Throughput: 0: 50571.6. Samples: 2148422820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 20:53:37,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 20:53:38,162][49750] Updated weights for policy 0, policy_version 268291 (0.0028) [2024-04-26 20:53:40,958][49750] Updated weights for policy 0, policy_version 268301 (0.0037) [2024-04-26 20:53:42,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 4395859968. Throughput: 0: 50605.4. Samples: 2148726580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 20:53:42,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 20:53:44,669][49750] Updated weights for policy 0, policy_version 268311 (0.0035) [2024-04-26 20:53:47,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.4, 300 sec: 50651.5). Total num frames: 4396138496. Throughput: 0: 50638.2. Samples: 2149036140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 20:53:47,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 20:53:48,002][49750] Updated weights for policy 0, policy_version 268321 (0.0034) [2024-04-26 20:53:51,150][49750] Updated weights for policy 0, policy_version 268331 (0.0032) [2024-04-26 20:53:52,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4396400640. Throughput: 0: 50595.6. Samples: 2149181980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 20:53:52,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 20:53:54,513][49750] Updated weights for policy 0, policy_version 268341 (0.0033) [2024-04-26 20:53:57,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4396630016. Throughput: 0: 50710.1. Samples: 2149487400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 20:53:57,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 20:53:57,547][49750] Updated weights for policy 0, policy_version 268351 (0.0032) [2024-04-26 20:54:00,896][49750] Updated weights for policy 0, policy_version 268361 (0.0035) [2024-04-26 20:54:02,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4396892160. Throughput: 0: 50576.3. Samples: 2149792340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 20:54:02,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 20:54:04,029][49750] Updated weights for policy 0, policy_version 268371 (0.0033) [2024-04-26 20:54:07,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 4397137920. Throughput: 0: 50662.3. Samples: 2149946560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 20:54:07,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 20:54:07,352][49750] Updated weights for policy 0, policy_version 268381 (0.0028) [2024-04-26 20:54:10,435][49750] Updated weights for policy 0, policy_version 268391 (0.0031) [2024-04-26 20:54:12,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 4397400064. Throughput: 0: 50547.8. Samples: 2150248700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 20:54:12,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 20:54:13,640][49750] Updated weights for policy 0, policy_version 268401 (0.0034) [2024-04-26 20:54:17,033][49750] Updated weights for policy 0, policy_version 268411 (0.0028) [2024-04-26 20:54:17,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.2, 300 sec: 50651.6). Total num frames: 4397645824. Throughput: 0: 50668.5. Samples: 2150554440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 20:54:17,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 20:54:20,268][49750] Updated weights for policy 0, policy_version 268421 (0.0040) [2024-04-26 20:54:22,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4397907968. Throughput: 0: 50700.4. Samples: 2150704340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:54:22,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 20:54:23,326][49750] Updated weights for policy 0, policy_version 268431 (0.0030) [2024-04-26 20:54:26,637][49750] Updated weights for policy 0, policy_version 268441 (0.0034) [2024-04-26 20:54:27,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50596.0). Total num frames: 4398153728. Throughput: 0: 50567.0. Samples: 2151002100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:54:27,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 20:54:27,369][49728] Signal inference workers to stop experience collection... (32050 times) [2024-04-26 20:54:27,370][49728] Signal inference workers to resume experience collection... (32050 times) [2024-04-26 20:54:27,393][49750] InferenceWorker_p0-w0: stopping experience collection (32050 times) [2024-04-26 20:54:27,393][49750] InferenceWorker_p0-w0: resuming experience collection (32050 times) [2024-04-26 20:54:29,643][49750] Updated weights for policy 0, policy_version 268451 (0.0028) [2024-04-26 20:54:32,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 4398432256. Throughput: 0: 50647.7. Samples: 2151315280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:54:32,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 20:54:33,065][49750] Updated weights for policy 0, policy_version 268461 (0.0028) [2024-04-26 20:54:36,154][49750] Updated weights for policy 0, policy_version 268471 (0.0030) [2024-04-26 20:54:37,063][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4398678016. Throughput: 0: 50706.6. Samples: 2151463780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:54:37,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 20:54:39,398][49750] Updated weights for policy 0, policy_version 268481 (0.0030) [2024-04-26 20:54:42,063][49517] Fps is (10 sec: 45873.8, 60 sec: 50517.1, 300 sec: 50596.0). Total num frames: 4398891008. Throughput: 0: 50674.3. Samples: 2151767760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:54:42,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 20:54:42,133][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000268488_4398907392.pth... [2024-04-26 20:54:42,177][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000267747_4386766848.pth [2024-04-26 20:54:42,653][49750] Updated weights for policy 0, policy_version 268491 (0.0031) [2024-04-26 20:54:45,868][49750] Updated weights for policy 0, policy_version 268501 (0.0032) [2024-04-26 20:54:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 4399169536. Throughput: 0: 50644.0. Samples: 2152071320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:54:47,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 20:54:49,290][49750] Updated weights for policy 0, policy_version 268511 (0.0040) [2024-04-26 20:54:52,062][49517] Fps is (10 sec: 52430.1, 60 sec: 50244.2, 300 sec: 50596.0). Total num frames: 4399415296. Throughput: 0: 50610.2. Samples: 2152224020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:54:52,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 20:54:52,433][49750] Updated weights for policy 0, policy_version 268521 (0.0033) [2024-04-26 20:54:55,578][49750] Updated weights for policy 0, policy_version 268531 (0.0030) [2024-04-26 20:54:57,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 4399677440. Throughput: 0: 50618.5. Samples: 2152526520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:54:57,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 20:54:59,000][49750] Updated weights for policy 0, policy_version 268541 (0.0032) [2024-04-26 20:55:02,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 4399939584. Throughput: 0: 50635.3. Samples: 2152833020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:55:02,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 20:55:02,068][49750] Updated weights for policy 0, policy_version 268551 (0.0034) [2024-04-26 20:55:05,365][49750] Updated weights for policy 0, policy_version 268561 (0.0030) [2024-04-26 20:55:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 4400185344. Throughput: 0: 50617.1. Samples: 2152982100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:55:07,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 20:55:08,544][49750] Updated weights for policy 0, policy_version 268571 (0.0032) [2024-04-26 20:55:11,736][49750] Updated weights for policy 0, policy_version 268581 (0.0033) [2024-04-26 20:55:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 4400431104. Throughput: 0: 50661.6. Samples: 2153281860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:55:12,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 20:55:14,960][49750] Updated weights for policy 0, policy_version 268591 (0.0030) [2024-04-26 20:55:17,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 4400693248. Throughput: 0: 50467.1. Samples: 2153586300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:55:17,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 20:55:18,227][49750] Updated weights for policy 0, policy_version 268601 (0.0029) [2024-04-26 20:55:21,489][49750] Updated weights for policy 0, policy_version 268611 (0.0028) [2024-04-26 20:55:22,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4400939008. Throughput: 0: 50572.4. Samples: 2153739540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:55:22,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 20:55:24,816][49750] Updated weights for policy 0, policy_version 268621 (0.0033) [2024-04-26 20:55:27,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 4401184768. Throughput: 0: 50691.7. Samples: 2154048880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 20:55:27,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 20:55:28,086][49750] Updated weights for policy 0, policy_version 268631 (0.0030) [2024-04-26 20:55:31,117][49750] Updated weights for policy 0, policy_version 268641 (0.0028) [2024-04-26 20:55:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 4401446912. Throughput: 0: 50672.8. Samples: 2154351600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 20:55:32,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 20:55:34,441][49750] Updated weights for policy 0, policy_version 268651 (0.0028) [2024-04-26 20:55:37,062][49517] Fps is (10 sec: 52429.9, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4401709056. Throughput: 0: 50627.2. Samples: 2154502240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 20:55:37,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 20:55:37,397][49750] Updated weights for policy 0, policy_version 268661 (0.0034) [2024-04-26 20:55:40,852][49750] Updated weights for policy 0, policy_version 268671 (0.0033) [2024-04-26 20:55:42,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.7, 300 sec: 50651.6). Total num frames: 4401954816. Throughput: 0: 50723.9. Samples: 2154809100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 20:55:42,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 20:55:43,925][49750] Updated weights for policy 0, policy_version 268681 (0.0028) [2024-04-26 20:55:46,530][49728] Signal inference workers to stop experience collection... (32100 times) [2024-04-26 20:55:46,580][49750] InferenceWorker_p0-w0: stopping experience collection (32100 times) [2024-04-26 20:55:46,602][49728] Signal inference workers to resume experience collection... (32100 times) [2024-04-26 20:55:46,603][49750] InferenceWorker_p0-w0: resuming experience collection (32100 times) [2024-04-26 20:55:47,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 4402200576. Throughput: 0: 50754.5. Samples: 2155116980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 20:55:47,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 20:55:47,350][49750] Updated weights for policy 0, policy_version 268691 (0.0030) [2024-04-26 20:55:50,491][49750] Updated weights for policy 0, policy_version 268701 (0.0033) [2024-04-26 20:55:52,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 4402446336. Throughput: 0: 50747.9. Samples: 2155265760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 20:55:52,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 20:55:53,923][49750] Updated weights for policy 0, policy_version 268711 (0.0031) [2024-04-26 20:55:56,782][49750] Updated weights for policy 0, policy_version 268721 (0.0031) [2024-04-26 20:55:57,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 4402724864. Throughput: 0: 50705.6. Samples: 2155563620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 20:55:57,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 20:56:00,296][49750] Updated weights for policy 0, policy_version 268731 (0.0038) [2024-04-26 20:56:02,063][49517] Fps is (10 sec: 54066.8, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 4402987008. Throughput: 0: 50739.5. Samples: 2155869580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 20:56:02,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 20:56:03,232][49750] Updated weights for policy 0, policy_version 268741 (0.0030) [2024-04-26 20:56:06,638][49750] Updated weights for policy 0, policy_version 268751 (0.0029) [2024-04-26 20:56:07,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.2, 300 sec: 50818.1). Total num frames: 4403232768. Throughput: 0: 50914.6. Samples: 2156030700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 20:56:07,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 20:56:09,653][49750] Updated weights for policy 0, policy_version 268761 (0.0030) [2024-04-26 20:56:12,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 4403478528. Throughput: 0: 50820.5. Samples: 2156335800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 20:56:12,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 20:56:13,074][49750] Updated weights for policy 0, policy_version 268771 (0.0041) [2024-04-26 20:56:16,201][49750] Updated weights for policy 0, policy_version 268781 (0.0031) [2024-04-26 20:56:17,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4403724288. Throughput: 0: 50686.3. Samples: 2156632480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 20:56:17,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 20:56:19,708][49750] Updated weights for policy 0, policy_version 268791 (0.0035) [2024-04-26 20:56:22,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4403986432. Throughput: 0: 50797.6. Samples: 2156788140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 20:56:22,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 20:56:22,509][49750] Updated weights for policy 0, policy_version 268801 (0.0025) [2024-04-26 20:56:26,124][49750] Updated weights for policy 0, policy_version 268811 (0.0027) [2024-04-26 20:56:27,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 4404248576. Throughput: 0: 50676.4. Samples: 2157089540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 20:56:27,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 20:56:29,348][49750] Updated weights for policy 0, policy_version 268821 (0.0027) [2024-04-26 20:56:32,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4404494336. Throughput: 0: 50595.2. Samples: 2157393760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 20:56:32,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 20:56:32,572][49750] Updated weights for policy 0, policy_version 268831 (0.0035) [2024-04-26 20:56:36,015][49750] Updated weights for policy 0, policy_version 268841 (0.0035) [2024-04-26 20:56:37,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4404740096. Throughput: 0: 50550.8. Samples: 2157540540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 20:56:37,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 20:56:39,110][49750] Updated weights for policy 0, policy_version 268851 (0.0035) [2024-04-26 20:56:42,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 4405002240. Throughput: 0: 50672.1. Samples: 2157843860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 20:56:42,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 20:56:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000268860_4405002240.pth... [2024-04-26 20:56:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000268118_4392845312.pth [2024-04-26 20:56:42,516][49750] Updated weights for policy 0, policy_version 268861 (0.0033) [2024-04-26 20:56:45,627][49750] Updated weights for policy 0, policy_version 268871 (0.0031) [2024-04-26 20:56:47,062][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4405264384. Throughput: 0: 50652.5. Samples: 2158148940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 20:56:47,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 20:56:48,030][49728] Signal inference workers to stop experience collection... (32150 times) [2024-04-26 20:56:48,030][49728] Signal inference workers to resume experience collection... (32150 times) [2024-04-26 20:56:48,055][49750] InferenceWorker_p0-w0: stopping experience collection (32150 times) [2024-04-26 20:56:48,056][49750] InferenceWorker_p0-w0: resuming experience collection (32150 times) [2024-04-26 20:56:48,998][49750] Updated weights for policy 0, policy_version 268881 (0.0030) [2024-04-26 20:56:51,956][49750] Updated weights for policy 0, policy_version 268891 (0.0033) [2024-04-26 20:56:52,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4405510144. Throughput: 0: 50683.2. Samples: 2158311440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 20:56:52,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 20:56:55,486][49750] Updated weights for policy 0, policy_version 268901 (0.0034) [2024-04-26 20:56:57,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4405755904. Throughput: 0: 50759.2. Samples: 2158619960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 20:56:57,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 20:56:58,446][49750] Updated weights for policy 0, policy_version 268911 (0.0032) [2024-04-26 20:57:02,054][49750] Updated weights for policy 0, policy_version 268921 (0.0034) [2024-04-26 20:57:02,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 4406001664. Throughput: 0: 50874.6. Samples: 2158921840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 20:57:02,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 20:57:05,047][49750] Updated weights for policy 0, policy_version 268931 (0.0030) [2024-04-26 20:57:07,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 4406263808. Throughput: 0: 50843.3. Samples: 2159076080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 20:57:07,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 20:57:08,305][49750] Updated weights for policy 0, policy_version 268941 (0.0032) [2024-04-26 20:57:11,342][49750] Updated weights for policy 0, policy_version 268951 (0.0036) [2024-04-26 20:57:12,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 4406542336. Throughput: 0: 50908.5. Samples: 2159380420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 20:57:12,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 20:57:14,818][49750] Updated weights for policy 0, policy_version 268961 (0.0028) [2024-04-26 20:57:17,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 4406771712. Throughput: 0: 50910.5. Samples: 2159684740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 20:57:17,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 20:57:17,746][49750] Updated weights for policy 0, policy_version 268971 (0.0036) [2024-04-26 20:57:21,155][49750] Updated weights for policy 0, policy_version 268981 (0.0031) [2024-04-26 20:57:22,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 4407033856. Throughput: 0: 50853.8. Samples: 2159828980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 20:57:22,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 20:57:24,262][49750] Updated weights for policy 0, policy_version 268991 (0.0031) [2024-04-26 20:57:27,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50517.4, 300 sec: 50651.5). Total num frames: 4407279616. Throughput: 0: 50836.0. Samples: 2160131480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 20:57:27,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 20:57:27,715][49750] Updated weights for policy 0, policy_version 269001 (0.0030) [2024-04-26 20:57:30,738][49750] Updated weights for policy 0, policy_version 269011 (0.0039) [2024-04-26 20:57:32,062][49517] Fps is (10 sec: 52430.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4407558144. Throughput: 0: 50994.3. Samples: 2160443680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 20:57:32,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 20:57:34,195][49750] Updated weights for policy 0, policy_version 269021 (0.0025) [2024-04-26 20:57:36,998][49750] Updated weights for policy 0, policy_version 269031 (0.0032) [2024-04-26 20:57:37,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4407803904. Throughput: 0: 50719.6. Samples: 2160593820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 20:57:37,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 20:57:40,557][49750] Updated weights for policy 0, policy_version 269041 (0.0028) [2024-04-26 20:57:42,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 4408049664. Throughput: 0: 50754.0. Samples: 2160903900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 20:57:42,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 20:57:43,404][49750] Updated weights for policy 0, policy_version 269051 (0.0029) [2024-04-26 20:57:46,888][49750] Updated weights for policy 0, policy_version 269061 (0.0031) [2024-04-26 20:57:47,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 4408295424. Throughput: 0: 50952.6. Samples: 2161214700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 20:57:47,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 20:57:49,927][49750] Updated weights for policy 0, policy_version 269071 (0.0032) [2024-04-26 20:57:52,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50651.5). Total num frames: 4408541184. Throughput: 0: 50705.2. Samples: 2161357820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 20:57:52,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 20:57:52,483][49728] Signal inference workers to stop experience collection... (32200 times) [2024-04-26 20:57:52,484][49728] Signal inference workers to resume experience collection... (32200 times) [2024-04-26 20:57:52,524][49750] InferenceWorker_p0-w0: stopping experience collection (32200 times) [2024-04-26 20:57:52,524][49750] InferenceWorker_p0-w0: resuming experience collection (32200 times) [2024-04-26 20:57:53,387][49750] Updated weights for policy 0, policy_version 269081 (0.0031) [2024-04-26 20:57:56,214][49750] Updated weights for policy 0, policy_version 269091 (0.0031) [2024-04-26 20:57:57,063][49517] Fps is (10 sec: 54065.3, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 4408836096. Throughput: 0: 50690.0. Samples: 2161661480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-26 20:57:57,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 20:57:59,875][49750] Updated weights for policy 0, policy_version 269101 (0.0030) [2024-04-26 20:58:02,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4409065472. Throughput: 0: 50678.3. Samples: 2161965260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-26 20:58:02,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 20:58:02,532][49750] Updated weights for policy 0, policy_version 269111 (0.0033) [2024-04-26 20:58:06,140][49750] Updated weights for policy 0, policy_version 269121 (0.0030) [2024-04-26 20:58:07,062][49517] Fps is (10 sec: 47515.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4409311232. Throughput: 0: 50859.9. Samples: 2162117660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-26 20:58:07,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 20:58:09,010][49750] Updated weights for policy 0, policy_version 269131 (0.0033) [2024-04-26 20:58:12,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 4409556992. Throughput: 0: 50829.9. Samples: 2162418820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-26 20:58:12,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 20:58:12,613][49750] Updated weights for policy 0, policy_version 269141 (0.0032) [2024-04-26 20:58:15,612][49750] Updated weights for policy 0, policy_version 269151 (0.0028) [2024-04-26 20:58:17,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 4409851904. Throughput: 0: 50598.6. Samples: 2162720620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-26 20:58:17,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 20:58:18,977][49750] Updated weights for policy 0, policy_version 269161 (0.0030) [2024-04-26 20:58:22,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 4410081280. Throughput: 0: 50725.0. Samples: 2162876440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-26 20:58:22,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 20:58:22,090][49750] Updated weights for policy 0, policy_version 269171 (0.0028) [2024-04-26 20:58:25,474][49750] Updated weights for policy 0, policy_version 269181 (0.0031) [2024-04-26 20:58:27,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4410327040. Throughput: 0: 50645.1. Samples: 2163182920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-26 20:58:27,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 20:58:28,425][49750] Updated weights for policy 0, policy_version 269191 (0.0031) [2024-04-26 20:58:31,783][49750] Updated weights for policy 0, policy_version 269201 (0.0027) [2024-04-26 20:58:32,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4410589184. Throughput: 0: 50578.4. Samples: 2163490740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-26 20:58:32,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 20:58:34,763][49750] Updated weights for policy 0, policy_version 269211 (0.0026) [2024-04-26 20:58:37,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50244.1, 300 sec: 50707.0). Total num frames: 4410818560. Throughput: 0: 50755.8. Samples: 2163641840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-26 20:58:37,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 20:58:38,408][49750] Updated weights for policy 0, policy_version 269221 (0.0031) [2024-04-26 20:58:41,309][49750] Updated weights for policy 0, policy_version 269231 (0.0029) [2024-04-26 20:58:42,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4411129856. Throughput: 0: 50828.5. Samples: 2163948760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-26 20:58:42,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 20:58:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000269234_4411129856.pth... [2024-04-26 20:58:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000268488_4398907392.pth [2024-04-26 20:58:44,973][49750] Updated weights for policy 0, policy_version 269241 (0.0033) [2024-04-26 20:58:47,062][49517] Fps is (10 sec: 52430.0, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 4411342848. Throughput: 0: 50768.1. Samples: 2164249820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-26 20:58:47,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 20:58:47,296][49728] Signal inference workers to stop experience collection... (32250 times) [2024-04-26 20:58:47,337][49750] InferenceWorker_p0-w0: stopping experience collection (32250 times) [2024-04-26 20:58:47,400][49728] Signal inference workers to resume experience collection... (32250 times) [2024-04-26 20:58:47,400][49750] InferenceWorker_p0-w0: resuming experience collection (32250 times) [2024-04-26 20:58:47,669][49750] Updated weights for policy 0, policy_version 269251 (0.0031) [2024-04-26 20:58:51,375][49750] Updated weights for policy 0, policy_version 269261 (0.0027) [2024-04-26 20:58:52,062][49517] Fps is (10 sec: 47514.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4411604992. Throughput: 0: 50864.8. Samples: 2164406580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-26 20:58:52,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 20:58:54,060][49750] Updated weights for policy 0, policy_version 269271 (0.0033) [2024-04-26 20:58:57,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4411850752. Throughput: 0: 50801.2. Samples: 2164704880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-26 20:58:57,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 20:58:57,831][49750] Updated weights for policy 0, policy_version 269281 (0.0030) [2024-04-26 20:59:00,561][49750] Updated weights for policy 0, policy_version 269291 (0.0037) [2024-04-26 20:59:02,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4412112896. Throughput: 0: 50660.0. Samples: 2165000320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-26 20:59:02,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 20:59:04,299][49750] Updated weights for policy 0, policy_version 269301 (0.0029) [2024-04-26 20:59:07,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4412375040. Throughput: 0: 50726.0. Samples: 2165159120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 20:59:07,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 20:59:07,191][49750] Updated weights for policy 0, policy_version 269311 (0.0025) [2024-04-26 20:59:10,788][49750] Updated weights for policy 0, policy_version 269321 (0.0029) [2024-04-26 20:59:12,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 4412604416. Throughput: 0: 50800.7. Samples: 2165468960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 20:59:12,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 20:59:13,529][49750] Updated weights for policy 0, policy_version 269331 (0.0029) [2024-04-26 20:59:17,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4412882944. Throughput: 0: 50588.5. Samples: 2165767220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 20:59:17,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 20:59:17,075][49750] Updated weights for policy 0, policy_version 269341 (0.0030) [2024-04-26 20:59:20,049][49750] Updated weights for policy 0, policy_version 269351 (0.0031) [2024-04-26 20:59:22,062][49517] Fps is (10 sec: 50791.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4413112320. Throughput: 0: 50700.4. Samples: 2165923340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 20:59:22,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 20:59:23,531][49750] Updated weights for policy 0, policy_version 269361 (0.0028) [2024-04-26 20:59:26,376][49750] Updated weights for policy 0, policy_version 269371 (0.0034) [2024-04-26 20:59:27,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4413390848. Throughput: 0: 50670.8. Samples: 2166228940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 20:59:27,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 20:59:30,046][49750] Updated weights for policy 0, policy_version 269381 (0.0035) [2024-04-26 20:59:32,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 4413620224. Throughput: 0: 50754.4. Samples: 2166533760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 20:59:32,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 20:59:33,022][49750] Updated weights for policy 0, policy_version 269391 (0.0028) [2024-04-26 20:59:34,948][49728] Signal inference workers to stop experience collection... (32300 times) [2024-04-26 20:59:34,972][49750] InferenceWorker_p0-w0: stopping experience collection (32300 times) [2024-04-26 20:59:35,011][49728] Signal inference workers to resume experience collection... (32300 times) [2024-04-26 20:59:35,012][49750] InferenceWorker_p0-w0: resuming experience collection (32300 times) [2024-04-26 20:59:36,401][49750] Updated weights for policy 0, policy_version 269401 (0.0026) [2024-04-26 20:59:37,063][49517] Fps is (10 sec: 49152.0, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4413882368. Throughput: 0: 50739.0. Samples: 2166689840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 20:59:37,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 20:59:39,483][49750] Updated weights for policy 0, policy_version 269411 (0.0030) [2024-04-26 20:59:42,063][49517] Fps is (10 sec: 50789.1, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4414128128. Throughput: 0: 50756.3. Samples: 2166988920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 20:59:42,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 20:59:43,060][49750] Updated weights for policy 0, policy_version 269421 (0.0032) [2024-04-26 20:59:45,867][49750] Updated weights for policy 0, policy_version 269431 (0.0031) [2024-04-26 20:59:47,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4414373888. Throughput: 0: 50723.0. Samples: 2167282860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 20:59:47,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 20:59:49,750][49750] Updated weights for policy 0, policy_version 269441 (0.0035) [2024-04-26 20:59:52,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4414668800. Throughput: 0: 50736.6. Samples: 2167442260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 20:59:52,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 20:59:52,401][49750] Updated weights for policy 0, policy_version 269451 (0.0029) [2024-04-26 20:59:56,550][49750] Updated weights for policy 0, policy_version 269461 (0.0033) [2024-04-26 20:59:57,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 4414881792. Throughput: 0: 50513.4. Samples: 2167742060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 20:59:57,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 20:59:58,750][49750] Updated weights for policy 0, policy_version 269471 (0.0027) [2024-04-26 21:00:02,063][49517] Fps is (10 sec: 47512.6, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4415143936. Throughput: 0: 50688.3. Samples: 2168048200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:00:02,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 21:00:02,885][49750] Updated weights for policy 0, policy_version 269481 (0.0029) [2024-04-26 21:00:05,160][49750] Updated weights for policy 0, policy_version 269491 (0.0032) [2024-04-26 21:00:07,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4415389696. Throughput: 0: 50583.5. Samples: 2168199600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:00:07,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 21:00:09,437][49750] Updated weights for policy 0, policy_version 269501 (0.0029) [2024-04-26 21:00:11,858][49750] Updated weights for policy 0, policy_version 269511 (0.0035) [2024-04-26 21:00:12,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4415668224. Throughput: 0: 50563.2. Samples: 2168504280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:00:12,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 21:00:16,073][49750] Updated weights for policy 0, policy_version 269521 (0.0025) [2024-04-26 21:00:17,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4415913984. Throughput: 0: 50720.6. Samples: 2168816200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:00:17,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 21:00:18,502][49750] Updated weights for policy 0, policy_version 269531 (0.0030) [2024-04-26 21:00:22,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4416143360. Throughput: 0: 50507.1. Samples: 2168962660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:00:22,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 21:00:22,431][49750] Updated weights for policy 0, policy_version 269541 (0.0033) [2024-04-26 21:00:25,002][49750] Updated weights for policy 0, policy_version 269551 (0.0039) [2024-04-26 21:00:27,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 4416421888. Throughput: 0: 50574.0. Samples: 2169264740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:00:27,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 21:00:28,842][49750] Updated weights for policy 0, policy_version 269561 (0.0037) [2024-04-26 21:00:31,520][49750] Updated weights for policy 0, policy_version 269571 (0.0033) [2024-04-26 21:00:32,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 4416667648. Throughput: 0: 50744.4. Samples: 2169566360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:00:32,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 21:00:35,223][49750] Updated weights for policy 0, policy_version 269581 (0.0031) [2024-04-26 21:00:37,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4416946176. Throughput: 0: 50616.9. Samples: 2169720020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:00:37,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 21:00:37,877][49750] Updated weights for policy 0, policy_version 269591 (0.0032) [2024-04-26 21:00:40,369][49728] Signal inference workers to stop experience collection... (32350 times) [2024-04-26 21:00:40,410][49750] InferenceWorker_p0-w0: stopping experience collection (32350 times) [2024-04-26 21:00:40,473][49728] Signal inference workers to resume experience collection... (32350 times) [2024-04-26 21:00:40,473][49750] InferenceWorker_p0-w0: resuming experience collection (32350 times) [2024-04-26 21:00:41,585][49750] Updated weights for policy 0, policy_version 269601 (0.0030) [2024-04-26 21:00:42,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4417159168. Throughput: 0: 50807.7. Samples: 2170028400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:00:42,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 21:00:42,070][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000269603_4417175552.pth... [2024-04-26 21:00:42,128][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000268860_4405002240.pth [2024-04-26 21:00:44,281][49750] Updated weights for policy 0, policy_version 269611 (0.0030) [2024-04-26 21:00:47,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4417421312. Throughput: 0: 50610.0. Samples: 2170325640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:00:47,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 21:00:48,202][49750] Updated weights for policy 0, policy_version 269621 (0.0033) [2024-04-26 21:00:50,805][49750] Updated weights for policy 0, policy_version 269631 (0.0033) [2024-04-26 21:00:52,063][49517] Fps is (10 sec: 50789.7, 60 sec: 49971.1, 300 sec: 50651.6). Total num frames: 4417667072. Throughput: 0: 50607.0. Samples: 2170476920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:00:52,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 21:00:54,632][49750] Updated weights for policy 0, policy_version 269641 (0.0028) [2024-04-26 21:00:57,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4417945600. Throughput: 0: 50661.7. Samples: 2170784060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:00:57,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 21:00:57,304][49750] Updated weights for policy 0, policy_version 269651 (0.0030) [2024-04-26 21:01:01,117][49750] Updated weights for policy 0, policy_version 269661 (0.0032) [2024-04-26 21:01:02,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4418174976. Throughput: 0: 50602.2. Samples: 2171093300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:01:02,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 21:01:03,569][49750] Updated weights for policy 0, policy_version 269671 (0.0035) [2024-04-26 21:01:07,062][49517] Fps is (10 sec: 47514.7, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4418420736. Throughput: 0: 50654.0. Samples: 2171242080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:01:07,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 21:01:07,479][49750] Updated weights for policy 0, policy_version 269681 (0.0031) [2024-04-26 21:01:09,999][49750] Updated weights for policy 0, policy_version 269691 (0.0030) [2024-04-26 21:01:12,062][49517] Fps is (10 sec: 54067.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4418715648. Throughput: 0: 50666.6. Samples: 2171544740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:01:12,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 21:01:13,864][49750] Updated weights for policy 0, policy_version 269701 (0.0031) [2024-04-26 21:01:16,711][49750] Updated weights for policy 0, policy_version 269711 (0.0030) [2024-04-26 21:01:17,062][49517] Fps is (10 sec: 54066.6, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 4418961408. Throughput: 0: 50819.2. Samples: 2171853220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:01:17,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 21:01:20,225][49750] Updated weights for policy 0, policy_version 269721 (0.0026) [2024-04-26 21:01:22,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51336.6, 300 sec: 50762.6). Total num frames: 4419223552. Throughput: 0: 50793.3. Samples: 2172005720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:01:22,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 21:01:23,270][49750] Updated weights for policy 0, policy_version 269731 (0.0029) [2024-04-26 21:01:26,694][49750] Updated weights for policy 0, policy_version 269741 (0.0033) [2024-04-26 21:01:27,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4419452928. Throughput: 0: 50796.7. Samples: 2172314260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:01:27,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 21:01:29,551][49750] Updated weights for policy 0, policy_version 269751 (0.0029) [2024-04-26 21:01:32,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4419715072. Throughput: 0: 50837.4. Samples: 2172613320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 21:01:32,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 21:01:33,091][49750] Updated weights for policy 0, policy_version 269761 (0.0028) [2024-04-26 21:01:36,011][49750] Updated weights for policy 0, policy_version 269771 (0.0030) [2024-04-26 21:01:37,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4419960832. Throughput: 0: 50919.5. Samples: 2172768300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 21:01:37,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 21:01:37,145][49728] Signal inference workers to stop experience collection... (32400 times) [2024-04-26 21:01:37,176][49750] InferenceWorker_p0-w0: stopping experience collection (32400 times) [2024-04-26 21:01:37,211][49728] Signal inference workers to resume experience collection... (32400 times) [2024-04-26 21:01:37,211][49750] InferenceWorker_p0-w0: resuming experience collection (32400 times) [2024-04-26 21:01:39,409][49750] Updated weights for policy 0, policy_version 269781 (0.0031) [2024-04-26 21:01:42,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 4420222976. Throughput: 0: 50931.6. Samples: 2173075980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 21:01:42,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 21:01:42,342][49750] Updated weights for policy 0, policy_version 269791 (0.0034) [2024-04-26 21:01:45,771][49750] Updated weights for policy 0, policy_version 269801 (0.0036) [2024-04-26 21:01:47,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4420485120. Throughput: 0: 50797.2. Samples: 2173379180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 21:01:47,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 21:01:48,925][49750] Updated weights for policy 0, policy_version 269811 (0.0035) [2024-04-26 21:01:52,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4420714496. Throughput: 0: 50868.7. Samples: 2173531180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 21:01:52,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 21:01:52,450][49750] Updated weights for policy 0, policy_version 269821 (0.0030) [2024-04-26 21:01:55,317][49750] Updated weights for policy 0, policy_version 269831 (0.0036) [2024-04-26 21:01:57,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4420993024. Throughput: 0: 50822.2. Samples: 2173831740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 21:01:57,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 21:01:59,095][49750] Updated weights for policy 0, policy_version 269841 (0.0027) [2024-04-26 21:02:01,704][49750] Updated weights for policy 0, policy_version 269851 (0.0032) [2024-04-26 21:02:02,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4421238784. Throughput: 0: 50821.4. Samples: 2174140180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 21:02:02,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 21:02:05,501][49750] Updated weights for policy 0, policy_version 269861 (0.0032) [2024-04-26 21:02:07,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 4421500928. Throughput: 0: 50835.6. Samples: 2174293320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 21:02:07,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 21:02:08,180][49750] Updated weights for policy 0, policy_version 269871 (0.0031) [2024-04-26 21:02:11,910][49750] Updated weights for policy 0, policy_version 269881 (0.0039) [2024-04-26 21:02:12,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4421730304. Throughput: 0: 50713.6. Samples: 2174596360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 21:02:12,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 21:02:14,493][49750] Updated weights for policy 0, policy_version 269891 (0.0033) [2024-04-26 21:02:17,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4421992448. Throughput: 0: 50811.9. Samples: 2174899860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 21:02:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 21:02:18,352][49750] Updated weights for policy 0, policy_version 269901 (0.0030) [2024-04-26 21:02:21,182][49750] Updated weights for policy 0, policy_version 269911 (0.0030) [2024-04-26 21:02:22,062][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4422270976. Throughput: 0: 50683.3. Samples: 2175049040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 21:02:22,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 21:02:24,643][49750] Updated weights for policy 0, policy_version 269921 (0.0028) [2024-04-26 21:02:27,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 4422516736. Throughput: 0: 50767.6. Samples: 2175360520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 21:02:27,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 21:02:27,917][49750] Updated weights for policy 0, policy_version 269931 (0.0032) [2024-04-26 21:02:31,084][49750] Updated weights for policy 0, policy_version 269941 (0.0033) [2024-04-26 21:02:32,063][49517] Fps is (10 sec: 50789.4, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4422778880. Throughput: 0: 50882.7. Samples: 2175668900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 21:02:32,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 21:02:34,214][49750] Updated weights for policy 0, policy_version 269951 (0.0028) [2024-04-26 21:02:37,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4423008256. Throughput: 0: 50886.2. Samples: 2175821060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-26 21:02:37,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 21:02:37,419][49750] Updated weights for policy 0, policy_version 269961 (0.0029) [2024-04-26 21:02:40,565][49750] Updated weights for policy 0, policy_version 269971 (0.0033) [2024-04-26 21:02:42,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4423254016. Throughput: 0: 50941.8. Samples: 2176124120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 21:02:42,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 21:02:42,078][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000269975_4423270400.pth... [2024-04-26 21:02:42,124][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000269234_4411129856.pth [2024-04-26 21:02:43,946][49750] Updated weights for policy 0, policy_version 269981 (0.0029) [2024-04-26 21:02:46,976][49750] Updated weights for policy 0, policy_version 269991 (0.0035) [2024-04-26 21:02:47,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 4423532544. Throughput: 0: 50809.8. Samples: 2176426620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 21:02:47,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 21:02:50,365][49750] Updated weights for policy 0, policy_version 270001 (0.0029) [2024-04-26 21:02:50,954][49728] Signal inference workers to stop experience collection... (32450 times) [2024-04-26 21:02:50,959][49728] Signal inference workers to resume experience collection... (32450 times) [2024-04-26 21:02:50,974][49750] InferenceWorker_p0-w0: stopping experience collection (32450 times) [2024-04-26 21:02:50,974][49750] InferenceWorker_p0-w0: resuming experience collection (32450 times) [2024-04-26 21:02:52,062][49517] Fps is (10 sec: 54066.6, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 4423794688. Throughput: 0: 50965.2. Samples: 2176586760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 21:02:52,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 21:02:53,433][49750] Updated weights for policy 0, policy_version 270011 (0.0027) [2024-04-26 21:02:56,963][49750] Updated weights for policy 0, policy_version 270021 (0.0031) [2024-04-26 21:02:57,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4424024064. Throughput: 0: 50865.2. Samples: 2176885300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 21:02:57,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 21:02:59,894][49750] Updated weights for policy 0, policy_version 270031 (0.0027) [2024-04-26 21:03:02,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4424286208. Throughput: 0: 50929.8. Samples: 2177191700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 21:03:02,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 21:03:03,464][49750] Updated weights for policy 0, policy_version 270041 (0.0026) [2024-04-26 21:03:06,291][49750] Updated weights for policy 0, policy_version 270051 (0.0031) [2024-04-26 21:03:07,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4424548352. Throughput: 0: 50950.2. Samples: 2177341800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 21:03:07,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 21:03:09,787][49750] Updated weights for policy 0, policy_version 270061 (0.0029) [2024-04-26 21:03:12,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.4, 300 sec: 50707.1). Total num frames: 4424810496. Throughput: 0: 50729.0. Samples: 2177643320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 21:03:12,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 21:03:12,730][49750] Updated weights for policy 0, policy_version 270071 (0.0039) [2024-04-26 21:03:16,111][49750] Updated weights for policy 0, policy_version 270081 (0.0032) [2024-04-26 21:03:17,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4425056256. Throughput: 0: 50866.9. Samples: 2177957900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 21:03:17,063][49517] Avg episode reward: [(0, '0.707')] [2024-04-26 21:03:19,212][49750] Updated weights for policy 0, policy_version 270091 (0.0031) [2024-04-26 21:03:22,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.1, 300 sec: 50762.6). Total num frames: 4425302016. Throughput: 0: 50773.7. Samples: 2178105880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 21:03:22,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 21:03:22,582][49750] Updated weights for policy 0, policy_version 270101 (0.0030) [2024-04-26 21:03:25,875][49750] Updated weights for policy 0, policy_version 270111 (0.0031) [2024-04-26 21:03:27,062][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4425564160. Throughput: 0: 50923.9. Samples: 2178415700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 21:03:27,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 21:03:28,988][49750] Updated weights for policy 0, policy_version 270121 (0.0035) [2024-04-26 21:03:32,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4425793536. Throughput: 0: 50894.0. Samples: 2178716860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 21:03:32,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 21:03:32,361][49750] Updated weights for policy 0, policy_version 270131 (0.0033) [2024-04-26 21:03:35,372][49750] Updated weights for policy 0, policy_version 270141 (0.0034) [2024-04-26 21:03:37,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.4, 300 sec: 50651.6). Total num frames: 4426072064. Throughput: 0: 50651.1. Samples: 2178866060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 21:03:37,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 21:03:38,890][49750] Updated weights for policy 0, policy_version 270151 (0.0032) [2024-04-26 21:03:41,814][49750] Updated weights for policy 0, policy_version 270161 (0.0038) [2024-04-26 21:03:42,062][49517] Fps is (10 sec: 54068.3, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4426334208. Throughput: 0: 50819.6. Samples: 2179172180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 21:03:42,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 21:03:45,398][49750] Updated weights for policy 0, policy_version 270171 (0.0028) [2024-04-26 21:03:47,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4426579968. Throughput: 0: 50764.9. Samples: 2179476120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 21:03:47,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 21:03:48,330][49750] Updated weights for policy 0, policy_version 270181 (0.0028) [2024-04-26 21:03:51,707][49750] Updated weights for policy 0, policy_version 270191 (0.0033) [2024-04-26 21:03:52,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4426825728. Throughput: 0: 50727.0. Samples: 2179624520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-26 21:03:52,071][49517] Avg episode reward: [(0, '0.483')] [2024-04-26 21:03:54,690][49750] Updated weights for policy 0, policy_version 270201 (0.0037) [2024-04-26 21:03:57,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4427087872. Throughput: 0: 50823.2. Samples: 2179930360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 21:03:57,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-26 21:03:58,026][49750] Updated weights for policy 0, policy_version 270211 (0.0036) [2024-04-26 21:04:01,157][49750] Updated weights for policy 0, policy_version 270221 (0.0028) [2024-04-26 21:04:02,010][49728] Signal inference workers to stop experience collection... (32500 times) [2024-04-26 21:04:02,011][49728] Signal inference workers to resume experience collection... (32500 times) [2024-04-26 21:04:02,035][49750] InferenceWorker_p0-w0: stopping experience collection (32500 times) [2024-04-26 21:04:02,035][49750] InferenceWorker_p0-w0: resuming experience collection (32500 times) [2024-04-26 21:04:02,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4427350016. Throughput: 0: 50745.1. Samples: 2180241440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 21:04:02,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 21:04:04,530][49750] Updated weights for policy 0, policy_version 270231 (0.0028) [2024-04-26 21:04:07,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 4427563008. Throughput: 0: 50968.9. Samples: 2180399480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 21:04:07,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 21:04:07,510][49750] Updated weights for policy 0, policy_version 270241 (0.0028) [2024-04-26 21:04:11,058][49750] Updated weights for policy 0, policy_version 270251 (0.0029) [2024-04-26 21:04:12,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4427857920. Throughput: 0: 50832.9. Samples: 2180703180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 21:04:12,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 21:04:13,940][49750] Updated weights for policy 0, policy_version 270261 (0.0030) [2024-04-26 21:04:17,062][49517] Fps is (10 sec: 54068.1, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4428103680. Throughput: 0: 50903.8. Samples: 2181007520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 21:04:17,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 21:04:17,333][49750] Updated weights for policy 0, policy_version 270271 (0.0028) [2024-04-26 21:04:20,295][49750] Updated weights for policy 0, policy_version 270281 (0.0035) [2024-04-26 21:04:22,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4428365824. Throughput: 0: 51026.6. Samples: 2181162260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 21:04:22,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 21:04:23,762][49750] Updated weights for policy 0, policy_version 270291 (0.0031) [2024-04-26 21:04:26,803][49750] Updated weights for policy 0, policy_version 270301 (0.0031) [2024-04-26 21:04:27,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4428627968. Throughput: 0: 50846.1. Samples: 2181460260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 21:04:27,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 21:04:30,363][49750] Updated weights for policy 0, policy_version 270311 (0.0035) [2024-04-26 21:04:32,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 4428873728. Throughput: 0: 50759.5. Samples: 2181760300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 21:04:32,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 21:04:33,190][49750] Updated weights for policy 0, policy_version 270321 (0.0034) [2024-04-26 21:04:36,700][49750] Updated weights for policy 0, policy_version 270331 (0.0032) [2024-04-26 21:04:37,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4429119488. Throughput: 0: 50924.4. Samples: 2181916120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 21:04:37,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 21:04:39,538][49750] Updated weights for policy 0, policy_version 270341 (0.0031) [2024-04-26 21:04:42,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4429365248. Throughput: 0: 50860.0. Samples: 2182219060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 21:04:42,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 21:04:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000270347_4429365248.pth... [2024-04-26 21:04:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000269603_4417175552.pth [2024-04-26 21:04:43,081][49750] Updated weights for policy 0, policy_version 270351 (0.0028) [2024-04-26 21:04:45,961][49750] Updated weights for policy 0, policy_version 270361 (0.0030) [2024-04-26 21:04:47,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4429660160. Throughput: 0: 50825.9. Samples: 2182528600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 21:04:47,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 21:04:49,464][49750] Updated weights for policy 0, policy_version 270371 (0.0033) [2024-04-26 21:04:52,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4429873152. Throughput: 0: 50764.6. Samples: 2182683880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 21:04:52,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 21:04:52,493][49750] Updated weights for policy 0, policy_version 270381 (0.0033) [2024-04-26 21:04:55,900][49750] Updated weights for policy 0, policy_version 270391 (0.0026) [2024-04-26 21:04:57,063][49517] Fps is (10 sec: 47512.5, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 4430135296. Throughput: 0: 50909.6. Samples: 2182994120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 21:04:57,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 21:04:58,300][49728] Signal inference workers to stop experience collection... (32550 times) [2024-04-26 21:04:58,301][49728] Signal inference workers to resume experience collection... (32550 times) [2024-04-26 21:04:58,327][49750] InferenceWorker_p0-w0: stopping experience collection (32550 times) [2024-04-26 21:04:58,327][49750] InferenceWorker_p0-w0: resuming experience collection (32550 times) [2024-04-26 21:04:58,928][49750] Updated weights for policy 0, policy_version 270401 (0.0032) [2024-04-26 21:05:02,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4430381056. Throughput: 0: 50796.9. Samples: 2183293380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 21:05:02,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 21:05:02,436][49750] Updated weights for policy 0, policy_version 270411 (0.0028) [2024-04-26 21:05:05,510][49750] Updated weights for policy 0, policy_version 270421 (0.0032) [2024-04-26 21:05:07,063][49517] Fps is (10 sec: 52429.5, 60 sec: 51609.7, 300 sec: 50818.2). Total num frames: 4430659584. Throughput: 0: 50816.9. Samples: 2183449020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 21:05:07,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 21:05:08,855][49750] Updated weights for policy 0, policy_version 270431 (0.0030) [2024-04-26 21:05:11,940][49750] Updated weights for policy 0, policy_version 270441 (0.0039) [2024-04-26 21:05:12,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4430905344. Throughput: 0: 50946.4. Samples: 2183752840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 21:05:12,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 21:05:15,195][49750] Updated weights for policy 0, policy_version 270451 (0.0034) [2024-04-26 21:05:17,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4431151104. Throughput: 0: 50951.4. Samples: 2184053120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 21:05:17,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 21:05:18,250][49750] Updated weights for policy 0, policy_version 270461 (0.0032) [2024-04-26 21:05:21,708][49750] Updated weights for policy 0, policy_version 270471 (0.0033) [2024-04-26 21:05:22,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 4431413248. Throughput: 0: 50749.8. Samples: 2184199860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 21:05:22,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 21:05:24,742][49750] Updated weights for policy 0, policy_version 270481 (0.0033) [2024-04-26 21:05:27,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4431642624. Throughput: 0: 50783.0. Samples: 2184504300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 21:05:27,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 21:05:28,287][49750] Updated weights for policy 0, policy_version 270491 (0.0038) [2024-04-26 21:05:31,310][49750] Updated weights for policy 0, policy_version 270501 (0.0038) [2024-04-26 21:05:32,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4431937536. Throughput: 0: 50679.5. Samples: 2184809180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 21:05:32,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 21:05:34,689][49750] Updated weights for policy 0, policy_version 270511 (0.0029) [2024-04-26 21:05:37,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4432166912. Throughput: 0: 50718.1. Samples: 2184966200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 21:05:37,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 21:05:37,621][49750] Updated weights for policy 0, policy_version 270521 (0.0030) [2024-04-26 21:05:41,136][49750] Updated weights for policy 0, policy_version 270531 (0.0039) [2024-04-26 21:05:42,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4432412672. Throughput: 0: 50750.0. Samples: 2185277860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 21:05:42,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 21:05:44,168][49750] Updated weights for policy 0, policy_version 270541 (0.0033) [2024-04-26 21:05:47,063][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.1, 300 sec: 50818.2). Total num frames: 4432658432. Throughput: 0: 50727.9. Samples: 2185576140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 21:05:47,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 21:05:47,562][49750] Updated weights for policy 0, policy_version 270551 (0.0038) [2024-04-26 21:05:50,725][49750] Updated weights for policy 0, policy_version 270561 (0.0033) [2024-04-26 21:05:52,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4432953344. Throughput: 0: 50665.0. Samples: 2185728940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 21:05:52,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 21:05:53,875][49750] Updated weights for policy 0, policy_version 270571 (0.0028) [2024-04-26 21:05:57,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4433166336. Throughput: 0: 50689.3. Samples: 2186033860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 21:05:57,063][49517] Avg episode reward: [(0, '0.691')] [2024-04-26 21:05:57,223][49728] Signal inference workers to stop experience collection... (32600 times) [2024-04-26 21:05:57,260][49750] InferenceWorker_p0-w0: stopping experience collection (32600 times) [2024-04-26 21:05:57,296][49728] Signal inference workers to resume experience collection... (32600 times) [2024-04-26 21:05:57,296][49750] InferenceWorker_p0-w0: resuming experience collection (32600 times) [2024-04-26 21:05:57,298][49750] Updated weights for policy 0, policy_version 270581 (0.0029) [2024-04-26 21:06:00,364][49750] Updated weights for policy 0, policy_version 270591 (0.0033) [2024-04-26 21:06:02,063][49517] Fps is (10 sec: 45874.2, 60 sec: 50517.1, 300 sec: 50818.1). Total num frames: 4433412096. Throughput: 0: 50700.8. Samples: 2186334660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 21:06:02,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 21:06:03,728][49750] Updated weights for policy 0, policy_version 270601 (0.0031) [2024-04-26 21:06:06,888][49750] Updated weights for policy 0, policy_version 270611 (0.0026) [2024-04-26 21:06:07,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4433690624. Throughput: 0: 50683.1. Samples: 2186480600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 21:06:07,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 21:06:10,133][49750] Updated weights for policy 0, policy_version 270621 (0.0033) [2024-04-26 21:06:12,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4433936384. Throughput: 0: 50777.0. Samples: 2186789260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-26 21:06:12,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 21:06:13,228][49750] Updated weights for policy 0, policy_version 270631 (0.0043) [2024-04-26 21:06:16,502][49750] Updated weights for policy 0, policy_version 270641 (0.0028) [2024-04-26 21:06:17,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4434214912. Throughput: 0: 50762.1. Samples: 2187093480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 21:06:17,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 21:06:19,678][49750] Updated weights for policy 0, policy_version 270651 (0.0029) [2024-04-26 21:06:22,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4434460672. Throughput: 0: 50647.6. Samples: 2187245340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 21:06:22,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 21:06:22,895][49750] Updated weights for policy 0, policy_version 270661 (0.0029) [2024-04-26 21:06:26,221][49750] Updated weights for policy 0, policy_version 270671 (0.0033) [2024-04-26 21:06:27,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4434690048. Throughput: 0: 50594.2. Samples: 2187554600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 21:06:27,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 21:06:29,240][49750] Updated weights for policy 0, policy_version 270681 (0.0028) [2024-04-26 21:06:32,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 4434952192. Throughput: 0: 50810.6. Samples: 2187862620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 21:06:32,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 21:06:33,002][49750] Updated weights for policy 0, policy_version 270691 (0.0031) [2024-04-26 21:06:35,628][49750] Updated weights for policy 0, policy_version 270701 (0.0029) [2024-04-26 21:06:37,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4435214336. Throughput: 0: 50594.0. Samples: 2188005680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 21:06:37,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 21:06:39,473][49750] Updated weights for policy 0, policy_version 270711 (0.0033) [2024-04-26 21:06:42,048][49750] Updated weights for policy 0, policy_version 270721 (0.0033) [2024-04-26 21:06:42,063][49517] Fps is (10 sec: 54067.3, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4435492864. Throughput: 0: 50672.0. Samples: 2188314100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 21:06:42,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 21:06:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000270721_4435492864.pth... [2024-04-26 21:06:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000269975_4423270400.pth [2024-04-26 21:06:46,091][49750] Updated weights for policy 0, policy_version 270731 (0.0028) [2024-04-26 21:06:47,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4435705856. Throughput: 0: 50793.6. Samples: 2188620360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 21:06:47,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 21:06:48,666][49750] Updated weights for policy 0, policy_version 270741 (0.0032) [2024-04-26 21:06:52,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4435968000. Throughput: 0: 50750.4. Samples: 2188764360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 21:06:52,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 21:06:52,594][49750] Updated weights for policy 0, policy_version 270751 (0.0030) [2024-04-26 21:06:55,261][49750] Updated weights for policy 0, policy_version 270761 (0.0031) [2024-04-26 21:06:57,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4436230144. Throughput: 0: 50710.2. Samples: 2189071220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 21:06:57,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 21:06:58,910][49750] Updated weights for policy 0, policy_version 270771 (0.0030) [2024-04-26 21:07:00,129][49728] Signal inference workers to stop experience collection... (32650 times) [2024-04-26 21:07:00,129][49728] Signal inference workers to resume experience collection... (32650 times) [2024-04-26 21:07:00,149][49750] InferenceWorker_p0-w0: stopping experience collection (32650 times) [2024-04-26 21:07:00,149][49750] InferenceWorker_p0-w0: resuming experience collection (32650 times) [2024-04-26 21:07:01,706][49750] Updated weights for policy 0, policy_version 270781 (0.0029) [2024-04-26 21:07:02,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51336.6, 300 sec: 50818.1). Total num frames: 4436492288. Throughput: 0: 50753.3. Samples: 2189377380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 21:07:02,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 21:07:05,430][49750] Updated weights for policy 0, policy_version 270791 (0.0032) [2024-04-26 21:07:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4436738048. Throughput: 0: 50802.8. Samples: 2189531460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 21:07:07,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 21:07:08,154][49750] Updated weights for policy 0, policy_version 270801 (0.0032) [2024-04-26 21:07:11,931][49750] Updated weights for policy 0, policy_version 270811 (0.0040) [2024-04-26 21:07:12,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50790.1, 300 sec: 50818.1). Total num frames: 4436983808. Throughput: 0: 50649.0. Samples: 2189833820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 21:07:12,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 21:07:14,660][49750] Updated weights for policy 0, policy_version 270821 (0.0030) [2024-04-26 21:07:17,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4437245952. Throughput: 0: 50638.4. Samples: 2190141340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 21:07:17,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 21:07:18,201][49750] Updated weights for policy 0, policy_version 270831 (0.0031) [2024-04-26 21:07:21,033][49750] Updated weights for policy 0, policy_version 270841 (0.0030) [2024-04-26 21:07:22,062][49517] Fps is (10 sec: 49154.0, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4437475328. Throughput: 0: 50857.6. Samples: 2190294260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 21:07:22,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 21:07:24,581][49750] Updated weights for policy 0, policy_version 270851 (0.0038) [2024-04-26 21:07:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4437770240. Throughput: 0: 50868.6. Samples: 2190603180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-26 21:07:27,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 21:07:27,372][49750] Updated weights for policy 0, policy_version 270861 (0.0031) [2024-04-26 21:07:31,049][49750] Updated weights for policy 0, policy_version 270871 (0.0029) [2024-04-26 21:07:32,063][49517] Fps is (10 sec: 54066.1, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4438016000. Throughput: 0: 50850.0. Samples: 2190908620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 21:07:32,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 21:07:33,816][49750] Updated weights for policy 0, policy_version 270881 (0.0035) [2024-04-26 21:07:37,063][49517] Fps is (10 sec: 47512.6, 60 sec: 50517.4, 300 sec: 50818.1). Total num frames: 4438245376. Throughput: 0: 50967.0. Samples: 2191057880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 21:07:37,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 21:07:37,505][49750] Updated weights for policy 0, policy_version 270891 (0.0030) [2024-04-26 21:07:40,287][49750] Updated weights for policy 0, policy_version 270901 (0.0034) [2024-04-26 21:07:42,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4438491136. Throughput: 0: 50717.8. Samples: 2191353520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 21:07:42,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 21:07:43,889][49750] Updated weights for policy 0, policy_version 270911 (0.0032) [2024-04-26 21:07:46,573][49750] Updated weights for policy 0, policy_version 270921 (0.0031) [2024-04-26 21:07:47,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4438786048. Throughput: 0: 50731.2. Samples: 2191660280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 21:07:47,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 21:07:50,208][49750] Updated weights for policy 0, policy_version 270931 (0.0031) [2024-04-26 21:07:52,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4439031808. Throughput: 0: 50921.0. Samples: 2191822900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 21:07:52,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 21:07:53,085][49750] Updated weights for policy 0, policy_version 270941 (0.0026) [2024-04-26 21:07:56,690][49750] Updated weights for policy 0, policy_version 270951 (0.0027) [2024-04-26 21:07:57,064][49517] Fps is (10 sec: 47508.4, 60 sec: 50516.4, 300 sec: 50762.4). Total num frames: 4439261184. Throughput: 0: 50851.5. Samples: 2192122180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 21:07:57,064][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 21:07:59,576][49750] Updated weights for policy 0, policy_version 270961 (0.0030) [2024-04-26 21:08:02,063][49517] Fps is (10 sec: 49150.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4439523328. Throughput: 0: 50946.0. Samples: 2192433920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 21:08:02,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 21:08:03,175][49750] Updated weights for policy 0, policy_version 270971 (0.0027) [2024-04-26 21:08:03,831][49728] Signal inference workers to stop experience collection... (32700 times) [2024-04-26 21:08:03,871][49750] InferenceWorker_p0-w0: stopping experience collection (32700 times) [2024-04-26 21:08:03,933][49728] Signal inference workers to resume experience collection... (32700 times) [2024-04-26 21:08:03,933][49750] InferenceWorker_p0-w0: resuming experience collection (32700 times) [2024-04-26 21:08:06,148][49750] Updated weights for policy 0, policy_version 270981 (0.0028) [2024-04-26 21:08:07,062][49517] Fps is (10 sec: 52435.0, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4439785472. Throughput: 0: 50738.7. Samples: 2192577500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 21:08:07,063][49517] Avg episode reward: [(0, '0.671')] [2024-04-26 21:08:09,428][49750] Updated weights for policy 0, policy_version 270991 (0.0031) [2024-04-26 21:08:12,062][49517] Fps is (10 sec: 52430.0, 60 sec: 51063.8, 300 sec: 50818.2). Total num frames: 4440047616. Throughput: 0: 50806.6. Samples: 2192889480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 21:08:12,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 21:08:12,439][49750] Updated weights for policy 0, policy_version 271001 (0.0029) [2024-04-26 21:08:15,756][49750] Updated weights for policy 0, policy_version 271011 (0.0033) [2024-04-26 21:08:17,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4440293376. Throughput: 0: 50917.5. Samples: 2193199900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 21:08:17,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 21:08:18,765][49750] Updated weights for policy 0, policy_version 271021 (0.0029) [2024-04-26 21:08:22,062][49517] Fps is (10 sec: 49152.2, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 4440539136. Throughput: 0: 50779.3. Samples: 2193342940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 21:08:22,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 21:08:22,412][49750] Updated weights for policy 0, policy_version 271031 (0.0031) [2024-04-26 21:08:25,447][49750] Updated weights for policy 0, policy_version 271041 (0.0035) [2024-04-26 21:08:27,062][49517] Fps is (10 sec: 47513.3, 60 sec: 49971.1, 300 sec: 50762.7). Total num frames: 4440768512. Throughput: 0: 50872.9. Samples: 2193642800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 21:08:27,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 21:08:29,074][49750] Updated weights for policy 0, policy_version 271051 (0.0029) [2024-04-26 21:08:31,952][49750] Updated weights for policy 0, policy_version 271061 (0.0029) [2024-04-26 21:08:32,062][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4441063424. Throughput: 0: 50851.1. Samples: 2193948580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 21:08:32,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 21:08:35,603][49750] Updated weights for policy 0, policy_version 271071 (0.0034) [2024-04-26 21:08:37,063][49517] Fps is (10 sec: 55704.4, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 4441325568. Throughput: 0: 50815.2. Samples: 2194109600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 21:08:37,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 21:08:38,345][49750] Updated weights for policy 0, policy_version 271081 (0.0034) [2024-04-26 21:08:42,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4441538560. Throughput: 0: 50952.3. Samples: 2194414980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:08:42,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 21:08:42,084][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000271091_4441554944.pth... [2024-04-26 21:08:42,087][49750] Updated weights for policy 0, policy_version 271091 (0.0031) [2024-04-26 21:08:42,130][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000270347_4429365248.pth [2024-04-26 21:08:44,707][49750] Updated weights for policy 0, policy_version 271101 (0.0031) [2024-04-26 21:08:47,062][49517] Fps is (10 sec: 45876.3, 60 sec: 49971.3, 300 sec: 50707.1). Total num frames: 4441784320. Throughput: 0: 50740.7. Samples: 2194717240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:08:47,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 21:08:48,465][49750] Updated weights for policy 0, policy_version 271111 (0.0029) [2024-04-26 21:08:51,155][49750] Updated weights for policy 0, policy_version 271121 (0.0029) [2024-04-26 21:08:52,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4442062848. Throughput: 0: 50814.2. Samples: 2194864140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:08:52,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 21:08:54,892][49750] Updated weights for policy 0, policy_version 271131 (0.0030) [2024-04-26 21:08:57,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50791.4, 300 sec: 50707.1). Total num frames: 4442308608. Throughput: 0: 50508.0. Samples: 2195162340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:08:57,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 21:08:57,605][49750] Updated weights for policy 0, policy_version 271141 (0.0039) [2024-04-26 21:09:01,282][49750] Updated weights for policy 0, policy_version 271151 (0.0032) [2024-04-26 21:09:02,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 4442587136. Throughput: 0: 50549.2. Samples: 2195474620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:09:02,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 21:09:04,138][49750] Updated weights for policy 0, policy_version 271161 (0.0032) [2024-04-26 21:09:07,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4442816512. Throughput: 0: 50682.1. Samples: 2195623640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:09:07,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 21:09:07,722][49750] Updated weights for policy 0, policy_version 271171 (0.0030) [2024-04-26 21:09:10,414][49750] Updated weights for policy 0, policy_version 271181 (0.0034) [2024-04-26 21:09:11,841][49728] Signal inference workers to stop experience collection... (32750 times) [2024-04-26 21:09:11,841][49728] Signal inference workers to resume experience collection... (32750 times) [2024-04-26 21:09:11,861][49750] InferenceWorker_p0-w0: stopping experience collection (32750 times) [2024-04-26 21:09:11,862][49750] InferenceWorker_p0-w0: resuming experience collection (32750 times) [2024-04-26 21:09:12,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.1, 300 sec: 50762.6). Total num frames: 4443078656. Throughput: 0: 50714.0. Samples: 2195924940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:09:12,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 21:09:14,046][49750] Updated weights for policy 0, policy_version 271191 (0.0028) [2024-04-26 21:09:16,887][49750] Updated weights for policy 0, policy_version 271201 (0.0032) [2024-04-26 21:09:17,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 4443357184. Throughput: 0: 50809.3. Samples: 2196235000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:09:17,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 21:09:20,318][49750] Updated weights for policy 0, policy_version 271211 (0.0031) [2024-04-26 21:09:22,063][49517] Fps is (10 sec: 54067.4, 60 sec: 51336.3, 300 sec: 50818.2). Total num frames: 4443619328. Throughput: 0: 50925.4. Samples: 2196401240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:09:22,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 21:09:23,302][49750] Updated weights for policy 0, policy_version 271221 (0.0031) [2024-04-26 21:09:26,615][49750] Updated weights for policy 0, policy_version 271231 (0.0036) [2024-04-26 21:09:27,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 4443865088. Throughput: 0: 51018.7. Samples: 2196710820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:09:27,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 21:09:29,560][49750] Updated weights for policy 0, policy_version 271241 (0.0033) [2024-04-26 21:09:32,062][49517] Fps is (10 sec: 45875.7, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4444078080. Throughput: 0: 51075.9. Samples: 2197015660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:09:32,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 21:09:33,426][49750] Updated weights for policy 0, policy_version 271251 (0.0026) [2024-04-26 21:09:35,923][49750] Updated weights for policy 0, policy_version 271261 (0.0030) [2024-04-26 21:09:37,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.4, 300 sec: 50818.1). Total num frames: 4444356608. Throughput: 0: 50961.2. Samples: 2197157400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:09:37,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 21:09:39,715][49750] Updated weights for policy 0, policy_version 271271 (0.0025) [2024-04-26 21:09:42,062][49517] Fps is (10 sec: 55705.6, 60 sec: 51609.6, 300 sec: 50762.6). Total num frames: 4444635136. Throughput: 0: 51285.2. Samples: 2197470180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:09:42,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 21:09:42,331][49750] Updated weights for policy 0, policy_version 271281 (0.0029) [2024-04-26 21:09:46,089][49750] Updated weights for policy 0, policy_version 271291 (0.0028) [2024-04-26 21:09:47,062][49517] Fps is (10 sec: 50791.3, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4444864512. Throughput: 0: 51047.2. Samples: 2197771740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:09:47,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 21:09:48,732][49750] Updated weights for policy 0, policy_version 271301 (0.0029) [2024-04-26 21:09:52,062][49517] Fps is (10 sec: 49152.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4445126656. Throughput: 0: 51233.3. Samples: 2197929140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 21:09:52,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 21:09:52,558][49750] Updated weights for policy 0, policy_version 271311 (0.0026) [2024-04-26 21:09:55,232][49750] Updated weights for policy 0, policy_version 271321 (0.0036) [2024-04-26 21:09:56,532][49728] Signal inference workers to stop experience collection... (32800 times) [2024-04-26 21:09:56,533][49728] Signal inference workers to resume experience collection... (32800 times) [2024-04-26 21:09:56,563][49750] InferenceWorker_p0-w0: stopping experience collection (32800 times) [2024-04-26 21:09:56,563][49750] InferenceWorker_p0-w0: resuming experience collection (32800 times) [2024-04-26 21:09:57,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4445372416. Throughput: 0: 51185.1. Samples: 2198228260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 21:09:57,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 21:09:58,920][49750] Updated weights for policy 0, policy_version 271331 (0.0036) [2024-04-26 21:10:02,030][49750] Updated weights for policy 0, policy_version 271341 (0.0029) [2024-04-26 21:10:02,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4445650944. Throughput: 0: 51116.2. Samples: 2198535220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 21:10:02,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-26 21:10:05,171][49750] Updated weights for policy 0, policy_version 271351 (0.0029) [2024-04-26 21:10:07,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 4445913088. Throughput: 0: 51037.0. Samples: 2198697900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 21:10:07,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 21:10:08,741][49750] Updated weights for policy 0, policy_version 271361 (0.0034) [2024-04-26 21:10:11,660][49750] Updated weights for policy 0, policy_version 271371 (0.0030) [2024-04-26 21:10:12,062][49517] Fps is (10 sec: 50789.9, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 4446158848. Throughput: 0: 50922.6. Samples: 2199002340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 21:10:12,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 21:10:15,276][49750] Updated weights for policy 0, policy_version 271381 (0.0033) [2024-04-26 21:10:17,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4446404608. Throughput: 0: 50964.9. Samples: 2199309080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 21:10:17,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 21:10:18,033][49750] Updated weights for policy 0, policy_version 271391 (0.0032) [2024-04-26 21:10:21,768][49750] Updated weights for policy 0, policy_version 271401 (0.0029) [2024-04-26 21:10:22,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 4446650368. Throughput: 0: 51099.3. Samples: 2199456860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 21:10:22,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 21:10:24,396][49750] Updated weights for policy 0, policy_version 271411 (0.0032) [2024-04-26 21:10:27,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4446945280. Throughput: 0: 50941.7. Samples: 2199762560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 21:10:27,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 21:10:28,108][49750] Updated weights for policy 0, policy_version 271421 (0.0034) [2024-04-26 21:10:30,833][49750] Updated weights for policy 0, policy_version 271431 (0.0034) [2024-04-26 21:10:32,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51882.6, 300 sec: 50929.3). Total num frames: 4447191040. Throughput: 0: 51086.5. Samples: 2200070640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 21:10:32,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 21:10:34,447][49750] Updated weights for policy 0, policy_version 271441 (0.0029) [2024-04-26 21:10:37,062][49517] Fps is (10 sec: 47514.8, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4447420416. Throughput: 0: 51081.5. Samples: 2200227800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 21:10:37,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 21:10:37,596][49750] Updated weights for policy 0, policy_version 271451 (0.0032) [2024-04-26 21:10:40,847][49750] Updated weights for policy 0, policy_version 271461 (0.0033) [2024-04-26 21:10:42,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4447666176. Throughput: 0: 51179.8. Samples: 2200531360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 21:10:42,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 21:10:42,188][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000271465_4447682560.pth... [2024-04-26 21:10:42,238][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000270721_4435492864.pth [2024-04-26 21:10:44,148][49750] Updated weights for policy 0, policy_version 271471 (0.0028) [2024-04-26 21:10:47,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4447928320. Throughput: 0: 50933.8. Samples: 2200827240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 21:10:47,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 21:10:47,402][49750] Updated weights for policy 0, policy_version 271481 (0.0035) [2024-04-26 21:10:50,483][49750] Updated weights for policy 0, policy_version 271491 (0.0028) [2024-04-26 21:10:52,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 4448206848. Throughput: 0: 51004.1. Samples: 2200993080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 21:10:52,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 21:10:53,858][49750] Updated weights for policy 0, policy_version 271501 (0.0029) [2024-04-26 21:10:56,981][49750] Updated weights for policy 0, policy_version 271511 (0.0030) [2024-04-26 21:10:57,063][49517] Fps is (10 sec: 50789.3, 60 sec: 51063.3, 300 sec: 50929.3). Total num frames: 4448436224. Throughput: 0: 50991.4. Samples: 2201296960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 21:10:57,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 21:11:00,141][49750] Updated weights for policy 0, policy_version 271521 (0.0029) [2024-04-26 21:11:02,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4448681984. Throughput: 0: 50928.1. Samples: 2201600840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 21:11:02,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 21:11:03,426][49750] Updated weights for policy 0, policy_version 271531 (0.0034) [2024-04-26 21:11:06,483][49750] Updated weights for policy 0, policy_version 271541 (0.0027) [2024-04-26 21:11:06,959][49728] Signal inference workers to stop experience collection... (32850 times) [2024-04-26 21:11:06,959][49728] Signal inference workers to resume experience collection... (32850 times) [2024-04-26 21:11:06,977][49750] InferenceWorker_p0-w0: stopping experience collection (32850 times) [2024-04-26 21:11:06,978][49750] InferenceWorker_p0-w0: resuming experience collection (32850 times) [2024-04-26 21:11:07,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4448944128. Throughput: 0: 50971.5. Samples: 2201750580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 21:11:07,063][49517] Avg episode reward: [(0, '0.683')] [2024-04-26 21:11:09,697][49750] Updated weights for policy 0, policy_version 271551 (0.0041) [2024-04-26 21:11:12,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4449222656. Throughput: 0: 50952.2. Samples: 2202055400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 21:11:12,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 21:11:13,085][49750] Updated weights for policy 0, policy_version 271561 (0.0032) [2024-04-26 21:11:16,020][49750] Updated weights for policy 0, policy_version 271571 (0.0031) [2024-04-26 21:11:17,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4449468416. Throughput: 0: 51009.3. Samples: 2202366060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 21:11:17,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 21:11:19,672][49750] Updated weights for policy 0, policy_version 271581 (0.0034) [2024-04-26 21:11:22,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4449730560. Throughput: 0: 51000.4. Samples: 2202522820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 21:11:22,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 21:11:22,489][49750] Updated weights for policy 0, policy_version 271591 (0.0034) [2024-04-26 21:11:26,160][49750] Updated weights for policy 0, policy_version 271601 (0.0028) [2024-04-26 21:11:27,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 4449959936. Throughput: 0: 50869.0. Samples: 2202820460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 21:11:27,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 21:11:29,113][49750] Updated weights for policy 0, policy_version 271611 (0.0029) [2024-04-26 21:11:32,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4450205696. Throughput: 0: 51006.1. Samples: 2203122520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 21:11:32,063][49517] Avg episode reward: [(0, '0.667')] [2024-04-26 21:11:32,479][49750] Updated weights for policy 0, policy_version 271621 (0.0029) [2024-04-26 21:11:35,465][49750] Updated weights for policy 0, policy_version 271631 (0.0033) [2024-04-26 21:11:37,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4450500608. Throughput: 0: 50795.6. Samples: 2203278880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 21:11:37,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 21:11:38,952][49750] Updated weights for policy 0, policy_version 271641 (0.0027) [2024-04-26 21:11:41,897][49750] Updated weights for policy 0, policy_version 271651 (0.0029) [2024-04-26 21:11:42,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 4450729984. Throughput: 0: 50900.2. Samples: 2203587460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 21:11:42,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 21:11:45,343][49750] Updated weights for policy 0, policy_version 271661 (0.0031) [2024-04-26 21:11:47,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4450975744. Throughput: 0: 51032.9. Samples: 2203897320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 21:11:47,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 21:11:48,468][49750] Updated weights for policy 0, policy_version 271671 (0.0029) [2024-04-26 21:11:51,589][49750] Updated weights for policy 0, policy_version 271681 (0.0032) [2024-04-26 21:11:52,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 4451254272. Throughput: 0: 50810.2. Samples: 2204037040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 21:11:52,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 21:11:53,656][49728] Signal inference workers to stop experience collection... (32900 times) [2024-04-26 21:11:53,702][49750] InferenceWorker_p0-w0: stopping experience collection (32900 times) [2024-04-26 21:11:53,729][49728] Signal inference workers to resume experience collection... (32900 times) [2024-04-26 21:11:53,730][49750] InferenceWorker_p0-w0: resuming experience collection (32900 times) [2024-04-26 21:11:54,734][49750] Updated weights for policy 0, policy_version 271691 (0.0031) [2024-04-26 21:11:57,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4451483648. Throughput: 0: 50890.5. Samples: 2204345480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 21:11:57,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 21:11:58,101][49750] Updated weights for policy 0, policy_version 271701 (0.0033) [2024-04-26 21:12:01,059][49750] Updated weights for policy 0, policy_version 271711 (0.0032) [2024-04-26 21:12:02,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 4451762176. Throughput: 0: 50844.7. Samples: 2204654060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 21:12:02,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 21:12:04,432][49750] Updated weights for policy 0, policy_version 271721 (0.0028) [2024-04-26 21:12:07,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 4452007936. Throughput: 0: 50801.2. Samples: 2204808880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 21:12:07,072][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 21:12:07,473][49750] Updated weights for policy 0, policy_version 271731 (0.0032) [2024-04-26 21:12:10,794][49750] Updated weights for policy 0, policy_version 271741 (0.0031) [2024-04-26 21:12:12,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4452253696. Throughput: 0: 51087.6. Samples: 2205119400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-26 21:12:12,071][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 21:12:13,858][49750] Updated weights for policy 0, policy_version 271751 (0.0027) [2024-04-26 21:12:17,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50984.8). Total num frames: 4452515840. Throughput: 0: 51159.1. Samples: 2205424680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 21:12:17,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 21:12:17,150][49750] Updated weights for policy 0, policy_version 271761 (0.0032) [2024-04-26 21:12:20,238][49750] Updated weights for policy 0, policy_version 271771 (0.0028) [2024-04-26 21:12:22,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4452777984. Throughput: 0: 50974.3. Samples: 2205572720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 21:12:22,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 21:12:23,512][49750] Updated weights for policy 0, policy_version 271781 (0.0033) [2024-04-26 21:12:26,652][49750] Updated weights for policy 0, policy_version 271791 (0.0033) [2024-04-26 21:12:27,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 4453040128. Throughput: 0: 51000.4. Samples: 2205882480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 21:12:27,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 21:12:29,888][49750] Updated weights for policy 0, policy_version 271801 (0.0029) [2024-04-26 21:12:32,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4453285888. Throughput: 0: 50926.5. Samples: 2206189020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 21:12:32,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 21:12:33,009][49750] Updated weights for policy 0, policy_version 271811 (0.0032) [2024-04-26 21:12:36,300][49750] Updated weights for policy 0, policy_version 271821 (0.0033) [2024-04-26 21:12:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50984.8). Total num frames: 4453531648. Throughput: 0: 51091.5. Samples: 2206336160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 21:12:37,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 21:12:39,519][49750] Updated weights for policy 0, policy_version 271831 (0.0029) [2024-04-26 21:12:42,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4453793792. Throughput: 0: 50967.2. Samples: 2206639000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 21:12:42,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 21:12:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000271838_4453793792.pth... [2024-04-26 21:12:42,129][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000271091_4441554944.pth [2024-04-26 21:12:42,641][49750] Updated weights for policy 0, policy_version 271841 (0.0038) [2024-04-26 21:12:45,829][49750] Updated weights for policy 0, policy_version 271851 (0.0032) [2024-04-26 21:12:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4454055936. Throughput: 0: 50985.2. Samples: 2206948400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 21:12:47,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 21:12:49,131][49750] Updated weights for policy 0, policy_version 271861 (0.0033) [2024-04-26 21:12:52,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 51040.5). Total num frames: 4454318080. Throughput: 0: 50906.3. Samples: 2207099660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 21:12:52,063][49517] Avg episode reward: [(0, '0.465')] [2024-04-26 21:12:52,294][49750] Updated weights for policy 0, policy_version 271871 (0.0030) [2024-04-26 21:12:55,708][49750] Updated weights for policy 0, policy_version 271881 (0.0029) [2024-04-26 21:12:57,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4454563840. Throughput: 0: 50809.2. Samples: 2207405820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 21:12:57,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 21:12:58,677][49750] Updated weights for policy 0, policy_version 271891 (0.0032) [2024-04-26 21:13:02,048][49750] Updated weights for policy 0, policy_version 271901 (0.0029) [2024-04-26 21:13:02,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 4454825984. Throughput: 0: 50784.1. Samples: 2207709960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 21:13:02,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 21:13:05,057][49750] Updated weights for policy 0, policy_version 271911 (0.0036) [2024-04-26 21:13:07,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4455038976. Throughput: 0: 50939.9. Samples: 2207865020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 21:13:07,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 21:13:08,508][49750] Updated weights for policy 0, policy_version 271921 (0.0037) [2024-04-26 21:13:11,054][49728] Signal inference workers to stop experience collection... (32950 times) [2024-04-26 21:13:11,055][49728] Signal inference workers to resume experience collection... (32950 times) [2024-04-26 21:13:11,082][49750] InferenceWorker_p0-w0: stopping experience collection (32950 times) [2024-04-26 21:13:11,082][49750] InferenceWorker_p0-w0: resuming experience collection (32950 times) [2024-04-26 21:13:11,438][49750] Updated weights for policy 0, policy_version 271931 (0.0038) [2024-04-26 21:13:12,062][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 4455317504. Throughput: 0: 50798.7. Samples: 2208168420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 21:13:12,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 21:13:14,975][49750] Updated weights for policy 0, policy_version 271941 (0.0032) [2024-04-26 21:13:17,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 4455579648. Throughput: 0: 50870.3. Samples: 2208478180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 21:13:17,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 21:13:17,908][49750] Updated weights for policy 0, policy_version 271951 (0.0030) [2024-04-26 21:13:21,436][49750] Updated weights for policy 0, policy_version 271961 (0.0032) [2024-04-26 21:13:22,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.2, 300 sec: 51040.3). Total num frames: 4455825408. Throughput: 0: 50880.3. Samples: 2208625780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 21:13:22,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 21:13:24,321][49750] Updated weights for policy 0, policy_version 271971 (0.0034) [2024-04-26 21:13:27,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4456071168. Throughput: 0: 50827.9. Samples: 2208926260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 21:13:27,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 21:13:27,692][49750] Updated weights for policy 0, policy_version 271981 (0.0035) [2024-04-26 21:13:30,674][49750] Updated weights for policy 0, policy_version 271991 (0.0034) [2024-04-26 21:13:32,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4456333312. Throughput: 0: 50846.0. Samples: 2209236480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 21:13:32,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 21:13:34,081][49750] Updated weights for policy 0, policy_version 272001 (0.0038) [2024-04-26 21:13:36,994][49750] Updated weights for policy 0, policy_version 272011 (0.0030) [2024-04-26 21:13:37,063][49517] Fps is (10 sec: 55705.5, 60 sec: 51609.5, 300 sec: 51151.4). Total num frames: 4456628224. Throughput: 0: 50970.1. Samples: 2209393320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 21:13:37,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 21:13:40,561][49750] Updated weights for policy 0, policy_version 272021 (0.0029) [2024-04-26 21:13:42,062][49517] Fps is (10 sec: 50791.7, 60 sec: 50790.5, 300 sec: 51040.3). Total num frames: 4456841216. Throughput: 0: 50915.8. Samples: 2209697020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 21:13:42,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 21:13:43,699][49750] Updated weights for policy 0, policy_version 272031 (0.0037) [2024-04-26 21:13:47,063][49517] Fps is (10 sec: 47513.9, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 4457103360. Throughput: 0: 51070.9. Samples: 2210008160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 21:13:47,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 21:13:47,181][49750] Updated weights for policy 0, policy_version 272041 (0.0029) [2024-04-26 21:13:50,505][49750] Updated weights for policy 0, policy_version 272051 (0.0037) [2024-04-26 21:13:52,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 51040.3). Total num frames: 4457365504. Throughput: 0: 50764.0. Samples: 2210149400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 21:13:52,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 21:13:53,917][49750] Updated weights for policy 0, policy_version 272061 (0.0031) [2024-04-26 21:13:56,848][49750] Updated weights for policy 0, policy_version 272071 (0.0033) [2024-04-26 21:13:57,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50929.2). Total num frames: 4457611264. Throughput: 0: 50793.3. Samples: 2210454120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 21:13:57,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 21:14:00,353][49750] Updated weights for policy 0, policy_version 272081 (0.0030) [2024-04-26 21:14:02,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 51040.3). Total num frames: 4457873408. Throughput: 0: 50783.2. Samples: 2210763420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 21:14:02,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 21:14:03,378][49750] Updated weights for policy 0, policy_version 272091 (0.0029) [2024-04-26 21:14:06,735][49750] Updated weights for policy 0, policy_version 272101 (0.0038) [2024-04-26 21:14:07,062][49517] Fps is (10 sec: 49152.6, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 4458102784. Throughput: 0: 51048.3. Samples: 2210922940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 21:14:07,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 21:14:09,706][49750] Updated weights for policy 0, policy_version 272111 (0.0028) [2024-04-26 21:14:12,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4458364928. Throughput: 0: 51054.3. Samples: 2211223700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 21:14:12,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 21:14:12,124][49728] Signal inference workers to stop experience collection... (33000 times) [2024-04-26 21:14:12,124][49728] Signal inference workers to resume experience collection... (33000 times) [2024-04-26 21:14:12,139][49750] InferenceWorker_p0-w0: stopping experience collection (33000 times) [2024-04-26 21:14:12,139][49750] InferenceWorker_p0-w0: resuming experience collection (33000 times) [2024-04-26 21:14:13,208][49750] Updated weights for policy 0, policy_version 272121 (0.0029) [2024-04-26 21:14:16,212][49750] Updated weights for policy 0, policy_version 272131 (0.0025) [2024-04-26 21:14:17,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4458610688. Throughput: 0: 50812.2. Samples: 2211523020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 21:14:17,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 21:14:19,705][49750] Updated weights for policy 0, policy_version 272141 (0.0028) [2024-04-26 21:14:22,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.6, 300 sec: 50929.2). Total num frames: 4458889216. Throughput: 0: 50845.0. Samples: 2211681340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 21:14:22,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 21:14:22,697][49750] Updated weights for policy 0, policy_version 272151 (0.0034) [2024-04-26 21:14:26,287][49750] Updated weights for policy 0, policy_version 272161 (0.0029) [2024-04-26 21:14:27,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 4459118592. Throughput: 0: 50737.6. Samples: 2211980220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 21:14:27,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 21:14:29,120][49750] Updated weights for policy 0, policy_version 272171 (0.0029) [2024-04-26 21:14:32,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 4459380736. Throughput: 0: 50611.6. Samples: 2212285680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 21:14:32,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 21:14:32,726][49750] Updated weights for policy 0, policy_version 272181 (0.0033) [2024-04-26 21:14:35,462][49750] Updated weights for policy 0, policy_version 272191 (0.0039) [2024-04-26 21:14:37,062][49517] Fps is (10 sec: 50790.9, 60 sec: 49971.3, 300 sec: 50818.2). Total num frames: 4459626496. Throughput: 0: 50691.1. Samples: 2212430500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-04-26 21:14:37,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 21:14:39,217][49750] Updated weights for policy 0, policy_version 272201 (0.0032) [2024-04-26 21:14:41,880][49750] Updated weights for policy 0, policy_version 272211 (0.0029) [2024-04-26 21:14:42,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.3, 300 sec: 50984.7). Total num frames: 4459905024. Throughput: 0: 50643.4. Samples: 2212733080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 21:14:42,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 21:14:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000272211_4459905024.pth... [2024-04-26 21:14:42,126][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000271465_4447682560.pth [2024-04-26 21:14:45,706][49750] Updated weights for policy 0, policy_version 272221 (0.0031) [2024-04-26 21:14:47,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.6, 300 sec: 50984.8). Total num frames: 4460167168. Throughput: 0: 50645.3. Samples: 2213042460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 21:14:47,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 21:14:48,422][49750] Updated weights for policy 0, policy_version 272231 (0.0036) [2024-04-26 21:14:52,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.2, 300 sec: 50873.7). Total num frames: 4460380160. Throughput: 0: 50556.3. Samples: 2213197980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 21:14:52,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 21:14:52,196][49750] Updated weights for policy 0, policy_version 272241 (0.0034) [2024-04-26 21:14:54,528][49728] Signal inference workers to stop experience collection... (33050 times) [2024-04-26 21:14:54,528][49728] Signal inference workers to resume experience collection... (33050 times) [2024-04-26 21:14:54,539][49750] InferenceWorker_p0-w0: stopping experience collection (33050 times) [2024-04-26 21:14:54,540][49750] InferenceWorker_p0-w0: resuming experience collection (33050 times) [2024-04-26 21:14:54,897][49750] Updated weights for policy 0, policy_version 272251 (0.0026) [2024-04-26 21:14:57,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4460658688. Throughput: 0: 50709.4. Samples: 2213505620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 21:14:57,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 21:14:58,587][49750] Updated weights for policy 0, policy_version 272261 (0.0032) [2024-04-26 21:15:01,635][49750] Updated weights for policy 0, policy_version 272271 (0.0031) [2024-04-26 21:15:02,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4460904448. Throughput: 0: 50956.0. Samples: 2213816040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 21:15:02,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 21:15:05,004][49750] Updated weights for policy 0, policy_version 272281 (0.0031) [2024-04-26 21:15:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4461166592. Throughput: 0: 50848.0. Samples: 2213969500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 21:15:07,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 21:15:08,010][49750] Updated weights for policy 0, policy_version 272291 (0.0029) [2024-04-26 21:15:11,399][49750] Updated weights for policy 0, policy_version 272301 (0.0035) [2024-04-26 21:15:12,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 4461428736. Throughput: 0: 50932.6. Samples: 2214272180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 21:15:12,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 21:15:14,306][49750] Updated weights for policy 0, policy_version 272311 (0.0029) [2024-04-26 21:15:17,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 4461658112. Throughput: 0: 50859.8. Samples: 2214574380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 21:15:17,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 21:15:17,722][49750] Updated weights for policy 0, policy_version 272321 (0.0030) [2024-04-26 21:15:20,894][49750] Updated weights for policy 0, policy_version 272331 (0.0032) [2024-04-26 21:15:22,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4461920256. Throughput: 0: 50941.7. Samples: 2214722880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 21:15:22,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 21:15:24,077][49750] Updated weights for policy 0, policy_version 272341 (0.0033) [2024-04-26 21:15:27,062][49517] Fps is (10 sec: 52429.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4462182400. Throughput: 0: 50953.1. Samples: 2215025960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 21:15:27,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 21:15:27,342][49750] Updated weights for policy 0, policy_version 272351 (0.0032) [2024-04-26 21:15:30,519][49750] Updated weights for policy 0, policy_version 272361 (0.0028) [2024-04-26 21:15:32,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 4462460928. Throughput: 0: 50805.7. Samples: 2215328720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 21:15:32,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 21:15:33,649][49750] Updated weights for policy 0, policy_version 272371 (0.0033) [2024-04-26 21:15:37,054][49750] Updated weights for policy 0, policy_version 272381 (0.0035) [2024-04-26 21:15:37,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 4462690304. Throughput: 0: 51050.2. Samples: 2215495240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 21:15:37,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 21:15:40,102][49750] Updated weights for policy 0, policy_version 272391 (0.0029) [2024-04-26 21:15:42,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4462936064. Throughput: 0: 50807.0. Samples: 2215791940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 21:15:42,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 21:15:43,289][49750] Updated weights for policy 0, policy_version 272401 (0.0035) [2024-04-26 21:15:46,378][49750] Updated weights for policy 0, policy_version 272411 (0.0035) [2024-04-26 21:15:47,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 4463198208. Throughput: 0: 50672.8. Samples: 2216096320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-26 21:15:47,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 21:15:49,724][49750] Updated weights for policy 0, policy_version 272421 (0.0032) [2024-04-26 21:15:50,016][49728] Signal inference workers to stop experience collection... (33100 times) [2024-04-26 21:15:50,066][49750] InferenceWorker_p0-w0: stopping experience collection (33100 times) [2024-04-26 21:15:50,082][49728] Signal inference workers to resume experience collection... (33100 times) [2024-04-26 21:15:50,086][49750] InferenceWorker_p0-w0: resuming experience collection (33100 times) [2024-04-26 21:15:52,062][49517] Fps is (10 sec: 50791.2, 60 sec: 51063.6, 300 sec: 50873.8). Total num frames: 4463443968. Throughput: 0: 50830.3. Samples: 2216256860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 21:15:52,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 21:15:52,739][49750] Updated weights for policy 0, policy_version 272431 (0.0028) [2024-04-26 21:15:56,077][49750] Updated weights for policy 0, policy_version 272441 (0.0029) [2024-04-26 21:15:57,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.3, 300 sec: 50984.8). Total num frames: 4463722496. Throughput: 0: 51008.6. Samples: 2216567580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 21:15:57,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 21:15:59,108][49750] Updated weights for policy 0, policy_version 272451 (0.0037) [2024-04-26 21:16:02,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4463951872. Throughput: 0: 51009.6. Samples: 2216869800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 21:16:02,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 21:16:02,576][49750] Updated weights for policy 0, policy_version 272461 (0.0035) [2024-04-26 21:16:05,656][49750] Updated weights for policy 0, policy_version 272471 (0.0032) [2024-04-26 21:16:07,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4464197632. Throughput: 0: 50975.8. Samples: 2217016800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 21:16:07,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 21:16:08,891][49750] Updated weights for policy 0, policy_version 272481 (0.0032) [2024-04-26 21:16:12,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4464459776. Throughput: 0: 50935.2. Samples: 2217318040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 21:16:12,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 21:16:12,212][49750] Updated weights for policy 0, policy_version 272491 (0.0039) [2024-04-26 21:16:15,383][49750] Updated weights for policy 0, policy_version 272501 (0.0031) [2024-04-26 21:16:17,063][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4464721920. Throughput: 0: 50931.9. Samples: 2217620660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 21:16:17,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 21:16:18,739][49750] Updated weights for policy 0, policy_version 272511 (0.0026) [2024-04-26 21:16:21,667][49750] Updated weights for policy 0, policy_version 272521 (0.0030) [2024-04-26 21:16:22,063][49517] Fps is (10 sec: 54066.2, 60 sec: 51336.4, 300 sec: 50984.8). Total num frames: 4465000448. Throughput: 0: 50922.2. Samples: 2217786740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 21:16:22,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 21:16:25,117][49750] Updated weights for policy 0, policy_version 272531 (0.0035) [2024-04-26 21:16:27,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4465213440. Throughput: 0: 50911.8. Samples: 2218082960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 21:16:27,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 21:16:28,117][49750] Updated weights for policy 0, policy_version 272541 (0.0037) [2024-04-26 21:16:31,441][49750] Updated weights for policy 0, policy_version 272551 (0.0031) [2024-04-26 21:16:32,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 4465475584. Throughput: 0: 50767.0. Samples: 2218380840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 21:16:32,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 21:16:34,444][49750] Updated weights for policy 0, policy_version 272561 (0.0026) [2024-04-26 21:16:37,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4465737728. Throughput: 0: 50510.1. Samples: 2218529820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 21:16:37,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 21:16:38,100][49750] Updated weights for policy 0, policy_version 272571 (0.0031) [2024-04-26 21:16:40,914][49750] Updated weights for policy 0, policy_version 272581 (0.0033) [2024-04-26 21:16:42,063][49517] Fps is (10 sec: 55705.6, 60 sec: 51609.6, 300 sec: 51040.3). Total num frames: 4466032640. Throughput: 0: 50619.1. Samples: 2218845440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 21:16:42,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 21:16:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000272585_4466032640.pth... [2024-04-26 21:16:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000271838_4453793792.pth [2024-04-26 21:16:44,412][49750] Updated weights for policy 0, policy_version 272591 (0.0033) [2024-04-26 21:16:47,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4466262016. Throughput: 0: 50823.3. Samples: 2219156860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 21:16:47,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 21:16:47,416][49750] Updated weights for policy 0, policy_version 272601 (0.0034) [2024-04-26 21:16:49,423][49728] Signal inference workers to stop experience collection... (33150 times) [2024-04-26 21:16:49,450][49750] InferenceWorker_p0-w0: stopping experience collection (33150 times) [2024-04-26 21:16:49,486][49728] Signal inference workers to resume experience collection... (33150 times) [2024-04-26 21:16:49,486][49750] InferenceWorker_p0-w0: resuming experience collection (33150 times) [2024-04-26 21:16:50,806][49750] Updated weights for policy 0, policy_version 272611 (0.0028) [2024-04-26 21:16:52,062][49517] Fps is (10 sec: 44237.6, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4466475008. Throughput: 0: 50867.3. Samples: 2219305820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 21:16:52,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 21:16:53,806][49750] Updated weights for policy 0, policy_version 272621 (0.0024) [2024-04-26 21:16:57,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 4466769920. Throughput: 0: 50934.2. Samples: 2219610080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 21:16:57,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 21:16:57,161][49750] Updated weights for policy 0, policy_version 272631 (0.0029) [2024-04-26 21:17:00,160][49750] Updated weights for policy 0, policy_version 272641 (0.0027) [2024-04-26 21:17:02,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 4467015680. Throughput: 0: 50807.1. Samples: 2219906980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 21:17:02,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 21:17:03,681][49750] Updated weights for policy 0, policy_version 272651 (0.0030) [2024-04-26 21:17:06,506][49750] Updated weights for policy 0, policy_version 272661 (0.0033) [2024-04-26 21:17:07,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51609.8, 300 sec: 50984.8). Total num frames: 4467294208. Throughput: 0: 50843.3. Samples: 2220074680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 21:17:07,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 21:17:10,544][49750] Updated weights for policy 0, policy_version 272671 (0.0031) [2024-04-26 21:17:12,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51063.2, 300 sec: 50873.7). Total num frames: 4467523584. Throughput: 0: 50963.7. Samples: 2220376340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 21:17:12,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 21:17:12,965][49750] Updated weights for policy 0, policy_version 272681 (0.0028) [2024-04-26 21:17:17,063][49517] Fps is (10 sec: 45874.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4467752960. Throughput: 0: 50969.3. Samples: 2220674460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 21:17:17,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 21:17:17,384][49750] Updated weights for policy 0, policy_version 272691 (0.0031) [2024-04-26 21:17:19,438][49750] Updated weights for policy 0, policy_version 272701 (0.0034) [2024-04-26 21:17:22,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4468031488. Throughput: 0: 50816.5. Samples: 2220816560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 21:17:22,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 21:17:23,806][49750] Updated weights for policy 0, policy_version 272711 (0.0033) [2024-04-26 21:17:25,928][49750] Updated weights for policy 0, policy_version 272721 (0.0032) [2024-04-26 21:17:27,062][49517] Fps is (10 sec: 55706.6, 60 sec: 51609.6, 300 sec: 50929.3). Total num frames: 4468310016. Throughput: 0: 50630.0. Samples: 2221123780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 21:17:27,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 21:17:30,567][49750] Updated weights for policy 0, policy_version 272731 (0.0029) [2024-04-26 21:17:32,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 4468555776. Throughput: 0: 50644.6. Samples: 2221435860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 21:17:32,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 21:17:32,340][49750] Updated weights for policy 0, policy_version 272741 (0.0030) [2024-04-26 21:17:36,957][49750] Updated weights for policy 0, policy_version 272751 (0.0038) [2024-04-26 21:17:37,062][49517] Fps is (10 sec: 44236.6, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4468752384. Throughput: 0: 50664.8. Samples: 2221585740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 21:17:37,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 21:17:38,768][49750] Updated weights for policy 0, policy_version 272761 (0.0031) [2024-04-26 21:17:42,063][49517] Fps is (10 sec: 47513.1, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 4469030912. Throughput: 0: 50613.1. Samples: 2221887680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 21:17:42,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 21:17:43,079][49728] Signal inference workers to stop experience collection... (33200 times) [2024-04-26 21:17:43,106][49750] InferenceWorker_p0-w0: stopping experience collection (33200 times) [2024-04-26 21:17:43,187][49728] Signal inference workers to resume experience collection... (33200 times) [2024-04-26 21:17:43,187][49750] InferenceWorker_p0-w0: resuming experience collection (33200 times) [2024-04-26 21:17:43,314][49750] Updated weights for policy 0, policy_version 272771 (0.0026) [2024-04-26 21:17:45,455][49750] Updated weights for policy 0, policy_version 272781 (0.0029) [2024-04-26 21:17:47,063][49517] Fps is (10 sec: 55705.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4469309440. Throughput: 0: 50686.2. Samples: 2222187860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 21:17:47,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 21:17:49,737][49750] Updated weights for policy 0, policy_version 272791 (0.0030) [2024-04-26 21:17:51,968][49750] Updated weights for policy 0, policy_version 272801 (0.0030) [2024-04-26 21:17:52,063][49517] Fps is (10 sec: 54067.3, 60 sec: 51609.4, 300 sec: 50873.7). Total num frames: 4469571584. Throughput: 0: 50813.6. Samples: 2222361300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 21:17:52,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 21:17:56,186][49750] Updated weights for policy 0, policy_version 272811 (0.0034) [2024-04-26 21:17:57,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4469800960. Throughput: 0: 50709.9. Samples: 2222658280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 21:17:57,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 21:17:58,402][49750] Updated weights for policy 0, policy_version 272821 (0.0032) [2024-04-26 21:18:02,063][49517] Fps is (10 sec: 45875.4, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 4470030336. Throughput: 0: 50754.7. Samples: 2222958420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 21:18:02,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 21:18:02,571][49750] Updated weights for policy 0, policy_version 272831 (0.0034) [2024-04-26 21:18:04,754][49750] Updated weights for policy 0, policy_version 272841 (0.0029) [2024-04-26 21:18:07,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.1, 300 sec: 50818.2). Total num frames: 4470308864. Throughput: 0: 50807.0. Samples: 2223102880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 21:18:07,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 21:18:08,958][49750] Updated weights for policy 0, policy_version 272851 (0.0034) [2024-04-26 21:18:11,245][49750] Updated weights for policy 0, policy_version 272861 (0.0033) [2024-04-26 21:18:12,063][49517] Fps is (10 sec: 54067.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4470571008. Throughput: 0: 50740.7. Samples: 2223407120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 21:18:12,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 21:18:15,433][49750] Updated weights for policy 0, policy_version 272871 (0.0030) [2024-04-26 21:18:17,063][49517] Fps is (10 sec: 54067.3, 60 sec: 51609.6, 300 sec: 50929.3). Total num frames: 4470849536. Throughput: 0: 50665.3. Samples: 2223715800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:18:17,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 21:18:17,737][49750] Updated weights for policy 0, policy_version 272881 (0.0031) [2024-04-26 21:18:21,876][49750] Updated weights for policy 0, policy_version 272891 (0.0031) [2024-04-26 21:18:22,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4471046144. Throughput: 0: 50781.7. Samples: 2223870920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:18:22,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 21:18:24,166][49750] Updated weights for policy 0, policy_version 272901 (0.0030) [2024-04-26 21:18:27,062][49517] Fps is (10 sec: 45875.7, 60 sec: 49971.2, 300 sec: 50762.7). Total num frames: 4471308288. Throughput: 0: 50753.9. Samples: 2224171600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:18:27,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 21:18:28,236][49750] Updated weights for policy 0, policy_version 272911 (0.0029) [2024-04-26 21:18:30,558][49750] Updated weights for policy 0, policy_version 272921 (0.0029) [2024-04-26 21:18:32,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4471586816. Throughput: 0: 50870.3. Samples: 2224477020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:18:32,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 21:18:34,718][49750] Updated weights for policy 0, policy_version 272931 (0.0025) [2024-04-26 21:18:37,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 4471848960. Throughput: 0: 50768.1. Samples: 2224645860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:18:37,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 21:18:37,214][49750] Updated weights for policy 0, policy_version 272941 (0.0033) [2024-04-26 21:18:41,045][49750] Updated weights for policy 0, policy_version 272951 (0.0030) [2024-04-26 21:18:42,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51063.7, 300 sec: 50818.2). Total num frames: 4472094720. Throughput: 0: 51007.7. Samples: 2224953620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:18:42,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 21:18:42,117][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000272956_4472111104.pth... [2024-04-26 21:18:42,179][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000272211_4459905024.pth [2024-04-26 21:18:43,679][49750] Updated weights for policy 0, policy_version 272961 (0.0026) [2024-04-26 21:18:47,063][49517] Fps is (10 sec: 45874.7, 60 sec: 49971.1, 300 sec: 50651.5). Total num frames: 4472307712. Throughput: 0: 50913.7. Samples: 2225249540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:18:47,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 21:18:47,540][49750] Updated weights for policy 0, policy_version 272971 (0.0035) [2024-04-26 21:18:48,223][49728] Signal inference workers to stop experience collection... (33250 times) [2024-04-26 21:18:48,224][49728] Signal inference workers to resume experience collection... (33250 times) [2024-04-26 21:18:48,242][49750] InferenceWorker_p0-w0: stopping experience collection (33250 times) [2024-04-26 21:18:48,242][49750] InferenceWorker_p0-w0: resuming experience collection (33250 times) [2024-04-26 21:18:50,064][49750] Updated weights for policy 0, policy_version 272981 (0.0034) [2024-04-26 21:18:52,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4472602624. Throughput: 0: 50812.6. Samples: 2225389440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:18:52,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 21:18:53,965][49750] Updated weights for policy 0, policy_version 272991 (0.0031) [2024-04-26 21:18:56,745][49750] Updated weights for policy 0, policy_version 273001 (0.0036) [2024-04-26 21:18:57,062][49517] Fps is (10 sec: 55707.0, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4472864768. Throughput: 0: 50902.0. Samples: 2225697700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:18:57,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 21:19:00,507][49750] Updated weights for policy 0, policy_version 273011 (0.0030) [2024-04-26 21:19:02,063][49517] Fps is (10 sec: 50789.5, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4473110528. Throughput: 0: 50801.3. Samples: 2226001860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:19:02,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 21:19:03,248][49750] Updated weights for policy 0, policy_version 273021 (0.0028) [2024-04-26 21:19:06,774][49750] Updated weights for policy 0, policy_version 273031 (0.0029) [2024-04-26 21:19:07,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4473356288. Throughput: 0: 50832.0. Samples: 2226158360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:19:07,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 21:19:09,604][49750] Updated weights for policy 0, policy_version 273041 (0.0032) [2024-04-26 21:19:12,063][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4473585664. Throughput: 0: 50932.3. Samples: 2226463560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:19:12,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 21:19:13,331][49750] Updated weights for policy 0, policy_version 273051 (0.0033) [2024-04-26 21:19:16,065][49750] Updated weights for policy 0, policy_version 273061 (0.0027) [2024-04-26 21:19:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4473864192. Throughput: 0: 50827.1. Samples: 2226764240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:19:17,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 21:19:19,765][49750] Updated weights for policy 0, policy_version 273071 (0.0033) [2024-04-26 21:19:22,063][49517] Fps is (10 sec: 54067.4, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4474126336. Throughput: 0: 50635.1. Samples: 2226924440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:19:22,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 21:19:22,473][49750] Updated weights for policy 0, policy_version 273081 (0.0031) [2024-04-26 21:19:26,276][49750] Updated weights for policy 0, policy_version 273091 (0.0033) [2024-04-26 21:19:27,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4474388480. Throughput: 0: 50518.7. Samples: 2227226960. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 21:19:27,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 21:19:28,971][49750] Updated weights for policy 0, policy_version 273101 (0.0026) [2024-04-26 21:19:32,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4474617856. Throughput: 0: 50774.0. Samples: 2227534360. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 21:19:32,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 21:19:32,699][49750] Updated weights for policy 0, policy_version 273111 (0.0032) [2024-04-26 21:19:32,903][49728] Signal inference workers to stop experience collection... (33300 times) [2024-04-26 21:19:32,946][49750] InferenceWorker_p0-w0: stopping experience collection (33300 times) [2024-04-26 21:19:33,008][49728] Signal inference workers to resume experience collection... (33300 times) [2024-04-26 21:19:33,008][49750] InferenceWorker_p0-w0: resuming experience collection (33300 times) [2024-04-26 21:19:35,351][49750] Updated weights for policy 0, policy_version 273121 (0.0029) [2024-04-26 21:19:37,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4474880000. Throughput: 0: 50673.7. Samples: 2227669760. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 21:19:37,063][49517] Avg episode reward: [(0, '0.695')] [2024-04-26 21:19:39,010][49750] Updated weights for policy 0, policy_version 273131 (0.0035) [2024-04-26 21:19:41,761][49750] Updated weights for policy 0, policy_version 273141 (0.0030) [2024-04-26 21:19:42,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4475158528. Throughput: 0: 50695.1. Samples: 2227978980. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 21:19:42,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-26 21:19:45,584][49750] Updated weights for policy 0, policy_version 273151 (0.0029) [2024-04-26 21:19:47,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51609.8, 300 sec: 50929.3). Total num frames: 4475404288. Throughput: 0: 50756.7. Samples: 2228285900. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 21:19:47,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 21:19:48,192][49750] Updated weights for policy 0, policy_version 273161 (0.0034) [2024-04-26 21:19:51,965][49750] Updated weights for policy 0, policy_version 273171 (0.0032) [2024-04-26 21:19:52,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4475633664. Throughput: 0: 50710.8. Samples: 2228440340. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 21:19:52,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 21:19:54,705][49750] Updated weights for policy 0, policy_version 273181 (0.0026) [2024-04-26 21:19:57,062][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4475879424. Throughput: 0: 50701.5. Samples: 2228745120. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 21:19:57,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 21:19:58,332][49750] Updated weights for policy 0, policy_version 273191 (0.0030) [2024-04-26 21:20:01,138][49750] Updated weights for policy 0, policy_version 273201 (0.0030) [2024-04-26 21:20:02,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4476141568. Throughput: 0: 50811.9. Samples: 2229050780. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 21:20:02,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 21:20:04,779][49750] Updated weights for policy 0, policy_version 273211 (0.0046) [2024-04-26 21:20:07,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4476420096. Throughput: 0: 50827.6. Samples: 2229211680. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 21:20:07,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 21:20:07,691][49750] Updated weights for policy 0, policy_version 273221 (0.0031) [2024-04-26 21:20:11,177][49750] Updated weights for policy 0, policy_version 273231 (0.0028) [2024-04-26 21:20:12,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4476665856. Throughput: 0: 50867.8. Samples: 2229516020. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 21:20:12,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 21:20:14,205][49750] Updated weights for policy 0, policy_version 273241 (0.0031) [2024-04-26 21:20:17,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4476911616. Throughput: 0: 50725.7. Samples: 2229817020. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 21:20:17,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 21:20:17,520][49750] Updated weights for policy 0, policy_version 273251 (0.0033) [2024-04-26 21:20:20,559][49750] Updated weights for policy 0, policy_version 273261 (0.0038) [2024-04-26 21:20:22,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4477157376. Throughput: 0: 50975.9. Samples: 2229963680. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 21:20:22,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 21:20:23,903][49750] Updated weights for policy 0, policy_version 273271 (0.0031) [2024-04-26 21:20:25,631][49728] Signal inference workers to stop experience collection... (33350 times) [2024-04-26 21:20:25,632][49728] Signal inference workers to resume experience collection... (33350 times) [2024-04-26 21:20:25,665][49750] InferenceWorker_p0-w0: stopping experience collection (33350 times) [2024-04-26 21:20:25,665][49750] InferenceWorker_p0-w0: resuming experience collection (33350 times) [2024-04-26 21:20:27,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4477419520. Throughput: 0: 50934.8. Samples: 2230271040. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 21:20:27,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 21:20:27,165][49750] Updated weights for policy 0, policy_version 273281 (0.0030) [2024-04-26 21:20:30,469][49750] Updated weights for policy 0, policy_version 273291 (0.0031) [2024-04-26 21:20:32,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4477698048. Throughput: 0: 50895.4. Samples: 2230576200. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 21:20:32,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 21:20:33,640][49750] Updated weights for policy 0, policy_version 273301 (0.0029) [2024-04-26 21:20:36,959][49750] Updated weights for policy 0, policy_version 273311 (0.0036) [2024-04-26 21:20:37,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4477927424. Throughput: 0: 50960.2. Samples: 2230733560. Policy #0 lag: (min: 2.0, avg: 9.5, max: 21.0) [2024-04-26 21:20:37,063][49517] Avg episode reward: [(0, '0.669')] [2024-04-26 21:20:40,035][49750] Updated weights for policy 0, policy_version 273321 (0.0029) [2024-04-26 21:20:42,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4478173184. Throughput: 0: 51032.8. Samples: 2231041600. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 21:20:42,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 21:20:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000273326_4478173184.pth... [2024-04-26 21:20:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000272585_4466032640.pth [2024-04-26 21:20:43,349][49750] Updated weights for policy 0, policy_version 273331 (0.0031) [2024-04-26 21:20:46,387][49750] Updated weights for policy 0, policy_version 273341 (0.0034) [2024-04-26 21:20:47,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4478418944. Throughput: 0: 50817.5. Samples: 2231337560. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 21:20:47,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 21:20:49,726][49750] Updated weights for policy 0, policy_version 273351 (0.0028) [2024-04-26 21:20:52,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4478713856. Throughput: 0: 50782.7. Samples: 2231496900. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 21:20:52,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 21:20:52,840][49750] Updated weights for policy 0, policy_version 273361 (0.0029) [2024-04-26 21:20:56,133][49750] Updated weights for policy 0, policy_version 273371 (0.0029) [2024-04-26 21:20:57,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4478959616. Throughput: 0: 50832.6. Samples: 2231803480. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 21:20:57,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 21:20:59,547][49750] Updated weights for policy 0, policy_version 273381 (0.0028) [2024-04-26 21:21:02,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 4479188992. Throughput: 0: 50842.8. Samples: 2232104940. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 21:21:02,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 21:21:02,522][49750] Updated weights for policy 0, policy_version 273391 (0.0041) [2024-04-26 21:21:06,181][49750] Updated weights for policy 0, policy_version 273401 (0.0030) [2024-04-26 21:21:07,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4479467520. Throughput: 0: 50912.0. Samples: 2232254720. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 21:21:07,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 21:21:08,856][49750] Updated weights for policy 0, policy_version 273411 (0.0033) [2024-04-26 21:21:12,063][49517] Fps is (10 sec: 50789.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4479696896. Throughput: 0: 50862.8. Samples: 2232559880. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 21:21:12,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 21:21:12,663][49750] Updated weights for policy 0, policy_version 273421 (0.0030) [2024-04-26 21:21:15,273][49750] Updated weights for policy 0, policy_version 273431 (0.0036) [2024-04-26 21:21:17,062][49517] Fps is (10 sec: 50791.3, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 4479975424. Throughput: 0: 50876.9. Samples: 2232865660. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 21:21:17,063][49517] Avg episode reward: [(0, '0.469')] [2024-04-26 21:21:19,142][49750] Updated weights for policy 0, policy_version 273441 (0.0033) [2024-04-26 21:21:21,807][49750] Updated weights for policy 0, policy_version 273451 (0.0033) [2024-04-26 21:21:22,062][49517] Fps is (10 sec: 54068.2, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 4480237568. Throughput: 0: 50901.4. Samples: 2233024120. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 21:21:22,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 21:21:25,443][49750] Updated weights for policy 0, policy_version 273461 (0.0031) [2024-04-26 21:21:27,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 4480466944. Throughput: 0: 50759.5. Samples: 2233325780. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 21:21:27,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 21:21:28,261][49750] Updated weights for policy 0, policy_version 273471 (0.0029) [2024-04-26 21:21:31,849][49750] Updated weights for policy 0, policy_version 273481 (0.0028) [2024-04-26 21:21:32,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4480712704. Throughput: 0: 51065.4. Samples: 2233635500. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 21:21:32,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 21:21:34,607][49750] Updated weights for policy 0, policy_version 273491 (0.0031) [2024-04-26 21:21:35,822][49728] Signal inference workers to stop experience collection... (33400 times) [2024-04-26 21:21:35,823][49728] Signal inference workers to resume experience collection... (33400 times) [2024-04-26 21:21:35,844][49750] InferenceWorker_p0-w0: stopping experience collection (33400 times) [2024-04-26 21:21:35,844][49750] InferenceWorker_p0-w0: resuming experience collection (33400 times) [2024-04-26 21:21:37,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 4480974848. Throughput: 0: 50804.8. Samples: 2233783120. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 21:21:37,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 21:21:38,338][49750] Updated weights for policy 0, policy_version 273501 (0.0038) [2024-04-26 21:21:41,072][49750] Updated weights for policy 0, policy_version 273511 (0.0037) [2024-04-26 21:21:42,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4481253376. Throughput: 0: 50915.5. Samples: 2234094680. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 21:21:42,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 21:21:44,721][49750] Updated weights for policy 0, policy_version 273521 (0.0036) [2024-04-26 21:21:47,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 4481499136. Throughput: 0: 50882.6. Samples: 2234394660. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-04-26 21:21:47,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 21:21:47,637][49750] Updated weights for policy 0, policy_version 273531 (0.0035) [2024-04-26 21:21:51,102][49750] Updated weights for policy 0, policy_version 273541 (0.0030) [2024-04-26 21:21:52,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4481761280. Throughput: 0: 50835.3. Samples: 2234542300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:21:52,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 21:21:54,057][49750] Updated weights for policy 0, policy_version 273551 (0.0034) [2024-04-26 21:21:57,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4481974272. Throughput: 0: 50812.6. Samples: 2234846440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:21:57,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 21:21:57,608][49750] Updated weights for policy 0, policy_version 273561 (0.0030) [2024-04-26 21:22:00,458][49750] Updated weights for policy 0, policy_version 273571 (0.0030) [2024-04-26 21:22:02,063][49517] Fps is (10 sec: 49150.9, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 4482252800. Throughput: 0: 50685.5. Samples: 2235146520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:22:02,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 21:22:04,139][49750] Updated weights for policy 0, policy_version 273581 (0.0033) [2024-04-26 21:22:06,790][49750] Updated weights for policy 0, policy_version 273591 (0.0031) [2024-04-26 21:22:07,063][49517] Fps is (10 sec: 54066.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4482514944. Throughput: 0: 50668.8. Samples: 2235304220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:22:07,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 21:22:10,630][49750] Updated weights for policy 0, policy_version 273601 (0.0028) [2024-04-26 21:22:12,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4482744320. Throughput: 0: 50810.6. Samples: 2235612260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:22:12,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 21:22:13,278][49750] Updated weights for policy 0, policy_version 273611 (0.0033) [2024-04-26 21:22:17,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4483006464. Throughput: 0: 50692.7. Samples: 2235916680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:22:17,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 21:22:17,065][49750] Updated weights for policy 0, policy_version 273621 (0.0035) [2024-04-26 21:22:19,791][49750] Updated weights for policy 0, policy_version 273631 (0.0034) [2024-04-26 21:22:22,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 4483252224. Throughput: 0: 50557.3. Samples: 2236058200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:22:22,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 21:22:23,465][49750] Updated weights for policy 0, policy_version 273641 (0.0028) [2024-04-26 21:22:26,210][49750] Updated weights for policy 0, policy_version 273651 (0.0033) [2024-04-26 21:22:27,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4483530752. Throughput: 0: 50504.9. Samples: 2236367400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:22:27,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 21:22:29,923][49750] Updated weights for policy 0, policy_version 273661 (0.0033) [2024-04-26 21:22:32,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4483760128. Throughput: 0: 50661.2. Samples: 2236674420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:22:32,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 21:22:32,671][49750] Updated weights for policy 0, policy_version 273671 (0.0033) [2024-04-26 21:22:36,315][49750] Updated weights for policy 0, policy_version 273681 (0.0030) [2024-04-26 21:22:37,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4484022272. Throughput: 0: 50565.2. Samples: 2236817740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:22:37,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 21:22:39,163][49750] Updated weights for policy 0, policy_version 273691 (0.0030) [2024-04-26 21:22:42,063][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.1, 300 sec: 50651.5). Total num frames: 4484251648. Throughput: 0: 50658.6. Samples: 2237126080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:22:42,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 21:22:42,207][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000273699_4484284416.pth... [2024-04-26 21:22:42,256][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000272956_4472111104.pth [2024-04-26 21:22:42,644][49750] Updated weights for policy 0, policy_version 273701 (0.0028) [2024-04-26 21:22:45,668][49750] Updated weights for policy 0, policy_version 273711 (0.0035) [2024-04-26 21:22:47,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4484530176. Throughput: 0: 50716.0. Samples: 2237428740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:22:47,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 21:22:49,048][49750] Updated weights for policy 0, policy_version 273721 (0.0035) [2024-04-26 21:22:50,475][49728] Signal inference workers to stop experience collection... (33450 times) [2024-04-26 21:22:50,476][49728] Signal inference workers to resume experience collection... (33450 times) [2024-04-26 21:22:50,490][49750] InferenceWorker_p0-w0: stopping experience collection (33450 times) [2024-04-26 21:22:50,491][49750] InferenceWorker_p0-w0: resuming experience collection (33450 times) [2024-04-26 21:22:52,062][49517] Fps is (10 sec: 54068.2, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4484792320. Throughput: 0: 50739.3. Samples: 2237587480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:22:52,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 21:22:52,114][49750] Updated weights for policy 0, policy_version 273731 (0.0031) [2024-04-26 21:22:55,562][49750] Updated weights for policy 0, policy_version 273741 (0.0034) [2024-04-26 21:22:57,062][49517] Fps is (10 sec: 50791.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4485038080. Throughput: 0: 50594.0. Samples: 2237888980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 21:22:57,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 21:22:58,503][49750] Updated weights for policy 0, policy_version 273751 (0.0035) [2024-04-26 21:23:02,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4485283840. Throughput: 0: 50616.6. Samples: 2238194420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 21:23:02,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 21:23:02,198][49750] Updated weights for policy 0, policy_version 273761 (0.0034) [2024-04-26 21:23:04,862][49750] Updated weights for policy 0, policy_version 273771 (0.0033) [2024-04-26 21:23:07,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 4485513216. Throughput: 0: 50633.0. Samples: 2238336680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 21:23:07,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 21:23:08,535][49750] Updated weights for policy 0, policy_version 273781 (0.0031) [2024-04-26 21:23:11,396][49750] Updated weights for policy 0, policy_version 273791 (0.0033) [2024-04-26 21:23:12,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 4485808128. Throughput: 0: 50557.3. Samples: 2238642480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 21:23:12,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 21:23:15,365][49750] Updated weights for policy 0, policy_version 273801 (0.0029) [2024-04-26 21:23:17,062][49517] Fps is (10 sec: 54066.9, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4486053888. Throughput: 0: 50534.8. Samples: 2238948480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 21:23:17,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 21:23:17,824][49750] Updated weights for policy 0, policy_version 273811 (0.0033) [2024-04-26 21:23:21,876][49750] Updated weights for policy 0, policy_version 273821 (0.0031) [2024-04-26 21:23:22,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4486283264. Throughput: 0: 50739.7. Samples: 2239101020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 21:23:22,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 21:23:24,200][49750] Updated weights for policy 0, policy_version 273831 (0.0031) [2024-04-26 21:23:27,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4486545408. Throughput: 0: 50504.9. Samples: 2239398800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 21:23:27,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 21:23:28,349][49750] Updated weights for policy 0, policy_version 273841 (0.0031) [2024-04-26 21:23:30,535][49750] Updated weights for policy 0, policy_version 273851 (0.0031) [2024-04-26 21:23:32,063][49517] Fps is (10 sec: 54065.9, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4486823936. Throughput: 0: 50627.1. Samples: 2239706960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 21:23:32,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 21:23:34,749][49750] Updated weights for policy 0, policy_version 273861 (0.0036) [2024-04-26 21:23:37,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4487086080. Throughput: 0: 50606.6. Samples: 2239864780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 21:23:37,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 21:23:37,317][49750] Updated weights for policy 0, policy_version 273871 (0.0030) [2024-04-26 21:23:41,048][49750] Updated weights for policy 0, policy_version 273881 (0.0028) [2024-04-26 21:23:42,063][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4487315456. Throughput: 0: 50630.9. Samples: 2240167380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 21:23:42,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 21:23:44,126][49750] Updated weights for policy 0, policy_version 273891 (0.0028) [2024-04-26 21:23:47,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4487561216. Throughput: 0: 50643.6. Samples: 2240473380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 21:23:47,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 21:23:47,488][49750] Updated weights for policy 0, policy_version 273901 (0.0033) [2024-04-26 21:23:50,451][49750] Updated weights for policy 0, policy_version 273911 (0.0031) [2024-04-26 21:23:52,062][49517] Fps is (10 sec: 50791.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4487823360. Throughput: 0: 50740.1. Samples: 2240619980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 21:23:52,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 21:23:54,010][49750] Updated weights for policy 0, policy_version 273921 (0.0032) [2024-04-26 21:23:56,831][49750] Updated weights for policy 0, policy_version 273931 (0.0034) [2024-04-26 21:23:57,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.3, 300 sec: 50762.7). Total num frames: 4488085504. Throughput: 0: 50734.7. Samples: 2240925540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 21:23:57,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 21:24:00,488][49750] Updated weights for policy 0, policy_version 273941 (0.0033) [2024-04-26 21:24:02,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4488331264. Throughput: 0: 50779.1. Samples: 2241233540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 21:24:02,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 21:24:02,091][49728] Signal inference workers to stop experience collection... (33500 times) [2024-04-26 21:24:02,091][49728] Signal inference workers to resume experience collection... (33500 times) [2024-04-26 21:24:02,106][49750] InferenceWorker_p0-w0: stopping experience collection (33500 times) [2024-04-26 21:24:02,106][49750] InferenceWorker_p0-w0: resuming experience collection (33500 times) [2024-04-26 21:24:03,214][49750] Updated weights for policy 0, policy_version 273951 (0.0028) [2024-04-26 21:24:06,954][49750] Updated weights for policy 0, policy_version 273961 (0.0032) [2024-04-26 21:24:07,062][49517] Fps is (10 sec: 49151.7, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4488577024. Throughput: 0: 50783.8. Samples: 2241386300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 21:24:07,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 21:24:09,537][49750] Updated weights for policy 0, policy_version 273971 (0.0032) [2024-04-26 21:24:12,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4488822784. Throughput: 0: 50922.2. Samples: 2241690300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-04-26 21:24:12,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 21:24:13,382][49750] Updated weights for policy 0, policy_version 273981 (0.0031) [2024-04-26 21:24:15,866][49750] Updated weights for policy 0, policy_version 273991 (0.0035) [2024-04-26 21:24:17,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4489101312. Throughput: 0: 50783.8. Samples: 2241992220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:24:17,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 21:24:19,932][49750] Updated weights for policy 0, policy_version 274001 (0.0032) [2024-04-26 21:24:22,063][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 4489363456. Throughput: 0: 50827.0. Samples: 2242152000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:24:22,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 21:24:22,647][49750] Updated weights for policy 0, policy_version 274011 (0.0031) [2024-04-26 21:24:26,260][49750] Updated weights for policy 0, policy_version 274021 (0.0033) [2024-04-26 21:24:27,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4489609216. Throughput: 0: 50944.6. Samples: 2242459880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:24:27,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 21:24:29,204][49750] Updated weights for policy 0, policy_version 274031 (0.0035) [2024-04-26 21:24:32,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4489838592. Throughput: 0: 50792.1. Samples: 2242759020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:24:32,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 21:24:32,621][49750] Updated weights for policy 0, policy_version 274041 (0.0032) [2024-04-26 21:24:35,524][49750] Updated weights for policy 0, policy_version 274051 (0.0031) [2024-04-26 21:24:37,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 4490100736. Throughput: 0: 50821.3. Samples: 2242906940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:24:37,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 21:24:39,079][49750] Updated weights for policy 0, policy_version 274061 (0.0033) [2024-04-26 21:24:41,817][49750] Updated weights for policy 0, policy_version 274071 (0.0035) [2024-04-26 21:24:42,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.7, 300 sec: 50762.6). Total num frames: 4490379264. Throughput: 0: 50878.7. Samples: 2243215080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:24:42,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 21:24:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000274071_4490379264.pth... [2024-04-26 21:24:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000273326_4478173184.pth [2024-04-26 21:24:45,487][49750] Updated weights for policy 0, policy_version 274081 (0.0028) [2024-04-26 21:24:47,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4490608640. Throughput: 0: 50704.0. Samples: 2243515220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:24:47,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 21:24:48,613][49750] Updated weights for policy 0, policy_version 274091 (0.0037) [2024-04-26 21:24:51,981][49750] Updated weights for policy 0, policy_version 274101 (0.0038) [2024-04-26 21:24:52,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4490870784. Throughput: 0: 50736.1. Samples: 2243669420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:24:52,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 21:24:54,906][49750] Updated weights for policy 0, policy_version 274111 (0.0035) [2024-04-26 21:24:57,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 4491116544. Throughput: 0: 50852.1. Samples: 2243978640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:24:57,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 21:24:58,555][49750] Updated weights for policy 0, policy_version 274121 (0.0031) [2024-04-26 21:25:01,408][49750] Updated weights for policy 0, policy_version 274131 (0.0030) [2024-04-26 21:25:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4491378688. Throughput: 0: 50711.6. Samples: 2244274240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:25:02,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 21:25:04,919][49750] Updated weights for policy 0, policy_version 274141 (0.0033) [2024-04-26 21:25:07,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 4491640832. Throughput: 0: 50708.5. Samples: 2244433880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:25:07,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 21:25:07,946][49750] Updated weights for policy 0, policy_version 274151 (0.0035) [2024-04-26 21:25:11,299][49750] Updated weights for policy 0, policy_version 274161 (0.0033) [2024-04-26 21:25:12,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.7, 300 sec: 50762.6). Total num frames: 4491886592. Throughput: 0: 50690.3. Samples: 2244740940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:25:12,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 21:25:14,268][49750] Updated weights for policy 0, policy_version 274171 (0.0032) [2024-04-26 21:25:17,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4492115968. Throughput: 0: 50778.7. Samples: 2245044060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:25:17,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 21:25:17,857][49750] Updated weights for policy 0, policy_version 274181 (0.0032) [2024-04-26 21:25:20,648][49750] Updated weights for policy 0, policy_version 274191 (0.0039) [2024-04-26 21:25:22,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4492394496. Throughput: 0: 50690.9. Samples: 2245188040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-26 21:25:22,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 21:25:24,412][49750] Updated weights for policy 0, policy_version 274201 (0.0030) [2024-04-26 21:25:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4492640256. Throughput: 0: 50668.0. Samples: 2245495140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 21:25:27,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 21:25:27,266][49750] Updated weights for policy 0, policy_version 274211 (0.0028) [2024-04-26 21:25:30,908][49750] Updated weights for policy 0, policy_version 274221 (0.0027) [2024-04-26 21:25:32,060][49728] Signal inference workers to stop experience collection... (33550 times) [2024-04-26 21:25:32,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4492902400. Throughput: 0: 50823.6. Samples: 2245802280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 21:25:32,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 21:25:32,084][49750] InferenceWorker_p0-w0: stopping experience collection (33550 times) [2024-04-26 21:25:32,176][49728] Signal inference workers to resume experience collection... (33550 times) [2024-04-26 21:25:32,176][49750] InferenceWorker_p0-w0: resuming experience collection (33550 times) [2024-04-26 21:25:33,696][49750] Updated weights for policy 0, policy_version 274231 (0.0036) [2024-04-26 21:25:37,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4493131776. Throughput: 0: 50783.1. Samples: 2245954660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 21:25:37,063][49517] Avg episode reward: [(0, '0.686')] [2024-04-26 21:25:37,253][49750] Updated weights for policy 0, policy_version 274241 (0.0034) [2024-04-26 21:25:40,168][49750] Updated weights for policy 0, policy_version 274251 (0.0029) [2024-04-26 21:25:42,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4493393920. Throughput: 0: 50628.5. Samples: 2246256920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 21:25:42,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 21:25:43,773][49750] Updated weights for policy 0, policy_version 274261 (0.0033) [2024-04-26 21:25:46,654][49750] Updated weights for policy 0, policy_version 274271 (0.0035) [2024-04-26 21:25:47,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 4493656064. Throughput: 0: 50744.0. Samples: 2246557720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 21:25:47,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 21:25:50,176][49750] Updated weights for policy 0, policy_version 274281 (0.0030) [2024-04-26 21:25:52,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 4493918208. Throughput: 0: 50672.7. Samples: 2246714160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 21:25:52,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 21:25:53,217][49750] Updated weights for policy 0, policy_version 274291 (0.0032) [2024-04-26 21:25:56,617][49750] Updated weights for policy 0, policy_version 274301 (0.0029) [2024-04-26 21:25:57,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 4494180352. Throughput: 0: 50661.6. Samples: 2247020720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 21:25:57,064][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 21:25:59,663][49750] Updated weights for policy 0, policy_version 274311 (0.0035) [2024-04-26 21:26:02,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4494409728. Throughput: 0: 50679.6. Samples: 2247324640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 21:26:02,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 21:26:03,143][49750] Updated weights for policy 0, policy_version 274321 (0.0032) [2024-04-26 21:26:06,076][49750] Updated weights for policy 0, policy_version 274331 (0.0030) [2024-04-26 21:26:07,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4494655488. Throughput: 0: 50757.9. Samples: 2247472140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 21:26:07,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-26 21:26:09,556][49750] Updated weights for policy 0, policy_version 274341 (0.0028) [2024-04-26 21:26:12,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4494950400. Throughput: 0: 50762.1. Samples: 2247779440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 21:26:12,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-26 21:26:12,648][49750] Updated weights for policy 0, policy_version 274351 (0.0029) [2024-04-26 21:26:15,895][49750] Updated weights for policy 0, policy_version 274361 (0.0030) [2024-04-26 21:26:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.4, 300 sec: 50651.6). Total num frames: 4495179776. Throughput: 0: 50725.8. Samples: 2248084940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 21:26:17,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 21:26:19,212][49750] Updated weights for policy 0, policy_version 274371 (0.0028) [2024-04-26 21:26:22,062][49517] Fps is (10 sec: 45875.6, 60 sec: 50244.5, 300 sec: 50651.6). Total num frames: 4495409152. Throughput: 0: 50624.9. Samples: 2248232780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 21:26:22,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 21:26:22,429][49750] Updated weights for policy 0, policy_version 274381 (0.0037) [2024-04-26 21:26:25,581][49750] Updated weights for policy 0, policy_version 274391 (0.0032) [2024-04-26 21:26:27,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 4495687680. Throughput: 0: 50722.1. Samples: 2248539420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 21:26:27,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 21:26:29,110][49750] Updated weights for policy 0, policy_version 274401 (0.0029) [2024-04-26 21:26:32,063][49517] Fps is (10 sec: 52427.6, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4495933440. Throughput: 0: 50710.9. Samples: 2248839720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-26 21:26:32,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 21:26:32,198][49750] Updated weights for policy 0, policy_version 274411 (0.0037) [2024-04-26 21:26:35,419][49750] Updated weights for policy 0, policy_version 274421 (0.0031) [2024-04-26 21:26:37,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 4496211968. Throughput: 0: 50726.9. Samples: 2248996860. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-04-26 21:26:37,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 21:26:38,532][49750] Updated weights for policy 0, policy_version 274431 (0.0033) [2024-04-26 21:26:41,697][49750] Updated weights for policy 0, policy_version 274441 (0.0029) [2024-04-26 21:26:42,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 4496441344. Throughput: 0: 50598.2. Samples: 2249297640. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-04-26 21:26:42,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 21:26:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000274441_4496441344.pth... [2024-04-26 21:26:42,115][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000273699_4484284416.pth [2024-04-26 21:26:44,907][49750] Updated weights for policy 0, policy_version 274451 (0.0034) [2024-04-26 21:26:47,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 4496703488. Throughput: 0: 50651.1. Samples: 2249603940. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-04-26 21:26:47,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 21:26:48,065][49728] Signal inference workers to stop experience collection... (33600 times) [2024-04-26 21:26:48,066][49728] Signal inference workers to resume experience collection... (33600 times) [2024-04-26 21:26:48,098][49750] InferenceWorker_p0-w0: stopping experience collection (33600 times) [2024-04-26 21:26:48,099][49750] InferenceWorker_p0-w0: resuming experience collection (33600 times) [2024-04-26 21:26:48,210][49750] Updated weights for policy 0, policy_version 274461 (0.0035) [2024-04-26 21:26:51,457][49750] Updated weights for policy 0, policy_version 274471 (0.0030) [2024-04-26 21:26:52,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4496949248. Throughput: 0: 50619.0. Samples: 2249750000. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-04-26 21:26:52,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 21:26:54,709][49750] Updated weights for policy 0, policy_version 274481 (0.0037) [2024-04-26 21:26:57,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4497227776. Throughput: 0: 50763.1. Samples: 2250063780. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-04-26 21:26:57,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 21:26:57,956][49750] Updated weights for policy 0, policy_version 274491 (0.0030) [2024-04-26 21:27:01,516][49750] Updated weights for policy 0, policy_version 274501 (0.0029) [2024-04-26 21:27:02,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 4497457152. Throughput: 0: 50838.3. Samples: 2250372660. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-04-26 21:27:02,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 21:27:04,581][49750] Updated weights for policy 0, policy_version 274511 (0.0031) [2024-04-26 21:27:07,063][49517] Fps is (10 sec: 49151.7, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4497719296. Throughput: 0: 50924.7. Samples: 2250524400. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-04-26 21:27:07,064][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 21:27:07,842][49750] Updated weights for policy 0, policy_version 274521 (0.0037) [2024-04-26 21:27:11,141][49750] Updated weights for policy 0, policy_version 274531 (0.0032) [2024-04-26 21:27:12,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4497965056. Throughput: 0: 50749.4. Samples: 2250823140. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-04-26 21:27:12,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 21:27:14,163][49750] Updated weights for policy 0, policy_version 274541 (0.0030) [2024-04-26 21:27:17,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4498227200. Throughput: 0: 50792.5. Samples: 2251125380. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-04-26 21:27:17,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 21:27:17,607][49750] Updated weights for policy 0, policy_version 274551 (0.0036) [2024-04-26 21:27:20,754][49750] Updated weights for policy 0, policy_version 274561 (0.0033) [2024-04-26 21:27:22,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51609.5, 300 sec: 50762.6). Total num frames: 4498505728. Throughput: 0: 50837.2. Samples: 2251284540. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-04-26 21:27:22,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 21:27:23,945][49750] Updated weights for policy 0, policy_version 274571 (0.0027) [2024-04-26 21:27:27,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4498718720. Throughput: 0: 50792.1. Samples: 2251583280. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-04-26 21:27:27,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 21:27:27,408][49750] Updated weights for policy 0, policy_version 274581 (0.0040) [2024-04-26 21:27:30,428][49750] Updated weights for policy 0, policy_version 274591 (0.0026) [2024-04-26 21:27:32,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 4498980864. Throughput: 0: 50764.5. Samples: 2251888340. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-04-26 21:27:32,063][49517] Avg episode reward: [(0, '0.667')] [2024-04-26 21:27:33,754][49750] Updated weights for policy 0, policy_version 274601 (0.0028) [2024-04-26 21:27:37,011][49750] Updated weights for policy 0, policy_version 274611 (0.0031) [2024-04-26 21:27:37,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4499226624. Throughput: 0: 50755.2. Samples: 2252033980. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-04-26 21:27:37,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 21:27:40,321][49750] Updated weights for policy 0, policy_version 274621 (0.0029) [2024-04-26 21:27:42,062][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4499505152. Throughput: 0: 50600.4. Samples: 2252340800. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-04-26 21:27:42,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 21:27:43,659][49750] Updated weights for policy 0, policy_version 274631 (0.0034) [2024-04-26 21:27:45,651][49728] Signal inference workers to stop experience collection... (33650 times) [2024-04-26 21:27:45,651][49728] Signal inference workers to resume experience collection... (33650 times) [2024-04-26 21:27:45,680][49750] InferenceWorker_p0-w0: stopping experience collection (33650 times) [2024-04-26 21:27:45,681][49750] InferenceWorker_p0-w0: resuming experience collection (33650 times) [2024-04-26 21:27:46,761][49750] Updated weights for policy 0, policy_version 274641 (0.0034) [2024-04-26 21:27:47,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 4499734528. Throughput: 0: 50658.6. Samples: 2252652300. Policy #0 lag: (min: 1.0, avg: 12.1, max: 24.0) [2024-04-26 21:27:47,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 21:27:49,943][49750] Updated weights for policy 0, policy_version 274651 (0.0036) [2024-04-26 21:27:52,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4499996672. Throughput: 0: 50649.4. Samples: 2252803620. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-04-26 21:27:52,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 21:27:53,162][49750] Updated weights for policy 0, policy_version 274661 (0.0027) [2024-04-26 21:27:56,268][49750] Updated weights for policy 0, policy_version 274671 (0.0031) [2024-04-26 21:27:57,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4500242432. Throughput: 0: 50618.7. Samples: 2253100980. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-04-26 21:27:57,063][49517] Avg episode reward: [(0, '0.671')] [2024-04-26 21:27:59,486][49750] Updated weights for policy 0, policy_version 274681 (0.0034) [2024-04-26 21:28:02,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4500504576. Throughput: 0: 50669.9. Samples: 2253405520. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-04-26 21:28:02,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 21:28:02,832][49750] Updated weights for policy 0, policy_version 274691 (0.0032) [2024-04-26 21:28:06,016][49750] Updated weights for policy 0, policy_version 274701 (0.0036) [2024-04-26 21:28:07,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4500783104. Throughput: 0: 50660.4. Samples: 2253564260. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-04-26 21:28:07,063][49517] Avg episode reward: [(0, '0.473')] [2024-04-26 21:28:09,292][49750] Updated weights for policy 0, policy_version 274711 (0.0027) [2024-04-26 21:28:12,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 4500996096. Throughput: 0: 50594.5. Samples: 2253860040. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-04-26 21:28:12,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 21:28:12,486][49750] Updated weights for policy 0, policy_version 274721 (0.0032) [2024-04-26 21:28:15,782][49750] Updated weights for policy 0, policy_version 274731 (0.0028) [2024-04-26 21:28:17,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 4501274624. Throughput: 0: 50701.1. Samples: 2254169900. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-04-26 21:28:17,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 21:28:18,805][49750] Updated weights for policy 0, policy_version 274741 (0.0026) [2024-04-26 21:28:22,062][49517] Fps is (10 sec: 50790.4, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4501504000. Throughput: 0: 50679.5. Samples: 2254314560. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-04-26 21:28:22,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 21:28:22,232][49750] Updated weights for policy 0, policy_version 274751 (0.0034) [2024-04-26 21:28:25,193][49750] Updated weights for policy 0, policy_version 274761 (0.0036) [2024-04-26 21:28:27,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4501782528. Throughput: 0: 50751.1. Samples: 2254624600. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-04-26 21:28:27,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 21:28:28,646][49750] Updated weights for policy 0, policy_version 274771 (0.0033) [2024-04-26 21:28:31,599][49750] Updated weights for policy 0, policy_version 274781 (0.0032) [2024-04-26 21:28:32,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.2, 300 sec: 50651.5). Total num frames: 4502028288. Throughput: 0: 50652.3. Samples: 2254931660. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-04-26 21:28:32,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 21:28:35,057][49750] Updated weights for policy 0, policy_version 274791 (0.0026) [2024-04-26 21:28:37,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4502274048. Throughput: 0: 50708.4. Samples: 2255085500. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-04-26 21:28:37,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 21:28:38,034][49750] Updated weights for policy 0, policy_version 274801 (0.0029) [2024-04-26 21:28:38,506][49728] Signal inference workers to stop experience collection... (33700 times) [2024-04-26 21:28:38,507][49728] Signal inference workers to resume experience collection... (33700 times) [2024-04-26 21:28:38,538][49750] InferenceWorker_p0-w0: stopping experience collection (33700 times) [2024-04-26 21:28:38,538][49750] InferenceWorker_p0-w0: resuming experience collection (33700 times) [2024-04-26 21:28:41,651][49750] Updated weights for policy 0, policy_version 274811 (0.0033) [2024-04-26 21:28:42,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4502536192. Throughput: 0: 50857.3. Samples: 2255389560. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-04-26 21:28:42,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 21:28:42,090][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000274814_4502552576.pth... [2024-04-26 21:28:42,135][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000274071_4490379264.pth [2024-04-26 21:28:44,577][49750] Updated weights for policy 0, policy_version 274821 (0.0030) [2024-04-26 21:28:47,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4502798336. Throughput: 0: 50778.7. Samples: 2255690560. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-04-26 21:28:47,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 21:28:47,931][49750] Updated weights for policy 0, policy_version 274831 (0.0030) [2024-04-26 21:28:50,982][49750] Updated weights for policy 0, policy_version 274841 (0.0035) [2024-04-26 21:28:52,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4503076864. Throughput: 0: 50767.2. Samples: 2255848780. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-04-26 21:28:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 21:28:54,397][49750] Updated weights for policy 0, policy_version 274851 (0.0035) [2024-04-26 21:28:57,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4503289856. Throughput: 0: 50870.6. Samples: 2256149220. Policy #0 lag: (min: 1.0, avg: 12.3, max: 21.0) [2024-04-26 21:28:57,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 21:28:57,421][49750] Updated weights for policy 0, policy_version 274861 (0.0033) [2024-04-26 21:29:00,941][49750] Updated weights for policy 0, policy_version 274871 (0.0034) [2024-04-26 21:29:02,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4503535616. Throughput: 0: 50626.8. Samples: 2256448100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 21:29:02,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 21:29:03,863][49750] Updated weights for policy 0, policy_version 274881 (0.0032) [2024-04-26 21:29:07,063][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4503781376. Throughput: 0: 50731.4. Samples: 2256597480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 21:29:07,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 21:29:07,299][49750] Updated weights for policy 0, policy_version 274891 (0.0038) [2024-04-26 21:29:10,339][49750] Updated weights for policy 0, policy_version 274901 (0.0033) [2024-04-26 21:29:12,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 4504059904. Throughput: 0: 50669.9. Samples: 2256904740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 21:29:12,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 21:29:13,841][49750] Updated weights for policy 0, policy_version 274911 (0.0036) [2024-04-26 21:29:16,801][49750] Updated weights for policy 0, policy_version 274921 (0.0033) [2024-04-26 21:29:17,062][49517] Fps is (10 sec: 52430.0, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 4504305664. Throughput: 0: 50624.2. Samples: 2257209740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 21:29:17,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 21:29:20,303][49750] Updated weights for policy 0, policy_version 274931 (0.0031) [2024-04-26 21:29:22,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 4504551424. Throughput: 0: 50635.5. Samples: 2257364100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 21:29:22,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 21:29:23,271][49750] Updated weights for policy 0, policy_version 274941 (0.0028) [2024-04-26 21:29:26,700][49750] Updated weights for policy 0, policy_version 274951 (0.0029) [2024-04-26 21:29:27,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4504813568. Throughput: 0: 50585.8. Samples: 2257665920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 21:29:27,072][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 21:29:29,659][49750] Updated weights for policy 0, policy_version 274961 (0.0034) [2024-04-26 21:29:32,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4505059328. Throughput: 0: 50644.7. Samples: 2257969580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 21:29:32,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 21:29:33,059][49750] Updated weights for policy 0, policy_version 274971 (0.0028) [2024-04-26 21:29:36,078][49750] Updated weights for policy 0, policy_version 274981 (0.0034) [2024-04-26 21:29:37,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4505337856. Throughput: 0: 50560.4. Samples: 2258124000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 21:29:37,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 21:29:39,520][49750] Updated weights for policy 0, policy_version 274991 (0.0037) [2024-04-26 21:29:42,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4505567232. Throughput: 0: 50554.4. Samples: 2258424160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 21:29:42,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 21:29:42,628][49750] Updated weights for policy 0, policy_version 275001 (0.0031) [2024-04-26 21:29:45,986][49750] Updated weights for policy 0, policy_version 275011 (0.0030) [2024-04-26 21:29:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4505829376. Throughput: 0: 50699.5. Samples: 2258729580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 21:29:47,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 21:29:49,083][49750] Updated weights for policy 0, policy_version 275021 (0.0029) [2024-04-26 21:29:52,062][49517] Fps is (10 sec: 50790.5, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4506075136. Throughput: 0: 50683.3. Samples: 2258878220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 21:29:52,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 21:29:52,425][49750] Updated weights for policy 0, policy_version 275031 (0.0034) [2024-04-26 21:29:54,818][49728] Signal inference workers to stop experience collection... (33750 times) [2024-04-26 21:29:54,861][49750] InferenceWorker_p0-w0: stopping experience collection (33750 times) [2024-04-26 21:29:54,883][49728] Signal inference workers to resume experience collection... (33750 times) [2024-04-26 21:29:54,884][49750] InferenceWorker_p0-w0: resuming experience collection (33750 times) [2024-04-26 21:29:55,536][49750] Updated weights for policy 0, policy_version 275041 (0.0034) [2024-04-26 21:29:57,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4506353664. Throughput: 0: 50697.7. Samples: 2259186140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 21:29:57,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 21:29:58,839][49750] Updated weights for policy 0, policy_version 275051 (0.0031) [2024-04-26 21:30:02,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 4506583040. Throughput: 0: 50707.9. Samples: 2259491600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 21:30:02,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 21:30:02,129][49750] Updated weights for policy 0, policy_version 275061 (0.0030) [2024-04-26 21:30:05,264][49750] Updated weights for policy 0, policy_version 275071 (0.0033) [2024-04-26 21:30:07,062][49517] Fps is (10 sec: 49152.4, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 4506845184. Throughput: 0: 50705.0. Samples: 2259645820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 21:30:07,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 21:30:08,470][49750] Updated weights for policy 0, policy_version 275081 (0.0028) [2024-04-26 21:30:11,696][49750] Updated weights for policy 0, policy_version 275091 (0.0032) [2024-04-26 21:30:12,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 4507107328. Throughput: 0: 50683.0. Samples: 2259946660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-26 21:30:12,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 21:30:15,081][49750] Updated weights for policy 0, policy_version 275101 (0.0037) [2024-04-26 21:30:17,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4507353088. Throughput: 0: 50628.5. Samples: 2260247860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 21:30:17,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 21:30:18,232][49750] Updated weights for policy 0, policy_version 275111 (0.0028) [2024-04-26 21:30:21,557][49750] Updated weights for policy 0, policy_version 275121 (0.0029) [2024-04-26 21:30:22,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4507598848. Throughput: 0: 50665.0. Samples: 2260403920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 21:30:22,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 21:30:24,678][49750] Updated weights for policy 0, policy_version 275131 (0.0029) [2024-04-26 21:30:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 4507844608. Throughput: 0: 50907.5. Samples: 2260715000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 21:30:27,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-26 21:30:27,882][49750] Updated weights for policy 0, policy_version 275141 (0.0040) [2024-04-26 21:30:31,137][49750] Updated weights for policy 0, policy_version 275151 (0.0028) [2024-04-26 21:30:32,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4508106752. Throughput: 0: 50760.3. Samples: 2261013800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 21:30:32,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 21:30:34,274][49750] Updated weights for policy 0, policy_version 275161 (0.0037) [2024-04-26 21:30:37,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4508368896. Throughput: 0: 50728.4. Samples: 2261161000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 21:30:37,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 21:30:37,665][49750] Updated weights for policy 0, policy_version 275171 (0.0035) [2024-04-26 21:30:40,755][49750] Updated weights for policy 0, policy_version 275181 (0.0029) [2024-04-26 21:30:42,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4508631040. Throughput: 0: 50727.1. Samples: 2261468860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 21:30:42,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 21:30:42,109][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000275186_4508647424.pth... [2024-04-26 21:30:42,151][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000274441_4496441344.pth [2024-04-26 21:30:44,299][49750] Updated weights for policy 0, policy_version 275191 (0.0028) [2024-04-26 21:30:47,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4508860416. Throughput: 0: 50779.5. Samples: 2261776680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 21:30:47,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 21:30:47,500][49750] Updated weights for policy 0, policy_version 275201 (0.0034) [2024-04-26 21:30:50,885][49750] Updated weights for policy 0, policy_version 275211 (0.0029) [2024-04-26 21:30:52,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 4509122560. Throughput: 0: 50566.7. Samples: 2261921320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 21:30:52,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 21:30:54,125][49750] Updated weights for policy 0, policy_version 275221 (0.0030) [2024-04-26 21:30:57,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4509368320. Throughput: 0: 50458.7. Samples: 2262217300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 21:30:57,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 21:30:57,313][49750] Updated weights for policy 0, policy_version 275231 (0.0033) [2024-04-26 21:31:00,378][49750] Updated weights for policy 0, policy_version 275241 (0.0029) [2024-04-26 21:31:02,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4509630464. Throughput: 0: 50550.3. Samples: 2262522620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 21:31:02,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 21:31:03,614][49750] Updated weights for policy 0, policy_version 275251 (0.0038) [2024-04-26 21:31:06,659][49750] Updated weights for policy 0, policy_version 275261 (0.0031) [2024-04-26 21:31:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 4509876224. Throughput: 0: 50593.7. Samples: 2262680640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 21:31:07,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 21:31:09,911][49750] Updated weights for policy 0, policy_version 275271 (0.0034) [2024-04-26 21:31:12,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 4510121984. Throughput: 0: 50478.2. Samples: 2262986520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 21:31:12,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 21:31:13,444][49750] Updated weights for policy 0, policy_version 275281 (0.0040) [2024-04-26 21:31:16,362][49750] Updated weights for policy 0, policy_version 275291 (0.0032) [2024-04-26 21:31:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4510384128. Throughput: 0: 50495.2. Samples: 2263286080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 21:31:17,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 21:31:19,075][49728] Signal inference workers to stop experience collection... (33800 times) [2024-04-26 21:31:19,076][49728] Signal inference workers to resume experience collection... (33800 times) [2024-04-26 21:31:19,099][49750] InferenceWorker_p0-w0: stopping experience collection (33800 times) [2024-04-26 21:31:19,099][49750] InferenceWorker_p0-w0: resuming experience collection (33800 times) [2024-04-26 21:31:19,930][49750] Updated weights for policy 0, policy_version 275301 (0.0029) [2024-04-26 21:31:22,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4510646272. Throughput: 0: 50749.4. Samples: 2263444720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 21:31:22,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 21:31:22,806][49750] Updated weights for policy 0, policy_version 275311 (0.0027) [2024-04-26 21:31:26,234][49750] Updated weights for policy 0, policy_version 275321 (0.0034) [2024-04-26 21:31:27,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4510908416. Throughput: 0: 50686.6. Samples: 2263749760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 21:31:27,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 21:31:29,125][49750] Updated weights for policy 0, policy_version 275331 (0.0033) [2024-04-26 21:31:32,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50651.5). Total num frames: 4511154176. Throughput: 0: 50673.8. Samples: 2264057000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 21:31:32,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 21:31:32,609][49750] Updated weights for policy 0, policy_version 275341 (0.0039) [2024-04-26 21:31:35,639][49750] Updated weights for policy 0, policy_version 275351 (0.0028) [2024-04-26 21:31:37,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4511399936. Throughput: 0: 50702.5. Samples: 2264202940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 21:31:37,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 21:31:39,207][49750] Updated weights for policy 0, policy_version 275361 (0.0035) [2024-04-26 21:31:42,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4511662080. Throughput: 0: 50897.8. Samples: 2264507700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 21:31:42,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 21:31:42,284][49750] Updated weights for policy 0, policy_version 275371 (0.0033) [2024-04-26 21:31:45,654][49750] Updated weights for policy 0, policy_version 275381 (0.0036) [2024-04-26 21:31:47,063][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4511924224. Throughput: 0: 50827.9. Samples: 2264809880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 21:31:47,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 21:31:48,804][49750] Updated weights for policy 0, policy_version 275391 (0.0035) [2024-04-26 21:31:51,964][49750] Updated weights for policy 0, policy_version 275401 (0.0031) [2024-04-26 21:31:52,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 4512169984. Throughput: 0: 50639.0. Samples: 2264959400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 21:31:52,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 21:31:55,099][49750] Updated weights for policy 0, policy_version 275411 (0.0035) [2024-04-26 21:31:57,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 4512399360. Throughput: 0: 50734.2. Samples: 2265269560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 21:31:57,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 21:31:58,476][49750] Updated weights for policy 0, policy_version 275421 (0.0030) [2024-04-26 21:32:01,398][49750] Updated weights for policy 0, policy_version 275431 (0.0035) [2024-04-26 21:32:02,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4512677888. Throughput: 0: 50799.9. Samples: 2265572080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 21:32:02,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 21:32:04,905][49750] Updated weights for policy 0, policy_version 275441 (0.0033) [2024-04-26 21:32:07,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4512940032. Throughput: 0: 50800.9. Samples: 2265730760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 21:32:07,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 21:32:07,806][49750] Updated weights for policy 0, policy_version 275451 (0.0030) [2024-04-26 21:32:11,489][49750] Updated weights for policy 0, policy_version 275461 (0.0024) [2024-04-26 21:32:12,063][49517] Fps is (10 sec: 50789.5, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 4513185792. Throughput: 0: 50855.0. Samples: 2266038240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 21:32:12,063][49517] Avg episode reward: [(0, '0.490')] [2024-04-26 21:32:14,199][49750] Updated weights for policy 0, policy_version 275471 (0.0028) [2024-04-26 21:32:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.5, 300 sec: 50651.6). Total num frames: 4513447936. Throughput: 0: 50878.3. Samples: 2266346520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 21:32:17,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 21:32:17,806][49750] Updated weights for policy 0, policy_version 275481 (0.0029) [2024-04-26 21:32:20,688][49750] Updated weights for policy 0, policy_version 275491 (0.0033) [2024-04-26 21:32:22,063][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4513693696. Throughput: 0: 50953.8. Samples: 2266495860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 21:32:22,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 21:32:24,109][49750] Updated weights for policy 0, policy_version 275501 (0.0038) [2024-04-26 21:32:26,993][49750] Updated weights for policy 0, policy_version 275511 (0.0026) [2024-04-26 21:32:27,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.6, 300 sec: 50818.1). Total num frames: 4513972224. Throughput: 0: 51060.5. Samples: 2266805420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 21:32:27,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 21:32:27,785][49728] Signal inference workers to stop experience collection... (33850 times) [2024-04-26 21:32:27,785][49728] Signal inference workers to resume experience collection... (33850 times) [2024-04-26 21:32:27,810][49750] InferenceWorker_p0-w0: stopping experience collection (33850 times) [2024-04-26 21:32:27,811][49750] InferenceWorker_p0-w0: resuming experience collection (33850 times) [2024-04-26 21:32:30,457][49750] Updated weights for policy 0, policy_version 275521 (0.0027) [2024-04-26 21:32:32,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4514217984. Throughput: 0: 51132.8. Samples: 2267110860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 21:32:32,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 21:32:33,477][49750] Updated weights for policy 0, policy_version 275531 (0.0033) [2024-04-26 21:32:36,948][49750] Updated weights for policy 0, policy_version 275541 (0.0035) [2024-04-26 21:32:37,063][49517] Fps is (10 sec: 49151.5, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4514463744. Throughput: 0: 51111.5. Samples: 2267259420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 21:32:37,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 21:32:39,936][49750] Updated weights for policy 0, policy_version 275551 (0.0028) [2024-04-26 21:32:42,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4514725888. Throughput: 0: 51045.4. Samples: 2267566600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 21:32:42,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 21:32:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000275557_4514725888.pth... [2024-04-26 21:32:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000274814_4502552576.pth [2024-04-26 21:32:43,478][49750] Updated weights for policy 0, policy_version 275561 (0.0036) [2024-04-26 21:32:46,345][49750] Updated weights for policy 0, policy_version 275571 (0.0030) [2024-04-26 21:32:47,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4514971648. Throughput: 0: 51121.8. Samples: 2267872560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 21:32:47,064][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 21:32:49,745][49750] Updated weights for policy 0, policy_version 275581 (0.0030) [2024-04-26 21:32:52,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4515233792. Throughput: 0: 51062.7. Samples: 2268028580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 21:32:52,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 21:32:52,691][49750] Updated weights for policy 0, policy_version 275591 (0.0033) [2024-04-26 21:32:56,324][49750] Updated weights for policy 0, policy_version 275601 (0.0028) [2024-04-26 21:32:57,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 4515495936. Throughput: 0: 51132.2. Samples: 2268339180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 21:32:57,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 21:32:58,971][49750] Updated weights for policy 0, policy_version 275611 (0.0034) [2024-04-26 21:33:02,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 4515708928. Throughput: 0: 50920.0. Samples: 2268637920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 21:33:02,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 21:33:02,644][49750] Updated weights for policy 0, policy_version 275621 (0.0030) [2024-04-26 21:33:05,339][49750] Updated weights for policy 0, policy_version 275631 (0.0030) [2024-04-26 21:33:07,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4515987456. Throughput: 0: 50936.0. Samples: 2268787980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 21:33:07,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 21:33:09,406][49750] Updated weights for policy 0, policy_version 275641 (0.0033) [2024-04-26 21:33:11,688][49750] Updated weights for policy 0, policy_version 275651 (0.0033) [2024-04-26 21:33:12,062][49517] Fps is (10 sec: 55705.1, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 4516265984. Throughput: 0: 50831.1. Samples: 2269092820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 21:33:12,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 21:33:15,793][49750] Updated weights for policy 0, policy_version 275661 (0.0025) [2024-04-26 21:33:17,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4516511744. Throughput: 0: 50964.2. Samples: 2269404240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 21:33:17,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 21:33:18,431][49750] Updated weights for policy 0, policy_version 275671 (0.0031) [2024-04-26 21:33:22,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4516741120. Throughput: 0: 51129.9. Samples: 2269560260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 21:33:22,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 21:33:22,194][49750] Updated weights for policy 0, policy_version 275681 (0.0030) [2024-04-26 21:33:24,801][49750] Updated weights for policy 0, policy_version 275691 (0.0025) [2024-04-26 21:33:27,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4517019648. Throughput: 0: 51055.8. Samples: 2269864120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 21:33:27,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 21:33:27,782][49728] Signal inference workers to stop experience collection... (33900 times) [2024-04-26 21:33:27,818][49750] InferenceWorker_p0-w0: stopping experience collection (33900 times) [2024-04-26 21:33:27,852][49728] Signal inference workers to resume experience collection... (33900 times) [2024-04-26 21:33:27,852][49750] InferenceWorker_p0-w0: resuming experience collection (33900 times) [2024-04-26 21:33:28,454][49750] Updated weights for policy 0, policy_version 275701 (0.0028) [2024-04-26 21:33:31,348][49750] Updated weights for policy 0, policy_version 275711 (0.0030) [2024-04-26 21:33:32,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 4517265408. Throughput: 0: 51030.1. Samples: 2270168920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 21:33:32,072][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 21:33:34,914][49750] Updated weights for policy 0, policy_version 275721 (0.0023) [2024-04-26 21:33:37,062][49517] Fps is (10 sec: 50791.7, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4517527552. Throughput: 0: 51035.5. Samples: 2270325180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 21:33:37,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 21:33:37,874][49750] Updated weights for policy 0, policy_version 275731 (0.0031) [2024-04-26 21:33:41,225][49750] Updated weights for policy 0, policy_version 275741 (0.0029) [2024-04-26 21:33:42,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 4517789696. Throughput: 0: 50985.2. Samples: 2270633520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 21:33:42,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 21:33:44,307][49750] Updated weights for policy 0, policy_version 275751 (0.0030) [2024-04-26 21:33:47,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 4518035456. Throughput: 0: 51164.8. Samples: 2270940340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 21:33:47,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 21:33:47,656][49750] Updated weights for policy 0, policy_version 275761 (0.0028) [2024-04-26 21:33:50,776][49750] Updated weights for policy 0, policy_version 275771 (0.0034) [2024-04-26 21:33:52,062][49517] Fps is (10 sec: 50791.3, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4518297600. Throughput: 0: 51132.6. Samples: 2271088940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 21:33:52,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 21:33:54,062][49750] Updated weights for policy 0, policy_version 275781 (0.0034) [2024-04-26 21:33:57,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4518543360. Throughput: 0: 51099.1. Samples: 2271392280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 21:33:57,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 21:33:57,149][49750] Updated weights for policy 0, policy_version 275791 (0.0032) [2024-04-26 21:34:00,540][49750] Updated weights for policy 0, policy_version 275801 (0.0031) [2024-04-26 21:34:02,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51882.6, 300 sec: 50984.8). Total num frames: 4518821888. Throughput: 0: 50921.7. Samples: 2271695720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 21:34:02,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 21:34:03,459][49750] Updated weights for policy 0, policy_version 275811 (0.0030) [2024-04-26 21:34:06,843][49750] Updated weights for policy 0, policy_version 275821 (0.0030) [2024-04-26 21:34:07,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4519051264. Throughput: 0: 51071.6. Samples: 2271858480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 21:34:07,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 21:34:09,945][49750] Updated weights for policy 0, policy_version 275831 (0.0031) [2024-04-26 21:34:12,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 4519297024. Throughput: 0: 50960.9. Samples: 2272157360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 21:34:12,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 21:34:13,333][49750] Updated weights for policy 0, policy_version 275841 (0.0025) [2024-04-26 21:34:16,481][49750] Updated weights for policy 0, policy_version 275851 (0.0033) [2024-04-26 21:34:17,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 4519559168. Throughput: 0: 50971.6. Samples: 2272462640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 21:34:17,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 21:34:19,808][49750] Updated weights for policy 0, policy_version 275861 (0.0032) [2024-04-26 21:34:22,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4519821312. Throughput: 0: 51010.6. Samples: 2272620660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 21:34:22,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 21:34:22,893][49750] Updated weights for policy 0, policy_version 275871 (0.0035) [2024-04-26 21:34:26,262][49750] Updated weights for policy 0, policy_version 275881 (0.0035) [2024-04-26 21:34:27,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 4520083456. Throughput: 0: 50949.9. Samples: 2272926260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 21:34:27,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 21:34:29,456][49750] Updated weights for policy 0, policy_version 275891 (0.0030) [2024-04-26 21:34:32,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4520312832. Throughput: 0: 50751.1. Samples: 2273224140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 21:34:32,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 21:34:32,786][49750] Updated weights for policy 0, policy_version 275901 (0.0030) [2024-04-26 21:34:35,889][49750] Updated weights for policy 0, policy_version 275911 (0.0028) [2024-04-26 21:34:37,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4520574976. Throughput: 0: 50839.0. Samples: 2273376700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 21:34:37,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-26 21:34:39,152][49750] Updated weights for policy 0, policy_version 275921 (0.0032) [2024-04-26 21:34:39,928][49728] Signal inference workers to stop experience collection... (33950 times) [2024-04-26 21:34:39,982][49750] InferenceWorker_p0-w0: stopping experience collection (33950 times) [2024-04-26 21:34:39,997][49728] Signal inference workers to resume experience collection... (33950 times) [2024-04-26 21:34:40,003][49750] InferenceWorker_p0-w0: resuming experience collection (33950 times) [2024-04-26 21:34:42,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4520820736. Throughput: 0: 50852.5. Samples: 2273680640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 21:34:42,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 21:34:42,149][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000275930_4520837120.pth... [2024-04-26 21:34:42,309][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000275186_4508647424.pth [2024-04-26 21:34:42,466][49750] Updated weights for policy 0, policy_version 275931 (0.0035) [2024-04-26 21:34:45,592][49750] Updated weights for policy 0, policy_version 275941 (0.0030) [2024-04-26 21:34:47,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4521099264. Throughput: 0: 50746.6. Samples: 2273979320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 21:34:47,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 21:34:48,923][49750] Updated weights for policy 0, policy_version 275951 (0.0033) [2024-04-26 21:34:52,017][49750] Updated weights for policy 0, policy_version 275961 (0.0033) [2024-04-26 21:34:52,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4521345024. Throughput: 0: 50722.6. Samples: 2274141000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 21:34:52,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-26 21:34:55,371][49750] Updated weights for policy 0, policy_version 275971 (0.0031) [2024-04-26 21:34:57,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4521590784. Throughput: 0: 50776.7. Samples: 2274442300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 21:34:57,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 21:34:58,564][49750] Updated weights for policy 0, policy_version 275981 (0.0029) [2024-04-26 21:35:01,675][49750] Updated weights for policy 0, policy_version 275991 (0.0028) [2024-04-26 21:35:02,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4521836544. Throughput: 0: 50855.3. Samples: 2274751120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 21:35:02,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 21:35:05,003][49750] Updated weights for policy 0, policy_version 276001 (0.0036) [2024-04-26 21:35:07,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4522115072. Throughput: 0: 50605.3. Samples: 2274897900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 21:35:07,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 21:35:08,006][49750] Updated weights for policy 0, policy_version 276011 (0.0035) [2024-04-26 21:35:11,419][49750] Updated weights for policy 0, policy_version 276021 (0.0030) [2024-04-26 21:35:12,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4522360832. Throughput: 0: 50664.0. Samples: 2275206140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 21:35:12,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 21:35:14,486][49750] Updated weights for policy 0, policy_version 276031 (0.0033) [2024-04-26 21:35:17,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4522590208. Throughput: 0: 50693.4. Samples: 2275505340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 21:35:17,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 21:35:18,088][49750] Updated weights for policy 0, policy_version 276041 (0.0035) [2024-04-26 21:35:20,941][49750] Updated weights for policy 0, policy_version 276051 (0.0029) [2024-04-26 21:35:22,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 4522852352. Throughput: 0: 50624.3. Samples: 2275654800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 21:35:22,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 21:35:24,501][49750] Updated weights for policy 0, policy_version 276061 (0.0031) [2024-04-26 21:35:27,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4523114496. Throughput: 0: 50700.3. Samples: 2275962160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 21:35:27,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 21:35:27,341][49750] Updated weights for policy 0, policy_version 276071 (0.0036) [2024-04-26 21:35:30,935][49750] Updated weights for policy 0, policy_version 276081 (0.0029) [2024-04-26 21:35:32,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 4523360256. Throughput: 0: 50779.5. Samples: 2276264400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 21:35:32,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 21:35:34,224][49750] Updated weights for policy 0, policy_version 276091 (0.0037) [2024-04-26 21:35:37,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4523606016. Throughput: 0: 50608.6. Samples: 2276418380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 21:35:37,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 21:35:37,266][49750] Updated weights for policy 0, policy_version 276101 (0.0031) [2024-04-26 21:35:39,828][49728] Signal inference workers to stop experience collection... (34000 times) [2024-04-26 21:35:39,862][49750] InferenceWorker_p0-w0: stopping experience collection (34000 times) [2024-04-26 21:35:39,893][49728] Signal inference workers to resume experience collection... (34000 times) [2024-04-26 21:35:39,897][49750] InferenceWorker_p0-w0: resuming experience collection (34000 times) [2024-04-26 21:35:40,809][49750] Updated weights for policy 0, policy_version 276111 (0.0030) [2024-04-26 21:35:42,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4523868160. Throughput: 0: 50761.2. Samples: 2276726560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 21:35:42,063][49517] Avg episode reward: [(0, '0.457')] [2024-04-26 21:35:43,584][49750] Updated weights for policy 0, policy_version 276121 (0.0029) [2024-04-26 21:35:47,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 50818.1). Total num frames: 4524113920. Throughput: 0: 50636.4. Samples: 2277029760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 21:35:47,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 21:35:47,182][49750] Updated weights for policy 0, policy_version 276131 (0.0031) [2024-04-26 21:35:49,974][49750] Updated weights for policy 0, policy_version 276141 (0.0029) [2024-04-26 21:35:52,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 4524392448. Throughput: 0: 50653.8. Samples: 2277177320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 21:35:52,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 21:35:53,729][49750] Updated weights for policy 0, policy_version 276151 (0.0028) [2024-04-26 21:35:56,261][49750] Updated weights for policy 0, policy_version 276161 (0.0029) [2024-04-26 21:35:57,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4524654592. Throughput: 0: 50687.0. Samples: 2277487060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 21:35:57,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 21:36:00,050][49750] Updated weights for policy 0, policy_version 276171 (0.0034) [2024-04-26 21:36:02,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4524900352. Throughput: 0: 50942.5. Samples: 2277797760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 21:36:02,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 21:36:02,749][49750] Updated weights for policy 0, policy_version 276181 (0.0032) [2024-04-26 21:36:06,387][49750] Updated weights for policy 0, policy_version 276191 (0.0037) [2024-04-26 21:36:07,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.4, 300 sec: 50873.7). Total num frames: 4525129728. Throughput: 0: 50878.0. Samples: 2277944300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 21:36:07,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 21:36:09,212][49750] Updated weights for policy 0, policy_version 276201 (0.0034) [2024-04-26 21:36:12,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 4525408256. Throughput: 0: 50757.0. Samples: 2278246220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-26 21:36:12,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 21:36:13,046][49750] Updated weights for policy 0, policy_version 276211 (0.0032) [2024-04-26 21:36:15,726][49750] Updated weights for policy 0, policy_version 276221 (0.0033) [2024-04-26 21:36:17,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4525637632. Throughput: 0: 50854.0. Samples: 2278552820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:36:17,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 21:36:19,509][49750] Updated weights for policy 0, policy_version 276231 (0.0026) [2024-04-26 21:36:22,060][49750] Updated weights for policy 0, policy_version 276241 (0.0030) [2024-04-26 21:36:22,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 4525932544. Throughput: 0: 50889.7. Samples: 2278708420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:36:22,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 21:36:25,766][49750] Updated weights for policy 0, policy_version 276251 (0.0029) [2024-04-26 21:36:27,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4526145536. Throughput: 0: 50733.3. Samples: 2279009560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:36:27,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 21:36:28,451][49750] Updated weights for policy 0, policy_version 276261 (0.0029) [2024-04-26 21:36:32,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4526407680. Throughput: 0: 50705.0. Samples: 2279311480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:36:32,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 21:36:32,098][49750] Updated weights for policy 0, policy_version 276271 (0.0039) [2024-04-26 21:36:34,951][49750] Updated weights for policy 0, policy_version 276281 (0.0028) [2024-04-26 21:36:37,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4526669824. Throughput: 0: 50860.8. Samples: 2279466060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:36:37,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 21:36:38,651][49750] Updated weights for policy 0, policy_version 276291 (0.0037) [2024-04-26 21:36:41,004][49728] Signal inference workers to stop experience collection... (34050 times) [2024-04-26 21:36:41,047][49750] InferenceWorker_p0-w0: stopping experience collection (34050 times) [2024-04-26 21:36:41,065][49728] Signal inference workers to resume experience collection... (34050 times) [2024-04-26 21:36:41,067][49750] InferenceWorker_p0-w0: resuming experience collection (34050 times) [2024-04-26 21:36:41,421][49750] Updated weights for policy 0, policy_version 276301 (0.0029) [2024-04-26 21:36:42,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 4526915584. Throughput: 0: 50676.3. Samples: 2279767500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:36:42,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 21:36:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000276301_4526915584.pth... [2024-04-26 21:36:42,124][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000275557_4514725888.pth [2024-04-26 21:36:45,321][49750] Updated weights for policy 0, policy_version 276311 (0.0034) [2024-04-26 21:36:47,062][49517] Fps is (10 sec: 50791.3, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4527177728. Throughput: 0: 50683.3. Samples: 2280078500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:36:47,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 21:36:47,743][49750] Updated weights for policy 0, policy_version 276321 (0.0032) [2024-04-26 21:36:51,546][49750] Updated weights for policy 0, policy_version 276331 (0.0032) [2024-04-26 21:36:52,063][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 4527407104. Throughput: 0: 50751.9. Samples: 2280228140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:36:52,072][49517] Avg episode reward: [(0, '0.684')] [2024-04-26 21:36:54,227][49750] Updated weights for policy 0, policy_version 276341 (0.0028) [2024-04-26 21:36:57,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4527685632. Throughput: 0: 50824.8. Samples: 2280533340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:36:57,072][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 21:36:57,942][49750] Updated weights for policy 0, policy_version 276351 (0.0028) [2024-04-26 21:37:00,856][49750] Updated weights for policy 0, policy_version 276361 (0.0033) [2024-04-26 21:37:02,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 4527931392. Throughput: 0: 50623.8. Samples: 2280830900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:37:02,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 21:37:04,542][49750] Updated weights for policy 0, policy_version 276371 (0.0030) [2024-04-26 21:37:07,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 4528209920. Throughput: 0: 50618.8. Samples: 2280986260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:37:07,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 21:37:07,332][49750] Updated weights for policy 0, policy_version 276381 (0.0028) [2024-04-26 21:37:10,841][49750] Updated weights for policy 0, policy_version 276391 (0.0031) [2024-04-26 21:37:12,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4528439296. Throughput: 0: 50762.8. Samples: 2281293880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:37:12,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 21:37:14,287][49750] Updated weights for policy 0, policy_version 276401 (0.0027) [2024-04-26 21:37:17,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4528685056. Throughput: 0: 50867.0. Samples: 2281600500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:37:17,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 21:37:17,231][49750] Updated weights for policy 0, policy_version 276411 (0.0034) [2024-04-26 21:37:20,716][49750] Updated weights for policy 0, policy_version 276421 (0.0030) [2024-04-26 21:37:22,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4528963584. Throughput: 0: 50629.0. Samples: 2281744360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-26 21:37:22,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 21:37:23,699][49750] Updated weights for policy 0, policy_version 276431 (0.0032) [2024-04-26 21:37:27,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.6, 300 sec: 50762.7). Total num frames: 4529192960. Throughput: 0: 50793.2. Samples: 2282053180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 21:37:27,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 21:37:27,113][49750] Updated weights for policy 0, policy_version 276441 (0.0033) [2024-04-26 21:37:30,030][49750] Updated weights for policy 0, policy_version 276451 (0.0030) [2024-04-26 21:37:32,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4529455104. Throughput: 0: 50639.6. Samples: 2282357280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 21:37:32,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 21:37:33,573][49750] Updated weights for policy 0, policy_version 276461 (0.0028) [2024-04-26 21:37:36,373][49750] Updated weights for policy 0, policy_version 276471 (0.0027) [2024-04-26 21:37:37,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 4529717248. Throughput: 0: 50740.2. Samples: 2282511440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 21:37:37,063][49517] Avg episode reward: [(0, '0.700')] [2024-04-26 21:37:38,702][49728] Signal inference workers to stop experience collection... (34100 times) [2024-04-26 21:37:38,703][49728] Signal inference workers to resume experience collection... (34100 times) [2024-04-26 21:37:38,735][49750] InferenceWorker_p0-w0: stopping experience collection (34100 times) [2024-04-26 21:37:38,735][49750] InferenceWorker_p0-w0: resuming experience collection (34100 times) [2024-04-26 21:37:39,992][49750] Updated weights for policy 0, policy_version 276481 (0.0033) [2024-04-26 21:37:42,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4529963008. Throughput: 0: 50879.8. Samples: 2282822940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 21:37:42,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 21:37:42,877][49750] Updated weights for policy 0, policy_version 276491 (0.0038) [2024-04-26 21:37:46,550][49750] Updated weights for policy 0, policy_version 276501 (0.0028) [2024-04-26 21:37:47,063][49517] Fps is (10 sec: 50789.2, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 4530225152. Throughput: 0: 51026.2. Samples: 2283127080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 21:37:47,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 21:37:49,483][49750] Updated weights for policy 0, policy_version 276511 (0.0035) [2024-04-26 21:37:52,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4530487296. Throughput: 0: 50824.0. Samples: 2283273340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 21:37:52,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 21:37:53,101][49750] Updated weights for policy 0, policy_version 276521 (0.0029) [2024-04-26 21:37:56,043][49750] Updated weights for policy 0, policy_version 276531 (0.0030) [2024-04-26 21:37:57,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 4530733056. Throughput: 0: 50779.1. Samples: 2283578940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 21:37:57,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 21:37:59,531][49750] Updated weights for policy 0, policy_version 276541 (0.0031) [2024-04-26 21:38:02,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4530995200. Throughput: 0: 50745.5. Samples: 2283884040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 21:38:02,063][49517] Avg episode reward: [(0, '0.711')] [2024-04-26 21:38:02,523][49750] Updated weights for policy 0, policy_version 276551 (0.0037) [2024-04-26 21:38:06,154][49750] Updated weights for policy 0, policy_version 276561 (0.0034) [2024-04-26 21:38:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4531240960. Throughput: 0: 50879.2. Samples: 2284033920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 21:38:07,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 21:38:08,824][49750] Updated weights for policy 0, policy_version 276571 (0.0037) [2024-04-26 21:38:12,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4531470336. Throughput: 0: 50741.6. Samples: 2284336560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 21:38:12,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 21:38:12,461][49750] Updated weights for policy 0, policy_version 276581 (0.0028) [2024-04-26 21:38:15,099][49750] Updated weights for policy 0, policy_version 276591 (0.0034) [2024-04-26 21:38:17,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4531748864. Throughput: 0: 50700.8. Samples: 2284638820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 21:38:17,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 21:38:18,860][49750] Updated weights for policy 0, policy_version 276601 (0.0035) [2024-04-26 21:38:21,605][49750] Updated weights for policy 0, policy_version 276611 (0.0031) [2024-04-26 21:38:22,062][49517] Fps is (10 sec: 54067.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4532011008. Throughput: 0: 50752.0. Samples: 2284795280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 21:38:22,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 21:38:25,369][49750] Updated weights for policy 0, policy_version 276621 (0.0031) [2024-04-26 21:38:27,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 4532256768. Throughput: 0: 50751.2. Samples: 2285106740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 21:38:27,063][49517] Avg episode reward: [(0, '0.683')] [2024-04-26 21:38:28,104][49750] Updated weights for policy 0, policy_version 276631 (0.0030) [2024-04-26 21:38:31,896][49750] Updated weights for policy 0, policy_version 276641 (0.0030) [2024-04-26 21:38:32,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4532502528. Throughput: 0: 50851.2. Samples: 2285415380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-26 21:38:32,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 21:38:34,397][49750] Updated weights for policy 0, policy_version 276651 (0.0033) [2024-04-26 21:38:37,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50517.0, 300 sec: 50707.1). Total num frames: 4532748288. Throughput: 0: 50630.9. Samples: 2285551740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 21:38:37,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 21:38:38,248][49750] Updated weights for policy 0, policy_version 276661 (0.0029) [2024-04-26 21:38:40,867][49750] Updated weights for policy 0, policy_version 276671 (0.0028) [2024-04-26 21:38:42,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4533026816. Throughput: 0: 50699.5. Samples: 2285860420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 21:38:42,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 21:38:42,150][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000276675_4533043200.pth... [2024-04-26 21:38:42,194][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000275930_4520837120.pth [2024-04-26 21:38:44,647][49750] Updated weights for policy 0, policy_version 276681 (0.0038) [2024-04-26 21:38:47,062][49517] Fps is (10 sec: 54068.5, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4533288960. Throughput: 0: 50782.2. Samples: 2286169240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 21:38:47,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 21:38:47,265][49750] Updated weights for policy 0, policy_version 276691 (0.0030) [2024-04-26 21:38:51,060][49750] Updated weights for policy 0, policy_version 276701 (0.0028) [2024-04-26 21:38:52,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4533518336. Throughput: 0: 50874.1. Samples: 2286323260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 21:38:52,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 21:38:53,695][49750] Updated weights for policy 0, policy_version 276711 (0.0042) [2024-04-26 21:38:57,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4533764096. Throughput: 0: 50936.9. Samples: 2286628720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 21:38:57,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 21:38:57,282][49728] Signal inference workers to stop experience collection... (34150 times) [2024-04-26 21:38:57,283][49728] Signal inference workers to resume experience collection... (34150 times) [2024-04-26 21:38:57,296][49750] InferenceWorker_p0-w0: stopping experience collection (34150 times) [2024-04-26 21:38:57,318][49750] InferenceWorker_p0-w0: resuming experience collection (34150 times) [2024-04-26 21:38:57,415][49750] Updated weights for policy 0, policy_version 276721 (0.0036) [2024-04-26 21:38:59,989][49750] Updated weights for policy 0, policy_version 276731 (0.0034) [2024-04-26 21:39:02,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4534026240. Throughput: 0: 50875.7. Samples: 2286928220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 21:39:02,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 21:39:03,903][49750] Updated weights for policy 0, policy_version 276741 (0.0038) [2024-04-26 21:39:06,371][49750] Updated weights for policy 0, policy_version 276751 (0.0035) [2024-04-26 21:39:07,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4534288384. Throughput: 0: 50990.1. Samples: 2287089840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 21:39:07,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 21:39:10,403][49750] Updated weights for policy 0, policy_version 276761 (0.0034) [2024-04-26 21:39:12,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4534550528. Throughput: 0: 50793.8. Samples: 2287392460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 21:39:12,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 21:39:12,861][49750] Updated weights for policy 0, policy_version 276771 (0.0030) [2024-04-26 21:39:16,810][49750] Updated weights for policy 0, policy_version 276781 (0.0026) [2024-04-26 21:39:17,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4534779904. Throughput: 0: 50792.6. Samples: 2287701040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 21:39:17,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 21:39:19,265][49750] Updated weights for policy 0, policy_version 276791 (0.0029) [2024-04-26 21:39:22,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 4535025664. Throughput: 0: 50887.4. Samples: 2287841660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 21:39:22,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 21:39:23,107][49750] Updated weights for policy 0, policy_version 276801 (0.0036) [2024-04-26 21:39:25,523][49750] Updated weights for policy 0, policy_version 276811 (0.0029) [2024-04-26 21:39:27,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4535320576. Throughput: 0: 50769.0. Samples: 2288145020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 21:39:27,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 21:39:29,543][49750] Updated weights for policy 0, policy_version 276821 (0.0033) [2024-04-26 21:39:32,062][49517] Fps is (10 sec: 55705.7, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4535582720. Throughput: 0: 50854.7. Samples: 2288457700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 21:39:32,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 21:39:32,346][49750] Updated weights for policy 0, policy_version 276831 (0.0030) [2024-04-26 21:39:36,109][49750] Updated weights for policy 0, policy_version 276841 (0.0030) [2024-04-26 21:39:37,062][49517] Fps is (10 sec: 49151.3, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4535812096. Throughput: 0: 50963.6. Samples: 2288616620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 21:39:37,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 21:39:38,655][49750] Updated weights for policy 0, policy_version 276851 (0.0028) [2024-04-26 21:39:42,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 4536041472. Throughput: 0: 50905.9. Samples: 2288919480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 21:39:42,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 21:39:42,574][49750] Updated weights for policy 0, policy_version 276861 (0.0034) [2024-04-26 21:39:44,939][49750] Updated weights for policy 0, policy_version 276871 (0.0034) [2024-04-26 21:39:47,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4536336384. Throughput: 0: 50960.3. Samples: 2289221440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 21:39:47,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 21:39:49,068][49750] Updated weights for policy 0, policy_version 276881 (0.0029) [2024-04-26 21:39:50,770][49728] Signal inference workers to stop experience collection... (34200 times) [2024-04-26 21:39:50,775][49728] Signal inference workers to resume experience collection... (34200 times) [2024-04-26 21:39:50,801][49750] InferenceWorker_p0-w0: stopping experience collection (34200 times) [2024-04-26 21:39:50,802][49750] InferenceWorker_p0-w0: resuming experience collection (34200 times) [2024-04-26 21:39:51,301][49750] Updated weights for policy 0, policy_version 276891 (0.0029) [2024-04-26 21:39:52,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4536582144. Throughput: 0: 50945.5. Samples: 2289382380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 21:39:52,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 21:39:55,427][49750] Updated weights for policy 0, policy_version 276901 (0.0031) [2024-04-26 21:39:57,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4536844288. Throughput: 0: 50943.6. Samples: 2289684920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 21:39:57,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 21:39:57,911][49750] Updated weights for policy 0, policy_version 276911 (0.0031) [2024-04-26 21:40:01,875][49750] Updated weights for policy 0, policy_version 276921 (0.0030) [2024-04-26 21:40:02,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4537073664. Throughput: 0: 50978.6. Samples: 2289995080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 21:40:02,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 21:40:04,268][49750] Updated weights for policy 0, policy_version 276931 (0.0041) [2024-04-26 21:40:07,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4537319424. Throughput: 0: 50755.0. Samples: 2290125640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 21:40:07,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 21:40:08,366][49750] Updated weights for policy 0, policy_version 276941 (0.0032) [2024-04-26 21:40:10,709][49750] Updated weights for policy 0, policy_version 276951 (0.0027) [2024-04-26 21:40:12,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4537597952. Throughput: 0: 50889.3. Samples: 2290435040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 21:40:12,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 21:40:14,997][49750] Updated weights for policy 0, policy_version 276961 (0.0030) [2024-04-26 21:40:17,063][49517] Fps is (10 sec: 55705.4, 60 sec: 51609.5, 300 sec: 50929.3). Total num frames: 4537876480. Throughput: 0: 50792.7. Samples: 2290743380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 21:40:17,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 21:40:17,304][49750] Updated weights for policy 0, policy_version 276971 (0.0031) [2024-04-26 21:40:21,328][49750] Updated weights for policy 0, policy_version 276981 (0.0034) [2024-04-26 21:40:22,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4538073088. Throughput: 0: 50723.1. Samples: 2290899160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 21:40:22,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 21:40:23,978][49750] Updated weights for policy 0, policy_version 276991 (0.0031) [2024-04-26 21:40:27,062][49517] Fps is (10 sec: 44237.3, 60 sec: 49971.1, 300 sec: 50707.1). Total num frames: 4538318848. Throughput: 0: 50761.7. Samples: 2291203760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 21:40:27,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 21:40:27,695][49750] Updated weights for policy 0, policy_version 277001 (0.0032) [2024-04-26 21:40:30,395][49750] Updated weights for policy 0, policy_version 277011 (0.0031) [2024-04-26 21:40:32,063][49517] Fps is (10 sec: 54067.2, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 4538613760. Throughput: 0: 50704.9. Samples: 2291503160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 21:40:32,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 21:40:34,236][49750] Updated weights for policy 0, policy_version 277021 (0.0027) [2024-04-26 21:40:36,831][49750] Updated weights for policy 0, policy_version 277031 (0.0029) [2024-04-26 21:40:37,062][49517] Fps is (10 sec: 55705.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4538875904. Throughput: 0: 50771.5. Samples: 2291667100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 21:40:37,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 21:40:40,662][49750] Updated weights for policy 0, policy_version 277041 (0.0044) [2024-04-26 21:40:42,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51609.3, 300 sec: 50929.2). Total num frames: 4539138048. Throughput: 0: 50896.1. Samples: 2291975260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 21:40:42,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-26 21:40:42,158][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000277048_4539154432.pth... [2024-04-26 21:40:42,205][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000276301_4526915584.pth [2024-04-26 21:40:43,312][49750] Updated weights for policy 0, policy_version 277051 (0.0039) [2024-04-26 21:40:47,043][49750] Updated weights for policy 0, policy_version 277061 (0.0029) [2024-04-26 21:40:47,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4539367424. Throughput: 0: 50858.1. Samples: 2292283700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 21:40:47,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 21:40:47,593][49728] Signal inference workers to stop experience collection... (34250 times) [2024-04-26 21:40:47,593][49728] Signal inference workers to resume experience collection... (34250 times) [2024-04-26 21:40:47,613][49750] InferenceWorker_p0-w0: stopping experience collection (34250 times) [2024-04-26 21:40:47,613][49750] InferenceWorker_p0-w0: resuming experience collection (34250 times) [2024-04-26 21:40:49,728][49750] Updated weights for policy 0, policy_version 277071 (0.0031) [2024-04-26 21:40:52,062][49517] Fps is (10 sec: 47514.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4539613184. Throughput: 0: 51038.7. Samples: 2292422380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 21:40:52,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 21:40:53,587][49750] Updated weights for policy 0, policy_version 277081 (0.0028) [2024-04-26 21:40:56,180][49750] Updated weights for policy 0, policy_version 277091 (0.0033) [2024-04-26 21:40:57,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 4539891712. Throughput: 0: 50884.2. Samples: 2292724840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 21:40:57,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 21:41:00,096][49750] Updated weights for policy 0, policy_version 277101 (0.0030) [2024-04-26 21:41:02,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4540137472. Throughput: 0: 50743.7. Samples: 2293026840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 21:41:02,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 21:41:02,771][49750] Updated weights for policy 0, policy_version 277111 (0.0029) [2024-04-26 21:41:06,379][49750] Updated weights for policy 0, policy_version 277121 (0.0034) [2024-04-26 21:41:07,062][49517] Fps is (10 sec: 50791.4, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4540399616. Throughput: 0: 50836.1. Samples: 2293186780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 21:41:07,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 21:41:09,074][49750] Updated weights for policy 0, policy_version 277131 (0.0028) [2024-04-26 21:41:12,063][49517] Fps is (10 sec: 45874.8, 60 sec: 49971.1, 300 sec: 50707.1). Total num frames: 4540596224. Throughput: 0: 50761.2. Samples: 2293488020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 21:41:12,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 21:41:12,792][49750] Updated weights for policy 0, policy_version 277141 (0.0033) [2024-04-26 21:41:15,538][49750] Updated weights for policy 0, policy_version 277151 (0.0028) [2024-04-26 21:41:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4540891136. Throughput: 0: 50915.2. Samples: 2293794340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 21:41:17,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-26 21:41:19,387][49750] Updated weights for policy 0, policy_version 277161 (0.0028) [2024-04-26 21:41:22,026][49750] Updated weights for policy 0, policy_version 277171 (0.0035) [2024-04-26 21:41:22,063][49517] Fps is (10 sec: 57344.1, 60 sec: 51609.6, 300 sec: 50929.3). Total num frames: 4541169664. Throughput: 0: 50654.6. Samples: 2293946560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 21:41:22,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 21:41:25,805][49750] Updated weights for policy 0, policy_version 277181 (0.0031) [2024-04-26 21:41:27,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51882.6, 300 sec: 50929.2). Total num frames: 4541431808. Throughput: 0: 50679.7. Samples: 2294255840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 21:41:27,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 21:41:28,518][49750] Updated weights for policy 0, policy_version 277191 (0.0030) [2024-04-26 21:41:32,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4541644800. Throughput: 0: 50737.5. Samples: 2294566880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 21:41:32,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 21:41:32,197][49750] Updated weights for policy 0, policy_version 277201 (0.0028) [2024-04-26 21:41:34,818][49750] Updated weights for policy 0, policy_version 277211 (0.0029) [2024-04-26 21:41:37,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50244.2, 300 sec: 50762.7). Total num frames: 4541890560. Throughput: 0: 50710.2. Samples: 2294704340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 21:41:37,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 21:41:38,607][49728] Signal inference workers to stop experience collection... (34300 times) [2024-04-26 21:41:38,627][49750] InferenceWorker_p0-w0: stopping experience collection (34300 times) [2024-04-26 21:41:38,712][49728] Signal inference workers to resume experience collection... (34300 times) [2024-04-26 21:41:38,712][49750] InferenceWorker_p0-w0: resuming experience collection (34300 times) [2024-04-26 21:41:38,714][49750] Updated weights for policy 0, policy_version 277221 (0.0030) [2024-04-26 21:41:41,258][49750] Updated weights for policy 0, policy_version 277231 (0.0028) [2024-04-26 21:41:42,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50517.4, 300 sec: 50818.1). Total num frames: 4542169088. Throughput: 0: 50887.1. Samples: 2295014760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 21:41:42,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 21:41:45,009][49750] Updated weights for policy 0, policy_version 277241 (0.0024) [2024-04-26 21:41:47,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 4542431232. Throughput: 0: 50809.2. Samples: 2295313260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 21:41:47,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 21:41:47,610][49750] Updated weights for policy 0, policy_version 277251 (0.0033) [2024-04-26 21:41:51,309][49750] Updated weights for policy 0, policy_version 277261 (0.0035) [2024-04-26 21:41:52,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4542693376. Throughput: 0: 50892.8. Samples: 2295476960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 21:41:52,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 21:41:54,188][49750] Updated weights for policy 0, policy_version 277271 (0.0033) [2024-04-26 21:41:57,062][49517] Fps is (10 sec: 45875.8, 60 sec: 49971.4, 300 sec: 50707.1). Total num frames: 4542889984. Throughput: 0: 51139.7. Samples: 2295789300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 21:41:57,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 21:41:57,756][49750] Updated weights for policy 0, policy_version 277281 (0.0030) [2024-04-26 21:42:00,653][49750] Updated weights for policy 0, policy_version 277291 (0.0028) [2024-04-26 21:42:02,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4543168512. Throughput: 0: 50909.8. Samples: 2296085280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 21:42:02,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 21:42:04,181][49750] Updated weights for policy 0, policy_version 277301 (0.0032) [2024-04-26 21:42:07,063][49517] Fps is (10 sec: 57343.8, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4543463424. Throughput: 0: 50965.9. Samples: 2296240020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 21:42:07,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 21:42:07,067][49750] Updated weights for policy 0, policy_version 277311 (0.0029) [2024-04-26 21:42:10,601][49750] Updated weights for policy 0, policy_version 277321 (0.0038) [2024-04-26 21:42:12,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51882.7, 300 sec: 50929.3). Total num frames: 4543709184. Throughput: 0: 50890.8. Samples: 2296545920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-26 21:42:12,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 21:42:13,731][49750] Updated weights for policy 0, policy_version 277331 (0.0033) [2024-04-26 21:42:17,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4543938560. Throughput: 0: 50869.7. Samples: 2296856020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 21:42:17,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 21:42:17,073][49750] Updated weights for policy 0, policy_version 277341 (0.0035) [2024-04-26 21:42:20,110][49750] Updated weights for policy 0, policy_version 277351 (0.0026) [2024-04-26 21:42:22,062][49517] Fps is (10 sec: 45875.6, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 4544167936. Throughput: 0: 50890.4. Samples: 2296994400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 21:42:22,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 21:42:23,405][49728] Signal inference workers to stop experience collection... (34350 times) [2024-04-26 21:42:23,406][49728] Signal inference workers to resume experience collection... (34350 times) [2024-04-26 21:42:23,427][49750] InferenceWorker_p0-w0: stopping experience collection (34350 times) [2024-04-26 21:42:23,428][49750] InferenceWorker_p0-w0: resuming experience collection (34350 times) [2024-04-26 21:42:23,547][49750] Updated weights for policy 0, policy_version 277361 (0.0028) [2024-04-26 21:42:26,519][49750] Updated weights for policy 0, policy_version 277371 (0.0031) [2024-04-26 21:42:27,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 4544446464. Throughput: 0: 50792.2. Samples: 2297300400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 21:42:27,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 21:42:29,904][49750] Updated weights for policy 0, policy_version 277381 (0.0034) [2024-04-26 21:42:32,062][49517] Fps is (10 sec: 55705.0, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4544724992. Throughput: 0: 50945.4. Samples: 2297605800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 21:42:32,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 21:42:32,964][49750] Updated weights for policy 0, policy_version 277391 (0.0030) [2024-04-26 21:42:36,370][49750] Updated weights for policy 0, policy_version 277401 (0.0034) [2024-04-26 21:42:37,063][49517] Fps is (10 sec: 54066.2, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 4544987136. Throughput: 0: 50844.3. Samples: 2297764960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 21:42:37,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 21:42:39,477][49750] Updated weights for policy 0, policy_version 277411 (0.0027) [2024-04-26 21:42:42,062][49517] Fps is (10 sec: 45875.2, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4545183744. Throughput: 0: 50699.9. Samples: 2298070800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 21:42:42,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 21:42:42,099][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000277417_4545200128.pth... [2024-04-26 21:42:42,166][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000276675_4533043200.pth [2024-04-26 21:42:42,749][49750] Updated weights for policy 0, policy_version 277421 (0.0035) [2024-04-26 21:42:45,959][49750] Updated weights for policy 0, policy_version 277431 (0.0032) [2024-04-26 21:42:47,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4545462272. Throughput: 0: 50846.6. Samples: 2298373380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 21:42:47,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 21:42:49,172][49750] Updated weights for policy 0, policy_version 277441 (0.0035) [2024-04-26 21:42:52,063][49517] Fps is (10 sec: 54066.9, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 4545724416. Throughput: 0: 50630.1. Samples: 2298518380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 21:42:52,072][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 21:42:52,474][49750] Updated weights for policy 0, policy_version 277451 (0.0032) [2024-04-26 21:42:55,594][49750] Updated weights for policy 0, policy_version 277461 (0.0034) [2024-04-26 21:42:57,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51882.5, 300 sec: 50873.7). Total num frames: 4546002944. Throughput: 0: 50725.7. Samples: 2298828580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 21:42:57,063][49517] Avg episode reward: [(0, '0.673')] [2024-04-26 21:42:58,866][49750] Updated weights for policy 0, policy_version 277471 (0.0038) [2024-04-26 21:43:01,988][49750] Updated weights for policy 0, policy_version 277481 (0.0034) [2024-04-26 21:43:02,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4546248704. Throughput: 0: 50728.8. Samples: 2299138820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 21:43:02,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 21:43:05,490][49750] Updated weights for policy 0, policy_version 277491 (0.0032) [2024-04-26 21:43:07,062][49517] Fps is (10 sec: 44237.9, 60 sec: 49698.2, 300 sec: 50762.7). Total num frames: 4546445312. Throughput: 0: 50865.4. Samples: 2299283340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 21:43:07,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 21:43:08,443][49750] Updated weights for policy 0, policy_version 277501 (0.0029) [2024-04-26 21:43:12,031][49750] Updated weights for policy 0, policy_version 277511 (0.0032) [2024-04-26 21:43:12,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4546740224. Throughput: 0: 50799.9. Samples: 2299586400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 21:43:12,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 21:43:14,885][49750] Updated weights for policy 0, policy_version 277521 (0.0034) [2024-04-26 21:43:17,062][49517] Fps is (10 sec: 55704.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4547002368. Throughput: 0: 50676.9. Samples: 2299886260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 21:43:17,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 21:43:18,561][49750] Updated weights for policy 0, policy_version 277531 (0.0034) [2024-04-26 21:43:21,371][49750] Updated weights for policy 0, policy_version 277541 (0.0033) [2024-04-26 21:43:22,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 4547264512. Throughput: 0: 50928.1. Samples: 2300056720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 21:43:22,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 21:43:24,744][49728] Signal inference workers to stop experience collection... (34400 times) [2024-04-26 21:43:24,744][49728] Signal inference workers to resume experience collection... (34400 times) [2024-04-26 21:43:24,759][49750] InferenceWorker_p0-w0: stopping experience collection (34400 times) [2024-04-26 21:43:24,759][49750] InferenceWorker_p0-w0: resuming experience collection (34400 times) [2024-04-26 21:43:24,876][49750] Updated weights for policy 0, policy_version 277551 (0.0029) [2024-04-26 21:43:27,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4547477504. Throughput: 0: 50737.4. Samples: 2300353980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 21:43:27,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 21:43:27,847][49750] Updated weights for policy 0, policy_version 277561 (0.0030) [2024-04-26 21:43:31,378][49750] Updated weights for policy 0, policy_version 277571 (0.0031) [2024-04-26 21:43:32,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 4547739648. Throughput: 0: 50703.5. Samples: 2300655040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 21:43:32,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 21:43:34,336][49750] Updated weights for policy 0, policy_version 277581 (0.0032) [2024-04-26 21:43:37,062][49517] Fps is (10 sec: 54066.9, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4548018176. Throughput: 0: 50739.6. Samples: 2300801660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 21:43:37,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 21:43:37,853][49750] Updated weights for policy 0, policy_version 277591 (0.0033) [2024-04-26 21:43:40,835][49750] Updated weights for policy 0, policy_version 277601 (0.0030) [2024-04-26 21:43:42,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 4548280320. Throughput: 0: 50725.0. Samples: 2301111200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 21:43:42,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 21:43:44,187][49750] Updated weights for policy 0, policy_version 277611 (0.0030) [2024-04-26 21:43:47,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4548526080. Throughput: 0: 50679.8. Samples: 2301419400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 21:43:47,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 21:43:47,143][49750] Updated weights for policy 0, policy_version 277621 (0.0032) [2024-04-26 21:43:50,688][49750] Updated weights for policy 0, policy_version 277631 (0.0035) [2024-04-26 21:43:52,063][49517] Fps is (10 sec: 45874.7, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4548739072. Throughput: 0: 50781.5. Samples: 2301568520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 21:43:52,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 21:43:53,636][49750] Updated weights for policy 0, policy_version 277641 (0.0029) [2024-04-26 21:43:57,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4549017600. Throughput: 0: 50799.6. Samples: 2301872380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 21:43:57,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 21:43:57,234][49750] Updated weights for policy 0, policy_version 277651 (0.0032) [2024-04-26 21:44:00,029][49750] Updated weights for policy 0, policy_version 277661 (0.0030) [2024-04-26 21:44:02,063][49517] Fps is (10 sec: 54067.2, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4549279744. Throughput: 0: 50822.6. Samples: 2302173280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 21:44:02,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 21:44:03,663][49750] Updated weights for policy 0, policy_version 277671 (0.0033) [2024-04-26 21:44:06,572][49750] Updated weights for policy 0, policy_version 277681 (0.0033) [2024-04-26 21:44:07,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51609.5, 300 sec: 50818.2). Total num frames: 4549541888. Throughput: 0: 50664.1. Samples: 2302336600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 21:44:07,063][49517] Avg episode reward: [(0, '0.683')] [2024-04-26 21:44:10,057][49750] Updated weights for policy 0, policy_version 277691 (0.0034) [2024-04-26 21:44:12,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4549771264. Throughput: 0: 50789.0. Samples: 2302639480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 21:44:12,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 21:44:12,869][49750] Updated weights for policy 0, policy_version 277701 (0.0033) [2024-04-26 21:44:16,471][49750] Updated weights for policy 0, policy_version 277711 (0.0031) [2024-04-26 21:44:17,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50244.2, 300 sec: 50818.1). Total num frames: 4550017024. Throughput: 0: 50744.0. Samples: 2302938520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 21:44:17,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 21:44:19,266][49750] Updated weights for policy 0, policy_version 277721 (0.0024) [2024-04-26 21:44:22,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4550279168. Throughput: 0: 50807.2. Samples: 2303087980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 21:44:22,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 21:44:22,923][49750] Updated weights for policy 0, policy_version 277731 (0.0032) [2024-04-26 21:44:25,494][49728] Signal inference workers to stop experience collection... (34450 times) [2024-04-26 21:44:25,542][49750] InferenceWorker_p0-w0: stopping experience collection (34450 times) [2024-04-26 21:44:25,563][49728] Signal inference workers to resume experience collection... (34450 times) [2024-04-26 21:44:25,564][49750] InferenceWorker_p0-w0: resuming experience collection (34450 times) [2024-04-26 21:44:25,696][49750] Updated weights for policy 0, policy_version 277741 (0.0030) [2024-04-26 21:44:27,062][49517] Fps is (10 sec: 55705.9, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 4550574080. Throughput: 0: 50777.3. Samples: 2303396180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 21:44:27,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 21:44:29,382][49750] Updated weights for policy 0, policy_version 277751 (0.0032) [2024-04-26 21:44:32,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4550803456. Throughput: 0: 50822.0. Samples: 2303706400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 21:44:32,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 21:44:32,261][49750] Updated weights for policy 0, policy_version 277761 (0.0026) [2024-04-26 21:44:35,745][49750] Updated weights for policy 0, policy_version 277771 (0.0034) [2024-04-26 21:44:37,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4551049216. Throughput: 0: 50914.2. Samples: 2303859660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 21:44:37,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 21:44:38,646][49750] Updated weights for policy 0, policy_version 277781 (0.0034) [2024-04-26 21:44:42,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4551311360. Throughput: 0: 50803.6. Samples: 2304158540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 21:44:42,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 21:44:42,160][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000277791_4551327744.pth... [2024-04-26 21:44:42,166][49750] Updated weights for policy 0, policy_version 277791 (0.0029) [2024-04-26 21:44:42,203][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000277048_4539154432.pth [2024-04-26 21:44:45,025][49750] Updated weights for policy 0, policy_version 277801 (0.0031) [2024-04-26 21:44:47,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4551573504. Throughput: 0: 50886.3. Samples: 2304463160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 21:44:47,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 21:44:48,790][49750] Updated weights for policy 0, policy_version 277811 (0.0034) [2024-04-26 21:44:51,453][49750] Updated weights for policy 0, policy_version 277821 (0.0031) [2024-04-26 21:44:52,062][49517] Fps is (10 sec: 50789.7, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 4551819264. Throughput: 0: 50656.7. Samples: 2304616160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 21:44:52,063][49517] Avg episode reward: [(0, '0.711')] [2024-04-26 21:44:55,465][49750] Updated weights for policy 0, policy_version 277831 (0.0032) [2024-04-26 21:44:57,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4552065024. Throughput: 0: 50797.8. Samples: 2304925380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 21:44:57,063][49517] Avg episode reward: [(0, '0.678')] [2024-04-26 21:44:57,849][49750] Updated weights for policy 0, policy_version 277841 (0.0036) [2024-04-26 21:45:01,723][49750] Updated weights for policy 0, policy_version 277851 (0.0032) [2024-04-26 21:45:02,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4552310784. Throughput: 0: 50938.7. Samples: 2305230760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 21:45:02,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 21:45:04,228][49750] Updated weights for policy 0, policy_version 277861 (0.0032) [2024-04-26 21:45:07,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4552572928. Throughput: 0: 50767.4. Samples: 2305372520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 21:45:07,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 21:45:08,122][49750] Updated weights for policy 0, policy_version 277871 (0.0030) [2024-04-26 21:45:10,616][49750] Updated weights for policy 0, policy_version 277881 (0.0027) [2024-04-26 21:45:12,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51336.6, 300 sec: 50762.7). Total num frames: 4552851456. Throughput: 0: 50763.3. Samples: 2305680520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 21:45:12,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 21:45:14,510][49750] Updated weights for policy 0, policy_version 277891 (0.0038) [2024-04-26 21:45:17,063][49517] Fps is (10 sec: 54067.5, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 4553113600. Throughput: 0: 50769.4. Samples: 2305991020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 21:45:17,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 21:45:17,528][49750] Updated weights for policy 0, policy_version 277901 (0.0030) [2024-04-26 21:45:21,017][49750] Updated weights for policy 0, policy_version 277911 (0.0030) [2024-04-26 21:45:22,062][49517] Fps is (10 sec: 47512.9, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4553326592. Throughput: 0: 50755.2. Samples: 2306143640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 21:45:22,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 21:45:23,840][49750] Updated weights for policy 0, policy_version 277921 (0.0032) [2024-04-26 21:45:27,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4553588736. Throughput: 0: 50871.5. Samples: 2306447760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 21:45:27,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 21:45:27,332][49750] Updated weights for policy 0, policy_version 277931 (0.0027) [2024-04-26 21:45:30,275][49750] Updated weights for policy 0, policy_version 277941 (0.0028) [2024-04-26 21:45:32,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4553850880. Throughput: 0: 50971.1. Samples: 2306756860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 21:45:32,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 21:45:33,676][49750] Updated weights for policy 0, policy_version 277951 (0.0033) [2024-04-26 21:45:36,862][49750] Updated weights for policy 0, policy_version 277961 (0.0035) [2024-04-26 21:45:37,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 4554113024. Throughput: 0: 50894.2. Samples: 2306906400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 21:45:37,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 21:45:40,101][49750] Updated weights for policy 0, policy_version 277971 (0.0034) [2024-04-26 21:45:42,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 4554375168. Throughput: 0: 50679.7. Samples: 2307205980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 21:45:42,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 21:45:43,331][49750] Updated weights for policy 0, policy_version 277981 (0.0027) [2024-04-26 21:45:46,683][49750] Updated weights for policy 0, policy_version 277991 (0.0032) [2024-04-26 21:45:47,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4554620928. Throughput: 0: 50775.0. Samples: 2307515640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-26 21:45:47,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 21:45:49,840][49750] Updated weights for policy 0, policy_version 278001 (0.0029) [2024-04-26 21:45:52,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4554850304. Throughput: 0: 50974.6. Samples: 2307666380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:45:52,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 21:45:53,027][49750] Updated weights for policy 0, policy_version 278011 (0.0035) [2024-04-26 21:45:54,537][49728] Signal inference workers to stop experience collection... (34500 times) [2024-04-26 21:45:54,537][49728] Signal inference workers to resume experience collection... (34500 times) [2024-04-26 21:45:54,551][49750] InferenceWorker_p0-w0: stopping experience collection (34500 times) [2024-04-26 21:45:54,551][49750] InferenceWorker_p0-w0: resuming experience collection (34500 times) [2024-04-26 21:45:56,156][49750] Updated weights for policy 0, policy_version 278021 (0.0031) [2024-04-26 21:45:57,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4555112448. Throughput: 0: 50800.3. Samples: 2307966540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:45:57,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 21:45:59,440][49750] Updated weights for policy 0, policy_version 278031 (0.0027) [2024-04-26 21:46:02,062][49517] Fps is (10 sec: 54068.3, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4555390976. Throughput: 0: 50706.7. Samples: 2308272820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:46:02,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 21:46:02,454][49750] Updated weights for policy 0, policy_version 278041 (0.0032) [2024-04-26 21:46:05,805][49750] Updated weights for policy 0, policy_version 278051 (0.0033) [2024-04-26 21:46:07,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 4555620352. Throughput: 0: 50860.0. Samples: 2308432340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:46:07,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 21:46:09,145][49750] Updated weights for policy 0, policy_version 278061 (0.0032) [2024-04-26 21:46:12,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 4555882496. Throughput: 0: 50919.0. Samples: 2308739120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:46:12,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 21:46:12,287][49750] Updated weights for policy 0, policy_version 278071 (0.0028) [2024-04-26 21:46:15,690][49750] Updated weights for policy 0, policy_version 278081 (0.0027) [2024-04-26 21:46:17,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4556144640. Throughput: 0: 50860.7. Samples: 2309045600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:46:17,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 21:46:18,611][49750] Updated weights for policy 0, policy_version 278091 (0.0034) [2024-04-26 21:46:22,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 4556406784. Throughput: 0: 50843.9. Samples: 2309194380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:46:22,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 21:46:22,064][49750] Updated weights for policy 0, policy_version 278101 (0.0039) [2024-04-26 21:46:24,954][49750] Updated weights for policy 0, policy_version 278111 (0.0030) [2024-04-26 21:46:27,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 4556668928. Throughput: 0: 50996.2. Samples: 2309500800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:46:27,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 21:46:28,396][49750] Updated weights for policy 0, policy_version 278121 (0.0033) [2024-04-26 21:46:31,429][49750] Updated weights for policy 0, policy_version 278131 (0.0035) [2024-04-26 21:46:32,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 4556914688. Throughput: 0: 50875.0. Samples: 2309805020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:46:32,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 21:46:34,902][49750] Updated weights for policy 0, policy_version 278141 (0.0029) [2024-04-26 21:46:37,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4557160448. Throughput: 0: 51001.4. Samples: 2309961440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:46:37,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 21:46:37,889][49750] Updated weights for policy 0, policy_version 278151 (0.0029) [2024-04-26 21:46:41,362][49750] Updated weights for policy 0, policy_version 278161 (0.0026) [2024-04-26 21:46:42,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 4557406208. Throughput: 0: 51119.6. Samples: 2310266920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:46:42,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 21:46:42,234][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000278164_4557438976.pth... [2024-04-26 21:46:42,279][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000277417_4545200128.pth [2024-04-26 21:46:44,240][49750] Updated weights for policy 0, policy_version 278171 (0.0032) [2024-04-26 21:46:47,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 4557684736. Throughput: 0: 51093.6. Samples: 2310572040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:46:47,064][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 21:46:47,824][49750] Updated weights for policy 0, policy_version 278181 (0.0031) [2024-04-26 21:46:50,730][49750] Updated weights for policy 0, policy_version 278191 (0.0026) [2024-04-26 21:46:52,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.7, 300 sec: 50984.8). Total num frames: 4557930496. Throughput: 0: 51150.2. Samples: 2310734100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:46:52,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 21:46:54,172][49728] Signal inference workers to stop experience collection... (34550 times) [2024-04-26 21:46:54,211][49750] InferenceWorker_p0-w0: stopping experience collection (34550 times) [2024-04-26 21:46:54,245][49728] Signal inference workers to resume experience collection... (34550 times) [2024-04-26 21:46:54,246][49750] InferenceWorker_p0-w0: resuming experience collection (34550 times) [2024-04-26 21:46:54,248][49750] Updated weights for policy 0, policy_version 278201 (0.0035) [2024-04-26 21:46:57,040][49750] Updated weights for policy 0, policy_version 278211 (0.0034) [2024-04-26 21:46:57,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 4558209024. Throughput: 0: 51073.4. Samples: 2311037420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:46:57,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 21:47:00,564][49750] Updated weights for policy 0, policy_version 278221 (0.0026) [2024-04-26 21:47:02,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4558422016. Throughput: 0: 50969.4. Samples: 2311339220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 21:47:02,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 21:47:03,515][49750] Updated weights for policy 0, policy_version 278231 (0.0031) [2024-04-26 21:47:06,998][49750] Updated weights for policy 0, policy_version 278241 (0.0027) [2024-04-26 21:47:07,063][49517] Fps is (10 sec: 49151.7, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4558700544. Throughput: 0: 51064.1. Samples: 2311492260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 21:47:07,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 21:47:09,983][49750] Updated weights for policy 0, policy_version 278251 (0.0035) [2024-04-26 21:47:12,063][49517] Fps is (10 sec: 54067.5, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 4558962688. Throughput: 0: 50893.7. Samples: 2311791020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 21:47:12,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 21:47:13,550][49750] Updated weights for policy 0, policy_version 278261 (0.0029) [2024-04-26 21:47:16,278][49750] Updated weights for policy 0, policy_version 278271 (0.0029) [2024-04-26 21:47:17,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 4559208448. Throughput: 0: 50930.7. Samples: 2312096900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 21:47:17,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 21:47:20,038][49750] Updated weights for policy 0, policy_version 278281 (0.0033) [2024-04-26 21:47:22,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4559437824. Throughput: 0: 50922.5. Samples: 2312252940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 21:47:22,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 21:47:22,661][49750] Updated weights for policy 0, policy_version 278291 (0.0030) [2024-04-26 21:47:26,411][49750] Updated weights for policy 0, policy_version 278301 (0.0026) [2024-04-26 21:47:27,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4559699968. Throughput: 0: 50904.8. Samples: 2312557640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 21:47:27,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 21:47:29,162][49750] Updated weights for policy 0, policy_version 278311 (0.0032) [2024-04-26 21:47:32,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4559978496. Throughput: 0: 50835.7. Samples: 2312859640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 21:47:32,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 21:47:32,797][49750] Updated weights for policy 0, policy_version 278321 (0.0031) [2024-04-26 21:47:35,929][49750] Updated weights for policy 0, policy_version 278331 (0.0032) [2024-04-26 21:47:37,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.6, 300 sec: 50929.3). Total num frames: 4560207872. Throughput: 0: 50784.0. Samples: 2313019380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 21:47:37,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 21:47:39,305][49750] Updated weights for policy 0, policy_version 278341 (0.0035) [2024-04-26 21:47:42,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 4560486400. Throughput: 0: 50724.9. Samples: 2313320040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 21:47:42,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 21:47:42,472][49750] Updated weights for policy 0, policy_version 278351 (0.0026) [2024-04-26 21:47:45,804][49750] Updated weights for policy 0, policy_version 278361 (0.0027) [2024-04-26 21:47:47,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4560715776. Throughput: 0: 50866.8. Samples: 2313628220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 21:47:47,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 21:47:48,808][49750] Updated weights for policy 0, policy_version 278371 (0.0034) [2024-04-26 21:47:52,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4560977920. Throughput: 0: 50861.8. Samples: 2313781040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 21:47:52,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 21:47:52,230][49750] Updated weights for policy 0, policy_version 278381 (0.0030) [2024-04-26 21:47:55,149][49750] Updated weights for policy 0, policy_version 278391 (0.0037) [2024-04-26 21:47:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4561240064. Throughput: 0: 50969.9. Samples: 2314084660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 21:47:57,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 21:47:58,520][49750] Updated weights for policy 0, policy_version 278401 (0.0030) [2024-04-26 21:48:01,635][49750] Updated weights for policy 0, policy_version 278411 (0.0034) [2024-04-26 21:48:02,063][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 4561485824. Throughput: 0: 50934.3. Samples: 2314388940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 21:48:02,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 21:48:04,894][49728] Signal inference workers to stop experience collection... (34600 times) [2024-04-26 21:48:04,894][49728] Signal inference workers to resume experience collection... (34600 times) [2024-04-26 21:48:04,933][49750] InferenceWorker_p0-w0: stopping experience collection (34600 times) [2024-04-26 21:48:04,933][49750] InferenceWorker_p0-w0: resuming experience collection (34600 times) [2024-04-26 21:48:05,032][49750] Updated weights for policy 0, policy_version 278421 (0.0038) [2024-04-26 21:48:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4561747968. Throughput: 0: 51045.3. Samples: 2314549980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 21:48:07,063][49517] Avg episode reward: [(0, '0.670')] [2024-04-26 21:48:08,073][49750] Updated weights for policy 0, policy_version 278431 (0.0042) [2024-04-26 21:48:11,590][49750] Updated weights for policy 0, policy_version 278441 (0.0029) [2024-04-26 21:48:12,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4562010112. Throughput: 0: 51000.5. Samples: 2314852660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 21:48:12,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-26 21:48:14,602][49750] Updated weights for policy 0, policy_version 278451 (0.0034) [2024-04-26 21:48:17,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4562239488. Throughput: 0: 50970.2. Samples: 2315153300. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 21:48:17,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 21:48:18,002][49750] Updated weights for policy 0, policy_version 278461 (0.0030) [2024-04-26 21:48:21,156][49750] Updated weights for policy 0, policy_version 278471 (0.0028) [2024-04-26 21:48:22,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51336.4, 300 sec: 50984.8). Total num frames: 4562518016. Throughput: 0: 50883.4. Samples: 2315309140. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 21:48:22,063][49517] Avg episode reward: [(0, '0.686')] [2024-04-26 21:48:24,363][49750] Updated weights for policy 0, policy_version 278481 (0.0028) [2024-04-26 21:48:27,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 4562763776. Throughput: 0: 51143.1. Samples: 2315621480. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 21:48:27,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 21:48:27,490][49750] Updated weights for policy 0, policy_version 278491 (0.0040) [2024-04-26 21:48:30,772][49750] Updated weights for policy 0, policy_version 278501 (0.0030) [2024-04-26 21:48:32,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4563025920. Throughput: 0: 51147.9. Samples: 2315929880. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 21:48:32,063][49517] Avg episode reward: [(0, '0.473')] [2024-04-26 21:48:33,992][49750] Updated weights for policy 0, policy_version 278511 (0.0036) [2024-04-26 21:48:37,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4563271680. Throughput: 0: 51059.1. Samples: 2316078700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 21:48:37,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 21:48:37,162][49750] Updated weights for policy 0, policy_version 278521 (0.0028) [2024-04-26 21:48:40,413][49750] Updated weights for policy 0, policy_version 278531 (0.0032) [2024-04-26 21:48:42,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4563550208. Throughput: 0: 51129.1. Samples: 2316385480. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 21:48:42,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 21:48:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000278537_4563550208.pth... [2024-04-26 21:48:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000277791_4551327744.pth [2024-04-26 21:48:43,492][49750] Updated weights for policy 0, policy_version 278541 (0.0030) [2024-04-26 21:48:46,873][49750] Updated weights for policy 0, policy_version 278551 (0.0037) [2024-04-26 21:48:47,063][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 4563779584. Throughput: 0: 51294.2. Samples: 2316697180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 21:48:47,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 21:48:50,040][49750] Updated weights for policy 0, policy_version 278561 (0.0036) [2024-04-26 21:48:52,063][49517] Fps is (10 sec: 49151.7, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4564041728. Throughput: 0: 51149.1. Samples: 2316851700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 21:48:52,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 21:48:53,388][49750] Updated weights for policy 0, policy_version 278571 (0.0032) [2024-04-26 21:48:56,484][49750] Updated weights for policy 0, policy_version 278581 (0.0028) [2024-04-26 21:48:57,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4564287488. Throughput: 0: 51172.1. Samples: 2317155400. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 21:48:57,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 21:48:59,696][49750] Updated weights for policy 0, policy_version 278591 (0.0036) [2024-04-26 21:49:02,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4564533248. Throughput: 0: 51169.3. Samples: 2317455920. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 21:49:02,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-26 21:49:02,891][49750] Updated weights for policy 0, policy_version 278601 (0.0033) [2024-04-26 21:49:06,113][49750] Updated weights for policy 0, policy_version 278611 (0.0033) [2024-04-26 21:49:06,552][49728] Signal inference workers to stop experience collection... (34650 times) [2024-04-26 21:49:06,552][49728] Signal inference workers to resume experience collection... (34650 times) [2024-04-26 21:49:06,565][49750] InferenceWorker_p0-w0: stopping experience collection (34650 times) [2024-04-26 21:49:06,565][49750] InferenceWorker_p0-w0: resuming experience collection (34650 times) [2024-04-26 21:49:07,062][49517] Fps is (10 sec: 54066.4, 60 sec: 51336.5, 300 sec: 51040.3). Total num frames: 4564828160. Throughput: 0: 51107.7. Samples: 2317608980. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 21:49:07,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 21:49:09,244][49750] Updated weights for policy 0, policy_version 278621 (0.0030) [2024-04-26 21:49:12,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50984.8). Total num frames: 4565057536. Throughput: 0: 50859.8. Samples: 2317910180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 21:49:12,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 21:49:12,481][49750] Updated weights for policy 0, policy_version 278631 (0.0028) [2024-04-26 21:49:15,854][49750] Updated weights for policy 0, policy_version 278641 (0.0033) [2024-04-26 21:49:17,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51609.5, 300 sec: 51040.3). Total num frames: 4565336064. Throughput: 0: 50921.2. Samples: 2318221340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 21:49:17,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 21:49:18,970][49750] Updated weights for policy 0, policy_version 278651 (0.0031) [2024-04-26 21:49:22,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4565549056. Throughput: 0: 50852.9. Samples: 2318367080. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-26 21:49:22,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-26 21:49:22,308][49750] Updated weights for policy 0, policy_version 278661 (0.0034) [2024-04-26 21:49:25,338][49750] Updated weights for policy 0, policy_version 278671 (0.0029) [2024-04-26 21:49:27,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4565843968. Throughput: 0: 51005.4. Samples: 2318680720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 21:49:27,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 21:49:28,737][49750] Updated weights for policy 0, policy_version 278681 (0.0027) [2024-04-26 21:49:31,750][49750] Updated weights for policy 0, policy_version 278691 (0.0028) [2024-04-26 21:49:32,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 4566089728. Throughput: 0: 50938.7. Samples: 2318989420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 21:49:32,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 21:49:35,060][49750] Updated weights for policy 0, policy_version 278701 (0.0032) [2024-04-26 21:49:37,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4566319104. Throughput: 0: 50857.4. Samples: 2319140280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 21:49:37,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 21:49:38,262][49750] Updated weights for policy 0, policy_version 278711 (0.0031) [2024-04-26 21:49:41,495][49750] Updated weights for policy 0, policy_version 278721 (0.0032) [2024-04-26 21:49:42,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 4566597632. Throughput: 0: 50926.6. Samples: 2319447100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 21:49:42,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 21:49:44,643][49750] Updated weights for policy 0, policy_version 278731 (0.0030) [2024-04-26 21:49:47,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4566843392. Throughput: 0: 50939.0. Samples: 2319748180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 21:49:47,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 21:49:47,898][49750] Updated weights for policy 0, policy_version 278741 (0.0028) [2024-04-26 21:49:51,079][49750] Updated weights for policy 0, policy_version 278751 (0.0034) [2024-04-26 21:49:52,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51336.6, 300 sec: 51040.3). Total num frames: 4567121920. Throughput: 0: 50876.0. Samples: 2319898400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 21:49:52,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 21:49:54,491][49750] Updated weights for policy 0, policy_version 278761 (0.0039) [2024-04-26 21:49:57,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.3, 300 sec: 50929.3). Total num frames: 4567334912. Throughput: 0: 50919.8. Samples: 2320201560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 21:49:57,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 21:49:57,592][49750] Updated weights for policy 0, policy_version 278771 (0.0032) [2024-04-26 21:50:00,822][49750] Updated weights for policy 0, policy_version 278781 (0.0029) [2024-04-26 21:50:02,062][49517] Fps is (10 sec: 49152.6, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 4567613440. Throughput: 0: 50752.6. Samples: 2320505200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 21:50:02,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 21:50:03,988][49750] Updated weights for policy 0, policy_version 278791 (0.0030) [2024-04-26 21:50:05,016][49728] Signal inference workers to stop experience collection... (34700 times) [2024-04-26 21:50:05,017][49728] Signal inference workers to resume experience collection... (34700 times) [2024-04-26 21:50:05,032][49750] InferenceWorker_p0-w0: stopping experience collection (34700 times) [2024-04-26 21:50:05,032][49750] InferenceWorker_p0-w0: resuming experience collection (34700 times) [2024-04-26 21:50:07,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4567859200. Throughput: 0: 50859.2. Samples: 2320655740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 21:50:07,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 21:50:07,148][49750] Updated weights for policy 0, policy_version 278801 (0.0032) [2024-04-26 21:50:10,402][49750] Updated weights for policy 0, policy_version 278811 (0.0029) [2024-04-26 21:50:12,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4568121344. Throughput: 0: 50740.5. Samples: 2320964040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 21:50:12,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 21:50:13,602][49750] Updated weights for policy 0, policy_version 278821 (0.0030) [2024-04-26 21:50:16,769][49750] Updated weights for policy 0, policy_version 278831 (0.0035) [2024-04-26 21:50:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.5, 300 sec: 51040.3). Total num frames: 4568383488. Throughput: 0: 50775.2. Samples: 2321274300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 21:50:17,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 21:50:20,022][49750] Updated weights for policy 0, policy_version 278841 (0.0039) [2024-04-26 21:50:22,063][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4568612864. Throughput: 0: 50776.4. Samples: 2321425220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 21:50:22,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 21:50:23,285][49750] Updated weights for policy 0, policy_version 278851 (0.0034) [2024-04-26 21:50:26,557][49750] Updated weights for policy 0, policy_version 278861 (0.0036) [2024-04-26 21:50:27,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50929.2). Total num frames: 4568875008. Throughput: 0: 50738.1. Samples: 2321730320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 21:50:27,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 21:50:29,750][49750] Updated weights for policy 0, policy_version 278871 (0.0033) [2024-04-26 21:50:32,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4569120768. Throughput: 0: 50818.7. Samples: 2322035020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 21:50:32,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 21:50:32,852][49750] Updated weights for policy 0, policy_version 278881 (0.0030) [2024-04-26 21:50:36,096][49750] Updated weights for policy 0, policy_version 278891 (0.0034) [2024-04-26 21:50:37,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 4569399296. Throughput: 0: 50921.0. Samples: 2322189840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-26 21:50:37,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 21:50:39,142][49750] Updated weights for policy 0, policy_version 278901 (0.0028) [2024-04-26 21:50:42,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4569628672. Throughput: 0: 50937.3. Samples: 2322493740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:50:42,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 21:50:42,142][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000278909_4569645056.pth... [2024-04-26 21:50:42,183][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000278164_4557438976.pth [2024-04-26 21:50:42,548][49750] Updated weights for policy 0, policy_version 278911 (0.0033) [2024-04-26 21:50:45,531][49750] Updated weights for policy 0, policy_version 278921 (0.0028) [2024-04-26 21:50:47,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.5, 300 sec: 51040.4). Total num frames: 4569907200. Throughput: 0: 50986.1. Samples: 2322799580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:50:47,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 21:50:48,997][49750] Updated weights for policy 0, policy_version 278931 (0.0031) [2024-04-26 21:50:51,946][49750] Updated weights for policy 0, policy_version 278941 (0.0032) [2024-04-26 21:50:52,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.5, 300 sec: 51040.3). Total num frames: 4570169344. Throughput: 0: 51024.9. Samples: 2322951860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:50:52,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 21:50:55,325][49750] Updated weights for policy 0, policy_version 278951 (0.0029) [2024-04-26 21:50:57,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 4570415104. Throughput: 0: 51033.8. Samples: 2323260560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:50:57,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 21:50:58,496][49750] Updated weights for policy 0, policy_version 278961 (0.0030) [2024-04-26 21:51:01,729][49750] Updated weights for policy 0, policy_version 278971 (0.0033) [2024-04-26 21:51:02,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.3, 300 sec: 51040.3). Total num frames: 4570677248. Throughput: 0: 51024.3. Samples: 2323570400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:51:02,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 21:51:04,791][49750] Updated weights for policy 0, policy_version 278981 (0.0029) [2024-04-26 21:51:07,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 4570906624. Throughput: 0: 50970.7. Samples: 2323718900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:51:07,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 21:51:08,174][49750] Updated weights for policy 0, policy_version 278991 (0.0030) [2024-04-26 21:51:11,300][49750] Updated weights for policy 0, policy_version 279001 (0.0029) [2024-04-26 21:51:12,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 4571168768. Throughput: 0: 50942.3. Samples: 2324022720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:51:12,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 21:51:14,450][49750] Updated weights for policy 0, policy_version 279011 (0.0031) [2024-04-26 21:51:17,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 4571430912. Throughput: 0: 51004.6. Samples: 2324330220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:51:17,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 21:51:17,670][49750] Updated weights for policy 0, policy_version 279021 (0.0033) [2024-04-26 21:51:20,876][49750] Updated weights for policy 0, policy_version 279031 (0.0033) [2024-04-26 21:51:22,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.7, 300 sec: 50929.2). Total num frames: 4571693056. Throughput: 0: 51092.9. Samples: 2324489020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:51:22,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 21:51:24,045][49728] Signal inference workers to stop experience collection... (34750 times) [2024-04-26 21:51:24,051][49728] Signal inference workers to resume experience collection... (34750 times) [2024-04-26 21:51:24,078][49750] InferenceWorker_p0-w0: stopping experience collection (34750 times) [2024-04-26 21:51:24,079][49750] InferenceWorker_p0-w0: resuming experience collection (34750 times) [2024-04-26 21:51:24,179][49750] Updated weights for policy 0, policy_version 279041 (0.0031) [2024-04-26 21:51:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4571922432. Throughput: 0: 51052.1. Samples: 2324791080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:51:27,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 21:51:27,355][49750] Updated weights for policy 0, policy_version 279051 (0.0032) [2024-04-26 21:51:30,697][49750] Updated weights for policy 0, policy_version 279061 (0.0034) [2024-04-26 21:51:32,063][49517] Fps is (10 sec: 49151.3, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 4572184576. Throughput: 0: 51023.1. Samples: 2325095620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:51:32,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 21:51:33,893][49750] Updated weights for policy 0, policy_version 279071 (0.0032) [2024-04-26 21:51:37,034][49750] Updated weights for policy 0, policy_version 279081 (0.0029) [2024-04-26 21:51:37,062][49517] Fps is (10 sec: 54066.5, 60 sec: 51063.4, 300 sec: 51040.3). Total num frames: 4572463104. Throughput: 0: 50773.3. Samples: 2325236660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:51:37,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 21:51:40,284][49750] Updated weights for policy 0, policy_version 279091 (0.0027) [2024-04-26 21:51:42,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 4572725248. Throughput: 0: 50850.2. Samples: 2325548820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:51:42,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 21:51:43,515][49750] Updated weights for policy 0, policy_version 279101 (0.0031) [2024-04-26 21:51:46,680][49750] Updated weights for policy 0, policy_version 279111 (0.0033) [2024-04-26 21:51:47,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.6, 300 sec: 50984.8). Total num frames: 4572971008. Throughput: 0: 50779.3. Samples: 2325855460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:51:47,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-26 21:51:49,998][49750] Updated weights for policy 0, policy_version 279121 (0.0029) [2024-04-26 21:51:52,063][49517] Fps is (10 sec: 45874.5, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 4573184000. Throughput: 0: 50811.9. Samples: 2326005440. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 21:51:52,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 21:51:53,107][49750] Updated weights for policy 0, policy_version 279131 (0.0032) [2024-04-26 21:51:56,569][49750] Updated weights for policy 0, policy_version 279141 (0.0028) [2024-04-26 21:51:57,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 4573462528. Throughput: 0: 50821.3. Samples: 2326309680. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 21:51:57,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 21:51:59,500][49750] Updated weights for policy 0, policy_version 279151 (0.0027) [2024-04-26 21:52:02,063][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4573708288. Throughput: 0: 50751.4. Samples: 2326614040. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 21:52:02,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 21:52:02,944][49750] Updated weights for policy 0, policy_version 279161 (0.0038) [2024-04-26 21:52:05,992][49750] Updated weights for policy 0, policy_version 279171 (0.0028) [2024-04-26 21:52:07,063][49517] Fps is (10 sec: 50788.7, 60 sec: 51063.2, 300 sec: 50873.7). Total num frames: 4573970432. Throughput: 0: 50769.3. Samples: 2326773660. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 21:52:07,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 21:52:09,387][49750] Updated weights for policy 0, policy_version 279181 (0.0031) [2024-04-26 21:52:12,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4574216192. Throughput: 0: 50822.2. Samples: 2327078080. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 21:52:12,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 21:52:12,430][49750] Updated weights for policy 0, policy_version 279191 (0.0027) [2024-04-26 21:52:16,037][49750] Updated weights for policy 0, policy_version 279201 (0.0032) [2024-04-26 21:52:17,062][49517] Fps is (10 sec: 47515.5, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 4574445568. Throughput: 0: 50743.7. Samples: 2327379080. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 21:52:17,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 21:52:18,597][49728] Signal inference workers to stop experience collection... (34800 times) [2024-04-26 21:52:18,632][49750] InferenceWorker_p0-w0: stopping experience collection (34800 times) [2024-04-26 21:52:18,667][49728] Signal inference workers to resume experience collection... (34800 times) [2024-04-26 21:52:18,668][49750] InferenceWorker_p0-w0: resuming experience collection (34800 times) [2024-04-26 21:52:18,793][49750] Updated weights for policy 0, policy_version 279211 (0.0025) [2024-04-26 21:52:22,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50929.3). Total num frames: 4574724096. Throughput: 0: 50772.1. Samples: 2327521400. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 21:52:22,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 21:52:22,396][49750] Updated weights for policy 0, policy_version 279221 (0.0029) [2024-04-26 21:52:25,255][49750] Updated weights for policy 0, policy_version 279231 (0.0036) [2024-04-26 21:52:27,062][49517] Fps is (10 sec: 55705.7, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 4575002624. Throughput: 0: 50658.8. Samples: 2327828460. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 21:52:27,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 21:52:28,846][49750] Updated weights for policy 0, policy_version 279241 (0.0031) [2024-04-26 21:52:31,681][49750] Updated weights for policy 0, policy_version 279251 (0.0034) [2024-04-26 21:52:32,063][49517] Fps is (10 sec: 55705.0, 60 sec: 51609.6, 300 sec: 51095.8). Total num frames: 4575281152. Throughput: 0: 50768.3. Samples: 2328140040. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 21:52:32,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 21:52:35,337][49750] Updated weights for policy 0, policy_version 279261 (0.0030) [2024-04-26 21:52:37,062][49517] Fps is (10 sec: 45875.2, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 4575461376. Throughput: 0: 51047.0. Samples: 2328302540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 21:52:37,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 21:52:38,061][49750] Updated weights for policy 0, policy_version 279271 (0.0031) [2024-04-26 21:52:41,863][49750] Updated weights for policy 0, policy_version 279281 (0.0037) [2024-04-26 21:52:42,063][49517] Fps is (10 sec: 45874.7, 60 sec: 50244.1, 300 sec: 50929.2). Total num frames: 4575739904. Throughput: 0: 50875.8. Samples: 2328599100. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 21:52:42,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 21:52:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000279281_4575739904.pth... [2024-04-26 21:52:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000278537_4563550208.pth [2024-04-26 21:52:44,586][49750] Updated weights for policy 0, policy_version 279291 (0.0038) [2024-04-26 21:52:47,063][49517] Fps is (10 sec: 54066.3, 60 sec: 50517.2, 300 sec: 50929.2). Total num frames: 4576002048. Throughput: 0: 50780.0. Samples: 2328899140. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 21:52:47,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 21:52:48,325][49750] Updated weights for policy 0, policy_version 279301 (0.0033) [2024-04-26 21:52:51,017][49750] Updated weights for policy 0, policy_version 279311 (0.0029) [2024-04-26 21:52:52,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51336.7, 300 sec: 50929.2). Total num frames: 4576264192. Throughput: 0: 50871.0. Samples: 2329062840. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 21:52:52,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 21:52:54,632][49750] Updated weights for policy 0, policy_version 279321 (0.0030) [2024-04-26 21:52:57,062][49517] Fps is (10 sec: 52430.0, 60 sec: 51063.6, 300 sec: 50984.8). Total num frames: 4576526336. Throughput: 0: 50992.5. Samples: 2329372740. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 21:52:57,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 21:52:57,344][49750] Updated weights for policy 0, policy_version 279331 (0.0033) [2024-04-26 21:53:01,232][49750] Updated weights for policy 0, policy_version 279341 (0.0031) [2024-04-26 21:53:02,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4576755712. Throughput: 0: 51047.1. Samples: 2329676200. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-26 21:53:02,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 21:53:03,730][49750] Updated weights for policy 0, policy_version 279351 (0.0032) [2024-04-26 21:53:07,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.7, 300 sec: 50873.7). Total num frames: 4577017856. Throughput: 0: 51051.1. Samples: 2329818700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 21:53:07,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 21:53:07,782][49750] Updated weights for policy 0, policy_version 279361 (0.0029) [2024-04-26 21:53:10,157][49750] Updated weights for policy 0, policy_version 279371 (0.0029) [2024-04-26 21:53:12,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 4577280000. Throughput: 0: 50801.3. Samples: 2330114520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 21:53:12,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 21:53:14,437][49750] Updated weights for policy 0, policy_version 279381 (0.0032) [2024-04-26 21:53:16,327][49728] Signal inference workers to stop experience collection... (34850 times) [2024-04-26 21:53:16,327][49728] Signal inference workers to resume experience collection... (34850 times) [2024-04-26 21:53:16,357][49750] InferenceWorker_p0-w0: stopping experience collection (34850 times) [2024-04-26 21:53:16,358][49750] InferenceWorker_p0-w0: resuming experience collection (34850 times) [2024-04-26 21:53:16,615][49750] Updated weights for policy 0, policy_version 279391 (0.0029) [2024-04-26 21:53:17,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51882.6, 300 sec: 50984.8). Total num frames: 4577558528. Throughput: 0: 50715.6. Samples: 2330422240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 21:53:17,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 21:53:20,882][49750] Updated weights for policy 0, policy_version 279401 (0.0027) [2024-04-26 21:53:22,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4577771520. Throughput: 0: 50757.7. Samples: 2330586640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 21:53:22,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 21:53:22,998][49750] Updated weights for policy 0, policy_version 279411 (0.0032) [2024-04-26 21:53:27,063][49517] Fps is (10 sec: 44236.8, 60 sec: 49971.1, 300 sec: 50762.6). Total num frames: 4578000896. Throughput: 0: 50917.9. Samples: 2330890400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 21:53:27,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 21:53:27,292][49750] Updated weights for policy 0, policy_version 279421 (0.0028) [2024-04-26 21:53:29,395][49750] Updated weights for policy 0, policy_version 279431 (0.0029) [2024-04-26 21:53:32,062][49517] Fps is (10 sec: 50790.4, 60 sec: 49971.3, 300 sec: 50873.7). Total num frames: 4578279424. Throughput: 0: 50902.8. Samples: 2331189760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 21:53:32,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 21:53:33,798][49750] Updated weights for policy 0, policy_version 279441 (0.0033) [2024-04-26 21:53:35,910][49750] Updated weights for policy 0, policy_version 279451 (0.0029) [2024-04-26 21:53:37,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 4578557952. Throughput: 0: 50893.4. Samples: 2331353040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 21:53:37,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 21:53:40,262][49750] Updated weights for policy 0, policy_version 279461 (0.0039) [2024-04-26 21:53:42,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.7, 300 sec: 50984.8). Total num frames: 4578820096. Throughput: 0: 50697.2. Samples: 2331654120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 21:53:42,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 21:53:42,339][49750] Updated weights for policy 0, policy_version 279471 (0.0027) [2024-04-26 21:53:46,664][49750] Updated weights for policy 0, policy_version 279481 (0.0029) [2024-04-26 21:53:47,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4579033088. Throughput: 0: 50881.7. Samples: 2331965880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 21:53:47,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 21:53:48,677][49750] Updated weights for policy 0, policy_version 279491 (0.0027) [2024-04-26 21:53:52,063][49517] Fps is (10 sec: 45874.6, 60 sec: 50244.2, 300 sec: 50818.1). Total num frames: 4579278848. Throughput: 0: 50682.9. Samples: 2332099440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 21:53:52,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 21:53:53,038][49750] Updated weights for policy 0, policy_version 279501 (0.0033) [2024-04-26 21:53:55,098][49750] Updated weights for policy 0, policy_version 279511 (0.0030) [2024-04-26 21:53:57,062][49517] Fps is (10 sec: 54067.6, 60 sec: 50790.3, 300 sec: 50984.8). Total num frames: 4579573760. Throughput: 0: 50842.6. Samples: 2332402440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 21:53:57,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 21:53:59,616][49750] Updated weights for policy 0, policy_version 279521 (0.0029) [2024-04-26 21:53:59,619][49728] Signal inference workers to stop experience collection... (34900 times) [2024-04-26 21:53:59,619][49728] Signal inference workers to resume experience collection... (34900 times) [2024-04-26 21:53:59,647][49750] InferenceWorker_p0-w0: stopping experience collection (34900 times) [2024-04-26 21:53:59,647][49750] InferenceWorker_p0-w0: resuming experience collection (34900 times) [2024-04-26 21:54:01,568][49750] Updated weights for policy 0, policy_version 279531 (0.0031) [2024-04-26 21:54:02,063][49517] Fps is (10 sec: 55705.8, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4579835904. Throughput: 0: 50704.4. Samples: 2332703940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 21:54:02,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 21:54:06,010][49750] Updated weights for policy 0, policy_version 279541 (0.0029) [2024-04-26 21:54:07,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4580048896. Throughput: 0: 50681.4. Samples: 2332867300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 21:54:07,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 21:54:08,032][49750] Updated weights for policy 0, policy_version 279551 (0.0031) [2024-04-26 21:54:12,062][49517] Fps is (10 sec: 45875.9, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4580294656. Throughput: 0: 50906.3. Samples: 2333181180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 21:54:12,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 21:54:12,355][49750] Updated weights for policy 0, policy_version 279561 (0.0031) [2024-04-26 21:54:14,301][49750] Updated weights for policy 0, policy_version 279571 (0.0030) [2024-04-26 21:54:17,063][49517] Fps is (10 sec: 52427.7, 60 sec: 50244.2, 300 sec: 50929.2). Total num frames: 4580573184. Throughput: 0: 50952.3. Samples: 2333482620. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-26 21:54:17,071][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 21:54:18,624][49750] Updated weights for policy 0, policy_version 279581 (0.0033) [2024-04-26 21:54:20,579][49750] Updated weights for policy 0, policy_version 279591 (0.0029) [2024-04-26 21:54:22,062][49517] Fps is (10 sec: 55705.2, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4580851712. Throughput: 0: 50903.1. Samples: 2333643680. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-26 21:54:22,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 21:54:25,143][49750] Updated weights for policy 0, policy_version 279601 (0.0032) [2024-04-26 21:54:26,966][49750] Updated weights for policy 0, policy_version 279611 (0.0032) [2024-04-26 21:54:27,062][49517] Fps is (10 sec: 57345.0, 60 sec: 52428.9, 300 sec: 51040.3). Total num frames: 4581146624. Throughput: 0: 51093.0. Samples: 2333953300. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-26 21:54:27,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 21:54:31,582][49750] Updated weights for policy 0, policy_version 279621 (0.0035) [2024-04-26 21:54:32,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4581326848. Throughput: 0: 50980.4. Samples: 2334260000. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-26 21:54:32,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 21:54:33,483][49750] Updated weights for policy 0, policy_version 279631 (0.0028) [2024-04-26 21:54:37,063][49517] Fps is (10 sec: 42597.8, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4581572608. Throughput: 0: 51049.8. Samples: 2334396680. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-26 21:54:37,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 21:54:37,858][49750] Updated weights for policy 0, policy_version 279641 (0.0031) [2024-04-26 21:54:39,500][49728] Signal inference workers to stop experience collection... (34950 times) [2024-04-26 21:54:39,522][49750] InferenceWorker_p0-w0: stopping experience collection (34950 times) [2024-04-26 21:54:39,598][49728] Signal inference workers to resume experience collection... (34950 times) [2024-04-26 21:54:39,598][49750] InferenceWorker_p0-w0: resuming experience collection (34950 times) [2024-04-26 21:54:39,942][49750] Updated weights for policy 0, policy_version 279651 (0.0030) [2024-04-26 21:54:42,063][49517] Fps is (10 sec: 54066.6, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 4581867520. Throughput: 0: 51207.8. Samples: 2334706800. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-26 21:54:42,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 21:54:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000279655_4581867520.pth... [2024-04-26 21:54:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000278909_4569645056.pth [2024-04-26 21:54:44,265][49750] Updated weights for policy 0, policy_version 279661 (0.0030) [2024-04-26 21:54:46,372][49750] Updated weights for policy 0, policy_version 279671 (0.0031) [2024-04-26 21:54:47,062][49517] Fps is (10 sec: 55706.4, 60 sec: 51609.7, 300 sec: 50873.7). Total num frames: 4582129664. Throughput: 0: 51195.3. Samples: 2335007720. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-26 21:54:47,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 21:54:50,617][49750] Updated weights for policy 0, policy_version 279681 (0.0029) [2024-04-26 21:54:52,063][49517] Fps is (10 sec: 52429.2, 60 sec: 51882.7, 300 sec: 51040.3). Total num frames: 4582391808. Throughput: 0: 51358.5. Samples: 2335178440. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-26 21:54:52,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 21:54:52,746][49750] Updated weights for policy 0, policy_version 279691 (0.0032) [2024-04-26 21:54:57,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4582604800. Throughput: 0: 51060.0. Samples: 2335478880. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-26 21:54:57,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-26 21:54:57,102][49750] Updated weights for policy 0, policy_version 279701 (0.0033) [2024-04-26 21:54:59,174][49750] Updated weights for policy 0, policy_version 279711 (0.0031) [2024-04-26 21:55:02,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 4582866944. Throughput: 0: 51238.4. Samples: 2335788340. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-26 21:55:02,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 21:55:03,387][49750] Updated weights for policy 0, policy_version 279721 (0.0031) [2024-04-26 21:55:05,653][49750] Updated weights for policy 0, policy_version 279731 (0.0036) [2024-04-26 21:55:07,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51609.6, 300 sec: 50929.2). Total num frames: 4583145472. Throughput: 0: 50886.7. Samples: 2335933580. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-26 21:55:07,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 21:55:09,725][49750] Updated weights for policy 0, policy_version 279741 (0.0034) [2024-04-26 21:55:12,063][49517] Fps is (10 sec: 55704.6, 60 sec: 52155.6, 300 sec: 50984.8). Total num frames: 4583424000. Throughput: 0: 50918.9. Samples: 2336244660. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-26 21:55:12,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 21:55:12,347][49750] Updated weights for policy 0, policy_version 279751 (0.0040) [2024-04-26 21:55:16,388][49750] Updated weights for policy 0, policy_version 279761 (0.0030) [2024-04-26 21:55:17,063][49517] Fps is (10 sec: 50789.5, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4583653376. Throughput: 0: 50849.7. Samples: 2336548240. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-26 21:55:17,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 21:55:18,703][49750] Updated weights for policy 0, policy_version 279771 (0.0032) [2024-04-26 21:55:22,062][49517] Fps is (10 sec: 44237.4, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4583866368. Throughput: 0: 51008.6. Samples: 2336692060. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-26 21:55:22,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 21:55:22,866][49750] Updated weights for policy 0, policy_version 279781 (0.0032) [2024-04-26 21:55:25,143][49750] Updated weights for policy 0, policy_version 279791 (0.0028) [2024-04-26 21:55:27,063][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.0, 300 sec: 50929.2). Total num frames: 4584144896. Throughput: 0: 50921.8. Samples: 2336998280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:55:27,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 21:55:29,177][49750] Updated weights for policy 0, policy_version 279801 (0.0029) [2024-04-26 21:55:31,680][49750] Updated weights for policy 0, policy_version 279811 (0.0034) [2024-04-26 21:55:32,063][49517] Fps is (10 sec: 57343.7, 60 sec: 51882.7, 300 sec: 50984.8). Total num frames: 4584439808. Throughput: 0: 50983.4. Samples: 2337301980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:55:32,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 21:55:35,477][49750] Updated weights for policy 0, policy_version 279821 (0.0033) [2024-04-26 21:55:37,063][49517] Fps is (10 sec: 54067.4, 60 sec: 51882.7, 300 sec: 51040.3). Total num frames: 4584685568. Throughput: 0: 50988.5. Samples: 2337472920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:55:37,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 21:55:37,330][49728] Signal inference workers to stop experience collection... (35000 times) [2024-04-26 21:55:37,330][49728] Signal inference workers to resume experience collection... (35000 times) [2024-04-26 21:55:37,342][49750] InferenceWorker_p0-w0: stopping experience collection (35000 times) [2024-04-26 21:55:37,342][49750] InferenceWorker_p0-w0: resuming experience collection (35000 times) [2024-04-26 21:55:38,105][49750] Updated weights for policy 0, policy_version 279831 (0.0036) [2024-04-26 21:55:41,845][49750] Updated weights for policy 0, policy_version 279841 (0.0030) [2024-04-26 21:55:42,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4584914944. Throughput: 0: 51096.3. Samples: 2337778220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:55:42,064][49517] Avg episode reward: [(0, '0.670')] [2024-04-26 21:55:44,491][49750] Updated weights for policy 0, policy_version 279851 (0.0033) [2024-04-26 21:55:47,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4585160704. Throughput: 0: 50881.3. Samples: 2338078000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:55:47,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 21:55:48,232][49750] Updated weights for policy 0, policy_version 279861 (0.0033) [2024-04-26 21:55:51,116][49750] Updated weights for policy 0, policy_version 279871 (0.0037) [2024-04-26 21:55:52,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4585422848. Throughput: 0: 50835.4. Samples: 2338221180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:55:52,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 21:55:54,589][49750] Updated weights for policy 0, policy_version 279881 (0.0026) [2024-04-26 21:55:57,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51609.6, 300 sec: 50929.3). Total num frames: 4585701376. Throughput: 0: 50810.8. Samples: 2338531140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:55:57,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 21:55:57,590][49750] Updated weights for policy 0, policy_version 279891 (0.0037) [2024-04-26 21:56:01,086][49750] Updated weights for policy 0, policy_version 279901 (0.0029) [2024-04-26 21:56:02,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.3, 300 sec: 50984.8). Total num frames: 4585947136. Throughput: 0: 50886.6. Samples: 2338838140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:56:02,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 21:56:03,984][49750] Updated weights for policy 0, policy_version 279911 (0.0033) [2024-04-26 21:56:07,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 4586192896. Throughput: 0: 51010.6. Samples: 2338987540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:56:07,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 21:56:07,617][49750] Updated weights for policy 0, policy_version 279921 (0.0029) [2024-04-26 21:56:10,376][49750] Updated weights for policy 0, policy_version 279931 (0.0030) [2024-04-26 21:56:12,063][49517] Fps is (10 sec: 45875.1, 60 sec: 49698.1, 300 sec: 50762.6). Total num frames: 4586405888. Throughput: 0: 50815.9. Samples: 2339285000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:56:12,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 21:56:14,382][49750] Updated weights for policy 0, policy_version 279941 (0.0029) [2024-04-26 21:56:16,815][49750] Updated weights for policy 0, policy_version 279951 (0.0027) [2024-04-26 21:56:17,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.6, 300 sec: 50929.2). Total num frames: 4586717184. Throughput: 0: 50928.1. Samples: 2339593740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:56:17,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 21:56:20,711][49750] Updated weights for policy 0, policy_version 279961 (0.0030) [2024-04-26 21:56:22,062][49517] Fps is (10 sec: 55706.4, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 4586962944. Throughput: 0: 50757.8. Samples: 2339757020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:56:22,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-26 21:56:23,358][49750] Updated weights for policy 0, policy_version 279971 (0.0032) [2024-04-26 21:56:27,028][49750] Updated weights for policy 0, policy_version 279981 (0.0032) [2024-04-26 21:56:27,063][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 4587208704. Throughput: 0: 50733.8. Samples: 2340061240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:56:27,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 21:56:29,859][49750] Updated weights for policy 0, policy_version 279991 (0.0035) [2024-04-26 21:56:32,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 4587438080. Throughput: 0: 50871.1. Samples: 2340367200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:56:32,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 21:56:33,631][49750] Updated weights for policy 0, policy_version 280001 (0.0027) [2024-04-26 21:56:36,300][49750] Updated weights for policy 0, policy_version 280011 (0.0029) [2024-04-26 21:56:37,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4587716608. Throughput: 0: 50754.2. Samples: 2340505120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 21:56:37,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 21:56:37,665][49728] Signal inference workers to stop experience collection... (35050 times) [2024-04-26 21:56:37,698][49750] InferenceWorker_p0-w0: stopping experience collection (35050 times) [2024-04-26 21:56:37,767][49728] Signal inference workers to resume experience collection... (35050 times) [2024-04-26 21:56:37,767][49750] InferenceWorker_p0-w0: resuming experience collection (35050 times) [2024-04-26 21:56:40,017][49750] Updated weights for policy 0, policy_version 280021 (0.0028) [2024-04-26 21:56:42,063][49517] Fps is (10 sec: 54066.3, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4587978752. Throughput: 0: 50740.7. Samples: 2340814480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:56:42,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 21:56:42,082][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000280029_4587995136.pth... [2024-04-26 21:56:42,130][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000279281_4575739904.pth [2024-04-26 21:56:42,748][49750] Updated weights for policy 0, policy_version 280031 (0.0028) [2024-04-26 21:56:46,409][49750] Updated weights for policy 0, policy_version 280041 (0.0029) [2024-04-26 21:56:47,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51336.5, 300 sec: 51040.4). Total num frames: 4588240896. Throughput: 0: 50775.8. Samples: 2341123040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:56:47,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 21:56:49,310][49750] Updated weights for policy 0, policy_version 280051 (0.0031) [2024-04-26 21:56:52,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4588470272. Throughput: 0: 50769.0. Samples: 2341272140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:56:52,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 21:56:52,782][49750] Updated weights for policy 0, policy_version 280061 (0.0032) [2024-04-26 21:56:55,999][49750] Updated weights for policy 0, policy_version 280071 (0.0030) [2024-04-26 21:56:57,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50244.1, 300 sec: 50873.7). Total num frames: 4588716032. Throughput: 0: 50864.5. Samples: 2341573900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:56:57,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 21:56:59,159][49750] Updated weights for policy 0, policy_version 280081 (0.0032) [2024-04-26 21:57:02,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.5, 300 sec: 50873.8). Total num frames: 4588978176. Throughput: 0: 50737.7. Samples: 2341876940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:57:02,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-26 21:57:02,415][49750] Updated weights for policy 0, policy_version 280091 (0.0036) [2024-04-26 21:57:05,590][49750] Updated weights for policy 0, policy_version 280101 (0.0028) [2024-04-26 21:57:07,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.5, 300 sec: 50929.2). Total num frames: 4589240320. Throughput: 0: 50765.9. Samples: 2342041480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:57:07,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 21:57:09,143][49750] Updated weights for policy 0, policy_version 280111 (0.0032) [2024-04-26 21:57:12,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51336.8, 300 sec: 50984.8). Total num frames: 4589486080. Throughput: 0: 50734.4. Samples: 2342344280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:57:12,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 21:57:12,080][49750] Updated weights for policy 0, policy_version 280121 (0.0029) [2024-04-26 21:57:15,667][49750] Updated weights for policy 0, policy_version 280131 (0.0035) [2024-04-26 21:57:17,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 4589731840. Throughput: 0: 50751.2. Samples: 2342651000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:57:17,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 21:57:18,593][49750] Updated weights for policy 0, policy_version 280141 (0.0030) [2024-04-26 21:57:21,965][49750] Updated weights for policy 0, policy_version 280151 (0.0028) [2024-04-26 21:57:22,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 4589993984. Throughput: 0: 50888.9. Samples: 2342795120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:57:22,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 21:57:24,978][49750] Updated weights for policy 0, policy_version 280161 (0.0027) [2024-04-26 21:57:27,062][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4590256128. Throughput: 0: 50778.8. Samples: 2343099520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:57:27,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 21:57:28,386][49750] Updated weights for policy 0, policy_version 280171 (0.0022) [2024-04-26 21:57:31,411][49750] Updated weights for policy 0, policy_version 280181 (0.0029) [2024-04-26 21:57:32,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51609.4, 300 sec: 51095.8). Total num frames: 4590534656. Throughput: 0: 50682.0. Samples: 2343403740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:57:32,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 21:57:34,741][49750] Updated weights for policy 0, policy_version 280191 (0.0026) [2024-04-26 21:57:37,063][49517] Fps is (10 sec: 49150.6, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 4590747648. Throughput: 0: 50939.2. Samples: 2343564420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:57:37,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 21:57:37,820][49750] Updated weights for policy 0, policy_version 280201 (0.0029) [2024-04-26 21:57:41,360][49750] Updated weights for policy 0, policy_version 280211 (0.0036) [2024-04-26 21:57:42,062][49517] Fps is (10 sec: 47514.7, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 4591009792. Throughput: 0: 50824.2. Samples: 2343860980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:57:42,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 21:57:44,134][49728] Signal inference workers to stop experience collection... (35100 times) [2024-04-26 21:57:44,178][49750] InferenceWorker_p0-w0: stopping experience collection (35100 times) [2024-04-26 21:57:44,242][49728] Signal inference workers to resume experience collection... (35100 times) [2024-04-26 21:57:44,242][49750] InferenceWorker_p0-w0: resuming experience collection (35100 times) [2024-04-26 21:57:44,396][49750] Updated weights for policy 0, policy_version 280221 (0.0027) [2024-04-26 21:57:47,063][49517] Fps is (10 sec: 52429.7, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 4591271936. Throughput: 0: 50812.3. Samples: 2344163500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 21:57:47,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 21:57:48,038][49750] Updated weights for policy 0, policy_version 280231 (0.0034) [2024-04-26 21:57:50,782][49750] Updated weights for policy 0, policy_version 280241 (0.0027) [2024-04-26 21:57:52,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4591517696. Throughput: 0: 50733.3. Samples: 2344324480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 21:57:52,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 21:57:54,315][49750] Updated weights for policy 0, policy_version 280251 (0.0031) [2024-04-26 21:57:57,063][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 4591779840. Throughput: 0: 50844.2. Samples: 2344632280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 21:57:57,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 21:57:57,151][49750] Updated weights for policy 0, policy_version 280261 (0.0028) [2024-04-26 21:58:00,809][49750] Updated weights for policy 0, policy_version 280271 (0.0036) [2024-04-26 21:58:02,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4592009216. Throughput: 0: 50641.3. Samples: 2344929860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 21:58:02,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 21:58:03,652][49750] Updated weights for policy 0, policy_version 280281 (0.0031) [2024-04-26 21:58:07,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 4592271360. Throughput: 0: 50615.0. Samples: 2345072800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 21:58:07,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 21:58:07,238][49750] Updated weights for policy 0, policy_version 280291 (0.0036) [2024-04-26 21:58:10,134][49750] Updated weights for policy 0, policy_version 280301 (0.0036) [2024-04-26 21:58:12,063][49517] Fps is (10 sec: 54065.6, 60 sec: 51063.2, 300 sec: 50818.1). Total num frames: 4592549888. Throughput: 0: 50573.9. Samples: 2345375360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 21:58:12,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-26 21:58:13,556][49750] Updated weights for policy 0, policy_version 280311 (0.0031) [2024-04-26 21:58:16,599][49750] Updated weights for policy 0, policy_version 280321 (0.0033) [2024-04-26 21:58:17,062][49517] Fps is (10 sec: 52430.0, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 4592795648. Throughput: 0: 50667.4. Samples: 2345683760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 21:58:17,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 21:58:20,103][49750] Updated weights for policy 0, policy_version 280331 (0.0032) [2024-04-26 21:58:22,062][49517] Fps is (10 sec: 47515.2, 60 sec: 50517.5, 300 sec: 50929.3). Total num frames: 4593025024. Throughput: 0: 50643.1. Samples: 2345843340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 21:58:22,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 21:58:22,985][49750] Updated weights for policy 0, policy_version 280341 (0.0030) [2024-04-26 21:58:26,452][49750] Updated weights for policy 0, policy_version 280351 (0.0030) [2024-04-26 21:58:27,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4593287168. Throughput: 0: 50663.5. Samples: 2346140840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 21:58:27,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 21:58:29,385][49750] Updated weights for policy 0, policy_version 280361 (0.0032) [2024-04-26 21:58:32,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50244.4, 300 sec: 50818.1). Total num frames: 4593549312. Throughput: 0: 50800.0. Samples: 2346449500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 21:58:32,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 21:58:32,873][49750] Updated weights for policy 0, policy_version 280371 (0.0041) [2024-04-26 21:58:35,878][49750] Updated weights for policy 0, policy_version 280381 (0.0033) [2024-04-26 21:58:37,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4593811456. Throughput: 0: 50694.1. Samples: 2346605720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 21:58:37,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 21:58:39,385][49750] Updated weights for policy 0, policy_version 280391 (0.0035) [2024-04-26 21:58:42,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4594040832. Throughput: 0: 50618.4. Samples: 2346910100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 21:58:42,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 21:58:42,251][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000280400_4594073600.pth... [2024-04-26 21:58:42,301][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000279655_4581867520.pth [2024-04-26 21:58:42,438][49750] Updated weights for policy 0, policy_version 280401 (0.0032) [2024-04-26 21:58:45,834][49750] Updated weights for policy 0, policy_version 280411 (0.0034) [2024-04-26 21:58:47,062][49517] Fps is (10 sec: 45875.6, 60 sec: 49971.3, 300 sec: 50818.2). Total num frames: 4594270208. Throughput: 0: 50572.0. Samples: 2347205600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 21:58:47,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 21:58:47,794][49728] Signal inference workers to stop experience collection... (35150 times) [2024-04-26 21:58:47,794][49728] Signal inference workers to resume experience collection... (35150 times) [2024-04-26 21:58:47,815][49750] InferenceWorker_p0-w0: stopping experience collection (35150 times) [2024-04-26 21:58:47,815][49750] InferenceWorker_p0-w0: resuming experience collection (35150 times) [2024-04-26 21:58:48,926][49750] Updated weights for policy 0, policy_version 280421 (0.0031) [2024-04-26 21:58:52,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4594565120. Throughput: 0: 50620.6. Samples: 2347350720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 21:58:52,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 21:58:52,098][49750] Updated weights for policy 0, policy_version 280431 (0.0032) [2024-04-26 21:58:55,326][49750] Updated weights for policy 0, policy_version 280441 (0.0031) [2024-04-26 21:58:57,062][49517] Fps is (10 sec: 55705.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4594827264. Throughput: 0: 50661.2. Samples: 2347655100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 21:58:57,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 21:58:58,607][49750] Updated weights for policy 0, policy_version 280451 (0.0042) [2024-04-26 21:59:01,781][49750] Updated weights for policy 0, policy_version 280461 (0.0030) [2024-04-26 21:59:02,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.4, 300 sec: 50984.8). Total num frames: 4595089408. Throughput: 0: 50745.2. Samples: 2347967300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-26 21:59:02,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 21:59:04,962][49750] Updated weights for policy 0, policy_version 280471 (0.0030) [2024-04-26 21:59:07,062][49517] Fps is (10 sec: 45875.1, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 4595286016. Throughput: 0: 50472.8. Samples: 2348114620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:59:07,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 21:59:08,284][49750] Updated weights for policy 0, policy_version 280481 (0.0029) [2024-04-26 21:59:11,264][49750] Updated weights for policy 0, policy_version 280491 (0.0032) [2024-04-26 21:59:12,063][49517] Fps is (10 sec: 47512.3, 60 sec: 50244.2, 300 sec: 50818.1). Total num frames: 4595564544. Throughput: 0: 50762.7. Samples: 2348425180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:59:12,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 21:59:14,690][49750] Updated weights for policy 0, policy_version 280501 (0.0029) [2024-04-26 21:59:17,062][49517] Fps is (10 sec: 54067.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4595826688. Throughput: 0: 50588.6. Samples: 2348725980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:59:17,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 21:59:17,782][49750] Updated weights for policy 0, policy_version 280511 (0.0029) [2024-04-26 21:59:21,058][49750] Updated weights for policy 0, policy_version 280521 (0.0028) [2024-04-26 21:59:22,063][49517] Fps is (10 sec: 54068.2, 60 sec: 51336.3, 300 sec: 50707.1). Total num frames: 4596105216. Throughput: 0: 50653.2. Samples: 2348885120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:59:22,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 21:59:24,334][49750] Updated weights for policy 0, policy_version 280531 (0.0029) [2024-04-26 21:59:27,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4596318208. Throughput: 0: 50633.8. Samples: 2349188620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:59:27,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 21:59:27,614][49750] Updated weights for policy 0, policy_version 280541 (0.0031) [2024-04-26 21:59:30,850][49750] Updated weights for policy 0, policy_version 280551 (0.0032) [2024-04-26 21:59:32,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4596580352. Throughput: 0: 50746.6. Samples: 2349489200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:59:32,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 21:59:33,229][49728] Signal inference workers to stop experience collection... (35200 times) [2024-04-26 21:59:33,290][49750] InferenceWorker_p0-w0: stopping experience collection (35200 times) [2024-04-26 21:59:33,292][49728] Signal inference workers to resume experience collection... (35200 times) [2024-04-26 21:59:33,305][49750] InferenceWorker_p0-w0: resuming experience collection (35200 times) [2024-04-26 21:59:33,988][49750] Updated weights for policy 0, policy_version 280561 (0.0026) [2024-04-26 21:59:37,062][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4596858880. Throughput: 0: 50843.1. Samples: 2349638660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:59:37,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 21:59:37,275][49750] Updated weights for policy 0, policy_version 280571 (0.0040) [2024-04-26 21:59:40,456][49750] Updated weights for policy 0, policy_version 280581 (0.0025) [2024-04-26 21:59:42,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4597088256. Throughput: 0: 50672.8. Samples: 2349935380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:59:42,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 21:59:44,280][49750] Updated weights for policy 0, policy_version 280591 (0.0031) [2024-04-26 21:59:46,939][49750] Updated weights for policy 0, policy_version 280601 (0.0035) [2024-04-26 21:59:47,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51609.5, 300 sec: 50762.6). Total num frames: 4597366784. Throughput: 0: 50726.2. Samples: 2350249980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:59:47,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 21:59:50,916][49750] Updated weights for policy 0, policy_version 280611 (0.0031) [2024-04-26 21:59:52,063][49517] Fps is (10 sec: 47513.4, 60 sec: 49971.1, 300 sec: 50707.1). Total num frames: 4597563392. Throughput: 0: 50525.7. Samples: 2350388280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:59:52,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 21:59:53,378][49750] Updated weights for policy 0, policy_version 280621 (0.0024) [2024-04-26 21:59:57,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 4597841920. Throughput: 0: 50474.4. Samples: 2350696520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 21:59:57,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 21:59:57,211][49750] Updated weights for policy 0, policy_version 280631 (0.0030) [2024-04-26 21:59:59,932][49750] Updated weights for policy 0, policy_version 280641 (0.0039) [2024-04-26 22:00:02,062][49517] Fps is (10 sec: 55706.0, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4598120448. Throughput: 0: 50547.9. Samples: 2351000640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 22:00:02,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 22:00:03,791][49750] Updated weights for policy 0, policy_version 280651 (0.0030) [2024-04-26 22:00:06,403][49750] Updated weights for policy 0, policy_version 280661 (0.0027) [2024-04-26 22:00:07,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51336.4, 300 sec: 50651.5). Total num frames: 4598366208. Throughput: 0: 50606.7. Samples: 2351162420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 22:00:07,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 22:00:10,205][49750] Updated weights for policy 0, policy_version 280671 (0.0035) [2024-04-26 22:00:11,788][49728] Signal inference workers to stop experience collection... (35250 times) [2024-04-26 22:00:11,789][49728] Signal inference workers to resume experience collection... (35250 times) [2024-04-26 22:00:11,817][49750] InferenceWorker_p0-w0: stopping experience collection (35250 times) [2024-04-26 22:00:11,817][49750] InferenceWorker_p0-w0: resuming experience collection (35250 times) [2024-04-26 22:00:12,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.7, 300 sec: 50651.6). Total num frames: 4598595584. Throughput: 0: 50651.6. Samples: 2351467940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-04-26 22:00:12,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 22:00:12,864][49750] Updated weights for policy 0, policy_version 280681 (0.0031) [2024-04-26 22:00:16,634][49750] Updated weights for policy 0, policy_version 280691 (0.0033) [2024-04-26 22:00:17,063][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 4598841344. Throughput: 0: 50589.7. Samples: 2351765740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 22:00:17,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 22:00:19,305][49750] Updated weights for policy 0, policy_version 280701 (0.0031) [2024-04-26 22:00:22,063][49517] Fps is (10 sec: 54066.3, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4599136256. Throughput: 0: 50425.2. Samples: 2351907800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 22:00:22,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 22:00:22,946][49750] Updated weights for policy 0, policy_version 280711 (0.0028) [2024-04-26 22:00:25,822][49750] Updated weights for policy 0, policy_version 280721 (0.0027) [2024-04-26 22:00:27,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 4599365632. Throughput: 0: 50696.9. Samples: 2352216740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 22:00:27,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 22:00:29,508][49750] Updated weights for policy 0, policy_version 280731 (0.0036) [2024-04-26 22:00:32,063][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4599644160. Throughput: 0: 50590.7. Samples: 2352526560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 22:00:32,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 22:00:32,115][49750] Updated weights for policy 0, policy_version 280741 (0.0030) [2024-04-26 22:00:36,135][49750] Updated weights for policy 0, policy_version 280751 (0.0029) [2024-04-26 22:00:37,063][49517] Fps is (10 sec: 49151.7, 60 sec: 49971.2, 300 sec: 50651.6). Total num frames: 4599857152. Throughput: 0: 50744.9. Samples: 2352671800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 22:00:37,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 22:00:38,514][49750] Updated weights for policy 0, policy_version 280761 (0.0034) [2024-04-26 22:00:42,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4600119296. Throughput: 0: 50676.1. Samples: 2352976940. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 22:00:42,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 22:00:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000280769_4600119296.pth... [2024-04-26 22:00:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000280029_4587995136.pth [2024-04-26 22:00:42,802][49750] Updated weights for policy 0, policy_version 280771 (0.0030) [2024-04-26 22:00:45,032][49750] Updated weights for policy 0, policy_version 280781 (0.0025) [2024-04-26 22:00:47,063][49517] Fps is (10 sec: 54066.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4600397824. Throughput: 0: 50725.7. Samples: 2353283300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 22:00:47,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 22:00:49,368][49750] Updated weights for policy 0, policy_version 280791 (0.0032) [2024-04-26 22:00:51,563][49750] Updated weights for policy 0, policy_version 280801 (0.0026) [2024-04-26 22:00:52,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51336.6, 300 sec: 50651.5). Total num frames: 4600643584. Throughput: 0: 50758.9. Samples: 2353446560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 22:00:52,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 22:00:55,811][49750] Updated weights for policy 0, policy_version 280811 (0.0040) [2024-04-26 22:00:57,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 4600889344. Throughput: 0: 50677.6. Samples: 2353748440. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 22:00:57,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 22:00:57,520][49728] Signal inference workers to stop experience collection... (35300 times) [2024-04-26 22:00:57,520][49728] Signal inference workers to resume experience collection... (35300 times) [2024-04-26 22:00:57,550][49750] InferenceWorker_p0-w0: stopping experience collection (35300 times) [2024-04-26 22:00:57,551][49750] InferenceWorker_p0-w0: resuming experience collection (35300 times) [2024-04-26 22:00:58,100][49750] Updated weights for policy 0, policy_version 280821 (0.0028) [2024-04-26 22:01:02,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49971.2, 300 sec: 50596.0). Total num frames: 4601118720. Throughput: 0: 50807.7. Samples: 2354052080. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 22:01:02,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 22:01:02,176][49750] Updated weights for policy 0, policy_version 280831 (0.0026) [2024-04-26 22:01:05,013][49750] Updated weights for policy 0, policy_version 280841 (0.0029) [2024-04-26 22:01:07,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4601413632. Throughput: 0: 50932.6. Samples: 2354199760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 22:01:07,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 22:01:08,612][49750] Updated weights for policy 0, policy_version 280851 (0.0029) [2024-04-26 22:01:11,361][49750] Updated weights for policy 0, policy_version 280861 (0.0029) [2024-04-26 22:01:12,062][49517] Fps is (10 sec: 55705.5, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 4601675776. Throughput: 0: 50795.1. Samples: 2354502520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 22:01:12,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 22:01:15,055][49750] Updated weights for policy 0, policy_version 280871 (0.0037) [2024-04-26 22:01:17,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51336.7, 300 sec: 50707.1). Total num frames: 4601921536. Throughput: 0: 50760.6. Samples: 2354810780. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 22:01:17,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 22:01:17,743][49750] Updated weights for policy 0, policy_version 280881 (0.0033) [2024-04-26 22:01:21,379][49750] Updated weights for policy 0, policy_version 280891 (0.0031) [2024-04-26 22:01:22,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 4602150912. Throughput: 0: 50830.7. Samples: 2354959180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 22:01:22,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 22:01:24,142][49750] Updated weights for policy 0, policy_version 280901 (0.0029) [2024-04-26 22:01:27,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4602413056. Throughput: 0: 50875.8. Samples: 2355266340. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-26 22:01:27,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 22:01:27,697][49750] Updated weights for policy 0, policy_version 280911 (0.0029) [2024-04-26 22:01:30,562][49750] Updated weights for policy 0, policy_version 280921 (0.0031) [2024-04-26 22:01:32,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4602675200. Throughput: 0: 50663.3. Samples: 2355563140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:01:32,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 22:01:34,284][49750] Updated weights for policy 0, policy_version 280931 (0.0033) [2024-04-26 22:01:36,965][49750] Updated weights for policy 0, policy_version 280941 (0.0031) [2024-04-26 22:01:37,063][49517] Fps is (10 sec: 52427.3, 60 sec: 51336.4, 300 sec: 50707.1). Total num frames: 4602937344. Throughput: 0: 50698.9. Samples: 2355728020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:01:37,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 22:01:40,875][49750] Updated weights for policy 0, policy_version 280951 (0.0031) [2024-04-26 22:01:42,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.5, 300 sec: 50651.5). Total num frames: 4603183104. Throughput: 0: 50870.7. Samples: 2356037620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:01:42,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 22:01:43,347][49750] Updated weights for policy 0, policy_version 280961 (0.0026) [2024-04-26 22:01:47,062][49517] Fps is (10 sec: 45875.8, 60 sec: 49971.3, 300 sec: 50596.0). Total num frames: 4603396096. Throughput: 0: 50873.7. Samples: 2356341400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:01:47,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 22:01:47,444][49750] Updated weights for policy 0, policy_version 280971 (0.0030) [2024-04-26 22:01:49,780][49750] Updated weights for policy 0, policy_version 280981 (0.0028) [2024-04-26 22:01:52,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4603691008. Throughput: 0: 50771.6. Samples: 2356484480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:01:52,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 22:01:54,048][49750] Updated weights for policy 0, policy_version 280991 (0.0029) [2024-04-26 22:01:56,179][49750] Updated weights for policy 0, policy_version 281001 (0.0034) [2024-04-26 22:01:57,063][49517] Fps is (10 sec: 57343.8, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4603969536. Throughput: 0: 50869.7. Samples: 2356791660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:01:57,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 22:02:00,377][49750] Updated weights for policy 0, policy_version 281011 (0.0032) [2024-04-26 22:02:00,595][49728] Signal inference workers to stop experience collection... (35350 times) [2024-04-26 22:02:00,595][49728] Signal inference workers to resume experience collection... (35350 times) [2024-04-26 22:02:00,620][49750] InferenceWorker_p0-w0: stopping experience collection (35350 times) [2024-04-26 22:02:00,621][49750] InferenceWorker_p0-w0: resuming experience collection (35350 times) [2024-04-26 22:02:02,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51336.4, 300 sec: 50707.1). Total num frames: 4604198912. Throughput: 0: 50780.7. Samples: 2357095920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:02:02,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 22:02:02,585][49750] Updated weights for policy 0, policy_version 281021 (0.0032) [2024-04-26 22:02:06,799][49750] Updated weights for policy 0, policy_version 281031 (0.0032) [2024-04-26 22:02:07,063][49517] Fps is (10 sec: 45875.0, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 4604428288. Throughput: 0: 50764.3. Samples: 2357243580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:02:07,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 22:02:09,237][49750] Updated weights for policy 0, policy_version 281041 (0.0032) [2024-04-26 22:02:12,062][49517] Fps is (10 sec: 47514.5, 60 sec: 49971.3, 300 sec: 50651.5). Total num frames: 4604674048. Throughput: 0: 50580.4. Samples: 2357542460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:02:12,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 22:02:13,298][49750] Updated weights for policy 0, policy_version 281051 (0.0032) [2024-04-26 22:02:15,644][49750] Updated weights for policy 0, policy_version 281061 (0.0030) [2024-04-26 22:02:17,063][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 4604968960. Throughput: 0: 50731.3. Samples: 2357846060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:02:17,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 22:02:19,797][49750] Updated weights for policy 0, policy_version 281071 (0.0038) [2024-04-26 22:02:22,054][49750] Updated weights for policy 0, policy_version 281081 (0.0034) [2024-04-26 22:02:22,063][49517] Fps is (10 sec: 55704.2, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 4605231104. Throughput: 0: 50861.4. Samples: 2358016780. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:02:22,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 22:02:26,232][49750] Updated weights for policy 0, policy_version 281091 (0.0029) [2024-04-26 22:02:27,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.2, 300 sec: 50596.0). Total num frames: 4605460480. Throughput: 0: 50657.7. Samples: 2358317220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:02:27,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 22:02:28,476][49750] Updated weights for policy 0, policy_version 281101 (0.0031) [2024-04-26 22:02:32,063][49517] Fps is (10 sec: 45875.1, 60 sec: 50244.1, 300 sec: 50651.6). Total num frames: 4605689856. Throughput: 0: 50706.5. Samples: 2358623200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:02:32,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 22:02:32,622][49750] Updated weights for policy 0, policy_version 281111 (0.0032) [2024-04-26 22:02:34,777][49750] Updated weights for policy 0, policy_version 281121 (0.0028) [2024-04-26 22:02:37,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.4, 300 sec: 50651.5). Total num frames: 4605952000. Throughput: 0: 50563.6. Samples: 2358759840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:02:37,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 22:02:39,107][49750] Updated weights for policy 0, policy_version 281131 (0.0038) [2024-04-26 22:02:41,280][49750] Updated weights for policy 0, policy_version 281141 (0.0030) [2024-04-26 22:02:42,063][49517] Fps is (10 sec: 55706.0, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4606246912. Throughput: 0: 50559.1. Samples: 2359066820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 22:02:42,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 22:02:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000281143_4606246912.pth... [2024-04-26 22:02:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000280400_4594073600.pth [2024-04-26 22:02:45,638][49750] Updated weights for policy 0, policy_version 281151 (0.0030) [2024-04-26 22:02:47,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51609.7, 300 sec: 50762.6). Total num frames: 4606492672. Throughput: 0: 50479.3. Samples: 2359367480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 22:02:47,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 22:02:47,691][49750] Updated weights for policy 0, policy_version 281161 (0.0029) [2024-04-26 22:02:52,062][49517] Fps is (10 sec: 44237.5, 60 sec: 49971.2, 300 sec: 50540.5). Total num frames: 4606689280. Throughput: 0: 50751.8. Samples: 2359527400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 22:02:52,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 22:02:52,120][49750] Updated weights for policy 0, policy_version 281171 (0.0034) [2024-04-26 22:02:53,996][49750] Updated weights for policy 0, policy_version 281181 (0.0027) [2024-04-26 22:02:57,063][49517] Fps is (10 sec: 45874.0, 60 sec: 49698.1, 300 sec: 50651.5). Total num frames: 4606951424. Throughput: 0: 50679.7. Samples: 2359823060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 22:02:57,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 22:02:58,597][49750] Updated weights for policy 0, policy_version 281191 (0.0032) [2024-04-26 22:03:00,613][49750] Updated weights for policy 0, policy_version 281201 (0.0031) [2024-04-26 22:03:02,062][49517] Fps is (10 sec: 55705.5, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 4607246336. Throughput: 0: 50657.1. Samples: 2360125620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 22:03:02,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 22:03:04,521][49728] Signal inference workers to stop experience collection... (35400 times) [2024-04-26 22:03:04,521][49728] Signal inference workers to resume experience collection... (35400 times) [2024-04-26 22:03:04,547][49750] InferenceWorker_p0-w0: stopping experience collection (35400 times) [2024-04-26 22:03:04,551][49750] InferenceWorker_p0-w0: resuming experience collection (35400 times) [2024-04-26 22:03:04,931][49750] Updated weights for policy 0, policy_version 281211 (0.0031) [2024-04-26 22:03:07,062][49517] Fps is (10 sec: 55706.6, 60 sec: 51336.7, 300 sec: 50707.1). Total num frames: 4607508480. Throughput: 0: 50834.4. Samples: 2360304320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 22:03:07,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-26 22:03:07,111][49750] Updated weights for policy 0, policy_version 281221 (0.0028) [2024-04-26 22:03:11,453][49750] Updated weights for policy 0, policy_version 281231 (0.0029) [2024-04-26 22:03:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 51063.4, 300 sec: 50651.5). Total num frames: 4607737856. Throughput: 0: 50762.0. Samples: 2360601500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 22:03:12,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 22:03:13,493][49750] Updated weights for policy 0, policy_version 281241 (0.0031) [2024-04-26 22:03:17,063][49517] Fps is (10 sec: 44236.2, 60 sec: 49698.2, 300 sec: 50596.0). Total num frames: 4607950848. Throughput: 0: 50696.5. Samples: 2360904540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 22:03:17,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 22:03:17,919][49750] Updated weights for policy 0, policy_version 281251 (0.0029) [2024-04-26 22:03:20,074][49750] Updated weights for policy 0, policy_version 281261 (0.0027) [2024-04-26 22:03:22,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50244.5, 300 sec: 50707.1). Total num frames: 4608245760. Throughput: 0: 50617.0. Samples: 2361037600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 22:03:22,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 22:03:24,301][49750] Updated weights for policy 0, policy_version 281271 (0.0035) [2024-04-26 22:03:26,356][49750] Updated weights for policy 0, policy_version 281281 (0.0035) [2024-04-26 22:03:27,063][49517] Fps is (10 sec: 57344.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4608524288. Throughput: 0: 50676.5. Samples: 2361347260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 22:03:27,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 22:03:30,610][49750] Updated weights for policy 0, policy_version 281291 (0.0035) [2024-04-26 22:03:32,062][49517] Fps is (10 sec: 55705.4, 60 sec: 51882.8, 300 sec: 50818.2). Total num frames: 4608802816. Throughput: 0: 50934.1. Samples: 2361659520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 22:03:32,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 22:03:32,845][49750] Updated weights for policy 0, policy_version 281301 (0.0029) [2024-04-26 22:03:36,956][49750] Updated weights for policy 0, policy_version 281311 (0.0029) [2024-04-26 22:03:37,063][49517] Fps is (10 sec: 47513.6, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4608999424. Throughput: 0: 50755.4. Samples: 2361811400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 22:03:37,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 22:03:39,325][49750] Updated weights for policy 0, policy_version 281321 (0.0029) [2024-04-26 22:03:42,063][49517] Fps is (10 sec: 44236.2, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 4609245184. Throughput: 0: 50958.7. Samples: 2362116200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 22:03:42,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 22:03:43,587][49750] Updated weights for policy 0, policy_version 281331 (0.0034) [2024-04-26 22:03:43,796][49728] Signal inference workers to stop experience collection... (35450 times) [2024-04-26 22:03:43,837][49750] InferenceWorker_p0-w0: stopping experience collection (35450 times) [2024-04-26 22:03:43,856][49728] Signal inference workers to resume experience collection... (35450 times) [2024-04-26 22:03:43,861][49750] InferenceWorker_p0-w0: resuming experience collection (35450 times) [2024-04-26 22:03:45,581][49750] Updated weights for policy 0, policy_version 281341 (0.0039) [2024-04-26 22:03:47,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.1, 300 sec: 50707.1). Total num frames: 4609523712. Throughput: 0: 50955.8. Samples: 2362418640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 22:03:47,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 22:03:49,966][49750] Updated weights for policy 0, policy_version 281351 (0.0032) [2024-04-26 22:03:51,994][49750] Updated weights for policy 0, policy_version 281361 (0.0026) [2024-04-26 22:03:52,063][49517] Fps is (10 sec: 57343.5, 60 sec: 52155.5, 300 sec: 50818.1). Total num frames: 4609818624. Throughput: 0: 50689.1. Samples: 2362585340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-26 22:03:52,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 22:03:56,258][49750] Updated weights for policy 0, policy_version 281371 (0.0031) [2024-04-26 22:03:57,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51609.8, 300 sec: 50707.1). Total num frames: 4610048000. Throughput: 0: 51015.1. Samples: 2362897180. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-26 22:03:57,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 22:03:58,472][49750] Updated weights for policy 0, policy_version 281381 (0.0028) [2024-04-26 22:04:02,063][49517] Fps is (10 sec: 42598.8, 60 sec: 49971.1, 300 sec: 50707.1). Total num frames: 4610244608. Throughput: 0: 50902.2. Samples: 2363195140. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-26 22:04:02,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 22:04:02,779][49750] Updated weights for policy 0, policy_version 281391 (0.0029) [2024-04-26 22:04:04,991][49750] Updated weights for policy 0, policy_version 281401 (0.0032) [2024-04-26 22:04:07,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4610523136. Throughput: 0: 50890.0. Samples: 2363327660. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-26 22:04:07,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 22:04:09,203][49750] Updated weights for policy 0, policy_version 281411 (0.0030) [2024-04-26 22:04:11,237][49750] Updated weights for policy 0, policy_version 281421 (0.0029) [2024-04-26 22:04:12,062][49517] Fps is (10 sec: 57344.6, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4610818048. Throughput: 0: 50856.6. Samples: 2363635800. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-26 22:04:12,071][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 22:04:15,609][49750] Updated weights for policy 0, policy_version 281431 (0.0035) [2024-04-26 22:04:17,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51882.8, 300 sec: 50707.1). Total num frames: 4611063808. Throughput: 0: 50702.2. Samples: 2363941120. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-26 22:04:17,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 22:04:17,782][49750] Updated weights for policy 0, policy_version 281441 (0.0035) [2024-04-26 22:04:22,024][49750] Updated weights for policy 0, policy_version 281451 (0.0031) [2024-04-26 22:04:22,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4611293184. Throughput: 0: 50803.7. Samples: 2364097560. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-26 22:04:22,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 22:04:24,352][49750] Updated weights for policy 0, policy_version 281461 (0.0033) [2024-04-26 22:04:27,062][49517] Fps is (10 sec: 45875.1, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 4611522560. Throughput: 0: 50818.8. Samples: 2364403040. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-26 22:04:27,063][49517] Avg episode reward: [(0, '0.682')] [2024-04-26 22:04:28,392][49750] Updated weights for policy 0, policy_version 281471 (0.0029) [2024-04-26 22:04:28,403][49728] Signal inference workers to stop experience collection... (35500 times) [2024-04-26 22:04:28,403][49728] Signal inference workers to resume experience collection... (35500 times) [2024-04-26 22:04:28,424][49750] InferenceWorker_p0-w0: stopping experience collection (35500 times) [2024-04-26 22:04:28,424][49750] InferenceWorker_p0-w0: resuming experience collection (35500 times) [2024-04-26 22:04:30,664][49750] Updated weights for policy 0, policy_version 281481 (0.0033) [2024-04-26 22:04:32,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 4611817472. Throughput: 0: 50841.8. Samples: 2364706520. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-26 22:04:32,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 22:04:34,919][49750] Updated weights for policy 0, policy_version 281491 (0.0033) [2024-04-26 22:04:37,063][49517] Fps is (10 sec: 55704.7, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 4612079616. Throughput: 0: 50697.3. Samples: 2364866720. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-26 22:04:37,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 22:04:37,282][49750] Updated weights for policy 0, policy_version 281501 (0.0032) [2024-04-26 22:04:41,302][49750] Updated weights for policy 0, policy_version 281511 (0.0027) [2024-04-26 22:04:42,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51336.6, 300 sec: 50707.1). Total num frames: 4612325376. Throughput: 0: 50603.5. Samples: 2365174340. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-26 22:04:42,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 22:04:42,190][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000281515_4612341760.pth... [2024-04-26 22:04:42,243][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000280769_4600119296.pth [2024-04-26 22:04:43,872][49750] Updated weights for policy 0, policy_version 281521 (0.0032) [2024-04-26 22:04:47,062][49517] Fps is (10 sec: 45876.2, 60 sec: 50244.4, 300 sec: 50762.7). Total num frames: 4612538368. Throughput: 0: 50856.6. Samples: 2365483680. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-26 22:04:47,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 22:04:47,727][49750] Updated weights for policy 0, policy_version 281531 (0.0040) [2024-04-26 22:04:50,344][49750] Updated weights for policy 0, policy_version 281541 (0.0039) [2024-04-26 22:04:52,062][49517] Fps is (10 sec: 47513.2, 60 sec: 49698.2, 300 sec: 50707.1). Total num frames: 4612800512. Throughput: 0: 50822.3. Samples: 2365614660. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-26 22:04:52,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 22:04:54,145][49750] Updated weights for policy 0, policy_version 281551 (0.0034) [2024-04-26 22:04:56,754][49750] Updated weights for policy 0, policy_version 281561 (0.0028) [2024-04-26 22:04:57,062][49517] Fps is (10 sec: 55705.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4613095424. Throughput: 0: 50702.7. Samples: 2365917420. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-26 22:04:57,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-26 22:05:00,694][49750] Updated weights for policy 0, policy_version 281571 (0.0034) [2024-04-26 22:05:02,063][49517] Fps is (10 sec: 55704.7, 60 sec: 51882.6, 300 sec: 50818.2). Total num frames: 4613357568. Throughput: 0: 50747.3. Samples: 2366224760. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-26 22:05:02,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 22:05:03,455][49750] Updated weights for policy 0, policy_version 281581 (0.0027) [2024-04-26 22:05:07,021][49750] Updated weights for policy 0, policy_version 281591 (0.0031) [2024-04-26 22:05:07,062][49517] Fps is (10 sec: 49151.7, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4613586944. Throughput: 0: 50807.1. Samples: 2366383880. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-04-26 22:05:07,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 22:05:10,095][49750] Updated weights for policy 0, policy_version 281601 (0.0026) [2024-04-26 22:05:12,063][49517] Fps is (10 sec: 44237.2, 60 sec: 49698.0, 300 sec: 50707.1). Total num frames: 4613799936. Throughput: 0: 50560.3. Samples: 2366678260. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-04-26 22:05:12,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 22:05:13,498][49750] Updated weights for policy 0, policy_version 281611 (0.0037) [2024-04-26 22:05:16,588][49750] Updated weights for policy 0, policy_version 281621 (0.0038) [2024-04-26 22:05:17,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4614094848. Throughput: 0: 50583.6. Samples: 2366982780. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-04-26 22:05:17,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 22:05:20,047][49750] Updated weights for policy 0, policy_version 281631 (0.0033) [2024-04-26 22:05:22,062][49517] Fps is (10 sec: 55706.6, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4614356992. Throughput: 0: 50555.4. Samples: 2367141700. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-04-26 22:05:22,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 22:05:23,137][49750] Updated weights for policy 0, policy_version 281641 (0.0031) [2024-04-26 22:05:25,345][49728] Signal inference workers to stop experience collection... (35550 times) [2024-04-26 22:05:25,345][49728] Signal inference workers to resume experience collection... (35550 times) [2024-04-26 22:05:25,356][49750] InferenceWorker_p0-w0: stopping experience collection (35550 times) [2024-04-26 22:05:25,377][49750] InferenceWorker_p0-w0: resuming experience collection (35550 times) [2024-04-26 22:05:26,421][49750] Updated weights for policy 0, policy_version 281651 (0.0031) [2024-04-26 22:05:27,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51336.6, 300 sec: 50707.1). Total num frames: 4614602752. Throughput: 0: 50578.7. Samples: 2367450380. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-04-26 22:05:27,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 22:05:29,542][49750] Updated weights for policy 0, policy_version 281661 (0.0032) [2024-04-26 22:05:32,063][49517] Fps is (10 sec: 47512.2, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4614832128. Throughput: 0: 50525.5. Samples: 2367757340. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-04-26 22:05:32,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 22:05:32,760][49750] Updated weights for policy 0, policy_version 281671 (0.0031) [2024-04-26 22:05:36,152][49750] Updated weights for policy 0, policy_version 281681 (0.0030) [2024-04-26 22:05:37,063][49517] Fps is (10 sec: 47512.8, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4615077888. Throughput: 0: 50615.0. Samples: 2367892340. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-04-26 22:05:37,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 22:05:39,135][49750] Updated weights for policy 0, policy_version 281691 (0.0032) [2024-04-26 22:05:42,063][49517] Fps is (10 sec: 52429.6, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4615356416. Throughput: 0: 50558.1. Samples: 2368192540. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-04-26 22:05:42,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 22:05:42,722][49750] Updated weights for policy 0, policy_version 281701 (0.0025) [2024-04-26 22:05:45,619][49750] Updated weights for policy 0, policy_version 281711 (0.0029) [2024-04-26 22:05:47,062][49517] Fps is (10 sec: 55706.5, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 4615634944. Throughput: 0: 50615.8. Samples: 2368502460. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-04-26 22:05:47,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 22:05:49,162][49750] Updated weights for policy 0, policy_version 281721 (0.0035) [2024-04-26 22:05:51,982][49750] Updated weights for policy 0, policy_version 281731 (0.0028) [2024-04-26 22:05:52,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4615880704. Throughput: 0: 50670.7. Samples: 2368664060. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-04-26 22:05:52,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 22:05:55,585][49750] Updated weights for policy 0, policy_version 281741 (0.0036) [2024-04-26 22:05:57,062][49517] Fps is (10 sec: 44237.0, 60 sec: 49698.2, 300 sec: 50707.1). Total num frames: 4616077312. Throughput: 0: 50852.7. Samples: 2368966620. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-04-26 22:05:57,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 22:05:58,432][49750] Updated weights for policy 0, policy_version 281751 (0.0029) [2024-04-26 22:06:02,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.4, 300 sec: 50651.6). Total num frames: 4616355840. Throughput: 0: 50835.3. Samples: 2369270360. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-04-26 22:06:02,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 22:06:02,086][49750] Updated weights for policy 0, policy_version 281761 (0.0034) [2024-04-26 22:06:04,995][49750] Updated weights for policy 0, policy_version 281771 (0.0032) [2024-04-26 22:06:07,063][49517] Fps is (10 sec: 55704.4, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4616634368. Throughput: 0: 50638.4. Samples: 2369420440. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-04-26 22:06:07,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 22:06:08,571][49750] Updated weights for policy 0, policy_version 281781 (0.0031) [2024-04-26 22:06:11,359][49750] Updated weights for policy 0, policy_version 281791 (0.0035) [2024-04-26 22:06:12,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51609.7, 300 sec: 50762.6). Total num frames: 4616896512. Throughput: 0: 50696.4. Samples: 2369731720. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-04-26 22:06:12,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 22:06:14,957][49750] Updated weights for policy 0, policy_version 281801 (0.0031) [2024-04-26 22:06:17,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4617125888. Throughput: 0: 50736.7. Samples: 2370040480. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-04-26 22:06:17,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 22:06:17,214][49728] Signal inference workers to stop experience collection... (35600 times) [2024-04-26 22:06:17,214][49728] Signal inference workers to resume experience collection... (35600 times) [2024-04-26 22:06:17,241][49750] InferenceWorker_p0-w0: stopping experience collection (35600 times) [2024-04-26 22:06:17,241][49750] InferenceWorker_p0-w0: resuming experience collection (35600 times) [2024-04-26 22:06:17,850][49750] Updated weights for policy 0, policy_version 281811 (0.0033) [2024-04-26 22:06:21,665][49750] Updated weights for policy 0, policy_version 281821 (0.0033) [2024-04-26 22:06:22,062][49517] Fps is (10 sec: 45875.2, 60 sec: 49971.2, 300 sec: 50651.5). Total num frames: 4617355264. Throughput: 0: 50785.5. Samples: 2370177680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 22:06:22,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 22:06:24,138][49750] Updated weights for policy 0, policy_version 281831 (0.0030) [2024-04-26 22:06:27,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4617633792. Throughput: 0: 50954.3. Samples: 2370485480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 22:06:27,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 22:06:28,443][49750] Updated weights for policy 0, policy_version 281841 (0.0029) [2024-04-26 22:06:30,514][49750] Updated weights for policy 0, policy_version 281851 (0.0031) [2024-04-26 22:06:32,062][49517] Fps is (10 sec: 55705.4, 60 sec: 51336.7, 300 sec: 50762.7). Total num frames: 4617912320. Throughput: 0: 50771.5. Samples: 2370787180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 22:06:32,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 22:06:34,879][49750] Updated weights for policy 0, policy_version 281861 (0.0033) [2024-04-26 22:06:37,058][49750] Updated weights for policy 0, policy_version 281871 (0.0030) [2024-04-26 22:06:37,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51609.7, 300 sec: 50818.2). Total num frames: 4618174464. Throughput: 0: 50876.0. Samples: 2370953480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 22:06:37,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 22:06:41,428][49750] Updated weights for policy 0, policy_version 281881 (0.0036) [2024-04-26 22:06:42,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4618387456. Throughput: 0: 50895.5. Samples: 2371256920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 22:06:42,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 22:06:42,116][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000281885_4618403840.pth... [2024-04-26 22:06:42,159][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000281143_4606246912.pth [2024-04-26 22:06:43,377][49750] Updated weights for policy 0, policy_version 281891 (0.0037) [2024-04-26 22:06:47,062][49517] Fps is (10 sec: 45875.3, 60 sec: 49971.2, 300 sec: 50651.6). Total num frames: 4618633216. Throughput: 0: 50920.0. Samples: 2371561760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 22:06:47,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 22:06:47,911][49750] Updated weights for policy 0, policy_version 281901 (0.0027) [2024-04-26 22:06:49,918][49750] Updated weights for policy 0, policy_version 281911 (0.0035) [2024-04-26 22:06:52,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4618911744. Throughput: 0: 50707.2. Samples: 2371702260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 22:06:52,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 22:06:54,428][49750] Updated weights for policy 0, policy_version 281921 (0.0032) [2024-04-26 22:06:56,402][49750] Updated weights for policy 0, policy_version 281931 (0.0029) [2024-04-26 22:06:57,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51609.5, 300 sec: 50762.6). Total num frames: 4619173888. Throughput: 0: 50601.2. Samples: 2372008780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 22:06:57,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 22:07:00,950][49750] Updated weights for policy 0, policy_version 281941 (0.0043) [2024-04-26 22:07:02,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4619403264. Throughput: 0: 50609.4. Samples: 2372317900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 22:07:02,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 22:07:02,753][49750] Updated weights for policy 0, policy_version 281951 (0.0027) [2024-04-26 22:07:07,063][49517] Fps is (10 sec: 44236.9, 60 sec: 49698.2, 300 sec: 50651.5). Total num frames: 4619616256. Throughput: 0: 50723.9. Samples: 2372460260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 22:07:07,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 22:07:07,312][49750] Updated weights for policy 0, policy_version 281961 (0.0032) [2024-04-26 22:07:09,316][49750] Updated weights for policy 0, policy_version 281971 (0.0033) [2024-04-26 22:07:12,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 4619911168. Throughput: 0: 50584.3. Samples: 2372761780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 22:07:12,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 22:07:13,811][49750] Updated weights for policy 0, policy_version 281981 (0.0034) [2024-04-26 22:07:15,232][49728] Signal inference workers to stop experience collection... (35650 times) [2024-04-26 22:07:15,232][49728] Signal inference workers to resume experience collection... (35650 times) [2024-04-26 22:07:15,249][49750] InferenceWorker_p0-w0: stopping experience collection (35650 times) [2024-04-26 22:07:15,249][49750] InferenceWorker_p0-w0: resuming experience collection (35650 times) [2024-04-26 22:07:15,878][49750] Updated weights for policy 0, policy_version 281991 (0.0031) [2024-04-26 22:07:17,063][49517] Fps is (10 sec: 57344.1, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4620189696. Throughput: 0: 50696.8. Samples: 2373068540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 22:07:17,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 22:07:20,192][49750] Updated weights for policy 0, policy_version 282001 (0.0031) [2024-04-26 22:07:22,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 4620451840. Throughput: 0: 50613.8. Samples: 2373231100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 22:07:22,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 22:07:22,161][49750] Updated weights for policy 0, policy_version 282011 (0.0033) [2024-04-26 22:07:26,605][49750] Updated weights for policy 0, policy_version 282021 (0.0028) [2024-04-26 22:07:27,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4620664832. Throughput: 0: 50677.3. Samples: 2373537400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-26 22:07:27,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 22:07:28,530][49750] Updated weights for policy 0, policy_version 282031 (0.0032) [2024-04-26 22:07:32,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4620926976. Throughput: 0: 50556.8. Samples: 2373836820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 22:07:32,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 22:07:33,001][49750] Updated weights for policy 0, policy_version 282041 (0.0028) [2024-04-26 22:07:35,037][49750] Updated weights for policy 0, policy_version 282051 (0.0030) [2024-04-26 22:07:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 4621189120. Throughput: 0: 50628.5. Samples: 2373980540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 22:07:37,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 22:07:39,433][49750] Updated weights for policy 0, policy_version 282061 (0.0029) [2024-04-26 22:07:41,497][49750] Updated weights for policy 0, policy_version 282071 (0.0037) [2024-04-26 22:07:42,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 4621467648. Throughput: 0: 50662.8. Samples: 2374288600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 22:07:42,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 22:07:45,792][49750] Updated weights for policy 0, policy_version 282081 (0.0033) [2024-04-26 22:07:47,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4621697024. Throughput: 0: 50742.7. Samples: 2374601320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 22:07:47,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 22:07:47,969][49750] Updated weights for policy 0, policy_version 282091 (0.0036) [2024-04-26 22:07:52,062][49517] Fps is (10 sec: 44236.9, 60 sec: 49971.3, 300 sec: 50707.1). Total num frames: 4621910016. Throughput: 0: 50787.7. Samples: 2374745700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 22:07:52,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 22:07:52,272][49750] Updated weights for policy 0, policy_version 282101 (0.0033) [2024-04-26 22:07:54,321][49750] Updated weights for policy 0, policy_version 282111 (0.0035) [2024-04-26 22:07:57,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 4622188544. Throughput: 0: 50799.7. Samples: 2375047760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 22:07:57,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 22:07:58,606][49750] Updated weights for policy 0, policy_version 282121 (0.0030) [2024-04-26 22:08:00,836][49750] Updated weights for policy 0, policy_version 282131 (0.0034) [2024-04-26 22:08:02,063][49517] Fps is (10 sec: 55704.8, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 4622467072. Throughput: 0: 50701.7. Samples: 2375350120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 22:08:02,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 22:08:04,975][49750] Updated weights for policy 0, policy_version 282141 (0.0037) [2024-04-26 22:08:07,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51882.7, 300 sec: 50818.2). Total num frames: 4622729216. Throughput: 0: 50858.2. Samples: 2375519720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 22:08:07,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 22:08:07,457][49750] Updated weights for policy 0, policy_version 282151 (0.0029) [2024-04-26 22:08:11,301][49750] Updated weights for policy 0, policy_version 282161 (0.0031) [2024-04-26 22:08:12,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 4622958592. Throughput: 0: 50827.2. Samples: 2375824620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 22:08:12,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 22:08:13,844][49750] Updated weights for policy 0, policy_version 282171 (0.0035) [2024-04-26 22:08:17,016][49728] Signal inference workers to stop experience collection... (35700 times) [2024-04-26 22:08:17,039][49750] InferenceWorker_p0-w0: stopping experience collection (35700 times) [2024-04-26 22:08:17,062][49517] Fps is (10 sec: 45875.4, 60 sec: 49971.2, 300 sec: 50651.5). Total num frames: 4623187968. Throughput: 0: 50988.5. Samples: 2376131300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 22:08:17,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 22:08:17,118][49728] Signal inference workers to resume experience collection... (35700 times) [2024-04-26 22:08:17,118][49750] InferenceWorker_p0-w0: resuming experience collection (35700 times) [2024-04-26 22:08:17,574][49750] Updated weights for policy 0, policy_version 282181 (0.0028) [2024-04-26 22:08:20,274][49750] Updated weights for policy 0, policy_version 282191 (0.0025) [2024-04-26 22:08:22,063][49517] Fps is (10 sec: 50788.6, 60 sec: 50244.0, 300 sec: 50651.5). Total num frames: 4623466496. Throughput: 0: 50841.1. Samples: 2376268400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 22:08:22,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 22:08:24,045][49750] Updated weights for policy 0, policy_version 282201 (0.0032) [2024-04-26 22:08:26,818][49750] Updated weights for policy 0, policy_version 282211 (0.0033) [2024-04-26 22:08:27,062][49517] Fps is (10 sec: 57344.0, 60 sec: 51609.5, 300 sec: 50707.1). Total num frames: 4623761408. Throughput: 0: 50950.2. Samples: 2376581360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 22:08:27,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 22:08:30,430][49750] Updated weights for policy 0, policy_version 282221 (0.0030) [2024-04-26 22:08:32,062][49517] Fps is (10 sec: 52430.4, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4623990784. Throughput: 0: 50805.7. Samples: 2376887580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 22:08:32,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 22:08:33,362][49750] Updated weights for policy 0, policy_version 282231 (0.0031) [2024-04-26 22:08:36,861][49750] Updated weights for policy 0, policy_version 282241 (0.0029) [2024-04-26 22:08:37,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4624236544. Throughput: 0: 50799.5. Samples: 2377031680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 22:08:37,063][49517] Avg episode reward: [(0, '0.672')] [2024-04-26 22:08:39,740][49750] Updated weights for policy 0, policy_version 282251 (0.0032) [2024-04-26 22:08:42,062][49517] Fps is (10 sec: 45875.2, 60 sec: 49698.2, 300 sec: 50596.0). Total num frames: 4624449536. Throughput: 0: 50938.2. Samples: 2377339980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-26 22:08:42,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 22:08:42,177][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000282255_4624465920.pth... [2024-04-26 22:08:42,222][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000281515_4612341760.pth [2024-04-26 22:08:43,305][49750] Updated weights for policy 0, policy_version 282261 (0.0031) [2024-04-26 22:08:46,142][49750] Updated weights for policy 0, policy_version 282271 (0.0028) [2024-04-26 22:08:47,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50651.6). Total num frames: 4624760832. Throughput: 0: 50845.4. Samples: 2377638160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 22:08:47,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 22:08:49,762][49750] Updated weights for policy 0, policy_version 282281 (0.0038) [2024-04-26 22:08:52,062][49517] Fps is (10 sec: 57343.8, 60 sec: 51882.6, 300 sec: 50762.6). Total num frames: 4625022976. Throughput: 0: 50684.5. Samples: 2377800520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 22:08:52,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 22:08:52,706][49750] Updated weights for policy 0, policy_version 282291 (0.0033) [2024-04-26 22:08:56,150][49750] Updated weights for policy 0, policy_version 282301 (0.0028) [2024-04-26 22:08:57,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4625235968. Throughput: 0: 50759.4. Samples: 2378108800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 22:08:57,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 22:08:59,218][49750] Updated weights for policy 0, policy_version 282311 (0.0034) [2024-04-26 22:09:02,063][49517] Fps is (10 sec: 47512.7, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4625498112. Throughput: 0: 50755.4. Samples: 2378415300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 22:09:02,063][49517] Avg episode reward: [(0, '0.483')] [2024-04-26 22:09:02,706][49750] Updated weights for policy 0, policy_version 282321 (0.0033) [2024-04-26 22:09:05,624][49750] Updated weights for policy 0, policy_version 282331 (0.0030) [2024-04-26 22:09:07,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 4625760256. Throughput: 0: 50802.5. Samples: 2378554500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 22:09:07,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 22:09:09,243][49750] Updated weights for policy 0, policy_version 282341 (0.0031) [2024-04-26 22:09:10,755][49728] Signal inference workers to stop experience collection... (35750 times) [2024-04-26 22:09:10,755][49728] Signal inference workers to resume experience collection... (35750 times) [2024-04-26 22:09:10,783][49750] InferenceWorker_p0-w0: stopping experience collection (35750 times) [2024-04-26 22:09:10,806][49750] InferenceWorker_p0-w0: resuming experience collection (35750 times) [2024-04-26 22:09:12,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 4626022400. Throughput: 0: 50731.9. Samples: 2378864300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 22:09:12,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 22:09:12,323][49750] Updated weights for policy 0, policy_version 282351 (0.0032) [2024-04-26 22:09:15,574][49750] Updated weights for policy 0, policy_version 282361 (0.0031) [2024-04-26 22:09:17,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 4626268160. Throughput: 0: 50655.0. Samples: 2379167060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 22:09:17,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 22:09:18,774][49750] Updated weights for policy 0, policy_version 282371 (0.0038) [2024-04-26 22:09:22,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4626513920. Throughput: 0: 50743.5. Samples: 2379315140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 22:09:22,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 22:09:22,108][49750] Updated weights for policy 0, policy_version 282381 (0.0032) [2024-04-26 22:09:25,111][49750] Updated weights for policy 0, policy_version 282391 (0.0030) [2024-04-26 22:09:27,062][49517] Fps is (10 sec: 49152.8, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 4626759680. Throughput: 0: 50787.2. Samples: 2379625400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 22:09:27,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 22:09:28,504][49750] Updated weights for policy 0, policy_version 282401 (0.0027) [2024-04-26 22:09:31,533][49750] Updated weights for policy 0, policy_version 282411 (0.0032) [2024-04-26 22:09:32,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4627021824. Throughput: 0: 50923.6. Samples: 2379929720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 22:09:32,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 22:09:34,898][49750] Updated weights for policy 0, policy_version 282421 (0.0030) [2024-04-26 22:09:37,062][49517] Fps is (10 sec: 54066.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4627300352. Throughput: 0: 50783.9. Samples: 2380085800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 22:09:37,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 22:09:37,978][49750] Updated weights for policy 0, policy_version 282431 (0.0031) [2024-04-26 22:09:41,135][49750] Updated weights for policy 0, policy_version 282441 (0.0036) [2024-04-26 22:09:42,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 4627546112. Throughput: 0: 50704.5. Samples: 2380390500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 22:09:42,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 22:09:44,289][49750] Updated weights for policy 0, policy_version 282451 (0.0031) [2024-04-26 22:09:47,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4627808256. Throughput: 0: 50868.6. Samples: 2380704380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 22:09:47,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 22:09:47,600][49750] Updated weights for policy 0, policy_version 282461 (0.0033) [2024-04-26 22:09:50,795][49750] Updated weights for policy 0, policy_version 282471 (0.0031) [2024-04-26 22:09:52,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 4628037632. Throughput: 0: 50948.6. Samples: 2380847180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-26 22:09:52,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 22:09:53,960][49750] Updated weights for policy 0, policy_version 282481 (0.0034) [2024-04-26 22:09:57,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51336.7, 300 sec: 50707.1). Total num frames: 4628316160. Throughput: 0: 51131.3. Samples: 2381165200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 22:09:57,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 22:09:57,096][49750] Updated weights for policy 0, policy_version 282491 (0.0030) [2024-04-26 22:10:00,361][49750] Updated weights for policy 0, policy_version 282501 (0.0030) [2024-04-26 22:10:02,062][49517] Fps is (10 sec: 54066.6, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 4628578304. Throughput: 0: 51288.9. Samples: 2381475060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 22:10:02,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 22:10:03,543][49750] Updated weights for policy 0, policy_version 282511 (0.0036) [2024-04-26 22:10:06,610][49750] Updated weights for policy 0, policy_version 282521 (0.0035) [2024-04-26 22:10:07,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4628840448. Throughput: 0: 51376.4. Samples: 2381627080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 22:10:07,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 22:10:09,998][49750] Updated weights for policy 0, policy_version 282531 (0.0030) [2024-04-26 22:10:12,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4629086208. Throughput: 0: 51220.7. Samples: 2381930340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 22:10:12,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 22:10:12,963][49750] Updated weights for policy 0, policy_version 282541 (0.0033) [2024-04-26 22:10:16,450][49750] Updated weights for policy 0, policy_version 282551 (0.0032) [2024-04-26 22:10:17,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4629315584. Throughput: 0: 51091.0. Samples: 2382228820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 22:10:17,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 22:10:19,506][49750] Updated weights for policy 0, policy_version 282561 (0.0034) [2024-04-26 22:10:22,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4629594112. Throughput: 0: 50938.7. Samples: 2382378040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 22:10:22,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 22:10:22,932][49750] Updated weights for policy 0, policy_version 282571 (0.0029) [2024-04-26 22:10:25,959][49750] Updated weights for policy 0, policy_version 282581 (0.0038) [2024-04-26 22:10:27,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4629823488. Throughput: 0: 51013.9. Samples: 2382686120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 22:10:27,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 22:10:27,621][49728] Signal inference workers to stop experience collection... (35800 times) [2024-04-26 22:10:27,649][49750] InferenceWorker_p0-w0: stopping experience collection (35800 times) [2024-04-26 22:10:27,686][49728] Signal inference workers to resume experience collection... (35800 times) [2024-04-26 22:10:27,686][49750] InferenceWorker_p0-w0: resuming experience collection (35800 times) [2024-04-26 22:10:29,327][49750] Updated weights for policy 0, policy_version 282591 (0.0028) [2024-04-26 22:10:32,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 4630118400. Throughput: 0: 50863.1. Samples: 2382993220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 22:10:32,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 22:10:32,287][49750] Updated weights for policy 0, policy_version 282601 (0.0034) [2024-04-26 22:10:35,882][49750] Updated weights for policy 0, policy_version 282611 (0.0033) [2024-04-26 22:10:37,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4630347776. Throughput: 0: 51139.9. Samples: 2383148480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 22:10:37,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 22:10:38,603][49750] Updated weights for policy 0, policy_version 282621 (0.0032) [2024-04-26 22:10:42,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4630593536. Throughput: 0: 50866.5. Samples: 2383454200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 22:10:42,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 22:10:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000282630_4630609920.pth... [2024-04-26 22:10:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000281885_4618403840.pth [2024-04-26 22:10:42,500][49750] Updated weights for policy 0, policy_version 282631 (0.0033) [2024-04-26 22:10:45,141][49750] Updated weights for policy 0, policy_version 282641 (0.0042) [2024-04-26 22:10:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4630839296. Throughput: 0: 50695.7. Samples: 2383756360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 22:10:47,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-26 22:10:49,008][49750] Updated weights for policy 0, policy_version 282651 (0.0039) [2024-04-26 22:10:51,739][49750] Updated weights for policy 0, policy_version 282661 (0.0027) [2024-04-26 22:10:52,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51336.4, 300 sec: 50984.8). Total num frames: 4631117824. Throughput: 0: 50607.6. Samples: 2383904420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 22:10:52,063][49517] Avg episode reward: [(0, '0.484')] [2024-04-26 22:10:55,598][49750] Updated weights for policy 0, policy_version 282671 (0.0031) [2024-04-26 22:10:57,063][49517] Fps is (10 sec: 54066.0, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 4631379968. Throughput: 0: 50819.0. Samples: 2384217200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 22:10:57,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-26 22:10:58,413][49750] Updated weights for policy 0, policy_version 282681 (0.0029) [2024-04-26 22:11:01,996][49750] Updated weights for policy 0, policy_version 282691 (0.0031) [2024-04-26 22:11:02,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4631609344. Throughput: 0: 50914.7. Samples: 2384519980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 22:11:02,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 22:11:05,012][49750] Updated weights for policy 0, policy_version 282701 (0.0027) [2024-04-26 22:11:07,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4631871488. Throughput: 0: 50788.3. Samples: 2384663520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 22:11:07,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 22:11:08,342][49750] Updated weights for policy 0, policy_version 282711 (0.0029) [2024-04-26 22:11:11,484][49750] Updated weights for policy 0, policy_version 282721 (0.0031) [2024-04-26 22:11:12,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 4632100864. Throughput: 0: 50712.6. Samples: 2384968200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:11:12,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 22:11:14,684][49750] Updated weights for policy 0, policy_version 282731 (0.0031) [2024-04-26 22:11:17,063][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4632395776. Throughput: 0: 50782.1. Samples: 2385278420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:11:17,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 22:11:17,796][49750] Updated weights for policy 0, policy_version 282741 (0.0036) [2024-04-26 22:11:21,113][49750] Updated weights for policy 0, policy_version 282751 (0.0036) [2024-04-26 22:11:22,063][49517] Fps is (10 sec: 54067.9, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4632641536. Throughput: 0: 50702.6. Samples: 2385430100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:11:22,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 22:11:24,594][49750] Updated weights for policy 0, policy_version 282761 (0.0031) [2024-04-26 22:11:27,062][49517] Fps is (10 sec: 49152.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4632887296. Throughput: 0: 50841.0. Samples: 2385742040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:11:27,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 22:11:27,600][49750] Updated weights for policy 0, policy_version 282771 (0.0031) [2024-04-26 22:11:31,123][49728] Signal inference workers to stop experience collection... (35850 times) [2024-04-26 22:11:31,165][49750] InferenceWorker_p0-w0: stopping experience collection (35850 times) [2024-04-26 22:11:31,189][49728] Signal inference workers to resume experience collection... (35850 times) [2024-04-26 22:11:31,190][49750] InferenceWorker_p0-w0: resuming experience collection (35850 times) [2024-04-26 22:11:31,193][49750] Updated weights for policy 0, policy_version 282781 (0.0033) [2024-04-26 22:11:32,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4633149440. Throughput: 0: 50899.3. Samples: 2386046840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:11:32,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 22:11:33,953][49750] Updated weights for policy 0, policy_version 282791 (0.0030) [2024-04-26 22:11:37,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4633395200. Throughput: 0: 50829.4. Samples: 2386191740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:11:37,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 22:11:37,497][49750] Updated weights for policy 0, policy_version 282801 (0.0032) [2024-04-26 22:11:40,294][49750] Updated weights for policy 0, policy_version 282811 (0.0034) [2024-04-26 22:11:42,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 4633657344. Throughput: 0: 50623.6. Samples: 2386495260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:11:42,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-26 22:11:43,876][49750] Updated weights for policy 0, policy_version 282821 (0.0034) [2024-04-26 22:11:46,782][49750] Updated weights for policy 0, policy_version 282831 (0.0032) [2024-04-26 22:11:47,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4633919488. Throughput: 0: 50743.0. Samples: 2386803420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:11:47,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 22:11:50,415][49750] Updated weights for policy 0, policy_version 282841 (0.0033) [2024-04-26 22:11:52,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4634148864. Throughput: 0: 50862.3. Samples: 2386952320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:11:52,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-26 22:11:53,128][49750] Updated weights for policy 0, policy_version 282851 (0.0033) [2024-04-26 22:11:56,795][49750] Updated weights for policy 0, policy_version 282861 (0.0034) [2024-04-26 22:11:57,062][49517] Fps is (10 sec: 47514.7, 60 sec: 50244.5, 300 sec: 50818.2). Total num frames: 4634394624. Throughput: 0: 50962.2. Samples: 2387261480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:11:57,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 22:11:59,485][49750] Updated weights for policy 0, policy_version 282871 (0.0029) [2024-04-26 22:12:02,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 4634656768. Throughput: 0: 50776.5. Samples: 2387563360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:12:02,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 22:12:03,325][49750] Updated weights for policy 0, policy_version 282881 (0.0030) [2024-04-26 22:12:05,955][49750] Updated weights for policy 0, policy_version 282891 (0.0029) [2024-04-26 22:12:07,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 4634918912. Throughput: 0: 50817.5. Samples: 2387716880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:12:07,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 22:12:09,708][49750] Updated weights for policy 0, policy_version 282901 (0.0034) [2024-04-26 22:12:12,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4635181056. Throughput: 0: 50736.7. Samples: 2388025200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:12:12,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 22:12:12,408][49750] Updated weights for policy 0, policy_version 282911 (0.0034) [2024-04-26 22:12:16,192][49750] Updated weights for policy 0, policy_version 282921 (0.0031) [2024-04-26 22:12:17,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4635426816. Throughput: 0: 50859.3. Samples: 2388335500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:12:17,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 22:12:18,736][49750] Updated weights for policy 0, policy_version 282931 (0.0034) [2024-04-26 22:12:22,063][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4635656192. Throughput: 0: 50918.7. Samples: 2388483080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 22:12:22,063][49517] Avg episode reward: [(0, '0.689')] [2024-04-26 22:12:22,531][49750] Updated weights for policy 0, policy_version 282941 (0.0033) [2024-04-26 22:12:25,142][49750] Updated weights for policy 0, policy_version 282951 (0.0029) [2024-04-26 22:12:27,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 4635951104. Throughput: 0: 50795.3. Samples: 2388781040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 22:12:27,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 22:12:29,041][49750] Updated weights for policy 0, policy_version 282961 (0.0029) [2024-04-26 22:12:31,584][49750] Updated weights for policy 0, policy_version 282971 (0.0032) [2024-04-26 22:12:32,063][49517] Fps is (10 sec: 55704.9, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 4636213248. Throughput: 0: 50836.8. Samples: 2389091080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 22:12:32,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 22:12:35,621][49750] Updated weights for policy 0, policy_version 282981 (0.0028) [2024-04-26 22:12:37,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4636442624. Throughput: 0: 51184.6. Samples: 2389255620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 22:12:37,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 22:12:37,851][49728] Signal inference workers to stop experience collection... (35900 times) [2024-04-26 22:12:37,901][49750] InferenceWorker_p0-w0: stopping experience collection (35900 times) [2024-04-26 22:12:37,919][49728] Signal inference workers to resume experience collection... (35900 times) [2024-04-26 22:12:37,927][49750] InferenceWorker_p0-w0: resuming experience collection (35900 times) [2024-04-26 22:12:38,050][49750] Updated weights for policy 0, policy_version 282991 (0.0031) [2024-04-26 22:12:41,988][49750] Updated weights for policy 0, policy_version 283001 (0.0033) [2024-04-26 22:12:42,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 4636688384. Throughput: 0: 50912.9. Samples: 2389552580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 22:12:42,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 22:12:42,076][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000283001_4636688384.pth... [2024-04-26 22:12:42,133][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000282255_4624465920.pth [2024-04-26 22:12:44,509][49750] Updated weights for policy 0, policy_version 283011 (0.0032) [2024-04-26 22:12:47,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50984.8). Total num frames: 4636950528. Throughput: 0: 50834.8. Samples: 2389850920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 22:12:47,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 22:12:48,526][49750] Updated weights for policy 0, policy_version 283021 (0.0031) [2024-04-26 22:12:50,933][49750] Updated weights for policy 0, policy_version 283031 (0.0029) [2024-04-26 22:12:52,062][49517] Fps is (10 sec: 54068.9, 60 sec: 51336.7, 300 sec: 50984.8). Total num frames: 4637229056. Throughput: 0: 50964.9. Samples: 2390010300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 22:12:52,063][49517] Avg episode reward: [(0, '0.673')] [2024-04-26 22:12:54,994][49750] Updated weights for policy 0, policy_version 283041 (0.0027) [2024-04-26 22:12:57,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4637474816. Throughput: 0: 50790.8. Samples: 2390310780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 22:12:57,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 22:12:57,420][49750] Updated weights for policy 0, policy_version 283051 (0.0028) [2024-04-26 22:13:01,449][49750] Updated weights for policy 0, policy_version 283061 (0.0030) [2024-04-26 22:13:02,062][49517] Fps is (10 sec: 47513.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4637704192. Throughput: 0: 50813.8. Samples: 2390622120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 22:13:02,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 22:13:04,012][49750] Updated weights for policy 0, policy_version 283071 (0.0035) [2024-04-26 22:13:07,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4637949952. Throughput: 0: 50698.4. Samples: 2390764500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 22:13:07,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 22:13:07,898][49750] Updated weights for policy 0, policy_version 283081 (0.0033) [2024-04-26 22:13:10,357][49750] Updated weights for policy 0, policy_version 283091 (0.0037) [2024-04-26 22:13:12,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50984.8). Total num frames: 4638228480. Throughput: 0: 50808.8. Samples: 2391067440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 22:13:12,063][49517] Avg episode reward: [(0, '0.682')] [2024-04-26 22:13:14,280][49750] Updated weights for policy 0, policy_version 283101 (0.0031) [2024-04-26 22:13:16,759][49750] Updated weights for policy 0, policy_version 283111 (0.0029) [2024-04-26 22:13:17,063][49517] Fps is (10 sec: 54066.2, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 4638490624. Throughput: 0: 50781.0. Samples: 2391376220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 22:13:17,063][49517] Avg episode reward: [(0, '0.670')] [2024-04-26 22:13:20,575][49750] Updated weights for policy 0, policy_version 283121 (0.0029) [2024-04-26 22:13:22,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 4638736384. Throughput: 0: 50806.1. Samples: 2391541900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 22:13:22,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 22:13:23,308][49750] Updated weights for policy 0, policy_version 283131 (0.0029) [2024-04-26 22:13:27,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4638965760. Throughput: 0: 50790.9. Samples: 2391838160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 22:13:27,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 22:13:27,108][49750] Updated weights for policy 0, policy_version 283141 (0.0025) [2024-04-26 22:13:29,833][49750] Updated weights for policy 0, policy_version 283151 (0.0032) [2024-04-26 22:13:32,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 4639227904. Throughput: 0: 50749.8. Samples: 2392134660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-26 22:13:32,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 22:13:33,515][49750] Updated weights for policy 0, policy_version 283161 (0.0030) [2024-04-26 22:13:36,444][49750] Updated weights for policy 0, policy_version 283171 (0.0031) [2024-04-26 22:13:37,063][49517] Fps is (10 sec: 55705.3, 60 sec: 51336.4, 300 sec: 51095.8). Total num frames: 4639522816. Throughput: 0: 50714.1. Samples: 2392292440. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 22:13:37,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 22:13:39,897][49750] Updated weights for policy 0, policy_version 283181 (0.0027) [2024-04-26 22:13:42,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.8, 300 sec: 50873.7). Total num frames: 4639768576. Throughput: 0: 51009.8. Samples: 2392606220. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 22:13:42,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 22:13:42,723][49750] Updated weights for policy 0, policy_version 283191 (0.0033) [2024-04-26 22:13:46,322][49750] Updated weights for policy 0, policy_version 283201 (0.0032) [2024-04-26 22:13:47,062][49517] Fps is (10 sec: 45875.7, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4639981568. Throughput: 0: 50767.2. Samples: 2392906640. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 22:13:47,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 22:13:47,257][49728] Signal inference workers to stop experience collection... (35950 times) [2024-04-26 22:13:47,283][49750] InferenceWorker_p0-w0: stopping experience collection (35950 times) [2024-04-26 22:13:47,369][49728] Signal inference workers to resume experience collection... (35950 times) [2024-04-26 22:13:47,369][49750] InferenceWorker_p0-w0: resuming experience collection (35950 times) [2024-04-26 22:13:49,078][49750] Updated weights for policy 0, policy_version 283211 (0.0028) [2024-04-26 22:13:52,062][49517] Fps is (10 sec: 45874.9, 60 sec: 49971.1, 300 sec: 50818.2). Total num frames: 4640227328. Throughput: 0: 50824.7. Samples: 2393051620. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 22:13:52,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 22:13:52,725][49750] Updated weights for policy 0, policy_version 283221 (0.0031) [2024-04-26 22:13:55,520][49750] Updated weights for policy 0, policy_version 283231 (0.0032) [2024-04-26 22:13:57,063][49517] Fps is (10 sec: 54066.7, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 4640522240. Throughput: 0: 50950.1. Samples: 2393360200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 22:13:57,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 22:13:59,108][49750] Updated weights for policy 0, policy_version 283241 (0.0029) [2024-04-26 22:14:02,009][49750] Updated weights for policy 0, policy_version 283251 (0.0032) [2024-04-26 22:14:02,063][49517] Fps is (10 sec: 55701.2, 60 sec: 51335.9, 300 sec: 50929.1). Total num frames: 4640784384. Throughput: 0: 50889.8. Samples: 2393666300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 22:14:02,064][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 22:14:05,437][49750] Updated weights for policy 0, policy_version 283261 (0.0030) [2024-04-26 22:14:07,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51609.5, 300 sec: 50929.3). Total num frames: 4641046528. Throughput: 0: 50871.1. Samples: 2393831100. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 22:14:07,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-26 22:14:08,495][49750] Updated weights for policy 0, policy_version 283271 (0.0028) [2024-04-26 22:14:11,976][49750] Updated weights for policy 0, policy_version 283281 (0.0029) [2024-04-26 22:14:12,062][49517] Fps is (10 sec: 49155.9, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4641275904. Throughput: 0: 50949.8. Samples: 2394130900. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 22:14:12,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 22:14:14,808][49750] Updated weights for policy 0, policy_version 283291 (0.0032) [2024-04-26 22:14:17,062][49517] Fps is (10 sec: 45875.6, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 4641505280. Throughput: 0: 51101.3. Samples: 2394434220. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 22:14:17,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 22:14:18,429][49750] Updated weights for policy 0, policy_version 283301 (0.0031) [2024-04-26 22:14:21,166][49750] Updated weights for policy 0, policy_version 283311 (0.0029) [2024-04-26 22:14:22,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.5, 300 sec: 50929.2). Total num frames: 4641783808. Throughput: 0: 50752.5. Samples: 2394576300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 22:14:22,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 22:14:24,891][49750] Updated weights for policy 0, policy_version 283321 (0.0033) [2024-04-26 22:14:27,062][49517] Fps is (10 sec: 55705.4, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 4642062336. Throughput: 0: 50629.3. Samples: 2394884540. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 22:14:27,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 22:14:27,633][49750] Updated weights for policy 0, policy_version 283331 (0.0028) [2024-04-26 22:14:31,249][49750] Updated weights for policy 0, policy_version 283341 (0.0028) [2024-04-26 22:14:32,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4642291712. Throughput: 0: 50794.7. Samples: 2395192400. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 22:14:32,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 22:14:34,142][49750] Updated weights for policy 0, policy_version 283351 (0.0030) [2024-04-26 22:14:37,062][49517] Fps is (10 sec: 45875.4, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 4642521088. Throughput: 0: 50840.1. Samples: 2395339420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 22:14:37,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 22:14:37,740][49750] Updated weights for policy 0, policy_version 283361 (0.0027) [2024-04-26 22:14:40,692][49750] Updated weights for policy 0, policy_version 283371 (0.0030) [2024-04-26 22:14:42,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4642799616. Throughput: 0: 50704.9. Samples: 2395641920. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-26 22:14:42,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 22:14:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000283374_4642799616.pth... [2024-04-26 22:14:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000282630_4630609920.pth [2024-04-26 22:14:44,162][49750] Updated weights for policy 0, policy_version 283381 (0.0028) [2024-04-26 22:14:46,388][49728] Signal inference workers to stop experience collection... (36000 times) [2024-04-26 22:14:46,388][49728] Signal inference workers to resume experience collection... (36000 times) [2024-04-26 22:14:46,408][49750] InferenceWorker_p0-w0: stopping experience collection (36000 times) [2024-04-26 22:14:46,408][49750] InferenceWorker_p0-w0: resuming experience collection (36000 times) [2024-04-26 22:14:47,063][49517] Fps is (10 sec: 54066.3, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 4643061760. Throughput: 0: 50648.8. Samples: 2395945460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:14:47,072][49517] Avg episode reward: [(0, '0.686')] [2024-04-26 22:14:47,313][49750] Updated weights for policy 0, policy_version 283391 (0.0036) [2024-04-26 22:14:50,708][49750] Updated weights for policy 0, policy_version 283401 (0.0030) [2024-04-26 22:14:52,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 4643323904. Throughput: 0: 50639.5. Samples: 2396109880. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:14:52,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 22:14:53,945][49750] Updated weights for policy 0, policy_version 283411 (0.0028) [2024-04-26 22:14:57,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 4643553280. Throughput: 0: 50704.1. Samples: 2396412580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:14:57,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 22:14:57,136][49750] Updated weights for policy 0, policy_version 283421 (0.0031) [2024-04-26 22:15:00,470][49750] Updated weights for policy 0, policy_version 283431 (0.0036) [2024-04-26 22:15:02,062][49517] Fps is (10 sec: 45875.8, 60 sec: 49971.9, 300 sec: 50651.6). Total num frames: 4643782656. Throughput: 0: 50466.2. Samples: 2396705200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:15:02,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 22:15:03,567][49750] Updated weights for policy 0, policy_version 283441 (0.0033) [2024-04-26 22:15:06,817][49750] Updated weights for policy 0, policy_version 283451 (0.0032) [2024-04-26 22:15:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.4, 300 sec: 50762.7). Total num frames: 4644061184. Throughput: 0: 50594.7. Samples: 2396853060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:15:07,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 22:15:10,060][49750] Updated weights for policy 0, policy_version 283461 (0.0028) [2024-04-26 22:15:12,063][49517] Fps is (10 sec: 54066.1, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4644323328. Throughput: 0: 50430.9. Samples: 2397153940. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:15:12,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 22:15:13,336][49750] Updated weights for policy 0, policy_version 283471 (0.0037) [2024-04-26 22:15:16,487][49750] Updated weights for policy 0, policy_version 283481 (0.0029) [2024-04-26 22:15:17,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4644585472. Throughput: 0: 50457.3. Samples: 2397462980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:15:17,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 22:15:19,961][49750] Updated weights for policy 0, policy_version 283491 (0.0033) [2024-04-26 22:15:22,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4644798464. Throughput: 0: 50613.7. Samples: 2397617040. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:15:22,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 22:15:23,005][49750] Updated weights for policy 0, policy_version 283501 (0.0031) [2024-04-26 22:15:26,368][49750] Updated weights for policy 0, policy_version 283511 (0.0031) [2024-04-26 22:15:27,063][49517] Fps is (10 sec: 47513.2, 60 sec: 49971.1, 300 sec: 50651.5). Total num frames: 4645060608. Throughput: 0: 50613.8. Samples: 2397919540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:15:27,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 22:15:29,464][49750] Updated weights for policy 0, policy_version 283521 (0.0033) [2024-04-26 22:15:32,063][49517] Fps is (10 sec: 54066.3, 60 sec: 50790.2, 300 sec: 50818.1). Total num frames: 4645339136. Throughput: 0: 50552.3. Samples: 2398220320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:15:32,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 22:15:32,700][49750] Updated weights for policy 0, policy_version 283531 (0.0034) [2024-04-26 22:15:35,885][49750] Updated weights for policy 0, policy_version 283541 (0.0033) [2024-04-26 22:15:37,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4645584896. Throughput: 0: 50567.2. Samples: 2398385400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:15:37,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 22:15:39,082][49750] Updated weights for policy 0, policy_version 283551 (0.0031) [2024-04-26 22:15:42,062][49517] Fps is (10 sec: 49153.4, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4645830656. Throughput: 0: 50607.5. Samples: 2398689920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:15:42,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 22:15:42,321][49750] Updated weights for policy 0, policy_version 283561 (0.0032) [2024-04-26 22:15:45,634][49750] Updated weights for policy 0, policy_version 283571 (0.0033) [2024-04-26 22:15:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4646076416. Throughput: 0: 50736.4. Samples: 2398988340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:15:47,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 22:15:48,685][49750] Updated weights for policy 0, policy_version 283581 (0.0035) [2024-04-26 22:15:52,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4646338560. Throughput: 0: 50754.0. Samples: 2399137000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:15:52,063][49517] Avg episode reward: [(0, '0.667')] [2024-04-26 22:15:52,181][49750] Updated weights for policy 0, policy_version 283591 (0.0033) [2024-04-26 22:15:54,888][49728] Signal inference workers to stop experience collection... (36050 times) [2024-04-26 22:15:54,888][49728] Signal inference workers to resume experience collection... (36050 times) [2024-04-26 22:15:54,913][49750] InferenceWorker_p0-w0: stopping experience collection (36050 times) [2024-04-26 22:15:54,913][49750] InferenceWorker_p0-w0: resuming experience collection (36050 times) [2024-04-26 22:15:55,015][49750] Updated weights for policy 0, policy_version 283601 (0.0032) [2024-04-26 22:15:57,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 4646600704. Throughput: 0: 50826.3. Samples: 2399441120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:15:57,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 22:15:58,661][49750] Updated weights for policy 0, policy_version 283611 (0.0029) [2024-04-26 22:16:01,412][49750] Updated weights for policy 0, policy_version 283621 (0.0035) [2024-04-26 22:16:02,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4646862848. Throughput: 0: 50589.4. Samples: 2399739500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 22:16:02,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 22:16:05,098][49750] Updated weights for policy 0, policy_version 283631 (0.0026) [2024-04-26 22:16:07,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4647092224. Throughput: 0: 50848.5. Samples: 2399905220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 22:16:07,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 22:16:07,855][49750] Updated weights for policy 0, policy_version 283641 (0.0033) [2024-04-26 22:16:11,527][49750] Updated weights for policy 0, policy_version 283651 (0.0027) [2024-04-26 22:16:12,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 4647337984. Throughput: 0: 50915.6. Samples: 2400210740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 22:16:12,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 22:16:14,436][49750] Updated weights for policy 0, policy_version 283661 (0.0031) [2024-04-26 22:16:17,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4647600128. Throughput: 0: 50782.5. Samples: 2400505520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 22:16:17,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 22:16:18,018][49750] Updated weights for policy 0, policy_version 283671 (0.0029) [2024-04-26 22:16:21,280][49750] Updated weights for policy 0, policy_version 283681 (0.0027) [2024-04-26 22:16:22,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 4647878656. Throughput: 0: 50777.9. Samples: 2400670400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 22:16:22,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 22:16:24,524][49750] Updated weights for policy 0, policy_version 283691 (0.0036) [2024-04-26 22:16:27,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4648124416. Throughput: 0: 50835.3. Samples: 2400977520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 22:16:27,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 22:16:27,612][49750] Updated weights for policy 0, policy_version 283701 (0.0027) [2024-04-26 22:16:31,068][49750] Updated weights for policy 0, policy_version 283711 (0.0028) [2024-04-26 22:16:32,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4648370176. Throughput: 0: 51036.9. Samples: 2401285000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 22:16:32,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 22:16:34,040][49750] Updated weights for policy 0, policy_version 283721 (0.0038) [2024-04-26 22:16:37,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4648632320. Throughput: 0: 50903.2. Samples: 2401427640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 22:16:37,063][49517] Avg episode reward: [(0, '0.470')] [2024-04-26 22:16:37,730][49750] Updated weights for policy 0, policy_version 283731 (0.0033) [2024-04-26 22:16:40,552][49750] Updated weights for policy 0, policy_version 283741 (0.0030) [2024-04-26 22:16:42,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4648894464. Throughput: 0: 50973.7. Samples: 2401734940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 22:16:42,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 22:16:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000283746_4648894464.pth... [2024-04-26 22:16:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000283001_4636688384.pth [2024-04-26 22:16:44,133][49750] Updated weights for policy 0, policy_version 283751 (0.0029) [2024-04-26 22:16:46,869][49750] Updated weights for policy 0, policy_version 283761 (0.0034) [2024-04-26 22:16:47,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4649140224. Throughput: 0: 51096.0. Samples: 2402038820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 22:16:47,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 22:16:50,645][49750] Updated weights for policy 0, policy_version 283771 (0.0032) [2024-04-26 22:16:52,062][49517] Fps is (10 sec: 50791.5, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4649402368. Throughput: 0: 50837.9. Samples: 2402192920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 22:16:52,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 22:16:53,159][49750] Updated weights for policy 0, policy_version 283781 (0.0030) [2024-04-26 22:16:57,062][49750] Updated weights for policy 0, policy_version 283791 (0.0030) [2024-04-26 22:16:57,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4649631744. Throughput: 0: 50931.9. Samples: 2402502680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 22:16:57,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 22:16:59,634][49750] Updated weights for policy 0, policy_version 283801 (0.0032) [2024-04-26 22:17:02,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4649893888. Throughput: 0: 51008.0. Samples: 2402800880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 22:17:02,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 22:17:03,534][49750] Updated weights for policy 0, policy_version 283811 (0.0032) [2024-04-26 22:17:04,163][49728] Signal inference workers to stop experience collection... (36100 times) [2024-04-26 22:17:04,204][49750] InferenceWorker_p0-w0: stopping experience collection (36100 times) [2024-04-26 22:17:04,222][49728] Signal inference workers to resume experience collection... (36100 times) [2024-04-26 22:17:04,223][49750] InferenceWorker_p0-w0: resuming experience collection (36100 times) [2024-04-26 22:17:06,140][49750] Updated weights for policy 0, policy_version 283821 (0.0029) [2024-04-26 22:17:07,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4650172416. Throughput: 0: 50762.1. Samples: 2402954700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-26 22:17:07,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 22:17:10,002][49750] Updated weights for policy 0, policy_version 283831 (0.0032) [2024-04-26 22:17:12,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 4650418176. Throughput: 0: 50783.6. Samples: 2403262780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 22:17:12,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 22:17:12,492][49750] Updated weights for policy 0, policy_version 283841 (0.0031) [2024-04-26 22:17:16,521][49750] Updated weights for policy 0, policy_version 283851 (0.0029) [2024-04-26 22:17:17,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4650631168. Throughput: 0: 50753.8. Samples: 2403568920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 22:17:17,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 22:17:18,981][49750] Updated weights for policy 0, policy_version 283861 (0.0034) [2024-04-26 22:17:22,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4650909696. Throughput: 0: 50828.9. Samples: 2403714940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 22:17:22,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 22:17:22,957][49750] Updated weights for policy 0, policy_version 283871 (0.0037) [2024-04-26 22:17:25,401][49750] Updated weights for policy 0, policy_version 283881 (0.0031) [2024-04-26 22:17:27,062][49517] Fps is (10 sec: 54066.8, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4651171840. Throughput: 0: 50609.0. Samples: 2404012340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 22:17:27,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 22:17:29,249][49750] Updated weights for policy 0, policy_version 283891 (0.0032) [2024-04-26 22:17:31,966][49750] Updated weights for policy 0, policy_version 283901 (0.0027) [2024-04-26 22:17:32,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4651433984. Throughput: 0: 50727.5. Samples: 2404321560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 22:17:32,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 22:17:35,624][49750] Updated weights for policy 0, policy_version 283911 (0.0038) [2024-04-26 22:17:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4651696128. Throughput: 0: 50805.6. Samples: 2404479180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 22:17:37,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 22:17:38,598][49750] Updated weights for policy 0, policy_version 283921 (0.0030) [2024-04-26 22:17:42,048][49750] Updated weights for policy 0, policy_version 283931 (0.0032) [2024-04-26 22:17:42,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4651925504. Throughput: 0: 50691.3. Samples: 2404783780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 22:17:42,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 22:17:44,933][49750] Updated weights for policy 0, policy_version 283941 (0.0028) [2024-04-26 22:17:47,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 4652171264. Throughput: 0: 50833.7. Samples: 2405088400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 22:17:47,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 22:17:48,622][49750] Updated weights for policy 0, policy_version 283951 (0.0036) [2024-04-26 22:17:51,247][49750] Updated weights for policy 0, policy_version 283961 (0.0029) [2024-04-26 22:17:52,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4652466176. Throughput: 0: 50715.5. Samples: 2405236900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 22:17:52,072][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 22:17:54,910][49750] Updated weights for policy 0, policy_version 283971 (0.0038) [2024-04-26 22:17:57,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4652711936. Throughput: 0: 50830.8. Samples: 2405550160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 22:17:57,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 22:17:57,732][49750] Updated weights for policy 0, policy_version 283981 (0.0033) [2024-04-26 22:18:01,444][49750] Updated weights for policy 0, policy_version 283991 (0.0033) [2024-04-26 22:18:02,062][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4652957696. Throughput: 0: 50827.0. Samples: 2405856140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 22:18:02,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 22:18:04,169][49750] Updated weights for policy 0, policy_version 284001 (0.0032) [2024-04-26 22:18:07,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4653187072. Throughput: 0: 50768.3. Samples: 2405999520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 22:18:07,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 22:18:07,832][49750] Updated weights for policy 0, policy_version 284011 (0.0029) [2024-04-26 22:18:10,643][49750] Updated weights for policy 0, policy_version 284021 (0.0025) [2024-04-26 22:18:12,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.6, 300 sec: 50762.7). Total num frames: 4653465600. Throughput: 0: 51107.7. Samples: 2406312180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 22:18:12,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 22:18:14,038][49728] Signal inference workers to stop experience collection... (36150 times) [2024-04-26 22:18:14,038][49728] Signal inference workers to resume experience collection... (36150 times) [2024-04-26 22:18:14,050][49750] InferenceWorker_p0-w0: stopping experience collection (36150 times) [2024-04-26 22:18:14,070][49750] InferenceWorker_p0-w0: resuming experience collection (36150 times) [2024-04-26 22:18:14,168][49750] Updated weights for policy 0, policy_version 284031 (0.0034) [2024-04-26 22:18:17,005][49750] Updated weights for policy 0, policy_version 284041 (0.0032) [2024-04-26 22:18:17,063][49517] Fps is (10 sec: 54067.2, 60 sec: 51609.5, 300 sec: 50818.2). Total num frames: 4653727744. Throughput: 0: 50875.9. Samples: 2406610980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 22:18:17,063][49517] Avg episode reward: [(0, '0.678')] [2024-04-26 22:18:20,669][49750] Updated weights for policy 0, policy_version 284051 (0.0031) [2024-04-26 22:18:22,063][49517] Fps is (10 sec: 50789.2, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 4653973504. Throughput: 0: 50912.0. Samples: 2406770220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-04-26 22:18:22,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 22:18:23,491][49750] Updated weights for policy 0, policy_version 284061 (0.0034) [2024-04-26 22:18:27,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4654202880. Throughput: 0: 50840.0. Samples: 2407071580. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-04-26 22:18:27,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 22:18:27,125][49750] Updated weights for policy 0, policy_version 284071 (0.0035) [2024-04-26 22:18:29,871][49750] Updated weights for policy 0, policy_version 284081 (0.0027) [2024-04-26 22:18:32,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4654465024. Throughput: 0: 50814.2. Samples: 2407375040. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-04-26 22:18:32,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 22:18:33,421][49750] Updated weights for policy 0, policy_version 284091 (0.0030) [2024-04-26 22:18:36,396][49750] Updated weights for policy 0, policy_version 284101 (0.0032) [2024-04-26 22:18:37,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4654743552. Throughput: 0: 50889.4. Samples: 2407526920. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-04-26 22:18:37,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 22:18:39,858][49750] Updated weights for policy 0, policy_version 284111 (0.0039) [2024-04-26 22:18:42,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4655005696. Throughput: 0: 50780.5. Samples: 2407835280. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-04-26 22:18:42,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 22:18:42,070][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000284119_4655005696.pth... [2024-04-26 22:18:42,113][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000283374_4642799616.pth [2024-04-26 22:18:42,805][49750] Updated weights for policy 0, policy_version 284121 (0.0033) [2024-04-26 22:18:46,305][49750] Updated weights for policy 0, policy_version 284131 (0.0036) [2024-04-26 22:18:47,063][49517] Fps is (10 sec: 49151.1, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 4655235072. Throughput: 0: 50795.8. Samples: 2408141960. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-04-26 22:18:47,064][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 22:18:49,178][49750] Updated weights for policy 0, policy_version 284141 (0.0029) [2024-04-26 22:18:52,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4655480832. Throughput: 0: 50894.4. Samples: 2408289760. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-04-26 22:18:52,063][49517] Avg episode reward: [(0, '0.673')] [2024-04-26 22:18:52,562][49750] Updated weights for policy 0, policy_version 284151 (0.0029) [2024-04-26 22:18:55,676][49750] Updated weights for policy 0, policy_version 284161 (0.0033) [2024-04-26 22:18:57,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50707.2). Total num frames: 4655742976. Throughput: 0: 50805.0. Samples: 2408598420. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-04-26 22:18:57,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 22:18:58,852][49750] Updated weights for policy 0, policy_version 284171 (0.0028) [2024-04-26 22:19:01,968][49750] Updated weights for policy 0, policy_version 284181 (0.0034) [2024-04-26 22:19:02,063][49517] Fps is (10 sec: 54066.3, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4656021504. Throughput: 0: 50936.0. Samples: 2408903100. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-04-26 22:19:02,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 22:19:05,307][49750] Updated weights for policy 0, policy_version 284191 (0.0040) [2024-04-26 22:19:07,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 4656267264. Throughput: 0: 50953.3. Samples: 2409063120. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-04-26 22:19:07,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 22:19:08,483][49750] Updated weights for policy 0, policy_version 284201 (0.0030) [2024-04-26 22:19:11,968][49750] Updated weights for policy 0, policy_version 284211 (0.0031) [2024-04-26 22:19:12,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 4656513024. Throughput: 0: 50978.9. Samples: 2409365640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-04-26 22:19:12,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 22:19:14,794][49750] Updated weights for policy 0, policy_version 284221 (0.0038) [2024-04-26 22:19:17,063][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4656775168. Throughput: 0: 51100.4. Samples: 2409674560. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-04-26 22:19:17,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 22:19:18,313][49750] Updated weights for policy 0, policy_version 284231 (0.0027) [2024-04-26 22:19:21,353][49750] Updated weights for policy 0, policy_version 284241 (0.0033) [2024-04-26 22:19:22,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4657020928. Throughput: 0: 50979.9. Samples: 2409821020. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-04-26 22:19:22,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 22:19:24,727][49750] Updated weights for policy 0, policy_version 284251 (0.0032) [2024-04-26 22:19:27,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4657283072. Throughput: 0: 50972.1. Samples: 2410129020. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-04-26 22:19:27,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 22:19:27,653][49750] Updated weights for policy 0, policy_version 284261 (0.0036) [2024-04-26 22:19:31,161][49750] Updated weights for policy 0, policy_version 284271 (0.0031) [2024-04-26 22:19:31,802][49728] Signal inference workers to stop experience collection... (36200 times) [2024-04-26 22:19:31,821][49750] InferenceWorker_p0-w0: stopping experience collection (36200 times) [2024-04-26 22:19:31,866][49728] Signal inference workers to resume experience collection... (36200 times) [2024-04-26 22:19:31,866][49750] InferenceWorker_p0-w0: resuming experience collection (36200 times) [2024-04-26 22:19:32,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4657545216. Throughput: 0: 50948.6. Samples: 2410434640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-04-26 22:19:32,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-26 22:19:34,173][49750] Updated weights for policy 0, policy_version 284281 (0.0033) [2024-04-26 22:19:37,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4657790976. Throughput: 0: 50922.0. Samples: 2410581260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 22:19:37,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 22:19:37,501][49750] Updated weights for policy 0, policy_version 284291 (0.0031) [2024-04-26 22:19:40,678][49750] Updated weights for policy 0, policy_version 284301 (0.0032) [2024-04-26 22:19:42,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4658053120. Throughput: 0: 50884.7. Samples: 2410888220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 22:19:42,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 22:19:44,156][49750] Updated weights for policy 0, policy_version 284311 (0.0032) [2024-04-26 22:19:47,027][49750] Updated weights for policy 0, policy_version 284321 (0.0029) [2024-04-26 22:19:47,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4658315264. Throughput: 0: 50968.5. Samples: 2411196680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 22:19:47,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 22:19:50,497][49750] Updated weights for policy 0, policy_version 284331 (0.0035) [2024-04-26 22:19:52,062][49517] Fps is (10 sec: 49151.7, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4658544640. Throughput: 0: 50829.1. Samples: 2411350420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 22:19:52,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 22:19:53,359][49750] Updated weights for policy 0, policy_version 284341 (0.0033) [2024-04-26 22:19:56,931][49750] Updated weights for policy 0, policy_version 284351 (0.0030) [2024-04-26 22:19:57,062][49517] Fps is (10 sec: 49152.3, 60 sec: 51063.6, 300 sec: 50929.2). Total num frames: 4658806784. Throughput: 0: 50792.5. Samples: 2411651300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 22:19:57,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 22:19:59,861][49750] Updated weights for policy 0, policy_version 284361 (0.0028) [2024-04-26 22:20:02,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4659052544. Throughput: 0: 50712.9. Samples: 2411956640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 22:20:02,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 22:20:03,271][49750] Updated weights for policy 0, policy_version 284371 (0.0028) [2024-04-26 22:20:06,511][49750] Updated weights for policy 0, policy_version 284381 (0.0031) [2024-04-26 22:20:07,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4659314688. Throughput: 0: 50855.6. Samples: 2412109520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 22:20:07,063][49517] Avg episode reward: [(0, '0.670')] [2024-04-26 22:20:09,772][49750] Updated weights for policy 0, policy_version 284391 (0.0029) [2024-04-26 22:20:12,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4659560448. Throughput: 0: 50782.9. Samples: 2412414260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 22:20:12,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 22:20:12,953][49750] Updated weights for policy 0, policy_version 284401 (0.0033) [2024-04-26 22:20:16,257][49750] Updated weights for policy 0, policy_version 284411 (0.0037) [2024-04-26 22:20:17,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 4659822592. Throughput: 0: 50825.7. Samples: 2412721800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 22:20:17,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 22:20:19,534][49750] Updated weights for policy 0, policy_version 284421 (0.0030) [2024-04-26 22:20:22,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 4660084736. Throughput: 0: 50954.7. Samples: 2412874220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 22:20:22,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 22:20:22,567][49750] Updated weights for policy 0, policy_version 284431 (0.0031) [2024-04-26 22:20:26,076][49750] Updated weights for policy 0, policy_version 284441 (0.0038) [2024-04-26 22:20:27,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4660330496. Throughput: 0: 50868.8. Samples: 2413177320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 22:20:27,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 22:20:29,037][49750] Updated weights for policy 0, policy_version 284451 (0.0033) [2024-04-26 22:20:32,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4660576256. Throughput: 0: 50813.1. Samples: 2413483260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 22:20:32,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 22:20:32,481][49750] Updated weights for policy 0, policy_version 284461 (0.0027) [2024-04-26 22:20:35,675][49750] Updated weights for policy 0, policy_version 284471 (0.0029) [2024-04-26 22:20:37,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4660822016. Throughput: 0: 50727.2. Samples: 2413633140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 22:20:37,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 22:20:38,848][49750] Updated weights for policy 0, policy_version 284481 (0.0033) [2024-04-26 22:20:42,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50517.1, 300 sec: 50873.7). Total num frames: 4661084160. Throughput: 0: 50781.6. Samples: 2413936480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 22:20:42,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 22:20:42,076][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000284490_4661084160.pth... [2024-04-26 22:20:42,139][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000283746_4648894464.pth [2024-04-26 22:20:42,285][49750] Updated weights for policy 0, policy_version 284491 (0.0033) [2024-04-26 22:20:45,406][49750] Updated weights for policy 0, policy_version 284501 (0.0032) [2024-04-26 22:20:47,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 4661346304. Throughput: 0: 50783.2. Samples: 2414241880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 22:20:47,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 22:20:48,691][49750] Updated weights for policy 0, policy_version 284511 (0.0028) [2024-04-26 22:20:51,785][49750] Updated weights for policy 0, policy_version 284521 (0.0030) [2024-04-26 22:20:52,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4661592064. Throughput: 0: 50890.7. Samples: 2414399600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 22:20:52,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 22:20:52,813][49728] Signal inference workers to stop experience collection... (36250 times) [2024-04-26 22:20:52,814][49728] Signal inference workers to resume experience collection... (36250 times) [2024-04-26 22:20:52,839][49750] InferenceWorker_p0-w0: stopping experience collection (36250 times) [2024-04-26 22:20:52,840][49750] InferenceWorker_p0-w0: resuming experience collection (36250 times) [2024-04-26 22:20:55,004][49750] Updated weights for policy 0, policy_version 284531 (0.0034) [2024-04-26 22:20:57,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4661837824. Throughput: 0: 50894.5. Samples: 2414704500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 22:20:57,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 22:20:58,226][49750] Updated weights for policy 0, policy_version 284541 (0.0038) [2024-04-26 22:21:01,395][49750] Updated weights for policy 0, policy_version 284551 (0.0040) [2024-04-26 22:21:02,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4662083584. Throughput: 0: 50577.4. Samples: 2414997780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 22:21:02,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 22:21:04,686][49750] Updated weights for policy 0, policy_version 284561 (0.0026) [2024-04-26 22:21:07,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 4662378496. Throughput: 0: 50714.2. Samples: 2415156360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 22:21:07,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 22:21:07,910][49750] Updated weights for policy 0, policy_version 284571 (0.0031) [2024-04-26 22:21:11,186][49750] Updated weights for policy 0, policy_version 284581 (0.0028) [2024-04-26 22:21:12,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 4662607872. Throughput: 0: 50808.6. Samples: 2415463700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 22:21:12,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-26 22:21:14,476][49750] Updated weights for policy 0, policy_version 284591 (0.0035) [2024-04-26 22:21:17,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4662853632. Throughput: 0: 50755.1. Samples: 2415767240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 22:21:17,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 22:21:17,537][49750] Updated weights for policy 0, policy_version 284601 (0.0033) [2024-04-26 22:21:21,257][49750] Updated weights for policy 0, policy_version 284611 (0.0028) [2024-04-26 22:21:22,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.3, 300 sec: 50762.7). Total num frames: 4663099392. Throughput: 0: 50640.0. Samples: 2415911940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 22:21:22,063][49517] Avg episode reward: [(0, '0.683')] [2024-04-26 22:21:23,921][49750] Updated weights for policy 0, policy_version 284621 (0.0033) [2024-04-26 22:21:27,063][49517] Fps is (10 sec: 52427.6, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4663377920. Throughput: 0: 50659.1. Samples: 2416216140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 22:21:27,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 22:21:27,630][49750] Updated weights for policy 0, policy_version 284631 (0.0030) [2024-04-26 22:21:30,429][49750] Updated weights for policy 0, policy_version 284641 (0.0029) [2024-04-26 22:21:32,063][49517] Fps is (10 sec: 55705.1, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 4663656448. Throughput: 0: 50726.1. Samples: 2416524560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 22:21:32,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 22:21:33,971][49750] Updated weights for policy 0, policy_version 284651 (0.0030) [2024-04-26 22:21:36,851][49750] Updated weights for policy 0, policy_version 284661 (0.0028) [2024-04-26 22:21:37,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4663902208. Throughput: 0: 50757.4. Samples: 2416683680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 22:21:37,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 22:21:40,401][49750] Updated weights for policy 0, policy_version 284671 (0.0032) [2024-04-26 22:21:42,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 4664131584. Throughput: 0: 50647.3. Samples: 2416983640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 22:21:42,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 22:21:43,220][49750] Updated weights for policy 0, policy_version 284681 (0.0033) [2024-04-26 22:21:47,011][49750] Updated weights for policy 0, policy_version 284691 (0.0034) [2024-04-26 22:21:47,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4664377344. Throughput: 0: 50975.9. Samples: 2417291700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 22:21:47,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-26 22:21:49,836][49750] Updated weights for policy 0, policy_version 284701 (0.0029) [2024-04-26 22:21:52,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 4664655872. Throughput: 0: 50791.1. Samples: 2417441960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 22:21:52,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 22:21:53,316][49750] Updated weights for policy 0, policy_version 284711 (0.0038) [2024-04-26 22:21:56,264][49750] Updated weights for policy 0, policy_version 284721 (0.0031) [2024-04-26 22:21:57,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4664901632. Throughput: 0: 50758.1. Samples: 2417747820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 22:21:57,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 22:21:59,589][49750] Updated weights for policy 0, policy_version 284731 (0.0028) [2024-04-26 22:22:02,062][49517] Fps is (10 sec: 49152.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4665147392. Throughput: 0: 50795.9. Samples: 2418053060. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 22:22:02,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 22:22:02,674][49750] Updated weights for policy 0, policy_version 284741 (0.0033) [2024-04-26 22:22:03,628][49728] Signal inference workers to stop experience collection... (36300 times) [2024-04-26 22:22:03,668][49750] InferenceWorker_p0-w0: stopping experience collection (36300 times) [2024-04-26 22:22:03,701][49728] Signal inference workers to resume experience collection... (36300 times) [2024-04-26 22:22:03,702][49750] InferenceWorker_p0-w0: resuming experience collection (36300 times) [2024-04-26 22:22:06,046][49750] Updated weights for policy 0, policy_version 284751 (0.0030) [2024-04-26 22:22:07,062][49517] Fps is (10 sec: 47514.2, 60 sec: 49971.3, 300 sec: 50707.1). Total num frames: 4665376768. Throughput: 0: 50826.3. Samples: 2418199120. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 22:22:07,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 22:22:09,054][49750] Updated weights for policy 0, policy_version 284761 (0.0024) [2024-04-26 22:22:12,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.2, 300 sec: 50929.2). Total num frames: 4665655296. Throughput: 0: 50948.9. Samples: 2418508840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 22:22:12,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 22:22:12,459][49750] Updated weights for policy 0, policy_version 284771 (0.0029) [2024-04-26 22:22:15,473][49750] Updated weights for policy 0, policy_version 284781 (0.0034) [2024-04-26 22:22:17,062][49517] Fps is (10 sec: 55705.3, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 4665933824. Throughput: 0: 50833.9. Samples: 2418812080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 22:22:17,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 22:22:19,233][49750] Updated weights for policy 0, policy_version 284791 (0.0030) [2024-04-26 22:22:21,875][49750] Updated weights for policy 0, policy_version 284801 (0.0028) [2024-04-26 22:22:22,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4666179584. Throughput: 0: 50961.7. Samples: 2418976960. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 22:22:22,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 22:22:25,828][49750] Updated weights for policy 0, policy_version 284811 (0.0034) [2024-04-26 22:22:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4666425344. Throughput: 0: 51029.1. Samples: 2419279940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 22:22:27,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 22:22:28,232][49750] Updated weights for policy 0, policy_version 284821 (0.0031) [2024-04-26 22:22:32,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4666654720. Throughput: 0: 50809.3. Samples: 2419578120. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 22:22:32,063][49517] Avg episode reward: [(0, '0.670')] [2024-04-26 22:22:32,142][49750] Updated weights for policy 0, policy_version 284831 (0.0028) [2024-04-26 22:22:34,628][49750] Updated weights for policy 0, policy_version 284841 (0.0031) [2024-04-26 22:22:37,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 4666933248. Throughput: 0: 50785.7. Samples: 2419727320. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 22:22:37,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 22:22:38,624][49750] Updated weights for policy 0, policy_version 284851 (0.0028) [2024-04-26 22:22:41,135][49750] Updated weights for policy 0, policy_version 284861 (0.0034) [2024-04-26 22:22:42,063][49517] Fps is (10 sec: 55705.0, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4667211776. Throughput: 0: 50824.3. Samples: 2420034920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 22:22:42,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 22:22:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000284864_4667211776.pth... [2024-04-26 22:22:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000284119_4655005696.pth [2024-04-26 22:22:45,122][49750] Updated weights for policy 0, policy_version 284871 (0.0041) [2024-04-26 22:22:47,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4667441152. Throughput: 0: 50889.7. Samples: 2420343100. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 22:22:47,063][49517] Avg episode reward: [(0, '0.676')] [2024-04-26 22:22:47,682][49750] Updated weights for policy 0, policy_version 284881 (0.0034) [2024-04-26 22:22:51,592][49750] Updated weights for policy 0, policy_version 284891 (0.0032) [2024-04-26 22:22:52,062][49517] Fps is (10 sec: 45875.7, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4667670528. Throughput: 0: 50937.2. Samples: 2420491300. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 22:22:52,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 22:22:54,074][49750] Updated weights for policy 0, policy_version 284901 (0.0032) [2024-04-26 22:22:57,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4667949056. Throughput: 0: 50753.1. Samples: 2420792720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 22:22:57,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 22:22:57,906][49750] Updated weights for policy 0, policy_version 284911 (0.0028) [2024-04-26 22:23:00,504][49750] Updated weights for policy 0, policy_version 284921 (0.0028) [2024-04-26 22:23:02,063][49517] Fps is (10 sec: 55704.9, 60 sec: 51336.3, 300 sec: 50984.8). Total num frames: 4668227584. Throughput: 0: 50815.3. Samples: 2421098780. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 22:23:02,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 22:23:04,201][49750] Updated weights for policy 0, policy_version 284931 (0.0031) [2024-04-26 22:23:06,962][49750] Updated weights for policy 0, policy_version 284941 (0.0024) [2024-04-26 22:23:07,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 4668473344. Throughput: 0: 50833.9. Samples: 2421264480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 22:23:07,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 22:23:10,745][49750] Updated weights for policy 0, policy_version 284951 (0.0034) [2024-04-26 22:23:12,062][49517] Fps is (10 sec: 49153.0, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4668719104. Throughput: 0: 50994.3. Samples: 2421574680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-26 22:23:12,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 22:23:13,432][49750] Updated weights for policy 0, policy_version 284961 (0.0033) [2024-04-26 22:23:17,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4668948480. Throughput: 0: 50946.2. Samples: 2421870700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 22:23:17,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 22:23:17,134][49750] Updated weights for policy 0, policy_version 284971 (0.0031) [2024-04-26 22:23:18,133][49728] Signal inference workers to stop experience collection... (36350 times) [2024-04-26 22:23:18,153][49750] InferenceWorker_p0-w0: stopping experience collection (36350 times) [2024-04-26 22:23:18,203][49728] Signal inference workers to resume experience collection... (36350 times) [2024-04-26 22:23:18,210][49750] InferenceWorker_p0-w0: resuming experience collection (36350 times) [2024-04-26 22:23:20,005][49750] Updated weights for policy 0, policy_version 284981 (0.0030) [2024-04-26 22:23:22,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 4669210624. Throughput: 0: 50950.4. Samples: 2422020080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 22:23:22,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 22:23:23,586][49750] Updated weights for policy 0, policy_version 284991 (0.0031) [2024-04-26 22:23:26,298][49750] Updated weights for policy 0, policy_version 285001 (0.0031) [2024-04-26 22:23:27,063][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4669472768. Throughput: 0: 50904.5. Samples: 2422325620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 22:23:27,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 22:23:29,932][49750] Updated weights for policy 0, policy_version 285011 (0.0030) [2024-04-26 22:23:32,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4669734912. Throughput: 0: 50810.7. Samples: 2422629580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 22:23:32,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 22:23:32,652][49750] Updated weights for policy 0, policy_version 285021 (0.0029) [2024-04-26 22:23:36,345][49750] Updated weights for policy 0, policy_version 285031 (0.0035) [2024-04-26 22:23:37,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4669964288. Throughput: 0: 50782.1. Samples: 2422776500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 22:23:37,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 22:23:39,144][49750] Updated weights for policy 0, policy_version 285041 (0.0037) [2024-04-26 22:23:42,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 4670226432. Throughput: 0: 50857.3. Samples: 2423081300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 22:23:42,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 22:23:42,882][49750] Updated weights for policy 0, policy_version 285051 (0.0034) [2024-04-26 22:23:45,612][49750] Updated weights for policy 0, policy_version 285061 (0.0035) [2024-04-26 22:23:47,063][49517] Fps is (10 sec: 55706.2, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4670521344. Throughput: 0: 50945.9. Samples: 2423391340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 22:23:47,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 22:23:49,163][49750] Updated weights for policy 0, policy_version 285071 (0.0030) [2024-04-26 22:23:51,955][49750] Updated weights for policy 0, policy_version 285081 (0.0036) [2024-04-26 22:23:52,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51609.6, 300 sec: 50929.3). Total num frames: 4670767104. Throughput: 0: 50715.5. Samples: 2423546680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 22:23:52,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 22:23:55,662][49750] Updated weights for policy 0, policy_version 285091 (0.0031) [2024-04-26 22:23:57,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4670996480. Throughput: 0: 50493.6. Samples: 2423846900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 22:23:57,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 22:23:58,328][49750] Updated weights for policy 0, policy_version 285101 (0.0030) [2024-04-26 22:24:02,059][49750] Updated weights for policy 0, policy_version 285111 (0.0028) [2024-04-26 22:24:02,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4671258624. Throughput: 0: 50753.3. Samples: 2424154600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 22:24:02,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 22:24:04,938][49750] Updated weights for policy 0, policy_version 285121 (0.0028) [2024-04-26 22:24:07,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 4671504384. Throughput: 0: 50705.6. Samples: 2424301840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 22:24:07,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 22:24:08,559][49750] Updated weights for policy 0, policy_version 285131 (0.0030) [2024-04-26 22:24:11,608][49750] Updated weights for policy 0, policy_version 285141 (0.0026) [2024-04-26 22:24:12,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4671766528. Throughput: 0: 50757.8. Samples: 2424609720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 22:24:12,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 22:24:15,289][49750] Updated weights for policy 0, policy_version 285151 (0.0031) [2024-04-26 22:24:17,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4672012288. Throughput: 0: 50551.1. Samples: 2424904380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 22:24:17,063][49517] Avg episode reward: [(0, '0.688')] [2024-04-26 22:24:17,998][49750] Updated weights for policy 0, policy_version 285161 (0.0028) [2024-04-26 22:24:21,904][49750] Updated weights for policy 0, policy_version 285171 (0.0032) [2024-04-26 22:24:22,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4672241664. Throughput: 0: 50710.9. Samples: 2425058480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-26 22:24:22,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 22:24:22,212][49728] Signal inference workers to stop experience collection... (36400 times) [2024-04-26 22:24:22,259][49750] InferenceWorker_p0-w0: stopping experience collection (36400 times) [2024-04-26 22:24:22,282][49728] Signal inference workers to resume experience collection... (36400 times) [2024-04-26 22:24:22,284][49750] InferenceWorker_p0-w0: resuming experience collection (36400 times) [2024-04-26 22:24:24,301][49750] Updated weights for policy 0, policy_version 285181 (0.0030) [2024-04-26 22:24:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4672503808. Throughput: 0: 50800.1. Samples: 2425367300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:24:27,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 22:24:28,196][49750] Updated weights for policy 0, policy_version 285191 (0.0032) [2024-04-26 22:24:30,739][49750] Updated weights for policy 0, policy_version 285201 (0.0036) [2024-04-26 22:24:32,063][49517] Fps is (10 sec: 54065.8, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 4672782336. Throughput: 0: 50699.0. Samples: 2425672800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:24:32,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 22:24:34,551][49750] Updated weights for policy 0, policy_version 285211 (0.0028) [2024-04-26 22:24:37,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4673028096. Throughput: 0: 50718.7. Samples: 2425829020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:24:37,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 22:24:37,237][49750] Updated weights for policy 0, policy_version 285221 (0.0030) [2024-04-26 22:24:40,937][49750] Updated weights for policy 0, policy_version 285231 (0.0031) [2024-04-26 22:24:42,062][49517] Fps is (10 sec: 50791.5, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 4673290240. Throughput: 0: 50841.0. Samples: 2426134740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:24:42,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 22:24:42,125][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000285236_4673306624.pth... [2024-04-26 22:24:42,168][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000284490_4661084160.pth [2024-04-26 22:24:43,928][49750] Updated weights for policy 0, policy_version 285241 (0.0032) [2024-04-26 22:24:47,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 4673519616. Throughput: 0: 50824.2. Samples: 2426441680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:24:47,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 22:24:47,644][49750] Updated weights for policy 0, policy_version 285251 (0.0030) [2024-04-26 22:24:50,613][49750] Updated weights for policy 0, policy_version 285261 (0.0030) [2024-04-26 22:24:52,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4673781760. Throughput: 0: 50653.5. Samples: 2426581240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:24:52,063][49517] Avg episode reward: [(0, '0.674')] [2024-04-26 22:24:53,880][49750] Updated weights for policy 0, policy_version 285271 (0.0034) [2024-04-26 22:24:57,007][49750] Updated weights for policy 0, policy_version 285281 (0.0032) [2024-04-26 22:24:57,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4674043904. Throughput: 0: 50617.4. Samples: 2426887500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:24:57,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 22:25:00,291][49750] Updated weights for policy 0, policy_version 285291 (0.0030) [2024-04-26 22:25:02,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4674306048. Throughput: 0: 50920.8. Samples: 2427195820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:25:02,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 22:25:03,393][49750] Updated weights for policy 0, policy_version 285301 (0.0036) [2024-04-26 22:25:06,870][49750] Updated weights for policy 0, policy_version 285311 (0.0029) [2024-04-26 22:25:07,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 4674535424. Throughput: 0: 50904.9. Samples: 2427349200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:25:07,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-26 22:25:09,917][49750] Updated weights for policy 0, policy_version 285321 (0.0033) [2024-04-26 22:25:12,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4674781184. Throughput: 0: 50788.9. Samples: 2427652800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:25:12,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 22:25:13,314][49750] Updated weights for policy 0, policy_version 285331 (0.0033) [2024-04-26 22:25:16,408][49750] Updated weights for policy 0, policy_version 285341 (0.0031) [2024-04-26 22:25:17,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4675043328. Throughput: 0: 50776.7. Samples: 2427957740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:25:17,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 22:25:19,834][49750] Updated weights for policy 0, policy_version 285351 (0.0029) [2024-04-26 22:25:20,696][49728] Signal inference workers to stop experience collection... (36450 times) [2024-04-26 22:25:20,696][49728] Signal inference workers to resume experience collection... (36450 times) [2024-04-26 22:25:20,718][49750] InferenceWorker_p0-w0: stopping experience collection (36450 times) [2024-04-26 22:25:20,718][49750] InferenceWorker_p0-w0: resuming experience collection (36450 times) [2024-04-26 22:25:22,063][49517] Fps is (10 sec: 54066.2, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 4675321856. Throughput: 0: 50901.2. Samples: 2428119580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:25:22,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-26 22:25:22,799][49750] Updated weights for policy 0, policy_version 285361 (0.0029) [2024-04-26 22:25:26,291][49750] Updated weights for policy 0, policy_version 285371 (0.0036) [2024-04-26 22:25:27,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4675584000. Throughput: 0: 50722.7. Samples: 2428417260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:25:27,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 22:25:29,193][49750] Updated weights for policy 0, policy_version 285381 (0.0038) [2024-04-26 22:25:32,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 4675813376. Throughput: 0: 50731.7. Samples: 2428724620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:25:32,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 22:25:32,588][49750] Updated weights for policy 0, policy_version 285391 (0.0031) [2024-04-26 22:25:35,715][49750] Updated weights for policy 0, policy_version 285401 (0.0028) [2024-04-26 22:25:37,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4676075520. Throughput: 0: 51040.7. Samples: 2428878080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:25:37,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 22:25:38,934][49750] Updated weights for policy 0, policy_version 285411 (0.0030) [2024-04-26 22:25:42,062][49517] Fps is (10 sec: 50791.7, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4676321280. Throughput: 0: 51017.4. Samples: 2429183280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 22:25:42,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 22:25:42,099][49750] Updated weights for policy 0, policy_version 285421 (0.0028) [2024-04-26 22:25:45,481][49750] Updated weights for policy 0, policy_version 285431 (0.0030) [2024-04-26 22:25:47,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4676599808. Throughput: 0: 50752.0. Samples: 2429479660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 22:25:47,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 22:25:48,488][49750] Updated weights for policy 0, policy_version 285441 (0.0033) [2024-04-26 22:25:51,900][49750] Updated weights for policy 0, policy_version 285451 (0.0032) [2024-04-26 22:25:52,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4676845568. Throughput: 0: 50922.0. Samples: 2429640700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 22:25:52,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 22:25:54,839][49750] Updated weights for policy 0, policy_version 285461 (0.0026) [2024-04-26 22:25:57,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4677091328. Throughput: 0: 51100.0. Samples: 2429952300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 22:25:57,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 22:25:58,292][49750] Updated weights for policy 0, policy_version 285471 (0.0029) [2024-04-26 22:26:01,297][49750] Updated weights for policy 0, policy_version 285481 (0.0027) [2024-04-26 22:26:02,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4677337088. Throughput: 0: 50945.2. Samples: 2430250280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 22:26:02,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 22:26:04,610][49750] Updated weights for policy 0, policy_version 285491 (0.0028) [2024-04-26 22:26:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4677599232. Throughput: 0: 50903.3. Samples: 2430410220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 22:26:07,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 22:26:07,715][49750] Updated weights for policy 0, policy_version 285501 (0.0030) [2024-04-26 22:26:11,079][49750] Updated weights for policy 0, policy_version 285511 (0.0034) [2024-04-26 22:26:11,115][49728] Signal inference workers to stop experience collection... (36500 times) [2024-04-26 22:26:11,158][49750] InferenceWorker_p0-w0: stopping experience collection (36500 times) [2024-04-26 22:26:11,223][49728] Signal inference workers to resume experience collection... (36500 times) [2024-04-26 22:26:11,223][49750] InferenceWorker_p0-w0: resuming experience collection (36500 times) [2024-04-26 22:26:12,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51609.6, 300 sec: 50929.2). Total num frames: 4677877760. Throughput: 0: 50932.4. Samples: 2430709220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 22:26:12,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 22:26:14,263][49750] Updated weights for policy 0, policy_version 285521 (0.0029) [2024-04-26 22:26:17,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4678090752. Throughput: 0: 50809.9. Samples: 2431011060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 22:26:17,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 22:26:17,499][49750] Updated weights for policy 0, policy_version 285531 (0.0027) [2024-04-26 22:26:20,937][49750] Updated weights for policy 0, policy_version 285541 (0.0034) [2024-04-26 22:26:22,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 4678352896. Throughput: 0: 50900.6. Samples: 2431168600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 22:26:22,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 22:26:23,803][49750] Updated weights for policy 0, policy_version 285551 (0.0031) [2024-04-26 22:26:27,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4678615040. Throughput: 0: 50767.0. Samples: 2431467800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 22:26:27,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 22:26:27,538][49750] Updated weights for policy 0, policy_version 285561 (0.0035) [2024-04-26 22:26:30,150][49750] Updated weights for policy 0, policy_version 285571 (0.0036) [2024-04-26 22:26:32,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4678877184. Throughput: 0: 51001.0. Samples: 2431774700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 22:26:32,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 22:26:33,912][49750] Updated weights for policy 0, policy_version 285581 (0.0033) [2024-04-26 22:26:36,671][49750] Updated weights for policy 0, policy_version 285591 (0.0029) [2024-04-26 22:26:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4679139328. Throughput: 0: 50910.7. Samples: 2431931680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 22:26:37,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 22:26:40,272][49750] Updated weights for policy 0, policy_version 285601 (0.0031) [2024-04-26 22:26:42,063][49517] Fps is (10 sec: 50789.4, 60 sec: 51063.2, 300 sec: 50873.7). Total num frames: 4679385088. Throughput: 0: 50676.6. Samples: 2432232760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 22:26:42,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 22:26:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000285607_4679385088.pth... [2024-04-26 22:26:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000284864_4667211776.pth [2024-04-26 22:26:43,160][49750] Updated weights for policy 0, policy_version 285611 (0.0029) [2024-04-26 22:26:46,709][49750] Updated weights for policy 0, policy_version 285621 (0.0030) [2024-04-26 22:26:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4679630848. Throughput: 0: 50864.0. Samples: 2432539160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 24.0) [2024-04-26 22:26:47,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 22:26:49,599][49750] Updated weights for policy 0, policy_version 285631 (0.0027) [2024-04-26 22:26:52,063][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4679876608. Throughput: 0: 50580.7. Samples: 2432686360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:26:52,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 22:26:53,279][49750] Updated weights for policy 0, policy_version 285641 (0.0029) [2024-04-26 22:26:55,928][49750] Updated weights for policy 0, policy_version 285651 (0.0028) [2024-04-26 22:26:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4680155136. Throughput: 0: 50722.2. Samples: 2432991720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:26:57,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 22:26:59,815][49750] Updated weights for policy 0, policy_version 285661 (0.0032) [2024-04-26 22:27:02,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4680400896. Throughput: 0: 50713.2. Samples: 2433293160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:27:02,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 22:27:02,506][49750] Updated weights for policy 0, policy_version 285671 (0.0040) [2024-04-26 22:27:06,343][49750] Updated weights for policy 0, policy_version 285681 (0.0035) [2024-04-26 22:27:07,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4680646656. Throughput: 0: 50589.8. Samples: 2433445140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:27:07,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 22:27:08,987][49750] Updated weights for policy 0, policy_version 285691 (0.0031) [2024-04-26 22:27:12,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 4680892416. Throughput: 0: 50769.2. Samples: 2433752420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:27:12,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 22:27:12,682][49750] Updated weights for policy 0, policy_version 285701 (0.0034) [2024-04-26 22:27:15,441][49750] Updated weights for policy 0, policy_version 285711 (0.0029) [2024-04-26 22:27:17,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4681138176. Throughput: 0: 50500.8. Samples: 2434047240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:27:17,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-26 22:27:19,082][49750] Updated weights for policy 0, policy_version 285721 (0.0034) [2024-04-26 22:27:21,842][49750] Updated weights for policy 0, policy_version 285731 (0.0044) [2024-04-26 22:27:22,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4681416704. Throughput: 0: 50539.2. Samples: 2434205940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:27:22,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 22:27:25,726][49750] Updated weights for policy 0, policy_version 285741 (0.0030) [2024-04-26 22:27:25,755][49728] Signal inference workers to stop experience collection... (36550 times) [2024-04-26 22:27:25,756][49728] Signal inference workers to resume experience collection... (36550 times) [2024-04-26 22:27:25,774][49750] InferenceWorker_p0-w0: stopping experience collection (36550 times) [2024-04-26 22:27:25,774][49750] InferenceWorker_p0-w0: resuming experience collection (36550 times) [2024-04-26 22:27:27,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4681662464. Throughput: 0: 50714.1. Samples: 2434514880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:27:27,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 22:27:28,308][49750] Updated weights for policy 0, policy_version 285751 (0.0031) [2024-04-26 22:27:32,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4681891840. Throughput: 0: 50505.0. Samples: 2434811880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:27:32,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 22:27:32,225][49750] Updated weights for policy 0, policy_version 285761 (0.0027) [2024-04-26 22:27:34,852][49750] Updated weights for policy 0, policy_version 285771 (0.0032) [2024-04-26 22:27:37,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 4682153984. Throughput: 0: 50482.2. Samples: 2434958060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:27:37,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-26 22:27:38,696][49750] Updated weights for policy 0, policy_version 285781 (0.0029) [2024-04-26 22:27:41,177][49750] Updated weights for policy 0, policy_version 285791 (0.0034) [2024-04-26 22:27:42,063][49517] Fps is (10 sec: 54066.4, 60 sec: 50790.5, 300 sec: 50818.1). Total num frames: 4682432512. Throughput: 0: 50512.7. Samples: 2435264800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:27:42,072][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 22:27:44,991][49750] Updated weights for policy 0, policy_version 285801 (0.0031) [2024-04-26 22:27:47,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4682678272. Throughput: 0: 50570.3. Samples: 2435568820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:27:47,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 22:27:47,765][49750] Updated weights for policy 0, policy_version 285811 (0.0034) [2024-04-26 22:27:51,362][49750] Updated weights for policy 0, policy_version 285821 (0.0028) [2024-04-26 22:27:52,063][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 4682940416. Throughput: 0: 50628.8. Samples: 2435723440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:27:52,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 22:27:54,186][49750] Updated weights for policy 0, policy_version 285831 (0.0033) [2024-04-26 22:27:57,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49971.2, 300 sec: 50596.0). Total num frames: 4683153408. Throughput: 0: 50541.9. Samples: 2436026800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:27:57,063][49517] Avg episode reward: [(0, '0.684')] [2024-04-26 22:27:58,053][49750] Updated weights for policy 0, policy_version 285841 (0.0031) [2024-04-26 22:28:00,725][49750] Updated weights for policy 0, policy_version 285851 (0.0027) [2024-04-26 22:28:02,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4683431936. Throughput: 0: 50786.7. Samples: 2436332640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:28:02,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 22:28:04,289][49750] Updated weights for policy 0, policy_version 285861 (0.0027) [2024-04-26 22:28:07,063][49517] Fps is (10 sec: 54066.5, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 4683694080. Throughput: 0: 50650.4. Samples: 2436485220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 22:28:07,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 22:28:07,234][49750] Updated weights for policy 0, policy_version 285871 (0.0031) [2024-04-26 22:28:10,673][49750] Updated weights for policy 0, policy_version 285881 (0.0036) [2024-04-26 22:28:12,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4683939840. Throughput: 0: 50586.1. Samples: 2436791260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 22:28:12,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 22:28:13,563][49750] Updated weights for policy 0, policy_version 285891 (0.0028) [2024-04-26 22:28:17,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4684185600. Throughput: 0: 50852.8. Samples: 2437100260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 22:28:17,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 22:28:17,154][49750] Updated weights for policy 0, policy_version 285901 (0.0031) [2024-04-26 22:28:19,901][49750] Updated weights for policy 0, policy_version 285911 (0.0033) [2024-04-26 22:28:22,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4684447744. Throughput: 0: 50656.6. Samples: 2437237600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 22:28:22,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 22:28:23,633][49750] Updated weights for policy 0, policy_version 285921 (0.0040) [2024-04-26 22:28:26,532][49750] Updated weights for policy 0, policy_version 285931 (0.0030) [2024-04-26 22:28:27,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4684709888. Throughput: 0: 50654.4. Samples: 2437544240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 22:28:27,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 22:28:29,993][49750] Updated weights for policy 0, policy_version 285941 (0.0029) [2024-04-26 22:28:32,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51609.6, 300 sec: 50929.3). Total num frames: 4684988416. Throughput: 0: 50685.4. Samples: 2437849660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 22:28:32,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 22:28:33,083][49750] Updated weights for policy 0, policy_version 285951 (0.0027) [2024-04-26 22:28:36,462][49750] Updated weights for policy 0, policy_version 285961 (0.0038) [2024-04-26 22:28:37,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4685201408. Throughput: 0: 50804.8. Samples: 2438009660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 22:28:37,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 22:28:39,405][49750] Updated weights for policy 0, policy_version 285971 (0.0028) [2024-04-26 22:28:40,454][49728] Signal inference workers to stop experience collection... (36600 times) [2024-04-26 22:28:40,484][49750] InferenceWorker_p0-w0: stopping experience collection (36600 times) [2024-04-26 22:28:40,571][49728] Signal inference workers to resume experience collection... (36600 times) [2024-04-26 22:28:40,571][49750] InferenceWorker_p0-w0: resuming experience collection (36600 times) [2024-04-26 22:28:42,062][49517] Fps is (10 sec: 45875.0, 60 sec: 50244.4, 300 sec: 50596.0). Total num frames: 4685447168. Throughput: 0: 50900.0. Samples: 2438317300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 22:28:42,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 22:28:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000285977_4685447168.pth... [2024-04-26 22:28:42,130][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000285236_4673306624.pth [2024-04-26 22:28:43,027][49750] Updated weights for policy 0, policy_version 285981 (0.0026) [2024-04-26 22:28:45,776][49750] Updated weights for policy 0, policy_version 285991 (0.0030) [2024-04-26 22:28:47,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4685709312. Throughput: 0: 50826.8. Samples: 2438619840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 22:28:47,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 22:28:49,365][49750] Updated weights for policy 0, policy_version 286001 (0.0028) [2024-04-26 22:28:52,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4685971456. Throughput: 0: 50671.8. Samples: 2438765440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 22:28:52,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 22:28:52,722][49750] Updated weights for policy 0, policy_version 286011 (0.0032) [2024-04-26 22:28:55,767][49750] Updated weights for policy 0, policy_version 286021 (0.0037) [2024-04-26 22:28:57,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.6, 300 sec: 50762.7). Total num frames: 4686233600. Throughput: 0: 50782.4. Samples: 2439076460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 22:28:57,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 22:28:59,105][49750] Updated weights for policy 0, policy_version 286031 (0.0031) [2024-04-26 22:29:02,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4686462976. Throughput: 0: 50741.3. Samples: 2439383620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 22:29:02,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 22:29:02,322][49750] Updated weights for policy 0, policy_version 286041 (0.0036) [2024-04-26 22:29:05,429][49750] Updated weights for policy 0, policy_version 286051 (0.0034) [2024-04-26 22:29:07,063][49517] Fps is (10 sec: 47512.7, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 4686708736. Throughput: 0: 50756.3. Samples: 2439521640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 22:29:07,063][49517] Avg episode reward: [(0, '0.674')] [2024-04-26 22:29:08,716][49750] Updated weights for policy 0, policy_version 286061 (0.0029) [2024-04-26 22:29:11,985][49750] Updated weights for policy 0, policy_version 286071 (0.0029) [2024-04-26 22:29:12,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4686987264. Throughput: 0: 50654.2. Samples: 2439823680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-26 22:29:12,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 22:29:15,083][49750] Updated weights for policy 0, policy_version 286081 (0.0037) [2024-04-26 22:29:17,062][49517] Fps is (10 sec: 55706.4, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 4687265792. Throughput: 0: 50716.0. Samples: 2440131880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 22:29:17,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 22:29:18,615][49750] Updated weights for policy 0, policy_version 286091 (0.0037) [2024-04-26 22:29:21,820][49750] Updated weights for policy 0, policy_version 286101 (0.0029) [2024-04-26 22:29:22,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4687495168. Throughput: 0: 50800.6. Samples: 2440295680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 22:29:22,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 22:29:24,913][49750] Updated weights for policy 0, policy_version 286111 (0.0032) [2024-04-26 22:29:27,063][49517] Fps is (10 sec: 45874.5, 60 sec: 50244.1, 300 sec: 50651.6). Total num frames: 4687724544. Throughput: 0: 50723.8. Samples: 2440599880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 22:29:27,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 22:29:28,246][49750] Updated weights for policy 0, policy_version 286121 (0.0031) [2024-04-26 22:29:31,380][49750] Updated weights for policy 0, policy_version 286131 (0.0032) [2024-04-26 22:29:31,979][49728] Signal inference workers to stop experience collection... (36650 times) [2024-04-26 22:29:32,037][49750] InferenceWorker_p0-w0: stopping experience collection (36650 times) [2024-04-26 22:29:32,041][49728] Signal inference workers to resume experience collection... (36650 times) [2024-04-26 22:29:32,048][49750] InferenceWorker_p0-w0: resuming experience collection (36650 times) [2024-04-26 22:29:32,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 4688003072. Throughput: 0: 50745.1. Samples: 2440903380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 22:29:32,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 22:29:34,495][49750] Updated weights for policy 0, policy_version 286141 (0.0033) [2024-04-26 22:29:37,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4688265216. Throughput: 0: 50841.7. Samples: 2441053320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 22:29:37,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 22:29:38,107][49750] Updated weights for policy 0, policy_version 286151 (0.0031) [2024-04-26 22:29:40,909][49750] Updated weights for policy 0, policy_version 286161 (0.0030) [2024-04-26 22:29:42,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4688527360. Throughput: 0: 50816.7. Samples: 2441363220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 22:29:42,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 22:29:44,694][49750] Updated weights for policy 0, policy_version 286171 (0.0032) [2024-04-26 22:29:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4688756736. Throughput: 0: 50777.0. Samples: 2441668580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 22:29:47,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 22:29:47,329][49750] Updated weights for policy 0, policy_version 286181 (0.0026) [2024-04-26 22:29:51,064][49750] Updated weights for policy 0, policy_version 286191 (0.0031) [2024-04-26 22:29:52,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4689002496. Throughput: 0: 50883.1. Samples: 2441811380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 22:29:52,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 22:29:53,766][49750] Updated weights for policy 0, policy_version 286201 (0.0035) [2024-04-26 22:29:57,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50244.1, 300 sec: 50651.6). Total num frames: 4689248256. Throughput: 0: 50894.5. Samples: 2442113940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 22:29:57,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 22:29:57,568][49750] Updated weights for policy 0, policy_version 286211 (0.0035) [2024-04-26 22:30:00,180][49750] Updated weights for policy 0, policy_version 286221 (0.0031) [2024-04-26 22:30:02,063][49517] Fps is (10 sec: 54067.5, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4689543168. Throughput: 0: 50667.9. Samples: 2442411940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 22:30:02,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 22:30:03,969][49750] Updated weights for policy 0, policy_version 286231 (0.0030) [2024-04-26 22:30:06,567][49750] Updated weights for policy 0, policy_version 286241 (0.0031) [2024-04-26 22:30:07,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 4689788928. Throughput: 0: 50682.3. Samples: 2442576380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 22:30:07,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 22:30:10,524][49750] Updated weights for policy 0, policy_version 286251 (0.0034) [2024-04-26 22:30:12,062][49517] Fps is (10 sec: 45875.8, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4690001920. Throughput: 0: 50734.4. Samples: 2442882920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 22:30:12,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 22:30:12,991][49750] Updated weights for policy 0, policy_version 286261 (0.0029) [2024-04-26 22:30:16,933][49750] Updated weights for policy 0, policy_version 286271 (0.0029) [2024-04-26 22:30:17,063][49517] Fps is (10 sec: 47512.5, 60 sec: 49971.1, 300 sec: 50651.5). Total num frames: 4690264064. Throughput: 0: 50703.1. Samples: 2443185020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 22:30:17,064][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 22:30:19,293][49728] Signal inference workers to stop experience collection... (36700 times) [2024-04-26 22:30:19,327][49750] InferenceWorker_p0-w0: stopping experience collection (36700 times) [2024-04-26 22:30:19,361][49728] Signal inference workers to resume experience collection... (36700 times) [2024-04-26 22:30:19,361][49750] InferenceWorker_p0-w0: resuming experience collection (36700 times) [2024-04-26 22:30:19,505][49750] Updated weights for policy 0, policy_version 286281 (0.0041) [2024-04-26 22:30:22,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 4690526208. Throughput: 0: 50483.0. Samples: 2443325060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 22:30:22,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 22:30:23,373][49750] Updated weights for policy 0, policy_version 286291 (0.0036) [2024-04-26 22:30:25,995][49750] Updated weights for policy 0, policy_version 286301 (0.0025) [2024-04-26 22:30:27,062][49517] Fps is (10 sec: 54068.2, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 4690804736. Throughput: 0: 50611.7. Samples: 2443640740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-26 22:30:27,063][49517] Avg episode reward: [(0, '0.472')] [2024-04-26 22:30:29,697][49750] Updated weights for policy 0, policy_version 286311 (0.0029) [2024-04-26 22:30:32,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4691034112. Throughput: 0: 50505.3. Samples: 2443941320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 22:30:32,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 22:30:32,440][49750] Updated weights for policy 0, policy_version 286321 (0.0031) [2024-04-26 22:30:36,293][49750] Updated weights for policy 0, policy_version 286331 (0.0029) [2024-04-26 22:30:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4691296256. Throughput: 0: 50624.2. Samples: 2444089460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 22:30:37,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 22:30:38,805][49750] Updated weights for policy 0, policy_version 286341 (0.0029) [2024-04-26 22:30:42,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 4691542016. Throughput: 0: 50687.5. Samples: 2444394880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 22:30:42,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 22:30:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000286349_4691542016.pth... [2024-04-26 22:30:42,126][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000285607_4679385088.pth [2024-04-26 22:30:42,632][49750] Updated weights for policy 0, policy_version 286351 (0.0028) [2024-04-26 22:30:45,187][49750] Updated weights for policy 0, policy_version 286361 (0.0027) [2024-04-26 22:30:47,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4691804160. Throughput: 0: 50757.5. Samples: 2444696020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 22:30:47,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 22:30:49,193][49750] Updated weights for policy 0, policy_version 286371 (0.0031) [2024-04-26 22:30:51,728][49750] Updated weights for policy 0, policy_version 286381 (0.0033) [2024-04-26 22:30:52,063][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4692066304. Throughput: 0: 50637.6. Samples: 2444855080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 22:30:52,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 22:30:55,583][49750] Updated weights for policy 0, policy_version 286391 (0.0030) [2024-04-26 22:30:57,063][49517] Fps is (10 sec: 50789.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4692312064. Throughput: 0: 50658.9. Samples: 2445162580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 22:30:57,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 22:30:58,017][49750] Updated weights for policy 0, policy_version 286401 (0.0031) [2024-04-26 22:31:02,050][49750] Updated weights for policy 0, policy_version 286411 (0.0034) [2024-04-26 22:31:02,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4692557824. Throughput: 0: 50725.8. Samples: 2445467680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 22:31:02,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 22:31:04,573][49750] Updated weights for policy 0, policy_version 286421 (0.0035) [2024-04-26 22:31:07,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.1, 300 sec: 50596.0). Total num frames: 4692803584. Throughput: 0: 50752.8. Samples: 2445608940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 22:31:07,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 22:31:08,484][49750] Updated weights for policy 0, policy_version 286431 (0.0028) [2024-04-26 22:31:11,088][49750] Updated weights for policy 0, policy_version 286441 (0.0027) [2024-04-26 22:31:12,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4693082112. Throughput: 0: 50525.2. Samples: 2445914380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 22:31:12,063][49517] Avg episode reward: [(0, '0.690')] [2024-04-26 22:31:14,988][49750] Updated weights for policy 0, policy_version 286451 (0.0029) [2024-04-26 22:31:17,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4693311488. Throughput: 0: 50755.1. Samples: 2446225300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 22:31:17,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 22:31:17,676][49750] Updated weights for policy 0, policy_version 286461 (0.0033) [2024-04-26 22:31:21,319][49750] Updated weights for policy 0, policy_version 286471 (0.0029) [2024-04-26 22:31:22,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 4693557248. Throughput: 0: 50741.3. Samples: 2446372820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 22:31:22,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 22:31:24,243][49750] Updated weights for policy 0, policy_version 286481 (0.0028) [2024-04-26 22:31:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 49971.2, 300 sec: 50596.0). Total num frames: 4693803008. Throughput: 0: 50625.1. Samples: 2446673000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 22:31:27,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 22:31:27,843][49750] Updated weights for policy 0, policy_version 286491 (0.0030) [2024-04-26 22:31:29,291][49728] Signal inference workers to stop experience collection... (36750 times) [2024-04-26 22:31:29,321][49750] InferenceWorker_p0-w0: stopping experience collection (36750 times) [2024-04-26 22:31:29,357][49728] Signal inference workers to resume experience collection... (36750 times) [2024-04-26 22:31:29,357][49750] InferenceWorker_p0-w0: resuming experience collection (36750 times) [2024-04-26 22:31:30,707][49750] Updated weights for policy 0, policy_version 286501 (0.0027) [2024-04-26 22:31:32,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50517.2, 300 sec: 50596.0). Total num frames: 4694065152. Throughput: 0: 50808.2. Samples: 2446982400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 22:31:32,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 22:31:34,109][49750] Updated weights for policy 0, policy_version 286511 (0.0033) [2024-04-26 22:31:37,062][49517] Fps is (10 sec: 54067.6, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4694343680. Throughput: 0: 50640.6. Samples: 2447133900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 22:31:37,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 22:31:37,075][49750] Updated weights for policy 0, policy_version 286521 (0.0037) [2024-04-26 22:31:40,577][49750] Updated weights for policy 0, policy_version 286531 (0.0033) [2024-04-26 22:31:42,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4694589440. Throughput: 0: 50505.6. Samples: 2447435340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-26 22:31:42,064][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 22:31:43,648][49750] Updated weights for policy 0, policy_version 286541 (0.0038) [2024-04-26 22:31:47,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4694835200. Throughput: 0: 50647.1. Samples: 2447746800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 22:31:47,063][49517] Avg episode reward: [(0, '0.462')] [2024-04-26 22:31:47,093][49750] Updated weights for policy 0, policy_version 286551 (0.0028) [2024-04-26 22:31:50,064][49750] Updated weights for policy 0, policy_version 286561 (0.0028) [2024-04-26 22:31:52,063][49517] Fps is (10 sec: 47514.4, 60 sec: 49971.2, 300 sec: 50540.5). Total num frames: 4695064576. Throughput: 0: 50649.9. Samples: 2447888180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 22:31:52,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 22:31:53,686][49750] Updated weights for policy 0, policy_version 286571 (0.0031) [2024-04-26 22:31:56,583][49750] Updated weights for policy 0, policy_version 286581 (0.0034) [2024-04-26 22:31:57,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4695359488. Throughput: 0: 50665.5. Samples: 2448194320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 22:31:57,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 22:31:59,994][49750] Updated weights for policy 0, policy_version 286591 (0.0032) [2024-04-26 22:32:02,063][49517] Fps is (10 sec: 55705.6, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4695621632. Throughput: 0: 50503.9. Samples: 2448497980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 22:32:02,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 22:32:02,978][49750] Updated weights for policy 0, policy_version 286601 (0.0030) [2024-04-26 22:32:06,352][49750] Updated weights for policy 0, policy_version 286611 (0.0032) [2024-04-26 22:32:07,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 4695851008. Throughput: 0: 50867.9. Samples: 2448661880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 22:32:07,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 22:32:09,427][49750] Updated weights for policy 0, policy_version 286621 (0.0027) [2024-04-26 22:32:12,063][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4696096768. Throughput: 0: 50864.4. Samples: 2448961900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 22:32:12,063][49517] Avg episode reward: [(0, '0.668')] [2024-04-26 22:32:12,813][49750] Updated weights for policy 0, policy_version 286631 (0.0034) [2024-04-26 22:32:15,811][49750] Updated weights for policy 0, policy_version 286641 (0.0028) [2024-04-26 22:32:17,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50790.2, 300 sec: 50651.5). Total num frames: 4696358912. Throughput: 0: 50570.2. Samples: 2449258060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 22:32:17,064][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 22:32:19,275][49750] Updated weights for policy 0, policy_version 286651 (0.0038) [2024-04-26 22:32:22,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 4696604672. Throughput: 0: 50701.3. Samples: 2449415460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 22:32:22,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 22:32:22,415][49750] Updated weights for policy 0, policy_version 286661 (0.0033) [2024-04-26 22:32:25,749][49750] Updated weights for policy 0, policy_version 286671 (0.0031) [2024-04-26 22:32:27,062][49517] Fps is (10 sec: 52430.1, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4696883200. Throughput: 0: 50774.6. Samples: 2449720180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 22:32:27,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 22:32:28,886][49750] Updated weights for policy 0, policy_version 286681 (0.0033) [2024-04-26 22:32:32,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4697128960. Throughput: 0: 50636.6. Samples: 2450025440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 22:32:32,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 22:32:32,074][49750] Updated weights for policy 0, policy_version 286691 (0.0029) [2024-04-26 22:32:35,201][49750] Updated weights for policy 0, policy_version 286701 (0.0036) [2024-04-26 22:32:37,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50244.2, 300 sec: 50596.0). Total num frames: 4697358336. Throughput: 0: 50947.6. Samples: 2450180820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 22:32:37,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 22:32:38,472][49750] Updated weights for policy 0, policy_version 286711 (0.0031) [2024-04-26 22:32:39,412][49728] Signal inference workers to stop experience collection... (36800 times) [2024-04-26 22:32:39,413][49728] Signal inference workers to resume experience collection... (36800 times) [2024-04-26 22:32:39,428][49750] InferenceWorker_p0-w0: stopping experience collection (36800 times) [2024-04-26 22:32:39,428][49750] InferenceWorker_p0-w0: resuming experience collection (36800 times) [2024-04-26 22:32:41,654][49750] Updated weights for policy 0, policy_version 286721 (0.0033) [2024-04-26 22:32:42,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.7, 300 sec: 50762.6). Total num frames: 4697653248. Throughput: 0: 50782.2. Samples: 2450479520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 22:32:42,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 22:32:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000286722_4697653248.pth... [2024-04-26 22:32:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000285977_4685447168.pth [2024-04-26 22:32:44,927][49750] Updated weights for policy 0, policy_version 286731 (0.0030) [2024-04-26 22:32:47,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 4697899008. Throughput: 0: 50785.4. Samples: 2450783320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 22:32:47,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 22:32:48,151][49750] Updated weights for policy 0, policy_version 286741 (0.0026) [2024-04-26 22:32:51,503][49750] Updated weights for policy 0, policy_version 286751 (0.0029) [2024-04-26 22:32:52,063][49517] Fps is (10 sec: 49150.9, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 4698144768. Throughput: 0: 50735.8. Samples: 2450945000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 22:32:52,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 22:32:54,465][49750] Updated weights for policy 0, policy_version 286761 (0.0032) [2024-04-26 22:32:57,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4698390528. Throughput: 0: 50790.9. Samples: 2451247480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 22:32:57,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 22:32:58,020][49750] Updated weights for policy 0, policy_version 286771 (0.0029) [2024-04-26 22:33:00,791][49750] Updated weights for policy 0, policy_version 286781 (0.0031) [2024-04-26 22:33:02,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 4698636288. Throughput: 0: 50840.2. Samples: 2451545860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 22:33:02,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 22:33:04,337][49750] Updated weights for policy 0, policy_version 286791 (0.0035) [2024-04-26 22:33:07,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4698931200. Throughput: 0: 50789.4. Samples: 2451700980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 22:33:07,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 22:33:07,277][49750] Updated weights for policy 0, policy_version 286801 (0.0029) [2024-04-26 22:33:10,778][49750] Updated weights for policy 0, policy_version 286811 (0.0033) [2024-04-26 22:33:12,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4699160576. Throughput: 0: 50917.2. Samples: 2452011460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 22:33:12,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 22:33:13,703][49750] Updated weights for policy 0, policy_version 286821 (0.0038) [2024-04-26 22:33:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 51063.7, 300 sec: 50762.6). Total num frames: 4699422720. Throughput: 0: 50994.3. Samples: 2452320180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 22:33:17,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 22:33:17,230][49750] Updated weights for policy 0, policy_version 286831 (0.0031) [2024-04-26 22:33:20,368][49750] Updated weights for policy 0, policy_version 286841 (0.0033) [2024-04-26 22:33:22,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 4699668480. Throughput: 0: 50867.2. Samples: 2452469840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 22:33:22,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 22:33:23,656][49750] Updated weights for policy 0, policy_version 286851 (0.0028) [2024-04-26 22:33:27,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 4699914240. Throughput: 0: 51016.3. Samples: 2452775260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 22:33:27,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 22:33:27,180][49750] Updated weights for policy 0, policy_version 286861 (0.0035) [2024-04-26 22:33:30,032][49750] Updated weights for policy 0, policy_version 286871 (0.0033) [2024-04-26 22:33:32,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 4700176384. Throughput: 0: 51065.4. Samples: 2453081260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 22:33:32,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 22:33:33,525][49750] Updated weights for policy 0, policy_version 286881 (0.0032) [2024-04-26 22:33:36,411][49750] Updated weights for policy 0, policy_version 286891 (0.0031) [2024-04-26 22:33:37,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4700438528. Throughput: 0: 50921.9. Samples: 2453236480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 22:33:37,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 22:33:39,954][49750] Updated weights for policy 0, policy_version 286901 (0.0028) [2024-04-26 22:33:42,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50517.1, 300 sec: 50762.6). Total num frames: 4700684288. Throughput: 0: 51023.3. Samples: 2453543540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 22:33:42,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 22:33:42,855][49750] Updated weights for policy 0, policy_version 286911 (0.0029) [2024-04-26 22:33:44,456][49728] Signal inference workers to stop experience collection... (36850 times) [2024-04-26 22:33:44,471][49750] InferenceWorker_p0-w0: stopping experience collection (36850 times) [2024-04-26 22:33:44,529][49728] Signal inference workers to resume experience collection... (36850 times) [2024-04-26 22:33:44,529][49750] InferenceWorker_p0-w0: resuming experience collection (36850 times) [2024-04-26 22:33:46,572][49750] Updated weights for policy 0, policy_version 286921 (0.0031) [2024-04-26 22:33:47,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 4700913664. Throughput: 0: 51078.8. Samples: 2453844400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 22:33:47,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 22:33:49,235][49750] Updated weights for policy 0, policy_version 286931 (0.0034) [2024-04-26 22:33:52,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4701208576. Throughput: 0: 50688.4. Samples: 2453981960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 22:33:52,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 22:33:53,156][49750] Updated weights for policy 0, policy_version 286941 (0.0031) [2024-04-26 22:33:55,760][49750] Updated weights for policy 0, policy_version 286951 (0.0036) [2024-04-26 22:33:57,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4701454336. Throughput: 0: 50743.6. Samples: 2454294920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 22:33:57,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 22:33:59,528][49750] Updated weights for policy 0, policy_version 286961 (0.0032) [2024-04-26 22:34:02,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4701716480. Throughput: 0: 50824.0. Samples: 2454607260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 22:34:02,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 22:34:02,176][49750] Updated weights for policy 0, policy_version 286971 (0.0027) [2024-04-26 22:34:05,878][49750] Updated weights for policy 0, policy_version 286981 (0.0029) [2024-04-26 22:34:07,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4701945856. Throughput: 0: 50605.4. Samples: 2454747080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-26 22:34:07,063][49517] Avg episode reward: [(0, '0.687')] [2024-04-26 22:34:08,725][49750] Updated weights for policy 0, policy_version 286991 (0.0026) [2024-04-26 22:34:12,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 4702208000. Throughput: 0: 50695.1. Samples: 2455056540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 22:34:12,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-26 22:34:12,490][49750] Updated weights for policy 0, policy_version 287001 (0.0029) [2024-04-26 22:34:15,303][49750] Updated weights for policy 0, policy_version 287011 (0.0039) [2024-04-26 22:34:17,062][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4702470144. Throughput: 0: 50585.7. Samples: 2455357620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 22:34:17,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 22:34:18,944][49750] Updated weights for policy 0, policy_version 287021 (0.0038) [2024-04-26 22:34:21,541][49750] Updated weights for policy 0, policy_version 287031 (0.0029) [2024-04-26 22:34:22,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4702715904. Throughput: 0: 50678.6. Samples: 2455517020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 22:34:22,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 22:34:25,338][49750] Updated weights for policy 0, policy_version 287041 (0.0035) [2024-04-26 22:34:27,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4702961664. Throughput: 0: 50538.2. Samples: 2455817760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 22:34:27,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 22:34:27,940][49750] Updated weights for policy 0, policy_version 287051 (0.0034) [2024-04-26 22:34:31,888][49750] Updated weights for policy 0, policy_version 287061 (0.0034) [2024-04-26 22:34:32,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4703207424. Throughput: 0: 50681.8. Samples: 2456125080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 22:34:32,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 22:34:34,484][49750] Updated weights for policy 0, policy_version 287071 (0.0032) [2024-04-26 22:34:37,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4703469568. Throughput: 0: 50726.5. Samples: 2456264660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 22:34:37,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 22:34:38,662][49750] Updated weights for policy 0, policy_version 287081 (0.0029) [2024-04-26 22:34:41,079][49750] Updated weights for policy 0, policy_version 287091 (0.0037) [2024-04-26 22:34:42,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4703715328. Throughput: 0: 50499.6. Samples: 2456567400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 22:34:42,064][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 22:34:42,132][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000287093_4703731712.pth... [2024-04-26 22:34:42,191][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000286349_4691542016.pth [2024-04-26 22:34:42,209][49728] Signal inference workers to stop experience collection... (36900 times) [2024-04-26 22:34:42,210][49728] Signal inference workers to resume experience collection... (36900 times) [2024-04-26 22:34:42,222][49750] InferenceWorker_p0-w0: stopping experience collection (36900 times) [2024-04-26 22:34:42,226][49750] InferenceWorker_p0-w0: resuming experience collection (36900 times) [2024-04-26 22:34:44,995][49750] Updated weights for policy 0, policy_version 287101 (0.0030) [2024-04-26 22:34:47,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.3, 300 sec: 50818.2). Total num frames: 4703993856. Throughput: 0: 50458.4. Samples: 2456877900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 22:34:47,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 22:34:47,462][49750] Updated weights for policy 0, policy_version 287111 (0.0032) [2024-04-26 22:34:51,311][49750] Updated weights for policy 0, policy_version 287121 (0.0032) [2024-04-26 22:34:52,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50244.3, 300 sec: 50762.7). Total num frames: 4704223232. Throughput: 0: 50735.5. Samples: 2457030180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 22:34:52,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 22:34:54,005][49750] Updated weights for policy 0, policy_version 287131 (0.0037) [2024-04-26 22:34:57,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4704485376. Throughput: 0: 50564.1. Samples: 2457331920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 22:34:57,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 22:34:57,830][49750] Updated weights for policy 0, policy_version 287141 (0.0032) [2024-04-26 22:35:00,674][49750] Updated weights for policy 0, policy_version 287151 (0.0028) [2024-04-26 22:35:02,063][49517] Fps is (10 sec: 54066.7, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4704763904. Throughput: 0: 50770.1. Samples: 2457642280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 22:35:02,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 22:35:04,411][49750] Updated weights for policy 0, policy_version 287161 (0.0039) [2024-04-26 22:35:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4704993280. Throughput: 0: 50665.5. Samples: 2457796960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 22:35:07,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 22:35:07,450][49750] Updated weights for policy 0, policy_version 287171 (0.0033) [2024-04-26 22:35:10,838][49750] Updated weights for policy 0, policy_version 287181 (0.0041) [2024-04-26 22:35:12,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4705255424. Throughput: 0: 50736.6. Samples: 2458100900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 22:35:12,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 22:35:13,806][49750] Updated weights for policy 0, policy_version 287191 (0.0034) [2024-04-26 22:35:17,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4705484800. Throughput: 0: 50599.8. Samples: 2458402080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-26 22:35:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 22:35:17,208][49750] Updated weights for policy 0, policy_version 287201 (0.0030) [2024-04-26 22:35:20,200][49750] Updated weights for policy 0, policy_version 287211 (0.0032) [2024-04-26 22:35:22,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 4705763328. Throughput: 0: 50820.7. Samples: 2458551580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:35:22,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 22:35:23,524][49750] Updated weights for policy 0, policy_version 287221 (0.0035) [2024-04-26 22:35:26,690][49750] Updated weights for policy 0, policy_version 287231 (0.0031) [2024-04-26 22:35:27,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.6, 300 sec: 50707.1). Total num frames: 4705992704. Throughput: 0: 50889.0. Samples: 2458857400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:35:27,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 22:35:29,953][49750] Updated weights for policy 0, policy_version 287241 (0.0034) [2024-04-26 22:35:32,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4706254848. Throughput: 0: 50704.7. Samples: 2459159600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:35:32,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 22:35:33,089][49750] Updated weights for policy 0, policy_version 287251 (0.0029) [2024-04-26 22:35:36,520][49750] Updated weights for policy 0, policy_version 287261 (0.0032) [2024-04-26 22:35:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4706500608. Throughput: 0: 50754.7. Samples: 2459314140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:35:37,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 22:35:39,440][49750] Updated weights for policy 0, policy_version 287271 (0.0033) [2024-04-26 22:35:42,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50651.5). Total num frames: 4706746368. Throughput: 0: 50714.7. Samples: 2459614080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:35:42,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 22:35:43,009][49750] Updated weights for policy 0, policy_version 287281 (0.0033) [2024-04-26 22:35:45,908][49750] Updated weights for policy 0, policy_version 287291 (0.0031) [2024-04-26 22:35:47,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.6, 300 sec: 50707.1). Total num frames: 4707024896. Throughput: 0: 50642.8. Samples: 2459921200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:35:47,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 22:35:49,377][49750] Updated weights for policy 0, policy_version 287301 (0.0034) [2024-04-26 22:35:52,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4707287040. Throughput: 0: 50617.7. Samples: 2460074760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:35:52,063][49517] Avg episode reward: [(0, '0.705')] [2024-04-26 22:35:52,427][49750] Updated weights for policy 0, policy_version 287311 (0.0032) [2024-04-26 22:35:55,894][49750] Updated weights for policy 0, policy_version 287321 (0.0034) [2024-04-26 22:35:57,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4707532800. Throughput: 0: 50702.8. Samples: 2460382520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:35:57,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 22:35:58,826][49750] Updated weights for policy 0, policy_version 287331 (0.0030) [2024-04-26 22:36:02,062][49517] Fps is (10 sec: 47514.3, 60 sec: 49971.3, 300 sec: 50707.1). Total num frames: 4707762176. Throughput: 0: 50763.7. Samples: 2460686440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:36:02,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 22:36:02,431][49750] Updated weights for policy 0, policy_version 287341 (0.0030) [2024-04-26 22:36:05,356][49750] Updated weights for policy 0, policy_version 287351 (0.0032) [2024-04-26 22:36:07,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4708040704. Throughput: 0: 50805.2. Samples: 2460837820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:36:07,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 22:36:08,519][49728] Signal inference workers to stop experience collection... (36950 times) [2024-04-26 22:36:08,564][49750] InferenceWorker_p0-w0: stopping experience collection (36950 times) [2024-04-26 22:36:08,627][49728] Signal inference workers to resume experience collection... (36950 times) [2024-04-26 22:36:08,627][49750] InferenceWorker_p0-w0: resuming experience collection (36950 times) [2024-04-26 22:36:08,757][49750] Updated weights for policy 0, policy_version 287361 (0.0026) [2024-04-26 22:36:11,847][49750] Updated weights for policy 0, policy_version 287371 (0.0029) [2024-04-26 22:36:12,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4708286464. Throughput: 0: 50726.5. Samples: 2461140100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:36:12,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 22:36:15,083][49750] Updated weights for policy 0, policy_version 287381 (0.0032) [2024-04-26 22:36:17,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4708532224. Throughput: 0: 50680.5. Samples: 2461440220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:36:17,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 22:36:18,277][49750] Updated weights for policy 0, policy_version 287391 (0.0032) [2024-04-26 22:36:21,685][49750] Updated weights for policy 0, policy_version 287401 (0.0033) [2024-04-26 22:36:22,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 4708794368. Throughput: 0: 50780.3. Samples: 2461599260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:36:22,063][49517] Avg episode reward: [(0, '0.676')] [2024-04-26 22:36:24,704][49750] Updated weights for policy 0, policy_version 287411 (0.0037) [2024-04-26 22:36:27,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 4709040128. Throughput: 0: 50754.1. Samples: 2461898020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:36:27,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 22:36:28,249][49750] Updated weights for policy 0, policy_version 287421 (0.0029) [2024-04-26 22:36:31,120][49750] Updated weights for policy 0, policy_version 287431 (0.0030) [2024-04-26 22:36:32,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4709302272. Throughput: 0: 50654.6. Samples: 2462200660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-26 22:36:32,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 22:36:34,576][49750] Updated weights for policy 0, policy_version 287441 (0.0033) [2024-04-26 22:36:37,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4709548032. Throughput: 0: 50737.0. Samples: 2462357920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 22:36:37,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 22:36:37,528][49750] Updated weights for policy 0, policy_version 287451 (0.0030) [2024-04-26 22:36:41,030][49750] Updated weights for policy 0, policy_version 287461 (0.0030) [2024-04-26 22:36:42,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 4709826560. Throughput: 0: 50745.6. Samples: 2462666080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 22:36:42,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 22:36:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000287465_4709826560.pth... [2024-04-26 22:36:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000286722_4697653248.pth [2024-04-26 22:36:43,963][49750] Updated weights for policy 0, policy_version 287471 (0.0041) [2024-04-26 22:36:47,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4710039552. Throughput: 0: 50727.9. Samples: 2462969200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 22:36:47,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 22:36:47,578][49750] Updated weights for policy 0, policy_version 287481 (0.0029) [2024-04-26 22:36:50,387][49750] Updated weights for policy 0, policy_version 287491 (0.0035) [2024-04-26 22:36:52,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 4710301696. Throughput: 0: 50620.1. Samples: 2463115720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 22:36:52,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 22:36:54,031][49750] Updated weights for policy 0, policy_version 287501 (0.0030) [2024-04-26 22:36:56,723][49750] Updated weights for policy 0, policy_version 287511 (0.0031) [2024-04-26 22:36:57,063][49517] Fps is (10 sec: 54066.6, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 4710580224. Throughput: 0: 50691.9. Samples: 2463421240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 22:36:57,064][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 22:37:00,443][49750] Updated weights for policy 0, policy_version 287521 (0.0031) [2024-04-26 22:37:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4710809600. Throughput: 0: 50727.6. Samples: 2463722960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 22:37:02,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 22:37:03,397][49750] Updated weights for policy 0, policy_version 287531 (0.0030) [2024-04-26 22:37:06,821][49750] Updated weights for policy 0, policy_version 287541 (0.0029) [2024-04-26 22:37:07,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4711088128. Throughput: 0: 50470.3. Samples: 2463870420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 22:37:07,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 22:37:09,845][49750] Updated weights for policy 0, policy_version 287551 (0.0027) [2024-04-26 22:37:12,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 4711333888. Throughput: 0: 50584.6. Samples: 2464174320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 22:37:12,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 22:37:13,153][49750] Updated weights for policy 0, policy_version 287561 (0.0029) [2024-04-26 22:37:16,352][49750] Updated weights for policy 0, policy_version 287571 (0.0032) [2024-04-26 22:37:17,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4711596032. Throughput: 0: 50679.0. Samples: 2464481220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 22:37:17,072][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 22:37:19,655][49750] Updated weights for policy 0, policy_version 287581 (0.0028) [2024-04-26 22:37:22,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50790.2, 300 sec: 50707.0). Total num frames: 4711841792. Throughput: 0: 50680.2. Samples: 2464638540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 22:37:22,072][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 22:37:22,825][49750] Updated weights for policy 0, policy_version 287591 (0.0028) [2024-04-26 22:37:26,055][49750] Updated weights for policy 0, policy_version 287601 (0.0034) [2024-04-26 22:37:27,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4712103936. Throughput: 0: 50569.4. Samples: 2464941700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 22:37:27,071][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 22:37:29,124][49728] Signal inference workers to stop experience collection... (37000 times) [2024-04-26 22:37:29,177][49750] InferenceWorker_p0-w0: stopping experience collection (37000 times) [2024-04-26 22:37:29,191][49728] Signal inference workers to resume experience collection... (37000 times) [2024-04-26 22:37:29,200][49750] InferenceWorker_p0-w0: resuming experience collection (37000 times) [2024-04-26 22:37:29,319][49750] Updated weights for policy 0, policy_version 287611 (0.0029) [2024-04-26 22:37:32,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4712333312. Throughput: 0: 50628.9. Samples: 2465247500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 22:37:32,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 22:37:32,510][49750] Updated weights for policy 0, policy_version 287621 (0.0030) [2024-04-26 22:37:35,790][49750] Updated weights for policy 0, policy_version 287631 (0.0033) [2024-04-26 22:37:37,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 4712595456. Throughput: 0: 50711.5. Samples: 2465397740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 22:37:37,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 22:37:39,142][49750] Updated weights for policy 0, policy_version 287641 (0.0029) [2024-04-26 22:37:42,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4712857600. Throughput: 0: 50679.2. Samples: 2465701800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-26 22:37:42,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 22:37:42,132][49750] Updated weights for policy 0, policy_version 287651 (0.0028) [2024-04-26 22:37:45,495][49750] Updated weights for policy 0, policy_version 287661 (0.0026) [2024-04-26 22:37:47,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 4713086976. Throughput: 0: 50651.8. Samples: 2466002300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 22:37:47,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 22:37:48,673][49750] Updated weights for policy 0, policy_version 287671 (0.0034) [2024-04-26 22:37:52,013][49750] Updated weights for policy 0, policy_version 287681 (0.0036) [2024-04-26 22:37:52,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4713365504. Throughput: 0: 50769.3. Samples: 2466155040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 22:37:52,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 22:37:55,055][49750] Updated weights for policy 0, policy_version 287691 (0.0031) [2024-04-26 22:37:57,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4713611264. Throughput: 0: 50736.9. Samples: 2466457480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 22:37:57,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 22:37:58,399][49750] Updated weights for policy 0, policy_version 287701 (0.0031) [2024-04-26 22:38:01,433][49750] Updated weights for policy 0, policy_version 287711 (0.0030) [2024-04-26 22:38:02,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.5, 300 sec: 50651.6). Total num frames: 4713873408. Throughput: 0: 50634.8. Samples: 2466759780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 22:38:02,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 22:38:04,914][49750] Updated weights for policy 0, policy_version 287721 (0.0029) [2024-04-26 22:38:07,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4714119168. Throughput: 0: 50660.7. Samples: 2466918260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 22:38:07,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 22:38:07,917][49750] Updated weights for policy 0, policy_version 287731 (0.0036) [2024-04-26 22:38:11,319][49750] Updated weights for policy 0, policy_version 287741 (0.0032) [2024-04-26 22:38:12,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4714381312. Throughput: 0: 50796.9. Samples: 2467227560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 22:38:12,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 22:38:14,525][49750] Updated weights for policy 0, policy_version 287751 (0.0033) [2024-04-26 22:38:17,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4714627072. Throughput: 0: 50658.3. Samples: 2467527120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 22:38:17,063][49517] Avg episode reward: [(0, '0.666')] [2024-04-26 22:38:17,799][49750] Updated weights for policy 0, policy_version 287761 (0.0027) [2024-04-26 22:38:20,927][49750] Updated weights for policy 0, policy_version 287771 (0.0028) [2024-04-26 22:38:22,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.6, 300 sec: 50707.1). Total num frames: 4714872832. Throughput: 0: 50789.0. Samples: 2467683240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 22:38:22,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 22:38:24,274][49750] Updated weights for policy 0, policy_version 287781 (0.0032) [2024-04-26 22:38:27,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4715151360. Throughput: 0: 50781.8. Samples: 2467986980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 22:38:27,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 22:38:27,251][49750] Updated weights for policy 0, policy_version 287791 (0.0033) [2024-04-26 22:38:30,726][49750] Updated weights for policy 0, policy_version 287801 (0.0027) [2024-04-26 22:38:32,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 4715380736. Throughput: 0: 50736.1. Samples: 2468285420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 22:38:32,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 22:38:33,769][49750] Updated weights for policy 0, policy_version 287811 (0.0026) [2024-04-26 22:38:37,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4715642880. Throughput: 0: 50726.5. Samples: 2468437740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 22:38:37,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 22:38:37,221][49750] Updated weights for policy 0, policy_version 287821 (0.0033) [2024-04-26 22:38:39,314][49728] Signal inference workers to stop experience collection... (37050 times) [2024-04-26 22:38:39,360][49750] InferenceWorker_p0-w0: stopping experience collection (37050 times) [2024-04-26 22:38:39,379][49728] Signal inference workers to resume experience collection... (37050 times) [2024-04-26 22:38:39,380][49750] InferenceWorker_p0-w0: resuming experience collection (37050 times) [2024-04-26 22:38:40,253][49750] Updated weights for policy 0, policy_version 287831 (0.0037) [2024-04-26 22:38:42,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4715872256. Throughput: 0: 50740.0. Samples: 2468740780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 22:38:42,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 22:38:42,159][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000287835_4715888640.pth... [2024-04-26 22:38:42,205][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000287093_4703731712.pth [2024-04-26 22:38:43,664][49750] Updated weights for policy 0, policy_version 287841 (0.0030) [2024-04-26 22:38:46,724][49750] Updated weights for policy 0, policy_version 287851 (0.0032) [2024-04-26 22:38:47,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51336.6, 300 sec: 50707.1). Total num frames: 4716167168. Throughput: 0: 50699.5. Samples: 2469041260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 22:38:47,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 22:38:50,241][49750] Updated weights for policy 0, policy_version 287861 (0.0032) [2024-04-26 22:38:52,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4716396544. Throughput: 0: 50786.2. Samples: 2469203640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 22:38:52,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 22:38:53,082][49750] Updated weights for policy 0, policy_version 287871 (0.0026) [2024-04-26 22:38:56,718][49750] Updated weights for policy 0, policy_version 287881 (0.0028) [2024-04-26 22:38:57,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 4716642304. Throughput: 0: 50763.1. Samples: 2469511900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 22:38:57,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 22:38:59,559][49750] Updated weights for policy 0, policy_version 287891 (0.0032) [2024-04-26 22:39:02,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4716904448. Throughput: 0: 50739.1. Samples: 2469810380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 22:39:02,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 22:39:03,148][49750] Updated weights for policy 0, policy_version 287901 (0.0024) [2024-04-26 22:39:05,984][49750] Updated weights for policy 0, policy_version 287911 (0.0031) [2024-04-26 22:39:07,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4717182976. Throughput: 0: 50807.3. Samples: 2469969580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 22:39:07,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 22:39:09,457][49750] Updated weights for policy 0, policy_version 287921 (0.0034) [2024-04-26 22:39:12,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4717428736. Throughput: 0: 50983.5. Samples: 2470281240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 22:39:12,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 22:39:12,467][49750] Updated weights for policy 0, policy_version 287931 (0.0029) [2024-04-26 22:39:15,808][49750] Updated weights for policy 0, policy_version 287941 (0.0030) [2024-04-26 22:39:17,062][49517] Fps is (10 sec: 47514.8, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4717658112. Throughput: 0: 51108.6. Samples: 2470585300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 22:39:17,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 22:39:18,912][49750] Updated weights for policy 0, policy_version 287951 (0.0031) [2024-04-26 22:39:22,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 4717920256. Throughput: 0: 50935.1. Samples: 2470729820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 22:39:22,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 22:39:22,386][49750] Updated weights for policy 0, policy_version 287961 (0.0037) [2024-04-26 22:39:25,270][49750] Updated weights for policy 0, policy_version 287971 (0.0035) [2024-04-26 22:39:27,062][49517] Fps is (10 sec: 52427.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4718182400. Throughput: 0: 50924.8. Samples: 2471032400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 22:39:27,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 22:39:28,801][49750] Updated weights for policy 0, policy_version 287981 (0.0035) [2024-04-26 22:39:31,787][49750] Updated weights for policy 0, policy_version 287991 (0.0036) [2024-04-26 22:39:32,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 4718460928. Throughput: 0: 50935.9. Samples: 2471333380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 22:39:32,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-26 22:39:35,371][49750] Updated weights for policy 0, policy_version 288001 (0.0029) [2024-04-26 22:39:37,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4718706688. Throughput: 0: 50924.0. Samples: 2471495220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 22:39:37,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 22:39:38,169][49750] Updated weights for policy 0, policy_version 288011 (0.0039) [2024-04-26 22:39:42,063][49517] Fps is (10 sec: 45875.2, 60 sec: 50790.3, 300 sec: 50596.0). Total num frames: 4718919680. Throughput: 0: 50796.8. Samples: 2471797760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 22:39:42,072][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 22:39:42,202][49750] Updated weights for policy 0, policy_version 288021 (0.0031) [2024-04-26 22:39:44,584][49750] Updated weights for policy 0, policy_version 288031 (0.0028) [2024-04-26 22:39:47,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4719181824. Throughput: 0: 50877.7. Samples: 2472099880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 22:39:47,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 22:39:48,661][49750] Updated weights for policy 0, policy_version 288041 (0.0035) [2024-04-26 22:39:49,218][49728] Signal inference workers to stop experience collection... (37100 times) [2024-04-26 22:39:49,264][49750] InferenceWorker_p0-w0: stopping experience collection (37100 times) [2024-04-26 22:39:49,289][49728] Signal inference workers to resume experience collection... (37100 times) [2024-04-26 22:39:49,290][49750] InferenceWorker_p0-w0: resuming experience collection (37100 times) [2024-04-26 22:39:50,957][49750] Updated weights for policy 0, policy_version 288051 (0.0031) [2024-04-26 22:39:52,063][49517] Fps is (10 sec: 55705.6, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 4719476736. Throughput: 0: 50668.0. Samples: 2472249640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 22:39:52,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 22:39:55,176][49750] Updated weights for policy 0, policy_version 288061 (0.0033) [2024-04-26 22:39:57,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 4719722496. Throughput: 0: 50687.1. Samples: 2472562160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 22:39:57,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 22:39:57,359][49750] Updated weights for policy 0, policy_version 288071 (0.0030) [2024-04-26 22:40:01,749][49750] Updated weights for policy 0, policy_version 288081 (0.0032) [2024-04-26 22:40:02,062][49517] Fps is (10 sec: 45876.8, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 4719935488. Throughput: 0: 50799.2. Samples: 2472871260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 22:40:02,062][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 22:40:03,853][49750] Updated weights for policy 0, policy_version 288091 (0.0031) [2024-04-26 22:40:07,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4720214016. Throughput: 0: 50620.9. Samples: 2473007760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 22:40:07,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 22:40:08,071][49750] Updated weights for policy 0, policy_version 288101 (0.0031) [2024-04-26 22:40:10,206][49750] Updated weights for policy 0, policy_version 288111 (0.0030) [2024-04-26 22:40:12,062][49517] Fps is (10 sec: 52428.0, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4720459776. Throughput: 0: 50658.3. Samples: 2473312020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-26 22:40:12,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 22:40:14,503][49750] Updated weights for policy 0, policy_version 288121 (0.0031) [2024-04-26 22:40:16,769][49750] Updated weights for policy 0, policy_version 288131 (0.0034) [2024-04-26 22:40:17,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.3, 300 sec: 50762.6). Total num frames: 4720738304. Throughput: 0: 50738.7. Samples: 2473616620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 22:40:17,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 22:40:20,928][49750] Updated weights for policy 0, policy_version 288141 (0.0030) [2024-04-26 22:40:22,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4720967680. Throughput: 0: 50621.6. Samples: 2473773200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 22:40:22,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 22:40:23,693][49750] Updated weights for policy 0, policy_version 288151 (0.0032) [2024-04-26 22:40:27,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4721213440. Throughput: 0: 50753.5. Samples: 2474081660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 22:40:27,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 22:40:27,431][49750] Updated weights for policy 0, policy_version 288161 (0.0034) [2024-04-26 22:40:30,093][49750] Updated weights for policy 0, policy_version 288171 (0.0037) [2024-04-26 22:40:32,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 4721475584. Throughput: 0: 50777.8. Samples: 2474384880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 22:40:32,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 22:40:33,761][49750] Updated weights for policy 0, policy_version 288181 (0.0031) [2024-04-26 22:40:36,406][49750] Updated weights for policy 0, policy_version 288191 (0.0032) [2024-04-26 22:40:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4721737728. Throughput: 0: 50863.2. Samples: 2474538480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 22:40:37,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 22:40:40,277][49750] Updated weights for policy 0, policy_version 288201 (0.0028) [2024-04-26 22:40:42,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51609.6, 300 sec: 50818.1). Total num frames: 4722016256. Throughput: 0: 50634.2. Samples: 2474840700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 22:40:42,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 22:40:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000288209_4722016256.pth... [2024-04-26 22:40:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000287465_4709826560.pth [2024-04-26 22:40:42,931][49750] Updated weights for policy 0, policy_version 288211 (0.0028) [2024-04-26 22:40:46,806][49750] Updated weights for policy 0, policy_version 288221 (0.0029) [2024-04-26 22:40:47,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 4722229248. Throughput: 0: 50624.2. Samples: 2475149360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 22:40:47,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 22:40:47,475][49728] Signal inference workers to stop experience collection... (37150 times) [2024-04-26 22:40:47,476][49728] Signal inference workers to resume experience collection... (37150 times) [2024-04-26 22:40:47,499][49750] InferenceWorker_p0-w0: stopping experience collection (37150 times) [2024-04-26 22:40:47,499][49750] InferenceWorker_p0-w0: resuming experience collection (37150 times) [2024-04-26 22:40:49,404][49750] Updated weights for policy 0, policy_version 288231 (0.0031) [2024-04-26 22:40:52,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4722491392. Throughput: 0: 50706.8. Samples: 2475289560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 22:40:52,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 22:40:53,257][49750] Updated weights for policy 0, policy_version 288241 (0.0033) [2024-04-26 22:40:55,775][49750] Updated weights for policy 0, policy_version 288251 (0.0031) [2024-04-26 22:40:57,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4722737152. Throughput: 0: 50781.1. Samples: 2475597180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 22:40:57,063][49517] Avg episode reward: [(0, '0.478')] [2024-04-26 22:40:59,638][49750] Updated weights for policy 0, policy_version 288261 (0.0037) [2024-04-26 22:41:02,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.3, 300 sec: 50762.6). Total num frames: 4723015680. Throughput: 0: 50695.7. Samples: 2475897920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 22:41:02,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 22:41:02,103][49750] Updated weights for policy 0, policy_version 288271 (0.0029) [2024-04-26 22:41:06,098][49750] Updated weights for policy 0, policy_version 288281 (0.0031) [2024-04-26 22:41:07,063][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4723277824. Throughput: 0: 50727.1. Samples: 2476055920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 22:41:07,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 22:41:08,610][49750] Updated weights for policy 0, policy_version 288291 (0.0031) [2024-04-26 22:41:12,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4723490816. Throughput: 0: 50616.0. Samples: 2476359380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 22:41:12,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 22:41:12,698][49750] Updated weights for policy 0, policy_version 288301 (0.0031) [2024-04-26 22:41:15,184][49750] Updated weights for policy 0, policy_version 288311 (0.0031) [2024-04-26 22:41:17,063][49517] Fps is (10 sec: 45875.0, 60 sec: 49971.2, 300 sec: 50651.5). Total num frames: 4723736576. Throughput: 0: 50593.6. Samples: 2476661600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 22:41:17,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 22:41:19,155][49750] Updated weights for policy 0, policy_version 288321 (0.0029) [2024-04-26 22:41:21,540][49750] Updated weights for policy 0, policy_version 288331 (0.0031) [2024-04-26 22:41:22,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.6, 300 sec: 50762.7). Total num frames: 4724015104. Throughput: 0: 50588.6. Samples: 2476814960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-26 22:41:22,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 22:41:25,507][49750] Updated weights for policy 0, policy_version 288341 (0.0034) [2024-04-26 22:41:27,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4724277248. Throughput: 0: 50669.5. Samples: 2477120820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 22:41:27,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 22:41:27,970][49750] Updated weights for policy 0, policy_version 288351 (0.0028) [2024-04-26 22:41:31,903][49750] Updated weights for policy 0, policy_version 288361 (0.0030) [2024-04-26 22:41:32,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4724506624. Throughput: 0: 50679.2. Samples: 2477429920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 22:41:32,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 22:41:34,520][49750] Updated weights for policy 0, policy_version 288371 (0.0031) [2024-04-26 22:41:37,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4724768768. Throughput: 0: 50538.6. Samples: 2477563800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 22:41:37,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 22:41:38,437][49750] Updated weights for policy 0, policy_version 288381 (0.0029) [2024-04-26 22:41:41,162][49750] Updated weights for policy 0, policy_version 288391 (0.0032) [2024-04-26 22:41:42,063][49517] Fps is (10 sec: 50789.7, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 4725014528. Throughput: 0: 50476.9. Samples: 2477868640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 22:41:42,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 22:41:44,930][49750] Updated weights for policy 0, policy_version 288401 (0.0032) [2024-04-26 22:41:47,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4725293056. Throughput: 0: 50571.7. Samples: 2478173640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 22:41:47,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 22:41:47,781][49750] Updated weights for policy 0, policy_version 288411 (0.0032) [2024-04-26 22:41:51,308][49750] Updated weights for policy 0, policy_version 288421 (0.0034) [2024-04-26 22:41:52,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4725538816. Throughput: 0: 50679.7. Samples: 2478336500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 22:41:52,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 22:41:54,246][49750] Updated weights for policy 0, policy_version 288431 (0.0031) [2024-04-26 22:41:57,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4725768192. Throughput: 0: 50663.5. Samples: 2478639240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 22:41:57,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 22:41:57,442][49728] Signal inference workers to stop experience collection... (37200 times) [2024-04-26 22:41:57,480][49750] InferenceWorker_p0-w0: stopping experience collection (37200 times) [2024-04-26 22:41:57,542][49728] Signal inference workers to resume experience collection... (37200 times) [2024-04-26 22:41:57,542][49750] InferenceWorker_p0-w0: resuming experience collection (37200 times) [2024-04-26 22:41:57,666][49750] Updated weights for policy 0, policy_version 288441 (0.0033) [2024-04-26 22:42:00,635][49750] Updated weights for policy 0, policy_version 288451 (0.0034) [2024-04-26 22:42:02,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 4726030336. Throughput: 0: 50617.8. Samples: 2478939400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 22:42:02,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 22:42:04,216][49750] Updated weights for policy 0, policy_version 288461 (0.0033) [2024-04-26 22:42:07,045][49750] Updated weights for policy 0, policy_version 288471 (0.0036) [2024-04-26 22:42:07,062][49517] Fps is (10 sec: 54067.9, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4726308864. Throughput: 0: 50557.3. Samples: 2479090040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 22:42:07,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 22:42:10,555][49750] Updated weights for policy 0, policy_version 288481 (0.0032) [2024-04-26 22:42:12,063][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4726554624. Throughput: 0: 50609.3. Samples: 2479398240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 22:42:12,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 22:42:13,773][49750] Updated weights for policy 0, policy_version 288491 (0.0035) [2024-04-26 22:42:17,062][49517] Fps is (10 sec: 49151.6, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 4726800384. Throughput: 0: 50600.4. Samples: 2479706940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 22:42:17,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 22:42:17,064][49750] Updated weights for policy 0, policy_version 288501 (0.0031) [2024-04-26 22:42:20,212][49750] Updated weights for policy 0, policy_version 288511 (0.0036) [2024-04-26 22:42:22,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 4727046144. Throughput: 0: 50762.3. Samples: 2479848100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 22:42:22,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 22:42:23,406][49750] Updated weights for policy 0, policy_version 288521 (0.0027) [2024-04-26 22:42:26,693][49750] Updated weights for policy 0, policy_version 288531 (0.0032) [2024-04-26 22:42:27,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4727291904. Throughput: 0: 50871.6. Samples: 2480157860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 22:42:27,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 22:42:29,786][49750] Updated weights for policy 0, policy_version 288541 (0.0036) [2024-04-26 22:42:32,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4727570432. Throughput: 0: 50918.5. Samples: 2480464980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 22:42:32,063][49517] Avg episode reward: [(0, '0.453')] [2024-04-26 22:42:33,122][49750] Updated weights for policy 0, policy_version 288551 (0.0028) [2024-04-26 22:42:36,175][49750] Updated weights for policy 0, policy_version 288561 (0.0044) [2024-04-26 22:42:37,062][49517] Fps is (10 sec: 55706.4, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 4727848960. Throughput: 0: 50722.3. Samples: 2480619000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-26 22:42:37,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 22:42:39,696][49750] Updated weights for policy 0, policy_version 288571 (0.0038) [2024-04-26 22:42:42,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4728061952. Throughput: 0: 50752.7. Samples: 2480923120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 22:42:42,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 22:42:42,076][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000288578_4728061952.pth... [2024-04-26 22:42:42,135][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000287835_4715888640.pth [2024-04-26 22:42:42,714][49750] Updated weights for policy 0, policy_version 288581 (0.0038) [2024-04-26 22:42:46,185][49750] Updated weights for policy 0, policy_version 288591 (0.0029) [2024-04-26 22:42:47,063][49517] Fps is (10 sec: 47512.6, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4728324096. Throughput: 0: 50949.8. Samples: 2481232140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 22:42:47,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 22:42:49,022][49750] Updated weights for policy 0, policy_version 288601 (0.0029) [2024-04-26 22:42:52,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4728569856. Throughput: 0: 50807.4. Samples: 2481376380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 22:42:52,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 22:42:52,671][49750] Updated weights for policy 0, policy_version 288611 (0.0030) [2024-04-26 22:42:55,537][49750] Updated weights for policy 0, policy_version 288621 (0.0025) [2024-04-26 22:42:57,063][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 4728848384. Throughput: 0: 50860.0. Samples: 2481686940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 22:42:57,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 22:42:59,024][49750] Updated weights for policy 0, policy_version 288631 (0.0032) [2024-04-26 22:43:01,909][49750] Updated weights for policy 0, policy_version 288641 (0.0033) [2024-04-26 22:43:02,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4729094144. Throughput: 0: 50729.4. Samples: 2481989760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 22:43:02,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 22:43:05,371][49750] Updated weights for policy 0, policy_version 288651 (0.0033) [2024-04-26 22:43:07,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 4729323520. Throughput: 0: 51112.5. Samples: 2482148160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 22:43:07,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 22:43:08,318][49750] Updated weights for policy 0, policy_version 288661 (0.0028) [2024-04-26 22:43:12,016][49750] Updated weights for policy 0, policy_version 288671 (0.0030) [2024-04-26 22:43:12,067][49517] Fps is (10 sec: 49131.7, 60 sec: 50514.0, 300 sec: 50706.4). Total num frames: 4729585664. Throughput: 0: 50861.2. Samples: 2482446820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 22:43:12,067][49517] Avg episode reward: [(0, '0.680')] [2024-04-26 22:43:14,562][49728] Signal inference workers to stop experience collection... (37250 times) [2024-04-26 22:43:14,563][49728] Signal inference workers to resume experience collection... (37250 times) [2024-04-26 22:43:14,585][49750] InferenceWorker_p0-w0: stopping experience collection (37250 times) [2024-04-26 22:43:14,585][49750] InferenceWorker_p0-w0: resuming experience collection (37250 times) [2024-04-26 22:43:14,839][49750] Updated weights for policy 0, policy_version 288681 (0.0031) [2024-04-26 22:43:17,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4729847808. Throughput: 0: 50689.3. Samples: 2482746000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 22:43:17,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 22:43:18,483][49750] Updated weights for policy 0, policy_version 288691 (0.0027) [2024-04-26 22:43:21,335][49750] Updated weights for policy 0, policy_version 288701 (0.0030) [2024-04-26 22:43:22,062][49517] Fps is (10 sec: 52450.4, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 4730109952. Throughput: 0: 50893.3. Samples: 2482909200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 22:43:22,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 22:43:24,815][49750] Updated weights for policy 0, policy_version 288711 (0.0030) [2024-04-26 22:43:27,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4730355712. Throughput: 0: 50825.0. Samples: 2483210240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 22:43:27,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 22:43:27,667][49750] Updated weights for policy 0, policy_version 288721 (0.0032) [2024-04-26 22:43:31,335][49750] Updated weights for policy 0, policy_version 288731 (0.0028) [2024-04-26 22:43:32,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4730601472. Throughput: 0: 50779.0. Samples: 2483517180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 22:43:32,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 22:43:34,031][49750] Updated weights for policy 0, policy_version 288741 (0.0028) [2024-04-26 22:43:37,062][49517] Fps is (10 sec: 49152.5, 60 sec: 49971.1, 300 sec: 50762.6). Total num frames: 4730847232. Throughput: 0: 50710.3. Samples: 2483658340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 22:43:37,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 22:43:37,843][49750] Updated weights for policy 0, policy_version 288751 (0.0028) [2024-04-26 22:43:40,431][49750] Updated weights for policy 0, policy_version 288761 (0.0030) [2024-04-26 22:43:42,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51336.7, 300 sec: 50762.6). Total num frames: 4731142144. Throughput: 0: 50568.5. Samples: 2483962520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 22:43:42,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 22:43:44,324][49750] Updated weights for policy 0, policy_version 288771 (0.0037) [2024-04-26 22:43:46,942][49750] Updated weights for policy 0, policy_version 288781 (0.0036) [2024-04-26 22:43:47,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4731387904. Throughput: 0: 50782.6. Samples: 2484274980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-26 22:43:47,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 22:43:50,700][49750] Updated weights for policy 0, policy_version 288791 (0.0027) [2024-04-26 22:43:52,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 4731617280. Throughput: 0: 50688.9. Samples: 2484429160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:43:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 22:43:53,312][49750] Updated weights for policy 0, policy_version 288801 (0.0031) [2024-04-26 22:43:57,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4731863040. Throughput: 0: 50743.7. Samples: 2484730080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:43:57,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 22:43:57,187][49750] Updated weights for policy 0, policy_version 288811 (0.0034) [2024-04-26 22:43:59,741][49750] Updated weights for policy 0, policy_version 288821 (0.0031) [2024-04-26 22:44:02,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4732125184. Throughput: 0: 50673.1. Samples: 2485026280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:44:02,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 22:44:03,731][49750] Updated weights for policy 0, policy_version 288831 (0.0034) [2024-04-26 22:44:06,179][49750] Updated weights for policy 0, policy_version 288841 (0.0026) [2024-04-26 22:44:07,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 4732403712. Throughput: 0: 50601.7. Samples: 2485186280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:44:07,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 22:44:10,030][49750] Updated weights for policy 0, policy_version 288851 (0.0029) [2024-04-26 22:44:12,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50793.9, 300 sec: 50762.6). Total num frames: 4732633088. Throughput: 0: 50758.4. Samples: 2485494360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:44:12,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 22:44:12,868][49750] Updated weights for policy 0, policy_version 288861 (0.0029) [2024-04-26 22:44:16,511][49750] Updated weights for policy 0, policy_version 288871 (0.0030) [2024-04-26 22:44:17,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4732878848. Throughput: 0: 50681.9. Samples: 2485797880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:44:17,063][49517] Avg episode reward: [(0, '0.666')] [2024-04-26 22:44:19,259][49750] Updated weights for policy 0, policy_version 288881 (0.0034) [2024-04-26 22:44:22,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 4733124608. Throughput: 0: 50743.2. Samples: 2485941780. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:44:22,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 22:44:22,940][49728] Signal inference workers to stop experience collection... (37300 times) [2024-04-26 22:44:22,986][49750] InferenceWorker_p0-w0: stopping experience collection (37300 times) [2024-04-26 22:44:23,047][49728] Signal inference workers to resume experience collection... (37300 times) [2024-04-26 22:44:23,047][49750] InferenceWorker_p0-w0: resuming experience collection (37300 times) [2024-04-26 22:44:23,171][49750] Updated weights for policy 0, policy_version 288891 (0.0029) [2024-04-26 22:44:25,600][49750] Updated weights for policy 0, policy_version 288901 (0.0037) [2024-04-26 22:44:27,062][49517] Fps is (10 sec: 52429.9, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 4733403136. Throughput: 0: 50670.7. Samples: 2486242700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:44:27,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-26 22:44:29,466][49750] Updated weights for policy 0, policy_version 288911 (0.0035) [2024-04-26 22:44:32,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4733665280. Throughput: 0: 50578.2. Samples: 2486551000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:44:32,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 22:44:32,102][49750] Updated weights for policy 0, policy_version 288921 (0.0030) [2024-04-26 22:44:35,902][49750] Updated weights for policy 0, policy_version 288931 (0.0030) [2024-04-26 22:44:37,063][49517] Fps is (10 sec: 47512.7, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4733878272. Throughput: 0: 50680.7. Samples: 2486709800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:44:37,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 22:44:38,550][49750] Updated weights for policy 0, policy_version 288941 (0.0031) [2024-04-26 22:44:42,063][49517] Fps is (10 sec: 47512.9, 60 sec: 49971.1, 300 sec: 50707.1). Total num frames: 4734140416. Throughput: 0: 50650.5. Samples: 2487009360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:44:42,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 22:44:42,186][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000288950_4734156800.pth... [2024-04-26 22:44:42,232][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000288209_4722016256.pth [2024-04-26 22:44:42,408][49750] Updated weights for policy 0, policy_version 288951 (0.0032) [2024-04-26 22:44:45,082][49750] Updated weights for policy 0, policy_version 288961 (0.0030) [2024-04-26 22:44:47,062][49517] Fps is (10 sec: 54068.4, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4734418944. Throughput: 0: 50868.8. Samples: 2487315380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:44:47,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 22:44:48,904][49750] Updated weights for policy 0, policy_version 288971 (0.0036) [2024-04-26 22:44:51,560][49750] Updated weights for policy 0, policy_version 288981 (0.0033) [2024-04-26 22:44:52,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4734681088. Throughput: 0: 50803.6. Samples: 2487472440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:44:52,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 22:44:55,316][49750] Updated weights for policy 0, policy_version 288991 (0.0034) [2024-04-26 22:44:57,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4734910464. Throughput: 0: 50611.5. Samples: 2487771880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:44:57,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 22:44:57,887][49750] Updated weights for policy 0, policy_version 289001 (0.0036) [2024-04-26 22:45:01,852][49750] Updated weights for policy 0, policy_version 289011 (0.0027) [2024-04-26 22:45:02,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.2, 300 sec: 50651.6). Total num frames: 4735156224. Throughput: 0: 50749.1. Samples: 2488081580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-26 22:45:02,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 22:45:04,420][49750] Updated weights for policy 0, policy_version 289021 (0.0036) [2024-04-26 22:45:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4735418368. Throughput: 0: 50572.4. Samples: 2488217540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 22:45:07,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 22:45:08,251][49750] Updated weights for policy 0, policy_version 289031 (0.0031) [2024-04-26 22:45:10,693][49750] Updated weights for policy 0, policy_version 289041 (0.0034) [2024-04-26 22:45:12,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 4735696896. Throughput: 0: 50682.9. Samples: 2488523440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 22:45:12,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 22:45:14,695][49750] Updated weights for policy 0, policy_version 289051 (0.0028) [2024-04-26 22:45:17,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4735959040. Throughput: 0: 50731.8. Samples: 2488833940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 22:45:17,063][49517] Avg episode reward: [(0, '0.462')] [2024-04-26 22:45:17,119][49750] Updated weights for policy 0, policy_version 289061 (0.0027) [2024-04-26 22:45:21,215][49750] Updated weights for policy 0, policy_version 289071 (0.0027) [2024-04-26 22:45:22,062][49517] Fps is (10 sec: 47514.7, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4736172032. Throughput: 0: 50529.6. Samples: 2488983620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 22:45:22,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 22:45:23,058][49728] Signal inference workers to stop experience collection... (37350 times) [2024-04-26 22:45:23,107][49750] InferenceWorker_p0-w0: stopping experience collection (37350 times) [2024-04-26 22:45:23,130][49728] Signal inference workers to resume experience collection... (37350 times) [2024-04-26 22:45:23,131][49750] InferenceWorker_p0-w0: resuming experience collection (37350 times) [2024-04-26 22:45:23,588][49750] Updated weights for policy 0, policy_version 289081 (0.0027) [2024-04-26 22:45:27,062][49517] Fps is (10 sec: 44237.5, 60 sec: 49971.2, 300 sec: 50596.0). Total num frames: 4736401408. Throughput: 0: 50727.3. Samples: 2489292080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 22:45:27,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 22:45:27,651][49750] Updated weights for policy 0, policy_version 289091 (0.0031) [2024-04-26 22:45:30,049][49750] Updated weights for policy 0, policy_version 289101 (0.0028) [2024-04-26 22:45:32,063][49517] Fps is (10 sec: 52427.4, 60 sec: 50517.1, 300 sec: 50707.1). Total num frames: 4736696320. Throughput: 0: 50639.2. Samples: 2489594160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 22:45:32,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 22:45:34,255][49750] Updated weights for policy 0, policy_version 289111 (0.0034) [2024-04-26 22:45:36,551][49750] Updated weights for policy 0, policy_version 289121 (0.0032) [2024-04-26 22:45:37,063][49517] Fps is (10 sec: 57343.2, 60 sec: 51609.6, 300 sec: 50707.1). Total num frames: 4736974848. Throughput: 0: 50557.2. Samples: 2489747520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 22:45:37,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 22:45:40,707][49750] Updated weights for policy 0, policy_version 289131 (0.0027) [2024-04-26 22:45:42,062][49517] Fps is (10 sec: 50791.8, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 4737204224. Throughput: 0: 50834.8. Samples: 2490059440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 22:45:42,063][49517] Avg episode reward: [(0, '0.675')] [2024-04-26 22:45:43,031][49750] Updated weights for policy 0, policy_version 289141 (0.0026) [2024-04-26 22:45:47,062][49517] Fps is (10 sec: 45875.7, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 4737433600. Throughput: 0: 50661.3. Samples: 2490361340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 22:45:47,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 22:45:47,124][49750] Updated weights for policy 0, policy_version 289151 (0.0028) [2024-04-26 22:45:49,442][49750] Updated weights for policy 0, policy_version 289161 (0.0031) [2024-04-26 22:45:52,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4737695744. Throughput: 0: 50652.5. Samples: 2490496900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 22:45:52,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 22:45:53,596][49750] Updated weights for policy 0, policy_version 289171 (0.0028) [2024-04-26 22:45:56,083][49750] Updated weights for policy 0, policy_version 289181 (0.0030) [2024-04-26 22:45:57,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 4737974272. Throughput: 0: 50706.4. Samples: 2490805220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 22:45:57,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 22:46:00,125][49750] Updated weights for policy 0, policy_version 289191 (0.0034) [2024-04-26 22:46:02,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 4738236416. Throughput: 0: 50609.4. Samples: 2491111360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 22:46:02,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 22:46:02,496][49750] Updated weights for policy 0, policy_version 289201 (0.0030) [2024-04-26 22:46:06,633][49750] Updated weights for policy 0, policy_version 289211 (0.0033) [2024-04-26 22:46:07,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4738449408. Throughput: 0: 50665.7. Samples: 2491263580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 22:46:07,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 22:46:08,957][49750] Updated weights for policy 0, policy_version 289221 (0.0036) [2024-04-26 22:46:12,063][49517] Fps is (10 sec: 45875.2, 60 sec: 49971.3, 300 sec: 50707.1). Total num frames: 4738695168. Throughput: 0: 50639.0. Samples: 2491570840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 22:46:12,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 22:46:12,985][49750] Updated weights for policy 0, policy_version 289231 (0.0029) [2024-04-26 22:46:14,387][49728] Signal inference workers to stop experience collection... (37400 times) [2024-04-26 22:46:14,435][49750] InferenceWorker_p0-w0: stopping experience collection (37400 times) [2024-04-26 22:46:14,456][49728] Signal inference workers to resume experience collection... (37400 times) [2024-04-26 22:46:14,457][49750] InferenceWorker_p0-w0: resuming experience collection (37400 times) [2024-04-26 22:46:15,411][49750] Updated weights for policy 0, policy_version 289241 (0.0031) [2024-04-26 22:46:17,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4738973696. Throughput: 0: 50766.8. Samples: 2491878660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:46:17,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 22:46:19,360][49750] Updated weights for policy 0, policy_version 289251 (0.0028) [2024-04-26 22:46:21,946][49750] Updated weights for policy 0, policy_version 289261 (0.0028) [2024-04-26 22:46:22,063][49517] Fps is (10 sec: 55705.6, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 4739252224. Throughput: 0: 50781.8. Samples: 2492032700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:46:22,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 22:46:25,863][49750] Updated weights for policy 0, policy_version 289271 (0.0030) [2024-04-26 22:46:27,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 4739497984. Throughput: 0: 50728.8. Samples: 2492342240. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:46:27,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 22:46:28,460][49750] Updated weights for policy 0, policy_version 289281 (0.0036) [2024-04-26 22:46:32,062][49517] Fps is (10 sec: 45876.0, 60 sec: 50244.5, 300 sec: 50651.6). Total num frames: 4739710976. Throughput: 0: 50767.7. Samples: 2492645880. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:46:32,063][49517] Avg episode reward: [(0, '0.674')] [2024-04-26 22:46:32,340][49750] Updated weights for policy 0, policy_version 289291 (0.0035) [2024-04-26 22:46:34,901][49750] Updated weights for policy 0, policy_version 289301 (0.0039) [2024-04-26 22:46:37,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4739989504. Throughput: 0: 50842.6. Samples: 2492784820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:46:37,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 22:46:38,891][49750] Updated weights for policy 0, policy_version 289311 (0.0030) [2024-04-26 22:46:41,395][49750] Updated weights for policy 0, policy_version 289321 (0.0037) [2024-04-26 22:46:42,063][49517] Fps is (10 sec: 54066.2, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4740251648. Throughput: 0: 50692.4. Samples: 2493086380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:46:42,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 22:46:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000289322_4740251648.pth... [2024-04-26 22:46:42,135][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000288578_4728061952.pth [2024-04-26 22:46:45,271][49750] Updated weights for policy 0, policy_version 289331 (0.0028) [2024-04-26 22:46:47,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51336.6, 300 sec: 50762.6). Total num frames: 4740513792. Throughput: 0: 50780.1. Samples: 2493396460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:46:47,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 22:46:47,815][49750] Updated weights for policy 0, policy_version 289341 (0.0031) [2024-04-26 22:46:51,647][49750] Updated weights for policy 0, policy_version 289351 (0.0031) [2024-04-26 22:46:52,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4740743168. Throughput: 0: 50928.5. Samples: 2493555360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:46:52,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 22:46:54,344][49750] Updated weights for policy 0, policy_version 289361 (0.0028) [2024-04-26 22:46:57,063][49517] Fps is (10 sec: 45874.7, 60 sec: 49971.1, 300 sec: 50651.6). Total num frames: 4740972544. Throughput: 0: 50704.0. Samples: 2493852520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:46:57,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 22:46:58,188][49750] Updated weights for policy 0, policy_version 289371 (0.0032) [2024-04-26 22:47:01,054][49750] Updated weights for policy 0, policy_version 289381 (0.0029) [2024-04-26 22:47:02,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4741267456. Throughput: 0: 50642.7. Samples: 2494157580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:47:02,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 22:47:04,712][49750] Updated weights for policy 0, policy_version 289391 (0.0035) [2024-04-26 22:47:07,062][49517] Fps is (10 sec: 54068.2, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 4741513216. Throughput: 0: 50828.2. Samples: 2494319960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:47:07,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 22:47:07,394][49750] Updated weights for policy 0, policy_version 289401 (0.0031) [2024-04-26 22:47:11,066][49750] Updated weights for policy 0, policy_version 289411 (0.0033) [2024-04-26 22:47:12,062][49517] Fps is (10 sec: 49151.8, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 4741758976. Throughput: 0: 50691.0. Samples: 2494623340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:47:12,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 22:47:12,850][49728] Signal inference workers to stop experience collection... (37450 times) [2024-04-26 22:47:12,850][49728] Signal inference workers to resume experience collection... (37450 times) [2024-04-26 22:47:12,878][49750] InferenceWorker_p0-w0: stopping experience collection (37450 times) [2024-04-26 22:47:12,878][49750] InferenceWorker_p0-w0: resuming experience collection (37450 times) [2024-04-26 22:47:14,048][49750] Updated weights for policy 0, policy_version 289421 (0.0033) [2024-04-26 22:47:17,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4742004736. Throughput: 0: 50587.3. Samples: 2494922320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:47:17,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 22:47:17,617][49750] Updated weights for policy 0, policy_version 289431 (0.0028) [2024-04-26 22:47:20,606][49750] Updated weights for policy 0, policy_version 289441 (0.0025) [2024-04-26 22:47:22,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4742283264. Throughput: 0: 50707.2. Samples: 2495066640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:47:22,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 22:47:24,116][49750] Updated weights for policy 0, policy_version 289451 (0.0039) [2024-04-26 22:47:27,003][49750] Updated weights for policy 0, policy_version 289461 (0.0030) [2024-04-26 22:47:27,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4742529024. Throughput: 0: 50629.3. Samples: 2495364700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-26 22:47:27,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 22:47:30,452][49750] Updated weights for policy 0, policy_version 289471 (0.0030) [2024-04-26 22:47:32,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51336.5, 300 sec: 50651.5). Total num frames: 4742791168. Throughput: 0: 50687.1. Samples: 2495677380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 22:47:32,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 22:47:33,389][49750] Updated weights for policy 0, policy_version 289481 (0.0030) [2024-04-26 22:47:36,843][49750] Updated weights for policy 0, policy_version 289491 (0.0035) [2024-04-26 22:47:37,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4743036928. Throughput: 0: 50506.1. Samples: 2495828140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 22:47:37,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 22:47:39,877][49750] Updated weights for policy 0, policy_version 289501 (0.0028) [2024-04-26 22:47:42,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 4743266304. Throughput: 0: 50657.9. Samples: 2496132120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 22:47:42,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 22:47:43,425][49750] Updated weights for policy 0, policy_version 289511 (0.0026) [2024-04-26 22:47:46,399][49750] Updated weights for policy 0, policy_version 289521 (0.0032) [2024-04-26 22:47:47,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4743544832. Throughput: 0: 50571.2. Samples: 2496433280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 22:47:47,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 22:47:49,934][49750] Updated weights for policy 0, policy_version 289531 (0.0034) [2024-04-26 22:47:52,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 4743790592. Throughput: 0: 50491.8. Samples: 2496592100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 22:47:52,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 22:47:52,787][49750] Updated weights for policy 0, policy_version 289541 (0.0032) [2024-04-26 22:47:56,422][49750] Updated weights for policy 0, policy_version 289551 (0.0029) [2024-04-26 22:47:57,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51336.6, 300 sec: 50707.1). Total num frames: 4744052736. Throughput: 0: 50537.0. Samples: 2496897500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 22:47:57,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 22:47:59,102][49750] Updated weights for policy 0, policy_version 289561 (0.0032) [2024-04-26 22:48:02,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4744282112. Throughput: 0: 50712.1. Samples: 2497204360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 22:48:02,071][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 22:48:02,794][49750] Updated weights for policy 0, policy_version 289571 (0.0031) [2024-04-26 22:48:05,562][49750] Updated weights for policy 0, policy_version 289581 (0.0035) [2024-04-26 22:48:07,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.1, 300 sec: 50707.8). Total num frames: 4744544256. Throughput: 0: 50710.4. Samples: 2497348620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 22:48:07,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 22:48:09,354][49750] Updated weights for policy 0, policy_version 289591 (0.0031) [2024-04-26 22:48:12,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4744806400. Throughput: 0: 50714.0. Samples: 2497646820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 22:48:12,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 22:48:12,248][49750] Updated weights for policy 0, policy_version 289601 (0.0030) [2024-04-26 22:48:15,638][49728] Signal inference workers to stop experience collection... (37500 times) [2024-04-26 22:48:15,676][49750] InferenceWorker_p0-w0: stopping experience collection (37500 times) [2024-04-26 22:48:15,745][49728] Signal inference workers to resume experience collection... (37500 times) [2024-04-26 22:48:15,746][49750] InferenceWorker_p0-w0: resuming experience collection (37500 times) [2024-04-26 22:48:15,899][49750] Updated weights for policy 0, policy_version 289611 (0.0027) [2024-04-26 22:48:17,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 4745068544. Throughput: 0: 50593.7. Samples: 2497954100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 22:48:17,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 22:48:18,629][49750] Updated weights for policy 0, policy_version 289621 (0.0031) [2024-04-26 22:48:22,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.3, 300 sec: 50596.1). Total num frames: 4745281536. Throughput: 0: 50670.5. Samples: 2498108300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 22:48:22,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 22:48:22,274][49750] Updated weights for policy 0, policy_version 289631 (0.0033) [2024-04-26 22:48:25,009][49750] Updated weights for policy 0, policy_version 289641 (0.0033) [2024-04-26 22:48:27,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50517.3, 300 sec: 50707.0). Total num frames: 4745560064. Throughput: 0: 50697.9. Samples: 2498413540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 22:48:27,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 22:48:28,807][49750] Updated weights for policy 0, policy_version 289651 (0.0030) [2024-04-26 22:48:31,388][49750] Updated weights for policy 0, policy_version 289661 (0.0029) [2024-04-26 22:48:32,063][49517] Fps is (10 sec: 54065.7, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4745822208. Throughput: 0: 50730.9. Samples: 2498716180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 22:48:32,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 22:48:35,088][49750] Updated weights for policy 0, policy_version 289671 (0.0032) [2024-04-26 22:48:37,062][49517] Fps is (10 sec: 52430.6, 60 sec: 50790.6, 300 sec: 50651.6). Total num frames: 4746084352. Throughput: 0: 50706.4. Samples: 2498873880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 22:48:37,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 22:48:37,856][49750] Updated weights for policy 0, policy_version 289681 (0.0033) [2024-04-26 22:48:41,577][49750] Updated weights for policy 0, policy_version 289691 (0.0032) [2024-04-26 22:48:42,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 4746346496. Throughput: 0: 50756.9. Samples: 2499181560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-26 22:48:42,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 22:48:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000289694_4746346496.pth... [2024-04-26 22:48:42,115][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000288950_4734156800.pth [2024-04-26 22:48:44,390][49750] Updated weights for policy 0, policy_version 289701 (0.0032) [2024-04-26 22:48:47,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4746575872. Throughput: 0: 50731.7. Samples: 2499487280. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-04-26 22:48:47,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 22:48:48,121][49750] Updated weights for policy 0, policy_version 289711 (0.0035) [2024-04-26 22:48:50,723][49750] Updated weights for policy 0, policy_version 289721 (0.0042) [2024-04-26 22:48:52,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4746821632. Throughput: 0: 50628.0. Samples: 2499626880. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-04-26 22:48:52,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 22:48:54,559][49750] Updated weights for policy 0, policy_version 289731 (0.0030) [2024-04-26 22:48:57,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4747100160. Throughput: 0: 50714.2. Samples: 2499928960. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-04-26 22:48:57,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 22:48:57,121][49750] Updated weights for policy 0, policy_version 289741 (0.0033) [2024-04-26 22:49:00,868][49750] Updated weights for policy 0, policy_version 289751 (0.0031) [2024-04-26 22:49:02,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 4747362304. Throughput: 0: 50747.6. Samples: 2500237740. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-04-26 22:49:02,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 22:49:03,628][49750] Updated weights for policy 0, policy_version 289761 (0.0036) [2024-04-26 22:49:07,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 4747575296. Throughput: 0: 50845.7. Samples: 2500396360. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-04-26 22:49:07,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 22:49:07,434][49750] Updated weights for policy 0, policy_version 289771 (0.0030) [2024-04-26 22:49:10,040][49750] Updated weights for policy 0, policy_version 289781 (0.0038) [2024-04-26 22:49:12,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 4747821056. Throughput: 0: 50614.5. Samples: 2500691180. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-04-26 22:49:12,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 22:49:13,975][49750] Updated weights for policy 0, policy_version 289791 (0.0036) [2024-04-26 22:49:15,772][49728] Signal inference workers to stop experience collection... (37550 times) [2024-04-26 22:49:15,773][49728] Signal inference workers to resume experience collection... (37550 times) [2024-04-26 22:49:15,785][49750] InferenceWorker_p0-w0: stopping experience collection (37550 times) [2024-04-26 22:49:15,809][49750] InferenceWorker_p0-w0: resuming experience collection (37550 times) [2024-04-26 22:49:16,556][49750] Updated weights for policy 0, policy_version 289801 (0.0034) [2024-04-26 22:49:17,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4748115968. Throughput: 0: 50756.2. Samples: 2501000200. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-04-26 22:49:17,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 22:49:20,364][49750] Updated weights for policy 0, policy_version 289811 (0.0032) [2024-04-26 22:49:22,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 4748361728. Throughput: 0: 50836.9. Samples: 2501161540. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-04-26 22:49:22,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 22:49:22,950][49750] Updated weights for policy 0, policy_version 289821 (0.0033) [2024-04-26 22:49:26,717][49750] Updated weights for policy 0, policy_version 289831 (0.0029) [2024-04-26 22:49:27,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.6, 300 sec: 50651.5). Total num frames: 4748607488. Throughput: 0: 50708.0. Samples: 2501463420. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-04-26 22:49:27,072][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 22:49:29,320][49750] Updated weights for policy 0, policy_version 289841 (0.0031) [2024-04-26 22:49:32,063][49517] Fps is (10 sec: 49150.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4748853248. Throughput: 0: 50611.6. Samples: 2501764820. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-04-26 22:49:32,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 22:49:33,124][49750] Updated weights for policy 0, policy_version 289851 (0.0033) [2024-04-26 22:49:35,749][49750] Updated weights for policy 0, policy_version 289861 (0.0029) [2024-04-26 22:49:37,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4749099008. Throughput: 0: 50835.3. Samples: 2501914460. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-04-26 22:49:37,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 22:49:39,645][49750] Updated weights for policy 0, policy_version 289871 (0.0036) [2024-04-26 22:49:42,062][49517] Fps is (10 sec: 54068.2, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4749393920. Throughput: 0: 50854.6. Samples: 2502217420. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-04-26 22:49:42,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 22:49:42,209][49750] Updated weights for policy 0, policy_version 289881 (0.0033) [2024-04-26 22:49:46,049][49750] Updated weights for policy 0, policy_version 289891 (0.0034) [2024-04-26 22:49:47,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4749639680. Throughput: 0: 50899.6. Samples: 2502528220. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-04-26 22:49:47,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 22:49:48,771][49750] Updated weights for policy 0, policy_version 289901 (0.0033) [2024-04-26 22:49:52,062][49517] Fps is (10 sec: 45876.0, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 4749852672. Throughput: 0: 50781.4. Samples: 2502681520. Policy #0 lag: (min: 1.0, avg: 12.5, max: 21.0) [2024-04-26 22:49:52,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 22:49:52,486][49750] Updated weights for policy 0, policy_version 289911 (0.0032) [2024-04-26 22:49:55,075][49750] Updated weights for policy 0, policy_version 289921 (0.0030) [2024-04-26 22:49:57,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4750114816. Throughput: 0: 50822.6. Samples: 2502978200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:49:57,063][49517] Avg episode reward: [(0, '0.670')] [2024-04-26 22:49:58,898][49750] Updated weights for policy 0, policy_version 289931 (0.0027) [2024-04-26 22:50:01,408][49750] Updated weights for policy 0, policy_version 289941 (0.0032) [2024-04-26 22:50:02,062][49517] Fps is (10 sec: 54066.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4750393344. Throughput: 0: 50610.7. Samples: 2503277680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:50:02,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 22:50:05,360][49750] Updated weights for policy 0, policy_version 289951 (0.0029) [2024-04-26 22:50:07,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 4750655488. Throughput: 0: 50820.3. Samples: 2503448460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:50:07,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 22:50:07,887][49750] Updated weights for policy 0, policy_version 289961 (0.0030) [2024-04-26 22:50:11,779][49750] Updated weights for policy 0, policy_version 289971 (0.0029) [2024-04-26 22:50:12,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51336.5, 300 sec: 50651.6). Total num frames: 4750901248. Throughput: 0: 50889.4. Samples: 2503753440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:50:12,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 22:50:12,737][49728] Signal inference workers to stop experience collection... (37600 times) [2024-04-26 22:50:12,737][49728] Signal inference workers to resume experience collection... (37600 times) [2024-04-26 22:50:12,774][49750] InferenceWorker_p0-w0: stopping experience collection (37600 times) [2024-04-26 22:50:12,774][49750] InferenceWorker_p0-w0: resuming experience collection (37600 times) [2024-04-26 22:50:14,452][49750] Updated weights for policy 0, policy_version 289981 (0.0030) [2024-04-26 22:50:17,062][49517] Fps is (10 sec: 45875.4, 60 sec: 49971.2, 300 sec: 50651.5). Total num frames: 4751114240. Throughput: 0: 50854.1. Samples: 2504053240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:50:17,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 22:50:18,248][49750] Updated weights for policy 0, policy_version 289991 (0.0031) [2024-04-26 22:50:20,795][49750] Updated weights for policy 0, policy_version 290001 (0.0032) [2024-04-26 22:50:22,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4751392768. Throughput: 0: 50642.3. Samples: 2504193360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:50:22,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 22:50:24,699][49750] Updated weights for policy 0, policy_version 290011 (0.0036) [2024-04-26 22:50:27,062][49517] Fps is (10 sec: 57344.4, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4751687680. Throughput: 0: 50809.1. Samples: 2504503820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:50:27,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 22:50:27,088][49750] Updated weights for policy 0, policy_version 290021 (0.0033) [2024-04-26 22:50:31,147][49750] Updated weights for policy 0, policy_version 290031 (0.0030) [2024-04-26 22:50:32,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.7, 300 sec: 50651.6). Total num frames: 4751917056. Throughput: 0: 50714.6. Samples: 2504810380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:50:32,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 22:50:33,581][49750] Updated weights for policy 0, policy_version 290041 (0.0026) [2024-04-26 22:50:37,063][49517] Fps is (10 sec: 45874.5, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 4752146432. Throughput: 0: 50743.4. Samples: 2504964980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:50:37,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 22:50:37,633][49750] Updated weights for policy 0, policy_version 290051 (0.0031) [2024-04-26 22:50:40,457][49750] Updated weights for policy 0, policy_version 290061 (0.0031) [2024-04-26 22:50:42,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.3, 300 sec: 50707.1). Total num frames: 4752392192. Throughput: 0: 50729.8. Samples: 2505261040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:50:42,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 22:50:42,139][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000290064_4752408576.pth... [2024-04-26 22:50:42,187][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000289322_4740251648.pth [2024-04-26 22:50:43,996][49750] Updated weights for policy 0, policy_version 290071 (0.0031) [2024-04-26 22:50:47,062][49517] Fps is (10 sec: 54068.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4752687104. Throughput: 0: 50745.9. Samples: 2505561240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:50:47,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 22:50:47,067][49750] Updated weights for policy 0, policy_version 290081 (0.0036) [2024-04-26 22:50:50,554][49750] Updated weights for policy 0, policy_version 290091 (0.0033) [2024-04-26 22:50:52,062][49517] Fps is (10 sec: 55706.0, 60 sec: 51609.6, 300 sec: 50762.7). Total num frames: 4752949248. Throughput: 0: 50764.2. Samples: 2505732840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:50:52,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 22:50:53,384][49750] Updated weights for policy 0, policy_version 290101 (0.0028) [2024-04-26 22:50:57,063][49517] Fps is (10 sec: 47512.3, 60 sec: 50790.3, 300 sec: 50596.0). Total num frames: 4753162240. Throughput: 0: 50719.3. Samples: 2506035820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:50:57,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 22:50:57,115][49750] Updated weights for policy 0, policy_version 290111 (0.0035) [2024-04-26 22:50:59,951][49750] Updated weights for policy 0, policy_version 290121 (0.0033) [2024-04-26 22:51:02,062][49517] Fps is (10 sec: 45874.9, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4753408000. Throughput: 0: 50764.9. Samples: 2506337660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:51:02,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-26 22:51:03,543][49750] Updated weights for policy 0, policy_version 290131 (0.0032) [2024-04-26 22:51:06,465][49750] Updated weights for policy 0, policy_version 290141 (0.0030) [2024-04-26 22:51:07,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4753686528. Throughput: 0: 50820.4. Samples: 2506480280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-26 22:51:07,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 22:51:09,900][49750] Updated weights for policy 0, policy_version 290151 (0.0035) [2024-04-26 22:51:12,063][49517] Fps is (10 sec: 55704.3, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 4753965056. Throughput: 0: 50750.3. Samples: 2506787600. Policy #0 lag: (min: 2.0, avg: 9.9, max: 23.0) [2024-04-26 22:51:12,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 22:51:12,883][49750] Updated weights for policy 0, policy_version 290161 (0.0037) [2024-04-26 22:51:16,326][49750] Updated weights for policy 0, policy_version 290171 (0.0031) [2024-04-26 22:51:17,062][49517] Fps is (10 sec: 49152.3, 60 sec: 51063.5, 300 sec: 50596.0). Total num frames: 4754178048. Throughput: 0: 50647.2. Samples: 2507089500. Policy #0 lag: (min: 2.0, avg: 9.9, max: 23.0) [2024-04-26 22:51:17,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 22:51:19,356][49750] Updated weights for policy 0, policy_version 290181 (0.0029) [2024-04-26 22:51:22,063][49517] Fps is (10 sec: 45875.8, 60 sec: 50517.2, 300 sec: 50596.0). Total num frames: 4754423808. Throughput: 0: 50544.9. Samples: 2507239500. Policy #0 lag: (min: 2.0, avg: 9.9, max: 23.0) [2024-04-26 22:51:22,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 22:51:22,258][49728] Signal inference workers to stop experience collection... (37650 times) [2024-04-26 22:51:22,297][49750] InferenceWorker_p0-w0: stopping experience collection (37650 times) [2024-04-26 22:51:22,359][49728] Signal inference workers to resume experience collection... (37650 times) [2024-04-26 22:51:22,359][49750] InferenceWorker_p0-w0: resuming experience collection (37650 times) [2024-04-26 22:51:22,815][49750] Updated weights for policy 0, policy_version 290191 (0.0031) [2024-04-26 22:51:25,772][49750] Updated weights for policy 0, policy_version 290201 (0.0037) [2024-04-26 22:51:27,063][49517] Fps is (10 sec: 50789.4, 60 sec: 49971.0, 300 sec: 50762.6). Total num frames: 4754685952. Throughput: 0: 50727.4. Samples: 2507543780. Policy #0 lag: (min: 2.0, avg: 9.9, max: 23.0) [2024-04-26 22:51:27,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 22:51:29,257][49750] Updated weights for policy 0, policy_version 290211 (0.0032) [2024-04-26 22:51:32,062][49517] Fps is (10 sec: 54067.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4754964480. Throughput: 0: 50730.5. Samples: 2507844120. Policy #0 lag: (min: 2.0, avg: 9.9, max: 23.0) [2024-04-26 22:51:32,063][49517] Avg episode reward: [(0, '0.704')] [2024-04-26 22:51:32,175][49750] Updated weights for policy 0, policy_version 290221 (0.0032) [2024-04-26 22:51:35,619][49750] Updated weights for policy 0, policy_version 290231 (0.0031) [2024-04-26 22:51:37,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51336.6, 300 sec: 50762.6). Total num frames: 4755226624. Throughput: 0: 50563.9. Samples: 2508008220. Policy #0 lag: (min: 2.0, avg: 9.9, max: 23.0) [2024-04-26 22:51:37,063][49517] Avg episode reward: [(0, '0.690')] [2024-04-26 22:51:38,468][49750] Updated weights for policy 0, policy_version 290241 (0.0031) [2024-04-26 22:51:42,030][49750] Updated weights for policy 0, policy_version 290251 (0.0031) [2024-04-26 22:51:42,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51336.6, 300 sec: 50707.1). Total num frames: 4755472384. Throughput: 0: 50779.4. Samples: 2508320880. Policy #0 lag: (min: 2.0, avg: 9.9, max: 23.0) [2024-04-26 22:51:42,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 22:51:44,806][49750] Updated weights for policy 0, policy_version 290261 (0.0030) [2024-04-26 22:51:47,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4755701760. Throughput: 0: 50904.5. Samples: 2508628360. Policy #0 lag: (min: 2.0, avg: 9.9, max: 23.0) [2024-04-26 22:51:47,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-26 22:51:48,568][49750] Updated weights for policy 0, policy_version 290271 (0.0030) [2024-04-26 22:51:51,383][49750] Updated weights for policy 0, policy_version 290281 (0.0029) [2024-04-26 22:51:52,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 4755980288. Throughput: 0: 50858.2. Samples: 2508768900. Policy #0 lag: (min: 2.0, avg: 9.9, max: 23.0) [2024-04-26 22:51:52,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 22:51:54,911][49750] Updated weights for policy 0, policy_version 290291 (0.0028) [2024-04-26 22:51:57,062][49517] Fps is (10 sec: 55705.6, 60 sec: 51609.8, 300 sec: 50818.2). Total num frames: 4756258816. Throughput: 0: 50890.5. Samples: 2509077660. Policy #0 lag: (min: 2.0, avg: 9.9, max: 23.0) [2024-04-26 22:51:57,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 22:51:57,901][49750] Updated weights for policy 0, policy_version 290301 (0.0025) [2024-04-26 22:52:01,202][49750] Updated weights for policy 0, policy_version 290311 (0.0030) [2024-04-26 22:52:02,062][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4756471808. Throughput: 0: 50923.9. Samples: 2509381080. Policy #0 lag: (min: 2.0, avg: 9.9, max: 23.0) [2024-04-26 22:52:02,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 22:52:04,364][49750] Updated weights for policy 0, policy_version 290321 (0.0033) [2024-04-26 22:52:07,062][49517] Fps is (10 sec: 47513.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4756733952. Throughput: 0: 50941.8. Samples: 2509531880. Policy #0 lag: (min: 2.0, avg: 9.9, max: 23.0) [2024-04-26 22:52:07,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 22:52:07,654][49750] Updated weights for policy 0, policy_version 290331 (0.0033) [2024-04-26 22:52:10,690][49750] Updated weights for policy 0, policy_version 290341 (0.0037) [2024-04-26 22:52:12,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4756979712. Throughput: 0: 50935.5. Samples: 2509835880. Policy #0 lag: (min: 2.0, avg: 9.9, max: 23.0) [2024-04-26 22:52:12,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 22:52:14,065][49750] Updated weights for policy 0, policy_version 290351 (0.0028) [2024-04-26 22:52:17,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 4757258240. Throughput: 0: 50958.6. Samples: 2510137260. Policy #0 lag: (min: 2.0, avg: 9.9, max: 23.0) [2024-04-26 22:52:17,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 22:52:17,212][49750] Updated weights for policy 0, policy_version 290361 (0.0032) [2024-04-26 22:52:19,283][49728] Signal inference workers to stop experience collection... (37700 times) [2024-04-26 22:52:19,283][49728] Signal inference workers to resume experience collection... (37700 times) [2024-04-26 22:52:19,309][49750] InferenceWorker_p0-w0: stopping experience collection (37700 times) [2024-04-26 22:52:19,309][49750] InferenceWorker_p0-w0: resuming experience collection (37700 times) [2024-04-26 22:52:20,531][49750] Updated weights for policy 0, policy_version 290371 (0.0036) [2024-04-26 22:52:22,062][49517] Fps is (10 sec: 50791.7, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 4757487616. Throughput: 0: 50907.2. Samples: 2510299040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 22:52:22,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 22:52:23,757][49750] Updated weights for policy 0, policy_version 290381 (0.0028) [2024-04-26 22:52:27,026][49750] Updated weights for policy 0, policy_version 290391 (0.0031) [2024-04-26 22:52:27,063][49517] Fps is (10 sec: 50790.7, 60 sec: 51336.6, 300 sec: 50762.6). Total num frames: 4757766144. Throughput: 0: 50776.7. Samples: 2510605840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 22:52:27,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 22:52:30,060][49750] Updated weights for policy 0, policy_version 290401 (0.0029) [2024-04-26 22:52:32,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 4757979136. Throughput: 0: 50779.8. Samples: 2510913460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 22:52:32,063][49517] Avg episode reward: [(0, '0.670')] [2024-04-26 22:52:33,404][49750] Updated weights for policy 0, policy_version 290411 (0.0032) [2024-04-26 22:52:36,379][49750] Updated weights for policy 0, policy_version 290421 (0.0039) [2024-04-26 22:52:37,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4758257664. Throughput: 0: 50945.4. Samples: 2511061440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 22:52:37,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 22:52:39,868][49750] Updated weights for policy 0, policy_version 290431 (0.0032) [2024-04-26 22:52:42,062][49517] Fps is (10 sec: 57344.5, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4758552576. Throughput: 0: 50787.0. Samples: 2511363080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 22:52:42,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 22:52:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000290439_4758552576.pth... [2024-04-26 22:52:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000289694_4746346496.pth [2024-04-26 22:52:42,728][49750] Updated weights for policy 0, policy_version 290441 (0.0026) [2024-04-26 22:52:46,360][49750] Updated weights for policy 0, policy_version 290451 (0.0035) [2024-04-26 22:52:47,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 4758765568. Throughput: 0: 50670.8. Samples: 2511661260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 22:52:47,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 22:52:49,184][49750] Updated weights for policy 0, policy_version 290461 (0.0028) [2024-04-26 22:52:52,063][49517] Fps is (10 sec: 45874.8, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4759011328. Throughput: 0: 50650.6. Samples: 2511811160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 22:52:52,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 22:52:52,939][49750] Updated weights for policy 0, policy_version 290471 (0.0027) [2024-04-26 22:52:56,136][49750] Updated weights for policy 0, policy_version 290481 (0.0031) [2024-04-26 22:52:57,062][49517] Fps is (10 sec: 49151.6, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 4759257088. Throughput: 0: 50721.5. Samples: 2512118340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 22:52:57,063][49517] Avg episode reward: [(0, '0.670')] [2024-04-26 22:52:59,260][49750] Updated weights for policy 0, policy_version 290491 (0.0037) [2024-04-26 22:53:02,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4759552000. Throughput: 0: 50783.6. Samples: 2512422520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 22:53:02,063][49517] Avg episode reward: [(0, '0.709')] [2024-04-26 22:53:02,649][49750] Updated weights for policy 0, policy_version 290501 (0.0033) [2024-04-26 22:53:05,629][49750] Updated weights for policy 0, policy_version 290511 (0.0034) [2024-04-26 22:53:07,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4759781376. Throughput: 0: 50795.9. Samples: 2512584860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 22:53:07,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 22:53:07,111][49728] Signal inference workers to stop experience collection... (37750 times) [2024-04-26 22:53:07,111][49728] Signal inference workers to resume experience collection... (37750 times) [2024-04-26 22:53:07,127][49750] InferenceWorker_p0-w0: stopping experience collection (37750 times) [2024-04-26 22:53:07,127][49750] InferenceWorker_p0-w0: resuming experience collection (37750 times) [2024-04-26 22:53:09,100][49750] Updated weights for policy 0, policy_version 290521 (0.0033) [2024-04-26 22:53:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4760043520. Throughput: 0: 50777.4. Samples: 2512890820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 22:53:12,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 22:53:12,126][49750] Updated weights for policy 0, policy_version 290531 (0.0035) [2024-04-26 22:53:15,682][49750] Updated weights for policy 0, policy_version 290541 (0.0032) [2024-04-26 22:53:17,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.3, 300 sec: 50818.1). Total num frames: 4760272896. Throughput: 0: 50772.1. Samples: 2513198200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 22:53:17,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 22:53:18,673][49750] Updated weights for policy 0, policy_version 290551 (0.0031) [2024-04-26 22:53:22,024][49750] Updated weights for policy 0, policy_version 290561 (0.0036) [2024-04-26 22:53:22,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 4760551424. Throughput: 0: 50724.3. Samples: 2513344040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 22:53:22,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 22:53:25,072][49750] Updated weights for policy 0, policy_version 290571 (0.0031) [2024-04-26 22:53:27,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4760813568. Throughput: 0: 50801.3. Samples: 2513649140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 22:53:27,063][49517] Avg episode reward: [(0, '0.677')] [2024-04-26 22:53:28,354][49750] Updated weights for policy 0, policy_version 290581 (0.0028) [2024-04-26 22:53:31,410][49750] Updated weights for policy 0, policy_version 290591 (0.0030) [2024-04-26 22:53:32,062][49517] Fps is (10 sec: 49152.4, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 4761042944. Throughput: 0: 50754.6. Samples: 2513945220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 22:53:32,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 22:53:34,964][49750] Updated weights for policy 0, policy_version 290601 (0.0028) [2024-04-26 22:53:37,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4761305088. Throughput: 0: 50921.3. Samples: 2514102620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 22:53:37,063][49517] Avg episode reward: [(0, '0.485')] [2024-04-26 22:53:37,902][49750] Updated weights for policy 0, policy_version 290611 (0.0028) [2024-04-26 22:53:41,450][49750] Updated weights for policy 0, policy_version 290621 (0.0029) [2024-04-26 22:53:42,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49698.1, 300 sec: 50707.1). Total num frames: 4761534464. Throughput: 0: 50640.0. Samples: 2514397140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 22:53:42,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 22:53:44,450][49750] Updated weights for policy 0, policy_version 290631 (0.0028) [2024-04-26 22:53:47,063][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 4761829376. Throughput: 0: 50713.3. Samples: 2514704620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 22:53:47,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 22:53:47,845][49750] Updated weights for policy 0, policy_version 290641 (0.0029) [2024-04-26 22:53:51,243][49750] Updated weights for policy 0, policy_version 290651 (0.0033) [2024-04-26 22:53:52,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4762075136. Throughput: 0: 50766.0. Samples: 2514869340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 22:53:52,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 22:53:54,427][49750] Updated weights for policy 0, policy_version 290661 (0.0029) [2024-04-26 22:53:57,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 4762337280. Throughput: 0: 50712.8. Samples: 2515172900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 22:53:57,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 22:53:57,730][49750] Updated weights for policy 0, policy_version 290671 (0.0028) [2024-04-26 22:54:00,998][49750] Updated weights for policy 0, policy_version 290681 (0.0025) [2024-04-26 22:54:02,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 4762566656. Throughput: 0: 50594.7. Samples: 2515474960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 22:54:02,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 22:54:04,016][49750] Updated weights for policy 0, policy_version 290691 (0.0038) [2024-04-26 22:54:07,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 4762812416. Throughput: 0: 50721.8. Samples: 2515626520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 22:54:07,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 22:54:07,644][49750] Updated weights for policy 0, policy_version 290701 (0.0032) [2024-04-26 22:54:10,247][49750] Updated weights for policy 0, policy_version 290711 (0.0031) [2024-04-26 22:54:11,530][49728] Signal inference workers to stop experience collection... (37800 times) [2024-04-26 22:54:11,530][49728] Signal inference workers to resume experience collection... (37800 times) [2024-04-26 22:54:11,556][49750] InferenceWorker_p0-w0: stopping experience collection (37800 times) [2024-04-26 22:54:11,556][49750] InferenceWorker_p0-w0: resuming experience collection (37800 times) [2024-04-26 22:54:12,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 4763107328. Throughput: 0: 50659.9. Samples: 2515928840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 22:54:12,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 22:54:13,928][49750] Updated weights for policy 0, policy_version 290721 (0.0029) [2024-04-26 22:54:16,738][49750] Updated weights for policy 0, policy_version 290731 (0.0030) [2024-04-26 22:54:17,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4763336704. Throughput: 0: 50891.9. Samples: 2516235360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 22:54:17,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-26 22:54:20,214][49750] Updated weights for policy 0, policy_version 290741 (0.0028) [2024-04-26 22:54:22,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4763598848. Throughput: 0: 50987.8. Samples: 2516397060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 22:54:22,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 22:54:23,080][49750] Updated weights for policy 0, policy_version 290751 (0.0027) [2024-04-26 22:54:26,942][49750] Updated weights for policy 0, policy_version 290761 (0.0034) [2024-04-26 22:54:27,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.3, 300 sec: 50762.7). Total num frames: 4763828224. Throughput: 0: 51080.9. Samples: 2516695780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 22:54:27,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 22:54:29,512][49750] Updated weights for policy 0, policy_version 290771 (0.0028) [2024-04-26 22:54:32,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4764106752. Throughput: 0: 50903.6. Samples: 2516995280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 22:54:32,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 22:54:33,454][49750] Updated weights for policy 0, policy_version 290781 (0.0034) [2024-04-26 22:54:36,122][49750] Updated weights for policy 0, policy_version 290791 (0.0028) [2024-04-26 22:54:37,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4764368896. Throughput: 0: 50803.3. Samples: 2517155480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 22:54:37,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 22:54:39,809][49750] Updated weights for policy 0, policy_version 290801 (0.0031) [2024-04-26 22:54:42,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 4764614656. Throughput: 0: 50932.5. Samples: 2517464860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 22:54:42,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 22:54:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000290810_4764631040.pth... [2024-04-26 22:54:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000290064_4752408576.pth [2024-04-26 22:54:42,427][49750] Updated weights for policy 0, policy_version 290811 (0.0027) [2024-04-26 22:54:46,140][49750] Updated weights for policy 0, policy_version 290821 (0.0029) [2024-04-26 22:54:47,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 4764844032. Throughput: 0: 50997.0. Samples: 2517769820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-26 22:54:47,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 22:54:49,091][49750] Updated weights for policy 0, policy_version 290831 (0.0037) [2024-04-26 22:54:52,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 4765106176. Throughput: 0: 50840.8. Samples: 2517914360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 22:54:52,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 22:54:52,595][49750] Updated weights for policy 0, policy_version 290841 (0.0035) [2024-04-26 22:54:55,693][49750] Updated weights for policy 0, policy_version 290851 (0.0033) [2024-04-26 22:54:57,063][49517] Fps is (10 sec: 54066.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4765384704. Throughput: 0: 50892.1. Samples: 2518218980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 22:54:57,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 22:54:59,090][49750] Updated weights for policy 0, policy_version 290861 (0.0034) [2024-04-26 22:55:01,998][49750] Updated weights for policy 0, policy_version 290871 (0.0032) [2024-04-26 22:55:02,063][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4765630464. Throughput: 0: 50859.2. Samples: 2518524020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 22:55:02,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 22:55:05,404][49750] Updated weights for policy 0, policy_version 290881 (0.0043) [2024-04-26 22:55:07,062][49517] Fps is (10 sec: 49152.7, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4765876224. Throughput: 0: 50727.6. Samples: 2518679800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 22:55:07,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 22:55:08,498][49750] Updated weights for policy 0, policy_version 290891 (0.0033) [2024-04-26 22:55:09,529][49728] Signal inference workers to stop experience collection... (37850 times) [2024-04-26 22:55:09,549][49750] InferenceWorker_p0-w0: stopping experience collection (37850 times) [2024-04-26 22:55:09,595][49728] Signal inference workers to resume experience collection... (37850 times) [2024-04-26 22:55:09,595][49750] InferenceWorker_p0-w0: resuming experience collection (37850 times) [2024-04-26 22:55:11,758][49750] Updated weights for policy 0, policy_version 290901 (0.0033) [2024-04-26 22:55:12,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 4766121984. Throughput: 0: 50815.5. Samples: 2518982480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 22:55:12,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 22:55:14,958][49750] Updated weights for policy 0, policy_version 290911 (0.0025) [2024-04-26 22:55:17,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 4766384128. Throughput: 0: 50856.8. Samples: 2519283840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 22:55:17,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-26 22:55:18,344][49750] Updated weights for policy 0, policy_version 290921 (0.0035) [2024-04-26 22:55:21,314][49750] Updated weights for policy 0, policy_version 290931 (0.0030) [2024-04-26 22:55:22,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4766662656. Throughput: 0: 50835.8. Samples: 2519443100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 22:55:22,063][49517] Avg episode reward: [(0, '0.670')] [2024-04-26 22:55:25,061][49750] Updated weights for policy 0, policy_version 290941 (0.0031) [2024-04-26 22:55:27,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 4766908416. Throughput: 0: 50888.3. Samples: 2519754840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 22:55:27,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 22:55:27,596][49750] Updated weights for policy 0, policy_version 290951 (0.0042) [2024-04-26 22:55:31,456][49750] Updated weights for policy 0, policy_version 290961 (0.0028) [2024-04-26 22:55:32,063][49517] Fps is (10 sec: 45875.4, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4767121408. Throughput: 0: 50895.3. Samples: 2520060120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 22:55:32,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 22:55:33,983][49750] Updated weights for policy 0, policy_version 290971 (0.0029) [2024-04-26 22:55:37,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4767399936. Throughput: 0: 50775.8. Samples: 2520199260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 22:55:37,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 22:55:37,802][49750] Updated weights for policy 0, policy_version 290981 (0.0029) [2024-04-26 22:55:40,535][49750] Updated weights for policy 0, policy_version 290991 (0.0036) [2024-04-26 22:55:42,063][49517] Fps is (10 sec: 55705.8, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 4767678464. Throughput: 0: 50859.1. Samples: 2520507640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 22:55:42,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 22:55:44,265][49750] Updated weights for policy 0, policy_version 291001 (0.0031) [2024-04-26 22:55:46,875][49750] Updated weights for policy 0, policy_version 291011 (0.0031) [2024-04-26 22:55:47,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 4767924224. Throughput: 0: 50887.6. Samples: 2520813960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 22:55:47,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 22:55:50,635][49750] Updated weights for policy 0, policy_version 291021 (0.0030) [2024-04-26 22:55:52,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51336.7, 300 sec: 50929.3). Total num frames: 4768186368. Throughput: 0: 51031.0. Samples: 2520976200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 22:55:52,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 22:55:53,201][49750] Updated weights for policy 0, policy_version 291031 (0.0034) [2024-04-26 22:55:56,962][49750] Updated weights for policy 0, policy_version 291041 (0.0040) [2024-04-26 22:55:57,064][49517] Fps is (10 sec: 49143.9, 60 sec: 50516.0, 300 sec: 50873.4). Total num frames: 4768415744. Throughput: 0: 50947.5. Samples: 2521275200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-26 22:55:57,065][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 22:55:59,581][49750] Updated weights for policy 0, policy_version 291051 (0.0031) [2024-04-26 22:56:02,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4768677888. Throughput: 0: 51037.5. Samples: 2521580520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 22:56:02,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 22:56:03,219][49750] Updated weights for policy 0, policy_version 291061 (0.0034) [2024-04-26 22:56:06,087][49750] Updated weights for policy 0, policy_version 291071 (0.0038) [2024-04-26 22:56:07,063][49517] Fps is (10 sec: 52436.9, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4768940032. Throughput: 0: 50956.9. Samples: 2521736160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 22:56:07,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 22:56:09,735][49750] Updated weights for policy 0, policy_version 291081 (0.0029) [2024-04-26 22:56:12,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4769202176. Throughput: 0: 50878.8. Samples: 2522044380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 22:56:12,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 22:56:12,583][49750] Updated weights for policy 0, policy_version 291091 (0.0030) [2024-04-26 22:56:16,293][49750] Updated weights for policy 0, policy_version 291101 (0.0031) [2024-04-26 22:56:16,916][49728] Signal inference workers to stop experience collection... (37900 times) [2024-04-26 22:56:16,956][49750] InferenceWorker_p0-w0: stopping experience collection (37900 times) [2024-04-26 22:56:16,994][49728] Signal inference workers to resume experience collection... (37900 times) [2024-04-26 22:56:16,994][49750] InferenceWorker_p0-w0: resuming experience collection (37900 times) [2024-04-26 22:56:17,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4769431552. Throughput: 0: 50840.1. Samples: 2522347920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 22:56:17,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 22:56:18,939][49750] Updated weights for policy 0, policy_version 291111 (0.0029) [2024-04-26 22:56:22,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4769677312. Throughput: 0: 50937.6. Samples: 2522491460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 22:56:22,072][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 22:56:22,802][49750] Updated weights for policy 0, policy_version 291121 (0.0034) [2024-04-26 22:56:25,345][49750] Updated weights for policy 0, policy_version 291131 (0.0027) [2024-04-26 22:56:27,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4769955840. Throughput: 0: 50842.7. Samples: 2522795560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 22:56:27,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 22:56:29,510][49750] Updated weights for policy 0, policy_version 291141 (0.0034) [2024-04-26 22:56:31,832][49750] Updated weights for policy 0, policy_version 291151 (0.0041) [2024-04-26 22:56:32,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 4770217984. Throughput: 0: 50840.3. Samples: 2523101780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 22:56:32,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 22:56:35,783][49750] Updated weights for policy 0, policy_version 291161 (0.0031) [2024-04-26 22:56:37,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4770480128. Throughput: 0: 50822.2. Samples: 2523263200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 22:56:37,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 22:56:38,230][49750] Updated weights for policy 0, policy_version 291171 (0.0032) [2024-04-26 22:56:42,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4770693120. Throughput: 0: 50979.2. Samples: 2523569180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 22:56:42,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 22:56:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000291180_4770693120.pth... [2024-04-26 22:56:42,134][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000290439_4758552576.pth [2024-04-26 22:56:42,301][49750] Updated weights for policy 0, policy_version 291181 (0.0029) [2024-04-26 22:56:44,826][49750] Updated weights for policy 0, policy_version 291191 (0.0027) [2024-04-26 22:56:47,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4770971648. Throughput: 0: 50796.3. Samples: 2523866360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 22:56:47,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 22:56:48,553][49750] Updated weights for policy 0, policy_version 291201 (0.0031) [2024-04-26 22:56:51,376][49750] Updated weights for policy 0, policy_version 291211 (0.0030) [2024-04-26 22:56:52,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4771217408. Throughput: 0: 50778.4. Samples: 2524021180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 22:56:52,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 22:56:54,984][49750] Updated weights for policy 0, policy_version 291221 (0.0026) [2024-04-26 22:56:57,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51337.8, 300 sec: 50929.2). Total num frames: 4771495936. Throughput: 0: 50769.3. Samples: 2524329000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 22:56:57,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 22:56:57,796][49750] Updated weights for policy 0, policy_version 291231 (0.0034) [2024-04-26 22:57:01,376][49750] Updated weights for policy 0, policy_version 291241 (0.0027) [2024-04-26 22:57:02,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4771741696. Throughput: 0: 50826.7. Samples: 2524635120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 22:57:02,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 22:57:04,249][49750] Updated weights for policy 0, policy_version 291251 (0.0034) [2024-04-26 22:57:07,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4771971072. Throughput: 0: 50965.9. Samples: 2524784920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 22:57:07,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 22:57:07,842][49750] Updated weights for policy 0, policy_version 291261 (0.0031) [2024-04-26 22:57:10,727][49750] Updated weights for policy 0, policy_version 291271 (0.0031) [2024-04-26 22:57:12,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4772233216. Throughput: 0: 50889.9. Samples: 2525085600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-26 22:57:12,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 22:57:14,093][49750] Updated weights for policy 0, policy_version 291281 (0.0032) [2024-04-26 22:57:17,003][49750] Updated weights for policy 0, policy_version 291291 (0.0034) [2024-04-26 22:57:17,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 4772511744. Throughput: 0: 50821.5. Samples: 2525388740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 22:57:17,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 22:57:20,599][49750] Updated weights for policy 0, policy_version 291301 (0.0037) [2024-04-26 22:57:22,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4772757504. Throughput: 0: 50903.5. Samples: 2525553860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 22:57:22,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 22:57:23,565][49750] Updated weights for policy 0, policy_version 291311 (0.0032) [2024-04-26 22:57:27,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4772986880. Throughput: 0: 50764.5. Samples: 2525853580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 22:57:27,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 22:57:27,083][49750] Updated weights for policy 0, policy_version 291321 (0.0038) [2024-04-26 22:57:30,090][49750] Updated weights for policy 0, policy_version 291331 (0.0036) [2024-04-26 22:57:32,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4773249024. Throughput: 0: 50840.8. Samples: 2526154200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 22:57:32,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 22:57:33,584][49750] Updated weights for policy 0, policy_version 291341 (0.0029) [2024-04-26 22:57:36,465][49750] Updated weights for policy 0, policy_version 291351 (0.0036) [2024-04-26 22:57:37,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4773511168. Throughput: 0: 50806.3. Samples: 2526307460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 22:57:37,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 22:57:39,980][49750] Updated weights for policy 0, policy_version 291361 (0.0031) [2024-04-26 22:57:42,062][49517] Fps is (10 sec: 52429.8, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4773773312. Throughput: 0: 50746.4. Samples: 2526612580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 22:57:42,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 22:57:42,750][49750] Updated weights for policy 0, policy_version 291371 (0.0036) [2024-04-26 22:57:46,240][49750] Updated weights for policy 0, policy_version 291381 (0.0028) [2024-04-26 22:57:47,062][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4774019072. Throughput: 0: 50699.5. Samples: 2526916600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 22:57:47,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 22:57:49,520][49750] Updated weights for policy 0, policy_version 291391 (0.0030) [2024-04-26 22:57:52,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4774248448. Throughput: 0: 50786.7. Samples: 2527070320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 22:57:52,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 22:57:52,785][49750] Updated weights for policy 0, policy_version 291401 (0.0030) [2024-04-26 22:57:56,062][49750] Updated weights for policy 0, policy_version 291411 (0.0037) [2024-04-26 22:57:56,502][49728] Signal inference workers to stop experience collection... (37950 times) [2024-04-26 22:57:56,507][49728] Signal inference workers to resume experience collection... (37950 times) [2024-04-26 22:57:56,535][49750] InferenceWorker_p0-w0: stopping experience collection (37950 times) [2024-04-26 22:57:56,535][49750] InferenceWorker_p0-w0: resuming experience collection (37950 times) [2024-04-26 22:57:57,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4774526976. Throughput: 0: 50913.6. Samples: 2527376720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 22:57:57,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 22:57:59,077][49750] Updated weights for policy 0, policy_version 291421 (0.0027) [2024-04-26 22:58:02,062][49517] Fps is (10 sec: 54066.9, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4774789120. Throughput: 0: 50817.3. Samples: 2527675520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 22:58:02,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 22:58:02,375][49750] Updated weights for policy 0, policy_version 291431 (0.0029) [2024-04-26 22:58:05,586][49750] Updated weights for policy 0, policy_version 291441 (0.0034) [2024-04-26 22:58:07,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4775051264. Throughput: 0: 50805.3. Samples: 2527840100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 22:58:07,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 22:58:08,832][49750] Updated weights for policy 0, policy_version 291451 (0.0030) [2024-04-26 22:58:11,972][49750] Updated weights for policy 0, policy_version 291461 (0.0037) [2024-04-26 22:58:12,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 4775297024. Throughput: 0: 50930.9. Samples: 2528145480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 22:58:12,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 22:58:15,348][49750] Updated weights for policy 0, policy_version 291471 (0.0031) [2024-04-26 22:58:17,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4775526400. Throughput: 0: 51053.0. Samples: 2528451580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 22:58:17,063][49517] Avg episode reward: [(0, '0.498')] [2024-04-26 22:58:18,415][49750] Updated weights for policy 0, policy_version 291481 (0.0030) [2024-04-26 22:58:21,598][49750] Updated weights for policy 0, policy_version 291491 (0.0031) [2024-04-26 22:58:22,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4775788544. Throughput: 0: 50763.0. Samples: 2528591800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-26 22:58:22,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 22:58:25,002][49750] Updated weights for policy 0, policy_version 291501 (0.0028) [2024-04-26 22:58:27,063][49517] Fps is (10 sec: 55705.4, 60 sec: 51609.5, 300 sec: 50984.8). Total num frames: 4776083456. Throughput: 0: 50925.2. Samples: 2528904220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 22:58:27,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 22:58:27,938][49750] Updated weights for policy 0, policy_version 291511 (0.0030) [2024-04-26 22:58:31,311][49750] Updated weights for policy 0, policy_version 291521 (0.0032) [2024-04-26 22:58:32,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 4776329216. Throughput: 0: 50991.0. Samples: 2529211200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 22:58:32,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 22:58:34,439][49750] Updated weights for policy 0, policy_version 291531 (0.0033) [2024-04-26 22:58:37,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 4776558592. Throughput: 0: 51023.6. Samples: 2529366380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 22:58:37,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 22:58:37,599][49750] Updated weights for policy 0, policy_version 291541 (0.0032) [2024-04-26 22:58:40,754][49750] Updated weights for policy 0, policy_version 291551 (0.0031) [2024-04-26 22:58:42,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4776804352. Throughput: 0: 51092.5. Samples: 2529675880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 22:58:42,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 22:58:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000291554_4776820736.pth... [2024-04-26 22:58:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000290810_4764631040.pth [2024-04-26 22:58:43,961][49750] Updated weights for policy 0, policy_version 291561 (0.0029) [2024-04-26 22:58:47,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4777082880. Throughput: 0: 51307.6. Samples: 2529984360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 22:58:47,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 22:58:47,306][49750] Updated weights for policy 0, policy_version 291571 (0.0030) [2024-04-26 22:58:50,370][49750] Updated weights for policy 0, policy_version 291581 (0.0032) [2024-04-26 22:58:52,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51609.7, 300 sec: 50873.7). Total num frames: 4777345024. Throughput: 0: 51120.3. Samples: 2530140500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 22:58:52,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 22:58:53,930][49750] Updated weights for policy 0, policy_version 291591 (0.0027) [2024-04-26 22:58:56,806][49750] Updated weights for policy 0, policy_version 291601 (0.0031) [2024-04-26 22:58:57,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 4777607168. Throughput: 0: 51057.4. Samples: 2530443060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 22:58:57,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 22:59:00,642][49750] Updated weights for policy 0, policy_version 291611 (0.0036) [2024-04-26 22:59:02,062][49517] Fps is (10 sec: 49151.2, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 4777836544. Throughput: 0: 51082.6. Samples: 2530750300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 22:59:02,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 22:59:03,288][49750] Updated weights for policy 0, policy_version 291621 (0.0031) [2024-04-26 22:59:07,063][49517] Fps is (10 sec: 45875.0, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4778065920. Throughput: 0: 51043.4. Samples: 2530888760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 22:59:07,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 22:59:07,196][49750] Updated weights for policy 0, policy_version 291631 (0.0031) [2024-04-26 22:59:07,608][49728] Signal inference workers to stop experience collection... (38000 times) [2024-04-26 22:59:07,608][49728] Signal inference workers to resume experience collection... (38000 times) [2024-04-26 22:59:07,620][49750] InferenceWorker_p0-w0: stopping experience collection (38000 times) [2024-04-26 22:59:07,621][49750] InferenceWorker_p0-w0: resuming experience collection (38000 times) [2024-04-26 22:59:09,711][49750] Updated weights for policy 0, policy_version 291641 (0.0033) [2024-04-26 22:59:12,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 4778360832. Throughput: 0: 50924.5. Samples: 2531195820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 22:59:12,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 22:59:13,560][49750] Updated weights for policy 0, policy_version 291651 (0.0032) [2024-04-26 22:59:16,225][49750] Updated weights for policy 0, policy_version 291661 (0.0035) [2024-04-26 22:59:17,063][49517] Fps is (10 sec: 57344.2, 60 sec: 51882.6, 300 sec: 50984.8). Total num frames: 4778639360. Throughput: 0: 50922.3. Samples: 2531502700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 22:59:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 22:59:20,120][49750] Updated weights for policy 0, policy_version 291671 (0.0030) [2024-04-26 22:59:22,063][49517] Fps is (10 sec: 49151.7, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4778852352. Throughput: 0: 51170.5. Samples: 2531669060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 22:59:22,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 22:59:22,549][49750] Updated weights for policy 0, policy_version 291681 (0.0033) [2024-04-26 22:59:26,506][49750] Updated weights for policy 0, policy_version 291691 (0.0032) [2024-04-26 22:59:27,062][49517] Fps is (10 sec: 44237.2, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 4779081728. Throughput: 0: 50917.4. Samples: 2531967160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 22:59:27,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 22:59:28,914][49750] Updated weights for policy 0, policy_version 291701 (0.0035) [2024-04-26 22:59:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4779360256. Throughput: 0: 50633.7. Samples: 2532262880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 22:59:32,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 22:59:32,971][49750] Updated weights for policy 0, policy_version 291711 (0.0029) [2024-04-26 22:59:35,420][49750] Updated weights for policy 0, policy_version 291721 (0.0035) [2024-04-26 22:59:37,062][49517] Fps is (10 sec: 55705.1, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 4779638784. Throughput: 0: 50789.6. Samples: 2532426040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 20.0) [2024-04-26 22:59:37,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 22:59:39,346][49750] Updated weights for policy 0, policy_version 291731 (0.0035) [2024-04-26 22:59:41,987][49750] Updated weights for policy 0, policy_version 291741 (0.0030) [2024-04-26 22:59:42,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 4779884544. Throughput: 0: 50811.3. Samples: 2532729560. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 22:59:42,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 22:59:45,731][49750] Updated weights for policy 0, policy_version 291751 (0.0026) [2024-04-26 22:59:47,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 4780113920. Throughput: 0: 50711.5. Samples: 2533032320. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 22:59:47,063][49517] Avg episode reward: [(0, '0.687')] [2024-04-26 22:59:48,394][49750] Updated weights for policy 0, policy_version 291761 (0.0034) [2024-04-26 22:59:52,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.2, 300 sec: 50762.7). Total num frames: 4780359680. Throughput: 0: 50740.2. Samples: 2533172060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 22:59:52,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 22:59:52,148][49750] Updated weights for policy 0, policy_version 291771 (0.0026) [2024-04-26 22:59:54,735][49750] Updated weights for policy 0, policy_version 291781 (0.0033) [2024-04-26 22:59:57,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4780638208. Throughput: 0: 50742.3. Samples: 2533479220. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 22:59:57,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 22:59:58,838][49750] Updated weights for policy 0, policy_version 291791 (0.0031) [2024-04-26 23:00:01,217][49750] Updated weights for policy 0, policy_version 291801 (0.0029) [2024-04-26 23:00:02,063][49517] Fps is (10 sec: 55704.8, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4780916736. Throughput: 0: 50621.3. Samples: 2533780660. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 23:00:02,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 23:00:05,302][49750] Updated weights for policy 0, policy_version 291811 (0.0028) [2024-04-26 23:00:07,062][49517] Fps is (10 sec: 49151.6, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4781129728. Throughput: 0: 50647.2. Samples: 2533948180. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 23:00:07,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 23:00:07,743][49750] Updated weights for policy 0, policy_version 291821 (0.0029) [2024-04-26 23:00:11,829][49750] Updated weights for policy 0, policy_version 291831 (0.0035) [2024-04-26 23:00:12,062][49517] Fps is (10 sec: 44237.4, 60 sec: 49971.3, 300 sec: 50762.7). Total num frames: 4781359104. Throughput: 0: 50616.5. Samples: 2534244900. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 23:00:12,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 23:00:14,228][49750] Updated weights for policy 0, policy_version 291841 (0.0032) [2024-04-26 23:00:17,063][49517] Fps is (10 sec: 50789.7, 60 sec: 49971.1, 300 sec: 50762.6). Total num frames: 4781637632. Throughput: 0: 50581.6. Samples: 2534539060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 23:00:17,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 23:00:18,274][49750] Updated weights for policy 0, policy_version 291851 (0.0030) [2024-04-26 23:00:20,661][49750] Updated weights for policy 0, policy_version 291861 (0.0027) [2024-04-26 23:00:21,710][49728] Signal inference workers to stop experience collection... (38050 times) [2024-04-26 23:00:21,711][49728] Signal inference workers to resume experience collection... (38050 times) [2024-04-26 23:00:21,727][49750] InferenceWorker_p0-w0: stopping experience collection (38050 times) [2024-04-26 23:00:21,727][49750] InferenceWorker_p0-w0: resuming experience collection (38050 times) [2024-04-26 23:00:22,063][49517] Fps is (10 sec: 57342.7, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 4781932544. Throughput: 0: 50668.7. Samples: 2534706140. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 23:00:22,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 23:00:24,595][49750] Updated weights for policy 0, policy_version 291871 (0.0035) [2024-04-26 23:00:27,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.4, 300 sec: 50984.8). Total num frames: 4782161920. Throughput: 0: 50661.6. Samples: 2535009340. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 23:00:27,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 23:00:27,224][49750] Updated weights for policy 0, policy_version 291881 (0.0028) [2024-04-26 23:00:30,923][49750] Updated weights for policy 0, policy_version 291891 (0.0033) [2024-04-26 23:00:32,062][49517] Fps is (10 sec: 45876.1, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4782391296. Throughput: 0: 50681.1. Samples: 2535312960. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 23:00:32,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 23:00:33,729][49750] Updated weights for policy 0, policy_version 291901 (0.0031) [2024-04-26 23:00:37,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4782653440. Throughput: 0: 50676.3. Samples: 2535452500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 23:00:37,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 23:00:37,310][49750] Updated weights for policy 0, policy_version 291911 (0.0035) [2024-04-26 23:00:40,074][49750] Updated weights for policy 0, policy_version 291921 (0.0029) [2024-04-26 23:00:42,062][49517] Fps is (10 sec: 54066.8, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4782931968. Throughput: 0: 50620.3. Samples: 2535757140. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 23:00:42,063][49517] Avg episode reward: [(0, '0.711')] [2024-04-26 23:00:42,069][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000291927_4782931968.pth... [2024-04-26 23:00:42,113][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000291180_4770693120.pth [2024-04-26 23:00:43,829][49750] Updated weights for policy 0, policy_version 291931 (0.0028) [2024-04-26 23:00:46,521][49750] Updated weights for policy 0, policy_version 291941 (0.0036) [2024-04-26 23:00:47,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4783194112. Throughput: 0: 50753.7. Samples: 2536064580. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 23:00:47,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 23:00:50,247][49750] Updated weights for policy 0, policy_version 291951 (0.0031) [2024-04-26 23:00:52,063][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.4, 300 sec: 50874.0). Total num frames: 4783423488. Throughput: 0: 50599.1. Samples: 2536225140. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-26 23:00:52,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 23:00:52,893][49750] Updated weights for policy 0, policy_version 291961 (0.0034) [2024-04-26 23:00:56,772][49750] Updated weights for policy 0, policy_version 291971 (0.0037) [2024-04-26 23:00:57,062][49517] Fps is (10 sec: 45876.1, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4783652864. Throughput: 0: 50755.6. Samples: 2536528900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:00:57,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 23:00:59,426][49750] Updated weights for policy 0, policy_version 291981 (0.0028) [2024-04-26 23:01:02,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 4783915008. Throughput: 0: 50911.7. Samples: 2536830080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:01:02,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 23:01:03,169][49750] Updated weights for policy 0, policy_version 291991 (0.0033) [2024-04-26 23:01:05,794][49750] Updated weights for policy 0, policy_version 292001 (0.0029) [2024-04-26 23:01:07,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4784193536. Throughput: 0: 50761.2. Samples: 2536990380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:01:07,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 23:01:10,002][49750] Updated weights for policy 0, policy_version 292011 (0.0029) [2024-04-26 23:01:12,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 4784455680. Throughput: 0: 50711.6. Samples: 2537291360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:01:12,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 23:01:12,213][49750] Updated weights for policy 0, policy_version 292021 (0.0030) [2024-04-26 23:01:16,438][49750] Updated weights for policy 0, policy_version 292031 (0.0027) [2024-04-26 23:01:17,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4784668672. Throughput: 0: 50923.5. Samples: 2537604520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:01:17,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 23:01:18,560][49750] Updated weights for policy 0, policy_version 292041 (0.0036) [2024-04-26 23:01:22,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 4784930816. Throughput: 0: 50746.7. Samples: 2537736100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:01:22,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 23:01:22,849][49750] Updated weights for policy 0, policy_version 292051 (0.0032) [2024-04-26 23:01:23,802][49728] Signal inference workers to stop experience collection... (38100 times) [2024-04-26 23:01:23,849][49750] InferenceWorker_p0-w0: stopping experience collection (38100 times) [2024-04-26 23:01:23,869][49728] Signal inference workers to resume experience collection... (38100 times) [2024-04-26 23:01:23,870][49750] InferenceWorker_p0-w0: resuming experience collection (38100 times) [2024-04-26 23:01:25,006][49750] Updated weights for policy 0, policy_version 292061 (0.0028) [2024-04-26 23:01:27,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4785209344. Throughput: 0: 50826.7. Samples: 2538044340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:01:27,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 23:01:29,220][49750] Updated weights for policy 0, policy_version 292071 (0.0035) [2024-04-26 23:01:31,320][49750] Updated weights for policy 0, policy_version 292081 (0.0030) [2024-04-26 23:01:32,062][49517] Fps is (10 sec: 55706.2, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 4785487872. Throughput: 0: 50836.2. Samples: 2538352200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:01:32,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 23:01:35,496][49750] Updated weights for policy 0, policy_version 292091 (0.0028) [2024-04-26 23:01:37,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.6, 300 sec: 50929.2). Total num frames: 4785717248. Throughput: 0: 50984.5. Samples: 2538519440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:01:37,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 23:01:37,828][49750] Updated weights for policy 0, policy_version 292101 (0.0035) [2024-04-26 23:01:41,770][49750] Updated weights for policy 0, policy_version 292111 (0.0030) [2024-04-26 23:01:42,063][49517] Fps is (10 sec: 45874.2, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4785946624. Throughput: 0: 51131.3. Samples: 2538829820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:01:42,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 23:01:44,168][49750] Updated weights for policy 0, policy_version 292121 (0.0033) [2024-04-26 23:01:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 4786208768. Throughput: 0: 51018.7. Samples: 2539125920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:01:47,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 23:01:48,268][49750] Updated weights for policy 0, policy_version 292131 (0.0030) [2024-04-26 23:01:50,691][49750] Updated weights for policy 0, policy_version 292141 (0.0034) [2024-04-26 23:01:52,063][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4786487296. Throughput: 0: 50930.9. Samples: 2539282280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:01:52,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 23:01:54,838][49750] Updated weights for policy 0, policy_version 292151 (0.0030) [2024-04-26 23:01:57,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 4786749440. Throughput: 0: 50934.2. Samples: 2539583400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:01:57,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 23:01:57,185][49750] Updated weights for policy 0, policy_version 292161 (0.0023) [2024-04-26 23:02:01,271][49750] Updated weights for policy 0, policy_version 292171 (0.0034) [2024-04-26 23:02:02,062][49517] Fps is (10 sec: 45876.1, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4786946048. Throughput: 0: 50749.4. Samples: 2539888240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:02:02,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 23:02:03,481][49750] Updated weights for policy 0, policy_version 292181 (0.0026) [2024-04-26 23:02:07,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4787224576. Throughput: 0: 51027.6. Samples: 2540032340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 23:02:07,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 23:02:07,669][49750] Updated weights for policy 0, policy_version 292191 (0.0031) [2024-04-26 23:02:09,921][49750] Updated weights for policy 0, policy_version 292201 (0.0032) [2024-04-26 23:02:12,063][49517] Fps is (10 sec: 54066.2, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4787486720. Throughput: 0: 50775.4. Samples: 2540329240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 23:02:12,063][49517] Avg episode reward: [(0, '0.669')] [2024-04-26 23:02:14,130][49750] Updated weights for policy 0, policy_version 292211 (0.0034) [2024-04-26 23:02:14,811][49728] Signal inference workers to stop experience collection... (38150 times) [2024-04-26 23:02:14,838][49750] InferenceWorker_p0-w0: stopping experience collection (38150 times) [2024-04-26 23:02:14,922][49728] Signal inference workers to resume experience collection... (38150 times) [2024-04-26 23:02:14,923][49750] InferenceWorker_p0-w0: resuming experience collection (38150 times) [2024-04-26 23:02:16,375][49750] Updated weights for policy 0, policy_version 292221 (0.0024) [2024-04-26 23:02:17,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 4787765248. Throughput: 0: 50659.1. Samples: 2540631860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 23:02:17,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 23:02:20,627][49750] Updated weights for policy 0, policy_version 292231 (0.0026) [2024-04-26 23:02:22,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4788011008. Throughput: 0: 50637.6. Samples: 2540798140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 23:02:22,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 23:02:22,878][49750] Updated weights for policy 0, policy_version 292241 (0.0032) [2024-04-26 23:02:27,062][49517] Fps is (10 sec: 45875.3, 60 sec: 50244.3, 300 sec: 50762.7). Total num frames: 4788224000. Throughput: 0: 50506.9. Samples: 2541102620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 23:02:27,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 23:02:27,107][49750] Updated weights for policy 0, policy_version 292251 (0.0030) [2024-04-26 23:02:29,363][49750] Updated weights for policy 0, policy_version 292261 (0.0028) [2024-04-26 23:02:32,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.2, 300 sec: 50818.1). Total num frames: 4788502528. Throughput: 0: 50739.9. Samples: 2541409220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 23:02:32,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 23:02:33,398][49750] Updated weights for policy 0, policy_version 292271 (0.0028) [2024-04-26 23:02:35,758][49750] Updated weights for policy 0, policy_version 292281 (0.0032) [2024-04-26 23:02:37,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4788748288. Throughput: 0: 50723.3. Samples: 2541564820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 23:02:37,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 23:02:39,715][49750] Updated weights for policy 0, policy_version 292291 (0.0031) [2024-04-26 23:02:42,063][49517] Fps is (10 sec: 54066.1, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 4789043200. Throughput: 0: 50912.3. Samples: 2541874460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 23:02:42,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-26 23:02:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000292300_4789043200.pth... [2024-04-26 23:02:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000291554_4776820736.pth [2024-04-26 23:02:42,256][49750] Updated weights for policy 0, policy_version 292301 (0.0032) [2024-04-26 23:02:46,195][49750] Updated weights for policy 0, policy_version 292311 (0.0030) [2024-04-26 23:02:47,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4789256192. Throughput: 0: 50883.1. Samples: 2542177980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 23:02:47,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 23:02:48,829][49750] Updated weights for policy 0, policy_version 292321 (0.0029) [2024-04-26 23:02:52,063][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4789518336. Throughput: 0: 50949.6. Samples: 2542325080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 23:02:52,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 23:02:52,873][49750] Updated weights for policy 0, policy_version 292331 (0.0037) [2024-04-26 23:02:53,628][49728] Signal inference workers to stop experience collection... (38200 times) [2024-04-26 23:02:53,687][49750] InferenceWorker_p0-w0: stopping experience collection (38200 times) [2024-04-26 23:02:53,734][49728] Signal inference workers to resume experience collection... (38200 times) [2024-04-26 23:02:53,734][49750] InferenceWorker_p0-w0: resuming experience collection (38200 times) [2024-04-26 23:02:55,172][49750] Updated weights for policy 0, policy_version 292341 (0.0028) [2024-04-26 23:02:57,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4789780480. Throughput: 0: 50966.4. Samples: 2542622720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 23:02:57,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 23:02:59,244][49750] Updated weights for policy 0, policy_version 292351 (0.0029) [2024-04-26 23:03:01,618][49750] Updated weights for policy 0, policy_version 292361 (0.0030) [2024-04-26 23:03:02,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51609.4, 300 sec: 50818.2). Total num frames: 4790042624. Throughput: 0: 51022.0. Samples: 2542927860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 23:03:02,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 23:03:05,719][49750] Updated weights for policy 0, policy_version 292371 (0.0027) [2024-04-26 23:03:07,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4790304768. Throughput: 0: 50980.9. Samples: 2543092280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 23:03:07,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-26 23:03:08,006][49750] Updated weights for policy 0, policy_version 292381 (0.0029) [2024-04-26 23:03:12,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4790517760. Throughput: 0: 50994.6. Samples: 2543397380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 23:03:12,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 23:03:12,134][49750] Updated weights for policy 0, policy_version 292391 (0.0036) [2024-04-26 23:03:14,398][49750] Updated weights for policy 0, policy_version 292401 (0.0028) [2024-04-26 23:03:17,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 4790779904. Throughput: 0: 50883.2. Samples: 2543698960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-26 23:03:17,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 23:03:18,431][49750] Updated weights for policy 0, policy_version 292411 (0.0036) [2024-04-26 23:03:20,715][49750] Updated weights for policy 0, policy_version 292421 (0.0035) [2024-04-26 23:03:22,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4791042048. Throughput: 0: 50746.7. Samples: 2543848420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 23:03:22,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 23:03:24,806][49750] Updated weights for policy 0, policy_version 292431 (0.0032) [2024-04-26 23:03:27,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51882.7, 300 sec: 50873.7). Total num frames: 4791336960. Throughput: 0: 50920.4. Samples: 2544165860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 23:03:27,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-26 23:03:27,089][49750] Updated weights for policy 0, policy_version 292441 (0.0027) [2024-04-26 23:03:31,216][49750] Updated weights for policy 0, policy_version 292451 (0.0034) [2024-04-26 23:03:32,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4791566336. Throughput: 0: 50993.3. Samples: 2544472680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 23:03:32,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 23:03:33,493][49750] Updated weights for policy 0, policy_version 292461 (0.0030) [2024-04-26 23:03:37,063][49517] Fps is (10 sec: 45874.1, 60 sec: 50790.2, 300 sec: 50818.1). Total num frames: 4791795712. Throughput: 0: 51007.1. Samples: 2544620400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 23:03:37,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 23:03:37,691][49750] Updated weights for policy 0, policy_version 292471 (0.0029) [2024-04-26 23:03:39,839][49750] Updated weights for policy 0, policy_version 292481 (0.0032) [2024-04-26 23:03:42,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.6, 300 sec: 50818.2). Total num frames: 4792074240. Throughput: 0: 51112.0. Samples: 2544922760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 23:03:42,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-26 23:03:43,926][49728] Signal inference workers to stop experience collection... (38250 times) [2024-04-26 23:03:43,931][49728] Signal inference workers to resume experience collection... (38250 times) [2024-04-26 23:03:43,948][49750] InferenceWorker_p0-w0: stopping experience collection (38250 times) [2024-04-26 23:03:43,948][49750] InferenceWorker_p0-w0: resuming experience collection (38250 times) [2024-04-26 23:03:44,062][49750] Updated weights for policy 0, policy_version 292491 (0.0028) [2024-04-26 23:03:46,698][49750] Updated weights for policy 0, policy_version 292501 (0.0038) [2024-04-26 23:03:47,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4792336384. Throughput: 0: 51126.4. Samples: 2545228540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 23:03:47,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 23:03:50,407][49750] Updated weights for policy 0, policy_version 292511 (0.0032) [2024-04-26 23:03:52,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4792598528. Throughput: 0: 51180.4. Samples: 2545395400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 23:03:52,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 23:03:53,175][49750] Updated weights for policy 0, policy_version 292521 (0.0026) [2024-04-26 23:03:56,897][49750] Updated weights for policy 0, policy_version 292531 (0.0032) [2024-04-26 23:03:57,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4792827904. Throughput: 0: 51009.9. Samples: 2545692820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 23:03:57,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 23:03:59,478][49750] Updated weights for policy 0, policy_version 292541 (0.0034) [2024-04-26 23:04:02,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.6, 300 sec: 50929.3). Total num frames: 4793090048. Throughput: 0: 51087.1. Samples: 2545997880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 23:04:02,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 23:04:03,418][49750] Updated weights for policy 0, policy_version 292551 (0.0032) [2024-04-26 23:04:05,710][49750] Updated weights for policy 0, policy_version 292561 (0.0027) [2024-04-26 23:04:07,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4793335808. Throughput: 0: 50982.5. Samples: 2546142640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 23:04:07,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 23:04:09,703][49750] Updated weights for policy 0, policy_version 292571 (0.0032) [2024-04-26 23:04:12,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51609.7, 300 sec: 50762.6). Total num frames: 4793614336. Throughput: 0: 50808.0. Samples: 2546452220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 23:04:12,063][49517] Avg episode reward: [(0, '0.681')] [2024-04-26 23:04:12,490][49750] Updated weights for policy 0, policy_version 292581 (0.0030) [2024-04-26 23:04:15,990][49750] Updated weights for policy 0, policy_version 292591 (0.0028) [2024-04-26 23:04:17,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51882.6, 300 sec: 50984.8). Total num frames: 4793892864. Throughput: 0: 50899.9. Samples: 2546763180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 23:04:17,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 23:04:19,089][49750] Updated weights for policy 0, policy_version 292601 (0.0027) [2024-04-26 23:04:22,063][49517] Fps is (10 sec: 49150.9, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 4794105856. Throughput: 0: 51104.9. Samples: 2546920120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 23:04:22,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 23:04:22,638][49750] Updated weights for policy 0, policy_version 292611 (0.0034) [2024-04-26 23:04:25,611][49750] Updated weights for policy 0, policy_version 292621 (0.0031) [2024-04-26 23:04:27,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 4794368000. Throughput: 0: 51096.7. Samples: 2547222120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 23:04:27,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 23:04:29,015][49750] Updated weights for policy 0, policy_version 292631 (0.0031) [2024-04-26 23:04:32,063][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4794613760. Throughput: 0: 50953.3. Samples: 2547521440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-26 23:04:32,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 23:04:32,164][49750] Updated weights for policy 0, policy_version 292641 (0.0030) [2024-04-26 23:04:35,370][49750] Updated weights for policy 0, policy_version 292651 (0.0032) [2024-04-26 23:04:37,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 4794892288. Throughput: 0: 50897.3. Samples: 2547685780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 23:04:37,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 23:04:38,450][49750] Updated weights for policy 0, policy_version 292661 (0.0034) [2024-04-26 23:04:41,690][49750] Updated weights for policy 0, policy_version 292671 (0.0029) [2024-04-26 23:04:42,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 4795138048. Throughput: 0: 51088.4. Samples: 2547991800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 23:04:42,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 23:04:42,149][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000292673_4795154432.pth... [2024-04-26 23:04:42,201][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000291927_4782931968.pth [2024-04-26 23:04:42,311][49728] Signal inference workers to stop experience collection... (38300 times) [2024-04-26 23:04:42,335][49750] InferenceWorker_p0-w0: stopping experience collection (38300 times) [2024-04-26 23:04:42,381][49728] Signal inference workers to resume experience collection... (38300 times) [2024-04-26 23:04:42,381][49750] InferenceWorker_p0-w0: resuming experience collection (38300 times) [2024-04-26 23:04:44,702][49750] Updated weights for policy 0, policy_version 292681 (0.0028) [2024-04-26 23:04:47,063][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4795367424. Throughput: 0: 51085.7. Samples: 2548296740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 23:04:47,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 23:04:48,166][49750] Updated weights for policy 0, policy_version 292691 (0.0033) [2024-04-26 23:04:51,301][49750] Updated weights for policy 0, policy_version 292701 (0.0029) [2024-04-26 23:04:52,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4795629568. Throughput: 0: 51056.2. Samples: 2548440160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 23:04:52,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 23:04:54,521][49750] Updated weights for policy 0, policy_version 292711 (0.0032) [2024-04-26 23:04:57,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 4795908096. Throughput: 0: 50867.4. Samples: 2548741260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 23:04:57,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 23:04:57,714][49750] Updated weights for policy 0, policy_version 292721 (0.0031) [2024-04-26 23:05:00,872][49750] Updated weights for policy 0, policy_version 292731 (0.0028) [2024-04-26 23:05:02,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51336.4, 300 sec: 50984.8). Total num frames: 4796170240. Throughput: 0: 50911.5. Samples: 2549054200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 23:05:02,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 23:05:04,099][49750] Updated weights for policy 0, policy_version 292741 (0.0034) [2024-04-26 23:05:07,062][49517] Fps is (10 sec: 49152.7, 60 sec: 51063.6, 300 sec: 50984.8). Total num frames: 4796399616. Throughput: 0: 50797.6. Samples: 2549206000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 23:05:07,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 23:05:07,216][49750] Updated weights for policy 0, policy_version 292751 (0.0032) [2024-04-26 23:05:10,599][49750] Updated weights for policy 0, policy_version 292761 (0.0033) [2024-04-26 23:05:12,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.3, 300 sec: 50929.3). Total num frames: 4796661760. Throughput: 0: 51040.0. Samples: 2549518920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 23:05:12,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 23:05:13,676][49750] Updated weights for policy 0, policy_version 292771 (0.0032) [2024-04-26 23:05:17,040][49750] Updated weights for policy 0, policy_version 292781 (0.0030) [2024-04-26 23:05:17,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4796923904. Throughput: 0: 51051.2. Samples: 2549818740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 23:05:17,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 23:05:20,169][49750] Updated weights for policy 0, policy_version 292791 (0.0040) [2024-04-26 23:05:22,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 4797202432. Throughput: 0: 50910.2. Samples: 2549976740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 23:05:22,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 23:05:23,517][49750] Updated weights for policy 0, policy_version 292801 (0.0028) [2024-04-26 23:05:26,562][49750] Updated weights for policy 0, policy_version 292811 (0.0035) [2024-04-26 23:05:27,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 4797415424. Throughput: 0: 50812.7. Samples: 2550278380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 23:05:27,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 23:05:29,858][49750] Updated weights for policy 0, policy_version 292821 (0.0030) [2024-04-26 23:05:32,063][49517] Fps is (10 sec: 45875.3, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4797661184. Throughput: 0: 50851.4. Samples: 2550585060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 23:05:32,063][49517] Avg episode reward: [(0, '0.701')] [2024-04-26 23:05:32,920][49750] Updated weights for policy 0, policy_version 292831 (0.0036) [2024-04-26 23:05:36,306][49750] Updated weights for policy 0, policy_version 292841 (0.0041) [2024-04-26 23:05:37,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4797923328. Throughput: 0: 50848.0. Samples: 2550728320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 23:05:37,063][49517] Avg episode reward: [(0, '0.491')] [2024-04-26 23:05:39,696][49750] Updated weights for policy 0, policy_version 292851 (0.0029) [2024-04-26 23:05:42,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 4798201856. Throughput: 0: 50873.7. Samples: 2551030580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-26 23:05:42,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-26 23:05:42,830][49750] Updated weights for policy 0, policy_version 292861 (0.0032) [2024-04-26 23:05:46,103][49750] Updated weights for policy 0, policy_version 292871 (0.0032) [2024-04-26 23:05:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 4798447616. Throughput: 0: 50744.2. Samples: 2551337680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 23:05:47,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 23:05:49,034][49728] Signal inference workers to stop experience collection... (38350 times) [2024-04-26 23:05:49,084][49750] InferenceWorker_p0-w0: stopping experience collection (38350 times) [2024-04-26 23:05:49,098][49728] Signal inference workers to resume experience collection... (38350 times) [2024-04-26 23:05:49,106][49750] InferenceWorker_p0-w0: resuming experience collection (38350 times) [2024-04-26 23:05:49,229][49750] Updated weights for policy 0, policy_version 292881 (0.0034) [2024-04-26 23:05:52,062][49517] Fps is (10 sec: 47514.7, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 4798676992. Throughput: 0: 50705.3. Samples: 2551487740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 23:05:52,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 23:05:52,543][49750] Updated weights for policy 0, policy_version 292891 (0.0034) [2024-04-26 23:05:55,613][49750] Updated weights for policy 0, policy_version 292901 (0.0026) [2024-04-26 23:05:57,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50929.3). Total num frames: 4798939136. Throughput: 0: 50618.3. Samples: 2551796740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 23:05:57,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 23:05:59,027][49750] Updated weights for policy 0, policy_version 292911 (0.0036) [2024-04-26 23:06:01,977][49750] Updated weights for policy 0, policy_version 292921 (0.0031) [2024-04-26 23:06:02,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.5, 300 sec: 50929.2). Total num frames: 4799217664. Throughput: 0: 50807.2. Samples: 2552105060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 23:06:02,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 23:06:05,286][49750] Updated weights for policy 0, policy_version 292931 (0.0029) [2024-04-26 23:06:07,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4799463424. Throughput: 0: 50754.5. Samples: 2552260680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 23:06:07,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 23:06:08,509][49750] Updated weights for policy 0, policy_version 292941 (0.0030) [2024-04-26 23:06:11,703][49750] Updated weights for policy 0, policy_version 292951 (0.0034) [2024-04-26 23:06:12,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.4, 300 sec: 51040.3). Total num frames: 4799725568. Throughput: 0: 50789.8. Samples: 2552563920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 23:06:12,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 23:06:14,847][49750] Updated weights for policy 0, policy_version 292961 (0.0034) [2024-04-26 23:06:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50929.3). Total num frames: 4799954944. Throughput: 0: 50774.9. Samples: 2552869920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 23:06:17,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 23:06:18,188][49750] Updated weights for policy 0, policy_version 292971 (0.0036) [2024-04-26 23:06:21,336][49750] Updated weights for policy 0, policy_version 292981 (0.0037) [2024-04-26 23:06:22,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 4800217088. Throughput: 0: 50845.6. Samples: 2553016380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 23:06:22,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 23:06:24,659][49750] Updated weights for policy 0, policy_version 292991 (0.0026) [2024-04-26 23:06:27,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4800479232. Throughput: 0: 50956.2. Samples: 2553323600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 23:06:27,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 23:06:27,709][49750] Updated weights for policy 0, policy_version 293001 (0.0033) [2024-04-26 23:06:31,088][49750] Updated weights for policy 0, policy_version 293011 (0.0036) [2024-04-26 23:06:32,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 4800741376. Throughput: 0: 50900.3. Samples: 2553628200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 23:06:32,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-26 23:06:34,077][49750] Updated weights for policy 0, policy_version 293021 (0.0024) [2024-04-26 23:06:37,063][49517] Fps is (10 sec: 50789.5, 60 sec: 51063.3, 300 sec: 50984.8). Total num frames: 4800987136. Throughput: 0: 51003.3. Samples: 2553782900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 23:06:37,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 23:06:37,455][49750] Updated weights for policy 0, policy_version 293031 (0.0030) [2024-04-26 23:06:40,598][49750] Updated weights for policy 0, policy_version 293041 (0.0028) [2024-04-26 23:06:42,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.5, 300 sec: 50929.2). Total num frames: 4801232896. Throughput: 0: 50881.3. Samples: 2554086400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 23:06:42,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 23:06:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000293044_4801232896.pth... [2024-04-26 23:06:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000292300_4789043200.pth [2024-04-26 23:06:44,004][49750] Updated weights for policy 0, policy_version 293051 (0.0034) [2024-04-26 23:06:46,867][49750] Updated weights for policy 0, policy_version 293061 (0.0029) [2024-04-26 23:06:47,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 4801511424. Throughput: 0: 50899.9. Samples: 2554395560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 23:06:47,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 23:06:50,357][49750] Updated weights for policy 0, policy_version 293071 (0.0031) [2024-04-26 23:06:52,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4801757184. Throughput: 0: 50907.4. Samples: 2554551520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 23:06:52,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 23:06:53,661][49750] Updated weights for policy 0, policy_version 293081 (0.0031) [2024-04-26 23:06:56,689][49750] Updated weights for policy 0, policy_version 293091 (0.0036) [2024-04-26 23:06:57,062][49517] Fps is (10 sec: 49152.0, 60 sec: 51063.4, 300 sec: 51040.3). Total num frames: 4802002944. Throughput: 0: 50853.4. Samples: 2554852320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-26 23:06:57,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 23:06:59,962][49750] Updated weights for policy 0, policy_version 293101 (0.0037) [2024-04-26 23:07:02,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50929.2). Total num frames: 4802248704. Throughput: 0: 50842.1. Samples: 2555157820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 23:07:02,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 23:07:02,993][49728] Signal inference workers to stop experience collection... (38400 times) [2024-04-26 23:07:03,037][49750] InferenceWorker_p0-w0: stopping experience collection (38400 times) [2024-04-26 23:07:03,097][49728] Signal inference workers to resume experience collection... (38400 times) [2024-04-26 23:07:03,097][49750] InferenceWorker_p0-w0: resuming experience collection (38400 times) [2024-04-26 23:07:03,220][49750] Updated weights for policy 0, policy_version 293111 (0.0028) [2024-04-26 23:07:06,364][49750] Updated weights for policy 0, policy_version 293121 (0.0045) [2024-04-26 23:07:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50929.3). Total num frames: 4802510848. Throughput: 0: 50885.9. Samples: 2555306240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 23:07:07,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 23:07:09,742][49750] Updated weights for policy 0, policy_version 293131 (0.0031) [2024-04-26 23:07:12,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 4802756608. Throughput: 0: 50880.3. Samples: 2555613220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 23:07:12,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 23:07:12,977][49750] Updated weights for policy 0, policy_version 293141 (0.0030) [2024-04-26 23:07:16,068][49750] Updated weights for policy 0, policy_version 293151 (0.0040) [2024-04-26 23:07:17,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4803018752. Throughput: 0: 50826.8. Samples: 2555915400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 23:07:17,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 23:07:19,502][49750] Updated weights for policy 0, policy_version 293161 (0.0032) [2024-04-26 23:07:22,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.5, 300 sec: 50929.2). Total num frames: 4803248128. Throughput: 0: 50735.8. Samples: 2556066000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 23:07:22,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 23:07:22,547][49750] Updated weights for policy 0, policy_version 293171 (0.0025) [2024-04-26 23:07:26,072][49750] Updated weights for policy 0, policy_version 293181 (0.0030) [2024-04-26 23:07:27,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 4803526656. Throughput: 0: 50742.2. Samples: 2556369800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 23:07:27,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 23:07:29,115][49750] Updated weights for policy 0, policy_version 293191 (0.0038) [2024-04-26 23:07:32,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50517.2, 300 sec: 50929.2). Total num frames: 4803772416. Throughput: 0: 50756.7. Samples: 2556679620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 23:07:32,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 23:07:32,384][49750] Updated weights for policy 0, policy_version 293201 (0.0028) [2024-04-26 23:07:35,476][49750] Updated weights for policy 0, policy_version 293211 (0.0030) [2024-04-26 23:07:37,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4804034560. Throughput: 0: 50719.6. Samples: 2556833900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 23:07:37,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 23:07:38,758][49750] Updated weights for policy 0, policy_version 293221 (0.0036) [2024-04-26 23:07:41,806][49750] Updated weights for policy 0, policy_version 293231 (0.0030) [2024-04-26 23:07:42,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50984.7). Total num frames: 4804296704. Throughput: 0: 50751.9. Samples: 2557136160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 23:07:42,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 23:07:45,398][49750] Updated weights for policy 0, policy_version 293241 (0.0031) [2024-04-26 23:07:47,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50929.3). Total num frames: 4804542464. Throughput: 0: 50711.2. Samples: 2557439820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 23:07:47,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 23:07:48,426][49750] Updated weights for policy 0, policy_version 293251 (0.0033) [2024-04-26 23:07:51,790][49750] Updated weights for policy 0, policy_version 293261 (0.0034) [2024-04-26 23:07:52,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50790.5, 300 sec: 50929.2). Total num frames: 4804804608. Throughput: 0: 50757.0. Samples: 2557590300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 23:07:52,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 23:07:55,217][49750] Updated weights for policy 0, policy_version 293271 (0.0034) [2024-04-26 23:07:57,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50873.8). Total num frames: 4805050368. Throughput: 0: 50808.7. Samples: 2557899600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 23:07:57,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 23:07:58,060][49750] Updated weights for policy 0, policy_version 293281 (0.0036) [2024-04-26 23:08:01,756][49750] Updated weights for policy 0, policy_version 293291 (0.0040) [2024-04-26 23:08:02,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4805296128. Throughput: 0: 50799.9. Samples: 2558201400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 23:08:02,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 23:08:04,464][49750] Updated weights for policy 0, policy_version 293301 (0.0027) [2024-04-26 23:08:07,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50929.2). Total num frames: 4805541888. Throughput: 0: 50714.2. Samples: 2558348140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-26 23:08:07,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 23:08:08,260][49750] Updated weights for policy 0, policy_version 293311 (0.0032) [2024-04-26 23:08:10,904][49750] Updated weights for policy 0, policy_version 293321 (0.0030) [2024-04-26 23:08:12,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 4805804032. Throughput: 0: 50754.8. Samples: 2558653760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 23:08:12,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 23:08:14,490][49750] Updated weights for policy 0, policy_version 293331 (0.0038) [2024-04-26 23:08:15,216][49728] Signal inference workers to stop experience collection... (38450 times) [2024-04-26 23:08:15,258][49750] InferenceWorker_p0-w0: stopping experience collection (38450 times) [2024-04-26 23:08:15,280][49728] Signal inference workers to resume experience collection... (38450 times) [2024-04-26 23:08:15,281][49750] InferenceWorker_p0-w0: resuming experience collection (38450 times) [2024-04-26 23:08:17,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 4806066176. Throughput: 0: 50820.7. Samples: 2558966540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 23:08:17,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 23:08:17,436][49750] Updated weights for policy 0, policy_version 293341 (0.0031) [2024-04-26 23:08:20,797][49750] Updated weights for policy 0, policy_version 293351 (0.0035) [2024-04-26 23:08:22,063][49517] Fps is (10 sec: 52427.3, 60 sec: 51336.3, 300 sec: 50818.1). Total num frames: 4806328320. Throughput: 0: 50757.6. Samples: 2559118000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 23:08:22,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 23:08:23,884][49750] Updated weights for policy 0, policy_version 293361 (0.0034) [2024-04-26 23:08:27,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4806574080. Throughput: 0: 50818.7. Samples: 2559423000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 23:08:27,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 23:08:27,308][49750] Updated weights for policy 0, policy_version 293371 (0.0032) [2024-04-26 23:08:30,372][49750] Updated weights for policy 0, policy_version 293381 (0.0027) [2024-04-26 23:08:32,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 4806819840. Throughput: 0: 50859.1. Samples: 2559728480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 23:08:32,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 23:08:33,818][49750] Updated weights for policy 0, policy_version 293391 (0.0034) [2024-04-26 23:08:36,820][49750] Updated weights for policy 0, policy_version 293401 (0.0032) [2024-04-26 23:08:37,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4807081984. Throughput: 0: 50724.9. Samples: 2559872920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 23:08:37,063][49517] Avg episode reward: [(0, '0.491')] [2024-04-26 23:08:40,329][49750] Updated weights for policy 0, policy_version 293411 (0.0027) [2024-04-26 23:08:42,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4807327744. Throughput: 0: 50695.4. Samples: 2560180900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 23:08:42,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 23:08:42,076][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000293416_4807327744.pth... [2024-04-26 23:08:42,132][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000292673_4795154432.pth [2024-04-26 23:08:43,245][49750] Updated weights for policy 0, policy_version 293421 (0.0028) [2024-04-26 23:08:46,655][49750] Updated weights for policy 0, policy_version 293431 (0.0037) [2024-04-26 23:08:47,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4807573504. Throughput: 0: 50742.3. Samples: 2560484800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 23:08:47,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-26 23:08:49,557][49750] Updated weights for policy 0, policy_version 293441 (0.0028) [2024-04-26 23:08:52,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.1, 300 sec: 50873.7). Total num frames: 4807835648. Throughput: 0: 50804.7. Samples: 2560634360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 23:08:52,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 23:08:53,208][49750] Updated weights for policy 0, policy_version 293451 (0.0036) [2024-04-26 23:08:56,059][49750] Updated weights for policy 0, policy_version 293461 (0.0029) [2024-04-26 23:08:57,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 4808081408. Throughput: 0: 50914.0. Samples: 2560944900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 23:08:57,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 23:08:59,779][49750] Updated weights for policy 0, policy_version 293471 (0.0032) [2024-04-26 23:09:02,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4808343552. Throughput: 0: 50735.8. Samples: 2561249660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 23:09:02,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 23:09:02,540][49750] Updated weights for policy 0, policy_version 293481 (0.0035) [2024-04-26 23:09:06,111][49750] Updated weights for policy 0, policy_version 293491 (0.0033) [2024-04-26 23:09:07,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4808605696. Throughput: 0: 50734.0. Samples: 2561401020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 23:09:07,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 23:09:09,076][49750] Updated weights for policy 0, policy_version 293501 (0.0030) [2024-04-26 23:09:12,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 4808851456. Throughput: 0: 50774.1. Samples: 2561707840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 23:09:12,063][49517] Avg episode reward: [(0, '0.681')] [2024-04-26 23:09:12,436][49750] Updated weights for policy 0, policy_version 293511 (0.0031) [2024-04-26 23:09:15,570][49750] Updated weights for policy 0, policy_version 293521 (0.0037) [2024-04-26 23:09:17,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4809113600. Throughput: 0: 50787.5. Samples: 2562013920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 23:09:17,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 23:09:18,949][49750] Updated weights for policy 0, policy_version 293531 (0.0033) [2024-04-26 23:09:22,063][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4809359360. Throughput: 0: 50857.2. Samples: 2562161500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-26 23:09:22,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 23:09:22,137][49750] Updated weights for policy 0, policy_version 293541 (0.0031) [2024-04-26 23:09:25,356][49750] Updated weights for policy 0, policy_version 293551 (0.0032) [2024-04-26 23:09:27,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4809621504. Throughput: 0: 50789.5. Samples: 2562466420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 23:09:27,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 23:09:28,740][49750] Updated weights for policy 0, policy_version 293561 (0.0036) [2024-04-26 23:09:31,708][49750] Updated weights for policy 0, policy_version 293571 (0.0036) [2024-04-26 23:09:32,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4809883648. Throughput: 0: 50791.4. Samples: 2562770420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 23:09:32,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 23:09:35,137][49750] Updated weights for policy 0, policy_version 293581 (0.0026) [2024-04-26 23:09:35,595][49728] Signal inference workers to stop experience collection... (38500 times) [2024-04-26 23:09:35,595][49728] Signal inference workers to resume experience collection... (38500 times) [2024-04-26 23:09:35,622][49750] InferenceWorker_p0-w0: stopping experience collection (38500 times) [2024-04-26 23:09:35,622][49750] InferenceWorker_p0-w0: resuming experience collection (38500 times) [2024-04-26 23:09:37,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4810113024. Throughput: 0: 50844.6. Samples: 2562922360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 23:09:37,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 23:09:38,124][49750] Updated weights for policy 0, policy_version 293591 (0.0028) [2024-04-26 23:09:41,548][49750] Updated weights for policy 0, policy_version 293601 (0.0029) [2024-04-26 23:09:42,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4810375168. Throughput: 0: 50683.7. Samples: 2563225660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 23:09:42,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 23:09:44,663][49750] Updated weights for policy 0, policy_version 293611 (0.0033) [2024-04-26 23:09:47,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 4810637312. Throughput: 0: 50721.8. Samples: 2563532140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 23:09:47,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 23:09:48,164][49750] Updated weights for policy 0, policy_version 293621 (0.0035) [2024-04-26 23:09:51,147][49750] Updated weights for policy 0, policy_version 293631 (0.0028) [2024-04-26 23:09:52,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4810899456. Throughput: 0: 50855.1. Samples: 2563689500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 23:09:52,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 23:09:54,639][49750] Updated weights for policy 0, policy_version 293641 (0.0034) [2024-04-26 23:09:57,063][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4811145216. Throughput: 0: 50737.1. Samples: 2563991000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 23:09:57,072][49517] Avg episode reward: [(0, '0.532')] [2024-04-26 23:09:57,471][49750] Updated weights for policy 0, policy_version 293651 (0.0029) [2024-04-26 23:10:01,140][49750] Updated weights for policy 0, policy_version 293661 (0.0034) [2024-04-26 23:10:02,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4811374592. Throughput: 0: 50589.3. Samples: 2564290440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 23:10:02,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 23:10:03,854][49750] Updated weights for policy 0, policy_version 293671 (0.0031) [2024-04-26 23:10:07,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4811636736. Throughput: 0: 50764.9. Samples: 2564445920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 23:10:07,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 23:10:07,582][49750] Updated weights for policy 0, policy_version 293681 (0.0037) [2024-04-26 23:10:10,372][49750] Updated weights for policy 0, policy_version 293691 (0.0030) [2024-04-26 23:10:12,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4811898880. Throughput: 0: 50658.9. Samples: 2564746080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 23:10:12,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 23:10:14,143][49750] Updated weights for policy 0, policy_version 293701 (0.0037) [2024-04-26 23:10:16,863][49750] Updated weights for policy 0, policy_version 293711 (0.0033) [2024-04-26 23:10:17,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4812161024. Throughput: 0: 50692.1. Samples: 2565051560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 23:10:17,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 23:10:20,635][49750] Updated weights for policy 0, policy_version 293721 (0.0031) [2024-04-26 23:10:22,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4812406784. Throughput: 0: 50819.5. Samples: 2565209240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 23:10:22,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 23:10:23,296][49750] Updated weights for policy 0, policy_version 293731 (0.0033) [2024-04-26 23:10:27,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 4812636160. Throughput: 0: 50750.1. Samples: 2565509420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 23:10:27,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 23:10:27,178][49750] Updated weights for policy 0, policy_version 293741 (0.0035) [2024-04-26 23:10:29,842][49750] Updated weights for policy 0, policy_version 293751 (0.0032) [2024-04-26 23:10:32,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4812914688. Throughput: 0: 50589.4. Samples: 2565808660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 23:10:32,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 23:10:33,557][49750] Updated weights for policy 0, policy_version 293761 (0.0031) [2024-04-26 23:10:36,272][49750] Updated weights for policy 0, policy_version 293771 (0.0030) [2024-04-26 23:10:37,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 4813176832. Throughput: 0: 50593.1. Samples: 2565966180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-26 23:10:37,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-26 23:10:39,940][49750] Updated weights for policy 0, policy_version 293781 (0.0038) [2024-04-26 23:10:42,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4813422592. Throughput: 0: 50743.1. Samples: 2566274440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:10:42,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-26 23:10:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000293789_4813438976.pth... [2024-04-26 23:10:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000293044_4801232896.pth [2024-04-26 23:10:42,682][49750] Updated weights for policy 0, policy_version 293791 (0.0028) [2024-04-26 23:10:46,463][49750] Updated weights for policy 0, policy_version 293801 (0.0031) [2024-04-26 23:10:47,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4813684736. Throughput: 0: 50877.4. Samples: 2566579920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:10:47,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 23:10:48,956][49750] Updated weights for policy 0, policy_version 293811 (0.0035) [2024-04-26 23:10:52,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4813914112. Throughput: 0: 50682.5. Samples: 2566726640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:10:52,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 23:10:52,827][49750] Updated weights for policy 0, policy_version 293821 (0.0032) [2024-04-26 23:10:55,485][49750] Updated weights for policy 0, policy_version 293831 (0.0028) [2024-04-26 23:10:57,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4814192640. Throughput: 0: 50686.7. Samples: 2567026980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:10:57,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 23:10:59,249][49750] Updated weights for policy 0, policy_version 293841 (0.0029) [2024-04-26 23:11:02,001][49750] Updated weights for policy 0, policy_version 293851 (0.0033) [2024-04-26 23:11:02,063][49517] Fps is (10 sec: 54067.8, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 4814454784. Throughput: 0: 50774.2. Samples: 2567336400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:11:02,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 23:11:05,559][49750] Updated weights for policy 0, policy_version 293861 (0.0029) [2024-04-26 23:11:06,152][49728] Signal inference workers to stop experience collection... (38550 times) [2024-04-26 23:11:06,152][49728] Signal inference workers to resume experience collection... (38550 times) [2024-04-26 23:11:06,187][49750] InferenceWorker_p0-w0: stopping experience collection (38550 times) [2024-04-26 23:11:06,187][49750] InferenceWorker_p0-w0: resuming experience collection (38550 times) [2024-04-26 23:11:07,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 4814700544. Throughput: 0: 50772.2. Samples: 2567493980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:11:07,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 23:11:08,420][49750] Updated weights for policy 0, policy_version 293871 (0.0031) [2024-04-26 23:11:12,002][49750] Updated weights for policy 0, policy_version 293881 (0.0034) [2024-04-26 23:11:12,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4814946304. Throughput: 0: 50936.5. Samples: 2567801560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:11:12,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 23:11:14,734][49750] Updated weights for policy 0, policy_version 293891 (0.0028) [2024-04-26 23:11:17,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4815192064. Throughput: 0: 51026.3. Samples: 2568104840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:11:17,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 23:11:18,507][49750] Updated weights for policy 0, policy_version 293901 (0.0032) [2024-04-26 23:11:21,046][49750] Updated weights for policy 0, policy_version 293911 (0.0032) [2024-04-26 23:11:22,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 4815486976. Throughput: 0: 50943.5. Samples: 2568258640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:11:22,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 23:11:24,922][49750] Updated weights for policy 0, policy_version 293921 (0.0035) [2024-04-26 23:11:27,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51609.7, 300 sec: 50818.2). Total num frames: 4815732736. Throughput: 0: 50929.8. Samples: 2568566280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:11:27,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 23:11:27,494][49750] Updated weights for policy 0, policy_version 293931 (0.0034) [2024-04-26 23:11:31,251][49750] Updated weights for policy 0, policy_version 293941 (0.0027) [2024-04-26 23:11:32,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 4815962112. Throughput: 0: 50975.1. Samples: 2568873800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:11:32,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 23:11:33,929][49750] Updated weights for policy 0, policy_version 293951 (0.0035) [2024-04-26 23:11:37,063][49517] Fps is (10 sec: 45874.9, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 4816191488. Throughput: 0: 50961.9. Samples: 2569019920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:11:37,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-26 23:11:37,665][49750] Updated weights for policy 0, policy_version 293961 (0.0029) [2024-04-26 23:11:40,409][49750] Updated weights for policy 0, policy_version 293971 (0.0035) [2024-04-26 23:11:42,063][49517] Fps is (10 sec: 54066.0, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 4816502784. Throughput: 0: 51078.5. Samples: 2569325520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:11:42,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 23:11:44,036][49750] Updated weights for policy 0, policy_version 293981 (0.0028) [2024-04-26 23:11:46,792][49750] Updated weights for policy 0, policy_version 293991 (0.0029) [2024-04-26 23:11:47,063][49517] Fps is (10 sec: 55705.5, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4816748544. Throughput: 0: 51006.2. Samples: 2569631680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:11:47,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 23:11:50,389][49750] Updated weights for policy 0, policy_version 294001 (0.0031) [2024-04-26 23:11:52,063][49517] Fps is (10 sec: 49152.1, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 4816994304. Throughput: 0: 51277.5. Samples: 2569801480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-04-26 23:11:52,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 23:11:53,250][49750] Updated weights for policy 0, policy_version 294011 (0.0031) [2024-04-26 23:11:56,828][49750] Updated weights for policy 0, policy_version 294021 (0.0034) [2024-04-26 23:11:57,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4817240064. Throughput: 0: 51056.1. Samples: 2570099080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-04-26 23:11:57,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 23:11:59,540][49750] Updated weights for policy 0, policy_version 294031 (0.0036) [2024-04-26 23:12:02,062][49517] Fps is (10 sec: 47514.6, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4817469440. Throughput: 0: 51027.1. Samples: 2570401060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-04-26 23:12:02,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 23:12:03,234][49750] Updated weights for policy 0, policy_version 294041 (0.0039) [2024-04-26 23:12:03,952][49728] Signal inference workers to stop experience collection... (38600 times) [2024-04-26 23:12:03,953][49728] Signal inference workers to resume experience collection... (38600 times) [2024-04-26 23:12:03,988][49750] InferenceWorker_p0-w0: stopping experience collection (38600 times) [2024-04-26 23:12:03,989][49750] InferenceWorker_p0-w0: resuming experience collection (38600 times) [2024-04-26 23:12:06,042][49750] Updated weights for policy 0, policy_version 294051 (0.0033) [2024-04-26 23:12:07,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4817747968. Throughput: 0: 50830.1. Samples: 2570546000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-04-26 23:12:07,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 23:12:09,640][49750] Updated weights for policy 0, policy_version 294061 (0.0029) [2024-04-26 23:12:12,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 4818010112. Throughput: 0: 50913.7. Samples: 2570857400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-04-26 23:12:12,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 23:12:12,405][49750] Updated weights for policy 0, policy_version 294071 (0.0033) [2024-04-26 23:12:16,098][49750] Updated weights for policy 0, policy_version 294081 (0.0034) [2024-04-26 23:12:17,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 4818272256. Throughput: 0: 50867.4. Samples: 2571162840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-04-26 23:12:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 23:12:18,801][49750] Updated weights for policy 0, policy_version 294091 (0.0023) [2024-04-26 23:12:22,062][49517] Fps is (10 sec: 47514.3, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4818485248. Throughput: 0: 50927.8. Samples: 2571311660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-04-26 23:12:22,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 23:12:22,562][49750] Updated weights for policy 0, policy_version 294101 (0.0027) [2024-04-26 23:12:25,269][49750] Updated weights for policy 0, policy_version 294111 (0.0031) [2024-04-26 23:12:27,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4818763776. Throughput: 0: 50805.6. Samples: 2571611760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-04-26 23:12:27,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 23:12:29,076][49750] Updated weights for policy 0, policy_version 294121 (0.0022) [2024-04-26 23:12:31,861][49750] Updated weights for policy 0, policy_version 294131 (0.0037) [2024-04-26 23:12:32,063][49517] Fps is (10 sec: 55704.5, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4819042304. Throughput: 0: 50750.7. Samples: 2571915460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-04-26 23:12:32,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 23:12:35,553][49750] Updated weights for policy 0, policy_version 294141 (0.0028) [2024-04-26 23:12:37,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51609.7, 300 sec: 50818.2). Total num frames: 4819288064. Throughput: 0: 50637.1. Samples: 2572080140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-04-26 23:12:37,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 23:12:38,395][49750] Updated weights for policy 0, policy_version 294151 (0.0034) [2024-04-26 23:12:42,030][49750] Updated weights for policy 0, policy_version 294161 (0.0038) [2024-04-26 23:12:42,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 4819533824. Throughput: 0: 50781.9. Samples: 2572384280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-04-26 23:12:42,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 23:12:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000294161_4819533824.pth... [2024-04-26 23:12:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000293416_4807327744.pth [2024-04-26 23:12:45,017][49750] Updated weights for policy 0, policy_version 294171 (0.0031) [2024-04-26 23:12:47,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4819763200. Throughput: 0: 50713.3. Samples: 2572683160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-04-26 23:12:47,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 23:12:48,113][49728] Signal inference workers to stop experience collection... (38650 times) [2024-04-26 23:12:48,132][49750] InferenceWorker_p0-w0: stopping experience collection (38650 times) [2024-04-26 23:12:48,218][49728] Signal inference workers to resume experience collection... (38650 times) [2024-04-26 23:12:48,218][49750] InferenceWorker_p0-w0: resuming experience collection (38650 times) [2024-04-26 23:12:48,475][49750] Updated weights for policy 0, policy_version 294181 (0.0029) [2024-04-26 23:12:51,457][49750] Updated weights for policy 0, policy_version 294191 (0.0030) [2024-04-26 23:12:52,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4820025344. Throughput: 0: 50603.0. Samples: 2572823140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-04-26 23:12:52,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 23:12:54,887][49750] Updated weights for policy 0, policy_version 294201 (0.0029) [2024-04-26 23:12:57,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4820287488. Throughput: 0: 50590.7. Samples: 2573133980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-04-26 23:12:57,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 23:12:58,044][49750] Updated weights for policy 0, policy_version 294211 (0.0034) [2024-04-26 23:13:01,350][49750] Updated weights for policy 0, policy_version 294221 (0.0030) [2024-04-26 23:13:02,063][49517] Fps is (10 sec: 54067.4, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 4820566016. Throughput: 0: 50550.7. Samples: 2573437620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-04-26 23:13:02,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 23:13:04,700][49750] Updated weights for policy 0, policy_version 294231 (0.0036) [2024-04-26 23:13:07,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4820762624. Throughput: 0: 50702.6. Samples: 2573593280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:13:07,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 23:13:07,876][49750] Updated weights for policy 0, policy_version 294241 (0.0032) [2024-04-26 23:13:11,283][49750] Updated weights for policy 0, policy_version 294251 (0.0032) [2024-04-26 23:13:12,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4821041152. Throughput: 0: 50746.7. Samples: 2573895360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:13:12,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 23:13:14,302][49750] Updated weights for policy 0, policy_version 294261 (0.0030) [2024-04-26 23:13:17,062][49517] Fps is (10 sec: 54067.5, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 4821303296. Throughput: 0: 50772.6. Samples: 2574200220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:13:17,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 23:13:17,600][49750] Updated weights for policy 0, policy_version 294271 (0.0030) [2024-04-26 23:13:20,697][49750] Updated weights for policy 0, policy_version 294281 (0.0029) [2024-04-26 23:13:22,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 4821581824. Throughput: 0: 50716.8. Samples: 2574362400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:13:22,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 23:13:23,985][49750] Updated weights for policy 0, policy_version 294291 (0.0031) [2024-04-26 23:13:27,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4821811200. Throughput: 0: 50711.8. Samples: 2574666300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:13:27,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 23:13:27,171][49750] Updated weights for policy 0, policy_version 294301 (0.0039) [2024-04-26 23:13:30,461][49750] Updated weights for policy 0, policy_version 294311 (0.0030) [2024-04-26 23:13:32,063][49517] Fps is (10 sec: 45875.0, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4822040576. Throughput: 0: 50708.3. Samples: 2574965040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:13:32,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 23:13:33,537][49750] Updated weights for policy 0, policy_version 294321 (0.0028) [2024-04-26 23:13:36,955][49750] Updated weights for policy 0, policy_version 294331 (0.0030) [2024-04-26 23:13:37,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4822319104. Throughput: 0: 50963.6. Samples: 2575116500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:13:37,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 23:13:40,029][49750] Updated weights for policy 0, policy_version 294341 (0.0029) [2024-04-26 23:13:40,164][49728] Signal inference workers to stop experience collection... (38700 times) [2024-04-26 23:13:40,165][49728] Signal inference workers to resume experience collection... (38700 times) [2024-04-26 23:13:40,183][49750] InferenceWorker_p0-w0: stopping experience collection (38700 times) [2024-04-26 23:13:40,183][49750] InferenceWorker_p0-w0: resuming experience collection (38700 times) [2024-04-26 23:13:42,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.4, 300 sec: 50818.1). Total num frames: 4822564864. Throughput: 0: 50660.0. Samples: 2575413680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:13:42,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-26 23:13:43,445][49750] Updated weights for policy 0, policy_version 294351 (0.0029) [2024-04-26 23:13:46,408][49750] Updated weights for policy 0, policy_version 294361 (0.0030) [2024-04-26 23:13:47,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51609.6, 300 sec: 50929.3). Total num frames: 4822859776. Throughput: 0: 50827.6. Samples: 2575724860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:13:47,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 23:13:49,892][49750] Updated weights for policy 0, policy_version 294371 (0.0032) [2024-04-26 23:13:52,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4823072768. Throughput: 0: 50921.5. Samples: 2575884760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:13:52,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 23:13:52,855][49750] Updated weights for policy 0, policy_version 294381 (0.0036) [2024-04-26 23:13:56,306][49750] Updated weights for policy 0, policy_version 294391 (0.0029) [2024-04-26 23:13:57,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4823334912. Throughput: 0: 50891.9. Samples: 2576185500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:13:57,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 23:13:59,229][49750] Updated weights for policy 0, policy_version 294401 (0.0032) [2024-04-26 23:14:02,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4823580672. Throughput: 0: 50881.2. Samples: 2576489880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:14:02,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 23:14:02,824][49750] Updated weights for policy 0, policy_version 294411 (0.0028) [2024-04-26 23:14:05,745][49750] Updated weights for policy 0, policy_version 294421 (0.0032) [2024-04-26 23:14:07,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 4823842816. Throughput: 0: 50826.5. Samples: 2576649600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:14:07,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 23:14:09,449][49750] Updated weights for policy 0, policy_version 294431 (0.0035) [2024-04-26 23:14:12,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4824104960. Throughput: 0: 50622.1. Samples: 2576944300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:14:12,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 23:14:12,225][49750] Updated weights for policy 0, policy_version 294441 (0.0041) [2024-04-26 23:14:15,954][49750] Updated weights for policy 0, policy_version 294451 (0.0035) [2024-04-26 23:14:17,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4824334336. Throughput: 0: 50751.2. Samples: 2577248840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:14:17,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 23:14:18,685][49750] Updated weights for policy 0, policy_version 294461 (0.0029) [2024-04-26 23:14:22,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4824580096. Throughput: 0: 50515.6. Samples: 2577389700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 25.0) [2024-04-26 23:14:22,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-26 23:14:22,222][49750] Updated weights for policy 0, policy_version 294471 (0.0034) [2024-04-26 23:14:25,050][49750] Updated weights for policy 0, policy_version 294481 (0.0033) [2024-04-26 23:14:27,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4824858624. Throughput: 0: 50805.9. Samples: 2577699940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 25.0) [2024-04-26 23:14:27,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 23:14:28,661][49750] Updated weights for policy 0, policy_version 294491 (0.0029) [2024-04-26 23:14:31,493][49750] Updated weights for policy 0, policy_version 294501 (0.0031) [2024-04-26 23:14:32,063][49517] Fps is (10 sec: 55705.3, 60 sec: 51609.6, 300 sec: 50929.2). Total num frames: 4825137152. Throughput: 0: 50716.4. Samples: 2578007100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 25.0) [2024-04-26 23:14:32,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 23:14:35,098][49750] Updated weights for policy 0, policy_version 294511 (0.0031) [2024-04-26 23:14:37,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4825350144. Throughput: 0: 50753.6. Samples: 2578168660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 25.0) [2024-04-26 23:14:37,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 23:14:37,944][49750] Updated weights for policy 0, policy_version 294521 (0.0031) [2024-04-26 23:14:40,837][49728] Signal inference workers to stop experience collection... (38750 times) [2024-04-26 23:14:40,879][49750] InferenceWorker_p0-w0: stopping experience collection (38750 times) [2024-04-26 23:14:40,942][49728] Signal inference workers to resume experience collection... (38750 times) [2024-04-26 23:14:40,943][49750] InferenceWorker_p0-w0: resuming experience collection (38750 times) [2024-04-26 23:14:41,460][49750] Updated weights for policy 0, policy_version 294531 (0.0028) [2024-04-26 23:14:42,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4825612288. Throughput: 0: 50805.7. Samples: 2578471760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 25.0) [2024-04-26 23:14:42,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 23:14:42,137][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000294533_4825628672.pth... [2024-04-26 23:14:42,184][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000293789_4813438976.pth [2024-04-26 23:14:44,344][49750] Updated weights for policy 0, policy_version 294541 (0.0030) [2024-04-26 23:14:47,062][49517] Fps is (10 sec: 50790.3, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4825858048. Throughput: 0: 50793.4. Samples: 2578775580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 25.0) [2024-04-26 23:14:47,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 23:14:47,836][49750] Updated weights for policy 0, policy_version 294551 (0.0032) [2024-04-26 23:14:50,881][49750] Updated weights for policy 0, policy_version 294561 (0.0027) [2024-04-26 23:14:52,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.7, 300 sec: 50818.2). Total num frames: 4826136576. Throughput: 0: 50718.0. Samples: 2578931900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 25.0) [2024-04-26 23:14:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 23:14:54,400][49750] Updated weights for policy 0, policy_version 294571 (0.0030) [2024-04-26 23:14:57,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4826382336. Throughput: 0: 50848.4. Samples: 2579232480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 25.0) [2024-04-26 23:14:57,063][49517] Avg episode reward: [(0, '0.488')] [2024-04-26 23:14:57,216][49750] Updated weights for policy 0, policy_version 294581 (0.0038) [2024-04-26 23:15:00,800][49750] Updated weights for policy 0, policy_version 294591 (0.0029) [2024-04-26 23:15:02,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4826628096. Throughput: 0: 50880.0. Samples: 2579538440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 25.0) [2024-04-26 23:15:02,063][49517] Avg episode reward: [(0, '0.457')] [2024-04-26 23:15:03,628][49750] Updated weights for policy 0, policy_version 294601 (0.0030) [2024-04-26 23:15:07,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 4826873856. Throughput: 0: 50944.5. Samples: 2579682200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 25.0) [2024-04-26 23:15:07,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 23:15:07,266][49750] Updated weights for policy 0, policy_version 294611 (0.0038) [2024-04-26 23:15:10,020][49750] Updated weights for policy 0, policy_version 294621 (0.0031) [2024-04-26 23:15:12,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4827152384. Throughput: 0: 50854.2. Samples: 2579988380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 25.0) [2024-04-26 23:15:12,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 23:15:13,679][49750] Updated weights for policy 0, policy_version 294631 (0.0028) [2024-04-26 23:15:16,434][49750] Updated weights for policy 0, policy_version 294641 (0.0036) [2024-04-26 23:15:17,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4827414528. Throughput: 0: 50584.5. Samples: 2580283400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 25.0) [2024-04-26 23:15:17,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 23:15:20,371][49750] Updated weights for policy 0, policy_version 294651 (0.0031) [2024-04-26 23:15:22,062][49517] Fps is (10 sec: 49152.2, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4827643904. Throughput: 0: 50770.3. Samples: 2580453320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 25.0) [2024-04-26 23:15:22,063][49517] Avg episode reward: [(0, '0.481')] [2024-04-26 23:15:22,955][49750] Updated weights for policy 0, policy_version 294661 (0.0033) [2024-04-26 23:15:26,746][49750] Updated weights for policy 0, policy_version 294671 (0.0042) [2024-04-26 23:15:27,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 4827889664. Throughput: 0: 50780.6. Samples: 2580756880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 25.0) [2024-04-26 23:15:27,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 23:15:29,498][49750] Updated weights for policy 0, policy_version 294681 (0.0034) [2024-04-26 23:15:32,063][49517] Fps is (10 sec: 49151.4, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4828135424. Throughput: 0: 50757.3. Samples: 2581059660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 23:15:32,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 23:15:33,272][49750] Updated weights for policy 0, policy_version 294691 (0.0035) [2024-04-26 23:15:35,409][49728] Signal inference workers to stop experience collection... (38800 times) [2024-04-26 23:15:35,409][49728] Signal inference workers to resume experience collection... (38800 times) [2024-04-26 23:15:35,462][49750] InferenceWorker_p0-w0: stopping experience collection (38800 times) [2024-04-26 23:15:35,462][49750] InferenceWorker_p0-w0: resuming experience collection (38800 times) [2024-04-26 23:15:35,839][49750] Updated weights for policy 0, policy_version 294701 (0.0026) [2024-04-26 23:15:37,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4828413952. Throughput: 0: 50665.7. Samples: 2581211860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 23:15:37,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 23:15:39,667][49750] Updated weights for policy 0, policy_version 294711 (0.0033) [2024-04-26 23:15:42,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 4828676096. Throughput: 0: 50804.4. Samples: 2581518680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 23:15:42,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 23:15:42,550][49750] Updated weights for policy 0, policy_version 294721 (0.0031) [2024-04-26 23:15:46,176][49750] Updated weights for policy 0, policy_version 294731 (0.0036) [2024-04-26 23:15:47,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4828905472. Throughput: 0: 50545.2. Samples: 2581812980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 23:15:47,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 23:15:49,112][49750] Updated weights for policy 0, policy_version 294741 (0.0030) [2024-04-26 23:15:52,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4829167616. Throughput: 0: 50648.0. Samples: 2581961360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 23:15:52,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 23:15:52,634][49750] Updated weights for policy 0, policy_version 294751 (0.0028) [2024-04-26 23:15:55,445][49750] Updated weights for policy 0, policy_version 294761 (0.0033) [2024-04-26 23:15:57,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4829413376. Throughput: 0: 50638.6. Samples: 2582267120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 23:15:57,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 23:15:59,005][49750] Updated weights for policy 0, policy_version 294771 (0.0032) [2024-04-26 23:16:01,954][49750] Updated weights for policy 0, policy_version 294781 (0.0034) [2024-04-26 23:16:02,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4829691904. Throughput: 0: 50922.2. Samples: 2582574900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 23:16:02,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 23:16:05,558][49750] Updated weights for policy 0, policy_version 294791 (0.0041) [2024-04-26 23:16:07,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4829937664. Throughput: 0: 50664.5. Samples: 2582733220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 23:16:07,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 23:16:08,350][49750] Updated weights for policy 0, policy_version 294801 (0.0030) [2024-04-26 23:16:11,953][49750] Updated weights for policy 0, policy_version 294811 (0.0032) [2024-04-26 23:16:12,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 4830183424. Throughput: 0: 50722.5. Samples: 2583039400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 23:16:12,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-26 23:16:14,883][49750] Updated weights for policy 0, policy_version 294821 (0.0034) [2024-04-26 23:16:17,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 4830429184. Throughput: 0: 50647.1. Samples: 2583338780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 23:16:17,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 23:16:18,476][49750] Updated weights for policy 0, policy_version 294831 (0.0029) [2024-04-26 23:16:21,351][49750] Updated weights for policy 0, policy_version 294841 (0.0036) [2024-04-26 23:16:22,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4830691328. Throughput: 0: 50560.3. Samples: 2583487080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 23:16:22,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 23:16:24,887][49750] Updated weights for policy 0, policy_version 294851 (0.0033) [2024-04-26 23:16:27,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4830969856. Throughput: 0: 50619.5. Samples: 2583796560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 23:16:27,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 23:16:27,758][49750] Updated weights for policy 0, policy_version 294861 (0.0031) [2024-04-26 23:16:31,281][49750] Updated weights for policy 0, policy_version 294871 (0.0032) [2024-04-26 23:16:32,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4831182848. Throughput: 0: 50665.0. Samples: 2584092900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 23:16:32,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 23:16:34,434][49750] Updated weights for policy 0, policy_version 294881 (0.0033) [2024-04-26 23:16:37,063][49517] Fps is (10 sec: 45875.0, 60 sec: 50244.1, 300 sec: 50596.0). Total num frames: 4831428608. Throughput: 0: 50683.8. Samples: 2584242140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 23:16:37,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 23:16:37,796][49750] Updated weights for policy 0, policy_version 294891 (0.0043) [2024-04-26 23:16:40,767][49750] Updated weights for policy 0, policy_version 294901 (0.0035) [2024-04-26 23:16:42,063][49517] Fps is (10 sec: 52427.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4831707136. Throughput: 0: 50727.8. Samples: 2584549880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-26 23:16:42,064][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 23:16:42,076][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000294904_4831707136.pth... [2024-04-26 23:16:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000294161_4819533824.pth [2024-04-26 23:16:44,289][49750] Updated weights for policy 0, policy_version 294911 (0.0033) [2024-04-26 23:16:47,063][49517] Fps is (10 sec: 54067.8, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4831969280. Throughput: 0: 50616.4. Samples: 2584852640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 23:16:47,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 23:16:47,436][49750] Updated weights for policy 0, policy_version 294921 (0.0028) [2024-04-26 23:16:50,655][49750] Updated weights for policy 0, policy_version 294931 (0.0033) [2024-04-26 23:16:51,986][49728] Signal inference workers to stop experience collection... (38850 times) [2024-04-26 23:16:51,991][49728] Signal inference workers to resume experience collection... (38850 times) [2024-04-26 23:16:52,006][49750] InferenceWorker_p0-w0: stopping experience collection (38850 times) [2024-04-26 23:16:52,006][49750] InferenceWorker_p0-w0: resuming experience collection (38850 times) [2024-04-26 23:16:52,062][49517] Fps is (10 sec: 50791.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4832215040. Throughput: 0: 50591.5. Samples: 2585009840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 23:16:52,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 23:16:54,004][49750] Updated weights for policy 0, policy_version 294941 (0.0031) [2024-04-26 23:16:57,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4832460800. Throughput: 0: 50494.5. Samples: 2585311640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 23:16:57,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 23:16:57,141][49750] Updated weights for policy 0, policy_version 294951 (0.0032) [2024-04-26 23:17:00,467][49750] Updated weights for policy 0, policy_version 294961 (0.0026) [2024-04-26 23:17:02,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4832706560. Throughput: 0: 50566.3. Samples: 2585614260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 23:17:02,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 23:17:03,734][49750] Updated weights for policy 0, policy_version 294971 (0.0029) [2024-04-26 23:17:06,943][49750] Updated weights for policy 0, policy_version 294981 (0.0033) [2024-04-26 23:17:07,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4832968704. Throughput: 0: 50616.5. Samples: 2585764820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 23:17:07,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 23:17:10,297][49750] Updated weights for policy 0, policy_version 294991 (0.0029) [2024-04-26 23:17:12,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4833230848. Throughput: 0: 50611.1. Samples: 2586074060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 23:17:12,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 23:17:13,358][49750] Updated weights for policy 0, policy_version 295001 (0.0032) [2024-04-26 23:17:16,762][49750] Updated weights for policy 0, policy_version 295011 (0.0034) [2024-04-26 23:17:17,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4833460224. Throughput: 0: 50881.4. Samples: 2586382560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 23:17:17,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 23:17:19,647][49750] Updated weights for policy 0, policy_version 295021 (0.0032) [2024-04-26 23:17:22,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4833722368. Throughput: 0: 50762.8. Samples: 2586526460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 23:17:22,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 23:17:23,192][49750] Updated weights for policy 0, policy_version 295031 (0.0026) [2024-04-26 23:17:26,184][49750] Updated weights for policy 0, policy_version 295041 (0.0037) [2024-04-26 23:17:27,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 4833984512. Throughput: 0: 50705.0. Samples: 2586831600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 23:17:27,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 23:17:29,673][49750] Updated weights for policy 0, policy_version 295051 (0.0028) [2024-04-26 23:17:32,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 4834246656. Throughput: 0: 50770.8. Samples: 2587137320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 23:17:32,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 23:17:32,646][49750] Updated weights for policy 0, policy_version 295061 (0.0037) [2024-04-26 23:17:36,123][49750] Updated weights for policy 0, policy_version 295071 (0.0037) [2024-04-26 23:17:37,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.6, 300 sec: 50762.6). Total num frames: 4834508800. Throughput: 0: 50710.5. Samples: 2587291820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 23:17:37,064][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 23:17:39,302][49750] Updated weights for policy 0, policy_version 295081 (0.0032) [2024-04-26 23:17:42,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4834738176. Throughput: 0: 50888.3. Samples: 2587601620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 23:17:42,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 23:17:42,454][49750] Updated weights for policy 0, policy_version 295091 (0.0038) [2024-04-26 23:17:45,738][49750] Updated weights for policy 0, policy_version 295101 (0.0031) [2024-04-26 23:17:47,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4835000320. Throughput: 0: 50907.2. Samples: 2587905080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 23:17:47,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 23:17:48,790][49750] Updated weights for policy 0, policy_version 295111 (0.0030) [2024-04-26 23:17:52,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.1, 300 sec: 50707.1). Total num frames: 4835246080. Throughput: 0: 50863.9. Samples: 2588053700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 23:17:52,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 23:17:52,149][49750] Updated weights for policy 0, policy_version 295121 (0.0046) [2024-04-26 23:17:55,256][49750] Updated weights for policy 0, policy_version 295131 (0.0032) [2024-04-26 23:17:55,611][49728] Signal inference workers to stop experience collection... (38900 times) [2024-04-26 23:17:55,611][49728] Signal inference workers to resume experience collection... (38900 times) [2024-04-26 23:17:55,638][49750] InferenceWorker_p0-w0: stopping experience collection (38900 times) [2024-04-26 23:17:55,638][49750] InferenceWorker_p0-w0: resuming experience collection (38900 times) [2024-04-26 23:17:57,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.2, 300 sec: 50651.5). Total num frames: 4835508224. Throughput: 0: 50819.2. Samples: 2588360920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-26 23:17:57,063][49517] Avg episode reward: [(0, '0.687')] [2024-04-26 23:17:58,785][49750] Updated weights for policy 0, policy_version 295141 (0.0031) [2024-04-26 23:18:01,640][49750] Updated weights for policy 0, policy_version 295151 (0.0030) [2024-04-26 23:18:02,062][49517] Fps is (10 sec: 50791.9, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4835753984. Throughput: 0: 50737.0. Samples: 2588665720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 23:18:02,063][49517] Avg episode reward: [(0, '0.671')] [2024-04-26 23:18:05,218][49750] Updated weights for policy 0, policy_version 295161 (0.0033) [2024-04-26 23:18:07,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4836016128. Throughput: 0: 50995.0. Samples: 2588821240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 23:18:07,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 23:18:08,136][49750] Updated weights for policy 0, policy_version 295171 (0.0040) [2024-04-26 23:18:11,563][49750] Updated weights for policy 0, policy_version 295181 (0.0036) [2024-04-26 23:18:12,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4836261888. Throughput: 0: 50812.9. Samples: 2589118180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 23:18:12,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 23:18:14,649][49750] Updated weights for policy 0, policy_version 295191 (0.0029) [2024-04-26 23:18:17,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.4, 300 sec: 50651.5). Total num frames: 4836524032. Throughput: 0: 50804.3. Samples: 2589423520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 23:18:17,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 23:18:18,031][49750] Updated weights for policy 0, policy_version 295201 (0.0028) [2024-04-26 23:18:21,158][49750] Updated weights for policy 0, policy_version 295211 (0.0030) [2024-04-26 23:18:22,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4836786176. Throughput: 0: 50751.1. Samples: 2589575620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 23:18:22,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 23:18:24,446][49750] Updated weights for policy 0, policy_version 295221 (0.0029) [2024-04-26 23:18:27,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4837031936. Throughput: 0: 50625.3. Samples: 2589879760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 23:18:27,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 23:18:27,626][49750] Updated weights for policy 0, policy_version 295231 (0.0032) [2024-04-26 23:18:30,787][49750] Updated weights for policy 0, policy_version 295241 (0.0038) [2024-04-26 23:18:32,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4837294080. Throughput: 0: 50764.3. Samples: 2590189480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 23:18:32,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 23:18:34,004][49750] Updated weights for policy 0, policy_version 295251 (0.0032) [2024-04-26 23:18:37,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 4837539840. Throughput: 0: 50868.3. Samples: 2590342760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 23:18:37,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-26 23:18:37,154][49750] Updated weights for policy 0, policy_version 295261 (0.0033) [2024-04-26 23:18:40,402][49750] Updated weights for policy 0, policy_version 295271 (0.0028) [2024-04-26 23:18:42,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51063.4, 300 sec: 50651.5). Total num frames: 4837801984. Throughput: 0: 50762.6. Samples: 2590645240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 23:18:42,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 23:18:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000295276_4837801984.pth... [2024-04-26 23:18:42,126][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000294533_4825628672.pth [2024-04-26 23:18:43,711][49750] Updated weights for policy 0, policy_version 295281 (0.0034) [2024-04-26 23:18:46,742][49750] Updated weights for policy 0, policy_version 295291 (0.0029) [2024-04-26 23:18:47,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 50762.7). Total num frames: 4838047744. Throughput: 0: 50816.8. Samples: 2590952480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 23:18:47,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 23:18:50,153][49750] Updated weights for policy 0, policy_version 295301 (0.0032) [2024-04-26 23:18:52,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4838309888. Throughput: 0: 50857.2. Samples: 2591109820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 23:18:52,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 23:18:53,165][49750] Updated weights for policy 0, policy_version 295311 (0.0038) [2024-04-26 23:18:56,718][49750] Updated weights for policy 0, policy_version 295321 (0.0035) [2024-04-26 23:18:57,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4838555648. Throughput: 0: 50928.9. Samples: 2591409980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 23:18:57,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 23:18:59,756][49750] Updated weights for policy 0, policy_version 295331 (0.0029) [2024-04-26 23:19:01,159][49728] Signal inference workers to stop experience collection... (38950 times) [2024-04-26 23:19:01,221][49750] InferenceWorker_p0-w0: stopping experience collection (38950 times) [2024-04-26 23:19:01,223][49728] Signal inference workers to resume experience collection... (38950 times) [2024-04-26 23:19:01,235][49750] InferenceWorker_p0-w0: resuming experience collection (38950 times) [2024-04-26 23:19:02,062][49517] Fps is (10 sec: 49153.3, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4838801408. Throughput: 0: 50815.7. Samples: 2591710220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 23:19:02,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 23:19:03,025][49750] Updated weights for policy 0, policy_version 295341 (0.0030) [2024-04-26 23:19:06,065][49750] Updated weights for policy 0, policy_version 295351 (0.0024) [2024-04-26 23:19:07,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4839079936. Throughput: 0: 50799.3. Samples: 2591861580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-26 23:19:07,063][49517] Avg episode reward: [(0, '0.677')] [2024-04-26 23:19:09,333][49750] Updated weights for policy 0, policy_version 295361 (0.0031) [2024-04-26 23:19:12,063][49517] Fps is (10 sec: 52427.7, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 4839325696. Throughput: 0: 50940.8. Samples: 2592172100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:19:12,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 23:19:12,748][49750] Updated weights for policy 0, policy_version 295371 (0.0032) [2024-04-26 23:19:15,948][49750] Updated weights for policy 0, policy_version 295381 (0.0033) [2024-04-26 23:19:17,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4839571456. Throughput: 0: 50885.3. Samples: 2592479320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:19:17,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 23:19:19,178][49750] Updated weights for policy 0, policy_version 295391 (0.0031) [2024-04-26 23:19:22,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4839833600. Throughput: 0: 50715.8. Samples: 2592624980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:19:22,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 23:19:22,469][49750] Updated weights for policy 0, policy_version 295401 (0.0030) [2024-04-26 23:19:25,629][49750] Updated weights for policy 0, policy_version 295411 (0.0031) [2024-04-26 23:19:27,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 4840079360. Throughput: 0: 50845.5. Samples: 2592933280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:19:27,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 23:19:28,806][49750] Updated weights for policy 0, policy_version 295421 (0.0031) [2024-04-26 23:19:32,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4840325120. Throughput: 0: 50837.4. Samples: 2593240160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:19:32,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 23:19:32,210][49750] Updated weights for policy 0, policy_version 295431 (0.0029) [2024-04-26 23:19:35,188][49750] Updated weights for policy 0, policy_version 295441 (0.0028) [2024-04-26 23:19:37,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4840587264. Throughput: 0: 50587.8. Samples: 2593386260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:19:37,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 23:19:38,783][49750] Updated weights for policy 0, policy_version 295451 (0.0031) [2024-04-26 23:19:41,634][49750] Updated weights for policy 0, policy_version 295461 (0.0032) [2024-04-26 23:19:42,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4840833024. Throughput: 0: 50713.4. Samples: 2593692080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:19:42,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 23:19:45,151][49750] Updated weights for policy 0, policy_version 295471 (0.0027) [2024-04-26 23:19:47,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4841095168. Throughput: 0: 50820.0. Samples: 2593997120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:19:47,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 23:19:48,268][49750] Updated weights for policy 0, policy_version 295481 (0.0028) [2024-04-26 23:19:51,490][49750] Updated weights for policy 0, policy_version 295491 (0.0030) [2024-04-26 23:19:52,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.6, 300 sec: 50762.7). Total num frames: 4841357312. Throughput: 0: 50796.9. Samples: 2594147440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:19:52,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 23:19:54,677][49750] Updated weights for policy 0, policy_version 295501 (0.0029) [2024-04-26 23:19:57,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4841586688. Throughput: 0: 50682.7. Samples: 2594452820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:19:57,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-26 23:19:58,032][49750] Updated weights for policy 0, policy_version 295511 (0.0034) [2024-04-26 23:20:01,179][49750] Updated weights for policy 0, policy_version 295521 (0.0038) [2024-04-26 23:20:02,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 4841865216. Throughput: 0: 50541.3. Samples: 2594753680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:20:02,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 23:20:04,473][49750] Updated weights for policy 0, policy_version 295531 (0.0031) [2024-04-26 23:20:07,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4842110976. Throughput: 0: 50762.6. Samples: 2594909300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:20:07,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 23:20:07,738][49750] Updated weights for policy 0, policy_version 295541 (0.0029) [2024-04-26 23:20:08,395][49728] Signal inference workers to stop experience collection... (39000 times) [2024-04-26 23:20:08,395][49728] Signal inference workers to resume experience collection... (39000 times) [2024-04-26 23:20:08,430][49750] InferenceWorker_p0-w0: stopping experience collection (39000 times) [2024-04-26 23:20:08,430][49750] InferenceWorker_p0-w0: resuming experience collection (39000 times) [2024-04-26 23:20:10,805][49750] Updated weights for policy 0, policy_version 295551 (0.0031) [2024-04-26 23:20:12,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50651.5). Total num frames: 4842356736. Throughput: 0: 50677.7. Samples: 2595213780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:20:12,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 23:20:14,194][49750] Updated weights for policy 0, policy_version 295561 (0.0028) [2024-04-26 23:20:17,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4842618880. Throughput: 0: 50590.1. Samples: 2595516720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:20:17,063][49517] Avg episode reward: [(0, '0.668')] [2024-04-26 23:20:17,200][49750] Updated weights for policy 0, policy_version 295571 (0.0028) [2024-04-26 23:20:20,489][49750] Updated weights for policy 0, policy_version 295581 (0.0029) [2024-04-26 23:20:22,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4842848256. Throughput: 0: 50658.1. Samples: 2595665880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:20:22,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 23:20:23,862][49750] Updated weights for policy 0, policy_version 295591 (0.0031) [2024-04-26 23:20:26,775][49750] Updated weights for policy 0, policy_version 295601 (0.0034) [2024-04-26 23:20:27,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4843126784. Throughput: 0: 50545.8. Samples: 2595966640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-26 23:20:27,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 23:20:30,339][49750] Updated weights for policy 0, policy_version 295611 (0.0031) [2024-04-26 23:20:32,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4843388928. Throughput: 0: 50584.9. Samples: 2596273440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-26 23:20:32,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 23:20:33,232][49750] Updated weights for policy 0, policy_version 295621 (0.0031) [2024-04-26 23:20:36,681][49750] Updated weights for policy 0, policy_version 295631 (0.0037) [2024-04-26 23:20:37,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 4843618304. Throughput: 0: 50741.6. Samples: 2596430820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-26 23:20:37,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 23:20:39,714][49750] Updated weights for policy 0, policy_version 295641 (0.0032) [2024-04-26 23:20:42,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4843880448. Throughput: 0: 50684.5. Samples: 2596733620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-26 23:20:42,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 23:20:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000295647_4843880448.pth... [2024-04-26 23:20:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000294904_4831707136.pth [2024-04-26 23:20:43,122][49750] Updated weights for policy 0, policy_version 295651 (0.0037) [2024-04-26 23:20:46,094][49750] Updated weights for policy 0, policy_version 295661 (0.0033) [2024-04-26 23:20:47,062][49517] Fps is (10 sec: 52429.9, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4844142592. Throughput: 0: 50837.1. Samples: 2597041340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-26 23:20:47,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 23:20:49,605][49750] Updated weights for policy 0, policy_version 295671 (0.0030) [2024-04-26 23:20:52,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 4844404736. Throughput: 0: 50820.4. Samples: 2597196220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-26 23:20:52,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 23:20:52,775][49750] Updated weights for policy 0, policy_version 295681 (0.0035) [2024-04-26 23:20:56,015][49750] Updated weights for policy 0, policy_version 295691 (0.0034) [2024-04-26 23:20:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.7, 300 sec: 50762.7). Total num frames: 4844666880. Throughput: 0: 50955.3. Samples: 2597506760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-26 23:20:57,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 23:20:59,075][49750] Updated weights for policy 0, policy_version 295701 (0.0038) [2024-04-26 23:21:02,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4844896256. Throughput: 0: 50937.5. Samples: 2597808900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-26 23:21:02,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 23:21:02,410][49750] Updated weights for policy 0, policy_version 295711 (0.0029) [2024-04-26 23:21:05,648][49750] Updated weights for policy 0, policy_version 295721 (0.0033) [2024-04-26 23:21:07,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4845142016. Throughput: 0: 50897.5. Samples: 2597956260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-26 23:21:07,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 23:21:08,800][49750] Updated weights for policy 0, policy_version 295731 (0.0031) [2024-04-26 23:21:09,302][49728] Signal inference workers to stop experience collection... (39050 times) [2024-04-26 23:21:09,303][49728] Signal inference workers to resume experience collection... (39050 times) [2024-04-26 23:21:09,315][49750] InferenceWorker_p0-w0: stopping experience collection (39050 times) [2024-04-26 23:21:09,315][49750] InferenceWorker_p0-w0: resuming experience collection (39050 times) [2024-04-26 23:21:12,050][49750] Updated weights for policy 0, policy_version 295741 (0.0033) [2024-04-26 23:21:12,063][49517] Fps is (10 sec: 52427.7, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4845420544. Throughput: 0: 50965.6. Samples: 2598260100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-26 23:21:12,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 23:21:15,371][49750] Updated weights for policy 0, policy_version 295751 (0.0029) [2024-04-26 23:21:17,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4845682688. Throughput: 0: 50730.5. Samples: 2598556320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-26 23:21:17,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-26 23:21:18,472][49750] Updated weights for policy 0, policy_version 295761 (0.0029) [2024-04-26 23:21:21,918][49750] Updated weights for policy 0, policy_version 295771 (0.0029) [2024-04-26 23:21:22,062][49517] Fps is (10 sec: 49152.9, 60 sec: 51063.6, 300 sec: 50651.6). Total num frames: 4845912064. Throughput: 0: 50996.7. Samples: 2598725660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-26 23:21:22,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 23:21:24,831][49750] Updated weights for policy 0, policy_version 295781 (0.0028) [2024-04-26 23:21:27,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4846157824. Throughput: 0: 50934.3. Samples: 2599025660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-26 23:21:27,063][49517] Avg episode reward: [(0, '0.486')] [2024-04-26 23:21:28,290][49750] Updated weights for policy 0, policy_version 295791 (0.0033) [2024-04-26 23:21:31,158][49750] Updated weights for policy 0, policy_version 295801 (0.0029) [2024-04-26 23:21:32,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4846419968. Throughput: 0: 50796.8. Samples: 2599327200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-26 23:21:32,063][49517] Avg episode reward: [(0, '0.722')] [2024-04-26 23:21:32,070][49728] Saving new best policy, reward=0.722! [2024-04-26 23:21:34,769][49750] Updated weights for policy 0, policy_version 295811 (0.0028) [2024-04-26 23:21:37,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 4846682112. Throughput: 0: 50931.8. Samples: 2599488140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-26 23:21:37,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 23:21:37,600][49750] Updated weights for policy 0, policy_version 295821 (0.0034) [2024-04-26 23:21:41,188][49750] Updated weights for policy 0, policy_version 295831 (0.0027) [2024-04-26 23:21:42,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4846960640. Throughput: 0: 50840.2. Samples: 2599794580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 23:21:42,063][49517] Avg episode reward: [(0, '0.485')] [2024-04-26 23:21:44,059][49750] Updated weights for policy 0, policy_version 295841 (0.0031) [2024-04-26 23:21:47,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4847190016. Throughput: 0: 50823.8. Samples: 2600095980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 23:21:47,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 23:21:47,612][49750] Updated weights for policy 0, policy_version 295851 (0.0033) [2024-04-26 23:21:50,577][49750] Updated weights for policy 0, policy_version 295861 (0.0028) [2024-04-26 23:21:52,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4847435776. Throughput: 0: 50761.1. Samples: 2600240520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 23:21:52,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 23:21:53,937][49750] Updated weights for policy 0, policy_version 295871 (0.0027) [2024-04-26 23:21:57,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 4847697920. Throughput: 0: 50640.5. Samples: 2600538920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 23:21:57,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 23:21:57,314][49750] Updated weights for policy 0, policy_version 295881 (0.0028) [2024-04-26 23:22:00,411][49750] Updated weights for policy 0, policy_version 295891 (0.0034) [2024-04-26 23:22:02,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4847960064. Throughput: 0: 50909.0. Samples: 2600847220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 23:22:02,072][49517] Avg episode reward: [(0, '0.494')] [2024-04-26 23:22:04,041][49750] Updated weights for policy 0, policy_version 295901 (0.0032) [2024-04-26 23:22:06,775][49750] Updated weights for policy 0, policy_version 295911 (0.0028) [2024-04-26 23:22:07,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4848222208. Throughput: 0: 50597.8. Samples: 2601002560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 23:22:07,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 23:22:10,436][49750] Updated weights for policy 0, policy_version 295921 (0.0037) [2024-04-26 23:22:12,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4848451584. Throughput: 0: 50728.5. Samples: 2601308440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 23:22:12,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 23:22:12,975][49728] Signal inference workers to stop experience collection... (39100 times) [2024-04-26 23:22:12,975][49728] Signal inference workers to resume experience collection... (39100 times) [2024-04-26 23:22:13,000][49750] InferenceWorker_p0-w0: stopping experience collection (39100 times) [2024-04-26 23:22:13,000][49750] InferenceWorker_p0-w0: resuming experience collection (39100 times) [2024-04-26 23:22:13,102][49750] Updated weights for policy 0, policy_version 295931 (0.0033) [2024-04-26 23:22:16,891][49750] Updated weights for policy 0, policy_version 295941 (0.0032) [2024-04-26 23:22:17,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 4848697344. Throughput: 0: 50742.3. Samples: 2601610600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 23:22:17,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 23:22:19,605][49750] Updated weights for policy 0, policy_version 295951 (0.0031) [2024-04-26 23:22:22,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4848959488. Throughput: 0: 50450.9. Samples: 2601758440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 23:22:22,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-26 23:22:23,553][49750] Updated weights for policy 0, policy_version 295961 (0.0034) [2024-04-26 23:22:26,142][49750] Updated weights for policy 0, policy_version 295971 (0.0032) [2024-04-26 23:22:27,062][49517] Fps is (10 sec: 54066.6, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4849238016. Throughput: 0: 50538.3. Samples: 2602068800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 23:22:27,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 23:22:29,994][49750] Updated weights for policy 0, policy_version 295981 (0.0028) [2024-04-26 23:22:32,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4849467392. Throughput: 0: 50686.5. Samples: 2602376880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 23:22:32,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 23:22:32,590][49750] Updated weights for policy 0, policy_version 295991 (0.0029) [2024-04-26 23:22:36,294][49750] Updated weights for policy 0, policy_version 296001 (0.0034) [2024-04-26 23:22:37,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4849713152. Throughput: 0: 50757.2. Samples: 2602524580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 23:22:37,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 23:22:39,050][49750] Updated weights for policy 0, policy_version 296011 (0.0029) [2024-04-26 23:22:42,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 4849975296. Throughput: 0: 50827.2. Samples: 2602826140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 23:22:42,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 23:22:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000296019_4849975296.pth... [2024-04-26 23:22:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000295276_4837801984.pth [2024-04-26 23:22:42,600][49750] Updated weights for policy 0, policy_version 296021 (0.0026) [2024-04-26 23:22:45,421][49750] Updated weights for policy 0, policy_version 296031 (0.0031) [2024-04-26 23:22:47,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4850221056. Throughput: 0: 50830.6. Samples: 2603134600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-26 23:22:47,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 23:22:49,207][49750] Updated weights for policy 0, policy_version 296041 (0.0025) [2024-04-26 23:22:51,835][49750] Updated weights for policy 0, policy_version 296051 (0.0033) [2024-04-26 23:22:52,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 4850515968. Throughput: 0: 50810.6. Samples: 2603289040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 23:22:52,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 23:22:55,518][49750] Updated weights for policy 0, policy_version 296061 (0.0029) [2024-04-26 23:22:57,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4850728960. Throughput: 0: 50790.9. Samples: 2603594040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 23:22:57,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 23:22:58,219][49750] Updated weights for policy 0, policy_version 296071 (0.0027) [2024-04-26 23:23:01,919][49750] Updated weights for policy 0, policy_version 296081 (0.0028) [2024-04-26 23:23:02,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4850991104. Throughput: 0: 50777.8. Samples: 2603895600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 23:23:02,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 23:23:04,740][49750] Updated weights for policy 0, policy_version 296091 (0.0027) [2024-04-26 23:23:07,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4851236864. Throughput: 0: 50793.0. Samples: 2604044120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 23:23:07,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 23:23:08,533][49750] Updated weights for policy 0, policy_version 296101 (0.0034) [2024-04-26 23:23:11,060][49750] Updated weights for policy 0, policy_version 296111 (0.0033) [2024-04-26 23:23:12,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4851515392. Throughput: 0: 50687.5. Samples: 2604349740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 23:23:12,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-26 23:23:14,926][49750] Updated weights for policy 0, policy_version 296121 (0.0027) [2024-04-26 23:23:17,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4851777536. Throughput: 0: 50675.9. Samples: 2604657280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 23:23:17,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 23:23:17,517][49750] Updated weights for policy 0, policy_version 296131 (0.0031) [2024-04-26 23:23:21,234][49750] Updated weights for policy 0, policy_version 296141 (0.0029) [2024-04-26 23:23:22,063][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4851990528. Throughput: 0: 50834.5. Samples: 2604812140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 23:23:22,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 23:23:23,945][49750] Updated weights for policy 0, policy_version 296151 (0.0031) [2024-04-26 23:23:27,062][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4852252672. Throughput: 0: 50873.3. Samples: 2605115440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 23:23:27,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 23:23:27,653][49750] Updated weights for policy 0, policy_version 296161 (0.0030) [2024-04-26 23:23:30,461][49750] Updated weights for policy 0, policy_version 296171 (0.0030) [2024-04-26 23:23:32,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4852514816. Throughput: 0: 50798.3. Samples: 2605420520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 23:23:32,063][49517] Avg episode reward: [(0, '0.483')] [2024-04-26 23:23:34,108][49750] Updated weights for policy 0, policy_version 296181 (0.0029) [2024-04-26 23:23:36,940][49750] Updated weights for policy 0, policy_version 296191 (0.0027) [2024-04-26 23:23:37,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4852793344. Throughput: 0: 50798.4. Samples: 2605574960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 23:23:37,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 23:23:40,156][49728] Signal inference workers to stop experience collection... (39150 times) [2024-04-26 23:23:40,187][49750] InferenceWorker_p0-w0: stopping experience collection (39150 times) [2024-04-26 23:23:40,221][49728] Signal inference workers to resume experience collection... (39150 times) [2024-04-26 23:23:40,222][49750] InferenceWorker_p0-w0: resuming experience collection (39150 times) [2024-04-26 23:23:40,490][49750] Updated weights for policy 0, policy_version 296201 (0.0028) [2024-04-26 23:23:42,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4853022720. Throughput: 0: 50778.0. Samples: 2605879040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 23:23:42,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 23:23:43,352][49750] Updated weights for policy 0, policy_version 296211 (0.0030) [2024-04-26 23:23:46,829][49750] Updated weights for policy 0, policy_version 296221 (0.0034) [2024-04-26 23:23:47,062][49517] Fps is (10 sec: 49151.5, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 4853284864. Throughput: 0: 50980.9. Samples: 2606189740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 23:23:47,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 23:23:49,747][49750] Updated weights for policy 0, policy_version 296231 (0.0030) [2024-04-26 23:23:52,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4853530624. Throughput: 0: 50848.8. Samples: 2606332320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 23:23:52,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 23:23:53,263][49750] Updated weights for policy 0, policy_version 296241 (0.0026) [2024-04-26 23:23:56,128][49750] Updated weights for policy 0, policy_version 296251 (0.0034) [2024-04-26 23:23:57,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 4853792768. Throughput: 0: 50725.7. Samples: 2606632400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 23:23:57,063][49517] Avg episode reward: [(0, '0.671')] [2024-04-26 23:23:59,806][49750] Updated weights for policy 0, policy_version 296261 (0.0032) [2024-04-26 23:24:02,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 4854071296. Throughput: 0: 50744.1. Samples: 2606940780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-26 23:24:02,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 23:24:02,597][49750] Updated weights for policy 0, policy_version 296271 (0.0039) [2024-04-26 23:24:06,326][49750] Updated weights for policy 0, policy_version 296281 (0.0027) [2024-04-26 23:24:07,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4854317056. Throughput: 0: 50944.1. Samples: 2607104620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:24:07,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 23:24:09,007][49750] Updated weights for policy 0, policy_version 296291 (0.0034) [2024-04-26 23:24:12,062][49517] Fps is (10 sec: 45876.4, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4854530048. Throughput: 0: 50932.1. Samples: 2607407380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:24:12,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 23:24:12,799][49750] Updated weights for policy 0, policy_version 296301 (0.0038) [2024-04-26 23:24:15,451][49750] Updated weights for policy 0, policy_version 296311 (0.0030) [2024-04-26 23:24:17,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4854792192. Throughput: 0: 50890.9. Samples: 2607710600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:24:17,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 23:24:19,176][49750] Updated weights for policy 0, policy_version 296321 (0.0033) [2024-04-26 23:24:21,809][49750] Updated weights for policy 0, policy_version 296331 (0.0028) [2024-04-26 23:24:22,062][49517] Fps is (10 sec: 55705.0, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 4855087104. Throughput: 0: 50853.6. Samples: 2607863380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:24:22,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 23:24:25,669][49750] Updated weights for policy 0, policy_version 296341 (0.0031) [2024-04-26 23:24:27,063][49517] Fps is (10 sec: 54066.0, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4855332864. Throughput: 0: 50913.2. Samples: 2608170140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:24:27,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 23:24:28,287][49750] Updated weights for policy 0, policy_version 296351 (0.0033) [2024-04-26 23:24:32,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4855562240. Throughput: 0: 50846.2. Samples: 2608477820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:24:32,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 23:24:32,121][49750] Updated weights for policy 0, policy_version 296361 (0.0031) [2024-04-26 23:24:34,896][49750] Updated weights for policy 0, policy_version 296371 (0.0035) [2024-04-26 23:24:37,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4855808000. Throughput: 0: 50921.9. Samples: 2608623800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:24:37,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 23:24:38,412][49750] Updated weights for policy 0, policy_version 296381 (0.0033) [2024-04-26 23:24:41,569][49750] Updated weights for policy 0, policy_version 296391 (0.0035) [2024-04-26 23:24:42,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 4856086528. Throughput: 0: 51002.7. Samples: 2608927520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:24:42,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 23:24:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000296392_4856086528.pth... [2024-04-26 23:24:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000295647_4843880448.pth [2024-04-26 23:24:44,741][49750] Updated weights for policy 0, policy_version 296401 (0.0033) [2024-04-26 23:24:47,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4856348672. Throughput: 0: 50916.6. Samples: 2609232020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:24:47,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 23:24:48,044][49750] Updated weights for policy 0, policy_version 296411 (0.0029) [2024-04-26 23:24:51,164][49750] Updated weights for policy 0, policy_version 296421 (0.0031) [2024-04-26 23:24:51,922][49728] Signal inference workers to stop experience collection... (39200 times) [2024-04-26 23:24:51,979][49750] InferenceWorker_p0-w0: stopping experience collection (39200 times) [2024-04-26 23:24:51,998][49728] Signal inference workers to resume experience collection... (39200 times) [2024-04-26 23:24:52,000][49750] InferenceWorker_p0-w0: resuming experience collection (39200 times) [2024-04-26 23:24:52,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4856594432. Throughput: 0: 50970.7. Samples: 2609398300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:24:52,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 23:24:54,340][49750] Updated weights for policy 0, policy_version 296431 (0.0030) [2024-04-26 23:24:57,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4856840192. Throughput: 0: 51024.3. Samples: 2609703480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:24:57,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 23:24:57,585][49750] Updated weights for policy 0, policy_version 296441 (0.0029) [2024-04-26 23:25:00,757][49750] Updated weights for policy 0, policy_version 296451 (0.0029) [2024-04-26 23:25:02,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49971.4, 300 sec: 50707.1). Total num frames: 4857069568. Throughput: 0: 51067.9. Samples: 2610008660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:25:02,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 23:25:04,059][49750] Updated weights for policy 0, policy_version 296461 (0.0028) [2024-04-26 23:25:07,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4857364480. Throughput: 0: 50814.4. Samples: 2610150020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:25:07,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 23:25:07,226][49750] Updated weights for policy 0, policy_version 296471 (0.0033) [2024-04-26 23:25:10,461][49750] Updated weights for policy 0, policy_version 296481 (0.0033) [2024-04-26 23:25:12,062][49517] Fps is (10 sec: 57344.1, 60 sec: 51882.6, 300 sec: 50929.3). Total num frames: 4857643008. Throughput: 0: 50783.7. Samples: 2610455400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:25:12,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-26 23:25:13,578][49750] Updated weights for policy 0, policy_version 296491 (0.0030) [2024-04-26 23:25:16,822][49750] Updated weights for policy 0, policy_version 296501 (0.0032) [2024-04-26 23:25:17,062][49517] Fps is (10 sec: 50789.9, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 4857872384. Throughput: 0: 50855.6. Samples: 2610766320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:25:17,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-26 23:25:20,000][49750] Updated weights for policy 0, policy_version 296511 (0.0030) [2024-04-26 23:25:22,062][49517] Fps is (10 sec: 44236.4, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4858085376. Throughput: 0: 50817.7. Samples: 2610910600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 23:25:22,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-26 23:25:23,346][49750] Updated weights for policy 0, policy_version 296521 (0.0029) [2024-04-26 23:25:26,511][49750] Updated weights for policy 0, policy_version 296531 (0.0029) [2024-04-26 23:25:27,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4858363904. Throughput: 0: 50872.2. Samples: 2611216760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 23:25:27,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 23:25:29,772][49750] Updated weights for policy 0, policy_version 296541 (0.0029) [2024-04-26 23:25:32,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4858626048. Throughput: 0: 50832.8. Samples: 2611519500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 23:25:32,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 23:25:32,988][49750] Updated weights for policy 0, policy_version 296551 (0.0030) [2024-04-26 23:25:36,087][49750] Updated weights for policy 0, policy_version 296561 (0.0029) [2024-04-26 23:25:37,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4858888192. Throughput: 0: 50788.5. Samples: 2611683780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 23:25:37,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 23:25:39,802][49750] Updated weights for policy 0, policy_version 296571 (0.0032) [2024-04-26 23:25:42,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4859133952. Throughput: 0: 50726.7. Samples: 2611986180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 23:25:42,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 23:25:42,606][49750] Updated weights for policy 0, policy_version 296581 (0.0029) [2024-04-26 23:25:46,588][49750] Updated weights for policy 0, policy_version 296591 (0.0036) [2024-04-26 23:25:47,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4859363328. Throughput: 0: 50748.9. Samples: 2612292360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 23:25:47,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 23:25:47,196][49728] Signal inference workers to stop experience collection... (39250 times) [2024-04-26 23:25:47,242][49750] InferenceWorker_p0-w0: stopping experience collection (39250 times) [2024-04-26 23:25:47,264][49728] Signal inference workers to resume experience collection... (39250 times) [2024-04-26 23:25:47,265][49750] InferenceWorker_p0-w0: resuming experience collection (39250 times) [2024-04-26 23:25:49,190][49750] Updated weights for policy 0, policy_version 296601 (0.0033) [2024-04-26 23:25:52,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 4859641856. Throughput: 0: 50793.9. Samples: 2612435760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 23:25:52,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 23:25:53,020][49750] Updated weights for policy 0, policy_version 296611 (0.0028) [2024-04-26 23:25:55,509][49750] Updated weights for policy 0, policy_version 296621 (0.0030) [2024-04-26 23:25:57,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4859904000. Throughput: 0: 50722.6. Samples: 2612737920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 23:25:57,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 23:25:59,370][49750] Updated weights for policy 0, policy_version 296631 (0.0038) [2024-04-26 23:26:01,853][49750] Updated weights for policy 0, policy_version 296641 (0.0028) [2024-04-26 23:26:02,063][49517] Fps is (10 sec: 52429.2, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 4860166144. Throughput: 0: 50737.7. Samples: 2613049520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 23:26:02,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 23:26:05,673][49750] Updated weights for policy 0, policy_version 296651 (0.0029) [2024-04-26 23:26:07,062][49517] Fps is (10 sec: 45875.6, 60 sec: 49971.2, 300 sec: 50651.6). Total num frames: 4860362752. Throughput: 0: 50858.4. Samples: 2613199220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 23:26:07,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 23:26:08,635][49750] Updated weights for policy 0, policy_version 296661 (0.0030) [2024-04-26 23:26:12,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4860641280. Throughput: 0: 50767.0. Samples: 2613501280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 23:26:12,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 23:26:12,097][49750] Updated weights for policy 0, policy_version 296671 (0.0030) [2024-04-26 23:26:14,958][49750] Updated weights for policy 0, policy_version 296681 (0.0032) [2024-04-26 23:26:17,062][49517] Fps is (10 sec: 55705.3, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4860919808. Throughput: 0: 50848.6. Samples: 2613807680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 23:26:17,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 23:26:18,558][49750] Updated weights for policy 0, policy_version 296691 (0.0038) [2024-04-26 23:26:21,342][49750] Updated weights for policy 0, policy_version 296701 (0.0027) [2024-04-26 23:26:22,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 4861181952. Throughput: 0: 50840.2. Samples: 2613971600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 23:26:22,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 23:26:24,898][49750] Updated weights for policy 0, policy_version 296711 (0.0034) [2024-04-26 23:26:27,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4861411328. Throughput: 0: 50853.7. Samples: 2614274600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-26 23:26:27,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 23:26:27,900][49750] Updated weights for policy 0, policy_version 296721 (0.0033) [2024-04-26 23:26:31,203][49750] Updated weights for policy 0, policy_version 296731 (0.0040) [2024-04-26 23:26:32,062][49517] Fps is (10 sec: 45876.1, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4861640704. Throughput: 0: 50624.5. Samples: 2614570460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 23:26:32,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 23:26:34,289][49750] Updated weights for policy 0, policy_version 296741 (0.0034) [2024-04-26 23:26:37,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4861919232. Throughput: 0: 50601.4. Samples: 2614712820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 23:26:37,063][49517] Avg episode reward: [(0, '0.674')] [2024-04-26 23:26:37,825][49750] Updated weights for policy 0, policy_version 296751 (0.0029) [2024-04-26 23:26:40,610][49750] Updated weights for policy 0, policy_version 296761 (0.0026) [2024-04-26 23:26:42,062][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4862181376. Throughput: 0: 50713.8. Samples: 2615020040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 23:26:42,063][49517] Avg episode reward: [(0, '0.437')] [2024-04-26 23:26:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000296764_4862181376.pth... [2024-04-26 23:26:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000296019_4849975296.pth [2024-04-26 23:26:44,450][49750] Updated weights for policy 0, policy_version 296771 (0.0034) [2024-04-26 23:26:45,326][49728] Signal inference workers to stop experience collection... (39300 times) [2024-04-26 23:26:45,360][49750] InferenceWorker_p0-w0: stopping experience collection (39300 times) [2024-04-26 23:26:45,395][49728] Signal inference workers to resume experience collection... (39300 times) [2024-04-26 23:26:45,395][49750] InferenceWorker_p0-w0: resuming experience collection (39300 times) [2024-04-26 23:26:47,016][49750] Updated weights for policy 0, policy_version 296781 (0.0027) [2024-04-26 23:26:47,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51609.5, 300 sec: 50929.3). Total num frames: 4862459904. Throughput: 0: 50697.8. Samples: 2615330920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 23:26:47,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 23:26:51,250][49750] Updated weights for policy 0, policy_version 296791 (0.0033) [2024-04-26 23:26:52,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4862656512. Throughput: 0: 50883.5. Samples: 2615488980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 23:26:52,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 23:26:53,531][49750] Updated weights for policy 0, policy_version 296801 (0.0031) [2024-04-26 23:26:57,062][49517] Fps is (10 sec: 45875.6, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4862918656. Throughput: 0: 50840.1. Samples: 2615789080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 23:26:57,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 23:26:57,532][49750] Updated weights for policy 0, policy_version 296811 (0.0029) [2024-04-26 23:26:59,852][49750] Updated weights for policy 0, policy_version 296821 (0.0044) [2024-04-26 23:27:02,063][49517] Fps is (10 sec: 54066.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4863197184. Throughput: 0: 50670.4. Samples: 2616087860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 23:27:02,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 23:27:03,867][49750] Updated weights for policy 0, policy_version 296831 (0.0032) [2024-04-26 23:27:06,371][49750] Updated weights for policy 0, policy_version 296841 (0.0036) [2024-04-26 23:27:07,062][49517] Fps is (10 sec: 57343.6, 60 sec: 52155.6, 300 sec: 50984.8). Total num frames: 4863492096. Throughput: 0: 50664.6. Samples: 2616251500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 23:27:07,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 23:27:10,422][49750] Updated weights for policy 0, policy_version 296851 (0.0038) [2024-04-26 23:27:12,063][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 4863705088. Throughput: 0: 50618.5. Samples: 2616552440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 23:27:12,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-26 23:27:12,848][49750] Updated weights for policy 0, policy_version 296861 (0.0029) [2024-04-26 23:27:16,876][49750] Updated weights for policy 0, policy_version 296871 (0.0027) [2024-04-26 23:27:17,062][49517] Fps is (10 sec: 44237.0, 60 sec: 50244.3, 300 sec: 50762.7). Total num frames: 4863934464. Throughput: 0: 50830.7. Samples: 2616857840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 23:27:17,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 23:27:19,243][49750] Updated weights for policy 0, policy_version 296881 (0.0040) [2024-04-26 23:27:22,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4864196608. Throughput: 0: 50740.4. Samples: 2616996140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 23:27:22,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 23:27:23,176][49750] Updated weights for policy 0, policy_version 296891 (0.0031) [2024-04-26 23:27:25,572][49750] Updated weights for policy 0, policy_version 296901 (0.0028) [2024-04-26 23:27:27,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4864475136. Throughput: 0: 50787.9. Samples: 2617305500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 23:27:27,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 23:27:29,517][49750] Updated weights for policy 0, policy_version 296911 (0.0039) [2024-04-26 23:27:31,936][49750] Updated weights for policy 0, policy_version 296921 (0.0030) [2024-04-26 23:27:32,063][49517] Fps is (10 sec: 55705.6, 60 sec: 51882.5, 300 sec: 50984.8). Total num frames: 4864753664. Throughput: 0: 50777.7. Samples: 2617615920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 23:27:32,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 23:27:34,267][49728] Signal inference workers to stop experience collection... (39350 times) [2024-04-26 23:27:34,304][49750] InferenceWorker_p0-w0: stopping experience collection (39350 times) [2024-04-26 23:27:34,325][49728] Signal inference workers to resume experience collection... (39350 times) [2024-04-26 23:27:34,327][49750] InferenceWorker_p0-w0: resuming experience collection (39350 times) [2024-04-26 23:27:35,989][49750] Updated weights for policy 0, policy_version 296931 (0.0035) [2024-04-26 23:27:37,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4864950272. Throughput: 0: 50771.7. Samples: 2617773720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 23:27:37,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 23:27:38,411][49750] Updated weights for policy 0, policy_version 296941 (0.0032) [2024-04-26 23:27:42,062][49517] Fps is (10 sec: 45876.0, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4865212416. Throughput: 0: 50980.9. Samples: 2618083220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 25.0) [2024-04-26 23:27:42,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-26 23:27:42,428][49750] Updated weights for policy 0, policy_version 296951 (0.0030) [2024-04-26 23:27:44,746][49750] Updated weights for policy 0, policy_version 296961 (0.0031) [2024-04-26 23:27:47,062][49517] Fps is (10 sec: 50791.2, 60 sec: 49971.2, 300 sec: 50651.6). Total num frames: 4865458176. Throughput: 0: 50889.0. Samples: 2618377860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 23:27:47,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-26 23:27:49,143][49750] Updated weights for policy 0, policy_version 296971 (0.0034) [2024-04-26 23:27:51,139][49750] Updated weights for policy 0, policy_version 296981 (0.0029) [2024-04-26 23:27:52,063][49517] Fps is (10 sec: 54066.3, 60 sec: 51609.5, 300 sec: 50929.3). Total num frames: 4865753088. Throughput: 0: 50878.5. Samples: 2618541040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 23:27:52,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 23:27:55,497][49750] Updated weights for policy 0, policy_version 296991 (0.0031) [2024-04-26 23:27:57,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4865998848. Throughput: 0: 50896.2. Samples: 2618842760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 23:27:57,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 23:27:57,501][49750] Updated weights for policy 0, policy_version 297001 (0.0030) [2024-04-26 23:28:01,982][49750] Updated weights for policy 0, policy_version 297011 (0.0032) [2024-04-26 23:28:02,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4866228224. Throughput: 0: 50951.0. Samples: 2619150640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 23:28:02,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 23:28:04,165][49750] Updated weights for policy 0, policy_version 297021 (0.0029) [2024-04-26 23:28:07,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49971.2, 300 sec: 50762.7). Total num frames: 4866490368. Throughput: 0: 50852.6. Samples: 2619284500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 23:28:07,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 23:28:08,470][49750] Updated weights for policy 0, policy_version 297031 (0.0034) [2024-04-26 23:28:10,413][49750] Updated weights for policy 0, policy_version 297041 (0.0036) [2024-04-26 23:28:12,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 4866768896. Throughput: 0: 50756.0. Samples: 2619589520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 23:28:12,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 23:28:14,874][49750] Updated weights for policy 0, policy_version 297051 (0.0041) [2024-04-26 23:28:16,943][49750] Updated weights for policy 0, policy_version 297061 (0.0031) [2024-04-26 23:28:17,064][49517] Fps is (10 sec: 55697.7, 60 sec: 51881.4, 300 sec: 51040.1). Total num frames: 4867047424. Throughput: 0: 50738.6. Samples: 2619899220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 23:28:17,064][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 23:28:21,237][49750] Updated weights for policy 0, policy_version 297071 (0.0034) [2024-04-26 23:28:22,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 4867244032. Throughput: 0: 50947.4. Samples: 2620066340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 23:28:22,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 23:28:22,107][49728] Signal inference workers to stop experience collection... (39400 times) [2024-04-26 23:28:22,107][49728] Signal inference workers to resume experience collection... (39400 times) [2024-04-26 23:28:22,124][49750] InferenceWorker_p0-w0: stopping experience collection (39400 times) [2024-04-26 23:28:22,125][49750] InferenceWorker_p0-w0: resuming experience collection (39400 times) [2024-04-26 23:28:23,493][49750] Updated weights for policy 0, policy_version 297081 (0.0033) [2024-04-26 23:28:27,063][49517] Fps is (10 sec: 45881.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4867506176. Throughput: 0: 50831.0. Samples: 2620370620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 23:28:27,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 23:28:27,608][49750] Updated weights for policy 0, policy_version 297091 (0.0036) [2024-04-26 23:28:29,771][49750] Updated weights for policy 0, policy_version 297101 (0.0032) [2024-04-26 23:28:32,062][49517] Fps is (10 sec: 50790.1, 60 sec: 49971.3, 300 sec: 50707.1). Total num frames: 4867751936. Throughput: 0: 51082.3. Samples: 2620676560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 23:28:32,063][49517] Avg episode reward: [(0, '0.480')] [2024-04-26 23:28:33,893][49750] Updated weights for policy 0, policy_version 297111 (0.0028) [2024-04-26 23:28:36,102][49750] Updated weights for policy 0, policy_version 297121 (0.0029) [2024-04-26 23:28:37,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51609.8, 300 sec: 50929.3). Total num frames: 4868046848. Throughput: 0: 50950.4. Samples: 2620833800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 23:28:37,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 23:28:40,245][49750] Updated weights for policy 0, policy_version 297131 (0.0038) [2024-04-26 23:28:42,063][49517] Fps is (10 sec: 57342.9, 60 sec: 51882.5, 300 sec: 50984.8). Total num frames: 4868325376. Throughput: 0: 51050.0. Samples: 2621140020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 23:28:42,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 23:28:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000297139_4868325376.pth... [2024-04-26 23:28:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000296392_4856086528.pth [2024-04-26 23:28:42,529][49750] Updated weights for policy 0, policy_version 297141 (0.0026) [2024-04-26 23:28:46,690][49750] Updated weights for policy 0, policy_version 297151 (0.0030) [2024-04-26 23:28:47,062][49517] Fps is (10 sec: 47513.7, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4868521984. Throughput: 0: 51164.1. Samples: 2621453020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 23:28:47,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 23:28:48,934][49750] Updated weights for policy 0, policy_version 297161 (0.0038) [2024-04-26 23:28:52,063][49517] Fps is (10 sec: 45875.7, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4868784128. Throughput: 0: 51217.2. Samples: 2621589280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 23:28:52,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 23:28:53,304][49750] Updated weights for policy 0, policy_version 297171 (0.0028) [2024-04-26 23:28:55,266][49750] Updated weights for policy 0, policy_version 297181 (0.0034) [2024-04-26 23:28:57,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4869046272. Throughput: 0: 51237.9. Samples: 2621895220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-26 23:28:57,063][49517] Avg episode reward: [(0, '0.485')] [2024-04-26 23:28:59,673][49750] Updated weights for policy 0, policy_version 297191 (0.0029) [2024-04-26 23:29:01,613][49750] Updated weights for policy 0, policy_version 297201 (0.0032) [2024-04-26 23:29:02,062][49517] Fps is (10 sec: 55706.3, 60 sec: 51882.7, 300 sec: 50929.3). Total num frames: 4869341184. Throughput: 0: 51126.1. Samples: 2622199820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:29:02,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 23:29:06,187][49750] Updated weights for policy 0, policy_version 297211 (0.0031) [2024-04-26 23:29:07,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 4869554176. Throughput: 0: 51226.2. Samples: 2622371520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:29:07,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 23:29:08,209][49750] Updated weights for policy 0, policy_version 297221 (0.0035) [2024-04-26 23:29:12,062][49517] Fps is (10 sec: 45875.2, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4869799936. Throughput: 0: 51172.6. Samples: 2622673380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:29:12,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 23:29:12,628][49750] Updated weights for policy 0, policy_version 297231 (0.0039) [2024-04-26 23:29:12,838][49728] Signal inference workers to stop experience collection... (39450 times) [2024-04-26 23:29:12,838][49728] Signal inference workers to resume experience collection... (39450 times) [2024-04-26 23:29:12,852][49750] InferenceWorker_p0-w0: stopping experience collection (39450 times) [2024-04-26 23:29:12,858][49750] InferenceWorker_p0-w0: resuming experience collection (39450 times) [2024-04-26 23:29:14,905][49750] Updated weights for policy 0, policy_version 297241 (0.0028) [2024-04-26 23:29:17,062][49517] Fps is (10 sec: 49151.5, 60 sec: 49972.3, 300 sec: 50707.1). Total num frames: 4870045696. Throughput: 0: 51035.1. Samples: 2622973140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:29:17,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 23:29:19,010][49750] Updated weights for policy 0, policy_version 297251 (0.0026) [2024-04-26 23:29:21,453][49750] Updated weights for policy 0, policy_version 297261 (0.0032) [2024-04-26 23:29:22,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4870324224. Throughput: 0: 50873.8. Samples: 2623123120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:29:22,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 23:29:25,342][49750] Updated weights for policy 0, policy_version 297271 (0.0037) [2024-04-26 23:29:27,063][49517] Fps is (10 sec: 55705.0, 60 sec: 51609.5, 300 sec: 50984.8). Total num frames: 4870602752. Throughput: 0: 50914.7. Samples: 2623431180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:29:27,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 23:29:27,969][49750] Updated weights for policy 0, policy_version 297281 (0.0031) [2024-04-26 23:29:31,853][49750] Updated weights for policy 0, policy_version 297291 (0.0029) [2024-04-26 23:29:32,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4870832128. Throughput: 0: 50718.1. Samples: 2623735340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:29:32,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 23:29:34,331][49750] Updated weights for policy 0, policy_version 297301 (0.0031) [2024-04-26 23:29:37,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4871077888. Throughput: 0: 50825.9. Samples: 2623876440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:29:37,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 23:29:38,219][49750] Updated weights for policy 0, policy_version 297311 (0.0028) [2024-04-26 23:29:40,704][49750] Updated weights for policy 0, policy_version 297321 (0.0030) [2024-04-26 23:29:42,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 4871323648. Throughput: 0: 50772.9. Samples: 2624180000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:29:42,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 23:29:44,552][49750] Updated weights for policy 0, policy_version 297331 (0.0030) [2024-04-26 23:29:47,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51609.6, 300 sec: 50929.2). Total num frames: 4871618560. Throughput: 0: 50772.8. Samples: 2624484600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:29:47,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 23:29:47,307][49750] Updated weights for policy 0, policy_version 297341 (0.0036) [2024-04-26 23:29:51,073][49750] Updated weights for policy 0, policy_version 297351 (0.0036) [2024-04-26 23:29:52,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4871847936. Throughput: 0: 50702.1. Samples: 2624653120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:29:52,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-26 23:29:53,936][49750] Updated weights for policy 0, policy_version 297361 (0.0030) [2024-04-26 23:29:57,063][49517] Fps is (10 sec: 45874.8, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4872077312. Throughput: 0: 50735.4. Samples: 2624956480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:29:57,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-26 23:29:57,411][49750] Updated weights for policy 0, policy_version 297371 (0.0036) [2024-04-26 23:30:00,409][49750] Updated weights for policy 0, policy_version 297381 (0.0030) [2024-04-26 23:30:02,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 4872339456. Throughput: 0: 50871.6. Samples: 2625262360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:30:02,063][49517] Avg episode reward: [(0, '0.669')] [2024-04-26 23:30:03,899][49750] Updated weights for policy 0, policy_version 297391 (0.0031) [2024-04-26 23:30:06,695][49750] Updated weights for policy 0, policy_version 297401 (0.0031) [2024-04-26 23:30:07,063][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4872617984. Throughput: 0: 50900.7. Samples: 2625413660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:30:07,063][49517] Avg episode reward: [(0, '0.705')] [2024-04-26 23:30:10,238][49750] Updated weights for policy 0, policy_version 297411 (0.0027) [2024-04-26 23:30:12,063][49517] Fps is (10 sec: 54065.9, 60 sec: 51336.3, 300 sec: 50873.7). Total num frames: 4872880128. Throughput: 0: 50746.6. Samples: 2625714780. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 23:30:12,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 23:30:13,308][49750] Updated weights for policy 0, policy_version 297421 (0.0027) [2024-04-26 23:30:16,641][49750] Updated weights for policy 0, policy_version 297431 (0.0030) [2024-04-26 23:30:17,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4873125888. Throughput: 0: 50896.9. Samples: 2626025700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 23:30:17,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 23:30:17,552][49728] Signal inference workers to stop experience collection... (39500 times) [2024-04-26 23:30:17,582][49750] InferenceWorker_p0-w0: stopping experience collection (39500 times) [2024-04-26 23:30:17,614][49728] Signal inference workers to resume experience collection... (39500 times) [2024-04-26 23:30:17,615][49750] InferenceWorker_p0-w0: resuming experience collection (39500 times) [2024-04-26 23:30:19,808][49750] Updated weights for policy 0, policy_version 297441 (0.0033) [2024-04-26 23:30:22,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4873371648. Throughput: 0: 51124.8. Samples: 2626177060. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 23:30:22,063][49517] Avg episode reward: [(0, '0.668')] [2024-04-26 23:30:23,076][49750] Updated weights for policy 0, policy_version 297451 (0.0030) [2024-04-26 23:30:26,290][49750] Updated weights for policy 0, policy_version 297461 (0.0030) [2024-04-26 23:30:27,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.3, 300 sec: 50762.7). Total num frames: 4873601024. Throughput: 0: 50990.7. Samples: 2626474580. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 23:30:27,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 23:30:29,436][49750] Updated weights for policy 0, policy_version 297471 (0.0030) [2024-04-26 23:30:32,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4873895936. Throughput: 0: 51100.4. Samples: 2626784120. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 23:30:32,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 23:30:32,780][49750] Updated weights for policy 0, policy_version 297481 (0.0038) [2024-04-26 23:30:35,851][49750] Updated weights for policy 0, policy_version 297491 (0.0040) [2024-04-26 23:30:37,063][49517] Fps is (10 sec: 55705.2, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4874158080. Throughput: 0: 50905.3. Samples: 2626943860. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 23:30:37,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-26 23:30:39,112][49750] Updated weights for policy 0, policy_version 297501 (0.0035) [2024-04-26 23:30:42,063][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4874387456. Throughput: 0: 50935.1. Samples: 2627248560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 23:30:42,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 23:30:42,141][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000297510_4874403840.pth... [2024-04-26 23:30:42,193][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000296764_4862181376.pth [2024-04-26 23:30:42,340][49750] Updated weights for policy 0, policy_version 297511 (0.0029) [2024-04-26 23:30:45,514][49750] Updated weights for policy 0, policy_version 297521 (0.0034) [2024-04-26 23:30:47,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 4874649600. Throughput: 0: 50940.7. Samples: 2627554700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 23:30:47,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 23:30:48,809][49750] Updated weights for policy 0, policy_version 297531 (0.0030) [2024-04-26 23:30:51,936][49750] Updated weights for policy 0, policy_version 297541 (0.0034) [2024-04-26 23:30:52,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4874911744. Throughput: 0: 50805.3. Samples: 2627699900. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 23:30:52,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 23:30:55,150][49750] Updated weights for policy 0, policy_version 297551 (0.0034) [2024-04-26 23:30:57,063][49517] Fps is (10 sec: 50790.6, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4875157504. Throughput: 0: 50953.4. Samples: 2628007680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 23:30:57,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 23:30:58,305][49750] Updated weights for policy 0, policy_version 297561 (0.0031) [2024-04-26 23:31:01,648][49750] Updated weights for policy 0, policy_version 297571 (0.0030) [2024-04-26 23:31:02,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51336.5, 300 sec: 51040.3). Total num frames: 4875419648. Throughput: 0: 50864.0. Samples: 2628314580. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 23:31:02,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 23:31:04,827][49750] Updated weights for policy 0, policy_version 297581 (0.0035) [2024-04-26 23:31:07,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 4875665408. Throughput: 0: 50851.1. Samples: 2628465360. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 23:31:07,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-26 23:31:07,961][49750] Updated weights for policy 0, policy_version 297591 (0.0029) [2024-04-26 23:31:11,543][49750] Updated weights for policy 0, policy_version 297601 (0.0038) [2024-04-26 23:31:12,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4875927552. Throughput: 0: 51150.5. Samples: 2628776360. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 23:31:12,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 23:31:14,257][49750] Updated weights for policy 0, policy_version 297611 (0.0031) [2024-04-26 23:31:17,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4876156928. Throughput: 0: 51000.7. Samples: 2629079160. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 23:31:17,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 23:31:18,100][49750] Updated weights for policy 0, policy_version 297621 (0.0029) [2024-04-26 23:31:20,709][49750] Updated weights for policy 0, policy_version 297631 (0.0027) [2024-04-26 23:31:21,494][49728] Signal inference workers to stop experience collection... (39550 times) [2024-04-26 23:31:21,514][49750] InferenceWorker_p0-w0: stopping experience collection (39550 times) [2024-04-26 23:31:21,562][49728] Signal inference workers to resume experience collection... (39550 times) [2024-04-26 23:31:21,563][49750] InferenceWorker_p0-w0: resuming experience collection (39550 times) [2024-04-26 23:31:22,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 4876451840. Throughput: 0: 50752.5. Samples: 2629227720. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-26 23:31:22,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 23:31:24,564][49750] Updated weights for policy 0, policy_version 297641 (0.0033) [2024-04-26 23:31:27,063][49517] Fps is (10 sec: 52429.4, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 4876681216. Throughput: 0: 50842.8. Samples: 2629536480. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-04-26 23:31:27,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-26 23:31:27,320][49750] Updated weights for policy 0, policy_version 297651 (0.0033) [2024-04-26 23:31:31,073][49750] Updated weights for policy 0, policy_version 297661 (0.0030) [2024-04-26 23:31:32,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 4876926976. Throughput: 0: 50802.7. Samples: 2629840820. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-04-26 23:31:32,064][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 23:31:33,665][49750] Updated weights for policy 0, policy_version 297671 (0.0034) [2024-04-26 23:31:37,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4877172736. Throughput: 0: 50663.7. Samples: 2629979760. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-04-26 23:31:37,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 23:31:37,402][49750] Updated weights for policy 0, policy_version 297681 (0.0028) [2024-04-26 23:31:40,070][49750] Updated weights for policy 0, policy_version 297691 (0.0030) [2024-04-26 23:31:42,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4877451264. Throughput: 0: 50760.1. Samples: 2630291880. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-04-26 23:31:42,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 23:31:43,931][49750] Updated weights for policy 0, policy_version 297701 (0.0030) [2024-04-26 23:31:46,584][49750] Updated weights for policy 0, policy_version 297711 (0.0032) [2024-04-26 23:31:47,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.5, 300 sec: 51040.3). Total num frames: 4877713408. Throughput: 0: 50677.3. Samples: 2630595060. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-04-26 23:31:47,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 23:31:50,346][49750] Updated weights for policy 0, policy_version 297721 (0.0031) [2024-04-26 23:31:52,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50929.2). Total num frames: 4877942784. Throughput: 0: 50880.9. Samples: 2630755000. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-04-26 23:31:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 23:31:53,178][49750] Updated weights for policy 0, policy_version 297731 (0.0026) [2024-04-26 23:31:56,726][49750] Updated weights for policy 0, policy_version 297741 (0.0029) [2024-04-26 23:31:57,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4878204928. Throughput: 0: 50704.0. Samples: 2631058040. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-04-26 23:31:57,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-26 23:31:59,510][49750] Updated weights for policy 0, policy_version 297751 (0.0029) [2024-04-26 23:32:02,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4878450688. Throughput: 0: 50770.4. Samples: 2631363820. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-04-26 23:32:02,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 23:32:03,149][49750] Updated weights for policy 0, policy_version 297761 (0.0030) [2024-04-26 23:32:05,804][49750] Updated weights for policy 0, policy_version 297771 (0.0030) [2024-04-26 23:32:07,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 4878729216. Throughput: 0: 50810.1. Samples: 2631514180. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-04-26 23:32:07,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 23:32:09,508][49750] Updated weights for policy 0, policy_version 297781 (0.0026) [2024-04-26 23:32:10,125][49728] Signal inference workers to stop experience collection... (39600 times) [2024-04-26 23:32:10,163][49750] InferenceWorker_p0-w0: stopping experience collection (39600 times) [2024-04-26 23:32:10,185][49728] Signal inference workers to resume experience collection... (39600 times) [2024-04-26 23:32:10,186][49750] InferenceWorker_p0-w0: resuming experience collection (39600 times) [2024-04-26 23:32:12,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.6, 300 sec: 51040.3). Total num frames: 4878991360. Throughput: 0: 50700.0. Samples: 2631817980. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-04-26 23:32:12,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-26 23:32:12,288][49750] Updated weights for policy 0, policy_version 297791 (0.0027) [2024-04-26 23:32:15,973][49750] Updated weights for policy 0, policy_version 297801 (0.0027) [2024-04-26 23:32:17,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 4879237120. Throughput: 0: 50767.7. Samples: 2632125360. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-04-26 23:32:17,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 23:32:18,773][49750] Updated weights for policy 0, policy_version 297811 (0.0027) [2024-04-26 23:32:22,062][49517] Fps is (10 sec: 45875.4, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 4879450112. Throughput: 0: 51036.4. Samples: 2632276400. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-04-26 23:32:22,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 23:32:22,468][49750] Updated weights for policy 0, policy_version 297821 (0.0032) [2024-04-26 23:32:25,118][49750] Updated weights for policy 0, policy_version 297831 (0.0032) [2024-04-26 23:32:27,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4879728640. Throughput: 0: 50733.0. Samples: 2632574860. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-04-26 23:32:27,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 23:32:28,958][49750] Updated weights for policy 0, policy_version 297841 (0.0035) [2024-04-26 23:32:31,455][49750] Updated weights for policy 0, policy_version 297851 (0.0032) [2024-04-26 23:32:32,062][49517] Fps is (10 sec: 55705.8, 60 sec: 51336.7, 300 sec: 51040.4). Total num frames: 4880007168. Throughput: 0: 50758.3. Samples: 2632879180. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-04-26 23:32:32,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 23:32:35,327][49750] Updated weights for policy 0, policy_version 297861 (0.0032) [2024-04-26 23:32:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4880220160. Throughput: 0: 50955.2. Samples: 2633047980. Policy #0 lag: (min: 1.0, avg: 8.5, max: 20.0) [2024-04-26 23:32:37,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 23:32:38,033][49750] Updated weights for policy 0, policy_version 297871 (0.0031) [2024-04-26 23:32:41,658][49750] Updated weights for policy 0, policy_version 297881 (0.0029) [2024-04-26 23:32:42,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 4880498688. Throughput: 0: 50901.4. Samples: 2633348600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:32:42,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 23:32:42,085][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000297883_4880515072.pth... [2024-04-26 23:32:42,129][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000297139_4868325376.pth [2024-04-26 23:32:44,483][49750] Updated weights for policy 0, policy_version 297891 (0.0030) [2024-04-26 23:32:47,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4880744448. Throughput: 0: 50893.2. Samples: 2633654020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:32:47,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 23:32:48,103][49750] Updated weights for policy 0, policy_version 297901 (0.0030) [2024-04-26 23:32:50,996][49750] Updated weights for policy 0, policy_version 297911 (0.0028) [2024-04-26 23:32:52,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 4881022976. Throughput: 0: 50953.5. Samples: 2633807080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:32:52,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-26 23:32:54,460][49750] Updated weights for policy 0, policy_version 297921 (0.0035) [2024-04-26 23:32:57,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 4881268736. Throughput: 0: 50820.0. Samples: 2634104880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:32:57,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 23:32:57,447][49750] Updated weights for policy 0, policy_version 297931 (0.0032) [2024-04-26 23:33:00,923][49750] Updated weights for policy 0, policy_version 297941 (0.0038) [2024-04-26 23:33:02,063][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4881514496. Throughput: 0: 50746.6. Samples: 2634408960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:33:02,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 23:33:03,841][49750] Updated weights for policy 0, policy_version 297951 (0.0032) [2024-04-26 23:33:07,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4881760256. Throughput: 0: 50716.0. Samples: 2634558620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:33:07,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-26 23:33:07,307][49750] Updated weights for policy 0, policy_version 297961 (0.0034) [2024-04-26 23:33:10,373][49750] Updated weights for policy 0, policy_version 297971 (0.0030) [2024-04-26 23:33:12,062][49517] Fps is (10 sec: 47514.4, 60 sec: 49971.3, 300 sec: 50651.8). Total num frames: 4881989632. Throughput: 0: 50847.1. Samples: 2634862980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:33:12,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 23:33:13,850][49750] Updated weights for policy 0, policy_version 297981 (0.0027) [2024-04-26 23:33:16,732][49750] Updated weights for policy 0, policy_version 297991 (0.0034) [2024-04-26 23:33:17,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.5, 300 sec: 50984.8). Total num frames: 4882284544. Throughput: 0: 50808.5. Samples: 2635165560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:33:17,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 23:33:20,321][49750] Updated weights for policy 0, policy_version 298001 (0.0033) [2024-04-26 23:33:22,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 4882530304. Throughput: 0: 50600.4. Samples: 2635325000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:33:22,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 23:33:23,226][49750] Updated weights for policy 0, policy_version 298011 (0.0035) [2024-04-26 23:33:26,947][49750] Updated weights for policy 0, policy_version 298021 (0.0031) [2024-04-26 23:33:27,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 4882776064. Throughput: 0: 50743.2. Samples: 2635632040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:33:27,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 23:33:28,322][49728] Signal inference workers to stop experience collection... (39650 times) [2024-04-26 23:33:28,368][49750] InferenceWorker_p0-w0: stopping experience collection (39650 times) [2024-04-26 23:33:28,430][49728] Signal inference workers to resume experience collection... (39650 times) [2024-04-26 23:33:28,430][49750] InferenceWorker_p0-w0: resuming experience collection (39650 times) [2024-04-26 23:33:29,684][49750] Updated weights for policy 0, policy_version 298031 (0.0027) [2024-04-26 23:33:32,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 4883021824. Throughput: 0: 50697.7. Samples: 2635935420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:33:32,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 23:33:33,315][49750] Updated weights for policy 0, policy_version 298041 (0.0034) [2024-04-26 23:33:36,159][49750] Updated weights for policy 0, policy_version 298051 (0.0037) [2024-04-26 23:33:37,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.5, 300 sec: 50762.7). Total num frames: 4883300352. Throughput: 0: 50651.1. Samples: 2636086380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:33:37,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 23:33:39,647][49750] Updated weights for policy 0, policy_version 298061 (0.0038) [2024-04-26 23:33:42,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4883529728. Throughput: 0: 50824.1. Samples: 2636391960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:33:42,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 23:33:42,706][49750] Updated weights for policy 0, policy_version 298071 (0.0028) [2024-04-26 23:33:46,124][49750] Updated weights for policy 0, policy_version 298081 (0.0034) [2024-04-26 23:33:47,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4883791872. Throughput: 0: 50753.5. Samples: 2636692860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:33:47,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 23:33:49,224][49750] Updated weights for policy 0, policy_version 298091 (0.0037) [2024-04-26 23:33:52,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4884054016. Throughput: 0: 50700.4. Samples: 2636840140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:33:52,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 23:33:52,625][49750] Updated weights for policy 0, policy_version 298101 (0.0031) [2024-04-26 23:33:55,763][49750] Updated weights for policy 0, policy_version 298111 (0.0034) [2024-04-26 23:33:57,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 4884283392. Throughput: 0: 50736.3. Samples: 2637146120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:33:57,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 23:33:58,937][49750] Updated weights for policy 0, policy_version 298121 (0.0036) [2024-04-26 23:34:02,050][49750] Updated weights for policy 0, policy_version 298131 (0.0030) [2024-04-26 23:34:02,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4884578304. Throughput: 0: 50983.3. Samples: 2637459820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:34:02,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 23:34:05,237][49750] Updated weights for policy 0, policy_version 298141 (0.0033) [2024-04-26 23:34:07,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4884807680. Throughput: 0: 50726.3. Samples: 2637607680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:34:07,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 23:34:08,630][49750] Updated weights for policy 0, policy_version 298151 (0.0028) [2024-04-26 23:34:11,866][49750] Updated weights for policy 0, policy_version 298161 (0.0027) [2024-04-26 23:34:12,063][49517] Fps is (10 sec: 49151.8, 60 sec: 51336.3, 300 sec: 50929.2). Total num frames: 4885069824. Throughput: 0: 50699.8. Samples: 2637913540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:34:12,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 23:34:15,112][49750] Updated weights for policy 0, policy_version 298171 (0.0031) [2024-04-26 23:34:17,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 4885299200. Throughput: 0: 50799.1. Samples: 2638221380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:34:17,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 23:34:18,465][49750] Updated weights for policy 0, policy_version 298181 (0.0028) [2024-04-26 23:34:21,535][49750] Updated weights for policy 0, policy_version 298191 (0.0035) [2024-04-26 23:34:22,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4885561344. Throughput: 0: 50605.4. Samples: 2638363620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:34:22,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 23:34:24,770][49750] Updated weights for policy 0, policy_version 298201 (0.0028) [2024-04-26 23:34:27,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4885823488. Throughput: 0: 50645.6. Samples: 2638671020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:34:27,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 23:34:27,925][49750] Updated weights for policy 0, policy_version 298211 (0.0035) [2024-04-26 23:34:31,572][49750] Updated weights for policy 0, policy_version 298221 (0.0036) [2024-04-26 23:34:32,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4886085632. Throughput: 0: 50765.6. Samples: 2638977320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:34:32,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 23:34:34,546][49750] Updated weights for policy 0, policy_version 298231 (0.0036) [2024-04-26 23:34:37,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4886331392. Throughput: 0: 50767.2. Samples: 2639124660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:34:37,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-26 23:34:38,012][49750] Updated weights for policy 0, policy_version 298241 (0.0034) [2024-04-26 23:34:41,056][49750] Updated weights for policy 0, policy_version 298251 (0.0040) [2024-04-26 23:34:42,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4886577152. Throughput: 0: 50751.9. Samples: 2639429960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:34:42,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 23:34:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000298253_4886577152.pth... [2024-04-26 23:34:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000297510_4874403840.pth [2024-04-26 23:34:42,764][49728] Signal inference workers to stop experience collection... (39700 times) [2024-04-26 23:34:42,764][49728] Signal inference workers to resume experience collection... (39700 times) [2024-04-26 23:34:42,778][49750] InferenceWorker_p0-w0: stopping experience collection (39700 times) [2024-04-26 23:34:42,778][49750] InferenceWorker_p0-w0: resuming experience collection (39700 times) [2024-04-26 23:34:44,383][49750] Updated weights for policy 0, policy_version 298261 (0.0032) [2024-04-26 23:34:47,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4886839296. Throughput: 0: 50721.8. Samples: 2639742300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:34:47,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 23:34:47,420][49750] Updated weights for policy 0, policy_version 298271 (0.0028) [2024-04-26 23:34:50,684][49750] Updated weights for policy 0, policy_version 298281 (0.0032) [2024-04-26 23:34:52,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4887085056. Throughput: 0: 50689.2. Samples: 2639888700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:34:52,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-26 23:34:53,812][49750] Updated weights for policy 0, policy_version 298291 (0.0033) [2024-04-26 23:34:57,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4887347200. Throughput: 0: 50651.3. Samples: 2640192840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:34:57,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 23:34:57,161][49750] Updated weights for policy 0, policy_version 298301 (0.0030) [2024-04-26 23:35:00,345][49750] Updated weights for policy 0, policy_version 298311 (0.0030) [2024-04-26 23:35:02,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4887609344. Throughput: 0: 50634.7. Samples: 2640499940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-26 23:35:02,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 23:35:03,521][49750] Updated weights for policy 0, policy_version 298321 (0.0029) [2024-04-26 23:35:06,877][49750] Updated weights for policy 0, policy_version 298331 (0.0027) [2024-04-26 23:35:07,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 4887855104. Throughput: 0: 50930.3. Samples: 2640655480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:35:07,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 23:35:09,832][49750] Updated weights for policy 0, policy_version 298341 (0.0028) [2024-04-26 23:35:12,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4888100864. Throughput: 0: 50898.3. Samples: 2640961440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:35:12,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-26 23:35:13,192][49750] Updated weights for policy 0, policy_version 298351 (0.0036) [2024-04-26 23:35:16,392][49750] Updated weights for policy 0, policy_version 298361 (0.0035) [2024-04-26 23:35:17,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4888379392. Throughput: 0: 50825.8. Samples: 2641264480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:35:17,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-26 23:35:19,776][49750] Updated weights for policy 0, policy_version 298371 (0.0036) [2024-04-26 23:35:22,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4888625152. Throughput: 0: 51051.9. Samples: 2641422000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:35:22,063][49517] Avg episode reward: [(0, '0.500')] [2024-04-26 23:35:22,709][49750] Updated weights for policy 0, policy_version 298381 (0.0036) [2024-04-26 23:35:26,411][49750] Updated weights for policy 0, policy_version 298391 (0.0033) [2024-04-26 23:35:27,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4888870912. Throughput: 0: 51073.9. Samples: 2641728280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:35:27,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 23:35:29,099][49750] Updated weights for policy 0, policy_version 298401 (0.0035) [2024-04-26 23:35:32,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4889116672. Throughput: 0: 50776.0. Samples: 2642027220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:35:32,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 23:35:32,724][49750] Updated weights for policy 0, policy_version 298411 (0.0038) [2024-04-26 23:35:35,690][49750] Updated weights for policy 0, policy_version 298421 (0.0030) [2024-04-26 23:35:37,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4889362432. Throughput: 0: 50888.3. Samples: 2642178680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:35:37,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-26 23:35:39,030][49750] Updated weights for policy 0, policy_version 298431 (0.0034) [2024-04-26 23:35:42,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4889640960. Throughput: 0: 50918.6. Samples: 2642484180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:35:42,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 23:35:42,115][49750] Updated weights for policy 0, policy_version 298441 (0.0036) [2024-04-26 23:35:44,968][49728] Signal inference workers to stop experience collection... (39750 times) [2024-04-26 23:35:45,001][49750] InferenceWorker_p0-w0: stopping experience collection (39750 times) [2024-04-26 23:35:45,036][49728] Signal inference workers to resume experience collection... (39750 times) [2024-04-26 23:35:45,036][49750] InferenceWorker_p0-w0: resuming experience collection (39750 times) [2024-04-26 23:35:45,477][49750] Updated weights for policy 0, policy_version 298451 (0.0022) [2024-04-26 23:35:47,062][49517] Fps is (10 sec: 54068.2, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4889903104. Throughput: 0: 50863.2. Samples: 2642788780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:35:47,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 23:35:48,609][49750] Updated weights for policy 0, policy_version 298461 (0.0037) [2024-04-26 23:35:52,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 4890132480. Throughput: 0: 50935.1. Samples: 2642947560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:35:52,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 23:35:52,174][49750] Updated weights for policy 0, policy_version 298471 (0.0038) [2024-04-26 23:35:55,002][49750] Updated weights for policy 0, policy_version 298481 (0.0038) [2024-04-26 23:35:57,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4890378240. Throughput: 0: 50788.5. Samples: 2643246920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:35:57,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 23:35:58,505][49750] Updated weights for policy 0, policy_version 298491 (0.0028) [2024-04-26 23:36:01,389][49750] Updated weights for policy 0, policy_version 298501 (0.0031) [2024-04-26 23:36:02,063][49517] Fps is (10 sec: 52427.6, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4890656768. Throughput: 0: 50839.0. Samples: 2643552240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:36:02,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-26 23:36:04,944][49750] Updated weights for policy 0, policy_version 298511 (0.0033) [2024-04-26 23:36:07,063][49517] Fps is (10 sec: 55705.1, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4890935296. Throughput: 0: 50845.3. Samples: 2643710040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:36:07,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 23:36:07,848][49750] Updated weights for policy 0, policy_version 298521 (0.0032) [2024-04-26 23:36:11,380][49750] Updated weights for policy 0, policy_version 298531 (0.0029) [2024-04-26 23:36:12,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 4891181056. Throughput: 0: 50916.3. Samples: 2644019520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:36:12,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 23:36:14,274][49750] Updated weights for policy 0, policy_version 298541 (0.0040) [2024-04-26 23:36:17,063][49517] Fps is (10 sec: 45875.1, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 4891394048. Throughput: 0: 50960.8. Samples: 2644320460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-26 23:36:17,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-26 23:36:17,789][49750] Updated weights for policy 0, policy_version 298551 (0.0040) [2024-04-26 23:36:20,779][49750] Updated weights for policy 0, policy_version 298561 (0.0028) [2024-04-26 23:36:22,063][49517] Fps is (10 sec: 45875.3, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4891639808. Throughput: 0: 50856.5. Samples: 2644467220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 23:36:22,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 23:36:24,146][49750] Updated weights for policy 0, policy_version 298571 (0.0039) [2024-04-26 23:36:27,063][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4891918336. Throughput: 0: 50816.0. Samples: 2644770900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 23:36:27,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 23:36:27,270][49750] Updated weights for policy 0, policy_version 298581 (0.0033) [2024-04-26 23:36:30,627][49750] Updated weights for policy 0, policy_version 298591 (0.0031) [2024-04-26 23:36:32,063][49517] Fps is (10 sec: 55705.0, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 4892196864. Throughput: 0: 50740.6. Samples: 2645072120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 23:36:32,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-26 23:36:33,822][49750] Updated weights for policy 0, policy_version 298601 (0.0029) [2024-04-26 23:36:37,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4892426240. Throughput: 0: 50696.7. Samples: 2645228920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 23:36:37,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 23:36:37,106][49750] Updated weights for policy 0, policy_version 298611 (0.0031) [2024-04-26 23:36:40,285][49750] Updated weights for policy 0, policy_version 298621 (0.0027) [2024-04-26 23:36:42,063][49517] Fps is (10 sec: 45875.5, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 4892655616. Throughput: 0: 50810.1. Samples: 2645533380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 23:36:42,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-26 23:36:42,145][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000298625_4892672000.pth... [2024-04-26 23:36:42,199][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000297883_4880515072.pth [2024-04-26 23:36:43,453][49750] Updated weights for policy 0, policy_version 298631 (0.0026) [2024-04-26 23:36:46,623][49750] Updated weights for policy 0, policy_version 298641 (0.0030) [2024-04-26 23:36:47,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 4892934144. Throughput: 0: 50830.7. Samples: 2645839620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 23:36:47,063][49517] Avg episode reward: [(0, '0.519')] [2024-04-26 23:36:49,867][49750] Updated weights for policy 0, policy_version 298651 (0.0032) [2024-04-26 23:36:50,820][49728] Signal inference workers to stop experience collection... (39800 times) [2024-04-26 23:36:50,863][49750] InferenceWorker_p0-w0: stopping experience collection (39800 times) [2024-04-26 23:36:50,878][49728] Signal inference workers to resume experience collection... (39800 times) [2024-04-26 23:36:50,886][49750] InferenceWorker_p0-w0: resuming experience collection (39800 times) [2024-04-26 23:36:52,063][49517] Fps is (10 sec: 55705.6, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4893212672. Throughput: 0: 50596.9. Samples: 2645986900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 23:36:52,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-26 23:36:53,052][49750] Updated weights for policy 0, policy_version 298661 (0.0031) [2024-04-26 23:36:56,413][49750] Updated weights for policy 0, policy_version 298671 (0.0033) [2024-04-26 23:36:57,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4893458432. Throughput: 0: 50641.0. Samples: 2646298360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 23:36:57,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 23:36:59,563][49750] Updated weights for policy 0, policy_version 298681 (0.0030) [2024-04-26 23:37:02,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4893687808. Throughput: 0: 50836.0. Samples: 2646608080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 23:37:02,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 23:37:02,902][49750] Updated weights for policy 0, policy_version 298691 (0.0030) [2024-04-26 23:37:06,118][49750] Updated weights for policy 0, policy_version 298701 (0.0034) [2024-04-26 23:37:07,062][49517] Fps is (10 sec: 47513.7, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 4893933568. Throughput: 0: 50668.6. Samples: 2646747300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 23:37:07,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 23:37:09,288][49750] Updated weights for policy 0, policy_version 298711 (0.0030) [2024-04-26 23:37:12,062][49517] Fps is (10 sec: 52430.0, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4894212096. Throughput: 0: 50740.6. Samples: 2647054220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 23:37:12,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 23:37:12,391][49750] Updated weights for policy 0, policy_version 298721 (0.0029) [2024-04-26 23:37:15,678][49750] Updated weights for policy 0, policy_version 298731 (0.0030) [2024-04-26 23:37:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4894457856. Throughput: 0: 50888.3. Samples: 2647362080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 23:37:17,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 23:37:18,733][49750] Updated weights for policy 0, policy_version 298741 (0.0030) [2024-04-26 23:37:22,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 4894720000. Throughput: 0: 50877.1. Samples: 2647518380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 23:37:22,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 23:37:22,070][49750] Updated weights for policy 0, policy_version 298751 (0.0032) [2024-04-26 23:37:25,268][49750] Updated weights for policy 0, policy_version 298761 (0.0029) [2024-04-26 23:37:27,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4894965760. Throughput: 0: 50741.5. Samples: 2647816740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-26 23:37:27,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 23:37:28,573][49750] Updated weights for policy 0, policy_version 298771 (0.0030) [2024-04-26 23:37:31,811][49750] Updated weights for policy 0, policy_version 298781 (0.0039) [2024-04-26 23:37:32,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 4895227904. Throughput: 0: 50824.0. Samples: 2648126700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 23:37:32,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-26 23:37:35,005][49750] Updated weights for policy 0, policy_version 298791 (0.0029) [2024-04-26 23:37:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4895490048. Throughput: 0: 50863.1. Samples: 2648275740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 23:37:37,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 23:37:38,416][49750] Updated weights for policy 0, policy_version 298801 (0.0029) [2024-04-26 23:37:41,365][49750] Updated weights for policy 0, policy_version 298811 (0.0036) [2024-04-26 23:37:42,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51609.7, 300 sec: 50873.7). Total num frames: 4895752192. Throughput: 0: 50843.1. Samples: 2648586300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 23:37:42,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 23:37:45,086][49750] Updated weights for policy 0, policy_version 298821 (0.0031) [2024-04-26 23:37:46,978][49728] Signal inference workers to stop experience collection... (39850 times) [2024-04-26 23:37:46,983][49728] Signal inference workers to resume experience collection... (39850 times) [2024-04-26 23:37:47,011][49750] InferenceWorker_p0-w0: stopping experience collection (39850 times) [2024-04-26 23:37:47,011][49750] InferenceWorker_p0-w0: resuming experience collection (39850 times) [2024-04-26 23:37:47,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4895997952. Throughput: 0: 50692.6. Samples: 2648889240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 23:37:47,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 23:37:47,638][49750] Updated weights for policy 0, policy_version 298831 (0.0035) [2024-04-26 23:37:51,421][49750] Updated weights for policy 0, policy_version 298841 (0.0033) [2024-04-26 23:37:52,062][49517] Fps is (10 sec: 45875.5, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 4896210944. Throughput: 0: 51039.1. Samples: 2649044060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 23:37:52,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 23:37:54,127][49750] Updated weights for policy 0, policy_version 298851 (0.0031) [2024-04-26 23:37:57,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4896489472. Throughput: 0: 50904.8. Samples: 2649344940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 23:37:57,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-26 23:37:58,011][49750] Updated weights for policy 0, policy_version 298861 (0.0033) [2024-04-26 23:38:00,649][49750] Updated weights for policy 0, policy_version 298871 (0.0028) [2024-04-26 23:38:02,063][49517] Fps is (10 sec: 54066.2, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 4896751616. Throughput: 0: 50745.1. Samples: 2649645620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 23:38:02,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 23:38:04,663][49750] Updated weights for policy 0, policy_version 298881 (0.0034) [2024-04-26 23:38:07,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4897013760. Throughput: 0: 50878.5. Samples: 2649807920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 23:38:07,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-26 23:38:07,082][49750] Updated weights for policy 0, policy_version 298891 (0.0032) [2024-04-26 23:38:11,089][49750] Updated weights for policy 0, policy_version 298901 (0.0034) [2024-04-26 23:38:12,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4897243136. Throughput: 0: 50766.2. Samples: 2650101220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 23:38:12,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 23:38:13,821][49750] Updated weights for policy 0, policy_version 298911 (0.0035) [2024-04-26 23:38:17,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4897505280. Throughput: 0: 50592.6. Samples: 2650403360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 23:38:17,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-26 23:38:17,664][49750] Updated weights for policy 0, policy_version 298921 (0.0031) [2024-04-26 23:38:20,157][49750] Updated weights for policy 0, policy_version 298931 (0.0031) [2024-04-26 23:38:22,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.2, 300 sec: 50818.1). Total num frames: 4897767424. Throughput: 0: 50722.6. Samples: 2650558260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 23:38:22,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 23:38:24,050][49750] Updated weights for policy 0, policy_version 298941 (0.0031) [2024-04-26 23:38:26,507][49750] Updated weights for policy 0, policy_version 298951 (0.0031) [2024-04-26 23:38:27,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4898029568. Throughput: 0: 50616.8. Samples: 2650864060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 23:38:27,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 23:38:30,302][49750] Updated weights for policy 0, policy_version 298961 (0.0037) [2024-04-26 23:38:32,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4898275328. Throughput: 0: 50753.2. Samples: 2651173140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 23:38:32,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 23:38:32,925][49750] Updated weights for policy 0, policy_version 298971 (0.0031) [2024-04-26 23:38:36,804][49750] Updated weights for policy 0, policy_version 298981 (0.0028) [2024-04-26 23:38:37,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 4898504704. Throughput: 0: 50716.0. Samples: 2651326280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 23:38:37,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 23:38:39,422][49750] Updated weights for policy 0, policy_version 298991 (0.0031) [2024-04-26 23:38:42,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4898783232. Throughput: 0: 50873.8. Samples: 2651634260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-26 23:38:42,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-26 23:38:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000298998_4898783232.pth... [2024-04-26 23:38:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000298253_4886577152.pth [2024-04-26 23:38:43,261][49750] Updated weights for policy 0, policy_version 299001 (0.0029) [2024-04-26 23:38:45,708][49750] Updated weights for policy 0, policy_version 299011 (0.0032) [2024-04-26 23:38:47,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4899028992. Throughput: 0: 50782.0. Samples: 2651930800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 23:38:47,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 23:38:49,850][49750] Updated weights for policy 0, policy_version 299021 (0.0042) [2024-04-26 23:38:52,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 4899307520. Throughput: 0: 50818.2. Samples: 2652094740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 23:38:52,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 23:38:52,080][49750] Updated weights for policy 0, policy_version 299031 (0.0029) [2024-04-26 23:38:56,149][49750] Updated weights for policy 0, policy_version 299041 (0.0037) [2024-04-26 23:38:57,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4899536896. Throughput: 0: 50909.9. Samples: 2652392160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 23:38:57,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 23:38:57,991][49728] Signal inference workers to stop experience collection... (39900 times) [2024-04-26 23:38:57,992][49728] Signal inference workers to resume experience collection... (39900 times) [2024-04-26 23:38:58,024][49750] InferenceWorker_p0-w0: stopping experience collection (39900 times) [2024-04-26 23:38:58,024][49750] InferenceWorker_p0-w0: resuming experience collection (39900 times) [2024-04-26 23:38:58,451][49750] Updated weights for policy 0, policy_version 299051 (0.0034) [2024-04-26 23:39:02,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4899782656. Throughput: 0: 51064.1. Samples: 2652701240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 23:39:02,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 23:39:02,536][49750] Updated weights for policy 0, policy_version 299061 (0.0031) [2024-04-26 23:39:05,003][49750] Updated weights for policy 0, policy_version 299071 (0.0035) [2024-04-26 23:39:07,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4900061184. Throughput: 0: 50865.2. Samples: 2652847180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 23:39:07,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 23:39:08,970][49750] Updated weights for policy 0, policy_version 299081 (0.0030) [2024-04-26 23:39:11,674][49750] Updated weights for policy 0, policy_version 299091 (0.0038) [2024-04-26 23:39:12,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4900306944. Throughput: 0: 50824.5. Samples: 2653151160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 23:39:12,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 23:39:15,493][49750] Updated weights for policy 0, policy_version 299101 (0.0035) [2024-04-26 23:39:17,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4900552704. Throughput: 0: 50703.3. Samples: 2653454780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 23:39:17,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 23:39:18,688][49750] Updated weights for policy 0, policy_version 299111 (0.0038) [2024-04-26 23:39:21,899][49750] Updated weights for policy 0, policy_version 299121 (0.0031) [2024-04-26 23:39:22,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 4900798464. Throughput: 0: 50762.2. Samples: 2653610580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 23:39:22,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 23:39:25,014][49750] Updated weights for policy 0, policy_version 299131 (0.0029) [2024-04-26 23:39:27,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 4901060608. Throughput: 0: 50616.4. Samples: 2653912000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 23:39:27,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 23:39:28,207][49750] Updated weights for policy 0, policy_version 299141 (0.0029) [2024-04-26 23:39:31,272][49750] Updated weights for policy 0, policy_version 299151 (0.0028) [2024-04-26 23:39:32,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4901306368. Throughput: 0: 50759.9. Samples: 2654215000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 23:39:32,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 23:39:34,667][49750] Updated weights for policy 0, policy_version 299161 (0.0037) [2024-04-26 23:39:37,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 4901584896. Throughput: 0: 50537.3. Samples: 2654368920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 23:39:37,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-26 23:39:37,744][49750] Updated weights for policy 0, policy_version 299171 (0.0037) [2024-04-26 23:39:41,196][49750] Updated weights for policy 0, policy_version 299181 (0.0032) [2024-04-26 23:39:42,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4901847040. Throughput: 0: 50798.5. Samples: 2654678100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 23:39:42,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 23:39:44,307][49750] Updated weights for policy 0, policy_version 299191 (0.0034) [2024-04-26 23:39:47,063][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4902076416. Throughput: 0: 50685.6. Samples: 2654982100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 23:39:47,063][49517] Avg episode reward: [(0, '0.673')] [2024-04-26 23:39:47,669][49750] Updated weights for policy 0, policy_version 299201 (0.0032) [2024-04-26 23:39:50,646][49750] Updated weights for policy 0, policy_version 299211 (0.0028) [2024-04-26 23:39:52,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4902322176. Throughput: 0: 50596.6. Samples: 2655124040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 23:39:52,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 23:39:54,117][49750] Updated weights for policy 0, policy_version 299221 (0.0034) [2024-04-26 23:39:56,999][49750] Updated weights for policy 0, policy_version 299231 (0.0032) [2024-04-26 23:39:57,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 4902600704. Throughput: 0: 50720.4. Samples: 2655433580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-26 23:39:57,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-26 23:40:00,560][49750] Updated weights for policy 0, policy_version 299241 (0.0031) [2024-04-26 23:40:02,063][49517] Fps is (10 sec: 54067.5, 60 sec: 51336.3, 300 sec: 50873.7). Total num frames: 4902862848. Throughput: 0: 50627.8. Samples: 2655733040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:40:02,072][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 23:40:03,493][49750] Updated weights for policy 0, policy_version 299251 (0.0030) [2024-04-26 23:40:06,884][49728] Signal inference workers to stop experience collection... (39950 times) [2024-04-26 23:40:06,932][49750] InferenceWorker_p0-w0: stopping experience collection (39950 times) [2024-04-26 23:40:06,950][49728] Signal inference workers to resume experience collection... (39950 times) [2024-04-26 23:40:06,951][49750] InferenceWorker_p0-w0: resuming experience collection (39950 times) [2024-04-26 23:40:07,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4903075840. Throughput: 0: 50785.3. Samples: 2655895920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:40:07,063][49517] Avg episode reward: [(0, '0.675')] [2024-04-26 23:40:07,085][49750] Updated weights for policy 0, policy_version 299261 (0.0029) [2024-04-26 23:40:09,897][49750] Updated weights for policy 0, policy_version 299271 (0.0030) [2024-04-26 23:40:12,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4903337984. Throughput: 0: 50870.2. Samples: 2656201160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:40:12,072][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 23:40:13,519][49750] Updated weights for policy 0, policy_version 299281 (0.0030) [2024-04-26 23:40:16,232][49750] Updated weights for policy 0, policy_version 299291 (0.0036) [2024-04-26 23:40:17,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4903583744. Throughput: 0: 50791.3. Samples: 2656500600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:40:17,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 23:40:19,937][49750] Updated weights for policy 0, policy_version 299301 (0.0028) [2024-04-26 23:40:22,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4903862272. Throughput: 0: 50941.4. Samples: 2656661280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:40:22,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 23:40:22,602][49750] Updated weights for policy 0, policy_version 299311 (0.0029) [2024-04-26 23:40:26,292][49750] Updated weights for policy 0, policy_version 299321 (0.0037) [2024-04-26 23:40:27,063][49517] Fps is (10 sec: 54066.2, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4904124416. Throughput: 0: 50758.6. Samples: 2656962240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:40:27,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-26 23:40:29,500][49750] Updated weights for policy 0, policy_version 299331 (0.0031) [2024-04-26 23:40:32,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4904353792. Throughput: 0: 50887.0. Samples: 2657272020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:40:32,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 23:40:32,814][49750] Updated weights for policy 0, policy_version 299341 (0.0031) [2024-04-26 23:40:36,037][49750] Updated weights for policy 0, policy_version 299351 (0.0035) [2024-04-26 23:40:37,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4904615936. Throughput: 0: 50943.2. Samples: 2657416480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:40:37,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 23:40:39,222][49750] Updated weights for policy 0, policy_version 299361 (0.0033) [2024-04-26 23:40:42,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4904878080. Throughput: 0: 50679.6. Samples: 2657714160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:40:42,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 23:40:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000299370_4904878080.pth... [2024-04-26 23:40:42,114][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000298625_4892672000.pth [2024-04-26 23:40:42,416][49750] Updated weights for policy 0, policy_version 299371 (0.0033) [2024-04-26 23:40:45,602][49750] Updated weights for policy 0, policy_version 299381 (0.0037) [2024-04-26 23:40:47,063][49517] Fps is (10 sec: 54067.4, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4905156608. Throughput: 0: 50966.3. Samples: 2658026520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:40:47,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-26 23:40:48,808][49750] Updated weights for policy 0, policy_version 299391 (0.0032) [2024-04-26 23:40:51,981][49750] Updated weights for policy 0, policy_version 299401 (0.0030) [2024-04-26 23:40:52,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4905385984. Throughput: 0: 50876.0. Samples: 2658185340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:40:52,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 23:40:55,428][49750] Updated weights for policy 0, policy_version 299411 (0.0033) [2024-04-26 23:40:57,062][49517] Fps is (10 sec: 45875.9, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 4905615360. Throughput: 0: 50742.3. Samples: 2658484560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:40:57,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 23:40:58,503][49750] Updated weights for policy 0, policy_version 299421 (0.0028) [2024-04-26 23:41:01,887][49750] Updated weights for policy 0, policy_version 299431 (0.0035) [2024-04-26 23:41:02,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 4905877504. Throughput: 0: 50840.9. Samples: 2658788440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:41:02,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 23:41:04,946][49750] Updated weights for policy 0, policy_version 299441 (0.0028) [2024-04-26 23:41:05,230][49728] Signal inference workers to stop experience collection... (40000 times) [2024-04-26 23:41:05,232][49728] Signal inference workers to resume experience collection... (40000 times) [2024-04-26 23:41:05,275][49750] InferenceWorker_p0-w0: stopping experience collection (40000 times) [2024-04-26 23:41:05,275][49750] InferenceWorker_p0-w0: resuming experience collection (40000 times) [2024-04-26 23:41:07,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.6, 300 sec: 50762.7). Total num frames: 4906156032. Throughput: 0: 50810.8. Samples: 2658947760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:41:07,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 23:41:08,243][49750] Updated weights for policy 0, policy_version 299451 (0.0031) [2024-04-26 23:41:11,283][49750] Updated weights for policy 0, policy_version 299461 (0.0030) [2024-04-26 23:41:12,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4906401792. Throughput: 0: 50969.1. Samples: 2659255840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-26 23:41:12,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 23:41:14,974][49750] Updated weights for policy 0, policy_version 299471 (0.0033) [2024-04-26 23:41:17,063][49517] Fps is (10 sec: 47512.5, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 4906631168. Throughput: 0: 50701.8. Samples: 2659553600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 23:41:17,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 23:41:17,713][49750] Updated weights for policy 0, policy_version 299481 (0.0031) [2024-04-26 23:41:21,417][49750] Updated weights for policy 0, policy_version 299491 (0.0028) [2024-04-26 23:41:22,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 4906893312. Throughput: 0: 50767.8. Samples: 2659701020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 23:41:22,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 23:41:24,219][49750] Updated weights for policy 0, policy_version 299501 (0.0029) [2024-04-26 23:41:27,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4907155456. Throughput: 0: 50925.3. Samples: 2660005800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 23:41:27,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 23:41:27,722][49750] Updated weights for policy 0, policy_version 299511 (0.0029) [2024-04-26 23:41:30,597][49750] Updated weights for policy 0, policy_version 299521 (0.0024) [2024-04-26 23:41:32,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4907417600. Throughput: 0: 50736.5. Samples: 2660309660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 23:41:32,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 23:41:34,165][49750] Updated weights for policy 0, policy_version 299531 (0.0028) [2024-04-26 23:41:36,952][49750] Updated weights for policy 0, policy_version 299541 (0.0032) [2024-04-26 23:41:37,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 4907679744. Throughput: 0: 50816.5. Samples: 2660472080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 23:41:37,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 23:41:40,622][49750] Updated weights for policy 0, policy_version 299551 (0.0035) [2024-04-26 23:41:42,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4907909120. Throughput: 0: 51032.3. Samples: 2660781020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 23:41:42,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 23:41:43,433][49750] Updated weights for policy 0, policy_version 299561 (0.0027) [2024-04-26 23:41:47,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 4908154880. Throughput: 0: 50898.3. Samples: 2661078860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 23:41:47,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-26 23:41:47,123][49750] Updated weights for policy 0, policy_version 299571 (0.0029) [2024-04-26 23:41:49,850][49750] Updated weights for policy 0, policy_version 299581 (0.0028) [2024-04-26 23:41:52,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4908433408. Throughput: 0: 50769.2. Samples: 2661232380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 23:41:52,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 23:41:53,402][49750] Updated weights for policy 0, policy_version 299591 (0.0029) [2024-04-26 23:41:56,198][49728] Signal inference workers to stop experience collection... (40050 times) [2024-04-26 23:41:56,241][49750] InferenceWorker_p0-w0: stopping experience collection (40050 times) [2024-04-26 23:41:56,257][49728] Signal inference workers to resume experience collection... (40050 times) [2024-04-26 23:41:56,265][49750] InferenceWorker_p0-w0: resuming experience collection (40050 times) [2024-04-26 23:41:56,269][49750] Updated weights for policy 0, policy_version 299601 (0.0027) [2024-04-26 23:41:57,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51336.6, 300 sec: 50873.8). Total num frames: 4908695552. Throughput: 0: 50764.5. Samples: 2661540240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 23:41:57,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 23:41:59,791][49750] Updated weights for policy 0, policy_version 299611 (0.0037) [2024-04-26 23:42:02,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4908924928. Throughput: 0: 50846.1. Samples: 2661841660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 23:42:02,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 23:42:02,597][49750] Updated weights for policy 0, policy_version 299621 (0.0036) [2024-04-26 23:42:06,318][49750] Updated weights for policy 0, policy_version 299631 (0.0029) [2024-04-26 23:42:07,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4909187072. Throughput: 0: 50807.0. Samples: 2661987340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 23:42:07,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 23:42:09,053][49750] Updated weights for policy 0, policy_version 299641 (0.0031) [2024-04-26 23:42:12,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4909432832. Throughput: 0: 50803.9. Samples: 2662291980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 23:42:12,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 23:42:12,765][49750] Updated weights for policy 0, policy_version 299651 (0.0040) [2024-04-26 23:42:15,563][49750] Updated weights for policy 0, policy_version 299661 (0.0029) [2024-04-26 23:42:17,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51336.6, 300 sec: 50818.1). Total num frames: 4909711360. Throughput: 0: 50752.4. Samples: 2662593520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 23:42:17,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 23:42:19,198][49750] Updated weights for policy 0, policy_version 299671 (0.0029) [2024-04-26 23:42:22,004][49750] Updated weights for policy 0, policy_version 299681 (0.0030) [2024-04-26 23:42:22,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 4909973504. Throughput: 0: 50583.1. Samples: 2662748320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-26 23:42:22,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 23:42:25,671][49750] Updated weights for policy 0, policy_version 299691 (0.0032) [2024-04-26 23:42:27,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4910186496. Throughput: 0: 50456.3. Samples: 2663051560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:42:27,063][49517] Avg episode reward: [(0, '0.698')] [2024-04-26 23:42:28,496][49750] Updated weights for policy 0, policy_version 299701 (0.0040) [2024-04-26 23:42:32,038][49750] Updated weights for policy 0, policy_version 299711 (0.0038) [2024-04-26 23:42:32,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4910465024. Throughput: 0: 50634.9. Samples: 2663357440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:42:32,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 23:42:35,093][49750] Updated weights for policy 0, policy_version 299721 (0.0037) [2024-04-26 23:42:37,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4910710784. Throughput: 0: 50505.0. Samples: 2663505100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:42:37,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 23:42:38,365][49750] Updated weights for policy 0, policy_version 299731 (0.0029) [2024-04-26 23:42:41,366][49750] Updated weights for policy 0, policy_version 299741 (0.0037) [2024-04-26 23:42:42,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4910972928. Throughput: 0: 50563.7. Samples: 2663815620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:42:42,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 23:42:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000299742_4910972928.pth... [2024-04-26 23:42:42,129][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000298998_4898783232.pth [2024-04-26 23:42:45,102][49750] Updated weights for policy 0, policy_version 299751 (0.0033) [2024-04-26 23:42:47,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50790.2, 300 sec: 50818.1). Total num frames: 4911202304. Throughput: 0: 50581.9. Samples: 2664117860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:42:47,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-26 23:42:47,949][49750] Updated weights for policy 0, policy_version 299761 (0.0031) [2024-04-26 23:42:51,545][49750] Updated weights for policy 0, policy_version 299771 (0.0026) [2024-04-26 23:42:52,063][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4911464448. Throughput: 0: 50640.4. Samples: 2664266160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:42:52,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 23:42:54,479][49750] Updated weights for policy 0, policy_version 299781 (0.0030) [2024-04-26 23:42:57,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4911710208. Throughput: 0: 50583.7. Samples: 2664568240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:42:57,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 23:42:57,963][49750] Updated weights for policy 0, policy_version 299791 (0.0030) [2024-04-26 23:43:00,870][49750] Updated weights for policy 0, policy_version 299801 (0.0030) [2024-04-26 23:43:02,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4911988736. Throughput: 0: 50632.6. Samples: 2664871980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:43:02,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 23:43:04,467][49750] Updated weights for policy 0, policy_version 299811 (0.0029) [2024-04-26 23:43:07,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4912250880. Throughput: 0: 50646.2. Samples: 2665027400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:43:07,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 23:43:07,239][49750] Updated weights for policy 0, policy_version 299821 (0.0033) [2024-04-26 23:43:10,968][49750] Updated weights for policy 0, policy_version 299831 (0.0031) [2024-04-26 23:43:12,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 4912463872. Throughput: 0: 50683.8. Samples: 2665332320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:43:12,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-26 23:43:13,943][49750] Updated weights for policy 0, policy_version 299841 (0.0032) [2024-04-26 23:43:13,958][49728] Signal inference workers to stop experience collection... (40100 times) [2024-04-26 23:43:13,958][49728] Signal inference workers to resume experience collection... (40100 times) [2024-04-26 23:43:13,982][49750] InferenceWorker_p0-w0: stopping experience collection (40100 times) [2024-04-26 23:43:13,982][49750] InferenceWorker_p0-w0: resuming experience collection (40100 times) [2024-04-26 23:43:17,063][49517] Fps is (10 sec: 47512.5, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4912726016. Throughput: 0: 50746.1. Samples: 2665641020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:43:17,063][49517] Avg episode reward: [(0, '0.686')] [2024-04-26 23:43:17,327][49750] Updated weights for policy 0, policy_version 299851 (0.0036) [2024-04-26 23:43:20,382][49750] Updated weights for policy 0, policy_version 299861 (0.0036) [2024-04-26 23:43:22,062][49517] Fps is (10 sec: 54066.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4913004544. Throughput: 0: 50767.0. Samples: 2665789620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:43:22,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 23:43:23,877][49750] Updated weights for policy 0, policy_version 299871 (0.0039) [2024-04-26 23:43:26,693][49750] Updated weights for policy 0, policy_version 299881 (0.0029) [2024-04-26 23:43:27,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4913250304. Throughput: 0: 50708.5. Samples: 2666097500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:43:27,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 23:43:30,256][49750] Updated weights for policy 0, policy_version 299891 (0.0028) [2024-04-26 23:43:32,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 4913479680. Throughput: 0: 50761.0. Samples: 2666402100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:43:32,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 23:43:33,122][49750] Updated weights for policy 0, policy_version 299901 (0.0025) [2024-04-26 23:43:36,685][49750] Updated weights for policy 0, policy_version 299911 (0.0035) [2024-04-26 23:43:37,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4913741824. Throughput: 0: 50729.0. Samples: 2666548960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:43:37,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 23:43:39,676][49750] Updated weights for policy 0, policy_version 299921 (0.0033) [2024-04-26 23:43:42,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4914003968. Throughput: 0: 50860.0. Samples: 2666856940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 23:43:42,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 23:43:43,157][49750] Updated weights for policy 0, policy_version 299931 (0.0033) [2024-04-26 23:43:46,223][49750] Updated weights for policy 0, policy_version 299941 (0.0030) [2024-04-26 23:43:47,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.6, 300 sec: 50651.6). Total num frames: 4914249728. Throughput: 0: 50835.2. Samples: 2667159560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 23:43:47,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 23:43:49,652][49750] Updated weights for policy 0, policy_version 299951 (0.0037) [2024-04-26 23:43:52,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 4914528256. Throughput: 0: 50811.9. Samples: 2667313940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 23:43:52,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 23:43:52,642][49750] Updated weights for policy 0, policy_version 299961 (0.0029) [2024-04-26 23:43:56,129][49750] Updated weights for policy 0, policy_version 299971 (0.0028) [2024-04-26 23:43:57,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4914774016. Throughput: 0: 50747.6. Samples: 2667615960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 23:43:57,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 23:43:59,050][49750] Updated weights for policy 0, policy_version 299981 (0.0035) [2024-04-26 23:44:02,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4915019776. Throughput: 0: 50745.8. Samples: 2667924580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 23:44:02,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-26 23:44:02,487][49750] Updated weights for policy 0, policy_version 299991 (0.0031) [2024-04-26 23:44:05,659][49750] Updated weights for policy 0, policy_version 300001 (0.0036) [2024-04-26 23:44:07,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4915281920. Throughput: 0: 50754.2. Samples: 2668073560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 23:44:07,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-26 23:44:09,116][49750] Updated weights for policy 0, policy_version 300011 (0.0032) [2024-04-26 23:44:12,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4915527680. Throughput: 0: 50771.1. Samples: 2668382200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 23:44:12,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-26 23:44:12,187][49750] Updated weights for policy 0, policy_version 300021 (0.0030) [2024-04-26 23:44:15,676][49750] Updated weights for policy 0, policy_version 300031 (0.0031) [2024-04-26 23:44:17,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.6, 300 sec: 50762.6). Total num frames: 4915773440. Throughput: 0: 50722.8. Samples: 2668684620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 23:44:17,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-26 23:44:18,661][49750] Updated weights for policy 0, policy_version 300041 (0.0029) [2024-04-26 23:44:21,959][49750] Updated weights for policy 0, policy_version 300051 (0.0037) [2024-04-26 23:44:22,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4916035584. Throughput: 0: 50675.3. Samples: 2668829360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 23:44:22,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-26 23:44:25,209][49750] Updated weights for policy 0, policy_version 300061 (0.0031) [2024-04-26 23:44:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 4916297728. Throughput: 0: 50657.5. Samples: 2669136520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 23:44:27,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 23:44:28,304][49750] Updated weights for policy 0, policy_version 300071 (0.0035) [2024-04-26 23:44:30,427][49728] Signal inference workers to stop experience collection... (40150 times) [2024-04-26 23:44:30,468][49750] InferenceWorker_p0-w0: stopping experience collection (40150 times) [2024-04-26 23:44:30,530][49728] Signal inference workers to resume experience collection... (40150 times) [2024-04-26 23:44:30,530][49750] InferenceWorker_p0-w0: resuming experience collection (40150 times) [2024-04-26 23:44:31,596][49750] Updated weights for policy 0, policy_version 300081 (0.0030) [2024-04-26 23:44:32,062][49517] Fps is (10 sec: 50791.7, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 4916543488. Throughput: 0: 50854.7. Samples: 2669448020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 23:44:32,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-26 23:44:34,818][49750] Updated weights for policy 0, policy_version 300091 (0.0036) [2024-04-26 23:44:37,063][49517] Fps is (10 sec: 50789.3, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 4916805632. Throughput: 0: 50804.8. Samples: 2669600160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 23:44:37,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 23:44:38,112][49750] Updated weights for policy 0, policy_version 300101 (0.0038) [2024-04-26 23:44:41,325][49750] Updated weights for policy 0, policy_version 300111 (0.0030) [2024-04-26 23:44:42,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4917035008. Throughput: 0: 50665.8. Samples: 2669895920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 23:44:42,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-26 23:44:42,124][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000300113_4917051392.pth... [2024-04-26 23:44:42,183][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000299370_4904878080.pth [2024-04-26 23:44:44,523][49750] Updated weights for policy 0, policy_version 300121 (0.0030) [2024-04-26 23:44:47,062][49517] Fps is (10 sec: 50791.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4917313536. Throughput: 0: 50715.8. Samples: 2670206780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 23:44:47,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-26 23:44:47,651][49750] Updated weights for policy 0, policy_version 300131 (0.0032) [2024-04-26 23:44:51,049][49750] Updated weights for policy 0, policy_version 300141 (0.0029) [2024-04-26 23:44:52,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4917559296. Throughput: 0: 50804.5. Samples: 2670359760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-26 23:44:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 23:44:54,050][49750] Updated weights for policy 0, policy_version 300151 (0.0031) [2024-04-26 23:44:57,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4917805056. Throughput: 0: 50771.4. Samples: 2670666900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:44:57,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 23:44:57,557][49750] Updated weights for policy 0, policy_version 300161 (0.0029) [2024-04-26 23:45:00,600][49750] Updated weights for policy 0, policy_version 300171 (0.0032) [2024-04-26 23:45:02,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4918067200. Throughput: 0: 50836.3. Samples: 2670972260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:45:02,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 23:45:03,823][49750] Updated weights for policy 0, policy_version 300181 (0.0036) [2024-04-26 23:45:07,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4918312960. Throughput: 0: 50886.7. Samples: 2671119260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:45:07,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 23:45:07,134][49750] Updated weights for policy 0, policy_version 300191 (0.0032) [2024-04-26 23:45:10,100][49750] Updated weights for policy 0, policy_version 300201 (0.0032) [2024-04-26 23:45:12,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 4918591488. Throughput: 0: 50884.4. Samples: 2671426320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:45:12,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 23:45:13,533][49750] Updated weights for policy 0, policy_version 300211 (0.0031) [2024-04-26 23:45:16,641][49750] Updated weights for policy 0, policy_version 300221 (0.0032) [2024-04-26 23:45:17,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4918837248. Throughput: 0: 50751.0. Samples: 2671731820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:45:17,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 23:45:19,871][49750] Updated weights for policy 0, policy_version 300231 (0.0032) [2024-04-26 23:45:22,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 4919083008. Throughput: 0: 50746.8. Samples: 2671883760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:45:22,067][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 23:45:23,223][49750] Updated weights for policy 0, policy_version 300241 (0.0040) [2024-04-26 23:45:26,390][49750] Updated weights for policy 0, policy_version 300251 (0.0031) [2024-04-26 23:45:27,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 4919312384. Throughput: 0: 50880.7. Samples: 2672185560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:45:27,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 23:45:29,705][49750] Updated weights for policy 0, policy_version 300261 (0.0037) [2024-04-26 23:45:32,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4919590912. Throughput: 0: 50607.9. Samples: 2672484140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:45:32,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 23:45:32,889][49750] Updated weights for policy 0, policy_version 300271 (0.0032) [2024-04-26 23:45:36,165][49750] Updated weights for policy 0, policy_version 300281 (0.0037) [2024-04-26 23:45:37,062][49517] Fps is (10 sec: 54067.8, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4919853056. Throughput: 0: 50837.8. Samples: 2672647460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:45:37,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-26 23:45:39,225][49750] Updated weights for policy 0, policy_version 300291 (0.0031) [2024-04-26 23:45:42,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50651.6). Total num frames: 4920098816. Throughput: 0: 50722.5. Samples: 2672949420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:45:42,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 23:45:42,570][49750] Updated weights for policy 0, policy_version 300301 (0.0030) [2024-04-26 23:45:45,525][49750] Updated weights for policy 0, policy_version 300311 (0.0033) [2024-04-26 23:45:47,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4920344576. Throughput: 0: 50732.5. Samples: 2673255220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:45:47,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 23:45:47,574][49728] Signal inference workers to stop experience collection... (40200 times) [2024-04-26 23:45:47,574][49728] Signal inference workers to resume experience collection... (40200 times) [2024-04-26 23:45:47,588][49750] InferenceWorker_p0-w0: stopping experience collection (40200 times) [2024-04-26 23:45:47,589][49750] InferenceWorker_p0-w0: resuming experience collection (40200 times) [2024-04-26 23:45:49,129][49750] Updated weights for policy 0, policy_version 300321 (0.0032) [2024-04-26 23:45:52,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4920606720. Throughput: 0: 50662.8. Samples: 2673399080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:45:52,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-26 23:45:52,149][49750] Updated weights for policy 0, policy_version 300331 (0.0029) [2024-04-26 23:45:55,594][49750] Updated weights for policy 0, policy_version 300341 (0.0029) [2024-04-26 23:45:57,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4920868864. Throughput: 0: 50633.4. Samples: 2673704820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:45:57,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-26 23:45:58,695][49750] Updated weights for policy 0, policy_version 300351 (0.0038) [2024-04-26 23:46:01,951][49750] Updated weights for policy 0, policy_version 300361 (0.0031) [2024-04-26 23:46:02,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4921114624. Throughput: 0: 50591.4. Samples: 2674008440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-26 23:46:02,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-26 23:46:05,795][49750] Updated weights for policy 0, policy_version 300371 (0.0035) [2024-04-26 23:46:07,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50517.4, 300 sec: 50651.5). Total num frames: 4921344000. Throughput: 0: 50511.4. Samples: 2674156780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:46:07,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 23:46:08,280][49750] Updated weights for policy 0, policy_version 300381 (0.0034) [2024-04-26 23:46:12,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49971.1, 300 sec: 50707.1). Total num frames: 4921589760. Throughput: 0: 50572.9. Samples: 2674461340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:46:12,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 23:46:12,155][49750] Updated weights for policy 0, policy_version 300391 (0.0034) [2024-04-26 23:46:14,796][49750] Updated weights for policy 0, policy_version 300401 (0.0034) [2024-04-26 23:46:17,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4921868288. Throughput: 0: 50649.2. Samples: 2674763360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:46:17,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 23:46:18,707][49750] Updated weights for policy 0, policy_version 300411 (0.0034) [2024-04-26 23:46:21,257][49750] Updated weights for policy 0, policy_version 300421 (0.0034) [2024-04-26 23:46:22,063][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4922130432. Throughput: 0: 50647.9. Samples: 2674926620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:46:22,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 23:46:25,138][49750] Updated weights for policy 0, policy_version 300431 (0.0038) [2024-04-26 23:46:27,062][49517] Fps is (10 sec: 50791.6, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 4922376192. Throughput: 0: 50787.7. Samples: 2675234860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:46:27,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 23:46:27,604][49750] Updated weights for policy 0, policy_version 300441 (0.0035) [2024-04-26 23:46:31,595][49750] Updated weights for policy 0, policy_version 300451 (0.0029) [2024-04-26 23:46:32,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.2, 300 sec: 50596.0). Total num frames: 4922605568. Throughput: 0: 50685.8. Samples: 2675536080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:46:32,063][49517] Avg episode reward: [(0, '0.674')] [2024-04-26 23:46:34,161][49750] Updated weights for policy 0, policy_version 300461 (0.0033) [2024-04-26 23:46:37,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4922884096. Throughput: 0: 50574.3. Samples: 2675674920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:46:37,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-26 23:46:38,123][49750] Updated weights for policy 0, policy_version 300471 (0.0036) [2024-04-26 23:46:40,743][49750] Updated weights for policy 0, policy_version 300481 (0.0029) [2024-04-26 23:46:42,063][49517] Fps is (10 sec: 55705.2, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4923162624. Throughput: 0: 50633.1. Samples: 2675983320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:46:42,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 23:46:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000300486_4923162624.pth... [2024-04-26 23:46:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000299742_4910972928.pth [2024-04-26 23:46:44,473][49750] Updated weights for policy 0, policy_version 300491 (0.0033) [2024-04-26 23:46:47,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4923392000. Throughput: 0: 50694.4. Samples: 2676289680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:46:47,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-26 23:46:47,180][49750] Updated weights for policy 0, policy_version 300501 (0.0029) [2024-04-26 23:46:50,951][49750] Updated weights for policy 0, policy_version 300511 (0.0031) [2024-04-26 23:46:52,062][49517] Fps is (10 sec: 45876.0, 60 sec: 50244.3, 300 sec: 50596.0). Total num frames: 4923621376. Throughput: 0: 50609.1. Samples: 2676434180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:46:52,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 23:46:53,550][49750] Updated weights for policy 0, policy_version 300521 (0.0038) [2024-04-26 23:46:57,062][49517] Fps is (10 sec: 47514.0, 60 sec: 49971.2, 300 sec: 50651.5). Total num frames: 4923867136. Throughput: 0: 50558.4. Samples: 2676736460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:46:57,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 23:46:57,421][49750] Updated weights for policy 0, policy_version 300531 (0.0032) [2024-04-26 23:47:00,103][49750] Updated weights for policy 0, policy_version 300541 (0.0033) [2024-04-26 23:47:02,063][49517] Fps is (10 sec: 54066.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4924162048. Throughput: 0: 50685.8. Samples: 2677044220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:47:02,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-26 23:47:03,724][49750] Updated weights for policy 0, policy_version 300551 (0.0031) [2024-04-26 23:47:06,489][49750] Updated weights for policy 0, policy_version 300561 (0.0032) [2024-04-26 23:47:07,062][49517] Fps is (10 sec: 54066.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4924407808. Throughput: 0: 50681.4. Samples: 2677207280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:47:07,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 23:47:10,081][49750] Updated weights for policy 0, policy_version 300571 (0.0031) [2024-04-26 23:47:12,063][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.4, 300 sec: 50651.5). Total num frames: 4924653568. Throughput: 0: 50507.3. Samples: 2677507700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:47:12,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-26 23:47:12,578][49728] Signal inference workers to stop experience collection... (40250 times) [2024-04-26 23:47:12,579][49728] Signal inference workers to resume experience collection... (40250 times) [2024-04-26 23:47:12,607][49750] InferenceWorker_p0-w0: stopping experience collection (40250 times) [2024-04-26 23:47:12,611][49750] InferenceWorker_p0-w0: resuming experience collection (40250 times) [2024-04-26 23:47:12,891][49750] Updated weights for policy 0, policy_version 300581 (0.0028) [2024-04-26 23:47:16,925][49750] Updated weights for policy 0, policy_version 300591 (0.0031) [2024-04-26 23:47:17,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 4924899328. Throughput: 0: 50624.0. Samples: 2677814160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-26 23:47:17,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 23:47:19,356][49750] Updated weights for policy 0, policy_version 300601 (0.0030) [2024-04-26 23:47:22,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4925161472. Throughput: 0: 50630.9. Samples: 2677953320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:47:22,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-26 23:47:23,469][49750] Updated weights for policy 0, policy_version 300611 (0.0037) [2024-04-26 23:47:25,802][49750] Updated weights for policy 0, policy_version 300621 (0.0032) [2024-04-26 23:47:27,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4925440000. Throughput: 0: 50613.8. Samples: 2678260940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:47:27,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-26 23:47:29,856][49750] Updated weights for policy 0, policy_version 300631 (0.0034) [2024-04-26 23:47:32,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 4925685760. Throughput: 0: 50612.3. Samples: 2678567240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:47:32,063][49517] Avg episode reward: [(0, '0.690')] [2024-04-26 23:47:32,398][49750] Updated weights for policy 0, policy_version 300641 (0.0030) [2024-04-26 23:47:36,305][49750] Updated weights for policy 0, policy_version 300651 (0.0031) [2024-04-26 23:47:37,062][49517] Fps is (10 sec: 45875.9, 60 sec: 50244.3, 300 sec: 50596.1). Total num frames: 4925898752. Throughput: 0: 50680.5. Samples: 2678714800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:47:37,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 23:47:38,828][49750] Updated weights for policy 0, policy_version 300661 (0.0033) [2024-04-26 23:47:42,063][49517] Fps is (10 sec: 47513.7, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4926160896. Throughput: 0: 50741.1. Samples: 2679019820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:47:42,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 23:47:42,675][49750] Updated weights for policy 0, policy_version 300671 (0.0032) [2024-04-26 23:47:45,364][49750] Updated weights for policy 0, policy_version 300681 (0.0029) [2024-04-26 23:47:47,062][49517] Fps is (10 sec: 55705.1, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4926455808. Throughput: 0: 50517.5. Samples: 2679317500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:47:47,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 23:47:49,178][49750] Updated weights for policy 0, policy_version 300691 (0.0029) [2024-04-26 23:47:51,745][49750] Updated weights for policy 0, policy_version 300701 (0.0029) [2024-04-26 23:47:52,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4926685184. Throughput: 0: 50761.4. Samples: 2679491540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:47:52,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 23:47:55,455][49750] Updated weights for policy 0, policy_version 300711 (0.0031) [2024-04-26 23:47:57,062][49517] Fps is (10 sec: 47513.6, 60 sec: 51063.4, 300 sec: 50651.5). Total num frames: 4926930944. Throughput: 0: 50731.7. Samples: 2679790620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:47:57,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-26 23:47:58,084][49750] Updated weights for policy 0, policy_version 300721 (0.0022) [2024-04-26 23:48:02,018][49750] Updated weights for policy 0, policy_version 300731 (0.0030) [2024-04-26 23:48:02,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.4, 300 sec: 50596.0). Total num frames: 4927176704. Throughput: 0: 50806.3. Samples: 2680100440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:48:02,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 23:48:04,473][49750] Updated weights for policy 0, policy_version 300741 (0.0024) [2024-04-26 23:48:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4927438848. Throughput: 0: 50831.3. Samples: 2680240720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:48:07,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 23:48:08,351][49750] Updated weights for policy 0, policy_version 300751 (0.0034) [2024-04-26 23:48:10,773][49750] Updated weights for policy 0, policy_version 300761 (0.0032) [2024-04-26 23:48:12,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4927717376. Throughput: 0: 50830.3. Samples: 2680548300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:48:12,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-26 23:48:14,723][49750] Updated weights for policy 0, policy_version 300771 (0.0036) [2024-04-26 23:48:15,832][49728] Signal inference workers to stop experience collection... (40300 times) [2024-04-26 23:48:15,832][49728] Signal inference workers to resume experience collection... (40300 times) [2024-04-26 23:48:15,847][49750] InferenceWorker_p0-w0: stopping experience collection (40300 times) [2024-04-26 23:48:15,848][49750] InferenceWorker_p0-w0: resuming experience collection (40300 times) [2024-04-26 23:48:17,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 4927979520. Throughput: 0: 50756.1. Samples: 2680851260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:48:17,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 23:48:17,242][49750] Updated weights for policy 0, policy_version 300781 (0.0035) [2024-04-26 23:48:21,299][49750] Updated weights for policy 0, policy_version 300791 (0.0030) [2024-04-26 23:48:22,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 4928208896. Throughput: 0: 51002.7. Samples: 2681009920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:48:22,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 23:48:23,678][49750] Updated weights for policy 0, policy_version 300801 (0.0032) [2024-04-26 23:48:27,063][49517] Fps is (10 sec: 45875.1, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 4928438272. Throughput: 0: 50874.2. Samples: 2681309160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:48:27,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-26 23:48:27,656][49750] Updated weights for policy 0, policy_version 300811 (0.0034) [2024-04-26 23:48:30,055][49750] Updated weights for policy 0, policy_version 300821 (0.0025) [2024-04-26 23:48:32,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 4928733184. Throughput: 0: 51096.1. Samples: 2681616820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-26 23:48:32,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 23:48:34,059][49750] Updated weights for policy 0, policy_version 300831 (0.0032) [2024-04-26 23:48:36,500][49750] Updated weights for policy 0, policy_version 300841 (0.0027) [2024-04-26 23:48:37,062][49517] Fps is (10 sec: 55706.0, 60 sec: 51609.5, 300 sec: 50818.2). Total num frames: 4928995328. Throughput: 0: 50803.1. Samples: 2681777680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 23:48:37,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-26 23:48:40,413][49750] Updated weights for policy 0, policy_version 300851 (0.0030) [2024-04-26 23:48:42,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4929241088. Throughput: 0: 50883.6. Samples: 2682080380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 23:48:42,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 23:48:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000300857_4929241088.pth... [2024-04-26 23:48:42,126][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000300113_4917051392.pth [2024-04-26 23:48:43,059][49750] Updated weights for policy 0, policy_version 300861 (0.0037) [2024-04-26 23:48:46,828][49750] Updated weights for policy 0, policy_version 300871 (0.0026) [2024-04-26 23:48:47,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 4929470464. Throughput: 0: 50780.0. Samples: 2682385540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 23:48:47,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 23:48:49,850][49750] Updated weights for policy 0, policy_version 300881 (0.0028) [2024-04-26 23:48:52,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4929732608. Throughput: 0: 50827.5. Samples: 2682527960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 23:48:52,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-26 23:48:53,254][49750] Updated weights for policy 0, policy_version 300891 (0.0034) [2024-04-26 23:48:56,207][49750] Updated weights for policy 0, policy_version 300901 (0.0030) [2024-04-26 23:48:57,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4929978368. Throughput: 0: 50743.4. Samples: 2682831760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 23:48:57,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-26 23:48:59,651][49750] Updated weights for policy 0, policy_version 300911 (0.0032) [2024-04-26 23:49:02,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 4930273280. Throughput: 0: 50842.8. Samples: 2683139180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 23:49:02,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 23:49:02,647][49750] Updated weights for policy 0, policy_version 300921 (0.0035) [2024-04-26 23:49:06,058][49750] Updated weights for policy 0, policy_version 300931 (0.0033) [2024-04-26 23:49:07,062][49517] Fps is (10 sec: 49153.2, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4930469888. Throughput: 0: 50826.7. Samples: 2683297120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 23:49:07,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 23:49:08,953][49750] Updated weights for policy 0, policy_version 300941 (0.0036) [2024-04-26 23:49:12,063][49517] Fps is (10 sec: 47512.5, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4930748416. Throughput: 0: 50962.6. Samples: 2683602480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 23:49:12,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 23:49:12,541][49750] Updated weights for policy 0, policy_version 300951 (0.0031) [2024-04-26 23:49:15,386][49750] Updated weights for policy 0, policy_version 300961 (0.0025) [2024-04-26 23:49:17,063][49517] Fps is (10 sec: 52427.5, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4930994176. Throughput: 0: 50914.5. Samples: 2683907980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 23:49:17,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-26 23:49:19,084][49750] Updated weights for policy 0, policy_version 300971 (0.0035) [2024-04-26 23:49:21,874][49750] Updated weights for policy 0, policy_version 300981 (0.0032) [2024-04-26 23:49:22,063][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4931272704. Throughput: 0: 50635.5. Samples: 2684056280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 23:49:22,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 23:49:23,122][49728] Signal inference workers to stop experience collection... (40350 times) [2024-04-26 23:49:23,125][49728] Signal inference workers to resume experience collection... (40350 times) [2024-04-26 23:49:23,153][49750] InferenceWorker_p0-w0: stopping experience collection (40350 times) [2024-04-26 23:49:23,153][49750] InferenceWorker_p0-w0: resuming experience collection (40350 times) [2024-04-26 23:49:25,489][49750] Updated weights for policy 0, policy_version 300991 (0.0034) [2024-04-26 23:49:27,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51609.7, 300 sec: 50818.2). Total num frames: 4931534848. Throughput: 0: 50726.3. Samples: 2684363060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 23:49:27,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-26 23:49:28,196][49750] Updated weights for policy 0, policy_version 301001 (0.0030) [2024-04-26 23:49:31,831][49750] Updated weights for policy 0, policy_version 301011 (0.0033) [2024-04-26 23:49:32,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4931764224. Throughput: 0: 50722.7. Samples: 2684668060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 23:49:32,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 23:49:34,570][49750] Updated weights for policy 0, policy_version 301021 (0.0036) [2024-04-26 23:49:37,062][49517] Fps is (10 sec: 45875.5, 60 sec: 49971.3, 300 sec: 50707.1). Total num frames: 4931993600. Throughput: 0: 50716.1. Samples: 2684810180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 23:49:37,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 23:49:38,300][49750] Updated weights for policy 0, policy_version 301031 (0.0034) [2024-04-26 23:49:40,969][49750] Updated weights for policy 0, policy_version 301041 (0.0029) [2024-04-26 23:49:42,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4932272128. Throughput: 0: 50821.8. Samples: 2685118740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 23:49:42,063][49517] Avg episode reward: [(0, '0.667')] [2024-04-26 23:49:44,821][49750] Updated weights for policy 0, policy_version 301051 (0.0042) [2024-04-26 23:49:47,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4932534272. Throughput: 0: 50760.0. Samples: 2685423380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-26 23:49:47,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 23:49:47,625][49750] Updated weights for policy 0, policy_version 301061 (0.0033) [2024-04-26 23:49:51,216][49750] Updated weights for policy 0, policy_version 301071 (0.0026) [2024-04-26 23:49:52,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4932763648. Throughput: 0: 50809.2. Samples: 2685583540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 23:49:52,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 23:49:54,479][49750] Updated weights for policy 0, policy_version 301081 (0.0028) [2024-04-26 23:49:57,063][49517] Fps is (10 sec: 50789.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4933042176. Throughput: 0: 50834.8. Samples: 2685890040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 23:49:57,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 23:49:57,646][49750] Updated weights for policy 0, policy_version 301091 (0.0030) [2024-04-26 23:50:01,174][49750] Updated weights for policy 0, policy_version 301101 (0.0034) [2024-04-26 23:50:02,063][49517] Fps is (10 sec: 49151.5, 60 sec: 49698.0, 300 sec: 50651.6). Total num frames: 4933255168. Throughput: 0: 50793.8. Samples: 2686193700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 23:50:02,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-26 23:50:03,970][49750] Updated weights for policy 0, policy_version 301111 (0.0032) [2024-04-26 23:50:07,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51336.4, 300 sec: 50707.1). Total num frames: 4933550080. Throughput: 0: 50654.3. Samples: 2686335720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 23:50:07,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 23:50:07,738][49750] Updated weights for policy 0, policy_version 301121 (0.0031) [2024-04-26 23:50:10,359][49750] Updated weights for policy 0, policy_version 301131 (0.0033) [2024-04-26 23:50:11,600][49728] Signal inference workers to stop experience collection... (40400 times) [2024-04-26 23:50:11,600][49728] Signal inference workers to resume experience collection... (40400 times) [2024-04-26 23:50:11,622][49750] InferenceWorker_p0-w0: stopping experience collection (40400 times) [2024-04-26 23:50:11,623][49750] InferenceWorker_p0-w0: resuming experience collection (40400 times) [2024-04-26 23:50:12,062][49517] Fps is (10 sec: 55706.2, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4933812224. Throughput: 0: 50572.9. Samples: 2686638840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 23:50:12,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-26 23:50:14,069][49750] Updated weights for policy 0, policy_version 301141 (0.0029) [2024-04-26 23:50:16,888][49750] Updated weights for policy 0, policy_version 301151 (0.0027) [2024-04-26 23:50:17,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 4934074368. Throughput: 0: 50610.5. Samples: 2686945540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 23:50:17,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-26 23:50:20,353][49750] Updated weights for policy 0, policy_version 301161 (0.0029) [2024-04-26 23:50:22,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4934287360. Throughput: 0: 50762.4. Samples: 2687094500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 23:50:22,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-26 23:50:23,346][49750] Updated weights for policy 0, policy_version 301171 (0.0032) [2024-04-26 23:50:26,823][49750] Updated weights for policy 0, policy_version 301181 (0.0033) [2024-04-26 23:50:27,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4934549504. Throughput: 0: 50738.8. Samples: 2687401980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 23:50:27,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-26 23:50:29,824][49750] Updated weights for policy 0, policy_version 301191 (0.0030) [2024-04-26 23:50:32,062][49517] Fps is (10 sec: 54068.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4934828032. Throughput: 0: 50668.5. Samples: 2687703460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 23:50:32,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 23:50:33,262][49750] Updated weights for policy 0, policy_version 301201 (0.0037) [2024-04-26 23:50:36,220][49750] Updated weights for policy 0, policy_version 301211 (0.0026) [2024-04-26 23:50:37,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4935057408. Throughput: 0: 50814.6. Samples: 2687870200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 23:50:37,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 23:50:39,771][49750] Updated weights for policy 0, policy_version 301221 (0.0032) [2024-04-26 23:50:42,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4935319552. Throughput: 0: 50711.5. Samples: 2688172060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 23:50:42,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-26 23:50:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000301228_4935319552.pth... [2024-04-26 23:50:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000300486_4923162624.pth [2024-04-26 23:50:42,624][49750] Updated weights for policy 0, policy_version 301231 (0.0032) [2024-04-26 23:50:46,274][49750] Updated weights for policy 0, policy_version 301241 (0.0029) [2024-04-26 23:50:47,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 4935548928. Throughput: 0: 50529.4. Samples: 2688467520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 23:50:47,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 23:50:49,109][49750] Updated weights for policy 0, policy_version 301251 (0.0027) [2024-04-26 23:50:52,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 4935827456. Throughput: 0: 50635.8. Samples: 2688614340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 23:50:52,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-26 23:50:52,968][49750] Updated weights for policy 0, policy_version 301261 (0.0027) [2024-04-26 23:50:55,669][49750] Updated weights for policy 0, policy_version 301271 (0.0030) [2024-04-26 23:50:57,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4936073216. Throughput: 0: 50611.9. Samples: 2688916380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 27.0) [2024-04-26 23:50:57,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-26 23:50:59,770][49750] Updated weights for policy 0, policy_version 301281 (0.0033) [2024-04-26 23:51:02,011][49750] Updated weights for policy 0, policy_version 301291 (0.0030) [2024-04-26 23:51:02,062][49517] Fps is (10 sec: 52429.9, 60 sec: 51609.7, 300 sec: 50873.7). Total num frames: 4936351744. Throughput: 0: 50633.9. Samples: 2689224060. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 23:51:02,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 23:51:06,273][49750] Updated weights for policy 0, policy_version 301301 (0.0034) [2024-04-26 23:51:07,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4936564736. Throughput: 0: 50721.4. Samples: 2689376960. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 23:51:07,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-26 23:51:07,969][49728] Signal inference workers to stop experience collection... (40450 times) [2024-04-26 23:51:07,969][49728] Signal inference workers to resume experience collection... (40450 times) [2024-04-26 23:51:08,000][49750] InferenceWorker_p0-w0: stopping experience collection (40450 times) [2024-04-26 23:51:08,000][49750] InferenceWorker_p0-w0: resuming experience collection (40450 times) [2024-04-26 23:51:08,344][49750] Updated weights for policy 0, policy_version 301311 (0.0033) [2024-04-26 23:51:12,062][49517] Fps is (10 sec: 45875.5, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 4936810496. Throughput: 0: 50657.4. Samples: 2689681560. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 23:51:12,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 23:51:12,628][49750] Updated weights for policy 0, policy_version 301321 (0.0027) [2024-04-26 23:51:14,881][49750] Updated weights for policy 0, policy_version 301331 (0.0028) [2024-04-26 23:51:17,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4937089024. Throughput: 0: 50606.4. Samples: 2689980760. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 23:51:17,063][49517] Avg episode reward: [(0, '0.666')] [2024-04-26 23:51:19,067][49750] Updated weights for policy 0, policy_version 301341 (0.0030) [2024-04-26 23:51:21,563][49750] Updated weights for policy 0, policy_version 301351 (0.0035) [2024-04-26 23:51:22,063][49517] Fps is (10 sec: 54066.3, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4937351168. Throughput: 0: 50529.7. Samples: 2690144040. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 23:51:22,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-26 23:51:25,405][49750] Updated weights for policy 0, policy_version 301361 (0.0033) [2024-04-26 23:51:27,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4937596928. Throughput: 0: 50563.7. Samples: 2690447420. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 23:51:27,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 23:51:27,877][49750] Updated weights for policy 0, policy_version 301371 (0.0031) [2024-04-26 23:51:31,791][49750] Updated weights for policy 0, policy_version 301381 (0.0039) [2024-04-26 23:51:32,062][49517] Fps is (10 sec: 47514.5, 60 sec: 49971.2, 300 sec: 50651.6). Total num frames: 4937826304. Throughput: 0: 50828.6. Samples: 2690754800. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 23:51:32,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-26 23:51:34,674][49750] Updated weights for policy 0, policy_version 301391 (0.0035) [2024-04-26 23:51:37,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4938121216. Throughput: 0: 50742.7. Samples: 2690897760. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 23:51:37,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-26 23:51:38,203][49750] Updated weights for policy 0, policy_version 301401 (0.0026) [2024-04-26 23:51:41,045][49750] Updated weights for policy 0, policy_version 301411 (0.0031) [2024-04-26 23:51:42,062][49517] Fps is (10 sec: 54066.7, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4938366976. Throughput: 0: 50725.9. Samples: 2691199040. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 23:51:42,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 23:51:44,539][49750] Updated weights for policy 0, policy_version 301421 (0.0028) [2024-04-26 23:51:47,062][49517] Fps is (10 sec: 49153.3, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4938612736. Throughput: 0: 50663.7. Samples: 2691503920. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 23:51:47,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 23:51:47,417][49750] Updated weights for policy 0, policy_version 301431 (0.0032) [2024-04-26 23:51:51,050][49750] Updated weights for policy 0, policy_version 301441 (0.0026) [2024-04-26 23:51:52,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.6, 300 sec: 50818.2). Total num frames: 4938858496. Throughput: 0: 50570.8. Samples: 2691652640. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 23:51:52,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 23:51:53,688][49750] Updated weights for policy 0, policy_version 301451 (0.0033) [2024-04-26 23:51:57,063][49517] Fps is (10 sec: 47512.2, 60 sec: 50244.2, 300 sec: 50596.0). Total num frames: 4939087872. Throughput: 0: 50656.2. Samples: 2691961100. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 23:51:57,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-26 23:51:57,570][49750] Updated weights for policy 0, policy_version 301461 (0.0034) [2024-04-26 23:52:00,083][49750] Updated weights for policy 0, policy_version 301471 (0.0035) [2024-04-26 23:52:02,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4939366400. Throughput: 0: 50745.5. Samples: 2692264300. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 23:52:02,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-26 23:52:04,313][49750] Updated weights for policy 0, policy_version 301481 (0.0034) [2024-04-26 23:52:06,672][49750] Updated weights for policy 0, policy_version 301491 (0.0032) [2024-04-26 23:52:07,063][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4939628544. Throughput: 0: 50669.3. Samples: 2692424160. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 23:52:07,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 23:52:10,752][49750] Updated weights for policy 0, policy_version 301501 (0.0035) [2024-04-26 23:52:12,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4939874304. Throughput: 0: 50751.4. Samples: 2692731240. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-04-26 23:52:12,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 23:52:13,143][49750] Updated weights for policy 0, policy_version 301511 (0.0028) [2024-04-26 23:52:17,062][49517] Fps is (10 sec: 45875.5, 60 sec: 49971.3, 300 sec: 50596.0). Total num frames: 4940087296. Throughput: 0: 50526.5. Samples: 2693028500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 23:52:17,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 23:52:17,345][49750] Updated weights for policy 0, policy_version 301521 (0.0028) [2024-04-26 23:52:17,538][49728] Signal inference workers to stop experience collection... (40500 times) [2024-04-26 23:52:17,538][49728] Signal inference workers to resume experience collection... (40500 times) [2024-04-26 23:52:17,565][49750] InferenceWorker_p0-w0: stopping experience collection (40500 times) [2024-04-26 23:52:17,565][49750] InferenceWorker_p0-w0: resuming experience collection (40500 times) [2024-04-26 23:52:19,608][49750] Updated weights for policy 0, policy_version 301531 (0.0033) [2024-04-26 23:52:22,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4940382208. Throughput: 0: 50588.2. Samples: 2693174220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 23:52:22,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 23:52:23,669][49750] Updated weights for policy 0, policy_version 301541 (0.0030) [2024-04-26 23:52:26,491][49750] Updated weights for policy 0, policy_version 301551 (0.0035) [2024-04-26 23:52:27,062][49517] Fps is (10 sec: 55706.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4940644352. Throughput: 0: 50687.2. Samples: 2693479960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 23:52:27,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-26 23:52:30,061][49750] Updated weights for policy 0, policy_version 301561 (0.0033) [2024-04-26 23:52:32,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4940890112. Throughput: 0: 50759.9. Samples: 2693788120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 23:52:32,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-26 23:52:33,017][49750] Updated weights for policy 0, policy_version 301571 (0.0036) [2024-04-26 23:52:36,442][49750] Updated weights for policy 0, policy_version 301581 (0.0033) [2024-04-26 23:52:37,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.4, 300 sec: 50762.7). Total num frames: 4941135872. Throughput: 0: 50651.4. Samples: 2693931960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 23:52:37,063][49517] Avg episode reward: [(0, '0.674')] [2024-04-26 23:52:39,458][49750] Updated weights for policy 0, policy_version 301591 (0.0035) [2024-04-26 23:52:42,063][49517] Fps is (10 sec: 50789.2, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 4941398016. Throughput: 0: 50735.1. Samples: 2694244180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 23:52:42,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-26 23:52:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000301599_4941398016.pth... [2024-04-26 23:52:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000300857_4929241088.pth [2024-04-26 23:52:42,853][49750] Updated weights for policy 0, policy_version 301601 (0.0032) [2024-04-26 23:52:45,959][49750] Updated weights for policy 0, policy_version 301611 (0.0034) [2024-04-26 23:52:47,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 4941660160. Throughput: 0: 50691.9. Samples: 2694545440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 23:52:47,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-26 23:52:49,382][49750] Updated weights for policy 0, policy_version 301621 (0.0043) [2024-04-26 23:52:52,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.1, 300 sec: 50707.1). Total num frames: 4941889536. Throughput: 0: 50615.9. Samples: 2694701880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 23:52:52,063][49517] Avg episode reward: [(0, '0.700')] [2024-04-26 23:52:52,497][49750] Updated weights for policy 0, policy_version 301631 (0.0030) [2024-04-26 23:52:55,717][49750] Updated weights for policy 0, policy_version 301641 (0.0032) [2024-04-26 23:52:57,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 4942168064. Throughput: 0: 50500.5. Samples: 2695003760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 23:52:57,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-26 23:52:59,002][49750] Updated weights for policy 0, policy_version 301651 (0.0034) [2024-04-26 23:53:02,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4942397440. Throughput: 0: 50711.6. Samples: 2695310520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 23:53:02,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-26 23:53:02,084][49750] Updated weights for policy 0, policy_version 301661 (0.0041) [2024-04-26 23:53:05,442][49750] Updated weights for policy 0, policy_version 301671 (0.0037) [2024-04-26 23:53:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4942675968. Throughput: 0: 50768.4. Samples: 2695458800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 23:53:07,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 23:53:08,623][49750] Updated weights for policy 0, policy_version 301681 (0.0033) [2024-04-26 23:53:11,766][49750] Updated weights for policy 0, policy_version 301691 (0.0031) [2024-04-26 23:53:12,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 4942921728. Throughput: 0: 50681.8. Samples: 2695760640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 23:53:12,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 23:53:15,343][49750] Updated weights for policy 0, policy_version 301701 (0.0034) [2024-04-26 23:53:17,063][49517] Fps is (10 sec: 49151.5, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 4943167488. Throughput: 0: 50606.1. Samples: 2696065400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 23:53:17,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-26 23:53:18,277][49750] Updated weights for policy 0, policy_version 301711 (0.0028) [2024-04-26 23:53:21,633][49750] Updated weights for policy 0, policy_version 301721 (0.0037) [2024-04-26 23:53:22,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4943413248. Throughput: 0: 50731.1. Samples: 2696214860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 23:53:22,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-26 23:53:24,763][49750] Updated weights for policy 0, policy_version 301731 (0.0033) [2024-04-26 23:53:27,062][49517] Fps is (10 sec: 50791.7, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4943675392. Throughput: 0: 50623.0. Samples: 2696522200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-26 23:53:27,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-26 23:53:28,090][49750] Updated weights for policy 0, policy_version 301741 (0.0032) [2024-04-26 23:53:30,451][49728] Signal inference workers to stop experience collection... (40550 times) [2024-04-26 23:53:30,452][49728] Signal inference workers to resume experience collection... (40550 times) [2024-04-26 23:53:30,464][49750] InferenceWorker_p0-w0: stopping experience collection (40550 times) [2024-04-26 23:53:30,464][49750] InferenceWorker_p0-w0: resuming experience collection (40550 times) [2024-04-26 23:53:31,186][49750] Updated weights for policy 0, policy_version 301751 (0.0033) [2024-04-26 23:53:32,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 4943937536. Throughput: 0: 50740.4. Samples: 2696828760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 23:53:32,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 23:53:34,565][49750] Updated weights for policy 0, policy_version 301761 (0.0031) [2024-04-26 23:53:37,062][49517] Fps is (10 sec: 50789.4, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 4944183296. Throughput: 0: 50791.7. Samples: 2696987500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 23:53:37,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-26 23:53:37,595][49750] Updated weights for policy 0, policy_version 301771 (0.0030) [2024-04-26 23:53:40,835][49750] Updated weights for policy 0, policy_version 301781 (0.0034) [2024-04-26 23:53:42,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4944445440. Throughput: 0: 50845.6. Samples: 2697291820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 23:53:42,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-26 23:53:44,197][49750] Updated weights for policy 0, policy_version 301791 (0.0029) [2024-04-26 23:53:47,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4944691200. Throughput: 0: 50696.3. Samples: 2697591860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 23:53:47,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-26 23:53:47,159][49750] Updated weights for policy 0, policy_version 301801 (0.0036) [2024-04-26 23:53:50,734][49750] Updated weights for policy 0, policy_version 301811 (0.0037) [2024-04-26 23:53:52,063][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4944953344. Throughput: 0: 50786.1. Samples: 2697744180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 23:53:52,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 23:53:53,696][49750] Updated weights for policy 0, policy_version 301821 (0.0036) [2024-04-26 23:53:57,050][49750] Updated weights for policy 0, policy_version 301831 (0.0027) [2024-04-26 23:53:57,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.2, 300 sec: 50596.0). Total num frames: 4945199104. Throughput: 0: 50799.4. Samples: 2698046620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 23:53:57,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-26 23:54:00,261][49750] Updated weights for policy 0, policy_version 301841 (0.0039) [2024-04-26 23:54:02,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4945444864. Throughput: 0: 50729.9. Samples: 2698348240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 23:54:02,063][49517] Avg episode reward: [(0, '0.669')] [2024-04-26 23:54:03,430][49750] Updated weights for policy 0, policy_version 301851 (0.0035) [2024-04-26 23:54:06,654][49750] Updated weights for policy 0, policy_version 301861 (0.0031) [2024-04-26 23:54:07,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4945707008. Throughput: 0: 50690.3. Samples: 2698495920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 23:54:07,063][49517] Avg episode reward: [(0, '0.692')] [2024-04-26 23:54:09,894][49750] Updated weights for policy 0, policy_version 301871 (0.0024) [2024-04-26 23:54:12,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 4945952768. Throughput: 0: 50729.1. Samples: 2698805020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 23:54:12,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 23:54:13,012][49750] Updated weights for policy 0, policy_version 301881 (0.0032) [2024-04-26 23:54:16,368][49750] Updated weights for policy 0, policy_version 301891 (0.0030) [2024-04-26 23:54:17,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 4946231296. Throughput: 0: 50687.5. Samples: 2699109700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 23:54:17,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 23:54:19,491][49750] Updated weights for policy 0, policy_version 301901 (0.0029) [2024-04-26 23:54:22,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50244.4, 300 sec: 50484.9). Total num frames: 4946427904. Throughput: 0: 50509.0. Samples: 2699260400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 23:54:22,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-26 23:54:22,755][49750] Updated weights for policy 0, policy_version 301911 (0.0034) [2024-04-26 23:54:26,005][49750] Updated weights for policy 0, policy_version 301921 (0.0033) [2024-04-26 23:54:27,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.1, 300 sec: 50651.5). Total num frames: 4946706432. Throughput: 0: 50633.8. Samples: 2699570340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 23:54:27,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-26 23:54:29,186][49750] Updated weights for policy 0, policy_version 301931 (0.0030) [2024-04-26 23:54:32,062][49517] Fps is (10 sec: 54066.7, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4946968576. Throughput: 0: 50718.3. Samples: 2699874180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 23:54:32,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-26 23:54:32,443][49750] Updated weights for policy 0, policy_version 301941 (0.0032) [2024-04-26 23:54:35,163][49728] Signal inference workers to stop experience collection... (40600 times) [2024-04-26 23:54:35,163][49728] Signal inference workers to resume experience collection... (40600 times) [2024-04-26 23:54:35,194][49750] InferenceWorker_p0-w0: stopping experience collection (40600 times) [2024-04-26 23:54:35,194][49750] InferenceWorker_p0-w0: resuming experience collection (40600 times) [2024-04-26 23:54:35,687][49750] Updated weights for policy 0, policy_version 301951 (0.0035) [2024-04-26 23:54:37,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 4947247104. Throughput: 0: 50833.9. Samples: 2700031700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 23:54:37,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-26 23:54:38,773][49750] Updated weights for policy 0, policy_version 301961 (0.0034) [2024-04-26 23:54:42,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.5, 300 sec: 50651.5). Total num frames: 4947476480. Throughput: 0: 50850.9. Samples: 2700334900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-26 23:54:42,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 23:54:42,095][49750] Updated weights for policy 0, policy_version 301971 (0.0031) [2024-04-26 23:54:42,095][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000301971_4947492864.pth... [2024-04-26 23:54:42,147][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000301228_4935319552.pth [2024-04-26 23:54:45,072][49750] Updated weights for policy 0, policy_version 301981 (0.0029) [2024-04-26 23:54:47,062][49517] Fps is (10 sec: 45875.3, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 4947705856. Throughput: 0: 50919.6. Samples: 2700639620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 23:54:47,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-26 23:54:48,505][49750] Updated weights for policy 0, policy_version 301991 (0.0030) [2024-04-26 23:54:51,568][49750] Updated weights for policy 0, policy_version 302001 (0.0036) [2024-04-26 23:54:52,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 4948000768. Throughput: 0: 50813.0. Samples: 2700782500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 23:54:52,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-26 23:54:54,939][49750] Updated weights for policy 0, policy_version 302011 (0.0036) [2024-04-26 23:54:57,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4948230144. Throughput: 0: 50694.2. Samples: 2701086260. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 23:54:57,071][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 23:54:58,144][49750] Updated weights for policy 0, policy_version 302021 (0.0032) [2024-04-26 23:55:01,484][49750] Updated weights for policy 0, policy_version 302031 (0.0024) [2024-04-26 23:55:02,063][49517] Fps is (10 sec: 52427.3, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 4948525056. Throughput: 0: 50809.7. Samples: 2701396140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 23:55:02,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-26 23:55:04,609][49750] Updated weights for policy 0, policy_version 302041 (0.0031) [2024-04-26 23:55:07,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 4948738048. Throughput: 0: 50829.3. Samples: 2701547720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 23:55:07,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 23:55:07,785][49750] Updated weights for policy 0, policy_version 302051 (0.0030) [2024-04-26 23:55:11,029][49750] Updated weights for policy 0, policy_version 302061 (0.0035) [2024-04-26 23:55:12,062][49517] Fps is (10 sec: 45876.1, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 4948983808. Throughput: 0: 50705.5. Samples: 2701852080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 23:55:12,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 23:55:14,234][49750] Updated weights for policy 0, policy_version 302071 (0.0028) [2024-04-26 23:55:17,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4949278720. Throughput: 0: 50853.4. Samples: 2702162580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 23:55:17,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 23:55:17,371][49750] Updated weights for policy 0, policy_version 302081 (0.0031) [2024-04-26 23:55:20,741][49750] Updated weights for policy 0, policy_version 302091 (0.0031) [2024-04-26 23:55:22,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51336.4, 300 sec: 50707.1). Total num frames: 4949508096. Throughput: 0: 50866.6. Samples: 2702320700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 23:55:22,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 23:55:23,928][49750] Updated weights for policy 0, policy_version 302101 (0.0039) [2024-04-26 23:55:27,062][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.6, 300 sec: 50651.5). Total num frames: 4949770240. Throughput: 0: 50698.2. Samples: 2702616320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 23:55:27,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-26 23:55:27,315][49750] Updated weights for policy 0, policy_version 302111 (0.0033) [2024-04-26 23:55:30,547][49750] Updated weights for policy 0, policy_version 302121 (0.0027) [2024-04-26 23:55:32,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4949999616. Throughput: 0: 50669.3. Samples: 2702919740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 23:55:32,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-26 23:55:33,641][49750] Updated weights for policy 0, policy_version 302131 (0.0030) [2024-04-26 23:55:36,909][49750] Updated weights for policy 0, policy_version 302141 (0.0027) [2024-04-26 23:55:37,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4950278144. Throughput: 0: 50556.3. Samples: 2703057540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 23:55:37,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-26 23:55:39,347][49728] Signal inference workers to stop experience collection... (40650 times) [2024-04-26 23:55:39,348][49728] Signal inference workers to resume experience collection... (40650 times) [2024-04-26 23:55:39,380][49750] InferenceWorker_p0-w0: stopping experience collection (40650 times) [2024-04-26 23:55:39,380][49750] InferenceWorker_p0-w0: resuming experience collection (40650 times) [2024-04-26 23:55:40,041][49750] Updated weights for policy 0, policy_version 302151 (0.0034) [2024-04-26 23:55:42,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4950523904. Throughput: 0: 50654.9. Samples: 2703365720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 23:55:42,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-26 23:55:43,182][49750] Updated weights for policy 0, policy_version 302161 (0.0032) [2024-04-26 23:55:46,600][49750] Updated weights for policy 0, policy_version 302171 (0.0029) [2024-04-26 23:55:47,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51609.5, 300 sec: 50762.7). Total num frames: 4950802432. Throughput: 0: 50667.7. Samples: 2703676180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 23:55:47,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-26 23:55:49,563][49750] Updated weights for policy 0, policy_version 302181 (0.0033) [2024-04-26 23:55:52,062][49517] Fps is (10 sec: 47513.1, 60 sec: 49971.1, 300 sec: 50596.0). Total num frames: 4950999040. Throughput: 0: 50621.3. Samples: 2703825680. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-26 23:55:52,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-26 23:55:53,013][49750] Updated weights for policy 0, policy_version 302191 (0.0027) [2024-04-26 23:55:56,132][49750] Updated weights for policy 0, policy_version 302201 (0.0029) [2024-04-26 23:55:57,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50790.5, 300 sec: 50596.0). Total num frames: 4951277568. Throughput: 0: 50664.0. Samples: 2704131960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:55:57,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-26 23:55:59,513][49750] Updated weights for policy 0, policy_version 302211 (0.0030) [2024-04-26 23:56:02,062][49517] Fps is (10 sec: 55705.6, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4951556096. Throughput: 0: 50621.7. Samples: 2704440560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:56:02,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-26 23:56:02,519][49750] Updated weights for policy 0, policy_version 302221 (0.0034) [2024-04-26 23:56:05,820][49750] Updated weights for policy 0, policy_version 302231 (0.0042) [2024-04-26 23:56:07,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 4951801856. Throughput: 0: 50675.5. Samples: 2704601100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:56:07,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-26 23:56:08,908][49750] Updated weights for policy 0, policy_version 302241 (0.0029) [2024-04-26 23:56:12,062][49517] Fps is (10 sec: 49152.4, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 4952047616. Throughput: 0: 50916.5. Samples: 2704907560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:56:12,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-26 23:56:12,353][49750] Updated weights for policy 0, policy_version 302251 (0.0033) [2024-04-26 23:56:15,357][49750] Updated weights for policy 0, policy_version 302261 (0.0034) [2024-04-26 23:56:17,063][49517] Fps is (10 sec: 47513.8, 60 sec: 49971.1, 300 sec: 50596.0). Total num frames: 4952276992. Throughput: 0: 50901.3. Samples: 2705210300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:56:17,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 23:56:18,747][49750] Updated weights for policy 0, policy_version 302271 (0.0031) [2024-04-26 23:56:22,017][49750] Updated weights for policy 0, policy_version 302281 (0.0029) [2024-04-26 23:56:22,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4952571904. Throughput: 0: 50900.4. Samples: 2705348060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:56:22,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 23:56:25,211][49750] Updated weights for policy 0, policy_version 302291 (0.0035) [2024-04-26 23:56:27,062][49517] Fps is (10 sec: 54067.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4952817664. Throughput: 0: 50736.4. Samples: 2705648860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:56:27,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-26 23:56:28,393][49750] Updated weights for policy 0, policy_version 302301 (0.0029) [2024-04-26 23:56:31,090][49728] Signal inference workers to stop experience collection... (40700 times) [2024-04-26 23:56:31,090][49728] Signal inference workers to resume experience collection... (40700 times) [2024-04-26 23:56:31,107][49750] InferenceWorker_p0-w0: stopping experience collection (40700 times) [2024-04-26 23:56:31,107][49750] InferenceWorker_p0-w0: resuming experience collection (40700 times) [2024-04-26 23:56:31,538][49750] Updated weights for policy 0, policy_version 302311 (0.0036) [2024-04-26 23:56:32,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 4953079808. Throughput: 0: 50775.1. Samples: 2705961060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:56:32,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-26 23:56:35,230][49750] Updated weights for policy 0, policy_version 302321 (0.0028) [2024-04-26 23:56:37,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4953309184. Throughput: 0: 50935.7. Samples: 2706117780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:56:37,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-26 23:56:37,953][49750] Updated weights for policy 0, policy_version 302331 (0.0035) [2024-04-26 23:56:41,878][49750] Updated weights for policy 0, policy_version 302341 (0.0039) [2024-04-26 23:56:42,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50517.1, 300 sec: 50651.5). Total num frames: 4953554944. Throughput: 0: 50966.0. Samples: 2706425440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:56:42,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 23:56:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000302341_4953554944.pth... [2024-04-26 23:56:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000301599_4941398016.pth [2024-04-26 23:56:44,379][49750] Updated weights for policy 0, policy_version 302351 (0.0028) [2024-04-26 23:56:47,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4953833472. Throughput: 0: 50800.6. Samples: 2706726580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:56:47,063][49517] Avg episode reward: [(0, '0.668')] [2024-04-26 23:56:48,238][49750] Updated weights for policy 0, policy_version 302361 (0.0036) [2024-04-26 23:56:50,770][49750] Updated weights for policy 0, policy_version 302371 (0.0036) [2024-04-26 23:56:52,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4954079232. Throughput: 0: 50763.1. Samples: 2706885440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:56:52,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-26 23:56:54,679][49750] Updated weights for policy 0, policy_version 302381 (0.0034) [2024-04-26 23:56:57,063][49517] Fps is (10 sec: 50789.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4954341376. Throughput: 0: 50762.1. Samples: 2707191860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:56:57,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-26 23:56:57,237][49750] Updated weights for policy 0, policy_version 302391 (0.0030) [2024-04-26 23:57:01,399][49750] Updated weights for policy 0, policy_version 302401 (0.0033) [2024-04-26 23:57:02,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4954587136. Throughput: 0: 50892.5. Samples: 2707500460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:57:02,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 23:57:03,643][49750] Updated weights for policy 0, policy_version 302411 (0.0029) [2024-04-26 23:57:07,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4954832896. Throughput: 0: 51051.6. Samples: 2707645380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-26 23:57:07,063][49517] Avg episode reward: [(0, '0.695')] [2024-04-26 23:57:07,769][49750] Updated weights for policy 0, policy_version 302421 (0.0033) [2024-04-26 23:57:09,980][49750] Updated weights for policy 0, policy_version 302431 (0.0030) [2024-04-26 23:57:12,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 4955111424. Throughput: 0: 51005.3. Samples: 2707944100. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 23:57:12,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-26 23:57:14,206][49750] Updated weights for policy 0, policy_version 302441 (0.0026) [2024-04-26 23:57:16,469][49750] Updated weights for policy 0, policy_version 302451 (0.0029) [2024-04-26 23:57:17,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 4955373568. Throughput: 0: 50829.8. Samples: 2708248400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 23:57:17,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-26 23:57:20,725][49750] Updated weights for policy 0, policy_version 302461 (0.0029) [2024-04-26 23:57:22,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4955602944. Throughput: 0: 50764.7. Samples: 2708402200. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 23:57:22,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-26 23:57:23,036][49750] Updated weights for policy 0, policy_version 302471 (0.0041) [2024-04-26 23:57:27,058][49750] Updated weights for policy 0, policy_version 302481 (0.0035) [2024-04-26 23:57:27,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4955848704. Throughput: 0: 50749.1. Samples: 2708709140. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 23:57:27,063][49517] Avg episode reward: [(0, '0.689')] [2024-04-26 23:57:28,347][49728] Signal inference workers to stop experience collection... (40750 times) [2024-04-26 23:57:28,381][49750] InferenceWorker_p0-w0: stopping experience collection (40750 times) [2024-04-26 23:57:28,450][49728] Signal inference workers to resume experience collection... (40750 times) [2024-04-26 23:57:28,450][49750] InferenceWorker_p0-w0: resuming experience collection (40750 times) [2024-04-26 23:57:29,427][49750] Updated weights for policy 0, policy_version 302491 (0.0026) [2024-04-26 23:57:32,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4956110848. Throughput: 0: 50745.2. Samples: 2709010120. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 23:57:32,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-26 23:57:33,558][49750] Updated weights for policy 0, policy_version 302501 (0.0030) [2024-04-26 23:57:35,943][49750] Updated weights for policy 0, policy_version 302511 (0.0038) [2024-04-26 23:57:37,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50762.7). Total num frames: 4956372992. Throughput: 0: 50731.7. Samples: 2709168360. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 23:57:37,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-26 23:57:40,065][49750] Updated weights for policy 0, policy_version 302521 (0.0030) [2024-04-26 23:57:42,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51336.6, 300 sec: 50762.6). Total num frames: 4956635136. Throughput: 0: 50581.7. Samples: 2709468040. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 23:57:42,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-26 23:57:42,466][49750] Updated weights for policy 0, policy_version 302531 (0.0030) [2024-04-26 23:57:46,522][49750] Updated weights for policy 0, policy_version 302541 (0.0031) [2024-04-26 23:57:47,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4956848128. Throughput: 0: 50540.0. Samples: 2709774760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 23:57:47,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 23:57:48,856][49750] Updated weights for policy 0, policy_version 302551 (0.0031) [2024-04-26 23:57:52,062][49517] Fps is (10 sec: 45875.9, 60 sec: 50244.4, 300 sec: 50596.0). Total num frames: 4957093888. Throughput: 0: 50428.0. Samples: 2709914640. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 23:57:52,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-26 23:57:52,939][49750] Updated weights for policy 0, policy_version 302561 (0.0031) [2024-04-26 23:57:55,302][49750] Updated weights for policy 0, policy_version 302571 (0.0028) [2024-04-26 23:57:57,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4957372416. Throughput: 0: 50534.3. Samples: 2710218140. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 23:57:57,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-26 23:57:59,433][49750] Updated weights for policy 0, policy_version 302581 (0.0032) [2024-04-26 23:58:01,794][49750] Updated weights for policy 0, policy_version 302591 (0.0029) [2024-04-26 23:58:02,062][49517] Fps is (10 sec: 55705.8, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4957650944. Throughput: 0: 50591.7. Samples: 2710525020. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 23:58:02,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-26 23:58:06,004][49750] Updated weights for policy 0, policy_version 302601 (0.0031) [2024-04-26 23:58:07,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 4957863936. Throughput: 0: 50667.3. Samples: 2710682220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 23:58:07,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-26 23:58:08,227][49750] Updated weights for policy 0, policy_version 302611 (0.0032) [2024-04-26 23:58:12,062][49517] Fps is (10 sec: 45875.0, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 4958109696. Throughput: 0: 50515.6. Samples: 2710982340. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 23:58:12,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-26 23:58:12,403][49750] Updated weights for policy 0, policy_version 302621 (0.0033) [2024-04-26 23:58:14,604][49750] Updated weights for policy 0, policy_version 302631 (0.0032) [2024-04-26 23:58:17,063][49517] Fps is (10 sec: 52427.4, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4958388224. Throughput: 0: 50680.7. Samples: 2711290760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 23:58:17,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-26 23:58:18,930][49750] Updated weights for policy 0, policy_version 302641 (0.0030) [2024-04-26 23:58:20,613][49728] Signal inference workers to stop experience collection... (40800 times) [2024-04-26 23:58:20,617][49728] Signal inference workers to resume experience collection... (40800 times) [2024-04-26 23:58:20,641][49750] InferenceWorker_p0-w0: stopping experience collection (40800 times) [2024-04-26 23:58:20,641][49750] InferenceWorker_p0-w0: resuming experience collection (40800 times) [2024-04-26 23:58:21,203][49750] Updated weights for policy 0, policy_version 302651 (0.0042) [2024-04-26 23:58:22,062][49517] Fps is (10 sec: 55705.0, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 4958666752. Throughput: 0: 50667.4. Samples: 2711448400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-26 23:58:22,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-26 23:58:25,453][49750] Updated weights for policy 0, policy_version 302661 (0.0029) [2024-04-26 23:58:27,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4958912512. Throughput: 0: 50692.1. Samples: 2711749180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 23:58:27,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-26 23:58:27,585][49750] Updated weights for policy 0, policy_version 302671 (0.0030) [2024-04-26 23:58:31,715][49750] Updated weights for policy 0, policy_version 302681 (0.0029) [2024-04-26 23:58:32,062][49517] Fps is (10 sec: 45875.3, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 4959125504. Throughput: 0: 50692.8. Samples: 2712055940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 23:58:32,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-26 23:58:33,984][49750] Updated weights for policy 0, policy_version 302691 (0.0032) [2024-04-26 23:58:37,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 4959387648. Throughput: 0: 50555.2. Samples: 2712189620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 23:58:37,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-26 23:58:38,123][49750] Updated weights for policy 0, policy_version 302701 (0.0032) [2024-04-26 23:58:40,392][49750] Updated weights for policy 0, policy_version 302711 (0.0035) [2024-04-26 23:58:42,062][49517] Fps is (10 sec: 50790.8, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 4959633408. Throughput: 0: 50602.2. Samples: 2712495240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 23:58:42,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-26 23:58:42,098][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000302713_4959649792.pth... [2024-04-26 23:58:42,144][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000301971_4947492864.pth [2024-04-26 23:58:44,716][49750] Updated weights for policy 0, policy_version 302721 (0.0030) [2024-04-26 23:58:46,938][49750] Updated weights for policy 0, policy_version 302731 (0.0032) [2024-04-26 23:58:47,062][49517] Fps is (10 sec: 55704.9, 60 sec: 51609.5, 300 sec: 50818.2). Total num frames: 4959944704. Throughput: 0: 50639.4. Samples: 2712803800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 23:58:47,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-26 23:58:51,204][49750] Updated weights for policy 0, policy_version 302741 (0.0034) [2024-04-26 23:58:52,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 4960157696. Throughput: 0: 50687.0. Samples: 2712963140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 23:58:52,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-26 23:58:53,361][49750] Updated weights for policy 0, policy_version 302751 (0.0030) [2024-04-26 23:58:57,063][49517] Fps is (10 sec: 44236.6, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 4960387072. Throughput: 0: 50784.8. Samples: 2713267660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 23:58:57,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-26 23:58:57,715][49750] Updated weights for policy 0, policy_version 302761 (0.0035) [2024-04-26 23:58:59,732][49750] Updated weights for policy 0, policy_version 302771 (0.0030) [2024-04-26 23:59:02,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 4960665600. Throughput: 0: 50881.8. Samples: 2713580440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 23:59:02,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-26 23:59:04,086][49750] Updated weights for policy 0, policy_version 302781 (0.0032) [2024-04-26 23:59:06,329][49750] Updated weights for policy 0, policy_version 302791 (0.0032) [2024-04-26 23:59:07,063][49517] Fps is (10 sec: 55705.6, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 4960944128. Throughput: 0: 50710.2. Samples: 2713730360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 23:59:07,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-26 23:59:10,422][49750] Updated weights for policy 0, policy_version 302801 (0.0038) [2024-04-26 23:59:12,062][49517] Fps is (10 sec: 54068.3, 60 sec: 51609.6, 300 sec: 50762.7). Total num frames: 4961206272. Throughput: 0: 50779.2. Samples: 2714034240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 23:59:12,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-26 23:59:12,754][49750] Updated weights for policy 0, policy_version 302811 (0.0030) [2024-04-26 23:59:16,951][49750] Updated weights for policy 0, policy_version 302821 (0.0028) [2024-04-26 23:59:17,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4961419264. Throughput: 0: 50827.1. Samples: 2714343160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 23:59:17,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-26 23:59:19,318][49750] Updated weights for policy 0, policy_version 302831 (0.0034) [2024-04-26 23:59:22,062][49517] Fps is (10 sec: 45875.3, 60 sec: 49971.3, 300 sec: 50707.1). Total num frames: 4961665024. Throughput: 0: 50783.5. Samples: 2714474880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 23:59:22,063][49517] Avg episode reward: [(0, '0.499')] [2024-04-26 23:59:23,485][49750] Updated weights for policy 0, policy_version 302841 (0.0030) [2024-04-26 23:59:25,931][49750] Updated weights for policy 0, policy_version 302851 (0.0034) [2024-04-26 23:59:27,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 4961927168. Throughput: 0: 50840.9. Samples: 2714783080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 23:59:27,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-26 23:59:27,867][49728] Signal inference workers to stop experience collection... (40850 times) [2024-04-26 23:59:27,867][49728] Signal inference workers to resume experience collection... (40850 times) [2024-04-26 23:59:27,896][49750] InferenceWorker_p0-w0: stopping experience collection (40850 times) [2024-04-26 23:59:27,896][49750] InferenceWorker_p0-w0: resuming experience collection (40850 times) [2024-04-26 23:59:29,827][49750] Updated weights for policy 0, policy_version 302861 (0.0036) [2024-04-26 23:59:32,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.6, 300 sec: 50707.1). Total num frames: 4962205696. Throughput: 0: 50681.4. Samples: 2715084460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 23:59:32,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-26 23:59:32,327][49750] Updated weights for policy 0, policy_version 302871 (0.0035) [2024-04-26 23:59:36,206][49750] Updated weights for policy 0, policy_version 302881 (0.0029) [2024-04-26 23:59:37,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4962451456. Throughput: 0: 50825.3. Samples: 2715250280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-26 23:59:37,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-26 23:59:38,681][49750] Updated weights for policy 0, policy_version 302891 (0.0028) [2024-04-26 23:59:42,063][49517] Fps is (10 sec: 47512.6, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 4962680832. Throughput: 0: 50779.4. Samples: 2715552740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:59:42,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-26 23:59:42,603][49750] Updated weights for policy 0, policy_version 302901 (0.0041) [2024-04-26 23:59:45,113][49750] Updated weights for policy 0, policy_version 302911 (0.0030) [2024-04-26 23:59:47,062][49517] Fps is (10 sec: 47514.2, 60 sec: 49698.2, 300 sec: 50596.0). Total num frames: 4962926592. Throughput: 0: 50584.7. Samples: 2715856740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:59:47,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-26 23:59:49,148][49750] Updated weights for policy 0, policy_version 302921 (0.0027) [2024-04-26 23:59:51,660][49750] Updated weights for policy 0, policy_version 302931 (0.0034) [2024-04-26 23:59:52,063][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4963221504. Throughput: 0: 50566.6. Samples: 2716005860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:59:52,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-26 23:59:55,688][49750] Updated weights for policy 0, policy_version 302941 (0.0033) [2024-04-26 23:59:57,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51609.8, 300 sec: 50707.1). Total num frames: 4963483648. Throughput: 0: 50668.6. Samples: 2716314320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-26 23:59:57,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-26 23:59:58,265][49750] Updated weights for policy 0, policy_version 302951 (0.0038) [2024-04-27 00:00:02,049][49750] Updated weights for policy 0, policy_version 302961 (0.0031) [2024-04-27 00:00:02,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50790.6, 300 sec: 50762.6). Total num frames: 4963713024. Throughput: 0: 50600.6. Samples: 2716620180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:00:02,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 00:00:04,623][49750] Updated weights for policy 0, policy_version 302971 (0.0034) [2024-04-27 00:00:07,063][49517] Fps is (10 sec: 47512.3, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4963958784. Throughput: 0: 50692.7. Samples: 2716756060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:00:07,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 00:00:08,576][49750] Updated weights for policy 0, policy_version 302981 (0.0027) [2024-04-27 00:00:11,146][49750] Updated weights for policy 0, policy_version 302991 (0.0039) [2024-04-27 00:00:12,062][49517] Fps is (10 sec: 49151.4, 60 sec: 49971.1, 300 sec: 50596.0). Total num frames: 4964204544. Throughput: 0: 50544.8. Samples: 2717057600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:00:12,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-27 00:00:14,957][49750] Updated weights for policy 0, policy_version 303001 (0.0032) [2024-04-27 00:00:17,063][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 4964499456. Throughput: 0: 50751.0. Samples: 2717368260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:00:17,063][49517] Avg episode reward: [(0, '0.692')] [2024-04-27 00:00:17,996][49750] Updated weights for policy 0, policy_version 303011 (0.0032) [2024-04-27 00:00:21,500][49750] Updated weights for policy 0, policy_version 303021 (0.0032) [2024-04-27 00:00:22,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.3, 300 sec: 50707.1). Total num frames: 4964728832. Throughput: 0: 50697.2. Samples: 2717531660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:00:22,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 00:00:24,640][49750] Updated weights for policy 0, policy_version 303031 (0.0034) [2024-04-27 00:00:25,880][49728] Signal inference workers to stop experience collection... (40900 times) [2024-04-27 00:00:25,930][49750] InferenceWorker_p0-w0: stopping experience collection (40900 times) [2024-04-27 00:00:25,946][49728] Signal inference workers to resume experience collection... (40900 times) [2024-04-27 00:00:25,954][49750] InferenceWorker_p0-w0: resuming experience collection (40900 times) [2024-04-27 00:00:27,063][49517] Fps is (10 sec: 45875.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4964958208. Throughput: 0: 50853.0. Samples: 2717841120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:00:27,063][49517] Avg episode reward: [(0, '0.672')] [2024-04-27 00:00:28,034][49750] Updated weights for policy 0, policy_version 303041 (0.0031) [2024-04-27 00:00:30,931][49750] Updated weights for policy 0, policy_version 303051 (0.0029) [2024-04-27 00:00:32,062][49517] Fps is (10 sec: 47514.4, 60 sec: 49971.2, 300 sec: 50596.0). Total num frames: 4965203968. Throughput: 0: 50800.0. Samples: 2718142740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:00:32,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-27 00:00:34,360][49750] Updated weights for policy 0, policy_version 303061 (0.0028) [2024-04-27 00:00:37,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.2, 300 sec: 50707.0). Total num frames: 4965482496. Throughput: 0: 50773.2. Samples: 2718290660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:00:37,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-27 00:00:37,473][49750] Updated weights for policy 0, policy_version 303071 (0.0026) [2024-04-27 00:00:40,699][49750] Updated weights for policy 0, policy_version 303081 (0.0028) [2024-04-27 00:00:42,063][49517] Fps is (10 sec: 57343.0, 60 sec: 51609.6, 300 sec: 50762.6). Total num frames: 4965777408. Throughput: 0: 50813.0. Samples: 2718600920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:00:42,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-27 00:00:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000303087_4965777408.pth... [2024-04-27 00:00:42,115][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000302341_4953554944.pth [2024-04-27 00:00:43,861][49750] Updated weights for policy 0, policy_version 303091 (0.0036) [2024-04-27 00:00:47,063][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 4965990400. Throughput: 0: 50776.6. Samples: 2718905140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:00:47,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-27 00:00:47,191][49750] Updated weights for policy 0, policy_version 303101 (0.0031) [2024-04-27 00:00:50,144][49750] Updated weights for policy 0, policy_version 303111 (0.0027) [2024-04-27 00:00:52,062][49517] Fps is (10 sec: 47514.7, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 4966252544. Throughput: 0: 50964.7. Samples: 2719049460. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-27 00:00:52,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-27 00:00:53,568][49750] Updated weights for policy 0, policy_version 303121 (0.0033) [2024-04-27 00:00:56,679][49750] Updated weights for policy 0, policy_version 303131 (0.0032) [2024-04-27 00:00:57,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.1, 300 sec: 50651.5). Total num frames: 4966498304. Throughput: 0: 50869.3. Samples: 2719346720. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-27 00:00:57,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-27 00:00:59,916][49750] Updated weights for policy 0, policy_version 303141 (0.0037) [2024-04-27 00:01:02,063][49517] Fps is (10 sec: 54066.3, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 4966793216. Throughput: 0: 50718.2. Samples: 2719650580. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-27 00:01:02,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-27 00:01:03,217][49750] Updated weights for policy 0, policy_version 303151 (0.0031) [2024-04-27 00:01:06,517][49750] Updated weights for policy 0, policy_version 303161 (0.0040) [2024-04-27 00:01:07,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4967022592. Throughput: 0: 50854.2. Samples: 2719820100. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-27 00:01:07,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 00:01:09,530][49750] Updated weights for policy 0, policy_version 303171 (0.0028) [2024-04-27 00:01:12,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4967251968. Throughput: 0: 50743.2. Samples: 2720124560. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-27 00:01:12,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 00:01:12,857][49750] Updated weights for policy 0, policy_version 303181 (0.0034) [2024-04-27 00:01:15,858][49750] Updated weights for policy 0, policy_version 303191 (0.0033) [2024-04-27 00:01:17,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4967530496. Throughput: 0: 50868.3. Samples: 2720431820. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-27 00:01:17,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-27 00:01:19,354][49750] Updated weights for policy 0, policy_version 303201 (0.0034) [2024-04-27 00:01:22,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4967776256. Throughput: 0: 50936.2. Samples: 2720582780. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-27 00:01:22,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 00:01:22,494][49750] Updated weights for policy 0, policy_version 303211 (0.0029) [2024-04-27 00:01:25,768][49750] Updated weights for policy 0, policy_version 303221 (0.0028) [2024-04-27 00:01:26,431][49728] Signal inference workers to stop experience collection... (40950 times) [2024-04-27 00:01:26,432][49728] Signal inference workers to resume experience collection... (40950 times) [2024-04-27 00:01:26,463][49750] InferenceWorker_p0-w0: stopping experience collection (40950 times) [2024-04-27 00:01:26,464][49750] InferenceWorker_p0-w0: resuming experience collection (40950 times) [2024-04-27 00:01:27,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51882.6, 300 sec: 50818.2). Total num frames: 4968071168. Throughput: 0: 50788.9. Samples: 2720886420. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-27 00:01:27,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-27 00:01:29,157][49750] Updated weights for policy 0, policy_version 303231 (0.0031) [2024-04-27 00:01:32,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 4968284160. Throughput: 0: 50984.6. Samples: 2721199440. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-27 00:01:32,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-27 00:01:32,104][49750] Updated weights for policy 0, policy_version 303241 (0.0026) [2024-04-27 00:01:35,444][49750] Updated weights for policy 0, policy_version 303251 (0.0028) [2024-04-27 00:01:37,063][49517] Fps is (10 sec: 47513.8, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 4968546304. Throughput: 0: 51044.3. Samples: 2721346460. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-27 00:01:37,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-27 00:01:38,464][49750] Updated weights for policy 0, policy_version 303261 (0.0036) [2024-04-27 00:01:42,062][49517] Fps is (10 sec: 49152.4, 60 sec: 49971.4, 300 sec: 50651.5). Total num frames: 4968775680. Throughput: 0: 51173.5. Samples: 2721649520. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-27 00:01:42,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-27 00:01:42,074][49750] Updated weights for policy 0, policy_version 303271 (0.0027) [2024-04-27 00:01:44,974][49750] Updated weights for policy 0, policy_version 303281 (0.0038) [2024-04-27 00:01:47,063][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4969054208. Throughput: 0: 50964.9. Samples: 2721944000. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-27 00:01:47,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-27 00:01:48,556][49750] Updated weights for policy 0, policy_version 303291 (0.0030) [2024-04-27 00:01:51,424][49750] Updated weights for policy 0, policy_version 303301 (0.0033) [2024-04-27 00:01:52,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 4969316352. Throughput: 0: 50943.2. Samples: 2722112540. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-27 00:01:52,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 00:01:54,920][49750] Updated weights for policy 0, policy_version 303311 (0.0032) [2024-04-27 00:01:57,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4969545728. Throughput: 0: 50851.6. Samples: 2722412880. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-27 00:01:57,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 00:01:57,731][49750] Updated weights for policy 0, policy_version 303321 (0.0032) [2024-04-27 00:02:01,239][49750] Updated weights for policy 0, policy_version 303331 (0.0036) [2024-04-27 00:02:02,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 4969807872. Throughput: 0: 50841.7. Samples: 2722719700. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-27 00:02:02,063][49517] Avg episode reward: [(0, '0.671')] [2024-04-27 00:02:04,244][49750] Updated weights for policy 0, policy_version 303341 (0.0036) [2024-04-27 00:02:07,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4970070016. Throughput: 0: 50792.4. Samples: 2722868440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:02:07,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-27 00:02:07,638][49750] Updated weights for policy 0, policy_version 303351 (0.0030) [2024-04-27 00:02:10,805][49750] Updated weights for policy 0, policy_version 303361 (0.0037) [2024-04-27 00:02:12,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51609.5, 300 sec: 50762.6). Total num frames: 4970348544. Throughput: 0: 51051.9. Samples: 2723183760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:02:12,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-27 00:02:13,965][49750] Updated weights for policy 0, policy_version 303371 (0.0029) [2024-04-27 00:02:17,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4970561536. Throughput: 0: 50903.6. Samples: 2723490100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:02:17,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-27 00:02:17,253][49750] Updated weights for policy 0, policy_version 303381 (0.0033) [2024-04-27 00:02:20,405][49750] Updated weights for policy 0, policy_version 303391 (0.0032) [2024-04-27 00:02:22,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4970823680. Throughput: 0: 50885.7. Samples: 2723636320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:02:22,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 00:02:23,486][49728] Signal inference workers to stop experience collection... (41000 times) [2024-04-27 00:02:23,487][49728] Signal inference workers to resume experience collection... (41000 times) [2024-04-27 00:02:23,499][49750] InferenceWorker_p0-w0: stopping experience collection (41000 times) [2024-04-27 00:02:23,519][49750] InferenceWorker_p0-w0: resuming experience collection (41000 times) [2024-04-27 00:02:23,633][49750] Updated weights for policy 0, policy_version 303401 (0.0032) [2024-04-27 00:02:26,820][49750] Updated weights for policy 0, policy_version 303411 (0.0034) [2024-04-27 00:02:27,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 4971102208. Throughput: 0: 50757.3. Samples: 2723933600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:02:27,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-27 00:02:30,097][49750] Updated weights for policy 0, policy_version 303421 (0.0037) [2024-04-27 00:02:32,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4971347968. Throughput: 0: 50931.9. Samples: 2724235940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:02:32,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-27 00:02:33,271][49750] Updated weights for policy 0, policy_version 303431 (0.0029) [2024-04-27 00:02:36,547][49750] Updated weights for policy 0, policy_version 303441 (0.0028) [2024-04-27 00:02:37,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 4971593728. Throughput: 0: 50728.3. Samples: 2724395320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:02:37,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-27 00:02:39,775][49750] Updated weights for policy 0, policy_version 303451 (0.0031) [2024-04-27 00:02:42,063][49517] Fps is (10 sec: 49152.4, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 4971839488. Throughput: 0: 50882.1. Samples: 2724702580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:02:42,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-27 00:02:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000303457_4971839488.pth... [2024-04-27 00:02:42,134][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000302713_4959649792.pth [2024-04-27 00:02:43,030][49750] Updated weights for policy 0, policy_version 303461 (0.0036) [2024-04-27 00:02:46,138][49750] Updated weights for policy 0, policy_version 303471 (0.0033) [2024-04-27 00:02:47,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4972101632. Throughput: 0: 50807.6. Samples: 2725006040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:02:47,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-27 00:02:49,395][49750] Updated weights for policy 0, policy_version 303481 (0.0028) [2024-04-27 00:02:52,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4972363776. Throughput: 0: 50870.7. Samples: 2725157620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:02:52,063][49517] Avg episode reward: [(0, '0.676')] [2024-04-27 00:02:52,501][49750] Updated weights for policy 0, policy_version 303491 (0.0026) [2024-04-27 00:02:55,856][49750] Updated weights for policy 0, policy_version 303501 (0.0031) [2024-04-27 00:02:57,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51609.5, 300 sec: 50818.2). Total num frames: 4972642304. Throughput: 0: 50709.0. Samples: 2725465660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:02:57,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 00:02:58,918][49750] Updated weights for policy 0, policy_version 303511 (0.0032) [2024-04-27 00:03:02,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50818.1). Total num frames: 4972855296. Throughput: 0: 50823.1. Samples: 2725777140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:03:02,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 00:03:02,301][49750] Updated weights for policy 0, policy_version 303521 (0.0034) [2024-04-27 00:03:05,428][49750] Updated weights for policy 0, policy_version 303531 (0.0037) [2024-04-27 00:03:07,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 4973101056. Throughput: 0: 50821.5. Samples: 2725923280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:03:07,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-27 00:03:08,682][49750] Updated weights for policy 0, policy_version 303541 (0.0040) [2024-04-27 00:03:11,683][49750] Updated weights for policy 0, policy_version 303551 (0.0034) [2024-04-27 00:03:12,062][49517] Fps is (10 sec: 54067.4, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4973395968. Throughput: 0: 50970.6. Samples: 2726227280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:03:12,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-27 00:03:15,018][49750] Updated weights for policy 0, policy_version 303561 (0.0029) [2024-04-27 00:03:17,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 4973641728. Throughput: 0: 50947.3. Samples: 2726528560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:03:17,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-27 00:03:18,093][49750] Updated weights for policy 0, policy_version 303571 (0.0036) [2024-04-27 00:03:21,437][49750] Updated weights for policy 0, policy_version 303581 (0.0034) [2024-04-27 00:03:22,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 4973903872. Throughput: 0: 50985.1. Samples: 2726689640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-27 00:03:22,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 00:03:24,585][49750] Updated weights for policy 0, policy_version 303591 (0.0026) [2024-04-27 00:03:27,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4974116864. Throughput: 0: 50831.7. Samples: 2726990000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-27 00:03:27,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-27 00:03:27,881][49750] Updated weights for policy 0, policy_version 303601 (0.0035) [2024-04-27 00:03:30,615][49728] Signal inference workers to stop experience collection... (41050 times) [2024-04-27 00:03:30,638][49750] InferenceWorker_p0-w0: stopping experience collection (41050 times) [2024-04-27 00:03:30,685][49728] Signal inference workers to resume experience collection... (41050 times) [2024-04-27 00:03:30,685][49750] InferenceWorker_p0-w0: resuming experience collection (41050 times) [2024-04-27 00:03:31,084][49750] Updated weights for policy 0, policy_version 303611 (0.0028) [2024-04-27 00:03:32,063][49517] Fps is (10 sec: 49150.8, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4974395392. Throughput: 0: 50811.0. Samples: 2727292540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-27 00:03:32,063][49517] Avg episode reward: [(0, '0.674')] [2024-04-27 00:03:34,310][49750] Updated weights for policy 0, policy_version 303621 (0.0034) [2024-04-27 00:03:37,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 4974641152. Throughput: 0: 50793.0. Samples: 2727443300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-27 00:03:37,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-27 00:03:37,780][49750] Updated weights for policy 0, policy_version 303631 (0.0032) [2024-04-27 00:03:40,982][49750] Updated weights for policy 0, policy_version 303641 (0.0032) [2024-04-27 00:03:42,063][49517] Fps is (10 sec: 52429.5, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 4974919680. Throughput: 0: 50802.2. Samples: 2727751760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-27 00:03:42,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-27 00:03:44,124][49750] Updated weights for policy 0, policy_version 303651 (0.0034) [2024-04-27 00:03:47,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4975149056. Throughput: 0: 50622.2. Samples: 2728055140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-27 00:03:47,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-27 00:03:47,362][49750] Updated weights for policy 0, policy_version 303661 (0.0029) [2024-04-27 00:03:50,601][49750] Updated weights for policy 0, policy_version 303671 (0.0029) [2024-04-27 00:03:52,062][49517] Fps is (10 sec: 45875.8, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4975378432. Throughput: 0: 50772.9. Samples: 2728208060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-27 00:03:52,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 00:03:53,716][49750] Updated weights for policy 0, policy_version 303681 (0.0029) [2024-04-27 00:03:57,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4975656960. Throughput: 0: 50722.7. Samples: 2728509800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-27 00:03:57,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-27 00:03:57,076][49750] Updated weights for policy 0, policy_version 303691 (0.0031) [2024-04-27 00:04:00,247][49750] Updated weights for policy 0, policy_version 303701 (0.0034) [2024-04-27 00:04:02,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4975919104. Throughput: 0: 50667.1. Samples: 2728808580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-27 00:04:02,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 00:04:03,503][49750] Updated weights for policy 0, policy_version 303711 (0.0033) [2024-04-27 00:04:06,805][49750] Updated weights for policy 0, policy_version 303721 (0.0028) [2024-04-27 00:04:07,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 4976181248. Throughput: 0: 50644.0. Samples: 2728968620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-27 00:04:07,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 00:04:09,792][49750] Updated weights for policy 0, policy_version 303731 (0.0034) [2024-04-27 00:04:12,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 4976410624. Throughput: 0: 50724.0. Samples: 2729272580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-27 00:04:12,063][49517] Avg episode reward: [(0, '0.667')] [2024-04-27 00:04:13,208][49750] Updated weights for policy 0, policy_version 303741 (0.0030) [2024-04-27 00:04:16,178][49750] Updated weights for policy 0, policy_version 303751 (0.0037) [2024-04-27 00:04:17,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50929.2). Total num frames: 4976689152. Throughput: 0: 50609.2. Samples: 2729569940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-27 00:04:17,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-27 00:04:19,629][49750] Updated weights for policy 0, policy_version 303761 (0.0029) [2024-04-27 00:04:22,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4976934912. Throughput: 0: 50739.9. Samples: 2729726600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-27 00:04:22,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 00:04:22,642][49750] Updated weights for policy 0, policy_version 303771 (0.0031) [2024-04-27 00:04:26,022][49750] Updated weights for policy 0, policy_version 303781 (0.0031) [2024-04-27 00:04:27,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 4977164288. Throughput: 0: 50837.5. Samples: 2730039440. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-27 00:04:27,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-27 00:04:29,282][49750] Updated weights for policy 0, policy_version 303791 (0.0031) [2024-04-27 00:04:31,909][49728] Signal inference workers to stop experience collection... (41100 times) [2024-04-27 00:04:31,947][49750] InferenceWorker_p0-w0: stopping experience collection (41100 times) [2024-04-27 00:04:31,980][49728] Signal inference workers to resume experience collection... (41100 times) [2024-04-27 00:04:31,981][49750] InferenceWorker_p0-w0: resuming experience collection (41100 times) [2024-04-27 00:04:32,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4977426432. Throughput: 0: 50700.4. Samples: 2730336660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-27 00:04:32,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-27 00:04:32,469][49750] Updated weights for policy 0, policy_version 303801 (0.0033) [2024-04-27 00:04:35,578][49750] Updated weights for policy 0, policy_version 303811 (0.0029) [2024-04-27 00:04:37,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4977672192. Throughput: 0: 50763.6. Samples: 2730492420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 00:04:37,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-27 00:04:38,936][49750] Updated weights for policy 0, policy_version 303821 (0.0030) [2024-04-27 00:04:41,972][49750] Updated weights for policy 0, policy_version 303831 (0.0031) [2024-04-27 00:04:42,062][49517] Fps is (10 sec: 54068.0, 60 sec: 50790.5, 300 sec: 50984.8). Total num frames: 4977967104. Throughput: 0: 50894.6. Samples: 2730800060. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 00:04:42,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-27 00:04:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000303831_4977967104.pth... [2024-04-27 00:04:42,136][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000303087_4965777408.pth [2024-04-27 00:04:45,334][49750] Updated weights for policy 0, policy_version 303841 (0.0028) [2024-04-27 00:04:47,063][49517] Fps is (10 sec: 52425.3, 60 sec: 50790.0, 300 sec: 50762.6). Total num frames: 4978196480. Throughput: 0: 51084.7. Samples: 2731107420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 00:04:47,064][49517] Avg episode reward: [(0, '0.578')] [2024-04-27 00:04:48,373][49750] Updated weights for policy 0, policy_version 303851 (0.0029) [2024-04-27 00:04:51,667][49750] Updated weights for policy 0, policy_version 303861 (0.0029) [2024-04-27 00:04:52,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51609.6, 300 sec: 50818.1). Total num frames: 4978475008. Throughput: 0: 50969.8. Samples: 2731262260. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 00:04:52,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-27 00:04:54,868][49750] Updated weights for policy 0, policy_version 303871 (0.0030) [2024-04-27 00:04:57,063][49517] Fps is (10 sec: 50792.9, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 4978704384. Throughput: 0: 50823.4. Samples: 2731559640. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 00:04:57,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-27 00:04:58,161][49750] Updated weights for policy 0, policy_version 303881 (0.0032) [2024-04-27 00:05:01,210][49750] Updated weights for policy 0, policy_version 303891 (0.0034) [2024-04-27 00:05:02,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 4978966528. Throughput: 0: 50914.2. Samples: 2731861080. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 00:05:02,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-27 00:05:04,777][49750] Updated weights for policy 0, policy_version 303901 (0.0028) [2024-04-27 00:05:07,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4979212288. Throughput: 0: 50948.1. Samples: 2732019260. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 00:05:07,063][49517] Avg episode reward: [(0, '0.690')] [2024-04-27 00:05:07,590][49750] Updated weights for policy 0, policy_version 303911 (0.0033) [2024-04-27 00:05:11,241][49750] Updated weights for policy 0, policy_version 303921 (0.0032) [2024-04-27 00:05:12,063][49517] Fps is (10 sec: 49150.6, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 4979458048. Throughput: 0: 50813.5. Samples: 2732326060. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 00:05:12,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 00:05:14,092][49750] Updated weights for policy 0, policy_version 303931 (0.0036) [2024-04-27 00:05:17,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4979720192. Throughput: 0: 50896.6. Samples: 2732627000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 00:05:17,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 00:05:17,579][49750] Updated weights for policy 0, policy_version 303941 (0.0030) [2024-04-27 00:05:20,530][49750] Updated weights for policy 0, policy_version 303951 (0.0029) [2024-04-27 00:05:22,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 4979982336. Throughput: 0: 50867.7. Samples: 2732781480. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 00:05:22,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 00:05:24,110][49750] Updated weights for policy 0, policy_version 303961 (0.0030) [2024-04-27 00:05:26,856][49750] Updated weights for policy 0, policy_version 303971 (0.0032) [2024-04-27 00:05:27,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51609.6, 300 sec: 51040.3). Total num frames: 4980260864. Throughput: 0: 50879.1. Samples: 2733089620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 00:05:27,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 00:05:30,615][49750] Updated weights for policy 0, policy_version 303981 (0.0033) [2024-04-27 00:05:32,063][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4980473856. Throughput: 0: 50760.6. Samples: 2733391620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 00:05:32,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 00:05:33,375][49750] Updated weights for policy 0, policy_version 303991 (0.0034) [2024-04-27 00:05:36,983][49750] Updated weights for policy 0, policy_version 304001 (0.0032) [2024-04-27 00:05:37,062][49517] Fps is (10 sec: 49152.2, 60 sec: 51336.5, 300 sec: 50762.7). Total num frames: 4980752384. Throughput: 0: 50727.1. Samples: 2733544980. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 00:05:37,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-27 00:05:40,147][49750] Updated weights for policy 0, policy_version 304011 (0.0033) [2024-04-27 00:05:42,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 4980998144. Throughput: 0: 50815.9. Samples: 2733846360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 00:05:42,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-27 00:05:43,141][49728] Signal inference workers to stop experience collection... (41150 times) [2024-04-27 00:05:43,185][49750] InferenceWorker_p0-w0: stopping experience collection (41150 times) [2024-04-27 00:05:43,197][49728] Signal inference workers to resume experience collection... (41150 times) [2024-04-27 00:05:43,206][49750] InferenceWorker_p0-w0: resuming experience collection (41150 times) [2024-04-27 00:05:43,329][49750] Updated weights for policy 0, policy_version 304021 (0.0031) [2024-04-27 00:05:46,528][49750] Updated weights for policy 0, policy_version 304031 (0.0029) [2024-04-27 00:05:47,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.9, 300 sec: 50818.2). Total num frames: 4981243904. Throughput: 0: 50852.4. Samples: 2734149440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:05:47,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-27 00:05:49,884][49750] Updated weights for policy 0, policy_version 304041 (0.0030) [2024-04-27 00:05:52,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4981506048. Throughput: 0: 50762.6. Samples: 2734303580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:05:52,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 00:05:52,878][49750] Updated weights for policy 0, policy_version 304051 (0.0031) [2024-04-27 00:05:56,309][49750] Updated weights for policy 0, policy_version 304061 (0.0028) [2024-04-27 00:05:57,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4981751808. Throughput: 0: 50671.4. Samples: 2734606260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:05:57,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-27 00:05:59,193][49750] Updated weights for policy 0, policy_version 304071 (0.0031) [2024-04-27 00:06:02,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4982013952. Throughput: 0: 50825.3. Samples: 2734914140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:06:02,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 00:06:02,824][49750] Updated weights for policy 0, policy_version 304081 (0.0034) [2024-04-27 00:06:05,656][49750] Updated weights for policy 0, policy_version 304091 (0.0031) [2024-04-27 00:06:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4982243328. Throughput: 0: 50816.3. Samples: 2735068200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:06:07,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-27 00:06:09,181][49750] Updated weights for policy 0, policy_version 304101 (0.0035) [2024-04-27 00:06:12,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.8, 300 sec: 50873.7). Total num frames: 4982538240. Throughput: 0: 50724.1. Samples: 2735372200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:06:12,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-27 00:06:12,369][49750] Updated weights for policy 0, policy_version 304111 (0.0028) [2024-04-27 00:06:16,076][49750] Updated weights for policy 0, policy_version 304121 (0.0042) [2024-04-27 00:06:17,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4982767616. Throughput: 0: 50712.0. Samples: 2735673660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:06:17,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-27 00:06:18,940][49750] Updated weights for policy 0, policy_version 304131 (0.0031) [2024-04-27 00:06:22,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 4983029760. Throughput: 0: 50721.7. Samples: 2735827460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:06:22,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 00:06:22,589][49750] Updated weights for policy 0, policy_version 304141 (0.0030) [2024-04-27 00:06:25,402][49750] Updated weights for policy 0, policy_version 304151 (0.0029) [2024-04-27 00:06:27,062][49517] Fps is (10 sec: 49152.6, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 4983259136. Throughput: 0: 50627.3. Samples: 2736124580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:06:27,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-27 00:06:28,835][49750] Updated weights for policy 0, policy_version 304161 (0.0031) [2024-04-27 00:06:32,023][49750] Updated weights for policy 0, policy_version 304171 (0.0035) [2024-04-27 00:06:32,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4983537664. Throughput: 0: 50666.7. Samples: 2736429440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:06:32,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 00:06:35,128][49750] Updated weights for policy 0, policy_version 304181 (0.0038) [2024-04-27 00:06:37,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 4983783424. Throughput: 0: 50715.6. Samples: 2736585780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:06:37,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-27 00:06:38,404][49750] Updated weights for policy 0, policy_version 304191 (0.0032) [2024-04-27 00:06:41,500][49750] Updated weights for policy 0, policy_version 304201 (0.0025) [2024-04-27 00:06:42,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4984045568. Throughput: 0: 50821.8. Samples: 2736893240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:06:42,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-27 00:06:42,093][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000304203_4984061952.pth... [2024-04-27 00:06:42,137][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000303457_4971839488.pth [2024-04-27 00:06:45,038][49750] Updated weights for policy 0, policy_version 304211 (0.0035) [2024-04-27 00:06:47,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4984291328. Throughput: 0: 50678.7. Samples: 2737194680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:06:47,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 00:06:47,938][49750] Updated weights for policy 0, policy_version 304221 (0.0033) [2024-04-27 00:06:51,523][49750] Updated weights for policy 0, policy_version 304231 (0.0028) [2024-04-27 00:06:52,063][49517] Fps is (10 sec: 50789.2, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 4984553472. Throughput: 0: 50677.9. Samples: 2737348720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:06:52,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 00:06:54,241][49750] Updated weights for policy 0, policy_version 304241 (0.0028) [2024-04-27 00:06:57,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4984799232. Throughput: 0: 50669.2. Samples: 2737652320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:06:57,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-27 00:06:57,841][49750] Updated weights for policy 0, policy_version 304251 (0.0031) [2024-04-27 00:06:59,141][49728] Signal inference workers to stop experience collection... (41200 times) [2024-04-27 00:06:59,185][49750] InferenceWorker_p0-w0: stopping experience collection (41200 times) [2024-04-27 00:06:59,200][49728] Signal inference workers to resume experience collection... (41200 times) [2024-04-27 00:06:59,211][49750] InferenceWorker_p0-w0: resuming experience collection (41200 times) [2024-04-27 00:07:00,528][49750] Updated weights for policy 0, policy_version 304261 (0.0028) [2024-04-27 00:07:02,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4985061376. Throughput: 0: 50731.2. Samples: 2737956560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 00:07:02,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 00:07:04,206][49750] Updated weights for policy 0, policy_version 304271 (0.0031) [2024-04-27 00:07:07,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 4985323520. Throughput: 0: 50869.3. Samples: 2738116580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 00:07:07,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-27 00:07:07,072][49750] Updated weights for policy 0, policy_version 304281 (0.0029) [2024-04-27 00:07:10,597][49750] Updated weights for policy 0, policy_version 304291 (0.0034) [2024-04-27 00:07:12,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 4985552896. Throughput: 0: 50920.8. Samples: 2738416020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 00:07:12,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-27 00:07:13,535][49750] Updated weights for policy 0, policy_version 304301 (0.0036) [2024-04-27 00:07:16,921][49750] Updated weights for policy 0, policy_version 304311 (0.0031) [2024-04-27 00:07:17,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 4985831424. Throughput: 0: 50920.7. Samples: 2738720880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 00:07:17,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 00:07:20,180][49750] Updated weights for policy 0, policy_version 304321 (0.0034) [2024-04-27 00:07:22,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 4986077184. Throughput: 0: 50793.3. Samples: 2738871480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 00:07:22,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 00:07:23,451][49750] Updated weights for policy 0, policy_version 304331 (0.0029) [2024-04-27 00:07:26,488][49750] Updated weights for policy 0, policy_version 304341 (0.0028) [2024-04-27 00:07:27,063][49517] Fps is (10 sec: 50790.6, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 4986339328. Throughput: 0: 50835.4. Samples: 2739180840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 00:07:27,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-27 00:07:29,915][49750] Updated weights for policy 0, policy_version 304351 (0.0029) [2024-04-27 00:07:32,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4986585088. Throughput: 0: 50975.6. Samples: 2739488580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 00:07:32,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 00:07:33,011][49750] Updated weights for policy 0, policy_version 304361 (0.0029) [2024-04-27 00:07:36,598][49750] Updated weights for policy 0, policy_version 304371 (0.0029) [2024-04-27 00:07:37,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4986830848. Throughput: 0: 50864.2. Samples: 2739637600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 00:07:37,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-27 00:07:39,587][49750] Updated weights for policy 0, policy_version 304381 (0.0034) [2024-04-27 00:07:42,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 4987076608. Throughput: 0: 50727.2. Samples: 2739935040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 00:07:42,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-27 00:07:43,120][49750] Updated weights for policy 0, policy_version 304391 (0.0030) [2024-04-27 00:07:45,893][49750] Updated weights for policy 0, policy_version 304401 (0.0026) [2024-04-27 00:07:47,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4987338752. Throughput: 0: 50751.3. Samples: 2740240380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 00:07:47,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-27 00:07:49,566][49750] Updated weights for policy 0, policy_version 304411 (0.0030) [2024-04-27 00:07:52,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4987617280. Throughput: 0: 50759.6. Samples: 2740400760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 00:07:52,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 00:07:52,231][49750] Updated weights for policy 0, policy_version 304421 (0.0031) [2024-04-27 00:07:56,157][49750] Updated weights for policy 0, policy_version 304431 (0.0032) [2024-04-27 00:07:57,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4987830272. Throughput: 0: 50834.6. Samples: 2740703580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 00:07:57,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 00:07:58,763][49750] Updated weights for policy 0, policy_version 304441 (0.0034) [2024-04-27 00:08:02,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 4988092416. Throughput: 0: 50716.8. Samples: 2741003140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 00:08:02,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-27 00:08:02,596][49750] Updated weights for policy 0, policy_version 304451 (0.0031) [2024-04-27 00:08:05,322][49750] Updated weights for policy 0, policy_version 304461 (0.0031) [2024-04-27 00:08:07,062][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4988370944. Throughput: 0: 50675.5. Samples: 2741151880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 00:08:07,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-27 00:08:08,835][49750] Updated weights for policy 0, policy_version 304471 (0.0030) [2024-04-27 00:08:10,367][49728] Signal inference workers to stop experience collection... (41250 times) [2024-04-27 00:08:10,408][49750] InferenceWorker_p0-w0: stopping experience collection (41250 times) [2024-04-27 00:08:10,427][49728] Signal inference workers to resume experience collection... (41250 times) [2024-04-27 00:08:10,429][49750] InferenceWorker_p0-w0: resuming experience collection (41250 times) [2024-04-27 00:08:11,642][49750] Updated weights for policy 0, policy_version 304481 (0.0028) [2024-04-27 00:08:12,063][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 4988616704. Throughput: 0: 50773.8. Samples: 2741465660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 00:08:12,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 00:08:15,303][49750] Updated weights for policy 0, policy_version 304491 (0.0033) [2024-04-27 00:08:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 4988878848. Throughput: 0: 50900.9. Samples: 2741779120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:08:17,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 00:08:17,965][49750] Updated weights for policy 0, policy_version 304501 (0.0032) [2024-04-27 00:08:21,677][49750] Updated weights for policy 0, policy_version 304511 (0.0032) [2024-04-27 00:08:22,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 4989108224. Throughput: 0: 50827.5. Samples: 2741924840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:08:22,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 00:08:24,497][49750] Updated weights for policy 0, policy_version 304521 (0.0035) [2024-04-27 00:08:27,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 4989370368. Throughput: 0: 50931.4. Samples: 2742226960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:08:27,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-27 00:08:28,272][49750] Updated weights for policy 0, policy_version 304531 (0.0038) [2024-04-27 00:08:30,945][49750] Updated weights for policy 0, policy_version 304541 (0.0024) [2024-04-27 00:08:32,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 4989616128. Throughput: 0: 50771.8. Samples: 2742525100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:08:32,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-27 00:08:34,541][49750] Updated weights for policy 0, policy_version 304551 (0.0031) [2024-04-27 00:08:37,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 4989894656. Throughput: 0: 50819.1. Samples: 2742687620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:08:37,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-27 00:08:37,310][49750] Updated weights for policy 0, policy_version 304561 (0.0031) [2024-04-27 00:08:40,968][49750] Updated weights for policy 0, policy_version 304571 (0.0039) [2024-04-27 00:08:42,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4990140416. Throughput: 0: 50827.6. Samples: 2742990820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:08:42,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 00:08:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000304574_4990140416.pth... [2024-04-27 00:08:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000303831_4977967104.pth [2024-04-27 00:08:43,654][49750] Updated weights for policy 0, policy_version 304581 (0.0033) [2024-04-27 00:08:47,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 4990386176. Throughput: 0: 50976.8. Samples: 2743297100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:08:47,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 00:08:47,446][49750] Updated weights for policy 0, policy_version 304591 (0.0030) [2024-04-27 00:08:50,212][49750] Updated weights for policy 0, policy_version 304601 (0.0029) [2024-04-27 00:08:52,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 4990648320. Throughput: 0: 50955.5. Samples: 2743444880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:08:52,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-27 00:08:54,018][49750] Updated weights for policy 0, policy_version 304611 (0.0030) [2024-04-27 00:08:56,696][49750] Updated weights for policy 0, policy_version 304621 (0.0030) [2024-04-27 00:08:57,062][49517] Fps is (10 sec: 52430.4, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4990910464. Throughput: 0: 50765.5. Samples: 2743750100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:08:57,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 00:09:00,604][49750] Updated weights for policy 0, policy_version 304631 (0.0036) [2024-04-27 00:09:02,063][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4991156224. Throughput: 0: 50538.2. Samples: 2744053340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:09:02,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-27 00:09:03,162][49750] Updated weights for policy 0, policy_version 304641 (0.0029) [2024-04-27 00:09:06,882][49750] Updated weights for policy 0, policy_version 304651 (0.0027) [2024-04-27 00:09:07,063][49517] Fps is (10 sec: 49150.7, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 4991401984. Throughput: 0: 50554.1. Samples: 2744199780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:09:07,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 00:09:09,656][49750] Updated weights for policy 0, policy_version 304661 (0.0036) [2024-04-27 00:09:10,546][49728] Signal inference workers to stop experience collection... (41300 times) [2024-04-27 00:09:10,546][49728] Signal inference workers to resume experience collection... (41300 times) [2024-04-27 00:09:10,559][49750] InferenceWorker_p0-w0: stopping experience collection (41300 times) [2024-04-27 00:09:10,559][49750] InferenceWorker_p0-w0: resuming experience collection (41300 times) [2024-04-27 00:09:12,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4991647744. Throughput: 0: 50796.5. Samples: 2744512800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:09:12,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-27 00:09:13,350][49750] Updated weights for policy 0, policy_version 304671 (0.0035) [2024-04-27 00:09:16,086][49750] Updated weights for policy 0, policy_version 304681 (0.0026) [2024-04-27 00:09:17,062][49517] Fps is (10 sec: 52430.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4991926272. Throughput: 0: 50885.3. Samples: 2744814940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:09:17,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 00:09:19,773][49750] Updated weights for policy 0, policy_version 304691 (0.0031) [2024-04-27 00:09:22,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 4992188416. Throughput: 0: 50625.3. Samples: 2744965760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:09:22,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-27 00:09:22,885][49750] Updated weights for policy 0, policy_version 304701 (0.0030) [2024-04-27 00:09:26,028][49750] Updated weights for policy 0, policy_version 304711 (0.0029) [2024-04-27 00:09:27,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 4992401408. Throughput: 0: 50695.6. Samples: 2745272120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:09:27,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-27 00:09:29,396][49750] Updated weights for policy 0, policy_version 304721 (0.0034) [2024-04-27 00:09:32,062][49517] Fps is (10 sec: 49152.6, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4992679936. Throughput: 0: 50874.1. Samples: 2745586420. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-27 00:09:32,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 00:09:32,345][49750] Updated weights for policy 0, policy_version 304731 (0.0036) [2024-04-27 00:09:35,840][49750] Updated weights for policy 0, policy_version 304741 (0.0033) [2024-04-27 00:09:37,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 4992925696. Throughput: 0: 50765.0. Samples: 2745729300. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-27 00:09:37,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-27 00:09:39,189][49750] Updated weights for policy 0, policy_version 304751 (0.0030) [2024-04-27 00:09:42,063][49517] Fps is (10 sec: 49150.6, 60 sec: 50517.1, 300 sec: 50762.7). Total num frames: 4993171456. Throughput: 0: 50718.4. Samples: 2746032440. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-27 00:09:42,063][49517] Avg episode reward: [(0, '0.703')] [2024-04-27 00:09:42,280][49750] Updated weights for policy 0, policy_version 304761 (0.0036) [2024-04-27 00:09:45,567][49750] Updated weights for policy 0, policy_version 304771 (0.0029) [2024-04-27 00:09:47,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 4993449984. Throughput: 0: 50896.8. Samples: 2746343700. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-27 00:09:47,063][49517] Avg episode reward: [(0, '0.672')] [2024-04-27 00:09:48,751][49750] Updated weights for policy 0, policy_version 304781 (0.0037) [2024-04-27 00:09:51,904][49750] Updated weights for policy 0, policy_version 304791 (0.0028) [2024-04-27 00:09:52,063][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 4993695744. Throughput: 0: 50796.5. Samples: 2746485620. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-27 00:09:52,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-27 00:09:55,201][49750] Updated weights for policy 0, policy_version 304801 (0.0031) [2024-04-27 00:09:57,063][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 4993925120. Throughput: 0: 50703.1. Samples: 2746794440. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-27 00:09:57,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 00:09:58,296][49750] Updated weights for policy 0, policy_version 304811 (0.0025) [2024-04-27 00:10:01,554][49750] Updated weights for policy 0, policy_version 304821 (0.0031) [2024-04-27 00:10:02,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 4994220032. Throughput: 0: 50809.2. Samples: 2747101360. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-27 00:10:02,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 00:10:04,787][49750] Updated weights for policy 0, policy_version 304831 (0.0037) [2024-04-27 00:10:05,937][49728] Signal inference workers to stop experience collection... (41350 times) [2024-04-27 00:10:05,988][49750] InferenceWorker_p0-w0: stopping experience collection (41350 times) [2024-04-27 00:10:06,005][49728] Signal inference workers to resume experience collection... (41350 times) [2024-04-27 00:10:06,007][49750] InferenceWorker_p0-w0: resuming experience collection (41350 times) [2024-04-27 00:10:07,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51336.7, 300 sec: 50929.3). Total num frames: 4994482176. Throughput: 0: 50895.2. Samples: 2747256040. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-27 00:10:07,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 00:10:08,019][49750] Updated weights for policy 0, policy_version 304841 (0.0028) [2024-04-27 00:10:11,276][49750] Updated weights for policy 0, policy_version 304851 (0.0033) [2024-04-27 00:10:12,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 4994695168. Throughput: 0: 50870.9. Samples: 2747561320. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-27 00:10:12,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-27 00:10:14,394][49750] Updated weights for policy 0, policy_version 304861 (0.0027) [2024-04-27 00:10:17,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4994973696. Throughput: 0: 50860.7. Samples: 2747875160. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-27 00:10:17,073][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 00:10:17,550][49750] Updated weights for policy 0, policy_version 304871 (0.0031) [2024-04-27 00:10:20,837][49750] Updated weights for policy 0, policy_version 304881 (0.0030) [2024-04-27 00:10:22,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 4995219456. Throughput: 0: 50852.0. Samples: 2748017640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-27 00:10:22,071][49517] Avg episode reward: [(0, '0.637')] [2024-04-27 00:10:23,928][49750] Updated weights for policy 0, policy_version 304891 (0.0032) [2024-04-27 00:10:27,062][49517] Fps is (10 sec: 50791.4, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 4995481600. Throughput: 0: 50839.4. Samples: 2748320200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-27 00:10:27,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 00:10:27,205][49750] Updated weights for policy 0, policy_version 304901 (0.0039) [2024-04-27 00:10:30,341][49750] Updated weights for policy 0, policy_version 304911 (0.0032) [2024-04-27 00:10:32,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 4995743744. Throughput: 0: 50767.6. Samples: 2748628240. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-27 00:10:32,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-27 00:10:33,632][49750] Updated weights for policy 0, policy_version 304921 (0.0044) [2024-04-27 00:10:36,844][49750] Updated weights for policy 0, policy_version 304931 (0.0033) [2024-04-27 00:10:37,062][49517] Fps is (10 sec: 50789.7, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 4995989504. Throughput: 0: 51069.0. Samples: 2748783720. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-27 00:10:37,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-27 00:10:40,068][49750] Updated weights for policy 0, policy_version 304941 (0.0034) [2024-04-27 00:10:42,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50790.6, 300 sec: 50762.6). Total num frames: 4996218880. Throughput: 0: 50982.3. Samples: 2749088640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:10:42,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-27 00:10:42,105][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000304946_4996235264.pth... [2024-04-27 00:10:42,151][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000304203_4984061952.pth [2024-04-27 00:10:43,335][49750] Updated weights for policy 0, policy_version 304951 (0.0030) [2024-04-27 00:10:46,418][49750] Updated weights for policy 0, policy_version 304961 (0.0038) [2024-04-27 00:10:47,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4996497408. Throughput: 0: 50813.9. Samples: 2749387980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:10:47,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 00:10:49,592][49750] Updated weights for policy 0, policy_version 304971 (0.0032) [2024-04-27 00:10:52,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4996743168. Throughput: 0: 50801.8. Samples: 2749542120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:10:52,063][49517] Avg episode reward: [(0, '0.667')] [2024-04-27 00:10:52,726][49750] Updated weights for policy 0, policy_version 304981 (0.0029) [2024-04-27 00:10:56,633][49750] Updated weights for policy 0, policy_version 304991 (0.0029) [2024-04-27 00:10:57,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 4997005312. Throughput: 0: 50824.5. Samples: 2749848420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:10:57,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 00:10:59,134][49728] Signal inference workers to stop experience collection... (41400 times) [2024-04-27 00:10:59,134][49728] Signal inference workers to resume experience collection... (41400 times) [2024-04-27 00:10:59,148][49750] InferenceWorker_p0-w0: stopping experience collection (41400 times) [2024-04-27 00:10:59,148][49750] InferenceWorker_p0-w0: resuming experience collection (41400 times) [2024-04-27 00:10:59,271][49750] Updated weights for policy 0, policy_version 305001 (0.0033) [2024-04-27 00:11:02,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 4997267456. Throughput: 0: 50705.4. Samples: 2750156900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:11:02,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-27 00:11:03,161][49750] Updated weights for policy 0, policy_version 305011 (0.0033) [2024-04-27 00:11:06,189][49750] Updated weights for policy 0, policy_version 305021 (0.0034) [2024-04-27 00:11:07,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 4997496832. Throughput: 0: 50815.0. Samples: 2750304320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:11:07,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 00:11:09,453][49750] Updated weights for policy 0, policy_version 305031 (0.0036) [2024-04-27 00:11:12,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 4997775360. Throughput: 0: 50959.1. Samples: 2750613360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:11:12,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-27 00:11:12,557][49750] Updated weights for policy 0, policy_version 305041 (0.0039) [2024-04-27 00:11:15,740][49750] Updated weights for policy 0, policy_version 305051 (0.0032) [2024-04-27 00:11:17,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 4998021120. Throughput: 0: 50856.1. Samples: 2750916760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:11:17,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-27 00:11:18,976][49750] Updated weights for policy 0, policy_version 305061 (0.0031) [2024-04-27 00:11:22,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 4998266880. Throughput: 0: 50868.5. Samples: 2751072800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:11:22,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-27 00:11:22,086][49750] Updated weights for policy 0, policy_version 305071 (0.0029) [2024-04-27 00:11:25,507][49750] Updated weights for policy 0, policy_version 305081 (0.0034) [2024-04-27 00:11:27,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 4998529024. Throughput: 0: 50867.1. Samples: 2751377660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:11:27,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-27 00:11:28,536][49750] Updated weights for policy 0, policy_version 305091 (0.0030) [2024-04-27 00:11:31,846][49750] Updated weights for policy 0, policy_version 305101 (0.0031) [2024-04-27 00:11:32,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 4998774784. Throughput: 0: 51019.0. Samples: 2751683840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:11:32,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-27 00:11:34,905][49750] Updated weights for policy 0, policy_version 305111 (0.0031) [2024-04-27 00:11:37,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 4999036928. Throughput: 0: 50927.9. Samples: 2751833880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:11:37,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-27 00:11:38,174][49750] Updated weights for policy 0, policy_version 305121 (0.0033) [2024-04-27 00:11:41,213][49750] Updated weights for policy 0, policy_version 305131 (0.0028) [2024-04-27 00:11:42,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 4999282688. Throughput: 0: 50916.9. Samples: 2752139680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:11:42,063][49517] Avg episode reward: [(0, '0.490')] [2024-04-27 00:11:44,684][49750] Updated weights for policy 0, policy_version 305141 (0.0029) [2024-04-27 00:11:47,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 4999528448. Throughput: 0: 50749.0. Samples: 2752440600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:11:47,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-27 00:11:47,689][49750] Updated weights for policy 0, policy_version 305151 (0.0025) [2024-04-27 00:11:51,174][49750] Updated weights for policy 0, policy_version 305161 (0.0032) [2024-04-27 00:11:52,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 4999774208. Throughput: 0: 50829.3. Samples: 2752591640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:11:52,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 00:11:54,246][49750] Updated weights for policy 0, policy_version 305171 (0.0032) [2024-04-27 00:11:57,067][49517] Fps is (10 sec: 52406.8, 60 sec: 50786.9, 300 sec: 50817.4). Total num frames: 5000052736. Throughput: 0: 50771.7. Samples: 2752898300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 00:11:57,067][49517] Avg episode reward: [(0, '0.685')] [2024-04-27 00:11:57,523][49750] Updated weights for policy 0, policy_version 305181 (0.0032) [2024-04-27 00:12:00,616][49750] Updated weights for policy 0, policy_version 305191 (0.0036) [2024-04-27 00:12:02,062][49517] Fps is (10 sec: 54067.9, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5000314880. Throughput: 0: 50773.4. Samples: 2753201560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 00:12:02,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-27 00:12:04,153][49750] Updated weights for policy 0, policy_version 305201 (0.0029) [2024-04-27 00:12:07,062][49517] Fps is (10 sec: 50811.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5000560640. Throughput: 0: 50808.5. Samples: 2753359180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 00:12:07,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 00:12:07,094][49750] Updated weights for policy 0, policy_version 305211 (0.0033) [2024-04-27 00:12:10,708][49750] Updated weights for policy 0, policy_version 305221 (0.0029) [2024-04-27 00:12:12,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 5000806400. Throughput: 0: 50697.0. Samples: 2753659020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 00:12:12,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-27 00:12:12,698][49728] Signal inference workers to stop experience collection... (41450 times) [2024-04-27 00:12:12,698][49728] Signal inference workers to resume experience collection... (41450 times) [2024-04-27 00:12:12,711][49750] InferenceWorker_p0-w0: stopping experience collection (41450 times) [2024-04-27 00:12:12,712][49750] InferenceWorker_p0-w0: resuming experience collection (41450 times) [2024-04-27 00:12:13,701][49750] Updated weights for policy 0, policy_version 305231 (0.0032) [2024-04-27 00:12:17,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5001052160. Throughput: 0: 50684.1. Samples: 2753964620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 00:12:17,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-27 00:12:17,157][49750] Updated weights for policy 0, policy_version 305241 (0.0026) [2024-04-27 00:12:20,087][49750] Updated weights for policy 0, policy_version 305251 (0.0033) [2024-04-27 00:12:22,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5001314304. Throughput: 0: 50697.9. Samples: 2754115280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 00:12:22,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 00:12:23,497][49750] Updated weights for policy 0, policy_version 305261 (0.0029) [2024-04-27 00:12:26,510][49750] Updated weights for policy 0, policy_version 305271 (0.0039) [2024-04-27 00:12:27,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5001592832. Throughput: 0: 50881.2. Samples: 2754429340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 00:12:27,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 00:12:29,874][49750] Updated weights for policy 0, policy_version 305281 (0.0032) [2024-04-27 00:12:32,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5001822208. Throughput: 0: 50968.4. Samples: 2754734180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 00:12:32,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 00:12:33,054][49750] Updated weights for policy 0, policy_version 305291 (0.0030) [2024-04-27 00:12:36,510][49750] Updated weights for policy 0, policy_version 305301 (0.0031) [2024-04-27 00:12:37,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5002084352. Throughput: 0: 50800.4. Samples: 2754877660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 00:12:37,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 00:12:39,462][49750] Updated weights for policy 0, policy_version 305311 (0.0031) [2024-04-27 00:12:42,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5002330112. Throughput: 0: 50800.2. Samples: 2755184100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 00:12:42,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-27 00:12:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000305318_5002330112.pth... [2024-04-27 00:12:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000304574_4990140416.pth [2024-04-27 00:12:42,975][49750] Updated weights for policy 0, policy_version 305321 (0.0030) [2024-04-27 00:12:45,987][49750] Updated weights for policy 0, policy_version 305331 (0.0031) [2024-04-27 00:12:47,062][49517] Fps is (10 sec: 50791.6, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5002592256. Throughput: 0: 50812.9. Samples: 2755488140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 00:12:47,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-27 00:12:49,303][49750] Updated weights for policy 0, policy_version 305341 (0.0034) [2024-04-27 00:12:52,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 5002854400. Throughput: 0: 50762.7. Samples: 2755643500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 00:12:52,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-27 00:12:52,455][49750] Updated weights for policy 0, policy_version 305351 (0.0034) [2024-04-27 00:12:55,755][49750] Updated weights for policy 0, policy_version 305361 (0.0036) [2024-04-27 00:12:57,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50520.8, 300 sec: 50818.2). Total num frames: 5003083776. Throughput: 0: 50926.1. Samples: 2755950700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 00:12:57,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-27 00:12:58,789][49750] Updated weights for policy 0, policy_version 305371 (0.0039) [2024-04-27 00:13:02,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 5003329536. Throughput: 0: 50877.9. Samples: 2756254120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 00:13:02,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 00:13:02,277][49750] Updated weights for policy 0, policy_version 305381 (0.0030) [2024-04-27 00:13:05,334][49750] Updated weights for policy 0, policy_version 305391 (0.0029) [2024-04-27 00:13:07,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5003608064. Throughput: 0: 50773.7. Samples: 2756400100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 00:13:07,064][49517] Avg episode reward: [(0, '0.544')] [2024-04-27 00:13:08,785][49750] Updated weights for policy 0, policy_version 305401 (0.0034) [2024-04-27 00:13:11,786][49750] Updated weights for policy 0, policy_version 305411 (0.0032) [2024-04-27 00:13:12,063][49517] Fps is (10 sec: 52427.4, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 5003853824. Throughput: 0: 50588.8. Samples: 2756705840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 00:13:12,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-27 00:13:15,105][49750] Updated weights for policy 0, policy_version 305421 (0.0028) [2024-04-27 00:13:17,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5004099584. Throughput: 0: 50643.6. Samples: 2757013140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 00:13:17,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-27 00:13:18,281][49750] Updated weights for policy 0, policy_version 305431 (0.0029) [2024-04-27 00:13:21,634][49728] Signal inference workers to stop experience collection... (41500 times) [2024-04-27 00:13:21,683][49750] InferenceWorker_p0-w0: stopping experience collection (41500 times) [2024-04-27 00:13:21,704][49728] Signal inference workers to resume experience collection... (41500 times) [2024-04-27 00:13:21,705][49750] InferenceWorker_p0-w0: resuming experience collection (41500 times) [2024-04-27 00:13:21,709][49750] Updated weights for policy 0, policy_version 305441 (0.0034) [2024-04-27 00:13:22,062][49517] Fps is (10 sec: 50791.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5004361728. Throughput: 0: 50656.2. Samples: 2757157180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 00:13:22,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-27 00:13:24,717][49750] Updated weights for policy 0, policy_version 305451 (0.0034) [2024-04-27 00:13:27,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 5004607488. Throughput: 0: 50629.0. Samples: 2757462400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 00:13:27,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 00:13:27,975][49750] Updated weights for policy 0, policy_version 305461 (0.0032) [2024-04-27 00:13:31,300][49750] Updated weights for policy 0, policy_version 305471 (0.0031) [2024-04-27 00:13:32,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5004886016. Throughput: 0: 50601.6. Samples: 2757765220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 00:13:32,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-27 00:13:34,295][49750] Updated weights for policy 0, policy_version 305481 (0.0028) [2024-04-27 00:13:37,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5005115392. Throughput: 0: 50592.3. Samples: 2757920160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 00:13:37,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-27 00:13:37,578][49750] Updated weights for policy 0, policy_version 305491 (0.0031) [2024-04-27 00:13:41,068][49750] Updated weights for policy 0, policy_version 305501 (0.0040) [2024-04-27 00:13:42,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50873.8). Total num frames: 5005393920. Throughput: 0: 50646.3. Samples: 2758229780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 00:13:42,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-27 00:13:44,000][49750] Updated weights for policy 0, policy_version 305511 (0.0033) [2024-04-27 00:13:47,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 5005623296. Throughput: 0: 50682.7. Samples: 2758534840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 00:13:47,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 00:13:47,422][49750] Updated weights for policy 0, policy_version 305521 (0.0030) [2024-04-27 00:13:50,529][49750] Updated weights for policy 0, policy_version 305531 (0.0029) [2024-04-27 00:13:52,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5005885440. Throughput: 0: 50839.9. Samples: 2758687900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 00:13:52,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-27 00:13:53,710][49750] Updated weights for policy 0, policy_version 305541 (0.0032) [2024-04-27 00:13:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5006147584. Throughput: 0: 50609.7. Samples: 2758983260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 00:13:57,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-27 00:13:57,069][49750] Updated weights for policy 0, policy_version 305551 (0.0034) [2024-04-27 00:14:00,196][49750] Updated weights for policy 0, policy_version 305561 (0.0033) [2024-04-27 00:14:02,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5006393344. Throughput: 0: 50617.6. Samples: 2759290940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 00:14:02,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-27 00:14:03,380][49750] Updated weights for policy 0, policy_version 305571 (0.0033) [2024-04-27 00:14:06,822][49750] Updated weights for policy 0, policy_version 305581 (0.0030) [2024-04-27 00:14:07,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5006639104. Throughput: 0: 50774.2. Samples: 2759442020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 00:14:07,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-27 00:14:09,810][49750] Updated weights for policy 0, policy_version 305591 (0.0027) [2024-04-27 00:14:12,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 5006884864. Throughput: 0: 50847.5. Samples: 2759750540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 00:14:12,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 00:14:13,276][49750] Updated weights for policy 0, policy_version 305601 (0.0032) [2024-04-27 00:14:16,601][49750] Updated weights for policy 0, policy_version 305611 (0.0028) [2024-04-27 00:14:17,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 5007163392. Throughput: 0: 50929.3. Samples: 2760057040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 00:14:17,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-27 00:14:19,791][49750] Updated weights for policy 0, policy_version 305621 (0.0028) [2024-04-27 00:14:22,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5007409152. Throughput: 0: 50783.2. Samples: 2760205400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 00:14:22,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-27 00:14:22,927][49750] Updated weights for policy 0, policy_version 305631 (0.0031) [2024-04-27 00:14:26,047][49750] Updated weights for policy 0, policy_version 305641 (0.0029) [2024-04-27 00:14:27,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5007671296. Throughput: 0: 50773.4. Samples: 2760514580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 00:14:27,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 00:14:29,227][49750] Updated weights for policy 0, policy_version 305651 (0.0035) [2024-04-27 00:14:32,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 5007917056. Throughput: 0: 50755.3. Samples: 2760818840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 00:14:32,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 00:14:32,407][49750] Updated weights for policy 0, policy_version 305661 (0.0036) [2024-04-27 00:14:34,783][49728] Signal inference workers to stop experience collection... (41550 times) [2024-04-27 00:14:34,783][49728] Signal inference workers to resume experience collection... (41550 times) [2024-04-27 00:14:34,796][49750] InferenceWorker_p0-w0: stopping experience collection (41550 times) [2024-04-27 00:14:34,813][49750] InferenceWorker_p0-w0: resuming experience collection (41550 times) [2024-04-27 00:14:35,860][49750] Updated weights for policy 0, policy_version 305671 (0.0030) [2024-04-27 00:14:37,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.6, 300 sec: 50873.8). Total num frames: 5008179200. Throughput: 0: 50958.8. Samples: 2760981040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 00:14:37,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-27 00:14:38,853][49750] Updated weights for policy 0, policy_version 305681 (0.0029) [2024-04-27 00:14:42,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.1, 300 sec: 50762.6). Total num frames: 5008424960. Throughput: 0: 51013.4. Samples: 2761278880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 00:14:42,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 00:14:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000305690_5008424960.pth... [2024-04-27 00:14:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000304946_4996235264.pth [2024-04-27 00:14:42,340][49750] Updated weights for policy 0, policy_version 305691 (0.0029) [2024-04-27 00:14:45,282][49750] Updated weights for policy 0, policy_version 305701 (0.0033) [2024-04-27 00:14:47,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5008670720. Throughput: 0: 50893.7. Samples: 2761581160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 00:14:47,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 00:14:48,750][49750] Updated weights for policy 0, policy_version 305711 (0.0035) [2024-04-27 00:14:51,798][49750] Updated weights for policy 0, policy_version 305721 (0.0037) [2024-04-27 00:14:52,063][49517] Fps is (10 sec: 52429.8, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 5008949248. Throughput: 0: 50980.4. Samples: 2761736140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 00:14:52,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 00:14:55,089][49750] Updated weights for policy 0, policy_version 305731 (0.0032) [2024-04-27 00:14:57,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 5009195008. Throughput: 0: 50926.1. Samples: 2762042220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 00:14:57,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-27 00:14:58,154][49750] Updated weights for policy 0, policy_version 305741 (0.0027) [2024-04-27 00:15:01,389][49750] Updated weights for policy 0, policy_version 305751 (0.0028) [2024-04-27 00:15:02,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5009440768. Throughput: 0: 50848.6. Samples: 2762345220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 00:15:02,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-27 00:15:04,586][49750] Updated weights for policy 0, policy_version 305761 (0.0027) [2024-04-27 00:15:07,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5009686528. Throughput: 0: 50943.2. Samples: 2762497840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 00:15:07,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 00:15:07,863][49750] Updated weights for policy 0, policy_version 305771 (0.0034) [2024-04-27 00:15:11,132][49750] Updated weights for policy 0, policy_version 305781 (0.0028) [2024-04-27 00:15:12,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.6, 300 sec: 50762.7). Total num frames: 5009948672. Throughput: 0: 50860.5. Samples: 2762803300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 00:15:12,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 00:15:14,323][49750] Updated weights for policy 0, policy_version 305791 (0.0034) [2024-04-27 00:15:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5010210816. Throughput: 0: 50871.8. Samples: 2763108060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 00:15:17,067][49517] Avg episode reward: [(0, '0.682')] [2024-04-27 00:15:17,491][49750] Updated weights for policy 0, policy_version 305801 (0.0029) [2024-04-27 00:15:20,666][49750] Updated weights for policy 0, policy_version 305811 (0.0030) [2024-04-27 00:15:22,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5010472960. Throughput: 0: 50755.1. Samples: 2763265020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 00:15:22,063][49517] Avg episode reward: [(0, '0.673')] [2024-04-27 00:15:23,846][49750] Updated weights for policy 0, policy_version 305821 (0.0028) [2024-04-27 00:15:27,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 5010718720. Throughput: 0: 50933.9. Samples: 2763570900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 00:15:27,072][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 00:15:27,306][49750] Updated weights for policy 0, policy_version 305831 (0.0033) [2024-04-27 00:15:30,447][49750] Updated weights for policy 0, policy_version 305841 (0.0030) [2024-04-27 00:15:32,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5010948096. Throughput: 0: 50768.0. Samples: 2763865720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 00:15:32,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-27 00:15:33,875][49750] Updated weights for policy 0, policy_version 305851 (0.0033) [2024-04-27 00:15:36,903][49750] Updated weights for policy 0, policy_version 305861 (0.0027) [2024-04-27 00:15:37,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5011226624. Throughput: 0: 50751.5. Samples: 2764019960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 00:15:37,072][49517] Avg episode reward: [(0, '0.618')] [2024-04-27 00:15:40,229][49750] Updated weights for policy 0, policy_version 305871 (0.0029) [2024-04-27 00:15:42,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50790.7, 300 sec: 50762.6). Total num frames: 5011472384. Throughput: 0: 50785.6. Samples: 2764327560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:15:42,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-27 00:15:43,275][49750] Updated weights for policy 0, policy_version 305881 (0.0050) [2024-04-27 00:15:46,590][49750] Updated weights for policy 0, policy_version 305891 (0.0032) [2024-04-27 00:15:47,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.6, 300 sec: 50762.6). Total num frames: 5011718144. Throughput: 0: 50855.1. Samples: 2764633700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:15:47,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 00:15:49,666][49750] Updated weights for policy 0, policy_version 305901 (0.0033) [2024-04-27 00:15:52,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 5011963904. Throughput: 0: 50742.0. Samples: 2764781240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:15:52,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-27 00:15:53,152][49750] Updated weights for policy 0, policy_version 305911 (0.0028) [2024-04-27 00:15:53,759][49728] Signal inference workers to stop experience collection... (41600 times) [2024-04-27 00:15:53,800][49750] InferenceWorker_p0-w0: stopping experience collection (41600 times) [2024-04-27 00:15:53,834][49728] Signal inference workers to resume experience collection... (41600 times) [2024-04-27 00:15:53,835][49750] InferenceWorker_p0-w0: resuming experience collection (41600 times) [2024-04-27 00:15:56,321][49750] Updated weights for policy 0, policy_version 305921 (0.0027) [2024-04-27 00:15:57,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5012242432. Throughput: 0: 50699.5. Samples: 2765084780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:15:57,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-27 00:15:59,708][49750] Updated weights for policy 0, policy_version 305931 (0.0029) [2024-04-27 00:16:02,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5012488192. Throughput: 0: 50674.6. Samples: 2765388420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:16:02,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-27 00:16:02,901][49750] Updated weights for policy 0, policy_version 305941 (0.0034) [2024-04-27 00:16:06,005][49750] Updated weights for policy 0, policy_version 305951 (0.0029) [2024-04-27 00:16:07,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5012750336. Throughput: 0: 50596.8. Samples: 2765541880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:16:07,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-27 00:16:09,385][49750] Updated weights for policy 0, policy_version 305961 (0.0029) [2024-04-27 00:16:12,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5013012480. Throughput: 0: 50655.8. Samples: 2765850400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:16:12,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-27 00:16:12,311][49750] Updated weights for policy 0, policy_version 305971 (0.0028) [2024-04-27 00:16:15,921][49750] Updated weights for policy 0, policy_version 305981 (0.0031) [2024-04-27 00:16:17,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 5013225472. Throughput: 0: 50862.9. Samples: 2766154540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:16:17,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-27 00:16:18,859][49750] Updated weights for policy 0, policy_version 305991 (0.0038) [2024-04-27 00:16:22,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 5013487616. Throughput: 0: 50685.4. Samples: 2766300800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:16:22,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-27 00:16:22,408][49750] Updated weights for policy 0, policy_version 306001 (0.0040) [2024-04-27 00:16:25,654][49750] Updated weights for policy 0, policy_version 306011 (0.0032) [2024-04-27 00:16:27,063][49517] Fps is (10 sec: 55704.6, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5013782528. Throughput: 0: 50590.9. Samples: 2766604160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:16:27,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-27 00:16:28,820][49750] Updated weights for policy 0, policy_version 306021 (0.0025) [2024-04-27 00:16:31,980][49750] Updated weights for policy 0, policy_version 306031 (0.0029) [2024-04-27 00:16:32,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5014011904. Throughput: 0: 50496.2. Samples: 2766906040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:16:32,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-27 00:16:35,089][49750] Updated weights for policy 0, policy_version 306041 (0.0032) [2024-04-27 00:16:37,063][49517] Fps is (10 sec: 45875.3, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 5014241280. Throughput: 0: 50555.6. Samples: 2767056240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:16:37,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-27 00:16:38,277][49750] Updated weights for policy 0, policy_version 306051 (0.0029) [2024-04-27 00:16:41,438][49750] Updated weights for policy 0, policy_version 306061 (0.0030) [2024-04-27 00:16:42,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5014519808. Throughput: 0: 50661.4. Samples: 2767364540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:16:42,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 00:16:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000306062_5014519808.pth... [2024-04-27 00:16:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000305318_5002330112.pth [2024-04-27 00:16:44,713][49750] Updated weights for policy 0, policy_version 306071 (0.0034) [2024-04-27 00:16:47,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 5014765568. Throughput: 0: 50481.1. Samples: 2767660080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 00:16:47,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-27 00:16:48,061][49750] Updated weights for policy 0, policy_version 306081 (0.0030) [2024-04-27 00:16:51,197][49750] Updated weights for policy 0, policy_version 306091 (0.0034) [2024-04-27 00:16:52,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.6, 300 sec: 50763.3). Total num frames: 5015027712. Throughput: 0: 50750.7. Samples: 2767825660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 00:16:52,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-27 00:16:54,700][49750] Updated weights for policy 0, policy_version 306101 (0.0030) [2024-04-27 00:16:57,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5015273472. Throughput: 0: 50572.7. Samples: 2768126180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 00:16:57,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-27 00:16:57,549][49750] Updated weights for policy 0, policy_version 306111 (0.0029) [2024-04-27 00:17:01,004][49750] Updated weights for policy 0, policy_version 306121 (0.0038) [2024-04-27 00:17:02,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 5015535616. Throughput: 0: 50713.5. Samples: 2768436660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 00:17:02,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 00:17:03,597][49728] Signal inference workers to stop experience collection... (41650 times) [2024-04-27 00:17:03,647][49750] InferenceWorker_p0-w0: stopping experience collection (41650 times) [2024-04-27 00:17:03,663][49728] Signal inference workers to resume experience collection... (41650 times) [2024-04-27 00:17:03,670][49750] InferenceWorker_p0-w0: resuming experience collection (41650 times) [2024-04-27 00:17:04,009][49750] Updated weights for policy 0, policy_version 306131 (0.0031) [2024-04-27 00:17:07,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5015781376. Throughput: 0: 50632.4. Samples: 2768579260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 00:17:07,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 00:17:07,893][49750] Updated weights for policy 0, policy_version 306141 (0.0029) [2024-04-27 00:17:10,432][49750] Updated weights for policy 0, policy_version 306151 (0.0031) [2024-04-27 00:17:12,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5016059904. Throughput: 0: 50549.4. Samples: 2768878880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 00:17:12,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 00:17:14,368][49750] Updated weights for policy 0, policy_version 306161 (0.0028) [2024-04-27 00:17:16,869][49750] Updated weights for policy 0, policy_version 306171 (0.0032) [2024-04-27 00:17:17,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 5016305664. Throughput: 0: 50705.4. Samples: 2769187780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 00:17:17,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-27 00:17:20,908][49750] Updated weights for policy 0, policy_version 306181 (0.0032) [2024-04-27 00:17:22,063][49517] Fps is (10 sec: 45874.9, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 5016518656. Throughput: 0: 50711.5. Samples: 2769338260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 00:17:22,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 00:17:23,310][49750] Updated weights for policy 0, policy_version 306191 (0.0027) [2024-04-27 00:17:27,062][49517] Fps is (10 sec: 47514.1, 60 sec: 49971.3, 300 sec: 50707.1). Total num frames: 5016780800. Throughput: 0: 50641.3. Samples: 2769643400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 00:17:27,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-27 00:17:27,274][49750] Updated weights for policy 0, policy_version 306201 (0.0030) [2024-04-27 00:17:29,944][49750] Updated weights for policy 0, policy_version 306211 (0.0034) [2024-04-27 00:17:32,063][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5017059328. Throughput: 0: 50826.3. Samples: 2769947260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 00:17:32,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-27 00:17:33,593][49750] Updated weights for policy 0, policy_version 306221 (0.0035) [2024-04-27 00:17:36,355][49750] Updated weights for policy 0, policy_version 306231 (0.0032) [2024-04-27 00:17:37,062][49517] Fps is (10 sec: 55705.5, 60 sec: 51609.7, 300 sec: 50873.7). Total num frames: 5017337856. Throughput: 0: 50787.1. Samples: 2770111080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 00:17:37,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-27 00:17:40,000][49750] Updated weights for policy 0, policy_version 306241 (0.0031) [2024-04-27 00:17:42,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5017550848. Throughput: 0: 50839.6. Samples: 2770413960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 00:17:42,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 00:17:42,853][49750] Updated weights for policy 0, policy_version 306251 (0.0027) [2024-04-27 00:17:46,530][49750] Updated weights for policy 0, policy_version 306261 (0.0030) [2024-04-27 00:17:47,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 5017796608. Throughput: 0: 50792.3. Samples: 2770722300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 00:17:47,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 00:17:49,247][49750] Updated weights for policy 0, policy_version 306271 (0.0028) [2024-04-27 00:17:52,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5018058752. Throughput: 0: 50819.6. Samples: 2770866140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 00:17:52,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-27 00:17:52,961][49750] Updated weights for policy 0, policy_version 306281 (0.0031) [2024-04-27 00:17:55,668][49750] Updated weights for policy 0, policy_version 306291 (0.0034) [2024-04-27 00:17:57,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5018320896. Throughput: 0: 50995.6. Samples: 2771173680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 00:17:57,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 00:17:59,355][49750] Updated weights for policy 0, policy_version 306301 (0.0034) [2024-04-27 00:18:02,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5018599424. Throughput: 0: 50929.8. Samples: 2771479620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 00:18:02,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 00:18:02,063][49750] Updated weights for policy 0, policy_version 306311 (0.0029) [2024-04-27 00:18:06,019][49750] Updated weights for policy 0, policy_version 306321 (0.0032) [2024-04-27 00:18:07,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5018828800. Throughput: 0: 50963.1. Samples: 2771631600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 00:18:07,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-27 00:18:08,476][49750] Updated weights for policy 0, policy_version 306331 (0.0032) [2024-04-27 00:18:12,063][49517] Fps is (10 sec: 45875.1, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 5019058176. Throughput: 0: 50833.2. Samples: 2771930900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 00:18:12,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 00:18:12,672][49750] Updated weights for policy 0, policy_version 306341 (0.0037) [2024-04-27 00:18:12,990][49728] Signal inference workers to stop experience collection... (41700 times) [2024-04-27 00:18:12,990][49728] Signal inference workers to resume experience collection... (41700 times) [2024-04-27 00:18:13,018][49750] InferenceWorker_p0-w0: stopping experience collection (41700 times) [2024-04-27 00:18:13,018][49750] InferenceWorker_p0-w0: resuming experience collection (41700 times) [2024-04-27 00:18:14,912][49750] Updated weights for policy 0, policy_version 306351 (0.0031) [2024-04-27 00:18:17,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5019353088. Throughput: 0: 50832.2. Samples: 2772234700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 00:18:17,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 00:18:19,260][49750] Updated weights for policy 0, policy_version 306361 (0.0034) [2024-04-27 00:18:21,347][49750] Updated weights for policy 0, policy_version 306371 (0.0034) [2024-04-27 00:18:22,062][49517] Fps is (10 sec: 55706.3, 60 sec: 51609.7, 300 sec: 50873.7). Total num frames: 5019615232. Throughput: 0: 50810.7. Samples: 2772397560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 00:18:22,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-27 00:18:25,756][49750] Updated weights for policy 0, policy_version 306381 (0.0033) [2024-04-27 00:18:27,062][49517] Fps is (10 sec: 49151.6, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 5019844608. Throughput: 0: 51044.0. Samples: 2772710940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 00:18:27,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-27 00:18:27,874][49750] Updated weights for policy 0, policy_version 306391 (0.0031) [2024-04-27 00:18:32,062][49517] Fps is (10 sec: 44236.8, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 5020057600. Throughput: 0: 50709.3. Samples: 2773004220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 00:18:32,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 00:18:32,174][49750] Updated weights for policy 0, policy_version 306401 (0.0032) [2024-04-27 00:18:34,370][49750] Updated weights for policy 0, policy_version 306411 (0.0033) [2024-04-27 00:18:37,063][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.2, 300 sec: 50651.5). Total num frames: 5020336128. Throughput: 0: 50553.8. Samples: 2773141060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 00:18:37,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 00:18:38,585][49750] Updated weights for policy 0, policy_version 306421 (0.0031) [2024-04-27 00:18:40,794][49750] Updated weights for policy 0, policy_version 306431 (0.0029) [2024-04-27 00:18:42,062][49517] Fps is (10 sec: 55705.7, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5020614656. Throughput: 0: 50634.7. Samples: 2773452240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 00:18:42,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-27 00:18:42,117][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000306435_5020631040.pth... [2024-04-27 00:18:42,174][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000305690_5008424960.pth [2024-04-27 00:18:45,067][49750] Updated weights for policy 0, policy_version 306441 (0.0033) [2024-04-27 00:18:47,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 5020876800. Throughput: 0: 50666.2. Samples: 2773759600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 00:18:47,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-27 00:18:47,186][49750] Updated weights for policy 0, policy_version 306451 (0.0037) [2024-04-27 00:18:51,579][49750] Updated weights for policy 0, policy_version 306461 (0.0035) [2024-04-27 00:18:52,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5021106176. Throughput: 0: 50674.4. Samples: 2773911940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 00:18:52,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-27 00:18:53,524][49728] Signal inference workers to stop experience collection... (41750 times) [2024-04-27 00:18:53,525][49728] Signal inference workers to resume experience collection... (41750 times) [2024-04-27 00:18:53,555][49750] InferenceWorker_p0-w0: stopping experience collection (41750 times) [2024-04-27 00:18:53,555][49750] InferenceWorker_p0-w0: resuming experience collection (41750 times) [2024-04-27 00:18:53,655][49750] Updated weights for policy 0, policy_version 306471 (0.0034) [2024-04-27 00:18:57,063][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5021351936. Throughput: 0: 50524.5. Samples: 2774204500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 00:18:57,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 00:18:58,023][49750] Updated weights for policy 0, policy_version 306481 (0.0033) [2024-04-27 00:19:00,153][49750] Updated weights for policy 0, policy_version 306491 (0.0029) [2024-04-27 00:19:02,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 5021614080. Throughput: 0: 50594.0. Samples: 2774511440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 00:19:02,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-27 00:19:04,344][49750] Updated weights for policy 0, policy_version 306501 (0.0031) [2024-04-27 00:19:06,526][49750] Updated weights for policy 0, policy_version 306511 (0.0027) [2024-04-27 00:19:07,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5021876224. Throughput: 0: 50593.6. Samples: 2774674280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 00:19:07,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 00:19:10,752][49750] Updated weights for policy 0, policy_version 306521 (0.0028) [2024-04-27 00:19:12,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 5022121984. Throughput: 0: 50538.7. Samples: 2774985180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 00:19:12,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-27 00:19:12,924][49750] Updated weights for policy 0, policy_version 306531 (0.0025) [2024-04-27 00:19:17,062][49517] Fps is (10 sec: 45876.0, 60 sec: 49698.2, 300 sec: 50596.0). Total num frames: 5022334976. Throughput: 0: 50756.9. Samples: 2775288280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 00:19:17,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 00:19:17,349][49750] Updated weights for policy 0, policy_version 306541 (0.0030) [2024-04-27 00:19:19,504][49750] Updated weights for policy 0, policy_version 306551 (0.0032) [2024-04-27 00:19:22,063][49517] Fps is (10 sec: 49151.6, 60 sec: 49971.1, 300 sec: 50651.5). Total num frames: 5022613504. Throughput: 0: 50615.9. Samples: 2775418780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-27 00:19:22,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-27 00:19:23,763][49750] Updated weights for policy 0, policy_version 306561 (0.0033) [2024-04-27 00:19:26,015][49750] Updated weights for policy 0, policy_version 306571 (0.0031) [2024-04-27 00:19:27,062][49517] Fps is (10 sec: 57343.6, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5022908416. Throughput: 0: 50540.8. Samples: 2775726580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-27 00:19:27,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 00:19:30,175][49750] Updated weights for policy 0, policy_version 306581 (0.0031) [2024-04-27 00:19:32,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51609.6, 300 sec: 50762.6). Total num frames: 5023154176. Throughput: 0: 50701.5. Samples: 2776041160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-27 00:19:32,063][49517] Avg episode reward: [(0, '0.667')] [2024-04-27 00:19:32,393][49750] Updated weights for policy 0, policy_version 306591 (0.0034) [2024-04-27 00:19:36,539][49750] Updated weights for policy 0, policy_version 306601 (0.0035) [2024-04-27 00:19:37,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5023383552. Throughput: 0: 50595.0. Samples: 2776188720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-27 00:19:37,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-27 00:19:39,046][49750] Updated weights for policy 0, policy_version 306611 (0.0028) [2024-04-27 00:19:42,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 5023629312. Throughput: 0: 50853.3. Samples: 2776492900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-27 00:19:42,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 00:19:42,977][49750] Updated weights for policy 0, policy_version 306621 (0.0032) [2024-04-27 00:19:45,435][49750] Updated weights for policy 0, policy_version 306631 (0.0029) [2024-04-27 00:19:47,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5023907840. Throughput: 0: 50683.7. Samples: 2776792200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-27 00:19:47,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-27 00:19:49,300][49750] Updated weights for policy 0, policy_version 306641 (0.0030) [2024-04-27 00:19:51,766][49750] Updated weights for policy 0, policy_version 306651 (0.0033) [2024-04-27 00:19:52,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5024169984. Throughput: 0: 50692.5. Samples: 2776955440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-27 00:19:52,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-27 00:19:55,704][49750] Updated weights for policy 0, policy_version 306661 (0.0029) [2024-04-27 00:19:57,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5024399360. Throughput: 0: 50596.2. Samples: 2777262000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-27 00:19:57,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 00:19:58,208][49750] Updated weights for policy 0, policy_version 306671 (0.0031) [2024-04-27 00:20:02,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 5024645120. Throughput: 0: 50647.5. Samples: 2777567420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-27 00:20:02,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-27 00:20:02,198][49750] Updated weights for policy 0, policy_version 306681 (0.0029) [2024-04-27 00:20:04,542][49750] Updated weights for policy 0, policy_version 306691 (0.0029) [2024-04-27 00:20:07,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 5024907264. Throughput: 0: 50812.2. Samples: 2777705320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-27 00:20:07,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 00:20:08,414][49728] Signal inference workers to stop experience collection... (41800 times) [2024-04-27 00:20:08,418][49728] Signal inference workers to resume experience collection... (41800 times) [2024-04-27 00:20:08,433][49750] InferenceWorker_p0-w0: stopping experience collection (41800 times) [2024-04-27 00:20:08,433][49750] InferenceWorker_p0-w0: resuming experience collection (41800 times) [2024-04-27 00:20:08,556][49750] Updated weights for policy 0, policy_version 306701 (0.0036) [2024-04-27 00:20:10,966][49750] Updated weights for policy 0, policy_version 306711 (0.0033) [2024-04-27 00:20:12,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5025185792. Throughput: 0: 50900.4. Samples: 2778017100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-27 00:20:12,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-27 00:20:14,962][49750] Updated weights for policy 0, policy_version 306721 (0.0030) [2024-04-27 00:20:17,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51882.7, 300 sec: 50762.6). Total num frames: 5025447936. Throughput: 0: 50824.9. Samples: 2778328280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-27 00:20:17,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 00:20:17,489][49750] Updated weights for policy 0, policy_version 306731 (0.0028) [2024-04-27 00:20:21,542][49750] Updated weights for policy 0, policy_version 306741 (0.0031) [2024-04-27 00:20:22,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51336.6, 300 sec: 50762.7). Total num frames: 5025693696. Throughput: 0: 50962.7. Samples: 2778482040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-27 00:20:22,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 00:20:23,909][49750] Updated weights for policy 0, policy_version 306751 (0.0028) [2024-04-27 00:20:27,062][49517] Fps is (10 sec: 45874.9, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 5025906688. Throughput: 0: 50782.7. Samples: 2778778120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-27 00:20:27,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-27 00:20:27,881][49750] Updated weights for policy 0, policy_version 306761 (0.0040) [2024-04-27 00:20:30,434][49750] Updated weights for policy 0, policy_version 306771 (0.0036) [2024-04-27 00:20:32,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5026185216. Throughput: 0: 50820.5. Samples: 2779079120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-04-27 00:20:32,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 00:20:34,387][49750] Updated weights for policy 0, policy_version 306781 (0.0032) [2024-04-27 00:20:36,984][49750] Updated weights for policy 0, policy_version 306791 (0.0034) [2024-04-27 00:20:37,062][49517] Fps is (10 sec: 55705.7, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 5026463744. Throughput: 0: 50690.7. Samples: 2779236520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:20:37,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 00:20:40,704][49750] Updated weights for policy 0, policy_version 306801 (0.0026) [2024-04-27 00:20:42,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5026676736. Throughput: 0: 50738.5. Samples: 2779545240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:20:42,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-27 00:20:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000306804_5026676736.pth... [2024-04-27 00:20:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000306062_5014519808.pth [2024-04-27 00:20:43,547][49750] Updated weights for policy 0, policy_version 306811 (0.0031) [2024-04-27 00:20:47,048][49750] Updated weights for policy 0, policy_version 306821 (0.0028) [2024-04-27 00:20:47,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5026955264. Throughput: 0: 50778.1. Samples: 2779852440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:20:47,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 00:20:50,019][49750] Updated weights for policy 0, policy_version 306831 (0.0030) [2024-04-27 00:20:52,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5027201024. Throughput: 0: 50856.5. Samples: 2779993860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:20:52,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 00:20:53,606][49750] Updated weights for policy 0, policy_version 306841 (0.0033) [2024-04-27 00:20:56,350][49750] Updated weights for policy 0, policy_version 306851 (0.0033) [2024-04-27 00:20:57,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5027463168. Throughput: 0: 50736.5. Samples: 2780300240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:20:57,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-27 00:21:00,132][49750] Updated weights for policy 0, policy_version 306861 (0.0033) [2024-04-27 00:21:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 5027708928. Throughput: 0: 50599.5. Samples: 2780605260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:21:02,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 00:21:02,979][49750] Updated weights for policy 0, policy_version 306871 (0.0025) [2024-04-27 00:21:06,467][49750] Updated weights for policy 0, policy_version 306881 (0.0028) [2024-04-27 00:21:07,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 5027954688. Throughput: 0: 50462.7. Samples: 2780752860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:21:07,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-27 00:21:09,525][49750] Updated weights for policy 0, policy_version 306891 (0.0034) [2024-04-27 00:21:12,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 5028200448. Throughput: 0: 50704.4. Samples: 2781059820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:21:12,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 00:21:12,967][49750] Updated weights for policy 0, policy_version 306901 (0.0031) [2024-04-27 00:21:14,435][49728] Signal inference workers to stop experience collection... (41850 times) [2024-04-27 00:21:14,473][49750] InferenceWorker_p0-w0: stopping experience collection (41850 times) [2024-04-27 00:21:14,507][49728] Signal inference workers to resume experience collection... (41850 times) [2024-04-27 00:21:14,508][49750] InferenceWorker_p0-w0: resuming experience collection (41850 times) [2024-04-27 00:21:15,909][49750] Updated weights for policy 0, policy_version 306911 (0.0024) [2024-04-27 00:21:17,063][49517] Fps is (10 sec: 49151.7, 60 sec: 49971.1, 300 sec: 50707.1). Total num frames: 5028446208. Throughput: 0: 50635.9. Samples: 2781357740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:21:17,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-27 00:21:19,456][49750] Updated weights for policy 0, policy_version 306921 (0.0032) [2024-04-27 00:21:22,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 5028724736. Throughput: 0: 50548.3. Samples: 2781511200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:21:22,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 00:21:22,330][49750] Updated weights for policy 0, policy_version 306931 (0.0030) [2024-04-27 00:21:25,940][49750] Updated weights for policy 0, policy_version 306941 (0.0033) [2024-04-27 00:21:27,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 5028954112. Throughput: 0: 50541.2. Samples: 2781819600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:21:27,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-27 00:21:28,781][49750] Updated weights for policy 0, policy_version 306951 (0.0026) [2024-04-27 00:21:32,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5029216256. Throughput: 0: 50478.7. Samples: 2782123980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:21:32,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-27 00:21:32,291][49750] Updated weights for policy 0, policy_version 306961 (0.0031) [2024-04-27 00:21:35,566][49750] Updated weights for policy 0, policy_version 306971 (0.0033) [2024-04-27 00:21:37,062][49517] Fps is (10 sec: 50791.3, 60 sec: 49971.2, 300 sec: 50651.6). Total num frames: 5029462016. Throughput: 0: 50642.2. Samples: 2782272760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:21:37,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 00:21:38,788][49750] Updated weights for policy 0, policy_version 306981 (0.0031) [2024-04-27 00:21:41,984][49750] Updated weights for policy 0, policy_version 306991 (0.0028) [2024-04-27 00:21:42,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 5029740544. Throughput: 0: 50561.0. Samples: 2782575480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:21:42,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-27 00:21:45,391][49750] Updated weights for policy 0, policy_version 307001 (0.0032) [2024-04-27 00:21:47,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 5029986304. Throughput: 0: 50528.0. Samples: 2782879020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:21:47,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-27 00:21:48,474][49750] Updated weights for policy 0, policy_version 307011 (0.0027) [2024-04-27 00:21:51,732][49750] Updated weights for policy 0, policy_version 307021 (0.0027) [2024-04-27 00:21:52,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5030232064. Throughput: 0: 50784.4. Samples: 2783038160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-27 00:21:52,063][49517] Avg episode reward: [(0, '0.466')] [2024-04-27 00:21:55,097][49750] Updated weights for policy 0, policy_version 307031 (0.0029) [2024-04-27 00:21:57,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 5030477824. Throughput: 0: 50628.0. Samples: 2783338080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-27 00:21:57,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 00:21:58,241][49750] Updated weights for policy 0, policy_version 307041 (0.0030) [2024-04-27 00:22:01,711][49750] Updated weights for policy 0, policy_version 307051 (0.0028) [2024-04-27 00:22:02,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.1, 300 sec: 50707.1). Total num frames: 5030739968. Throughput: 0: 50786.0. Samples: 2783643120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-27 00:22:02,064][49517] Avg episode reward: [(0, '0.497')] [2024-04-27 00:22:04,580][49750] Updated weights for policy 0, policy_version 307061 (0.0034) [2024-04-27 00:22:07,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50651.5). Total num frames: 5031002112. Throughput: 0: 50697.8. Samples: 2783792600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-27 00:22:07,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-27 00:22:08,165][49750] Updated weights for policy 0, policy_version 307071 (0.0034) [2024-04-27 00:22:11,232][49750] Updated weights for policy 0, policy_version 307081 (0.0027) [2024-04-27 00:22:12,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 5031247872. Throughput: 0: 50566.8. Samples: 2784095100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-27 00:22:12,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 00:22:14,663][49750] Updated weights for policy 0, policy_version 307091 (0.0039) [2024-04-27 00:22:17,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5031510016. Throughput: 0: 50543.1. Samples: 2784398420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-27 00:22:17,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 00:22:17,743][49750] Updated weights for policy 0, policy_version 307101 (0.0031) [2024-04-27 00:22:20,973][49750] Updated weights for policy 0, policy_version 307111 (0.0029) [2024-04-27 00:22:22,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 5031739392. Throughput: 0: 50648.2. Samples: 2784551940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-27 00:22:22,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-27 00:22:24,102][49750] Updated weights for policy 0, policy_version 307121 (0.0034) [2024-04-27 00:22:27,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 5032001536. Throughput: 0: 50731.9. Samples: 2784858420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-27 00:22:27,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 00:22:27,145][49728] Signal inference workers to stop experience collection... (41900 times) [2024-04-27 00:22:27,145][49728] Signal inference workers to resume experience collection... (41900 times) [2024-04-27 00:22:27,159][49750] InferenceWorker_p0-w0: stopping experience collection (41900 times) [2024-04-27 00:22:27,159][49750] InferenceWorker_p0-w0: resuming experience collection (41900 times) [2024-04-27 00:22:27,283][49750] Updated weights for policy 0, policy_version 307131 (0.0032) [2024-04-27 00:22:30,713][49750] Updated weights for policy 0, policy_version 307141 (0.0034) [2024-04-27 00:22:32,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50517.4, 300 sec: 50540.5). Total num frames: 5032247296. Throughput: 0: 50713.3. Samples: 2785161120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-27 00:22:32,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 00:22:33,889][49750] Updated weights for policy 0, policy_version 307151 (0.0034) [2024-04-27 00:22:37,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 5032509440. Throughput: 0: 50580.4. Samples: 2785314280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-27 00:22:37,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 00:22:37,216][49750] Updated weights for policy 0, policy_version 307161 (0.0034) [2024-04-27 00:22:40,381][49750] Updated weights for policy 0, policy_version 307171 (0.0032) [2024-04-27 00:22:42,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 5032755200. Throughput: 0: 50596.5. Samples: 2785614920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-27 00:22:42,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-27 00:22:42,088][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000307176_5032771584.pth... [2024-04-27 00:22:42,136][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000306435_5020631040.pth [2024-04-27 00:22:43,514][49750] Updated weights for policy 0, policy_version 307181 (0.0039) [2024-04-27 00:22:46,771][49750] Updated weights for policy 0, policy_version 307191 (0.0028) [2024-04-27 00:22:47,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5033017344. Throughput: 0: 50609.1. Samples: 2785920520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-27 00:22:47,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-27 00:22:49,913][49750] Updated weights for policy 0, policy_version 307201 (0.0030) [2024-04-27 00:22:52,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5033279488. Throughput: 0: 50580.6. Samples: 2786068720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-27 00:22:52,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-27 00:22:53,126][49750] Updated weights for policy 0, policy_version 307211 (0.0031) [2024-04-27 00:22:56,321][49750] Updated weights for policy 0, policy_version 307221 (0.0033) [2024-04-27 00:22:57,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50596.0). Total num frames: 5033525248. Throughput: 0: 50633.3. Samples: 2786373600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-27 00:22:57,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 00:22:59,594][49750] Updated weights for policy 0, policy_version 307231 (0.0034) [2024-04-27 00:23:02,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.6, 300 sec: 50651.6). Total num frames: 5033771008. Throughput: 0: 50627.6. Samples: 2786676660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-27 00:23:02,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-27 00:23:02,792][49750] Updated weights for policy 0, policy_version 307241 (0.0028) [2024-04-27 00:23:06,368][49750] Updated weights for policy 0, policy_version 307251 (0.0031) [2024-04-27 00:23:07,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 5034033152. Throughput: 0: 50588.6. Samples: 2786828420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 00:23:07,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-27 00:23:09,494][49750] Updated weights for policy 0, policy_version 307261 (0.0027) [2024-04-27 00:23:12,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 5034278912. Throughput: 0: 50691.2. Samples: 2787139520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 00:23:12,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-27 00:23:13,143][49750] Updated weights for policy 0, policy_version 307271 (0.0034) [2024-04-27 00:23:15,983][49750] Updated weights for policy 0, policy_version 307281 (0.0031) [2024-04-27 00:23:17,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.3, 300 sec: 50540.5). Total num frames: 5034524672. Throughput: 0: 50599.9. Samples: 2787438120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 00:23:17,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 00:23:19,525][49750] Updated weights for policy 0, policy_version 307291 (0.0032) [2024-04-27 00:23:22,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 5034803200. Throughput: 0: 50709.7. Samples: 2787596220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 00:23:22,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-27 00:23:22,443][49750] Updated weights for policy 0, policy_version 307301 (0.0032) [2024-04-27 00:23:25,998][49750] Updated weights for policy 0, policy_version 307311 (0.0032) [2024-04-27 00:23:27,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5035048960. Throughput: 0: 50641.7. Samples: 2787893800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 00:23:27,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-27 00:23:28,749][49750] Updated weights for policy 0, policy_version 307321 (0.0030) [2024-04-27 00:23:32,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5035294720. Throughput: 0: 50620.1. Samples: 2788198420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 00:23:32,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 00:23:32,323][49750] Updated weights for policy 0, policy_version 307331 (0.0033) [2024-04-27 00:23:35,094][49750] Updated weights for policy 0, policy_version 307341 (0.0033) [2024-04-27 00:23:37,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.5, 300 sec: 50596.0). Total num frames: 5035540480. Throughput: 0: 50651.6. Samples: 2788348040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 00:23:37,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-27 00:23:38,615][49750] Updated weights for policy 0, policy_version 307351 (0.0032) [2024-04-27 00:23:41,797][49750] Updated weights for policy 0, policy_version 307361 (0.0029) [2024-04-27 00:23:42,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50651.6). Total num frames: 5035819008. Throughput: 0: 50763.7. Samples: 2788657960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 00:23:42,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-27 00:23:45,320][49750] Updated weights for policy 0, policy_version 307371 (0.0031) [2024-04-27 00:23:47,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5036064768. Throughput: 0: 50598.7. Samples: 2788953600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 00:23:47,063][49517] Avg episode reward: [(0, '0.675')] [2024-04-27 00:23:48,296][49750] Updated weights for policy 0, policy_version 307381 (0.0032) [2024-04-27 00:23:50,634][49728] Signal inference workers to stop experience collection... (41950 times) [2024-04-27 00:23:50,635][49728] Signal inference workers to resume experience collection... (41950 times) [2024-04-27 00:23:50,657][49750] InferenceWorker_p0-w0: stopping experience collection (41950 times) [2024-04-27 00:23:50,657][49750] InferenceWorker_p0-w0: resuming experience collection (41950 times) [2024-04-27 00:23:51,752][49750] Updated weights for policy 0, policy_version 307391 (0.0035) [2024-04-27 00:23:52,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 5036294144. Throughput: 0: 50725.8. Samples: 2789111080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 00:23:52,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 00:23:54,727][49750] Updated weights for policy 0, policy_version 307401 (0.0026) [2024-04-27 00:23:57,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 5036556288. Throughput: 0: 50631.1. Samples: 2789417920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 00:23:57,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 00:23:58,052][49750] Updated weights for policy 0, policy_version 307411 (0.0032) [2024-04-27 00:24:01,199][49750] Updated weights for policy 0, policy_version 307421 (0.0028) [2024-04-27 00:24:02,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 5036802048. Throughput: 0: 50737.3. Samples: 2789721300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 00:24:02,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-27 00:24:04,453][49750] Updated weights for policy 0, policy_version 307431 (0.0034) [2024-04-27 00:24:07,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5037080576. Throughput: 0: 50541.5. Samples: 2789870580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 00:24:07,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 00:24:07,577][49750] Updated weights for policy 0, policy_version 307441 (0.0033) [2024-04-27 00:24:10,930][49750] Updated weights for policy 0, policy_version 307451 (0.0028) [2024-04-27 00:24:12,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5037342720. Throughput: 0: 50850.4. Samples: 2790182060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 00:24:12,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 00:24:14,203][49750] Updated weights for policy 0, policy_version 307461 (0.0031) [2024-04-27 00:24:17,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5037572096. Throughput: 0: 50600.8. Samples: 2790475460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:24:17,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-27 00:24:17,473][49750] Updated weights for policy 0, policy_version 307471 (0.0032) [2024-04-27 00:24:20,801][49750] Updated weights for policy 0, policy_version 307481 (0.0033) [2024-04-27 00:24:22,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 5037850624. Throughput: 0: 50662.6. Samples: 2790627860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:24:22,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-27 00:24:23,962][49750] Updated weights for policy 0, policy_version 307491 (0.0031) [2024-04-27 00:24:27,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 5038080000. Throughput: 0: 50631.5. Samples: 2790936380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:24:27,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-27 00:24:27,170][49750] Updated weights for policy 0, policy_version 307501 (0.0030) [2024-04-27 00:24:30,307][49750] Updated weights for policy 0, policy_version 307511 (0.0031) [2024-04-27 00:24:32,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 5038342144. Throughput: 0: 50830.1. Samples: 2791240960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:24:32,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 00:24:33,481][49750] Updated weights for policy 0, policy_version 307521 (0.0034) [2024-04-27 00:24:36,719][49750] Updated weights for policy 0, policy_version 307531 (0.0034) [2024-04-27 00:24:37,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5038604288. Throughput: 0: 50825.4. Samples: 2791398220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:24:37,063][49517] Avg episode reward: [(0, '0.670')] [2024-04-27 00:24:39,948][49750] Updated weights for policy 0, policy_version 307541 (0.0030) [2024-04-27 00:24:42,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 5038850048. Throughput: 0: 50819.0. Samples: 2791704780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:24:42,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 00:24:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000307547_5038850048.pth... [2024-04-27 00:24:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000306804_5026676736.pth [2024-04-27 00:24:43,251][49750] Updated weights for policy 0, policy_version 307551 (0.0033) [2024-04-27 00:24:46,737][49750] Updated weights for policy 0, policy_version 307561 (0.0034) [2024-04-27 00:24:47,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.2, 300 sec: 50596.0). Total num frames: 5039095808. Throughput: 0: 50772.4. Samples: 2792006060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:24:47,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-27 00:24:49,628][49750] Updated weights for policy 0, policy_version 307571 (0.0030) [2024-04-27 00:24:52,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 5039357952. Throughput: 0: 50830.2. Samples: 2792157940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:24:52,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 00:24:53,285][49750] Updated weights for policy 0, policy_version 307581 (0.0029) [2024-04-27 00:24:55,944][49750] Updated weights for policy 0, policy_version 307591 (0.0033) [2024-04-27 00:24:57,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5039603712. Throughput: 0: 50567.9. Samples: 2792457620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:24:57,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-27 00:24:59,641][49750] Updated weights for policy 0, policy_version 307601 (0.0032) [2024-04-27 00:25:02,063][49517] Fps is (10 sec: 50789.4, 60 sec: 51063.3, 300 sec: 50707.0). Total num frames: 5039865856. Throughput: 0: 50851.3. Samples: 2792763780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:25:02,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 00:25:02,495][49750] Updated weights for policy 0, policy_version 307611 (0.0031) [2024-04-27 00:25:03,537][49728] Signal inference workers to stop experience collection... (42000 times) [2024-04-27 00:25:03,537][49728] Signal inference workers to resume experience collection... (42000 times) [2024-04-27 00:25:03,563][49750] InferenceWorker_p0-w0: stopping experience collection (42000 times) [2024-04-27 00:25:03,563][49750] InferenceWorker_p0-w0: resuming experience collection (42000 times) [2024-04-27 00:25:06,017][49750] Updated weights for policy 0, policy_version 307621 (0.0034) [2024-04-27 00:25:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 5040111616. Throughput: 0: 50794.1. Samples: 2792913600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:25:07,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-27 00:25:08,890][49750] Updated weights for policy 0, policy_version 307631 (0.0033) [2024-04-27 00:25:12,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50517.2, 300 sec: 50596.0). Total num frames: 5040373760. Throughput: 0: 50761.3. Samples: 2793220640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:25:12,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-27 00:25:12,472][49750] Updated weights for policy 0, policy_version 307641 (0.0037) [2024-04-27 00:25:15,457][49750] Updated weights for policy 0, policy_version 307651 (0.0029) [2024-04-27 00:25:17,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 5040619520. Throughput: 0: 50595.1. Samples: 2793517740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:25:17,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-27 00:25:18,948][49750] Updated weights for policy 0, policy_version 307661 (0.0028) [2024-04-27 00:25:21,785][49750] Updated weights for policy 0, policy_version 307671 (0.0032) [2024-04-27 00:25:22,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 5040881664. Throughput: 0: 50655.4. Samples: 2793677720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:25:22,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-27 00:25:25,466][49750] Updated weights for policy 0, policy_version 307681 (0.0036) [2024-04-27 00:25:27,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 5041127424. Throughput: 0: 50677.0. Samples: 2793985240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:25:27,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-27 00:25:28,241][49750] Updated weights for policy 0, policy_version 307691 (0.0035) [2024-04-27 00:25:31,896][49750] Updated weights for policy 0, policy_version 307701 (0.0039) [2024-04-27 00:25:32,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.4, 300 sec: 50596.0). Total num frames: 5041389568. Throughput: 0: 50753.0. Samples: 2794289940. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-27 00:25:32,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-27 00:25:34,714][49750] Updated weights for policy 0, policy_version 307711 (0.0031) [2024-04-27 00:25:37,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5041635328. Throughput: 0: 50665.0. Samples: 2794437860. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-27 00:25:37,063][49517] Avg episode reward: [(0, '0.455')] [2024-04-27 00:25:38,260][49750] Updated weights for policy 0, policy_version 307721 (0.0034) [2024-04-27 00:25:41,170][49750] Updated weights for policy 0, policy_version 307731 (0.0036) [2024-04-27 00:25:42,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.4, 300 sec: 50651.5). Total num frames: 5041897472. Throughput: 0: 50718.2. Samples: 2794739940. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-27 00:25:42,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 00:25:44,714][49750] Updated weights for policy 0, policy_version 307741 (0.0033) [2024-04-27 00:25:47,063][49517] Fps is (10 sec: 52427.7, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 5042159616. Throughput: 0: 50636.1. Samples: 2795042400. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-27 00:25:47,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-27 00:25:47,579][49750] Updated weights for policy 0, policy_version 307751 (0.0033) [2024-04-27 00:25:51,190][49750] Updated weights for policy 0, policy_version 307761 (0.0035) [2024-04-27 00:25:52,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 5042405376. Throughput: 0: 50795.2. Samples: 2795199380. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-27 00:25:52,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-27 00:25:54,202][49750] Updated weights for policy 0, policy_version 307771 (0.0029) [2024-04-27 00:25:57,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.4, 300 sec: 50596.0). Total num frames: 5042634752. Throughput: 0: 50822.7. Samples: 2795507660. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-27 00:25:57,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-27 00:25:57,607][49750] Updated weights for policy 0, policy_version 307781 (0.0031) [2024-04-27 00:26:00,640][49750] Updated weights for policy 0, policy_version 307791 (0.0031) [2024-04-27 00:26:02,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.6, 300 sec: 50651.6). Total num frames: 5042896896. Throughput: 0: 50968.1. Samples: 2795811300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-27 00:26:02,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 00:26:04,011][49750] Updated weights for policy 0, policy_version 307801 (0.0033) [2024-04-27 00:26:06,918][49750] Updated weights for policy 0, policy_version 307811 (0.0032) [2024-04-27 00:26:07,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5043175424. Throughput: 0: 50764.1. Samples: 2795962100. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-27 00:26:07,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-27 00:26:10,515][49750] Updated weights for policy 0, policy_version 307821 (0.0032) [2024-04-27 00:26:12,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5043421184. Throughput: 0: 50815.9. Samples: 2796271960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-27 00:26:12,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-27 00:26:13,274][49750] Updated weights for policy 0, policy_version 307831 (0.0033) [2024-04-27 00:26:16,858][49750] Updated weights for policy 0, policy_version 307841 (0.0031) [2024-04-27 00:26:17,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 5043666944. Throughput: 0: 50831.0. Samples: 2796577340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-27 00:26:17,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 00:26:20,026][49750] Updated weights for policy 0, policy_version 307851 (0.0031) [2024-04-27 00:26:22,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5043929088. Throughput: 0: 50877.2. Samples: 2796727340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-27 00:26:22,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 00:26:23,209][49750] Updated weights for policy 0, policy_version 307861 (0.0035) [2024-04-27 00:26:26,501][49750] Updated weights for policy 0, policy_version 307871 (0.0032) [2024-04-27 00:26:27,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5044174848. Throughput: 0: 50837.0. Samples: 2797027600. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-27 00:26:27,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-27 00:26:29,673][49750] Updated weights for policy 0, policy_version 307881 (0.0031) [2024-04-27 00:26:30,812][49728] Signal inference workers to stop experience collection... (42050 times) [2024-04-27 00:26:30,855][49750] InferenceWorker_p0-w0: stopping experience collection (42050 times) [2024-04-27 00:26:30,869][49728] Signal inference workers to resume experience collection... (42050 times) [2024-04-27 00:26:30,877][49750] InferenceWorker_p0-w0: resuming experience collection (42050 times) [2024-04-27 00:26:32,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5044453376. Throughput: 0: 50875.6. Samples: 2797331800. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-27 00:26:32,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 00:26:32,823][49750] Updated weights for policy 0, policy_version 307891 (0.0029) [2024-04-27 00:26:36,251][49750] Updated weights for policy 0, policy_version 307901 (0.0032) [2024-04-27 00:26:37,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.2, 300 sec: 50651.5). Total num frames: 5044682752. Throughput: 0: 50758.9. Samples: 2797483540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-27 00:26:37,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-27 00:26:39,241][49750] Updated weights for policy 0, policy_version 307911 (0.0035) [2024-04-27 00:26:42,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.4, 300 sec: 50651.5). Total num frames: 5044928512. Throughput: 0: 50832.9. Samples: 2797795140. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-27 00:26:42,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 00:26:42,122][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000307919_5044944896.pth... [2024-04-27 00:26:42,172][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000307176_5032771584.pth [2024-04-27 00:26:42,815][49750] Updated weights for policy 0, policy_version 307921 (0.0036) [2024-04-27 00:26:45,673][49750] Updated weights for policy 0, policy_version 307931 (0.0025) [2024-04-27 00:26:47,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5045190656. Throughput: 0: 50890.9. Samples: 2798101400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 00:26:47,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 00:26:49,133][49750] Updated weights for policy 0, policy_version 307941 (0.0028) [2024-04-27 00:26:52,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5045436416. Throughput: 0: 50876.0. Samples: 2798251520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 00:26:52,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 00:26:52,270][49750] Updated weights for policy 0, policy_version 307951 (0.0028) [2024-04-27 00:26:55,563][49750] Updated weights for policy 0, policy_version 307961 (0.0032) [2024-04-27 00:26:57,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.5, 300 sec: 50762.7). Total num frames: 5045714944. Throughput: 0: 50688.4. Samples: 2798552940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 00:26:57,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 00:26:58,690][49750] Updated weights for policy 0, policy_version 307971 (0.0027) [2024-04-27 00:27:02,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50651.6). Total num frames: 5045944320. Throughput: 0: 50693.0. Samples: 2798858520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 00:27:02,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-27 00:27:02,147][49750] Updated weights for policy 0, policy_version 307981 (0.0030) [2024-04-27 00:27:04,987][49750] Updated weights for policy 0, policy_version 307991 (0.0031) [2024-04-27 00:27:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5046222848. Throughput: 0: 50692.1. Samples: 2799008480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 00:27:07,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-27 00:27:08,503][49750] Updated weights for policy 0, policy_version 308001 (0.0027) [2024-04-27 00:27:11,484][49750] Updated weights for policy 0, policy_version 308011 (0.0035) [2024-04-27 00:27:12,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 5046452224. Throughput: 0: 50685.8. Samples: 2799308460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 00:27:12,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-27 00:27:14,980][49750] Updated weights for policy 0, policy_version 308021 (0.0035) [2024-04-27 00:27:17,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5046747136. Throughput: 0: 50712.1. Samples: 2799613840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 00:27:17,063][49517] Avg episode reward: [(0, '0.727')] [2024-04-27 00:27:17,063][49728] Saving new best policy, reward=0.727! [2024-04-27 00:27:18,029][49750] Updated weights for policy 0, policy_version 308031 (0.0032) [2024-04-27 00:27:21,330][49750] Updated weights for policy 0, policy_version 308041 (0.0029) [2024-04-27 00:27:22,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5046976512. Throughput: 0: 50873.9. Samples: 2799772860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 00:27:22,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-27 00:27:24,312][49750] Updated weights for policy 0, policy_version 308051 (0.0033) [2024-04-27 00:27:27,064][49517] Fps is (10 sec: 45866.6, 60 sec: 50515.8, 300 sec: 50706.8). Total num frames: 5047205888. Throughput: 0: 50856.1. Samples: 2800083760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 00:27:27,065][49517] Avg episode reward: [(0, '0.519')] [2024-04-27 00:27:27,775][49750] Updated weights for policy 0, policy_version 308061 (0.0030) [2024-04-27 00:27:30,962][49750] Updated weights for policy 0, policy_version 308071 (0.0030) [2024-04-27 00:27:32,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5047484416. Throughput: 0: 50852.2. Samples: 2800389740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 00:27:32,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 00:27:34,056][49750] Updated weights for policy 0, policy_version 308081 (0.0034) [2024-04-27 00:27:37,063][49517] Fps is (10 sec: 54076.4, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 5047746560. Throughput: 0: 50861.6. Samples: 2800540300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 00:27:37,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-27 00:27:37,261][49750] Updated weights for policy 0, policy_version 308091 (0.0030) [2024-04-27 00:27:40,533][49750] Updated weights for policy 0, policy_version 308101 (0.0029) [2024-04-27 00:27:41,889][49728] Signal inference workers to stop experience collection... (42100 times) [2024-04-27 00:27:41,889][49728] Signal inference workers to resume experience collection... (42100 times) [2024-04-27 00:27:41,923][49750] InferenceWorker_p0-w0: stopping experience collection (42100 times) [2024-04-27 00:27:41,923][49750] InferenceWorker_p0-w0: resuming experience collection (42100 times) [2024-04-27 00:27:42,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 5048008704. Throughput: 0: 50822.2. Samples: 2800839940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 00:27:42,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-27 00:27:43,572][49750] Updated weights for policy 0, policy_version 308111 (0.0042) [2024-04-27 00:27:46,892][49750] Updated weights for policy 0, policy_version 308121 (0.0031) [2024-04-27 00:27:47,063][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5048254464. Throughput: 0: 50767.4. Samples: 2801143060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 00:27:47,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 00:27:50,132][49750] Updated weights for policy 0, policy_version 308131 (0.0036) [2024-04-27 00:27:52,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5048483840. Throughput: 0: 50793.9. Samples: 2801294200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 00:27:52,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-27 00:27:53,420][49750] Updated weights for policy 0, policy_version 308141 (0.0035) [2024-04-27 00:27:57,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 5048729600. Throughput: 0: 50826.7. Samples: 2801595660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 00:27:57,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-27 00:27:57,148][49750] Updated weights for policy 0, policy_version 308151 (0.0031) [2024-04-27 00:27:59,916][49750] Updated weights for policy 0, policy_version 308161 (0.0031) [2024-04-27 00:28:02,063][49517] Fps is (10 sec: 52427.7, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 5049008128. Throughput: 0: 50746.1. Samples: 2801897420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 00:28:02,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 00:28:03,754][49750] Updated weights for policy 0, policy_version 308171 (0.0031) [2024-04-27 00:28:06,306][49750] Updated weights for policy 0, policy_version 308181 (0.0029) [2024-04-27 00:28:07,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5049253888. Throughput: 0: 50776.5. Samples: 2802057800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 00:28:07,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-27 00:28:10,190][49750] Updated weights for policy 0, policy_version 308191 (0.0033) [2024-04-27 00:28:12,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5049499648. Throughput: 0: 50714.9. Samples: 2802365840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 00:28:12,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-27 00:28:12,945][49750] Updated weights for policy 0, policy_version 308201 (0.0032) [2024-04-27 00:28:16,604][49750] Updated weights for policy 0, policy_version 308211 (0.0036) [2024-04-27 00:28:17,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 5049761792. Throughput: 0: 50747.5. Samples: 2802673380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 00:28:17,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 00:28:19,299][49750] Updated weights for policy 0, policy_version 308221 (0.0033) [2024-04-27 00:28:22,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5050023936. Throughput: 0: 50626.4. Samples: 2802818480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 00:28:22,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-27 00:28:22,852][49750] Updated weights for policy 0, policy_version 308231 (0.0031) [2024-04-27 00:28:25,715][49750] Updated weights for policy 0, policy_version 308241 (0.0029) [2024-04-27 00:28:27,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51338.1, 300 sec: 50818.1). Total num frames: 5050286080. Throughput: 0: 50814.3. Samples: 2803126580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 00:28:27,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-27 00:28:29,423][49750] Updated weights for policy 0, policy_version 308251 (0.0035) [2024-04-27 00:28:32,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5050531840. Throughput: 0: 50699.7. Samples: 2803424540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 00:28:32,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-27 00:28:32,147][49750] Updated weights for policy 0, policy_version 308261 (0.0039) [2024-04-27 00:28:35,953][49750] Updated weights for policy 0, policy_version 308271 (0.0034) [2024-04-27 00:28:37,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 5050761216. Throughput: 0: 50717.1. Samples: 2803576480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 00:28:37,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 00:28:38,654][49750] Updated weights for policy 0, policy_version 308281 (0.0035) [2024-04-27 00:28:42,062][49517] Fps is (10 sec: 47513.5, 60 sec: 49971.3, 300 sec: 50651.5). Total num frames: 5051006976. Throughput: 0: 50799.1. Samples: 2803881620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 00:28:42,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 00:28:42,174][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000308290_5051023360.pth... [2024-04-27 00:28:42,223][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000307547_5038850048.pth [2024-04-27 00:28:42,373][49750] Updated weights for policy 0, policy_version 308291 (0.0034) [2024-04-27 00:28:44,964][49750] Updated weights for policy 0, policy_version 308301 (0.0035) [2024-04-27 00:28:47,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 5051285504. Throughput: 0: 50840.9. Samples: 2804185260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 00:28:47,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-27 00:28:48,798][49750] Updated weights for policy 0, policy_version 308311 (0.0038) [2024-04-27 00:28:50,171][49728] Signal inference workers to stop experience collection... (42150 times) [2024-04-27 00:28:50,172][49728] Signal inference workers to resume experience collection... (42150 times) [2024-04-27 00:28:50,200][49750] InferenceWorker_p0-w0: stopping experience collection (42150 times) [2024-04-27 00:28:50,200][49750] InferenceWorker_p0-w0: resuming experience collection (42150 times) [2024-04-27 00:28:51,353][49750] Updated weights for policy 0, policy_version 308321 (0.0029) [2024-04-27 00:28:52,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5051547648. Throughput: 0: 50780.5. Samples: 2804342920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 00:28:52,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-27 00:28:55,221][49750] Updated weights for policy 0, policy_version 308331 (0.0030) [2024-04-27 00:28:57,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 5051793408. Throughput: 0: 50599.1. Samples: 2804642800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 00:28:57,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 00:28:57,902][49750] Updated weights for policy 0, policy_version 308341 (0.0033) [2024-04-27 00:29:01,807][49750] Updated weights for policy 0, policy_version 308351 (0.0038) [2024-04-27 00:29:02,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5052039168. Throughput: 0: 50632.3. Samples: 2804951840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 00:29:02,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 00:29:04,390][49750] Updated weights for policy 0, policy_version 308361 (0.0027) [2024-04-27 00:29:07,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 5052301312. Throughput: 0: 50491.5. Samples: 2805090600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 00:29:07,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 00:29:08,238][49750] Updated weights for policy 0, policy_version 308371 (0.0034) [2024-04-27 00:29:10,787][49750] Updated weights for policy 0, policy_version 308381 (0.0030) [2024-04-27 00:29:12,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5052563456. Throughput: 0: 50535.6. Samples: 2805400680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 00:29:12,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-27 00:29:14,675][49750] Updated weights for policy 0, policy_version 308391 (0.0033) [2024-04-27 00:29:17,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5052825600. Throughput: 0: 50598.7. Samples: 2805701480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 00:29:17,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 00:29:17,146][49750] Updated weights for policy 0, policy_version 308401 (0.0035) [2024-04-27 00:29:21,146][49750] Updated weights for policy 0, policy_version 308411 (0.0041) [2024-04-27 00:29:22,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 5053038592. Throughput: 0: 50650.8. Samples: 2805855760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 00:29:22,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-27 00:29:23,597][49750] Updated weights for policy 0, policy_version 308421 (0.0030) [2024-04-27 00:29:27,062][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 5053300736. Throughput: 0: 50689.2. Samples: 2806162640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 00:29:27,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-27 00:29:27,558][49750] Updated weights for policy 0, policy_version 308431 (0.0028) [2024-04-27 00:29:30,048][49750] Updated weights for policy 0, policy_version 308441 (0.0033) [2024-04-27 00:29:32,062][49517] Fps is (10 sec: 54066.6, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5053579264. Throughput: 0: 50799.6. Samples: 2806471240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 00:29:32,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-27 00:29:33,807][49750] Updated weights for policy 0, policy_version 308451 (0.0029) [2024-04-27 00:29:36,392][49750] Updated weights for policy 0, policy_version 308461 (0.0032) [2024-04-27 00:29:37,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 5053841408. Throughput: 0: 50852.3. Samples: 2806631280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 00:29:37,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-27 00:29:40,514][49750] Updated weights for policy 0, policy_version 308471 (0.0034) [2024-04-27 00:29:42,063][49517] Fps is (10 sec: 50790.5, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 5054087168. Throughput: 0: 50960.0. Samples: 2806936000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 00:29:42,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 00:29:42,741][49750] Updated weights for policy 0, policy_version 308481 (0.0021) [2024-04-27 00:29:47,016][49750] Updated weights for policy 0, policy_version 308491 (0.0040) [2024-04-27 00:29:47,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5054316544. Throughput: 0: 50815.2. Samples: 2807238520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 00:29:47,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-27 00:29:49,334][49750] Updated weights for policy 0, policy_version 308501 (0.0030) [2024-04-27 00:29:52,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5054578688. Throughput: 0: 50848.5. Samples: 2807378780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 00:29:52,071][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 00:29:53,358][49750] Updated weights for policy 0, policy_version 308511 (0.0044) [2024-04-27 00:29:55,867][49750] Updated weights for policy 0, policy_version 308521 (0.0031) [2024-04-27 00:29:57,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5054840832. Throughput: 0: 50628.8. Samples: 2807678980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 00:29:57,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 00:29:59,889][49750] Updated weights for policy 0, policy_version 308531 (0.0029) [2024-04-27 00:30:02,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5055102976. Throughput: 0: 50868.0. Samples: 2807990540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 00:30:02,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-27 00:30:02,411][49750] Updated weights for policy 0, policy_version 308541 (0.0034) [2024-04-27 00:30:06,471][49750] Updated weights for policy 0, policy_version 308551 (0.0029) [2024-04-27 00:30:07,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5055332352. Throughput: 0: 50897.8. Samples: 2808146160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 00:30:07,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 00:30:07,170][49728] Signal inference workers to stop experience collection... (42200 times) [2024-04-27 00:30:07,171][49728] Signal inference workers to resume experience collection... (42200 times) [2024-04-27 00:30:07,200][49750] InferenceWorker_p0-w0: stopping experience collection (42200 times) [2024-04-27 00:30:07,200][49750] InferenceWorker_p0-w0: resuming experience collection (42200 times) [2024-04-27 00:30:09,049][49750] Updated weights for policy 0, policy_version 308561 (0.0034) [2024-04-27 00:30:12,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 5055578112. Throughput: 0: 50760.9. Samples: 2808446880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 00:30:12,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-27 00:30:12,767][49750] Updated weights for policy 0, policy_version 308571 (0.0034) [2024-04-27 00:30:15,484][49750] Updated weights for policy 0, policy_version 308581 (0.0030) [2024-04-27 00:30:17,062][49517] Fps is (10 sec: 54066.9, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5055873024. Throughput: 0: 50810.7. Samples: 2808757720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 00:30:17,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-27 00:30:19,136][49750] Updated weights for policy 0, policy_version 308591 (0.0029) [2024-04-27 00:30:21,811][49750] Updated weights for policy 0, policy_version 308601 (0.0028) [2024-04-27 00:30:22,062][49517] Fps is (10 sec: 55706.4, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 5056135168. Throughput: 0: 50755.7. Samples: 2808915280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 00:30:22,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-27 00:30:25,520][49750] Updated weights for policy 0, policy_version 308611 (0.0028) [2024-04-27 00:30:27,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 5056380928. Throughput: 0: 50773.4. Samples: 2809220800. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 00:30:27,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-27 00:30:28,300][49750] Updated weights for policy 0, policy_version 308621 (0.0033) [2024-04-27 00:30:31,949][49750] Updated weights for policy 0, policy_version 308631 (0.0031) [2024-04-27 00:30:32,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5056610304. Throughput: 0: 50814.6. Samples: 2809525180. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 00:30:32,072][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 00:30:34,760][49750] Updated weights for policy 0, policy_version 308641 (0.0030) [2024-04-27 00:30:37,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 5056872448. Throughput: 0: 50946.3. Samples: 2809671360. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 00:30:37,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-27 00:30:38,264][49750] Updated weights for policy 0, policy_version 308651 (0.0032) [2024-04-27 00:30:41,127][49750] Updated weights for policy 0, policy_version 308661 (0.0028) [2024-04-27 00:30:42,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5057118208. Throughput: 0: 51037.4. Samples: 2809975660. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 00:30:42,072][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 00:30:42,081][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000308662_5057118208.pth... [2024-04-27 00:30:42,139][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000307919_5044944896.pth [2024-04-27 00:30:44,667][49750] Updated weights for policy 0, policy_version 308671 (0.0029) [2024-04-27 00:30:47,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 5057396736. Throughput: 0: 50793.6. Samples: 2810276260. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 00:30:47,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-27 00:30:47,433][49750] Updated weights for policy 0, policy_version 308681 (0.0027) [2024-04-27 00:30:51,181][49750] Updated weights for policy 0, policy_version 308691 (0.0031) [2024-04-27 00:30:52,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 5057626112. Throughput: 0: 50877.6. Samples: 2810435660. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 00:30:52,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-27 00:30:53,862][49750] Updated weights for policy 0, policy_version 308701 (0.0029) [2024-04-27 00:30:57,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 5057871872. Throughput: 0: 51026.4. Samples: 2810743060. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 00:30:57,063][49517] Avg episode reward: [(0, '0.706')] [2024-04-27 00:30:57,650][49750] Updated weights for policy 0, policy_version 308711 (0.0039) [2024-04-27 00:31:00,391][49750] Updated weights for policy 0, policy_version 308721 (0.0036) [2024-04-27 00:31:02,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5058150400. Throughput: 0: 50832.9. Samples: 2811045200. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 00:31:02,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 00:31:04,047][49750] Updated weights for policy 0, policy_version 308731 (0.0032) [2024-04-27 00:31:06,748][49750] Updated weights for policy 0, policy_version 308741 (0.0035) [2024-04-27 00:31:07,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 5058412544. Throughput: 0: 50783.1. Samples: 2811200520. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 00:31:07,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 00:31:10,459][49750] Updated weights for policy 0, policy_version 308751 (0.0028) [2024-04-27 00:31:12,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51609.7, 300 sec: 50873.7). Total num frames: 5058674688. Throughput: 0: 50789.3. Samples: 2811506320. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 00:31:12,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 00:31:12,750][49728] Signal inference workers to stop experience collection... (42250 times) [2024-04-27 00:31:12,758][49728] Signal inference workers to resume experience collection... (42250 times) [2024-04-27 00:31:12,776][49750] InferenceWorker_p0-w0: stopping experience collection (42250 times) [2024-04-27 00:31:12,777][49750] InferenceWorker_p0-w0: resuming experience collection (42250 times) [2024-04-27 00:31:13,173][49750] Updated weights for policy 0, policy_version 308761 (0.0026) [2024-04-27 00:31:16,925][49750] Updated weights for policy 0, policy_version 308771 (0.0034) [2024-04-27 00:31:17,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5058904064. Throughput: 0: 50864.5. Samples: 2811814080. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 00:31:17,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 00:31:19,650][49750] Updated weights for policy 0, policy_version 308781 (0.0030) [2024-04-27 00:31:22,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 5059149824. Throughput: 0: 50841.2. Samples: 2811959220. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 00:31:22,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 00:31:23,251][49750] Updated weights for policy 0, policy_version 308791 (0.0040) [2024-04-27 00:31:26,255][49750] Updated weights for policy 0, policy_version 308801 (0.0032) [2024-04-27 00:31:27,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5059411968. Throughput: 0: 50943.9. Samples: 2812268140. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 00:31:27,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 00:31:29,517][49750] Updated weights for policy 0, policy_version 308811 (0.0030) [2024-04-27 00:31:32,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 5059657728. Throughput: 0: 51034.9. Samples: 2812572820. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 00:31:32,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 00:31:32,737][49750] Updated weights for policy 0, policy_version 308821 (0.0032) [2024-04-27 00:31:36,006][49750] Updated weights for policy 0, policy_version 308831 (0.0032) [2024-04-27 00:31:37,062][49517] Fps is (10 sec: 52430.0, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5059936256. Throughput: 0: 50932.6. Samples: 2812727620. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 00:31:37,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-27 00:31:39,109][49750] Updated weights for policy 0, policy_version 308841 (0.0026) [2024-04-27 00:31:42,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5060182016. Throughput: 0: 50932.9. Samples: 2813035040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 00:31:42,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 00:31:42,615][49750] Updated weights for policy 0, policy_version 308851 (0.0034) [2024-04-27 00:31:45,510][49750] Updated weights for policy 0, policy_version 308861 (0.0031) [2024-04-27 00:31:47,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5060427776. Throughput: 0: 50867.0. Samples: 2813334220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 00:31:47,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-27 00:31:48,947][49750] Updated weights for policy 0, policy_version 308871 (0.0030) [2024-04-27 00:31:51,926][49750] Updated weights for policy 0, policy_version 308881 (0.0032) [2024-04-27 00:31:52,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 5060706304. Throughput: 0: 50867.5. Samples: 2813489560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 00:31:52,063][49517] Avg episode reward: [(0, '0.490')] [2024-04-27 00:31:55,310][49750] Updated weights for policy 0, policy_version 308891 (0.0030) [2024-04-27 00:31:57,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5060935680. Throughput: 0: 50768.9. Samples: 2813790920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 00:31:57,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-27 00:31:58,462][49750] Updated weights for policy 0, policy_version 308901 (0.0033) [2024-04-27 00:32:01,874][49750] Updated weights for policy 0, policy_version 308911 (0.0035) [2024-04-27 00:32:02,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 5061214208. Throughput: 0: 50832.8. Samples: 2814101560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 00:32:02,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-27 00:32:04,855][49750] Updated weights for policy 0, policy_version 308921 (0.0035) [2024-04-27 00:32:07,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 5061443584. Throughput: 0: 50908.9. Samples: 2814250120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 00:32:07,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-27 00:32:08,285][49750] Updated weights for policy 0, policy_version 308931 (0.0032) [2024-04-27 00:32:11,369][49750] Updated weights for policy 0, policy_version 308941 (0.0030) [2024-04-27 00:32:12,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5061722112. Throughput: 0: 50847.8. Samples: 2814556280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 00:32:12,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 00:32:14,675][49750] Updated weights for policy 0, policy_version 308951 (0.0035) [2024-04-27 00:32:17,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5061951488. Throughput: 0: 50849.8. Samples: 2814861060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 00:32:17,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-27 00:32:18,005][49750] Updated weights for policy 0, policy_version 308961 (0.0030) [2024-04-27 00:32:21,044][49750] Updated weights for policy 0, policy_version 308971 (0.0031) [2024-04-27 00:32:22,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.4, 300 sec: 50818.5). Total num frames: 5062197248. Throughput: 0: 50797.7. Samples: 2815013520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 00:32:22,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-27 00:32:24,518][49750] Updated weights for policy 0, policy_version 308981 (0.0030) [2024-04-27 00:32:27,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5062459392. Throughput: 0: 50691.0. Samples: 2815316140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 00:32:27,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 00:32:27,163][49728] Signal inference workers to stop experience collection... (42300 times) [2024-04-27 00:32:27,163][49728] Signal inference workers to resume experience collection... (42300 times) [2024-04-27 00:32:27,175][49750] InferenceWorker_p0-w0: stopping experience collection (42300 times) [2024-04-27 00:32:27,175][49750] InferenceWorker_p0-w0: resuming experience collection (42300 times) [2024-04-27 00:32:27,474][49750] Updated weights for policy 0, policy_version 308991 (0.0031) [2024-04-27 00:32:30,835][49750] Updated weights for policy 0, policy_version 309001 (0.0032) [2024-04-27 00:32:32,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 5062705152. Throughput: 0: 50835.9. Samples: 2815621840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 00:32:32,063][49517] Avg episode reward: [(0, '0.671')] [2024-04-27 00:32:33,880][49750] Updated weights for policy 0, policy_version 309011 (0.0033) [2024-04-27 00:32:37,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 5062983680. Throughput: 0: 50676.5. Samples: 2815770000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 00:32:37,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-27 00:32:37,183][49750] Updated weights for policy 0, policy_version 309021 (0.0029) [2024-04-27 00:32:40,226][49750] Updated weights for policy 0, policy_version 309031 (0.0031) [2024-04-27 00:32:42,063][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5063213056. Throughput: 0: 50693.2. Samples: 2816072120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 00:32:42,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 00:32:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000309034_5063213056.pth... [2024-04-27 00:32:42,131][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000308290_5051023360.pth [2024-04-27 00:32:43,618][49750] Updated weights for policy 0, policy_version 309041 (0.0035) [2024-04-27 00:32:46,757][49750] Updated weights for policy 0, policy_version 309051 (0.0030) [2024-04-27 00:32:47,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 5063507968. Throughput: 0: 50560.6. Samples: 2816376780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 00:32:47,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 00:32:50,123][49750] Updated weights for policy 0, policy_version 309061 (0.0039) [2024-04-27 00:32:52,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 5063737344. Throughput: 0: 50670.8. Samples: 2816530300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 00:32:52,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-27 00:32:53,291][49750] Updated weights for policy 0, policy_version 309071 (0.0036) [2024-04-27 00:32:56,527][49750] Updated weights for policy 0, policy_version 309081 (0.0030) [2024-04-27 00:32:57,063][49517] Fps is (10 sec: 49151.6, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5063999488. Throughput: 0: 50740.7. Samples: 2816839620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:32:57,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 00:32:59,843][49750] Updated weights for policy 0, policy_version 309091 (0.0031) [2024-04-27 00:33:02,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 5064245248. Throughput: 0: 50828.4. Samples: 2817148340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:33:02,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-27 00:33:02,823][49750] Updated weights for policy 0, policy_version 309101 (0.0030) [2024-04-27 00:33:06,374][49750] Updated weights for policy 0, policy_version 309111 (0.0031) [2024-04-27 00:33:07,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5064491008. Throughput: 0: 50783.5. Samples: 2817298780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:33:07,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-27 00:33:09,249][49750] Updated weights for policy 0, policy_version 309121 (0.0030) [2024-04-27 00:33:12,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5064769536. Throughput: 0: 50890.6. Samples: 2817606220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:33:12,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-27 00:33:12,680][49750] Updated weights for policy 0, policy_version 309131 (0.0036) [2024-04-27 00:33:15,729][49750] Updated weights for policy 0, policy_version 309141 (0.0030) [2024-04-27 00:33:17,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5065015296. Throughput: 0: 50819.4. Samples: 2817908700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:33:17,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-27 00:33:19,275][49750] Updated weights for policy 0, policy_version 309151 (0.0030) [2024-04-27 00:33:22,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 5065277440. Throughput: 0: 50899.0. Samples: 2818060460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:33:22,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-27 00:33:22,422][49750] Updated weights for policy 0, policy_version 309161 (0.0036) [2024-04-27 00:33:25,711][49750] Updated weights for policy 0, policy_version 309171 (0.0029) [2024-04-27 00:33:27,062][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5065490432. Throughput: 0: 50915.6. Samples: 2818363320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:33:27,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 00:33:28,985][49750] Updated weights for policy 0, policy_version 309181 (0.0029) [2024-04-27 00:33:29,645][49728] Signal inference workers to stop experience collection... (42350 times) [2024-04-27 00:33:29,646][49728] Signal inference workers to resume experience collection... (42350 times) [2024-04-27 00:33:29,661][49750] InferenceWorker_p0-w0: stopping experience collection (42350 times) [2024-04-27 00:33:29,661][49750] InferenceWorker_p0-w0: resuming experience collection (42350 times) [2024-04-27 00:33:32,062][49517] Fps is (10 sec: 49152.3, 60 sec: 51063.7, 300 sec: 50873.7). Total num frames: 5065768960. Throughput: 0: 50800.9. Samples: 2818662820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:33:32,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-27 00:33:32,116][49750] Updated weights for policy 0, policy_version 309191 (0.0033) [2024-04-27 00:33:35,591][49750] Updated weights for policy 0, policy_version 309201 (0.0033) [2024-04-27 00:33:37,063][49517] Fps is (10 sec: 54067.1, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 5066031104. Throughput: 0: 50946.5. Samples: 2818822900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:33:37,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 00:33:38,744][49750] Updated weights for policy 0, policy_version 309211 (0.0032) [2024-04-27 00:33:42,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5066260480. Throughput: 0: 50843.1. Samples: 2819127560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:33:42,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-27 00:33:42,266][49750] Updated weights for policy 0, policy_version 309221 (0.0027) [2024-04-27 00:33:45,059][49750] Updated weights for policy 0, policy_version 309231 (0.0030) [2024-04-27 00:33:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 5066522624. Throughput: 0: 50843.5. Samples: 2819436300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:33:47,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-27 00:33:48,678][49750] Updated weights for policy 0, policy_version 309241 (0.0035) [2024-04-27 00:33:51,467][49750] Updated weights for policy 0, policy_version 309251 (0.0026) [2024-04-27 00:33:52,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 5066784768. Throughput: 0: 50733.3. Samples: 2819581780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:33:52,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 00:33:54,977][49750] Updated weights for policy 0, policy_version 309261 (0.0028) [2024-04-27 00:33:57,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 5067046912. Throughput: 0: 50748.6. Samples: 2819889900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:33:57,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-27 00:33:58,017][49750] Updated weights for policy 0, policy_version 309271 (0.0031) [2024-04-27 00:34:01,402][49750] Updated weights for policy 0, policy_version 309281 (0.0032) [2024-04-27 00:34:02,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5067276288. Throughput: 0: 50619.9. Samples: 2820186600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:34:02,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 00:34:04,600][49750] Updated weights for policy 0, policy_version 309291 (0.0036) [2024-04-27 00:34:07,062][49517] Fps is (10 sec: 50789.7, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5067554816. Throughput: 0: 50644.0. Samples: 2820339440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:34:07,063][49517] Avg episode reward: [(0, '0.680')] [2024-04-27 00:34:07,903][49750] Updated weights for policy 0, policy_version 309301 (0.0033) [2024-04-27 00:34:11,189][49750] Updated weights for policy 0, policy_version 309311 (0.0031) [2024-04-27 00:34:12,062][49517] Fps is (10 sec: 49151.7, 60 sec: 49971.2, 300 sec: 50651.5). Total num frames: 5067767808. Throughput: 0: 50645.4. Samples: 2820642360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 00:34:12,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 00:34:14,234][49750] Updated weights for policy 0, policy_version 309321 (0.0031) [2024-04-27 00:34:17,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 5068046336. Throughput: 0: 50687.9. Samples: 2820943780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 00:34:17,072][49517] Avg episode reward: [(0, '0.559')] [2024-04-27 00:34:17,515][49750] Updated weights for policy 0, policy_version 309331 (0.0033) [2024-04-27 00:34:20,550][49750] Updated weights for policy 0, policy_version 309341 (0.0026) [2024-04-27 00:34:22,062][49517] Fps is (10 sec: 54067.1, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5068308480. Throughput: 0: 50585.4. Samples: 2821099240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 00:34:22,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-27 00:34:23,901][49750] Updated weights for policy 0, policy_version 309351 (0.0029) [2024-04-27 00:34:27,063][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5068554240. Throughput: 0: 50599.2. Samples: 2821404520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 00:34:27,072][49517] Avg episode reward: [(0, '0.544')] [2024-04-27 00:34:27,074][49750] Updated weights for policy 0, policy_version 309361 (0.0031) [2024-04-27 00:34:30,416][49750] Updated weights for policy 0, policy_version 309371 (0.0030) [2024-04-27 00:34:32,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5068800000. Throughput: 0: 50689.7. Samples: 2821717340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 00:34:32,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-27 00:34:33,544][49750] Updated weights for policy 0, policy_version 309381 (0.0025) [2024-04-27 00:34:36,896][49750] Updated weights for policy 0, policy_version 309391 (0.0029) [2024-04-27 00:34:37,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5069062144. Throughput: 0: 50648.5. Samples: 2821860960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 00:34:37,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-27 00:34:40,122][49750] Updated weights for policy 0, policy_version 309401 (0.0035) [2024-04-27 00:34:42,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 5069340672. Throughput: 0: 50560.6. Samples: 2822165140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 00:34:42,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-27 00:34:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000309408_5069340672.pth... [2024-04-27 00:34:42,139][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000308662_5057118208.pth [2024-04-27 00:34:43,190][49728] Signal inference workers to stop experience collection... (42400 times) [2024-04-27 00:34:43,190][49728] Signal inference workers to resume experience collection... (42400 times) [2024-04-27 00:34:43,212][49750] InferenceWorker_p0-w0: stopping experience collection (42400 times) [2024-04-27 00:34:43,212][49750] InferenceWorker_p0-w0: resuming experience collection (42400 times) [2024-04-27 00:34:43,334][49750] Updated weights for policy 0, policy_version 309411 (0.0035) [2024-04-27 00:34:46,446][49750] Updated weights for policy 0, policy_version 309421 (0.0034) [2024-04-27 00:34:47,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5069570048. Throughput: 0: 50627.9. Samples: 2822464860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 00:34:47,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-27 00:34:49,675][49750] Updated weights for policy 0, policy_version 309431 (0.0034) [2024-04-27 00:34:52,062][49517] Fps is (10 sec: 45876.2, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 5069799424. Throughput: 0: 50753.9. Samples: 2822623360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 00:34:52,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 00:34:52,811][49750] Updated weights for policy 0, policy_version 309441 (0.0030) [2024-04-27 00:34:56,078][49750] Updated weights for policy 0, policy_version 309451 (0.0032) [2024-04-27 00:34:57,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 5070061568. Throughput: 0: 50804.5. Samples: 2822928560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 00:34:57,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 00:34:59,309][49750] Updated weights for policy 0, policy_version 309461 (0.0037) [2024-04-27 00:35:02,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5070323712. Throughput: 0: 50877.5. Samples: 2823233260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 00:35:02,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-27 00:35:02,558][49750] Updated weights for policy 0, policy_version 309471 (0.0030) [2024-04-27 00:35:05,718][49750] Updated weights for policy 0, policy_version 309481 (0.0034) [2024-04-27 00:35:07,063][49517] Fps is (10 sec: 52427.5, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 5070585856. Throughput: 0: 50834.9. Samples: 2823386820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 00:35:07,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-27 00:35:09,176][49750] Updated weights for policy 0, policy_version 309491 (0.0035) [2024-04-27 00:35:12,063][49517] Fps is (10 sec: 52427.6, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 5070848000. Throughput: 0: 50841.7. Samples: 2823692400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 00:35:12,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-27 00:35:12,197][49750] Updated weights for policy 0, policy_version 309501 (0.0027) [2024-04-27 00:35:15,654][49750] Updated weights for policy 0, policy_version 309511 (0.0033) [2024-04-27 00:35:17,062][49517] Fps is (10 sec: 49153.6, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 5071077376. Throughput: 0: 50681.6. Samples: 2823998000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 00:35:17,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 00:35:18,675][49750] Updated weights for policy 0, policy_version 309521 (0.0035) [2024-04-27 00:35:22,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5071339520. Throughput: 0: 50775.3. Samples: 2824145840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 00:35:22,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-27 00:35:22,121][49750] Updated weights for policy 0, policy_version 309531 (0.0038) [2024-04-27 00:35:24,989][49750] Updated weights for policy 0, policy_version 309541 (0.0030) [2024-04-27 00:35:27,063][49517] Fps is (10 sec: 52427.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5071601664. Throughput: 0: 50749.8. Samples: 2824448880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 00:35:27,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-27 00:35:28,692][49750] Updated weights for policy 0, policy_version 309551 (0.0033) [2024-04-27 00:35:31,515][49750] Updated weights for policy 0, policy_version 309561 (0.0031) [2024-04-27 00:35:32,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5071863808. Throughput: 0: 50926.3. Samples: 2824756540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 00:35:32,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-27 00:35:34,955][49750] Updated weights for policy 0, policy_version 309571 (0.0030) [2024-04-27 00:35:37,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 5072093184. Throughput: 0: 50877.3. Samples: 2824912840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 00:35:37,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-27 00:35:37,951][49750] Updated weights for policy 0, policy_version 309581 (0.0032) [2024-04-27 00:35:41,322][49750] Updated weights for policy 0, policy_version 309591 (0.0029) [2024-04-27 00:35:42,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 5072355328. Throughput: 0: 50800.8. Samples: 2825214600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 00:35:42,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-27 00:35:44,355][49750] Updated weights for policy 0, policy_version 309601 (0.0034) [2024-04-27 00:35:47,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5072617472. Throughput: 0: 50768.3. Samples: 2825517840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 00:35:47,072][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 00:35:47,731][49750] Updated weights for policy 0, policy_version 309611 (0.0033) [2024-04-27 00:35:50,785][49750] Updated weights for policy 0, policy_version 309621 (0.0029) [2024-04-27 00:35:52,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5072863232. Throughput: 0: 50870.6. Samples: 2825675980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 00:35:52,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 00:35:54,154][49750] Updated weights for policy 0, policy_version 309631 (0.0034) [2024-04-27 00:35:57,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 5073141760. Throughput: 0: 50752.4. Samples: 2825976260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 00:35:57,072][49517] Avg episode reward: [(0, '0.648')] [2024-04-27 00:35:57,226][49750] Updated weights for policy 0, policy_version 309641 (0.0030) [2024-04-27 00:36:00,541][49750] Updated weights for policy 0, policy_version 309651 (0.0033) [2024-04-27 00:36:01,553][49728] Signal inference workers to stop experience collection... (42450 times) [2024-04-27 00:36:01,593][49750] InferenceWorker_p0-w0: stopping experience collection (42450 times) [2024-04-27 00:36:01,613][49728] Signal inference workers to resume experience collection... (42450 times) [2024-04-27 00:36:01,615][49750] InferenceWorker_p0-w0: resuming experience collection (42450 times) [2024-04-27 00:36:02,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 5073371136. Throughput: 0: 50743.0. Samples: 2826281440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 00:36:02,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 00:36:03,570][49750] Updated weights for policy 0, policy_version 309661 (0.0036) [2024-04-27 00:36:07,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 5073633280. Throughput: 0: 50778.6. Samples: 2826430880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 00:36:07,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 00:36:07,282][49750] Updated weights for policy 0, policy_version 309671 (0.0035) [2024-04-27 00:36:09,895][49750] Updated weights for policy 0, policy_version 309681 (0.0028) [2024-04-27 00:36:12,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5073879040. Throughput: 0: 50841.8. Samples: 2826736760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 00:36:12,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-27 00:36:13,568][49750] Updated weights for policy 0, policy_version 309691 (0.0035) [2024-04-27 00:36:16,365][49750] Updated weights for policy 0, policy_version 309701 (0.0031) [2024-04-27 00:36:17,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 5074157568. Throughput: 0: 50573.4. Samples: 2827032340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 00:36:17,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-27 00:36:20,326][49750] Updated weights for policy 0, policy_version 309711 (0.0031) [2024-04-27 00:36:22,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5074403328. Throughput: 0: 50722.1. Samples: 2827195340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 00:36:22,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 00:36:22,886][49750] Updated weights for policy 0, policy_version 309721 (0.0029) [2024-04-27 00:36:26,997][49750] Updated weights for policy 0, policy_version 309731 (0.0032) [2024-04-27 00:36:27,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5074632704. Throughput: 0: 50739.9. Samples: 2827497900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 00:36:27,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-27 00:36:29,388][49750] Updated weights for policy 0, policy_version 309741 (0.0030) [2024-04-27 00:36:32,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5074911232. Throughput: 0: 50629.4. Samples: 2827796160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 00:36:32,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 00:36:33,511][49750] Updated weights for policy 0, policy_version 309751 (0.0034) [2024-04-27 00:36:36,237][49750] Updated weights for policy 0, policy_version 309761 (0.0030) [2024-04-27 00:36:37,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 5075156992. Throughput: 0: 50554.0. Samples: 2827950920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 00:36:37,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 00:36:39,807][49750] Updated weights for policy 0, policy_version 309771 (0.0037) [2024-04-27 00:36:42,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5075435520. Throughput: 0: 50663.3. Samples: 2828256100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:36:42,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 00:36:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000309780_5075435520.pth... [2024-04-27 00:36:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000309034_5063213056.pth [2024-04-27 00:36:42,695][49750] Updated weights for policy 0, policy_version 309781 (0.0032) [2024-04-27 00:36:46,355][49750] Updated weights for policy 0, policy_version 309791 (0.0032) [2024-04-27 00:36:47,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5075664896. Throughput: 0: 50777.9. Samples: 2828566440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:36:47,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-27 00:36:49,113][49750] Updated weights for policy 0, policy_version 309801 (0.0030) [2024-04-27 00:36:52,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 5075910656. Throughput: 0: 50758.1. Samples: 2828715000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:36:52,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-27 00:36:52,860][49750] Updated weights for policy 0, policy_version 309811 (0.0027) [2024-04-27 00:36:55,865][49750] Updated weights for policy 0, policy_version 309821 (0.0028) [2024-04-27 00:36:57,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.5, 300 sec: 50651.6). Total num frames: 5076156416. Throughput: 0: 50650.4. Samples: 2829016020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:36:57,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 00:36:59,280][49750] Updated weights for policy 0, policy_version 309831 (0.0029) [2024-04-27 00:37:02,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5076418560. Throughput: 0: 50723.5. Samples: 2829314900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:37:02,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-27 00:37:02,232][49750] Updated weights for policy 0, policy_version 309841 (0.0035) [2024-04-27 00:37:05,790][49750] Updated weights for policy 0, policy_version 309851 (0.0029) [2024-04-27 00:37:07,062][49517] Fps is (10 sec: 54066.5, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5076697088. Throughput: 0: 50627.1. Samples: 2829473560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:37:07,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-27 00:37:08,603][49750] Updated weights for policy 0, policy_version 309861 (0.0036) [2024-04-27 00:37:09,664][49728] Signal inference workers to stop experience collection... (42500 times) [2024-04-27 00:37:09,707][49750] InferenceWorker_p0-w0: stopping experience collection (42500 times) [2024-04-27 00:37:09,768][49728] Signal inference workers to resume experience collection... (42500 times) [2024-04-27 00:37:09,768][49750] InferenceWorker_p0-w0: resuming experience collection (42500 times) [2024-04-27 00:37:12,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5076910080. Throughput: 0: 50646.3. Samples: 2829776980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:37:12,063][49517] Avg episode reward: [(0, '0.678')] [2024-04-27 00:37:12,173][49750] Updated weights for policy 0, policy_version 309871 (0.0030) [2024-04-27 00:37:15,013][49750] Updated weights for policy 0, policy_version 309881 (0.0032) [2024-04-27 00:37:17,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 5077172224. Throughput: 0: 50844.9. Samples: 2830084180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:37:17,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 00:37:18,587][49750] Updated weights for policy 0, policy_version 309891 (0.0035) [2024-04-27 00:37:21,484][49750] Updated weights for policy 0, policy_version 309901 (0.0030) [2024-04-27 00:37:22,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 5077417984. Throughput: 0: 50680.2. Samples: 2830231520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:37:22,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 00:37:25,024][49750] Updated weights for policy 0, policy_version 309911 (0.0040) [2024-04-27 00:37:27,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5077696512. Throughput: 0: 50788.5. Samples: 2830541580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:37:27,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 00:37:27,871][49750] Updated weights for policy 0, policy_version 309921 (0.0029) [2024-04-27 00:37:31,365][49750] Updated weights for policy 0, policy_version 309931 (0.0035) [2024-04-27 00:37:32,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5077942272. Throughput: 0: 50714.2. Samples: 2830848580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:37:32,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-27 00:37:34,184][49750] Updated weights for policy 0, policy_version 309941 (0.0034) [2024-04-27 00:37:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 5078188032. Throughput: 0: 50730.0. Samples: 2830997840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:37:37,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 00:37:37,872][49750] Updated weights for policy 0, policy_version 309951 (0.0028) [2024-04-27 00:37:40,589][49750] Updated weights for policy 0, policy_version 309961 (0.0031) [2024-04-27 00:37:42,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 5078450176. Throughput: 0: 50804.8. Samples: 2831302240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:37:42,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 00:37:44,362][49750] Updated weights for policy 0, policy_version 309971 (0.0032) [2024-04-27 00:37:47,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5078712320. Throughput: 0: 50904.0. Samples: 2831605580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:37:47,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-27 00:37:47,211][49750] Updated weights for policy 0, policy_version 309981 (0.0032) [2024-04-27 00:37:50,774][49750] Updated weights for policy 0, policy_version 309991 (0.0033) [2024-04-27 00:37:52,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5078974464. Throughput: 0: 50736.4. Samples: 2831756700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 00:37:52,063][49517] Avg episode reward: [(0, '0.701')] [2024-04-27 00:37:53,823][49750] Updated weights for policy 0, policy_version 310001 (0.0030) [2024-04-27 00:37:57,057][49750] Updated weights for policy 0, policy_version 310011 (0.0032) [2024-04-27 00:37:57,063][49517] Fps is (10 sec: 50787.3, 60 sec: 51062.9, 300 sec: 50762.5). Total num frames: 5079220224. Throughput: 0: 50755.8. Samples: 2832061020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 00:37:57,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-27 00:38:00,425][49750] Updated weights for policy 0, policy_version 310021 (0.0033) [2024-04-27 00:38:02,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 5079465984. Throughput: 0: 50846.2. Samples: 2832372260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 00:38:02,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-27 00:38:03,419][49750] Updated weights for policy 0, policy_version 310031 (0.0031) [2024-04-27 00:38:06,875][49750] Updated weights for policy 0, policy_version 310041 (0.0030) [2024-04-27 00:38:07,062][49517] Fps is (10 sec: 49155.4, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 5079711744. Throughput: 0: 50841.7. Samples: 2832519400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 00:38:07,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 00:38:09,967][49750] Updated weights for policy 0, policy_version 310051 (0.0033) [2024-04-27 00:38:12,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 5079973888. Throughput: 0: 50664.5. Samples: 2832821480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 00:38:12,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 00:38:13,200][49750] Updated weights for policy 0, policy_version 310061 (0.0036) [2024-04-27 00:38:16,376][49750] Updated weights for policy 0, policy_version 310071 (0.0029) [2024-04-27 00:38:17,063][49517] Fps is (10 sec: 54066.1, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 5080252416. Throughput: 0: 50823.8. Samples: 2833135660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 00:38:17,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-27 00:38:19,611][49750] Updated weights for policy 0, policy_version 310081 (0.0030) [2024-04-27 00:38:22,063][49517] Fps is (10 sec: 50789.4, 60 sec: 51063.2, 300 sec: 50818.1). Total num frames: 5080481792. Throughput: 0: 50976.7. Samples: 2833291800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 00:38:22,064][49517] Avg episode reward: [(0, '0.609')] [2024-04-27 00:38:22,639][49750] Updated weights for policy 0, policy_version 310091 (0.0033) [2024-04-27 00:38:24,835][49728] Signal inference workers to stop experience collection... (42550 times) [2024-04-27 00:38:24,888][49750] InferenceWorker_p0-w0: stopping experience collection (42550 times) [2024-04-27 00:38:24,894][49728] Signal inference workers to resume experience collection... (42550 times) [2024-04-27 00:38:24,904][49750] InferenceWorker_p0-w0: resuming experience collection (42550 times) [2024-04-27 00:38:25,944][49750] Updated weights for policy 0, policy_version 310101 (0.0030) [2024-04-27 00:38:27,063][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5080727552. Throughput: 0: 51038.1. Samples: 2833598960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 00:38:27,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-27 00:38:29,027][49750] Updated weights for policy 0, policy_version 310111 (0.0035) [2024-04-27 00:38:32,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 5080989696. Throughput: 0: 51052.9. Samples: 2833902960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 00:38:32,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 00:38:32,347][49750] Updated weights for policy 0, policy_version 310121 (0.0032) [2024-04-27 00:38:35,424][49750] Updated weights for policy 0, policy_version 310131 (0.0030) [2024-04-27 00:38:37,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.3, 300 sec: 50873.7). Total num frames: 5081268224. Throughput: 0: 51126.5. Samples: 2834057400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 00:38:37,063][49517] Avg episode reward: [(0, '0.682')] [2024-04-27 00:38:38,848][49750] Updated weights for policy 0, policy_version 310141 (0.0037) [2024-04-27 00:38:41,747][49750] Updated weights for policy 0, policy_version 310151 (0.0029) [2024-04-27 00:38:42,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5081530368. Throughput: 0: 51196.3. Samples: 2834364820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 00:38:42,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-27 00:38:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000310152_5081530368.pth... [2024-04-27 00:38:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000309408_5069340672.pth [2024-04-27 00:38:45,308][49750] Updated weights for policy 0, policy_version 310161 (0.0032) [2024-04-27 00:38:47,062][49517] Fps is (10 sec: 49153.2, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 5081759744. Throughput: 0: 51126.3. Samples: 2834672940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 00:38:47,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-27 00:38:48,093][49750] Updated weights for policy 0, policy_version 310171 (0.0031) [2024-04-27 00:38:51,654][49750] Updated weights for policy 0, policy_version 310181 (0.0032) [2024-04-27 00:38:52,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5082021888. Throughput: 0: 51072.9. Samples: 2834817680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 00:38:52,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-27 00:38:54,594][49750] Updated weights for policy 0, policy_version 310191 (0.0034) [2024-04-27 00:38:57,063][49517] Fps is (10 sec: 49150.8, 60 sec: 50517.7, 300 sec: 50762.6). Total num frames: 5082251264. Throughput: 0: 51058.4. Samples: 2835119120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 00:38:57,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-27 00:38:58,149][49750] Updated weights for policy 0, policy_version 310201 (0.0036) [2024-04-27 00:39:01,142][49750] Updated weights for policy 0, policy_version 310211 (0.0031) [2024-04-27 00:39:02,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 5082546176. Throughput: 0: 50868.2. Samples: 2835424720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 00:39:02,063][49517] Avg episode reward: [(0, '0.669')] [2024-04-27 00:39:04,739][49750] Updated weights for policy 0, policy_version 310221 (0.0027) [2024-04-27 00:39:07,062][49517] Fps is (10 sec: 54068.5, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 5082791936. Throughput: 0: 50958.9. Samples: 2835584940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 00:39:07,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 00:39:07,447][49750] Updated weights for policy 0, policy_version 310231 (0.0035) [2024-04-27 00:39:11,159][49750] Updated weights for policy 0, policy_version 310241 (0.0030) [2024-04-27 00:39:12,063][49517] Fps is (10 sec: 49151.2, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5083037696. Throughput: 0: 50987.1. Samples: 2835893380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 00:39:12,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-27 00:39:13,773][49750] Updated weights for policy 0, policy_version 310251 (0.0038) [2024-04-27 00:39:17,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5083283456. Throughput: 0: 51059.0. Samples: 2836200620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 00:39:17,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-27 00:39:17,513][49750] Updated weights for policy 0, policy_version 310261 (0.0038) [2024-04-27 00:39:20,386][49750] Updated weights for policy 0, policy_version 310271 (0.0034) [2024-04-27 00:39:22,062][49517] Fps is (10 sec: 50791.3, 60 sec: 51063.7, 300 sec: 50818.2). Total num frames: 5083545600. Throughput: 0: 50872.7. Samples: 2836346660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 00:39:22,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 00:39:23,860][49750] Updated weights for policy 0, policy_version 310281 (0.0034) [2024-04-27 00:39:27,026][49750] Updated weights for policy 0, policy_version 310291 (0.0030) [2024-04-27 00:39:27,062][49517] Fps is (10 sec: 52430.0, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 5083807744. Throughput: 0: 50725.4. Samples: 2836647460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 00:39:27,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 00:39:30,283][49750] Updated weights for policy 0, policy_version 310301 (0.0028) [2024-04-27 00:39:32,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5084053504. Throughput: 0: 50697.7. Samples: 2836954340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 00:39:32,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-27 00:39:33,302][49750] Updated weights for policy 0, policy_version 310311 (0.0040) [2024-04-27 00:39:36,654][49750] Updated weights for policy 0, policy_version 310321 (0.0036) [2024-04-27 00:39:37,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 5084299264. Throughput: 0: 50803.6. Samples: 2837103840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 00:39:37,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 00:39:37,393][49728] Signal inference workers to stop experience collection... (42600 times) [2024-04-27 00:39:37,394][49728] Signal inference workers to resume experience collection... (42600 times) [2024-04-27 00:39:37,418][49750] InferenceWorker_p0-w0: stopping experience collection (42600 times) [2024-04-27 00:39:37,418][49750] InferenceWorker_p0-w0: resuming experience collection (42600 times) [2024-04-27 00:39:39,637][49750] Updated weights for policy 0, policy_version 310331 (0.0036) [2024-04-27 00:39:42,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 5084545024. Throughput: 0: 50693.9. Samples: 2837400340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 00:39:42,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-27 00:39:43,133][49750] Updated weights for policy 0, policy_version 310341 (0.0034) [2024-04-27 00:39:46,218][49750] Updated weights for policy 0, policy_version 310351 (0.0027) [2024-04-27 00:39:47,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 5084839936. Throughput: 0: 50768.0. Samples: 2837709280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 00:39:47,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-27 00:39:49,621][49750] Updated weights for policy 0, policy_version 310361 (0.0032) [2024-04-27 00:39:52,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5085085696. Throughput: 0: 50634.5. Samples: 2837863500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 00:39:52,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 00:39:52,701][49750] Updated weights for policy 0, policy_version 310371 (0.0035) [2024-04-27 00:39:56,398][49750] Updated weights for policy 0, policy_version 310381 (0.0031) [2024-04-27 00:39:57,062][49517] Fps is (10 sec: 47513.4, 60 sec: 51063.6, 300 sec: 50818.1). Total num frames: 5085315072. Throughput: 0: 50760.1. Samples: 2838177580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 00:39:57,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-27 00:39:59,034][49750] Updated weights for policy 0, policy_version 310391 (0.0034) [2024-04-27 00:40:02,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.2, 300 sec: 50762.7). Total num frames: 5085560832. Throughput: 0: 50751.8. Samples: 2838484440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 00:40:02,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 00:40:02,896][49750] Updated weights for policy 0, policy_version 310401 (0.0037) [2024-04-27 00:40:05,334][49750] Updated weights for policy 0, policy_version 310411 (0.0039) [2024-04-27 00:40:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5085822976. Throughput: 0: 50754.5. Samples: 2838630620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 00:40:07,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 00:40:09,194][49750] Updated weights for policy 0, policy_version 310421 (0.0028) [2024-04-27 00:40:11,829][49750] Updated weights for policy 0, policy_version 310431 (0.0031) [2024-04-27 00:40:12,063][49517] Fps is (10 sec: 55705.2, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 5086117888. Throughput: 0: 50777.6. Samples: 2838932460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 00:40:12,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-27 00:40:15,508][49750] Updated weights for policy 0, policy_version 310441 (0.0028) [2024-04-27 00:40:17,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5086347264. Throughput: 0: 50871.1. Samples: 2839243540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 00:40:17,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-27 00:40:18,188][49750] Updated weights for policy 0, policy_version 310451 (0.0040) [2024-04-27 00:40:21,918][49750] Updated weights for policy 0, policy_version 310461 (0.0033) [2024-04-27 00:40:22,063][49517] Fps is (10 sec: 47513.6, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5086593024. Throughput: 0: 50821.2. Samples: 2839390800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 00:40:22,064][49517] Avg episode reward: [(0, '0.545')] [2024-04-27 00:40:24,629][49750] Updated weights for policy 0, policy_version 310471 (0.0029) [2024-04-27 00:40:27,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 5086838784. Throughput: 0: 50918.6. Samples: 2839691680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 00:40:27,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 00:40:28,379][49750] Updated weights for policy 0, policy_version 310481 (0.0030) [2024-04-27 00:40:31,066][49750] Updated weights for policy 0, policy_version 310491 (0.0032) [2024-04-27 00:40:32,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5087100928. Throughput: 0: 51013.8. Samples: 2840004900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 00:40:32,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-27 00:40:34,882][49750] Updated weights for policy 0, policy_version 310501 (0.0032) [2024-04-27 00:40:35,556][49728] Signal inference workers to stop experience collection... (42650 times) [2024-04-27 00:40:35,558][49728] Signal inference workers to resume experience collection... (42650 times) [2024-04-27 00:40:35,576][49750] InferenceWorker_p0-w0: stopping experience collection (42650 times) [2024-04-27 00:40:35,577][49750] InferenceWorker_p0-w0: resuming experience collection (42650 times) [2024-04-27 00:40:37,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5087363072. Throughput: 0: 51167.7. Samples: 2840166040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 00:40:37,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-27 00:40:37,474][49750] Updated weights for policy 0, policy_version 310511 (0.0035) [2024-04-27 00:40:41,281][49750] Updated weights for policy 0, policy_version 310521 (0.0029) [2024-04-27 00:40:42,063][49517] Fps is (10 sec: 52427.6, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 5087625216. Throughput: 0: 50965.2. Samples: 2840471020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 00:40:42,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 00:40:42,189][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000310525_5087641600.pth... [2024-04-27 00:40:42,231][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000309780_5075435520.pth [2024-04-27 00:40:43,914][49750] Updated weights for policy 0, policy_version 310531 (0.0042) [2024-04-27 00:40:47,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50244.2, 300 sec: 50818.1). Total num frames: 5087854592. Throughput: 0: 50919.5. Samples: 2840775820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 00:40:47,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-27 00:40:47,697][49750] Updated weights for policy 0, policy_version 310541 (0.0027) [2024-04-27 00:40:50,387][49750] Updated weights for policy 0, policy_version 310551 (0.0029) [2024-04-27 00:40:52,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 5088116736. Throughput: 0: 50992.0. Samples: 2840925260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 00:40:52,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 00:40:54,097][49750] Updated weights for policy 0, policy_version 310561 (0.0028) [2024-04-27 00:40:56,864][49750] Updated weights for policy 0, policy_version 310571 (0.0029) [2024-04-27 00:40:57,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 5088395264. Throughput: 0: 50977.4. Samples: 2841226440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 00:40:57,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 00:41:00,559][49750] Updated weights for policy 0, policy_version 310581 (0.0028) [2024-04-27 00:41:02,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5088641024. Throughput: 0: 50837.8. Samples: 2841531240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 00:41:02,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-27 00:41:03,272][49750] Updated weights for policy 0, policy_version 310591 (0.0032) [2024-04-27 00:41:07,034][49750] Updated weights for policy 0, policy_version 310601 (0.0031) [2024-04-27 00:41:07,062][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5088886784. Throughput: 0: 50954.7. Samples: 2841683760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 00:41:07,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 00:41:09,701][49750] Updated weights for policy 0, policy_version 310611 (0.0029) [2024-04-27 00:41:12,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 5089132544. Throughput: 0: 51097.0. Samples: 2841991040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 00:41:12,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 00:41:13,398][49750] Updated weights for policy 0, policy_version 310621 (0.0033) [2024-04-27 00:41:16,126][49750] Updated weights for policy 0, policy_version 310631 (0.0029) [2024-04-27 00:41:17,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5089411072. Throughput: 0: 50759.4. Samples: 2842289080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 00:41:17,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 00:41:19,773][49750] Updated weights for policy 0, policy_version 310641 (0.0028) [2024-04-27 00:41:22,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 5089656832. Throughput: 0: 50882.7. Samples: 2842455760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 00:41:22,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-27 00:41:22,698][49750] Updated weights for policy 0, policy_version 310651 (0.0032) [2024-04-27 00:41:26,271][49750] Updated weights for policy 0, policy_version 310661 (0.0028) [2024-04-27 00:41:27,062][49517] Fps is (10 sec: 49152.5, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5089902592. Throughput: 0: 50949.1. Samples: 2842763720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 00:41:27,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-27 00:41:29,334][49750] Updated weights for policy 0, policy_version 310671 (0.0033) [2024-04-27 00:41:32,063][49517] Fps is (10 sec: 49150.8, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5090148352. Throughput: 0: 50885.3. Samples: 2843065660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 00:41:32,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 00:41:32,646][49750] Updated weights for policy 0, policy_version 310681 (0.0030) [2024-04-27 00:41:35,589][49750] Updated weights for policy 0, policy_version 310691 (0.0033) [2024-04-27 00:41:37,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5090410496. Throughput: 0: 50796.4. Samples: 2843211100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:41:37,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 00:41:39,122][49750] Updated weights for policy 0, policy_version 310701 (0.0038) [2024-04-27 00:41:41,985][49750] Updated weights for policy 0, policy_version 310711 (0.0031) [2024-04-27 00:41:42,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51063.6, 300 sec: 50929.2). Total num frames: 5090689024. Throughput: 0: 50919.6. Samples: 2843517820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:41:42,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-27 00:41:45,532][49750] Updated weights for policy 0, policy_version 310721 (0.0033) [2024-04-27 00:41:47,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 5090934784. Throughput: 0: 50909.5. Samples: 2843822180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:41:47,064][49517] Avg episode reward: [(0, '0.589')] [2024-04-27 00:41:48,491][49750] Updated weights for policy 0, policy_version 310731 (0.0031) [2024-04-27 00:41:50,756][49728] Signal inference workers to stop experience collection... (42700 times) [2024-04-27 00:41:50,802][49750] InferenceWorker_p0-w0: stopping experience collection (42700 times) [2024-04-27 00:41:50,864][49728] Signal inference workers to resume experience collection... (42700 times) [2024-04-27 00:41:50,865][49750] InferenceWorker_p0-w0: resuming experience collection (42700 times) [2024-04-27 00:41:51,858][49750] Updated weights for policy 0, policy_version 310741 (0.0030) [2024-04-27 00:41:52,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 5091196928. Throughput: 0: 50970.2. Samples: 2843977420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:41:52,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-27 00:41:54,938][49750] Updated weights for policy 0, policy_version 310751 (0.0027) [2024-04-27 00:41:57,062][49517] Fps is (10 sec: 47514.8, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5091409920. Throughput: 0: 50823.6. Samples: 2844278100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:41:57,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-27 00:41:58,338][49750] Updated weights for policy 0, policy_version 310761 (0.0028) [2024-04-27 00:42:01,455][49750] Updated weights for policy 0, policy_version 310771 (0.0028) [2024-04-27 00:42:02,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5091704832. Throughput: 0: 50958.6. Samples: 2844582220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:42:02,063][49517] Avg episode reward: [(0, '0.699')] [2024-04-27 00:42:04,853][49750] Updated weights for policy 0, policy_version 310781 (0.0031) [2024-04-27 00:42:07,062][49517] Fps is (10 sec: 54067.1, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 5091950592. Throughput: 0: 50779.0. Samples: 2844740820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:42:07,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 00:42:08,020][49750] Updated weights for policy 0, policy_version 310791 (0.0037) [2024-04-27 00:42:11,172][49750] Updated weights for policy 0, policy_version 310801 (0.0031) [2024-04-27 00:42:12,062][49517] Fps is (10 sec: 49152.8, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 5092196352. Throughput: 0: 50812.4. Samples: 2845050280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:42:12,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 00:42:14,423][49750] Updated weights for policy 0, policy_version 310811 (0.0031) [2024-04-27 00:42:17,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50984.8). Total num frames: 5092458496. Throughput: 0: 50808.6. Samples: 2845352040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:42:17,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-27 00:42:17,690][49750] Updated weights for policy 0, policy_version 310821 (0.0032) [2024-04-27 00:42:20,751][49750] Updated weights for policy 0, policy_version 310831 (0.0028) [2024-04-27 00:42:22,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5092704256. Throughput: 0: 50877.4. Samples: 2845500580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:42:22,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 00:42:24,167][49750] Updated weights for policy 0, policy_version 310841 (0.0028) [2024-04-27 00:42:27,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5092950016. Throughput: 0: 50886.2. Samples: 2845807700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:42:27,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-27 00:42:27,279][49750] Updated weights for policy 0, policy_version 310851 (0.0039) [2024-04-27 00:42:30,660][49750] Updated weights for policy 0, policy_version 310861 (0.0034) [2024-04-27 00:42:32,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 5093212160. Throughput: 0: 51010.0. Samples: 2846117620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:42:32,063][49517] Avg episode reward: [(0, '0.716')] [2024-04-27 00:42:33,612][49750] Updated weights for policy 0, policy_version 310871 (0.0028) [2024-04-27 00:42:36,962][49750] Updated weights for policy 0, policy_version 310881 (0.0030) [2024-04-27 00:42:37,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 5093474304. Throughput: 0: 50923.6. Samples: 2846268980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:42:37,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-27 00:42:39,995][49750] Updated weights for policy 0, policy_version 310891 (0.0027) [2024-04-27 00:42:42,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5093720064. Throughput: 0: 51123.1. Samples: 2846578640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:42:42,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 00:42:42,170][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000310897_5093736448.pth... [2024-04-27 00:42:42,218][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000310152_5081530368.pth [2024-04-27 00:42:43,453][49750] Updated weights for policy 0, policy_version 310901 (0.0044) [2024-04-27 00:42:46,419][49750] Updated weights for policy 0, policy_version 310911 (0.0028) [2024-04-27 00:42:47,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 5093982208. Throughput: 0: 50958.4. Samples: 2846875340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:42:47,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 00:42:49,800][49750] Updated weights for policy 0, policy_version 310921 (0.0031) [2024-04-27 00:42:52,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50873.8). Total num frames: 5094227968. Throughput: 0: 50855.6. Samples: 2847029320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 00:42:52,063][49517] Avg episode reward: [(0, '0.672')] [2024-04-27 00:42:52,832][49750] Updated weights for policy 0, policy_version 310931 (0.0034) [2024-04-27 00:42:56,246][49750] Updated weights for policy 0, policy_version 310941 (0.0027) [2024-04-27 00:42:57,062][49517] Fps is (10 sec: 49151.6, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5094473728. Throughput: 0: 50777.7. Samples: 2847335280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 00:42:57,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 00:42:59,180][49750] Updated weights for policy 0, policy_version 310951 (0.0032) [2024-04-27 00:43:02,063][49517] Fps is (10 sec: 52427.6, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 5094752256. Throughput: 0: 50922.9. Samples: 2847643580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 00:43:02,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-27 00:43:02,842][49750] Updated weights for policy 0, policy_version 310961 (0.0031) [2024-04-27 00:43:05,548][49750] Updated weights for policy 0, policy_version 310971 (0.0041) [2024-04-27 00:43:07,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 5094998016. Throughput: 0: 50995.5. Samples: 2847795380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 00:43:07,063][49517] Avg episode reward: [(0, '0.679')] [2024-04-27 00:43:09,338][49750] Updated weights for policy 0, policy_version 310981 (0.0034) [2024-04-27 00:43:12,045][49750] Updated weights for policy 0, policy_version 310991 (0.0028) [2024-04-27 00:43:12,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.4, 300 sec: 50929.3). Total num frames: 5095276544. Throughput: 0: 50996.4. Samples: 2848102540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 00:43:12,063][49517] Avg episode reward: [(0, '0.669')] [2024-04-27 00:43:15,851][49750] Updated weights for policy 0, policy_version 311001 (0.0033) [2024-04-27 00:43:16,638][49728] Signal inference workers to stop experience collection... (42750 times) [2024-04-27 00:43:16,638][49728] Signal inference workers to resume experience collection... (42750 times) [2024-04-27 00:43:16,667][49750] InferenceWorker_p0-w0: stopping experience collection (42750 times) [2024-04-27 00:43:16,667][49750] InferenceWorker_p0-w0: resuming experience collection (42750 times) [2024-04-27 00:43:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.3, 300 sec: 50929.3). Total num frames: 5095505920. Throughput: 0: 50987.1. Samples: 2848412040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 00:43:17,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 00:43:18,615][49750] Updated weights for policy 0, policy_version 311011 (0.0025) [2024-04-27 00:43:22,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 5095751680. Throughput: 0: 50899.1. Samples: 2848559440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 00:43:22,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 00:43:22,166][49750] Updated weights for policy 0, policy_version 311021 (0.0034) [2024-04-27 00:43:24,982][49750] Updated weights for policy 0, policy_version 311031 (0.0033) [2024-04-27 00:43:27,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51609.7, 300 sec: 51040.3). Total num frames: 5096046592. Throughput: 0: 50806.3. Samples: 2848864920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 00:43:27,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-27 00:43:28,458][49750] Updated weights for policy 0, policy_version 311041 (0.0036) [2024-04-27 00:43:31,742][49750] Updated weights for policy 0, policy_version 311051 (0.0037) [2024-04-27 00:43:32,063][49517] Fps is (10 sec: 52427.5, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5096275968. Throughput: 0: 50972.1. Samples: 2849169100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 00:43:32,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 00:43:34,965][49750] Updated weights for policy 0, policy_version 311061 (0.0035) [2024-04-27 00:43:37,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5096521728. Throughput: 0: 50971.4. Samples: 2849323040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 00:43:37,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-27 00:43:38,226][49750] Updated weights for policy 0, policy_version 311071 (0.0034) [2024-04-27 00:43:41,432][49750] Updated weights for policy 0, policy_version 311081 (0.0023) [2024-04-27 00:43:42,062][49517] Fps is (10 sec: 49153.2, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5096767488. Throughput: 0: 50793.8. Samples: 2849621000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 00:43:42,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-27 00:43:44,591][49750] Updated weights for policy 0, policy_version 311091 (0.0032) [2024-04-27 00:43:47,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 5097046016. Throughput: 0: 50721.4. Samples: 2849926040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 00:43:47,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-27 00:43:47,784][49750] Updated weights for policy 0, policy_version 311101 (0.0027) [2024-04-27 00:43:51,009][49750] Updated weights for policy 0, policy_version 311111 (0.0030) [2024-04-27 00:43:52,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 5097291776. Throughput: 0: 50908.6. Samples: 2850086260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 00:43:52,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-27 00:43:54,300][49750] Updated weights for policy 0, policy_version 311121 (0.0037) [2024-04-27 00:43:57,062][49517] Fps is (10 sec: 49153.1, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5097537536. Throughput: 0: 50861.6. Samples: 2850391300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 00:43:57,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 00:43:57,453][49750] Updated weights for policy 0, policy_version 311131 (0.0030) [2024-04-27 00:44:00,881][49750] Updated weights for policy 0, policy_version 311141 (0.0033) [2024-04-27 00:44:02,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 5097783296. Throughput: 0: 50823.7. Samples: 2850699100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 00:44:02,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 00:44:03,765][49750] Updated weights for policy 0, policy_version 311151 (0.0028) [2024-04-27 00:44:07,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5098029056. Throughput: 0: 50862.2. Samples: 2850848240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 00:44:07,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 00:44:07,509][49750] Updated weights for policy 0, policy_version 311161 (0.0034) [2024-04-27 00:44:10,336][49750] Updated weights for policy 0, policy_version 311171 (0.0037) [2024-04-27 00:44:12,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.5, 300 sec: 50929.3). Total num frames: 5098307584. Throughput: 0: 50796.5. Samples: 2851150760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 00:44:12,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-27 00:44:13,973][49750] Updated weights for policy 0, policy_version 311181 (0.0028) [2024-04-27 00:44:16,797][49750] Updated weights for policy 0, policy_version 311191 (0.0039) [2024-04-27 00:44:17,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5098553344. Throughput: 0: 50876.7. Samples: 2851458540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 00:44:17,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 00:44:20,518][49750] Updated weights for policy 0, policy_version 311201 (0.0030) [2024-04-27 00:44:22,063][49517] Fps is (10 sec: 50789.3, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5098815488. Throughput: 0: 50939.1. Samples: 2851615300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 00:44:22,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 00:44:22,788][49728] Signal inference workers to stop experience collection... (42800 times) [2024-04-27 00:44:22,822][49750] InferenceWorker_p0-w0: stopping experience collection (42800 times) [2024-04-27 00:44:22,852][49728] Signal inference workers to resume experience collection... (42800 times) [2024-04-27 00:44:22,852][49750] InferenceWorker_p0-w0: resuming experience collection (42800 times) [2024-04-27 00:44:23,145][49750] Updated weights for policy 0, policy_version 311211 (0.0037) [2024-04-27 00:44:27,023][49750] Updated weights for policy 0, policy_version 311221 (0.0029) [2024-04-27 00:44:27,063][49517] Fps is (10 sec: 49151.2, 60 sec: 49971.0, 300 sec: 50818.1). Total num frames: 5099044864. Throughput: 0: 50887.4. Samples: 2851910940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 00:44:27,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 00:44:29,534][49750] Updated weights for policy 0, policy_version 311231 (0.0036) [2024-04-27 00:44:32,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 5099307008. Throughput: 0: 50796.5. Samples: 2852211880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 00:44:32,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 00:44:33,306][49750] Updated weights for policy 0, policy_version 311241 (0.0030) [2024-04-27 00:44:36,149][49750] Updated weights for policy 0, policy_version 311251 (0.0034) [2024-04-27 00:44:37,062][49517] Fps is (10 sec: 55706.5, 60 sec: 51336.6, 300 sec: 51040.3). Total num frames: 5099601920. Throughput: 0: 50626.1. Samples: 2852364440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 00:44:37,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 00:44:39,716][49750] Updated weights for policy 0, policy_version 311261 (0.0035) [2024-04-27 00:44:42,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5099814912. Throughput: 0: 50774.1. Samples: 2852676140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 00:44:42,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 00:44:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000311268_5099814912.pth... [2024-04-27 00:44:42,129][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000310525_5087641600.pth [2024-04-27 00:44:42,770][49750] Updated weights for policy 0, policy_version 311271 (0.0028) [2024-04-27 00:44:46,219][49750] Updated weights for policy 0, policy_version 311281 (0.0031) [2024-04-27 00:44:47,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5100077056. Throughput: 0: 50728.0. Samples: 2852981860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 00:44:47,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-27 00:44:49,092][49750] Updated weights for policy 0, policy_version 311291 (0.0032) [2024-04-27 00:44:52,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5100322816. Throughput: 0: 50540.0. Samples: 2853122540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 00:44:52,063][49517] Avg episode reward: [(0, '0.668')] [2024-04-27 00:44:52,694][49750] Updated weights for policy 0, policy_version 311301 (0.0029) [2024-04-27 00:44:55,563][49750] Updated weights for policy 0, policy_version 311311 (0.0036) [2024-04-27 00:44:57,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.2, 300 sec: 50929.2). Total num frames: 5100584960. Throughput: 0: 50650.9. Samples: 2853430060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 00:44:57,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-27 00:44:59,169][49750] Updated weights for policy 0, policy_version 311321 (0.0031) [2024-04-27 00:45:02,055][49750] Updated weights for policy 0, policy_version 311331 (0.0026) [2024-04-27 00:45:02,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 5100847104. Throughput: 0: 50559.1. Samples: 2853733700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 00:45:02,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-27 00:45:05,530][49750] Updated weights for policy 0, policy_version 311341 (0.0032) [2024-04-27 00:45:07,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5101092864. Throughput: 0: 50646.0. Samples: 2853894360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 00:45:07,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 00:45:08,344][49750] Updated weights for policy 0, policy_version 311351 (0.0033) [2024-04-27 00:45:12,062][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 5101322240. Throughput: 0: 50725.9. Samples: 2854193600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 00:45:12,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-27 00:45:12,122][49750] Updated weights for policy 0, policy_version 311361 (0.0034) [2024-04-27 00:45:14,616][49750] Updated weights for policy 0, policy_version 311371 (0.0033) [2024-04-27 00:45:17,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 5101584384. Throughput: 0: 50780.4. Samples: 2854497000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 00:45:17,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 00:45:18,450][49750] Updated weights for policy 0, policy_version 311381 (0.0030) [2024-04-27 00:45:21,057][49750] Updated weights for policy 0, policy_version 311391 (0.0035) [2024-04-27 00:45:22,063][49517] Fps is (10 sec: 55705.0, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 5101879296. Throughput: 0: 50847.4. Samples: 2854652580. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 00:45:22,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-27 00:45:24,439][49728] Signal inference workers to stop experience collection... (42850 times) [2024-04-27 00:45:24,440][49728] Signal inference workers to resume experience collection... (42850 times) [2024-04-27 00:45:24,451][49750] InferenceWorker_p0-w0: stopping experience collection (42850 times) [2024-04-27 00:45:24,451][49750] InferenceWorker_p0-w0: resuming experience collection (42850 times) [2024-04-27 00:45:24,715][49750] Updated weights for policy 0, policy_version 311401 (0.0031) [2024-04-27 00:45:27,062][49517] Fps is (10 sec: 52429.8, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5102108672. Throughput: 0: 50775.2. Samples: 2854961020. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 00:45:27,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-27 00:45:27,485][49750] Updated weights for policy 0, policy_version 311411 (0.0033) [2024-04-27 00:45:31,313][49750] Updated weights for policy 0, policy_version 311421 (0.0029) [2024-04-27 00:45:32,063][49517] Fps is (10 sec: 47513.8, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 5102354432. Throughput: 0: 50767.9. Samples: 2855266420. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 00:45:32,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 00:45:33,955][49750] Updated weights for policy 0, policy_version 311431 (0.0038) [2024-04-27 00:45:37,062][49517] Fps is (10 sec: 47513.4, 60 sec: 49698.1, 300 sec: 50707.1). Total num frames: 5102583808. Throughput: 0: 50762.7. Samples: 2855406860. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 00:45:37,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-27 00:45:37,818][49750] Updated weights for policy 0, policy_version 311441 (0.0032) [2024-04-27 00:45:40,346][49750] Updated weights for policy 0, policy_version 311451 (0.0035) [2024-04-27 00:45:42,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5102878720. Throughput: 0: 50724.8. Samples: 2855712680. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 00:45:42,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-27 00:45:44,165][49750] Updated weights for policy 0, policy_version 311461 (0.0031) [2024-04-27 00:45:46,886][49750] Updated weights for policy 0, policy_version 311471 (0.0033) [2024-04-27 00:45:47,062][49517] Fps is (10 sec: 55705.7, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 5103140864. Throughput: 0: 50800.0. Samples: 2856019700. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 00:45:47,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-27 00:45:50,493][49750] Updated weights for policy 0, policy_version 311481 (0.0031) [2024-04-27 00:45:52,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5103370240. Throughput: 0: 50812.5. Samples: 2856180920. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 00:45:52,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-27 00:45:53,290][49750] Updated weights for policy 0, policy_version 311491 (0.0031) [2024-04-27 00:45:56,979][49750] Updated weights for policy 0, policy_version 311501 (0.0034) [2024-04-27 00:45:57,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5103632384. Throughput: 0: 51034.3. Samples: 2856490140. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 00:45:57,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 00:45:59,604][49750] Updated weights for policy 0, policy_version 311511 (0.0035) [2024-04-27 00:46:02,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5103878144. Throughput: 0: 51089.9. Samples: 2856796040. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 00:46:02,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-27 00:46:03,444][49750] Updated weights for policy 0, policy_version 311521 (0.0043) [2024-04-27 00:46:05,979][49750] Updated weights for policy 0, policy_version 311531 (0.0037) [2024-04-27 00:46:07,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 5104156672. Throughput: 0: 50900.2. Samples: 2856943080. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 00:46:07,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 00:46:09,990][49750] Updated weights for policy 0, policy_version 311541 (0.0036) [2024-04-27 00:46:12,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 5104418816. Throughput: 0: 50932.3. Samples: 2857252980. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 00:46:12,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 00:46:12,436][49750] Updated weights for policy 0, policy_version 311551 (0.0038) [2024-04-27 00:46:16,357][49750] Updated weights for policy 0, policy_version 311561 (0.0034) [2024-04-27 00:46:17,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5104664576. Throughput: 0: 50959.6. Samples: 2857559600. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 00:46:17,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 00:46:18,871][49750] Updated weights for policy 0, policy_version 311571 (0.0030) [2024-04-27 00:46:22,062][49517] Fps is (10 sec: 45875.7, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 5104877568. Throughput: 0: 51032.0. Samples: 2857703300. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 00:46:22,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-27 00:46:22,888][49750] Updated weights for policy 0, policy_version 311581 (0.0031) [2024-04-27 00:46:25,202][49750] Updated weights for policy 0, policy_version 311591 (0.0039) [2024-04-27 00:46:27,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5105156096. Throughput: 0: 50863.3. Samples: 2858001520. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 00:46:27,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-27 00:46:29,368][49750] Updated weights for policy 0, policy_version 311601 (0.0030) [2024-04-27 00:46:31,644][49750] Updated weights for policy 0, policy_version 311611 (0.0026) [2024-04-27 00:46:32,063][49517] Fps is (10 sec: 57343.2, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 5105451008. Throughput: 0: 50924.3. Samples: 2858311300. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 00:46:32,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 00:46:35,933][49750] Updated weights for policy 0, policy_version 311621 (0.0032) [2024-04-27 00:46:37,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51609.5, 300 sec: 50818.1). Total num frames: 5105680384. Throughput: 0: 51018.0. Samples: 2858476740. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-04-27 00:46:37,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 00:46:38,189][49750] Updated weights for policy 0, policy_version 311631 (0.0024) [2024-04-27 00:46:41,993][49728] Signal inference workers to stop experience collection... (42900 times) [2024-04-27 00:46:42,038][49750] InferenceWorker_p0-w0: stopping experience collection (42900 times) [2024-04-27 00:46:42,062][49517] Fps is (10 sec: 44237.4, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 5105893376. Throughput: 0: 50928.4. Samples: 2858781920. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-04-27 00:46:42,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-27 00:46:42,097][49728] Signal inference workers to resume experience collection... (42900 times) [2024-04-27 00:46:42,098][49750] InferenceWorker_p0-w0: resuming experience collection (42900 times) [2024-04-27 00:46:42,247][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000311641_5105926144.pth... [2024-04-27 00:46:42,248][49750] Updated weights for policy 0, policy_version 311641 (0.0032) [2024-04-27 00:46:42,289][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000310897_5093736448.pth [2024-04-27 00:46:44,530][49750] Updated weights for policy 0, policy_version 311651 (0.0033) [2024-04-27 00:46:47,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5106171904. Throughput: 0: 50901.4. Samples: 2859086600. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-04-27 00:46:47,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-27 00:46:48,765][49750] Updated weights for policy 0, policy_version 311661 (0.0031) [2024-04-27 00:46:50,876][49750] Updated weights for policy 0, policy_version 311671 (0.0027) [2024-04-27 00:46:52,062][49517] Fps is (10 sec: 55705.4, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 5106450432. Throughput: 0: 50909.3. Samples: 2859234000. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-04-27 00:46:52,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 00:46:55,215][49750] Updated weights for policy 0, policy_version 311681 (0.0031) [2024-04-27 00:46:57,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5106712576. Throughput: 0: 50861.4. Samples: 2859541740. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-04-27 00:46:57,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-27 00:46:57,408][49750] Updated weights for policy 0, policy_version 311691 (0.0037) [2024-04-27 00:47:01,651][49750] Updated weights for policy 0, policy_version 311701 (0.0031) [2024-04-27 00:47:02,062][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5106941952. Throughput: 0: 50911.6. Samples: 2859850620. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-04-27 00:47:02,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 00:47:03,912][49750] Updated weights for policy 0, policy_version 311711 (0.0027) [2024-04-27 00:47:07,063][49517] Fps is (10 sec: 45874.8, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 5107171328. Throughput: 0: 50748.7. Samples: 2859987000. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-04-27 00:47:07,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 00:47:07,958][49750] Updated weights for policy 0, policy_version 311721 (0.0038) [2024-04-27 00:47:10,352][49750] Updated weights for policy 0, policy_version 311731 (0.0025) [2024-04-27 00:47:12,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 5107433472. Throughput: 0: 50923.0. Samples: 2860293060. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-04-27 00:47:12,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-27 00:47:14,395][49750] Updated weights for policy 0, policy_version 311741 (0.0030) [2024-04-27 00:47:16,758][49750] Updated weights for policy 0, policy_version 311751 (0.0039) [2024-04-27 00:47:17,063][49517] Fps is (10 sec: 55705.9, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 5107728384. Throughput: 0: 50743.1. Samples: 2860594740. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-04-27 00:47:17,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-27 00:47:20,986][49750] Updated weights for policy 0, policy_version 311761 (0.0029) [2024-04-27 00:47:22,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51609.5, 300 sec: 50929.2). Total num frames: 5107974144. Throughput: 0: 50824.1. Samples: 2860763820. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-04-27 00:47:22,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-27 00:47:23,196][49750] Updated weights for policy 0, policy_version 311771 (0.0035) [2024-04-27 00:47:27,062][49517] Fps is (10 sec: 44237.1, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 5108170752. Throughput: 0: 50731.9. Samples: 2861064860. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-04-27 00:47:27,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 00:47:27,478][49750] Updated weights for policy 0, policy_version 311781 (0.0032) [2024-04-27 00:47:29,676][49750] Updated weights for policy 0, policy_version 311791 (0.0031) [2024-04-27 00:47:32,062][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 5108449280. Throughput: 0: 50740.3. Samples: 2861369920. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-04-27 00:47:32,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-27 00:47:33,759][49728] Signal inference workers to stop experience collection... (42950 times) [2024-04-27 00:47:33,765][49728] Signal inference workers to resume experience collection... (42950 times) [2024-04-27 00:47:33,780][49750] InferenceWorker_p0-w0: stopping experience collection (42950 times) [2024-04-27 00:47:33,781][49750] InferenceWorker_p0-w0: resuming experience collection (42950 times) [2024-04-27 00:47:33,916][49750] Updated weights for policy 0, policy_version 311801 (0.0033) [2024-04-27 00:47:36,207][49750] Updated weights for policy 0, policy_version 311811 (0.0035) [2024-04-27 00:47:37,063][49517] Fps is (10 sec: 55705.3, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5108727808. Throughput: 0: 50782.6. Samples: 2861519220. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-04-27 00:47:37,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 00:47:40,357][49750] Updated weights for policy 0, policy_version 311821 (0.0033) [2024-04-27 00:47:42,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 5108989952. Throughput: 0: 50733.9. Samples: 2861824760. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-04-27 00:47:42,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-27 00:47:42,680][49750] Updated weights for policy 0, policy_version 311831 (0.0034) [2024-04-27 00:47:46,790][49750] Updated weights for policy 0, policy_version 311841 (0.0029) [2024-04-27 00:47:47,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5109219328. Throughput: 0: 50655.2. Samples: 2862130100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 00:47:47,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 00:47:49,169][49750] Updated weights for policy 0, policy_version 311851 (0.0031) [2024-04-27 00:47:52,063][49517] Fps is (10 sec: 47512.4, 60 sec: 50244.1, 300 sec: 50818.1). Total num frames: 5109465088. Throughput: 0: 50708.8. Samples: 2862268900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 00:47:52,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-27 00:47:53,251][49750] Updated weights for policy 0, policy_version 311861 (0.0032) [2024-04-27 00:47:55,966][49750] Updated weights for policy 0, policy_version 311871 (0.0034) [2024-04-27 00:47:57,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 5109727232. Throughput: 0: 50684.9. Samples: 2862573880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 00:47:57,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 00:47:59,567][49750] Updated weights for policy 0, policy_version 311881 (0.0031) [2024-04-27 00:48:02,063][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5110005760. Throughput: 0: 50867.4. Samples: 2862883780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 00:48:02,063][49517] Avg episode reward: [(0, '0.468')] [2024-04-27 00:48:02,615][49750] Updated weights for policy 0, policy_version 311891 (0.0045) [2024-04-27 00:48:05,994][49750] Updated weights for policy 0, policy_version 311901 (0.0029) [2024-04-27 00:48:07,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.6, 300 sec: 50762.6). Total num frames: 5110251520. Throughput: 0: 50659.5. Samples: 2863043500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 00:48:07,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-27 00:48:08,908][49750] Updated weights for policy 0, policy_version 311911 (0.0029) [2024-04-27 00:48:12,063][49517] Fps is (10 sec: 47514.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5110480896. Throughput: 0: 50687.5. Samples: 2863345800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 00:48:12,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 00:48:12,504][49750] Updated weights for policy 0, policy_version 311921 (0.0025) [2024-04-27 00:48:15,397][49750] Updated weights for policy 0, policy_version 311931 (0.0032) [2024-04-27 00:48:17,063][49517] Fps is (10 sec: 47513.2, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 5110726656. Throughput: 0: 50591.0. Samples: 2863646520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 00:48:17,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 00:48:18,829][49750] Updated weights for policy 0, policy_version 311941 (0.0034) [2024-04-27 00:48:21,932][49750] Updated weights for policy 0, policy_version 311951 (0.0030) [2024-04-27 00:48:22,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5111005184. Throughput: 0: 50592.4. Samples: 2863795880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 00:48:22,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 00:48:25,194][49750] Updated weights for policy 0, policy_version 311961 (0.0030) [2024-04-27 00:48:27,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 5111267328. Throughput: 0: 50625.3. Samples: 2864102900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 00:48:27,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 00:48:28,305][49750] Updated weights for policy 0, policy_version 311971 (0.0032) [2024-04-27 00:48:31,766][49750] Updated weights for policy 0, policy_version 311981 (0.0035) [2024-04-27 00:48:32,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5111513088. Throughput: 0: 50679.0. Samples: 2864410660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 00:48:32,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-27 00:48:34,567][49750] Updated weights for policy 0, policy_version 311991 (0.0028) [2024-04-27 00:48:37,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 5111742464. Throughput: 0: 50805.8. Samples: 2864555160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 00:48:37,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-27 00:48:38,243][49750] Updated weights for policy 0, policy_version 312001 (0.0037) [2024-04-27 00:48:41,201][49750] Updated weights for policy 0, policy_version 312011 (0.0030) [2024-04-27 00:48:42,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5112020992. Throughput: 0: 50836.9. Samples: 2864861540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 00:48:42,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-27 00:48:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000312013_5112020992.pth... [2024-04-27 00:48:42,124][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000311268_5099814912.pth [2024-04-27 00:48:44,662][49750] Updated weights for policy 0, policy_version 312021 (0.0032) [2024-04-27 00:48:47,063][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5112266752. Throughput: 0: 50723.7. Samples: 2865166340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 00:48:47,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-27 00:48:47,865][49750] Updated weights for policy 0, policy_version 312031 (0.0033) [2024-04-27 00:48:51,042][49750] Updated weights for policy 0, policy_version 312041 (0.0035) [2024-04-27 00:48:52,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 5112545280. Throughput: 0: 50713.8. Samples: 2865325620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 00:48:52,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-27 00:48:54,142][49750] Updated weights for policy 0, policy_version 312051 (0.0034) [2024-04-27 00:48:57,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 5112774656. Throughput: 0: 50780.8. Samples: 2865630940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 00:48:57,063][49517] Avg episode reward: [(0, '0.666')] [2024-04-27 00:48:57,293][49728] Signal inference workers to stop experience collection... (43000 times) [2024-04-27 00:48:57,295][49728] Signal inference workers to resume experience collection... (43000 times) [2024-04-27 00:48:57,323][49750] InferenceWorker_p0-w0: stopping experience collection (43000 times) [2024-04-27 00:48:57,323][49750] InferenceWorker_p0-w0: resuming experience collection (43000 times) [2024-04-27 00:48:57,420][49750] Updated weights for policy 0, policy_version 312061 (0.0028) [2024-04-27 00:49:00,610][49750] Updated weights for policy 0, policy_version 312071 (0.0031) [2024-04-27 00:49:02,062][49517] Fps is (10 sec: 45875.2, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 5113004032. Throughput: 0: 50793.9. Samples: 2865932240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:49:02,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 00:49:03,875][49750] Updated weights for policy 0, policy_version 312081 (0.0039) [2024-04-27 00:49:07,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5113282560. Throughput: 0: 50681.5. Samples: 2866076540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:49:07,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-27 00:49:07,137][49750] Updated weights for policy 0, policy_version 312091 (0.0035) [2024-04-27 00:49:10,373][49750] Updated weights for policy 0, policy_version 312101 (0.0029) [2024-04-27 00:49:12,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 5113544704. Throughput: 0: 50651.5. Samples: 2866382220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:49:12,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-27 00:49:13,627][49750] Updated weights for policy 0, policy_version 312111 (0.0034) [2024-04-27 00:49:16,708][49750] Updated weights for policy 0, policy_version 312121 (0.0033) [2024-04-27 00:49:17,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 5113806848. Throughput: 0: 50713.3. Samples: 2866692760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:49:17,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-27 00:49:20,083][49750] Updated weights for policy 0, policy_version 312131 (0.0029) [2024-04-27 00:49:22,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5114036224. Throughput: 0: 50846.3. Samples: 2866843240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:49:22,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-27 00:49:23,176][49750] Updated weights for policy 0, policy_version 312141 (0.0032) [2024-04-27 00:49:26,560][49750] Updated weights for policy 0, policy_version 312151 (0.0026) [2024-04-27 00:49:27,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5114298368. Throughput: 0: 50778.7. Samples: 2867146580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:49:27,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-27 00:49:29,620][49750] Updated weights for policy 0, policy_version 312161 (0.0035) [2024-04-27 00:49:32,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 5114544128. Throughput: 0: 50729.3. Samples: 2867449160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:49:32,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-27 00:49:32,874][49750] Updated weights for policy 0, policy_version 312171 (0.0037) [2024-04-27 00:49:35,950][49750] Updated weights for policy 0, policy_version 312181 (0.0030) [2024-04-27 00:49:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5114822656. Throughput: 0: 50771.9. Samples: 2867610360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:49:37,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 00:49:39,431][49750] Updated weights for policy 0, policy_version 312191 (0.0034) [2024-04-27 00:49:42,063][49517] Fps is (10 sec: 52427.7, 60 sec: 50790.2, 300 sec: 50818.1). Total num frames: 5115068416. Throughput: 0: 50802.0. Samples: 2867917040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:49:42,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 00:49:42,366][49750] Updated weights for policy 0, policy_version 312201 (0.0033) [2024-04-27 00:49:46,166][49750] Updated weights for policy 0, policy_version 312211 (0.0028) [2024-04-27 00:49:47,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 5115314176. Throughput: 0: 50824.8. Samples: 2868219360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:49:47,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-27 00:49:48,793][49750] Updated weights for policy 0, policy_version 312221 (0.0034) [2024-04-27 00:49:52,063][49517] Fps is (10 sec: 49153.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 5115559936. Throughput: 0: 50857.2. Samples: 2868365120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:49:52,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 00:49:52,564][49750] Updated weights for policy 0, policy_version 312231 (0.0037) [2024-04-27 00:49:55,300][49750] Updated weights for policy 0, policy_version 312241 (0.0031) [2024-04-27 00:49:57,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5115822080. Throughput: 0: 50753.5. Samples: 2868666120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:49:57,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 00:49:59,103][49750] Updated weights for policy 0, policy_version 312251 (0.0032) [2024-04-27 00:50:01,653][49750] Updated weights for policy 0, policy_version 312261 (0.0032) [2024-04-27 00:50:02,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 5116100608. Throughput: 0: 50683.6. Samples: 2868973520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:50:02,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-27 00:50:05,757][49750] Updated weights for policy 0, policy_version 312271 (0.0036) [2024-04-27 00:50:07,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5116329984. Throughput: 0: 50865.4. Samples: 2869132180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:50:07,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-27 00:50:08,346][49750] Updated weights for policy 0, policy_version 312281 (0.0036) [2024-04-27 00:50:08,867][49728] Signal inference workers to stop experience collection... (43050 times) [2024-04-27 00:50:08,918][49750] InferenceWorker_p0-w0: stopping experience collection (43050 times) [2024-04-27 00:50:08,929][49728] Signal inference workers to resume experience collection... (43050 times) [2024-04-27 00:50:08,937][49750] InferenceWorker_p0-w0: resuming experience collection (43050 times) [2024-04-27 00:50:12,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50244.4, 300 sec: 50762.7). Total num frames: 5116559360. Throughput: 0: 50788.1. Samples: 2869432040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 00:50:12,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 00:50:12,116][49750] Updated weights for policy 0, policy_version 312291 (0.0029) [2024-04-27 00:50:14,851][49750] Updated weights for policy 0, policy_version 312301 (0.0033) [2024-04-27 00:50:17,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5116837888. Throughput: 0: 50744.4. Samples: 2869732660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 00:50:17,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 00:50:18,495][49750] Updated weights for policy 0, policy_version 312311 (0.0033) [2024-04-27 00:50:21,133][49750] Updated weights for policy 0, policy_version 312321 (0.0031) [2024-04-27 00:50:22,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5117100032. Throughput: 0: 50536.5. Samples: 2869884500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 00:50:22,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 00:50:25,026][49750] Updated weights for policy 0, policy_version 312331 (0.0027) [2024-04-27 00:50:27,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5117362176. Throughput: 0: 50579.4. Samples: 2870193100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 00:50:27,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 00:50:27,497][49750] Updated weights for policy 0, policy_version 312341 (0.0030) [2024-04-27 00:50:31,501][49750] Updated weights for policy 0, policy_version 312351 (0.0033) [2024-04-27 00:50:32,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5117591552. Throughput: 0: 50749.8. Samples: 2870503100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 00:50:32,063][49517] Avg episode reward: [(0, '0.501')] [2024-04-27 00:50:34,060][49750] Updated weights for policy 0, policy_version 312361 (0.0032) [2024-04-27 00:50:37,062][49517] Fps is (10 sec: 45875.6, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 5117820928. Throughput: 0: 50573.9. Samples: 2870640940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 00:50:37,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-27 00:50:37,886][49750] Updated weights for policy 0, policy_version 312371 (0.0033) [2024-04-27 00:50:40,676][49750] Updated weights for policy 0, policy_version 312381 (0.0030) [2024-04-27 00:50:42,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.4, 300 sec: 50651.5). Total num frames: 5118083072. Throughput: 0: 50776.7. Samples: 2870951080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 00:50:42,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-27 00:50:42,150][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000312384_5118099456.pth... [2024-04-27 00:50:42,196][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000311641_5105926144.pth [2024-04-27 00:50:44,264][49750] Updated weights for policy 0, policy_version 312391 (0.0032) [2024-04-27 00:50:47,053][49750] Updated weights for policy 0, policy_version 312401 (0.0038) [2024-04-27 00:50:47,063][49517] Fps is (10 sec: 55705.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5118377984. Throughput: 0: 50661.7. Samples: 2871253300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 00:50:47,068][49517] Avg episode reward: [(0, '0.578')] [2024-04-27 00:50:50,704][49750] Updated weights for policy 0, policy_version 312411 (0.0030) [2024-04-27 00:50:52,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5118623744. Throughput: 0: 50657.4. Samples: 2871411760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 00:50:52,071][49517] Avg episode reward: [(0, '0.516')] [2024-04-27 00:50:53,452][49750] Updated weights for policy 0, policy_version 312421 (0.0036) [2024-04-27 00:50:57,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5118853120. Throughput: 0: 50703.5. Samples: 2871713700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 00:50:57,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-27 00:50:57,086][49750] Updated weights for policy 0, policy_version 312431 (0.0031) [2024-04-27 00:50:59,962][49750] Updated weights for policy 0, policy_version 312441 (0.0030) [2024-04-27 00:51:02,062][49517] Fps is (10 sec: 45875.1, 60 sec: 49698.1, 300 sec: 50596.0). Total num frames: 5119082496. Throughput: 0: 50771.7. Samples: 2872017380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 00:51:02,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 00:51:03,540][49750] Updated weights for policy 0, policy_version 312451 (0.0031) [2024-04-27 00:51:06,448][49750] Updated weights for policy 0, policy_version 312461 (0.0033) [2024-04-27 00:51:07,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5119377408. Throughput: 0: 50778.7. Samples: 2872169540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 00:51:07,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 00:51:10,065][49750] Updated weights for policy 0, policy_version 312471 (0.0031) [2024-04-27 00:51:12,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 5119623168. Throughput: 0: 50441.9. Samples: 2872462980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 00:51:12,071][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 00:51:12,451][49728] Signal inference workers to stop experience collection... (43100 times) [2024-04-27 00:51:12,451][49728] Signal inference workers to resume experience collection... (43100 times) [2024-04-27 00:51:12,479][49750] InferenceWorker_p0-w0: stopping experience collection (43100 times) [2024-04-27 00:51:12,479][49750] InferenceWorker_p0-w0: resuming experience collection (43100 times) [2024-04-27 00:51:12,906][49750] Updated weights for policy 0, policy_version 312481 (0.0029) [2024-04-27 00:51:16,520][49750] Updated weights for policy 0, policy_version 312491 (0.0030) [2024-04-27 00:51:17,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 5119868928. Throughput: 0: 50335.1. Samples: 2872768180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 00:51:17,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-27 00:51:19,288][49750] Updated weights for policy 0, policy_version 312501 (0.0035) [2024-04-27 00:51:22,062][49517] Fps is (10 sec: 47513.8, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 5120098304. Throughput: 0: 50456.5. Samples: 2872911480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 00:51:22,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-27 00:51:23,117][49750] Updated weights for policy 0, policy_version 312511 (0.0035) [2024-04-27 00:51:25,904][49750] Updated weights for policy 0, policy_version 312521 (0.0027) [2024-04-27 00:51:27,063][49517] Fps is (10 sec: 49152.2, 60 sec: 49971.2, 300 sec: 50540.5). Total num frames: 5120360448. Throughput: 0: 50441.4. Samples: 2873220940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 00:51:27,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-27 00:51:29,458][49750] Updated weights for policy 0, policy_version 312531 (0.0028) [2024-04-27 00:51:32,062][49517] Fps is (10 sec: 54066.7, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5120638976. Throughput: 0: 50451.6. Samples: 2873523620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:51:32,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-27 00:51:32,325][49750] Updated weights for policy 0, policy_version 312541 (0.0036) [2024-04-27 00:51:35,813][49750] Updated weights for policy 0, policy_version 312551 (0.0030) [2024-04-27 00:51:37,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5120884736. Throughput: 0: 50456.0. Samples: 2873682280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:51:37,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 00:51:38,832][49750] Updated weights for policy 0, policy_version 312561 (0.0030) [2024-04-27 00:51:42,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 5121146880. Throughput: 0: 50647.1. Samples: 2873992820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:51:42,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 00:51:42,344][49750] Updated weights for policy 0, policy_version 312571 (0.0027) [2024-04-27 00:51:45,229][49750] Updated weights for policy 0, policy_version 312581 (0.0031) [2024-04-27 00:51:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.3, 300 sec: 50596.0). Total num frames: 5121376256. Throughput: 0: 50639.2. Samples: 2874296140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:51:47,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 00:51:48,892][49750] Updated weights for policy 0, policy_version 312591 (0.0031) [2024-04-27 00:51:51,654][49750] Updated weights for policy 0, policy_version 312601 (0.0031) [2024-04-27 00:51:52,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 5121654784. Throughput: 0: 50511.4. Samples: 2874442560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:51:52,072][49517] Avg episode reward: [(0, '0.635')] [2024-04-27 00:51:55,486][49750] Updated weights for policy 0, policy_version 312611 (0.0028) [2024-04-27 00:51:57,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5121900544. Throughput: 0: 50683.1. Samples: 2874743720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:51:57,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-27 00:51:57,683][49728] Signal inference workers to stop experience collection... (43150 times) [2024-04-27 00:51:57,743][49750] InferenceWorker_p0-w0: stopping experience collection (43150 times) [2024-04-27 00:51:57,748][49728] Signal inference workers to resume experience collection... (43150 times) [2024-04-27 00:51:57,757][49750] InferenceWorker_p0-w0: resuming experience collection (43150 times) [2024-04-27 00:51:58,193][49750] Updated weights for policy 0, policy_version 312621 (0.0042) [2024-04-27 00:52:01,802][49750] Updated weights for policy 0, policy_version 312631 (0.0031) [2024-04-27 00:52:02,063][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5122146304. Throughput: 0: 50649.4. Samples: 2875047400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:52:02,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 00:52:04,548][49750] Updated weights for policy 0, policy_version 312641 (0.0032) [2024-04-27 00:52:07,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 5122392064. Throughput: 0: 50736.9. Samples: 2875194640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:52:07,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 00:52:08,329][49750] Updated weights for policy 0, policy_version 312651 (0.0030) [2024-04-27 00:52:11,014][49750] Updated weights for policy 0, policy_version 312661 (0.0030) [2024-04-27 00:52:12,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 5122654208. Throughput: 0: 50747.2. Samples: 2875504560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:52:12,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 00:52:14,832][49750] Updated weights for policy 0, policy_version 312671 (0.0034) [2024-04-27 00:52:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.6, 300 sec: 50651.6). Total num frames: 5122916352. Throughput: 0: 50811.2. Samples: 2875810120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:52:17,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 00:52:17,472][49750] Updated weights for policy 0, policy_version 312681 (0.0035) [2024-04-27 00:52:21,365][49750] Updated weights for policy 0, policy_version 312691 (0.0027) [2024-04-27 00:52:22,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 5123178496. Throughput: 0: 50805.2. Samples: 2875968520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:52:22,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-27 00:52:24,315][49750] Updated weights for policy 0, policy_version 312701 (0.0030) [2024-04-27 00:52:27,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 5123424256. Throughput: 0: 50648.5. Samples: 2876272000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:52:27,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-27 00:52:27,820][49750] Updated weights for policy 0, policy_version 312711 (0.0035) [2024-04-27 00:52:30,723][49750] Updated weights for policy 0, policy_version 312721 (0.0033) [2024-04-27 00:52:32,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 5123670016. Throughput: 0: 50798.5. Samples: 2876582080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:52:32,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-27 00:52:34,273][49750] Updated weights for policy 0, policy_version 312731 (0.0027) [2024-04-27 00:52:37,063][49517] Fps is (10 sec: 50789.3, 60 sec: 50790.2, 300 sec: 50651.5). Total num frames: 5123932160. Throughput: 0: 50767.0. Samples: 2876727080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:52:37,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-27 00:52:37,220][49750] Updated weights for policy 0, policy_version 312741 (0.0030) [2024-04-27 00:52:40,635][49750] Updated weights for policy 0, policy_version 312751 (0.0030) [2024-04-27 00:52:42,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 5124194304. Throughput: 0: 51002.4. Samples: 2877038840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 00:52:42,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-27 00:52:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000312756_5124194304.pth... [2024-04-27 00:52:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000312013_5112020992.pth [2024-04-27 00:52:43,669][49750] Updated weights for policy 0, policy_version 312761 (0.0041) [2024-04-27 00:52:46,969][49750] Updated weights for policy 0, policy_version 312771 (0.0030) [2024-04-27 00:52:47,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51063.4, 300 sec: 50762.7). Total num frames: 5124440064. Throughput: 0: 50859.1. Samples: 2877336060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:52:47,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-27 00:52:50,379][49750] Updated weights for policy 0, policy_version 312781 (0.0032) [2024-04-27 00:52:52,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5124685824. Throughput: 0: 50904.8. Samples: 2877485360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:52:52,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-27 00:52:53,431][49750] Updated weights for policy 0, policy_version 312791 (0.0028) [2024-04-27 00:52:56,828][49750] Updated weights for policy 0, policy_version 312801 (0.0035) [2024-04-27 00:52:57,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 5124931584. Throughput: 0: 50743.6. Samples: 2877788020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:52:57,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 00:52:59,907][49750] Updated weights for policy 0, policy_version 312811 (0.0032) [2024-04-27 00:53:02,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 5125226496. Throughput: 0: 50822.5. Samples: 2878097140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:53:02,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 00:53:03,504][49750] Updated weights for policy 0, policy_version 312821 (0.0033) [2024-04-27 00:53:05,615][49728] Signal inference workers to stop experience collection... (43200 times) [2024-04-27 00:53:05,650][49750] InferenceWorker_p0-w0: stopping experience collection (43200 times) [2024-04-27 00:53:05,683][49728] Signal inference workers to resume experience collection... (43200 times) [2024-04-27 00:53:05,684][49750] InferenceWorker_p0-w0: resuming experience collection (43200 times) [2024-04-27 00:53:06,342][49750] Updated weights for policy 0, policy_version 312831 (0.0031) [2024-04-27 00:53:07,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5125455872. Throughput: 0: 50794.8. Samples: 2878254280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:53:07,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-27 00:53:10,045][49750] Updated weights for policy 0, policy_version 312841 (0.0028) [2024-04-27 00:53:12,063][49517] Fps is (10 sec: 45874.9, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5125685248. Throughput: 0: 50776.6. Samples: 2878556960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:53:12,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 00:53:12,691][49750] Updated weights for policy 0, policy_version 312851 (0.0028) [2024-04-27 00:53:16,399][49750] Updated weights for policy 0, policy_version 312861 (0.0029) [2024-04-27 00:53:17,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 5125947392. Throughput: 0: 50609.4. Samples: 2878859500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:53:17,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-27 00:53:19,270][49750] Updated weights for policy 0, policy_version 312871 (0.0035) [2024-04-27 00:53:22,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.4, 300 sec: 50651.5). Total num frames: 5126209536. Throughput: 0: 50658.8. Samples: 2879006720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:53:22,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-27 00:53:22,744][49750] Updated weights for policy 0, policy_version 312881 (0.0033) [2024-04-27 00:53:25,726][49750] Updated weights for policy 0, policy_version 312891 (0.0031) [2024-04-27 00:53:27,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5126471680. Throughput: 0: 50602.0. Samples: 2879315920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:53:27,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 00:53:29,304][49750] Updated weights for policy 0, policy_version 312901 (0.0031) [2024-04-27 00:53:32,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5126717440. Throughput: 0: 50817.3. Samples: 2879622840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:53:32,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-27 00:53:32,187][49750] Updated weights for policy 0, policy_version 312911 (0.0034) [2024-04-27 00:53:35,889][49750] Updated weights for policy 0, policy_version 312921 (0.0030) [2024-04-27 00:53:37,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 5126979584. Throughput: 0: 50673.5. Samples: 2879765660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:53:37,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 00:53:38,551][49750] Updated weights for policy 0, policy_version 312931 (0.0028) [2024-04-27 00:53:42,063][49517] Fps is (10 sec: 47513.4, 60 sec: 49971.3, 300 sec: 50596.0). Total num frames: 5127192576. Throughput: 0: 50742.9. Samples: 2880071460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:53:42,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-27 00:53:42,281][49750] Updated weights for policy 0, policy_version 312941 (0.0031) [2024-04-27 00:53:44,954][49750] Updated weights for policy 0, policy_version 312951 (0.0034) [2024-04-27 00:53:47,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 5127487488. Throughput: 0: 50608.6. Samples: 2880374520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:53:47,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-27 00:53:48,638][49750] Updated weights for policy 0, policy_version 312961 (0.0034) [2024-04-27 00:53:51,503][49750] Updated weights for policy 0, policy_version 312971 (0.0032) [2024-04-27 00:53:52,063][49517] Fps is (10 sec: 54067.5, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5127733248. Throughput: 0: 50590.6. Samples: 2880530860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:53:52,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-27 00:53:55,084][49750] Updated weights for policy 0, policy_version 312981 (0.0034) [2024-04-27 00:53:57,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5127979008. Throughput: 0: 50646.5. Samples: 2880836040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 00:53:57,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 00:53:57,973][49750] Updated weights for policy 0, policy_version 312991 (0.0029) [2024-04-27 00:54:01,509][49750] Updated weights for policy 0, policy_version 313001 (0.0035) [2024-04-27 00:54:02,062][49517] Fps is (10 sec: 49152.2, 60 sec: 49971.3, 300 sec: 50651.5). Total num frames: 5128224768. Throughput: 0: 50754.1. Samples: 2881143440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:54:02,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-27 00:54:04,337][49750] Updated weights for policy 0, policy_version 313011 (0.0031) [2024-04-27 00:54:07,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 5128486912. Throughput: 0: 50693.7. Samples: 2881287940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:54:07,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 00:54:07,905][49750] Updated weights for policy 0, policy_version 313021 (0.0032) [2024-04-27 00:54:10,757][49750] Updated weights for policy 0, policy_version 313031 (0.0025) [2024-04-27 00:54:12,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.6, 300 sec: 50651.6). Total num frames: 5128749056. Throughput: 0: 50678.7. Samples: 2881596460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:54:12,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-27 00:54:14,142][49728] Signal inference workers to stop experience collection... (43250 times) [2024-04-27 00:54:14,147][49728] Signal inference workers to resume experience collection... (43250 times) [2024-04-27 00:54:14,172][49750] InferenceWorker_p0-w0: stopping experience collection (43250 times) [2024-04-27 00:54:14,173][49750] InferenceWorker_p0-w0: resuming experience collection (43250 times) [2024-04-27 00:54:14,272][49750] Updated weights for policy 0, policy_version 313041 (0.0033) [2024-04-27 00:54:17,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 5128994816. Throughput: 0: 50766.2. Samples: 2881907320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:54:17,064][49517] Avg episode reward: [(0, '0.569')] [2024-04-27 00:54:17,354][49750] Updated weights for policy 0, policy_version 313051 (0.0029) [2024-04-27 00:54:20,766][49750] Updated weights for policy 0, policy_version 313061 (0.0030) [2024-04-27 00:54:22,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5129256960. Throughput: 0: 50799.9. Samples: 2882051660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:54:22,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 00:54:23,963][49750] Updated weights for policy 0, policy_version 313071 (0.0038) [2024-04-27 00:54:27,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5129502720. Throughput: 0: 50613.0. Samples: 2882349040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:54:27,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-27 00:54:27,237][49750] Updated weights for policy 0, policy_version 313081 (0.0033) [2024-04-27 00:54:30,366][49750] Updated weights for policy 0, policy_version 313091 (0.0032) [2024-04-27 00:54:32,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 5129781248. Throughput: 0: 50683.5. Samples: 2882655280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:54:32,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-27 00:54:33,615][49750] Updated weights for policy 0, policy_version 313101 (0.0032) [2024-04-27 00:54:36,722][49750] Updated weights for policy 0, policy_version 313111 (0.0028) [2024-04-27 00:54:37,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 5130010624. Throughput: 0: 50615.3. Samples: 2882808540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:54:37,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-27 00:54:40,018][49750] Updated weights for policy 0, policy_version 313121 (0.0026) [2024-04-27 00:54:42,062][49517] Fps is (10 sec: 47513.5, 60 sec: 51063.6, 300 sec: 50651.6). Total num frames: 5130256384. Throughput: 0: 50690.6. Samples: 2883117120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:54:42,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 00:54:42,070][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000313126_5130256384.pth... [2024-04-27 00:54:42,115][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000312384_5118099456.pth [2024-04-27 00:54:43,281][49750] Updated weights for policy 0, policy_version 313131 (0.0028) [2024-04-27 00:54:46,501][49750] Updated weights for policy 0, policy_version 313141 (0.0034) [2024-04-27 00:54:47,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5130518528. Throughput: 0: 50585.8. Samples: 2883419800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:54:47,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 00:54:49,796][49750] Updated weights for policy 0, policy_version 313151 (0.0028) [2024-04-27 00:54:52,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.4, 300 sec: 50651.5). Total num frames: 5130764288. Throughput: 0: 50842.7. Samples: 2883575860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:54:52,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 00:54:52,881][49750] Updated weights for policy 0, policy_version 313161 (0.0033) [2024-04-27 00:54:56,259][49750] Updated weights for policy 0, policy_version 313171 (0.0032) [2024-04-27 00:54:57,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50651.6). Total num frames: 5131042816. Throughput: 0: 50775.6. Samples: 2883881360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:54:57,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 00:54:59,228][49750] Updated weights for policy 0, policy_version 313181 (0.0032) [2024-04-27 00:55:02,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 5131272192. Throughput: 0: 50769.6. Samples: 2884191940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:55:02,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-27 00:55:02,668][49750] Updated weights for policy 0, policy_version 313191 (0.0036) [2024-04-27 00:55:05,738][49750] Updated weights for policy 0, policy_version 313201 (0.0028) [2024-04-27 00:55:07,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 5131550720. Throughput: 0: 50716.3. Samples: 2884333900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:55:07,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 00:55:09,018][49750] Updated weights for policy 0, policy_version 313211 (0.0031) [2024-04-27 00:55:12,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.3, 300 sec: 50651.6). Total num frames: 5131780096. Throughput: 0: 50894.2. Samples: 2884639280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 00:55:12,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-27 00:55:12,245][49750] Updated weights for policy 0, policy_version 313221 (0.0033) [2024-04-27 00:55:15,553][49750] Updated weights for policy 0, policy_version 313231 (0.0032) [2024-04-27 00:55:17,063][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 5132058624. Throughput: 0: 50776.3. Samples: 2884940220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 00:55:17,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 00:55:18,666][49750] Updated weights for policy 0, policy_version 313241 (0.0036) [2024-04-27 00:55:21,319][49728] Signal inference workers to stop experience collection... (43300 times) [2024-04-27 00:55:21,348][49750] InferenceWorker_p0-w0: stopping experience collection (43300 times) [2024-04-27 00:55:21,420][49728] Signal inference workers to resume experience collection... (43300 times) [2024-04-27 00:55:21,420][49750] InferenceWorker_p0-w0: resuming experience collection (43300 times) [2024-04-27 00:55:22,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.2, 300 sec: 50596.0). Total num frames: 5132288000. Throughput: 0: 50820.7. Samples: 2885095480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 00:55:22,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 00:55:22,093][49750] Updated weights for policy 0, policy_version 313251 (0.0033) [2024-04-27 00:55:25,170][49750] Updated weights for policy 0, policy_version 313261 (0.0035) [2024-04-27 00:55:27,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 5132533760. Throughput: 0: 50724.0. Samples: 2885399700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 00:55:27,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 00:55:28,377][49750] Updated weights for policy 0, policy_version 313271 (0.0026) [2024-04-27 00:55:31,584][49750] Updated weights for policy 0, policy_version 313281 (0.0031) [2024-04-27 00:55:32,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5132812288. Throughput: 0: 50880.9. Samples: 2885709440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 00:55:32,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-27 00:55:34,747][49750] Updated weights for policy 0, policy_version 313291 (0.0036) [2024-04-27 00:55:37,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 5133074432. Throughput: 0: 50870.6. Samples: 2885865040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 00:55:37,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 00:55:37,974][49750] Updated weights for policy 0, policy_version 313301 (0.0033) [2024-04-27 00:55:41,243][49750] Updated weights for policy 0, policy_version 313311 (0.0028) [2024-04-27 00:55:42,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.5, 300 sec: 50707.1). Total num frames: 5133336576. Throughput: 0: 50879.0. Samples: 2886170920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 00:55:42,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-27 00:55:44,447][49750] Updated weights for policy 0, policy_version 313321 (0.0029) [2024-04-27 00:55:47,063][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.3, 300 sec: 50596.0). Total num frames: 5133549568. Throughput: 0: 50770.5. Samples: 2886476620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 00:55:47,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-27 00:55:47,755][49750] Updated weights for policy 0, policy_version 313331 (0.0031) [2024-04-27 00:55:50,926][49750] Updated weights for policy 0, policy_version 313341 (0.0030) [2024-04-27 00:55:52,063][49517] Fps is (10 sec: 49151.5, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5133828096. Throughput: 0: 50854.2. Samples: 2886622340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 00:55:52,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-27 00:55:54,190][49750] Updated weights for policy 0, policy_version 313351 (0.0028) [2024-04-27 00:55:57,062][49517] Fps is (10 sec: 54067.6, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5134090240. Throughput: 0: 50729.4. Samples: 2886922100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 00:55:57,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-27 00:55:57,328][49750] Updated weights for policy 0, policy_version 313361 (0.0032) [2024-04-27 00:56:00,759][49750] Updated weights for policy 0, policy_version 313371 (0.0030) [2024-04-27 00:56:02,063][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 5134352384. Throughput: 0: 50798.3. Samples: 2887226140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 00:56:02,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 00:56:03,749][49750] Updated weights for policy 0, policy_version 313381 (0.0030) [2024-04-27 00:56:07,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 5134565376. Throughput: 0: 50905.1. Samples: 2887386200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 00:56:07,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 00:56:07,273][49750] Updated weights for policy 0, policy_version 313391 (0.0035) [2024-04-27 00:56:10,281][49750] Updated weights for policy 0, policy_version 313401 (0.0030) [2024-04-27 00:56:12,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5134827520. Throughput: 0: 50799.5. Samples: 2887685680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 00:56:12,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-27 00:56:13,756][49750] Updated weights for policy 0, policy_version 313411 (0.0032) [2024-04-27 00:56:16,654][49750] Updated weights for policy 0, policy_version 313421 (0.0033) [2024-04-27 00:56:17,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 5135089664. Throughput: 0: 50809.9. Samples: 2887995880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 00:56:17,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-27 00:56:20,154][49750] Updated weights for policy 0, policy_version 313431 (0.0032) [2024-04-27 00:56:22,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5135351808. Throughput: 0: 50790.8. Samples: 2888150620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 00:56:22,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 00:56:23,161][49750] Updated weights for policy 0, policy_version 313441 (0.0031) [2024-04-27 00:56:26,545][49750] Updated weights for policy 0, policy_version 313451 (0.0033) [2024-04-27 00:56:27,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 5135597568. Throughput: 0: 50798.7. Samples: 2888456860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 00:56:27,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 00:56:29,565][49728] Signal inference workers to stop experience collection... (43350 times) [2024-04-27 00:56:29,608][49750] InferenceWorker_p0-w0: stopping experience collection (43350 times) [2024-04-27 00:56:29,625][49728] Signal inference workers to resume experience collection... (43350 times) [2024-04-27 00:56:29,633][49750] InferenceWorker_p0-w0: resuming experience collection (43350 times) [2024-04-27 00:56:29,754][49750] Updated weights for policy 0, policy_version 313461 (0.0028) [2024-04-27 00:56:32,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5135843328. Throughput: 0: 50702.4. Samples: 2888758220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:56:32,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 00:56:32,804][49750] Updated weights for policy 0, policy_version 313471 (0.0033) [2024-04-27 00:56:36,094][49750] Updated weights for policy 0, policy_version 313481 (0.0030) [2024-04-27 00:56:37,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 5136089088. Throughput: 0: 50657.4. Samples: 2888901920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:56:37,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 00:56:39,314][49750] Updated weights for policy 0, policy_version 313491 (0.0031) [2024-04-27 00:56:42,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5136367616. Throughput: 0: 50722.3. Samples: 2889204600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:56:42,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-27 00:56:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000313499_5136367616.pth... [2024-04-27 00:56:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000312756_5124194304.pth [2024-04-27 00:56:42,523][49750] Updated weights for policy 0, policy_version 313501 (0.0025) [2024-04-27 00:56:45,837][49750] Updated weights for policy 0, policy_version 313511 (0.0034) [2024-04-27 00:56:47,062][49517] Fps is (10 sec: 57344.9, 60 sec: 51882.8, 300 sec: 50873.7). Total num frames: 5136662528. Throughput: 0: 50936.6. Samples: 2889518280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:56:47,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 00:56:48,936][49750] Updated weights for policy 0, policy_version 313521 (0.0029) [2024-04-27 00:56:52,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5136875520. Throughput: 0: 50726.1. Samples: 2889668880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:56:52,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-27 00:56:52,169][49750] Updated weights for policy 0, policy_version 313531 (0.0031) [2024-04-27 00:56:55,275][49750] Updated weights for policy 0, policy_version 313541 (0.0028) [2024-04-27 00:56:57,062][49517] Fps is (10 sec: 44236.3, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 5137104896. Throughput: 0: 50832.8. Samples: 2889973160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:56:57,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-27 00:56:58,681][49750] Updated weights for policy 0, policy_version 313551 (0.0031) [2024-04-27 00:57:01,835][49750] Updated weights for policy 0, policy_version 313561 (0.0035) [2024-04-27 00:57:02,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 5137383424. Throughput: 0: 50678.4. Samples: 2890276420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:57:02,063][49517] Avg episode reward: [(0, '0.666')] [2024-04-27 00:57:05,228][49750] Updated weights for policy 0, policy_version 313571 (0.0035) [2024-04-27 00:57:07,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 5137645568. Throughput: 0: 50687.1. Samples: 2890431540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:57:07,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-27 00:57:08,433][49750] Updated weights for policy 0, policy_version 313581 (0.0035) [2024-04-27 00:57:11,697][49750] Updated weights for policy 0, policy_version 313591 (0.0034) [2024-04-27 00:57:12,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5137874944. Throughput: 0: 50601.3. Samples: 2890733920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:57:12,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-27 00:57:14,879][49750] Updated weights for policy 0, policy_version 313601 (0.0037) [2024-04-27 00:57:17,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 5138137088. Throughput: 0: 50754.1. Samples: 2891042160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:57:17,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-27 00:57:18,121][49750] Updated weights for policy 0, policy_version 313611 (0.0037) [2024-04-27 00:57:21,310][49750] Updated weights for policy 0, policy_version 313621 (0.0042) [2024-04-27 00:57:22,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5138382848. Throughput: 0: 50675.1. Samples: 2891182300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:57:22,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 00:57:24,500][49750] Updated weights for policy 0, policy_version 313631 (0.0031) [2024-04-27 00:57:27,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5138644992. Throughput: 0: 50775.6. Samples: 2891489500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:57:27,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 00:57:27,781][49750] Updated weights for policy 0, policy_version 313641 (0.0033) [2024-04-27 00:57:30,550][49728] Signal inference workers to stop experience collection... (43400 times) [2024-04-27 00:57:30,550][49728] Signal inference workers to resume experience collection... (43400 times) [2024-04-27 00:57:30,580][49750] InferenceWorker_p0-w0: stopping experience collection (43400 times) [2024-04-27 00:57:30,580][49750] InferenceWorker_p0-w0: resuming experience collection (43400 times) [2024-04-27 00:57:30,947][49750] Updated weights for policy 0, policy_version 313651 (0.0030) [2024-04-27 00:57:32,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 5138907136. Throughput: 0: 50650.5. Samples: 2891797560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:57:32,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 00:57:34,350][49750] Updated weights for policy 0, policy_version 313661 (0.0031) [2024-04-27 00:57:37,062][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 5139152896. Throughput: 0: 50768.8. Samples: 2891953480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:57:37,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-27 00:57:37,482][49750] Updated weights for policy 0, policy_version 313671 (0.0032) [2024-04-27 00:57:40,754][49750] Updated weights for policy 0, policy_version 313681 (0.0032) [2024-04-27 00:57:42,063][49517] Fps is (10 sec: 47513.9, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 5139382272. Throughput: 0: 50869.3. Samples: 2892262280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:57:42,063][49517] Avg episode reward: [(0, '0.666')] [2024-04-27 00:57:43,897][49750] Updated weights for policy 0, policy_version 313691 (0.0031) [2024-04-27 00:57:47,062][49517] Fps is (10 sec: 50790.6, 60 sec: 49971.1, 300 sec: 50762.6). Total num frames: 5139660800. Throughput: 0: 50795.7. Samples: 2892562220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:57:47,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-27 00:57:47,309][49750] Updated weights for policy 0, policy_version 313701 (0.0032) [2024-04-27 00:57:50,216][49750] Updated weights for policy 0, policy_version 313711 (0.0032) [2024-04-27 00:57:52,063][49517] Fps is (10 sec: 54066.8, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 5139922944. Throughput: 0: 50789.6. Samples: 2892717080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:57:52,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-27 00:57:53,723][49750] Updated weights for policy 0, policy_version 313721 (0.0031) [2024-04-27 00:57:56,608][49750] Updated weights for policy 0, policy_version 313731 (0.0035) [2024-04-27 00:57:57,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50651.6). Total num frames: 5140168704. Throughput: 0: 50792.4. Samples: 2893019580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:57:57,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 00:58:00,172][49750] Updated weights for policy 0, policy_version 313741 (0.0038) [2024-04-27 00:58:02,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5140430848. Throughput: 0: 50825.5. Samples: 2893329320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:58:02,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 00:58:03,071][49750] Updated weights for policy 0, policy_version 313751 (0.0032) [2024-04-27 00:58:06,543][49750] Updated weights for policy 0, policy_version 313761 (0.0030) [2024-04-27 00:58:07,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5140676608. Throughput: 0: 50969.4. Samples: 2893475920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:58:07,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 00:58:09,482][49750] Updated weights for policy 0, policy_version 313771 (0.0028) [2024-04-27 00:58:12,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5140922368. Throughput: 0: 50837.2. Samples: 2893777180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:58:12,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-27 00:58:12,969][49750] Updated weights for policy 0, policy_version 313781 (0.0039) [2024-04-27 00:58:15,887][49750] Updated weights for policy 0, policy_version 313791 (0.0032) [2024-04-27 00:58:17,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5141184512. Throughput: 0: 50814.3. Samples: 2894084200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:58:17,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-27 00:58:19,642][49750] Updated weights for policy 0, policy_version 313801 (0.0033) [2024-04-27 00:58:22,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 5141463040. Throughput: 0: 50803.2. Samples: 2894239620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:58:22,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-27 00:58:22,477][49750] Updated weights for policy 0, policy_version 313811 (0.0033) [2024-04-27 00:58:26,007][49750] Updated weights for policy 0, policy_version 313821 (0.0039) [2024-04-27 00:58:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5141676032. Throughput: 0: 50713.8. Samples: 2894544400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:58:27,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-27 00:58:28,970][49750] Updated weights for policy 0, policy_version 313831 (0.0032) [2024-04-27 00:58:32,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 5141938176. Throughput: 0: 50865.4. Samples: 2894851160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:58:32,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 00:58:32,416][49750] Updated weights for policy 0, policy_version 313841 (0.0031) [2024-04-27 00:58:35,301][49750] Updated weights for policy 0, policy_version 313851 (0.0035) [2024-04-27 00:58:37,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 5142216704. Throughput: 0: 50614.4. Samples: 2894994720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:58:37,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-27 00:58:38,978][49750] Updated weights for policy 0, policy_version 313861 (0.0032) [2024-04-27 00:58:41,707][49750] Updated weights for policy 0, policy_version 313871 (0.0031) [2024-04-27 00:58:42,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 5142462464. Throughput: 0: 50693.7. Samples: 2895300800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:58:42,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-27 00:58:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000313871_5142462464.pth... [2024-04-27 00:58:42,130][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000313126_5130256384.pth [2024-04-27 00:58:43,213][49728] Signal inference workers to stop experience collection... (43450 times) [2024-04-27 00:58:43,233][49750] InferenceWorker_p0-w0: stopping experience collection (43450 times) [2024-04-27 00:58:43,317][49728] Signal inference workers to resume experience collection... (43450 times) [2024-04-27 00:58:43,317][49750] InferenceWorker_p0-w0: resuming experience collection (43450 times) [2024-04-27 00:58:45,761][49750] Updated weights for policy 0, policy_version 313881 (0.0027) [2024-04-27 00:58:47,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5142708224. Throughput: 0: 50628.3. Samples: 2895607580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:58:47,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 00:58:48,511][49750] Updated weights for policy 0, policy_version 313891 (0.0039) [2024-04-27 00:58:52,046][49750] Updated weights for policy 0, policy_version 313901 (0.0039) [2024-04-27 00:58:52,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5142953984. Throughput: 0: 50716.4. Samples: 2895758160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:58:52,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 00:58:54,913][49750] Updated weights for policy 0, policy_version 313911 (0.0025) [2024-04-27 00:58:57,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5143199744. Throughput: 0: 50815.2. Samples: 2896063860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 00:58:57,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 00:58:58,474][49750] Updated weights for policy 0, policy_version 313921 (0.0037) [2024-04-27 00:59:01,244][49750] Updated weights for policy 0, policy_version 313931 (0.0035) [2024-04-27 00:59:02,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.6, 300 sec: 50762.6). Total num frames: 5143461888. Throughput: 0: 50726.3. Samples: 2896366880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:59:02,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-27 00:59:04,927][49750] Updated weights for policy 0, policy_version 313941 (0.0032) [2024-04-27 00:59:07,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5143740416. Throughput: 0: 50828.8. Samples: 2896526920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:59:07,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 00:59:07,566][49750] Updated weights for policy 0, policy_version 313951 (0.0036) [2024-04-27 00:59:11,257][49750] Updated weights for policy 0, policy_version 313961 (0.0033) [2024-04-27 00:59:12,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 5143969792. Throughput: 0: 50836.5. Samples: 2896832040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:59:12,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-27 00:59:14,180][49750] Updated weights for policy 0, policy_version 313971 (0.0031) [2024-04-27 00:59:17,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5144215552. Throughput: 0: 50743.0. Samples: 2897134600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:59:17,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 00:59:17,685][49750] Updated weights for policy 0, policy_version 313981 (0.0031) [2024-04-27 00:59:20,696][49750] Updated weights for policy 0, policy_version 313991 (0.0034) [2024-04-27 00:59:22,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 5144477696. Throughput: 0: 50691.1. Samples: 2897275820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:59:22,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 00:59:24,107][49750] Updated weights for policy 0, policy_version 314001 (0.0032) [2024-04-27 00:59:27,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50707.1). Total num frames: 5144739840. Throughput: 0: 50730.4. Samples: 2897583660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:59:27,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-27 00:59:27,169][49750] Updated weights for policy 0, policy_version 314011 (0.0031) [2024-04-27 00:59:30,580][49750] Updated weights for policy 0, policy_version 314021 (0.0032) [2024-04-27 00:59:32,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5145001984. Throughput: 0: 50748.9. Samples: 2897891280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:59:32,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 00:59:33,478][49750] Updated weights for policy 0, policy_version 314031 (0.0030) [2024-04-27 00:59:36,954][49750] Updated weights for policy 0, policy_version 314041 (0.0029) [2024-04-27 00:59:37,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5145247744. Throughput: 0: 50901.9. Samples: 2898048740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:59:37,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-27 00:59:39,875][49750] Updated weights for policy 0, policy_version 314051 (0.0032) [2024-04-27 00:59:42,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5145493504. Throughput: 0: 50709.1. Samples: 2898345780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:59:42,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-27 00:59:43,388][49750] Updated weights for policy 0, policy_version 314061 (0.0034) [2024-04-27 00:59:44,350][49728] Signal inference workers to stop experience collection... (43500 times) [2024-04-27 00:59:44,407][49750] InferenceWorker_p0-w0: stopping experience collection (43500 times) [2024-04-27 00:59:44,414][49728] Signal inference workers to resume experience collection... (43500 times) [2024-04-27 00:59:44,424][49750] InferenceWorker_p0-w0: resuming experience collection (43500 times) [2024-04-27 00:59:46,546][49750] Updated weights for policy 0, policy_version 314071 (0.0032) [2024-04-27 00:59:47,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5145755648. Throughput: 0: 50758.1. Samples: 2898651000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:59:47,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-27 00:59:49,866][49750] Updated weights for policy 0, policy_version 314081 (0.0030) [2024-04-27 00:59:52,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5146017792. Throughput: 0: 50674.3. Samples: 2898807260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:59:52,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-27 00:59:52,959][49750] Updated weights for policy 0, policy_version 314091 (0.0035) [2024-04-27 00:59:56,766][49750] Updated weights for policy 0, policy_version 314101 (0.0030) [2024-04-27 00:59:57,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5146247168. Throughput: 0: 50629.8. Samples: 2899110380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 00:59:57,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-27 00:59:59,440][49750] Updated weights for policy 0, policy_version 314111 (0.0027) [2024-04-27 01:00:02,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5146509312. Throughput: 0: 50738.4. Samples: 2899417820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:00:02,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-27 01:00:03,152][49750] Updated weights for policy 0, policy_version 314121 (0.0033) [2024-04-27 01:00:05,776][49750] Updated weights for policy 0, policy_version 314131 (0.0035) [2024-04-27 01:00:07,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 5146755072. Throughput: 0: 50731.4. Samples: 2899558740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:00:07,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-27 01:00:09,538][49750] Updated weights for policy 0, policy_version 314141 (0.0032) [2024-04-27 01:00:12,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5147033600. Throughput: 0: 50560.4. Samples: 2899858880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:00:12,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-27 01:00:12,307][49750] Updated weights for policy 0, policy_version 314151 (0.0031) [2024-04-27 01:00:15,932][49750] Updated weights for policy 0, policy_version 314161 (0.0034) [2024-04-27 01:00:17,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5147295744. Throughput: 0: 50690.7. Samples: 2900172360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 01:00:17,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 01:00:18,769][49750] Updated weights for policy 0, policy_version 314171 (0.0026) [2024-04-27 01:00:22,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5147508736. Throughput: 0: 50720.0. Samples: 2900331140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 01:00:22,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-27 01:00:22,422][49750] Updated weights for policy 0, policy_version 314181 (0.0034) [2024-04-27 01:00:25,192][49750] Updated weights for policy 0, policy_version 314191 (0.0030) [2024-04-27 01:00:27,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5147787264. Throughput: 0: 50825.9. Samples: 2900632940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 01:00:27,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 01:00:28,848][49750] Updated weights for policy 0, policy_version 314201 (0.0028) [2024-04-27 01:00:31,675][49750] Updated weights for policy 0, policy_version 314211 (0.0036) [2024-04-27 01:00:32,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5148033024. Throughput: 0: 50840.0. Samples: 2900938800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 01:00:32,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 01:00:35,202][49750] Updated weights for policy 0, policy_version 314221 (0.0030) [2024-04-27 01:00:37,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5148311552. Throughput: 0: 50810.2. Samples: 2901093720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 01:00:37,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 01:00:38,024][49750] Updated weights for policy 0, policy_version 314231 (0.0034) [2024-04-27 01:00:41,602][49750] Updated weights for policy 0, policy_version 314241 (0.0039) [2024-04-27 01:00:42,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5148540928. Throughput: 0: 50749.2. Samples: 2901394100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 01:00:42,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 01:00:42,194][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000314243_5148557312.pth... [2024-04-27 01:00:42,197][49728] Signal inference workers to stop experience collection... (43550 times) [2024-04-27 01:00:42,215][49728] Signal inference workers to resume experience collection... (43550 times) [2024-04-27 01:00:42,225][49750] InferenceWorker_p0-w0: stopping experience collection (43550 times) [2024-04-27 01:00:42,235][49750] InferenceWorker_p0-w0: resuming experience collection (43550 times) [2024-04-27 01:00:42,251][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000313499_5136367616.pth [2024-04-27 01:00:44,476][49750] Updated weights for policy 0, policy_version 314251 (0.0032) [2024-04-27 01:00:47,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5148803072. Throughput: 0: 50718.8. Samples: 2901700180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 01:00:47,063][49517] Avg episode reward: [(0, '0.682')] [2024-04-27 01:00:47,930][49750] Updated weights for policy 0, policy_version 314261 (0.0031) [2024-04-27 01:00:51,435][49750] Updated weights for policy 0, policy_version 314271 (0.0035) [2024-04-27 01:00:52,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 5149032448. Throughput: 0: 50939.2. Samples: 2901851000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 01:00:52,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 01:00:54,384][49750] Updated weights for policy 0, policy_version 314281 (0.0030) [2024-04-27 01:00:57,062][49517] Fps is (10 sec: 50791.6, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 5149310976. Throughput: 0: 51026.7. Samples: 2902155080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 01:00:57,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 01:00:57,859][49750] Updated weights for policy 0, policy_version 314291 (0.0034) [2024-04-27 01:01:00,902][49750] Updated weights for policy 0, policy_version 314301 (0.0028) [2024-04-27 01:01:02,063][49517] Fps is (10 sec: 55705.2, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 5149589504. Throughput: 0: 50925.2. Samples: 2902464000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 01:01:02,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-27 01:01:04,305][49750] Updated weights for policy 0, policy_version 314311 (0.0037) [2024-04-27 01:01:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5149818880. Throughput: 0: 50865.3. Samples: 2902620080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 01:01:07,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 01:01:07,259][49750] Updated weights for policy 0, policy_version 314321 (0.0032) [2024-04-27 01:01:10,646][49750] Updated weights for policy 0, policy_version 314331 (0.0030) [2024-04-27 01:01:12,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5150064640. Throughput: 0: 50831.6. Samples: 2902920360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 01:01:12,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-27 01:01:13,628][49750] Updated weights for policy 0, policy_version 314341 (0.0033) [2024-04-27 01:01:17,008][49750] Updated weights for policy 0, policy_version 314351 (0.0035) [2024-04-27 01:01:17,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5150326784. Throughput: 0: 50776.0. Samples: 2903223720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 01:01:17,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-27 01:01:20,350][49750] Updated weights for policy 0, policy_version 314361 (0.0030) [2024-04-27 01:01:22,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 5150588928. Throughput: 0: 50675.6. Samples: 2903374120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 01:01:22,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 01:01:23,497][49750] Updated weights for policy 0, policy_version 314371 (0.0034) [2024-04-27 01:01:26,844][49750] Updated weights for policy 0, policy_version 314381 (0.0029) [2024-04-27 01:01:27,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 5150834688. Throughput: 0: 50826.1. Samples: 2903681280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 01:01:27,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-27 01:01:30,079][49750] Updated weights for policy 0, policy_version 314391 (0.0026) [2024-04-27 01:01:32,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5151096832. Throughput: 0: 50781.5. Samples: 2903985340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 01:01:32,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-27 01:01:33,336][49750] Updated weights for policy 0, policy_version 314401 (0.0042) [2024-04-27 01:01:36,562][49750] Updated weights for policy 0, policy_version 314411 (0.0031) [2024-04-27 01:01:37,062][49517] Fps is (10 sec: 47514.6, 60 sec: 49971.3, 300 sec: 50651.6). Total num frames: 5151309824. Throughput: 0: 50722.4. Samples: 2904133500. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 01:01:37,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-27 01:01:39,675][49750] Updated weights for policy 0, policy_version 314421 (0.0034) [2024-04-27 01:01:42,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51063.3, 300 sec: 50651.5). Total num frames: 5151604736. Throughput: 0: 50775.7. Samples: 2904440000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 01:01:42,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-27 01:01:43,004][49750] Updated weights for policy 0, policy_version 314431 (0.0033) [2024-04-27 01:01:46,121][49750] Updated weights for policy 0, policy_version 314441 (0.0033) [2024-04-27 01:01:47,062][49517] Fps is (10 sec: 54066.6, 60 sec: 50790.6, 300 sec: 50762.6). Total num frames: 5151850496. Throughput: 0: 50692.1. Samples: 2904745140. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 01:01:47,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 01:01:49,330][49750] Updated weights for policy 0, policy_version 314451 (0.0034) [2024-04-27 01:01:51,392][49728] Signal inference workers to stop experience collection... (43600 times) [2024-04-27 01:01:51,432][49750] InferenceWorker_p0-w0: stopping experience collection (43600 times) [2024-04-27 01:01:51,465][49728] Signal inference workers to resume experience collection... (43600 times) [2024-04-27 01:01:51,466][49750] InferenceWorker_p0-w0: resuming experience collection (43600 times) [2024-04-27 01:01:52,062][49517] Fps is (10 sec: 49152.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5152096256. Throughput: 0: 50629.7. Samples: 2904898420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 01:01:52,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-27 01:01:52,582][49750] Updated weights for policy 0, policy_version 314461 (0.0029) [2024-04-27 01:01:56,044][49750] Updated weights for policy 0, policy_version 314471 (0.0029) [2024-04-27 01:01:57,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 5152358400. Throughput: 0: 50769.8. Samples: 2905205000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 01:01:57,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-27 01:01:59,151][49750] Updated weights for policy 0, policy_version 314481 (0.0033) [2024-04-27 01:02:02,063][49517] Fps is (10 sec: 49151.7, 60 sec: 49971.2, 300 sec: 50651.5). Total num frames: 5152587776. Throughput: 0: 50611.1. Samples: 2905501220. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 01:02:02,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 01:02:02,566][49750] Updated weights for policy 0, policy_version 314491 (0.0029) [2024-04-27 01:02:05,607][49750] Updated weights for policy 0, policy_version 314501 (0.0030) [2024-04-27 01:02:07,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5152866304. Throughput: 0: 50666.2. Samples: 2905654100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 01:02:07,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-27 01:02:08,858][49750] Updated weights for policy 0, policy_version 314511 (0.0029) [2024-04-27 01:02:11,987][49750] Updated weights for policy 0, policy_version 314521 (0.0029) [2024-04-27 01:02:12,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5153112064. Throughput: 0: 50655.2. Samples: 2905960760. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 01:02:12,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-27 01:02:15,320][49750] Updated weights for policy 0, policy_version 314531 (0.0033) [2024-04-27 01:02:17,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5153374208. Throughput: 0: 50658.8. Samples: 2906264980. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 01:02:17,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 01:02:18,466][49750] Updated weights for policy 0, policy_version 314541 (0.0028) [2024-04-27 01:02:21,747][49750] Updated weights for policy 0, policy_version 314551 (0.0029) [2024-04-27 01:02:22,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 5153603584. Throughput: 0: 50725.3. Samples: 2906416140. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 01:02:22,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-27 01:02:24,946][49750] Updated weights for policy 0, policy_version 314561 (0.0034) [2024-04-27 01:02:27,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5153865728. Throughput: 0: 50645.5. Samples: 2906719040. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 01:02:27,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-27 01:02:28,200][49750] Updated weights for policy 0, policy_version 314571 (0.0036) [2024-04-27 01:02:31,350][49750] Updated weights for policy 0, policy_version 314581 (0.0036) [2024-04-27 01:02:32,063][49517] Fps is (10 sec: 54066.0, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5154144256. Throughput: 0: 50755.0. Samples: 2907029120. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 01:02:32,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-27 01:02:34,773][49750] Updated weights for policy 0, policy_version 314591 (0.0031) [2024-04-27 01:02:37,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5154373632. Throughput: 0: 50767.6. Samples: 2907182960. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 01:02:37,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-27 01:02:37,704][49750] Updated weights for policy 0, policy_version 314601 (0.0033) [2024-04-27 01:02:41,085][49750] Updated weights for policy 0, policy_version 314611 (0.0028) [2024-04-27 01:02:42,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 5154635776. Throughput: 0: 50688.8. Samples: 2907486000. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 01:02:42,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 01:02:42,226][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000314616_5154668544.pth... [2024-04-27 01:02:42,269][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000313871_5142462464.pth [2024-04-27 01:02:44,278][49750] Updated weights for policy 0, policy_version 314621 (0.0024) [2024-04-27 01:02:47,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.2, 300 sec: 50651.6). Total num frames: 5154865152. Throughput: 0: 50961.8. Samples: 2907794500. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 01:02:47,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-27 01:02:47,429][49750] Updated weights for policy 0, policy_version 314631 (0.0026) [2024-04-27 01:02:50,700][49750] Updated weights for policy 0, policy_version 314641 (0.0030) [2024-04-27 01:02:52,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5155143680. Throughput: 0: 50776.0. Samples: 2907939020. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 01:02:52,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 01:02:53,964][49750] Updated weights for policy 0, policy_version 314651 (0.0034) [2024-04-27 01:02:56,989][49728] Signal inference workers to stop experience collection... (43650 times) [2024-04-27 01:02:57,043][49750] InferenceWorker_p0-w0: stopping experience collection (43650 times) [2024-04-27 01:02:57,053][49728] Signal inference workers to resume experience collection... (43650 times) [2024-04-27 01:02:57,062][49750] InferenceWorker_p0-w0: resuming experience collection (43650 times) [2024-04-27 01:02:57,062][49517] Fps is (10 sec: 54068.0, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 5155405824. Throughput: 0: 50741.1. Samples: 2908244100. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 01:02:57,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 01:02:57,065][49750] Updated weights for policy 0, policy_version 314661 (0.0031) [2024-04-27 01:03:00,563][49750] Updated weights for policy 0, policy_version 314671 (0.0034) [2024-04-27 01:03:02,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 5155667968. Throughput: 0: 50592.3. Samples: 2908541640. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 01:03:02,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-27 01:03:03,500][49750] Updated weights for policy 0, policy_version 314681 (0.0027) [2024-04-27 01:03:06,827][49750] Updated weights for policy 0, policy_version 314691 (0.0029) [2024-04-27 01:03:07,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5155897344. Throughput: 0: 50740.3. Samples: 2908699460. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 01:03:07,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 01:03:09,957][49750] Updated weights for policy 0, policy_version 314701 (0.0029) [2024-04-27 01:03:12,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5156143104. Throughput: 0: 50791.6. Samples: 2909004660. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 01:03:12,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-27 01:03:13,197][49750] Updated weights for policy 0, policy_version 314711 (0.0030) [2024-04-27 01:03:16,420][49750] Updated weights for policy 0, policy_version 314721 (0.0031) [2024-04-27 01:03:17,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 5156421632. Throughput: 0: 50736.6. Samples: 2909312260. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 01:03:17,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 01:03:19,659][49750] Updated weights for policy 0, policy_version 314731 (0.0027) [2024-04-27 01:03:22,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5156651008. Throughput: 0: 50838.2. Samples: 2909470680. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 01:03:22,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 01:03:22,710][49750] Updated weights for policy 0, policy_version 314741 (0.0036) [2024-04-27 01:03:26,020][49750] Updated weights for policy 0, policy_version 314751 (0.0025) [2024-04-27 01:03:27,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 5156945920. Throughput: 0: 51080.9. Samples: 2909784640. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 01:03:27,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 01:03:29,134][49750] Updated weights for policy 0, policy_version 314761 (0.0033) [2024-04-27 01:03:32,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50244.5, 300 sec: 50651.6). Total num frames: 5157158912. Throughput: 0: 50931.3. Samples: 2910086400. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 01:03:32,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 01:03:32,449][49750] Updated weights for policy 0, policy_version 314771 (0.0031) [2024-04-27 01:03:35,621][49750] Updated weights for policy 0, policy_version 314781 (0.0035) [2024-04-27 01:03:37,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5157421056. Throughput: 0: 50882.8. Samples: 2910228740. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 01:03:37,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 01:03:38,924][49750] Updated weights for policy 0, policy_version 314791 (0.0029) [2024-04-27 01:03:41,985][49750] Updated weights for policy 0, policy_version 314801 (0.0029) [2024-04-27 01:03:42,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5157699584. Throughput: 0: 50819.0. Samples: 2910530960. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 01:03:42,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 01:03:44,834][49728] Signal inference workers to stop experience collection... (43700 times) [2024-04-27 01:03:44,835][49728] Signal inference workers to resume experience collection... (43700 times) [2024-04-27 01:03:44,871][49750] InferenceWorker_p0-w0: stopping experience collection (43700 times) [2024-04-27 01:03:44,871][49750] InferenceWorker_p0-w0: resuming experience collection (43700 times) [2024-04-27 01:03:45,428][49750] Updated weights for policy 0, policy_version 314811 (0.0033) [2024-04-27 01:03:47,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 5157945344. Throughput: 0: 51064.8. Samples: 2910839560. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 01:03:47,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 01:03:48,412][49750] Updated weights for policy 0, policy_version 314821 (0.0037) [2024-04-27 01:03:51,727][49750] Updated weights for policy 0, policy_version 314831 (0.0039) [2024-04-27 01:03:52,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5158191104. Throughput: 0: 51060.6. Samples: 2910997180. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 01:03:52,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 01:03:54,870][49750] Updated weights for policy 0, policy_version 314841 (0.0037) [2024-04-27 01:03:57,062][49517] Fps is (10 sec: 45876.2, 60 sec: 49971.2, 300 sec: 50651.6). Total num frames: 5158404096. Throughput: 0: 50859.7. Samples: 2911293340. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 01:03:57,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-27 01:03:58,235][49750] Updated weights for policy 0, policy_version 314851 (0.0032) [2024-04-27 01:04:01,371][49750] Updated weights for policy 0, policy_version 314861 (0.0028) [2024-04-27 01:04:02,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5158699008. Throughput: 0: 50780.4. Samples: 2911597380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-27 01:04:02,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 01:04:04,650][49750] Updated weights for policy 0, policy_version 314871 (0.0030) [2024-04-27 01:04:07,063][49517] Fps is (10 sec: 55704.3, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 5158961152. Throughput: 0: 50751.8. Samples: 2911754520. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-27 01:04:07,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 01:04:07,746][49750] Updated weights for policy 0, policy_version 314881 (0.0032) [2024-04-27 01:04:11,107][49750] Updated weights for policy 0, policy_version 314891 (0.0039) [2024-04-27 01:04:12,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5159223296. Throughput: 0: 50619.9. Samples: 2912062540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-27 01:04:12,063][49517] Avg episode reward: [(0, '0.674')] [2024-04-27 01:04:14,111][49750] Updated weights for policy 0, policy_version 314901 (0.0026) [2024-04-27 01:04:17,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5159469056. Throughput: 0: 50926.9. Samples: 2912378120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-27 01:04:17,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 01:04:17,397][49750] Updated weights for policy 0, policy_version 314911 (0.0034) [2024-04-27 01:04:20,601][49750] Updated weights for policy 0, policy_version 314921 (0.0037) [2024-04-27 01:04:22,063][49517] Fps is (10 sec: 49151.5, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 5159714816. Throughput: 0: 50790.0. Samples: 2912514300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-27 01:04:22,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 01:04:23,842][49750] Updated weights for policy 0, policy_version 314931 (0.0034) [2024-04-27 01:04:27,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.1, 300 sec: 50762.6). Total num frames: 5159976960. Throughput: 0: 50977.5. Samples: 2912824960. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-27 01:04:27,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-27 01:04:27,155][49750] Updated weights for policy 0, policy_version 314941 (0.0041) [2024-04-27 01:04:30,373][49750] Updated weights for policy 0, policy_version 314951 (0.0032) [2024-04-27 01:04:32,062][49517] Fps is (10 sec: 50791.6, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5160222720. Throughput: 0: 50719.8. Samples: 2913121940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-27 01:04:32,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-27 01:04:33,655][49750] Updated weights for policy 0, policy_version 314961 (0.0032) [2024-04-27 01:04:36,777][49750] Updated weights for policy 0, policy_version 314971 (0.0038) [2024-04-27 01:04:37,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5160501248. Throughput: 0: 50811.1. Samples: 2913283680. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-27 01:04:37,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 01:04:39,973][49750] Updated weights for policy 0, policy_version 314981 (0.0028) [2024-04-27 01:04:42,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 5160714240. Throughput: 0: 50985.6. Samples: 2913587700. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-27 01:04:42,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-27 01:04:42,064][49728] Signal inference workers to stop experience collection... (43750 times) [2024-04-27 01:04:42,065][49728] Signal inference workers to resume experience collection... (43750 times) [2024-04-27 01:04:42,086][49750] InferenceWorker_p0-w0: stopping experience collection (43750 times) [2024-04-27 01:04:42,086][49750] InferenceWorker_p0-w0: resuming experience collection (43750 times) [2024-04-27 01:04:42,213][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000314986_5160730624.pth... [2024-04-27 01:04:42,260][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000314243_5148557312.pth [2024-04-27 01:04:43,112][49750] Updated weights for policy 0, policy_version 314991 (0.0028) [2024-04-27 01:04:46,355][49750] Updated weights for policy 0, policy_version 315001 (0.0035) [2024-04-27 01:04:47,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5160976384. Throughput: 0: 50791.1. Samples: 2913882980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-27 01:04:47,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 01:04:49,632][49750] Updated weights for policy 0, policy_version 315011 (0.0027) [2024-04-27 01:04:52,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5161254912. Throughput: 0: 50676.0. Samples: 2914034940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-27 01:04:52,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-27 01:04:52,884][49750] Updated weights for policy 0, policy_version 315021 (0.0035) [2024-04-27 01:04:56,119][49750] Updated weights for policy 0, policy_version 315031 (0.0035) [2024-04-27 01:04:57,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 5161484288. Throughput: 0: 50741.0. Samples: 2914345880. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-27 01:04:57,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-27 01:04:59,242][49750] Updated weights for policy 0, policy_version 315041 (0.0033) [2024-04-27 01:05:02,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5161746432. Throughput: 0: 50648.0. Samples: 2914657280. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-27 01:05:02,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 01:05:02,600][49750] Updated weights for policy 0, policy_version 315051 (0.0027) [2024-04-27 01:05:05,611][49750] Updated weights for policy 0, policy_version 315061 (0.0032) [2024-04-27 01:05:07,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 5161992192. Throughput: 0: 50827.8. Samples: 2914801540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-04-27 01:05:07,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-27 01:05:08,964][49750] Updated weights for policy 0, policy_version 315071 (0.0028) [2024-04-27 01:05:11,956][49750] Updated weights for policy 0, policy_version 315081 (0.0031) [2024-04-27 01:05:12,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5162287104. Throughput: 0: 50767.8. Samples: 2915109500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 01:05:12,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 01:05:15,428][49750] Updated weights for policy 0, policy_version 315091 (0.0032) [2024-04-27 01:05:17,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5162516480. Throughput: 0: 51016.8. Samples: 2915417700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 01:05:17,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 01:05:18,376][49750] Updated weights for policy 0, policy_version 315101 (0.0038) [2024-04-27 01:05:21,744][49750] Updated weights for policy 0, policy_version 315111 (0.0029) [2024-04-27 01:05:22,062][49517] Fps is (10 sec: 49151.9, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5162778624. Throughput: 0: 50729.8. Samples: 2915566520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 01:05:22,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-27 01:05:24,803][49750] Updated weights for policy 0, policy_version 315121 (0.0037) [2024-04-27 01:05:27,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5163008000. Throughput: 0: 50717.3. Samples: 2915869980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 01:05:27,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-27 01:05:28,313][49750] Updated weights for policy 0, policy_version 315131 (0.0033) [2024-04-27 01:05:31,800][49750] Updated weights for policy 0, policy_version 315141 (0.0040) [2024-04-27 01:05:32,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5163270144. Throughput: 0: 50840.6. Samples: 2916170800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 01:05:32,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-27 01:05:34,849][49750] Updated weights for policy 0, policy_version 315151 (0.0032) [2024-04-27 01:05:37,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 5163515904. Throughput: 0: 50866.8. Samples: 2916323940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 01:05:37,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-27 01:05:38,408][49750] Updated weights for policy 0, policy_version 315161 (0.0027) [2024-04-27 01:05:39,853][49728] Signal inference workers to stop experience collection... (43800 times) [2024-04-27 01:05:39,853][49728] Signal inference workers to resume experience collection... (43800 times) [2024-04-27 01:05:39,882][49750] InferenceWorker_p0-w0: stopping experience collection (43800 times) [2024-04-27 01:05:39,882][49750] InferenceWorker_p0-w0: resuming experience collection (43800 times) [2024-04-27 01:05:41,281][49750] Updated weights for policy 0, policy_version 315171 (0.0033) [2024-04-27 01:05:42,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5163761664. Throughput: 0: 50760.9. Samples: 2916630120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 01:05:42,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-27 01:05:44,705][49750] Updated weights for policy 0, policy_version 315181 (0.0033) [2024-04-27 01:05:47,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5164023808. Throughput: 0: 50506.2. Samples: 2916930060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 01:05:47,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 01:05:47,675][49750] Updated weights for policy 0, policy_version 315191 (0.0032) [2024-04-27 01:05:51,021][49750] Updated weights for policy 0, policy_version 315201 (0.0028) [2024-04-27 01:05:52,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50244.5, 300 sec: 50707.1). Total num frames: 5164269568. Throughput: 0: 50870.3. Samples: 2917090700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 01:05:52,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-27 01:05:54,230][49750] Updated weights for policy 0, policy_version 315211 (0.0037) [2024-04-27 01:05:57,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 5164548096. Throughput: 0: 50837.7. Samples: 2917397200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 01:05:57,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-27 01:05:57,519][49750] Updated weights for policy 0, policy_version 315221 (0.0028) [2024-04-27 01:06:00,684][49750] Updated weights for policy 0, policy_version 315231 (0.0028) [2024-04-27 01:06:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5164777472. Throughput: 0: 50592.5. Samples: 2917694360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 01:06:02,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 01:06:04,098][49750] Updated weights for policy 0, policy_version 315241 (0.0034) [2024-04-27 01:06:07,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 5165056000. Throughput: 0: 50588.3. Samples: 2917843000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 01:06:07,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 01:06:07,204][49750] Updated weights for policy 0, policy_version 315251 (0.0032) [2024-04-27 01:06:10,688][49750] Updated weights for policy 0, policy_version 315261 (0.0027) [2024-04-27 01:06:12,062][49517] Fps is (10 sec: 54066.8, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5165318144. Throughput: 0: 50785.5. Samples: 2918155320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 01:06:12,063][49517] Avg episode reward: [(0, '0.670')] [2024-04-27 01:06:13,566][49750] Updated weights for policy 0, policy_version 315271 (0.0033) [2024-04-27 01:06:17,015][49750] Updated weights for policy 0, policy_version 315281 (0.0029) [2024-04-27 01:06:17,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5165563904. Throughput: 0: 50806.5. Samples: 2918457100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 01:06:17,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 01:06:20,033][49750] Updated weights for policy 0, policy_version 315291 (0.0035) [2024-04-27 01:06:22,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 5165809664. Throughput: 0: 50831.2. Samples: 2918611340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 01:06:22,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-27 01:06:23,326][49750] Updated weights for policy 0, policy_version 315301 (0.0031) [2024-04-27 01:06:26,453][49750] Updated weights for policy 0, policy_version 315311 (0.0031) [2024-04-27 01:06:27,063][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 5166071808. Throughput: 0: 50865.7. Samples: 2918919080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:06:27,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 01:06:29,932][49750] Updated weights for policy 0, policy_version 315321 (0.0034) [2024-04-27 01:06:32,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5166317568. Throughput: 0: 50869.7. Samples: 2919219200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:06:32,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-27 01:06:33,148][49750] Updated weights for policy 0, policy_version 315331 (0.0032) [2024-04-27 01:06:36,266][49750] Updated weights for policy 0, policy_version 315341 (0.0029) [2024-04-27 01:06:37,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5166579712. Throughput: 0: 50759.8. Samples: 2919374900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:06:37,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 01:06:38,779][49728] Signal inference workers to stop experience collection... (43850 times) [2024-04-27 01:06:38,780][49728] Signal inference workers to resume experience collection... (43850 times) [2024-04-27 01:06:38,804][49750] InferenceWorker_p0-w0: stopping experience collection (43850 times) [2024-04-27 01:06:38,804][49750] InferenceWorker_p0-w0: resuming experience collection (43850 times) [2024-04-27 01:06:39,506][49750] Updated weights for policy 0, policy_version 315351 (0.0037) [2024-04-27 01:06:42,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 5166841856. Throughput: 0: 50760.4. Samples: 2919681420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:06:42,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 01:06:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000315359_5166841856.pth... [2024-04-27 01:06:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000314616_5154668544.pth [2024-04-27 01:06:42,731][49750] Updated weights for policy 0, policy_version 315361 (0.0041) [2024-04-27 01:06:45,819][49750] Updated weights for policy 0, policy_version 315371 (0.0033) [2024-04-27 01:06:47,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5167054848. Throughput: 0: 50919.0. Samples: 2919985720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:06:47,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 01:06:49,123][49750] Updated weights for policy 0, policy_version 315381 (0.0029) [2024-04-27 01:06:52,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 5167349760. Throughput: 0: 50847.6. Samples: 2920131140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:06:52,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-27 01:06:52,341][49750] Updated weights for policy 0, policy_version 315391 (0.0030) [2024-04-27 01:06:55,602][49750] Updated weights for policy 0, policy_version 315401 (0.0025) [2024-04-27 01:06:57,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5167579136. Throughput: 0: 50592.9. Samples: 2920432000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:06:57,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-27 01:06:58,868][49750] Updated weights for policy 0, policy_version 315411 (0.0032) [2024-04-27 01:07:02,062][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5167841280. Throughput: 0: 50650.7. Samples: 2920736380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:07:02,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-27 01:07:02,233][49750] Updated weights for policy 0, policy_version 315421 (0.0037) [2024-04-27 01:07:05,259][49750] Updated weights for policy 0, policy_version 315431 (0.0039) [2024-04-27 01:07:07,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5168087040. Throughput: 0: 50696.3. Samples: 2920892680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:07:07,063][49517] Avg episode reward: [(0, '0.669')] [2024-04-27 01:07:08,575][49750] Updated weights for policy 0, policy_version 315441 (0.0030) [2024-04-27 01:07:11,929][49750] Updated weights for policy 0, policy_version 315451 (0.0033) [2024-04-27 01:07:12,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5168349184. Throughput: 0: 50777.7. Samples: 2921204080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:07:12,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-27 01:07:14,914][49750] Updated weights for policy 0, policy_version 315461 (0.0031) [2024-04-27 01:07:17,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5168594944. Throughput: 0: 50741.0. Samples: 2921502540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:07:17,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-27 01:07:18,464][49750] Updated weights for policy 0, policy_version 315471 (0.0030) [2024-04-27 01:07:21,480][49750] Updated weights for policy 0, policy_version 315481 (0.0042) [2024-04-27 01:07:22,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5168873472. Throughput: 0: 50741.3. Samples: 2921658260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:07:22,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-27 01:07:24,750][49750] Updated weights for policy 0, policy_version 315491 (0.0032) [2024-04-27 01:07:27,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5169119232. Throughput: 0: 50703.5. Samples: 2921963080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:07:27,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 01:07:27,879][49750] Updated weights for policy 0, policy_version 315501 (0.0029) [2024-04-27 01:07:31,114][49750] Updated weights for policy 0, policy_version 315511 (0.0031) [2024-04-27 01:07:32,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5169348608. Throughput: 0: 50711.0. Samples: 2922267720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:07:32,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-27 01:07:34,155][49750] Updated weights for policy 0, policy_version 315521 (0.0027) [2024-04-27 01:07:37,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5169627136. Throughput: 0: 50758.7. Samples: 2922415280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:07:37,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-27 01:07:37,897][49750] Updated weights for policy 0, policy_version 315531 (0.0036) [2024-04-27 01:07:40,539][49750] Updated weights for policy 0, policy_version 315541 (0.0029) [2024-04-27 01:07:42,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5169872896. Throughput: 0: 50849.2. Samples: 2922720220. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-04-27 01:07:42,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 01:07:43,175][49728] Signal inference workers to stop experience collection... (43900 times) [2024-04-27 01:07:43,176][49728] Signal inference workers to resume experience collection... (43900 times) [2024-04-27 01:07:43,189][49750] InferenceWorker_p0-w0: stopping experience collection (43900 times) [2024-04-27 01:07:43,189][49750] InferenceWorker_p0-w0: resuming experience collection (43900 times) [2024-04-27 01:07:44,319][49750] Updated weights for policy 0, policy_version 315551 (0.0032) [2024-04-27 01:07:47,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 5170135040. Throughput: 0: 50879.6. Samples: 2923025960. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-04-27 01:07:47,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-27 01:07:47,171][49750] Updated weights for policy 0, policy_version 315561 (0.0028) [2024-04-27 01:07:50,634][49750] Updated weights for policy 0, policy_version 315571 (0.0037) [2024-04-27 01:07:52,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 5170397184. Throughput: 0: 50859.9. Samples: 2923181380. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-04-27 01:07:52,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-27 01:07:53,683][49750] Updated weights for policy 0, policy_version 315581 (0.0028) [2024-04-27 01:07:56,968][49750] Updated weights for policy 0, policy_version 315591 (0.0033) [2024-04-27 01:07:57,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5170642944. Throughput: 0: 50811.6. Samples: 2923490600. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-04-27 01:07:57,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 01:07:59,999][49750] Updated weights for policy 0, policy_version 315601 (0.0031) [2024-04-27 01:08:02,063][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5170905088. Throughput: 0: 51039.0. Samples: 2923799300. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-04-27 01:08:02,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-27 01:08:03,432][49750] Updated weights for policy 0, policy_version 315611 (0.0029) [2024-04-27 01:08:06,577][49750] Updated weights for policy 0, policy_version 315621 (0.0032) [2024-04-27 01:08:07,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5171150848. Throughput: 0: 50904.2. Samples: 2923948940. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-04-27 01:08:07,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 01:08:09,934][49750] Updated weights for policy 0, policy_version 315631 (0.0031) [2024-04-27 01:08:12,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5171412992. Throughput: 0: 50977.8. Samples: 2924257080. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-04-27 01:08:12,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-27 01:08:13,111][49750] Updated weights for policy 0, policy_version 315641 (0.0028) [2024-04-27 01:08:16,301][49750] Updated weights for policy 0, policy_version 315651 (0.0030) [2024-04-27 01:08:17,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5171658752. Throughput: 0: 50867.2. Samples: 2924556740. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-04-27 01:08:17,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-27 01:08:19,395][49750] Updated weights for policy 0, policy_version 315661 (0.0028) [2024-04-27 01:08:22,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5171920896. Throughput: 0: 50894.3. Samples: 2924705520. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-04-27 01:08:22,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-27 01:08:23,047][49750] Updated weights for policy 0, policy_version 315671 (0.0032) [2024-04-27 01:08:25,755][49750] Updated weights for policy 0, policy_version 315681 (0.0033) [2024-04-27 01:08:27,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5172166656. Throughput: 0: 51031.3. Samples: 2925016620. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-04-27 01:08:27,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-27 01:08:29,500][49750] Updated weights for policy 0, policy_version 315691 (0.0032) [2024-04-27 01:08:32,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5172428800. Throughput: 0: 50897.3. Samples: 2925316340. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-04-27 01:08:32,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-27 01:08:32,162][49750] Updated weights for policy 0, policy_version 315701 (0.0032) [2024-04-27 01:08:35,882][49750] Updated weights for policy 0, policy_version 315711 (0.0033) [2024-04-27 01:08:37,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5172674560. Throughput: 0: 50946.5. Samples: 2925473960. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-04-27 01:08:37,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-27 01:08:38,644][49750] Updated weights for policy 0, policy_version 315721 (0.0037) [2024-04-27 01:08:42,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5172920320. Throughput: 0: 50754.7. Samples: 2925774560. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-04-27 01:08:42,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 01:08:42,077][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000315730_5172920320.pth... [2024-04-27 01:08:42,145][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000314986_5160730624.pth [2024-04-27 01:08:42,307][49750] Updated weights for policy 0, policy_version 315731 (0.0031) [2024-04-27 01:08:45,014][49750] Updated weights for policy 0, policy_version 315741 (0.0029) [2024-04-27 01:08:47,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 5173182464. Throughput: 0: 50743.5. Samples: 2926082760. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-04-27 01:08:47,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-27 01:08:48,738][49750] Updated weights for policy 0, policy_version 315751 (0.0034) [2024-04-27 01:08:51,412][49750] Updated weights for policy 0, policy_version 315761 (0.0029) [2024-04-27 01:08:52,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.5, 300 sec: 50929.2). Total num frames: 5173428224. Throughput: 0: 50700.0. Samples: 2926230440. Policy #0 lag: (min: 1.0, avg: 11.5, max: 24.0) [2024-04-27 01:08:52,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 01:08:55,181][49728] Signal inference workers to stop experience collection... (43950 times) [2024-04-27 01:08:55,182][49728] Signal inference workers to resume experience collection... (43950 times) [2024-04-27 01:08:55,205][49750] InferenceWorker_p0-w0: stopping experience collection (43950 times) [2024-04-27 01:08:55,205][49750] InferenceWorker_p0-w0: resuming experience collection (43950 times) [2024-04-27 01:08:55,331][49750] Updated weights for policy 0, policy_version 315771 (0.0034) [2024-04-27 01:08:57,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5173690368. Throughput: 0: 50723.2. Samples: 2926539620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 01:08:57,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-27 01:08:57,872][49750] Updated weights for policy 0, policy_version 315781 (0.0027) [2024-04-27 01:09:01,656][49750] Updated weights for policy 0, policy_version 315791 (0.0030) [2024-04-27 01:09:02,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 5173936128. Throughput: 0: 50893.9. Samples: 2926846960. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 01:09:02,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-27 01:09:04,393][49750] Updated weights for policy 0, policy_version 315801 (0.0028) [2024-04-27 01:09:07,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5174198272. Throughput: 0: 50714.2. Samples: 2926987660. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 01:09:07,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-27 01:09:08,028][49750] Updated weights for policy 0, policy_version 315811 (0.0026) [2024-04-27 01:09:11,058][49750] Updated weights for policy 0, policy_version 315821 (0.0032) [2024-04-27 01:09:12,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 5174444032. Throughput: 0: 50707.2. Samples: 2927298440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 01:09:12,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-27 01:09:14,492][49750] Updated weights for policy 0, policy_version 315831 (0.0031) [2024-04-27 01:09:17,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5174722560. Throughput: 0: 50911.9. Samples: 2927607380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 01:09:17,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 01:09:17,607][49750] Updated weights for policy 0, policy_version 315841 (0.0028) [2024-04-27 01:09:20,962][49750] Updated weights for policy 0, policy_version 315851 (0.0030) [2024-04-27 01:09:22,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5174968320. Throughput: 0: 50830.6. Samples: 2927761340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 01:09:22,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-27 01:09:24,025][49750] Updated weights for policy 0, policy_version 315861 (0.0036) [2024-04-27 01:09:27,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5175197696. Throughput: 0: 50872.1. Samples: 2928063800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 01:09:27,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 01:09:27,382][49750] Updated weights for policy 0, policy_version 315871 (0.0035) [2024-04-27 01:09:30,314][49750] Updated weights for policy 0, policy_version 315881 (0.0033) [2024-04-27 01:09:32,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.4, 300 sec: 50651.6). Total num frames: 5175443456. Throughput: 0: 50885.1. Samples: 2928372580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 01:09:32,063][49517] Avg episode reward: [(0, '0.677')] [2024-04-27 01:09:33,791][49750] Updated weights for policy 0, policy_version 315891 (0.0030) [2024-04-27 01:09:36,661][49750] Updated weights for policy 0, policy_version 315901 (0.0039) [2024-04-27 01:09:37,063][49517] Fps is (10 sec: 52427.7, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 5175721984. Throughput: 0: 50796.2. Samples: 2928516280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 01:09:37,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-27 01:09:40,186][49750] Updated weights for policy 0, policy_version 315911 (0.0031) [2024-04-27 01:09:42,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5175967744. Throughput: 0: 50782.3. Samples: 2928824820. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 01:09:42,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-27 01:09:43,058][49750] Updated weights for policy 0, policy_version 315921 (0.0037) [2024-04-27 01:09:46,683][49750] Updated weights for policy 0, policy_version 315931 (0.0037) [2024-04-27 01:09:47,062][49517] Fps is (10 sec: 52430.1, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5176246272. Throughput: 0: 50852.0. Samples: 2929135300. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 01:09:47,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 01:09:49,627][49750] Updated weights for policy 0, policy_version 315941 (0.0025) [2024-04-27 01:09:52,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 5176459264. Throughput: 0: 50952.4. Samples: 2929280520. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 01:09:52,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-27 01:09:53,113][49750] Updated weights for policy 0, policy_version 315951 (0.0027) [2024-04-27 01:09:53,129][49728] Signal inference workers to stop experience collection... (44000 times) [2024-04-27 01:09:53,129][49728] Signal inference workers to resume experience collection... (44000 times) [2024-04-27 01:09:53,154][49750] InferenceWorker_p0-w0: stopping experience collection (44000 times) [2024-04-27 01:09:53,155][49750] InferenceWorker_p0-w0: resuming experience collection (44000 times) [2024-04-27 01:09:56,038][49750] Updated weights for policy 0, policy_version 315961 (0.0031) [2024-04-27 01:09:57,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5176737792. Throughput: 0: 50872.9. Samples: 2929587720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 01:09:57,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 01:09:59,507][49750] Updated weights for policy 0, policy_version 315971 (0.0036) [2024-04-27 01:10:02,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5176999936. Throughput: 0: 50762.3. Samples: 2929891680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 01:10:02,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-27 01:10:02,370][49750] Updated weights for policy 0, policy_version 315981 (0.0029) [2024-04-27 01:10:05,937][49750] Updated weights for policy 0, policy_version 315991 (0.0025) [2024-04-27 01:10:07,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 5177278464. Throughput: 0: 50892.4. Samples: 2930051500. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 01:10:07,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-27 01:10:08,666][49750] Updated weights for policy 0, policy_version 316001 (0.0026) [2024-04-27 01:10:12,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 5177491456. Throughput: 0: 50898.8. Samples: 2930354240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:10:12,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 01:10:12,392][49750] Updated weights for policy 0, policy_version 316011 (0.0038) [2024-04-27 01:10:15,197][49750] Updated weights for policy 0, policy_version 316021 (0.0028) [2024-04-27 01:10:17,063][49517] Fps is (10 sec: 45875.1, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 5177737216. Throughput: 0: 50764.0. Samples: 2930656960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:10:17,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 01:10:18,712][49750] Updated weights for policy 0, policy_version 316031 (0.0031) [2024-04-27 01:10:22,037][49750] Updated weights for policy 0, policy_version 316041 (0.0030) [2024-04-27 01:10:22,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5178015744. Throughput: 0: 50804.7. Samples: 2930802480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:10:22,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-27 01:10:25,101][49750] Updated weights for policy 0, policy_version 316051 (0.0027) [2024-04-27 01:10:27,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5178277888. Throughput: 0: 50870.3. Samples: 2931113980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:10:27,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-27 01:10:28,544][49750] Updated weights for policy 0, policy_version 316061 (0.0032) [2024-04-27 01:10:31,631][49750] Updated weights for policy 0, policy_version 316071 (0.0035) [2024-04-27 01:10:32,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5178523648. Throughput: 0: 50905.3. Samples: 2931426040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:10:32,063][49517] Avg episode reward: [(0, '0.677')] [2024-04-27 01:10:34,940][49750] Updated weights for policy 0, policy_version 316081 (0.0033) [2024-04-27 01:10:37,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 5178753024. Throughput: 0: 50951.6. Samples: 2931573340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:10:37,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 01:10:38,001][49750] Updated weights for policy 0, policy_version 316091 (0.0031) [2024-04-27 01:10:41,366][49750] Updated weights for policy 0, policy_version 316101 (0.0031) [2024-04-27 01:10:42,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5179031552. Throughput: 0: 50927.4. Samples: 2931879460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:10:42,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-27 01:10:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000316103_5179031552.pth... [2024-04-27 01:10:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000315359_5166841856.pth [2024-04-27 01:10:44,508][49750] Updated weights for policy 0, policy_version 316111 (0.0029) [2024-04-27 01:10:47,062][49517] Fps is (10 sec: 54067.9, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 5179293696. Throughput: 0: 50892.2. Samples: 2932181820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:10:47,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-27 01:10:47,913][49750] Updated weights for policy 0, policy_version 316121 (0.0029) [2024-04-27 01:10:51,022][49750] Updated weights for policy 0, policy_version 316131 (0.0030) [2024-04-27 01:10:51,028][49728] Signal inference workers to stop experience collection... (44050 times) [2024-04-27 01:10:51,029][49728] Signal inference workers to resume experience collection... (44050 times) [2024-04-27 01:10:51,055][49750] InferenceWorker_p0-w0: stopping experience collection (44050 times) [2024-04-27 01:10:51,055][49750] InferenceWorker_p0-w0: resuming experience collection (44050 times) [2024-04-27 01:10:52,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51609.7, 300 sec: 50873.7). Total num frames: 5179555840. Throughput: 0: 50963.6. Samples: 2932344860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:10:52,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 01:10:54,239][49750] Updated weights for policy 0, policy_version 316141 (0.0032) [2024-04-27 01:10:57,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5179785216. Throughput: 0: 50926.5. Samples: 2932645940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:10:57,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-27 01:10:57,391][49750] Updated weights for policy 0, policy_version 316151 (0.0032) [2024-04-27 01:11:00,610][49750] Updated weights for policy 0, policy_version 316161 (0.0027) [2024-04-27 01:11:02,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5180030976. Throughput: 0: 50964.4. Samples: 2932950360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:11:02,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 01:11:03,701][49750] Updated weights for policy 0, policy_version 316171 (0.0035) [2024-04-27 01:11:07,037][49750] Updated weights for policy 0, policy_version 316181 (0.0037) [2024-04-27 01:11:07,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5180309504. Throughput: 0: 51096.9. Samples: 2933101840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:11:07,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 01:11:10,062][49750] Updated weights for policy 0, policy_version 316191 (0.0026) [2024-04-27 01:11:12,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 5180571648. Throughput: 0: 50895.0. Samples: 2933404260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:11:12,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 01:11:13,498][49750] Updated weights for policy 0, policy_version 316201 (0.0036) [2024-04-27 01:11:16,563][49750] Updated weights for policy 0, policy_version 316211 (0.0038) [2024-04-27 01:11:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5180817408. Throughput: 0: 50886.7. Samples: 2933715940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:11:17,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-27 01:11:19,788][49750] Updated weights for policy 0, policy_version 316221 (0.0027) [2024-04-27 01:11:22,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5181063168. Throughput: 0: 51088.9. Samples: 2933872340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:11:22,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-27 01:11:22,842][49750] Updated weights for policy 0, policy_version 316231 (0.0031) [2024-04-27 01:11:26,288][49750] Updated weights for policy 0, policy_version 316241 (0.0030) [2024-04-27 01:11:27,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5181325312. Throughput: 0: 50963.5. Samples: 2934172820. Policy #0 lag: (min: 2.0, avg: 11.7, max: 20.0) [2024-04-27 01:11:27,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 01:11:29,277][49750] Updated weights for policy 0, policy_version 316251 (0.0035) [2024-04-27 01:11:32,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5181587456. Throughput: 0: 51003.3. Samples: 2934476980. Policy #0 lag: (min: 2.0, avg: 11.7, max: 20.0) [2024-04-27 01:11:32,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 01:11:32,670][49750] Updated weights for policy 0, policy_version 316261 (0.0028) [2024-04-27 01:11:35,689][49750] Updated weights for policy 0, policy_version 316271 (0.0030) [2024-04-27 01:11:37,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 5181849600. Throughput: 0: 51062.6. Samples: 2934642680. Policy #0 lag: (min: 2.0, avg: 11.7, max: 20.0) [2024-04-27 01:11:37,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-27 01:11:39,087][49750] Updated weights for policy 0, policy_version 316281 (0.0032) [2024-04-27 01:11:42,063][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 5182095360. Throughput: 0: 51187.9. Samples: 2934949400. Policy #0 lag: (min: 2.0, avg: 11.7, max: 20.0) [2024-04-27 01:11:42,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-27 01:11:42,112][49750] Updated weights for policy 0, policy_version 316291 (0.0032) [2024-04-27 01:11:45,413][49750] Updated weights for policy 0, policy_version 316301 (0.0024) [2024-04-27 01:11:47,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.1, 300 sec: 50762.6). Total num frames: 5182324736. Throughput: 0: 51128.7. Samples: 2935251160. Policy #0 lag: (min: 2.0, avg: 11.7, max: 20.0) [2024-04-27 01:11:47,063][49517] Avg episode reward: [(0, '0.523')] [2024-04-27 01:11:48,468][49750] Updated weights for policy 0, policy_version 316311 (0.0033) [2024-04-27 01:11:51,855][49750] Updated weights for policy 0, policy_version 316321 (0.0027) [2024-04-27 01:11:52,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 5182603264. Throughput: 0: 51029.8. Samples: 2935398180. Policy #0 lag: (min: 2.0, avg: 11.7, max: 20.0) [2024-04-27 01:11:52,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-27 01:11:55,188][49750] Updated weights for policy 0, policy_version 316331 (0.0031) [2024-04-27 01:11:56,190][49728] Signal inference workers to stop experience collection... (44100 times) [2024-04-27 01:11:56,190][49728] Signal inference workers to resume experience collection... (44100 times) [2024-04-27 01:11:56,221][49750] InferenceWorker_p0-w0: stopping experience collection (44100 times) [2024-04-27 01:11:56,221][49750] InferenceWorker_p0-w0: resuming experience collection (44100 times) [2024-04-27 01:11:57,063][49517] Fps is (10 sec: 54067.9, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 5182865408. Throughput: 0: 51126.2. Samples: 2935704940. Policy #0 lag: (min: 2.0, avg: 11.7, max: 20.0) [2024-04-27 01:11:57,063][49517] Avg episode reward: [(0, '0.711')] [2024-04-27 01:11:58,232][49750] Updated weights for policy 0, policy_version 316341 (0.0039) [2024-04-27 01:12:01,573][49750] Updated weights for policy 0, policy_version 316351 (0.0035) [2024-04-27 01:12:02,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 5183127552. Throughput: 0: 51030.1. Samples: 2936012300. Policy #0 lag: (min: 2.0, avg: 11.7, max: 20.0) [2024-04-27 01:12:02,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 01:12:04,652][49750] Updated weights for policy 0, policy_version 316361 (0.0029) [2024-04-27 01:12:07,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5183356928. Throughput: 0: 50998.2. Samples: 2936167260. Policy #0 lag: (min: 2.0, avg: 11.7, max: 20.0) [2024-04-27 01:12:07,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-27 01:12:07,887][49750] Updated weights for policy 0, policy_version 316371 (0.0029) [2024-04-27 01:12:11,042][49750] Updated weights for policy 0, policy_version 316381 (0.0032) [2024-04-27 01:12:12,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 5183619072. Throughput: 0: 51128.0. Samples: 2936473580. Policy #0 lag: (min: 2.0, avg: 11.7, max: 20.0) [2024-04-27 01:12:12,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 01:12:14,299][49750] Updated weights for policy 0, policy_version 316391 (0.0030) [2024-04-27 01:12:17,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 5183864832. Throughput: 0: 51164.9. Samples: 2936779400. Policy #0 lag: (min: 2.0, avg: 11.7, max: 20.0) [2024-04-27 01:12:17,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 01:12:17,434][49750] Updated weights for policy 0, policy_version 316401 (0.0030) [2024-04-27 01:12:20,769][49750] Updated weights for policy 0, policy_version 316411 (0.0035) [2024-04-27 01:12:22,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 5184143360. Throughput: 0: 50864.0. Samples: 2936931560. Policy #0 lag: (min: 2.0, avg: 11.7, max: 20.0) [2024-04-27 01:12:22,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-27 01:12:24,080][49750] Updated weights for policy 0, policy_version 316421 (0.0029) [2024-04-27 01:12:27,063][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 5184389120. Throughput: 0: 50701.4. Samples: 2937230960. Policy #0 lag: (min: 2.0, avg: 11.7, max: 20.0) [2024-04-27 01:12:27,063][49517] Avg episode reward: [(0, '0.672')] [2024-04-27 01:12:27,266][49750] Updated weights for policy 0, policy_version 316431 (0.0029) [2024-04-27 01:12:30,451][49750] Updated weights for policy 0, policy_version 316441 (0.0033) [2024-04-27 01:12:32,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5184618496. Throughput: 0: 50861.4. Samples: 2937539920. Policy #0 lag: (min: 2.0, avg: 11.7, max: 20.0) [2024-04-27 01:12:32,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 01:12:33,653][49750] Updated weights for policy 0, policy_version 316451 (0.0032) [2024-04-27 01:12:36,849][49750] Updated weights for policy 0, policy_version 316461 (0.0028) [2024-04-27 01:12:37,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 5184897024. Throughput: 0: 50731.1. Samples: 2937681080. Policy #0 lag: (min: 2.0, avg: 11.7, max: 20.0) [2024-04-27 01:12:37,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 01:12:40,070][49750] Updated weights for policy 0, policy_version 316471 (0.0030) [2024-04-27 01:12:42,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5185142784. Throughput: 0: 50777.4. Samples: 2937989920. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-27 01:12:42,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 01:12:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000316476_5185142784.pth... [2024-04-27 01:12:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000315730_5172920320.pth [2024-04-27 01:12:43,242][49750] Updated weights for policy 0, policy_version 316481 (0.0029) [2024-04-27 01:12:46,483][49750] Updated weights for policy 0, policy_version 316491 (0.0026) [2024-04-27 01:12:47,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5185404928. Throughput: 0: 50680.0. Samples: 2938292900. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-27 01:12:47,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-27 01:12:49,748][49750] Updated weights for policy 0, policy_version 316501 (0.0034) [2024-04-27 01:12:52,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5185634304. Throughput: 0: 50716.0. Samples: 2938449480. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-27 01:12:52,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-27 01:12:52,902][49750] Updated weights for policy 0, policy_version 316511 (0.0033) [2024-04-27 01:12:56,249][49750] Updated weights for policy 0, policy_version 316521 (0.0031) [2024-04-27 01:12:57,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5185896448. Throughput: 0: 50791.1. Samples: 2938759180. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-27 01:12:57,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-27 01:12:59,425][49750] Updated weights for policy 0, policy_version 316531 (0.0035) [2024-04-27 01:13:02,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.2, 300 sec: 50818.1). Total num frames: 5186142208. Throughput: 0: 50756.1. Samples: 2939063420. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-27 01:13:02,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-27 01:13:02,647][49750] Updated weights for policy 0, policy_version 316541 (0.0031) [2024-04-27 01:13:04,801][49728] Signal inference workers to stop experience collection... (44150 times) [2024-04-27 01:13:04,835][49750] InferenceWorker_p0-w0: stopping experience collection (44150 times) [2024-04-27 01:13:04,870][49728] Signal inference workers to resume experience collection... (44150 times) [2024-04-27 01:13:04,870][49750] InferenceWorker_p0-w0: resuming experience collection (44150 times) [2024-04-27 01:13:05,662][49750] Updated weights for policy 0, policy_version 316551 (0.0032) [2024-04-27 01:13:07,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5186420736. Throughput: 0: 50932.4. Samples: 2939223520. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-27 01:13:07,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-27 01:13:09,052][49750] Updated weights for policy 0, policy_version 316561 (0.0029) [2024-04-27 01:13:11,983][49750] Updated weights for policy 0, policy_version 316571 (0.0031) [2024-04-27 01:13:12,062][49517] Fps is (10 sec: 55706.0, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 5186699264. Throughput: 0: 50894.7. Samples: 2939521220. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-27 01:13:12,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 01:13:15,582][49750] Updated weights for policy 0, policy_version 316581 (0.0030) [2024-04-27 01:13:17,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 5186912256. Throughput: 0: 50777.0. Samples: 2939824880. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-27 01:13:17,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-27 01:13:18,539][49750] Updated weights for policy 0, policy_version 316591 (0.0029) [2024-04-27 01:13:22,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5187174400. Throughput: 0: 50973.1. Samples: 2939974880. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-27 01:13:22,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-27 01:13:22,156][49750] Updated weights for policy 0, policy_version 316601 (0.0029) [2024-04-27 01:13:24,887][49750] Updated weights for policy 0, policy_version 316611 (0.0025) [2024-04-27 01:13:27,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5187420160. Throughput: 0: 50830.3. Samples: 2940277280. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-27 01:13:27,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-27 01:13:28,608][49750] Updated weights for policy 0, policy_version 316621 (0.0030) [2024-04-27 01:13:31,373][49750] Updated weights for policy 0, policy_version 316631 (0.0028) [2024-04-27 01:13:32,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 5187698688. Throughput: 0: 50796.9. Samples: 2940578760. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-27 01:13:32,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 01:13:35,011][49750] Updated weights for policy 0, policy_version 316641 (0.0036) [2024-04-27 01:13:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 5187944448. Throughput: 0: 50921.3. Samples: 2940740940. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-27 01:13:37,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 01:13:37,855][49750] Updated weights for policy 0, policy_version 316651 (0.0031) [2024-04-27 01:13:41,284][49750] Updated weights for policy 0, policy_version 316661 (0.0026) [2024-04-27 01:13:42,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5188190208. Throughput: 0: 50890.3. Samples: 2941049240. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-27 01:13:42,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-27 01:13:44,209][49750] Updated weights for policy 0, policy_version 316671 (0.0029) [2024-04-27 01:13:47,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 5188435968. Throughput: 0: 50807.7. Samples: 2941349760. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-27 01:13:47,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-27 01:13:47,716][49750] Updated weights for policy 0, policy_version 316681 (0.0031) [2024-04-27 01:13:50,606][49750] Updated weights for policy 0, policy_version 316691 (0.0035) [2024-04-27 01:13:52,063][49517] Fps is (10 sec: 50789.4, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5188698112. Throughput: 0: 50770.6. Samples: 2941508200. Policy #0 lag: (min: 1.0, avg: 8.8, max: 22.0) [2024-04-27 01:13:52,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-27 01:13:54,092][49750] Updated weights for policy 0, policy_version 316701 (0.0041) [2024-04-27 01:13:56,447][49728] Signal inference workers to stop experience collection... (44200 times) [2024-04-27 01:13:56,448][49728] Signal inference workers to resume experience collection... (44200 times) [2024-04-27 01:13:56,463][49750] InferenceWorker_p0-w0: stopping experience collection (44200 times) [2024-04-27 01:13:56,463][49750] InferenceWorker_p0-w0: resuming experience collection (44200 times) [2024-04-27 01:13:57,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 5188976640. Throughput: 0: 50885.9. Samples: 2941811080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:13:57,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-27 01:13:57,089][49750] Updated weights for policy 0, policy_version 316711 (0.0026) [2024-04-27 01:14:00,496][49750] Updated weights for policy 0, policy_version 316721 (0.0040) [2024-04-27 01:14:02,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5189189632. Throughput: 0: 50922.7. Samples: 2942116400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:14:02,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 01:14:03,604][49750] Updated weights for policy 0, policy_version 316731 (0.0032) [2024-04-27 01:14:06,975][49750] Updated weights for policy 0, policy_version 316741 (0.0028) [2024-04-27 01:14:07,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 5189484544. Throughput: 0: 50873.5. Samples: 2942264180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:14:07,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 01:14:09,955][49750] Updated weights for policy 0, policy_version 316751 (0.0030) [2024-04-27 01:14:12,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 5189713920. Throughput: 0: 50840.3. Samples: 2942565100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:14:12,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-27 01:14:13,487][49750] Updated weights for policy 0, policy_version 316761 (0.0033) [2024-04-27 01:14:16,312][49750] Updated weights for policy 0, policy_version 316771 (0.0030) [2024-04-27 01:14:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 5189992448. Throughput: 0: 50842.7. Samples: 2942866680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:14:17,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-27 01:14:20,162][49750] Updated weights for policy 0, policy_version 316781 (0.0033) [2024-04-27 01:14:22,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 5190238208. Throughput: 0: 50866.2. Samples: 2943029920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:14:22,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 01:14:22,940][49750] Updated weights for policy 0, policy_version 316791 (0.0028) [2024-04-27 01:14:26,437][49750] Updated weights for policy 0, policy_version 316801 (0.0033) [2024-04-27 01:14:27,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 5190467584. Throughput: 0: 50851.9. Samples: 2943337580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:14:27,063][49517] Avg episode reward: [(0, '0.667')] [2024-04-27 01:14:29,433][49750] Updated weights for policy 0, policy_version 316811 (0.0032) [2024-04-27 01:14:32,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5190713344. Throughput: 0: 51007.6. Samples: 2943645100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:14:32,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-27 01:14:32,845][49750] Updated weights for policy 0, policy_version 316821 (0.0028) [2024-04-27 01:14:35,685][49750] Updated weights for policy 0, policy_version 316831 (0.0031) [2024-04-27 01:14:37,063][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 5190991872. Throughput: 0: 50797.0. Samples: 2943794060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:14:37,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-27 01:14:39,455][49750] Updated weights for policy 0, policy_version 316841 (0.0038) [2024-04-27 01:14:42,062][49517] Fps is (10 sec: 55705.2, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 5191270400. Throughput: 0: 50868.3. Samples: 2944100160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:14:42,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-27 01:14:42,135][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000316851_5191286784.pth... [2024-04-27 01:14:42,139][49750] Updated weights for policy 0, policy_version 316851 (0.0035) [2024-04-27 01:14:42,179][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000316103_5179031552.pth [2024-04-27 01:14:45,784][49750] Updated weights for policy 0, policy_version 316861 (0.0025) [2024-04-27 01:14:47,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 5191499776. Throughput: 0: 50778.1. Samples: 2944401420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:14:47,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-27 01:14:48,751][49750] Updated weights for policy 0, policy_version 316871 (0.0034) [2024-04-27 01:14:52,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 5191745536. Throughput: 0: 50782.7. Samples: 2944549400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:14:52,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-27 01:14:52,371][49750] Updated weights for policy 0, policy_version 316881 (0.0035) [2024-04-27 01:14:55,326][49750] Updated weights for policy 0, policy_version 316891 (0.0035) [2024-04-27 01:14:57,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5191991296. Throughput: 0: 50804.6. Samples: 2944851300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:14:57,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-27 01:14:59,003][49750] Updated weights for policy 0, policy_version 316901 (0.0039) [2024-04-27 01:15:01,606][49750] Updated weights for policy 0, policy_version 316911 (0.0033) [2024-04-27 01:15:02,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51336.4, 300 sec: 50818.1). Total num frames: 5192269824. Throughput: 0: 50898.6. Samples: 2945157120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:15:02,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-27 01:15:05,332][49750] Updated weights for policy 0, policy_version 316921 (0.0033) [2024-04-27 01:15:07,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 5192531968. Throughput: 0: 50790.0. Samples: 2945315460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:15:07,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 01:15:07,916][49750] Updated weights for policy 0, policy_version 316931 (0.0033) [2024-04-27 01:15:10,999][49728] Signal inference workers to stop experience collection... (44250 times) [2024-04-27 01:15:11,000][49728] Signal inference workers to resume experience collection... (44250 times) [2024-04-27 01:15:11,012][49750] InferenceWorker_p0-w0: stopping experience collection (44250 times) [2024-04-27 01:15:11,017][49750] InferenceWorker_p0-w0: resuming experience collection (44250 times) [2024-04-27 01:15:11,665][49750] Updated weights for policy 0, policy_version 316941 (0.0029) [2024-04-27 01:15:12,063][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 5192777728. Throughput: 0: 50903.9. Samples: 2945628260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 01:15:12,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 01:15:14,453][49750] Updated weights for policy 0, policy_version 316951 (0.0035) [2024-04-27 01:15:17,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5193007104. Throughput: 0: 50837.7. Samples: 2945932800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 01:15:17,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-27 01:15:18,172][49750] Updated weights for policy 0, policy_version 316961 (0.0032) [2024-04-27 01:15:21,027][49750] Updated weights for policy 0, policy_version 316971 (0.0033) [2024-04-27 01:15:22,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 5193269248. Throughput: 0: 50742.8. Samples: 2946077480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 01:15:22,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 01:15:24,644][49750] Updated weights for policy 0, policy_version 316981 (0.0030) [2024-04-27 01:15:27,062][49517] Fps is (10 sec: 55705.4, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 5193564160. Throughput: 0: 50787.1. Samples: 2946385580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 01:15:27,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 01:15:27,365][49750] Updated weights for policy 0, policy_version 316991 (0.0038) [2024-04-27 01:15:31,015][49750] Updated weights for policy 0, policy_version 317001 (0.0032) [2024-04-27 01:15:32,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 5193777152. Throughput: 0: 50958.3. Samples: 2946694540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 01:15:32,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-27 01:15:33,670][49750] Updated weights for policy 0, policy_version 317011 (0.0034) [2024-04-27 01:15:37,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5194039296. Throughput: 0: 50884.2. Samples: 2946839200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 01:15:37,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-27 01:15:37,336][49750] Updated weights for policy 0, policy_version 317021 (0.0031) [2024-04-27 01:15:40,247][49750] Updated weights for policy 0, policy_version 317031 (0.0031) [2024-04-27 01:15:42,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.3, 300 sec: 50818.1). Total num frames: 5194285056. Throughput: 0: 50883.0. Samples: 2947141040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 01:15:42,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-27 01:15:43,882][49750] Updated weights for policy 0, policy_version 317041 (0.0030) [2024-04-27 01:15:46,769][49750] Updated weights for policy 0, policy_version 317051 (0.0033) [2024-04-27 01:15:47,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5194563584. Throughput: 0: 50928.0. Samples: 2947448880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 01:15:47,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-27 01:15:50,411][49750] Updated weights for policy 0, policy_version 317061 (0.0029) [2024-04-27 01:15:52,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 5194809344. Throughput: 0: 50878.0. Samples: 2947604980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 01:15:52,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-27 01:15:53,163][49750] Updated weights for policy 0, policy_version 317071 (0.0027) [2024-04-27 01:15:56,731][49750] Updated weights for policy 0, policy_version 317081 (0.0028) [2024-04-27 01:15:57,062][49517] Fps is (10 sec: 49152.5, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 5195055104. Throughput: 0: 50766.4. Samples: 2947912740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 01:15:57,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 01:15:59,442][49750] Updated weights for policy 0, policy_version 317091 (0.0030) [2024-04-27 01:16:02,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5195317248. Throughput: 0: 50893.5. Samples: 2948223020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 01:16:02,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-27 01:16:02,987][49750] Updated weights for policy 0, policy_version 317101 (0.0033) [2024-04-27 01:16:05,839][49750] Updated weights for policy 0, policy_version 317111 (0.0029) [2024-04-27 01:16:07,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 5195563008. Throughput: 0: 50874.1. Samples: 2948366820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 01:16:07,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-27 01:16:09,893][49750] Updated weights for policy 0, policy_version 317121 (0.0037) [2024-04-27 01:16:12,063][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5195841536. Throughput: 0: 50777.2. Samples: 2948670560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 01:16:12,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-27 01:16:12,371][49750] Updated weights for policy 0, policy_version 317131 (0.0031) [2024-04-27 01:16:16,282][49750] Updated weights for policy 0, policy_version 317141 (0.0035) [2024-04-27 01:16:17,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 5196087296. Throughput: 0: 50863.4. Samples: 2948983400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 01:16:17,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-27 01:16:17,271][49728] Signal inference workers to stop experience collection... (44300 times) [2024-04-27 01:16:17,271][49728] Signal inference workers to resume experience collection... (44300 times) [2024-04-27 01:16:17,296][49750] InferenceWorker_p0-w0: stopping experience collection (44300 times) [2024-04-27 01:16:17,296][49750] InferenceWorker_p0-w0: resuming experience collection (44300 times) [2024-04-27 01:16:18,704][49750] Updated weights for policy 0, policy_version 317151 (0.0032) [2024-04-27 01:16:22,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5196316672. Throughput: 0: 50959.8. Samples: 2949132380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 01:16:22,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-27 01:16:22,610][49750] Updated weights for policy 0, policy_version 317161 (0.0030) [2024-04-27 01:16:25,158][49750] Updated weights for policy 0, policy_version 317171 (0.0030) [2024-04-27 01:16:27,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5196578816. Throughput: 0: 50934.2. Samples: 2949433080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-27 01:16:27,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-27 01:16:29,056][49750] Updated weights for policy 0, policy_version 317181 (0.0036) [2024-04-27 01:16:31,650][49750] Updated weights for policy 0, policy_version 317191 (0.0035) [2024-04-27 01:16:32,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 5196857344. Throughput: 0: 50967.6. Samples: 2949742420. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-27 01:16:32,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-27 01:16:35,489][49750] Updated weights for policy 0, policy_version 317201 (0.0032) [2024-04-27 01:16:37,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5197103104. Throughput: 0: 51060.0. Samples: 2949902680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-27 01:16:37,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 01:16:38,059][49750] Updated weights for policy 0, policy_version 317211 (0.0036) [2024-04-27 01:16:41,846][49750] Updated weights for policy 0, policy_version 317221 (0.0027) [2024-04-27 01:16:42,062][49517] Fps is (10 sec: 49152.3, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 5197348864. Throughput: 0: 50994.2. Samples: 2950207480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-27 01:16:42,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 01:16:42,100][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000317222_5197365248.pth... [2024-04-27 01:16:42,145][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000316476_5185142784.pth [2024-04-27 01:16:44,350][49750] Updated weights for policy 0, policy_version 317231 (0.0032) [2024-04-27 01:16:47,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5197611008. Throughput: 0: 50867.4. Samples: 2950512040. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-27 01:16:47,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-27 01:16:48,221][49750] Updated weights for policy 0, policy_version 317241 (0.0029) [2024-04-27 01:16:50,970][49750] Updated weights for policy 0, policy_version 317251 (0.0032) [2024-04-27 01:16:52,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5197856768. Throughput: 0: 51111.6. Samples: 2950666840. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-27 01:16:52,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 01:16:54,614][49750] Updated weights for policy 0, policy_version 317261 (0.0036) [2024-04-27 01:16:57,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5198135296. Throughput: 0: 51053.1. Samples: 2950967940. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-27 01:16:57,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 01:16:57,692][49750] Updated weights for policy 0, policy_version 317271 (0.0037) [2024-04-27 01:17:01,115][49750] Updated weights for policy 0, policy_version 317281 (0.0036) [2024-04-27 01:17:02,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5198364672. Throughput: 0: 50808.9. Samples: 2951269800. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-27 01:17:02,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-27 01:17:03,970][49750] Updated weights for policy 0, policy_version 317291 (0.0031) [2024-04-27 01:17:07,063][49517] Fps is (10 sec: 49151.1, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5198626816. Throughput: 0: 50943.8. Samples: 2951424860. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-27 01:17:07,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 01:17:07,469][49750] Updated weights for policy 0, policy_version 317301 (0.0034) [2024-04-27 01:17:10,509][49750] Updated weights for policy 0, policy_version 317311 (0.0036) [2024-04-27 01:17:12,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5198872576. Throughput: 0: 50987.9. Samples: 2951727540. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-27 01:17:12,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-27 01:17:13,902][49750] Updated weights for policy 0, policy_version 317321 (0.0031) [2024-04-27 01:17:16,908][49750] Updated weights for policy 0, policy_version 317331 (0.0036) [2024-04-27 01:17:17,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5199151104. Throughput: 0: 50816.8. Samples: 2952029180. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-27 01:17:17,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 01:17:20,289][49750] Updated weights for policy 0, policy_version 317341 (0.0028) [2024-04-27 01:17:22,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.3, 300 sec: 50873.7). Total num frames: 5199396864. Throughput: 0: 50842.2. Samples: 2952190580. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-27 01:17:22,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-27 01:17:23,196][49750] Updated weights for policy 0, policy_version 317351 (0.0034) [2024-04-27 01:17:26,775][49750] Updated weights for policy 0, policy_version 317361 (0.0033) [2024-04-27 01:17:27,062][49517] Fps is (10 sec: 49152.4, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 5199642624. Throughput: 0: 50832.8. Samples: 2952494960. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-27 01:17:27,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-27 01:17:29,540][49750] Updated weights for policy 0, policy_version 317371 (0.0035) [2024-04-27 01:17:32,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5199888384. Throughput: 0: 50872.5. Samples: 2952801300. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-27 01:17:32,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-27 01:17:33,312][49750] Updated weights for policy 0, policy_version 317381 (0.0032) [2024-04-27 01:17:35,744][49728] Signal inference workers to stop experience collection... (44350 times) [2024-04-27 01:17:35,745][49728] Signal inference workers to resume experience collection... (44350 times) [2024-04-27 01:17:35,762][49750] InferenceWorker_p0-w0: stopping experience collection (44350 times) [2024-04-27 01:17:35,763][49750] InferenceWorker_p0-w0: resuming experience collection (44350 times) [2024-04-27 01:17:36,015][49750] Updated weights for policy 0, policy_version 317391 (0.0029) [2024-04-27 01:17:37,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5200150528. Throughput: 0: 50855.1. Samples: 2952955320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-04-27 01:17:37,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 01:17:39,627][49750] Updated weights for policy 0, policy_version 317401 (0.0031) [2024-04-27 01:17:42,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5200396288. Throughput: 0: 51015.2. Samples: 2953263620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 01:17:42,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-27 01:17:42,509][49750] Updated weights for policy 0, policy_version 317411 (0.0034) [2024-04-27 01:17:45,974][49750] Updated weights for policy 0, policy_version 317421 (0.0033) [2024-04-27 01:17:47,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 5200658432. Throughput: 0: 50927.3. Samples: 2953561520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 01:17:47,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-27 01:17:49,049][49750] Updated weights for policy 0, policy_version 317431 (0.0032) [2024-04-27 01:17:52,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 5200920576. Throughput: 0: 50890.4. Samples: 2953714920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 01:17:52,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 01:17:52,425][49750] Updated weights for policy 0, policy_version 317441 (0.0031) [2024-04-27 01:17:55,510][49750] Updated weights for policy 0, policy_version 317451 (0.0030) [2024-04-27 01:17:57,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.3, 300 sec: 50929.3). Total num frames: 5201166336. Throughput: 0: 50916.7. Samples: 2954018780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 01:17:57,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-27 01:17:58,894][49750] Updated weights for policy 0, policy_version 317461 (0.0030) [2024-04-27 01:18:01,913][49750] Updated weights for policy 0, policy_version 317471 (0.0034) [2024-04-27 01:18:02,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 5201444864. Throughput: 0: 51084.1. Samples: 2954327960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 01:18:02,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 01:18:05,225][49750] Updated weights for policy 0, policy_version 317481 (0.0033) [2024-04-27 01:18:07,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.7, 300 sec: 50818.2). Total num frames: 5201690624. Throughput: 0: 51015.9. Samples: 2954486280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 01:18:07,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 01:18:08,272][49750] Updated weights for policy 0, policy_version 317491 (0.0038) [2024-04-27 01:18:11,632][49750] Updated weights for policy 0, policy_version 317501 (0.0033) [2024-04-27 01:18:12,062][49517] Fps is (10 sec: 49152.2, 60 sec: 51063.6, 300 sec: 50929.2). Total num frames: 5201936384. Throughput: 0: 51093.9. Samples: 2954794180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 01:18:12,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 01:18:14,854][49750] Updated weights for policy 0, policy_version 317511 (0.0028) [2024-04-27 01:18:17,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.6, 300 sec: 50929.3). Total num frames: 5202198528. Throughput: 0: 51160.0. Samples: 2955103500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 01:18:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 01:18:18,133][49750] Updated weights for policy 0, policy_version 317521 (0.0032) [2024-04-27 01:18:21,133][49750] Updated weights for policy 0, policy_version 317531 (0.0031) [2024-04-27 01:18:22,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.6, 300 sec: 50929.3). Total num frames: 5202444288. Throughput: 0: 51127.6. Samples: 2955256060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 01:18:22,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 01:18:24,519][49750] Updated weights for policy 0, policy_version 317541 (0.0035) [2024-04-27 01:18:27,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 5202722816. Throughput: 0: 51130.1. Samples: 2955564480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 01:18:27,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 01:18:27,732][49750] Updated weights for policy 0, policy_version 317551 (0.0030) [2024-04-27 01:18:30,983][49750] Updated weights for policy 0, policy_version 317561 (0.0028) [2024-04-27 01:18:32,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5202952192. Throughput: 0: 51343.5. Samples: 2955871980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 01:18:32,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 01:18:34,182][49750] Updated weights for policy 0, policy_version 317571 (0.0034) [2024-04-27 01:18:37,063][49517] Fps is (10 sec: 49151.6, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5203214336. Throughput: 0: 51270.1. Samples: 2956022080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 01:18:37,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-27 01:18:37,397][49750] Updated weights for policy 0, policy_version 317581 (0.0035) [2024-04-27 01:18:40,477][49750] Updated weights for policy 0, policy_version 317591 (0.0027) [2024-04-27 01:18:42,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51609.4, 300 sec: 51040.3). Total num frames: 5203492864. Throughput: 0: 51259.8. Samples: 2956325480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 01:18:42,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-27 01:18:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000317596_5203492864.pth... [2024-04-27 01:18:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000316851_5191286784.pth [2024-04-27 01:18:43,787][49750] Updated weights for policy 0, policy_version 317601 (0.0029) [2024-04-27 01:18:46,851][49750] Updated weights for policy 0, policy_version 317611 (0.0032) [2024-04-27 01:18:47,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 5203738624. Throughput: 0: 51199.6. Samples: 2956631940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 01:18:47,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-27 01:18:50,166][49750] Updated weights for policy 0, policy_version 317621 (0.0029) [2024-04-27 01:18:51,405][49728] Signal inference workers to stop experience collection... (44400 times) [2024-04-27 01:18:51,436][49750] InferenceWorker_p0-w0: stopping experience collection (44400 times) [2024-04-27 01:18:51,514][49728] Signal inference workers to resume experience collection... (44400 times) [2024-04-27 01:18:51,514][49750] InferenceWorker_p0-w0: resuming experience collection (44400 times) [2024-04-27 01:18:52,063][49517] Fps is (10 sec: 50790.7, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 5204000768. Throughput: 0: 51167.3. Samples: 2956788820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 01:18:52,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-27 01:18:53,329][49750] Updated weights for policy 0, policy_version 317631 (0.0028) [2024-04-27 01:18:56,551][49750] Updated weights for policy 0, policy_version 317641 (0.0036) [2024-04-27 01:18:57,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51336.5, 300 sec: 51040.3). Total num frames: 5204246528. Throughput: 0: 51232.8. Samples: 2957099660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 01:18:57,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-27 01:18:59,944][49750] Updated weights for policy 0, policy_version 317651 (0.0033) [2024-04-27 01:19:02,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5204492288. Throughput: 0: 51101.7. Samples: 2957403080. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 01:19:02,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-27 01:19:03,029][49750] Updated weights for policy 0, policy_version 317661 (0.0029) [2024-04-27 01:19:06,344][49750] Updated weights for policy 0, policy_version 317671 (0.0029) [2024-04-27 01:19:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 5204754432. Throughput: 0: 50999.5. Samples: 2957551040. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 01:19:07,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-27 01:19:09,434][49750] Updated weights for policy 0, policy_version 317681 (0.0029) [2024-04-27 01:19:12,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 5205016576. Throughput: 0: 51071.9. Samples: 2957862720. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 01:19:12,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 01:19:12,764][49750] Updated weights for policy 0, policy_version 317691 (0.0028) [2024-04-27 01:19:15,930][49750] Updated weights for policy 0, policy_version 317701 (0.0031) [2024-04-27 01:19:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 5205278720. Throughput: 0: 51117.8. Samples: 2958172280. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 01:19:17,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 01:19:19,056][49750] Updated weights for policy 0, policy_version 317711 (0.0029) [2024-04-27 01:19:22,062][49517] Fps is (10 sec: 49153.3, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 5205508096. Throughput: 0: 51182.1. Samples: 2958325260. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 01:19:22,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-27 01:19:22,278][49750] Updated weights for policy 0, policy_version 317721 (0.0028) [2024-04-27 01:19:25,519][49750] Updated weights for policy 0, policy_version 317731 (0.0026) [2024-04-27 01:19:27,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.5, 300 sec: 51040.3). Total num frames: 5205770240. Throughput: 0: 51184.3. Samples: 2958628760. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 01:19:27,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 01:19:28,726][49750] Updated weights for policy 0, policy_version 317741 (0.0029) [2024-04-27 01:19:31,938][49750] Updated weights for policy 0, policy_version 317751 (0.0035) [2024-04-27 01:19:32,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 5206032384. Throughput: 0: 51135.9. Samples: 2958933060. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 01:19:32,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-27 01:19:35,171][49750] Updated weights for policy 0, policy_version 317761 (0.0029) [2024-04-27 01:19:37,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 5206294528. Throughput: 0: 50987.3. Samples: 2959083240. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 01:19:37,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 01:19:38,383][49750] Updated weights for policy 0, policy_version 317771 (0.0039) [2024-04-27 01:19:41,628][49750] Updated weights for policy 0, policy_version 317781 (0.0031) [2024-04-27 01:19:42,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50984.8). Total num frames: 5206540288. Throughput: 0: 50892.4. Samples: 2959389820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 01:19:42,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 01:19:44,800][49750] Updated weights for policy 0, policy_version 317791 (0.0034) [2024-04-27 01:19:47,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 5206786048. Throughput: 0: 50900.1. Samples: 2959693580. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 01:19:47,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 01:19:48,136][49750] Updated weights for policy 0, policy_version 317801 (0.0034) [2024-04-27 01:19:51,247][49750] Updated weights for policy 0, policy_version 317811 (0.0032) [2024-04-27 01:19:52,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.5, 300 sec: 51040.3). Total num frames: 5207048192. Throughput: 0: 50797.8. Samples: 2959836940. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 01:19:52,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-27 01:19:52,904][49728] Signal inference workers to stop experience collection... (44450 times) [2024-04-27 01:19:52,909][49728] Signal inference workers to resume experience collection... (44450 times) [2024-04-27 01:19:52,938][49750] InferenceWorker_p0-w0: stopping experience collection (44450 times) [2024-04-27 01:19:52,939][49750] InferenceWorker_p0-w0: resuming experience collection (44450 times) [2024-04-27 01:19:54,595][49750] Updated weights for policy 0, policy_version 317821 (0.0028) [2024-04-27 01:19:57,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 5207293952. Throughput: 0: 50629.4. Samples: 2960141040. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 01:19:57,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 01:19:57,694][49750] Updated weights for policy 0, policy_version 317831 (0.0035) [2024-04-27 01:20:01,160][49750] Updated weights for policy 0, policy_version 317841 (0.0029) [2024-04-27 01:20:02,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51609.5, 300 sec: 51040.3). Total num frames: 5207588864. Throughput: 0: 50702.5. Samples: 2960453900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 01:20:02,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 01:20:04,133][49750] Updated weights for policy 0, policy_version 317851 (0.0030) [2024-04-27 01:20:07,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 5207801856. Throughput: 0: 50730.9. Samples: 2960608160. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 01:20:07,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-27 01:20:07,645][49750] Updated weights for policy 0, policy_version 317861 (0.0032) [2024-04-27 01:20:10,562][49750] Updated weights for policy 0, policy_version 317871 (0.0030) [2024-04-27 01:20:12,062][49517] Fps is (10 sec: 45876.3, 60 sec: 50517.5, 300 sec: 50984.8). Total num frames: 5208047616. Throughput: 0: 50705.8. Samples: 2960910520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 01:20:12,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-27 01:20:14,074][49750] Updated weights for policy 0, policy_version 317881 (0.0037) [2024-04-27 01:20:17,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50517.4, 300 sec: 50984.8). Total num frames: 5208309760. Throughput: 0: 50613.0. Samples: 2961210640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 01:20:17,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 01:20:17,208][49750] Updated weights for policy 0, policy_version 317891 (0.0035) [2024-04-27 01:20:20,433][49750] Updated weights for policy 0, policy_version 317901 (0.0040) [2024-04-27 01:20:22,063][49517] Fps is (10 sec: 52427.7, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5208571904. Throughput: 0: 50666.5. Samples: 2961363240. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 01:20:22,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 01:20:23,520][49750] Updated weights for policy 0, policy_version 317911 (0.0037) [2024-04-27 01:20:26,981][49750] Updated weights for policy 0, policy_version 317921 (0.0030) [2024-04-27 01:20:27,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 5208817664. Throughput: 0: 50565.0. Samples: 2961665240. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 01:20:27,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 01:20:30,183][49750] Updated weights for policy 0, policy_version 317931 (0.0033) [2024-04-27 01:20:32,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.5, 300 sec: 50984.8). Total num frames: 5209079808. Throughput: 0: 50708.8. Samples: 2961975480. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 01:20:32,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-27 01:20:33,463][49750] Updated weights for policy 0, policy_version 317941 (0.0029) [2024-04-27 01:20:36,591][49750] Updated weights for policy 0, policy_version 317951 (0.0036) [2024-04-27 01:20:37,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50984.8). Total num frames: 5209325568. Throughput: 0: 50726.3. Samples: 2962119620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 01:20:37,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-27 01:20:39,818][49750] Updated weights for policy 0, policy_version 317961 (0.0028) [2024-04-27 01:20:42,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5209571328. Throughput: 0: 50657.7. Samples: 2962420640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 01:20:42,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-27 01:20:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000317967_5209571328.pth... [2024-04-27 01:20:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000317222_5197365248.pth [2024-04-27 01:20:42,830][49750] Updated weights for policy 0, policy_version 317971 (0.0027) [2024-04-27 01:20:44,654][49728] Signal inference workers to stop experience collection... (44500 times) [2024-04-27 01:20:44,654][49728] Signal inference workers to resume experience collection... (44500 times) [2024-04-27 01:20:44,670][49750] InferenceWorker_p0-w0: stopping experience collection (44500 times) [2024-04-27 01:20:44,670][49750] InferenceWorker_p0-w0: resuming experience collection (44500 times) [2024-04-27 01:20:46,089][49750] Updated weights for policy 0, policy_version 317981 (0.0032) [2024-04-27 01:20:47,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 5209849856. Throughput: 0: 50513.1. Samples: 2962726980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 01:20:47,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 01:20:49,105][49750] Updated weights for policy 0, policy_version 317991 (0.0036) [2024-04-27 01:20:52,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50244.4, 300 sec: 50873.7). Total num frames: 5210062848. Throughput: 0: 50553.5. Samples: 2962883060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 01:20:52,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-27 01:20:52,518][49750] Updated weights for policy 0, policy_version 318001 (0.0027) [2024-04-27 01:20:55,883][49750] Updated weights for policy 0, policy_version 318011 (0.0033) [2024-04-27 01:20:57,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 5210341376. Throughput: 0: 50674.2. Samples: 2963190860. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 01:20:57,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-27 01:20:58,933][49750] Updated weights for policy 0, policy_version 318021 (0.0032) [2024-04-27 01:21:02,063][49517] Fps is (10 sec: 54066.0, 60 sec: 50244.3, 300 sec: 50984.8). Total num frames: 5210603520. Throughput: 0: 50863.3. Samples: 2963499500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 01:21:02,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-27 01:21:02,206][49750] Updated weights for policy 0, policy_version 318031 (0.0029) [2024-04-27 01:21:05,466][49750] Updated weights for policy 0, policy_version 318041 (0.0035) [2024-04-27 01:21:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5210849280. Throughput: 0: 50900.6. Samples: 2963653760. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 01:21:07,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 01:21:08,622][49750] Updated weights for policy 0, policy_version 318051 (0.0027) [2024-04-27 01:21:11,788][49750] Updated weights for policy 0, policy_version 318061 (0.0036) [2024-04-27 01:21:12,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51336.4, 300 sec: 50984.8). Total num frames: 5211127808. Throughput: 0: 51014.1. Samples: 2963960880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 01:21:12,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 01:21:15,044][49750] Updated weights for policy 0, policy_version 318071 (0.0033) [2024-04-27 01:21:17,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50929.3). Total num frames: 5211340800. Throughput: 0: 51032.5. Samples: 2964271940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 01:21:17,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-27 01:21:18,082][49750] Updated weights for policy 0, policy_version 318081 (0.0030) [2024-04-27 01:21:21,433][49750] Updated weights for policy 0, policy_version 318091 (0.0026) [2024-04-27 01:21:22,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 5211619328. Throughput: 0: 51063.0. Samples: 2964417460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 01:21:22,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 01:21:24,654][49750] Updated weights for policy 0, policy_version 318101 (0.0032) [2024-04-27 01:21:27,062][49517] Fps is (10 sec: 54066.5, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 5211881472. Throughput: 0: 50979.7. Samples: 2964714720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 01:21:27,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 01:21:27,846][49750] Updated weights for policy 0, policy_version 318111 (0.0034) [2024-04-27 01:21:31,107][49750] Updated weights for policy 0, policy_version 318121 (0.0030) [2024-04-27 01:21:32,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 5212143616. Throughput: 0: 50995.1. Samples: 2965021760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 01:21:32,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 01:21:34,467][49750] Updated weights for policy 0, policy_version 318131 (0.0033) [2024-04-27 01:21:37,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 5212389376. Throughput: 0: 50973.3. Samples: 2965176860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 01:21:37,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 01:21:37,532][49750] Updated weights for policy 0, policy_version 318141 (0.0028) [2024-04-27 01:21:40,902][49750] Updated weights for policy 0, policy_version 318151 (0.0029) [2024-04-27 01:21:42,063][49517] Fps is (10 sec: 45874.5, 60 sec: 50517.4, 300 sec: 50818.1). Total num frames: 5212602368. Throughput: 0: 50874.0. Samples: 2965480200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 01:21:42,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 01:21:43,826][49750] Updated weights for policy 0, policy_version 318161 (0.0036) [2024-04-27 01:21:47,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 5212897280. Throughput: 0: 50794.4. Samples: 2965785240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 01:21:47,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-27 01:21:47,207][49750] Updated weights for policy 0, policy_version 318171 (0.0033) [2024-04-27 01:21:49,145][49728] Signal inference workers to stop experience collection... (44550 times) [2024-04-27 01:21:49,193][49750] InferenceWorker_p0-w0: stopping experience collection (44550 times) [2024-04-27 01:21:49,216][49728] Signal inference workers to resume experience collection... (44550 times) [2024-04-27 01:21:49,218][49750] InferenceWorker_p0-w0: resuming experience collection (44550 times) [2024-04-27 01:21:50,383][49750] Updated weights for policy 0, policy_version 318181 (0.0032) [2024-04-27 01:21:52,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5213143040. Throughput: 0: 50871.1. Samples: 2965942960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 01:21:52,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-27 01:21:53,505][49750] Updated weights for policy 0, policy_version 318191 (0.0034) [2024-04-27 01:21:56,949][49750] Updated weights for policy 0, policy_version 318201 (0.0035) [2024-04-27 01:21:57,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 5213405184. Throughput: 0: 50825.9. Samples: 2966248040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 01:21:57,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-27 01:22:00,294][49750] Updated weights for policy 0, policy_version 318211 (0.0033) [2024-04-27 01:22:02,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.6, 300 sec: 50929.3). Total num frames: 5213650944. Throughput: 0: 50860.5. Samples: 2966560660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 01:22:02,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-27 01:22:03,366][49750] Updated weights for policy 0, policy_version 318221 (0.0032) [2024-04-27 01:22:06,917][49750] Updated weights for policy 0, policy_version 318231 (0.0036) [2024-04-27 01:22:07,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.3, 300 sec: 50929.3). Total num frames: 5213896704. Throughput: 0: 50860.5. Samples: 2966706180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 01:22:07,063][49517] Avg episode reward: [(0, '0.706')] [2024-04-27 01:22:09,738][49750] Updated weights for policy 0, policy_version 318241 (0.0025) [2024-04-27 01:22:12,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 5214158848. Throughput: 0: 50967.1. Samples: 2967008240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 01:22:12,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-27 01:22:13,231][49750] Updated weights for policy 0, policy_version 318251 (0.0037) [2024-04-27 01:22:16,280][49750] Updated weights for policy 0, policy_version 318261 (0.0033) [2024-04-27 01:22:17,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 5214420992. Throughput: 0: 50910.7. Samples: 2967312740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 01:22:17,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 01:22:19,550][49750] Updated weights for policy 0, policy_version 318271 (0.0038) [2024-04-27 01:22:22,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 5214666752. Throughput: 0: 50832.8. Samples: 2967464340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 01:22:22,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 01:22:22,730][49750] Updated weights for policy 0, policy_version 318281 (0.0027) [2024-04-27 01:22:26,275][49750] Updated weights for policy 0, policy_version 318291 (0.0035) [2024-04-27 01:22:27,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 5214896128. Throughput: 0: 50827.7. Samples: 2967767440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 01:22:27,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-27 01:22:29,173][49750] Updated weights for policy 0, policy_version 318301 (0.0030) [2024-04-27 01:22:32,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50929.2). Total num frames: 5215174656. Throughput: 0: 50824.4. Samples: 2968072340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 01:22:32,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-27 01:22:32,892][49750] Updated weights for policy 0, policy_version 318311 (0.0032) [2024-04-27 01:22:35,644][49750] Updated weights for policy 0, policy_version 318321 (0.0031) [2024-04-27 01:22:36,432][49728] Signal inference workers to stop experience collection... (44600 times) [2024-04-27 01:22:36,432][49728] Signal inference workers to resume experience collection... (44600 times) [2024-04-27 01:22:36,465][49750] InferenceWorker_p0-w0: stopping experience collection (44600 times) [2024-04-27 01:22:36,465][49750] InferenceWorker_p0-w0: resuming experience collection (44600 times) [2024-04-27 01:22:37,063][49517] Fps is (10 sec: 55705.1, 60 sec: 51063.4, 300 sec: 51040.3). Total num frames: 5215453184. Throughput: 0: 50745.2. Samples: 2968226500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 01:22:37,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 01:22:39,190][49750] Updated weights for policy 0, policy_version 318331 (0.0032) [2024-04-27 01:22:42,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 5215682560. Throughput: 0: 50792.3. Samples: 2968533700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 01:22:42,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-27 01:22:42,105][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000318341_5215698944.pth... [2024-04-27 01:22:42,108][49750] Updated weights for policy 0, policy_version 318341 (0.0033) [2024-04-27 01:22:42,151][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000317596_5203492864.pth [2024-04-27 01:22:45,826][49750] Updated weights for policy 0, policy_version 318351 (0.0032) [2024-04-27 01:22:47,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 5215944704. Throughput: 0: 50662.8. Samples: 2968840500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 01:22:47,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-27 01:22:48,620][49750] Updated weights for policy 0, policy_version 318361 (0.0028) [2024-04-27 01:22:52,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 5216174080. Throughput: 0: 50807.9. Samples: 2968992540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 01:22:52,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 01:22:52,280][49750] Updated weights for policy 0, policy_version 318371 (0.0029) [2024-04-27 01:22:54,908][49750] Updated weights for policy 0, policy_version 318381 (0.0031) [2024-04-27 01:22:57,062][49517] Fps is (10 sec: 52430.0, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 5216468992. Throughput: 0: 50872.5. Samples: 2969297500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 01:22:57,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-27 01:22:58,674][49750] Updated weights for policy 0, policy_version 318391 (0.0029) [2024-04-27 01:23:01,420][49750] Updated weights for policy 0, policy_version 318401 (0.0030) [2024-04-27 01:23:02,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5216698368. Throughput: 0: 50807.5. Samples: 2969599080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 01:23:02,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 01:23:05,014][49750] Updated weights for policy 0, policy_version 318411 (0.0033) [2024-04-27 01:23:07,062][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5216944128. Throughput: 0: 50902.8. Samples: 2969754960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 01:23:07,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 01:23:08,107][49750] Updated weights for policy 0, policy_version 318421 (0.0029) [2024-04-27 01:23:11,574][49750] Updated weights for policy 0, policy_version 318431 (0.0030) [2024-04-27 01:23:12,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5217206272. Throughput: 0: 50938.3. Samples: 2970059660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 01:23:12,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 01:23:14,410][49750] Updated weights for policy 0, policy_version 318441 (0.0030) [2024-04-27 01:23:17,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5217452032. Throughput: 0: 50891.1. Samples: 2970362440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 01:23:17,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 01:23:17,918][49750] Updated weights for policy 0, policy_version 318451 (0.0034) [2024-04-27 01:23:20,803][49750] Updated weights for policy 0, policy_version 318461 (0.0035) [2024-04-27 01:23:22,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5217730560. Throughput: 0: 50738.0. Samples: 2970509700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 01:23:22,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 01:23:24,433][49750] Updated weights for policy 0, policy_version 318471 (0.0033) [2024-04-27 01:23:27,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 5217976320. Throughput: 0: 50744.7. Samples: 2970817220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 01:23:27,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 01:23:27,187][49750] Updated weights for policy 0, policy_version 318481 (0.0036) [2024-04-27 01:23:30,673][49728] Signal inference workers to stop experience collection... (44650 times) [2024-04-27 01:23:30,695][49750] InferenceWorker_p0-w0: stopping experience collection (44650 times) [2024-04-27 01:23:30,739][49728] Signal inference workers to resume experience collection... (44650 times) [2024-04-27 01:23:30,739][49750] InferenceWorker_p0-w0: resuming experience collection (44650 times) [2024-04-27 01:23:30,741][49750] Updated weights for policy 0, policy_version 318491 (0.0035) [2024-04-27 01:23:32,062][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 5218238464. Throughput: 0: 50828.1. Samples: 2971127760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 01:23:32,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 01:23:33,738][49750] Updated weights for policy 0, policy_version 318501 (0.0028) [2024-04-27 01:23:36,977][49750] Updated weights for policy 0, policy_version 318511 (0.0031) [2024-04-27 01:23:37,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5218484224. Throughput: 0: 50970.2. Samples: 2971286200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 01:23:37,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 01:23:40,236][49750] Updated weights for policy 0, policy_version 318521 (0.0032) [2024-04-27 01:23:42,064][49517] Fps is (10 sec: 50783.8, 60 sec: 51062.4, 300 sec: 50873.5). Total num frames: 5218746368. Throughput: 0: 50812.2. Samples: 2971584120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 01:23:42,064][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 01:23:43,490][49750] Updated weights for policy 0, policy_version 318531 (0.0029) [2024-04-27 01:23:46,565][49750] Updated weights for policy 0, policy_version 318541 (0.0033) [2024-04-27 01:23:47,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 5218975744. Throughput: 0: 50986.3. Samples: 2971893460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 01:23:47,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-27 01:23:50,030][49750] Updated weights for policy 0, policy_version 318551 (0.0032) [2024-04-27 01:23:52,062][49517] Fps is (10 sec: 49158.7, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5219237888. Throughput: 0: 50958.2. Samples: 2972048080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 01:23:52,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-27 01:23:53,068][49750] Updated weights for policy 0, policy_version 318561 (0.0035) [2024-04-27 01:23:56,316][49750] Updated weights for policy 0, policy_version 318571 (0.0030) [2024-04-27 01:23:57,063][49517] Fps is (10 sec: 54066.6, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 5219516416. Throughput: 0: 51001.7. Samples: 2972354740. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 01:23:57,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 01:23:59,618][49750] Updated weights for policy 0, policy_version 318581 (0.0032) [2024-04-27 01:24:02,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5219745792. Throughput: 0: 50867.6. Samples: 2972651480. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 01:24:02,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 01:24:02,571][49750] Updated weights for policy 0, policy_version 318591 (0.0028) [2024-04-27 01:24:06,217][49750] Updated weights for policy 0, policy_version 318601 (0.0037) [2024-04-27 01:24:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5220007936. Throughput: 0: 51024.4. Samples: 2972805800. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 01:24:07,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 01:24:09,072][49750] Updated weights for policy 0, policy_version 318611 (0.0025) [2024-04-27 01:24:12,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5220253696. Throughput: 0: 50890.0. Samples: 2973107260. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 01:24:12,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-27 01:24:12,538][49750] Updated weights for policy 0, policy_version 318621 (0.0029) [2024-04-27 01:24:15,769][49728] Signal inference workers to stop experience collection... (44700 times) [2024-04-27 01:24:15,769][49728] Signal inference workers to resume experience collection... (44700 times) [2024-04-27 01:24:15,781][49750] Updated weights for policy 0, policy_version 318631 (0.0031) [2024-04-27 01:24:15,811][49750] InferenceWorker_p0-w0: stopping experience collection (44700 times) [2024-04-27 01:24:15,812][49750] InferenceWorker_p0-w0: resuming experience collection (44700 times) [2024-04-27 01:24:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 5220532224. Throughput: 0: 50785.8. Samples: 2973413120. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 01:24:17,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 01:24:18,934][49750] Updated weights for policy 0, policy_version 318641 (0.0035) [2024-04-27 01:24:22,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5220761600. Throughput: 0: 50733.5. Samples: 2973569200. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 01:24:22,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 01:24:22,127][49750] Updated weights for policy 0, policy_version 318651 (0.0029) [2024-04-27 01:24:25,327][49750] Updated weights for policy 0, policy_version 318661 (0.0029) [2024-04-27 01:24:27,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 5221023744. Throughput: 0: 50931.3. Samples: 2973875960. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 01:24:27,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-27 01:24:28,574][49750] Updated weights for policy 0, policy_version 318671 (0.0039) [2024-04-27 01:24:31,653][49750] Updated weights for policy 0, policy_version 318681 (0.0030) [2024-04-27 01:24:32,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 5221285888. Throughput: 0: 50825.1. Samples: 2974180600. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 01:24:32,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-27 01:24:35,033][49750] Updated weights for policy 0, policy_version 318691 (0.0030) [2024-04-27 01:24:37,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5221531648. Throughput: 0: 50811.9. Samples: 2974334620. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 01:24:37,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-27 01:24:38,214][49750] Updated weights for policy 0, policy_version 318701 (0.0028) [2024-04-27 01:24:41,304][49750] Updated weights for policy 0, policy_version 318711 (0.0036) [2024-04-27 01:24:42,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50791.5, 300 sec: 50873.7). Total num frames: 5221793792. Throughput: 0: 50877.4. Samples: 2974644220. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 01:24:42,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 01:24:42,173][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000318714_5221810176.pth... [2024-04-27 01:24:42,218][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000317967_5209571328.pth [2024-04-27 01:24:44,544][49750] Updated weights for policy 0, policy_version 318721 (0.0029) [2024-04-27 01:24:47,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 5222055936. Throughput: 0: 50885.2. Samples: 2974941320. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 01:24:47,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-27 01:24:47,648][49750] Updated weights for policy 0, policy_version 318731 (0.0034) [2024-04-27 01:24:51,130][49750] Updated weights for policy 0, policy_version 318741 (0.0033) [2024-04-27 01:24:52,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5222301696. Throughput: 0: 51093.4. Samples: 2975105000. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 01:24:52,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-27 01:24:54,239][49750] Updated weights for policy 0, policy_version 318751 (0.0042) [2024-04-27 01:24:57,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5222547456. Throughput: 0: 50959.2. Samples: 2975400420. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 01:24:57,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-27 01:24:57,692][49750] Updated weights for policy 0, policy_version 318761 (0.0030) [2024-04-27 01:25:00,789][49750] Updated weights for policy 0, policy_version 318771 (0.0033) [2024-04-27 01:25:02,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5222793216. Throughput: 0: 50815.0. Samples: 2975699800. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 01:25:02,063][49517] Avg episode reward: [(0, '0.669')] [2024-04-27 01:25:03,999][49750] Updated weights for policy 0, policy_version 318781 (0.0032) [2024-04-27 01:25:07,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5223055360. Throughput: 0: 50848.4. Samples: 2975857380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 01:25:07,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 01:25:07,085][49750] Updated weights for policy 0, policy_version 318791 (0.0031) [2024-04-27 01:25:10,338][49750] Updated weights for policy 0, policy_version 318801 (0.0036) [2024-04-27 01:25:12,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5223317504. Throughput: 0: 50791.3. Samples: 2976161580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 01:25:12,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-27 01:25:13,400][49750] Updated weights for policy 0, policy_version 318811 (0.0037) [2024-04-27 01:25:16,762][49750] Updated weights for policy 0, policy_version 318821 (0.0031) [2024-04-27 01:25:17,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5223563264. Throughput: 0: 50756.2. Samples: 2976464620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 01:25:17,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 01:25:18,167][49728] Signal inference workers to stop experience collection... (44750 times) [2024-04-27 01:25:18,168][49728] Signal inference workers to resume experience collection... (44750 times) [2024-04-27 01:25:18,193][49750] InferenceWorker_p0-w0: stopping experience collection (44750 times) [2024-04-27 01:25:18,193][49750] InferenceWorker_p0-w0: resuming experience collection (44750 times) [2024-04-27 01:25:19,869][49750] Updated weights for policy 0, policy_version 318831 (0.0032) [2024-04-27 01:25:22,063][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5223825408. Throughput: 0: 50931.9. Samples: 2976626560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 01:25:22,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 01:25:23,153][49750] Updated weights for policy 0, policy_version 318841 (0.0036) [2024-04-27 01:25:26,390][49750] Updated weights for policy 0, policy_version 318851 (0.0031) [2024-04-27 01:25:27,062][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5224087552. Throughput: 0: 50881.3. Samples: 2976933880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 01:25:27,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 01:25:29,553][49750] Updated weights for policy 0, policy_version 318861 (0.0033) [2024-04-27 01:25:32,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5224333312. Throughput: 0: 50926.1. Samples: 2977233000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 01:25:32,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-27 01:25:32,793][49750] Updated weights for policy 0, policy_version 318871 (0.0034) [2024-04-27 01:25:35,928][49750] Updated weights for policy 0, policy_version 318881 (0.0036) [2024-04-27 01:25:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5224579072. Throughput: 0: 50855.5. Samples: 2977393500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 01:25:37,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-27 01:25:39,293][49750] Updated weights for policy 0, policy_version 318891 (0.0033) [2024-04-27 01:25:42,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 5224841216. Throughput: 0: 51132.2. Samples: 2977701380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 01:25:42,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-27 01:25:42,406][49750] Updated weights for policy 0, policy_version 318901 (0.0032) [2024-04-27 01:25:45,613][49750] Updated weights for policy 0, policy_version 318911 (0.0033) [2024-04-27 01:25:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.4, 300 sec: 50873.7). Total num frames: 5225070592. Throughput: 0: 51166.8. Samples: 2978002300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 01:25:47,063][49517] Avg episode reward: [(0, '0.534')] [2024-04-27 01:25:48,834][49750] Updated weights for policy 0, policy_version 318921 (0.0039) [2024-04-27 01:25:51,953][49750] Updated weights for policy 0, policy_version 318931 (0.0031) [2024-04-27 01:25:52,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5225365504. Throughput: 0: 51044.9. Samples: 2978154400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 01:25:52,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-27 01:25:55,144][49750] Updated weights for policy 0, policy_version 318941 (0.0040) [2024-04-27 01:25:57,063][49517] Fps is (10 sec: 54066.1, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5225611264. Throughput: 0: 51132.9. Samples: 2978462560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 01:25:57,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 01:25:58,293][49750] Updated weights for policy 0, policy_version 318951 (0.0039) [2024-04-27 01:26:01,513][49750] Updated weights for policy 0, policy_version 318961 (0.0037) [2024-04-27 01:26:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 5225873408. Throughput: 0: 51227.0. Samples: 2978769840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 01:26:02,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-27 01:26:04,754][49750] Updated weights for policy 0, policy_version 318971 (0.0030) [2024-04-27 01:26:07,062][49517] Fps is (10 sec: 52430.1, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5226135552. Throughput: 0: 51093.5. Samples: 2978925760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 01:26:07,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-27 01:26:08,120][49750] Updated weights for policy 0, policy_version 318981 (0.0036) [2024-04-27 01:26:11,235][49750] Updated weights for policy 0, policy_version 318991 (0.0028) [2024-04-27 01:26:12,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 5226364928. Throughput: 0: 50975.4. Samples: 2979227780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 01:26:12,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-27 01:26:14,588][49750] Updated weights for policy 0, policy_version 319001 (0.0030) [2024-04-27 01:26:17,063][49517] Fps is (10 sec: 50789.4, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 5226643456. Throughput: 0: 51178.7. Samples: 2979536040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 01:26:17,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 01:26:17,484][49750] Updated weights for policy 0, policy_version 319011 (0.0029) [2024-04-27 01:26:20,927][49750] Updated weights for policy 0, policy_version 319021 (0.0031) [2024-04-27 01:26:22,063][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 5226905600. Throughput: 0: 50922.5. Samples: 2979685020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:26:22,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-27 01:26:23,423][49728] Signal inference workers to stop experience collection... (44800 times) [2024-04-27 01:26:23,423][49728] Signal inference workers to resume experience collection... (44800 times) [2024-04-27 01:26:23,449][49750] InferenceWorker_p0-w0: stopping experience collection (44800 times) [2024-04-27 01:26:23,449][49750] InferenceWorker_p0-w0: resuming experience collection (44800 times) [2024-04-27 01:26:23,959][49750] Updated weights for policy 0, policy_version 319031 (0.0030) [2024-04-27 01:26:27,062][49517] Fps is (10 sec: 50791.2, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5227151360. Throughput: 0: 51031.8. Samples: 2979997800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:26:27,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 01:26:27,246][49750] Updated weights for policy 0, policy_version 319041 (0.0030) [2024-04-27 01:26:30,785][49750] Updated weights for policy 0, policy_version 319051 (0.0042) [2024-04-27 01:26:32,062][49517] Fps is (10 sec: 50791.3, 60 sec: 51336.7, 300 sec: 50929.3). Total num frames: 5227413504. Throughput: 0: 51223.6. Samples: 2980307360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:26:32,063][49517] Avg episode reward: [(0, '0.469')] [2024-04-27 01:26:33,750][49750] Updated weights for policy 0, policy_version 319061 (0.0026) [2024-04-27 01:26:37,063][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 5227642880. Throughput: 0: 50962.2. Samples: 2980447700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:26:37,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-27 01:26:37,231][49750] Updated weights for policy 0, policy_version 319071 (0.0031) [2024-04-27 01:26:40,251][49750] Updated weights for policy 0, policy_version 319081 (0.0033) [2024-04-27 01:26:42,063][49517] Fps is (10 sec: 49151.2, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5227905024. Throughput: 0: 51165.8. Samples: 2980765020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:26:42,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 01:26:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000319086_5227905024.pth... [2024-04-27 01:26:42,130][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000318341_5215698944.pth [2024-04-27 01:26:43,485][49750] Updated weights for policy 0, policy_version 319091 (0.0031) [2024-04-27 01:26:46,601][49750] Updated weights for policy 0, policy_version 319101 (0.0033) [2024-04-27 01:26:47,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51882.7, 300 sec: 50984.8). Total num frames: 5228183552. Throughput: 0: 51182.3. Samples: 2981073040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:26:47,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-27 01:26:49,759][49750] Updated weights for policy 0, policy_version 319111 (0.0030) [2024-04-27 01:26:52,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5228412928. Throughput: 0: 51055.9. Samples: 2981223280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:26:52,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-27 01:26:52,913][49750] Updated weights for policy 0, policy_version 319121 (0.0024) [2024-04-27 01:26:56,296][49750] Updated weights for policy 0, policy_version 319131 (0.0030) [2024-04-27 01:26:57,062][49517] Fps is (10 sec: 49151.8, 60 sec: 51063.6, 300 sec: 50929.2). Total num frames: 5228675072. Throughput: 0: 51158.4. Samples: 2981529900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:26:57,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 01:26:59,367][49750] Updated weights for policy 0, policy_version 319141 (0.0037) [2024-04-27 01:27:02,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 5228937216. Throughput: 0: 51130.3. Samples: 2981836900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:27:02,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 01:27:02,810][49750] Updated weights for policy 0, policy_version 319151 (0.0033) [2024-04-27 01:27:05,824][49750] Updated weights for policy 0, policy_version 319161 (0.0030) [2024-04-27 01:27:07,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.3, 300 sec: 50984.8). Total num frames: 5229199360. Throughput: 0: 51029.4. Samples: 2981981340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:27:07,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 01:27:09,182][49750] Updated weights for policy 0, policy_version 319171 (0.0030) [2024-04-27 01:27:12,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 5229445120. Throughput: 0: 50959.8. Samples: 2982291000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:27:12,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-27 01:27:12,322][49750] Updated weights for policy 0, policy_version 319181 (0.0036) [2024-04-27 01:27:15,507][49750] Updated weights for policy 0, policy_version 319191 (0.0028) [2024-04-27 01:27:17,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 5229707264. Throughput: 0: 50841.2. Samples: 2982595220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:27:17,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-27 01:27:18,662][49750] Updated weights for policy 0, policy_version 319201 (0.0031) [2024-04-27 01:27:22,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.4, 300 sec: 50984.8). Total num frames: 5229936640. Throughput: 0: 51034.7. Samples: 2982744260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:27:22,063][49517] Avg episode reward: [(0, '0.490')] [2024-04-27 01:27:22,076][49750] Updated weights for policy 0, policy_version 319211 (0.0038) [2024-04-27 01:27:25,037][49750] Updated weights for policy 0, policy_version 319221 (0.0031) [2024-04-27 01:27:27,063][49517] Fps is (10 sec: 45875.1, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 5230166016. Throughput: 0: 50705.4. Samples: 2983046760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:27:27,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-27 01:27:28,604][49750] Updated weights for policy 0, policy_version 319231 (0.0030) [2024-04-27 01:27:31,495][49750] Updated weights for policy 0, policy_version 319241 (0.0032) [2024-04-27 01:27:32,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 5230460928. Throughput: 0: 50666.0. Samples: 2983353020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 01:27:32,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-27 01:27:34,921][49750] Updated weights for policy 0, policy_version 319251 (0.0030) [2024-04-27 01:27:35,499][49728] Signal inference workers to stop experience collection... (44850 times) [2024-04-27 01:27:35,500][49728] Signal inference workers to resume experience collection... (44850 times) [2024-04-27 01:27:35,529][49750] InferenceWorker_p0-w0: stopping experience collection (44850 times) [2024-04-27 01:27:35,529][49750] InferenceWorker_p0-w0: resuming experience collection (44850 times) [2024-04-27 01:27:37,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 5230706688. Throughput: 0: 50966.6. Samples: 2983516780. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 01:27:37,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-27 01:27:37,943][49750] Updated weights for policy 0, policy_version 319261 (0.0035) [2024-04-27 01:27:41,344][49750] Updated weights for policy 0, policy_version 319271 (0.0032) [2024-04-27 01:27:42,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5230952448. Throughput: 0: 50836.0. Samples: 2983817520. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 01:27:42,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-27 01:27:44,249][49750] Updated weights for policy 0, policy_version 319281 (0.0026) [2024-04-27 01:27:47,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.2, 300 sec: 50929.3). Total num frames: 5231198208. Throughput: 0: 50902.2. Samples: 2984127500. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 01:27:47,063][49517] Avg episode reward: [(0, '0.674')] [2024-04-27 01:27:47,762][49750] Updated weights for policy 0, policy_version 319291 (0.0027) [2024-04-27 01:27:50,650][49750] Updated weights for policy 0, policy_version 319301 (0.0037) [2024-04-27 01:27:52,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5231476736. Throughput: 0: 50844.6. Samples: 2984269340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 01:27:52,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-27 01:27:54,355][49750] Updated weights for policy 0, policy_version 319311 (0.0036) [2024-04-27 01:27:57,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 5231722496. Throughput: 0: 50729.1. Samples: 2984573800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 01:27:57,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-27 01:27:57,228][49750] Updated weights for policy 0, policy_version 319321 (0.0029) [2024-04-27 01:28:00,698][49750] Updated weights for policy 0, policy_version 319331 (0.0024) [2024-04-27 01:28:02,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50929.2). Total num frames: 5231968256. Throughput: 0: 50660.5. Samples: 2984874940. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 01:28:02,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-27 01:28:03,577][49750] Updated weights for policy 0, policy_version 319341 (0.0029) [2024-04-27 01:28:07,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.4, 300 sec: 50929.2). Total num frames: 5232230400. Throughput: 0: 50871.1. Samples: 2985033460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 01:28:07,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-27 01:28:07,100][49750] Updated weights for policy 0, policy_version 319351 (0.0030) [2024-04-27 01:28:09,935][49750] Updated weights for policy 0, policy_version 319361 (0.0035) [2024-04-27 01:28:12,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50244.5, 300 sec: 50873.7). Total num frames: 5232459776. Throughput: 0: 50842.4. Samples: 2985334660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 01:28:12,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 01:28:13,790][49750] Updated weights for policy 0, policy_version 319371 (0.0034) [2024-04-27 01:28:16,390][49750] Updated weights for policy 0, policy_version 319381 (0.0040) [2024-04-27 01:28:17,063][49517] Fps is (10 sec: 52427.8, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 5232754688. Throughput: 0: 50640.8. Samples: 2985631860. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 01:28:17,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-27 01:28:20,130][49750] Updated weights for policy 0, policy_version 319391 (0.0032) [2024-04-27 01:28:22,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 5233000448. Throughput: 0: 50603.3. Samples: 2985793920. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 01:28:22,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-27 01:28:22,892][49750] Updated weights for policy 0, policy_version 319401 (0.0034) [2024-04-27 01:28:26,501][49750] Updated weights for policy 0, policy_version 319411 (0.0033) [2024-04-27 01:28:27,063][49517] Fps is (10 sec: 49152.6, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5233246208. Throughput: 0: 50819.1. Samples: 2986104380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 01:28:27,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-27 01:28:29,263][49750] Updated weights for policy 0, policy_version 319421 (0.0030) [2024-04-27 01:28:32,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 5233491968. Throughput: 0: 50677.8. Samples: 2986408000. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 01:28:32,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 01:28:33,204][49750] Updated weights for policy 0, policy_version 319431 (0.0034) [2024-04-27 01:28:35,211][49728] Signal inference workers to stop experience collection... (44900 times) [2024-04-27 01:28:35,211][49728] Signal inference workers to resume experience collection... (44900 times) [2024-04-27 01:28:35,238][49750] InferenceWorker_p0-w0: stopping experience collection (44900 times) [2024-04-27 01:28:35,238][49750] InferenceWorker_p0-w0: resuming experience collection (44900 times) [2024-04-27 01:28:35,801][49750] Updated weights for policy 0, policy_version 319441 (0.0033) [2024-04-27 01:28:37,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50873.9). Total num frames: 5233754112. Throughput: 0: 50744.4. Samples: 2986552840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 01:28:37,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-27 01:28:39,660][49750] Updated weights for policy 0, policy_version 319451 (0.0028) [2024-04-27 01:28:42,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51336.5, 300 sec: 51040.3). Total num frames: 5234032640. Throughput: 0: 50799.5. Samples: 2986859780. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 01:28:42,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 01:28:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000319460_5234032640.pth... [2024-04-27 01:28:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000318714_5221810176.pth [2024-04-27 01:28:42,476][49750] Updated weights for policy 0, policy_version 319461 (0.0027) [2024-04-27 01:28:45,980][49750] Updated weights for policy 0, policy_version 319471 (0.0030) [2024-04-27 01:28:47,062][49517] Fps is (10 sec: 50790.8, 60 sec: 51063.6, 300 sec: 50929.2). Total num frames: 5234262016. Throughput: 0: 50893.4. Samples: 2987165140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 01:28:47,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-27 01:28:48,788][49750] Updated weights for policy 0, policy_version 319481 (0.0037) [2024-04-27 01:28:52,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5234507776. Throughput: 0: 50744.8. Samples: 2987316980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:28:52,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 01:28:52,329][49750] Updated weights for policy 0, policy_version 319491 (0.0032) [2024-04-27 01:28:55,109][49750] Updated weights for policy 0, policy_version 319501 (0.0028) [2024-04-27 01:28:57,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 5234769920. Throughput: 0: 50835.9. Samples: 2987622280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:28:57,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-27 01:28:58,811][49750] Updated weights for policy 0, policy_version 319511 (0.0033) [2024-04-27 01:29:01,745][49750] Updated weights for policy 0, policy_version 319521 (0.0030) [2024-04-27 01:29:02,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5235032064. Throughput: 0: 51031.7. Samples: 2987928280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:29:02,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-27 01:29:05,162][49750] Updated weights for policy 0, policy_version 319531 (0.0027) [2024-04-27 01:29:07,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 5235294208. Throughput: 0: 50912.7. Samples: 2988085000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:29:07,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 01:29:08,186][49750] Updated weights for policy 0, policy_version 319541 (0.0030) [2024-04-27 01:29:11,465][49750] Updated weights for policy 0, policy_version 319551 (0.0030) [2024-04-27 01:29:12,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51336.3, 300 sec: 50873.7). Total num frames: 5235539968. Throughput: 0: 50781.6. Samples: 2988389560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:29:12,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 01:29:14,461][49750] Updated weights for policy 0, policy_version 319561 (0.0035) [2024-04-27 01:29:17,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.4, 300 sec: 50929.2). Total num frames: 5235785728. Throughput: 0: 50762.6. Samples: 2988692320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:29:17,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 01:29:18,115][49750] Updated weights for policy 0, policy_version 319571 (0.0027) [2024-04-27 01:29:20,872][49750] Updated weights for policy 0, policy_version 319581 (0.0032) [2024-04-27 01:29:22,062][49517] Fps is (10 sec: 47514.8, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5236015104. Throughput: 0: 50860.6. Samples: 2988841560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:29:22,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-27 01:29:24,545][49750] Updated weights for policy 0, policy_version 319591 (0.0029) [2024-04-27 01:29:27,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 5236326400. Throughput: 0: 50832.0. Samples: 2989147220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:29:27,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-27 01:29:27,333][49750] Updated weights for policy 0, policy_version 319601 (0.0029) [2024-04-27 01:29:31,027][49750] Updated weights for policy 0, policy_version 319611 (0.0034) [2024-04-27 01:29:32,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5236555776. Throughput: 0: 50827.0. Samples: 2989452360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:29:32,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-27 01:29:33,646][49750] Updated weights for policy 0, policy_version 319621 (0.0028) [2024-04-27 01:29:37,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5236801536. Throughput: 0: 50878.6. Samples: 2989606520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:29:37,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 01:29:37,535][49750] Updated weights for policy 0, policy_version 319631 (0.0030) [2024-04-27 01:29:40,599][49750] Updated weights for policy 0, policy_version 319641 (0.0034) [2024-04-27 01:29:42,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 5237047296. Throughput: 0: 50849.3. Samples: 2989910500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:29:42,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 01:29:43,071][49728] Signal inference workers to stop experience collection... (44950 times) [2024-04-27 01:29:43,071][49728] Signal inference workers to resume experience collection... (44950 times) [2024-04-27 01:29:43,107][49750] InferenceWorker_p0-w0: stopping experience collection (44950 times) [2024-04-27 01:29:43,107][49750] InferenceWorker_p0-w0: resuming experience collection (44950 times) [2024-04-27 01:29:43,910][49750] Updated weights for policy 0, policy_version 319651 (0.0032) [2024-04-27 01:29:47,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5237309440. Throughput: 0: 50609.4. Samples: 2990205700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:29:47,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-27 01:29:47,439][49750] Updated weights for policy 0, policy_version 319661 (0.0028) [2024-04-27 01:29:50,329][49750] Updated weights for policy 0, policy_version 319671 (0.0031) [2024-04-27 01:29:52,063][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 5237587968. Throughput: 0: 50705.8. Samples: 2990366760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:29:52,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 01:29:53,928][49750] Updated weights for policy 0, policy_version 319681 (0.0035) [2024-04-27 01:29:56,768][49750] Updated weights for policy 0, policy_version 319691 (0.0030) [2024-04-27 01:29:57,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 5237833728. Throughput: 0: 50755.7. Samples: 2990673560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:29:57,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 01:30:00,231][49750] Updated weights for policy 0, policy_version 319701 (0.0030) [2024-04-27 01:30:02,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 5238063104. Throughput: 0: 50916.1. Samples: 2990983540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 01:30:02,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-27 01:30:03,140][49750] Updated weights for policy 0, policy_version 319711 (0.0031) [2024-04-27 01:30:06,699][49750] Updated weights for policy 0, policy_version 319721 (0.0033) [2024-04-27 01:30:07,063][49517] Fps is (10 sec: 47513.6, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5238308864. Throughput: 0: 50782.1. Samples: 2991126760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 01:30:07,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-27 01:30:09,529][49750] Updated weights for policy 0, policy_version 319731 (0.0046) [2024-04-27 01:30:12,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 5238603776. Throughput: 0: 50757.6. Samples: 2991431320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 01:30:12,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-27 01:30:13,285][49750] Updated weights for policy 0, policy_version 319741 (0.0028) [2024-04-27 01:30:15,993][49750] Updated weights for policy 0, policy_version 319751 (0.0037) [2024-04-27 01:30:17,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5238849536. Throughput: 0: 50725.6. Samples: 2991735020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 01:30:17,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 01:30:19,705][49750] Updated weights for policy 0, policy_version 319761 (0.0028) [2024-04-27 01:30:22,063][49517] Fps is (10 sec: 49151.7, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 5239095296. Throughput: 0: 50865.7. Samples: 2991895480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 01:30:22,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 01:30:22,378][49750] Updated weights for policy 0, policy_version 319771 (0.0031) [2024-04-27 01:30:26,039][49750] Updated weights for policy 0, policy_version 319781 (0.0030) [2024-04-27 01:30:27,063][49517] Fps is (10 sec: 47513.6, 60 sec: 49971.0, 300 sec: 50818.2). Total num frames: 5239324672. Throughput: 0: 50810.2. Samples: 2992196960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 01:30:27,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 01:30:28,889][49750] Updated weights for policy 0, policy_version 319791 (0.0034) [2024-04-27 01:30:32,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 5239586816. Throughput: 0: 50868.9. Samples: 2992494800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 01:30:32,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-27 01:30:32,433][49750] Updated weights for policy 0, policy_version 319801 (0.0030) [2024-04-27 01:30:35,219][49750] Updated weights for policy 0, policy_version 319811 (0.0026) [2024-04-27 01:30:37,063][49517] Fps is (10 sec: 54067.6, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 5239865344. Throughput: 0: 50837.4. Samples: 2992654440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 01:30:37,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-27 01:30:39,046][49750] Updated weights for policy 0, policy_version 319821 (0.0028) [2024-04-27 01:30:41,732][49750] Updated weights for policy 0, policy_version 319831 (0.0038) [2024-04-27 01:30:42,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51336.6, 300 sec: 51040.3). Total num frames: 5240127488. Throughput: 0: 50777.4. Samples: 2992958540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 01:30:42,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-27 01:30:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000319832_5240127488.pth... [2024-04-27 01:30:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000319086_5227905024.pth [2024-04-27 01:30:45,785][49750] Updated weights for policy 0, policy_version 319841 (0.0036) [2024-04-27 01:30:47,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5240340480. Throughput: 0: 50657.3. Samples: 2993263120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 01:30:47,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-27 01:30:48,249][49750] Updated weights for policy 0, policy_version 319851 (0.0027) [2024-04-27 01:30:52,063][49517] Fps is (10 sec: 45875.2, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 5240586240. Throughput: 0: 50631.6. Samples: 2993405180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 01:30:52,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 01:30:52,114][49750] Updated weights for policy 0, policy_version 319861 (0.0042) [2024-04-27 01:30:53,345][49728] Signal inference workers to stop experience collection... (45000 times) [2024-04-27 01:30:53,398][49750] InferenceWorker_p0-w0: stopping experience collection (45000 times) [2024-04-27 01:30:53,411][49728] Signal inference workers to resume experience collection... (45000 times) [2024-04-27 01:30:53,417][49750] InferenceWorker_p0-w0: resuming experience collection (45000 times) [2024-04-27 01:30:54,622][49750] Updated weights for policy 0, policy_version 319871 (0.0029) [2024-04-27 01:30:57,062][49517] Fps is (10 sec: 54067.4, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5240881152. Throughput: 0: 50618.3. Samples: 2993709140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 01:30:57,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 01:30:58,450][49750] Updated weights for policy 0, policy_version 319881 (0.0031) [2024-04-27 01:31:01,078][49750] Updated weights for policy 0, policy_version 319891 (0.0029) [2024-04-27 01:31:02,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5241126912. Throughput: 0: 50648.7. Samples: 2994014200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 01:31:02,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 01:31:05,137][49750] Updated weights for policy 0, policy_version 319901 (0.0035) [2024-04-27 01:31:07,063][49517] Fps is (10 sec: 49148.9, 60 sec: 51063.0, 300 sec: 50873.6). Total num frames: 5241372672. Throughput: 0: 50654.1. Samples: 2994174940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 01:31:07,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 01:31:07,498][49750] Updated weights for policy 0, policy_version 319911 (0.0037) [2024-04-27 01:31:11,596][49750] Updated weights for policy 0, policy_version 319921 (0.0033) [2024-04-27 01:31:12,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 5241618432. Throughput: 0: 50632.6. Samples: 2994475420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 01:31:12,063][49517] Avg episode reward: [(0, '0.667')] [2024-04-27 01:31:14,001][49750] Updated weights for policy 0, policy_version 319931 (0.0027) [2024-04-27 01:31:17,062][49517] Fps is (10 sec: 49155.2, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 5241864192. Throughput: 0: 50695.5. Samples: 2994776100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 01:31:17,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 01:31:18,045][49750] Updated weights for policy 0, policy_version 319941 (0.0030) [2024-04-27 01:31:20,472][49750] Updated weights for policy 0, policy_version 319951 (0.0029) [2024-04-27 01:31:22,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5242142720. Throughput: 0: 50561.5. Samples: 2994929700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 01:31:22,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 01:31:24,496][49750] Updated weights for policy 0, policy_version 319961 (0.0030) [2024-04-27 01:31:26,964][49750] Updated weights for policy 0, policy_version 319971 (0.0033) [2024-04-27 01:31:27,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51336.8, 300 sec: 50818.2). Total num frames: 5242404864. Throughput: 0: 50660.6. Samples: 2995238260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 01:31:27,063][49517] Avg episode reward: [(0, '0.689')] [2024-04-27 01:31:31,079][49750] Updated weights for policy 0, policy_version 319981 (0.0031) [2024-04-27 01:31:32,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 5242634240. Throughput: 0: 50600.8. Samples: 2995540160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 01:31:32,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-27 01:31:33,401][49750] Updated weights for policy 0, policy_version 319991 (0.0038) [2024-04-27 01:31:37,063][49517] Fps is (10 sec: 45874.1, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 5242863616. Throughput: 0: 50660.8. Samples: 2995684920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 01:31:37,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-27 01:31:37,540][49750] Updated weights for policy 0, policy_version 320001 (0.0034) [2024-04-27 01:31:39,873][49750] Updated weights for policy 0, policy_version 320011 (0.0029) [2024-04-27 01:31:42,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5243158528. Throughput: 0: 50611.4. Samples: 2995986660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 01:31:42,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-27 01:31:43,847][49750] Updated weights for policy 0, policy_version 320021 (0.0028) [2024-04-27 01:31:46,359][49750] Updated weights for policy 0, policy_version 320031 (0.0029) [2024-04-27 01:31:47,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5243404288. Throughput: 0: 50536.0. Samples: 2996288320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 01:31:47,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-27 01:31:50,398][49750] Updated weights for policy 0, policy_version 320041 (0.0030) [2024-04-27 01:31:52,063][49517] Fps is (10 sec: 47513.8, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5243633664. Throughput: 0: 50595.7. Samples: 2996451720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 01:31:52,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 01:31:52,239][49728] Signal inference workers to stop experience collection... (45050 times) [2024-04-27 01:31:52,282][49750] InferenceWorker_p0-w0: stopping experience collection (45050 times) [2024-04-27 01:31:52,310][49728] Signal inference workers to resume experience collection... (45050 times) [2024-04-27 01:31:52,311][49750] InferenceWorker_p0-w0: resuming experience collection (45050 times) [2024-04-27 01:31:52,743][49750] Updated weights for policy 0, policy_version 320051 (0.0026) [2024-04-27 01:31:56,911][49750] Updated weights for policy 0, policy_version 320061 (0.0033) [2024-04-27 01:31:57,062][49517] Fps is (10 sec: 47513.4, 60 sec: 49971.2, 300 sec: 50651.6). Total num frames: 5243879424. Throughput: 0: 50588.5. Samples: 2996751900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 01:31:57,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-27 01:31:59,150][49750] Updated weights for policy 0, policy_version 320071 (0.0029) [2024-04-27 01:32:02,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50244.3, 300 sec: 50651.6). Total num frames: 5244141568. Throughput: 0: 50514.2. Samples: 2997049240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 01:32:02,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-27 01:32:03,359][49750] Updated weights for policy 0, policy_version 320081 (0.0037) [2024-04-27 01:32:05,543][49750] Updated weights for policy 0, policy_version 320091 (0.0029) [2024-04-27 01:32:07,063][49517] Fps is (10 sec: 54066.7, 60 sec: 50790.9, 300 sec: 50762.6). Total num frames: 5244420096. Throughput: 0: 50676.8. Samples: 2997210160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 01:32:07,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 01:32:09,720][49750] Updated weights for policy 0, policy_version 320101 (0.0034) [2024-04-27 01:32:12,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5244682240. Throughput: 0: 50619.3. Samples: 2997516140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 01:32:12,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-27 01:32:12,305][49750] Updated weights for policy 0, policy_version 320111 (0.0033) [2024-04-27 01:32:16,069][49750] Updated weights for policy 0, policy_version 320121 (0.0034) [2024-04-27 01:32:17,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 5244911616. Throughput: 0: 50701.7. Samples: 2997821740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 01:32:17,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 01:32:18,841][49750] Updated weights for policy 0, policy_version 320131 (0.0028) [2024-04-27 01:32:22,063][49517] Fps is (10 sec: 45873.0, 60 sec: 49970.7, 300 sec: 50762.5). Total num frames: 5245140992. Throughput: 0: 50782.6. Samples: 2997970160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 01:32:22,064][49517] Avg episode reward: [(0, '0.559')] [2024-04-27 01:32:22,595][49750] Updated weights for policy 0, policy_version 320141 (0.0030) [2024-04-27 01:32:25,275][49750] Updated weights for policy 0, policy_version 320151 (0.0027) [2024-04-27 01:32:27,063][49517] Fps is (10 sec: 54066.9, 60 sec: 50790.1, 300 sec: 50818.2). Total num frames: 5245452288. Throughput: 0: 50743.4. Samples: 2998270120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 01:32:27,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 01:32:29,092][49750] Updated weights for policy 0, policy_version 320161 (0.0030) [2024-04-27 01:32:31,588][49750] Updated weights for policy 0, policy_version 320171 (0.0030) [2024-04-27 01:32:32,062][49517] Fps is (10 sec: 55708.8, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5245698048. Throughput: 0: 50899.5. Samples: 2998578800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 01:32:32,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-27 01:32:35,501][49750] Updated weights for policy 0, policy_version 320181 (0.0030) [2024-04-27 01:32:37,063][49517] Fps is (10 sec: 49152.2, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 5245943808. Throughput: 0: 50959.4. Samples: 2998744900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:32:37,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-27 01:32:37,911][49750] Updated weights for policy 0, policy_version 320191 (0.0032) [2024-04-27 01:32:41,802][49750] Updated weights for policy 0, policy_version 320201 (0.0029) [2024-04-27 01:32:42,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 5246189568. Throughput: 0: 50891.2. Samples: 2999042000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:32:42,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-27 01:32:42,133][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000320203_5246205952.pth... [2024-04-27 01:32:42,176][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000319460_5234032640.pth [2024-04-27 01:32:44,403][49750] Updated weights for policy 0, policy_version 320211 (0.0034) [2024-04-27 01:32:47,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5246435328. Throughput: 0: 50944.8. Samples: 2999341760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:32:47,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-27 01:32:48,109][49728] Signal inference workers to stop experience collection... (45100 times) [2024-04-27 01:32:48,110][49728] Signal inference workers to resume experience collection... (45100 times) [2024-04-27 01:32:48,135][49750] InferenceWorker_p0-w0: stopping experience collection (45100 times) [2024-04-27 01:32:48,162][49750] InferenceWorker_p0-w0: resuming experience collection (45100 times) [2024-04-27 01:32:48,240][49750] Updated weights for policy 0, policy_version 320221 (0.0027) [2024-04-27 01:32:50,929][49750] Updated weights for policy 0, policy_version 320231 (0.0030) [2024-04-27 01:32:52,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 5246713856. Throughput: 0: 50776.0. Samples: 2999495080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:32:52,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 01:32:54,796][49750] Updated weights for policy 0, policy_version 320241 (0.0027) [2024-04-27 01:32:57,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 5246959616. Throughput: 0: 50807.7. Samples: 2999802480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:32:57,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-27 01:32:57,340][49750] Updated weights for policy 0, policy_version 320251 (0.0034) [2024-04-27 01:33:01,161][49750] Updated weights for policy 0, policy_version 320261 (0.0032) [2024-04-27 01:33:02,063][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5247205376. Throughput: 0: 50730.8. Samples: 3000104620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:33:02,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 01:33:03,813][49750] Updated weights for policy 0, policy_version 320271 (0.0034) [2024-04-27 01:33:07,063][49517] Fps is (10 sec: 47512.6, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 5247434752. Throughput: 0: 50836.9. Samples: 3000257800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:33:07,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-27 01:33:07,592][49750] Updated weights for policy 0, policy_version 320281 (0.0033) [2024-04-27 01:33:10,345][49750] Updated weights for policy 0, policy_version 320291 (0.0027) [2024-04-27 01:33:12,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 5247729664. Throughput: 0: 50812.2. Samples: 3000556660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:33:12,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 01:33:14,042][49750] Updated weights for policy 0, policy_version 320301 (0.0028) [2024-04-27 01:33:16,942][49750] Updated weights for policy 0, policy_version 320311 (0.0028) [2024-04-27 01:33:17,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 5247975424. Throughput: 0: 50761.3. Samples: 3000863060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:33:17,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-27 01:33:20,591][49750] Updated weights for policy 0, policy_version 320321 (0.0033) [2024-04-27 01:33:22,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51610.0, 300 sec: 50818.2). Total num frames: 5248237568. Throughput: 0: 50594.7. Samples: 3001021660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:33:22,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-27 01:33:23,294][49750] Updated weights for policy 0, policy_version 320331 (0.0029) [2024-04-27 01:33:26,869][49750] Updated weights for policy 0, policy_version 320341 (0.0029) [2024-04-27 01:33:27,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 5248466944. Throughput: 0: 50769.1. Samples: 3001326620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:33:27,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 01:33:29,645][49750] Updated weights for policy 0, policy_version 320351 (0.0033) [2024-04-27 01:33:32,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 5248712704. Throughput: 0: 50835.5. Samples: 3001629360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:33:32,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-27 01:33:33,360][49750] Updated weights for policy 0, policy_version 320361 (0.0030) [2024-04-27 01:33:36,146][49750] Updated weights for policy 0, policy_version 320371 (0.0034) [2024-04-27 01:33:37,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5248991232. Throughput: 0: 50876.3. Samples: 3001784520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:33:37,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 01:33:39,952][49750] Updated weights for policy 0, policy_version 320381 (0.0032) [2024-04-27 01:33:42,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5249236992. Throughput: 0: 50747.5. Samples: 3002086120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:33:42,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-27 01:33:42,766][49750] Updated weights for policy 0, policy_version 320391 (0.0031) [2024-04-27 01:33:46,329][49750] Updated weights for policy 0, policy_version 320401 (0.0033) [2024-04-27 01:33:47,063][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5249499136. Throughput: 0: 50836.0. Samples: 3002392240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:33:47,063][49517] Avg episode reward: [(0, '0.717')] [2024-04-27 01:33:49,316][49750] Updated weights for policy 0, policy_version 320411 (0.0031) [2024-04-27 01:33:52,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 5249728512. Throughput: 0: 50766.0. Samples: 3002542260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 01:33:52,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-27 01:33:52,638][49750] Updated weights for policy 0, policy_version 320421 (0.0030) [2024-04-27 01:33:55,602][49750] Updated weights for policy 0, policy_version 320431 (0.0032) [2024-04-27 01:33:57,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5250007040. Throughput: 0: 50923.1. Samples: 3002848200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 01:33:57,064][49517] Avg episode reward: [(0, '0.639')] [2024-04-27 01:33:59,061][49750] Updated weights for policy 0, policy_version 320441 (0.0029) [2024-04-27 01:34:00,276][49728] Signal inference workers to stop experience collection... (45150 times) [2024-04-27 01:34:00,276][49728] Signal inference workers to resume experience collection... (45150 times) [2024-04-27 01:34:00,289][49750] InferenceWorker_p0-w0: stopping experience collection (45150 times) [2024-04-27 01:34:00,289][49750] InferenceWorker_p0-w0: resuming experience collection (45150 times) [2024-04-27 01:34:02,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5250252800. Throughput: 0: 50937.0. Samples: 3003155220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 01:34:02,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 01:34:02,152][49750] Updated weights for policy 0, policy_version 320451 (0.0040) [2024-04-27 01:34:05,566][49750] Updated weights for policy 0, policy_version 320461 (0.0035) [2024-04-27 01:34:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51336.7, 300 sec: 50762.7). Total num frames: 5250514944. Throughput: 0: 50765.1. Samples: 3003306080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 01:34:07,063][49517] Avg episode reward: [(0, '0.683')] [2024-04-27 01:34:08,682][49750] Updated weights for policy 0, policy_version 320471 (0.0039) [2024-04-27 01:34:12,025][49750] Updated weights for policy 0, policy_version 320481 (0.0039) [2024-04-27 01:34:12,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5250760704. Throughput: 0: 50771.6. Samples: 3003611340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 01:34:12,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-27 01:34:15,224][49750] Updated weights for policy 0, policy_version 320491 (0.0035) [2024-04-27 01:34:17,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 5251006464. Throughput: 0: 50897.4. Samples: 3003919740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 01:34:17,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 01:34:18,352][49750] Updated weights for policy 0, policy_version 320501 (0.0035) [2024-04-27 01:34:21,450][49750] Updated weights for policy 0, policy_version 320511 (0.0029) [2024-04-27 01:34:22,063][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5251284992. Throughput: 0: 50767.7. Samples: 3004069060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 01:34:22,063][49517] Avg episode reward: [(0, '0.726')] [2024-04-27 01:34:24,625][49750] Updated weights for policy 0, policy_version 320521 (0.0036) [2024-04-27 01:34:27,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5251514368. Throughput: 0: 50795.4. Samples: 3004371920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 01:34:27,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 01:34:27,941][49750] Updated weights for policy 0, policy_version 320531 (0.0030) [2024-04-27 01:34:31,069][49750] Updated weights for policy 0, policy_version 320541 (0.0029) [2024-04-27 01:34:32,062][49517] Fps is (10 sec: 49152.0, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5251776512. Throughput: 0: 50876.4. Samples: 3004681680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 01:34:32,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 01:34:34,316][49750] Updated weights for policy 0, policy_version 320551 (0.0029) [2024-04-27 01:34:37,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5252022272. Throughput: 0: 50839.8. Samples: 3004830060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 01:34:37,064][49517] Avg episode reward: [(0, '0.657')] [2024-04-27 01:34:37,601][49750] Updated weights for policy 0, policy_version 320561 (0.0031) [2024-04-27 01:34:40,752][49750] Updated weights for policy 0, policy_version 320571 (0.0031) [2024-04-27 01:34:42,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5252284416. Throughput: 0: 50925.8. Samples: 3005139860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 01:34:42,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-27 01:34:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000320574_5252284416.pth... [2024-04-27 01:34:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000319832_5240127488.pth [2024-04-27 01:34:43,951][49750] Updated weights for policy 0, policy_version 320581 (0.0026) [2024-04-27 01:34:47,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 5252530176. Throughput: 0: 50921.8. Samples: 3005446700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 01:34:47,063][49517] Avg episode reward: [(0, '0.675')] [2024-04-27 01:34:47,202][49750] Updated weights for policy 0, policy_version 320591 (0.0028) [2024-04-27 01:34:50,417][49750] Updated weights for policy 0, policy_version 320601 (0.0038) [2024-04-27 01:34:52,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 5252808704. Throughput: 0: 50959.0. Samples: 3005599240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 01:34:52,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-27 01:34:53,664][49750] Updated weights for policy 0, policy_version 320611 (0.0031) [2024-04-27 01:34:57,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5253038080. Throughput: 0: 50714.4. Samples: 3005893480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 01:34:57,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-27 01:34:57,099][49750] Updated weights for policy 0, policy_version 320621 (0.0031) [2024-04-27 01:35:00,256][49750] Updated weights for policy 0, policy_version 320631 (0.0027) [2024-04-27 01:35:02,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 5253300224. Throughput: 0: 50718.2. Samples: 3006202060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 01:35:02,063][49517] Avg episode reward: [(0, '0.661')] [2024-04-27 01:35:03,635][49750] Updated weights for policy 0, policy_version 320641 (0.0037) [2024-04-27 01:35:06,005][49728] Signal inference workers to stop experience collection... (45200 times) [2024-04-27 01:35:06,036][49750] InferenceWorker_p0-w0: stopping experience collection (45200 times) [2024-04-27 01:35:06,070][49728] Signal inference workers to resume experience collection... (45200 times) [2024-04-27 01:35:06,070][49750] InferenceWorker_p0-w0: resuming experience collection (45200 times) [2024-04-27 01:35:06,592][49750] Updated weights for policy 0, policy_version 320651 (0.0027) [2024-04-27 01:35:07,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5253562368. Throughput: 0: 50653.4. Samples: 3006348460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 01:35:07,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 01:35:10,066][49750] Updated weights for policy 0, policy_version 320661 (0.0032) [2024-04-27 01:35:12,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.5, 300 sec: 50651.6). Total num frames: 5253791744. Throughput: 0: 50738.9. Samples: 3006655160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 01:35:12,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-27 01:35:12,989][49750] Updated weights for policy 0, policy_version 320671 (0.0028) [2024-04-27 01:35:16,344][49750] Updated weights for policy 0, policy_version 320681 (0.0036) [2024-04-27 01:35:17,063][49517] Fps is (10 sec: 49150.7, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 5254053888. Throughput: 0: 50550.9. Samples: 3006956480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 01:35:17,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 01:35:19,540][49750] Updated weights for policy 0, policy_version 320691 (0.0030) [2024-04-27 01:35:22,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5254316032. Throughput: 0: 50640.0. Samples: 3007108860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 01:35:22,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-27 01:35:23,027][49750] Updated weights for policy 0, policy_version 320701 (0.0027) [2024-04-27 01:35:25,984][49750] Updated weights for policy 0, policy_version 320711 (0.0028) [2024-04-27 01:35:27,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5254561792. Throughput: 0: 50624.9. Samples: 3007417980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 01:35:27,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-27 01:35:29,594][49750] Updated weights for policy 0, policy_version 320721 (0.0036) [2024-04-27 01:35:32,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5254823936. Throughput: 0: 50532.8. Samples: 3007720680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 01:35:32,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-27 01:35:32,410][49750] Updated weights for policy 0, policy_version 320731 (0.0029) [2024-04-27 01:35:35,948][49750] Updated weights for policy 0, policy_version 320741 (0.0033) [2024-04-27 01:35:37,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50651.6). Total num frames: 5255069696. Throughput: 0: 50484.9. Samples: 3007871060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 01:35:37,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-27 01:35:38,877][49750] Updated weights for policy 0, policy_version 320751 (0.0026) [2024-04-27 01:35:42,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 5255315456. Throughput: 0: 50823.8. Samples: 3008180560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 01:35:42,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-27 01:35:42,334][49750] Updated weights for policy 0, policy_version 320761 (0.0037) [2024-04-27 01:35:45,402][49750] Updated weights for policy 0, policy_version 320771 (0.0028) [2024-04-27 01:35:47,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5255577600. Throughput: 0: 50590.9. Samples: 3008478640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 01:35:47,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 01:35:48,829][49750] Updated weights for policy 0, policy_version 320781 (0.0032) [2024-04-27 01:35:51,811][49750] Updated weights for policy 0, policy_version 320791 (0.0031) [2024-04-27 01:35:52,063][49517] Fps is (10 sec: 54067.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5255856128. Throughput: 0: 50635.4. Samples: 3008627060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 01:35:52,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-27 01:35:55,402][49750] Updated weights for policy 0, policy_version 320801 (0.0032) [2024-04-27 01:35:57,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 5256069120. Throughput: 0: 50689.3. Samples: 3008936180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 01:35:57,063][49517] Avg episode reward: [(0, '0.507')] [2024-04-27 01:35:58,214][49750] Updated weights for policy 0, policy_version 320811 (0.0032) [2024-04-27 01:36:01,798][49750] Updated weights for policy 0, policy_version 320821 (0.0032) [2024-04-27 01:36:02,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 5256347648. Throughput: 0: 50834.0. Samples: 3009244000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 01:36:02,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 01:36:04,835][49750] Updated weights for policy 0, policy_version 320831 (0.0032) [2024-04-27 01:36:07,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 5256593408. Throughput: 0: 50677.3. Samples: 3009389340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 01:36:07,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 01:36:08,304][49750] Updated weights for policy 0, policy_version 320841 (0.0037) [2024-04-27 01:36:11,273][49750] Updated weights for policy 0, policy_version 320851 (0.0030) [2024-04-27 01:36:12,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5256839168. Throughput: 0: 50581.3. Samples: 3009694140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 01:36:12,064][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 01:36:14,749][49750] Updated weights for policy 0, policy_version 320861 (0.0035) [2024-04-27 01:36:16,872][49728] Signal inference workers to stop experience collection... (45250 times) [2024-04-27 01:36:16,910][49750] InferenceWorker_p0-w0: stopping experience collection (45250 times) [2024-04-27 01:36:16,933][49728] Signal inference workers to resume experience collection... (45250 times) [2024-04-27 01:36:16,934][49750] InferenceWorker_p0-w0: resuming experience collection (45250 times) [2024-04-27 01:36:17,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 5257101312. Throughput: 0: 50801.9. Samples: 3010006760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 01:36:17,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-27 01:36:17,622][49750] Updated weights for policy 0, policy_version 320871 (0.0035) [2024-04-27 01:36:21,178][49750] Updated weights for policy 0, policy_version 320881 (0.0030) [2024-04-27 01:36:22,063][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5257363456. Throughput: 0: 50691.4. Samples: 3010152180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 01:36:22,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-27 01:36:24,236][49750] Updated weights for policy 0, policy_version 320891 (0.0030) [2024-04-27 01:36:27,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5257609216. Throughput: 0: 50554.2. Samples: 3010455500. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 01:36:27,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 01:36:27,631][49750] Updated weights for policy 0, policy_version 320901 (0.0035) [2024-04-27 01:36:30,595][49750] Updated weights for policy 0, policy_version 320911 (0.0030) [2024-04-27 01:36:32,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5257854976. Throughput: 0: 50764.7. Samples: 3010763060. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 01:36:32,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-27 01:36:34,143][49750] Updated weights for policy 0, policy_version 320921 (0.0040) [2024-04-27 01:36:36,959][49750] Updated weights for policy 0, policy_version 320931 (0.0027) [2024-04-27 01:36:37,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5258133504. Throughput: 0: 50767.1. Samples: 3010911580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 01:36:37,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 01:36:40,512][49750] Updated weights for policy 0, policy_version 320941 (0.0032) [2024-04-27 01:36:42,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5258362880. Throughput: 0: 50728.9. Samples: 3011218980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 01:36:42,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-27 01:36:42,126][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000320946_5258379264.pth... [2024-04-27 01:36:42,175][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000320203_5246205952.pth [2024-04-27 01:36:43,336][49750] Updated weights for policy 0, policy_version 320951 (0.0033) [2024-04-27 01:36:46,868][49750] Updated weights for policy 0, policy_version 320961 (0.0030) [2024-04-27 01:36:47,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5258625024. Throughput: 0: 50634.7. Samples: 3011522560. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 01:36:47,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 01:36:49,868][49750] Updated weights for policy 0, policy_version 320971 (0.0037) [2024-04-27 01:36:52,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5258870784. Throughput: 0: 50719.7. Samples: 3011671720. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 01:36:52,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 01:36:53,390][49750] Updated weights for policy 0, policy_version 320981 (0.0032) [2024-04-27 01:36:56,564][49750] Updated weights for policy 0, policy_version 320991 (0.0031) [2024-04-27 01:36:57,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5259132928. Throughput: 0: 50801.4. Samples: 3011980200. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 01:36:57,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-27 01:36:59,912][49750] Updated weights for policy 0, policy_version 321001 (0.0039) [2024-04-27 01:37:02,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5259378688. Throughput: 0: 50620.4. Samples: 3012284680. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 01:37:02,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 01:37:02,905][49750] Updated weights for policy 0, policy_version 321011 (0.0033) [2024-04-27 01:37:06,294][49750] Updated weights for policy 0, policy_version 321021 (0.0033) [2024-04-27 01:37:07,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 5259640832. Throughput: 0: 50818.4. Samples: 3012439000. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 01:37:07,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 01:37:09,352][49750] Updated weights for policy 0, policy_version 321031 (0.0034) [2024-04-27 01:37:12,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5259870208. Throughput: 0: 50762.2. Samples: 3012739800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 01:37:12,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-27 01:37:12,642][49750] Updated weights for policy 0, policy_version 321041 (0.0028) [2024-04-27 01:37:14,739][49728] Signal inference workers to stop experience collection... (45300 times) [2024-04-27 01:37:14,739][49728] Signal inference workers to resume experience collection... (45300 times) [2024-04-27 01:37:14,760][49750] InferenceWorker_p0-w0: stopping experience collection (45300 times) [2024-04-27 01:37:14,760][49750] InferenceWorker_p0-w0: resuming experience collection (45300 times) [2024-04-27 01:37:15,923][49750] Updated weights for policy 0, policy_version 321051 (0.0038) [2024-04-27 01:37:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50873.8). Total num frames: 5260148736. Throughput: 0: 50695.3. Samples: 3013044340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 01:37:17,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-27 01:37:19,132][49750] Updated weights for policy 0, policy_version 321061 (0.0036) [2024-04-27 01:37:22,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50517.2, 300 sec: 50651.6). Total num frames: 5260394496. Throughput: 0: 50719.0. Samples: 3013193940. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 01:37:22,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 01:37:22,292][49750] Updated weights for policy 0, policy_version 321071 (0.0033) [2024-04-27 01:37:25,751][49750] Updated weights for policy 0, policy_version 321081 (0.0030) [2024-04-27 01:37:27,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 5260640256. Throughput: 0: 50861.6. Samples: 3013507760. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 01:37:27,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-27 01:37:28,662][49750] Updated weights for policy 0, policy_version 321091 (0.0034) [2024-04-27 01:37:32,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5260902400. Throughput: 0: 50892.9. Samples: 3013812740. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 01:37:32,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-27 01:37:32,236][49750] Updated weights for policy 0, policy_version 321101 (0.0034) [2024-04-27 01:37:35,138][49750] Updated weights for policy 0, policy_version 321111 (0.0034) [2024-04-27 01:37:37,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 5261164544. Throughput: 0: 50913.8. Samples: 3013962840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 01:37:37,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-27 01:37:38,554][49750] Updated weights for policy 0, policy_version 321121 (0.0032) [2024-04-27 01:37:41,493][49750] Updated weights for policy 0, policy_version 321131 (0.0030) [2024-04-27 01:37:42,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5261426688. Throughput: 0: 50863.1. Samples: 3014269040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 01:37:42,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-27 01:37:44,880][49750] Updated weights for policy 0, policy_version 321141 (0.0030) [2024-04-27 01:37:47,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 5261672448. Throughput: 0: 50936.8. Samples: 3014576840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 01:37:47,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-27 01:37:48,023][49750] Updated weights for policy 0, policy_version 321151 (0.0032) [2024-04-27 01:37:51,459][49750] Updated weights for policy 0, policy_version 321161 (0.0030) [2024-04-27 01:37:52,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5261918208. Throughput: 0: 50809.2. Samples: 3014725420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 01:37:52,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-27 01:37:54,448][49750] Updated weights for policy 0, policy_version 321171 (0.0028) [2024-04-27 01:37:57,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5262163968. Throughput: 0: 50774.8. Samples: 3015024660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 01:37:57,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-27 01:37:57,776][49750] Updated weights for policy 0, policy_version 321181 (0.0031) [2024-04-27 01:38:00,828][49750] Updated weights for policy 0, policy_version 321191 (0.0034) [2024-04-27 01:38:02,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5262442496. Throughput: 0: 50856.5. Samples: 3015332880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 01:38:02,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-27 01:38:04,228][49750] Updated weights for policy 0, policy_version 321201 (0.0026) [2024-04-27 01:38:07,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5262704640. Throughput: 0: 50958.9. Samples: 3015487080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 01:38:07,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-27 01:38:07,341][49750] Updated weights for policy 0, policy_version 321211 (0.0031) [2024-04-27 01:38:10,648][49750] Updated weights for policy 0, policy_version 321221 (0.0031) [2024-04-27 01:38:12,062][49517] Fps is (10 sec: 49151.8, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 5262934016. Throughput: 0: 50752.6. Samples: 3015791620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 01:38:12,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 01:38:13,826][49750] Updated weights for policy 0, policy_version 321231 (0.0034) [2024-04-27 01:38:17,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 5263196160. Throughput: 0: 50705.2. Samples: 3016094480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 01:38:17,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-27 01:38:17,245][49750] Updated weights for policy 0, policy_version 321241 (0.0031) [2024-04-27 01:38:20,171][49750] Updated weights for policy 0, policy_version 321251 (0.0033) [2024-04-27 01:38:22,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 5263425536. Throughput: 0: 50761.8. Samples: 3016247120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 01:38:22,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 01:38:23,077][49728] Signal inference workers to stop experience collection... (45350 times) [2024-04-27 01:38:23,132][49750] InferenceWorker_p0-w0: stopping experience collection (45350 times) [2024-04-27 01:38:23,136][49728] Signal inference workers to resume experience collection... (45350 times) [2024-04-27 01:38:23,145][49750] InferenceWorker_p0-w0: resuming experience collection (45350 times) [2024-04-27 01:38:23,667][49750] Updated weights for policy 0, policy_version 321261 (0.0035) [2024-04-27 01:38:26,693][49750] Updated weights for policy 0, policy_version 321271 (0.0029) [2024-04-27 01:38:27,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5263720448. Throughput: 0: 50782.0. Samples: 3016554240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 01:38:27,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-27 01:38:30,177][49750] Updated weights for policy 0, policy_version 321281 (0.0027) [2024-04-27 01:38:32,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5263949824. Throughput: 0: 50814.0. Samples: 3016863460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 01:38:32,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-27 01:38:33,135][49750] Updated weights for policy 0, policy_version 321291 (0.0029) [2024-04-27 01:38:36,600][49750] Updated weights for policy 0, policy_version 321301 (0.0035) [2024-04-27 01:38:37,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5264211968. Throughput: 0: 50793.3. Samples: 3017011120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 01:38:37,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-27 01:38:39,609][49750] Updated weights for policy 0, policy_version 321311 (0.0036) [2024-04-27 01:38:42,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5264457728. Throughput: 0: 50949.6. Samples: 3017317400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 01:38:42,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-27 01:38:42,134][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000321318_5264474112.pth... [2024-04-27 01:38:42,190][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000320574_5252284416.pth [2024-04-27 01:38:42,867][49750] Updated weights for policy 0, policy_version 321321 (0.0028) [2024-04-27 01:38:45,939][49750] Updated weights for policy 0, policy_version 321331 (0.0035) [2024-04-27 01:38:47,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5264703488. Throughput: 0: 50804.2. Samples: 3017619080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 01:38:47,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 01:38:49,399][49750] Updated weights for policy 0, policy_version 321341 (0.0033) [2024-04-27 01:38:52,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5264982016. Throughput: 0: 50809.7. Samples: 3017773520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 01:38:52,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 01:38:52,244][49750] Updated weights for policy 0, policy_version 321351 (0.0031) [2024-04-27 01:38:55,829][49750] Updated weights for policy 0, policy_version 321361 (0.0036) [2024-04-27 01:38:57,063][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5265227776. Throughput: 0: 50936.8. Samples: 3018083780. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 01:38:57,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-27 01:38:58,973][49750] Updated weights for policy 0, policy_version 321371 (0.0027) [2024-04-27 01:39:02,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5265473536. Throughput: 0: 50886.0. Samples: 3018384340. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 01:39:02,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 01:39:02,236][49750] Updated weights for policy 0, policy_version 321381 (0.0029) [2024-04-27 01:39:05,380][49750] Updated weights for policy 0, policy_version 321391 (0.0033) [2024-04-27 01:39:07,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 5265719296. Throughput: 0: 50795.9. Samples: 3018532940. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 01:39:07,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-27 01:39:08,681][49750] Updated weights for policy 0, policy_version 321401 (0.0031) [2024-04-27 01:39:11,770][49750] Updated weights for policy 0, policy_version 321411 (0.0032) [2024-04-27 01:39:12,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5265997824. Throughput: 0: 50779.2. Samples: 3018839300. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 01:39:12,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-27 01:39:15,113][49750] Updated weights for policy 0, policy_version 321421 (0.0030) [2024-04-27 01:39:17,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 5266243584. Throughput: 0: 50773.7. Samples: 3019148280. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 01:39:17,063][49517] Avg episode reward: [(0, '0.465')] [2024-04-27 01:39:18,252][49750] Updated weights for policy 0, policy_version 321431 (0.0029) [2024-04-27 01:39:21,550][49750] Updated weights for policy 0, policy_version 321441 (0.0034) [2024-04-27 01:39:22,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 5266505728. Throughput: 0: 50905.4. Samples: 3019301860. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 01:39:22,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-27 01:39:24,768][49750] Updated weights for policy 0, policy_version 321451 (0.0042) [2024-04-27 01:39:27,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5266751488. Throughput: 0: 50740.9. Samples: 3019600740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 01:39:27,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 01:39:27,994][49750] Updated weights for policy 0, policy_version 321461 (0.0030) [2024-04-27 01:39:31,042][49750] Updated weights for policy 0, policy_version 321471 (0.0032) [2024-04-27 01:39:32,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 5267013632. Throughput: 0: 50773.4. Samples: 3019903880. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 01:39:32,071][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 01:39:33,018][49728] Signal inference workers to stop experience collection... (45400 times) [2024-04-27 01:39:33,018][49728] Signal inference workers to resume experience collection... (45400 times) [2024-04-27 01:39:33,032][49750] InferenceWorker_p0-w0: stopping experience collection (45400 times) [2024-04-27 01:39:33,033][49750] InferenceWorker_p0-w0: resuming experience collection (45400 times) [2024-04-27 01:39:34,674][49750] Updated weights for policy 0, policy_version 321481 (0.0037) [2024-04-27 01:39:37,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5267275776. Throughput: 0: 50807.2. Samples: 3020059840. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 01:39:37,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 01:39:37,551][49750] Updated weights for policy 0, policy_version 321491 (0.0036) [2024-04-27 01:39:41,038][49750] Updated weights for policy 0, policy_version 321501 (0.0032) [2024-04-27 01:39:42,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5267505152. Throughput: 0: 50774.8. Samples: 3020368640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 01:39:42,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-27 01:39:44,141][49750] Updated weights for policy 0, policy_version 321511 (0.0030) [2024-04-27 01:39:47,062][49517] Fps is (10 sec: 49152.0, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 5267767296. Throughput: 0: 50848.4. Samples: 3020672520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 01:39:47,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-27 01:39:47,366][49750] Updated weights for policy 0, policy_version 321521 (0.0030) [2024-04-27 01:39:50,603][49750] Updated weights for policy 0, policy_version 321531 (0.0030) [2024-04-27 01:39:52,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5268029440. Throughput: 0: 50862.7. Samples: 3020821760. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 01:39:52,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 01:39:53,818][49750] Updated weights for policy 0, policy_version 321541 (0.0035) [2024-04-27 01:39:57,056][49750] Updated weights for policy 0, policy_version 321551 (0.0028) [2024-04-27 01:39:57,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5268291584. Throughput: 0: 50802.3. Samples: 3021125400. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 01:39:57,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-27 01:40:00,265][49750] Updated weights for policy 0, policy_version 321561 (0.0034) [2024-04-27 01:40:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5268537344. Throughput: 0: 50788.8. Samples: 3021433780. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 01:40:02,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-27 01:40:03,518][49750] Updated weights for policy 0, policy_version 321571 (0.0030) [2024-04-27 01:40:06,597][49750] Updated weights for policy 0, policy_version 321581 (0.0041) [2024-04-27 01:40:07,063][49517] Fps is (10 sec: 49151.1, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 5268783104. Throughput: 0: 50901.2. Samples: 3021592420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 01:40:07,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 01:40:09,962][49750] Updated weights for policy 0, policy_version 321591 (0.0032) [2024-04-27 01:40:12,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5269028864. Throughput: 0: 50888.0. Samples: 3021890700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 01:40:12,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 01:40:13,056][49750] Updated weights for policy 0, policy_version 321601 (0.0026) [2024-04-27 01:40:16,569][49750] Updated weights for policy 0, policy_version 321611 (0.0032) [2024-04-27 01:40:17,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5269291008. Throughput: 0: 50941.4. Samples: 3022196240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 01:40:17,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-27 01:40:19,520][49750] Updated weights for policy 0, policy_version 321621 (0.0034) [2024-04-27 01:40:22,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5269536768. Throughput: 0: 50773.3. Samples: 3022344640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 01:40:22,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 01:40:23,005][49750] Updated weights for policy 0, policy_version 321631 (0.0035) [2024-04-27 01:40:26,081][49750] Updated weights for policy 0, policy_version 321641 (0.0031) [2024-04-27 01:40:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5269815296. Throughput: 0: 50665.3. Samples: 3022648580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 01:40:27,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 01:40:29,383][49750] Updated weights for policy 0, policy_version 321651 (0.0026) [2024-04-27 01:40:32,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5270061056. Throughput: 0: 50819.9. Samples: 3022959420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 01:40:32,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-27 01:40:32,416][49750] Updated weights for policy 0, policy_version 321661 (0.0035) [2024-04-27 01:40:35,661][49750] Updated weights for policy 0, policy_version 321671 (0.0031) [2024-04-27 01:40:37,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5270306816. Throughput: 0: 50897.8. Samples: 3023112160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 01:40:37,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 01:40:39,017][49750] Updated weights for policy 0, policy_version 321681 (0.0040) [2024-04-27 01:40:42,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 5270568960. Throughput: 0: 50823.3. Samples: 3023412460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 01:40:42,063][49517] Avg episode reward: [(0, '0.527')] [2024-04-27 01:40:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000321690_5270568960.pth... [2024-04-27 01:40:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000320946_5258379264.pth [2024-04-27 01:40:42,258][49750] Updated weights for policy 0, policy_version 321691 (0.0033) [2024-04-27 01:40:45,437][49750] Updated weights for policy 0, policy_version 321701 (0.0034) [2024-04-27 01:40:45,937][49728] Signal inference workers to stop experience collection... (45450 times) [2024-04-27 01:40:45,938][49728] Signal inference workers to resume experience collection... (45450 times) [2024-04-27 01:40:45,966][49750] InferenceWorker_p0-w0: stopping experience collection (45450 times) [2024-04-27 01:40:45,966][49750] InferenceWorker_p0-w0: resuming experience collection (45450 times) [2024-04-27 01:40:47,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 5270831104. Throughput: 0: 50765.4. Samples: 3023718220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 01:40:47,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-27 01:40:48,825][49750] Updated weights for policy 0, policy_version 321711 (0.0033) [2024-04-27 01:40:51,795][49750] Updated weights for policy 0, policy_version 321721 (0.0033) [2024-04-27 01:40:52,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5271076864. Throughput: 0: 50720.6. Samples: 3023874840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 01:40:52,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-27 01:40:55,246][49750] Updated weights for policy 0, policy_version 321731 (0.0029) [2024-04-27 01:40:57,063][49517] Fps is (10 sec: 49150.7, 60 sec: 50517.1, 300 sec: 50762.6). Total num frames: 5271322624. Throughput: 0: 50955.0. Samples: 3024183680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 01:40:57,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 01:40:58,069][49750] Updated weights for policy 0, policy_version 321741 (0.0031) [2024-04-27 01:41:01,569][49750] Updated weights for policy 0, policy_version 321751 (0.0032) [2024-04-27 01:41:02,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5271584768. Throughput: 0: 50866.8. Samples: 3024485240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 01:41:02,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-27 01:41:04,497][49750] Updated weights for policy 0, policy_version 321761 (0.0034) [2024-04-27 01:41:07,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5271830528. Throughput: 0: 50784.9. Samples: 3024629960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 01:41:07,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 01:41:08,258][49750] Updated weights for policy 0, policy_version 321771 (0.0031) [2024-04-27 01:41:10,939][49750] Updated weights for policy 0, policy_version 321781 (0.0026) [2024-04-27 01:41:12,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5272109056. Throughput: 0: 50883.5. Samples: 3024938340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 01:41:12,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-27 01:41:14,627][49750] Updated weights for policy 0, policy_version 321791 (0.0033) [2024-04-27 01:41:17,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 5272338432. Throughput: 0: 50844.1. Samples: 3025247400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 01:41:17,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-27 01:41:17,707][49750] Updated weights for policy 0, policy_version 321801 (0.0028) [2024-04-27 01:41:21,048][49750] Updated weights for policy 0, policy_version 321811 (0.0034) [2024-04-27 01:41:22,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 5272584192. Throughput: 0: 50757.4. Samples: 3025396240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:41:22,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 01:41:24,051][49750] Updated weights for policy 0, policy_version 321821 (0.0030) [2024-04-27 01:41:27,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5272846336. Throughput: 0: 50823.7. Samples: 3025699520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:41:27,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 01:41:27,401][49750] Updated weights for policy 0, policy_version 321831 (0.0031) [2024-04-27 01:41:30,357][49750] Updated weights for policy 0, policy_version 321841 (0.0032) [2024-04-27 01:41:32,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5273108480. Throughput: 0: 50769.2. Samples: 3026002840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:41:32,063][49517] Avg episode reward: [(0, '0.472')] [2024-04-27 01:41:33,749][49750] Updated weights for policy 0, policy_version 321851 (0.0029) [2024-04-27 01:41:36,881][49750] Updated weights for policy 0, policy_version 321861 (0.0036) [2024-04-27 01:41:37,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5273370624. Throughput: 0: 50861.2. Samples: 3026163600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:41:37,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-27 01:41:40,368][49750] Updated weights for policy 0, policy_version 321871 (0.0032) [2024-04-27 01:41:42,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 5273600000. Throughput: 0: 50668.7. Samples: 3026463760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:41:42,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 01:41:43,407][49750] Updated weights for policy 0, policy_version 321881 (0.0037) [2024-04-27 01:41:46,740][49750] Updated weights for policy 0, policy_version 321891 (0.0032) [2024-04-27 01:41:47,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5273862144. Throughput: 0: 50830.6. Samples: 3026772620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:41:47,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 01:41:49,705][49750] Updated weights for policy 0, policy_version 321901 (0.0033) [2024-04-27 01:41:51,926][49728] Signal inference workers to stop experience collection... (45500 times) [2024-04-27 01:41:51,926][49728] Signal inference workers to resume experience collection... (45500 times) [2024-04-27 01:41:51,941][49750] InferenceWorker_p0-w0: stopping experience collection (45500 times) [2024-04-27 01:41:51,941][49750] InferenceWorker_p0-w0: resuming experience collection (45500 times) [2024-04-27 01:41:52,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5274124288. Throughput: 0: 50852.5. Samples: 3026918320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:41:52,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-27 01:41:53,278][49750] Updated weights for policy 0, policy_version 321911 (0.0032) [2024-04-27 01:41:56,214][49750] Updated weights for policy 0, policy_version 321921 (0.0027) [2024-04-27 01:41:57,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5274386432. Throughput: 0: 50767.9. Samples: 3027222900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:41:57,063][49517] Avg episode reward: [(0, '0.705')] [2024-04-27 01:41:59,847][49750] Updated weights for policy 0, policy_version 321931 (0.0032) [2024-04-27 01:42:02,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5274632192. Throughput: 0: 50830.1. Samples: 3027534760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:42:02,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 01:42:02,698][49750] Updated weights for policy 0, policy_version 321941 (0.0031) [2024-04-27 01:42:06,400][49750] Updated weights for policy 0, policy_version 321951 (0.0034) [2024-04-27 01:42:07,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5274861568. Throughput: 0: 50952.0. Samples: 3027689080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:42:07,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 01:42:09,010][49750] Updated weights for policy 0, policy_version 321961 (0.0030) [2024-04-27 01:42:12,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5275140096. Throughput: 0: 50851.7. Samples: 3027987840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:42:12,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-27 01:42:12,757][49750] Updated weights for policy 0, policy_version 321971 (0.0037) [2024-04-27 01:42:15,474][49750] Updated weights for policy 0, policy_version 321981 (0.0028) [2024-04-27 01:42:17,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5275385856. Throughput: 0: 50780.6. Samples: 3028287960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:42:17,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 01:42:19,098][49750] Updated weights for policy 0, policy_version 321991 (0.0031) [2024-04-27 01:42:21,818][49750] Updated weights for policy 0, policy_version 322001 (0.0030) [2024-04-27 01:42:22,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 5275664384. Throughput: 0: 50806.8. Samples: 3028449900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:42:22,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 01:42:25,596][49750] Updated weights for policy 0, policy_version 322011 (0.0030) [2024-04-27 01:42:27,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5275877376. Throughput: 0: 50897.6. Samples: 3028754160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:42:27,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-27 01:42:28,297][49750] Updated weights for policy 0, policy_version 322021 (0.0035) [2024-04-27 01:42:32,049][49750] Updated weights for policy 0, policy_version 322031 (0.0030) [2024-04-27 01:42:32,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5276155904. Throughput: 0: 50765.4. Samples: 3029057060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 01:42:32,071][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 01:42:34,672][49750] Updated weights for policy 0, policy_version 322041 (0.0031) [2024-04-27 01:42:37,063][49517] Fps is (10 sec: 54067.0, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 5276418048. Throughput: 0: 50843.8. Samples: 3029206300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 01:42:37,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-27 01:42:38,364][49750] Updated weights for policy 0, policy_version 322051 (0.0032) [2024-04-27 01:42:41,181][49750] Updated weights for policy 0, policy_version 322061 (0.0033) [2024-04-27 01:42:42,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 5276680192. Throughput: 0: 51057.9. Samples: 3029520500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 01:42:42,064][49517] Avg episode reward: [(0, '0.605')] [2024-04-27 01:42:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000322063_5276680192.pth... [2024-04-27 01:42:42,124][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000321318_5264474112.pth [2024-04-27 01:42:44,679][49750] Updated weights for policy 0, policy_version 322071 (0.0035) [2024-04-27 01:42:47,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5276909568. Throughput: 0: 50930.6. Samples: 3029826640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 01:42:47,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-27 01:42:47,645][49750] Updated weights for policy 0, policy_version 322081 (0.0033) [2024-04-27 01:42:51,114][49750] Updated weights for policy 0, policy_version 322091 (0.0031) [2024-04-27 01:42:52,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5277155328. Throughput: 0: 50759.1. Samples: 3029973240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 01:42:52,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 01:42:54,028][49750] Updated weights for policy 0, policy_version 322101 (0.0034) [2024-04-27 01:42:57,063][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.5, 300 sec: 50818.1). Total num frames: 5277433856. Throughput: 0: 50939.4. Samples: 3030280120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 01:42:57,063][49517] Avg episode reward: [(0, '0.668')] [2024-04-27 01:42:57,730][49750] Updated weights for policy 0, policy_version 322111 (0.0030) [2024-04-27 01:43:00,433][49750] Updated weights for policy 0, policy_version 322121 (0.0038) [2024-04-27 01:43:02,063][49517] Fps is (10 sec: 54066.2, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 5277696000. Throughput: 0: 51059.3. Samples: 3030585640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 01:43:02,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-27 01:43:03,608][49728] Signal inference workers to stop experience collection... (45550 times) [2024-04-27 01:43:03,608][49728] Signal inference workers to resume experience collection... (45550 times) [2024-04-27 01:43:03,618][49750] InferenceWorker_p0-w0: stopping experience collection (45550 times) [2024-04-27 01:43:03,618][49750] InferenceWorker_p0-w0: resuming experience collection (45550 times) [2024-04-27 01:43:04,001][49750] Updated weights for policy 0, policy_version 322131 (0.0033) [2024-04-27 01:43:06,755][49750] Updated weights for policy 0, policy_version 322141 (0.0040) [2024-04-27 01:43:07,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51609.6, 300 sec: 50929.2). Total num frames: 5277958144. Throughput: 0: 50984.4. Samples: 3030744200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 01:43:07,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 01:43:10,479][49750] Updated weights for policy 0, policy_version 322151 (0.0028) [2024-04-27 01:43:12,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 5278187520. Throughput: 0: 50911.5. Samples: 3031045180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 01:43:12,064][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 01:43:13,201][49750] Updated weights for policy 0, policy_version 322161 (0.0032) [2024-04-27 01:43:17,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5278433280. Throughput: 0: 50941.3. Samples: 3031349420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 01:43:17,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 01:43:17,112][49750] Updated weights for policy 0, policy_version 322171 (0.0032) [2024-04-27 01:43:19,761][49750] Updated weights for policy 0, policy_version 322181 (0.0032) [2024-04-27 01:43:22,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 5278695424. Throughput: 0: 50914.9. Samples: 3031497460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 01:43:22,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-27 01:43:23,627][49750] Updated weights for policy 0, policy_version 322191 (0.0034) [2024-04-27 01:43:26,321][49750] Updated weights for policy 0, policy_version 322201 (0.0031) [2024-04-27 01:43:27,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5278957568. Throughput: 0: 50801.4. Samples: 3031806560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 01:43:27,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 01:43:29,995][49750] Updated weights for policy 0, policy_version 322211 (0.0032) [2024-04-27 01:43:32,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5279203328. Throughput: 0: 50738.0. Samples: 3032109840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 01:43:32,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-27 01:43:32,673][49750] Updated weights for policy 0, policy_version 322221 (0.0037) [2024-04-27 01:43:36,349][49750] Updated weights for policy 0, policy_version 322231 (0.0030) [2024-04-27 01:43:37,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 5279449088. Throughput: 0: 50825.0. Samples: 3032260360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 01:43:37,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-27 01:43:39,177][49750] Updated weights for policy 0, policy_version 322241 (0.0029) [2024-04-27 01:43:42,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5279711232. Throughput: 0: 50766.6. Samples: 3032564620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 01:43:42,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 01:43:42,922][49750] Updated weights for policy 0, policy_version 322251 (0.0035) [2024-04-27 01:43:45,630][49750] Updated weights for policy 0, policy_version 322261 (0.0039) [2024-04-27 01:43:47,063][49517] Fps is (10 sec: 54065.9, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5279989760. Throughput: 0: 50771.2. Samples: 3032870340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 01:43:47,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-27 01:43:49,288][49750] Updated weights for policy 0, policy_version 322271 (0.0034) [2024-04-27 01:43:52,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 5280235520. Throughput: 0: 50757.1. Samples: 3033028280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:43:52,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-27 01:43:52,193][49750] Updated weights for policy 0, policy_version 322281 (0.0030) [2024-04-27 01:43:55,612][49750] Updated weights for policy 0, policy_version 322291 (0.0031) [2024-04-27 01:43:57,062][49517] Fps is (10 sec: 50791.4, 60 sec: 51063.6, 300 sec: 50929.2). Total num frames: 5280497664. Throughput: 0: 50922.9. Samples: 3033336700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:43:57,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-27 01:43:58,565][49750] Updated weights for policy 0, policy_version 322301 (0.0027) [2024-04-27 01:44:02,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 5280727040. Throughput: 0: 50734.6. Samples: 3033632480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:44:02,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-27 01:44:02,207][49750] Updated weights for policy 0, policy_version 322311 (0.0031) [2024-04-27 01:44:04,860][49750] Updated weights for policy 0, policy_version 322321 (0.0027) [2024-04-27 01:44:07,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 5280972800. Throughput: 0: 50825.7. Samples: 3033784620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:44:07,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-27 01:44:08,685][49750] Updated weights for policy 0, policy_version 322331 (0.0027) [2024-04-27 01:44:10,942][49728] Signal inference workers to stop experience collection... (45600 times) [2024-04-27 01:44:10,947][49728] Signal inference workers to resume experience collection... (45600 times) [2024-04-27 01:44:10,974][49750] InferenceWorker_p0-w0: stopping experience collection (45600 times) [2024-04-27 01:44:10,974][49750] InferenceWorker_p0-w0: resuming experience collection (45600 times) [2024-04-27 01:44:11,360][49750] Updated weights for policy 0, policy_version 322341 (0.0029) [2024-04-27 01:44:12,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5281251328. Throughput: 0: 50865.4. Samples: 3034095500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:44:12,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 01:44:15,303][49750] Updated weights for policy 0, policy_version 322351 (0.0032) [2024-04-27 01:44:17,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5281497088. Throughput: 0: 50760.9. Samples: 3034394080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:44:17,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-27 01:44:17,882][49750] Updated weights for policy 0, policy_version 322361 (0.0032) [2024-04-27 01:44:21,679][49750] Updated weights for policy 0, policy_version 322371 (0.0035) [2024-04-27 01:44:22,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 5281742848. Throughput: 0: 50967.2. Samples: 3034553900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:44:22,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-27 01:44:24,325][49750] Updated weights for policy 0, policy_version 322381 (0.0027) [2024-04-27 01:44:27,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5282004992. Throughput: 0: 50784.1. Samples: 3034849900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:44:27,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-27 01:44:28,282][49750] Updated weights for policy 0, policy_version 322391 (0.0037) [2024-04-27 01:44:30,625][49750] Updated weights for policy 0, policy_version 322401 (0.0034) [2024-04-27 01:44:32,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5282267136. Throughput: 0: 50797.5. Samples: 3035156220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:44:32,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 01:44:34,798][49750] Updated weights for policy 0, policy_version 322411 (0.0031) [2024-04-27 01:44:37,042][49750] Updated weights for policy 0, policy_version 322421 (0.0024) [2024-04-27 01:44:37,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51609.5, 300 sec: 50984.8). Total num frames: 5282545664. Throughput: 0: 50961.0. Samples: 3035321520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:44:37,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-27 01:44:41,139][49750] Updated weights for policy 0, policy_version 322431 (0.0039) [2024-04-27 01:44:42,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5282775040. Throughput: 0: 50704.8. Samples: 3035618420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:44:42,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-27 01:44:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000322435_5282775040.pth... [2024-04-27 01:44:42,114][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000321690_5270568960.pth [2024-04-27 01:44:43,603][49750] Updated weights for policy 0, policy_version 322441 (0.0029) [2024-04-27 01:44:47,063][49517] Fps is (10 sec: 45875.1, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 5283004416. Throughput: 0: 50857.7. Samples: 3035921080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:44:47,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 01:44:47,529][49750] Updated weights for policy 0, policy_version 322451 (0.0033) [2024-04-27 01:44:50,145][49750] Updated weights for policy 0, policy_version 322461 (0.0026) [2024-04-27 01:44:52,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 5283250176. Throughput: 0: 50884.0. Samples: 3036074400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:44:52,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 01:44:53,964][49750] Updated weights for policy 0, policy_version 322471 (0.0032) [2024-04-27 01:44:56,587][49750] Updated weights for policy 0, policy_version 322481 (0.0034) [2024-04-27 01:44:57,062][49517] Fps is (10 sec: 54067.5, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5283545088. Throughput: 0: 50605.8. Samples: 3036372760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:44:57,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-27 01:45:00,311][49750] Updated weights for policy 0, policy_version 322491 (0.0029) [2024-04-27 01:45:02,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5283774464. Throughput: 0: 50695.6. Samples: 3036675380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 01:45:02,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-27 01:45:02,907][49750] Updated weights for policy 0, policy_version 322501 (0.0032) [2024-04-27 01:45:06,809][49750] Updated weights for policy 0, policy_version 322511 (0.0027) [2024-04-27 01:45:07,063][49517] Fps is (10 sec: 49151.5, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5284036608. Throughput: 0: 50716.5. Samples: 3036836140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:45:07,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 01:45:09,433][49750] Updated weights for policy 0, policy_version 322521 (0.0028) [2024-04-27 01:45:12,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5284282368. Throughput: 0: 50880.9. Samples: 3037139540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:45:12,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-27 01:45:13,291][49750] Updated weights for policy 0, policy_version 322531 (0.0030) [2024-04-27 01:45:15,906][49750] Updated weights for policy 0, policy_version 322541 (0.0028) [2024-04-27 01:45:17,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5284528128. Throughput: 0: 50716.0. Samples: 3037438440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:45:17,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-27 01:45:19,405][49728] Signal inference workers to stop experience collection... (45650 times) [2024-04-27 01:45:19,406][49728] Signal inference workers to resume experience collection... (45650 times) [2024-04-27 01:45:19,418][49750] InferenceWorker_p0-w0: stopping experience collection (45650 times) [2024-04-27 01:45:19,418][49750] InferenceWorker_p0-w0: resuming experience collection (45650 times) [2024-04-27 01:45:19,715][49750] Updated weights for policy 0, policy_version 322551 (0.0032) [2024-04-27 01:45:22,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.7, 300 sec: 50818.2). Total num frames: 5284806656. Throughput: 0: 50622.8. Samples: 3037599540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:45:22,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-27 01:45:22,384][49750] Updated weights for policy 0, policy_version 322561 (0.0032) [2024-04-27 01:45:26,104][49750] Updated weights for policy 0, policy_version 322571 (0.0028) [2024-04-27 01:45:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5285052416. Throughput: 0: 50834.3. Samples: 3037905960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:45:27,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 01:45:28,732][49750] Updated weights for policy 0, policy_version 322581 (0.0036) [2024-04-27 01:45:32,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5285298176. Throughput: 0: 50921.3. Samples: 3038212540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:45:32,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 01:45:32,442][49750] Updated weights for policy 0, policy_version 322591 (0.0036) [2024-04-27 01:45:35,254][49750] Updated weights for policy 0, policy_version 322601 (0.0029) [2024-04-27 01:45:37,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5285560320. Throughput: 0: 50736.9. Samples: 3038357560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:45:37,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-27 01:45:39,005][49750] Updated weights for policy 0, policy_version 322611 (0.0038) [2024-04-27 01:45:41,913][49750] Updated weights for policy 0, policy_version 322621 (0.0033) [2024-04-27 01:45:42,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5285822464. Throughput: 0: 50947.1. Samples: 3038665380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:45:42,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 01:45:45,479][49750] Updated weights for policy 0, policy_version 322631 (0.0030) [2024-04-27 01:45:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5286084608. Throughput: 0: 50995.0. Samples: 3038970160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:45:47,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-27 01:45:48,302][49750] Updated weights for policy 0, policy_version 322641 (0.0033) [2024-04-27 01:45:51,877][49750] Updated weights for policy 0, policy_version 322651 (0.0029) [2024-04-27 01:45:52,063][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5286313984. Throughput: 0: 50827.1. Samples: 3039123360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:45:52,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 01:45:54,601][49750] Updated weights for policy 0, policy_version 322661 (0.0027) [2024-04-27 01:45:57,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5286576128. Throughput: 0: 50871.6. Samples: 3039428760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:45:57,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-27 01:45:58,228][49750] Updated weights for policy 0, policy_version 322671 (0.0029) [2024-04-27 01:46:01,253][49750] Updated weights for policy 0, policy_version 322681 (0.0031) [2024-04-27 01:46:02,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5286821888. Throughput: 0: 50974.3. Samples: 3039732280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:46:02,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 01:46:04,813][49750] Updated weights for policy 0, policy_version 322691 (0.0027) [2024-04-27 01:46:07,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5287100416. Throughput: 0: 50918.0. Samples: 3039890860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:46:07,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 01:46:07,836][49750] Updated weights for policy 0, policy_version 322701 (0.0036) [2024-04-27 01:46:11,334][49750] Updated weights for policy 0, policy_version 322711 (0.0031) [2024-04-27 01:46:12,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5287346176. Throughput: 0: 50845.3. Samples: 3040194000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:46:12,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 01:46:14,188][49750] Updated weights for policy 0, policy_version 322721 (0.0032) [2024-04-27 01:46:17,063][49517] Fps is (10 sec: 47513.9, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 5287575552. Throughput: 0: 50887.5. Samples: 3040502480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 01:46:17,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-27 01:46:17,806][49750] Updated weights for policy 0, policy_version 322731 (0.0035) [2024-04-27 01:46:20,526][49750] Updated weights for policy 0, policy_version 322741 (0.0032) [2024-04-27 01:46:22,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5287837696. Throughput: 0: 50596.1. Samples: 3040634380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 01:46:22,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 01:46:24,211][49750] Updated weights for policy 0, policy_version 322751 (0.0031) [2024-04-27 01:46:24,591][49728] Signal inference workers to stop experience collection... (45700 times) [2024-04-27 01:46:24,626][49750] InferenceWorker_p0-w0: stopping experience collection (45700 times) [2024-04-27 01:46:24,660][49728] Signal inference workers to resume experience collection... (45700 times) [2024-04-27 01:46:24,660][49750] InferenceWorker_p0-w0: resuming experience collection (45700 times) [2024-04-27 01:46:26,960][49750] Updated weights for policy 0, policy_version 322761 (0.0030) [2024-04-27 01:46:27,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5288116224. Throughput: 0: 50644.0. Samples: 3040944360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 01:46:27,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 01:46:30,799][49750] Updated weights for policy 0, policy_version 322771 (0.0029) [2024-04-27 01:46:32,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5288378368. Throughput: 0: 50755.2. Samples: 3041254140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 01:46:32,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 01:46:33,542][49750] Updated weights for policy 0, policy_version 322781 (0.0029) [2024-04-27 01:46:37,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5288591360. Throughput: 0: 50866.4. Samples: 3041412340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 01:46:37,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 01:46:37,120][49750] Updated weights for policy 0, policy_version 322791 (0.0036) [2024-04-27 01:46:39,975][49750] Updated weights for policy 0, policy_version 322801 (0.0027) [2024-04-27 01:46:42,063][49517] Fps is (10 sec: 47512.4, 60 sec: 50517.1, 300 sec: 50818.1). Total num frames: 5288853504. Throughput: 0: 50640.7. Samples: 3041707600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 01:46:42,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-27 01:46:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000322806_5288853504.pth... [2024-04-27 01:46:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000322063_5276680192.pth [2024-04-27 01:46:43,644][49750] Updated weights for policy 0, policy_version 322811 (0.0033) [2024-04-27 01:46:46,335][49750] Updated weights for policy 0, policy_version 322821 (0.0031) [2024-04-27 01:46:47,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 5289099264. Throughput: 0: 50665.2. Samples: 3042012220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 01:46:47,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-27 01:46:49,930][49750] Updated weights for policy 0, policy_version 322831 (0.0030) [2024-04-27 01:46:52,062][49517] Fps is (10 sec: 52430.4, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5289377792. Throughput: 0: 50701.2. Samples: 3042172400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 01:46:52,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 01:46:52,957][49750] Updated weights for policy 0, policy_version 322841 (0.0031) [2024-04-27 01:46:56,321][49750] Updated weights for policy 0, policy_version 322851 (0.0032) [2024-04-27 01:46:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5289623552. Throughput: 0: 50827.1. Samples: 3042481220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 01:46:57,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 01:46:59,320][49750] Updated weights for policy 0, policy_version 322861 (0.0035) [2024-04-27 01:47:02,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5289869312. Throughput: 0: 50714.2. Samples: 3042784620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 01:47:02,072][49517] Avg episode reward: [(0, '0.641')] [2024-04-27 01:47:02,712][49750] Updated weights for policy 0, policy_version 322871 (0.0026) [2024-04-27 01:47:05,627][49750] Updated weights for policy 0, policy_version 322881 (0.0029) [2024-04-27 01:47:07,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50818.1). Total num frames: 5290131456. Throughput: 0: 50971.0. Samples: 3042928080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 01:47:07,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 01:47:09,152][49750] Updated weights for policy 0, policy_version 322891 (0.0029) [2024-04-27 01:47:12,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5290393600. Throughput: 0: 50752.4. Samples: 3043228220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 01:47:12,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-27 01:47:12,177][49750] Updated weights for policy 0, policy_version 322901 (0.0029) [2024-04-27 01:47:15,649][49750] Updated weights for policy 0, policy_version 322911 (0.0033) [2024-04-27 01:47:17,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51609.7, 300 sec: 50873.7). Total num frames: 5290672128. Throughput: 0: 50781.3. Samples: 3043539300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 01:47:17,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 01:47:18,709][49750] Updated weights for policy 0, policy_version 322921 (0.0029) [2024-04-27 01:47:21,385][49728] Signal inference workers to stop experience collection... (45750 times) [2024-04-27 01:47:21,386][49728] Signal inference workers to resume experience collection... (45750 times) [2024-04-27 01:47:21,414][49750] InferenceWorker_p0-w0: stopping experience collection (45750 times) [2024-04-27 01:47:21,414][49750] InferenceWorker_p0-w0: resuming experience collection (45750 times) [2024-04-27 01:47:22,048][49750] Updated weights for policy 0, policy_version 322931 (0.0034) [2024-04-27 01:47:22,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 5290901504. Throughput: 0: 50859.5. Samples: 3043701020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 01:47:22,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 01:47:25,108][49750] Updated weights for policy 0, policy_version 322941 (0.0028) [2024-04-27 01:47:27,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5291147264. Throughput: 0: 51018.9. Samples: 3044003440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 01:47:27,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 01:47:28,585][49750] Updated weights for policy 0, policy_version 322951 (0.0031) [2024-04-27 01:47:31,377][49750] Updated weights for policy 0, policy_version 322961 (0.0029) [2024-04-27 01:47:32,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.3, 300 sec: 50762.7). Total num frames: 5291393024. Throughput: 0: 50931.3. Samples: 3044304120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 01:47:32,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-27 01:47:34,945][49750] Updated weights for policy 0, policy_version 322971 (0.0038) [2024-04-27 01:47:37,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 5291687936. Throughput: 0: 50996.4. Samples: 3044467240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 01:47:37,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-27 01:47:38,025][49750] Updated weights for policy 0, policy_version 322981 (0.0028) [2024-04-27 01:47:41,414][49750] Updated weights for policy 0, policy_version 322991 (0.0031) [2024-04-27 01:47:42,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51336.8, 300 sec: 50929.3). Total num frames: 5291933696. Throughput: 0: 50885.5. Samples: 3044771060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 01:47:42,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-27 01:47:44,563][49750] Updated weights for policy 0, policy_version 323001 (0.0034) [2024-04-27 01:47:47,063][49517] Fps is (10 sec: 47512.7, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5292163072. Throughput: 0: 50895.1. Samples: 3045074900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 01:47:47,063][49517] Avg episode reward: [(0, '0.672')] [2024-04-27 01:47:47,890][49750] Updated weights for policy 0, policy_version 323011 (0.0028) [2024-04-27 01:47:50,996][49750] Updated weights for policy 0, policy_version 323021 (0.0033) [2024-04-27 01:47:52,062][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5292425216. Throughput: 0: 51009.4. Samples: 3045223500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 01:47:52,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 01:47:54,173][49750] Updated weights for policy 0, policy_version 323031 (0.0029) [2024-04-27 01:47:57,063][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5292670976. Throughput: 0: 50978.6. Samples: 3045522260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 01:47:57,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 01:47:57,340][49750] Updated weights for policy 0, policy_version 323041 (0.0035) [2024-04-27 01:48:00,648][49750] Updated weights for policy 0, policy_version 323051 (0.0036) [2024-04-27 01:48:02,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 5292949504. Throughput: 0: 50801.7. Samples: 3045825380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 01:48:02,063][49517] Avg episode reward: [(0, '0.668')] [2024-04-27 01:48:03,844][49750] Updated weights for policy 0, policy_version 323061 (0.0031) [2024-04-27 01:48:07,017][49750] Updated weights for policy 0, policy_version 323071 (0.0036) [2024-04-27 01:48:07,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5293195264. Throughput: 0: 50885.2. Samples: 3045990860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 01:48:07,063][49517] Avg episode reward: [(0, '0.532')] [2024-04-27 01:48:10,280][49750] Updated weights for policy 0, policy_version 323081 (0.0034) [2024-04-27 01:48:12,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5293441024. Throughput: 0: 50892.0. Samples: 3046293580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 01:48:12,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-27 01:48:13,560][49750] Updated weights for policy 0, policy_version 323091 (0.0030) [2024-04-27 01:48:16,665][49750] Updated weights for policy 0, policy_version 323101 (0.0030) [2024-04-27 01:48:17,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 5293686784. Throughput: 0: 51026.1. Samples: 3046600300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 01:48:17,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-27 01:48:19,875][49750] Updated weights for policy 0, policy_version 323111 (0.0031) [2024-04-27 01:48:22,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5293965312. Throughput: 0: 50789.8. Samples: 3046752780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 01:48:22,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-27 01:48:23,002][49750] Updated weights for policy 0, policy_version 323121 (0.0033) [2024-04-27 01:48:26,290][49750] Updated weights for policy 0, policy_version 323131 (0.0028) [2024-04-27 01:48:27,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 5294227456. Throughput: 0: 51008.3. Samples: 3047066440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 01:48:27,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-27 01:48:28,409][49728] Signal inference workers to stop experience collection... (45800 times) [2024-04-27 01:48:28,414][49728] Signal inference workers to resume experience collection... (45800 times) [2024-04-27 01:48:28,438][49750] InferenceWorker_p0-w0: stopping experience collection (45800 times) [2024-04-27 01:48:28,439][49750] InferenceWorker_p0-w0: resuming experience collection (45800 times) [2024-04-27 01:48:29,384][49750] Updated weights for policy 0, policy_version 323141 (0.0034) [2024-04-27 01:48:32,062][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5294456832. Throughput: 0: 51052.7. Samples: 3047372260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 01:48:32,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-27 01:48:32,733][49750] Updated weights for policy 0, policy_version 323151 (0.0032) [2024-04-27 01:48:35,736][49750] Updated weights for policy 0, policy_version 323161 (0.0029) [2024-04-27 01:48:37,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 5294718976. Throughput: 0: 51050.8. Samples: 3047520780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 01:48:37,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-27 01:48:39,090][49750] Updated weights for policy 0, policy_version 323171 (0.0029) [2024-04-27 01:48:42,034][49750] Updated weights for policy 0, policy_version 323181 (0.0025) [2024-04-27 01:48:42,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5294997504. Throughput: 0: 51175.6. Samples: 3047825160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 01:48:42,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 01:48:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000323181_5294997504.pth... [2024-04-27 01:48:42,116][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000322435_5282775040.pth [2024-04-27 01:48:45,600][49750] Updated weights for policy 0, policy_version 323191 (0.0032) [2024-04-27 01:48:47,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 5295243264. Throughput: 0: 51130.8. Samples: 3048126260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 01:48:47,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-27 01:48:48,593][49750] Updated weights for policy 0, policy_version 323201 (0.0034) [2024-04-27 01:48:51,986][49750] Updated weights for policy 0, policy_version 323211 (0.0030) [2024-04-27 01:48:52,063][49517] Fps is (10 sec: 49152.0, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 5295489024. Throughput: 0: 51015.1. Samples: 3048286540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 01:48:52,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 01:48:55,037][49750] Updated weights for policy 0, policy_version 323221 (0.0032) [2024-04-27 01:48:57,062][49517] Fps is (10 sec: 49151.7, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5295734784. Throughput: 0: 51089.8. Samples: 3048592620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 01:48:57,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-27 01:48:58,345][49750] Updated weights for policy 0, policy_version 323231 (0.0034) [2024-04-27 01:49:01,914][49750] Updated weights for policy 0, policy_version 323241 (0.0032) [2024-04-27 01:49:02,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 5295980544. Throughput: 0: 51080.4. Samples: 3048898920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 01:49:02,071][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 01:49:04,704][49750] Updated weights for policy 0, policy_version 323251 (0.0029) [2024-04-27 01:49:07,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 5296242688. Throughput: 0: 51006.8. Samples: 3049048080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 01:49:07,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-27 01:49:08,402][49750] Updated weights for policy 0, policy_version 323261 (0.0034) [2024-04-27 01:49:11,257][49750] Updated weights for policy 0, policy_version 323271 (0.0030) [2024-04-27 01:49:12,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 5296521216. Throughput: 0: 50817.2. Samples: 3049353220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 01:49:12,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-27 01:49:14,891][49750] Updated weights for policy 0, policy_version 323281 (0.0033) [2024-04-27 01:49:17,062][49517] Fps is (10 sec: 52428.2, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 5296766976. Throughput: 0: 50823.9. Samples: 3049659340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 01:49:17,072][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 01:49:17,573][49750] Updated weights for policy 0, policy_version 323291 (0.0031) [2024-04-27 01:49:21,161][49750] Updated weights for policy 0, policy_version 323301 (0.0029) [2024-04-27 01:49:22,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5296996352. Throughput: 0: 50796.4. Samples: 3049806620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 01:49:22,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-27 01:49:24,024][49750] Updated weights for policy 0, policy_version 323311 (0.0033) [2024-04-27 01:49:27,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5297258496. Throughput: 0: 50788.0. Samples: 3050110620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 01:49:27,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-27 01:49:27,601][49750] Updated weights for policy 0, policy_version 323321 (0.0034) [2024-04-27 01:49:30,409][49750] Updated weights for policy 0, policy_version 323331 (0.0033) [2024-04-27 01:49:32,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 5297537024. Throughput: 0: 50985.7. Samples: 3050420620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 01:49:32,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-27 01:49:34,105][49750] Updated weights for policy 0, policy_version 323341 (0.0035) [2024-04-27 01:49:36,841][49750] Updated weights for policy 0, policy_version 323351 (0.0036) [2024-04-27 01:49:37,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5297782784. Throughput: 0: 50869.3. Samples: 3050575660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 01:49:37,064][49517] Avg episode reward: [(0, '0.545')] [2024-04-27 01:49:40,563][49750] Updated weights for policy 0, policy_version 323361 (0.0031) [2024-04-27 01:49:42,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50929.3). Total num frames: 5298028544. Throughput: 0: 50941.0. Samples: 3050884960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 01:49:42,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-27 01:49:43,308][49750] Updated weights for policy 0, policy_version 323371 (0.0028) [2024-04-27 01:49:46,953][49750] Updated weights for policy 0, policy_version 323381 (0.0030) [2024-04-27 01:49:47,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.3, 300 sec: 50929.3). Total num frames: 5298274304. Throughput: 0: 50856.0. Samples: 3051187440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 01:49:47,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-27 01:49:49,738][49750] Updated weights for policy 0, policy_version 323391 (0.0029) [2024-04-27 01:49:52,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5298536448. Throughput: 0: 50787.0. Samples: 3051333500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 01:49:52,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 01:49:53,486][49750] Updated weights for policy 0, policy_version 323401 (0.0033) [2024-04-27 01:49:54,823][49728] Signal inference workers to stop experience collection... (45850 times) [2024-04-27 01:49:54,824][49728] Signal inference workers to resume experience collection... (45850 times) [2024-04-27 01:49:54,854][49750] InferenceWorker_p0-w0: stopping experience collection (45850 times) [2024-04-27 01:49:54,854][49750] InferenceWorker_p0-w0: resuming experience collection (45850 times) [2024-04-27 01:49:56,083][49750] Updated weights for policy 0, policy_version 323411 (0.0030) [2024-04-27 01:49:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 5298798592. Throughput: 0: 50779.3. Samples: 3051638280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 01:49:57,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 01:50:00,182][49750] Updated weights for policy 0, policy_version 323421 (0.0030) [2024-04-27 01:50:02,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 5299060736. Throughput: 0: 50699.0. Samples: 3051940800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 01:50:02,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-27 01:50:02,497][49750] Updated weights for policy 0, policy_version 323431 (0.0027) [2024-04-27 01:50:06,506][49750] Updated weights for policy 0, policy_version 323441 (0.0034) [2024-04-27 01:50:07,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5299290112. Throughput: 0: 50809.8. Samples: 3052093060. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-04-27 01:50:07,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-27 01:50:08,834][49750] Updated weights for policy 0, policy_version 323451 (0.0032) [2024-04-27 01:50:12,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50244.4, 300 sec: 50873.7). Total num frames: 5299535872. Throughput: 0: 50844.1. Samples: 3052398600. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-04-27 01:50:12,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 01:50:12,773][49750] Updated weights for policy 0, policy_version 323461 (0.0029) [2024-04-27 01:50:15,262][49750] Updated weights for policy 0, policy_version 323471 (0.0031) [2024-04-27 01:50:17,063][49517] Fps is (10 sec: 54065.6, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 5299830784. Throughput: 0: 50828.3. Samples: 3052707900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-04-27 01:50:17,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-27 01:50:19,308][49750] Updated weights for policy 0, policy_version 323481 (0.0040) [2024-04-27 01:50:21,814][49750] Updated weights for policy 0, policy_version 323491 (0.0032) [2024-04-27 01:50:22,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 5300076544. Throughput: 0: 50906.5. Samples: 3052866440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-04-27 01:50:22,063][49517] Avg episode reward: [(0, '0.711')] [2024-04-27 01:50:25,715][49750] Updated weights for policy 0, policy_version 323501 (0.0028) [2024-04-27 01:50:27,062][49517] Fps is (10 sec: 49153.4, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 5300322304. Throughput: 0: 50886.3. Samples: 3053174840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-04-27 01:50:27,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 01:50:28,456][49750] Updated weights for policy 0, policy_version 323511 (0.0031) [2024-04-27 01:50:32,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 5300551680. Throughput: 0: 50929.4. Samples: 3053479260. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-04-27 01:50:32,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-27 01:50:32,098][49750] Updated weights for policy 0, policy_version 323521 (0.0031) [2024-04-27 01:50:34,729][49750] Updated weights for policy 0, policy_version 323531 (0.0034) [2024-04-27 01:50:37,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5300830208. Throughput: 0: 50863.6. Samples: 3053622360. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-04-27 01:50:37,063][49517] Avg episode reward: [(0, '0.637')] [2024-04-27 01:50:38,513][49750] Updated weights for policy 0, policy_version 323541 (0.0036) [2024-04-27 01:50:41,321][49750] Updated weights for policy 0, policy_version 323551 (0.0028) [2024-04-27 01:50:42,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5301075968. Throughput: 0: 50916.0. Samples: 3053929500. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-04-27 01:50:42,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-27 01:50:42,108][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000323553_5301092352.pth... [2024-04-27 01:50:42,158][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000322806_5288853504.pth [2024-04-27 01:50:44,885][49750] Updated weights for policy 0, policy_version 323561 (0.0026) [2024-04-27 01:50:47,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 5301338112. Throughput: 0: 50898.4. Samples: 3054231220. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-04-27 01:50:47,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-27 01:50:47,797][49750] Updated weights for policy 0, policy_version 323571 (0.0029) [2024-04-27 01:50:51,344][49750] Updated weights for policy 0, policy_version 323581 (0.0030) [2024-04-27 01:50:52,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5301583872. Throughput: 0: 50816.3. Samples: 3054379800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-04-27 01:50:52,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 01:50:54,118][49750] Updated weights for policy 0, policy_version 323591 (0.0030) [2024-04-27 01:50:57,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 5301813248. Throughput: 0: 50942.1. Samples: 3054691000. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-04-27 01:50:57,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-27 01:50:57,738][49750] Updated weights for policy 0, policy_version 323601 (0.0033) [2024-04-27 01:51:00,414][49750] Updated weights for policy 0, policy_version 323611 (0.0031) [2024-04-27 01:51:02,006][49728] Signal inference workers to stop experience collection... (45900 times) [2024-04-27 01:51:02,006][49728] Signal inference workers to resume experience collection... (45900 times) [2024-04-27 01:51:02,017][49750] InferenceWorker_p0-w0: stopping experience collection (45900 times) [2024-04-27 01:51:02,018][49750] InferenceWorker_p0-w0: resuming experience collection (45900 times) [2024-04-27 01:51:02,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5302108160. Throughput: 0: 50976.9. Samples: 3055001860. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-04-27 01:51:02,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-27 01:51:04,107][49750] Updated weights for policy 0, policy_version 323621 (0.0027) [2024-04-27 01:51:06,864][49750] Updated weights for policy 0, policy_version 323631 (0.0030) [2024-04-27 01:51:07,062][49517] Fps is (10 sec: 55705.8, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 5302370304. Throughput: 0: 50804.8. Samples: 3055152660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-04-27 01:51:07,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-27 01:51:10,765][49750] Updated weights for policy 0, policy_version 323641 (0.0035) [2024-04-27 01:51:12,062][49517] Fps is (10 sec: 49152.7, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 5302599680. Throughput: 0: 50862.1. Samples: 3055463640. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-04-27 01:51:12,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-27 01:51:13,311][49750] Updated weights for policy 0, policy_version 323651 (0.0029) [2024-04-27 01:51:17,062][49517] Fps is (10 sec: 45875.6, 60 sec: 49971.4, 300 sec: 50818.2). Total num frames: 5302829056. Throughput: 0: 50836.0. Samples: 3055766880. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-04-27 01:51:17,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-27 01:51:17,239][49750] Updated weights for policy 0, policy_version 323661 (0.0032) [2024-04-27 01:51:19,649][49750] Updated weights for policy 0, policy_version 323671 (0.0027) [2024-04-27 01:51:22,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5303107584. Throughput: 0: 50807.1. Samples: 3055908680. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-04-27 01:51:22,063][49517] Avg episode reward: [(0, '0.483')] [2024-04-27 01:51:23,578][49750] Updated weights for policy 0, policy_version 323681 (0.0035) [2024-04-27 01:51:26,396][49750] Updated weights for policy 0, policy_version 323691 (0.0036) [2024-04-27 01:51:27,062][49517] Fps is (10 sec: 54066.7, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5303369728. Throughput: 0: 50786.7. Samples: 3056214900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 01:51:27,063][49517] Avg episode reward: [(0, '0.679')] [2024-04-27 01:51:29,937][49750] Updated weights for policy 0, policy_version 323701 (0.0028) [2024-04-27 01:51:32,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 5303631872. Throughput: 0: 50947.6. Samples: 3056523860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 01:51:32,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-27 01:51:33,200][49750] Updated weights for policy 0, policy_version 323711 (0.0036) [2024-04-27 01:51:36,370][49750] Updated weights for policy 0, policy_version 323721 (0.0030) [2024-04-27 01:51:37,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50929.3). Total num frames: 5303877632. Throughput: 0: 50914.6. Samples: 3056670960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 01:51:37,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-27 01:51:39,532][49750] Updated weights for policy 0, policy_version 323731 (0.0030) [2024-04-27 01:51:42,062][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5304107008. Throughput: 0: 50850.7. Samples: 3056979280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 01:51:42,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 01:51:42,764][49750] Updated weights for policy 0, policy_version 323741 (0.0037) [2024-04-27 01:51:45,809][49750] Updated weights for policy 0, policy_version 323751 (0.0032) [2024-04-27 01:51:47,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5304369152. Throughput: 0: 50726.0. Samples: 3057284520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 01:51:47,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 01:51:49,223][49750] Updated weights for policy 0, policy_version 323761 (0.0040) [2024-04-27 01:51:52,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5304647680. Throughput: 0: 50631.9. Samples: 3057431100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 01:51:52,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-27 01:51:52,217][49750] Updated weights for policy 0, policy_version 323771 (0.0034) [2024-04-27 01:51:55,716][49750] Updated weights for policy 0, policy_version 323781 (0.0028) [2024-04-27 01:51:57,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51609.7, 300 sec: 50984.8). Total num frames: 5304909824. Throughput: 0: 50657.9. Samples: 3057743240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 01:51:57,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-27 01:51:58,681][49750] Updated weights for policy 0, policy_version 323791 (0.0033) [2024-04-27 01:52:02,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 5305139200. Throughput: 0: 50751.5. Samples: 3058050700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 01:52:02,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-27 01:52:02,148][49750] Updated weights for policy 0, policy_version 323801 (0.0040) [2024-04-27 01:52:05,060][49750] Updated weights for policy 0, policy_version 323811 (0.0028) [2024-04-27 01:52:07,062][49517] Fps is (10 sec: 45875.1, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 5305368576. Throughput: 0: 50700.9. Samples: 3058190220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 01:52:07,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 01:52:08,656][49750] Updated weights for policy 0, policy_version 323821 (0.0029) [2024-04-27 01:52:09,272][49728] Signal inference workers to stop experience collection... (45950 times) [2024-04-27 01:52:09,308][49750] InferenceWorker_p0-w0: stopping experience collection (45950 times) [2024-04-27 01:52:09,341][49728] Signal inference workers to resume experience collection... (45950 times) [2024-04-27 01:52:09,342][49750] InferenceWorker_p0-w0: resuming experience collection (45950 times) [2024-04-27 01:52:11,345][49750] Updated weights for policy 0, policy_version 323831 (0.0031) [2024-04-27 01:52:12,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 5305663488. Throughput: 0: 50704.8. Samples: 3058496620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 01:52:12,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 01:52:15,008][49750] Updated weights for policy 0, policy_version 323841 (0.0032) [2024-04-27 01:52:17,062][49517] Fps is (10 sec: 55706.0, 60 sec: 51609.6, 300 sec: 50929.3). Total num frames: 5305925632. Throughput: 0: 50655.6. Samples: 3058803360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 01:52:17,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-27 01:52:17,936][49750] Updated weights for policy 0, policy_version 323851 (0.0036) [2024-04-27 01:52:21,478][49750] Updated weights for policy 0, policy_version 323861 (0.0027) [2024-04-27 01:52:22,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5306155008. Throughput: 0: 50923.7. Samples: 3058962520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 01:52:22,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-27 01:52:24,520][49750] Updated weights for policy 0, policy_version 323871 (0.0034) [2024-04-27 01:52:27,062][49517] Fps is (10 sec: 47513.1, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5306400768. Throughput: 0: 50869.0. Samples: 3059268380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 01:52:27,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-27 01:52:27,817][49750] Updated weights for policy 0, policy_version 323881 (0.0027) [2024-04-27 01:52:30,869][49750] Updated weights for policy 0, policy_version 323891 (0.0032) [2024-04-27 01:52:32,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 5306646528. Throughput: 0: 50800.0. Samples: 3059570520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 01:52:32,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 01:52:34,207][49750] Updated weights for policy 0, policy_version 323901 (0.0032) [2024-04-27 01:52:37,063][49517] Fps is (10 sec: 52428.2, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 5306925056. Throughput: 0: 50849.3. Samples: 3059719320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 01:52:37,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-27 01:52:37,433][49750] Updated weights for policy 0, policy_version 323911 (0.0032) [2024-04-27 01:52:40,667][49750] Updated weights for policy 0, policy_version 323921 (0.0033) [2024-04-27 01:52:42,062][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 5307187200. Throughput: 0: 50743.5. Samples: 3060026700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 01:52:42,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 01:52:42,173][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000323926_5307203584.pth... [2024-04-27 01:52:42,223][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000323181_5294997504.pth [2024-04-27 01:52:43,859][49750] Updated weights for policy 0, policy_version 323931 (0.0028) [2024-04-27 01:52:47,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 5307416576. Throughput: 0: 50791.3. Samples: 3060336320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 01:52:47,063][49517] Avg episode reward: [(0, '0.571')] [2024-04-27 01:52:47,242][49750] Updated weights for policy 0, policy_version 323941 (0.0029) [2024-04-27 01:52:50,477][49750] Updated weights for policy 0, policy_version 323951 (0.0032) [2024-04-27 01:52:52,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 5307662336. Throughput: 0: 50656.2. Samples: 3060469760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 01:52:52,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 01:52:53,633][49750] Updated weights for policy 0, policy_version 323961 (0.0028) [2024-04-27 01:52:57,032][49750] Updated weights for policy 0, policy_version 323971 (0.0033) [2024-04-27 01:52:57,062][49517] Fps is (10 sec: 52430.2, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5307940864. Throughput: 0: 50659.8. Samples: 3060776300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 01:52:57,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-27 01:53:00,039][49750] Updated weights for policy 0, policy_version 323981 (0.0038) [2024-04-27 01:53:02,062][49517] Fps is (10 sec: 55706.4, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 5308219392. Throughput: 0: 50715.0. Samples: 3061085540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 01:53:02,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 01:53:03,462][49750] Updated weights for policy 0, policy_version 323991 (0.0031) [2024-04-27 01:53:06,081][49728] Signal inference workers to stop experience collection... (46000 times) [2024-04-27 01:53:06,081][49728] Signal inference workers to resume experience collection... (46000 times) [2024-04-27 01:53:06,118][49750] InferenceWorker_p0-w0: stopping experience collection (46000 times) [2024-04-27 01:53:06,118][49750] InferenceWorker_p0-w0: resuming experience collection (46000 times) [2024-04-27 01:53:06,525][49750] Updated weights for policy 0, policy_version 324001 (0.0036) [2024-04-27 01:53:07,063][49517] Fps is (10 sec: 50789.4, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 5308448768. Throughput: 0: 50855.0. Samples: 3061251000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 01:53:07,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-27 01:53:09,903][49750] Updated weights for policy 0, policy_version 324011 (0.0031) [2024-04-27 01:53:12,062][49517] Fps is (10 sec: 45875.1, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5308678144. Throughput: 0: 50745.3. Samples: 3061551920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 01:53:12,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-27 01:53:13,022][49750] Updated weights for policy 0, policy_version 324021 (0.0034) [2024-04-27 01:53:16,517][49750] Updated weights for policy 0, policy_version 324031 (0.0036) [2024-04-27 01:53:17,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 5308940288. Throughput: 0: 50724.7. Samples: 3061853140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 01:53:17,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-27 01:53:19,382][49750] Updated weights for policy 0, policy_version 324041 (0.0031) [2024-04-27 01:53:22,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5309218816. Throughput: 0: 50643.1. Samples: 3061998260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 01:53:22,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 01:53:22,909][49750] Updated weights for policy 0, policy_version 324051 (0.0032) [2024-04-27 01:53:25,797][49750] Updated weights for policy 0, policy_version 324061 (0.0027) [2024-04-27 01:53:27,062][49517] Fps is (10 sec: 52430.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5309464576. Throughput: 0: 50763.2. Samples: 3062311040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 01:53:27,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-27 01:53:29,545][49750] Updated weights for policy 0, policy_version 324071 (0.0030) [2024-04-27 01:53:32,063][49517] Fps is (10 sec: 49151.8, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 5309710336. Throughput: 0: 50816.5. Samples: 3062623060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 01:53:32,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 01:53:32,226][49750] Updated weights for policy 0, policy_version 324081 (0.0036) [2024-04-27 01:53:35,922][49750] Updated weights for policy 0, policy_version 324091 (0.0029) [2024-04-27 01:53:37,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5309956096. Throughput: 0: 50930.4. Samples: 3062761620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 01:53:37,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-27 01:53:38,657][49750] Updated weights for policy 0, policy_version 324101 (0.0035) [2024-04-27 01:53:42,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 5310201856. Throughput: 0: 50836.3. Samples: 3063063940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 01:53:42,063][49517] Avg episode reward: [(0, '0.681')] [2024-04-27 01:53:42,445][49750] Updated weights for policy 0, policy_version 324111 (0.0029) [2024-04-27 01:53:44,999][49750] Updated weights for policy 0, policy_version 324121 (0.0031) [2024-04-27 01:53:47,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 5310496768. Throughput: 0: 50801.3. Samples: 3063371600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 01:53:47,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 01:53:48,953][49750] Updated weights for policy 0, policy_version 324131 (0.0038) [2024-04-27 01:53:51,479][49750] Updated weights for policy 0, policy_version 324141 (0.0030) [2024-04-27 01:53:52,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 5310742528. Throughput: 0: 50840.6. Samples: 3063538820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 01:53:52,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 01:53:55,379][49750] Updated weights for policy 0, policy_version 324151 (0.0031) [2024-04-27 01:53:57,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 5310971904. Throughput: 0: 50979.9. Samples: 3063846020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-27 01:53:57,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-27 01:53:57,911][49750] Updated weights for policy 0, policy_version 324161 (0.0028) [2024-04-27 01:54:01,784][49750] Updated weights for policy 0, policy_version 324171 (0.0028) [2024-04-27 01:54:02,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5311234048. Throughput: 0: 50962.0. Samples: 3064146420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-27 01:54:02,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-27 01:54:03,670][49728] Signal inference workers to stop experience collection... (46050 times) [2024-04-27 01:54:03,670][49728] Signal inference workers to resume experience collection... (46050 times) [2024-04-27 01:54:03,683][49750] InferenceWorker_p0-w0: stopping experience collection (46050 times) [2024-04-27 01:54:03,683][49750] InferenceWorker_p0-w0: resuming experience collection (46050 times) [2024-04-27 01:54:04,261][49750] Updated weights for policy 0, policy_version 324181 (0.0031) [2024-04-27 01:54:07,063][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5311512576. Throughput: 0: 50948.0. Samples: 3064290920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-27 01:54:07,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-27 01:54:08,053][49750] Updated weights for policy 0, policy_version 324191 (0.0030) [2024-04-27 01:54:10,700][49750] Updated weights for policy 0, policy_version 324201 (0.0030) [2024-04-27 01:54:12,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 5311758336. Throughput: 0: 50785.8. Samples: 3064596400. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-27 01:54:12,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-27 01:54:14,596][49750] Updated weights for policy 0, policy_version 324211 (0.0031) [2024-04-27 01:54:17,063][49517] Fps is (10 sec: 50790.4, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 5312020480. Throughput: 0: 50724.1. Samples: 3064905640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-27 01:54:17,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-27 01:54:17,286][49750] Updated weights for policy 0, policy_version 324221 (0.0027) [2024-04-27 01:54:20,960][49750] Updated weights for policy 0, policy_version 324231 (0.0029) [2024-04-27 01:54:22,062][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 5312233472. Throughput: 0: 50970.2. Samples: 3065055280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-27 01:54:22,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-27 01:54:23,716][49750] Updated weights for policy 0, policy_version 324241 (0.0034) [2024-04-27 01:54:27,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5312495616. Throughput: 0: 50948.5. Samples: 3065356620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-27 01:54:27,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-27 01:54:27,373][49750] Updated weights for policy 0, policy_version 324251 (0.0032) [2024-04-27 01:54:29,983][49750] Updated weights for policy 0, policy_version 324261 (0.0025) [2024-04-27 01:54:32,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5312774144. Throughput: 0: 50882.0. Samples: 3065661300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-27 01:54:32,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-27 01:54:33,978][49750] Updated weights for policy 0, policy_version 324271 (0.0029) [2024-04-27 01:54:36,335][49750] Updated weights for policy 0, policy_version 324281 (0.0031) [2024-04-27 01:54:37,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5313036288. Throughput: 0: 50931.5. Samples: 3065830740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-27 01:54:37,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 01:54:40,466][49750] Updated weights for policy 0, policy_version 324291 (0.0029) [2024-04-27 01:54:42,062][49517] Fps is (10 sec: 50791.5, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5313282048. Throughput: 0: 50863.7. Samples: 3066134880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-27 01:54:42,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 01:54:42,090][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000324298_5313298432.pth... [2024-04-27 01:54:42,136][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000323553_5301092352.pth [2024-04-27 01:54:42,784][49750] Updated weights for policy 0, policy_version 324301 (0.0037) [2024-04-27 01:54:47,062][49517] Fps is (10 sec: 45875.1, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 5313495040. Throughput: 0: 50984.8. Samples: 3066440740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-27 01:54:47,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 01:54:47,125][49750] Updated weights for policy 0, policy_version 324311 (0.0030) [2024-04-27 01:54:49,207][49750] Updated weights for policy 0, policy_version 324321 (0.0028) [2024-04-27 01:54:52,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50790.2, 300 sec: 50818.1). Total num frames: 5313789952. Throughput: 0: 50843.4. Samples: 3066578880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-27 01:54:52,063][49517] Avg episode reward: [(0, '0.684')] [2024-04-27 01:54:53,736][49750] Updated weights for policy 0, policy_version 324331 (0.0029) [2024-04-27 01:54:55,947][49750] Updated weights for policy 0, policy_version 324341 (0.0028) [2024-04-27 01:54:57,062][49517] Fps is (10 sec: 55705.6, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 5314052096. Throughput: 0: 50858.6. Samples: 3066885040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-27 01:54:57,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 01:55:00,009][49750] Updated weights for policy 0, policy_version 324351 (0.0039) [2024-04-27 01:55:02,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 5314314240. Throughput: 0: 50896.4. Samples: 3067195980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-27 01:55:02,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-27 01:55:02,291][49750] Updated weights for policy 0, policy_version 324361 (0.0032) [2024-04-27 01:55:06,247][49750] Updated weights for policy 0, policy_version 324371 (0.0036) [2024-04-27 01:55:07,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5314527232. Throughput: 0: 50834.7. Samples: 3067342840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-27 01:55:07,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 01:55:07,766][49728] Signal inference workers to stop experience collection... (46100 times) [2024-04-27 01:55:07,767][49728] Signal inference workers to resume experience collection... (46100 times) [2024-04-27 01:55:07,785][49750] InferenceWorker_p0-w0: stopping experience collection (46100 times) [2024-04-27 01:55:07,785][49750] InferenceWorker_p0-w0: resuming experience collection (46100 times) [2024-04-27 01:55:08,637][49750] Updated weights for policy 0, policy_version 324381 (0.0029) [2024-04-27 01:55:12,063][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5314789376. Throughput: 0: 51070.2. Samples: 3067654780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 01:55:12,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 01:55:12,738][49750] Updated weights for policy 0, policy_version 324391 (0.0030) [2024-04-27 01:55:15,079][49750] Updated weights for policy 0, policy_version 324401 (0.0031) [2024-04-27 01:55:17,063][49517] Fps is (10 sec: 55705.2, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5315084288. Throughput: 0: 51067.1. Samples: 3067959320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 01:55:17,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 01:55:19,238][49750] Updated weights for policy 0, policy_version 324411 (0.0032) [2024-04-27 01:55:21,526][49750] Updated weights for policy 0, policy_version 324421 (0.0032) [2024-04-27 01:55:22,062][49517] Fps is (10 sec: 54068.2, 60 sec: 51609.7, 300 sec: 50873.7). Total num frames: 5315330048. Throughput: 0: 51087.7. Samples: 3068129680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 01:55:22,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 01:55:25,582][49750] Updated weights for policy 0, policy_version 324431 (0.0034) [2024-04-27 01:55:27,063][49517] Fps is (10 sec: 49152.5, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 5315575808. Throughput: 0: 51047.5. Samples: 3068432020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 01:55:27,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-27 01:55:27,882][49750] Updated weights for policy 0, policy_version 324441 (0.0032) [2024-04-27 01:55:32,034][49750] Updated weights for policy 0, policy_version 324451 (0.0036) [2024-04-27 01:55:32,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 5315805184. Throughput: 0: 51035.2. Samples: 3068737320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 01:55:32,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-27 01:55:34,197][49750] Updated weights for policy 0, policy_version 324461 (0.0030) [2024-04-27 01:55:37,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5316083712. Throughput: 0: 50957.9. Samples: 3068871980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 01:55:37,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-27 01:55:38,685][49750] Updated weights for policy 0, policy_version 324471 (0.0035) [2024-04-27 01:55:40,665][49750] Updated weights for policy 0, policy_version 324481 (0.0043) [2024-04-27 01:55:42,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5316345856. Throughput: 0: 50999.5. Samples: 3069180020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 01:55:42,063][49517] Avg episode reward: [(0, '0.690')] [2024-04-27 01:55:44,944][49750] Updated weights for policy 0, policy_version 324491 (0.0034) [2024-04-27 01:55:47,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51882.7, 300 sec: 50929.3). Total num frames: 5316608000. Throughput: 0: 50954.9. Samples: 3069488940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 01:55:47,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-27 01:55:47,074][49750] Updated weights for policy 0, policy_version 324501 (0.0031) [2024-04-27 01:55:51,301][49750] Updated weights for policy 0, policy_version 324511 (0.0034) [2024-04-27 01:55:52,062][49517] Fps is (10 sec: 45875.5, 60 sec: 50244.5, 300 sec: 50818.2). Total num frames: 5316804608. Throughput: 0: 51033.9. Samples: 3069639360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 01:55:52,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-27 01:55:53,616][49750] Updated weights for policy 0, policy_version 324521 (0.0027) [2024-04-27 01:55:57,063][49517] Fps is (10 sec: 47512.7, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5317083136. Throughput: 0: 50830.6. Samples: 3069942160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 01:55:57,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 01:55:57,765][49750] Updated weights for policy 0, policy_version 324531 (0.0032) [2024-04-27 01:56:00,047][49750] Updated weights for policy 0, policy_version 324541 (0.0028) [2024-04-27 01:56:02,062][49517] Fps is (10 sec: 57343.6, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5317378048. Throughput: 0: 50942.8. Samples: 3070251740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 01:56:02,063][49517] Avg episode reward: [(0, '0.667')] [2024-04-27 01:56:04,206][49750] Updated weights for policy 0, policy_version 324551 (0.0033) [2024-04-27 01:56:05,404][49728] Signal inference workers to stop experience collection... (46150 times) [2024-04-27 01:56:05,438][49750] InferenceWorker_p0-w0: stopping experience collection (46150 times) [2024-04-27 01:56:05,472][49728] Signal inference workers to resume experience collection... (46150 times) [2024-04-27 01:56:05,473][49750] InferenceWorker_p0-w0: resuming experience collection (46150 times) [2024-04-27 01:56:06,476][49750] Updated weights for policy 0, policy_version 324561 (0.0026) [2024-04-27 01:56:07,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51609.7, 300 sec: 50929.3). Total num frames: 5317623808. Throughput: 0: 50921.3. Samples: 3070421140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 01:56:07,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-27 01:56:10,514][49750] Updated weights for policy 0, policy_version 324571 (0.0031) [2024-04-27 01:56:12,062][49517] Fps is (10 sec: 49152.0, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 5317869568. Throughput: 0: 50882.7. Samples: 3070721740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 01:56:12,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-27 01:56:13,073][49750] Updated weights for policy 0, policy_version 324581 (0.0035) [2024-04-27 01:56:16,827][49750] Updated weights for policy 0, policy_version 324591 (0.0032) [2024-04-27 01:56:17,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5318098944. Throughput: 0: 50845.2. Samples: 3071025360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 01:56:17,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 01:56:19,518][49750] Updated weights for policy 0, policy_version 324601 (0.0033) [2024-04-27 01:56:22,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 5318361088. Throughput: 0: 51037.8. Samples: 3071168680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 01:56:22,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-27 01:56:23,270][49750] Updated weights for policy 0, policy_version 324611 (0.0035) [2024-04-27 01:56:25,845][49750] Updated weights for policy 0, policy_version 324621 (0.0028) [2024-04-27 01:56:27,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5318639616. Throughput: 0: 51031.2. Samples: 3071476420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 01:56:27,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-27 01:56:29,752][49750] Updated weights for policy 0, policy_version 324631 (0.0035) [2024-04-27 01:56:32,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 5318885376. Throughput: 0: 50870.1. Samples: 3071778100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 01:56:32,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 01:56:32,409][49750] Updated weights for policy 0, policy_version 324641 (0.0029) [2024-04-27 01:56:36,113][49750] Updated weights for policy 0, policy_version 324651 (0.0033) [2024-04-27 01:56:37,063][49517] Fps is (10 sec: 47512.7, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5319114752. Throughput: 0: 50934.9. Samples: 3071931440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 01:56:37,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 01:56:38,964][49750] Updated weights for policy 0, policy_version 324661 (0.0026) [2024-04-27 01:56:42,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5319376896. Throughput: 0: 50901.4. Samples: 3072232720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 01:56:42,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-27 01:56:42,070][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000324669_5319376896.pth... [2024-04-27 01:56:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000323926_5307203584.pth [2024-04-27 01:56:42,441][49750] Updated weights for policy 0, policy_version 324671 (0.0032) [2024-04-27 01:56:45,311][49750] Updated weights for policy 0, policy_version 324681 (0.0030) [2024-04-27 01:56:47,062][49517] Fps is (10 sec: 54068.0, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5319655424. Throughput: 0: 50735.6. Samples: 3072534840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 01:56:47,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-27 01:56:48,987][49750] Updated weights for policy 0, policy_version 324691 (0.0037) [2024-04-27 01:56:51,648][49750] Updated weights for policy 0, policy_version 324701 (0.0029) [2024-04-27 01:56:52,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51882.6, 300 sec: 50873.7). Total num frames: 5319917568. Throughput: 0: 50768.3. Samples: 3072705720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 01:56:52,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-27 01:56:55,572][49750] Updated weights for policy 0, policy_version 324711 (0.0030) [2024-04-27 01:56:57,062][49517] Fps is (10 sec: 49152.6, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5320146944. Throughput: 0: 50852.6. Samples: 3073010100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 01:56:57,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-27 01:56:58,138][49750] Updated weights for policy 0, policy_version 324721 (0.0031) [2024-04-27 01:57:01,150][49728] Signal inference workers to stop experience collection... (46200 times) [2024-04-27 01:57:01,151][49728] Signal inference workers to resume experience collection... (46200 times) [2024-04-27 01:57:01,166][49750] InferenceWorker_p0-w0: stopping experience collection (46200 times) [2024-04-27 01:57:01,166][49750] InferenceWorker_p0-w0: resuming experience collection (46200 times) [2024-04-27 01:57:02,062][49517] Fps is (10 sec: 45875.6, 60 sec: 49971.2, 300 sec: 50873.7). Total num frames: 5320376320. Throughput: 0: 50842.4. Samples: 3073313260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 01:57:02,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-27 01:57:02,218][49750] Updated weights for policy 0, policy_version 324731 (0.0031) [2024-04-27 01:57:04,644][49750] Updated weights for policy 0, policy_version 324741 (0.0033) [2024-04-27 01:57:07,062][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5320654848. Throughput: 0: 50919.6. Samples: 3073460060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 01:57:07,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 01:57:08,715][49750] Updated weights for policy 0, policy_version 324751 (0.0031) [2024-04-27 01:57:10,979][49750] Updated weights for policy 0, policy_version 324761 (0.0029) [2024-04-27 01:57:12,062][49517] Fps is (10 sec: 55705.6, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5320933376. Throughput: 0: 50893.8. Samples: 3073766640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 01:57:12,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-27 01:57:15,000][49750] Updated weights for policy 0, policy_version 324771 (0.0034) [2024-04-27 01:57:17,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5321162752. Throughput: 0: 51021.4. Samples: 3074074060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 01:57:17,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 01:57:17,424][49750] Updated weights for policy 0, policy_version 324781 (0.0033) [2024-04-27 01:57:21,445][49750] Updated weights for policy 0, policy_version 324791 (0.0038) [2024-04-27 01:57:22,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5321408512. Throughput: 0: 50878.4. Samples: 3074220960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 01:57:22,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-27 01:57:23,891][49750] Updated weights for policy 0, policy_version 324801 (0.0034) [2024-04-27 01:57:27,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50244.2, 300 sec: 50873.7). Total num frames: 5321654272. Throughput: 0: 50840.1. Samples: 3074520520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 01:57:27,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-27 01:57:27,853][49750] Updated weights for policy 0, policy_version 324811 (0.0033) [2024-04-27 01:57:30,396][49750] Updated weights for policy 0, policy_version 324821 (0.0034) [2024-04-27 01:57:32,063][49517] Fps is (10 sec: 54066.2, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5321949184. Throughput: 0: 50796.3. Samples: 3074820680. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 01:57:32,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-27 01:57:34,417][49750] Updated weights for policy 0, policy_version 324831 (0.0029) [2024-04-27 01:57:36,793][49750] Updated weights for policy 0, policy_version 324841 (0.0034) [2024-04-27 01:57:37,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 5322194944. Throughput: 0: 50687.2. Samples: 3074986640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 01:57:37,063][49517] Avg episode reward: [(0, '0.680')] [2024-04-27 01:57:40,720][49750] Updated weights for policy 0, policy_version 324851 (0.0041) [2024-04-27 01:57:42,063][49517] Fps is (10 sec: 49152.4, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 5322440704. Throughput: 0: 50794.0. Samples: 3075295840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 01:57:42,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-27 01:57:43,100][49750] Updated weights for policy 0, policy_version 324861 (0.0033) [2024-04-27 01:57:47,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 5322670080. Throughput: 0: 50760.9. Samples: 3075597500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 01:57:47,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 01:57:47,102][49750] Updated weights for policy 0, policy_version 324871 (0.0028) [2024-04-27 01:57:49,654][49750] Updated weights for policy 0, policy_version 324881 (0.0039) [2024-04-27 01:57:52,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 5322948608. Throughput: 0: 50631.6. Samples: 3075738480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 01:57:52,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 01:57:53,641][49750] Updated weights for policy 0, policy_version 324891 (0.0031) [2024-04-27 01:57:56,343][49750] Updated weights for policy 0, policy_version 324901 (0.0030) [2024-04-27 01:57:57,062][49517] Fps is (10 sec: 55705.5, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5323227136. Throughput: 0: 50692.0. Samples: 3076047780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 01:57:57,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-27 01:58:00,055][49750] Updated weights for policy 0, policy_version 324911 (0.0032) [2024-04-27 01:58:02,063][49517] Fps is (10 sec: 50789.4, 60 sec: 51336.3, 300 sec: 50873.7). Total num frames: 5323456512. Throughput: 0: 50658.5. Samples: 3076353700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 01:58:02,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-27 01:58:02,799][49750] Updated weights for policy 0, policy_version 324921 (0.0033) [2024-04-27 01:58:06,422][49750] Updated weights for policy 0, policy_version 324931 (0.0030) [2024-04-27 01:58:07,062][49517] Fps is (10 sec: 45874.9, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5323685888. Throughput: 0: 50663.5. Samples: 3076500820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 01:58:07,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 01:58:07,216][49728] Signal inference workers to stop experience collection... (46250 times) [2024-04-27 01:58:07,223][49728] Signal inference workers to resume experience collection... (46250 times) [2024-04-27 01:58:07,237][49750] InferenceWorker_p0-w0: stopping experience collection (46250 times) [2024-04-27 01:58:07,238][49750] InferenceWorker_p0-w0: resuming experience collection (46250 times) [2024-04-27 01:58:09,282][49750] Updated weights for policy 0, policy_version 324941 (0.0029) [2024-04-27 01:58:12,063][49517] Fps is (10 sec: 49152.5, 60 sec: 50244.2, 300 sec: 50873.7). Total num frames: 5323948032. Throughput: 0: 50872.8. Samples: 3076809800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 01:58:12,063][49517] Avg episode reward: [(0, '0.666')] [2024-04-27 01:58:12,857][49750] Updated weights for policy 0, policy_version 324951 (0.0032) [2024-04-27 01:58:15,731][49750] Updated weights for policy 0, policy_version 324961 (0.0031) [2024-04-27 01:58:17,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5324226560. Throughput: 0: 50839.2. Samples: 3077108440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 01:58:17,063][49517] Avg episode reward: [(0, '0.690')] [2024-04-27 01:58:19,240][49750] Updated weights for policy 0, policy_version 324971 (0.0031) [2024-04-27 01:58:22,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5324472320. Throughput: 0: 50794.5. Samples: 3077272400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 01:58:22,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-27 01:58:22,093][49750] Updated weights for policy 0, policy_version 324981 (0.0028) [2024-04-27 01:58:25,607][49750] Updated weights for policy 0, policy_version 324991 (0.0031) [2024-04-27 01:58:27,063][49517] Fps is (10 sec: 49151.6, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5324718080. Throughput: 0: 50669.7. Samples: 3077575980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 01:58:27,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-27 01:58:28,720][49750] Updated weights for policy 0, policy_version 325001 (0.0034) [2024-04-27 01:58:32,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.4, 300 sec: 50873.7). Total num frames: 5324963840. Throughput: 0: 50768.9. Samples: 3077882100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 01:58:32,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 01:58:32,070][49750] Updated weights for policy 0, policy_version 325011 (0.0033) [2024-04-27 01:58:34,995][49750] Updated weights for policy 0, policy_version 325021 (0.0027) [2024-04-27 01:58:37,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 5325242368. Throughput: 0: 50820.4. Samples: 3078025400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 01:58:37,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-27 01:58:38,522][49750] Updated weights for policy 0, policy_version 325031 (0.0030) [2024-04-27 01:58:41,536][49750] Updated weights for policy 0, policy_version 325041 (0.0029) [2024-04-27 01:58:42,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5325504512. Throughput: 0: 50829.7. Samples: 3078335120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 01:58:42,063][49517] Avg episode reward: [(0, '0.666')] [2024-04-27 01:58:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000325043_5325504512.pth... [2024-04-27 01:58:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000324298_5313298432.pth [2024-04-27 01:58:45,013][49750] Updated weights for policy 0, policy_version 325051 (0.0028) [2024-04-27 01:58:47,062][49517] Fps is (10 sec: 49152.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5325733888. Throughput: 0: 50872.3. Samples: 3078642940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 01:58:47,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 01:58:48,040][49750] Updated weights for policy 0, policy_version 325061 (0.0037) [2024-04-27 01:58:51,332][49750] Updated weights for policy 0, policy_version 325071 (0.0032) [2024-04-27 01:58:52,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 5325996032. Throughput: 0: 50790.1. Samples: 3078786380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 01:58:52,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 01:58:54,419][49750] Updated weights for policy 0, policy_version 325081 (0.0035) [2024-04-27 01:58:57,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50244.2, 300 sec: 50873.7). Total num frames: 5326241792. Throughput: 0: 50730.8. Samples: 3079092680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 01:58:57,063][49517] Avg episode reward: [(0, '0.672')] [2024-04-27 01:58:57,735][49750] Updated weights for policy 0, policy_version 325091 (0.0028) [2024-04-27 01:59:00,694][49750] Updated weights for policy 0, policy_version 325101 (0.0036) [2024-04-27 01:59:02,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5326520320. Throughput: 0: 50928.8. Samples: 3079400240. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 01:59:02,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-27 01:59:04,422][49750] Updated weights for policy 0, policy_version 325111 (0.0032) [2024-04-27 01:59:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5326749696. Throughput: 0: 50763.6. Samples: 3079556760. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 01:59:07,063][49517] Avg episode reward: [(0, '0.557')] [2024-04-27 01:59:07,344][49750] Updated weights for policy 0, policy_version 325121 (0.0028) [2024-04-27 01:59:10,879][49750] Updated weights for policy 0, policy_version 325131 (0.0028) [2024-04-27 01:59:12,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5326995456. Throughput: 0: 50914.2. Samples: 3079867120. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 01:59:12,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 01:59:13,686][49750] Updated weights for policy 0, policy_version 325141 (0.0028) [2024-04-27 01:59:13,688][49728] Signal inference workers to stop experience collection... (46300 times) [2024-04-27 01:59:13,689][49728] Signal inference workers to resume experience collection... (46300 times) [2024-04-27 01:59:13,704][49750] InferenceWorker_p0-w0: stopping experience collection (46300 times) [2024-04-27 01:59:13,704][49750] InferenceWorker_p0-w0: resuming experience collection (46300 times) [2024-04-27 01:59:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.4, 300 sec: 50929.3). Total num frames: 5327257600. Throughput: 0: 50761.2. Samples: 3080166360. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 01:59:17,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-27 01:59:17,253][49750] Updated weights for policy 0, policy_version 325151 (0.0035) [2024-04-27 01:59:20,043][49750] Updated weights for policy 0, policy_version 325161 (0.0035) [2024-04-27 01:59:22,063][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5327503360. Throughput: 0: 50902.5. Samples: 3080316020. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 01:59:22,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-27 01:59:23,563][49750] Updated weights for policy 0, policy_version 325171 (0.0028) [2024-04-27 01:59:26,616][49750] Updated weights for policy 0, policy_version 325181 (0.0029) [2024-04-27 01:59:27,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5327781888. Throughput: 0: 50888.6. Samples: 3080625100. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 01:59:27,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-27 01:59:30,104][49750] Updated weights for policy 0, policy_version 325191 (0.0029) [2024-04-27 01:59:32,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5328027648. Throughput: 0: 50905.3. Samples: 3080933680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 01:59:32,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 01:59:32,957][49750] Updated weights for policy 0, policy_version 325201 (0.0031) [2024-04-27 01:59:36,358][49750] Updated weights for policy 0, policy_version 325211 (0.0030) [2024-04-27 01:59:37,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5328273408. Throughput: 0: 51067.7. Samples: 3081084420. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 01:59:37,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 01:59:39,439][49750] Updated weights for policy 0, policy_version 325221 (0.0033) [2024-04-27 01:59:42,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50517.3, 300 sec: 50984.8). Total num frames: 5328535552. Throughput: 0: 50876.7. Samples: 3081382140. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 01:59:42,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 01:59:42,854][49750] Updated weights for policy 0, policy_version 325231 (0.0027) [2024-04-27 01:59:45,805][49750] Updated weights for policy 0, policy_version 325241 (0.0027) [2024-04-27 01:59:47,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50873.8). Total num frames: 5328797696. Throughput: 0: 50794.0. Samples: 3081685960. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 01:59:47,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-27 01:59:49,233][49750] Updated weights for policy 0, policy_version 325251 (0.0034) [2024-04-27 01:59:52,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5329043456. Throughput: 0: 50704.5. Samples: 3081838460. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 01:59:52,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 01:59:52,320][49750] Updated weights for policy 0, policy_version 325261 (0.0032) [2024-04-27 01:59:55,582][49750] Updated weights for policy 0, policy_version 325271 (0.0032) [2024-04-27 01:59:57,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5329272832. Throughput: 0: 50667.8. Samples: 3082147160. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 01:59:57,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-27 01:59:58,796][49750] Updated weights for policy 0, policy_version 325281 (0.0031) [2024-04-27 02:00:02,009][49750] Updated weights for policy 0, policy_version 325291 (0.0029) [2024-04-27 02:00:02,063][49517] Fps is (10 sec: 52427.8, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 5329567744. Throughput: 0: 50919.4. Samples: 3082457740. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 02:00:02,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-27 02:00:05,186][49750] Updated weights for policy 0, policy_version 325301 (0.0027) [2024-04-27 02:00:07,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 5329813504. Throughput: 0: 50881.5. Samples: 3082605680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 02:00:07,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-27 02:00:08,700][49750] Updated weights for policy 0, policy_version 325311 (0.0034) [2024-04-27 02:00:11,486][49750] Updated weights for policy 0, policy_version 325321 (0.0031) [2024-04-27 02:00:12,063][49517] Fps is (10 sec: 50791.0, 60 sec: 51336.7, 300 sec: 50818.2). Total num frames: 5330075648. Throughput: 0: 50743.0. Samples: 3082908540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 02:00:12,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 02:00:14,960][49750] Updated weights for policy 0, policy_version 325331 (0.0031) [2024-04-27 02:00:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5330321408. Throughput: 0: 50785.3. Samples: 3083219020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 02:00:17,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-27 02:00:17,887][49750] Updated weights for policy 0, policy_version 325341 (0.0028) [2024-04-27 02:00:19,682][49728] Signal inference workers to stop experience collection... (46350 times) [2024-04-27 02:00:19,725][49750] InferenceWorker_p0-w0: stopping experience collection (46350 times) [2024-04-27 02:00:19,786][49728] Signal inference workers to resume experience collection... (46350 times) [2024-04-27 02:00:19,787][49750] InferenceWorker_p0-w0: resuming experience collection (46350 times) [2024-04-27 02:00:21,519][49750] Updated weights for policy 0, policy_version 325351 (0.0026) [2024-04-27 02:00:22,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5330550784. Throughput: 0: 50790.6. Samples: 3083370000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 02:00:22,063][49517] Avg episode reward: [(0, '0.658')] [2024-04-27 02:00:24,363][49750] Updated weights for policy 0, policy_version 325361 (0.0035) [2024-04-27 02:00:27,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50244.1, 300 sec: 50818.1). Total num frames: 5330796544. Throughput: 0: 50876.9. Samples: 3083671600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 02:00:27,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 02:00:28,046][49750] Updated weights for policy 0, policy_version 325371 (0.0032) [2024-04-27 02:00:30,770][49750] Updated weights for policy 0, policy_version 325381 (0.0030) [2024-04-27 02:00:32,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5331075072. Throughput: 0: 50887.1. Samples: 3083975880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 02:00:32,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 02:00:34,435][49750] Updated weights for policy 0, policy_version 325391 (0.0031) [2024-04-27 02:00:37,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5331337216. Throughput: 0: 50900.0. Samples: 3084128960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 02:00:37,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 02:00:37,230][49750] Updated weights for policy 0, policy_version 325401 (0.0033) [2024-04-27 02:00:40,858][49750] Updated weights for policy 0, policy_version 325411 (0.0031) [2024-04-27 02:00:42,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 5331566592. Throughput: 0: 50852.5. Samples: 3084435520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 02:00:42,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 02:00:42,178][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000325414_5331582976.pth... [2024-04-27 02:00:42,223][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000324669_5319376896.pth [2024-04-27 02:00:43,746][49750] Updated weights for policy 0, policy_version 325421 (0.0032) [2024-04-27 02:00:47,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50517.2, 300 sec: 50929.2). Total num frames: 5331828736. Throughput: 0: 50727.6. Samples: 3084740480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 02:00:47,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-27 02:00:47,436][49750] Updated weights for policy 0, policy_version 325431 (0.0026) [2024-04-27 02:00:50,140][49750] Updated weights for policy 0, policy_version 325441 (0.0026) [2024-04-27 02:00:52,063][49517] Fps is (10 sec: 54065.9, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 5332107264. Throughput: 0: 50670.4. Samples: 3084885860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 02:00:52,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 02:00:54,055][49750] Updated weights for policy 0, policy_version 325451 (0.0032) [2024-04-27 02:00:56,668][49750] Updated weights for policy 0, policy_version 325461 (0.0032) [2024-04-27 02:00:57,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 5332353024. Throughput: 0: 50725.3. Samples: 3085191180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 02:00:57,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-27 02:01:00,704][49750] Updated weights for policy 0, policy_version 325471 (0.0035) [2024-04-27 02:01:02,063][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 5332582400. Throughput: 0: 50620.3. Samples: 3085496940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 02:01:02,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-27 02:01:03,020][49750] Updated weights for policy 0, policy_version 325481 (0.0034) [2024-04-27 02:01:07,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 5332828160. Throughput: 0: 50555.7. Samples: 3085645000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 02:01:07,063][49517] Avg episode reward: [(0, '0.666')] [2024-04-27 02:01:07,157][49750] Updated weights for policy 0, policy_version 325491 (0.0033) [2024-04-27 02:01:08,751][49728] Signal inference workers to stop experience collection... (46400 times) [2024-04-27 02:01:08,752][49728] Signal inference workers to resume experience collection... (46400 times) [2024-04-27 02:01:08,768][49750] InferenceWorker_p0-w0: stopping experience collection (46400 times) [2024-04-27 02:01:08,769][49750] InferenceWorker_p0-w0: resuming experience collection (46400 times) [2024-04-27 02:01:09,367][49750] Updated weights for policy 0, policy_version 325501 (0.0032) [2024-04-27 02:01:12,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5333090304. Throughput: 0: 50601.1. Samples: 3085948640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 02:01:12,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-27 02:01:13,608][49750] Updated weights for policy 0, policy_version 325511 (0.0038) [2024-04-27 02:01:15,947][49750] Updated weights for policy 0, policy_version 325521 (0.0032) [2024-04-27 02:01:17,063][49517] Fps is (10 sec: 52427.5, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 5333352448. Throughput: 0: 50526.0. Samples: 3086249560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 02:01:17,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-27 02:01:20,124][49750] Updated weights for policy 0, policy_version 325531 (0.0033) [2024-04-27 02:01:22,063][49517] Fps is (10 sec: 54066.3, 60 sec: 51336.5, 300 sec: 50818.1). Total num frames: 5333630976. Throughput: 0: 50742.0. Samples: 3086412360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 02:01:22,063][49517] Avg episode reward: [(0, '0.635')] [2024-04-27 02:01:22,454][49750] Updated weights for policy 0, policy_version 325541 (0.0031) [2024-04-27 02:01:26,739][49750] Updated weights for policy 0, policy_version 325551 (0.0035) [2024-04-27 02:01:27,062][49517] Fps is (10 sec: 49153.2, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5333843968. Throughput: 0: 50616.0. Samples: 3086713240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 02:01:27,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-27 02:01:28,780][49750] Updated weights for policy 0, policy_version 325561 (0.0033) [2024-04-27 02:01:32,063][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 5334106112. Throughput: 0: 50603.6. Samples: 3087017640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 02:01:32,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-27 02:01:33,239][49750] Updated weights for policy 0, policy_version 325571 (0.0037) [2024-04-27 02:01:35,248][49750] Updated weights for policy 0, policy_version 325581 (0.0029) [2024-04-27 02:01:37,063][49517] Fps is (10 sec: 52427.6, 60 sec: 50517.1, 300 sec: 50818.1). Total num frames: 5334368256. Throughput: 0: 50645.4. Samples: 3087164900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 02:01:37,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-27 02:01:39,571][49750] Updated weights for policy 0, policy_version 325591 (0.0036) [2024-04-27 02:01:41,754][49750] Updated weights for policy 0, policy_version 325601 (0.0036) [2024-04-27 02:01:42,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 5334646784. Throughput: 0: 50780.5. Samples: 3087476300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 02:01:42,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 02:01:46,194][49750] Updated weights for policy 0, policy_version 325611 (0.0030) [2024-04-27 02:01:47,063][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5334876160. Throughput: 0: 50877.3. Samples: 3087786420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 02:01:47,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 02:01:48,554][49750] Updated weights for policy 0, policy_version 325621 (0.0028) [2024-04-27 02:01:52,062][49517] Fps is (10 sec: 45875.0, 60 sec: 49971.4, 300 sec: 50707.1). Total num frames: 5335105536. Throughput: 0: 50655.0. Samples: 3087924480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 02:01:52,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-27 02:01:52,642][49750] Updated weights for policy 0, policy_version 325631 (0.0032) [2024-04-27 02:01:54,906][49750] Updated weights for policy 0, policy_version 325641 (0.0032) [2024-04-27 02:01:57,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5335384064. Throughput: 0: 50816.7. Samples: 3088235400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 02:01:57,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-27 02:01:59,075][49750] Updated weights for policy 0, policy_version 325651 (0.0034) [2024-04-27 02:02:01,214][49750] Updated weights for policy 0, policy_version 325661 (0.0033) [2024-04-27 02:02:02,063][49517] Fps is (10 sec: 54066.6, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5335646208. Throughput: 0: 50706.3. Samples: 3088531340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 02:02:02,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-27 02:02:05,482][49750] Updated weights for policy 0, policy_version 325671 (0.0032) [2024-04-27 02:02:07,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 5335908352. Throughput: 0: 50623.2. Samples: 3088690400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 02:02:07,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-27 02:02:07,762][49750] Updated weights for policy 0, policy_version 325681 (0.0031) [2024-04-27 02:02:12,014][49728] Signal inference workers to stop experience collection... (46450 times) [2024-04-27 02:02:12,061][49750] InferenceWorker_p0-w0: stopping experience collection (46450 times) [2024-04-27 02:02:12,063][49517] Fps is (10 sec: 45875.4, 60 sec: 50244.2, 300 sec: 50651.5). Total num frames: 5336104960. Throughput: 0: 50774.1. Samples: 3088998080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 02:02:12,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 02:02:12,083][49728] Signal inference workers to resume experience collection... (46450 times) [2024-04-27 02:02:12,084][49750] InferenceWorker_p0-w0: resuming experience collection (46450 times) [2024-04-27 02:02:12,088][49750] Updated weights for policy 0, policy_version 325691 (0.0028) [2024-04-27 02:02:14,254][49750] Updated weights for policy 0, policy_version 325701 (0.0023) [2024-04-27 02:02:17,063][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5336383488. Throughput: 0: 50720.0. Samples: 3089300040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 02:02:17,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 02:02:18,435][49750] Updated weights for policy 0, policy_version 325711 (0.0026) [2024-04-27 02:02:20,514][49750] Updated weights for policy 0, policy_version 325721 (0.0039) [2024-04-27 02:02:22,062][49517] Fps is (10 sec: 55705.7, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 5336662016. Throughput: 0: 50691.3. Samples: 3089446000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 02:02:22,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-27 02:02:24,849][49750] Updated weights for policy 0, policy_version 325731 (0.0030) [2024-04-27 02:02:26,732][49750] Updated weights for policy 0, policy_version 325741 (0.0034) [2024-04-27 02:02:27,063][49517] Fps is (10 sec: 55704.8, 60 sec: 51609.4, 300 sec: 50818.2). Total num frames: 5336940544. Throughput: 0: 50775.3. Samples: 3089761200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 02:02:27,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 02:02:31,217][49750] Updated weights for policy 0, policy_version 325751 (0.0031) [2024-04-27 02:02:32,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5337169920. Throughput: 0: 50824.5. Samples: 3090073520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 02:02:32,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-27 02:02:33,352][49750] Updated weights for policy 0, policy_version 325761 (0.0034) [2024-04-27 02:02:37,063][49517] Fps is (10 sec: 45875.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5337399296. Throughput: 0: 50894.9. Samples: 3090214760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 02:02:37,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-27 02:02:37,709][49750] Updated weights for policy 0, policy_version 325771 (0.0030) [2024-04-27 02:02:40,124][49750] Updated weights for policy 0, policy_version 325781 (0.0031) [2024-04-27 02:02:42,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 5337677824. Throughput: 0: 50662.7. Samples: 3090515220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:02:42,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 02:02:42,076][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000325786_5337677824.pth... [2024-04-27 02:02:42,123][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000325043_5325504512.pth [2024-04-27 02:02:44,121][49750] Updated weights for policy 0, policy_version 325791 (0.0032) [2024-04-27 02:02:46,521][49750] Updated weights for policy 0, policy_version 325801 (0.0028) [2024-04-27 02:02:47,063][49517] Fps is (10 sec: 54068.0, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5337939968. Throughput: 0: 50906.3. Samples: 3090822120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:02:47,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 02:02:50,536][49750] Updated weights for policy 0, policy_version 325811 (0.0034) [2024-04-27 02:02:52,062][49517] Fps is (10 sec: 50791.2, 60 sec: 51336.6, 300 sec: 50707.1). Total num frames: 5338185728. Throughput: 0: 50933.0. Samples: 3090982380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:02:52,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 02:02:52,966][49750] Updated weights for policy 0, policy_version 325821 (0.0031) [2024-04-27 02:02:56,886][49750] Updated weights for policy 0, policy_version 325831 (0.0031) [2024-04-27 02:02:57,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5338415104. Throughput: 0: 50825.3. Samples: 3091285220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:02:57,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 02:02:59,217][49750] Updated weights for policy 0, policy_version 325841 (0.0032) [2024-04-27 02:03:02,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5338677248. Throughput: 0: 50878.7. Samples: 3091589580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:03:02,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 02:03:03,203][49750] Updated weights for policy 0, policy_version 325851 (0.0033) [2024-04-27 02:03:05,778][49750] Updated weights for policy 0, policy_version 325861 (0.0037) [2024-04-27 02:03:07,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5338939392. Throughput: 0: 50997.3. Samples: 3091740880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:03:07,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-27 02:03:09,762][49750] Updated weights for policy 0, policy_version 325871 (0.0035) [2024-04-27 02:03:12,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51882.7, 300 sec: 50818.2). Total num frames: 5339217920. Throughput: 0: 50853.2. Samples: 3092049580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:03:12,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 02:03:12,241][49750] Updated weights for policy 0, policy_version 325881 (0.0034) [2024-04-27 02:03:16,075][49750] Updated weights for policy 0, policy_version 325891 (0.0028) [2024-04-27 02:03:17,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5339447296. Throughput: 0: 50634.1. Samples: 3092352060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:03:17,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-27 02:03:18,562][49750] Updated weights for policy 0, policy_version 325901 (0.0028) [2024-04-27 02:03:22,062][49517] Fps is (10 sec: 45875.3, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 5339676672. Throughput: 0: 50796.7. Samples: 3092500600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:03:22,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-27 02:03:22,684][49750] Updated weights for policy 0, policy_version 325911 (0.0034) [2024-04-27 02:03:22,695][49728] Signal inference workers to stop experience collection... (46500 times) [2024-04-27 02:03:22,695][49728] Signal inference workers to resume experience collection... (46500 times) [2024-04-27 02:03:22,707][49750] InferenceWorker_p0-w0: stopping experience collection (46500 times) [2024-04-27 02:03:22,726][49750] InferenceWorker_p0-w0: resuming experience collection (46500 times) [2024-04-27 02:03:25,046][49750] Updated weights for policy 0, policy_version 325921 (0.0033) [2024-04-27 02:03:27,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 5339971584. Throughput: 0: 50810.4. Samples: 3092801680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:03:27,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 02:03:29,084][49750] Updated weights for policy 0, policy_version 325931 (0.0026) [2024-04-27 02:03:31,418][49750] Updated weights for policy 0, policy_version 325941 (0.0033) [2024-04-27 02:03:32,063][49517] Fps is (10 sec: 55704.4, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 5340233728. Throughput: 0: 50625.7. Samples: 3093100280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:03:32,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 02:03:35,510][49750] Updated weights for policy 0, policy_version 325951 (0.0031) [2024-04-27 02:03:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.7, 300 sec: 50707.1). Total num frames: 5340463104. Throughput: 0: 50847.1. Samples: 3093270500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:03:37,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 02:03:37,856][49750] Updated weights for policy 0, policy_version 325961 (0.0034) [2024-04-27 02:03:41,981][49750] Updated weights for policy 0, policy_version 325971 (0.0030) [2024-04-27 02:03:42,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5340708864. Throughput: 0: 50773.9. Samples: 3093570040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:03:42,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 02:03:44,657][49750] Updated weights for policy 0, policy_version 325981 (0.0030) [2024-04-27 02:03:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 5340954624. Throughput: 0: 50649.9. Samples: 3093868820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:03:47,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-27 02:03:48,350][49750] Updated weights for policy 0, policy_version 325991 (0.0035) [2024-04-27 02:03:51,088][49750] Updated weights for policy 0, policy_version 326001 (0.0035) [2024-04-27 02:03:52,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5341216768. Throughput: 0: 50799.2. Samples: 3094026840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:03:52,063][49517] Avg episode reward: [(0, '0.672')] [2024-04-27 02:03:54,766][49750] Updated weights for policy 0, policy_version 326011 (0.0038) [2024-04-27 02:03:57,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.6, 300 sec: 50762.7). Total num frames: 5341495296. Throughput: 0: 50701.8. Samples: 3094331160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:03:57,063][49517] Avg episode reward: [(0, '0.660')] [2024-04-27 02:03:57,469][49750] Updated weights for policy 0, policy_version 326021 (0.0036) [2024-04-27 02:04:01,088][49750] Updated weights for policy 0, policy_version 326031 (0.0033) [2024-04-27 02:04:02,062][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5341724672. Throughput: 0: 50819.2. Samples: 3094638920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:04:02,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 02:04:03,924][49750] Updated weights for policy 0, policy_version 326041 (0.0040) [2024-04-27 02:04:07,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 5341970432. Throughput: 0: 50821.8. Samples: 3094787580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:04:07,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-27 02:04:07,662][49750] Updated weights for policy 0, policy_version 326051 (0.0034) [2024-04-27 02:04:10,330][49750] Updated weights for policy 0, policy_version 326061 (0.0028) [2024-04-27 02:04:12,063][49517] Fps is (10 sec: 50789.5, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 5342232576. Throughput: 0: 50714.0. Samples: 3095083820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:04:12,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 02:04:14,075][49750] Updated weights for policy 0, policy_version 326071 (0.0032) [2024-04-27 02:04:15,322][49728] Signal inference workers to stop experience collection... (46550 times) [2024-04-27 02:04:15,327][49728] Signal inference workers to resume experience collection... (46550 times) [2024-04-27 02:04:15,341][49750] InferenceWorker_p0-w0: stopping experience collection (46550 times) [2024-04-27 02:04:15,341][49750] InferenceWorker_p0-w0: resuming experience collection (46550 times) [2024-04-27 02:04:16,850][49750] Updated weights for policy 0, policy_version 326081 (0.0025) [2024-04-27 02:04:17,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5342511104. Throughput: 0: 50965.9. Samples: 3095393740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:04:17,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-27 02:04:20,465][49750] Updated weights for policy 0, policy_version 326091 (0.0032) [2024-04-27 02:04:22,063][49517] Fps is (10 sec: 52429.2, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 5342756864. Throughput: 0: 50758.5. Samples: 3095554640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:04:22,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 02:04:23,373][49750] Updated weights for policy 0, policy_version 326101 (0.0029) [2024-04-27 02:04:27,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.2, 300 sec: 50707.1). Total num frames: 5342986240. Throughput: 0: 50717.3. Samples: 3095852320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:04:27,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-27 02:04:27,102][49750] Updated weights for policy 0, policy_version 326111 (0.0037) [2024-04-27 02:04:29,773][49750] Updated weights for policy 0, policy_version 326121 (0.0028) [2024-04-27 02:04:32,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 5343248384. Throughput: 0: 50903.9. Samples: 3096159500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:04:32,063][49517] Avg episode reward: [(0, '0.645')] [2024-04-27 02:04:33,745][49750] Updated weights for policy 0, policy_version 326131 (0.0032) [2024-04-27 02:04:36,250][49750] Updated weights for policy 0, policy_version 326141 (0.0033) [2024-04-27 02:04:37,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5343510528. Throughput: 0: 50657.7. Samples: 3096306440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:04:37,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 02:04:40,036][49750] Updated weights for policy 0, policy_version 326151 (0.0031) [2024-04-27 02:04:42,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5343772672. Throughput: 0: 50854.5. Samples: 3096619620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:04:42,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 02:04:42,076][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000326159_5343789056.pth... [2024-04-27 02:04:42,126][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000325414_5331582976.pth [2024-04-27 02:04:42,741][49750] Updated weights for policy 0, policy_version 326161 (0.0035) [2024-04-27 02:04:46,362][49750] Updated weights for policy 0, policy_version 326171 (0.0029) [2024-04-27 02:04:47,062][49517] Fps is (10 sec: 50791.1, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5344018432. Throughput: 0: 50696.1. Samples: 3096920240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:04:47,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-27 02:04:49,164][49750] Updated weights for policy 0, policy_version 326181 (0.0031) [2024-04-27 02:04:52,063][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 5344247808. Throughput: 0: 50675.7. Samples: 3097068000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:04:52,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 02:04:52,857][49750] Updated weights for policy 0, policy_version 326191 (0.0036) [2024-04-27 02:04:55,741][49750] Updated weights for policy 0, policy_version 326201 (0.0033) [2024-04-27 02:04:57,062][49517] Fps is (10 sec: 50789.7, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5344526336. Throughput: 0: 50880.6. Samples: 3097373440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:04:57,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-27 02:04:59,356][49750] Updated weights for policy 0, policy_version 326211 (0.0039) [2024-04-27 02:05:02,062][49517] Fps is (10 sec: 52429.9, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5344772096. Throughput: 0: 50637.9. Samples: 3097672440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:05:02,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 02:05:02,480][49750] Updated weights for policy 0, policy_version 326221 (0.0034) [2024-04-27 02:05:05,838][49750] Updated weights for policy 0, policy_version 326231 (0.0032) [2024-04-27 02:05:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.4, 300 sec: 50707.1). Total num frames: 5345034240. Throughput: 0: 50389.9. Samples: 3097822180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:05:07,063][49517] Avg episode reward: [(0, '0.510')] [2024-04-27 02:05:08,896][49750] Updated weights for policy 0, policy_version 326241 (0.0031) [2024-04-27 02:05:12,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.6, 300 sec: 50707.1). Total num frames: 5345280000. Throughput: 0: 50618.3. Samples: 3098130140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:05:12,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-27 02:05:12,186][49750] Updated weights for policy 0, policy_version 326251 (0.0033) [2024-04-27 02:05:15,439][49750] Updated weights for policy 0, policy_version 326261 (0.0035) [2024-04-27 02:05:17,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5345542144. Throughput: 0: 50781.8. Samples: 3098444680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:05:17,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 02:05:18,496][49750] Updated weights for policy 0, policy_version 326271 (0.0028) [2024-04-27 02:05:21,717][49750] Updated weights for policy 0, policy_version 326281 (0.0037) [2024-04-27 02:05:22,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 5345787904. Throughput: 0: 50649.5. Samples: 3098585660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:05:22,063][49517] Avg episode reward: [(0, '0.668')] [2024-04-27 02:05:25,067][49750] Updated weights for policy 0, policy_version 326291 (0.0030) [2024-04-27 02:05:26,013][49728] Signal inference workers to stop experience collection... (46600 times) [2024-04-27 02:05:26,013][49728] Signal inference workers to resume experience collection... (46600 times) [2024-04-27 02:05:26,040][49750] InferenceWorker_p0-w0: stopping experience collection (46600 times) [2024-04-27 02:05:26,040][49750] InferenceWorker_p0-w0: resuming experience collection (46600 times) [2024-04-27 02:05:27,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5346050048. Throughput: 0: 50409.4. Samples: 3098888040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:05:27,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-27 02:05:28,104][49750] Updated weights for policy 0, policy_version 326301 (0.0028) [2024-04-27 02:05:31,560][49750] Updated weights for policy 0, policy_version 326311 (0.0033) [2024-04-27 02:05:32,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5346312192. Throughput: 0: 50670.2. Samples: 3099200400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:05:32,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-27 02:05:34,636][49750] Updated weights for policy 0, policy_version 326321 (0.0032) [2024-04-27 02:05:37,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 5346525184. Throughput: 0: 50801.1. Samples: 3099354040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:05:37,063][49517] Avg episode reward: [(0, '0.692')] [2024-04-27 02:05:37,901][49750] Updated weights for policy 0, policy_version 326331 (0.0027) [2024-04-27 02:05:41,158][49750] Updated weights for policy 0, policy_version 326341 (0.0031) [2024-04-27 02:05:42,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 5346803712. Throughput: 0: 50759.2. Samples: 3099657600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:05:42,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 02:05:44,401][49750] Updated weights for policy 0, policy_version 326351 (0.0030) [2024-04-27 02:05:47,063][49517] Fps is (10 sec: 54066.6, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 5347065856. Throughput: 0: 50828.3. Samples: 3099959720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:05:47,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-27 02:05:47,499][49750] Updated weights for policy 0, policy_version 326361 (0.0028) [2024-04-27 02:05:50,959][49750] Updated weights for policy 0, policy_version 326371 (0.0026) [2024-04-27 02:05:52,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.7, 300 sec: 50762.6). Total num frames: 5347328000. Throughput: 0: 50882.2. Samples: 3100111880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:05:52,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-27 02:05:54,113][49750] Updated weights for policy 0, policy_version 326381 (0.0028) [2024-04-27 02:05:57,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5347557376. Throughput: 0: 50664.8. Samples: 3100410060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:05:57,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 02:05:57,331][49750] Updated weights for policy 0, policy_version 326391 (0.0030) [2024-04-27 02:06:00,540][49750] Updated weights for policy 0, policy_version 326401 (0.0040) [2024-04-27 02:06:02,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5347835904. Throughput: 0: 50536.8. Samples: 3100718840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:06:02,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 02:06:03,696][49750] Updated weights for policy 0, policy_version 326411 (0.0030) [2024-04-27 02:06:07,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5348065280. Throughput: 0: 50754.7. Samples: 3100869620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:06:07,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 02:06:07,072][49750] Updated weights for policy 0, policy_version 326421 (0.0027) [2024-04-27 02:06:10,118][49750] Updated weights for policy 0, policy_version 326431 (0.0034) [2024-04-27 02:06:12,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5348327424. Throughput: 0: 50648.9. Samples: 3101167240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:06:12,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 02:06:13,794][49750] Updated weights for policy 0, policy_version 326441 (0.0033) [2024-04-27 02:06:16,585][49750] Updated weights for policy 0, policy_version 326451 (0.0025) [2024-04-27 02:06:17,063][49517] Fps is (10 sec: 52427.3, 60 sec: 50790.2, 300 sec: 50707.1). Total num frames: 5348589568. Throughput: 0: 50522.8. Samples: 3101473940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:06:17,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-27 02:06:19,482][49728] Signal inference workers to stop experience collection... (46650 times) [2024-04-27 02:06:19,482][49728] Signal inference workers to resume experience collection... (46650 times) [2024-04-27 02:06:19,494][49750] InferenceWorker_p0-w0: stopping experience collection (46650 times) [2024-04-27 02:06:19,497][49750] InferenceWorker_p0-w0: resuming experience collection (46650 times) [2024-04-27 02:06:20,258][49750] Updated weights for policy 0, policy_version 326461 (0.0034) [2024-04-27 02:06:22,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5348818944. Throughput: 0: 50683.6. Samples: 3101634800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:06:22,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-27 02:06:22,930][49750] Updated weights for policy 0, policy_version 326471 (0.0031) [2024-04-27 02:06:26,632][49750] Updated weights for policy 0, policy_version 326481 (0.0031) [2024-04-27 02:06:27,062][49517] Fps is (10 sec: 49153.5, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 5349081088. Throughput: 0: 50659.6. Samples: 3101937280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 02:06:27,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 02:06:29,391][49750] Updated weights for policy 0, policy_version 326491 (0.0026) [2024-04-27 02:06:32,063][49517] Fps is (10 sec: 50788.2, 60 sec: 50243.9, 300 sec: 50707.1). Total num frames: 5349326848. Throughput: 0: 50669.9. Samples: 3102239880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 02:06:32,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-27 02:06:33,010][49750] Updated weights for policy 0, policy_version 326501 (0.0028) [2024-04-27 02:06:35,888][49750] Updated weights for policy 0, policy_version 326511 (0.0028) [2024-04-27 02:06:37,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51336.4, 300 sec: 50707.1). Total num frames: 5349605376. Throughput: 0: 50803.9. Samples: 3102398060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 02:06:37,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-27 02:06:39,360][49750] Updated weights for policy 0, policy_version 326521 (0.0030) [2024-04-27 02:06:42,062][49517] Fps is (10 sec: 52431.0, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 5349851136. Throughput: 0: 50899.2. Samples: 3102700520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 02:06:42,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-27 02:06:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000326529_5349851136.pth... [2024-04-27 02:06:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000325786_5337677824.pth [2024-04-27 02:06:42,525][49750] Updated weights for policy 0, policy_version 326531 (0.0036) [2024-04-27 02:06:45,801][49750] Updated weights for policy 0, policy_version 326541 (0.0035) [2024-04-27 02:06:47,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 5350080512. Throughput: 0: 50653.8. Samples: 3102998260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 02:06:47,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 02:06:48,937][49750] Updated weights for policy 0, policy_version 326551 (0.0027) [2024-04-27 02:06:52,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50707.1). Total num frames: 5350342656. Throughput: 0: 50679.1. Samples: 3103150180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 02:06:52,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-27 02:06:52,292][49750] Updated weights for policy 0, policy_version 326561 (0.0033) [2024-04-27 02:06:55,417][49750] Updated weights for policy 0, policy_version 326571 (0.0030) [2024-04-27 02:06:57,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5350604800. Throughput: 0: 50831.2. Samples: 3103454640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 02:06:57,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-27 02:06:58,597][49750] Updated weights for policy 0, policy_version 326581 (0.0036) [2024-04-27 02:07:01,642][49750] Updated weights for policy 0, policy_version 326591 (0.0029) [2024-04-27 02:07:02,067][49517] Fps is (10 sec: 54044.3, 60 sec: 50786.9, 300 sec: 50761.9). Total num frames: 5350883328. Throughput: 0: 50862.2. Samples: 3103762940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 02:07:02,067][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 02:07:05,063][49750] Updated weights for policy 0, policy_version 326601 (0.0028) [2024-04-27 02:07:07,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 5351112704. Throughput: 0: 50855.3. Samples: 3103923300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 02:07:07,063][49517] Avg episode reward: [(0, '0.650')] [2024-04-27 02:07:08,194][49750] Updated weights for policy 0, policy_version 326611 (0.0035) [2024-04-27 02:07:11,674][49750] Updated weights for policy 0, policy_version 326621 (0.0025) [2024-04-27 02:07:12,062][49517] Fps is (10 sec: 49172.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5351374848. Throughput: 0: 50838.6. Samples: 3104225020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 02:07:12,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-27 02:07:14,527][49750] Updated weights for policy 0, policy_version 326631 (0.0028) [2024-04-27 02:07:17,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50651.5). Total num frames: 5351604224. Throughput: 0: 50694.0. Samples: 3104521100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 02:07:17,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-27 02:07:18,078][49750] Updated weights for policy 0, policy_version 326641 (0.0029) [2024-04-27 02:07:20,970][49750] Updated weights for policy 0, policy_version 326651 (0.0034) [2024-04-27 02:07:22,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.4, 300 sec: 50651.6). Total num frames: 5351882752. Throughput: 0: 50737.8. Samples: 3104681260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 02:07:22,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-27 02:07:24,385][49750] Updated weights for policy 0, policy_version 326661 (0.0036) [2024-04-27 02:07:27,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5352144896. Throughput: 0: 50813.7. Samples: 3104987140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 02:07:27,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-27 02:07:27,238][49728] Signal inference workers to stop experience collection... (46700 times) [2024-04-27 02:07:27,239][49728] Signal inference workers to resume experience collection... (46700 times) [2024-04-27 02:07:27,255][49750] InferenceWorker_p0-w0: stopping experience collection (46700 times) [2024-04-27 02:07:27,255][49750] InferenceWorker_p0-w0: resuming experience collection (46700 times) [2024-04-27 02:07:27,366][49750] Updated weights for policy 0, policy_version 326671 (0.0040) [2024-04-27 02:07:30,959][49750] Updated weights for policy 0, policy_version 326681 (0.0031) [2024-04-27 02:07:32,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50517.6, 300 sec: 50707.1). Total num frames: 5352357888. Throughput: 0: 50832.9. Samples: 3105285740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 02:07:32,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-27 02:07:33,872][49750] Updated weights for policy 0, policy_version 326691 (0.0033) [2024-04-27 02:07:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5352636416. Throughput: 0: 50751.5. Samples: 3105434000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 02:07:37,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-27 02:07:37,530][49750] Updated weights for policy 0, policy_version 326701 (0.0038) [2024-04-27 02:07:40,309][49750] Updated weights for policy 0, policy_version 326711 (0.0028) [2024-04-27 02:07:42,063][49517] Fps is (10 sec: 54066.9, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 5352898560. Throughput: 0: 50716.8. Samples: 3105736900. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-04-27 02:07:42,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-27 02:07:43,823][49750] Updated weights for policy 0, policy_version 326721 (0.0030) [2024-04-27 02:07:46,697][49750] Updated weights for policy 0, policy_version 326731 (0.0026) [2024-04-27 02:07:47,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51336.6, 300 sec: 50762.6). Total num frames: 5353160704. Throughput: 0: 50751.0. Samples: 3106046520. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-04-27 02:07:47,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-27 02:07:50,380][49750] Updated weights for policy 0, policy_version 326741 (0.0035) [2024-04-27 02:07:52,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5353373696. Throughput: 0: 50581.5. Samples: 3106199460. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-04-27 02:07:52,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-27 02:07:53,115][49750] Updated weights for policy 0, policy_version 326751 (0.0037) [2024-04-27 02:07:56,874][49750] Updated weights for policy 0, policy_version 326761 (0.0037) [2024-04-27 02:07:57,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5353652224. Throughput: 0: 50638.0. Samples: 3106503740. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-04-27 02:07:57,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 02:07:59,567][49750] Updated weights for policy 0, policy_version 326771 (0.0030) [2024-04-27 02:08:02,063][49517] Fps is (10 sec: 54066.0, 60 sec: 50520.7, 300 sec: 50762.6). Total num frames: 5353914368. Throughput: 0: 51067.9. Samples: 3106819160. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-04-27 02:08:02,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-27 02:08:03,114][49750] Updated weights for policy 0, policy_version 326781 (0.0030) [2024-04-27 02:08:05,913][49750] Updated weights for policy 0, policy_version 326791 (0.0029) [2024-04-27 02:08:07,062][49517] Fps is (10 sec: 52429.8, 60 sec: 51063.6, 300 sec: 50707.1). Total num frames: 5354176512. Throughput: 0: 50887.6. Samples: 3106971200. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-04-27 02:08:07,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 02:08:09,575][49750] Updated weights for policy 0, policy_version 326801 (0.0032) [2024-04-27 02:08:12,062][49517] Fps is (10 sec: 52430.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5354438656. Throughput: 0: 50973.0. Samples: 3107280920. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-04-27 02:08:12,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 02:08:12,398][49750] Updated weights for policy 0, policy_version 326811 (0.0035) [2024-04-27 02:08:15,949][49750] Updated weights for policy 0, policy_version 326821 (0.0029) [2024-04-27 02:08:17,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5354651648. Throughput: 0: 51049.8. Samples: 3107582980. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-04-27 02:08:17,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 02:08:18,407][49728] Signal inference workers to stop experience collection... (46750 times) [2024-04-27 02:08:18,407][49728] Signal inference workers to resume experience collection... (46750 times) [2024-04-27 02:08:18,421][49750] InferenceWorker_p0-w0: stopping experience collection (46750 times) [2024-04-27 02:08:18,421][49750] InferenceWorker_p0-w0: resuming experience collection (46750 times) [2024-04-27 02:08:18,952][49750] Updated weights for policy 0, policy_version 326831 (0.0032) [2024-04-27 02:08:22,063][49517] Fps is (10 sec: 47512.7, 60 sec: 50517.2, 300 sec: 50651.5). Total num frames: 5354913792. Throughput: 0: 50943.0. Samples: 3107726440. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-04-27 02:08:22,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-27 02:08:22,645][49750] Updated weights for policy 0, policy_version 326841 (0.0029) [2024-04-27 02:08:25,434][49750] Updated weights for policy 0, policy_version 326851 (0.0029) [2024-04-27 02:08:27,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50517.4, 300 sec: 50651.6). Total num frames: 5355175936. Throughput: 0: 50936.6. Samples: 3108029040. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-04-27 02:08:27,063][49517] Avg episode reward: [(0, '0.515')] [2024-04-27 02:08:29,050][49750] Updated weights for policy 0, policy_version 326861 (0.0032) [2024-04-27 02:08:31,825][49750] Updated weights for policy 0, policy_version 326871 (0.0030) [2024-04-27 02:08:32,062][49517] Fps is (10 sec: 54068.2, 60 sec: 51609.7, 300 sec: 50818.2). Total num frames: 5355454464. Throughput: 0: 50931.5. Samples: 3108338440. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-04-27 02:08:32,063][49517] Avg episode reward: [(0, '0.671')] [2024-04-27 02:08:35,313][49750] Updated weights for policy 0, policy_version 326881 (0.0029) [2024-04-27 02:08:37,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5355700224. Throughput: 0: 51025.9. Samples: 3108495620. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-04-27 02:08:37,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-27 02:08:38,261][49750] Updated weights for policy 0, policy_version 326891 (0.0029) [2024-04-27 02:08:41,775][49750] Updated weights for policy 0, policy_version 326901 (0.0029) [2024-04-27 02:08:42,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 5355945984. Throughput: 0: 51069.3. Samples: 3108801860. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-04-27 02:08:42,063][49517] Avg episode reward: [(0, '0.678')] [2024-04-27 02:08:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000326901_5355945984.pth... [2024-04-27 02:08:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000326159_5343789056.pth [2024-04-27 02:08:44,693][49750] Updated weights for policy 0, policy_version 326911 (0.0032) [2024-04-27 02:08:47,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 5356208128. Throughput: 0: 50806.9. Samples: 3109105460. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-04-27 02:08:47,063][49517] Avg episode reward: [(0, '0.698')] [2024-04-27 02:08:48,255][49750] Updated weights for policy 0, policy_version 326921 (0.0037) [2024-04-27 02:08:51,040][49750] Updated weights for policy 0, policy_version 326931 (0.0033) [2024-04-27 02:08:52,063][49517] Fps is (10 sec: 50790.7, 60 sec: 51336.4, 300 sec: 50707.1). Total num frames: 5356453888. Throughput: 0: 51040.3. Samples: 3109268020. Policy #0 lag: (min: 1.0, avg: 12.5, max: 24.0) [2024-04-27 02:08:52,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 02:08:54,532][49750] Updated weights for policy 0, policy_version 326941 (0.0032) [2024-04-27 02:08:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5356732416. Throughput: 0: 51020.8. Samples: 3109576860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:08:57,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-27 02:08:57,950][49750] Updated weights for policy 0, policy_version 326951 (0.0032) [2024-04-27 02:09:00,804][49750] Updated weights for policy 0, policy_version 326961 (0.0029) [2024-04-27 02:09:02,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50818.1). Total num frames: 5356961792. Throughput: 0: 51051.0. Samples: 3109880280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:09:02,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-27 02:09:04,208][49750] Updated weights for policy 0, policy_version 326971 (0.0035) [2024-04-27 02:09:07,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5357223936. Throughput: 0: 51219.6. Samples: 3110031320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:09:07,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 02:09:07,284][49750] Updated weights for policy 0, policy_version 326981 (0.0032) [2024-04-27 02:09:10,552][49750] Updated weights for policy 0, policy_version 326991 (0.0034) [2024-04-27 02:09:12,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5357469696. Throughput: 0: 51196.7. Samples: 3110332900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:09:12,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-27 02:09:13,854][49750] Updated weights for policy 0, policy_version 327001 (0.0034) [2024-04-27 02:09:16,991][49750] Updated weights for policy 0, policy_version 327011 (0.0029) [2024-04-27 02:09:17,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51609.5, 300 sec: 50818.2). Total num frames: 5357748224. Throughput: 0: 50969.7. Samples: 3110632080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:09:17,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-27 02:09:20,248][49750] Updated weights for policy 0, policy_version 327021 (0.0030) [2024-04-27 02:09:22,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51609.8, 300 sec: 50929.3). Total num frames: 5358010368. Throughput: 0: 51086.6. Samples: 3110794520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:09:22,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-27 02:09:23,484][49750] Updated weights for policy 0, policy_version 327031 (0.0030) [2024-04-27 02:09:26,877][49750] Updated weights for policy 0, policy_version 327041 (0.0030) [2024-04-27 02:09:27,063][49517] Fps is (10 sec: 49151.8, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 5358239744. Throughput: 0: 51044.5. Samples: 3111098860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:09:27,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-27 02:09:27,764][49728] Signal inference workers to stop experience collection... (46800 times) [2024-04-27 02:09:27,767][49728] Signal inference workers to resume experience collection... (46800 times) [2024-04-27 02:09:27,794][49750] InferenceWorker_p0-w0: stopping experience collection (46800 times) [2024-04-27 02:09:27,794][49750] InferenceWorker_p0-w0: resuming experience collection (46800 times) [2024-04-27 02:09:29,830][49750] Updated weights for policy 0, policy_version 327051 (0.0035) [2024-04-27 02:09:32,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.2, 300 sec: 50762.6). Total num frames: 5358485504. Throughput: 0: 50904.9. Samples: 3111396180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:09:32,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-27 02:09:33,220][49750] Updated weights for policy 0, policy_version 327061 (0.0039) [2024-04-27 02:09:36,093][49750] Updated weights for policy 0, policy_version 327071 (0.0034) [2024-04-27 02:09:37,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 5358747648. Throughput: 0: 50846.0. Samples: 3111556080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:09:37,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 02:09:39,690][49750] Updated weights for policy 0, policy_version 327081 (0.0037) [2024-04-27 02:09:42,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 5359026176. Throughput: 0: 50872.0. Samples: 3111866100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:09:42,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 02:09:42,790][49750] Updated weights for policy 0, policy_version 327091 (0.0038) [2024-04-27 02:09:46,023][49750] Updated weights for policy 0, policy_version 327101 (0.0035) [2024-04-27 02:09:47,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5359255552. Throughput: 0: 50870.4. Samples: 3112169440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:09:47,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-27 02:09:49,267][49750] Updated weights for policy 0, policy_version 327111 (0.0033) [2024-04-27 02:09:52,062][49517] Fps is (10 sec: 49152.2, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5359517696. Throughput: 0: 50802.4. Samples: 3112317420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:09:52,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 02:09:52,327][49750] Updated weights for policy 0, policy_version 327121 (0.0032) [2024-04-27 02:09:55,912][49750] Updated weights for policy 0, policy_version 327131 (0.0039) [2024-04-27 02:09:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5359779840. Throughput: 0: 50927.7. Samples: 3112624640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:09:57,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-27 02:09:58,730][49750] Updated weights for policy 0, policy_version 327141 (0.0036) [2024-04-27 02:10:02,063][49517] Fps is (10 sec: 50789.5, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 5360025600. Throughput: 0: 51094.1. Samples: 3112931320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:10:02,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-27 02:10:02,172][49750] Updated weights for policy 0, policy_version 327151 (0.0032) [2024-04-27 02:10:05,092][49750] Updated weights for policy 0, policy_version 327161 (0.0035) [2024-04-27 02:10:07,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.7, 300 sec: 50929.2). Total num frames: 5360304128. Throughput: 0: 50916.4. Samples: 3113085760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:10:07,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 02:10:08,596][49750] Updated weights for policy 0, policy_version 327171 (0.0028) [2024-04-27 02:10:11,471][49750] Updated weights for policy 0, policy_version 327181 (0.0032) [2024-04-27 02:10:12,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5360549888. Throughput: 0: 51012.5. Samples: 3113394420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 02:10:12,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-27 02:10:14,883][49750] Updated weights for policy 0, policy_version 327191 (0.0035) [2024-04-27 02:10:17,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 5360779264. Throughput: 0: 51343.8. Samples: 3113706640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 02:10:17,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-27 02:10:18,129][49750] Updated weights for policy 0, policy_version 327201 (0.0035) [2024-04-27 02:10:21,151][49750] Updated weights for policy 0, policy_version 327211 (0.0029) [2024-04-27 02:10:22,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5361041408. Throughput: 0: 50960.3. Samples: 3113849300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 02:10:22,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-27 02:10:24,543][49750] Updated weights for policy 0, policy_version 327221 (0.0026) [2024-04-27 02:10:27,063][49517] Fps is (10 sec: 54066.2, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5361319936. Throughput: 0: 50961.7. Samples: 3114159380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 02:10:27,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-27 02:10:27,577][49750] Updated weights for policy 0, policy_version 327231 (0.0032) [2024-04-27 02:10:30,829][49750] Updated weights for policy 0, policy_version 327241 (0.0034) [2024-04-27 02:10:32,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 5361565696. Throughput: 0: 51124.8. Samples: 3114470060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 02:10:32,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-27 02:10:34,014][49750] Updated weights for policy 0, policy_version 327251 (0.0034) [2024-04-27 02:10:37,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 5361827840. Throughput: 0: 51167.5. Samples: 3114619960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 02:10:37,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 02:10:37,204][49750] Updated weights for policy 0, policy_version 327261 (0.0029) [2024-04-27 02:10:40,362][49750] Updated weights for policy 0, policy_version 327271 (0.0033) [2024-04-27 02:10:42,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5362057216. Throughput: 0: 50987.5. Samples: 3114919080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 02:10:42,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 02:10:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000327274_5362057216.pth... [2024-04-27 02:10:42,118][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000326529_5349851136.pth [2024-04-27 02:10:42,378][49728] Signal inference workers to stop experience collection... (46850 times) [2024-04-27 02:10:42,381][49728] Signal inference workers to resume experience collection... (46850 times) [2024-04-27 02:10:42,417][49750] InferenceWorker_p0-w0: stopping experience collection (46850 times) [2024-04-27 02:10:42,417][49750] InferenceWorker_p0-w0: resuming experience collection (46850 times) [2024-04-27 02:10:43,723][49750] Updated weights for policy 0, policy_version 327281 (0.0032) [2024-04-27 02:10:46,665][49750] Updated weights for policy 0, policy_version 327291 (0.0035) [2024-04-27 02:10:47,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51336.4, 300 sec: 50873.7). Total num frames: 5362335744. Throughput: 0: 50968.0. Samples: 3115224880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 02:10:47,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-27 02:10:50,285][49750] Updated weights for policy 0, policy_version 327301 (0.0035) [2024-04-27 02:10:52,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.4, 300 sec: 50984.8). Total num frames: 5362597888. Throughput: 0: 51023.5. Samples: 3115381820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 02:10:52,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 02:10:53,589][49750] Updated weights for policy 0, policy_version 327311 (0.0033) [2024-04-27 02:10:56,605][49750] Updated weights for policy 0, policy_version 327321 (0.0032) [2024-04-27 02:10:57,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5362827264. Throughput: 0: 50976.9. Samples: 3115688380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 02:10:57,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-27 02:11:00,046][49750] Updated weights for policy 0, policy_version 327331 (0.0031) [2024-04-27 02:11:02,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5363073024. Throughput: 0: 51032.3. Samples: 3116003100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 02:11:02,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 02:11:02,877][49750] Updated weights for policy 0, policy_version 327341 (0.0029) [2024-04-27 02:11:06,325][49750] Updated weights for policy 0, policy_version 327351 (0.0036) [2024-04-27 02:11:07,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 5363335168. Throughput: 0: 50975.3. Samples: 3116143200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 02:11:07,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-27 02:11:09,709][49750] Updated weights for policy 0, policy_version 327361 (0.0034) [2024-04-27 02:11:12,062][49517] Fps is (10 sec: 55706.1, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 5363630080. Throughput: 0: 50931.7. Samples: 3116451300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 02:11:12,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-27 02:11:12,667][49750] Updated weights for policy 0, policy_version 327371 (0.0036) [2024-04-27 02:11:16,083][49750] Updated weights for policy 0, policy_version 327381 (0.0028) [2024-04-27 02:11:17,063][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.4, 300 sec: 50984.8). Total num frames: 5363859456. Throughput: 0: 50867.1. Samples: 3116759080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 02:11:17,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 02:11:19,066][49750] Updated weights for policy 0, policy_version 327391 (0.0037) [2024-04-27 02:11:22,063][49517] Fps is (10 sec: 47513.1, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5364105216. Throughput: 0: 50901.8. Samples: 3116910540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 02:11:22,063][49517] Avg episode reward: [(0, '0.669')] [2024-04-27 02:11:22,551][49750] Updated weights for policy 0, policy_version 327401 (0.0029) [2024-04-27 02:11:25,526][49750] Updated weights for policy 0, policy_version 327411 (0.0034) [2024-04-27 02:11:27,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.4, 300 sec: 50929.3). Total num frames: 5364350976. Throughput: 0: 50955.2. Samples: 3117212060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 02:11:27,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-27 02:11:28,936][49750] Updated weights for policy 0, policy_version 327421 (0.0031) [2024-04-27 02:11:31,831][49750] Updated weights for policy 0, policy_version 327431 (0.0031) [2024-04-27 02:11:32,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 5364629504. Throughput: 0: 50810.5. Samples: 3117511340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 02:11:32,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-27 02:11:35,391][49750] Updated weights for policy 0, policy_version 327441 (0.0037) [2024-04-27 02:11:37,062][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 5364875264. Throughput: 0: 50870.3. Samples: 3117670980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 02:11:37,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-27 02:11:38,286][49750] Updated weights for policy 0, policy_version 327451 (0.0034) [2024-04-27 02:11:41,894][49750] Updated weights for policy 0, policy_version 327461 (0.0032) [2024-04-27 02:11:42,062][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.6, 300 sec: 50984.8). Total num frames: 5365121024. Throughput: 0: 50802.3. Samples: 3117974480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 02:11:42,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-27 02:11:43,614][49728] Signal inference workers to stop experience collection... (46900 times) [2024-04-27 02:11:43,615][49728] Signal inference workers to resume experience collection... (46900 times) [2024-04-27 02:11:43,628][49750] InferenceWorker_p0-w0: stopping experience collection (46900 times) [2024-04-27 02:11:43,628][49750] InferenceWorker_p0-w0: resuming experience collection (46900 times) [2024-04-27 02:11:44,814][49750] Updated weights for policy 0, policy_version 327471 (0.0031) [2024-04-27 02:11:47,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.5, 300 sec: 50984.8). Total num frames: 5365383168. Throughput: 0: 50657.3. Samples: 3118282680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 02:11:47,072][49517] Avg episode reward: [(0, '0.620')] [2024-04-27 02:11:48,269][49750] Updated weights for policy 0, policy_version 327481 (0.0029) [2024-04-27 02:11:51,405][49750] Updated weights for policy 0, policy_version 327491 (0.0030) [2024-04-27 02:11:52,063][49517] Fps is (10 sec: 50788.8, 60 sec: 50517.2, 300 sec: 50929.2). Total num frames: 5365628928. Throughput: 0: 50889.7. Samples: 3118433240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 02:11:52,072][49517] Avg episode reward: [(0, '0.665')] [2024-04-27 02:11:54,718][49750] Updated weights for policy 0, policy_version 327501 (0.0028) [2024-04-27 02:11:57,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51336.6, 300 sec: 50930.0). Total num frames: 5365907456. Throughput: 0: 50770.2. Samples: 3118735960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 02:11:57,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 02:11:57,970][49750] Updated weights for policy 0, policy_version 327511 (0.0033) [2024-04-27 02:12:01,384][49750] Updated weights for policy 0, policy_version 327521 (0.0033) [2024-04-27 02:12:02,063][49517] Fps is (10 sec: 50791.1, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 5366136832. Throughput: 0: 50679.1. Samples: 3119039640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 02:12:02,072][49517] Avg episode reward: [(0, '0.610')] [2024-04-27 02:12:04,536][49750] Updated weights for policy 0, policy_version 327531 (0.0031) [2024-04-27 02:12:07,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5366382592. Throughput: 0: 50728.8. Samples: 3119193340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 02:12:07,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 02:12:07,687][49750] Updated weights for policy 0, policy_version 327541 (0.0030) [2024-04-27 02:12:10,804][49750] Updated weights for policy 0, policy_version 327551 (0.0033) [2024-04-27 02:12:12,062][49517] Fps is (10 sec: 49152.3, 60 sec: 49971.1, 300 sec: 50929.3). Total num frames: 5366628352. Throughput: 0: 50708.4. Samples: 3119493940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 02:12:12,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 02:12:14,096][49750] Updated weights for policy 0, policy_version 327561 (0.0029) [2024-04-27 02:12:17,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 5366906880. Throughput: 0: 50733.6. Samples: 3119794360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 02:12:17,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-27 02:12:17,159][49750] Updated weights for policy 0, policy_version 327571 (0.0031) [2024-04-27 02:12:20,591][49750] Updated weights for policy 0, policy_version 327581 (0.0037) [2024-04-27 02:12:22,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5367169024. Throughput: 0: 50834.0. Samples: 3119958520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 02:12:22,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-27 02:12:23,603][49750] Updated weights for policy 0, policy_version 327591 (0.0032) [2024-04-27 02:12:27,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50790.5, 300 sec: 50984.8). Total num frames: 5367398400. Throughput: 0: 50821.8. Samples: 3120261460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 02:12:27,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-27 02:12:27,140][49750] Updated weights for policy 0, policy_version 327601 (0.0030) [2024-04-27 02:12:30,057][49750] Updated weights for policy 0, policy_version 327611 (0.0037) [2024-04-27 02:12:32,062][49517] Fps is (10 sec: 49153.2, 60 sec: 50517.4, 300 sec: 50929.3). Total num frames: 5367660544. Throughput: 0: 50749.1. Samples: 3120566380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 02:12:32,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 02:12:33,566][49750] Updated weights for policy 0, policy_version 327621 (0.0031) [2024-04-27 02:12:36,358][49750] Updated weights for policy 0, policy_version 327631 (0.0036) [2024-04-27 02:12:37,062][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.4, 300 sec: 50929.3). Total num frames: 5367922688. Throughput: 0: 50751.4. Samples: 3120717040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 02:12:37,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 02:12:39,950][49750] Updated weights for policy 0, policy_version 327641 (0.0031) [2024-04-27 02:12:42,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 5368184832. Throughput: 0: 50848.3. Samples: 3121024140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 02:12:42,064][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 02:12:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000327648_5368184832.pth... [2024-04-27 02:12:42,124][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000326901_5355945984.pth [2024-04-27 02:12:43,141][49750] Updated weights for policy 0, policy_version 327651 (0.0030) [2024-04-27 02:12:46,445][49750] Updated weights for policy 0, policy_version 327661 (0.0033) [2024-04-27 02:12:47,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.4, 300 sec: 50984.8). Total num frames: 5368414208. Throughput: 0: 50883.7. Samples: 3121329400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:12:47,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 02:12:48,539][49728] Signal inference workers to stop experience collection... (46950 times) [2024-04-27 02:12:48,539][49728] Signal inference workers to resume experience collection... (46950 times) [2024-04-27 02:12:48,566][49750] InferenceWorker_p0-w0: stopping experience collection (46950 times) [2024-04-27 02:12:48,566][49750] InferenceWorker_p0-w0: resuming experience collection (46950 times) [2024-04-27 02:12:49,629][49750] Updated weights for policy 0, policy_version 327671 (0.0026) [2024-04-27 02:12:52,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.5, 300 sec: 50929.2). Total num frames: 5368676352. Throughput: 0: 50879.5. Samples: 3121482920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:12:52,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-27 02:12:52,902][49750] Updated weights for policy 0, policy_version 327681 (0.0030) [2024-04-27 02:12:55,875][49750] Updated weights for policy 0, policy_version 327691 (0.0032) [2024-04-27 02:12:57,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.2, 300 sec: 50873.7). Total num frames: 5368922112. Throughput: 0: 50965.8. Samples: 3121787400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:12:57,063][49517] Avg episode reward: [(0, '0.680')] [2024-04-27 02:12:59,134][49750] Updated weights for policy 0, policy_version 327701 (0.0029) [2024-04-27 02:13:02,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5369184256. Throughput: 0: 51061.9. Samples: 3122092140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:13:02,071][49517] Avg episode reward: [(0, '0.583')] [2024-04-27 02:13:02,427][49750] Updated weights for policy 0, policy_version 327711 (0.0029) [2024-04-27 02:13:05,663][49750] Updated weights for policy 0, policy_version 327721 (0.0036) [2024-04-27 02:13:07,063][49517] Fps is (10 sec: 54066.9, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 5369462784. Throughput: 0: 50813.9. Samples: 3122245140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:13:07,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 02:13:08,804][49750] Updated weights for policy 0, policy_version 327731 (0.0031) [2024-04-27 02:13:12,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51063.4, 300 sec: 50984.8). Total num frames: 5369692160. Throughput: 0: 50867.8. Samples: 3122550520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:13:12,071][49517] Avg episode reward: [(0, '0.635')] [2024-04-27 02:13:12,179][49750] Updated weights for policy 0, policy_version 327741 (0.0033) [2024-04-27 02:13:15,295][49750] Updated weights for policy 0, policy_version 327751 (0.0032) [2024-04-27 02:13:17,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.6, 300 sec: 50984.8). Total num frames: 5369954304. Throughput: 0: 50887.2. Samples: 3122856300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:13:17,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-27 02:13:18,556][49750] Updated weights for policy 0, policy_version 327761 (0.0029) [2024-04-27 02:13:21,709][49750] Updated weights for policy 0, policy_version 327771 (0.0036) [2024-04-27 02:13:22,062][49517] Fps is (10 sec: 50791.1, 60 sec: 50517.5, 300 sec: 50929.2). Total num frames: 5370200064. Throughput: 0: 50823.2. Samples: 3123004080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:13:22,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 02:13:24,903][49750] Updated weights for policy 0, policy_version 327781 (0.0035) [2024-04-27 02:13:27,063][49517] Fps is (10 sec: 52427.8, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 5370478592. Throughput: 0: 50867.2. Samples: 3123313160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:13:27,072][49517] Avg episode reward: [(0, '0.634')] [2024-04-27 02:13:28,032][49750] Updated weights for policy 0, policy_version 327791 (0.0032) [2024-04-27 02:13:31,343][49750] Updated weights for policy 0, policy_version 327801 (0.0029) [2024-04-27 02:13:32,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5370724352. Throughput: 0: 50870.2. Samples: 3123618560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:13:32,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-27 02:13:34,491][49750] Updated weights for policy 0, policy_version 327811 (0.0034) [2024-04-27 02:13:37,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5370953728. Throughput: 0: 50875.3. Samples: 3123772300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:13:37,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-27 02:13:37,767][49750] Updated weights for policy 0, policy_version 327821 (0.0030) [2024-04-27 02:13:41,059][49750] Updated weights for policy 0, policy_version 327831 (0.0029) [2024-04-27 02:13:42,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 5371232256. Throughput: 0: 50869.7. Samples: 3124076540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:13:42,063][49517] Avg episode reward: [(0, '0.674')] [2024-04-27 02:13:44,190][49750] Updated weights for policy 0, policy_version 327841 (0.0029) [2024-04-27 02:13:47,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 5371478016. Throughput: 0: 50910.5. Samples: 3124383120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:13:47,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-27 02:13:47,534][49750] Updated weights for policy 0, policy_version 327851 (0.0030) [2024-04-27 02:13:50,619][49750] Updated weights for policy 0, policy_version 327861 (0.0028) [2024-04-27 02:13:52,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5371740160. Throughput: 0: 50915.5. Samples: 3124536340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:13:52,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-27 02:13:53,874][49750] Updated weights for policy 0, policy_version 327871 (0.0033) [2024-04-27 02:13:56,950][49750] Updated weights for policy 0, policy_version 327881 (0.0028) [2024-04-27 02:13:57,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 5372002304. Throughput: 0: 50957.1. Samples: 3124843580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:13:57,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 02:14:00,301][49750] Updated weights for policy 0, policy_version 327891 (0.0030) [2024-04-27 02:14:02,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5372231680. Throughput: 0: 50886.1. Samples: 3125146180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:14:02,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 02:14:03,437][49750] Updated weights for policy 0, policy_version 327901 (0.0037) [2024-04-27 02:14:06,747][49750] Updated weights for policy 0, policy_version 327911 (0.0029) [2024-04-27 02:14:07,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50984.8). Total num frames: 5372510208. Throughput: 0: 50869.8. Samples: 3125293220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:14:07,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-27 02:14:09,347][49728] Signal inference workers to stop experience collection... (47000 times) [2024-04-27 02:14:09,347][49728] Signal inference workers to resume experience collection... (47000 times) [2024-04-27 02:14:09,364][49750] InferenceWorker_p0-w0: stopping experience collection (47000 times) [2024-04-27 02:14:09,364][49750] InferenceWorker_p0-w0: resuming experience collection (47000 times) [2024-04-27 02:14:09,847][49750] Updated weights for policy 0, policy_version 327921 (0.0037) [2024-04-27 02:14:12,063][49517] Fps is (10 sec: 52428.2, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5372755968. Throughput: 0: 50869.8. Samples: 3125602300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:14:12,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-27 02:14:13,094][49750] Updated weights for policy 0, policy_version 327931 (0.0031) [2024-04-27 02:14:16,292][49750] Updated weights for policy 0, policy_version 327941 (0.0030) [2024-04-27 02:14:17,063][49517] Fps is (10 sec: 49151.1, 60 sec: 50790.2, 300 sec: 50818.1). Total num frames: 5373001728. Throughput: 0: 50832.3. Samples: 3125906020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:14:17,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 02:14:19,477][49750] Updated weights for policy 0, policy_version 327951 (0.0034) [2024-04-27 02:14:22,063][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 5373263872. Throughput: 0: 50776.4. Samples: 3126057240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:14:22,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-27 02:14:22,790][49750] Updated weights for policy 0, policy_version 327961 (0.0036) [2024-04-27 02:14:25,838][49750] Updated weights for policy 0, policy_version 327971 (0.0033) [2024-04-27 02:14:27,063][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.3, 300 sec: 50929.2). Total num frames: 5373509632. Throughput: 0: 50901.3. Samples: 3126367100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:14:27,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 02:14:29,454][49750] Updated weights for policy 0, policy_version 327981 (0.0031) [2024-04-27 02:14:32,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 5373771776. Throughput: 0: 50861.8. Samples: 3126671900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:14:32,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 02:14:32,323][49750] Updated weights for policy 0, policy_version 327991 (0.0035) [2024-04-27 02:14:35,798][49750] Updated weights for policy 0, policy_version 328001 (0.0030) [2024-04-27 02:14:37,062][49517] Fps is (10 sec: 50791.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5374017536. Throughput: 0: 50921.5. Samples: 3126827800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:14:37,063][49517] Avg episode reward: [(0, '0.543')] [2024-04-27 02:14:38,725][49750] Updated weights for policy 0, policy_version 328011 (0.0030) [2024-04-27 02:14:42,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 5374279680. Throughput: 0: 50883.3. Samples: 3127133340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:14:42,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 02:14:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000328020_5374279680.pth... [2024-04-27 02:14:42,121][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000327274_5362057216.pth [2024-04-27 02:14:42,353][49750] Updated weights for policy 0, policy_version 328021 (0.0030) [2024-04-27 02:14:45,105][49750] Updated weights for policy 0, policy_version 328031 (0.0029) [2024-04-27 02:14:47,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5374525440. Throughput: 0: 50965.8. Samples: 3127439640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:14:47,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 02:14:48,686][49750] Updated weights for policy 0, policy_version 328041 (0.0040) [2024-04-27 02:14:51,530][49750] Updated weights for policy 0, policy_version 328051 (0.0031) [2024-04-27 02:14:52,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 5374803968. Throughput: 0: 51031.4. Samples: 3127589640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:14:52,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 02:14:55,227][49750] Updated weights for policy 0, policy_version 328061 (0.0026) [2024-04-27 02:14:57,063][49517] Fps is (10 sec: 50789.4, 60 sec: 50517.1, 300 sec: 50873.7). Total num frames: 5375033344. Throughput: 0: 50849.7. Samples: 3127890540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:14:57,063][49517] Avg episode reward: [(0, '0.514')] [2024-04-27 02:14:58,140][49750] Updated weights for policy 0, policy_version 328071 (0.0033) [2024-04-27 02:15:01,533][49750] Updated weights for policy 0, policy_version 328081 (0.0035) [2024-04-27 02:15:02,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5375311872. Throughput: 0: 50919.7. Samples: 3128197400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:15:02,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 02:15:04,595][49750] Updated weights for policy 0, policy_version 328091 (0.0030) [2024-04-27 02:15:07,063][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 5375557632. Throughput: 0: 50854.2. Samples: 3128345680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:15:07,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 02:15:08,080][49750] Updated weights for policy 0, policy_version 328101 (0.0029) [2024-04-27 02:15:11,095][49750] Updated weights for policy 0, policy_version 328111 (0.0028) [2024-04-27 02:15:12,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 5375803392. Throughput: 0: 50830.1. Samples: 3128654460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:15:12,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-27 02:15:14,395][49750] Updated weights for policy 0, policy_version 328121 (0.0031) [2024-04-27 02:15:16,882][49728] Signal inference workers to stop experience collection... (47050 times) [2024-04-27 02:15:16,882][49728] Signal inference workers to resume experience collection... (47050 times) [2024-04-27 02:15:16,894][49750] InferenceWorker_p0-w0: stopping experience collection (47050 times) [2024-04-27 02:15:16,915][49750] InferenceWorker_p0-w0: resuming experience collection (47050 times) [2024-04-27 02:15:17,062][49517] Fps is (10 sec: 50791.7, 60 sec: 51063.7, 300 sec: 50929.3). Total num frames: 5376065536. Throughput: 0: 50893.5. Samples: 3128962100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 02:15:17,063][49517] Avg episode reward: [(0, '0.521')] [2024-04-27 02:15:17,469][49750] Updated weights for policy 0, policy_version 328131 (0.0029) [2024-04-27 02:15:20,928][49750] Updated weights for policy 0, policy_version 328141 (0.0030) [2024-04-27 02:15:22,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5376311296. Throughput: 0: 50659.3. Samples: 3129107480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 02:15:22,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-27 02:15:23,881][49750] Updated weights for policy 0, policy_version 328151 (0.0028) [2024-04-27 02:15:27,062][49517] Fps is (10 sec: 49151.3, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5376557056. Throughput: 0: 50855.7. Samples: 3129421840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 02:15:27,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-27 02:15:27,510][49750] Updated weights for policy 0, policy_version 328161 (0.0031) [2024-04-27 02:15:30,276][49750] Updated weights for policy 0, policy_version 328171 (0.0031) [2024-04-27 02:15:32,062][49517] Fps is (10 sec: 52429.9, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5376835584. Throughput: 0: 50818.6. Samples: 3129726480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 02:15:32,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-27 02:15:33,832][49750] Updated weights for policy 0, policy_version 328181 (0.0035) [2024-04-27 02:15:37,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5377064960. Throughput: 0: 50818.4. Samples: 3129876460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 02:15:37,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-27 02:15:37,123][49750] Updated weights for policy 0, policy_version 328191 (0.0033) [2024-04-27 02:15:40,234][49750] Updated weights for policy 0, policy_version 328201 (0.0028) [2024-04-27 02:15:42,063][49517] Fps is (10 sec: 49150.9, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5377327104. Throughput: 0: 50874.6. Samples: 3130179900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 02:15:42,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 02:15:43,485][49750] Updated weights for policy 0, policy_version 328211 (0.0033) [2024-04-27 02:15:46,734][49750] Updated weights for policy 0, policy_version 328221 (0.0030) [2024-04-27 02:15:47,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5377589248. Throughput: 0: 50860.9. Samples: 3130486140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 02:15:47,063][49517] Avg episode reward: [(0, '0.537')] [2024-04-27 02:15:49,991][49750] Updated weights for policy 0, policy_version 328231 (0.0033) [2024-04-27 02:15:52,062][49517] Fps is (10 sec: 50791.5, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 5377835008. Throughput: 0: 50863.7. Samples: 3130634540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 02:15:52,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 02:15:53,117][49750] Updated weights for policy 0, policy_version 328241 (0.0029) [2024-04-27 02:15:56,422][49750] Updated weights for policy 0, policy_version 328251 (0.0028) [2024-04-27 02:15:57,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5378080768. Throughput: 0: 50861.4. Samples: 3130943220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 02:15:57,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-27 02:15:59,420][49750] Updated weights for policy 0, policy_version 328261 (0.0039) [2024-04-27 02:16:02,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50244.2, 300 sec: 50818.2). Total num frames: 5378326528. Throughput: 0: 50705.5. Samples: 3131243860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 02:16:02,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-27 02:16:02,788][49750] Updated weights for policy 0, policy_version 328271 (0.0034) [2024-04-27 02:16:05,973][49750] Updated weights for policy 0, policy_version 328281 (0.0038) [2024-04-27 02:16:07,062][49517] Fps is (10 sec: 52429.4, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5378605056. Throughput: 0: 50861.6. Samples: 3131396240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 02:16:07,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-27 02:16:09,314][49750] Updated weights for policy 0, policy_version 328291 (0.0031) [2024-04-27 02:16:12,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 5378850816. Throughput: 0: 50741.0. Samples: 3131705180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 02:16:12,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-27 02:16:12,612][49750] Updated weights for policy 0, policy_version 328301 (0.0028) [2024-04-27 02:16:15,852][49750] Updated weights for policy 0, policy_version 328311 (0.0026) [2024-04-27 02:16:17,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50790.1, 300 sec: 50873.7). Total num frames: 5379112960. Throughput: 0: 50723.3. Samples: 3132009040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 02:16:17,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 02:16:18,238][49728] Signal inference workers to stop experience collection... (47100 times) [2024-04-27 02:16:18,259][49750] InferenceWorker_p0-w0: stopping experience collection (47100 times) [2024-04-27 02:16:18,344][49728] Signal inference workers to resume experience collection... (47100 times) [2024-04-27 02:16:18,344][49750] InferenceWorker_p0-w0: resuming experience collection (47100 times) [2024-04-27 02:16:18,993][49750] Updated weights for policy 0, policy_version 328321 (0.0030) [2024-04-27 02:16:22,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 5379342336. Throughput: 0: 50568.8. Samples: 3132152060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 02:16:22,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-27 02:16:22,475][49750] Updated weights for policy 0, policy_version 328331 (0.0031) [2024-04-27 02:16:25,366][49750] Updated weights for policy 0, policy_version 328341 (0.0030) [2024-04-27 02:16:27,063][49517] Fps is (10 sec: 50790.7, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 5379620864. Throughput: 0: 50676.5. Samples: 3132460340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 02:16:27,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 02:16:28,743][49750] Updated weights for policy 0, policy_version 328351 (0.0032) [2024-04-27 02:16:31,641][49750] Updated weights for policy 0, policy_version 328361 (0.0028) [2024-04-27 02:16:32,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5379883008. Throughput: 0: 50675.1. Samples: 3132766520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-27 02:16:32,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 02:16:35,056][49750] Updated weights for policy 0, policy_version 328371 (0.0032) [2024-04-27 02:16:37,063][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 5380112384. Throughput: 0: 51001.2. Samples: 3132929600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-27 02:16:37,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-27 02:16:38,286][49750] Updated weights for policy 0, policy_version 328381 (0.0027) [2024-04-27 02:16:41,630][49750] Updated weights for policy 0, policy_version 328391 (0.0033) [2024-04-27 02:16:42,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5380374528. Throughput: 0: 50901.8. Samples: 3133233800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-27 02:16:42,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-27 02:16:42,136][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000328393_5380390912.pth... [2024-04-27 02:16:42,180][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000327648_5368184832.pth [2024-04-27 02:16:44,859][49750] Updated weights for policy 0, policy_version 328401 (0.0041) [2024-04-27 02:16:47,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 5380603904. Throughput: 0: 50771.9. Samples: 3133528600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-27 02:16:47,064][49517] Avg episode reward: [(0, '0.682')] [2024-04-27 02:16:48,142][49750] Updated weights for policy 0, policy_version 328411 (0.0030) [2024-04-27 02:16:51,335][49750] Updated weights for policy 0, policy_version 328421 (0.0029) [2024-04-27 02:16:52,063][49517] Fps is (10 sec: 52428.0, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 5380898816. Throughput: 0: 50729.1. Samples: 3133679060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-27 02:16:52,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 02:16:54,472][49750] Updated weights for policy 0, policy_version 328431 (0.0026) [2024-04-27 02:16:57,062][49517] Fps is (10 sec: 54068.6, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5381144576. Throughput: 0: 50854.2. Samples: 3133993620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-27 02:16:57,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-27 02:16:57,629][49750] Updated weights for policy 0, policy_version 328441 (0.0035) [2024-04-27 02:17:00,944][49750] Updated weights for policy 0, policy_version 328451 (0.0031) [2024-04-27 02:17:02,062][49517] Fps is (10 sec: 49153.1, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5381390336. Throughput: 0: 50797.6. Samples: 3134294920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-27 02:17:02,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-27 02:17:03,959][49750] Updated weights for policy 0, policy_version 328461 (0.0031) [2024-04-27 02:17:07,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 5381636096. Throughput: 0: 50774.6. Samples: 3134436920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-27 02:17:07,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 02:17:07,444][49750] Updated weights for policy 0, policy_version 328471 (0.0032) [2024-04-27 02:17:10,491][49750] Updated weights for policy 0, policy_version 328481 (0.0029) [2024-04-27 02:17:12,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5381914624. Throughput: 0: 50830.8. Samples: 3134747720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-27 02:17:12,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-27 02:17:13,925][49750] Updated weights for policy 0, policy_version 328491 (0.0027) [2024-04-27 02:17:14,670][49728] Signal inference workers to stop experience collection... (47150 times) [2024-04-27 02:17:14,731][49750] InferenceWorker_p0-w0: stopping experience collection (47150 times) [2024-04-27 02:17:14,734][49728] Signal inference workers to resume experience collection... (47150 times) [2024-04-27 02:17:14,743][49750] InferenceWorker_p0-w0: resuming experience collection (47150 times) [2024-04-27 02:17:16,931][49750] Updated weights for policy 0, policy_version 328501 (0.0034) [2024-04-27 02:17:17,062][49517] Fps is (10 sec: 54067.8, 60 sec: 51063.7, 300 sec: 50873.7). Total num frames: 5382176768. Throughput: 0: 50796.5. Samples: 3135052360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-27 02:17:17,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-27 02:17:20,289][49750] Updated weights for policy 0, policy_version 328511 (0.0029) [2024-04-27 02:17:22,063][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5382406144. Throughput: 0: 50718.6. Samples: 3135211940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-27 02:17:22,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-27 02:17:23,294][49750] Updated weights for policy 0, policy_version 328521 (0.0034) [2024-04-27 02:17:26,799][49750] Updated weights for policy 0, policy_version 328531 (0.0032) [2024-04-27 02:17:27,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 5382668288. Throughput: 0: 50698.7. Samples: 3135515240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-27 02:17:27,063][49517] Avg episode reward: [(0, '0.508')] [2024-04-27 02:17:29,713][49750] Updated weights for policy 0, policy_version 328541 (0.0035) [2024-04-27 02:17:32,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.0, 300 sec: 50762.6). Total num frames: 5382897664. Throughput: 0: 50800.8. Samples: 3135814640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-27 02:17:32,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 02:17:33,234][49750] Updated weights for policy 0, policy_version 328551 (0.0032) [2024-04-27 02:17:36,164][49750] Updated weights for policy 0, policy_version 328561 (0.0038) [2024-04-27 02:17:37,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5383192576. Throughput: 0: 50880.7. Samples: 3135968680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-27 02:17:37,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-27 02:17:39,792][49750] Updated weights for policy 0, policy_version 328571 (0.0036) [2024-04-27 02:17:42,063][49517] Fps is (10 sec: 54067.7, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5383438336. Throughput: 0: 50694.4. Samples: 3136274880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-27 02:17:42,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-27 02:17:42,562][49750] Updated weights for policy 0, policy_version 328581 (0.0030) [2024-04-27 02:17:46,171][49750] Updated weights for policy 0, policy_version 328591 (0.0036) [2024-04-27 02:17:47,063][49517] Fps is (10 sec: 47513.1, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5383667712. Throughput: 0: 50802.6. Samples: 3136581040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 02:17:47,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-27 02:17:48,938][49750] Updated weights for policy 0, policy_version 328601 (0.0027) [2024-04-27 02:17:52,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 5383929856. Throughput: 0: 50817.8. Samples: 3136723720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 02:17:52,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 02:17:52,659][49750] Updated weights for policy 0, policy_version 328611 (0.0036) [2024-04-27 02:17:55,284][49750] Updated weights for policy 0, policy_version 328621 (0.0035) [2024-04-27 02:17:57,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5384175616. Throughput: 0: 50592.6. Samples: 3137024380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 02:17:57,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-27 02:17:59,149][49750] Updated weights for policy 0, policy_version 328631 (0.0032) [2024-04-27 02:18:01,854][49750] Updated weights for policy 0, policy_version 328641 (0.0036) [2024-04-27 02:18:02,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5384470528. Throughput: 0: 50636.0. Samples: 3137330980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 02:18:02,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-27 02:18:05,566][49750] Updated weights for policy 0, policy_version 328651 (0.0026) [2024-04-27 02:18:07,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 5384667136. Throughput: 0: 50698.9. Samples: 3137493380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 02:18:07,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-27 02:18:08,309][49750] Updated weights for policy 0, policy_version 328661 (0.0035) [2024-04-27 02:18:12,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 5384929280. Throughput: 0: 50788.0. Samples: 3137800700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 02:18:12,063][49517] Avg episode reward: [(0, '0.546')] [2024-04-27 02:18:12,109][49750] Updated weights for policy 0, policy_version 328671 (0.0030) [2024-04-27 02:18:13,305][49728] Signal inference workers to stop experience collection... (47200 times) [2024-04-27 02:18:13,306][49728] Signal inference workers to resume experience collection... (47200 times) [2024-04-27 02:18:13,317][49750] InferenceWorker_p0-w0: stopping experience collection (47200 times) [2024-04-27 02:18:13,317][49750] InferenceWorker_p0-w0: resuming experience collection (47200 times) [2024-04-27 02:18:14,612][49750] Updated weights for policy 0, policy_version 328681 (0.0036) [2024-04-27 02:18:17,062][49517] Fps is (10 sec: 50790.2, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 5385175040. Throughput: 0: 50857.1. Samples: 3138103200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 02:18:17,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 02:18:18,531][49750] Updated weights for policy 0, policy_version 328691 (0.0038) [2024-04-27 02:18:21,039][49750] Updated weights for policy 0, policy_version 328701 (0.0035) [2024-04-27 02:18:22,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.7, 300 sec: 50818.2). Total num frames: 5385469952. Throughput: 0: 50865.4. Samples: 3138257620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 02:18:22,063][49517] Avg episode reward: [(0, '0.647')] [2024-04-27 02:18:24,873][49750] Updated weights for policy 0, policy_version 328711 (0.0031) [2024-04-27 02:18:27,063][49517] Fps is (10 sec: 54066.8, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5385715712. Throughput: 0: 50728.1. Samples: 3138557640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 02:18:27,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-27 02:18:27,557][49750] Updated weights for policy 0, policy_version 328721 (0.0031) [2024-04-27 02:18:31,495][49750] Updated weights for policy 0, policy_version 328731 (0.0035) [2024-04-27 02:18:32,062][49517] Fps is (10 sec: 47513.0, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 5385945088. Throughput: 0: 50754.7. Samples: 3138865000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 02:18:32,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-27 02:18:34,128][49750] Updated weights for policy 0, policy_version 328741 (0.0030) [2024-04-27 02:18:37,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 5386207232. Throughput: 0: 50816.0. Samples: 3139010440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 02:18:37,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 02:18:38,074][49750] Updated weights for policy 0, policy_version 328751 (0.0028) [2024-04-27 02:18:40,482][49750] Updated weights for policy 0, policy_version 328761 (0.0030) [2024-04-27 02:18:42,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5386469376. Throughput: 0: 50771.4. Samples: 3139309100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 02:18:42,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 02:18:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000328764_5386469376.pth... [2024-04-27 02:18:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000328020_5374279680.pth [2024-04-27 02:18:44,494][49750] Updated weights for policy 0, policy_version 328771 (0.0034) [2024-04-27 02:18:46,945][49750] Updated weights for policy 0, policy_version 328781 (0.0036) [2024-04-27 02:18:47,062][49517] Fps is (10 sec: 54068.3, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 5386747904. Throughput: 0: 50804.5. Samples: 3139617180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 02:18:47,063][49517] Avg episode reward: [(0, '0.680')] [2024-04-27 02:18:50,903][49750] Updated weights for policy 0, policy_version 328791 (0.0032) [2024-04-27 02:18:52,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5386977280. Throughput: 0: 50664.0. Samples: 3139773260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 02:18:52,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 02:18:53,389][49750] Updated weights for policy 0, policy_version 328801 (0.0036) [2024-04-27 02:18:57,062][49517] Fps is (10 sec: 45875.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5387206656. Throughput: 0: 50620.9. Samples: 3140078640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 02:18:57,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-27 02:18:57,396][49750] Updated weights for policy 0, policy_version 328811 (0.0027) [2024-04-27 02:18:59,872][49750] Updated weights for policy 0, policy_version 328821 (0.0030) [2024-04-27 02:19:02,062][49517] Fps is (10 sec: 49152.1, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 5387468800. Throughput: 0: 50614.3. Samples: 3140380840. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 02:19:02,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 02:19:03,808][49750] Updated weights for policy 0, policy_version 328831 (0.0032) [2024-04-27 02:19:06,290][49750] Updated weights for policy 0, policy_version 328841 (0.0028) [2024-04-27 02:19:07,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51336.4, 300 sec: 50818.2). Total num frames: 5387747328. Throughput: 0: 50748.7. Samples: 3140541320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 02:19:07,063][49517] Avg episode reward: [(0, '0.555')] [2024-04-27 02:19:10,248][49750] Updated weights for policy 0, policy_version 328851 (0.0030) [2024-04-27 02:19:12,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5387993088. Throughput: 0: 50794.8. Samples: 3140843400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 02:19:12,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-27 02:19:12,806][49750] Updated weights for policy 0, policy_version 328861 (0.0023) [2024-04-27 02:19:16,743][49750] Updated weights for policy 0, policy_version 328871 (0.0034) [2024-04-27 02:19:17,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5388222464. Throughput: 0: 50671.2. Samples: 3141145200. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 02:19:17,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-27 02:19:19,230][49750] Updated weights for policy 0, policy_version 328881 (0.0028) [2024-04-27 02:19:22,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50244.2, 300 sec: 50762.7). Total num frames: 5388484608. Throughput: 0: 50611.7. Samples: 3141287960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 02:19:22,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-27 02:19:23,190][49750] Updated weights for policy 0, policy_version 328891 (0.0040) [2024-04-27 02:19:25,650][49750] Updated weights for policy 0, policy_version 328901 (0.0027) [2024-04-27 02:19:27,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5388746752. Throughput: 0: 50841.4. Samples: 3141596960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 02:19:27,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-27 02:19:29,331][49728] Signal inference workers to stop experience collection... (47250 times) [2024-04-27 02:19:29,374][49750] InferenceWorker_p0-w0: stopping experience collection (47250 times) [2024-04-27 02:19:29,405][49728] Signal inference workers to resume experience collection... (47250 times) [2024-04-27 02:19:29,405][49750] InferenceWorker_p0-w0: resuming experience collection (47250 times) [2024-04-27 02:19:29,545][49750] Updated weights for policy 0, policy_version 328911 (0.0028) [2024-04-27 02:19:32,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5389025280. Throughput: 0: 50883.9. Samples: 3141906960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 02:19:32,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-27 02:19:32,083][49750] Updated weights for policy 0, policy_version 328921 (0.0035) [2024-04-27 02:19:35,916][49750] Updated weights for policy 0, policy_version 328931 (0.0029) [2024-04-27 02:19:37,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 5389254656. Throughput: 0: 50856.9. Samples: 3142061820. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 02:19:37,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 02:19:38,692][49750] Updated weights for policy 0, policy_version 328941 (0.0029) [2024-04-27 02:19:42,063][49517] Fps is (10 sec: 47513.0, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5389500416. Throughput: 0: 50963.3. Samples: 3142372000. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 02:19:42,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 02:19:42,443][49750] Updated weights for policy 0, policy_version 328951 (0.0032) [2024-04-27 02:19:45,303][49750] Updated weights for policy 0, policy_version 328961 (0.0034) [2024-04-27 02:19:47,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50244.1, 300 sec: 50707.1). Total num frames: 5389762560. Throughput: 0: 50848.2. Samples: 3142669020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 02:19:47,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-27 02:19:48,899][49750] Updated weights for policy 0, policy_version 328971 (0.0032) [2024-04-27 02:19:51,657][49750] Updated weights for policy 0, policy_version 328981 (0.0032) [2024-04-27 02:19:52,063][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5390041088. Throughput: 0: 50789.3. Samples: 3142826840. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 02:19:52,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-27 02:19:55,335][49750] Updated weights for policy 0, policy_version 328991 (0.0033) [2024-04-27 02:19:57,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 5390286848. Throughput: 0: 50831.1. Samples: 3143130800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 02:19:57,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 02:19:58,159][49750] Updated weights for policy 0, policy_version 329001 (0.0031) [2024-04-27 02:20:01,719][49750] Updated weights for policy 0, policy_version 329011 (0.0028) [2024-04-27 02:20:02,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5390516224. Throughput: 0: 50917.7. Samples: 3143436500. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 02:20:02,063][49517] Avg episode reward: [(0, '0.642')] [2024-04-27 02:20:04,521][49750] Updated weights for policy 0, policy_version 329021 (0.0033) [2024-04-27 02:20:07,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 5390761984. Throughput: 0: 50874.7. Samples: 3143577320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 02:20:07,063][49517] Avg episode reward: [(0, '0.667')] [2024-04-27 02:20:08,157][49750] Updated weights for policy 0, policy_version 329031 (0.0032) [2024-04-27 02:20:11,274][49750] Updated weights for policy 0, policy_version 329041 (0.0040) [2024-04-27 02:20:12,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5391040512. Throughput: 0: 50766.2. Samples: 3143881440. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 02:20:12,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 02:20:14,753][49750] Updated weights for policy 0, policy_version 329051 (0.0031) [2024-04-27 02:20:17,063][49517] Fps is (10 sec: 55704.6, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 5391319040. Throughput: 0: 50721.7. Samples: 3144189440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:20:17,072][49517] Avg episode reward: [(0, '0.670')] [2024-04-27 02:20:17,605][49750] Updated weights for policy 0, policy_version 329061 (0.0028) [2024-04-27 02:20:21,138][49750] Updated weights for policy 0, policy_version 329071 (0.0042) [2024-04-27 02:20:22,063][49517] Fps is (10 sec: 50789.4, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 5391548416. Throughput: 0: 50709.6. Samples: 3144343760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:20:22,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-27 02:20:23,868][49750] Updated weights for policy 0, policy_version 329081 (0.0038) [2024-04-27 02:20:27,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5391794176. Throughput: 0: 50770.4. Samples: 3144656660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:20:27,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-27 02:20:27,482][49750] Updated weights for policy 0, policy_version 329091 (0.0032) [2024-04-27 02:20:30,381][49750] Updated weights for policy 0, policy_version 329101 (0.0035) [2024-04-27 02:20:32,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 5392039936. Throughput: 0: 50920.9. Samples: 3144960460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:20:32,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-27 02:20:33,946][49750] Updated weights for policy 0, policy_version 329111 (0.0032) [2024-04-27 02:20:36,761][49750] Updated weights for policy 0, policy_version 329121 (0.0030) [2024-04-27 02:20:37,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5392318464. Throughput: 0: 50721.9. Samples: 3145109320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:20:37,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 02:20:40,466][49750] Updated weights for policy 0, policy_version 329131 (0.0040) [2024-04-27 02:20:42,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 5392564224. Throughput: 0: 50818.7. Samples: 3145417640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:20:42,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-27 02:20:42,233][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000329138_5392596992.pth... [2024-04-27 02:20:42,276][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000328393_5380390912.pth [2024-04-27 02:20:42,903][49728] Signal inference workers to stop experience collection... (47300 times) [2024-04-27 02:20:42,942][49750] InferenceWorker_p0-w0: stopping experience collection (47300 times) [2024-04-27 02:20:42,962][49728] Signal inference workers to resume experience collection... (47300 times) [2024-04-27 02:20:42,963][49750] InferenceWorker_p0-w0: resuming experience collection (47300 times) [2024-04-27 02:20:43,099][49750] Updated weights for policy 0, policy_version 329141 (0.0031) [2024-04-27 02:20:46,859][49750] Updated weights for policy 0, policy_version 329151 (0.0036) [2024-04-27 02:20:47,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5392809984. Throughput: 0: 50807.6. Samples: 3145722840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:20:47,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-27 02:20:49,464][49750] Updated weights for policy 0, policy_version 329161 (0.0028) [2024-04-27 02:20:52,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 5393055744. Throughput: 0: 50784.6. Samples: 3145862640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:20:52,063][49517] Avg episode reward: [(0, '0.666')] [2024-04-27 02:20:53,248][49750] Updated weights for policy 0, policy_version 329171 (0.0034) [2024-04-27 02:20:56,031][49750] Updated weights for policy 0, policy_version 329181 (0.0029) [2024-04-27 02:20:57,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5393317888. Throughput: 0: 50805.8. Samples: 3146167700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:20:57,063][49517] Avg episode reward: [(0, '0.595')] [2024-04-27 02:20:59,684][49750] Updated weights for policy 0, policy_version 329191 (0.0035) [2024-04-27 02:21:02,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 5393596416. Throughput: 0: 50791.6. Samples: 3146475060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:21:02,063][49517] Avg episode reward: [(0, '0.535')] [2024-04-27 02:21:02,413][49750] Updated weights for policy 0, policy_version 329201 (0.0033) [2024-04-27 02:21:06,097][49750] Updated weights for policy 0, policy_version 329211 (0.0031) [2024-04-27 02:21:07,062][49517] Fps is (10 sec: 52428.9, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 5393842176. Throughput: 0: 51062.0. Samples: 3146641540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:21:07,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 02:21:08,892][49750] Updated weights for policy 0, policy_version 329221 (0.0035) [2024-04-27 02:21:12,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 5394087936. Throughput: 0: 50844.9. Samples: 3146944680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:21:12,064][49517] Avg episode reward: [(0, '0.615')] [2024-04-27 02:21:12,627][49750] Updated weights for policy 0, policy_version 329231 (0.0036) [2024-04-27 02:21:15,408][49750] Updated weights for policy 0, policy_version 329241 (0.0030) [2024-04-27 02:21:17,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.4, 300 sec: 50818.2). Total num frames: 5394333696. Throughput: 0: 50925.5. Samples: 3147252100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:21:17,063][49517] Avg episode reward: [(0, '0.636')] [2024-04-27 02:21:19,048][49750] Updated weights for policy 0, policy_version 329251 (0.0027) [2024-04-27 02:21:21,656][49750] Updated weights for policy 0, policy_version 329261 (0.0029) [2024-04-27 02:21:22,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5394612224. Throughput: 0: 50880.0. Samples: 3147398920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:21:22,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 02:21:25,407][49750] Updated weights for policy 0, policy_version 329271 (0.0035) [2024-04-27 02:21:27,063][49517] Fps is (10 sec: 55704.5, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 5394890752. Throughput: 0: 50810.1. Samples: 3147704100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 02:21:27,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 02:21:28,118][49750] Updated weights for policy 0, policy_version 329281 (0.0033) [2024-04-27 02:21:31,886][49750] Updated weights for policy 0, policy_version 329291 (0.0033) [2024-04-27 02:21:32,063][49517] Fps is (10 sec: 50790.1, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5395120128. Throughput: 0: 50954.5. Samples: 3148015800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 02:21:32,063][49517] Avg episode reward: [(0, '0.693')] [2024-04-27 02:21:34,584][49750] Updated weights for policy 0, policy_version 329301 (0.0035) [2024-04-27 02:21:37,063][49517] Fps is (10 sec: 45875.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5395349504. Throughput: 0: 50962.7. Samples: 3148155960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 02:21:37,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-27 02:21:38,281][49750] Updated weights for policy 0, policy_version 329311 (0.0029) [2024-04-27 02:21:41,229][49750] Updated weights for policy 0, policy_version 329321 (0.0034) [2024-04-27 02:21:42,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5395595264. Throughput: 0: 50979.6. Samples: 3148461780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 02:21:42,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 02:21:42,416][49728] Signal inference workers to stop experience collection... (47350 times) [2024-04-27 02:21:42,417][49728] Signal inference workers to resume experience collection... (47350 times) [2024-04-27 02:21:42,434][49750] InferenceWorker_p0-w0: stopping experience collection (47350 times) [2024-04-27 02:21:42,434][49750] InferenceWorker_p0-w0: resuming experience collection (47350 times) [2024-04-27 02:21:44,745][49750] Updated weights for policy 0, policy_version 329331 (0.0024) [2024-04-27 02:21:47,063][49517] Fps is (10 sec: 54067.5, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 5395890176. Throughput: 0: 50882.2. Samples: 3148764760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 02:21:47,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-27 02:21:47,958][49750] Updated weights for policy 0, policy_version 329341 (0.0029) [2024-04-27 02:21:51,132][49750] Updated weights for policy 0, policy_version 329351 (0.0035) [2024-04-27 02:21:52,062][49517] Fps is (10 sec: 54067.7, 60 sec: 51336.8, 300 sec: 50818.2). Total num frames: 5396135936. Throughput: 0: 50724.5. Samples: 3148924140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 02:21:52,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-27 02:21:54,691][49750] Updated weights for policy 0, policy_version 329361 (0.0026) [2024-04-27 02:21:57,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5396365312. Throughput: 0: 50937.8. Samples: 3149236880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 02:21:57,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-27 02:21:57,494][49750] Updated weights for policy 0, policy_version 329371 (0.0036) [2024-04-27 02:22:01,547][49750] Updated weights for policy 0, policy_version 329381 (0.0026) [2024-04-27 02:22:02,063][49517] Fps is (10 sec: 49151.0, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5396627456. Throughput: 0: 51032.3. Samples: 3149548560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 02:22:02,063][49517] Avg episode reward: [(0, '0.629')] [2024-04-27 02:22:03,915][49750] Updated weights for policy 0, policy_version 329391 (0.0036) [2024-04-27 02:22:07,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5396889600. Throughput: 0: 50856.1. Samples: 3149687440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 02:22:07,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-27 02:22:08,002][49750] Updated weights for policy 0, policy_version 329401 (0.0032) [2024-04-27 02:22:10,434][49750] Updated weights for policy 0, policy_version 329411 (0.0032) [2024-04-27 02:22:12,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 5397168128. Throughput: 0: 50839.4. Samples: 3149991860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 02:22:12,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-27 02:22:14,242][49750] Updated weights for policy 0, policy_version 329421 (0.0029) [2024-04-27 02:22:16,937][49750] Updated weights for policy 0, policy_version 329431 (0.0025) [2024-04-27 02:22:17,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5397397504. Throughput: 0: 50813.5. Samples: 3150302400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 02:22:17,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 02:22:20,656][49750] Updated weights for policy 0, policy_version 329441 (0.0032) [2024-04-27 02:22:22,062][49517] Fps is (10 sec: 47513.7, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 5397643264. Throughput: 0: 51047.8. Samples: 3150453100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 02:22:22,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 02:22:23,390][49750] Updated weights for policy 0, policy_version 329451 (0.0028) [2024-04-27 02:22:27,062][49517] Fps is (10 sec: 47513.9, 60 sec: 49698.3, 300 sec: 50762.7). Total num frames: 5397872640. Throughput: 0: 50932.5. Samples: 3150753740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 02:22:27,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 02:22:27,146][49750] Updated weights for policy 0, policy_version 329461 (0.0033) [2024-04-27 02:22:29,863][49750] Updated weights for policy 0, policy_version 329471 (0.0037) [2024-04-27 02:22:29,907][49728] Signal inference workers to stop experience collection... (47400 times) [2024-04-27 02:22:29,907][49728] Signal inference workers to resume experience collection... (47400 times) [2024-04-27 02:22:29,930][49750] InferenceWorker_p0-w0: stopping experience collection (47400 times) [2024-04-27 02:22:29,930][49750] InferenceWorker_p0-w0: resuming experience collection (47400 times) [2024-04-27 02:22:32,063][49517] Fps is (10 sec: 54066.1, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 5398183936. Throughput: 0: 50793.7. Samples: 3151050480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 02:22:32,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-27 02:22:33,431][49750] Updated weights for policy 0, policy_version 329481 (0.0029) [2024-04-27 02:22:36,268][49750] Updated weights for policy 0, policy_version 329491 (0.0031) [2024-04-27 02:22:37,062][49517] Fps is (10 sec: 55705.4, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 5398429696. Throughput: 0: 50890.1. Samples: 3151214200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 02:22:37,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-27 02:22:39,858][49750] Updated weights for policy 0, policy_version 329501 (0.0026) [2024-04-27 02:22:42,062][49517] Fps is (10 sec: 45875.9, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5398642688. Throughput: 0: 50776.4. Samples: 3151521820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 02:22:42,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 02:22:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000329507_5398642688.pth... [2024-04-27 02:22:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000328764_5386469376.pth [2024-04-27 02:22:42,720][49750] Updated weights for policy 0, policy_version 329511 (0.0024) [2024-04-27 02:22:46,431][49750] Updated weights for policy 0, policy_version 329521 (0.0027) [2024-04-27 02:22:47,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5398921216. Throughput: 0: 50739.3. Samples: 3151831820. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-27 02:22:47,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-27 02:22:49,135][49750] Updated weights for policy 0, policy_version 329531 (0.0027) [2024-04-27 02:22:52,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 5399166976. Throughput: 0: 50772.8. Samples: 3151972220. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-27 02:22:52,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 02:22:52,900][49750] Updated weights for policy 0, policy_version 329541 (0.0036) [2024-04-27 02:22:55,587][49750] Updated weights for policy 0, policy_version 329551 (0.0040) [2024-04-27 02:22:57,062][49517] Fps is (10 sec: 54067.0, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 5399461888. Throughput: 0: 50866.6. Samples: 3152280860. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-27 02:22:57,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-27 02:22:59,374][49750] Updated weights for policy 0, policy_version 329561 (0.0027) [2024-04-27 02:23:01,892][49750] Updated weights for policy 0, policy_version 329571 (0.0028) [2024-04-27 02:23:02,062][49517] Fps is (10 sec: 54067.6, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 5399707648. Throughput: 0: 50836.0. Samples: 3152590020. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-27 02:23:02,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 02:23:05,743][49750] Updated weights for policy 0, policy_version 329581 (0.0026) [2024-04-27 02:23:07,062][49517] Fps is (10 sec: 47513.9, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5399937024. Throughput: 0: 50897.3. Samples: 3152743480. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-27 02:23:07,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-27 02:23:08,195][49728] Signal inference workers to stop experience collection... (47450 times) [2024-04-27 02:23:08,196][49728] Signal inference workers to resume experience collection... (47450 times) [2024-04-27 02:23:08,215][49750] InferenceWorker_p0-w0: stopping experience collection (47450 times) [2024-04-27 02:23:08,215][49750] InferenceWorker_p0-w0: resuming experience collection (47450 times) [2024-04-27 02:23:08,326][49750] Updated weights for policy 0, policy_version 329591 (0.0030) [2024-04-27 02:23:12,062][49517] Fps is (10 sec: 45875.5, 60 sec: 49971.2, 300 sec: 50818.2). Total num frames: 5400166400. Throughput: 0: 50848.0. Samples: 3153041900. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-27 02:23:12,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-27 02:23:12,098][49750] Updated weights for policy 0, policy_version 329601 (0.0029) [2024-04-27 02:23:14,863][49750] Updated weights for policy 0, policy_version 329611 (0.0038) [2024-04-27 02:23:17,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5400444928. Throughput: 0: 50835.7. Samples: 3153338080. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-27 02:23:17,071][49517] Avg episode reward: [(0, '0.548')] [2024-04-27 02:23:18,557][49750] Updated weights for policy 0, policy_version 329621 (0.0031) [2024-04-27 02:23:21,219][49750] Updated weights for policy 0, policy_version 329631 (0.0027) [2024-04-27 02:23:22,063][49517] Fps is (10 sec: 57342.6, 60 sec: 51609.4, 300 sec: 50929.2). Total num frames: 5400739840. Throughput: 0: 50796.7. Samples: 3153500060. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-27 02:23:22,072][49517] Avg episode reward: [(0, '0.559')] [2024-04-27 02:23:25,133][49750] Updated weights for policy 0, policy_version 329641 (0.0035) [2024-04-27 02:23:27,063][49517] Fps is (10 sec: 49151.7, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5400936448. Throughput: 0: 50805.2. Samples: 3153808060. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-27 02:23:27,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 02:23:27,692][49750] Updated weights for policy 0, policy_version 329651 (0.0035) [2024-04-27 02:23:31,573][49750] Updated weights for policy 0, policy_version 329661 (0.0029) [2024-04-27 02:23:32,063][49517] Fps is (10 sec: 45875.6, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5401198592. Throughput: 0: 50806.5. Samples: 3154118120. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-27 02:23:32,064][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 02:23:34,054][49750] Updated weights for policy 0, policy_version 329671 (0.0029) [2024-04-27 02:23:37,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 5401444352. Throughput: 0: 50755.6. Samples: 3154256220. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-27 02:23:37,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 02:23:37,848][49750] Updated weights for policy 0, policy_version 329681 (0.0034) [2024-04-27 02:23:40,548][49750] Updated weights for policy 0, policy_version 329691 (0.0030) [2024-04-27 02:23:42,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51609.5, 300 sec: 50818.1). Total num frames: 5401739264. Throughput: 0: 50781.1. Samples: 3154566020. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-27 02:23:42,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-27 02:23:44,254][49750] Updated weights for policy 0, policy_version 329701 (0.0029) [2024-04-27 02:23:47,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.2, 300 sec: 50818.1). Total num frames: 5401968640. Throughput: 0: 50724.3. Samples: 3154872620. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-27 02:23:47,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-27 02:23:47,144][49750] Updated weights for policy 0, policy_version 329711 (0.0031) [2024-04-27 02:23:50,586][49750] Updated weights for policy 0, policy_version 329721 (0.0036) [2024-04-27 02:23:52,062][49517] Fps is (10 sec: 47514.2, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5402214400. Throughput: 0: 50669.2. Samples: 3155023600. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-27 02:23:52,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-27 02:23:53,503][49750] Updated weights for policy 0, policy_version 329731 (0.0030) [2024-04-27 02:23:53,693][49728] Signal inference workers to stop experience collection... (47500 times) [2024-04-27 02:23:53,694][49728] Signal inference workers to resume experience collection... (47500 times) [2024-04-27 02:23:53,722][49750] InferenceWorker_p0-w0: stopping experience collection (47500 times) [2024-04-27 02:23:53,722][49750] InferenceWorker_p0-w0: resuming experience collection (47500 times) [2024-04-27 02:23:56,988][49750] Updated weights for policy 0, policy_version 329741 (0.0036) [2024-04-27 02:23:57,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50244.1, 300 sec: 50873.7). Total num frames: 5402476544. Throughput: 0: 50827.3. Samples: 3155329140. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-27 02:23:57,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 02:23:59,902][49750] Updated weights for policy 0, policy_version 329751 (0.0026) [2024-04-27 02:24:02,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 5402722304. Throughput: 0: 50868.9. Samples: 3155627180. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-04-27 02:24:02,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 02:24:03,600][49750] Updated weights for policy 0, policy_version 329761 (0.0032) [2024-04-27 02:24:06,417][49750] Updated weights for policy 0, policy_version 329771 (0.0028) [2024-04-27 02:24:07,062][49517] Fps is (10 sec: 54067.9, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 5403017216. Throughput: 0: 50775.7. Samples: 3155784960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:24:07,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-27 02:24:10,186][49750] Updated weights for policy 0, policy_version 329781 (0.0038) [2024-04-27 02:24:12,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5403230208. Throughput: 0: 50745.5. Samples: 3156091600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:24:12,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 02:24:12,751][49750] Updated weights for policy 0, policy_version 329791 (0.0027) [2024-04-27 02:24:16,746][49750] Updated weights for policy 0, policy_version 329801 (0.0033) [2024-04-27 02:24:17,062][49517] Fps is (10 sec: 45875.3, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5403475968. Throughput: 0: 50749.0. Samples: 3156401820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:24:17,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 02:24:19,033][49750] Updated weights for policy 0, policy_version 329811 (0.0027) [2024-04-27 02:24:22,062][49517] Fps is (10 sec: 49152.0, 60 sec: 49698.3, 300 sec: 50762.6). Total num frames: 5403721728. Throughput: 0: 50757.4. Samples: 3156540300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:24:22,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-27 02:24:23,124][49750] Updated weights for policy 0, policy_version 329821 (0.0032) [2024-04-27 02:24:25,493][49750] Updated weights for policy 0, policy_version 329831 (0.0029) [2024-04-27 02:24:27,063][49517] Fps is (10 sec: 55705.1, 60 sec: 51609.6, 300 sec: 50873.7). Total num frames: 5404033024. Throughput: 0: 50693.4. Samples: 3156847220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:24:27,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 02:24:29,429][49750] Updated weights for policy 0, policy_version 329841 (0.0027) [2024-04-27 02:24:31,995][49750] Updated weights for policy 0, policy_version 329851 (0.0033) [2024-04-27 02:24:32,062][49517] Fps is (10 sec: 55705.2, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 5404278784. Throughput: 0: 50792.1. Samples: 3157158260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:24:32,063][49517] Avg episode reward: [(0, '0.615')] [2024-04-27 02:24:35,921][49750] Updated weights for policy 0, policy_version 329861 (0.0030) [2024-04-27 02:24:37,063][49517] Fps is (10 sec: 45875.3, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5404491776. Throughput: 0: 50867.1. Samples: 3157312620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:24:37,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 02:24:38,438][49750] Updated weights for policy 0, policy_version 329871 (0.0030) [2024-04-27 02:24:42,062][49517] Fps is (10 sec: 45875.0, 60 sec: 49971.3, 300 sec: 50762.6). Total num frames: 5404737536. Throughput: 0: 50744.6. Samples: 3157612640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:24:42,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 02:24:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000329879_5404737536.pth... [2024-04-27 02:24:42,125][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000329138_5392596992.pth [2024-04-27 02:24:42,435][49750] Updated weights for policy 0, policy_version 329881 (0.0030) [2024-04-27 02:24:45,042][49750] Updated weights for policy 0, policy_version 329891 (0.0033) [2024-04-27 02:24:47,063][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5405016064. Throughput: 0: 50823.5. Samples: 3157914240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:24:47,063][49517] Avg episode reward: [(0, '0.656')] [2024-04-27 02:24:48,705][49750] Updated weights for policy 0, policy_version 329901 (0.0027) [2024-04-27 02:24:50,782][49728] Signal inference workers to stop experience collection... (47550 times) [2024-04-27 02:24:50,829][49750] InferenceWorker_p0-w0: stopping experience collection (47550 times) [2024-04-27 02:24:50,845][49728] Signal inference workers to resume experience collection... (47550 times) [2024-04-27 02:24:50,853][49750] InferenceWorker_p0-w0: resuming experience collection (47550 times) [2024-04-27 02:24:51,357][49750] Updated weights for policy 0, policy_version 329911 (0.0030) [2024-04-27 02:24:52,063][49517] Fps is (10 sec: 55704.6, 60 sec: 51336.3, 300 sec: 50873.7). Total num frames: 5405294592. Throughput: 0: 50820.7. Samples: 3158071900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:24:52,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-27 02:24:55,049][49750] Updated weights for policy 0, policy_version 329921 (0.0034) [2024-04-27 02:24:57,062][49517] Fps is (10 sec: 52429.4, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 5405540352. Throughput: 0: 50830.2. Samples: 3158378960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:24:57,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 02:24:57,656][49750] Updated weights for policy 0, policy_version 329931 (0.0029) [2024-04-27 02:25:01,621][49750] Updated weights for policy 0, policy_version 329941 (0.0034) [2024-04-27 02:25:02,063][49517] Fps is (10 sec: 49152.6, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5405786112. Throughput: 0: 50768.3. Samples: 3158686400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:25:02,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-27 02:25:04,254][49750] Updated weights for policy 0, policy_version 329951 (0.0031) [2024-04-27 02:25:07,062][49517] Fps is (10 sec: 47513.4, 60 sec: 49971.2, 300 sec: 50762.6). Total num frames: 5406015488. Throughput: 0: 50823.5. Samples: 3158827360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:25:07,067][49517] Avg episode reward: [(0, '0.513')] [2024-04-27 02:25:08,019][49750] Updated weights for policy 0, policy_version 329961 (0.0034) [2024-04-27 02:25:10,758][49750] Updated weights for policy 0, policy_version 329971 (0.0029) [2024-04-27 02:25:12,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 5406294016. Throughput: 0: 50780.4. Samples: 3159132340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:25:12,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-27 02:25:14,390][49750] Updated weights for policy 0, policy_version 329981 (0.0033) [2024-04-27 02:25:17,000][49750] Updated weights for policy 0, policy_version 329991 (0.0028) [2024-04-27 02:25:17,063][49517] Fps is (10 sec: 55705.2, 60 sec: 51609.5, 300 sec: 50929.3). Total num frames: 5406572544. Throughput: 0: 50687.5. Samples: 3159439200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 02:25:17,063][49517] Avg episode reward: [(0, '0.676')] [2024-04-27 02:25:20,764][49750] Updated weights for policy 0, policy_version 330001 (0.0036) [2024-04-27 02:25:22,062][49517] Fps is (10 sec: 49152.5, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5406785536. Throughput: 0: 50880.5. Samples: 3159602240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:25:22,063][49517] Avg episode reward: [(0, '0.672')] [2024-04-27 02:25:23,378][49750] Updated weights for policy 0, policy_version 330011 (0.0036) [2024-04-27 02:25:27,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50244.4, 300 sec: 50873.7). Total num frames: 5407047680. Throughput: 0: 50831.2. Samples: 3159900040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:25:27,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-27 02:25:27,263][49750] Updated weights for policy 0, policy_version 330021 (0.0029) [2024-04-27 02:25:29,961][49750] Updated weights for policy 0, policy_version 330031 (0.0035) [2024-04-27 02:25:32,063][49517] Fps is (10 sec: 52427.8, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 5407309824. Throughput: 0: 50943.8. Samples: 3160206720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:25:32,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 02:25:33,584][49750] Updated weights for policy 0, policy_version 330041 (0.0030) [2024-04-27 02:25:36,376][49750] Updated weights for policy 0, policy_version 330051 (0.0030) [2024-04-27 02:25:37,063][49517] Fps is (10 sec: 52427.9, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5407571968. Throughput: 0: 50995.2. Samples: 3160366680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:25:37,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 02:25:39,857][49750] Updated weights for policy 0, policy_version 330061 (0.0031) [2024-04-27 02:25:42,063][49517] Fps is (10 sec: 52429.5, 60 sec: 51609.6, 300 sec: 50929.2). Total num frames: 5407834112. Throughput: 0: 51051.4. Samples: 3160676280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:25:42,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-27 02:25:42,933][49750] Updated weights for policy 0, policy_version 330071 (0.0028) [2024-04-27 02:25:45,899][49728] Signal inference workers to stop experience collection... (47600 times) [2024-04-27 02:25:45,899][49728] Signal inference workers to resume experience collection... (47600 times) [2024-04-27 02:25:45,913][49750] InferenceWorker_p0-w0: stopping experience collection (47600 times) [2024-04-27 02:25:45,913][49750] InferenceWorker_p0-w0: resuming experience collection (47600 times) [2024-04-27 02:25:46,229][49750] Updated weights for policy 0, policy_version 330081 (0.0030) [2024-04-27 02:25:47,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5408063488. Throughput: 0: 50845.5. Samples: 3160974440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:25:47,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-27 02:25:49,294][49750] Updated weights for policy 0, policy_version 330091 (0.0034) [2024-04-27 02:25:52,063][49517] Fps is (10 sec: 47513.1, 60 sec: 50244.3, 300 sec: 50818.1). Total num frames: 5408309248. Throughput: 0: 51096.2. Samples: 3161126700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:25:52,063][49517] Avg episode reward: [(0, '0.609')] [2024-04-27 02:25:52,659][49750] Updated weights for policy 0, policy_version 330101 (0.0028) [2024-04-27 02:25:55,613][49750] Updated weights for policy 0, policy_version 330111 (0.0031) [2024-04-27 02:25:57,062][49517] Fps is (10 sec: 52428.4, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5408587776. Throughput: 0: 51096.6. Samples: 3161431680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:25:57,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 02:25:59,126][49750] Updated weights for policy 0, policy_version 330121 (0.0030) [2024-04-27 02:26:02,062][49517] Fps is (10 sec: 54068.5, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5408849920. Throughput: 0: 51132.6. Samples: 3161740160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:26:02,063][49517] Avg episode reward: [(0, '0.679')] [2024-04-27 02:26:02,164][49750] Updated weights for policy 0, policy_version 330131 (0.0033) [2024-04-27 02:26:05,434][49750] Updated weights for policy 0, policy_version 330141 (0.0033) [2024-04-27 02:26:07,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51609.6, 300 sec: 50929.3). Total num frames: 5409112064. Throughput: 0: 51146.3. Samples: 3161903820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:26:07,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-27 02:26:08,804][49750] Updated weights for policy 0, policy_version 330151 (0.0028) [2024-04-27 02:26:11,909][49750] Updated weights for policy 0, policy_version 330161 (0.0027) [2024-04-27 02:26:12,063][49517] Fps is (10 sec: 50789.3, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 5409357824. Throughput: 0: 51147.3. Samples: 3162201680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:26:12,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-27 02:26:15,271][49750] Updated weights for policy 0, policy_version 330171 (0.0026) [2024-04-27 02:26:17,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5409603584. Throughput: 0: 51147.3. Samples: 3162508340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:26:17,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 02:26:18,312][49750] Updated weights for policy 0, policy_version 330181 (0.0030) [2024-04-27 02:26:21,509][49750] Updated weights for policy 0, policy_version 330191 (0.0026) [2024-04-27 02:26:22,063][49517] Fps is (10 sec: 52429.2, 60 sec: 51609.6, 300 sec: 50818.2). Total num frames: 5409882112. Throughput: 0: 50900.0. Samples: 3162657180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:26:22,063][49517] Avg episode reward: [(0, '0.487')] [2024-04-27 02:26:24,731][49750] Updated weights for policy 0, policy_version 330201 (0.0030) [2024-04-27 02:26:27,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5410111488. Throughput: 0: 50906.3. Samples: 3162967060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:26:27,063][49517] Avg episode reward: [(0, '0.675')] [2024-04-27 02:26:27,899][49750] Updated weights for policy 0, policy_version 330211 (0.0033) [2024-04-27 02:26:31,107][49750] Updated weights for policy 0, policy_version 330221 (0.0029) [2024-04-27 02:26:32,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5410357248. Throughput: 0: 50977.2. Samples: 3163268420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:26:32,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 02:26:34,499][49750] Updated weights for policy 0, policy_version 330231 (0.0036) [2024-04-27 02:26:37,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 5410603008. Throughput: 0: 50934.9. Samples: 3163418760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:26:37,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 02:26:37,540][49750] Updated weights for policy 0, policy_version 330241 (0.0027) [2024-04-27 02:26:40,993][49750] Updated weights for policy 0, policy_version 330251 (0.0033) [2024-04-27 02:26:42,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5410881536. Throughput: 0: 50957.3. Samples: 3163724760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:26:42,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 02:26:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000330254_5410881536.pth... [2024-04-27 02:26:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000329507_5398642688.pth [2024-04-27 02:26:43,423][49728] Signal inference workers to stop experience collection... (47650 times) [2024-04-27 02:26:43,432][49728] Signal inference workers to resume experience collection... (47650 times) [2024-04-27 02:26:43,477][49750] InferenceWorker_p0-w0: stopping experience collection (47650 times) [2024-04-27 02:26:43,477][49750] InferenceWorker_p0-w0: resuming experience collection (47650 times) [2024-04-27 02:26:43,992][49750] Updated weights for policy 0, policy_version 330261 (0.0028) [2024-04-27 02:26:47,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5411127296. Throughput: 0: 50938.7. Samples: 3164032400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:26:47,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 02:26:47,413][49750] Updated weights for policy 0, policy_version 330271 (0.0030) [2024-04-27 02:26:50,385][49750] Updated weights for policy 0, policy_version 330281 (0.0032) [2024-04-27 02:26:52,062][49517] Fps is (10 sec: 50790.2, 60 sec: 51336.7, 300 sec: 50929.2). Total num frames: 5411389440. Throughput: 0: 50751.9. Samples: 3164187660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:26:52,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-27 02:26:53,796][49750] Updated weights for policy 0, policy_version 330291 (0.0036) [2024-04-27 02:26:56,742][49750] Updated weights for policy 0, policy_version 330301 (0.0035) [2024-04-27 02:26:57,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 5411651584. Throughput: 0: 51009.6. Samples: 3164497100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:26:57,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 02:27:00,219][49750] Updated weights for policy 0, policy_version 330311 (0.0032) [2024-04-27 02:27:02,063][49517] Fps is (10 sec: 47512.6, 60 sec: 50244.0, 300 sec: 50762.6). Total num frames: 5411864576. Throughput: 0: 50914.4. Samples: 3164799500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:27:02,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-27 02:27:03,232][49750] Updated weights for policy 0, policy_version 330321 (0.0037) [2024-04-27 02:27:06,623][49750] Updated weights for policy 0, policy_version 330331 (0.0036) [2024-04-27 02:27:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5412159488. Throughput: 0: 50754.8. Samples: 3164941140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:27:07,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 02:27:09,872][49750] Updated weights for policy 0, policy_version 330341 (0.0033) [2024-04-27 02:27:12,063][49517] Fps is (10 sec: 54067.9, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5412405248. Throughput: 0: 50751.5. Samples: 3165250880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:27:12,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-27 02:27:13,036][49750] Updated weights for policy 0, policy_version 330351 (0.0027) [2024-04-27 02:27:16,235][49750] Updated weights for policy 0, policy_version 330361 (0.0038) [2024-04-27 02:27:17,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5412651008. Throughput: 0: 50870.3. Samples: 3165557580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:27:17,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-27 02:27:19,609][49750] Updated weights for policy 0, policy_version 330371 (0.0027) [2024-04-27 02:27:22,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.5, 300 sec: 51040.3). Total num frames: 5412929536. Throughput: 0: 50962.7. Samples: 3165712080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:27:22,063][49517] Avg episode reward: [(0, '0.509')] [2024-04-27 02:27:22,594][49750] Updated weights for policy 0, policy_version 330381 (0.0038) [2024-04-27 02:27:26,088][49750] Updated weights for policy 0, policy_version 330391 (0.0035) [2024-04-27 02:27:27,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5413142528. Throughput: 0: 50821.8. Samples: 3166011740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:27:27,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 02:27:29,031][49750] Updated weights for policy 0, policy_version 330401 (0.0038) [2024-04-27 02:27:32,063][49517] Fps is (10 sec: 49151.4, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5413421056. Throughput: 0: 50866.5. Samples: 3166321400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:27:32,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 02:27:32,449][49750] Updated weights for policy 0, policy_version 330411 (0.0029) [2024-04-27 02:27:35,373][49750] Updated weights for policy 0, policy_version 330421 (0.0035) [2024-04-27 02:27:37,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5413666816. Throughput: 0: 50905.8. Samples: 3166478420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:27:37,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-27 02:27:38,986][49750] Updated weights for policy 0, policy_version 330431 (0.0028) [2024-04-27 02:27:40,755][49728] Signal inference workers to stop experience collection... (47700 times) [2024-04-27 02:27:40,755][49728] Signal inference workers to resume experience collection... (47700 times) [2024-04-27 02:27:40,780][49750] InferenceWorker_p0-w0: stopping experience collection (47700 times) [2024-04-27 02:27:40,780][49750] InferenceWorker_p0-w0: resuming experience collection (47700 times) [2024-04-27 02:27:41,654][49750] Updated weights for policy 0, policy_version 330441 (0.0027) [2024-04-27 02:27:42,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 5413945344. Throughput: 0: 50892.2. Samples: 3166787260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:27:42,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 02:27:45,569][49750] Updated weights for policy 0, policy_version 330451 (0.0030) [2024-04-27 02:27:47,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5414158336. Throughput: 0: 50879.0. Samples: 3167089040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:27:47,063][49517] Avg episode reward: [(0, '0.495')] [2024-04-27 02:27:48,184][49750] Updated weights for policy 0, policy_version 330461 (0.0038) [2024-04-27 02:27:51,990][49750] Updated weights for policy 0, policy_version 330471 (0.0028) [2024-04-27 02:27:52,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5414436864. Throughput: 0: 50950.8. Samples: 3167233940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:27:52,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 02:27:54,714][49750] Updated weights for policy 0, policy_version 330481 (0.0026) [2024-04-27 02:27:57,062][49517] Fps is (10 sec: 54066.8, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5414699008. Throughput: 0: 50795.2. Samples: 3167536660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:27:57,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 02:27:58,412][49750] Updated weights for policy 0, policy_version 330491 (0.0037) [2024-04-27 02:28:01,090][49750] Updated weights for policy 0, policy_version 330501 (0.0038) [2024-04-27 02:28:02,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51609.6, 300 sec: 50929.2). Total num frames: 5414961152. Throughput: 0: 50789.5. Samples: 3167843120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:28:02,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 02:28:04,830][49750] Updated weights for policy 0, policy_version 330511 (0.0035) [2024-04-27 02:28:07,063][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.3, 300 sec: 50984.8). Total num frames: 5415206912. Throughput: 0: 50926.1. Samples: 3168003760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:28:07,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 02:28:07,843][49750] Updated weights for policy 0, policy_version 330521 (0.0032) [2024-04-27 02:28:11,325][49750] Updated weights for policy 0, policy_version 330531 (0.0032) [2024-04-27 02:28:12,063][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 5415436288. Throughput: 0: 50968.7. Samples: 3168305340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:28:12,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 02:28:14,479][49750] Updated weights for policy 0, policy_version 330541 (0.0033) [2024-04-27 02:28:17,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.4, 300 sec: 50707.1). Total num frames: 5415698432. Throughput: 0: 50670.7. Samples: 3168601580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:28:17,063][49517] Avg episode reward: [(0, '0.667')] [2024-04-27 02:28:17,889][49750] Updated weights for policy 0, policy_version 330551 (0.0038) [2024-04-27 02:28:20,831][49750] Updated weights for policy 0, policy_version 330561 (0.0030) [2024-04-27 02:28:22,062][49517] Fps is (10 sec: 52429.8, 60 sec: 50517.3, 300 sec: 50929.3). Total num frames: 5415960576. Throughput: 0: 50704.1. Samples: 3168760100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:28:22,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 02:28:24,309][49750] Updated weights for policy 0, policy_version 330571 (0.0031) [2024-04-27 02:28:27,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 5416222720. Throughput: 0: 50701.0. Samples: 3169068800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:28:27,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 02:28:27,306][49750] Updated weights for policy 0, policy_version 330581 (0.0033) [2024-04-27 02:28:30,732][49750] Updated weights for policy 0, policy_version 330591 (0.0031) [2024-04-27 02:28:32,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 5416452096. Throughput: 0: 50770.5. Samples: 3169373720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:28:32,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 02:28:33,632][49750] Updated weights for policy 0, policy_version 330601 (0.0033) [2024-04-27 02:28:37,062][49517] Fps is (10 sec: 47514.1, 60 sec: 50517.4, 300 sec: 50707.1). Total num frames: 5416697856. Throughput: 0: 50918.0. Samples: 3169525240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:28:37,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 02:28:37,252][49750] Updated weights for policy 0, policy_version 330611 (0.0030) [2024-04-27 02:28:38,047][49728] Signal inference workers to stop experience collection... (47750 times) [2024-04-27 02:28:38,048][49728] Signal inference workers to resume experience collection... (47750 times) [2024-04-27 02:28:38,088][49750] InferenceWorker_p0-w0: stopping experience collection (47750 times) [2024-04-27 02:28:38,088][49750] InferenceWorker_p0-w0: resuming experience collection (47750 times) [2024-04-27 02:28:40,064][49750] Updated weights for policy 0, policy_version 330621 (0.0028) [2024-04-27 02:28:42,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.5, 300 sec: 50929.3). Total num frames: 5416992768. Throughput: 0: 50924.0. Samples: 3169828240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:28:42,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-27 02:28:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000330627_5416992768.pth... [2024-04-27 02:28:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000329879_5404737536.pth [2024-04-27 02:28:43,690][49750] Updated weights for policy 0, policy_version 330631 (0.0033) [2024-04-27 02:28:46,445][49750] Updated weights for policy 0, policy_version 330641 (0.0030) [2024-04-27 02:28:47,063][49517] Fps is (10 sec: 54066.8, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 5417238528. Throughput: 0: 50702.9. Samples: 3170124740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:28:47,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 02:28:50,192][49750] Updated weights for policy 0, policy_version 330651 (0.0029) [2024-04-27 02:28:52,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5417484288. Throughput: 0: 50885.0. Samples: 3170293580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:28:52,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-27 02:28:52,802][49750] Updated weights for policy 0, policy_version 330661 (0.0028) [2024-04-27 02:28:56,734][49750] Updated weights for policy 0, policy_version 330671 (0.0029) [2024-04-27 02:28:57,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5417713664. Throughput: 0: 50886.4. Samples: 3170595220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:28:57,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 02:28:59,242][49750] Updated weights for policy 0, policy_version 330681 (0.0030) [2024-04-27 02:29:02,063][49517] Fps is (10 sec: 49151.6, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 5417975808. Throughput: 0: 50962.1. Samples: 3170894880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 02:29:02,063][49517] Avg episode reward: [(0, '0.619')] [2024-04-27 02:29:03,075][49750] Updated weights for policy 0, policy_version 330691 (0.0038) [2024-04-27 02:29:05,602][49750] Updated weights for policy 0, policy_version 330701 (0.0028) [2024-04-27 02:29:07,062][49517] Fps is (10 sec: 57344.3, 60 sec: 51336.7, 300 sec: 51040.3). Total num frames: 5418287104. Throughput: 0: 50909.3. Samples: 3171051020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-27 02:29:07,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-27 02:29:09,375][49750] Updated weights for policy 0, policy_version 330711 (0.0030) [2024-04-27 02:29:12,063][49517] Fps is (10 sec: 54067.1, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 5418516480. Throughput: 0: 50999.5. Samples: 3171363780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-27 02:29:12,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 02:29:12,177][49750] Updated weights for policy 0, policy_version 330721 (0.0031) [2024-04-27 02:29:15,897][49750] Updated weights for policy 0, policy_version 330731 (0.0034) [2024-04-27 02:29:17,063][49517] Fps is (10 sec: 45874.6, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 5418745856. Throughput: 0: 50984.8. Samples: 3171668040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-27 02:29:17,063][49517] Avg episode reward: [(0, '0.653')] [2024-04-27 02:29:18,638][49750] Updated weights for policy 0, policy_version 330741 (0.0032) [2024-04-27 02:29:22,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5418991616. Throughput: 0: 50698.4. Samples: 3171806680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-27 02:29:22,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 02:29:22,481][49750] Updated weights for policy 0, policy_version 330751 (0.0035) [2024-04-27 02:29:25,067][49750] Updated weights for policy 0, policy_version 330761 (0.0033) [2024-04-27 02:29:27,063][49517] Fps is (10 sec: 54067.2, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5419286528. Throughput: 0: 50787.9. Samples: 3172113700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-27 02:29:27,064][49517] Avg episode reward: [(0, '0.576')] [2024-04-27 02:29:28,812][49750] Updated weights for policy 0, policy_version 330771 (0.0032) [2024-04-27 02:29:31,385][49750] Updated weights for policy 0, policy_version 330781 (0.0027) [2024-04-27 02:29:32,062][49517] Fps is (10 sec: 55706.4, 60 sec: 51609.6, 300 sec: 51040.3). Total num frames: 5419548672. Throughput: 0: 51028.9. Samples: 3172421040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-27 02:29:32,063][49517] Avg episode reward: [(0, '0.525')] [2024-04-27 02:29:35,179][49750] Updated weights for policy 0, policy_version 330791 (0.0031) [2024-04-27 02:29:37,062][49517] Fps is (10 sec: 49152.7, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 5419778048. Throughput: 0: 50872.5. Samples: 3172582840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-27 02:29:37,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-27 02:29:37,866][49750] Updated weights for policy 0, policy_version 330801 (0.0030) [2024-04-27 02:29:37,888][49728] Signal inference workers to stop experience collection... (47800 times) [2024-04-27 02:29:37,894][49728] Signal inference workers to resume experience collection... (47800 times) [2024-04-27 02:29:37,919][49750] InferenceWorker_p0-w0: stopping experience collection (47800 times) [2024-04-27 02:29:37,919][49750] InferenceWorker_p0-w0: resuming experience collection (47800 times) [2024-04-27 02:29:41,696][49750] Updated weights for policy 0, policy_version 330811 (0.0038) [2024-04-27 02:29:42,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5420023808. Throughput: 0: 50966.6. Samples: 3172888720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-27 02:29:42,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-27 02:29:44,153][49750] Updated weights for policy 0, policy_version 330821 (0.0035) [2024-04-27 02:29:47,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 5420269568. Throughput: 0: 51109.9. Samples: 3173194820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-27 02:29:47,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-27 02:29:48,168][49750] Updated weights for policy 0, policy_version 330831 (0.0029) [2024-04-27 02:29:50,484][49750] Updated weights for policy 0, policy_version 330841 (0.0031) [2024-04-27 02:29:52,063][49517] Fps is (10 sec: 55705.0, 60 sec: 51609.4, 300 sec: 50984.7). Total num frames: 5420580864. Throughput: 0: 51049.9. Samples: 3173348280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-27 02:29:52,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 02:29:54,667][49750] Updated weights for policy 0, policy_version 330851 (0.0038) [2024-04-27 02:29:56,987][49750] Updated weights for policy 0, policy_version 330861 (0.0030) [2024-04-27 02:29:57,062][49517] Fps is (10 sec: 55705.5, 60 sec: 51882.7, 300 sec: 50984.8). Total num frames: 5420826624. Throughput: 0: 50839.7. Samples: 3173651560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-27 02:29:57,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-27 02:30:01,114][49750] Updated weights for policy 0, policy_version 330871 (0.0027) [2024-04-27 02:30:02,062][49517] Fps is (10 sec: 45876.4, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 5421039616. Throughput: 0: 50861.9. Samples: 3173956820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-27 02:30:02,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-27 02:30:03,382][49750] Updated weights for policy 0, policy_version 330881 (0.0034) [2024-04-27 02:30:07,063][49517] Fps is (10 sec: 45874.9, 60 sec: 49971.1, 300 sec: 50818.2). Total num frames: 5421285376. Throughput: 0: 50965.9. Samples: 3174100140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-27 02:30:07,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-27 02:30:07,607][49750] Updated weights for policy 0, policy_version 330891 (0.0030) [2024-04-27 02:30:09,808][49750] Updated weights for policy 0, policy_version 330901 (0.0033) [2024-04-27 02:30:12,062][49517] Fps is (10 sec: 52428.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5421563904. Throughput: 0: 50990.8. Samples: 3174408280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-27 02:30:12,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 02:30:13,985][49750] Updated weights for policy 0, policy_version 330911 (0.0028) [2024-04-27 02:30:16,165][49750] Updated weights for policy 0, policy_version 330921 (0.0032) [2024-04-27 02:30:17,062][49517] Fps is (10 sec: 55706.0, 60 sec: 51609.7, 300 sec: 51040.3). Total num frames: 5421842432. Throughput: 0: 50957.8. Samples: 3174714140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-04-27 02:30:17,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-27 02:30:20,312][49750] Updated weights for policy 0, policy_version 330931 (0.0033) [2024-04-27 02:30:22,063][49517] Fps is (10 sec: 52427.3, 60 sec: 51609.6, 300 sec: 50984.7). Total num frames: 5422088192. Throughput: 0: 51125.0. Samples: 3174883480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 02:30:22,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 02:30:22,755][49750] Updated weights for policy 0, policy_version 330941 (0.0030) [2024-04-27 02:30:26,766][49750] Updated weights for policy 0, policy_version 330951 (0.0035) [2024-04-27 02:30:27,063][49517] Fps is (10 sec: 47513.3, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5422317568. Throughput: 0: 50919.1. Samples: 3175180080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 02:30:27,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 02:30:29,230][49750] Updated weights for policy 0, policy_version 330961 (0.0031) [2024-04-27 02:30:32,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5422563328. Throughput: 0: 50778.6. Samples: 3175479860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 02:30:32,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 02:30:33,281][49750] Updated weights for policy 0, policy_version 330971 (0.0028) [2024-04-27 02:30:35,551][49750] Updated weights for policy 0, policy_version 330981 (0.0035) [2024-04-27 02:30:37,062][49517] Fps is (10 sec: 54068.0, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 5422858240. Throughput: 0: 50832.7. Samples: 3175635740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 02:30:37,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 02:30:39,601][49750] Updated weights for policy 0, policy_version 330991 (0.0031) [2024-04-27 02:30:40,083][49728] Signal inference workers to stop experience collection... (47850 times) [2024-04-27 02:30:40,083][49728] Signal inference workers to resume experience collection... (47850 times) [2024-04-27 02:30:40,112][49750] InferenceWorker_p0-w0: stopping experience collection (47850 times) [2024-04-27 02:30:40,112][49750] InferenceWorker_p0-w0: resuming experience collection (47850 times) [2024-04-27 02:30:41,929][49750] Updated weights for policy 0, policy_version 331001 (0.0029) [2024-04-27 02:30:42,062][49517] Fps is (10 sec: 55705.9, 60 sec: 51609.7, 300 sec: 51040.3). Total num frames: 5423120384. Throughput: 0: 51040.9. Samples: 3175948400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 02:30:42,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-27 02:30:42,070][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000331001_5423120384.pth... [2024-04-27 02:30:42,120][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000330254_5410881536.pth [2024-04-27 02:30:45,932][49750] Updated weights for policy 0, policy_version 331011 (0.0036) [2024-04-27 02:30:47,062][49517] Fps is (10 sec: 49151.6, 60 sec: 51336.5, 300 sec: 50984.8). Total num frames: 5423349760. Throughput: 0: 50923.9. Samples: 3176248400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 02:30:47,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 02:30:48,458][49750] Updated weights for policy 0, policy_version 331021 (0.0029) [2024-04-27 02:30:52,062][49517] Fps is (10 sec: 45875.1, 60 sec: 49971.4, 300 sec: 50818.2). Total num frames: 5423579136. Throughput: 0: 51095.6. Samples: 3176399440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 02:30:52,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 02:30:52,452][49750] Updated weights for policy 0, policy_version 331031 (0.0025) [2024-04-27 02:30:54,992][49750] Updated weights for policy 0, policy_version 331041 (0.0039) [2024-04-27 02:30:57,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.4, 300 sec: 50873.7). Total num frames: 5423857664. Throughput: 0: 51028.0. Samples: 3176704540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 02:30:57,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-27 02:30:58,978][49750] Updated weights for policy 0, policy_version 331051 (0.0034) [2024-04-27 02:31:01,298][49750] Updated weights for policy 0, policy_version 331061 (0.0026) [2024-04-27 02:31:02,062][49517] Fps is (10 sec: 54067.4, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5424119808. Throughput: 0: 50957.8. Samples: 3177007240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 02:31:02,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-27 02:31:05,316][49750] Updated weights for policy 0, policy_version 331071 (0.0030) [2024-04-27 02:31:07,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51882.7, 300 sec: 50984.8). Total num frames: 5424398336. Throughput: 0: 51026.5. Samples: 3177179660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 02:31:07,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 02:31:07,585][49750] Updated weights for policy 0, policy_version 331081 (0.0028) [2024-04-27 02:31:11,738][49750] Updated weights for policy 0, policy_version 331091 (0.0035) [2024-04-27 02:31:12,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5424611328. Throughput: 0: 51157.7. Samples: 3177482180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 02:31:12,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 02:31:14,270][49750] Updated weights for policy 0, policy_version 331101 (0.0027) [2024-04-27 02:31:17,063][49517] Fps is (10 sec: 45874.4, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 5424857088. Throughput: 0: 51075.9. Samples: 3177778280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 02:31:17,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-27 02:31:18,109][49750] Updated weights for policy 0, policy_version 331111 (0.0029) [2024-04-27 02:31:20,588][49750] Updated weights for policy 0, policy_version 331121 (0.0027) [2024-04-27 02:31:22,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50790.6, 300 sec: 50929.3). Total num frames: 5425135616. Throughput: 0: 50849.8. Samples: 3177923980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 02:31:22,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-27 02:31:24,542][49750] Updated weights for policy 0, policy_version 331131 (0.0032) [2024-04-27 02:31:27,062][49517] Fps is (10 sec: 54068.5, 60 sec: 51336.7, 300 sec: 50984.8). Total num frames: 5425397760. Throughput: 0: 50798.3. Samples: 3178234320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 02:31:27,063][49517] Avg episode reward: [(0, '0.545')] [2024-04-27 02:31:27,288][49750] Updated weights for policy 0, policy_version 331141 (0.0030) [2024-04-27 02:31:31,063][49750] Updated weights for policy 0, policy_version 331151 (0.0030) [2024-04-27 02:31:32,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51609.7, 300 sec: 51040.3). Total num frames: 5425659904. Throughput: 0: 51095.2. Samples: 3178547680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 02:31:32,063][49517] Avg episode reward: [(0, '0.673')] [2024-04-27 02:31:33,525][49750] Updated weights for policy 0, policy_version 331161 (0.0028) [2024-04-27 02:31:37,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5425872896. Throughput: 0: 51020.1. Samples: 3178695340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 02:31:37,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-27 02:31:37,345][49750] Updated weights for policy 0, policy_version 331171 (0.0039) [2024-04-27 02:31:39,955][49750] Updated weights for policy 0, policy_version 331181 (0.0031) [2024-04-27 02:31:42,063][49517] Fps is (10 sec: 47512.9, 60 sec: 50244.1, 300 sec: 50873.7). Total num frames: 5426135040. Throughput: 0: 51044.3. Samples: 3179001540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 02:31:42,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 02:31:43,815][49750] Updated weights for policy 0, policy_version 331191 (0.0036) [2024-04-27 02:31:43,827][49728] Signal inference workers to stop experience collection... (47900 times) [2024-04-27 02:31:43,827][49728] Signal inference workers to resume experience collection... (47900 times) [2024-04-27 02:31:43,846][49750] InferenceWorker_p0-w0: stopping experience collection (47900 times) [2024-04-27 02:31:43,846][49750] InferenceWorker_p0-w0: resuming experience collection (47900 times) [2024-04-27 02:31:46,393][49750] Updated weights for policy 0, policy_version 331201 (0.0028) [2024-04-27 02:31:47,063][49517] Fps is (10 sec: 54066.5, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5426413568. Throughput: 0: 50927.5. Samples: 3179298980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 02:31:47,063][49517] Avg episode reward: [(0, '0.667')] [2024-04-27 02:31:50,316][49750] Updated weights for policy 0, policy_version 331211 (0.0031) [2024-04-27 02:31:52,063][49517] Fps is (10 sec: 55705.4, 60 sec: 51882.5, 300 sec: 50984.7). Total num frames: 5426692096. Throughput: 0: 50822.0. Samples: 3179466660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 02:31:52,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 02:31:52,678][49750] Updated weights for policy 0, policy_version 331221 (0.0029) [2024-04-27 02:31:56,685][49750] Updated weights for policy 0, policy_version 331231 (0.0030) [2024-04-27 02:31:57,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50790.4, 300 sec: 50984.8). Total num frames: 5426905088. Throughput: 0: 50965.0. Samples: 3179775600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 02:31:57,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 02:31:59,027][49750] Updated weights for policy 0, policy_version 331241 (0.0033) [2024-04-27 02:32:02,062][49517] Fps is (10 sec: 45876.0, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5427150848. Throughput: 0: 51096.2. Samples: 3180077600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 02:32:02,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-27 02:32:03,022][49750] Updated weights for policy 0, policy_version 331251 (0.0036) [2024-04-27 02:32:05,507][49750] Updated weights for policy 0, policy_version 331261 (0.0031) [2024-04-27 02:32:07,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50244.2, 300 sec: 50873.7). Total num frames: 5427412992. Throughput: 0: 50919.4. Samples: 3180215360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 02:32:07,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-27 02:32:09,503][49750] Updated weights for policy 0, policy_version 331271 (0.0031) [2024-04-27 02:32:11,952][49750] Updated weights for policy 0, policy_version 331281 (0.0040) [2024-04-27 02:32:12,063][49517] Fps is (10 sec: 55704.8, 60 sec: 51609.6, 300 sec: 51040.3). Total num frames: 5427707904. Throughput: 0: 50736.2. Samples: 3180517460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 02:32:12,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 02:32:16,019][49750] Updated weights for policy 0, policy_version 331291 (0.0037) [2024-04-27 02:32:17,063][49517] Fps is (10 sec: 54067.3, 60 sec: 51609.6, 300 sec: 50929.2). Total num frames: 5427953664. Throughput: 0: 50687.4. Samples: 3180828620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 02:32:17,063][49517] Avg episode reward: [(0, '0.641')] [2024-04-27 02:32:18,565][49750] Updated weights for policy 0, policy_version 331301 (0.0027) [2024-04-27 02:32:22,062][49517] Fps is (10 sec: 45876.0, 60 sec: 50517.3, 300 sec: 50929.3). Total num frames: 5428166656. Throughput: 0: 50853.8. Samples: 3180983760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 02:32:22,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-27 02:32:22,416][49750] Updated weights for policy 0, policy_version 331311 (0.0033) [2024-04-27 02:32:25,057][49750] Updated weights for policy 0, policy_version 331321 (0.0029) [2024-04-27 02:32:27,062][49517] Fps is (10 sec: 47514.0, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5428428800. Throughput: 0: 50768.6. Samples: 3181286120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 02:32:27,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 02:32:28,771][49750] Updated weights for policy 0, policy_version 331331 (0.0032) [2024-04-27 02:32:31,497][49750] Updated weights for policy 0, policy_version 331341 (0.0033) [2024-04-27 02:32:32,063][49517] Fps is (10 sec: 54065.6, 60 sec: 50790.2, 300 sec: 50984.7). Total num frames: 5428707328. Throughput: 0: 50845.6. Samples: 3181587040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 02:32:32,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 02:32:35,260][49750] Updated weights for policy 0, policy_version 331351 (0.0034) [2024-04-27 02:32:37,063][49517] Fps is (10 sec: 54066.3, 60 sec: 51609.4, 300 sec: 50929.2). Total num frames: 5428969472. Throughput: 0: 50912.4. Samples: 3181757720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 02:32:37,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-27 02:32:37,986][49750] Updated weights for policy 0, policy_version 331361 (0.0034) [2024-04-27 02:32:41,137][49728] Signal inference workers to stop experience collection... (47950 times) [2024-04-27 02:32:41,175][49750] InferenceWorker_p0-w0: stopping experience collection (47950 times) [2024-04-27 02:32:41,202][49728] Signal inference workers to resume experience collection... (47950 times) [2024-04-27 02:32:41,204][49750] InferenceWorker_p0-w0: resuming experience collection (47950 times) [2024-04-27 02:32:41,648][49750] Updated weights for policy 0, policy_version 331371 (0.0029) [2024-04-27 02:32:42,062][49517] Fps is (10 sec: 50791.9, 60 sec: 51336.7, 300 sec: 51040.3). Total num frames: 5429215232. Throughput: 0: 50777.4. Samples: 3182060580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 02:32:42,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 02:32:42,103][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000331374_5429231616.pth... [2024-04-27 02:32:42,162][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000330627_5416992768.pth [2024-04-27 02:32:44,451][49750] Updated weights for policy 0, policy_version 331381 (0.0032) [2024-04-27 02:32:47,062][49517] Fps is (10 sec: 45875.8, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5429428224. Throughput: 0: 50775.0. Samples: 3182362480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 02:32:47,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-27 02:32:48,042][49750] Updated weights for policy 0, policy_version 331391 (0.0031) [2024-04-27 02:32:50,853][49750] Updated weights for policy 0, policy_version 331401 (0.0030) [2024-04-27 02:32:52,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 5429706752. Throughput: 0: 50819.1. Samples: 3182502220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 02:32:52,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 02:32:54,497][49750] Updated weights for policy 0, policy_version 331411 (0.0037) [2024-04-27 02:32:57,063][49517] Fps is (10 sec: 55705.2, 60 sec: 51336.4, 300 sec: 50929.3). Total num frames: 5429985280. Throughput: 0: 50889.8. Samples: 3182807500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 02:32:57,063][49517] Avg episode reward: [(0, '0.575')] [2024-04-27 02:32:57,179][49750] Updated weights for policy 0, policy_version 331421 (0.0039) [2024-04-27 02:33:00,947][49750] Updated weights for policy 0, policy_version 331431 (0.0041) [2024-04-27 02:33:02,063][49517] Fps is (10 sec: 52428.8, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 5430231040. Throughput: 0: 50715.1. Samples: 3183110800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 02:33:02,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-27 02:33:03,538][49750] Updated weights for policy 0, policy_version 331441 (0.0036) [2024-04-27 02:33:07,062][49517] Fps is (10 sec: 45875.9, 60 sec: 50517.5, 300 sec: 50873.7). Total num frames: 5430444032. Throughput: 0: 50817.3. Samples: 3183270540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 02:33:07,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 02:33:07,409][49750] Updated weights for policy 0, policy_version 331451 (0.0036) [2024-04-27 02:33:09,896][49750] Updated weights for policy 0, policy_version 331461 (0.0032) [2024-04-27 02:33:12,063][49517] Fps is (10 sec: 47513.2, 60 sec: 49971.2, 300 sec: 50873.7). Total num frames: 5430706176. Throughput: 0: 50825.6. Samples: 3183573280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 02:33:12,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-27 02:33:13,882][49750] Updated weights for policy 0, policy_version 331471 (0.0030) [2024-04-27 02:33:16,617][49750] Updated weights for policy 0, policy_version 331481 (0.0030) [2024-04-27 02:33:17,062][49517] Fps is (10 sec: 54067.0, 60 sec: 50517.4, 300 sec: 50929.2). Total num frames: 5430984704. Throughput: 0: 50809.6. Samples: 3183873460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 02:33:17,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-27 02:33:20,343][49750] Updated weights for policy 0, policy_version 331491 (0.0031) [2024-04-27 02:33:21,625][49728] Signal inference workers to stop experience collection... (48000 times) [2024-04-27 02:33:21,625][49728] Signal inference workers to resume experience collection... (48000 times) [2024-04-27 02:33:21,653][49750] InferenceWorker_p0-w0: stopping experience collection (48000 times) [2024-04-27 02:33:21,653][49750] InferenceWorker_p0-w0: resuming experience collection (48000 times) [2024-04-27 02:33:22,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 5431246848. Throughput: 0: 50729.1. Samples: 3184040520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 02:33:22,063][49517] Avg episode reward: [(0, '0.715')] [2024-04-27 02:33:23,091][49750] Updated weights for policy 0, policy_version 331501 (0.0027) [2024-04-27 02:33:26,775][49750] Updated weights for policy 0, policy_version 331511 (0.0037) [2024-04-27 02:33:27,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.5, 300 sec: 50984.8). Total num frames: 5431492608. Throughput: 0: 50823.9. Samples: 3184347660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 02:33:27,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-27 02:33:29,420][49750] Updated weights for policy 0, policy_version 331521 (0.0032) [2024-04-27 02:33:32,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50244.4, 300 sec: 50929.2). Total num frames: 5431721984. Throughput: 0: 50827.0. Samples: 3184649700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 02:33:32,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-27 02:33:33,199][49750] Updated weights for policy 0, policy_version 331531 (0.0032) [2024-04-27 02:33:35,738][49750] Updated weights for policy 0, policy_version 331541 (0.0035) [2024-04-27 02:33:37,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50244.5, 300 sec: 50818.2). Total num frames: 5431984128. Throughput: 0: 50786.9. Samples: 3184787620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 02:33:37,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 02:33:39,694][49750] Updated weights for policy 0, policy_version 331551 (0.0032) [2024-04-27 02:33:42,063][49517] Fps is (10 sec: 54067.4, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 5432262656. Throughput: 0: 50739.6. Samples: 3185090780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 02:33:42,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 02:33:42,675][49750] Updated weights for policy 0, policy_version 331561 (0.0030) [2024-04-27 02:33:46,054][49750] Updated weights for policy 0, policy_version 331571 (0.0031) [2024-04-27 02:33:47,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51609.6, 300 sec: 50984.8). Total num frames: 5432524800. Throughput: 0: 50826.7. Samples: 3185398000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 02:33:47,063][49517] Avg episode reward: [(0, '0.664')] [2024-04-27 02:33:49,311][49750] Updated weights for policy 0, policy_version 331581 (0.0026) [2024-04-27 02:33:52,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.6, 300 sec: 50984.8). Total num frames: 5432754176. Throughput: 0: 50689.4. Samples: 3185551560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 02:33:52,063][49517] Avg episode reward: [(0, '0.659')] [2024-04-27 02:33:52,554][49750] Updated weights for policy 0, policy_version 331591 (0.0030) [2024-04-27 02:33:55,623][49750] Updated weights for policy 0, policy_version 331601 (0.0032) [2024-04-27 02:33:57,063][49517] Fps is (10 sec: 45874.6, 60 sec: 49971.2, 300 sec: 50873.7). Total num frames: 5432983552. Throughput: 0: 50778.7. Samples: 3185858320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 02:33:57,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 02:33:58,902][49750] Updated weights for policy 0, policy_version 331611 (0.0033) [2024-04-27 02:34:01,976][49750] Updated weights for policy 0, policy_version 331621 (0.0034) [2024-04-27 02:34:02,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5433278464. Throughput: 0: 50766.7. Samples: 3186157960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 02:34:02,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-27 02:34:05,333][49750] Updated weights for policy 0, policy_version 331631 (0.0029) [2024-04-27 02:34:07,063][49517] Fps is (10 sec: 55706.2, 60 sec: 51609.5, 300 sec: 50929.3). Total num frames: 5433540608. Throughput: 0: 50749.7. Samples: 3186324260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 02:34:07,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 02:34:08,389][49750] Updated weights for policy 0, policy_version 331641 (0.0035) [2024-04-27 02:34:11,704][49750] Updated weights for policy 0, policy_version 331651 (0.0027) [2024-04-27 02:34:12,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51336.6, 300 sec: 50984.8). Total num frames: 5433786368. Throughput: 0: 50792.4. Samples: 3186633320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:34:12,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 02:34:14,730][49750] Updated weights for policy 0, policy_version 331661 (0.0029) [2024-04-27 02:34:17,062][49517] Fps is (10 sec: 45875.4, 60 sec: 50244.3, 300 sec: 50873.7). Total num frames: 5433999360. Throughput: 0: 50873.5. Samples: 3186939000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:34:17,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-27 02:34:18,267][49750] Updated weights for policy 0, policy_version 331671 (0.0029) [2024-04-27 02:34:21,266][49750] Updated weights for policy 0, policy_version 331681 (0.0031) [2024-04-27 02:34:22,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 5434277888. Throughput: 0: 50829.1. Samples: 3187074940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:34:22,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 02:34:22,768][49728] Signal inference workers to stop experience collection... (48050 times) [2024-04-27 02:34:22,768][49728] Signal inference workers to resume experience collection... (48050 times) [2024-04-27 02:34:22,792][49750] InferenceWorker_p0-w0: stopping experience collection (48050 times) [2024-04-27 02:34:22,792][49750] InferenceWorker_p0-w0: resuming experience collection (48050 times) [2024-04-27 02:34:24,747][49750] Updated weights for policy 0, policy_version 331691 (0.0034) [2024-04-27 02:34:27,062][49517] Fps is (10 sec: 55705.5, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5434556416. Throughput: 0: 50917.4. Samples: 3187382060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:34:27,063][49517] Avg episode reward: [(0, '0.496')] [2024-04-27 02:34:27,690][49750] Updated weights for policy 0, policy_version 331701 (0.0036) [2024-04-27 02:34:31,065][49750] Updated weights for policy 0, policy_version 331711 (0.0029) [2024-04-27 02:34:32,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.5, 300 sec: 50929.2). Total num frames: 5434802176. Throughput: 0: 50952.3. Samples: 3187690860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:34:32,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-27 02:34:34,074][49750] Updated weights for policy 0, policy_version 331721 (0.0031) [2024-04-27 02:34:37,062][49517] Fps is (10 sec: 49152.2, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 5435047936. Throughput: 0: 50948.8. Samples: 3187844260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:34:37,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 02:34:37,539][49750] Updated weights for policy 0, policy_version 331731 (0.0041) [2024-04-27 02:34:40,412][49750] Updated weights for policy 0, policy_version 331741 (0.0036) [2024-04-27 02:34:42,063][49517] Fps is (10 sec: 47513.7, 60 sec: 50244.2, 300 sec: 50873.7). Total num frames: 5435277312. Throughput: 0: 50756.9. Samples: 3188142380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:34:42,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 02:34:42,183][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000331744_5435293696.pth... [2024-04-27 02:34:42,229][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000331001_5423120384.pth [2024-04-27 02:34:43,899][49750] Updated weights for policy 0, policy_version 331751 (0.0029) [2024-04-27 02:34:46,870][49750] Updated weights for policy 0, policy_version 331761 (0.0026) [2024-04-27 02:34:47,062][49517] Fps is (10 sec: 52429.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5435572224. Throughput: 0: 50983.2. Samples: 3188452200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:34:47,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 02:34:50,481][49750] Updated weights for policy 0, policy_version 331771 (0.0032) [2024-04-27 02:34:52,063][49517] Fps is (10 sec: 54067.4, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 5435817984. Throughput: 0: 50802.6. Samples: 3188610380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:34:52,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 02:34:53,215][49750] Updated weights for policy 0, policy_version 331781 (0.0033) [2024-04-27 02:34:56,722][49750] Updated weights for policy 0, policy_version 331791 (0.0036) [2024-04-27 02:34:57,062][49517] Fps is (10 sec: 49151.8, 60 sec: 51336.7, 300 sec: 50929.2). Total num frames: 5436063744. Throughput: 0: 50802.4. Samples: 3188919420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:34:57,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 02:34:59,682][49750] Updated weights for policy 0, policy_version 331801 (0.0029) [2024-04-27 02:35:02,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.4, 300 sec: 50929.3). Total num frames: 5436309504. Throughput: 0: 50782.7. Samples: 3189224220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:35:02,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-27 02:35:03,273][49750] Updated weights for policy 0, policy_version 331811 (0.0032) [2024-04-27 02:35:06,215][49750] Updated weights for policy 0, policy_version 331821 (0.0028) [2024-04-27 02:35:07,062][49517] Fps is (10 sec: 49152.0, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5436555264. Throughput: 0: 50918.4. Samples: 3189366260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:35:07,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 02:35:09,822][49750] Updated weights for policy 0, policy_version 331831 (0.0031) [2024-04-27 02:35:12,062][49517] Fps is (10 sec: 52428.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5436833792. Throughput: 0: 50933.4. Samples: 3189674060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:35:12,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 02:35:12,551][49750] Updated weights for policy 0, policy_version 331841 (0.0028) [2024-04-27 02:35:16,128][49750] Updated weights for policy 0, policy_version 331851 (0.0030) [2024-04-27 02:35:17,062][49517] Fps is (10 sec: 54067.2, 60 sec: 51609.6, 300 sec: 50873.8). Total num frames: 5437095936. Throughput: 0: 50843.3. Samples: 3189978800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:35:17,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 02:35:19,317][49750] Updated weights for policy 0, policy_version 331861 (0.0031) [2024-04-27 02:35:22,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5437325312. Throughput: 0: 50853.7. Samples: 3190132680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 02:35:22,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 02:35:22,484][49750] Updated weights for policy 0, policy_version 331871 (0.0029) [2024-04-27 02:35:22,974][49728] Signal inference workers to stop experience collection... (48100 times) [2024-04-27 02:35:23,027][49750] InferenceWorker_p0-w0: stopping experience collection (48100 times) [2024-04-27 02:35:23,045][49728] Signal inference workers to resume experience collection... (48100 times) [2024-04-27 02:35:23,046][49750] InferenceWorker_p0-w0: resuming experience collection (48100 times) [2024-04-27 02:35:25,829][49750] Updated weights for policy 0, policy_version 331881 (0.0034) [2024-04-27 02:35:27,062][49517] Fps is (10 sec: 45875.2, 60 sec: 49971.3, 300 sec: 50818.2). Total num frames: 5437554688. Throughput: 0: 50971.7. Samples: 3190436100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 02:35:27,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 02:35:28,935][49750] Updated weights for policy 0, policy_version 331891 (0.0032) [2024-04-27 02:35:32,062][49517] Fps is (10 sec: 52429.6, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 5437849600. Throughput: 0: 50958.7. Samples: 3190745340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 02:35:32,063][49517] Avg episode reward: [(0, '0.621')] [2024-04-27 02:35:32,375][49750] Updated weights for policy 0, policy_version 331901 (0.0028) [2024-04-27 02:35:35,473][49750] Updated weights for policy 0, policy_version 331911 (0.0030) [2024-04-27 02:35:37,062][49517] Fps is (10 sec: 54067.2, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5438095360. Throughput: 0: 50795.2. Samples: 3190896160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 02:35:37,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 02:35:38,754][49750] Updated weights for policy 0, policy_version 331921 (0.0038) [2024-04-27 02:35:41,816][49750] Updated weights for policy 0, policy_version 331931 (0.0031) [2024-04-27 02:35:42,062][49517] Fps is (10 sec: 50789.9, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5438357504. Throughput: 0: 50772.8. Samples: 3191204200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 02:35:42,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-27 02:35:44,974][49750] Updated weights for policy 0, policy_version 331941 (0.0037) [2024-04-27 02:35:47,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.3, 300 sec: 50929.3). Total num frames: 5438603264. Throughput: 0: 50813.3. Samples: 3191510820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 02:35:47,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-27 02:35:48,061][49750] Updated weights for policy 0, policy_version 331951 (0.0030) [2024-04-27 02:35:51,538][49750] Updated weights for policy 0, policy_version 331961 (0.0034) [2024-04-27 02:35:52,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5438865408. Throughput: 0: 50978.7. Samples: 3191660300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 02:35:52,063][49517] Avg episode reward: [(0, '0.599')] [2024-04-27 02:35:54,733][49750] Updated weights for policy 0, policy_version 331971 (0.0031) [2024-04-27 02:35:57,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.2, 300 sec: 50818.1). Total num frames: 5439111168. Throughput: 0: 50870.5. Samples: 3191963240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 02:35:57,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-27 02:35:58,181][49750] Updated weights for policy 0, policy_version 331981 (0.0035) [2024-04-27 02:36:01,026][49750] Updated weights for policy 0, policy_version 331991 (0.0036) [2024-04-27 02:36:02,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 5439373312. Throughput: 0: 50904.7. Samples: 3192269520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 02:36:02,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 02:36:04,471][49750] Updated weights for policy 0, policy_version 332001 (0.0029) [2024-04-27 02:36:07,062][49517] Fps is (10 sec: 52429.5, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 5439635456. Throughput: 0: 50930.7. Samples: 3192424560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 02:36:07,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 02:36:07,449][49750] Updated weights for policy 0, policy_version 332011 (0.0031) [2024-04-27 02:36:10,872][49750] Updated weights for policy 0, policy_version 332021 (0.0035) [2024-04-27 02:36:12,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50517.2, 300 sec: 50873.7). Total num frames: 5439864832. Throughput: 0: 50913.6. Samples: 3192727220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 02:36:12,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 02:36:13,939][49750] Updated weights for policy 0, policy_version 332031 (0.0031) [2024-04-27 02:36:17,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5440126976. Throughput: 0: 50842.6. Samples: 3193033260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 02:36:17,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-27 02:36:17,426][49750] Updated weights for policy 0, policy_version 332041 (0.0031) [2024-04-27 02:36:20,208][49750] Updated weights for policy 0, policy_version 332051 (0.0031) [2024-04-27 02:36:22,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5440389120. Throughput: 0: 50912.5. Samples: 3193187220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 02:36:22,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-27 02:36:24,014][49750] Updated weights for policy 0, policy_version 332061 (0.0034) [2024-04-27 02:36:26,858][49750] Updated weights for policy 0, policy_version 332071 (0.0033) [2024-04-27 02:36:27,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51609.5, 300 sec: 50818.2). Total num frames: 5440651264. Throughput: 0: 50826.2. Samples: 3193491380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 02:36:27,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-27 02:36:30,531][49750] Updated weights for policy 0, policy_version 332081 (0.0032) [2024-04-27 02:36:32,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5440880640. Throughput: 0: 50784.4. Samples: 3193796120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 02:36:32,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-27 02:36:33,243][49750] Updated weights for policy 0, policy_version 332091 (0.0032) [2024-04-27 02:36:37,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5441126400. Throughput: 0: 50728.0. Samples: 3193943060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 02:36:37,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-27 02:36:37,100][49750] Updated weights for policy 0, policy_version 332101 (0.0031) [2024-04-27 02:36:39,799][49750] Updated weights for policy 0, policy_version 332111 (0.0033) [2024-04-27 02:36:42,063][49517] Fps is (10 sec: 52428.0, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5441404928. Throughput: 0: 50757.3. Samples: 3194247320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:36:42,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-27 02:36:42,073][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000332117_5441404928.pth... [2024-04-27 02:36:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000331374_5429231616.pth [2024-04-27 02:36:43,650][49750] Updated weights for policy 0, policy_version 332121 (0.0037) [2024-04-27 02:36:44,771][49728] Signal inference workers to stop experience collection... (48150 times) [2024-04-27 02:36:44,821][49750] InferenceWorker_p0-w0: stopping experience collection (48150 times) [2024-04-27 02:36:44,833][49728] Signal inference workers to resume experience collection... (48150 times) [2024-04-27 02:36:44,841][49750] InferenceWorker_p0-w0: resuming experience collection (48150 times) [2024-04-27 02:36:46,450][49750] Updated weights for policy 0, policy_version 332131 (0.0032) [2024-04-27 02:36:47,063][49517] Fps is (10 sec: 54066.1, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 5441667072. Throughput: 0: 50681.3. Samples: 3194550180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:36:47,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-27 02:36:50,037][49750] Updated weights for policy 0, policy_version 332141 (0.0035) [2024-04-27 02:36:52,063][49517] Fps is (10 sec: 50790.0, 60 sec: 50790.2, 300 sec: 50873.7). Total num frames: 5441912832. Throughput: 0: 50690.9. Samples: 3194705660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:36:52,063][49517] Avg episode reward: [(0, '0.663')] [2024-04-27 02:36:52,883][49750] Updated weights for policy 0, policy_version 332151 (0.0031) [2024-04-27 02:36:56,389][49750] Updated weights for policy 0, policy_version 332161 (0.0028) [2024-04-27 02:36:57,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5442158592. Throughput: 0: 50685.7. Samples: 3195008080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:36:57,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-27 02:36:59,707][49750] Updated weights for policy 0, policy_version 332171 (0.0030) [2024-04-27 02:37:02,063][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5442404352. Throughput: 0: 50665.2. Samples: 3195313200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:37:02,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 02:37:02,941][49750] Updated weights for policy 0, policy_version 332181 (0.0033) [2024-04-27 02:37:05,978][49750] Updated weights for policy 0, policy_version 332191 (0.0030) [2024-04-27 02:37:07,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5442666496. Throughput: 0: 50577.1. Samples: 3195463200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:37:07,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-27 02:37:09,423][49750] Updated weights for policy 0, policy_version 332201 (0.0033) [2024-04-27 02:37:12,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5442912256. Throughput: 0: 50714.3. Samples: 3195773520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:37:12,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-27 02:37:12,269][49750] Updated weights for policy 0, policy_version 332211 (0.0035) [2024-04-27 02:37:15,740][49750] Updated weights for policy 0, policy_version 332221 (0.0032) [2024-04-27 02:37:17,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 5443190784. Throughput: 0: 50809.6. Samples: 3196082560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:37:17,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 02:37:18,650][49750] Updated weights for policy 0, policy_version 332231 (0.0032) [2024-04-27 02:37:22,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5443420160. Throughput: 0: 50749.3. Samples: 3196226780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:37:22,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-27 02:37:22,073][49750] Updated weights for policy 0, policy_version 332241 (0.0037) [2024-04-27 02:37:25,026][49750] Updated weights for policy 0, policy_version 332251 (0.0029) [2024-04-27 02:37:27,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50762.7). Total num frames: 5443682304. Throughput: 0: 50738.2. Samples: 3196530540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:37:27,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 02:37:28,711][49750] Updated weights for policy 0, policy_version 332261 (0.0034) [2024-04-27 02:37:31,365][49750] Updated weights for policy 0, policy_version 332271 (0.0029) [2024-04-27 02:37:32,063][49517] Fps is (10 sec: 52427.7, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 5443944448. Throughput: 0: 50670.6. Samples: 3196830360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:37:32,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-27 02:37:35,296][49750] Updated weights for policy 0, policy_version 332281 (0.0026) [2024-04-27 02:37:37,062][49517] Fps is (10 sec: 50791.5, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5444190208. Throughput: 0: 50778.1. Samples: 3196990660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:37:37,063][49517] Avg episode reward: [(0, '0.492')] [2024-04-27 02:37:38,038][49750] Updated weights for policy 0, policy_version 332291 (0.0030) [2024-04-27 02:37:41,560][49750] Updated weights for policy 0, policy_version 332301 (0.0031) [2024-04-27 02:37:42,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5444435968. Throughput: 0: 50902.2. Samples: 3197298680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:37:42,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 02:37:44,334][49750] Updated weights for policy 0, policy_version 332311 (0.0031) [2024-04-27 02:37:47,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5444698112. Throughput: 0: 50958.3. Samples: 3197606320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:37:47,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 02:37:47,910][49750] Updated weights for policy 0, policy_version 332321 (0.0029) [2024-04-27 02:37:50,790][49750] Updated weights for policy 0, policy_version 332331 (0.0027) [2024-04-27 02:37:52,062][49517] Fps is (10 sec: 52429.9, 60 sec: 50790.7, 300 sec: 50762.7). Total num frames: 5444960256. Throughput: 0: 50918.5. Samples: 3197754520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 02:37:52,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 02:37:54,520][49750] Updated weights for policy 0, policy_version 332341 (0.0033) [2024-04-27 02:37:57,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5445206016. Throughput: 0: 50772.0. Samples: 3198058260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:37:57,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-27 02:37:57,173][49728] Signal inference workers to stop experience collection... (48200 times) [2024-04-27 02:37:57,173][49728] Signal inference workers to resume experience collection... (48200 times) [2024-04-27 02:37:57,204][49750] InferenceWorker_p0-w0: stopping experience collection (48200 times) [2024-04-27 02:37:57,204][49750] InferenceWorker_p0-w0: resuming experience collection (48200 times) [2024-04-27 02:37:57,306][49750] Updated weights for policy 0, policy_version 332351 (0.0030) [2024-04-27 02:38:00,972][49750] Updated weights for policy 0, policy_version 332361 (0.0027) [2024-04-27 02:38:02,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5445451776. Throughput: 0: 50756.6. Samples: 3198366600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:38:02,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-27 02:38:03,733][49750] Updated weights for policy 0, policy_version 332371 (0.0031) [2024-04-27 02:38:07,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5445713920. Throughput: 0: 50914.6. Samples: 3198517940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:38:07,063][49517] Avg episode reward: [(0, '0.459')] [2024-04-27 02:38:07,307][49750] Updated weights for policy 0, policy_version 332381 (0.0031) [2024-04-27 02:38:10,031][49750] Updated weights for policy 0, policy_version 332391 (0.0033) [2024-04-27 02:38:12,063][49517] Fps is (10 sec: 52427.4, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 5445976064. Throughput: 0: 50851.4. Samples: 3198818860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:38:12,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 02:38:13,775][49750] Updated weights for policy 0, policy_version 332401 (0.0032) [2024-04-27 02:38:16,525][49750] Updated weights for policy 0, policy_version 332411 (0.0027) [2024-04-27 02:38:17,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 5446238208. Throughput: 0: 50990.4. Samples: 3199124920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:38:17,063][49517] Avg episode reward: [(0, '0.679')] [2024-04-27 02:38:20,179][49750] Updated weights for policy 0, policy_version 332421 (0.0030) [2024-04-27 02:38:22,062][49517] Fps is (10 sec: 52430.1, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5446500352. Throughput: 0: 50962.1. Samples: 3199283960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:38:22,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 02:38:23,047][49750] Updated weights for policy 0, policy_version 332431 (0.0034) [2024-04-27 02:38:26,606][49750] Updated weights for policy 0, policy_version 332441 (0.0029) [2024-04-27 02:38:27,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5446729728. Throughput: 0: 50925.9. Samples: 3199590340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:38:27,063][49517] Avg episode reward: [(0, '0.524')] [2024-04-27 02:38:29,422][49750] Updated weights for policy 0, policy_version 332451 (0.0033) [2024-04-27 02:38:32,062][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5446991872. Throughput: 0: 50836.0. Samples: 3199893940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:38:32,063][49517] Avg episode reward: [(0, '0.669')] [2024-04-27 02:38:33,012][49750] Updated weights for policy 0, policy_version 332461 (0.0038) [2024-04-27 02:38:35,751][49750] Updated weights for policy 0, policy_version 332471 (0.0035) [2024-04-27 02:38:37,062][49517] Fps is (10 sec: 52429.1, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5447254016. Throughput: 0: 50947.1. Samples: 3200047140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:38:37,063][49517] Avg episode reward: [(0, '0.626')] [2024-04-27 02:38:39,589][49750] Updated weights for policy 0, policy_version 332481 (0.0031) [2024-04-27 02:38:42,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5447499776. Throughput: 0: 50891.0. Samples: 3200348360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:38:42,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-27 02:38:42,075][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000332489_5447499776.pth... [2024-04-27 02:38:42,130][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000331744_5435293696.pth [2024-04-27 02:38:42,452][49750] Updated weights for policy 0, policy_version 332491 (0.0033) [2024-04-27 02:38:45,890][49750] Updated weights for policy 0, policy_version 332501 (0.0033) [2024-04-27 02:38:47,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5447745536. Throughput: 0: 50898.3. Samples: 3200657020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:38:47,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 02:38:48,828][49750] Updated weights for policy 0, policy_version 332511 (0.0036) [2024-04-27 02:38:52,062][49517] Fps is (10 sec: 49153.2, 60 sec: 50517.4, 300 sec: 50873.8). Total num frames: 5447991296. Throughput: 0: 50906.4. Samples: 3200808720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:38:52,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-27 02:38:52,237][49750] Updated weights for policy 0, policy_version 332521 (0.0033) [2024-04-27 02:38:55,124][49750] Updated weights for policy 0, policy_version 332531 (0.0026) [2024-04-27 02:38:57,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5448269824. Throughput: 0: 50943.0. Samples: 3201111280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:38:57,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 02:38:58,520][49750] Updated weights for policy 0, policy_version 332541 (0.0029) [2024-04-27 02:39:01,604][49750] Updated weights for policy 0, policy_version 332551 (0.0031) [2024-04-27 02:39:02,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.5, 300 sec: 50762.7). Total num frames: 5448515584. Throughput: 0: 50972.5. Samples: 3201418680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:39:02,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 02:39:04,869][49750] Updated weights for policy 0, policy_version 332561 (0.0033) [2024-04-27 02:39:07,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5448777728. Throughput: 0: 50945.2. Samples: 3201576500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 02:39:07,063][49517] Avg episode reward: [(0, '0.540')] [2024-04-27 02:39:08,002][49750] Updated weights for policy 0, policy_version 332571 (0.0034) [2024-04-27 02:39:11,299][49750] Updated weights for policy 0, policy_version 332581 (0.0029) [2024-04-27 02:39:12,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.6, 300 sec: 50929.2). Total num frames: 5449023488. Throughput: 0: 50925.7. Samples: 3201882000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:39:12,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-27 02:39:14,381][49750] Updated weights for policy 0, policy_version 332591 (0.0029) [2024-04-27 02:39:17,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5449269248. Throughput: 0: 50993.8. Samples: 3202188660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:39:17,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-27 02:39:18,095][49750] Updated weights for policy 0, policy_version 332601 (0.0026) [2024-04-27 02:39:18,986][49728] Signal inference workers to stop experience collection... (48250 times) [2024-04-27 02:39:18,987][49728] Signal inference workers to resume experience collection... (48250 times) [2024-04-27 02:39:19,010][49750] InferenceWorker_p0-w0: stopping experience collection (48250 times) [2024-04-27 02:39:19,010][49750] InferenceWorker_p0-w0: resuming experience collection (48250 times) [2024-04-27 02:39:20,864][49750] Updated weights for policy 0, policy_version 332611 (0.0031) [2024-04-27 02:39:22,062][49517] Fps is (10 sec: 52429.7, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5449547776. Throughput: 0: 50773.0. Samples: 3202331920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:39:22,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-27 02:39:24,482][49750] Updated weights for policy 0, policy_version 332621 (0.0030) [2024-04-27 02:39:27,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51336.6, 300 sec: 50873.7). Total num frames: 5449809920. Throughput: 0: 50904.6. Samples: 3202639060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:39:27,063][49517] Avg episode reward: [(0, '0.644')] [2024-04-27 02:39:27,190][49750] Updated weights for policy 0, policy_version 332631 (0.0028) [2024-04-27 02:39:30,831][49750] Updated weights for policy 0, policy_version 332641 (0.0032) [2024-04-27 02:39:32,062][49517] Fps is (10 sec: 52428.3, 60 sec: 51336.6, 300 sec: 50929.2). Total num frames: 5450072064. Throughput: 0: 50853.7. Samples: 3202945440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:39:32,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 02:39:33,620][49750] Updated weights for policy 0, policy_version 332651 (0.0030) [2024-04-27 02:39:37,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5450285056. Throughput: 0: 51007.0. Samples: 3203104040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:39:37,063][49517] Avg episode reward: [(0, '0.698')] [2024-04-27 02:39:37,314][49750] Updated weights for policy 0, policy_version 332661 (0.0028) [2024-04-27 02:39:40,099][49750] Updated weights for policy 0, policy_version 332671 (0.0029) [2024-04-27 02:39:42,062][49517] Fps is (10 sec: 49151.5, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 5450563584. Throughput: 0: 51047.0. Samples: 3203408400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:39:42,063][49517] Avg episode reward: [(0, '0.651')] [2024-04-27 02:39:43,618][49750] Updated weights for policy 0, policy_version 332681 (0.0034) [2024-04-27 02:39:46,706][49750] Updated weights for policy 0, policy_version 332691 (0.0034) [2024-04-27 02:39:47,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5450809344. Throughput: 0: 50969.2. Samples: 3203712300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:39:47,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 02:39:50,192][49750] Updated weights for policy 0, policy_version 332701 (0.0030) [2024-04-27 02:39:52,063][49517] Fps is (10 sec: 49151.7, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 5451055104. Throughput: 0: 50780.8. Samples: 3203861640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:39:52,064][49517] Avg episode reward: [(0, '0.608')] [2024-04-27 02:39:53,265][49750] Updated weights for policy 0, policy_version 332711 (0.0032) [2024-04-27 02:39:56,612][49750] Updated weights for policy 0, policy_version 332721 (0.0032) [2024-04-27 02:39:57,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.3, 300 sec: 50929.2). Total num frames: 5451333632. Throughput: 0: 50798.1. Samples: 3204167920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:39:57,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-27 02:39:59,683][49750] Updated weights for policy 0, policy_version 332731 (0.0033) [2024-04-27 02:40:02,062][49517] Fps is (10 sec: 50791.4, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5451563008. Throughput: 0: 50827.6. Samples: 3204475900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:40:02,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-27 02:40:02,955][49750] Updated weights for policy 0, policy_version 332741 (0.0031) [2024-04-27 02:40:06,245][49750] Updated weights for policy 0, policy_version 332751 (0.0031) [2024-04-27 02:40:07,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5451825152. Throughput: 0: 50947.5. Samples: 3204624560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:40:07,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-27 02:40:09,318][49750] Updated weights for policy 0, policy_version 332761 (0.0036) [2024-04-27 02:40:12,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5452087296. Throughput: 0: 50857.7. Samples: 3204927660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:40:12,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 02:40:12,849][49750] Updated weights for policy 0, policy_version 332771 (0.0030) [2024-04-27 02:40:15,871][49750] Updated weights for policy 0, policy_version 332781 (0.0037) [2024-04-27 02:40:17,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51336.6, 300 sec: 50929.3). Total num frames: 5452349440. Throughput: 0: 50728.9. Samples: 3205228240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:40:17,063][49517] Avg episode reward: [(0, '0.608')] [2024-04-27 02:40:19,260][49750] Updated weights for policy 0, policy_version 332791 (0.0045) [2024-04-27 02:40:22,062][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.3, 300 sec: 50929.2). Total num frames: 5452578816. Throughput: 0: 50806.7. Samples: 3205390340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:40:22,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-27 02:40:22,291][49750] Updated weights for policy 0, policy_version 332801 (0.0029) [2024-04-27 02:40:25,668][49750] Updated weights for policy 0, policy_version 332811 (0.0031) [2024-04-27 02:40:27,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.3, 300 sec: 50762.6). Total num frames: 5452824576. Throughput: 0: 50789.9. Samples: 3205693940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 02:40:27,063][49517] Avg episode reward: [(0, '0.669')] [2024-04-27 02:40:28,373][49728] Signal inference workers to stop experience collection... (48300 times) [2024-04-27 02:40:28,416][49750] InferenceWorker_p0-w0: stopping experience collection (48300 times) [2024-04-27 02:40:28,479][49728] Signal inference workers to resume experience collection... (48300 times) [2024-04-27 02:40:28,479][49750] InferenceWorker_p0-w0: resuming experience collection (48300 times) [2024-04-27 02:40:28,605][49750] Updated weights for policy 0, policy_version 332821 (0.0030) [2024-04-27 02:40:32,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50244.3, 300 sec: 50818.2). Total num frames: 5453086720. Throughput: 0: 50687.1. Samples: 3205993220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 02:40:32,063][49517] Avg episode reward: [(0, '0.603')] [2024-04-27 02:40:32,264][49750] Updated weights for policy 0, policy_version 332831 (0.0036) [2024-04-27 02:40:35,099][49750] Updated weights for policy 0, policy_version 332841 (0.0035) [2024-04-27 02:40:37,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5453348864. Throughput: 0: 50929.5. Samples: 3206153460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 02:40:37,063][49517] Avg episode reward: [(0, '0.654')] [2024-04-27 02:40:38,570][49750] Updated weights for policy 0, policy_version 332851 (0.0039) [2024-04-27 02:40:41,514][49750] Updated weights for policy 0, policy_version 332861 (0.0028) [2024-04-27 02:40:42,063][49517] Fps is (10 sec: 54066.4, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5453627392. Throughput: 0: 50890.7. Samples: 3206458000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 02:40:42,063][49517] Avg episode reward: [(0, '0.513')] [2024-04-27 02:40:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000332863_5453627392.pth... [2024-04-27 02:40:42,117][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000332117_5441404928.pth [2024-04-27 02:40:45,028][49750] Updated weights for policy 0, policy_version 332871 (0.0034) [2024-04-27 02:40:47,063][49517] Fps is (10 sec: 49151.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5453840384. Throughput: 0: 50851.4. Samples: 3206764220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 02:40:47,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 02:40:47,930][49750] Updated weights for policy 0, policy_version 332881 (0.0028) [2024-04-27 02:40:51,793][49750] Updated weights for policy 0, policy_version 332891 (0.0032) [2024-04-27 02:40:52,062][49517] Fps is (10 sec: 47514.3, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5454102528. Throughput: 0: 50838.2. Samples: 3206912280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 02:40:52,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-27 02:40:54,315][49750] Updated weights for policy 0, policy_version 332901 (0.0035) [2024-04-27 02:40:57,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 5454364672. Throughput: 0: 50724.5. Samples: 3207210260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 02:40:57,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-27 02:40:58,136][49750] Updated weights for policy 0, policy_version 332911 (0.0031) [2024-04-27 02:41:00,689][49750] Updated weights for policy 0, policy_version 332921 (0.0032) [2024-04-27 02:41:02,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.3, 300 sec: 50818.2). Total num frames: 5454626816. Throughput: 0: 50810.9. Samples: 3207514740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 02:41:02,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 02:41:04,424][49750] Updated weights for policy 0, policy_version 332931 (0.0033) [2024-04-27 02:41:07,062][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 5454888960. Throughput: 0: 50783.1. Samples: 3207675580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 02:41:07,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 02:41:07,131][49750] Updated weights for policy 0, policy_version 332941 (0.0031) [2024-04-27 02:41:10,953][49750] Updated weights for policy 0, policy_version 332951 (0.0030) [2024-04-27 02:41:12,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5455118336. Throughput: 0: 50817.3. Samples: 3207980720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 02:41:12,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 02:41:13,559][49750] Updated weights for policy 0, policy_version 332961 (0.0036) [2024-04-27 02:41:17,063][49517] Fps is (10 sec: 47512.8, 60 sec: 50244.1, 300 sec: 50762.6). Total num frames: 5455364096. Throughput: 0: 50856.3. Samples: 3208281760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 02:41:17,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-27 02:41:17,430][49750] Updated weights for policy 0, policy_version 332971 (0.0028) [2024-04-27 02:41:20,112][49750] Updated weights for policy 0, policy_version 332981 (0.0028) [2024-04-27 02:41:22,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 5455626240. Throughput: 0: 50701.0. Samples: 3208435000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 02:41:22,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-27 02:41:23,755][49750] Updated weights for policy 0, policy_version 332991 (0.0036) [2024-04-27 02:41:26,616][49750] Updated weights for policy 0, policy_version 333001 (0.0038) [2024-04-27 02:41:27,062][49517] Fps is (10 sec: 54068.1, 60 sec: 51336.5, 300 sec: 50929.3). Total num frames: 5455904768. Throughput: 0: 50857.1. Samples: 3208746560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 02:41:27,063][49517] Avg episode reward: [(0, '0.502')] [2024-04-27 02:41:30,064][49750] Updated weights for policy 0, policy_version 333011 (0.0028) [2024-04-27 02:41:32,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5456134144. Throughput: 0: 50837.9. Samples: 3209051920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 02:41:32,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-27 02:41:32,983][49750] Updated weights for policy 0, policy_version 333021 (0.0030) [2024-04-27 02:41:34,618][49728] Signal inference workers to stop experience collection... (48350 times) [2024-04-27 02:41:34,649][49750] InferenceWorker_p0-w0: stopping experience collection (48350 times) [2024-04-27 02:41:34,685][49728] Signal inference workers to resume experience collection... (48350 times) [2024-04-27 02:41:34,685][49750] InferenceWorker_p0-w0: resuming experience collection (48350 times) [2024-04-27 02:41:36,509][49750] Updated weights for policy 0, policy_version 333031 (0.0031) [2024-04-27 02:41:37,062][49517] Fps is (10 sec: 49151.6, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5456396288. Throughput: 0: 50877.7. Samples: 3209201780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 02:41:37,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 02:41:39,449][49750] Updated weights for policy 0, policy_version 333041 (0.0031) [2024-04-27 02:41:42,063][49517] Fps is (10 sec: 52427.7, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5456658432. Throughput: 0: 50956.7. Samples: 3209503320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 02:41:42,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 02:41:43,078][49750] Updated weights for policy 0, policy_version 333051 (0.0030) [2024-04-27 02:41:46,072][49750] Updated weights for policy 0, policy_version 333061 (0.0031) [2024-04-27 02:41:47,062][49517] Fps is (10 sec: 50790.5, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5456904192. Throughput: 0: 50861.0. Samples: 3209803480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:41:47,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 02:41:49,398][49750] Updated weights for policy 0, policy_version 333071 (0.0036) [2024-04-27 02:41:52,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5457166336. Throughput: 0: 50758.6. Samples: 3209959720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:41:52,063][49517] Avg episode reward: [(0, '0.491')] [2024-04-27 02:41:52,364][49750] Updated weights for policy 0, policy_version 333081 (0.0032) [2024-04-27 02:41:55,825][49750] Updated weights for policy 0, policy_version 333091 (0.0035) [2024-04-27 02:41:57,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5457412096. Throughput: 0: 50797.2. Samples: 3210266600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:41:57,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 02:41:58,695][49750] Updated weights for policy 0, policy_version 333101 (0.0031) [2024-04-27 02:42:02,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 5457657856. Throughput: 0: 50966.0. Samples: 3210575220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:42:02,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-27 02:42:02,390][49750] Updated weights for policy 0, policy_version 333111 (0.0029) [2024-04-27 02:42:05,135][49750] Updated weights for policy 0, policy_version 333121 (0.0033) [2024-04-27 02:42:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5457920000. Throughput: 0: 50793.2. Samples: 3210720700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:42:07,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-27 02:42:08,948][49750] Updated weights for policy 0, policy_version 333131 (0.0034) [2024-04-27 02:42:11,641][49750] Updated weights for policy 0, policy_version 333141 (0.0031) [2024-04-27 02:42:12,062][49517] Fps is (10 sec: 54066.9, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5458198528. Throughput: 0: 50715.6. Samples: 3211028760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:42:12,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 02:42:15,354][49750] Updated weights for policy 0, policy_version 333151 (0.0034) [2024-04-27 02:42:17,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5458427904. Throughput: 0: 50800.3. Samples: 3211337940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:42:17,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 02:42:18,027][49750] Updated weights for policy 0, policy_version 333161 (0.0036) [2024-04-27 02:42:21,606][49750] Updated weights for policy 0, policy_version 333171 (0.0032) [2024-04-27 02:42:22,063][49517] Fps is (10 sec: 47512.5, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 5458673664. Throughput: 0: 50683.4. Samples: 3211482540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:42:22,063][49517] Avg episode reward: [(0, '0.573')] [2024-04-27 02:42:24,430][49750] Updated weights for policy 0, policy_version 333181 (0.0034) [2024-04-27 02:42:27,063][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 5458935808. Throughput: 0: 50636.6. Samples: 3211781960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:42:27,063][49517] Avg episode reward: [(0, '0.516')] [2024-04-27 02:42:28,152][49750] Updated weights for policy 0, policy_version 333191 (0.0033) [2024-04-27 02:42:31,012][49750] Updated weights for policy 0, policy_version 333201 (0.0029) [2024-04-27 02:42:32,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5459197952. Throughput: 0: 50733.3. Samples: 3212086480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:42:32,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 02:42:32,580][49728] Signal inference workers to stop experience collection... (48400 times) [2024-04-27 02:42:32,628][49750] InferenceWorker_p0-w0: stopping experience collection (48400 times) [2024-04-27 02:42:32,648][49728] Signal inference workers to resume experience collection... (48400 times) [2024-04-27 02:42:32,649][49750] InferenceWorker_p0-w0: resuming experience collection (48400 times) [2024-04-27 02:42:34,644][49750] Updated weights for policy 0, policy_version 333211 (0.0034) [2024-04-27 02:42:37,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.4, 300 sec: 50929.3). Total num frames: 5459460096. Throughput: 0: 50806.6. Samples: 3212246020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:42:37,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-27 02:42:37,578][49750] Updated weights for policy 0, policy_version 333221 (0.0034) [2024-04-27 02:42:40,922][49750] Updated weights for policy 0, policy_version 333231 (0.0028) [2024-04-27 02:42:42,062][49517] Fps is (10 sec: 47513.5, 60 sec: 50244.4, 300 sec: 50762.6). Total num frames: 5459673088. Throughput: 0: 50729.8. Samples: 3212549440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:42:42,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 02:42:42,135][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000333233_5459689472.pth... [2024-04-27 02:42:42,178][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000332489_5447499776.pth [2024-04-27 02:42:43,875][49750] Updated weights for policy 0, policy_version 333241 (0.0032) [2024-04-27 02:42:47,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 5459951616. Throughput: 0: 50657.9. Samples: 3212854840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:42:47,072][49517] Avg episode reward: [(0, '0.552')] [2024-04-27 02:42:47,240][49750] Updated weights for policy 0, policy_version 333251 (0.0033) [2024-04-27 02:42:50,214][49750] Updated weights for policy 0, policy_version 333261 (0.0033) [2024-04-27 02:42:52,062][49517] Fps is (10 sec: 52429.2, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5460197376. Throughput: 0: 50856.1. Samples: 3213009220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:42:52,063][49517] Avg episode reward: [(0, '0.494')] [2024-04-27 02:42:53,642][49750] Updated weights for policy 0, policy_version 333271 (0.0030) [2024-04-27 02:42:56,697][49750] Updated weights for policy 0, policy_version 333281 (0.0032) [2024-04-27 02:42:57,062][49517] Fps is (10 sec: 52429.7, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 5460475904. Throughput: 0: 50952.0. Samples: 3213321600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 02:42:57,063][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 02:43:00,153][49750] Updated weights for policy 0, policy_version 333291 (0.0028) [2024-04-27 02:43:02,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5460721664. Throughput: 0: 50785.4. Samples: 3213623280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 02:43:02,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 02:43:03,172][49750] Updated weights for policy 0, policy_version 333301 (0.0032) [2024-04-27 02:43:06,953][49750] Updated weights for policy 0, policy_version 333311 (0.0038) [2024-04-27 02:43:07,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5460967424. Throughput: 0: 50836.1. Samples: 3213770160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 02:43:07,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 02:43:09,579][49750] Updated weights for policy 0, policy_version 333321 (0.0030) [2024-04-27 02:43:12,063][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.2, 300 sec: 50818.1). Total num frames: 5461229568. Throughput: 0: 51008.0. Samples: 3214077320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 02:43:12,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 02:43:13,539][49750] Updated weights for policy 0, policy_version 333331 (0.0035) [2024-04-27 02:43:15,938][49750] Updated weights for policy 0, policy_version 333341 (0.0032) [2024-04-27 02:43:17,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5461491712. Throughput: 0: 50846.2. Samples: 3214374560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 02:43:17,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 02:43:19,896][49750] Updated weights for policy 0, policy_version 333351 (0.0030) [2024-04-27 02:43:22,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5461721088. Throughput: 0: 50803.2. Samples: 3214532160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 02:43:22,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-27 02:43:22,544][49750] Updated weights for policy 0, policy_version 333361 (0.0026) [2024-04-27 02:43:26,172][49750] Updated weights for policy 0, policy_version 333371 (0.0037) [2024-04-27 02:43:27,063][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5461966848. Throughput: 0: 50886.1. Samples: 3214839320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 02:43:27,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-27 02:43:28,743][49728] Signal inference workers to stop experience collection... (48450 times) [2024-04-27 02:43:28,782][49750] InferenceWorker_p0-w0: stopping experience collection (48450 times) [2024-04-27 02:43:28,810][49728] Signal inference workers to resume experience collection... (48450 times) [2024-04-27 02:43:28,811][49750] InferenceWorker_p0-w0: resuming experience collection (48450 times) [2024-04-27 02:43:29,072][49750] Updated weights for policy 0, policy_version 333381 (0.0032) [2024-04-27 02:43:32,063][49517] Fps is (10 sec: 52428.6, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 5462245376. Throughput: 0: 50861.4. Samples: 3215143600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 02:43:32,063][49517] Avg episode reward: [(0, '0.631')] [2024-04-27 02:43:32,549][49750] Updated weights for policy 0, policy_version 333391 (0.0031) [2024-04-27 02:43:35,567][49750] Updated weights for policy 0, policy_version 333401 (0.0032) [2024-04-27 02:43:37,062][49517] Fps is (10 sec: 52429.5, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 5462491136. Throughput: 0: 50832.9. Samples: 3215296700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 02:43:37,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 02:43:38,953][49750] Updated weights for policy 0, policy_version 333411 (0.0031) [2024-04-27 02:43:42,009][49750] Updated weights for policy 0, policy_version 333421 (0.0029) [2024-04-27 02:43:42,063][49517] Fps is (10 sec: 52429.1, 60 sec: 51609.6, 300 sec: 50929.2). Total num frames: 5462769664. Throughput: 0: 50660.4. Samples: 3215601320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 02:43:42,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-27 02:43:45,390][49750] Updated weights for policy 0, policy_version 333431 (0.0033) [2024-04-27 02:43:47,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5462999040. Throughput: 0: 50785.3. Samples: 3215908620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 02:43:47,063][49517] Avg episode reward: [(0, '0.520')] [2024-04-27 02:43:48,481][49750] Updated weights for policy 0, policy_version 333441 (0.0037) [2024-04-27 02:43:51,670][49750] Updated weights for policy 0, policy_version 333451 (0.0032) [2024-04-27 02:43:52,064][49517] Fps is (10 sec: 49146.4, 60 sec: 51062.4, 300 sec: 50818.0). Total num frames: 5463261184. Throughput: 0: 50823.1. Samples: 3216057260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 02:43:52,064][49517] Avg episode reward: [(0, '0.642')] [2024-04-27 02:43:54,985][49750] Updated weights for policy 0, policy_version 333461 (0.0027) [2024-04-27 02:43:57,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5463506944. Throughput: 0: 50713.1. Samples: 3216359400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 02:43:57,063][49517] Avg episode reward: [(0, '0.517')] [2024-04-27 02:43:58,486][49750] Updated weights for policy 0, policy_version 333471 (0.0028) [2024-04-27 02:44:01,337][49750] Updated weights for policy 0, policy_version 333481 (0.0033) [2024-04-27 02:44:02,062][49517] Fps is (10 sec: 50796.5, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5463769088. Throughput: 0: 50784.0. Samples: 3216659840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 02:44:02,063][49517] Avg episode reward: [(0, '0.548')] [2024-04-27 02:44:04,937][49750] Updated weights for policy 0, policy_version 333491 (0.0027) [2024-04-27 02:44:07,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5464014848. Throughput: 0: 50748.1. Samples: 3216815820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 02:44:07,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 02:44:07,664][49750] Updated weights for policy 0, policy_version 333501 (0.0029) [2024-04-27 02:44:11,347][49750] Updated weights for policy 0, policy_version 333511 (0.0027) [2024-04-27 02:44:12,062][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.6, 300 sec: 50818.2). Total num frames: 5464260608. Throughput: 0: 50668.3. Samples: 3217119380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 02:44:12,062][49517] Avg episode reward: [(0, '0.524')] [2024-04-27 02:44:14,226][49750] Updated weights for policy 0, policy_version 333521 (0.0031) [2024-04-27 02:44:17,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.4, 300 sec: 50818.1). Total num frames: 5464539136. Throughput: 0: 50722.3. Samples: 3217426100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 02:44:17,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-27 02:44:17,893][49750] Updated weights for policy 0, policy_version 333531 (0.0035) [2024-04-27 02:44:20,724][49750] Updated weights for policy 0, policy_version 333541 (0.0029) [2024-04-27 02:44:22,063][49517] Fps is (10 sec: 52427.4, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5464784896. Throughput: 0: 50587.4. Samples: 3217573140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 02:44:22,063][49517] Avg episode reward: [(0, '0.616')] [2024-04-27 02:44:24,265][49750] Updated weights for policy 0, policy_version 333551 (0.0038) [2024-04-27 02:44:27,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51336.5, 300 sec: 50762.6). Total num frames: 5465047040. Throughput: 0: 50647.0. Samples: 3217880440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 02:44:27,063][49517] Avg episode reward: [(0, '0.584')] [2024-04-27 02:44:27,617][49750] Updated weights for policy 0, policy_version 333561 (0.0029) [2024-04-27 02:44:30,563][49750] Updated weights for policy 0, policy_version 333571 (0.0037) [2024-04-27 02:44:32,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 5465276416. Throughput: 0: 50669.1. Samples: 3218188720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 02:44:32,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 02:44:34,142][49750] Updated weights for policy 0, policy_version 333581 (0.0030) [2024-04-27 02:44:37,001][49750] Updated weights for policy 0, policy_version 333591 (0.0034) [2024-04-27 02:44:37,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5465554944. Throughput: 0: 50619.1. Samples: 3218335060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 02:44:37,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-27 02:44:37,862][49728] Signal inference workers to stop experience collection... (48500 times) [2024-04-27 02:44:37,909][49750] InferenceWorker_p0-w0: stopping experience collection (48500 times) [2024-04-27 02:44:37,929][49728] Signal inference workers to resume experience collection... (48500 times) [2024-04-27 02:44:37,930][49750] InferenceWorker_p0-w0: resuming experience collection (48500 times) [2024-04-27 02:44:40,521][49750] Updated weights for policy 0, policy_version 333601 (0.0031) [2024-04-27 02:44:42,063][49517] Fps is (10 sec: 50789.1, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 5465784320. Throughput: 0: 50701.9. Samples: 3218641000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 02:44:42,063][49517] Avg episode reward: [(0, '0.665')] [2024-04-27 02:44:42,198][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000333606_5465800704.pth... [2024-04-27 02:44:42,253][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000332863_5453627392.pth [2024-04-27 02:44:43,595][49750] Updated weights for policy 0, policy_version 333611 (0.0031) [2024-04-27 02:44:47,062][49517] Fps is (10 sec: 47513.6, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 5466030080. Throughput: 0: 50714.2. Samples: 3218941980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 02:44:47,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-27 02:44:47,101][49750] Updated weights for policy 0, policy_version 333621 (0.0035) [2024-04-27 02:44:49,870][49750] Updated weights for policy 0, policy_version 333631 (0.0027) [2024-04-27 02:44:52,062][49517] Fps is (10 sec: 50791.2, 60 sec: 50518.3, 300 sec: 50707.1). Total num frames: 5466292224. Throughput: 0: 50586.1. Samples: 3219092200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 02:44:52,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 02:44:53,630][49750] Updated weights for policy 0, policy_version 333641 (0.0038) [2024-04-27 02:44:56,411][49750] Updated weights for policy 0, policy_version 333651 (0.0032) [2024-04-27 02:44:57,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5466537984. Throughput: 0: 50683.5. Samples: 3219400140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 02:44:57,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 02:44:59,994][49750] Updated weights for policy 0, policy_version 333661 (0.0032) [2024-04-27 02:45:02,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5466800128. Throughput: 0: 50678.3. Samples: 3219706620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 02:45:02,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-27 02:45:03,440][49750] Updated weights for policy 0, policy_version 333671 (0.0030) [2024-04-27 02:45:06,265][49750] Updated weights for policy 0, policy_version 333681 (0.0032) [2024-04-27 02:45:07,062][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5467062272. Throughput: 0: 50734.3. Samples: 3219856180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 02:45:07,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-27 02:45:09,685][49750] Updated weights for policy 0, policy_version 333691 (0.0029) [2024-04-27 02:45:12,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 5467308032. Throughput: 0: 50728.6. Samples: 3220163220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 02:45:12,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 02:45:12,643][49750] Updated weights for policy 0, policy_version 333701 (0.0033) [2024-04-27 02:45:16,113][49750] Updated weights for policy 0, policy_version 333711 (0.0036) [2024-04-27 02:45:17,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5467570176. Throughput: 0: 50763.1. Samples: 3220473060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 02:45:17,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 02:45:19,076][49750] Updated weights for policy 0, policy_version 333721 (0.0027) [2024-04-27 02:45:22,062][49517] Fps is (10 sec: 50790.4, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5467815936. Throughput: 0: 50902.3. Samples: 3220625660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 02:45:22,063][49517] Avg episode reward: [(0, '0.559')] [2024-04-27 02:45:22,502][49750] Updated weights for policy 0, policy_version 333731 (0.0026) [2024-04-27 02:45:25,444][49750] Updated weights for policy 0, policy_version 333741 (0.0029) [2024-04-27 02:45:27,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5468094464. Throughput: 0: 50777.1. Samples: 3220925960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 02:45:27,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-27 02:45:28,859][49750] Updated weights for policy 0, policy_version 333751 (0.0032) [2024-04-27 02:45:31,885][49750] Updated weights for policy 0, policy_version 333761 (0.0029) [2024-04-27 02:45:32,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5468340224. Throughput: 0: 51034.8. Samples: 3221238540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 02:45:32,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 02:45:35,328][49750] Updated weights for policy 0, policy_version 333771 (0.0030) [2024-04-27 02:45:37,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5468585984. Throughput: 0: 51032.8. Samples: 3221388680. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 02:45:37,063][49517] Avg episode reward: [(0, '0.648')] [2024-04-27 02:45:38,259][49750] Updated weights for policy 0, policy_version 333781 (0.0029) [2024-04-27 02:45:41,721][49750] Updated weights for policy 0, policy_version 333791 (0.0032) [2024-04-27 02:45:42,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5468848128. Throughput: 0: 50804.7. Samples: 3221686360. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 02:45:42,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-27 02:45:44,978][49750] Updated weights for policy 0, policy_version 333801 (0.0033) [2024-04-27 02:45:47,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5469110272. Throughput: 0: 50876.7. Samples: 3221996080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 02:45:47,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-27 02:45:48,053][49750] Updated weights for policy 0, policy_version 333811 (0.0027) [2024-04-27 02:45:51,524][49750] Updated weights for policy 0, policy_version 333821 (0.0029) [2024-04-27 02:45:52,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50790.5, 300 sec: 50762.6). Total num frames: 5469339648. Throughput: 0: 51017.0. Samples: 3222151940. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 02:45:52,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 02:45:54,474][49750] Updated weights for policy 0, policy_version 333831 (0.0032) [2024-04-27 02:45:57,063][49517] Fps is (10 sec: 49152.0, 60 sec: 51063.3, 300 sec: 50762.6). Total num frames: 5469601792. Throughput: 0: 50862.5. Samples: 3222452040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 02:45:57,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 02:45:57,849][49750] Updated weights for policy 0, policy_version 333841 (0.0032) [2024-04-27 02:46:00,939][49750] Updated weights for policy 0, policy_version 333851 (0.0028) [2024-04-27 02:46:02,062][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5469863936. Throughput: 0: 50725.3. Samples: 3222755700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 02:46:02,063][49517] Avg episode reward: [(0, '0.544')] [2024-04-27 02:46:04,257][49750] Updated weights for policy 0, policy_version 333861 (0.0030) [2024-04-27 02:46:07,062][49517] Fps is (10 sec: 50791.3, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5470109696. Throughput: 0: 50893.0. Samples: 3222915840. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 02:46:07,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 02:46:07,336][49750] Updated weights for policy 0, policy_version 333871 (0.0029) [2024-04-27 02:46:10,674][49750] Updated weights for policy 0, policy_version 333881 (0.0035) [2024-04-27 02:46:12,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5470371840. Throughput: 0: 51036.8. Samples: 3223222620. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 02:46:12,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-27 02:46:13,673][49750] Updated weights for policy 0, policy_version 333891 (0.0031) [2024-04-27 02:46:15,095][49728] Signal inference workers to stop experience collection... (48550 times) [2024-04-27 02:46:15,099][49728] Signal inference workers to resume experience collection... (48550 times) [2024-04-27 02:46:15,125][49750] InferenceWorker_p0-w0: stopping experience collection (48550 times) [2024-04-27 02:46:15,125][49750] InferenceWorker_p0-w0: resuming experience collection (48550 times) [2024-04-27 02:46:17,062][49517] Fps is (10 sec: 52428.6, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5470633984. Throughput: 0: 50889.7. Samples: 3223528580. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 02:46:17,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 02:46:17,069][49750] Updated weights for policy 0, policy_version 333901 (0.0029) [2024-04-27 02:46:20,130][49750] Updated weights for policy 0, policy_version 333911 (0.0030) [2024-04-27 02:46:22,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.2, 300 sec: 50707.0). Total num frames: 5470863360. Throughput: 0: 50954.5. Samples: 3223681640. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 02:46:22,063][49517] Avg episode reward: [(0, '0.662')] [2024-04-27 02:46:23,425][49750] Updated weights for policy 0, policy_version 333921 (0.0034) [2024-04-27 02:46:26,494][49750] Updated weights for policy 0, policy_version 333931 (0.0034) [2024-04-27 02:46:27,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5471141888. Throughput: 0: 50973.4. Samples: 3223980160. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 02:46:27,063][49517] Avg episode reward: [(0, '0.563')] [2024-04-27 02:46:29,958][49750] Updated weights for policy 0, policy_version 333941 (0.0030) [2024-04-27 02:46:32,063][49517] Fps is (10 sec: 52429.2, 60 sec: 50790.2, 300 sec: 50818.2). Total num frames: 5471387648. Throughput: 0: 50889.7. Samples: 3224286120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 02:46:32,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-27 02:46:32,871][49750] Updated weights for policy 0, policy_version 333951 (0.0026) [2024-04-27 02:46:36,262][49750] Updated weights for policy 0, policy_version 333961 (0.0040) [2024-04-27 02:46:37,063][49517] Fps is (10 sec: 49151.7, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5471633408. Throughput: 0: 51032.2. Samples: 3224448400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 02:46:37,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 02:46:39,267][49750] Updated weights for policy 0, policy_version 333971 (0.0027) [2024-04-27 02:46:42,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5471879168. Throughput: 0: 51100.4. Samples: 3224751560. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 02:46:42,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 02:46:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000333977_5471879168.pth... [2024-04-27 02:46:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000333233_5459689472.pth [2024-04-27 02:46:42,746][49750] Updated weights for policy 0, policy_version 333981 (0.0028) [2024-04-27 02:46:45,754][49750] Updated weights for policy 0, policy_version 333991 (0.0032) [2024-04-27 02:46:47,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 5472124928. Throughput: 0: 50912.5. Samples: 3225046760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 02:46:47,063][49517] Avg episode reward: [(0, '0.655')] [2024-04-27 02:46:49,263][49750] Updated weights for policy 0, policy_version 334001 (0.0031) [2024-04-27 02:46:52,062][49517] Fps is (10 sec: 52429.6, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5472403456. Throughput: 0: 50752.0. Samples: 3225199680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 02:46:52,063][49517] Avg episode reward: [(0, '0.627')] [2024-04-27 02:46:52,280][49750] Updated weights for policy 0, policy_version 334011 (0.0033) [2024-04-27 02:46:55,703][49750] Updated weights for policy 0, policy_version 334021 (0.0028) [2024-04-27 02:46:57,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5472665600. Throughput: 0: 50734.4. Samples: 3225505660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 02:46:57,063][49517] Avg episode reward: [(0, '0.549')] [2024-04-27 02:46:58,787][49750] Updated weights for policy 0, policy_version 334031 (0.0029) [2024-04-27 02:47:01,992][49750] Updated weights for policy 0, policy_version 334041 (0.0039) [2024-04-27 02:47:02,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5472927744. Throughput: 0: 50820.7. Samples: 3225815520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 02:47:02,063][49517] Avg episode reward: [(0, '0.652')] [2024-04-27 02:47:05,267][49750] Updated weights for policy 0, policy_version 334051 (0.0032) [2024-04-27 02:47:07,063][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5473173504. Throughput: 0: 50741.5. Samples: 3225965000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 02:47:07,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-27 02:47:08,592][49750] Updated weights for policy 0, policy_version 334061 (0.0026) [2024-04-27 02:47:11,634][49750] Updated weights for policy 0, policy_version 334071 (0.0030) [2024-04-27 02:47:12,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.6, 300 sec: 50818.2). Total num frames: 5473419264. Throughput: 0: 50779.3. Samples: 3226265220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 02:47:12,063][49517] Avg episode reward: [(0, '0.643')] [2024-04-27 02:47:14,319][49728] Signal inference workers to stop experience collection... (48600 times) [2024-04-27 02:47:14,371][49750] InferenceWorker_p0-w0: stopping experience collection (48600 times) [2024-04-27 02:47:14,377][49728] Signal inference workers to resume experience collection... (48600 times) [2024-04-27 02:47:14,386][49750] InferenceWorker_p0-w0: resuming experience collection (48600 times) [2024-04-27 02:47:15,212][49750] Updated weights for policy 0, policy_version 334081 (0.0027) [2024-04-27 02:47:17,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 5473665024. Throughput: 0: 50741.8. Samples: 3226569500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 02:47:17,063][49517] Avg episode reward: [(0, '0.566')] [2024-04-27 02:47:18,156][49750] Updated weights for policy 0, policy_version 334091 (0.0031) [2024-04-27 02:47:21,523][49750] Updated weights for policy 0, policy_version 334101 (0.0029) [2024-04-27 02:47:22,063][49517] Fps is (10 sec: 50789.0, 60 sec: 51063.5, 300 sec: 50818.1). Total num frames: 5473927168. Throughput: 0: 50598.6. Samples: 3226725340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 02:47:22,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 02:47:24,577][49750] Updated weights for policy 0, policy_version 334111 (0.0034) [2024-04-27 02:47:27,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50244.4, 300 sec: 50707.1). Total num frames: 5474156544. Throughput: 0: 50648.2. Samples: 3227030720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 02:47:27,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-27 02:47:28,058][49750] Updated weights for policy 0, policy_version 334121 (0.0033) [2024-04-27 02:47:31,035][49750] Updated weights for policy 0, policy_version 334131 (0.0029) [2024-04-27 02:47:32,063][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5474418688. Throughput: 0: 50806.5. Samples: 3227333060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 02:47:32,063][49517] Avg episode reward: [(0, '0.539')] [2024-04-27 02:47:34,566][49750] Updated weights for policy 0, policy_version 334141 (0.0031) [2024-04-27 02:47:37,063][49517] Fps is (10 sec: 54066.3, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 5474697216. Throughput: 0: 50872.3. Samples: 3227488940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 02:47:37,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 02:47:37,395][49750] Updated weights for policy 0, policy_version 334151 (0.0030) [2024-04-27 02:47:40,993][49750] Updated weights for policy 0, policy_version 334161 (0.0038) [2024-04-27 02:47:42,063][49517] Fps is (10 sec: 52428.3, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5474942976. Throughput: 0: 50842.3. Samples: 3227793580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 02:47:42,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-27 02:47:43,804][49750] Updated weights for policy 0, policy_version 334171 (0.0029) [2024-04-27 02:47:47,062][49517] Fps is (10 sec: 49152.8, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5475188736. Throughput: 0: 50828.2. Samples: 3228102780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 02:47:47,063][49517] Avg episode reward: [(0, '0.511')] [2024-04-27 02:47:47,380][49750] Updated weights for policy 0, policy_version 334181 (0.0035) [2024-04-27 02:47:50,386][49750] Updated weights for policy 0, policy_version 334191 (0.0028) [2024-04-27 02:47:52,063][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.1, 300 sec: 50707.1). Total num frames: 5475434496. Throughput: 0: 50747.8. Samples: 3228248660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 02:47:52,063][49517] Avg episode reward: [(0, '0.541')] [2024-04-27 02:47:53,788][49750] Updated weights for policy 0, policy_version 334201 (0.0033) [2024-04-27 02:47:56,793][49750] Updated weights for policy 0, policy_version 334211 (0.0027) [2024-04-27 02:47:57,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50818.2). Total num frames: 5475713024. Throughput: 0: 50833.1. Samples: 3228552720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 02:47:57,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 02:48:00,248][49750] Updated weights for policy 0, policy_version 334221 (0.0032) [2024-04-27 02:48:02,063][49517] Fps is (10 sec: 52429.0, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5475958784. Throughput: 0: 50790.6. Samples: 3228855080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 02:48:02,063][49517] Avg episode reward: [(0, '0.632')] [2024-04-27 02:48:03,679][49750] Updated weights for policy 0, policy_version 334231 (0.0033) [2024-04-27 02:48:06,634][49750] Updated weights for policy 0, policy_version 334241 (0.0042) [2024-04-27 02:48:07,062][49517] Fps is (10 sec: 50790.8, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5476220928. Throughput: 0: 50770.4. Samples: 3229010000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 02:48:07,063][49517] Avg episode reward: [(0, '0.512')] [2024-04-27 02:48:10,128][49750] Updated weights for policy 0, policy_version 334251 (0.0025) [2024-04-27 02:48:12,063][49517] Fps is (10 sec: 49152.4, 60 sec: 50517.2, 300 sec: 50707.1). Total num frames: 5476450304. Throughput: 0: 50941.2. Samples: 3229323080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 02:48:12,063][49517] Avg episode reward: [(0, '0.568')] [2024-04-27 02:48:12,991][49750] Updated weights for policy 0, policy_version 334261 (0.0037) [2024-04-27 02:48:16,424][49750] Updated weights for policy 0, policy_version 334271 (0.0032) [2024-04-27 02:48:17,063][49517] Fps is (10 sec: 50790.0, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5476728832. Throughput: 0: 50928.5. Samples: 3229624840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 02:48:17,063][49517] Avg episode reward: [(0, '0.518')] [2024-04-27 02:48:19,435][49750] Updated weights for policy 0, policy_version 334281 (0.0032) [2024-04-27 02:48:22,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 5476974592. Throughput: 0: 50792.6. Samples: 3229774600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 02:48:22,063][49517] Avg episode reward: [(0, '0.503')] [2024-04-27 02:48:22,855][49750] Updated weights for policy 0, policy_version 334291 (0.0034) [2024-04-27 02:48:25,828][49750] Updated weights for policy 0, policy_version 334301 (0.0034) [2024-04-27 02:48:27,062][49517] Fps is (10 sec: 50790.7, 60 sec: 51336.5, 300 sec: 50818.2). Total num frames: 5477236736. Throughput: 0: 50806.0. Samples: 3230079840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 02:48:27,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 02:48:29,330][49750] Updated weights for policy 0, policy_version 334311 (0.0027) [2024-04-27 02:48:32,062][49517] Fps is (10 sec: 50790.3, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5477482496. Throughput: 0: 50774.2. Samples: 3230387620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 02:48:32,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 02:48:32,426][49750] Updated weights for policy 0, policy_version 334321 (0.0034) [2024-04-27 02:48:35,831][49750] Updated weights for policy 0, policy_version 334331 (0.0033) [2024-04-27 02:48:35,855][49728] Signal inference workers to stop experience collection... (48650 times) [2024-04-27 02:48:35,855][49728] Signal inference workers to resume experience collection... (48650 times) [2024-04-27 02:48:35,881][49750] InferenceWorker_p0-w0: stopping experience collection (48650 times) [2024-04-27 02:48:35,881][49750] InferenceWorker_p0-w0: resuming experience collection (48650 times) [2024-04-27 02:48:37,063][49517] Fps is (10 sec: 49151.5, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5477728256. Throughput: 0: 50922.7. Samples: 3230540180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 02:48:37,063][49517] Avg episode reward: [(0, '0.522')] [2024-04-27 02:48:38,919][49750] Updated weights for policy 0, policy_version 334341 (0.0029) [2024-04-27 02:48:42,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5477990400. Throughput: 0: 50795.1. Samples: 3230838500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 02:48:42,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-27 02:48:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000334350_5477990400.pth... [2024-04-27 02:48:42,124][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000333606_5465800704.pth [2024-04-27 02:48:42,273][49750] Updated weights for policy 0, policy_version 334351 (0.0028) [2024-04-27 02:48:45,413][49750] Updated weights for policy 0, policy_version 334361 (0.0031) [2024-04-27 02:48:47,063][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.3, 300 sec: 50818.4). Total num frames: 5478252544. Throughput: 0: 50873.8. Samples: 3231144400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 02:48:47,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-27 02:48:48,625][49750] Updated weights for policy 0, policy_version 334371 (0.0031) [2024-04-27 02:48:51,837][49750] Updated weights for policy 0, policy_version 334381 (0.0030) [2024-04-27 02:48:52,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.7, 300 sec: 50873.7). Total num frames: 5478514688. Throughput: 0: 50863.5. Samples: 3231298860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 02:48:52,063][49517] Avg episode reward: [(0, '0.633')] [2024-04-27 02:48:55,008][49750] Updated weights for policy 0, policy_version 334391 (0.0034) [2024-04-27 02:48:57,062][49517] Fps is (10 sec: 49153.1, 60 sec: 50517.5, 300 sec: 50762.6). Total num frames: 5478744064. Throughput: 0: 50642.4. Samples: 3231601980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 02:48:57,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-27 02:48:58,223][49750] Updated weights for policy 0, policy_version 334401 (0.0030) [2024-04-27 02:49:01,494][49750] Updated weights for policy 0, policy_version 334411 (0.0027) [2024-04-27 02:49:02,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5479006208. Throughput: 0: 50745.8. Samples: 3231908400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 02:49:02,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-27 02:49:04,556][49750] Updated weights for policy 0, policy_version 334421 (0.0037) [2024-04-27 02:49:07,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 5479251968. Throughput: 0: 50882.1. Samples: 3232064300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 02:49:07,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-27 02:49:07,777][49750] Updated weights for policy 0, policy_version 334431 (0.0033) [2024-04-27 02:49:11,019][49750] Updated weights for policy 0, policy_version 334441 (0.0036) [2024-04-27 02:49:12,062][49517] Fps is (10 sec: 50790.9, 60 sec: 51063.6, 300 sec: 50762.6). Total num frames: 5479514112. Throughput: 0: 50968.1. Samples: 3232373400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 02:49:12,063][49517] Avg episode reward: [(0, '0.551')] [2024-04-27 02:49:14,240][49750] Updated weights for policy 0, policy_version 334451 (0.0032) [2024-04-27 02:49:17,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5479776256. Throughput: 0: 50804.9. Samples: 3232673840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 02:49:17,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 02:49:17,424][49750] Updated weights for policy 0, policy_version 334461 (0.0029) [2024-04-27 02:49:20,758][49750] Updated weights for policy 0, policy_version 334471 (0.0038) [2024-04-27 02:49:22,063][49517] Fps is (10 sec: 50789.7, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5480022016. Throughput: 0: 50831.1. Samples: 3232827580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 02:49:22,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-27 02:49:23,765][49750] Updated weights for policy 0, policy_version 334481 (0.0032) [2024-04-27 02:49:27,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5480284160. Throughput: 0: 51058.3. Samples: 3233136120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 02:49:27,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-27 02:49:27,264][49750] Updated weights for policy 0, policy_version 334491 (0.0030) [2024-04-27 02:49:30,097][49750] Updated weights for policy 0, policy_version 334501 (0.0039) [2024-04-27 02:49:32,063][49517] Fps is (10 sec: 52429.0, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5480546304. Throughput: 0: 50987.2. Samples: 3233438820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 02:49:32,063][49517] Avg episode reward: [(0, '0.620')] [2024-04-27 02:49:33,530][49750] Updated weights for policy 0, policy_version 334511 (0.0037) [2024-04-27 02:49:36,637][49750] Updated weights for policy 0, policy_version 334521 (0.0036) [2024-04-27 02:49:37,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51336.7, 300 sec: 50929.3). Total num frames: 5480808448. Throughput: 0: 51149.9. Samples: 3233600600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 02:49:37,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-27 02:49:40,169][49750] Updated weights for policy 0, policy_version 334531 (0.0031) [2024-04-27 02:49:42,063][49517] Fps is (10 sec: 49151.9, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5481037824. Throughput: 0: 51042.0. Samples: 3233898880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 02:49:42,063][49517] Avg episode reward: [(0, '0.587')] [2024-04-27 02:49:43,143][49750] Updated weights for policy 0, policy_version 334541 (0.0029) [2024-04-27 02:49:46,508][49750] Updated weights for policy 0, policy_version 334551 (0.0032) [2024-04-27 02:49:47,062][49517] Fps is (10 sec: 47513.8, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 5481283584. Throughput: 0: 50942.8. Samples: 3234200820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 02:49:47,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-27 02:49:49,432][49750] Updated weights for policy 0, policy_version 334561 (0.0030) [2024-04-27 02:49:52,062][49517] Fps is (10 sec: 52429.3, 60 sec: 50790.4, 300 sec: 50929.2). Total num frames: 5481562112. Throughput: 0: 50742.7. Samples: 3234347720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 02:49:52,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-27 02:49:52,878][49750] Updated weights for policy 0, policy_version 334571 (0.0030) [2024-04-27 02:49:55,825][49750] Updated weights for policy 0, policy_version 334581 (0.0028) [2024-04-27 02:49:57,063][49517] Fps is (10 sec: 54066.1, 60 sec: 51336.4, 300 sec: 50929.2). Total num frames: 5481824256. Throughput: 0: 50895.0. Samples: 3234663680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 02:49:57,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 02:49:59,116][49728] Signal inference workers to stop experience collection... (48700 times) [2024-04-27 02:49:59,117][49728] Signal inference workers to resume experience collection... (48700 times) [2024-04-27 02:49:59,143][49750] InferenceWorker_p0-w0: stopping experience collection (48700 times) [2024-04-27 02:49:59,143][49750] InferenceWorker_p0-w0: resuming experience collection (48700 times) [2024-04-27 02:49:59,243][49750] Updated weights for policy 0, policy_version 334591 (0.0031) [2024-04-27 02:50:02,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5482070016. Throughput: 0: 50993.4. Samples: 3234968540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 02:50:02,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 02:50:02,365][49750] Updated weights for policy 0, policy_version 334601 (0.0036) [2024-04-27 02:50:05,750][49750] Updated weights for policy 0, policy_version 334611 (0.0035) [2024-04-27 02:50:07,062][49517] Fps is (10 sec: 47514.4, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5482299392. Throughput: 0: 50776.2. Samples: 3235112500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 02:50:07,063][49517] Avg episode reward: [(0, '0.526')] [2024-04-27 02:50:08,936][49750] Updated weights for policy 0, policy_version 334621 (0.0034) [2024-04-27 02:50:12,063][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5482577920. Throughput: 0: 50818.6. Samples: 3235422960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 02:50:12,067][49517] Avg episode reward: [(0, '0.604')] [2024-04-27 02:50:12,295][49750] Updated weights for policy 0, policy_version 334631 (0.0034) [2024-04-27 02:50:15,408][49750] Updated weights for policy 0, policy_version 334641 (0.0032) [2024-04-27 02:50:17,062][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 5482840064. Throughput: 0: 50938.3. Samples: 3235731040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 02:50:17,063][49517] Avg episode reward: [(0, '0.638')] [2024-04-27 02:50:18,662][49750] Updated weights for policy 0, policy_version 334651 (0.0034) [2024-04-27 02:50:21,843][49750] Updated weights for policy 0, policy_version 334661 (0.0034) [2024-04-27 02:50:22,063][49517] Fps is (10 sec: 50789.3, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 5483085824. Throughput: 0: 50794.3. Samples: 3235886360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 02:50:22,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-27 02:50:25,173][49750] Updated weights for policy 0, policy_version 334671 (0.0032) [2024-04-27 02:50:27,062][49517] Fps is (10 sec: 50790.4, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5483347968. Throughput: 0: 50852.1. Samples: 3236187220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 02:50:27,063][49517] Avg episode reward: [(0, '0.628')] [2024-04-27 02:50:28,564][49750] Updated weights for policy 0, policy_version 334681 (0.0037) [2024-04-27 02:50:31,908][49750] Updated weights for policy 0, policy_version 334691 (0.0030) [2024-04-27 02:50:32,063][49517] Fps is (10 sec: 49153.0, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5483577344. Throughput: 0: 50906.9. Samples: 3236491640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 02:50:32,071][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 02:50:34,983][49750] Updated weights for policy 0, policy_version 334701 (0.0032) [2024-04-27 02:50:37,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5483839488. Throughput: 0: 50923.1. Samples: 3236639260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 02:50:37,063][49517] Avg episode reward: [(0, '0.489')] [2024-04-27 02:50:38,231][49750] Updated weights for policy 0, policy_version 334711 (0.0028) [2024-04-27 02:50:41,304][49750] Updated weights for policy 0, policy_version 334721 (0.0026) [2024-04-27 02:50:42,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50762.7). Total num frames: 5484085248. Throughput: 0: 50727.7. Samples: 3236946420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 02:50:42,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-27 02:50:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000334723_5484101632.pth... [2024-04-27 02:50:42,119][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000333977_5471879168.pth [2024-04-27 02:50:44,516][49750] Updated weights for policy 0, policy_version 334731 (0.0035) [2024-04-27 02:50:47,062][49517] Fps is (10 sec: 50790.1, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5484347392. Throughput: 0: 50682.5. Samples: 3237249260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 02:50:47,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-27 02:50:47,842][49750] Updated weights for policy 0, policy_version 334741 (0.0033) [2024-04-27 02:50:51,096][49750] Updated weights for policy 0, policy_version 334751 (0.0030) [2024-04-27 02:50:52,062][49517] Fps is (10 sec: 49151.7, 60 sec: 50244.2, 300 sec: 50762.6). Total num frames: 5484576768. Throughput: 0: 50768.3. Samples: 3237397080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 02:50:52,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-27 02:50:52,310][49728] Signal inference workers to stop experience collection... (48750 times) [2024-04-27 02:50:52,311][49728] Signal inference workers to resume experience collection... (48750 times) [2024-04-27 02:50:52,329][49750] InferenceWorker_p0-w0: stopping experience collection (48750 times) [2024-04-27 02:50:52,330][49750] InferenceWorker_p0-w0: resuming experience collection (48750 times) [2024-04-27 02:50:54,468][49750] Updated weights for policy 0, policy_version 334761 (0.0031) [2024-04-27 02:50:57,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50517.3, 300 sec: 50818.1). Total num frames: 5484855296. Throughput: 0: 50629.3. Samples: 3237701280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 02:50:57,063][49517] Avg episode reward: [(0, '0.550')] [2024-04-27 02:50:57,609][49750] Updated weights for policy 0, policy_version 334771 (0.0034) [2024-04-27 02:51:01,010][49750] Updated weights for policy 0, policy_version 334781 (0.0041) [2024-04-27 02:51:02,062][49517] Fps is (10 sec: 55705.6, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5485133824. Throughput: 0: 50650.7. Samples: 3238010320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 02:51:02,063][49517] Avg episode reward: [(0, '0.605')] [2024-04-27 02:51:03,942][49750] Updated weights for policy 0, policy_version 334791 (0.0030) [2024-04-27 02:51:07,062][49517] Fps is (10 sec: 49152.9, 60 sec: 50790.4, 300 sec: 50762.7). Total num frames: 5485346816. Throughput: 0: 50735.9. Samples: 3238169460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 02:51:07,063][49517] Avg episode reward: [(0, '0.649')] [2024-04-27 02:51:07,344][49750] Updated weights for policy 0, policy_version 334801 (0.0030) [2024-04-27 02:51:10,243][49750] Updated weights for policy 0, policy_version 334811 (0.0029) [2024-04-27 02:51:12,063][49517] Fps is (10 sec: 50789.7, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5485641728. Throughput: 0: 50695.0. Samples: 3238468500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 02:51:12,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 02:51:13,750][49750] Updated weights for policy 0, policy_version 334821 (0.0034) [2024-04-27 02:51:16,814][49750] Updated weights for policy 0, policy_version 334831 (0.0028) [2024-04-27 02:51:17,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5485871104. Throughput: 0: 50728.8. Samples: 3238774440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 02:51:17,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-27 02:51:20,172][49750] Updated weights for policy 0, policy_version 334841 (0.0029) [2024-04-27 02:51:22,062][49517] Fps is (10 sec: 47514.5, 60 sec: 50517.6, 300 sec: 50762.6). Total num frames: 5486116864. Throughput: 0: 50930.3. Samples: 3238931120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 02:51:22,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 02:51:23,479][49750] Updated weights for policy 0, policy_version 334851 (0.0031) [2024-04-27 02:51:26,651][49750] Updated weights for policy 0, policy_version 334861 (0.0031) [2024-04-27 02:51:27,063][49517] Fps is (10 sec: 50790.2, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 5486379008. Throughput: 0: 50698.4. Samples: 3239227860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 02:51:27,063][49517] Avg episode reward: [(0, '0.572')] [2024-04-27 02:51:29,914][49750] Updated weights for policy 0, policy_version 334871 (0.0037) [2024-04-27 02:51:32,062][49517] Fps is (10 sec: 50789.9, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5486624768. Throughput: 0: 50636.4. Samples: 3239527900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 02:51:32,063][49517] Avg episode reward: [(0, '0.580')] [2024-04-27 02:51:32,967][49750] Updated weights for policy 0, policy_version 334881 (0.0033) [2024-04-27 02:51:36,714][49750] Updated weights for policy 0, policy_version 334891 (0.0028) [2024-04-27 02:51:37,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 5486870528. Throughput: 0: 50768.8. Samples: 3239681680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 02:51:37,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 02:51:38,792][49728] Signal inference workers to stop experience collection... (48800 times) [2024-04-27 02:51:38,793][49728] Signal inference workers to resume experience collection... (48800 times) [2024-04-27 02:51:38,804][49750] InferenceWorker_p0-w0: stopping experience collection (48800 times) [2024-04-27 02:51:38,804][49750] InferenceWorker_p0-w0: resuming experience collection (48800 times) [2024-04-27 02:51:39,346][49750] Updated weights for policy 0, policy_version 334901 (0.0035) [2024-04-27 02:51:42,062][49517] Fps is (10 sec: 50790.9, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5487132672. Throughput: 0: 50710.8. Samples: 3239983260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 02:51:42,063][49517] Avg episode reward: [(0, '0.552')] [2024-04-27 02:51:43,133][49750] Updated weights for policy 0, policy_version 334911 (0.0028) [2024-04-27 02:51:45,960][49750] Updated weights for policy 0, policy_version 334921 (0.0031) [2024-04-27 02:51:47,062][49517] Fps is (10 sec: 54068.3, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5487411200. Throughput: 0: 50586.8. Samples: 3240286720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-04-27 02:51:47,063][49517] Avg episode reward: [(0, '0.622')] [2024-04-27 02:51:49,538][49750] Updated weights for policy 0, policy_version 334931 (0.0037) [2024-04-27 02:51:52,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51063.4, 300 sec: 50762.6). Total num frames: 5487640576. Throughput: 0: 50709.6. Samples: 3240451400. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 02:51:52,063][49517] Avg episode reward: [(0, '0.602')] [2024-04-27 02:51:52,535][49750] Updated weights for policy 0, policy_version 334941 (0.0036) [2024-04-27 02:51:55,897][49750] Updated weights for policy 0, policy_version 334951 (0.0035) [2024-04-27 02:51:57,062][49517] Fps is (10 sec: 47513.4, 60 sec: 50517.5, 300 sec: 50707.1). Total num frames: 5487886336. Throughput: 0: 50789.5. Samples: 3240754020. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 02:51:57,063][49517] Avg episode reward: [(0, '0.613')] [2024-04-27 02:51:58,873][49750] Updated weights for policy 0, policy_version 334961 (0.0033) [2024-04-27 02:52:02,062][49517] Fps is (10 sec: 49152.8, 60 sec: 49971.2, 300 sec: 50707.1). Total num frames: 5488132096. Throughput: 0: 50635.7. Samples: 3241053040. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 02:52:02,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-27 02:52:02,269][49750] Updated weights for policy 0, policy_version 334971 (0.0033) [2024-04-27 02:52:05,365][49750] Updated weights for policy 0, policy_version 334981 (0.0029) [2024-04-27 02:52:07,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51063.4, 300 sec: 50818.1). Total num frames: 5488410624. Throughput: 0: 50629.7. Samples: 3241209460. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 02:52:07,063][49517] Avg episode reward: [(0, '0.596')] [2024-04-27 02:52:08,761][49750] Updated weights for policy 0, policy_version 334991 (0.0042) [2024-04-27 02:52:11,936][49750] Updated weights for policy 0, policy_version 335001 (0.0034) [2024-04-27 02:52:12,062][49517] Fps is (10 sec: 52428.9, 60 sec: 50244.5, 300 sec: 50818.2). Total num frames: 5488656384. Throughput: 0: 50702.5. Samples: 3241509460. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 02:52:12,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-27 02:52:15,194][49750] Updated weights for policy 0, policy_version 335011 (0.0028) [2024-04-27 02:52:17,062][49517] Fps is (10 sec: 49152.3, 60 sec: 50517.4, 300 sec: 50762.7). Total num frames: 5488902144. Throughput: 0: 50817.9. Samples: 3241814700. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 02:52:17,063][49517] Avg episode reward: [(0, '0.583')] [2024-04-27 02:52:18,399][49750] Updated weights for policy 0, policy_version 335021 (0.0030) [2024-04-27 02:52:21,772][49750] Updated weights for policy 0, policy_version 335031 (0.0028) [2024-04-27 02:52:22,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5489164288. Throughput: 0: 50607.3. Samples: 3241959000. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 02:52:22,063][49517] Avg episode reward: [(0, '0.606')] [2024-04-27 02:52:24,686][49750] Updated weights for policy 0, policy_version 335041 (0.0040) [2024-04-27 02:52:27,062][49517] Fps is (10 sec: 50790.3, 60 sec: 50517.5, 300 sec: 50818.2). Total num frames: 5489410048. Throughput: 0: 50673.7. Samples: 3242263580. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 02:52:27,063][49517] Avg episode reward: [(0, '0.505')] [2024-04-27 02:52:28,258][49750] Updated weights for policy 0, policy_version 335051 (0.0037) [2024-04-27 02:52:31,121][49750] Updated weights for policy 0, policy_version 335061 (0.0031) [2024-04-27 02:52:32,062][49517] Fps is (10 sec: 52428.7, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5489688576. Throughput: 0: 50777.7. Samples: 3242571720. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 02:52:32,063][49517] Avg episode reward: [(0, '0.530')] [2024-04-27 02:52:34,654][49750] Updated weights for policy 0, policy_version 335071 (0.0031) [2024-04-27 02:52:36,914][49728] Signal inference workers to stop experience collection... (48850 times) [2024-04-27 02:52:36,953][49750] InferenceWorker_p0-w0: stopping experience collection (48850 times) [2024-04-27 02:52:37,015][49728] Signal inference workers to resume experience collection... (48850 times) [2024-04-27 02:52:37,015][49750] InferenceWorker_p0-w0: resuming experience collection (48850 times) [2024-04-27 02:52:37,062][49517] Fps is (10 sec: 52429.3, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5489934336. Throughput: 0: 50634.0. Samples: 3242729920. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 02:52:37,063][49517] Avg episode reward: [(0, '0.630')] [2024-04-27 02:52:37,530][49750] Updated weights for policy 0, policy_version 335081 (0.0035) [2024-04-27 02:52:41,031][49750] Updated weights for policy 0, policy_version 335091 (0.0033) [2024-04-27 02:52:42,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50790.3, 300 sec: 50818.1). Total num frames: 5490180096. Throughput: 0: 50596.7. Samples: 3243030880. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 02:52:42,063][49517] Avg episode reward: [(0, '0.528')] [2024-04-27 02:52:42,072][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000335094_5490180096.pth... [2024-04-27 02:52:42,114][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000334350_5477990400.pth [2024-04-27 02:52:43,990][49750] Updated weights for policy 0, policy_version 335101 (0.0029) [2024-04-27 02:52:47,063][49517] Fps is (10 sec: 49151.2, 60 sec: 50244.1, 300 sec: 50818.2). Total num frames: 5490425856. Throughput: 0: 50811.8. Samples: 3243339580. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 02:52:47,072][49517] Avg episode reward: [(0, '0.533')] [2024-04-27 02:52:47,506][49750] Updated weights for policy 0, policy_version 335111 (0.0030) [2024-04-27 02:52:50,495][49750] Updated weights for policy 0, policy_version 335121 (0.0034) [2024-04-27 02:52:52,063][49517] Fps is (10 sec: 54067.0, 60 sec: 51336.5, 300 sec: 50873.7). Total num frames: 5490720768. Throughput: 0: 50779.8. Samples: 3243494560. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 02:52:52,063][49517] Avg episode reward: [(0, '0.504')] [2024-04-27 02:52:53,880][49750] Updated weights for policy 0, policy_version 335131 (0.0034) [2024-04-27 02:52:56,859][49750] Updated weights for policy 0, policy_version 335141 (0.0030) [2024-04-27 02:52:57,063][49517] Fps is (10 sec: 52428.9, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5490950144. Throughput: 0: 50947.9. Samples: 3243802120. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 02:52:57,063][49517] Avg episode reward: [(0, '0.558')] [2024-04-27 02:53:00,160][49750] Updated weights for policy 0, policy_version 335151 (0.0033) [2024-04-27 02:53:02,062][49517] Fps is (10 sec: 44237.5, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 5491163136. Throughput: 0: 50870.6. Samples: 3244103880. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 02:53:02,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-27 02:53:03,425][49750] Updated weights for policy 0, policy_version 335161 (0.0032) [2024-04-27 02:53:06,628][49750] Updated weights for policy 0, policy_version 335171 (0.0026) [2024-04-27 02:53:07,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5491458048. Throughput: 0: 50853.2. Samples: 3244247400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 02:53:07,063][49517] Avg episode reward: [(0, '0.531')] [2024-04-27 02:53:09,783][49750] Updated weights for policy 0, policy_version 335181 (0.0035) [2024-04-27 02:53:12,062][49517] Fps is (10 sec: 54067.3, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5491703808. Throughput: 0: 50984.5. Samples: 3244557880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 02:53:12,063][49517] Avg episode reward: [(0, '0.634')] [2024-04-27 02:53:13,065][49750] Updated weights for policy 0, policy_version 335191 (0.0028) [2024-04-27 02:53:16,100][49750] Updated weights for policy 0, policy_version 335201 (0.0036) [2024-04-27 02:53:17,062][49517] Fps is (10 sec: 50791.2, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5491965952. Throughput: 0: 50774.8. Samples: 3244856580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 02:53:17,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-27 02:53:19,508][49750] Updated weights for policy 0, policy_version 335211 (0.0031) [2024-04-27 02:53:22,062][49517] Fps is (10 sec: 49152.2, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5492195328. Throughput: 0: 50736.4. Samples: 3245013060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 02:53:22,063][49517] Avg episode reward: [(0, '0.554')] [2024-04-27 02:53:22,612][49750] Updated weights for policy 0, policy_version 335221 (0.0036) [2024-04-27 02:53:25,911][49750] Updated weights for policy 0, policy_version 335231 (0.0039) [2024-04-27 02:53:27,062][49517] Fps is (10 sec: 49151.3, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5492457472. Throughput: 0: 50834.4. Samples: 3245318420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 02:53:27,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 02:53:29,086][49750] Updated weights for policy 0, policy_version 335241 (0.0028) [2024-04-27 02:53:32,063][49517] Fps is (10 sec: 52427.9, 60 sec: 50517.2, 300 sec: 50818.2). Total num frames: 5492719616. Throughput: 0: 50673.3. Samples: 3245619880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 02:53:32,063][49517] Avg episode reward: [(0, '0.591')] [2024-04-27 02:53:32,504][49750] Updated weights for policy 0, policy_version 335251 (0.0031) [2024-04-27 02:53:35,449][49750] Updated weights for policy 0, policy_version 335261 (0.0028) [2024-04-27 02:53:37,062][49517] Fps is (10 sec: 54067.5, 60 sec: 51063.4, 300 sec: 50873.7). Total num frames: 5492998144. Throughput: 0: 50675.8. Samples: 3245774960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 02:53:37,063][49517] Avg episode reward: [(0, '0.590')] [2024-04-27 02:53:38,863][49750] Updated weights for policy 0, policy_version 335271 (0.0035) [2024-04-27 02:53:41,917][49750] Updated weights for policy 0, policy_version 335281 (0.0032) [2024-04-27 02:53:42,062][49517] Fps is (10 sec: 52429.8, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5493243904. Throughput: 0: 50793.0. Samples: 3246087800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 02:53:42,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 02:53:43,249][49728] Signal inference workers to stop experience collection... (48900 times) [2024-04-27 02:53:43,249][49728] Signal inference workers to resume experience collection... (48900 times) [2024-04-27 02:53:43,294][49750] InferenceWorker_p0-w0: stopping experience collection (48900 times) [2024-04-27 02:53:43,294][49750] InferenceWorker_p0-w0: resuming experience collection (48900 times) [2024-04-27 02:53:45,272][49750] Updated weights for policy 0, policy_version 335291 (0.0030) [2024-04-27 02:53:47,063][49517] Fps is (10 sec: 45874.5, 60 sec: 50517.3, 300 sec: 50651.5). Total num frames: 5493456896. Throughput: 0: 50843.4. Samples: 3246391840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 02:53:47,063][49517] Avg episode reward: [(0, '0.593')] [2024-04-27 02:53:48,423][49750] Updated weights for policy 0, policy_version 335301 (0.0030) [2024-04-27 02:53:51,645][49750] Updated weights for policy 0, policy_version 335311 (0.0034) [2024-04-27 02:53:52,063][49517] Fps is (10 sec: 49151.3, 60 sec: 50244.3, 300 sec: 50818.1). Total num frames: 5493735424. Throughput: 0: 50758.2. Samples: 3246531520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 02:53:52,063][49517] Avg episode reward: [(0, '0.562')] [2024-04-27 02:53:54,863][49750] Updated weights for policy 0, policy_version 335321 (0.0034) [2024-04-27 02:53:57,062][49517] Fps is (10 sec: 54068.2, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5493997568. Throughput: 0: 50697.4. Samples: 3246839260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 02:53:57,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 02:53:58,194][49750] Updated weights for policy 0, policy_version 335331 (0.0040) [2024-04-27 02:54:01,305][49750] Updated weights for policy 0, policy_version 335341 (0.0038) [2024-04-27 02:54:02,063][49517] Fps is (10 sec: 52428.4, 60 sec: 51609.5, 300 sec: 50873.7). Total num frames: 5494259712. Throughput: 0: 50857.0. Samples: 3247145160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 02:54:02,063][49517] Avg episode reward: [(0, '0.565')] [2024-04-27 02:54:04,529][49750] Updated weights for policy 0, policy_version 335351 (0.0031) [2024-04-27 02:54:07,062][49517] Fps is (10 sec: 49151.8, 60 sec: 50517.4, 300 sec: 50762.6). Total num frames: 5494489088. Throughput: 0: 50814.2. Samples: 3247299700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 02:54:07,063][49517] Avg episode reward: [(0, '0.556')] [2024-04-27 02:54:07,799][49750] Updated weights for policy 0, policy_version 335361 (0.0030) [2024-04-27 02:54:10,837][49750] Updated weights for policy 0, policy_version 335371 (0.0027) [2024-04-27 02:54:12,063][49517] Fps is (10 sec: 49152.1, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5494751232. Throughput: 0: 50811.9. Samples: 3247604960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 02:54:12,063][49517] Avg episode reward: [(0, '0.582')] [2024-04-27 02:54:14,257][49750] Updated weights for policy 0, policy_version 335381 (0.0031) [2024-04-27 02:54:17,063][49517] Fps is (10 sec: 54066.7, 60 sec: 51063.3, 300 sec: 50873.7). Total num frames: 5495029760. Throughput: 0: 50977.9. Samples: 3247913880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 02:54:17,063][49517] Avg episode reward: [(0, '0.493')] [2024-04-27 02:54:17,268][49750] Updated weights for policy 0, policy_version 335391 (0.0030) [2024-04-27 02:54:20,801][49750] Updated weights for policy 0, policy_version 335401 (0.0034) [2024-04-27 02:54:22,062][49517] Fps is (10 sec: 52429.9, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 5495275520. Throughput: 0: 50897.8. Samples: 3248065360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 02:54:22,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 02:54:23,788][49750] Updated weights for policy 0, policy_version 335411 (0.0032) [2024-04-27 02:54:27,063][49517] Fps is (10 sec: 47513.2, 60 sec: 50790.3, 300 sec: 50707.1). Total num frames: 5495504896. Throughput: 0: 50823.3. Samples: 3248374860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 02:54:27,063][49517] Avg episode reward: [(0, '0.542')] [2024-04-27 02:54:27,227][49750] Updated weights for policy 0, policy_version 335421 (0.0029) [2024-04-27 02:54:30,345][49750] Updated weights for policy 0, policy_version 335431 (0.0031) [2024-04-27 02:54:32,062][49517] Fps is (10 sec: 49151.5, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5495767040. Throughput: 0: 50829.9. Samples: 3248679180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 02:54:32,063][49517] Avg episode reward: [(0, '0.611')] [2024-04-27 02:54:33,541][49750] Updated weights for policy 0, policy_version 335441 (0.0031) [2024-04-27 02:54:36,889][49750] Updated weights for policy 0, policy_version 335451 (0.0028) [2024-04-27 02:54:37,062][49517] Fps is (10 sec: 52429.9, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5496029184. Throughput: 0: 51065.9. Samples: 3248829480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 02:54:37,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 02:54:39,200][49728] Signal inference workers to stop experience collection... (48950 times) [2024-04-27 02:54:39,201][49728] Signal inference workers to resume experience collection... (48950 times) [2024-04-27 02:54:39,233][49750] InferenceWorker_p0-w0: stopping experience collection (48950 times) [2024-04-27 02:54:39,233][49750] InferenceWorker_p0-w0: resuming experience collection (48950 times) [2024-04-27 02:54:39,976][49750] Updated weights for policy 0, policy_version 335461 (0.0027) [2024-04-27 02:54:42,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50517.3, 300 sec: 50818.2). Total num frames: 5496274944. Throughput: 0: 50987.5. Samples: 3249133700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 02:54:42,063][49517] Avg episode reward: [(0, '0.460')] [2024-04-27 02:54:42,153][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000335467_5496291328.pth... [2024-04-27 02:54:42,209][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000334723_5484101632.pth [2024-04-27 02:54:43,388][49750] Updated weights for policy 0, policy_version 335471 (0.0028) [2024-04-27 02:54:46,487][49750] Updated weights for policy 0, policy_version 335481 (0.0037) [2024-04-27 02:54:47,063][49517] Fps is (10 sec: 50789.6, 60 sec: 51336.6, 300 sec: 50762.6). Total num frames: 5496537088. Throughput: 0: 50935.6. Samples: 3249437260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 02:54:47,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 02:54:49,619][49750] Updated weights for policy 0, policy_version 335491 (0.0028) [2024-04-27 02:54:52,062][49517] Fps is (10 sec: 50790.5, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5496782848. Throughput: 0: 50938.7. Samples: 3249591940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 02:54:52,063][49517] Avg episode reward: [(0, '0.678')] [2024-04-27 02:54:52,787][49750] Updated weights for policy 0, policy_version 335501 (0.0032) [2024-04-27 02:54:56,062][49750] Updated weights for policy 0, policy_version 335511 (0.0030) [2024-04-27 02:54:57,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.3, 300 sec: 50707.1). Total num frames: 5497028608. Throughput: 0: 50890.0. Samples: 3249895000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 02:54:57,063][49517] Avg episode reward: [(0, '0.538')] [2024-04-27 02:54:59,374][49750] Updated weights for policy 0, policy_version 335521 (0.0033) [2024-04-27 02:55:02,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5497307136. Throughput: 0: 50803.1. Samples: 3250200020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 02:55:02,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-27 02:55:02,495][49750] Updated weights for policy 0, policy_version 335531 (0.0027) [2024-04-27 02:55:06,119][49750] Updated weights for policy 0, policy_version 335541 (0.0036) [2024-04-27 02:55:07,062][49517] Fps is (10 sec: 54067.3, 60 sec: 51336.6, 300 sec: 50818.2). Total num frames: 5497569280. Throughput: 0: 50919.6. Samples: 3250356740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 02:55:07,063][49517] Avg episode reward: [(0, '0.646')] [2024-04-27 02:55:08,888][49750] Updated weights for policy 0, policy_version 335551 (0.0033) [2024-04-27 02:55:12,063][49517] Fps is (10 sec: 49152.0, 60 sec: 50790.5, 300 sec: 50707.1). Total num frames: 5497798656. Throughput: 0: 50858.8. Samples: 3250663500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 02:55:12,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-27 02:55:12,414][49750] Updated weights for policy 0, policy_version 335561 (0.0033) [2024-04-27 02:55:15,358][49750] Updated weights for policy 0, policy_version 335571 (0.0031) [2024-04-27 02:55:17,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.5, 300 sec: 50762.7). Total num frames: 5498060800. Throughput: 0: 50830.4. Samples: 3250966540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 02:55:17,063][49517] Avg episode reward: [(0, '0.674')] [2024-04-27 02:55:18,721][49750] Updated weights for policy 0, policy_version 335581 (0.0033) [2024-04-27 02:55:21,656][49750] Updated weights for policy 0, policy_version 335591 (0.0034) [2024-04-27 02:55:22,063][49517] Fps is (10 sec: 52428.3, 60 sec: 50790.2, 300 sec: 50762.6). Total num frames: 5498322944. Throughput: 0: 50839.3. Samples: 3251117260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 02:55:22,063][49517] Avg episode reward: [(0, '0.536')] [2024-04-27 02:55:25,120][49750] Updated weights for policy 0, policy_version 335601 (0.0027) [2024-04-27 02:55:27,062][49517] Fps is (10 sec: 50789.9, 60 sec: 51063.6, 300 sec: 50818.2). Total num frames: 5498568704. Throughput: 0: 50853.7. Samples: 3251422120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 02:55:27,063][49517] Avg episode reward: [(0, '0.529')] [2024-04-27 02:55:28,114][49750] Updated weights for policy 0, policy_version 335611 (0.0028) [2024-04-27 02:55:31,472][49750] Updated weights for policy 0, policy_version 335621 (0.0032) [2024-04-27 02:55:32,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51063.3, 300 sec: 50818.1). Total num frames: 5498830848. Throughput: 0: 50859.0. Samples: 3251725920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 02:55:32,063][49517] Avg episode reward: [(0, '0.588')] [2024-04-27 02:55:34,894][49750] Updated weights for policy 0, policy_version 335631 (0.0033) [2024-04-27 02:55:37,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.2, 300 sec: 50818.1). Total num frames: 5499076608. Throughput: 0: 50877.1. Samples: 3251881420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 02:55:37,063][49517] Avg episode reward: [(0, '0.639')] [2024-04-27 02:55:37,879][49750] Updated weights for policy 0, policy_version 335641 (0.0030) [2024-04-27 02:55:41,218][49750] Updated weights for policy 0, policy_version 335651 (0.0030) [2024-04-27 02:55:42,062][49517] Fps is (10 sec: 49152.8, 60 sec: 50790.4, 300 sec: 50762.6). Total num frames: 5499322368. Throughput: 0: 50848.3. Samples: 3252183180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 02:55:42,063][49517] Avg episode reward: [(0, '0.623')] [2024-04-27 02:55:44,200][49750] Updated weights for policy 0, policy_version 335661 (0.0036) [2024-04-27 02:55:47,063][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.5, 300 sec: 50929.2). Total num frames: 5499600896. Throughput: 0: 51031.5. Samples: 3252496440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 02:55:47,063][49517] Avg episode reward: [(0, '0.578')] [2024-04-27 02:55:47,480][49750] Updated weights for policy 0, policy_version 335671 (0.0031) [2024-04-27 02:55:49,205][49728] Signal inference workers to stop experience collection... (49000 times) [2024-04-27 02:55:49,257][49750] InferenceWorker_p0-w0: stopping experience collection (49000 times) [2024-04-27 02:55:49,273][49728] Signal inference workers to resume experience collection... (49000 times) [2024-04-27 02:55:49,281][49750] InferenceWorker_p0-w0: resuming experience collection (49000 times) [2024-04-27 02:55:50,667][49750] Updated weights for policy 0, policy_version 335681 (0.0028) [2024-04-27 02:55:52,062][49517] Fps is (10 sec: 52428.8, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5499846656. Throughput: 0: 50905.2. Samples: 3252647480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 02:55:52,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 02:55:53,826][49750] Updated weights for policy 0, policy_version 335691 (0.0028) [2024-04-27 02:55:57,063][49517] Fps is (10 sec: 50790.2, 60 sec: 51336.4, 300 sec: 50762.6). Total num frames: 5500108800. Throughput: 0: 50975.5. Samples: 3252957400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 02:55:57,063][49517] Avg episode reward: [(0, '0.600')] [2024-04-27 02:55:57,346][49750] Updated weights for policy 0, policy_version 335701 (0.0032) [2024-04-27 02:56:00,352][49750] Updated weights for policy 0, policy_version 335711 (0.0034) [2024-04-27 02:56:02,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50517.4, 300 sec: 50818.2). Total num frames: 5500338176. Throughput: 0: 51002.6. Samples: 3253261660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 02:56:02,063][49517] Avg episode reward: [(0, '0.574')] [2024-04-27 02:56:03,711][49750] Updated weights for policy 0, policy_version 335721 (0.0028) [2024-04-27 02:56:06,674][49750] Updated weights for policy 0, policy_version 335731 (0.0027) [2024-04-27 02:56:07,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5500616704. Throughput: 0: 51015.7. Samples: 3253412960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 02:56:07,063][49517] Avg episode reward: [(0, '0.618')] [2024-04-27 02:56:10,412][49750] Updated weights for policy 0, policy_version 335741 (0.0034) [2024-04-27 02:56:12,063][49517] Fps is (10 sec: 52428.1, 60 sec: 51063.5, 300 sec: 50818.2). Total num frames: 5500862464. Throughput: 0: 51001.7. Samples: 3253717200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 02:56:12,063][49517] Avg episode reward: [(0, '0.570')] [2024-04-27 02:56:13,234][49750] Updated weights for policy 0, policy_version 335751 (0.0030) [2024-04-27 02:56:16,957][49750] Updated weights for policy 0, policy_version 335761 (0.0033) [2024-04-27 02:56:17,063][49517] Fps is (10 sec: 49151.8, 60 sec: 50790.2, 300 sec: 50818.1). Total num frames: 5501108224. Throughput: 0: 50959.2. Samples: 3254019080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 02:56:17,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 02:56:19,860][49750] Updated weights for policy 0, policy_version 335771 (0.0030) [2024-04-27 02:56:22,063][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5501370368. Throughput: 0: 50890.7. Samples: 3254171500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 02:56:22,063][49517] Avg episode reward: [(0, '0.625')] [2024-04-27 02:56:23,521][49750] Updated weights for policy 0, policy_version 335781 (0.0039) [2024-04-27 02:56:26,241][49750] Updated weights for policy 0, policy_version 335791 (0.0031) [2024-04-27 02:56:27,062][49517] Fps is (10 sec: 52430.2, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5501632512. Throughput: 0: 50949.1. Samples: 3254475880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 02:56:27,063][49517] Avg episode reward: [(0, '0.640')] [2024-04-27 02:56:29,800][49750] Updated weights for policy 0, policy_version 335801 (0.0028) [2024-04-27 02:56:32,062][49517] Fps is (10 sec: 52429.2, 60 sec: 51063.6, 300 sec: 50929.3). Total num frames: 5501894656. Throughput: 0: 50876.9. Samples: 3254785900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 02:56:32,063][49517] Avg episode reward: [(0, '0.597')] [2024-04-27 02:56:32,553][49750] Updated weights for policy 0, policy_version 335811 (0.0031) [2024-04-27 02:56:36,029][49750] Updated weights for policy 0, policy_version 335821 (0.0029) [2024-04-27 02:56:37,062][49517] Fps is (10 sec: 50789.8, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5502140416. Throughput: 0: 50963.2. Samples: 3254940820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 02:56:37,063][49517] Avg episode reward: [(0, '0.601')] [2024-04-27 02:56:38,933][49750] Updated weights for policy 0, policy_version 335831 (0.0030) [2024-04-27 02:56:42,062][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.5, 300 sec: 50762.6). Total num frames: 5502386176. Throughput: 0: 50898.3. Samples: 3255247820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 02:56:42,063][49517] Avg episode reward: [(0, '0.497')] [2024-04-27 02:56:42,071][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000335839_5502386176.pth... [2024-04-27 02:56:42,122][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000335094_5490180096.pth [2024-04-27 02:56:42,490][49750] Updated weights for policy 0, policy_version 335841 (0.0028) [2024-04-27 02:56:45,496][49750] Updated weights for policy 0, policy_version 335851 (0.0039) [2024-04-27 02:56:47,062][49517] Fps is (10 sec: 50790.1, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5502648320. Throughput: 0: 50877.7. Samples: 3255551160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 02:56:47,063][49517] Avg episode reward: [(0, '0.579')] [2024-04-27 02:56:48,921][49750] Updated weights for policy 0, policy_version 335861 (0.0031) [2024-04-27 02:56:51,915][49750] Updated weights for policy 0, policy_version 335871 (0.0030) [2024-04-27 02:56:52,063][49517] Fps is (10 sec: 52428.5, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5502910464. Throughput: 0: 50779.5. Samples: 3255698040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 02:56:52,063][49517] Avg episode reward: [(0, '0.547')] [2024-04-27 02:56:55,350][49750] Updated weights for policy 0, policy_version 335881 (0.0033) [2024-04-27 02:56:57,062][49517] Fps is (10 sec: 50790.7, 60 sec: 50790.5, 300 sec: 50929.2). Total num frames: 5503156224. Throughput: 0: 50870.4. Samples: 3256006360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 02:56:57,063][49517] Avg episode reward: [(0, '0.564')] [2024-04-27 02:56:58,244][49750] Updated weights for policy 0, policy_version 335891 (0.0040) [2024-04-27 02:57:01,883][49750] Updated weights for policy 0, policy_version 335901 (0.0029) [2024-04-27 02:57:02,063][49517] Fps is (10 sec: 49152.1, 60 sec: 51063.4, 300 sec: 50818.2). Total num frames: 5503401984. Throughput: 0: 51016.9. Samples: 3256314840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 02:57:02,063][49517] Avg episode reward: [(0, '0.610')] [2024-04-27 02:57:04,753][49750] Updated weights for policy 0, policy_version 335911 (0.0033) [2024-04-27 02:57:07,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5503664128. Throughput: 0: 50874.3. Samples: 3256460840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 02:57:07,063][49517] Avg episode reward: [(0, '0.577')] [2024-04-27 02:57:08,412][49750] Updated weights for policy 0, policy_version 335921 (0.0028) [2024-04-27 02:57:11,270][49750] Updated weights for policy 0, policy_version 335931 (0.0033) [2024-04-27 02:57:12,063][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5503909888. Throughput: 0: 50893.4. Samples: 3256766100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 02:57:12,063][49517] Avg episode reward: [(0, '0.560')] [2024-04-27 02:57:14,853][49750] Updated weights for policy 0, policy_version 335941 (0.0034) [2024-04-27 02:57:17,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5504172032. Throughput: 0: 50762.3. Samples: 3257070200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 02:57:17,063][49517] Avg episode reward: [(0, '0.607')] [2024-04-27 02:57:17,602][49750] Updated weights for policy 0, policy_version 335951 (0.0033) [2024-04-27 02:57:21,241][49750] Updated weights for policy 0, policy_version 335961 (0.0033) [2024-04-27 02:57:22,062][49517] Fps is (10 sec: 50791.6, 60 sec: 50790.6, 300 sec: 50873.7). Total num frames: 5504417792. Throughput: 0: 50682.2. Samples: 3257221520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 02:57:22,063][49517] Avg episode reward: [(0, '0.581')] [2024-04-27 02:57:24,012][49750] Updated weights for policy 0, policy_version 335971 (0.0028) [2024-04-27 02:57:27,062][49517] Fps is (10 sec: 49152.1, 60 sec: 50517.3, 300 sec: 50762.6). Total num frames: 5504663552. Throughput: 0: 50703.2. Samples: 3257529460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 02:57:27,063][49517] Avg episode reward: [(0, '0.589')] [2024-04-27 02:57:27,661][49750] Updated weights for policy 0, policy_version 335981 (0.0037) [2024-04-27 02:57:30,429][49728] Signal inference workers to stop experience collection... (49050 times) [2024-04-27 02:57:30,429][49728] Signal inference workers to resume experience collection... (49050 times) [2024-04-27 02:57:30,454][49750] InferenceWorker_p0-w0: stopping experience collection (49050 times) [2024-04-27 02:57:30,454][49750] InferenceWorker_p0-w0: resuming experience collection (49050 times) [2024-04-27 02:57:30,554][49750] Updated weights for policy 0, policy_version 335991 (0.0034) [2024-04-27 02:57:32,062][49517] Fps is (10 sec: 52428.5, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5504942080. Throughput: 0: 50647.6. Samples: 3257830300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 02:57:32,063][49517] Avg episode reward: [(0, '0.617')] [2024-04-27 02:57:34,077][49750] Updated weights for policy 0, policy_version 336001 (0.0028) [2024-04-27 02:57:37,063][49517] Fps is (10 sec: 52428.1, 60 sec: 50790.3, 300 sec: 50873.7). Total num frames: 5505187840. Throughput: 0: 50942.2. Samples: 3257990440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 02:57:37,063][49517] Avg episode reward: [(0, '0.598')] [2024-04-27 02:57:37,183][49750] Updated weights for policy 0, policy_version 336011 (0.0030) [2024-04-27 02:57:40,498][49750] Updated weights for policy 0, policy_version 336021 (0.0028) [2024-04-27 02:57:42,062][49517] Fps is (10 sec: 50790.6, 60 sec: 51063.5, 300 sec: 50929.3). Total num frames: 5505449984. Throughput: 0: 50866.2. Samples: 3258295340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 02:57:42,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 02:57:43,597][49750] Updated weights for policy 0, policy_version 336031 (0.0028) [2024-04-27 02:57:46,878][49750] Updated weights for policy 0, policy_version 336041 (0.0029) [2024-04-27 02:57:47,063][49517] Fps is (10 sec: 50790.3, 60 sec: 50790.3, 300 sec: 50762.6). Total num frames: 5505695744. Throughput: 0: 50866.6. Samples: 3258603840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 02:57:47,063][49517] Avg episode reward: [(0, '0.624')] [2024-04-27 02:57:50,044][49750] Updated weights for policy 0, policy_version 336051 (0.0037) [2024-04-27 02:57:52,062][49517] Fps is (10 sec: 50790.2, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5505957888. Throughput: 0: 50941.8. Samples: 3258753220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 02:57:52,063][49517] Avg episode reward: [(0, '0.614')] [2024-04-27 02:57:53,258][49750] Updated weights for policy 0, policy_version 336061 (0.0027) [2024-04-27 02:57:56,363][49750] Updated weights for policy 0, policy_version 336071 (0.0029) [2024-04-27 02:57:57,062][49517] Fps is (10 sec: 50790.6, 60 sec: 50790.3, 300 sec: 50984.8). Total num frames: 5506203648. Throughput: 0: 50997.9. Samples: 3259061000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 02:57:57,063][49517] Avg episode reward: [(0, '0.612')] [2024-04-27 02:57:59,744][49750] Updated weights for policy 0, policy_version 336081 (0.0033) [2024-04-27 02:58:02,062][49517] Fps is (10 sec: 50791.0, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5506465792. Throughput: 0: 50866.7. Samples: 3259359200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 02:58:02,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-27 02:58:02,761][49750] Updated weights for policy 0, policy_version 336091 (0.0029) [2024-04-27 02:58:06,204][49750] Updated weights for policy 0, policy_version 336101 (0.0037) [2024-04-27 02:58:07,062][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.5, 300 sec: 50873.7). Total num frames: 5506711552. Throughput: 0: 51036.9. Samples: 3259518180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 02:58:07,063][49517] Avg episode reward: [(0, '0.569')] [2024-04-27 02:58:09,142][49750] Updated weights for policy 0, policy_version 336111 (0.0025) [2024-04-27 02:58:12,063][49517] Fps is (10 sec: 50789.5, 60 sec: 51063.6, 300 sec: 50873.7). Total num frames: 5506973696. Throughput: 0: 50980.3. Samples: 3259823580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:58:12,063][49517] Avg episode reward: [(0, '0.506')] [2024-04-27 02:58:12,587][49750] Updated weights for policy 0, policy_version 336121 (0.0030) [2024-04-27 02:58:15,533][49750] Updated weights for policy 0, policy_version 336131 (0.0029) [2024-04-27 02:58:17,062][49517] Fps is (10 sec: 50789.8, 60 sec: 50790.3, 300 sec: 50929.2). Total num frames: 5507219456. Throughput: 0: 50965.7. Samples: 3260123760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:58:17,063][49517] Avg episode reward: [(0, '0.585')] [2024-04-27 02:58:19,022][49750] Updated weights for policy 0, policy_version 336141 (0.0041) [2024-04-27 02:58:21,937][49750] Updated weights for policy 0, policy_version 336151 (0.0029) [2024-04-27 02:58:22,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.4, 300 sec: 50984.8). Total num frames: 5507497984. Throughput: 0: 50927.5. Samples: 3260282180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:58:22,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-27 02:58:25,519][49750] Updated weights for policy 0, policy_version 336161 (0.0030) [2024-04-27 02:58:27,063][49517] Fps is (10 sec: 52428.6, 60 sec: 51336.4, 300 sec: 50929.3). Total num frames: 5507743744. Throughput: 0: 50851.0. Samples: 3260583640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:58:27,063][49517] Avg episode reward: [(0, '0.533')] [2024-04-27 02:58:28,446][49750] Updated weights for policy 0, policy_version 336171 (0.0030) [2024-04-27 02:58:32,020][49750] Updated weights for policy 0, policy_version 336181 (0.0029) [2024-04-27 02:58:32,062][49517] Fps is (10 sec: 49152.5, 60 sec: 50790.4, 300 sec: 50818.2). Total num frames: 5507989504. Throughput: 0: 50762.7. Samples: 3260888160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:58:32,063][49517] Avg episode reward: [(0, '0.553')] [2024-04-27 02:58:34,883][49750] Updated weights for policy 0, policy_version 336191 (0.0033) [2024-04-27 02:58:37,062][49517] Fps is (10 sec: 49152.6, 60 sec: 50790.5, 300 sec: 50818.2). Total num frames: 5508235264. Throughput: 0: 50767.6. Samples: 3261037760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:58:37,063][49517] Avg episode reward: [(0, '0.586')] [2024-04-27 02:58:38,301][49750] Updated weights for policy 0, policy_version 336201 (0.0029) [2024-04-27 02:58:41,233][49750] Updated weights for policy 0, policy_version 336211 (0.0037) [2024-04-27 02:58:42,063][49517] Fps is (10 sec: 50789.6, 60 sec: 50790.2, 300 sec: 50984.8). Total num frames: 5508497408. Throughput: 0: 50830.1. Samples: 3261348360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:58:42,063][49517] Avg episode reward: [(0, '0.657')] [2024-04-27 02:58:42,074][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000336212_5508497408.pth... [2024-04-27 02:58:42,145][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000335467_5496291328.pth [2024-04-27 02:58:43,874][49728] Signal inference workers to stop experience collection... (49100 times) [2024-04-27 02:58:43,895][49750] InferenceWorker_p0-w0: stopping experience collection (49100 times) [2024-04-27 02:58:43,943][49728] Signal inference workers to resume experience collection... (49100 times) [2024-04-27 02:58:43,943][49750] InferenceWorker_p0-w0: resuming experience collection (49100 times) [2024-04-27 02:58:44,658][49750] Updated weights for policy 0, policy_version 336221 (0.0028) [2024-04-27 02:58:47,063][49517] Fps is (10 sec: 52427.5, 60 sec: 51063.4, 300 sec: 50929.2). Total num frames: 5508759552. Throughput: 0: 50921.4. Samples: 3261650680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:58:47,063][49517] Avg episode reward: [(0, '0.567')] [2024-04-27 02:58:47,718][49750] Updated weights for policy 0, policy_version 336231 (0.0029) [2024-04-27 02:58:51,121][49750] Updated weights for policy 0, policy_version 336241 (0.0037) [2024-04-27 02:58:52,063][49517] Fps is (10 sec: 50791.0, 60 sec: 50790.4, 300 sec: 50873.7). Total num frames: 5509005312. Throughput: 0: 50917.2. Samples: 3261809460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:58:52,063][49517] Avg episode reward: [(0, '0.594')] [2024-04-27 02:58:54,194][49750] Updated weights for policy 0, policy_version 336251 (0.0028) [2024-04-27 02:58:57,062][49517] Fps is (10 sec: 50791.6, 60 sec: 51063.5, 300 sec: 50873.7). Total num frames: 5509267456. Throughput: 0: 50872.6. Samples: 3262112840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:58:57,063][49517] Avg episode reward: [(0, '0.561')] [2024-04-27 02:58:57,805][49750] Updated weights for policy 0, policy_version 336261 (0.0031) [2024-04-27 02:59:00,599][49750] Updated weights for policy 0, policy_version 336271 (0.0029) [2024-04-27 02:59:02,062][49517] Fps is (10 sec: 49152.7, 60 sec: 50517.3, 300 sec: 50873.7). Total num frames: 5509496832. Throughput: 0: 50963.7. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:59:02,063][49517] Avg episode reward: [(0, '0.576')] [2024-04-27 02:59:07,063][49517] Fps is (10 sec: 31127.9, 60 sec: 47786.2, 300 sec: 50262.7). Total num frames: 5509578752. Throughput: 0: 47442.7. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:59:07,064][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 02:59:12,063][49517] Fps is (10 sec: 8191.7, 60 sec: 43417.5, 300 sec: 49318.6). Total num frames: 5509578752. Throughput: 0: 40743.8. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:59:12,064][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 02:59:17,063][49517] Fps is (10 sec: 0.0, 60 sec: 39321.5, 300 sec: 48485.5). Total num frames: 5509578752. Throughput: 0: 33976.7. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:59:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 02:59:22,063][49517] Fps is (10 sec: 0.0, 60 sec: 34679.4, 300 sec: 47708.0). Total num frames: 5509578752. Throughput: 0: 30652.2. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:59:22,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 02:59:27,063][49517] Fps is (10 sec: 0.0, 60 sec: 30583.4, 300 sec: 46819.3). Total num frames: 5509578752. Throughput: 0: 23750.2. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:59:27,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 02:59:32,063][49517] Fps is (10 sec: 0.0, 60 sec: 26487.4, 300 sec: 45930.7). Total num frames: 5509578752. Throughput: 0: 17032.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:59:32,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 02:59:37,063][49517] Fps is (10 sec: 0.0, 60 sec: 22391.4, 300 sec: 45097.6). Total num frames: 5509578752. Throughput: 0: 13503.5. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:59:37,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 02:59:42,063][49517] Fps is (10 sec: 0.0, 60 sec: 18022.4, 300 sec: 44209.0). Total num frames: 5509578752. Throughput: 0: 6761.7. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:59:42,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 02:59:47,063][49517] Fps is (10 sec: 0.0, 60 sec: 13653.3, 300 sec: 43375.9). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:59:47,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 02:59:52,063][49517] Fps is (10 sec: 0.0, 60 sec: 9557.3, 300 sec: 42542.8). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:59:52,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 02:59:57,063][49517] Fps is (10 sec: 0.0, 60 sec: 5188.2, 300 sec: 41598.7). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 02:59:57,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:00:02,063][49517] Fps is (10 sec: 0.0, 60 sec: 1365.3, 300 sec: 40710.0). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:00:02,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:00:07,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 39932.5). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:00:07,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:00:12,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 39043.9). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:00:12,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:00:17,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 38155.3). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:00:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:00:22,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 37322.2). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:00:22,064][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:00:27,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 36433.6). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:00:27,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:00:32,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 35600.5). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:00:32,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:00:37,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 34767.4). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:00:37,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:00:42,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 33823.2). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:00:42,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:00:42,094][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000336278_5509578752.pth... [2024-04-27 03:00:42,181][49728] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000335839_5502386176.pth [2024-04-27 03:00:47,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 32990.1). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:00:47,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:00:52,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 32101.5). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:00:52,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:00:57,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 31324.0). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:00:57,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:01:02,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 30379.8). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:01:02,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:01:07,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 29546.7). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:01:07,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:01:12,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 28713.6). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:01:12,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:01:17,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 27825.0). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:01:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:01:22,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 26936.4). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:01:22,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:01:27,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 26047.8). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:01:27,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:01:32,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 25214.7). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:01:32,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:01:37,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 24381.6). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:01:37,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:01:42,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 23493.0). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:01:42,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:01:47,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 22604.4). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:01:47,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:01:52,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 21771.3). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:01:52,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:01:57,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 20938.2). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:01:57,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:02:02,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 20049.6). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:02:02,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:02:07,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 19216.5). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:02:07,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:02:12,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 18327.8). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:02:12,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:02:17,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 17494.8). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:02:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:02:22,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 16661.7). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:02:22,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:02:27,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 15717.5). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:02:27,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:02:32,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 14884.4). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:02:32,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:02:37,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 13995.8). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:02:37,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:02:42,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 13162.7). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:02:42,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:02:42,093][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000336278_5509578752.pth... [2024-04-27 03:02:47,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 12274.1). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:02:47,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:02:52,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 11441.0). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:02:52,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:02:57,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 10552.4). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:02:57,064][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:03:02,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 9719.3). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:03:02,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:03:07,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 8830.7). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:03:07,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:03:12,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 7997.6). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:03:12,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:03:17,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 7053.4). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:03:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:03:22,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 6220.4). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:03:22,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:03:27,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 5387.3). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:03:27,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:03:32,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 4554.2). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:03:32,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:03:37,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 3665.6). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:03:37,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:03:42,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 2776.9). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:03:42,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:03:47,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 1943.9). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:03:47,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:03:52,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 1055.2). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:03:52,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:03:57,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 277.7). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:03:57,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:04:02,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:04:02,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:04:07,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:04:07,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:04:12,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:04:12,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:04:17,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:04:17,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:04:22,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:04:22,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:04:27,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:04:27,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:04:32,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:04:32,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:04:37,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:04:37,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:04:42,063][49517] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 5509578752. Throughput: 0: 0.0. Samples: 3262417120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 03:04:42,063][49517] Avg episode reward: [(0, '0.592')] [2024-04-27 03:04:42,094][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000336278_5509578752.pth... [2024-04-27 03:04:42,095][49517] No heartbeat for components: InferenceWorker_p0-w0 (357 seconds) [2024-04-27 03:04:42,095][49517] Stopping training due to lack of heartbeats from [2024-04-27 03:04:42,097][49758] Stopping RolloutWorker_w9... [2024-04-27 03:04:42,097][49754] Stopping RolloutWorker_w5... [2024-04-27 03:04:42,097][49749] Stopping RolloutWorker_w1... [2024-04-27 03:04:42,097][49752] Stopping RolloutWorker_w3... [2024-04-27 03:04:42,097][49751] Stopping RolloutWorker_w2... [2024-04-27 03:04:42,097][49778] Stopping RolloutWorker_w30... [2024-04-27 03:04:42,097][49763] Stopping RolloutWorker_w15... [2024-04-27 03:04:42,097][49757] Stopping RolloutWorker_w8... [2024-04-27 03:04:42,097][49748] Stopping RolloutWorker_w0... [2024-04-27 03:04:42,097][49776] Stopping RolloutWorker_w27... [2024-04-27 03:04:42,097][49775] Stopping RolloutWorker_w25... [2024-04-27 03:04:42,097][49761] Stopping RolloutWorker_w13... [2024-04-27 03:04:42,097][49760] Stopping RolloutWorker_w11... [2024-04-27 03:04:42,098][49758] Loop rollout_proc9_evt_loop terminating... [2024-04-27 03:04:42,098][49754] Loop rollout_proc5_evt_loop terminating... [2024-04-27 03:04:42,097][49759] Stopping RolloutWorker_w10... [2024-04-27 03:04:42,097][49765] Stopping RolloutWorker_w17... [2024-04-27 03:04:42,097][49777] Stopping RolloutWorker_w28... [2024-04-27 03:04:42,098][49749] Loop rollout_proc1_evt_loop terminating... [2024-04-27 03:04:42,098][49751] Loop rollout_proc2_evt_loop terminating... [2024-04-27 03:04:42,097][49769] Stopping RolloutWorker_w20... [2024-04-27 03:04:42,098][49752] Loop rollout_proc3_evt_loop terminating... [2024-04-27 03:04:42,098][49763] Loop rollout_proc15_evt_loop terminating... [2024-04-27 03:04:42,097][49770] Stopping RolloutWorker_w21... [2024-04-27 03:04:42,098][49757] Loop rollout_proc8_evt_loop terminating... [2024-04-27 03:04:42,098][49748] Loop rollout_proc0_evt_loop terminating... [2024-04-27 03:04:42,097][49768] Stopping RolloutWorker_w19... [2024-04-27 03:04:42,097][49767] Stopping RolloutWorker_w18... [2024-04-27 03:04:42,097][49764] Stopping RolloutWorker_w14... [2024-04-27 03:04:42,098][49778] Loop rollout_proc30_evt_loop terminating... [2024-04-27 03:04:42,098][49776] Loop rollout_proc27_evt_loop terminating... [2024-04-27 03:04:42,098][49779] Stopping RolloutWorker_w31... [2024-04-27 03:04:42,097][49756] Stopping RolloutWorker_w7... [2024-04-27 03:04:42,098][49761] Loop rollout_proc13_evt_loop terminating... [2024-04-27 03:04:42,098][49760] Loop rollout_proc11_evt_loop terminating... [2024-04-27 03:04:42,097][49766] Stopping RolloutWorker_w16... [2024-04-27 03:04:42,098][49759] Loop rollout_proc10_evt_loop terminating... [2024-04-27 03:04:42,098][49775] Loop rollout_proc25_evt_loop terminating... [2024-04-27 03:04:42,097][49771] Stopping RolloutWorker_w23... [2024-04-27 03:04:42,097][49762] Stopping RolloutWorker_w12... [2024-04-27 03:04:42,098][49765] Loop rollout_proc17_evt_loop terminating... [2024-04-27 03:04:42,098][49777] Loop rollout_proc28_evt_loop terminating... [2024-04-27 03:04:42,098][49770] Loop rollout_proc21_evt_loop terminating... [2024-04-27 03:04:42,098][49764] Loop rollout_proc14_evt_loop terminating... [2024-04-27 03:04:42,098][49768] Loop rollout_proc19_evt_loop terminating... [2024-04-27 03:04:42,098][49767] Loop rollout_proc18_evt_loop terminating... [2024-04-27 03:04:42,098][49769] Loop rollout_proc20_evt_loop terminating... [2024-04-27 03:04:42,099][49756] Loop rollout_proc7_evt_loop terminating... [2024-04-27 03:04:42,097][49774] Stopping RolloutWorker_w26... [2024-04-27 03:04:42,099][49762] Loop rollout_proc12_evt_loop terminating... [2024-04-27 03:04:42,097][49780] Stopping RolloutWorker_w29... [2024-04-27 03:04:42,099][49766] Loop rollout_proc16_evt_loop terminating... [2024-04-27 03:04:42,097][49773] Stopping RolloutWorker_w22... [2024-04-27 03:04:42,098][49517] Component InferenceWorker_p0-w0 process died already! Don't wait for it. [2024-04-27 03:04:42,100][49774] Loop rollout_proc26_evt_loop terminating... [2024-04-27 03:04:42,099][49755] Stopping RolloutWorker_w6... [2024-04-27 03:04:42,099][49779] Loop rollout_proc31_evt_loop terminating... [2024-04-27 03:04:42,099][49771] Loop rollout_proc23_evt_loop terminating... [2024-04-27 03:04:42,100][49780] Loop rollout_proc29_evt_loop terminating... [2024-04-27 03:04:42,101][49755] Loop rollout_proc6_evt_loop terminating... [2024-04-27 03:04:42,100][49517] Component RolloutWorker_w3 stopped! [2024-04-27 03:04:42,097][49772] Stopping RolloutWorker_w24... [2024-04-27 03:04:42,105][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w1', 'RolloutWorker_w2', 'RolloutWorker_w4', 'RolloutWorker_w5', 'RolloutWorker_w6', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,105][49772] Loop rollout_proc24_evt_loop terminating... [2024-04-27 03:04:42,105][49517] Component RolloutWorker_w5 stopped! [2024-04-27 03:04:42,105][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w1', 'RolloutWorker_w2', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,106][49517] Component RolloutWorker_w1 stopped! [2024-04-27 03:04:42,106][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w2', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,106][49517] Component RolloutWorker_w9 stopped! [2024-04-27 03:04:42,106][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w2', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,106][49517] Component RolloutWorker_w30 stopped! [2024-04-27 03:04:42,106][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w2', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,106][49517] Component RolloutWorker_w19 stopped! [2024-04-27 03:04:42,106][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w2', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,107][49517] Component RolloutWorker_w10 stopped! [2024-04-27 03:04:42,107][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w2', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,107][49517] Component RolloutWorker_w11 stopped! [2024-04-27 03:04:42,107][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w2', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,107][49517] Component RolloutWorker_w2 stopped! [2024-04-27 03:04:42,107][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,108][49517] Component RolloutWorker_w27 stopped! [2024-04-27 03:04:42,108][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,108][49517] Component RolloutWorker_w25 stopped! [2024-04-27 03:04:42,108][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w26', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,108][49517] Component RolloutWorker_w15 stopped! [2024-04-27 03:04:42,109][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w26', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,109][49517] Component RolloutWorker_w17 stopped! [2024-04-27 03:04:42,100][49773] Loop rollout_proc22_evt_loop terminating... [2024-04-27 03:04:42,103][49753] Stopping RolloutWorker_w4... [2024-04-27 03:04:42,104][49728] Stopping Batcher_0... [2024-04-27 03:04:42,109][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w16', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w26', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,110][49517] Component RolloutWorker_w16 stopped! [2024-04-27 03:04:42,110][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w26', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,110][49517] Component RolloutWorker_w7 stopped! [2024-04-27 03:04:42,110][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w8', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w26', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,110][49517] Component RolloutWorker_w8 stopped! [2024-04-27 03:04:42,111][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w26', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,111][49517] Component RolloutWorker_w14 stopped! [2024-04-27 03:04:42,111][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w26', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,111][49517] Component RolloutWorker_w23 stopped! [2024-04-27 03:04:42,111][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w24', 'RolloutWorker_w26', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,112][49517] Component RolloutWorker_w0 stopped! [2024-04-27 03:04:42,112][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w24', 'RolloutWorker_w26', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,112][49517] Component RolloutWorker_w29 stopped! [2024-04-27 03:04:42,115][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w24', 'RolloutWorker_w26', 'RolloutWorker_w28', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,115][49517] Component RolloutWorker_w13 stopped! [2024-04-27 03:04:42,115][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w12', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w24', 'RolloutWorker_w26', 'RolloutWorker_w28', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,116][49517] Component RolloutWorker_w26 stopped! [2024-04-27 03:04:42,116][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w12', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w24', 'RolloutWorker_w28', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,116][49517] Component RolloutWorker_w12 stopped! [2024-04-27 03:04:42,116][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w24', 'RolloutWorker_w28', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,116][49517] Component RolloutWorker_w22 stopped! [2024-04-27 03:04:42,116][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w24', 'RolloutWorker_w28', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,116][49517] Component RolloutWorker_w28 stopped! [2024-04-27 03:04:42,117][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w24', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,117][49517] Component RolloutWorker_w24 stopped! [2024-04-27 03:04:42,117][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,117][49517] Component RolloutWorker_w20 stopped! [2024-04-27 03:04:42,117][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w18', 'RolloutWorker_w21', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,117][49517] Component RolloutWorker_w18 stopped! [2024-04-27 03:04:42,117][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w21', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,118][49517] Component RolloutWorker_w21 stopped! [2024-04-27 03:04:42,118][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w4', 'RolloutWorker_w6', 'RolloutWorker_w31'] to stop... [2024-04-27 03:04:42,118][49517] Component RolloutWorker_w31 stopped! [2024-04-27 03:04:42,118][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w4', 'RolloutWorker_w6'] to stop... [2024-04-27 03:04:42,118][49517] Component RolloutWorker_w6 stopped! [2024-04-27 03:04:42,118][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0', 'RolloutWorker_w4'] to stop... [2024-04-27 03:04:42,118][49517] Component RolloutWorker_w4 stopped! [2024-04-27 03:04:42,119][49517] Waiting for ['Batcher_0', 'LearnerWorker_p0'] to stop... [2024-04-27 03:04:42,119][49517] Component Batcher_0 stopped! [2024-04-27 03:04:42,119][49517] Waiting for ['LearnerWorker_p0'] to stop... [2024-04-27 03:04:42,163][49753] Loop rollout_proc4_evt_loop terminating... [2024-04-27 03:04:42,184][49728] Loop batcher_evt_loop terminating... [2024-04-27 03:04:42,296][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000336278_5509578752.pth... [2024-04-27 03:04:42,388][49728] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000336278_5509578752.pth... [2024-04-27 03:04:42,492][49728] Stopping LearnerWorker_p0... [2024-04-27 03:04:42,493][49728] Loop learner_proc0_evt_loop terminating... [2024-04-27 03:04:42,493][49517] Component LearnerWorker_p0 stopped! [2024-04-27 03:04:42,494][49517] Waiting for process learner_proc0 to stop... [2024-04-27 03:04:44,432][49517] Waiting for process inference_proc0-0 to join... [2024-04-27 03:04:44,432][49517] Waiting for process rollout_proc0 to join... [2024-04-27 03:04:44,432][49517] Waiting for process rollout_proc1 to join... [2024-04-27 03:04:44,432][49517] Waiting for process rollout_proc2 to join... [2024-04-27 03:04:44,433][49517] Waiting for process rollout_proc3 to join... [2024-04-27 03:04:44,433][49517] Waiting for process rollout_proc4 to join... [2024-04-27 03:04:44,589][49517] Waiting for process rollout_proc5 to join... [2024-04-27 03:04:44,590][49517] Waiting for process rollout_proc6 to join... [2024-04-27 03:04:44,590][49517] Waiting for process rollout_proc7 to join... [2024-04-27 03:04:44,590][49517] Waiting for process rollout_proc8 to join... [2024-04-27 03:04:44,590][49517] Waiting for process rollout_proc9 to join... [2024-04-27 03:04:44,591][49517] Waiting for process rollout_proc10 to join... [2024-04-27 03:04:44,591][49517] Waiting for process rollout_proc11 to join... [2024-04-27 03:04:44,591][49517] Waiting for process rollout_proc12 to join... [2024-04-27 03:04:44,591][49517] Waiting for process rollout_proc13 to join... [2024-04-27 03:04:44,591][49517] Waiting for process rollout_proc14 to join... [2024-04-27 03:04:44,592][49517] Waiting for process rollout_proc15 to join... [2024-04-27 03:04:44,592][49517] Waiting for process rollout_proc16 to join... [2024-04-27 03:04:44,592][49517] Waiting for process rollout_proc17 to join... [2024-04-27 03:04:44,592][49517] Waiting for process rollout_proc18 to join... [2024-04-27 03:04:44,592][49517] Waiting for process rollout_proc19 to join... [2024-04-27 03:04:44,592][49517] Waiting for process rollout_proc20 to join... [2024-04-27 03:04:44,593][49517] Waiting for process rollout_proc21 to join... [2024-04-27 03:04:44,593][49517] Waiting for process rollout_proc22 to join... [2024-04-27 03:04:44,593][49517] Waiting for process rollout_proc23 to join... [2024-04-27 03:04:44,593][49517] Waiting for process rollout_proc24 to join... [2024-04-27 03:04:44,593][49517] Waiting for process rollout_proc25 to join... [2024-04-27 03:04:44,593][49517] Waiting for process rollout_proc26 to join... [2024-04-27 03:04:44,594][49517] Waiting for process rollout_proc27 to join... [2024-04-27 03:04:44,594][49517] Waiting for process rollout_proc28 to join... [2024-04-27 03:04:44,594][49517] Waiting for process rollout_proc29 to join... [2024-04-27 03:04:44,594][49517] Waiting for process rollout_proc30 to join... [2024-04-27 03:04:44,594][49517] Waiting for process rollout_proc31 to join... [2024-04-27 03:04:44,595][49517] Batcher 0 profile tree view: batching: 8862.0877, releasing_batches: 372.1009 [2024-04-27 03:04:44,595][49517] Learner 0 profile tree view: misc: 0.7909, prepare_batch: 2627.2247 train: 23343.1086 epoch_init: 0.5188, minibatch_init: 0.5506, losses_postprocess: 62.9190, kl_divergence: 104.5310, after_optimizer: 11568.8476 calculate_losses: 10395.5645 losses_init: 0.3084, forward_head: 1565.2002, bptt_initial: 7670.2271, tail: 130.2271, advantages_returns: 26.3617, losses: 457.0765 bptt: 508.8324 bptt_forward_core: 502.7217 update: 1157.1097 clip: 121.2124 [2024-04-27 03:04:44,595][49517] RolloutWorker_w0 profile tree view: wait_for_trajectories: 4.1040, enqueue_policy_requests: 1624.2252, env_step: 22079.2053, overhead: 839.3966, complete_rollouts: 5.1924 save_policy_outputs: 3010.0739 split_output_tensors: 983.7541 [2024-04-27 03:04:44,595][49517] RolloutWorker_w31 profile tree view: wait_for_trajectories: 4.3000, enqueue_policy_requests: 1899.6848, env_step: 25894.9596, overhead: 1065.4807, complete_rollouts: 150.7558 save_policy_outputs: 3646.2464 split_output_tensors: 1100.2242 [2024-04-27 03:04:44,595][49517] Loop Runner_EvtLoop terminating... [2024-04-27 03:04:44,596][49517] Runner profile tree view: main_loop: 64799.5504 [2024-04-27 03:04:44,596][49517] Collected {0: 5509578752}, FPS: 50345.0 [2024-04-27 08:57:36,394][52031] Saving configuration to /workspace/metta/train_dir/p2.fa.clean/config.json... [2024-04-27 08:57:36,403][52031] Rollout worker 0 uses device cpu [2024-04-27 08:57:36,403][52031] Rollout worker 1 uses device cpu [2024-04-27 08:57:36,403][52031] Rollout worker 2 uses device cpu [2024-04-27 08:57:36,403][52031] Rollout worker 3 uses device cpu [2024-04-27 08:57:36,403][52031] Rollout worker 4 uses device cpu [2024-04-27 08:57:36,403][52031] Rollout worker 5 uses device cpu [2024-04-27 08:57:36,403][52031] Rollout worker 6 uses device cpu [2024-04-27 08:57:36,403][52031] Rollout worker 7 uses device cpu [2024-04-27 08:57:36,403][52031] Rollout worker 8 uses device cpu [2024-04-27 08:57:36,403][52031] Rollout worker 9 uses device cpu [2024-04-27 08:57:36,404][52031] Rollout worker 10 uses device cpu [2024-04-27 08:57:36,404][52031] Rollout worker 11 uses device cpu [2024-04-27 08:57:36,404][52031] Rollout worker 12 uses device cpu [2024-04-27 08:57:36,404][52031] Rollout worker 13 uses device cpu [2024-04-27 08:57:36,404][52031] Rollout worker 14 uses device cpu [2024-04-27 08:57:36,404][52031] Rollout worker 15 uses device cpu [2024-04-27 08:57:36,404][52031] Rollout worker 16 uses device cpu [2024-04-27 08:57:36,404][52031] Rollout worker 17 uses device cpu [2024-04-27 08:57:36,404][52031] Rollout worker 18 uses device cpu [2024-04-27 08:57:36,404][52031] Rollout worker 19 uses device cpu [2024-04-27 08:57:36,404][52031] Rollout worker 20 uses device cpu [2024-04-27 08:57:36,404][52031] Rollout worker 21 uses device cpu [2024-04-27 08:57:36,405][52031] Rollout worker 22 uses device cpu [2024-04-27 08:57:36,405][52031] Rollout worker 23 uses device cpu [2024-04-27 08:57:36,405][52031] Rollout worker 24 uses device cpu [2024-04-27 08:57:36,405][52031] Rollout worker 25 uses device cpu [2024-04-27 08:57:36,405][52031] Rollout worker 26 uses device cpu [2024-04-27 08:57:36,405][52031] Rollout worker 27 uses device cpu [2024-04-27 08:57:36,405][52031] Rollout worker 28 uses device cpu [2024-04-27 08:57:36,405][52031] Rollout worker 29 uses device cpu [2024-04-27 08:57:36,405][52031] Rollout worker 30 uses device cpu [2024-04-27 08:57:36,405][52031] Rollout worker 31 uses device cpu [2024-04-27 08:57:36,972][52031] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-27 08:57:36,972][52031] InferenceWorker_p0-w0: min num requests: 10 [2024-04-27 08:57:37,019][52031] Starting all processes... [2024-04-27 08:57:37,019][52031] Starting process learner_proc0 [2024-04-27 08:57:37,078][52031] Starting all processes... [2024-04-27 08:57:37,082][52031] Starting process inference_proc0-0 [2024-04-27 08:57:37,082][52031] Starting process rollout_proc0 [2024-04-27 08:57:37,082][52031] Starting process rollout_proc1 [2024-04-27 08:57:37,082][52031] Starting process rollout_proc2 [2024-04-27 08:57:37,082][52031] Starting process rollout_proc3 [2024-04-27 08:57:37,082][52031] Starting process rollout_proc4 [2024-04-27 08:57:37,082][52031] Starting process rollout_proc5 [2024-04-27 08:57:37,082][52031] Starting process rollout_proc6 [2024-04-27 08:57:37,082][52031] Starting process rollout_proc7 [2024-04-27 08:57:37,084][52031] Starting process rollout_proc8 [2024-04-27 08:57:37,084][52031] Starting process rollout_proc9 [2024-04-27 08:57:37,085][52031] Starting process rollout_proc10 [2024-04-27 08:57:37,085][52031] Starting process rollout_proc11 [2024-04-27 08:57:37,085][52031] Starting process rollout_proc12 [2024-04-27 08:57:37,085][52031] Starting process rollout_proc13 [2024-04-27 08:57:37,085][52031] Starting process rollout_proc14 [2024-04-27 08:57:37,085][52031] Starting process rollout_proc15 [2024-04-27 08:57:37,085][52031] Starting process rollout_proc16 [2024-04-27 08:57:37,086][52031] Starting process rollout_proc17 [2024-04-27 08:57:37,088][52031] Starting process rollout_proc18 [2024-04-27 08:57:37,089][52031] Starting process rollout_proc19 [2024-04-27 08:57:37,089][52031] Starting process rollout_proc20 [2024-04-27 08:57:37,098][52031] Starting process rollout_proc21 [2024-04-27 08:57:37,098][52031] Starting process rollout_proc22 [2024-04-27 08:57:37,098][52031] Starting process rollout_proc23 [2024-04-27 08:57:37,098][52031] Starting process rollout_proc24 [2024-04-27 08:57:37,098][52031] Starting process rollout_proc25 [2024-04-27 08:57:37,100][52031] Starting process rollout_proc26 [2024-04-27 08:57:37,101][52031] Starting process rollout_proc27 [2024-04-27 08:57:37,104][52031] Starting process rollout_proc28 [2024-04-27 08:57:37,106][52031] Starting process rollout_proc29 [2024-04-27 08:57:37,109][52031] Starting process rollout_proc30 [2024-04-27 08:57:37,111][52031] Starting process rollout_proc31 [2024-04-27 08:57:40,526][52266] Worker 3 uses CPU cores [3] [2024-04-27 08:57:40,538][52267] Worker 4 uses CPU cores [4] [2024-04-27 08:57:40,585][52265] Worker 2 uses CPU cores [2] [2024-04-27 08:57:40,682][52242] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-27 08:57:40,682][52242] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-04-27 08:57:40,692][52242] Num visible devices: 1 [2024-04-27 08:57:40,694][52268] Worker 8 uses CPU cores [8] [2024-04-27 08:57:40,734][52264] Worker 1 uses CPU cores [1] [2024-04-27 08:57:40,739][52242] Starting seed is not provided [2024-04-27 08:57:40,739][52242] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-27 08:57:40,739][52242] Initializing actor-critic model on device cuda:0 [2024-04-27 08:57:40,740][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,750][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,750][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,750][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,751][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,751][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,751][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,751][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,751][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,751][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,751][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,751][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,751][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,751][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,752][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,752][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,752][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,752][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,752][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,752][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,752][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,752][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,752][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,752][52242] RunningMeanStd input shape: (1,) [2024-04-27 08:57:40,754][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,754][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,754][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,754][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,754][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,754][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,754][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,754][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,754][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,754][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,754][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,755][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,755][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,755][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,755][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,755][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,755][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,755][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,755][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,755][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,755][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,755][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,755][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,755][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,756][52242] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:40,802][52287] Worker 25 uses CPU cores [25] [2024-04-27 08:57:40,805][52275] Worker 12 uses CPU cores [12] [2024-04-27 08:57:40,822][52272] Worker 6 uses CPU cores [6] [2024-04-27 08:57:40,841][52242] Created Actor Critic model with architecture: [2024-04-27 08:57:40,841][52242] PredictingActorCritic( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): FeatureEncoder( (_global_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (conf:agent:energy:initial): RunningMeanStdInPlace() (conf:agent:energy:max): RunningMeanStdInPlace() (conf:agent:energy:regen): RunningMeanStdInPlace() (conf:altar:cooldown): RunningMeanStdInPlace() (conf:altar:cost): RunningMeanStdInPlace() (conf:attack:damage): RunningMeanStdInPlace() (conf:attack:freeze_duration): RunningMeanStdInPlace() (conf:charger:cooldown): RunningMeanStdInPlace() (conf:charger:energy): RunningMeanStdInPlace() (conf:cost:attack): RunningMeanStdInPlace() (conf:cost:frozen): RunningMeanStdInPlace() (conf:cost:jump): RunningMeanStdInPlace() (conf:cost:move): RunningMeanStdInPlace() (conf:cost:move:predator): RunningMeanStdInPlace() (conf:cost:move:prey): RunningMeanStdInPlace() (conf:cost:rotate): RunningMeanStdInPlace() (conf:cost:shield): RunningMeanStdInPlace() (conf:cost:shield:upkeep): RunningMeanStdInPlace() (conf:generator:cooldown): RunningMeanStdInPlace() (conf:gift:energy): RunningMeanStdInPlace() (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() (last_reward): RunningMeanStdInPlace() ) ) (_grid_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (charger): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:extra_property:0): RunningMeanStdInPlace() (agent:extra_property:1): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv:1): RunningMeanStdInPlace() (agent:inv:2): RunningMeanStdInPlace() (agent:inv:3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (agent:species): RunningMeanStdInPlace() (altar:ready): RunningMeanStdInPlace() (charger:bonus): RunningMeanStdInPlace() (charger:input:1): RunningMeanStdInPlace() (charger:input:2): RunningMeanStdInPlace() (charger:input:3): RunningMeanStdInPlace() (charger:output): RunningMeanStdInPlace() (charger:ready): RunningMeanStdInPlace() (generator:ready): RunningMeanStdInPlace() (generator:resource): RunningMeanStdInPlace() ) ) (_global_encoder): FeatureListEncoder( (_embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (_grid_encoder): FeatureListEncoder( (_embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (encoder_head): Sequential( (0): Linear(in_features=520, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): FeatureDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=17, bias=True) ) ) [2024-04-27 08:57:40,842][52292] Worker 30 uses CPU cores [30] [2024-04-27 08:57:40,862][52262] Worker 0 uses CPU cores [0] [2024-04-27 08:57:40,890][52282] Worker 18 uses CPU cores [18] [2024-04-27 08:57:40,898][52289] Worker 27 uses CPU cores [27] [2024-04-27 08:57:40,926][52285] Worker 21 uses CPU cores [21] [2024-04-27 08:57:40,946][52270] Worker 5 uses CPU cores [5] [2024-04-27 08:57:41,002][52274] Worker 11 uses CPU cores [11] [2024-04-27 08:57:41,014][52263] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-27 08:57:41,014][52263] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-04-27 08:57:41,017][52286] Worker 23 uses CPU cores [23] [2024-04-27 08:57:41,017][52276] Worker 13 uses CPU cores [13] [2024-04-27 08:57:41,023][52273] Worker 10 uses CPU cores [10] [2024-04-27 08:57:41,035][52291] Worker 28 uses CPU cores [28] [2024-04-27 08:57:41,039][52284] Worker 22 uses CPU cores [22] [2024-04-27 08:57:41,045][52263] Num visible devices: 1 [2024-04-27 08:57:41,045][52279] Worker 15 uses CPU cores [15] [2024-04-27 08:57:41,050][52281] Worker 17 uses CPU cores [17] [2024-04-27 08:57:41,058][52271] Worker 7 uses CPU cores [7] [2024-04-27 08:57:41,062][52280] Worker 19 uses CPU cores [19] [2024-04-27 08:57:41,089][52242] Using optimizer [2024-04-27 08:57:41,101][52278] Worker 16 uses CPU cores [16] [2024-04-27 08:57:41,107][52283] Worker 20 uses CPU cores [20] [2024-04-27 08:57:41,142][52294] Worker 31 uses CPU cores [31] [2024-04-27 08:57:41,143][52288] Worker 24 uses CPU cores [24] [2024-04-27 08:57:41,225][52242] Loading state from checkpoint /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000336278_5509578752.pth... [2024-04-27 08:57:41,226][52293] Worker 29 uses CPU cores [29] [2024-04-27 08:57:41,247][52242] Loading model from checkpoint [2024-04-27 08:57:41,250][52242] Loaded experiment state at self.train_step=336278, self.env_steps=5509578752 [2024-04-27 08:57:41,250][52242] Initialized policy 0 weights for model version 336278 [2024-04-27 08:57:41,252][52242] LearnerWorker_p0 finished initialization! [2024-04-27 08:57:41,252][52242] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-27 08:57:41,305][52269] Worker 9 uses CPU cores [9] [2024-04-27 08:57:41,365][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,367][52290] Worker 26 uses CPU cores [26] [2024-04-27 08:57:41,369][52277] Worker 14 uses CPU cores [14] [2024-04-27 08:57:41,371][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,371][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,371][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,371][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,371][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,371][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,371][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,371][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,371][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,371][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,371][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,371][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,371][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,371][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,372][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,372][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,372][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,372][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,372][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,372][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,372][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,372][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,372][52263] RunningMeanStd input shape: (1,) [2024-04-27 08:57:41,372][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,373][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,374][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,374][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,374][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,374][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,374][52263] RunningMeanStd input shape: (11, 11) [2024-04-27 08:57:41,434][52031] Inference worker 0-0 is ready! [2024-04-27 08:57:41,434][52031] All inference workers are ready! Signal rollout workers to start! [2024-04-27 08:57:42,003][52279] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,015][52268] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,075][52273] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,146][52264] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,152][52266] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,156][52276] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,158][52267] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,161][52265] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,163][52270] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,165][52275] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,166][52274] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,174][52262] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,179][52272] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,180][52271] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,194][52289] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,194][52291] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,197][52292] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,198][52287] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,200][52284] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,202][52282] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,202][52286] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,203][52285] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,205][52280] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,207][52281] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,220][52278] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,237][52283] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,258][52288] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,275][52294] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,303][52269] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,337][52277] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,366][52293] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,389][52290] Decorrelating experience for 0 frames... [2024-04-27 08:57:42,662][52279] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,702][52268] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,755][52273] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,810][52264] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,818][52274] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,844][52266] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,848][52276] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,857][52265] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,858][52267] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,858][52270] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,859][52275] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,870][52262] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,884][52271] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,884][52272] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,950][52291] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,956][52289] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,961][52284] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,963][52292] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,970][52287] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,971][52286] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,973][52285] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,975][52269] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,975][52278] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,977][52282] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,978][52280] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,980][52281] Decorrelating experience for 256 frames... [2024-04-27 08:57:42,995][52283] Decorrelating experience for 256 frames... [2024-04-27 08:57:43,000][52277] Decorrelating experience for 256 frames... [2024-04-27 08:57:43,008][52288] Decorrelating experience for 256 frames... [2024-04-27 08:57:43,035][52294] Decorrelating experience for 256 frames... [2024-04-27 08:57:43,095][52293] Decorrelating experience for 256 frames... [2024-04-27 08:57:43,116][52290] Decorrelating experience for 256 frames... [2024-04-27 08:57:44,107][52031] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 5509578752. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-04-27 08:57:47,755][52279] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-04-27 08:57:47,771][52264] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-04-27 08:57:47,771][52266] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-04-27 08:57:47,785][52274] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-04-27 08:57:47,793][52268] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-04-27 08:57:47,807][52270] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-04-27 08:57:47,831][52275] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-04-27 08:57:47,838][52271] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-04-27 08:57:47,838][52276] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-04-27 08:57:47,850][52265] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-04-27 08:57:47,882][52273] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-04-27 08:57:47,882][52269] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-04-27 08:57:47,907][52242] Signal inference workers to stop experience collection... [2024-04-27 08:57:47,912][52263] InferenceWorker_p0-w0: stopping experience collection [2024-04-27 08:57:47,912][52285] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-04-27 08:57:48,377][52242] Signal inference workers to resume experience collection... [2024-04-27 08:57:48,378][52263] InferenceWorker_p0-w0: resuming experience collection [2024-04-27 08:57:48,400][52277] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-04-27 08:57:48,406][52278] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-04-27 08:57:48,410][52291] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-04-27 08:57:48,411][52292] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-04-27 08:57:48,666][52282] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-04-27 08:57:48,666][52284] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-04-27 08:57:48,710][52281] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-04-27 08:57:48,721][52287] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-04-27 08:57:48,721][52289] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-04-27 08:57:48,739][52288] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-04-27 08:57:48,828][52293] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-04-27 08:57:48,828][52280] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-04-27 08:57:48,835][52294] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-04-27 08:57:48,835][52286] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-04-27 08:57:48,866][52290] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-04-27 08:57:48,963][52283] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-04-27 08:57:49,106][52031] Fps is (10 sec: 19661.3, 60 sec: 19661.3, 300 sec: 19661.3). Total num frames: 5509677056. Throughput: 0: 56953.4. Samples: 284760. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 08:57:49,563][52267] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-04-27 08:57:49,684][52263] Updated weights for policy 0, policy_version 336288 (0.0019) [2024-04-27 08:57:49,695][52272] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-04-27 08:57:52,482][52264] Worker 1 awakens! [2024-04-27 08:57:54,107][52031] Fps is (10 sec: 16384.0, 60 sec: 16384.0, 300 sec: 16384.0). Total num frames: 5509742592. Throughput: 0: 33379.9. Samples: 333800. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 08:57:56,968][52031] Heartbeat connected on Batcher_0 [2024-04-27 08:57:56,970][52031] Heartbeat connected on LearnerWorker_p0 [2024-04-27 08:57:56,983][52031] Heartbeat connected on RolloutWorker_w1 [2024-04-27 08:57:56,983][52031] Heartbeat connected on RolloutWorker_w0 [2024-04-27 08:57:57,042][52031] Heartbeat connected on InferenceWorker_p0-w0 [2024-04-27 08:57:57,272][52265] Worker 2 awakens! [2024-04-27 08:57:57,281][52031] Heartbeat connected on RolloutWorker_w2 [2024-04-27 08:57:59,106][52031] Fps is (10 sec: 8191.9, 60 sec: 12015.0, 300 sec: 12015.0). Total num frames: 5509758976. Throughput: 0: 22674.7. Samples: 340120. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 08:58:01,905][52266] Worker 3 awakens! [2024-04-27 08:58:01,920][52031] Heartbeat connected on RolloutWorker_w3 [2024-04-27 08:58:04,107][52031] Fps is (10 sec: 3276.8, 60 sec: 9830.4, 300 sec: 9830.4). Total num frames: 5509775360. Throughput: 0: 17930.0. Samples: 358600. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 08:58:08,407][52267] Worker 4 awakens! [2024-04-27 08:58:08,417][52031] Heartbeat connected on RolloutWorker_w4 [2024-04-27 08:58:09,106][52031] Fps is (10 sec: 4915.2, 60 sec: 9175.1, 300 sec: 9175.1). Total num frames: 5509808128. Throughput: 0: 15340.1. Samples: 383500. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 08:58:09,107][52031] Avg episode reward: [(0, '0.270')] [2024-04-27 08:58:11,344][52270] Worker 5 awakens! [2024-04-27 08:58:11,350][52031] Heartbeat connected on RolloutWorker_w5 [2024-04-27 08:58:14,106][52031] Fps is (10 sec: 9830.5, 60 sec: 9830.4, 300 sec: 9830.4). Total num frames: 5509873664. Throughput: 0: 14522.7. Samples: 435680. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 08:58:14,107][52031] Avg episode reward: [(0, '0.465')] [2024-04-27 08:58:14,511][52263] Updated weights for policy 0, policy_version 336298 (0.0019) [2024-04-27 08:58:17,918][52272] Worker 6 awakens! [2024-04-27 08:58:17,922][52031] Heartbeat connected on RolloutWorker_w6 [2024-04-27 08:58:19,106][52031] Fps is (10 sec: 18022.3, 60 sec: 11702.9, 300 sec: 11702.9). Total num frames: 5509988352. Throughput: 0: 15875.5. Samples: 555640. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 08:58:19,107][52031] Avg episode reward: [(0, '0.483')] [2024-04-27 08:58:20,750][52271] Worker 7 awakens! [2024-04-27 08:58:20,756][52031] Heartbeat connected on RolloutWorker_w7 [2024-04-27 08:58:20,875][52263] Updated weights for policy 0, policy_version 336308 (0.0014) [2024-04-27 08:58:24,106][52031] Fps is (10 sec: 26214.2, 60 sec: 13926.4, 300 sec: 13926.4). Total num frames: 5510135808. Throughput: 0: 17745.0. Samples: 709800. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 08:58:24,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 08:58:25,391][52268] Worker 8 awakens! [2024-04-27 08:58:25,395][52031] Heartbeat connected on RolloutWorker_w8 [2024-04-27 08:58:27,544][52263] Updated weights for policy 0, policy_version 336318 (0.0014) [2024-04-27 08:58:29,106][52031] Fps is (10 sec: 27852.9, 60 sec: 15291.8, 300 sec: 15291.8). Total num frames: 5510266880. Throughput: 0: 17589.4. Samples: 791520. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 08:58:29,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 08:58:30,169][52269] Worker 9 awakens! [2024-04-27 08:58:30,174][52031] Heartbeat connected on RolloutWorker_w9 [2024-04-27 08:58:32,730][52263] Updated weights for policy 0, policy_version 336328 (0.0014) [2024-04-27 08:58:34,107][52031] Fps is (10 sec: 31129.5, 60 sec: 17367.0, 300 sec: 17367.0). Total num frames: 5510447104. Throughput: 0: 15361.7. Samples: 976040. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 08:58:34,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 08:58:34,789][52273] Worker 10 awakens! [2024-04-27 08:58:34,793][52031] Heartbeat connected on RolloutWorker_w10 [2024-04-27 08:58:37,243][52263] Updated weights for policy 0, policy_version 336338 (0.0016) [2024-04-27 08:58:39,107][52031] Fps is (10 sec: 37682.9, 60 sec: 19362.9, 300 sec: 19362.9). Total num frames: 5510643712. Throughput: 0: 19396.0. Samples: 1206620. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 08:58:39,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 08:58:39,446][52274] Worker 11 awakens! [2024-04-27 08:58:39,453][52031] Heartbeat connected on RolloutWorker_w11 [2024-04-27 08:58:41,234][52263] Updated weights for policy 0, policy_version 336348 (0.0015) [2024-04-27 08:58:44,107][52031] Fps is (10 sec: 39321.5, 60 sec: 21026.1, 300 sec: 21026.1). Total num frames: 5510840320. Throughput: 0: 21928.8. Samples: 1326920. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 08:58:44,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 08:58:44,180][52275] Worker 12 awakens! [2024-04-27 08:58:44,184][52031] Heartbeat connected on RolloutWorker_w12 [2024-04-27 08:58:44,702][52263] Updated weights for policy 0, policy_version 336358 (0.0018) [2024-04-27 08:58:48,875][52276] Worker 13 awakens! [2024-04-27 08:58:48,881][52031] Heartbeat connected on RolloutWorker_w13 [2024-04-27 08:58:48,953][52263] Updated weights for policy 0, policy_version 336368 (0.0019) [2024-04-27 08:58:49,107][52031] Fps is (10 sec: 40959.8, 60 sec: 22937.5, 300 sec: 22685.5). Total num frames: 5511053312. Throughput: 0: 27266.7. Samples: 1585600. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 08:58:49,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 08:58:52,730][52263] Updated weights for policy 0, policy_version 336378 (0.0021) [2024-04-27 08:58:54,107][52031] Fps is (10 sec: 42598.5, 60 sec: 25395.2, 300 sec: 24107.9). Total num frames: 5511266304. Throughput: 0: 32479.9. Samples: 1845100. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 08:58:54,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 08:58:54,122][52277] Worker 14 awakens! [2024-04-27 08:58:54,127][52031] Heartbeat connected on RolloutWorker_w14 [2024-04-27 08:58:57,005][52263] Updated weights for policy 0, policy_version 336388 (0.0021) [2024-04-27 08:58:58,168][52279] Worker 15 awakens! [2024-04-27 08:58:58,173][52031] Heartbeat connected on RolloutWorker_w15 [2024-04-27 08:58:59,107][52031] Fps is (10 sec: 42598.5, 60 sec: 28672.0, 300 sec: 25340.6). Total num frames: 5511479296. Throughput: 0: 33883.5. Samples: 1960440. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 08:58:59,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 08:59:00,774][52263] Updated weights for policy 0, policy_version 336398 (0.0026) [2024-04-27 08:59:03,489][52278] Worker 16 awakens! [2024-04-27 08:59:03,498][52031] Heartbeat connected on RolloutWorker_w16 [2024-04-27 08:59:04,106][52031] Fps is (10 sec: 40960.5, 60 sec: 31675.8, 300 sec: 26214.4). Total num frames: 5511675904. Throughput: 0: 36600.9. Samples: 2202680. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 08:59:04,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 08:59:04,564][52263] Updated weights for policy 0, policy_version 336408 (0.0027) [2024-04-27 08:59:08,498][52281] Worker 17 awakens! [2024-04-27 08:59:08,507][52031] Heartbeat connected on RolloutWorker_w17 [2024-04-27 08:59:08,643][52263] Updated weights for policy 0, policy_version 336418 (0.0024) [2024-04-27 08:59:09,106][52031] Fps is (10 sec: 40960.3, 60 sec: 34679.4, 300 sec: 27178.2). Total num frames: 5511888896. Throughput: 0: 38947.6. Samples: 2462440. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-04-27 08:59:09,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 08:59:12,220][52263] Updated weights for policy 0, policy_version 336428 (0.0028) [2024-04-27 08:59:13,139][52282] Worker 18 awakens! [2024-04-27 08:59:13,147][52031] Heartbeat connected on RolloutWorker_w18 [2024-04-27 08:59:14,107][52031] Fps is (10 sec: 44236.4, 60 sec: 37410.1, 300 sec: 28216.9). Total num frames: 5512118272. Throughput: 0: 40103.9. Samples: 2596200. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-04-27 08:59:14,107][52031] Avg episode reward: [(0, '0.476')] [2024-04-27 08:59:15,770][52263] Updated weights for policy 0, policy_version 336438 (0.0025) [2024-04-27 08:59:17,913][52280] Worker 19 awakens! [2024-04-27 08:59:17,923][52031] Heartbeat connected on RolloutWorker_w19 [2024-04-27 08:59:19,080][52263] Updated weights for policy 0, policy_version 336448 (0.0025) [2024-04-27 08:59:19,107][52031] Fps is (10 sec: 47513.1, 60 sec: 39594.6, 300 sec: 29318.7). Total num frames: 5512364032. Throughput: 0: 42021.7. Samples: 2867020. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-04-27 08:59:19,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 08:59:22,744][52263] Updated weights for policy 0, policy_version 336458 (0.0024) [2024-04-27 08:59:22,813][52283] Worker 20 awakens! [2024-04-27 08:59:22,822][52031] Heartbeat connected on RolloutWorker_w20 [2024-04-27 08:59:24,107][52031] Fps is (10 sec: 45875.2, 60 sec: 40686.9, 300 sec: 29982.7). Total num frames: 5512577024. Throughput: 0: 43018.7. Samples: 3142460. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-04-27 08:59:24,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 08:59:26,365][52263] Updated weights for policy 0, policy_version 336468 (0.0027) [2024-04-27 08:59:26,439][52285] Worker 21 awakens! [2024-04-27 08:59:26,449][52031] Heartbeat connected on RolloutWorker_w21 [2024-04-27 08:59:29,107][52031] Fps is (10 sec: 45875.3, 60 sec: 42598.3, 300 sec: 30895.5). Total num frames: 5512822784. Throughput: 0: 43577.3. Samples: 3287900. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-04-27 08:59:29,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 08:59:29,911][52263] Updated weights for policy 0, policy_version 336478 (0.0026) [2024-04-27 08:59:31,890][52284] Worker 22 awakens! [2024-04-27 08:59:31,900][52031] Heartbeat connected on RolloutWorker_w22 [2024-04-27 08:59:32,682][52263] Updated weights for policy 0, policy_version 336488 (0.0030) [2024-04-27 08:59:34,107][52031] Fps is (10 sec: 47513.2, 60 sec: 43417.5, 300 sec: 31576.4). Total num frames: 5513052160. Throughput: 0: 44139.1. Samples: 3571860. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-04-27 08:59:34,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 08:59:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000336490_5513052160.pth... [2024-04-27 08:59:34,171][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000336212_5508497408.pth [2024-04-27 08:59:36,431][52263] Updated weights for policy 0, policy_version 336498 (0.0024) [2024-04-27 08:59:36,748][52286] Worker 23 awakens! [2024-04-27 08:59:36,759][52031] Heartbeat connected on RolloutWorker_w23 [2024-04-27 08:59:39,107][52031] Fps is (10 sec: 45875.2, 60 sec: 43963.7, 300 sec: 32198.1). Total num frames: 5513281536. Throughput: 0: 45076.4. Samples: 3873540. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-04-27 08:59:39,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 08:59:39,962][52263] Updated weights for policy 0, policy_version 336508 (0.0026) [2024-04-27 08:59:41,339][52288] Worker 24 awakens! [2024-04-27 08:59:41,350][52031] Heartbeat connected on RolloutWorker_w24 [2024-04-27 08:59:42,845][52263] Updated weights for policy 0, policy_version 336518 (0.0024) [2024-04-27 08:59:44,107][52031] Fps is (10 sec: 49152.4, 60 sec: 45056.0, 300 sec: 33041.1). Total num frames: 5513543680. Throughput: 0: 45759.5. Samples: 4019620. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-04-27 08:59:44,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 08:59:46,006][52287] Worker 25 awakens! [2024-04-27 08:59:46,017][52031] Heartbeat connected on RolloutWorker_w25 [2024-04-27 08:59:46,230][52263] Updated weights for policy 0, policy_version 336528 (0.0028) [2024-04-27 08:59:49,107][52031] Fps is (10 sec: 52429.1, 60 sec: 45875.3, 300 sec: 33816.6). Total num frames: 5513805824. Throughput: 0: 46942.1. Samples: 4315080. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-04-27 08:59:49,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 08:59:49,838][52263] Updated weights for policy 0, policy_version 336538 (0.0026) [2024-04-27 08:59:50,789][52290] Worker 26 awakens! [2024-04-27 08:59:50,801][52031] Heartbeat connected on RolloutWorker_w26 [2024-04-27 08:59:51,110][52242] Signal inference workers to stop experience collection... (50 times) [2024-04-27 08:59:51,110][52242] Signal inference workers to resume experience collection... (50 times) [2024-04-27 08:59:51,121][52263] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-04-27 08:59:51,121][52263] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-04-27 08:59:52,428][52263] Updated weights for policy 0, policy_version 336548 (0.0028) [2024-04-27 08:59:54,107][52031] Fps is (10 sec: 52428.9, 60 sec: 46694.4, 300 sec: 34532.4). Total num frames: 5514067968. Throughput: 0: 48135.9. Samples: 4628560. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-04-27 08:59:54,107][52031] Avg episode reward: [(0, '0.501')] [2024-04-27 08:59:55,384][52289] Worker 27 awakens! [2024-04-27 08:59:55,397][52031] Heartbeat connected on RolloutWorker_w27 [2024-04-27 08:59:56,206][52263] Updated weights for policy 0, policy_version 336558 (0.0034) [2024-04-27 08:59:58,756][52263] Updated weights for policy 0, policy_version 336568 (0.0038) [2024-04-27 08:59:59,107][52031] Fps is (10 sec: 52428.1, 60 sec: 47513.5, 300 sec: 35195.2). Total num frames: 5514330112. Throughput: 0: 48623.4. Samples: 4784260. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-04-27 08:59:59,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 08:59:59,730][52291] Worker 28 awakens! [2024-04-27 08:59:59,743][52031] Heartbeat connected on RolloutWorker_w28 [2024-04-27 09:00:02,288][52263] Updated weights for policy 0, policy_version 336578 (0.0030) [2024-04-27 09:00:04,107][52031] Fps is (10 sec: 52428.9, 60 sec: 48605.8, 300 sec: 35810.7). Total num frames: 5514592256. Throughput: 0: 49422.7. Samples: 5091040. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-04-27 09:00:04,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 09:00:04,866][52293] Worker 29 awakens! [2024-04-27 09:00:04,878][52031] Heartbeat connected on RolloutWorker_w29 [2024-04-27 09:00:05,879][52263] Updated weights for policy 0, policy_version 336588 (0.0026) [2024-04-27 09:00:08,272][52263] Updated weights for policy 0, policy_version 336598 (0.0034) [2024-04-27 09:00:09,099][52292] Worker 30 awakens! [2024-04-27 09:00:09,106][52031] Fps is (10 sec: 54068.4, 60 sec: 49698.2, 300 sec: 36496.8). Total num frames: 5514870784. Throughput: 0: 50147.7. Samples: 5399100. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-04-27 09:00:09,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 09:00:09,114][52031] Heartbeat connected on RolloutWorker_w30 [2024-04-27 09:00:12,211][52263] Updated weights for policy 0, policy_version 336608 (0.0028) [2024-04-27 09:00:14,107][52031] Fps is (10 sec: 50789.5, 60 sec: 49698.0, 300 sec: 36809.3). Total num frames: 5515100160. Throughput: 0: 50705.2. Samples: 5569640. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-04-27 09:00:14,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 09:00:14,230][52294] Worker 31 awakens! [2024-04-27 09:00:14,242][52031] Heartbeat connected on RolloutWorker_w31 [2024-04-27 09:00:14,393][52263] Updated weights for policy 0, policy_version 336618 (0.0039) [2024-04-27 09:00:18,330][52263] Updated weights for policy 0, policy_version 336628 (0.0029) [2024-04-27 09:00:19,106][52031] Fps is (10 sec: 49151.8, 60 sec: 49971.3, 300 sec: 37313.3). Total num frames: 5515362304. Throughput: 0: 51521.5. Samples: 5890320. Policy #0 lag: (min: 0.0, avg: 6.1, max: 13.0) [2024-04-27 09:00:19,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 09:00:20,400][52263] Updated weights for policy 0, policy_version 336638 (0.0032) [2024-04-27 09:00:24,107][52031] Fps is (10 sec: 50790.8, 60 sec: 50517.3, 300 sec: 37683.2). Total num frames: 5515608064. Throughput: 0: 51906.2. Samples: 6209320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-27 09:00:24,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 09:00:24,364][52263] Updated weights for policy 0, policy_version 336648 (0.0029) [2024-04-27 09:00:26,480][52263] Updated weights for policy 0, policy_version 336658 (0.0038) [2024-04-27 09:00:29,107][52031] Fps is (10 sec: 55704.9, 60 sec: 51609.6, 300 sec: 38427.9). Total num frames: 5515919360. Throughput: 0: 52028.8. Samples: 6360920. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-27 09:00:29,107][52031] Avg episode reward: [(0, '0.484')] [2024-04-27 09:00:30,377][52263] Updated weights for policy 0, policy_version 336668 (0.0031) [2024-04-27 09:00:32,507][52263] Updated weights for policy 0, policy_version 336678 (0.0033) [2024-04-27 09:00:34,108][52031] Fps is (10 sec: 57334.9, 60 sec: 52154.4, 300 sec: 38839.3). Total num frames: 5516181504. Throughput: 0: 52577.6. Samples: 6681160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-27 09:00:34,109][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 09:00:36,497][52263] Updated weights for policy 0, policy_version 336688 (0.0030) [2024-04-27 09:00:38,600][52263] Updated weights for policy 0, policy_version 336698 (0.0036) [2024-04-27 09:00:39,106][52031] Fps is (10 sec: 55706.2, 60 sec: 53248.1, 300 sec: 39415.2). Total num frames: 5516476416. Throughput: 0: 52818.7. Samples: 7005400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-27 09:00:39,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 09:00:42,603][52263] Updated weights for policy 0, policy_version 336708 (0.0034) [2024-04-27 09:00:44,106][52031] Fps is (10 sec: 55715.1, 60 sec: 53248.1, 300 sec: 39776.7). Total num frames: 5516738560. Throughput: 0: 53272.2. Samples: 7181500. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-27 09:00:44,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 09:00:44,646][52263] Updated weights for policy 0, policy_version 336718 (0.0026) [2024-04-27 09:00:48,627][52263] Updated weights for policy 0, policy_version 336728 (0.0030) [2024-04-27 09:00:49,106][52031] Fps is (10 sec: 49152.1, 60 sec: 52701.9, 300 sec: 39941.6). Total num frames: 5516967936. Throughput: 0: 53626.3. Samples: 7504220. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-27 09:00:49,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 09:00:50,694][52263] Updated weights for policy 0, policy_version 336738 (0.0031) [2024-04-27 09:00:51,912][52242] Signal inference workers to stop experience collection... (100 times) [2024-04-27 09:00:51,912][52242] Signal inference workers to resume experience collection... (100 times) [2024-04-27 09:00:51,924][52263] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-04-27 09:00:51,945][52263] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-04-27 09:00:54,107][52031] Fps is (10 sec: 49151.6, 60 sec: 52701.8, 300 sec: 40270.1). Total num frames: 5517230080. Throughput: 0: 53918.9. Samples: 7825460. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-27 09:00:54,107][52031] Avg episode reward: [(0, '0.507')] [2024-04-27 09:00:54,751][52263] Updated weights for policy 0, policy_version 336748 (0.0028) [2024-04-27 09:00:56,710][52263] Updated weights for policy 0, policy_version 336758 (0.0026) [2024-04-27 09:00:59,106][52031] Fps is (10 sec: 55705.5, 60 sec: 53248.2, 300 sec: 40750.0). Total num frames: 5517524992. Throughput: 0: 53403.4. Samples: 7972780. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-27 09:00:59,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 09:01:01,009][52263] Updated weights for policy 0, policy_version 336768 (0.0034) [2024-04-27 09:01:02,850][52263] Updated weights for policy 0, policy_version 336778 (0.0034) [2024-04-27 09:01:04,107][52031] Fps is (10 sec: 57343.4, 60 sec: 53520.9, 300 sec: 41123.8). Total num frames: 5517803520. Throughput: 0: 53542.4. Samples: 8299740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-27 09:01:04,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 09:01:06,884][52263] Updated weights for policy 0, policy_version 336788 (0.0027) [2024-04-27 09:01:08,967][52263] Updated weights for policy 0, policy_version 336798 (0.0031) [2024-04-27 09:01:09,106][52031] Fps is (10 sec: 57344.2, 60 sec: 53794.1, 300 sec: 41559.4). Total num frames: 5518098432. Throughput: 0: 53598.0. Samples: 8621220. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-27 09:01:09,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 09:01:12,952][52263] Updated weights for policy 0, policy_version 336808 (0.0030) [2024-04-27 09:01:14,107][52031] Fps is (10 sec: 54067.7, 60 sec: 54067.3, 300 sec: 41740.2). Total num frames: 5518344192. Throughput: 0: 54020.4. Samples: 8791840. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-27 09:01:14,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 09:01:15,276][52263] Updated weights for policy 0, policy_version 336818 (0.0026) [2024-04-27 09:01:18,954][52263] Updated weights for policy 0, policy_version 336828 (0.0026) [2024-04-27 09:01:19,107][52031] Fps is (10 sec: 49151.6, 60 sec: 53794.1, 300 sec: 41912.6). Total num frames: 5518589952. Throughput: 0: 53994.0. Samples: 9110800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-27 09:01:19,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 09:01:21,225][52263] Updated weights for policy 0, policy_version 336838 (0.0033) [2024-04-27 09:01:24,107][52031] Fps is (10 sec: 50790.8, 60 sec: 54067.3, 300 sec: 42151.6). Total num frames: 5518852096. Throughput: 0: 53905.3. Samples: 9431140. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-27 09:01:24,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 09:01:24,995][52263] Updated weights for policy 0, policy_version 336848 (0.0034) [2024-04-27 09:01:27,245][52263] Updated weights for policy 0, policy_version 336858 (0.0031) [2024-04-27 09:01:29,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.1, 300 sec: 42452.8). Total num frames: 5519130624. Throughput: 0: 53548.4. Samples: 9591180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-27 09:01:29,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 09:01:31,303][52263] Updated weights for policy 0, policy_version 336868 (0.0029) [2024-04-27 09:01:33,421][52263] Updated weights for policy 0, policy_version 336878 (0.0036) [2024-04-27 09:01:34,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53795.6, 300 sec: 42740.9). Total num frames: 5519409152. Throughput: 0: 53411.5. Samples: 9907740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-04-27 09:01:34,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 09:01:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000336878_5519409152.pth... [2024-04-27 09:01:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000336278_5509578752.pth [2024-04-27 09:01:37,424][52263] Updated weights for policy 0, policy_version 336888 (0.0033) [2024-04-27 09:01:39,106][52031] Fps is (10 sec: 57344.3, 60 sec: 53794.2, 300 sec: 43086.5). Total num frames: 5519704064. Throughput: 0: 53319.7. Samples: 10224840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:01:39,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 09:01:39,569][52263] Updated weights for policy 0, policy_version 336898 (0.0038) [2024-04-27 09:01:43,439][52263] Updated weights for policy 0, policy_version 336908 (0.0026) [2024-04-27 09:01:44,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52974.9, 300 sec: 43076.3). Total num frames: 5519917056. Throughput: 0: 53732.9. Samples: 10390760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:01:44,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 09:01:45,689][52263] Updated weights for policy 0, policy_version 336918 (0.0035) [2024-04-27 09:01:49,106][52031] Fps is (10 sec: 47513.4, 60 sec: 53521.0, 300 sec: 43267.1). Total num frames: 5520179200. Throughput: 0: 53638.9. Samples: 10713480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:01:49,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 09:01:49,605][52263] Updated weights for policy 0, policy_version 336928 (0.0030) [2024-04-27 09:01:52,146][52263] Updated weights for policy 0, policy_version 336938 (0.0028) [2024-04-27 09:01:54,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.2, 300 sec: 43515.9). Total num frames: 5520457728. Throughput: 0: 53646.1. Samples: 11035300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:01:54,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 09:01:55,587][52263] Updated weights for policy 0, policy_version 336948 (0.0032) [2024-04-27 09:01:57,363][52242] Signal inference workers to stop experience collection... (150 times) [2024-04-27 09:01:57,368][52242] Signal inference workers to resume experience collection... (150 times) [2024-04-27 09:01:57,384][52263] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-04-27 09:01:57,385][52263] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-04-27 09:01:58,288][52263] Updated weights for policy 0, policy_version 336958 (0.0032) [2024-04-27 09:01:59,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.1, 300 sec: 43754.9). Total num frames: 5520736256. Throughput: 0: 53349.9. Samples: 11192580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:01:59,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 09:02:01,704][52263] Updated weights for policy 0, policy_version 336968 (0.0032) [2024-04-27 09:02:04,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.1, 300 sec: 43984.7). Total num frames: 5521014784. Throughput: 0: 53429.2. Samples: 11515120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:02:04,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 09:02:04,269][52263] Updated weights for policy 0, policy_version 336978 (0.0032) [2024-04-27 09:02:07,846][52263] Updated weights for policy 0, policy_version 336988 (0.0027) [2024-04-27 09:02:09,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52701.8, 300 sec: 44082.2). Total num frames: 5521260544. Throughput: 0: 53333.3. Samples: 11831140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:02:09,107][52031] Avg episode reward: [(0, '0.480')] [2024-04-27 09:02:11,210][52263] Updated weights for policy 0, policy_version 336998 (0.0033) [2024-04-27 09:02:14,107][52031] Fps is (10 sec: 50790.6, 60 sec: 52975.0, 300 sec: 44236.8). Total num frames: 5521522688. Throughput: 0: 53321.7. Samples: 11990660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:02:14,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 09:02:14,154][52263] Updated weights for policy 0, policy_version 337008 (0.0030) [2024-04-27 09:02:17,440][52263] Updated weights for policy 0, policy_version 337018 (0.0028) [2024-04-27 09:02:19,106][52031] Fps is (10 sec: 52429.9, 60 sec: 53248.1, 300 sec: 44385.8). Total num frames: 5521784832. Throughput: 0: 53411.3. Samples: 12311240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:02:19,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 09:02:20,256][52263] Updated weights for policy 0, policy_version 337028 (0.0026) [2024-04-27 09:02:23,469][52263] Updated weights for policy 0, policy_version 337038 (0.0030) [2024-04-27 09:02:24,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53247.9, 300 sec: 44529.4). Total num frames: 5522046976. Throughput: 0: 53424.7. Samples: 12628960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:02:24,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 09:02:26,544][52263] Updated weights for policy 0, policy_version 337048 (0.0031) [2024-04-27 09:02:29,107][52031] Fps is (10 sec: 55704.5, 60 sec: 53521.0, 300 sec: 44782.9). Total num frames: 5522341888. Throughput: 0: 53347.0. Samples: 12791380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:02:29,107][52031] Avg episode reward: [(0, '0.512')] [2024-04-27 09:02:29,489][52263] Updated weights for policy 0, policy_version 337058 (0.0032) [2024-04-27 09:02:32,577][52263] Updated weights for policy 0, policy_version 337068 (0.0027) [2024-04-27 09:02:34,106][52031] Fps is (10 sec: 57344.6, 60 sec: 53521.1, 300 sec: 44971.3). Total num frames: 5522620416. Throughput: 0: 53147.1. Samples: 13105100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:02:34,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 09:02:35,799][52263] Updated weights for policy 0, policy_version 337078 (0.0029) [2024-04-27 09:02:38,891][52263] Updated weights for policy 0, policy_version 337088 (0.0040) [2024-04-27 09:02:39,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52428.7, 300 sec: 44986.6). Total num frames: 5522849792. Throughput: 0: 53016.9. Samples: 13421060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:02:39,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 09:02:42,134][52263] Updated weights for policy 0, policy_version 337098 (0.0028) [2024-04-27 09:02:44,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53521.1, 300 sec: 45597.5). Total num frames: 5523128320. Throughput: 0: 52988.5. Samples: 13577060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:02:44,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 09:02:45,345][52263] Updated weights for policy 0, policy_version 337108 (0.0029) [2024-04-27 09:02:48,501][52263] Updated weights for policy 0, policy_version 337118 (0.0037) [2024-04-27 09:02:49,106][52031] Fps is (10 sec: 50790.7, 60 sec: 52975.0, 300 sec: 46152.9). Total num frames: 5523357696. Throughput: 0: 52767.7. Samples: 13889660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:02:49,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 09:02:51,638][52263] Updated weights for policy 0, policy_version 337128 (0.0028) [2024-04-27 09:02:54,107][52031] Fps is (10 sec: 50789.7, 60 sec: 52974.9, 300 sec: 47041.5). Total num frames: 5523636224. Throughput: 0: 52786.2. Samples: 14206520. Policy #0 lag: (min: 2.0, avg: 12.3, max: 22.0) [2024-04-27 09:02:54,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 09:02:54,492][52263] Updated weights for policy 0, policy_version 337138 (0.0026) [2024-04-27 09:02:57,852][52263] Updated weights for policy 0, policy_version 337148 (0.0030) [2024-04-27 09:02:57,980][52242] Signal inference workers to stop experience collection... (200 times) [2024-04-27 09:02:57,982][52242] Signal inference workers to resume experience collection... (200 times) [2024-04-27 09:02:58,000][52263] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-04-27 09:02:58,000][52263] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-04-27 09:02:59,107][52031] Fps is (10 sec: 58981.4, 60 sec: 53520.9, 300 sec: 48041.2). Total num frames: 5523947520. Throughput: 0: 52895.5. Samples: 14370960. Policy #0 lag: (min: 2.0, avg: 12.3, max: 22.0) [2024-04-27 09:02:59,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 09:03:00,612][52263] Updated weights for policy 0, policy_version 337158 (0.0030) [2024-04-27 09:03:03,946][52263] Updated weights for policy 0, policy_version 337168 (0.0035) [2024-04-27 09:03:04,106][52031] Fps is (10 sec: 54067.8, 60 sec: 52702.0, 300 sec: 48707.7). Total num frames: 5524176896. Throughput: 0: 52891.0. Samples: 14691340. Policy #0 lag: (min: 2.0, avg: 12.3, max: 22.0) [2024-04-27 09:03:04,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 09:03:06,969][52263] Updated weights for policy 0, policy_version 337178 (0.0027) [2024-04-27 09:03:09,106][52031] Fps is (10 sec: 49152.9, 60 sec: 52975.0, 300 sec: 49374.2). Total num frames: 5524439040. Throughput: 0: 52790.4. Samples: 15004520. Policy #0 lag: (min: 2.0, avg: 12.3, max: 22.0) [2024-04-27 09:03:09,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 09:03:10,107][52263] Updated weights for policy 0, policy_version 337188 (0.0031) [2024-04-27 09:03:13,454][52263] Updated weights for policy 0, policy_version 337198 (0.0034) [2024-04-27 09:03:14,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52702.0, 300 sec: 49818.5). Total num frames: 5524684800. Throughput: 0: 52467.2. Samples: 15152400. Policy #0 lag: (min: 2.0, avg: 12.3, max: 22.0) [2024-04-27 09:03:14,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 09:03:16,253][52263] Updated weights for policy 0, policy_version 337208 (0.0027) [2024-04-27 09:03:19,107][52031] Fps is (10 sec: 50789.5, 60 sec: 52701.7, 300 sec: 50207.2). Total num frames: 5524946944. Throughput: 0: 52585.2. Samples: 15471440. Policy #0 lag: (min: 2.0, avg: 12.3, max: 22.0) [2024-04-27 09:03:19,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 09:03:19,588][52263] Updated weights for policy 0, policy_version 337218 (0.0029) [2024-04-27 09:03:22,402][52263] Updated weights for policy 0, policy_version 337228 (0.0028) [2024-04-27 09:03:24,106][52031] Fps is (10 sec: 57343.9, 60 sec: 53521.2, 300 sec: 50818.2). Total num frames: 5525258240. Throughput: 0: 52663.6. Samples: 15790920. Policy #0 lag: (min: 2.0, avg: 12.3, max: 22.0) [2024-04-27 09:03:24,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 09:03:25,694][52263] Updated weights for policy 0, policy_version 337238 (0.0033) [2024-04-27 09:03:28,475][52263] Updated weights for policy 0, policy_version 337248 (0.0034) [2024-04-27 09:03:29,107][52031] Fps is (10 sec: 57344.2, 60 sec: 52974.9, 300 sec: 51095.9). Total num frames: 5525520384. Throughput: 0: 52982.1. Samples: 15961260. Policy #0 lag: (min: 2.0, avg: 12.3, max: 22.0) [2024-04-27 09:03:29,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 09:03:31,753][52263] Updated weights for policy 0, policy_version 337258 (0.0028) [2024-04-27 09:03:34,106][52031] Fps is (10 sec: 47513.5, 60 sec: 51882.7, 300 sec: 51151.4). Total num frames: 5525733376. Throughput: 0: 53059.5. Samples: 16277340. Policy #0 lag: (min: 2.0, avg: 12.3, max: 22.0) [2024-04-27 09:03:34,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 09:03:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000337265_5525749760.pth... [2024-04-27 09:03:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000336490_5513052160.pth [2024-04-27 09:03:34,664][52263] Updated weights for policy 0, policy_version 337268 (0.0025) [2024-04-27 09:03:37,848][52263] Updated weights for policy 0, policy_version 337278 (0.0036) [2024-04-27 09:03:39,106][52031] Fps is (10 sec: 50791.1, 60 sec: 52975.0, 300 sec: 51484.7). Total num frames: 5526028288. Throughput: 0: 53148.2. Samples: 16598180. Policy #0 lag: (min: 2.0, avg: 12.3, max: 22.0) [2024-04-27 09:03:39,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 09:03:40,882][52263] Updated weights for policy 0, policy_version 337288 (0.0028) [2024-04-27 09:03:44,079][52263] Updated weights for policy 0, policy_version 337298 (0.0036) [2024-04-27 09:03:44,107][52031] Fps is (10 sec: 55704.9, 60 sec: 52701.8, 300 sec: 51651.3). Total num frames: 5526290432. Throughput: 0: 52859.1. Samples: 16749620. Policy #0 lag: (min: 2.0, avg: 12.3, max: 22.0) [2024-04-27 09:03:44,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 09:03:46,975][52263] Updated weights for policy 0, policy_version 337308 (0.0038) [2024-04-27 09:03:49,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53248.0, 300 sec: 51817.9). Total num frames: 5526552576. Throughput: 0: 52753.3. Samples: 17065240. Policy #0 lag: (min: 2.0, avg: 12.3, max: 22.0) [2024-04-27 09:03:49,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 09:03:50,286][52263] Updated weights for policy 0, policy_version 337318 (0.0029) [2024-04-27 09:03:53,110][52242] Signal inference workers to stop experience collection... (250 times) [2024-04-27 09:03:53,129][52263] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-04-27 09:03:53,172][52242] Signal inference workers to resume experience collection... (250 times) [2024-04-27 09:03:53,173][52263] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-04-27 09:03:53,175][52263] Updated weights for policy 0, policy_version 337328 (0.0029) [2024-04-27 09:03:54,106][52031] Fps is (10 sec: 55706.5, 60 sec: 53521.2, 300 sec: 52095.6). Total num frames: 5526847488. Throughput: 0: 52775.1. Samples: 17379400. Policy #0 lag: (min: 2.0, avg: 12.3, max: 22.0) [2024-04-27 09:03:54,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 09:03:56,529][52263] Updated weights for policy 0, policy_version 337338 (0.0033) [2024-04-27 09:03:59,107][52031] Fps is (10 sec: 52428.1, 60 sec: 52155.7, 300 sec: 52206.6). Total num frames: 5527076864. Throughput: 0: 53277.6. Samples: 17549900. Policy #0 lag: (min: 2.0, avg: 12.3, max: 22.0) [2024-04-27 09:03:59,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 09:03:59,361][52263] Updated weights for policy 0, policy_version 337348 (0.0030) [2024-04-27 09:04:02,812][52263] Updated weights for policy 0, policy_version 337358 (0.0030) [2024-04-27 09:04:04,106][52031] Fps is (10 sec: 49152.0, 60 sec: 52701.9, 300 sec: 52373.3). Total num frames: 5527339008. Throughput: 0: 53275.8. Samples: 17868840. Policy #0 lag: (min: 2.0, avg: 12.3, max: 22.0) [2024-04-27 09:04:04,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 09:04:05,535][52263] Updated weights for policy 0, policy_version 337368 (0.0028) [2024-04-27 09:04:08,904][52263] Updated weights for policy 0, policy_version 337378 (0.0028) [2024-04-27 09:04:09,106][52031] Fps is (10 sec: 52429.8, 60 sec: 52701.9, 300 sec: 52484.4). Total num frames: 5527601152. Throughput: 0: 53080.5. Samples: 18179540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 09:04:09,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 09:04:11,612][52263] Updated weights for policy 0, policy_version 337388 (0.0027) [2024-04-27 09:04:14,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52974.8, 300 sec: 52539.9). Total num frames: 5527863296. Throughput: 0: 52714.7. Samples: 18333420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 09:04:14,108][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 09:04:15,072][52263] Updated weights for policy 0, policy_version 337398 (0.0033) [2024-04-27 09:04:17,859][52263] Updated weights for policy 0, policy_version 337408 (0.0036) [2024-04-27 09:04:19,106][52031] Fps is (10 sec: 57343.7, 60 sec: 53794.2, 300 sec: 52873.1). Total num frames: 5528174592. Throughput: 0: 52848.4. Samples: 18655520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 09:04:19,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 09:04:21,387][52263] Updated weights for policy 0, policy_version 337418 (0.0029) [2024-04-27 09:04:24,106][52031] Fps is (10 sec: 54067.8, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5528403968. Throughput: 0: 52784.9. Samples: 18973500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 09:04:24,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 09:04:24,264][52263] Updated weights for policy 0, policy_version 337428 (0.0033) [2024-04-27 09:04:27,512][52263] Updated weights for policy 0, policy_version 337438 (0.0026) [2024-04-27 09:04:29,106][52031] Fps is (10 sec: 47513.8, 60 sec: 52155.9, 300 sec: 52873.1). Total num frames: 5528649728. Throughput: 0: 52858.0. Samples: 19128220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 09:04:29,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 09:04:30,580][52263] Updated weights for policy 0, policy_version 337448 (0.0032) [2024-04-27 09:04:33,811][52263] Updated weights for policy 0, policy_version 337458 (0.0030) [2024-04-27 09:04:34,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53247.9, 300 sec: 53039.7). Total num frames: 5528928256. Throughput: 0: 52640.7. Samples: 19434080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 09:04:34,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 09:04:36,248][52242] Signal inference workers to stop experience collection... (300 times) [2024-04-27 09:04:36,248][52242] Signal inference workers to resume experience collection... (300 times) [2024-04-27 09:04:36,259][52263] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-04-27 09:04:36,259][52263] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-04-27 09:04:36,842][52263] Updated weights for policy 0, policy_version 337468 (0.0027) [2024-04-27 09:04:39,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52428.7, 300 sec: 52984.2). Total num frames: 5529174016. Throughput: 0: 52762.5. Samples: 19753720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 09:04:39,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 09:04:39,797][52263] Updated weights for policy 0, policy_version 337478 (0.0031) [2024-04-27 09:04:42,902][52263] Updated weights for policy 0, policy_version 337488 (0.0035) [2024-04-27 09:04:44,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53248.1, 300 sec: 53150.8). Total num frames: 5529485312. Throughput: 0: 52698.8. Samples: 19921340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 09:04:44,107][52031] Avg episode reward: [(0, '0.677')] [2024-04-27 09:04:45,935][52263] Updated weights for policy 0, policy_version 337498 (0.0031) [2024-04-27 09:04:49,103][52263] Updated weights for policy 0, policy_version 337508 (0.0032) [2024-04-27 09:04:49,106][52031] Fps is (10 sec: 55706.0, 60 sec: 52974.9, 300 sec: 53095.3). Total num frames: 5529731072. Throughput: 0: 52548.0. Samples: 20233500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 09:04:49,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 09:04:52,102][52263] Updated weights for policy 0, policy_version 337518 (0.0028) [2024-04-27 09:04:54,107][52031] Fps is (10 sec: 47513.1, 60 sec: 51882.5, 300 sec: 52984.2). Total num frames: 5529960448. Throughput: 0: 52719.8. Samples: 20551940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 09:04:54,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 09:04:55,304][52263] Updated weights for policy 0, policy_version 337528 (0.0036) [2024-04-27 09:04:58,310][52263] Updated weights for policy 0, policy_version 337538 (0.0027) [2024-04-27 09:04:59,106][52031] Fps is (10 sec: 50790.2, 60 sec: 52702.0, 300 sec: 53039.7). Total num frames: 5530238976. Throughput: 0: 52560.5. Samples: 20698640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 09:04:59,107][52031] Avg episode reward: [(0, '0.662')] [2024-04-27 09:05:01,441][52263] Updated weights for policy 0, policy_version 337548 (0.0034) [2024-04-27 09:05:04,106][52031] Fps is (10 sec: 55706.3, 60 sec: 52974.9, 300 sec: 53039.7). Total num frames: 5530517504. Throughput: 0: 52492.0. Samples: 21017660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 09:05:04,107][52031] Avg episode reward: [(0, '0.457')] [2024-04-27 09:05:04,378][52263] Updated weights for policy 0, policy_version 337558 (0.0030) [2024-04-27 09:05:07,596][52263] Updated weights for policy 0, policy_version 337568 (0.0032) [2024-04-27 09:05:09,106][52031] Fps is (10 sec: 54067.4, 60 sec: 52974.9, 300 sec: 53150.8). Total num frames: 5530779648. Throughput: 0: 52436.0. Samples: 21333120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 09:05:09,107][52031] Avg episode reward: [(0, '0.492')] [2024-04-27 09:05:11,124][52263] Updated weights for policy 0, policy_version 337578 (0.0028) [2024-04-27 09:05:13,830][52263] Updated weights for policy 0, policy_version 337588 (0.0028) [2024-04-27 09:05:14,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53248.0, 300 sec: 53206.3). Total num frames: 5531058176. Throughput: 0: 52697.2. Samples: 21499600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 09:05:14,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 09:05:17,178][52263] Updated weights for policy 0, policy_version 337598 (0.0027) [2024-04-27 09:05:19,106][52031] Fps is (10 sec: 49152.2, 60 sec: 51609.6, 300 sec: 53095.3). Total num frames: 5531271168. Throughput: 0: 52994.4. Samples: 21818820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 09:05:19,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 09:05:20,049][52263] Updated weights for policy 0, policy_version 337608 (0.0032) [2024-04-27 09:05:23,412][52263] Updated weights for policy 0, policy_version 337618 (0.0034) [2024-04-27 09:05:24,106][52031] Fps is (10 sec: 47514.1, 60 sec: 52155.7, 300 sec: 52928.7). Total num frames: 5531533312. Throughput: 0: 52918.8. Samples: 22135060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 09:05:24,107][52031] Avg episode reward: [(0, '0.458')] [2024-04-27 09:05:26,258][52263] Updated weights for policy 0, policy_version 337628 (0.0027) [2024-04-27 09:05:29,107][52031] Fps is (10 sec: 57343.5, 60 sec: 53247.9, 300 sec: 53095.6). Total num frames: 5531844608. Throughput: 0: 52600.8. Samples: 22288380. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 09:05:29,107][52031] Avg episode reward: [(0, '0.519')] [2024-04-27 09:05:29,614][52263] Updated weights for policy 0, policy_version 337638 (0.0033) [2024-04-27 09:05:32,238][52242] Signal inference workers to stop experience collection... (350 times) [2024-04-27 09:05:32,276][52263] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-04-27 09:05:32,335][52242] Signal inference workers to resume experience collection... (350 times) [2024-04-27 09:05:32,335][52263] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-04-27 09:05:32,337][52263] Updated weights for policy 0, policy_version 337648 (0.0026) [2024-04-27 09:05:34,106][52031] Fps is (10 sec: 57343.8, 60 sec: 52975.0, 300 sec: 52984.2). Total num frames: 5532106752. Throughput: 0: 52709.3. Samples: 22605420. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 09:05:34,107][52031] Avg episode reward: [(0, '0.669')] [2024-04-27 09:05:34,179][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000337654_5532123136.pth... [2024-04-27 09:05:34,220][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000336878_5519409152.pth [2024-04-27 09:05:35,647][52263] Updated weights for policy 0, policy_version 337658 (0.0027) [2024-04-27 09:05:38,433][52263] Updated weights for policy 0, policy_version 337668 (0.0029) [2024-04-27 09:05:39,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.0, 300 sec: 53039.7). Total num frames: 5532385280. Throughput: 0: 52696.8. Samples: 22923300. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 09:05:39,107][52031] Avg episode reward: [(0, '0.499')] [2024-04-27 09:05:41,739][52263] Updated weights for policy 0, policy_version 337678 (0.0031) [2024-04-27 09:05:44,106][52031] Fps is (10 sec: 50791.1, 60 sec: 52155.8, 300 sec: 53039.7). Total num frames: 5532614656. Throughput: 0: 53152.6. Samples: 23090500. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 09:05:44,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 09:05:44,660][52263] Updated weights for policy 0, policy_version 337688 (0.0034) [2024-04-27 09:05:48,060][52263] Updated weights for policy 0, policy_version 337698 (0.0028) [2024-04-27 09:05:49,107][52031] Fps is (10 sec: 49151.8, 60 sec: 52428.7, 300 sec: 53039.7). Total num frames: 5532876800. Throughput: 0: 53056.7. Samples: 23405220. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 09:05:49,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 09:05:50,884][52263] Updated weights for policy 0, policy_version 337708 (0.0030) [2024-04-27 09:05:54,107][52031] Fps is (10 sec: 54065.9, 60 sec: 53248.0, 300 sec: 52984.2). Total num frames: 5533155328. Throughput: 0: 53087.4. Samples: 23722060. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 09:05:54,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 09:05:54,283][52263] Updated weights for policy 0, policy_version 337718 (0.0031) [2024-04-27 09:05:57,053][52263] Updated weights for policy 0, policy_version 337728 (0.0041) [2024-04-27 09:05:59,107][52031] Fps is (10 sec: 57344.7, 60 sec: 53521.1, 300 sec: 53039.8). Total num frames: 5533450240. Throughput: 0: 53118.3. Samples: 23889920. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 09:05:59,107][52031] Avg episode reward: [(0, '0.472')] [2024-04-27 09:06:00,376][52263] Updated weights for policy 0, policy_version 337738 (0.0031) [2024-04-27 09:06:03,204][52263] Updated weights for policy 0, policy_version 337748 (0.0030) [2024-04-27 09:06:04,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53248.0, 300 sec: 52928.6). Total num frames: 5533712384. Throughput: 0: 53070.2. Samples: 24206980. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 09:06:04,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 09:06:06,908][52263] Updated weights for policy 0, policy_version 337758 (0.0041) [2024-04-27 09:06:09,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53247.9, 300 sec: 52984.2). Total num frames: 5533974528. Throughput: 0: 53096.8. Samples: 24524420. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 09:06:09,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 09:06:09,378][52263] Updated weights for policy 0, policy_version 337768 (0.0034) [2024-04-27 09:06:13,010][52263] Updated weights for policy 0, policy_version 337778 (0.0025) [2024-04-27 09:06:14,107][52031] Fps is (10 sec: 50790.2, 60 sec: 52701.9, 300 sec: 52984.2). Total num frames: 5534220288. Throughput: 0: 53067.6. Samples: 24676420. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 09:06:14,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 09:06:15,450][52263] Updated weights for policy 0, policy_version 337788 (0.0030) [2024-04-27 09:06:19,107][52031] Fps is (10 sec: 49151.9, 60 sec: 53247.9, 300 sec: 52928.6). Total num frames: 5534466048. Throughput: 0: 53116.8. Samples: 24995680. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 09:06:19,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 09:06:19,241][52263] Updated weights for policy 0, policy_version 337798 (0.0035) [2024-04-27 09:06:21,586][52263] Updated weights for policy 0, policy_version 337808 (0.0035) [2024-04-27 09:06:24,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53794.0, 300 sec: 52984.2). Total num frames: 5534760960. Throughput: 0: 53160.4. Samples: 25315520. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 09:06:24,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 09:06:25,274][52263] Updated weights for policy 0, policy_version 337818 (0.0030) [2024-04-27 09:06:27,733][52263] Updated weights for policy 0, policy_version 337828 (0.0035) [2024-04-27 09:06:29,107][52031] Fps is (10 sec: 57344.4, 60 sec: 53248.0, 300 sec: 52984.2). Total num frames: 5535039488. Throughput: 0: 53208.7. Samples: 25484900. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 09:06:29,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 09:06:31,604][52263] Updated weights for policy 0, policy_version 337838 (0.0035) [2024-04-27 09:06:34,071][52263] Updated weights for policy 0, policy_version 337848 (0.0034) [2024-04-27 09:06:34,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53247.9, 300 sec: 52873.1). Total num frames: 5535301632. Throughput: 0: 53161.4. Samples: 25797480. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 09:06:34,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 09:06:37,918][52263] Updated weights for policy 0, policy_version 337858 (0.0035) [2024-04-27 09:06:39,107][52031] Fps is (10 sec: 50789.4, 60 sec: 52701.8, 300 sec: 52984.2). Total num frames: 5535547392. Throughput: 0: 53179.9. Samples: 26115160. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 09:06:39,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 09:06:40,203][52263] Updated weights for policy 0, policy_version 337868 (0.0034) [2024-04-27 09:06:43,962][52263] Updated weights for policy 0, policy_version 337878 (0.0027) [2024-04-27 09:06:44,106][52031] Fps is (10 sec: 49152.5, 60 sec: 52974.8, 300 sec: 52928.6). Total num frames: 5535793152. Throughput: 0: 52694.2. Samples: 26261160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 09:06:44,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 09:06:46,334][52263] Updated weights for policy 0, policy_version 337888 (0.0027) [2024-04-27 09:06:49,107][52031] Fps is (10 sec: 52429.6, 60 sec: 53248.1, 300 sec: 52928.7). Total num frames: 5536071680. Throughput: 0: 52714.1. Samples: 26579120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 09:06:49,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 09:06:50,002][52263] Updated weights for policy 0, policy_version 337898 (0.0035) [2024-04-27 09:06:50,821][52242] Signal inference workers to stop experience collection... (400 times) [2024-04-27 09:06:50,861][52263] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-04-27 09:06:50,874][52242] Signal inference workers to resume experience collection... (400 times) [2024-04-27 09:06:50,883][52263] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-04-27 09:06:52,660][52263] Updated weights for policy 0, policy_version 337908 (0.0031) [2024-04-27 09:06:54,106][52031] Fps is (10 sec: 57344.3, 60 sec: 53521.2, 300 sec: 52984.2). Total num frames: 5536366592. Throughput: 0: 52682.8. Samples: 26895140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 09:06:54,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 09:06:56,363][52263] Updated weights for policy 0, policy_version 337918 (0.0031) [2024-04-27 09:06:58,835][52263] Updated weights for policy 0, policy_version 337928 (0.0026) [2024-04-27 09:06:59,106][52031] Fps is (10 sec: 54067.6, 60 sec: 52701.9, 300 sec: 52873.1). Total num frames: 5536612352. Throughput: 0: 52950.3. Samples: 27059180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 09:06:59,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 09:07:02,510][52263] Updated weights for policy 0, policy_version 337938 (0.0024) [2024-04-27 09:07:04,106][52031] Fps is (10 sec: 49152.0, 60 sec: 52428.8, 300 sec: 52873.1). Total num frames: 5536858112. Throughput: 0: 52954.8. Samples: 27378640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 09:07:04,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 09:07:04,932][52263] Updated weights for policy 0, policy_version 337948 (0.0033) [2024-04-27 09:07:08,736][52263] Updated weights for policy 0, policy_version 337958 (0.0031) [2024-04-27 09:07:09,106][52031] Fps is (10 sec: 49151.9, 60 sec: 52155.8, 300 sec: 52817.6). Total num frames: 5537103872. Throughput: 0: 52938.0. Samples: 27697720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 09:07:09,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 09:07:11,061][52263] Updated weights for policy 0, policy_version 337968 (0.0031) [2024-04-27 09:07:14,106][52031] Fps is (10 sec: 50790.5, 60 sec: 52428.9, 300 sec: 52817.6). Total num frames: 5537366016. Throughput: 0: 52544.1. Samples: 27849380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 09:07:14,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 09:07:15,057][52263] Updated weights for policy 0, policy_version 337978 (0.0038) [2024-04-27 09:07:17,459][52263] Updated weights for policy 0, policy_version 337988 (0.0031) [2024-04-27 09:07:19,107][52031] Fps is (10 sec: 57343.6, 60 sec: 53521.1, 300 sec: 52984.2). Total num frames: 5537677312. Throughput: 0: 52525.4. Samples: 28161120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 09:07:19,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 09:07:21,209][52263] Updated weights for policy 0, policy_version 337998 (0.0034) [2024-04-27 09:07:23,557][52263] Updated weights for policy 0, policy_version 338008 (0.0032) [2024-04-27 09:07:24,107][52031] Fps is (10 sec: 57343.1, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5537939456. Throughput: 0: 52527.7. Samples: 28478900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 09:07:24,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 09:07:27,536][52263] Updated weights for policy 0, policy_version 338018 (0.0034) [2024-04-27 09:07:29,106][52031] Fps is (10 sec: 50790.9, 60 sec: 52428.8, 300 sec: 52762.0). Total num frames: 5538185216. Throughput: 0: 53098.7. Samples: 28650600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 09:07:29,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 09:07:29,800][52263] Updated weights for policy 0, policy_version 338028 (0.0032) [2024-04-27 09:07:33,637][52263] Updated weights for policy 0, policy_version 338038 (0.0029) [2024-04-27 09:07:34,106][52031] Fps is (10 sec: 49152.5, 60 sec: 52155.8, 300 sec: 52817.6). Total num frames: 5538430976. Throughput: 0: 53014.3. Samples: 28964760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 09:07:34,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 09:07:34,200][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000338040_5538447360.pth... [2024-04-27 09:07:34,245][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000337265_5525749760.pth [2024-04-27 09:07:36,169][52263] Updated weights for policy 0, policy_version 338048 (0.0033) [2024-04-27 09:07:39,107][52031] Fps is (10 sec: 50789.8, 60 sec: 52428.9, 300 sec: 52762.0). Total num frames: 5538693120. Throughput: 0: 52998.5. Samples: 29280080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 09:07:39,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 09:07:39,905][52263] Updated weights for policy 0, policy_version 338058 (0.0031) [2024-04-27 09:07:42,281][52263] Updated weights for policy 0, policy_version 338068 (0.0032) [2024-04-27 09:07:44,107][52031] Fps is (10 sec: 54066.8, 60 sec: 52974.9, 300 sec: 52928.6). Total num frames: 5538971648. Throughput: 0: 52725.6. Samples: 29431840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 09:07:44,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 09:07:46,057][52263] Updated weights for policy 0, policy_version 338078 (0.0031) [2024-04-27 09:07:48,389][52263] Updated weights for policy 0, policy_version 338088 (0.0033) [2024-04-27 09:07:49,107][52031] Fps is (10 sec: 55705.9, 60 sec: 52974.9, 300 sec: 52928.7). Total num frames: 5539250176. Throughput: 0: 52665.3. Samples: 29748580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 09:07:49,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 09:07:52,353][52263] Updated weights for policy 0, policy_version 338098 (0.0036) [2024-04-27 09:07:54,107][52031] Fps is (10 sec: 54067.1, 60 sec: 52428.7, 300 sec: 52762.0). Total num frames: 5539512320. Throughput: 0: 52541.7. Samples: 30062100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 09:07:54,108][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 09:07:54,726][52263] Updated weights for policy 0, policy_version 338108 (0.0028) [2024-04-27 09:07:58,581][52263] Updated weights for policy 0, policy_version 338118 (0.0033) [2024-04-27 09:07:59,106][52031] Fps is (10 sec: 49152.3, 60 sec: 52155.7, 300 sec: 52762.0). Total num frames: 5539741696. Throughput: 0: 52711.5. Samples: 30221400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 09:07:59,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 09:08:00,937][52263] Updated weights for policy 0, policy_version 338128 (0.0027) [2024-04-27 09:08:04,107][52031] Fps is (10 sec: 49152.1, 60 sec: 52428.7, 300 sec: 52762.0). Total num frames: 5540003840. Throughput: 0: 52751.5. Samples: 30534940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 09:08:04,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 09:08:04,853][52263] Updated weights for policy 0, policy_version 338138 (0.0028) [2024-04-27 09:08:07,181][52263] Updated weights for policy 0, policy_version 338148 (0.0023) [2024-04-27 09:08:09,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52701.9, 300 sec: 52817.6). Total num frames: 5540265984. Throughput: 0: 52560.2. Samples: 30844100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 09:08:09,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 09:08:10,881][52242] Signal inference workers to stop experience collection... (450 times) [2024-04-27 09:08:10,917][52263] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-04-27 09:08:10,934][52242] Signal inference workers to resume experience collection... (450 times) [2024-04-27 09:08:10,935][52263] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-04-27 09:08:11,064][52263] Updated weights for policy 0, policy_version 338158 (0.0030) [2024-04-27 09:08:13,379][52263] Updated weights for policy 0, policy_version 338168 (0.0034) [2024-04-27 09:08:14,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53247.8, 300 sec: 52928.6). Total num frames: 5540560896. Throughput: 0: 52450.0. Samples: 31010860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 09:08:14,108][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 09:08:17,344][52263] Updated weights for policy 0, policy_version 338178 (0.0034) [2024-04-27 09:08:19,106][52031] Fps is (10 sec: 55705.5, 60 sec: 52428.9, 300 sec: 52762.0). Total num frames: 5540823040. Throughput: 0: 52507.6. Samples: 31327600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 09:08:19,107][52031] Avg episode reward: [(0, '0.667')] [2024-04-27 09:08:19,542][52263] Updated weights for policy 0, policy_version 338188 (0.0033) [2024-04-27 09:08:23,486][52263] Updated weights for policy 0, policy_version 338198 (0.0029) [2024-04-27 09:08:24,106][52031] Fps is (10 sec: 50791.4, 60 sec: 52155.9, 300 sec: 52706.5). Total num frames: 5541068800. Throughput: 0: 52498.8. Samples: 31642520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 09:08:24,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 09:08:25,741][52263] Updated weights for policy 0, policy_version 338208 (0.0025) [2024-04-27 09:08:29,106][52031] Fps is (10 sec: 49151.8, 60 sec: 52155.7, 300 sec: 52817.6). Total num frames: 5541314560. Throughput: 0: 52481.0. Samples: 31793480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 09:08:29,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 09:08:29,873][52263] Updated weights for policy 0, policy_version 338218 (0.0039) [2024-04-27 09:08:31,984][52263] Updated weights for policy 0, policy_version 338228 (0.0032) [2024-04-27 09:08:34,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5541593088. Throughput: 0: 52472.9. Samples: 32109860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 09:08:34,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 09:08:36,041][52263] Updated weights for policy 0, policy_version 338238 (0.0033) [2024-04-27 09:08:38,165][52263] Updated weights for policy 0, policy_version 338248 (0.0032) [2024-04-27 09:08:39,106][52031] Fps is (10 sec: 57344.2, 60 sec: 53248.1, 300 sec: 52873.1). Total num frames: 5541888000. Throughput: 0: 52417.5. Samples: 32420880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 09:08:39,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 09:08:42,240][52263] Updated weights for policy 0, policy_version 338258 (0.0033) [2024-04-27 09:08:44,106][52031] Fps is (10 sec: 55705.9, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5542150144. Throughput: 0: 52685.3. Samples: 32592240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 09:08:44,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 09:08:44,384][52263] Updated weights for policy 0, policy_version 338268 (0.0031) [2024-04-27 09:08:48,513][52263] Updated weights for policy 0, policy_version 338278 (0.0030) [2024-04-27 09:08:49,107][52031] Fps is (10 sec: 50789.4, 60 sec: 52428.7, 300 sec: 52706.5). Total num frames: 5542395904. Throughput: 0: 52814.1. Samples: 32911580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 09:08:49,116][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 09:08:50,503][52263] Updated weights for policy 0, policy_version 338288 (0.0033) [2024-04-27 09:08:54,107][52031] Fps is (10 sec: 47513.2, 60 sec: 51882.7, 300 sec: 52706.5). Total num frames: 5542625280. Throughput: 0: 52928.7. Samples: 33225900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 09:08:54,115][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 09:08:54,765][52263] Updated weights for policy 0, policy_version 338298 (0.0038) [2024-04-27 09:08:56,785][52263] Updated weights for policy 0, policy_version 338308 (0.0029) [2024-04-27 09:08:59,107][52031] Fps is (10 sec: 49152.2, 60 sec: 52428.7, 300 sec: 52706.5). Total num frames: 5542887424. Throughput: 0: 52403.1. Samples: 33369000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 09:08:59,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 09:09:00,877][52263] Updated weights for policy 0, policy_version 338318 (0.0030) [2024-04-27 09:09:02,090][52242] Signal inference workers to stop experience collection... (500 times) [2024-04-27 09:09:02,091][52242] Signal inference workers to resume experience collection... (500 times) [2024-04-27 09:09:02,122][52263] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-04-27 09:09:02,123][52263] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-04-27 09:09:03,191][52263] Updated weights for policy 0, policy_version 338328 (0.0032) [2024-04-27 09:09:04,106][52031] Fps is (10 sec: 55706.0, 60 sec: 52975.0, 300 sec: 52817.6). Total num frames: 5543182336. Throughput: 0: 52347.9. Samples: 33683260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 09:09:04,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 09:09:07,033][52263] Updated weights for policy 0, policy_version 338338 (0.0029) [2024-04-27 09:09:09,106][52031] Fps is (10 sec: 58983.1, 60 sec: 53521.0, 300 sec: 52928.7). Total num frames: 5543477248. Throughput: 0: 52374.6. Samples: 33999380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 09:09:09,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 09:09:09,532][52263] Updated weights for policy 0, policy_version 338348 (0.0027) [2024-04-27 09:09:13,422][52263] Updated weights for policy 0, policy_version 338358 (0.0031) [2024-04-27 09:09:14,106][52031] Fps is (10 sec: 52429.4, 60 sec: 52429.0, 300 sec: 52651.0). Total num frames: 5543706624. Throughput: 0: 52949.9. Samples: 34176220. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:09:14,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 09:09:15,727][52263] Updated weights for policy 0, policy_version 338368 (0.0027) [2024-04-27 09:09:19,106][52031] Fps is (10 sec: 47513.8, 60 sec: 52155.7, 300 sec: 52706.5). Total num frames: 5543952384. Throughput: 0: 52938.8. Samples: 34492100. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:09:19,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 09:09:19,453][52263] Updated weights for policy 0, policy_version 338378 (0.0027) [2024-04-27 09:09:21,883][52263] Updated weights for policy 0, policy_version 338388 (0.0030) [2024-04-27 09:09:24,107][52031] Fps is (10 sec: 49151.2, 60 sec: 52155.6, 300 sec: 52706.5). Total num frames: 5544198144. Throughput: 0: 52977.2. Samples: 34804860. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:09:24,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 09:09:25,520][52263] Updated weights for policy 0, policy_version 338398 (0.0029) [2024-04-27 09:09:28,248][52263] Updated weights for policy 0, policy_version 338408 (0.0034) [2024-04-27 09:09:29,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53248.0, 300 sec: 52817.6). Total num frames: 5544509440. Throughput: 0: 52455.5. Samples: 34952740. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:09:29,107][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 09:09:31,927][52263] Updated weights for policy 0, policy_version 338418 (0.0032) [2024-04-27 09:09:34,107][52031] Fps is (10 sec: 57343.0, 60 sec: 52974.8, 300 sec: 52873.1). Total num frames: 5544771584. Throughput: 0: 52435.5. Samples: 35271180. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:09:34,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 09:09:34,164][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000338427_5544787968.pth... [2024-04-27 09:09:34,212][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000337654_5532123136.pth [2024-04-27 09:09:34,392][52263] Updated weights for policy 0, policy_version 338428 (0.0035) [2024-04-27 09:09:38,156][52263] Updated weights for policy 0, policy_version 338438 (0.0030) [2024-04-27 09:09:39,106][52031] Fps is (10 sec: 54067.8, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5545050112. Throughput: 0: 52563.3. Samples: 35591240. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:09:39,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 09:09:40,593][52263] Updated weights for policy 0, policy_version 338448 (0.0031) [2024-04-27 09:09:44,107][52031] Fps is (10 sec: 50791.3, 60 sec: 52155.7, 300 sec: 52706.5). Total num frames: 5545279488. Throughput: 0: 52934.7. Samples: 35751060. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:09:44,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 09:09:44,197][52263] Updated weights for policy 0, policy_version 338458 (0.0030) [2024-04-27 09:09:46,889][52263] Updated weights for policy 0, policy_version 338468 (0.0032) [2024-04-27 09:09:49,106][52031] Fps is (10 sec: 45874.9, 60 sec: 51882.8, 300 sec: 52706.5). Total num frames: 5545508864. Throughput: 0: 52918.3. Samples: 36064580. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:09:49,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 09:09:50,321][52263] Updated weights for policy 0, policy_version 338478 (0.0037) [2024-04-27 09:09:53,121][52263] Updated weights for policy 0, policy_version 338488 (0.0032) [2024-04-27 09:09:54,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5545803776. Throughput: 0: 52920.1. Samples: 36380780. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:09:54,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 09:09:56,593][52263] Updated weights for policy 0, policy_version 338498 (0.0032) [2024-04-27 09:09:59,107][52031] Fps is (10 sec: 58981.7, 60 sec: 53521.1, 300 sec: 52817.5). Total num frames: 5546098688. Throughput: 0: 52570.4. Samples: 36541900. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:09:59,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 09:09:59,268][52263] Updated weights for policy 0, policy_version 338508 (0.0032) [2024-04-27 09:10:02,790][52263] Updated weights for policy 0, policy_version 338518 (0.0034) [2024-04-27 09:10:04,107][52031] Fps is (10 sec: 55705.2, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5546360832. Throughput: 0: 52601.7. Samples: 36859180. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:10:04,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 09:10:05,483][52263] Updated weights for policy 0, policy_version 338528 (0.0030) [2024-04-27 09:10:07,866][52242] Signal inference workers to stop experience collection... (550 times) [2024-04-27 09:10:07,866][52242] Signal inference workers to resume experience collection... (550 times) [2024-04-27 09:10:07,896][52263] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-04-27 09:10:07,896][52263] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-04-27 09:10:08,899][52263] Updated weights for policy 0, policy_version 338538 (0.0030) [2024-04-27 09:10:09,107][52031] Fps is (10 sec: 50790.8, 60 sec: 52155.7, 300 sec: 52706.5). Total num frames: 5546606592. Throughput: 0: 52584.1. Samples: 37171140. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:10:09,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 09:10:11,655][52263] Updated weights for policy 0, policy_version 338548 (0.0036) [2024-04-27 09:10:14,106][52031] Fps is (10 sec: 47513.9, 60 sec: 52155.7, 300 sec: 52762.0). Total num frames: 5546835968. Throughput: 0: 52643.6. Samples: 37321700. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:10:14,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 09:10:15,160][52263] Updated weights for policy 0, policy_version 338558 (0.0031) [2024-04-27 09:10:18,042][52263] Updated weights for policy 0, policy_version 338568 (0.0035) [2024-04-27 09:10:19,107][52031] Fps is (10 sec: 50789.9, 60 sec: 52701.7, 300 sec: 52817.6). Total num frames: 5547114496. Throughput: 0: 52660.6. Samples: 37640900. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:10:19,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 09:10:21,466][52263] Updated weights for policy 0, policy_version 338578 (0.0030) [2024-04-27 09:10:24,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53248.2, 300 sec: 52706.5). Total num frames: 5547393024. Throughput: 0: 52636.0. Samples: 37959860. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:10:24,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 09:10:24,334][52263] Updated weights for policy 0, policy_version 338588 (0.0027) [2024-04-27 09:10:27,601][52263] Updated weights for policy 0, policy_version 338598 (0.0031) [2024-04-27 09:10:29,106][52031] Fps is (10 sec: 55706.5, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5547671552. Throughput: 0: 52667.2. Samples: 38121080. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:10:29,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 09:10:30,440][52263] Updated weights for policy 0, policy_version 338608 (0.0028) [2024-04-27 09:10:33,885][52263] Updated weights for policy 0, policy_version 338618 (0.0028) [2024-04-27 09:10:34,106][52031] Fps is (10 sec: 54066.7, 60 sec: 52702.1, 300 sec: 52706.5). Total num frames: 5547933696. Throughput: 0: 52680.0. Samples: 38435180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 09:10:34,107][52031] Avg episode reward: [(0, '0.672')] [2024-04-27 09:10:36,763][52263] Updated weights for policy 0, policy_version 338628 (0.0035) [2024-04-27 09:10:39,106][52031] Fps is (10 sec: 50790.2, 60 sec: 52155.7, 300 sec: 52762.0). Total num frames: 5548179456. Throughput: 0: 52732.9. Samples: 38753760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 09:10:39,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 09:10:40,009][52263] Updated weights for policy 0, policy_version 338638 (0.0028) [2024-04-27 09:10:43,627][52263] Updated weights for policy 0, policy_version 338648 (0.0032) [2024-04-27 09:10:44,107][52031] Fps is (10 sec: 49151.8, 60 sec: 52428.8, 300 sec: 52706.5). Total num frames: 5548425216. Throughput: 0: 52427.6. Samples: 38901140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 09:10:44,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 09:10:46,343][52263] Updated weights for policy 0, policy_version 338658 (0.0033) [2024-04-27 09:10:49,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.0, 300 sec: 52706.5). Total num frames: 5548703744. Throughput: 0: 52349.9. Samples: 39214920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 09:10:49,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 09:10:49,822][52263] Updated weights for policy 0, policy_version 338668 (0.0034) [2024-04-27 09:10:52,601][52263] Updated weights for policy 0, policy_version 338678 (0.0028) [2024-04-27 09:10:54,107][52031] Fps is (10 sec: 57344.0, 60 sec: 53247.9, 300 sec: 52706.5). Total num frames: 5548998656. Throughput: 0: 52558.2. Samples: 39536260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 09:10:54,107][52031] Avg episode reward: [(0, '0.658')] [2024-04-27 09:10:55,918][52263] Updated weights for policy 0, policy_version 338688 (0.0031) [2024-04-27 09:10:58,710][52263] Updated weights for policy 0, policy_version 338698 (0.0034) [2024-04-27 09:10:59,107][52031] Fps is (10 sec: 54066.7, 60 sec: 52428.8, 300 sec: 52650.9). Total num frames: 5549244416. Throughput: 0: 52831.0. Samples: 39699100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 09:10:59,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 09:11:02,139][52263] Updated weights for policy 0, policy_version 338708 (0.0029) [2024-04-27 09:11:04,107][52031] Fps is (10 sec: 50789.1, 60 sec: 52428.6, 300 sec: 52650.9). Total num frames: 5549506560. Throughput: 0: 52852.7. Samples: 40019280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 09:11:04,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 09:11:05,065][52263] Updated weights for policy 0, policy_version 338718 (0.0029) [2024-04-27 09:11:08,455][52263] Updated weights for policy 0, policy_version 338728 (0.0034) [2024-04-27 09:11:09,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52428.8, 300 sec: 52651.0). Total num frames: 5549752320. Throughput: 0: 52725.2. Samples: 40332500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 09:11:09,107][52031] Avg episode reward: [(0, '0.480')] [2024-04-27 09:11:11,220][52263] Updated weights for policy 0, policy_version 338738 (0.0029) [2024-04-27 09:11:12,406][52242] Signal inference workers to stop experience collection... (600 times) [2024-04-27 09:11:12,407][52242] Signal inference workers to resume experience collection... (600 times) [2024-04-27 09:11:12,441][52263] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-04-27 09:11:12,442][52263] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-04-27 09:11:14,107][52031] Fps is (10 sec: 50791.6, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5550014464. Throughput: 0: 52538.6. Samples: 40485320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 09:11:14,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 09:11:14,707][52263] Updated weights for policy 0, policy_version 338748 (0.0041) [2024-04-27 09:11:17,380][52263] Updated weights for policy 0, policy_version 338758 (0.0031) [2024-04-27 09:11:19,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53248.0, 300 sec: 52706.5). Total num frames: 5550309376. Throughput: 0: 52636.3. Samples: 40803820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 09:11:19,107][52031] Avg episode reward: [(0, '0.685')] [2024-04-27 09:11:20,884][52263] Updated weights for policy 0, policy_version 338768 (0.0028) [2024-04-27 09:11:23,560][52263] Updated weights for policy 0, policy_version 338778 (0.0035) [2024-04-27 09:11:24,106][52031] Fps is (10 sec: 54067.7, 60 sec: 52701.8, 300 sec: 52595.4). Total num frames: 5550555136. Throughput: 0: 52608.0. Samples: 41121120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 09:11:24,107][52031] Avg episode reward: [(0, '0.520')] [2024-04-27 09:11:27,099][52263] Updated weights for policy 0, policy_version 338788 (0.0030) [2024-04-27 09:11:29,106][52031] Fps is (10 sec: 50791.3, 60 sec: 52428.9, 300 sec: 52595.5). Total num frames: 5550817280. Throughput: 0: 53044.6. Samples: 41288140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 09:11:29,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 09:11:29,778][52263] Updated weights for policy 0, policy_version 338798 (0.0025) [2024-04-27 09:11:33,247][52263] Updated weights for policy 0, policy_version 338808 (0.0024) [2024-04-27 09:11:34,106][52031] Fps is (10 sec: 54067.2, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5551095808. Throughput: 0: 53136.4. Samples: 41606060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 09:11:34,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 09:11:34,189][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000338813_5551112192.pth... [2024-04-27 09:11:34,241][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000338040_5538447360.pth [2024-04-27 09:11:35,892][52263] Updated weights for policy 0, policy_version 338818 (0.0038) [2024-04-27 09:11:39,106][52031] Fps is (10 sec: 49151.7, 60 sec: 52155.8, 300 sec: 52595.4). Total num frames: 5551308800. Throughput: 0: 52960.1. Samples: 41919460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 09:11:39,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 09:11:39,465][52263] Updated weights for policy 0, policy_version 338828 (0.0027) [2024-04-27 09:11:42,220][52263] Updated weights for policy 0, policy_version 338838 (0.0028) [2024-04-27 09:11:44,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53248.0, 300 sec: 52706.5). Total num frames: 5551620096. Throughput: 0: 52874.7. Samples: 42078460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 09:11:44,107][52031] Avg episode reward: [(0, '0.516')] [2024-04-27 09:11:45,670][52263] Updated weights for policy 0, policy_version 338848 (0.0033) [2024-04-27 09:11:48,473][52263] Updated weights for policy 0, policy_version 338858 (0.0030) [2024-04-27 09:11:49,106][52031] Fps is (10 sec: 55705.4, 60 sec: 52701.9, 300 sec: 52539.9). Total num frames: 5551865856. Throughput: 0: 52575.9. Samples: 42385180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 09:11:49,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 09:11:51,878][52263] Updated weights for policy 0, policy_version 338868 (0.0026) [2024-04-27 09:11:54,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52428.8, 300 sec: 52650.9). Total num frames: 5552144384. Throughput: 0: 52604.0. Samples: 42699680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 09:11:54,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 09:11:54,669][52263] Updated weights for policy 0, policy_version 338878 (0.0028) [2024-04-27 09:11:57,956][52263] Updated weights for policy 0, policy_version 338888 (0.0028) [2024-04-27 09:11:58,764][52242] Signal inference workers to stop experience collection... (650 times) [2024-04-27 09:11:58,766][52242] Signal inference workers to resume experience collection... (650 times) [2024-04-27 09:11:58,807][52263] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-04-27 09:11:58,807][52263] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-04-27 09:11:59,106][52031] Fps is (10 sec: 55705.4, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5552422912. Throughput: 0: 52740.0. Samples: 42858620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 09:11:59,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 09:12:00,899][52263] Updated weights for policy 0, policy_version 338898 (0.0036) [2024-04-27 09:12:04,106][52031] Fps is (10 sec: 50791.1, 60 sec: 52429.1, 300 sec: 52706.5). Total num frames: 5552652288. Throughput: 0: 52624.2. Samples: 43171900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 09:12:04,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 09:12:04,185][52263] Updated weights for policy 0, policy_version 338908 (0.0028) [2024-04-27 09:12:06,970][52263] Updated weights for policy 0, policy_version 338918 (0.0035) [2024-04-27 09:12:09,106][52031] Fps is (10 sec: 49152.3, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5552914432. Throughput: 0: 52652.9. Samples: 43490500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 09:12:09,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 09:12:10,372][52263] Updated weights for policy 0, policy_version 338928 (0.0025) [2024-04-27 09:12:13,187][52263] Updated weights for policy 0, policy_version 338938 (0.0026) [2024-04-27 09:12:14,107][52031] Fps is (10 sec: 54062.1, 60 sec: 52974.2, 300 sec: 52595.3). Total num frames: 5553192960. Throughput: 0: 52434.4. Samples: 43647740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 09:12:14,108][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 09:12:16,561][52263] Updated weights for policy 0, policy_version 338948 (0.0030) [2024-04-27 09:12:19,106][52031] Fps is (10 sec: 54067.1, 60 sec: 52428.9, 300 sec: 52595.4). Total num frames: 5553455104. Throughput: 0: 52414.7. Samples: 43964720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 09:12:19,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 09:12:19,462][52263] Updated weights for policy 0, policy_version 338958 (0.0034) [2024-04-27 09:12:22,842][52263] Updated weights for policy 0, policy_version 338968 (0.0028) [2024-04-27 09:12:24,106][52031] Fps is (10 sec: 54071.7, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5553733632. Throughput: 0: 52526.1. Samples: 44283140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 09:12:24,116][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 09:12:25,575][52263] Updated weights for policy 0, policy_version 338978 (0.0031) [2024-04-27 09:12:28,979][52263] Updated weights for policy 0, policy_version 338988 (0.0027) [2024-04-27 09:12:29,107][52031] Fps is (10 sec: 52428.9, 60 sec: 52701.8, 300 sec: 52706.5). Total num frames: 5553979392. Throughput: 0: 52542.7. Samples: 44442880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 09:12:29,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 09:12:31,772][52263] Updated weights for policy 0, policy_version 338998 (0.0031) [2024-04-27 09:12:34,107][52031] Fps is (10 sec: 49151.5, 60 sec: 52155.6, 300 sec: 52651.0). Total num frames: 5554225152. Throughput: 0: 52795.8. Samples: 44761000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 09:12:34,108][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 09:12:35,286][52263] Updated weights for policy 0, policy_version 339008 (0.0033) [2024-04-27 09:12:37,983][52263] Updated weights for policy 0, policy_version 339018 (0.0027) [2024-04-27 09:12:39,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.0, 300 sec: 52706.5). Total num frames: 5554520064. Throughput: 0: 52863.6. Samples: 45078540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 09:12:39,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 09:12:41,362][52263] Updated weights for policy 0, policy_version 339028 (0.0032) [2024-04-27 09:12:44,107][52031] Fps is (10 sec: 55706.1, 60 sec: 52701.8, 300 sec: 52651.0). Total num frames: 5554782208. Throughput: 0: 52859.1. Samples: 45237280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 09:12:44,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 09:12:44,293][52263] Updated weights for policy 0, policy_version 339038 (0.0033) [2024-04-27 09:12:47,483][52263] Updated weights for policy 0, policy_version 339048 (0.0031) [2024-04-27 09:12:49,107][52031] Fps is (10 sec: 52428.1, 60 sec: 52974.8, 300 sec: 52651.0). Total num frames: 5555044352. Throughput: 0: 52840.6. Samples: 45549740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 09:12:49,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 09:12:50,453][52263] Updated weights for policy 0, policy_version 339058 (0.0030) [2024-04-27 09:12:53,654][52263] Updated weights for policy 0, policy_version 339068 (0.0029) [2024-04-27 09:12:54,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5555306496. Throughput: 0: 52919.9. Samples: 45871900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 09:12:54,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 09:12:56,461][52242] Signal inference workers to stop experience collection... (700 times) [2024-04-27 09:12:56,499][52263] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-04-27 09:12:56,557][52242] Signal inference workers to resume experience collection... (700 times) [2024-04-27 09:12:56,558][52263] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-04-27 09:12:56,678][52263] Updated weights for policy 0, policy_version 339078 (0.0033) [2024-04-27 09:12:59,107][52031] Fps is (10 sec: 52429.1, 60 sec: 52428.7, 300 sec: 52762.0). Total num frames: 5555568640. Throughput: 0: 52903.6. Samples: 46028360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 09:12:59,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 09:12:59,873][52263] Updated weights for policy 0, policy_version 339088 (0.0027) [2024-04-27 09:13:02,885][52263] Updated weights for policy 0, policy_version 339098 (0.0029) [2024-04-27 09:13:04,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52974.8, 300 sec: 52762.0). Total num frames: 5555830784. Throughput: 0: 52891.9. Samples: 46344860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:13:04,108][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 09:13:05,861][52263] Updated weights for policy 0, policy_version 339108 (0.0029) [2024-04-27 09:13:08,989][52263] Updated weights for policy 0, policy_version 339118 (0.0033) [2024-04-27 09:13:09,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.0, 300 sec: 52706.5). Total num frames: 5556109312. Throughput: 0: 52856.5. Samples: 46661680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:13:09,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 09:13:12,243][52263] Updated weights for policy 0, policy_version 339128 (0.0034) [2024-04-27 09:13:14,107][52031] Fps is (10 sec: 54067.3, 60 sec: 52975.6, 300 sec: 52706.5). Total num frames: 5556371456. Throughput: 0: 52955.0. Samples: 46825860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:13:14,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 09:13:15,068][52263] Updated weights for policy 0, policy_version 339138 (0.0028) [2024-04-27 09:13:18,521][52263] Updated weights for policy 0, policy_version 339148 (0.0033) [2024-04-27 09:13:19,106][52031] Fps is (10 sec: 50790.4, 60 sec: 52701.8, 300 sec: 52706.5). Total num frames: 5556617216. Throughput: 0: 52868.1. Samples: 47140060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:13:19,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 09:13:21,517][52263] Updated weights for policy 0, policy_version 339158 (0.0038) [2024-04-27 09:13:24,107][52031] Fps is (10 sec: 52428.9, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5556895744. Throughput: 0: 52801.7. Samples: 47454620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:13:24,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 09:13:24,791][52263] Updated weights for policy 0, policy_version 339168 (0.0030) [2024-04-27 09:13:27,598][52263] Updated weights for policy 0, policy_version 339178 (0.0025) [2024-04-27 09:13:29,106][52031] Fps is (10 sec: 54067.2, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5557157888. Throughput: 0: 52864.1. Samples: 47616160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:13:29,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 09:13:30,933][52263] Updated weights for policy 0, policy_version 339188 (0.0031) [2024-04-27 09:13:33,943][52263] Updated weights for policy 0, policy_version 339198 (0.0037) [2024-04-27 09:13:34,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53248.0, 300 sec: 52650.9). Total num frames: 5557420032. Throughput: 0: 52986.6. Samples: 47934140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:13:34,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 09:13:34,126][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000339199_5557436416.pth... [2024-04-27 09:13:34,169][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000338427_5544787968.pth [2024-04-27 09:13:37,010][52263] Updated weights for policy 0, policy_version 339208 (0.0031) [2024-04-27 09:13:39,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52701.9, 300 sec: 52651.0). Total num frames: 5557682176. Throughput: 0: 52835.6. Samples: 48249500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:13:39,107][52031] Avg episode reward: [(0, '0.490')] [2024-04-27 09:13:40,103][52263] Updated weights for policy 0, policy_version 339218 (0.0028) [2024-04-27 09:13:43,350][52263] Updated weights for policy 0, policy_version 339228 (0.0029) [2024-04-27 09:13:44,107][52031] Fps is (10 sec: 52428.4, 60 sec: 52701.7, 300 sec: 52706.5). Total num frames: 5557944320. Throughput: 0: 52872.7. Samples: 48407640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:13:44,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 09:13:46,355][52263] Updated weights for policy 0, policy_version 339238 (0.0034) [2024-04-27 09:13:49,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52702.0, 300 sec: 52817.6). Total num frames: 5558206464. Throughput: 0: 52901.4. Samples: 48725420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:13:49,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 09:13:49,605][52263] Updated weights for policy 0, policy_version 339248 (0.0031) [2024-04-27 09:13:52,583][52263] Updated weights for policy 0, policy_version 339258 (0.0031) [2024-04-27 09:13:54,106][52031] Fps is (10 sec: 55707.5, 60 sec: 53248.1, 300 sec: 52928.7). Total num frames: 5558501376. Throughput: 0: 52939.7. Samples: 49043960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:13:54,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 09:13:55,891][52263] Updated weights for policy 0, policy_version 339268 (0.0030) [2024-04-27 09:13:58,654][52263] Updated weights for policy 0, policy_version 339278 (0.0032) [2024-04-27 09:13:59,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52702.0, 300 sec: 52706.5). Total num frames: 5558730752. Throughput: 0: 52731.2. Samples: 49198760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:13:59,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 09:14:02,135][52263] Updated weights for policy 0, policy_version 339288 (0.0030) [2024-04-27 09:14:04,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53248.0, 300 sec: 52706.5). Total num frames: 5559025664. Throughput: 0: 52951.5. Samples: 49522880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:14:04,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 09:14:04,662][52263] Updated weights for policy 0, policy_version 339298 (0.0031) [2024-04-27 09:14:08,333][52263] Updated weights for policy 0, policy_version 339308 (0.0036) [2024-04-27 09:14:09,107][52031] Fps is (10 sec: 52428.1, 60 sec: 52428.7, 300 sec: 52706.5). Total num frames: 5559255040. Throughput: 0: 52935.0. Samples: 49836700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:14:09,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 09:14:11,003][52263] Updated weights for policy 0, policy_version 339318 (0.0028) [2024-04-27 09:14:14,106][52031] Fps is (10 sec: 49152.4, 60 sec: 52428.9, 300 sec: 52762.0). Total num frames: 5559517184. Throughput: 0: 52866.7. Samples: 49995160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:14:14,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 09:14:14,393][52263] Updated weights for policy 0, policy_version 339328 (0.0029) [2024-04-27 09:14:17,261][52263] Updated weights for policy 0, policy_version 339338 (0.0032) [2024-04-27 09:14:19,107][52031] Fps is (10 sec: 55706.1, 60 sec: 53248.0, 300 sec: 52928.7). Total num frames: 5559812096. Throughput: 0: 52931.7. Samples: 50316060. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-27 09:14:19,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 09:14:20,510][52263] Updated weights for policy 0, policy_version 339348 (0.0032) [2024-04-27 09:14:23,360][52263] Updated weights for policy 0, policy_version 339358 (0.0029) [2024-04-27 09:14:24,106][52031] Fps is (10 sec: 55705.6, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5560074240. Throughput: 0: 52869.0. Samples: 50628600. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-27 09:14:24,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 09:14:24,441][52242] Signal inference workers to stop experience collection... (750 times) [2024-04-27 09:14:24,474][52263] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-04-27 09:14:24,506][52242] Signal inference workers to resume experience collection... (750 times) [2024-04-27 09:14:24,507][52263] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-04-27 09:14:26,586][52263] Updated weights for policy 0, policy_version 339368 (0.0029) [2024-04-27 09:14:29,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52974.9, 300 sec: 52762.1). Total num frames: 5560336384. Throughput: 0: 52950.9. Samples: 50790420. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-27 09:14:29,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 09:14:29,692][52263] Updated weights for policy 0, policy_version 339378 (0.0030) [2024-04-27 09:14:32,749][52263] Updated weights for policy 0, policy_version 339388 (0.0033) [2024-04-27 09:14:34,106][52031] Fps is (10 sec: 50790.4, 60 sec: 52702.0, 300 sec: 52651.0). Total num frames: 5560582144. Throughput: 0: 53048.0. Samples: 51112580. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-27 09:14:34,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 09:14:35,840][52263] Updated weights for policy 0, policy_version 339398 (0.0028) [2024-04-27 09:14:39,005][52263] Updated weights for policy 0, policy_version 339408 (0.0028) [2024-04-27 09:14:39,107][52031] Fps is (10 sec: 52428.1, 60 sec: 52974.8, 300 sec: 52817.6). Total num frames: 5560860672. Throughput: 0: 52993.4. Samples: 51428680. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-27 09:14:39,108][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 09:14:42,080][52263] Updated weights for policy 0, policy_version 339418 (0.0031) [2024-04-27 09:14:44,106][52031] Fps is (10 sec: 54066.9, 60 sec: 52975.1, 300 sec: 52928.6). Total num frames: 5561122816. Throughput: 0: 53237.7. Samples: 51594460. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-27 09:14:44,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 09:14:45,263][52263] Updated weights for policy 0, policy_version 339428 (0.0029) [2024-04-27 09:14:48,251][52263] Updated weights for policy 0, policy_version 339438 (0.0029) [2024-04-27 09:14:49,106][52031] Fps is (10 sec: 54068.4, 60 sec: 53248.0, 300 sec: 52873.1). Total num frames: 5561401344. Throughput: 0: 53013.9. Samples: 51908500. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-27 09:14:49,107][52031] Avg episode reward: [(0, '0.499')] [2024-04-27 09:14:51,607][52263] Updated weights for policy 0, policy_version 339448 (0.0033) [2024-04-27 09:14:54,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52428.7, 300 sec: 52706.5). Total num frames: 5561647104. Throughput: 0: 52973.1. Samples: 52220480. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-27 09:14:54,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 09:14:54,319][52263] Updated weights for policy 0, policy_version 339458 (0.0035) [2024-04-27 09:14:57,732][52263] Updated weights for policy 0, policy_version 339468 (0.0028) [2024-04-27 09:14:59,106][52031] Fps is (10 sec: 50790.3, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5561909248. Throughput: 0: 52898.6. Samples: 52375600. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-27 09:14:59,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 09:15:00,604][52263] Updated weights for policy 0, policy_version 339478 (0.0029) [2024-04-27 09:15:03,912][52263] Updated weights for policy 0, policy_version 339488 (0.0029) [2024-04-27 09:15:04,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52428.8, 300 sec: 52762.0). Total num frames: 5562171392. Throughput: 0: 52747.5. Samples: 52689700. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-27 09:15:04,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 09:15:06,771][52263] Updated weights for policy 0, policy_version 339498 (0.0035) [2024-04-27 09:15:09,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.1, 300 sec: 52928.7). Total num frames: 5562449920. Throughput: 0: 52785.8. Samples: 53003960. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-27 09:15:09,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 09:15:10,221][52263] Updated weights for policy 0, policy_version 339508 (0.0033) [2024-04-27 09:15:12,900][52263] Updated weights for policy 0, policy_version 339518 (0.0025) [2024-04-27 09:15:14,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53520.9, 300 sec: 52928.7). Total num frames: 5562728448. Throughput: 0: 53011.9. Samples: 53175960. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-27 09:15:14,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 09:15:16,362][52263] Updated weights for policy 0, policy_version 339528 (0.0032) [2024-04-27 09:15:19,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52701.9, 300 sec: 52817.6). Total num frames: 5562974208. Throughput: 0: 52971.1. Samples: 53496280. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-27 09:15:19,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 09:15:19,160][52263] Updated weights for policy 0, policy_version 339538 (0.0031) [2024-04-27 09:15:22,393][52263] Updated weights for policy 0, policy_version 339548 (0.0038) [2024-04-27 09:15:23,873][52242] Signal inference workers to stop experience collection... (800 times) [2024-04-27 09:15:23,874][52242] Signal inference workers to resume experience collection... (800 times) [2024-04-27 09:15:23,903][52263] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-04-27 09:15:23,903][52263] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-04-27 09:15:24,106][52031] Fps is (10 sec: 50791.2, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5563236352. Throughput: 0: 52958.5. Samples: 53811800. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-27 09:15:24,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 09:15:25,160][52263] Updated weights for policy 0, policy_version 339558 (0.0036) [2024-04-27 09:15:28,616][52263] Updated weights for policy 0, policy_version 339568 (0.0033) [2024-04-27 09:15:29,106][52031] Fps is (10 sec: 50790.0, 60 sec: 52428.8, 300 sec: 52706.5). Total num frames: 5563482112. Throughput: 0: 52579.5. Samples: 53960540. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-27 09:15:29,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 09:15:31,413][52263] Updated weights for policy 0, policy_version 339578 (0.0027) [2024-04-27 09:15:34,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53247.9, 300 sec: 52873.1). Total num frames: 5563777024. Throughput: 0: 52575.4. Samples: 54274400. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-04-27 09:15:34,107][52031] Avg episode reward: [(0, '0.669')] [2024-04-27 09:15:34,114][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000339586_5563777024.pth... [2024-04-27 09:15:34,157][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000338813_5551112192.pth [2024-04-27 09:15:34,956][52263] Updated weights for policy 0, policy_version 339588 (0.0029) [2024-04-27 09:15:37,633][52263] Updated weights for policy 0, policy_version 339598 (0.0034) [2024-04-27 09:15:39,107][52031] Fps is (10 sec: 57343.5, 60 sec: 53248.1, 300 sec: 52984.2). Total num frames: 5564055552. Throughput: 0: 52588.7. Samples: 54586980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 09:15:39,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 09:15:41,340][52263] Updated weights for policy 0, policy_version 339608 (0.0031) [2024-04-27 09:15:43,813][52263] Updated weights for policy 0, policy_version 339618 (0.0027) [2024-04-27 09:15:44,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.0, 300 sec: 52928.6). Total num frames: 5564317696. Throughput: 0: 53017.7. Samples: 54761400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 09:15:44,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 09:15:47,640][52263] Updated weights for policy 0, policy_version 339628 (0.0025) [2024-04-27 09:15:49,106][52031] Fps is (10 sec: 50791.0, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5564563456. Throughput: 0: 53177.0. Samples: 55082660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 09:15:49,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 09:15:49,831][52263] Updated weights for policy 0, policy_version 339638 (0.0026) [2024-04-27 09:15:53,839][52263] Updated weights for policy 0, policy_version 339648 (0.0030) [2024-04-27 09:15:54,106][52031] Fps is (10 sec: 49152.3, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5564809216. Throughput: 0: 53237.8. Samples: 55399660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 09:15:54,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 09:15:56,036][52263] Updated weights for policy 0, policy_version 339658 (0.0031) [2024-04-27 09:15:59,107][52031] Fps is (10 sec: 52427.9, 60 sec: 52974.8, 300 sec: 52817.6). Total num frames: 5565087744. Throughput: 0: 52699.5. Samples: 55547440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 09:15:59,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 09:15:59,972][52263] Updated weights for policy 0, policy_version 339668 (0.0032) [2024-04-27 09:16:02,363][52263] Updated weights for policy 0, policy_version 339678 (0.0033) [2024-04-27 09:16:04,106][52031] Fps is (10 sec: 57344.0, 60 sec: 53521.2, 300 sec: 52984.2). Total num frames: 5565382656. Throughput: 0: 52693.3. Samples: 55867480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 09:16:04,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 09:16:06,155][52263] Updated weights for policy 0, policy_version 339688 (0.0033) [2024-04-27 09:16:08,524][52263] Updated weights for policy 0, policy_version 339698 (0.0032) [2024-04-27 09:16:09,106][52031] Fps is (10 sec: 54068.2, 60 sec: 52974.9, 300 sec: 52928.7). Total num frames: 5565628416. Throughput: 0: 52793.8. Samples: 56187520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 09:16:09,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 09:16:12,187][52263] Updated weights for policy 0, policy_version 339708 (0.0031) [2024-04-27 09:16:14,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52702.1, 300 sec: 52817.6). Total num frames: 5565890560. Throughput: 0: 53075.7. Samples: 56348940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 09:16:14,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 09:16:14,569][52263] Updated weights for policy 0, policy_version 339718 (0.0027) [2024-04-27 09:16:18,188][52263] Updated weights for policy 0, policy_version 339728 (0.0030) [2024-04-27 09:16:19,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52701.9, 300 sec: 52817.6). Total num frames: 5566136320. Throughput: 0: 53158.9. Samples: 56666540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 09:16:19,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 09:16:20,809][52263] Updated weights for policy 0, policy_version 339738 (0.0028) [2024-04-27 09:16:24,107][52031] Fps is (10 sec: 52428.0, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5566414848. Throughput: 0: 53338.7. Samples: 56987220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 09:16:24,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 09:16:24,286][52263] Updated weights for policy 0, policy_version 339748 (0.0031) [2024-04-27 09:16:25,362][52242] Signal inference workers to stop experience collection... (850 times) [2024-04-27 09:16:25,398][52263] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-04-27 09:16:25,457][52242] Signal inference workers to resume experience collection... (850 times) [2024-04-27 09:16:25,457][52263] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-04-27 09:16:27,043][52263] Updated weights for policy 0, policy_version 339758 (0.0029) [2024-04-27 09:16:29,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.1, 300 sec: 52873.1). Total num frames: 5566693376. Throughput: 0: 52776.5. Samples: 57136340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 09:16:29,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 09:16:30,626][52263] Updated weights for policy 0, policy_version 339768 (0.0031) [2024-04-27 09:16:33,043][52263] Updated weights for policy 0, policy_version 339778 (0.0034) [2024-04-27 09:16:34,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52702.0, 300 sec: 52984.2). Total num frames: 5566939136. Throughput: 0: 52771.6. Samples: 57457380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 09:16:34,107][52031] Avg episode reward: [(0, '0.671')] [2024-04-27 09:16:37,027][52263] Updated weights for policy 0, policy_version 339788 (0.0025) [2024-04-27 09:16:39,106][52031] Fps is (10 sec: 54067.5, 60 sec: 52975.1, 300 sec: 52928.7). Total num frames: 5567234048. Throughput: 0: 52823.1. Samples: 57776700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 09:16:39,107][52031] Avg episode reward: [(0, '0.469')] [2024-04-27 09:16:39,157][52263] Updated weights for policy 0, policy_version 339798 (0.0039) [2024-04-27 09:16:43,254][52263] Updated weights for policy 0, policy_version 339808 (0.0030) [2024-04-27 09:16:44,107][52031] Fps is (10 sec: 54066.1, 60 sec: 52701.7, 300 sec: 52928.6). Total num frames: 5567479808. Throughput: 0: 53069.8. Samples: 57935580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 09:16:44,108][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 09:16:45,350][52263] Updated weights for policy 0, policy_version 339818 (0.0033) [2024-04-27 09:16:49,106][52031] Fps is (10 sec: 49152.2, 60 sec: 52701.9, 300 sec: 52817.6). Total num frames: 5567725568. Throughput: 0: 52895.6. Samples: 58247780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 09:16:49,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 09:16:49,386][52263] Updated weights for policy 0, policy_version 339828 (0.0034) [2024-04-27 09:16:51,763][52263] Updated weights for policy 0, policy_version 339838 (0.0031) [2024-04-27 09:16:54,107][52031] Fps is (10 sec: 52429.1, 60 sec: 53247.9, 300 sec: 52817.6). Total num frames: 5568004096. Throughput: 0: 52860.3. Samples: 58566240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 09:16:54,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 09:16:55,430][52263] Updated weights for policy 0, policy_version 339848 (0.0027) [2024-04-27 09:16:58,298][52263] Updated weights for policy 0, policy_version 339858 (0.0027) [2024-04-27 09:16:59,106][52031] Fps is (10 sec: 54067.4, 60 sec: 52975.2, 300 sec: 52928.7). Total num frames: 5568266240. Throughput: 0: 52926.7. Samples: 58730640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 09:16:59,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 09:17:01,642][52263] Updated weights for policy 0, policy_version 339868 (0.0033) [2024-04-27 09:17:04,107][52031] Fps is (10 sec: 54067.1, 60 sec: 52701.7, 300 sec: 52984.2). Total num frames: 5568544768. Throughput: 0: 52873.1. Samples: 59045840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 09:17:04,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 09:17:04,349][52263] Updated weights for policy 0, policy_version 339878 (0.0035) [2024-04-27 09:17:07,808][52263] Updated weights for policy 0, policy_version 339888 (0.0027) [2024-04-27 09:17:09,107][52031] Fps is (10 sec: 54066.3, 60 sec: 52974.9, 300 sec: 52928.8). Total num frames: 5568806912. Throughput: 0: 52768.5. Samples: 59361800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 09:17:09,108][52031] Avg episode reward: [(0, '0.658')] [2024-04-27 09:17:10,374][52263] Updated weights for policy 0, policy_version 339898 (0.0030) [2024-04-27 09:17:13,833][52263] Updated weights for policy 0, policy_version 339908 (0.0027) [2024-04-27 09:17:14,106][52031] Fps is (10 sec: 52429.6, 60 sec: 52974.9, 300 sec: 52928.7). Total num frames: 5569069056. Throughput: 0: 53085.4. Samples: 59525180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 09:17:14,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 09:17:16,591][52263] Updated weights for policy 0, policy_version 339918 (0.0029) [2024-04-27 09:17:19,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5569314816. Throughput: 0: 53029.8. Samples: 59843720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 09:17:19,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 09:17:20,034][52263] Updated weights for policy 0, policy_version 339928 (0.0034) [2024-04-27 09:17:22,734][52263] Updated weights for policy 0, policy_version 339938 (0.0029) [2024-04-27 09:17:24,106][52031] Fps is (10 sec: 50790.3, 60 sec: 52701.9, 300 sec: 52873.1). Total num frames: 5569576960. Throughput: 0: 52878.6. Samples: 60156240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 09:17:24,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 09:17:26,069][52242] Signal inference workers to stop experience collection... (900 times) [2024-04-27 09:17:26,126][52263] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-04-27 09:17:26,128][52242] Signal inference workers to resume experience collection... (900 times) [2024-04-27 09:17:26,138][52263] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-04-27 09:17:26,249][52263] Updated weights for policy 0, policy_version 339948 (0.0031) [2024-04-27 09:17:28,851][52263] Updated weights for policy 0, policy_version 339958 (0.0032) [2024-04-27 09:17:29,106][52031] Fps is (10 sec: 55705.7, 60 sec: 52975.0, 300 sec: 53039.8). Total num frames: 5569871872. Throughput: 0: 52928.7. Samples: 60317360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 09:17:29,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 09:17:32,506][52263] Updated weights for policy 0, policy_version 339968 (0.0027) [2024-04-27 09:17:34,106][52031] Fps is (10 sec: 54067.3, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5570117632. Throughput: 0: 52956.4. Samples: 60630820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 09:17:34,107][52031] Avg episode reward: [(0, '0.486')] [2024-04-27 09:17:34,130][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000339974_5570134016.pth... [2024-04-27 09:17:34,172][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000339199_5557436416.pth [2024-04-27 09:17:35,128][52263] Updated weights for policy 0, policy_version 339978 (0.0034) [2024-04-27 09:17:38,567][52263] Updated weights for policy 0, policy_version 339988 (0.0030) [2024-04-27 09:17:39,107][52031] Fps is (10 sec: 50789.8, 60 sec: 52428.7, 300 sec: 52873.1). Total num frames: 5570379776. Throughput: 0: 52881.4. Samples: 60945900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 09:17:39,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 09:17:41,623][52263] Updated weights for policy 0, policy_version 339998 (0.0027) [2024-04-27 09:17:44,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52702.0, 300 sec: 52873.1). Total num frames: 5570641920. Throughput: 0: 52734.0. Samples: 61103680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 09:17:44,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 09:17:44,692][52263] Updated weights for policy 0, policy_version 340008 (0.0031) [2024-04-27 09:17:48,131][52263] Updated weights for policy 0, policy_version 340018 (0.0034) [2024-04-27 09:17:49,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5570904064. Throughput: 0: 52809.5. Samples: 61422260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 09:17:49,107][52031] Avg episode reward: [(0, '0.671')] [2024-04-27 09:17:50,826][52263] Updated weights for policy 0, policy_version 340028 (0.0030) [2024-04-27 09:17:54,107][52031] Fps is (10 sec: 52428.9, 60 sec: 52701.9, 300 sec: 52873.1). Total num frames: 5571166208. Throughput: 0: 52745.8. Samples: 61735360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 09:17:54,116][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 09:17:54,247][52263] Updated weights for policy 0, policy_version 340038 (0.0030) [2024-04-27 09:17:57,038][52263] Updated weights for policy 0, policy_version 340048 (0.0030) [2024-04-27 09:17:59,107][52031] Fps is (10 sec: 54066.8, 60 sec: 52974.8, 300 sec: 52928.7). Total num frames: 5571444736. Throughput: 0: 52689.2. Samples: 61896200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 09:17:59,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 09:18:00,504][52263] Updated weights for policy 0, policy_version 340058 (0.0033) [2024-04-27 09:18:03,207][52263] Updated weights for policy 0, policy_version 340068 (0.0031) [2024-04-27 09:18:04,106][52031] Fps is (10 sec: 52428.8, 60 sec: 52428.9, 300 sec: 52817.6). Total num frames: 5571690496. Throughput: 0: 52656.8. Samples: 62213280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 09:18:04,107][52031] Avg episode reward: [(0, '0.667')] [2024-04-27 09:18:06,572][52263] Updated weights for policy 0, policy_version 340078 (0.0033) [2024-04-27 09:18:09,106][52031] Fps is (10 sec: 50791.0, 60 sec: 52428.9, 300 sec: 52817.6). Total num frames: 5571952640. Throughput: 0: 52720.5. Samples: 62528660. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-27 09:18:09,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 09:18:09,581][52263] Updated weights for policy 0, policy_version 340088 (0.0031) [2024-04-27 09:18:12,946][52263] Updated weights for policy 0, policy_version 340098 (0.0033) [2024-04-27 09:18:14,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52428.7, 300 sec: 52873.1). Total num frames: 5572214784. Throughput: 0: 52513.7. Samples: 62680480. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-27 09:18:14,107][52031] Avg episode reward: [(0, '0.502')] [2024-04-27 09:18:15,894][52263] Updated weights for policy 0, policy_version 340108 (0.0027) [2024-04-27 09:18:19,006][52263] Updated weights for policy 0, policy_version 340118 (0.0028) [2024-04-27 09:18:19,106][52031] Fps is (10 sec: 54067.2, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5572493312. Throughput: 0: 52582.2. Samples: 62997020. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-27 09:18:19,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 09:18:22,104][52263] Updated weights for policy 0, policy_version 340128 (0.0035) [2024-04-27 09:18:24,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53247.9, 300 sec: 52928.6). Total num frames: 5572771840. Throughput: 0: 52704.5. Samples: 63317600. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-27 09:18:24,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 09:18:25,198][52263] Updated weights for policy 0, policy_version 340138 (0.0030) [2024-04-27 09:18:28,303][52263] Updated weights for policy 0, policy_version 340148 (0.0029) [2024-04-27 09:18:29,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52428.7, 300 sec: 52873.1). Total num frames: 5573017600. Throughput: 0: 52773.3. Samples: 63478480. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-27 09:18:29,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 09:18:31,320][52263] Updated weights for policy 0, policy_version 340158 (0.0027) [2024-04-27 09:18:34,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52701.9, 300 sec: 52873.1). Total num frames: 5573279744. Throughput: 0: 52775.1. Samples: 63797140. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-27 09:18:34,107][52031] Avg episode reward: [(0, '0.456')] [2024-04-27 09:18:34,376][52263] Updated weights for policy 0, policy_version 340168 (0.0035) [2024-04-27 09:18:37,624][52263] Updated weights for policy 0, policy_version 340178 (0.0029) [2024-04-27 09:18:39,106][52031] Fps is (10 sec: 54067.5, 60 sec: 52975.0, 300 sec: 52928.7). Total num frames: 5573558272. Throughput: 0: 52885.4. Samples: 64115200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-27 09:18:39,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 09:18:40,532][52263] Updated weights for policy 0, policy_version 340188 (0.0034) [2024-04-27 09:18:44,053][52263] Updated weights for policy 0, policy_version 340198 (0.0038) [2024-04-27 09:18:44,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52701.8, 300 sec: 52873.1). Total num frames: 5573804032. Throughput: 0: 52880.9. Samples: 64275840. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-27 09:18:44,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 09:18:44,609][52242] Signal inference workers to stop experience collection... (950 times) [2024-04-27 09:18:44,638][52263] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-04-27 09:18:44,701][52242] Signal inference workers to resume experience collection... (950 times) [2024-04-27 09:18:44,702][52263] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-04-27 09:18:46,788][52263] Updated weights for policy 0, policy_version 340208 (0.0029) [2024-04-27 09:18:49,106][52031] Fps is (10 sec: 50790.1, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5574066176. Throughput: 0: 52765.3. Samples: 64587720. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-27 09:18:49,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 09:18:50,206][52263] Updated weights for policy 0, policy_version 340218 (0.0032) [2024-04-27 09:18:52,952][52263] Updated weights for policy 0, policy_version 340228 (0.0028) [2024-04-27 09:18:54,107][52031] Fps is (10 sec: 54067.0, 60 sec: 52974.9, 300 sec: 52928.6). Total num frames: 5574344704. Throughput: 0: 52670.9. Samples: 64898860. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-27 09:18:54,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 09:18:56,386][52263] Updated weights for policy 0, policy_version 340238 (0.0032) [2024-04-27 09:18:59,106][52031] Fps is (10 sec: 54067.3, 60 sec: 52701.9, 300 sec: 52817.6). Total num frames: 5574606848. Throughput: 0: 52940.9. Samples: 65062820. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-27 09:18:59,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 09:18:59,136][52263] Updated weights for policy 0, policy_version 340248 (0.0031) [2024-04-27 09:19:02,640][52263] Updated weights for policy 0, policy_version 340258 (0.0030) [2024-04-27 09:19:04,107][52031] Fps is (10 sec: 52428.9, 60 sec: 52974.9, 300 sec: 52928.7). Total num frames: 5574868992. Throughput: 0: 53025.6. Samples: 65383180. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-27 09:19:04,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 09:19:05,664][52263] Updated weights for policy 0, policy_version 340268 (0.0034) [2024-04-27 09:19:08,746][52263] Updated weights for policy 0, policy_version 340278 (0.0035) [2024-04-27 09:19:09,106][52031] Fps is (10 sec: 50790.7, 60 sec: 52701.8, 300 sec: 52873.1). Total num frames: 5575114752. Throughput: 0: 52757.4. Samples: 65691680. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-27 09:19:09,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 09:19:11,954][52263] Updated weights for policy 0, policy_version 340288 (0.0038) [2024-04-27 09:19:14,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52975.0, 300 sec: 52817.6). Total num frames: 5575393280. Throughput: 0: 52769.4. Samples: 65853100. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-27 09:19:14,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 09:19:14,892][52263] Updated weights for policy 0, policy_version 340298 (0.0031) [2024-04-27 09:19:18,204][52263] Updated weights for policy 0, policy_version 340308 (0.0028) [2024-04-27 09:19:19,107][52031] Fps is (10 sec: 54066.5, 60 sec: 52701.7, 300 sec: 52817.6). Total num frames: 5575655424. Throughput: 0: 52677.2. Samples: 66167620. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-27 09:19:19,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 09:19:21,286][52263] Updated weights for policy 0, policy_version 340318 (0.0035) [2024-04-27 09:19:24,107][52031] Fps is (10 sec: 52428.6, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5575917568. Throughput: 0: 52607.5. Samples: 66482540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:19:24,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 09:19:24,398][52263] Updated weights for policy 0, policy_version 340328 (0.0025) [2024-04-27 09:19:27,479][52263] Updated weights for policy 0, policy_version 340338 (0.0027) [2024-04-27 09:19:29,107][52031] Fps is (10 sec: 52428.8, 60 sec: 52701.8, 300 sec: 52873.1). Total num frames: 5576179712. Throughput: 0: 52643.1. Samples: 66644780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:19:29,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 09:19:30,481][52263] Updated weights for policy 0, policy_version 340348 (0.0029) [2024-04-27 09:19:33,749][52263] Updated weights for policy 0, policy_version 340358 (0.0031) [2024-04-27 09:19:34,107][52031] Fps is (10 sec: 52428.5, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5576441856. Throughput: 0: 52550.2. Samples: 66952480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:19:34,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 09:19:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000340359_5576441856.pth... [2024-04-27 09:19:34,160][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000339586_5563777024.pth [2024-04-27 09:19:36,793][52263] Updated weights for policy 0, policy_version 340368 (0.0036) [2024-04-27 09:19:39,107][52031] Fps is (10 sec: 52428.6, 60 sec: 52428.7, 300 sec: 52817.6). Total num frames: 5576704000. Throughput: 0: 52693.8. Samples: 67270080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:19:39,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 09:19:39,970][52263] Updated weights for policy 0, policy_version 340378 (0.0033) [2024-04-27 09:19:43,066][52263] Updated weights for policy 0, policy_version 340388 (0.0031) [2024-04-27 09:19:44,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5576966144. Throughput: 0: 52524.0. Samples: 67426400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:19:44,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 09:19:46,267][52263] Updated weights for policy 0, policy_version 340398 (0.0033) [2024-04-27 09:19:49,068][52263] Updated weights for policy 0, policy_version 340408 (0.0029) [2024-04-27 09:19:49,107][52031] Fps is (10 sec: 54067.3, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5577244672. Throughput: 0: 52371.1. Samples: 67739880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:19:49,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 09:19:52,418][52263] Updated weights for policy 0, policy_version 340418 (0.0037) [2024-04-27 09:19:54,106][52031] Fps is (10 sec: 52428.7, 60 sec: 52428.9, 300 sec: 52817.6). Total num frames: 5577490432. Throughput: 0: 52476.8. Samples: 68053140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:19:54,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 09:19:55,435][52263] Updated weights for policy 0, policy_version 340428 (0.0034) [2024-04-27 09:19:58,384][52263] Updated weights for policy 0, policy_version 340438 (0.0033) [2024-04-27 09:19:59,106][52031] Fps is (10 sec: 52429.4, 60 sec: 52701.9, 300 sec: 52873.1). Total num frames: 5577768960. Throughput: 0: 52480.5. Samples: 68214720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:19:59,115][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 09:20:01,647][52263] Updated weights for policy 0, policy_version 340448 (0.0031) [2024-04-27 09:20:04,107][52031] Fps is (10 sec: 54066.6, 60 sec: 52701.8, 300 sec: 52817.5). Total num frames: 5578031104. Throughput: 0: 52605.7. Samples: 68534880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:20:04,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 09:20:04,553][52263] Updated weights for policy 0, policy_version 340458 (0.0035) [2024-04-27 09:20:07,910][52263] Updated weights for policy 0, policy_version 340468 (0.0028) [2024-04-27 09:20:09,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52701.8, 300 sec: 52706.5). Total num frames: 5578276864. Throughput: 0: 52645.3. Samples: 68851580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:20:09,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 09:20:09,454][52242] Signal inference workers to stop experience collection... (1000 times) [2024-04-27 09:20:09,454][52242] Signal inference workers to resume experience collection... (1000 times) [2024-04-27 09:20:09,485][52263] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-04-27 09:20:09,486][52263] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-04-27 09:20:10,819][52263] Updated weights for policy 0, policy_version 340478 (0.0030) [2024-04-27 09:20:13,971][52263] Updated weights for policy 0, policy_version 340488 (0.0036) [2024-04-27 09:20:14,107][52031] Fps is (10 sec: 52428.8, 60 sec: 52701.7, 300 sec: 52817.5). Total num frames: 5578555392. Throughput: 0: 52405.3. Samples: 69003020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:20:14,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 09:20:16,969][52263] Updated weights for policy 0, policy_version 340498 (0.0030) [2024-04-27 09:20:19,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52428.9, 300 sec: 52762.0). Total num frames: 5578801152. Throughput: 0: 52658.4. Samples: 69322100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:20:19,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 09:20:20,164][52263] Updated weights for policy 0, policy_version 340508 (0.0030) [2024-04-27 09:20:23,363][52263] Updated weights for policy 0, policy_version 340518 (0.0031) [2024-04-27 09:20:24,106][52031] Fps is (10 sec: 54068.1, 60 sec: 52975.0, 300 sec: 52928.7). Total num frames: 5579096064. Throughput: 0: 52618.4. Samples: 69637900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:20:24,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 09:20:26,380][52263] Updated weights for policy 0, policy_version 340528 (0.0031) [2024-04-27 09:20:29,106][52031] Fps is (10 sec: 54066.7, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5579341824. Throughput: 0: 52566.6. Samples: 69791900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:20:29,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 09:20:29,620][52263] Updated weights for policy 0, policy_version 340538 (0.0027) [2024-04-27 09:20:32,502][52263] Updated weights for policy 0, policy_version 340548 (0.0028) [2024-04-27 09:20:34,106][52031] Fps is (10 sec: 49151.7, 60 sec: 52428.9, 300 sec: 52651.0). Total num frames: 5579587584. Throughput: 0: 52636.1. Samples: 70108500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:20:34,107][52031] Avg episode reward: [(0, '0.661')] [2024-04-27 09:20:35,662][52263] Updated weights for policy 0, policy_version 340558 (0.0035) [2024-04-27 09:20:38,701][52263] Updated weights for policy 0, policy_version 340568 (0.0032) [2024-04-27 09:20:39,107][52031] Fps is (10 sec: 54067.1, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5579882496. Throughput: 0: 52748.4. Samples: 70426820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:20:39,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 09:20:41,761][52263] Updated weights for policy 0, policy_version 340578 (0.0034) [2024-04-27 09:20:44,107][52031] Fps is (10 sec: 52428.5, 60 sec: 52428.7, 300 sec: 52706.5). Total num frames: 5580111872. Throughput: 0: 52713.7. Samples: 70586840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:20:44,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 09:20:45,024][52263] Updated weights for policy 0, policy_version 340588 (0.0033) [2024-04-27 09:20:48,092][52263] Updated weights for policy 0, policy_version 340598 (0.0027) [2024-04-27 09:20:49,107][52031] Fps is (10 sec: 50790.5, 60 sec: 52428.9, 300 sec: 52817.6). Total num frames: 5580390400. Throughput: 0: 52631.7. Samples: 70903300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:20:49,107][52031] Avg episode reward: [(0, '0.502')] [2024-04-27 09:20:51,160][52263] Updated weights for policy 0, policy_version 340608 (0.0030) [2024-04-27 09:20:54,106][52031] Fps is (10 sec: 55706.3, 60 sec: 52975.0, 300 sec: 52817.6). Total num frames: 5580668928. Throughput: 0: 52673.4. Samples: 71221880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:20:54,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 09:20:54,218][52263] Updated weights for policy 0, policy_version 340618 (0.0030) [2024-04-27 09:20:57,438][52263] Updated weights for policy 0, policy_version 340628 (0.0027) [2024-04-27 09:20:59,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52428.8, 300 sec: 52651.0). Total num frames: 5580914688. Throughput: 0: 52918.9. Samples: 71384360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:20:59,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 09:21:00,262][52263] Updated weights for policy 0, policy_version 340638 (0.0030) [2024-04-27 09:21:03,585][52263] Updated weights for policy 0, policy_version 340648 (0.0030) [2024-04-27 09:21:04,107][52031] Fps is (10 sec: 50789.7, 60 sec: 52428.9, 300 sec: 52706.5). Total num frames: 5581176832. Throughput: 0: 52953.2. Samples: 71705000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:21:04,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 09:21:06,400][52263] Updated weights for policy 0, policy_version 340658 (0.0035) [2024-04-27 09:21:09,106][52031] Fps is (10 sec: 54067.2, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5581455360. Throughput: 0: 52954.3. Samples: 72020840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:21:09,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 09:21:09,738][52263] Updated weights for policy 0, policy_version 340668 (0.0030) [2024-04-27 09:21:12,555][52263] Updated weights for policy 0, policy_version 340678 (0.0028) [2024-04-27 09:21:14,107][52031] Fps is (10 sec: 54066.5, 60 sec: 52701.8, 300 sec: 52817.5). Total num frames: 5581717504. Throughput: 0: 53141.6. Samples: 72183280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:21:14,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 09:21:15,886][52263] Updated weights for policy 0, policy_version 340688 (0.0032) [2024-04-27 09:21:19,006][52263] Updated weights for policy 0, policy_version 340698 (0.0034) [2024-04-27 09:21:19,106][52031] Fps is (10 sec: 54066.7, 60 sec: 53247.9, 300 sec: 52817.6). Total num frames: 5581996032. Throughput: 0: 53137.3. Samples: 72499680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:21:19,107][52031] Avg episode reward: [(0, '0.485')] [2024-04-27 09:21:22,067][52263] Updated weights for policy 0, policy_version 340708 (0.0032) [2024-04-27 09:21:24,106][52031] Fps is (10 sec: 52430.1, 60 sec: 52428.8, 300 sec: 52706.5). Total num frames: 5582241792. Throughput: 0: 53233.9. Samples: 72822340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:21:24,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 09:21:25,058][52263] Updated weights for policy 0, policy_version 340718 (0.0029) [2024-04-27 09:21:28,410][52263] Updated weights for policy 0, policy_version 340728 (0.0029) [2024-04-27 09:21:29,107][52031] Fps is (10 sec: 50790.4, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5582503936. Throughput: 0: 53224.1. Samples: 72981920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:21:29,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 09:21:31,020][52263] Updated weights for policy 0, policy_version 340738 (0.0029) [2024-04-27 09:21:34,107][52031] Fps is (10 sec: 54065.9, 60 sec: 53247.8, 300 sec: 52706.4). Total num frames: 5582782464. Throughput: 0: 53351.8. Samples: 73304140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:21:34,108][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 09:21:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000340746_5582782464.pth... [2024-04-27 09:21:34,168][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000339974_5570134016.pth [2024-04-27 09:21:34,472][52263] Updated weights for policy 0, policy_version 340748 (0.0035) [2024-04-27 09:21:37,251][52263] Updated weights for policy 0, policy_version 340758 (0.0029) [2024-04-27 09:21:39,106][52031] Fps is (10 sec: 52428.8, 60 sec: 52428.8, 300 sec: 52706.5). Total num frames: 5583028224. Throughput: 0: 53212.4. Samples: 73616440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:21:39,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 09:21:40,706][52263] Updated weights for policy 0, policy_version 340768 (0.0030) [2024-04-27 09:21:43,327][52242] Signal inference workers to stop experience collection... (1050 times) [2024-04-27 09:21:43,327][52242] Signal inference workers to resume experience collection... (1050 times) [2024-04-27 09:21:43,340][52263] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-04-27 09:21:43,340][52263] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-04-27 09:21:43,450][52263] Updated weights for policy 0, policy_version 340778 (0.0030) [2024-04-27 09:21:44,106][52031] Fps is (10 sec: 54068.6, 60 sec: 53521.2, 300 sec: 52873.1). Total num frames: 5583323136. Throughput: 0: 53112.0. Samples: 73774400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:21:44,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 09:21:47,064][52263] Updated weights for policy 0, policy_version 340788 (0.0033) [2024-04-27 09:21:49,106][52031] Fps is (10 sec: 54067.9, 60 sec: 52975.1, 300 sec: 52762.1). Total num frames: 5583568896. Throughput: 0: 53070.0. Samples: 74093140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:21:49,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 09:21:49,659][52263] Updated weights for policy 0, policy_version 340798 (0.0030) [2024-04-27 09:21:53,135][52263] Updated weights for policy 0, policy_version 340808 (0.0027) [2024-04-27 09:21:54,106][52031] Fps is (10 sec: 50790.4, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5583831040. Throughput: 0: 53029.8. Samples: 74407180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:21:54,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 09:21:55,973][52263] Updated weights for policy 0, policy_version 340818 (0.0028) [2024-04-27 09:21:59,106][52031] Fps is (10 sec: 54066.4, 60 sec: 53247.9, 300 sec: 52762.0). Total num frames: 5584109568. Throughput: 0: 52841.1. Samples: 74561120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 09:21:59,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 09:21:59,250][52263] Updated weights for policy 0, policy_version 340828 (0.0027) [2024-04-27 09:22:02,368][52263] Updated weights for policy 0, policy_version 340838 (0.0037) [2024-04-27 09:22:04,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53248.0, 300 sec: 52762.0). Total num frames: 5584371712. Throughput: 0: 52764.4. Samples: 74874080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 09:22:04,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 09:22:05,449][52263] Updated weights for policy 0, policy_version 340848 (0.0032) [2024-04-27 09:22:08,605][52263] Updated weights for policy 0, policy_version 340858 (0.0033) [2024-04-27 09:22:09,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5584633856. Throughput: 0: 52707.5. Samples: 75194180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 09:22:09,107][52031] Avg episode reward: [(0, '0.646')] [2024-04-27 09:22:11,700][52263] Updated weights for policy 0, policy_version 340868 (0.0030) [2024-04-27 09:22:14,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53248.2, 300 sec: 52873.1). Total num frames: 5584912384. Throughput: 0: 52757.4. Samples: 75356000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 09:22:14,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 09:22:14,653][52263] Updated weights for policy 0, policy_version 340878 (0.0031) [2024-04-27 09:22:17,901][52263] Updated weights for policy 0, policy_version 340888 (0.0033) [2024-04-27 09:22:19,106][52031] Fps is (10 sec: 52428.7, 60 sec: 52701.9, 300 sec: 52817.6). Total num frames: 5585158144. Throughput: 0: 52599.3. Samples: 75671100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 09:22:19,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 09:22:20,854][52263] Updated weights for policy 0, policy_version 340898 (0.0028) [2024-04-27 09:22:24,007][52263] Updated weights for policy 0, policy_version 340908 (0.0036) [2024-04-27 09:22:24,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53247.9, 300 sec: 52762.0). Total num frames: 5585436672. Throughput: 0: 52846.1. Samples: 75994520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 09:22:24,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 09:22:27,093][52263] Updated weights for policy 0, policy_version 340918 (0.0029) [2024-04-27 09:22:29,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5585682432. Throughput: 0: 52654.1. Samples: 76143840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 09:22:29,116][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 09:22:30,301][52263] Updated weights for policy 0, policy_version 340928 (0.0028) [2024-04-27 09:22:33,472][52263] Updated weights for policy 0, policy_version 340938 (0.0031) [2024-04-27 09:22:34,107][52031] Fps is (10 sec: 50790.5, 60 sec: 52702.0, 300 sec: 52762.0). Total num frames: 5585944576. Throughput: 0: 52573.1. Samples: 76458940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 09:22:34,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 09:22:36,486][52263] Updated weights for policy 0, policy_version 340948 (0.0028) [2024-04-27 09:22:39,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53521.0, 300 sec: 52873.1). Total num frames: 5586239488. Throughput: 0: 52663.8. Samples: 76777060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 09:22:39,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 09:22:39,602][52263] Updated weights for policy 0, policy_version 340958 (0.0032) [2024-04-27 09:22:42,575][52263] Updated weights for policy 0, policy_version 340968 (0.0032) [2024-04-27 09:22:44,106][52031] Fps is (10 sec: 54067.9, 60 sec: 52701.9, 300 sec: 52817.6). Total num frames: 5586485248. Throughput: 0: 52934.3. Samples: 76943160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 09:22:44,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 09:22:45,860][52263] Updated weights for policy 0, policy_version 340978 (0.0033) [2024-04-27 09:22:48,670][52263] Updated weights for policy 0, policy_version 340988 (0.0025) [2024-04-27 09:22:49,107][52031] Fps is (10 sec: 50790.5, 60 sec: 52974.7, 300 sec: 52817.6). Total num frames: 5586747392. Throughput: 0: 52965.3. Samples: 77257520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 09:22:49,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 09:22:52,372][52263] Updated weights for policy 0, policy_version 340998 (0.0027) [2024-04-27 09:22:54,107][52031] Fps is (10 sec: 52427.7, 60 sec: 52974.7, 300 sec: 52762.0). Total num frames: 5587009536. Throughput: 0: 52896.2. Samples: 77574520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 09:22:54,116][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 09:22:54,741][52263] Updated weights for policy 0, policy_version 341008 (0.0029) [2024-04-27 09:22:58,633][52263] Updated weights for policy 0, policy_version 341018 (0.0028) [2024-04-27 09:22:59,107][52031] Fps is (10 sec: 50790.6, 60 sec: 52428.8, 300 sec: 52762.0). Total num frames: 5587255296. Throughput: 0: 52756.3. Samples: 77730040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 09:22:59,115][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 09:23:00,470][52242] Signal inference workers to stop experience collection... (1100 times) [2024-04-27 09:23:00,470][52242] Signal inference workers to resume experience collection... (1100 times) [2024-04-27 09:23:00,485][52263] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-04-27 09:23:00,485][52263] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-04-27 09:23:00,875][52263] Updated weights for policy 0, policy_version 341028 (0.0027) [2024-04-27 09:23:04,107][52031] Fps is (10 sec: 52429.2, 60 sec: 52701.8, 300 sec: 52817.5). Total num frames: 5587533824. Throughput: 0: 52869.2. Samples: 78050220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 09:23:04,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 09:23:04,604][52263] Updated weights for policy 0, policy_version 341038 (0.0032) [2024-04-27 09:23:07,175][52263] Updated weights for policy 0, policy_version 341048 (0.0028) [2024-04-27 09:23:09,106][52031] Fps is (10 sec: 57344.6, 60 sec: 53248.0, 300 sec: 52928.7). Total num frames: 5587828736. Throughput: 0: 52793.9. Samples: 78370240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 09:23:09,107][52031] Avg episode reward: [(0, '0.504')] [2024-04-27 09:23:10,744][52263] Updated weights for policy 0, policy_version 341058 (0.0032) [2024-04-27 09:23:13,721][52263] Updated weights for policy 0, policy_version 341068 (0.0024) [2024-04-27 09:23:14,107][52031] Fps is (10 sec: 54067.1, 60 sec: 52701.7, 300 sec: 52817.5). Total num frames: 5588074496. Throughput: 0: 53223.9. Samples: 78538920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 09:23:14,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 09:23:16,913][52263] Updated weights for policy 0, policy_version 341078 (0.0029) [2024-04-27 09:23:19,106][52031] Fps is (10 sec: 50790.3, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5588336640. Throughput: 0: 53182.3. Samples: 78852140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 09:23:19,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 09:23:20,135][52263] Updated weights for policy 0, policy_version 341088 (0.0027) [2024-04-27 09:23:23,048][52263] Updated weights for policy 0, policy_version 341098 (0.0032) [2024-04-27 09:23:24,107][52031] Fps is (10 sec: 52428.9, 60 sec: 52701.9, 300 sec: 52817.6). Total num frames: 5588598784. Throughput: 0: 53138.7. Samples: 79168300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 09:23:24,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 09:23:26,203][52263] Updated weights for policy 0, policy_version 341108 (0.0029) [2024-04-27 09:23:29,106][52031] Fps is (10 sec: 50790.3, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5588844544. Throughput: 0: 52833.3. Samples: 79320660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 09:23:29,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 09:23:29,375][52263] Updated weights for policy 0, policy_version 341118 (0.0035) [2024-04-27 09:23:32,220][52263] Updated weights for policy 0, policy_version 341128 (0.0026) [2024-04-27 09:23:34,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53247.9, 300 sec: 52817.5). Total num frames: 5589139456. Throughput: 0: 52933.7. Samples: 79639540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 09:23:34,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 09:23:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000341134_5589139456.pth... [2024-04-27 09:23:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000340359_5576441856.pth [2024-04-27 09:23:35,510][52263] Updated weights for policy 0, policy_version 341138 (0.0034) [2024-04-27 09:23:38,397][52263] Updated weights for policy 0, policy_version 341148 (0.0031) [2024-04-27 09:23:39,106][52031] Fps is (10 sec: 55706.0, 60 sec: 52702.0, 300 sec: 52873.1). Total num frames: 5589401600. Throughput: 0: 52895.8. Samples: 79954820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 09:23:39,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 09:23:42,277][52263] Updated weights for policy 0, policy_version 341158 (0.0033) [2024-04-27 09:23:44,106][52031] Fps is (10 sec: 52429.9, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5589663744. Throughput: 0: 53201.5. Samples: 80124100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 09:23:44,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 09:23:44,526][52263] Updated weights for policy 0, policy_version 341168 (0.0035) [2024-04-27 09:23:48,322][52263] Updated weights for policy 0, policy_version 341178 (0.0026) [2024-04-27 09:23:49,106][52031] Fps is (10 sec: 49151.7, 60 sec: 52428.9, 300 sec: 52706.5). Total num frames: 5589893120. Throughput: 0: 53104.5. Samples: 80439920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 09:23:49,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 09:23:50,531][52263] Updated weights for policy 0, policy_version 341188 (0.0033) [2024-04-27 09:23:54,107][52031] Fps is (10 sec: 49151.4, 60 sec: 52428.9, 300 sec: 52706.5). Total num frames: 5590155264. Throughput: 0: 52977.2. Samples: 80754220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 09:23:54,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 09:23:54,518][52263] Updated weights for policy 0, policy_version 341198 (0.0027) [2024-04-27 09:23:56,960][52263] Updated weights for policy 0, policy_version 341208 (0.0032) [2024-04-27 09:23:57,388][52242] Signal inference workers to stop experience collection... (1150 times) [2024-04-27 09:23:57,390][52242] Signal inference workers to resume experience collection... (1150 times) [2024-04-27 09:23:57,406][52263] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-04-27 09:23:57,406][52263] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-04-27 09:23:59,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53248.0, 300 sec: 52817.6). Total num frames: 5590450176. Throughput: 0: 52620.5. Samples: 80906840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 09:23:59,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 09:24:00,558][52263] Updated weights for policy 0, policy_version 341218 (0.0030) [2024-04-27 09:24:03,265][52263] Updated weights for policy 0, policy_version 341228 (0.0031) [2024-04-27 09:24:04,107][52031] Fps is (10 sec: 58981.9, 60 sec: 53521.0, 300 sec: 52984.2). Total num frames: 5590745088. Throughput: 0: 52711.0. Samples: 81224140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 09:24:04,108][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 09:24:06,779][52263] Updated weights for policy 0, policy_version 341238 (0.0026) [2024-04-27 09:24:09,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5590974464. Throughput: 0: 52722.8. Samples: 81540820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 09:24:09,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 09:24:09,345][52263] Updated weights for policy 0, policy_version 341248 (0.0030) [2024-04-27 09:24:12,904][52263] Updated weights for policy 0, policy_version 341258 (0.0025) [2024-04-27 09:24:14,106][52031] Fps is (10 sec: 49152.8, 60 sec: 52702.0, 300 sec: 52817.6). Total num frames: 5591236608. Throughput: 0: 52958.3. Samples: 81703780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 09:24:14,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 09:24:15,395][52263] Updated weights for policy 0, policy_version 341268 (0.0029) [2024-04-27 09:24:18,997][52263] Updated weights for policy 0, policy_version 341278 (0.0027) [2024-04-27 09:24:19,106][52031] Fps is (10 sec: 52428.7, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5591498752. Throughput: 0: 53032.1. Samples: 82025980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 09:24:19,115][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 09:24:21,639][52263] Updated weights for policy 0, policy_version 341288 (0.0034) [2024-04-27 09:24:24,106][52031] Fps is (10 sec: 50790.5, 60 sec: 52428.9, 300 sec: 52762.1). Total num frames: 5591744512. Throughput: 0: 53022.7. Samples: 82340840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 09:24:24,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 09:24:25,136][52263] Updated weights for policy 0, policy_version 341298 (0.0032) [2024-04-27 09:24:27,773][52263] Updated weights for policy 0, policy_version 341308 (0.0033) [2024-04-27 09:24:29,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.1, 300 sec: 52928.7). Total num frames: 5592055808. Throughput: 0: 52784.8. Samples: 82499420. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-27 09:24:29,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 09:24:31,429][52263] Updated weights for policy 0, policy_version 341318 (0.0029) [2024-04-27 09:24:33,750][52263] Updated weights for policy 0, policy_version 341328 (0.0031) [2024-04-27 09:24:34,106][52031] Fps is (10 sec: 57343.8, 60 sec: 52975.1, 300 sec: 52928.7). Total num frames: 5592317952. Throughput: 0: 52866.7. Samples: 82818920. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-27 09:24:34,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 09:24:37,598][52263] Updated weights for policy 0, policy_version 341338 (0.0033) [2024-04-27 09:24:39,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52974.8, 300 sec: 52928.6). Total num frames: 5592580096. Throughput: 0: 52977.3. Samples: 83138200. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-27 09:24:39,116][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 09:24:39,898][52263] Updated weights for policy 0, policy_version 341348 (0.0029) [2024-04-27 09:24:43,736][52263] Updated weights for policy 0, policy_version 341358 (0.0032) [2024-04-27 09:24:44,107][52031] Fps is (10 sec: 50789.6, 60 sec: 52701.7, 300 sec: 52817.6). Total num frames: 5592825856. Throughput: 0: 52882.6. Samples: 83286560. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-27 09:24:44,116][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 09:24:46,181][52263] Updated weights for policy 0, policy_version 341368 (0.0037) [2024-04-27 09:24:49,107][52031] Fps is (10 sec: 49151.8, 60 sec: 52974.8, 300 sec: 52817.6). Total num frames: 5593071616. Throughput: 0: 52808.0. Samples: 83600500. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-27 09:24:49,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 09:24:49,931][52263] Updated weights for policy 0, policy_version 341378 (0.0030) [2024-04-27 09:24:52,476][52263] Updated weights for policy 0, policy_version 341388 (0.0037) [2024-04-27 09:24:54,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53794.0, 300 sec: 52928.6). Total num frames: 5593382912. Throughput: 0: 52869.6. Samples: 83919960. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-27 09:24:54,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 09:24:56,018][52263] Updated weights for policy 0, policy_version 341398 (0.0034) [2024-04-27 09:24:58,729][52263] Updated weights for policy 0, policy_version 341408 (0.0035) [2024-04-27 09:24:59,107][52031] Fps is (10 sec: 57344.5, 60 sec: 53248.0, 300 sec: 52928.7). Total num frames: 5593645056. Throughput: 0: 53078.1. Samples: 84092300. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-27 09:24:59,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 09:25:02,160][52263] Updated weights for policy 0, policy_version 341418 (0.0030) [2024-04-27 09:25:04,106][52031] Fps is (10 sec: 52429.8, 60 sec: 52702.0, 300 sec: 52984.2). Total num frames: 5593907200. Throughput: 0: 53069.0. Samples: 84414080. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-27 09:25:04,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 09:25:04,749][52263] Updated weights for policy 0, policy_version 341428 (0.0031) [2024-04-27 09:25:07,683][52242] Signal inference workers to stop experience collection... (1200 times) [2024-04-27 09:25:07,736][52263] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-04-27 09:25:07,742][52242] Signal inference workers to resume experience collection... (1200 times) [2024-04-27 09:25:07,746][52263] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-04-27 09:25:08,483][52263] Updated weights for policy 0, policy_version 341438 (0.0028) [2024-04-27 09:25:09,106][52031] Fps is (10 sec: 50791.0, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5594152960. Throughput: 0: 53093.8. Samples: 84730060. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-27 09:25:09,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 09:25:10,898][52263] Updated weights for policy 0, policy_version 341448 (0.0034) [2024-04-27 09:25:14,106][52031] Fps is (10 sec: 49151.7, 60 sec: 52701.8, 300 sec: 52873.1). Total num frames: 5594398720. Throughput: 0: 52846.6. Samples: 84877520. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-27 09:25:14,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 09:25:14,710][52263] Updated weights for policy 0, policy_version 341458 (0.0039) [2024-04-27 09:25:17,105][52263] Updated weights for policy 0, policy_version 341468 (0.0031) [2024-04-27 09:25:19,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5594677248. Throughput: 0: 52779.9. Samples: 85194020. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-27 09:25:19,107][52031] Avg episode reward: [(0, '0.436')] [2024-04-27 09:25:21,056][52263] Updated weights for policy 0, policy_version 341478 (0.0034) [2024-04-27 09:25:23,288][52263] Updated weights for policy 0, policy_version 341488 (0.0030) [2024-04-27 09:25:24,106][52031] Fps is (10 sec: 58982.9, 60 sec: 54067.2, 300 sec: 53039.7). Total num frames: 5594988544. Throughput: 0: 52791.8. Samples: 85513820. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-27 09:25:24,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 09:25:27,010][52263] Updated weights for policy 0, policy_version 341498 (0.0028) [2024-04-27 09:25:29,107][52031] Fps is (10 sec: 55705.7, 60 sec: 52974.9, 300 sec: 53039.7). Total num frames: 5595234304. Throughput: 0: 53377.0. Samples: 85688520. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-27 09:25:29,115][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 09:25:29,590][52263] Updated weights for policy 0, policy_version 341508 (0.0031) [2024-04-27 09:25:33,309][52263] Updated weights for policy 0, policy_version 341518 (0.0030) [2024-04-27 09:25:34,106][52031] Fps is (10 sec: 49151.9, 60 sec: 52701.9, 300 sec: 52873.1). Total num frames: 5595480064. Throughput: 0: 53319.3. Samples: 85999860. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-27 09:25:34,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 09:25:34,225][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000341522_5595496448.pth... [2024-04-27 09:25:34,277][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000340746_5582782464.pth [2024-04-27 09:25:35,713][52263] Updated weights for policy 0, policy_version 341528 (0.0031) [2024-04-27 09:25:39,106][52031] Fps is (10 sec: 49152.5, 60 sec: 52428.9, 300 sec: 52928.7). Total num frames: 5595725824. Throughput: 0: 53250.9. Samples: 86316240. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-27 09:25:39,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 09:25:39,463][52263] Updated weights for policy 0, policy_version 341538 (0.0034) [2024-04-27 09:25:41,890][52263] Updated weights for policy 0, policy_version 341548 (0.0033) [2024-04-27 09:25:44,106][52031] Fps is (10 sec: 52428.4, 60 sec: 52975.0, 300 sec: 52928.7). Total num frames: 5596004352. Throughput: 0: 52864.0. Samples: 86471180. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-04-27 09:25:44,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 09:25:45,519][52263] Updated weights for policy 0, policy_version 341558 (0.0033) [2024-04-27 09:25:48,101][52263] Updated weights for policy 0, policy_version 341568 (0.0028) [2024-04-27 09:25:49,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53521.2, 300 sec: 52928.6). Total num frames: 5596282880. Throughput: 0: 52883.5. Samples: 86793840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 09:25:49,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 09:25:51,680][52263] Updated weights for policy 0, policy_version 341578 (0.0032) [2024-04-27 09:25:54,106][52031] Fps is (10 sec: 55706.4, 60 sec: 52975.2, 300 sec: 53039.7). Total num frames: 5596561408. Throughput: 0: 52891.2. Samples: 87110160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 09:25:54,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 09:25:54,284][52263] Updated weights for policy 0, policy_version 341588 (0.0026) [2024-04-27 09:25:57,743][52263] Updated weights for policy 0, policy_version 341598 (0.0027) [2024-04-27 09:25:59,107][52031] Fps is (10 sec: 54067.1, 60 sec: 52974.9, 300 sec: 53039.7). Total num frames: 5596823552. Throughput: 0: 53194.2. Samples: 87271260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 09:25:59,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 09:26:00,486][52263] Updated weights for policy 0, policy_version 341608 (0.0030) [2024-04-27 09:26:04,107][52031] Fps is (10 sec: 49150.6, 60 sec: 52428.6, 300 sec: 52873.1). Total num frames: 5597052928. Throughput: 0: 53223.8. Samples: 87589100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 09:26:04,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 09:26:04,227][52263] Updated weights for policy 0, policy_version 341618 (0.0034) [2024-04-27 09:26:06,557][52263] Updated weights for policy 0, policy_version 341628 (0.0037) [2024-04-27 09:26:09,107][52031] Fps is (10 sec: 49151.6, 60 sec: 52701.7, 300 sec: 52873.1). Total num frames: 5597315072. Throughput: 0: 53274.9. Samples: 87911200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 09:26:09,107][52031] Avg episode reward: [(0, '0.516')] [2024-04-27 09:26:10,298][52263] Updated weights for policy 0, policy_version 341638 (0.0033) [2024-04-27 09:26:12,565][52242] Signal inference workers to stop experience collection... (1250 times) [2024-04-27 09:26:12,566][52242] Signal inference workers to resume experience collection... (1250 times) [2024-04-27 09:26:12,592][52263] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-04-27 09:26:12,592][52263] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-04-27 09:26:12,681][52263] Updated weights for policy 0, policy_version 341648 (0.0035) [2024-04-27 09:26:14,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.0, 300 sec: 52928.6). Total num frames: 5597609984. Throughput: 0: 52720.3. Samples: 88060940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 09:26:14,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 09:26:16,469][52263] Updated weights for policy 0, policy_version 341658 (0.0027) [2024-04-27 09:26:18,912][52263] Updated weights for policy 0, policy_version 341668 (0.0033) [2024-04-27 09:26:19,107][52031] Fps is (10 sec: 58982.8, 60 sec: 53794.1, 300 sec: 53095.2). Total num frames: 5597904896. Throughput: 0: 52875.4. Samples: 88379260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 09:26:19,107][52031] Avg episode reward: [(0, '0.491')] [2024-04-27 09:26:22,654][52263] Updated weights for policy 0, policy_version 341678 (0.0030) [2024-04-27 09:26:24,107][52031] Fps is (10 sec: 54067.0, 60 sec: 52701.7, 300 sec: 53039.7). Total num frames: 5598150656. Throughput: 0: 52939.3. Samples: 88698520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 09:26:24,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 09:26:25,234][52263] Updated weights for policy 0, policy_version 341688 (0.0032) [2024-04-27 09:26:28,770][52263] Updated weights for policy 0, policy_version 341698 (0.0030) [2024-04-27 09:26:29,107][52031] Fps is (10 sec: 49152.1, 60 sec: 52701.8, 300 sec: 52928.7). Total num frames: 5598396416. Throughput: 0: 53103.5. Samples: 88860840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 09:26:29,108][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 09:26:31,401][52263] Updated weights for policy 0, policy_version 341708 (0.0032) [2024-04-27 09:26:34,107][52031] Fps is (10 sec: 50790.4, 60 sec: 52974.7, 300 sec: 52984.2). Total num frames: 5598658560. Throughput: 0: 52910.1. Samples: 89174800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 09:26:34,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 09:26:34,917][52263] Updated weights for policy 0, policy_version 341718 (0.0030) [2024-04-27 09:26:37,557][52263] Updated weights for policy 0, policy_version 341728 (0.0033) [2024-04-27 09:26:39,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53247.9, 300 sec: 52873.1). Total num frames: 5598920704. Throughput: 0: 52880.2. Samples: 89489780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 09:26:39,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 09:26:41,023][52263] Updated weights for policy 0, policy_version 341738 (0.0031) [2024-04-27 09:26:43,806][52263] Updated weights for policy 0, policy_version 341748 (0.0034) [2024-04-27 09:26:44,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53248.0, 300 sec: 52984.2). Total num frames: 5599199232. Throughput: 0: 53012.9. Samples: 89656840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 09:26:44,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 09:26:47,274][52263] Updated weights for policy 0, policy_version 341758 (0.0032) [2024-04-27 09:26:49,107][52031] Fps is (10 sec: 54067.3, 60 sec: 52974.9, 300 sec: 52984.2). Total num frames: 5599461376. Throughput: 0: 52962.3. Samples: 89972400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 09:26:49,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 09:26:50,032][52263] Updated weights for policy 0, policy_version 341768 (0.0034) [2024-04-27 09:26:53,474][52263] Updated weights for policy 0, policy_version 341778 (0.0033) [2024-04-27 09:26:54,107][52031] Fps is (10 sec: 52428.1, 60 sec: 52701.6, 300 sec: 52928.6). Total num frames: 5599723520. Throughput: 0: 52924.0. Samples: 90292780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 09:26:54,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 09:26:56,052][52263] Updated weights for policy 0, policy_version 341788 (0.0029) [2024-04-27 09:26:59,106][52031] Fps is (10 sec: 52429.5, 60 sec: 52702.0, 300 sec: 52928.7). Total num frames: 5599985664. Throughput: 0: 52972.2. Samples: 90444680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 09:26:59,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 09:26:59,463][52263] Updated weights for policy 0, policy_version 341798 (0.0029) [2024-04-27 09:27:02,090][52263] Updated weights for policy 0, policy_version 341808 (0.0029) [2024-04-27 09:27:04,106][52031] Fps is (10 sec: 50791.6, 60 sec: 52975.2, 300 sec: 52873.1). Total num frames: 5600231424. Throughput: 0: 52961.5. Samples: 90762520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:27:04,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 09:27:05,707][52263] Updated weights for policy 0, policy_version 341818 (0.0027) [2024-04-27 09:27:08,361][52263] Updated weights for policy 0, policy_version 341828 (0.0041) [2024-04-27 09:27:09,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53521.1, 300 sec: 52928.6). Total num frames: 5600526336. Throughput: 0: 52926.7. Samples: 91080220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:27:09,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 09:27:11,994][52263] Updated weights for policy 0, policy_version 341838 (0.0039) [2024-04-27 09:27:14,106][52031] Fps is (10 sec: 55705.4, 60 sec: 52975.1, 300 sec: 52984.2). Total num frames: 5600788480. Throughput: 0: 53044.1. Samples: 91247820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:27:14,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 09:27:14,592][52263] Updated weights for policy 0, policy_version 341848 (0.0028) [2024-04-27 09:27:18,095][52263] Updated weights for policy 0, policy_version 341858 (0.0033) [2024-04-27 09:27:19,106][52031] Fps is (10 sec: 52429.6, 60 sec: 52428.9, 300 sec: 52928.7). Total num frames: 5601050624. Throughput: 0: 53175.8. Samples: 91567700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:27:19,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 09:27:20,605][52263] Updated weights for policy 0, policy_version 341868 (0.0029) [2024-04-27 09:27:24,092][52263] Updated weights for policy 0, policy_version 341878 (0.0027) [2024-04-27 09:27:24,106][52031] Fps is (10 sec: 54067.2, 60 sec: 52975.1, 300 sec: 53039.7). Total num frames: 5601329152. Throughput: 0: 53272.6. Samples: 91887040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:27:24,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 09:27:26,840][52263] Updated weights for policy 0, policy_version 341888 (0.0033) [2024-04-27 09:27:29,106][52031] Fps is (10 sec: 52428.6, 60 sec: 52975.0, 300 sec: 52984.2). Total num frames: 5601574912. Throughput: 0: 52941.8. Samples: 92039220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:27:29,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 09:27:30,183][52263] Updated weights for policy 0, policy_version 341898 (0.0038) [2024-04-27 09:27:32,965][52263] Updated weights for policy 0, policy_version 341908 (0.0032) [2024-04-27 09:27:34,107][52031] Fps is (10 sec: 50789.8, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5601837056. Throughput: 0: 53094.7. Samples: 92361660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:27:34,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 09:27:34,208][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000341910_5601853440.pth... [2024-04-27 09:27:34,253][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000341134_5589139456.pth [2024-04-27 09:27:36,369][52263] Updated weights for policy 0, policy_version 341918 (0.0036) [2024-04-27 09:27:38,604][52242] Signal inference workers to stop experience collection... (1300 times) [2024-04-27 09:27:38,648][52263] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-04-27 09:27:38,702][52242] Signal inference workers to resume experience collection... (1300 times) [2024-04-27 09:27:38,702][52263] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-04-27 09:27:39,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53521.0, 300 sec: 53039.7). Total num frames: 5602131968. Throughput: 0: 53022.2. Samples: 92678780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:27:39,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 09:27:39,200][52263] Updated weights for policy 0, policy_version 341928 (0.0025) [2024-04-27 09:27:42,816][52263] Updated weights for policy 0, policy_version 341938 (0.0038) [2024-04-27 09:27:44,106][52031] Fps is (10 sec: 55706.2, 60 sec: 53248.1, 300 sec: 53039.8). Total num frames: 5602394112. Throughput: 0: 53297.8. Samples: 92843080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:27:44,107][52031] Avg episode reward: [(0, '0.513')] [2024-04-27 09:27:45,193][52263] Updated weights for policy 0, policy_version 341948 (0.0033) [2024-04-27 09:27:48,992][52263] Updated weights for policy 0, policy_version 341958 (0.0035) [2024-04-27 09:27:49,107][52031] Fps is (10 sec: 50790.9, 60 sec: 52975.0, 300 sec: 52984.2). Total num frames: 5602639872. Throughput: 0: 53278.5. Samples: 93160060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:27:49,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 09:27:51,274][52263] Updated weights for policy 0, policy_version 341968 (0.0027) [2024-04-27 09:27:54,106][52031] Fps is (10 sec: 50790.4, 60 sec: 52975.1, 300 sec: 53039.7). Total num frames: 5602902016. Throughput: 0: 53334.8. Samples: 93480280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:27:54,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 09:27:55,301][52263] Updated weights for policy 0, policy_version 341978 (0.0028) [2024-04-27 09:27:57,569][52263] Updated weights for policy 0, policy_version 341988 (0.0029) [2024-04-27 09:27:59,106][52031] Fps is (10 sec: 52429.4, 60 sec: 52974.9, 300 sec: 52984.2). Total num frames: 5603164160. Throughput: 0: 53003.1. Samples: 93632960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:27:59,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 09:28:01,587][52263] Updated weights for policy 0, policy_version 341998 (0.0029) [2024-04-27 09:28:03,773][52263] Updated weights for policy 0, policy_version 342008 (0.0039) [2024-04-27 09:28:04,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53794.0, 300 sec: 52984.2). Total num frames: 5603459072. Throughput: 0: 53015.4. Samples: 93953400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:28:04,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 09:28:07,605][52263] Updated weights for policy 0, policy_version 342018 (0.0027) [2024-04-27 09:28:09,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53248.1, 300 sec: 53039.7). Total num frames: 5603721216. Throughput: 0: 53103.9. Samples: 94276720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:28:09,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 09:28:09,778][52263] Updated weights for policy 0, policy_version 342028 (0.0032) [2024-04-27 09:28:13,671][52263] Updated weights for policy 0, policy_version 342038 (0.0030) [2024-04-27 09:28:14,106][52031] Fps is (10 sec: 49152.6, 60 sec: 52701.8, 300 sec: 52928.7). Total num frames: 5603950592. Throughput: 0: 53204.9. Samples: 94433440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:28:14,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 09:28:16,331][52263] Updated weights for policy 0, policy_version 342048 (0.0030) [2024-04-27 09:28:19,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53248.0, 300 sec: 53039.8). Total num frames: 5604245504. Throughput: 0: 53161.5. Samples: 94753920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:28:19,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 09:28:19,910][52263] Updated weights for policy 0, policy_version 342058 (0.0033) [2024-04-27 09:28:22,860][52263] Updated weights for policy 0, policy_version 342068 (0.0029) [2024-04-27 09:28:24,106][52031] Fps is (10 sec: 54067.2, 60 sec: 52701.8, 300 sec: 53039.7). Total num frames: 5604491264. Throughput: 0: 53119.3. Samples: 95069140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:28:24,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 09:28:26,165][52263] Updated weights for policy 0, policy_version 342078 (0.0031) [2024-04-27 09:28:28,954][52263] Updated weights for policy 0, policy_version 342088 (0.0029) [2024-04-27 09:28:29,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53247.9, 300 sec: 52984.2). Total num frames: 5604769792. Throughput: 0: 52938.5. Samples: 95225320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:28:29,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 09:28:32,522][52263] Updated weights for policy 0, policy_version 342098 (0.0032) [2024-04-27 09:28:34,106][52031] Fps is (10 sec: 57344.1, 60 sec: 53794.2, 300 sec: 53095.3). Total num frames: 5605064704. Throughput: 0: 53149.0. Samples: 95551760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:28:34,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 09:28:35,101][52263] Updated weights for policy 0, policy_version 342108 (0.0035) [2024-04-27 09:28:38,623][52263] Updated weights for policy 0, policy_version 342118 (0.0034) [2024-04-27 09:28:39,106][52031] Fps is (10 sec: 50791.1, 60 sec: 52429.0, 300 sec: 52928.6). Total num frames: 5605277696. Throughput: 0: 53110.2. Samples: 95870240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:28:39,107][52031] Avg episode reward: [(0, '0.484')] [2024-04-27 09:28:41,285][52263] Updated weights for policy 0, policy_version 342128 (0.0030) [2024-04-27 09:28:44,106][52031] Fps is (10 sec: 47513.6, 60 sec: 52428.8, 300 sec: 53039.7). Total num frames: 5605539840. Throughput: 0: 53011.1. Samples: 96018460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:28:44,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 09:28:44,701][52263] Updated weights for policy 0, policy_version 342138 (0.0031) [2024-04-27 09:28:45,030][52242] Signal inference workers to stop experience collection... (1350 times) [2024-04-27 09:28:45,066][52263] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-04-27 09:28:45,124][52242] Signal inference workers to resume experience collection... (1350 times) [2024-04-27 09:28:45,124][52263] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-04-27 09:28:47,438][52263] Updated weights for policy 0, policy_version 342148 (0.0027) [2024-04-27 09:28:49,106][52031] Fps is (10 sec: 54067.1, 60 sec: 52975.0, 300 sec: 53095.3). Total num frames: 5605818368. Throughput: 0: 52969.0. Samples: 96337000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:28:49,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 09:28:50,982][52263] Updated weights for policy 0, policy_version 342158 (0.0029) [2024-04-27 09:28:53,461][52263] Updated weights for policy 0, policy_version 342168 (0.0031) [2024-04-27 09:28:54,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53247.9, 300 sec: 53039.7). Total num frames: 5606096896. Throughput: 0: 52775.0. Samples: 96651600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:28:54,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 09:28:57,230][52263] Updated weights for policy 0, policy_version 342178 (0.0029) [2024-04-27 09:28:59,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.0, 300 sec: 52984.2). Total num frames: 5606375424. Throughput: 0: 53066.7. Samples: 96821440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:28:59,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 09:28:59,682][52263] Updated weights for policy 0, policy_version 342188 (0.0028) [2024-04-27 09:29:03,335][52263] Updated weights for policy 0, policy_version 342198 (0.0036) [2024-04-27 09:29:04,106][52031] Fps is (10 sec: 52429.7, 60 sec: 52702.0, 300 sec: 53039.7). Total num frames: 5606621184. Throughput: 0: 52996.5. Samples: 97138760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:29:04,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 09:29:05,778][52263] Updated weights for policy 0, policy_version 342208 (0.0029) [2024-04-27 09:29:09,107][52031] Fps is (10 sec: 49151.8, 60 sec: 52428.8, 300 sec: 52984.2). Total num frames: 5606866944. Throughput: 0: 52996.8. Samples: 97454000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:29:09,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 09:29:09,382][52263] Updated weights for policy 0, policy_version 342218 (0.0029) [2024-04-27 09:29:11,990][52263] Updated weights for policy 0, policy_version 342228 (0.0028) [2024-04-27 09:29:14,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53247.9, 300 sec: 53039.7). Total num frames: 5607145472. Throughput: 0: 52883.6. Samples: 97605080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:29:14,116][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 09:29:15,519][52263] Updated weights for policy 0, policy_version 342238 (0.0032) [2024-04-27 09:29:18,011][52263] Updated weights for policy 0, policy_version 342248 (0.0033) [2024-04-27 09:29:19,107][52031] Fps is (10 sec: 54067.2, 60 sec: 52701.8, 300 sec: 53095.2). Total num frames: 5607407616. Throughput: 0: 52799.0. Samples: 97927720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:29:19,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 09:29:21,745][52263] Updated weights for policy 0, policy_version 342258 (0.0030) [2024-04-27 09:29:24,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53520.9, 300 sec: 53039.7). Total num frames: 5607702528. Throughput: 0: 52771.4. Samples: 98244960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:29:24,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 09:29:24,284][52263] Updated weights for policy 0, policy_version 342268 (0.0030) [2024-04-27 09:29:27,833][52263] Updated weights for policy 0, policy_version 342278 (0.0034) [2024-04-27 09:29:29,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53248.1, 300 sec: 53039.7). Total num frames: 5607964672. Throughput: 0: 53139.9. Samples: 98409760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:29:29,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 09:29:30,533][52263] Updated weights for policy 0, policy_version 342288 (0.0030) [2024-04-27 09:29:33,867][52263] Updated weights for policy 0, policy_version 342298 (0.0028) [2024-04-27 09:29:34,107][52031] Fps is (10 sec: 50790.5, 60 sec: 52428.7, 300 sec: 52984.2). Total num frames: 5608210432. Throughput: 0: 53083.4. Samples: 98725760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 09:29:34,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 09:29:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000342298_5608210432.pth... [2024-04-27 09:29:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000341522_5595496448.pth [2024-04-27 09:29:36,612][52263] Updated weights for policy 0, policy_version 342308 (0.0036) [2024-04-27 09:29:39,107][52031] Fps is (10 sec: 49151.8, 60 sec: 52974.9, 300 sec: 52984.2). Total num frames: 5608456192. Throughput: 0: 53045.8. Samples: 99038660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:29:39,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 09:29:40,313][52263] Updated weights for policy 0, policy_version 342318 (0.0034) [2024-04-27 09:29:41,096][52242] Signal inference workers to stop experience collection... (1400 times) [2024-04-27 09:29:41,135][52263] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-04-27 09:29:41,165][52242] Signal inference workers to resume experience collection... (1400 times) [2024-04-27 09:29:41,170][52263] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-04-27 09:29:42,739][52263] Updated weights for policy 0, policy_version 342328 (0.0024) [2024-04-27 09:29:44,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53248.0, 300 sec: 53095.3). Total num frames: 5608734720. Throughput: 0: 52764.9. Samples: 99195860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:29:44,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 09:29:46,597][52263] Updated weights for policy 0, policy_version 342338 (0.0030) [2024-04-27 09:29:49,013][52263] Updated weights for policy 0, policy_version 342348 (0.0031) [2024-04-27 09:29:49,106][52031] Fps is (10 sec: 57344.4, 60 sec: 53521.1, 300 sec: 53039.8). Total num frames: 5609029632. Throughput: 0: 52700.8. Samples: 99510300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:29:49,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 09:29:52,608][52263] Updated weights for policy 0, policy_version 342358 (0.0031) [2024-04-27 09:29:54,106][52031] Fps is (10 sec: 54067.2, 60 sec: 52975.0, 300 sec: 52984.2). Total num frames: 5609275392. Throughput: 0: 52786.3. Samples: 99829380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:29:54,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 09:29:55,318][52263] Updated weights for policy 0, policy_version 342368 (0.0030) [2024-04-27 09:29:58,752][52263] Updated weights for policy 0, policy_version 342378 (0.0031) [2024-04-27 09:29:59,106][52031] Fps is (10 sec: 49152.6, 60 sec: 52428.9, 300 sec: 52928.7). Total num frames: 5609521152. Throughput: 0: 52959.3. Samples: 99988240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:29:59,107][52031] Avg episode reward: [(0, '0.664')] [2024-04-27 09:30:01,477][52263] Updated weights for policy 0, policy_version 342388 (0.0029) [2024-04-27 09:30:04,107][52031] Fps is (10 sec: 49151.7, 60 sec: 52428.7, 300 sec: 52928.6). Total num frames: 5609766912. Throughput: 0: 52908.4. Samples: 100308600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:30:04,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 09:30:05,200][52263] Updated weights for policy 0, policy_version 342398 (0.0028) [2024-04-27 09:30:07,625][52263] Updated weights for policy 0, policy_version 342408 (0.0032) [2024-04-27 09:30:09,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53521.1, 300 sec: 53150.8). Total num frames: 5610078208. Throughput: 0: 52986.3. Samples: 100629340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:30:09,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 09:30:11,395][52263] Updated weights for policy 0, policy_version 342418 (0.0035) [2024-04-27 09:30:14,013][52263] Updated weights for policy 0, policy_version 342428 (0.0030) [2024-04-27 09:30:14,107][52031] Fps is (10 sec: 57343.4, 60 sec: 53247.9, 300 sec: 53095.2). Total num frames: 5610340352. Throughput: 0: 52872.3. Samples: 100789020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:30:14,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 09:30:17,430][52263] Updated weights for policy 0, policy_version 342438 (0.0037) [2024-04-27 09:30:19,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53248.0, 300 sec: 52928.6). Total num frames: 5610602496. Throughput: 0: 52888.9. Samples: 101105760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:30:19,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 09:30:20,078][52263] Updated weights for policy 0, policy_version 342448 (0.0032) [2024-04-27 09:30:23,644][52263] Updated weights for policy 0, policy_version 342458 (0.0028) [2024-04-27 09:30:24,107][52031] Fps is (10 sec: 52429.0, 60 sec: 52701.8, 300 sec: 52984.2). Total num frames: 5610864640. Throughput: 0: 52998.1. Samples: 101423580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:30:24,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 09:30:26,431][52263] Updated weights for policy 0, policy_version 342468 (0.0031) [2024-04-27 09:30:29,106][52031] Fps is (10 sec: 49152.8, 60 sec: 52155.8, 300 sec: 52928.7). Total num frames: 5611094016. Throughput: 0: 52843.6. Samples: 101573820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:30:29,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 09:30:29,897][52263] Updated weights for policy 0, policy_version 342478 (0.0030) [2024-04-27 09:30:32,460][52263] Updated weights for policy 0, policy_version 342488 (0.0026) [2024-04-27 09:30:34,106][52031] Fps is (10 sec: 50790.9, 60 sec: 52701.9, 300 sec: 53039.7). Total num frames: 5611372544. Throughput: 0: 52937.3. Samples: 101892480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:30:34,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 09:30:36,050][52263] Updated weights for policy 0, policy_version 342498 (0.0036) [2024-04-27 09:30:38,715][52263] Updated weights for policy 0, policy_version 342508 (0.0033) [2024-04-27 09:30:39,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53248.0, 300 sec: 53039.7). Total num frames: 5611651072. Throughput: 0: 52912.4. Samples: 102210440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:30:39,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 09:30:39,532][52242] Signal inference workers to stop experience collection... (1450 times) [2024-04-27 09:30:39,574][52263] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-04-27 09:30:39,587][52242] Signal inference workers to resume experience collection... (1450 times) [2024-04-27 09:30:39,596][52263] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-04-27 09:30:42,232][52263] Updated weights for policy 0, policy_version 342518 (0.0034) [2024-04-27 09:30:44,106][52031] Fps is (10 sec: 54067.8, 60 sec: 52975.0, 300 sec: 52984.2). Total num frames: 5611913216. Throughput: 0: 53073.3. Samples: 102376540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:30:44,107][52031] Avg episode reward: [(0, '0.686')] [2024-04-27 09:30:44,769][52263] Updated weights for policy 0, policy_version 342528 (0.0035) [2024-04-27 09:30:48,504][52263] Updated weights for policy 0, policy_version 342538 (0.0026) [2024-04-27 09:30:49,106][52031] Fps is (10 sec: 54067.7, 60 sec: 52701.9, 300 sec: 52984.2). Total num frames: 5612191744. Throughput: 0: 53017.5. Samples: 102694380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 09:30:49,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 09:30:51,143][52263] Updated weights for policy 0, policy_version 342548 (0.0027) [2024-04-27 09:30:54,107][52031] Fps is (10 sec: 49151.5, 60 sec: 52155.7, 300 sec: 52817.6). Total num frames: 5612404736. Throughput: 0: 53001.8. Samples: 103014420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-27 09:30:54,116][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 09:30:54,647][52263] Updated weights for policy 0, policy_version 342558 (0.0028) [2024-04-27 09:30:57,407][52263] Updated weights for policy 0, policy_version 342568 (0.0026) [2024-04-27 09:30:59,107][52031] Fps is (10 sec: 50789.6, 60 sec: 52974.7, 300 sec: 53039.7). Total num frames: 5612699648. Throughput: 0: 52825.0. Samples: 103166140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-27 09:30:59,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 09:31:00,767][52263] Updated weights for policy 0, policy_version 342578 (0.0031) [2024-04-27 09:31:03,606][52263] Updated weights for policy 0, policy_version 342588 (0.0027) [2024-04-27 09:31:04,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53247.9, 300 sec: 53039.7). Total num frames: 5612961792. Throughput: 0: 52785.7. Samples: 103481120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-27 09:31:04,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 09:31:06,960][52263] Updated weights for policy 0, policy_version 342598 (0.0032) [2024-04-27 09:31:09,106][52031] Fps is (10 sec: 54068.2, 60 sec: 52702.0, 300 sec: 52984.2). Total num frames: 5613240320. Throughput: 0: 52775.8. Samples: 103798480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-27 09:31:09,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 09:31:09,657][52263] Updated weights for policy 0, policy_version 342608 (0.0029) [2024-04-27 09:31:13,146][52263] Updated weights for policy 0, policy_version 342618 (0.0028) [2024-04-27 09:31:14,107][52031] Fps is (10 sec: 55705.4, 60 sec: 52974.9, 300 sec: 52928.6). Total num frames: 5613518848. Throughput: 0: 53013.9. Samples: 103959460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-27 09:31:14,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 09:31:16,011][52263] Updated weights for policy 0, policy_version 342628 (0.0030) [2024-04-27 09:31:19,107][52031] Fps is (10 sec: 52427.9, 60 sec: 52701.9, 300 sec: 52928.7). Total num frames: 5613764608. Throughput: 0: 53010.6. Samples: 104277960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-27 09:31:19,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 09:31:19,242][52263] Updated weights for policy 0, policy_version 342638 (0.0031) [2024-04-27 09:31:22,480][52263] Updated weights for policy 0, policy_version 342648 (0.0034) [2024-04-27 09:31:24,106][52031] Fps is (10 sec: 49153.0, 60 sec: 52428.9, 300 sec: 52928.7). Total num frames: 5614010368. Throughput: 0: 52939.6. Samples: 104592720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-27 09:31:24,107][52031] Avg episode reward: [(0, '0.468')] [2024-04-27 09:31:25,426][52263] Updated weights for policy 0, policy_version 342658 (0.0042) [2024-04-27 09:31:28,701][52263] Updated weights for policy 0, policy_version 342668 (0.0033) [2024-04-27 09:31:29,107][52031] Fps is (10 sec: 52429.0, 60 sec: 53247.9, 300 sec: 52984.2). Total num frames: 5614288896. Throughput: 0: 52767.0. Samples: 104751060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-27 09:31:29,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 09:31:31,655][52263] Updated weights for policy 0, policy_version 342678 (0.0029) [2024-04-27 09:31:34,106][52031] Fps is (10 sec: 54067.7, 60 sec: 52975.1, 300 sec: 52984.2). Total num frames: 5614551040. Throughput: 0: 52709.8. Samples: 105066320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-27 09:31:34,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 09:31:34,188][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000342686_5614567424.pth... [2024-04-27 09:31:34,243][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000341910_5601853440.pth [2024-04-27 09:31:34,858][52263] Updated weights for policy 0, policy_version 342688 (0.0034) [2024-04-27 09:31:37,726][52242] Signal inference workers to stop experience collection... (1500 times) [2024-04-27 09:31:37,727][52242] Signal inference workers to resume experience collection... (1500 times) [2024-04-27 09:31:37,756][52263] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-04-27 09:31:37,756][52263] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-04-27 09:31:37,857][52263] Updated weights for policy 0, policy_version 342698 (0.0026) [2024-04-27 09:31:39,106][52031] Fps is (10 sec: 52429.4, 60 sec: 52701.9, 300 sec: 52928.7). Total num frames: 5614813184. Throughput: 0: 52562.3. Samples: 105379720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-27 09:31:39,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 09:31:41,001][52263] Updated weights for policy 0, policy_version 342708 (0.0031) [2024-04-27 09:31:44,026][52263] Updated weights for policy 0, policy_version 342718 (0.0033) [2024-04-27 09:31:44,106][52031] Fps is (10 sec: 54067.0, 60 sec: 52974.9, 300 sec: 52984.2). Total num frames: 5615091712. Throughput: 0: 52802.9. Samples: 105542260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-27 09:31:44,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 09:31:47,110][52263] Updated weights for policy 0, policy_version 342728 (0.0037) [2024-04-27 09:31:49,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52428.7, 300 sec: 52928.7). Total num frames: 5615337472. Throughput: 0: 52801.9. Samples: 105857200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-27 09:31:49,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 09:31:50,123][52263] Updated weights for policy 0, policy_version 342738 (0.0030) [2024-04-27 09:31:53,325][52263] Updated weights for policy 0, policy_version 342748 (0.0030) [2024-04-27 09:31:54,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53521.0, 300 sec: 52984.2). Total num frames: 5615616000. Throughput: 0: 52800.2. Samples: 106174500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-27 09:31:54,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 09:31:56,302][52263] Updated weights for policy 0, policy_version 342758 (0.0026) [2024-04-27 09:31:59,106][52031] Fps is (10 sec: 54067.6, 60 sec: 52975.1, 300 sec: 53039.7). Total num frames: 5615878144. Throughput: 0: 52696.3. Samples: 106330780. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-27 09:31:59,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 09:31:59,542][52263] Updated weights for policy 0, policy_version 342768 (0.0032) [2024-04-27 09:32:02,573][52263] Updated weights for policy 0, policy_version 342778 (0.0030) [2024-04-27 09:32:04,107][52031] Fps is (10 sec: 52429.0, 60 sec: 52975.0, 300 sec: 52928.7). Total num frames: 5616140288. Throughput: 0: 52639.5. Samples: 106646740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-27 09:32:04,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 09:32:05,886][52263] Updated weights for policy 0, policy_version 342788 (0.0029) [2024-04-27 09:32:08,708][52263] Updated weights for policy 0, policy_version 342798 (0.0028) [2024-04-27 09:32:09,107][52031] Fps is (10 sec: 54066.7, 60 sec: 52974.8, 300 sec: 52984.2). Total num frames: 5616418816. Throughput: 0: 52728.3. Samples: 106965500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 09:32:09,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 09:32:12,001][52263] Updated weights for policy 0, policy_version 342808 (0.0032) [2024-04-27 09:32:14,107][52031] Fps is (10 sec: 52429.0, 60 sec: 52428.9, 300 sec: 52928.6). Total num frames: 5616664576. Throughput: 0: 52845.3. Samples: 107129100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 09:32:14,116][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 09:32:14,908][52263] Updated weights for policy 0, policy_version 342818 (0.0028) [2024-04-27 09:32:18,089][52263] Updated weights for policy 0, policy_version 342828 (0.0031) [2024-04-27 09:32:19,107][52031] Fps is (10 sec: 50790.6, 60 sec: 52701.9, 300 sec: 52873.1). Total num frames: 5616926720. Throughput: 0: 52921.6. Samples: 107447800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 09:32:19,107][52031] Avg episode reward: [(0, '0.442')] [2024-04-27 09:32:21,127][52263] Updated weights for policy 0, policy_version 342838 (0.0034) [2024-04-27 09:32:24,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53247.9, 300 sec: 52984.2). Total num frames: 5617205248. Throughput: 0: 52979.4. Samples: 107763800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 09:32:24,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 09:32:24,334][52263] Updated weights for policy 0, policy_version 342848 (0.0031) [2024-04-27 09:32:27,282][52263] Updated weights for policy 0, policy_version 342858 (0.0029) [2024-04-27 09:32:29,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52701.9, 300 sec: 52928.7). Total num frames: 5617451008. Throughput: 0: 52772.4. Samples: 107917020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 09:32:29,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 09:32:30,638][52263] Updated weights for policy 0, policy_version 342868 (0.0028) [2024-04-27 09:32:33,550][52263] Updated weights for policy 0, policy_version 342878 (0.0026) [2024-04-27 09:32:34,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5617729536. Throughput: 0: 52887.2. Samples: 108237120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 09:32:34,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 09:32:36,721][52263] Updated weights for policy 0, policy_version 342888 (0.0037) [2024-04-27 09:32:39,106][52031] Fps is (10 sec: 54067.4, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5617991680. Throughput: 0: 52876.2. Samples: 108553920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 09:32:39,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 09:32:39,672][52263] Updated weights for policy 0, policy_version 342898 (0.0036) [2024-04-27 09:32:42,911][52263] Updated weights for policy 0, policy_version 342908 (0.0030) [2024-04-27 09:32:44,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52701.7, 300 sec: 52928.6). Total num frames: 5618253824. Throughput: 0: 52865.2. Samples: 108709720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 09:32:44,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 09:32:45,856][52263] Updated weights for policy 0, policy_version 342918 (0.0042) [2024-04-27 09:32:48,623][52242] Signal inference workers to stop experience collection... (1550 times) [2024-04-27 09:32:48,623][52242] Signal inference workers to resume experience collection... (1550 times) [2024-04-27 09:32:48,653][52263] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-04-27 09:32:48,653][52263] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-04-27 09:32:49,035][52263] Updated weights for policy 0, policy_version 342928 (0.0026) [2024-04-27 09:32:49,107][52031] Fps is (10 sec: 54066.2, 60 sec: 53247.9, 300 sec: 52984.2). Total num frames: 5618532352. Throughput: 0: 52921.3. Samples: 109028200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 09:32:49,107][52031] Avg episode reward: [(0, '0.507')] [2024-04-27 09:32:52,140][52263] Updated weights for policy 0, policy_version 342938 (0.0028) [2024-04-27 09:32:54,107][52031] Fps is (10 sec: 50790.4, 60 sec: 52428.9, 300 sec: 52873.1). Total num frames: 5618761728. Throughput: 0: 52904.9. Samples: 109346220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 09:32:54,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 09:32:55,405][52263] Updated weights for policy 0, policy_version 342948 (0.0029) [2024-04-27 09:32:58,383][52263] Updated weights for policy 0, policy_version 342958 (0.0030) [2024-04-27 09:32:59,107][52031] Fps is (10 sec: 50790.7, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5619040256. Throughput: 0: 52724.4. Samples: 109501700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 09:32:59,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 09:33:01,540][52263] Updated weights for policy 0, policy_version 342968 (0.0030) [2024-04-27 09:33:04,106][52031] Fps is (10 sec: 55705.9, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5619318784. Throughput: 0: 52658.7. Samples: 109817440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 09:33:04,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 09:33:04,394][52263] Updated weights for policy 0, policy_version 342978 (0.0028) [2024-04-27 09:33:07,737][52263] Updated weights for policy 0, policy_version 342988 (0.0028) [2024-04-27 09:33:09,107][52031] Fps is (10 sec: 50790.7, 60 sec: 52155.8, 300 sec: 52873.1). Total num frames: 5619548160. Throughput: 0: 52704.5. Samples: 110135500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 09:33:09,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 09:33:10,962][52263] Updated weights for policy 0, policy_version 342998 (0.0038) [2024-04-27 09:33:13,978][52263] Updated weights for policy 0, policy_version 343008 (0.0033) [2024-04-27 09:33:14,107][52031] Fps is (10 sec: 52428.8, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5619843072. Throughput: 0: 52799.1. Samples: 110292980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 09:33:14,107][52031] Avg episode reward: [(0, '0.519')] [2024-04-27 09:33:17,057][52263] Updated weights for policy 0, policy_version 343018 (0.0038) [2024-04-27 09:33:19,107][52031] Fps is (10 sec: 54066.4, 60 sec: 52701.7, 300 sec: 52873.1). Total num frames: 5620088832. Throughput: 0: 52741.1. Samples: 110610480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 09:33:19,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 09:33:20,198][52263] Updated weights for policy 0, policy_version 343028 (0.0034) [2024-04-27 09:33:23,319][52263] Updated weights for policy 0, policy_version 343038 (0.0031) [2024-04-27 09:33:24,107][52031] Fps is (10 sec: 50790.4, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5620350976. Throughput: 0: 52708.3. Samples: 110925800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 09:33:24,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 09:33:26,472][52263] Updated weights for policy 0, policy_version 343048 (0.0037) [2024-04-27 09:33:29,106][52031] Fps is (10 sec: 54068.3, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5620629504. Throughput: 0: 52691.2. Samples: 111080820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:33:29,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 09:33:29,601][52263] Updated weights for policy 0, policy_version 343058 (0.0035) [2024-04-27 09:33:32,678][52263] Updated weights for policy 0, policy_version 343068 (0.0033) [2024-04-27 09:33:34,106][52031] Fps is (10 sec: 54067.7, 60 sec: 52701.9, 300 sec: 52928.7). Total num frames: 5620891648. Throughput: 0: 52682.4. Samples: 111398900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:33:34,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 09:33:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000343072_5620891648.pth... [2024-04-27 09:33:34,173][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000342298_5608210432.pth [2024-04-27 09:33:35,910][52263] Updated weights for policy 0, policy_version 343078 (0.0031) [2024-04-27 09:33:38,889][52263] Updated weights for policy 0, policy_version 343088 (0.0030) [2024-04-27 09:33:39,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52701.9, 300 sec: 52928.7). Total num frames: 5621153792. Throughput: 0: 52646.0. Samples: 111715280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:33:39,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 09:33:42,113][52263] Updated weights for policy 0, policy_version 343098 (0.0030) [2024-04-27 09:33:44,107][52031] Fps is (10 sec: 50789.9, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5621399552. Throughput: 0: 52552.9. Samples: 111866580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:33:44,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 09:33:45,084][52263] Updated weights for policy 0, policy_version 343108 (0.0028) [2024-04-27 09:33:48,241][52263] Updated weights for policy 0, policy_version 343118 (0.0037) [2024-04-27 09:33:49,106][52031] Fps is (10 sec: 50790.0, 60 sec: 52155.9, 300 sec: 52762.1). Total num frames: 5621661696. Throughput: 0: 52551.1. Samples: 112182240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:33:49,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 09:33:51,413][52263] Updated weights for policy 0, policy_version 343128 (0.0033) [2024-04-27 09:33:54,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53248.0, 300 sec: 52817.6). Total num frames: 5621956608. Throughput: 0: 52449.3. Samples: 112495720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:33:54,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 09:33:54,314][52263] Updated weights for policy 0, policy_version 343138 (0.0035) [2024-04-27 09:33:57,740][52263] Updated weights for policy 0, policy_version 343148 (0.0036) [2024-04-27 09:33:59,106][52031] Fps is (10 sec: 54067.4, 60 sec: 52702.0, 300 sec: 52817.6). Total num frames: 5622202368. Throughput: 0: 52654.3. Samples: 112662420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:33:59,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 09:34:00,635][52263] Updated weights for policy 0, policy_version 343158 (0.0030) [2024-04-27 09:34:04,001][52263] Updated weights for policy 0, policy_version 343168 (0.0031) [2024-04-27 09:34:04,106][52031] Fps is (10 sec: 50791.1, 60 sec: 52428.9, 300 sec: 52873.1). Total num frames: 5622464512. Throughput: 0: 52509.2. Samples: 112973380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:34:04,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 09:34:07,260][52263] Updated weights for policy 0, policy_version 343178 (0.0029) [2024-04-27 09:34:09,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52975.0, 300 sec: 52817.6). Total num frames: 5622726656. Throughput: 0: 52587.7. Samples: 113292240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:34:09,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 09:34:10,322][52263] Updated weights for policy 0, policy_version 343188 (0.0028) [2024-04-27 09:34:11,615][52242] Signal inference workers to stop experience collection... (1600 times) [2024-04-27 09:34:11,616][52242] Signal inference workers to resume experience collection... (1600 times) [2024-04-27 09:34:11,642][52263] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-04-27 09:34:11,643][52263] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-04-27 09:34:13,264][52263] Updated weights for policy 0, policy_version 343198 (0.0031) [2024-04-27 09:34:14,106][52031] Fps is (10 sec: 50790.4, 60 sec: 52155.8, 300 sec: 52762.1). Total num frames: 5622972416. Throughput: 0: 52724.5. Samples: 113453420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:34:14,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 09:34:16,457][52263] Updated weights for policy 0, policy_version 343208 (0.0030) [2024-04-27 09:34:19,107][52031] Fps is (10 sec: 52428.1, 60 sec: 52702.0, 300 sec: 52706.5). Total num frames: 5623250944. Throughput: 0: 52629.6. Samples: 113767240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:34:19,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 09:34:19,509][52263] Updated weights for policy 0, policy_version 343218 (0.0031) [2024-04-27 09:34:22,623][52263] Updated weights for policy 0, policy_version 343228 (0.0031) [2024-04-27 09:34:24,107][52031] Fps is (10 sec: 55704.9, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5623529472. Throughput: 0: 52589.2. Samples: 114081800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:34:24,107][52031] Avg episode reward: [(0, '0.465')] [2024-04-27 09:34:25,814][52263] Updated weights for policy 0, policy_version 343238 (0.0026) [2024-04-27 09:34:28,968][52263] Updated weights for policy 0, policy_version 343248 (0.0029) [2024-04-27 09:34:29,106][52031] Fps is (10 sec: 54067.5, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5623791616. Throughput: 0: 52860.5. Samples: 114245300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:34:29,107][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 09:34:31,906][52263] Updated weights for policy 0, policy_version 343258 (0.0035) [2024-04-27 09:34:34,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5624037376. Throughput: 0: 52791.6. Samples: 114557860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:34:34,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 09:34:35,148][52263] Updated weights for policy 0, policy_version 343268 (0.0033) [2024-04-27 09:34:37,998][52263] Updated weights for policy 0, policy_version 343278 (0.0030) [2024-04-27 09:34:39,106][52031] Fps is (10 sec: 50790.5, 60 sec: 52428.7, 300 sec: 52762.0). Total num frames: 5624299520. Throughput: 0: 52885.4. Samples: 114875560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:34:39,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 09:34:41,310][52263] Updated weights for policy 0, policy_version 343288 (0.0033) [2024-04-27 09:34:44,106][52031] Fps is (10 sec: 54067.1, 60 sec: 52975.0, 300 sec: 52706.5). Total num frames: 5624578048. Throughput: 0: 52579.5. Samples: 115028500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 09:34:44,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 09:34:44,282][52263] Updated weights for policy 0, policy_version 343298 (0.0036) [2024-04-27 09:34:47,533][52263] Updated weights for policy 0, policy_version 343308 (0.0031) [2024-04-27 09:34:49,106][52031] Fps is (10 sec: 54067.8, 60 sec: 52975.0, 300 sec: 52762.1). Total num frames: 5624840192. Throughput: 0: 52689.4. Samples: 115344400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 09:34:49,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 09:34:50,594][52263] Updated weights for policy 0, policy_version 343318 (0.0037) [2024-04-27 09:34:53,761][52263] Updated weights for policy 0, policy_version 343328 (0.0025) [2024-04-27 09:34:54,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52155.7, 300 sec: 52762.0). Total num frames: 5625085952. Throughput: 0: 52596.7. Samples: 115659100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 09:34:54,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 09:34:56,947][52263] Updated weights for policy 0, policy_version 343338 (0.0030) [2024-04-27 09:34:59,107][52031] Fps is (10 sec: 50789.3, 60 sec: 52428.7, 300 sec: 52817.6). Total num frames: 5625348096. Throughput: 0: 52485.1. Samples: 115815260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 09:34:59,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 09:35:00,005][52263] Updated weights for policy 0, policy_version 343348 (0.0036) [2024-04-27 09:35:03,010][52263] Updated weights for policy 0, policy_version 343358 (0.0030) [2024-04-27 09:35:04,107][52031] Fps is (10 sec: 54067.5, 60 sec: 52701.8, 300 sec: 52706.5). Total num frames: 5625626624. Throughput: 0: 52519.2. Samples: 116130600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 09:35:04,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 09:35:06,278][52263] Updated weights for policy 0, policy_version 343368 (0.0034) [2024-04-27 09:35:09,106][52031] Fps is (10 sec: 54068.1, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5625888768. Throughput: 0: 52497.9. Samples: 116444200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 09:35:09,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 09:35:09,268][52263] Updated weights for policy 0, policy_version 343378 (0.0031) [2024-04-27 09:35:12,481][52263] Updated weights for policy 0, policy_version 343388 (0.0032) [2024-04-27 09:35:14,107][52031] Fps is (10 sec: 52428.9, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5626150912. Throughput: 0: 52306.7. Samples: 116599100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 09:35:14,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 09:35:15,632][52263] Updated weights for policy 0, policy_version 343398 (0.0031) [2024-04-27 09:35:18,325][52242] Signal inference workers to stop experience collection... (1650 times) [2024-04-27 09:35:18,384][52263] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-04-27 09:35:18,384][52242] Signal inference workers to resume experience collection... (1650 times) [2024-04-27 09:35:18,400][52263] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-04-27 09:35:18,647][52263] Updated weights for policy 0, policy_version 343408 (0.0031) [2024-04-27 09:35:19,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52429.0, 300 sec: 52651.0). Total num frames: 5626396672. Throughput: 0: 52358.8. Samples: 116914000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 09:35:19,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 09:35:21,757][52263] Updated weights for policy 0, policy_version 343418 (0.0037) [2024-04-27 09:35:24,106][52031] Fps is (10 sec: 50790.5, 60 sec: 52155.8, 300 sec: 52762.0). Total num frames: 5626658816. Throughput: 0: 52373.8. Samples: 117232380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 09:35:24,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 09:35:25,143][52263] Updated weights for policy 0, policy_version 343428 (0.0038) [2024-04-27 09:35:27,846][52263] Updated weights for policy 0, policy_version 343438 (0.0030) [2024-04-27 09:35:29,107][52031] Fps is (10 sec: 52428.0, 60 sec: 52155.7, 300 sec: 52706.5). Total num frames: 5626920960. Throughput: 0: 52391.5. Samples: 117386120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 09:35:29,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 09:35:31,449][52263] Updated weights for policy 0, policy_version 343448 (0.0031) [2024-04-27 09:35:34,074][52263] Updated weights for policy 0, policy_version 343458 (0.0026) [2024-04-27 09:35:34,107][52031] Fps is (10 sec: 55705.2, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5627215872. Throughput: 0: 52382.9. Samples: 117701640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 09:35:34,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 09:35:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000343458_5627215872.pth... [2024-04-27 09:35:34,165][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000342686_5614567424.pth [2024-04-27 09:35:37,627][52263] Updated weights for policy 0, policy_version 343468 (0.0032) [2024-04-27 09:35:39,106][52031] Fps is (10 sec: 54067.5, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5627461632. Throughput: 0: 52463.2. Samples: 118019940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 09:35:39,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 09:35:40,167][52263] Updated weights for policy 0, policy_version 343478 (0.0029) [2024-04-27 09:35:43,810][52263] Updated weights for policy 0, policy_version 343488 (0.0031) [2024-04-27 09:35:44,107][52031] Fps is (10 sec: 50789.8, 60 sec: 52428.7, 300 sec: 52650.9). Total num frames: 5627723776. Throughput: 0: 52629.2. Samples: 118183580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 09:35:44,107][52031] Avg episode reward: [(0, '0.513')] [2024-04-27 09:35:46,534][52263] Updated weights for policy 0, policy_version 343498 (0.0031) [2024-04-27 09:35:49,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5627985920. Throughput: 0: 52769.9. Samples: 118505240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 09:35:49,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 09:35:50,033][52263] Updated weights for policy 0, policy_version 343508 (0.0030) [2024-04-27 09:35:52,776][52263] Updated weights for policy 0, policy_version 343518 (0.0028) [2024-04-27 09:35:54,106][52031] Fps is (10 sec: 52430.0, 60 sec: 52702.0, 300 sec: 52706.5). Total num frames: 5628248064. Throughput: 0: 52848.4. Samples: 118822380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 09:35:54,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 09:35:56,155][52263] Updated weights for policy 0, policy_version 343528 (0.0030) [2024-04-27 09:35:59,106][52031] Fps is (10 sec: 52428.7, 60 sec: 52702.0, 300 sec: 52706.5). Total num frames: 5628510208. Throughput: 0: 52810.3. Samples: 118975560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:35:59,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 09:35:59,208][52263] Updated weights for policy 0, policy_version 343538 (0.0028) [2024-04-27 09:36:02,214][52263] Updated weights for policy 0, policy_version 343548 (0.0033) [2024-04-27 09:36:04,106][52031] Fps is (10 sec: 54067.0, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5628788736. Throughput: 0: 52933.6. Samples: 119296020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:36:04,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 09:36:05,333][52263] Updated weights for policy 0, policy_version 343558 (0.0030) [2024-04-27 09:36:08,448][52263] Updated weights for policy 0, policy_version 343568 (0.0033) [2024-04-27 09:36:09,107][52031] Fps is (10 sec: 55704.8, 60 sec: 52974.8, 300 sec: 52706.5). Total num frames: 5629067264. Throughput: 0: 52998.6. Samples: 119617320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:36:09,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 09:36:11,440][52263] Updated weights for policy 0, policy_version 343578 (0.0030) [2024-04-27 09:36:14,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5629313024. Throughput: 0: 53064.5. Samples: 119774020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:36:14,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 09:36:14,584][52263] Updated weights for policy 0, policy_version 343588 (0.0027) [2024-04-27 09:36:17,609][52263] Updated weights for policy 0, policy_version 343598 (0.0035) [2024-04-27 09:36:19,106][52031] Fps is (10 sec: 49152.8, 60 sec: 52701.8, 300 sec: 52706.5). Total num frames: 5629558784. Throughput: 0: 53139.7. Samples: 120092920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:36:19,107][52031] Avg episode reward: [(0, '0.487')] [2024-04-27 09:36:20,942][52263] Updated weights for policy 0, policy_version 343608 (0.0030) [2024-04-27 09:36:23,886][52263] Updated weights for policy 0, policy_version 343618 (0.0040) [2024-04-27 09:36:24,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52975.0, 300 sec: 52706.5). Total num frames: 5629837312. Throughput: 0: 52957.4. Samples: 120403020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:36:24,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 09:36:27,136][52263] Updated weights for policy 0, policy_version 343628 (0.0025) [2024-04-27 09:36:29,049][52242] Signal inference workers to stop experience collection... (1700 times) [2024-04-27 09:36:29,050][52242] Signal inference workers to resume experience collection... (1700 times) [2024-04-27 09:36:29,079][52263] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-04-27 09:36:29,080][52263] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-04-27 09:36:29,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53248.0, 300 sec: 52762.0). Total num frames: 5630115840. Throughput: 0: 52903.3. Samples: 120564220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:36:29,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 09:36:29,975][52263] Updated weights for policy 0, policy_version 343638 (0.0032) [2024-04-27 09:36:33,241][52263] Updated weights for policy 0, policy_version 343648 (0.0033) [2024-04-27 09:36:34,107][52031] Fps is (10 sec: 54066.7, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5630377984. Throughput: 0: 52748.3. Samples: 120878920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:36:34,107][52031] Avg episode reward: [(0, '0.518')] [2024-04-27 09:36:36,067][52263] Updated weights for policy 0, policy_version 343658 (0.0029) [2024-04-27 09:36:39,107][52031] Fps is (10 sec: 50789.9, 60 sec: 52701.8, 300 sec: 52650.9). Total num frames: 5630623744. Throughput: 0: 52783.4. Samples: 121197640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:36:39,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 09:36:39,416][52263] Updated weights for policy 0, policy_version 343668 (0.0034) [2024-04-27 09:36:42,514][52263] Updated weights for policy 0, policy_version 343678 (0.0033) [2024-04-27 09:36:44,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52975.1, 300 sec: 52762.0). Total num frames: 5630902272. Throughput: 0: 52912.4. Samples: 121356620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:36:44,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 09:36:45,656][52263] Updated weights for policy 0, policy_version 343688 (0.0033) [2024-04-27 09:36:48,853][52263] Updated weights for policy 0, policy_version 343698 (0.0027) [2024-04-27 09:36:49,107][52031] Fps is (10 sec: 52429.1, 60 sec: 52701.8, 300 sec: 52651.0). Total num frames: 5631148032. Throughput: 0: 52767.1. Samples: 121670540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:36:49,107][52031] Avg episode reward: [(0, '0.508')] [2024-04-27 09:36:51,871][52263] Updated weights for policy 0, policy_version 343708 (0.0028) [2024-04-27 09:36:54,106][52031] Fps is (10 sec: 52428.7, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5631426560. Throughput: 0: 52646.7. Samples: 121986420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:36:54,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 09:36:55,114][52263] Updated weights for policy 0, policy_version 343718 (0.0030) [2024-04-27 09:36:57,964][52263] Updated weights for policy 0, policy_version 343728 (0.0029) [2024-04-27 09:36:59,106][52031] Fps is (10 sec: 57344.3, 60 sec: 53521.0, 300 sec: 52817.6). Total num frames: 5631721472. Throughput: 0: 52827.5. Samples: 122151260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:36:59,107][52031] Avg episode reward: [(0, '0.520')] [2024-04-27 09:37:01,144][52263] Updated weights for policy 0, policy_version 343738 (0.0032) [2024-04-27 09:37:04,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52701.8, 300 sec: 52651.0). Total num frames: 5631950848. Throughput: 0: 52858.1. Samples: 122471540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:37:04,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 09:37:04,197][52263] Updated weights for policy 0, policy_version 343748 (0.0029) [2024-04-27 09:37:07,350][52263] Updated weights for policy 0, policy_version 343758 (0.0030) [2024-04-27 09:37:09,106][52031] Fps is (10 sec: 49152.3, 60 sec: 52428.9, 300 sec: 52706.5). Total num frames: 5632212992. Throughput: 0: 52910.3. Samples: 122783980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 09:37:09,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 09:37:10,607][52263] Updated weights for policy 0, policy_version 343768 (0.0031) [2024-04-27 09:37:13,422][52263] Updated weights for policy 0, policy_version 343778 (0.0029) [2024-04-27 09:37:14,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5632475136. Throughput: 0: 52640.0. Samples: 122933020. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 09:37:14,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 09:37:16,864][52263] Updated weights for policy 0, policy_version 343788 (0.0026) [2024-04-27 09:37:19,107][52031] Fps is (10 sec: 52427.9, 60 sec: 52974.8, 300 sec: 52650.9). Total num frames: 5632737280. Throughput: 0: 52867.9. Samples: 123257980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 09:37:19,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 09:37:19,599][52263] Updated weights for policy 0, policy_version 343798 (0.0029) [2024-04-27 09:37:22,901][52263] Updated weights for policy 0, policy_version 343808 (0.0037) [2024-04-27 09:37:24,106][52031] Fps is (10 sec: 54067.1, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5633015808. Throughput: 0: 52809.9. Samples: 123574080. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 09:37:24,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 09:37:25,762][52263] Updated weights for policy 0, policy_version 343818 (0.0033) [2024-04-27 09:37:29,003][52263] Updated weights for policy 0, policy_version 343828 (0.0033) [2024-04-27 09:37:29,106][52031] Fps is (10 sec: 54068.1, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5633277952. Throughput: 0: 52893.9. Samples: 123736840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 09:37:29,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 09:37:30,072][52242] Signal inference workers to stop experience collection... (1750 times) [2024-04-27 09:37:30,073][52242] Signal inference workers to resume experience collection... (1750 times) [2024-04-27 09:37:30,096][52263] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-04-27 09:37:30,096][52263] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-04-27 09:37:32,046][52263] Updated weights for policy 0, policy_version 343838 (0.0029) [2024-04-27 09:37:34,106][52031] Fps is (10 sec: 52428.8, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5633540096. Throughput: 0: 53037.4. Samples: 124057220. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 09:37:34,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 09:37:34,132][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000343845_5633556480.pth... [2024-04-27 09:37:34,188][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000343072_5620891648.pth [2024-04-27 09:37:35,101][52263] Updated weights for policy 0, policy_version 343848 (0.0036) [2024-04-27 09:37:38,435][52263] Updated weights for policy 0, policy_version 343858 (0.0030) [2024-04-27 09:37:39,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52975.1, 300 sec: 52706.5). Total num frames: 5633802240. Throughput: 0: 53077.9. Samples: 124374920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 09:37:39,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 09:37:41,246][52263] Updated weights for policy 0, policy_version 343868 (0.0035) [2024-04-27 09:37:44,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52701.9, 300 sec: 52651.0). Total num frames: 5634064384. Throughput: 0: 52834.2. Samples: 124528800. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 09:37:44,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 09:37:44,647][52263] Updated weights for policy 0, policy_version 343878 (0.0026) [2024-04-27 09:37:47,270][52263] Updated weights for policy 0, policy_version 343888 (0.0033) [2024-04-27 09:37:49,107][52031] Fps is (10 sec: 55704.5, 60 sec: 53521.0, 300 sec: 52873.1). Total num frames: 5634359296. Throughput: 0: 52829.7. Samples: 124848880. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 09:37:49,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 09:37:50,762][52263] Updated weights for policy 0, policy_version 343898 (0.0027) [2024-04-27 09:37:53,367][52263] Updated weights for policy 0, policy_version 343908 (0.0032) [2024-04-27 09:37:54,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53248.0, 300 sec: 52817.6). Total num frames: 5634621440. Throughput: 0: 52917.2. Samples: 125165260. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 09:37:54,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 09:37:56,927][52263] Updated weights for policy 0, policy_version 343918 (0.0030) [2024-04-27 09:37:59,107][52031] Fps is (10 sec: 50790.3, 60 sec: 52428.7, 300 sec: 52706.5). Total num frames: 5634867200. Throughput: 0: 53311.4. Samples: 125332040. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 09:37:59,115][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 09:37:59,873][52263] Updated weights for policy 0, policy_version 343928 (0.0038) [2024-04-27 09:38:03,149][52263] Updated weights for policy 0, policy_version 343938 (0.0029) [2024-04-27 09:38:04,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52975.0, 300 sec: 52817.6). Total num frames: 5635129344. Throughput: 0: 53070.8. Samples: 125646160. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 09:38:04,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 09:38:05,884][52263] Updated weights for policy 0, policy_version 343948 (0.0029) [2024-04-27 09:38:09,106][52031] Fps is (10 sec: 52429.8, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5635391488. Throughput: 0: 53087.7. Samples: 125963020. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 09:38:09,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 09:38:09,161][52263] Updated weights for policy 0, policy_version 343958 (0.0032) [2024-04-27 09:38:11,914][52263] Updated weights for policy 0, policy_version 343968 (0.0034) [2024-04-27 09:38:14,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53248.0, 300 sec: 52817.6). Total num frames: 5635670016. Throughput: 0: 53055.0. Samples: 126124320. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 09:38:14,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 09:38:15,310][52263] Updated weights for policy 0, policy_version 343978 (0.0028) [2024-04-27 09:38:18,155][52263] Updated weights for policy 0, policy_version 343988 (0.0027) [2024-04-27 09:38:19,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53521.1, 300 sec: 52873.1). Total num frames: 5635948544. Throughput: 0: 52988.0. Samples: 126441680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 09:38:19,107][52031] Avg episode reward: [(0, '0.674')] [2024-04-27 09:38:21,680][52263] Updated weights for policy 0, policy_version 343998 (0.0036) [2024-04-27 09:38:24,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53247.9, 300 sec: 52817.5). Total num frames: 5636210688. Throughput: 0: 52975.7. Samples: 126758840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 09:38:24,107][52031] Avg episode reward: [(0, '0.502')] [2024-04-27 09:38:24,329][52263] Updated weights for policy 0, policy_version 344008 (0.0030) [2024-04-27 09:38:27,820][52263] Updated weights for policy 0, policy_version 344018 (0.0034) [2024-04-27 09:38:29,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53247.9, 300 sec: 52817.6). Total num frames: 5636472832. Throughput: 0: 53057.3. Samples: 126916380. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 09:38:29,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 09:38:30,387][52263] Updated weights for policy 0, policy_version 344028 (0.0031) [2024-04-27 09:38:34,052][52263] Updated weights for policy 0, policy_version 344038 (0.0028) [2024-04-27 09:38:34,106][52031] Fps is (10 sec: 50791.2, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5636718592. Throughput: 0: 52997.1. Samples: 127233740. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 09:38:34,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 09:38:36,403][52263] Updated weights for policy 0, policy_version 344048 (0.0030) [2024-04-27 09:38:39,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53247.8, 300 sec: 52873.1). Total num frames: 5636997120. Throughput: 0: 53131.5. Samples: 127556180. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 09:38:39,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 09:38:40,150][52263] Updated weights for policy 0, policy_version 344058 (0.0032) [2024-04-27 09:38:42,920][52263] Updated weights for policy 0, policy_version 344068 (0.0030) [2024-04-27 09:38:44,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.0, 300 sec: 52873.1). Total num frames: 5637259264. Throughput: 0: 53040.2. Samples: 127718840. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 09:38:44,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 09:38:46,279][52263] Updated weights for policy 0, policy_version 344078 (0.0030) [2024-04-27 09:38:48,404][52242] Signal inference workers to stop experience collection... (1800 times) [2024-04-27 09:38:48,404][52242] Signal inference workers to resume experience collection... (1800 times) [2024-04-27 09:38:48,433][52263] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-04-27 09:38:48,433][52263] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-04-27 09:38:48,995][52263] Updated weights for policy 0, policy_version 344088 (0.0026) [2024-04-27 09:38:49,106][52031] Fps is (10 sec: 54068.2, 60 sec: 52975.1, 300 sec: 52817.6). Total num frames: 5637537792. Throughput: 0: 53047.2. Samples: 128033280. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 09:38:49,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 09:38:52,518][52263] Updated weights for policy 0, policy_version 344098 (0.0032) [2024-04-27 09:38:54,106][52031] Fps is (10 sec: 50790.2, 60 sec: 52428.9, 300 sec: 52762.0). Total num frames: 5637767168. Throughput: 0: 53004.8. Samples: 128348240. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 09:38:54,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 09:38:55,348][52263] Updated weights for policy 0, policy_version 344108 (0.0031) [2024-04-27 09:38:58,783][52263] Updated weights for policy 0, policy_version 344118 (0.0031) [2024-04-27 09:38:59,106][52031] Fps is (10 sec: 50790.4, 60 sec: 52975.1, 300 sec: 52817.6). Total num frames: 5638045696. Throughput: 0: 52821.0. Samples: 128501260. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 09:38:59,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 09:39:01,470][52263] Updated weights for policy 0, policy_version 344128 (0.0030) [2024-04-27 09:39:04,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5638291456. Throughput: 0: 52940.1. Samples: 128823980. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 09:39:04,107][52031] Avg episode reward: [(0, '0.470')] [2024-04-27 09:39:04,953][52263] Updated weights for policy 0, policy_version 344138 (0.0031) [2024-04-27 09:39:07,770][52263] Updated weights for policy 0, policy_version 344148 (0.0031) [2024-04-27 09:39:09,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53247.9, 300 sec: 52928.6). Total num frames: 5638586368. Throughput: 0: 52771.2. Samples: 129133540. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 09:39:09,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 09:39:11,283][52263] Updated weights for policy 0, policy_version 344158 (0.0029) [2024-04-27 09:39:14,081][52263] Updated weights for policy 0, policy_version 344168 (0.0042) [2024-04-27 09:39:14,106][52031] Fps is (10 sec: 55705.7, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5638848512. Throughput: 0: 52756.1. Samples: 129290400. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 09:39:14,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 09:39:17,446][52263] Updated weights for policy 0, policy_version 344178 (0.0032) [2024-04-27 09:39:19,106][52031] Fps is (10 sec: 50790.9, 60 sec: 52428.9, 300 sec: 52762.1). Total num frames: 5639094272. Throughput: 0: 52708.5. Samples: 129605620. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 09:39:19,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 09:39:20,105][52263] Updated weights for policy 0, policy_version 344188 (0.0029) [2024-04-27 09:39:23,669][52263] Updated weights for policy 0, policy_version 344198 (0.0031) [2024-04-27 09:39:24,106][52031] Fps is (10 sec: 49151.9, 60 sec: 52155.9, 300 sec: 52706.5). Total num frames: 5639340032. Throughput: 0: 52697.1. Samples: 129927540. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 09:39:24,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 09:39:26,234][52263] Updated weights for policy 0, policy_version 344208 (0.0029) [2024-04-27 09:39:29,106][52031] Fps is (10 sec: 50790.1, 60 sec: 52155.8, 300 sec: 52762.0). Total num frames: 5639602176. Throughput: 0: 52367.1. Samples: 130075360. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 09:39:29,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 09:39:30,031][52263] Updated weights for policy 0, policy_version 344218 (0.0031) [2024-04-27 09:39:32,879][52263] Updated weights for policy 0, policy_version 344228 (0.0032) [2024-04-27 09:39:34,107][52031] Fps is (10 sec: 57343.6, 60 sec: 53247.9, 300 sec: 52928.6). Total num frames: 5639913472. Throughput: 0: 52498.6. Samples: 130395720. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 09:39:34,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 09:39:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000344233_5639913472.pth... [2024-04-27 09:39:34,167][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000343458_5627215872.pth [2024-04-27 09:39:36,182][52263] Updated weights for policy 0, policy_version 344238 (0.0030) [2024-04-27 09:39:39,106][52031] Fps is (10 sec: 54067.2, 60 sec: 52428.9, 300 sec: 52762.0). Total num frames: 5640142848. Throughput: 0: 52436.4. Samples: 130707880. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 09:39:39,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 09:39:39,178][52263] Updated weights for policy 0, policy_version 344248 (0.0033) [2024-04-27 09:39:42,293][52263] Updated weights for policy 0, policy_version 344258 (0.0032) [2024-04-27 09:39:43,667][52242] Signal inference workers to stop experience collection... (1850 times) [2024-04-27 09:39:43,668][52242] Signal inference workers to resume experience collection... (1850 times) [2024-04-27 09:39:43,694][52263] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-04-27 09:39:43,694][52263] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-04-27 09:39:44,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5640421376. Throughput: 0: 52738.6. Samples: 130874500. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 09:39:44,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 09:39:45,389][52263] Updated weights for policy 0, policy_version 344268 (0.0027) [2024-04-27 09:39:48,481][52263] Updated weights for policy 0, policy_version 344278 (0.0034) [2024-04-27 09:39:49,107][52031] Fps is (10 sec: 52428.4, 60 sec: 52155.6, 300 sec: 52817.6). Total num frames: 5640667136. Throughput: 0: 52572.3. Samples: 131189740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 09:39:49,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 09:39:51,714][52263] Updated weights for policy 0, policy_version 344288 (0.0029) [2024-04-27 09:39:54,107][52031] Fps is (10 sec: 50790.0, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5640929280. Throughput: 0: 52785.7. Samples: 131508900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 09:39:54,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 09:39:54,822][52263] Updated weights for policy 0, policy_version 344298 (0.0029) [2024-04-27 09:39:57,760][52263] Updated weights for policy 0, policy_version 344308 (0.0033) [2024-04-27 09:39:59,107][52031] Fps is (10 sec: 55705.7, 60 sec: 52974.8, 300 sec: 52873.1). Total num frames: 5641224192. Throughput: 0: 52653.2. Samples: 131659800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 09:39:59,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 09:40:00,943][52263] Updated weights for policy 0, policy_version 344318 (0.0036) [2024-04-27 09:40:03,867][52263] Updated weights for policy 0, policy_version 344328 (0.0034) [2024-04-27 09:40:04,107][52031] Fps is (10 sec: 54067.4, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5641469952. Throughput: 0: 52667.4. Samples: 131975660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 09:40:04,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 09:40:07,330][52263] Updated weights for policy 0, policy_version 344338 (0.0035) [2024-04-27 09:40:09,106][52031] Fps is (10 sec: 52429.4, 60 sec: 52701.9, 300 sec: 52873.1). Total num frames: 5641748480. Throughput: 0: 52547.6. Samples: 132292180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 09:40:09,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 09:40:10,054][52263] Updated weights for policy 0, policy_version 344348 (0.0035) [2024-04-27 09:40:13,528][52263] Updated weights for policy 0, policy_version 344358 (0.0033) [2024-04-27 09:40:14,107][52031] Fps is (10 sec: 52428.8, 60 sec: 52428.7, 300 sec: 52873.1). Total num frames: 5641994240. Throughput: 0: 52858.6. Samples: 132454000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 09:40:14,107][52031] Avg episode reward: [(0, '0.513')] [2024-04-27 09:40:16,341][52263] Updated weights for policy 0, policy_version 344368 (0.0036) [2024-04-27 09:40:19,106][52031] Fps is (10 sec: 47513.3, 60 sec: 52155.7, 300 sec: 52762.0). Total num frames: 5642223616. Throughput: 0: 52686.3. Samples: 132766600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 09:40:19,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 09:40:19,768][52263] Updated weights for policy 0, policy_version 344378 (0.0030) [2024-04-27 09:40:22,479][52263] Updated weights for policy 0, policy_version 344388 (0.0032) [2024-04-27 09:40:24,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.0, 300 sec: 52928.7). Total num frames: 5642534912. Throughput: 0: 52712.9. Samples: 133079960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 09:40:24,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 09:40:25,946][52263] Updated weights for policy 0, policy_version 344398 (0.0027) [2024-04-27 09:40:29,023][52263] Updated weights for policy 0, policy_version 344408 (0.0030) [2024-04-27 09:40:29,106][52031] Fps is (10 sec: 55705.4, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5642780672. Throughput: 0: 52660.0. Samples: 133244200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 09:40:29,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 09:40:32,126][52263] Updated weights for policy 0, policy_version 344418 (0.0034) [2024-04-27 09:40:34,107][52031] Fps is (10 sec: 50790.0, 60 sec: 52155.7, 300 sec: 52817.6). Total num frames: 5643042816. Throughput: 0: 52688.0. Samples: 133560700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 09:40:34,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 09:40:35,080][52263] Updated weights for policy 0, policy_version 344428 (0.0028) [2024-04-27 09:40:38,289][52263] Updated weights for policy 0, policy_version 344438 (0.0031) [2024-04-27 09:40:39,106][52031] Fps is (10 sec: 54067.7, 60 sec: 52975.0, 300 sec: 52873.2). Total num frames: 5643321344. Throughput: 0: 52591.3. Samples: 133875500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 09:40:39,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 09:40:41,525][52263] Updated weights for policy 0, policy_version 344448 (0.0035) [2024-04-27 09:40:44,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52428.7, 300 sec: 52817.5). Total num frames: 5643567104. Throughput: 0: 52880.4. Samples: 134039420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 09:40:44,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 09:40:44,336][52263] Updated weights for policy 0, policy_version 344458 (0.0030) [2024-04-27 09:40:47,726][52263] Updated weights for policy 0, policy_version 344468 (0.0033) [2024-04-27 09:40:49,106][52031] Fps is (10 sec: 52428.6, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5643845632. Throughput: 0: 52784.5. Samples: 134350960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 09:40:49,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 09:40:50,153][52242] Signal inference workers to stop experience collection... (1900 times) [2024-04-27 09:40:50,206][52242] Signal inference workers to resume experience collection... (1900 times) [2024-04-27 09:40:50,206][52263] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-04-27 09:40:50,218][52263] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-04-27 09:40:50,662][52263] Updated weights for policy 0, policy_version 344478 (0.0030) [2024-04-27 09:40:54,030][52263] Updated weights for policy 0, policy_version 344488 (0.0034) [2024-04-27 09:40:54,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52701.8, 300 sec: 52817.5). Total num frames: 5644091392. Throughput: 0: 52851.3. Samples: 134670500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 09:40:54,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 09:40:56,766][52263] Updated weights for policy 0, policy_version 344498 (0.0031) [2024-04-27 09:40:59,107][52031] Fps is (10 sec: 52428.4, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5644369920. Throughput: 0: 52823.1. Samples: 134831040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-04-27 09:40:59,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 09:41:00,132][52263] Updated weights for policy 0, policy_version 344508 (0.0031) [2024-04-27 09:41:02,933][52263] Updated weights for policy 0, policy_version 344518 (0.0030) [2024-04-27 09:41:04,107][52031] Fps is (10 sec: 54067.6, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5644632064. Throughput: 0: 52891.5. Samples: 135146720. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:41:04,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 09:41:06,315][52263] Updated weights for policy 0, policy_version 344528 (0.0031) [2024-04-27 09:41:09,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5644894208. Throughput: 0: 52987.5. Samples: 135464400. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:41:09,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 09:41:09,208][52263] Updated weights for policy 0, policy_version 344538 (0.0028) [2024-04-27 09:41:12,502][52263] Updated weights for policy 0, policy_version 344548 (0.0029) [2024-04-27 09:41:14,106][52031] Fps is (10 sec: 52429.6, 60 sec: 52702.0, 300 sec: 52873.1). Total num frames: 5645156352. Throughput: 0: 52836.6. Samples: 135621840. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:41:14,107][52031] Avg episode reward: [(0, '0.512')] [2024-04-27 09:41:15,365][52263] Updated weights for policy 0, policy_version 344558 (0.0030) [2024-04-27 09:41:18,619][52263] Updated weights for policy 0, policy_version 344568 (0.0029) [2024-04-27 09:41:19,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.2, 300 sec: 52928.7). Total num frames: 5645451264. Throughput: 0: 52765.9. Samples: 135935160. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:41:19,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 09:41:21,476][52263] Updated weights for policy 0, policy_version 344578 (0.0044) [2024-04-27 09:41:24,106][52031] Fps is (10 sec: 52428.3, 60 sec: 52428.7, 300 sec: 52762.0). Total num frames: 5645680640. Throughput: 0: 52937.2. Samples: 136257680. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:41:24,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 09:41:24,706][52263] Updated weights for policy 0, policy_version 344588 (0.0035) [2024-04-27 09:41:27,603][52263] Updated weights for policy 0, policy_version 344598 (0.0032) [2024-04-27 09:41:29,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5645959168. Throughput: 0: 52765.9. Samples: 136413880. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:41:29,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 09:41:30,880][52263] Updated weights for policy 0, policy_version 344608 (0.0031) [2024-04-27 09:41:33,786][52263] Updated weights for policy 0, policy_version 344618 (0.0028) [2024-04-27 09:41:34,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53248.0, 300 sec: 52928.7). Total num frames: 5646237696. Throughput: 0: 52931.5. Samples: 136732880. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:41:34,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 09:41:34,119][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000344619_5646237696.pth... [2024-04-27 09:41:34,167][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000343845_5633556480.pth [2024-04-27 09:41:37,101][52263] Updated weights for policy 0, policy_version 344628 (0.0031) [2024-04-27 09:41:39,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52701.7, 300 sec: 52817.6). Total num frames: 5646483456. Throughput: 0: 52930.2. Samples: 137052360. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:41:39,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 09:41:39,885][52263] Updated weights for policy 0, policy_version 344638 (0.0027) [2024-04-27 09:41:43,181][52263] Updated weights for policy 0, policy_version 344648 (0.0037) [2024-04-27 09:41:44,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53248.2, 300 sec: 52928.7). Total num frames: 5646761984. Throughput: 0: 52919.7. Samples: 137212420. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:41:44,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 09:41:45,994][52263] Updated weights for policy 0, policy_version 344658 (0.0032) [2024-04-27 09:41:49,106][52031] Fps is (10 sec: 54067.9, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5647024128. Throughput: 0: 52893.0. Samples: 137526900. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:41:49,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 09:41:49,305][52263] Updated weights for policy 0, policy_version 344668 (0.0030) [2024-04-27 09:41:51,743][52242] Signal inference workers to stop experience collection... (1950 times) [2024-04-27 09:41:51,785][52263] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-04-27 09:41:51,807][52242] Signal inference workers to resume experience collection... (1950 times) [2024-04-27 09:41:51,807][52263] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-04-27 09:41:52,289][52263] Updated weights for policy 0, policy_version 344678 (0.0028) [2024-04-27 09:41:54,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53248.1, 300 sec: 52762.0). Total num frames: 5647286272. Throughput: 0: 52926.1. Samples: 137846080. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:41:54,107][52031] Avg episode reward: [(0, '0.671')] [2024-04-27 09:41:55,440][52263] Updated weights for policy 0, policy_version 344688 (0.0030) [2024-04-27 09:41:58,691][52263] Updated weights for policy 0, policy_version 344698 (0.0027) [2024-04-27 09:41:59,107][52031] Fps is (10 sec: 52428.5, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5647548416. Throughput: 0: 53039.4. Samples: 138008620. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:41:59,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 09:42:01,698][52263] Updated weights for policy 0, policy_version 344708 (0.0036) [2024-04-27 09:42:04,107][52031] Fps is (10 sec: 52428.4, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5647810560. Throughput: 0: 53028.6. Samples: 138321460. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:42:04,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 09:42:04,747][52263] Updated weights for policy 0, policy_version 344718 (0.0031) [2024-04-27 09:42:08,020][52263] Updated weights for policy 0, policy_version 344728 (0.0029) [2024-04-27 09:42:09,106][52031] Fps is (10 sec: 52429.6, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5648072704. Throughput: 0: 52745.0. Samples: 138631200. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:42:09,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 09:42:10,859][52263] Updated weights for policy 0, policy_version 344738 (0.0028) [2024-04-27 09:42:14,106][52031] Fps is (10 sec: 52429.9, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5648334848. Throughput: 0: 52697.4. Samples: 138785260. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:42:14,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 09:42:14,214][52263] Updated weights for policy 0, policy_version 344748 (0.0028) [2024-04-27 09:42:17,172][52263] Updated weights for policy 0, policy_version 344758 (0.0034) [2024-04-27 09:42:19,107][52031] Fps is (10 sec: 50789.1, 60 sec: 52155.6, 300 sec: 52762.0). Total num frames: 5648580608. Throughput: 0: 52645.6. Samples: 139101940. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 09:42:19,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 09:42:20,494][52263] Updated weights for policy 0, policy_version 344768 (0.0031) [2024-04-27 09:42:23,492][52263] Updated weights for policy 0, policy_version 344778 (0.0029) [2024-04-27 09:42:24,106][52031] Fps is (10 sec: 52428.5, 60 sec: 52975.0, 300 sec: 52817.6). Total num frames: 5648859136. Throughput: 0: 52485.0. Samples: 139414180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 09:42:24,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 09:42:26,671][52263] Updated weights for policy 0, policy_version 344788 (0.0033) [2024-04-27 09:42:29,107][52031] Fps is (10 sec: 54067.3, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5649121280. Throughput: 0: 52588.6. Samples: 139578920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 09:42:29,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 09:42:29,761][52263] Updated weights for policy 0, policy_version 344798 (0.0035) [2024-04-27 09:42:33,014][52263] Updated weights for policy 0, policy_version 344808 (0.0029) [2024-04-27 09:42:34,106][52031] Fps is (10 sec: 50790.3, 60 sec: 52155.7, 300 sec: 52762.0). Total num frames: 5649367040. Throughput: 0: 52547.5. Samples: 139891540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 09:42:34,107][52031] Avg episode reward: [(0, '0.486')] [2024-04-27 09:42:36,122][52263] Updated weights for policy 0, policy_version 344818 (0.0031) [2024-04-27 09:42:39,107][52031] Fps is (10 sec: 52428.8, 60 sec: 52701.9, 300 sec: 52817.5). Total num frames: 5649645568. Throughput: 0: 52471.5. Samples: 140207300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 09:42:39,108][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 09:42:39,251][52263] Updated weights for policy 0, policy_version 344828 (0.0029) [2024-04-27 09:42:42,210][52263] Updated weights for policy 0, policy_version 344838 (0.0028) [2024-04-27 09:42:44,107][52031] Fps is (10 sec: 52428.1, 60 sec: 52155.5, 300 sec: 52651.0). Total num frames: 5649891328. Throughput: 0: 52274.6. Samples: 140360980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 09:42:44,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 09:42:45,373][52263] Updated weights for policy 0, policy_version 344848 (0.0028) [2024-04-27 09:42:48,524][52263] Updated weights for policy 0, policy_version 344858 (0.0029) [2024-04-27 09:42:49,106][52031] Fps is (10 sec: 52429.6, 60 sec: 52428.8, 300 sec: 52706.5). Total num frames: 5650169856. Throughput: 0: 52349.5. Samples: 140677180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 09:42:49,107][52031] Avg episode reward: [(0, '0.517')] [2024-04-27 09:42:51,594][52263] Updated weights for policy 0, policy_version 344868 (0.0027) [2024-04-27 09:42:53,828][52242] Signal inference workers to stop experience collection... (2000 times) [2024-04-27 09:42:53,828][52242] Signal inference workers to resume experience collection... (2000 times) [2024-04-27 09:42:53,839][52263] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-04-27 09:42:53,839][52263] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-04-27 09:42:54,107][52031] Fps is (10 sec: 54067.1, 60 sec: 52428.7, 300 sec: 52762.0). Total num frames: 5650432000. Throughput: 0: 52479.7. Samples: 140992800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 09:42:54,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 09:42:54,699][52263] Updated weights for policy 0, policy_version 344878 (0.0038) [2024-04-27 09:42:58,012][52263] Updated weights for policy 0, policy_version 344888 (0.0035) [2024-04-27 09:42:59,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52155.8, 300 sec: 52706.5). Total num frames: 5650677760. Throughput: 0: 52664.0. Samples: 141155140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 09:42:59,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 09:43:00,949][52263] Updated weights for policy 0, policy_version 344898 (0.0028) [2024-04-27 09:43:04,099][52263] Updated weights for policy 0, policy_version 344908 (0.0028) [2024-04-27 09:43:04,107][52031] Fps is (10 sec: 54067.8, 60 sec: 52702.0, 300 sec: 52817.6). Total num frames: 5650972672. Throughput: 0: 52629.9. Samples: 141470280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 09:43:04,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 09:43:07,230][52263] Updated weights for policy 0, policy_version 344918 (0.0038) [2024-04-27 09:43:09,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52155.7, 300 sec: 52651.0). Total num frames: 5651202048. Throughput: 0: 52676.1. Samples: 141784600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 09:43:09,107][52031] Avg episode reward: [(0, '0.678')] [2024-04-27 09:43:10,274][52263] Updated weights for policy 0, policy_version 344928 (0.0034) [2024-04-27 09:43:13,553][52263] Updated weights for policy 0, policy_version 344938 (0.0032) [2024-04-27 09:43:14,107][52031] Fps is (10 sec: 52428.8, 60 sec: 52701.8, 300 sec: 52706.5). Total num frames: 5651496960. Throughput: 0: 52580.1. Samples: 141945020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 09:43:14,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 09:43:16,650][52263] Updated weights for policy 0, policy_version 344948 (0.0034) [2024-04-27 09:43:19,106][52031] Fps is (10 sec: 54066.8, 60 sec: 52702.0, 300 sec: 52651.0). Total num frames: 5651742720. Throughput: 0: 52599.6. Samples: 142258520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 09:43:19,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 09:43:19,666][52263] Updated weights for policy 0, policy_version 344958 (0.0028) [2024-04-27 09:43:22,777][52263] Updated weights for policy 0, policy_version 344968 (0.0030) [2024-04-27 09:43:24,107][52031] Fps is (10 sec: 50789.6, 60 sec: 52428.6, 300 sec: 52650.9). Total num frames: 5652004864. Throughput: 0: 52658.6. Samples: 142576940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 09:43:24,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 09:43:25,803][52263] Updated weights for policy 0, policy_version 344978 (0.0034) [2024-04-27 09:43:28,944][52263] Updated weights for policy 0, policy_version 344988 (0.0028) [2024-04-27 09:43:29,106][52031] Fps is (10 sec: 54067.3, 60 sec: 52702.0, 300 sec: 52762.0). Total num frames: 5652283392. Throughput: 0: 52651.3. Samples: 142730280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 09:43:29,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 09:43:32,023][52263] Updated weights for policy 0, policy_version 344998 (0.0035) [2024-04-27 09:43:34,106][52031] Fps is (10 sec: 50791.3, 60 sec: 52428.8, 300 sec: 52595.4). Total num frames: 5652512768. Throughput: 0: 52680.9. Samples: 143047820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 09:43:34,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 09:43:34,282][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000345004_5652545536.pth... [2024-04-27 09:43:34,335][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000344233_5639913472.pth [2024-04-27 09:43:35,035][52263] Updated weights for policy 0, policy_version 345008 (0.0029) [2024-04-27 09:43:38,272][52263] Updated weights for policy 0, policy_version 345018 (0.0038) [2024-04-27 09:43:39,107][52031] Fps is (10 sec: 52428.1, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5652807680. Throughput: 0: 52667.6. Samples: 143362840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:43:39,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 09:43:41,355][52263] Updated weights for policy 0, policy_version 345028 (0.0029) [2024-04-27 09:43:44,106][52031] Fps is (10 sec: 55705.8, 60 sec: 52975.1, 300 sec: 52650.9). Total num frames: 5653069824. Throughput: 0: 52573.3. Samples: 143520940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:43:44,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 09:43:44,492][52263] Updated weights for policy 0, policy_version 345038 (0.0029) [2024-04-27 09:43:47,631][52263] Updated weights for policy 0, policy_version 345048 (0.0035) [2024-04-27 09:43:49,107][52031] Fps is (10 sec: 52429.1, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5653331968. Throughput: 0: 52681.8. Samples: 143840960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:43:49,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 09:43:50,801][52263] Updated weights for policy 0, policy_version 345058 (0.0029) [2024-04-27 09:43:53,818][52263] Updated weights for policy 0, policy_version 345068 (0.0027) [2024-04-27 09:43:54,106][52031] Fps is (10 sec: 52428.8, 60 sec: 52702.0, 300 sec: 52706.5). Total num frames: 5653594112. Throughput: 0: 52695.0. Samples: 144155880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:43:54,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 09:43:57,026][52263] Updated weights for policy 0, policy_version 345078 (0.0034) [2024-04-27 09:43:59,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52974.8, 300 sec: 52762.0). Total num frames: 5653856256. Throughput: 0: 52612.3. Samples: 144312580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:43:59,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 09:44:00,071][52263] Updated weights for policy 0, policy_version 345088 (0.0029) [2024-04-27 09:44:01,589][52242] Signal inference workers to stop experience collection... (2050 times) [2024-04-27 09:44:01,589][52242] Signal inference workers to resume experience collection... (2050 times) [2024-04-27 09:44:01,603][52263] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-04-27 09:44:01,603][52263] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-04-27 09:44:03,242][52263] Updated weights for policy 0, policy_version 345098 (0.0029) [2024-04-27 09:44:04,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52428.9, 300 sec: 52651.0). Total num frames: 5654118400. Throughput: 0: 52689.8. Samples: 144629560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:44:04,107][52031] Avg episode reward: [(0, '0.500')] [2024-04-27 09:44:06,311][52263] Updated weights for policy 0, policy_version 345108 (0.0028) [2024-04-27 09:44:09,107][52031] Fps is (10 sec: 52428.6, 60 sec: 52974.7, 300 sec: 52650.9). Total num frames: 5654380544. Throughput: 0: 52707.6. Samples: 144948780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:44:09,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 09:44:09,542][52263] Updated weights for policy 0, policy_version 345118 (0.0031) [2024-04-27 09:44:12,643][52263] Updated weights for policy 0, policy_version 345128 (0.0031) [2024-04-27 09:44:14,106][52031] Fps is (10 sec: 54067.2, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5654659072. Throughput: 0: 52813.8. Samples: 145106900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:44:14,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 09:44:15,610][52263] Updated weights for policy 0, policy_version 345138 (0.0031) [2024-04-27 09:44:18,764][52263] Updated weights for policy 0, policy_version 345148 (0.0029) [2024-04-27 09:44:19,106][52031] Fps is (10 sec: 54068.4, 60 sec: 52975.0, 300 sec: 52817.6). Total num frames: 5654921216. Throughput: 0: 52898.8. Samples: 145428260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:44:19,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 09:44:21,863][52263] Updated weights for policy 0, policy_version 345158 (0.0029) [2024-04-27 09:44:24,107][52031] Fps is (10 sec: 52428.0, 60 sec: 52975.0, 300 sec: 52817.5). Total num frames: 5655183360. Throughput: 0: 52860.9. Samples: 145741580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:44:24,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 09:44:24,789][52263] Updated weights for policy 0, policy_version 345168 (0.0033) [2024-04-27 09:44:28,054][52263] Updated weights for policy 0, policy_version 345178 (0.0028) [2024-04-27 09:44:29,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52428.8, 300 sec: 52595.4). Total num frames: 5655429120. Throughput: 0: 52970.3. Samples: 145904600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:44:29,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 09:44:30,990][52263] Updated weights for policy 0, policy_version 345188 (0.0029) [2024-04-27 09:44:34,059][52263] Updated weights for policy 0, policy_version 345198 (0.0027) [2024-04-27 09:44:34,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.0, 300 sec: 52817.6). Total num frames: 5655724032. Throughput: 0: 52916.3. Samples: 146222200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:44:34,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 09:44:37,361][52263] Updated weights for policy 0, policy_version 345208 (0.0030) [2024-04-27 09:44:39,106][52031] Fps is (10 sec: 55705.7, 60 sec: 52975.1, 300 sec: 52762.1). Total num frames: 5655986176. Throughput: 0: 52969.9. Samples: 146539520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:44:39,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 09:44:40,379][52263] Updated weights for policy 0, policy_version 345218 (0.0029) [2024-04-27 09:44:43,529][52263] Updated weights for policy 0, policy_version 345228 (0.0032) [2024-04-27 09:44:44,107][52031] Fps is (10 sec: 50789.8, 60 sec: 52701.6, 300 sec: 52762.0). Total num frames: 5656231936. Throughput: 0: 52982.5. Samples: 146696800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:44:44,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 09:44:46,570][52263] Updated weights for policy 0, policy_version 345238 (0.0027) [2024-04-27 09:44:49,107][52031] Fps is (10 sec: 52428.0, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5656510464. Throughput: 0: 52961.2. Samples: 147012820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:44:49,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 09:44:49,673][52263] Updated weights for policy 0, policy_version 345248 (0.0033) [2024-04-27 09:44:52,758][52263] Updated weights for policy 0, policy_version 345258 (0.0033) [2024-04-27 09:44:54,107][52031] Fps is (10 sec: 52429.6, 60 sec: 52701.8, 300 sec: 52651.0). Total num frames: 5656756224. Throughput: 0: 52918.8. Samples: 147330120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 09:44:54,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 09:44:55,997][52263] Updated weights for policy 0, policy_version 345268 (0.0034) [2024-04-27 09:44:58,975][52263] Updated weights for policy 0, policy_version 345278 (0.0030) [2024-04-27 09:44:59,107][52031] Fps is (10 sec: 52428.8, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5657034752. Throughput: 0: 52797.7. Samples: 147482800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 09:44:59,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 09:45:02,198][52263] Updated weights for policy 0, policy_version 345288 (0.0036) [2024-04-27 09:45:04,107][52031] Fps is (10 sec: 54066.9, 60 sec: 52974.8, 300 sec: 52706.5). Total num frames: 5657296896. Throughput: 0: 52645.1. Samples: 147797300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 09:45:04,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 09:45:05,477][52263] Updated weights for policy 0, policy_version 345298 (0.0034) [2024-04-27 09:45:08,568][52263] Updated weights for policy 0, policy_version 345308 (0.0034) [2024-04-27 09:45:09,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52975.1, 300 sec: 52762.0). Total num frames: 5657559040. Throughput: 0: 52747.7. Samples: 148115220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 09:45:09,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 09:45:11,787][52263] Updated weights for policy 0, policy_version 345318 (0.0027) [2024-04-27 09:45:14,106][52031] Fps is (10 sec: 50791.2, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5657804800. Throughput: 0: 52546.2. Samples: 148269180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 09:45:14,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 09:45:14,711][52263] Updated weights for policy 0, policy_version 345328 (0.0028) [2024-04-27 09:45:17,913][52263] Updated weights for policy 0, policy_version 345338 (0.0029) [2024-04-27 09:45:19,106][52031] Fps is (10 sec: 49152.0, 60 sec: 52155.7, 300 sec: 52595.4). Total num frames: 5658050560. Throughput: 0: 52510.4. Samples: 148585160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 09:45:19,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 09:45:20,922][52263] Updated weights for policy 0, policy_version 345348 (0.0032) [2024-04-27 09:45:23,978][52263] Updated weights for policy 0, policy_version 345358 (0.0029) [2024-04-27 09:45:24,107][52031] Fps is (10 sec: 54066.4, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5658345472. Throughput: 0: 52380.6. Samples: 148896660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 09:45:24,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 09:45:27,129][52263] Updated weights for policy 0, policy_version 345368 (0.0031) [2024-04-27 09:45:29,107][52031] Fps is (10 sec: 54066.4, 60 sec: 52701.7, 300 sec: 52706.5). Total num frames: 5658591232. Throughput: 0: 52585.9. Samples: 149063160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 09:45:29,107][52031] Avg episode reward: [(0, '0.488')] [2024-04-27 09:45:30,256][52263] Updated weights for policy 0, policy_version 345378 (0.0029) [2024-04-27 09:45:33,309][52263] Updated weights for policy 0, policy_version 345388 (0.0029) [2024-04-27 09:45:34,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52428.9, 300 sec: 52706.5). Total num frames: 5658869760. Throughput: 0: 52549.8. Samples: 149377560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 09:45:34,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 09:45:34,194][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000345391_5658886144.pth... [2024-04-27 09:45:34,238][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000344619_5646237696.pth [2024-04-27 09:45:36,507][52263] Updated weights for policy 0, policy_version 345398 (0.0035) [2024-04-27 09:45:39,106][52031] Fps is (10 sec: 50791.5, 60 sec: 51882.6, 300 sec: 52651.0). Total num frames: 5659099136. Throughput: 0: 52386.4. Samples: 149687500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 09:45:39,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 09:45:39,135][52242] Signal inference workers to stop experience collection... (2100 times) [2024-04-27 09:45:39,137][52242] Signal inference workers to resume experience collection... (2100 times) [2024-04-27 09:45:39,150][52263] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-04-27 09:45:39,184][52263] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-04-27 09:45:39,630][52263] Updated weights for policy 0, policy_version 345408 (0.0028) [2024-04-27 09:45:42,667][52263] Updated weights for policy 0, policy_version 345418 (0.0027) [2024-04-27 09:45:44,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52429.1, 300 sec: 52651.0). Total num frames: 5659377664. Throughput: 0: 52428.1. Samples: 149842060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 09:45:44,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 09:45:45,864][52263] Updated weights for policy 0, policy_version 345428 (0.0027) [2024-04-27 09:45:48,734][52263] Updated weights for policy 0, policy_version 345438 (0.0032) [2024-04-27 09:45:49,106][52031] Fps is (10 sec: 55705.3, 60 sec: 52428.8, 300 sec: 52762.1). Total num frames: 5659656192. Throughput: 0: 52480.6. Samples: 150158920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 09:45:49,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 09:45:52,239][52263] Updated weights for policy 0, policy_version 345448 (0.0034) [2024-04-27 09:45:54,107][52031] Fps is (10 sec: 54066.2, 60 sec: 52701.8, 300 sec: 52706.5). Total num frames: 5659918336. Throughput: 0: 52468.7. Samples: 150476320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 09:45:54,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 09:45:54,991][52263] Updated weights for policy 0, policy_version 345458 (0.0037) [2024-04-27 09:45:58,399][52263] Updated weights for policy 0, policy_version 345468 (0.0030) [2024-04-27 09:45:59,106][52031] Fps is (10 sec: 54067.5, 60 sec: 52701.9, 300 sec: 52762.1). Total num frames: 5660196864. Throughput: 0: 52668.5. Samples: 150639260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 09:45:59,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 09:46:01,246][52263] Updated weights for policy 0, policy_version 345478 (0.0035) [2024-04-27 09:46:04,107][52031] Fps is (10 sec: 52429.2, 60 sec: 52428.9, 300 sec: 52706.5). Total num frames: 5660442624. Throughput: 0: 52701.7. Samples: 150956740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 09:46:04,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 09:46:04,455][52263] Updated weights for policy 0, policy_version 345488 (0.0029) [2024-04-27 09:46:07,405][52263] Updated weights for policy 0, policy_version 345498 (0.0032) [2024-04-27 09:46:09,107][52031] Fps is (10 sec: 49151.0, 60 sec: 52155.6, 300 sec: 52650.9). Total num frames: 5660688384. Throughput: 0: 52853.3. Samples: 151275060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 09:46:09,116][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 09:46:10,744][52263] Updated weights for policy 0, policy_version 345508 (0.0025) [2024-04-27 09:46:13,421][52263] Updated weights for policy 0, policy_version 345518 (0.0028) [2024-04-27 09:46:14,107][52031] Fps is (10 sec: 52428.6, 60 sec: 52701.8, 300 sec: 52595.4). Total num frames: 5660966912. Throughput: 0: 52579.6. Samples: 151429240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 09:46:14,115][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 09:46:16,952][52263] Updated weights for policy 0, policy_version 345528 (0.0028) [2024-04-27 09:46:19,106][52031] Fps is (10 sec: 54068.1, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5661229056. Throughput: 0: 52608.0. Samples: 151744920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 09:46:19,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 09:46:19,918][52263] Updated weights for policy 0, policy_version 345538 (0.0030) [2024-04-27 09:46:23,112][52263] Updated weights for policy 0, policy_version 345548 (0.0031) [2024-04-27 09:46:24,106][52031] Fps is (10 sec: 54067.6, 60 sec: 52702.0, 300 sec: 52706.5). Total num frames: 5661507584. Throughput: 0: 52651.9. Samples: 152056840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 09:46:24,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 09:46:26,344][52263] Updated weights for policy 0, policy_version 345558 (0.0032) [2024-04-27 09:46:29,107][52031] Fps is (10 sec: 52428.5, 60 sec: 52702.0, 300 sec: 52595.4). Total num frames: 5661753344. Throughput: 0: 52756.8. Samples: 152216120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 09:46:29,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 09:46:29,332][52263] Updated weights for policy 0, policy_version 345568 (0.0030) [2024-04-27 09:46:32,610][52263] Updated weights for policy 0, policy_version 345578 (0.0027) [2024-04-27 09:46:34,106][52031] Fps is (10 sec: 49152.3, 60 sec: 52155.8, 300 sec: 52595.4). Total num frames: 5661999104. Throughput: 0: 52810.7. Samples: 152535400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 09:46:34,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 09:46:35,509][52263] Updated weights for policy 0, policy_version 345588 (0.0031) [2024-04-27 09:46:38,732][52263] Updated weights for policy 0, policy_version 345598 (0.0030) [2024-04-27 09:46:39,107][52031] Fps is (10 sec: 52428.8, 60 sec: 52974.8, 300 sec: 52595.4). Total num frames: 5662277632. Throughput: 0: 52705.4. Samples: 152848060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 09:46:39,116][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 09:46:40,348][52242] Signal inference workers to stop experience collection... (2150 times) [2024-04-27 09:46:40,372][52263] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-04-27 09:46:40,408][52242] Signal inference workers to resume experience collection... (2150 times) [2024-04-27 09:46:40,409][52263] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-04-27 09:46:41,813][52263] Updated weights for policy 0, policy_version 345608 (0.0028) [2024-04-27 09:46:44,107][52031] Fps is (10 sec: 55704.6, 60 sec: 52974.8, 300 sec: 52650.9). Total num frames: 5662556160. Throughput: 0: 52657.1. Samples: 153008840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 09:46:44,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 09:46:44,748][52263] Updated weights for policy 0, policy_version 345618 (0.0037) [2024-04-27 09:46:48,009][52263] Updated weights for policy 0, policy_version 345628 (0.0025) [2024-04-27 09:46:49,107][52031] Fps is (10 sec: 54067.2, 60 sec: 52701.8, 300 sec: 52651.0). Total num frames: 5662818304. Throughput: 0: 52668.0. Samples: 153326800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 09:46:49,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 09:46:51,247][52263] Updated weights for policy 0, policy_version 345638 (0.0028) [2024-04-27 09:46:54,106][52031] Fps is (10 sec: 52429.6, 60 sec: 52702.0, 300 sec: 52651.0). Total num frames: 5663080448. Throughput: 0: 52658.4. Samples: 153644680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 09:46:54,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 09:46:54,177][52263] Updated weights for policy 0, policy_version 345648 (0.0032) [2024-04-27 09:46:57,445][52263] Updated weights for policy 0, policy_version 345658 (0.0029) [2024-04-27 09:46:59,107][52031] Fps is (10 sec: 50790.3, 60 sec: 52155.7, 300 sec: 52595.4). Total num frames: 5663326208. Throughput: 0: 52618.3. Samples: 153797060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 09:46:59,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 09:47:00,401][52263] Updated weights for policy 0, policy_version 345668 (0.0028) [2024-04-27 09:47:03,571][52263] Updated weights for policy 0, policy_version 345678 (0.0027) [2024-04-27 09:47:04,107][52031] Fps is (10 sec: 50789.8, 60 sec: 52428.8, 300 sec: 52595.4). Total num frames: 5663588352. Throughput: 0: 52701.7. Samples: 154116500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 09:47:04,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 09:47:06,680][52263] Updated weights for policy 0, policy_version 345688 (0.0031) [2024-04-27 09:47:09,107][52031] Fps is (10 sec: 55700.5, 60 sec: 53247.3, 300 sec: 52706.3). Total num frames: 5663883264. Throughput: 0: 52792.2. Samples: 154432540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 09:47:09,108][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 09:47:09,592][52263] Updated weights for policy 0, policy_version 345698 (0.0026) [2024-04-27 09:47:12,951][52263] Updated weights for policy 0, policy_version 345708 (0.0036) [2024-04-27 09:47:14,106][52031] Fps is (10 sec: 55706.1, 60 sec: 52975.0, 300 sec: 52762.1). Total num frames: 5664145408. Throughput: 0: 52920.0. Samples: 154597520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 09:47:14,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 09:47:15,674][52263] Updated weights for policy 0, policy_version 345718 (0.0029) [2024-04-27 09:47:19,076][52263] Updated weights for policy 0, policy_version 345728 (0.0032) [2024-04-27 09:47:19,107][52031] Fps is (10 sec: 52433.8, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5664407552. Throughput: 0: 52928.8. Samples: 154917200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 09:47:19,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 09:47:22,132][52263] Updated weights for policy 0, policy_version 345738 (0.0028) [2024-04-27 09:47:24,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52701.8, 300 sec: 52706.5). Total num frames: 5664669696. Throughput: 0: 53020.4. Samples: 155233980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 09:47:24,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 09:47:25,385][52263] Updated weights for policy 0, policy_version 345748 (0.0032) [2024-04-27 09:47:28,370][52263] Updated weights for policy 0, policy_version 345758 (0.0029) [2024-04-27 09:47:29,107][52031] Fps is (10 sec: 50789.9, 60 sec: 52701.8, 300 sec: 52706.5). Total num frames: 5664915456. Throughput: 0: 52757.8. Samples: 155382940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:47:29,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 09:47:31,438][52263] Updated weights for policy 0, policy_version 345768 (0.0031) [2024-04-27 09:47:34,106][52031] Fps is (10 sec: 52430.2, 60 sec: 53248.1, 300 sec: 52706.5). Total num frames: 5665193984. Throughput: 0: 52894.4. Samples: 155707040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:47:34,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 09:47:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000345776_5665193984.pth... [2024-04-27 09:47:34,177][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000345004_5652545536.pth [2024-04-27 09:47:34,825][52263] Updated weights for policy 0, policy_version 345778 (0.0030) [2024-04-27 09:47:37,534][52263] Updated weights for policy 0, policy_version 345788 (0.0031) [2024-04-27 09:47:38,610][52242] Signal inference workers to stop experience collection... (2200 times) [2024-04-27 09:47:38,639][52263] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-04-27 09:47:38,701][52242] Signal inference workers to resume experience collection... (2200 times) [2024-04-27 09:47:38,701][52263] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-04-27 09:47:39,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53247.9, 300 sec: 52817.6). Total num frames: 5665472512. Throughput: 0: 52856.7. Samples: 156023240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:47:39,116][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 09:47:40,908][52263] Updated weights for policy 0, policy_version 345798 (0.0029) [2024-04-27 09:47:43,752][52263] Updated weights for policy 0, policy_version 345808 (0.0036) [2024-04-27 09:47:44,107][52031] Fps is (10 sec: 54066.0, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5665734656. Throughput: 0: 53171.1. Samples: 156189760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:47:44,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 09:47:46,959][52263] Updated weights for policy 0, policy_version 345818 (0.0032) [2024-04-27 09:47:49,106][52031] Fps is (10 sec: 50791.2, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5665980416. Throughput: 0: 53122.4. Samples: 156507000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:47:49,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 09:47:50,149][52263] Updated weights for policy 0, policy_version 345828 (0.0030) [2024-04-27 09:47:53,085][52263] Updated weights for policy 0, policy_version 345838 (0.0035) [2024-04-27 09:47:54,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5666242560. Throughput: 0: 53087.8. Samples: 156821440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:47:54,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 09:47:56,282][52263] Updated weights for policy 0, policy_version 345848 (0.0030) [2024-04-27 09:47:59,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52975.0, 300 sec: 52651.0). Total num frames: 5666504704. Throughput: 0: 52841.0. Samples: 156975360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:47:59,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 09:47:59,293][52263] Updated weights for policy 0, policy_version 345858 (0.0033) [2024-04-27 09:48:02,401][52263] Updated weights for policy 0, policy_version 345868 (0.0030) [2024-04-27 09:48:04,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.1, 300 sec: 52873.1). Total num frames: 5666799616. Throughput: 0: 52743.1. Samples: 157290640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:48:04,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 09:48:05,313][52263] Updated weights for policy 0, policy_version 345878 (0.0027) [2024-04-27 09:48:08,703][52263] Updated weights for policy 0, policy_version 345888 (0.0028) [2024-04-27 09:48:09,107][52031] Fps is (10 sec: 54066.7, 60 sec: 52702.7, 300 sec: 52706.5). Total num frames: 5667045376. Throughput: 0: 52854.3. Samples: 157612420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:48:09,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 09:48:11,776][52263] Updated weights for policy 0, policy_version 345898 (0.0032) [2024-04-27 09:48:14,106][52031] Fps is (10 sec: 49152.5, 60 sec: 52428.8, 300 sec: 52706.5). Total num frames: 5667291136. Throughput: 0: 53018.9. Samples: 157768780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:48:14,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 09:48:14,776][52263] Updated weights for policy 0, policy_version 345908 (0.0029) [2024-04-27 09:48:17,920][52263] Updated weights for policy 0, policy_version 345918 (0.0029) [2024-04-27 09:48:19,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52701.9, 300 sec: 52762.1). Total num frames: 5667569664. Throughput: 0: 52967.9. Samples: 158090600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:48:19,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 09:48:20,866][52263] Updated weights for policy 0, policy_version 345928 (0.0029) [2024-04-27 09:48:24,106][52031] Fps is (10 sec: 54067.6, 60 sec: 52702.1, 300 sec: 52706.5). Total num frames: 5667831808. Throughput: 0: 53054.5. Samples: 158410680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:48:24,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 09:48:24,138][52263] Updated weights for policy 0, policy_version 345938 (0.0030) [2024-04-27 09:48:27,108][52263] Updated weights for policy 0, policy_version 345948 (0.0032) [2024-04-27 09:48:29,107][52031] Fps is (10 sec: 54066.2, 60 sec: 53248.0, 300 sec: 52873.1). Total num frames: 5668110336. Throughput: 0: 52818.6. Samples: 158566600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:48:29,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 09:48:30,341][52263] Updated weights for policy 0, policy_version 345958 (0.0030) [2024-04-27 09:48:33,261][52263] Updated weights for policy 0, policy_version 345968 (0.0035) [2024-04-27 09:48:34,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53247.9, 300 sec: 52817.6). Total num frames: 5668388864. Throughput: 0: 52852.0. Samples: 158885340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:48:34,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 09:48:36,485][52263] Updated weights for policy 0, policy_version 345978 (0.0027) [2024-04-27 09:48:39,106][52031] Fps is (10 sec: 52429.8, 60 sec: 52702.0, 300 sec: 52762.0). Total num frames: 5668634624. Throughput: 0: 52949.8. Samples: 159204180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 09:48:39,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 09:48:39,349][52263] Updated weights for policy 0, policy_version 345988 (0.0031) [2024-04-27 09:48:42,655][52263] Updated weights for policy 0, policy_version 345998 (0.0027) [2024-04-27 09:48:44,107][52031] Fps is (10 sec: 50789.6, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5668896768. Throughput: 0: 52934.0. Samples: 159357400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:48:44,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 09:48:45,698][52263] Updated weights for policy 0, policy_version 346008 (0.0034) [2024-04-27 09:48:48,951][52263] Updated weights for policy 0, policy_version 346018 (0.0028) [2024-04-27 09:48:49,106][52031] Fps is (10 sec: 52428.8, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5669158912. Throughput: 0: 52845.9. Samples: 159668700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:48:49,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 09:48:52,185][52263] Updated weights for policy 0, policy_version 346028 (0.0033) [2024-04-27 09:48:54,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52974.9, 300 sec: 52762.1). Total num frames: 5669421056. Throughput: 0: 52719.6. Samples: 159984800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:48:54,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 09:48:55,002][52263] Updated weights for policy 0, policy_version 346038 (0.0038) [2024-04-27 09:48:58,460][52263] Updated weights for policy 0, policy_version 346048 (0.0035) [2024-04-27 09:48:59,107][52031] Fps is (10 sec: 54066.0, 60 sec: 53247.8, 300 sec: 52817.5). Total num frames: 5669699584. Throughput: 0: 52820.2. Samples: 160145700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:48:59,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 09:49:01,076][52263] Updated weights for policy 0, policy_version 346058 (0.0030) [2024-04-27 09:49:04,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52428.8, 300 sec: 52762.1). Total num frames: 5669945344. Throughput: 0: 52760.8. Samples: 160464840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:49:04,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 09:49:04,700][52263] Updated weights for policy 0, policy_version 346068 (0.0033) [2024-04-27 09:49:07,406][52263] Updated weights for policy 0, policy_version 346078 (0.0028) [2024-04-27 09:49:08,233][52242] Signal inference workers to stop experience collection... (2250 times) [2024-04-27 09:49:08,234][52242] Signal inference workers to resume experience collection... (2250 times) [2024-04-27 09:49:08,255][52263] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-04-27 09:49:08,255][52263] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-04-27 09:49:09,106][52031] Fps is (10 sec: 52429.8, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5670223872. Throughput: 0: 52675.4. Samples: 160781080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:49:09,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 09:49:10,748][52263] Updated weights for policy 0, policy_version 346088 (0.0028) [2024-04-27 09:49:13,706][52263] Updated weights for policy 0, policy_version 346098 (0.0029) [2024-04-27 09:49:14,107][52031] Fps is (10 sec: 52428.4, 60 sec: 52974.8, 300 sec: 52706.5). Total num frames: 5670469632. Throughput: 0: 52685.8. Samples: 160937460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:49:14,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 09:49:16,763][52263] Updated weights for policy 0, policy_version 346108 (0.0034) [2024-04-27 09:49:19,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52974.9, 300 sec: 52762.1). Total num frames: 5670748160. Throughput: 0: 52712.0. Samples: 161257380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:49:19,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 09:49:19,930][52263] Updated weights for policy 0, policy_version 346118 (0.0034) [2024-04-27 09:49:22,944][52263] Updated weights for policy 0, policy_version 346128 (0.0038) [2024-04-27 09:49:24,106][52031] Fps is (10 sec: 55706.5, 60 sec: 53247.9, 300 sec: 52873.1). Total num frames: 5671026688. Throughput: 0: 52618.7. Samples: 161572020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:49:24,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 09:49:26,032][52263] Updated weights for policy 0, policy_version 346138 (0.0038) [2024-04-27 09:49:29,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5671272448. Throughput: 0: 52901.8. Samples: 161737980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:49:29,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 09:49:29,142][52263] Updated weights for policy 0, policy_version 346148 (0.0030) [2024-04-27 09:49:32,223][52263] Updated weights for policy 0, policy_version 346158 (0.0027) [2024-04-27 09:49:34,106][52031] Fps is (10 sec: 50790.2, 60 sec: 52428.8, 300 sec: 52706.5). Total num frames: 5671534592. Throughput: 0: 52920.0. Samples: 162050100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:49:34,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 09:49:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000346163_5671534592.pth... [2024-04-27 09:49:34,181][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000345391_5658886144.pth [2024-04-27 09:49:35,497][52263] Updated weights for policy 0, policy_version 346168 (0.0030) [2024-04-27 09:49:38,523][52263] Updated weights for policy 0, policy_version 346178 (0.0032) [2024-04-27 09:49:39,107][52031] Fps is (10 sec: 52428.5, 60 sec: 52701.7, 300 sec: 52762.1). Total num frames: 5671796736. Throughput: 0: 52917.6. Samples: 162366100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:49:39,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 09:49:41,842][52263] Updated weights for policy 0, policy_version 346188 (0.0033) [2024-04-27 09:49:44,107][52031] Fps is (10 sec: 54066.3, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5672075264. Throughput: 0: 52868.5. Samples: 162524780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:49:44,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 09:49:44,678][52263] Updated weights for policy 0, policy_version 346198 (0.0033) [2024-04-27 09:49:47,886][52263] Updated weights for policy 0, policy_version 346208 (0.0030) [2024-04-27 09:49:49,107][52031] Fps is (10 sec: 54067.6, 60 sec: 52974.8, 300 sec: 52817.6). Total num frames: 5672337408. Throughput: 0: 52879.1. Samples: 162844400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:49:49,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 09:49:50,733][52263] Updated weights for policy 0, policy_version 346218 (0.0030) [2024-04-27 09:49:54,107][52031] Fps is (10 sec: 50790.9, 60 sec: 52701.8, 300 sec: 52706.5). Total num frames: 5672583168. Throughput: 0: 52925.2. Samples: 163162720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:49:54,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 09:49:54,113][52263] Updated weights for policy 0, policy_version 346228 (0.0029) [2024-04-27 09:49:56,959][52263] Updated weights for policy 0, policy_version 346238 (0.0031) [2024-04-27 09:49:59,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52702.0, 300 sec: 52762.1). Total num frames: 5672861696. Throughput: 0: 53002.8. Samples: 163322580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 09:49:59,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 09:50:00,263][52263] Updated weights for policy 0, policy_version 346248 (0.0031) [2024-04-27 09:50:03,376][52263] Updated weights for policy 0, policy_version 346258 (0.0029) [2024-04-27 09:50:04,106][52031] Fps is (10 sec: 55706.6, 60 sec: 53248.1, 300 sec: 52817.6). Total num frames: 5673140224. Throughput: 0: 52935.2. Samples: 163639460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 09:50:04,107][52031] Avg episode reward: [(0, '0.504')] [2024-04-27 09:50:06,333][52263] Updated weights for policy 0, policy_version 346268 (0.0026) [2024-04-27 09:50:09,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52428.8, 300 sec: 52762.0). Total num frames: 5673369600. Throughput: 0: 52924.0. Samples: 163953600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 09:50:09,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 09:50:09,638][52263] Updated weights for policy 0, policy_version 346278 (0.0030) [2024-04-27 09:50:12,504][52263] Updated weights for policy 0, policy_version 346288 (0.0031) [2024-04-27 09:50:14,106][52031] Fps is (10 sec: 50789.8, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5673648128. Throughput: 0: 52685.9. Samples: 164108840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 09:50:14,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 09:50:15,867][52263] Updated weights for policy 0, policy_version 346298 (0.0028) [2024-04-27 09:50:18,698][52263] Updated weights for policy 0, policy_version 346308 (0.0027) [2024-04-27 09:50:19,106][52031] Fps is (10 sec: 55705.5, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5673926656. Throughput: 0: 52834.3. Samples: 164427640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 09:50:19,107][52031] Avg episode reward: [(0, '0.474')] [2024-04-27 09:50:21,975][52263] Updated weights for policy 0, policy_version 346318 (0.0026) [2024-04-27 09:50:22,363][52242] Signal inference workers to stop experience collection... (2300 times) [2024-04-27 09:50:22,363][52242] Signal inference workers to resume experience collection... (2300 times) [2024-04-27 09:50:22,388][52263] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-04-27 09:50:22,389][52263] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-04-27 09:50:24,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5674172416. Throughput: 0: 52751.8. Samples: 164739920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 09:50:24,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 09:50:25,238][52263] Updated weights for policy 0, policy_version 346328 (0.0030) [2024-04-27 09:50:28,210][52263] Updated weights for policy 0, policy_version 346338 (0.0030) [2024-04-27 09:50:29,106][52031] Fps is (10 sec: 52428.8, 60 sec: 52975.0, 300 sec: 52817.6). Total num frames: 5674450944. Throughput: 0: 52754.0. Samples: 164898700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 09:50:29,107][52031] Avg episode reward: [(0, '0.686')] [2024-04-27 09:50:31,471][52263] Updated weights for policy 0, policy_version 346348 (0.0032) [2024-04-27 09:50:34,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52701.8, 300 sec: 52873.1). Total num frames: 5674696704. Throughput: 0: 52607.6. Samples: 165211740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 09:50:34,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 09:50:34,397][52263] Updated weights for policy 0, policy_version 346358 (0.0036) [2024-04-27 09:50:37,600][52263] Updated weights for policy 0, policy_version 346368 (0.0036) [2024-04-27 09:50:39,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52702.0, 300 sec: 52817.6). Total num frames: 5674958848. Throughput: 0: 52537.8. Samples: 165526920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 09:50:39,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 09:50:40,623][52263] Updated weights for policy 0, policy_version 346378 (0.0029) [2024-04-27 09:50:43,708][52263] Updated weights for policy 0, policy_version 346388 (0.0027) [2024-04-27 09:50:44,107][52031] Fps is (10 sec: 52428.6, 60 sec: 52428.9, 300 sec: 52762.0). Total num frames: 5675220992. Throughput: 0: 52486.2. Samples: 165684460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 09:50:44,108][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 09:50:46,841][52263] Updated weights for policy 0, policy_version 346398 (0.0026) [2024-04-27 09:50:49,107][52031] Fps is (10 sec: 52428.6, 60 sec: 52428.8, 300 sec: 52762.0). Total num frames: 5675483136. Throughput: 0: 52372.7. Samples: 165996240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 09:50:49,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 09:50:49,909][52263] Updated weights for policy 0, policy_version 346408 (0.0033) [2024-04-27 09:50:53,168][52263] Updated weights for policy 0, policy_version 346418 (0.0028) [2024-04-27 09:50:54,106][52031] Fps is (10 sec: 54067.9, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5675761664. Throughput: 0: 52507.1. Samples: 166316420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 09:50:54,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 09:50:56,225][52263] Updated weights for policy 0, policy_version 346428 (0.0033) [2024-04-27 09:50:59,106][52031] Fps is (10 sec: 52429.5, 60 sec: 52428.9, 300 sec: 52762.1). Total num frames: 5676007424. Throughput: 0: 52617.0. Samples: 166476600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 09:50:59,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 09:50:59,275][52263] Updated weights for policy 0, policy_version 346438 (0.0030) [2024-04-27 09:51:02,339][52263] Updated weights for policy 0, policy_version 346448 (0.0029) [2024-04-27 09:51:04,106][52031] Fps is (10 sec: 50790.1, 60 sec: 52155.6, 300 sec: 52817.6). Total num frames: 5676269568. Throughput: 0: 52527.0. Samples: 166791360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 09:51:04,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 09:51:05,460][52263] Updated weights for policy 0, policy_version 346458 (0.0028) [2024-04-27 09:51:08,563][52263] Updated weights for policy 0, policy_version 346468 (0.0031) [2024-04-27 09:51:09,106][52031] Fps is (10 sec: 54067.2, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5676548096. Throughput: 0: 52564.9. Samples: 167105340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 09:51:09,107][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 09:51:11,748][52263] Updated weights for policy 0, policy_version 346478 (0.0025) [2024-04-27 09:51:14,107][52031] Fps is (10 sec: 50789.8, 60 sec: 52155.6, 300 sec: 52706.5). Total num frames: 5676777472. Throughput: 0: 52619.8. Samples: 167266600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 09:51:14,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 09:51:14,826][52263] Updated weights for policy 0, policy_version 346488 (0.0029) [2024-04-27 09:51:17,845][52263] Updated weights for policy 0, policy_version 346498 (0.0026) [2024-04-27 09:51:19,106][52031] Fps is (10 sec: 54067.0, 60 sec: 52701.9, 300 sec: 52817.6). Total num frames: 5677088768. Throughput: 0: 52619.6. Samples: 167579620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:51:19,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 09:51:20,718][52242] Signal inference workers to stop experience collection... (2350 times) [2024-04-27 09:51:20,719][52242] Signal inference workers to resume experience collection... (2350 times) [2024-04-27 09:51:20,730][52263] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-04-27 09:51:20,730][52263] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-04-27 09:51:21,256][52263] Updated weights for policy 0, policy_version 346508 (0.0028) [2024-04-27 09:51:24,031][52263] Updated weights for policy 0, policy_version 346518 (0.0026) [2024-04-27 09:51:24,107][52031] Fps is (10 sec: 57344.6, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5677350912. Throughput: 0: 52626.7. Samples: 167895120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:51:24,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 09:51:27,390][52263] Updated weights for policy 0, policy_version 346528 (0.0028) [2024-04-27 09:51:29,106][52031] Fps is (10 sec: 49152.1, 60 sec: 52155.7, 300 sec: 52817.6). Total num frames: 5677580288. Throughput: 0: 52789.9. Samples: 168060000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:51:29,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 09:51:30,209][52263] Updated weights for policy 0, policy_version 346538 (0.0039) [2024-04-27 09:51:33,613][52263] Updated weights for policy 0, policy_version 346548 (0.0031) [2024-04-27 09:51:34,107][52031] Fps is (10 sec: 50790.0, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5677858816. Throughput: 0: 52817.7. Samples: 168373040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:51:34,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 09:51:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000346549_5677858816.pth... [2024-04-27 09:51:34,166][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000345776_5665193984.pth [2024-04-27 09:51:36,517][52263] Updated weights for policy 0, policy_version 346558 (0.0030) [2024-04-27 09:51:39,107][52031] Fps is (10 sec: 52428.1, 60 sec: 52428.7, 300 sec: 52706.5). Total num frames: 5678104576. Throughput: 0: 52591.4. Samples: 168683040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:51:39,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 09:51:39,993][52263] Updated weights for policy 0, policy_version 346568 (0.0038) [2024-04-27 09:51:42,918][52263] Updated weights for policy 0, policy_version 346578 (0.0036) [2024-04-27 09:51:44,106][52031] Fps is (10 sec: 54068.0, 60 sec: 52975.0, 300 sec: 52817.6). Total num frames: 5678399488. Throughput: 0: 52608.4. Samples: 168843980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:51:44,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 09:51:46,102][52263] Updated weights for policy 0, policy_version 346588 (0.0031) [2024-04-27 09:51:49,106][52031] Fps is (10 sec: 54068.1, 60 sec: 52702.0, 300 sec: 52762.0). Total num frames: 5678645248. Throughput: 0: 52641.9. Samples: 169160240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:51:49,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 09:51:49,145][52263] Updated weights for policy 0, policy_version 346598 (0.0034) [2024-04-27 09:51:52,221][52263] Updated weights for policy 0, policy_version 346608 (0.0033) [2024-04-27 09:51:54,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52428.7, 300 sec: 52817.6). Total num frames: 5678907392. Throughput: 0: 52708.8. Samples: 169477240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:51:54,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 09:51:55,254][52263] Updated weights for policy 0, policy_version 346618 (0.0035) [2024-04-27 09:51:58,442][52263] Updated weights for policy 0, policy_version 346628 (0.0031) [2024-04-27 09:51:59,107][52031] Fps is (10 sec: 52427.6, 60 sec: 52701.7, 300 sec: 52817.6). Total num frames: 5679169536. Throughput: 0: 52566.7. Samples: 169632100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:51:59,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 09:52:01,355][52263] Updated weights for policy 0, policy_version 346638 (0.0038) [2024-04-27 09:52:04,107][52031] Fps is (10 sec: 50790.2, 60 sec: 52428.7, 300 sec: 52651.1). Total num frames: 5679415296. Throughput: 0: 52543.4. Samples: 169944080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:52:04,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 09:52:04,643][52263] Updated weights for policy 0, policy_version 346648 (0.0027) [2024-04-27 09:52:07,636][52263] Updated weights for policy 0, policy_version 346658 (0.0029) [2024-04-27 09:52:09,107][52031] Fps is (10 sec: 52429.3, 60 sec: 52428.7, 300 sec: 52706.5). Total num frames: 5679693824. Throughput: 0: 52647.1. Samples: 170264240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:52:09,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 09:52:10,850][52263] Updated weights for policy 0, policy_version 346668 (0.0035) [2024-04-27 09:52:13,837][52263] Updated weights for policy 0, policy_version 346678 (0.0030) [2024-04-27 09:52:14,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53248.1, 300 sec: 52762.0). Total num frames: 5679972352. Throughput: 0: 52453.7. Samples: 170420420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:52:14,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 09:52:17,108][52263] Updated weights for policy 0, policy_version 346688 (0.0027) [2024-04-27 09:52:18,406][52242] Signal inference workers to stop experience collection... (2400 times) [2024-04-27 09:52:18,407][52242] Signal inference workers to resume experience collection... (2400 times) [2024-04-27 09:52:18,422][52263] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-04-27 09:52:18,422][52263] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-04-27 09:52:19,106][52031] Fps is (10 sec: 55706.0, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5680250880. Throughput: 0: 52519.7. Samples: 170736420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:52:19,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 09:52:19,874][52263] Updated weights for policy 0, policy_version 346698 (0.0028) [2024-04-27 09:52:23,307][52263] Updated weights for policy 0, policy_version 346708 (0.0027) [2024-04-27 09:52:24,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52155.7, 300 sec: 52762.0). Total num frames: 5680480256. Throughput: 0: 52684.0. Samples: 171053820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:52:24,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 09:52:26,071][52263] Updated weights for policy 0, policy_version 346718 (0.0036) [2024-04-27 09:52:29,107][52031] Fps is (10 sec: 50789.5, 60 sec: 52974.7, 300 sec: 52762.0). Total num frames: 5680758784. Throughput: 0: 52561.1. Samples: 171209240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 09:52:29,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 09:52:29,515][52263] Updated weights for policy 0, policy_version 346728 (0.0028) [2024-04-27 09:52:32,838][52263] Updated weights for policy 0, policy_version 346738 (0.0032) [2024-04-27 09:52:34,106][52031] Fps is (10 sec: 52429.4, 60 sec: 52428.9, 300 sec: 52651.0). Total num frames: 5681004544. Throughput: 0: 52658.1. Samples: 171529860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 09:52:34,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 09:52:35,795][52263] Updated weights for policy 0, policy_version 346748 (0.0027) [2024-04-27 09:52:39,058][52263] Updated weights for policy 0, policy_version 346758 (0.0027) [2024-04-27 09:52:39,107][52031] Fps is (10 sec: 52429.2, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5681283072. Throughput: 0: 52564.8. Samples: 171842660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 09:52:39,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 09:52:42,019][52263] Updated weights for policy 0, policy_version 346768 (0.0029) [2024-04-27 09:52:44,106][52031] Fps is (10 sec: 55706.4, 60 sec: 52702.0, 300 sec: 52817.6). Total num frames: 5681561600. Throughput: 0: 52696.8. Samples: 172003440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 09:52:44,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 09:52:45,428][52263] Updated weights for policy 0, policy_version 346778 (0.0031) [2024-04-27 09:52:48,240][52263] Updated weights for policy 0, policy_version 346788 (0.0032) [2024-04-27 09:52:49,106][52031] Fps is (10 sec: 52429.6, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5681807360. Throughput: 0: 52845.0. Samples: 172322100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 09:52:49,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 09:52:51,743][52263] Updated weights for policy 0, policy_version 346798 (0.0031) [2024-04-27 09:52:54,106][52031] Fps is (10 sec: 50790.1, 60 sec: 52702.0, 300 sec: 52762.0). Total num frames: 5682069504. Throughput: 0: 52773.9. Samples: 172639060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 09:52:54,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 09:52:54,538][52263] Updated weights for policy 0, policy_version 346808 (0.0028) [2024-04-27 09:52:57,811][52263] Updated weights for policy 0, policy_version 346818 (0.0029) [2024-04-27 09:52:59,106][52031] Fps is (10 sec: 52428.8, 60 sec: 52702.0, 300 sec: 52651.0). Total num frames: 5682331648. Throughput: 0: 52724.1. Samples: 172793000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 09:52:59,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 09:53:00,820][52263] Updated weights for policy 0, policy_version 346828 (0.0029) [2024-04-27 09:53:04,049][52263] Updated weights for policy 0, policy_version 346838 (0.0027) [2024-04-27 09:53:04,106][52031] Fps is (10 sec: 52428.7, 60 sec: 52975.0, 300 sec: 52706.5). Total num frames: 5682593792. Throughput: 0: 52774.3. Samples: 173111260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 09:53:04,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 09:53:07,061][52263] Updated weights for policy 0, policy_version 346848 (0.0026) [2024-04-27 09:53:09,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53248.1, 300 sec: 52873.1). Total num frames: 5682888704. Throughput: 0: 52711.8. Samples: 173425840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 09:53:09,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 09:53:10,239][52263] Updated weights for policy 0, policy_version 346858 (0.0030) [2024-04-27 09:53:13,231][52263] Updated weights for policy 0, policy_version 346868 (0.0031) [2024-04-27 09:53:14,107][52031] Fps is (10 sec: 54066.4, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5683134464. Throughput: 0: 52913.9. Samples: 173590360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 09:53:14,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 09:53:16,620][52263] Updated weights for policy 0, policy_version 346878 (0.0028) [2024-04-27 09:53:19,106][52031] Fps is (10 sec: 49151.9, 60 sec: 52155.8, 300 sec: 52706.5). Total num frames: 5683380224. Throughput: 0: 52726.3. Samples: 173902540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 09:53:19,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 09:53:19,397][52263] Updated weights for policy 0, policy_version 346888 (0.0033) [2024-04-27 09:53:23,072][52263] Updated weights for policy 0, policy_version 346898 (0.0037) [2024-04-27 09:53:24,107][52031] Fps is (10 sec: 52429.1, 60 sec: 52975.0, 300 sec: 52706.5). Total num frames: 5683658752. Throughput: 0: 52839.6. Samples: 174220440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 09:53:24,108][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 09:53:25,561][52263] Updated weights for policy 0, policy_version 346908 (0.0029) [2024-04-27 09:53:29,096][52263] Updated weights for policy 0, policy_version 346918 (0.0036) [2024-04-27 09:53:29,107][52031] Fps is (10 sec: 52428.4, 60 sec: 52428.9, 300 sec: 52595.4). Total num frames: 5683904512. Throughput: 0: 52647.4. Samples: 174372580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 09:53:29,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 09:53:29,843][52242] Signal inference workers to stop experience collection... (2450 times) [2024-04-27 09:53:29,847][52242] Signal inference workers to resume experience collection... (2450 times) [2024-04-27 09:53:29,862][52263] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-04-27 09:53:29,862][52263] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-04-27 09:53:31,776][52263] Updated weights for policy 0, policy_version 346928 (0.0031) [2024-04-27 09:53:34,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.0, 300 sec: 52762.0). Total num frames: 5684199424. Throughput: 0: 52523.0. Samples: 174685640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 09:53:34,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 09:53:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000346936_5684199424.pth... [2024-04-27 09:53:34,162][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000346163_5671534592.pth [2024-04-27 09:53:35,309][52263] Updated weights for policy 0, policy_version 346938 (0.0029) [2024-04-27 09:53:37,938][52263] Updated weights for policy 0, policy_version 346948 (0.0031) [2024-04-27 09:53:39,106][52031] Fps is (10 sec: 54067.9, 60 sec: 52702.0, 300 sec: 52706.5). Total num frames: 5684445184. Throughput: 0: 52589.4. Samples: 175005580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 09:53:39,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 09:53:41,621][52263] Updated weights for policy 0, policy_version 346958 (0.0029) [2024-04-27 09:53:44,107][52031] Fps is (10 sec: 50789.9, 60 sec: 52428.6, 300 sec: 52706.5). Total num frames: 5684707328. Throughput: 0: 52835.3. Samples: 175170600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 09:53:44,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 09:53:44,332][52263] Updated weights for policy 0, policy_version 346968 (0.0031) [2024-04-27 09:53:47,856][52263] Updated weights for policy 0, policy_version 346978 (0.0032) [2024-04-27 09:53:49,106][52031] Fps is (10 sec: 54067.1, 60 sec: 52974.9, 300 sec: 52762.1). Total num frames: 5684985856. Throughput: 0: 52817.4. Samples: 175488040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 09:53:49,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 09:53:50,461][52263] Updated weights for policy 0, policy_version 346988 (0.0035) [2024-04-27 09:53:53,984][52263] Updated weights for policy 0, policy_version 346998 (0.0028) [2024-04-27 09:53:54,107][52031] Fps is (10 sec: 50790.7, 60 sec: 52428.7, 300 sec: 52595.4). Total num frames: 5685215232. Throughput: 0: 52862.5. Samples: 175804660. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 09:53:54,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 09:53:56,690][52263] Updated weights for policy 0, policy_version 347008 (0.0030) [2024-04-27 09:53:59,106][52031] Fps is (10 sec: 49151.9, 60 sec: 52428.8, 300 sec: 52651.0). Total num frames: 5685477376. Throughput: 0: 52492.6. Samples: 175952520. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 09:53:59,107][52031] Avg episode reward: [(0, '0.685')] [2024-04-27 09:54:00,063][52263] Updated weights for policy 0, policy_version 347018 (0.0028) [2024-04-27 09:54:02,799][52263] Updated weights for policy 0, policy_version 347028 (0.0029) [2024-04-27 09:54:04,106][52031] Fps is (10 sec: 55706.4, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5685772288. Throughput: 0: 52587.6. Samples: 176268980. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 09:54:04,107][52031] Avg episode reward: [(0, '0.659')] [2024-04-27 09:54:06,312][52263] Updated weights for policy 0, policy_version 347038 (0.0034) [2024-04-27 09:54:08,858][52263] Updated weights for policy 0, policy_version 347048 (0.0029) [2024-04-27 09:54:09,107][52031] Fps is (10 sec: 57343.2, 60 sec: 52701.7, 300 sec: 52817.6). Total num frames: 5686050816. Throughput: 0: 52555.5. Samples: 176585440. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 09:54:09,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 09:54:12,668][52263] Updated weights for policy 0, policy_version 347058 (0.0030) [2024-04-27 09:54:14,106][52031] Fps is (10 sec: 50790.4, 60 sec: 52428.9, 300 sec: 52651.0). Total num frames: 5686280192. Throughput: 0: 52873.5. Samples: 176751880. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 09:54:14,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 09:54:15,022][52263] Updated weights for policy 0, policy_version 347068 (0.0026) [2024-04-27 09:54:18,747][52263] Updated weights for policy 0, policy_version 347078 (0.0030) [2024-04-27 09:54:19,106][52031] Fps is (10 sec: 47514.2, 60 sec: 52428.8, 300 sec: 52539.9). Total num frames: 5686525952. Throughput: 0: 52897.9. Samples: 177066040. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 09:54:19,107][52031] Avg episode reward: [(0, '0.679')] [2024-04-27 09:54:21,251][52263] Updated weights for policy 0, policy_version 347088 (0.0033) [2024-04-27 09:54:24,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52428.8, 300 sec: 52651.0). Total num frames: 5686804480. Throughput: 0: 52798.5. Samples: 177381520. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 09:54:24,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 09:54:24,846][52242] Signal inference workers to stop experience collection... (2500 times) [2024-04-27 09:54:24,896][52263] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-04-27 09:54:24,908][52242] Signal inference workers to resume experience collection... (2500 times) [2024-04-27 09:54:24,914][52263] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-04-27 09:54:25,020][52263] Updated weights for policy 0, policy_version 347098 (0.0030) [2024-04-27 09:54:27,594][52263] Updated weights for policy 0, policy_version 347108 (0.0027) [2024-04-27 09:54:29,107][52031] Fps is (10 sec: 57343.0, 60 sec: 53247.9, 300 sec: 52762.0). Total num frames: 5687099392. Throughput: 0: 52668.9. Samples: 177540700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 09:54:29,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 09:54:31,136][52263] Updated weights for policy 0, policy_version 347118 (0.0029) [2024-04-27 09:54:33,630][52263] Updated weights for policy 0, policy_version 347128 (0.0035) [2024-04-27 09:54:34,107][52031] Fps is (10 sec: 55705.5, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5687361536. Throughput: 0: 52644.7. Samples: 177857060. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 09:54:34,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 09:54:37,374][52263] Updated weights for policy 0, policy_version 347138 (0.0028) [2024-04-27 09:54:39,107][52031] Fps is (10 sec: 50790.5, 60 sec: 52701.7, 300 sec: 52651.0). Total num frames: 5687607296. Throughput: 0: 52565.7. Samples: 178170120. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 09:54:39,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 09:54:39,907][52263] Updated weights for policy 0, policy_version 347148 (0.0034) [2024-04-27 09:54:43,457][52263] Updated weights for policy 0, policy_version 347158 (0.0028) [2024-04-27 09:54:44,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52702.0, 300 sec: 52651.0). Total num frames: 5687869440. Throughput: 0: 52883.9. Samples: 178332300. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 09:54:44,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 09:54:46,092][52263] Updated weights for policy 0, policy_version 347168 (0.0031) [2024-04-27 09:54:49,106][52031] Fps is (10 sec: 50791.2, 60 sec: 52155.7, 300 sec: 52651.0). Total num frames: 5688115200. Throughput: 0: 52965.7. Samples: 178652440. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 09:54:49,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 09:54:49,793][52263] Updated weights for policy 0, policy_version 347178 (0.0027) [2024-04-27 09:54:52,240][52263] Updated weights for policy 0, policy_version 347188 (0.0029) [2024-04-27 09:54:54,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53247.9, 300 sec: 52706.5). Total num frames: 5688410112. Throughput: 0: 52856.4. Samples: 178963980. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 09:54:54,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 09:54:55,921][52263] Updated weights for policy 0, policy_version 347198 (0.0034) [2024-04-27 09:54:58,319][52263] Updated weights for policy 0, policy_version 347208 (0.0028) [2024-04-27 09:54:59,106][52031] Fps is (10 sec: 57344.3, 60 sec: 53521.1, 300 sec: 52706.5). Total num frames: 5688688640. Throughput: 0: 52866.7. Samples: 179130880. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 09:54:59,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 09:55:02,122][52263] Updated weights for policy 0, policy_version 347218 (0.0028) [2024-04-27 09:55:04,107][52031] Fps is (10 sec: 52429.4, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5688934400. Throughput: 0: 52950.6. Samples: 179448820. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 09:55:04,115][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 09:55:04,561][52263] Updated weights for policy 0, policy_version 347228 (0.0033) [2024-04-27 09:55:08,273][52263] Updated weights for policy 0, policy_version 347238 (0.0028) [2024-04-27 09:55:09,106][52031] Fps is (10 sec: 49151.6, 60 sec: 52155.8, 300 sec: 52651.0). Total num frames: 5689180160. Throughput: 0: 52924.1. Samples: 179763100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 09:55:09,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 09:55:11,006][52263] Updated weights for policy 0, policy_version 347248 (0.0027) [2024-04-27 09:55:14,107][52031] Fps is (10 sec: 50790.4, 60 sec: 52701.8, 300 sec: 52595.4). Total num frames: 5689442304. Throughput: 0: 52627.2. Samples: 179908920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 09:55:14,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 09:55:14,584][52263] Updated weights for policy 0, policy_version 347258 (0.0037) [2024-04-27 09:55:17,078][52263] Updated weights for policy 0, policy_version 347268 (0.0028) [2024-04-27 09:55:19,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53248.0, 300 sec: 52706.5). Total num frames: 5689720832. Throughput: 0: 52625.8. Samples: 180225220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 09:55:19,107][52031] Avg episode reward: [(0, '0.511')] [2024-04-27 09:55:20,836][52263] Updated weights for policy 0, policy_version 347278 (0.0038) [2024-04-27 09:55:23,530][52263] Updated weights for policy 0, policy_version 347288 (0.0030) [2024-04-27 09:55:24,107][52031] Fps is (10 sec: 54067.0, 60 sec: 52974.9, 300 sec: 52650.9). Total num frames: 5689982976. Throughput: 0: 52823.6. Samples: 180547180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 09:55:24,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 09:55:27,080][52263] Updated weights for policy 0, policy_version 347298 (0.0030) [2024-04-27 09:55:29,106][52031] Fps is (10 sec: 54067.6, 60 sec: 52702.0, 300 sec: 52762.1). Total num frames: 5690261504. Throughput: 0: 52792.5. Samples: 180707960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 09:55:29,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 09:55:29,750][52263] Updated weights for policy 0, policy_version 347308 (0.0032) [2024-04-27 09:55:33,301][52263] Updated weights for policy 0, policy_version 347318 (0.0033) [2024-04-27 09:55:34,107][52031] Fps is (10 sec: 54066.8, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5690523648. Throughput: 0: 52701.1. Samples: 181024000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 09:55:34,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 09:55:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000347322_5690523648.pth... [2024-04-27 09:55:34,167][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000346549_5677858816.pth [2024-04-27 09:55:35,818][52263] Updated weights for policy 0, policy_version 347328 (0.0029) [2024-04-27 09:55:39,107][52031] Fps is (10 sec: 49151.4, 60 sec: 52428.8, 300 sec: 52651.0). Total num frames: 5690753024. Throughput: 0: 52837.9. Samples: 181341680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 09:55:39,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 09:55:39,520][52263] Updated weights for policy 0, policy_version 347338 (0.0033) [2024-04-27 09:55:41,603][52242] Signal inference workers to stop experience collection... (2550 times) [2024-04-27 09:55:41,635][52263] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-04-27 09:55:41,664][52242] Signal inference workers to resume experience collection... (2550 times) [2024-04-27 09:55:41,670][52263] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-04-27 09:55:41,997][52263] Updated weights for policy 0, policy_version 347348 (0.0029) [2024-04-27 09:55:44,106][52031] Fps is (10 sec: 50791.1, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5691031552. Throughput: 0: 52583.0. Samples: 181497120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 09:55:44,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 09:55:45,751][52263] Updated weights for policy 0, policy_version 347358 (0.0028) [2024-04-27 09:55:48,249][52263] Updated weights for policy 0, policy_version 347368 (0.0030) [2024-04-27 09:55:49,107][52031] Fps is (10 sec: 54067.2, 60 sec: 52974.8, 300 sec: 52650.9). Total num frames: 5691293696. Throughput: 0: 52505.7. Samples: 181811580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 09:55:49,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 09:55:52,046][52263] Updated weights for policy 0, policy_version 347378 (0.0028) [2024-04-27 09:55:54,107][52031] Fps is (10 sec: 54067.0, 60 sec: 52702.0, 300 sec: 52762.0). Total num frames: 5691572224. Throughput: 0: 52475.9. Samples: 182124520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 09:55:54,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 09:55:54,559][52263] Updated weights for policy 0, policy_version 347388 (0.0029) [2024-04-27 09:55:58,297][52263] Updated weights for policy 0, policy_version 347398 (0.0026) [2024-04-27 09:55:59,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52155.6, 300 sec: 52706.5). Total num frames: 5691817984. Throughput: 0: 52784.5. Samples: 182284220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 09:55:59,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 09:56:00,776][52263] Updated weights for policy 0, policy_version 347408 (0.0027) [2024-04-27 09:56:04,107][52031] Fps is (10 sec: 49151.8, 60 sec: 52155.7, 300 sec: 52595.4). Total num frames: 5692063744. Throughput: 0: 52668.4. Samples: 182595300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 09:56:04,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 09:56:04,379][52263] Updated weights for policy 0, policy_version 347418 (0.0031) [2024-04-27 09:56:06,959][52263] Updated weights for policy 0, policy_version 347428 (0.0027) [2024-04-27 09:56:09,106][52031] Fps is (10 sec: 54067.1, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5692358656. Throughput: 0: 52564.5. Samples: 182912580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 09:56:09,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 09:56:10,530][52263] Updated weights for policy 0, policy_version 347438 (0.0026) [2024-04-27 09:56:13,634][52263] Updated weights for policy 0, policy_version 347448 (0.0031) [2024-04-27 09:56:14,107][52031] Fps is (10 sec: 54067.5, 60 sec: 52701.9, 300 sec: 52595.4). Total num frames: 5692604416. Throughput: 0: 52411.5. Samples: 183066480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 09:56:14,107][52031] Avg episode reward: [(0, '0.691')] [2024-04-27 09:56:16,758][52263] Updated weights for policy 0, policy_version 347458 (0.0029) [2024-04-27 09:56:19,107][52031] Fps is (10 sec: 52428.6, 60 sec: 52701.8, 300 sec: 52650.9). Total num frames: 5692882944. Throughput: 0: 52513.0. Samples: 183387080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 09:56:19,107][52031] Avg episode reward: [(0, '0.490')] [2024-04-27 09:56:19,671][52263] Updated weights for policy 0, policy_version 347468 (0.0033) [2024-04-27 09:56:22,832][52263] Updated weights for policy 0, policy_version 347478 (0.0030) [2024-04-27 09:56:24,106][52031] Fps is (10 sec: 55705.9, 60 sec: 52975.0, 300 sec: 52817.6). Total num frames: 5693161472. Throughput: 0: 52555.7. Samples: 183706680. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-04-27 09:56:24,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 09:56:25,897][52263] Updated weights for policy 0, policy_version 347488 (0.0029) [2024-04-27 09:56:29,039][52263] Updated weights for policy 0, policy_version 347498 (0.0029) [2024-04-27 09:56:29,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52428.7, 300 sec: 52706.5). Total num frames: 5693407232. Throughput: 0: 52740.9. Samples: 183870460. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-04-27 09:56:29,107][52031] Avg episode reward: [(0, '0.492')] [2024-04-27 09:56:32,154][52263] Updated weights for policy 0, policy_version 347508 (0.0032) [2024-04-27 09:56:34,107][52031] Fps is (10 sec: 50789.9, 60 sec: 52428.9, 300 sec: 52762.0). Total num frames: 5693669376. Throughput: 0: 52797.8. Samples: 184187480. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-04-27 09:56:34,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 09:56:35,319][52263] Updated weights for policy 0, policy_version 347518 (0.0027) [2024-04-27 09:56:38,197][52263] Updated weights for policy 0, policy_version 347528 (0.0030) [2024-04-27 09:56:39,107][52031] Fps is (10 sec: 50790.2, 60 sec: 52701.9, 300 sec: 52595.4). Total num frames: 5693915136. Throughput: 0: 52821.3. Samples: 184501480. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-04-27 09:56:39,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 09:56:40,710][52242] Signal inference workers to stop experience collection... (2600 times) [2024-04-27 09:56:40,711][52242] Signal inference workers to resume experience collection... (2600 times) [2024-04-27 09:56:40,741][52263] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-04-27 09:56:40,741][52263] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-04-27 09:56:41,517][52263] Updated weights for policy 0, policy_version 347538 (0.0040) [2024-04-27 09:56:44,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5694193664. Throughput: 0: 52855.1. Samples: 184662700. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-04-27 09:56:44,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 09:56:44,370][52263] Updated weights for policy 0, policy_version 347548 (0.0023) [2024-04-27 09:56:47,761][52263] Updated weights for policy 0, policy_version 347558 (0.0028) [2024-04-27 09:56:49,107][52031] Fps is (10 sec: 54067.0, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5694455808. Throughput: 0: 52881.3. Samples: 184974960. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-04-27 09:56:49,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 09:56:50,587][52263] Updated weights for policy 0, policy_version 347568 (0.0033) [2024-04-27 09:56:53,863][52263] Updated weights for policy 0, policy_version 347578 (0.0036) [2024-04-27 09:56:54,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52428.9, 300 sec: 52706.5). Total num frames: 5694717952. Throughput: 0: 52809.5. Samples: 185289000. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-04-27 09:56:54,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 09:56:56,938][52263] Updated weights for policy 0, policy_version 347588 (0.0034) [2024-04-27 09:56:59,107][52031] Fps is (10 sec: 54067.4, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5694996480. Throughput: 0: 52848.0. Samples: 185444640. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-04-27 09:56:59,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 09:57:00,151][52263] Updated weights for policy 0, policy_version 347598 (0.0033) [2024-04-27 09:57:03,290][52263] Updated weights for policy 0, policy_version 347608 (0.0031) [2024-04-27 09:57:04,106][52031] Fps is (10 sec: 52428.4, 60 sec: 52975.0, 300 sec: 52706.5). Total num frames: 5695242240. Throughput: 0: 52918.7. Samples: 185768420. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-04-27 09:57:04,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 09:57:06,129][52263] Updated weights for policy 0, policy_version 347618 (0.0035) [2024-04-27 09:57:09,107][52031] Fps is (10 sec: 50789.8, 60 sec: 52428.7, 300 sec: 52650.9). Total num frames: 5695504384. Throughput: 0: 52774.4. Samples: 186081540. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-04-27 09:57:09,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 09:57:09,476][52263] Updated weights for policy 0, policy_version 347628 (0.0030) [2024-04-27 09:57:12,402][52263] Updated weights for policy 0, policy_version 347638 (0.0033) [2024-04-27 09:57:14,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52701.9, 300 sec: 52595.4). Total num frames: 5695766528. Throughput: 0: 52636.9. Samples: 186239120. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-04-27 09:57:14,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 09:57:15,611][52263] Updated weights for policy 0, policy_version 347648 (0.0029) [2024-04-27 09:57:18,426][52263] Updated weights for policy 0, policy_version 347658 (0.0028) [2024-04-27 09:57:19,106][52031] Fps is (10 sec: 57345.1, 60 sec: 53248.1, 300 sec: 52873.1). Total num frames: 5696077824. Throughput: 0: 52688.5. Samples: 186558460. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-04-27 09:57:19,107][52031] Avg episode reward: [(0, '0.499')] [2024-04-27 09:57:21,790][52263] Updated weights for policy 0, policy_version 347668 (0.0035) [2024-04-27 09:57:24,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52155.7, 300 sec: 52651.0). Total num frames: 5696290816. Throughput: 0: 52803.7. Samples: 186877640. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-04-27 09:57:24,107][52031] Avg episode reward: [(0, '0.510')] [2024-04-27 09:57:24,645][52263] Updated weights for policy 0, policy_version 347678 (0.0028) [2024-04-27 09:57:27,894][52263] Updated weights for policy 0, policy_version 347688 (0.0033) [2024-04-27 09:57:29,107][52031] Fps is (10 sec: 49151.8, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5696569344. Throughput: 0: 52657.7. Samples: 187032300. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-04-27 09:57:29,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 09:57:31,039][52263] Updated weights for policy 0, policy_version 347698 (0.0033) [2024-04-27 09:57:33,972][52263] Updated weights for policy 0, policy_version 347708 (0.0032) [2024-04-27 09:57:34,107][52031] Fps is (10 sec: 55705.0, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5696847872. Throughput: 0: 52808.0. Samples: 187351320. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-04-27 09:57:34,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 09:57:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000347708_5696847872.pth... [2024-04-27 09:57:34,162][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000346936_5684199424.pth [2024-04-27 09:57:37,225][52263] Updated weights for policy 0, policy_version 347718 (0.0027) [2024-04-27 09:57:39,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53248.0, 300 sec: 52706.5). Total num frames: 5697110016. Throughput: 0: 52928.8. Samples: 187670800. Policy #0 lag: (min: 2.0, avg: 10.8, max: 22.0) [2024-04-27 09:57:39,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 09:57:40,180][52263] Updated weights for policy 0, policy_version 347728 (0.0029) [2024-04-27 09:57:43,295][52263] Updated weights for policy 0, policy_version 347738 (0.0034) [2024-04-27 09:57:44,107][52031] Fps is (10 sec: 52428.9, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5697372160. Throughput: 0: 52905.4. Samples: 187825380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 09:57:44,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 09:57:46,399][52263] Updated weights for policy 0, policy_version 347748 (0.0032) [2024-04-27 09:57:49,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52975.1, 300 sec: 52762.0). Total num frames: 5697634304. Throughput: 0: 52840.5. Samples: 188146240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 09:57:49,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 09:57:49,514][52263] Updated weights for policy 0, policy_version 347758 (0.0026) [2024-04-27 09:57:52,584][52242] Signal inference workers to stop experience collection... (2650 times) [2024-04-27 09:57:52,585][52242] Signal inference workers to resume experience collection... (2650 times) [2024-04-27 09:57:52,630][52263] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-04-27 09:57:52,630][52263] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-04-27 09:57:52,711][52263] Updated weights for policy 0, policy_version 347768 (0.0025) [2024-04-27 09:57:54,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5697880064. Throughput: 0: 52889.2. Samples: 188461540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 09:57:54,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 09:57:55,744][52263] Updated weights for policy 0, policy_version 347778 (0.0037) [2024-04-27 09:57:59,107][52031] Fps is (10 sec: 50789.9, 60 sec: 52428.8, 300 sec: 52706.5). Total num frames: 5698142208. Throughput: 0: 52771.9. Samples: 188613860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 09:57:59,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 09:57:59,407][52263] Updated weights for policy 0, policy_version 347788 (0.0026) [2024-04-27 09:58:02,064][52263] Updated weights for policy 0, policy_version 347798 (0.0036) [2024-04-27 09:58:04,107][52031] Fps is (10 sec: 54066.9, 60 sec: 52974.9, 300 sec: 52650.9). Total num frames: 5698420736. Throughput: 0: 52645.8. Samples: 188927520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 09:58:04,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 09:58:05,737][52263] Updated weights for policy 0, policy_version 347808 (0.0032) [2024-04-27 09:58:08,061][52263] Updated weights for policy 0, policy_version 347818 (0.0037) [2024-04-27 09:58:09,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53248.1, 300 sec: 52762.0). Total num frames: 5698699264. Throughput: 0: 52615.0. Samples: 189245320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 09:58:09,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 09:58:11,788][52263] Updated weights for policy 0, policy_version 347828 (0.0027) [2024-04-27 09:58:14,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53247.9, 300 sec: 52817.6). Total num frames: 5698961408. Throughput: 0: 52907.1. Samples: 189413120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 09:58:14,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 09:58:14,189][52263] Updated weights for policy 0, policy_version 347838 (0.0038) [2024-04-27 09:58:17,862][52263] Updated weights for policy 0, policy_version 347848 (0.0035) [2024-04-27 09:58:19,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52155.7, 300 sec: 52706.5). Total num frames: 5699207168. Throughput: 0: 52808.1. Samples: 189727680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 09:58:19,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 09:58:20,438][52263] Updated weights for policy 0, policy_version 347858 (0.0030) [2024-04-27 09:58:24,100][52263] Updated weights for policy 0, policy_version 347868 (0.0031) [2024-04-27 09:58:24,107][52031] Fps is (10 sec: 50790.2, 60 sec: 52974.8, 300 sec: 52762.0). Total num frames: 5699469312. Throughput: 0: 52685.7. Samples: 190041660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 09:58:24,108][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 09:58:26,870][52263] Updated weights for policy 0, policy_version 347878 (0.0033) [2024-04-27 09:58:29,107][52031] Fps is (10 sec: 54067.1, 60 sec: 52975.0, 300 sec: 52706.5). Total num frames: 5699747840. Throughput: 0: 52804.9. Samples: 190201600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 09:58:29,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 09:58:30,295][52263] Updated weights for policy 0, policy_version 347888 (0.0034) [2024-04-27 09:58:33,114][52263] Updated weights for policy 0, policy_version 347898 (0.0037) [2024-04-27 09:58:34,107][52031] Fps is (10 sec: 52429.0, 60 sec: 52428.8, 300 sec: 52706.5). Total num frames: 5699993600. Throughput: 0: 52651.4. Samples: 190515560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 09:58:34,107][52031] Avg episode reward: [(0, '0.655')] [2024-04-27 09:58:36,452][52263] Updated weights for policy 0, policy_version 347908 (0.0033) [2024-04-27 09:58:39,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52701.9, 300 sec: 52762.1). Total num frames: 5700272128. Throughput: 0: 52694.2. Samples: 190832780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 09:58:39,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 09:58:39,298][52263] Updated weights for policy 0, policy_version 347918 (0.0027) [2024-04-27 09:58:42,644][52263] Updated weights for policy 0, policy_version 347928 (0.0034) [2024-04-27 09:58:44,106][52031] Fps is (10 sec: 55706.0, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5700550656. Throughput: 0: 52907.2. Samples: 190994680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 09:58:44,107][52031] Avg episode reward: [(0, '0.511')] [2024-04-27 09:58:45,403][52263] Updated weights for policy 0, policy_version 347938 (0.0032) [2024-04-27 09:58:48,762][52263] Updated weights for policy 0, policy_version 347948 (0.0031) [2024-04-27 09:58:49,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5700796416. Throughput: 0: 52931.1. Samples: 191309420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 09:58:49,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 09:58:51,525][52263] Updated weights for policy 0, policy_version 347958 (0.0031) [2024-04-27 09:58:54,107][52031] Fps is (10 sec: 50789.7, 60 sec: 52974.8, 300 sec: 52817.5). Total num frames: 5701058560. Throughput: 0: 52864.8. Samples: 191624240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 09:58:54,115][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 09:58:54,946][52263] Updated weights for policy 0, policy_version 347968 (0.0029) [2024-04-27 09:58:57,807][52263] Updated weights for policy 0, policy_version 347978 (0.0028) [2024-04-27 09:58:59,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.0, 300 sec: 52762.0). Total num frames: 5701337088. Throughput: 0: 52684.4. Samples: 191783920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 09:58:59,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 09:59:01,007][52263] Updated weights for policy 0, policy_version 347988 (0.0027) [2024-04-27 09:59:04,107][52031] Fps is (10 sec: 52428.8, 60 sec: 52701.8, 300 sec: 52650.9). Total num frames: 5701582848. Throughput: 0: 52789.2. Samples: 192103200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 09:59:04,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 09:59:04,130][52263] Updated weights for policy 0, policy_version 347998 (0.0035) [2024-04-27 09:59:05,133][52242] Signal inference workers to stop experience collection... (2700 times) [2024-04-27 09:59:05,166][52263] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-04-27 09:59:05,197][52242] Signal inference workers to resume experience collection... (2700 times) [2024-04-27 09:59:05,197][52263] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-04-27 09:59:07,228][52263] Updated weights for policy 0, policy_version 348008 (0.0027) [2024-04-27 09:59:09,107][52031] Fps is (10 sec: 52428.9, 60 sec: 52701.9, 300 sec: 52817.6). Total num frames: 5701861376. Throughput: 0: 52848.5. Samples: 192419840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 09:59:09,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 09:59:10,462][52263] Updated weights for policy 0, policy_version 348018 (0.0028) [2024-04-27 09:59:13,345][52263] Updated weights for policy 0, policy_version 348028 (0.0033) [2024-04-27 09:59:14,106][52031] Fps is (10 sec: 52429.4, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5702107136. Throughput: 0: 52774.2. Samples: 192576440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 09:59:14,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 09:59:16,832][52263] Updated weights for policy 0, policy_version 348038 (0.0029) [2024-04-27 09:59:19,107][52031] Fps is (10 sec: 52428.8, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5702385664. Throughput: 0: 52856.0. Samples: 192894080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 09:59:19,108][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 09:59:19,553][52263] Updated weights for policy 0, policy_version 348048 (0.0037) [2024-04-27 09:59:23,203][52263] Updated weights for policy 0, policy_version 348058 (0.0028) [2024-04-27 09:59:24,106][52031] Fps is (10 sec: 54067.7, 60 sec: 52975.1, 300 sec: 52706.5). Total num frames: 5702647808. Throughput: 0: 52929.8. Samples: 193214620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 09:59:24,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 09:59:25,613][52263] Updated weights for policy 0, policy_version 348068 (0.0031) [2024-04-27 09:59:29,107][52031] Fps is (10 sec: 50790.2, 60 sec: 52428.7, 300 sec: 52651.0). Total num frames: 5702893568. Throughput: 0: 52791.4. Samples: 193370300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 09:59:29,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 09:59:29,296][52263] Updated weights for policy 0, policy_version 348078 (0.0032) [2024-04-27 09:59:31,795][52263] Updated weights for policy 0, policy_version 348088 (0.0026) [2024-04-27 09:59:34,106][52031] Fps is (10 sec: 52428.5, 60 sec: 52975.0, 300 sec: 52762.1). Total num frames: 5703172096. Throughput: 0: 52789.4. Samples: 193684940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 09:59:34,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 09:59:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000348094_5703172096.pth... [2024-04-27 09:59:34,172][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000347322_5690523648.pth [2024-04-27 09:59:35,469][52263] Updated weights for policy 0, policy_version 348098 (0.0029) [2024-04-27 09:59:38,097][52263] Updated weights for policy 0, policy_version 348108 (0.0042) [2024-04-27 09:59:39,107][52031] Fps is (10 sec: 52428.9, 60 sec: 52428.7, 300 sec: 52706.5). Total num frames: 5703417856. Throughput: 0: 52678.3. Samples: 193994760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 09:59:39,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 09:59:41,661][52263] Updated weights for policy 0, policy_version 348118 (0.0035) [2024-04-27 09:59:44,107][52031] Fps is (10 sec: 54067.0, 60 sec: 52701.8, 300 sec: 52873.1). Total num frames: 5703712768. Throughput: 0: 52798.2. Samples: 194159840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 09:59:44,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 09:59:44,213][52263] Updated weights for policy 0, policy_version 348128 (0.0025) [2024-04-27 09:59:47,796][52263] Updated weights for policy 0, policy_version 348138 (0.0029) [2024-04-27 09:59:49,106][52031] Fps is (10 sec: 55706.2, 60 sec: 52975.0, 300 sec: 52762.1). Total num frames: 5703974912. Throughput: 0: 52754.4. Samples: 194477140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 09:59:49,107][52031] Avg episode reward: [(0, '0.510')] [2024-04-27 09:59:50,409][52263] Updated weights for policy 0, policy_version 348148 (0.0029) [2024-04-27 09:59:54,061][52263] Updated weights for policy 0, policy_version 348158 (0.0035) [2024-04-27 09:59:54,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52701.9, 300 sec: 52650.9). Total num frames: 5704220672. Throughput: 0: 52758.6. Samples: 194793980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 09:59:54,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 09:59:56,521][52263] Updated weights for policy 0, policy_version 348168 (0.0029) [2024-04-27 09:59:59,107][52031] Fps is (10 sec: 50789.9, 60 sec: 52428.8, 300 sec: 52706.5). Total num frames: 5704482816. Throughput: 0: 52653.8. Samples: 194945860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 09:59:59,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 10:00:00,293][52263] Updated weights for policy 0, policy_version 348178 (0.0033) [2024-04-27 10:00:02,610][52263] Updated weights for policy 0, policy_version 348188 (0.0030) [2024-04-27 10:00:04,107][52031] Fps is (10 sec: 52429.0, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5704744960. Throughput: 0: 52653.8. Samples: 195263500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 10:00:04,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 10:00:06,434][52263] Updated weights for policy 0, policy_version 348198 (0.0031) [2024-04-27 10:00:08,740][52242] Signal inference workers to stop experience collection... (2750 times) [2024-04-27 10:00:08,746][52242] Signal inference workers to resume experience collection... (2750 times) [2024-04-27 10:00:08,760][52263] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-04-27 10:00:08,760][52263] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-04-27 10:00:08,870][52263] Updated weights for policy 0, policy_version 348208 (0.0037) [2024-04-27 10:00:09,107][52031] Fps is (10 sec: 55705.0, 60 sec: 52974.8, 300 sec: 52873.1). Total num frames: 5705039872. Throughput: 0: 52535.3. Samples: 195578720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 10:00:09,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 10:00:12,639][52263] Updated weights for policy 0, policy_version 348218 (0.0030) [2024-04-27 10:00:14,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53248.1, 300 sec: 52817.6). Total num frames: 5705302016. Throughput: 0: 52781.1. Samples: 195745440. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-27 10:00:14,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 10:00:15,202][52263] Updated weights for policy 0, policy_version 348228 (0.0026) [2024-04-27 10:00:18,850][52263] Updated weights for policy 0, policy_version 348238 (0.0031) [2024-04-27 10:00:19,107][52031] Fps is (10 sec: 49152.5, 60 sec: 52428.8, 300 sec: 52706.5). Total num frames: 5705531392. Throughput: 0: 52857.3. Samples: 196063520. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-27 10:00:19,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 10:00:21,294][52263] Updated weights for policy 0, policy_version 348248 (0.0030) [2024-04-27 10:00:24,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52974.8, 300 sec: 52762.0). Total num frames: 5705826304. Throughput: 0: 53070.3. Samples: 196382920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-27 10:00:24,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 10:00:25,047][52263] Updated weights for policy 0, policy_version 348258 (0.0034) [2024-04-27 10:00:27,618][52263] Updated weights for policy 0, policy_version 348268 (0.0035) [2024-04-27 10:00:29,106][52031] Fps is (10 sec: 54067.3, 60 sec: 52975.0, 300 sec: 52706.5). Total num frames: 5706072064. Throughput: 0: 52821.3. Samples: 196536800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-27 10:00:29,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 10:00:31,079][52263] Updated weights for policy 0, policy_version 348278 (0.0029) [2024-04-27 10:00:33,825][52263] Updated weights for policy 0, policy_version 348288 (0.0037) [2024-04-27 10:00:34,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5706350592. Throughput: 0: 52772.0. Samples: 196851880. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-27 10:00:34,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 10:00:37,361][52263] Updated weights for policy 0, policy_version 348298 (0.0030) [2024-04-27 10:00:39,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.0, 300 sec: 52873.1). Total num frames: 5706629120. Throughput: 0: 52835.6. Samples: 197171580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-27 10:00:39,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 10:00:39,823][52263] Updated weights for policy 0, policy_version 348308 (0.0030) [2024-04-27 10:00:43,445][52263] Updated weights for policy 0, policy_version 348318 (0.0031) [2024-04-27 10:00:44,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52702.0, 300 sec: 52817.6). Total num frames: 5706874880. Throughput: 0: 53211.3. Samples: 197340360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-27 10:00:44,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 10:00:45,919][52263] Updated weights for policy 0, policy_version 348328 (0.0027) [2024-04-27 10:00:49,106][52031] Fps is (10 sec: 50790.7, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5707137024. Throughput: 0: 53217.4. Samples: 197658280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-27 10:00:49,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 10:00:49,592][52263] Updated weights for policy 0, policy_version 348338 (0.0028) [2024-04-27 10:00:52,010][52263] Updated weights for policy 0, policy_version 348348 (0.0031) [2024-04-27 10:00:54,107][52031] Fps is (10 sec: 52427.6, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5707399168. Throughput: 0: 53236.0. Samples: 197974340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-27 10:00:54,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 10:00:55,844][52263] Updated weights for policy 0, policy_version 348358 (0.0029) [2024-04-27 10:00:58,602][52263] Updated weights for policy 0, policy_version 348368 (0.0026) [2024-04-27 10:00:59,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5707661312. Throughput: 0: 53091.9. Samples: 198134580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-27 10:00:59,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 10:01:01,885][52263] Updated weights for policy 0, policy_version 348378 (0.0029) [2024-04-27 10:01:04,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.0, 300 sec: 52873.1). Total num frames: 5707956224. Throughput: 0: 53071.9. Samples: 198451760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-27 10:01:04,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 10:01:05,011][52263] Updated weights for policy 0, policy_version 348388 (0.0033) [2024-04-27 10:01:08,022][52263] Updated weights for policy 0, policy_version 348398 (0.0029) [2024-04-27 10:01:09,106][52031] Fps is (10 sec: 54067.5, 60 sec: 52702.0, 300 sec: 52873.1). Total num frames: 5708201984. Throughput: 0: 52954.8. Samples: 198765880. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-27 10:01:09,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 10:01:11,161][52263] Updated weights for policy 0, policy_version 348408 (0.0033) [2024-04-27 10:01:14,107][52031] Fps is (10 sec: 49151.9, 60 sec: 52428.6, 300 sec: 52762.0). Total num frames: 5708447744. Throughput: 0: 53049.2. Samples: 198924020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-27 10:01:14,107][52031] Avg episode reward: [(0, '0.488')] [2024-04-27 10:01:14,330][52263] Updated weights for policy 0, policy_version 348418 (0.0027) [2024-04-27 10:01:17,418][52242] Signal inference workers to stop experience collection... (2800 times) [2024-04-27 10:01:17,418][52242] Signal inference workers to resume experience collection... (2800 times) [2024-04-27 10:01:17,429][52263] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-04-27 10:01:17,430][52263] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-04-27 10:01:17,541][52263] Updated weights for policy 0, policy_version 348428 (0.0031) [2024-04-27 10:01:19,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.1, 300 sec: 52762.0). Total num frames: 5708726272. Throughput: 0: 53066.3. Samples: 199239860. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-27 10:01:19,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 10:01:20,489][52263] Updated weights for policy 0, policy_version 348438 (0.0031) [2024-04-27 10:01:23,875][52263] Updated weights for policy 0, policy_version 348448 (0.0028) [2024-04-27 10:01:24,107][52031] Fps is (10 sec: 52428.8, 60 sec: 52428.7, 300 sec: 52762.0). Total num frames: 5708972032. Throughput: 0: 53103.0. Samples: 199561220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-27 10:01:24,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 10:01:26,747][52263] Updated weights for policy 0, policy_version 348458 (0.0027) [2024-04-27 10:01:29,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5709250560. Throughput: 0: 52851.4. Samples: 199718680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-04-27 10:01:29,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 10:01:29,960][52263] Updated weights for policy 0, policy_version 348468 (0.0031) [2024-04-27 10:01:32,801][52263] Updated weights for policy 0, policy_version 348478 (0.0038) [2024-04-27 10:01:34,106][52031] Fps is (10 sec: 55706.7, 60 sec: 52975.0, 300 sec: 52928.7). Total num frames: 5709529088. Throughput: 0: 52788.5. Samples: 200033760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 10:01:34,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 10:01:34,113][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000348483_5709545472.pth... [2024-04-27 10:01:34,167][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000347708_5696847872.pth [2024-04-27 10:01:36,174][52263] Updated weights for policy 0, policy_version 348488 (0.0029) [2024-04-27 10:01:38,908][52263] Updated weights for policy 0, policy_version 348498 (0.0029) [2024-04-27 10:01:39,106][52031] Fps is (10 sec: 54067.7, 60 sec: 52702.0, 300 sec: 52873.1). Total num frames: 5709791232. Throughput: 0: 52798.9. Samples: 200350280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 10:01:39,107][52031] Avg episode reward: [(0, '0.671')] [2024-04-27 10:01:42,250][52263] Updated weights for policy 0, policy_version 348508 (0.0031) [2024-04-27 10:01:44,107][52031] Fps is (10 sec: 52427.7, 60 sec: 52974.7, 300 sec: 52873.1). Total num frames: 5710053376. Throughput: 0: 52772.3. Samples: 200509340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 10:01:44,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 10:01:45,129][52263] Updated weights for policy 0, policy_version 348518 (0.0027) [2024-04-27 10:01:48,527][52263] Updated weights for policy 0, policy_version 348528 (0.0028) [2024-04-27 10:01:49,107][52031] Fps is (10 sec: 49151.3, 60 sec: 52428.7, 300 sec: 52762.0). Total num frames: 5710282752. Throughput: 0: 52820.5. Samples: 200828680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 10:01:49,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 10:01:51,298][52263] Updated weights for policy 0, policy_version 348538 (0.0030) [2024-04-27 10:01:54,107][52031] Fps is (10 sec: 49152.3, 60 sec: 52428.8, 300 sec: 52706.5). Total num frames: 5710544896. Throughput: 0: 52729.6. Samples: 201138720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 10:01:54,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 10:01:54,716][52263] Updated weights for policy 0, policy_version 348548 (0.0026) [2024-04-27 10:01:57,464][52263] Updated weights for policy 0, policy_version 348558 (0.0033) [2024-04-27 10:01:59,106][52031] Fps is (10 sec: 57344.5, 60 sec: 53248.0, 300 sec: 52928.7). Total num frames: 5710856192. Throughput: 0: 52710.4. Samples: 201295980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 10:01:59,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 10:02:00,909][52263] Updated weights for policy 0, policy_version 348568 (0.0033) [2024-04-27 10:02:03,668][52263] Updated weights for policy 0, policy_version 348578 (0.0030) [2024-04-27 10:02:04,107][52031] Fps is (10 sec: 57343.8, 60 sec: 52701.8, 300 sec: 52928.7). Total num frames: 5711118336. Throughput: 0: 52786.4. Samples: 201615260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 10:02:04,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 10:02:07,158][52263] Updated weights for policy 0, policy_version 348588 (0.0028) [2024-04-27 10:02:09,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52701.9, 300 sec: 52873.1). Total num frames: 5711364096. Throughput: 0: 52662.9. Samples: 201931040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 10:02:09,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 10:02:09,811][52263] Updated weights for policy 0, policy_version 348598 (0.0031) [2024-04-27 10:02:10,347][52242] Signal inference workers to stop experience collection... (2850 times) [2024-04-27 10:02:10,348][52242] Signal inference workers to resume experience collection... (2850 times) [2024-04-27 10:02:10,366][52263] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-04-27 10:02:10,366][52263] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-04-27 10:02:13,618][52263] Updated weights for policy 0, policy_version 348608 (0.0028) [2024-04-27 10:02:14,106][52031] Fps is (10 sec: 49153.0, 60 sec: 52702.0, 300 sec: 52651.0). Total num frames: 5711609856. Throughput: 0: 52585.0. Samples: 202085000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 10:02:14,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 10:02:16,062][52263] Updated weights for policy 0, policy_version 348618 (0.0031) [2024-04-27 10:02:19,107][52031] Fps is (10 sec: 50789.7, 60 sec: 52428.7, 300 sec: 52817.5). Total num frames: 5711872000. Throughput: 0: 52546.9. Samples: 202398380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 10:02:19,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 10:02:19,738][52263] Updated weights for policy 0, policy_version 348628 (0.0031) [2024-04-27 10:02:22,239][52263] Updated weights for policy 0, policy_version 348638 (0.0037) [2024-04-27 10:02:24,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53248.2, 300 sec: 52873.1). Total num frames: 5712166912. Throughput: 0: 52573.8. Samples: 202716100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 10:02:24,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 10:02:25,940][52263] Updated weights for policy 0, policy_version 348648 (0.0026) [2024-04-27 10:02:28,348][52263] Updated weights for policy 0, policy_version 348658 (0.0027) [2024-04-27 10:02:29,107][52031] Fps is (10 sec: 55705.8, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5712429056. Throughput: 0: 52841.9. Samples: 202887220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 10:02:29,108][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 10:02:32,168][52263] Updated weights for policy 0, policy_version 348668 (0.0031) [2024-04-27 10:02:34,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5712691200. Throughput: 0: 52663.2. Samples: 203198520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 10:02:34,115][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 10:02:34,886][52263] Updated weights for policy 0, policy_version 348678 (0.0032) [2024-04-27 10:02:38,513][52263] Updated weights for policy 0, policy_version 348688 (0.0040) [2024-04-27 10:02:39,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52428.8, 300 sec: 52762.0). Total num frames: 5712936960. Throughput: 0: 52809.0. Samples: 203515120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 10:02:39,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 10:02:41,088][52263] Updated weights for policy 0, policy_version 348698 (0.0032) [2024-04-27 10:02:44,106][52031] Fps is (10 sec: 49152.2, 60 sec: 52155.9, 300 sec: 52706.5). Total num frames: 5713182720. Throughput: 0: 52557.3. Samples: 203661060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 10:02:44,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 10:02:44,610][52263] Updated weights for policy 0, policy_version 348708 (0.0027) [2024-04-27 10:02:47,123][52263] Updated weights for policy 0, policy_version 348718 (0.0033) [2024-04-27 10:02:49,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53248.0, 300 sec: 52873.1). Total num frames: 5713477632. Throughput: 0: 52501.9. Samples: 203977840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 10:02:49,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 10:02:50,667][52263] Updated weights for policy 0, policy_version 348728 (0.0028) [2024-04-27 10:02:53,300][52263] Updated weights for policy 0, policy_version 348738 (0.0031) [2024-04-27 10:02:54,107][52031] Fps is (10 sec: 58981.9, 60 sec: 53794.2, 300 sec: 52984.2). Total num frames: 5713772544. Throughput: 0: 52660.8. Samples: 204300780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 10:02:54,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 10:02:56,955][52263] Updated weights for policy 0, policy_version 348748 (0.0032) [2024-04-27 10:02:59,106][52031] Fps is (10 sec: 54067.9, 60 sec: 52701.9, 300 sec: 52873.1). Total num frames: 5714018304. Throughput: 0: 52976.4. Samples: 204468940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 10:02:59,107][52031] Avg episode reward: [(0, '0.672')] [2024-04-27 10:02:59,592][52263] Updated weights for policy 0, policy_version 348758 (0.0030) [2024-04-27 10:03:03,145][52263] Updated weights for policy 0, policy_version 348768 (0.0031) [2024-04-27 10:03:04,107][52031] Fps is (10 sec: 49152.1, 60 sec: 52428.9, 300 sec: 52762.0). Total num frames: 5714264064. Throughput: 0: 53062.7. Samples: 204786200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 10:03:04,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 10:03:05,648][52263] Updated weights for policy 0, policy_version 348778 (0.0027) [2024-04-27 10:03:09,107][52031] Fps is (10 sec: 49151.2, 60 sec: 52428.7, 300 sec: 52706.5). Total num frames: 5714509824. Throughput: 0: 53022.5. Samples: 205102120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 10:03:09,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 10:03:09,394][52263] Updated weights for policy 0, policy_version 348788 (0.0035) [2024-04-27 10:03:10,987][52242] Signal inference workers to stop experience collection... (2900 times) [2024-04-27 10:03:11,025][52263] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-04-27 10:03:11,082][52242] Signal inference workers to resume experience collection... (2900 times) [2024-04-27 10:03:11,082][52263] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-04-27 10:03:11,865][52263] Updated weights for policy 0, policy_version 348798 (0.0029) [2024-04-27 10:03:14,107][52031] Fps is (10 sec: 52428.6, 60 sec: 52974.8, 300 sec: 52817.6). Total num frames: 5714788352. Throughput: 0: 52494.2. Samples: 205249460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 10:03:14,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 10:03:15,442][52263] Updated weights for policy 0, policy_version 348808 (0.0029) [2024-04-27 10:03:17,983][52263] Updated weights for policy 0, policy_version 348818 (0.0033) [2024-04-27 10:03:19,107][52031] Fps is (10 sec: 57344.6, 60 sec: 53521.2, 300 sec: 52928.7). Total num frames: 5715083264. Throughput: 0: 52623.6. Samples: 205566580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 10:03:19,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 10:03:21,682][52263] Updated weights for policy 0, policy_version 348828 (0.0027) [2024-04-27 10:03:24,016][52263] Updated weights for policy 0, policy_version 348838 (0.0028) [2024-04-27 10:03:24,107][52031] Fps is (10 sec: 57343.8, 60 sec: 53247.8, 300 sec: 52928.6). Total num frames: 5715361792. Throughput: 0: 52882.1. Samples: 205894820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 10:03:24,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 10:03:28,162][52263] Updated weights for policy 0, policy_version 348848 (0.0032) [2024-04-27 10:03:29,107][52031] Fps is (10 sec: 50790.0, 60 sec: 52701.9, 300 sec: 52873.1). Total num frames: 5715591168. Throughput: 0: 53179.0. Samples: 206054120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 10:03:29,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 10:03:30,172][52263] Updated weights for policy 0, policy_version 348858 (0.0027) [2024-04-27 10:03:34,107][52031] Fps is (10 sec: 47513.1, 60 sec: 52428.6, 300 sec: 52762.0). Total num frames: 5715836928. Throughput: 0: 53231.4. Samples: 206373260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 10:03:34,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 10:03:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000348867_5715836928.pth... [2024-04-27 10:03:34,162][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000348094_5703172096.pth [2024-04-27 10:03:34,298][52263] Updated weights for policy 0, policy_version 348868 (0.0032) [2024-04-27 10:03:36,394][52263] Updated weights for policy 0, policy_version 348878 (0.0022) [2024-04-27 10:03:39,106][52031] Fps is (10 sec: 52429.4, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5716115456. Throughput: 0: 53074.3. Samples: 206689120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 10:03:39,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 10:03:40,302][52263] Updated weights for policy 0, policy_version 348888 (0.0029) [2024-04-27 10:03:42,588][52263] Updated weights for policy 0, policy_version 348898 (0.0033) [2024-04-27 10:03:44,106][52031] Fps is (10 sec: 54068.5, 60 sec: 53248.0, 300 sec: 52817.6). Total num frames: 5716377600. Throughput: 0: 52881.3. Samples: 206848600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 10:03:44,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 10:03:46,429][52263] Updated weights for policy 0, policy_version 348908 (0.0035) [2024-04-27 10:03:48,764][52263] Updated weights for policy 0, policy_version 348918 (0.0031) [2024-04-27 10:03:49,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53248.0, 300 sec: 52928.7). Total num frames: 5716672512. Throughput: 0: 52913.8. Samples: 207167320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 10:03:49,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 10:03:52,737][52263] Updated weights for policy 0, policy_version 348928 (0.0031) [2024-04-27 10:03:54,106][52031] Fps is (10 sec: 54067.2, 60 sec: 52428.9, 300 sec: 52817.6). Total num frames: 5716918272. Throughput: 0: 52889.5. Samples: 207482140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 10:03:54,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 10:03:54,975][52263] Updated weights for policy 0, policy_version 348938 (0.0030) [2024-04-27 10:03:58,923][52263] Updated weights for policy 0, policy_version 348948 (0.0036) [2024-04-27 10:03:59,106][52031] Fps is (10 sec: 49152.3, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5717164032. Throughput: 0: 53182.8. Samples: 207642680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 10:03:59,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 10:04:01,115][52263] Updated weights for policy 0, policy_version 348958 (0.0037) [2024-04-27 10:04:04,107][52031] Fps is (10 sec: 50790.0, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5717426176. Throughput: 0: 53137.7. Samples: 207957780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 10:04:04,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 10:04:05,143][52263] Updated weights for policy 0, policy_version 348968 (0.0027) [2024-04-27 10:04:06,881][52242] Signal inference workers to stop experience collection... (2950 times) [2024-04-27 10:04:06,881][52242] Signal inference workers to resume experience collection... (2950 times) [2024-04-27 10:04:06,897][52263] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-04-27 10:04:06,897][52263] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-04-27 10:04:07,397][52263] Updated weights for policy 0, policy_version 348978 (0.0032) [2024-04-27 10:04:09,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.1, 300 sec: 52873.1). Total num frames: 5717704704. Throughput: 0: 52888.2. Samples: 208274780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 10:04:09,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 10:04:11,186][52263] Updated weights for policy 0, policy_version 348988 (0.0030) [2024-04-27 10:04:13,532][52263] Updated weights for policy 0, policy_version 348998 (0.0036) [2024-04-27 10:04:14,107][52031] Fps is (10 sec: 57343.3, 60 sec: 53521.0, 300 sec: 52928.6). Total num frames: 5717999616. Throughput: 0: 52919.4. Samples: 208435500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 10:04:14,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 10:04:17,469][52263] Updated weights for policy 0, policy_version 349008 (0.0030) [2024-04-27 10:04:19,106][52031] Fps is (10 sec: 54067.4, 60 sec: 52701.9, 300 sec: 52873.1). Total num frames: 5718245376. Throughput: 0: 52911.4. Samples: 208754260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 10:04:19,107][52031] Avg episode reward: [(0, '0.519')] [2024-04-27 10:04:19,719][52263] Updated weights for policy 0, policy_version 349018 (0.0032) [2024-04-27 10:04:23,572][52263] Updated weights for policy 0, policy_version 349028 (0.0033) [2024-04-27 10:04:24,107][52031] Fps is (10 sec: 49152.4, 60 sec: 52155.8, 300 sec: 52873.1). Total num frames: 5718491136. Throughput: 0: 52928.7. Samples: 209070920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 10:04:24,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 10:04:26,039][52263] Updated weights for policy 0, policy_version 349038 (0.0028) [2024-04-27 10:04:29,106][52031] Fps is (10 sec: 49152.1, 60 sec: 52429.0, 300 sec: 52762.1). Total num frames: 5718736896. Throughput: 0: 52651.6. Samples: 209217920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 10:04:29,107][52031] Avg episode reward: [(0, '0.511')] [2024-04-27 10:04:29,711][52263] Updated weights for policy 0, policy_version 349048 (0.0031) [2024-04-27 10:04:32,239][52263] Updated weights for policy 0, policy_version 349058 (0.0031) [2024-04-27 10:04:34,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.2, 300 sec: 52928.7). Total num frames: 5719031808. Throughput: 0: 52640.9. Samples: 209536160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 10:04:34,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 10:04:35,886][52263] Updated weights for policy 0, policy_version 349068 (0.0038) [2024-04-27 10:04:38,472][52263] Updated weights for policy 0, policy_version 349078 (0.0028) [2024-04-27 10:04:39,106][52031] Fps is (10 sec: 57344.0, 60 sec: 53248.1, 300 sec: 52873.1). Total num frames: 5719310336. Throughput: 0: 52694.3. Samples: 209853380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 10:04:39,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 10:04:42,110][52263] Updated weights for policy 0, policy_version 349088 (0.0032) [2024-04-27 10:04:44,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53521.0, 300 sec: 52928.6). Total num frames: 5719588864. Throughput: 0: 53013.2. Samples: 210028280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 10:04:44,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 10:04:44,799][52263] Updated weights for policy 0, policy_version 349098 (0.0031) [2024-04-27 10:04:48,361][52263] Updated weights for policy 0, policy_version 349108 (0.0034) [2024-04-27 10:04:49,106][52031] Fps is (10 sec: 49151.8, 60 sec: 52155.8, 300 sec: 52817.6). Total num frames: 5719801856. Throughput: 0: 52885.5. Samples: 210337620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 10:04:49,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 10:04:51,127][52263] Updated weights for policy 0, policy_version 349118 (0.0026) [2024-04-27 10:04:54,107][52031] Fps is (10 sec: 47513.6, 60 sec: 52428.7, 300 sec: 52817.6). Total num frames: 5720064000. Throughput: 0: 52819.5. Samples: 210651660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 10:04:54,107][52031] Avg episode reward: [(0, '0.519')] [2024-04-27 10:04:54,547][52263] Updated weights for policy 0, policy_version 349128 (0.0028) [2024-04-27 10:04:57,352][52263] Updated weights for policy 0, policy_version 349138 (0.0032) [2024-04-27 10:04:59,106][52031] Fps is (10 sec: 54066.8, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5720342528. Throughput: 0: 52456.6. Samples: 210796040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 10:04:59,107][52031] Avg episode reward: [(0, '0.670')] [2024-04-27 10:05:00,833][52263] Updated weights for policy 0, policy_version 349148 (0.0028) [2024-04-27 10:05:03,508][52263] Updated weights for policy 0, policy_version 349158 (0.0029) [2024-04-27 10:05:04,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53248.0, 300 sec: 52817.6). Total num frames: 5720621056. Throughput: 0: 52456.7. Samples: 211114820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 10:05:04,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 10:05:07,131][52263] Updated weights for policy 0, policy_version 349168 (0.0029) [2024-04-27 10:05:09,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53248.0, 300 sec: 52873.1). Total num frames: 5720899584. Throughput: 0: 52517.9. Samples: 211434220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 10:05:09,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 10:05:09,676][52263] Updated weights for policy 0, policy_version 349178 (0.0039) [2024-04-27 10:05:13,203][52263] Updated weights for policy 0, policy_version 349188 (0.0031) [2024-04-27 10:05:14,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52428.9, 300 sec: 52928.6). Total num frames: 5721145344. Throughput: 0: 52992.2. Samples: 211602580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 10:05:14,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 10:05:14,478][52242] Signal inference workers to stop experience collection... (3000 times) [2024-04-27 10:05:14,499][52263] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-04-27 10:05:14,537][52242] Signal inference workers to resume experience collection... (3000 times) [2024-04-27 10:05:14,538][52263] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-04-27 10:05:15,931][52263] Updated weights for policy 0, policy_version 349198 (0.0031) [2024-04-27 10:05:19,108][52031] Fps is (10 sec: 49147.0, 60 sec: 52427.8, 300 sec: 52761.9). Total num frames: 5721391104. Throughput: 0: 52879.7. Samples: 211915800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 10:05:19,108][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 10:05:19,405][52263] Updated weights for policy 0, policy_version 349208 (0.0035) [2024-04-27 10:05:22,421][52263] Updated weights for policy 0, policy_version 349218 (0.0030) [2024-04-27 10:05:24,106][52031] Fps is (10 sec: 52429.4, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5721669632. Throughput: 0: 52736.4. Samples: 212226520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 10:05:24,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 10:05:25,596][52263] Updated weights for policy 0, policy_version 349228 (0.0034) [2024-04-27 10:05:28,696][52263] Updated weights for policy 0, policy_version 349238 (0.0031) [2024-04-27 10:05:29,106][52031] Fps is (10 sec: 54073.2, 60 sec: 53248.0, 300 sec: 52817.6). Total num frames: 5721931776. Throughput: 0: 52329.5. Samples: 212383100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 10:05:29,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 10:05:31,877][52263] Updated weights for policy 0, policy_version 349248 (0.0027) [2024-04-27 10:05:34,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5722193920. Throughput: 0: 52579.4. Samples: 212703700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 10:05:34,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 10:05:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000349255_5722193920.pth... [2024-04-27 10:05:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000348483_5709545472.pth [2024-04-27 10:05:34,863][52263] Updated weights for policy 0, policy_version 349258 (0.0027) [2024-04-27 10:05:38,004][52263] Updated weights for policy 0, policy_version 349268 (0.0037) [2024-04-27 10:05:39,106][52031] Fps is (10 sec: 52428.5, 60 sec: 52428.7, 300 sec: 52817.6). Total num frames: 5722456064. Throughput: 0: 52732.1. Samples: 213024600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 10:05:39,107][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 10:05:41,158][52263] Updated weights for policy 0, policy_version 349278 (0.0030) [2024-04-27 10:05:44,106][52031] Fps is (10 sec: 52429.6, 60 sec: 52155.8, 300 sec: 52817.6). Total num frames: 5722718208. Throughput: 0: 52838.3. Samples: 213173760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 10:05:44,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 10:05:44,224][52263] Updated weights for policy 0, policy_version 349288 (0.0031) [2024-04-27 10:05:47,356][52263] Updated weights for policy 0, policy_version 349298 (0.0032) [2024-04-27 10:05:49,107][52031] Fps is (10 sec: 50789.5, 60 sec: 52701.7, 300 sec: 52762.0). Total num frames: 5722963968. Throughput: 0: 52772.8. Samples: 213489600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 10:05:49,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 10:05:50,498][52263] Updated weights for policy 0, policy_version 349308 (0.0034) [2024-04-27 10:05:53,576][52263] Updated weights for policy 0, policy_version 349318 (0.0033) [2024-04-27 10:05:54,107][52031] Fps is (10 sec: 52427.6, 60 sec: 52974.8, 300 sec: 52817.5). Total num frames: 5723242496. Throughput: 0: 52718.5. Samples: 213806560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 10:05:54,107][52031] Avg episode reward: [(0, '0.661')] [2024-04-27 10:05:57,009][52263] Updated weights for policy 0, policy_version 349328 (0.0033) [2024-04-27 10:05:59,107][52031] Fps is (10 sec: 54067.8, 60 sec: 52701.8, 300 sec: 52706.5). Total num frames: 5723504640. Throughput: 0: 52563.6. Samples: 213967940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 10:05:59,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 10:05:59,830][52263] Updated weights for policy 0, policy_version 349338 (0.0031) [2024-04-27 10:06:03,174][52263] Updated weights for policy 0, policy_version 349348 (0.0033) [2024-04-27 10:06:04,106][52031] Fps is (10 sec: 54067.9, 60 sec: 52701.9, 300 sec: 52817.6). Total num frames: 5723783168. Throughput: 0: 52682.9. Samples: 214286480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 10:06:04,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 10:06:05,931][52263] Updated weights for policy 0, policy_version 349358 (0.0028) [2024-04-27 10:06:09,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52155.7, 300 sec: 52817.6). Total num frames: 5724028928. Throughput: 0: 52791.0. Samples: 214602120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 10:06:09,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 10:06:09,333][52263] Updated weights for policy 0, policy_version 349368 (0.0039) [2024-04-27 10:06:12,055][52263] Updated weights for policy 0, policy_version 349378 (0.0028) [2024-04-27 10:06:14,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52428.9, 300 sec: 52762.0). Total num frames: 5724291072. Throughput: 0: 52779.5. Samples: 214758180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 10:06:14,107][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 10:06:15,398][52263] Updated weights for policy 0, policy_version 349388 (0.0030) [2024-04-27 10:06:18,604][52263] Updated weights for policy 0, policy_version 349398 (0.0032) [2024-04-27 10:06:19,107][52031] Fps is (10 sec: 52428.6, 60 sec: 52702.7, 300 sec: 52817.6). Total num frames: 5724553216. Throughput: 0: 52639.1. Samples: 215072460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 10:06:19,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 10:06:20,769][52242] Signal inference workers to stop experience collection... (3050 times) [2024-04-27 10:06:20,772][52242] Signal inference workers to resume experience collection... (3050 times) [2024-04-27 10:06:20,818][52263] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-04-27 10:06:20,819][52263] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-04-27 10:06:21,719][52263] Updated weights for policy 0, policy_version 349408 (0.0026) [2024-04-27 10:06:24,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52428.8, 300 sec: 52762.0). Total num frames: 5724815360. Throughput: 0: 52485.3. Samples: 215386440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 10:06:24,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 10:06:25,048][52263] Updated weights for policy 0, policy_version 349418 (0.0027) [2024-04-27 10:06:27,896][52263] Updated weights for policy 0, policy_version 349428 (0.0039) [2024-04-27 10:06:29,106][52031] Fps is (10 sec: 55706.3, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5725110272. Throughput: 0: 52694.7. Samples: 215545020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 10:06:29,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 10:06:31,068][52263] Updated weights for policy 0, policy_version 349438 (0.0036) [2024-04-27 10:06:33,906][52263] Updated weights for policy 0, policy_version 349448 (0.0030) [2024-04-27 10:06:34,107][52031] Fps is (10 sec: 54067.0, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5725356032. Throughput: 0: 52835.7. Samples: 215867200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 10:06:34,107][52031] Avg episode reward: [(0, '0.520')] [2024-04-27 10:06:37,317][52263] Updated weights for policy 0, policy_version 349458 (0.0032) [2024-04-27 10:06:39,106][52031] Fps is (10 sec: 47513.6, 60 sec: 52155.8, 300 sec: 52651.0). Total num frames: 5725585408. Throughput: 0: 52751.4. Samples: 216180360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 10:06:39,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 10:06:40,218][52263] Updated weights for policy 0, policy_version 349468 (0.0026) [2024-04-27 10:06:43,559][52263] Updated weights for policy 0, policy_version 349478 (0.0034) [2024-04-27 10:06:44,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52701.8, 300 sec: 52873.1). Total num frames: 5725880320. Throughput: 0: 52473.8. Samples: 216329260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 10:06:44,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 10:06:46,856][52263] Updated weights for policy 0, policy_version 349488 (0.0035) [2024-04-27 10:06:49,107][52031] Fps is (10 sec: 55705.2, 60 sec: 52975.1, 300 sec: 52873.1). Total num frames: 5726142464. Throughput: 0: 52427.1. Samples: 216645700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 10:06:49,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 10:06:49,679][52263] Updated weights for policy 0, policy_version 349498 (0.0028) [2024-04-27 10:06:52,872][52263] Updated weights for policy 0, policy_version 349508 (0.0035) [2024-04-27 10:06:54,107][52031] Fps is (10 sec: 52428.0, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5726404608. Throughput: 0: 52542.5. Samples: 216966540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 10:06:54,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 10:06:55,730][52263] Updated weights for policy 0, policy_version 349518 (0.0030) [2024-04-27 10:06:58,842][52263] Updated weights for policy 0, policy_version 349528 (0.0033) [2024-04-27 10:06:59,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52702.0, 300 sec: 52706.5). Total num frames: 5726666752. Throughput: 0: 52471.6. Samples: 217119400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 10:06:59,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 10:07:01,954][52263] Updated weights for policy 0, policy_version 349538 (0.0028) [2024-04-27 10:07:04,107][52031] Fps is (10 sec: 50790.5, 60 sec: 52155.6, 300 sec: 52706.5). Total num frames: 5726912512. Throughput: 0: 52593.7. Samples: 217439180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 10:07:04,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 10:07:05,199][52263] Updated weights for policy 0, policy_version 349548 (0.0029) [2024-04-27 10:07:08,123][52263] Updated weights for policy 0, policy_version 349558 (0.0030) [2024-04-27 10:07:09,107][52031] Fps is (10 sec: 54066.6, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5727207424. Throughput: 0: 52750.2. Samples: 217760200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 10:07:09,116][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 10:07:11,384][52263] Updated weights for policy 0, policy_version 349568 (0.0029) [2024-04-27 10:07:14,107][52031] Fps is (10 sec: 55705.4, 60 sec: 52974.8, 300 sec: 52873.1). Total num frames: 5727469568. Throughput: 0: 52773.9. Samples: 217919860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 10:07:14,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 10:07:14,258][52263] Updated weights for policy 0, policy_version 349578 (0.0035) [2024-04-27 10:07:17,514][52263] Updated weights for policy 0, policy_version 349588 (0.0028) [2024-04-27 10:07:18,027][52242] Signal inference workers to stop experience collection... (3100 times) [2024-04-27 10:07:18,027][52242] Signal inference workers to resume experience collection... (3100 times) [2024-04-27 10:07:18,057][52263] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-04-27 10:07:18,057][52263] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-04-27 10:07:19,107][52031] Fps is (10 sec: 52428.6, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5727731712. Throughput: 0: 52650.2. Samples: 218236460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 10:07:19,107][52031] Avg episode reward: [(0, '0.679')] [2024-04-27 10:07:20,512][52263] Updated weights for policy 0, policy_version 349598 (0.0031) [2024-04-27 10:07:23,557][52263] Updated weights for policy 0, policy_version 349608 (0.0031) [2024-04-27 10:07:24,106][52031] Fps is (10 sec: 54068.2, 60 sec: 53248.0, 300 sec: 52817.6). Total num frames: 5728010240. Throughput: 0: 52763.0. Samples: 218554700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 10:07:24,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 10:07:26,650][52263] Updated weights for policy 0, policy_version 349618 (0.0037) [2024-04-27 10:07:29,106][52031] Fps is (10 sec: 50791.3, 60 sec: 52155.8, 300 sec: 52706.5). Total num frames: 5728239616. Throughput: 0: 52955.7. Samples: 218712260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 10:07:29,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 10:07:29,860][52263] Updated weights for policy 0, policy_version 349628 (0.0026) [2024-04-27 10:07:32,653][52263] Updated weights for policy 0, policy_version 349638 (0.0027) [2024-04-27 10:07:34,107][52031] Fps is (10 sec: 49151.2, 60 sec: 52428.7, 300 sec: 52762.0). Total num frames: 5728501760. Throughput: 0: 53019.8. Samples: 219031600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 10:07:34,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 10:07:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000349640_5728501760.pth... [2024-04-27 10:07:34,162][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000348867_5715836928.pth [2024-04-27 10:07:36,126][52263] Updated weights for policy 0, policy_version 349648 (0.0026) [2024-04-27 10:07:38,775][52263] Updated weights for policy 0, policy_version 349658 (0.0029) [2024-04-27 10:07:39,107][52031] Fps is (10 sec: 55704.0, 60 sec: 53520.8, 300 sec: 52928.6). Total num frames: 5728796672. Throughput: 0: 52886.2. Samples: 219346420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 10:07:39,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 10:07:42,341][52263] Updated weights for policy 0, policy_version 349668 (0.0033) [2024-04-27 10:07:44,107][52031] Fps is (10 sec: 55705.5, 60 sec: 52974.8, 300 sec: 52817.6). Total num frames: 5729058816. Throughput: 0: 53211.7. Samples: 219513940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 10:07:44,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 10:07:45,027][52263] Updated weights for policy 0, policy_version 349678 (0.0031) [2024-04-27 10:07:48,418][52263] Updated weights for policy 0, policy_version 349688 (0.0027) [2024-04-27 10:07:49,106][52031] Fps is (10 sec: 54068.5, 60 sec: 53248.1, 300 sec: 52762.1). Total num frames: 5729337344. Throughput: 0: 53127.3. Samples: 219829900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 10:07:49,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 10:07:51,310][52263] Updated weights for policy 0, policy_version 349698 (0.0034) [2024-04-27 10:07:54,106][52031] Fps is (10 sec: 49152.9, 60 sec: 52429.0, 300 sec: 52650.9). Total num frames: 5729550336. Throughput: 0: 53152.5. Samples: 220152060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 10:07:54,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 10:07:54,568][52263] Updated weights for policy 0, policy_version 349708 (0.0039) [2024-04-27 10:07:57,635][52263] Updated weights for policy 0, policy_version 349718 (0.0031) [2024-04-27 10:07:59,106][52031] Fps is (10 sec: 47513.5, 60 sec: 52428.8, 300 sec: 52706.5). Total num frames: 5729812480. Throughput: 0: 52790.9. Samples: 220295440. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-04-27 10:07:59,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 10:08:00,947][52263] Updated weights for policy 0, policy_version 349728 (0.0040) [2024-04-27 10:08:03,858][52263] Updated weights for policy 0, policy_version 349738 (0.0031) [2024-04-27 10:08:04,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53248.2, 300 sec: 52873.1). Total num frames: 5730107392. Throughput: 0: 52764.1. Samples: 220610840. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-04-27 10:08:04,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 10:08:07,274][52263] Updated weights for policy 0, policy_version 349748 (0.0033) [2024-04-27 10:08:09,106][52031] Fps is (10 sec: 57344.1, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5730385920. Throughput: 0: 52575.2. Samples: 220920580. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-04-27 10:08:09,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 10:08:09,998][52263] Updated weights for policy 0, policy_version 349758 (0.0031) [2024-04-27 10:08:13,382][52263] Updated weights for policy 0, policy_version 349768 (0.0030) [2024-04-27 10:08:13,448][52242] Signal inference workers to stop experience collection... (3150 times) [2024-04-27 10:08:13,481][52263] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-04-27 10:08:13,510][52242] Signal inference workers to resume experience collection... (3150 times) [2024-04-27 10:08:13,515][52263] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-04-27 10:08:14,107][52031] Fps is (10 sec: 54066.7, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5730648064. Throughput: 0: 52979.4. Samples: 221096340. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-04-27 10:08:14,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 10:08:16,123][52263] Updated weights for policy 0, policy_version 349778 (0.0029) [2024-04-27 10:08:19,106][52031] Fps is (10 sec: 47513.3, 60 sec: 52155.8, 300 sec: 52539.9). Total num frames: 5730861056. Throughput: 0: 52796.2. Samples: 221407420. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-04-27 10:08:19,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 10:08:19,497][52263] Updated weights for policy 0, policy_version 349788 (0.0028) [2024-04-27 10:08:22,279][52263] Updated weights for policy 0, policy_version 349798 (0.0030) [2024-04-27 10:08:24,106][52031] Fps is (10 sec: 45875.7, 60 sec: 51609.6, 300 sec: 52595.4). Total num frames: 5731106816. Throughput: 0: 52770.1. Samples: 221721060. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-04-27 10:08:24,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 10:08:25,781][52263] Updated weights for policy 0, policy_version 349808 (0.0034) [2024-04-27 10:08:28,492][52263] Updated weights for policy 0, policy_version 349818 (0.0036) [2024-04-27 10:08:29,107][52031] Fps is (10 sec: 55701.6, 60 sec: 52974.2, 300 sec: 52817.5). Total num frames: 5731418112. Throughput: 0: 52328.2. Samples: 221868740. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-04-27 10:08:29,108][52031] Avg episode reward: [(0, '0.488')] [2024-04-27 10:08:32,100][52263] Updated weights for policy 0, policy_version 349828 (0.0031) [2024-04-27 10:08:34,106][52031] Fps is (10 sec: 60620.6, 60 sec: 53521.2, 300 sec: 52873.1). Total num frames: 5731713024. Throughput: 0: 52308.8. Samples: 222183800. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-04-27 10:08:34,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 10:08:34,827][52263] Updated weights for policy 0, policy_version 349838 (0.0034) [2024-04-27 10:08:38,193][52263] Updated weights for policy 0, policy_version 349848 (0.0033) [2024-04-27 10:08:39,106][52031] Fps is (10 sec: 54071.4, 60 sec: 52702.1, 300 sec: 52817.6). Total num frames: 5731958784. Throughput: 0: 52220.5. Samples: 222501980. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-04-27 10:08:39,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 10:08:41,091][52263] Updated weights for policy 0, policy_version 349858 (0.0035) [2024-04-27 10:08:44,106][52031] Fps is (10 sec: 49152.0, 60 sec: 52429.0, 300 sec: 52651.0). Total num frames: 5732204544. Throughput: 0: 52547.5. Samples: 222660080. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-04-27 10:08:44,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 10:08:44,368][52263] Updated weights for policy 0, policy_version 349868 (0.0034) [2024-04-27 10:08:47,477][52263] Updated weights for policy 0, policy_version 349878 (0.0030) [2024-04-27 10:08:49,107][52031] Fps is (10 sec: 47512.8, 60 sec: 51609.4, 300 sec: 52595.4). Total num frames: 5732433920. Throughput: 0: 52543.8. Samples: 222975320. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-04-27 10:08:49,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 10:08:50,708][52263] Updated weights for policy 0, policy_version 349888 (0.0030) [2024-04-27 10:08:53,662][52263] Updated weights for policy 0, policy_version 349898 (0.0030) [2024-04-27 10:08:54,107][52031] Fps is (10 sec: 52428.1, 60 sec: 52974.8, 300 sec: 52762.0). Total num frames: 5732728832. Throughput: 0: 52767.8. Samples: 223295140. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-04-27 10:08:54,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 10:08:56,918][52263] Updated weights for policy 0, policy_version 349908 (0.0032) [2024-04-27 10:08:59,106][52031] Fps is (10 sec: 57345.2, 60 sec: 53248.0, 300 sec: 52817.6). Total num frames: 5733007360. Throughput: 0: 52510.8. Samples: 223459320. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-04-27 10:08:59,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 10:09:00,336][52263] Updated weights for policy 0, policy_version 349918 (0.0029) [2024-04-27 10:09:03,089][52263] Updated weights for policy 0, policy_version 349928 (0.0027) [2024-04-27 10:09:04,107][52031] Fps is (10 sec: 57343.7, 60 sec: 53247.8, 300 sec: 52873.1). Total num frames: 5733302272. Throughput: 0: 52550.9. Samples: 223772220. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-04-27 10:09:04,107][52031] Avg episode reward: [(0, '0.513')] [2024-04-27 10:09:06,678][52263] Updated weights for policy 0, policy_version 349938 (0.0037) [2024-04-27 10:09:09,106][52031] Fps is (10 sec: 50790.4, 60 sec: 52155.7, 300 sec: 52595.5). Total num frames: 5733515264. Throughput: 0: 52675.6. Samples: 224091460. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-04-27 10:09:09,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 10:09:09,344][52263] Updated weights for policy 0, policy_version 349948 (0.0045) [2024-04-27 10:09:13,082][52263] Updated weights for policy 0, policy_version 349958 (0.0029) [2024-04-27 10:09:14,106][52031] Fps is (10 sec: 42599.0, 60 sec: 51336.6, 300 sec: 52484.3). Total num frames: 5733728256. Throughput: 0: 52537.3. Samples: 224232880. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-04-27 10:09:14,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 10:09:14,171][52242] Signal inference workers to stop experience collection... (3200 times) [2024-04-27 10:09:14,172][52242] Signal inference workers to resume experience collection... (3200 times) [2024-04-27 10:09:14,185][52263] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-04-27 10:09:14,185][52263] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-04-27 10:09:15,677][52263] Updated weights for policy 0, policy_version 349968 (0.0042) [2024-04-27 10:09:19,107][52031] Fps is (10 sec: 50790.5, 60 sec: 52701.9, 300 sec: 52651.0). Total num frames: 5734023168. Throughput: 0: 52675.6. Samples: 224554200. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-04-27 10:09:19,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 10:09:19,297][52263] Updated weights for policy 0, policy_version 349978 (0.0033) [2024-04-27 10:09:21,906][52263] Updated weights for policy 0, policy_version 349988 (0.0034) [2024-04-27 10:09:24,106][52031] Fps is (10 sec: 60620.9, 60 sec: 53794.1, 300 sec: 52873.1). Total num frames: 5734334464. Throughput: 0: 52536.0. Samples: 224866100. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-04-27 10:09:24,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 10:09:25,474][52263] Updated weights for policy 0, policy_version 349998 (0.0035) [2024-04-27 10:09:28,163][52263] Updated weights for policy 0, policy_version 350008 (0.0032) [2024-04-27 10:09:29,107][52031] Fps is (10 sec: 57343.3, 60 sec: 52975.5, 300 sec: 52762.0). Total num frames: 5734596608. Throughput: 0: 52960.8. Samples: 225043320. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-04-27 10:09:29,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 10:09:31,801][52263] Updated weights for policy 0, policy_version 350018 (0.0030) [2024-04-27 10:09:34,107][52031] Fps is (10 sec: 49151.7, 60 sec: 51882.6, 300 sec: 52595.4). Total num frames: 5734825984. Throughput: 0: 52925.4. Samples: 225356960. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-04-27 10:09:34,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 10:09:34,287][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000350028_5734858752.pth... [2024-04-27 10:09:34,295][52263] Updated weights for policy 0, policy_version 350028 (0.0033) [2024-04-27 10:09:34,340][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000349255_5722193920.pth [2024-04-27 10:09:38,027][52263] Updated weights for policy 0, policy_version 350038 (0.0038) [2024-04-27 10:09:39,106][52031] Fps is (10 sec: 47514.0, 60 sec: 51882.7, 300 sec: 52484.4). Total num frames: 5735071744. Throughput: 0: 52753.5. Samples: 225669040. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-04-27 10:09:39,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 10:09:40,403][52263] Updated weights for policy 0, policy_version 350048 (0.0034) [2024-04-27 10:09:44,106][52031] Fps is (10 sec: 50791.0, 60 sec: 52155.8, 300 sec: 52651.0). Total num frames: 5735333888. Throughput: 0: 52265.8. Samples: 225811280. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-04-27 10:09:44,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 10:09:44,181][52263] Updated weights for policy 0, policy_version 350058 (0.0031) [2024-04-27 10:09:46,651][52263] Updated weights for policy 0, policy_version 350068 (0.0027) [2024-04-27 10:09:49,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53248.1, 300 sec: 52762.0). Total num frames: 5735628800. Throughput: 0: 52362.8. Samples: 226128540. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-04-27 10:09:49,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 10:09:50,582][52263] Updated weights for policy 0, policy_version 350078 (0.0027) [2024-04-27 10:09:51,696][52242] Signal inference workers to stop experience collection... (3250 times) [2024-04-27 10:09:51,726][52263] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-04-27 10:09:51,750][52242] Signal inference workers to resume experience collection... (3250 times) [2024-04-27 10:09:51,750][52263] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-04-27 10:09:52,804][52263] Updated weights for policy 0, policy_version 350088 (0.0029) [2024-04-27 10:09:54,106][52031] Fps is (10 sec: 57344.0, 60 sec: 52975.1, 300 sec: 52762.0). Total num frames: 5735907328. Throughput: 0: 52214.7. Samples: 226441120. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-04-27 10:09:54,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 10:09:56,897][52263] Updated weights for policy 0, policy_version 350098 (0.0029) [2024-04-27 10:09:58,980][52263] Updated weights for policy 0, policy_version 350108 (0.0030) [2024-04-27 10:09:59,106][52031] Fps is (10 sec: 55705.9, 60 sec: 52974.9, 300 sec: 52762.1). Total num frames: 5736185856. Throughput: 0: 53076.5. Samples: 226621320. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-04-27 10:09:59,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 10:10:02,997][52263] Updated weights for policy 0, policy_version 350118 (0.0034) [2024-04-27 10:10:04,106][52031] Fps is (10 sec: 47513.6, 60 sec: 51336.7, 300 sec: 52484.4). Total num frames: 5736382464. Throughput: 0: 52802.2. Samples: 226930300. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-04-27 10:10:04,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 10:10:05,168][52263] Updated weights for policy 0, policy_version 350128 (0.0032) [2024-04-27 10:10:09,105][52263] Updated weights for policy 0, policy_version 350138 (0.0039) [2024-04-27 10:10:09,107][52031] Fps is (10 sec: 47513.2, 60 sec: 52428.7, 300 sec: 52595.4). Total num frames: 5736660992. Throughput: 0: 52794.6. Samples: 227241860. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-04-27 10:10:09,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 10:10:11,416][52263] Updated weights for policy 0, policy_version 350148 (0.0024) [2024-04-27 10:10:14,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.2, 300 sec: 52706.7). Total num frames: 5736939520. Throughput: 0: 52148.6. Samples: 227390000. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-04-27 10:10:14,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 10:10:15,366][52263] Updated weights for policy 0, policy_version 350158 (0.0029) [2024-04-27 10:10:17,630][52263] Updated weights for policy 0, policy_version 350168 (0.0035) [2024-04-27 10:10:19,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53247.8, 300 sec: 52706.5). Total num frames: 5737218048. Throughput: 0: 52328.0. Samples: 227711720. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-04-27 10:10:19,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 10:10:21,615][52263] Updated weights for policy 0, policy_version 350178 (0.0028) [2024-04-27 10:10:23,721][52263] Updated weights for policy 0, policy_version 350188 (0.0034) [2024-04-27 10:10:24,107][52031] Fps is (10 sec: 57342.6, 60 sec: 52974.8, 300 sec: 52817.5). Total num frames: 5737512960. Throughput: 0: 52436.2. Samples: 228028680. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-04-27 10:10:24,107][52031] Avg episode reward: [(0, '0.474')] [2024-04-27 10:10:27,767][52263] Updated weights for policy 0, policy_version 350198 (0.0034) [2024-04-27 10:10:29,106][52031] Fps is (10 sec: 50791.4, 60 sec: 52155.9, 300 sec: 52651.0). Total num frames: 5737725952. Throughput: 0: 52917.8. Samples: 228192580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:10:29,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 10:10:29,708][52242] Signal inference workers to stop experience collection... (3300 times) [2024-04-27 10:10:29,749][52263] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-04-27 10:10:29,777][52242] Signal inference workers to resume experience collection... (3300 times) [2024-04-27 10:10:29,783][52263] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-04-27 10:10:29,908][52263] Updated weights for policy 0, policy_version 350208 (0.0027) [2024-04-27 10:10:33,865][52263] Updated weights for policy 0, policy_version 350218 (0.0035) [2024-04-27 10:10:34,107][52031] Fps is (10 sec: 45875.6, 60 sec: 52428.8, 300 sec: 52595.4). Total num frames: 5737971712. Throughput: 0: 52903.5. Samples: 228509200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:10:34,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 10:10:36,088][52263] Updated weights for policy 0, policy_version 350228 (0.0034) [2024-04-27 10:10:39,107][52031] Fps is (10 sec: 50789.4, 60 sec: 52701.7, 300 sec: 52595.4). Total num frames: 5738233856. Throughput: 0: 52914.5. Samples: 228822280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:10:39,107][52031] Avg episode reward: [(0, '0.524')] [2024-04-27 10:10:40,181][52263] Updated weights for policy 0, policy_version 350238 (0.0027) [2024-04-27 10:10:42,345][52263] Updated weights for policy 0, policy_version 350248 (0.0033) [2024-04-27 10:10:44,107][52031] Fps is (10 sec: 57344.1, 60 sec: 53521.0, 300 sec: 52817.6). Total num frames: 5738545152. Throughput: 0: 52431.0. Samples: 228980720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:10:44,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 10:10:46,382][52263] Updated weights for policy 0, policy_version 350258 (0.0028) [2024-04-27 10:10:48,513][52263] Updated weights for policy 0, policy_version 350268 (0.0026) [2024-04-27 10:10:49,107][52031] Fps is (10 sec: 58982.7, 60 sec: 53248.0, 300 sec: 52817.6). Total num frames: 5738823680. Throughput: 0: 52675.0. Samples: 229300680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:10:49,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 10:10:52,643][52263] Updated weights for policy 0, policy_version 350278 (0.0039) [2024-04-27 10:10:54,107][52031] Fps is (10 sec: 50790.3, 60 sec: 52428.7, 300 sec: 52706.5). Total num frames: 5739053056. Throughput: 0: 52815.1. Samples: 229618540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:10:54,107][52031] Avg episode reward: [(0, '0.674')] [2024-04-27 10:10:54,715][52263] Updated weights for policy 0, policy_version 350288 (0.0036) [2024-04-27 10:10:58,928][52263] Updated weights for policy 0, policy_version 350298 (0.0026) [2024-04-27 10:10:59,107][52031] Fps is (10 sec: 47513.0, 60 sec: 51882.5, 300 sec: 52595.4). Total num frames: 5739298816. Throughput: 0: 52756.1. Samples: 229764040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:10:59,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 10:11:01,028][52263] Updated weights for policy 0, policy_version 350308 (0.0029) [2024-04-27 10:11:04,106][52031] Fps is (10 sec: 49152.6, 60 sec: 52701.9, 300 sec: 52595.4). Total num frames: 5739544576. Throughput: 0: 52642.8. Samples: 230080640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:11:04,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 10:11:04,988][52263] Updated weights for policy 0, policy_version 350318 (0.0033) [2024-04-27 10:11:07,374][52263] Updated weights for policy 0, policy_version 350328 (0.0027) [2024-04-27 10:11:09,106][52031] Fps is (10 sec: 55706.7, 60 sec: 53248.1, 300 sec: 52762.0). Total num frames: 5739855872. Throughput: 0: 52714.0. Samples: 230400800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:11:09,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 10:11:11,029][52263] Updated weights for policy 0, policy_version 350338 (0.0027) [2024-04-27 10:11:13,449][52263] Updated weights for policy 0, policy_version 350348 (0.0024) [2024-04-27 10:11:14,106][52031] Fps is (10 sec: 57344.2, 60 sec: 52974.9, 300 sec: 52762.1). Total num frames: 5740118016. Throughput: 0: 52872.0. Samples: 230571820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:11:14,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 10:11:17,288][52263] Updated weights for policy 0, policy_version 350358 (0.0030) [2024-04-27 10:11:19,106][52031] Fps is (10 sec: 54067.2, 60 sec: 52975.1, 300 sec: 52817.6). Total num frames: 5740396544. Throughput: 0: 52804.1. Samples: 230885380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:11:19,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 10:11:19,526][52263] Updated weights for policy 0, policy_version 350368 (0.0028) [2024-04-27 10:11:23,484][52263] Updated weights for policy 0, policy_version 350378 (0.0038) [2024-04-27 10:11:24,106][52031] Fps is (10 sec: 49151.8, 60 sec: 51609.8, 300 sec: 52539.9). Total num frames: 5740609536. Throughput: 0: 52879.7. Samples: 231201860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:11:24,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 10:11:24,266][52242] Signal inference workers to stop experience collection... (3350 times) [2024-04-27 10:11:24,300][52263] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-04-27 10:11:24,317][52242] Signal inference workers to resume experience collection... (3350 times) [2024-04-27 10:11:24,318][52263] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-04-27 10:11:25,634][52263] Updated weights for policy 0, policy_version 350388 (0.0027) [2024-04-27 10:11:29,106][52031] Fps is (10 sec: 45875.0, 60 sec: 52155.6, 300 sec: 52539.9). Total num frames: 5740855296. Throughput: 0: 52617.4. Samples: 231348500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:11:29,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 10:11:29,868][52263] Updated weights for policy 0, policy_version 350398 (0.0028) [2024-04-27 10:11:31,915][52263] Updated weights for policy 0, policy_version 350408 (0.0025) [2024-04-27 10:11:34,107][52031] Fps is (10 sec: 54066.7, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5741150208. Throughput: 0: 52442.2. Samples: 231660580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:11:34,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 10:11:34,120][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000350412_5741150208.pth... [2024-04-27 10:11:34,181][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000349640_5728501760.pth [2024-04-27 10:11:36,216][52263] Updated weights for policy 0, policy_version 350418 (0.0030) [2024-04-27 10:11:38,212][52263] Updated weights for policy 0, policy_version 350428 (0.0025) [2024-04-27 10:11:39,106][52031] Fps is (10 sec: 58983.0, 60 sec: 53521.2, 300 sec: 52762.1). Total num frames: 5741445120. Throughput: 0: 52446.4. Samples: 231978620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:11:39,107][52031] Avg episode reward: [(0, '0.684')] [2024-04-27 10:11:42,321][52263] Updated weights for policy 0, policy_version 350438 (0.0030) [2024-04-27 10:11:44,107][52031] Fps is (10 sec: 55705.4, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5741707264. Throughput: 0: 53237.0. Samples: 232159700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:11:44,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 10:11:44,389][52263] Updated weights for policy 0, policy_version 350448 (0.0027) [2024-04-27 10:11:48,423][52263] Updated weights for policy 0, policy_version 350458 (0.0031) [2024-04-27 10:11:49,107][52031] Fps is (10 sec: 50789.8, 60 sec: 52155.7, 300 sec: 52706.5). Total num frames: 5741953024. Throughput: 0: 53165.3. Samples: 232473080. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-27 10:11:49,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 10:11:50,840][52263] Updated weights for policy 0, policy_version 350468 (0.0028) [2024-04-27 10:11:54,106][52031] Fps is (10 sec: 47514.0, 60 sec: 52155.8, 300 sec: 52595.4). Total num frames: 5742182400. Throughput: 0: 53084.8. Samples: 232789620. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-27 10:11:54,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 10:11:54,521][52263] Updated weights for policy 0, policy_version 350478 (0.0027) [2024-04-27 10:11:57,039][52263] Updated weights for policy 0, policy_version 350488 (0.0030) [2024-04-27 10:11:59,106][52031] Fps is (10 sec: 49152.5, 60 sec: 52429.0, 300 sec: 52651.0). Total num frames: 5742444544. Throughput: 0: 52459.5. Samples: 232932500. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-27 10:11:59,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 10:12:00,856][52263] Updated weights for policy 0, policy_version 350498 (0.0037) [2024-04-27 10:12:03,134][52263] Updated weights for policy 0, policy_version 350508 (0.0027) [2024-04-27 10:12:04,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53247.8, 300 sec: 52650.9). Total num frames: 5742739456. Throughput: 0: 52465.1. Samples: 233246320. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-27 10:12:04,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 10:12:06,491][52242] Signal inference workers to stop experience collection... (3400 times) [2024-04-27 10:12:06,539][52263] InferenceWorker_p0-w0: stopping experience collection (3400 times) [2024-04-27 10:12:06,552][52242] Signal inference workers to resume experience collection... (3400 times) [2024-04-27 10:12:06,556][52263] InferenceWorker_p0-w0: resuming experience collection (3400 times) [2024-04-27 10:12:07,039][52263] Updated weights for policy 0, policy_version 350518 (0.0031) [2024-04-27 10:12:09,106][52031] Fps is (10 sec: 57343.7, 60 sec: 52701.8, 300 sec: 52706.5). Total num frames: 5743017984. Throughput: 0: 52441.7. Samples: 233561740. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-27 10:12:09,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 10:12:09,354][52263] Updated weights for policy 0, policy_version 350528 (0.0028) [2024-04-27 10:12:13,280][52263] Updated weights for policy 0, policy_version 350538 (0.0028) [2024-04-27 10:12:14,107][52031] Fps is (10 sec: 55706.1, 60 sec: 52974.8, 300 sec: 52762.0). Total num frames: 5743296512. Throughput: 0: 52956.4. Samples: 233731540. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-27 10:12:14,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 10:12:15,518][52263] Updated weights for policy 0, policy_version 350548 (0.0029) [2024-04-27 10:12:19,106][52031] Fps is (10 sec: 47513.6, 60 sec: 51609.6, 300 sec: 52484.3). Total num frames: 5743493120. Throughput: 0: 53114.7. Samples: 234050740. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-27 10:12:19,107][52031] Avg episode reward: [(0, '0.454')] [2024-04-27 10:12:19,441][52263] Updated weights for policy 0, policy_version 350558 (0.0031) [2024-04-27 10:12:21,684][52263] Updated weights for policy 0, policy_version 350568 (0.0030) [2024-04-27 10:12:24,107][52031] Fps is (10 sec: 47513.7, 60 sec: 52701.8, 300 sec: 52650.9). Total num frames: 5743771648. Throughput: 0: 53033.6. Samples: 234365140. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-27 10:12:24,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 10:12:25,558][52263] Updated weights for policy 0, policy_version 350578 (0.0032) [2024-04-27 10:12:27,680][52263] Updated weights for policy 0, policy_version 350588 (0.0028) [2024-04-27 10:12:29,106][52031] Fps is (10 sec: 58982.4, 60 sec: 53794.1, 300 sec: 52817.6). Total num frames: 5744082944. Throughput: 0: 52416.6. Samples: 234518440. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-27 10:12:29,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 10:12:31,730][52263] Updated weights for policy 0, policy_version 350598 (0.0034) [2024-04-27 10:12:34,106][52031] Fps is (10 sec: 57344.5, 60 sec: 53248.1, 300 sec: 52706.5). Total num frames: 5744345088. Throughput: 0: 52510.3. Samples: 234836040. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-27 10:12:34,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 10:12:34,114][52263] Updated weights for policy 0, policy_version 350608 (0.0035) [2024-04-27 10:12:37,964][52263] Updated weights for policy 0, policy_version 350618 (0.0029) [2024-04-27 10:12:39,107][52031] Fps is (10 sec: 52428.5, 60 sec: 52701.7, 300 sec: 52706.5). Total num frames: 5744607232. Throughput: 0: 52538.2. Samples: 235153840. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-27 10:12:39,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 10:12:40,463][52263] Updated weights for policy 0, policy_version 350628 (0.0028) [2024-04-27 10:12:44,106][52031] Fps is (10 sec: 49151.7, 60 sec: 52155.8, 300 sec: 52539.9). Total num frames: 5744836608. Throughput: 0: 52770.6. Samples: 235307180. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-27 10:12:44,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 10:12:44,161][52263] Updated weights for policy 0, policy_version 350638 (0.0026) [2024-04-27 10:12:46,675][52263] Updated weights for policy 0, policy_version 350648 (0.0033) [2024-04-27 10:12:49,106][52031] Fps is (10 sec: 49152.2, 60 sec: 52428.8, 300 sec: 52706.5). Total num frames: 5745098752. Throughput: 0: 52918.4. Samples: 235627640. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-27 10:12:49,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 10:12:50,302][52263] Updated weights for policy 0, policy_version 350658 (0.0029) [2024-04-27 10:12:52,894][52263] Updated weights for policy 0, policy_version 350668 (0.0030) [2024-04-27 10:12:54,107][52031] Fps is (10 sec: 52428.6, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5745360896. Throughput: 0: 52917.7. Samples: 235943040. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-27 10:12:54,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 10:12:56,543][52263] Updated weights for policy 0, policy_version 350678 (0.0035) [2024-04-27 10:12:58,190][52242] Signal inference workers to stop experience collection... (3450 times) [2024-04-27 10:12:58,224][52263] InferenceWorker_p0-w0: stopping experience collection (3450 times) [2024-04-27 10:12:58,280][52242] Signal inference workers to resume experience collection... (3450 times) [2024-04-27 10:12:58,280][52263] InferenceWorker_p0-w0: resuming experience collection (3450 times) [2024-04-27 10:12:59,057][52263] Updated weights for policy 0, policy_version 350688 (0.0032) [2024-04-27 10:12:59,107][52031] Fps is (10 sec: 57343.7, 60 sec: 53794.0, 300 sec: 52762.0). Total num frames: 5745672192. Throughput: 0: 52667.1. Samples: 236101560. Policy #0 lag: (min: 0.0, avg: 13.6, max: 24.0) [2024-04-27 10:12:59,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 10:13:02,738][52263] Updated weights for policy 0, policy_version 350698 (0.0033) [2024-04-27 10:13:04,106][52031] Fps is (10 sec: 55706.1, 60 sec: 52975.1, 300 sec: 52651.0). Total num frames: 5745917952. Throughput: 0: 52692.0. Samples: 236421880. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 10:13:04,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 10:13:05,354][52263] Updated weights for policy 0, policy_version 350708 (0.0028) [2024-04-27 10:13:08,985][52263] Updated weights for policy 0, policy_version 350718 (0.0034) [2024-04-27 10:13:09,107][52031] Fps is (10 sec: 49152.0, 60 sec: 52428.7, 300 sec: 52595.4). Total num frames: 5746163712. Throughput: 0: 52799.5. Samples: 236741120. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 10:13:09,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 10:13:11,560][52263] Updated weights for policy 0, policy_version 350728 (0.0028) [2024-04-27 10:13:14,106][52031] Fps is (10 sec: 49152.1, 60 sec: 51882.8, 300 sec: 52706.5). Total num frames: 5746409472. Throughput: 0: 52563.2. Samples: 236883780. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 10:13:14,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 10:13:15,234][52263] Updated weights for policy 0, policy_version 350738 (0.0038) [2024-04-27 10:13:17,863][52263] Updated weights for policy 0, policy_version 350748 (0.0037) [2024-04-27 10:13:19,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.0, 300 sec: 52817.6). Total num frames: 5746688000. Throughput: 0: 52494.6. Samples: 237198300. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 10:13:19,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 10:13:21,703][52263] Updated weights for policy 0, policy_version 350758 (0.0036) [2024-04-27 10:13:24,104][52263] Updated weights for policy 0, policy_version 350768 (0.0036) [2024-04-27 10:13:24,106][52031] Fps is (10 sec: 57343.8, 60 sec: 53521.1, 300 sec: 52762.2). Total num frames: 5746982912. Throughput: 0: 52385.4. Samples: 237511180. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 10:13:24,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 10:13:27,784][52263] Updated weights for policy 0, policy_version 350778 (0.0027) [2024-04-27 10:13:29,107][52031] Fps is (10 sec: 54066.4, 60 sec: 52428.6, 300 sec: 52595.4). Total num frames: 5747228672. Throughput: 0: 52647.8. Samples: 237676340. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 10:13:29,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 10:13:30,298][52263] Updated weights for policy 0, policy_version 350788 (0.0032) [2024-04-27 10:13:33,802][52263] Updated weights for policy 0, policy_version 350798 (0.0029) [2024-04-27 10:13:34,107][52031] Fps is (10 sec: 49151.1, 60 sec: 52155.6, 300 sec: 52595.4). Total num frames: 5747474432. Throughput: 0: 52572.7. Samples: 237993420. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 10:13:34,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 10:13:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000350798_5747474432.pth... [2024-04-27 10:13:34,159][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000350028_5734858752.pth [2024-04-27 10:13:36,645][52263] Updated weights for policy 0, policy_version 350808 (0.0032) [2024-04-27 10:13:39,107][52031] Fps is (10 sec: 50790.4, 60 sec: 52155.6, 300 sec: 52650.9). Total num frames: 5747736576. Throughput: 0: 52515.0. Samples: 238306220. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 10:13:39,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 10:13:40,157][52263] Updated weights for policy 0, policy_version 350818 (0.0032) [2024-04-27 10:13:42,973][52263] Updated weights for policy 0, policy_version 350828 (0.0043) [2024-04-27 10:13:44,107][52031] Fps is (10 sec: 50790.4, 60 sec: 52428.7, 300 sec: 52706.5). Total num frames: 5747982336. Throughput: 0: 52457.7. Samples: 238462160. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 10:13:44,116][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 10:13:46,507][52263] Updated weights for policy 0, policy_version 350838 (0.0032) [2024-04-27 10:13:49,107][52031] Fps is (10 sec: 54067.5, 60 sec: 52974.8, 300 sec: 52706.5). Total num frames: 5748277248. Throughput: 0: 52236.3. Samples: 238772520. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 10:13:49,115][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 10:13:49,127][52263] Updated weights for policy 0, policy_version 350848 (0.0028) [2024-04-27 10:13:52,696][52263] Updated weights for policy 0, policy_version 350858 (0.0031) [2024-04-27 10:13:54,107][52031] Fps is (10 sec: 55706.4, 60 sec: 52975.0, 300 sec: 52650.9). Total num frames: 5748539392. Throughput: 0: 52221.8. Samples: 239091100. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 10:13:54,116][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 10:13:55,382][52263] Updated weights for policy 0, policy_version 350868 (0.0031) [2024-04-27 10:13:58,816][52263] Updated weights for policy 0, policy_version 350878 (0.0035) [2024-04-27 10:13:59,106][52031] Fps is (10 sec: 52429.4, 60 sec: 52155.8, 300 sec: 52539.9). Total num frames: 5748801536. Throughput: 0: 52799.0. Samples: 239259740. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 10:13:59,107][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 10:13:59,219][52242] Signal inference workers to stop experience collection... (3500 times) [2024-04-27 10:13:59,220][52242] Signal inference workers to resume experience collection... (3500 times) [2024-04-27 10:13:59,259][52263] InferenceWorker_p0-w0: stopping experience collection (3500 times) [2024-04-27 10:13:59,259][52263] InferenceWorker_p0-w0: resuming experience collection (3500 times) [2024-04-27 10:14:01,493][52263] Updated weights for policy 0, policy_version 350888 (0.0033) [2024-04-27 10:14:04,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52155.6, 300 sec: 52650.9). Total num frames: 5749047296. Throughput: 0: 52737.3. Samples: 239571480. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 10:14:04,114][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 10:14:05,118][52263] Updated weights for policy 0, policy_version 350898 (0.0034) [2024-04-27 10:14:07,690][52263] Updated weights for policy 0, policy_version 350908 (0.0030) [2024-04-27 10:14:09,107][52031] Fps is (10 sec: 50790.2, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5749309440. Throughput: 0: 52702.1. Samples: 239882780. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 10:14:09,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 10:14:11,391][52263] Updated weights for policy 0, policy_version 350918 (0.0034) [2024-04-27 10:14:13,956][52263] Updated weights for policy 0, policy_version 350928 (0.0028) [2024-04-27 10:14:14,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53247.9, 300 sec: 52817.5). Total num frames: 5749604352. Throughput: 0: 52614.8. Samples: 240044000. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 10:14:14,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 10:14:17,474][52263] Updated weights for policy 0, policy_version 350938 (0.0037) [2024-04-27 10:14:19,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52428.8, 300 sec: 52539.9). Total num frames: 5749833728. Throughput: 0: 52598.0. Samples: 240360320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 10:14:19,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 10:14:20,249][52263] Updated weights for policy 0, policy_version 350948 (0.0031) [2024-04-27 10:14:23,633][52263] Updated weights for policy 0, policy_version 350958 (0.0029) [2024-04-27 10:14:24,107][52031] Fps is (10 sec: 50790.5, 60 sec: 52155.7, 300 sec: 52595.4). Total num frames: 5750112256. Throughput: 0: 52664.6. Samples: 240676120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 10:14:24,107][52031] Avg episode reward: [(0, '0.673')] [2024-04-27 10:14:26,759][52263] Updated weights for policy 0, policy_version 350968 (0.0029) [2024-04-27 10:14:29,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52155.9, 300 sec: 52651.0). Total num frames: 5750358016. Throughput: 0: 52650.4. Samples: 240831420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 10:14:29,107][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 10:14:29,919][52263] Updated weights for policy 0, policy_version 350978 (0.0029) [2024-04-27 10:14:33,200][52263] Updated weights for policy 0, policy_version 350988 (0.0030) [2024-04-27 10:14:34,107][52031] Fps is (10 sec: 52428.5, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5750636544. Throughput: 0: 52651.1. Samples: 241141820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 10:14:34,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 10:14:36,174][52263] Updated weights for policy 0, policy_version 350998 (0.0026) [2024-04-27 10:14:39,106][52031] Fps is (10 sec: 52428.5, 60 sec: 52428.9, 300 sec: 52706.5). Total num frames: 5750882304. Throughput: 0: 52554.2. Samples: 241456040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 10:14:39,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 10:14:39,507][52263] Updated weights for policy 0, policy_version 351008 (0.0027) [2024-04-27 10:14:42,326][52263] Updated weights for policy 0, policy_version 351018 (0.0031) [2024-04-27 10:14:44,107][52031] Fps is (10 sec: 52429.2, 60 sec: 52975.0, 300 sec: 52651.0). Total num frames: 5751160832. Throughput: 0: 52478.2. Samples: 241621260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 10:14:44,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 10:14:45,606][52263] Updated weights for policy 0, policy_version 351028 (0.0026) [2024-04-27 10:14:48,612][52263] Updated weights for policy 0, policy_version 351038 (0.0033) [2024-04-27 10:14:49,106][52031] Fps is (10 sec: 54067.6, 60 sec: 52428.9, 300 sec: 52595.4). Total num frames: 5751422976. Throughput: 0: 52617.0. Samples: 241939240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 10:14:49,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 10:14:51,768][52263] Updated weights for policy 0, policy_version 351048 (0.0028) [2024-04-27 10:14:54,106][52031] Fps is (10 sec: 50790.9, 60 sec: 52155.8, 300 sec: 52484.3). Total num frames: 5751668736. Throughput: 0: 52605.9. Samples: 242250040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 10:14:54,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 10:14:54,831][52263] Updated weights for policy 0, policy_version 351058 (0.0029) [2024-04-27 10:14:57,855][52263] Updated weights for policy 0, policy_version 351068 (0.0030) [2024-04-27 10:14:59,106][52031] Fps is (10 sec: 54067.6, 60 sec: 52702.0, 300 sec: 52817.6). Total num frames: 5751963648. Throughput: 0: 52565.6. Samples: 242409440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 10:14:59,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 10:15:01,079][52263] Updated weights for policy 0, policy_version 351078 (0.0035) [2024-04-27 10:15:04,015][52263] Updated weights for policy 0, policy_version 351088 (0.0027) [2024-04-27 10:15:04,107][52031] Fps is (10 sec: 55704.7, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5752225792. Throughput: 0: 52622.1. Samples: 242728320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 10:15:04,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 10:15:07,301][52263] Updated weights for policy 0, policy_version 351098 (0.0026) [2024-04-27 10:15:08,972][52242] Signal inference workers to stop experience collection... (3550 times) [2024-04-27 10:15:08,973][52242] Signal inference workers to resume experience collection... (3550 times) [2024-04-27 10:15:08,998][52263] InferenceWorker_p0-w0: stopping experience collection (3550 times) [2024-04-27 10:15:08,998][52263] InferenceWorker_p0-w0: resuming experience collection (3550 times) [2024-04-27 10:15:09,107][52031] Fps is (10 sec: 52428.0, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5752487936. Throughput: 0: 52655.1. Samples: 243045600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 10:15:09,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 10:15:10,187][52263] Updated weights for policy 0, policy_version 351108 (0.0029) [2024-04-27 10:15:13,518][52263] Updated weights for policy 0, policy_version 351118 (0.0031) [2024-04-27 10:15:14,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52155.8, 300 sec: 52595.4). Total num frames: 5752733696. Throughput: 0: 52638.2. Samples: 243200140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 10:15:14,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 10:15:16,320][52263] Updated weights for policy 0, policy_version 351128 (0.0030) [2024-04-27 10:15:19,106][52031] Fps is (10 sec: 50790.5, 60 sec: 52701.8, 300 sec: 52484.4). Total num frames: 5752995840. Throughput: 0: 52673.5. Samples: 243512120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 10:15:19,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 10:15:19,869][52263] Updated weights for policy 0, policy_version 351138 (0.0032) [2024-04-27 10:15:22,674][52263] Updated weights for policy 0, policy_version 351148 (0.0028) [2024-04-27 10:15:24,106][52031] Fps is (10 sec: 54067.5, 60 sec: 52702.0, 300 sec: 52706.5). Total num frames: 5753274368. Throughput: 0: 52899.6. Samples: 243836520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 10:15:24,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 10:15:25,945][52263] Updated weights for policy 0, policy_version 351158 (0.0033) [2024-04-27 10:15:28,849][52263] Updated weights for policy 0, policy_version 351168 (0.0033) [2024-04-27 10:15:29,106][52031] Fps is (10 sec: 54067.5, 60 sec: 52974.9, 300 sec: 52762.1). Total num frames: 5753536512. Throughput: 0: 52754.3. Samples: 243995200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 10:15:29,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 10:15:32,052][52263] Updated weights for policy 0, policy_version 351178 (0.0034) [2024-04-27 10:15:34,107][52031] Fps is (10 sec: 52428.0, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5753798656. Throughput: 0: 52690.9. Samples: 244310340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 10:15:34,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 10:15:34,219][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000351185_5753815040.pth... [2024-04-27 10:15:34,275][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000350412_5741150208.pth [2024-04-27 10:15:34,973][52263] Updated weights for policy 0, policy_version 351188 (0.0031) [2024-04-27 10:15:38,419][52263] Updated weights for policy 0, policy_version 351198 (0.0038) [2024-04-27 10:15:39,106][52031] Fps is (10 sec: 52428.6, 60 sec: 52975.0, 300 sec: 52595.4). Total num frames: 5754060800. Throughput: 0: 52758.6. Samples: 244624180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 10:15:39,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 10:15:41,040][52263] Updated weights for policy 0, policy_version 351208 (0.0030) [2024-04-27 10:15:44,106][52031] Fps is (10 sec: 50790.9, 60 sec: 52428.8, 300 sec: 52484.3). Total num frames: 5754306560. Throughput: 0: 52684.7. Samples: 244780260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 10:15:44,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 10:15:44,750][52263] Updated weights for policy 0, policy_version 351218 (0.0029) [2024-04-27 10:15:47,234][52263] Updated weights for policy 0, policy_version 351228 (0.0035) [2024-04-27 10:15:49,107][52031] Fps is (10 sec: 52428.6, 60 sec: 52701.8, 300 sec: 52651.0). Total num frames: 5754585088. Throughput: 0: 52532.1. Samples: 245092260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 10:15:49,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 10:15:51,076][52263] Updated weights for policy 0, policy_version 351238 (0.0029) [2024-04-27 10:15:53,520][52263] Updated weights for policy 0, policy_version 351248 (0.0037) [2024-04-27 10:15:54,106][52031] Fps is (10 sec: 54067.6, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5754847232. Throughput: 0: 52481.0. Samples: 245407240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 10:15:54,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 10:15:57,384][52263] Updated weights for policy 0, policy_version 351258 (0.0037) [2024-04-27 10:15:59,107][52031] Fps is (10 sec: 54067.3, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5755125760. Throughput: 0: 52760.9. Samples: 245574380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 10:15:59,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 10:15:59,671][52263] Updated weights for policy 0, policy_version 351268 (0.0025) [2024-04-27 10:16:03,535][52263] Updated weights for policy 0, policy_version 351278 (0.0032) [2024-04-27 10:16:04,106][52031] Fps is (10 sec: 50790.1, 60 sec: 52155.8, 300 sec: 52539.9). Total num frames: 5755355136. Throughput: 0: 52732.9. Samples: 245885100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 10:16:04,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 10:16:05,845][52263] Updated weights for policy 0, policy_version 351288 (0.0027) [2024-04-27 10:16:09,107][52031] Fps is (10 sec: 49151.6, 60 sec: 52155.7, 300 sec: 52539.8). Total num frames: 5755617280. Throughput: 0: 52556.3. Samples: 246201560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 10:16:09,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 10:16:09,831][52263] Updated weights for policy 0, policy_version 351298 (0.0027) [2024-04-27 10:16:12,550][52263] Updated weights for policy 0, policy_version 351308 (0.0031) [2024-04-27 10:16:14,107][52031] Fps is (10 sec: 54067.0, 60 sec: 52701.8, 300 sec: 52539.9). Total num frames: 5755895808. Throughput: 0: 52413.2. Samples: 246353800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 10:16:14,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 10:16:16,013][52263] Updated weights for policy 0, policy_version 351318 (0.0026) [2024-04-27 10:16:17,695][52242] Signal inference workers to stop experience collection... (3600 times) [2024-04-27 10:16:17,730][52263] InferenceWorker_p0-w0: stopping experience collection (3600 times) [2024-04-27 10:16:17,760][52242] Signal inference workers to resume experience collection... (3600 times) [2024-04-27 10:16:17,760][52263] InferenceWorker_p0-w0: resuming experience collection (3600 times) [2024-04-27 10:16:18,853][52263] Updated weights for policy 0, policy_version 351328 (0.0026) [2024-04-27 10:16:19,106][52031] Fps is (10 sec: 54067.9, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5756157952. Throughput: 0: 52420.6. Samples: 246669260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 10:16:19,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 10:16:22,167][52263] Updated weights for policy 0, policy_version 351338 (0.0033) [2024-04-27 10:16:24,106][52031] Fps is (10 sec: 55706.0, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5756452864. Throughput: 0: 52449.3. Samples: 246984400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 10:16:24,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 10:16:25,161][52263] Updated weights for policy 0, policy_version 351348 (0.0028) [2024-04-27 10:16:28,299][52263] Updated weights for policy 0, policy_version 351358 (0.0032) [2024-04-27 10:16:29,106][52031] Fps is (10 sec: 54067.0, 60 sec: 52701.8, 300 sec: 52706.5). Total num frames: 5756698624. Throughput: 0: 52674.2. Samples: 247150600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 10:16:29,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 10:16:31,259][52263] Updated weights for policy 0, policy_version 351368 (0.0027) [2024-04-27 10:16:34,106][52031] Fps is (10 sec: 49152.0, 60 sec: 52428.9, 300 sec: 52539.9). Total num frames: 5756944384. Throughput: 0: 52778.3. Samples: 247467280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 10:16:34,107][52031] Avg episode reward: [(0, '0.473')] [2024-04-27 10:16:34,663][52263] Updated weights for policy 0, policy_version 351378 (0.0026) [2024-04-27 10:16:37,318][52263] Updated weights for policy 0, policy_version 351388 (0.0033) [2024-04-27 10:16:39,107][52031] Fps is (10 sec: 49151.3, 60 sec: 52155.6, 300 sec: 52484.3). Total num frames: 5757190144. Throughput: 0: 52756.2. Samples: 247781280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 10:16:39,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 10:16:40,865][52263] Updated weights for policy 0, policy_version 351398 (0.0035) [2024-04-27 10:16:43,583][52263] Updated weights for policy 0, policy_version 351408 (0.0031) [2024-04-27 10:16:44,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52701.9, 300 sec: 52595.4). Total num frames: 5757468672. Throughput: 0: 52565.8. Samples: 247939840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 10:16:44,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 10:16:47,066][52263] Updated weights for policy 0, policy_version 351418 (0.0029) [2024-04-27 10:16:49,107][52031] Fps is (10 sec: 57344.1, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5757763584. Throughput: 0: 52700.8. Samples: 248256640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 10:16:49,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 10:16:49,856][52263] Updated weights for policy 0, policy_version 351428 (0.0029) [2024-04-27 10:16:53,248][52263] Updated weights for policy 0, policy_version 351438 (0.0034) [2024-04-27 10:16:54,106][52031] Fps is (10 sec: 54066.9, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5758009344. Throughput: 0: 52709.9. Samples: 248573500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 10:16:54,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 10:16:56,114][52263] Updated weights for policy 0, policy_version 351448 (0.0032) [2024-04-27 10:16:59,106][52031] Fps is (10 sec: 49152.7, 60 sec: 52155.8, 300 sec: 52595.5). Total num frames: 5758255104. Throughput: 0: 52845.9. Samples: 248731860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 10:16:59,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 10:16:59,344][52263] Updated weights for policy 0, policy_version 351458 (0.0035) [2024-04-27 10:17:02,201][52263] Updated weights for policy 0, policy_version 351468 (0.0034) [2024-04-27 10:17:04,107][52031] Fps is (10 sec: 52428.4, 60 sec: 52974.9, 300 sec: 52595.4). Total num frames: 5758533632. Throughput: 0: 52825.2. Samples: 249046400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 10:17:04,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 10:17:05,555][52263] Updated weights for policy 0, policy_version 351478 (0.0031) [2024-04-27 10:17:08,757][52263] Updated weights for policy 0, policy_version 351488 (0.0027) [2024-04-27 10:17:09,106][52031] Fps is (10 sec: 54067.1, 60 sec: 52975.0, 300 sec: 52539.9). Total num frames: 5758795776. Throughput: 0: 52872.5. Samples: 249363660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 10:17:09,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 10:17:11,753][52263] Updated weights for policy 0, policy_version 351498 (0.0030) [2024-04-27 10:17:14,107][52031] Fps is (10 sec: 54067.3, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5759074304. Throughput: 0: 52759.9. Samples: 249524800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 10:17:14,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 10:17:14,811][52263] Updated weights for policy 0, policy_version 351508 (0.0037) [2024-04-27 10:17:17,832][52263] Updated weights for policy 0, policy_version 351518 (0.0028) [2024-04-27 10:17:19,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53248.0, 300 sec: 52817.6). Total num frames: 5759352832. Throughput: 0: 52661.8. Samples: 249837060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 10:17:19,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 10:17:20,871][52263] Updated weights for policy 0, policy_version 351528 (0.0033) [2024-04-27 10:17:23,830][52242] Signal inference workers to stop experience collection... (3650 times) [2024-04-27 10:17:23,863][52263] InferenceWorker_p0-w0: stopping experience collection (3650 times) [2024-04-27 10:17:23,920][52242] Signal inference workers to resume experience collection... (3650 times) [2024-04-27 10:17:23,920][52263] InferenceWorker_p0-w0: resuming experience collection (3650 times) [2024-04-27 10:17:24,042][52263] Updated weights for policy 0, policy_version 351538 (0.0026) [2024-04-27 10:17:24,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52428.7, 300 sec: 52595.4). Total num frames: 5759598592. Throughput: 0: 52796.9. Samples: 250157140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 10:17:24,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 10:17:27,122][52263] Updated weights for policy 0, policy_version 351548 (0.0037) [2024-04-27 10:17:29,107][52031] Fps is (10 sec: 49151.9, 60 sec: 52428.8, 300 sec: 52539.9). Total num frames: 5759844352. Throughput: 0: 52613.7. Samples: 250307460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 10:17:29,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 10:17:30,324][52263] Updated weights for policy 0, policy_version 351558 (0.0029) [2024-04-27 10:17:33,464][52263] Updated weights for policy 0, policy_version 351568 (0.0028) [2024-04-27 10:17:34,107][52031] Fps is (10 sec: 50790.9, 60 sec: 52701.8, 300 sec: 52539.9). Total num frames: 5760106496. Throughput: 0: 52520.5. Samples: 250620060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 10:17:34,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 10:17:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000351569_5760106496.pth... [2024-04-27 10:17:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000350798_5747474432.pth [2024-04-27 10:17:36,477][52263] Updated weights for policy 0, policy_version 351578 (0.0029) [2024-04-27 10:17:39,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.1, 300 sec: 52706.5). Total num frames: 5760385024. Throughput: 0: 52613.4. Samples: 250941100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 10:17:39,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 10:17:39,509][52263] Updated weights for policy 0, policy_version 351588 (0.0030) [2024-04-27 10:17:42,699][52263] Updated weights for policy 0, policy_version 351598 (0.0033) [2024-04-27 10:17:44,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53247.9, 300 sec: 52762.0). Total num frames: 5760663552. Throughput: 0: 52810.1. Samples: 251108320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 10:17:44,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 10:17:45,762][52263] Updated weights for policy 0, policy_version 351608 (0.0031) [2024-04-27 10:17:48,816][52263] Updated weights for policy 0, policy_version 351618 (0.0032) [2024-04-27 10:17:49,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52428.9, 300 sec: 52706.5). Total num frames: 5760909312. Throughput: 0: 52849.0. Samples: 251424600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 10:17:49,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 10:17:52,068][52263] Updated weights for policy 0, policy_version 351628 (0.0029) [2024-04-27 10:17:54,107][52031] Fps is (10 sec: 50790.4, 60 sec: 52701.8, 300 sec: 52539.9). Total num frames: 5761171456. Throughput: 0: 52844.3. Samples: 251741660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 10:17:54,107][52031] Avg episode reward: [(0, '0.673')] [2024-04-27 10:17:54,992][52263] Updated weights for policy 0, policy_version 351638 (0.0030) [2024-04-27 10:17:58,402][52263] Updated weights for policy 0, policy_version 351648 (0.0029) [2024-04-27 10:17:59,107][52031] Fps is (10 sec: 50790.0, 60 sec: 52701.8, 300 sec: 52539.9). Total num frames: 5761417216. Throughput: 0: 52691.1. Samples: 251895900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 10:17:59,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 10:18:01,127][52263] Updated weights for policy 0, policy_version 351658 (0.0031) [2024-04-27 10:18:04,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52702.0, 300 sec: 52651.0). Total num frames: 5761695744. Throughput: 0: 52795.1. Samples: 252212840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 10:18:04,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 10:18:04,706][52263] Updated weights for policy 0, policy_version 351668 (0.0032) [2024-04-27 10:18:07,372][52263] Updated weights for policy 0, policy_version 351678 (0.0034) [2024-04-27 10:18:09,106][52031] Fps is (10 sec: 55705.8, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5761974272. Throughput: 0: 52671.3. Samples: 252527340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 10:18:09,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 10:18:10,812][52263] Updated weights for policy 0, policy_version 351688 (0.0027) [2024-04-27 10:18:13,519][52263] Updated weights for policy 0, policy_version 351698 (0.0030) [2024-04-27 10:18:14,106][52031] Fps is (10 sec: 55706.0, 60 sec: 52975.1, 300 sec: 52762.1). Total num frames: 5762252800. Throughput: 0: 53147.2. Samples: 252699080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:18:14,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 10:18:16,944][52263] Updated weights for policy 0, policy_version 351708 (0.0033) [2024-04-27 10:18:19,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52155.7, 300 sec: 52539.9). Total num frames: 5762482176. Throughput: 0: 53258.2. Samples: 253016680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:18:19,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 10:18:19,593][52263] Updated weights for policy 0, policy_version 351718 (0.0032) [2024-04-27 10:18:23,016][52263] Updated weights for policy 0, policy_version 351728 (0.0035) [2024-04-27 10:18:24,106][52031] Fps is (10 sec: 47513.1, 60 sec: 52155.9, 300 sec: 52539.9). Total num frames: 5762727936. Throughput: 0: 53123.5. Samples: 253331660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:18:24,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 10:18:25,741][52263] Updated weights for policy 0, policy_version 351738 (0.0030) [2024-04-27 10:18:29,106][52031] Fps is (10 sec: 54067.8, 60 sec: 52975.0, 300 sec: 52706.5). Total num frames: 5763022848. Throughput: 0: 52704.6. Samples: 253480020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:18:29,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 10:18:29,133][52263] Updated weights for policy 0, policy_version 351748 (0.0027) [2024-04-27 10:18:31,849][52263] Updated weights for policy 0, policy_version 351758 (0.0030) [2024-04-27 10:18:34,106][52031] Fps is (10 sec: 57344.3, 60 sec: 53248.1, 300 sec: 52762.1). Total num frames: 5763301376. Throughput: 0: 52887.6. Samples: 253804540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:18:34,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 10:18:35,331][52263] Updated weights for policy 0, policy_version 351768 (0.0028) [2024-04-27 10:18:37,795][52242] Signal inference workers to stop experience collection... (3700 times) [2024-04-27 10:18:37,841][52263] InferenceWorker_p0-w0: stopping experience collection (3700 times) [2024-04-27 10:18:37,858][52242] Signal inference workers to resume experience collection... (3700 times) [2024-04-27 10:18:37,859][52263] InferenceWorker_p0-w0: resuming experience collection (3700 times) [2024-04-27 10:18:37,981][52263] Updated weights for policy 0, policy_version 351778 (0.0030) [2024-04-27 10:18:39,106][52031] Fps is (10 sec: 54067.3, 60 sec: 52975.0, 300 sec: 52817.6). Total num frames: 5763563520. Throughput: 0: 52882.4. Samples: 254121360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:18:39,107][52031] Avg episode reward: [(0, '0.520')] [2024-04-27 10:18:41,677][52263] Updated weights for policy 0, policy_version 351788 (0.0030) [2024-04-27 10:18:44,107][52031] Fps is (10 sec: 54065.3, 60 sec: 52974.7, 300 sec: 52762.0). Total num frames: 5763842048. Throughput: 0: 53159.2. Samples: 254288080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:18:44,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 10:18:44,199][52263] Updated weights for policy 0, policy_version 351798 (0.0031) [2024-04-27 10:18:48,103][52263] Updated weights for policy 0, policy_version 351808 (0.0030) [2024-04-27 10:18:49,106][52031] Fps is (10 sec: 52428.4, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5764087808. Throughput: 0: 53086.6. Samples: 254601740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:18:49,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 10:18:50,517][52263] Updated weights for policy 0, policy_version 351818 (0.0030) [2024-04-27 10:18:54,106][52031] Fps is (10 sec: 49153.6, 60 sec: 52701.9, 300 sec: 52651.0). Total num frames: 5764333568. Throughput: 0: 53114.3. Samples: 254917480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:18:54,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 10:18:54,145][52263] Updated weights for policy 0, policy_version 351828 (0.0031) [2024-04-27 10:18:56,759][52263] Updated weights for policy 0, policy_version 351838 (0.0029) [2024-04-27 10:18:59,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.1, 300 sec: 52817.6). Total num frames: 5764628480. Throughput: 0: 52771.0. Samples: 255073780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:18:59,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 10:19:00,473][52263] Updated weights for policy 0, policy_version 351848 (0.0028) [2024-04-27 10:19:02,992][52263] Updated weights for policy 0, policy_version 351858 (0.0033) [2024-04-27 10:19:04,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53247.9, 300 sec: 52817.6). Total num frames: 5764890624. Throughput: 0: 52743.1. Samples: 255390120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:19:04,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 10:19:06,734][52263] Updated weights for policy 0, policy_version 351868 (0.0030) [2024-04-27 10:19:09,106][52031] Fps is (10 sec: 52428.8, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5765152768. Throughput: 0: 52722.2. Samples: 255704160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:19:09,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 10:19:09,186][52263] Updated weights for policy 0, policy_version 351878 (0.0033) [2024-04-27 10:19:12,812][52263] Updated weights for policy 0, policy_version 351888 (0.0031) [2024-04-27 10:19:14,106][52031] Fps is (10 sec: 52429.8, 60 sec: 52701.9, 300 sec: 52817.6). Total num frames: 5765414912. Throughput: 0: 52919.6. Samples: 255861400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:19:14,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 10:19:15,312][52263] Updated weights for policy 0, policy_version 351898 (0.0032) [2024-04-27 10:19:19,072][52263] Updated weights for policy 0, policy_version 351908 (0.0028) [2024-04-27 10:19:19,106][52031] Fps is (10 sec: 50791.1, 60 sec: 52975.1, 300 sec: 52706.5). Total num frames: 5765660672. Throughput: 0: 52818.7. Samples: 256181380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:19:19,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 10:19:21,708][52263] Updated weights for policy 0, policy_version 351918 (0.0030) [2024-04-27 10:19:24,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53521.1, 300 sec: 52817.6). Total num frames: 5765939200. Throughput: 0: 52811.9. Samples: 256497900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:19:24,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 10:19:25,368][52263] Updated weights for policy 0, policy_version 351928 (0.0029) [2024-04-27 10:19:28,122][52263] Updated weights for policy 0, policy_version 351938 (0.0028) [2024-04-27 10:19:29,107][52031] Fps is (10 sec: 54066.2, 60 sec: 52974.8, 300 sec: 52762.0). Total num frames: 5766201344. Throughput: 0: 52602.9. Samples: 256655200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:19:29,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 10:19:31,755][52263] Updated weights for policy 0, policy_version 351948 (0.0030) [2024-04-27 10:19:34,106][52031] Fps is (10 sec: 52428.8, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5766463488. Throughput: 0: 52602.6. Samples: 256968860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:19:34,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 10:19:34,181][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000351958_5766479872.pth... [2024-04-27 10:19:34,190][52263] Updated weights for policy 0, policy_version 351958 (0.0031) [2024-04-27 10:19:34,226][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000351185_5753815040.pth [2024-04-27 10:19:37,847][52263] Updated weights for policy 0, policy_version 351968 (0.0033) [2024-04-27 10:19:39,107][52031] Fps is (10 sec: 52428.6, 60 sec: 52701.7, 300 sec: 52762.0). Total num frames: 5766725632. Throughput: 0: 52662.9. Samples: 257287320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:19:39,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 10:19:40,483][52263] Updated weights for policy 0, policy_version 351978 (0.0036) [2024-04-27 10:19:44,091][52263] Updated weights for policy 0, policy_version 351988 (0.0031) [2024-04-27 10:19:44,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52156.1, 300 sec: 52706.5). Total num frames: 5766971392. Throughput: 0: 52685.9. Samples: 257444640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:19:44,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 10:19:46,664][52263] Updated weights for policy 0, policy_version 351998 (0.0031) [2024-04-27 10:19:49,107][52031] Fps is (10 sec: 52429.2, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5767249920. Throughput: 0: 52584.5. Samples: 257756420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:19:49,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 10:19:50,321][52263] Updated weights for policy 0, policy_version 352008 (0.0035) [2024-04-27 10:19:52,920][52263] Updated weights for policy 0, policy_version 352018 (0.0030) [2024-04-27 10:19:54,107][52031] Fps is (10 sec: 54065.6, 60 sec: 52974.7, 300 sec: 52706.4). Total num frames: 5767512064. Throughput: 0: 52679.8. Samples: 258074760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:19:54,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 10:19:56,406][52263] Updated weights for policy 0, policy_version 352028 (0.0029) [2024-04-27 10:19:56,927][52242] Signal inference workers to stop experience collection... (3750 times) [2024-04-27 10:19:56,927][52242] Signal inference workers to resume experience collection... (3750 times) [2024-04-27 10:19:56,941][52263] InferenceWorker_p0-w0: stopping experience collection (3750 times) [2024-04-27 10:19:56,941][52263] InferenceWorker_p0-w0: resuming experience collection (3750 times) [2024-04-27 10:19:59,097][52263] Updated weights for policy 0, policy_version 352038 (0.0032) [2024-04-27 10:19:59,106][52031] Fps is (10 sec: 54067.4, 60 sec: 52701.9, 300 sec: 52762.1). Total num frames: 5767790592. Throughput: 0: 52731.0. Samples: 258234300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:19:59,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 10:20:02,637][52263] Updated weights for policy 0, policy_version 352048 (0.0036) [2024-04-27 10:20:04,106][52031] Fps is (10 sec: 54069.0, 60 sec: 52702.0, 300 sec: 52762.1). Total num frames: 5768052736. Throughput: 0: 52645.3. Samples: 258550420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:20:04,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 10:20:05,447][52263] Updated weights for policy 0, policy_version 352058 (0.0032) [2024-04-27 10:20:08,827][52263] Updated weights for policy 0, policy_version 352068 (0.0032) [2024-04-27 10:20:09,106][52031] Fps is (10 sec: 49152.0, 60 sec: 52155.8, 300 sec: 52706.5). Total num frames: 5768282112. Throughput: 0: 52591.1. Samples: 258864500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:20:09,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 10:20:11,667][52263] Updated weights for policy 0, policy_version 352078 (0.0028) [2024-04-27 10:20:14,107][52031] Fps is (10 sec: 50789.2, 60 sec: 52428.6, 300 sec: 52762.0). Total num frames: 5768560640. Throughput: 0: 52539.5. Samples: 259019480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:20:14,107][52031] Avg episode reward: [(0, '0.698')] [2024-04-27 10:20:15,056][52263] Updated weights for policy 0, policy_version 352088 (0.0031) [2024-04-27 10:20:17,901][52263] Updated weights for policy 0, policy_version 352098 (0.0035) [2024-04-27 10:20:19,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52428.8, 300 sec: 52651.0). Total num frames: 5768806400. Throughput: 0: 52539.3. Samples: 259333120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:20:19,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 10:20:21,364][52263] Updated weights for policy 0, policy_version 352108 (0.0032) [2024-04-27 10:20:24,107][52031] Fps is (10 sec: 52429.2, 60 sec: 52428.8, 300 sec: 52706.5). Total num frames: 5769084928. Throughput: 0: 52362.7. Samples: 259643640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:20:24,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 10:20:24,244][52263] Updated weights for policy 0, policy_version 352118 (0.0028) [2024-04-27 10:20:27,663][52263] Updated weights for policy 0, policy_version 352128 (0.0028) [2024-04-27 10:20:29,107][52031] Fps is (10 sec: 55704.7, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5769363456. Throughput: 0: 52436.8. Samples: 259804300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:20:29,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 10:20:30,351][52263] Updated weights for policy 0, policy_version 352138 (0.0033) [2024-04-27 10:20:33,860][52263] Updated weights for policy 0, policy_version 352148 (0.0028) [2024-04-27 10:20:34,106][52031] Fps is (10 sec: 52429.8, 60 sec: 52428.9, 300 sec: 52706.5). Total num frames: 5769609216. Throughput: 0: 52637.9. Samples: 260125120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:20:34,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 10:20:36,521][52263] Updated weights for policy 0, policy_version 352158 (0.0029) [2024-04-27 10:20:39,106][52031] Fps is (10 sec: 49152.6, 60 sec: 52155.9, 300 sec: 52706.5). Total num frames: 5769854976. Throughput: 0: 52501.2. Samples: 260437300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:20:39,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 10:20:40,058][52263] Updated weights for policy 0, policy_version 352168 (0.0032) [2024-04-27 10:20:43,143][52263] Updated weights for policy 0, policy_version 352178 (0.0033) [2024-04-27 10:20:44,106][52031] Fps is (10 sec: 49151.4, 60 sec: 52155.7, 300 sec: 52595.4). Total num frames: 5770100736. Throughput: 0: 52265.8. Samples: 260586260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 10:20:44,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 10:20:46,135][52263] Updated weights for policy 0, policy_version 352188 (0.0028) [2024-04-27 10:20:49,107][52031] Fps is (10 sec: 54066.7, 60 sec: 52428.8, 300 sec: 52706.5). Total num frames: 5770395648. Throughput: 0: 52233.6. Samples: 260900940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-27 10:20:49,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 10:20:49,467][52263] Updated weights for policy 0, policy_version 352198 (0.0031) [2024-04-27 10:20:52,372][52263] Updated weights for policy 0, policy_version 352208 (0.0034) [2024-04-27 10:20:54,107][52031] Fps is (10 sec: 57343.2, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5770674176. Throughput: 0: 52209.2. Samples: 261213920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-27 10:20:54,108][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 10:20:55,586][52263] Updated weights for policy 0, policy_version 352218 (0.0031) [2024-04-27 10:20:58,621][52263] Updated weights for policy 0, policy_version 352228 (0.0030) [2024-04-27 10:20:59,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52155.8, 300 sec: 52762.0). Total num frames: 5770919936. Throughput: 0: 52619.3. Samples: 261387340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-27 10:20:59,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 10:21:01,742][52263] Updated weights for policy 0, policy_version 352238 (0.0034) [2024-04-27 10:21:04,106][52031] Fps is (10 sec: 49152.8, 60 sec: 51882.6, 300 sec: 52706.5). Total num frames: 5771165696. Throughput: 0: 52523.9. Samples: 261696700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-27 10:21:04,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 10:21:04,692][52263] Updated weights for policy 0, policy_version 352248 (0.0032) [2024-04-27 10:21:08,022][52263] Updated weights for policy 0, policy_version 352258 (0.0032) [2024-04-27 10:21:09,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52428.9, 300 sec: 52651.0). Total num frames: 5771427840. Throughput: 0: 52586.4. Samples: 262010020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-27 10:21:09,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 10:21:11,118][52263] Updated weights for policy 0, policy_version 352268 (0.0028) [2024-04-27 10:21:14,106][52031] Fps is (10 sec: 52428.8, 60 sec: 52155.9, 300 sec: 52650.9). Total num frames: 5771689984. Throughput: 0: 52464.1. Samples: 262165180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-27 10:21:14,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 10:21:14,263][52263] Updated weights for policy 0, policy_version 352278 (0.0034) [2024-04-27 10:21:16,599][52242] Signal inference workers to stop experience collection... (3800 times) [2024-04-27 10:21:16,599][52242] Signal inference workers to resume experience collection... (3800 times) [2024-04-27 10:21:16,623][52263] InferenceWorker_p0-w0: stopping experience collection (3800 times) [2024-04-27 10:21:16,623][52263] InferenceWorker_p0-w0: resuming experience collection (3800 times) [2024-04-27 10:21:17,373][52263] Updated weights for policy 0, policy_version 352288 (0.0032) [2024-04-27 10:21:19,106][52031] Fps is (10 sec: 55705.2, 60 sec: 52974.9, 300 sec: 52651.0). Total num frames: 5771984896. Throughput: 0: 52324.8. Samples: 262479740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-27 10:21:19,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 10:21:20,493][52263] Updated weights for policy 0, policy_version 352298 (0.0036) [2024-04-27 10:21:23,419][52263] Updated weights for policy 0, policy_version 352308 (0.0029) [2024-04-27 10:21:24,106][52031] Fps is (10 sec: 54067.4, 60 sec: 52428.9, 300 sec: 52651.0). Total num frames: 5772230656. Throughput: 0: 52383.5. Samples: 262794560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-27 10:21:24,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 10:21:26,740][52263] Updated weights for policy 0, policy_version 352318 (0.0025) [2024-04-27 10:21:29,107][52031] Fps is (10 sec: 50790.0, 60 sec: 52155.8, 300 sec: 52706.5). Total num frames: 5772492800. Throughput: 0: 52692.0. Samples: 262957400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-27 10:21:29,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 10:21:29,542][52263] Updated weights for policy 0, policy_version 352328 (0.0037) [2024-04-27 10:21:32,834][52263] Updated weights for policy 0, policy_version 352338 (0.0031) [2024-04-27 10:21:34,106][52031] Fps is (10 sec: 52428.7, 60 sec: 52428.7, 300 sec: 52762.1). Total num frames: 5772754944. Throughput: 0: 52720.1. Samples: 263273340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-27 10:21:34,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 10:21:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000352341_5772754944.pth... [2024-04-27 10:21:34,162][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000351569_5760106496.pth [2024-04-27 10:21:35,944][52263] Updated weights for policy 0, policy_version 352348 (0.0030) [2024-04-27 10:21:39,106][52031] Fps is (10 sec: 52429.6, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5773017088. Throughput: 0: 52724.8. Samples: 263586520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-27 10:21:39,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 10:21:39,329][52263] Updated weights for policy 0, policy_version 352358 (0.0027) [2024-04-27 10:21:42,173][52263] Updated weights for policy 0, policy_version 352368 (0.0026) [2024-04-27 10:21:44,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53247.9, 300 sec: 52651.0). Total num frames: 5773295616. Throughput: 0: 52453.6. Samples: 263747760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-27 10:21:44,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 10:21:45,453][52263] Updated weights for policy 0, policy_version 352378 (0.0025) [2024-04-27 10:21:48,470][52263] Updated weights for policy 0, policy_version 352388 (0.0031) [2024-04-27 10:21:49,107][52031] Fps is (10 sec: 55704.8, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5773574144. Throughput: 0: 52647.1. Samples: 264065820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-27 10:21:49,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 10:21:51,547][52263] Updated weights for policy 0, policy_version 352398 (0.0033) [2024-04-27 10:21:54,106][52031] Fps is (10 sec: 52429.5, 60 sec: 52429.0, 300 sec: 52762.0). Total num frames: 5773819904. Throughput: 0: 52779.0. Samples: 264385080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-27 10:21:54,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 10:21:54,622][52263] Updated weights for policy 0, policy_version 352408 (0.0033) [2024-04-27 10:21:57,694][52263] Updated weights for policy 0, policy_version 352418 (0.0028) [2024-04-27 10:21:59,106][52031] Fps is (10 sec: 49152.2, 60 sec: 52428.8, 300 sec: 52651.0). Total num frames: 5774065664. Throughput: 0: 52759.6. Samples: 264539360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-27 10:21:59,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 10:22:00,811][52263] Updated weights for policy 0, policy_version 352428 (0.0027) [2024-04-27 10:22:03,914][52263] Updated weights for policy 0, policy_version 352438 (0.0028) [2024-04-27 10:22:04,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5774344192. Throughput: 0: 52795.9. Samples: 264855560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 10:22:04,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 10:22:07,006][52263] Updated weights for policy 0, policy_version 352448 (0.0036) [2024-04-27 10:22:09,106][52031] Fps is (10 sec: 54067.2, 60 sec: 52974.8, 300 sec: 52651.0). Total num frames: 5774606336. Throughput: 0: 52944.0. Samples: 265177040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 10:22:09,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 10:22:10,003][52263] Updated weights for policy 0, policy_version 352458 (0.0031) [2024-04-27 10:22:13,002][52263] Updated weights for policy 0, policy_version 352468 (0.0031) [2024-04-27 10:22:14,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.0, 300 sec: 52706.5). Total num frames: 5774901248. Throughput: 0: 53018.2. Samples: 265343220. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 10:22:14,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 10:22:16,125][52263] Updated weights for policy 0, policy_version 352478 (0.0031) [2024-04-27 10:22:19,105][52263] Updated weights for policy 0, policy_version 352488 (0.0033) [2024-04-27 10:22:19,107][52031] Fps is (10 sec: 55705.3, 60 sec: 52974.8, 300 sec: 52762.1). Total num frames: 5775163392. Throughput: 0: 53027.0. Samples: 265659560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 10:22:19,107][52031] Avg episode reward: [(0, '0.452')] [2024-04-27 10:22:22,121][52263] Updated weights for policy 0, policy_version 352498 (0.0028) [2024-04-27 10:22:24,107][52031] Fps is (10 sec: 49151.6, 60 sec: 52701.7, 300 sec: 52706.5). Total num frames: 5775392768. Throughput: 0: 53052.1. Samples: 265973880. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 10:22:24,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 10:22:25,279][52242] Signal inference workers to stop experience collection... (3850 times) [2024-04-27 10:22:25,279][52242] Signal inference workers to resume experience collection... (3850 times) [2024-04-27 10:22:25,318][52263] InferenceWorker_p0-w0: stopping experience collection (3850 times) [2024-04-27 10:22:25,318][52263] InferenceWorker_p0-w0: resuming experience collection (3850 times) [2024-04-27 10:22:25,393][52263] Updated weights for policy 0, policy_version 352508 (0.0035) [2024-04-27 10:22:28,353][52263] Updated weights for policy 0, policy_version 352518 (0.0028) [2024-04-27 10:22:29,106][52031] Fps is (10 sec: 49152.6, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5775654912. Throughput: 0: 52928.2. Samples: 266129520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 10:22:29,107][52031] Avg episode reward: [(0, '0.479')] [2024-04-27 10:22:31,717][52263] Updated weights for policy 0, policy_version 352528 (0.0029) [2024-04-27 10:22:34,107][52031] Fps is (10 sec: 55706.0, 60 sec: 53247.9, 300 sec: 52762.0). Total num frames: 5775949824. Throughput: 0: 52772.8. Samples: 266440600. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 10:22:34,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 10:22:34,931][52263] Updated weights for policy 0, policy_version 352538 (0.0034) [2024-04-27 10:22:37,776][52263] Updated weights for policy 0, policy_version 352548 (0.0034) [2024-04-27 10:22:39,106][52031] Fps is (10 sec: 54067.5, 60 sec: 52974.9, 300 sec: 52651.0). Total num frames: 5776195584. Throughput: 0: 52676.1. Samples: 266755500. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 10:22:39,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 10:22:41,115][52263] Updated weights for policy 0, policy_version 352558 (0.0030) [2024-04-27 10:22:43,825][52263] Updated weights for policy 0, policy_version 352568 (0.0037) [2024-04-27 10:22:44,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5776474112. Throughput: 0: 53023.5. Samples: 266925420. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 10:22:44,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 10:22:47,357][52263] Updated weights for policy 0, policy_version 352578 (0.0028) [2024-04-27 10:22:49,106][52031] Fps is (10 sec: 50789.9, 60 sec: 52155.8, 300 sec: 52651.0). Total num frames: 5776703488. Throughput: 0: 53023.7. Samples: 267241620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 10:22:49,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 10:22:50,008][52263] Updated weights for policy 0, policy_version 352588 (0.0034) [2024-04-27 10:22:53,400][52263] Updated weights for policy 0, policy_version 352598 (0.0038) [2024-04-27 10:22:54,107][52031] Fps is (10 sec: 50790.2, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5776982016. Throughput: 0: 52903.5. Samples: 267557700. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 10:22:54,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 10:22:56,235][52263] Updated weights for policy 0, policy_version 352608 (0.0036) [2024-04-27 10:22:59,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53248.0, 300 sec: 52762.0). Total num frames: 5777260544. Throughput: 0: 52731.7. Samples: 267716140. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 10:22:59,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 10:22:59,580][52263] Updated weights for policy 0, policy_version 352618 (0.0034) [2024-04-27 10:23:02,429][52263] Updated weights for policy 0, policy_version 352628 (0.0036) [2024-04-27 10:23:04,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53248.1, 300 sec: 52762.0). Total num frames: 5777539072. Throughput: 0: 52674.3. Samples: 268029900. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 10:23:04,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 10:23:05,657][52263] Updated weights for policy 0, policy_version 352638 (0.0032) [2024-04-27 10:23:08,731][52263] Updated weights for policy 0, policy_version 352648 (0.0030) [2024-04-27 10:23:09,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53248.1, 300 sec: 52706.5). Total num frames: 5777801216. Throughput: 0: 52880.2. Samples: 268353480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 10:23:09,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 10:23:11,715][52263] Updated weights for policy 0, policy_version 352658 (0.0029) [2024-04-27 10:23:14,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52428.8, 300 sec: 52762.0). Total num frames: 5778046976. Throughput: 0: 52960.3. Samples: 268512740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-27 10:23:14,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 10:23:14,921][52263] Updated weights for policy 0, policy_version 352668 (0.0028) [2024-04-27 10:23:17,856][52263] Updated weights for policy 0, policy_version 352678 (0.0033) [2024-04-27 10:23:19,106][52031] Fps is (10 sec: 52428.5, 60 sec: 52701.9, 300 sec: 52873.1). Total num frames: 5778325504. Throughput: 0: 53092.5. Samples: 268829760. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-27 10:23:19,116][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 10:23:21,025][52263] Updated weights for policy 0, policy_version 352688 (0.0034) [2024-04-27 10:23:24,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.2, 300 sec: 52817.6). Total num frames: 5778604032. Throughput: 0: 53221.6. Samples: 269150480. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-27 10:23:24,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 10:23:24,112][52263] Updated weights for policy 0, policy_version 352698 (0.0031) [2024-04-27 10:23:27,359][52263] Updated weights for policy 0, policy_version 352708 (0.0029) [2024-04-27 10:23:28,266][52242] Signal inference workers to stop experience collection... (3900 times) [2024-04-27 10:23:28,271][52242] Signal inference workers to resume experience collection... (3900 times) [2024-04-27 10:23:28,300][52263] InferenceWorker_p0-w0: stopping experience collection (3900 times) [2024-04-27 10:23:28,300][52263] InferenceWorker_p0-w0: resuming experience collection (3900 times) [2024-04-27 10:23:29,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53247.9, 300 sec: 52706.5). Total num frames: 5778849792. Throughput: 0: 52949.8. Samples: 269308160. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-27 10:23:29,116][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 10:23:30,571][52263] Updated weights for policy 0, policy_version 352718 (0.0032) [2024-04-27 10:23:33,484][52263] Updated weights for policy 0, policy_version 352728 (0.0028) [2024-04-27 10:23:34,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5779111936. Throughput: 0: 53097.2. Samples: 269631000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-27 10:23:34,116][52031] Avg episode reward: [(0, '0.501')] [2024-04-27 10:23:34,127][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000352729_5779111936.pth... [2024-04-27 10:23:34,190][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000351958_5766479872.pth [2024-04-27 10:23:36,718][52263] Updated weights for policy 0, policy_version 352738 (0.0026) [2024-04-27 10:23:39,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53247.8, 300 sec: 52706.5). Total num frames: 5779390464. Throughput: 0: 53156.8. Samples: 269949760. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-27 10:23:39,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 10:23:39,617][52263] Updated weights for policy 0, policy_version 352748 (0.0028) [2024-04-27 10:23:42,908][52263] Updated weights for policy 0, policy_version 352758 (0.0041) [2024-04-27 10:23:44,106][52031] Fps is (10 sec: 50791.1, 60 sec: 52428.9, 300 sec: 52651.0). Total num frames: 5779619840. Throughput: 0: 52989.4. Samples: 270100660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-27 10:23:44,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 10:23:45,755][52263] Updated weights for policy 0, policy_version 352768 (0.0029) [2024-04-27 10:23:48,959][52263] Updated weights for policy 0, policy_version 352778 (0.0028) [2024-04-27 10:23:49,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.0, 300 sec: 52817.6). Total num frames: 5779914752. Throughput: 0: 53136.0. Samples: 270421020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-27 10:23:49,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 10:23:51,960][52263] Updated weights for policy 0, policy_version 352788 (0.0030) [2024-04-27 10:23:54,106][52031] Fps is (10 sec: 54066.7, 60 sec: 52975.0, 300 sec: 52651.0). Total num frames: 5780160512. Throughput: 0: 52931.9. Samples: 270735420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-27 10:23:54,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 10:23:55,095][52263] Updated weights for policy 0, policy_version 352798 (0.0028) [2024-04-27 10:23:58,282][52263] Updated weights for policy 0, policy_version 352808 (0.0029) [2024-04-27 10:23:59,106][52031] Fps is (10 sec: 50790.3, 60 sec: 52701.8, 300 sec: 52651.0). Total num frames: 5780422656. Throughput: 0: 53028.9. Samples: 270899040. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-27 10:23:59,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 10:24:01,245][52263] Updated weights for policy 0, policy_version 352818 (0.0028) [2024-04-27 10:24:04,106][52031] Fps is (10 sec: 54067.2, 60 sec: 52701.8, 300 sec: 52706.5). Total num frames: 5780701184. Throughput: 0: 53006.2. Samples: 271215040. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-27 10:24:04,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 10:24:04,404][52263] Updated weights for policy 0, policy_version 352828 (0.0032) [2024-04-27 10:24:07,518][52263] Updated weights for policy 0, policy_version 352838 (0.0026) [2024-04-27 10:24:09,107][52031] Fps is (10 sec: 52428.4, 60 sec: 52428.7, 300 sec: 52650.9). Total num frames: 5780946944. Throughput: 0: 52843.0. Samples: 271528420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-27 10:24:09,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 10:24:10,739][52263] Updated weights for policy 0, policy_version 352848 (0.0031) [2024-04-27 10:24:13,628][52263] Updated weights for policy 0, policy_version 352858 (0.0030) [2024-04-27 10:24:14,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53248.0, 300 sec: 52817.5). Total num frames: 5781241856. Throughput: 0: 52836.4. Samples: 271685800. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-27 10:24:14,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 10:24:16,775][52263] Updated weights for policy 0, policy_version 352868 (0.0031) [2024-04-27 10:24:19,106][52031] Fps is (10 sec: 54068.0, 60 sec: 52701.9, 300 sec: 52706.5). Total num frames: 5781487616. Throughput: 0: 52725.9. Samples: 272003660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-27 10:24:19,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 10:24:19,848][52263] Updated weights for policy 0, policy_version 352878 (0.0030) [2024-04-27 10:24:22,950][52263] Updated weights for policy 0, policy_version 352888 (0.0031) [2024-04-27 10:24:24,106][52031] Fps is (10 sec: 52429.4, 60 sec: 52701.9, 300 sec: 52762.1). Total num frames: 5781766144. Throughput: 0: 52670.4. Samples: 272319920. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-27 10:24:24,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 10:24:26,038][52242] Signal inference workers to stop experience collection... (3950 times) [2024-04-27 10:24:26,039][52242] Signal inference workers to resume experience collection... (3950 times) [2024-04-27 10:24:26,068][52263] InferenceWorker_p0-w0: stopping experience collection (3950 times) [2024-04-27 10:24:26,074][52263] InferenceWorker_p0-w0: resuming experience collection (3950 times) [2024-04-27 10:24:26,153][52263] Updated weights for policy 0, policy_version 352898 (0.0028) [2024-04-27 10:24:29,106][52031] Fps is (10 sec: 54067.3, 60 sec: 52975.0, 300 sec: 52762.1). Total num frames: 5782028288. Throughput: 0: 52799.6. Samples: 272476640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-27 10:24:29,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 10:24:29,137][52263] Updated weights for policy 0, policy_version 352908 (0.0030) [2024-04-27 10:24:32,464][52263] Updated weights for policy 0, policy_version 352918 (0.0031) [2024-04-27 10:24:34,106][52031] Fps is (10 sec: 49152.0, 60 sec: 52428.9, 300 sec: 52651.0). Total num frames: 5782257664. Throughput: 0: 52633.4. Samples: 272789520. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-27 10:24:34,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 10:24:35,573][52263] Updated weights for policy 0, policy_version 352928 (0.0032) [2024-04-27 10:24:38,598][52263] Updated weights for policy 0, policy_version 352938 (0.0030) [2024-04-27 10:24:39,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52701.9, 300 sec: 52817.6). Total num frames: 5782552576. Throughput: 0: 52668.9. Samples: 273105520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-27 10:24:39,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 10:24:41,704][52263] Updated weights for policy 0, policy_version 352948 (0.0033) [2024-04-27 10:24:44,106][52031] Fps is (10 sec: 55705.4, 60 sec: 53248.0, 300 sec: 52762.0). Total num frames: 5782814720. Throughput: 0: 52575.1. Samples: 273264920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-27 10:24:44,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 10:24:44,796][52263] Updated weights for policy 0, policy_version 352958 (0.0029) [2024-04-27 10:24:47,931][52263] Updated weights for policy 0, policy_version 352968 (0.0027) [2024-04-27 10:24:49,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52428.8, 300 sec: 52706.5). Total num frames: 5783060480. Throughput: 0: 52612.1. Samples: 273582580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-27 10:24:49,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 10:24:51,030][52263] Updated weights for policy 0, policy_version 352978 (0.0030) [2024-04-27 10:24:54,107][52031] Fps is (10 sec: 52428.4, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5783339008. Throughput: 0: 52628.0. Samples: 273896680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-27 10:24:54,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 10:24:54,304][52263] Updated weights for policy 0, policy_version 352988 (0.0028) [2024-04-27 10:24:57,184][52263] Updated weights for policy 0, policy_version 352998 (0.0028) [2024-04-27 10:24:59,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52701.9, 300 sec: 52650.9). Total num frames: 5783584768. Throughput: 0: 52697.1. Samples: 274057160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-27 10:24:59,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 10:25:00,270][52263] Updated weights for policy 0, policy_version 353008 (0.0031) [2024-04-27 10:25:03,241][52263] Updated weights for policy 0, policy_version 353018 (0.0034) [2024-04-27 10:25:04,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52701.9, 300 sec: 52817.6). Total num frames: 5783863296. Throughput: 0: 52722.1. Samples: 274376160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-27 10:25:04,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 10:25:06,372][52263] Updated weights for policy 0, policy_version 353028 (0.0031) [2024-04-27 10:25:09,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53248.1, 300 sec: 52817.6). Total num frames: 5784141824. Throughput: 0: 52768.9. Samples: 274694520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-27 10:25:09,107][52031] Avg episode reward: [(0, '0.520')] [2024-04-27 10:25:09,409][52263] Updated weights for policy 0, policy_version 353038 (0.0028) [2024-04-27 10:25:12,733][52263] Updated weights for policy 0, policy_version 353048 (0.0031) [2024-04-27 10:25:14,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52428.8, 300 sec: 52817.5). Total num frames: 5784387584. Throughput: 0: 52936.3. Samples: 274858780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-27 10:25:14,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 10:25:15,689][52263] Updated weights for policy 0, policy_version 353058 (0.0030) [2024-04-27 10:25:18,898][52263] Updated weights for policy 0, policy_version 353068 (0.0032) [2024-04-27 10:25:19,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52974.8, 300 sec: 52817.6). Total num frames: 5784666112. Throughput: 0: 53044.8. Samples: 275176540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-27 10:25:19,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 10:25:22,167][52263] Updated weights for policy 0, policy_version 353078 (0.0030) [2024-04-27 10:25:23,593][52242] Signal inference workers to stop experience collection... (4000 times) [2024-04-27 10:25:23,594][52242] Signal inference workers to resume experience collection... (4000 times) [2024-04-27 10:25:23,611][52263] InferenceWorker_p0-w0: stopping experience collection (4000 times) [2024-04-27 10:25:23,612][52263] InferenceWorker_p0-w0: resuming experience collection (4000 times) [2024-04-27 10:25:24,107][52031] Fps is (10 sec: 54066.9, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5784928256. Throughput: 0: 53070.2. Samples: 275493680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-27 10:25:24,107][52031] Avg episode reward: [(0, '0.507')] [2024-04-27 10:25:25,146][52263] Updated weights for policy 0, policy_version 353088 (0.0029) [2024-04-27 10:25:28,385][52263] Updated weights for policy 0, policy_version 353098 (0.0033) [2024-04-27 10:25:29,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52428.7, 300 sec: 52762.0). Total num frames: 5785174016. Throughput: 0: 52957.3. Samples: 275648000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-27 10:25:29,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 10:25:31,346][52263] Updated weights for policy 0, policy_version 353108 (0.0030) [2024-04-27 10:25:34,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53520.9, 300 sec: 52928.6). Total num frames: 5785468928. Throughput: 0: 52890.9. Samples: 275962680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-27 10:25:34,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 10:25:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000353117_5785468928.pth... [2024-04-27 10:25:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000352341_5772754944.pth [2024-04-27 10:25:34,554][52263] Updated weights for policy 0, policy_version 353118 (0.0035) [2024-04-27 10:25:37,727][52263] Updated weights for policy 0, policy_version 353128 (0.0032) [2024-04-27 10:25:39,106][52031] Fps is (10 sec: 55705.5, 60 sec: 52975.0, 300 sec: 52984.2). Total num frames: 5785731072. Throughput: 0: 52897.0. Samples: 276277040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-27 10:25:39,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 10:25:40,839][52263] Updated weights for policy 0, policy_version 353138 (0.0031) [2024-04-27 10:25:43,872][52263] Updated weights for policy 0, policy_version 353148 (0.0033) [2024-04-27 10:25:44,106][52031] Fps is (10 sec: 50790.9, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5785976832. Throughput: 0: 53048.3. Samples: 276444340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-27 10:25:44,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 10:25:47,037][52263] Updated weights for policy 0, policy_version 353158 (0.0027) [2024-04-27 10:25:49,107][52031] Fps is (10 sec: 50790.2, 60 sec: 52974.9, 300 sec: 52762.1). Total num frames: 5786238976. Throughput: 0: 52912.9. Samples: 276757240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-27 10:25:49,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 10:25:50,122][52263] Updated weights for policy 0, policy_version 353168 (0.0028) [2024-04-27 10:25:53,442][52263] Updated weights for policy 0, policy_version 353178 (0.0027) [2024-04-27 10:25:54,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52428.9, 300 sec: 52762.0). Total num frames: 5786484736. Throughput: 0: 52848.0. Samples: 277072680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:25:54,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 10:25:56,212][52263] Updated weights for policy 0, policy_version 353188 (0.0030) [2024-04-27 10:25:59,107][52031] Fps is (10 sec: 52428.4, 60 sec: 52974.8, 300 sec: 52873.1). Total num frames: 5786763264. Throughput: 0: 52601.2. Samples: 277225840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:25:59,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 10:25:59,673][52263] Updated weights for policy 0, policy_version 353198 (0.0032) [2024-04-27 10:26:02,420][52263] Updated weights for policy 0, policy_version 353208 (0.0035) [2024-04-27 10:26:04,106][52031] Fps is (10 sec: 55705.9, 60 sec: 52975.0, 300 sec: 52928.6). Total num frames: 5787041792. Throughput: 0: 52572.2. Samples: 277542280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:26:04,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 10:26:06,010][52263] Updated weights for policy 0, policy_version 353218 (0.0040) [2024-04-27 10:26:08,670][52263] Updated weights for policy 0, policy_version 353228 (0.0040) [2024-04-27 10:26:09,106][52031] Fps is (10 sec: 54068.3, 60 sec: 52701.9, 300 sec: 52928.7). Total num frames: 5787303936. Throughput: 0: 52534.0. Samples: 277857700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:26:09,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 10:26:12,246][52263] Updated weights for policy 0, policy_version 353238 (0.0030) [2024-04-27 10:26:14,107][52031] Fps is (10 sec: 52428.1, 60 sec: 52974.9, 300 sec: 52817.6). Total num frames: 5787566080. Throughput: 0: 52601.7. Samples: 278015080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:26:14,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 10:26:15,006][52263] Updated weights for policy 0, policy_version 353248 (0.0030) [2024-04-27 10:26:18,621][52263] Updated weights for policy 0, policy_version 353258 (0.0033) [2024-04-27 10:26:19,107][52031] Fps is (10 sec: 49151.4, 60 sec: 52155.7, 300 sec: 52762.0). Total num frames: 5787795456. Throughput: 0: 52588.1. Samples: 278329140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:26:19,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 10:26:21,032][52263] Updated weights for policy 0, policy_version 353268 (0.0034) [2024-04-27 10:26:24,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5788073984. Throughput: 0: 52706.1. Samples: 278648820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:26:24,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 10:26:24,805][52263] Updated weights for policy 0, policy_version 353278 (0.0031) [2024-04-27 10:26:27,315][52263] Updated weights for policy 0, policy_version 353288 (0.0031) [2024-04-27 10:26:28,446][52242] Signal inference workers to stop experience collection... (4050 times) [2024-04-27 10:26:28,486][52263] InferenceWorker_p0-w0: stopping experience collection (4050 times) [2024-04-27 10:26:28,544][52242] Signal inference workers to resume experience collection... (4050 times) [2024-04-27 10:26:28,544][52263] InferenceWorker_p0-w0: resuming experience collection (4050 times) [2024-04-27 10:26:29,106][52031] Fps is (10 sec: 57344.2, 60 sec: 53248.0, 300 sec: 52928.6). Total num frames: 5788368896. Throughput: 0: 52470.2. Samples: 278805500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:26:29,107][52031] Avg episode reward: [(0, '0.488')] [2024-04-27 10:26:30,996][52263] Updated weights for policy 0, policy_version 353298 (0.0029) [2024-04-27 10:26:33,641][52263] Updated weights for policy 0, policy_version 353308 (0.0033) [2024-04-27 10:26:34,106][52031] Fps is (10 sec: 54067.6, 60 sec: 52428.9, 300 sec: 52873.1). Total num frames: 5788614656. Throughput: 0: 52563.1. Samples: 279122580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:26:34,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 10:26:37,265][52263] Updated weights for policy 0, policy_version 353318 (0.0031) [2024-04-27 10:26:39,107][52031] Fps is (10 sec: 50789.7, 60 sec: 52428.7, 300 sec: 52817.6). Total num frames: 5788876800. Throughput: 0: 52566.4. Samples: 279438180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:26:39,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 10:26:39,879][52263] Updated weights for policy 0, policy_version 353328 (0.0033) [2024-04-27 10:26:43,334][52263] Updated weights for policy 0, policy_version 353338 (0.0033) [2024-04-27 10:26:44,106][52031] Fps is (10 sec: 52429.4, 60 sec: 52702.0, 300 sec: 52762.1). Total num frames: 5789138944. Throughput: 0: 52539.8. Samples: 279590120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:26:44,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 10:26:45,993][52263] Updated weights for policy 0, policy_version 353348 (0.0032) [2024-04-27 10:26:49,107][52031] Fps is (10 sec: 50791.0, 60 sec: 52428.8, 300 sec: 52762.0). Total num frames: 5789384704. Throughput: 0: 52606.1. Samples: 279909560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:26:49,115][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 10:26:49,632][52263] Updated weights for policy 0, policy_version 353358 (0.0031) [2024-04-27 10:26:52,194][52263] Updated weights for policy 0, policy_version 353368 (0.0032) [2024-04-27 10:26:54,106][52031] Fps is (10 sec: 52428.2, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5789663232. Throughput: 0: 52537.2. Samples: 280221880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:26:54,115][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 10:26:55,909][52263] Updated weights for policy 0, policy_version 353378 (0.0037) [2024-04-27 10:26:58,377][52263] Updated weights for policy 0, policy_version 353388 (0.0029) [2024-04-27 10:26:59,106][52031] Fps is (10 sec: 54067.6, 60 sec: 52702.0, 300 sec: 52817.6). Total num frames: 5789925376. Throughput: 0: 52769.4. Samples: 280389700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:26:59,107][52031] Avg episode reward: [(0, '0.668')] [2024-04-27 10:27:02,007][52263] Updated weights for policy 0, policy_version 353398 (0.0028) [2024-04-27 10:27:04,107][52031] Fps is (10 sec: 54067.4, 60 sec: 52701.8, 300 sec: 52873.1). Total num frames: 5790203904. Throughput: 0: 52877.0. Samples: 280708600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:27:04,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 10:27:04,480][52263] Updated weights for policy 0, policy_version 353408 (0.0033) [2024-04-27 10:27:08,133][52263] Updated weights for policy 0, policy_version 353418 (0.0033) [2024-04-27 10:27:09,107][52031] Fps is (10 sec: 54066.8, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5790466048. Throughput: 0: 52881.8. Samples: 281028500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:27:09,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 10:27:10,620][52263] Updated weights for policy 0, policy_version 353428 (0.0032) [2024-04-27 10:27:14,107][52031] Fps is (10 sec: 49151.8, 60 sec: 52155.7, 300 sec: 52651.0). Total num frames: 5790695424. Throughput: 0: 52819.1. Samples: 281182360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:27:14,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 10:27:14,423][52263] Updated weights for policy 0, policy_version 353438 (0.0028) [2024-04-27 10:27:16,721][52263] Updated weights for policy 0, policy_version 353448 (0.0039) [2024-04-27 10:27:19,106][52031] Fps is (10 sec: 50790.7, 60 sec: 52975.0, 300 sec: 52817.6). Total num frames: 5790973952. Throughput: 0: 52834.7. Samples: 281500140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:27:19,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 10:27:20,570][52263] Updated weights for policy 0, policy_version 353458 (0.0027) [2024-04-27 10:27:22,932][52263] Updated weights for policy 0, policy_version 353468 (0.0038) [2024-04-27 10:27:24,106][52031] Fps is (10 sec: 57344.3, 60 sec: 53248.1, 300 sec: 52928.6). Total num frames: 5791268864. Throughput: 0: 52868.2. Samples: 281817240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:27:24,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 10:27:26,678][52263] Updated weights for policy 0, policy_version 353478 (0.0028) [2024-04-27 10:27:29,107][52031] Fps is (10 sec: 55705.0, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5791531008. Throughput: 0: 53198.0. Samples: 281984040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:27:29,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 10:27:29,252][52263] Updated weights for policy 0, policy_version 353488 (0.0027) [2024-04-27 10:27:32,880][52263] Updated weights for policy 0, policy_version 353498 (0.0027) [2024-04-27 10:27:33,732][52242] Signal inference workers to stop experience collection... (4100 times) [2024-04-27 10:27:33,737][52242] Signal inference workers to resume experience collection... (4100 times) [2024-04-27 10:27:33,763][52263] InferenceWorker_p0-w0: stopping experience collection (4100 times) [2024-04-27 10:27:33,764][52263] InferenceWorker_p0-w0: resuming experience collection (4100 times) [2024-04-27 10:27:34,106][52031] Fps is (10 sec: 52428.7, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5791793152. Throughput: 0: 53088.0. Samples: 282298520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:27:34,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 10:27:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000353503_5791793152.pth... [2024-04-27 10:27:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000352729_5779111936.pth [2024-04-27 10:27:35,421][52263] Updated weights for policy 0, policy_version 353508 (0.0030) [2024-04-27 10:27:38,975][52263] Updated weights for policy 0, policy_version 353518 (0.0037) [2024-04-27 10:27:39,107][52031] Fps is (10 sec: 50790.4, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5792038912. Throughput: 0: 53212.8. Samples: 282616460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:27:39,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 10:27:41,431][52263] Updated weights for policy 0, policy_version 353528 (0.0035) [2024-04-27 10:27:44,106][52031] Fps is (10 sec: 50790.7, 60 sec: 52701.8, 300 sec: 52873.1). Total num frames: 5792301056. Throughput: 0: 52680.5. Samples: 282760320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:27:44,107][52031] Avg episode reward: [(0, '0.680')] [2024-04-27 10:27:45,177][52263] Updated weights for policy 0, policy_version 353538 (0.0030) [2024-04-27 10:27:47,614][52263] Updated weights for policy 0, policy_version 353548 (0.0026) [2024-04-27 10:27:49,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.0, 300 sec: 52873.1). Total num frames: 5792579584. Throughput: 0: 52698.2. Samples: 283080020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:27:49,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 10:27:51,397][52263] Updated weights for policy 0, policy_version 353558 (0.0033) [2024-04-27 10:27:54,029][52263] Updated weights for policy 0, policy_version 353568 (0.0030) [2024-04-27 10:27:54,107][52031] Fps is (10 sec: 55704.5, 60 sec: 53247.9, 300 sec: 52873.1). Total num frames: 5792858112. Throughput: 0: 52735.4. Samples: 283401600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:27:54,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 10:27:57,451][52263] Updated weights for policy 0, policy_version 353578 (0.0031) [2024-04-27 10:27:59,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53247.9, 300 sec: 52817.6). Total num frames: 5793120256. Throughput: 0: 53111.6. Samples: 283572380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:27:59,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 10:28:00,386][52263] Updated weights for policy 0, policy_version 353588 (0.0030) [2024-04-27 10:28:03,579][52263] Updated weights for policy 0, policy_version 353598 (0.0031) [2024-04-27 10:28:04,106][52031] Fps is (10 sec: 52430.0, 60 sec: 52975.0, 300 sec: 52817.6). Total num frames: 5793382400. Throughput: 0: 53096.1. Samples: 283889460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:28:04,107][52031] Avg episode reward: [(0, '0.673')] [2024-04-27 10:28:06,559][52263] Updated weights for policy 0, policy_version 353608 (0.0034) [2024-04-27 10:28:09,107][52031] Fps is (10 sec: 50789.9, 60 sec: 52701.8, 300 sec: 52817.6). Total num frames: 5793628160. Throughput: 0: 53098.1. Samples: 284206660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:28:09,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 10:28:09,758][52263] Updated weights for policy 0, policy_version 353618 (0.0032) [2024-04-27 10:28:12,785][52263] Updated weights for policy 0, policy_version 353628 (0.0028) [2024-04-27 10:28:14,107][52031] Fps is (10 sec: 50789.4, 60 sec: 53247.9, 300 sec: 52762.0). Total num frames: 5793890304. Throughput: 0: 52736.4. Samples: 284357180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:28:14,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 10:28:16,003][52263] Updated weights for policy 0, policy_version 353638 (0.0031) [2024-04-27 10:28:18,998][52263] Updated weights for policy 0, policy_version 353648 (0.0028) [2024-04-27 10:28:19,106][52031] Fps is (10 sec: 54068.3, 60 sec: 53248.1, 300 sec: 52762.0). Total num frames: 5794168832. Throughput: 0: 52844.1. Samples: 284676500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:28:19,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 10:28:22,115][52263] Updated weights for policy 0, policy_version 353658 (0.0030) [2024-04-27 10:28:24,107][52031] Fps is (10 sec: 57344.0, 60 sec: 53247.9, 300 sec: 52928.6). Total num frames: 5794463744. Throughput: 0: 52804.4. Samples: 284992660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 10:28:24,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 10:28:25,228][52263] Updated weights for policy 0, policy_version 353668 (0.0028) [2024-04-27 10:28:28,292][52263] Updated weights for policy 0, policy_version 353678 (0.0027) [2024-04-27 10:28:29,106][52031] Fps is (10 sec: 54066.8, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5794709504. Throughput: 0: 53296.8. Samples: 285158680. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:28:29,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 10:28:31,346][52263] Updated weights for policy 0, policy_version 353688 (0.0030) [2024-04-27 10:28:34,106][52031] Fps is (10 sec: 49152.8, 60 sec: 52701.9, 300 sec: 52762.1). Total num frames: 5794955264. Throughput: 0: 53279.2. Samples: 285477580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:28:34,107][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 10:28:34,365][52263] Updated weights for policy 0, policy_version 353698 (0.0034) [2024-04-27 10:28:36,672][52242] Signal inference workers to stop experience collection... (4150 times) [2024-04-27 10:28:36,673][52242] Signal inference workers to resume experience collection... (4150 times) [2024-04-27 10:28:36,700][52263] InferenceWorker_p0-w0: stopping experience collection (4150 times) [2024-04-27 10:28:36,700][52263] InferenceWorker_p0-w0: resuming experience collection (4150 times) [2024-04-27 10:28:37,510][52263] Updated weights for policy 0, policy_version 353708 (0.0029) [2024-04-27 10:28:39,106][52031] Fps is (10 sec: 49152.2, 60 sec: 52702.0, 300 sec: 52817.6). Total num frames: 5795201024. Throughput: 0: 53175.8. Samples: 285794500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:28:39,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 10:28:40,632][52263] Updated weights for policy 0, policy_version 353718 (0.0034) [2024-04-27 10:28:43,695][52263] Updated weights for policy 0, policy_version 353728 (0.0027) [2024-04-27 10:28:44,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53247.9, 300 sec: 52817.6). Total num frames: 5795495936. Throughput: 0: 52693.3. Samples: 285943580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:28:44,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 10:28:46,677][52263] Updated weights for policy 0, policy_version 353738 (0.0031) [2024-04-27 10:28:49,107][52031] Fps is (10 sec: 57343.5, 60 sec: 53247.9, 300 sec: 52928.6). Total num frames: 5795774464. Throughput: 0: 52803.4. Samples: 286265620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:28:49,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 10:28:49,934][52263] Updated weights for policy 0, policy_version 353748 (0.0031) [2024-04-27 10:28:52,858][52263] Updated weights for policy 0, policy_version 353758 (0.0030) [2024-04-27 10:28:54,107][52031] Fps is (10 sec: 54066.9, 60 sec: 52974.9, 300 sec: 52928.6). Total num frames: 5796036608. Throughput: 0: 52851.1. Samples: 286584960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:28:54,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 10:28:56,102][52263] Updated weights for policy 0, policy_version 353768 (0.0030) [2024-04-27 10:28:59,055][52263] Updated weights for policy 0, policy_version 353778 (0.0030) [2024-04-27 10:28:59,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5796298752. Throughput: 0: 53083.7. Samples: 286745940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:28:59,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 10:29:02,377][52263] Updated weights for policy 0, policy_version 353788 (0.0028) [2024-04-27 10:29:04,106][52031] Fps is (10 sec: 47514.5, 60 sec: 52155.7, 300 sec: 52762.1). Total num frames: 5796511744. Throughput: 0: 52913.7. Samples: 287057620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:29:04,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 10:29:05,241][52263] Updated weights for policy 0, policy_version 353798 (0.0028) [2024-04-27 10:29:08,713][52263] Updated weights for policy 0, policy_version 353808 (0.0031) [2024-04-27 10:29:09,106][52031] Fps is (10 sec: 49151.8, 60 sec: 52702.0, 300 sec: 52706.5). Total num frames: 5796790272. Throughput: 0: 52961.0. Samples: 287375900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:29:09,107][52031] Avg episode reward: [(0, '0.691')] [2024-04-27 10:29:11,475][52263] Updated weights for policy 0, policy_version 353818 (0.0033) [2024-04-27 10:29:14,107][52031] Fps is (10 sec: 55705.2, 60 sec: 52975.0, 300 sec: 52817.6). Total num frames: 5797068800. Throughput: 0: 52647.5. Samples: 287527820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:29:14,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 10:29:15,068][52263] Updated weights for policy 0, policy_version 353828 (0.0034) [2024-04-27 10:29:17,644][52263] Updated weights for policy 0, policy_version 353838 (0.0027) [2024-04-27 10:29:19,107][52031] Fps is (10 sec: 57343.7, 60 sec: 53247.9, 300 sec: 52873.1). Total num frames: 5797363712. Throughput: 0: 52659.4. Samples: 287847260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:29:19,107][52031] Avg episode reward: [(0, '0.512')] [2024-04-27 10:29:21,117][52263] Updated weights for policy 0, policy_version 353848 (0.0035) [2024-04-27 10:29:23,846][52263] Updated weights for policy 0, policy_version 353858 (0.0030) [2024-04-27 10:29:24,107][52031] Fps is (10 sec: 54066.9, 60 sec: 52428.8, 300 sec: 52817.5). Total num frames: 5797609472. Throughput: 0: 52705.2. Samples: 288166240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:29:24,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 10:29:27,244][52263] Updated weights for policy 0, policy_version 353868 (0.0024) [2024-04-27 10:29:29,107][52031] Fps is (10 sec: 49152.1, 60 sec: 52428.7, 300 sec: 52873.1). Total num frames: 5797855232. Throughput: 0: 52998.7. Samples: 288328520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:29:29,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 10:29:29,975][52263] Updated weights for policy 0, policy_version 353878 (0.0026) [2024-04-27 10:29:33,601][52263] Updated weights for policy 0, policy_version 353888 (0.0027) [2024-04-27 10:29:34,107][52031] Fps is (10 sec: 50790.0, 60 sec: 52701.7, 300 sec: 52762.0). Total num frames: 5798117376. Throughput: 0: 52955.9. Samples: 288648640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:29:34,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 10:29:34,243][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000353890_5798133760.pth... [2024-04-27 10:29:34,283][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000353117_5785468928.pth [2024-04-27 10:29:35,093][52242] Signal inference workers to stop experience collection... (4200 times) [2024-04-27 10:29:35,093][52242] Signal inference workers to resume experience collection... (4200 times) [2024-04-27 10:29:35,116][52263] InferenceWorker_p0-w0: stopping experience collection (4200 times) [2024-04-27 10:29:35,116][52263] InferenceWorker_p0-w0: resuming experience collection (4200 times) [2024-04-27 10:29:36,116][52263] Updated weights for policy 0, policy_version 353898 (0.0024) [2024-04-27 10:29:39,106][52031] Fps is (10 sec: 52429.8, 60 sec: 52975.0, 300 sec: 52762.1). Total num frames: 5798379520. Throughput: 0: 52801.7. Samples: 288961020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:29:39,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 10:29:39,730][52263] Updated weights for policy 0, policy_version 353908 (0.0029) [2024-04-27 10:29:42,381][52263] Updated weights for policy 0, policy_version 353918 (0.0028) [2024-04-27 10:29:44,107][52031] Fps is (10 sec: 55706.3, 60 sec: 52975.0, 300 sec: 52928.6). Total num frames: 5798674432. Throughput: 0: 52863.5. Samples: 289124800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:29:44,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 10:29:45,834][52263] Updated weights for policy 0, policy_version 353928 (0.0034) [2024-04-27 10:29:48,633][52263] Updated weights for policy 0, policy_version 353938 (0.0028) [2024-04-27 10:29:49,106][52031] Fps is (10 sec: 57343.5, 60 sec: 52975.0, 300 sec: 52928.7). Total num frames: 5798952960. Throughput: 0: 52932.0. Samples: 289439560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 10:29:49,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 10:29:52,062][52263] Updated weights for policy 0, policy_version 353948 (0.0030) [2024-04-27 10:29:54,107][52031] Fps is (10 sec: 49151.8, 60 sec: 52155.8, 300 sec: 52817.6). Total num frames: 5799165952. Throughput: 0: 52832.0. Samples: 289753340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 10:29:54,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 10:29:54,947][52263] Updated weights for policy 0, policy_version 353958 (0.0039) [2024-04-27 10:29:58,434][52263] Updated weights for policy 0, policy_version 353968 (0.0033) [2024-04-27 10:29:59,107][52031] Fps is (10 sec: 47513.3, 60 sec: 52155.7, 300 sec: 52762.0). Total num frames: 5799428096. Throughput: 0: 52746.2. Samples: 289901400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 10:29:59,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 10:30:01,079][52263] Updated weights for policy 0, policy_version 353978 (0.0038) [2024-04-27 10:30:04,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53247.9, 300 sec: 52762.0). Total num frames: 5799706624. Throughput: 0: 52659.6. Samples: 290216940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 10:30:04,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 10:30:04,546][52263] Updated weights for policy 0, policy_version 353988 (0.0033) [2024-04-27 10:30:07,297][52263] Updated weights for policy 0, policy_version 353998 (0.0031) [2024-04-27 10:30:09,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53248.0, 300 sec: 52873.1). Total num frames: 5799985152. Throughput: 0: 52606.7. Samples: 290533540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 10:30:09,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 10:30:10,933][52263] Updated weights for policy 0, policy_version 354008 (0.0029) [2024-04-27 10:30:13,558][52263] Updated weights for policy 0, policy_version 354018 (0.0029) [2024-04-27 10:30:14,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53248.0, 300 sec: 52873.1). Total num frames: 5800263680. Throughput: 0: 52656.1. Samples: 290698040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 10:30:14,107][52031] Avg episode reward: [(0, '0.495')] [2024-04-27 10:30:16,990][52263] Updated weights for policy 0, policy_version 354028 (0.0034) [2024-04-27 10:30:19,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52155.8, 300 sec: 52762.0). Total num frames: 5800493056. Throughput: 0: 52634.4. Samples: 291017180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 10:30:19,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 10:30:19,764][52263] Updated weights for policy 0, policy_version 354038 (0.0028) [2024-04-27 10:30:23,415][52263] Updated weights for policy 0, policy_version 354048 (0.0030) [2024-04-27 10:30:24,106][52031] Fps is (10 sec: 47513.4, 60 sec: 52155.8, 300 sec: 52762.0). Total num frames: 5800738816. Throughput: 0: 52633.2. Samples: 291329520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 10:30:24,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 10:30:25,975][52263] Updated weights for policy 0, policy_version 354058 (0.0034) [2024-04-27 10:30:29,106][52031] Fps is (10 sec: 50790.7, 60 sec: 52428.9, 300 sec: 52651.0). Total num frames: 5801000960. Throughput: 0: 52142.7. Samples: 291471220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 10:30:29,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 10:30:29,822][52263] Updated weights for policy 0, policy_version 354068 (0.0028) [2024-04-27 10:30:29,838][52242] Signal inference workers to stop experience collection... (4250 times) [2024-04-27 10:30:29,838][52242] Signal inference workers to resume experience collection... (4250 times) [2024-04-27 10:30:29,852][52263] InferenceWorker_p0-w0: stopping experience collection (4250 times) [2024-04-27 10:30:29,852][52263] InferenceWorker_p0-w0: resuming experience collection (4250 times) [2024-04-27 10:30:32,149][52263] Updated weights for policy 0, policy_version 354078 (0.0030) [2024-04-27 10:30:34,107][52031] Fps is (10 sec: 55705.4, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5801295872. Throughput: 0: 52239.5. Samples: 291790340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 10:30:34,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 10:30:36,011][52263] Updated weights for policy 0, policy_version 354088 (0.0029) [2024-04-27 10:30:38,269][52263] Updated weights for policy 0, policy_version 354098 (0.0028) [2024-04-27 10:30:39,107][52031] Fps is (10 sec: 57343.5, 60 sec: 53247.9, 300 sec: 52873.1). Total num frames: 5801574400. Throughput: 0: 52352.9. Samples: 292109220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 10:30:39,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 10:30:42,135][52263] Updated weights for policy 0, policy_version 354108 (0.0030) [2024-04-27 10:30:44,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5801820160. Throughput: 0: 52897.8. Samples: 292281800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 10:30:44,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 10:30:44,541][52263] Updated weights for policy 0, policy_version 354118 (0.0033) [2024-04-27 10:30:48,120][52263] Updated weights for policy 0, policy_version 354128 (0.0028) [2024-04-27 10:30:49,107][52031] Fps is (10 sec: 49151.8, 60 sec: 51882.6, 300 sec: 52817.6). Total num frames: 5802065920. Throughput: 0: 53053.7. Samples: 292604360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 10:30:49,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 10:30:50,702][52263] Updated weights for policy 0, policy_version 354138 (0.0027) [2024-04-27 10:30:54,106][52031] Fps is (10 sec: 50790.7, 60 sec: 52701.9, 300 sec: 52762.1). Total num frames: 5802328064. Throughput: 0: 52913.0. Samples: 292914620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 10:30:54,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 10:30:54,398][52263] Updated weights for policy 0, policy_version 354148 (0.0026) [2024-04-27 10:30:57,072][52263] Updated weights for policy 0, policy_version 354158 (0.0031) [2024-04-27 10:30:59,106][52031] Fps is (10 sec: 54067.6, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5802606592. Throughput: 0: 52514.2. Samples: 293061180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 10:30:59,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 10:31:00,795][52263] Updated weights for policy 0, policy_version 354168 (0.0032) [2024-04-27 10:31:03,184][52263] Updated weights for policy 0, policy_version 354178 (0.0030) [2024-04-27 10:31:04,106][52031] Fps is (10 sec: 57343.9, 60 sec: 53248.0, 300 sec: 52873.1). Total num frames: 5802901504. Throughput: 0: 52500.9. Samples: 293379720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 10:31:04,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 10:31:06,971][52263] Updated weights for policy 0, policy_version 354188 (0.0032) [2024-04-27 10:31:09,106][52031] Fps is (10 sec: 54067.8, 60 sec: 52702.0, 300 sec: 52817.6). Total num frames: 5803147264. Throughput: 0: 52668.1. Samples: 293699580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 10:31:09,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 10:31:09,268][52263] Updated weights for policy 0, policy_version 354198 (0.0029) [2024-04-27 10:31:13,216][52263] Updated weights for policy 0, policy_version 354208 (0.0029) [2024-04-27 10:31:14,107][52031] Fps is (10 sec: 47512.8, 60 sec: 51882.5, 300 sec: 52817.6). Total num frames: 5803376640. Throughput: 0: 53141.5. Samples: 293862600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 10:31:14,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 10:31:15,561][52263] Updated weights for policy 0, policy_version 354218 (0.0029) [2024-04-27 10:31:19,106][52031] Fps is (10 sec: 49151.6, 60 sec: 52428.8, 300 sec: 52762.1). Total num frames: 5803638784. Throughput: 0: 53023.2. Samples: 294176380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 10:31:19,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 10:31:19,439][52263] Updated weights for policy 0, policy_version 354228 (0.0034) [2024-04-27 10:31:21,893][52263] Updated weights for policy 0, policy_version 354238 (0.0031) [2024-04-27 10:31:24,106][52031] Fps is (10 sec: 54068.0, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5803917312. Throughput: 0: 52852.9. Samples: 294487600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 10:31:24,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 10:31:25,608][52263] Updated weights for policy 0, policy_version 354248 (0.0033) [2024-04-27 10:31:27,832][52242] Signal inference workers to stop experience collection... (4300 times) [2024-04-27 10:31:27,868][52263] InferenceWorker_p0-w0: stopping experience collection (4300 times) [2024-04-27 10:31:27,897][52242] Signal inference workers to resume experience collection... (4300 times) [2024-04-27 10:31:27,898][52263] InferenceWorker_p0-w0: resuming experience collection (4300 times) [2024-04-27 10:31:28,026][52263] Updated weights for policy 0, policy_version 354258 (0.0025) [2024-04-27 10:31:29,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53247.9, 300 sec: 52817.6). Total num frames: 5804195840. Throughput: 0: 52672.4. Samples: 294652060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 10:31:29,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 10:31:31,787][52263] Updated weights for policy 0, policy_version 354268 (0.0030) [2024-04-27 10:31:34,107][52031] Fps is (10 sec: 55705.5, 60 sec: 52974.9, 300 sec: 52873.1). Total num frames: 5804474368. Throughput: 0: 52629.8. Samples: 294972700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 10:31:34,107][52031] Avg episode reward: [(0, '0.664')] [2024-04-27 10:31:34,151][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000354278_5804490752.pth... [2024-04-27 10:31:34,154][52263] Updated weights for policy 0, policy_version 354278 (0.0033) [2024-04-27 10:31:34,196][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000353503_5791793152.pth [2024-04-27 10:31:37,885][52263] Updated weights for policy 0, policy_version 354288 (0.0038) [2024-04-27 10:31:39,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5804720128. Throughput: 0: 52792.8. Samples: 295290300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 10:31:39,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 10:31:40,459][52263] Updated weights for policy 0, policy_version 354298 (0.0028) [2024-04-27 10:31:43,925][52263] Updated weights for policy 0, policy_version 354308 (0.0028) [2024-04-27 10:31:44,107][52031] Fps is (10 sec: 50790.2, 60 sec: 52701.8, 300 sec: 52873.1). Total num frames: 5804982272. Throughput: 0: 52880.4. Samples: 295440800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 10:31:44,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 10:31:46,726][52263] Updated weights for policy 0, policy_version 354318 (0.0027) [2024-04-27 10:31:49,106][52031] Fps is (10 sec: 50790.9, 60 sec: 52702.0, 300 sec: 52762.1). Total num frames: 5805228032. Throughput: 0: 52858.3. Samples: 295758340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 10:31:49,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 10:31:50,222][52263] Updated weights for policy 0, policy_version 354328 (0.0030) [2024-04-27 10:31:52,784][52263] Updated weights for policy 0, policy_version 354338 (0.0030) [2024-04-27 10:31:54,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52974.8, 300 sec: 52817.5). Total num frames: 5805506560. Throughput: 0: 52769.9. Samples: 296074240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 10:31:54,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 10:31:56,509][52263] Updated weights for policy 0, policy_version 354348 (0.0030) [2024-04-27 10:31:58,842][52263] Updated weights for policy 0, policy_version 354358 (0.0031) [2024-04-27 10:31:59,106][52031] Fps is (10 sec: 57343.9, 60 sec: 53248.1, 300 sec: 52873.1). Total num frames: 5805801472. Throughput: 0: 52820.3. Samples: 296239500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 10:31:59,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 10:32:02,568][52263] Updated weights for policy 0, policy_version 354368 (0.0032) [2024-04-27 10:32:04,107][52031] Fps is (10 sec: 54067.9, 60 sec: 52428.7, 300 sec: 52817.6). Total num frames: 5806047232. Throughput: 0: 52841.7. Samples: 296554260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 10:32:04,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 10:32:05,206][52263] Updated weights for policy 0, policy_version 354378 (0.0029) [2024-04-27 10:32:08,814][52263] Updated weights for policy 0, policy_version 354388 (0.0036) [2024-04-27 10:32:09,107][52031] Fps is (10 sec: 49151.6, 60 sec: 52428.7, 300 sec: 52873.1). Total num frames: 5806292992. Throughput: 0: 52880.9. Samples: 296867240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 10:32:09,107][52031] Avg episode reward: [(0, '0.497')] [2024-04-27 10:32:11,702][52263] Updated weights for policy 0, policy_version 354398 (0.0028) [2024-04-27 10:32:14,107][52031] Fps is (10 sec: 50790.5, 60 sec: 52975.0, 300 sec: 52817.6). Total num frames: 5806555136. Throughput: 0: 52683.6. Samples: 297022820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 10:32:14,107][52031] Avg episode reward: [(0, '0.672')] [2024-04-27 10:32:15,186][52263] Updated weights for policy 0, policy_version 354408 (0.0030) [2024-04-27 10:32:18,137][52263] Updated weights for policy 0, policy_version 354418 (0.0039) [2024-04-27 10:32:19,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53247.9, 300 sec: 52762.0). Total num frames: 5806833664. Throughput: 0: 52495.1. Samples: 297334980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 10:32:19,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 10:32:21,365][52263] Updated weights for policy 0, policy_version 354428 (0.0033) [2024-04-27 10:32:24,107][52031] Fps is (10 sec: 54066.9, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5807095808. Throughput: 0: 52535.9. Samples: 297654420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:32:24,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 10:32:24,179][52263] Updated weights for policy 0, policy_version 354438 (0.0029) [2024-04-27 10:32:27,514][52263] Updated weights for policy 0, policy_version 354448 (0.0033) [2024-04-27 10:32:29,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5807357952. Throughput: 0: 52793.0. Samples: 297816480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:32:29,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 10:32:30,343][52263] Updated weights for policy 0, policy_version 354458 (0.0031) [2024-04-27 10:32:33,596][52263] Updated weights for policy 0, policy_version 354468 (0.0032) [2024-04-27 10:32:34,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52428.8, 300 sec: 52817.6). Total num frames: 5807620096. Throughput: 0: 52639.5. Samples: 298127120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:32:34,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 10:32:36,885][52263] Updated weights for policy 0, policy_version 354478 (0.0027) [2024-04-27 10:32:39,106][52031] Fps is (10 sec: 50790.7, 60 sec: 52428.8, 300 sec: 52762.0). Total num frames: 5807865856. Throughput: 0: 52697.1. Samples: 298445600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:32:39,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 10:32:39,801][52263] Updated weights for policy 0, policy_version 354488 (0.0030) [2024-04-27 10:32:42,951][52263] Updated weights for policy 0, policy_version 354498 (0.0031) [2024-04-27 10:32:44,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5808144384. Throughput: 0: 52516.3. Samples: 298602740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:32:44,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 10:32:45,474][52242] Signal inference workers to stop experience collection... (4350 times) [2024-04-27 10:32:45,476][52242] Signal inference workers to resume experience collection... (4350 times) [2024-04-27 10:32:45,489][52263] InferenceWorker_p0-w0: stopping experience collection (4350 times) [2024-04-27 10:32:45,489][52263] InferenceWorker_p0-w0: resuming experience collection (4350 times) [2024-04-27 10:32:46,095][52263] Updated weights for policy 0, policy_version 354508 (0.0029) [2024-04-27 10:32:48,958][52263] Updated weights for policy 0, policy_version 354518 (0.0032) [2024-04-27 10:32:49,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53247.8, 300 sec: 52762.0). Total num frames: 5808422912. Throughput: 0: 52627.0. Samples: 298922480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:32:49,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 10:32:52,224][52263] Updated weights for policy 0, policy_version 354528 (0.0030) [2024-04-27 10:32:54,107][52031] Fps is (10 sec: 54067.4, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5808685056. Throughput: 0: 52736.8. Samples: 299240400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:32:54,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 10:32:55,275][52263] Updated weights for policy 0, policy_version 354538 (0.0032) [2024-04-27 10:32:58,311][52263] Updated weights for policy 0, policy_version 354548 (0.0026) [2024-04-27 10:32:59,106][52031] Fps is (10 sec: 49152.4, 60 sec: 51882.6, 300 sec: 52650.9). Total num frames: 5808914432. Throughput: 0: 52910.7. Samples: 299403800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:32:59,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 10:33:01,570][52263] Updated weights for policy 0, policy_version 354558 (0.0030) [2024-04-27 10:33:04,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52701.9, 300 sec: 52817.6). Total num frames: 5809209344. Throughput: 0: 52938.3. Samples: 299717200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:33:04,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 10:33:04,832][52263] Updated weights for policy 0, policy_version 354568 (0.0032) [2024-04-27 10:33:07,760][52263] Updated weights for policy 0, policy_version 354578 (0.0029) [2024-04-27 10:33:09,106][52031] Fps is (10 sec: 54067.7, 60 sec: 52701.9, 300 sec: 52762.1). Total num frames: 5809455104. Throughput: 0: 52799.3. Samples: 300030380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:33:09,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 10:33:11,126][52263] Updated weights for policy 0, policy_version 354588 (0.0028) [2024-04-27 10:33:13,875][52263] Updated weights for policy 0, policy_version 354598 (0.0031) [2024-04-27 10:33:14,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5809733632. Throughput: 0: 52793.4. Samples: 300192180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:33:14,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 10:33:17,242][52263] Updated weights for policy 0, policy_version 354608 (0.0030) [2024-04-27 10:33:19,106][52031] Fps is (10 sec: 54066.9, 60 sec: 52701.9, 300 sec: 52651.0). Total num frames: 5809995776. Throughput: 0: 52888.5. Samples: 300507100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:33:19,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 10:33:20,081][52263] Updated weights for policy 0, policy_version 354618 (0.0030) [2024-04-27 10:33:23,504][52263] Updated weights for policy 0, policy_version 354628 (0.0033) [2024-04-27 10:33:24,107][52031] Fps is (10 sec: 50789.5, 60 sec: 52428.8, 300 sec: 52650.9). Total num frames: 5810241536. Throughput: 0: 52747.4. Samples: 300819240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:33:24,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 10:33:26,455][52263] Updated weights for policy 0, policy_version 354638 (0.0029) [2024-04-27 10:33:29,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5810520064. Throughput: 0: 52658.2. Samples: 300972360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:33:29,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 10:33:29,843][52263] Updated weights for policy 0, policy_version 354648 (0.0032) [2024-04-27 10:33:32,927][52263] Updated weights for policy 0, policy_version 354658 (0.0039) [2024-04-27 10:33:34,106][52031] Fps is (10 sec: 54067.9, 60 sec: 52701.9, 300 sec: 52817.6). Total num frames: 5810782208. Throughput: 0: 52645.0. Samples: 301291500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:33:34,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 10:33:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000354662_5810782208.pth... [2024-04-27 10:33:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000353890_5798133760.pth [2024-04-27 10:33:35,914][52263] Updated weights for policy 0, policy_version 354668 (0.0029) [2024-04-27 10:33:39,106][52031] Fps is (10 sec: 50791.3, 60 sec: 52701.9, 300 sec: 52651.0). Total num frames: 5811027968. Throughput: 0: 52605.9. Samples: 301607660. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 10:33:39,107][52031] Avg episode reward: [(0, '0.646')] [2024-04-27 10:33:39,146][52263] Updated weights for policy 0, policy_version 354678 (0.0028) [2024-04-27 10:33:42,405][52263] Updated weights for policy 0, policy_version 354688 (0.0028) [2024-04-27 10:33:44,107][52031] Fps is (10 sec: 52428.5, 60 sec: 52701.9, 300 sec: 52651.0). Total num frames: 5811306496. Throughput: 0: 52462.6. Samples: 301764620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 10:33:44,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 10:33:45,317][52263] Updated weights for policy 0, policy_version 354698 (0.0031) [2024-04-27 10:33:48,684][52263] Updated weights for policy 0, policy_version 354708 (0.0028) [2024-04-27 10:33:49,106][52031] Fps is (10 sec: 52428.3, 60 sec: 52155.8, 300 sec: 52595.4). Total num frames: 5811552256. Throughput: 0: 52409.3. Samples: 302075620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 10:33:49,107][52031] Avg episode reward: [(0, '0.677')] [2024-04-27 10:33:51,454][52263] Updated weights for policy 0, policy_version 354718 (0.0027) [2024-04-27 10:33:54,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52428.8, 300 sec: 52650.9). Total num frames: 5811830784. Throughput: 0: 52539.0. Samples: 302394640. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 10:33:54,107][52031] Avg episode reward: [(0, '0.680')] [2024-04-27 10:33:54,717][52263] Updated weights for policy 0, policy_version 354728 (0.0031) [2024-04-27 10:33:57,468][52263] Updated weights for policy 0, policy_version 354738 (0.0034) [2024-04-27 10:33:59,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53248.0, 300 sec: 52873.1). Total num frames: 5812109312. Throughput: 0: 52505.6. Samples: 302554940. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 10:33:59,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 10:34:01,148][52263] Updated weights for policy 0, policy_version 354748 (0.0031) [2024-04-27 10:34:03,675][52263] Updated weights for policy 0, policy_version 354758 (0.0029) [2024-04-27 10:34:04,107][52031] Fps is (10 sec: 54066.7, 60 sec: 52701.7, 300 sec: 52817.5). Total num frames: 5812371456. Throughput: 0: 52560.3. Samples: 302872320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 10:34:04,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 10:34:07,386][52263] Updated weights for policy 0, policy_version 354768 (0.0032) [2024-04-27 10:34:08,337][52242] Signal inference workers to stop experience collection... (4400 times) [2024-04-27 10:34:08,376][52263] InferenceWorker_p0-w0: stopping experience collection (4400 times) [2024-04-27 10:34:08,434][52242] Signal inference workers to resume experience collection... (4400 times) [2024-04-27 10:34:08,434][52263] InferenceWorker_p0-w0: resuming experience collection (4400 times) [2024-04-27 10:34:09,107][52031] Fps is (10 sec: 52429.0, 60 sec: 52974.8, 300 sec: 52762.0). Total num frames: 5812633600. Throughput: 0: 52669.9. Samples: 303189380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 10:34:09,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 10:34:10,048][52263] Updated weights for policy 0, policy_version 354778 (0.0029) [2024-04-27 10:34:13,503][52263] Updated weights for policy 0, policy_version 354788 (0.0032) [2024-04-27 10:34:14,106][52031] Fps is (10 sec: 49152.9, 60 sec: 52155.7, 300 sec: 52539.9). Total num frames: 5812862976. Throughput: 0: 52639.3. Samples: 303341120. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 10:34:14,107][52031] Avg episode reward: [(0, '0.488')] [2024-04-27 10:34:16,175][52263] Updated weights for policy 0, policy_version 354798 (0.0032) [2024-04-27 10:34:19,107][52031] Fps is (10 sec: 50790.3, 60 sec: 52428.7, 300 sec: 52651.0). Total num frames: 5813141504. Throughput: 0: 52576.4. Samples: 303657440. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 10:34:19,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 10:34:19,589][52263] Updated weights for policy 0, policy_version 354808 (0.0032) [2024-04-27 10:34:22,680][52263] Updated weights for policy 0, policy_version 354818 (0.0028) [2024-04-27 10:34:24,106][52031] Fps is (10 sec: 55705.4, 60 sec: 52975.0, 300 sec: 52762.0). Total num frames: 5813420032. Throughput: 0: 52665.7. Samples: 303977620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 10:34:24,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 10:34:25,893][52263] Updated weights for policy 0, policy_version 354828 (0.0027) [2024-04-27 10:34:28,728][52263] Updated weights for policy 0, policy_version 354838 (0.0030) [2024-04-27 10:34:29,106][52031] Fps is (10 sec: 54067.5, 60 sec: 52702.0, 300 sec: 52762.1). Total num frames: 5813682176. Throughput: 0: 52917.8. Samples: 304145920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 10:34:29,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 10:34:32,007][52263] Updated weights for policy 0, policy_version 354848 (0.0028) [2024-04-27 10:34:34,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52701.9, 300 sec: 52762.0). Total num frames: 5813944320. Throughput: 0: 53024.9. Samples: 304461740. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 10:34:34,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 10:34:34,704][52263] Updated weights for policy 0, policy_version 354858 (0.0030) [2024-04-27 10:34:38,320][52263] Updated weights for policy 0, policy_version 354868 (0.0029) [2024-04-27 10:34:39,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52974.9, 300 sec: 52651.0). Total num frames: 5814206464. Throughput: 0: 52949.9. Samples: 304777380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 10:34:39,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 10:34:40,941][52263] Updated weights for policy 0, policy_version 354878 (0.0031) [2024-04-27 10:34:44,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52428.8, 300 sec: 52539.9). Total num frames: 5814452224. Throughput: 0: 52816.5. Samples: 304931680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 10:34:44,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 10:34:44,521][52263] Updated weights for policy 0, policy_version 354888 (0.0027) [2024-04-27 10:34:47,307][52263] Updated weights for policy 0, policy_version 354898 (0.0033) [2024-04-27 10:34:49,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53247.9, 300 sec: 52817.6). Total num frames: 5814747136. Throughput: 0: 52836.5. Samples: 305249960. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 10:34:49,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 10:34:50,796][52263] Updated weights for policy 0, policy_version 354908 (0.0031) [2024-04-27 10:34:53,389][52263] Updated weights for policy 0, policy_version 354918 (0.0027) [2024-04-27 10:34:54,106][52031] Fps is (10 sec: 55705.9, 60 sec: 52975.0, 300 sec: 52817.6). Total num frames: 5815009280. Throughput: 0: 52815.1. Samples: 305566060. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-27 10:34:54,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 10:34:56,906][52263] Updated weights for policy 0, policy_version 354928 (0.0033) [2024-04-27 10:34:59,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52701.8, 300 sec: 52762.0). Total num frames: 5815271424. Throughput: 0: 53092.3. Samples: 305730280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 10:34:59,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 10:34:59,573][52263] Updated weights for policy 0, policy_version 354938 (0.0035) [2024-04-27 10:35:03,069][52263] Updated weights for policy 0, policy_version 354948 (0.0029) [2024-04-27 10:35:04,106][52031] Fps is (10 sec: 49151.8, 60 sec: 52155.8, 300 sec: 52595.4). Total num frames: 5815500800. Throughput: 0: 52970.2. Samples: 306041100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 10:35:04,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 10:35:05,875][52263] Updated weights for policy 0, policy_version 354958 (0.0027) [2024-04-27 10:35:08,165][52242] Signal inference workers to stop experience collection... (4450 times) [2024-04-27 10:35:08,210][52263] InferenceWorker_p0-w0: stopping experience collection (4450 times) [2024-04-27 10:35:08,225][52242] Signal inference workers to resume experience collection... (4450 times) [2024-04-27 10:35:08,230][52263] InferenceWorker_p0-w0: resuming experience collection (4450 times) [2024-04-27 10:35:09,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52428.8, 300 sec: 52595.4). Total num frames: 5815779328. Throughput: 0: 52904.0. Samples: 306358300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 10:35:09,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 10:35:09,230][52263] Updated weights for policy 0, policy_version 354968 (0.0034) [2024-04-27 10:35:12,260][52263] Updated weights for policy 0, policy_version 354978 (0.0034) [2024-04-27 10:35:14,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53248.0, 300 sec: 52762.0). Total num frames: 5816057856. Throughput: 0: 52755.1. Samples: 306519900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 10:35:14,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 10:35:15,415][52263] Updated weights for policy 0, policy_version 354988 (0.0036) [2024-04-27 10:35:18,308][52263] Updated weights for policy 0, policy_version 354998 (0.0029) [2024-04-27 10:35:19,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53248.0, 300 sec: 52873.1). Total num frames: 5816336384. Throughput: 0: 52736.0. Samples: 306834860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 10:35:19,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 10:35:21,646][52263] Updated weights for policy 0, policy_version 355008 (0.0028) [2024-04-27 10:35:24,107][52031] Fps is (10 sec: 54066.3, 60 sec: 52974.8, 300 sec: 52873.1). Total num frames: 5816598528. Throughput: 0: 52730.5. Samples: 307150260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 10:35:24,107][52031] Avg episode reward: [(0, '0.681')] [2024-04-27 10:35:24,390][52263] Updated weights for policy 0, policy_version 355018 (0.0028) [2024-04-27 10:35:27,745][52263] Updated weights for policy 0, policy_version 355028 (0.0032) [2024-04-27 10:35:29,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5816860672. Throughput: 0: 52997.4. Samples: 307316560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 10:35:29,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 10:35:30,732][52263] Updated weights for policy 0, policy_version 355038 (0.0031) [2024-04-27 10:35:33,878][52263] Updated weights for policy 0, policy_version 355048 (0.0033) [2024-04-27 10:35:34,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52974.9, 300 sec: 52706.5). Total num frames: 5817122816. Throughput: 0: 53031.6. Samples: 307636380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 10:35:34,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 10:35:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000355049_5817122816.pth... [2024-04-27 10:35:34,166][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000354278_5804490752.pth [2024-04-27 10:35:36,863][52263] Updated weights for policy 0, policy_version 355058 (0.0031) [2024-04-27 10:35:39,107][52031] Fps is (10 sec: 50790.0, 60 sec: 52701.8, 300 sec: 52706.5). Total num frames: 5817368576. Throughput: 0: 53041.3. Samples: 307952920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 10:35:39,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 10:35:40,007][52263] Updated weights for policy 0, policy_version 355068 (0.0030) [2024-04-27 10:35:42,857][52263] Updated weights for policy 0, policy_version 355078 (0.0026) [2024-04-27 10:35:44,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53794.1, 300 sec: 52928.7). Total num frames: 5817679872. Throughput: 0: 52898.7. Samples: 308110720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 10:35:44,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 10:35:46,161][52263] Updated weights for policy 0, policy_version 355088 (0.0028) [2024-04-27 10:35:48,868][52263] Updated weights for policy 0, policy_version 355098 (0.0027) [2024-04-27 10:35:49,106][52031] Fps is (10 sec: 55706.3, 60 sec: 52975.1, 300 sec: 52873.1). Total num frames: 5817925632. Throughput: 0: 53145.9. Samples: 308432660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 10:35:49,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 10:35:52,285][52263] Updated weights for policy 0, policy_version 355108 (0.0025) [2024-04-27 10:35:54,106][52031] Fps is (10 sec: 50791.3, 60 sec: 52975.0, 300 sec: 52817.6). Total num frames: 5818187776. Throughput: 0: 53191.7. Samples: 308751920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 10:35:54,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 10:35:55,072][52263] Updated weights for policy 0, policy_version 355118 (0.0026) [2024-04-27 10:35:58,334][52263] Updated weights for policy 0, policy_version 355128 (0.0028) [2024-04-27 10:35:59,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53248.1, 300 sec: 52762.0). Total num frames: 5818466304. Throughput: 0: 53041.8. Samples: 308906780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 10:35:59,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 10:36:01,520][52263] Updated weights for policy 0, policy_version 355138 (0.0030) [2024-04-27 10:36:04,106][52031] Fps is (10 sec: 50789.9, 60 sec: 53248.1, 300 sec: 52706.5). Total num frames: 5818695680. Throughput: 0: 53154.3. Samples: 309226800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 10:36:04,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 10:36:04,472][52263] Updated weights for policy 0, policy_version 355148 (0.0030) [2024-04-27 10:36:07,526][52263] Updated weights for policy 0, policy_version 355158 (0.0031) [2024-04-27 10:36:09,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.1, 300 sec: 52873.2). Total num frames: 5818974208. Throughput: 0: 53323.3. Samples: 309549800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 10:36:09,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 10:36:10,606][52263] Updated weights for policy 0, policy_version 355168 (0.0033) [2024-04-27 10:36:13,619][52263] Updated weights for policy 0, policy_version 355178 (0.0030) [2024-04-27 10:36:14,106][52031] Fps is (10 sec: 54067.7, 60 sec: 52975.0, 300 sec: 52873.1). Total num frames: 5819236352. Throughput: 0: 53097.9. Samples: 309705960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:36:14,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 10:36:16,617][52242] Signal inference workers to stop experience collection... (4500 times) [2024-04-27 10:36:16,667][52263] InferenceWorker_p0-w0: stopping experience collection (4500 times) [2024-04-27 10:36:16,675][52242] Signal inference workers to resume experience collection... (4500 times) [2024-04-27 10:36:16,680][52263] InferenceWorker_p0-w0: resuming experience collection (4500 times) [2024-04-27 10:36:16,812][52263] Updated weights for policy 0, policy_version 355188 (0.0032) [2024-04-27 10:36:19,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53248.0, 300 sec: 52928.6). Total num frames: 5819531264. Throughput: 0: 53092.9. Samples: 310025560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:36:19,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 10:36:19,781][52263] Updated weights for policy 0, policy_version 355198 (0.0033) [2024-04-27 10:36:22,960][52263] Updated weights for policy 0, policy_version 355208 (0.0033) [2024-04-27 10:36:24,107][52031] Fps is (10 sec: 57343.1, 60 sec: 53521.1, 300 sec: 52928.7). Total num frames: 5819809792. Throughput: 0: 53309.4. Samples: 310351840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:36:24,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 10:36:25,873][52263] Updated weights for policy 0, policy_version 355218 (0.0029) [2024-04-27 10:36:29,106][52031] Fps is (10 sec: 50790.7, 60 sec: 52974.9, 300 sec: 52762.0). Total num frames: 5820039168. Throughput: 0: 53270.3. Samples: 310507880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:36:29,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 10:36:29,160][52263] Updated weights for policy 0, policy_version 355228 (0.0033) [2024-04-27 10:36:31,821][52263] Updated weights for policy 0, policy_version 355238 (0.0031) [2024-04-27 10:36:34,107][52031] Fps is (10 sec: 49149.8, 60 sec: 52974.5, 300 sec: 52817.5). Total num frames: 5820301312. Throughput: 0: 53178.0. Samples: 310825700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:36:34,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 10:36:35,213][52263] Updated weights for policy 0, policy_version 355248 (0.0030) [2024-04-27 10:36:37,869][52263] Updated weights for policy 0, policy_version 355258 (0.0035) [2024-04-27 10:36:39,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.1, 300 sec: 52817.6). Total num frames: 5820563456. Throughput: 0: 53225.7. Samples: 311147080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:36:39,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 10:36:41,246][52263] Updated weights for policy 0, policy_version 355268 (0.0024) [2024-04-27 10:36:44,107][52031] Fps is (10 sec: 54069.8, 60 sec: 52701.9, 300 sec: 52928.6). Total num frames: 5820841984. Throughput: 0: 53276.4. Samples: 311304220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:36:44,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 10:36:44,471][52263] Updated weights for policy 0, policy_version 355278 (0.0026) [2024-04-27 10:36:47,313][52263] Updated weights for policy 0, policy_version 355288 (0.0025) [2024-04-27 10:36:49,106][52031] Fps is (10 sec: 57343.9, 60 sec: 53521.1, 300 sec: 52984.2). Total num frames: 5821136896. Throughput: 0: 53273.4. Samples: 311624100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:36:49,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 10:36:50,852][52263] Updated weights for policy 0, policy_version 355298 (0.0031) [2024-04-27 10:36:53,400][52263] Updated weights for policy 0, policy_version 355308 (0.0026) [2024-04-27 10:36:54,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53247.9, 300 sec: 52817.6). Total num frames: 5821382656. Throughput: 0: 53248.4. Samples: 311945980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:36:54,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 10:36:57,062][52263] Updated weights for policy 0, policy_version 355318 (0.0029) [2024-04-27 10:36:59,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53248.0, 300 sec: 52928.7). Total num frames: 5821661184. Throughput: 0: 53395.0. Samples: 312108740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:36:59,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 10:36:59,365][52263] Updated weights for policy 0, policy_version 355328 (0.0031) [2024-04-27 10:37:03,227][52263] Updated weights for policy 0, policy_version 355338 (0.0023) [2024-04-27 10:37:04,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.0, 300 sec: 52928.6). Total num frames: 5821906944. Throughput: 0: 53476.9. Samples: 312432020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:37:04,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 10:37:05,609][52263] Updated weights for policy 0, policy_version 355348 (0.0027) [2024-04-27 10:37:09,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53248.0, 300 sec: 52928.7). Total num frames: 5822169088. Throughput: 0: 53245.0. Samples: 312747860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:37:09,107][52031] Avg episode reward: [(0, '0.475')] [2024-04-27 10:37:09,233][52263] Updated weights for policy 0, policy_version 355358 (0.0034) [2024-04-27 10:37:11,985][52263] Updated weights for policy 0, policy_version 355368 (0.0028) [2024-04-27 10:37:14,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53794.1, 300 sec: 52984.2). Total num frames: 5822464000. Throughput: 0: 53367.2. Samples: 312909400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:37:14,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 10:37:15,539][52263] Updated weights for policy 0, policy_version 355378 (0.0024) [2024-04-27 10:37:16,120][52242] Signal inference workers to stop experience collection... (4550 times) [2024-04-27 10:37:16,120][52242] Signal inference workers to resume experience collection... (4550 times) [2024-04-27 10:37:16,145][52263] InferenceWorker_p0-w0: stopping experience collection (4550 times) [2024-04-27 10:37:16,145][52263] InferenceWorker_p0-w0: resuming experience collection (4550 times) [2024-04-27 10:37:18,184][52263] Updated weights for policy 0, policy_version 355388 (0.0028) [2024-04-27 10:37:19,107][52031] Fps is (10 sec: 57343.3, 60 sec: 53521.0, 300 sec: 53039.7). Total num frames: 5822742528. Throughput: 0: 53460.0. Samples: 313231380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:37:19,107][52031] Avg episode reward: [(0, '0.472')] [2024-04-27 10:37:21,617][52263] Updated weights for policy 0, policy_version 355398 (0.0037) [2024-04-27 10:37:24,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52974.9, 300 sec: 52984.2). Total num frames: 5822988288. Throughput: 0: 53423.4. Samples: 313551140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 10:37:24,116][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 10:37:24,345][52263] Updated weights for policy 0, policy_version 355408 (0.0026) [2024-04-27 10:37:27,801][52263] Updated weights for policy 0, policy_version 355418 (0.0028) [2024-04-27 10:37:29,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53521.0, 300 sec: 52984.2). Total num frames: 5823250432. Throughput: 0: 53452.3. Samples: 313709580. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 10:37:29,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 10:37:30,379][52263] Updated weights for policy 0, policy_version 355428 (0.0034) [2024-04-27 10:37:33,879][52263] Updated weights for policy 0, policy_version 355438 (0.0035) [2024-04-27 10:37:34,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53248.5, 300 sec: 52984.2). Total num frames: 5823496192. Throughput: 0: 53476.9. Samples: 314030560. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 10:37:34,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 10:37:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000355438_5823496192.pth... [2024-04-27 10:37:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000354662_5810782208.pth [2024-04-27 10:37:36,460][52263] Updated weights for policy 0, policy_version 355448 (0.0030) [2024-04-27 10:37:39,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.0, 300 sec: 53039.7). Total num frames: 5823791104. Throughput: 0: 53379.5. Samples: 314348060. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 10:37:39,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 10:37:40,374][52263] Updated weights for policy 0, policy_version 355458 (0.0027) [2024-04-27 10:37:42,605][52263] Updated weights for policy 0, policy_version 355468 (0.0028) [2024-04-27 10:37:44,106][52031] Fps is (10 sec: 57344.1, 60 sec: 53794.2, 300 sec: 53039.8). Total num frames: 5824069632. Throughput: 0: 53313.0. Samples: 314507820. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 10:37:44,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 10:37:46,490][52263] Updated weights for policy 0, policy_version 355478 (0.0033) [2024-04-27 10:37:48,679][52263] Updated weights for policy 0, policy_version 355488 (0.0038) [2024-04-27 10:37:49,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53520.9, 300 sec: 53095.3). Total num frames: 5824348160. Throughput: 0: 53274.2. Samples: 314829360. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 10:37:49,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 10:37:52,603][52263] Updated weights for policy 0, policy_version 355498 (0.0030) [2024-04-27 10:37:54,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53248.0, 300 sec: 53095.3). Total num frames: 5824577536. Throughput: 0: 53371.0. Samples: 315149560. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 10:37:54,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 10:37:54,788][52263] Updated weights for policy 0, policy_version 355508 (0.0034) [2024-04-27 10:37:58,794][52263] Updated weights for policy 0, policy_version 355518 (0.0029) [2024-04-27 10:37:59,106][52031] Fps is (10 sec: 47514.4, 60 sec: 52701.9, 300 sec: 52928.7). Total num frames: 5824823296. Throughput: 0: 53286.7. Samples: 315307300. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 10:37:59,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 10:38:01,059][52263] Updated weights for policy 0, policy_version 355528 (0.0026) [2024-04-27 10:38:04,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.0, 300 sec: 53039.7). Total num frames: 5825101824. Throughput: 0: 53161.9. Samples: 315623660. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 10:38:04,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 10:38:04,889][52263] Updated weights for policy 0, policy_version 355538 (0.0025) [2024-04-27 10:38:07,386][52263] Updated weights for policy 0, policy_version 355548 (0.0027) [2024-04-27 10:38:09,107][52031] Fps is (10 sec: 55704.4, 60 sec: 53520.9, 300 sec: 53039.7). Total num frames: 5825380352. Throughput: 0: 53197.7. Samples: 315945040. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 10:38:09,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 10:38:11,020][52263] Updated weights for policy 0, policy_version 355558 (0.0030) [2024-04-27 10:38:13,255][52263] Updated weights for policy 0, policy_version 355568 (0.0028) [2024-04-27 10:38:14,107][52031] Fps is (10 sec: 57343.8, 60 sec: 53521.0, 300 sec: 53150.8). Total num frames: 5825675264. Throughput: 0: 53379.7. Samples: 316111660. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 10:38:14,107][52031] Avg episode reward: [(0, '0.709')] [2024-04-27 10:38:17,148][52263] Updated weights for policy 0, policy_version 355578 (0.0025) [2024-04-27 10:38:19,106][52031] Fps is (10 sec: 54068.2, 60 sec: 52975.1, 300 sec: 53150.8). Total num frames: 5825921024. Throughput: 0: 53441.3. Samples: 316435420. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 10:38:19,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 10:38:19,285][52263] Updated weights for policy 0, policy_version 355588 (0.0027) [2024-04-27 10:38:23,193][52263] Updated weights for policy 0, policy_version 355598 (0.0026) [2024-04-27 10:38:24,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53248.1, 300 sec: 53095.3). Total num frames: 5826183168. Throughput: 0: 53469.9. Samples: 316754200. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 10:38:24,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 10:38:24,823][52242] Signal inference workers to stop experience collection... (4600 times) [2024-04-27 10:38:24,827][52242] Signal inference workers to resume experience collection... (4600 times) [2024-04-27 10:38:24,849][52263] InferenceWorker_p0-w0: stopping experience collection (4600 times) [2024-04-27 10:38:24,849][52263] InferenceWorker_p0-w0: resuming experience collection (4600 times) [2024-04-27 10:38:25,510][52263] Updated weights for policy 0, policy_version 355608 (0.0029) [2024-04-27 10:38:29,107][52031] Fps is (10 sec: 49151.5, 60 sec: 52701.9, 300 sec: 52984.2). Total num frames: 5826412544. Throughput: 0: 53156.7. Samples: 316899880. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 10:38:29,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 10:38:29,474][52263] Updated weights for policy 0, policy_version 355618 (0.0033) [2024-04-27 10:38:31,663][52263] Updated weights for policy 0, policy_version 355628 (0.0026) [2024-04-27 10:38:34,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53150.8). Total num frames: 5826707456. Throughput: 0: 53023.3. Samples: 317215400. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 10:38:34,107][52031] Avg episode reward: [(0, '0.658')] [2024-04-27 10:38:35,660][52263] Updated weights for policy 0, policy_version 355638 (0.0036) [2024-04-27 10:38:37,687][52263] Updated weights for policy 0, policy_version 355648 (0.0027) [2024-04-27 10:38:39,106][52031] Fps is (10 sec: 58983.2, 60 sec: 53521.2, 300 sec: 53206.4). Total num frames: 5827002368. Throughput: 0: 52895.7. Samples: 317529860. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 10:38:39,107][52031] Avg episode reward: [(0, '0.646')] [2024-04-27 10:38:41,710][52263] Updated weights for policy 0, policy_version 355658 (0.0027) [2024-04-27 10:38:43,932][52263] Updated weights for policy 0, policy_version 355668 (0.0027) [2024-04-27 10:38:44,106][52031] Fps is (10 sec: 57344.1, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 5827280896. Throughput: 0: 53337.3. Samples: 317707480. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-27 10:38:44,107][52031] Avg episode reward: [(0, '0.664')] [2024-04-27 10:38:47,795][52263] Updated weights for policy 0, policy_version 355678 (0.0029) [2024-04-27 10:38:49,106][52031] Fps is (10 sec: 49152.3, 60 sec: 52429.0, 300 sec: 53095.3). Total num frames: 5827493888. Throughput: 0: 53445.0. Samples: 318028680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 10:38:49,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 10:38:50,072][52263] Updated weights for policy 0, policy_version 355688 (0.0027) [2024-04-27 10:38:54,106][52031] Fps is (10 sec: 45875.2, 60 sec: 52702.0, 300 sec: 52984.2). Total num frames: 5827739648. Throughput: 0: 53350.9. Samples: 318345820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 10:38:54,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 10:38:54,218][52263] Updated weights for policy 0, policy_version 355698 (0.0027) [2024-04-27 10:38:56,421][52263] Updated weights for policy 0, policy_version 355708 (0.0031) [2024-04-27 10:38:59,107][52031] Fps is (10 sec: 52427.5, 60 sec: 53247.8, 300 sec: 53039.7). Total num frames: 5828018176. Throughput: 0: 52851.5. Samples: 318489980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 10:38:59,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 10:39:00,305][52263] Updated weights for policy 0, policy_version 355718 (0.0028) [2024-04-27 10:39:02,389][52263] Updated weights for policy 0, policy_version 355728 (0.0035) [2024-04-27 10:39:04,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53248.0, 300 sec: 53095.3). Total num frames: 5828296704. Throughput: 0: 52668.4. Samples: 318805500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 10:39:04,107][52031] Avg episode reward: [(0, '0.509')] [2024-04-27 10:39:06,357][52263] Updated weights for policy 0, policy_version 355738 (0.0032) [2024-04-27 10:39:08,254][52242] Signal inference workers to stop experience collection... (4650 times) [2024-04-27 10:39:08,306][52263] InferenceWorker_p0-w0: stopping experience collection (4650 times) [2024-04-27 10:39:08,312][52242] Signal inference workers to resume experience collection... (4650 times) [2024-04-27 10:39:08,319][52263] InferenceWorker_p0-w0: resuming experience collection (4650 times) [2024-04-27 10:39:08,434][52263] Updated weights for policy 0, policy_version 355748 (0.0031) [2024-04-27 10:39:09,106][52031] Fps is (10 sec: 57345.4, 60 sec: 53521.3, 300 sec: 53317.4). Total num frames: 5828591616. Throughput: 0: 52595.2. Samples: 319120980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 10:39:09,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 10:39:12,487][52263] Updated weights for policy 0, policy_version 355758 (0.0037) [2024-04-27 10:39:14,106][52031] Fps is (10 sec: 54067.7, 60 sec: 52702.0, 300 sec: 53206.4). Total num frames: 5828837376. Throughput: 0: 53396.6. Samples: 319302720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 10:39:14,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 10:39:14,551][52263] Updated weights for policy 0, policy_version 355768 (0.0029) [2024-04-27 10:39:18,934][52263] Updated weights for policy 0, policy_version 355778 (0.0029) [2024-04-27 10:39:19,107][52031] Fps is (10 sec: 49151.3, 60 sec: 52701.8, 300 sec: 53095.3). Total num frames: 5829083136. Throughput: 0: 53411.0. Samples: 319618900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 10:39:19,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 10:39:20,805][52263] Updated weights for policy 0, policy_version 355788 (0.0028) [2024-04-27 10:39:24,107][52031] Fps is (10 sec: 50789.6, 60 sec: 52701.8, 300 sec: 53095.3). Total num frames: 5829345280. Throughput: 0: 53483.4. Samples: 319936620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 10:39:24,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 10:39:25,170][52263] Updated weights for policy 0, policy_version 355798 (0.0032) [2024-04-27 10:39:26,946][52263] Updated weights for policy 0, policy_version 355808 (0.0033) [2024-04-27 10:39:29,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.1, 300 sec: 53206.3). Total num frames: 5829640192. Throughput: 0: 52843.4. Samples: 320085440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 10:39:29,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 10:39:31,255][52263] Updated weights for policy 0, policy_version 355818 (0.0028) [2024-04-27 10:39:33,221][52263] Updated weights for policy 0, policy_version 355828 (0.0030) [2024-04-27 10:39:34,107][52031] Fps is (10 sec: 58982.0, 60 sec: 53794.0, 300 sec: 53317.4). Total num frames: 5829935104. Throughput: 0: 52926.3. Samples: 320410380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 10:39:34,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 10:39:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000355831_5829935104.pth... [2024-04-27 10:39:34,165][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000355049_5817122816.pth [2024-04-27 10:39:37,376][52263] Updated weights for policy 0, policy_version 355838 (0.0029) [2024-04-27 10:39:39,106][52031] Fps is (10 sec: 55706.5, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 5830197248. Throughput: 0: 53012.5. Samples: 320731380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 10:39:39,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 10:39:39,273][52263] Updated weights for policy 0, policy_version 355848 (0.0029) [2024-04-27 10:39:43,275][52263] Updated weights for policy 0, policy_version 355858 (0.0028) [2024-04-27 10:39:44,107][52031] Fps is (10 sec: 50791.1, 60 sec: 52701.8, 300 sec: 53206.4). Total num frames: 5830443008. Throughput: 0: 53402.8. Samples: 320893100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 10:39:44,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 10:39:45,378][52263] Updated weights for policy 0, policy_version 355868 (0.0032) [2024-04-27 10:39:49,106][52031] Fps is (10 sec: 47513.3, 60 sec: 52974.8, 300 sec: 53095.3). Total num frames: 5830672384. Throughput: 0: 53530.3. Samples: 321214360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 10:39:49,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 10:39:49,383][52263] Updated weights for policy 0, policy_version 355878 (0.0030) [2024-04-27 10:39:51,591][52263] Updated weights for policy 0, policy_version 355888 (0.0034) [2024-04-27 10:39:54,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53794.0, 300 sec: 53206.4). Total num frames: 5830967296. Throughput: 0: 53752.7. Samples: 321539860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 10:39:54,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 10:39:55,456][52263] Updated weights for policy 0, policy_version 355898 (0.0031) [2024-04-27 10:39:56,632][52242] Signal inference workers to stop experience collection... (4700 times) [2024-04-27 10:39:56,632][52242] Signal inference workers to resume experience collection... (4700 times) [2024-04-27 10:39:56,656][52263] InferenceWorker_p0-w0: stopping experience collection (4700 times) [2024-04-27 10:39:56,657][52263] InferenceWorker_p0-w0: resuming experience collection (4700 times) [2024-04-27 10:39:57,775][52263] Updated weights for policy 0, policy_version 355908 (0.0030) [2024-04-27 10:39:59,107][52031] Fps is (10 sec: 57343.3, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 5831245824. Throughput: 0: 53236.7. Samples: 321698380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 10:39:59,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 10:40:01,513][52263] Updated weights for policy 0, policy_version 355918 (0.0029) [2024-04-27 10:40:03,887][52263] Updated weights for policy 0, policy_version 355928 (0.0033) [2024-04-27 10:40:04,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 5831524352. Throughput: 0: 53216.5. Samples: 322013640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-27 10:40:04,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 10:40:07,866][52263] Updated weights for policy 0, policy_version 355938 (0.0029) [2024-04-27 10:40:09,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53247.8, 300 sec: 53317.4). Total num frames: 5831786496. Throughput: 0: 53325.3. Samples: 322336260. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-27 10:40:09,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 10:40:10,009][52263] Updated weights for policy 0, policy_version 355948 (0.0031) [2024-04-27 10:40:13,964][52263] Updated weights for policy 0, policy_version 355958 (0.0030) [2024-04-27 10:40:14,106][52031] Fps is (10 sec: 49152.6, 60 sec: 52975.0, 300 sec: 53150.8). Total num frames: 5832015872. Throughput: 0: 53528.7. Samples: 322494220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-27 10:40:14,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 10:40:16,211][52263] Updated weights for policy 0, policy_version 355968 (0.0031) [2024-04-27 10:40:19,107][52031] Fps is (10 sec: 49152.3, 60 sec: 53248.0, 300 sec: 53150.8). Total num frames: 5832278016. Throughput: 0: 53341.0. Samples: 322810720. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-27 10:40:19,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 10:40:19,986][52263] Updated weights for policy 0, policy_version 355978 (0.0033) [2024-04-27 10:40:22,425][52263] Updated weights for policy 0, policy_version 355988 (0.0031) [2024-04-27 10:40:24,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.2, 300 sec: 53261.9). Total num frames: 5832572928. Throughput: 0: 53315.5. Samples: 323130580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-27 10:40:24,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 10:40:26,317][52263] Updated weights for policy 0, policy_version 355998 (0.0033) [2024-04-27 10:40:28,604][52263] Updated weights for policy 0, policy_version 356008 (0.0031) [2024-04-27 10:40:29,107][52031] Fps is (10 sec: 57343.8, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 5832851456. Throughput: 0: 53450.2. Samples: 323298360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-27 10:40:29,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 10:40:32,260][52263] Updated weights for policy 0, policy_version 356018 (0.0033) [2024-04-27 10:40:34,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 5833129984. Throughput: 0: 53494.5. Samples: 323621620. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-27 10:40:34,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 10:40:34,643][52263] Updated weights for policy 0, policy_version 356028 (0.0029) [2024-04-27 10:40:38,610][52263] Updated weights for policy 0, policy_version 356038 (0.0034) [2024-04-27 10:40:39,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52974.9, 300 sec: 53206.4). Total num frames: 5833375744. Throughput: 0: 53389.5. Samples: 323942380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-27 10:40:39,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 10:40:40,858][52263] Updated weights for policy 0, policy_version 356048 (0.0032) [2024-04-27 10:40:44,107][52031] Fps is (10 sec: 49152.2, 60 sec: 52974.9, 300 sec: 53206.3). Total num frames: 5833621504. Throughput: 0: 53164.5. Samples: 324090780. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-27 10:40:44,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 10:40:44,546][52263] Updated weights for policy 0, policy_version 356058 (0.0032) [2024-04-27 10:40:47,134][52263] Updated weights for policy 0, policy_version 356068 (0.0027) [2024-04-27 10:40:49,106][52031] Fps is (10 sec: 50790.2, 60 sec: 53521.1, 300 sec: 53206.3). Total num frames: 5833883648. Throughput: 0: 53267.1. Samples: 324410660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-27 10:40:49,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 10:40:50,769][52263] Updated weights for policy 0, policy_version 356078 (0.0031) [2024-04-27 10:40:51,083][52242] Signal inference workers to stop experience collection... (4750 times) [2024-04-27 10:40:51,083][52242] Signal inference workers to resume experience collection... (4750 times) [2024-04-27 10:40:51,107][52263] InferenceWorker_p0-w0: stopping experience collection (4750 times) [2024-04-27 10:40:51,107][52263] InferenceWorker_p0-w0: resuming experience collection (4750 times) [2024-04-27 10:40:53,207][52263] Updated weights for policy 0, policy_version 356088 (0.0027) [2024-04-27 10:40:54,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.1, 300 sec: 53261.9). Total num frames: 5834178560. Throughput: 0: 53201.4. Samples: 324730320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-27 10:40:54,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 10:40:56,953][52263] Updated weights for policy 0, policy_version 356098 (0.0029) [2024-04-27 10:40:59,107][52031] Fps is (10 sec: 57343.7, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 5834457088. Throughput: 0: 53536.3. Samples: 324903360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-27 10:40:59,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 10:40:59,280][52263] Updated weights for policy 0, policy_version 356108 (0.0033) [2024-04-27 10:41:03,108][52263] Updated weights for policy 0, policy_version 356118 (0.0029) [2024-04-27 10:41:04,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 5834735616. Throughput: 0: 53658.7. Samples: 325225360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-27 10:41:04,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 10:41:05,428][52263] Updated weights for policy 0, policy_version 356128 (0.0029) [2024-04-27 10:41:09,107][52031] Fps is (10 sec: 50790.5, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 5834964992. Throughput: 0: 53650.6. Samples: 325544860. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-27 10:41:09,108][52263] Updated weights for policy 0, policy_version 356138 (0.0037) [2024-04-27 10:41:09,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 10:41:11,589][52263] Updated weights for policy 0, policy_version 356148 (0.0030) [2024-04-27 10:41:14,106][52031] Fps is (10 sec: 49152.3, 60 sec: 53521.0, 300 sec: 53206.4). Total num frames: 5835227136. Throughput: 0: 53236.5. Samples: 325694000. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-27 10:41:14,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 10:41:15,152][52263] Updated weights for policy 0, policy_version 356158 (0.0027) [2024-04-27 10:41:17,824][52263] Updated weights for policy 0, policy_version 356168 (0.0032) [2024-04-27 10:41:19,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.1, 300 sec: 53206.3). Total num frames: 5835505664. Throughput: 0: 53294.3. Samples: 326019860. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-04-27 10:41:19,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 10:41:21,366][52263] Updated weights for policy 0, policy_version 356178 (0.0032) [2024-04-27 10:41:24,005][52263] Updated weights for policy 0, policy_version 356188 (0.0029) [2024-04-27 10:41:24,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 5835784192. Throughput: 0: 53185.4. Samples: 326335720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:41:24,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 10:41:27,375][52263] Updated weights for policy 0, policy_version 356198 (0.0029) [2024-04-27 10:41:29,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.1, 300 sec: 53428.6). Total num frames: 5836062720. Throughput: 0: 53678.3. Samples: 326506300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:41:29,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 10:41:29,992][52263] Updated weights for policy 0, policy_version 356208 (0.0042) [2024-04-27 10:41:33,539][52263] Updated weights for policy 0, policy_version 356218 (0.0033) [2024-04-27 10:41:34,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 5836324864. Throughput: 0: 53741.2. Samples: 326829020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:41:34,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 10:41:34,225][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000356222_5836341248.pth... [2024-04-27 10:41:34,272][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000355438_5823496192.pth [2024-04-27 10:41:36,101][52263] Updated weights for policy 0, policy_version 356228 (0.0037) [2024-04-27 10:41:39,106][52031] Fps is (10 sec: 49152.3, 60 sec: 52974.9, 300 sec: 53261.9). Total num frames: 5836554240. Throughput: 0: 53750.8. Samples: 327149100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:41:39,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 10:41:39,644][52263] Updated weights for policy 0, policy_version 356238 (0.0027) [2024-04-27 10:41:42,283][52263] Updated weights for policy 0, policy_version 356248 (0.0029) [2024-04-27 10:41:44,107][52031] Fps is (10 sec: 49151.8, 60 sec: 53248.0, 300 sec: 53150.8). Total num frames: 5836816384. Throughput: 0: 53176.8. Samples: 327296320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:41:44,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 10:41:45,992][52263] Updated weights for policy 0, policy_version 356258 (0.0034) [2024-04-27 10:41:46,080][52242] Signal inference workers to stop experience collection... (4800 times) [2024-04-27 10:41:46,138][52242] Signal inference workers to resume experience collection... (4800 times) [2024-04-27 10:41:46,138][52263] InferenceWorker_p0-w0: stopping experience collection (4800 times) [2024-04-27 10:41:46,152][52263] InferenceWorker_p0-w0: resuming experience collection (4800 times) [2024-04-27 10:41:48,329][52263] Updated weights for policy 0, policy_version 356268 (0.0034) [2024-04-27 10:41:49,107][52031] Fps is (10 sec: 57343.4, 60 sec: 54067.1, 300 sec: 53373.0). Total num frames: 5837127680. Throughput: 0: 53105.7. Samples: 327615120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:41:49,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 10:41:52,001][52263] Updated weights for policy 0, policy_version 356278 (0.0033) [2024-04-27 10:41:54,106][52031] Fps is (10 sec: 57344.7, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 5837389824. Throughput: 0: 53272.1. Samples: 327942100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:41:54,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 10:41:54,374][52263] Updated weights for policy 0, policy_version 356288 (0.0029) [2024-04-27 10:41:58,036][52263] Updated weights for policy 0, policy_version 356298 (0.0036) [2024-04-27 10:41:59,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 5837668352. Throughput: 0: 53911.0. Samples: 328120000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:41:59,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 10:42:00,495][52263] Updated weights for policy 0, policy_version 356308 (0.0025) [2024-04-27 10:42:04,065][52263] Updated weights for policy 0, policy_version 356318 (0.0033) [2024-04-27 10:42:04,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52974.9, 300 sec: 53372.9). Total num frames: 5837914112. Throughput: 0: 53751.9. Samples: 328438700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:42:04,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 10:42:06,522][52263] Updated weights for policy 0, policy_version 356328 (0.0029) [2024-04-27 10:42:09,107][52031] Fps is (10 sec: 49152.0, 60 sec: 53247.9, 300 sec: 53206.3). Total num frames: 5838159872. Throughput: 0: 53873.1. Samples: 328760020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:42:09,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 10:42:10,162][52263] Updated weights for policy 0, policy_version 356338 (0.0033) [2024-04-27 10:42:12,549][52263] Updated weights for policy 0, policy_version 356348 (0.0029) [2024-04-27 10:42:14,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.0, 300 sec: 53206.3). Total num frames: 5838438400. Throughput: 0: 53372.4. Samples: 328908060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:42:14,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 10:42:16,214][52263] Updated weights for policy 0, policy_version 356358 (0.0030) [2024-04-27 10:42:18,659][52263] Updated weights for policy 0, policy_version 356368 (0.0028) [2024-04-27 10:42:19,106][52031] Fps is (10 sec: 57345.0, 60 sec: 53794.2, 300 sec: 53373.0). Total num frames: 5838733312. Throughput: 0: 53326.8. Samples: 329228720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:42:19,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 10:42:22,229][52263] Updated weights for policy 0, policy_version 356378 (0.0037) [2024-04-27 10:42:22,745][52242] Signal inference workers to stop experience collection... (4850 times) [2024-04-27 10:42:22,793][52263] InferenceWorker_p0-w0: stopping experience collection (4850 times) [2024-04-27 10:42:22,808][52242] Signal inference workers to resume experience collection... (4850 times) [2024-04-27 10:42:22,814][52263] InferenceWorker_p0-w0: resuming experience collection (4850 times) [2024-04-27 10:42:24,107][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 5838995456. Throughput: 0: 53354.1. Samples: 329550040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:42:24,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 10:42:24,746][52263] Updated weights for policy 0, policy_version 356388 (0.0024) [2024-04-27 10:42:28,362][52263] Updated weights for policy 0, policy_version 356398 (0.0027) [2024-04-27 10:42:29,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 5839290368. Throughput: 0: 54040.9. Samples: 329728160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:42:29,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 10:42:30,867][52263] Updated weights for policy 0, policy_version 356408 (0.0028) [2024-04-27 10:42:34,107][52031] Fps is (10 sec: 50790.2, 60 sec: 52974.9, 300 sec: 53261.9). Total num frames: 5839503360. Throughput: 0: 54016.0. Samples: 330045840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-27 10:42:34,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 10:42:34,477][52263] Updated weights for policy 0, policy_version 356418 (0.0033) [2024-04-27 10:42:36,913][52263] Updated weights for policy 0, policy_version 356428 (0.0027) [2024-04-27 10:42:39,106][52031] Fps is (10 sec: 47514.2, 60 sec: 53521.0, 300 sec: 53206.3). Total num frames: 5839765504. Throughput: 0: 53791.1. Samples: 330362700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 10:42:39,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 10:42:40,555][52263] Updated weights for policy 0, policy_version 356438 (0.0027) [2024-04-27 10:42:43,087][52263] Updated weights for policy 0, policy_version 356448 (0.0027) [2024-04-27 10:42:44,106][52031] Fps is (10 sec: 55706.4, 60 sec: 54067.4, 300 sec: 53261.9). Total num frames: 5840060416. Throughput: 0: 53227.3. Samples: 330515220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 10:42:44,107][52031] Avg episode reward: [(0, '0.471')] [2024-04-27 10:42:46,775][52263] Updated weights for policy 0, policy_version 356458 (0.0041) [2024-04-27 10:42:49,106][52031] Fps is (10 sec: 58982.1, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 5840355328. Throughput: 0: 53147.2. Samples: 330830320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 10:42:49,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 10:42:49,870][52263] Updated weights for policy 0, policy_version 356468 (0.0031) [2024-04-27 10:42:52,843][52263] Updated weights for policy 0, policy_version 356478 (0.0034) [2024-04-27 10:42:54,107][52031] Fps is (10 sec: 54066.1, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 5840601088. Throughput: 0: 53054.2. Samples: 331147460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 10:42:54,107][52031] Avg episode reward: [(0, '0.504')] [2024-04-27 10:42:55,841][52263] Updated weights for policy 0, policy_version 356488 (0.0031) [2024-04-27 10:42:59,048][52263] Updated weights for policy 0, policy_version 356498 (0.0028) [2024-04-27 10:42:59,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53248.2, 300 sec: 53428.5). Total num frames: 5840863232. Throughput: 0: 53525.5. Samples: 331316700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 10:42:59,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 10:43:02,007][52263] Updated weights for policy 0, policy_version 356508 (0.0029) [2024-04-27 10:43:04,106][52031] Fps is (10 sec: 49152.9, 60 sec: 52975.1, 300 sec: 53261.9). Total num frames: 5841092608. Throughput: 0: 53432.9. Samples: 331633200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 10:43:04,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 10:43:05,018][52242] Signal inference workers to stop experience collection... (4900 times) [2024-04-27 10:43:05,018][52242] Signal inference workers to resume experience collection... (4900 times) [2024-04-27 10:43:05,048][52263] InferenceWorker_p0-w0: stopping experience collection (4900 times) [2024-04-27 10:43:05,048][52263] InferenceWorker_p0-w0: resuming experience collection (4900 times) [2024-04-27 10:43:05,140][52263] Updated weights for policy 0, policy_version 356518 (0.0028) [2024-04-27 10:43:08,161][52263] Updated weights for policy 0, policy_version 356528 (0.0032) [2024-04-27 10:43:09,107][52031] Fps is (10 sec: 50789.9, 60 sec: 53521.1, 300 sec: 53206.4). Total num frames: 5841371136. Throughput: 0: 53357.8. Samples: 331951140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 10:43:09,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 10:43:11,442][52263] Updated weights for policy 0, policy_version 356538 (0.0029) [2024-04-27 10:43:14,106][52031] Fps is (10 sec: 57344.1, 60 sec: 53794.3, 300 sec: 53373.0). Total num frames: 5841666048. Throughput: 0: 52885.5. Samples: 332108000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 10:43:14,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 10:43:14,134][52263] Updated weights for policy 0, policy_version 356548 (0.0026) [2024-04-27 10:43:17,532][52263] Updated weights for policy 0, policy_version 356558 (0.0029) [2024-04-27 10:43:19,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53247.9, 300 sec: 53372.9). Total num frames: 5841928192. Throughput: 0: 52929.8. Samples: 332427680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 10:43:19,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 10:43:20,126][52263] Updated weights for policy 0, policy_version 356568 (0.0029) [2024-04-27 10:43:23,773][52263] Updated weights for policy 0, policy_version 356578 (0.0033) [2024-04-27 10:43:24,107][52031] Fps is (10 sec: 52427.6, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 5842190336. Throughput: 0: 53074.5. Samples: 332751060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 10:43:24,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 10:43:26,202][52263] Updated weights for policy 0, policy_version 356588 (0.0034) [2024-04-27 10:43:29,106][52031] Fps is (10 sec: 49152.7, 60 sec: 52155.9, 300 sec: 53261.9). Total num frames: 5842419712. Throughput: 0: 53008.9. Samples: 332900620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 10:43:29,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 10:43:29,791][52263] Updated weights for policy 0, policy_version 356598 (0.0026) [2024-04-27 10:43:32,755][52263] Updated weights for policy 0, policy_version 356608 (0.0030) [2024-04-27 10:43:34,106][52031] Fps is (10 sec: 49152.8, 60 sec: 52975.0, 300 sec: 53150.8). Total num frames: 5842681856. Throughput: 0: 53251.2. Samples: 333226620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 10:43:34,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 10:43:34,139][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000356610_5842698240.pth... [2024-04-27 10:43:34,185][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000355831_5829935104.pth [2024-04-27 10:43:36,009][52263] Updated weights for policy 0, policy_version 356618 (0.0035) [2024-04-27 10:43:39,106][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.1, 300 sec: 53206.3). Total num frames: 5842976768. Throughput: 0: 53341.1. Samples: 333547800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 10:43:39,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 10:43:39,348][52263] Updated weights for policy 0, policy_version 356628 (0.0029) [2024-04-27 10:43:42,108][52263] Updated weights for policy 0, policy_version 356638 (0.0026) [2024-04-27 10:43:44,106][52031] Fps is (10 sec: 58982.5, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 5843271680. Throughput: 0: 53253.3. Samples: 333713100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 10:43:44,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 10:43:45,445][52263] Updated weights for policy 0, policy_version 356648 (0.0029) [2024-04-27 10:43:48,239][52263] Updated weights for policy 0, policy_version 356658 (0.0029) [2024-04-27 10:43:49,106][52031] Fps is (10 sec: 55705.6, 60 sec: 52975.0, 300 sec: 53539.6). Total num frames: 5843533824. Throughput: 0: 53359.6. Samples: 334034380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 10:43:49,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 10:43:51,594][52263] Updated weights for policy 0, policy_version 356668 (0.0028) [2024-04-27 10:43:54,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 5843795968. Throughput: 0: 53482.2. Samples: 334357840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 10:43:54,107][52031] Avg episode reward: [(0, '0.659')] [2024-04-27 10:43:54,288][52263] Updated weights for policy 0, policy_version 356678 (0.0040) [2024-04-27 10:43:55,692][52242] Signal inference workers to stop experience collection... (4950 times) [2024-04-27 10:43:55,692][52242] Signal inference workers to resume experience collection... (4950 times) [2024-04-27 10:43:55,703][52263] InferenceWorker_p0-w0: stopping experience collection (4950 times) [2024-04-27 10:43:55,704][52263] InferenceWorker_p0-w0: resuming experience collection (4950 times) [2024-04-27 10:43:57,757][52263] Updated weights for policy 0, policy_version 356688 (0.0034) [2024-04-27 10:43:59,106][52031] Fps is (10 sec: 47513.5, 60 sec: 52428.7, 300 sec: 53261.9). Total num frames: 5844008960. Throughput: 0: 53152.8. Samples: 334499880. Policy #0 lag: (min: 1.0, avg: 12.8, max: 21.0) [2024-04-27 10:43:59,107][52031] Avg episode reward: [(0, '0.670')] [2024-04-27 10:44:00,568][52263] Updated weights for policy 0, policy_version 356698 (0.0027) [2024-04-27 10:44:03,883][52263] Updated weights for policy 0, policy_version 356708 (0.0042) [2024-04-27 10:44:04,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53521.0, 300 sec: 53261.9). Total num frames: 5844303872. Throughput: 0: 53083.1. Samples: 334816420. Policy #0 lag: (min: 1.0, avg: 12.8, max: 21.0) [2024-04-27 10:44:04,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 10:44:06,740][52263] Updated weights for policy 0, policy_version 356718 (0.0029) [2024-04-27 10:44:09,106][52031] Fps is (10 sec: 58982.5, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 5844598784. Throughput: 0: 53044.7. Samples: 335138060. Policy #0 lag: (min: 1.0, avg: 12.8, max: 21.0) [2024-04-27 10:44:09,107][52031] Avg episode reward: [(0, '0.509')] [2024-04-27 10:44:09,959][52263] Updated weights for policy 0, policy_version 356728 (0.0027) [2024-04-27 10:44:12,928][52263] Updated weights for policy 0, policy_version 356738 (0.0025) [2024-04-27 10:44:14,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53247.8, 300 sec: 53484.0). Total num frames: 5844860928. Throughput: 0: 53607.3. Samples: 335312960. Policy #0 lag: (min: 1.0, avg: 12.8, max: 21.0) [2024-04-27 10:44:14,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 10:44:16,189][52263] Updated weights for policy 0, policy_version 356748 (0.0030) [2024-04-27 10:44:18,987][52263] Updated weights for policy 0, policy_version 356758 (0.0031) [2024-04-27 10:44:19,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 5845139456. Throughput: 0: 53442.1. Samples: 335631520. Policy #0 lag: (min: 1.0, avg: 12.8, max: 21.0) [2024-04-27 10:44:19,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 10:44:22,294][52263] Updated weights for policy 0, policy_version 356768 (0.0031) [2024-04-27 10:44:24,106][52031] Fps is (10 sec: 49152.8, 60 sec: 52702.0, 300 sec: 53261.9). Total num frames: 5845352448. Throughput: 0: 53380.4. Samples: 335949920. Policy #0 lag: (min: 1.0, avg: 12.8, max: 21.0) [2024-04-27 10:44:24,107][52031] Avg episode reward: [(0, '0.681')] [2024-04-27 10:44:25,023][52263] Updated weights for policy 0, policy_version 356778 (0.0027) [2024-04-27 10:44:28,521][52263] Updated weights for policy 0, policy_version 356788 (0.0030) [2024-04-27 10:44:29,106][52031] Fps is (10 sec: 47514.0, 60 sec: 53248.0, 300 sec: 53150.8). Total num frames: 5845614592. Throughput: 0: 52954.2. Samples: 336096040. Policy #0 lag: (min: 1.0, avg: 12.8, max: 21.0) [2024-04-27 10:44:29,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 10:44:31,172][52263] Updated weights for policy 0, policy_version 356798 (0.0037) [2024-04-27 10:44:34,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53794.0, 300 sec: 53261.8). Total num frames: 5845909504. Throughput: 0: 52864.7. Samples: 336413300. Policy #0 lag: (min: 1.0, avg: 12.8, max: 21.0) [2024-04-27 10:44:34,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 10:44:35,203][52263] Updated weights for policy 0, policy_version 356808 (0.0027) [2024-04-27 10:44:37,421][52263] Updated weights for policy 0, policy_version 356818 (0.0026) [2024-04-27 10:44:39,107][52031] Fps is (10 sec: 58982.0, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 5846204416. Throughput: 0: 52784.5. Samples: 336733140. Policy #0 lag: (min: 1.0, avg: 12.8, max: 21.0) [2024-04-27 10:44:39,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 10:44:41,230][52263] Updated weights for policy 0, policy_version 356828 (0.0026) [2024-04-27 10:44:42,682][52242] Signal inference workers to stop experience collection... (5000 times) [2024-04-27 10:44:42,724][52263] InferenceWorker_p0-w0: stopping experience collection (5000 times) [2024-04-27 10:44:42,783][52242] Signal inference workers to resume experience collection... (5000 times) [2024-04-27 10:44:42,784][52263] InferenceWorker_p0-w0: resuming experience collection (5000 times) [2024-04-27 10:44:43,527][52263] Updated weights for policy 0, policy_version 356838 (0.0028) [2024-04-27 10:44:44,106][52031] Fps is (10 sec: 54068.3, 60 sec: 52975.0, 300 sec: 53484.1). Total num frames: 5846450176. Throughput: 0: 53566.3. Samples: 336910360. Policy #0 lag: (min: 1.0, avg: 12.8, max: 21.0) [2024-04-27 10:44:44,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 10:44:47,415][52263] Updated weights for policy 0, policy_version 356848 (0.0031) [2024-04-27 10:44:49,106][52031] Fps is (10 sec: 50791.0, 60 sec: 52974.9, 300 sec: 53373.0). Total num frames: 5846712320. Throughput: 0: 53594.4. Samples: 337228160. Policy #0 lag: (min: 1.0, avg: 12.8, max: 21.0) [2024-04-27 10:44:49,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 10:44:49,637][52263] Updated weights for policy 0, policy_version 356858 (0.0030) [2024-04-27 10:44:53,385][52263] Updated weights for policy 0, policy_version 356868 (0.0028) [2024-04-27 10:44:54,107][52031] Fps is (10 sec: 49151.0, 60 sec: 52428.7, 300 sec: 53206.3). Total num frames: 5846941696. Throughput: 0: 53475.8. Samples: 337544480. Policy #0 lag: (min: 1.0, avg: 12.8, max: 21.0) [2024-04-27 10:44:54,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 10:44:55,848][52263] Updated weights for policy 0, policy_version 356878 (0.0032) [2024-04-27 10:44:59,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53521.1, 300 sec: 53206.4). Total num frames: 5847220224. Throughput: 0: 53028.3. Samples: 337699220. Policy #0 lag: (min: 1.0, avg: 12.8, max: 21.0) [2024-04-27 10:44:59,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 10:44:59,540][52263] Updated weights for policy 0, policy_version 356888 (0.0029) [2024-04-27 10:45:01,986][52263] Updated weights for policy 0, policy_version 356898 (0.0039) [2024-04-27 10:45:04,106][52031] Fps is (10 sec: 57344.7, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 5847515136. Throughput: 0: 52994.3. Samples: 338016260. Policy #0 lag: (min: 1.0, avg: 12.8, max: 21.0) [2024-04-27 10:45:04,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 10:45:05,572][52263] Updated weights for policy 0, policy_version 356908 (0.0029) [2024-04-27 10:45:08,103][52263] Updated weights for policy 0, policy_version 356918 (0.0033) [2024-04-27 10:45:09,106][52031] Fps is (10 sec: 55705.6, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 5847777280. Throughput: 0: 52947.2. Samples: 338332540. Policy #0 lag: (min: 1.0, avg: 12.8, max: 21.0) [2024-04-27 10:45:09,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 10:45:11,753][52263] Updated weights for policy 0, policy_version 356928 (0.0028) [2024-04-27 10:45:14,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 5848055808. Throughput: 0: 53555.5. Samples: 338506040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 10:45:14,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 10:45:14,221][52263] Updated weights for policy 0, policy_version 356938 (0.0037) [2024-04-27 10:45:17,815][52263] Updated weights for policy 0, policy_version 356948 (0.0027) [2024-04-27 10:45:19,106][52031] Fps is (10 sec: 50790.4, 60 sec: 52428.9, 300 sec: 53261.9). Total num frames: 5848285184. Throughput: 0: 53512.2. Samples: 338821340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 10:45:19,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 10:45:20,464][52263] Updated weights for policy 0, policy_version 356958 (0.0036) [2024-04-27 10:45:23,976][52263] Updated weights for policy 0, policy_version 356968 (0.0031) [2024-04-27 10:45:24,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53521.1, 300 sec: 53261.9). Total num frames: 5848563712. Throughput: 0: 53545.4. Samples: 339142680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 10:45:24,107][52031] Avg episode reward: [(0, '0.667')] [2024-04-27 10:45:26,569][52263] Updated weights for policy 0, policy_version 356978 (0.0030) [2024-04-27 10:45:29,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.1, 300 sec: 53206.4). Total num frames: 5848825856. Throughput: 0: 52905.8. Samples: 339291120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 10:45:29,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 10:45:30,156][52263] Updated weights for policy 0, policy_version 356988 (0.0030) [2024-04-27 10:45:32,668][52263] Updated weights for policy 0, policy_version 356998 (0.0029) [2024-04-27 10:45:33,384][52242] Signal inference workers to stop experience collection... (5050 times) [2024-04-27 10:45:33,388][52242] Signal inference workers to resume experience collection... (5050 times) [2024-04-27 10:45:33,411][52263] InferenceWorker_p0-w0: stopping experience collection (5050 times) [2024-04-27 10:45:33,411][52263] InferenceWorker_p0-w0: resuming experience collection (5050 times) [2024-04-27 10:45:34,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 5849120768. Throughput: 0: 53028.4. Samples: 339614440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 10:45:34,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 10:45:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000357002_5849120768.pth... [2024-04-27 10:45:34,175][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000356222_5836341248.pth [2024-04-27 10:45:36,384][52263] Updated weights for policy 0, policy_version 357008 (0.0033) [2024-04-27 10:45:38,858][52263] Updated weights for policy 0, policy_version 357018 (0.0030) [2024-04-27 10:45:39,106][52031] Fps is (10 sec: 57344.1, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 5849399296. Throughput: 0: 53099.3. Samples: 339933940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 10:45:39,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 10:45:42,519][52263] Updated weights for policy 0, policy_version 357028 (0.0030) [2024-04-27 10:45:44,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 5849645056. Throughput: 0: 53309.3. Samples: 340098140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 10:45:44,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 10:45:45,070][52263] Updated weights for policy 0, policy_version 357038 (0.0028) [2024-04-27 10:45:48,608][52263] Updated weights for policy 0, policy_version 357048 (0.0026) [2024-04-27 10:45:49,107][52031] Fps is (10 sec: 49151.4, 60 sec: 52974.8, 300 sec: 53261.9). Total num frames: 5849890816. Throughput: 0: 53317.3. Samples: 340415540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 10:45:49,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 10:45:51,258][52263] Updated weights for policy 0, policy_version 357058 (0.0028) [2024-04-27 10:45:54,107][52031] Fps is (10 sec: 50789.7, 60 sec: 53521.1, 300 sec: 53206.3). Total num frames: 5850152960. Throughput: 0: 53416.3. Samples: 340736280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 10:45:54,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 10:45:54,866][52263] Updated weights for policy 0, policy_version 357068 (0.0033) [2024-04-27 10:45:57,398][52263] Updated weights for policy 0, policy_version 357078 (0.0034) [2024-04-27 10:45:59,107][52031] Fps is (10 sec: 54062.7, 60 sec: 53520.2, 300 sec: 53206.2). Total num frames: 5850431488. Throughput: 0: 52970.6. Samples: 340889760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 10:45:59,108][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 10:46:01,126][52263] Updated weights for policy 0, policy_version 357088 (0.0027) [2024-04-27 10:46:03,412][52263] Updated weights for policy 0, policy_version 357098 (0.0028) [2024-04-27 10:46:04,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 5850710016. Throughput: 0: 53007.5. Samples: 341206680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 10:46:04,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 10:46:07,248][52263] Updated weights for policy 0, policy_version 357108 (0.0029) [2024-04-27 10:46:09,107][52031] Fps is (10 sec: 55710.3, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 5850988544. Throughput: 0: 52982.6. Samples: 341526900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 10:46:09,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 10:46:09,456][52263] Updated weights for policy 0, policy_version 357118 (0.0032) [2024-04-27 10:46:13,379][52263] Updated weights for policy 0, policy_version 357128 (0.0027) [2024-04-27 10:46:14,106][52031] Fps is (10 sec: 52428.6, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 5851234304. Throughput: 0: 53204.4. Samples: 341685320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 10:46:14,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 10:46:15,831][52263] Updated weights for policy 0, policy_version 357138 (0.0032) [2024-04-27 10:46:19,107][52031] Fps is (10 sec: 49151.3, 60 sec: 53247.8, 300 sec: 53206.3). Total num frames: 5851480064. Throughput: 0: 53148.2. Samples: 342006120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 10:46:19,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 10:46:19,408][52263] Updated weights for policy 0, policy_version 357148 (0.0026) [2024-04-27 10:46:21,835][52263] Updated weights for policy 0, policy_version 357158 (0.0028) [2024-04-27 10:46:24,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53206.3). Total num frames: 5851758592. Throughput: 0: 53187.9. Samples: 342327400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 10:46:24,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 10:46:25,468][52263] Updated weights for policy 0, policy_version 357168 (0.0033) [2024-04-27 10:46:28,242][52263] Updated weights for policy 0, policy_version 357178 (0.0031) [2024-04-27 10:46:29,106][52031] Fps is (10 sec: 55707.0, 60 sec: 53521.1, 300 sec: 53261.9). Total num frames: 5852037120. Throughput: 0: 53108.5. Samples: 342488020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-27 10:46:29,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 10:46:31,665][52263] Updated weights for policy 0, policy_version 357188 (0.0031) [2024-04-27 10:46:33,830][52242] Signal inference workers to stop experience collection... (5100 times) [2024-04-27 10:46:33,868][52263] InferenceWorker_p0-w0: stopping experience collection (5100 times) [2024-04-27 10:46:33,897][52242] Signal inference workers to resume experience collection... (5100 times) [2024-04-27 10:46:33,902][52263] InferenceWorker_p0-w0: resuming experience collection (5100 times) [2024-04-27 10:46:34,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 5852315648. Throughput: 0: 53090.7. Samples: 342804620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:46:34,116][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 10:46:34,733][52263] Updated weights for policy 0, policy_version 357198 (0.0028) [2024-04-27 10:46:37,832][52263] Updated weights for policy 0, policy_version 357208 (0.0026) [2024-04-27 10:46:39,107][52031] Fps is (10 sec: 54066.4, 60 sec: 52974.8, 300 sec: 53428.5). Total num frames: 5852577792. Throughput: 0: 53125.8. Samples: 343126940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:46:39,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 10:46:40,777][52263] Updated weights for policy 0, policy_version 357218 (0.0029) [2024-04-27 10:46:44,053][52263] Updated weights for policy 0, policy_version 357228 (0.0037) [2024-04-27 10:46:44,107][52031] Fps is (10 sec: 50790.0, 60 sec: 52974.8, 300 sec: 53206.3). Total num frames: 5852823552. Throughput: 0: 53105.4. Samples: 343279460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:46:44,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 10:46:46,795][52263] Updated weights for policy 0, policy_version 357238 (0.0034) [2024-04-27 10:46:49,106][52031] Fps is (10 sec: 49152.7, 60 sec: 52975.0, 300 sec: 53150.8). Total num frames: 5853069312. Throughput: 0: 53238.3. Samples: 343602400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:46:49,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 10:46:50,337][52263] Updated weights for policy 0, policy_version 357248 (0.0031) [2024-04-27 10:46:52,999][52263] Updated weights for policy 0, policy_version 357258 (0.0024) [2024-04-27 10:46:54,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.0, 300 sec: 53206.3). Total num frames: 5853364224. Throughput: 0: 53130.1. Samples: 343917760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:46:54,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 10:46:56,611][52263] Updated weights for policy 0, policy_version 357268 (0.0029) [2024-04-27 10:46:59,107][52031] Fps is (10 sec: 55704.4, 60 sec: 53248.7, 300 sec: 53261.9). Total num frames: 5853626368. Throughput: 0: 53375.8. Samples: 344087240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:46:59,107][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 10:46:59,250][52263] Updated weights for policy 0, policy_version 357278 (0.0026) [2024-04-27 10:47:02,698][52263] Updated weights for policy 0, policy_version 357288 (0.0026) [2024-04-27 10:47:04,106][52031] Fps is (10 sec: 54068.3, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 5853904896. Throughput: 0: 53368.7. Samples: 344407700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:47:04,107][52031] Avg episode reward: [(0, '0.476')] [2024-04-27 10:47:05,228][52263] Updated weights for policy 0, policy_version 357298 (0.0026) [2024-04-27 10:47:08,698][52263] Updated weights for policy 0, policy_version 357308 (0.0025) [2024-04-27 10:47:09,106][52031] Fps is (10 sec: 52429.7, 60 sec: 52701.9, 300 sec: 53261.9). Total num frames: 5854150656. Throughput: 0: 53341.4. Samples: 344727760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:47:09,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 10:47:11,398][52263] Updated weights for policy 0, policy_version 357318 (0.0030) [2024-04-27 10:47:14,106][52031] Fps is (10 sec: 49151.7, 60 sec: 52701.9, 300 sec: 53095.3). Total num frames: 5854396416. Throughput: 0: 53108.8. Samples: 344877920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:47:14,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 10:47:14,817][52263] Updated weights for policy 0, policy_version 357328 (0.0037) [2024-04-27 10:47:17,818][52263] Updated weights for policy 0, policy_version 357338 (0.0025) [2024-04-27 10:47:19,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.2, 300 sec: 53150.8). Total num frames: 5854674944. Throughput: 0: 53199.6. Samples: 345198600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:47:19,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 10:47:21,005][52263] Updated weights for policy 0, policy_version 357348 (0.0029) [2024-04-27 10:47:23,943][52263] Updated weights for policy 0, policy_version 357358 (0.0027) [2024-04-27 10:47:24,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53248.0, 300 sec: 53095.3). Total num frames: 5854953472. Throughput: 0: 53120.5. Samples: 345517360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:47:24,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 10:47:27,019][52263] Updated weights for policy 0, policy_version 357368 (0.0029) [2024-04-27 10:47:29,106][52031] Fps is (10 sec: 57343.6, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 5855248384. Throughput: 0: 53442.3. Samples: 345684360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:47:29,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 10:47:30,017][52263] Updated weights for policy 0, policy_version 357378 (0.0038) [2024-04-27 10:47:33,247][52263] Updated weights for policy 0, policy_version 357388 (0.0029) [2024-04-27 10:47:34,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 5855510528. Throughput: 0: 53459.6. Samples: 346008080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:47:34,107][52031] Avg episode reward: [(0, '0.668')] [2024-04-27 10:47:34,121][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000357392_5855510528.pth... [2024-04-27 10:47:34,182][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000356610_5842698240.pth [2024-04-27 10:47:36,232][52263] Updated weights for policy 0, policy_version 357398 (0.0033) [2024-04-27 10:47:39,107][52031] Fps is (10 sec: 49151.3, 60 sec: 52701.8, 300 sec: 53150.8). Total num frames: 5855739904. Throughput: 0: 53477.3. Samples: 346324240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:47:39,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 10:47:39,621][52263] Updated weights for policy 0, policy_version 357408 (0.0035) [2024-04-27 10:47:42,275][52263] Updated weights for policy 0, policy_version 357418 (0.0028) [2024-04-27 10:47:44,107][52031] Fps is (10 sec: 49151.0, 60 sec: 52974.9, 300 sec: 53039.7). Total num frames: 5856002048. Throughput: 0: 52891.2. Samples: 346467340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 10:47:44,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 10:47:45,767][52263] Updated weights for policy 0, policy_version 357428 (0.0035) [2024-04-27 10:47:46,741][52242] Signal inference workers to stop experience collection... (5150 times) [2024-04-27 10:47:46,788][52263] InferenceWorker_p0-w0: stopping experience collection (5150 times) [2024-04-27 10:47:46,799][52242] Signal inference workers to resume experience collection... (5150 times) [2024-04-27 10:47:46,807][52263] InferenceWorker_p0-w0: resuming experience collection (5150 times) [2024-04-27 10:47:48,401][52263] Updated weights for policy 0, policy_version 357438 (0.0028) [2024-04-27 10:47:49,106][52031] Fps is (10 sec: 54068.2, 60 sec: 53521.0, 300 sec: 53150.8). Total num frames: 5856280576. Throughput: 0: 52905.3. Samples: 346788440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 10:47:49,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 10:47:51,970][52263] Updated weights for policy 0, policy_version 357448 (0.0033) [2024-04-27 10:47:54,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53248.1, 300 sec: 53206.3). Total num frames: 5856559104. Throughput: 0: 52891.9. Samples: 347107900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 10:47:54,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 10:47:54,718][52263] Updated weights for policy 0, policy_version 357458 (0.0030) [2024-04-27 10:47:58,002][52263] Updated weights for policy 0, policy_version 357468 (0.0032) [2024-04-27 10:47:59,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53521.1, 300 sec: 53372.9). Total num frames: 5856837632. Throughput: 0: 53375.0. Samples: 347279800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 10:47:59,108][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 10:48:01,031][52263] Updated weights for policy 0, policy_version 357478 (0.0032) [2024-04-27 10:48:04,082][52263] Updated weights for policy 0, policy_version 357488 (0.0030) [2024-04-27 10:48:04,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52974.8, 300 sec: 53261.9). Total num frames: 5857083392. Throughput: 0: 53305.6. Samples: 347597360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 10:48:04,107][52031] Avg episode reward: [(0, '0.509')] [2024-04-27 10:48:07,285][52263] Updated weights for policy 0, policy_version 357498 (0.0033) [2024-04-27 10:48:09,106][52031] Fps is (10 sec: 49153.0, 60 sec: 52975.0, 300 sec: 53095.3). Total num frames: 5857329152. Throughput: 0: 53336.6. Samples: 347917500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 10:48:09,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 10:48:10,173][52263] Updated weights for policy 0, policy_version 357508 (0.0038) [2024-04-27 10:48:13,287][52263] Updated weights for policy 0, policy_version 357518 (0.0035) [2024-04-27 10:48:14,107][52031] Fps is (10 sec: 50790.7, 60 sec: 53247.9, 300 sec: 53095.3). Total num frames: 5857591296. Throughput: 0: 52887.9. Samples: 348064320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 10:48:14,107][52031] Avg episode reward: [(0, '0.495')] [2024-04-27 10:48:16,403][52263] Updated weights for policy 0, policy_version 357528 (0.0026) [2024-04-27 10:48:19,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53521.0, 300 sec: 53206.4). Total num frames: 5857886208. Throughput: 0: 52809.2. Samples: 348384500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 10:48:19,108][52031] Avg episode reward: [(0, '0.498')] [2024-04-27 10:48:19,354][52263] Updated weights for policy 0, policy_version 357538 (0.0030) [2024-04-27 10:48:22,426][52263] Updated weights for policy 0, policy_version 357548 (0.0034) [2024-04-27 10:48:24,107][52031] Fps is (10 sec: 57343.5, 60 sec: 53521.0, 300 sec: 53372.9). Total num frames: 5858164736. Throughput: 0: 52903.5. Samples: 348704900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 10:48:24,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 10:48:25,705][52263] Updated weights for policy 0, policy_version 357558 (0.0026) [2024-04-27 10:48:28,725][52263] Updated weights for policy 0, policy_version 357568 (0.0030) [2024-04-27 10:48:29,107][52031] Fps is (10 sec: 52428.6, 60 sec: 52701.8, 300 sec: 53317.4). Total num frames: 5858410496. Throughput: 0: 53380.5. Samples: 348869460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 10:48:29,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 10:48:31,768][52263] Updated weights for policy 0, policy_version 357578 (0.0029) [2024-04-27 10:48:34,106][52031] Fps is (10 sec: 49153.0, 60 sec: 52428.7, 300 sec: 53150.8). Total num frames: 5858656256. Throughput: 0: 53275.6. Samples: 349185840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 10:48:34,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 10:48:34,880][52263] Updated weights for policy 0, policy_version 357588 (0.0032) [2024-04-27 10:48:37,864][52263] Updated weights for policy 0, policy_version 357598 (0.0028) [2024-04-27 10:48:39,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53248.2, 300 sec: 53095.3). Total num frames: 5858934784. Throughput: 0: 53329.9. Samples: 349507740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 10:48:39,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 10:48:40,902][52263] Updated weights for policy 0, policy_version 357608 (0.0025) [2024-04-27 10:48:44,031][52263] Updated weights for policy 0, policy_version 357618 (0.0033) [2024-04-27 10:48:44,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53521.1, 300 sec: 53150.8). Total num frames: 5859213312. Throughput: 0: 53085.8. Samples: 349668660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 10:48:44,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 10:48:47,010][52263] Updated weights for policy 0, policy_version 357628 (0.0033) [2024-04-27 10:48:49,107][52031] Fps is (10 sec: 54066.1, 60 sec: 53247.8, 300 sec: 53150.8). Total num frames: 5859475456. Throughput: 0: 53013.3. Samples: 349982960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 10:48:49,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 10:48:50,165][52263] Updated weights for policy 0, policy_version 357638 (0.0026) [2024-04-27 10:48:53,171][52263] Updated weights for policy 0, policy_version 357648 (0.0031) [2024-04-27 10:48:54,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53247.9, 300 sec: 53372.9). Total num frames: 5859753984. Throughput: 0: 52909.6. Samples: 350298440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 10:48:54,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 10:48:56,420][52263] Updated weights for policy 0, policy_version 357658 (0.0028) [2024-04-27 10:48:59,106][52031] Fps is (10 sec: 52429.6, 60 sec: 52702.0, 300 sec: 53206.4). Total num frames: 5859999744. Throughput: 0: 53279.7. Samples: 350461900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 10:48:59,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 10:48:59,283][52263] Updated weights for policy 0, policy_version 357668 (0.0030) [2024-04-27 10:49:02,544][52263] Updated weights for policy 0, policy_version 357678 (0.0027) [2024-04-27 10:49:04,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53150.8). Total num frames: 5860278272. Throughput: 0: 53392.3. Samples: 350787160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 10:49:04,107][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 10:49:05,328][52263] Updated weights for policy 0, policy_version 357688 (0.0029) [2024-04-27 10:49:06,276][52242] Signal inference workers to stop experience collection... (5200 times) [2024-04-27 10:49:06,318][52263] InferenceWorker_p0-w0: stopping experience collection (5200 times) [2024-04-27 10:49:06,329][52242] Signal inference workers to resume experience collection... (5200 times) [2024-04-27 10:49:06,337][52263] InferenceWorker_p0-w0: resuming experience collection (5200 times) [2024-04-27 10:49:08,629][52263] Updated weights for policy 0, policy_version 357698 (0.0027) [2024-04-27 10:49:09,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53520.9, 300 sec: 53150.8). Total num frames: 5860540416. Throughput: 0: 53361.0. Samples: 351106140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 10:49:09,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 10:49:11,597][52263] Updated weights for policy 0, policy_version 357708 (0.0032) [2024-04-27 10:49:14,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53521.1, 300 sec: 53095.3). Total num frames: 5860802560. Throughput: 0: 53223.6. Samples: 351264520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 10:49:14,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 10:49:14,865][52263] Updated weights for policy 0, policy_version 357718 (0.0028) [2024-04-27 10:49:17,750][52263] Updated weights for policy 0, policy_version 357728 (0.0028) [2024-04-27 10:49:19,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 5861081088. Throughput: 0: 53209.6. Samples: 351580280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 10:49:19,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 10:49:20,959][52263] Updated weights for policy 0, policy_version 357738 (0.0032) [2024-04-27 10:49:23,824][52263] Updated weights for policy 0, policy_version 357748 (0.0033) [2024-04-27 10:49:24,107][52031] Fps is (10 sec: 54066.8, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 5861343232. Throughput: 0: 53306.5. Samples: 351906540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 10:49:24,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 10:49:27,060][52263] Updated weights for policy 0, policy_version 357758 (0.0032) [2024-04-27 10:49:29,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53248.0, 300 sec: 53206.4). Total num frames: 5861605376. Throughput: 0: 53265.0. Samples: 352065580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 10:49:29,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 10:49:30,002][52263] Updated weights for policy 0, policy_version 357768 (0.0034) [2024-04-27 10:49:33,194][52263] Updated weights for policy 0, policy_version 357778 (0.0031) [2024-04-27 10:49:34,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.1, 300 sec: 53095.3). Total num frames: 5861867520. Throughput: 0: 53339.8. Samples: 352383240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 10:49:34,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 10:49:34,184][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000357781_5861883904.pth... [2024-04-27 10:49:34,226][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000357002_5849120768.pth [2024-04-27 10:49:36,273][52263] Updated weights for policy 0, policy_version 357788 (0.0027) [2024-04-27 10:49:39,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53247.9, 300 sec: 53150.8). Total num frames: 5862129664. Throughput: 0: 53373.3. Samples: 352700240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 10:49:39,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 10:49:39,385][52263] Updated weights for policy 0, policy_version 357798 (0.0029) [2024-04-27 10:49:42,286][52263] Updated weights for policy 0, policy_version 357808 (0.0031) [2024-04-27 10:49:44,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53248.0, 300 sec: 53206.3). Total num frames: 5862408192. Throughput: 0: 53372.3. Samples: 352863660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 10:49:44,107][52031] Avg episode reward: [(0, '0.517')] [2024-04-27 10:49:45,701][52263] Updated weights for policy 0, policy_version 357818 (0.0039) [2024-04-27 10:49:48,315][52263] Updated weights for policy 0, policy_version 357828 (0.0031) [2024-04-27 10:49:49,107][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 5862686720. Throughput: 0: 53201.8. Samples: 353181240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 10:49:49,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 10:49:51,724][52263] Updated weights for policy 0, policy_version 357838 (0.0032) [2024-04-27 10:49:54,107][52031] Fps is (10 sec: 54067.8, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 5862948864. Throughput: 0: 53387.3. Samples: 353508560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 10:49:54,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 10:49:54,558][52263] Updated weights for policy 0, policy_version 357848 (0.0029) [2024-04-27 10:49:57,838][52263] Updated weights for policy 0, policy_version 357858 (0.0036) [2024-04-27 10:49:59,106][52031] Fps is (10 sec: 50791.2, 60 sec: 53248.1, 300 sec: 53150.8). Total num frames: 5863194624. Throughput: 0: 53290.8. Samples: 353662600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 10:49:59,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 10:50:00,793][52263] Updated weights for policy 0, policy_version 357868 (0.0035) [2024-04-27 10:50:03,930][52263] Updated weights for policy 0, policy_version 357878 (0.0028) [2024-04-27 10:50:04,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.1, 300 sec: 53206.3). Total num frames: 5863473152. Throughput: 0: 53287.7. Samples: 353978220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 10:50:04,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 10:50:07,069][52263] Updated weights for policy 0, policy_version 357888 (0.0030) [2024-04-27 10:50:09,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53248.0, 300 sec: 53150.8). Total num frames: 5863735296. Throughput: 0: 53111.6. Samples: 354296560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 10:50:09,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 10:50:10,137][52263] Updated weights for policy 0, policy_version 357898 (0.0026) [2024-04-27 10:50:13,082][52263] Updated weights for policy 0, policy_version 357908 (0.0034) [2024-04-27 10:50:14,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53247.9, 300 sec: 53261.9). Total num frames: 5863997440. Throughput: 0: 53133.2. Samples: 354456580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 10:50:14,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 10:50:16,347][52263] Updated weights for policy 0, policy_version 357918 (0.0037) [2024-04-27 10:50:19,065][52263] Updated weights for policy 0, policy_version 357928 (0.0029) [2024-04-27 10:50:19,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 5864292352. Throughput: 0: 53279.5. Samples: 354780820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 10:50:19,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 10:50:22,429][52263] Updated weights for policy 0, policy_version 357938 (0.0031) [2024-04-27 10:50:24,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 5864554496. Throughput: 0: 53394.7. Samples: 355103000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 10:50:24,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 10:50:25,306][52263] Updated weights for policy 0, policy_version 357948 (0.0030) [2024-04-27 10:50:28,422][52242] Signal inference workers to stop experience collection... (5250 times) [2024-04-27 10:50:28,458][52263] InferenceWorker_p0-w0: stopping experience collection (5250 times) [2024-04-27 10:50:28,490][52242] Signal inference workers to resume experience collection... (5250 times) [2024-04-27 10:50:28,491][52263] InferenceWorker_p0-w0: resuming experience collection (5250 times) [2024-04-27 10:50:28,616][52263] Updated weights for policy 0, policy_version 357958 (0.0028) [2024-04-27 10:50:29,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53521.2, 300 sec: 53206.4). Total num frames: 5864816640. Throughput: 0: 53337.1. Samples: 355263820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 10:50:29,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 10:50:31,490][52263] Updated weights for policy 0, policy_version 357968 (0.0029) [2024-04-27 10:50:34,106][52031] Fps is (10 sec: 49152.5, 60 sec: 52975.0, 300 sec: 53039.7). Total num frames: 5865046016. Throughput: 0: 53301.9. Samples: 355579820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 10:50:34,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 10:50:34,873][52263] Updated weights for policy 0, policy_version 357978 (0.0030) [2024-04-27 10:50:37,630][52263] Updated weights for policy 0, policy_version 357988 (0.0030) [2024-04-27 10:50:39,106][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.2, 300 sec: 53261.9). Total num frames: 5865357312. Throughput: 0: 53161.7. Samples: 355900840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 10:50:39,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 10:50:40,998][52263] Updated weights for policy 0, policy_version 357998 (0.0039) [2024-04-27 10:50:43,807][52263] Updated weights for policy 0, policy_version 358008 (0.0035) [2024-04-27 10:50:44,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53248.0, 300 sec: 53261.9). Total num frames: 5865603072. Throughput: 0: 53278.5. Samples: 356060140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 10:50:44,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 10:50:47,002][52263] Updated weights for policy 0, policy_version 358018 (0.0034) [2024-04-27 10:50:49,106][52031] Fps is (10 sec: 50790.2, 60 sec: 52975.0, 300 sec: 53261.9). Total num frames: 5865865216. Throughput: 0: 53426.6. Samples: 356382420. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 10:50:49,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 10:50:49,927][52263] Updated weights for policy 0, policy_version 358028 (0.0026) [2024-04-27 10:50:53,009][52263] Updated weights for policy 0, policy_version 358038 (0.0028) [2024-04-27 10:50:54,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53247.9, 300 sec: 53262.0). Total num frames: 5866143744. Throughput: 0: 53579.6. Samples: 356707640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 10:50:54,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 10:50:56,196][52263] Updated weights for policy 0, policy_version 358048 (0.0023) [2024-04-27 10:50:59,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.1, 300 sec: 53206.4). Total num frames: 5866405888. Throughput: 0: 53653.1. Samples: 356870960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 10:50:59,107][52031] Avg episode reward: [(0, '0.508')] [2024-04-27 10:50:59,216][52263] Updated weights for policy 0, policy_version 358058 (0.0023) [2024-04-27 10:51:02,196][52263] Updated weights for policy 0, policy_version 358068 (0.0031) [2024-04-27 10:51:04,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.1, 300 sec: 53206.4). Total num frames: 5866684416. Throughput: 0: 53551.2. Samples: 357190620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 10:51:04,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 10:51:05,358][52263] Updated weights for policy 0, policy_version 358078 (0.0028) [2024-04-27 10:51:08,224][52263] Updated weights for policy 0, policy_version 358088 (0.0032) [2024-04-27 10:51:09,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.1, 300 sec: 53261.9). Total num frames: 5866946560. Throughput: 0: 53451.6. Samples: 357508320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 10:51:09,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 10:51:11,383][52263] Updated weights for policy 0, policy_version 358098 (0.0035) [2024-04-27 10:51:14,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.1, 300 sec: 53317.5). Total num frames: 5867208704. Throughput: 0: 53447.4. Samples: 357668960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 10:51:14,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 10:51:14,374][52263] Updated weights for policy 0, policy_version 358108 (0.0030) [2024-04-27 10:51:17,512][52263] Updated weights for policy 0, policy_version 358118 (0.0033) [2024-04-27 10:51:19,106][52031] Fps is (10 sec: 50790.5, 60 sec: 52701.9, 300 sec: 53206.3). Total num frames: 5867454464. Throughput: 0: 53526.1. Samples: 357988500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 10:51:19,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 10:51:20,385][52263] Updated weights for policy 0, policy_version 358128 (0.0032) [2024-04-27 10:51:23,749][52263] Updated weights for policy 0, policy_version 358138 (0.0032) [2024-04-27 10:51:24,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.0, 300 sec: 53261.9). Total num frames: 5867749376. Throughput: 0: 53616.4. Samples: 358313580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 10:51:24,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 10:51:26,752][52263] Updated weights for policy 0, policy_version 358148 (0.0033) [2024-04-27 10:51:29,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53248.0, 300 sec: 53206.4). Total num frames: 5868011520. Throughput: 0: 53510.4. Samples: 358468100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 10:51:29,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 10:51:29,738][52263] Updated weights for policy 0, policy_version 358158 (0.0032) [2024-04-27 10:51:32,761][52263] Updated weights for policy 0, policy_version 358168 (0.0029) [2024-04-27 10:51:34,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53794.1, 300 sec: 53206.4). Total num frames: 5868273664. Throughput: 0: 53627.6. Samples: 358795660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 10:51:34,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 10:51:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000358171_5868273664.pth... [2024-04-27 10:51:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000357392_5855510528.pth [2024-04-27 10:51:35,862][52263] Updated weights for policy 0, policy_version 358178 (0.0029) [2024-04-27 10:51:38,336][52242] Signal inference workers to stop experience collection... (5300 times) [2024-04-27 10:51:38,337][52242] Signal inference workers to resume experience collection... (5300 times) [2024-04-27 10:51:38,355][52263] InferenceWorker_p0-w0: stopping experience collection (5300 times) [2024-04-27 10:51:38,355][52263] InferenceWorker_p0-w0: resuming experience collection (5300 times) [2024-04-27 10:51:38,907][52263] Updated weights for policy 0, policy_version 358188 (0.0027) [2024-04-27 10:51:39,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 5868568576. Throughput: 0: 53472.8. Samples: 359113920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 10:51:39,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 10:51:41,933][52263] Updated weights for policy 0, policy_version 358198 (0.0029) [2024-04-27 10:51:44,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 5868814336. Throughput: 0: 53412.8. Samples: 359274540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 10:51:44,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 10:51:45,023][52263] Updated weights for policy 0, policy_version 358208 (0.0030) [2024-04-27 10:51:47,958][52263] Updated weights for policy 0, policy_version 358218 (0.0034) [2024-04-27 10:51:49,106][52031] Fps is (10 sec: 50791.2, 60 sec: 53521.2, 300 sec: 53261.9). Total num frames: 5869076480. Throughput: 0: 53564.5. Samples: 359601020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 10:51:49,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 10:51:51,089][52263] Updated weights for policy 0, policy_version 358228 (0.0031) [2024-04-27 10:51:54,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.1, 300 sec: 53317.5). Total num frames: 5869355008. Throughput: 0: 53645.4. Samples: 359922360. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 10:51:54,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 10:51:54,136][52263] Updated weights for policy 0, policy_version 358238 (0.0024) [2024-04-27 10:51:57,142][52263] Updated weights for policy 0, policy_version 358248 (0.0033) [2024-04-27 10:51:59,107][52031] Fps is (10 sec: 55704.4, 60 sec: 53793.9, 300 sec: 53317.4). Total num frames: 5869633536. Throughput: 0: 53631.4. Samples: 360082380. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 10:51:59,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 10:52:00,183][52263] Updated weights for policy 0, policy_version 358258 (0.0028) [2024-04-27 10:52:03,067][52263] Updated weights for policy 0, policy_version 358268 (0.0026) [2024-04-27 10:52:04,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 5869879296. Throughput: 0: 53757.9. Samples: 360407600. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 10:52:04,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 10:52:06,371][52263] Updated weights for policy 0, policy_version 358278 (0.0035) [2024-04-27 10:52:09,106][52031] Fps is (10 sec: 54068.4, 60 sec: 53794.3, 300 sec: 53484.1). Total num frames: 5870174208. Throughput: 0: 53613.0. Samples: 360726160. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 10:52:09,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 10:52:09,257][52263] Updated weights for policy 0, policy_version 358288 (0.0031) [2024-04-27 10:52:12,635][52263] Updated weights for policy 0, policy_version 358298 (0.0029) [2024-04-27 10:52:14,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 5870419968. Throughput: 0: 53757.7. Samples: 360887200. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 10:52:14,107][52031] Avg episode reward: [(0, '0.655')] [2024-04-27 10:52:15,540][52263] Updated weights for policy 0, policy_version 358308 (0.0030) [2024-04-27 10:52:18,588][52263] Updated weights for policy 0, policy_version 358318 (0.0026) [2024-04-27 10:52:19,106][52031] Fps is (10 sec: 52428.6, 60 sec: 54067.3, 300 sec: 53373.0). Total num frames: 5870698496. Throughput: 0: 53623.6. Samples: 361208720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 10:52:19,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 10:52:21,501][52263] Updated weights for policy 0, policy_version 358328 (0.0033) [2024-04-27 10:52:24,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53794.1, 300 sec: 53317.4). Total num frames: 5870977024. Throughput: 0: 53645.8. Samples: 361527980. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 10:52:24,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 10:52:24,583][52263] Updated weights for policy 0, policy_version 358338 (0.0028) [2024-04-27 10:52:27,453][52263] Updated weights for policy 0, policy_version 358348 (0.0032) [2024-04-27 10:52:29,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53261.9). Total num frames: 5871222784. Throughput: 0: 53716.1. Samples: 361691760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 10:52:29,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 10:52:30,678][52263] Updated weights for policy 0, policy_version 358358 (0.0032) [2024-04-27 10:52:33,852][52263] Updated weights for policy 0, policy_version 358368 (0.0027) [2024-04-27 10:52:34,107][52031] Fps is (10 sec: 54066.9, 60 sec: 54067.1, 300 sec: 53484.0). Total num frames: 5871517696. Throughput: 0: 53651.3. Samples: 362015340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 10:52:34,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 10:52:36,933][52263] Updated weights for policy 0, policy_version 358378 (0.0030) [2024-04-27 10:52:39,106][52031] Fps is (10 sec: 54066.5, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 5871763456. Throughput: 0: 53714.6. Samples: 362339520. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 10:52:39,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 10:52:39,973][52263] Updated weights for policy 0, policy_version 358388 (0.0027) [2024-04-27 10:52:43,050][52263] Updated weights for policy 0, policy_version 358398 (0.0038) [2024-04-27 10:52:44,106][52031] Fps is (10 sec: 50791.3, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 5872025600. Throughput: 0: 53562.4. Samples: 362492680. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 10:52:44,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 10:52:45,960][52263] Updated weights for policy 0, policy_version 358408 (0.0028) [2024-04-27 10:52:49,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.0, 300 sec: 53373.0). Total num frames: 5872304128. Throughput: 0: 53448.7. Samples: 362812800. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 10:52:49,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 10:52:49,253][52263] Updated weights for policy 0, policy_version 358418 (0.0029) [2024-04-27 10:52:50,639][52242] Signal inference workers to stop experience collection... (5350 times) [2024-04-27 10:52:50,639][52242] Signal inference workers to resume experience collection... (5350 times) [2024-04-27 10:52:50,660][52263] InferenceWorker_p0-w0: stopping experience collection (5350 times) [2024-04-27 10:52:50,661][52263] InferenceWorker_p0-w0: resuming experience collection (5350 times) [2024-04-27 10:52:51,988][52263] Updated weights for policy 0, policy_version 358428 (0.0031) [2024-04-27 10:52:54,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53794.0, 300 sec: 53373.0). Total num frames: 5872582656. Throughput: 0: 53597.1. Samples: 363138040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 10:52:54,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 10:52:55,172][52263] Updated weights for policy 0, policy_version 358438 (0.0027) [2024-04-27 10:52:58,094][52263] Updated weights for policy 0, policy_version 358448 (0.0029) [2024-04-27 10:52:59,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53248.2, 300 sec: 53373.0). Total num frames: 5872828416. Throughput: 0: 53653.4. Samples: 363301600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 10:52:59,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 10:53:01,220][52263] Updated weights for policy 0, policy_version 358458 (0.0027) [2024-04-27 10:53:04,106][52031] Fps is (10 sec: 54068.3, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 5873123328. Throughput: 0: 53664.9. Samples: 363623640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 10:53:04,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 10:53:04,272][52263] Updated weights for policy 0, policy_version 358468 (0.0029) [2024-04-27 10:53:07,394][52263] Updated weights for policy 0, policy_version 358478 (0.0031) [2024-04-27 10:53:09,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53247.9, 300 sec: 53484.1). Total num frames: 5873369088. Throughput: 0: 53661.9. Samples: 363942760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 10:53:09,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 10:53:10,292][52263] Updated weights for policy 0, policy_version 358488 (0.0038) [2024-04-27 10:53:13,634][52263] Updated weights for policy 0, policy_version 358498 (0.0034) [2024-04-27 10:53:14,106][52031] Fps is (10 sec: 50790.1, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 5873631232. Throughput: 0: 53547.5. Samples: 364101400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 10:53:14,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 10:53:16,289][52263] Updated weights for policy 0, policy_version 358508 (0.0032) [2024-04-27 10:53:19,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 5873909760. Throughput: 0: 53529.0. Samples: 364424140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 10:53:19,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 10:53:19,891][52263] Updated weights for policy 0, policy_version 358518 (0.0030) [2024-04-27 10:53:22,961][52263] Updated weights for policy 0, policy_version 358528 (0.0034) [2024-04-27 10:53:24,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 5874188288. Throughput: 0: 53383.2. Samples: 364741760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 10:53:24,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 10:53:26,182][52263] Updated weights for policy 0, policy_version 358538 (0.0035) [2024-04-27 10:53:28,947][52263] Updated weights for policy 0, policy_version 358548 (0.0029) [2024-04-27 10:53:29,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 5874450432. Throughput: 0: 53735.6. Samples: 364910780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 10:53:29,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 10:53:32,213][52263] Updated weights for policy 0, policy_version 358558 (0.0028) [2024-04-27 10:53:34,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 5874712576. Throughput: 0: 53756.4. Samples: 365231840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 10:53:34,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 10:53:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000358564_5874712576.pth... [2024-04-27 10:53:34,173][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000357781_5861883904.pth [2024-04-27 10:53:35,121][52263] Updated weights for policy 0, policy_version 358568 (0.0029) [2024-04-27 10:53:38,392][52263] Updated weights for policy 0, policy_version 358578 (0.0027) [2024-04-27 10:53:39,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 5874974720. Throughput: 0: 53617.9. Samples: 365550840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 10:53:39,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 10:53:41,354][52263] Updated weights for policy 0, policy_version 358588 (0.0037) [2024-04-27 10:53:44,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 5875253248. Throughput: 0: 53471.3. Samples: 365707820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 10:53:44,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 10:53:44,360][52263] Updated weights for policy 0, policy_version 358598 (0.0029) [2024-04-27 10:53:47,412][52263] Updated weights for policy 0, policy_version 358608 (0.0027) [2024-04-27 10:53:49,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 5875531776. Throughput: 0: 53439.4. Samples: 366028420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 10:53:49,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 10:53:50,624][52263] Updated weights for policy 0, policy_version 358618 (0.0024) [2024-04-27 10:53:53,395][52263] Updated weights for policy 0, policy_version 358628 (0.0029) [2024-04-27 10:53:54,107][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 5875793920. Throughput: 0: 53445.7. Samples: 366347820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 10:53:54,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 10:53:56,674][52263] Updated weights for policy 0, policy_version 358638 (0.0026) [2024-04-27 10:53:59,107][52031] Fps is (10 sec: 52429.1, 60 sec: 53794.0, 300 sec: 53484.1). Total num frames: 5876056064. Throughput: 0: 53561.7. Samples: 366511680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 10:53:59,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 10:53:59,458][52263] Updated weights for policy 0, policy_version 358648 (0.0032) [2024-04-27 10:54:02,842][52263] Updated weights for policy 0, policy_version 358658 (0.0035) [2024-04-27 10:54:04,106][52031] Fps is (10 sec: 50791.3, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 5876301824. Throughput: 0: 53522.9. Samples: 366832660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 10:54:04,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 10:54:05,594][52263] Updated weights for policy 0, policy_version 358668 (0.0026) [2024-04-27 10:54:08,870][52263] Updated weights for policy 0, policy_version 358678 (0.0026) [2024-04-27 10:54:09,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 5876580352. Throughput: 0: 53532.7. Samples: 367150740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 10:54:09,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 10:54:10,245][52242] Signal inference workers to stop experience collection... (5400 times) [2024-04-27 10:54:10,251][52242] Signal inference workers to resume experience collection... (5400 times) [2024-04-27 10:54:10,277][52263] InferenceWorker_p0-w0: stopping experience collection (5400 times) [2024-04-27 10:54:10,277][52263] InferenceWorker_p0-w0: resuming experience collection (5400 times) [2024-04-27 10:54:11,733][52263] Updated weights for policy 0, policy_version 358688 (0.0027) [2024-04-27 10:54:14,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 5876858880. Throughput: 0: 53353.7. Samples: 367311700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 10:54:14,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 10:54:14,952][52263] Updated weights for policy 0, policy_version 358698 (0.0026) [2024-04-27 10:54:17,981][52263] Updated weights for policy 0, policy_version 358708 (0.0029) [2024-04-27 10:54:19,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 5877121024. Throughput: 0: 53310.7. Samples: 367630820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 10:54:19,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 10:54:21,280][52263] Updated weights for policy 0, policy_version 358718 (0.0028) [2024-04-27 10:54:24,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 5877383168. Throughput: 0: 53240.0. Samples: 367946640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 10:54:24,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 10:54:24,252][52263] Updated weights for policy 0, policy_version 358728 (0.0034) [2024-04-27 10:54:27,362][52263] Updated weights for policy 0, policy_version 358738 (0.0036) [2024-04-27 10:54:29,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53520.9, 300 sec: 53539.6). Total num frames: 5877661696. Throughput: 0: 53284.5. Samples: 368105620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 10:54:29,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 10:54:30,358][52263] Updated weights for policy 0, policy_version 358748 (0.0028) [2024-04-27 10:54:33,396][52263] Updated weights for policy 0, policy_version 358758 (0.0034) [2024-04-27 10:54:34,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 5877907456. Throughput: 0: 53309.5. Samples: 368427340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 10:54:34,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 10:54:36,452][52263] Updated weights for policy 0, policy_version 358768 (0.0029) [2024-04-27 10:54:39,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 5878185984. Throughput: 0: 53261.4. Samples: 368744580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 10:54:39,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 10:54:39,793][52263] Updated weights for policy 0, policy_version 358778 (0.0027) [2024-04-27 10:54:42,463][52263] Updated weights for policy 0, policy_version 358788 (0.0030) [2024-04-27 10:54:44,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 5878448128. Throughput: 0: 53195.6. Samples: 368905480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 10:54:44,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 10:54:46,029][52263] Updated weights for policy 0, policy_version 358798 (0.0032) [2024-04-27 10:54:48,603][52263] Updated weights for policy 0, policy_version 358808 (0.0030) [2024-04-27 10:54:49,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 5878710272. Throughput: 0: 53064.7. Samples: 369220580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 10:54:49,116][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 10:54:52,296][52263] Updated weights for policy 0, policy_version 358818 (0.0029) [2024-04-27 10:54:54,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 5878988800. Throughput: 0: 53102.8. Samples: 369540360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 10:54:54,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 10:54:54,719][52263] Updated weights for policy 0, policy_version 358828 (0.0035) [2024-04-27 10:54:58,261][52263] Updated weights for policy 0, policy_version 358838 (0.0031) [2024-04-27 10:54:59,107][52031] Fps is (10 sec: 52428.9, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 5879234560. Throughput: 0: 53100.8. Samples: 369701240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 10:54:59,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 10:55:00,876][52263] Updated weights for policy 0, policy_version 358848 (0.0033) [2024-04-27 10:55:04,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 5879496704. Throughput: 0: 52987.6. Samples: 370015260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 10:55:04,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 10:55:04,442][52263] Updated weights for policy 0, policy_version 358858 (0.0034) [2024-04-27 10:55:07,629][52263] Updated weights for policy 0, policy_version 358868 (0.0030) [2024-04-27 10:55:09,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 5879775232. Throughput: 0: 53108.0. Samples: 370336500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 10:55:09,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 10:55:10,461][52263] Updated weights for policy 0, policy_version 358878 (0.0034) [2024-04-27 10:55:11,336][52242] Signal inference workers to stop experience collection... (5450 times) [2024-04-27 10:55:11,336][52242] Signal inference workers to resume experience collection... (5450 times) [2024-04-27 10:55:11,354][52263] InferenceWorker_p0-w0: stopping experience collection (5450 times) [2024-04-27 10:55:11,354][52263] InferenceWorker_p0-w0: resuming experience collection (5450 times) [2024-04-27 10:55:13,863][52263] Updated weights for policy 0, policy_version 358888 (0.0031) [2024-04-27 10:55:14,106][52031] Fps is (10 sec: 54067.5, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 5880037376. Throughput: 0: 53182.9. Samples: 370498840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 10:55:14,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 10:55:16,656][52263] Updated weights for policy 0, policy_version 358898 (0.0030) [2024-04-27 10:55:19,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 5880315904. Throughput: 0: 53082.9. Samples: 370816080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 10:55:19,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 10:55:20,055][52263] Updated weights for policy 0, policy_version 358908 (0.0035) [2024-04-27 10:55:22,889][52263] Updated weights for policy 0, policy_version 358918 (0.0033) [2024-04-27 10:55:24,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 5880578048. Throughput: 0: 53122.3. Samples: 371135080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 10:55:24,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 10:55:26,121][52263] Updated weights for policy 0, policy_version 358928 (0.0033) [2024-04-27 10:55:29,051][52263] Updated weights for policy 0, policy_version 358938 (0.0027) [2024-04-27 10:55:29,106][52031] Fps is (10 sec: 52429.4, 60 sec: 52975.0, 300 sec: 53539.6). Total num frames: 5880840192. Throughput: 0: 53162.2. Samples: 371297780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 10:55:29,107][52031] Avg episode reward: [(0, '0.659')] [2024-04-27 10:55:32,339][52263] Updated weights for policy 0, policy_version 358948 (0.0027) [2024-04-27 10:55:34,106][52031] Fps is (10 sec: 50790.3, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 5881085952. Throughput: 0: 53174.3. Samples: 371613420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 10:55:34,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 10:55:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000358953_5881085952.pth... [2024-04-27 10:55:34,165][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000358171_5868273664.pth [2024-04-27 10:55:35,422][52263] Updated weights for policy 0, policy_version 358958 (0.0033) [2024-04-27 10:55:38,559][52263] Updated weights for policy 0, policy_version 358968 (0.0032) [2024-04-27 10:55:39,106][52031] Fps is (10 sec: 50790.4, 60 sec: 52701.9, 300 sec: 53373.0). Total num frames: 5881348096. Throughput: 0: 53086.2. Samples: 371929240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 10:55:39,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 10:55:41,608][52263] Updated weights for policy 0, policy_version 358978 (0.0028) [2024-04-27 10:55:44,106][52031] Fps is (10 sec: 54067.3, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 5881626624. Throughput: 0: 53098.8. Samples: 372090680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 10:55:44,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 10:55:44,596][52263] Updated weights for policy 0, policy_version 358988 (0.0028) [2024-04-27 10:55:47,636][52263] Updated weights for policy 0, policy_version 358998 (0.0033) [2024-04-27 10:55:49,106][52031] Fps is (10 sec: 55705.4, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 5881905152. Throughput: 0: 53304.9. Samples: 372413980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 10:55:49,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 10:55:50,534][52263] Updated weights for policy 0, policy_version 359008 (0.0028) [2024-04-27 10:55:53,681][52263] Updated weights for policy 0, policy_version 359018 (0.0032) [2024-04-27 10:55:54,107][52031] Fps is (10 sec: 54066.7, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 5882167296. Throughput: 0: 53342.2. Samples: 372736900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 10:55:54,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 10:55:56,876][52263] Updated weights for policy 0, policy_version 359028 (0.0029) [2024-04-27 10:55:59,107][52031] Fps is (10 sec: 50790.3, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 5882413056. Throughput: 0: 53259.4. Samples: 372895520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 10:55:59,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 10:55:59,784][52263] Updated weights for policy 0, policy_version 359038 (0.0027) [2024-04-27 10:56:03,020][52263] Updated weights for policy 0, policy_version 359048 (0.0026) [2024-04-27 10:56:04,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 5882691584. Throughput: 0: 53240.7. Samples: 373211900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 10:56:04,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 10:56:05,920][52263] Updated weights for policy 0, policy_version 359058 (0.0028) [2024-04-27 10:56:09,107][52031] Fps is (10 sec: 54066.1, 60 sec: 52974.7, 300 sec: 53372.9). Total num frames: 5882953728. Throughput: 0: 53202.8. Samples: 373529220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 10:56:09,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 10:56:09,195][52263] Updated weights for policy 0, policy_version 359068 (0.0029) [2024-04-27 10:56:12,008][52263] Updated weights for policy 0, policy_version 359078 (0.0033) [2024-04-27 10:56:14,106][52031] Fps is (10 sec: 54066.7, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 5883232256. Throughput: 0: 53287.5. Samples: 373695720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 10:56:14,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 10:56:15,226][52263] Updated weights for policy 0, policy_version 359088 (0.0028) [2024-04-27 10:56:18,083][52242] Signal inference workers to stop experience collection... (5500 times) [2024-04-27 10:56:18,087][52242] Signal inference workers to resume experience collection... (5500 times) [2024-04-27 10:56:18,097][52263] InferenceWorker_p0-w0: stopping experience collection (5500 times) [2024-04-27 10:56:18,116][52263] InferenceWorker_p0-w0: resuming experience collection (5500 times) [2024-04-27 10:56:18,209][52263] Updated weights for policy 0, policy_version 359098 (0.0032) [2024-04-27 10:56:19,106][52031] Fps is (10 sec: 57346.0, 60 sec: 53521.3, 300 sec: 53484.1). Total num frames: 5883527168. Throughput: 0: 53442.3. Samples: 374018320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 10:56:19,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 10:56:21,424][52263] Updated weights for policy 0, policy_version 359108 (0.0031) [2024-04-27 10:56:24,106][52031] Fps is (10 sec: 52428.8, 60 sec: 52974.9, 300 sec: 53372.9). Total num frames: 5883756544. Throughput: 0: 53459.5. Samples: 374334920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 10:56:24,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 10:56:24,403][52263] Updated weights for policy 0, policy_version 359118 (0.0030) [2024-04-27 10:56:27,453][52263] Updated weights for policy 0, policy_version 359128 (0.0037) [2024-04-27 10:56:29,107][52031] Fps is (10 sec: 49150.9, 60 sec: 52974.8, 300 sec: 53372.9). Total num frames: 5884018688. Throughput: 0: 53321.1. Samples: 374490140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 10:56:29,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 10:56:30,517][52263] Updated weights for policy 0, policy_version 359138 (0.0027) [2024-04-27 10:56:33,502][52263] Updated weights for policy 0, policy_version 359148 (0.0026) [2024-04-27 10:56:34,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 5884313600. Throughput: 0: 53313.7. Samples: 374813100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 10:56:34,107][52031] Avg episode reward: [(0, '0.479')] [2024-04-27 10:56:36,639][52263] Updated weights for policy 0, policy_version 359158 (0.0026) [2024-04-27 10:56:39,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.0, 300 sec: 53372.9). Total num frames: 5884559360. Throughput: 0: 53263.1. Samples: 375133740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 10:56:39,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 10:56:39,696][52263] Updated weights for policy 0, policy_version 359168 (0.0026) [2024-04-27 10:56:42,713][52263] Updated weights for policy 0, policy_version 359178 (0.0031) [2024-04-27 10:56:44,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 5884854272. Throughput: 0: 53411.5. Samples: 375299040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 10:56:44,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 10:56:45,910][52263] Updated weights for policy 0, policy_version 359188 (0.0029) [2024-04-27 10:56:48,947][52263] Updated weights for policy 0, policy_version 359198 (0.0033) [2024-04-27 10:56:49,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 5885116416. Throughput: 0: 53502.1. Samples: 375619500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 10:56:49,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 10:56:51,971][52263] Updated weights for policy 0, policy_version 359208 (0.0028) [2024-04-27 10:56:54,107][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 5885362176. Throughput: 0: 53542.5. Samples: 375938620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 10:56:54,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 10:56:54,944][52263] Updated weights for policy 0, policy_version 359218 (0.0038) [2024-04-27 10:56:58,026][52263] Updated weights for policy 0, policy_version 359228 (0.0029) [2024-04-27 10:56:59,106][52031] Fps is (10 sec: 49152.4, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 5885607936. Throughput: 0: 53220.1. Samples: 376090620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 10:56:59,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 10:57:01,101][52263] Updated weights for policy 0, policy_version 359238 (0.0033) [2024-04-27 10:57:04,051][52263] Updated weights for policy 0, policy_version 359248 (0.0030) [2024-04-27 10:57:04,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 5885919232. Throughput: 0: 53132.4. Samples: 376409280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 10:57:04,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 10:57:07,392][52263] Updated weights for policy 0, policy_version 359258 (0.0027) [2024-04-27 10:57:09,107][52031] Fps is (10 sec: 57343.0, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 5886181376. Throughput: 0: 53080.8. Samples: 376723560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 10:57:09,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 10:57:10,205][52263] Updated weights for policy 0, policy_version 359268 (0.0027) [2024-04-27 10:57:13,475][52263] Updated weights for policy 0, policy_version 359278 (0.0028) [2024-04-27 10:57:14,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 5886443520. Throughput: 0: 53583.7. Samples: 376901400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 10:57:14,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 10:57:16,069][52242] Signal inference workers to stop experience collection... (5550 times) [2024-04-27 10:57:16,075][52242] Signal inference workers to resume experience collection... (5550 times) [2024-04-27 10:57:16,096][52263] InferenceWorker_p0-w0: stopping experience collection (5550 times) [2024-04-27 10:57:16,096][52263] InferenceWorker_p0-w0: resuming experience collection (5550 times) [2024-04-27 10:57:16,192][52263] Updated weights for policy 0, policy_version 359288 (0.0033) [2024-04-27 10:57:19,106][52031] Fps is (10 sec: 50791.3, 60 sec: 52701.8, 300 sec: 53261.9). Total num frames: 5886689280. Throughput: 0: 53405.5. Samples: 377216340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 10:57:19,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 10:57:19,640][52263] Updated weights for policy 0, policy_version 359298 (0.0031) [2024-04-27 10:57:22,301][52263] Updated weights for policy 0, policy_version 359308 (0.0030) [2024-04-27 10:57:24,107][52031] Fps is (10 sec: 49151.2, 60 sec: 52974.8, 300 sec: 53261.8). Total num frames: 5886935040. Throughput: 0: 53319.9. Samples: 377533140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 10:57:24,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 10:57:25,766][52263] Updated weights for policy 0, policy_version 359318 (0.0031) [2024-04-27 10:57:28,582][52263] Updated weights for policy 0, policy_version 359328 (0.0028) [2024-04-27 10:57:29,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.2, 300 sec: 53261.9). Total num frames: 5887229952. Throughput: 0: 53009.1. Samples: 377684440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 10:57:29,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 10:57:31,850][52263] Updated weights for policy 0, policy_version 359338 (0.0026) [2024-04-27 10:57:34,107][52031] Fps is (10 sec: 57344.7, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 5887508480. Throughput: 0: 52896.0. Samples: 377999820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 10:57:34,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 10:57:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000359345_5887508480.pth... [2024-04-27 10:57:34,171][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000358564_5874712576.pth [2024-04-27 10:57:34,817][52263] Updated weights for policy 0, policy_version 359348 (0.0034) [2024-04-27 10:57:38,159][52263] Updated weights for policy 0, policy_version 359358 (0.0030) [2024-04-27 10:57:39,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 5887770624. Throughput: 0: 52859.2. Samples: 378317280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 10:57:39,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 10:57:41,027][52263] Updated weights for policy 0, policy_version 359368 (0.0029) [2024-04-27 10:57:44,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 5888032768. Throughput: 0: 53087.0. Samples: 378479540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 10:57:44,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 10:57:44,348][52263] Updated weights for policy 0, policy_version 359378 (0.0028) [2024-04-27 10:57:46,988][52263] Updated weights for policy 0, policy_version 359388 (0.0028) [2024-04-27 10:57:49,106][52031] Fps is (10 sec: 47513.3, 60 sec: 52155.8, 300 sec: 53095.3). Total num frames: 5888245760. Throughput: 0: 53147.0. Samples: 378800900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 10:57:49,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 10:57:50,379][52263] Updated weights for policy 0, policy_version 359398 (0.0035) [2024-04-27 10:57:53,059][52263] Updated weights for policy 0, policy_version 359408 (0.0031) [2024-04-27 10:57:54,107][52031] Fps is (10 sec: 50789.9, 60 sec: 52974.9, 300 sec: 53261.9). Total num frames: 5888540672. Throughput: 0: 53217.4. Samples: 379118340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 10:57:54,116][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 10:57:56,621][52263] Updated weights for policy 0, policy_version 359418 (0.0024) [2024-04-27 10:57:59,106][52031] Fps is (10 sec: 60621.0, 60 sec: 54067.2, 300 sec: 53317.4). Total num frames: 5888851968. Throughput: 0: 52867.6. Samples: 379280440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 10:57:59,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 10:57:59,247][52263] Updated weights for policy 0, policy_version 359428 (0.0023) [2024-04-27 10:58:02,726][52263] Updated weights for policy 0, policy_version 359438 (0.0044) [2024-04-27 10:58:04,107][52031] Fps is (10 sec: 57344.3, 60 sec: 53247.9, 300 sec: 53372.9). Total num frames: 5889114112. Throughput: 0: 53016.3. Samples: 379602080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 10:58:04,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 10:58:05,425][52263] Updated weights for policy 0, policy_version 359448 (0.0026) [2024-04-27 10:58:08,851][52263] Updated weights for policy 0, policy_version 359458 (0.0035) [2024-04-27 10:58:09,106][52031] Fps is (10 sec: 50790.4, 60 sec: 52975.1, 300 sec: 53317.4). Total num frames: 5889359872. Throughput: 0: 53119.3. Samples: 379923500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 10:58:09,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 10:58:11,972][52263] Updated weights for policy 0, policy_version 359468 (0.0033) [2024-04-27 10:58:14,106][52031] Fps is (10 sec: 50790.9, 60 sec: 52975.0, 300 sec: 53261.9). Total num frames: 5889622016. Throughput: 0: 53288.0. Samples: 380082400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 10:58:14,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 10:58:14,780][52263] Updated weights for policy 0, policy_version 359478 (0.0028) [2024-04-27 10:58:18,303][52263] Updated weights for policy 0, policy_version 359488 (0.0034) [2024-04-27 10:58:19,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52974.9, 300 sec: 53150.8). Total num frames: 5889867776. Throughput: 0: 53483.5. Samples: 380406580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 10:58:19,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 10:58:20,865][52263] Updated weights for policy 0, policy_version 359498 (0.0027) [2024-04-27 10:58:24,106][52031] Fps is (10 sec: 54066.8, 60 sec: 53794.2, 300 sec: 53261.9). Total num frames: 5890162688. Throughput: 0: 53440.8. Samples: 380722120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 10:58:24,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 10:58:25,062][52263] Updated weights for policy 0, policy_version 359508 (0.0031) [2024-04-27 10:58:26,890][52242] Signal inference workers to stop experience collection... (5600 times) [2024-04-27 10:58:26,939][52263] InferenceWorker_p0-w0: stopping experience collection (5600 times) [2024-04-27 10:58:26,949][52242] Signal inference workers to resume experience collection... (5600 times) [2024-04-27 10:58:26,956][52263] InferenceWorker_p0-w0: resuming experience collection (5600 times) [2024-04-27 10:58:27,091][52263] Updated weights for policy 0, policy_version 359518 (0.0033) [2024-04-27 10:58:29,106][52031] Fps is (10 sec: 58983.1, 60 sec: 53794.2, 300 sec: 53373.0). Total num frames: 5890457600. Throughput: 0: 53565.0. Samples: 380889960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 10:58:29,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 10:58:31,244][52263] Updated weights for policy 0, policy_version 359528 (0.0032) [2024-04-27 10:58:33,086][52263] Updated weights for policy 0, policy_version 359538 (0.0030) [2024-04-27 10:58:34,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 5890719744. Throughput: 0: 53635.6. Samples: 381214500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 10:58:34,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 10:58:37,251][52263] Updated weights for policy 0, policy_version 359548 (0.0028) [2024-04-27 10:58:39,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 5890981888. Throughput: 0: 53727.6. Samples: 381536080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 10:58:39,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 10:58:39,436][52263] Updated weights for policy 0, policy_version 359558 (0.0031) [2024-04-27 10:58:43,108][52263] Updated weights for policy 0, policy_version 359568 (0.0029) [2024-04-27 10:58:44,107][52031] Fps is (10 sec: 50789.7, 60 sec: 53247.9, 300 sec: 53206.3). Total num frames: 5891227648. Throughput: 0: 53471.0. Samples: 381686640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 10:58:44,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 10:58:45,635][52263] Updated weights for policy 0, policy_version 359578 (0.0030) [2024-04-27 10:58:49,106][52031] Fps is (10 sec: 49152.6, 60 sec: 53794.2, 300 sec: 53150.8). Total num frames: 5891473408. Throughput: 0: 53473.5. Samples: 382008380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 10:58:49,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 10:58:49,157][52263] Updated weights for policy 0, policy_version 359588 (0.0033) [2024-04-27 10:58:51,553][52263] Updated weights for policy 0, policy_version 359598 (0.0032) [2024-04-27 10:58:54,107][52031] Fps is (10 sec: 55705.5, 60 sec: 54067.2, 300 sec: 53317.4). Total num frames: 5891784704. Throughput: 0: 53537.2. Samples: 382332680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 10:58:54,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 10:58:55,321][52263] Updated weights for policy 0, policy_version 359608 (0.0028) [2024-04-27 10:58:57,721][52263] Updated weights for policy 0, policy_version 359618 (0.0030) [2024-04-27 10:58:59,106][52031] Fps is (10 sec: 57343.6, 60 sec: 53248.0, 300 sec: 53372.9). Total num frames: 5892046848. Throughput: 0: 53775.9. Samples: 382502320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 10:58:59,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 10:59:01,285][52263] Updated weights for policy 0, policy_version 359628 (0.0032) [2024-04-27 10:59:03,700][52263] Updated weights for policy 0, policy_version 359638 (0.0028) [2024-04-27 10:59:04,106][52031] Fps is (10 sec: 54068.2, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 5892325376. Throughput: 0: 53693.0. Samples: 382822760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 10:59:04,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 10:59:07,385][52263] Updated weights for policy 0, policy_version 359648 (0.0025) [2024-04-27 10:59:09,108][52031] Fps is (10 sec: 54060.8, 60 sec: 53793.1, 300 sec: 53317.2). Total num frames: 5892587520. Throughput: 0: 53740.4. Samples: 383140500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 10:59:09,108][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 10:59:09,849][52263] Updated weights for policy 0, policy_version 359658 (0.0027) [2024-04-27 10:59:13,834][52263] Updated weights for policy 0, policy_version 359668 (0.0030) [2024-04-27 10:59:14,106][52031] Fps is (10 sec: 49151.8, 60 sec: 53248.0, 300 sec: 53206.4). Total num frames: 5892816896. Throughput: 0: 53519.9. Samples: 383298360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 10:59:14,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 10:59:15,973][52263] Updated weights for policy 0, policy_version 359678 (0.0030) [2024-04-27 10:59:19,106][52031] Fps is (10 sec: 50796.3, 60 sec: 53794.1, 300 sec: 53261.9). Total num frames: 5893095424. Throughput: 0: 53372.4. Samples: 383616260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 10:59:19,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 10:59:19,842][52263] Updated weights for policy 0, policy_version 359688 (0.0030) [2024-04-27 10:59:22,055][52263] Updated weights for policy 0, policy_version 359698 (0.0025) [2024-04-27 10:59:24,106][52031] Fps is (10 sec: 57344.1, 60 sec: 53794.2, 300 sec: 53317.5). Total num frames: 5893390336. Throughput: 0: 53388.1. Samples: 383938540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 10:59:24,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 10:59:25,829][52263] Updated weights for policy 0, policy_version 359708 (0.0034) [2024-04-27 10:59:28,263][52263] Updated weights for policy 0, policy_version 359718 (0.0030) [2024-04-27 10:59:29,107][52031] Fps is (10 sec: 57343.9, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 5893668864. Throughput: 0: 53743.2. Samples: 384105080. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 10:59:29,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 10:59:31,931][52263] Updated weights for policy 0, policy_version 359728 (0.0027) [2024-04-27 10:59:34,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 5893914624. Throughput: 0: 53623.5. Samples: 384421440. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 10:59:34,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 10:59:34,142][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000359737_5893931008.pth... [2024-04-27 10:59:34,185][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000358953_5881085952.pth [2024-04-27 10:59:34,404][52263] Updated weights for policy 0, policy_version 359738 (0.0031) [2024-04-27 10:59:37,965][52263] Updated weights for policy 0, policy_version 359748 (0.0031) [2024-04-27 10:59:39,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 5894176768. Throughput: 0: 53629.1. Samples: 384745980. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 10:59:39,107][52031] Avg episode reward: [(0, '0.524')] [2024-04-27 10:59:40,531][52263] Updated weights for policy 0, policy_version 359758 (0.0034) [2024-04-27 10:59:41,504][52242] Signal inference workers to stop experience collection... (5650 times) [2024-04-27 10:59:41,504][52242] Signal inference workers to resume experience collection... (5650 times) [2024-04-27 10:59:41,517][52263] InferenceWorker_p0-w0: stopping experience collection (5650 times) [2024-04-27 10:59:41,518][52263] InferenceWorker_p0-w0: resuming experience collection (5650 times) [2024-04-27 10:59:43,975][52263] Updated weights for policy 0, policy_version 359768 (0.0033) [2024-04-27 10:59:44,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 5894438912. Throughput: 0: 53431.5. Samples: 384906740. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 10:59:44,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 10:59:46,814][52263] Updated weights for policy 0, policy_version 359778 (0.0033) [2024-04-27 10:59:49,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53794.1, 300 sec: 53261.9). Total num frames: 5894701056. Throughput: 0: 53415.5. Samples: 385226460. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 10:59:49,107][52031] Avg episode reward: [(0, '0.677')] [2024-04-27 10:59:50,208][52263] Updated weights for policy 0, policy_version 359788 (0.0029) [2024-04-27 10:59:52,786][52263] Updated weights for policy 0, policy_version 359798 (0.0032) [2024-04-27 10:59:54,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 5894979584. Throughput: 0: 53387.1. Samples: 385542860. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 10:59:54,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 10:59:56,502][52263] Updated weights for policy 0, policy_version 359808 (0.0032) [2024-04-27 10:59:58,889][52263] Updated weights for policy 0, policy_version 359818 (0.0028) [2024-04-27 10:59:59,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 5895258112. Throughput: 0: 53471.4. Samples: 385704580. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 10:59:59,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 11:00:02,593][52263] Updated weights for policy 0, policy_version 359828 (0.0030) [2024-04-27 11:00:04,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 5895520256. Throughput: 0: 53499.2. Samples: 386023720. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 11:00:04,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 11:00:05,068][52263] Updated weights for policy 0, policy_version 359838 (0.0028) [2024-04-27 11:00:08,648][52263] Updated weights for policy 0, policy_version 359848 (0.0035) [2024-04-27 11:00:09,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52975.9, 300 sec: 53317.4). Total num frames: 5895766016. Throughput: 0: 53489.6. Samples: 386345580. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 11:00:09,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 11:00:11,177][52263] Updated weights for policy 0, policy_version 359858 (0.0039) [2024-04-27 11:00:14,106][52031] Fps is (10 sec: 50790.2, 60 sec: 53521.0, 300 sec: 53261.9). Total num frames: 5896028160. Throughput: 0: 53153.4. Samples: 386496980. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 11:00:14,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 11:00:14,922][52263] Updated weights for policy 0, policy_version 359868 (0.0028) [2024-04-27 11:00:17,804][52263] Updated weights for policy 0, policy_version 359878 (0.0030) [2024-04-27 11:00:19,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 5896306688. Throughput: 0: 53243.0. Samples: 386817380. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 11:00:19,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 11:00:20,954][52263] Updated weights for policy 0, policy_version 359888 (0.0031) [2024-04-27 11:00:23,994][52263] Updated weights for policy 0, policy_version 359898 (0.0033) [2024-04-27 11:00:24,107][52031] Fps is (10 sec: 54066.4, 60 sec: 52974.8, 300 sec: 53317.4). Total num frames: 5896568832. Throughput: 0: 53154.0. Samples: 387137920. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 11:00:24,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 11:00:27,076][52263] Updated weights for policy 0, policy_version 359908 (0.0028) [2024-04-27 11:00:29,106][52031] Fps is (10 sec: 54067.6, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 5896847360. Throughput: 0: 53133.8. Samples: 387297760. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 11:00:29,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 11:00:29,944][52263] Updated weights for policy 0, policy_version 359918 (0.0029) [2024-04-27 11:00:33,186][52263] Updated weights for policy 0, policy_version 359928 (0.0031) [2024-04-27 11:00:34,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 5897109504. Throughput: 0: 53218.0. Samples: 387621280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 11:00:34,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 11:00:36,232][52263] Updated weights for policy 0, policy_version 359938 (0.0026) [2024-04-27 11:00:39,107][52031] Fps is (10 sec: 50789.5, 60 sec: 52974.7, 300 sec: 53317.4). Total num frames: 5897355264. Throughput: 0: 53347.8. Samples: 387943520. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 11:00:39,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 11:00:39,306][52263] Updated weights for policy 0, policy_version 359948 (0.0034) [2024-04-27 11:00:42,388][52263] Updated weights for policy 0, policy_version 359958 (0.0026) [2024-04-27 11:00:44,107][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 5897633792. Throughput: 0: 53135.6. Samples: 388095680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:00:44,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 11:00:45,004][52242] Signal inference workers to stop experience collection... (5700 times) [2024-04-27 11:00:45,004][52242] Signal inference workers to resume experience collection... (5700 times) [2024-04-27 11:00:45,019][52263] InferenceWorker_p0-w0: stopping experience collection (5700 times) [2024-04-27 11:00:45,019][52263] InferenceWorker_p0-w0: resuming experience collection (5700 times) [2024-04-27 11:00:45,513][52263] Updated weights for policy 0, policy_version 359968 (0.0032) [2024-04-27 11:00:48,449][52263] Updated weights for policy 0, policy_version 359978 (0.0031) [2024-04-27 11:00:49,107][52031] Fps is (10 sec: 54067.8, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 5897895936. Throughput: 0: 53065.7. Samples: 388411680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:00:49,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 11:00:51,592][52263] Updated weights for policy 0, policy_version 359988 (0.0027) [2024-04-27 11:00:54,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 5898174464. Throughput: 0: 53073.4. Samples: 388733880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:00:54,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 11:00:54,404][52263] Updated weights for policy 0, policy_version 359998 (0.0032) [2024-04-27 11:00:57,661][52263] Updated weights for policy 0, policy_version 360008 (0.0026) [2024-04-27 11:00:59,106][52031] Fps is (10 sec: 54067.6, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 5898436608. Throughput: 0: 53379.1. Samples: 388899040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:00:59,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 11:01:00,701][52263] Updated weights for policy 0, policy_version 360018 (0.0034) [2024-04-27 11:01:03,711][52263] Updated weights for policy 0, policy_version 360028 (0.0031) [2024-04-27 11:01:04,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 5898715136. Throughput: 0: 53367.5. Samples: 389218920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:01:04,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 11:01:06,960][52263] Updated weights for policy 0, policy_version 360038 (0.0026) [2024-04-27 11:01:09,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 5898960896. Throughput: 0: 53387.3. Samples: 389540340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:01:09,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 11:01:09,770][52263] Updated weights for policy 0, policy_version 360048 (0.0039) [2024-04-27 11:01:13,253][52263] Updated weights for policy 0, policy_version 360058 (0.0028) [2024-04-27 11:01:14,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.0, 300 sec: 53261.8). Total num frames: 5899239424. Throughput: 0: 53215.9. Samples: 389692480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:01:14,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 11:01:15,919][52263] Updated weights for policy 0, policy_version 360068 (0.0044) [2024-04-27 11:01:19,107][52031] Fps is (10 sec: 52428.0, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 5899485184. Throughput: 0: 53178.2. Samples: 390014300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:01:19,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 11:01:19,303][52263] Updated weights for policy 0, policy_version 360078 (0.0031) [2024-04-27 11:01:22,030][52263] Updated weights for policy 0, policy_version 360088 (0.0032) [2024-04-27 11:01:24,107][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 5899780096. Throughput: 0: 53145.1. Samples: 390335040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:01:24,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 11:01:25,353][52263] Updated weights for policy 0, policy_version 360098 (0.0030) [2024-04-27 11:01:28,169][52263] Updated weights for policy 0, policy_version 360108 (0.0029) [2024-04-27 11:01:29,107][52031] Fps is (10 sec: 57344.4, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 5900058624. Throughput: 0: 53425.3. Samples: 390499820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:01:29,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 11:01:31,641][52263] Updated weights for policy 0, policy_version 360118 (0.0027) [2024-04-27 11:01:34,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 5900304384. Throughput: 0: 53566.3. Samples: 390822160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:01:34,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 11:01:34,138][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000360127_5900320768.pth... [2024-04-27 11:01:34,191][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000359345_5887508480.pth [2024-04-27 11:01:34,320][52263] Updated weights for policy 0, policy_version 360128 (0.0028) [2024-04-27 11:01:37,906][52263] Updated weights for policy 0, policy_version 360138 (0.0029) [2024-04-27 11:01:39,106][52031] Fps is (10 sec: 49152.7, 60 sec: 53248.2, 300 sec: 53206.4). Total num frames: 5900550144. Throughput: 0: 53423.2. Samples: 391137920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:01:39,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 11:01:40,175][52242] Signal inference workers to stop experience collection... (5750 times) [2024-04-27 11:01:40,176][52242] Signal inference workers to resume experience collection... (5750 times) [2024-04-27 11:01:40,198][52263] InferenceWorker_p0-w0: stopping experience collection (5750 times) [2024-04-27 11:01:40,198][52263] InferenceWorker_p0-w0: resuming experience collection (5750 times) [2024-04-27 11:01:40,453][52263] Updated weights for policy 0, policy_version 360148 (0.0033) [2024-04-27 11:01:43,874][52263] Updated weights for policy 0, policy_version 360158 (0.0028) [2024-04-27 11:01:44,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53248.0, 300 sec: 53261.9). Total num frames: 5900828672. Throughput: 0: 53219.1. Samples: 391293900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:01:44,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 11:01:46,583][52263] Updated weights for policy 0, policy_version 360168 (0.0027) [2024-04-27 11:01:49,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 5901090816. Throughput: 0: 53193.0. Samples: 391612600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:01:49,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 11:01:50,136][52263] Updated weights for policy 0, policy_version 360178 (0.0036) [2024-04-27 11:01:52,779][52263] Updated weights for policy 0, policy_version 360188 (0.0035) [2024-04-27 11:01:54,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 5901385728. Throughput: 0: 53108.8. Samples: 391930240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:01:54,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 11:01:56,298][52263] Updated weights for policy 0, policy_version 360198 (0.0030) [2024-04-27 11:01:58,908][52263] Updated weights for policy 0, policy_version 360208 (0.0030) [2024-04-27 11:01:59,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 5901647872. Throughput: 0: 53331.3. Samples: 392092380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:01:59,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 11:02:02,518][52263] Updated weights for policy 0, policy_version 360218 (0.0031) [2024-04-27 11:02:04,106][52031] Fps is (10 sec: 49152.4, 60 sec: 52702.0, 300 sec: 53206.4). Total num frames: 5901877248. Throughput: 0: 53311.7. Samples: 392413320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 11:02:04,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 11:02:05,048][52263] Updated weights for policy 0, policy_version 360228 (0.0031) [2024-04-27 11:02:08,692][52263] Updated weights for policy 0, policy_version 360238 (0.0031) [2024-04-27 11:02:09,106][52031] Fps is (10 sec: 50790.2, 60 sec: 53248.0, 300 sec: 53261.9). Total num frames: 5902155776. Throughput: 0: 53296.5. Samples: 392733380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 11:02:09,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 11:02:11,200][52263] Updated weights for policy 0, policy_version 360248 (0.0031) [2024-04-27 11:02:14,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53248.0, 300 sec: 53372.9). Total num frames: 5902434304. Throughput: 0: 52915.1. Samples: 392881000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 11:02:14,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 11:02:14,816][52263] Updated weights for policy 0, policy_version 360258 (0.0027) [2024-04-27 11:02:17,485][52263] Updated weights for policy 0, policy_version 360268 (0.0035) [2024-04-27 11:02:19,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 5902696448. Throughput: 0: 52903.1. Samples: 393202800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 11:02:19,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 11:02:20,860][52263] Updated weights for policy 0, policy_version 360278 (0.0029) [2024-04-27 11:02:23,673][52263] Updated weights for policy 0, policy_version 360288 (0.0031) [2024-04-27 11:02:24,107][52031] Fps is (10 sec: 52428.4, 60 sec: 52974.8, 300 sec: 53317.4). Total num frames: 5902958592. Throughput: 0: 53033.5. Samples: 393524440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 11:02:24,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 11:02:27,033][52263] Updated weights for policy 0, policy_version 360298 (0.0031) [2024-04-27 11:02:29,107][52031] Fps is (10 sec: 54066.5, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 5903237120. Throughput: 0: 53146.9. Samples: 393685520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 11:02:29,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 11:02:29,631][52263] Updated weights for policy 0, policy_version 360308 (0.0029) [2024-04-27 11:02:33,081][52263] Updated weights for policy 0, policy_version 360318 (0.0028) [2024-04-27 11:02:34,106][52031] Fps is (10 sec: 52430.0, 60 sec: 52975.0, 300 sec: 53261.9). Total num frames: 5903482880. Throughput: 0: 53195.6. Samples: 394006400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 11:02:34,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 11:02:35,551][52263] Updated weights for policy 0, policy_version 360328 (0.0032) [2024-04-27 11:02:39,106][52031] Fps is (10 sec: 52429.8, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 5903761408. Throughput: 0: 53464.6. Samples: 394336140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 11:02:39,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 11:02:39,143][52263] Updated weights for policy 0, policy_version 360338 (0.0026) [2024-04-27 11:02:40,921][52242] Signal inference workers to stop experience collection... (5800 times) [2024-04-27 11:02:40,921][52242] Signal inference workers to resume experience collection... (5800 times) [2024-04-27 11:02:40,943][52263] InferenceWorker_p0-w0: stopping experience collection (5800 times) [2024-04-27 11:02:40,943][52263] InferenceWorker_p0-w0: resuming experience collection (5800 times) [2024-04-27 11:02:41,739][52263] Updated weights for policy 0, policy_version 360348 (0.0030) [2024-04-27 11:02:44,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 5904039936. Throughput: 0: 53166.1. Samples: 394484860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 11:02:44,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 11:02:45,316][52263] Updated weights for policy 0, policy_version 360358 (0.0027) [2024-04-27 11:02:47,917][52263] Updated weights for policy 0, policy_version 360368 (0.0032) [2024-04-27 11:02:49,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 5904285696. Throughput: 0: 53111.4. Samples: 394803340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 11:02:49,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 11:02:51,407][52263] Updated weights for policy 0, policy_version 360378 (0.0029) [2024-04-27 11:02:53,978][52263] Updated weights for policy 0, policy_version 360388 (0.0029) [2024-04-27 11:02:54,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.1, 300 sec: 53372.9). Total num frames: 5904596992. Throughput: 0: 53103.9. Samples: 395123060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 11:02:54,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 11:02:57,508][52263] Updated weights for policy 0, policy_version 360398 (0.0032) [2024-04-27 11:02:59,106][52031] Fps is (10 sec: 52429.9, 60 sec: 52701.9, 300 sec: 53206.4). Total num frames: 5904809984. Throughput: 0: 53422.9. Samples: 395285020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 11:02:59,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 11:03:00,593][52263] Updated weights for policy 0, policy_version 360408 (0.0024) [2024-04-27 11:03:03,578][52263] Updated weights for policy 0, policy_version 360418 (0.0031) [2024-04-27 11:03:04,106][52031] Fps is (10 sec: 49152.6, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 5905088512. Throughput: 0: 53374.7. Samples: 395604660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 11:03:04,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 11:03:06,950][52263] Updated weights for policy 0, policy_version 360428 (0.0031) [2024-04-27 11:03:09,107][52031] Fps is (10 sec: 54062.9, 60 sec: 53247.4, 300 sec: 53317.3). Total num frames: 5905350656. Throughput: 0: 53199.8. Samples: 395918460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 11:03:09,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 11:03:09,923][52263] Updated weights for policy 0, policy_version 360438 (0.0032) [2024-04-27 11:03:13,419][52263] Updated weights for policy 0, policy_version 360448 (0.0031) [2024-04-27 11:03:14,106][52031] Fps is (10 sec: 54066.7, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 5905629184. Throughput: 0: 53379.2. Samples: 396087580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 11:03:14,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 11:03:15,944][52263] Updated weights for policy 0, policy_version 360458 (0.0032) [2024-04-27 11:03:19,107][52031] Fps is (10 sec: 52432.6, 60 sec: 52974.9, 300 sec: 53261.9). Total num frames: 5905874944. Throughput: 0: 53361.7. Samples: 396407680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 11:03:19,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 11:03:19,511][52263] Updated weights for policy 0, policy_version 360468 (0.0031) [2024-04-27 11:03:22,007][52263] Updated weights for policy 0, policy_version 360478 (0.0038) [2024-04-27 11:03:24,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.3, 300 sec: 53261.9). Total num frames: 5906169856. Throughput: 0: 53097.8. Samples: 396725540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:03:24,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 11:03:25,605][52263] Updated weights for policy 0, policy_version 360488 (0.0026) [2024-04-27 11:03:28,441][52263] Updated weights for policy 0, policy_version 360498 (0.0032) [2024-04-27 11:03:29,107][52031] Fps is (10 sec: 54066.6, 60 sec: 52974.9, 300 sec: 53206.3). Total num frames: 5906415616. Throughput: 0: 53185.3. Samples: 396878200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:03:29,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 11:03:29,972][52242] Signal inference workers to stop experience collection... (5850 times) [2024-04-27 11:03:29,980][52242] Signal inference workers to resume experience collection... (5850 times) [2024-04-27 11:03:29,982][52263] InferenceWorker_p0-w0: stopping experience collection (5850 times) [2024-04-27 11:03:29,995][52263] InferenceWorker_p0-w0: resuming experience collection (5850 times) [2024-04-27 11:03:31,655][52263] Updated weights for policy 0, policy_version 360508 (0.0027) [2024-04-27 11:03:34,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.0, 300 sec: 53261.9). Total num frames: 5906694144. Throughput: 0: 53167.7. Samples: 397195880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:03:34,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 11:03:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000360516_5906694144.pth... [2024-04-27 11:03:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000359737_5893931008.pth [2024-04-27 11:03:34,671][52263] Updated weights for policy 0, policy_version 360518 (0.0025) [2024-04-27 11:03:37,828][52263] Updated weights for policy 0, policy_version 360528 (0.0032) [2024-04-27 11:03:39,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 5906956288. Throughput: 0: 53144.9. Samples: 397514580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:03:39,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 11:03:40,961][52263] Updated weights for policy 0, policy_version 360538 (0.0030) [2024-04-27 11:03:43,890][52263] Updated weights for policy 0, policy_version 360548 (0.0027) [2024-04-27 11:03:44,106][52031] Fps is (10 sec: 52428.8, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 5907218432. Throughput: 0: 53373.3. Samples: 397686820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:03:44,107][52031] Avg episode reward: [(0, '0.664')] [2024-04-27 11:03:47,021][52263] Updated weights for policy 0, policy_version 360558 (0.0028) [2024-04-27 11:03:49,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.1, 300 sec: 53261.9). Total num frames: 5907496960. Throughput: 0: 53374.5. Samples: 398006520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:03:49,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 11:03:50,048][52263] Updated weights for policy 0, policy_version 360568 (0.0024) [2024-04-27 11:03:53,301][52263] Updated weights for policy 0, policy_version 360578 (0.0032) [2024-04-27 11:03:54,106][52031] Fps is (10 sec: 54066.9, 60 sec: 52701.9, 300 sec: 53261.9). Total num frames: 5907759104. Throughput: 0: 53472.0. Samples: 398324660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:03:54,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 11:03:56,215][52263] Updated weights for policy 0, policy_version 360588 (0.0028) [2024-04-27 11:03:59,107][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.0, 300 sec: 53206.3). Total num frames: 5908021248. Throughput: 0: 53225.3. Samples: 398482720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:03:59,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 11:03:59,596][52263] Updated weights for policy 0, policy_version 360598 (0.0027) [2024-04-27 11:04:02,526][52263] Updated weights for policy 0, policy_version 360608 (0.0028) [2024-04-27 11:04:04,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.0, 300 sec: 53262.1). Total num frames: 5908299776. Throughput: 0: 53274.2. Samples: 398805020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:04:04,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 11:04:05,635][52263] Updated weights for policy 0, policy_version 360618 (0.0032) [2024-04-27 11:04:08,574][52263] Updated weights for policy 0, policy_version 360628 (0.0028) [2024-04-27 11:04:09,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.7, 300 sec: 53317.4). Total num frames: 5908545536. Throughput: 0: 53374.2. Samples: 399127380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:04:09,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 11:04:11,625][52263] Updated weights for policy 0, policy_version 360638 (0.0028) [2024-04-27 11:04:14,107][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 5908824064. Throughput: 0: 53544.5. Samples: 399287700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:04:14,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 11:04:14,616][52263] Updated weights for policy 0, policy_version 360648 (0.0035) [2024-04-27 11:04:17,828][52263] Updated weights for policy 0, policy_version 360658 (0.0037) [2024-04-27 11:04:19,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53794.1, 300 sec: 53261.9). Total num frames: 5909102592. Throughput: 0: 53582.6. Samples: 399607100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:04:19,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 11:04:20,784][52263] Updated weights for policy 0, policy_version 360668 (0.0026) [2024-04-27 11:04:24,025][52263] Updated weights for policy 0, policy_version 360678 (0.0029) [2024-04-27 11:04:24,107][52031] Fps is (10 sec: 52428.6, 60 sec: 52974.8, 300 sec: 53150.8). Total num frames: 5909348352. Throughput: 0: 53550.2. Samples: 399924340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:04:24,107][52031] Avg episode reward: [(0, '0.662')] [2024-04-27 11:04:26,950][52263] Updated weights for policy 0, policy_version 360688 (0.0029) [2024-04-27 11:04:29,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.2, 300 sec: 53317.4). Total num frames: 5909643264. Throughput: 0: 53235.9. Samples: 400082440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:04:29,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 11:04:30,107][52263] Updated weights for policy 0, policy_version 360698 (0.0030) [2024-04-27 11:04:33,026][52263] Updated weights for policy 0, policy_version 360708 (0.0037) [2024-04-27 11:04:34,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53247.8, 300 sec: 53261.8). Total num frames: 5909889024. Throughput: 0: 53203.5. Samples: 400400680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:04:34,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 11:04:35,095][52242] Signal inference workers to stop experience collection... (5900 times) [2024-04-27 11:04:35,129][52263] InferenceWorker_p0-w0: stopping experience collection (5900 times) [2024-04-27 11:04:35,155][52242] Signal inference workers to resume experience collection... (5900 times) [2024-04-27 11:04:35,158][52263] InferenceWorker_p0-w0: resuming experience collection (5900 times) [2024-04-27 11:04:36,189][52263] Updated weights for policy 0, policy_version 360718 (0.0028) [2024-04-27 11:04:39,107][52031] Fps is (10 sec: 50790.2, 60 sec: 53248.0, 300 sec: 53261.9). Total num frames: 5910151168. Throughput: 0: 53279.1. Samples: 400722220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 11:04:39,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 11:04:39,456][52263] Updated weights for policy 0, policy_version 360728 (0.0028) [2024-04-27 11:04:42,366][52263] Updated weights for policy 0, policy_version 360738 (0.0025) [2024-04-27 11:04:44,107][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 5910429696. Throughput: 0: 53249.3. Samples: 400878940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 11:04:44,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 11:04:45,435][52263] Updated weights for policy 0, policy_version 360748 (0.0030) [2024-04-27 11:04:48,424][52263] Updated weights for policy 0, policy_version 360758 (0.0027) [2024-04-27 11:04:49,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52975.1, 300 sec: 53206.4). Total num frames: 5910675456. Throughput: 0: 53127.2. Samples: 401195740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 11:04:49,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 11:04:51,756][52263] Updated weights for policy 0, policy_version 360768 (0.0025) [2024-04-27 11:04:54,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53248.1, 300 sec: 53206.4). Total num frames: 5910953984. Throughput: 0: 53210.2. Samples: 401521840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 11:04:54,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 11:04:54,460][52263] Updated weights for policy 0, policy_version 360778 (0.0028) [2024-04-27 11:04:57,842][52263] Updated weights for policy 0, policy_version 360788 (0.0029) [2024-04-27 11:04:59,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53248.0, 300 sec: 53206.3). Total num frames: 5911216128. Throughput: 0: 53225.3. Samples: 401682840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 11:04:59,107][52031] Avg episode reward: [(0, '0.699')] [2024-04-27 11:05:00,551][52263] Updated weights for policy 0, policy_version 360798 (0.0031) [2024-04-27 11:05:03,920][52263] Updated weights for policy 0, policy_version 360808 (0.0030) [2024-04-27 11:05:04,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52975.1, 300 sec: 53261.9). Total num frames: 5911478272. Throughput: 0: 53158.4. Samples: 401999220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 11:05:04,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 11:05:06,964][52263] Updated weights for policy 0, policy_version 360818 (0.0031) [2024-04-27 11:05:09,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 5911756800. Throughput: 0: 53170.2. Samples: 402317000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 11:05:09,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 11:05:10,026][52263] Updated weights for policy 0, policy_version 360828 (0.0029) [2024-04-27 11:05:13,267][52263] Updated weights for policy 0, policy_version 360838 (0.0033) [2024-04-27 11:05:14,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.2, 300 sec: 53317.4). Total num frames: 5912035328. Throughput: 0: 53340.1. Samples: 402482740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 11:05:14,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 11:05:16,140][52263] Updated weights for policy 0, policy_version 360848 (0.0032) [2024-04-27 11:05:19,107][52031] Fps is (10 sec: 52428.9, 60 sec: 52974.9, 300 sec: 53261.9). Total num frames: 5912281088. Throughput: 0: 53331.7. Samples: 402800600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 11:05:19,107][52031] Avg episode reward: [(0, '0.667')] [2024-04-27 11:05:19,314][52263] Updated weights for policy 0, policy_version 360858 (0.0028) [2024-04-27 11:05:22,345][52263] Updated weights for policy 0, policy_version 360868 (0.0029) [2024-04-27 11:05:24,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.1, 300 sec: 53261.9). Total num frames: 5912559616. Throughput: 0: 53324.5. Samples: 403121820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 11:05:24,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 11:05:25,246][52263] Updated weights for policy 0, policy_version 360878 (0.0025) [2024-04-27 11:05:28,493][52263] Updated weights for policy 0, policy_version 360888 (0.0030) [2024-04-27 11:05:29,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 5912838144. Throughput: 0: 53451.1. Samples: 403284240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 11:05:29,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 11:05:31,472][52263] Updated weights for policy 0, policy_version 360898 (0.0027) [2024-04-27 11:05:34,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52975.0, 300 sec: 53261.9). Total num frames: 5913067520. Throughput: 0: 53361.7. Samples: 403597020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 11:05:34,116][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 11:05:34,124][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000360905_5913067520.pth... [2024-04-27 11:05:34,181][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000360127_5900320768.pth [2024-04-27 11:05:34,552][52263] Updated weights for policy 0, policy_version 360908 (0.0031) [2024-04-27 11:05:37,621][52263] Updated weights for policy 0, policy_version 360918 (0.0030) [2024-04-27 11:05:39,106][52031] Fps is (10 sec: 50791.2, 60 sec: 53248.1, 300 sec: 53261.9). Total num frames: 5913346048. Throughput: 0: 53220.9. Samples: 403916780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 11:05:39,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 11:05:40,633][52263] Updated weights for policy 0, policy_version 360928 (0.0034) [2024-04-27 11:05:43,594][52263] Updated weights for policy 0, policy_version 360938 (0.0028) [2024-04-27 11:05:44,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 5913624576. Throughput: 0: 53417.7. Samples: 404086640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 11:05:44,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 11:05:46,845][52263] Updated weights for policy 0, policy_version 360948 (0.0028) [2024-04-27 11:05:49,106][52031] Fps is (10 sec: 54066.8, 60 sec: 53521.0, 300 sec: 53261.9). Total num frames: 5913886720. Throughput: 0: 53451.5. Samples: 404404540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 11:05:49,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 11:05:49,752][52263] Updated weights for policy 0, policy_version 360958 (0.0027) [2024-04-27 11:05:50,260][52242] Signal inference workers to stop experience collection... (5950 times) [2024-04-27 11:05:50,279][52263] InferenceWorker_p0-w0: stopping experience collection (5950 times) [2024-04-27 11:05:50,322][52242] Signal inference workers to resume experience collection... (5950 times) [2024-04-27 11:05:50,322][52263] InferenceWorker_p0-w0: resuming experience collection (5950 times) [2024-04-27 11:05:52,908][52263] Updated weights for policy 0, policy_version 360968 (0.0026) [2024-04-27 11:05:54,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 5914165248. Throughput: 0: 53463.7. Samples: 404722860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 11:05:54,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 11:05:55,920][52263] Updated weights for policy 0, policy_version 360978 (0.0028) [2024-04-27 11:05:58,903][52263] Updated weights for policy 0, policy_version 360988 (0.0027) [2024-04-27 11:05:59,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53794.1, 300 sec: 53317.4). Total num frames: 5914443776. Throughput: 0: 53282.1. Samples: 404880440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-27 11:05:59,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 11:06:02,106][52263] Updated weights for policy 0, policy_version 360998 (0.0027) [2024-04-27 11:06:04,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53520.9, 300 sec: 53317.4). Total num frames: 5914689536. Throughput: 0: 53452.0. Samples: 405205940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-27 11:06:04,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 11:06:05,010][52263] Updated weights for policy 0, policy_version 361008 (0.0029) [2024-04-27 11:06:08,199][52263] Updated weights for policy 0, policy_version 361018 (0.0029) [2024-04-27 11:06:09,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53247.9, 300 sec: 53261.9). Total num frames: 5914951680. Throughput: 0: 53445.6. Samples: 405526880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-27 11:06:09,107][52031] Avg episode reward: [(0, '0.499')] [2024-04-27 11:06:11,216][52263] Updated weights for policy 0, policy_version 361028 (0.0032) [2024-04-27 11:06:14,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 5915230208. Throughput: 0: 53236.1. Samples: 405679860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-27 11:06:14,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 11:06:14,213][52263] Updated weights for policy 0, policy_version 361038 (0.0031) [2024-04-27 11:06:17,311][52263] Updated weights for policy 0, policy_version 361048 (0.0026) [2024-04-27 11:06:19,106][52031] Fps is (10 sec: 54068.2, 60 sec: 53521.1, 300 sec: 53261.9). Total num frames: 5915492352. Throughput: 0: 53345.8. Samples: 405997580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-27 11:06:19,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 11:06:20,267][52263] Updated weights for policy 0, policy_version 361058 (0.0025) [2024-04-27 11:06:23,320][52263] Updated weights for policy 0, policy_version 361068 (0.0027) [2024-04-27 11:06:24,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53247.9, 300 sec: 53206.3). Total num frames: 5915754496. Throughput: 0: 53410.0. Samples: 406320240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-27 11:06:24,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 11:06:26,496][52263] Updated weights for policy 0, policy_version 361078 (0.0028) [2024-04-27 11:06:29,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52975.1, 300 sec: 53261.9). Total num frames: 5916016640. Throughput: 0: 53226.4. Samples: 406481820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-27 11:06:29,107][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 11:06:29,543][52263] Updated weights for policy 0, policy_version 361088 (0.0030) [2024-04-27 11:06:32,719][52263] Updated weights for policy 0, policy_version 361098 (0.0028) [2024-04-27 11:06:34,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 5916278784. Throughput: 0: 53287.5. Samples: 406802480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-27 11:06:34,107][52031] Avg episode reward: [(0, '0.670')] [2024-04-27 11:06:35,679][52263] Updated weights for policy 0, policy_version 361108 (0.0032) [2024-04-27 11:06:38,874][52263] Updated weights for policy 0, policy_version 361118 (0.0026) [2024-04-27 11:06:39,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53520.9, 300 sec: 53317.4). Total num frames: 5916557312. Throughput: 0: 53258.1. Samples: 407119480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-27 11:06:39,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 11:06:41,716][52263] Updated weights for policy 0, policy_version 361128 (0.0028) [2024-04-27 11:06:44,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 5916819456. Throughput: 0: 53416.6. Samples: 407284180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-27 11:06:44,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 11:06:44,167][52242] Signal inference workers to stop experience collection... (6000 times) [2024-04-27 11:06:44,167][52242] Signal inference workers to resume experience collection... (6000 times) [2024-04-27 11:06:44,193][52263] InferenceWorker_p0-w0: stopping experience collection (6000 times) [2024-04-27 11:06:44,193][52263] InferenceWorker_p0-w0: resuming experience collection (6000 times) [2024-04-27 11:06:45,130][52263] Updated weights for policy 0, policy_version 361138 (0.0033) [2024-04-27 11:06:47,977][52263] Updated weights for policy 0, policy_version 361148 (0.0028) [2024-04-27 11:06:49,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53794.0, 300 sec: 53317.4). Total num frames: 5917114368. Throughput: 0: 53238.2. Samples: 407601660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-27 11:06:49,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 11:06:51,316][52263] Updated weights for policy 0, policy_version 361158 (0.0032) [2024-04-27 11:06:54,057][52263] Updated weights for policy 0, policy_version 361168 (0.0027) [2024-04-27 11:06:54,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 5917376512. Throughput: 0: 53378.0. Samples: 407928880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-27 11:06:54,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 11:06:57,442][52263] Updated weights for policy 0, policy_version 361178 (0.0026) [2024-04-27 11:06:59,107][52031] Fps is (10 sec: 50790.6, 60 sec: 52974.9, 300 sec: 53373.0). Total num frames: 5917622272. Throughput: 0: 53505.7. Samples: 408087620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-27 11:06:59,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 11:07:00,230][52263] Updated weights for policy 0, policy_version 361188 (0.0029) [2024-04-27 11:07:03,496][52263] Updated weights for policy 0, policy_version 361198 (0.0033) [2024-04-27 11:07:04,107][52031] Fps is (10 sec: 50790.2, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 5917884416. Throughput: 0: 53496.4. Samples: 408404920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-27 11:07:04,116][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 11:07:06,390][52263] Updated weights for policy 0, policy_version 361208 (0.0033) [2024-04-27 11:07:09,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 5918162944. Throughput: 0: 53435.5. Samples: 408724840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-27 11:07:09,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 11:07:09,681][52263] Updated weights for policy 0, policy_version 361218 (0.0028) [2024-04-27 11:07:12,434][52263] Updated weights for policy 0, policy_version 361228 (0.0028) [2024-04-27 11:07:14,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 5918425088. Throughput: 0: 53522.1. Samples: 408890320. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 11:07:14,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 11:07:15,860][52263] Updated weights for policy 0, policy_version 361238 (0.0035) [2024-04-27 11:07:18,494][52263] Updated weights for policy 0, policy_version 361248 (0.0033) [2024-04-27 11:07:19,106][52031] Fps is (10 sec: 57345.1, 60 sec: 54067.3, 300 sec: 53484.1). Total num frames: 5918736384. Throughput: 0: 53536.1. Samples: 409211600. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 11:07:19,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 11:07:22,050][52263] Updated weights for policy 0, policy_version 361258 (0.0034) [2024-04-27 11:07:24,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.2, 300 sec: 53317.5). Total num frames: 5918965760. Throughput: 0: 53689.9. Samples: 409535520. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 11:07:24,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 11:07:24,720][52263] Updated weights for policy 0, policy_version 361268 (0.0029) [2024-04-27 11:07:28,044][52263] Updated weights for policy 0, policy_version 361278 (0.0030) [2024-04-27 11:07:29,106][52031] Fps is (10 sec: 47513.5, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 5919211520. Throughput: 0: 53401.7. Samples: 409687260. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 11:07:29,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 11:07:30,674][52263] Updated weights for policy 0, policy_version 361288 (0.0034) [2024-04-27 11:07:34,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 5919490048. Throughput: 0: 53433.9. Samples: 410006180. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 11:07:34,107][52031] Avg episode reward: [(0, '0.489')] [2024-04-27 11:07:34,212][52263] Updated weights for policy 0, policy_version 361298 (0.0027) [2024-04-27 11:07:34,215][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000361298_5919506432.pth... [2024-04-27 11:07:34,258][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000360516_5906694144.pth [2024-04-27 11:07:34,946][52242] Signal inference workers to stop experience collection... (6050 times) [2024-04-27 11:07:34,947][52242] Signal inference workers to resume experience collection... (6050 times) [2024-04-27 11:07:34,960][52263] InferenceWorker_p0-w0: stopping experience collection (6050 times) [2024-04-27 11:07:34,960][52263] InferenceWorker_p0-w0: resuming experience collection (6050 times) [2024-04-27 11:07:36,844][52263] Updated weights for policy 0, policy_version 361308 (0.0028) [2024-04-27 11:07:39,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53247.9, 300 sec: 53261.9). Total num frames: 5919752192. Throughput: 0: 53267.4. Samples: 410325920. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 11:07:39,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 11:07:40,513][52263] Updated weights for policy 0, policy_version 361318 (0.0030) [2024-04-27 11:07:43,027][52263] Updated weights for policy 0, policy_version 361328 (0.0035) [2024-04-27 11:07:44,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 5920047104. Throughput: 0: 53456.0. Samples: 410493140. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 11:07:44,107][52031] Avg episode reward: [(0, '0.501')] [2024-04-27 11:07:46,574][52263] Updated weights for policy 0, policy_version 361338 (0.0029) [2024-04-27 11:07:48,978][52263] Updated weights for policy 0, policy_version 361348 (0.0025) [2024-04-27 11:07:49,106][52031] Fps is (10 sec: 57345.0, 60 sec: 53521.2, 300 sec: 53317.4). Total num frames: 5920325632. Throughput: 0: 53629.4. Samples: 410818240. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 11:07:49,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 11:07:52,593][52263] Updated weights for policy 0, policy_version 361358 (0.0028) [2024-04-27 11:07:54,107][52031] Fps is (10 sec: 49152.1, 60 sec: 52701.8, 300 sec: 53317.4). Total num frames: 5920538624. Throughput: 0: 53633.5. Samples: 411138340. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 11:07:54,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 11:07:55,162][52263] Updated weights for policy 0, policy_version 361368 (0.0032) [2024-04-27 11:07:58,755][52263] Updated weights for policy 0, policy_version 361378 (0.0025) [2024-04-27 11:07:59,107][52031] Fps is (10 sec: 50789.7, 60 sec: 53521.0, 300 sec: 53372.9). Total num frames: 5920833536. Throughput: 0: 53376.4. Samples: 411292260. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 11:07:59,107][52031] Avg episode reward: [(0, '0.666')] [2024-04-27 11:08:01,242][52263] Updated weights for policy 0, policy_version 361388 (0.0031) [2024-04-27 11:08:04,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53521.1, 300 sec: 53373.1). Total num frames: 5921095680. Throughput: 0: 53334.2. Samples: 411611640. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 11:08:04,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 11:08:04,952][52263] Updated weights for policy 0, policy_version 361398 (0.0030) [2024-04-27 11:08:07,400][52263] Updated weights for policy 0, policy_version 361408 (0.0035) [2024-04-27 11:08:09,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 5921374208. Throughput: 0: 53275.1. Samples: 411932900. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 11:08:09,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 11:08:11,117][52263] Updated weights for policy 0, policy_version 361418 (0.0030) [2024-04-27 11:08:13,432][52263] Updated weights for policy 0, policy_version 361428 (0.0025) [2024-04-27 11:08:14,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53794.3, 300 sec: 53484.1). Total num frames: 5921652736. Throughput: 0: 53689.0. Samples: 412103260. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 11:08:14,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 11:08:17,145][52263] Updated weights for policy 0, policy_version 361438 (0.0027) [2024-04-27 11:08:19,106][52031] Fps is (10 sec: 54067.8, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 5921914880. Throughput: 0: 53730.3. Samples: 412424040. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 11:08:19,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 11:08:19,652][52263] Updated weights for policy 0, policy_version 361448 (0.0023) [2024-04-27 11:08:23,116][52263] Updated weights for policy 0, policy_version 361458 (0.0025) [2024-04-27 11:08:24,106][52031] Fps is (10 sec: 50790.1, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 5922160640. Throughput: 0: 53760.6. Samples: 412745140. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 11:08:24,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 11:08:25,719][52263] Updated weights for policy 0, policy_version 361468 (0.0033) [2024-04-27 11:08:29,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 5922439168. Throughput: 0: 53424.6. Samples: 412897240. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 11:08:29,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 11:08:29,302][52263] Updated weights for policy 0, policy_version 361478 (0.0027) [2024-04-27 11:08:31,669][52263] Updated weights for policy 0, policy_version 361488 (0.0031) [2024-04-27 11:08:34,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 5922717696. Throughput: 0: 53396.7. Samples: 413221100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 11:08:34,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 11:08:35,428][52263] Updated weights for policy 0, policy_version 361498 (0.0035) [2024-04-27 11:08:37,664][52242] Signal inference workers to stop experience collection... (6100 times) [2024-04-27 11:08:37,664][52242] Signal inference workers to resume experience collection... (6100 times) [2024-04-27 11:08:37,690][52263] InferenceWorker_p0-w0: stopping experience collection (6100 times) [2024-04-27 11:08:37,690][52263] InferenceWorker_p0-w0: resuming experience collection (6100 times) [2024-04-27 11:08:37,776][52263] Updated weights for policy 0, policy_version 361508 (0.0032) [2024-04-27 11:08:39,106][52031] Fps is (10 sec: 55705.5, 60 sec: 54067.3, 300 sec: 53484.0). Total num frames: 5922996224. Throughput: 0: 53394.7. Samples: 413541100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 11:08:39,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 11:08:41,692][52263] Updated weights for policy 0, policy_version 361518 (0.0031) [2024-04-27 11:08:43,932][52263] Updated weights for policy 0, policy_version 361528 (0.0029) [2024-04-27 11:08:44,107][52031] Fps is (10 sec: 55706.0, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 5923274752. Throughput: 0: 53747.6. Samples: 413710900. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 11:08:44,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 11:08:47,656][52263] Updated weights for policy 0, policy_version 361538 (0.0035) [2024-04-27 11:08:49,106][52031] Fps is (10 sec: 50790.3, 60 sec: 52974.9, 300 sec: 53373.0). Total num frames: 5923504128. Throughput: 0: 53788.8. Samples: 414032140. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 11:08:49,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 11:08:50,070][52263] Updated weights for policy 0, policy_version 361548 (0.0033) [2024-04-27 11:08:53,817][52263] Updated weights for policy 0, policy_version 361558 (0.0029) [2024-04-27 11:08:54,107][52031] Fps is (10 sec: 49151.9, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 5923766272. Throughput: 0: 53888.8. Samples: 414357900. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 11:08:54,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 11:08:56,058][52263] Updated weights for policy 0, policy_version 361568 (0.0030) [2024-04-27 11:08:59,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 5924061184. Throughput: 0: 53246.0. Samples: 414499340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 11:08:59,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 11:08:59,870][52263] Updated weights for policy 0, policy_version 361578 (0.0028) [2024-04-27 11:09:02,127][52263] Updated weights for policy 0, policy_version 361588 (0.0034) [2024-04-27 11:09:04,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 5924306944. Throughput: 0: 53234.1. Samples: 414819580. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 11:09:04,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 11:09:05,941][52263] Updated weights for policy 0, policy_version 361598 (0.0030) [2024-04-27 11:09:08,284][52263] Updated weights for policy 0, policy_version 361608 (0.0029) [2024-04-27 11:09:09,106][52031] Fps is (10 sec: 54068.2, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 5924601856. Throughput: 0: 53226.7. Samples: 415140340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 11:09:09,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 11:09:12,484][52263] Updated weights for policy 0, policy_version 361618 (0.0028) [2024-04-27 11:09:14,107][52031] Fps is (10 sec: 57343.2, 60 sec: 53793.9, 300 sec: 53484.0). Total num frames: 5924880384. Throughput: 0: 53844.3. Samples: 415320240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 11:09:14,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 11:09:14,372][52263] Updated weights for policy 0, policy_version 361628 (0.0031) [2024-04-27 11:09:18,432][52263] Updated weights for policy 0, policy_version 361638 (0.0031) [2024-04-27 11:09:19,107][52031] Fps is (10 sec: 49151.2, 60 sec: 52974.8, 300 sec: 53373.0). Total num frames: 5925093376. Throughput: 0: 53805.4. Samples: 415642340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 11:09:19,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 11:09:19,957][52242] Signal inference workers to stop experience collection... (6150 times) [2024-04-27 11:09:20,001][52263] InferenceWorker_p0-w0: stopping experience collection (6150 times) [2024-04-27 11:09:20,016][52242] Signal inference workers to resume experience collection... (6150 times) [2024-04-27 11:09:20,021][52263] InferenceWorker_p0-w0: resuming experience collection (6150 times) [2024-04-27 11:09:20,416][52263] Updated weights for policy 0, policy_version 361648 (0.0036) [2024-04-27 11:09:24,107][52031] Fps is (10 sec: 50789.9, 60 sec: 53793.9, 300 sec: 53372.9). Total num frames: 5925388288. Throughput: 0: 53786.4. Samples: 415961500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 11:09:24,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 11:09:24,448][52263] Updated weights for policy 0, policy_version 361658 (0.0032) [2024-04-27 11:09:26,593][52263] Updated weights for policy 0, policy_version 361668 (0.0032) [2024-04-27 11:09:29,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 5925650432. Throughput: 0: 53251.6. Samples: 416107220. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 11:09:29,107][52031] Avg episode reward: [(0, '0.670')] [2024-04-27 11:09:30,516][52263] Updated weights for policy 0, policy_version 361678 (0.0036) [2024-04-27 11:09:32,682][52263] Updated weights for policy 0, policy_version 361688 (0.0038) [2024-04-27 11:09:34,107][52031] Fps is (10 sec: 52429.5, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 5925912576. Throughput: 0: 53352.4. Samples: 416433000. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 11:09:34,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 11:09:34,167][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000361690_5925928960.pth... [2024-04-27 11:09:34,216][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000360905_5913067520.pth [2024-04-27 11:09:36,702][52263] Updated weights for policy 0, policy_version 361698 (0.0027) [2024-04-27 11:09:38,826][52263] Updated weights for policy 0, policy_version 361708 (0.0025) [2024-04-27 11:09:39,107][52031] Fps is (10 sec: 57344.0, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 5926223872. Throughput: 0: 53191.6. Samples: 416751520. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 11:09:39,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 11:09:42,905][52263] Updated weights for policy 0, policy_version 361718 (0.0036) [2024-04-27 11:09:44,106][52031] Fps is (10 sec: 54067.5, 60 sec: 52975.0, 300 sec: 53484.0). Total num frames: 5926453248. Throughput: 0: 53853.0. Samples: 416922720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 11:09:44,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 11:09:44,914][52263] Updated weights for policy 0, policy_version 361728 (0.0026) [2024-04-27 11:09:48,928][52263] Updated weights for policy 0, policy_version 361738 (0.0034) [2024-04-27 11:09:49,106][52031] Fps is (10 sec: 49152.7, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 5926715392. Throughput: 0: 53908.5. Samples: 417245460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:09:49,107][52031] Avg episode reward: [(0, '0.519')] [2024-04-27 11:09:51,138][52263] Updated weights for policy 0, policy_version 361748 (0.0031) [2024-04-27 11:09:54,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 5926977536. Throughput: 0: 53803.5. Samples: 417561500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:09:54,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 11:09:55,026][52263] Updated weights for policy 0, policy_version 361758 (0.0027) [2024-04-27 11:09:57,341][52263] Updated weights for policy 0, policy_version 361768 (0.0029) [2024-04-27 11:09:59,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53248.2, 300 sec: 53484.0). Total num frames: 5927256064. Throughput: 0: 53283.3. Samples: 417717980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:09:59,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 11:10:01,112][52263] Updated weights for policy 0, policy_version 361778 (0.0029) [2024-04-27 11:10:03,582][52263] Updated weights for policy 0, policy_version 361788 (0.0029) [2024-04-27 11:10:04,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 5927534592. Throughput: 0: 53313.4. Samples: 418041440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:10:04,107][52031] Avg episode reward: [(0, '0.516')] [2024-04-27 11:10:04,694][52242] Signal inference workers to stop experience collection... (6200 times) [2024-04-27 11:10:04,749][52242] Signal inference workers to resume experience collection... (6200 times) [2024-04-27 11:10:04,749][52263] InferenceWorker_p0-w0: stopping experience collection (6200 times) [2024-04-27 11:10:04,772][52263] InferenceWorker_p0-w0: resuming experience collection (6200 times) [2024-04-27 11:10:07,257][52263] Updated weights for policy 0, policy_version 361798 (0.0035) [2024-04-27 11:10:09,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 5927813120. Throughput: 0: 53401.7. Samples: 418364560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:10:09,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 11:10:09,696][52263] Updated weights for policy 0, policy_version 361808 (0.0024) [2024-04-27 11:10:13,441][52263] Updated weights for policy 0, policy_version 361818 (0.0028) [2024-04-27 11:10:14,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52701.9, 300 sec: 53428.5). Total num frames: 5928042496. Throughput: 0: 53632.8. Samples: 418520700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:10:14,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 11:10:16,140][52263] Updated weights for policy 0, policy_version 361828 (0.0024) [2024-04-27 11:10:19,107][52031] Fps is (10 sec: 50789.6, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 5928321024. Throughput: 0: 53510.3. Samples: 418840960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:10:19,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 11:10:19,479][52263] Updated weights for policy 0, policy_version 361838 (0.0031) [2024-04-27 11:10:22,245][52263] Updated weights for policy 0, policy_version 361848 (0.0029) [2024-04-27 11:10:24,106][52031] Fps is (10 sec: 54068.2, 60 sec: 53248.2, 300 sec: 53373.0). Total num frames: 5928583168. Throughput: 0: 53434.8. Samples: 419156080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:10:24,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 11:10:26,186][52263] Updated weights for policy 0, policy_version 361858 (0.0025) [2024-04-27 11:10:28,413][52263] Updated weights for policy 0, policy_version 361868 (0.0033) [2024-04-27 11:10:29,106][52031] Fps is (10 sec: 55706.2, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 5928878080. Throughput: 0: 53373.4. Samples: 419324520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:10:29,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 11:10:32,371][52263] Updated weights for policy 0, policy_version 361878 (0.0030) [2024-04-27 11:10:34,107][52031] Fps is (10 sec: 55704.4, 60 sec: 53794.1, 300 sec: 53539.5). Total num frames: 5929140224. Throughput: 0: 53353.9. Samples: 419646400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:10:34,107][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 11:10:34,503][52263] Updated weights for policy 0, policy_version 361888 (0.0028) [2024-04-27 11:10:38,423][52263] Updated weights for policy 0, policy_version 361898 (0.0032) [2024-04-27 11:10:39,106][52031] Fps is (10 sec: 50790.1, 60 sec: 52701.9, 300 sec: 53428.5). Total num frames: 5929385984. Throughput: 0: 53328.4. Samples: 419961280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:10:39,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 11:10:40,977][52263] Updated weights for policy 0, policy_version 361908 (0.0032) [2024-04-27 11:10:44,106][52031] Fps is (10 sec: 49153.1, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 5929631744. Throughput: 0: 53147.1. Samples: 420109600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:10:44,107][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 11:10:44,692][52263] Updated weights for policy 0, policy_version 361918 (0.0027) [2024-04-27 11:10:46,985][52263] Updated weights for policy 0, policy_version 361928 (0.0024) [2024-04-27 11:10:49,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 5929910272. Throughput: 0: 53070.4. Samples: 420429600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:10:49,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 11:10:50,886][52263] Updated weights for policy 0, policy_version 361938 (0.0028) [2024-04-27 11:10:53,085][52263] Updated weights for policy 0, policy_version 361948 (0.0025) [2024-04-27 11:10:54,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 5930188800. Throughput: 0: 53029.7. Samples: 420750900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:10:54,107][52031] Avg episode reward: [(0, '0.517')] [2024-04-27 11:10:56,919][52263] Updated weights for policy 0, policy_version 361958 (0.0034) [2024-04-27 11:10:57,171][52242] Signal inference workers to stop experience collection... (6250 times) [2024-04-27 11:10:57,171][52242] Signal inference workers to resume experience collection... (6250 times) [2024-04-27 11:10:57,190][52263] InferenceWorker_p0-w0: stopping experience collection (6250 times) [2024-04-27 11:10:57,190][52263] InferenceWorker_p0-w0: resuming experience collection (6250 times) [2024-04-27 11:10:59,106][52031] Fps is (10 sec: 55705.0, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 5930467328. Throughput: 0: 53283.2. Samples: 420918440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:10:59,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 11:10:59,192][52263] Updated weights for policy 0, policy_version 361968 (0.0031) [2024-04-27 11:11:03,028][52263] Updated weights for policy 0, policy_version 361978 (0.0037) [2024-04-27 11:11:04,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 5930729472. Throughput: 0: 53214.4. Samples: 421235600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:11:04,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 11:11:05,495][52263] Updated weights for policy 0, policy_version 361988 (0.0029) [2024-04-27 11:11:09,107][52031] Fps is (10 sec: 49151.4, 60 sec: 52428.6, 300 sec: 53317.4). Total num frames: 5930958848. Throughput: 0: 53309.1. Samples: 421555000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 11:11:09,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 11:11:09,225][52263] Updated weights for policy 0, policy_version 361998 (0.0030) [2024-04-27 11:11:11,858][52263] Updated weights for policy 0, policy_version 362008 (0.0032) [2024-04-27 11:11:14,107][52031] Fps is (10 sec: 50789.7, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 5931237376. Throughput: 0: 53000.3. Samples: 421709540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 11:11:14,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 11:11:15,310][52263] Updated weights for policy 0, policy_version 362018 (0.0033) [2024-04-27 11:11:17,887][52263] Updated weights for policy 0, policy_version 362028 (0.0028) [2024-04-27 11:11:19,106][52031] Fps is (10 sec: 55706.5, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 5931515904. Throughput: 0: 52929.1. Samples: 422028200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 11:11:19,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 11:11:21,378][52263] Updated weights for policy 0, policy_version 362038 (0.0028) [2024-04-27 11:11:23,883][52263] Updated weights for policy 0, policy_version 362048 (0.0033) [2024-04-27 11:11:24,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 5931794432. Throughput: 0: 53099.9. Samples: 422350780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 11:11:24,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 11:11:27,596][52263] Updated weights for policy 0, policy_version 362058 (0.0030) [2024-04-27 11:11:29,106][52031] Fps is (10 sec: 54067.2, 60 sec: 52974.9, 300 sec: 53484.0). Total num frames: 5932056576. Throughput: 0: 53489.7. Samples: 422516640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 11:11:29,115][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 11:11:30,170][52263] Updated weights for policy 0, policy_version 362068 (0.0026) [2024-04-27 11:11:33,753][52263] Updated weights for policy 0, policy_version 362078 (0.0034) [2024-04-27 11:11:34,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52975.1, 300 sec: 53428.5). Total num frames: 5932318720. Throughput: 0: 53439.5. Samples: 422834380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 11:11:34,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 11:11:34,249][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000362082_5932351488.pth... [2024-04-27 11:11:34,294][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000361298_5919506432.pth [2024-04-27 11:11:36,270][52263] Updated weights for policy 0, policy_version 362088 (0.0028) [2024-04-27 11:11:39,106][52031] Fps is (10 sec: 50790.4, 60 sec: 52974.9, 300 sec: 53373.0). Total num frames: 5932564480. Throughput: 0: 53403.5. Samples: 423154060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 11:11:39,107][52031] Avg episode reward: [(0, '0.509')] [2024-04-27 11:11:39,754][52263] Updated weights for policy 0, policy_version 362098 (0.0031) [2024-04-27 11:11:42,308][52263] Updated weights for policy 0, policy_version 362108 (0.0028) [2024-04-27 11:11:44,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.1, 300 sec: 53317.5). Total num frames: 5932843008. Throughput: 0: 53040.5. Samples: 423305260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 11:11:44,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 11:11:45,828][52263] Updated weights for policy 0, policy_version 362118 (0.0030) [2024-04-27 11:11:48,457][52263] Updated weights for policy 0, policy_version 362128 (0.0029) [2024-04-27 11:11:49,106][52031] Fps is (10 sec: 57343.9, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 5933137920. Throughput: 0: 53143.5. Samples: 423627060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 11:11:49,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 11:11:51,943][52242] Signal inference workers to stop experience collection... (6300 times) [2024-04-27 11:11:51,966][52263] InferenceWorker_p0-w0: stopping experience collection (6300 times) [2024-04-27 11:11:52,008][52242] Signal inference workers to resume experience collection... (6300 times) [2024-04-27 11:11:52,008][52263] InferenceWorker_p0-w0: resuming experience collection (6300 times) [2024-04-27 11:11:52,011][52263] Updated weights for policy 0, policy_version 362138 (0.0029) [2024-04-27 11:11:54,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 5933400064. Throughput: 0: 53247.8. Samples: 423951140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 11:11:54,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 11:11:54,475][52263] Updated weights for policy 0, policy_version 362148 (0.0032) [2024-04-27 11:11:58,076][52263] Updated weights for policy 0, policy_version 362158 (0.0037) [2024-04-27 11:11:59,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 5933662208. Throughput: 0: 53499.7. Samples: 424117020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 11:11:59,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 11:12:00,705][52263] Updated weights for policy 0, policy_version 362168 (0.0034) [2024-04-27 11:12:04,081][52263] Updated weights for policy 0, policy_version 362178 (0.0027) [2024-04-27 11:12:04,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 5933924352. Throughput: 0: 53580.1. Samples: 424439300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 11:12:04,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 11:12:06,858][52263] Updated weights for policy 0, policy_version 362188 (0.0037) [2024-04-27 11:12:09,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 5934186496. Throughput: 0: 53517.3. Samples: 424759060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 11:12:09,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 11:12:10,087][52263] Updated weights for policy 0, policy_version 362198 (0.0031) [2024-04-27 11:12:13,037][52263] Updated weights for policy 0, policy_version 362208 (0.0032) [2024-04-27 11:12:14,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.1, 300 sec: 53317.4). Total num frames: 5934465024. Throughput: 0: 53398.6. Samples: 424919580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 11:12:14,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 11:12:16,276][52263] Updated weights for policy 0, policy_version 362218 (0.0031) [2024-04-27 11:12:19,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 5934727168. Throughput: 0: 53354.6. Samples: 425235340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 11:12:19,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 11:12:19,121][52263] Updated weights for policy 0, policy_version 362228 (0.0031) [2024-04-27 11:12:22,280][52263] Updated weights for policy 0, policy_version 362238 (0.0033) [2024-04-27 11:12:24,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 5935005696. Throughput: 0: 53363.0. Samples: 425555400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:12:24,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 11:12:25,072][52263] Updated weights for policy 0, policy_version 362248 (0.0028) [2024-04-27 11:12:28,364][52263] Updated weights for policy 0, policy_version 362258 (0.0030) [2024-04-27 11:12:29,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 5935284224. Throughput: 0: 53914.1. Samples: 425731400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:12:29,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 11:12:31,101][52263] Updated weights for policy 0, policy_version 362268 (0.0036) [2024-04-27 11:12:34,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 5935529984. Throughput: 0: 53872.9. Samples: 426051340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:12:34,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 11:12:34,471][52263] Updated weights for policy 0, policy_version 362278 (0.0031) [2024-04-27 11:12:37,184][52263] Updated weights for policy 0, policy_version 362288 (0.0028) [2024-04-27 11:12:39,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 5935792128. Throughput: 0: 53840.4. Samples: 426373960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:12:39,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 11:12:40,681][52263] Updated weights for policy 0, policy_version 362298 (0.0029) [2024-04-27 11:12:41,333][52242] Signal inference workers to stop experience collection... (6350 times) [2024-04-27 11:12:41,377][52263] InferenceWorker_p0-w0: stopping experience collection (6350 times) [2024-04-27 11:12:41,406][52242] Signal inference workers to resume experience collection... (6350 times) [2024-04-27 11:12:41,410][52263] InferenceWorker_p0-w0: resuming experience collection (6350 times) [2024-04-27 11:12:43,237][52263] Updated weights for policy 0, policy_version 362308 (0.0031) [2024-04-27 11:12:44,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.1, 300 sec: 53372.9). Total num frames: 5936070656. Throughput: 0: 53580.4. Samples: 426528140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:12:44,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 11:12:46,686][52263] Updated weights for policy 0, policy_version 362318 (0.0032) [2024-04-27 11:12:49,106][52031] Fps is (10 sec: 57344.1, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 5936365568. Throughput: 0: 53563.5. Samples: 426849660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:12:49,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 11:12:49,324][52263] Updated weights for policy 0, policy_version 362328 (0.0035) [2024-04-27 11:12:52,715][52263] Updated weights for policy 0, policy_version 362338 (0.0029) [2024-04-27 11:12:54,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 5936611328. Throughput: 0: 53576.2. Samples: 427169980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:12:54,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 11:12:55,469][52263] Updated weights for policy 0, policy_version 362348 (0.0026) [2024-04-27 11:12:58,782][52263] Updated weights for policy 0, policy_version 362358 (0.0029) [2024-04-27 11:12:59,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 5936889856. Throughput: 0: 53783.6. Samples: 427339840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:12:59,116][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 11:13:02,058][52263] Updated weights for policy 0, policy_version 362368 (0.0027) [2024-04-27 11:13:04,107][52031] Fps is (10 sec: 50789.4, 60 sec: 53247.8, 300 sec: 53372.9). Total num frames: 5937119232. Throughput: 0: 53895.9. Samples: 427660660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:13:04,115][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 11:13:05,010][52263] Updated weights for policy 0, policy_version 362378 (0.0035) [2024-04-27 11:13:08,314][52263] Updated weights for policy 0, policy_version 362388 (0.0032) [2024-04-27 11:13:09,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53521.0, 300 sec: 53372.9). Total num frames: 5937397760. Throughput: 0: 53884.3. Samples: 427980200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:13:09,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 11:13:11,137][52263] Updated weights for policy 0, policy_version 362398 (0.0031) [2024-04-27 11:13:14,107][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 5937676288. Throughput: 0: 53311.6. Samples: 428130420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:13:14,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 11:13:14,330][52263] Updated weights for policy 0, policy_version 362408 (0.0026) [2024-04-27 11:13:17,268][52263] Updated weights for policy 0, policy_version 362418 (0.0030) [2024-04-27 11:13:19,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 5937954816. Throughput: 0: 53312.3. Samples: 428450400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:13:19,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 11:13:20,340][52263] Updated weights for policy 0, policy_version 362428 (0.0034) [2024-04-27 11:13:23,342][52263] Updated weights for policy 0, policy_version 362438 (0.0029) [2024-04-27 11:13:24,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 5938216960. Throughput: 0: 53361.3. Samples: 428775220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:13:24,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 11:13:26,478][52263] Updated weights for policy 0, policy_version 362448 (0.0041) [2024-04-27 11:13:29,106][52031] Fps is (10 sec: 50791.5, 60 sec: 52975.1, 300 sec: 53373.0). Total num frames: 5938462720. Throughput: 0: 53520.6. Samples: 428936560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:13:29,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 11:13:29,439][52263] Updated weights for policy 0, policy_version 362458 (0.0026) [2024-04-27 11:13:32,753][52263] Updated weights for policy 0, policy_version 362468 (0.0031) [2024-04-27 11:13:34,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 5938724864. Throughput: 0: 53443.1. Samples: 429254600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:13:34,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 11:13:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000362471_5938724864.pth... [2024-04-27 11:13:34,166][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000361690_5925928960.pth [2024-04-27 11:13:35,543][52263] Updated weights for policy 0, policy_version 362478 (0.0030) [2024-04-27 11:13:38,768][52263] Updated weights for policy 0, policy_version 362488 (0.0031) [2024-04-27 11:13:39,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 5939003392. Throughput: 0: 53337.3. Samples: 429570160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 11:13:39,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 11:13:41,658][52263] Updated weights for policy 0, policy_version 362498 (0.0028) [2024-04-27 11:13:44,107][52031] Fps is (10 sec: 57343.5, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 5939298304. Throughput: 0: 53202.6. Samples: 429733960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 11:13:44,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 11:13:45,183][52263] Updated weights for policy 0, policy_version 362508 (0.0035) [2024-04-27 11:13:47,798][52263] Updated weights for policy 0, policy_version 362518 (0.0029) [2024-04-27 11:13:49,106][52031] Fps is (10 sec: 54067.0, 60 sec: 52974.9, 300 sec: 53484.1). Total num frames: 5939544064. Throughput: 0: 53144.6. Samples: 430052160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 11:13:49,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 11:13:51,349][52263] Updated weights for policy 0, policy_version 362528 (0.0031) [2024-04-27 11:13:53,544][52242] Signal inference workers to stop experience collection... (6400 times) [2024-04-27 11:13:53,551][52242] Signal inference workers to resume experience collection... (6400 times) [2024-04-27 11:13:53,566][52263] InferenceWorker_p0-w0: stopping experience collection (6400 times) [2024-04-27 11:13:53,566][52263] InferenceWorker_p0-w0: resuming experience collection (6400 times) [2024-04-27 11:13:53,791][52263] Updated weights for policy 0, policy_version 362538 (0.0025) [2024-04-27 11:13:54,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 5939822592. Throughput: 0: 53203.6. Samples: 430374360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 11:13:54,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 11:13:57,484][52263] Updated weights for policy 0, policy_version 362548 (0.0031) [2024-04-27 11:13:59,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52701.9, 300 sec: 53373.0). Total num frames: 5940051968. Throughput: 0: 53312.6. Samples: 430529480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 11:13:59,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 11:14:00,036][52263] Updated weights for policy 0, policy_version 362558 (0.0035) [2024-04-27 11:14:03,744][52263] Updated weights for policy 0, policy_version 362568 (0.0029) [2024-04-27 11:14:04,107][52031] Fps is (10 sec: 49151.7, 60 sec: 53248.0, 300 sec: 53261.8). Total num frames: 5940314112. Throughput: 0: 53218.2. Samples: 430845220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 11:14:04,107][52031] Avg episode reward: [(0, '0.484')] [2024-04-27 11:14:06,094][52263] Updated weights for policy 0, policy_version 362578 (0.0026) [2024-04-27 11:14:09,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.2, 300 sec: 53317.5). Total num frames: 5940609024. Throughput: 0: 53095.6. Samples: 431164520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 11:14:09,107][52031] Avg episode reward: [(0, '0.518')] [2024-04-27 11:14:09,894][52263] Updated weights for policy 0, policy_version 362588 (0.0032) [2024-04-27 11:14:12,208][52263] Updated weights for policy 0, policy_version 362598 (0.0029) [2024-04-27 11:14:14,107][52031] Fps is (10 sec: 58982.5, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 5940903936. Throughput: 0: 53452.2. Samples: 431341920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 11:14:14,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 11:14:15,908][52263] Updated weights for policy 0, policy_version 362608 (0.0036) [2024-04-27 11:14:18,499][52263] Updated weights for policy 0, policy_version 362618 (0.0030) [2024-04-27 11:14:19,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 5941149696. Throughput: 0: 53485.8. Samples: 431661460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 11:14:19,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 11:14:21,951][52263] Updated weights for policy 0, policy_version 362628 (0.0035) [2024-04-27 11:14:24,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 5941428224. Throughput: 0: 53489.7. Samples: 431977200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 11:14:24,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 11:14:24,572][52263] Updated weights for policy 0, policy_version 362638 (0.0035) [2024-04-27 11:14:28,152][52263] Updated weights for policy 0, policy_version 362648 (0.0033) [2024-04-27 11:14:29,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 5941657600. Throughput: 0: 53285.0. Samples: 432131780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 11:14:29,107][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 11:14:30,751][52263] Updated weights for policy 0, policy_version 362658 (0.0029) [2024-04-27 11:14:34,107][52031] Fps is (10 sec: 49151.7, 60 sec: 53247.9, 300 sec: 53206.3). Total num frames: 5941919744. Throughput: 0: 53288.3. Samples: 432450140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 11:14:34,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 11:14:34,635][52263] Updated weights for policy 0, policy_version 362668 (0.0037) [2024-04-27 11:14:36,776][52263] Updated weights for policy 0, policy_version 362678 (0.0029) [2024-04-27 11:14:39,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 5942214656. Throughput: 0: 53217.0. Samples: 432769120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 11:14:39,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 11:14:40,588][52263] Updated weights for policy 0, policy_version 362688 (0.0039) [2024-04-27 11:14:42,875][52263] Updated weights for policy 0, policy_version 362698 (0.0029) [2024-04-27 11:14:44,107][52031] Fps is (10 sec: 57344.1, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 5942493184. Throughput: 0: 53625.6. Samples: 432942640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 11:14:44,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 11:14:46,922][52263] Updated weights for policy 0, policy_version 362708 (0.0029) [2024-04-27 11:14:49,051][52263] Updated weights for policy 0, policy_version 362718 (0.0038) [2024-04-27 11:14:49,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53794.0, 300 sec: 53539.5). Total num frames: 5942771712. Throughput: 0: 53762.7. Samples: 433264540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 11:14:49,108][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 11:14:53,129][52263] Updated weights for policy 0, policy_version 362728 (0.0032) [2024-04-27 11:14:53,583][52242] Signal inference workers to stop experience collection... (6450 times) [2024-04-27 11:14:53,629][52263] InferenceWorker_p0-w0: stopping experience collection (6450 times) [2024-04-27 11:14:53,641][52242] Signal inference workers to resume experience collection... (6450 times) [2024-04-27 11:14:53,647][52263] InferenceWorker_p0-w0: resuming experience collection (6450 times) [2024-04-27 11:14:54,106][52031] Fps is (10 sec: 50791.0, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 5943001088. Throughput: 0: 53870.2. Samples: 433588680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 11:14:54,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 11:14:55,222][52263] Updated weights for policy 0, policy_version 362738 (0.0027) [2024-04-27 11:14:59,107][52031] Fps is (10 sec: 47513.7, 60 sec: 53247.9, 300 sec: 53261.9). Total num frames: 5943246848. Throughput: 0: 53248.0. Samples: 433738080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 11:14:59,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 11:14:59,212][52263] Updated weights for policy 0, policy_version 362748 (0.0034) [2024-04-27 11:15:01,294][52263] Updated weights for policy 0, policy_version 362758 (0.0032) [2024-04-27 11:15:04,107][52031] Fps is (10 sec: 55705.0, 60 sec: 54067.3, 300 sec: 53372.9). Total num frames: 5943558144. Throughput: 0: 53261.2. Samples: 434058220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 11:15:04,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 11:15:05,243][52263] Updated weights for policy 0, policy_version 362768 (0.0030) [2024-04-27 11:15:07,640][52263] Updated weights for policy 0, policy_version 362778 (0.0036) [2024-04-27 11:15:09,106][52031] Fps is (10 sec: 58983.0, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 5943836672. Throughput: 0: 53378.2. Samples: 434379220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 11:15:09,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 11:15:11,240][52263] Updated weights for policy 0, policy_version 362788 (0.0032) [2024-04-27 11:15:13,698][52263] Updated weights for policy 0, policy_version 362798 (0.0034) [2024-04-27 11:15:14,107][52031] Fps is (10 sec: 52428.6, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 5944082432. Throughput: 0: 53689.1. Samples: 434547800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 11:15:14,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 11:15:17,348][52263] Updated weights for policy 0, policy_version 362808 (0.0028) [2024-04-27 11:15:19,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 5944377344. Throughput: 0: 53724.4. Samples: 434867740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 11:15:19,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 11:15:19,925][52263] Updated weights for policy 0, policy_version 362818 (0.0031) [2024-04-27 11:15:23,593][52263] Updated weights for policy 0, policy_version 362828 (0.0035) [2024-04-27 11:15:24,107][52031] Fps is (10 sec: 52429.0, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 5944606720. Throughput: 0: 53862.1. Samples: 435192920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 11:15:24,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 11:15:25,901][52263] Updated weights for policy 0, policy_version 362838 (0.0036) [2024-04-27 11:15:29,107][52031] Fps is (10 sec: 49152.1, 60 sec: 53520.9, 300 sec: 53317.4). Total num frames: 5944868864. Throughput: 0: 53308.9. Samples: 435341540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 11:15:29,107][52031] Avg episode reward: [(0, '0.502')] [2024-04-27 11:15:29,683][52263] Updated weights for policy 0, policy_version 362848 (0.0031) [2024-04-27 11:15:31,903][52263] Updated weights for policy 0, policy_version 362858 (0.0027) [2024-04-27 11:15:34,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 5945147392. Throughput: 0: 53329.1. Samples: 435664340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 11:15:34,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 11:15:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000362864_5945163776.pth... [2024-04-27 11:15:34,161][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000362082_5932351488.pth [2024-04-27 11:15:35,747][52263] Updated weights for policy 0, policy_version 362868 (0.0028) [2024-04-27 11:15:38,250][52263] Updated weights for policy 0, policy_version 362878 (0.0031) [2024-04-27 11:15:39,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 5945425920. Throughput: 0: 53190.2. Samples: 435982240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 11:15:39,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 11:15:41,846][52263] Updated weights for policy 0, policy_version 362888 (0.0031) [2024-04-27 11:15:44,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 5945688064. Throughput: 0: 53588.1. Samples: 436149540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 11:15:44,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 11:15:44,556][52263] Updated weights for policy 0, policy_version 362898 (0.0027) [2024-04-27 11:15:48,083][52263] Updated weights for policy 0, policy_version 362908 (0.0036) [2024-04-27 11:15:49,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 5945966592. Throughput: 0: 53607.6. Samples: 436470560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 11:15:49,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 11:15:50,613][52263] Updated weights for policy 0, policy_version 362918 (0.0032) [2024-04-27 11:15:54,081][52263] Updated weights for policy 0, policy_version 362928 (0.0031) [2024-04-27 11:15:54,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 5946212352. Throughput: 0: 53548.5. Samples: 436788900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 11:15:54,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 11:15:56,974][52263] Updated weights for policy 0, policy_version 362938 (0.0028) [2024-04-27 11:15:59,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53794.3, 300 sec: 53373.0). Total num frames: 5946474496. Throughput: 0: 53262.4. Samples: 436944600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 11:15:59,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 11:16:00,062][52263] Updated weights for policy 0, policy_version 362948 (0.0030) [2024-04-27 11:16:03,017][52263] Updated weights for policy 0, policy_version 362958 (0.0031) [2024-04-27 11:16:04,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 5946753024. Throughput: 0: 53137.9. Samples: 437258940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 11:16:04,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 11:16:06,361][52263] Updated weights for policy 0, policy_version 362968 (0.0025) [2024-04-27 11:16:06,785][52242] Signal inference workers to stop experience collection... (6500 times) [2024-04-27 11:16:06,785][52242] Signal inference workers to resume experience collection... (6500 times) [2024-04-27 11:16:06,816][52263] InferenceWorker_p0-w0: stopping experience collection (6500 times) [2024-04-27 11:16:06,817][52263] InferenceWorker_p0-w0: resuming experience collection (6500 times) [2024-04-27 11:16:09,106][52031] Fps is (10 sec: 54067.0, 60 sec: 52974.9, 300 sec: 53484.1). Total num frames: 5947015168. Throughput: 0: 53017.4. Samples: 437578700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 11:16:09,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 11:16:09,214][52263] Updated weights for policy 0, policy_version 362978 (0.0031) [2024-04-27 11:16:12,554][52263] Updated weights for policy 0, policy_version 362988 (0.0033) [2024-04-27 11:16:14,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 5947310080. Throughput: 0: 53419.1. Samples: 437745400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 11:16:14,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 11:16:15,319][52263] Updated weights for policy 0, policy_version 362998 (0.0026) [2024-04-27 11:16:18,778][52263] Updated weights for policy 0, policy_version 363008 (0.0036) [2024-04-27 11:16:19,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52702.0, 300 sec: 53373.0). Total num frames: 5947539456. Throughput: 0: 53322.7. Samples: 438063860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 11:16:19,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 11:16:21,401][52263] Updated weights for policy 0, policy_version 363018 (0.0035) [2024-04-27 11:16:24,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 5947817984. Throughput: 0: 53300.7. Samples: 438380780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 11:16:24,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 11:16:24,861][52263] Updated weights for policy 0, policy_version 363028 (0.0030) [2024-04-27 11:16:27,649][52263] Updated weights for policy 0, policy_version 363038 (0.0035) [2024-04-27 11:16:29,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 5948080128. Throughput: 0: 53064.8. Samples: 438537460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 11:16:29,107][52031] Avg episode reward: [(0, '0.520')] [2024-04-27 11:16:31,029][52263] Updated weights for policy 0, policy_version 363048 (0.0026) [2024-04-27 11:16:33,736][52263] Updated weights for policy 0, policy_version 363058 (0.0030) [2024-04-27 11:16:34,106][52031] Fps is (10 sec: 54068.2, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 5948358656. Throughput: 0: 53037.4. Samples: 438857240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 11:16:34,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 11:16:37,159][52263] Updated weights for policy 0, policy_version 363068 (0.0032) [2024-04-27 11:16:39,107][52031] Fps is (10 sec: 52429.1, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 5948604416. Throughput: 0: 53044.8. Samples: 439175920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 11:16:39,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 11:16:39,798][52263] Updated weights for policy 0, policy_version 363078 (0.0032) [2024-04-27 11:16:43,305][52263] Updated weights for policy 0, policy_version 363088 (0.0028) [2024-04-27 11:16:44,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 5948882944. Throughput: 0: 53421.3. Samples: 439348560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 11:16:44,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 11:16:46,054][52263] Updated weights for policy 0, policy_version 363098 (0.0030) [2024-04-27 11:16:49,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52701.9, 300 sec: 53317.4). Total num frames: 5949128704. Throughput: 0: 53455.2. Samples: 439664420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 11:16:49,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 11:16:49,313][52263] Updated weights for policy 0, policy_version 363108 (0.0024) [2024-04-27 11:16:52,502][52263] Updated weights for policy 0, policy_version 363118 (0.0030) [2024-04-27 11:16:54,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 5949407232. Throughput: 0: 53460.5. Samples: 439984420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 11:16:54,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 11:16:55,335][52263] Updated weights for policy 0, policy_version 363128 (0.0030) [2024-04-27 11:16:56,407][52242] Signal inference workers to stop experience collection... (6550 times) [2024-04-27 11:16:56,440][52263] InferenceWorker_p0-w0: stopping experience collection (6550 times) [2024-04-27 11:16:56,469][52242] Signal inference workers to resume experience collection... (6550 times) [2024-04-27 11:16:56,469][52263] InferenceWorker_p0-w0: resuming experience collection (6550 times) [2024-04-27 11:16:58,450][52263] Updated weights for policy 0, policy_version 363138 (0.0031) [2024-04-27 11:16:59,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 5949685760. Throughput: 0: 53121.7. Samples: 440135880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 11:16:59,108][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 11:17:01,396][52263] Updated weights for policy 0, policy_version 363148 (0.0029) [2024-04-27 11:17:04,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 5949947904. Throughput: 0: 53265.8. Samples: 440460820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 11:17:04,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 11:17:04,382][52263] Updated weights for policy 0, policy_version 363158 (0.0030) [2024-04-27 11:17:07,396][52263] Updated weights for policy 0, policy_version 363168 (0.0032) [2024-04-27 11:17:09,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 5950226432. Throughput: 0: 53413.9. Samples: 440784400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 11:17:09,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 11:17:10,546][52263] Updated weights for policy 0, policy_version 363178 (0.0030) [2024-04-27 11:17:13,575][52263] Updated weights for policy 0, policy_version 363188 (0.0031) [2024-04-27 11:17:14,107][52031] Fps is (10 sec: 54066.5, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 5950488576. Throughput: 0: 53563.6. Samples: 440947820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 11:17:14,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 11:17:16,762][52263] Updated weights for policy 0, policy_version 363198 (0.0029) [2024-04-27 11:17:19,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53520.9, 300 sec: 53372.9). Total num frames: 5950750720. Throughput: 0: 53493.9. Samples: 441264480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 11:17:19,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 11:17:19,720][52263] Updated weights for policy 0, policy_version 363208 (0.0033) [2024-04-27 11:17:22,844][52263] Updated weights for policy 0, policy_version 363218 (0.0029) [2024-04-27 11:17:24,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 5951012864. Throughput: 0: 53629.0. Samples: 441589220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 11:17:24,107][52031] Avg episode reward: [(0, '0.503')] [2024-04-27 11:17:25,734][52263] Updated weights for policy 0, policy_version 363228 (0.0033) [2024-04-27 11:17:28,879][52263] Updated weights for policy 0, policy_version 363238 (0.0037) [2024-04-27 11:17:29,107][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 5951291392. Throughput: 0: 53364.8. Samples: 441749980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 11:17:29,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 11:17:31,823][52263] Updated weights for policy 0, policy_version 363248 (0.0028) [2024-04-27 11:17:34,107][52031] Fps is (10 sec: 54066.2, 60 sec: 53247.8, 300 sec: 53428.5). Total num frames: 5951553536. Throughput: 0: 53449.1. Samples: 442069640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 11:17:34,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 11:17:34,162][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000363255_5951569920.pth... [2024-04-27 11:17:34,208][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000362471_5938724864.pth [2024-04-27 11:17:35,136][52263] Updated weights for policy 0, policy_version 363258 (0.0032) [2024-04-27 11:17:37,923][52263] Updated weights for policy 0, policy_version 363268 (0.0028) [2024-04-27 11:17:39,106][52031] Fps is (10 sec: 55706.1, 60 sec: 54067.3, 300 sec: 53484.1). Total num frames: 5951848448. Throughput: 0: 53369.8. Samples: 442386060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:17:39,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 11:17:41,260][52263] Updated weights for policy 0, policy_version 363278 (0.0031) [2024-04-27 11:17:44,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53520.9, 300 sec: 53317.4). Total num frames: 5952094208. Throughput: 0: 53687.1. Samples: 442551800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:17:44,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 11:17:44,190][52263] Updated weights for policy 0, policy_version 363288 (0.0031) [2024-04-27 11:17:47,342][52263] Updated weights for policy 0, policy_version 363298 (0.0031) [2024-04-27 11:17:49,107][52031] Fps is (10 sec: 50790.1, 60 sec: 53794.1, 300 sec: 53372.9). Total num frames: 5952356352. Throughput: 0: 53507.0. Samples: 442868640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:17:49,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 11:17:50,256][52263] Updated weights for policy 0, policy_version 363308 (0.0029) [2024-04-27 11:17:53,471][52263] Updated weights for policy 0, policy_version 363318 (0.0030) [2024-04-27 11:17:54,107][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 5952618496. Throughput: 0: 53470.2. Samples: 443190560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:17:54,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 11:17:56,257][52263] Updated weights for policy 0, policy_version 363328 (0.0029) [2024-04-27 11:17:59,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 5952897024. Throughput: 0: 53357.5. Samples: 443348900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:17:59,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 11:17:59,797][52263] Updated weights for policy 0, policy_version 363338 (0.0029) [2024-04-27 11:18:02,827][52263] Updated weights for policy 0, policy_version 363348 (0.0030) [2024-04-27 11:18:04,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53793.9, 300 sec: 53484.0). Total num frames: 5953175552. Throughput: 0: 53447.1. Samples: 443669600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:18:04,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 11:18:05,849][52263] Updated weights for policy 0, policy_version 363358 (0.0028) [2024-04-27 11:18:09,103][52263] Updated weights for policy 0, policy_version 363368 (0.0026) [2024-04-27 11:18:09,106][52031] Fps is (10 sec: 52428.2, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 5953421312. Throughput: 0: 53292.8. Samples: 443987400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:18:09,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 11:18:09,609][52242] Signal inference workers to stop experience collection... (6600 times) [2024-04-27 11:18:09,644][52263] InferenceWorker_p0-w0: stopping experience collection (6600 times) [2024-04-27 11:18:09,662][52242] Signal inference workers to resume experience collection... (6600 times) [2024-04-27 11:18:09,662][52263] InferenceWorker_p0-w0: resuming experience collection (6600 times) [2024-04-27 11:18:12,013][52263] Updated weights for policy 0, policy_version 363378 (0.0035) [2024-04-27 11:18:14,106][52031] Fps is (10 sec: 52429.8, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 5953699840. Throughput: 0: 53320.1. Samples: 444149380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:18:14,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 11:18:15,135][52263] Updated weights for policy 0, policy_version 363388 (0.0033) [2024-04-27 11:18:18,082][52263] Updated weights for policy 0, policy_version 363398 (0.0029) [2024-04-27 11:18:19,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 5953945600. Throughput: 0: 53388.5. Samples: 444472120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:18:19,116][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 11:18:21,170][52263] Updated weights for policy 0, policy_version 363408 (0.0033) [2024-04-27 11:18:24,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 5954224128. Throughput: 0: 53471.0. Samples: 444792260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:18:24,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 11:18:24,225][52263] Updated weights for policy 0, policy_version 363418 (0.0034) [2024-04-27 11:18:27,371][52263] Updated weights for policy 0, policy_version 363428 (0.0030) [2024-04-27 11:18:29,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 5954486272. Throughput: 0: 53158.0. Samples: 444943900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:18:29,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 11:18:30,738][52263] Updated weights for policy 0, policy_version 363438 (0.0030) [2024-04-27 11:18:33,601][52263] Updated weights for policy 0, policy_version 363448 (0.0037) [2024-04-27 11:18:34,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 5954764800. Throughput: 0: 53193.0. Samples: 445262320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:18:34,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 11:18:36,739][52263] Updated weights for policy 0, policy_version 363458 (0.0032) [2024-04-27 11:18:39,107][52031] Fps is (10 sec: 54066.8, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 5955026944. Throughput: 0: 53077.3. Samples: 445579040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:18:39,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 11:18:39,682][52263] Updated weights for policy 0, policy_version 363468 (0.0031) [2024-04-27 11:18:42,706][52263] Updated weights for policy 0, policy_version 363478 (0.0027) [2024-04-27 11:18:44,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 5955289088. Throughput: 0: 53186.5. Samples: 445742300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:18:44,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 11:18:45,823][52263] Updated weights for policy 0, policy_version 363488 (0.0030) [2024-04-27 11:18:48,843][52263] Updated weights for policy 0, policy_version 363498 (0.0027) [2024-04-27 11:18:49,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 5955551232. Throughput: 0: 53180.3. Samples: 446062700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:18:49,107][52031] Avg episode reward: [(0, '0.485')] [2024-04-27 11:18:51,915][52263] Updated weights for policy 0, policy_version 363508 (0.0030) [2024-04-27 11:18:54,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 5955813376. Throughput: 0: 53225.4. Samples: 446382540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 11:18:54,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 11:18:55,246][52263] Updated weights for policy 0, policy_version 363518 (0.0026) [2024-04-27 11:18:58,050][52263] Updated weights for policy 0, policy_version 363528 (0.0025) [2024-04-27 11:18:59,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 5956091904. Throughput: 0: 53120.1. Samples: 446539780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 11:18:59,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 11:19:01,496][52263] Updated weights for policy 0, policy_version 363538 (0.0027) [2024-04-27 11:19:04,069][52263] Updated weights for policy 0, policy_version 363548 (0.0031) [2024-04-27 11:19:04,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 5956370432. Throughput: 0: 53164.5. Samples: 446864520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 11:19:04,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 11:19:07,731][52263] Updated weights for policy 0, policy_version 363558 (0.0031) [2024-04-27 11:19:09,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.1, 300 sec: 53261.9). Total num frames: 5956616192. Throughput: 0: 53154.0. Samples: 447184180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 11:19:09,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 11:19:10,110][52263] Updated weights for policy 0, policy_version 363568 (0.0032) [2024-04-27 11:19:13,707][52263] Updated weights for policy 0, policy_version 363578 (0.0028) [2024-04-27 11:19:14,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 5956894720. Throughput: 0: 53539.6. Samples: 447353180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 11:19:14,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 11:19:15,727][52242] Signal inference workers to stop experience collection... (6650 times) [2024-04-27 11:19:15,754][52263] InferenceWorker_p0-w0: stopping experience collection (6650 times) [2024-04-27 11:19:15,828][52242] Signal inference workers to resume experience collection... (6650 times) [2024-04-27 11:19:15,829][52263] InferenceWorker_p0-w0: resuming experience collection (6650 times) [2024-04-27 11:19:16,349][52263] Updated weights for policy 0, policy_version 363588 (0.0029) [2024-04-27 11:19:19,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.2, 300 sec: 53317.4). Total num frames: 5957156864. Throughput: 0: 53532.0. Samples: 447671260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 11:19:19,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 11:19:19,773][52263] Updated weights for policy 0, policy_version 363598 (0.0029) [2024-04-27 11:19:22,354][52263] Updated weights for policy 0, policy_version 363608 (0.0027) [2024-04-27 11:19:24,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 5957435392. Throughput: 0: 53640.0. Samples: 447992840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 11:19:24,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 11:19:25,834][52263] Updated weights for policy 0, policy_version 363618 (0.0031) [2024-04-27 11:19:28,445][52263] Updated weights for policy 0, policy_version 363628 (0.0035) [2024-04-27 11:19:29,106][52031] Fps is (10 sec: 55705.5, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 5957713920. Throughput: 0: 53584.1. Samples: 448153580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 11:19:29,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 11:19:31,808][52263] Updated weights for policy 0, policy_version 363638 (0.0035) [2024-04-27 11:19:34,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 5957959680. Throughput: 0: 53617.9. Samples: 448475500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 11:19:34,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 11:19:34,191][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000363646_5957976064.pth... [2024-04-27 11:19:34,235][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000362864_5945163776.pth [2024-04-27 11:19:34,517][52263] Updated weights for policy 0, policy_version 363648 (0.0026) [2024-04-27 11:19:37,984][52263] Updated weights for policy 0, policy_version 363658 (0.0035) [2024-04-27 11:19:39,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 5958238208. Throughput: 0: 53665.7. Samples: 448797500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 11:19:39,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 11:19:40,806][52263] Updated weights for policy 0, policy_version 363668 (0.0030) [2024-04-27 11:19:44,039][52263] Updated weights for policy 0, policy_version 363678 (0.0029) [2024-04-27 11:19:44,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.1, 300 sec: 53317.5). Total num frames: 5958500352. Throughput: 0: 53830.2. Samples: 448962140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 11:19:44,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 11:19:46,761][52263] Updated weights for policy 0, policy_version 363688 (0.0028) [2024-04-27 11:19:49,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 5958778880. Throughput: 0: 53769.4. Samples: 449284140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 11:19:49,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 11:19:50,163][52263] Updated weights for policy 0, policy_version 363698 (0.0028) [2024-04-27 11:19:52,924][52263] Updated weights for policy 0, policy_version 363708 (0.0033) [2024-04-27 11:19:54,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 5959041024. Throughput: 0: 53679.5. Samples: 449599760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 11:19:54,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 11:19:56,187][52263] Updated weights for policy 0, policy_version 363718 (0.0028) [2024-04-27 11:19:59,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53520.9, 300 sec: 53372.9). Total num frames: 5959303168. Throughput: 0: 53491.3. Samples: 449760300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 11:19:59,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 11:19:59,199][52263] Updated weights for policy 0, policy_version 363728 (0.0032) [2024-04-27 11:20:02,317][52263] Updated weights for policy 0, policy_version 363738 (0.0039) [2024-04-27 11:20:04,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 5959565312. Throughput: 0: 53460.0. Samples: 450076960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 11:20:04,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 11:20:05,346][52263] Updated weights for policy 0, policy_version 363748 (0.0027) [2024-04-27 11:20:08,423][52263] Updated weights for policy 0, policy_version 363758 (0.0031) [2024-04-27 11:20:09,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 5959843840. Throughput: 0: 53466.6. Samples: 450398840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 11:20:09,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 11:20:11,425][52263] Updated weights for policy 0, policy_version 363768 (0.0035) [2024-04-27 11:20:14,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 5960105984. Throughput: 0: 53423.6. Samples: 450557640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 11:20:14,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 11:20:14,500][52263] Updated weights for policy 0, policy_version 363778 (0.0028) [2024-04-27 11:20:17,605][52263] Updated weights for policy 0, policy_version 363788 (0.0031) [2024-04-27 11:20:19,107][52031] Fps is (10 sec: 52429.0, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 5960368128. Throughput: 0: 53465.6. Samples: 450881460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 11:20:19,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 11:20:20,564][52263] Updated weights for policy 0, policy_version 363798 (0.0029) [2024-04-27 11:20:23,841][52263] Updated weights for policy 0, policy_version 363808 (0.0036) [2024-04-27 11:20:24,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 5960646656. Throughput: 0: 53413.8. Samples: 451201120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 11:20:24,107][52031] Avg episode reward: [(0, '0.513')] [2024-04-27 11:20:26,586][52242] Signal inference workers to stop experience collection... (6700 times) [2024-04-27 11:20:26,587][52242] Signal inference workers to resume experience collection... (6700 times) [2024-04-27 11:20:26,614][52263] InferenceWorker_p0-w0: stopping experience collection (6700 times) [2024-04-27 11:20:26,615][52263] InferenceWorker_p0-w0: resuming experience collection (6700 times) [2024-04-27 11:20:26,718][52263] Updated weights for policy 0, policy_version 363818 (0.0030) [2024-04-27 11:20:29,107][52031] Fps is (10 sec: 52429.0, 60 sec: 52974.9, 300 sec: 53373.0). Total num frames: 5960892416. Throughput: 0: 53276.8. Samples: 451359600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 11:20:29,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 11:20:30,061][52263] Updated weights for policy 0, policy_version 363828 (0.0031) [2024-04-27 11:20:32,834][52263] Updated weights for policy 0, policy_version 363838 (0.0026) [2024-04-27 11:20:34,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53520.9, 300 sec: 53372.9). Total num frames: 5961170944. Throughput: 0: 53166.6. Samples: 451676640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 11:20:34,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 11:20:36,082][52263] Updated weights for policy 0, policy_version 363848 (0.0037) [2024-04-27 11:20:38,929][52263] Updated weights for policy 0, policy_version 363858 (0.0033) [2024-04-27 11:20:39,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 5961449472. Throughput: 0: 53445.2. Samples: 452004800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 11:20:39,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 11:20:42,137][52263] Updated weights for policy 0, policy_version 363868 (0.0026) [2024-04-27 11:20:44,107][52031] Fps is (10 sec: 50790.5, 60 sec: 52974.8, 300 sec: 53261.9). Total num frames: 5961678848. Throughput: 0: 53409.4. Samples: 452163720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 11:20:44,107][52031] Avg episode reward: [(0, '0.509')] [2024-04-27 11:20:44,997][52263] Updated weights for policy 0, policy_version 363878 (0.0035) [2024-04-27 11:20:48,350][52263] Updated weights for policy 0, policy_version 363888 (0.0029) [2024-04-27 11:20:49,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 5961973760. Throughput: 0: 53450.7. Samples: 452482240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 11:20:49,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 11:20:51,172][52263] Updated weights for policy 0, policy_version 363898 (0.0031) [2024-04-27 11:20:54,107][52031] Fps is (10 sec: 54066.9, 60 sec: 52974.8, 300 sec: 53372.9). Total num frames: 5962219520. Throughput: 0: 53413.3. Samples: 452802440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 11:20:54,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 11:20:54,686][52263] Updated weights for policy 0, policy_version 363908 (0.0032) [2024-04-27 11:20:57,377][52263] Updated weights for policy 0, policy_version 363918 (0.0030) [2024-04-27 11:20:59,106][52031] Fps is (10 sec: 50790.2, 60 sec: 52975.1, 300 sec: 53317.4). Total num frames: 5962481664. Throughput: 0: 53363.1. Samples: 452958980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 11:20:59,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 11:21:00,670][52263] Updated weights for policy 0, policy_version 363928 (0.0029) [2024-04-27 11:21:03,400][52263] Updated weights for policy 0, policy_version 363938 (0.0031) [2024-04-27 11:21:04,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 5962776576. Throughput: 0: 53302.6. Samples: 453280080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 11:21:04,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 11:21:06,879][52263] Updated weights for policy 0, policy_version 363948 (0.0026) [2024-04-27 11:21:09,106][52031] Fps is (10 sec: 55705.4, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 5963038720. Throughput: 0: 53322.3. Samples: 453600620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 11:21:09,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 11:21:09,649][52263] Updated weights for policy 0, policy_version 363958 (0.0029) [2024-04-27 11:21:13,135][52263] Updated weights for policy 0, policy_version 363968 (0.0036) [2024-04-27 11:21:14,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 5963300864. Throughput: 0: 53330.2. Samples: 453759460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 11:21:14,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 11:21:15,817][52263] Updated weights for policy 0, policy_version 363978 (0.0028) [2024-04-27 11:21:19,101][52263] Updated weights for policy 0, policy_version 363988 (0.0033) [2024-04-27 11:21:19,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 5963579392. Throughput: 0: 53431.3. Samples: 454081040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 11:21:19,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 11:21:21,952][52263] Updated weights for policy 0, policy_version 363998 (0.0030) [2024-04-27 11:21:24,106][52031] Fps is (10 sec: 52428.7, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 5963825152. Throughput: 0: 53177.4. Samples: 454397780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 11:21:24,107][52031] Avg episode reward: [(0, '0.496')] [2024-04-27 11:21:25,239][52263] Updated weights for policy 0, policy_version 364008 (0.0027) [2024-04-27 11:21:27,985][52242] Signal inference workers to stop experience collection... (6750 times) [2024-04-27 11:21:27,985][52242] Signal inference workers to resume experience collection... (6750 times) [2024-04-27 11:21:28,006][52263] InferenceWorker_p0-w0: stopping experience collection (6750 times) [2024-04-27 11:21:28,006][52263] InferenceWorker_p0-w0: resuming experience collection (6750 times) [2024-04-27 11:21:28,117][52263] Updated weights for policy 0, policy_version 364018 (0.0028) [2024-04-27 11:21:29,107][52031] Fps is (10 sec: 52427.3, 60 sec: 53520.9, 300 sec: 53372.9). Total num frames: 5964103680. Throughput: 0: 53297.1. Samples: 454562100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-27 11:21:29,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 11:21:31,262][52263] Updated weights for policy 0, policy_version 364028 (0.0028) [2024-04-27 11:21:34,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 5964365824. Throughput: 0: 53297.1. Samples: 454880620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-27 11:21:34,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 11:21:34,189][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000364037_5964382208.pth... [2024-04-27 11:21:34,248][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000363255_5951569920.pth [2024-04-27 11:21:34,368][52263] Updated weights for policy 0, policy_version 364038 (0.0031) [2024-04-27 11:21:37,468][52263] Updated weights for policy 0, policy_version 364048 (0.0032) [2024-04-27 11:21:39,107][52031] Fps is (10 sec: 52429.8, 60 sec: 52975.0, 300 sec: 53372.9). Total num frames: 5964627968. Throughput: 0: 53213.9. Samples: 455197060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-27 11:21:39,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 11:21:40,434][52263] Updated weights for policy 0, policy_version 364058 (0.0026) [2024-04-27 11:21:43,602][52263] Updated weights for policy 0, policy_version 364068 (0.0026) [2024-04-27 11:21:44,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 5964906496. Throughput: 0: 53327.8. Samples: 455358740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-27 11:21:44,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 11:21:46,537][52263] Updated weights for policy 0, policy_version 364078 (0.0031) [2024-04-27 11:21:49,107][52031] Fps is (10 sec: 52427.9, 60 sec: 52974.7, 300 sec: 53372.9). Total num frames: 5965152256. Throughput: 0: 53335.4. Samples: 455680180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-27 11:21:49,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 11:21:49,746][52263] Updated weights for policy 0, policy_version 364088 (0.0026) [2024-04-27 11:21:52,748][52263] Updated weights for policy 0, policy_version 364098 (0.0030) [2024-04-27 11:21:54,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 5965430784. Throughput: 0: 53259.4. Samples: 455997300. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-27 11:21:54,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 11:21:56,167][52263] Updated weights for policy 0, policy_version 364108 (0.0035) [2024-04-27 11:21:59,046][52263] Updated weights for policy 0, policy_version 364118 (0.0027) [2024-04-27 11:21:59,107][52031] Fps is (10 sec: 55706.5, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 5965709312. Throughput: 0: 53341.7. Samples: 456159840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-27 11:21:59,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 11:22:02,338][52263] Updated weights for policy 0, policy_version 364128 (0.0031) [2024-04-27 11:22:04,106][52031] Fps is (10 sec: 54068.0, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 5965971456. Throughput: 0: 53170.2. Samples: 456473700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-27 11:22:04,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 11:22:05,090][52263] Updated weights for policy 0, policy_version 364138 (0.0031) [2024-04-27 11:22:08,597][52263] Updated weights for policy 0, policy_version 364148 (0.0028) [2024-04-27 11:22:09,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 5966233600. Throughput: 0: 53264.9. Samples: 456794700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-27 11:22:09,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 11:22:11,317][52263] Updated weights for policy 0, policy_version 364158 (0.0034) [2024-04-27 11:22:14,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 5966495744. Throughput: 0: 53041.5. Samples: 456948960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-27 11:22:14,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 11:22:14,651][52263] Updated weights for policy 0, policy_version 364168 (0.0030) [2024-04-27 11:22:17,642][52263] Updated weights for policy 0, policy_version 364178 (0.0028) [2024-04-27 11:22:19,106][52031] Fps is (10 sec: 50790.4, 60 sec: 52701.8, 300 sec: 53317.4). Total num frames: 5966741504. Throughput: 0: 53134.4. Samples: 457271660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-27 11:22:19,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 11:22:20,852][52263] Updated weights for policy 0, policy_version 364188 (0.0029) [2024-04-27 11:22:23,809][52263] Updated weights for policy 0, policy_version 364198 (0.0029) [2024-04-27 11:22:24,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 5967020032. Throughput: 0: 53139.5. Samples: 457588340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-27 11:22:24,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 11:22:26,914][52263] Updated weights for policy 0, policy_version 364208 (0.0025) [2024-04-27 11:22:29,106][52031] Fps is (10 sec: 54067.4, 60 sec: 52975.2, 300 sec: 53317.5). Total num frames: 5967282176. Throughput: 0: 53079.8. Samples: 457747320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-27 11:22:29,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 11:22:30,031][52263] Updated weights for policy 0, policy_version 364218 (0.0031) [2024-04-27 11:22:32,935][52263] Updated weights for policy 0, policy_version 364228 (0.0030) [2024-04-27 11:22:34,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53248.0, 300 sec: 53261.9). Total num frames: 5967560704. Throughput: 0: 53021.0. Samples: 458066120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-27 11:22:34,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 11:22:36,115][52263] Updated weights for policy 0, policy_version 364238 (0.0031) [2024-04-27 11:22:39,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.1, 300 sec: 53317.5). Total num frames: 5967822848. Throughput: 0: 53157.5. Samples: 458389380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-27 11:22:39,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 11:22:39,184][52263] Updated weights for policy 0, policy_version 364248 (0.0028) [2024-04-27 11:22:42,351][52263] Updated weights for policy 0, policy_version 364258 (0.0030) [2024-04-27 11:22:44,106][52031] Fps is (10 sec: 52429.8, 60 sec: 52975.1, 300 sec: 53317.4). Total num frames: 5968084992. Throughput: 0: 53058.8. Samples: 458547480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-27 11:22:44,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 11:22:45,481][52263] Updated weights for policy 0, policy_version 364268 (0.0033) [2024-04-27 11:22:48,457][52263] Updated weights for policy 0, policy_version 364278 (0.0023) [2024-04-27 11:22:49,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 5968347136. Throughput: 0: 52992.2. Samples: 458858360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 11:22:49,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 11:22:51,558][52242] Signal inference workers to stop experience collection... (6800 times) [2024-04-27 11:22:51,559][52242] Signal inference workers to resume experience collection... (6800 times) [2024-04-27 11:22:51,571][52263] InferenceWorker_p0-w0: stopping experience collection (6800 times) [2024-04-27 11:22:51,571][52263] InferenceWorker_p0-w0: resuming experience collection (6800 times) [2024-04-27 11:22:51,681][52263] Updated weights for policy 0, policy_version 364288 (0.0027) [2024-04-27 11:22:54,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 5968625664. Throughput: 0: 53017.7. Samples: 459180500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 11:22:54,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 11:22:54,678][52263] Updated weights for policy 0, policy_version 364298 (0.0027) [2024-04-27 11:22:57,740][52263] Updated weights for policy 0, policy_version 364308 (0.0029) [2024-04-27 11:22:59,107][52031] Fps is (10 sec: 54067.4, 60 sec: 52974.9, 300 sec: 53261.9). Total num frames: 5968887808. Throughput: 0: 53267.6. Samples: 459346000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 11:22:59,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 11:23:00,700][52263] Updated weights for policy 0, policy_version 364318 (0.0029) [2024-04-27 11:23:03,866][52263] Updated weights for policy 0, policy_version 364328 (0.0030) [2024-04-27 11:23:04,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 5969166336. Throughput: 0: 53148.8. Samples: 459663360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 11:23:04,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 11:23:07,014][52263] Updated weights for policy 0, policy_version 364338 (0.0031) [2024-04-27 11:23:09,106][52031] Fps is (10 sec: 52429.6, 60 sec: 52975.0, 300 sec: 53261.9). Total num frames: 5969412096. Throughput: 0: 53177.5. Samples: 459981320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 11:23:09,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 11:23:09,963][52263] Updated weights for policy 0, policy_version 364348 (0.0025) [2024-04-27 11:23:13,256][52263] Updated weights for policy 0, policy_version 364358 (0.0035) [2024-04-27 11:23:14,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 5969674240. Throughput: 0: 53135.0. Samples: 460138400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 11:23:14,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 11:23:16,122][52263] Updated weights for policy 0, policy_version 364368 (0.0032) [2024-04-27 11:23:19,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53247.9, 300 sec: 53261.9). Total num frames: 5969936384. Throughput: 0: 53173.4. Samples: 460458920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 11:23:19,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 11:23:19,503][52263] Updated weights for policy 0, policy_version 364378 (0.0028) [2024-04-27 11:23:22,213][52263] Updated weights for policy 0, policy_version 364388 (0.0028) [2024-04-27 11:23:24,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 5970214912. Throughput: 0: 53095.9. Samples: 460778700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 11:23:24,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 11:23:25,693][52263] Updated weights for policy 0, policy_version 364398 (0.0026) [2024-04-27 11:23:28,469][52263] Updated weights for policy 0, policy_version 364408 (0.0026) [2024-04-27 11:23:29,106][52031] Fps is (10 sec: 54068.0, 60 sec: 53248.0, 300 sec: 53261.9). Total num frames: 5970477056. Throughput: 0: 53211.6. Samples: 460942000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 11:23:29,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 11:23:31,702][52263] Updated weights for policy 0, policy_version 364418 (0.0031) [2024-04-27 11:23:34,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 5970755584. Throughput: 0: 53321.9. Samples: 461257840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 11:23:34,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 11:23:34,119][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000364426_5970755584.pth... [2024-04-27 11:23:34,162][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000363646_5957976064.pth [2024-04-27 11:23:34,872][52263] Updated weights for policy 0, policy_version 364428 (0.0034) [2024-04-27 11:23:37,792][52263] Updated weights for policy 0, policy_version 364438 (0.0025) [2024-04-27 11:23:39,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52974.9, 300 sec: 53261.9). Total num frames: 5971001344. Throughput: 0: 53258.3. Samples: 461577120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 11:23:39,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 11:23:41,129][52263] Updated weights for policy 0, policy_version 364448 (0.0031) [2024-04-27 11:23:43,881][52263] Updated weights for policy 0, policy_version 364458 (0.0031) [2024-04-27 11:23:44,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 5971279872. Throughput: 0: 53207.5. Samples: 461740340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 11:23:44,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 11:23:47,177][52263] Updated weights for policy 0, policy_version 364468 (0.0029) [2024-04-27 11:23:49,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 5971542016. Throughput: 0: 53288.0. Samples: 462061320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 11:23:49,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 11:23:50,014][52263] Updated weights for policy 0, policy_version 364478 (0.0032) [2024-04-27 11:23:53,091][52242] Signal inference workers to stop experience collection... (6850 times) [2024-04-27 11:23:53,095][52242] Signal inference workers to resume experience collection... (6850 times) [2024-04-27 11:23:53,123][52263] InferenceWorker_p0-w0: stopping experience collection (6850 times) [2024-04-27 11:23:53,123][52263] InferenceWorker_p0-w0: resuming experience collection (6850 times) [2024-04-27 11:23:53,214][52263] Updated weights for policy 0, policy_version 364488 (0.0028) [2024-04-27 11:23:54,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 5971820544. Throughput: 0: 53278.2. Samples: 462378840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 11:23:54,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 11:23:56,241][52263] Updated weights for policy 0, policy_version 364498 (0.0027) [2024-04-27 11:23:59,107][52031] Fps is (10 sec: 52428.8, 60 sec: 52975.0, 300 sec: 53206.3). Total num frames: 5972066304. Throughput: 0: 53271.0. Samples: 462535600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 11:23:59,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 11:23:59,385][52263] Updated weights for policy 0, policy_version 364508 (0.0028) [2024-04-27 11:24:02,499][52263] Updated weights for policy 0, policy_version 364518 (0.0030) [2024-04-27 11:24:04,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 5972344832. Throughput: 0: 53311.7. Samples: 462857940. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-04-27 11:24:04,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 11:24:05,750][52263] Updated weights for policy 0, policy_version 364528 (0.0028) [2024-04-27 11:24:08,541][52263] Updated weights for policy 0, policy_version 364538 (0.0032) [2024-04-27 11:24:09,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53247.9, 300 sec: 53261.9). Total num frames: 5972606976. Throughput: 0: 53204.5. Samples: 463172900. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-04-27 11:24:09,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 11:24:11,876][52263] Updated weights for policy 0, policy_version 364548 (0.0037) [2024-04-27 11:24:14,106][52031] Fps is (10 sec: 50790.6, 60 sec: 52975.0, 300 sec: 53206.3). Total num frames: 5972852736. Throughput: 0: 53180.9. Samples: 463335140. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-04-27 11:24:14,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 11:24:14,710][52263] Updated weights for policy 0, policy_version 364558 (0.0033) [2024-04-27 11:24:17,924][52263] Updated weights for policy 0, policy_version 364568 (0.0031) [2024-04-27 11:24:19,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.1, 300 sec: 53206.4). Total num frames: 5973131264. Throughput: 0: 53234.7. Samples: 463653400. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-04-27 11:24:19,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 11:24:20,785][52263] Updated weights for policy 0, policy_version 364578 (0.0028) [2024-04-27 11:24:24,106][52031] Fps is (10 sec: 54067.1, 60 sec: 52975.0, 300 sec: 53150.8). Total num frames: 5973393408. Throughput: 0: 53279.7. Samples: 463974700. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-04-27 11:24:24,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 11:24:24,147][52263] Updated weights for policy 0, policy_version 364588 (0.0036) [2024-04-27 11:24:26,908][52263] Updated weights for policy 0, policy_version 364598 (0.0028) [2024-04-27 11:24:29,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53247.9, 300 sec: 53261.9). Total num frames: 5973671936. Throughput: 0: 53138.3. Samples: 464131560. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-04-27 11:24:29,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 11:24:30,372][52263] Updated weights for policy 0, policy_version 364608 (0.0025) [2024-04-27 11:24:32,994][52263] Updated weights for policy 0, policy_version 364618 (0.0030) [2024-04-27 11:24:34,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53248.1, 300 sec: 53261.9). Total num frames: 5973950464. Throughput: 0: 53138.8. Samples: 464452560. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-04-27 11:24:34,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 11:24:36,431][52263] Updated weights for policy 0, policy_version 364628 (0.0031) [2024-04-27 11:24:39,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53248.1, 300 sec: 53206.3). Total num frames: 5974196224. Throughput: 0: 53124.9. Samples: 464769460. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-04-27 11:24:39,107][52031] Avg episode reward: [(0, '0.468')] [2024-04-27 11:24:39,398][52263] Updated weights for policy 0, policy_version 364638 (0.0032) [2024-04-27 11:24:42,448][52263] Updated weights for policy 0, policy_version 364648 (0.0033) [2024-04-27 11:24:44,106][52031] Fps is (10 sec: 49151.6, 60 sec: 52702.0, 300 sec: 53095.3). Total num frames: 5974441984. Throughput: 0: 53042.7. Samples: 464922520. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-04-27 11:24:44,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 11:24:45,502][52263] Updated weights for policy 0, policy_version 364658 (0.0029) [2024-04-27 11:24:48,677][52263] Updated weights for policy 0, policy_version 364668 (0.0034) [2024-04-27 11:24:49,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53248.1, 300 sec: 53206.4). Total num frames: 5974736896. Throughput: 0: 52997.3. Samples: 465242820. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-04-27 11:24:49,107][52031] Avg episode reward: [(0, '0.508')] [2024-04-27 11:24:51,640][52263] Updated weights for policy 0, policy_version 364678 (0.0037) [2024-04-27 11:24:52,816][52242] Signal inference workers to stop experience collection... (6900 times) [2024-04-27 11:24:52,817][52242] Signal inference workers to resume experience collection... (6900 times) [2024-04-27 11:24:52,835][52263] InferenceWorker_p0-w0: stopping experience collection (6900 times) [2024-04-27 11:24:52,836][52263] InferenceWorker_p0-w0: resuming experience collection (6900 times) [2024-04-27 11:24:54,107][52031] Fps is (10 sec: 55705.0, 60 sec: 52974.8, 300 sec: 53206.3). Total num frames: 5974999040. Throughput: 0: 53050.1. Samples: 465560160. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-04-27 11:24:54,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 11:24:54,915][52263] Updated weights for policy 0, policy_version 364688 (0.0029) [2024-04-27 11:24:57,771][52263] Updated weights for policy 0, policy_version 364698 (0.0033) [2024-04-27 11:24:59,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.2, 300 sec: 53317.4). Total num frames: 5975293952. Throughput: 0: 53206.6. Samples: 465729440. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-04-27 11:24:59,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 11:25:00,827][52263] Updated weights for policy 0, policy_version 364708 (0.0040) [2024-04-27 11:25:03,921][52263] Updated weights for policy 0, policy_version 364718 (0.0032) [2024-04-27 11:25:04,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53247.9, 300 sec: 53206.4). Total num frames: 5975539712. Throughput: 0: 53261.6. Samples: 466050180. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-04-27 11:25:04,111][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 11:25:06,844][52263] Updated weights for policy 0, policy_version 364728 (0.0031) [2024-04-27 11:25:09,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53248.1, 300 sec: 53206.3). Total num frames: 5975801856. Throughput: 0: 53271.1. Samples: 466371900. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-04-27 11:25:09,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 11:25:10,126][52263] Updated weights for policy 0, policy_version 364738 (0.0034) [2024-04-27 11:25:13,123][52263] Updated weights for policy 0, policy_version 364748 (0.0026) [2024-04-27 11:25:14,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53521.0, 300 sec: 53206.4). Total num frames: 5976064000. Throughput: 0: 53096.0. Samples: 466520880. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-04-27 11:25:14,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 11:25:16,130][52263] Updated weights for policy 0, policy_version 364758 (0.0025) [2024-04-27 11:25:19,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.1, 300 sec: 53206.4). Total num frames: 5976342528. Throughput: 0: 53235.1. Samples: 466848140. Policy #0 lag: (min: 1.0, avg: 9.1, max: 19.0) [2024-04-27 11:25:19,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 11:25:19,161][52263] Updated weights for policy 0, policy_version 364768 (0.0030) [2024-04-27 11:25:22,122][52263] Updated weights for policy 0, policy_version 364778 (0.0031) [2024-04-27 11:25:24,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53794.1, 300 sec: 53317.4). Total num frames: 5976621056. Throughput: 0: 53298.6. Samples: 467167900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:25:24,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 11:25:25,261][52263] Updated weights for policy 0, policy_version 364788 (0.0030) [2024-04-27 11:25:28,285][52263] Updated weights for policy 0, policy_version 364798 (0.0029) [2024-04-27 11:25:29,107][52031] Fps is (10 sec: 55704.4, 60 sec: 53794.0, 300 sec: 53317.4). Total num frames: 5976899584. Throughput: 0: 53699.4. Samples: 467339000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:25:29,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 11:25:31,433][52263] Updated weights for policy 0, policy_version 364808 (0.0028) [2024-04-27 11:25:34,107][52031] Fps is (10 sec: 50790.3, 60 sec: 52974.9, 300 sec: 53150.8). Total num frames: 5977128960. Throughput: 0: 53691.1. Samples: 467658920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:25:34,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 11:25:34,259][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000364817_5977161728.pth... [2024-04-27 11:25:34,303][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000364037_5964382208.pth [2024-04-27 11:25:34,477][52263] Updated weights for policy 0, policy_version 364818 (0.0033) [2024-04-27 11:25:37,489][52263] Updated weights for policy 0, policy_version 364828 (0.0029) [2024-04-27 11:25:39,106][52031] Fps is (10 sec: 49153.0, 60 sec: 53248.0, 300 sec: 53261.9). Total num frames: 5977391104. Throughput: 0: 53754.9. Samples: 467979120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:25:39,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 11:25:40,628][52263] Updated weights for policy 0, policy_version 364838 (0.0026) [2024-04-27 11:25:43,617][52263] Updated weights for policy 0, policy_version 364848 (0.0034) [2024-04-27 11:25:44,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.0, 300 sec: 53206.3). Total num frames: 5977669632. Throughput: 0: 53279.8. Samples: 468127040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:25:44,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 11:25:46,680][52263] Updated weights for policy 0, policy_version 364858 (0.0028) [2024-04-27 11:25:47,404][52242] Signal inference workers to stop experience collection... (6950 times) [2024-04-27 11:25:47,434][52263] InferenceWorker_p0-w0: stopping experience collection (6950 times) [2024-04-27 11:25:47,464][52242] Signal inference workers to resume experience collection... (6950 times) [2024-04-27 11:25:47,465][52263] InferenceWorker_p0-w0: resuming experience collection (6950 times) [2024-04-27 11:25:49,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.0, 300 sec: 53261.9). Total num frames: 5977931776. Throughput: 0: 53385.5. Samples: 468452520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:25:49,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 11:25:49,754][52263] Updated weights for policy 0, policy_version 364868 (0.0037) [2024-04-27 11:25:52,859][52263] Updated weights for policy 0, policy_version 364878 (0.0031) [2024-04-27 11:25:54,106][52031] Fps is (10 sec: 54068.0, 60 sec: 53521.2, 300 sec: 53317.4). Total num frames: 5978210304. Throughput: 0: 53287.5. Samples: 468769840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:25:54,107][52031] Avg episode reward: [(0, '0.674')] [2024-04-27 11:25:56,315][52263] Updated weights for policy 0, policy_version 364888 (0.0037) [2024-04-27 11:25:58,914][52263] Updated weights for policy 0, policy_version 364898 (0.0035) [2024-04-27 11:25:59,107][52031] Fps is (10 sec: 57343.2, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 5978505216. Throughput: 0: 53488.0. Samples: 468927840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:25:59,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 11:26:02,595][52263] Updated weights for policy 0, policy_version 364908 (0.0029) [2024-04-27 11:26:04,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.1, 300 sec: 53206.4). Total num frames: 5978734592. Throughput: 0: 53550.2. Samples: 469257900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:26:04,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 11:26:04,987][52263] Updated weights for policy 0, policy_version 364918 (0.0033) [2024-04-27 11:26:08,808][52263] Updated weights for policy 0, policy_version 364928 (0.0034) [2024-04-27 11:26:09,107][52031] Fps is (10 sec: 49152.0, 60 sec: 53247.9, 300 sec: 53206.3). Total num frames: 5978996736. Throughput: 0: 53573.2. Samples: 469578700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:26:09,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 11:26:11,160][52263] Updated weights for policy 0, policy_version 364938 (0.0033) [2024-04-27 11:26:14,107][52031] Fps is (10 sec: 54065.9, 60 sec: 53520.9, 300 sec: 53206.3). Total num frames: 5979275264. Throughput: 0: 52960.8. Samples: 469722240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:26:14,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 11:26:14,939][52263] Updated weights for policy 0, policy_version 364948 (0.0032) [2024-04-27 11:26:17,326][52263] Updated weights for policy 0, policy_version 364958 (0.0030) [2024-04-27 11:26:19,106][52031] Fps is (10 sec: 55706.2, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 5979553792. Throughput: 0: 53008.0. Samples: 470044280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:26:19,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 11:26:21,075][52263] Updated weights for policy 0, policy_version 364968 (0.0034) [2024-04-27 11:26:23,461][52263] Updated weights for policy 0, policy_version 364978 (0.0036) [2024-04-27 11:26:24,107][52031] Fps is (10 sec: 52429.2, 60 sec: 52974.8, 300 sec: 53206.4). Total num frames: 5979799552. Throughput: 0: 52888.2. Samples: 470359100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:26:24,107][52031] Avg episode reward: [(0, '0.487')] [2024-04-27 11:26:27,264][52263] Updated weights for policy 0, policy_version 364988 (0.0035) [2024-04-27 11:26:29,107][52031] Fps is (10 sec: 52427.7, 60 sec: 52974.9, 300 sec: 53261.9). Total num frames: 5980078080. Throughput: 0: 53343.5. Samples: 470527500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:26:29,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 11:26:29,644][52263] Updated weights for policy 0, policy_version 364998 (0.0037) [2024-04-27 11:26:33,384][52263] Updated weights for policy 0, policy_version 365008 (0.0040) [2024-04-27 11:26:34,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53248.0, 300 sec: 53206.3). Total num frames: 5980323840. Throughput: 0: 53236.8. Samples: 470848180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:26:34,107][52031] Avg episode reward: [(0, '0.508')] [2024-04-27 11:26:35,900][52263] Updated weights for policy 0, policy_version 365018 (0.0024) [2024-04-27 11:26:39,107][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.0, 300 sec: 53206.4). Total num frames: 5980602368. Throughput: 0: 53274.1. Samples: 471167180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:26:39,107][52031] Avg episode reward: [(0, '0.678')] [2024-04-27 11:26:39,427][52263] Updated weights for policy 0, policy_version 365028 (0.0031) [2024-04-27 11:26:41,936][52263] Updated weights for policy 0, policy_version 365038 (0.0032) [2024-04-27 11:26:44,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.1, 300 sec: 53261.9). Total num frames: 5980864512. Throughput: 0: 53187.1. Samples: 471321260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:26:44,107][52031] Avg episode reward: [(0, '0.496')] [2024-04-27 11:26:45,644][52263] Updated weights for policy 0, policy_version 365048 (0.0034) [2024-04-27 11:26:48,086][52263] Updated weights for policy 0, policy_version 365058 (0.0030) [2024-04-27 11:26:49,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52974.9, 300 sec: 53150.8). Total num frames: 5981110272. Throughput: 0: 52960.0. Samples: 471641100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:26:49,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 11:26:49,620][52242] Signal inference workers to stop experience collection... (7000 times) [2024-04-27 11:26:49,622][52242] Signal inference workers to resume experience collection... (7000 times) [2024-04-27 11:26:49,652][52263] InferenceWorker_p0-w0: stopping experience collection (7000 times) [2024-04-27 11:26:49,652][52263] InferenceWorker_p0-w0: resuming experience collection (7000 times) [2024-04-27 11:26:51,875][52263] Updated weights for policy 0, policy_version 365068 (0.0033) [2024-04-27 11:26:54,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53520.9, 300 sec: 53261.9). Total num frames: 5981421568. Throughput: 0: 52839.0. Samples: 471956460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:26:54,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 11:26:54,337][52263] Updated weights for policy 0, policy_version 365078 (0.0029) [2024-04-27 11:26:57,925][52263] Updated weights for policy 0, policy_version 365088 (0.0028) [2024-04-27 11:26:59,106][52031] Fps is (10 sec: 55705.4, 60 sec: 52701.9, 300 sec: 53206.3). Total num frames: 5981667328. Throughput: 0: 53419.4. Samples: 472126100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:26:59,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 11:27:00,345][52263] Updated weights for policy 0, policy_version 365098 (0.0024) [2024-04-27 11:27:04,107][52031] Fps is (10 sec: 49152.4, 60 sec: 52974.8, 300 sec: 53150.8). Total num frames: 5981913088. Throughput: 0: 53332.8. Samples: 472444260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:27:04,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 11:27:04,156][52263] Updated weights for policy 0, policy_version 365108 (0.0027) [2024-04-27 11:27:06,719][52263] Updated weights for policy 0, policy_version 365118 (0.0030) [2024-04-27 11:27:09,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53248.0, 300 sec: 53206.3). Total num frames: 5982191616. Throughput: 0: 53460.0. Samples: 472764800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:27:09,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 11:27:10,259][52263] Updated weights for policy 0, policy_version 365128 (0.0027) [2024-04-27 11:27:13,255][52263] Updated weights for policy 0, policy_version 365138 (0.0029) [2024-04-27 11:27:14,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 5982470144. Throughput: 0: 53278.8. Samples: 472925040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:27:14,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 11:27:16,376][52263] Updated weights for policy 0, policy_version 365148 (0.0025) [2024-04-27 11:27:19,106][52031] Fps is (10 sec: 52429.7, 60 sec: 52701.9, 300 sec: 53206.4). Total num frames: 5982715904. Throughput: 0: 53222.3. Samples: 473243180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:27:19,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 11:27:19,324][52263] Updated weights for policy 0, policy_version 365158 (0.0036) [2024-04-27 11:27:22,398][52263] Updated weights for policy 0, policy_version 365168 (0.0030) [2024-04-27 11:27:24,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.2, 300 sec: 53317.4). Total num frames: 5983010816. Throughput: 0: 53374.7. Samples: 473569040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:27:24,107][52031] Avg episode reward: [(0, '0.661')] [2024-04-27 11:27:25,464][52263] Updated weights for policy 0, policy_version 365178 (0.0024) [2024-04-27 11:27:28,619][52263] Updated weights for policy 0, policy_version 365188 (0.0028) [2024-04-27 11:27:29,106][52031] Fps is (10 sec: 54067.2, 60 sec: 52975.1, 300 sec: 53206.4). Total num frames: 5983256576. Throughput: 0: 53477.0. Samples: 473727720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:27:29,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 11:27:31,548][52263] Updated weights for policy 0, policy_version 365198 (0.0026) [2024-04-27 11:27:34,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53248.1, 300 sec: 53206.3). Total num frames: 5983518720. Throughput: 0: 53472.0. Samples: 474047340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:27:34,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 11:27:34,132][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000365206_5983535104.pth... [2024-04-27 11:27:34,184][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000364426_5970755584.pth [2024-04-27 11:27:34,702][52263] Updated weights for policy 0, policy_version 365208 (0.0030) [2024-04-27 11:27:37,744][52263] Updated weights for policy 0, policy_version 365218 (0.0029) [2024-04-27 11:27:39,106][52031] Fps is (10 sec: 52428.6, 60 sec: 52975.0, 300 sec: 53206.3). Total num frames: 5983780864. Throughput: 0: 53488.2. Samples: 474363420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:27:39,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 11:27:40,894][52263] Updated weights for policy 0, policy_version 365228 (0.0031) [2024-04-27 11:27:42,097][52242] Signal inference workers to stop experience collection... (7050 times) [2024-04-27 11:27:42,131][52263] InferenceWorker_p0-w0: stopping experience collection (7050 times) [2024-04-27 11:27:42,159][52242] Signal inference workers to resume experience collection... (7050 times) [2024-04-27 11:27:42,162][52263] InferenceWorker_p0-w0: resuming experience collection (7050 times) [2024-04-27 11:27:43,796][52263] Updated weights for policy 0, policy_version 365238 (0.0035) [2024-04-27 11:27:44,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.2, 300 sec: 53317.5). Total num frames: 5984075776. Throughput: 0: 53395.6. Samples: 474528900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:27:44,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 11:27:47,094][52263] Updated weights for policy 0, policy_version 365248 (0.0028) [2024-04-27 11:27:49,107][52031] Fps is (10 sec: 57343.9, 60 sec: 54067.1, 300 sec: 53317.4). Total num frames: 5984354304. Throughput: 0: 53470.3. Samples: 474850420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:27:49,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 11:27:49,894][52263] Updated weights for policy 0, policy_version 365258 (0.0025) [2024-04-27 11:27:53,177][52263] Updated weights for policy 0, policy_version 365268 (0.0028) [2024-04-27 11:27:54,107][52031] Fps is (10 sec: 52427.9, 60 sec: 52975.0, 300 sec: 53261.9). Total num frames: 5984600064. Throughput: 0: 53396.9. Samples: 475167660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:27:54,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 11:27:55,971][52263] Updated weights for policy 0, policy_version 365278 (0.0028) [2024-04-27 11:27:59,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53248.0, 300 sec: 53206.3). Total num frames: 5984862208. Throughput: 0: 53366.2. Samples: 475326520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:27:59,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 11:27:59,240][52263] Updated weights for policy 0, policy_version 365288 (0.0030) [2024-04-27 11:28:02,138][52263] Updated weights for policy 0, policy_version 365298 (0.0028) [2024-04-27 11:28:04,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.1, 300 sec: 53317.4). Total num frames: 5985140736. Throughput: 0: 53419.4. Samples: 475647060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:28:04,107][52031] Avg episode reward: [(0, '0.519')] [2024-04-27 11:28:05,257][52263] Updated weights for policy 0, policy_version 365308 (0.0027) [2024-04-27 11:28:08,310][52263] Updated weights for policy 0, policy_version 365318 (0.0030) [2024-04-27 11:28:09,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53794.2, 300 sec: 53372.9). Total num frames: 5985419264. Throughput: 0: 53373.3. Samples: 475970840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:28:09,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 11:28:11,257][52263] Updated weights for policy 0, policy_version 365328 (0.0027) [2024-04-27 11:28:14,107][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 5985665024. Throughput: 0: 53391.0. Samples: 476130320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:28:14,107][52031] Avg episode reward: [(0, '0.502')] [2024-04-27 11:28:14,285][52263] Updated weights for policy 0, policy_version 365338 (0.0029) [2024-04-27 11:28:17,393][52263] Updated weights for policy 0, policy_version 365348 (0.0030) [2024-04-27 11:28:19,107][52031] Fps is (10 sec: 54067.3, 60 sec: 54067.1, 300 sec: 53373.0). Total num frames: 5985959936. Throughput: 0: 53401.6. Samples: 476450420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:28:19,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 11:28:20,303][52263] Updated weights for policy 0, policy_version 365358 (0.0027) [2024-04-27 11:28:23,633][52263] Updated weights for policy 0, policy_version 365368 (0.0033) [2024-04-27 11:28:24,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 5986205696. Throughput: 0: 53564.1. Samples: 476773800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:28:24,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 11:28:26,406][52263] Updated weights for policy 0, policy_version 365378 (0.0029) [2024-04-27 11:28:29,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53521.0, 300 sec: 53261.9). Total num frames: 5986467840. Throughput: 0: 53271.9. Samples: 476926140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:28:29,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 11:28:29,836][52263] Updated weights for policy 0, policy_version 365388 (0.0048) [2024-04-27 11:28:32,560][52263] Updated weights for policy 0, policy_version 365398 (0.0025) [2024-04-27 11:28:34,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 5986746368. Throughput: 0: 53405.4. Samples: 477253660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:28:34,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 11:28:35,956][52263] Updated weights for policy 0, policy_version 365408 (0.0037) [2024-04-27 11:28:36,924][52242] Signal inference workers to stop experience collection... (7100 times) [2024-04-27 11:28:36,975][52263] InferenceWorker_p0-w0: stopping experience collection (7100 times) [2024-04-27 11:28:36,983][52242] Signal inference workers to resume experience collection... (7100 times) [2024-04-27 11:28:36,995][52263] InferenceWorker_p0-w0: resuming experience collection (7100 times) [2024-04-27 11:28:38,610][52263] Updated weights for policy 0, policy_version 365418 (0.0029) [2024-04-27 11:28:39,106][52031] Fps is (10 sec: 55706.0, 60 sec: 54067.2, 300 sec: 53373.0). Total num frames: 5987024896. Throughput: 0: 53287.7. Samples: 477565600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:28:39,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 11:28:42,034][52263] Updated weights for policy 0, policy_version 365428 (0.0036) [2024-04-27 11:28:44,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 5987270656. Throughput: 0: 53502.2. Samples: 477734120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:28:44,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 11:28:44,781][52263] Updated weights for policy 0, policy_version 365438 (0.0027) [2024-04-27 11:28:48,079][52263] Updated weights for policy 0, policy_version 365448 (0.0027) [2024-04-27 11:28:49,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 5987549184. Throughput: 0: 53545.9. Samples: 478056620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:28:49,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 11:28:50,994][52263] Updated weights for policy 0, policy_version 365458 (0.0028) [2024-04-27 11:28:54,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 5987811328. Throughput: 0: 53551.2. Samples: 478380640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:28:54,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 11:28:54,201][52263] Updated weights for policy 0, policy_version 365468 (0.0027) [2024-04-27 11:28:57,136][52263] Updated weights for policy 0, policy_version 365478 (0.0033) [2024-04-27 11:28:59,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 5988073472. Throughput: 0: 53534.7. Samples: 478539380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:28:59,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 11:29:00,144][52263] Updated weights for policy 0, policy_version 365488 (0.0035) [2024-04-27 11:29:03,270][52263] Updated weights for policy 0, policy_version 365498 (0.0035) [2024-04-27 11:29:04,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 5988352000. Throughput: 0: 53489.9. Samples: 478857460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:29:04,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 11:29:06,499][52263] Updated weights for policy 0, policy_version 365508 (0.0031) [2024-04-27 11:29:09,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.2, 300 sec: 53484.0). Total num frames: 5988630528. Throughput: 0: 53451.5. Samples: 479179120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:29:09,107][52031] Avg episode reward: [(0, '0.516')] [2024-04-27 11:29:09,535][52263] Updated weights for policy 0, policy_version 365518 (0.0037) [2024-04-27 11:29:12,887][52263] Updated weights for policy 0, policy_version 365528 (0.0034) [2024-04-27 11:29:14,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 5988892672. Throughput: 0: 53799.1. Samples: 479347100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:29:14,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 11:29:15,558][52263] Updated weights for policy 0, policy_version 365538 (0.0027) [2024-04-27 11:29:19,106][52031] Fps is (10 sec: 49152.3, 60 sec: 52702.0, 300 sec: 53317.4). Total num frames: 5989122048. Throughput: 0: 53640.0. Samples: 479667460. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 11:29:19,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 11:29:19,127][52263] Updated weights for policy 0, policy_version 365548 (0.0033) [2024-04-27 11:29:21,462][52263] Updated weights for policy 0, policy_version 365558 (0.0034) [2024-04-27 11:29:24,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53520.8, 300 sec: 53372.9). Total num frames: 5989416960. Throughput: 0: 53770.8. Samples: 479985300. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 11:29:24,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 11:29:25,159][52263] Updated weights for policy 0, policy_version 365568 (0.0036) [2024-04-27 11:29:27,724][52263] Updated weights for policy 0, policy_version 365578 (0.0027) [2024-04-27 11:29:29,106][52031] Fps is (10 sec: 58982.7, 60 sec: 54067.3, 300 sec: 53428.5). Total num frames: 5989711872. Throughput: 0: 53596.2. Samples: 480145940. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 11:29:29,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 11:29:31,130][52263] Updated weights for policy 0, policy_version 365588 (0.0031) [2024-04-27 11:29:33,890][52263] Updated weights for policy 0, policy_version 365598 (0.0027) [2024-04-27 11:29:34,106][52031] Fps is (10 sec: 55706.7, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 5989974016. Throughput: 0: 53517.3. Samples: 480464900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 11:29:34,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 11:29:34,199][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000365600_5989990400.pth... [2024-04-27 11:29:34,242][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000364817_5977161728.pth [2024-04-27 11:29:37,150][52242] Signal inference workers to stop experience collection... (7150 times) [2024-04-27 11:29:37,192][52263] InferenceWorker_p0-w0: stopping experience collection (7150 times) [2024-04-27 11:29:37,250][52242] Signal inference workers to resume experience collection... (7150 times) [2024-04-27 11:29:37,250][52263] InferenceWorker_p0-w0: resuming experience collection (7150 times) [2024-04-27 11:29:37,374][52263] Updated weights for policy 0, policy_version 365608 (0.0029) [2024-04-27 11:29:39,107][52031] Fps is (10 sec: 50789.6, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 5990219776. Throughput: 0: 53438.6. Samples: 480785380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 11:29:39,107][52031] Avg episode reward: [(0, '0.686')] [2024-04-27 11:29:39,952][52263] Updated weights for policy 0, policy_version 365618 (0.0033) [2024-04-27 11:29:43,501][52263] Updated weights for policy 0, policy_version 365628 (0.0025) [2024-04-27 11:29:44,107][52031] Fps is (10 sec: 50790.0, 60 sec: 53521.1, 300 sec: 53372.9). Total num frames: 5990481920. Throughput: 0: 53318.1. Samples: 480938700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 11:29:44,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 11:29:45,922][52263] Updated weights for policy 0, policy_version 365638 (0.0029) [2024-04-27 11:29:49,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 5990744064. Throughput: 0: 53497.6. Samples: 481264860. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 11:29:49,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 11:29:49,428][52263] Updated weights for policy 0, policy_version 365648 (0.0026) [2024-04-27 11:29:52,192][52263] Updated weights for policy 0, policy_version 365658 (0.0031) [2024-04-27 11:29:54,107][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 5991022592. Throughput: 0: 53436.9. Samples: 481583780. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 11:29:54,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 11:29:55,581][52263] Updated weights for policy 0, policy_version 365668 (0.0035) [2024-04-27 11:29:58,268][52263] Updated weights for policy 0, policy_version 365678 (0.0029) [2024-04-27 11:29:59,106][52031] Fps is (10 sec: 57344.9, 60 sec: 54067.2, 300 sec: 53484.1). Total num frames: 5991317504. Throughput: 0: 53340.1. Samples: 481747400. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 11:29:59,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 11:30:01,844][52263] Updated weights for policy 0, policy_version 365688 (0.0029) [2024-04-27 11:30:04,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 5991579648. Throughput: 0: 53331.0. Samples: 482067360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 11:30:04,107][52031] Avg episode reward: [(0, '0.496')] [2024-04-27 11:30:04,292][52263] Updated weights for policy 0, policy_version 365698 (0.0032) [2024-04-27 11:30:07,949][52263] Updated weights for policy 0, policy_version 365708 (0.0028) [2024-04-27 11:30:09,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 5991841792. Throughput: 0: 53680.2. Samples: 482400900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 11:30:09,107][52031] Avg episode reward: [(0, '0.499')] [2024-04-27 11:30:10,271][52263] Updated weights for policy 0, policy_version 365718 (0.0028) [2024-04-27 11:30:14,103][52263] Updated weights for policy 0, policy_version 365728 (0.0028) [2024-04-27 11:30:14,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53247.9, 300 sec: 53372.9). Total num frames: 5992087552. Throughput: 0: 53377.1. Samples: 482547920. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 11:30:14,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 11:30:16,455][52263] Updated weights for policy 0, policy_version 365738 (0.0028) [2024-04-27 11:30:19,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53794.1, 300 sec: 53317.4). Total num frames: 5992349696. Throughput: 0: 53358.7. Samples: 482866040. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 11:30:19,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 11:30:20,203][52263] Updated weights for policy 0, policy_version 365748 (0.0029) [2024-04-27 11:30:22,783][52263] Updated weights for policy 0, policy_version 365758 (0.0028) [2024-04-27 11:30:24,106][52031] Fps is (10 sec: 57344.7, 60 sec: 54067.4, 300 sec: 53428.5). Total num frames: 5992660992. Throughput: 0: 53353.8. Samples: 483186300. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 11:30:24,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 11:30:26,147][52263] Updated weights for policy 0, policy_version 365768 (0.0030) [2024-04-27 11:30:28,741][52263] Updated weights for policy 0, policy_version 365778 (0.0034) [2024-04-27 11:30:29,106][52031] Fps is (10 sec: 57344.1, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 5992923136. Throughput: 0: 53750.4. Samples: 483357460. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 11:30:29,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 11:30:32,226][52263] Updated weights for policy 0, policy_version 365788 (0.0030) [2024-04-27 11:30:34,107][52031] Fps is (10 sec: 50789.5, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 5993168896. Throughput: 0: 53756.4. Samples: 483683900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 11:30:34,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 11:30:34,697][52242] Signal inference workers to stop experience collection... (7200 times) [2024-04-27 11:30:34,699][52242] Signal inference workers to resume experience collection... (7200 times) [2024-04-27 11:30:34,721][52263] InferenceWorker_p0-w0: stopping experience collection (7200 times) [2024-04-27 11:30:34,721][52263] InferenceWorker_p0-w0: resuming experience collection (7200 times) [2024-04-27 11:30:34,819][52263] Updated weights for policy 0, policy_version 365798 (0.0028) [2024-04-27 11:30:38,358][52263] Updated weights for policy 0, policy_version 365808 (0.0030) [2024-04-27 11:30:39,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 5993431040. Throughput: 0: 53817.9. Samples: 484005580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 11:30:39,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 11:30:40,864][52263] Updated weights for policy 0, policy_version 365818 (0.0027) [2024-04-27 11:30:44,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 5993693184. Throughput: 0: 53634.4. Samples: 484160960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 11:30:44,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 11:30:44,318][52263] Updated weights for policy 0, policy_version 365828 (0.0028) [2024-04-27 11:30:46,936][52263] Updated weights for policy 0, policy_version 365838 (0.0029) [2024-04-27 11:30:49,106][52031] Fps is (10 sec: 50790.2, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 5993938944. Throughput: 0: 53647.2. Samples: 484481480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 11:30:49,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 11:30:50,542][52263] Updated weights for policy 0, policy_version 365848 (0.0028) [2024-04-27 11:30:52,935][52263] Updated weights for policy 0, policy_version 365858 (0.0032) [2024-04-27 11:30:54,107][52031] Fps is (10 sec: 55706.5, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 5994250240. Throughput: 0: 53302.7. Samples: 484799520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 11:30:54,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 11:30:56,677][52263] Updated weights for policy 0, policy_version 365868 (0.0028) [2024-04-27 11:30:59,106][52031] Fps is (10 sec: 58982.4, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 5994528768. Throughput: 0: 53924.6. Samples: 484974520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 11:30:59,107][52031] Avg episode reward: [(0, '0.499')] [2024-04-27 11:30:59,176][52263] Updated weights for policy 0, policy_version 365878 (0.0029) [2024-04-27 11:31:02,742][52263] Updated weights for policy 0, policy_version 365888 (0.0038) [2024-04-27 11:31:04,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 5994774528. Throughput: 0: 53884.8. Samples: 485290860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 11:31:04,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 11:31:05,395][52263] Updated weights for policy 0, policy_version 365898 (0.0029) [2024-04-27 11:31:08,710][52263] Updated weights for policy 0, policy_version 365908 (0.0026) [2024-04-27 11:31:09,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53248.1, 300 sec: 53428.6). Total num frames: 5995036672. Throughput: 0: 53817.4. Samples: 485608080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 11:31:09,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 11:31:11,372][52263] Updated weights for policy 0, policy_version 365918 (0.0030) [2024-04-27 11:31:14,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53248.2, 300 sec: 53317.4). Total num frames: 5995282432. Throughput: 0: 53392.0. Samples: 485760100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 11:31:14,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 11:31:14,839][52263] Updated weights for policy 0, policy_version 365928 (0.0036) [2024-04-27 11:31:17,491][52263] Updated weights for policy 0, policy_version 365938 (0.0027) [2024-04-27 11:31:19,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 5995577344. Throughput: 0: 53304.2. Samples: 486082580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 11:31:19,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 11:31:21,012][52263] Updated weights for policy 0, policy_version 365948 (0.0035) [2024-04-27 11:31:23,030][52242] Signal inference workers to stop experience collection... (7250 times) [2024-04-27 11:31:23,085][52263] InferenceWorker_p0-w0: stopping experience collection (7250 times) [2024-04-27 11:31:23,092][52242] Signal inference workers to resume experience collection... (7250 times) [2024-04-27 11:31:23,095][52263] InferenceWorker_p0-w0: resuming experience collection (7250 times) [2024-04-27 11:31:23,715][52263] Updated weights for policy 0, policy_version 365958 (0.0028) [2024-04-27 11:31:24,106][52031] Fps is (10 sec: 58981.9, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 5995872256. Throughput: 0: 53321.2. Samples: 486405040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 11:31:24,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 11:31:27,086][52263] Updated weights for policy 0, policy_version 365968 (0.0029) [2024-04-27 11:31:29,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53247.8, 300 sec: 53539.6). Total num frames: 5996118016. Throughput: 0: 53635.6. Samples: 486574560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 11:31:29,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 11:31:29,868][52263] Updated weights for policy 0, policy_version 365978 (0.0030) [2024-04-27 11:31:33,401][52263] Updated weights for policy 0, policy_version 365988 (0.0035) [2024-04-27 11:31:34,106][52031] Fps is (10 sec: 49152.4, 60 sec: 53248.2, 300 sec: 53428.5). Total num frames: 5996363776. Throughput: 0: 53672.5. Samples: 486896740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 11:31:34,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 11:31:34,133][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000365990_5996380160.pth... [2024-04-27 11:31:34,180][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000365206_5983535104.pth [2024-04-27 11:31:35,922][52263] Updated weights for policy 0, policy_version 365998 (0.0029) [2024-04-27 11:31:39,106][52031] Fps is (10 sec: 50791.1, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 5996625920. Throughput: 0: 53718.3. Samples: 487216840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 11:31:39,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 11:31:39,631][52263] Updated weights for policy 0, policy_version 366008 (0.0037) [2024-04-27 11:31:41,944][52263] Updated weights for policy 0, policy_version 366018 (0.0035) [2024-04-27 11:31:44,106][52031] Fps is (10 sec: 50790.4, 60 sec: 52975.2, 300 sec: 53428.5). Total num frames: 5996871680. Throughput: 0: 52959.6. Samples: 487357700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 11:31:44,107][52031] Avg episode reward: [(0, '0.465')] [2024-04-27 11:31:45,980][52263] Updated weights for policy 0, policy_version 366028 (0.0029) [2024-04-27 11:31:48,082][52263] Updated weights for policy 0, policy_version 366038 (0.0031) [2024-04-27 11:31:49,106][52031] Fps is (10 sec: 55705.9, 60 sec: 54067.2, 300 sec: 53428.5). Total num frames: 5997182976. Throughput: 0: 53219.7. Samples: 487685740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 11:31:49,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 11:31:51,945][52263] Updated weights for policy 0, policy_version 366048 (0.0035) [2024-04-27 11:31:54,107][52031] Fps is (10 sec: 60620.3, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 5997477888. Throughput: 0: 53327.9. Samples: 488007840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:31:54,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 11:31:54,263][52263] Updated weights for policy 0, policy_version 366058 (0.0027) [2024-04-27 11:31:57,862][52263] Updated weights for policy 0, policy_version 366068 (0.0027) [2024-04-27 11:31:59,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 5997723648. Throughput: 0: 53862.2. Samples: 488183900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:31:59,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 11:32:00,266][52263] Updated weights for policy 0, policy_version 366078 (0.0029) [2024-04-27 11:32:03,942][52263] Updated weights for policy 0, policy_version 366088 (0.0026) [2024-04-27 11:32:04,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 5997985792. Throughput: 0: 53857.0. Samples: 488506140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:32:04,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 11:32:06,532][52263] Updated weights for policy 0, policy_version 366098 (0.0034) [2024-04-27 11:32:09,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 5998247936. Throughput: 0: 53806.6. Samples: 488826340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:32:09,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 11:32:10,074][52263] Updated weights for policy 0, policy_version 366108 (0.0033) [2024-04-27 11:32:12,714][52263] Updated weights for policy 0, policy_version 366118 (0.0025) [2024-04-27 11:32:14,107][52031] Fps is (10 sec: 50789.9, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 5998493696. Throughput: 0: 53351.2. Samples: 488975360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:32:14,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 11:32:15,723][52242] Signal inference workers to stop experience collection... (7300 times) [2024-04-27 11:32:15,723][52242] Signal inference workers to resume experience collection... (7300 times) [2024-04-27 11:32:15,741][52263] InferenceWorker_p0-w0: stopping experience collection (7300 times) [2024-04-27 11:32:15,741][52263] InferenceWorker_p0-w0: resuming experience collection (7300 times) [2024-04-27 11:32:16,107][52263] Updated weights for policy 0, policy_version 366128 (0.0029) [2024-04-27 11:32:18,877][52263] Updated weights for policy 0, policy_version 366138 (0.0035) [2024-04-27 11:32:19,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 5998804992. Throughput: 0: 53259.5. Samples: 489293420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:32:19,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 11:32:22,372][52263] Updated weights for policy 0, policy_version 366148 (0.0029) [2024-04-27 11:32:24,106][52031] Fps is (10 sec: 57344.1, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 5999067136. Throughput: 0: 53311.1. Samples: 489615840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:32:24,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 11:32:24,929][52263] Updated weights for policy 0, policy_version 366158 (0.0029) [2024-04-27 11:32:28,293][52263] Updated weights for policy 0, policy_version 366168 (0.0027) [2024-04-27 11:32:29,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 5999329280. Throughput: 0: 54148.2. Samples: 489794380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:32:29,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 11:32:30,814][52263] Updated weights for policy 0, policy_version 366178 (0.0034) [2024-04-27 11:32:34,107][52031] Fps is (10 sec: 50789.9, 60 sec: 53520.9, 300 sec: 53539.6). Total num frames: 5999575040. Throughput: 0: 53984.7. Samples: 490115060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:32:34,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 11:32:34,650][52263] Updated weights for policy 0, policy_version 366188 (0.0035) [2024-04-27 11:32:36,999][52263] Updated weights for policy 0, policy_version 366198 (0.0025) [2024-04-27 11:32:39,107][52031] Fps is (10 sec: 50790.8, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 5999837184. Throughput: 0: 53832.0. Samples: 490430280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:32:39,107][52031] Avg episode reward: [(0, '0.506')] [2024-04-27 11:32:40,891][52263] Updated weights for policy 0, policy_version 366208 (0.0031) [2024-04-27 11:32:43,408][52263] Updated weights for policy 0, policy_version 366218 (0.0027) [2024-04-27 11:32:44,107][52031] Fps is (10 sec: 55706.2, 60 sec: 54340.2, 300 sec: 53484.0). Total num frames: 6000132096. Throughput: 0: 53348.8. Samples: 490584600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:32:44,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 11:32:46,981][52263] Updated weights for policy 0, policy_version 366228 (0.0033) [2024-04-27 11:32:49,107][52031] Fps is (10 sec: 57343.6, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6000410624. Throughput: 0: 53261.1. Samples: 490902900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:32:49,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 11:32:49,601][52263] Updated weights for policy 0, policy_version 366238 (0.0027) [2024-04-27 11:32:53,089][52263] Updated weights for policy 0, policy_version 366248 (0.0029) [2024-04-27 11:32:54,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6000672768. Throughput: 0: 53287.6. Samples: 491224280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:32:54,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 11:32:55,745][52263] Updated weights for policy 0, policy_version 366258 (0.0030) [2024-04-27 11:32:59,107][52031] Fps is (10 sec: 50790.4, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6000918528. Throughput: 0: 53707.9. Samples: 491392220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:32:59,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 11:32:59,173][52263] Updated weights for policy 0, policy_version 366268 (0.0031) [2024-04-27 11:33:01,891][52263] Updated weights for policy 0, policy_version 366278 (0.0030) [2024-04-27 11:33:04,107][52031] Fps is (10 sec: 50789.7, 60 sec: 53247.8, 300 sec: 53428.5). Total num frames: 6001180672. Throughput: 0: 53719.4. Samples: 491710800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:33:04,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 11:33:05,210][52263] Updated weights for policy 0, policy_version 366288 (0.0025) [2024-04-27 11:33:07,881][52263] Updated weights for policy 0, policy_version 366298 (0.0029) [2024-04-27 11:33:09,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6001442816. Throughput: 0: 53599.5. Samples: 492027820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:33:09,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 11:33:11,373][52263] Updated weights for policy 0, policy_version 366308 (0.0023) [2024-04-27 11:33:13,909][52263] Updated weights for policy 0, policy_version 366318 (0.0031) [2024-04-27 11:33:14,106][52031] Fps is (10 sec: 57344.8, 60 sec: 54340.3, 300 sec: 53539.6). Total num frames: 6001754112. Throughput: 0: 53204.1. Samples: 492188560. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 11:33:14,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 11:33:17,535][52263] Updated weights for policy 0, policy_version 366328 (0.0033) [2024-04-27 11:33:19,028][52242] Signal inference workers to stop experience collection... (7350 times) [2024-04-27 11:33:19,052][52263] InferenceWorker_p0-w0: stopping experience collection (7350 times) [2024-04-27 11:33:19,093][52242] Signal inference workers to resume experience collection... (7350 times) [2024-04-27 11:33:19,093][52263] InferenceWorker_p0-w0: resuming experience collection (7350 times) [2024-04-27 11:33:19,107][52031] Fps is (10 sec: 57344.1, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6002016256. Throughput: 0: 53259.6. Samples: 492511740. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 11:33:19,107][52031] Avg episode reward: [(0, '0.447')] [2024-04-27 11:33:20,092][52263] Updated weights for policy 0, policy_version 366338 (0.0025) [2024-04-27 11:33:23,803][52263] Updated weights for policy 0, policy_version 366348 (0.0032) [2024-04-27 11:33:24,107][52031] Fps is (10 sec: 50790.4, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6002262016. Throughput: 0: 53299.6. Samples: 492828760. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 11:33:24,107][52031] Avg episode reward: [(0, '0.480')] [2024-04-27 11:33:26,366][52263] Updated weights for policy 0, policy_version 366358 (0.0031) [2024-04-27 11:33:29,106][52031] Fps is (10 sec: 50791.0, 60 sec: 53248.2, 300 sec: 53484.0). Total num frames: 6002524160. Throughput: 0: 53270.7. Samples: 492981780. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 11:33:29,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 11:33:30,189][52263] Updated weights for policy 0, policy_version 366368 (0.0035) [2024-04-27 11:33:32,644][52263] Updated weights for policy 0, policy_version 366378 (0.0031) [2024-04-27 11:33:34,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6002786304. Throughput: 0: 53307.5. Samples: 493301740. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 11:33:34,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 11:33:34,213][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000366382_6002802688.pth... [2024-04-27 11:33:34,275][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000365600_5989990400.pth [2024-04-27 11:33:36,194][52263] Updated weights for policy 0, policy_version 366388 (0.0037) [2024-04-27 11:33:39,078][52263] Updated weights for policy 0, policy_version 366398 (0.0030) [2024-04-27 11:33:39,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6003064832. Throughput: 0: 53338.6. Samples: 493624520. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 11:33:39,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 11:33:42,243][52263] Updated weights for policy 0, policy_version 366408 (0.0031) [2024-04-27 11:33:44,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53520.9, 300 sec: 53539.5). Total num frames: 6003343360. Throughput: 0: 53158.2. Samples: 493784340. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 11:33:44,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 11:33:45,148][52263] Updated weights for policy 0, policy_version 366418 (0.0030) [2024-04-27 11:33:48,371][52263] Updated weights for policy 0, policy_version 366428 (0.0025) [2024-04-27 11:33:49,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6003605504. Throughput: 0: 53280.1. Samples: 494108400. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 11:33:49,107][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 11:33:51,698][52263] Updated weights for policy 0, policy_version 366438 (0.0029) [2024-04-27 11:33:54,107][52031] Fps is (10 sec: 52429.2, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6003867648. Throughput: 0: 53257.3. Samples: 494424400. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 11:33:54,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 11:33:54,369][52263] Updated weights for policy 0, policy_version 366448 (0.0028) [2024-04-27 11:33:57,943][52263] Updated weights for policy 0, policy_version 366458 (0.0033) [2024-04-27 11:33:59,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6004146176. Throughput: 0: 53049.8. Samples: 494575800. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 11:33:59,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 11:34:00,404][52263] Updated weights for policy 0, policy_version 366468 (0.0035) [2024-04-27 11:34:04,053][52263] Updated weights for policy 0, policy_version 366478 (0.0027) [2024-04-27 11:34:04,107][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6004375552. Throughput: 0: 53061.8. Samples: 494899520. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 11:34:04,108][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 11:34:06,697][52263] Updated weights for policy 0, policy_version 366488 (0.0028) [2024-04-27 11:34:09,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6004670464. Throughput: 0: 53235.1. Samples: 495224340. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 11:34:09,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 11:34:10,070][52263] Updated weights for policy 0, policy_version 366498 (0.0025) [2024-04-27 11:34:12,889][52263] Updated weights for policy 0, policy_version 366508 (0.0027) [2024-04-27 11:34:14,106][52031] Fps is (10 sec: 54067.5, 60 sec: 52701.9, 300 sec: 53539.6). Total num frames: 6004916224. Throughput: 0: 53402.6. Samples: 495384900. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 11:34:14,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 11:34:14,111][52242] Signal inference workers to stop experience collection... (7400 times) [2024-04-27 11:34:14,113][52242] Signal inference workers to resume experience collection... (7400 times) [2024-04-27 11:34:14,121][52263] InferenceWorker_p0-w0: stopping experience collection (7400 times) [2024-04-27 11:34:14,133][52263] InferenceWorker_p0-w0: resuming experience collection (7400 times) [2024-04-27 11:34:16,069][52263] Updated weights for policy 0, policy_version 366518 (0.0024) [2024-04-27 11:34:19,023][52263] Updated weights for policy 0, policy_version 366528 (0.0032) [2024-04-27 11:34:19,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52975.0, 300 sec: 53484.1). Total num frames: 6005194752. Throughput: 0: 53385.5. Samples: 495704080. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 11:34:19,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 11:34:22,267][52263] Updated weights for policy 0, policy_version 366538 (0.0026) [2024-04-27 11:34:24,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6005473280. Throughput: 0: 53270.6. Samples: 496021700. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 11:34:24,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 11:34:25,105][52263] Updated weights for policy 0, policy_version 366548 (0.0037) [2024-04-27 11:34:28,408][52263] Updated weights for policy 0, policy_version 366558 (0.0031) [2024-04-27 11:34:29,107][52031] Fps is (10 sec: 54066.0, 60 sec: 53520.8, 300 sec: 53428.5). Total num frames: 6005735424. Throughput: 0: 53492.4. Samples: 496191500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 11:34:29,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 11:34:31,141][52263] Updated weights for policy 0, policy_version 366568 (0.0027) [2024-04-27 11:34:34,107][52031] Fps is (10 sec: 50790.8, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6005981184. Throughput: 0: 53327.1. Samples: 496508120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 11:34:34,107][52031] Avg episode reward: [(0, '0.524')] [2024-04-27 11:34:34,514][52263] Updated weights for policy 0, policy_version 366578 (0.0029) [2024-04-27 11:34:37,349][52263] Updated weights for policy 0, policy_version 366588 (0.0033) [2024-04-27 11:34:39,106][52031] Fps is (10 sec: 52430.1, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6006259712. Throughput: 0: 53410.8. Samples: 496827880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 11:34:39,107][52031] Avg episode reward: [(0, '0.646')] [2024-04-27 11:34:40,787][52263] Updated weights for policy 0, policy_version 366598 (0.0031) [2024-04-27 11:34:43,340][52263] Updated weights for policy 0, policy_version 366608 (0.0027) [2024-04-27 11:34:44,107][52031] Fps is (10 sec: 54066.9, 60 sec: 52975.0, 300 sec: 53484.0). Total num frames: 6006521856. Throughput: 0: 53444.4. Samples: 496980800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 11:34:44,115][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 11:34:46,828][52263] Updated weights for policy 0, policy_version 366618 (0.0035) [2024-04-27 11:34:49,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6006800384. Throughput: 0: 53397.2. Samples: 497302400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 11:34:49,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 11:34:49,477][52263] Updated weights for policy 0, policy_version 366628 (0.0030) [2024-04-27 11:34:52,881][52263] Updated weights for policy 0, policy_version 366638 (0.0023) [2024-04-27 11:34:54,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6007078912. Throughput: 0: 53318.7. Samples: 497623680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 11:34:54,116][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 11:34:55,727][52263] Updated weights for policy 0, policy_version 366648 (0.0029) [2024-04-27 11:34:58,958][52263] Updated weights for policy 0, policy_version 366658 (0.0028) [2024-04-27 11:34:59,106][52031] Fps is (10 sec: 52429.7, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6007324672. Throughput: 0: 53348.5. Samples: 497785580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 11:34:59,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 11:35:01,873][52263] Updated weights for policy 0, policy_version 366668 (0.0030) [2024-04-27 11:35:04,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6007603200. Throughput: 0: 53396.8. Samples: 498106940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 11:35:04,116][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 11:35:05,138][52263] Updated weights for policy 0, policy_version 366678 (0.0033) [2024-04-27 11:35:08,041][52263] Updated weights for policy 0, policy_version 366688 (0.0035) [2024-04-27 11:35:09,107][52031] Fps is (10 sec: 52428.4, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6007848960. Throughput: 0: 53459.2. Samples: 498427360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 11:35:09,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 11:35:11,273][52263] Updated weights for policy 0, policy_version 366698 (0.0029) [2024-04-27 11:35:14,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6008127488. Throughput: 0: 53212.2. Samples: 498586040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 11:35:14,107][52031] Avg episode reward: [(0, '0.475')] [2024-04-27 11:35:14,272][52263] Updated weights for policy 0, policy_version 366708 (0.0027) [2024-04-27 11:35:17,387][52263] Updated weights for policy 0, policy_version 366718 (0.0027) [2024-04-27 11:35:19,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6008406016. Throughput: 0: 53234.3. Samples: 498903660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 11:35:19,107][52031] Avg episode reward: [(0, '0.668')] [2024-04-27 11:35:20,299][52263] Updated weights for policy 0, policy_version 366728 (0.0030) [2024-04-27 11:35:22,971][52242] Signal inference workers to stop experience collection... (7450 times) [2024-04-27 11:35:22,971][52242] Signal inference workers to resume experience collection... (7450 times) [2024-04-27 11:35:22,984][52263] InferenceWorker_p0-w0: stopping experience collection (7450 times) [2024-04-27 11:35:22,984][52263] InferenceWorker_p0-w0: resuming experience collection (7450 times) [2024-04-27 11:35:23,508][52263] Updated weights for policy 0, policy_version 366738 (0.0027) [2024-04-27 11:35:24,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.1, 300 sec: 53372.9). Total num frames: 6008668160. Throughput: 0: 53231.9. Samples: 499223320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 11:35:24,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 11:35:26,456][52263] Updated weights for policy 0, policy_version 366748 (0.0028) [2024-04-27 11:35:29,106][52031] Fps is (10 sec: 50790.5, 60 sec: 52975.2, 300 sec: 53373.0). Total num frames: 6008913920. Throughput: 0: 53306.0. Samples: 499379560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 11:35:29,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 11:35:29,846][52263] Updated weights for policy 0, policy_version 366758 (0.0028) [2024-04-27 11:35:32,627][52263] Updated weights for policy 0, policy_version 366768 (0.0032) [2024-04-27 11:35:34,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53248.0, 300 sec: 53372.9). Total num frames: 6009176064. Throughput: 0: 53302.8. Samples: 499701020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 11:35:34,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 11:35:34,138][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000366772_6009192448.pth... [2024-04-27 11:35:34,187][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000365990_5996380160.pth [2024-04-27 11:35:35,906][52263] Updated weights for policy 0, policy_version 366778 (0.0031) [2024-04-27 11:35:38,674][52263] Updated weights for policy 0, policy_version 366788 (0.0032) [2024-04-27 11:35:39,107][52031] Fps is (10 sec: 55704.5, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6009470976. Throughput: 0: 53318.5. Samples: 500023020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 11:35:39,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 11:35:41,901][52263] Updated weights for policy 0, policy_version 366798 (0.0027) [2024-04-27 11:35:44,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6009716736. Throughput: 0: 53276.7. Samples: 500183040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 11:35:44,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 11:35:44,886][52263] Updated weights for policy 0, policy_version 366808 (0.0027) [2024-04-27 11:35:48,008][52263] Updated weights for policy 0, policy_version 366818 (0.0037) [2024-04-27 11:35:49,107][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6009995264. Throughput: 0: 53179.5. Samples: 500500020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:35:49,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 11:35:51,274][52263] Updated weights for policy 0, policy_version 366828 (0.0030) [2024-04-27 11:35:54,107][52031] Fps is (10 sec: 54066.7, 60 sec: 52974.8, 300 sec: 53317.4). Total num frames: 6010257408. Throughput: 0: 53112.7. Samples: 500817440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:35:54,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 11:35:54,122][52263] Updated weights for policy 0, policy_version 366838 (0.0032) [2024-04-27 11:35:57,489][52263] Updated weights for policy 0, policy_version 366848 (0.0030) [2024-04-27 11:35:59,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 6010503168. Throughput: 0: 53368.1. Samples: 500987600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:35:59,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 11:36:00,304][52263] Updated weights for policy 0, policy_version 366858 (0.0028) [2024-04-27 11:36:03,543][52263] Updated weights for policy 0, policy_version 366868 (0.0030) [2024-04-27 11:36:04,106][52031] Fps is (10 sec: 50791.2, 60 sec: 52701.9, 300 sec: 53317.4). Total num frames: 6010765312. Throughput: 0: 53432.4. Samples: 501308120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:36:04,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 11:36:06,295][52263] Updated weights for policy 0, policy_version 366878 (0.0034) [2024-04-27 11:36:09,106][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6011060224. Throughput: 0: 53424.5. Samples: 501627420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:36:09,107][52031] Avg episode reward: [(0, '0.502')] [2024-04-27 11:36:09,519][52263] Updated weights for policy 0, policy_version 366888 (0.0029) [2024-04-27 11:36:12,254][52263] Updated weights for policy 0, policy_version 366898 (0.0028) [2024-04-27 11:36:14,107][52031] Fps is (10 sec: 57343.7, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6011338752. Throughput: 0: 53603.4. Samples: 501791720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:36:14,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 11:36:15,509][52263] Updated weights for policy 0, policy_version 366908 (0.0032) [2024-04-27 11:36:18,605][52263] Updated weights for policy 0, policy_version 366918 (0.0032) [2024-04-27 11:36:19,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6011600896. Throughput: 0: 53480.6. Samples: 502107640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:36:19,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 11:36:21,717][52263] Updated weights for policy 0, policy_version 366928 (0.0030) [2024-04-27 11:36:24,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6011863040. Throughput: 0: 53435.3. Samples: 502427600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:36:24,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 11:36:24,713][52263] Updated weights for policy 0, policy_version 366938 (0.0032) [2024-04-27 11:36:27,910][52263] Updated weights for policy 0, policy_version 366948 (0.0027) [2024-04-27 11:36:29,106][52031] Fps is (10 sec: 50790.2, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6012108800. Throughput: 0: 53445.9. Samples: 502588100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:36:29,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 11:36:30,682][52263] Updated weights for policy 0, policy_version 366958 (0.0035) [2024-04-27 11:36:34,107][52031] Fps is (10 sec: 52427.3, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6012387328. Throughput: 0: 53606.9. Samples: 502912340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:36:34,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 11:36:34,152][52263] Updated weights for policy 0, policy_version 366968 (0.0031) [2024-04-27 11:36:36,753][52263] Updated weights for policy 0, policy_version 366978 (0.0026) [2024-04-27 11:36:39,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6012665856. Throughput: 0: 53664.1. Samples: 503232320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:36:39,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 11:36:40,259][52242] Signal inference workers to stop experience collection... (7500 times) [2024-04-27 11:36:40,305][52263] InferenceWorker_p0-w0: stopping experience collection (7500 times) [2024-04-27 11:36:40,318][52242] Signal inference workers to resume experience collection... (7500 times) [2024-04-27 11:36:40,324][52263] InferenceWorker_p0-w0: resuming experience collection (7500 times) [2024-04-27 11:36:40,436][52263] Updated weights for policy 0, policy_version 366988 (0.0030) [2024-04-27 11:36:42,859][52263] Updated weights for policy 0, policy_version 366998 (0.0029) [2024-04-27 11:36:44,106][52031] Fps is (10 sec: 55707.3, 60 sec: 53794.3, 300 sec: 53428.5). Total num frames: 6012944384. Throughput: 0: 53443.2. Samples: 503392540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:36:44,107][52031] Avg episode reward: [(0, '0.513')] [2024-04-27 11:36:46,809][52263] Updated weights for policy 0, policy_version 367008 (0.0034) [2024-04-27 11:36:49,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53521.2, 300 sec: 53317.4). Total num frames: 6013206528. Throughput: 0: 53421.5. Samples: 503712080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:36:49,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 11:36:49,158][52263] Updated weights for policy 0, policy_version 367018 (0.0024) [2024-04-27 11:36:52,811][52263] Updated weights for policy 0, policy_version 367028 (0.0031) [2024-04-27 11:36:54,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 6013452288. Throughput: 0: 53479.9. Samples: 504034020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:36:54,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 11:36:55,233][52263] Updated weights for policy 0, policy_version 367038 (0.0031) [2024-04-27 11:36:58,788][52263] Updated weights for policy 0, policy_version 367048 (0.0029) [2024-04-27 11:36:59,107][52031] Fps is (10 sec: 50789.5, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 6013714432. Throughput: 0: 53290.6. Samples: 504189800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:36:59,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 11:37:01,336][52263] Updated weights for policy 0, policy_version 367058 (0.0034) [2024-04-27 11:37:04,106][52031] Fps is (10 sec: 55706.3, 60 sec: 54067.3, 300 sec: 53428.5). Total num frames: 6014009344. Throughput: 0: 53277.8. Samples: 504505140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 11:37:04,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 11:37:05,078][52263] Updated weights for policy 0, policy_version 367068 (0.0029) [2024-04-27 11:37:07,560][52263] Updated weights for policy 0, policy_version 367078 (0.0032) [2024-04-27 11:37:09,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6014255104. Throughput: 0: 53272.7. Samples: 504824880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 11:37:09,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 11:37:11,356][52263] Updated weights for policy 0, policy_version 367088 (0.0031) [2024-04-27 11:37:13,613][52263] Updated weights for policy 0, policy_version 367098 (0.0035) [2024-04-27 11:37:14,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6014550016. Throughput: 0: 53401.3. Samples: 504991160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 11:37:14,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 11:37:17,456][52263] Updated weights for policy 0, policy_version 367108 (0.0030) [2024-04-27 11:37:19,106][52031] Fps is (10 sec: 52429.7, 60 sec: 52974.9, 300 sec: 53261.9). Total num frames: 6014779392. Throughput: 0: 53324.4. Samples: 505311920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 11:37:19,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 11:37:19,722][52263] Updated weights for policy 0, policy_version 367118 (0.0025) [2024-04-27 11:37:23,457][52263] Updated weights for policy 0, policy_version 367128 (0.0027) [2024-04-27 11:37:24,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6015057920. Throughput: 0: 53287.6. Samples: 505630260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 11:37:24,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 11:37:24,622][52242] Signal inference workers to stop experience collection... (7550 times) [2024-04-27 11:37:24,622][52242] Signal inference workers to resume experience collection... (7550 times) [2024-04-27 11:37:24,649][52263] InferenceWorker_p0-w0: stopping experience collection (7550 times) [2024-04-27 11:37:24,649][52263] InferenceWorker_p0-w0: resuming experience collection (7550 times) [2024-04-27 11:37:25,828][52263] Updated weights for policy 0, policy_version 367138 (0.0029) [2024-04-27 11:37:29,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6015320064. Throughput: 0: 53123.4. Samples: 505783100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 11:37:29,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 11:37:29,567][52263] Updated weights for policy 0, policy_version 367148 (0.0038) [2024-04-27 11:37:32,077][52263] Updated weights for policy 0, policy_version 367158 (0.0025) [2024-04-27 11:37:34,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.2, 300 sec: 53373.0). Total num frames: 6015582208. Throughput: 0: 53128.8. Samples: 506102880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 11:37:34,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 11:37:34,156][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000367163_6015598592.pth... [2024-04-27 11:37:34,203][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000366382_6002802688.pth [2024-04-27 11:37:35,645][52263] Updated weights for policy 0, policy_version 367168 (0.0030) [2024-04-27 11:37:38,379][52263] Updated weights for policy 0, policy_version 367178 (0.0028) [2024-04-27 11:37:39,107][52031] Fps is (10 sec: 57344.3, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6015893504. Throughput: 0: 53094.3. Samples: 506423260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 11:37:39,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 11:37:41,696][52263] Updated weights for policy 0, policy_version 367188 (0.0033) [2024-04-27 11:37:44,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6016139264. Throughput: 0: 53389.4. Samples: 506592320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 11:37:44,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 11:37:44,588][52263] Updated weights for policy 0, policy_version 367198 (0.0025) [2024-04-27 11:37:47,785][52263] Updated weights for policy 0, policy_version 367208 (0.0034) [2024-04-27 11:37:49,107][52031] Fps is (10 sec: 49151.8, 60 sec: 52974.8, 300 sec: 53261.9). Total num frames: 6016385024. Throughput: 0: 53471.0. Samples: 506911340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 11:37:49,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 11:37:50,674][52263] Updated weights for policy 0, policy_version 367218 (0.0030) [2024-04-27 11:37:53,806][52263] Updated weights for policy 0, policy_version 367228 (0.0027) [2024-04-27 11:37:54,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6016679936. Throughput: 0: 53448.4. Samples: 507230060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 11:37:54,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 11:37:56,836][52263] Updated weights for policy 0, policy_version 367238 (0.0028) [2024-04-27 11:37:59,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 6016925696. Throughput: 0: 53365.5. Samples: 507392600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 11:37:59,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 11:37:59,976][52263] Updated weights for policy 0, policy_version 367248 (0.0026) [2024-04-27 11:38:02,838][52263] Updated weights for policy 0, policy_version 367258 (0.0030) [2024-04-27 11:38:04,107][52031] Fps is (10 sec: 52429.2, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6017204224. Throughput: 0: 53396.7. Samples: 507714780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 11:38:04,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 11:38:06,013][52263] Updated weights for policy 0, policy_version 367268 (0.0035) [2024-04-27 11:38:08,944][52263] Updated weights for policy 0, policy_version 367278 (0.0033) [2024-04-27 11:38:09,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53794.2, 300 sec: 53317.4). Total num frames: 6017482752. Throughput: 0: 53478.6. Samples: 508036800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 11:38:09,107][52031] Avg episode reward: [(0, '0.501')] [2024-04-27 11:38:12,223][52263] Updated weights for policy 0, policy_version 367288 (0.0037) [2024-04-27 11:38:14,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6017744896. Throughput: 0: 53753.8. Samples: 508202020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 11:38:14,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 11:38:15,029][52263] Updated weights for policy 0, policy_version 367298 (0.0030) [2024-04-27 11:38:18,380][52263] Updated weights for policy 0, policy_version 367308 (0.0030) [2024-04-27 11:38:19,107][52031] Fps is (10 sec: 50790.0, 60 sec: 53520.9, 300 sec: 53317.4). Total num frames: 6017990656. Throughput: 0: 53658.1. Samples: 508517500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 11:38:19,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 11:38:21,254][52263] Updated weights for policy 0, policy_version 367318 (0.0032) [2024-04-27 11:38:24,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.0, 300 sec: 53372.9). Total num frames: 6018269184. Throughput: 0: 53620.3. Samples: 508836180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:38:24,107][52031] Avg episode reward: [(0, '0.667')] [2024-04-27 11:38:24,342][52263] Updated weights for policy 0, policy_version 367328 (0.0031) [2024-04-27 11:38:27,381][52263] Updated weights for policy 0, policy_version 367338 (0.0032) [2024-04-27 11:38:29,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6018547712. Throughput: 0: 53461.2. Samples: 508998080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:38:29,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 11:38:30,453][52263] Updated weights for policy 0, policy_version 367348 (0.0030) [2024-04-27 11:38:33,563][52263] Updated weights for policy 0, policy_version 367358 (0.0029) [2024-04-27 11:38:34,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 6018809856. Throughput: 0: 53520.5. Samples: 509319760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:38:34,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 11:38:34,213][52242] Signal inference workers to stop experience collection... (7600 times) [2024-04-27 11:38:34,213][52242] Signal inference workers to resume experience collection... (7600 times) [2024-04-27 11:38:34,239][52263] InferenceWorker_p0-w0: stopping experience collection (7600 times) [2024-04-27 11:38:34,239][52263] InferenceWorker_p0-w0: resuming experience collection (7600 times) [2024-04-27 11:38:36,569][52263] Updated weights for policy 0, policy_version 367368 (0.0029) [2024-04-27 11:38:39,107][52031] Fps is (10 sec: 52429.2, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 6019072000. Throughput: 0: 53533.0. Samples: 509639040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:38:39,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 11:38:39,610][52263] Updated weights for policy 0, policy_version 367378 (0.0029) [2024-04-27 11:38:42,607][52263] Updated weights for policy 0, policy_version 367388 (0.0032) [2024-04-27 11:38:44,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6019334144. Throughput: 0: 53547.9. Samples: 509802260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:38:44,116][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 11:38:45,664][52263] Updated weights for policy 0, policy_version 367398 (0.0025) [2024-04-27 11:38:48,818][52263] Updated weights for policy 0, policy_version 367408 (0.0029) [2024-04-27 11:38:49,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 6019612672. Throughput: 0: 53484.4. Samples: 510121580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:38:49,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 11:38:51,774][52263] Updated weights for policy 0, policy_version 367418 (0.0031) [2024-04-27 11:38:54,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53521.1, 300 sec: 53372.9). Total num frames: 6019891200. Throughput: 0: 53556.9. Samples: 510446860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:38:54,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 11:38:54,771][52263] Updated weights for policy 0, policy_version 367428 (0.0025) [2024-04-27 11:38:57,978][52263] Updated weights for policy 0, policy_version 367438 (0.0030) [2024-04-27 11:38:59,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 6020153344. Throughput: 0: 53465.0. Samples: 510607940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:38:59,107][52031] Avg episode reward: [(0, '0.686')] [2024-04-27 11:39:01,138][52263] Updated weights for policy 0, policy_version 367448 (0.0028) [2024-04-27 11:39:03,925][52263] Updated weights for policy 0, policy_version 367458 (0.0031) [2024-04-27 11:39:04,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6020431872. Throughput: 0: 53677.9. Samples: 510933000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:39:04,107][52031] Avg episode reward: [(0, '0.666')] [2024-04-27 11:39:07,248][52263] Updated weights for policy 0, policy_version 367468 (0.0034) [2024-04-27 11:39:09,107][52031] Fps is (10 sec: 52427.7, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6020677632. Throughput: 0: 53818.6. Samples: 511258020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:39:09,107][52031] Avg episode reward: [(0, '0.665')] [2024-04-27 11:39:10,018][52263] Updated weights for policy 0, policy_version 367478 (0.0028) [2024-04-27 11:39:13,156][52263] Updated weights for policy 0, policy_version 367488 (0.0025) [2024-04-27 11:39:14,107][52031] Fps is (10 sec: 50790.5, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6020939776. Throughput: 0: 53515.7. Samples: 511406280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:39:14,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 11:39:16,266][52263] Updated weights for policy 0, policy_version 367498 (0.0031) [2024-04-27 11:39:19,106][52031] Fps is (10 sec: 55706.8, 60 sec: 54067.4, 300 sec: 53428.5). Total num frames: 6021234688. Throughput: 0: 53616.1. Samples: 511732480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:39:19,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 11:39:19,375][52263] Updated weights for policy 0, policy_version 367508 (0.0031) [2024-04-27 11:39:22,333][52263] Updated weights for policy 0, policy_version 367518 (0.0028) [2024-04-27 11:39:24,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6021496832. Throughput: 0: 53649.4. Samples: 512053260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:39:24,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 11:39:25,269][52263] Updated weights for policy 0, policy_version 367528 (0.0028) [2024-04-27 11:39:28,446][52263] Updated weights for policy 0, policy_version 367538 (0.0027) [2024-04-27 11:39:29,107][52031] Fps is (10 sec: 52427.6, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6021758976. Throughput: 0: 53803.4. Samples: 512223420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:39:29,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 11:39:31,250][52263] Updated weights for policy 0, policy_version 367548 (0.0028) [2024-04-27 11:39:34,107][52031] Fps is (10 sec: 50789.4, 60 sec: 53247.8, 300 sec: 53372.9). Total num frames: 6022004736. Throughput: 0: 53817.2. Samples: 512543360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:39:34,116][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 11:39:34,339][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000367556_6022037504.pth... [2024-04-27 11:39:34,400][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000366772_6009192448.pth [2024-04-27 11:39:34,643][52263] Updated weights for policy 0, policy_version 367558 (0.0029) [2024-04-27 11:39:37,949][52263] Updated weights for policy 0, policy_version 367568 (0.0029) [2024-04-27 11:39:39,106][52031] Fps is (10 sec: 52430.1, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6022283264. Throughput: 0: 53672.2. Samples: 512862100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 11:39:39,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 11:39:40,758][52263] Updated weights for policy 0, policy_version 367578 (0.0027) [2024-04-27 11:39:42,672][52242] Signal inference workers to stop experience collection... (7650 times) [2024-04-27 11:39:42,672][52242] Signal inference workers to resume experience collection... (7650 times) [2024-04-27 11:39:42,697][52263] InferenceWorker_p0-w0: stopping experience collection (7650 times) [2024-04-27 11:39:42,697][52263] InferenceWorker_p0-w0: resuming experience collection (7650 times) [2024-04-27 11:39:44,106][52031] Fps is (10 sec: 54068.5, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6022545408. Throughput: 0: 53588.0. Samples: 513019400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 11:39:44,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 11:39:44,145][52263] Updated weights for policy 0, policy_version 367588 (0.0030) [2024-04-27 11:39:46,977][52263] Updated weights for policy 0, policy_version 367598 (0.0033) [2024-04-27 11:39:49,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6022840320. Throughput: 0: 53454.8. Samples: 513338460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 11:39:49,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 11:39:50,471][52263] Updated weights for policy 0, policy_version 367608 (0.0034) [2024-04-27 11:39:52,994][52263] Updated weights for policy 0, policy_version 367618 (0.0027) [2024-04-27 11:39:54,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.2, 300 sec: 53484.0). Total num frames: 6023102464. Throughput: 0: 53230.5. Samples: 513653380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 11:39:54,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 11:39:56,758][52263] Updated weights for policy 0, policy_version 367628 (0.0036) [2024-04-27 11:39:59,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6023364608. Throughput: 0: 53735.7. Samples: 513824380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 11:39:59,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 11:39:59,234][52263] Updated weights for policy 0, policy_version 367638 (0.0027) [2024-04-27 11:40:02,841][52263] Updated weights for policy 0, policy_version 367648 (0.0028) [2024-04-27 11:40:04,107][52031] Fps is (10 sec: 49151.4, 60 sec: 52701.9, 300 sec: 53373.0). Total num frames: 6023593984. Throughput: 0: 53516.3. Samples: 514140720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 11:40:04,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 11:40:05,390][52263] Updated weights for policy 0, policy_version 367658 (0.0023) [2024-04-27 11:40:08,872][52263] Updated weights for policy 0, policy_version 367668 (0.0028) [2024-04-27 11:40:09,106][52031] Fps is (10 sec: 50789.8, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6023872512. Throughput: 0: 53504.9. Samples: 514460980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 11:40:09,107][52031] Avg episode reward: [(0, '0.476')] [2024-04-27 11:40:11,601][52263] Updated weights for policy 0, policy_version 367678 (0.0028) [2024-04-27 11:40:14,107][52031] Fps is (10 sec: 57343.6, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6024167424. Throughput: 0: 53112.0. Samples: 514613460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 11:40:14,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 11:40:15,089][52263] Updated weights for policy 0, policy_version 367688 (0.0032) [2024-04-27 11:40:17,792][52263] Updated weights for policy 0, policy_version 367698 (0.0031) [2024-04-27 11:40:19,106][52031] Fps is (10 sec: 54067.6, 60 sec: 52974.9, 300 sec: 53373.0). Total num frames: 6024413184. Throughput: 0: 53074.1. Samples: 514931680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 11:40:19,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 11:40:21,213][52263] Updated weights for policy 0, policy_version 367708 (0.0031) [2024-04-27 11:40:23,869][52263] Updated weights for policy 0, policy_version 367718 (0.0031) [2024-04-27 11:40:24,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6024708096. Throughput: 0: 53050.5. Samples: 515249380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 11:40:24,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 11:40:27,380][52263] Updated weights for policy 0, policy_version 367728 (0.0033) [2024-04-27 11:40:29,107][52031] Fps is (10 sec: 52427.6, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6024937472. Throughput: 0: 53089.1. Samples: 515408420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 11:40:29,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 11:40:29,898][52263] Updated weights for policy 0, policy_version 367738 (0.0029) [2024-04-27 11:40:33,729][52263] Updated weights for policy 0, policy_version 367748 (0.0034) [2024-04-27 11:40:34,107][52031] Fps is (10 sec: 49152.0, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 6025199616. Throughput: 0: 53156.7. Samples: 515730520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 11:40:34,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 11:40:35,986][52263] Updated weights for policy 0, policy_version 367758 (0.0031) [2024-04-27 11:40:37,643][52242] Signal inference workers to stop experience collection... (7700 times) [2024-04-27 11:40:37,649][52242] Signal inference workers to resume experience collection... (7700 times) [2024-04-27 11:40:37,664][52263] InferenceWorker_p0-w0: stopping experience collection (7700 times) [2024-04-27 11:40:37,664][52263] InferenceWorker_p0-w0: resuming experience collection (7700 times) [2024-04-27 11:40:39,107][52031] Fps is (10 sec: 52429.2, 60 sec: 52974.8, 300 sec: 53373.0). Total num frames: 6025461760. Throughput: 0: 53205.6. Samples: 516047640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 11:40:39,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 11:40:39,882][52263] Updated weights for policy 0, policy_version 367768 (0.0034) [2024-04-27 11:40:42,207][52263] Updated weights for policy 0, policy_version 367778 (0.0031) [2024-04-27 11:40:44,106][52031] Fps is (10 sec: 57344.7, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 6025773056. Throughput: 0: 53023.0. Samples: 516210420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 11:40:44,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 11:40:46,361][52263] Updated weights for policy 0, policy_version 367788 (0.0026) [2024-04-27 11:40:48,292][52263] Updated weights for policy 0, policy_version 367798 (0.0036) [2024-04-27 11:40:49,107][52031] Fps is (10 sec: 57344.2, 60 sec: 53247.9, 300 sec: 53484.1). Total num frames: 6026035200. Throughput: 0: 53114.2. Samples: 516530860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 11:40:49,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 11:40:52,343][52263] Updated weights for policy 0, policy_version 367808 (0.0031) [2024-04-27 11:40:54,107][52031] Fps is (10 sec: 50789.4, 60 sec: 52974.7, 300 sec: 53484.0). Total num frames: 6026280960. Throughput: 0: 53241.6. Samples: 516856860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-04-27 11:40:54,107][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 11:40:54,461][52263] Updated weights for policy 0, policy_version 367818 (0.0027) [2024-04-27 11:40:58,381][52263] Updated weights for policy 0, policy_version 367828 (0.0027) [2024-04-27 11:40:59,106][52031] Fps is (10 sec: 49152.3, 60 sec: 52701.8, 300 sec: 53428.5). Total num frames: 6026526720. Throughput: 0: 53077.1. Samples: 517001920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 11:40:59,107][52031] Avg episode reward: [(0, '0.524')] [2024-04-27 11:41:00,515][52263] Updated weights for policy 0, policy_version 367838 (0.0028) [2024-04-27 11:41:04,106][52031] Fps is (10 sec: 50791.5, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 6026788864. Throughput: 0: 53076.9. Samples: 517320140. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 11:41:04,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 11:41:04,532][52263] Updated weights for policy 0, policy_version 367848 (0.0028) [2024-04-27 11:41:06,726][52263] Updated weights for policy 0, policy_version 367858 (0.0030) [2024-04-27 11:41:09,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6027083776. Throughput: 0: 53130.4. Samples: 517640240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 11:41:09,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 11:41:10,774][52263] Updated weights for policy 0, policy_version 367868 (0.0029) [2024-04-27 11:41:12,781][52263] Updated weights for policy 0, policy_version 367878 (0.0025) [2024-04-27 11:41:14,107][52031] Fps is (10 sec: 57342.3, 60 sec: 53247.9, 300 sec: 53428.4). Total num frames: 6027362304. Throughput: 0: 53443.9. Samples: 517813400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 11:41:14,107][52031] Avg episode reward: [(0, '0.443')] [2024-04-27 11:41:16,857][52263] Updated weights for policy 0, policy_version 367888 (0.0034) [2024-04-27 11:41:18,914][52263] Updated weights for policy 0, policy_version 367898 (0.0025) [2024-04-27 11:41:19,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6027640832. Throughput: 0: 53475.6. Samples: 518136920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 11:41:19,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 11:41:22,839][52263] Updated weights for policy 0, policy_version 367908 (0.0031) [2024-04-27 11:41:23,608][52242] Signal inference workers to stop experience collection... (7750 times) [2024-04-27 11:41:23,609][52242] Signal inference workers to resume experience collection... (7750 times) [2024-04-27 11:41:23,631][52263] InferenceWorker_p0-w0: stopping experience collection (7750 times) [2024-04-27 11:41:23,631][52263] InferenceWorker_p0-w0: resuming experience collection (7750 times) [2024-04-27 11:41:24,106][52031] Fps is (10 sec: 50792.2, 60 sec: 52702.0, 300 sec: 53428.5). Total num frames: 6027870208. Throughput: 0: 53628.2. Samples: 518460900. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 11:41:24,107][52031] Avg episode reward: [(0, '0.524')] [2024-04-27 11:41:24,951][52263] Updated weights for policy 0, policy_version 367918 (0.0027) [2024-04-27 11:41:29,037][52263] Updated weights for policy 0, policy_version 367928 (0.0026) [2024-04-27 11:41:29,107][52031] Fps is (10 sec: 49151.3, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6028132352. Throughput: 0: 53115.8. Samples: 518600640. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 11:41:29,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 11:41:31,182][52263] Updated weights for policy 0, policy_version 367938 (0.0028) [2024-04-27 11:41:34,107][52031] Fps is (10 sec: 52427.7, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6028394496. Throughput: 0: 53159.1. Samples: 518923020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 11:41:34,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 11:41:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000367944_6028394496.pth... [2024-04-27 11:41:34,162][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000367163_6015598592.pth [2024-04-27 11:41:35,224][52263] Updated weights for policy 0, policy_version 367948 (0.0035) [2024-04-27 11:41:37,372][52263] Updated weights for policy 0, policy_version 367958 (0.0029) [2024-04-27 11:41:39,107][52031] Fps is (10 sec: 58982.9, 60 sec: 54340.3, 300 sec: 53484.0). Total num frames: 6028722176. Throughput: 0: 52980.5. Samples: 519240980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 11:41:39,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 11:41:41,229][52263] Updated weights for policy 0, policy_version 367968 (0.0032) [2024-04-27 11:41:43,484][52263] Updated weights for policy 0, policy_version 367978 (0.0028) [2024-04-27 11:41:44,107][52031] Fps is (10 sec: 57344.0, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6028967936. Throughput: 0: 53669.6. Samples: 519417060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 11:41:44,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 11:41:47,433][52263] Updated weights for policy 0, policy_version 367988 (0.0031) [2024-04-27 11:41:49,107][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6029230080. Throughput: 0: 53623.9. Samples: 519733220. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 11:41:49,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 11:41:49,756][52263] Updated weights for policy 0, policy_version 367998 (0.0031) [2024-04-27 11:41:53,637][52263] Updated weights for policy 0, policy_version 368008 (0.0032) [2024-04-27 11:41:54,106][52031] Fps is (10 sec: 49152.4, 60 sec: 52975.1, 300 sec: 53373.0). Total num frames: 6029459456. Throughput: 0: 53675.5. Samples: 520055640. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 11:41:54,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 11:41:55,772][52263] Updated weights for policy 0, policy_version 368018 (0.0031) [2024-04-27 11:41:59,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 6029737984. Throughput: 0: 53207.4. Samples: 520207720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 11:41:59,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 11:41:59,544][52263] Updated weights for policy 0, policy_version 368028 (0.0033) [2024-04-27 11:42:01,776][52263] Updated weights for policy 0, policy_version 368038 (0.0039) [2024-04-27 11:42:04,107][52031] Fps is (10 sec: 57343.5, 60 sec: 54067.0, 300 sec: 53484.0). Total num frames: 6030032896. Throughput: 0: 53169.2. Samples: 520529540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 11:42:04,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 11:42:05,805][52263] Updated weights for policy 0, policy_version 368048 (0.0026) [2024-04-27 11:42:07,290][52242] Signal inference workers to stop experience collection... (7800 times) [2024-04-27 11:42:07,291][52242] Signal inference workers to resume experience collection... (7800 times) [2024-04-27 11:42:07,304][52263] InferenceWorker_p0-w0: stopping experience collection (7800 times) [2024-04-27 11:42:07,305][52263] InferenceWorker_p0-w0: resuming experience collection (7800 times) [2024-04-27 11:42:07,939][52263] Updated weights for policy 0, policy_version 368058 (0.0022) [2024-04-27 11:42:09,106][52031] Fps is (10 sec: 57343.8, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6030311424. Throughput: 0: 52978.1. Samples: 520844920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 11:42:09,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 11:42:11,917][52263] Updated weights for policy 0, policy_version 368068 (0.0032) [2024-04-27 11:42:14,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6030589952. Throughput: 0: 53696.6. Samples: 521016980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 11:42:14,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 11:42:14,109][52263] Updated weights for policy 0, policy_version 368078 (0.0025) [2024-04-27 11:42:18,075][52263] Updated weights for policy 0, policy_version 368088 (0.0034) [2024-04-27 11:42:19,106][52031] Fps is (10 sec: 47513.9, 60 sec: 52428.9, 300 sec: 53317.4). Total num frames: 6030786560. Throughput: 0: 53614.4. Samples: 521335660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 11:42:19,107][52031] Avg episode reward: [(0, '0.481')] [2024-04-27 11:42:20,167][52263] Updated weights for policy 0, policy_version 368098 (0.0034) [2024-04-27 11:42:24,106][52031] Fps is (10 sec: 47514.0, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6031065088. Throughput: 0: 53759.7. Samples: 521660160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 11:42:24,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 11:42:24,112][52263] Updated weights for policy 0, policy_version 368108 (0.0028) [2024-04-27 11:42:26,304][52263] Updated weights for policy 0, policy_version 368118 (0.0025) [2024-04-27 11:42:29,107][52031] Fps is (10 sec: 55704.5, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6031343616. Throughput: 0: 53181.3. Samples: 521810220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 11:42:29,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 11:42:30,060][52263] Updated weights for policy 0, policy_version 368128 (0.0029) [2024-04-27 11:42:32,550][52263] Updated weights for policy 0, policy_version 368138 (0.0028) [2024-04-27 11:42:34,107][52031] Fps is (10 sec: 58981.1, 60 sec: 54340.2, 300 sec: 53428.5). Total num frames: 6031654912. Throughput: 0: 53344.3. Samples: 522133720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 11:42:34,107][52031] Avg episode reward: [(0, '0.524')] [2024-04-27 11:42:35,995][52263] Updated weights for policy 0, policy_version 368148 (0.0027) [2024-04-27 11:42:38,804][52263] Updated weights for policy 0, policy_version 368158 (0.0029) [2024-04-27 11:42:39,106][52031] Fps is (10 sec: 57345.0, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6031917056. Throughput: 0: 53375.6. Samples: 522457540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 11:42:39,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 11:42:42,060][52263] Updated weights for policy 0, policy_version 368168 (0.0032) [2024-04-27 11:42:44,106][52031] Fps is (10 sec: 50791.5, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6032162816. Throughput: 0: 53597.4. Samples: 522619600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 11:42:44,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 11:42:44,991][52263] Updated weights for policy 0, policy_version 368178 (0.0024) [2024-04-27 11:42:48,202][52263] Updated weights for policy 0, policy_version 368188 (0.0029) [2024-04-27 11:42:49,106][52031] Fps is (10 sec: 47513.4, 60 sec: 52701.9, 300 sec: 53261.9). Total num frames: 6032392192. Throughput: 0: 53698.8. Samples: 522945980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 11:42:49,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 11:42:50,979][52263] Updated weights for policy 0, policy_version 368198 (0.0035) [2024-04-27 11:42:54,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6032687104. Throughput: 0: 53776.9. Samples: 523264880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 11:42:54,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 11:42:54,260][52263] Updated weights for policy 0, policy_version 368208 (0.0028) [2024-04-27 11:42:55,559][52242] Signal inference workers to stop experience collection... (7850 times) [2024-04-27 11:42:55,589][52263] InferenceWorker_p0-w0: stopping experience collection (7850 times) [2024-04-27 11:42:55,619][52242] Signal inference workers to resume experience collection... (7850 times) [2024-04-27 11:42:55,624][52263] InferenceWorker_p0-w0: resuming experience collection (7850 times) [2024-04-27 11:42:57,002][52263] Updated weights for policy 0, policy_version 368218 (0.0029) [2024-04-27 11:42:59,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6032949248. Throughput: 0: 53419.2. Samples: 523420840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 11:42:59,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 11:43:00,162][52263] Updated weights for policy 0, policy_version 368228 (0.0028) [2024-04-27 11:43:03,207][52263] Updated weights for policy 0, policy_version 368238 (0.0027) [2024-04-27 11:43:04,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6033244160. Throughput: 0: 53485.6. Samples: 523742520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 11:43:04,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 11:43:06,421][52263] Updated weights for policy 0, policy_version 368248 (0.0029) [2024-04-27 11:43:09,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6033506304. Throughput: 0: 53478.1. Samples: 524066680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 11:43:09,107][52031] Avg episode reward: [(0, '0.690')] [2024-04-27 11:43:09,406][52263] Updated weights for policy 0, policy_version 368258 (0.0029) [2024-04-27 11:43:13,082][52263] Updated weights for policy 0, policy_version 368268 (0.0029) [2024-04-27 11:43:14,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6033784832. Throughput: 0: 53913.9. Samples: 524236340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 11:43:14,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 11:43:15,462][52263] Updated weights for policy 0, policy_version 368278 (0.0025) [2024-04-27 11:43:19,048][52263] Updated weights for policy 0, policy_version 368288 (0.0032) [2024-04-27 11:43:19,106][52031] Fps is (10 sec: 52428.7, 60 sec: 54067.1, 300 sec: 53428.5). Total num frames: 6034030592. Throughput: 0: 53853.5. Samples: 524557120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 11:43:19,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 11:43:21,632][52263] Updated weights for policy 0, policy_version 368298 (0.0030) [2024-04-27 11:43:24,106][52031] Fps is (10 sec: 54067.3, 60 sec: 54340.2, 300 sec: 53484.1). Total num frames: 6034325504. Throughput: 0: 53706.6. Samples: 524874340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 11:43:24,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 11:43:25,126][52263] Updated weights for policy 0, policy_version 368308 (0.0027) [2024-04-27 11:43:27,644][52263] Updated weights for policy 0, policy_version 368318 (0.0029) [2024-04-27 11:43:29,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6034571264. Throughput: 0: 53601.1. Samples: 525031660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 11:43:29,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 11:43:31,196][52263] Updated weights for policy 0, policy_version 368328 (0.0035) [2024-04-27 11:43:33,893][52263] Updated weights for policy 0, policy_version 368338 (0.0030) [2024-04-27 11:43:34,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6034849792. Throughput: 0: 53428.1. Samples: 525350240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 11:43:34,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 11:43:34,111][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000368339_6034866176.pth... [2024-04-27 11:43:34,161][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000367556_6022037504.pth [2024-04-27 11:43:37,060][52263] Updated weights for policy 0, policy_version 368348 (0.0027) [2024-04-27 11:43:39,107][52031] Fps is (10 sec: 54067.7, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6035111936. Throughput: 0: 53542.5. Samples: 525674300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 11:43:39,107][52031] Avg episode reward: [(0, '0.499')] [2024-04-27 11:43:40,003][52263] Updated weights for policy 0, policy_version 368358 (0.0029) [2024-04-27 11:43:43,517][52263] Updated weights for policy 0, policy_version 368368 (0.0025) [2024-04-27 11:43:44,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6035374080. Throughput: 0: 53754.5. Samples: 525839800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 11:43:44,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 11:43:46,002][52242] Signal inference workers to stop experience collection... (7900 times) [2024-04-27 11:43:46,002][52242] Signal inference workers to resume experience collection... (7900 times) [2024-04-27 11:43:46,027][52263] InferenceWorker_p0-w0: stopping experience collection (7900 times) [2024-04-27 11:43:46,028][52263] InferenceWorker_p0-w0: resuming experience collection (7900 times) [2024-04-27 11:43:46,125][52263] Updated weights for policy 0, policy_version 368378 (0.0037) [2024-04-27 11:43:49,107][52031] Fps is (10 sec: 54067.3, 60 sec: 54340.2, 300 sec: 53428.5). Total num frames: 6035652608. Throughput: 0: 53814.3. Samples: 526164160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 11:43:49,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 11:43:49,494][52263] Updated weights for policy 0, policy_version 368388 (0.0035) [2024-04-27 11:43:52,278][52263] Updated weights for policy 0, policy_version 368398 (0.0027) [2024-04-27 11:43:54,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53793.9, 300 sec: 53428.5). Total num frames: 6035914752. Throughput: 0: 53626.0. Samples: 526479860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 11:43:54,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 11:43:55,688][52263] Updated weights for policy 0, policy_version 368408 (0.0029) [2024-04-27 11:43:58,455][52263] Updated weights for policy 0, policy_version 368418 (0.0028) [2024-04-27 11:43:59,107][52031] Fps is (10 sec: 54067.1, 60 sec: 54067.1, 300 sec: 53428.5). Total num frames: 6036193280. Throughput: 0: 53443.1. Samples: 526641280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 11:43:59,107][52031] Avg episode reward: [(0, '0.668')] [2024-04-27 11:44:01,922][52263] Updated weights for policy 0, policy_version 368428 (0.0029) [2024-04-27 11:44:04,107][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6036439040. Throughput: 0: 53486.1. Samples: 526964000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 11:44:04,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 11:44:04,694][52263] Updated weights for policy 0, policy_version 368438 (0.0029) [2024-04-27 11:44:08,070][52263] Updated weights for policy 0, policy_version 368448 (0.0031) [2024-04-27 11:44:09,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6036733952. Throughput: 0: 53604.0. Samples: 527286520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 11:44:09,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 11:44:10,947][52263] Updated weights for policy 0, policy_version 368458 (0.0033) [2024-04-27 11:44:14,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 6036963328. Throughput: 0: 53529.4. Samples: 527440480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 11:44:14,107][52031] Avg episode reward: [(0, '0.485')] [2024-04-27 11:44:14,312][52263] Updated weights for policy 0, policy_version 368468 (0.0030) [2024-04-27 11:44:16,997][52263] Updated weights for policy 0, policy_version 368478 (0.0031) [2024-04-27 11:44:19,107][52031] Fps is (10 sec: 50790.2, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6037241856. Throughput: 0: 53575.4. Samples: 527761140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 11:44:19,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 11:44:20,308][52263] Updated weights for policy 0, policy_version 368488 (0.0030) [2024-04-27 11:44:23,075][52263] Updated weights for policy 0, policy_version 368498 (0.0033) [2024-04-27 11:44:24,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6037520384. Throughput: 0: 53443.9. Samples: 528079280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 11:44:24,107][52031] Avg episode reward: [(0, '0.655')] [2024-04-27 11:44:26,399][52263] Updated weights for policy 0, policy_version 368508 (0.0027) [2024-04-27 11:44:29,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53248.2, 300 sec: 53428.5). Total num frames: 6037766144. Throughput: 0: 53507.3. Samples: 528247620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 11:44:29,107][52031] Avg episode reward: [(0, '0.678')] [2024-04-27 11:44:29,268][52263] Updated weights for policy 0, policy_version 368518 (0.0028) [2024-04-27 11:44:32,573][52263] Updated weights for policy 0, policy_version 368528 (0.0040) [2024-04-27 11:44:34,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6038044672. Throughput: 0: 53430.3. Samples: 528568520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 11:44:34,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 11:44:35,476][52263] Updated weights for policy 0, policy_version 368538 (0.0027) [2024-04-27 11:44:38,756][52263] Updated weights for policy 0, policy_version 368548 (0.0026) [2024-04-27 11:44:39,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6038306816. Throughput: 0: 53554.7. Samples: 528889820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 11:44:39,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 11:44:41,519][52263] Updated weights for policy 0, policy_version 368558 (0.0029) [2024-04-27 11:44:42,215][52242] Signal inference workers to stop experience collection... (7950 times) [2024-04-27 11:44:42,216][52242] Signal inference workers to resume experience collection... (7950 times) [2024-04-27 11:44:42,232][52263] InferenceWorker_p0-w0: stopping experience collection (7950 times) [2024-04-27 11:44:42,232][52263] InferenceWorker_p0-w0: resuming experience collection (7950 times) [2024-04-27 11:44:44,107][52031] Fps is (10 sec: 54066.0, 60 sec: 53521.0, 300 sec: 53372.9). Total num frames: 6038585344. Throughput: 0: 53351.9. Samples: 529042120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 11:44:44,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 11:44:44,910][52263] Updated weights for policy 0, policy_version 368568 (0.0023) [2024-04-27 11:44:47,706][52263] Updated weights for policy 0, policy_version 368578 (0.0033) [2024-04-27 11:44:49,107][52031] Fps is (10 sec: 55706.1, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6038863872. Throughput: 0: 53429.4. Samples: 529368320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 11:44:49,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 11:44:50,880][52263] Updated weights for policy 0, policy_version 368588 (0.0030) [2024-04-27 11:44:53,857][52263] Updated weights for policy 0, policy_version 368598 (0.0029) [2024-04-27 11:44:54,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6039126016. Throughput: 0: 53400.7. Samples: 529689560. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 11:44:54,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 11:44:56,977][52263] Updated weights for policy 0, policy_version 368608 (0.0031) [2024-04-27 11:44:59,107][52031] Fps is (10 sec: 50789.8, 60 sec: 52974.9, 300 sec: 53484.0). Total num frames: 6039371776. Throughput: 0: 53501.8. Samples: 529848060. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 11:44:59,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 11:44:59,926][52263] Updated weights for policy 0, policy_version 368618 (0.0030) [2024-04-27 11:45:03,127][52263] Updated weights for policy 0, policy_version 368628 (0.0029) [2024-04-27 11:45:04,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6039650304. Throughput: 0: 53375.4. Samples: 530163040. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 11:45:04,107][52031] Avg episode reward: [(0, '0.501')] [2024-04-27 11:45:06,057][52263] Updated weights for policy 0, policy_version 368638 (0.0033) [2024-04-27 11:45:09,106][52031] Fps is (10 sec: 52429.7, 60 sec: 52701.9, 300 sec: 53317.5). Total num frames: 6039896064. Throughput: 0: 53446.0. Samples: 530484340. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 11:45:09,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 11:45:09,262][52263] Updated weights for policy 0, policy_version 368648 (0.0028) [2024-04-27 11:45:12,311][52263] Updated weights for policy 0, policy_version 368658 (0.0030) [2024-04-27 11:45:14,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6040190976. Throughput: 0: 53333.1. Samples: 530647620. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 11:45:14,107][52031] Avg episode reward: [(0, '0.475')] [2024-04-27 11:45:15,351][52263] Updated weights for policy 0, policy_version 368668 (0.0035) [2024-04-27 11:45:18,439][52263] Updated weights for policy 0, policy_version 368678 (0.0031) [2024-04-27 11:45:19,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 6040453120. Throughput: 0: 53276.5. Samples: 530965960. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 11:45:19,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 11:45:21,569][52263] Updated weights for policy 0, policy_version 368688 (0.0027) [2024-04-27 11:45:24,106][52031] Fps is (10 sec: 49152.8, 60 sec: 52702.0, 300 sec: 53373.0). Total num frames: 6040682496. Throughput: 0: 53186.0. Samples: 531283180. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 11:45:24,107][52031] Avg episode reward: [(0, '0.463')] [2024-04-27 11:45:24,587][52263] Updated weights for policy 0, policy_version 368698 (0.0025) [2024-04-27 11:45:27,624][52263] Updated weights for policy 0, policy_version 368708 (0.0029) [2024-04-27 11:45:29,106][52031] Fps is (10 sec: 50790.1, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6040961024. Throughput: 0: 53206.0. Samples: 531436380. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 11:45:29,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 11:45:30,353][52242] Signal inference workers to stop experience collection... (8000 times) [2024-04-27 11:45:30,398][52263] InferenceWorker_p0-w0: stopping experience collection (8000 times) [2024-04-27 11:45:30,414][52242] Signal inference workers to resume experience collection... (8000 times) [2024-04-27 11:45:30,417][52263] InferenceWorker_p0-w0: resuming experience collection (8000 times) [2024-04-27 11:45:30,685][52263] Updated weights for policy 0, policy_version 368718 (0.0027) [2024-04-27 11:45:33,809][52263] Updated weights for policy 0, policy_version 368728 (0.0037) [2024-04-27 11:45:34,107][52031] Fps is (10 sec: 57343.5, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6041255936. Throughput: 0: 53107.5. Samples: 531758160. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 11:45:34,116][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 11:45:34,126][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000368729_6041255936.pth... [2024-04-27 11:45:34,176][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000367944_6028394496.pth [2024-04-27 11:45:36,852][52263] Updated weights for policy 0, policy_version 368738 (0.0040) [2024-04-27 11:45:39,107][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.2, 300 sec: 53317.4). Total num frames: 6041501696. Throughput: 0: 53038.5. Samples: 532076280. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 11:45:39,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 11:45:39,903][52263] Updated weights for policy 0, policy_version 368748 (0.0028) [2024-04-27 11:45:42,990][52263] Updated weights for policy 0, policy_version 368758 (0.0031) [2024-04-27 11:45:44,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6041796608. Throughput: 0: 53054.7. Samples: 532235520. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 11:45:44,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 11:45:46,162][52263] Updated weights for policy 0, policy_version 368768 (0.0032) [2024-04-27 11:45:49,107][52031] Fps is (10 sec: 54066.0, 60 sec: 52974.8, 300 sec: 53428.5). Total num frames: 6042042368. Throughput: 0: 53243.6. Samples: 532559000. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 11:45:49,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 11:45:49,294][52263] Updated weights for policy 0, policy_version 368778 (0.0030) [2024-04-27 11:45:52,186][52263] Updated weights for policy 0, policy_version 368788 (0.0032) [2024-04-27 11:45:54,106][52031] Fps is (10 sec: 50791.3, 60 sec: 52975.2, 300 sec: 53484.1). Total num frames: 6042304512. Throughput: 0: 53199.2. Samples: 532878300. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 11:45:54,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 11:45:55,400][52263] Updated weights for policy 0, policy_version 368798 (0.0025) [2024-04-27 11:45:58,395][52263] Updated weights for policy 0, policy_version 368808 (0.0034) [2024-04-27 11:45:59,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6042566656. Throughput: 0: 52827.7. Samples: 533024860. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 11:45:59,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 11:46:01,594][52263] Updated weights for policy 0, policy_version 368818 (0.0039) [2024-04-27 11:46:04,107][52031] Fps is (10 sec: 52427.2, 60 sec: 52974.9, 300 sec: 53372.9). Total num frames: 6042828800. Throughput: 0: 52967.7. Samples: 533349520. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 11:46:04,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 11:46:04,611][52263] Updated weights for policy 0, policy_version 368828 (0.0027) [2024-04-27 11:46:07,867][52263] Updated weights for policy 0, policy_version 368838 (0.0030) [2024-04-27 11:46:09,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6043123712. Throughput: 0: 53015.5. Samples: 533668880. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 11:46:09,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 11:46:10,589][52263] Updated weights for policy 0, policy_version 368848 (0.0029) [2024-04-27 11:46:13,842][52263] Updated weights for policy 0, policy_version 368858 (0.0032) [2024-04-27 11:46:14,107][52031] Fps is (10 sec: 55706.3, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6043385856. Throughput: 0: 53377.2. Samples: 533838360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 11:46:14,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 11:46:16,637][52263] Updated weights for policy 0, policy_version 368868 (0.0028) [2024-04-27 11:46:19,107][52031] Fps is (10 sec: 50789.6, 60 sec: 52974.7, 300 sec: 53428.4). Total num frames: 6043631616. Throughput: 0: 53234.5. Samples: 534153720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 11:46:19,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 11:46:19,903][52263] Updated weights for policy 0, policy_version 368878 (0.0028) [2024-04-27 11:46:22,886][52263] Updated weights for policy 0, policy_version 368888 (0.0034) [2024-04-27 11:46:24,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53794.0, 300 sec: 53484.1). Total num frames: 6043910144. Throughput: 0: 53240.7. Samples: 534472120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 11:46:24,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 11:46:26,087][52263] Updated weights for policy 0, policy_version 368898 (0.0027) [2024-04-27 11:46:28,991][52263] Updated weights for policy 0, policy_version 368908 (0.0026) [2024-04-27 11:46:29,107][52031] Fps is (10 sec: 55706.4, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6044188672. Throughput: 0: 53237.4. Samples: 534631200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 11:46:29,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 11:46:32,270][52263] Updated weights for policy 0, policy_version 368918 (0.0026) [2024-04-27 11:46:34,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6044450816. Throughput: 0: 53166.7. Samples: 534951500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 11:46:34,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 11:46:35,166][52263] Updated weights for policy 0, policy_version 368928 (0.0030) [2024-04-27 11:46:38,420][52263] Updated weights for policy 0, policy_version 368938 (0.0033) [2024-04-27 11:46:39,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53520.9, 300 sec: 53373.0). Total num frames: 6044712960. Throughput: 0: 53194.4. Samples: 535272060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 11:46:39,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 11:46:40,646][52242] Signal inference workers to stop experience collection... (8050 times) [2024-04-27 11:46:40,647][52242] Signal inference workers to resume experience collection... (8050 times) [2024-04-27 11:46:40,670][52263] InferenceWorker_p0-w0: stopping experience collection (8050 times) [2024-04-27 11:46:40,670][52263] InferenceWorker_p0-w0: resuming experience collection (8050 times) [2024-04-27 11:46:41,100][52263] Updated weights for policy 0, policy_version 368948 (0.0027) [2024-04-27 11:46:44,107][52031] Fps is (10 sec: 52428.9, 60 sec: 52974.9, 300 sec: 53373.0). Total num frames: 6044975104. Throughput: 0: 53524.8. Samples: 535433480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 11:46:44,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 11:46:44,377][52263] Updated weights for policy 0, policy_version 368958 (0.0026) [2024-04-27 11:46:47,323][52263] Updated weights for policy 0, policy_version 368968 (0.0024) [2024-04-27 11:46:49,106][52031] Fps is (10 sec: 52429.8, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6045237248. Throughput: 0: 53436.3. Samples: 535754140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 11:46:49,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 11:46:50,443][52263] Updated weights for policy 0, policy_version 368978 (0.0031) [2024-04-27 11:46:53,749][52263] Updated weights for policy 0, policy_version 368988 (0.0035) [2024-04-27 11:46:54,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6045499392. Throughput: 0: 53489.0. Samples: 536075880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 11:46:54,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 11:46:56,732][52263] Updated weights for policy 0, policy_version 368998 (0.0032) [2024-04-27 11:46:59,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 6045777920. Throughput: 0: 53334.9. Samples: 536238420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 11:46:59,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 11:46:59,791][52263] Updated weights for policy 0, policy_version 369008 (0.0030) [2024-04-27 11:47:02,879][52263] Updated weights for policy 0, policy_version 369018 (0.0035) [2024-04-27 11:47:04,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.3, 300 sec: 53317.4). Total num frames: 6046040064. Throughput: 0: 53416.2. Samples: 536557440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 11:47:04,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 11:47:05,721][52263] Updated weights for policy 0, policy_version 369028 (0.0024) [2024-04-27 11:47:08,923][52263] Updated weights for policy 0, policy_version 369038 (0.0025) [2024-04-27 11:47:09,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 6046318592. Throughput: 0: 53438.9. Samples: 536876860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 11:47:09,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 11:47:11,822][52263] Updated weights for policy 0, policy_version 369048 (0.0026) [2024-04-27 11:47:14,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6046580736. Throughput: 0: 53378.2. Samples: 537033220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 11:47:14,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 11:47:15,072][52263] Updated weights for policy 0, policy_version 369058 (0.0031) [2024-04-27 11:47:18,139][52263] Updated weights for policy 0, policy_version 369068 (0.0036) [2024-04-27 11:47:19,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6046859264. Throughput: 0: 53439.2. Samples: 537356260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 11:47:19,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 11:47:21,240][52263] Updated weights for policy 0, policy_version 369078 (0.0034) [2024-04-27 11:47:24,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6047121408. Throughput: 0: 53370.4. Samples: 537673720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 11:47:24,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 11:47:24,138][52263] Updated weights for policy 0, policy_version 369088 (0.0034) [2024-04-27 11:47:27,570][52263] Updated weights for policy 0, policy_version 369098 (0.0031) [2024-04-27 11:47:29,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53248.1, 300 sec: 53317.5). Total num frames: 6047383552. Throughput: 0: 53329.5. Samples: 537833300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 11:47:29,107][52031] Avg episode reward: [(0, '0.512')] [2024-04-27 11:47:30,174][52263] Updated weights for policy 0, policy_version 369108 (0.0027) [2024-04-27 11:47:33,554][52263] Updated weights for policy 0, policy_version 369118 (0.0028) [2024-04-27 11:47:34,106][52031] Fps is (10 sec: 50790.4, 60 sec: 52975.0, 300 sec: 53261.9). Total num frames: 6047629312. Throughput: 0: 53259.5. Samples: 538150820. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-04-27 11:47:34,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 11:47:34,139][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000369119_6047645696.pth... [2024-04-27 11:47:34,189][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000368339_6034866176.pth [2024-04-27 11:47:36,902][52263] Updated weights for policy 0, policy_version 369128 (0.0032) [2024-04-27 11:47:39,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6047924224. Throughput: 0: 53335.9. Samples: 538476000. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-04-27 11:47:39,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 11:47:39,552][52263] Updated weights for policy 0, policy_version 369138 (0.0027) [2024-04-27 11:47:42,976][52263] Updated weights for policy 0, policy_version 369148 (0.0027) [2024-04-27 11:47:43,485][52242] Signal inference workers to stop experience collection... (8100 times) [2024-04-27 11:47:43,485][52242] Signal inference workers to resume experience collection... (8100 times) [2024-04-27 11:47:43,511][52263] InferenceWorker_p0-w0: stopping experience collection (8100 times) [2024-04-27 11:47:43,515][52263] InferenceWorker_p0-w0: resuming experience collection (8100 times) [2024-04-27 11:47:44,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6048186368. Throughput: 0: 53394.2. Samples: 538641160. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-04-27 11:47:44,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 11:47:45,940][52263] Updated weights for policy 0, policy_version 369158 (0.0032) [2024-04-27 11:47:49,107][52031] Fps is (10 sec: 50790.0, 60 sec: 53247.8, 300 sec: 53372.9). Total num frames: 6048432128. Throughput: 0: 53346.1. Samples: 538958020. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-04-27 11:47:49,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 11:47:49,125][52263] Updated weights for policy 0, policy_version 369168 (0.0032) [2024-04-27 11:47:52,218][52263] Updated weights for policy 0, policy_version 369178 (0.0031) [2024-04-27 11:47:54,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6048727040. Throughput: 0: 53351.5. Samples: 539277680. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-04-27 11:47:54,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 11:47:55,180][52263] Updated weights for policy 0, policy_version 369188 (0.0028) [2024-04-27 11:47:58,384][52263] Updated weights for policy 0, policy_version 369198 (0.0031) [2024-04-27 11:47:59,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6048972800. Throughput: 0: 53433.0. Samples: 539437700. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-04-27 11:47:59,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 11:48:01,179][52263] Updated weights for policy 0, policy_version 369208 (0.0029) [2024-04-27 11:48:04,107][52031] Fps is (10 sec: 50789.9, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6049234944. Throughput: 0: 53373.8. Samples: 539758080. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-04-27 11:48:04,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 11:48:04,601][52263] Updated weights for policy 0, policy_version 369218 (0.0031) [2024-04-27 11:48:07,164][52263] Updated weights for policy 0, policy_version 369228 (0.0027) [2024-04-27 11:48:09,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6049513472. Throughput: 0: 53392.8. Samples: 540076400. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-04-27 11:48:09,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 11:48:10,698][52263] Updated weights for policy 0, policy_version 369238 (0.0029) [2024-04-27 11:48:13,342][52263] Updated weights for policy 0, policy_version 369248 (0.0028) [2024-04-27 11:48:14,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53248.0, 300 sec: 53372.9). Total num frames: 6049775616. Throughput: 0: 53547.4. Samples: 540242940. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-04-27 11:48:14,107][52031] Avg episode reward: [(0, '0.509')] [2024-04-27 11:48:16,989][52263] Updated weights for policy 0, policy_version 369258 (0.0036) [2024-04-27 11:48:19,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 6050054144. Throughput: 0: 53579.1. Samples: 540561880. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-04-27 11:48:19,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 11:48:19,412][52263] Updated weights for policy 0, policy_version 369268 (0.0031) [2024-04-27 11:48:22,940][52263] Updated weights for policy 0, policy_version 369278 (0.0032) [2024-04-27 11:48:24,107][52031] Fps is (10 sec: 52429.1, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 6050299904. Throughput: 0: 53595.0. Samples: 540887780. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-04-27 11:48:24,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 11:48:25,535][52263] Updated weights for policy 0, policy_version 369288 (0.0035) [2024-04-27 11:48:29,106][52031] Fps is (10 sec: 50790.7, 60 sec: 52975.0, 300 sec: 53261.9). Total num frames: 6050562048. Throughput: 0: 53388.9. Samples: 541043660. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-04-27 11:48:29,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 11:48:29,166][52263] Updated weights for policy 0, policy_version 369298 (0.0035) [2024-04-27 11:48:31,448][52263] Updated weights for policy 0, policy_version 369308 (0.0032) [2024-04-27 11:48:34,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53794.0, 300 sec: 53373.0). Total num frames: 6050856960. Throughput: 0: 53424.4. Samples: 541362120. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-04-27 11:48:34,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 11:48:35,331][52263] Updated weights for policy 0, policy_version 369318 (0.0027) [2024-04-27 11:48:37,582][52263] Updated weights for policy 0, policy_version 369328 (0.0028) [2024-04-27 11:48:39,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6051119104. Throughput: 0: 53432.0. Samples: 541682120. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-04-27 11:48:39,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 11:48:41,387][52263] Updated weights for policy 0, policy_version 369338 (0.0028) [2024-04-27 11:48:43,723][52263] Updated weights for policy 0, policy_version 369348 (0.0026) [2024-04-27 11:48:44,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6051397632. Throughput: 0: 53501.7. Samples: 541845280. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-04-27 11:48:44,107][52031] Avg episode reward: [(0, '0.512')] [2024-04-27 11:48:47,369][52263] Updated weights for policy 0, policy_version 369358 (0.0038) [2024-04-27 11:48:49,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.3, 300 sec: 53373.0). Total num frames: 6051659776. Throughput: 0: 53538.4. Samples: 542167300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 11:48:49,107][52031] Avg episode reward: [(0, '0.513')] [2024-04-27 11:48:49,718][52263] Updated weights for policy 0, policy_version 369368 (0.0028) [2024-04-27 11:48:53,322][52242] Signal inference workers to stop experience collection... (8150 times) [2024-04-27 11:48:53,332][52263] InferenceWorker_p0-w0: stopping experience collection (8150 times) [2024-04-27 11:48:53,419][52242] Signal inference workers to resume experience collection... (8150 times) [2024-04-27 11:48:53,419][52263] InferenceWorker_p0-w0: resuming experience collection (8150 times) [2024-04-27 11:48:53,537][52263] Updated weights for policy 0, policy_version 369378 (0.0030) [2024-04-27 11:48:54,106][52031] Fps is (10 sec: 50791.0, 60 sec: 52975.0, 300 sec: 53261.9). Total num frames: 6051905536. Throughput: 0: 53675.3. Samples: 542491780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 11:48:54,107][52031] Avg episode reward: [(0, '0.659')] [2024-04-27 11:48:56,046][52263] Updated weights for policy 0, policy_version 369388 (0.0036) [2024-04-27 11:48:59,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6052184064. Throughput: 0: 53407.8. Samples: 542646280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 11:48:59,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 11:48:59,644][52263] Updated weights for policy 0, policy_version 369398 (0.0027) [2024-04-27 11:49:02,112][52263] Updated weights for policy 0, policy_version 369408 (0.0031) [2024-04-27 11:49:04,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53521.0, 300 sec: 53261.9). Total num frames: 6052446208. Throughput: 0: 53454.1. Samples: 542967320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 11:49:04,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 11:49:05,799][52263] Updated weights for policy 0, policy_version 369418 (0.0031) [2024-04-27 11:49:08,440][52263] Updated weights for policy 0, policy_version 369428 (0.0028) [2024-04-27 11:49:09,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6052724736. Throughput: 0: 53339.7. Samples: 543288060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 11:49:09,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 11:49:11,755][52263] Updated weights for policy 0, policy_version 369438 (0.0036) [2024-04-27 11:49:14,107][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6053003264. Throughput: 0: 53591.4. Samples: 543455280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 11:49:14,116][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 11:49:14,622][52263] Updated weights for policy 0, policy_version 369448 (0.0028) [2024-04-27 11:49:17,827][52263] Updated weights for policy 0, policy_version 369458 (0.0031) [2024-04-27 11:49:19,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6053281792. Throughput: 0: 53727.2. Samples: 543779840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 11:49:19,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 11:49:20,746][52263] Updated weights for policy 0, policy_version 369468 (0.0029) [2024-04-27 11:49:24,035][52263] Updated weights for policy 0, policy_version 369478 (0.0030) [2024-04-27 11:49:24,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6053527552. Throughput: 0: 53696.8. Samples: 544098480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 11:49:24,116][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 11:49:26,896][52263] Updated weights for policy 0, policy_version 369488 (0.0027) [2024-04-27 11:49:29,107][52031] Fps is (10 sec: 50790.4, 60 sec: 53794.0, 300 sec: 53372.9). Total num frames: 6053789696. Throughput: 0: 53539.1. Samples: 544254540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 11:49:29,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 11:49:30,311][52263] Updated weights for policy 0, policy_version 369498 (0.0026) [2024-04-27 11:49:32,926][52263] Updated weights for policy 0, policy_version 369508 (0.0027) [2024-04-27 11:49:34,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6054051840. Throughput: 0: 53520.7. Samples: 544575740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 11:49:34,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 11:49:34,119][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000369510_6054051840.pth... [2024-04-27 11:49:34,175][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000368729_6041255936.pth [2024-04-27 11:49:36,285][52263] Updated weights for policy 0, policy_version 369518 (0.0026) [2024-04-27 11:49:38,971][52263] Updated weights for policy 0, policy_version 369528 (0.0029) [2024-04-27 11:49:39,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6054346752. Throughput: 0: 53388.3. Samples: 544894260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 11:49:39,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 11:49:42,535][52263] Updated weights for policy 0, policy_version 369538 (0.0035) [2024-04-27 11:49:44,096][52242] Signal inference workers to stop experience collection... (8200 times) [2024-04-27 11:49:44,105][52242] Signal inference workers to resume experience collection... (8200 times) [2024-04-27 11:49:44,106][52031] Fps is (10 sec: 57345.1, 60 sec: 53794.3, 300 sec: 53428.5). Total num frames: 6054625280. Throughput: 0: 53752.1. Samples: 545065120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 11:49:44,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 11:49:44,118][52263] InferenceWorker_p0-w0: stopping experience collection (8200 times) [2024-04-27 11:49:44,118][52263] InferenceWorker_p0-w0: resuming experience collection (8200 times) [2024-04-27 11:49:45,061][52263] Updated weights for policy 0, policy_version 369548 (0.0029) [2024-04-27 11:49:48,718][52263] Updated weights for policy 0, policy_version 369558 (0.0030) [2024-04-27 11:49:49,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6054871040. Throughput: 0: 53674.7. Samples: 545382680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 11:49:49,116][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 11:49:51,208][52263] Updated weights for policy 0, policy_version 369568 (0.0029) [2024-04-27 11:49:54,106][52031] Fps is (10 sec: 50789.8, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6055133184. Throughput: 0: 53766.2. Samples: 545707540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 11:49:54,115][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 11:49:54,690][52263] Updated weights for policy 0, policy_version 369578 (0.0028) [2024-04-27 11:49:57,191][52263] Updated weights for policy 0, policy_version 369588 (0.0029) [2024-04-27 11:49:59,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6055411712. Throughput: 0: 53453.5. Samples: 545860680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 11:49:59,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 11:50:00,686][52263] Updated weights for policy 0, policy_version 369598 (0.0027) [2024-04-27 11:50:03,758][52263] Updated weights for policy 0, policy_version 369608 (0.0035) [2024-04-27 11:50:04,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6055673856. Throughput: 0: 53335.9. Samples: 546179960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 11:50:04,116][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 11:50:06,780][52263] Updated weights for policy 0, policy_version 369618 (0.0028) [2024-04-27 11:50:09,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6055952384. Throughput: 0: 53345.7. Samples: 546499040. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-04-27 11:50:09,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 11:50:09,861][52263] Updated weights for policy 0, policy_version 369628 (0.0029) [2024-04-27 11:50:13,016][52263] Updated weights for policy 0, policy_version 369638 (0.0038) [2024-04-27 11:50:14,107][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6056230912. Throughput: 0: 53620.9. Samples: 546667480. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-04-27 11:50:14,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 11:50:15,927][52263] Updated weights for policy 0, policy_version 369648 (0.0030) [2024-04-27 11:50:19,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52974.9, 300 sec: 53484.0). Total num frames: 6056460288. Throughput: 0: 53664.9. Samples: 546990660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-04-27 11:50:19,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 11:50:19,181][52263] Updated weights for policy 0, policy_version 369658 (0.0033) [2024-04-27 11:50:21,885][52263] Updated weights for policy 0, policy_version 369668 (0.0035) [2024-04-27 11:50:24,106][52031] Fps is (10 sec: 50791.0, 60 sec: 53521.2, 300 sec: 53484.0). Total num frames: 6056738816. Throughput: 0: 53769.9. Samples: 547313900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-04-27 11:50:24,107][52031] Avg episode reward: [(0, '0.518')] [2024-04-27 11:50:25,161][52263] Updated weights for policy 0, policy_version 369678 (0.0035) [2024-04-27 11:50:27,882][52263] Updated weights for policy 0, policy_version 369688 (0.0030) [2024-04-27 11:50:29,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 6057000960. Throughput: 0: 53351.9. Samples: 547465960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-04-27 11:50:29,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 11:50:31,184][52263] Updated weights for policy 0, policy_version 369698 (0.0026) [2024-04-27 11:50:34,047][52263] Updated weights for policy 0, policy_version 369708 (0.0029) [2024-04-27 11:50:34,107][52031] Fps is (10 sec: 55704.7, 60 sec: 54067.2, 300 sec: 53539.5). Total num frames: 6057295872. Throughput: 0: 53569.7. Samples: 547793320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-04-27 11:50:34,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 11:50:37,402][52263] Updated weights for policy 0, policy_version 369718 (0.0027) [2024-04-27 11:50:39,106][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6057558016. Throughput: 0: 53428.9. Samples: 548111840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-04-27 11:50:39,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 11:50:40,343][52263] Updated weights for policy 0, policy_version 369728 (0.0028) [2024-04-27 11:50:43,520][52263] Updated weights for policy 0, policy_version 369738 (0.0032) [2024-04-27 11:50:43,653][52242] Signal inference workers to stop experience collection... (8250 times) [2024-04-27 11:50:43,653][52242] Signal inference workers to resume experience collection... (8250 times) [2024-04-27 11:50:43,680][52263] InferenceWorker_p0-w0: stopping experience collection (8250 times) [2024-04-27 11:50:43,680][52263] InferenceWorker_p0-w0: resuming experience collection (8250 times) [2024-04-27 11:50:44,107][52031] Fps is (10 sec: 52429.1, 60 sec: 53247.8, 300 sec: 53484.1). Total num frames: 6057820160. Throughput: 0: 53788.8. Samples: 548281180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-04-27 11:50:44,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 11:50:46,634][52263] Updated weights for policy 0, policy_version 369748 (0.0026) [2024-04-27 11:50:49,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6058082304. Throughput: 0: 53876.6. Samples: 548604400. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-04-27 11:50:49,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 11:50:49,516][52263] Updated weights for policy 0, policy_version 369758 (0.0032) [2024-04-27 11:50:52,974][52263] Updated weights for policy 0, policy_version 369768 (0.0027) [2024-04-27 11:50:54,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6058360832. Throughput: 0: 53983.2. Samples: 548928280. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-04-27 11:50:54,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 11:50:55,715][52263] Updated weights for policy 0, policy_version 369778 (0.0031) [2024-04-27 11:50:58,935][52263] Updated weights for policy 0, policy_version 369788 (0.0026) [2024-04-27 11:50:59,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53247.8, 300 sec: 53484.1). Total num frames: 6058606592. Throughput: 0: 53496.3. Samples: 549074820. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-04-27 11:50:59,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 11:51:01,756][52263] Updated weights for policy 0, policy_version 369798 (0.0028) [2024-04-27 11:51:04,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6058901504. Throughput: 0: 53443.0. Samples: 549395600. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-04-27 11:51:04,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 11:51:04,960][52263] Updated weights for policy 0, policy_version 369808 (0.0031) [2024-04-27 11:51:07,960][52263] Updated weights for policy 0, policy_version 369818 (0.0033) [2024-04-27 11:51:09,107][52031] Fps is (10 sec: 57344.4, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6059180032. Throughput: 0: 53306.5. Samples: 549712700. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-04-27 11:51:09,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 11:51:11,166][52263] Updated weights for policy 0, policy_version 369828 (0.0029) [2024-04-27 11:51:14,106][52031] Fps is (10 sec: 50791.6, 60 sec: 52975.0, 300 sec: 53484.1). Total num frames: 6059409408. Throughput: 0: 53741.8. Samples: 549884340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-04-27 11:51:14,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 11:51:14,214][52263] Updated weights for policy 0, policy_version 369838 (0.0024) [2024-04-27 11:51:17,374][52263] Updated weights for policy 0, policy_version 369848 (0.0026) [2024-04-27 11:51:19,106][52031] Fps is (10 sec: 50791.3, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6059687936. Throughput: 0: 53554.0. Samples: 550203240. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-04-27 11:51:19,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 11:51:20,255][52263] Updated weights for policy 0, policy_version 369858 (0.0028) [2024-04-27 11:51:23,379][52263] Updated weights for policy 0, policy_version 369868 (0.0027) [2024-04-27 11:51:24,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 6059933696. Throughput: 0: 53504.3. Samples: 550519540. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-04-27 11:51:24,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 11:51:26,407][52263] Updated weights for policy 0, policy_version 369878 (0.0028) [2024-04-27 11:51:29,107][52031] Fps is (10 sec: 50789.6, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 6060195840. Throughput: 0: 53263.1. Samples: 550678020. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 11:51:29,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 11:51:29,419][52263] Updated weights for policy 0, policy_version 369888 (0.0028) [2024-04-27 11:51:32,475][52263] Updated weights for policy 0, policy_version 369898 (0.0023) [2024-04-27 11:51:34,107][52031] Fps is (10 sec: 57344.2, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6060507136. Throughput: 0: 53211.0. Samples: 550998900. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 11:51:34,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 11:51:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000369904_6060507136.pth... [2024-04-27 11:51:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000369119_6047645696.pth [2024-04-27 11:51:35,547][52263] Updated weights for policy 0, policy_version 369908 (0.0032) [2024-04-27 11:51:38,593][52263] Updated weights for policy 0, policy_version 369918 (0.0036) [2024-04-27 11:51:39,106][52031] Fps is (10 sec: 57344.6, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6060769280. Throughput: 0: 53242.3. Samples: 551324180. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 11:51:39,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 11:51:42,090][52242] Signal inference workers to stop experience collection... (8300 times) [2024-04-27 11:51:42,125][52263] InferenceWorker_p0-w0: stopping experience collection (8300 times) [2024-04-27 11:51:42,154][52242] Signal inference workers to resume experience collection... (8300 times) [2024-04-27 11:51:42,159][52263] InferenceWorker_p0-w0: resuming experience collection (8300 times) [2024-04-27 11:51:42,162][52263] Updated weights for policy 0, policy_version 369928 (0.0027) [2024-04-27 11:51:44,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6061015040. Throughput: 0: 53563.8. Samples: 551485180. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 11:51:44,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 11:51:44,693][52263] Updated weights for policy 0, policy_version 369938 (0.0037) [2024-04-27 11:51:48,190][52263] Updated weights for policy 0, policy_version 369948 (0.0028) [2024-04-27 11:51:49,106][52031] Fps is (10 sec: 49151.7, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6061260800. Throughput: 0: 53337.1. Samples: 551795760. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 11:51:49,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 11:51:50,912][52263] Updated weights for policy 0, policy_version 369958 (0.0030) [2024-04-27 11:51:54,106][52031] Fps is (10 sec: 52428.7, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6061539328. Throughput: 0: 53464.6. Samples: 552118600. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 11:51:54,107][52031] Avg episode reward: [(0, '0.503')] [2024-04-27 11:51:54,164][52263] Updated weights for policy 0, policy_version 369968 (0.0029) [2024-04-27 11:51:57,111][52263] Updated weights for policy 0, policy_version 369978 (0.0036) [2024-04-27 11:51:59,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6061817856. Throughput: 0: 53239.8. Samples: 552280140. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 11:51:59,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 11:52:00,331][52263] Updated weights for policy 0, policy_version 369988 (0.0030) [2024-04-27 11:52:03,205][52263] Updated weights for policy 0, policy_version 369998 (0.0034) [2024-04-27 11:52:04,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6062096384. Throughput: 0: 53248.7. Samples: 552599440. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 11:52:04,107][52031] Avg episode reward: [(0, '0.519')] [2024-04-27 11:52:06,418][52263] Updated weights for policy 0, policy_version 370008 (0.0029) [2024-04-27 11:52:09,107][52031] Fps is (10 sec: 52428.9, 60 sec: 52701.9, 300 sec: 53428.5). Total num frames: 6062342144. Throughput: 0: 53210.3. Samples: 552914000. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 11:52:09,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 11:52:09,424][52263] Updated weights for policy 0, policy_version 370018 (0.0031) [2024-04-27 11:52:12,412][52263] Updated weights for policy 0, policy_version 370028 (0.0023) [2024-04-27 11:52:14,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6062620672. Throughput: 0: 53205.4. Samples: 553072260. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 11:52:14,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 11:52:15,687][52263] Updated weights for policy 0, policy_version 370038 (0.0030) [2024-04-27 11:52:18,859][52263] Updated weights for policy 0, policy_version 370048 (0.0033) [2024-04-27 11:52:19,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53247.8, 300 sec: 53428.5). Total num frames: 6062882816. Throughput: 0: 53088.9. Samples: 553387900. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 11:52:19,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 11:52:21,692][52263] Updated weights for policy 0, policy_version 370058 (0.0031) [2024-04-27 11:52:24,107][52031] Fps is (10 sec: 50789.4, 60 sec: 53247.9, 300 sec: 53372.9). Total num frames: 6063128576. Throughput: 0: 53034.0. Samples: 553710720. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 11:52:24,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 11:52:24,952][52263] Updated weights for policy 0, policy_version 370068 (0.0032) [2024-04-27 11:52:27,695][52263] Updated weights for policy 0, policy_version 370078 (0.0028) [2024-04-27 11:52:29,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53794.3, 300 sec: 53539.6). Total num frames: 6063423488. Throughput: 0: 52997.4. Samples: 553870060. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 11:52:29,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 11:52:31,126][52263] Updated weights for policy 0, policy_version 370088 (0.0028) [2024-04-27 11:52:33,897][52263] Updated weights for policy 0, policy_version 370098 (0.0027) [2024-04-27 11:52:34,106][52031] Fps is (10 sec: 55706.9, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6063685632. Throughput: 0: 53264.1. Samples: 554192640. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 11:52:34,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 11:52:37,262][52263] Updated weights for policy 0, policy_version 370108 (0.0032) [2024-04-27 11:52:39,106][52031] Fps is (10 sec: 50790.2, 60 sec: 52701.9, 300 sec: 53373.0). Total num frames: 6063931392. Throughput: 0: 53139.6. Samples: 554509880. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 11:52:39,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 11:52:40,098][52263] Updated weights for policy 0, policy_version 370118 (0.0031) [2024-04-27 11:52:43,367][52263] Updated weights for policy 0, policy_version 370128 (0.0036) [2024-04-27 11:52:44,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6064209920. Throughput: 0: 53089.5. Samples: 554669160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:52:44,107][52031] Avg episode reward: [(0, '0.509')] [2024-04-27 11:52:46,098][52263] Updated weights for policy 0, policy_version 370138 (0.0033) [2024-04-27 11:52:49,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6064488448. Throughput: 0: 53144.2. Samples: 554990920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:52:49,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 11:52:49,306][52263] Updated weights for policy 0, policy_version 370148 (0.0035) [2024-04-27 11:52:52,223][52242] Signal inference workers to stop experience collection... (8350 times) [2024-04-27 11:52:52,257][52263] InferenceWorker_p0-w0: stopping experience collection (8350 times) [2024-04-27 11:52:52,285][52242] Signal inference workers to resume experience collection... (8350 times) [2024-04-27 11:52:52,286][52263] InferenceWorker_p0-w0: resuming experience collection (8350 times) [2024-04-27 11:52:52,291][52263] Updated weights for policy 0, policy_version 370158 (0.0033) [2024-04-27 11:52:54,106][52031] Fps is (10 sec: 54066.8, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6064750592. Throughput: 0: 53327.6. Samples: 555313740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:52:54,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 11:52:55,361][52263] Updated weights for policy 0, policy_version 370168 (0.0030) [2024-04-27 11:52:58,535][52263] Updated weights for policy 0, policy_version 370178 (0.0026) [2024-04-27 11:52:59,107][52031] Fps is (10 sec: 52427.6, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6065012736. Throughput: 0: 53311.3. Samples: 555471280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:52:59,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 11:53:01,517][52263] Updated weights for policy 0, policy_version 370188 (0.0035) [2024-04-27 11:53:04,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6065291264. Throughput: 0: 53287.6. Samples: 555785840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:53:04,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 11:53:04,666][52263] Updated weights for policy 0, policy_version 370198 (0.0030) [2024-04-27 11:53:07,791][52263] Updated weights for policy 0, policy_version 370208 (0.0027) [2024-04-27 11:53:09,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6065537024. Throughput: 0: 53193.5. Samples: 556104420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:53:09,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 11:53:10,673][52263] Updated weights for policy 0, policy_version 370218 (0.0028) [2024-04-27 11:53:13,788][52263] Updated weights for policy 0, policy_version 370228 (0.0024) [2024-04-27 11:53:14,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6065815552. Throughput: 0: 53433.3. Samples: 556274560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:53:14,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 11:53:16,743][52263] Updated weights for policy 0, policy_version 370238 (0.0028) [2024-04-27 11:53:19,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6066077696. Throughput: 0: 53371.5. Samples: 556594360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:53:19,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 11:53:19,954][52263] Updated weights for policy 0, policy_version 370248 (0.0030) [2024-04-27 11:53:22,886][52263] Updated weights for policy 0, policy_version 370258 (0.0029) [2024-04-27 11:53:24,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.3, 300 sec: 53539.6). Total num frames: 6066356224. Throughput: 0: 53489.8. Samples: 556916920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:53:24,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 11:53:26,315][52263] Updated weights for policy 0, policy_version 370268 (0.0031) [2024-04-27 11:53:29,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6066618368. Throughput: 0: 53369.2. Samples: 557070780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:53:29,107][52031] Avg episode reward: [(0, '0.502')] [2024-04-27 11:53:29,166][52263] Updated weights for policy 0, policy_version 370278 (0.0029) [2024-04-27 11:53:32,310][52263] Updated weights for policy 0, policy_version 370288 (0.0026) [2024-04-27 11:53:34,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6066896896. Throughput: 0: 53311.8. Samples: 557389960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:53:34,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 11:53:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000370294_6066896896.pth... [2024-04-27 11:53:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000369510_6054051840.pth [2024-04-27 11:53:35,351][52263] Updated weights for policy 0, policy_version 370298 (0.0029) [2024-04-27 11:53:38,256][52263] Updated weights for policy 0, policy_version 370308 (0.0033) [2024-04-27 11:53:39,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53520.9, 300 sec: 53373.0). Total num frames: 6067142656. Throughput: 0: 53333.2. Samples: 557713740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:53:39,107][52031] Avg episode reward: [(0, '0.683')] [2024-04-27 11:53:41,568][52263] Updated weights for policy 0, policy_version 370318 (0.0043) [2024-04-27 11:53:44,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6067421184. Throughput: 0: 53477.1. Samples: 557877740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:53:44,107][52031] Avg episode reward: [(0, '0.517')] [2024-04-27 11:53:44,454][52263] Updated weights for policy 0, policy_version 370328 (0.0031) [2024-04-27 11:53:46,063][52242] Signal inference workers to stop experience collection... (8400 times) [2024-04-27 11:53:46,063][52242] Signal inference workers to resume experience collection... (8400 times) [2024-04-27 11:53:46,080][52263] InferenceWorker_p0-w0: stopping experience collection (8400 times) [2024-04-27 11:53:46,080][52263] InferenceWorker_p0-w0: resuming experience collection (8400 times) [2024-04-27 11:53:47,782][52263] Updated weights for policy 0, policy_version 370338 (0.0030) [2024-04-27 11:53:49,107][52031] Fps is (10 sec: 52429.0, 60 sec: 52974.8, 300 sec: 53428.5). Total num frames: 6067666944. Throughput: 0: 53673.3. Samples: 558201140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:53:49,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 11:53:50,621][52263] Updated weights for policy 0, policy_version 370348 (0.0030) [2024-04-27 11:53:53,948][52263] Updated weights for policy 0, policy_version 370358 (0.0029) [2024-04-27 11:53:54,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6067945472. Throughput: 0: 53686.1. Samples: 558520300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:53:54,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 11:53:56,658][52263] Updated weights for policy 0, policy_version 370368 (0.0027) [2024-04-27 11:53:59,106][52031] Fps is (10 sec: 55706.2, 60 sec: 53521.3, 300 sec: 53484.1). Total num frames: 6068224000. Throughput: 0: 53400.5. Samples: 558677580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 11:53:59,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 11:54:00,140][52263] Updated weights for policy 0, policy_version 370378 (0.0030) [2024-04-27 11:54:02,691][52263] Updated weights for policy 0, policy_version 370388 (0.0028) [2024-04-27 11:54:04,107][52031] Fps is (10 sec: 52428.4, 60 sec: 52974.8, 300 sec: 53372.9). Total num frames: 6068469760. Throughput: 0: 53339.4. Samples: 558994640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:54:04,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 11:54:06,288][52263] Updated weights for policy 0, policy_version 370398 (0.0032) [2024-04-27 11:54:09,040][52263] Updated weights for policy 0, policy_version 370408 (0.0031) [2024-04-27 11:54:09,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6068764672. Throughput: 0: 53453.1. Samples: 559322320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:54:09,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 11:54:12,414][52263] Updated weights for policy 0, policy_version 370418 (0.0028) [2024-04-27 11:54:14,106][52031] Fps is (10 sec: 55706.6, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6069026816. Throughput: 0: 53431.1. Samples: 559475180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:54:14,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 11:54:15,124][52263] Updated weights for policy 0, policy_version 370428 (0.0034) [2024-04-27 11:54:18,390][52263] Updated weights for policy 0, policy_version 370438 (0.0031) [2024-04-27 11:54:19,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6069288960. Throughput: 0: 53554.2. Samples: 559799900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:54:19,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 11:54:21,111][52263] Updated weights for policy 0, policy_version 370448 (0.0028) [2024-04-27 11:54:24,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6069551104. Throughput: 0: 53513.5. Samples: 560121840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:54:24,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 11:54:24,640][52263] Updated weights for policy 0, policy_version 370458 (0.0029) [2024-04-27 11:54:27,220][52263] Updated weights for policy 0, policy_version 370468 (0.0029) [2024-04-27 11:54:29,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6069829632. Throughput: 0: 53382.6. Samples: 560279960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:54:29,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 11:54:30,700][52263] Updated weights for policy 0, policy_version 370478 (0.0032) [2024-04-27 11:54:33,459][52263] Updated weights for policy 0, policy_version 370488 (0.0033) [2024-04-27 11:54:34,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53248.0, 300 sec: 53372.9). Total num frames: 6070091776. Throughput: 0: 53261.3. Samples: 560597900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:54:34,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 11:54:36,998][52263] Updated weights for policy 0, policy_version 370498 (0.0033) [2024-04-27 11:54:39,107][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 6070353920. Throughput: 0: 53336.5. Samples: 560920440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:54:39,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 11:54:39,594][52263] Updated weights for policy 0, policy_version 370508 (0.0033) [2024-04-27 11:54:41,140][52242] Signal inference workers to stop experience collection... (8450 times) [2024-04-27 11:54:41,174][52263] InferenceWorker_p0-w0: stopping experience collection (8450 times) [2024-04-27 11:54:41,202][52242] Signal inference workers to resume experience collection... (8450 times) [2024-04-27 11:54:41,203][52263] InferenceWorker_p0-w0: resuming experience collection (8450 times) [2024-04-27 11:54:43,209][52263] Updated weights for policy 0, policy_version 370518 (0.0034) [2024-04-27 11:54:44,107][52031] Fps is (10 sec: 52429.3, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6070616064. Throughput: 0: 53216.4. Samples: 561072320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:54:44,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 11:54:45,616][52263] Updated weights for policy 0, policy_version 370528 (0.0025) [2024-04-27 11:54:49,107][52031] Fps is (10 sec: 50790.0, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6070861824. Throughput: 0: 53461.4. Samples: 561400400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:54:49,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 11:54:49,255][52263] Updated weights for policy 0, policy_version 370538 (0.0029) [2024-04-27 11:54:51,656][52263] Updated weights for policy 0, policy_version 370548 (0.0027) [2024-04-27 11:54:54,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.1, 300 sec: 53372.9). Total num frames: 6071156736. Throughput: 0: 53220.1. Samples: 561717220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:54:54,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 11:54:55,335][52263] Updated weights for policy 0, policy_version 370558 (0.0030) [2024-04-27 11:54:58,209][52263] Updated weights for policy 0, policy_version 370568 (0.0025) [2024-04-27 11:54:59,106][52031] Fps is (10 sec: 55706.5, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6071418880. Throughput: 0: 53338.3. Samples: 561875400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:54:59,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 11:55:01,628][52263] Updated weights for policy 0, policy_version 370578 (0.0030) [2024-04-27 11:55:04,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.3, 300 sec: 53373.0). Total num frames: 6071697408. Throughput: 0: 53274.9. Samples: 562197260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:55:04,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 11:55:04,206][52263] Updated weights for policy 0, policy_version 370588 (0.0029) [2024-04-27 11:55:07,924][52263] Updated weights for policy 0, policy_version 370598 (0.0035) [2024-04-27 11:55:09,106][52031] Fps is (10 sec: 54066.8, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 6071959552. Throughput: 0: 53289.3. Samples: 562519860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:55:09,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 11:55:10,146][52263] Updated weights for policy 0, policy_version 370608 (0.0025) [2024-04-27 11:55:13,903][52263] Updated weights for policy 0, policy_version 370618 (0.0027) [2024-04-27 11:55:14,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6072221696. Throughput: 0: 53488.1. Samples: 562686920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:55:14,107][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 11:55:16,340][52263] Updated weights for policy 0, policy_version 370628 (0.0029) [2024-04-27 11:55:19,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 6072467456. Throughput: 0: 53503.2. Samples: 563005540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 11:55:19,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 11:55:19,936][52263] Updated weights for policy 0, policy_version 370638 (0.0033) [2024-04-27 11:55:22,541][52263] Updated weights for policy 0, policy_version 370648 (0.0026) [2024-04-27 11:55:24,106][52031] Fps is (10 sec: 55705.4, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6072778752. Throughput: 0: 53402.7. Samples: 563323560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 11:55:24,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 11:55:26,207][52263] Updated weights for policy 0, policy_version 370658 (0.0027) [2024-04-27 11:55:28,525][52263] Updated weights for policy 0, policy_version 370668 (0.0029) [2024-04-27 11:55:29,107][52031] Fps is (10 sec: 57344.2, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6073040896. Throughput: 0: 53656.0. Samples: 563486840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 11:55:29,107][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 11:55:32,574][52263] Updated weights for policy 0, policy_version 370678 (0.0030) [2024-04-27 11:55:34,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 6073303040. Throughput: 0: 53564.2. Samples: 563810780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 11:55:34,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 11:55:34,131][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000370686_6073319424.pth... [2024-04-27 11:55:34,183][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000369904_6060507136.pth [2024-04-27 11:55:34,585][52263] Updated weights for policy 0, policy_version 370688 (0.0032) [2024-04-27 11:55:36,358][52242] Signal inference workers to stop experience collection... (8500 times) [2024-04-27 11:55:36,358][52242] Signal inference workers to resume experience collection... (8500 times) [2024-04-27 11:55:36,388][52263] InferenceWorker_p0-w0: stopping experience collection (8500 times) [2024-04-27 11:55:36,388][52263] InferenceWorker_p0-w0: resuming experience collection (8500 times) [2024-04-27 11:55:38,604][52263] Updated weights for policy 0, policy_version 370698 (0.0027) [2024-04-27 11:55:39,106][52031] Fps is (10 sec: 49152.8, 60 sec: 52975.1, 300 sec: 53261.9). Total num frames: 6073532416. Throughput: 0: 53546.9. Samples: 564126820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 11:55:39,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 11:55:41,062][52263] Updated weights for policy 0, policy_version 370708 (0.0033) [2024-04-27 11:55:44,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6073810944. Throughput: 0: 53336.7. Samples: 564275560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 11:55:44,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 11:55:44,719][52263] Updated weights for policy 0, policy_version 370718 (0.0031) [2024-04-27 11:55:47,323][52263] Updated weights for policy 0, policy_version 370728 (0.0030) [2024-04-27 11:55:49,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53794.2, 300 sec: 53317.4). Total num frames: 6074089472. Throughput: 0: 53354.6. Samples: 564598220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 11:55:49,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 11:55:50,758][52263] Updated weights for policy 0, policy_version 370738 (0.0027) [2024-04-27 11:55:53,260][52263] Updated weights for policy 0, policy_version 370748 (0.0025) [2024-04-27 11:55:54,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 6074335232. Throughput: 0: 53332.0. Samples: 564919800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 11:55:54,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 11:55:56,870][52263] Updated weights for policy 0, policy_version 370758 (0.0028) [2024-04-27 11:55:59,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53794.0, 300 sec: 53373.0). Total num frames: 6074646528. Throughput: 0: 53393.6. Samples: 565089640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 11:55:59,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 11:55:59,328][52263] Updated weights for policy 0, policy_version 370768 (0.0027) [2024-04-27 11:56:02,881][52263] Updated weights for policy 0, policy_version 370778 (0.0032) [2024-04-27 11:56:04,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53248.0, 300 sec: 53261.9). Total num frames: 6074892288. Throughput: 0: 53503.2. Samples: 565413180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 11:56:04,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 11:56:05,358][52263] Updated weights for policy 0, policy_version 370788 (0.0029) [2024-04-27 11:56:08,977][52263] Updated weights for policy 0, policy_version 370798 (0.0028) [2024-04-27 11:56:09,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6075154432. Throughput: 0: 53580.4. Samples: 565734680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 11:56:09,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 11:56:11,339][52263] Updated weights for policy 0, policy_version 370808 (0.0030) [2024-04-27 11:56:14,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53521.0, 300 sec: 53372.9). Total num frames: 6075432960. Throughput: 0: 53264.4. Samples: 565883740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 11:56:14,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 11:56:15,159][52263] Updated weights for policy 0, policy_version 370818 (0.0030) [2024-04-27 11:56:17,392][52263] Updated weights for policy 0, policy_version 370828 (0.0034) [2024-04-27 11:56:19,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6075695104. Throughput: 0: 53115.5. Samples: 566200980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 11:56:19,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 11:56:21,128][52263] Updated weights for policy 0, policy_version 370838 (0.0026) [2024-04-27 11:56:23,708][52263] Updated weights for policy 0, policy_version 370848 (0.0030) [2024-04-27 11:56:24,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6075973632. Throughput: 0: 53379.9. Samples: 566528920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 11:56:24,107][52031] Avg episode reward: [(0, '0.477')] [2024-04-27 11:56:27,204][52263] Updated weights for policy 0, policy_version 370858 (0.0034) [2024-04-27 11:56:29,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 6076235776. Throughput: 0: 53804.6. Samples: 566696760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 11:56:29,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 11:56:29,829][52263] Updated weights for policy 0, policy_version 370868 (0.0041) [2024-04-27 11:56:33,247][52263] Updated weights for policy 0, policy_version 370878 (0.0036) [2024-04-27 11:56:34,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6076514304. Throughput: 0: 53756.5. Samples: 567017260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 11:56:34,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 11:56:35,999][52263] Updated weights for policy 0, policy_version 370888 (0.0033) [2024-04-27 11:56:39,106][52031] Fps is (10 sec: 54067.4, 60 sec: 54067.2, 300 sec: 53428.5). Total num frames: 6076776448. Throughput: 0: 53661.5. Samples: 567334560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 11:56:39,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 11:56:39,321][52263] Updated weights for policy 0, policy_version 370898 (0.0031) [2024-04-27 11:56:42,499][52263] Updated weights for policy 0, policy_version 370908 (0.0030) [2024-04-27 11:56:44,106][52031] Fps is (10 sec: 54067.2, 60 sec: 54067.3, 300 sec: 53539.6). Total num frames: 6077054976. Throughput: 0: 53514.3. Samples: 567497780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 11:56:44,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 11:56:45,336][52263] Updated weights for policy 0, policy_version 370918 (0.0028) [2024-04-27 11:56:48,614][52263] Updated weights for policy 0, policy_version 370928 (0.0026) [2024-04-27 11:56:49,106][52031] Fps is (10 sec: 52428.1, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6077300736. Throughput: 0: 53528.8. Samples: 567821980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 11:56:49,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 11:56:51,568][52263] Updated weights for policy 0, policy_version 370938 (0.0029) [2024-04-27 11:56:52,930][52242] Signal inference workers to stop experience collection... (8550 times) [2024-04-27 11:56:52,960][52263] InferenceWorker_p0-w0: stopping experience collection (8550 times) [2024-04-27 11:56:52,993][52242] Signal inference workers to resume experience collection... (8550 times) [2024-04-27 11:56:52,993][52263] InferenceWorker_p0-w0: resuming experience collection (8550 times) [2024-04-27 11:56:54,107][52031] Fps is (10 sec: 52428.3, 60 sec: 54067.1, 300 sec: 53428.5). Total num frames: 6077579264. Throughput: 0: 53439.0. Samples: 568139440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 11:56:54,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 11:56:54,783][52263] Updated weights for policy 0, policy_version 370948 (0.0028) [2024-04-27 11:56:57,665][52263] Updated weights for policy 0, policy_version 370958 (0.0027) [2024-04-27 11:56:59,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6077841408. Throughput: 0: 53662.3. Samples: 568298540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 11:56:59,107][52031] Avg episode reward: [(0, '0.455')] [2024-04-27 11:57:00,805][52263] Updated weights for policy 0, policy_version 370968 (0.0029) [2024-04-27 11:57:03,652][52263] Updated weights for policy 0, policy_version 370978 (0.0031) [2024-04-27 11:57:04,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6078119936. Throughput: 0: 53773.7. Samples: 568620800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 11:57:04,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 11:57:06,749][52263] Updated weights for policy 0, policy_version 370988 (0.0027) [2024-04-27 11:57:09,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53520.9, 300 sec: 53372.9). Total num frames: 6078365696. Throughput: 0: 53497.1. Samples: 568936300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 11:57:09,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 11:57:10,116][52263] Updated weights for policy 0, policy_version 370998 (0.0027) [2024-04-27 11:57:13,042][52263] Updated weights for policy 0, policy_version 371008 (0.0026) [2024-04-27 11:57:14,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6078644224. Throughput: 0: 53316.8. Samples: 569096020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 11:57:14,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 11:57:16,258][52263] Updated weights for policy 0, policy_version 371018 (0.0030) [2024-04-27 11:57:19,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 6078906368. Throughput: 0: 53289.8. Samples: 569415300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 11:57:19,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 11:57:19,203][52263] Updated weights for policy 0, policy_version 371028 (0.0027) [2024-04-27 11:57:22,229][52263] Updated weights for policy 0, policy_version 371038 (0.0031) [2024-04-27 11:57:24,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6079168512. Throughput: 0: 53505.2. Samples: 569742300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 11:57:24,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 11:57:25,396][52263] Updated weights for policy 0, policy_version 371048 (0.0029) [2024-04-27 11:57:28,493][52263] Updated weights for policy 0, policy_version 371058 (0.0027) [2024-04-27 11:57:29,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6079430656. Throughput: 0: 53424.0. Samples: 569901860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 11:57:29,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 11:57:31,513][52263] Updated weights for policy 0, policy_version 371068 (0.0030) [2024-04-27 11:57:34,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6079709184. Throughput: 0: 53378.6. Samples: 570224020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 11:57:34,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 11:57:34,188][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000371077_6079725568.pth... [2024-04-27 11:57:34,241][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000370294_6066896896.pth [2024-04-27 11:57:34,801][52263] Updated weights for policy 0, policy_version 371079 (0.0036) [2024-04-27 11:57:37,910][52263] Updated weights for policy 0, policy_version 371089 (0.0025) [2024-04-27 11:57:39,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6079987712. Throughput: 0: 53383.3. Samples: 570541680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 11:57:39,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 11:57:41,086][52263] Updated weights for policy 0, policy_version 371099 (0.0024) [2024-04-27 11:57:43,932][52263] Updated weights for policy 0, policy_version 371109 (0.0026) [2024-04-27 11:57:44,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6080249856. Throughput: 0: 53382.6. Samples: 570700760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 11:57:44,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 11:57:47,009][52263] Updated weights for policy 0, policy_version 371119 (0.0027) [2024-04-27 11:57:49,107][52031] Fps is (10 sec: 50789.6, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6080495616. Throughput: 0: 53378.2. Samples: 571022820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 11:57:49,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 11:57:50,171][52263] Updated weights for policy 0, policy_version 371129 (0.0035) [2024-04-27 11:57:53,045][52263] Updated weights for policy 0, policy_version 371139 (0.0025) [2024-04-27 11:57:54,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6080774144. Throughput: 0: 53431.6. Samples: 571340720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 11:57:54,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 11:57:55,382][52242] Signal inference workers to stop experience collection... (8600 times) [2024-04-27 11:57:55,417][52263] InferenceWorker_p0-w0: stopping experience collection (8600 times) [2024-04-27 11:57:55,446][52242] Signal inference workers to resume experience collection... (8600 times) [2024-04-27 11:57:55,447][52263] InferenceWorker_p0-w0: resuming experience collection (8600 times) [2024-04-27 11:57:56,148][52263] Updated weights for policy 0, policy_version 371149 (0.0030) [2024-04-27 11:57:59,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6081036288. Throughput: 0: 53534.8. Samples: 571505080. Policy #0 lag: (min: 2.0, avg: 9.4, max: 21.0) [2024-04-27 11:57:59,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 11:57:59,241][52263] Updated weights for policy 0, policy_version 371159 (0.0032) [2024-04-27 11:58:02,533][52263] Updated weights for policy 0, policy_version 371169 (0.0035) [2024-04-27 11:58:04,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6081331200. Throughput: 0: 53602.1. Samples: 571827400. Policy #0 lag: (min: 2.0, avg: 9.4, max: 21.0) [2024-04-27 11:58:04,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 11:58:05,574][52263] Updated weights for policy 0, policy_version 371179 (0.0028) [2024-04-27 11:58:08,699][52263] Updated weights for policy 0, policy_version 371189 (0.0029) [2024-04-27 11:58:09,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.3, 300 sec: 53428.5). Total num frames: 6081576960. Throughput: 0: 53330.7. Samples: 572142180. Policy #0 lag: (min: 2.0, avg: 9.4, max: 21.0) [2024-04-27 11:58:09,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 11:58:11,719][52263] Updated weights for policy 0, policy_version 371199 (0.0034) [2024-04-27 11:58:14,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6081839104. Throughput: 0: 53320.3. Samples: 572301280. Policy #0 lag: (min: 2.0, avg: 9.4, max: 21.0) [2024-04-27 11:58:14,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 11:58:14,793][52263] Updated weights for policy 0, policy_version 371209 (0.0029) [2024-04-27 11:58:17,721][52263] Updated weights for policy 0, policy_version 371219 (0.0032) [2024-04-27 11:58:19,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53247.9, 300 sec: 53372.9). Total num frames: 6082101248. Throughput: 0: 53260.8. Samples: 572620760. Policy #0 lag: (min: 2.0, avg: 9.4, max: 21.0) [2024-04-27 11:58:19,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 11:58:20,874][52263] Updated weights for policy 0, policy_version 371229 (0.0030) [2024-04-27 11:58:23,954][52263] Updated weights for policy 0, policy_version 371239 (0.0036) [2024-04-27 11:58:24,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6082379776. Throughput: 0: 53333.7. Samples: 572941700. Policy #0 lag: (min: 2.0, avg: 9.4, max: 21.0) [2024-04-27 11:58:24,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 11:58:26,939][52263] Updated weights for policy 0, policy_version 371249 (0.0029) [2024-04-27 11:58:29,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6082641920. Throughput: 0: 53502.8. Samples: 573108380. Policy #0 lag: (min: 2.0, avg: 9.4, max: 21.0) [2024-04-27 11:58:29,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 11:58:30,272][52263] Updated weights for policy 0, policy_version 371259 (0.0032) [2024-04-27 11:58:33,072][52263] Updated weights for policy 0, policy_version 371269 (0.0029) [2024-04-27 11:58:34,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6082936832. Throughput: 0: 53388.0. Samples: 573425280. Policy #0 lag: (min: 2.0, avg: 9.4, max: 21.0) [2024-04-27 11:58:34,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 11:58:36,267][52263] Updated weights for policy 0, policy_version 371279 (0.0032) [2024-04-27 11:58:39,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53247.8, 300 sec: 53428.5). Total num frames: 6083182592. Throughput: 0: 53447.6. Samples: 573745860. Policy #0 lag: (min: 2.0, avg: 9.4, max: 21.0) [2024-04-27 11:58:39,107][52031] Avg episode reward: [(0, '0.676')] [2024-04-27 11:58:39,294][52263] Updated weights for policy 0, policy_version 371289 (0.0034) [2024-04-27 11:58:42,239][52263] Updated weights for policy 0, policy_version 371299 (0.0028) [2024-04-27 11:58:44,107][52031] Fps is (10 sec: 49152.1, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6083428352. Throughput: 0: 53279.4. Samples: 573902660. Policy #0 lag: (min: 2.0, avg: 9.4, max: 21.0) [2024-04-27 11:58:44,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 11:58:45,301][52263] Updated weights for policy 0, policy_version 371309 (0.0030) [2024-04-27 11:58:48,376][52263] Updated weights for policy 0, policy_version 371319 (0.0030) [2024-04-27 11:58:49,106][52031] Fps is (10 sec: 52429.9, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6083706880. Throughput: 0: 53293.1. Samples: 574225580. Policy #0 lag: (min: 2.0, avg: 9.4, max: 21.0) [2024-04-27 11:58:49,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 11:58:51,713][52263] Updated weights for policy 0, policy_version 371329 (0.0031) [2024-04-27 11:58:54,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6083985408. Throughput: 0: 53347.8. Samples: 574542840. Policy #0 lag: (min: 2.0, avg: 9.4, max: 21.0) [2024-04-27 11:58:54,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 11:58:54,894][52263] Updated weights for policy 0, policy_version 371339 (0.0025) [2024-04-27 11:58:57,843][52263] Updated weights for policy 0, policy_version 371349 (0.0029) [2024-04-27 11:58:59,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53520.9, 300 sec: 53484.1). Total num frames: 6084247552. Throughput: 0: 53405.8. Samples: 574704540. Policy #0 lag: (min: 2.0, avg: 9.4, max: 21.0) [2024-04-27 11:58:59,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 11:58:59,537][52242] Signal inference workers to stop experience collection... (8650 times) [2024-04-27 11:58:59,537][52242] Signal inference workers to resume experience collection... (8650 times) [2024-04-27 11:58:59,567][52263] InferenceWorker_p0-w0: stopping experience collection (8650 times) [2024-04-27 11:58:59,567][52263] InferenceWorker_p0-w0: resuming experience collection (8650 times) [2024-04-27 11:59:00,991][52263] Updated weights for policy 0, policy_version 371359 (0.0034) [2024-04-27 11:59:03,811][52263] Updated weights for policy 0, policy_version 371369 (0.0028) [2024-04-27 11:59:04,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6084526080. Throughput: 0: 53549.7. Samples: 575030500. Policy #0 lag: (min: 2.0, avg: 9.4, max: 21.0) [2024-04-27 11:59:04,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 11:59:07,243][52263] Updated weights for policy 0, policy_version 371379 (0.0030) [2024-04-27 11:59:09,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6084788224. Throughput: 0: 53507.9. Samples: 575349560. Policy #0 lag: (min: 2.0, avg: 9.4, max: 21.0) [2024-04-27 11:59:09,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 11:59:09,862][52263] Updated weights for policy 0, policy_version 371389 (0.0028) [2024-04-27 11:59:13,407][52263] Updated weights for policy 0, policy_version 371399 (0.0034) [2024-04-27 11:59:14,106][52031] Fps is (10 sec: 50791.1, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6085033984. Throughput: 0: 53248.8. Samples: 575504580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 11:59:14,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 11:59:16,004][52263] Updated weights for policy 0, policy_version 371409 (0.0028) [2024-04-27 11:59:19,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6085312512. Throughput: 0: 53196.4. Samples: 575819120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 11:59:19,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 11:59:19,719][52263] Updated weights for policy 0, policy_version 371419 (0.0031) [2024-04-27 11:59:22,258][52263] Updated weights for policy 0, policy_version 371429 (0.0027) [2024-04-27 11:59:24,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6085591040. Throughput: 0: 53135.6. Samples: 576136960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 11:59:24,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 11:59:25,864][52263] Updated weights for policy 0, policy_version 371439 (0.0033) [2024-04-27 11:59:28,355][52263] Updated weights for policy 0, policy_version 371449 (0.0025) [2024-04-27 11:59:29,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6085853184. Throughput: 0: 53504.9. Samples: 576310380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 11:59:29,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 11:59:31,840][52263] Updated weights for policy 0, policy_version 371459 (0.0026) [2024-04-27 11:59:34,106][52031] Fps is (10 sec: 50791.2, 60 sec: 52701.9, 300 sec: 53373.0). Total num frames: 6086098944. Throughput: 0: 53402.6. Samples: 576628700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 11:59:34,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 11:59:34,178][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000371467_6086115328.pth... [2024-04-27 11:59:34,238][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000370686_6073319424.pth [2024-04-27 11:59:34,512][52263] Updated weights for policy 0, policy_version 371469 (0.0031) [2024-04-27 11:59:37,808][52263] Updated weights for policy 0, policy_version 371479 (0.0031) [2024-04-27 11:59:39,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52975.1, 300 sec: 53373.0). Total num frames: 6086361088. Throughput: 0: 53413.5. Samples: 576946440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 11:59:39,107][52031] Avg episode reward: [(0, '0.511')] [2024-04-27 11:59:40,654][52263] Updated weights for policy 0, policy_version 371489 (0.0029) [2024-04-27 11:59:43,914][52263] Updated weights for policy 0, policy_version 371499 (0.0028) [2024-04-27 11:59:44,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6086639616. Throughput: 0: 53282.2. Samples: 577102240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 11:59:44,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 11:59:46,946][52263] Updated weights for policy 0, policy_version 371509 (0.0030) [2024-04-27 11:59:49,106][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6086918144. Throughput: 0: 53168.2. Samples: 577423060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 11:59:49,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 11:59:50,269][52263] Updated weights for policy 0, policy_version 371519 (0.0031) [2024-04-27 11:59:53,092][52263] Updated weights for policy 0, policy_version 371529 (0.0028) [2024-04-27 11:59:54,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6087180288. Throughput: 0: 53161.8. Samples: 577741840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 11:59:54,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 11:59:56,407][52263] Updated weights for policy 0, policy_version 371539 (0.0033) [2024-04-27 11:59:59,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6087458816. Throughput: 0: 53358.7. Samples: 577905720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 11:59:59,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 11:59:59,108][52263] Updated weights for policy 0, policy_version 371549 (0.0035) [2024-04-27 12:00:02,416][52263] Updated weights for policy 0, policy_version 371559 (0.0026) [2024-04-27 12:00:04,107][52031] Fps is (10 sec: 52428.5, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6087704576. Throughput: 0: 53484.5. Samples: 578225920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 12:00:04,107][52031] Avg episode reward: [(0, '0.517')] [2024-04-27 12:00:05,348][52263] Updated weights for policy 0, policy_version 371569 (0.0026) [2024-04-27 12:00:08,486][52263] Updated weights for policy 0, policy_version 371579 (0.0028) [2024-04-27 12:00:09,106][52031] Fps is (10 sec: 49152.1, 60 sec: 52701.9, 300 sec: 53317.4). Total num frames: 6087950336. Throughput: 0: 53493.0. Samples: 578544140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 12:00:09,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 12:00:11,528][52263] Updated weights for policy 0, policy_version 371589 (0.0034) [2024-04-27 12:00:14,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6088245248. Throughput: 0: 53182.3. Samples: 578703580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 12:00:14,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 12:00:14,678][52263] Updated weights for policy 0, policy_version 371599 (0.0030) [2024-04-27 12:00:17,681][52263] Updated weights for policy 0, policy_version 371609 (0.0028) [2024-04-27 12:00:18,577][52242] Signal inference workers to stop experience collection... (8700 times) [2024-04-27 12:00:18,577][52242] Signal inference workers to resume experience collection... (8700 times) [2024-04-27 12:00:18,595][52263] InferenceWorker_p0-w0: stopping experience collection (8700 times) [2024-04-27 12:00:18,595][52263] InferenceWorker_p0-w0: resuming experience collection (8700 times) [2024-04-27 12:00:19,106][52031] Fps is (10 sec: 57344.4, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 6088523776. Throughput: 0: 53188.9. Samples: 579022200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 12:00:19,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 12:00:20,793][52263] Updated weights for policy 0, policy_version 371619 (0.0025) [2024-04-27 12:00:23,869][52263] Updated weights for policy 0, policy_version 371629 (0.0027) [2024-04-27 12:00:24,107][52031] Fps is (10 sec: 52428.0, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 6088769536. Throughput: 0: 53294.0. Samples: 579344680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 12:00:24,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 12:00:26,804][52263] Updated weights for policy 0, policy_version 371639 (0.0027) [2024-04-27 12:00:29,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53248.0, 300 sec: 53372.9). Total num frames: 6089048064. Throughput: 0: 53365.8. Samples: 579503700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 12:00:29,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 12:00:29,973][52263] Updated weights for policy 0, policy_version 371649 (0.0028) [2024-04-27 12:00:32,988][52263] Updated weights for policy 0, policy_version 371659 (0.0032) [2024-04-27 12:00:34,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6089326592. Throughput: 0: 53278.6. Samples: 579820600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:00:34,107][52031] Avg episode reward: [(0, '0.517')] [2024-04-27 12:00:36,161][52263] Updated weights for policy 0, policy_version 371669 (0.0037) [2024-04-27 12:00:39,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6089572352. Throughput: 0: 53330.2. Samples: 580141700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:00:39,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 12:00:39,230][52263] Updated weights for policy 0, policy_version 371679 (0.0030) [2024-04-27 12:00:42,321][52263] Updated weights for policy 0, policy_version 371689 (0.0026) [2024-04-27 12:00:44,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6089850880. Throughput: 0: 53233.7. Samples: 580301240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:00:44,108][52031] Avg episode reward: [(0, '0.505')] [2024-04-27 12:00:45,582][52263] Updated weights for policy 0, policy_version 371699 (0.0031) [2024-04-27 12:00:48,357][52263] Updated weights for policy 0, policy_version 371709 (0.0028) [2024-04-27 12:00:49,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6090096640. Throughput: 0: 53216.5. Samples: 580620660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:00:49,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 12:00:51,857][52263] Updated weights for policy 0, policy_version 371719 (0.0026) [2024-04-27 12:00:54,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53520.9, 300 sec: 53372.9). Total num frames: 6090391552. Throughput: 0: 53263.8. Samples: 580941020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:00:54,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 12:00:54,465][52263] Updated weights for policy 0, policy_version 371729 (0.0030) [2024-04-27 12:00:57,815][52263] Updated weights for policy 0, policy_version 371739 (0.0037) [2024-04-27 12:00:59,106][52031] Fps is (10 sec: 54067.8, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6090637312. Throughput: 0: 53174.3. Samples: 581096420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:00:59,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 12:01:00,437][52263] Updated weights for policy 0, policy_version 371749 (0.0031) [2024-04-27 12:01:03,777][52263] Updated weights for policy 0, policy_version 371759 (0.0030) [2024-04-27 12:01:04,106][52031] Fps is (10 sec: 50791.5, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6090899456. Throughput: 0: 53127.1. Samples: 581412920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:01:04,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 12:01:06,767][52263] Updated weights for policy 0, policy_version 371769 (0.0026) [2024-04-27 12:01:09,106][52031] Fps is (10 sec: 54066.8, 60 sec: 53794.2, 300 sec: 53373.0). Total num frames: 6091177984. Throughput: 0: 53133.1. Samples: 581735660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:01:09,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 12:01:09,855][52263] Updated weights for policy 0, policy_version 371779 (0.0028) [2024-04-27 12:01:13,411][52263] Updated weights for policy 0, policy_version 371789 (0.0028) [2024-04-27 12:01:14,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53247.9, 300 sec: 53372.9). Total num frames: 6091440128. Throughput: 0: 53283.9. Samples: 581901480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:01:14,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 12:01:15,947][52263] Updated weights for policy 0, policy_version 371799 (0.0027) [2024-04-27 12:01:19,107][52031] Fps is (10 sec: 52428.1, 60 sec: 52974.8, 300 sec: 53317.4). Total num frames: 6091702272. Throughput: 0: 53259.5. Samples: 582217280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:01:19,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 12:01:19,524][52263] Updated weights for policy 0, policy_version 371809 (0.0028) [2024-04-27 12:01:20,403][52242] Signal inference workers to stop experience collection... (8750 times) [2024-04-27 12:01:20,436][52263] InferenceWorker_p0-w0: stopping experience collection (8750 times) [2024-04-27 12:01:20,462][52242] Signal inference workers to resume experience collection... (8750 times) [2024-04-27 12:01:20,463][52263] InferenceWorker_p0-w0: resuming experience collection (8750 times) [2024-04-27 12:01:21,940][52263] Updated weights for policy 0, policy_version 371819 (0.0032) [2024-04-27 12:01:24,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53248.2, 300 sec: 53317.4). Total num frames: 6091964416. Throughput: 0: 53183.7. Samples: 582534960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:01:24,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 12:01:25,598][52263] Updated weights for policy 0, policy_version 371829 (0.0027) [2024-04-27 12:01:28,258][52263] Updated weights for policy 0, policy_version 371839 (0.0031) [2024-04-27 12:01:29,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6092242944. Throughput: 0: 53270.1. Samples: 582698400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:01:29,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 12:01:31,820][52263] Updated weights for policy 0, policy_version 371849 (0.0034) [2024-04-27 12:01:34,106][52031] Fps is (10 sec: 54066.7, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 6092505088. Throughput: 0: 53258.2. Samples: 583017280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:01:34,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 12:01:34,194][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000371858_6092521472.pth... [2024-04-27 12:01:34,243][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000371077_6079725568.pth [2024-04-27 12:01:34,603][52263] Updated weights for policy 0, policy_version 371859 (0.0025) [2024-04-27 12:01:37,980][52263] Updated weights for policy 0, policy_version 371869 (0.0030) [2024-04-27 12:01:39,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53794.1, 300 sec: 53372.9). Total num frames: 6092800000. Throughput: 0: 53352.1. Samples: 583341860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:01:39,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 12:01:40,840][52263] Updated weights for policy 0, policy_version 371879 (0.0033) [2024-04-27 12:01:43,970][52263] Updated weights for policy 0, policy_version 371889 (0.0032) [2024-04-27 12:01:44,107][52031] Fps is (10 sec: 52427.6, 60 sec: 52974.8, 300 sec: 53317.4). Total num frames: 6093029376. Throughput: 0: 53397.3. Samples: 583499320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:01:44,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 12:01:46,793][52263] Updated weights for policy 0, policy_version 371899 (0.0029) [2024-04-27 12:01:49,106][52031] Fps is (10 sec: 50791.0, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 6093307904. Throughput: 0: 53416.0. Samples: 583816640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:01:49,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 12:01:50,244][52263] Updated weights for policy 0, policy_version 371909 (0.0030) [2024-04-27 12:01:52,931][52263] Updated weights for policy 0, policy_version 371919 (0.0029) [2024-04-27 12:01:54,106][52031] Fps is (10 sec: 54068.8, 60 sec: 52975.1, 300 sec: 53317.4). Total num frames: 6093570048. Throughput: 0: 53413.8. Samples: 584139280. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-04-27 12:01:54,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 12:01:56,425][52263] Updated weights for policy 0, policy_version 371929 (0.0026) [2024-04-27 12:01:59,056][52263] Updated weights for policy 0, policy_version 371939 (0.0023) [2024-04-27 12:01:59,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53520.9, 300 sec: 53317.4). Total num frames: 6093848576. Throughput: 0: 53424.4. Samples: 584305580. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-04-27 12:01:59,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 12:02:02,556][52263] Updated weights for policy 0, policy_version 371949 (0.0033) [2024-04-27 12:02:04,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6094127104. Throughput: 0: 53610.8. Samples: 584629760. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-04-27 12:02:04,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 12:02:05,092][52263] Updated weights for policy 0, policy_version 371959 (0.0028) [2024-04-27 12:02:08,690][52263] Updated weights for policy 0, policy_version 371969 (0.0031) [2024-04-27 12:02:08,892][52242] Signal inference workers to stop experience collection... (8800 times) [2024-04-27 12:02:08,897][52242] Signal inference workers to resume experience collection... (8800 times) [2024-04-27 12:02:08,924][52263] InferenceWorker_p0-w0: stopping experience collection (8800 times) [2024-04-27 12:02:08,924][52263] InferenceWorker_p0-w0: resuming experience collection (8800 times) [2024-04-27 12:02:09,107][52031] Fps is (10 sec: 52429.0, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6094372864. Throughput: 0: 53615.9. Samples: 584947680. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-04-27 12:02:09,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 12:02:11,337][52263] Updated weights for policy 0, policy_version 371979 (0.0033) [2024-04-27 12:02:14,106][52031] Fps is (10 sec: 49152.0, 60 sec: 52975.0, 300 sec: 53261.9). Total num frames: 6094618624. Throughput: 0: 53324.2. Samples: 585097980. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-04-27 12:02:14,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 12:02:14,763][52263] Updated weights for policy 0, policy_version 371989 (0.0030) [2024-04-27 12:02:17,467][52263] Updated weights for policy 0, policy_version 371999 (0.0029) [2024-04-27 12:02:19,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.1, 300 sec: 53372.9). Total num frames: 6094913536. Throughput: 0: 53327.9. Samples: 585417040. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-04-27 12:02:19,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 12:02:20,878][52263] Updated weights for policy 0, policy_version 372009 (0.0029) [2024-04-27 12:02:23,756][52263] Updated weights for policy 0, policy_version 372019 (0.0027) [2024-04-27 12:02:24,107][52031] Fps is (10 sec: 57342.8, 60 sec: 53793.9, 300 sec: 53428.5). Total num frames: 6095192064. Throughput: 0: 53277.2. Samples: 585739340. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-04-27 12:02:24,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 12:02:26,959][52263] Updated weights for policy 0, policy_version 372029 (0.0026) [2024-04-27 12:02:29,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53248.2, 300 sec: 53317.4). Total num frames: 6095437824. Throughput: 0: 53555.9. Samples: 585909320. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-04-27 12:02:29,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 12:02:29,760][52263] Updated weights for policy 0, policy_version 372039 (0.0025) [2024-04-27 12:02:33,061][52263] Updated weights for policy 0, policy_version 372049 (0.0035) [2024-04-27 12:02:34,106][52031] Fps is (10 sec: 54068.4, 60 sec: 53794.2, 300 sec: 53373.0). Total num frames: 6095732736. Throughput: 0: 53482.7. Samples: 586223360. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-04-27 12:02:34,107][52031] Avg episode reward: [(0, '0.489')] [2024-04-27 12:02:36,070][52263] Updated weights for policy 0, policy_version 372059 (0.0028) [2024-04-27 12:02:39,106][52031] Fps is (10 sec: 52428.7, 60 sec: 52701.9, 300 sec: 53261.9). Total num frames: 6095962112. Throughput: 0: 53449.7. Samples: 586544520. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-04-27 12:02:39,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 12:02:39,296][52263] Updated weights for policy 0, policy_version 372069 (0.0031) [2024-04-27 12:02:42,105][52263] Updated weights for policy 0, policy_version 372079 (0.0033) [2024-04-27 12:02:44,106][52031] Fps is (10 sec: 49152.3, 60 sec: 53248.3, 300 sec: 53317.5). Total num frames: 6096224256. Throughput: 0: 53161.6. Samples: 586697840. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-04-27 12:02:44,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 12:02:45,403][52263] Updated weights for policy 0, policy_version 372089 (0.0033) [2024-04-27 12:02:48,300][52263] Updated weights for policy 0, policy_version 372099 (0.0026) [2024-04-27 12:02:49,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6096502784. Throughput: 0: 53022.1. Samples: 587015760. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-04-27 12:02:49,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 12:02:51,486][52263] Updated weights for policy 0, policy_version 372109 (0.0030) [2024-04-27 12:02:54,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53521.0, 300 sec: 53372.9). Total num frames: 6096781312. Throughput: 0: 53076.9. Samples: 587336140. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-04-27 12:02:54,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 12:02:54,254][52263] Updated weights for policy 0, policy_version 372119 (0.0030) [2024-04-27 12:02:57,670][52263] Updated weights for policy 0, policy_version 372129 (0.0033) [2024-04-27 12:02:59,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53248.0, 300 sec: 53261.9). Total num frames: 6097043456. Throughput: 0: 53473.2. Samples: 587504280. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-04-27 12:02:59,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 12:03:00,442][52263] Updated weights for policy 0, policy_version 372139 (0.0030) [2024-04-27 12:03:03,808][52263] Updated weights for policy 0, policy_version 372149 (0.0030) [2024-04-27 12:03:04,107][52031] Fps is (10 sec: 52428.4, 60 sec: 52974.8, 300 sec: 53317.4). Total num frames: 6097305600. Throughput: 0: 53462.7. Samples: 587822860. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-04-27 12:03:04,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 12:03:06,511][52263] Updated weights for policy 0, policy_version 372159 (0.0032) [2024-04-27 12:03:09,106][52031] Fps is (10 sec: 49152.3, 60 sec: 52701.9, 300 sec: 53206.4). Total num frames: 6097534976. Throughput: 0: 53360.2. Samples: 588140540. Policy #0 lag: (min: 2.0, avg: 10.7, max: 23.0) [2024-04-27 12:03:09,107][52031] Avg episode reward: [(0, '0.513')] [2024-04-27 12:03:09,159][52242] Signal inference workers to stop experience collection... (8850 times) [2024-04-27 12:03:09,159][52242] Signal inference workers to resume experience collection... (8850 times) [2024-04-27 12:03:09,176][52263] InferenceWorker_p0-w0: stopping experience collection (8850 times) [2024-04-27 12:03:09,176][52263] InferenceWorker_p0-w0: resuming experience collection (8850 times) [2024-04-27 12:03:09,924][52263] Updated weights for policy 0, policy_version 372169 (0.0033) [2024-04-27 12:03:12,624][52263] Updated weights for policy 0, policy_version 372179 (0.0033) [2024-04-27 12:03:14,107][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 6097829888. Throughput: 0: 53103.9. Samples: 588299000. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 12:03:14,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 12:03:15,948][52263] Updated weights for policy 0, policy_version 372189 (0.0030) [2024-04-27 12:03:18,547][52263] Updated weights for policy 0, policy_version 372199 (0.0028) [2024-04-27 12:03:19,106][52031] Fps is (10 sec: 57343.8, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 6098108416. Throughput: 0: 53260.8. Samples: 588620100. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 12:03:19,107][52031] Avg episode reward: [(0, '0.459')] [2024-04-27 12:03:22,147][52263] Updated weights for policy 0, policy_version 372209 (0.0027) [2024-04-27 12:03:24,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53248.1, 300 sec: 53372.9). Total num frames: 6098386944. Throughput: 0: 53223.9. Samples: 588939600. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 12:03:24,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 12:03:24,753][52263] Updated weights for policy 0, policy_version 372219 (0.0032) [2024-04-27 12:03:28,286][52263] Updated weights for policy 0, policy_version 372229 (0.0031) [2024-04-27 12:03:29,106][52031] Fps is (10 sec: 57344.2, 60 sec: 54067.2, 300 sec: 53373.0). Total num frames: 6098681856. Throughput: 0: 53470.6. Samples: 589104020. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 12:03:29,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 12:03:30,856][52263] Updated weights for policy 0, policy_version 372239 (0.0032) [2024-04-27 12:03:34,106][52031] Fps is (10 sec: 52429.7, 60 sec: 52975.0, 300 sec: 53317.5). Total num frames: 6098911232. Throughput: 0: 53532.2. Samples: 589424700. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 12:03:34,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 12:03:34,114][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000372248_6098911232.pth... [2024-04-27 12:03:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000371467_6086115328.pth [2024-04-27 12:03:34,313][52263] Updated weights for policy 0, policy_version 372249 (0.0033) [2024-04-27 12:03:36,867][52263] Updated weights for policy 0, policy_version 372259 (0.0033) [2024-04-27 12:03:39,107][52031] Fps is (10 sec: 47513.2, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6099156992. Throughput: 0: 53465.8. Samples: 589742100. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 12:03:39,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 12:03:40,539][52263] Updated weights for policy 0, policy_version 372269 (0.0025) [2024-04-27 12:03:43,017][52263] Updated weights for policy 0, policy_version 372279 (0.0032) [2024-04-27 12:03:44,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 6099435520. Throughput: 0: 53159.7. Samples: 589896460. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 12:03:44,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 12:03:46,688][52263] Updated weights for policy 0, policy_version 372289 (0.0027) [2024-04-27 12:03:49,106][52031] Fps is (10 sec: 57345.2, 60 sec: 53794.3, 300 sec: 53373.0). Total num frames: 6099730432. Throughput: 0: 53185.2. Samples: 590216180. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 12:03:49,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 12:03:49,288][52263] Updated weights for policy 0, policy_version 372299 (0.0031) [2024-04-27 12:03:52,970][52263] Updated weights for policy 0, policy_version 372309 (0.0032) [2024-04-27 12:03:54,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 6099976192. Throughput: 0: 53176.5. Samples: 590533480. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 12:03:54,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 12:03:55,510][52263] Updated weights for policy 0, policy_version 372319 (0.0026) [2024-04-27 12:03:58,961][52263] Updated weights for policy 0, policy_version 372329 (0.0034) [2024-04-27 12:03:59,046][52242] Signal inference workers to stop experience collection... (8900 times) [2024-04-27 12:03:59,079][52263] InferenceWorker_p0-w0: stopping experience collection (8900 times) [2024-04-27 12:03:59,106][52031] Fps is (10 sec: 50789.8, 60 sec: 53248.1, 300 sec: 53261.9). Total num frames: 6100238336. Throughput: 0: 53331.2. Samples: 590698900. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 12:03:59,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 12:03:59,108][52242] Signal inference workers to resume experience collection... (8900 times) [2024-04-27 12:03:59,114][52263] InferenceWorker_p0-w0: resuming experience collection (8900 times) [2024-04-27 12:04:01,643][52263] Updated weights for policy 0, policy_version 372339 (0.0025) [2024-04-27 12:04:04,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.2, 300 sec: 53261.9). Total num frames: 6100500480. Throughput: 0: 53421.4. Samples: 591024060. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 12:04:04,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 12:04:05,003][52263] Updated weights for policy 0, policy_version 372349 (0.0031) [2024-04-27 12:04:07,670][52263] Updated weights for policy 0, policy_version 372359 (0.0025) [2024-04-27 12:04:09,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53794.1, 300 sec: 53317.4). Total num frames: 6100762624. Throughput: 0: 53427.2. Samples: 591343820. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 12:04:09,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 12:04:11,192][52263] Updated weights for policy 0, policy_version 372369 (0.0028) [2024-04-27 12:04:13,766][52263] Updated weights for policy 0, policy_version 372379 (0.0026) [2024-04-27 12:04:14,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 6101057536. Throughput: 0: 53243.4. Samples: 591499980. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 12:04:14,107][52031] Avg episode reward: [(0, '0.484')] [2024-04-27 12:04:17,214][52263] Updated weights for policy 0, policy_version 372389 (0.0031) [2024-04-27 12:04:19,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 6101319680. Throughput: 0: 53340.7. Samples: 591825040. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 12:04:19,107][52031] Avg episode reward: [(0, '0.476')] [2024-04-27 12:04:19,785][52263] Updated weights for policy 0, policy_version 372399 (0.0025) [2024-04-27 12:04:23,461][52263] Updated weights for policy 0, policy_version 372409 (0.0029) [2024-04-27 12:04:24,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6101598208. Throughput: 0: 53419.6. Samples: 592145980. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-04-27 12:04:24,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 12:04:25,854][52263] Updated weights for policy 0, policy_version 372419 (0.0028) [2024-04-27 12:04:29,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52701.8, 300 sec: 53373.0). Total num frames: 6101843968. Throughput: 0: 53537.3. Samples: 592305640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:04:29,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 12:04:29,459][52263] Updated weights for policy 0, policy_version 372429 (0.0024) [2024-04-27 12:04:32,557][52263] Updated weights for policy 0, policy_version 372439 (0.0028) [2024-04-27 12:04:34,107][52031] Fps is (10 sec: 50790.0, 60 sec: 53247.8, 300 sec: 53372.9). Total num frames: 6102106112. Throughput: 0: 53577.5. Samples: 592627180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:04:34,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 12:04:35,527][52263] Updated weights for policy 0, policy_version 372449 (0.0028) [2024-04-27 12:04:38,718][52263] Updated weights for policy 0, policy_version 372459 (0.0028) [2024-04-27 12:04:39,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 6102368256. Throughput: 0: 53563.5. Samples: 592943840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:04:39,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 12:04:41,639][52263] Updated weights for policy 0, policy_version 372469 (0.0033) [2024-04-27 12:04:44,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.0, 300 sec: 53372.9). Total num frames: 6102663168. Throughput: 0: 53438.5. Samples: 593103640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:04:44,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 12:04:44,695][52263] Updated weights for policy 0, policy_version 372479 (0.0029) [2024-04-27 12:04:47,849][52263] Updated weights for policy 0, policy_version 372489 (0.0032) [2024-04-27 12:04:49,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 6102925312. Throughput: 0: 53360.4. Samples: 593425280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:04:49,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 12:04:50,867][52263] Updated weights for policy 0, policy_version 372499 (0.0027) [2024-04-27 12:04:54,019][52263] Updated weights for policy 0, policy_version 372509 (0.0031) [2024-04-27 12:04:54,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53520.9, 300 sec: 53317.4). Total num frames: 6103187456. Throughput: 0: 53407.9. Samples: 593747180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:04:54,107][52031] Avg episode reward: [(0, '0.482')] [2024-04-27 12:04:57,040][52263] Updated weights for policy 0, policy_version 372519 (0.0032) [2024-04-27 12:04:59,106][52031] Fps is (10 sec: 50790.0, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6103433216. Throughput: 0: 53396.6. Samples: 593902820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:04:59,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 12:05:00,358][52263] Updated weights for policy 0, policy_version 372529 (0.0037) [2024-04-27 12:05:03,284][52263] Updated weights for policy 0, policy_version 372539 (0.0028) [2024-04-27 12:05:04,106][52031] Fps is (10 sec: 49152.7, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 6103678976. Throughput: 0: 53284.1. Samples: 594222820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:05:04,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 12:05:06,334][52263] Updated weights for policy 0, policy_version 372549 (0.0032) [2024-04-27 12:05:09,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53794.2, 300 sec: 53373.0). Total num frames: 6103990272. Throughput: 0: 53226.3. Samples: 594541160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:05:09,107][52031] Avg episode reward: [(0, '0.486')] [2024-04-27 12:05:09,287][52263] Updated weights for policy 0, policy_version 372559 (0.0032) [2024-04-27 12:05:11,662][52242] Signal inference workers to stop experience collection... (8950 times) [2024-04-27 12:05:11,668][52242] Signal inference workers to resume experience collection... (8950 times) [2024-04-27 12:05:11,677][52263] InferenceWorker_p0-w0: stopping experience collection (8950 times) [2024-04-27 12:05:11,705][52263] InferenceWorker_p0-w0: resuming experience collection (8950 times) [2024-04-27 12:05:12,452][52263] Updated weights for policy 0, policy_version 372569 (0.0036) [2024-04-27 12:05:14,106][52031] Fps is (10 sec: 57344.1, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 6104252416. Throughput: 0: 53469.9. Samples: 594711780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:05:14,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 12:05:15,295][52263] Updated weights for policy 0, policy_version 372579 (0.0031) [2024-04-27 12:05:18,631][52263] Updated weights for policy 0, policy_version 372589 (0.0035) [2024-04-27 12:05:19,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 6104547328. Throughput: 0: 53372.4. Samples: 595028940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:05:19,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 12:05:21,583][52263] Updated weights for policy 0, policy_version 372599 (0.0030) [2024-04-27 12:05:24,107][52031] Fps is (10 sec: 52428.4, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 6104776704. Throughput: 0: 53364.4. Samples: 595345240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:05:24,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 12:05:24,787][52263] Updated weights for policy 0, policy_version 372609 (0.0027) [2024-04-27 12:05:27,796][52263] Updated weights for policy 0, policy_version 372619 (0.0037) [2024-04-27 12:05:29,106][52031] Fps is (10 sec: 47514.2, 60 sec: 52975.0, 300 sec: 53206.4). Total num frames: 6105022464. Throughput: 0: 53328.6. Samples: 595503420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:05:29,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 12:05:30,891][52263] Updated weights for policy 0, policy_version 372629 (0.0032) [2024-04-27 12:05:33,965][52263] Updated weights for policy 0, policy_version 372639 (0.0029) [2024-04-27 12:05:34,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6105317376. Throughput: 0: 53389.3. Samples: 595827800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:05:34,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 12:05:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000372639_6105317376.pth... [2024-04-27 12:05:34,171][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000371858_6092521472.pth [2024-04-27 12:05:36,909][52263] Updated weights for policy 0, policy_version 372649 (0.0029) [2024-04-27 12:05:39,106][52031] Fps is (10 sec: 57343.9, 60 sec: 53794.2, 300 sec: 53373.0). Total num frames: 6105595904. Throughput: 0: 53377.1. Samples: 596149140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:05:39,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 12:05:40,007][52263] Updated weights for policy 0, policy_version 372659 (0.0027) [2024-04-27 12:05:42,995][52263] Updated weights for policy 0, policy_version 372669 (0.0029) [2024-04-27 12:05:44,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6105874432. Throughput: 0: 53588.8. Samples: 596314320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:05:44,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 12:05:46,202][52263] Updated weights for policy 0, policy_version 372679 (0.0032) [2024-04-27 12:05:49,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.0, 300 sec: 53317.5). Total num frames: 6106120192. Throughput: 0: 53478.7. Samples: 596629360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:05:49,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 12:05:49,125][52263] Updated weights for policy 0, policy_version 372689 (0.0029) [2024-04-27 12:05:52,377][52263] Updated weights for policy 0, policy_version 372699 (0.0035) [2024-04-27 12:05:54,107][52031] Fps is (10 sec: 50790.9, 60 sec: 53248.1, 300 sec: 53372.9). Total num frames: 6106382336. Throughput: 0: 53659.5. Samples: 596955840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:05:54,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 12:05:55,179][52263] Updated weights for policy 0, policy_version 372709 (0.0029) [2024-04-27 12:05:58,537][52263] Updated weights for policy 0, policy_version 372719 (0.0028) [2024-04-27 12:05:59,107][52031] Fps is (10 sec: 50789.9, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6106628096. Throughput: 0: 53251.5. Samples: 597108100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:05:59,115][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 12:06:01,296][52263] Updated weights for policy 0, policy_version 372729 (0.0032) [2024-04-27 12:06:04,106][52031] Fps is (10 sec: 54067.4, 60 sec: 54067.2, 300 sec: 53373.0). Total num frames: 6106923008. Throughput: 0: 53342.8. Samples: 597429360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:06:04,107][52031] Avg episode reward: [(0, '0.476')] [2024-04-27 12:06:04,631][52263] Updated weights for policy 0, policy_version 372739 (0.0034) [2024-04-27 12:06:07,584][52263] Updated weights for policy 0, policy_version 372749 (0.0031) [2024-04-27 12:06:09,106][52031] Fps is (10 sec: 57344.4, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6107201536. Throughput: 0: 53454.8. Samples: 597750700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:06:09,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 12:06:11,096][52263] Updated weights for policy 0, policy_version 372759 (0.0035) [2024-04-27 12:06:13,635][52263] Updated weights for policy 0, policy_version 372769 (0.0024) [2024-04-27 12:06:14,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6107463680. Throughput: 0: 53648.7. Samples: 597917620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:06:14,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 12:06:17,264][52263] Updated weights for policy 0, policy_version 372779 (0.0025) [2024-04-27 12:06:19,106][52031] Fps is (10 sec: 52428.8, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6107725824. Throughput: 0: 53667.6. Samples: 598242840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:06:19,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 12:06:19,603][52263] Updated weights for policy 0, policy_version 372789 (0.0029) [2024-04-27 12:06:23,451][52263] Updated weights for policy 0, policy_version 372799 (0.0034) [2024-04-27 12:06:24,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6108004352. Throughput: 0: 53701.6. Samples: 598565720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:06:24,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 12:06:26,092][52263] Updated weights for policy 0, policy_version 372809 (0.0033) [2024-04-27 12:06:29,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 6108250112. Throughput: 0: 53347.7. Samples: 598714960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:06:29,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 12:06:29,422][52263] Updated weights for policy 0, policy_version 372819 (0.0026) [2024-04-27 12:06:32,074][52263] Updated weights for policy 0, policy_version 372829 (0.0030) [2024-04-27 12:06:34,106][52031] Fps is (10 sec: 50791.5, 60 sec: 53248.1, 300 sec: 53261.9). Total num frames: 6108512256. Throughput: 0: 53477.8. Samples: 599035860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:06:34,107][52031] Avg episode reward: [(0, '0.498')] [2024-04-27 12:06:34,229][52242] Signal inference workers to stop experience collection... (9000 times) [2024-04-27 12:06:34,230][52242] Signal inference workers to resume experience collection... (9000 times) [2024-04-27 12:06:34,273][52263] InferenceWorker_p0-w0: stopping experience collection (9000 times) [2024-04-27 12:06:34,273][52263] InferenceWorker_p0-w0: resuming experience collection (9000 times) [2024-04-27 12:06:35,416][52263] Updated weights for policy 0, policy_version 372839 (0.0031) [2024-04-27 12:06:38,253][52263] Updated weights for policy 0, policy_version 372849 (0.0037) [2024-04-27 12:06:39,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.0, 300 sec: 53428.6). Total num frames: 6108790784. Throughput: 0: 53385.0. Samples: 599358160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:06:39,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 12:06:41,629][52263] Updated weights for policy 0, policy_version 372859 (0.0032) [2024-04-27 12:06:44,106][52031] Fps is (10 sec: 55705.4, 60 sec: 53248.2, 300 sec: 53428.5). Total num frames: 6109069312. Throughput: 0: 53677.0. Samples: 599523560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:06:44,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 12:06:44,387][52263] Updated weights for policy 0, policy_version 372869 (0.0032) [2024-04-27 12:06:47,672][52263] Updated weights for policy 0, policy_version 372879 (0.0030) [2024-04-27 12:06:49,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6109331456. Throughput: 0: 53780.9. Samples: 599849500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:06:49,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 12:06:50,319][52263] Updated weights for policy 0, policy_version 372889 (0.0032) [2024-04-27 12:06:53,835][52263] Updated weights for policy 0, policy_version 372899 (0.0030) [2024-04-27 12:06:54,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 6109593600. Throughput: 0: 53811.2. Samples: 600172200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:06:54,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 12:06:56,391][52263] Updated weights for policy 0, policy_version 372909 (0.0042) [2024-04-27 12:06:59,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53794.1, 300 sec: 53317.4). Total num frames: 6109855744. Throughput: 0: 53447.1. Samples: 600322740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:06:59,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 12:06:59,944][52263] Updated weights for policy 0, policy_version 372919 (0.0026) [2024-04-27 12:07:02,603][52263] Updated weights for policy 0, policy_version 372929 (0.0030) [2024-04-27 12:07:04,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6110134272. Throughput: 0: 53408.5. Samples: 600646220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:07:04,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 12:07:06,104][52263] Updated weights for policy 0, policy_version 372939 (0.0029) [2024-04-27 12:07:08,716][52263] Updated weights for policy 0, policy_version 372949 (0.0027) [2024-04-27 12:07:09,106][52031] Fps is (10 sec: 55706.6, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6110412800. Throughput: 0: 53329.6. Samples: 600965540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:07:09,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 12:07:12,112][52263] Updated weights for policy 0, policy_version 372959 (0.0030) [2024-04-27 12:07:14,106][52031] Fps is (10 sec: 54066.7, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6110674944. Throughput: 0: 53791.1. Samples: 601135560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:07:14,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 12:07:15,106][52263] Updated weights for policy 0, policy_version 372969 (0.0031) [2024-04-27 12:07:18,316][52263] Updated weights for policy 0, policy_version 372979 (0.0029) [2024-04-27 12:07:19,107][52031] Fps is (10 sec: 50789.5, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6110920704. Throughput: 0: 53738.9. Samples: 601454120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:07:19,107][52031] Avg episode reward: [(0, '0.518')] [2024-04-27 12:07:21,395][52263] Updated weights for policy 0, policy_version 372989 (0.0032) [2024-04-27 12:07:24,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6111199232. Throughput: 0: 53585.3. Samples: 601769500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:07:24,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 12:07:24,397][52263] Updated weights for policy 0, policy_version 372999 (0.0033) [2024-04-27 12:07:27,331][52263] Updated weights for policy 0, policy_version 373009 (0.0029) [2024-04-27 12:07:29,107][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.1, 300 sec: 53372.9). Total num frames: 6111477760. Throughput: 0: 53428.8. Samples: 601927860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:07:29,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 12:07:30,433][52263] Updated weights for policy 0, policy_version 373019 (0.0028) [2024-04-27 12:07:33,245][52263] Updated weights for policy 0, policy_version 373029 (0.0029) [2024-04-27 12:07:34,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6111739904. Throughput: 0: 53357.2. Samples: 602250580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:07:34,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 12:07:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000373031_6111739904.pth... [2024-04-27 12:07:34,179][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000372248_6098911232.pth [2024-04-27 12:07:36,598][52263] Updated weights for policy 0, policy_version 373039 (0.0032) [2024-04-27 12:07:36,935][52242] Signal inference workers to stop experience collection... (9050 times) [2024-04-27 12:07:36,935][52242] Signal inference workers to resume experience collection... (9050 times) [2024-04-27 12:07:36,960][52263] InferenceWorker_p0-w0: stopping experience collection (9050 times) [2024-04-27 12:07:36,961][52263] InferenceWorker_p0-w0: resuming experience collection (9050 times) [2024-04-27 12:07:39,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6112018432. Throughput: 0: 53381.2. Samples: 602574360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:07:39,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 12:07:39,378][52263] Updated weights for policy 0, policy_version 373049 (0.0030) [2024-04-27 12:07:42,706][52263] Updated weights for policy 0, policy_version 373059 (0.0031) [2024-04-27 12:07:44,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6112296960. Throughput: 0: 53746.9. Samples: 602741340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:07:44,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 12:07:45,575][52263] Updated weights for policy 0, policy_version 373069 (0.0031) [2024-04-27 12:07:48,764][52263] Updated weights for policy 0, policy_version 373079 (0.0026) [2024-04-27 12:07:49,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6112542720. Throughput: 0: 53607.7. Samples: 603058580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:07:49,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 12:07:51,522][52263] Updated weights for policy 0, policy_version 373089 (0.0029) [2024-04-27 12:07:54,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6112821248. Throughput: 0: 53624.8. Samples: 603378660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:07:54,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 12:07:54,862][52263] Updated weights for policy 0, policy_version 373099 (0.0032) [2024-04-27 12:07:57,614][52263] Updated weights for policy 0, policy_version 373109 (0.0028) [2024-04-27 12:07:59,106][52031] Fps is (10 sec: 55706.4, 60 sec: 54067.3, 300 sec: 53539.6). Total num frames: 6113099776. Throughput: 0: 53464.0. Samples: 603541440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:07:59,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 12:08:00,906][52263] Updated weights for policy 0, policy_version 373119 (0.0029) [2024-04-27 12:08:04,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6113329152. Throughput: 0: 53538.3. Samples: 603863340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:08:04,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 12:08:04,172][52263] Updated weights for policy 0, policy_version 373129 (0.0030) [2024-04-27 12:08:06,972][52263] Updated weights for policy 0, policy_version 373139 (0.0032) [2024-04-27 12:08:09,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53247.9, 300 sec: 53484.1). Total num frames: 6113607680. Throughput: 0: 53561.8. Samples: 604179780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:08:09,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 12:08:10,333][52263] Updated weights for policy 0, policy_version 373149 (0.0029) [2024-04-27 12:08:13,054][52263] Updated weights for policy 0, policy_version 373159 (0.0035) [2024-04-27 12:08:14,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6113886208. Throughput: 0: 53576.5. Samples: 604338800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:08:14,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 12:08:16,446][52263] Updated weights for policy 0, policy_version 373169 (0.0028) [2024-04-27 12:08:19,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6114148352. Throughput: 0: 53445.5. Samples: 604655620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:08:19,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 12:08:19,116][52263] Updated weights for policy 0, policy_version 373179 (0.0030) [2024-04-27 12:08:22,387][52263] Updated weights for policy 0, policy_version 373189 (0.0026) [2024-04-27 12:08:24,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53794.2, 300 sec: 53373.0). Total num frames: 6114426880. Throughput: 0: 53417.1. Samples: 604978120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:08:24,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 12:08:25,173][52263] Updated weights for policy 0, policy_version 373199 (0.0039) [2024-04-27 12:08:28,357][52263] Updated weights for policy 0, policy_version 373209 (0.0032) [2024-04-27 12:08:29,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6114689024. Throughput: 0: 53248.2. Samples: 605137520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:08:29,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 12:08:31,540][52263] Updated weights for policy 0, policy_version 373219 (0.0032) [2024-04-27 12:08:34,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6114951168. Throughput: 0: 53406.4. Samples: 605461860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:08:34,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 12:08:34,406][52263] Updated weights for policy 0, policy_version 373229 (0.0026) [2024-04-27 12:08:35,333][52242] Signal inference workers to stop experience collection... (9100 times) [2024-04-27 12:08:35,389][52263] InferenceWorker_p0-w0: stopping experience collection (9100 times) [2024-04-27 12:08:35,389][52242] Signal inference workers to resume experience collection... (9100 times) [2024-04-27 12:08:35,404][52263] InferenceWorker_p0-w0: resuming experience collection (9100 times) [2024-04-27 12:08:37,685][52263] Updated weights for policy 0, policy_version 373239 (0.0032) [2024-04-27 12:08:39,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6115213312. Throughput: 0: 53481.2. Samples: 605785320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:08:39,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 12:08:40,632][52263] Updated weights for policy 0, policy_version 373249 (0.0031) [2024-04-27 12:08:43,834][52263] Updated weights for policy 0, policy_version 373259 (0.0028) [2024-04-27 12:08:44,107][52031] Fps is (10 sec: 52428.1, 60 sec: 52974.8, 300 sec: 53372.9). Total num frames: 6115475456. Throughput: 0: 53277.7. Samples: 605938940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:08:44,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 12:08:46,791][52263] Updated weights for policy 0, policy_version 373269 (0.0033) [2024-04-27 12:08:49,107][52031] Fps is (10 sec: 55706.1, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6115770368. Throughput: 0: 53317.8. Samples: 606262640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:08:49,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 12:08:49,802][52263] Updated weights for policy 0, policy_version 373279 (0.0027) [2024-04-27 12:08:52,912][52263] Updated weights for policy 0, policy_version 373289 (0.0030) [2024-04-27 12:08:54,107][52031] Fps is (10 sec: 57344.2, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6116048896. Throughput: 0: 53629.7. Samples: 606593120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:08:54,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 12:08:56,035][52263] Updated weights for policy 0, policy_version 373299 (0.0028) [2024-04-27 12:08:59,107][52031] Fps is (10 sec: 50790.3, 60 sec: 52974.9, 300 sec: 53484.0). Total num frames: 6116278272. Throughput: 0: 53578.1. Samples: 606749820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:08:59,107][52031] Avg episode reward: [(0, '0.500')] [2024-04-27 12:08:59,286][52263] Updated weights for policy 0, policy_version 373309 (0.0031) [2024-04-27 12:09:02,055][52263] Updated weights for policy 0, policy_version 373319 (0.0030) [2024-04-27 12:09:04,106][52031] Fps is (10 sec: 49152.3, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6116540416. Throughput: 0: 53681.3. Samples: 607071280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:09:04,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 12:09:05,404][52263] Updated weights for policy 0, policy_version 373329 (0.0033) [2024-04-27 12:09:08,418][52263] Updated weights for policy 0, policy_version 373339 (0.0032) [2024-04-27 12:09:09,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6116818944. Throughput: 0: 53553.3. Samples: 607388020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:09:09,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 12:09:11,480][52263] Updated weights for policy 0, policy_version 373349 (0.0029) [2024-04-27 12:09:14,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6117081088. Throughput: 0: 53622.3. Samples: 607550520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:09:14,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 12:09:14,740][52263] Updated weights for policy 0, policy_version 373359 (0.0028) [2024-04-27 12:09:17,557][52263] Updated weights for policy 0, policy_version 373369 (0.0031) [2024-04-27 12:09:19,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6117376000. Throughput: 0: 53556.8. Samples: 607871920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:09:19,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 12:09:20,944][52263] Updated weights for policy 0, policy_version 373379 (0.0027) [2024-04-27 12:09:23,864][52263] Updated weights for policy 0, policy_version 373389 (0.0030) [2024-04-27 12:09:24,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53247.8, 300 sec: 53484.0). Total num frames: 6117621760. Throughput: 0: 53391.2. Samples: 608187920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:09:24,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 12:09:27,055][52263] Updated weights for policy 0, policy_version 373399 (0.0031) [2024-04-27 12:09:29,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6117883904. Throughput: 0: 53561.0. Samples: 608349180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:09:29,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 12:09:29,951][52263] Updated weights for policy 0, policy_version 373409 (0.0029) [2024-04-27 12:09:33,199][52263] Updated weights for policy 0, policy_version 373419 (0.0030) [2024-04-27 12:09:34,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6118162432. Throughput: 0: 53494.2. Samples: 608669880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:09:34,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 12:09:34,119][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000373423_6118162432.pth... [2024-04-27 12:09:34,162][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000372639_6105317376.pth [2024-04-27 12:09:35,703][52242] Signal inference workers to stop experience collection... (9150 times) [2024-04-27 12:09:35,703][52242] Signal inference workers to resume experience collection... (9150 times) [2024-04-27 12:09:35,715][52263] InferenceWorker_p0-w0: stopping experience collection (9150 times) [2024-04-27 12:09:35,715][52263] InferenceWorker_p0-w0: resuming experience collection (9150 times) [2024-04-27 12:09:36,086][52263] Updated weights for policy 0, policy_version 373429 (0.0034) [2024-04-27 12:09:39,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6118408192. Throughput: 0: 53217.0. Samples: 608987880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:09:39,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 12:09:39,210][52263] Updated weights for policy 0, policy_version 373439 (0.0031) [2024-04-27 12:09:42,217][52263] Updated weights for policy 0, policy_version 373449 (0.0028) [2024-04-27 12:09:44,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6118703104. Throughput: 0: 53414.3. Samples: 609153460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:09:44,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 12:09:45,403][52263] Updated weights for policy 0, policy_version 373459 (0.0029) [2024-04-27 12:09:48,371][52263] Updated weights for policy 0, policy_version 373469 (0.0032) [2024-04-27 12:09:49,106][52031] Fps is (10 sec: 55705.5, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6118965248. Throughput: 0: 53253.8. Samples: 609467700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:09:49,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 12:09:51,477][52263] Updated weights for policy 0, policy_version 373479 (0.0027) [2024-04-27 12:09:54,107][52031] Fps is (10 sec: 49151.2, 60 sec: 52428.7, 300 sec: 53428.5). Total num frames: 6119194624. Throughput: 0: 53237.5. Samples: 609783720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:09:54,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 12:09:54,412][52263] Updated weights for policy 0, policy_version 373489 (0.0030) [2024-04-27 12:09:57,609][52263] Updated weights for policy 0, policy_version 373499 (0.0024) [2024-04-27 12:09:59,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6119505920. Throughput: 0: 53095.2. Samples: 609939800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:09:59,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 12:10:00,661][52263] Updated weights for policy 0, policy_version 373509 (0.0028) [2024-04-27 12:10:03,961][52263] Updated weights for policy 0, policy_version 373519 (0.0035) [2024-04-27 12:10:04,106][52031] Fps is (10 sec: 54068.3, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6119735296. Throughput: 0: 53174.3. Samples: 610264760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:10:04,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 12:10:06,630][52263] Updated weights for policy 0, policy_version 373529 (0.0032) [2024-04-27 12:10:09,107][52031] Fps is (10 sec: 50789.2, 60 sec: 53247.8, 300 sec: 53428.5). Total num frames: 6120013824. Throughput: 0: 53204.3. Samples: 610582120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:10:09,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 12:10:10,049][52263] Updated weights for policy 0, policy_version 373539 (0.0025) [2024-04-27 12:10:12,783][52263] Updated weights for policy 0, policy_version 373549 (0.0031) [2024-04-27 12:10:14,106][52031] Fps is (10 sec: 55705.4, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6120292352. Throughput: 0: 53253.8. Samples: 610745600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:10:14,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 12:10:16,155][52263] Updated weights for policy 0, policy_version 373559 (0.0030) [2024-04-27 12:10:18,972][52263] Updated weights for policy 0, policy_version 373569 (0.0032) [2024-04-27 12:10:19,107][52031] Fps is (10 sec: 54067.7, 60 sec: 52974.8, 300 sec: 53484.0). Total num frames: 6120554496. Throughput: 0: 53168.4. Samples: 611062460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:10:19,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 12:10:22,215][52263] Updated weights for policy 0, policy_version 373579 (0.0032) [2024-04-27 12:10:24,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6120816640. Throughput: 0: 53281.8. Samples: 611385560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:10:24,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 12:10:25,044][52263] Updated weights for policy 0, policy_version 373589 (0.0030) [2024-04-27 12:10:28,354][52263] Updated weights for policy 0, policy_version 373599 (0.0030) [2024-04-27 12:10:29,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6121078784. Throughput: 0: 53035.6. Samples: 611540060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:10:29,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 12:10:31,085][52263] Updated weights for policy 0, policy_version 373609 (0.0031) [2024-04-27 12:10:34,107][52031] Fps is (10 sec: 52428.1, 60 sec: 52974.9, 300 sec: 53372.9). Total num frames: 6121340928. Throughput: 0: 53190.5. Samples: 611861280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:10:34,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 12:10:34,561][52263] Updated weights for policy 0, policy_version 373619 (0.0028) [2024-04-27 12:10:37,334][52263] Updated weights for policy 0, policy_version 373629 (0.0031) [2024-04-27 12:10:39,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6121619456. Throughput: 0: 53283.3. Samples: 612181460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:10:39,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 12:10:40,289][52242] Signal inference workers to stop experience collection... (9200 times) [2024-04-27 12:10:40,323][52263] InferenceWorker_p0-w0: stopping experience collection (9200 times) [2024-04-27 12:10:40,350][52242] Signal inference workers to resume experience collection... (9200 times) [2024-04-27 12:10:40,354][52263] InferenceWorker_p0-w0: resuming experience collection (9200 times) [2024-04-27 12:10:40,749][52263] Updated weights for policy 0, policy_version 373639 (0.0031) [2024-04-27 12:10:43,331][52263] Updated weights for policy 0, policy_version 373649 (0.0029) [2024-04-27 12:10:44,106][52031] Fps is (10 sec: 54067.9, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6121881600. Throughput: 0: 53322.7. Samples: 612339320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:10:44,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 12:10:46,752][52263] Updated weights for policy 0, policy_version 373659 (0.0029) [2024-04-27 12:10:49,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6122160128. Throughput: 0: 53330.7. Samples: 612664640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:10:49,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 12:10:49,466][52263] Updated weights for policy 0, policy_version 373669 (0.0029) [2024-04-27 12:10:52,772][52263] Updated weights for policy 0, policy_version 373679 (0.0030) [2024-04-27 12:10:54,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.2, 300 sec: 53484.0). Total num frames: 6122405888. Throughput: 0: 53387.3. Samples: 612984540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:10:54,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 12:10:55,464][52263] Updated weights for policy 0, policy_version 373689 (0.0027) [2024-04-27 12:10:58,993][52263] Updated weights for policy 0, policy_version 373699 (0.0037) [2024-04-27 12:10:59,106][52031] Fps is (10 sec: 52428.4, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6122684416. Throughput: 0: 53327.1. Samples: 613145320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:10:59,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 12:11:01,773][52263] Updated weights for policy 0, policy_version 373709 (0.0034) [2024-04-27 12:11:04,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6122930176. Throughput: 0: 53373.5. Samples: 613464260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 12:11:04,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 12:11:05,147][52263] Updated weights for policy 0, policy_version 373719 (0.0032) [2024-04-27 12:11:07,895][52263] Updated weights for policy 0, policy_version 373729 (0.0027) [2024-04-27 12:11:09,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.2, 300 sec: 53373.0). Total num frames: 6123208704. Throughput: 0: 53414.7. Samples: 613789220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 12:11:09,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 12:11:11,181][52263] Updated weights for policy 0, policy_version 373739 (0.0028) [2024-04-27 12:11:13,842][52263] Updated weights for policy 0, policy_version 373749 (0.0032) [2024-04-27 12:11:14,107][52031] Fps is (10 sec: 57342.6, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6123503616. Throughput: 0: 53462.8. Samples: 613945900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 12:11:14,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 12:11:17,149][52263] Updated weights for policy 0, policy_version 373759 (0.0028) [2024-04-27 12:11:19,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6123765760. Throughput: 0: 53444.0. Samples: 614266260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 12:11:19,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 12:11:19,875][52263] Updated weights for policy 0, policy_version 373769 (0.0027) [2024-04-27 12:11:23,280][52263] Updated weights for policy 0, policy_version 373779 (0.0036) [2024-04-27 12:11:24,107][52031] Fps is (10 sec: 52429.0, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6124027904. Throughput: 0: 53403.3. Samples: 614584620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 12:11:24,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 12:11:26,103][52263] Updated weights for policy 0, policy_version 373789 (0.0029) [2024-04-27 12:11:29,107][52031] Fps is (10 sec: 50790.1, 60 sec: 53247.8, 300 sec: 53428.5). Total num frames: 6124273664. Throughput: 0: 53439.4. Samples: 614744100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 12:11:29,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 12:11:29,561][52263] Updated weights for policy 0, policy_version 373799 (0.0030) [2024-04-27 12:11:32,115][52263] Updated weights for policy 0, policy_version 373809 (0.0032) [2024-04-27 12:11:34,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6124552192. Throughput: 0: 53316.8. Samples: 615063900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 12:11:34,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 12:11:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000373813_6124552192.pth... [2024-04-27 12:11:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000373031_6111739904.pth [2024-04-27 12:11:34,466][52242] Signal inference workers to stop experience collection... (9250 times) [2024-04-27 12:11:34,467][52242] Signal inference workers to resume experience collection... (9250 times) [2024-04-27 12:11:34,481][52263] InferenceWorker_p0-w0: stopping experience collection (9250 times) [2024-04-27 12:11:34,482][52263] InferenceWorker_p0-w0: resuming experience collection (9250 times) [2024-04-27 12:11:35,605][52263] Updated weights for policy 0, policy_version 373819 (0.0031) [2024-04-27 12:11:38,163][52263] Updated weights for policy 0, policy_version 373829 (0.0027) [2024-04-27 12:11:39,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6124814336. Throughput: 0: 53306.2. Samples: 615383320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 12:11:39,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 12:11:41,675][52263] Updated weights for policy 0, policy_version 373839 (0.0029) [2024-04-27 12:11:44,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6125109248. Throughput: 0: 53406.9. Samples: 615548640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 12:11:44,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 12:11:44,356][52263] Updated weights for policy 0, policy_version 373849 (0.0030) [2024-04-27 12:11:47,860][52263] Updated weights for policy 0, policy_version 373859 (0.0026) [2024-04-27 12:11:49,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6125355008. Throughput: 0: 53470.7. Samples: 615870440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 12:11:49,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 12:11:51,287][52263] Updated weights for policy 0, policy_version 373869 (0.0031) [2024-04-27 12:11:53,943][52263] Updated weights for policy 0, policy_version 373879 (0.0027) [2024-04-27 12:11:54,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6125633536. Throughput: 0: 53378.7. Samples: 616191260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 12:11:54,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 12:11:57,394][52263] Updated weights for policy 0, policy_version 373889 (0.0026) [2024-04-27 12:11:59,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6125879296. Throughput: 0: 53277.6. Samples: 616343380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 12:11:59,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 12:12:00,166][52263] Updated weights for policy 0, policy_version 373899 (0.0028) [2024-04-27 12:12:03,345][52263] Updated weights for policy 0, policy_version 373909 (0.0030) [2024-04-27 12:12:04,107][52031] Fps is (10 sec: 50789.3, 60 sec: 53520.9, 300 sec: 53317.4). Total num frames: 6126141440. Throughput: 0: 53364.3. Samples: 616667660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 12:12:04,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 12:12:06,431][52263] Updated weights for policy 0, policy_version 373919 (0.0038) [2024-04-27 12:12:09,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6126436352. Throughput: 0: 53395.8. Samples: 616987420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 12:12:09,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 12:12:09,343][52263] Updated weights for policy 0, policy_version 373929 (0.0027) [2024-04-27 12:12:12,420][52263] Updated weights for policy 0, policy_version 373939 (0.0030) [2024-04-27 12:12:14,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6126698496. Throughput: 0: 53695.2. Samples: 617160380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-27 12:12:14,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 12:12:15,487][52263] Updated weights for policy 0, policy_version 373949 (0.0027) [2024-04-27 12:12:18,587][52263] Updated weights for policy 0, policy_version 373959 (0.0032) [2024-04-27 12:12:19,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6126977024. Throughput: 0: 53747.9. Samples: 617482560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 12:12:19,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 12:12:21,634][52263] Updated weights for policy 0, policy_version 373969 (0.0028) [2024-04-27 12:12:24,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.3, 300 sec: 53484.1). Total num frames: 6127255552. Throughput: 0: 53832.9. Samples: 617805800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 12:12:24,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 12:12:24,605][52263] Updated weights for policy 0, policy_version 373979 (0.0029) [2024-04-27 12:12:27,801][52263] Updated weights for policy 0, policy_version 373989 (0.0028) [2024-04-27 12:12:29,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6127501312. Throughput: 0: 53522.7. Samples: 617957160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 12:12:29,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 12:12:30,717][52242] Signal inference workers to stop experience collection... (9300 times) [2024-04-27 12:12:30,749][52263] InferenceWorker_p0-w0: stopping experience collection (9300 times) [2024-04-27 12:12:30,779][52242] Signal inference workers to resume experience collection... (9300 times) [2024-04-27 12:12:30,783][52263] InferenceWorker_p0-w0: resuming experience collection (9300 times) [2024-04-27 12:12:30,785][52263] Updated weights for policy 0, policy_version 373999 (0.0027) [2024-04-27 12:12:33,862][52263] Updated weights for policy 0, policy_version 374009 (0.0030) [2024-04-27 12:12:34,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6127763456. Throughput: 0: 53589.3. Samples: 618281960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 12:12:34,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 12:12:36,856][52263] Updated weights for policy 0, policy_version 374019 (0.0030) [2024-04-27 12:12:39,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.1, 300 sec: 53372.9). Total num frames: 6128041984. Throughput: 0: 53585.2. Samples: 618602600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 12:12:39,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 12:12:39,878][52263] Updated weights for policy 0, policy_version 374029 (0.0033) [2024-04-27 12:12:42,769][52263] Updated weights for policy 0, policy_version 374039 (0.0024) [2024-04-27 12:12:44,107][52031] Fps is (10 sec: 57343.5, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6128336896. Throughput: 0: 53948.8. Samples: 618771080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 12:12:44,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 12:12:46,292][52263] Updated weights for policy 0, policy_version 374049 (0.0027) [2024-04-27 12:12:48,926][52263] Updated weights for policy 0, policy_version 374059 (0.0028) [2024-04-27 12:12:49,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6128582656. Throughput: 0: 53957.6. Samples: 619095740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 12:12:49,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 12:12:52,486][52263] Updated weights for policy 0, policy_version 374069 (0.0028) [2024-04-27 12:12:54,106][52031] Fps is (10 sec: 50791.1, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6128844800. Throughput: 0: 54017.8. Samples: 619418220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 12:12:54,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 12:12:54,979][52263] Updated weights for policy 0, policy_version 374079 (0.0026) [2024-04-27 12:12:58,546][52263] Updated weights for policy 0, policy_version 374089 (0.0026) [2024-04-27 12:12:59,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6129106944. Throughput: 0: 53533.2. Samples: 619569380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 12:12:59,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 12:13:01,196][52263] Updated weights for policy 0, policy_version 374099 (0.0027) [2024-04-27 12:13:04,107][52031] Fps is (10 sec: 54066.6, 60 sec: 54067.3, 300 sec: 53484.0). Total num frames: 6129385472. Throughput: 0: 53452.9. Samples: 619887940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 12:13:04,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 12:13:04,479][52263] Updated weights for policy 0, policy_version 374109 (0.0033) [2024-04-27 12:13:07,378][52263] Updated weights for policy 0, policy_version 374119 (0.0031) [2024-04-27 12:13:09,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6129647616. Throughput: 0: 53488.9. Samples: 620212800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 12:13:09,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 12:13:10,414][52263] Updated weights for policy 0, policy_version 374129 (0.0026) [2024-04-27 12:13:13,322][52263] Updated weights for policy 0, policy_version 374139 (0.0030) [2024-04-27 12:13:14,107][52031] Fps is (10 sec: 55705.6, 60 sec: 54067.1, 300 sec: 53539.6). Total num frames: 6129942528. Throughput: 0: 53887.5. Samples: 620382100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 12:13:14,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 12:13:16,819][52263] Updated weights for policy 0, policy_version 374149 (0.0034) [2024-04-27 12:13:19,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6130188288. Throughput: 0: 53789.3. Samples: 620702480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 12:13:19,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 12:13:19,438][52263] Updated weights for policy 0, policy_version 374159 (0.0033) [2024-04-27 12:13:22,870][52263] Updated weights for policy 0, policy_version 374169 (0.0026) [2024-04-27 12:13:24,106][52031] Fps is (10 sec: 50791.0, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6130450432. Throughput: 0: 53758.3. Samples: 621021720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 12:13:24,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 12:13:25,566][52263] Updated weights for policy 0, policy_version 374179 (0.0029) [2024-04-27 12:13:29,088][52263] Updated weights for policy 0, policy_version 374189 (0.0031) [2024-04-27 12:13:29,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6130712576. Throughput: 0: 53475.1. Samples: 621177460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 12:13:29,107][52031] Avg episode reward: [(0, '0.508')] [2024-04-27 12:13:31,023][52242] Signal inference workers to stop experience collection... (9350 times) [2024-04-27 12:13:31,065][52263] InferenceWorker_p0-w0: stopping experience collection (9350 times) [2024-04-27 12:13:31,076][52242] Signal inference workers to resume experience collection... (9350 times) [2024-04-27 12:13:31,082][52263] InferenceWorker_p0-w0: resuming experience collection (9350 times) [2024-04-27 12:13:31,558][52263] Updated weights for policy 0, policy_version 374199 (0.0033) [2024-04-27 12:13:34,107][52031] Fps is (10 sec: 54066.0, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6130991104. Throughput: 0: 53446.4. Samples: 621500840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 12:13:34,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 12:13:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000374206_6130991104.pth... [2024-04-27 12:13:34,167][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000373423_6118162432.pth [2024-04-27 12:13:35,066][52263] Updated weights for policy 0, policy_version 374209 (0.0026) [2024-04-27 12:13:37,679][52263] Updated weights for policy 0, policy_version 374219 (0.0031) [2024-04-27 12:13:39,107][52031] Fps is (10 sec: 55706.0, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6131269632. Throughput: 0: 53407.0. Samples: 621821540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 12:13:39,107][52031] Avg episode reward: [(0, '0.517')] [2024-04-27 12:13:41,278][52263] Updated weights for policy 0, policy_version 374229 (0.0032) [2024-04-27 12:13:43,937][52263] Updated weights for policy 0, policy_version 374239 (0.0031) [2024-04-27 12:13:44,107][52031] Fps is (10 sec: 55706.1, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6131548160. Throughput: 0: 53750.7. Samples: 621988160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 12:13:44,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 12:13:47,507][52263] Updated weights for policy 0, policy_version 374249 (0.0030) [2024-04-27 12:13:49,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6131793920. Throughput: 0: 53894.7. Samples: 622313200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 12:13:49,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 12:13:50,022][52263] Updated weights for policy 0, policy_version 374259 (0.0027) [2024-04-27 12:13:53,624][52263] Updated weights for policy 0, policy_version 374269 (0.0031) [2024-04-27 12:13:54,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6132056064. Throughput: 0: 53791.0. Samples: 622633400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 12:13:54,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 12:13:56,016][52263] Updated weights for policy 0, policy_version 374279 (0.0030) [2024-04-27 12:13:59,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53794.3, 300 sec: 53539.6). Total num frames: 6132334592. Throughput: 0: 53481.5. Samples: 622788760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 12:13:59,107][52031] Avg episode reward: [(0, '0.688')] [2024-04-27 12:13:59,628][52263] Updated weights for policy 0, policy_version 374289 (0.0034) [2024-04-27 12:14:02,092][52263] Updated weights for policy 0, policy_version 374299 (0.0026) [2024-04-27 12:14:04,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6132596736. Throughput: 0: 53427.9. Samples: 623106740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 12:14:04,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 12:14:05,693][52263] Updated weights for policy 0, policy_version 374309 (0.0036) [2024-04-27 12:14:08,253][52263] Updated weights for policy 0, policy_version 374319 (0.0029) [2024-04-27 12:14:09,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6132875264. Throughput: 0: 53336.9. Samples: 623421880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 12:14:09,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 12:14:11,983][52263] Updated weights for policy 0, policy_version 374329 (0.0034) [2024-04-27 12:14:14,106][52031] Fps is (10 sec: 52429.7, 60 sec: 52975.1, 300 sec: 53373.0). Total num frames: 6133121024. Throughput: 0: 53584.3. Samples: 623588740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 12:14:14,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 12:14:14,397][52263] Updated weights for policy 0, policy_version 374339 (0.0026) [2024-04-27 12:14:18,253][52263] Updated weights for policy 0, policy_version 374349 (0.0029) [2024-04-27 12:14:19,106][52031] Fps is (10 sec: 49152.2, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6133366784. Throughput: 0: 53434.5. Samples: 623905380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 12:14:19,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 12:14:20,435][52263] Updated weights for policy 0, policy_version 374359 (0.0035) [2024-04-27 12:14:24,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6133645312. Throughput: 0: 53370.2. Samples: 624223200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 12:14:24,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 12:14:24,345][52263] Updated weights for policy 0, policy_version 374369 (0.0038) [2024-04-27 12:14:24,657][52242] Signal inference workers to stop experience collection... (9400 times) [2024-04-27 12:14:24,704][52263] InferenceWorker_p0-w0: stopping experience collection (9400 times) [2024-04-27 12:14:24,717][52242] Signal inference workers to resume experience collection... (9400 times) [2024-04-27 12:14:24,726][52263] InferenceWorker_p0-w0: resuming experience collection (9400 times) [2024-04-27 12:14:26,695][52263] Updated weights for policy 0, policy_version 374379 (0.0026) [2024-04-27 12:14:29,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6133923840. Throughput: 0: 53137.0. Samples: 624379320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 12:14:29,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 12:14:30,500][52263] Updated weights for policy 0, policy_version 374389 (0.0030) [2024-04-27 12:14:33,173][52263] Updated weights for policy 0, policy_version 374399 (0.0033) [2024-04-27 12:14:34,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6134202368. Throughput: 0: 53008.4. Samples: 624698580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 12:14:34,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 12:14:36,503][52263] Updated weights for policy 0, policy_version 374409 (0.0037) [2024-04-27 12:14:39,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6134464512. Throughput: 0: 53014.8. Samples: 625019060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 12:14:39,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 12:14:39,329][52263] Updated weights for policy 0, policy_version 374419 (0.0030) [2024-04-27 12:14:42,513][52263] Updated weights for policy 0, policy_version 374429 (0.0033) [2024-04-27 12:14:44,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6134743040. Throughput: 0: 53082.2. Samples: 625177460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 12:14:44,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 12:14:45,609][52263] Updated weights for policy 0, policy_version 374439 (0.0032) [2024-04-27 12:14:48,658][52263] Updated weights for policy 0, policy_version 374449 (0.0026) [2024-04-27 12:14:49,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6134988800. Throughput: 0: 53069.7. Samples: 625494880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 12:14:49,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 12:14:51,924][52263] Updated weights for policy 0, policy_version 374459 (0.0030) [2024-04-27 12:14:54,106][52031] Fps is (10 sec: 49151.9, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 6135234560. Throughput: 0: 53304.0. Samples: 625820560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 12:14:54,107][52031] Avg episode reward: [(0, '0.512')] [2024-04-27 12:14:54,789][52263] Updated weights for policy 0, policy_version 374469 (0.0028) [2024-04-27 12:14:57,934][52263] Updated weights for policy 0, policy_version 374479 (0.0035) [2024-04-27 12:14:59,106][52031] Fps is (10 sec: 52430.2, 60 sec: 52975.0, 300 sec: 53484.0). Total num frames: 6135513088. Throughput: 0: 53028.0. Samples: 625975000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 12:14:59,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 12:15:01,208][52263] Updated weights for policy 0, policy_version 374489 (0.0027) [2024-04-27 12:15:04,107][52031] Fps is (10 sec: 54066.4, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6135775232. Throughput: 0: 53119.3. Samples: 626295760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 12:15:04,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 12:15:04,140][52263] Updated weights for policy 0, policy_version 374499 (0.0035) [2024-04-27 12:15:07,256][52263] Updated weights for policy 0, policy_version 374509 (0.0027) [2024-04-27 12:15:09,107][52031] Fps is (10 sec: 55704.1, 60 sec: 53247.8, 300 sec: 53484.0). Total num frames: 6136070144. Throughput: 0: 53197.7. Samples: 626617100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 12:15:09,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 12:15:10,143][52263] Updated weights for policy 0, policy_version 374519 (0.0031) [2024-04-27 12:15:13,329][52263] Updated weights for policy 0, policy_version 374529 (0.0033) [2024-04-27 12:15:14,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6136332288. Throughput: 0: 53303.0. Samples: 626777960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 12:15:14,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 12:15:16,300][52263] Updated weights for policy 0, policy_version 374539 (0.0029) [2024-04-27 12:15:19,107][52031] Fps is (10 sec: 50790.8, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6136578048. Throughput: 0: 53320.5. Samples: 627098000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 12:15:19,107][52031] Avg episode reward: [(0, '0.509')] [2024-04-27 12:15:19,340][52263] Updated weights for policy 0, policy_version 374549 (0.0029) [2024-04-27 12:15:22,527][52263] Updated weights for policy 0, policy_version 374559 (0.0028) [2024-04-27 12:15:24,107][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6136856576. Throughput: 0: 53322.6. Samples: 627418580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 12:15:24,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 12:15:25,412][52263] Updated weights for policy 0, policy_version 374569 (0.0032) [2024-04-27 12:15:26,441][52242] Signal inference workers to stop experience collection... (9450 times) [2024-04-27 12:15:26,441][52242] Signal inference workers to resume experience collection... (9450 times) [2024-04-27 12:15:26,454][52263] InferenceWorker_p0-w0: stopping experience collection (9450 times) [2024-04-27 12:15:26,454][52263] InferenceWorker_p0-w0: resuming experience collection (9450 times) [2024-04-27 12:15:28,847][52263] Updated weights for policy 0, policy_version 374579 (0.0028) [2024-04-27 12:15:29,107][52031] Fps is (10 sec: 52429.0, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6137102336. Throughput: 0: 53399.9. Samples: 627580460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 12:15:29,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 12:15:31,529][52263] Updated weights for policy 0, policy_version 374589 (0.0037) [2024-04-27 12:15:34,107][52031] Fps is (10 sec: 52428.6, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6137380864. Throughput: 0: 53477.9. Samples: 627901380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 12:15:34,107][52031] Avg episode reward: [(0, '0.667')] [2024-04-27 12:15:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000374596_6137380864.pth... [2024-04-27 12:15:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000373813_6124552192.pth [2024-04-27 12:15:35,088][52263] Updated weights for policy 0, policy_version 374599 (0.0033) [2024-04-27 12:15:37,667][52263] Updated weights for policy 0, policy_version 374609 (0.0028) [2024-04-27 12:15:39,107][52031] Fps is (10 sec: 57344.0, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6137675776. Throughput: 0: 53315.9. Samples: 628219780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 12:15:39,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 12:15:41,323][52263] Updated weights for policy 0, policy_version 374619 (0.0028) [2024-04-27 12:15:43,748][52263] Updated weights for policy 0, policy_version 374629 (0.0029) [2024-04-27 12:15:44,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6137937920. Throughput: 0: 53628.7. Samples: 628388300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 12:15:44,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 12:15:47,265][52263] Updated weights for policy 0, policy_version 374639 (0.0033) [2024-04-27 12:15:49,107][52031] Fps is (10 sec: 49151.1, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6138167296. Throughput: 0: 53543.0. Samples: 628705200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 12:15:49,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 12:15:49,950][52263] Updated weights for policy 0, policy_version 374649 (0.0026) [2024-04-27 12:15:53,177][52263] Updated weights for policy 0, policy_version 374659 (0.0029) [2024-04-27 12:15:54,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6138462208. Throughput: 0: 53616.1. Samples: 629029820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 12:15:54,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 12:15:56,024][52263] Updated weights for policy 0, policy_version 374669 (0.0028) [2024-04-27 12:15:59,107][52031] Fps is (10 sec: 55706.6, 60 sec: 53520.9, 300 sec: 53539.6). Total num frames: 6138724352. Throughput: 0: 53592.9. Samples: 629189640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 12:15:59,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 12:15:59,209][52263] Updated weights for policy 0, policy_version 374679 (0.0028) [2024-04-27 12:16:02,180][52263] Updated weights for policy 0, policy_version 374689 (0.0031) [2024-04-27 12:16:04,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.3, 300 sec: 53539.6). Total num frames: 6139002880. Throughput: 0: 53602.3. Samples: 629510100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 12:16:04,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 12:16:05,263][52263] Updated weights for policy 0, policy_version 374699 (0.0029) [2024-04-27 12:16:08,112][52263] Updated weights for policy 0, policy_version 374709 (0.0029) [2024-04-27 12:16:09,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.2, 300 sec: 53428.5). Total num frames: 6139265024. Throughput: 0: 53621.0. Samples: 629831520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 12:16:09,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 12:16:11,506][52263] Updated weights for policy 0, policy_version 374719 (0.0030) [2024-04-27 12:16:14,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6139543552. Throughput: 0: 53629.8. Samples: 629993800. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-27 12:16:14,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 12:16:14,286][52263] Updated weights for policy 0, policy_version 374729 (0.0027) [2024-04-27 12:16:17,563][52263] Updated weights for policy 0, policy_version 374739 (0.0031) [2024-04-27 12:16:19,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6139789312. Throughput: 0: 53755.0. Samples: 630320360. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-27 12:16:19,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 12:16:20,255][52263] Updated weights for policy 0, policy_version 374749 (0.0028) [2024-04-27 12:16:23,797][52263] Updated weights for policy 0, policy_version 374759 (0.0024) [2024-04-27 12:16:24,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6140067840. Throughput: 0: 53775.2. Samples: 630639660. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-27 12:16:24,107][52031] Avg episode reward: [(0, '0.479')] [2024-04-27 12:16:26,428][52263] Updated weights for policy 0, policy_version 374769 (0.0029) [2024-04-27 12:16:27,298][52242] Signal inference workers to stop experience collection... (9500 times) [2024-04-27 12:16:27,298][52242] Signal inference workers to resume experience collection... (9500 times) [2024-04-27 12:16:27,320][52263] InferenceWorker_p0-w0: stopping experience collection (9500 times) [2024-04-27 12:16:27,320][52263] InferenceWorker_p0-w0: resuming experience collection (9500 times) [2024-04-27 12:16:29,106][52031] Fps is (10 sec: 55706.6, 60 sec: 54067.3, 300 sec: 53539.6). Total num frames: 6140346368. Throughput: 0: 53434.8. Samples: 630792860. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-27 12:16:29,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 12:16:29,719][52263] Updated weights for policy 0, policy_version 374779 (0.0025) [2024-04-27 12:16:32,830][52263] Updated weights for policy 0, policy_version 374789 (0.0028) [2024-04-27 12:16:34,107][52031] Fps is (10 sec: 55705.2, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6140624896. Throughput: 0: 53587.8. Samples: 631116640. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-27 12:16:34,107][52031] Avg episode reward: [(0, '0.491')] [2024-04-27 12:16:35,691][52263] Updated weights for policy 0, policy_version 374799 (0.0029) [2024-04-27 12:16:39,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6140870656. Throughput: 0: 53525.7. Samples: 631438480. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-27 12:16:39,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 12:16:39,108][52263] Updated weights for policy 0, policy_version 374809 (0.0030) [2024-04-27 12:16:41,936][52263] Updated weights for policy 0, policy_version 374819 (0.0027) [2024-04-27 12:16:44,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6141132800. Throughput: 0: 53453.8. Samples: 631595060. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-27 12:16:44,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 12:16:45,033][52263] Updated weights for policy 0, policy_version 374829 (0.0026) [2024-04-27 12:16:48,067][52263] Updated weights for policy 0, policy_version 374839 (0.0032) [2024-04-27 12:16:49,107][52031] Fps is (10 sec: 52429.0, 60 sec: 53794.3, 300 sec: 53428.5). Total num frames: 6141394944. Throughput: 0: 53477.2. Samples: 631916580. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-27 12:16:49,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 12:16:51,009][52263] Updated weights for policy 0, policy_version 374849 (0.0027) [2024-04-27 12:16:54,037][52263] Updated weights for policy 0, policy_version 374859 (0.0026) [2024-04-27 12:16:54,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6141689856. Throughput: 0: 53471.3. Samples: 632237740. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-27 12:16:54,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 12:16:57,272][52263] Updated weights for policy 0, policy_version 374869 (0.0027) [2024-04-27 12:16:59,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6141935616. Throughput: 0: 53465.7. Samples: 632399760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-27 12:16:59,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 12:17:00,191][52263] Updated weights for policy 0, policy_version 374879 (0.0029) [2024-04-27 12:17:03,361][52263] Updated weights for policy 0, policy_version 374889 (0.0032) [2024-04-27 12:17:04,107][52031] Fps is (10 sec: 54064.7, 60 sec: 53793.6, 300 sec: 53539.5). Total num frames: 6142230528. Throughput: 0: 53377.2. Samples: 632722360. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-27 12:17:04,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 12:17:06,572][52263] Updated weights for policy 0, policy_version 374899 (0.0034) [2024-04-27 12:17:09,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6142492672. Throughput: 0: 53459.1. Samples: 633045320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-27 12:17:09,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 12:17:09,339][52263] Updated weights for policy 0, policy_version 374909 (0.0028) [2024-04-27 12:17:12,725][52263] Updated weights for policy 0, policy_version 374919 (0.0027) [2024-04-27 12:17:14,107][52031] Fps is (10 sec: 52431.6, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6142754816. Throughput: 0: 53640.8. Samples: 633206700. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-27 12:17:14,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 12:17:15,508][52263] Updated weights for policy 0, policy_version 374929 (0.0032) [2024-04-27 12:17:18,829][52263] Updated weights for policy 0, policy_version 374939 (0.0031) [2024-04-27 12:17:19,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 6143000576. Throughput: 0: 53566.7. Samples: 633527140. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-27 12:17:19,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 12:17:21,599][52263] Updated weights for policy 0, policy_version 374949 (0.0028) [2024-04-27 12:17:24,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6143295488. Throughput: 0: 53549.1. Samples: 633848180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-27 12:17:24,107][52031] Avg episode reward: [(0, '0.664')] [2024-04-27 12:17:24,833][52263] Updated weights for policy 0, policy_version 374959 (0.0030) [2024-04-27 12:17:27,740][52263] Updated weights for policy 0, policy_version 374969 (0.0026) [2024-04-27 12:17:28,009][52242] Signal inference workers to stop experience collection... (9550 times) [2024-04-27 12:17:28,012][52242] Signal inference workers to resume experience collection... (9550 times) [2024-04-27 12:17:28,042][52263] InferenceWorker_p0-w0: stopping experience collection (9550 times) [2024-04-27 12:17:28,042][52263] InferenceWorker_p0-w0: resuming experience collection (9550 times) [2024-04-27 12:17:29,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6143557632. Throughput: 0: 53722.2. Samples: 634012560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-04-27 12:17:29,107][52031] Avg episode reward: [(0, '0.671')] [2024-04-27 12:17:30,843][52263] Updated weights for policy 0, policy_version 374979 (0.0032) [2024-04-27 12:17:33,886][52263] Updated weights for policy 0, policy_version 374989 (0.0028) [2024-04-27 12:17:34,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6143819776. Throughput: 0: 53751.2. Samples: 634335380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:17:34,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 12:17:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000374989_6143819776.pth... [2024-04-27 12:17:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000374206_6130991104.pth [2024-04-27 12:17:36,980][52263] Updated weights for policy 0, policy_version 374999 (0.0029) [2024-04-27 12:17:39,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6144098304. Throughput: 0: 53699.7. Samples: 634654220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:17:39,107][52031] Avg episode reward: [(0, '0.453')] [2024-04-27 12:17:39,921][52263] Updated weights for policy 0, policy_version 375009 (0.0028) [2024-04-27 12:17:43,236][52263] Updated weights for policy 0, policy_version 375019 (0.0029) [2024-04-27 12:17:44,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6144344064. Throughput: 0: 53678.3. Samples: 634815280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:17:44,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 12:17:46,035][52263] Updated weights for policy 0, policy_version 375029 (0.0034) [2024-04-27 12:17:49,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6144622592. Throughput: 0: 53581.0. Samples: 635133480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:17:49,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 12:17:49,363][52263] Updated weights for policy 0, policy_version 375039 (0.0030) [2024-04-27 12:17:52,194][52263] Updated weights for policy 0, policy_version 375049 (0.0030) [2024-04-27 12:17:54,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6144901120. Throughput: 0: 53544.8. Samples: 635454840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:17:54,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 12:17:55,398][52263] Updated weights for policy 0, policy_version 375059 (0.0035) [2024-04-27 12:17:58,279][52263] Updated weights for policy 0, policy_version 375069 (0.0028) [2024-04-27 12:17:59,106][52031] Fps is (10 sec: 54068.2, 60 sec: 53794.3, 300 sec: 53484.1). Total num frames: 6145163264. Throughput: 0: 53617.5. Samples: 635619480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:17:59,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 12:18:01,524][52263] Updated weights for policy 0, policy_version 375079 (0.0028) [2024-04-27 12:18:04,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53521.7, 300 sec: 53539.6). Total num frames: 6145441792. Throughput: 0: 53630.3. Samples: 635940500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:18:04,115][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 12:18:04,635][52263] Updated weights for policy 0, policy_version 375089 (0.0026) [2024-04-27 12:18:07,820][52263] Updated weights for policy 0, policy_version 375099 (0.0025) [2024-04-27 12:18:09,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 6145687552. Throughput: 0: 53602.1. Samples: 636260280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:18:09,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 12:18:10,697][52263] Updated weights for policy 0, policy_version 375109 (0.0027) [2024-04-27 12:18:13,853][52263] Updated weights for policy 0, policy_version 375119 (0.0031) [2024-04-27 12:18:14,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6145966080. Throughput: 0: 53447.1. Samples: 636417680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:18:14,116][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 12:18:16,679][52263] Updated weights for policy 0, policy_version 375129 (0.0035) [2024-04-27 12:18:19,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6146228224. Throughput: 0: 53450.1. Samples: 636740640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:18:19,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 12:18:19,809][52263] Updated weights for policy 0, policy_version 375139 (0.0028) [2024-04-27 12:18:22,837][52263] Updated weights for policy 0, policy_version 375149 (0.0030) [2024-04-27 12:18:24,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53520.9, 300 sec: 53539.6). Total num frames: 6146506752. Throughput: 0: 53467.4. Samples: 637060260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:18:24,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 12:18:25,967][52263] Updated weights for policy 0, policy_version 375159 (0.0038) [2024-04-27 12:18:29,055][52263] Updated weights for policy 0, policy_version 375169 (0.0029) [2024-04-27 12:18:29,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6146768896. Throughput: 0: 53462.9. Samples: 637221120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:18:29,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 12:18:32,116][52263] Updated weights for policy 0, policy_version 375179 (0.0032) [2024-04-27 12:18:34,107][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6147031040. Throughput: 0: 53576.9. Samples: 637544440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:18:34,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 12:18:35,093][52263] Updated weights for policy 0, policy_version 375189 (0.0033) [2024-04-27 12:18:38,135][52263] Updated weights for policy 0, policy_version 375199 (0.0023) [2024-04-27 12:18:39,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 6147293184. Throughput: 0: 53503.1. Samples: 637862480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:18:39,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 12:18:41,131][52263] Updated weights for policy 0, policy_version 375209 (0.0029) [2024-04-27 12:18:44,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6147571712. Throughput: 0: 53561.1. Samples: 638029740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:18:44,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 12:18:44,272][52263] Updated weights for policy 0, policy_version 375219 (0.0027) [2024-04-27 12:18:47,312][52263] Updated weights for policy 0, policy_version 375229 (0.0027) [2024-04-27 12:18:48,128][52242] Signal inference workers to stop experience collection... (9600 times) [2024-04-27 12:18:48,173][52263] InferenceWorker_p0-w0: stopping experience collection (9600 times) [2024-04-27 12:18:48,183][52242] Signal inference workers to resume experience collection... (9600 times) [2024-04-27 12:18:48,190][52263] InferenceWorker_p0-w0: resuming experience collection (9600 times) [2024-04-27 12:18:49,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6147833856. Throughput: 0: 53508.2. Samples: 638348380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:18:49,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 12:18:50,379][52263] Updated weights for policy 0, policy_version 375239 (0.0031) [2024-04-27 12:18:53,503][52263] Updated weights for policy 0, policy_version 375249 (0.0032) [2024-04-27 12:18:54,106][52031] Fps is (10 sec: 54068.3, 60 sec: 53521.2, 300 sec: 53484.0). Total num frames: 6148112384. Throughput: 0: 53529.5. Samples: 638669100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:18:54,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 12:18:56,519][52263] Updated weights for policy 0, policy_version 375259 (0.0030) [2024-04-27 12:18:59,106][52031] Fps is (10 sec: 55706.5, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6148390912. Throughput: 0: 53637.9. Samples: 638831380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:18:59,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 12:18:59,425][52263] Updated weights for policy 0, policy_version 375269 (0.0027) [2024-04-27 12:19:02,616][52263] Updated weights for policy 0, policy_version 375279 (0.0027) [2024-04-27 12:19:04,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6148636672. Throughput: 0: 53611.3. Samples: 639153140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:19:04,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 12:19:05,399][52263] Updated weights for policy 0, policy_version 375289 (0.0027) [2024-04-27 12:19:08,783][52263] Updated weights for policy 0, policy_version 375299 (0.0036) [2024-04-27 12:19:09,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6148915200. Throughput: 0: 53635.7. Samples: 639473860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:19:09,107][52031] Avg episode reward: [(0, '0.678')] [2024-04-27 12:19:11,679][52263] Updated weights for policy 0, policy_version 375309 (0.0034) [2024-04-27 12:19:14,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6149177344. Throughput: 0: 53555.6. Samples: 639631120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:19:14,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 12:19:14,949][52263] Updated weights for policy 0, policy_version 375319 (0.0033) [2024-04-27 12:19:18,004][52263] Updated weights for policy 0, policy_version 375329 (0.0029) [2024-04-27 12:19:19,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6149455872. Throughput: 0: 53537.5. Samples: 639953620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:19:19,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 12:19:21,173][52263] Updated weights for policy 0, policy_version 375339 (0.0031) [2024-04-27 12:19:24,023][52263] Updated weights for policy 0, policy_version 375349 (0.0028) [2024-04-27 12:19:24,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6149718016. Throughput: 0: 53530.6. Samples: 640271360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:19:24,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 12:19:27,162][52263] Updated weights for policy 0, policy_version 375359 (0.0032) [2024-04-27 12:19:29,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6149980160. Throughput: 0: 53408.9. Samples: 640433140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:19:29,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 12:19:30,159][52263] Updated weights for policy 0, policy_version 375369 (0.0031) [2024-04-27 12:19:33,345][52263] Updated weights for policy 0, policy_version 375379 (0.0031) [2024-04-27 12:19:34,106][52031] Fps is (10 sec: 52429.8, 60 sec: 53521.2, 300 sec: 53484.0). Total num frames: 6150242304. Throughput: 0: 53460.2. Samples: 640754080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:19:34,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 12:19:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000375381_6150242304.pth... [2024-04-27 12:19:34,162][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000374596_6137380864.pth [2024-04-27 12:19:36,329][52263] Updated weights for policy 0, policy_version 375389 (0.0036) [2024-04-27 12:19:39,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6150504448. Throughput: 0: 53437.2. Samples: 641073780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:19:39,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 12:19:39,617][52263] Updated weights for policy 0, policy_version 375399 (0.0026) [2024-04-27 12:19:42,588][52263] Updated weights for policy 0, policy_version 375409 (0.0029) [2024-04-27 12:19:44,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6150766592. Throughput: 0: 53358.7. Samples: 641232520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:19:44,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 12:19:45,808][52263] Updated weights for policy 0, policy_version 375419 (0.0031) [2024-04-27 12:19:48,644][52263] Updated weights for policy 0, policy_version 375429 (0.0026) [2024-04-27 12:19:49,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6151028736. Throughput: 0: 53302.0. Samples: 641551740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:19:49,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 12:19:51,802][52263] Updated weights for policy 0, policy_version 375439 (0.0027) [2024-04-27 12:19:54,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6151323648. Throughput: 0: 53427.1. Samples: 641878080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:19:54,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 12:19:54,911][52263] Updated weights for policy 0, policy_version 375449 (0.0030) [2024-04-27 12:19:57,747][52263] Updated weights for policy 0, policy_version 375459 (0.0027) [2024-04-27 12:19:59,107][52031] Fps is (10 sec: 55706.0, 60 sec: 53247.9, 300 sec: 53595.1). Total num frames: 6151585792. Throughput: 0: 53356.0. Samples: 642032140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:19:59,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 12:20:01,195][52263] Updated weights for policy 0, policy_version 375469 (0.0032) [2024-04-27 12:20:03,992][52263] Updated weights for policy 0, policy_version 375479 (0.0024) [2024-04-27 12:20:04,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53520.9, 300 sec: 53484.1). Total num frames: 6151847936. Throughput: 0: 53355.0. Samples: 642354600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:20:04,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 12:20:07,443][52263] Updated weights for policy 0, policy_version 375489 (0.0026) [2024-04-27 12:20:09,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6152126464. Throughput: 0: 53369.9. Samples: 642673000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:20:09,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 12:20:10,250][52263] Updated weights for policy 0, policy_version 375499 (0.0033) [2024-04-27 12:20:13,570][52263] Updated weights for policy 0, policy_version 375509 (0.0030) [2024-04-27 12:20:14,107][52031] Fps is (10 sec: 50790.5, 60 sec: 52974.9, 300 sec: 53484.0). Total num frames: 6152355840. Throughput: 0: 53305.9. Samples: 642831900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 12:20:14,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 12:20:16,337][52263] Updated weights for policy 0, policy_version 375519 (0.0027) [2024-04-27 12:20:19,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6152650752. Throughput: 0: 53322.5. Samples: 643153600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 12:20:19,107][52031] Avg episode reward: [(0, '0.679')] [2024-04-27 12:20:19,509][52263] Updated weights for policy 0, policy_version 375529 (0.0031) [2024-04-27 12:20:20,224][52242] Signal inference workers to stop experience collection... (9650 times) [2024-04-27 12:20:20,266][52263] InferenceWorker_p0-w0: stopping experience collection (9650 times) [2024-04-27 12:20:20,280][52242] Signal inference workers to resume experience collection... (9650 times) [2024-04-27 12:20:20,283][52263] InferenceWorker_p0-w0: resuming experience collection (9650 times) [2024-04-27 12:20:22,644][52263] Updated weights for policy 0, policy_version 375539 (0.0033) [2024-04-27 12:20:24,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6152912896. Throughput: 0: 53341.4. Samples: 643474140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 12:20:24,107][52031] Avg episode reward: [(0, '0.524')] [2024-04-27 12:20:25,769][52263] Updated weights for policy 0, policy_version 375549 (0.0031) [2024-04-27 12:20:28,702][52263] Updated weights for policy 0, policy_version 375559 (0.0025) [2024-04-27 12:20:29,107][52031] Fps is (10 sec: 50790.8, 60 sec: 52975.0, 300 sec: 53484.0). Total num frames: 6153158656. Throughput: 0: 53323.0. Samples: 643632060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 12:20:29,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 12:20:31,912][52263] Updated weights for policy 0, policy_version 375569 (0.0029) [2024-04-27 12:20:34,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6153437184. Throughput: 0: 53368.5. Samples: 643953320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 12:20:34,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 12:20:34,651][52263] Updated weights for policy 0, policy_version 375579 (0.0032) [2024-04-27 12:20:38,066][52263] Updated weights for policy 0, policy_version 375589 (0.0030) [2024-04-27 12:20:39,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6153715712. Throughput: 0: 53261.5. Samples: 644274840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 12:20:39,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 12:20:40,868][52263] Updated weights for policy 0, policy_version 375599 (0.0035) [2024-04-27 12:20:44,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53247.8, 300 sec: 53539.6). Total num frames: 6153961472. Throughput: 0: 53379.1. Samples: 644434200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 12:20:44,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 12:20:44,235][52263] Updated weights for policy 0, policy_version 375609 (0.0028) [2024-04-27 12:20:47,042][52263] Updated weights for policy 0, policy_version 375619 (0.0032) [2024-04-27 12:20:49,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6154240000. Throughput: 0: 53366.4. Samples: 644756080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 12:20:49,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 12:20:50,199][52263] Updated weights for policy 0, policy_version 375629 (0.0031) [2024-04-27 12:20:53,022][52263] Updated weights for policy 0, policy_version 375639 (0.0033) [2024-04-27 12:20:54,106][52031] Fps is (10 sec: 54068.0, 60 sec: 52975.0, 300 sec: 53484.1). Total num frames: 6154502144. Throughput: 0: 53431.6. Samples: 645077420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 12:20:54,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 12:20:56,427][52263] Updated weights for policy 0, policy_version 375649 (0.0030) [2024-04-27 12:20:59,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6154780672. Throughput: 0: 53460.9. Samples: 645237640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 12:20:59,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 12:20:59,291][52263] Updated weights for policy 0, policy_version 375659 (0.0026) [2024-04-27 12:21:02,388][52263] Updated weights for policy 0, policy_version 375669 (0.0028) [2024-04-27 12:21:04,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6155059200. Throughput: 0: 53398.3. Samples: 645556520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 12:21:04,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 12:21:05,369][52263] Updated weights for policy 0, policy_version 375679 (0.0042) [2024-04-27 12:21:08,498][52263] Updated weights for policy 0, policy_version 375689 (0.0031) [2024-04-27 12:21:09,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6155321344. Throughput: 0: 53462.2. Samples: 645879940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 12:21:09,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 12:21:11,598][52263] Updated weights for policy 0, policy_version 375699 (0.0033) [2024-04-27 12:21:14,106][52031] Fps is (10 sec: 50791.0, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6155567104. Throughput: 0: 53515.7. Samples: 646040260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 12:21:14,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 12:21:14,526][52263] Updated weights for policy 0, policy_version 375709 (0.0034) [2024-04-27 12:21:17,915][52263] Updated weights for policy 0, policy_version 375719 (0.0031) [2024-04-27 12:21:19,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6155845632. Throughput: 0: 53467.7. Samples: 646359360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 12:21:19,107][52031] Avg episode reward: [(0, '0.509')] [2024-04-27 12:21:20,562][52263] Updated weights for policy 0, policy_version 375729 (0.0026) [2024-04-27 12:21:23,971][52263] Updated weights for policy 0, policy_version 375739 (0.0026) [2024-04-27 12:21:24,106][52031] Fps is (10 sec: 54066.5, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6156107776. Throughput: 0: 53408.4. Samples: 646678220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 12:21:24,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 12:21:26,718][52263] Updated weights for policy 0, policy_version 375749 (0.0029) [2024-04-27 12:21:29,106][52031] Fps is (10 sec: 55705.6, 60 sec: 54067.3, 300 sec: 53484.1). Total num frames: 6156402688. Throughput: 0: 53485.0. Samples: 646841020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:21:29,107][52031] Avg episode reward: [(0, '0.481')] [2024-04-27 12:21:29,921][52263] Updated weights for policy 0, policy_version 375759 (0.0032) [2024-04-27 12:21:32,888][52263] Updated weights for policy 0, policy_version 375769 (0.0035) [2024-04-27 12:21:34,107][52031] Fps is (10 sec: 55704.3, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6156664832. Throughput: 0: 53418.3. Samples: 647159920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:21:34,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 12:21:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000375773_6156664832.pth... [2024-04-27 12:21:34,168][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000374989_6143819776.pth [2024-04-27 12:21:36,153][52263] Updated weights for policy 0, policy_version 375779 (0.0030) [2024-04-27 12:21:39,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6156926976. Throughput: 0: 53508.2. Samples: 647485280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:21:39,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 12:21:39,113][52263] Updated weights for policy 0, policy_version 375789 (0.0029) [2024-04-27 12:21:40,903][52242] Signal inference workers to stop experience collection... (9700 times) [2024-04-27 12:21:40,905][52242] Signal inference workers to resume experience collection... (9700 times) [2024-04-27 12:21:40,923][52263] InferenceWorker_p0-w0: stopping experience collection (9700 times) [2024-04-27 12:21:40,923][52263] InferenceWorker_p0-w0: resuming experience collection (9700 times) [2024-04-27 12:21:42,339][52263] Updated weights for policy 0, policy_version 375799 (0.0029) [2024-04-27 12:21:44,106][52031] Fps is (10 sec: 50792.0, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6157172736. Throughput: 0: 53360.6. Samples: 647638860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:21:44,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 12:21:45,190][52263] Updated weights for policy 0, policy_version 375809 (0.0029) [2024-04-27 12:21:48,268][52263] Updated weights for policy 0, policy_version 375819 (0.0029) [2024-04-27 12:21:49,106][52031] Fps is (10 sec: 52428.0, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6157451264. Throughput: 0: 53465.0. Samples: 647962440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:21:49,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 12:21:51,239][52263] Updated weights for policy 0, policy_version 375829 (0.0035) [2024-04-27 12:21:54,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6157729792. Throughput: 0: 53443.4. Samples: 648284900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:21:54,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 12:21:54,423][52263] Updated weights for policy 0, policy_version 375839 (0.0026) [2024-04-27 12:21:57,296][52263] Updated weights for policy 0, policy_version 375849 (0.0029) [2024-04-27 12:21:59,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.2, 300 sec: 53428.6). Total num frames: 6157991936. Throughput: 0: 53555.9. Samples: 648450280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:21:59,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 12:22:00,422][52263] Updated weights for policy 0, policy_version 375859 (0.0028) [2024-04-27 12:22:03,451][52263] Updated weights for policy 0, policy_version 375869 (0.0033) [2024-04-27 12:22:04,106][52031] Fps is (10 sec: 54068.0, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6158270464. Throughput: 0: 53668.0. Samples: 648774420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:22:04,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 12:22:06,761][52263] Updated weights for policy 0, policy_version 375879 (0.0034) [2024-04-27 12:22:09,106][52031] Fps is (10 sec: 54066.7, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6158532608. Throughput: 0: 53707.5. Samples: 649095060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:22:09,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 12:22:09,416][52263] Updated weights for policy 0, policy_version 375889 (0.0031) [2024-04-27 12:22:12,689][52263] Updated weights for policy 0, policy_version 375899 (0.0028) [2024-04-27 12:22:14,107][52031] Fps is (10 sec: 50790.1, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6158778368. Throughput: 0: 53526.1. Samples: 649249700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:22:14,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 12:22:15,557][52263] Updated weights for policy 0, policy_version 375909 (0.0029) [2024-04-27 12:22:18,641][52263] Updated weights for policy 0, policy_version 375919 (0.0028) [2024-04-27 12:22:19,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6159056896. Throughput: 0: 53704.1. Samples: 649576600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:22:19,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 12:22:21,618][52263] Updated weights for policy 0, policy_version 375929 (0.0028) [2024-04-27 12:22:24,107][52031] Fps is (10 sec: 57343.9, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6159351808. Throughput: 0: 53610.4. Samples: 649897760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:22:24,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 12:22:24,739][52263] Updated weights for policy 0, policy_version 375939 (0.0033) [2024-04-27 12:22:27,675][52263] Updated weights for policy 0, policy_version 375949 (0.0035) [2024-04-27 12:22:29,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6159613952. Throughput: 0: 53861.1. Samples: 650062620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:22:29,107][52031] Avg episode reward: [(0, '0.490')] [2024-04-27 12:22:30,925][52263] Updated weights for policy 0, policy_version 375959 (0.0028) [2024-04-27 12:22:33,964][52263] Updated weights for policy 0, policy_version 375969 (0.0030) [2024-04-27 12:22:34,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53521.2, 300 sec: 53484.0). Total num frames: 6159876096. Throughput: 0: 53813.6. Samples: 650384060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:22:34,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 12:22:37,175][52263] Updated weights for policy 0, policy_version 375979 (0.0029) [2024-04-27 12:22:39,106][52031] Fps is (10 sec: 50791.2, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6160121856. Throughput: 0: 53754.9. Samples: 650703860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:22:39,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 12:22:40,124][52263] Updated weights for policy 0, policy_version 375989 (0.0028) [2024-04-27 12:22:43,094][52263] Updated weights for policy 0, policy_version 375999 (0.0029) [2024-04-27 12:22:44,107][52031] Fps is (10 sec: 50790.9, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6160384000. Throughput: 0: 53629.2. Samples: 650863600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:22:44,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 12:22:46,071][52263] Updated weights for policy 0, policy_version 376009 (0.0035) [2024-04-27 12:22:49,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6160678912. Throughput: 0: 53510.1. Samples: 651182380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 12:22:49,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 12:22:49,176][52263] Updated weights for policy 0, policy_version 376019 (0.0037) [2024-04-27 12:22:51,299][52242] Signal inference workers to stop experience collection... (9750 times) [2024-04-27 12:22:51,301][52242] Signal inference workers to resume experience collection... (9750 times) [2024-04-27 12:22:51,328][52263] InferenceWorker_p0-w0: stopping experience collection (9750 times) [2024-04-27 12:22:51,328][52263] InferenceWorker_p0-w0: resuming experience collection (9750 times) [2024-04-27 12:22:52,225][52263] Updated weights for policy 0, policy_version 376029 (0.0038) [2024-04-27 12:22:54,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6160924672. Throughput: 0: 53462.3. Samples: 651500860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 12:22:54,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 12:22:55,406][52263] Updated weights for policy 0, policy_version 376039 (0.0032) [2024-04-27 12:22:58,550][52263] Updated weights for policy 0, policy_version 376049 (0.0032) [2024-04-27 12:22:59,106][52031] Fps is (10 sec: 55706.3, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6161235968. Throughput: 0: 53815.7. Samples: 651671400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 12:22:59,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 12:23:02,113][52263] Updated weights for policy 0, policy_version 376059 (0.0029) [2024-04-27 12:23:04,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6161481728. Throughput: 0: 53648.1. Samples: 651990760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 12:23:04,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 12:23:04,688][52263] Updated weights for policy 0, policy_version 376069 (0.0031) [2024-04-27 12:23:08,090][52263] Updated weights for policy 0, policy_version 376079 (0.0030) [2024-04-27 12:23:09,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6161743872. Throughput: 0: 53631.2. Samples: 652311160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 12:23:09,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 12:23:10,709][52263] Updated weights for policy 0, policy_version 376089 (0.0030) [2024-04-27 12:23:14,065][52263] Updated weights for policy 0, policy_version 376099 (0.0029) [2024-04-27 12:23:14,107][52031] Fps is (10 sec: 52429.3, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6162006016. Throughput: 0: 53394.4. Samples: 652465360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 12:23:14,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 12:23:16,739][52263] Updated weights for policy 0, policy_version 376109 (0.0027) [2024-04-27 12:23:19,106][52031] Fps is (10 sec: 54066.8, 60 sec: 53794.3, 300 sec: 53484.1). Total num frames: 6162284544. Throughput: 0: 53283.3. Samples: 652781800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 12:23:19,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 12:23:20,224][52263] Updated weights for policy 0, policy_version 376119 (0.0028) [2024-04-27 12:23:23,076][52263] Updated weights for policy 0, policy_version 376129 (0.0034) [2024-04-27 12:23:24,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6162546688. Throughput: 0: 53382.9. Samples: 653106100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 12:23:24,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 12:23:26,423][52263] Updated weights for policy 0, policy_version 376139 (0.0031) [2024-04-27 12:23:29,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6162808832. Throughput: 0: 53492.4. Samples: 653270760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 12:23:29,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 12:23:29,369][52263] Updated weights for policy 0, policy_version 376149 (0.0028) [2024-04-27 12:23:32,627][52263] Updated weights for policy 0, policy_version 376159 (0.0031) [2024-04-27 12:23:34,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6163087360. Throughput: 0: 53567.0. Samples: 653592900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 12:23:34,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 12:23:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000376165_6163087360.pth... [2024-04-27 12:23:34,168][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000375381_6150242304.pth [2024-04-27 12:23:35,473][52263] Updated weights for policy 0, policy_version 376169 (0.0033) [2024-04-27 12:23:38,806][52263] Updated weights for policy 0, policy_version 376179 (0.0030) [2024-04-27 12:23:39,107][52031] Fps is (10 sec: 50790.5, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 6163316736. Throughput: 0: 53547.4. Samples: 653910500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 12:23:39,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 12:23:41,547][52263] Updated weights for policy 0, policy_version 376189 (0.0029) [2024-04-27 12:23:44,106][52031] Fps is (10 sec: 50791.5, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6163595264. Throughput: 0: 53269.3. Samples: 654068520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 12:23:44,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 12:23:44,802][52263] Updated weights for policy 0, policy_version 376199 (0.0029) [2024-04-27 12:23:47,820][52263] Updated weights for policy 0, policy_version 376209 (0.0028) [2024-04-27 12:23:49,106][52031] Fps is (10 sec: 57344.4, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6163890176. Throughput: 0: 53338.7. Samples: 654391000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 12:23:49,107][52031] Avg episode reward: [(0, '0.517')] [2024-04-27 12:23:50,709][52263] Updated weights for policy 0, policy_version 376219 (0.0028) [2024-04-27 12:23:53,912][52263] Updated weights for policy 0, policy_version 376229 (0.0028) [2024-04-27 12:23:54,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6164152320. Throughput: 0: 53387.0. Samples: 654713580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 12:23:54,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 12:23:54,865][52242] Signal inference workers to stop experience collection... (9800 times) [2024-04-27 12:23:54,866][52242] Signal inference workers to resume experience collection... (9800 times) [2024-04-27 12:23:54,903][52263] InferenceWorker_p0-w0: stopping experience collection (9800 times) [2024-04-27 12:23:54,903][52263] InferenceWorker_p0-w0: resuming experience collection (9800 times) [2024-04-27 12:23:56,858][52263] Updated weights for policy 0, policy_version 376239 (0.0027) [2024-04-27 12:23:59,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52974.9, 300 sec: 53484.0). Total num frames: 6164414464. Throughput: 0: 53641.7. Samples: 654879240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 12:23:59,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 12:23:59,918][52263] Updated weights for policy 0, policy_version 376249 (0.0035) [2024-04-27 12:24:03,024][52263] Updated weights for policy 0, policy_version 376259 (0.0027) [2024-04-27 12:24:04,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6164692992. Throughput: 0: 53828.9. Samples: 655204100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 12:24:04,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 12:24:06,085][52263] Updated weights for policy 0, policy_version 376269 (0.0025) [2024-04-27 12:24:08,971][52263] Updated weights for policy 0, policy_version 376279 (0.0029) [2024-04-27 12:24:09,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6164955136. Throughput: 0: 53758.7. Samples: 655525240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:24:09,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 12:24:12,009][52263] Updated weights for policy 0, policy_version 376289 (0.0033) [2024-04-27 12:24:14,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6165217280. Throughput: 0: 53664.4. Samples: 655685660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:24:14,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 12:24:15,410][52263] Updated weights for policy 0, policy_version 376299 (0.0033) [2024-04-27 12:24:18,071][52263] Updated weights for policy 0, policy_version 376309 (0.0027) [2024-04-27 12:24:19,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6165495808. Throughput: 0: 53633.1. Samples: 656006380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:24:19,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 12:24:21,400][52263] Updated weights for policy 0, policy_version 376319 (0.0027) [2024-04-27 12:24:24,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6165757952. Throughput: 0: 53706.3. Samples: 656327280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:24:24,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 12:24:24,141][52263] Updated weights for policy 0, policy_version 376329 (0.0028) [2024-04-27 12:24:27,300][52263] Updated weights for policy 0, policy_version 376339 (0.0026) [2024-04-27 12:24:29,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6166036480. Throughput: 0: 53901.6. Samples: 656494100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:24:29,108][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 12:24:30,460][52263] Updated weights for policy 0, policy_version 376349 (0.0033) [2024-04-27 12:24:33,372][52263] Updated weights for policy 0, policy_version 376359 (0.0029) [2024-04-27 12:24:34,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6166315008. Throughput: 0: 53810.6. Samples: 656812480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:24:34,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 12:24:36,460][52263] Updated weights for policy 0, policy_version 376369 (0.0030) [2024-04-27 12:24:39,106][52031] Fps is (10 sec: 52429.5, 60 sec: 54067.3, 300 sec: 53539.6). Total num frames: 6166560768. Throughput: 0: 53866.8. Samples: 657137580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:24:39,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 12:24:39,536][52263] Updated weights for policy 0, policy_version 376379 (0.0030) [2024-04-27 12:24:42,541][52263] Updated weights for policy 0, policy_version 376389 (0.0031) [2024-04-27 12:24:44,107][52031] Fps is (10 sec: 50790.5, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6166822912. Throughput: 0: 53687.9. Samples: 657295200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:24:44,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 12:24:45,728][52263] Updated weights for policy 0, policy_version 376399 (0.0033) [2024-04-27 12:24:46,138][52242] Signal inference workers to stop experience collection... (9850 times) [2024-04-27 12:24:46,139][52242] Signal inference workers to resume experience collection... (9850 times) [2024-04-27 12:24:46,161][52263] InferenceWorker_p0-w0: stopping experience collection (9850 times) [2024-04-27 12:24:46,161][52263] InferenceWorker_p0-w0: resuming experience collection (9850 times) [2024-04-27 12:24:48,660][52263] Updated weights for policy 0, policy_version 376409 (0.0031) [2024-04-27 12:24:49,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6167101440. Throughput: 0: 53488.3. Samples: 657611080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:24:49,107][52031] Avg episode reward: [(0, '0.658')] [2024-04-27 12:24:51,763][52263] Updated weights for policy 0, policy_version 376419 (0.0037) [2024-04-27 12:24:54,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6167347200. Throughput: 0: 53395.0. Samples: 657928020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:24:54,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 12:24:54,875][52263] Updated weights for policy 0, policy_version 376429 (0.0024) [2024-04-27 12:24:57,920][52263] Updated weights for policy 0, policy_version 376439 (0.0031) [2024-04-27 12:24:59,107][52031] Fps is (10 sec: 55705.9, 60 sec: 54067.1, 300 sec: 53595.1). Total num frames: 6167658496. Throughput: 0: 53553.8. Samples: 658095580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:24:59,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 12:25:00,903][52263] Updated weights for policy 0, policy_version 376449 (0.0029) [2024-04-27 12:25:03,962][52263] Updated weights for policy 0, policy_version 376459 (0.0033) [2024-04-27 12:25:04,106][52031] Fps is (10 sec: 55706.8, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6167904256. Throughput: 0: 53520.9. Samples: 658414820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:25:04,107][52031] Avg episode reward: [(0, '0.495')] [2024-04-27 12:25:06,999][52263] Updated weights for policy 0, policy_version 376469 (0.0027) [2024-04-27 12:25:09,106][52031] Fps is (10 sec: 50791.1, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6168166400. Throughput: 0: 53513.0. Samples: 658735360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:25:09,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 12:25:10,143][52263] Updated weights for policy 0, policy_version 376479 (0.0026) [2024-04-27 12:25:13,215][52263] Updated weights for policy 0, policy_version 376489 (0.0035) [2024-04-27 12:25:14,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6168444928. Throughput: 0: 53204.5. Samples: 658888300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:25:14,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 12:25:16,232][52263] Updated weights for policy 0, policy_version 376499 (0.0026) [2024-04-27 12:25:19,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6168690688. Throughput: 0: 53264.1. Samples: 659209360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:25:19,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 12:25:19,233][52263] Updated weights for policy 0, policy_version 376509 (0.0027) [2024-04-27 12:25:22,321][52263] Updated weights for policy 0, policy_version 376519 (0.0031) [2024-04-27 12:25:24,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6168969216. Throughput: 0: 53190.9. Samples: 659531180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:25:24,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 12:25:25,761][52263] Updated weights for policy 0, policy_version 376529 (0.0033) [2024-04-27 12:25:28,575][52263] Updated weights for policy 0, policy_version 376539 (0.0035) [2024-04-27 12:25:29,107][52031] Fps is (10 sec: 57343.4, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6169264128. Throughput: 0: 53385.8. Samples: 659697560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:25:29,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 12:25:32,151][52263] Updated weights for policy 0, policy_version 376549 (0.0029) [2024-04-27 12:25:34,106][52031] Fps is (10 sec: 52430.0, 60 sec: 52975.1, 300 sec: 53484.1). Total num frames: 6169493504. Throughput: 0: 53437.6. Samples: 660015760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:25:34,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 12:25:34,189][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000376557_6169509888.pth... [2024-04-27 12:25:34,226][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000375773_6156664832.pth [2024-04-27 12:25:34,652][52242] Signal inference workers to stop experience collection... (9900 times) [2024-04-27 12:25:34,688][52263] InferenceWorker_p0-w0: stopping experience collection (9900 times) [2024-04-27 12:25:34,745][52242] Signal inference workers to resume experience collection... (9900 times) [2024-04-27 12:25:34,746][52263] InferenceWorker_p0-w0: resuming experience collection (9900 times) [2024-04-27 12:25:34,748][52263] Updated weights for policy 0, policy_version 376559 (0.0025) [2024-04-27 12:25:38,193][52263] Updated weights for policy 0, policy_version 376569 (0.0035) [2024-04-27 12:25:39,106][52031] Fps is (10 sec: 47514.1, 60 sec: 52974.9, 300 sec: 53484.1). Total num frames: 6169739264. Throughput: 0: 53511.4. Samples: 660336020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:25:39,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 12:25:40,757][52263] Updated weights for policy 0, policy_version 376579 (0.0029) [2024-04-27 12:25:44,106][52031] Fps is (10 sec: 52428.2, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6170017792. Throughput: 0: 53085.0. Samples: 660484400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:25:44,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 12:25:44,185][52263] Updated weights for policy 0, policy_version 376589 (0.0027) [2024-04-27 12:25:46,836][52263] Updated weights for policy 0, policy_version 376599 (0.0036) [2024-04-27 12:25:49,106][52031] Fps is (10 sec: 54067.3, 60 sec: 52975.1, 300 sec: 53484.0). Total num frames: 6170279936. Throughput: 0: 53133.8. Samples: 660805840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:25:49,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 12:25:50,324][52263] Updated weights for policy 0, policy_version 376609 (0.0032) [2024-04-27 12:25:53,084][52263] Updated weights for policy 0, policy_version 376619 (0.0031) [2024-04-27 12:25:54,107][52031] Fps is (10 sec: 57343.6, 60 sec: 54067.3, 300 sec: 53595.1). Total num frames: 6170591232. Throughput: 0: 53183.4. Samples: 661128620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:25:54,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 12:25:56,501][52263] Updated weights for policy 0, policy_version 376629 (0.0031) [2024-04-27 12:25:59,074][52263] Updated weights for policy 0, policy_version 376639 (0.0027) [2024-04-27 12:25:59,106][52031] Fps is (10 sec: 57344.0, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6170853376. Throughput: 0: 53583.7. Samples: 661299560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:25:59,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 12:26:02,552][52263] Updated weights for policy 0, policy_version 376649 (0.0033) [2024-04-27 12:26:04,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6171099136. Throughput: 0: 53622.1. Samples: 661622360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:26:04,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 12:26:05,002][52263] Updated weights for policy 0, policy_version 376659 (0.0031) [2024-04-27 12:26:08,832][52263] Updated weights for policy 0, policy_version 376669 (0.0025) [2024-04-27 12:26:09,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6171377664. Throughput: 0: 53667.2. Samples: 661946200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:26:09,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 12:26:11,137][52263] Updated weights for policy 0, policy_version 376679 (0.0034) [2024-04-27 12:26:14,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52975.0, 300 sec: 53484.0). Total num frames: 6171623424. Throughput: 0: 53307.6. Samples: 662096400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:26:14,107][52031] Avg episode reward: [(0, '0.474')] [2024-04-27 12:26:14,736][52263] Updated weights for policy 0, policy_version 376689 (0.0033) [2024-04-27 12:26:17,238][52263] Updated weights for policy 0, policy_version 376699 (0.0028) [2024-04-27 12:26:19,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6171901952. Throughput: 0: 53474.1. Samples: 662422100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:26:19,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 12:26:20,844][52263] Updated weights for policy 0, policy_version 376709 (0.0026) [2024-04-27 12:26:23,127][52263] Updated weights for policy 0, policy_version 376719 (0.0025) [2024-04-27 12:26:24,107][52031] Fps is (10 sec: 58981.8, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6172213248. Throughput: 0: 53520.3. Samples: 662744440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:26:24,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 12:26:27,021][52263] Updated weights for policy 0, policy_version 376729 (0.0033) [2024-04-27 12:26:29,107][52031] Fps is (10 sec: 57343.6, 60 sec: 53521.1, 300 sec: 53595.2). Total num frames: 6172475392. Throughput: 0: 53943.0. Samples: 662911840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:26:29,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 12:26:29,455][52242] Signal inference workers to stop experience collection... (9950 times) [2024-04-27 12:26:29,489][52263] InferenceWorker_p0-w0: stopping experience collection (9950 times) [2024-04-27 12:26:29,518][52242] Signal inference workers to resume experience collection... (9950 times) [2024-04-27 12:26:29,522][52263] InferenceWorker_p0-w0: resuming experience collection (9950 times) [2024-04-27 12:26:29,525][52263] Updated weights for policy 0, policy_version 376739 (0.0029) [2024-04-27 12:26:32,929][52263] Updated weights for policy 0, policy_version 376749 (0.0029) [2024-04-27 12:26:34,106][52031] Fps is (10 sec: 47514.3, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6172688384. Throughput: 0: 53894.2. Samples: 663231080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:26:34,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 12:26:35,484][52263] Updated weights for policy 0, policy_version 376759 (0.0032) [2024-04-27 12:26:38,993][52263] Updated weights for policy 0, policy_version 376769 (0.0031) [2024-04-27 12:26:39,106][52031] Fps is (10 sec: 50790.8, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6172983296. Throughput: 0: 53849.0. Samples: 663551820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:26:39,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 12:26:41,737][52263] Updated weights for policy 0, policy_version 376779 (0.0034) [2024-04-27 12:26:44,107][52031] Fps is (10 sec: 57343.1, 60 sec: 54067.1, 300 sec: 53595.1). Total num frames: 6173261824. Throughput: 0: 53542.0. Samples: 663708960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:26:44,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 12:26:45,035][52263] Updated weights for policy 0, policy_version 376789 (0.0029) [2024-04-27 12:26:47,887][52263] Updated weights for policy 0, policy_version 376799 (0.0032) [2024-04-27 12:26:49,107][52031] Fps is (10 sec: 55705.2, 60 sec: 54340.2, 300 sec: 53595.1). Total num frames: 6173540352. Throughput: 0: 53405.3. Samples: 664025600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:26:49,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 12:26:51,035][52263] Updated weights for policy 0, policy_version 376809 (0.0034) [2024-04-27 12:26:53,850][52263] Updated weights for policy 0, policy_version 376819 (0.0027) [2024-04-27 12:26:54,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6173802496. Throughput: 0: 53396.6. Samples: 664349040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:26:54,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 12:26:57,153][52263] Updated weights for policy 0, policy_version 376829 (0.0033) [2024-04-27 12:26:59,106][52031] Fps is (10 sec: 50791.1, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6174048256. Throughput: 0: 53820.5. Samples: 664518320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:26:59,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 12:27:00,112][52263] Updated weights for policy 0, policy_version 376839 (0.0031) [2024-04-27 12:27:03,307][52263] Updated weights for policy 0, policy_version 376849 (0.0036) [2024-04-27 12:27:04,107][52031] Fps is (10 sec: 50789.7, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6174310400. Throughput: 0: 53730.1. Samples: 664839960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:27:04,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 12:27:06,022][52263] Updated weights for policy 0, policy_version 376859 (0.0025) [2024-04-27 12:27:09,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6174588928. Throughput: 0: 53825.5. Samples: 665166580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:27:09,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 12:27:09,679][52263] Updated weights for policy 0, policy_version 376869 (0.0028) [2024-04-27 12:27:12,100][52263] Updated weights for policy 0, policy_version 376879 (0.0025) [2024-04-27 12:27:14,106][52031] Fps is (10 sec: 55706.5, 60 sec: 54067.2, 300 sec: 53595.2). Total num frames: 6174867456. Throughput: 0: 53529.9. Samples: 665320680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:27:14,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 12:27:15,612][52263] Updated weights for policy 0, policy_version 376889 (0.0030) [2024-04-27 12:27:18,175][52263] Updated weights for policy 0, policy_version 376899 (0.0031) [2024-04-27 12:27:19,106][52031] Fps is (10 sec: 55705.1, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6175145984. Throughput: 0: 53625.7. Samples: 665644240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:27:19,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 12:27:21,533][52263] Updated weights for policy 0, policy_version 376909 (0.0031) [2024-04-27 12:27:24,106][52031] Fps is (10 sec: 52428.7, 60 sec: 52975.0, 300 sec: 53484.1). Total num frames: 6175391744. Throughput: 0: 53657.8. Samples: 665966420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:27:24,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 12:27:24,431][52263] Updated weights for policy 0, policy_version 376919 (0.0034) [2024-04-27 12:27:27,636][52263] Updated weights for policy 0, policy_version 376929 (0.0030) [2024-04-27 12:27:29,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6175670272. Throughput: 0: 53865.4. Samples: 666132900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:27:29,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 12:27:30,590][52263] Updated weights for policy 0, policy_version 376939 (0.0030) [2024-04-27 12:27:32,584][52242] Signal inference workers to stop experience collection... (10000 times) [2024-04-27 12:27:32,585][52242] Signal inference workers to resume experience collection... (10000 times) [2024-04-27 12:27:32,616][52263] InferenceWorker_p0-w0: stopping experience collection (10000 times) [2024-04-27 12:27:32,616][52263] InferenceWorker_p0-w0: resuming experience collection (10000 times) [2024-04-27 12:27:33,814][52263] Updated weights for policy 0, policy_version 376949 (0.0030) [2024-04-27 12:27:34,106][52031] Fps is (10 sec: 54067.1, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6175932416. Throughput: 0: 53937.9. Samples: 666452800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:27:34,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 12:27:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000376949_6175932416.pth... [2024-04-27 12:27:34,161][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000376165_6163087360.pth [2024-04-27 12:27:36,738][52263] Updated weights for policy 0, policy_version 376959 (0.0027) [2024-04-27 12:27:39,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6176210944. Throughput: 0: 53903.9. Samples: 666774720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:27:39,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 12:27:39,841][52263] Updated weights for policy 0, policy_version 376969 (0.0032) [2024-04-27 12:27:42,774][52263] Updated weights for policy 0, policy_version 376979 (0.0029) [2024-04-27 12:27:44,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6176473088. Throughput: 0: 53689.1. Samples: 666934340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:27:44,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 12:27:45,870][52263] Updated weights for policy 0, policy_version 376989 (0.0026) [2024-04-27 12:27:48,820][52263] Updated weights for policy 0, policy_version 376999 (0.0034) [2024-04-27 12:27:49,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6176768000. Throughput: 0: 53781.4. Samples: 667260120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:27:49,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 12:27:52,345][52263] Updated weights for policy 0, policy_version 377009 (0.0031) [2024-04-27 12:27:54,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53794.0, 300 sec: 53539.5). Total num frames: 6177030144. Throughput: 0: 53648.2. Samples: 667580760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:27:54,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 12:27:54,849][52263] Updated weights for policy 0, policy_version 377019 (0.0030) [2024-04-27 12:27:58,450][52263] Updated weights for policy 0, policy_version 377029 (0.0032) [2024-04-27 12:27:59,107][52031] Fps is (10 sec: 50790.5, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6177275904. Throughput: 0: 53806.1. Samples: 667741960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 12:27:59,107][52031] Avg episode reward: [(0, '0.496')] [2024-04-27 12:28:00,968][52263] Updated weights for policy 0, policy_version 377039 (0.0030) [2024-04-27 12:28:04,106][52031] Fps is (10 sec: 52429.6, 60 sec: 54067.3, 300 sec: 53595.1). Total num frames: 6177554432. Throughput: 0: 53864.5. Samples: 668068140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:28:04,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 12:28:04,590][52263] Updated weights for policy 0, policy_version 377049 (0.0030) [2024-04-27 12:28:07,073][52263] Updated weights for policy 0, policy_version 377059 (0.0033) [2024-04-27 12:28:09,107][52031] Fps is (10 sec: 55705.2, 60 sec: 54067.0, 300 sec: 53650.6). Total num frames: 6177832960. Throughput: 0: 53760.2. Samples: 668385640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:28:09,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 12:28:10,762][52263] Updated weights for policy 0, policy_version 377069 (0.0034) [2024-04-27 12:28:13,289][52263] Updated weights for policy 0, policy_version 377079 (0.0034) [2024-04-27 12:28:14,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53793.9, 300 sec: 53595.1). Total num frames: 6178095104. Throughput: 0: 53721.6. Samples: 668550380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:28:14,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 12:28:16,747][52263] Updated weights for policy 0, policy_version 377089 (0.0026) [2024-04-27 12:28:19,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6178357248. Throughput: 0: 53663.9. Samples: 668867680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:28:19,107][52031] Avg episode reward: [(0, '0.498')] [2024-04-27 12:28:19,779][52263] Updated weights for policy 0, policy_version 377099 (0.0038) [2024-04-27 12:28:22,709][52263] Updated weights for policy 0, policy_version 377109 (0.0032) [2024-04-27 12:28:24,106][52031] Fps is (10 sec: 52429.8, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6178619392. Throughput: 0: 53610.3. Samples: 669187180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:28:24,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 12:28:25,893][52263] Updated weights for policy 0, policy_version 377119 (0.0028) [2024-04-27 12:28:28,833][52263] Updated weights for policy 0, policy_version 377129 (0.0030) [2024-04-27 12:28:29,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6178881536. Throughput: 0: 53704.0. Samples: 669351020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:28:29,107][52031] Avg episode reward: [(0, '0.510')] [2024-04-27 12:28:31,852][52263] Updated weights for policy 0, policy_version 377139 (0.0028) [2024-04-27 12:28:32,380][52242] Signal inference workers to stop experience collection... (10050 times) [2024-04-27 12:28:32,391][52263] InferenceWorker_p0-w0: stopping experience collection (10050 times) [2024-04-27 12:28:32,480][52242] Signal inference workers to resume experience collection... (10050 times) [2024-04-27 12:28:32,480][52263] InferenceWorker_p0-w0: resuming experience collection (10050 times) [2024-04-27 12:28:34,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6179160064. Throughput: 0: 53604.0. Samples: 669672300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:28:34,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 12:28:34,870][52263] Updated weights for policy 0, policy_version 377149 (0.0028) [2024-04-27 12:28:37,884][52263] Updated weights for policy 0, policy_version 377159 (0.0037) [2024-04-27 12:28:39,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 6179422208. Throughput: 0: 53630.3. Samples: 669994120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:28:39,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 12:28:41,253][52263] Updated weights for policy 0, policy_version 377169 (0.0027) [2024-04-27 12:28:44,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6179684352. Throughput: 0: 53560.8. Samples: 670152200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:28:44,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 12:28:44,125][52263] Updated weights for policy 0, policy_version 377179 (0.0031) [2024-04-27 12:28:47,720][52263] Updated weights for policy 0, policy_version 377189 (0.0033) [2024-04-27 12:28:49,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 6179979264. Throughput: 0: 53457.9. Samples: 670473740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:28:49,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 12:28:50,364][52263] Updated weights for policy 0, policy_version 377199 (0.0027) [2024-04-27 12:28:53,767][52263] Updated weights for policy 0, policy_version 377209 (0.0029) [2024-04-27 12:28:54,106][52031] Fps is (10 sec: 52429.5, 60 sec: 52975.1, 300 sec: 53539.6). Total num frames: 6180208640. Throughput: 0: 53416.1. Samples: 670789360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:28:54,107][52031] Avg episode reward: [(0, '0.510')] [2024-04-27 12:28:56,313][52263] Updated weights for policy 0, policy_version 377219 (0.0029) [2024-04-27 12:28:59,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6180503552. Throughput: 0: 53251.7. Samples: 670946700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:28:59,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 12:28:59,765][52263] Updated weights for policy 0, policy_version 377229 (0.0032) [2024-04-27 12:29:02,688][52263] Updated weights for policy 0, policy_version 377239 (0.0028) [2024-04-27 12:29:04,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6180765696. Throughput: 0: 53400.5. Samples: 671270700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:29:04,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 12:29:05,856][52263] Updated weights for policy 0, policy_version 377249 (0.0030) [2024-04-27 12:29:08,874][52263] Updated weights for policy 0, policy_version 377259 (0.0028) [2024-04-27 12:29:09,107][52031] Fps is (10 sec: 50790.2, 60 sec: 52975.0, 300 sec: 53539.6). Total num frames: 6181011456. Throughput: 0: 53371.5. Samples: 671588900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:29:09,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 12:29:12,035][52263] Updated weights for policy 0, policy_version 377269 (0.0028) [2024-04-27 12:29:14,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6181289984. Throughput: 0: 53153.9. Samples: 671742940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:29:14,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 12:29:14,960][52263] Updated weights for policy 0, policy_version 377279 (0.0036) [2024-04-27 12:29:18,073][52263] Updated weights for policy 0, policy_version 377289 (0.0025) [2024-04-27 12:29:19,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6181552128. Throughput: 0: 53076.4. Samples: 672060740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 12:29:19,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 12:29:20,932][52263] Updated weights for policy 0, policy_version 377299 (0.0032) [2024-04-27 12:29:24,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6181814272. Throughput: 0: 53207.8. Samples: 672388460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 12:29:24,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 12:29:24,180][52263] Updated weights for policy 0, policy_version 377309 (0.0032) [2024-04-27 12:29:27,112][52263] Updated weights for policy 0, policy_version 377319 (0.0027) [2024-04-27 12:29:29,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6182076416. Throughput: 0: 53089.8. Samples: 672541240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 12:29:29,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 12:29:30,552][52263] Updated weights for policy 0, policy_version 377329 (0.0033) [2024-04-27 12:29:33,273][52263] Updated weights for policy 0, policy_version 377339 (0.0031) [2024-04-27 12:29:34,106][52031] Fps is (10 sec: 54066.8, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6182354944. Throughput: 0: 53147.1. Samples: 672865360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 12:29:34,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 12:29:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000377341_6182354944.pth... [2024-04-27 12:29:34,166][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000376557_6169509888.pth [2024-04-27 12:29:36,664][52263] Updated weights for policy 0, policy_version 377349 (0.0033) [2024-04-27 12:29:39,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6182633472. Throughput: 0: 53168.7. Samples: 673181960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 12:29:39,107][52031] Avg episode reward: [(0, '0.473')] [2024-04-27 12:29:39,259][52263] Updated weights for policy 0, policy_version 377359 (0.0033) [2024-04-27 12:29:42,883][52263] Updated weights for policy 0, policy_version 377369 (0.0029) [2024-04-27 12:29:44,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6182879232. Throughput: 0: 53319.2. Samples: 673346060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 12:29:44,107][52031] Avg episode reward: [(0, '0.516')] [2024-04-27 12:29:45,174][52242] Signal inference workers to stop experience collection... (10100 times) [2024-04-27 12:29:45,175][52242] Signal inference workers to resume experience collection... (10100 times) [2024-04-27 12:29:45,196][52263] InferenceWorker_p0-w0: stopping experience collection (10100 times) [2024-04-27 12:29:45,196][52263] InferenceWorker_p0-w0: resuming experience collection (10100 times) [2024-04-27 12:29:45,292][52263] Updated weights for policy 0, policy_version 377379 (0.0030) [2024-04-27 12:29:48,856][52263] Updated weights for policy 0, policy_version 377389 (0.0029) [2024-04-27 12:29:49,106][52031] Fps is (10 sec: 50791.6, 60 sec: 52701.9, 300 sec: 53539.6). Total num frames: 6183141376. Throughput: 0: 53173.8. Samples: 673663520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 12:29:49,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 12:29:51,790][52263] Updated weights for policy 0, policy_version 377399 (0.0030) [2024-04-27 12:29:54,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6183419904. Throughput: 0: 53147.5. Samples: 673980540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 12:29:54,107][52031] Avg episode reward: [(0, '0.493')] [2024-04-27 12:29:55,056][52263] Updated weights for policy 0, policy_version 377409 (0.0027) [2024-04-27 12:29:57,940][52263] Updated weights for policy 0, policy_version 377419 (0.0026) [2024-04-27 12:29:59,106][52031] Fps is (10 sec: 55705.4, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6183698432. Throughput: 0: 53363.1. Samples: 674144280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 12:29:59,107][52031] Avg episode reward: [(0, '0.473')] [2024-04-27 12:30:01,264][52263] Updated weights for policy 0, policy_version 377429 (0.0033) [2024-04-27 12:30:04,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52974.8, 300 sec: 53484.0). Total num frames: 6183944192. Throughput: 0: 53507.6. Samples: 674468580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 12:30:04,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 12:30:04,154][52263] Updated weights for policy 0, policy_version 377439 (0.0028) [2024-04-27 12:30:07,293][52263] Updated weights for policy 0, policy_version 377449 (0.0036) [2024-04-27 12:30:09,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6184222720. Throughput: 0: 53264.5. Samples: 674785360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 12:30:09,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 12:30:10,258][52263] Updated weights for policy 0, policy_version 377459 (0.0030) [2024-04-27 12:30:13,331][52263] Updated weights for policy 0, policy_version 377469 (0.0027) [2024-04-27 12:30:14,107][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6184501248. Throughput: 0: 53404.0. Samples: 674944420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 12:30:14,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 12:30:16,357][52263] Updated weights for policy 0, policy_version 377479 (0.0037) [2024-04-27 12:30:19,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.3, 300 sec: 53539.6). Total num frames: 6184763392. Throughput: 0: 53297.0. Samples: 675263720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 12:30:19,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 12:30:19,516][52263] Updated weights for policy 0, policy_version 377489 (0.0037) [2024-04-27 12:30:22,435][52263] Updated weights for policy 0, policy_version 377499 (0.0033) [2024-04-27 12:30:24,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 6185009152. Throughput: 0: 53358.5. Samples: 675583080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 12:30:24,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 12:30:25,728][52263] Updated weights for policy 0, policy_version 377509 (0.0037) [2024-04-27 12:30:28,378][52263] Updated weights for policy 0, policy_version 377519 (0.0030) [2024-04-27 12:30:29,107][52031] Fps is (10 sec: 54065.9, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6185304064. Throughput: 0: 53404.7. Samples: 675749280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 12:30:29,107][52031] Avg episode reward: [(0, '0.511')] [2024-04-27 12:30:31,861][52263] Updated weights for policy 0, policy_version 377529 (0.0037) [2024-04-27 12:30:34,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6185566208. Throughput: 0: 53363.1. Samples: 676064860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 12:30:34,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 12:30:34,555][52263] Updated weights for policy 0, policy_version 377539 (0.0030) [2024-04-27 12:30:37,869][52263] Updated weights for policy 0, policy_version 377549 (0.0027) [2024-04-27 12:30:39,106][52031] Fps is (10 sec: 50791.2, 60 sec: 52975.1, 300 sec: 53539.6). Total num frames: 6185811968. Throughput: 0: 53482.8. Samples: 676387260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 12:30:39,107][52031] Avg episode reward: [(0, '0.520')] [2024-04-27 12:30:40,774][52263] Updated weights for policy 0, policy_version 377559 (0.0040) [2024-04-27 12:30:44,072][52263] Updated weights for policy 0, policy_version 377569 (0.0026) [2024-04-27 12:30:44,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6186090496. Throughput: 0: 53207.2. Samples: 676538600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:30:44,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 12:30:46,878][52263] Updated weights for policy 0, policy_version 377579 (0.0034) [2024-04-27 12:30:49,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6186336256. Throughput: 0: 53045.5. Samples: 676855620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:30:49,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 12:30:50,276][52263] Updated weights for policy 0, policy_version 377589 (0.0028) [2024-04-27 12:30:53,036][52263] Updated weights for policy 0, policy_version 377599 (0.0031) [2024-04-27 12:30:54,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6186631168. Throughput: 0: 53085.5. Samples: 677174220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:30:54,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 12:30:56,323][52263] Updated weights for policy 0, policy_version 377609 (0.0029) [2024-04-27 12:30:59,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6186893312. Throughput: 0: 53287.5. Samples: 677342360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:30:59,115][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 12:30:59,172][52263] Updated weights for policy 0, policy_version 377619 (0.0026) [2024-04-27 12:31:01,668][52242] Signal inference workers to stop experience collection... (10150 times) [2024-04-27 12:31:01,715][52263] InferenceWorker_p0-w0: stopping experience collection (10150 times) [2024-04-27 12:31:01,726][52242] Signal inference workers to resume experience collection... (10150 times) [2024-04-27 12:31:01,732][52263] InferenceWorker_p0-w0: resuming experience collection (10150 times) [2024-04-27 12:31:02,512][52263] Updated weights for policy 0, policy_version 377629 (0.0029) [2024-04-27 12:31:04,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6187155456. Throughput: 0: 53383.4. Samples: 677665980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:31:04,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 12:31:05,236][52263] Updated weights for policy 0, policy_version 377639 (0.0033) [2024-04-27 12:31:08,818][52263] Updated weights for policy 0, policy_version 377649 (0.0029) [2024-04-27 12:31:09,106][52031] Fps is (10 sec: 50790.9, 60 sec: 52974.8, 300 sec: 53484.0). Total num frames: 6187401216. Throughput: 0: 53404.4. Samples: 677986280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:31:09,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 12:31:11,263][52263] Updated weights for policy 0, policy_version 377659 (0.0028) [2024-04-27 12:31:14,106][52031] Fps is (10 sec: 50790.5, 60 sec: 52702.0, 300 sec: 53428.5). Total num frames: 6187663360. Throughput: 0: 52991.3. Samples: 678133880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:31:14,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 12:31:14,879][52263] Updated weights for policy 0, policy_version 377669 (0.0026) [2024-04-27 12:31:17,351][52263] Updated weights for policy 0, policy_version 377679 (0.0034) [2024-04-27 12:31:19,106][52031] Fps is (10 sec: 54067.5, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 6187941888. Throughput: 0: 53164.0. Samples: 678457240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:31:19,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 12:31:20,827][52263] Updated weights for policy 0, policy_version 377689 (0.0030) [2024-04-27 12:31:23,600][52263] Updated weights for policy 0, policy_version 377699 (0.0032) [2024-04-27 12:31:24,106][52031] Fps is (10 sec: 57344.3, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6188236800. Throughput: 0: 53163.7. Samples: 678779620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:31:24,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 12:31:27,212][52263] Updated weights for policy 0, policy_version 377709 (0.0029) [2024-04-27 12:31:29,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6188498944. Throughput: 0: 53541.7. Samples: 678947980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:31:29,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 12:31:29,767][52263] Updated weights for policy 0, policy_version 377719 (0.0028) [2024-04-27 12:31:33,571][52263] Updated weights for policy 0, policy_version 377729 (0.0033) [2024-04-27 12:31:34,107][52031] Fps is (10 sec: 50789.5, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6188744704. Throughput: 0: 53668.8. Samples: 679270720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:31:34,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 12:31:34,154][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000377732_6188761088.pth... [2024-04-27 12:31:34,213][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000376949_6175932416.pth [2024-04-27 12:31:35,801][52263] Updated weights for policy 0, policy_version 377739 (0.0034) [2024-04-27 12:31:39,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6189023232. Throughput: 0: 53670.7. Samples: 679589400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:31:39,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 12:31:39,745][52263] Updated weights for policy 0, policy_version 377749 (0.0027) [2024-04-27 12:31:41,784][52263] Updated weights for policy 0, policy_version 377759 (0.0028) [2024-04-27 12:31:44,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 6189268992. Throughput: 0: 53317.1. Samples: 679741620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:31:44,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 12:31:45,704][52263] Updated weights for policy 0, policy_version 377769 (0.0032) [2024-04-27 12:31:48,054][52263] Updated weights for policy 0, policy_version 377779 (0.0026) [2024-04-27 12:31:49,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6189563904. Throughput: 0: 53264.7. Samples: 680062900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:31:49,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 12:31:51,671][52263] Updated weights for policy 0, policy_version 377789 (0.0025) [2024-04-27 12:31:54,107][52031] Fps is (10 sec: 57343.0, 60 sec: 53521.1, 300 sec: 53539.5). Total num frames: 6189842432. Throughput: 0: 53265.2. Samples: 680383220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 12:31:54,116][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 12:31:54,253][52263] Updated weights for policy 0, policy_version 377799 (0.0030) [2024-04-27 12:31:57,951][52263] Updated weights for policy 0, policy_version 377809 (0.0031) [2024-04-27 12:31:57,993][52242] Signal inference workers to stop experience collection... (10200 times) [2024-04-27 12:31:58,033][52263] InferenceWorker_p0-w0: stopping experience collection (10200 times) [2024-04-27 12:31:58,091][52242] Signal inference workers to resume experience collection... (10200 times) [2024-04-27 12:31:58,091][52263] InferenceWorker_p0-w0: resuming experience collection (10200 times) [2024-04-27 12:31:59,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6190104576. Throughput: 0: 53787.4. Samples: 680554320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 12:31:59,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 12:32:00,204][52263] Updated weights for policy 0, policy_version 377819 (0.0027) [2024-04-27 12:32:03,943][52263] Updated weights for policy 0, policy_version 377829 (0.0026) [2024-04-27 12:32:04,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53520.8, 300 sec: 53484.0). Total num frames: 6190366720. Throughput: 0: 53802.8. Samples: 680878380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 12:32:04,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 12:32:06,494][52263] Updated weights for policy 0, policy_version 377839 (0.0029) [2024-04-27 12:32:09,106][52031] Fps is (10 sec: 49152.4, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 6190596096. Throughput: 0: 53743.5. Samples: 681198080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 12:32:09,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 12:32:09,998][52263] Updated weights for policy 0, policy_version 377849 (0.0029) [2024-04-27 12:32:12,527][52263] Updated weights for policy 0, policy_version 377859 (0.0024) [2024-04-27 12:32:14,107][52031] Fps is (10 sec: 52429.3, 60 sec: 53794.0, 300 sec: 53372.9). Total num frames: 6190891008. Throughput: 0: 53338.5. Samples: 681348220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 12:32:14,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 12:32:16,106][52263] Updated weights for policy 0, policy_version 377869 (0.0031) [2024-04-27 12:32:18,731][52263] Updated weights for policy 0, policy_version 377879 (0.0033) [2024-04-27 12:32:19,107][52031] Fps is (10 sec: 57343.0, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6191169536. Throughput: 0: 53374.2. Samples: 681672560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 12:32:19,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 12:32:22,104][52263] Updated weights for policy 0, policy_version 377889 (0.0029) [2024-04-27 12:32:24,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53520.8, 300 sec: 53484.0). Total num frames: 6191448064. Throughput: 0: 53403.0. Samples: 681992540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 12:32:24,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 12:32:24,988][52263] Updated weights for policy 0, policy_version 377899 (0.0032) [2024-04-27 12:32:28,126][52263] Updated weights for policy 0, policy_version 377909 (0.0036) [2024-04-27 12:32:29,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6191710208. Throughput: 0: 53673.2. Samples: 682156920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 12:32:29,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 12:32:31,241][52263] Updated weights for policy 0, policy_version 377919 (0.0028) [2024-04-27 12:32:34,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6191972352. Throughput: 0: 53796.6. Samples: 682483740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 12:32:34,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 12:32:34,161][52263] Updated weights for policy 0, policy_version 377929 (0.0031) [2024-04-27 12:32:37,316][52263] Updated weights for policy 0, policy_version 377939 (0.0026) [2024-04-27 12:32:39,107][52031] Fps is (10 sec: 50790.5, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6192218112. Throughput: 0: 53921.9. Samples: 682809700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 12:32:39,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 12:32:40,446][52263] Updated weights for policy 0, policy_version 377949 (0.0032) [2024-04-27 12:32:43,420][52263] Updated weights for policy 0, policy_version 377959 (0.0030) [2024-04-27 12:32:44,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53793.9, 300 sec: 53317.4). Total num frames: 6192496640. Throughput: 0: 53409.6. Samples: 682957760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 12:32:44,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 12:32:46,738][52263] Updated weights for policy 0, policy_version 377969 (0.0029) [2024-04-27 12:32:49,106][52031] Fps is (10 sec: 57344.5, 60 sec: 53794.3, 300 sec: 53428.5). Total num frames: 6192791552. Throughput: 0: 53359.4. Samples: 683279540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 12:32:49,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 12:32:49,610][52263] Updated weights for policy 0, policy_version 377979 (0.0029) [2024-04-27 12:32:52,740][52263] Updated weights for policy 0, policy_version 377989 (0.0035) [2024-04-27 12:32:54,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6193053696. Throughput: 0: 53388.3. Samples: 683600560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 12:32:54,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 12:32:55,778][52263] Updated weights for policy 0, policy_version 377999 (0.0032) [2024-04-27 12:32:57,691][52242] Signal inference workers to stop experience collection... (10250 times) [2024-04-27 12:32:57,725][52263] InferenceWorker_p0-w0: stopping experience collection (10250 times) [2024-04-27 12:32:57,784][52242] Signal inference workers to resume experience collection... (10250 times) [2024-04-27 12:32:57,784][52263] InferenceWorker_p0-w0: resuming experience collection (10250 times) [2024-04-27 12:32:58,868][52263] Updated weights for policy 0, policy_version 378009 (0.0028) [2024-04-27 12:32:59,106][52031] Fps is (10 sec: 50790.1, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6193299456. Throughput: 0: 53696.1. Samples: 683764540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 12:32:59,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 12:33:01,839][52263] Updated weights for policy 0, policy_version 378019 (0.0027) [2024-04-27 12:33:04,106][52031] Fps is (10 sec: 49152.7, 60 sec: 52975.2, 300 sec: 53261.9). Total num frames: 6193545216. Throughput: 0: 53575.3. Samples: 684083440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 12:33:04,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 12:33:05,152][52263] Updated weights for policy 0, policy_version 378029 (0.0030) [2024-04-27 12:33:08,089][52263] Updated weights for policy 0, policy_version 378039 (0.0031) [2024-04-27 12:33:09,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53794.0, 300 sec: 53317.4). Total num frames: 6193823744. Throughput: 0: 53435.3. Samples: 684397120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 12:33:09,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 12:33:11,444][52263] Updated weights for policy 0, policy_version 378049 (0.0025) [2024-04-27 12:33:14,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 6194102272. Throughput: 0: 53370.3. Samples: 684558580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 12:33:14,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 12:33:14,187][52263] Updated weights for policy 0, policy_version 378059 (0.0029) [2024-04-27 12:33:17,573][52263] Updated weights for policy 0, policy_version 378069 (0.0030) [2024-04-27 12:33:19,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.2, 300 sec: 53373.0). Total num frames: 6194364416. Throughput: 0: 53287.2. Samples: 684881660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:33:19,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 12:33:20,142][52263] Updated weights for policy 0, policy_version 378079 (0.0029) [2024-04-27 12:33:23,623][52263] Updated weights for policy 0, policy_version 378089 (0.0031) [2024-04-27 12:33:24,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6194642944. Throughput: 0: 53108.0. Samples: 685199560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:33:24,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 12:33:26,203][52263] Updated weights for policy 0, policy_version 378099 (0.0033) [2024-04-27 12:33:29,106][52031] Fps is (10 sec: 52428.7, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 6194888704. Throughput: 0: 53259.3. Samples: 685354420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:33:29,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 12:33:29,638][52263] Updated weights for policy 0, policy_version 378109 (0.0025) [2024-04-27 12:33:32,641][52263] Updated weights for policy 0, policy_version 378119 (0.0029) [2024-04-27 12:33:34,106][52031] Fps is (10 sec: 50790.4, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 6195150848. Throughput: 0: 53274.6. Samples: 685676900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:33:34,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 12:33:34,122][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000378123_6195167232.pth... [2024-04-27 12:33:34,168][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000377341_6182354944.pth [2024-04-27 12:33:35,826][52263] Updated weights for policy 0, policy_version 378129 (0.0031) [2024-04-27 12:33:38,809][52263] Updated weights for policy 0, policy_version 378139 (0.0029) [2024-04-27 12:33:39,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6195429376. Throughput: 0: 53216.4. Samples: 685995300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:33:39,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 12:33:41,864][52263] Updated weights for policy 0, policy_version 378149 (0.0031) [2024-04-27 12:33:44,106][52031] Fps is (10 sec: 55706.2, 60 sec: 53521.3, 300 sec: 53317.4). Total num frames: 6195707904. Throughput: 0: 53290.3. Samples: 686162600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:33:44,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 12:33:44,814][52263] Updated weights for policy 0, policy_version 378159 (0.0028) [2024-04-27 12:33:47,955][52263] Updated weights for policy 0, policy_version 378169 (0.0030) [2024-04-27 12:33:49,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6195986432. Throughput: 0: 53310.6. Samples: 686482420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:33:49,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 12:33:50,845][52263] Updated weights for policy 0, policy_version 378179 (0.0030) [2024-04-27 12:33:53,832][52242] Signal inference workers to stop experience collection... (10300 times) [2024-04-27 12:33:53,832][52242] Signal inference workers to resume experience collection... (10300 times) [2024-04-27 12:33:53,844][52263] InferenceWorker_p0-w0: stopping experience collection (10300 times) [2024-04-27 12:33:53,844][52263] InferenceWorker_p0-w0: resuming experience collection (10300 times) [2024-04-27 12:33:53,963][52263] Updated weights for policy 0, policy_version 378189 (0.0037) [2024-04-27 12:33:54,107][52031] Fps is (10 sec: 54066.1, 60 sec: 53248.0, 300 sec: 53372.9). Total num frames: 6196248576. Throughput: 0: 53500.8. Samples: 686804660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:33:54,107][52031] Avg episode reward: [(0, '0.666')] [2024-04-27 12:33:57,405][52263] Updated weights for policy 0, policy_version 378199 (0.0031) [2024-04-27 12:33:59,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6196494336. Throughput: 0: 53254.7. Samples: 686955040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:33:59,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 12:34:00,349][52263] Updated weights for policy 0, policy_version 378209 (0.0032) [2024-04-27 12:34:03,480][52263] Updated weights for policy 0, policy_version 378219 (0.0027) [2024-04-27 12:34:04,107][52031] Fps is (10 sec: 49152.4, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6196740096. Throughput: 0: 53196.8. Samples: 687275520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:34:04,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 12:34:06,639][52263] Updated weights for policy 0, policy_version 378229 (0.0031) [2024-04-27 12:34:09,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6197051392. Throughput: 0: 53168.1. Samples: 687592120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:34:09,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 12:34:09,521][52263] Updated weights for policy 0, policy_version 378239 (0.0029) [2024-04-27 12:34:12,605][52263] Updated weights for policy 0, policy_version 378249 (0.0028) [2024-04-27 12:34:14,106][52031] Fps is (10 sec: 57344.2, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6197313536. Throughput: 0: 53524.0. Samples: 687763000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:34:14,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 12:34:15,464][52263] Updated weights for policy 0, policy_version 378259 (0.0029) [2024-04-27 12:34:18,642][52263] Updated weights for policy 0, policy_version 378269 (0.0031) [2024-04-27 12:34:19,106][52031] Fps is (10 sec: 50790.0, 60 sec: 53247.9, 300 sec: 53372.9). Total num frames: 6197559296. Throughput: 0: 53530.7. Samples: 688085780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:34:19,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 12:34:21,781][52263] Updated weights for policy 0, policy_version 378279 (0.0031) [2024-04-27 12:34:24,106][52031] Fps is (10 sec: 50790.7, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6197821440. Throughput: 0: 53474.4. Samples: 688401640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:34:24,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 12:34:24,964][52263] Updated weights for policy 0, policy_version 378289 (0.0036) [2024-04-27 12:34:27,994][52263] Updated weights for policy 0, policy_version 378299 (0.0027) [2024-04-27 12:34:29,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6198099968. Throughput: 0: 53198.1. Samples: 688556520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:34:29,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 12:34:31,222][52263] Updated weights for policy 0, policy_version 378309 (0.0030) [2024-04-27 12:34:34,054][52263] Updated weights for policy 0, policy_version 378319 (0.0032) [2024-04-27 12:34:34,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 6198378496. Throughput: 0: 53294.6. Samples: 688880680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 12:34:34,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 12:34:37,260][52263] Updated weights for policy 0, policy_version 378329 (0.0029) [2024-04-27 12:34:39,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6198640640. Throughput: 0: 53171.2. Samples: 689197360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:34:39,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 12:34:40,176][52263] Updated weights for policy 0, policy_version 378339 (0.0034) [2024-04-27 12:34:43,239][52263] Updated weights for policy 0, policy_version 378349 (0.0030) [2024-04-27 12:34:44,106][52031] Fps is (10 sec: 50790.7, 60 sec: 52974.9, 300 sec: 53373.0). Total num frames: 6198886400. Throughput: 0: 53428.9. Samples: 689359340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:34:44,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 12:34:46,545][52263] Updated weights for policy 0, policy_version 378359 (0.0027) [2024-04-27 12:34:49,106][52031] Fps is (10 sec: 52429.8, 60 sec: 52975.1, 300 sec: 53373.0). Total num frames: 6199164928. Throughput: 0: 53384.6. Samples: 689677820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:34:49,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 12:34:49,415][52263] Updated weights for policy 0, policy_version 378369 (0.0027) [2024-04-27 12:34:52,805][52263] Updated weights for policy 0, policy_version 378379 (0.0027) [2024-04-27 12:34:54,106][52031] Fps is (10 sec: 54067.5, 60 sec: 52975.1, 300 sec: 53317.4). Total num frames: 6199427072. Throughput: 0: 53607.6. Samples: 690004460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:34:54,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 12:34:55,546][52263] Updated weights for policy 0, policy_version 378389 (0.0036) [2024-04-27 12:34:58,950][52263] Updated weights for policy 0, policy_version 378399 (0.0034) [2024-04-27 12:34:59,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6199705600. Throughput: 0: 53309.0. Samples: 690161900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:34:59,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 12:34:59,915][52242] Signal inference workers to stop experience collection... (10350 times) [2024-04-27 12:34:59,953][52263] InferenceWorker_p0-w0: stopping experience collection (10350 times) [2024-04-27 12:34:59,981][52242] Signal inference workers to resume experience collection... (10350 times) [2024-04-27 12:34:59,986][52263] InferenceWorker_p0-w0: resuming experience collection (10350 times) [2024-04-27 12:35:01,785][52263] Updated weights for policy 0, policy_version 378409 (0.0034) [2024-04-27 12:35:04,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53794.1, 300 sec: 53372.9). Total num frames: 6199967744. Throughput: 0: 53272.5. Samples: 690483040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:35:04,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 12:35:04,922][52263] Updated weights for policy 0, policy_version 378419 (0.0030) [2024-04-27 12:35:07,949][52263] Updated weights for policy 0, policy_version 378429 (0.0032) [2024-04-27 12:35:09,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6200246272. Throughput: 0: 53403.1. Samples: 690804780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:35:09,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 12:35:10,921][52263] Updated weights for policy 0, policy_version 378439 (0.0028) [2024-04-27 12:35:14,107][52031] Fps is (10 sec: 52428.4, 60 sec: 52974.8, 300 sec: 53317.4). Total num frames: 6200492032. Throughput: 0: 53396.3. Samples: 690959360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:35:14,107][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 12:35:14,417][52263] Updated weights for policy 0, policy_version 378449 (0.0036) [2024-04-27 12:35:17,117][52263] Updated weights for policy 0, policy_version 378459 (0.0028) [2024-04-27 12:35:19,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6200770560. Throughput: 0: 53317.2. Samples: 691279960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:35:19,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 12:35:20,394][52263] Updated weights for policy 0, policy_version 378469 (0.0026) [2024-04-27 12:35:23,460][52263] Updated weights for policy 0, policy_version 378479 (0.0031) [2024-04-27 12:35:24,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.0, 300 sec: 53373.0). Total num frames: 6201049088. Throughput: 0: 53461.7. Samples: 691603140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:35:24,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 12:35:26,397][52263] Updated weights for policy 0, policy_version 378489 (0.0030) [2024-04-27 12:35:29,107][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6201311232. Throughput: 0: 53398.6. Samples: 691762280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:35:29,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 12:35:29,492][52263] Updated weights for policy 0, policy_version 378499 (0.0027) [2024-04-27 12:35:32,538][52263] Updated weights for policy 0, policy_version 378509 (0.0028) [2024-04-27 12:35:34,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6201573376. Throughput: 0: 53487.7. Samples: 692084780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:35:34,107][52031] Avg episode reward: [(0, '0.498')] [2024-04-27 12:35:34,119][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000378514_6201573376.pth... [2024-04-27 12:35:34,162][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000377732_6188761088.pth [2024-04-27 12:35:35,544][52263] Updated weights for policy 0, policy_version 378519 (0.0027) [2024-04-27 12:35:38,674][52263] Updated weights for policy 0, policy_version 378529 (0.0031) [2024-04-27 12:35:39,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6201835520. Throughput: 0: 53297.3. Samples: 692402840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:35:39,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 12:35:41,799][52263] Updated weights for policy 0, policy_version 378539 (0.0027) [2024-04-27 12:35:44,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6202097664. Throughput: 0: 53361.3. Samples: 692563160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:35:44,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 12:35:44,771][52263] Updated weights for policy 0, policy_version 378549 (0.0031) [2024-04-27 12:35:47,797][52263] Updated weights for policy 0, policy_version 378559 (0.0032) [2024-04-27 12:35:49,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6202376192. Throughput: 0: 53386.3. Samples: 692885420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:35:49,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 12:35:50,917][52263] Updated weights for policy 0, policy_version 378569 (0.0027) [2024-04-27 12:35:53,969][52263] Updated weights for policy 0, policy_version 378579 (0.0034) [2024-04-27 12:35:54,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6202654720. Throughput: 0: 53323.4. Samples: 693204340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 12:35:54,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 12:35:57,166][52263] Updated weights for policy 0, policy_version 378589 (0.0033) [2024-04-27 12:35:59,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6202900480. Throughput: 0: 53558.0. Samples: 693369460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:35:59,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 12:35:59,962][52263] Updated weights for policy 0, policy_version 378599 (0.0028) [2024-04-27 12:36:00,126][52242] Signal inference workers to stop experience collection... (10400 times) [2024-04-27 12:36:00,175][52263] InferenceWorker_p0-w0: stopping experience collection (10400 times) [2024-04-27 12:36:00,188][52242] Signal inference workers to resume experience collection... (10400 times) [2024-04-27 12:36:00,195][52263] InferenceWorker_p0-w0: resuming experience collection (10400 times) [2024-04-27 12:36:03,403][52263] Updated weights for policy 0, policy_version 378609 (0.0027) [2024-04-27 12:36:04,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6203195392. Throughput: 0: 53571.6. Samples: 693690680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:36:04,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 12:36:06,109][52263] Updated weights for policy 0, policy_version 378619 (0.0027) [2024-04-27 12:36:09,107][52031] Fps is (10 sec: 54066.2, 60 sec: 53247.8, 300 sec: 53484.0). Total num frames: 6203441152. Throughput: 0: 53431.1. Samples: 694007540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:36:09,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 12:36:09,432][52263] Updated weights for policy 0, policy_version 378629 (0.0034) [2024-04-27 12:36:12,293][52263] Updated weights for policy 0, policy_version 378639 (0.0028) [2024-04-27 12:36:14,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6203703296. Throughput: 0: 53476.0. Samples: 694168700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:36:14,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 12:36:15,577][52263] Updated weights for policy 0, policy_version 378649 (0.0031) [2024-04-27 12:36:18,291][52263] Updated weights for policy 0, policy_version 378659 (0.0029) [2024-04-27 12:36:19,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.1, 300 sec: 53372.9). Total num frames: 6203981824. Throughput: 0: 53424.4. Samples: 694488880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:36:19,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 12:36:21,587][52263] Updated weights for policy 0, policy_version 378669 (0.0033) [2024-04-27 12:36:24,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6204243968. Throughput: 0: 53563.5. Samples: 694813200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:36:24,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 12:36:24,331][52263] Updated weights for policy 0, policy_version 378679 (0.0029) [2024-04-27 12:36:27,761][52263] Updated weights for policy 0, policy_version 378689 (0.0031) [2024-04-27 12:36:29,107][52031] Fps is (10 sec: 52429.0, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6204506112. Throughput: 0: 53558.1. Samples: 694973280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:36:29,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 12:36:30,482][52263] Updated weights for policy 0, policy_version 378699 (0.0032) [2024-04-27 12:36:33,741][52263] Updated weights for policy 0, policy_version 378709 (0.0029) [2024-04-27 12:36:34,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6204801024. Throughput: 0: 53536.6. Samples: 695294580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:36:34,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 12:36:36,658][52263] Updated weights for policy 0, policy_version 378719 (0.0032) [2024-04-27 12:36:39,107][52031] Fps is (10 sec: 54066.1, 60 sec: 53520.8, 300 sec: 53484.0). Total num frames: 6205046784. Throughput: 0: 53574.4. Samples: 695615200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:36:39,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 12:36:39,921][52263] Updated weights for policy 0, policy_version 378729 (0.0032) [2024-04-27 12:36:42,619][52263] Updated weights for policy 0, policy_version 378739 (0.0030) [2024-04-27 12:36:44,107][52031] Fps is (10 sec: 50791.3, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6205308928. Throughput: 0: 53479.5. Samples: 695776040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:36:44,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 12:36:45,978][52263] Updated weights for policy 0, policy_version 378749 (0.0031) [2024-04-27 12:36:48,853][52263] Updated weights for policy 0, policy_version 378759 (0.0032) [2024-04-27 12:36:49,107][52031] Fps is (10 sec: 55706.7, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6205603840. Throughput: 0: 53440.4. Samples: 696095500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:36:49,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 12:36:52,081][52263] Updated weights for policy 0, policy_version 378769 (0.0026) [2024-04-27 12:36:54,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6205865984. Throughput: 0: 53628.6. Samples: 696420820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:36:54,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 12:36:54,760][52263] Updated weights for policy 0, policy_version 378779 (0.0031) [2024-04-27 12:36:58,131][52263] Updated weights for policy 0, policy_version 378789 (0.0032) [2024-04-27 12:36:59,106][52031] Fps is (10 sec: 50791.1, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6206111744. Throughput: 0: 53618.7. Samples: 696581540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:36:59,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 12:37:01,044][52263] Updated weights for policy 0, policy_version 378799 (0.0039) [2024-04-27 12:37:04,106][52031] Fps is (10 sec: 50790.7, 60 sec: 52975.1, 300 sec: 53484.0). Total num frames: 6206373888. Throughput: 0: 53616.7. Samples: 696901620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:37:04,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 12:37:04,328][52263] Updated weights for policy 0, policy_version 378809 (0.0030) [2024-04-27 12:37:07,315][52263] Updated weights for policy 0, policy_version 378819 (0.0027) [2024-04-27 12:37:07,773][52242] Signal inference workers to stop experience collection... (10450 times) [2024-04-27 12:37:07,774][52242] Signal inference workers to resume experience collection... (10450 times) [2024-04-27 12:37:07,800][52263] InferenceWorker_p0-w0: stopping experience collection (10450 times) [2024-04-27 12:37:07,800][52263] InferenceWorker_p0-w0: resuming experience collection (10450 times) [2024-04-27 12:37:09,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6206652416. Throughput: 0: 53464.9. Samples: 697219120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:37:09,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 12:37:10,426][52263] Updated weights for policy 0, policy_version 378829 (0.0032) [2024-04-27 12:37:13,554][52263] Updated weights for policy 0, policy_version 378839 (0.0029) [2024-04-27 12:37:14,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6206930944. Throughput: 0: 53568.2. Samples: 697383840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 12:37:14,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 12:37:16,357][52263] Updated weights for policy 0, policy_version 378849 (0.0033) [2024-04-27 12:37:19,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 6207193088. Throughput: 0: 53574.9. Samples: 697705440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 12:37:19,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 12:37:19,626][52263] Updated weights for policy 0, policy_version 378859 (0.0029) [2024-04-27 12:37:22,480][52263] Updated weights for policy 0, policy_version 378869 (0.0036) [2024-04-27 12:37:24,107][52031] Fps is (10 sec: 54066.2, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6207471616. Throughput: 0: 53556.2. Samples: 698025220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 12:37:24,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 12:37:25,743][52263] Updated weights for policy 0, policy_version 378879 (0.0032) [2024-04-27 12:37:28,657][52263] Updated weights for policy 0, policy_version 378889 (0.0029) [2024-04-27 12:37:29,106][52031] Fps is (10 sec: 55706.1, 60 sec: 54067.4, 300 sec: 53484.1). Total num frames: 6207750144. Throughput: 0: 53639.7. Samples: 698189820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 12:37:29,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 12:37:31,776][52263] Updated weights for policy 0, policy_version 378899 (0.0030) [2024-04-27 12:37:34,107][52031] Fps is (10 sec: 50790.8, 60 sec: 52975.1, 300 sec: 53428.5). Total num frames: 6207979520. Throughput: 0: 53637.4. Samples: 698509180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 12:37:34,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 12:37:34,196][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000378906_6207995904.pth... [2024-04-27 12:37:34,235][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000378123_6195167232.pth [2024-04-27 12:37:34,913][52263] Updated weights for policy 0, policy_version 378909 (0.0031) [2024-04-27 12:37:38,075][52263] Updated weights for policy 0, policy_version 378919 (0.0030) [2024-04-27 12:37:39,107][52031] Fps is (10 sec: 50789.6, 60 sec: 53521.3, 300 sec: 53428.5). Total num frames: 6208258048. Throughput: 0: 53419.0. Samples: 698824680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 12:37:39,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 12:37:40,933][52263] Updated weights for policy 0, policy_version 378929 (0.0032) [2024-04-27 12:37:44,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 6208520192. Throughput: 0: 53442.7. Samples: 698986460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 12:37:44,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 12:37:44,156][52263] Updated weights for policy 0, policy_version 378939 (0.0031) [2024-04-27 12:37:46,998][52263] Updated weights for policy 0, policy_version 378949 (0.0024) [2024-04-27 12:37:49,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6208815104. Throughput: 0: 53566.1. Samples: 699312100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 12:37:49,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 12:37:50,290][52263] Updated weights for policy 0, policy_version 378959 (0.0028) [2024-04-27 12:37:53,229][52263] Updated weights for policy 0, policy_version 378969 (0.0031) [2024-04-27 12:37:54,107][52031] Fps is (10 sec: 54066.1, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6209060864. Throughput: 0: 53581.6. Samples: 699630300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 12:37:54,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 12:37:56,384][52263] Updated weights for policy 0, policy_version 378979 (0.0027) [2024-04-27 12:37:59,106][52031] Fps is (10 sec: 50791.2, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6209323008. Throughput: 0: 53520.1. Samples: 699792240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 12:37:59,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 12:37:59,382][52263] Updated weights for policy 0, policy_version 378989 (0.0033) [2024-04-27 12:38:02,395][52263] Updated weights for policy 0, policy_version 378999 (0.0033) [2024-04-27 12:38:02,581][52242] Signal inference workers to stop experience collection... (10500 times) [2024-04-27 12:38:02,616][52263] InferenceWorker_p0-w0: stopping experience collection (10500 times) [2024-04-27 12:38:02,644][52242] Signal inference workers to resume experience collection... (10500 times) [2024-04-27 12:38:02,644][52263] InferenceWorker_p0-w0: resuming experience collection (10500 times) [2024-04-27 12:38:04,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6209585152. Throughput: 0: 53513.6. Samples: 700113560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 12:38:04,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 12:38:05,376][52263] Updated weights for policy 0, policy_version 379009 (0.0036) [2024-04-27 12:38:08,498][52263] Updated weights for policy 0, policy_version 379019 (0.0032) [2024-04-27 12:38:09,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6209863680. Throughput: 0: 53613.5. Samples: 700437820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 12:38:09,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 12:38:11,472][52263] Updated weights for policy 0, policy_version 379029 (0.0030) [2024-04-27 12:38:14,107][52031] Fps is (10 sec: 55706.6, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6210142208. Throughput: 0: 53495.9. Samples: 700597140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 12:38:14,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 12:38:14,597][52263] Updated weights for policy 0, policy_version 379039 (0.0029) [2024-04-27 12:38:17,598][52263] Updated weights for policy 0, policy_version 379049 (0.0027) [2024-04-27 12:38:19,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6210420736. Throughput: 0: 53574.2. Samples: 700920020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 12:38:19,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 12:38:20,683][52263] Updated weights for policy 0, policy_version 379059 (0.0036) [2024-04-27 12:38:23,711][52263] Updated weights for policy 0, policy_version 379069 (0.0029) [2024-04-27 12:38:24,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.3, 300 sec: 53539.6). Total num frames: 6210682880. Throughput: 0: 53724.2. Samples: 701242260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 12:38:24,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 12:38:26,636][52263] Updated weights for policy 0, policy_version 379079 (0.0029) [2024-04-27 12:38:29,106][52031] Fps is (10 sec: 50791.4, 60 sec: 52975.0, 300 sec: 53484.1). Total num frames: 6210928640. Throughput: 0: 53690.3. Samples: 701402520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 12:38:29,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 12:38:29,726][52263] Updated weights for policy 0, policy_version 379089 (0.0029) [2024-04-27 12:38:32,647][52263] Updated weights for policy 0, policy_version 379099 (0.0035) [2024-04-27 12:38:34,106][52031] Fps is (10 sec: 50790.2, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6211190784. Throughput: 0: 53621.0. Samples: 701725040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-27 12:38:34,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 12:38:35,882][52263] Updated weights for policy 0, policy_version 379109 (0.0030) [2024-04-27 12:38:38,841][52263] Updated weights for policy 0, policy_version 379119 (0.0029) [2024-04-27 12:38:39,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.3, 300 sec: 53484.0). Total num frames: 6211485696. Throughput: 0: 53804.7. Samples: 702051500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-27 12:38:39,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 12:38:42,001][52263] Updated weights for policy 0, policy_version 379129 (0.0028) [2024-04-27 12:38:44,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6211747840. Throughput: 0: 53827.5. Samples: 702214480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-27 12:38:44,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 12:38:44,790][52263] Updated weights for policy 0, policy_version 379139 (0.0031) [2024-04-27 12:38:48,123][52263] Updated weights for policy 0, policy_version 379149 (0.0028) [2024-04-27 12:38:49,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6212042752. Throughput: 0: 53808.1. Samples: 702534920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-27 12:38:49,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 12:38:51,177][52263] Updated weights for policy 0, policy_version 379159 (0.0029) [2024-04-27 12:38:54,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6212288512. Throughput: 0: 53704.3. Samples: 702854520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-27 12:38:54,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 12:38:54,133][52263] Updated weights for policy 0, policy_version 379169 (0.0027) [2024-04-27 12:38:57,765][52263] Updated weights for policy 0, policy_version 379179 (0.0025) [2024-04-27 12:38:59,106][52031] Fps is (10 sec: 49152.4, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6212534272. Throughput: 0: 53706.2. Samples: 703013920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-27 12:38:59,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 12:38:59,993][52242] Signal inference workers to stop experience collection... (10550 times) [2024-04-27 12:39:00,022][52263] InferenceWorker_p0-w0: stopping experience collection (10550 times) [2024-04-27 12:39:00,046][52242] Signal inference workers to resume experience collection... (10550 times) [2024-04-27 12:39:00,047][52263] InferenceWorker_p0-w0: resuming experience collection (10550 times) [2024-04-27 12:39:00,162][52263] Updated weights for policy 0, policy_version 379189 (0.0028) [2024-04-27 12:39:03,732][52263] Updated weights for policy 0, policy_version 379199 (0.0028) [2024-04-27 12:39:04,107][52031] Fps is (10 sec: 52429.0, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6212812800. Throughput: 0: 53664.5. Samples: 703334920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-27 12:39:04,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 12:39:06,437][52263] Updated weights for policy 0, policy_version 379209 (0.0030) [2024-04-27 12:39:09,107][52031] Fps is (10 sec: 57343.8, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6213107712. Throughput: 0: 53694.1. Samples: 703658500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-27 12:39:09,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 12:39:09,950][52263] Updated weights for policy 0, policy_version 379219 (0.0027) [2024-04-27 12:39:12,639][52263] Updated weights for policy 0, policy_version 379229 (0.0028) [2024-04-27 12:39:14,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6213353472. Throughput: 0: 53583.3. Samples: 703813780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-27 12:39:14,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 12:39:16,120][52263] Updated weights for policy 0, policy_version 379239 (0.0029) [2024-04-27 12:39:18,747][52263] Updated weights for policy 0, policy_version 379249 (0.0031) [2024-04-27 12:39:19,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 6213648384. Throughput: 0: 53636.9. Samples: 704138700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-27 12:39:19,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 12:39:22,210][52263] Updated weights for policy 0, policy_version 379259 (0.0030) [2024-04-27 12:39:24,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6213894144. Throughput: 0: 53470.2. Samples: 704457660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-27 12:39:24,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 12:39:24,824][52263] Updated weights for policy 0, policy_version 379269 (0.0031) [2024-04-27 12:39:28,152][52263] Updated weights for policy 0, policy_version 379279 (0.0027) [2024-04-27 12:39:29,106][52031] Fps is (10 sec: 49151.9, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6214139904. Throughput: 0: 53366.7. Samples: 704615980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-27 12:39:29,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 12:39:30,915][52263] Updated weights for policy 0, policy_version 379289 (0.0034) [2024-04-27 12:39:34,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6214418432. Throughput: 0: 53333.8. Samples: 704934940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-27 12:39:34,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 12:39:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000379298_6214418432.pth... [2024-04-27 12:39:34,184][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000378514_6201573376.pth [2024-04-27 12:39:34,324][52263] Updated weights for policy 0, policy_version 379299 (0.0034) [2024-04-27 12:39:36,996][52263] Updated weights for policy 0, policy_version 379309 (0.0030) [2024-04-27 12:39:39,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6214680576. Throughput: 0: 53388.7. Samples: 705257000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-27 12:39:39,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 12:39:40,470][52263] Updated weights for policy 0, policy_version 379319 (0.0029) [2024-04-27 12:39:43,096][52263] Updated weights for policy 0, policy_version 379329 (0.0027) [2024-04-27 12:39:44,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6214975488. Throughput: 0: 53524.0. Samples: 705422500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-27 12:39:44,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 12:39:46,502][52263] Updated weights for policy 0, policy_version 379339 (0.0028) [2024-04-27 12:39:49,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6215237632. Throughput: 0: 53449.3. Samples: 705740140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-27 12:39:49,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 12:39:49,408][52263] Updated weights for policy 0, policy_version 379349 (0.0031) [2024-04-27 12:39:52,587][52263] Updated weights for policy 0, policy_version 379359 (0.0030) [2024-04-27 12:39:54,107][52031] Fps is (10 sec: 50790.4, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6215483392. Throughput: 0: 53316.9. Samples: 706057760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:39:54,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 12:39:54,317][52242] Signal inference workers to stop experience collection... (10600 times) [2024-04-27 12:39:54,319][52242] Signal inference workers to resume experience collection... (10600 times) [2024-04-27 12:39:54,326][52263] InferenceWorker_p0-w0: stopping experience collection (10600 times) [2024-04-27 12:39:54,340][52263] InferenceWorker_p0-w0: resuming experience collection (10600 times) [2024-04-27 12:39:55,522][52263] Updated weights for policy 0, policy_version 379369 (0.0030) [2024-04-27 12:39:58,646][52263] Updated weights for policy 0, policy_version 379379 (0.0030) [2024-04-27 12:39:59,106][52031] Fps is (10 sec: 50791.0, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6215745536. Throughput: 0: 53388.2. Samples: 706216240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:39:59,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 12:40:01,493][52263] Updated weights for policy 0, policy_version 379389 (0.0031) [2024-04-27 12:40:04,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6216024064. Throughput: 0: 53282.1. Samples: 706536400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:40:04,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 12:40:04,743][52263] Updated weights for policy 0, policy_version 379399 (0.0028) [2024-04-27 12:40:07,627][52263] Updated weights for policy 0, policy_version 379409 (0.0027) [2024-04-27 12:40:09,106][52031] Fps is (10 sec: 54067.2, 60 sec: 52975.0, 300 sec: 53539.6). Total num frames: 6216286208. Throughput: 0: 53298.7. Samples: 706856100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:40:09,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 12:40:10,974][52263] Updated weights for policy 0, policy_version 379419 (0.0024) [2024-04-27 12:40:13,802][52263] Updated weights for policy 0, policy_version 379429 (0.0025) [2024-04-27 12:40:14,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53794.3, 300 sec: 53595.2). Total num frames: 6216581120. Throughput: 0: 53557.9. Samples: 707026080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:40:14,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 12:40:17,352][52263] Updated weights for policy 0, policy_version 379439 (0.0033) [2024-04-27 12:40:19,107][52031] Fps is (10 sec: 54066.4, 60 sec: 52974.8, 300 sec: 53484.0). Total num frames: 6216826880. Throughput: 0: 53485.7. Samples: 707341800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:40:19,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 12:40:19,859][52263] Updated weights for policy 0, policy_version 379449 (0.0033) [2024-04-27 12:40:23,332][52263] Updated weights for policy 0, policy_version 379459 (0.0027) [2024-04-27 12:40:24,106][52031] Fps is (10 sec: 49151.5, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6217072640. Throughput: 0: 53378.1. Samples: 707659020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:40:24,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 12:40:26,075][52263] Updated weights for policy 0, policy_version 379469 (0.0031) [2024-04-27 12:40:29,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6217367552. Throughput: 0: 53100.9. Samples: 707812040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:40:29,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 12:40:29,505][52263] Updated weights for policy 0, policy_version 379479 (0.0026) [2024-04-27 12:40:32,042][52263] Updated weights for policy 0, policy_version 379489 (0.0024) [2024-04-27 12:40:34,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6217629696. Throughput: 0: 53242.9. Samples: 708136060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:40:34,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 12:40:35,567][52263] Updated weights for policy 0, policy_version 379499 (0.0033) [2024-04-27 12:40:38,327][52263] Updated weights for policy 0, policy_version 379509 (0.0030) [2024-04-27 12:40:39,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6217908224. Throughput: 0: 53344.0. Samples: 708458240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:40:39,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 12:40:41,561][52263] Updated weights for policy 0, policy_version 379519 (0.0029) [2024-04-27 12:40:44,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6218170368. Throughput: 0: 53510.1. Samples: 708624200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:40:44,107][52031] Avg episode reward: [(0, '0.681')] [2024-04-27 12:40:44,338][52263] Updated weights for policy 0, policy_version 379529 (0.0026) [2024-04-27 12:40:47,608][52263] Updated weights for policy 0, policy_version 379539 (0.0028) [2024-04-27 12:40:49,106][52031] Fps is (10 sec: 50791.1, 60 sec: 52975.1, 300 sec: 53428.5). Total num frames: 6218416128. Throughput: 0: 53504.6. Samples: 708944100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:40:49,107][52031] Avg episode reward: [(0, '0.495')] [2024-04-27 12:40:50,474][52263] Updated weights for policy 0, policy_version 379549 (0.0031) [2024-04-27 12:40:52,433][52242] Signal inference workers to stop experience collection... (10650 times) [2024-04-27 12:40:52,455][52263] InferenceWorker_p0-w0: stopping experience collection (10650 times) [2024-04-27 12:40:52,497][52242] Signal inference workers to resume experience collection... (10650 times) [2024-04-27 12:40:52,497][52263] InferenceWorker_p0-w0: resuming experience collection (10650 times) [2024-04-27 12:40:53,734][52263] Updated weights for policy 0, policy_version 379559 (0.0028) [2024-04-27 12:40:54,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6218694656. Throughput: 0: 53624.9. Samples: 709269220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:40:54,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 12:40:56,490][52263] Updated weights for policy 0, policy_version 379569 (0.0028) [2024-04-27 12:40:59,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6218973184. Throughput: 0: 53236.0. Samples: 709421700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:40:59,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 12:40:59,916][52263] Updated weights for policy 0, policy_version 379579 (0.0033) [2024-04-27 12:41:02,869][52263] Updated weights for policy 0, policy_version 379589 (0.0036) [2024-04-27 12:41:04,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6219218944. Throughput: 0: 53340.7. Samples: 709742120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:41:04,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 12:41:06,192][52263] Updated weights for policy 0, policy_version 379599 (0.0033) [2024-04-27 12:41:09,002][52263] Updated weights for policy 0, policy_version 379609 (0.0027) [2024-04-27 12:41:09,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.2, 300 sec: 53595.2). Total num frames: 6219513856. Throughput: 0: 53462.0. Samples: 710064800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:41:09,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 12:41:12,357][52263] Updated weights for policy 0, policy_version 379619 (0.0028) [2024-04-27 12:41:14,106][52031] Fps is (10 sec: 54067.1, 60 sec: 52974.9, 300 sec: 53484.1). Total num frames: 6219759616. Throughput: 0: 53630.8. Samples: 710225420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:41:14,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 12:41:14,982][52263] Updated weights for policy 0, policy_version 379629 (0.0025) [2024-04-27 12:41:18,300][52263] Updated weights for policy 0, policy_version 379639 (0.0035) [2024-04-27 12:41:19,106][52031] Fps is (10 sec: 50790.2, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6220021760. Throughput: 0: 53626.7. Samples: 710549260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:41:19,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 12:41:21,238][52263] Updated weights for policy 0, policy_version 379649 (0.0033) [2024-04-27 12:41:24,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.3, 300 sec: 53539.6). Total num frames: 6220300288. Throughput: 0: 53533.1. Samples: 710867220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:41:24,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 12:41:24,413][52263] Updated weights for policy 0, policy_version 379659 (0.0029) [2024-04-27 12:41:27,400][52263] Updated weights for policy 0, policy_version 379669 (0.0030) [2024-04-27 12:41:29,106][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6220578816. Throughput: 0: 53501.0. Samples: 711031740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:41:29,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 12:41:30,848][52263] Updated weights for policy 0, policy_version 379679 (0.0032) [2024-04-27 12:41:33,645][52263] Updated weights for policy 0, policy_version 379689 (0.0031) [2024-04-27 12:41:34,106][52031] Fps is (10 sec: 52428.1, 60 sec: 53247.9, 300 sec: 53484.1). Total num frames: 6220824576. Throughput: 0: 53334.6. Samples: 711344160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:41:34,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 12:41:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000379689_6220824576.pth... [2024-04-27 12:41:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000378906_6207995904.pth [2024-04-27 12:41:37,030][52263] Updated weights for policy 0, policy_version 379699 (0.0037) [2024-04-27 12:41:39,106][52031] Fps is (10 sec: 50790.7, 60 sec: 52975.1, 300 sec: 53484.1). Total num frames: 6221086720. Throughput: 0: 53117.4. Samples: 711659500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:41:39,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 12:41:39,764][52263] Updated weights for policy 0, policy_version 379709 (0.0026) [2024-04-27 12:41:43,146][52263] Updated weights for policy 0, policy_version 379719 (0.0034) [2024-04-27 12:41:44,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52975.1, 300 sec: 53373.0). Total num frames: 6221348864. Throughput: 0: 53304.4. Samples: 711820400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:41:44,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 12:41:45,941][52263] Updated weights for policy 0, policy_version 379729 (0.0025) [2024-04-27 12:41:49,106][52031] Fps is (10 sec: 54066.7, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6221627392. Throughput: 0: 53361.7. Samples: 712143400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:41:49,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 12:41:49,149][52263] Updated weights for policy 0, policy_version 379739 (0.0027) [2024-04-27 12:41:51,303][52242] Signal inference workers to stop experience collection... (10700 times) [2024-04-27 12:41:51,345][52263] InferenceWorker_p0-w0: stopping experience collection (10700 times) [2024-04-27 12:41:51,403][52242] Signal inference workers to resume experience collection... (10700 times) [2024-04-27 12:41:51,403][52263] InferenceWorker_p0-w0: resuming experience collection (10700 times) [2024-04-27 12:41:52,175][52263] Updated weights for policy 0, policy_version 379749 (0.0029) [2024-04-27 12:41:54,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6221889536. Throughput: 0: 53275.3. Samples: 712462200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:41:54,107][52031] Avg episode reward: [(0, '0.508')] [2024-04-27 12:41:55,440][52263] Updated weights for policy 0, policy_version 379759 (0.0030) [2024-04-27 12:41:58,303][52263] Updated weights for policy 0, policy_version 379769 (0.0026) [2024-04-27 12:41:59,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6222168064. Throughput: 0: 53384.0. Samples: 712627700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:41:59,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 12:42:01,728][52263] Updated weights for policy 0, policy_version 379779 (0.0031) [2024-04-27 12:42:04,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 6222430208. Throughput: 0: 53346.2. Samples: 712949840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:42:04,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 12:42:04,466][52263] Updated weights for policy 0, policy_version 379789 (0.0030) [2024-04-27 12:42:07,814][52263] Updated weights for policy 0, policy_version 379799 (0.0036) [2024-04-27 12:42:09,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6222692352. Throughput: 0: 53390.6. Samples: 713269800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:42:09,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 12:42:10,506][52263] Updated weights for policy 0, policy_version 379809 (0.0027) [2024-04-27 12:42:14,018][52263] Updated weights for policy 0, policy_version 379819 (0.0026) [2024-04-27 12:42:14,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6222954496. Throughput: 0: 53299.6. Samples: 713430220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:42:14,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 12:42:16,823][52263] Updated weights for policy 0, policy_version 379829 (0.0035) [2024-04-27 12:42:19,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6223249408. Throughput: 0: 53488.6. Samples: 713751140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:42:19,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 12:42:19,970][52263] Updated weights for policy 0, policy_version 379839 (0.0032) [2024-04-27 12:42:22,948][52263] Updated weights for policy 0, policy_version 379849 (0.0026) [2024-04-27 12:42:24,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6223511552. Throughput: 0: 53698.4. Samples: 714075940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:42:24,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 12:42:26,118][52263] Updated weights for policy 0, policy_version 379859 (0.0031) [2024-04-27 12:42:29,077][52263] Updated weights for policy 0, policy_version 379869 (0.0029) [2024-04-27 12:42:29,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6223773696. Throughput: 0: 53764.0. Samples: 714239780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 12:42:29,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 12:42:32,256][52263] Updated weights for policy 0, policy_version 379879 (0.0031) [2024-04-27 12:42:34,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6224052224. Throughput: 0: 53737.3. Samples: 714561580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 12:42:34,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 12:42:35,075][52263] Updated weights for policy 0, policy_version 379889 (0.0035) [2024-04-27 12:42:38,252][52263] Updated weights for policy 0, policy_version 379899 (0.0029) [2024-04-27 12:42:39,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6224297984. Throughput: 0: 53708.8. Samples: 714879100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 12:42:39,107][52031] Avg episode reward: [(0, '0.491')] [2024-04-27 12:42:41,048][52263] Updated weights for policy 0, policy_version 379909 (0.0027) [2024-04-27 12:42:44,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6224576512. Throughput: 0: 53543.6. Samples: 715037160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 12:42:44,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 12:42:44,724][52263] Updated weights for policy 0, policy_version 379919 (0.0032) [2024-04-27 12:42:47,590][52263] Updated weights for policy 0, policy_version 379929 (0.0028) [2024-04-27 12:42:49,106][52031] Fps is (10 sec: 55706.9, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6224855040. Throughput: 0: 53574.3. Samples: 715360680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 12:42:49,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 12:42:50,933][52263] Updated weights for policy 0, policy_version 379939 (0.0035) [2024-04-27 12:42:51,412][52242] Signal inference workers to stop experience collection... (10750 times) [2024-04-27 12:42:51,442][52263] InferenceWorker_p0-w0: stopping experience collection (10750 times) [2024-04-27 12:42:51,470][52242] Signal inference workers to resume experience collection... (10750 times) [2024-04-27 12:42:51,471][52263] InferenceWorker_p0-w0: resuming experience collection (10750 times) [2024-04-27 12:42:53,649][52263] Updated weights for policy 0, policy_version 379949 (0.0030) [2024-04-27 12:42:54,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.2, 300 sec: 53484.0). Total num frames: 6225100800. Throughput: 0: 53515.0. Samples: 715677980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 12:42:54,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 12:42:56,948][52263] Updated weights for policy 0, policy_version 379959 (0.0032) [2024-04-27 12:42:59,106][52031] Fps is (10 sec: 52428.1, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6225379328. Throughput: 0: 53477.7. Samples: 715836720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 12:42:59,107][52031] Avg episode reward: [(0, '0.667')] [2024-04-27 12:42:59,907][52263] Updated weights for policy 0, policy_version 379969 (0.0029) [2024-04-27 12:43:03,030][52263] Updated weights for policy 0, policy_version 379979 (0.0025) [2024-04-27 12:43:04,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6225625088. Throughput: 0: 53382.1. Samples: 716153340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 12:43:04,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 12:43:05,993][52263] Updated weights for policy 0, policy_version 379989 (0.0027) [2024-04-27 12:43:09,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6225887232. Throughput: 0: 53372.6. Samples: 716477700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 12:43:09,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 12:43:09,164][52263] Updated weights for policy 0, policy_version 379999 (0.0029) [2024-04-27 12:43:12,106][52263] Updated weights for policy 0, policy_version 380009 (0.0026) [2024-04-27 12:43:14,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6226182144. Throughput: 0: 53270.6. Samples: 716636960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 12:43:14,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 12:43:15,156][52263] Updated weights for policy 0, policy_version 380019 (0.0029) [2024-04-27 12:43:18,170][52263] Updated weights for policy 0, policy_version 380029 (0.0031) [2024-04-27 12:43:19,107][52031] Fps is (10 sec: 57343.4, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6226460672. Throughput: 0: 53224.9. Samples: 716956700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 12:43:19,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 12:43:21,561][52263] Updated weights for policy 0, policy_version 380039 (0.0030) [2024-04-27 12:43:24,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6226706432. Throughput: 0: 53390.9. Samples: 717281680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 12:43:24,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 12:43:24,189][52263] Updated weights for policy 0, policy_version 380049 (0.0034) [2024-04-27 12:43:27,641][52263] Updated weights for policy 0, policy_version 380059 (0.0029) [2024-04-27 12:43:29,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6226984960. Throughput: 0: 53370.2. Samples: 717438820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 12:43:29,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 12:43:30,498][52263] Updated weights for policy 0, policy_version 380069 (0.0030) [2024-04-27 12:43:33,788][52263] Updated weights for policy 0, policy_version 380079 (0.0026) [2024-04-27 12:43:34,106][52031] Fps is (10 sec: 52428.4, 60 sec: 52974.9, 300 sec: 53373.0). Total num frames: 6227230720. Throughput: 0: 53254.5. Samples: 717757140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 12:43:34,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 12:43:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000380080_6227230720.pth... [2024-04-27 12:43:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000379298_6214418432.pth [2024-04-27 12:43:36,650][52263] Updated weights for policy 0, policy_version 380089 (0.0026) [2024-04-27 12:43:36,926][52242] Signal inference workers to stop experience collection... (10800 times) [2024-04-27 12:43:36,931][52242] Signal inference workers to resume experience collection... (10800 times) [2024-04-27 12:43:36,946][52263] InferenceWorker_p0-w0: stopping experience collection (10800 times) [2024-04-27 12:43:36,946][52263] InferenceWorker_p0-w0: resuming experience collection (10800 times) [2024-04-27 12:43:39,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6227509248. Throughput: 0: 53291.1. Samples: 718076080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 12:43:39,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 12:43:39,833][52263] Updated weights for policy 0, policy_version 380099 (0.0032) [2024-04-27 12:43:42,626][52263] Updated weights for policy 0, policy_version 380109 (0.0024) [2024-04-27 12:43:44,106][52031] Fps is (10 sec: 57344.4, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6227804160. Throughput: 0: 53476.1. Samples: 718243140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 12:43:44,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 12:43:45,826][52263] Updated weights for policy 0, policy_version 380119 (0.0035) [2024-04-27 12:43:48,775][52263] Updated weights for policy 0, policy_version 380129 (0.0032) [2024-04-27 12:43:49,106][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 6228066304. Throughput: 0: 53611.2. Samples: 718565840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 12:43:49,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 12:43:52,082][52263] Updated weights for policy 0, policy_version 380139 (0.0034) [2024-04-27 12:43:54,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6228312064. Throughput: 0: 53444.4. Samples: 718882700. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-04-27 12:43:54,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 12:43:55,022][52263] Updated weights for policy 0, policy_version 380149 (0.0026) [2024-04-27 12:43:58,336][52263] Updated weights for policy 0, policy_version 380159 (0.0031) [2024-04-27 12:43:59,106][52031] Fps is (10 sec: 49151.9, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6228557824. Throughput: 0: 53202.2. Samples: 719031060. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-04-27 12:43:59,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 12:44:01,057][52263] Updated weights for policy 0, policy_version 380169 (0.0032) [2024-04-27 12:44:04,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 6228836352. Throughput: 0: 53233.4. Samples: 719352200. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-04-27 12:44:04,107][52031] Avg episode reward: [(0, '0.486')] [2024-04-27 12:44:04,385][52263] Updated weights for policy 0, policy_version 380179 (0.0032) [2024-04-27 12:44:07,206][52263] Updated weights for policy 0, policy_version 380189 (0.0027) [2024-04-27 12:44:09,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6229114880. Throughput: 0: 53142.7. Samples: 719673100. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-04-27 12:44:09,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 12:44:10,412][52263] Updated weights for policy 0, policy_version 380199 (0.0028) [2024-04-27 12:44:13,193][52263] Updated weights for policy 0, policy_version 380209 (0.0027) [2024-04-27 12:44:14,106][52031] Fps is (10 sec: 57344.5, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6229409792. Throughput: 0: 53366.7. Samples: 719840320. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-04-27 12:44:14,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 12:44:16,609][52263] Updated weights for policy 0, policy_version 380219 (0.0031) [2024-04-27 12:44:19,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6229639168. Throughput: 0: 53381.9. Samples: 720159320. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-04-27 12:44:19,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 12:44:19,381][52263] Updated weights for policy 0, policy_version 380229 (0.0027) [2024-04-27 12:44:22,799][52263] Updated weights for policy 0, policy_version 380239 (0.0030) [2024-04-27 12:44:24,106][52031] Fps is (10 sec: 49151.3, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6229901312. Throughput: 0: 53419.0. Samples: 720479940. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-04-27 12:44:24,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 12:44:25,456][52263] Updated weights for policy 0, policy_version 380249 (0.0028) [2024-04-27 12:44:28,933][52263] Updated weights for policy 0, policy_version 380259 (0.0029) [2024-04-27 12:44:29,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6230163456. Throughput: 0: 53125.4. Samples: 720633780. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-04-27 12:44:29,107][52031] Avg episode reward: [(0, '0.476')] [2024-04-27 12:44:30,981][52242] Signal inference workers to stop experience collection... (10850 times) [2024-04-27 12:44:30,982][52242] Signal inference workers to resume experience collection... (10850 times) [2024-04-27 12:44:31,001][52263] InferenceWorker_p0-w0: stopping experience collection (10850 times) [2024-04-27 12:44:31,001][52263] InferenceWorker_p0-w0: resuming experience collection (10850 times) [2024-04-27 12:44:31,449][52263] Updated weights for policy 0, policy_version 380269 (0.0033) [2024-04-27 12:44:34,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6230441984. Throughput: 0: 53049.3. Samples: 720953060. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-04-27 12:44:34,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 12:44:35,087][52263] Updated weights for policy 0, policy_version 380279 (0.0028) [2024-04-27 12:44:37,542][52263] Updated weights for policy 0, policy_version 380289 (0.0029) [2024-04-27 12:44:39,107][52031] Fps is (10 sec: 57342.4, 60 sec: 53793.9, 300 sec: 53428.5). Total num frames: 6230736896. Throughput: 0: 53154.8. Samples: 721274680. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-04-27 12:44:39,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 12:44:41,289][52263] Updated weights for policy 0, policy_version 380299 (0.0028) [2024-04-27 12:44:43,787][52263] Updated weights for policy 0, policy_version 380309 (0.0033) [2024-04-27 12:44:44,106][52031] Fps is (10 sec: 54067.6, 60 sec: 52974.9, 300 sec: 53373.0). Total num frames: 6230982656. Throughput: 0: 53690.7. Samples: 721447140. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-04-27 12:44:44,107][52031] Avg episode reward: [(0, '0.485')] [2024-04-27 12:44:47,282][52263] Updated weights for policy 0, policy_version 380319 (0.0026) [2024-04-27 12:44:49,106][52031] Fps is (10 sec: 49153.1, 60 sec: 52701.9, 300 sec: 53373.0). Total num frames: 6231228416. Throughput: 0: 53622.7. Samples: 721765220. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-04-27 12:44:49,107][52031] Avg episode reward: [(0, '0.494')] [2024-04-27 12:44:50,040][52263] Updated weights for policy 0, policy_version 380329 (0.0030) [2024-04-27 12:44:53,261][52263] Updated weights for policy 0, policy_version 380339 (0.0026) [2024-04-27 12:44:54,106][52031] Fps is (10 sec: 54066.8, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6231523328. Throughput: 0: 53637.3. Samples: 722086780. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-04-27 12:44:54,107][52031] Avg episode reward: [(0, '0.501')] [2024-04-27 12:44:56,137][52263] Updated weights for policy 0, policy_version 380349 (0.0028) [2024-04-27 12:44:59,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6231785472. Throughput: 0: 53228.0. Samples: 722235580. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-04-27 12:44:59,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 12:44:59,350][52263] Updated weights for policy 0, policy_version 380359 (0.0029) [2024-04-27 12:45:02,154][52263] Updated weights for policy 0, policy_version 380369 (0.0028) [2024-04-27 12:45:04,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6232064000. Throughput: 0: 53208.0. Samples: 722553680. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-04-27 12:45:04,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 12:45:05,728][52263] Updated weights for policy 0, policy_version 380379 (0.0026) [2024-04-27 12:45:08,298][52263] Updated weights for policy 0, policy_version 380389 (0.0025) [2024-04-27 12:45:09,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6232342528. Throughput: 0: 53312.0. Samples: 722878980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-27 12:45:09,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 12:45:11,909][52263] Updated weights for policy 0, policy_version 380399 (0.0031) [2024-04-27 12:45:14,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53247.9, 300 sec: 53484.1). Total num frames: 6232604672. Throughput: 0: 53646.1. Samples: 723047860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-27 12:45:14,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 12:45:14,414][52263] Updated weights for policy 0, policy_version 380409 (0.0040) [2024-04-27 12:45:18,036][52263] Updated weights for policy 0, policy_version 380419 (0.0028) [2024-04-27 12:45:19,106][52031] Fps is (10 sec: 49152.2, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6232834048. Throughput: 0: 53650.3. Samples: 723367320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-27 12:45:19,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 12:45:20,425][52263] Updated weights for policy 0, policy_version 380429 (0.0033) [2024-04-27 12:45:24,015][52263] Updated weights for policy 0, policy_version 380439 (0.0034) [2024-04-27 12:45:24,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6233112576. Throughput: 0: 53697.1. Samples: 723691040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-27 12:45:24,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 12:45:24,975][52242] Signal inference workers to stop experience collection... (10900 times) [2024-04-27 12:45:24,976][52242] Signal inference workers to resume experience collection... (10900 times) [2024-04-27 12:45:24,988][52263] InferenceWorker_p0-w0: stopping experience collection (10900 times) [2024-04-27 12:45:24,989][52263] InferenceWorker_p0-w0: resuming experience collection (10900 times) [2024-04-27 12:45:26,517][52263] Updated weights for policy 0, policy_version 380449 (0.0030) [2024-04-27 12:45:29,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6233391104. Throughput: 0: 53246.7. Samples: 723843240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-27 12:45:29,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 12:45:30,098][52263] Updated weights for policy 0, policy_version 380459 (0.0029) [2024-04-27 12:45:32,839][52263] Updated weights for policy 0, policy_version 380469 (0.0028) [2024-04-27 12:45:34,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6233669632. Throughput: 0: 53383.1. Samples: 724167460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-27 12:45:34,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 12:45:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000380473_6233669632.pth... [2024-04-27 12:45:34,174][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000379689_6220824576.pth [2024-04-27 12:45:36,423][52263] Updated weights for policy 0, policy_version 380479 (0.0033) [2024-04-27 12:45:39,013][52263] Updated weights for policy 0, policy_version 380489 (0.0032) [2024-04-27 12:45:39,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.2, 300 sec: 53428.5). Total num frames: 6233931776. Throughput: 0: 53270.3. Samples: 724483940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-27 12:45:39,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 12:45:42,403][52263] Updated weights for policy 0, policy_version 380499 (0.0030) [2024-04-27 12:45:44,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6234177536. Throughput: 0: 53489.4. Samples: 724642600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-27 12:45:44,107][52031] Avg episode reward: [(0, '0.516')] [2024-04-27 12:45:45,174][52263] Updated weights for policy 0, policy_version 380509 (0.0028) [2024-04-27 12:45:48,698][52263] Updated weights for policy 0, policy_version 380519 (0.0029) [2024-04-27 12:45:49,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6234439680. Throughput: 0: 53557.8. Samples: 724963780. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-27 12:45:49,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 12:45:51,464][52263] Updated weights for policy 0, policy_version 380529 (0.0035) [2024-04-27 12:45:54,106][52031] Fps is (10 sec: 52428.8, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 6234701824. Throughput: 0: 53497.0. Samples: 725286340. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-27 12:45:54,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 12:45:54,870][52263] Updated weights for policy 0, policy_version 380539 (0.0031) [2024-04-27 12:45:57,619][52263] Updated weights for policy 0, policy_version 380549 (0.0031) [2024-04-27 12:45:59,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6234980352. Throughput: 0: 53188.1. Samples: 725441320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-27 12:45:59,107][52031] Avg episode reward: [(0, '0.486')] [2024-04-27 12:46:00,970][52263] Updated weights for policy 0, policy_version 380559 (0.0030) [2024-04-27 12:46:03,586][52263] Updated weights for policy 0, policy_version 380569 (0.0024) [2024-04-27 12:46:04,106][52031] Fps is (10 sec: 55704.9, 60 sec: 53247.9, 300 sec: 53372.9). Total num frames: 6235258880. Throughput: 0: 53223.5. Samples: 725762380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-27 12:46:04,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 12:46:07,249][52263] Updated weights for policy 0, policy_version 380579 (0.0033) [2024-04-27 12:46:09,106][52031] Fps is (10 sec: 54066.5, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6235521024. Throughput: 0: 53171.9. Samples: 726083780. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-27 12:46:09,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 12:46:09,639][52263] Updated weights for policy 0, policy_version 380589 (0.0029) [2024-04-27 12:46:13,306][52263] Updated weights for policy 0, policy_version 380599 (0.0029) [2024-04-27 12:46:14,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6235799552. Throughput: 0: 53286.6. Samples: 726241140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-27 12:46:14,107][52031] Avg episode reward: [(0, '0.511')] [2024-04-27 12:46:15,822][52263] Updated weights for policy 0, policy_version 380609 (0.0028) [2024-04-27 12:46:19,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.0, 300 sec: 53372.9). Total num frames: 6236045312. Throughput: 0: 53128.8. Samples: 726558260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-27 12:46:19,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 12:46:19,379][52263] Updated weights for policy 0, policy_version 380619 (0.0029) [2024-04-27 12:46:21,955][52263] Updated weights for policy 0, policy_version 380629 (0.0029) [2024-04-27 12:46:23,463][52242] Signal inference workers to stop experience collection... (10950 times) [2024-04-27 12:46:23,463][52242] Signal inference workers to resume experience collection... (10950 times) [2024-04-27 12:46:23,487][52263] InferenceWorker_p0-w0: stopping experience collection (10950 times) [2024-04-27 12:46:23,488][52263] InferenceWorker_p0-w0: resuming experience collection (10950 times) [2024-04-27 12:46:24,106][52031] Fps is (10 sec: 50790.2, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6236307456. Throughput: 0: 53336.8. Samples: 726884100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-27 12:46:24,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 12:46:25,322][52263] Updated weights for policy 0, policy_version 380639 (0.0028) [2024-04-27 12:46:27,977][52263] Updated weights for policy 0, policy_version 380649 (0.0037) [2024-04-27 12:46:29,106][52031] Fps is (10 sec: 52429.5, 60 sec: 52974.9, 300 sec: 53373.0). Total num frames: 6236569600. Throughput: 0: 53412.4. Samples: 727046160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 12:46:29,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 12:46:31,499][52263] Updated weights for policy 0, policy_version 380659 (0.0034) [2024-04-27 12:46:34,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6236864512. Throughput: 0: 53378.4. Samples: 727365820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 12:46:34,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 12:46:34,197][52263] Updated weights for policy 0, policy_version 380669 (0.0029) [2024-04-27 12:46:37,489][52263] Updated weights for policy 0, policy_version 380679 (0.0035) [2024-04-27 12:46:39,106][52031] Fps is (10 sec: 55705.4, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6237126656. Throughput: 0: 53417.3. Samples: 727690120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 12:46:39,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 12:46:40,137][52263] Updated weights for policy 0, policy_version 380689 (0.0028) [2024-04-27 12:46:43,595][52263] Updated weights for policy 0, policy_version 380699 (0.0029) [2024-04-27 12:46:44,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53793.9, 300 sec: 53484.0). Total num frames: 6237405184. Throughput: 0: 53611.3. Samples: 727853840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 12:46:44,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 12:46:46,232][52263] Updated weights for policy 0, policy_version 380709 (0.0030) [2024-04-27 12:46:49,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6237650944. Throughput: 0: 53612.5. Samples: 728174940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 12:46:49,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 12:46:49,647][52263] Updated weights for policy 0, policy_version 380719 (0.0032) [2024-04-27 12:46:52,327][52263] Updated weights for policy 0, policy_version 380729 (0.0026) [2024-04-27 12:46:54,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6237929472. Throughput: 0: 53532.5. Samples: 728492740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 12:46:54,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 12:46:55,906][52263] Updated weights for policy 0, policy_version 380739 (0.0027) [2024-04-27 12:46:58,604][52263] Updated weights for policy 0, policy_version 380749 (0.0033) [2024-04-27 12:46:59,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6238191616. Throughput: 0: 53625.3. Samples: 728654280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 12:46:59,107][52031] Avg episode reward: [(0, '0.666')] [2024-04-27 12:47:01,910][52263] Updated weights for policy 0, policy_version 380759 (0.0027) [2024-04-27 12:47:04,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6238470144. Throughput: 0: 53723.7. Samples: 728975820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 12:47:04,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 12:47:04,843][52263] Updated weights for policy 0, policy_version 380769 (0.0028) [2024-04-27 12:47:07,940][52263] Updated weights for policy 0, policy_version 380779 (0.0029) [2024-04-27 12:47:09,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6238715904. Throughput: 0: 53546.8. Samples: 729293700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 12:47:09,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 12:47:10,093][52242] Signal inference workers to stop experience collection... (11000 times) [2024-04-27 12:47:10,139][52263] InferenceWorker_p0-w0: stopping experience collection (11000 times) [2024-04-27 12:47:10,151][52242] Signal inference workers to resume experience collection... (11000 times) [2024-04-27 12:47:10,157][52263] InferenceWorker_p0-w0: resuming experience collection (11000 times) [2024-04-27 12:47:10,885][52263] Updated weights for policy 0, policy_version 380789 (0.0031) [2024-04-27 12:47:14,021][52263] Updated weights for policy 0, policy_version 380799 (0.0030) [2024-04-27 12:47:14,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6239010816. Throughput: 0: 53658.1. Samples: 729460780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 12:47:14,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 12:47:16,958][52263] Updated weights for policy 0, policy_version 380809 (0.0032) [2024-04-27 12:47:19,106][52031] Fps is (10 sec: 54066.5, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6239256576. Throughput: 0: 53712.1. Samples: 729782860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 12:47:19,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 12:47:20,055][52263] Updated weights for policy 0, policy_version 380819 (0.0025) [2024-04-27 12:47:23,193][52263] Updated weights for policy 0, policy_version 380829 (0.0032) [2024-04-27 12:47:24,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6239535104. Throughput: 0: 53684.5. Samples: 730105920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 12:47:24,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 12:47:26,158][52263] Updated weights for policy 0, policy_version 380839 (0.0034) [2024-04-27 12:47:29,106][52031] Fps is (10 sec: 55706.2, 60 sec: 54067.2, 300 sec: 53428.5). Total num frames: 6239813632. Throughput: 0: 53482.1. Samples: 730260520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 12:47:29,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 12:47:29,297][52263] Updated weights for policy 0, policy_version 380849 (0.0028) [2024-04-27 12:47:32,351][52263] Updated weights for policy 0, policy_version 380859 (0.0025) [2024-04-27 12:47:34,106][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6240075776. Throughput: 0: 53516.4. Samples: 730583180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 12:47:34,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 12:47:34,310][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000380866_6240108544.pth... [2024-04-27 12:47:34,356][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000380080_6227230720.pth [2024-04-27 12:47:35,412][52263] Updated weights for policy 0, policy_version 380869 (0.0023) [2024-04-27 12:47:38,427][52263] Updated weights for policy 0, policy_version 380879 (0.0031) [2024-04-27 12:47:39,106][52031] Fps is (10 sec: 54066.5, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6240354304. Throughput: 0: 53498.2. Samples: 730900160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 12:47:39,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 12:47:41,498][52263] Updated weights for policy 0, policy_version 380889 (0.0033) [2024-04-27 12:47:44,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53248.1, 300 sec: 53372.9). Total num frames: 6240600064. Throughput: 0: 53559.1. Samples: 731064440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-27 12:47:44,107][52031] Avg episode reward: [(0, '0.489')] [2024-04-27 12:47:44,543][52263] Updated weights for policy 0, policy_version 380899 (0.0035) [2024-04-27 12:47:47,839][52263] Updated weights for policy 0, policy_version 380909 (0.0028) [2024-04-27 12:47:49,106][52031] Fps is (10 sec: 49152.2, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6240845824. Throughput: 0: 53507.5. Samples: 731383660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 12:47:49,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 12:47:50,770][52263] Updated weights for policy 0, policy_version 380919 (0.0034) [2024-04-27 12:47:54,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6241124352. Throughput: 0: 53525.2. Samples: 731702340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 12:47:54,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 12:47:54,114][52263] Updated weights for policy 0, policy_version 380929 (0.0027) [2024-04-27 12:47:56,943][52263] Updated weights for policy 0, policy_version 380939 (0.0031) [2024-04-27 12:47:59,106][52031] Fps is (10 sec: 58982.7, 60 sec: 54067.3, 300 sec: 53595.1). Total num frames: 6241435648. Throughput: 0: 53387.2. Samples: 731863200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 12:47:59,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 12:48:00,272][52263] Updated weights for policy 0, policy_version 380949 (0.0030) [2024-04-27 12:48:02,984][52263] Updated weights for policy 0, policy_version 380959 (0.0029) [2024-04-27 12:48:04,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6241681408. Throughput: 0: 53437.4. Samples: 732187540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 12:48:04,107][52031] Avg episode reward: [(0, '0.478')] [2024-04-27 12:48:06,363][52263] Updated weights for policy 0, policy_version 380969 (0.0028) [2024-04-27 12:48:08,748][52242] Signal inference workers to stop experience collection... (11050 times) [2024-04-27 12:48:08,748][52242] Signal inference workers to resume experience collection... (11050 times) [2024-04-27 12:48:08,759][52263] InferenceWorker_p0-w0: stopping experience collection (11050 times) [2024-04-27 12:48:08,760][52263] InferenceWorker_p0-w0: resuming experience collection (11050 times) [2024-04-27 12:48:09,000][52263] Updated weights for policy 0, policy_version 380979 (0.0025) [2024-04-27 12:48:09,106][52031] Fps is (10 sec: 52428.5, 60 sec: 54067.1, 300 sec: 53484.0). Total num frames: 6241959936. Throughput: 0: 53424.8. Samples: 732510040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 12:48:09,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 12:48:12,587][52263] Updated weights for policy 0, policy_version 380989 (0.0032) [2024-04-27 12:48:14,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6242205696. Throughput: 0: 53435.1. Samples: 732665100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 12:48:14,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 12:48:15,097][52263] Updated weights for policy 0, policy_version 380999 (0.0030) [2024-04-27 12:48:18,919][52263] Updated weights for policy 0, policy_version 381009 (0.0028) [2024-04-27 12:48:19,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6242467840. Throughput: 0: 53385.8. Samples: 732985540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 12:48:19,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 12:48:21,386][52263] Updated weights for policy 0, policy_version 381019 (0.0033) [2024-04-27 12:48:24,106][52031] Fps is (10 sec: 54066.5, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6242746368. Throughput: 0: 53444.9. Samples: 733305180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 12:48:24,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 12:48:24,875][52263] Updated weights for policy 0, policy_version 381029 (0.0033) [2024-04-27 12:48:27,406][52263] Updated weights for policy 0, policy_version 381039 (0.0029) [2024-04-27 12:48:29,106][52031] Fps is (10 sec: 58982.5, 60 sec: 54067.1, 300 sec: 53650.7). Total num frames: 6243057664. Throughput: 0: 53549.4. Samples: 733474160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 12:48:29,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 12:48:31,127][52263] Updated weights for policy 0, policy_version 381049 (0.0028) [2024-04-27 12:48:33,598][52263] Updated weights for policy 0, policy_version 381059 (0.0030) [2024-04-27 12:48:34,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6243270656. Throughput: 0: 53449.0. Samples: 733788860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 12:48:34,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 12:48:37,226][52263] Updated weights for policy 0, policy_version 381069 (0.0025) [2024-04-27 12:48:39,106][52031] Fps is (10 sec: 47513.5, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 6243532800. Throughput: 0: 53459.5. Samples: 734108020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 12:48:39,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 12:48:39,744][52263] Updated weights for policy 0, policy_version 381079 (0.0027) [2024-04-27 12:48:43,264][52263] Updated weights for policy 0, policy_version 381089 (0.0029) [2024-04-27 12:48:44,106][52031] Fps is (10 sec: 52428.3, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6243794944. Throughput: 0: 53198.1. Samples: 734257120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 12:48:44,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 12:48:46,267][52263] Updated weights for policy 0, policy_version 381099 (0.0033) [2024-04-27 12:48:49,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6244057088. Throughput: 0: 53050.2. Samples: 734574800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 12:48:49,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 12:48:49,475][52263] Updated weights for policy 0, policy_version 381109 (0.0029) [2024-04-27 12:48:52,418][52263] Updated weights for policy 0, policy_version 381119 (0.0026) [2024-04-27 12:48:54,106][52031] Fps is (10 sec: 57343.9, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6244368384. Throughput: 0: 53021.7. Samples: 734896020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 12:48:54,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 12:48:55,541][52263] Updated weights for policy 0, policy_version 381129 (0.0031) [2024-04-27 12:48:58,577][52263] Updated weights for policy 0, policy_version 381139 (0.0028) [2024-04-27 12:48:59,106][52031] Fps is (10 sec: 57343.7, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6244630528. Throughput: 0: 53399.4. Samples: 735068080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 12:48:59,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 12:49:01,664][52263] Updated weights for policy 0, policy_version 381149 (0.0027) [2024-04-27 12:49:04,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6244876288. Throughput: 0: 53457.8. Samples: 735391140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 12:49:04,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 12:49:04,545][52263] Updated weights for policy 0, policy_version 381159 (0.0033) [2024-04-27 12:49:05,161][52242] Signal inference workers to stop experience collection... (11100 times) [2024-04-27 12:49:05,201][52263] InferenceWorker_p0-w0: stopping experience collection (11100 times) [2024-04-27 12:49:05,217][52242] Signal inference workers to resume experience collection... (11100 times) [2024-04-27 12:49:05,219][52263] InferenceWorker_p0-w0: resuming experience collection (11100 times) [2024-04-27 12:49:07,610][52263] Updated weights for policy 0, policy_version 381169 (0.0034) [2024-04-27 12:49:09,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 6245138432. Throughput: 0: 53417.4. Samples: 735708960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 12:49:09,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 12:49:10,787][52263] Updated weights for policy 0, policy_version 381179 (0.0033) [2024-04-27 12:49:13,870][52263] Updated weights for policy 0, policy_version 381189 (0.0035) [2024-04-27 12:49:14,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6245400576. Throughput: 0: 53201.9. Samples: 735868240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 12:49:14,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 12:49:16,967][52263] Updated weights for policy 0, policy_version 381199 (0.0033) [2024-04-27 12:49:19,106][52031] Fps is (10 sec: 54066.8, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6245679104. Throughput: 0: 53332.8. Samples: 736188840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 12:49:19,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 12:49:20,044][52263] Updated weights for policy 0, policy_version 381209 (0.0027) [2024-04-27 12:49:22,921][52263] Updated weights for policy 0, policy_version 381219 (0.0030) [2024-04-27 12:49:24,106][52031] Fps is (10 sec: 57343.7, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6245974016. Throughput: 0: 53321.0. Samples: 736507460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 12:49:24,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 12:49:26,068][52263] Updated weights for policy 0, policy_version 381229 (0.0031) [2024-04-27 12:49:28,849][52263] Updated weights for policy 0, policy_version 381239 (0.0032) [2024-04-27 12:49:29,106][52031] Fps is (10 sec: 54067.1, 60 sec: 52701.8, 300 sec: 53484.0). Total num frames: 6246219776. Throughput: 0: 53684.9. Samples: 736672940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 12:49:29,107][52031] Avg episode reward: [(0, '0.471')] [2024-04-27 12:49:32,150][52263] Updated weights for policy 0, policy_version 381249 (0.0032) [2024-04-27 12:49:34,106][52031] Fps is (10 sec: 49151.9, 60 sec: 53248.0, 300 sec: 53317.5). Total num frames: 6246465536. Throughput: 0: 53682.7. Samples: 736990520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 12:49:34,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 12:49:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000381254_6246465536.pth... [2024-04-27 12:49:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000380473_6233669632.pth [2024-04-27 12:49:35,100][52263] Updated weights for policy 0, policy_version 381259 (0.0028) [2024-04-27 12:49:38,256][52263] Updated weights for policy 0, policy_version 381269 (0.0032) [2024-04-27 12:49:39,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6246744064. Throughput: 0: 53628.1. Samples: 737309280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 12:49:39,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 12:49:41,369][52263] Updated weights for policy 0, policy_version 381279 (0.0030) [2024-04-27 12:49:44,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6247022592. Throughput: 0: 53365.8. Samples: 737469540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 12:49:44,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 12:49:44,433][52263] Updated weights for policy 0, policy_version 381289 (0.0032) [2024-04-27 12:49:47,477][52263] Updated weights for policy 0, policy_version 381299 (0.0026) [2024-04-27 12:49:49,106][52031] Fps is (10 sec: 55705.7, 60 sec: 54067.2, 300 sec: 53484.1). Total num frames: 6247301120. Throughput: 0: 53239.2. Samples: 737786900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 12:49:49,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 12:49:50,361][52263] Updated weights for policy 0, policy_version 381309 (0.0030) [2024-04-27 12:49:53,866][52263] Updated weights for policy 0, policy_version 381319 (0.0025) [2024-04-27 12:49:54,106][52031] Fps is (10 sec: 52428.7, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6247546880. Throughput: 0: 53335.9. Samples: 738109080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 12:49:54,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 12:49:56,507][52263] Updated weights for policy 0, policy_version 381329 (0.0031) [2024-04-27 12:49:59,106][52031] Fps is (10 sec: 50790.3, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6247809024. Throughput: 0: 53290.6. Samples: 738266320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 12:49:59,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 12:49:59,813][52263] Updated weights for policy 0, policy_version 381339 (0.0031) [2024-04-27 12:50:02,503][52263] Updated weights for policy 0, policy_version 381349 (0.0032) [2024-04-27 12:50:04,106][52031] Fps is (10 sec: 50791.3, 60 sec: 52975.1, 300 sec: 53261.9). Total num frames: 6248054784. Throughput: 0: 53276.1. Samples: 738586260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 12:50:04,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 12:50:05,600][52242] Signal inference workers to stop experience collection... (11150 times) [2024-04-27 12:50:05,601][52242] Signal inference workers to resume experience collection... (11150 times) [2024-04-27 12:50:05,628][52263] InferenceWorker_p0-w0: stopping experience collection (11150 times) [2024-04-27 12:50:05,629][52263] InferenceWorker_p0-w0: resuming experience collection (11150 times) [2024-04-27 12:50:05,928][52263] Updated weights for policy 0, policy_version 381359 (0.0028) [2024-04-27 12:50:08,793][52263] Updated weights for policy 0, policy_version 381369 (0.0028) [2024-04-27 12:50:09,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53520.9, 300 sec: 53372.9). Total num frames: 6248349696. Throughput: 0: 53568.7. Samples: 738918060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 12:50:09,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 12:50:11,979][52263] Updated weights for policy 0, policy_version 381379 (0.0027) [2024-04-27 12:50:14,106][52031] Fps is (10 sec: 57343.4, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6248628224. Throughput: 0: 53474.3. Samples: 739079280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 12:50:14,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 12:50:15,486][52263] Updated weights for policy 0, policy_version 381389 (0.0029) [2024-04-27 12:50:18,021][52263] Updated weights for policy 0, policy_version 381399 (0.0026) [2024-04-27 12:50:19,106][52031] Fps is (10 sec: 57344.4, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6248923136. Throughput: 0: 53692.4. Samples: 739406680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 12:50:19,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 12:50:21,571][52263] Updated weights for policy 0, policy_version 381409 (0.0032) [2024-04-27 12:50:23,936][52263] Updated weights for policy 0, policy_version 381419 (0.0028) [2024-04-27 12:50:24,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6249168896. Throughput: 0: 53828.4. Samples: 739731560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 12:50:24,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 12:50:27,570][52263] Updated weights for policy 0, policy_version 381429 (0.0035) [2024-04-27 12:50:29,106][52031] Fps is (10 sec: 49152.0, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6249414656. Throughput: 0: 53711.6. Samples: 739886560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:50:29,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 12:50:30,132][52263] Updated weights for policy 0, policy_version 381439 (0.0029) [2024-04-27 12:50:33,746][52263] Updated weights for policy 0, policy_version 381449 (0.0029) [2024-04-27 12:50:34,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53521.0, 300 sec: 53372.9). Total num frames: 6249676800. Throughput: 0: 53814.6. Samples: 740208560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:50:34,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 12:50:36,286][52263] Updated weights for policy 0, policy_version 381459 (0.0034) [2024-04-27 12:50:39,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6249971712. Throughput: 0: 53838.8. Samples: 740531820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:50:39,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 12:50:39,838][52263] Updated weights for policy 0, policy_version 381469 (0.0028) [2024-04-27 12:50:42,454][52263] Updated weights for policy 0, policy_version 381479 (0.0031) [2024-04-27 12:50:44,106][52031] Fps is (10 sec: 58982.3, 60 sec: 54067.2, 300 sec: 53650.6). Total num frames: 6250266624. Throughput: 0: 54009.7. Samples: 740696760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:50:44,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 12:50:45,919][52263] Updated weights for policy 0, policy_version 381489 (0.0035) [2024-04-27 12:50:48,516][52263] Updated weights for policy 0, policy_version 381499 (0.0028) [2024-04-27 12:50:49,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6250512384. Throughput: 0: 53991.5. Samples: 741015880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:50:49,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 12:50:51,991][52263] Updated weights for policy 0, policy_version 381509 (0.0033) [2024-04-27 12:50:54,106][52031] Fps is (10 sec: 49152.0, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6250758144. Throughput: 0: 53786.8. Samples: 741338460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:50:54,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 12:50:54,719][52263] Updated weights for policy 0, policy_version 381519 (0.0041) [2024-04-27 12:50:58,255][52263] Updated weights for policy 0, policy_version 381529 (0.0028) [2024-04-27 12:50:59,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6251036672. Throughput: 0: 53733.3. Samples: 741497280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:50:59,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 12:51:00,846][52263] Updated weights for policy 0, policy_version 381539 (0.0032) [2024-04-27 12:51:04,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6251282432. Throughput: 0: 53573.3. Samples: 741817480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:51:04,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 12:51:04,197][52263] Updated weights for policy 0, policy_version 381549 (0.0028) [2024-04-27 12:51:06,874][52263] Updated weights for policy 0, policy_version 381559 (0.0027) [2024-04-27 12:51:09,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.3, 300 sec: 53484.0). Total num frames: 6251577344. Throughput: 0: 53449.4. Samples: 742136780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:51:09,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 12:51:10,459][52263] Updated weights for policy 0, policy_version 381569 (0.0031) [2024-04-27 12:51:10,785][52242] Signal inference workers to stop experience collection... (11200 times) [2024-04-27 12:51:10,790][52242] Signal inference workers to resume experience collection... (11200 times) [2024-04-27 12:51:10,818][52263] InferenceWorker_p0-w0: stopping experience collection (11200 times) [2024-04-27 12:51:10,818][52263] InferenceWorker_p0-w0: resuming experience collection (11200 times) [2024-04-27 12:51:12,865][52263] Updated weights for policy 0, policy_version 381579 (0.0028) [2024-04-27 12:51:14,106][52031] Fps is (10 sec: 57344.2, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6251855872. Throughput: 0: 53657.8. Samples: 742301160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:51:14,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 12:51:16,546][52263] Updated weights for policy 0, policy_version 381589 (0.0030) [2024-04-27 12:51:18,954][52263] Updated weights for policy 0, policy_version 381599 (0.0025) [2024-04-27 12:51:19,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6252118016. Throughput: 0: 53750.3. Samples: 742627320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:51:19,107][52031] Avg episode reward: [(0, '0.496')] [2024-04-27 12:51:22,608][52263] Updated weights for policy 0, policy_version 381609 (0.0032) [2024-04-27 12:51:24,106][52031] Fps is (10 sec: 50790.1, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6252363776. Throughput: 0: 53644.9. Samples: 742945840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:51:24,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 12:51:25,062][52263] Updated weights for policy 0, policy_version 381619 (0.0028) [2024-04-27 12:51:28,735][52263] Updated weights for policy 0, policy_version 381629 (0.0035) [2024-04-27 12:51:29,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6252625920. Throughput: 0: 53325.0. Samples: 743096380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:51:29,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 12:51:31,190][52263] Updated weights for policy 0, policy_version 381639 (0.0028) [2024-04-27 12:51:34,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6252888064. Throughput: 0: 53366.6. Samples: 743417380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:51:34,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 12:51:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000381646_6252888064.pth... [2024-04-27 12:51:34,162][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000380866_6240108544.pth [2024-04-27 12:51:34,714][52263] Updated weights for policy 0, policy_version 381649 (0.0030) [2024-04-27 12:51:37,252][52263] Updated weights for policy 0, policy_version 381659 (0.0030) [2024-04-27 12:51:39,106][52031] Fps is (10 sec: 55705.0, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 6253182976. Throughput: 0: 53384.8. Samples: 743740780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:51:39,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 12:51:40,874][52263] Updated weights for policy 0, policy_version 381669 (0.0030) [2024-04-27 12:51:43,341][52263] Updated weights for policy 0, policy_version 381679 (0.0028) [2024-04-27 12:51:44,106][52031] Fps is (10 sec: 57343.9, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6253461504. Throughput: 0: 53592.9. Samples: 743908960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:51:44,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 12:51:46,894][52263] Updated weights for policy 0, policy_version 381689 (0.0034) [2024-04-27 12:51:49,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6253707264. Throughput: 0: 53596.8. Samples: 744229340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:51:49,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 12:51:49,521][52263] Updated weights for policy 0, policy_version 381699 (0.0026) [2024-04-27 12:51:53,260][52263] Updated weights for policy 0, policy_version 381709 (0.0027) [2024-04-27 12:51:54,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6253969408. Throughput: 0: 53611.1. Samples: 744549280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:51:54,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 12:51:55,650][52263] Updated weights for policy 0, policy_version 381719 (0.0028) [2024-04-27 12:51:59,106][52031] Fps is (10 sec: 50791.0, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6254215168. Throughput: 0: 53203.6. Samples: 744695320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:51:59,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 12:51:59,595][52263] Updated weights for policy 0, policy_version 381729 (0.0030) [2024-04-27 12:52:01,717][52263] Updated weights for policy 0, policy_version 381739 (0.0026) [2024-04-27 12:52:04,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6254510080. Throughput: 0: 52995.9. Samples: 745012140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:52:04,107][52031] Avg episode reward: [(0, '0.505')] [2024-04-27 12:52:05,593][52263] Updated weights for policy 0, policy_version 381749 (0.0032) [2024-04-27 12:52:07,808][52263] Updated weights for policy 0, policy_version 381759 (0.0030) [2024-04-27 12:52:09,106][52031] Fps is (10 sec: 58981.8, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6254804992. Throughput: 0: 53104.4. Samples: 745335540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:52:09,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 12:52:11,644][52263] Updated weights for policy 0, policy_version 381769 (0.0034) [2024-04-27 12:52:13,903][52242] Signal inference workers to stop experience collection... (11250 times) [2024-04-27 12:52:13,903][52242] Signal inference workers to resume experience collection... (11250 times) [2024-04-27 12:52:13,931][52263] InferenceWorker_p0-w0: stopping experience collection (11250 times) [2024-04-27 12:52:13,931][52263] InferenceWorker_p0-w0: resuming experience collection (11250 times) [2024-04-27 12:52:14,038][52263] Updated weights for policy 0, policy_version 381779 (0.0036) [2024-04-27 12:52:14,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6255067136. Throughput: 0: 53645.3. Samples: 745510420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:52:14,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 12:52:17,818][52263] Updated weights for policy 0, policy_version 381789 (0.0031) [2024-04-27 12:52:19,106][52031] Fps is (10 sec: 49152.5, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6255296512. Throughput: 0: 53619.6. Samples: 745830260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:52:19,107][52031] Avg episode reward: [(0, '0.506')] [2024-04-27 12:52:20,143][52263] Updated weights for policy 0, policy_version 381799 (0.0026) [2024-04-27 12:52:23,907][52263] Updated weights for policy 0, policy_version 381809 (0.0028) [2024-04-27 12:52:24,106][52031] Fps is (10 sec: 50790.2, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6255575040. Throughput: 0: 53510.3. Samples: 746148740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:52:24,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 12:52:26,291][52263] Updated weights for policy 0, policy_version 381819 (0.0027) [2024-04-27 12:52:29,107][52031] Fps is (10 sec: 54066.0, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6255837184. Throughput: 0: 53009.2. Samples: 746294380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:52:29,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 12:52:29,898][52263] Updated weights for policy 0, policy_version 381829 (0.0032) [2024-04-27 12:52:32,325][52263] Updated weights for policy 0, policy_version 381839 (0.0030) [2024-04-27 12:52:34,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6256115712. Throughput: 0: 53004.0. Samples: 746614520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:52:34,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 12:52:35,888][52263] Updated weights for policy 0, policy_version 381849 (0.0029) [2024-04-27 12:52:38,423][52263] Updated weights for policy 0, policy_version 381859 (0.0032) [2024-04-27 12:52:39,106][52031] Fps is (10 sec: 57345.3, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6256410624. Throughput: 0: 53083.6. Samples: 746938040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:52:39,107][52031] Avg episode reward: [(0, '0.462')] [2024-04-27 12:52:42,114][52263] Updated weights for policy 0, policy_version 381869 (0.0032) [2024-04-27 12:52:44,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6256656384. Throughput: 0: 53697.6. Samples: 747111720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:52:44,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 12:52:44,415][52263] Updated weights for policy 0, policy_version 381879 (0.0033) [2024-04-27 12:52:48,358][52263] Updated weights for policy 0, policy_version 381889 (0.0028) [2024-04-27 12:52:49,106][52031] Fps is (10 sec: 49151.7, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6256902144. Throughput: 0: 53820.4. Samples: 747434060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:52:49,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 12:52:50,470][52263] Updated weights for policy 0, policy_version 381899 (0.0025) [2024-04-27 12:52:54,106][52031] Fps is (10 sec: 50791.0, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6257164288. Throughput: 0: 53743.6. Samples: 747754000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:52:54,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 12:52:54,658][52263] Updated weights for policy 0, policy_version 381909 (0.0030) [2024-04-27 12:52:56,544][52263] Updated weights for policy 0, policy_version 381919 (0.0029) [2024-04-27 12:52:57,944][52242] Signal inference workers to stop experience collection... (11300 times) [2024-04-27 12:52:57,999][52263] InferenceWorker_p0-w0: stopping experience collection (11300 times) [2024-04-27 12:52:58,006][52242] Signal inference workers to resume experience collection... (11300 times) [2024-04-27 12:52:58,012][52263] InferenceWorker_p0-w0: resuming experience collection (11300 times) [2024-04-27 12:52:59,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6257442816. Throughput: 0: 53232.4. Samples: 747905880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:52:59,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 12:53:00,904][52263] Updated weights for policy 0, policy_version 381929 (0.0027) [2024-04-27 12:53:02,825][52263] Updated weights for policy 0, policy_version 381939 (0.0028) [2024-04-27 12:53:04,106][52031] Fps is (10 sec: 57343.8, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6257737728. Throughput: 0: 53285.7. Samples: 748228120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 12:53:04,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 12:53:06,984][52263] Updated weights for policy 0, policy_version 381949 (0.0030) [2024-04-27 12:53:08,984][52263] Updated weights for policy 0, policy_version 381959 (0.0034) [2024-04-27 12:53:09,106][52031] Fps is (10 sec: 57344.6, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6258016256. Throughput: 0: 53385.4. Samples: 748551080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 12:53:09,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 12:53:13,038][52263] Updated weights for policy 0, policy_version 381969 (0.0030) [2024-04-27 12:53:14,106][52031] Fps is (10 sec: 50790.7, 60 sec: 52975.0, 300 sec: 53484.1). Total num frames: 6258245632. Throughput: 0: 53861.6. Samples: 748718140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 12:53:14,107][52031] Avg episode reward: [(0, '0.664')] [2024-04-27 12:53:14,900][52263] Updated weights for policy 0, policy_version 381979 (0.0030) [2024-04-27 12:53:19,107][52031] Fps is (10 sec: 47512.9, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 6258491392. Throughput: 0: 53922.6. Samples: 749041040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 12:53:19,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 12:53:19,126][52263] Updated weights for policy 0, policy_version 381989 (0.0030) [2024-04-27 12:53:21,027][52263] Updated weights for policy 0, policy_version 381999 (0.0031) [2024-04-27 12:53:24,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53794.1, 300 sec: 53372.9). Total num frames: 6258802688. Throughput: 0: 54011.8. Samples: 749368580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 12:53:24,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 12:53:25,149][52263] Updated weights for policy 0, policy_version 382009 (0.0025) [2024-04-27 12:53:27,215][52263] Updated weights for policy 0, policy_version 382019 (0.0032) [2024-04-27 12:53:29,106][52031] Fps is (10 sec: 57344.9, 60 sec: 53794.3, 300 sec: 53539.6). Total num frames: 6259064832. Throughput: 0: 53487.3. Samples: 749518640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 12:53:29,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 12:53:31,236][52263] Updated weights for policy 0, policy_version 382029 (0.0030) [2024-04-27 12:53:33,405][52263] Updated weights for policy 0, policy_version 382039 (0.0030) [2024-04-27 12:53:34,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6259343360. Throughput: 0: 53506.6. Samples: 749841860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 12:53:34,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 12:53:34,121][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000382040_6259343360.pth... [2024-04-27 12:53:34,175][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000381254_6246465536.pth [2024-04-27 12:53:37,307][52263] Updated weights for policy 0, policy_version 382049 (0.0030) [2024-04-27 12:53:39,106][52031] Fps is (10 sec: 57343.5, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6259638272. Throughput: 0: 53650.6. Samples: 750168280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 12:53:39,107][52031] Avg episode reward: [(0, '0.501')] [2024-04-27 12:53:39,359][52263] Updated weights for policy 0, policy_version 382059 (0.0026) [2024-04-27 12:53:43,322][52263] Updated weights for policy 0, policy_version 382069 (0.0029) [2024-04-27 12:53:43,842][52242] Signal inference workers to stop experience collection... (11350 times) [2024-04-27 12:53:43,842][52242] Signal inference workers to resume experience collection... (11350 times) [2024-04-27 12:53:43,865][52263] InferenceWorker_p0-w0: stopping experience collection (11350 times) [2024-04-27 12:53:43,865][52263] InferenceWorker_p0-w0: resuming experience collection (11350 times) [2024-04-27 12:53:44,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6259851264. Throughput: 0: 53987.1. Samples: 750335300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 12:53:44,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 12:53:45,324][52263] Updated weights for policy 0, policy_version 382079 (0.0029) [2024-04-27 12:53:49,106][52031] Fps is (10 sec: 49152.0, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6260129792. Throughput: 0: 53932.0. Samples: 750655060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 12:53:49,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 12:53:49,413][52263] Updated weights for policy 0, policy_version 382089 (0.0029) [2024-04-27 12:53:51,637][52263] Updated weights for policy 0, policy_version 382099 (0.0036) [2024-04-27 12:53:54,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6260391936. Throughput: 0: 53918.5. Samples: 750977420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 12:53:54,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 12:53:55,395][52263] Updated weights for policy 0, policy_version 382109 (0.0032) [2024-04-27 12:53:57,914][52263] Updated weights for policy 0, policy_version 382119 (0.0028) [2024-04-27 12:53:59,106][52031] Fps is (10 sec: 55705.6, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6260686848. Throughput: 0: 53823.9. Samples: 751140220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 12:53:59,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 12:54:01,318][52263] Updated weights for policy 0, policy_version 382129 (0.0037) [2024-04-27 12:54:03,981][52263] Updated weights for policy 0, policy_version 382139 (0.0029) [2024-04-27 12:54:04,107][52031] Fps is (10 sec: 57343.5, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 6260965376. Throughput: 0: 53859.0. Samples: 751464700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 12:54:04,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 12:54:07,440][52263] Updated weights for policy 0, policy_version 382149 (0.0036) [2024-04-27 12:54:09,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6261243904. Throughput: 0: 53794.8. Samples: 751789340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 12:54:09,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 12:54:09,997][52263] Updated weights for policy 0, policy_version 382159 (0.0031) [2024-04-27 12:54:13,441][52263] Updated weights for policy 0, policy_version 382169 (0.0026) [2024-04-27 12:54:14,106][52031] Fps is (10 sec: 52429.8, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6261489664. Throughput: 0: 54100.4. Samples: 751953160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 12:54:14,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 12:54:16,153][52263] Updated weights for policy 0, policy_version 382179 (0.0030) [2024-04-27 12:54:19,106][52031] Fps is (10 sec: 49152.1, 60 sec: 54067.3, 300 sec: 53428.5). Total num frames: 6261735424. Throughput: 0: 54059.7. Samples: 752274540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 12:54:19,115][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 12:54:19,512][52263] Updated weights for policy 0, policy_version 382189 (0.0027) [2024-04-27 12:54:22,361][52263] Updated weights for policy 0, policy_version 382199 (0.0030) [2024-04-27 12:54:24,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6262013952. Throughput: 0: 53985.4. Samples: 752597620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-27 12:54:24,115][52031] Avg episode reward: [(0, '0.516')] [2024-04-27 12:54:25,640][52263] Updated weights for policy 0, policy_version 382209 (0.0029) [2024-04-27 12:54:28,352][52263] Updated weights for policy 0, policy_version 382219 (0.0034) [2024-04-27 12:54:29,106][52031] Fps is (10 sec: 57343.4, 60 sec: 54067.1, 300 sec: 53706.2). Total num frames: 6262308864. Throughput: 0: 53747.6. Samples: 752753940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 12:54:29,116][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 12:54:31,602][52263] Updated weights for policy 0, policy_version 382229 (0.0030) [2024-04-27 12:54:32,987][52242] Signal inference workers to stop experience collection... (11400 times) [2024-04-27 12:54:33,017][52263] InferenceWorker_p0-w0: stopping experience collection (11400 times) [2024-04-27 12:54:33,051][52242] Signal inference workers to resume experience collection... (11400 times) [2024-04-27 12:54:33,052][52263] InferenceWorker_p0-w0: resuming experience collection (11400 times) [2024-04-27 12:54:34,107][52031] Fps is (10 sec: 57342.9, 60 sec: 54067.1, 300 sec: 53706.2). Total num frames: 6262587392. Throughput: 0: 53980.3. Samples: 753084180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 12:54:34,115][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 12:54:34,337][52263] Updated weights for policy 0, policy_version 382239 (0.0032) [2024-04-27 12:54:37,673][52263] Updated weights for policy 0, policy_version 382249 (0.0030) [2024-04-27 12:54:39,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6262833152. Throughput: 0: 53954.3. Samples: 753405360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 12:54:39,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 12:54:40,513][52263] Updated weights for policy 0, policy_version 382259 (0.0026) [2024-04-27 12:54:43,614][52263] Updated weights for policy 0, policy_version 382269 (0.0032) [2024-04-27 12:54:44,107][52031] Fps is (10 sec: 50790.7, 60 sec: 54067.1, 300 sec: 53539.6). Total num frames: 6263095296. Throughput: 0: 53840.3. Samples: 753563040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 12:54:44,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 12:54:46,827][52263] Updated weights for policy 0, policy_version 382279 (0.0034) [2024-04-27 12:54:49,107][52031] Fps is (10 sec: 54066.9, 60 sec: 54067.1, 300 sec: 53650.7). Total num frames: 6263373824. Throughput: 0: 53751.6. Samples: 753883520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 12:54:49,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 12:54:49,769][52263] Updated weights for policy 0, policy_version 382289 (0.0027) [2024-04-27 12:54:52,844][52263] Updated weights for policy 0, policy_version 382299 (0.0029) [2024-04-27 12:54:54,106][52031] Fps is (10 sec: 54067.8, 60 sec: 54067.3, 300 sec: 53650.7). Total num frames: 6263635968. Throughput: 0: 53777.7. Samples: 754209340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 12:54:54,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 12:54:55,801][52263] Updated weights for policy 0, policy_version 382309 (0.0032) [2024-04-27 12:54:58,949][52263] Updated weights for policy 0, policy_version 382319 (0.0034) [2024-04-27 12:54:59,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.2, 300 sec: 53761.7). Total num frames: 6263914496. Throughput: 0: 53706.3. Samples: 754369940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 12:54:59,107][52031] Avg episode reward: [(0, '0.495')] [2024-04-27 12:55:02,056][52263] Updated weights for policy 0, policy_version 382329 (0.0031) [2024-04-27 12:55:04,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 6264176640. Throughput: 0: 53663.1. Samples: 754689380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 12:55:04,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 12:55:05,210][52263] Updated weights for policy 0, policy_version 382339 (0.0029) [2024-04-27 12:55:08,141][52263] Updated weights for policy 0, policy_version 382349 (0.0028) [2024-04-27 12:55:09,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 6264455168. Throughput: 0: 53567.9. Samples: 755008180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 12:55:09,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 12:55:11,227][52263] Updated weights for policy 0, policy_version 382359 (0.0029) [2024-04-27 12:55:14,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6264700928. Throughput: 0: 53709.4. Samples: 755170860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 12:55:14,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 12:55:14,501][52263] Updated weights for policy 0, policy_version 382369 (0.0032) [2024-04-27 12:55:17,170][52263] Updated weights for policy 0, policy_version 382379 (0.0031) [2024-04-27 12:55:19,106][52031] Fps is (10 sec: 52429.0, 60 sec: 54067.1, 300 sec: 53595.1). Total num frames: 6264979456. Throughput: 0: 53525.5. Samples: 755492820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 12:55:19,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 12:55:20,666][52263] Updated weights for policy 0, policy_version 382389 (0.0031) [2024-04-27 12:55:23,416][52263] Updated weights for policy 0, policy_version 382399 (0.0028) [2024-04-27 12:55:24,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6265241600. Throughput: 0: 53424.1. Samples: 755809440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 12:55:24,107][52031] Avg episode reward: [(0, '0.668')] [2024-04-27 12:55:26,829][52263] Updated weights for policy 0, policy_version 382409 (0.0029) [2024-04-27 12:55:29,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6265520128. Throughput: 0: 53568.2. Samples: 755973600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 12:55:29,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 12:55:29,602][52263] Updated weights for policy 0, policy_version 382419 (0.0041) [2024-04-27 12:55:32,783][52263] Updated weights for policy 0, policy_version 382429 (0.0027) [2024-04-27 12:55:34,106][52031] Fps is (10 sec: 54066.6, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6265782272. Throughput: 0: 53548.0. Samples: 756293180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 12:55:34,107][52031] Avg episode reward: [(0, '0.673')] [2024-04-27 12:55:34,128][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000382434_6265798656.pth... [2024-04-27 12:55:34,171][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000381646_6252888064.pth [2024-04-27 12:55:35,771][52263] Updated weights for policy 0, policy_version 382439 (0.0032) [2024-04-27 12:55:38,719][52263] Updated weights for policy 0, policy_version 382449 (0.0030) [2024-04-27 12:55:39,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6266044416. Throughput: 0: 53468.9. Samples: 756615440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 12:55:39,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 12:55:41,722][52263] Updated weights for policy 0, policy_version 382459 (0.0034) [2024-04-27 12:55:44,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6266322944. Throughput: 0: 53453.2. Samples: 756775340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 12:55:44,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 12:55:44,948][52263] Updated weights for policy 0, policy_version 382469 (0.0034) [2024-04-27 12:55:47,848][52263] Updated weights for policy 0, policy_version 382479 (0.0031) [2024-04-27 12:55:49,107][52031] Fps is (10 sec: 52427.0, 60 sec: 53247.8, 300 sec: 53595.1). Total num frames: 6266568704. Throughput: 0: 53483.5. Samples: 757096160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 12:55:49,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 12:55:51,106][52263] Updated weights for policy 0, policy_version 382489 (0.0031) [2024-04-27 12:55:51,960][52242] Signal inference workers to stop experience collection... (11450 times) [2024-04-27 12:55:51,961][52242] Signal inference workers to resume experience collection... (11450 times) [2024-04-27 12:55:51,973][52263] InferenceWorker_p0-w0: stopping experience collection (11450 times) [2024-04-27 12:55:51,973][52263] InferenceWorker_p0-w0: resuming experience collection (11450 times) [2024-04-27 12:55:54,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 6266863616. Throughput: 0: 53592.4. Samples: 757419840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 12:55:54,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 12:55:54,108][52263] Updated weights for policy 0, policy_version 382499 (0.0031) [2024-04-27 12:55:57,212][52263] Updated weights for policy 0, policy_version 382509 (0.0030) [2024-04-27 12:55:59,106][52031] Fps is (10 sec: 55707.8, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6267125760. Throughput: 0: 53572.5. Samples: 757581620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 12:55:59,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 12:56:00,036][52263] Updated weights for policy 0, policy_version 382519 (0.0029) [2024-04-27 12:56:03,412][52263] Updated weights for policy 0, policy_version 382529 (0.0029) [2024-04-27 12:56:04,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 6267404288. Throughput: 0: 53605.3. Samples: 757905060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 12:56:04,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 12:56:06,001][52263] Updated weights for policy 0, policy_version 382539 (0.0031) [2024-04-27 12:56:09,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6267666432. Throughput: 0: 53788.7. Samples: 758229940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 12:56:09,107][52031] Avg episode reward: [(0, '0.646')] [2024-04-27 12:56:09,455][52263] Updated weights for policy 0, policy_version 382549 (0.0041) [2024-04-27 12:56:12,043][52263] Updated weights for policy 0, policy_version 382559 (0.0037) [2024-04-27 12:56:14,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6267928576. Throughput: 0: 53546.2. Samples: 758383180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 12:56:14,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 12:56:15,559][52263] Updated weights for policy 0, policy_version 382569 (0.0028) [2024-04-27 12:56:18,505][52263] Updated weights for policy 0, policy_version 382579 (0.0024) [2024-04-27 12:56:19,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 6268190720. Throughput: 0: 53701.8. Samples: 758709760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 12:56:19,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 12:56:21,619][52263] Updated weights for policy 0, policy_version 382589 (0.0032) [2024-04-27 12:56:24,107][52031] Fps is (10 sec: 55705.2, 60 sec: 54067.1, 300 sec: 53761.7). Total num frames: 6268485632. Throughput: 0: 53759.9. Samples: 759034640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 12:56:24,107][52031] Avg episode reward: [(0, '0.663')] [2024-04-27 12:56:24,426][52263] Updated weights for policy 0, policy_version 382599 (0.0028) [2024-04-27 12:56:27,601][52263] Updated weights for policy 0, policy_version 382609 (0.0028) [2024-04-27 12:56:29,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 6268747776. Throughput: 0: 53893.9. Samples: 759200560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 12:56:29,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 12:56:30,977][52263] Updated weights for policy 0, policy_version 382619 (0.0029) [2024-04-27 12:56:33,796][52263] Updated weights for policy 0, policy_version 382629 (0.0029) [2024-04-27 12:56:34,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6269009920. Throughput: 0: 53911.0. Samples: 759522140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 12:56:34,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 12:56:36,870][52263] Updated weights for policy 0, policy_version 382639 (0.0033) [2024-04-27 12:56:39,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6269272064. Throughput: 0: 53926.3. Samples: 759846520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 12:56:39,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 12:56:39,934][52263] Updated weights for policy 0, policy_version 382649 (0.0026) [2024-04-27 12:56:42,831][52263] Updated weights for policy 0, policy_version 382659 (0.0028) [2024-04-27 12:56:44,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6269550592. Throughput: 0: 53885.2. Samples: 760006460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 12:56:44,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 12:56:45,989][52263] Updated weights for policy 0, policy_version 382669 (0.0029) [2024-04-27 12:56:46,902][52242] Signal inference workers to stop experience collection... (11500 times) [2024-04-27 12:56:46,902][52242] Signal inference workers to resume experience collection... (11500 times) [2024-04-27 12:56:46,925][52263] InferenceWorker_p0-w0: stopping experience collection (11500 times) [2024-04-27 12:56:46,925][52263] InferenceWorker_p0-w0: resuming experience collection (11500 times) [2024-04-27 12:56:48,951][52263] Updated weights for policy 0, policy_version 382679 (0.0032) [2024-04-27 12:56:49,106][52031] Fps is (10 sec: 54067.1, 60 sec: 54067.5, 300 sec: 53706.2). Total num frames: 6269812736. Throughput: 0: 53836.9. Samples: 760327720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 12:56:49,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 12:56:52,195][52263] Updated weights for policy 0, policy_version 382689 (0.0028) [2024-04-27 12:56:54,106][52031] Fps is (10 sec: 55706.2, 60 sec: 54067.3, 300 sec: 53872.8). Total num frames: 6270107648. Throughput: 0: 53695.7. Samples: 760646240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 12:56:54,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 12:56:55,050][52263] Updated weights for policy 0, policy_version 382699 (0.0027) [2024-04-27 12:56:58,367][52263] Updated weights for policy 0, policy_version 382709 (0.0027) [2024-04-27 12:56:59,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6270353408. Throughput: 0: 53867.4. Samples: 760807220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 12:56:59,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 12:57:01,024][52263] Updated weights for policy 0, policy_version 382719 (0.0032) [2024-04-27 12:57:04,107][52031] Fps is (10 sec: 49151.3, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6270599168. Throughput: 0: 53710.6. Samples: 761126740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:57:04,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 12:57:04,472][52263] Updated weights for policy 0, policy_version 382729 (0.0033) [2024-04-27 12:57:07,253][52263] Updated weights for policy 0, policy_version 382739 (0.0024) [2024-04-27 12:57:09,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6270877696. Throughput: 0: 53674.2. Samples: 761449980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:57:09,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 12:57:10,439][52263] Updated weights for policy 0, policy_version 382749 (0.0029) [2024-04-27 12:57:13,380][52263] Updated weights for policy 0, policy_version 382759 (0.0035) [2024-04-27 12:57:14,106][52031] Fps is (10 sec: 55706.5, 60 sec: 53794.2, 300 sec: 53761.7). Total num frames: 6271156224. Throughput: 0: 53554.3. Samples: 761610500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:57:14,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 12:57:16,428][52263] Updated weights for policy 0, policy_version 382769 (0.0030) [2024-04-27 12:57:19,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6271418368. Throughput: 0: 53521.0. Samples: 761930580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:57:19,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 12:57:19,474][52263] Updated weights for policy 0, policy_version 382779 (0.0029) [2024-04-27 12:57:22,659][52263] Updated weights for policy 0, policy_version 382789 (0.0032) [2024-04-27 12:57:24,106][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.1, 300 sec: 53761.8). Total num frames: 6271696896. Throughput: 0: 53492.8. Samples: 762253700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:57:24,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 12:57:25,412][52263] Updated weights for policy 0, policy_version 382799 (0.0034) [2024-04-27 12:57:28,763][52263] Updated weights for policy 0, policy_version 382809 (0.0028) [2024-04-27 12:57:29,107][52031] Fps is (10 sec: 54066.2, 60 sec: 53520.9, 300 sec: 53706.2). Total num frames: 6271959040. Throughput: 0: 53607.1. Samples: 762418780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:57:29,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 12:57:31,493][52263] Updated weights for policy 0, policy_version 382819 (0.0038) [2024-04-27 12:57:34,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6272221184. Throughput: 0: 53611.9. Samples: 762740260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:57:34,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 12:57:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000382826_6272221184.pth... [2024-04-27 12:57:34,165][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000382040_6259343360.pth [2024-04-27 12:57:34,869][52263] Updated weights for policy 0, policy_version 382829 (0.0031) [2024-04-27 12:57:37,588][52263] Updated weights for policy 0, policy_version 382839 (0.0032) [2024-04-27 12:57:39,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6272483328. Throughput: 0: 53607.1. Samples: 763058560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:57:39,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 12:57:40,895][52263] Updated weights for policy 0, policy_version 382849 (0.0027) [2024-04-27 12:57:43,487][52263] Updated weights for policy 0, policy_version 382859 (0.0033) [2024-04-27 12:57:44,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.0, 300 sec: 53761.7). Total num frames: 6272761856. Throughput: 0: 53595.9. Samples: 763219040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:57:44,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 12:57:47,077][52263] Updated weights for policy 0, policy_version 382869 (0.0038) [2024-04-27 12:57:47,444][52242] Signal inference workers to stop experience collection... (11550 times) [2024-04-27 12:57:47,445][52242] Signal inference workers to resume experience collection... (11550 times) [2024-04-27 12:57:47,458][52263] InferenceWorker_p0-w0: stopping experience collection (11550 times) [2024-04-27 12:57:47,459][52263] InferenceWorker_p0-w0: resuming experience collection (11550 times) [2024-04-27 12:57:49,106][52031] Fps is (10 sec: 57343.9, 60 sec: 54067.2, 300 sec: 53872.8). Total num frames: 6273056768. Throughput: 0: 53625.4. Samples: 763539880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:57:49,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 12:57:50,059][52263] Updated weights for policy 0, policy_version 382879 (0.0034) [2024-04-27 12:57:53,217][52263] Updated weights for policy 0, policy_version 382889 (0.0027) [2024-04-27 12:57:54,107][52031] Fps is (10 sec: 52429.2, 60 sec: 52974.8, 300 sec: 53706.2). Total num frames: 6273286144. Throughput: 0: 53583.6. Samples: 763861240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:57:54,108][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 12:57:56,010][52263] Updated weights for policy 0, policy_version 382899 (0.0032) [2024-04-27 12:57:59,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6273564672. Throughput: 0: 53589.7. Samples: 764022040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:57:59,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 12:57:59,256][52263] Updated weights for policy 0, policy_version 382909 (0.0033) [2024-04-27 12:58:02,149][52263] Updated weights for policy 0, policy_version 382919 (0.0028) [2024-04-27 12:58:04,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6273810432. Throughput: 0: 53544.0. Samples: 764340060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:58:04,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 12:58:05,382][52263] Updated weights for policy 0, policy_version 382929 (0.0029) [2024-04-27 12:58:08,338][52263] Updated weights for policy 0, policy_version 382939 (0.0028) [2024-04-27 12:58:09,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.0, 300 sec: 53706.2). Total num frames: 6274088960. Throughput: 0: 53482.6. Samples: 764660420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:58:09,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 12:58:11,557][52263] Updated weights for policy 0, policy_version 382949 (0.0028) [2024-04-27 12:58:14,106][52031] Fps is (10 sec: 57343.8, 60 sec: 53794.1, 300 sec: 53872.8). Total num frames: 6274383872. Throughput: 0: 53449.1. Samples: 764823980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:58:14,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 12:58:14,703][52263] Updated weights for policy 0, policy_version 382959 (0.0025) [2024-04-27 12:58:17,719][52263] Updated weights for policy 0, policy_version 382969 (0.0034) [2024-04-27 12:58:19,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6274646016. Throughput: 0: 53461.8. Samples: 765146040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 12:58:19,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 12:58:20,735][52263] Updated weights for policy 0, policy_version 382979 (0.0029) [2024-04-27 12:58:23,753][52263] Updated weights for policy 0, policy_version 382989 (0.0037) [2024-04-27 12:58:24,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6274908160. Throughput: 0: 53460.5. Samples: 765464280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 12:58:24,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 12:58:26,715][52263] Updated weights for policy 0, policy_version 382999 (0.0031) [2024-04-27 12:58:29,107][52031] Fps is (10 sec: 50790.2, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6275153920. Throughput: 0: 53308.9. Samples: 765617940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 12:58:29,107][52031] Avg episode reward: [(0, '0.518')] [2024-04-27 12:58:29,881][52263] Updated weights for policy 0, policy_version 383009 (0.0027) [2024-04-27 12:58:32,908][52263] Updated weights for policy 0, policy_version 383019 (0.0034) [2024-04-27 12:58:34,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6275416064. Throughput: 0: 53201.4. Samples: 765933940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 12:58:34,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 12:58:36,052][52263] Updated weights for policy 0, policy_version 383029 (0.0026) [2024-04-27 12:58:39,094][52263] Updated weights for policy 0, policy_version 383039 (0.0029) [2024-04-27 12:58:39,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.0, 300 sec: 53761.7). Total num frames: 6275710976. Throughput: 0: 53156.4. Samples: 766253280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 12:58:39,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 12:58:42,134][52263] Updated weights for policy 0, policy_version 383049 (0.0033) [2024-04-27 12:58:44,107][52031] Fps is (10 sec: 57343.4, 60 sec: 53794.2, 300 sec: 53761.7). Total num frames: 6275989504. Throughput: 0: 53491.5. Samples: 766429160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 12:58:44,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 12:58:45,123][52263] Updated weights for policy 0, policy_version 383059 (0.0027) [2024-04-27 12:58:48,055][52242] Signal inference workers to stop experience collection... (11600 times) [2024-04-27 12:58:48,067][52263] InferenceWorker_p0-w0: stopping experience collection (11600 times) [2024-04-27 12:58:48,120][52242] Signal inference workers to resume experience collection... (11600 times) [2024-04-27 12:58:48,120][52263] InferenceWorker_p0-w0: resuming experience collection (11600 times) [2024-04-27 12:58:48,234][52263] Updated weights for policy 0, policy_version 383069 (0.0027) [2024-04-27 12:58:49,106][52031] Fps is (10 sec: 52429.6, 60 sec: 52975.0, 300 sec: 53706.2). Total num frames: 6276235264. Throughput: 0: 53556.0. Samples: 766750080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 12:58:49,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 12:58:51,150][52263] Updated weights for policy 0, policy_version 383079 (0.0032) [2024-04-27 12:58:54,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6276497408. Throughput: 0: 53613.4. Samples: 767073020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 12:58:54,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 12:58:54,375][52263] Updated weights for policy 0, policy_version 383089 (0.0033) [2024-04-27 12:58:57,364][52263] Updated weights for policy 0, policy_version 383099 (0.0030) [2024-04-27 12:58:59,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6276759552. Throughput: 0: 53120.9. Samples: 767214420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 12:58:59,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 12:59:00,469][52263] Updated weights for policy 0, policy_version 383109 (0.0027) [2024-04-27 12:59:03,533][52263] Updated weights for policy 0, policy_version 383119 (0.0032) [2024-04-27 12:59:04,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6277021696. Throughput: 0: 53191.2. Samples: 767539640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 12:59:04,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 12:59:06,546][52263] Updated weights for policy 0, policy_version 383129 (0.0034) [2024-04-27 12:59:09,106][52031] Fps is (10 sec: 55705.5, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6277316608. Throughput: 0: 53369.8. Samples: 767865920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 12:59:09,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 12:59:09,764][52263] Updated weights for policy 0, policy_version 383139 (0.0028) [2024-04-27 12:59:12,706][52263] Updated weights for policy 0, policy_version 383149 (0.0029) [2024-04-27 12:59:14,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53248.0, 300 sec: 53706.2). Total num frames: 6277578752. Throughput: 0: 53759.3. Samples: 768037100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 12:59:14,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 12:59:15,939][52263] Updated weights for policy 0, policy_version 383159 (0.0028) [2024-04-27 12:59:18,816][52263] Updated weights for policy 0, policy_version 383169 (0.0030) [2024-04-27 12:59:19,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53248.0, 300 sec: 53650.6). Total num frames: 6277840896. Throughput: 0: 53843.0. Samples: 768356880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 12:59:19,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 12:59:22,111][52263] Updated weights for policy 0, policy_version 383179 (0.0027) [2024-04-27 12:59:24,107][52031] Fps is (10 sec: 50789.3, 60 sec: 52974.8, 300 sec: 53484.0). Total num frames: 6278086656. Throughput: 0: 53825.3. Samples: 768675420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 12:59:24,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 12:59:24,908][52263] Updated weights for policy 0, policy_version 383189 (0.0029) [2024-04-27 12:59:28,286][52263] Updated weights for policy 0, policy_version 383199 (0.0026) [2024-04-27 12:59:29,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6278365184. Throughput: 0: 53337.8. Samples: 768829360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 12:59:29,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 12:59:30,970][52263] Updated weights for policy 0, policy_version 383209 (0.0025) [2024-04-27 12:59:34,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6278643712. Throughput: 0: 53388.3. Samples: 769152560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 12:59:34,107][52031] Avg episode reward: [(0, '0.512')] [2024-04-27 12:59:34,119][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000383218_6278643712.pth... [2024-04-27 12:59:34,166][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000382434_6265798656.pth [2024-04-27 12:59:34,531][52263] Updated weights for policy 0, policy_version 383219 (0.0039) [2024-04-27 12:59:37,306][52263] Updated weights for policy 0, policy_version 383229 (0.0028) [2024-04-27 12:59:39,107][52031] Fps is (10 sec: 57343.7, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6278938624. Throughput: 0: 53364.3. Samples: 769474420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 12:59:39,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 12:59:40,658][52263] Updated weights for policy 0, policy_version 383239 (0.0030) [2024-04-27 12:59:43,362][52263] Updated weights for policy 0, policy_version 383249 (0.0032) [2024-04-27 12:59:44,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6279184384. Throughput: 0: 54020.8. Samples: 769645360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:59:44,107][52031] Avg episode reward: [(0, '0.505')] [2024-04-27 12:59:46,705][52263] Updated weights for policy 0, policy_version 383259 (0.0027) [2024-04-27 12:59:49,106][52031] Fps is (10 sec: 50791.1, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6279446528. Throughput: 0: 53891.6. Samples: 769964760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:59:49,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 12:59:49,341][52263] Updated weights for policy 0, policy_version 383269 (0.0034) [2024-04-27 12:59:52,678][52263] Updated weights for policy 0, policy_version 383279 (0.0032) [2024-04-27 12:59:54,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53520.9, 300 sec: 53539.5). Total num frames: 6279708672. Throughput: 0: 53763.3. Samples: 770285280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:59:54,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 12:59:54,556][52242] Signal inference workers to stop experience collection... (11650 times) [2024-04-27 12:59:54,596][52263] InferenceWorker_p0-w0: stopping experience collection (11650 times) [2024-04-27 12:59:54,655][52242] Signal inference workers to resume experience collection... (11650 times) [2024-04-27 12:59:54,655][52263] InferenceWorker_p0-w0: resuming experience collection (11650 times) [2024-04-27 12:59:55,453][52263] Updated weights for policy 0, policy_version 383289 (0.0031) [2024-04-27 12:59:58,701][52263] Updated weights for policy 0, policy_version 383299 (0.0027) [2024-04-27 12:59:59,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6279987200. Throughput: 0: 53388.4. Samples: 770439580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 12:59:59,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 13:00:01,527][52263] Updated weights for policy 0, policy_version 383309 (0.0028) [2024-04-27 13:00:04,107][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6280249344. Throughput: 0: 53475.5. Samples: 770763280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 13:00:04,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 13:00:04,886][52263] Updated weights for policy 0, policy_version 383319 (0.0027) [2024-04-27 13:00:07,607][52263] Updated weights for policy 0, policy_version 383329 (0.0031) [2024-04-27 13:00:09,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6280544256. Throughput: 0: 53536.2. Samples: 771084540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 13:00:09,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 13:00:11,182][52263] Updated weights for policy 0, policy_version 383339 (0.0030) [2024-04-27 13:00:13,621][52263] Updated weights for policy 0, policy_version 383349 (0.0034) [2024-04-27 13:00:14,107][52031] Fps is (10 sec: 55706.0, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6280806400. Throughput: 0: 53758.7. Samples: 771248500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 13:00:14,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 13:00:17,150][52263] Updated weights for policy 0, policy_version 383359 (0.0031) [2024-04-27 13:00:19,107][52031] Fps is (10 sec: 50788.4, 60 sec: 53520.8, 300 sec: 53595.0). Total num frames: 6281052160. Throughput: 0: 53742.3. Samples: 771570980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 13:00:19,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 13:00:19,873][52263] Updated weights for policy 0, policy_version 383369 (0.0029) [2024-04-27 13:00:23,298][52263] Updated weights for policy 0, policy_version 383379 (0.0031) [2024-04-27 13:00:24,107][52031] Fps is (10 sec: 50790.0, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6281314304. Throughput: 0: 53709.3. Samples: 771891340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 13:00:24,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 13:00:25,875][52263] Updated weights for policy 0, policy_version 383389 (0.0032) [2024-04-27 13:00:29,107][52031] Fps is (10 sec: 54068.6, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6281592832. Throughput: 0: 53324.4. Samples: 772044960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 13:00:29,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 13:00:29,419][52263] Updated weights for policy 0, policy_version 383399 (0.0027) [2024-04-27 13:00:32,033][52263] Updated weights for policy 0, policy_version 383409 (0.0033) [2024-04-27 13:00:34,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.2, 300 sec: 53650.6). Total num frames: 6281871360. Throughput: 0: 53302.1. Samples: 772363360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 13:00:34,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 13:00:35,686][52263] Updated weights for policy 0, policy_version 383419 (0.0027) [2024-04-27 13:00:38,250][52263] Updated weights for policy 0, policy_version 383429 (0.0036) [2024-04-27 13:00:39,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6282133504. Throughput: 0: 53266.5. Samples: 772682260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 13:00:39,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 13:00:41,724][52263] Updated weights for policy 0, policy_version 383439 (0.0031) [2024-04-27 13:00:44,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 6282395648. Throughput: 0: 53600.0. Samples: 772851580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 13:00:44,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 13:00:44,441][52263] Updated weights for policy 0, policy_version 383449 (0.0027) [2024-04-27 13:00:47,688][52263] Updated weights for policy 0, policy_version 383459 (0.0027) [2024-04-27 13:00:49,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6282641408. Throughput: 0: 53517.1. Samples: 773171540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 13:00:49,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 13:00:50,508][52263] Updated weights for policy 0, policy_version 383469 (0.0026) [2024-04-27 13:00:53,817][52263] Updated weights for policy 0, policy_version 383479 (0.0028) [2024-04-27 13:00:54,106][52031] Fps is (10 sec: 54066.7, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6282936320. Throughput: 0: 53499.1. Samples: 773492000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 13:00:54,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 13:00:55,319][52242] Signal inference workers to stop experience collection... (11700 times) [2024-04-27 13:00:55,319][52242] Signal inference workers to resume experience collection... (11700 times) [2024-04-27 13:00:55,334][52263] InferenceWorker_p0-w0: stopping experience collection (11700 times) [2024-04-27 13:00:55,335][52263] InferenceWorker_p0-w0: resuming experience collection (11700 times) [2024-04-27 13:00:56,478][52263] Updated weights for policy 0, policy_version 383489 (0.0043) [2024-04-27 13:00:59,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6283182080. Throughput: 0: 53354.3. Samples: 773649440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 13:00:59,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 13:00:59,908][52263] Updated weights for policy 0, policy_version 383499 (0.0036) [2024-04-27 13:01:02,429][52263] Updated weights for policy 0, policy_version 383509 (0.0028) [2024-04-27 13:01:04,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6283476992. Throughput: 0: 53287.6. Samples: 773968900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 13:01:04,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 13:01:05,996][52263] Updated weights for policy 0, policy_version 383519 (0.0030) [2024-04-27 13:01:08,512][52263] Updated weights for policy 0, policy_version 383529 (0.0029) [2024-04-27 13:01:09,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53247.9, 300 sec: 53595.1). Total num frames: 6283739136. Throughput: 0: 53486.2. Samples: 774298220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 13:01:09,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 13:01:11,966][52263] Updated weights for policy 0, policy_version 383539 (0.0027) [2024-04-27 13:01:14,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6284017664. Throughput: 0: 53881.5. Samples: 774469620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 13:01:14,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 13:01:14,885][52263] Updated weights for policy 0, policy_version 383549 (0.0032) [2024-04-27 13:01:18,048][52263] Updated weights for policy 0, policy_version 383559 (0.0026) [2024-04-27 13:01:19,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.4, 300 sec: 53484.0). Total num frames: 6284263424. Throughput: 0: 53900.9. Samples: 774788900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 13:01:19,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 13:01:21,096][52263] Updated weights for policy 0, policy_version 383569 (0.0032) [2024-04-27 13:01:24,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6284541952. Throughput: 0: 54010.1. Samples: 775112720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 13:01:24,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 13:01:24,239][52263] Updated weights for policy 0, policy_version 383579 (0.0029) [2024-04-27 13:01:27,282][52263] Updated weights for policy 0, policy_version 383589 (0.0031) [2024-04-27 13:01:29,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6284820480. Throughput: 0: 53801.8. Samples: 775272660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 13:01:29,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 13:01:30,227][52263] Updated weights for policy 0, policy_version 383599 (0.0032) [2024-04-27 13:01:33,585][52263] Updated weights for policy 0, policy_version 383609 (0.0029) [2024-04-27 13:01:34,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6285082624. Throughput: 0: 53831.5. Samples: 775593960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 13:01:34,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 13:01:34,206][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000383612_6285099008.pth... [2024-04-27 13:01:34,250][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000382826_6272221184.pth [2024-04-27 13:01:36,147][52263] Updated weights for policy 0, policy_version 383619 (0.0031) [2024-04-27 13:01:39,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6285344768. Throughput: 0: 53766.7. Samples: 775911500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 13:01:39,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 13:01:39,664][52263] Updated weights for policy 0, policy_version 383629 (0.0031) [2024-04-27 13:01:42,425][52263] Updated weights for policy 0, policy_version 383639 (0.0028) [2024-04-27 13:01:44,106][52031] Fps is (10 sec: 55705.3, 60 sec: 54067.1, 300 sec: 53650.7). Total num frames: 6285639680. Throughput: 0: 54007.5. Samples: 776079780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 13:01:44,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 13:01:45,743][52263] Updated weights for policy 0, policy_version 383649 (0.0028) [2024-04-27 13:01:48,625][52263] Updated weights for policy 0, policy_version 383659 (0.0030) [2024-04-27 13:01:49,106][52031] Fps is (10 sec: 54067.1, 60 sec: 54067.1, 300 sec: 53484.0). Total num frames: 6285885440. Throughput: 0: 54111.5. Samples: 776403920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 13:01:49,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 13:01:51,916][52263] Updated weights for policy 0, policy_version 383669 (0.0031) [2024-04-27 13:01:54,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6286147584. Throughput: 0: 53921.4. Samples: 776724680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 13:01:54,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 13:01:54,940][52263] Updated weights for policy 0, policy_version 383679 (0.0027) [2024-04-27 13:01:58,003][52263] Updated weights for policy 0, policy_version 383689 (0.0031) [2024-04-27 13:01:59,106][52031] Fps is (10 sec: 54067.5, 60 sec: 54067.1, 300 sec: 53650.7). Total num frames: 6286426112. Throughput: 0: 53566.6. Samples: 776880120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 13:01:59,107][52031] Avg episode reward: [(0, '0.518')] [2024-04-27 13:02:00,925][52263] Updated weights for policy 0, policy_version 383699 (0.0033) [2024-04-27 13:02:04,086][52263] Updated weights for policy 0, policy_version 383709 (0.0033) [2024-04-27 13:02:04,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6286688256. Throughput: 0: 53591.5. Samples: 777200520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 13:02:04,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 13:02:07,015][52263] Updated weights for policy 0, policy_version 383719 (0.0037) [2024-04-27 13:02:09,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6286966784. Throughput: 0: 53536.9. Samples: 777521880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 13:02:09,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 13:02:10,123][52263] Updated weights for policy 0, policy_version 383729 (0.0031) [2024-04-27 13:02:10,503][52242] Signal inference workers to stop experience collection... (11750 times) [2024-04-27 13:02:10,560][52263] InferenceWorker_p0-w0: stopping experience collection (11750 times) [2024-04-27 13:02:10,562][52242] Signal inference workers to resume experience collection... (11750 times) [2024-04-27 13:02:10,573][52263] InferenceWorker_p0-w0: resuming experience collection (11750 times) [2024-04-27 13:02:12,962][52263] Updated weights for policy 0, policy_version 383739 (0.0033) [2024-04-27 13:02:14,107][52031] Fps is (10 sec: 55706.0, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 6287245312. Throughput: 0: 53725.7. Samples: 777690320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 13:02:14,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 13:02:16,282][52263] Updated weights for policy 0, policy_version 383749 (0.0035) [2024-04-27 13:02:19,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6287491072. Throughput: 0: 53681.3. Samples: 778009620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 13:02:19,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 13:02:19,159][52263] Updated weights for policy 0, policy_version 383759 (0.0034) [2024-04-27 13:02:22,396][52263] Updated weights for policy 0, policy_version 383769 (0.0032) [2024-04-27 13:02:24,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6287769600. Throughput: 0: 53853.0. Samples: 778334880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 13:02:24,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 13:02:25,258][52263] Updated weights for policy 0, policy_version 383779 (0.0030) [2024-04-27 13:02:28,470][52263] Updated weights for policy 0, policy_version 383789 (0.0037) [2024-04-27 13:02:29,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6288031744. Throughput: 0: 53546.3. Samples: 778489360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 13:02:29,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 13:02:31,342][52263] Updated weights for policy 0, policy_version 383799 (0.0029) [2024-04-27 13:02:34,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6288293888. Throughput: 0: 53543.9. Samples: 778813400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 13:02:34,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 13:02:34,620][52263] Updated weights for policy 0, policy_version 383809 (0.0029) [2024-04-27 13:02:37,297][52263] Updated weights for policy 0, policy_version 383819 (0.0028) [2024-04-27 13:02:39,107][52031] Fps is (10 sec: 55704.6, 60 sec: 54067.1, 300 sec: 53650.7). Total num frames: 6288588800. Throughput: 0: 53519.5. Samples: 779133060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 13:02:39,107][52031] Avg episode reward: [(0, '0.662')] [2024-04-27 13:02:40,746][52263] Updated weights for policy 0, policy_version 383829 (0.0029) [2024-04-27 13:02:43,648][52263] Updated weights for policy 0, policy_version 383839 (0.0029) [2024-04-27 13:02:44,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6288834560. Throughput: 0: 53824.4. Samples: 779302220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 13:02:44,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 13:02:46,836][52263] Updated weights for policy 0, policy_version 383849 (0.0030) [2024-04-27 13:02:49,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6289096704. Throughput: 0: 53725.9. Samples: 779618180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 13:02:49,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 13:02:49,741][52263] Updated weights for policy 0, policy_version 383859 (0.0030) [2024-04-27 13:02:52,978][52263] Updated weights for policy 0, policy_version 383869 (0.0029) [2024-04-27 13:02:54,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6289375232. Throughput: 0: 53575.1. Samples: 779932760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 13:02:54,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 13:02:55,754][52263] Updated weights for policy 0, policy_version 383879 (0.0032) [2024-04-27 13:02:59,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6289620992. Throughput: 0: 53444.5. Samples: 780095320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 13:02:59,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 13:02:59,187][52263] Updated weights for policy 0, policy_version 383889 (0.0032) [2024-04-27 13:03:01,841][52263] Updated weights for policy 0, policy_version 383899 (0.0032) [2024-04-27 13:03:04,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6289899520. Throughput: 0: 53504.8. Samples: 780417340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 13:03:04,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 13:03:05,255][52263] Updated weights for policy 0, policy_version 383909 (0.0036) [2024-04-27 13:03:08,055][52263] Updated weights for policy 0, policy_version 383919 (0.0031) [2024-04-27 13:03:09,106][52031] Fps is (10 sec: 55706.2, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6290178048. Throughput: 0: 53366.8. Samples: 780736380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 13:03:09,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 13:03:11,528][52263] Updated weights for policy 0, policy_version 383929 (0.0027) [2024-04-27 13:03:14,106][52031] Fps is (10 sec: 54068.4, 60 sec: 53248.2, 300 sec: 53539.6). Total num frames: 6290440192. Throughput: 0: 53512.5. Samples: 780897420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 13:03:14,107][52031] Avg episode reward: [(0, '0.481')] [2024-04-27 13:03:14,201][52263] Updated weights for policy 0, policy_version 383939 (0.0030) [2024-04-27 13:03:17,489][52263] Updated weights for policy 0, policy_version 383949 (0.0030) [2024-04-27 13:03:19,107][52031] Fps is (10 sec: 54065.9, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6290718720. Throughput: 0: 53450.7. Samples: 781218680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 13:03:19,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 13:03:19,777][52242] Signal inference workers to stop experience collection... (11800 times) [2024-04-27 13:03:19,782][52242] Signal inference workers to resume experience collection... (11800 times) [2024-04-27 13:03:19,806][52263] InferenceWorker_p0-w0: stopping experience collection (11800 times) [2024-04-27 13:03:19,806][52263] InferenceWorker_p0-w0: resuming experience collection (11800 times) [2024-04-27 13:03:20,213][52263] Updated weights for policy 0, policy_version 383959 (0.0033) [2024-04-27 13:03:23,736][52263] Updated weights for policy 0, policy_version 383969 (0.0023) [2024-04-27 13:03:24,106][52031] Fps is (10 sec: 52428.0, 60 sec: 53247.9, 300 sec: 53595.1). Total num frames: 6290964480. Throughput: 0: 53564.5. Samples: 781543460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 13:03:24,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 13:03:26,295][52263] Updated weights for policy 0, policy_version 383979 (0.0027) [2024-04-27 13:03:29,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 6291243008. Throughput: 0: 53130.3. Samples: 781693080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 13:03:29,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 13:03:29,854][52263] Updated weights for policy 0, policy_version 383989 (0.0028) [2024-04-27 13:03:32,615][52263] Updated weights for policy 0, policy_version 383999 (0.0026) [2024-04-27 13:03:34,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6291505152. Throughput: 0: 53385.4. Samples: 782020520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 13:03:34,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 13:03:34,236][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000384004_6291521536.pth... [2024-04-27 13:03:34,281][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000383218_6278643712.pth [2024-04-27 13:03:35,877][52263] Updated weights for policy 0, policy_version 384009 (0.0030) [2024-04-27 13:03:38,903][52263] Updated weights for policy 0, policy_version 384019 (0.0025) [2024-04-27 13:03:39,106][52031] Fps is (10 sec: 52428.7, 60 sec: 52975.0, 300 sec: 53484.0). Total num frames: 6291767296. Throughput: 0: 53480.0. Samples: 782339360. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 13:03:39,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 13:03:42,165][52263] Updated weights for policy 0, policy_version 384029 (0.0033) [2024-04-27 13:03:44,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6292045824. Throughput: 0: 53396.3. Samples: 782498160. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 13:03:44,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 13:03:45,143][52263] Updated weights for policy 0, policy_version 384039 (0.0028) [2024-04-27 13:03:48,284][52263] Updated weights for policy 0, policy_version 384049 (0.0027) [2024-04-27 13:03:49,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6292291584. Throughput: 0: 53320.1. Samples: 782816740. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 13:03:49,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 13:03:51,231][52263] Updated weights for policy 0, policy_version 384059 (0.0037) [2024-04-27 13:03:54,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6292570112. Throughput: 0: 53476.7. Samples: 783142840. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 13:03:54,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 13:03:54,266][52263] Updated weights for policy 0, policy_version 384069 (0.0029) [2024-04-27 13:03:57,170][52263] Updated weights for policy 0, policy_version 384079 (0.0030) [2024-04-27 13:03:59,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6292832256. Throughput: 0: 53451.8. Samples: 783302760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 13:03:59,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 13:04:00,372][52263] Updated weights for policy 0, policy_version 384089 (0.0033) [2024-04-27 13:04:03,266][52263] Updated weights for policy 0, policy_version 384099 (0.0031) [2024-04-27 13:04:04,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6293110784. Throughput: 0: 53390.4. Samples: 783621240. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 13:04:04,107][52031] Avg episode reward: [(0, '0.669')] [2024-04-27 13:04:06,635][52263] Updated weights for policy 0, policy_version 384109 (0.0033) [2024-04-27 13:04:09,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6293389312. Throughput: 0: 53121.3. Samples: 783933920. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 13:04:09,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 13:04:09,508][52263] Updated weights for policy 0, policy_version 384119 (0.0029) [2024-04-27 13:04:12,892][52263] Updated weights for policy 0, policy_version 384129 (0.0026) [2024-04-27 13:04:14,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6293651456. Throughput: 0: 53595.9. Samples: 784104900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 13:04:14,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 13:04:15,553][52263] Updated weights for policy 0, policy_version 384139 (0.0032) [2024-04-27 13:04:18,872][52263] Updated weights for policy 0, policy_version 384149 (0.0030) [2024-04-27 13:04:19,106][52031] Fps is (10 sec: 50790.9, 60 sec: 52975.1, 300 sec: 53595.2). Total num frames: 6293897216. Throughput: 0: 53521.4. Samples: 784428980. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 13:04:19,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 13:04:19,601][52242] Signal inference workers to stop experience collection... (11850 times) [2024-04-27 13:04:19,601][52242] Signal inference workers to resume experience collection... (11850 times) [2024-04-27 13:04:19,613][52263] InferenceWorker_p0-w0: stopping experience collection (11850 times) [2024-04-27 13:04:19,632][52263] InferenceWorker_p0-w0: resuming experience collection (11850 times) [2024-04-27 13:04:21,821][52263] Updated weights for policy 0, policy_version 384159 (0.0027) [2024-04-27 13:04:24,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6294159360. Throughput: 0: 53534.2. Samples: 784748400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 13:04:24,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 13:04:25,103][52263] Updated weights for policy 0, policy_version 384169 (0.0033) [2024-04-27 13:04:27,956][52263] Updated weights for policy 0, policy_version 384179 (0.0030) [2024-04-27 13:04:29,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6294437888. Throughput: 0: 53330.3. Samples: 784898020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 13:04:29,107][52031] Avg episode reward: [(0, '0.488')] [2024-04-27 13:04:31,266][52263] Updated weights for policy 0, policy_version 384189 (0.0030) [2024-04-27 13:04:34,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6294700032. Throughput: 0: 53259.6. Samples: 785213420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 13:04:34,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 13:04:34,228][52263] Updated weights for policy 0, policy_version 384199 (0.0027) [2024-04-27 13:04:37,384][52263] Updated weights for policy 0, policy_version 384209 (0.0032) [2024-04-27 13:04:39,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6294994944. Throughput: 0: 53115.1. Samples: 785533020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 13:04:39,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 13:04:40,239][52263] Updated weights for policy 0, policy_version 384219 (0.0033) [2024-04-27 13:04:43,396][52263] Updated weights for policy 0, policy_version 384229 (0.0026) [2024-04-27 13:04:44,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.3, 300 sec: 53595.1). Total num frames: 6295257088. Throughput: 0: 53308.6. Samples: 785701640. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 13:04:44,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 13:04:46,368][52263] Updated weights for policy 0, policy_version 384239 (0.0031) [2024-04-27 13:04:49,106][52031] Fps is (10 sec: 49152.4, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6295486464. Throughput: 0: 53472.9. Samples: 786027520. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 13:04:49,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 13:04:49,578][52263] Updated weights for policy 0, policy_version 384249 (0.0028) [2024-04-27 13:04:52,437][52263] Updated weights for policy 0, policy_version 384259 (0.0025) [2024-04-27 13:04:54,107][52031] Fps is (10 sec: 50789.4, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6295764992. Throughput: 0: 53684.3. Samples: 786349720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 13:04:54,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 13:04:55,672][52263] Updated weights for policy 0, policy_version 384269 (0.0033) [2024-04-27 13:04:58,458][52263] Updated weights for policy 0, policy_version 384279 (0.0028) [2024-04-27 13:04:59,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6296043520. Throughput: 0: 53250.8. Samples: 786501180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:04:59,107][52031] Avg episode reward: [(0, '0.482')] [2024-04-27 13:05:01,822][52263] Updated weights for policy 0, policy_version 384289 (0.0029) [2024-04-27 13:05:04,106][52031] Fps is (10 sec: 54068.6, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6296305664. Throughput: 0: 53147.6. Samples: 786820620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:05:04,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 13:05:04,822][52263] Updated weights for policy 0, policy_version 384299 (0.0025) [2024-04-27 13:05:07,841][52263] Updated weights for policy 0, policy_version 384309 (0.0024) [2024-04-27 13:05:09,106][52031] Fps is (10 sec: 57344.3, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6296616960. Throughput: 0: 53265.9. Samples: 787145360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:05:09,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 13:05:10,982][52263] Updated weights for policy 0, policy_version 384319 (0.0030) [2024-04-27 13:05:13,862][52263] Updated weights for policy 0, policy_version 384329 (0.0031) [2024-04-27 13:05:14,107][52031] Fps is (10 sec: 54066.0, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6296846336. Throughput: 0: 53712.0. Samples: 787315060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:05:14,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 13:05:14,193][52242] Signal inference workers to stop experience collection... (11900 times) [2024-04-27 13:05:14,236][52263] InferenceWorker_p0-w0: stopping experience collection (11900 times) [2024-04-27 13:05:14,296][52242] Signal inference workers to resume experience collection... (11900 times) [2024-04-27 13:05:14,296][52263] InferenceWorker_p0-w0: resuming experience collection (11900 times) [2024-04-27 13:05:17,008][52263] Updated weights for policy 0, policy_version 384339 (0.0025) [2024-04-27 13:05:19,106][52031] Fps is (10 sec: 49151.6, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6297108480. Throughput: 0: 53816.8. Samples: 787635180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:05:19,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 13:05:20,112][52263] Updated weights for policy 0, policy_version 384349 (0.0035) [2024-04-27 13:05:22,958][52263] Updated weights for policy 0, policy_version 384359 (0.0034) [2024-04-27 13:05:24,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6297370624. Throughput: 0: 53806.2. Samples: 787954300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:05:24,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 13:05:26,113][52263] Updated weights for policy 0, policy_version 384369 (0.0027) [2024-04-27 13:05:29,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6297649152. Throughput: 0: 53527.6. Samples: 788110380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:05:29,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 13:05:29,111][52263] Updated weights for policy 0, policy_version 384379 (0.0028) [2024-04-27 13:05:32,209][52263] Updated weights for policy 0, policy_version 384389 (0.0032) [2024-04-27 13:05:34,106][52031] Fps is (10 sec: 57344.2, 60 sec: 54067.1, 300 sec: 53595.1). Total num frames: 6297944064. Throughput: 0: 53483.9. Samples: 788434300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:05:34,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 13:05:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000384396_6297944064.pth... [2024-04-27 13:05:34,165][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000383612_6285099008.pth [2024-04-27 13:05:35,263][52263] Updated weights for policy 0, policy_version 384399 (0.0029) [2024-04-27 13:05:38,377][52263] Updated weights for policy 0, policy_version 384409 (0.0042) [2024-04-27 13:05:39,107][52031] Fps is (10 sec: 57342.6, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6298222592. Throughput: 0: 53500.0. Samples: 788757220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:05:39,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 13:05:41,314][52263] Updated weights for policy 0, policy_version 384419 (0.0031) [2024-04-27 13:05:44,107][52031] Fps is (10 sec: 50790.2, 60 sec: 53247.9, 300 sec: 53595.1). Total num frames: 6298451968. Throughput: 0: 53651.9. Samples: 788915520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:05:44,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 13:05:44,471][52263] Updated weights for policy 0, policy_version 384429 (0.0033) [2024-04-27 13:05:47,382][52263] Updated weights for policy 0, policy_version 384439 (0.0031) [2024-04-27 13:05:49,106][52031] Fps is (10 sec: 49152.4, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6298714112. Throughput: 0: 53585.2. Samples: 789231960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:05:49,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 13:05:50,516][52263] Updated weights for policy 0, policy_version 384449 (0.0028) [2024-04-27 13:05:54,018][52263] Updated weights for policy 0, policy_version 384459 (0.0035) [2024-04-27 13:05:54,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53521.3, 300 sec: 53539.6). Total num frames: 6298976256. Throughput: 0: 53717.8. Samples: 789562660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:05:54,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 13:05:56,545][52263] Updated weights for policy 0, policy_version 384469 (0.0029) [2024-04-27 13:05:59,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6299271168. Throughput: 0: 53438.4. Samples: 789719780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:05:59,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 13:06:00,139][52263] Updated weights for policy 0, policy_version 384479 (0.0029) [2024-04-27 13:06:02,285][52242] Signal inference workers to stop experience collection... (11950 times) [2024-04-27 13:06:02,285][52242] Signal inference workers to resume experience collection... (11950 times) [2024-04-27 13:06:02,310][52263] InferenceWorker_p0-w0: stopping experience collection (11950 times) [2024-04-27 13:06:02,310][52263] InferenceWorker_p0-w0: resuming experience collection (11950 times) [2024-04-27 13:06:02,615][52263] Updated weights for policy 0, policy_version 384489 (0.0033) [2024-04-27 13:06:04,107][52031] Fps is (10 sec: 57342.6, 60 sec: 54067.0, 300 sec: 53595.1). Total num frames: 6299549696. Throughput: 0: 53490.1. Samples: 790042240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:06:04,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 13:06:06,033][52263] Updated weights for policy 0, policy_version 384499 (0.0029) [2024-04-27 13:06:08,759][52263] Updated weights for policy 0, policy_version 384509 (0.0029) [2024-04-27 13:06:09,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53247.8, 300 sec: 53539.6). Total num frames: 6299811840. Throughput: 0: 53555.5. Samples: 790364300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:06:09,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 13:06:12,008][52263] Updated weights for policy 0, policy_version 384519 (0.0030) [2024-04-27 13:06:14,107][52031] Fps is (10 sec: 49152.0, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6300041216. Throughput: 0: 53497.5. Samples: 790517780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:06:14,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 13:06:14,862][52263] Updated weights for policy 0, policy_version 384529 (0.0027) [2024-04-27 13:06:18,129][52263] Updated weights for policy 0, policy_version 384539 (0.0036) [2024-04-27 13:06:19,106][52031] Fps is (10 sec: 50791.2, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6300319744. Throughput: 0: 53484.6. Samples: 790841100. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-04-27 13:06:19,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 13:06:20,849][52263] Updated weights for policy 0, policy_version 384549 (0.0032) [2024-04-27 13:06:24,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6300598272. Throughput: 0: 53549.0. Samples: 791166920. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-04-27 13:06:24,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 13:06:24,198][52263] Updated weights for policy 0, policy_version 384559 (0.0027) [2024-04-27 13:06:26,907][52263] Updated weights for policy 0, policy_version 384569 (0.0032) [2024-04-27 13:06:29,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6300860416. Throughput: 0: 53705.0. Samples: 791332240. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-04-27 13:06:29,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 13:06:30,495][52263] Updated weights for policy 0, policy_version 384579 (0.0028) [2024-04-27 13:06:33,123][52263] Updated weights for policy 0, policy_version 384589 (0.0036) [2024-04-27 13:06:34,107][52031] Fps is (10 sec: 57343.6, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6301171712. Throughput: 0: 53884.4. Samples: 791656760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-04-27 13:06:34,107][52031] Avg episode reward: [(0, '0.506')] [2024-04-27 13:06:36,467][52263] Updated weights for policy 0, policy_version 384599 (0.0033) [2024-04-27 13:06:39,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6301417472. Throughput: 0: 53648.8. Samples: 791976860. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-04-27 13:06:39,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 13:06:39,279][52263] Updated weights for policy 0, policy_version 384609 (0.0035) [2024-04-27 13:06:42,985][52263] Updated weights for policy 0, policy_version 384619 (0.0034) [2024-04-27 13:06:44,107][52031] Fps is (10 sec: 49152.1, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6301663232. Throughput: 0: 53657.2. Samples: 792134360. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-04-27 13:06:44,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 13:06:45,301][52263] Updated weights for policy 0, policy_version 384629 (0.0026) [2024-04-27 13:06:49,034][52263] Updated weights for policy 0, policy_version 384639 (0.0028) [2024-04-27 13:06:49,107][52031] Fps is (10 sec: 50789.9, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6301925376. Throughput: 0: 53537.4. Samples: 792451420. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-04-27 13:06:49,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 13:06:51,561][52263] Updated weights for policy 0, policy_version 384649 (0.0026) [2024-04-27 13:06:53,588][52242] Signal inference workers to stop experience collection... (12000 times) [2024-04-27 13:06:53,589][52242] Signal inference workers to resume experience collection... (12000 times) [2024-04-27 13:06:53,607][52263] InferenceWorker_p0-w0: stopping experience collection (12000 times) [2024-04-27 13:06:53,607][52263] InferenceWorker_p0-w0: resuming experience collection (12000 times) [2024-04-27 13:06:54,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6302203904. Throughput: 0: 53509.4. Samples: 792772220. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-04-27 13:06:54,107][52031] Avg episode reward: [(0, '0.502')] [2024-04-27 13:06:55,033][52263] Updated weights for policy 0, policy_version 384659 (0.0028) [2024-04-27 13:06:57,758][52263] Updated weights for policy 0, policy_version 384669 (0.0028) [2024-04-27 13:06:59,106][52031] Fps is (10 sec: 55706.2, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6302482432. Throughput: 0: 53720.6. Samples: 792935200. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-04-27 13:06:59,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 13:07:01,175][52263] Updated weights for policy 0, policy_version 384679 (0.0035) [2024-04-27 13:07:03,895][52263] Updated weights for policy 0, policy_version 384689 (0.0037) [2024-04-27 13:07:04,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6302760960. Throughput: 0: 53675.3. Samples: 793256500. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-04-27 13:07:04,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 13:07:07,285][52263] Updated weights for policy 0, policy_version 384699 (0.0029) [2024-04-27 13:07:09,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6303006720. Throughput: 0: 53461.3. Samples: 793572680. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-04-27 13:07:09,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 13:07:09,913][52263] Updated weights for policy 0, policy_version 384709 (0.0033) [2024-04-27 13:07:13,434][52263] Updated weights for policy 0, policy_version 384719 (0.0033) [2024-04-27 13:07:14,107][52031] Fps is (10 sec: 52429.0, 60 sec: 54067.3, 300 sec: 53539.6). Total num frames: 6303285248. Throughput: 0: 53280.8. Samples: 793729880. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-04-27 13:07:14,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 13:07:16,184][52263] Updated weights for policy 0, policy_version 384729 (0.0029) [2024-04-27 13:07:19,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6303531008. Throughput: 0: 53068.5. Samples: 794044840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-04-27 13:07:19,107][52031] Avg episode reward: [(0, '0.513')] [2024-04-27 13:07:19,608][52263] Updated weights for policy 0, policy_version 384739 (0.0032) [2024-04-27 13:07:22,456][52263] Updated weights for policy 0, policy_version 384749 (0.0030) [2024-04-27 13:07:24,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6303809536. Throughput: 0: 53089.2. Samples: 794365880. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-04-27 13:07:24,116][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 13:07:25,654][52263] Updated weights for policy 0, policy_version 384759 (0.0027) [2024-04-27 13:07:28,493][52263] Updated weights for policy 0, policy_version 384769 (0.0027) [2024-04-27 13:07:29,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6304088064. Throughput: 0: 53200.3. Samples: 794528380. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-04-27 13:07:29,116][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 13:07:31,938][52263] Updated weights for policy 0, policy_version 384779 (0.0030) [2024-04-27 13:07:34,107][52031] Fps is (10 sec: 54067.0, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6304350208. Throughput: 0: 53231.5. Samples: 794846840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 24.0) [2024-04-27 13:07:34,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 13:07:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000384787_6304350208.pth... [2024-04-27 13:07:34,165][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000384004_6291521536.pth [2024-04-27 13:07:34,502][52263] Updated weights for policy 0, policy_version 384789 (0.0028) [2024-04-27 13:07:37,908][52263] Updated weights for policy 0, policy_version 384799 (0.0033) [2024-04-27 13:07:39,106][52031] Fps is (10 sec: 50791.9, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6304595968. Throughput: 0: 53182.0. Samples: 795165400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:07:39,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 13:07:40,750][52263] Updated weights for policy 0, policy_version 384809 (0.0026) [2024-04-27 13:07:43,920][52263] Updated weights for policy 0, policy_version 384819 (0.0029) [2024-04-27 13:07:44,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6304874496. Throughput: 0: 52992.0. Samples: 795319840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:07:44,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 13:07:46,944][52263] Updated weights for policy 0, policy_version 384829 (0.0029) [2024-04-27 13:07:49,106][52031] Fps is (10 sec: 54066.7, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6305136640. Throughput: 0: 52978.4. Samples: 795640520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:07:49,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 13:07:49,982][52263] Updated weights for policy 0, policy_version 384839 (0.0030) [2024-04-27 13:07:53,178][52263] Updated weights for policy 0, policy_version 384849 (0.0030) [2024-04-27 13:07:54,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6305398784. Throughput: 0: 53287.2. Samples: 795970600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:07:54,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 13:07:56,196][52263] Updated weights for policy 0, policy_version 384859 (0.0034) [2024-04-27 13:07:59,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53247.9, 300 sec: 53484.1). Total num frames: 6305677312. Throughput: 0: 53368.0. Samples: 796131440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:07:59,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 13:07:59,184][52263] Updated weights for policy 0, policy_version 384869 (0.0029) [2024-04-27 13:08:02,210][52263] Updated weights for policy 0, policy_version 384879 (0.0027) [2024-04-27 13:08:04,106][52031] Fps is (10 sec: 52428.8, 60 sec: 52702.0, 300 sec: 53372.9). Total num frames: 6305923072. Throughput: 0: 53366.3. Samples: 796446320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:08:04,107][52031] Avg episode reward: [(0, '0.483')] [2024-04-27 13:08:05,377][52263] Updated weights for policy 0, policy_version 384889 (0.0030) [2024-04-27 13:08:08,354][52263] Updated weights for policy 0, policy_version 384899 (0.0043) [2024-04-27 13:08:09,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6306201600. Throughput: 0: 53262.8. Samples: 796762700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:08:09,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 13:08:11,576][52263] Updated weights for policy 0, policy_version 384909 (0.0026) [2024-04-27 13:08:14,106][52031] Fps is (10 sec: 54067.0, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6306463744. Throughput: 0: 53294.8. Samples: 796926640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:08:14,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 13:08:14,909][52263] Updated weights for policy 0, policy_version 384919 (0.0032) [2024-04-27 13:08:17,574][52263] Updated weights for policy 0, policy_version 384929 (0.0026) [2024-04-27 13:08:19,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6306742272. Throughput: 0: 53351.2. Samples: 797247640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:08:19,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 13:08:21,014][52263] Updated weights for policy 0, policy_version 384939 (0.0029) [2024-04-27 13:08:23,209][52242] Signal inference workers to stop experience collection... (12050 times) [2024-04-27 13:08:23,209][52242] Signal inference workers to resume experience collection... (12050 times) [2024-04-27 13:08:23,222][52263] InferenceWorker_p0-w0: stopping experience collection (12050 times) [2024-04-27 13:08:23,222][52263] InferenceWorker_p0-w0: resuming experience collection (12050 times) [2024-04-27 13:08:23,660][52263] Updated weights for policy 0, policy_version 384949 (0.0032) [2024-04-27 13:08:24,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6307004416. Throughput: 0: 53369.2. Samples: 797567020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:08:24,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 13:08:26,977][52263] Updated weights for policy 0, policy_version 384959 (0.0030) [2024-04-27 13:08:29,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6307282944. Throughput: 0: 53651.1. Samples: 797734140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:08:29,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 13:08:29,783][52263] Updated weights for policy 0, policy_version 384969 (0.0026) [2024-04-27 13:08:33,032][52263] Updated weights for policy 0, policy_version 384979 (0.0030) [2024-04-27 13:08:34,106][52031] Fps is (10 sec: 52428.5, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6307528704. Throughput: 0: 53547.5. Samples: 798050160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:08:34,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 13:08:36,065][52263] Updated weights for policy 0, policy_version 384989 (0.0029) [2024-04-27 13:08:39,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6307807232. Throughput: 0: 53452.8. Samples: 798375980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:08:39,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 13:08:39,235][52263] Updated weights for policy 0, policy_version 384999 (0.0027) [2024-04-27 13:08:42,499][52263] Updated weights for policy 0, policy_version 385009 (0.0031) [2024-04-27 13:08:44,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6308069376. Throughput: 0: 53462.2. Samples: 798537240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:08:44,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 13:08:45,184][52263] Updated weights for policy 0, policy_version 385019 (0.0031) [2024-04-27 13:08:48,507][52263] Updated weights for policy 0, policy_version 385029 (0.0031) [2024-04-27 13:08:49,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6308347904. Throughput: 0: 53561.4. Samples: 798856580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:08:49,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 13:08:51,445][52263] Updated weights for policy 0, policy_version 385039 (0.0030) [2024-04-27 13:08:54,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6308626432. Throughput: 0: 53711.0. Samples: 799179700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:08:54,107][52031] Avg episode reward: [(0, '0.662')] [2024-04-27 13:08:54,516][52263] Updated weights for policy 0, policy_version 385049 (0.0033) [2024-04-27 13:08:57,531][52263] Updated weights for policy 0, policy_version 385059 (0.0029) [2024-04-27 13:08:59,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6308888576. Throughput: 0: 53602.4. Samples: 799338740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-04-27 13:08:59,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 13:09:00,739][52263] Updated weights for policy 0, policy_version 385069 (0.0028) [2024-04-27 13:09:04,038][52263] Updated weights for policy 0, policy_version 385079 (0.0028) [2024-04-27 13:09:04,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6309134336. Throughput: 0: 53585.9. Samples: 799659000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-04-27 13:09:04,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 13:09:07,008][52263] Updated weights for policy 0, policy_version 385089 (0.0029) [2024-04-27 13:09:09,107][52031] Fps is (10 sec: 52427.5, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6309412864. Throughput: 0: 53539.4. Samples: 799976300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-04-27 13:09:09,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 13:09:10,148][52263] Updated weights for policy 0, policy_version 385099 (0.0038) [2024-04-27 13:09:13,048][52263] Updated weights for policy 0, policy_version 385109 (0.0025) [2024-04-27 13:09:14,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6309691392. Throughput: 0: 53349.3. Samples: 800134860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-04-27 13:09:14,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 13:09:16,390][52263] Updated weights for policy 0, policy_version 385119 (0.0032) [2024-04-27 13:09:19,017][52263] Updated weights for policy 0, policy_version 385129 (0.0027) [2024-04-27 13:09:19,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6309953536. Throughput: 0: 53472.4. Samples: 800456420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-04-27 13:09:19,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 13:09:22,649][52263] Updated weights for policy 0, policy_version 385139 (0.0029) [2024-04-27 13:09:24,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6310199296. Throughput: 0: 53298.3. Samples: 800774400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-04-27 13:09:24,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 13:09:25,182][52263] Updated weights for policy 0, policy_version 385149 (0.0034) [2024-04-27 13:09:28,623][52263] Updated weights for policy 0, policy_version 385159 (0.0027) [2024-04-27 13:09:29,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6310477824. Throughput: 0: 53277.5. Samples: 800934720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-04-27 13:09:29,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 13:09:31,523][52263] Updated weights for policy 0, policy_version 385169 (0.0039) [2024-04-27 13:09:32,000][52242] Signal inference workers to stop experience collection... (12100 times) [2024-04-27 13:09:32,000][52242] Signal inference workers to resume experience collection... (12100 times) [2024-04-27 13:09:32,026][52263] InferenceWorker_p0-w0: stopping experience collection (12100 times) [2024-04-27 13:09:32,026][52263] InferenceWorker_p0-w0: resuming experience collection (12100 times) [2024-04-27 13:09:34,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6310739968. Throughput: 0: 53255.0. Samples: 801253060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-04-27 13:09:34,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 13:09:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000385177_6310739968.pth... [2024-04-27 13:09:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000384396_6297944064.pth [2024-04-27 13:09:34,706][52263] Updated weights for policy 0, policy_version 385179 (0.0033) [2024-04-27 13:09:37,849][52263] Updated weights for policy 0, policy_version 385189 (0.0027) [2024-04-27 13:09:39,107][52031] Fps is (10 sec: 54065.8, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6311018496. Throughput: 0: 53120.7. Samples: 801570140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-04-27 13:09:39,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 13:09:40,882][52263] Updated weights for policy 0, policy_version 385199 (0.0031) [2024-04-27 13:09:43,870][52263] Updated weights for policy 0, policy_version 385209 (0.0029) [2024-04-27 13:09:44,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6311264256. Throughput: 0: 53306.9. Samples: 801737560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-04-27 13:09:44,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 13:09:46,971][52263] Updated weights for policy 0, policy_version 385219 (0.0031) [2024-04-27 13:09:49,106][52031] Fps is (10 sec: 52430.0, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6311542784. Throughput: 0: 53226.7. Samples: 802054200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-04-27 13:09:49,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 13:09:49,861][52263] Updated weights for policy 0, policy_version 385229 (0.0028) [2024-04-27 13:09:52,998][52263] Updated weights for policy 0, policy_version 385239 (0.0033) [2024-04-27 13:09:54,106][52031] Fps is (10 sec: 54067.7, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6311804928. Throughput: 0: 53310.9. Samples: 802375280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-04-27 13:09:54,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 13:09:55,977][52263] Updated weights for policy 0, policy_version 385249 (0.0032) [2024-04-27 13:09:59,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52974.8, 300 sec: 53428.5). Total num frames: 6312067072. Throughput: 0: 53333.7. Samples: 802534880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-04-27 13:09:59,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 13:09:59,151][52263] Updated weights for policy 0, policy_version 385259 (0.0032) [2024-04-27 13:10:02,170][52263] Updated weights for policy 0, policy_version 385269 (0.0027) [2024-04-27 13:10:04,106][52031] Fps is (10 sec: 55705.5, 60 sec: 53794.1, 300 sec: 53372.9). Total num frames: 6312361984. Throughput: 0: 53250.7. Samples: 802852700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-04-27 13:10:04,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 13:10:05,735][52263] Updated weights for policy 0, policy_version 385279 (0.0027) [2024-04-27 13:10:08,176][52263] Updated weights for policy 0, policy_version 385289 (0.0033) [2024-04-27 13:10:09,106][52031] Fps is (10 sec: 54068.3, 60 sec: 53248.2, 300 sec: 53428.5). Total num frames: 6312607744. Throughput: 0: 53258.3. Samples: 803171020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-04-27 13:10:09,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 13:10:11,714][52263] Updated weights for policy 0, policy_version 385299 (0.0034) [2024-04-27 13:10:14,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6312886272. Throughput: 0: 53369.6. Samples: 803336360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-04-27 13:10:14,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 13:10:14,149][52263] Updated weights for policy 0, policy_version 385309 (0.0031) [2024-04-27 13:10:17,702][52263] Updated weights for policy 0, policy_version 385319 (0.0028) [2024-04-27 13:10:19,106][52031] Fps is (10 sec: 52428.2, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6313132032. Throughput: 0: 53389.4. Samples: 803655580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:10:19,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 13:10:20,424][52263] Updated weights for policy 0, policy_version 385329 (0.0034) [2024-04-27 13:10:23,870][52263] Updated weights for policy 0, policy_version 385339 (0.0030) [2024-04-27 13:10:24,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.0, 300 sec: 53372.9). Total num frames: 6313394176. Throughput: 0: 53319.8. Samples: 803969520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:10:24,107][52031] Avg episode reward: [(0, '0.666')] [2024-04-27 13:10:26,663][52263] Updated weights for policy 0, policy_version 385349 (0.0034) [2024-04-27 13:10:29,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6313672704. Throughput: 0: 53157.4. Samples: 804129640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:10:29,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 13:10:30,033][52263] Updated weights for policy 0, policy_version 385359 (0.0028) [2024-04-27 13:10:31,412][52242] Signal inference workers to stop experience collection... (12150 times) [2024-04-27 13:10:31,412][52242] Signal inference workers to resume experience collection... (12150 times) [2024-04-27 13:10:31,429][52263] InferenceWorker_p0-w0: stopping experience collection (12150 times) [2024-04-27 13:10:31,429][52263] InferenceWorker_p0-w0: resuming experience collection (12150 times) [2024-04-27 13:10:32,857][52263] Updated weights for policy 0, policy_version 385369 (0.0032) [2024-04-27 13:10:34,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.2, 300 sec: 53317.5). Total num frames: 6313951232. Throughput: 0: 53220.9. Samples: 804449140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:10:34,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 13:10:36,175][52263] Updated weights for policy 0, policy_version 385379 (0.0035) [2024-04-27 13:10:39,049][52263] Updated weights for policy 0, policy_version 385389 (0.0026) [2024-04-27 13:10:39,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6314213376. Throughput: 0: 53163.9. Samples: 804767660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:10:39,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 13:10:42,189][52263] Updated weights for policy 0, policy_version 385399 (0.0035) [2024-04-27 13:10:44,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6314475520. Throughput: 0: 53317.4. Samples: 804934160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:10:44,107][52031] Avg episode reward: [(0, '0.510')] [2024-04-27 13:10:45,122][52263] Updated weights for policy 0, policy_version 385409 (0.0028) [2024-04-27 13:10:48,383][52263] Updated weights for policy 0, policy_version 385419 (0.0029) [2024-04-27 13:10:49,106][52031] Fps is (10 sec: 50790.5, 60 sec: 52974.9, 300 sec: 53372.9). Total num frames: 6314721280. Throughput: 0: 53245.7. Samples: 805248760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:10:49,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 13:10:51,444][52263] Updated weights for policy 0, policy_version 385429 (0.0022) [2024-04-27 13:10:54,106][52031] Fps is (10 sec: 50790.5, 60 sec: 52974.9, 300 sec: 53261.9). Total num frames: 6314983424. Throughput: 0: 53251.9. Samples: 805567360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:10:54,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 13:10:54,588][52263] Updated weights for policy 0, policy_version 385439 (0.0036) [2024-04-27 13:10:57,524][52263] Updated weights for policy 0, policy_version 385449 (0.0034) [2024-04-27 13:10:59,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 6315278336. Throughput: 0: 53051.6. Samples: 805723680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:10:59,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 13:11:00,771][52263] Updated weights for policy 0, policy_version 385459 (0.0026) [2024-04-27 13:11:03,495][52263] Updated weights for policy 0, policy_version 385469 (0.0035) [2024-04-27 13:11:04,106][52031] Fps is (10 sec: 55705.8, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 6315540480. Throughput: 0: 53169.8. Samples: 806048220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:11:04,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 13:11:06,747][52263] Updated weights for policy 0, policy_version 385479 (0.0029) [2024-04-27 13:11:09,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6315802624. Throughput: 0: 53261.4. Samples: 806366280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:11:09,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 13:11:09,848][52263] Updated weights for policy 0, policy_version 385489 (0.0027) [2024-04-27 13:11:12,813][52263] Updated weights for policy 0, policy_version 385499 (0.0031) [2024-04-27 13:11:14,106][52031] Fps is (10 sec: 52428.4, 60 sec: 52975.0, 300 sec: 53372.9). Total num frames: 6316064768. Throughput: 0: 53371.6. Samples: 806531360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:11:14,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 13:11:15,982][52263] Updated weights for policy 0, policy_version 385509 (0.0035) [2024-04-27 13:11:18,896][52263] Updated weights for policy 0, policy_version 385519 (0.0026) [2024-04-27 13:11:19,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6316343296. Throughput: 0: 53327.1. Samples: 806848860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:11:19,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 13:11:22,035][52263] Updated weights for policy 0, policy_version 385529 (0.0028) [2024-04-27 13:11:24,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6316589056. Throughput: 0: 53331.1. Samples: 807167560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:11:24,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 13:11:24,968][52263] Updated weights for policy 0, policy_version 385539 (0.0035) [2024-04-27 13:11:28,231][52263] Updated weights for policy 0, policy_version 385549 (0.0029) [2024-04-27 13:11:29,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.1, 300 sec: 53261.9). Total num frames: 6316883968. Throughput: 0: 53244.0. Samples: 807330140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:11:29,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 13:11:31,530][52263] Updated weights for policy 0, policy_version 385559 (0.0029) [2024-04-27 13:11:34,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6317146112. Throughput: 0: 53317.3. Samples: 807648040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:11:34,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 13:11:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000385568_6317146112.pth... [2024-04-27 13:11:34,170][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000384787_6304350208.pth [2024-04-27 13:11:34,303][52263] Updated weights for policy 0, policy_version 385569 (0.0026) [2024-04-27 13:11:37,755][52263] Updated weights for policy 0, policy_version 385579 (0.0030) [2024-04-27 13:11:39,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6317424640. Throughput: 0: 53393.7. Samples: 807970080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 13:11:39,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 13:11:40,379][52263] Updated weights for policy 0, policy_version 385589 (0.0029) [2024-04-27 13:11:43,779][52263] Updated weights for policy 0, policy_version 385599 (0.0030) [2024-04-27 13:11:44,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6317670400. Throughput: 0: 53474.2. Samples: 808130020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 13:11:44,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 13:11:46,596][52263] Updated weights for policy 0, policy_version 385609 (0.0034) [2024-04-27 13:11:49,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 6317948928. Throughput: 0: 53436.3. Samples: 808452860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 13:11:49,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 13:11:49,752][52263] Updated weights for policy 0, policy_version 385619 (0.0027) [2024-04-27 13:11:52,947][52263] Updated weights for policy 0, policy_version 385629 (0.0033) [2024-04-27 13:11:54,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.1, 300 sec: 53317.4). Total num frames: 6318211072. Throughput: 0: 53576.3. Samples: 808777220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 13:11:54,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 13:11:55,915][52263] Updated weights for policy 0, policy_version 385639 (0.0027) [2024-04-27 13:11:58,900][52263] Updated weights for policy 0, policy_version 385649 (0.0032) [2024-04-27 13:11:59,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.0, 300 sec: 53261.9). Total num frames: 6318473216. Throughput: 0: 53473.4. Samples: 808937660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 13:11:59,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 13:12:00,090][52242] Signal inference workers to stop experience collection... (12200 times) [2024-04-27 13:12:00,118][52263] InferenceWorker_p0-w0: stopping experience collection (12200 times) [2024-04-27 13:12:00,153][52242] Signal inference workers to resume experience collection... (12200 times) [2024-04-27 13:12:00,154][52263] InferenceWorker_p0-w0: resuming experience collection (12200 times) [2024-04-27 13:12:01,964][52263] Updated weights for policy 0, policy_version 385659 (0.0037) [2024-04-27 13:12:04,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6318768128. Throughput: 0: 53555.9. Samples: 809258880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 13:12:04,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 13:12:05,074][52263] Updated weights for policy 0, policy_version 385669 (0.0031) [2024-04-27 13:12:07,949][52263] Updated weights for policy 0, policy_version 385679 (0.0034) [2024-04-27 13:12:09,106][52031] Fps is (10 sec: 55705.4, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 6319030272. Throughput: 0: 53596.9. Samples: 809579420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 13:12:09,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 13:12:11,220][52263] Updated weights for policy 0, policy_version 385689 (0.0034) [2024-04-27 13:12:14,033][52263] Updated weights for policy 0, policy_version 385699 (0.0024) [2024-04-27 13:12:14,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6319292416. Throughput: 0: 53651.1. Samples: 809744440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 13:12:14,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 13:12:17,301][52263] Updated weights for policy 0, policy_version 385709 (0.0025) [2024-04-27 13:12:19,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6319554560. Throughput: 0: 53896.4. Samples: 810073380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 13:12:19,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 13:12:20,110][52263] Updated weights for policy 0, policy_version 385719 (0.0030) [2024-04-27 13:12:23,308][52263] Updated weights for policy 0, policy_version 385729 (0.0027) [2024-04-27 13:12:24,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53794.2, 300 sec: 53317.5). Total num frames: 6319816704. Throughput: 0: 53856.9. Samples: 810393640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 13:12:24,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 13:12:26,225][52263] Updated weights for policy 0, policy_version 385739 (0.0027) [2024-04-27 13:12:29,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6320078848. Throughput: 0: 53815.1. Samples: 810551700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 13:12:29,107][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 13:12:29,569][52263] Updated weights for policy 0, policy_version 385749 (0.0026) [2024-04-27 13:12:32,221][52263] Updated weights for policy 0, policy_version 385759 (0.0030) [2024-04-27 13:12:34,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6320357376. Throughput: 0: 53829.0. Samples: 810875160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 13:12:34,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 13:12:35,551][52263] Updated weights for policy 0, policy_version 385769 (0.0031) [2024-04-27 13:12:38,100][52263] Updated weights for policy 0, policy_version 385779 (0.0029) [2024-04-27 13:12:39,106][52031] Fps is (10 sec: 57344.2, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6320652288. Throughput: 0: 53845.5. Samples: 811200260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 13:12:39,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 13:12:41,755][52263] Updated weights for policy 0, policy_version 385789 (0.0033) [2024-04-27 13:12:44,107][52031] Fps is (10 sec: 55704.3, 60 sec: 54067.0, 300 sec: 53484.0). Total num frames: 6320914432. Throughput: 0: 53999.3. Samples: 811367640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 13:12:44,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 13:12:44,311][52263] Updated weights for policy 0, policy_version 385799 (0.0029) [2024-04-27 13:12:47,829][52263] Updated weights for policy 0, policy_version 385809 (0.0038) [2024-04-27 13:12:49,107][52031] Fps is (10 sec: 50789.6, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6321160192. Throughput: 0: 54006.2. Samples: 811689160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 13:12:49,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 13:12:50,537][52263] Updated weights for policy 0, policy_version 385819 (0.0029) [2024-04-27 13:12:54,025][52263] Updated weights for policy 0, policy_version 385829 (0.0028) [2024-04-27 13:12:54,106][52031] Fps is (10 sec: 50791.6, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 6321422336. Throughput: 0: 53940.5. Samples: 812006740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 13:12:54,107][52031] Avg episode reward: [(0, '0.472')] [2024-04-27 13:12:56,546][52263] Updated weights for policy 0, policy_version 385839 (0.0028) [2024-04-27 13:12:59,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6321700864. Throughput: 0: 53658.2. Samples: 812159060. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 13:12:59,107][52031] Avg episode reward: [(0, '0.676')] [2024-04-27 13:12:59,988][52263] Updated weights for policy 0, policy_version 385849 (0.0028) [2024-04-27 13:13:02,666][52263] Updated weights for policy 0, policy_version 385859 (0.0038) [2024-04-27 13:13:04,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.2, 300 sec: 53484.0). Total num frames: 6321979392. Throughput: 0: 53512.1. Samples: 812481420. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 13:13:04,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 13:13:06,249][52263] Updated weights for policy 0, policy_version 385869 (0.0024) [2024-04-27 13:13:08,706][52263] Updated weights for policy 0, policy_version 385879 (0.0028) [2024-04-27 13:13:09,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6322241536. Throughput: 0: 53500.1. Samples: 812801140. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 13:13:09,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 13:13:12,399][52263] Updated weights for policy 0, policy_version 385889 (0.0032) [2024-04-27 13:13:14,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6322503680. Throughput: 0: 53645.8. Samples: 812965760. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 13:13:14,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 13:13:14,923][52263] Updated weights for policy 0, policy_version 385899 (0.0032) [2024-04-27 13:13:18,551][52263] Updated weights for policy 0, policy_version 385909 (0.0031) [2024-04-27 13:13:19,109][52031] Fps is (10 sec: 52412.5, 60 sec: 53518.4, 300 sec: 53427.9). Total num frames: 6322765824. Throughput: 0: 53496.3. Samples: 813282660. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 13:13:19,110][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 13:13:21,105][52263] Updated weights for policy 0, policy_version 385919 (0.0031) [2024-04-27 13:13:24,106][52031] Fps is (10 sec: 50789.9, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6323011584. Throughput: 0: 53423.8. Samples: 813604340. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 13:13:24,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 13:13:24,628][52263] Updated weights for policy 0, policy_version 385929 (0.0040) [2024-04-27 13:13:26,329][52242] Signal inference workers to stop experience collection... (12250 times) [2024-04-27 13:13:26,330][52242] Signal inference workers to resume experience collection... (12250 times) [2024-04-27 13:13:26,356][52263] InferenceWorker_p0-w0: stopping experience collection (12250 times) [2024-04-27 13:13:26,356][52263] InferenceWorker_p0-w0: resuming experience collection (12250 times) [2024-04-27 13:13:27,207][52263] Updated weights for policy 0, policy_version 385939 (0.0029) [2024-04-27 13:13:29,107][52031] Fps is (10 sec: 52443.9, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6323290112. Throughput: 0: 53040.5. Samples: 813754460. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 13:13:29,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 13:13:30,812][52263] Updated weights for policy 0, policy_version 385949 (0.0033) [2024-04-27 13:13:33,358][52263] Updated weights for policy 0, policy_version 385959 (0.0030) [2024-04-27 13:13:34,106][52031] Fps is (10 sec: 57344.4, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6323585024. Throughput: 0: 52934.3. Samples: 814071200. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 13:13:34,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 13:13:34,169][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000385962_6323601408.pth... [2024-04-27 13:13:34,216][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000385177_6310739968.pth [2024-04-27 13:13:37,191][52263] Updated weights for policy 0, policy_version 385969 (0.0032) [2024-04-27 13:13:39,106][52031] Fps is (10 sec: 54067.9, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6323830784. Throughput: 0: 53014.1. Samples: 814392380. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 13:13:39,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 13:13:39,514][52263] Updated weights for policy 0, policy_version 385979 (0.0029) [2024-04-27 13:13:43,123][52263] Updated weights for policy 0, policy_version 385989 (0.0027) [2024-04-27 13:13:44,106][52031] Fps is (10 sec: 47513.4, 60 sec: 52428.9, 300 sec: 53261.9). Total num frames: 6324060160. Throughput: 0: 53171.6. Samples: 814551780. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 13:13:44,107][52031] Avg episode reward: [(0, '0.658')] [2024-04-27 13:13:45,661][52263] Updated weights for policy 0, policy_version 385999 (0.0031) [2024-04-27 13:13:49,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 6324355072. Throughput: 0: 53114.2. Samples: 814871560. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 13:13:49,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 13:13:49,207][52263] Updated weights for policy 0, policy_version 386009 (0.0029) [2024-04-27 13:13:51,706][52263] Updated weights for policy 0, policy_version 386019 (0.0032) [2024-04-27 13:13:54,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6324617216. Throughput: 0: 53225.2. Samples: 815196280. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 13:13:54,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 13:13:55,389][52263] Updated weights for policy 0, policy_version 386029 (0.0031) [2024-04-27 13:13:58,000][52263] Updated weights for policy 0, policy_version 386039 (0.0031) [2024-04-27 13:13:59,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6324912128. Throughput: 0: 53227.6. Samples: 815361000. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 13:13:59,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 13:14:01,619][52263] Updated weights for policy 0, policy_version 386049 (0.0029) [2024-04-27 13:14:04,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6325174272. Throughput: 0: 53276.0. Samples: 815679920. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 13:14:04,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 13:14:04,176][52263] Updated weights for policy 0, policy_version 386059 (0.0032) [2024-04-27 13:14:07,607][52263] Updated weights for policy 0, policy_version 386069 (0.0030) [2024-04-27 13:14:09,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6325452800. Throughput: 0: 53367.7. Samples: 816005880. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 13:14:09,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 13:14:10,101][52263] Updated weights for policy 0, policy_version 386079 (0.0029) [2024-04-27 13:14:13,667][52263] Updated weights for policy 0, policy_version 386089 (0.0030) [2024-04-27 13:14:14,106][52031] Fps is (10 sec: 50790.7, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 6325682176. Throughput: 0: 53392.2. Samples: 816157100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 13:14:14,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 13:14:16,088][52263] Updated weights for policy 0, policy_version 386099 (0.0031) [2024-04-27 13:14:19,106][52031] Fps is (10 sec: 49151.8, 60 sec: 52977.6, 300 sec: 53373.0). Total num frames: 6325944320. Throughput: 0: 53523.1. Samples: 816479740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 13:14:19,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 13:14:20,152][52263] Updated weights for policy 0, policy_version 386109 (0.0027) [2024-04-27 13:14:20,754][52242] Signal inference workers to stop experience collection... (12300 times) [2024-04-27 13:14:20,754][52242] Signal inference workers to resume experience collection... (12300 times) [2024-04-27 13:14:20,777][52263] InferenceWorker_p0-w0: stopping experience collection (12300 times) [2024-04-27 13:14:20,777][52263] InferenceWorker_p0-w0: resuming experience collection (12300 times) [2024-04-27 13:14:22,193][52263] Updated weights for policy 0, policy_version 386119 (0.0025) [2024-04-27 13:14:24,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6326239232. Throughput: 0: 53545.3. Samples: 816801920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 13:14:24,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 13:14:26,314][52263] Updated weights for policy 0, policy_version 386129 (0.0029) [2024-04-27 13:14:28,417][52263] Updated weights for policy 0, policy_version 386139 (0.0030) [2024-04-27 13:14:29,107][52031] Fps is (10 sec: 58981.4, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6326534144. Throughput: 0: 53689.2. Samples: 816967800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 13:14:29,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 13:14:32,365][52263] Updated weights for policy 0, policy_version 386149 (0.0026) [2024-04-27 13:14:34,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53248.1, 300 sec: 53428.6). Total num frames: 6326779904. Throughput: 0: 53720.5. Samples: 817288980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 13:14:34,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 13:14:34,518][52263] Updated weights for policy 0, policy_version 386159 (0.0030) [2024-04-27 13:14:38,476][52263] Updated weights for policy 0, policy_version 386169 (0.0031) [2024-04-27 13:14:39,107][52031] Fps is (10 sec: 49152.3, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6327025664. Throughput: 0: 53720.4. Samples: 817613700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 13:14:39,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 13:14:40,627][52263] Updated weights for policy 0, policy_version 386179 (0.0031) [2024-04-27 13:14:44,107][52031] Fps is (10 sec: 50789.5, 60 sec: 53794.1, 300 sec: 53372.9). Total num frames: 6327287808. Throughput: 0: 53354.5. Samples: 817761960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 13:14:44,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 13:14:44,599][52263] Updated weights for policy 0, policy_version 386189 (0.0032) [2024-04-27 13:14:46,692][52263] Updated weights for policy 0, policy_version 386199 (0.0030) [2024-04-27 13:14:49,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6327566336. Throughput: 0: 53405.2. Samples: 818083160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 13:14:49,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 13:14:50,610][52263] Updated weights for policy 0, policy_version 386209 (0.0026) [2024-04-27 13:14:52,914][52263] Updated weights for policy 0, policy_version 386219 (0.0030) [2024-04-27 13:14:54,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6327844864. Throughput: 0: 53268.9. Samples: 818402980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 13:14:54,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 13:14:56,591][52263] Updated weights for policy 0, policy_version 386229 (0.0032) [2024-04-27 13:14:58,921][52263] Updated weights for policy 0, policy_version 386239 (0.0030) [2024-04-27 13:14:59,107][52031] Fps is (10 sec: 57343.7, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6328139776. Throughput: 0: 53702.0. Samples: 818573700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 13:14:59,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 13:15:02,662][52263] Updated weights for policy 0, policy_version 386249 (0.0031) [2024-04-27 13:15:04,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.2, 300 sec: 53484.0). Total num frames: 6328385536. Throughput: 0: 53701.5. Samples: 818896300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 13:15:04,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 13:15:04,928][52263] Updated weights for policy 0, policy_version 386259 (0.0028) [2024-04-27 13:15:08,748][52263] Updated weights for policy 0, policy_version 386269 (0.0034) [2024-04-27 13:15:09,107][52031] Fps is (10 sec: 50791.0, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6328647680. Throughput: 0: 53721.8. Samples: 819219400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 13:15:09,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 13:15:11,065][52263] Updated weights for policy 0, policy_version 386279 (0.0032) [2024-04-27 13:15:14,107][52031] Fps is (10 sec: 49150.9, 60 sec: 53247.8, 300 sec: 53372.9). Total num frames: 6328877056. Throughput: 0: 53336.1. Samples: 819367920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 13:15:14,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 13:15:14,948][52263] Updated weights for policy 0, policy_version 386289 (0.0029) [2024-04-27 13:15:17,206][52263] Updated weights for policy 0, policy_version 386299 (0.0028) [2024-04-27 13:15:17,742][52242] Signal inference workers to stop experience collection... (12350 times) [2024-04-27 13:15:17,783][52263] InferenceWorker_p0-w0: stopping experience collection (12350 times) [2024-04-27 13:15:17,798][52242] Signal inference workers to resume experience collection... (12350 times) [2024-04-27 13:15:17,799][52263] InferenceWorker_p0-w0: resuming experience collection (12350 times) [2024-04-27 13:15:19,106][52031] Fps is (10 sec: 54067.5, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6329188352. Throughput: 0: 53336.8. Samples: 819689140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 13:15:19,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 13:15:21,062][52263] Updated weights for policy 0, policy_version 386309 (0.0029) [2024-04-27 13:15:23,378][52263] Updated weights for policy 0, policy_version 386319 (0.0024) [2024-04-27 13:15:24,106][52031] Fps is (10 sec: 60621.5, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6329483264. Throughput: 0: 53198.3. Samples: 820007620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 13:15:24,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 13:15:27,023][52263] Updated weights for policy 0, policy_version 386329 (0.0032) [2024-04-27 13:15:29,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.2, 300 sec: 53484.0). Total num frames: 6329729024. Throughput: 0: 53857.5. Samples: 820185540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 13:15:29,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 13:15:29,942][52263] Updated weights for policy 0, policy_version 386339 (0.0029) [2024-04-27 13:15:33,141][52263] Updated weights for policy 0, policy_version 386349 (0.0029) [2024-04-27 13:15:34,107][52031] Fps is (10 sec: 50790.1, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6329991168. Throughput: 0: 53818.7. Samples: 820505000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-27 13:15:34,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 13:15:34,158][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000386353_6330007552.pth... [2024-04-27 13:15:34,217][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000385568_6317146112.pth [2024-04-27 13:15:36,140][52263] Updated weights for policy 0, policy_version 386359 (0.0029) [2024-04-27 13:15:39,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6330253312. Throughput: 0: 53888.4. Samples: 820827960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-27 13:15:39,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 13:15:39,363][52263] Updated weights for policy 0, policy_version 386369 (0.0032) [2024-04-27 13:15:42,155][52263] Updated weights for policy 0, policy_version 386379 (0.0036) [2024-04-27 13:15:44,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6330515456. Throughput: 0: 53403.7. Samples: 820976860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-27 13:15:44,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 13:15:45,325][52263] Updated weights for policy 0, policy_version 386389 (0.0030) [2024-04-27 13:15:48,198][52263] Updated weights for policy 0, policy_version 386399 (0.0032) [2024-04-27 13:15:49,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6330793984. Throughput: 0: 53351.3. Samples: 821297120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-27 13:15:49,107][52031] Avg episode reward: [(0, '0.511')] [2024-04-27 13:15:51,363][52263] Updated weights for policy 0, policy_version 386409 (0.0035) [2024-04-27 13:15:54,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6331072512. Throughput: 0: 53276.5. Samples: 821616840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-27 13:15:54,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 13:15:54,576][52263] Updated weights for policy 0, policy_version 386419 (0.0029) [2024-04-27 13:15:57,489][52263] Updated weights for policy 0, policy_version 386429 (0.0033) [2024-04-27 13:15:59,106][52031] Fps is (10 sec: 54068.2, 60 sec: 53248.2, 300 sec: 53539.6). Total num frames: 6331334656. Throughput: 0: 53912.2. Samples: 821793960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-27 13:15:59,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 13:16:00,594][52263] Updated weights for policy 0, policy_version 386439 (0.0027) [2024-04-27 13:16:03,938][52263] Updated weights for policy 0, policy_version 386449 (0.0027) [2024-04-27 13:16:04,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6331580416. Throughput: 0: 53846.7. Samples: 822112240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-27 13:16:04,107][52031] Avg episode reward: [(0, '0.511')] [2024-04-27 13:16:04,463][52242] Signal inference workers to stop experience collection... (12400 times) [2024-04-27 13:16:04,463][52242] Signal inference workers to resume experience collection... (12400 times) [2024-04-27 13:16:04,476][52263] InferenceWorker_p0-w0: stopping experience collection (12400 times) [2024-04-27 13:16:04,495][52263] InferenceWorker_p0-w0: resuming experience collection (12400 times) [2024-04-27 13:16:06,586][52263] Updated weights for policy 0, policy_version 386459 (0.0031) [2024-04-27 13:16:09,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6331858944. Throughput: 0: 53899.0. Samples: 822433080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-27 13:16:09,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 13:16:10,185][52263] Updated weights for policy 0, policy_version 386469 (0.0031) [2024-04-27 13:16:12,770][52263] Updated weights for policy 0, policy_version 386479 (0.0037) [2024-04-27 13:16:14,106][52031] Fps is (10 sec: 55705.6, 60 sec: 54340.4, 300 sec: 53539.6). Total num frames: 6332137472. Throughput: 0: 53258.7. Samples: 822582180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-27 13:16:14,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 13:16:16,192][52263] Updated weights for policy 0, policy_version 386489 (0.0027) [2024-04-27 13:16:19,012][52263] Updated weights for policy 0, policy_version 386499 (0.0026) [2024-04-27 13:16:19,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6332399616. Throughput: 0: 53327.6. Samples: 822904740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-27 13:16:19,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 13:16:22,380][52263] Updated weights for policy 0, policy_version 386509 (0.0027) [2024-04-27 13:16:24,106][52031] Fps is (10 sec: 54066.8, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6332678144. Throughput: 0: 53163.1. Samples: 823220300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-27 13:16:24,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 13:16:25,324][52263] Updated weights for policy 0, policy_version 386519 (0.0025) [2024-04-27 13:16:28,523][52263] Updated weights for policy 0, policy_version 386529 (0.0026) [2024-04-27 13:16:29,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6332940288. Throughput: 0: 53402.2. Samples: 823379960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-27 13:16:29,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 13:16:31,517][52263] Updated weights for policy 0, policy_version 386539 (0.0033) [2024-04-27 13:16:34,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6333202432. Throughput: 0: 53380.6. Samples: 823699240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-27 13:16:34,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 13:16:34,490][52263] Updated weights for policy 0, policy_version 386549 (0.0032) [2024-04-27 13:16:37,485][52263] Updated weights for policy 0, policy_version 386559 (0.0034) [2024-04-27 13:16:39,106][52031] Fps is (10 sec: 50791.0, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6333448192. Throughput: 0: 53641.0. Samples: 824030680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-27 13:16:39,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 13:16:40,590][52263] Updated weights for policy 0, policy_version 386569 (0.0030) [2024-04-27 13:16:43,681][52263] Updated weights for policy 0, policy_version 386579 (0.0027) [2024-04-27 13:16:44,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6333710336. Throughput: 0: 53079.0. Samples: 824182520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-27 13:16:44,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 13:16:46,698][52263] Updated weights for policy 0, policy_version 386589 (0.0026) [2024-04-27 13:16:49,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6333988864. Throughput: 0: 53058.5. Samples: 824499880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-04-27 13:16:49,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 13:16:49,813][52263] Updated weights for policy 0, policy_version 386599 (0.0028) [2024-04-27 13:16:52,839][52263] Updated weights for policy 0, policy_version 386609 (0.0029) [2024-04-27 13:16:54,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6334267392. Throughput: 0: 53081.9. Samples: 824821760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 13:16:54,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 13:16:55,818][52263] Updated weights for policy 0, policy_version 386619 (0.0030) [2024-04-27 13:16:58,931][52263] Updated weights for policy 0, policy_version 386629 (0.0026) [2024-04-27 13:16:59,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6334545920. Throughput: 0: 53564.8. Samples: 824992600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 13:16:59,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 13:17:01,954][52263] Updated weights for policy 0, policy_version 386639 (0.0029) [2024-04-27 13:17:04,107][52031] Fps is (10 sec: 52427.6, 60 sec: 53520.8, 300 sec: 53428.5). Total num frames: 6334791680. Throughput: 0: 53496.6. Samples: 825312100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 13:17:04,107][52031] Avg episode reward: [(0, '0.511')] [2024-04-27 13:17:05,156][52263] Updated weights for policy 0, policy_version 386649 (0.0030) [2024-04-27 13:17:08,182][52263] Updated weights for policy 0, policy_version 386659 (0.0029) [2024-04-27 13:17:09,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6335070208. Throughput: 0: 53567.7. Samples: 825630840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 13:17:09,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 13:17:11,309][52263] Updated weights for policy 0, policy_version 386669 (0.0033) [2024-04-27 13:17:11,563][52242] Signal inference workers to stop experience collection... (12450 times) [2024-04-27 13:17:11,563][52242] Signal inference workers to resume experience collection... (12450 times) [2024-04-27 13:17:11,589][52263] InferenceWorker_p0-w0: stopping experience collection (12450 times) [2024-04-27 13:17:11,589][52263] InferenceWorker_p0-w0: resuming experience collection (12450 times) [2024-04-27 13:17:14,106][52031] Fps is (10 sec: 54068.7, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6335332352. Throughput: 0: 53451.2. Samples: 825785260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 13:17:14,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 13:17:14,458][52263] Updated weights for policy 0, policy_version 386679 (0.0029) [2024-04-27 13:17:17,342][52263] Updated weights for policy 0, policy_version 386689 (0.0025) [2024-04-27 13:17:19,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6335594496. Throughput: 0: 53447.9. Samples: 826104400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 13:17:19,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 13:17:20,543][52263] Updated weights for policy 0, policy_version 386699 (0.0039) [2024-04-27 13:17:23,444][52263] Updated weights for policy 0, policy_version 386709 (0.0027) [2024-04-27 13:17:24,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6335873024. Throughput: 0: 53154.0. Samples: 826422620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 13:17:24,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 13:17:26,789][52263] Updated weights for policy 0, policy_version 386719 (0.0031) [2024-04-27 13:17:29,106][52031] Fps is (10 sec: 52429.5, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6336118784. Throughput: 0: 53551.2. Samples: 826592320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 13:17:29,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 13:17:29,654][52263] Updated weights for policy 0, policy_version 386729 (0.0026) [2024-04-27 13:17:32,918][52263] Updated weights for policy 0, policy_version 386739 (0.0033) [2024-04-27 13:17:34,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53247.9, 300 sec: 53372.9). Total num frames: 6336397312. Throughput: 0: 53591.6. Samples: 826911500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 13:17:34,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 13:17:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000386743_6336397312.pth... [2024-04-27 13:17:34,173][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000385962_6323601408.pth [2024-04-27 13:17:35,792][52263] Updated weights for policy 0, policy_version 386749 (0.0029) [2024-04-27 13:17:38,992][52263] Updated weights for policy 0, policy_version 386759 (0.0031) [2024-04-27 13:17:39,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6336659456. Throughput: 0: 53510.3. Samples: 827229720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 13:17:39,107][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 13:17:41,819][52263] Updated weights for policy 0, policy_version 386769 (0.0027) [2024-04-27 13:17:44,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 6336937984. Throughput: 0: 53208.0. Samples: 827386960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 13:17:44,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 13:17:45,006][52263] Updated weights for policy 0, policy_version 386779 (0.0032) [2024-04-27 13:17:47,804][52263] Updated weights for policy 0, policy_version 386789 (0.0027) [2024-04-27 13:17:49,107][52031] Fps is (10 sec: 54066.1, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6337200128. Throughput: 0: 53306.3. Samples: 827710880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 13:17:49,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 13:17:51,053][52263] Updated weights for policy 0, policy_version 386799 (0.0030) [2024-04-27 13:17:54,087][52263] Updated weights for policy 0, policy_version 386809 (0.0030) [2024-04-27 13:17:54,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6337478656. Throughput: 0: 53449.2. Samples: 828036060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 13:17:54,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 13:17:57,105][52263] Updated weights for policy 0, policy_version 386819 (0.0034) [2024-04-27 13:17:59,106][52031] Fps is (10 sec: 52429.8, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6337724416. Throughput: 0: 53565.3. Samples: 828195700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 13:17:59,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 13:18:00,273][52263] Updated weights for policy 0, policy_version 386829 (0.0034) [2024-04-27 13:18:03,290][52263] Updated weights for policy 0, policy_version 386839 (0.0027) [2024-04-27 13:18:04,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.3, 300 sec: 53428.5). Total num frames: 6338002944. Throughput: 0: 53609.9. Samples: 828516840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 13:18:04,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 13:18:06,254][52263] Updated weights for policy 0, policy_version 386849 (0.0027) [2024-04-27 13:18:09,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6338281472. Throughput: 0: 53723.2. Samples: 828840160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 13:18:09,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 13:18:09,387][52263] Updated weights for policy 0, policy_version 386859 (0.0034) [2024-04-27 13:18:12,294][52263] Updated weights for policy 0, policy_version 386869 (0.0035) [2024-04-27 13:18:14,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.1, 300 sec: 53484.6). Total num frames: 6338543616. Throughput: 0: 53488.9. Samples: 828999320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 13:18:14,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 13:18:15,526][52263] Updated weights for policy 0, policy_version 386879 (0.0032) [2024-04-27 13:18:18,442][52263] Updated weights for policy 0, policy_version 386889 (0.0028) [2024-04-27 13:18:19,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6338805760. Throughput: 0: 53581.3. Samples: 829322660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 13:18:19,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 13:18:21,736][52263] Updated weights for policy 0, policy_version 386899 (0.0027) [2024-04-27 13:18:24,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6339067904. Throughput: 0: 53522.2. Samples: 829638220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 13:18:24,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 13:18:24,599][52263] Updated weights for policy 0, policy_version 386909 (0.0031) [2024-04-27 13:18:28,022][52263] Updated weights for policy 0, policy_version 386919 (0.0029) [2024-04-27 13:18:29,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6339346432. Throughput: 0: 53550.2. Samples: 829796720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 13:18:29,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 13:18:30,792][52263] Updated weights for policy 0, policy_version 386929 (0.0028) [2024-04-27 13:18:34,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.2, 300 sec: 53428.5). Total num frames: 6339592192. Throughput: 0: 53466.5. Samples: 830116860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 13:18:34,107][52031] Avg episode reward: [(0, '0.519')] [2024-04-27 13:18:34,155][52263] Updated weights for policy 0, policy_version 386939 (0.0035) [2024-04-27 13:18:36,852][52263] Updated weights for policy 0, policy_version 386949 (0.0026) [2024-04-27 13:18:38,985][52242] Signal inference workers to stop experience collection... (12500 times) [2024-04-27 13:18:39,020][52263] InferenceWorker_p0-w0: stopping experience collection (12500 times) [2024-04-27 13:18:39,050][52242] Signal inference workers to resume experience collection... (12500 times) [2024-04-27 13:18:39,054][52263] InferenceWorker_p0-w0: resuming experience collection (12500 times) [2024-04-27 13:18:39,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6339870720. Throughput: 0: 53375.5. Samples: 830437960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 13:18:39,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 13:18:40,264][52263] Updated weights for policy 0, policy_version 386959 (0.0029) [2024-04-27 13:18:43,021][52263] Updated weights for policy 0, policy_version 386969 (0.0028) [2024-04-27 13:18:44,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6340132864. Throughput: 0: 53399.6. Samples: 830598680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 13:18:44,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 13:18:46,380][52263] Updated weights for policy 0, policy_version 386979 (0.0030) [2024-04-27 13:18:49,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6340411392. Throughput: 0: 53336.0. Samples: 830916960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 13:18:49,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 13:18:49,316][52263] Updated weights for policy 0, policy_version 386989 (0.0032) [2024-04-27 13:18:52,463][52263] Updated weights for policy 0, policy_version 386999 (0.0030) [2024-04-27 13:18:54,107][52031] Fps is (10 sec: 54066.1, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6340673536. Throughput: 0: 53323.0. Samples: 831239700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 13:18:54,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 13:18:55,412][52263] Updated weights for policy 0, policy_version 387009 (0.0031) [2024-04-27 13:18:58,638][52263] Updated weights for policy 0, policy_version 387019 (0.0028) [2024-04-27 13:18:59,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6340935680. Throughput: 0: 53355.1. Samples: 831400300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 13:18:59,107][52031] Avg episode reward: [(0, '0.489')] [2024-04-27 13:19:01,364][52263] Updated weights for policy 0, policy_version 387029 (0.0037) [2024-04-27 13:19:04,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6341214208. Throughput: 0: 53469.5. Samples: 831728780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 13:19:04,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 13:19:04,696][52263] Updated weights for policy 0, policy_version 387039 (0.0029) [2024-04-27 13:19:07,493][52263] Updated weights for policy 0, policy_version 387049 (0.0030) [2024-04-27 13:19:09,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6341476352. Throughput: 0: 53508.0. Samples: 832046080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 13:19:09,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 13:19:10,826][52263] Updated weights for policy 0, policy_version 387059 (0.0032) [2024-04-27 13:19:13,634][52263] Updated weights for policy 0, policy_version 387069 (0.0034) [2024-04-27 13:19:14,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6341754880. Throughput: 0: 53511.7. Samples: 832204740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 13:19:14,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 13:19:17,019][52263] Updated weights for policy 0, policy_version 387079 (0.0032) [2024-04-27 13:19:19,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6342017024. Throughput: 0: 53542.0. Samples: 832526260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 13:19:19,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 13:19:19,886][52263] Updated weights for policy 0, policy_version 387089 (0.0028) [2024-04-27 13:19:23,203][52263] Updated weights for policy 0, policy_version 387099 (0.0029) [2024-04-27 13:19:24,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6342295552. Throughput: 0: 53670.8. Samples: 832853140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 13:19:24,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 13:19:25,904][52263] Updated weights for policy 0, policy_version 387109 (0.0032) [2024-04-27 13:19:29,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6342541312. Throughput: 0: 53686.5. Samples: 833014580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 13:19:29,107][52031] Avg episode reward: [(0, '0.457')] [2024-04-27 13:19:29,164][52263] Updated weights for policy 0, policy_version 387119 (0.0027) [2024-04-27 13:19:31,880][52263] Updated weights for policy 0, policy_version 387129 (0.0032) [2024-04-27 13:19:34,107][52031] Fps is (10 sec: 52427.6, 60 sec: 53793.9, 300 sec: 53539.6). Total num frames: 6342819840. Throughput: 0: 53654.0. Samples: 833331400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-27 13:19:34,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 13:19:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000387135_6342819840.pth... [2024-04-27 13:19:34,168][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000386353_6330007552.pth [2024-04-27 13:19:35,198][52263] Updated weights for policy 0, policy_version 387139 (0.0031) [2024-04-27 13:19:38,140][52263] Updated weights for policy 0, policy_version 387149 (0.0033) [2024-04-27 13:19:39,106][52031] Fps is (10 sec: 55705.4, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6343098368. Throughput: 0: 53498.7. Samples: 833647140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-27 13:19:39,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 13:19:41,492][52263] Updated weights for policy 0, policy_version 387159 (0.0032) [2024-04-27 13:19:44,106][52031] Fps is (10 sec: 52430.3, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6343344128. Throughput: 0: 53601.4. Samples: 833812360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-27 13:19:44,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 13:19:44,318][52263] Updated weights for policy 0, policy_version 387169 (0.0025) [2024-04-27 13:19:47,666][52263] Updated weights for policy 0, policy_version 387179 (0.0026) [2024-04-27 13:19:49,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6343622656. Throughput: 0: 53534.5. Samples: 834137840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-27 13:19:49,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 13:19:50,433][52263] Updated weights for policy 0, policy_version 387189 (0.0031) [2024-04-27 13:19:53,738][52263] Updated weights for policy 0, policy_version 387199 (0.0033) [2024-04-27 13:19:54,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6343884800. Throughput: 0: 53604.8. Samples: 834458300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-27 13:19:54,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 13:19:56,434][52263] Updated weights for policy 0, policy_version 387209 (0.0029) [2024-04-27 13:19:59,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6344146944. Throughput: 0: 53519.1. Samples: 834613100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-27 13:19:59,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 13:19:59,640][52263] Updated weights for policy 0, policy_version 387219 (0.0028) [2024-04-27 13:20:00,143][52242] Signal inference workers to stop experience collection... (12550 times) [2024-04-27 13:20:00,144][52242] Signal inference workers to resume experience collection... (12550 times) [2024-04-27 13:20:00,158][52263] InferenceWorker_p0-w0: stopping experience collection (12550 times) [2024-04-27 13:20:00,158][52263] InferenceWorker_p0-w0: resuming experience collection (12550 times) [2024-04-27 13:20:02,618][52263] Updated weights for policy 0, policy_version 387229 (0.0025) [2024-04-27 13:20:04,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6344441856. Throughput: 0: 53690.7. Samples: 834942340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-27 13:20:04,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 13:20:05,867][52263] Updated weights for policy 0, policy_version 387239 (0.0029) [2024-04-27 13:20:08,696][52263] Updated weights for policy 0, policy_version 387249 (0.0031) [2024-04-27 13:20:09,106][52031] Fps is (10 sec: 57344.1, 60 sec: 54067.2, 300 sec: 53706.2). Total num frames: 6344720384. Throughput: 0: 53626.7. Samples: 835266340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-27 13:20:09,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 13:20:11,975][52263] Updated weights for policy 0, policy_version 387259 (0.0031) [2024-04-27 13:20:14,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6344966144. Throughput: 0: 53622.6. Samples: 835427600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-27 13:20:14,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 13:20:14,704][52263] Updated weights for policy 0, policy_version 387269 (0.0026) [2024-04-27 13:20:18,081][52263] Updated weights for policy 0, policy_version 387279 (0.0030) [2024-04-27 13:20:19,106][52031] Fps is (10 sec: 54066.8, 60 sec: 54067.2, 300 sec: 53484.0). Total num frames: 6345261056. Throughput: 0: 53742.0. Samples: 835749780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-27 13:20:19,107][52031] Avg episode reward: [(0, '0.655')] [2024-04-27 13:20:20,638][52263] Updated weights for policy 0, policy_version 387289 (0.0030) [2024-04-27 13:20:24,013][52263] Updated weights for policy 0, policy_version 387299 (0.0035) [2024-04-27 13:20:24,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6345506816. Throughput: 0: 54048.5. Samples: 836079320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-27 13:20:24,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 13:20:26,729][52263] Updated weights for policy 0, policy_version 387309 (0.0033) [2024-04-27 13:20:29,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 6345768960. Throughput: 0: 53805.2. Samples: 836233600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-27 13:20:29,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 13:20:30,194][52263] Updated weights for policy 0, policy_version 387319 (0.0027) [2024-04-27 13:20:32,837][52263] Updated weights for policy 0, policy_version 387329 (0.0038) [2024-04-27 13:20:34,106][52031] Fps is (10 sec: 57344.1, 60 sec: 54340.4, 300 sec: 53650.7). Total num frames: 6346080256. Throughput: 0: 53846.3. Samples: 836560920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-27 13:20:34,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 13:20:36,149][52263] Updated weights for policy 0, policy_version 387339 (0.0030) [2024-04-27 13:20:39,066][52263] Updated weights for policy 0, policy_version 387349 (0.0034) [2024-04-27 13:20:39,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6346326016. Throughput: 0: 53893.9. Samples: 836883520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-27 13:20:39,107][52031] Avg episode reward: [(0, '0.674')] [2024-04-27 13:20:42,239][52263] Updated weights for policy 0, policy_version 387359 (0.0029) [2024-04-27 13:20:44,106][52031] Fps is (10 sec: 50790.8, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6346588160. Throughput: 0: 54152.9. Samples: 837049980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-27 13:20:44,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 13:20:45,132][52263] Updated weights for policy 0, policy_version 387369 (0.0028) [2024-04-27 13:20:48,347][52263] Updated weights for policy 0, policy_version 387379 (0.0029) [2024-04-27 13:20:49,107][52031] Fps is (10 sec: 50789.5, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6346833920. Throughput: 0: 53925.2. Samples: 837368980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-04-27 13:20:49,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 13:20:51,102][52263] Updated weights for policy 0, policy_version 387389 (0.0027) [2024-04-27 13:20:51,834][52242] Signal inference workers to stop experience collection... (12600 times) [2024-04-27 13:20:51,839][52242] Signal inference workers to resume experience collection... (12600 times) [2024-04-27 13:20:51,865][52263] InferenceWorker_p0-w0: stopping experience collection (12600 times) [2024-04-27 13:20:51,865][52263] InferenceWorker_p0-w0: resuming experience collection (12600 times) [2024-04-27 13:20:54,107][52031] Fps is (10 sec: 54066.5, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6347128832. Throughput: 0: 53953.2. Samples: 837694240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 13:20:54,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 13:20:54,345][52263] Updated weights for policy 0, policy_version 387399 (0.0027) [2024-04-27 13:20:57,190][52263] Updated weights for policy 0, policy_version 387409 (0.0030) [2024-04-27 13:20:59,106][52031] Fps is (10 sec: 55706.4, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6347390976. Throughput: 0: 53822.0. Samples: 837849580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 13:20:59,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 13:21:00,608][52263] Updated weights for policy 0, policy_version 387419 (0.0028) [2024-04-27 13:21:03,256][52263] Updated weights for policy 0, policy_version 387429 (0.0028) [2024-04-27 13:21:04,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6347669504. Throughput: 0: 53781.8. Samples: 838169960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 13:21:04,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 13:21:06,820][52263] Updated weights for policy 0, policy_version 387439 (0.0030) [2024-04-27 13:21:09,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6347931648. Throughput: 0: 53491.1. Samples: 838486420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 13:21:09,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 13:21:09,434][52263] Updated weights for policy 0, policy_version 387449 (0.0031) [2024-04-27 13:21:12,835][52263] Updated weights for policy 0, policy_version 387459 (0.0032) [2024-04-27 13:21:14,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6348193792. Throughput: 0: 53718.2. Samples: 838650920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 13:21:14,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 13:21:15,447][52263] Updated weights for policy 0, policy_version 387469 (0.0030) [2024-04-27 13:21:19,106][52031] Fps is (10 sec: 50790.5, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6348439552. Throughput: 0: 53584.0. Samples: 838972200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 13:21:19,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 13:21:19,337][52263] Updated weights for policy 0, policy_version 387479 (0.0030) [2024-04-27 13:21:21,571][52263] Updated weights for policy 0, policy_version 387489 (0.0034) [2024-04-27 13:21:24,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6348701696. Throughput: 0: 53624.0. Samples: 839296600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 13:21:24,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 13:21:25,420][52263] Updated weights for policy 0, policy_version 387499 (0.0029) [2024-04-27 13:21:27,615][52263] Updated weights for policy 0, policy_version 387509 (0.0029) [2024-04-27 13:21:29,106][52031] Fps is (10 sec: 57343.9, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6349012992. Throughput: 0: 53515.0. Samples: 839458160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 13:21:29,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 13:21:31,531][52263] Updated weights for policy 0, policy_version 387519 (0.0034) [2024-04-27 13:21:33,729][52263] Updated weights for policy 0, policy_version 387529 (0.0028) [2024-04-27 13:21:34,107][52031] Fps is (10 sec: 57343.2, 60 sec: 53247.9, 300 sec: 53650.6). Total num frames: 6349275136. Throughput: 0: 53501.3. Samples: 839776540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 13:21:34,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 13:21:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000387529_6349275136.pth... [2024-04-27 13:21:34,168][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000386743_6336397312.pth [2024-04-27 13:21:37,547][52263] Updated weights for policy 0, policy_version 387539 (0.0029) [2024-04-27 13:21:39,106][52031] Fps is (10 sec: 50790.0, 60 sec: 53247.9, 300 sec: 53595.1). Total num frames: 6349520896. Throughput: 0: 53336.0. Samples: 840094360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 13:21:39,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 13:21:40,060][52263] Updated weights for policy 0, policy_version 387549 (0.0032) [2024-04-27 13:21:43,571][52263] Updated weights for policy 0, policy_version 387559 (0.0024) [2024-04-27 13:21:44,076][52242] Signal inference workers to stop experience collection... (12650 times) [2024-04-27 13:21:44,077][52242] Signal inference workers to resume experience collection... (12650 times) [2024-04-27 13:21:44,099][52263] InferenceWorker_p0-w0: stopping experience collection (12650 times) [2024-04-27 13:21:44,099][52263] InferenceWorker_p0-w0: resuming experience collection (12650 times) [2024-04-27 13:21:44,106][52031] Fps is (10 sec: 50791.5, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6349783040. Throughput: 0: 53390.3. Samples: 840252140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 13:21:44,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 13:21:46,106][52263] Updated weights for policy 0, policy_version 387569 (0.0028) [2024-04-27 13:21:49,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6350045184. Throughput: 0: 53423.2. Samples: 840574000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 13:21:49,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 13:21:49,823][52263] Updated weights for policy 0, policy_version 387579 (0.0026) [2024-04-27 13:21:52,361][52263] Updated weights for policy 0, policy_version 387589 (0.0031) [2024-04-27 13:21:54,107][52031] Fps is (10 sec: 54066.2, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6350323712. Throughput: 0: 53503.9. Samples: 840894100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 13:21:54,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 13:21:55,791][52263] Updated weights for policy 0, policy_version 387599 (0.0028) [2024-04-27 13:21:58,594][52263] Updated weights for policy 0, policy_version 387609 (0.0034) [2024-04-27 13:21:59,106][52031] Fps is (10 sec: 57344.2, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6350618624. Throughput: 0: 53564.5. Samples: 841061320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 13:21:59,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 13:22:02,089][52263] Updated weights for policy 0, policy_version 387619 (0.0037) [2024-04-27 13:22:04,107][52031] Fps is (10 sec: 57344.2, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6350897152. Throughput: 0: 53600.4. Samples: 841384220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 13:22:04,107][52031] Avg episode reward: [(0, '0.502')] [2024-04-27 13:22:04,705][52263] Updated weights for policy 0, policy_version 387629 (0.0028) [2024-04-27 13:22:08,151][52263] Updated weights for policy 0, policy_version 387639 (0.0032) [2024-04-27 13:22:09,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6351142912. Throughput: 0: 53541.6. Samples: 841705980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 13:22:09,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 13:22:10,751][52263] Updated weights for policy 0, policy_version 387649 (0.0037) [2024-04-27 13:22:14,107][52031] Fps is (10 sec: 49151.7, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6351388672. Throughput: 0: 53149.7. Samples: 841849900. Policy #0 lag: (min: 1.0, avg: 13.4, max: 26.0) [2024-04-27 13:22:14,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 13:22:14,348][52263] Updated weights for policy 0, policy_version 387659 (0.0027) [2024-04-27 13:22:16,746][52263] Updated weights for policy 0, policy_version 387669 (0.0033) [2024-04-27 13:22:19,107][52031] Fps is (10 sec: 49152.0, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6351634432. Throughput: 0: 53193.3. Samples: 842170240. Policy #0 lag: (min: 1.0, avg: 13.4, max: 26.0) [2024-04-27 13:22:19,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 13:22:20,599][52263] Updated weights for policy 0, policy_version 387679 (0.0030) [2024-04-27 13:22:22,863][52263] Updated weights for policy 0, policy_version 387689 (0.0028) [2024-04-27 13:22:24,107][52031] Fps is (10 sec: 57344.1, 60 sec: 54340.1, 300 sec: 53706.2). Total num frames: 6351962112. Throughput: 0: 53343.5. Samples: 842494820. Policy #0 lag: (min: 1.0, avg: 13.4, max: 26.0) [2024-04-27 13:22:24,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 13:22:26,607][52263] Updated weights for policy 0, policy_version 387699 (0.0028) [2024-04-27 13:22:29,107][52031] Fps is (10 sec: 57344.2, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6352207872. Throughput: 0: 53658.5. Samples: 842666780. Policy #0 lag: (min: 1.0, avg: 13.4, max: 26.0) [2024-04-27 13:22:29,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 13:22:29,121][52263] Updated weights for policy 0, policy_version 387709 (0.0025) [2024-04-27 13:22:31,809][52242] Signal inference workers to stop experience collection... (12700 times) [2024-04-27 13:22:31,842][52263] InferenceWorker_p0-w0: stopping experience collection (12700 times) [2024-04-27 13:22:31,870][52242] Signal inference workers to resume experience collection... (12700 times) [2024-04-27 13:22:31,871][52263] InferenceWorker_p0-w0: resuming experience collection (12700 times) [2024-04-27 13:22:32,561][52263] Updated weights for policy 0, policy_version 387719 (0.0031) [2024-04-27 13:22:34,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 6352486400. Throughput: 0: 53633.8. Samples: 842987520. Policy #0 lag: (min: 1.0, avg: 13.4, max: 26.0) [2024-04-27 13:22:34,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 13:22:35,147][52263] Updated weights for policy 0, policy_version 387729 (0.0030) [2024-04-27 13:22:38,836][52263] Updated weights for policy 0, policy_version 387739 (0.0035) [2024-04-27 13:22:39,107][52031] Fps is (10 sec: 50790.2, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6352715776. Throughput: 0: 53516.4. Samples: 843302340. Policy #0 lag: (min: 1.0, avg: 13.4, max: 26.0) [2024-04-27 13:22:39,107][52031] Avg episode reward: [(0, '0.670')] [2024-04-27 13:22:41,166][52263] Updated weights for policy 0, policy_version 387749 (0.0033) [2024-04-27 13:22:44,106][52031] Fps is (10 sec: 49152.0, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6352977920. Throughput: 0: 53151.1. Samples: 843453120. Policy #0 lag: (min: 1.0, avg: 13.4, max: 26.0) [2024-04-27 13:22:44,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 13:22:45,091][52263] Updated weights for policy 0, policy_version 387759 (0.0030) [2024-04-27 13:22:47,350][52263] Updated weights for policy 0, policy_version 387769 (0.0028) [2024-04-27 13:22:49,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6353256448. Throughput: 0: 53000.9. Samples: 843769260. Policy #0 lag: (min: 1.0, avg: 13.4, max: 26.0) [2024-04-27 13:22:49,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 13:22:51,178][52263] Updated weights for policy 0, policy_version 387779 (0.0030) [2024-04-27 13:22:53,575][52263] Updated weights for policy 0, policy_version 387789 (0.0028) [2024-04-27 13:22:54,106][52031] Fps is (10 sec: 57344.3, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 6353551360. Throughput: 0: 53194.9. Samples: 844099740. Policy #0 lag: (min: 1.0, avg: 13.4, max: 26.0) [2024-04-27 13:22:54,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 13:22:57,283][52263] Updated weights for policy 0, policy_version 387799 (0.0032) [2024-04-27 13:22:59,106][52031] Fps is (10 sec: 57343.8, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6353829888. Throughput: 0: 53804.1. Samples: 844271080. Policy #0 lag: (min: 1.0, avg: 13.4, max: 26.0) [2024-04-27 13:22:59,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 13:22:59,536][52263] Updated weights for policy 0, policy_version 387809 (0.0035) [2024-04-27 13:23:03,445][52263] Updated weights for policy 0, policy_version 387819 (0.0030) [2024-04-27 13:23:04,107][52031] Fps is (10 sec: 52427.8, 60 sec: 52974.9, 300 sec: 53539.6). Total num frames: 6354075648. Throughput: 0: 53783.6. Samples: 844590500. Policy #0 lag: (min: 1.0, avg: 13.4, max: 26.0) [2024-04-27 13:23:04,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 13:23:05,637][52263] Updated weights for policy 0, policy_version 387829 (0.0025) [2024-04-27 13:23:09,107][52031] Fps is (10 sec: 47512.9, 60 sec: 52701.8, 300 sec: 53428.5). Total num frames: 6354305024. Throughput: 0: 53591.0. Samples: 844906420. Policy #0 lag: (min: 1.0, avg: 13.4, max: 26.0) [2024-04-27 13:23:09,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 13:23:09,672][52263] Updated weights for policy 0, policy_version 387839 (0.0032) [2024-04-27 13:23:11,739][52263] Updated weights for policy 0, policy_version 387849 (0.0034) [2024-04-27 13:23:14,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6354583552. Throughput: 0: 53133.8. Samples: 845057800. Policy #0 lag: (min: 1.0, avg: 13.4, max: 26.0) [2024-04-27 13:23:14,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 13:23:15,715][52263] Updated weights for policy 0, policy_version 387859 (0.0035) [2024-04-27 13:23:18,000][52263] Updated weights for policy 0, policy_version 387869 (0.0027) [2024-04-27 13:23:19,106][52031] Fps is (10 sec: 57345.0, 60 sec: 54067.3, 300 sec: 53595.1). Total num frames: 6354878464. Throughput: 0: 53091.0. Samples: 845376620. Policy #0 lag: (min: 1.0, avg: 13.4, max: 26.0) [2024-04-27 13:23:19,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 13:23:21,695][52263] Updated weights for policy 0, policy_version 387879 (0.0028) [2024-04-27 13:23:24,106][52031] Fps is (10 sec: 57344.2, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6355156992. Throughput: 0: 53194.8. Samples: 845696100. Policy #0 lag: (min: 1.0, avg: 13.4, max: 26.0) [2024-04-27 13:23:24,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 13:23:24,120][52263] Updated weights for policy 0, policy_version 387889 (0.0032) [2024-04-27 13:23:27,963][52263] Updated weights for policy 0, policy_version 387899 (0.0034) [2024-04-27 13:23:29,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 6355419136. Throughput: 0: 53629.3. Samples: 845866440. Policy #0 lag: (min: 1.0, avg: 13.4, max: 26.0) [2024-04-27 13:23:29,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 13:23:30,088][52263] Updated weights for policy 0, policy_version 387909 (0.0028) [2024-04-27 13:23:32,125][52242] Signal inference workers to stop experience collection... (12750 times) [2024-04-27 13:23:32,171][52263] InferenceWorker_p0-w0: stopping experience collection (12750 times) [2024-04-27 13:23:32,185][52242] Signal inference workers to resume experience collection... (12750 times) [2024-04-27 13:23:32,190][52263] InferenceWorker_p0-w0: resuming experience collection (12750 times) [2024-04-27 13:23:34,106][52031] Fps is (10 sec: 49151.8, 60 sec: 52701.8, 300 sec: 53484.0). Total num frames: 6355648512. Throughput: 0: 53610.2. Samples: 846181720. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-27 13:23:34,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 13:23:34,124][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000387919_6355664896.pth... [2024-04-27 13:23:34,132][52263] Updated weights for policy 0, policy_version 387919 (0.0032) [2024-04-27 13:23:34,171][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000387135_6342819840.pth [2024-04-27 13:23:36,136][52263] Updated weights for policy 0, policy_version 387929 (0.0029) [2024-04-27 13:23:39,106][52031] Fps is (10 sec: 49152.0, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6355910656. Throughput: 0: 53500.4. Samples: 846507260. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-27 13:23:39,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 13:23:40,387][52263] Updated weights for policy 0, policy_version 387939 (0.0029) [2024-04-27 13:23:42,604][52263] Updated weights for policy 0, policy_version 387949 (0.0025) [2024-04-27 13:23:44,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 6356189184. Throughput: 0: 53000.9. Samples: 846656120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-27 13:23:44,107][52031] Avg episode reward: [(0, '0.488')] [2024-04-27 13:23:46,389][52263] Updated weights for policy 0, policy_version 387959 (0.0028) [2024-04-27 13:23:48,848][52263] Updated weights for policy 0, policy_version 387969 (0.0028) [2024-04-27 13:23:49,107][52031] Fps is (10 sec: 57343.0, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6356484096. Throughput: 0: 53103.1. Samples: 846980140. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-27 13:23:49,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 13:23:52,583][52263] Updated weights for policy 0, policy_version 387979 (0.0028) [2024-04-27 13:23:54,107][52031] Fps is (10 sec: 57342.6, 60 sec: 53520.8, 300 sec: 53650.6). Total num frames: 6356762624. Throughput: 0: 53156.4. Samples: 847298460. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-27 13:23:54,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 13:23:55,255][52263] Updated weights for policy 0, policy_version 387989 (0.0028) [2024-04-27 13:23:58,892][52263] Updated weights for policy 0, policy_version 387999 (0.0033) [2024-04-27 13:23:59,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52701.9, 300 sec: 53484.0). Total num frames: 6356992000. Throughput: 0: 53388.0. Samples: 847460260. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-27 13:23:59,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 13:24:01,328][52263] Updated weights for policy 0, policy_version 388009 (0.0027) [2024-04-27 13:24:04,106][52031] Fps is (10 sec: 49153.7, 60 sec: 52975.1, 300 sec: 53484.0). Total num frames: 6357254144. Throughput: 0: 53562.4. Samples: 847786920. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-27 13:24:04,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 13:24:04,931][52263] Updated weights for policy 0, policy_version 388019 (0.0028) [2024-04-27 13:24:07,397][52263] Updated weights for policy 0, policy_version 388029 (0.0031) [2024-04-27 13:24:09,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6357532672. Throughput: 0: 53595.0. Samples: 848107880. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-27 13:24:09,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 13:24:10,930][52263] Updated weights for policy 0, policy_version 388039 (0.0031) [2024-04-27 13:24:12,025][52242] Signal inference workers to stop experience collection... (12800 times) [2024-04-27 13:24:12,060][52263] InferenceWorker_p0-w0: stopping experience collection (12800 times) [2024-04-27 13:24:12,093][52242] Signal inference workers to resume experience collection... (12800 times) [2024-04-27 13:24:12,093][52263] InferenceWorker_p0-w0: resuming experience collection (12800 times) [2024-04-27 13:24:13,558][52263] Updated weights for policy 0, policy_version 388049 (0.0028) [2024-04-27 13:24:14,106][52031] Fps is (10 sec: 55704.7, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6357811200. Throughput: 0: 53184.7. Samples: 848259760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-27 13:24:14,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 13:24:17,111][52263] Updated weights for policy 0, policy_version 388059 (0.0032) [2024-04-27 13:24:19,107][52031] Fps is (10 sec: 57343.4, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6358106112. Throughput: 0: 53410.5. Samples: 848585200. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-27 13:24:19,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 13:24:19,875][52263] Updated weights for policy 0, policy_version 388069 (0.0030) [2024-04-27 13:24:23,204][52263] Updated weights for policy 0, policy_version 388079 (0.0029) [2024-04-27 13:24:24,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6358368256. Throughput: 0: 53373.8. Samples: 848909080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-27 13:24:24,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 13:24:26,018][52263] Updated weights for policy 0, policy_version 388089 (0.0032) [2024-04-27 13:24:29,106][52031] Fps is (10 sec: 49153.2, 60 sec: 52974.9, 300 sec: 53484.1). Total num frames: 6358597632. Throughput: 0: 53690.7. Samples: 849072200. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-27 13:24:29,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 13:24:29,201][52263] Updated weights for policy 0, policy_version 388099 (0.0037) [2024-04-27 13:24:31,989][52263] Updated weights for policy 0, policy_version 388109 (0.0026) [2024-04-27 13:24:34,106][52031] Fps is (10 sec: 49151.8, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6358859776. Throughput: 0: 53641.5. Samples: 849394000. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-27 13:24:34,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 13:24:35,158][52263] Updated weights for policy 0, policy_version 388119 (0.0030) [2024-04-27 13:24:38,104][52263] Updated weights for policy 0, policy_version 388129 (0.0027) [2024-04-27 13:24:39,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6359138304. Throughput: 0: 53611.3. Samples: 849710960. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-27 13:24:39,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 13:24:41,361][52263] Updated weights for policy 0, policy_version 388139 (0.0031) [2024-04-27 13:24:44,107][52031] Fps is (10 sec: 54065.9, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6359400448. Throughput: 0: 53579.4. Samples: 849871340. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-27 13:24:44,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 13:24:44,301][52263] Updated weights for policy 0, policy_version 388149 (0.0031) [2024-04-27 13:24:47,495][52263] Updated weights for policy 0, policy_version 388159 (0.0027) [2024-04-27 13:24:49,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6359695360. Throughput: 0: 53555.1. Samples: 850196900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-04-27 13:24:49,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 13:24:50,330][52263] Updated weights for policy 0, policy_version 388169 (0.0031) [2024-04-27 13:24:53,475][52263] Updated weights for policy 0, policy_version 388179 (0.0028) [2024-04-27 13:24:54,106][52031] Fps is (10 sec: 55706.6, 60 sec: 53248.2, 300 sec: 53595.1). Total num frames: 6359957504. Throughput: 0: 53580.1. Samples: 850518980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-27 13:24:54,107][52031] Avg episode reward: [(0, '0.518')] [2024-04-27 13:24:56,501][52263] Updated weights for policy 0, policy_version 388189 (0.0027) [2024-04-27 13:24:59,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6360219648. Throughput: 0: 53788.9. Samples: 850680260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-27 13:24:59,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 13:24:59,452][52263] Updated weights for policy 0, policy_version 388199 (0.0027) [2024-04-27 13:25:02,580][52263] Updated weights for policy 0, policy_version 388209 (0.0030) [2024-04-27 13:25:04,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53793.9, 300 sec: 53428.5). Total num frames: 6360481792. Throughput: 0: 53673.8. Samples: 851000520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-27 13:25:04,107][52031] Avg episode reward: [(0, '0.666')] [2024-04-27 13:25:05,688][52263] Updated weights for policy 0, policy_version 388219 (0.0026) [2024-04-27 13:25:08,621][52263] Updated weights for policy 0, policy_version 388229 (0.0031) [2024-04-27 13:25:09,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6360760320. Throughput: 0: 53543.9. Samples: 851318560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-27 13:25:09,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 13:25:11,872][52263] Updated weights for policy 0, policy_version 388239 (0.0035) [2024-04-27 13:25:14,107][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6361022464. Throughput: 0: 53550.5. Samples: 851481980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-27 13:25:14,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 13:25:14,715][52263] Updated weights for policy 0, policy_version 388249 (0.0027) [2024-04-27 13:25:17,843][52263] Updated weights for policy 0, policy_version 388259 (0.0027) [2024-04-27 13:25:19,106][52031] Fps is (10 sec: 52428.5, 60 sec: 52975.1, 300 sec: 53484.0). Total num frames: 6361284608. Throughput: 0: 53550.1. Samples: 851803760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-27 13:25:19,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 13:25:20,943][52263] Updated weights for policy 0, policy_version 388269 (0.0026) [2024-04-27 13:25:22,839][52242] Signal inference workers to stop experience collection... (12850 times) [2024-04-27 13:25:22,840][52242] Signal inference workers to resume experience collection... (12850 times) [2024-04-27 13:25:22,850][52263] InferenceWorker_p0-w0: stopping experience collection (12850 times) [2024-04-27 13:25:22,855][52263] InferenceWorker_p0-w0: resuming experience collection (12850 times) [2024-04-27 13:25:24,000][52263] Updated weights for policy 0, policy_version 388279 (0.0029) [2024-04-27 13:25:24,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53247.8, 300 sec: 53539.6). Total num frames: 6361563136. Throughput: 0: 53635.0. Samples: 852124540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-27 13:25:24,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 13:25:26,862][52263] Updated weights for policy 0, policy_version 388289 (0.0034) [2024-04-27 13:25:29,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 6361808896. Throughput: 0: 53531.4. Samples: 852280240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-27 13:25:29,107][52031] Avg episode reward: [(0, '0.518')] [2024-04-27 13:25:30,152][52263] Updated weights for policy 0, policy_version 388299 (0.0029) [2024-04-27 13:25:33,268][52263] Updated weights for policy 0, policy_version 388309 (0.0031) [2024-04-27 13:25:34,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53793.9, 300 sec: 53428.5). Total num frames: 6362087424. Throughput: 0: 53431.2. Samples: 852601320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-27 13:25:34,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 13:25:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000388311_6362087424.pth... [2024-04-27 13:25:34,167][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000387529_6349275136.pth [2024-04-27 13:25:36,224][52263] Updated weights for policy 0, policy_version 388319 (0.0035) [2024-04-27 13:25:39,106][52031] Fps is (10 sec: 55705.1, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6362365952. Throughput: 0: 53323.6. Samples: 852918540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-27 13:25:39,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 13:25:39,474][52263] Updated weights for policy 0, policy_version 388329 (0.0036) [2024-04-27 13:25:42,224][52263] Updated weights for policy 0, policy_version 388339 (0.0032) [2024-04-27 13:25:44,106][52031] Fps is (10 sec: 52430.3, 60 sec: 53521.3, 300 sec: 53484.1). Total num frames: 6362611712. Throughput: 0: 53486.9. Samples: 853087160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-27 13:25:44,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 13:25:45,524][52263] Updated weights for policy 0, policy_version 388349 (0.0031) [2024-04-27 13:25:48,431][52263] Updated weights for policy 0, policy_version 388359 (0.0036) [2024-04-27 13:25:49,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6362890240. Throughput: 0: 53497.0. Samples: 853407880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-27 13:25:49,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 13:25:51,674][52263] Updated weights for policy 0, policy_version 388369 (0.0030) [2024-04-27 13:25:54,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6363152384. Throughput: 0: 53525.8. Samples: 853727220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-27 13:25:54,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 13:25:54,700][52263] Updated weights for policy 0, policy_version 388379 (0.0031) [2024-04-27 13:25:57,768][52263] Updated weights for policy 0, policy_version 388389 (0.0032) [2024-04-27 13:25:59,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6363430912. Throughput: 0: 53428.5. Samples: 853886260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-27 13:25:59,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 13:26:00,738][52263] Updated weights for policy 0, policy_version 388399 (0.0029) [2024-04-27 13:26:03,727][52263] Updated weights for policy 0, policy_version 388409 (0.0028) [2024-04-27 13:26:04,107][52031] Fps is (10 sec: 54065.6, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6363693056. Throughput: 0: 53478.9. Samples: 854210320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-27 13:26:04,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 13:26:06,715][52263] Updated weights for policy 0, policy_version 388419 (0.0031) [2024-04-27 13:26:09,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6363971584. Throughput: 0: 53462.1. Samples: 854530320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 13:26:09,107][52031] Avg episode reward: [(0, '0.701')] [2024-04-27 13:26:09,687][52263] Updated weights for policy 0, policy_version 388429 (0.0032) [2024-04-27 13:26:12,911][52263] Updated weights for policy 0, policy_version 388439 (0.0029) [2024-04-27 13:26:14,106][52031] Fps is (10 sec: 52430.3, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6364217344. Throughput: 0: 53551.5. Samples: 854690060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 13:26:14,107][52031] Avg episode reward: [(0, '0.658')] [2024-04-27 13:26:15,849][52263] Updated weights for policy 0, policy_version 388449 (0.0030) [2024-04-27 13:26:19,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6364495872. Throughput: 0: 53607.9. Samples: 855013660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 13:26:19,107][52031] Avg episode reward: [(0, '0.508')] [2024-04-27 13:26:19,109][52263] Updated weights for policy 0, policy_version 388459 (0.0025) [2024-04-27 13:26:22,417][52263] Updated weights for policy 0, policy_version 388469 (0.0023) [2024-04-27 13:26:24,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.2, 300 sec: 53373.0). Total num frames: 6364758016. Throughput: 0: 53621.0. Samples: 855331480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 13:26:24,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 13:26:24,822][52242] Signal inference workers to stop experience collection... (12900 times) [2024-04-27 13:26:24,862][52263] InferenceWorker_p0-w0: stopping experience collection (12900 times) [2024-04-27 13:26:24,920][52242] Signal inference workers to resume experience collection... (12900 times) [2024-04-27 13:26:24,920][52263] InferenceWorker_p0-w0: resuming experience collection (12900 times) [2024-04-27 13:26:25,278][52263] Updated weights for policy 0, policy_version 388479 (0.0028) [2024-04-27 13:26:28,632][52263] Updated weights for policy 0, policy_version 388489 (0.0028) [2024-04-27 13:26:29,106][52031] Fps is (10 sec: 54066.7, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6365036544. Throughput: 0: 53520.8. Samples: 855495600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 13:26:29,107][52031] Avg episode reward: [(0, '0.507')] [2024-04-27 13:26:31,267][52263] Updated weights for policy 0, policy_version 388499 (0.0029) [2024-04-27 13:26:34,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6365315072. Throughput: 0: 53521.2. Samples: 855816340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 13:26:34,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 13:26:34,574][52263] Updated weights for policy 0, policy_version 388509 (0.0034) [2024-04-27 13:26:37,470][52263] Updated weights for policy 0, policy_version 388519 (0.0024) [2024-04-27 13:26:39,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6365560832. Throughput: 0: 53407.1. Samples: 856130540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 13:26:39,107][52031] Avg episode reward: [(0, '0.487')] [2024-04-27 13:26:40,624][52263] Updated weights for policy 0, policy_version 388529 (0.0030) [2024-04-27 13:26:43,658][52263] Updated weights for policy 0, policy_version 388539 (0.0031) [2024-04-27 13:26:44,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6365822976. Throughput: 0: 53433.3. Samples: 856290760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 13:26:44,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 13:26:46,774][52263] Updated weights for policy 0, policy_version 388549 (0.0028) [2024-04-27 13:26:49,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6366085120. Throughput: 0: 53286.9. Samples: 856608220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 13:26:49,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 13:26:49,834][52263] Updated weights for policy 0, policy_version 388559 (0.0029) [2024-04-27 13:26:52,984][52263] Updated weights for policy 0, policy_version 388569 (0.0036) [2024-04-27 13:26:54,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6366363648. Throughput: 0: 53273.3. Samples: 856927620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 13:26:54,107][52031] Avg episode reward: [(0, '0.486')] [2024-04-27 13:26:56,047][52263] Updated weights for policy 0, policy_version 388579 (0.0033) [2024-04-27 13:26:58,934][52263] Updated weights for policy 0, policy_version 388589 (0.0028) [2024-04-27 13:26:59,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6366642176. Throughput: 0: 53570.2. Samples: 857100720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 13:26:59,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 13:27:02,292][52263] Updated weights for policy 0, policy_version 388599 (0.0025) [2024-04-27 13:27:04,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.3, 300 sec: 53428.5). Total num frames: 6366904320. Throughput: 0: 53411.9. Samples: 857417200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 13:27:04,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 13:27:04,984][52263] Updated weights for policy 0, policy_version 388609 (0.0025) [2024-04-27 13:27:08,308][52263] Updated weights for policy 0, policy_version 388619 (0.0033) [2024-04-27 13:27:09,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53247.8, 300 sec: 53484.0). Total num frames: 6367166464. Throughput: 0: 53399.7. Samples: 857734480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 13:27:09,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 13:27:11,163][52263] Updated weights for policy 0, policy_version 388629 (0.0029) [2024-04-27 13:27:12,533][52242] Signal inference workers to stop experience collection... (12950 times) [2024-04-27 13:27:12,534][52242] Signal inference workers to resume experience collection... (12950 times) [2024-04-27 13:27:12,558][52263] InferenceWorker_p0-w0: stopping experience collection (12950 times) [2024-04-27 13:27:12,558][52263] InferenceWorker_p0-w0: resuming experience collection (12950 times) [2024-04-27 13:27:14,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6367444992. Throughput: 0: 53312.8. Samples: 857894680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 13:27:14,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 13:27:14,303][52263] Updated weights for policy 0, policy_version 388639 (0.0034) [2024-04-27 13:27:17,322][52263] Updated weights for policy 0, policy_version 388649 (0.0031) [2024-04-27 13:27:19,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6367690752. Throughput: 0: 53265.9. Samples: 858213300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 13:27:19,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 13:27:20,506][52263] Updated weights for policy 0, policy_version 388659 (0.0025) [2024-04-27 13:27:23,367][52263] Updated weights for policy 0, policy_version 388669 (0.0031) [2024-04-27 13:27:24,107][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 6367985664. Throughput: 0: 53445.7. Samples: 858535600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 13:27:24,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 13:27:26,911][52263] Updated weights for policy 0, policy_version 388679 (0.0034) [2024-04-27 13:27:29,107][52031] Fps is (10 sec: 57343.1, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6368264192. Throughput: 0: 53587.4. Samples: 858702200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:27:29,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 13:27:29,313][52263] Updated weights for policy 0, policy_version 388689 (0.0036) [2024-04-27 13:27:32,826][52263] Updated weights for policy 0, policy_version 388699 (0.0030) [2024-04-27 13:27:34,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6368526336. Throughput: 0: 53771.1. Samples: 859027920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:27:34,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 13:27:34,119][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000388704_6368526336.pth... [2024-04-27 13:27:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000387919_6355664896.pth [2024-04-27 13:27:35,365][52263] Updated weights for policy 0, policy_version 388709 (0.0025) [2024-04-27 13:27:39,096][52263] Updated weights for policy 0, policy_version 388719 (0.0033) [2024-04-27 13:27:39,107][52031] Fps is (10 sec: 50790.6, 60 sec: 53520.9, 300 sec: 53539.6). Total num frames: 6368772096. Throughput: 0: 53788.3. Samples: 859348100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:27:39,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 13:27:41,557][52263] Updated weights for policy 0, policy_version 388729 (0.0029) [2024-04-27 13:27:44,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6369050624. Throughput: 0: 53399.9. Samples: 859503720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:27:44,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 13:27:45,243][52263] Updated weights for policy 0, policy_version 388739 (0.0032) [2024-04-27 13:27:47,815][52263] Updated weights for policy 0, policy_version 388749 (0.0026) [2024-04-27 13:27:49,107][52031] Fps is (10 sec: 55705.4, 60 sec: 54067.2, 300 sec: 53484.0). Total num frames: 6369329152. Throughput: 0: 53465.6. Samples: 859823160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:27:49,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 13:27:51,340][52263] Updated weights for policy 0, policy_version 388759 (0.0030) [2024-04-27 13:27:53,883][52263] Updated weights for policy 0, policy_version 388769 (0.0027) [2024-04-27 13:27:54,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6369591296. Throughput: 0: 53540.3. Samples: 860143780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:27:54,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 13:27:57,399][52263] Updated weights for policy 0, policy_version 388779 (0.0025) [2024-04-27 13:27:59,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6369853440. Throughput: 0: 53681.1. Samples: 860310320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:27:59,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 13:27:59,990][52263] Updated weights for policy 0, policy_version 388789 (0.0028) [2024-04-27 13:28:03,361][52263] Updated weights for policy 0, policy_version 388799 (0.0027) [2024-04-27 13:28:04,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6370131968. Throughput: 0: 53732.5. Samples: 860631260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:28:04,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 13:28:05,984][52263] Updated weights for policy 0, policy_version 388809 (0.0032) [2024-04-27 13:28:07,435][52242] Signal inference workers to stop experience collection... (13000 times) [2024-04-27 13:28:07,436][52242] Signal inference workers to resume experience collection... (13000 times) [2024-04-27 13:28:07,449][52263] InferenceWorker_p0-w0: stopping experience collection (13000 times) [2024-04-27 13:28:07,449][52263] InferenceWorker_p0-w0: resuming experience collection (13000 times) [2024-04-27 13:28:09,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6370394112. Throughput: 0: 53753.2. Samples: 860954500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:28:09,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 13:28:09,551][52263] Updated weights for policy 0, policy_version 388819 (0.0030) [2024-04-27 13:28:12,129][52263] Updated weights for policy 0, policy_version 388829 (0.0032) [2024-04-27 13:28:14,107][52031] Fps is (10 sec: 50789.7, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6370639872. Throughput: 0: 53590.2. Samples: 861113760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:28:14,107][52031] Avg episode reward: [(0, '0.496')] [2024-04-27 13:28:15,699][52263] Updated weights for policy 0, policy_version 388839 (0.0027) [2024-04-27 13:28:18,258][52263] Updated weights for policy 0, policy_version 388849 (0.0028) [2024-04-27 13:28:19,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6370918400. Throughput: 0: 53411.8. Samples: 861431440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:28:19,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 13:28:21,760][52263] Updated weights for policy 0, policy_version 388859 (0.0027) [2024-04-27 13:28:24,106][52031] Fps is (10 sec: 57344.6, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6371213312. Throughput: 0: 53490.3. Samples: 861755160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:28:24,107][52031] Avg episode reward: [(0, '0.509')] [2024-04-27 13:28:24,177][52263] Updated weights for policy 0, policy_version 388869 (0.0039) [2024-04-27 13:28:27,843][52263] Updated weights for policy 0, policy_version 388879 (0.0026) [2024-04-27 13:28:29,106][52031] Fps is (10 sec: 55705.1, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 6371475456. Throughput: 0: 53830.7. Samples: 861926100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:28:29,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 13:28:30,162][52263] Updated weights for policy 0, policy_version 388889 (0.0029) [2024-04-27 13:28:33,894][52263] Updated weights for policy 0, policy_version 388899 (0.0037) [2024-04-27 13:28:34,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6371721216. Throughput: 0: 53851.3. Samples: 862246460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:28:34,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 13:28:36,697][52263] Updated weights for policy 0, policy_version 388909 (0.0029) [2024-04-27 13:28:39,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6371999744. Throughput: 0: 53888.9. Samples: 862568780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:28:39,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 13:28:39,974][52263] Updated weights for policy 0, policy_version 388919 (0.0031) [2024-04-27 13:28:42,997][52263] Updated weights for policy 0, policy_version 388929 (0.0030) [2024-04-27 13:28:44,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6372261888. Throughput: 0: 53735.8. Samples: 862728440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 13:28:44,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 13:28:45,899][52263] Updated weights for policy 0, policy_version 388939 (0.0034) [2024-04-27 13:28:49,034][52263] Updated weights for policy 0, policy_version 388949 (0.0027) [2024-04-27 13:28:49,107][52031] Fps is (10 sec: 54065.8, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 6372540416. Throughput: 0: 53759.8. Samples: 863050460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:28:49,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 13:28:51,938][52263] Updated weights for policy 0, policy_version 388959 (0.0027) [2024-04-27 13:28:54,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53793.9, 300 sec: 53650.6). Total num frames: 6372818944. Throughput: 0: 53712.8. Samples: 863371580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:28:54,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 13:28:55,343][52263] Updated weights for policy 0, policy_version 388969 (0.0026) [2024-04-27 13:28:58,065][52263] Updated weights for policy 0, policy_version 388979 (0.0026) [2024-04-27 13:28:59,106][52031] Fps is (10 sec: 54068.5, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6373081088. Throughput: 0: 53802.9. Samples: 863534880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:28:59,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 13:29:01,541][52263] Updated weights for policy 0, policy_version 388989 (0.0025) [2024-04-27 13:29:04,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6373343232. Throughput: 0: 53979.0. Samples: 863860500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:29:04,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 13:29:04,360][52263] Updated weights for policy 0, policy_version 388999 (0.0032) [2024-04-27 13:29:07,646][52263] Updated weights for policy 0, policy_version 389009 (0.0033) [2024-04-27 13:29:09,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6373605376. Throughput: 0: 53824.8. Samples: 864177280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:29:09,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 13:29:10,352][52263] Updated weights for policy 0, policy_version 389019 (0.0041) [2024-04-27 13:29:13,765][52263] Updated weights for policy 0, policy_version 389029 (0.0036) [2024-04-27 13:29:13,924][52242] Signal inference workers to stop experience collection... (13050 times) [2024-04-27 13:29:13,925][52242] Signal inference workers to resume experience collection... (13050 times) [2024-04-27 13:29:13,956][52263] InferenceWorker_p0-w0: stopping experience collection (13050 times) [2024-04-27 13:29:13,962][52263] InferenceWorker_p0-w0: resuming experience collection (13050 times) [2024-04-27 13:29:14,106][52031] Fps is (10 sec: 54067.2, 60 sec: 54067.3, 300 sec: 53484.1). Total num frames: 6373883904. Throughput: 0: 53397.8. Samples: 864329000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:29:14,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 13:29:16,548][52263] Updated weights for policy 0, policy_version 389039 (0.0034) [2024-04-27 13:29:19,107][52031] Fps is (10 sec: 55705.9, 60 sec: 54067.1, 300 sec: 53539.6). Total num frames: 6374162432. Throughput: 0: 53490.5. Samples: 864653540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:29:19,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 13:29:19,797][52263] Updated weights for policy 0, policy_version 389049 (0.0032) [2024-04-27 13:29:22,668][52263] Updated weights for policy 0, policy_version 389059 (0.0029) [2024-04-27 13:29:24,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.1, 300 sec: 53650.6). Total num frames: 6374424576. Throughput: 0: 53499.4. Samples: 864976260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:29:24,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 13:29:25,835][52263] Updated weights for policy 0, policy_version 389069 (0.0027) [2024-04-27 13:29:28,800][52263] Updated weights for policy 0, policy_version 389079 (0.0031) [2024-04-27 13:29:29,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6374670336. Throughput: 0: 53521.9. Samples: 865136920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:29:29,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 13:29:32,054][52263] Updated weights for policy 0, policy_version 389089 (0.0029) [2024-04-27 13:29:34,107][52031] Fps is (10 sec: 54066.7, 60 sec: 54067.0, 300 sec: 53650.6). Total num frames: 6374965248. Throughput: 0: 53487.2. Samples: 865457380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:29:34,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 13:29:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000389097_6374965248.pth... [2024-04-27 13:29:34,161][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000388311_6362087424.pth [2024-04-27 13:29:34,791][52263] Updated weights for policy 0, policy_version 389099 (0.0033) [2024-04-27 13:29:38,245][52263] Updated weights for policy 0, policy_version 389109 (0.0026) [2024-04-27 13:29:39,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.1, 300 sec: 53595.2). Total num frames: 6375211008. Throughput: 0: 53558.5. Samples: 865781700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:29:39,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 13:29:41,092][52263] Updated weights for policy 0, policy_version 389119 (0.0027) [2024-04-27 13:29:44,106][52031] Fps is (10 sec: 49152.7, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6375456768. Throughput: 0: 53484.8. Samples: 865941700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:29:44,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 13:29:44,386][52263] Updated weights for policy 0, policy_version 389129 (0.0028) [2024-04-27 13:29:47,391][52263] Updated weights for policy 0, policy_version 389139 (0.0031) [2024-04-27 13:29:49,106][52031] Fps is (10 sec: 54066.7, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6375751680. Throughput: 0: 53402.2. Samples: 866263600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:29:49,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 13:29:50,525][52263] Updated weights for policy 0, policy_version 389149 (0.0035) [2024-04-27 13:29:53,412][52263] Updated weights for policy 0, policy_version 389159 (0.0028) [2024-04-27 13:29:54,106][52031] Fps is (10 sec: 57343.8, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6376030208. Throughput: 0: 53469.0. Samples: 866583380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:29:54,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 13:29:56,713][52263] Updated weights for policy 0, policy_version 389169 (0.0030) [2024-04-27 13:29:59,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53247.8, 300 sec: 53539.6). Total num frames: 6376275968. Throughput: 0: 53737.7. Samples: 866747200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:29:59,107][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 13:29:59,508][52263] Updated weights for policy 0, policy_version 389179 (0.0030) [2024-04-27 13:30:02,656][52263] Updated weights for policy 0, policy_version 389189 (0.0037) [2024-04-27 13:30:04,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6376554496. Throughput: 0: 53641.0. Samples: 867067380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:30:04,107][52031] Avg episode reward: [(0, '0.506')] [2024-04-27 13:30:05,734][52263] Updated weights for policy 0, policy_version 389199 (0.0032) [2024-04-27 13:30:06,449][52242] Signal inference workers to stop experience collection... (13100 times) [2024-04-27 13:30:06,486][52263] InferenceWorker_p0-w0: stopping experience collection (13100 times) [2024-04-27 13:30:06,522][52242] Signal inference workers to resume experience collection... (13100 times) [2024-04-27 13:30:06,522][52263] InferenceWorker_p0-w0: resuming experience collection (13100 times) [2024-04-27 13:30:08,753][52263] Updated weights for policy 0, policy_version 389209 (0.0034) [2024-04-27 13:30:09,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6376816640. Throughput: 0: 53544.9. Samples: 867385780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-27 13:30:09,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 13:30:11,952][52263] Updated weights for policy 0, policy_version 389219 (0.0030) [2024-04-27 13:30:14,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6377078784. Throughput: 0: 53589.4. Samples: 867548440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-27 13:30:14,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 13:30:14,926][52263] Updated weights for policy 0, policy_version 389229 (0.0030) [2024-04-27 13:30:17,971][52263] Updated weights for policy 0, policy_version 389239 (0.0031) [2024-04-27 13:30:19,106][52031] Fps is (10 sec: 52428.8, 60 sec: 52975.0, 300 sec: 53484.1). Total num frames: 6377340928. Throughput: 0: 53531.7. Samples: 867866300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-27 13:30:19,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 13:30:21,003][52263] Updated weights for policy 0, policy_version 389249 (0.0031) [2024-04-27 13:30:23,937][52263] Updated weights for policy 0, policy_version 389259 (0.0027) [2024-04-27 13:30:24,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6377619456. Throughput: 0: 53472.8. Samples: 868187980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-27 13:30:24,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 13:30:27,130][52263] Updated weights for policy 0, policy_version 389269 (0.0028) [2024-04-27 13:30:29,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53794.2, 300 sec: 53595.2). Total num frames: 6377897984. Throughput: 0: 53480.0. Samples: 868348300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-27 13:30:29,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 13:30:30,125][52263] Updated weights for policy 0, policy_version 389279 (0.0026) [2024-04-27 13:30:33,275][52263] Updated weights for policy 0, policy_version 389289 (0.0033) [2024-04-27 13:30:34,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52975.0, 300 sec: 53484.0). Total num frames: 6378143744. Throughput: 0: 53482.1. Samples: 868670300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-27 13:30:34,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 13:30:36,072][52263] Updated weights for policy 0, policy_version 389299 (0.0033) [2024-04-27 13:30:39,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6378422272. Throughput: 0: 53449.7. Samples: 868988620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-27 13:30:39,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 13:30:39,244][52263] Updated weights for policy 0, policy_version 389309 (0.0033) [2024-04-27 13:30:42,085][52263] Updated weights for policy 0, policy_version 389319 (0.0031) [2024-04-27 13:30:44,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6378684416. Throughput: 0: 53338.0. Samples: 869147400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-27 13:30:44,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 13:30:45,356][52263] Updated weights for policy 0, policy_version 389329 (0.0031) [2024-04-27 13:30:48,625][52263] Updated weights for policy 0, policy_version 389339 (0.0031) [2024-04-27 13:30:49,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6378962944. Throughput: 0: 53322.0. Samples: 869466880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-27 13:30:49,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 13:30:51,423][52263] Updated weights for policy 0, policy_version 389349 (0.0034) [2024-04-27 13:30:54,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6379225088. Throughput: 0: 53378.6. Samples: 869787820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-27 13:30:54,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 13:30:54,794][52263] Updated weights for policy 0, policy_version 389359 (0.0027) [2024-04-27 13:30:55,653][52242] Signal inference workers to stop experience collection... (13150 times) [2024-04-27 13:30:55,701][52263] InferenceWorker_p0-w0: stopping experience collection (13150 times) [2024-04-27 13:30:55,715][52242] Signal inference workers to resume experience collection... (13150 times) [2024-04-27 13:30:55,719][52263] InferenceWorker_p0-w0: resuming experience collection (13150 times) [2024-04-27 13:30:57,399][52263] Updated weights for policy 0, policy_version 389369 (0.0029) [2024-04-27 13:30:59,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53794.2, 300 sec: 53595.2). Total num frames: 6379503616. Throughput: 0: 53335.9. Samples: 869948560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-27 13:30:59,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 13:31:00,783][52263] Updated weights for policy 0, policy_version 389379 (0.0032) [2024-04-27 13:31:03,590][52263] Updated weights for policy 0, policy_version 389389 (0.0033) [2024-04-27 13:31:04,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6379765760. Throughput: 0: 53536.0. Samples: 870275420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-27 13:31:04,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 13:31:06,971][52263] Updated weights for policy 0, policy_version 389399 (0.0026) [2024-04-27 13:31:09,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6380027904. Throughput: 0: 53497.7. Samples: 870595380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-27 13:31:09,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 13:31:09,615][52263] Updated weights for policy 0, policy_version 389409 (0.0036) [2024-04-27 13:31:13,195][52263] Updated weights for policy 0, policy_version 389419 (0.0035) [2024-04-27 13:31:14,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6380273664. Throughput: 0: 53486.1. Samples: 870755180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-27 13:31:14,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 13:31:15,703][52263] Updated weights for policy 0, policy_version 389429 (0.0025) [2024-04-27 13:31:19,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6380552192. Throughput: 0: 53417.4. Samples: 871074080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-27 13:31:19,107][52031] Avg episode reward: [(0, '0.661')] [2024-04-27 13:31:19,292][52263] Updated weights for policy 0, policy_version 389439 (0.0029) [2024-04-27 13:31:22,071][52263] Updated weights for policy 0, policy_version 389449 (0.0026) [2024-04-27 13:31:24,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6380830720. Throughput: 0: 53506.6. Samples: 871396420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-04-27 13:31:24,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 13:31:25,336][52263] Updated weights for policy 0, policy_version 389459 (0.0037) [2024-04-27 13:31:28,158][52263] Updated weights for policy 0, policy_version 389469 (0.0027) [2024-04-27 13:31:29,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6381092864. Throughput: 0: 53505.0. Samples: 871555120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 13:31:29,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 13:31:31,545][52263] Updated weights for policy 0, policy_version 389479 (0.0033) [2024-04-27 13:31:34,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6381371392. Throughput: 0: 53591.3. Samples: 871878480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 13:31:34,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 13:31:34,121][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000389489_6381387776.pth... [2024-04-27 13:31:34,127][52263] Updated weights for policy 0, policy_version 389489 (0.0027) [2024-04-27 13:31:34,165][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000388704_6368526336.pth [2024-04-27 13:31:37,621][52263] Updated weights for policy 0, policy_version 389499 (0.0035) [2024-04-27 13:31:39,107][52031] Fps is (10 sec: 54065.9, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6381633536. Throughput: 0: 53604.8. Samples: 872200040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 13:31:39,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 13:31:40,278][52263] Updated weights for policy 0, policy_version 389509 (0.0027) [2024-04-27 13:31:43,717][52263] Updated weights for policy 0, policy_version 389519 (0.0031) [2024-04-27 13:31:44,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6381895680. Throughput: 0: 53543.5. Samples: 872358020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 13:31:44,116][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 13:31:46,394][52263] Updated weights for policy 0, policy_version 389529 (0.0029) [2024-04-27 13:31:49,106][52031] Fps is (10 sec: 52429.9, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6382157824. Throughput: 0: 53500.1. Samples: 872682920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 13:31:49,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 13:31:50,062][52263] Updated weights for policy 0, policy_version 389539 (0.0030) [2024-04-27 13:31:52,493][52263] Updated weights for policy 0, policy_version 389549 (0.0025) [2024-04-27 13:31:54,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6382436352. Throughput: 0: 53483.6. Samples: 873002140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 13:31:54,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 13:31:56,187][52263] Updated weights for policy 0, policy_version 389559 (0.0029) [2024-04-27 13:31:58,499][52263] Updated weights for policy 0, policy_version 389569 (0.0028) [2024-04-27 13:31:59,106][52031] Fps is (10 sec: 57343.6, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6382731264. Throughput: 0: 53539.6. Samples: 873164460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 13:31:59,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 13:31:59,615][52242] Signal inference workers to stop experience collection... (13200 times) [2024-04-27 13:31:59,651][52263] InferenceWorker_p0-w0: stopping experience collection (13200 times) [2024-04-27 13:31:59,667][52242] Signal inference workers to resume experience collection... (13200 times) [2024-04-27 13:31:59,672][52263] InferenceWorker_p0-w0: resuming experience collection (13200 times) [2024-04-27 13:32:02,388][52263] Updated weights for policy 0, policy_version 389579 (0.0038) [2024-04-27 13:32:04,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6382960640. Throughput: 0: 53554.7. Samples: 873484040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 13:32:04,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 13:32:04,633][52263] Updated weights for policy 0, policy_version 389589 (0.0029) [2024-04-27 13:32:08,338][52263] Updated weights for policy 0, policy_version 389599 (0.0030) [2024-04-27 13:32:09,106][52031] Fps is (10 sec: 49152.3, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6383222784. Throughput: 0: 53521.9. Samples: 873804900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 13:32:09,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 13:32:10,987][52263] Updated weights for policy 0, policy_version 389609 (0.0030) [2024-04-27 13:32:14,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6383484928. Throughput: 0: 53424.4. Samples: 873959220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 13:32:14,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 13:32:14,363][52263] Updated weights for policy 0, policy_version 389619 (0.0034) [2024-04-27 13:32:17,080][52263] Updated weights for policy 0, policy_version 389629 (0.0032) [2024-04-27 13:32:19,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6383763456. Throughput: 0: 53503.4. Samples: 874286140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 13:32:19,107][52031] Avg episode reward: [(0, '0.502')] [2024-04-27 13:32:20,535][52263] Updated weights for policy 0, policy_version 389639 (0.0029) [2024-04-27 13:32:23,178][52263] Updated weights for policy 0, policy_version 389649 (0.0032) [2024-04-27 13:32:24,107][52031] Fps is (10 sec: 57342.4, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6384058368. Throughput: 0: 53395.9. Samples: 874602860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 13:32:24,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 13:32:26,750][52263] Updated weights for policy 0, policy_version 389659 (0.0026) [2024-04-27 13:32:29,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53793.9, 300 sec: 53539.6). Total num frames: 6384320512. Throughput: 0: 53592.8. Samples: 874769700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 13:32:29,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 13:32:29,376][52263] Updated weights for policy 0, policy_version 389669 (0.0025) [2024-04-27 13:32:32,732][52263] Updated weights for policy 0, policy_version 389679 (0.0031) [2024-04-27 13:32:34,107][52031] Fps is (10 sec: 52429.5, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6384582656. Throughput: 0: 53558.5. Samples: 875093060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 13:32:34,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 13:32:35,634][52263] Updated weights for policy 0, policy_version 389689 (0.0029) [2024-04-27 13:32:38,783][52263] Updated weights for policy 0, policy_version 389699 (0.0032) [2024-04-27 13:32:39,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6384844800. Throughput: 0: 53599.0. Samples: 875414100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 13:32:39,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 13:32:41,591][52263] Updated weights for policy 0, policy_version 389709 (0.0028) [2024-04-27 13:32:44,107][52031] Fps is (10 sec: 50790.5, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6385090560. Throughput: 0: 53381.3. Samples: 875566620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 13:32:44,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 13:32:45,099][52263] Updated weights for policy 0, policy_version 389719 (0.0036) [2024-04-27 13:32:47,561][52263] Updated weights for policy 0, policy_version 389729 (0.0027) [2024-04-27 13:32:49,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6385385472. Throughput: 0: 53391.6. Samples: 875886660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:32:49,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 13:32:50,948][52242] Signal inference workers to stop experience collection... (13250 times) [2024-04-27 13:32:50,950][52242] Signal inference workers to resume experience collection... (13250 times) [2024-04-27 13:32:50,981][52263] InferenceWorker_p0-w0: stopping experience collection (13250 times) [2024-04-27 13:32:50,981][52263] InferenceWorker_p0-w0: resuming experience collection (13250 times) [2024-04-27 13:32:51,066][52263] Updated weights for policy 0, policy_version 389739 (0.0032) [2024-04-27 13:32:54,017][52263] Updated weights for policy 0, policy_version 389749 (0.0029) [2024-04-27 13:32:54,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6385647616. Throughput: 0: 53351.6. Samples: 876205720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:32:54,107][52031] Avg episode reward: [(0, '0.663')] [2024-04-27 13:32:57,080][52263] Updated weights for policy 0, policy_version 389759 (0.0027) [2024-04-27 13:32:59,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52975.0, 300 sec: 53484.0). Total num frames: 6385909760. Throughput: 0: 53724.9. Samples: 876376840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:32:59,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 13:33:00,031][52263] Updated weights for policy 0, policy_version 389769 (0.0026) [2024-04-27 13:33:03,172][52263] Updated weights for policy 0, policy_version 389779 (0.0028) [2024-04-27 13:33:04,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6386188288. Throughput: 0: 53721.3. Samples: 876703600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:33:04,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 13:33:06,171][52263] Updated weights for policy 0, policy_version 389789 (0.0028) [2024-04-27 13:33:09,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6386434048. Throughput: 0: 53779.7. Samples: 877022940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:33:09,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 13:33:09,343][52263] Updated weights for policy 0, policy_version 389799 (0.0031) [2024-04-27 13:33:12,343][52263] Updated weights for policy 0, policy_version 389809 (0.0026) [2024-04-27 13:33:14,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6386712576. Throughput: 0: 53481.9. Samples: 877176380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:33:14,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 13:33:15,403][52263] Updated weights for policy 0, policy_version 389819 (0.0029) [2024-04-27 13:33:18,565][52263] Updated weights for policy 0, policy_version 389829 (0.0030) [2024-04-27 13:33:19,106][52031] Fps is (10 sec: 57344.4, 60 sec: 54067.3, 300 sec: 53539.6). Total num frames: 6387007488. Throughput: 0: 53358.3. Samples: 877494180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:33:19,107][52031] Avg episode reward: [(0, '0.519')] [2024-04-27 13:33:21,512][52263] Updated weights for policy 0, policy_version 389839 (0.0029) [2024-04-27 13:33:24,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6387253248. Throughput: 0: 53434.7. Samples: 877818660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:33:24,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 13:33:24,551][52263] Updated weights for policy 0, policy_version 389849 (0.0028) [2024-04-27 13:33:27,611][52263] Updated weights for policy 0, policy_version 389859 (0.0028) [2024-04-27 13:33:29,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 6387548160. Throughput: 0: 53819.2. Samples: 877988480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:33:29,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 13:33:30,482][52263] Updated weights for policy 0, policy_version 389869 (0.0026) [2024-04-27 13:33:33,750][52263] Updated weights for policy 0, policy_version 389879 (0.0028) [2024-04-27 13:33:34,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6387810304. Throughput: 0: 53878.9. Samples: 878311220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:33:34,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 13:33:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000389881_6387810304.pth... [2024-04-27 13:33:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000389097_6374965248.pth [2024-04-27 13:33:36,557][52263] Updated weights for policy 0, policy_version 389889 (0.0028) [2024-04-27 13:33:39,107][52031] Fps is (10 sec: 47513.2, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6388023296. Throughput: 0: 53903.9. Samples: 878631400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:33:39,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 13:33:39,471][52242] Signal inference workers to stop experience collection... (13300 times) [2024-04-27 13:33:39,520][52263] InferenceWorker_p0-w0: stopping experience collection (13300 times) [2024-04-27 13:33:39,533][52242] Signal inference workers to resume experience collection... (13300 times) [2024-04-27 13:33:39,538][52263] InferenceWorker_p0-w0: resuming experience collection (13300 times) [2024-04-27 13:33:39,820][52263] Updated weights for policy 0, policy_version 389899 (0.0036) [2024-04-27 13:33:42,884][52263] Updated weights for policy 0, policy_version 389909 (0.0029) [2024-04-27 13:33:44,107][52031] Fps is (10 sec: 52429.0, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6388334592. Throughput: 0: 53409.2. Samples: 878780260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:33:44,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 13:33:45,788][52263] Updated weights for policy 0, policy_version 389919 (0.0025) [2024-04-27 13:33:48,822][52263] Updated weights for policy 0, policy_version 389929 (0.0032) [2024-04-27 13:33:49,106][52031] Fps is (10 sec: 58982.7, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6388613120. Throughput: 0: 53404.6. Samples: 879106800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:33:49,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 13:33:51,816][52263] Updated weights for policy 0, policy_version 389939 (0.0035) [2024-04-27 13:33:54,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6388875264. Throughput: 0: 53403.1. Samples: 879426080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:33:54,107][52031] Avg episode reward: [(0, '0.505')] [2024-04-27 13:33:54,949][52263] Updated weights for policy 0, policy_version 389949 (0.0027) [2024-04-27 13:33:58,072][52263] Updated weights for policy 0, policy_version 389959 (0.0030) [2024-04-27 13:33:59,107][52031] Fps is (10 sec: 54066.4, 60 sec: 54067.0, 300 sec: 53595.1). Total num frames: 6389153792. Throughput: 0: 53745.2. Samples: 879594920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:33:59,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 13:34:01,133][52263] Updated weights for policy 0, policy_version 389969 (0.0029) [2024-04-27 13:34:04,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6389399552. Throughput: 0: 53750.8. Samples: 879912960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 13:34:04,107][52031] Avg episode reward: [(0, '0.473')] [2024-04-27 13:34:04,188][52263] Updated weights for policy 0, policy_version 389979 (0.0029) [2024-04-27 13:34:07,378][52263] Updated weights for policy 0, policy_version 389989 (0.0026) [2024-04-27 13:34:09,106][52031] Fps is (10 sec: 49152.7, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6389645312. Throughput: 0: 53679.0. Samples: 880234220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 13:34:09,107][52031] Avg episode reward: [(0, '0.686')] [2024-04-27 13:34:10,236][52263] Updated weights for policy 0, policy_version 389999 (0.0029) [2024-04-27 13:34:13,359][52263] Updated weights for policy 0, policy_version 390009 (0.0032) [2024-04-27 13:34:14,106][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6389940224. Throughput: 0: 53226.1. Samples: 880383660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 13:34:14,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 13:34:16,392][52263] Updated weights for policy 0, policy_version 390019 (0.0041) [2024-04-27 13:34:19,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6390202368. Throughput: 0: 53241.5. Samples: 880707080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 13:34:19,107][52031] Avg episode reward: [(0, '0.669')] [2024-04-27 13:34:19,385][52263] Updated weights for policy 0, policy_version 390029 (0.0031) [2024-04-27 13:34:22,421][52263] Updated weights for policy 0, policy_version 390039 (0.0031) [2024-04-27 13:34:24,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53520.9, 300 sec: 53539.6). Total num frames: 6390464512. Throughput: 0: 53294.5. Samples: 881029660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 13:34:24,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 13:34:25,744][52263] Updated weights for policy 0, policy_version 390049 (0.0032) [2024-04-27 13:34:28,555][52263] Updated weights for policy 0, policy_version 390059 (0.0028) [2024-04-27 13:34:29,107][52031] Fps is (10 sec: 54066.2, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6390743040. Throughput: 0: 53665.3. Samples: 881195200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 13:34:29,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 13:34:31,921][52263] Updated weights for policy 0, policy_version 390069 (0.0028) [2024-04-27 13:34:34,107][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6391005184. Throughput: 0: 53614.1. Samples: 881519440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 13:34:34,107][52031] Avg episode reward: [(0, '0.658')] [2024-04-27 13:34:34,786][52263] Updated weights for policy 0, policy_version 390079 (0.0034) [2024-04-27 13:34:38,000][52263] Updated weights for policy 0, policy_version 390089 (0.0028) [2024-04-27 13:34:39,107][52031] Fps is (10 sec: 50789.4, 60 sec: 53793.9, 300 sec: 53539.5). Total num frames: 6391250944. Throughput: 0: 53556.6. Samples: 881836140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 13:34:39,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 13:34:39,909][52242] Signal inference workers to stop experience collection... (13350 times) [2024-04-27 13:34:39,910][52242] Signal inference workers to resume experience collection... (13350 times) [2024-04-27 13:34:39,922][52263] InferenceWorker_p0-w0: stopping experience collection (13350 times) [2024-04-27 13:34:39,927][52263] InferenceWorker_p0-w0: resuming experience collection (13350 times) [2024-04-27 13:34:40,824][52263] Updated weights for policy 0, policy_version 390099 (0.0026) [2024-04-27 13:34:44,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6391529472. Throughput: 0: 53254.0. Samples: 881991340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 13:34:44,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 13:34:44,252][52263] Updated weights for policy 0, policy_version 390109 (0.0033) [2024-04-27 13:34:46,943][52263] Updated weights for policy 0, policy_version 390119 (0.0031) [2024-04-27 13:34:49,106][52031] Fps is (10 sec: 55707.2, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6391808000. Throughput: 0: 53388.3. Samples: 882315440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 13:34:49,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 13:34:50,309][52263] Updated weights for policy 0, policy_version 390129 (0.0032) [2024-04-27 13:34:52,959][52263] Updated weights for policy 0, policy_version 390139 (0.0035) [2024-04-27 13:34:54,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6392070144. Throughput: 0: 53375.0. Samples: 882636100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 13:34:54,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 13:34:56,493][52263] Updated weights for policy 0, policy_version 390149 (0.0031) [2024-04-27 13:34:59,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53248.2, 300 sec: 53539.6). Total num frames: 6392348672. Throughput: 0: 53720.1. Samples: 882801060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 13:34:59,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 13:34:59,139][52263] Updated weights for policy 0, policy_version 390159 (0.0034) [2024-04-27 13:35:02,694][52263] Updated weights for policy 0, policy_version 390169 (0.0027) [2024-04-27 13:35:04,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6392594432. Throughput: 0: 53744.4. Samples: 883125580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 13:35:04,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 13:35:05,198][52263] Updated weights for policy 0, policy_version 390179 (0.0029) [2024-04-27 13:35:08,721][52263] Updated weights for policy 0, policy_version 390189 (0.0033) [2024-04-27 13:35:09,106][52031] Fps is (10 sec: 54067.2, 60 sec: 54067.3, 300 sec: 53595.1). Total num frames: 6392889344. Throughput: 0: 53629.2. Samples: 883442960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 13:35:09,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 13:35:11,699][52263] Updated weights for policy 0, policy_version 390199 (0.0032) [2024-04-27 13:35:14,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6393135104. Throughput: 0: 53355.7. Samples: 883596200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 13:35:14,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 13:35:14,949][52263] Updated weights for policy 0, policy_version 390209 (0.0029) [2024-04-27 13:35:17,735][52263] Updated weights for policy 0, policy_version 390219 (0.0027) [2024-04-27 13:35:19,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6393413632. Throughput: 0: 53295.7. Samples: 883917740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 13:35:19,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 13:35:20,911][52263] Updated weights for policy 0, policy_version 390229 (0.0027) [2024-04-27 13:35:23,663][52263] Updated weights for policy 0, policy_version 390239 (0.0033) [2024-04-27 13:35:24,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6393675776. Throughput: 0: 53455.7. Samples: 884241640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 13:35:24,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 13:35:26,987][52263] Updated weights for policy 0, policy_version 390249 (0.0036) [2024-04-27 13:35:29,106][52031] Fps is (10 sec: 54066.8, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6393954304. Throughput: 0: 53786.2. Samples: 884411720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-04-27 13:35:29,107][52031] Avg episode reward: [(0, '0.506')] [2024-04-27 13:35:29,711][52263] Updated weights for policy 0, policy_version 390259 (0.0028) [2024-04-27 13:35:33,159][52263] Updated weights for policy 0, policy_version 390269 (0.0028) [2024-04-27 13:35:34,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6394216448. Throughput: 0: 53820.9. Samples: 884737380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-04-27 13:35:34,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 13:35:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000390272_6394216448.pth... [2024-04-27 13:35:34,171][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000389489_6381387776.pth [2024-04-27 13:35:36,113][52263] Updated weights for policy 0, policy_version 390279 (0.0030) [2024-04-27 13:35:39,107][52031] Fps is (10 sec: 50790.2, 60 sec: 53521.3, 300 sec: 53484.0). Total num frames: 6394462208. Throughput: 0: 53733.8. Samples: 885054120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-04-27 13:35:39,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 13:35:39,530][52263] Updated weights for policy 0, policy_version 390289 (0.0034) [2024-04-27 13:35:40,992][52242] Signal inference workers to stop experience collection... (13400 times) [2024-04-27 13:35:40,992][52242] Signal inference workers to resume experience collection... (13400 times) [2024-04-27 13:35:41,016][52263] InferenceWorker_p0-w0: stopping experience collection (13400 times) [2024-04-27 13:35:41,016][52263] InferenceWorker_p0-w0: resuming experience collection (13400 times) [2024-04-27 13:35:42,221][52263] Updated weights for policy 0, policy_version 390299 (0.0029) [2024-04-27 13:35:44,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6394757120. Throughput: 0: 53485.1. Samples: 885207900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-04-27 13:35:44,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 13:35:45,490][52263] Updated weights for policy 0, policy_version 390309 (0.0034) [2024-04-27 13:35:48,384][52263] Updated weights for policy 0, policy_version 390319 (0.0027) [2024-04-27 13:35:49,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6395019264. Throughput: 0: 53406.3. Samples: 885528860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-04-27 13:35:49,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 13:35:51,491][52263] Updated weights for policy 0, policy_version 390329 (0.0032) [2024-04-27 13:35:54,107][52031] Fps is (10 sec: 54067.6, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6395297792. Throughput: 0: 53483.0. Samples: 885849700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-04-27 13:35:54,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 13:35:54,476][52263] Updated weights for policy 0, policy_version 390339 (0.0028) [2024-04-27 13:35:57,598][52263] Updated weights for policy 0, policy_version 390349 (0.0028) [2024-04-27 13:35:59,106][52031] Fps is (10 sec: 52428.3, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6395543552. Throughput: 0: 53733.8. Samples: 886014220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-04-27 13:35:59,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 13:36:00,575][52263] Updated weights for policy 0, policy_version 390359 (0.0030) [2024-04-27 13:36:03,914][52263] Updated weights for policy 0, policy_version 390369 (0.0030) [2024-04-27 13:36:04,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6395805696. Throughput: 0: 53702.9. Samples: 886334380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-04-27 13:36:04,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 13:36:06,688][52263] Updated weights for policy 0, policy_version 390379 (0.0026) [2024-04-27 13:36:09,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52974.9, 300 sec: 53539.6). Total num frames: 6396067840. Throughput: 0: 53615.8. Samples: 886654340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-04-27 13:36:09,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 13:36:09,828][52263] Updated weights for policy 0, policy_version 390389 (0.0032) [2024-04-27 13:36:12,690][52263] Updated weights for policy 0, policy_version 390399 (0.0028) [2024-04-27 13:36:14,107][52031] Fps is (10 sec: 55706.4, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6396362752. Throughput: 0: 53407.2. Samples: 886815040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-04-27 13:36:14,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 13:36:16,029][52263] Updated weights for policy 0, policy_version 390409 (0.0026) [2024-04-27 13:36:18,770][52263] Updated weights for policy 0, policy_version 390419 (0.0034) [2024-04-27 13:36:19,107][52031] Fps is (10 sec: 57343.4, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6396641280. Throughput: 0: 53329.2. Samples: 887137200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-04-27 13:36:19,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 13:36:22,061][52263] Updated weights for policy 0, policy_version 390429 (0.0030) [2024-04-27 13:36:24,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.3, 300 sec: 53539.6). Total num frames: 6396887040. Throughput: 0: 53472.6. Samples: 887460380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-04-27 13:36:24,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 13:36:24,928][52263] Updated weights for policy 0, policy_version 390439 (0.0026) [2024-04-27 13:36:28,350][52263] Updated weights for policy 0, policy_version 390449 (0.0030) [2024-04-27 13:36:29,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53520.9, 300 sec: 53539.5). Total num frames: 6397165568. Throughput: 0: 53470.2. Samples: 887614060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-04-27 13:36:29,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 13:36:30,937][52263] Updated weights for policy 0, policy_version 390459 (0.0026) [2024-04-27 13:36:34,107][52031] Fps is (10 sec: 54066.2, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6397427712. Throughput: 0: 53431.8. Samples: 887933300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-04-27 13:36:34,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 13:36:34,599][52263] Updated weights for policy 0, policy_version 390469 (0.0038) [2024-04-27 13:36:37,246][52263] Updated weights for policy 0, policy_version 390479 (0.0032) [2024-04-27 13:36:39,107][52031] Fps is (10 sec: 52429.4, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6397689856. Throughput: 0: 53536.9. Samples: 888258860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-04-27 13:36:39,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 13:36:40,716][52263] Updated weights for policy 0, policy_version 390489 (0.0029) [2024-04-27 13:36:43,309][52263] Updated weights for policy 0, policy_version 390499 (0.0032) [2024-04-27 13:36:44,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53248.2, 300 sec: 53539.6). Total num frames: 6397952000. Throughput: 0: 53378.3. Samples: 888416240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-04-27 13:36:44,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 13:36:46,673][52263] Updated weights for policy 0, policy_version 390509 (0.0028) [2024-04-27 13:36:48,032][52242] Signal inference workers to stop experience collection... (13450 times) [2024-04-27 13:36:48,032][52242] Signal inference workers to resume experience collection... (13450 times) [2024-04-27 13:36:48,064][52263] InferenceWorker_p0-w0: stopping experience collection (13450 times) [2024-04-27 13:36:48,065][52263] InferenceWorker_p0-w0: resuming experience collection (13450 times) [2024-04-27 13:36:49,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6398246912. Throughput: 0: 53329.0. Samples: 888734180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 13:36:49,116][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 13:36:49,369][52263] Updated weights for policy 0, policy_version 390519 (0.0032) [2024-04-27 13:36:52,948][52263] Updated weights for policy 0, policy_version 390529 (0.0033) [2024-04-27 13:36:54,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6398492672. Throughput: 0: 53394.7. Samples: 889057100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 13:36:54,115][52031] Avg episode reward: [(0, '0.506')] [2024-04-27 13:36:55,535][52263] Updated weights for policy 0, policy_version 390539 (0.0030) [2024-04-27 13:36:59,087][52263] Updated weights for policy 0, policy_version 390549 (0.0030) [2024-04-27 13:36:59,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6398754816. Throughput: 0: 53347.0. Samples: 889215660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 13:36:59,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 13:37:01,744][52263] Updated weights for policy 0, policy_version 390559 (0.0032) [2024-04-27 13:37:04,107][52031] Fps is (10 sec: 52427.7, 60 sec: 53521.0, 300 sec: 53539.5). Total num frames: 6399016960. Throughput: 0: 53324.8. Samples: 889536820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 13:37:04,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 13:37:05,194][52263] Updated weights for policy 0, policy_version 390569 (0.0030) [2024-04-27 13:37:07,850][52263] Updated weights for policy 0, policy_version 390579 (0.0030) [2024-04-27 13:37:09,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6399279104. Throughput: 0: 53228.7. Samples: 889855680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 13:37:09,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 13:37:11,208][52263] Updated weights for policy 0, policy_version 390589 (0.0030) [2024-04-27 13:37:14,066][52263] Updated weights for policy 0, policy_version 390599 (0.0029) [2024-04-27 13:37:14,107][52031] Fps is (10 sec: 55706.2, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6399574016. Throughput: 0: 53471.2. Samples: 890020260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 13:37:14,108][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 13:37:17,329][52263] Updated weights for policy 0, policy_version 390609 (0.0033) [2024-04-27 13:37:19,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6399836160. Throughput: 0: 53369.4. Samples: 890334920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 13:37:19,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 13:37:20,184][52263] Updated weights for policy 0, policy_version 390619 (0.0024) [2024-04-27 13:37:23,468][52263] Updated weights for policy 0, policy_version 390629 (0.0028) [2024-04-27 13:37:24,107][52031] Fps is (10 sec: 49151.4, 60 sec: 52974.7, 300 sec: 53373.0). Total num frames: 6400065536. Throughput: 0: 53292.7. Samples: 890657040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 13:37:24,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 13:37:26,205][52263] Updated weights for policy 0, policy_version 390639 (0.0035) [2024-04-27 13:37:29,107][52031] Fps is (10 sec: 50790.0, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6400344064. Throughput: 0: 53224.7. Samples: 890811360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 13:37:29,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 13:37:29,885][52263] Updated weights for policy 0, policy_version 390649 (0.0033) [2024-04-27 13:37:32,244][52263] Updated weights for policy 0, policy_version 390659 (0.0032) [2024-04-27 13:37:34,107][52031] Fps is (10 sec: 55706.2, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6400622592. Throughput: 0: 53183.5. Samples: 891127440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 13:37:34,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 13:37:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000390663_6400622592.pth... [2024-04-27 13:37:34,168][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000389881_6387810304.pth [2024-04-27 13:37:36,284][52263] Updated weights for policy 0, policy_version 390669 (0.0038) [2024-04-27 13:37:38,524][52263] Updated weights for policy 0, policy_version 390679 (0.0031) [2024-04-27 13:37:39,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6400884736. Throughput: 0: 53099.0. Samples: 891446560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 13:37:39,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 13:37:42,525][52263] Updated weights for policy 0, policy_version 390689 (0.0029) [2024-04-27 13:37:44,107][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6401179648. Throughput: 0: 53350.2. Samples: 891616420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 13:37:44,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 13:37:45,145][52263] Updated weights for policy 0, policy_version 390699 (0.0027) [2024-04-27 13:37:48,476][52263] Updated weights for policy 0, policy_version 390709 (0.0028) [2024-04-27 13:37:49,106][52031] Fps is (10 sec: 54067.9, 60 sec: 52975.0, 300 sec: 53484.0). Total num frames: 6401425408. Throughput: 0: 53306.1. Samples: 891935580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 13:37:49,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 13:37:51,228][52263] Updated weights for policy 0, policy_version 390719 (0.0030) [2024-04-27 13:37:54,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6401687552. Throughput: 0: 53202.2. Samples: 892249780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 13:37:54,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 13:37:54,527][52263] Updated weights for policy 0, policy_version 390729 (0.0038) [2024-04-27 13:37:56,096][52242] Signal inference workers to stop experience collection... (13500 times) [2024-04-27 13:37:56,096][52242] Signal inference workers to resume experience collection... (13500 times) [2024-04-27 13:37:56,121][52263] InferenceWorker_p0-w0: stopping experience collection (13500 times) [2024-04-27 13:37:56,121][52263] InferenceWorker_p0-w0: resuming experience collection (13500 times) [2024-04-27 13:37:57,437][52263] Updated weights for policy 0, policy_version 390739 (0.0030) [2024-04-27 13:37:59,106][52031] Fps is (10 sec: 52428.3, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6401949696. Throughput: 0: 52962.7. Samples: 892403580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 13:37:59,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 13:38:00,599][52263] Updated weights for policy 0, policy_version 390749 (0.0030) [2024-04-27 13:38:03,465][52263] Updated weights for policy 0, policy_version 390759 (0.0030) [2024-04-27 13:38:04,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6402211840. Throughput: 0: 53075.1. Samples: 892723300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 13:38:04,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 13:38:06,830][52263] Updated weights for policy 0, policy_version 390769 (0.0026) [2024-04-27 13:38:09,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53794.3, 300 sec: 53539.6). Total num frames: 6402506752. Throughput: 0: 53075.4. Samples: 893045420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 13:38:09,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 13:38:09,578][52263] Updated weights for policy 0, policy_version 390779 (0.0032) [2024-04-27 13:38:12,836][52263] Updated weights for policy 0, policy_version 390789 (0.0037) [2024-04-27 13:38:14,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6402768896. Throughput: 0: 53413.4. Samples: 893214960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 13:38:14,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 13:38:15,729][52263] Updated weights for policy 0, policy_version 390799 (0.0027) [2024-04-27 13:38:18,927][52263] Updated weights for policy 0, policy_version 390809 (0.0027) [2024-04-27 13:38:19,107][52031] Fps is (10 sec: 50789.7, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6403014656. Throughput: 0: 53445.8. Samples: 893532500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 13:38:19,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 13:38:21,701][52263] Updated weights for policy 0, policy_version 390819 (0.0034) [2024-04-27 13:38:24,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53521.2, 300 sec: 53317.4). Total num frames: 6403276800. Throughput: 0: 53477.8. Samples: 893853060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 13:38:24,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 13:38:24,961][52263] Updated weights for policy 0, policy_version 390829 (0.0034) [2024-04-27 13:38:27,717][52263] Updated weights for policy 0, policy_version 390839 (0.0025) [2024-04-27 13:38:29,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 6403538944. Throughput: 0: 53164.1. Samples: 894008800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 13:38:29,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 13:38:31,206][52263] Updated weights for policy 0, policy_version 390849 (0.0027) [2024-04-27 13:38:33,956][52263] Updated weights for policy 0, policy_version 390859 (0.0029) [2024-04-27 13:38:34,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6403833856. Throughput: 0: 53316.7. Samples: 894334840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 13:38:34,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 13:38:37,260][52263] Updated weights for policy 0, policy_version 390869 (0.0026) [2024-04-27 13:38:39,106][52031] Fps is (10 sec: 57343.8, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6404112384. Throughput: 0: 53487.6. Samples: 894656720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 13:38:39,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 13:38:40,055][52263] Updated weights for policy 0, policy_version 390879 (0.0031) [2024-04-27 13:38:43,230][52263] Updated weights for policy 0, policy_version 390889 (0.0032) [2024-04-27 13:38:44,107][52031] Fps is (10 sec: 52429.1, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6404358144. Throughput: 0: 53673.4. Samples: 894818880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 13:38:44,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 13:38:46,277][52263] Updated weights for policy 0, policy_version 390899 (0.0027) [2024-04-27 13:38:49,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6404636672. Throughput: 0: 53772.4. Samples: 895143060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 13:38:49,107][52031] Avg episode reward: [(0, '0.485')] [2024-04-27 13:38:49,399][52263] Updated weights for policy 0, policy_version 390909 (0.0029) [2024-04-27 13:38:52,496][52263] Updated weights for policy 0, policy_version 390919 (0.0029) [2024-04-27 13:38:54,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6404882432. Throughput: 0: 53746.0. Samples: 895464000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 13:38:54,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 13:38:55,499][52263] Updated weights for policy 0, policy_version 390929 (0.0029) [2024-04-27 13:38:57,317][52242] Signal inference workers to stop experience collection... (13550 times) [2024-04-27 13:38:57,355][52263] InferenceWorker_p0-w0: stopping experience collection (13550 times) [2024-04-27 13:38:57,412][52242] Signal inference workers to resume experience collection... (13550 times) [2024-04-27 13:38:57,413][52263] InferenceWorker_p0-w0: resuming experience collection (13550 times) [2024-04-27 13:38:58,444][52263] Updated weights for policy 0, policy_version 390939 (0.0019) [2024-04-27 13:38:59,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6405144576. Throughput: 0: 53375.7. Samples: 895616860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 13:38:59,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 13:39:01,626][52263] Updated weights for policy 0, policy_version 390949 (0.0035) [2024-04-27 13:39:04,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6405423104. Throughput: 0: 53517.0. Samples: 895940760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 13:39:04,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 13:39:04,615][52263] Updated weights for policy 0, policy_version 390959 (0.0032) [2024-04-27 13:39:08,175][52263] Updated weights for policy 0, policy_version 390969 (0.0032) [2024-04-27 13:39:09,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6405701632. Throughput: 0: 53434.6. Samples: 896257620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 13:39:09,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 13:39:10,584][52263] Updated weights for policy 0, policy_version 390979 (0.0028) [2024-04-27 13:39:14,106][52031] Fps is (10 sec: 52428.8, 60 sec: 52975.0, 300 sec: 53372.9). Total num frames: 6405947392. Throughput: 0: 53666.6. Samples: 896423800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 13:39:14,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 13:39:14,428][52263] Updated weights for policy 0, policy_version 390989 (0.0035) [2024-04-27 13:39:16,525][52263] Updated weights for policy 0, policy_version 390999 (0.0037) [2024-04-27 13:39:19,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6406242304. Throughput: 0: 53521.4. Samples: 896743300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 13:39:19,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 13:39:20,634][52263] Updated weights for policy 0, policy_version 391009 (0.0031) [2024-04-27 13:39:22,636][52263] Updated weights for policy 0, policy_version 391019 (0.0028) [2024-04-27 13:39:24,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6406488064. Throughput: 0: 53439.9. Samples: 897061520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 13:39:24,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 13:39:26,638][52263] Updated weights for policy 0, policy_version 391029 (0.0030) [2024-04-27 13:39:29,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6406766592. Throughput: 0: 53504.9. Samples: 897226600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 13:39:29,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 13:39:29,241][52263] Updated weights for policy 0, policy_version 391039 (0.0033) [2024-04-27 13:39:32,704][52263] Updated weights for policy 0, policy_version 391049 (0.0029) [2024-04-27 13:39:34,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6407028736. Throughput: 0: 53359.2. Samples: 897544220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 13:39:34,107][52031] Avg episode reward: [(0, '0.664')] [2024-04-27 13:39:34,230][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000391055_6407045120.pth... [2024-04-27 13:39:34,280][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000390272_6394216448.pth [2024-04-27 13:39:35,680][52263] Updated weights for policy 0, policy_version 391059 (0.0031) [2024-04-27 13:39:38,881][52263] Updated weights for policy 0, policy_version 391069 (0.0027) [2024-04-27 13:39:39,107][52031] Fps is (10 sec: 50789.8, 60 sec: 52701.8, 300 sec: 53372.9). Total num frames: 6407274496. Throughput: 0: 53324.5. Samples: 897863600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 13:39:39,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 13:39:41,844][52263] Updated weights for policy 0, policy_version 391079 (0.0025) [2024-04-27 13:39:44,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6407585792. Throughput: 0: 53506.6. Samples: 898024660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 13:39:44,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 13:39:45,001][52263] Updated weights for policy 0, policy_version 391089 (0.0025) [2024-04-27 13:39:45,963][52242] Signal inference workers to stop experience collection... (13600 times) [2024-04-27 13:39:45,963][52242] Signal inference workers to resume experience collection... (13600 times) [2024-04-27 13:39:45,974][52263] InferenceWorker_p0-w0: stopping experience collection (13600 times) [2024-04-27 13:39:45,994][52263] InferenceWorker_p0-w0: resuming experience collection (13600 times) [2024-04-27 13:39:47,806][52263] Updated weights for policy 0, policy_version 391099 (0.0029) [2024-04-27 13:39:49,107][52031] Fps is (10 sec: 54066.8, 60 sec: 52974.8, 300 sec: 53372.9). Total num frames: 6407815168. Throughput: 0: 53351.8. Samples: 898341600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 13:39:49,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 13:39:51,082][52263] Updated weights for policy 0, policy_version 391109 (0.0033) [2024-04-27 13:39:53,779][52263] Updated weights for policy 0, policy_version 391119 (0.0029) [2024-04-27 13:39:54,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53521.1, 300 sec: 53372.9). Total num frames: 6408093696. Throughput: 0: 53391.1. Samples: 898660220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 13:39:54,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 13:39:57,351][52263] Updated weights for policy 0, policy_version 391129 (0.0034) [2024-04-27 13:39:59,106][52031] Fps is (10 sec: 58983.8, 60 sec: 54340.3, 300 sec: 53595.1). Total num frames: 6408404992. Throughput: 0: 53320.1. Samples: 898823200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 13:39:59,107][52031] Avg episode reward: [(0, '0.491')] [2024-04-27 13:39:59,864][52263] Updated weights for policy 0, policy_version 391139 (0.0035) [2024-04-27 13:40:03,384][52263] Updated weights for policy 0, policy_version 391149 (0.0028) [2024-04-27 13:40:04,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.0, 300 sec: 53372.9). Total num frames: 6408634368. Throughput: 0: 53345.2. Samples: 899143840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 13:40:04,107][52031] Avg episode reward: [(0, '0.505')] [2024-04-27 13:40:05,936][52263] Updated weights for policy 0, policy_version 391159 (0.0026) [2024-04-27 13:40:09,106][52031] Fps is (10 sec: 47513.3, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6408880128. Throughput: 0: 53342.8. Samples: 899461940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 13:40:09,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 13:40:09,462][52263] Updated weights for policy 0, policy_version 391169 (0.0036) [2024-04-27 13:40:11,920][52263] Updated weights for policy 0, policy_version 391179 (0.0033) [2024-04-27 13:40:14,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.0, 300 sec: 53372.9). Total num frames: 6409158656. Throughput: 0: 53194.1. Samples: 899620340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 13:40:14,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 13:40:15,648][52263] Updated weights for policy 0, policy_version 391189 (0.0033) [2024-04-27 13:40:18,465][52263] Updated weights for policy 0, policy_version 391199 (0.0033) [2024-04-27 13:40:19,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52701.9, 300 sec: 53317.5). Total num frames: 6409404416. Throughput: 0: 53259.7. Samples: 899940900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 13:40:19,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 13:40:21,890][52263] Updated weights for policy 0, policy_version 391209 (0.0028) [2024-04-27 13:40:24,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6409715712. Throughput: 0: 53221.4. Samples: 900258560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 13:40:24,107][52031] Avg episode reward: [(0, '0.501')] [2024-04-27 13:40:24,885][52263] Updated weights for policy 0, policy_version 391219 (0.0030) [2024-04-27 13:40:27,907][52263] Updated weights for policy 0, policy_version 391229 (0.0033) [2024-04-27 13:40:29,106][52031] Fps is (10 sec: 57344.0, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6409977856. Throughput: 0: 53314.0. Samples: 900423780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 13:40:29,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 13:40:30,995][52263] Updated weights for policy 0, policy_version 391239 (0.0030) [2024-04-27 13:40:34,068][52263] Updated weights for policy 0, policy_version 391249 (0.0032) [2024-04-27 13:40:34,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6410223616. Throughput: 0: 53463.3. Samples: 900747440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 13:40:34,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 13:40:37,234][52263] Updated weights for policy 0, policy_version 391259 (0.0029) [2024-04-27 13:40:39,106][52031] Fps is (10 sec: 50789.7, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 6410485760. Throughput: 0: 53516.4. Samples: 901068460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 13:40:39,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 13:40:40,084][52263] Updated weights for policy 0, policy_version 391269 (0.0029) [2024-04-27 13:40:40,687][52242] Signal inference workers to stop experience collection... (13650 times) [2024-04-27 13:40:40,687][52242] Signal inference workers to resume experience collection... (13650 times) [2024-04-27 13:40:40,698][52263] InferenceWorker_p0-w0: stopping experience collection (13650 times) [2024-04-27 13:40:40,698][52263] InferenceWorker_p0-w0: resuming experience collection (13650 times) [2024-04-27 13:40:43,188][52263] Updated weights for policy 0, policy_version 391279 (0.0027) [2024-04-27 13:40:44,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52701.8, 300 sec: 53317.4). Total num frames: 6410747904. Throughput: 0: 53196.2. Samples: 901217040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 13:40:44,107][52031] Avg episode reward: [(0, '0.703')] [2024-04-27 13:40:46,328][52263] Updated weights for policy 0, policy_version 391289 (0.0032) [2024-04-27 13:40:49,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.3, 300 sec: 53317.4). Total num frames: 6411026432. Throughput: 0: 53131.3. Samples: 901534740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 13:40:49,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 13:40:49,181][52263] Updated weights for policy 0, policy_version 391299 (0.0031) [2024-04-27 13:40:52,375][52263] Updated weights for policy 0, policy_version 391309 (0.0036) [2024-04-27 13:40:54,106][52031] Fps is (10 sec: 57344.9, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6411321344. Throughput: 0: 53202.3. Samples: 901856040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 13:40:54,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 13:40:55,145][52263] Updated weights for policy 0, policy_version 391319 (0.0034) [2024-04-27 13:40:58,521][52263] Updated weights for policy 0, policy_version 391329 (0.0030) [2024-04-27 13:40:59,106][52031] Fps is (10 sec: 55705.3, 60 sec: 52974.9, 300 sec: 53484.1). Total num frames: 6411583488. Throughput: 0: 53470.7. Samples: 902026520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 13:40:59,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 13:41:01,240][52263] Updated weights for policy 0, policy_version 391339 (0.0031) [2024-04-27 13:41:04,107][52031] Fps is (10 sec: 49150.8, 60 sec: 52974.8, 300 sec: 53372.9). Total num frames: 6411812864. Throughput: 0: 53591.2. Samples: 902352520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 13:41:04,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 13:41:04,519][52263] Updated weights for policy 0, policy_version 391349 (0.0032) [2024-04-27 13:41:07,500][52263] Updated weights for policy 0, policy_version 391359 (0.0023) [2024-04-27 13:41:09,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53794.1, 300 sec: 53372.9). Total num frames: 6412107776. Throughput: 0: 53640.4. Samples: 902672380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 13:41:09,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 13:41:10,496][52263] Updated weights for policy 0, policy_version 391369 (0.0029) [2024-04-27 13:41:13,794][52263] Updated weights for policy 0, policy_version 391379 (0.0028) [2024-04-27 13:41:14,106][52031] Fps is (10 sec: 55706.7, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 6412369920. Throughput: 0: 53330.1. Samples: 902823640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 13:41:14,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 13:41:16,515][52263] Updated weights for policy 0, policy_version 391389 (0.0028) [2024-04-27 13:41:19,106][52031] Fps is (10 sec: 54067.7, 60 sec: 54067.1, 300 sec: 53428.5). Total num frames: 6412648448. Throughput: 0: 53298.7. Samples: 903145880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 13:41:19,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 13:41:19,739][52263] Updated weights for policy 0, policy_version 391399 (0.0034) [2024-04-27 13:41:22,655][52263] Updated weights for policy 0, policy_version 391409 (0.0031) [2024-04-27 13:41:24,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6412926976. Throughput: 0: 53356.4. Samples: 903469500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 13:41:24,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 13:41:25,727][52263] Updated weights for policy 0, policy_version 391419 (0.0035) [2024-04-27 13:41:28,773][52263] Updated weights for policy 0, policy_version 391429 (0.0027) [2024-04-27 13:41:29,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6413189120. Throughput: 0: 53889.4. Samples: 903642060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 13:41:29,107][52031] Avg episode reward: [(0, '0.680')] [2024-04-27 13:41:31,867][52263] Updated weights for policy 0, policy_version 391439 (0.0030) [2024-04-27 13:41:34,106][52031] Fps is (10 sec: 50791.0, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6413434880. Throughput: 0: 53964.4. Samples: 903963140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 13:41:34,115][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 13:41:34,125][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000391445_6413434880.pth... [2024-04-27 13:41:34,173][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000390663_6400622592.pth [2024-04-27 13:41:35,075][52263] Updated weights for policy 0, policy_version 391449 (0.0032) [2024-04-27 13:41:38,022][52263] Updated weights for policy 0, policy_version 391459 (0.0030) [2024-04-27 13:41:39,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6413713408. Throughput: 0: 53901.2. Samples: 904281600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 13:41:39,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 13:41:41,072][52263] Updated weights for policy 0, policy_version 391469 (0.0031) [2024-04-27 13:41:44,006][52263] Updated weights for policy 0, policy_version 391479 (0.0024) [2024-04-27 13:41:44,106][52031] Fps is (10 sec: 55705.6, 60 sec: 54067.3, 300 sec: 53373.0). Total num frames: 6413991936. Throughput: 0: 53690.7. Samples: 904442600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 13:41:44,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 13:41:47,241][52263] Updated weights for policy 0, policy_version 391489 (0.0025) [2024-04-27 13:41:49,107][52031] Fps is (10 sec: 55705.3, 60 sec: 54067.0, 300 sec: 53484.0). Total num frames: 6414270464. Throughput: 0: 53601.0. Samples: 904764560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 13:41:49,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 13:41:50,016][52263] Updated weights for policy 0, policy_version 391499 (0.0028) [2024-04-27 13:41:53,341][52263] Updated weights for policy 0, policy_version 391509 (0.0036) [2024-04-27 13:41:54,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6414548992. Throughput: 0: 53694.3. Samples: 905088620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 13:41:54,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 13:41:56,274][52263] Updated weights for policy 0, policy_version 391519 (0.0037) [2024-04-27 13:41:58,260][52242] Signal inference workers to stop experience collection... (13700 times) [2024-04-27 13:41:58,260][52242] Signal inference workers to resume experience collection... (13700 times) [2024-04-27 13:41:58,282][52263] InferenceWorker_p0-w0: stopping experience collection (13700 times) [2024-04-27 13:41:58,282][52263] InferenceWorker_p0-w0: resuming experience collection (13700 times) [2024-04-27 13:41:59,107][52031] Fps is (10 sec: 50790.7, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6414778368. Throughput: 0: 53952.8. Samples: 905251520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 13:41:59,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 13:41:59,425][52263] Updated weights for policy 0, policy_version 391529 (0.0037) [2024-04-27 13:42:02,512][52263] Updated weights for policy 0, policy_version 391539 (0.0034) [2024-04-27 13:42:04,106][52031] Fps is (10 sec: 50790.5, 60 sec: 54067.4, 300 sec: 53484.1). Total num frames: 6415056896. Throughput: 0: 53921.8. Samples: 905572360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 13:42:04,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 13:42:05,380][52263] Updated weights for policy 0, policy_version 391549 (0.0035) [2024-04-27 13:42:08,484][52263] Updated weights for policy 0, policy_version 391559 (0.0026) [2024-04-27 13:42:09,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6415335424. Throughput: 0: 53888.6. Samples: 905894480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 13:42:09,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 13:42:11,518][52263] Updated weights for policy 0, policy_version 391569 (0.0031) [2024-04-27 13:42:14,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6415597568. Throughput: 0: 53638.6. Samples: 906055800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 13:42:14,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 13:42:14,599][52263] Updated weights for policy 0, policy_version 391579 (0.0030) [2024-04-27 13:42:17,764][52263] Updated weights for policy 0, policy_version 391589 (0.0029) [2024-04-27 13:42:19,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6415876096. Throughput: 0: 53677.1. Samples: 906378620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 13:42:19,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 13:42:20,788][52263] Updated weights for policy 0, policy_version 391599 (0.0030) [2024-04-27 13:42:23,893][52263] Updated weights for policy 0, policy_version 391609 (0.0025) [2024-04-27 13:42:24,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6416138240. Throughput: 0: 53749.1. Samples: 906700300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 13:42:24,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 13:42:27,006][52263] Updated weights for policy 0, policy_version 391619 (0.0030) [2024-04-27 13:42:29,106][52031] Fps is (10 sec: 49153.2, 60 sec: 52975.1, 300 sec: 53373.0). Total num frames: 6416367616. Throughput: 0: 53593.9. Samples: 906854320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 13:42:29,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 13:42:30,106][52263] Updated weights for policy 0, policy_version 391629 (0.0030) [2024-04-27 13:42:33,202][52263] Updated weights for policy 0, policy_version 391639 (0.0034) [2024-04-27 13:42:34,107][52031] Fps is (10 sec: 52427.5, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6416662528. Throughput: 0: 53415.1. Samples: 907168240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 13:42:34,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 13:42:36,104][52263] Updated weights for policy 0, policy_version 391649 (0.0026) [2024-04-27 13:42:39,106][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 6416924672. Throughput: 0: 53287.6. Samples: 907486560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 13:42:39,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 13:42:39,217][52263] Updated weights for policy 0, policy_version 391659 (0.0034) [2024-04-27 13:42:42,311][52263] Updated weights for policy 0, policy_version 391669 (0.0030) [2024-04-27 13:42:44,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6417186816. Throughput: 0: 53307.9. Samples: 907650380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 13:42:44,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 13:42:45,441][52263] Updated weights for policy 0, policy_version 391679 (0.0031) [2024-04-27 13:42:48,415][52263] Updated weights for policy 0, policy_version 391689 (0.0036) [2024-04-27 13:42:49,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6417481728. Throughput: 0: 53285.1. Samples: 907970200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 13:42:49,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 13:42:51,637][52263] Updated weights for policy 0, policy_version 391699 (0.0026) [2024-04-27 13:42:54,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52701.8, 300 sec: 53428.5). Total num frames: 6417711104. Throughput: 0: 53290.2. Samples: 908292540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 13:42:54,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 13:42:54,526][52263] Updated weights for policy 0, policy_version 391709 (0.0030) [2024-04-27 13:42:57,829][52263] Updated weights for policy 0, policy_version 391719 (0.0036) [2024-04-27 13:42:59,107][52031] Fps is (10 sec: 50790.7, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6417989632. Throughput: 0: 53144.5. Samples: 908447300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 13:42:59,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 13:43:00,591][52263] Updated weights for policy 0, policy_version 391729 (0.0039) [2024-04-27 13:43:04,083][52263] Updated weights for policy 0, policy_version 391739 (0.0027) [2024-04-27 13:43:04,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6418251776. Throughput: 0: 53035.0. Samples: 908765180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 13:43:04,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 13:43:06,795][52263] Updated weights for policy 0, policy_version 391749 (0.0029) [2024-04-27 13:43:09,106][52031] Fps is (10 sec: 52429.1, 60 sec: 52974.9, 300 sec: 53373.0). Total num frames: 6418513920. Throughput: 0: 53002.6. Samples: 909085420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 13:43:09,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 13:43:10,049][52263] Updated weights for policy 0, policy_version 391759 (0.0029) [2024-04-27 13:43:12,466][52242] Signal inference workers to stop experience collection... (13750 times) [2024-04-27 13:43:12,466][52242] Signal inference workers to resume experience collection... (13750 times) [2024-04-27 13:43:12,493][52263] InferenceWorker_p0-w0: stopping experience collection (13750 times) [2024-04-27 13:43:12,493][52263] InferenceWorker_p0-w0: resuming experience collection (13750 times) [2024-04-27 13:43:12,869][52263] Updated weights for policy 0, policy_version 391769 (0.0035) [2024-04-27 13:43:14,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6418808832. Throughput: 0: 53319.8. Samples: 909253720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 13:43:14,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 13:43:16,117][52263] Updated weights for policy 0, policy_version 391779 (0.0028) [2024-04-27 13:43:18,884][52263] Updated weights for policy 0, policy_version 391789 (0.0028) [2024-04-27 13:43:19,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6419070976. Throughput: 0: 53600.5. Samples: 909580260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 13:43:19,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 13:43:22,096][52263] Updated weights for policy 0, policy_version 391799 (0.0028) [2024-04-27 13:43:24,107][52031] Fps is (10 sec: 50790.4, 60 sec: 52974.8, 300 sec: 53484.0). Total num frames: 6419316736. Throughput: 0: 53768.3. Samples: 909906140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 13:43:24,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 13:43:24,974][52263] Updated weights for policy 0, policy_version 391809 (0.0029) [2024-04-27 13:43:28,091][52263] Updated weights for policy 0, policy_version 391819 (0.0025) [2024-04-27 13:43:29,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53793.9, 300 sec: 53428.5). Total num frames: 6419595264. Throughput: 0: 53553.8. Samples: 910060300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 13:43:29,107][52031] Avg episode reward: [(0, '0.683')] [2024-04-27 13:43:30,968][52263] Updated weights for policy 0, policy_version 391829 (0.0028) [2024-04-27 13:43:34,103][52263] Updated weights for policy 0, policy_version 391839 (0.0030) [2024-04-27 13:43:34,106][52031] Fps is (10 sec: 57345.1, 60 sec: 53794.4, 300 sec: 53484.1). Total num frames: 6419890176. Throughput: 0: 53631.8. Samples: 910383620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 13:43:34,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 13:43:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000391839_6419890176.pth... [2024-04-27 13:43:34,169][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000391055_6407045120.pth [2024-04-27 13:43:36,982][52263] Updated weights for policy 0, policy_version 391849 (0.0026) [2024-04-27 13:43:39,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6420135936. Throughput: 0: 53715.0. Samples: 910709720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 13:43:39,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 13:43:40,284][52263] Updated weights for policy 0, policy_version 391859 (0.0033) [2024-04-27 13:43:43,072][52263] Updated weights for policy 0, policy_version 391869 (0.0027) [2024-04-27 13:43:44,107][52031] Fps is (10 sec: 54066.4, 60 sec: 54067.3, 300 sec: 53539.6). Total num frames: 6420430848. Throughput: 0: 53975.5. Samples: 910876200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 13:43:44,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 13:43:46,196][52263] Updated weights for policy 0, policy_version 391879 (0.0029) [2024-04-27 13:43:49,086][52263] Updated weights for policy 0, policy_version 391889 (0.0038) [2024-04-27 13:43:49,107][52031] Fps is (10 sec: 57344.5, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6420709376. Throughput: 0: 54087.8. Samples: 911199140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 13:43:49,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 13:43:52,244][52263] Updated weights for policy 0, policy_version 391899 (0.0032) [2024-04-27 13:43:54,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6420938752. Throughput: 0: 54061.8. Samples: 911518200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 13:43:54,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 13:43:55,392][52263] Updated weights for policy 0, policy_version 391909 (0.0027) [2024-04-27 13:43:58,381][52263] Updated weights for policy 0, policy_version 391919 (0.0034) [2024-04-27 13:43:59,106][52031] Fps is (10 sec: 49152.3, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6421200896. Throughput: 0: 53775.7. Samples: 911673620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 13:43:59,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 13:44:01,348][52263] Updated weights for policy 0, policy_version 391929 (0.0034) [2024-04-27 13:44:04,106][52031] Fps is (10 sec: 55705.4, 60 sec: 54067.1, 300 sec: 53539.6). Total num frames: 6421495808. Throughput: 0: 53596.1. Samples: 911992080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 13:44:04,107][52031] Avg episode reward: [(0, '0.677')] [2024-04-27 13:44:04,587][52263] Updated weights for policy 0, policy_version 391939 (0.0034) [2024-04-27 13:44:06,791][52242] Signal inference workers to stop experience collection... (13800 times) [2024-04-27 13:44:06,797][52242] Signal inference workers to resume experience collection... (13800 times) [2024-04-27 13:44:06,821][52263] InferenceWorker_p0-w0: stopping experience collection (13800 times) [2024-04-27 13:44:06,821][52263] InferenceWorker_p0-w0: resuming experience collection (13800 times) [2024-04-27 13:44:07,539][52263] Updated weights for policy 0, policy_version 391949 (0.0031) [2024-04-27 13:44:09,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6421741568. Throughput: 0: 53526.4. Samples: 912314820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 13:44:09,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 13:44:10,580][52263] Updated weights for policy 0, policy_version 391959 (0.0027) [2024-04-27 13:44:13,867][52263] Updated weights for policy 0, policy_version 391969 (0.0028) [2024-04-27 13:44:14,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6422036480. Throughput: 0: 53871.7. Samples: 912484520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 13:44:14,107][52031] Avg episode reward: [(0, '0.689')] [2024-04-27 13:44:16,551][52263] Updated weights for policy 0, policy_version 391979 (0.0028) [2024-04-27 13:44:19,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.3, 300 sec: 53539.6). Total num frames: 6422282240. Throughput: 0: 53780.9. Samples: 912803760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 13:44:19,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 13:44:19,828][52263] Updated weights for policy 0, policy_version 391989 (0.0033) [2024-04-27 13:44:22,806][52263] Updated weights for policy 0, policy_version 391999 (0.0031) [2024-04-27 13:44:24,107][52031] Fps is (10 sec: 50790.1, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6422544384. Throughput: 0: 53639.2. Samples: 913123480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 13:44:24,107][52031] Avg episode reward: [(0, '0.661')] [2024-04-27 13:44:26,022][52263] Updated weights for policy 0, policy_version 392009 (0.0029) [2024-04-27 13:44:28,911][52263] Updated weights for policy 0, policy_version 392019 (0.0039) [2024-04-27 13:44:29,106][52031] Fps is (10 sec: 55705.5, 60 sec: 54067.4, 300 sec: 53595.1). Total num frames: 6422839296. Throughput: 0: 53344.6. Samples: 913276700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 13:44:29,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 13:44:31,945][52263] Updated weights for policy 0, policy_version 392029 (0.0027) [2024-04-27 13:44:34,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53247.8, 300 sec: 53595.1). Total num frames: 6423085056. Throughput: 0: 53265.3. Samples: 913596080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 13:44:34,107][52031] Avg episode reward: [(0, '0.477')] [2024-04-27 13:44:35,118][52263] Updated weights for policy 0, policy_version 392039 (0.0028) [2024-04-27 13:44:38,104][52263] Updated weights for policy 0, policy_version 392049 (0.0028) [2024-04-27 13:44:39,106][52031] Fps is (10 sec: 52428.3, 60 sec: 53794.3, 300 sec: 53484.0). Total num frames: 6423363584. Throughput: 0: 53264.8. Samples: 913915120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 13:44:39,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 13:44:41,882][52263] Updated weights for policy 0, policy_version 392059 (0.0026) [2024-04-27 13:44:44,107][52031] Fps is (10 sec: 55706.0, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6423642112. Throughput: 0: 53542.6. Samples: 914083040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 13:44:44,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 13:44:44,278][52263] Updated weights for policy 0, policy_version 392069 (0.0031) [2024-04-27 13:44:48,088][52263] Updated weights for policy 0, policy_version 392079 (0.0030) [2024-04-27 13:44:49,106][52031] Fps is (10 sec: 52429.4, 60 sec: 52975.1, 300 sec: 53539.6). Total num frames: 6423887872. Throughput: 0: 53630.3. Samples: 914405440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 13:44:49,107][52031] Avg episode reward: [(0, '0.675')] [2024-04-27 13:44:50,512][52263] Updated weights for policy 0, policy_version 392089 (0.0030) [2024-04-27 13:44:54,107][52031] Fps is (10 sec: 49151.9, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6424133632. Throughput: 0: 53635.0. Samples: 914728400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 13:44:54,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 13:44:54,218][52263] Updated weights for policy 0, policy_version 392099 (0.0028) [2024-04-27 13:44:55,593][52242] Signal inference workers to stop experience collection... (13850 times) [2024-04-27 13:44:55,596][52242] Signal inference workers to resume experience collection... (13850 times) [2024-04-27 13:44:55,615][52263] InferenceWorker_p0-w0: stopping experience collection (13850 times) [2024-04-27 13:44:55,615][52263] InferenceWorker_p0-w0: resuming experience collection (13850 times) [2024-04-27 13:44:56,719][52263] Updated weights for policy 0, policy_version 392109 (0.0028) [2024-04-27 13:44:59,106][52031] Fps is (10 sec: 57343.4, 60 sec: 54340.3, 300 sec: 53650.7). Total num frames: 6424461312. Throughput: 0: 53414.7. Samples: 914888180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 13:44:59,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 13:45:00,231][52263] Updated weights for policy 0, policy_version 392119 (0.0039) [2024-04-27 13:45:02,738][52263] Updated weights for policy 0, policy_version 392129 (0.0028) [2024-04-27 13:45:04,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6424690688. Throughput: 0: 53438.6. Samples: 915208500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 13:45:04,107][52031] Avg episode reward: [(0, '0.511')] [2024-04-27 13:45:06,287][52263] Updated weights for policy 0, policy_version 392139 (0.0026) [2024-04-27 13:45:08,696][52263] Updated weights for policy 0, policy_version 392149 (0.0031) [2024-04-27 13:45:09,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6424969216. Throughput: 0: 53523.2. Samples: 915532020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 13:45:09,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 13:45:12,462][52263] Updated weights for policy 0, policy_version 392159 (0.0034) [2024-04-27 13:45:14,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53521.0, 300 sec: 53706.2). Total num frames: 6425247744. Throughput: 0: 53690.5. Samples: 915692780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 13:45:14,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 13:45:14,918][52263] Updated weights for policy 0, policy_version 392169 (0.0036) [2024-04-27 13:45:18,501][52263] Updated weights for policy 0, policy_version 392179 (0.0032) [2024-04-27 13:45:19,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6425509888. Throughput: 0: 53667.2. Samples: 916011100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 13:45:19,107][52031] Avg episode reward: [(0, '0.500')] [2024-04-27 13:45:21,168][52263] Updated weights for policy 0, policy_version 392189 (0.0029) [2024-04-27 13:45:24,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53794.1, 300 sec: 53539.5). Total num frames: 6425772032. Throughput: 0: 53849.3. Samples: 916338340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 13:45:24,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 13:45:24,492][52263] Updated weights for policy 0, policy_version 392199 (0.0030) [2024-04-27 13:45:27,171][52263] Updated weights for policy 0, policy_version 392209 (0.0030) [2024-04-27 13:45:29,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6426050560. Throughput: 0: 53564.6. Samples: 916493440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 13:45:29,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 13:45:30,600][52263] Updated weights for policy 0, policy_version 392219 (0.0028) [2024-04-27 13:45:33,204][52263] Updated weights for policy 0, policy_version 392229 (0.0026) [2024-04-27 13:45:34,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6426312704. Throughput: 0: 53638.5. Samples: 916819180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 13:45:34,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 13:45:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000392231_6426312704.pth... [2024-04-27 13:45:34,169][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000391445_6413434880.pth [2024-04-27 13:45:36,763][52263] Updated weights for policy 0, policy_version 392239 (0.0026) [2024-04-27 13:45:39,106][52031] Fps is (10 sec: 54066.8, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6426591232. Throughput: 0: 53673.4. Samples: 917143700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 13:45:39,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 13:45:39,252][52263] Updated weights for policy 0, policy_version 392249 (0.0030) [2024-04-27 13:45:42,921][52263] Updated weights for policy 0, policy_version 392259 (0.0028) [2024-04-27 13:45:44,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.1, 300 sec: 53650.6). Total num frames: 6426853376. Throughput: 0: 53758.7. Samples: 917307320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 13:45:44,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 13:45:45,399][52263] Updated weights for policy 0, policy_version 392269 (0.0028) [2024-04-27 13:45:48,850][52263] Updated weights for policy 0, policy_version 392279 (0.0031) [2024-04-27 13:45:49,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6427115520. Throughput: 0: 53818.2. Samples: 917630320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 13:45:49,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 13:45:51,401][52263] Updated weights for policy 0, policy_version 392289 (0.0033) [2024-04-27 13:45:54,106][52031] Fps is (10 sec: 52429.3, 60 sec: 54067.3, 300 sec: 53539.6). Total num frames: 6427377664. Throughput: 0: 53800.1. Samples: 917953020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 13:45:54,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 13:45:54,803][52263] Updated weights for policy 0, policy_version 392299 (0.0031) [2024-04-27 13:45:55,114][52242] Signal inference workers to stop experience collection... (13900 times) [2024-04-27 13:45:55,145][52263] InferenceWorker_p0-w0: stopping experience collection (13900 times) [2024-04-27 13:45:55,174][52242] Signal inference workers to resume experience collection... (13900 times) [2024-04-27 13:45:55,178][52263] InferenceWorker_p0-w0: resuming experience collection (13900 times) [2024-04-27 13:45:57,617][52263] Updated weights for policy 0, policy_version 392309 (0.0031) [2024-04-27 13:45:59,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53248.0, 300 sec: 53706.2). Total num frames: 6427656192. Throughput: 0: 53771.6. Samples: 918112500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 13:45:59,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 13:46:00,964][52263] Updated weights for policy 0, policy_version 392319 (0.0031) [2024-04-27 13:46:03,906][52263] Updated weights for policy 0, policy_version 392329 (0.0028) [2024-04-27 13:46:04,106][52031] Fps is (10 sec: 55705.1, 60 sec: 54067.1, 300 sec: 53650.7). Total num frames: 6427934720. Throughput: 0: 53834.7. Samples: 918433660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 13:46:04,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 13:46:07,233][52263] Updated weights for policy 0, policy_version 392339 (0.0026) [2024-04-27 13:46:09,106][52031] Fps is (10 sec: 55705.6, 60 sec: 54067.2, 300 sec: 53706.2). Total num frames: 6428213248. Throughput: 0: 53684.5. Samples: 918754140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 13:46:09,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 13:46:09,898][52263] Updated weights for policy 0, policy_version 392349 (0.0031) [2024-04-27 13:46:13,178][52263] Updated weights for policy 0, policy_version 392359 (0.0024) [2024-04-27 13:46:14,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6428459008. Throughput: 0: 53872.3. Samples: 918917700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 13:46:14,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 13:46:15,924][52263] Updated weights for policy 0, policy_version 392369 (0.0032) [2024-04-27 13:46:19,107][52031] Fps is (10 sec: 50790.0, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6428721152. Throughput: 0: 53756.0. Samples: 919238200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 13:46:19,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 13:46:19,236][52263] Updated weights for policy 0, policy_version 392379 (0.0028) [2024-04-27 13:46:21,969][52263] Updated weights for policy 0, policy_version 392389 (0.0025) [2024-04-27 13:46:24,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6428983296. Throughput: 0: 53708.9. Samples: 919560600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 13:46:24,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 13:46:25,645][52263] Updated weights for policy 0, policy_version 392399 (0.0031) [2024-04-27 13:46:28,191][52263] Updated weights for policy 0, policy_version 392409 (0.0028) [2024-04-27 13:46:29,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53520.9, 300 sec: 53650.6). Total num frames: 6429261824. Throughput: 0: 53547.0. Samples: 919716940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 13:46:29,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 13:46:31,599][52263] Updated weights for policy 0, policy_version 392419 (0.0037) [2024-04-27 13:46:34,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6429523968. Throughput: 0: 53531.6. Samples: 920039240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 13:46:34,107][52031] Avg episode reward: [(0, '0.524')] [2024-04-27 13:46:34,327][52263] Updated weights for policy 0, policy_version 392429 (0.0033) [2024-04-27 13:46:37,673][52263] Updated weights for policy 0, policy_version 392439 (0.0034) [2024-04-27 13:46:39,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6429802496. Throughput: 0: 53575.4. Samples: 920363920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 13:46:39,116][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 13:46:40,425][52263] Updated weights for policy 0, policy_version 392449 (0.0031) [2024-04-27 13:46:43,791][52263] Updated weights for policy 0, policy_version 392459 (0.0026) [2024-04-27 13:46:44,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6430064640. Throughput: 0: 53490.2. Samples: 920519560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 13:46:44,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 13:46:46,583][52263] Updated weights for policy 0, policy_version 392469 (0.0038) [2024-04-27 13:46:49,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6430326784. Throughput: 0: 53475.4. Samples: 920840060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 13:46:49,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 13:46:49,866][52263] Updated weights for policy 0, policy_version 392479 (0.0029) [2024-04-27 13:46:52,637][52263] Updated weights for policy 0, policy_version 392489 (0.0031) [2024-04-27 13:46:54,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.0, 300 sec: 53650.7). Total num frames: 6430605312. Throughput: 0: 53457.7. Samples: 921159740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 13:46:54,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 13:46:56,008][52263] Updated weights for policy 0, policy_version 392499 (0.0035) [2024-04-27 13:46:58,907][52263] Updated weights for policy 0, policy_version 392509 (0.0030) [2024-04-27 13:46:59,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6430867456. Throughput: 0: 53468.1. Samples: 921323760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 13:46:59,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 13:47:02,144][52263] Updated weights for policy 0, policy_version 392519 (0.0025) [2024-04-27 13:47:04,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6431145984. Throughput: 0: 53422.8. Samples: 921642220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 13:47:04,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 13:47:05,154][52263] Updated weights for policy 0, policy_version 392529 (0.0029) [2024-04-27 13:47:08,215][52263] Updated weights for policy 0, policy_version 392539 (0.0038) [2024-04-27 13:47:09,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52974.9, 300 sec: 53539.6). Total num frames: 6431391744. Throughput: 0: 53400.4. Samples: 921963620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 13:47:09,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 13:47:11,289][52263] Updated weights for policy 0, policy_version 392549 (0.0029) [2024-04-27 13:47:14,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6431670272. Throughput: 0: 53406.8. Samples: 922120240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 13:47:14,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 13:47:14,437][52263] Updated weights for policy 0, policy_version 392559 (0.0033) [2024-04-27 13:47:17,296][52263] Updated weights for policy 0, policy_version 392569 (0.0032) [2024-04-27 13:47:18,561][52242] Signal inference workers to stop experience collection... (13950 times) [2024-04-27 13:47:18,561][52242] Signal inference workers to resume experience collection... (13950 times) [2024-04-27 13:47:18,586][52263] InferenceWorker_p0-w0: stopping experience collection (13950 times) [2024-04-27 13:47:18,586][52263] InferenceWorker_p0-w0: resuming experience collection (13950 times) [2024-04-27 13:47:19,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6431916032. Throughput: 0: 53269.6. Samples: 922436380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 13:47:19,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 13:47:20,638][52263] Updated weights for policy 0, policy_version 392579 (0.0029) [2024-04-27 13:47:23,383][52263] Updated weights for policy 0, policy_version 392589 (0.0033) [2024-04-27 13:47:24,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 6432194560. Throughput: 0: 53172.0. Samples: 922756660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 13:47:24,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 13:47:26,599][52263] Updated weights for policy 0, policy_version 392599 (0.0033) [2024-04-27 13:47:29,106][52031] Fps is (10 sec: 55706.2, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6432473088. Throughput: 0: 53396.9. Samples: 922922420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 13:47:29,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 13:47:29,511][52263] Updated weights for policy 0, policy_version 392609 (0.0032) [2024-04-27 13:47:32,837][52263] Updated weights for policy 0, policy_version 392619 (0.0027) [2024-04-27 13:47:34,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6432735232. Throughput: 0: 53470.8. Samples: 923246240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 13:47:34,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 13:47:34,183][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000392624_6432751616.pth... [2024-04-27 13:47:34,226][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000391839_6419890176.pth [2024-04-27 13:47:35,638][52263] Updated weights for policy 0, policy_version 392629 (0.0030) [2024-04-27 13:47:39,044][52263] Updated weights for policy 0, policy_version 392639 (0.0034) [2024-04-27 13:47:39,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6432997376. Throughput: 0: 53464.8. Samples: 923565660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 13:47:39,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 13:47:41,938][52263] Updated weights for policy 0, policy_version 392649 (0.0027) [2024-04-27 13:47:44,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6433275904. Throughput: 0: 53444.9. Samples: 923728780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 13:47:44,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 13:47:45,017][52263] Updated weights for policy 0, policy_version 392659 (0.0033) [2024-04-27 13:47:47,956][52263] Updated weights for policy 0, policy_version 392669 (0.0028) [2024-04-27 13:47:49,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53248.2, 300 sec: 53595.1). Total num frames: 6433521664. Throughput: 0: 53497.4. Samples: 924049600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 13:47:49,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 13:47:51,164][52263] Updated weights for policy 0, policy_version 392679 (0.0028) [2024-04-27 13:47:54,020][52263] Updated weights for policy 0, policy_version 392689 (0.0028) [2024-04-27 13:47:54,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6433816576. Throughput: 0: 53514.2. Samples: 924371760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 13:47:54,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 13:47:57,258][52263] Updated weights for policy 0, policy_version 392699 (0.0032) [2024-04-27 13:47:59,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6434078720. Throughput: 0: 53559.2. Samples: 924530400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 13:47:59,107][52031] Avg episode reward: [(0, '0.646')] [2024-04-27 13:48:00,089][52263] Updated weights for policy 0, policy_version 392709 (0.0031) [2024-04-27 13:48:03,368][52263] Updated weights for policy 0, policy_version 392719 (0.0028) [2024-04-27 13:48:04,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53247.9, 300 sec: 53650.6). Total num frames: 6434340864. Throughput: 0: 53708.5. Samples: 924853260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 13:48:04,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 13:48:06,291][52263] Updated weights for policy 0, policy_version 392729 (0.0036) [2024-04-27 13:48:09,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6434603008. Throughput: 0: 53682.7. Samples: 925172380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 13:48:09,108][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 13:48:09,374][52263] Updated weights for policy 0, policy_version 392739 (0.0030) [2024-04-27 13:48:12,534][52263] Updated weights for policy 0, policy_version 392749 (0.0027) [2024-04-27 13:48:14,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6434865152. Throughput: 0: 53554.5. Samples: 925332380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 13:48:14,116][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 13:48:15,525][52263] Updated weights for policy 0, policy_version 392759 (0.0029) [2024-04-27 13:48:18,527][52263] Updated weights for policy 0, policy_version 392769 (0.0029) [2024-04-27 13:48:19,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6435127296. Throughput: 0: 53431.0. Samples: 925650640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 13:48:19,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 13:48:21,787][52263] Updated weights for policy 0, policy_version 392779 (0.0027) [2024-04-27 13:48:24,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6435405824. Throughput: 0: 53496.6. Samples: 925973000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 13:48:24,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 13:48:24,574][52263] Updated weights for policy 0, policy_version 392789 (0.0026) [2024-04-27 13:48:27,839][52263] Updated weights for policy 0, policy_version 392799 (0.0032) [2024-04-27 13:48:29,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6435684352. Throughput: 0: 53484.5. Samples: 926135580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 13:48:29,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 13:48:30,859][52263] Updated weights for policy 0, policy_version 392809 (0.0030) [2024-04-27 13:48:33,822][52263] Updated weights for policy 0, policy_version 392819 (0.0029) [2024-04-27 13:48:34,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.1, 300 sec: 53595.2). Total num frames: 6435946496. Throughput: 0: 53476.8. Samples: 926456060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 13:48:34,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 13:48:37,049][52263] Updated weights for policy 0, policy_version 392829 (0.0030) [2024-04-27 13:48:39,106][52031] Fps is (10 sec: 52428.3, 60 sec: 53521.2, 300 sec: 53484.0). Total num frames: 6436208640. Throughput: 0: 53381.0. Samples: 926773900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 13:48:39,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 13:48:39,974][52263] Updated weights for policy 0, policy_version 392839 (0.0026) [2024-04-27 13:48:43,300][52263] Updated weights for policy 0, policy_version 392849 (0.0025) [2024-04-27 13:48:44,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6436487168. Throughput: 0: 53447.8. Samples: 926935560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 13:48:44,110][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 13:48:45,692][52242] Signal inference workers to stop experience collection... (14000 times) [2024-04-27 13:48:45,734][52263] InferenceWorker_p0-w0: stopping experience collection (14000 times) [2024-04-27 13:48:45,792][52242] Signal inference workers to resume experience collection... (14000 times) [2024-04-27 13:48:45,792][52263] InferenceWorker_p0-w0: resuming experience collection (14000 times) [2024-04-27 13:48:46,309][52263] Updated weights for policy 0, policy_version 392859 (0.0026) [2024-04-27 13:48:49,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6436732928. Throughput: 0: 53213.4. Samples: 927247860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 13:48:49,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 13:48:49,639][52263] Updated weights for policy 0, policy_version 392869 (0.0032) [2024-04-27 13:48:52,388][52263] Updated weights for policy 0, policy_version 392879 (0.0030) [2024-04-27 13:48:54,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53247.9, 300 sec: 53595.1). Total num frames: 6437011456. Throughput: 0: 53135.4. Samples: 927563480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 13:48:54,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 13:48:55,793][52263] Updated weights for policy 0, policy_version 392889 (0.0031) [2024-04-27 13:48:58,464][52263] Updated weights for policy 0, policy_version 392899 (0.0033) [2024-04-27 13:48:59,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6437273600. Throughput: 0: 53221.0. Samples: 927727320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 13:48:59,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 13:49:01,845][52263] Updated weights for policy 0, policy_version 392909 (0.0030) [2024-04-27 13:49:04,107][52031] Fps is (10 sec: 50790.7, 60 sec: 52974.9, 300 sec: 53484.0). Total num frames: 6437519360. Throughput: 0: 53199.5. Samples: 928044620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 13:49:04,107][52031] Avg episode reward: [(0, '0.661')] [2024-04-27 13:49:04,709][52263] Updated weights for policy 0, policy_version 392919 (0.0027) [2024-04-27 13:49:07,887][52263] Updated weights for policy 0, policy_version 392929 (0.0031) [2024-04-27 13:49:09,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6437797888. Throughput: 0: 53126.1. Samples: 928363680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 13:49:09,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 13:49:11,065][52263] Updated weights for policy 0, policy_version 392939 (0.0034) [2024-04-27 13:49:14,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6438060032. Throughput: 0: 52872.3. Samples: 928514840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 13:49:14,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 13:49:14,171][52263] Updated weights for policy 0, policy_version 392949 (0.0034) [2024-04-27 13:49:17,171][52263] Updated weights for policy 0, policy_version 392959 (0.0028) [2024-04-27 13:49:19,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6438322176. Throughput: 0: 52978.1. Samples: 928840080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 13:49:19,108][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 13:49:20,408][52263] Updated weights for policy 0, policy_version 392969 (0.0027) [2024-04-27 13:49:23,280][52263] Updated weights for policy 0, policy_version 392979 (0.0032) [2024-04-27 13:49:24,106][52031] Fps is (10 sec: 52429.7, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6438584320. Throughput: 0: 52946.3. Samples: 929156480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 13:49:24,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 13:49:26,621][52263] Updated weights for policy 0, policy_version 392989 (0.0039) [2024-04-27 13:49:29,106][52031] Fps is (10 sec: 54067.6, 60 sec: 52974.9, 300 sec: 53484.1). Total num frames: 6438862848. Throughput: 0: 53017.5. Samples: 929321340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 13:49:29,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 13:49:29,348][52263] Updated weights for policy 0, policy_version 392999 (0.0032) [2024-04-27 13:49:32,594][52263] Updated weights for policy 0, policy_version 393009 (0.0027) [2024-04-27 13:49:34,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6439141376. Throughput: 0: 53167.0. Samples: 929640380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 13:49:34,107][52031] Avg episode reward: [(0, '0.516')] [2024-04-27 13:49:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000393014_6439141376.pth... [2024-04-27 13:49:34,165][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000392231_6426312704.pth [2024-04-27 13:49:35,589][52263] Updated weights for policy 0, policy_version 393019 (0.0027) [2024-04-27 13:49:38,878][52263] Updated weights for policy 0, policy_version 393029 (0.0035) [2024-04-27 13:49:39,107][52031] Fps is (10 sec: 54065.7, 60 sec: 53247.8, 300 sec: 53428.5). Total num frames: 6439403520. Throughput: 0: 53250.5. Samples: 929959760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 13:49:39,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 13:49:41,850][52263] Updated weights for policy 0, policy_version 393039 (0.0035) [2024-04-27 13:49:44,106][52031] Fps is (10 sec: 50790.5, 60 sec: 52701.9, 300 sec: 53428.5). Total num frames: 6439649280. Throughput: 0: 53042.6. Samples: 930114240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 13:49:44,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 13:49:44,943][52263] Updated weights for policy 0, policy_version 393049 (0.0032) [2024-04-27 13:49:47,835][52263] Updated weights for policy 0, policy_version 393059 (0.0032) [2024-04-27 13:49:49,106][52031] Fps is (10 sec: 52430.4, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6439927808. Throughput: 0: 53187.3. Samples: 930438040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 13:49:49,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 13:49:51,068][52263] Updated weights for policy 0, policy_version 393069 (0.0028) [2024-04-27 13:49:54,035][52263] Updated weights for policy 0, policy_version 393079 (0.0027) [2024-04-27 13:49:54,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6440206336. Throughput: 0: 53274.3. Samples: 930761020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 13:49:54,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 13:49:57,104][52263] Updated weights for policy 0, policy_version 393089 (0.0030) [2024-04-27 13:49:59,106][52031] Fps is (10 sec: 52428.3, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6440452096. Throughput: 0: 53449.8. Samples: 930920080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 13:49:59,107][52031] Avg episode reward: [(0, '0.671')] [2024-04-27 13:49:59,308][52242] Signal inference workers to stop experience collection... (14050 times) [2024-04-27 13:49:59,354][52263] InferenceWorker_p0-w0: stopping experience collection (14050 times) [2024-04-27 13:49:59,364][52242] Signal inference workers to resume experience collection... (14050 times) [2024-04-27 13:49:59,371][52263] InferenceWorker_p0-w0: resuming experience collection (14050 times) [2024-04-27 13:50:00,168][52263] Updated weights for policy 0, policy_version 393099 (0.0028) [2024-04-27 13:50:03,082][52263] Updated weights for policy 0, policy_version 393109 (0.0026) [2024-04-27 13:50:04,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6440730624. Throughput: 0: 53415.1. Samples: 931243760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:50:04,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 13:50:06,475][52263] Updated weights for policy 0, policy_version 393119 (0.0027) [2024-04-27 13:50:09,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6441009152. Throughput: 0: 53463.5. Samples: 931562340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:50:09,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 13:50:09,113][52263] Updated weights for policy 0, policy_version 393129 (0.0034) [2024-04-27 13:50:12,634][52263] Updated weights for policy 0, policy_version 393139 (0.0037) [2024-04-27 13:50:14,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6441271296. Throughput: 0: 53450.7. Samples: 931726620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:50:14,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 13:50:15,348][52263] Updated weights for policy 0, policy_version 393149 (0.0031) [2024-04-27 13:50:18,767][52263] Updated weights for policy 0, policy_version 393159 (0.0030) [2024-04-27 13:50:19,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6441533440. Throughput: 0: 53502.3. Samples: 932047980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:50:19,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 13:50:21,551][52263] Updated weights for policy 0, policy_version 393169 (0.0027) [2024-04-27 13:50:24,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6441795584. Throughput: 0: 53614.6. Samples: 932372400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:50:24,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 13:50:24,776][52263] Updated weights for policy 0, policy_version 393179 (0.0024) [2024-04-27 13:50:27,766][52263] Updated weights for policy 0, policy_version 393189 (0.0034) [2024-04-27 13:50:29,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6442074112. Throughput: 0: 53656.8. Samples: 932528800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:50:29,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 13:50:31,005][52263] Updated weights for policy 0, policy_version 393199 (0.0031) [2024-04-27 13:50:33,827][52263] Updated weights for policy 0, policy_version 393209 (0.0032) [2024-04-27 13:50:34,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6442352640. Throughput: 0: 53562.1. Samples: 932848340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:50:34,107][52031] Avg episode reward: [(0, '0.705')] [2024-04-27 13:50:37,064][52263] Updated weights for policy 0, policy_version 393219 (0.0031) [2024-04-27 13:50:39,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6442614784. Throughput: 0: 53502.6. Samples: 933168640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:50:39,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 13:50:39,876][52263] Updated weights for policy 0, policy_version 393229 (0.0033) [2024-04-27 13:50:43,293][52263] Updated weights for policy 0, policy_version 393239 (0.0033) [2024-04-27 13:50:44,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6442876928. Throughput: 0: 53579.2. Samples: 933331140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:50:44,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 13:50:46,027][52263] Updated weights for policy 0, policy_version 393249 (0.0028) [2024-04-27 13:50:49,107][52031] Fps is (10 sec: 50790.4, 60 sec: 53247.9, 300 sec: 53372.9). Total num frames: 6443122688. Throughput: 0: 53526.6. Samples: 933652460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:50:49,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 13:50:49,558][52263] Updated weights for policy 0, policy_version 393259 (0.0029) [2024-04-27 13:50:52,233][52263] Updated weights for policy 0, policy_version 393269 (0.0031) [2024-04-27 13:50:54,107][52031] Fps is (10 sec: 52427.7, 60 sec: 53247.8, 300 sec: 53372.9). Total num frames: 6443401216. Throughput: 0: 53507.8. Samples: 933970200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:50:54,108][52031] Avg episode reward: [(0, '0.506')] [2024-04-27 13:50:55,679][52263] Updated weights for policy 0, policy_version 393279 (0.0034) [2024-04-27 13:50:58,203][52263] Updated weights for policy 0, policy_version 393289 (0.0031) [2024-04-27 13:50:59,107][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 6443679744. Throughput: 0: 53311.0. Samples: 934125620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:50:59,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 13:51:01,912][52263] Updated weights for policy 0, policy_version 393299 (0.0041) [2024-04-27 13:51:04,106][52031] Fps is (10 sec: 54068.7, 60 sec: 53521.2, 300 sec: 53317.4). Total num frames: 6443941888. Throughput: 0: 53289.4. Samples: 934446000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:51:04,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 13:51:04,310][52263] Updated weights for policy 0, policy_version 393309 (0.0030) [2024-04-27 13:51:08,015][52263] Updated weights for policy 0, policy_version 393319 (0.0029) [2024-04-27 13:51:09,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6444204032. Throughput: 0: 53157.7. Samples: 934764500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:51:09,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 13:51:10,619][52263] Updated weights for policy 0, policy_version 393329 (0.0027) [2024-04-27 13:51:14,097][52263] Updated weights for policy 0, policy_version 393339 (0.0030) [2024-04-27 13:51:14,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 6444466176. Throughput: 0: 53218.3. Samples: 934923620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:51:14,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 13:51:16,628][52263] Updated weights for policy 0, policy_version 393349 (0.0036) [2024-04-27 13:51:19,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6444728320. Throughput: 0: 53264.1. Samples: 935245220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 13:51:19,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 13:51:20,191][52263] Updated weights for policy 0, policy_version 393359 (0.0032) [2024-04-27 13:51:22,894][52263] Updated weights for policy 0, policy_version 393369 (0.0027) [2024-04-27 13:51:24,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53248.0, 300 sec: 53317.5). Total num frames: 6444990464. Throughput: 0: 53230.9. Samples: 935564020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:51:24,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 13:51:26,252][52263] Updated weights for policy 0, policy_version 393379 (0.0032) [2024-04-27 13:51:28,903][52263] Updated weights for policy 0, policy_version 393389 (0.0029) [2024-04-27 13:51:29,106][52031] Fps is (10 sec: 55705.0, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6445285376. Throughput: 0: 53188.4. Samples: 935724620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:51:29,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 13:51:32,401][52263] Updated weights for policy 0, policy_version 393399 (0.0034) [2024-04-27 13:51:34,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6445547520. Throughput: 0: 53208.0. Samples: 936046820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:51:34,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 13:51:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000393405_6445547520.pth... [2024-04-27 13:51:34,175][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000392624_6432751616.pth [2024-04-27 13:51:35,098][52263] Updated weights for policy 0, policy_version 393409 (0.0035) [2024-04-27 13:51:38,195][52242] Signal inference workers to stop experience collection... (14100 times) [2024-04-27 13:51:38,200][52242] Signal inference workers to resume experience collection... (14100 times) [2024-04-27 13:51:38,206][52263] InferenceWorker_p0-w0: stopping experience collection (14100 times) [2024-04-27 13:51:38,237][52263] InferenceWorker_p0-w0: resuming experience collection (14100 times) [2024-04-27 13:51:38,479][52263] Updated weights for policy 0, policy_version 393419 (0.0033) [2024-04-27 13:51:39,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6445809664. Throughput: 0: 53225.5. Samples: 936365340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:51:39,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 13:51:41,362][52263] Updated weights for policy 0, policy_version 393429 (0.0033) [2024-04-27 13:51:44,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 6446055424. Throughput: 0: 53249.0. Samples: 936521820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:51:44,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 13:51:44,570][52263] Updated weights for policy 0, policy_version 393439 (0.0036) [2024-04-27 13:51:47,332][52263] Updated weights for policy 0, policy_version 393449 (0.0030) [2024-04-27 13:51:49,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 6446333952. Throughput: 0: 53190.1. Samples: 936839560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:51:49,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 13:51:50,691][52263] Updated weights for policy 0, policy_version 393459 (0.0026) [2024-04-27 13:51:53,435][52263] Updated weights for policy 0, policy_version 393469 (0.0030) [2024-04-27 13:51:54,106][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 6446612480. Throughput: 0: 53252.5. Samples: 937160860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:51:54,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 13:51:56,840][52263] Updated weights for policy 0, policy_version 393479 (0.0027) [2024-04-27 13:51:59,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6446874624. Throughput: 0: 53515.0. Samples: 937331800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:51:59,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 13:51:59,679][52263] Updated weights for policy 0, policy_version 393489 (0.0032) [2024-04-27 13:52:03,001][52263] Updated weights for policy 0, policy_version 393499 (0.0030) [2024-04-27 13:52:04,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6447136768. Throughput: 0: 53422.6. Samples: 937649240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:52:04,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 13:52:06,106][52263] Updated weights for policy 0, policy_version 393509 (0.0028) [2024-04-27 13:52:08,986][52263] Updated weights for policy 0, policy_version 393519 (0.0029) [2024-04-27 13:52:09,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6447415296. Throughput: 0: 53437.2. Samples: 937968700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:52:09,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 13:52:12,074][52263] Updated weights for policy 0, policy_version 393529 (0.0038) [2024-04-27 13:52:14,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6447661056. Throughput: 0: 53412.9. Samples: 938128200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:52:14,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 13:52:15,128][52263] Updated weights for policy 0, policy_version 393539 (0.0026) [2024-04-27 13:52:18,258][52263] Updated weights for policy 0, policy_version 393549 (0.0030) [2024-04-27 13:52:19,106][52031] Fps is (10 sec: 49152.1, 60 sec: 52974.9, 300 sec: 53261.9). Total num frames: 6447906816. Throughput: 0: 53339.6. Samples: 938447100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:52:19,107][52031] Avg episode reward: [(0, '0.502')] [2024-04-27 13:52:21,032][52242] Signal inference workers to stop experience collection... (14150 times) [2024-04-27 13:52:21,033][52242] Signal inference workers to resume experience collection... (14150 times) [2024-04-27 13:52:21,066][52263] InferenceWorker_p0-w0: stopping experience collection (14150 times) [2024-04-27 13:52:21,066][52263] InferenceWorker_p0-w0: resuming experience collection (14150 times) [2024-04-27 13:52:21,158][52263] Updated weights for policy 0, policy_version 393559 (0.0025) [2024-04-27 13:52:24,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53793.9, 300 sec: 53372.9). Total num frames: 6448218112. Throughput: 0: 53481.7. Samples: 938772020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:52:24,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 13:52:24,563][52263] Updated weights for policy 0, policy_version 393569 (0.0035) [2024-04-27 13:52:27,595][52263] Updated weights for policy 0, policy_version 393579 (0.0036) [2024-04-27 13:52:29,107][52031] Fps is (10 sec: 57343.0, 60 sec: 53247.9, 300 sec: 53372.9). Total num frames: 6448480256. Throughput: 0: 53629.1. Samples: 938935140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:52:29,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 13:52:30,506][52263] Updated weights for policy 0, policy_version 393589 (0.0029) [2024-04-27 13:52:33,770][52263] Updated weights for policy 0, policy_version 393599 (0.0029) [2024-04-27 13:52:34,107][52031] Fps is (10 sec: 50790.9, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 6448726016. Throughput: 0: 53656.4. Samples: 939254100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:52:34,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 13:52:36,510][52263] Updated weights for policy 0, policy_version 393609 (0.0033) [2024-04-27 13:52:39,106][52031] Fps is (10 sec: 52430.0, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 6449004544. Throughput: 0: 53659.2. Samples: 939575520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 13:52:39,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 13:52:39,793][52263] Updated weights for policy 0, policy_version 393619 (0.0035) [2024-04-27 13:52:42,750][52263] Updated weights for policy 0, policy_version 393629 (0.0031) [2024-04-27 13:52:44,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6449266688. Throughput: 0: 53340.2. Samples: 939732100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 13:52:44,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 13:52:45,960][52263] Updated weights for policy 0, policy_version 393639 (0.0032) [2024-04-27 13:52:48,986][52263] Updated weights for policy 0, policy_version 393649 (0.0031) [2024-04-27 13:52:49,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 6449545216. Throughput: 0: 53364.9. Samples: 940050660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 13:52:49,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 13:52:52,246][52263] Updated weights for policy 0, policy_version 393659 (0.0027) [2024-04-27 13:52:54,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6449807360. Throughput: 0: 53445.3. Samples: 940373740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 13:52:54,107][52031] Avg episode reward: [(0, '0.690')] [2024-04-27 13:52:55,651][52263] Updated weights for policy 0, policy_version 393669 (0.0026) [2024-04-27 13:52:58,305][52263] Updated weights for policy 0, policy_version 393679 (0.0029) [2024-04-27 13:52:59,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6450102272. Throughput: 0: 53475.5. Samples: 940534600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 13:52:59,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 13:53:01,865][52263] Updated weights for policy 0, policy_version 393689 (0.0032) [2024-04-27 13:53:04,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53520.9, 300 sec: 53372.9). Total num frames: 6450348032. Throughput: 0: 53623.0. Samples: 940860140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 13:53:04,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 13:53:04,350][52263] Updated weights for policy 0, policy_version 393699 (0.0035) [2024-04-27 13:53:07,880][52263] Updated weights for policy 0, policy_version 393709 (0.0033) [2024-04-27 13:53:09,106][52031] Fps is (10 sec: 50791.2, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6450610176. Throughput: 0: 53614.9. Samples: 941184680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 13:53:09,107][52031] Avg episode reward: [(0, '0.484')] [2024-04-27 13:53:10,339][52263] Updated weights for policy 0, policy_version 393719 (0.0031) [2024-04-27 13:53:14,007][52263] Updated weights for policy 0, policy_version 393729 (0.0032) [2024-04-27 13:53:14,106][52031] Fps is (10 sec: 50791.3, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 6450855936. Throughput: 0: 53323.0. Samples: 941334660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 13:53:14,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 13:53:15,862][52242] Signal inference workers to stop experience collection... (14200 times) [2024-04-27 13:53:15,862][52242] Signal inference workers to resume experience collection... (14200 times) [2024-04-27 13:53:15,888][52263] InferenceWorker_p0-w0: stopping experience collection (14200 times) [2024-04-27 13:53:15,888][52263] InferenceWorker_p0-w0: resuming experience collection (14200 times) [2024-04-27 13:53:16,529][52263] Updated weights for policy 0, policy_version 393739 (0.0030) [2024-04-27 13:53:19,107][52031] Fps is (10 sec: 54066.3, 60 sec: 54067.1, 300 sec: 53372.9). Total num frames: 6451150848. Throughput: 0: 53438.1. Samples: 941658820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 13:53:19,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 13:53:19,973][52263] Updated weights for policy 0, policy_version 393749 (0.0026) [2024-04-27 13:53:22,471][52263] Updated weights for policy 0, policy_version 393759 (0.0030) [2024-04-27 13:53:24,106][52031] Fps is (10 sec: 57343.4, 60 sec: 53521.2, 300 sec: 53372.9). Total num frames: 6451429376. Throughput: 0: 53467.4. Samples: 941981560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 13:53:24,107][52031] Avg episode reward: [(0, '0.511')] [2024-04-27 13:53:26,194][52263] Updated weights for policy 0, policy_version 393769 (0.0031) [2024-04-27 13:53:28,503][52263] Updated weights for policy 0, policy_version 393779 (0.0030) [2024-04-27 13:53:29,106][52031] Fps is (10 sec: 55706.5, 60 sec: 53794.3, 300 sec: 53428.5). Total num frames: 6451707904. Throughput: 0: 53680.8. Samples: 942147740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 13:53:29,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 13:53:32,363][52263] Updated weights for policy 0, policy_version 393789 (0.0031) [2024-04-27 13:53:34,107][52031] Fps is (10 sec: 54067.1, 60 sec: 54067.2, 300 sec: 53428.5). Total num frames: 6451970048. Throughput: 0: 53766.2. Samples: 942470140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 13:53:34,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 13:53:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000393797_6451970048.pth... [2024-04-27 13:53:34,171][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000393014_6439141376.pth [2024-04-27 13:53:34,598][52263] Updated weights for policy 0, policy_version 393799 (0.0027) [2024-04-27 13:53:38,335][52263] Updated weights for policy 0, policy_version 393809 (0.0028) [2024-04-27 13:53:39,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53521.1, 300 sec: 53317.5). Total num frames: 6452215808. Throughput: 0: 53678.3. Samples: 942789260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 13:53:39,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 13:53:40,802][52263] Updated weights for policy 0, policy_version 393819 (0.0032) [2024-04-27 13:53:44,107][52031] Fps is (10 sec: 49151.9, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6452461568. Throughput: 0: 53506.3. Samples: 942942380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 13:53:44,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 13:53:44,582][52263] Updated weights for policy 0, policy_version 393829 (0.0035) [2024-04-27 13:53:46,953][52263] Updated weights for policy 0, policy_version 393839 (0.0033) [2024-04-27 13:53:49,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6452756480. Throughput: 0: 53453.5. Samples: 943265540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 13:53:49,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 13:53:50,563][52263] Updated weights for policy 0, policy_version 393849 (0.0029) [2024-04-27 13:53:52,920][52263] Updated weights for policy 0, policy_version 393859 (0.0024) [2024-04-27 13:53:54,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6453018624. Throughput: 0: 53386.7. Samples: 943587080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 13:53:54,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 13:53:56,605][52263] Updated weights for policy 0, policy_version 393869 (0.0032) [2024-04-27 13:53:58,953][52263] Updated weights for policy 0, policy_version 393879 (0.0037) [2024-04-27 13:53:59,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6453313536. Throughput: 0: 53799.1. Samples: 943755620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 13:53:59,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 13:54:02,747][52263] Updated weights for policy 0, policy_version 393889 (0.0037) [2024-04-27 13:54:04,107][52031] Fps is (10 sec: 54065.7, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6453559296. Throughput: 0: 53647.9. Samples: 944072980. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 13:54:04,107][52031] Avg episode reward: [(0, '0.493')] [2024-04-27 13:54:05,184][52263] Updated weights for policy 0, policy_version 393899 (0.0033) [2024-04-27 13:54:08,981][52263] Updated weights for policy 0, policy_version 393909 (0.0029) [2024-04-27 13:54:09,106][52031] Fps is (10 sec: 49151.9, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6453805056. Throughput: 0: 53529.0. Samples: 944390360. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 13:54:09,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 13:54:11,444][52263] Updated weights for policy 0, policy_version 393919 (0.0035) [2024-04-27 13:54:14,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6454083584. Throughput: 0: 53279.5. Samples: 944545320. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 13:54:14,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 13:54:14,959][52263] Updated weights for policy 0, policy_version 393929 (0.0029) [2024-04-27 13:54:17,439][52263] Updated weights for policy 0, policy_version 393939 (0.0026) [2024-04-27 13:54:19,106][52031] Fps is (10 sec: 55705.4, 60 sec: 53521.2, 300 sec: 53484.0). Total num frames: 6454362112. Throughput: 0: 53237.8. Samples: 944865840. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 13:54:19,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 13:54:21,151][52263] Updated weights for policy 0, policy_version 393949 (0.0029) [2024-04-27 13:54:21,413][52242] Signal inference workers to stop experience collection... (14250 times) [2024-04-27 13:54:21,413][52242] Signal inference workers to resume experience collection... (14250 times) [2024-04-27 13:54:21,429][52263] InferenceWorker_p0-w0: stopping experience collection (14250 times) [2024-04-27 13:54:21,430][52263] InferenceWorker_p0-w0: resuming experience collection (14250 times) [2024-04-27 13:54:23,465][52263] Updated weights for policy 0, policy_version 393959 (0.0029) [2024-04-27 13:54:24,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6454624256. Throughput: 0: 53211.8. Samples: 945183800. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 13:54:24,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 13:54:27,373][52263] Updated weights for policy 0, policy_version 393969 (0.0025) [2024-04-27 13:54:29,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6454919168. Throughput: 0: 53690.4. Samples: 945358440. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 13:54:29,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 13:54:29,636][52263] Updated weights for policy 0, policy_version 393979 (0.0033) [2024-04-27 13:54:33,543][52263] Updated weights for policy 0, policy_version 393989 (0.0036) [2024-04-27 13:54:34,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6455148544. Throughput: 0: 53533.7. Samples: 945674560. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 13:54:34,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 13:54:35,999][52263] Updated weights for policy 0, policy_version 393999 (0.0026) [2024-04-27 13:54:39,106][52031] Fps is (10 sec: 49151.8, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6455410688. Throughput: 0: 53552.4. Samples: 945996940. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 13:54:39,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 13:54:39,675][52263] Updated weights for policy 0, policy_version 394009 (0.0026) [2024-04-27 13:54:42,035][52263] Updated weights for policy 0, policy_version 394019 (0.0030) [2024-04-27 13:54:44,107][52031] Fps is (10 sec: 55704.7, 60 sec: 54067.1, 300 sec: 53484.0). Total num frames: 6455705600. Throughput: 0: 53162.8. Samples: 946147960. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 13:54:44,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 13:54:45,824][52263] Updated weights for policy 0, policy_version 394029 (0.0033) [2024-04-27 13:54:48,510][52263] Updated weights for policy 0, policy_version 394039 (0.0027) [2024-04-27 13:54:49,107][52031] Fps is (10 sec: 55704.5, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6455967744. Throughput: 0: 53119.2. Samples: 946463340. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 13:54:49,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 13:54:51,803][52263] Updated weights for policy 0, policy_version 394049 (0.0034) [2024-04-27 13:54:54,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6456229888. Throughput: 0: 53200.4. Samples: 946784380. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 13:54:54,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 13:54:54,583][52263] Updated weights for policy 0, policy_version 394059 (0.0026) [2024-04-27 13:54:57,954][52263] Updated weights for policy 0, policy_version 394069 (0.0030) [2024-04-27 13:54:59,106][52031] Fps is (10 sec: 54068.4, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6456508416. Throughput: 0: 53502.8. Samples: 946952940. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 13:54:59,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 13:55:00,539][52263] Updated weights for policy 0, policy_version 394079 (0.0028) [2024-04-27 13:55:04,024][52263] Updated weights for policy 0, policy_version 394089 (0.0027) [2024-04-27 13:55:04,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.2, 300 sec: 53373.0). Total num frames: 6456754176. Throughput: 0: 53517.4. Samples: 947274120. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 13:55:04,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 13:55:06,628][52263] Updated weights for policy 0, policy_version 394099 (0.0034) [2024-04-27 13:55:09,107][52031] Fps is (10 sec: 50790.0, 60 sec: 53521.0, 300 sec: 53372.9). Total num frames: 6457016320. Throughput: 0: 53591.2. Samples: 947595400. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 13:55:09,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 13:55:10,236][52263] Updated weights for policy 0, policy_version 394109 (0.0037) [2024-04-27 13:55:12,683][52263] Updated weights for policy 0, policy_version 394119 (0.0031) [2024-04-27 13:55:14,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6457278464. Throughput: 0: 53123.1. Samples: 947748980. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 13:55:14,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 13:55:16,269][52263] Updated weights for policy 0, policy_version 394129 (0.0027) [2024-04-27 13:55:18,740][52263] Updated weights for policy 0, policy_version 394139 (0.0031) [2024-04-27 13:55:19,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6457573376. Throughput: 0: 53250.3. Samples: 948070820. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-27 13:55:19,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 13:55:22,513][52263] Updated weights for policy 0, policy_version 394149 (0.0029) [2024-04-27 13:55:24,107][52031] Fps is (10 sec: 57342.9, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6457851904. Throughput: 0: 53179.8. Samples: 948390040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-04-27 13:55:24,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 13:55:25,266][52263] Updated weights for policy 0, policy_version 394159 (0.0032) [2024-04-27 13:55:28,509][52263] Updated weights for policy 0, policy_version 394169 (0.0028) [2024-04-27 13:55:29,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52974.8, 300 sec: 53372.9). Total num frames: 6458097664. Throughput: 0: 53421.9. Samples: 948551940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-04-27 13:55:29,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 13:55:31,281][52263] Updated weights for policy 0, policy_version 394179 (0.0031) [2024-04-27 13:55:32,675][52242] Signal inference workers to stop experience collection... (14300 times) [2024-04-27 13:55:32,675][52242] Signal inference workers to resume experience collection... (14300 times) [2024-04-27 13:55:32,688][52263] InferenceWorker_p0-w0: stopping experience collection (14300 times) [2024-04-27 13:55:32,688][52263] InferenceWorker_p0-w0: resuming experience collection (14300 times) [2024-04-27 13:55:34,107][52031] Fps is (10 sec: 49152.2, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6458343424. Throughput: 0: 53583.7. Samples: 948874600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-04-27 13:55:34,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 13:55:34,241][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000394187_6458359808.pth... [2024-04-27 13:55:34,293][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000393405_6445547520.pth [2024-04-27 13:55:34,624][52263] Updated weights for policy 0, policy_version 394189 (0.0032) [2024-04-27 13:55:37,519][52263] Updated weights for policy 0, policy_version 394199 (0.0027) [2024-04-27 13:55:39,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.0, 300 sec: 53372.9). Total num frames: 6458621952. Throughput: 0: 53572.8. Samples: 949195160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-04-27 13:55:39,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 13:55:40,663][52263] Updated weights for policy 0, policy_version 394209 (0.0029) [2024-04-27 13:55:43,697][52263] Updated weights for policy 0, policy_version 394219 (0.0032) [2024-04-27 13:55:44,107][52031] Fps is (10 sec: 54067.1, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6458884096. Throughput: 0: 53312.3. Samples: 949352000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-04-27 13:55:44,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 13:55:46,765][52263] Updated weights for policy 0, policy_version 394229 (0.0031) [2024-04-27 13:55:49,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6459179008. Throughput: 0: 53280.9. Samples: 949671760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-04-27 13:55:49,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 13:55:49,650][52263] Updated weights for policy 0, policy_version 394239 (0.0032) [2024-04-27 13:55:52,782][52263] Updated weights for policy 0, policy_version 394249 (0.0032) [2024-04-27 13:55:54,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6459441152. Throughput: 0: 53342.3. Samples: 949995800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-04-27 13:55:54,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 13:55:55,816][52263] Updated weights for policy 0, policy_version 394259 (0.0030) [2024-04-27 13:55:58,784][52263] Updated weights for policy 0, policy_version 394269 (0.0032) [2024-04-27 13:55:59,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6459719680. Throughput: 0: 53717.8. Samples: 950166280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-04-27 13:55:59,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 13:56:02,066][52263] Updated weights for policy 0, policy_version 394279 (0.0027) [2024-04-27 13:56:04,107][52031] Fps is (10 sec: 50790.0, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 6459949056. Throughput: 0: 53723.0. Samples: 950488360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-04-27 13:56:04,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 13:56:04,950][52263] Updated weights for policy 0, policy_version 394289 (0.0027) [2024-04-27 13:56:07,956][52263] Updated weights for policy 0, policy_version 394299 (0.0031) [2024-04-27 13:56:09,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6460243968. Throughput: 0: 53818.8. Samples: 950811880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-04-27 13:56:09,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 13:56:11,095][52263] Updated weights for policy 0, policy_version 394309 (0.0031) [2024-04-27 13:56:14,040][52263] Updated weights for policy 0, policy_version 394319 (0.0028) [2024-04-27 13:56:14,106][52031] Fps is (10 sec: 57344.5, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6460522496. Throughput: 0: 53728.6. Samples: 950969720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-04-27 13:56:14,107][52031] Avg episode reward: [(0, '0.517')] [2024-04-27 13:56:16,968][52263] Updated weights for policy 0, policy_version 394329 (0.0027) [2024-04-27 13:56:19,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6460801024. Throughput: 0: 53837.2. Samples: 951297280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-04-27 13:56:19,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 13:56:20,145][52263] Updated weights for policy 0, policy_version 394339 (0.0032) [2024-04-27 13:56:23,097][52263] Updated weights for policy 0, policy_version 394349 (0.0027) [2024-04-27 13:56:24,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.2, 300 sec: 53428.5). Total num frames: 6461046784. Throughput: 0: 53962.0. Samples: 951623440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-04-27 13:56:24,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 13:56:26,360][52263] Updated weights for policy 0, policy_version 394359 (0.0030) [2024-04-27 13:56:29,107][52031] Fps is (10 sec: 52429.3, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6461325312. Throughput: 0: 54062.2. Samples: 951784800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-04-27 13:56:29,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 13:56:29,231][52263] Updated weights for policy 0, policy_version 394369 (0.0029) [2024-04-27 13:56:32,480][52263] Updated weights for policy 0, policy_version 394379 (0.0027) [2024-04-27 13:56:34,106][52031] Fps is (10 sec: 55705.6, 60 sec: 54340.4, 300 sec: 53539.6). Total num frames: 6461603840. Throughput: 0: 54216.1. Samples: 952111480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-04-27 13:56:34,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 13:56:35,169][52263] Updated weights for policy 0, policy_version 394389 (0.0027) [2024-04-27 13:56:38,428][52263] Updated weights for policy 0, policy_version 394399 (0.0033) [2024-04-27 13:56:39,087][52242] Signal inference workers to stop experience collection... (14350 times) [2024-04-27 13:56:39,088][52242] Signal inference workers to resume experience collection... (14350 times) [2024-04-27 13:56:39,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6461849600. Throughput: 0: 54167.2. Samples: 952433320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-04-27 13:56:39,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 13:56:39,115][52263] InferenceWorker_p0-w0: stopping experience collection (14350 times) [2024-04-27 13:56:39,116][52263] InferenceWorker_p0-w0: resuming experience collection (14350 times) [2024-04-27 13:56:41,054][52263] Updated weights for policy 0, policy_version 394409 (0.0027) [2024-04-27 13:56:44,107][52031] Fps is (10 sec: 54066.5, 60 sec: 54340.3, 300 sec: 53595.1). Total num frames: 6462144512. Throughput: 0: 53959.9. Samples: 952594480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 13:56:44,116][52031] Avg episode reward: [(0, '0.518')] [2024-04-27 13:56:44,518][52263] Updated weights for policy 0, policy_version 394419 (0.0027) [2024-04-27 13:56:47,425][52263] Updated weights for policy 0, policy_version 394429 (0.0034) [2024-04-27 13:56:49,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6462406656. Throughput: 0: 53983.2. Samples: 952917600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 13:56:49,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 13:56:50,682][52263] Updated weights for policy 0, policy_version 394439 (0.0030) [2024-04-27 13:56:53,784][52263] Updated weights for policy 0, policy_version 394449 (0.0030) [2024-04-27 13:56:54,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6462668800. Throughput: 0: 53991.0. Samples: 953241480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 13:56:54,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 13:56:56,694][52263] Updated weights for policy 0, policy_version 394459 (0.0028) [2024-04-27 13:56:59,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6462947328. Throughput: 0: 54038.7. Samples: 953401460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 13:56:59,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 13:56:59,888][52263] Updated weights for policy 0, policy_version 394469 (0.0032) [2024-04-27 13:57:02,637][52263] Updated weights for policy 0, policy_version 394479 (0.0035) [2024-04-27 13:57:04,107][52031] Fps is (10 sec: 54066.9, 60 sec: 54340.2, 300 sec: 53539.5). Total num frames: 6463209472. Throughput: 0: 53946.7. Samples: 953724880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 13:57:04,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 13:57:05,998][52263] Updated weights for policy 0, policy_version 394489 (0.0027) [2024-04-27 13:57:08,831][52263] Updated weights for policy 0, policy_version 394499 (0.0030) [2024-04-27 13:57:09,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6463471616. Throughput: 0: 53831.9. Samples: 954045880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 13:57:09,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 13:57:12,085][52263] Updated weights for policy 0, policy_version 394509 (0.0028) [2024-04-27 13:57:14,107][52031] Fps is (10 sec: 54067.6, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6463750144. Throughput: 0: 53809.3. Samples: 954206220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 13:57:14,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 13:57:15,115][52263] Updated weights for policy 0, policy_version 394519 (0.0026) [2024-04-27 13:57:18,100][52263] Updated weights for policy 0, policy_version 394529 (0.0023) [2024-04-27 13:57:19,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6464028672. Throughput: 0: 53767.4. Samples: 954531020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 13:57:19,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 13:57:21,072][52263] Updated weights for policy 0, policy_version 394539 (0.0033) [2024-04-27 13:57:24,107][52031] Fps is (10 sec: 52426.3, 60 sec: 53793.5, 300 sec: 53539.5). Total num frames: 6464274432. Throughput: 0: 53743.7. Samples: 954851820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 13:57:24,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 13:57:24,153][52263] Updated weights for policy 0, policy_version 394549 (0.0031) [2024-04-27 13:57:27,121][52263] Updated weights for policy 0, policy_version 394559 (0.0032) [2024-04-27 13:57:29,107][52031] Fps is (10 sec: 52427.2, 60 sec: 53793.9, 300 sec: 53650.6). Total num frames: 6464552960. Throughput: 0: 53837.0. Samples: 955017160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 13:57:29,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 13:57:30,248][52263] Updated weights for policy 0, policy_version 394569 (0.0034) [2024-04-27 13:57:33,124][52263] Updated weights for policy 0, policy_version 394579 (0.0033) [2024-04-27 13:57:34,106][52031] Fps is (10 sec: 55709.0, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6464831488. Throughput: 0: 53789.3. Samples: 955338120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 13:57:34,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 13:57:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000394582_6464831488.pth... [2024-04-27 13:57:34,161][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000393797_6451970048.pth [2024-04-27 13:57:36,413][52263] Updated weights for policy 0, policy_version 394589 (0.0038) [2024-04-27 13:57:38,091][52242] Signal inference workers to stop experience collection... (14400 times) [2024-04-27 13:57:38,092][52242] Signal inference workers to resume experience collection... (14400 times) [2024-04-27 13:57:38,114][52263] InferenceWorker_p0-w0: stopping experience collection (14400 times) [2024-04-27 13:57:38,114][52263] InferenceWorker_p0-w0: resuming experience collection (14400 times) [2024-04-27 13:57:39,106][52031] Fps is (10 sec: 54069.0, 60 sec: 54067.1, 300 sec: 53650.6). Total num frames: 6465093632. Throughput: 0: 53735.7. Samples: 955659580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 13:57:39,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 13:57:39,285][52263] Updated weights for policy 0, policy_version 394599 (0.0034) [2024-04-27 13:57:42,457][52263] Updated weights for policy 0, policy_version 394609 (0.0035) [2024-04-27 13:57:44,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6465372160. Throughput: 0: 53833.3. Samples: 955823960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 13:57:44,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 13:57:45,294][52263] Updated weights for policy 0, policy_version 394619 (0.0032) [2024-04-27 13:57:48,548][52263] Updated weights for policy 0, policy_version 394629 (0.0027) [2024-04-27 13:57:49,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6465634304. Throughput: 0: 53822.4. Samples: 956146880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 13:57:49,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 13:57:51,327][52263] Updated weights for policy 0, policy_version 394639 (0.0028) [2024-04-27 13:57:54,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6465896448. Throughput: 0: 53827.6. Samples: 956468120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 13:57:54,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 13:57:54,614][52263] Updated weights for policy 0, policy_version 394649 (0.0030) [2024-04-27 13:57:57,548][52263] Updated weights for policy 0, policy_version 394659 (0.0031) [2024-04-27 13:57:59,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.0, 300 sec: 53650.7). Total num frames: 6466174976. Throughput: 0: 53774.7. Samples: 956626080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 13:57:59,107][52031] Avg episode reward: [(0, '0.664')] [2024-04-27 13:58:00,654][52263] Updated weights for policy 0, policy_version 394669 (0.0027) [2024-04-27 13:58:03,755][52263] Updated weights for policy 0, policy_version 394679 (0.0030) [2024-04-27 13:58:04,106][52031] Fps is (10 sec: 55705.7, 60 sec: 54067.3, 300 sec: 53706.2). Total num frames: 6466453504. Throughput: 0: 53773.3. Samples: 956950820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 13:58:04,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 13:58:06,721][52263] Updated weights for policy 0, policy_version 394689 (0.0028) [2024-04-27 13:58:09,107][52031] Fps is (10 sec: 54066.8, 60 sec: 54067.1, 300 sec: 53761.7). Total num frames: 6466715648. Throughput: 0: 53913.0. Samples: 957277880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 13:58:09,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 13:58:09,756][52263] Updated weights for policy 0, policy_version 394699 (0.0031) [2024-04-27 13:58:12,688][52263] Updated weights for policy 0, policy_version 394709 (0.0032) [2024-04-27 13:58:14,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 6466977792. Throughput: 0: 53832.9. Samples: 957439620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 13:58:14,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 13:58:15,723][52263] Updated weights for policy 0, policy_version 394719 (0.0026) [2024-04-27 13:58:18,859][52263] Updated weights for policy 0, policy_version 394729 (0.0030) [2024-04-27 13:58:19,106][52031] Fps is (10 sec: 52429.8, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6467239936. Throughput: 0: 53853.8. Samples: 957761540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 13:58:19,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 13:58:21,764][52263] Updated weights for policy 0, policy_version 394739 (0.0029) [2024-04-27 13:58:24,107][52031] Fps is (10 sec: 54066.8, 60 sec: 54067.7, 300 sec: 53595.1). Total num frames: 6467518464. Throughput: 0: 53864.4. Samples: 958083480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 13:58:24,107][52031] Avg episode reward: [(0, '0.699')] [2024-04-27 13:58:24,998][52263] Updated weights for policy 0, policy_version 394749 (0.0037) [2024-04-27 13:58:27,825][52263] Updated weights for policy 0, policy_version 394759 (0.0034) [2024-04-27 13:58:29,106][52031] Fps is (10 sec: 55705.1, 60 sec: 54067.5, 300 sec: 53650.7). Total num frames: 6467796992. Throughput: 0: 53780.8. Samples: 958244100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 13:58:29,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 13:58:30,985][52263] Updated weights for policy 0, policy_version 394769 (0.0033) [2024-04-27 13:58:33,760][52263] Updated weights for policy 0, policy_version 394779 (0.0031) [2024-04-27 13:58:34,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53794.0, 300 sec: 53706.1). Total num frames: 6468059136. Throughput: 0: 53734.0. Samples: 958564920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 13:58:34,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 13:58:37,246][52263] Updated weights for policy 0, policy_version 394789 (0.0029) [2024-04-27 13:58:39,107][52031] Fps is (10 sec: 54067.0, 60 sec: 54067.2, 300 sec: 53817.3). Total num frames: 6468337664. Throughput: 0: 53854.6. Samples: 958891580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 13:58:39,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 13:58:39,792][52263] Updated weights for policy 0, policy_version 394799 (0.0034) [2024-04-27 13:58:43,286][52263] Updated weights for policy 0, policy_version 394809 (0.0027) [2024-04-27 13:58:44,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 6468583424. Throughput: 0: 53939.1. Samples: 959053340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 13:58:44,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 13:58:46,037][52263] Updated weights for policy 0, policy_version 394819 (0.0028) [2024-04-27 13:58:49,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6468861952. Throughput: 0: 53909.3. Samples: 959376740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 13:58:49,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 13:58:49,196][52263] Updated weights for policy 0, policy_version 394829 (0.0029) [2024-04-27 13:58:52,044][52263] Updated weights for policy 0, policy_version 394839 (0.0026) [2024-04-27 13:58:54,107][52031] Fps is (10 sec: 55704.6, 60 sec: 54067.1, 300 sec: 53650.6). Total num frames: 6469140480. Throughput: 0: 53795.5. Samples: 959698680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 13:58:54,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 13:58:54,790][52242] Signal inference workers to stop experience collection... (14450 times) [2024-04-27 13:58:54,826][52263] InferenceWorker_p0-w0: stopping experience collection (14450 times) [2024-04-27 13:58:54,884][52242] Signal inference workers to resume experience collection... (14450 times) [2024-04-27 13:58:54,884][52263] InferenceWorker_p0-w0: resuming experience collection (14450 times) [2024-04-27 13:58:55,198][52263] Updated weights for policy 0, policy_version 394849 (0.0031) [2024-04-27 13:58:58,084][52263] Updated weights for policy 0, policy_version 394859 (0.0027) [2024-04-27 13:58:59,106][52031] Fps is (10 sec: 55705.7, 60 sec: 54067.2, 300 sec: 53761.8). Total num frames: 6469419008. Throughput: 0: 53807.1. Samples: 959860940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 13:58:59,107][52031] Avg episode reward: [(0, '0.677')] [2024-04-27 13:59:01,338][52263] Updated weights for policy 0, policy_version 394869 (0.0040) [2024-04-27 13:59:04,106][52031] Fps is (10 sec: 54068.5, 60 sec: 53794.2, 300 sec: 53817.3). Total num frames: 6469681152. Throughput: 0: 53796.4. Samples: 960182380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 13:59:04,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 13:59:04,140][52263] Updated weights for policy 0, policy_version 394879 (0.0028) [2024-04-27 13:59:07,550][52263] Updated weights for policy 0, policy_version 394889 (0.0032) [2024-04-27 13:59:09,107][52031] Fps is (10 sec: 54066.8, 60 sec: 54067.2, 300 sec: 53817.3). Total num frames: 6469959680. Throughput: 0: 53745.3. Samples: 960502020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 13:59:09,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 13:59:10,347][52263] Updated weights for policy 0, policy_version 394899 (0.0029) [2024-04-27 13:59:13,812][52263] Updated weights for policy 0, policy_version 394909 (0.0038) [2024-04-27 13:59:14,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6470189056. Throughput: 0: 53807.6. Samples: 960665440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 13:59:14,107][52031] Avg episode reward: [(0, '0.457')] [2024-04-27 13:59:16,410][52263] Updated weights for policy 0, policy_version 394919 (0.0028) [2024-04-27 13:59:19,107][52031] Fps is (10 sec: 50790.4, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6470467584. Throughput: 0: 53814.3. Samples: 960986560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-04-27 13:59:19,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 13:59:19,793][52263] Updated weights for policy 0, policy_version 394929 (0.0033) [2024-04-27 13:59:22,491][52263] Updated weights for policy 0, policy_version 394939 (0.0030) [2024-04-27 13:59:24,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6470746112. Throughput: 0: 53702.6. Samples: 961308200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 13:59:24,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 13:59:25,854][52263] Updated weights for policy 0, policy_version 394949 (0.0027) [2024-04-27 13:59:28,513][52263] Updated weights for policy 0, policy_version 394959 (0.0031) [2024-04-27 13:59:29,107][52031] Fps is (10 sec: 57344.1, 60 sec: 54067.2, 300 sec: 53872.8). Total num frames: 6471041024. Throughput: 0: 53813.7. Samples: 961474960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 13:59:29,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 13:59:32,138][52263] Updated weights for policy 0, policy_version 394969 (0.0027) [2024-04-27 13:59:34,106][52031] Fps is (10 sec: 54068.0, 60 sec: 53794.3, 300 sec: 53817.3). Total num frames: 6471286784. Throughput: 0: 53854.3. Samples: 961800180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 13:59:34,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 13:59:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000394976_6471286784.pth... [2024-04-27 13:59:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000394187_6458359808.pth [2024-04-27 13:59:34,908][52263] Updated weights for policy 0, policy_version 394979 (0.0031) [2024-04-27 13:59:38,270][52263] Updated weights for policy 0, policy_version 394989 (0.0035) [2024-04-27 13:59:39,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53794.0, 300 sec: 53761.7). Total num frames: 6471565312. Throughput: 0: 53772.9. Samples: 962118460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 13:59:39,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 13:59:40,958][52263] Updated weights for policy 0, policy_version 394999 (0.0028) [2024-04-27 13:59:44,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6471811072. Throughput: 0: 53614.7. Samples: 962273600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 13:59:44,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 13:59:44,354][52263] Updated weights for policy 0, policy_version 395009 (0.0034) [2024-04-27 13:59:47,148][52263] Updated weights for policy 0, policy_version 395019 (0.0033) [2024-04-27 13:59:49,107][52031] Fps is (10 sec: 52429.4, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 6472089600. Throughput: 0: 53591.0. Samples: 962593980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 13:59:49,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 13:59:50,277][52263] Updated weights for policy 0, policy_version 395029 (0.0031) [2024-04-27 13:59:53,296][52263] Updated weights for policy 0, policy_version 395039 (0.0030) [2024-04-27 13:59:54,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.2, 300 sec: 53706.2). Total num frames: 6472351744. Throughput: 0: 53711.2. Samples: 962919020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 13:59:54,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 13:59:56,252][52263] Updated weights for policy 0, policy_version 395049 (0.0026) [2024-04-27 13:59:59,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.1, 300 sec: 53817.3). Total num frames: 6472630272. Throughput: 0: 53638.6. Samples: 963079180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 13:59:59,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 13:59:59,417][52263] Updated weights for policy 0, policy_version 395059 (0.0029) [2024-04-27 14:00:02,072][52242] Signal inference workers to stop experience collection... (14500 times) [2024-04-27 14:00:02,108][52263] InferenceWorker_p0-w0: stopping experience collection (14500 times) [2024-04-27 14:00:02,137][52242] Signal inference workers to resume experience collection... (14500 times) [2024-04-27 14:00:02,137][52263] InferenceWorker_p0-w0: resuming experience collection (14500 times) [2024-04-27 14:00:02,587][52263] Updated weights for policy 0, policy_version 395069 (0.0030) [2024-04-27 14:00:04,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.0, 300 sec: 53817.3). Total num frames: 6472892416. Throughput: 0: 53618.3. Samples: 963399380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 14:00:04,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 14:00:05,629][52263] Updated weights for policy 0, policy_version 395079 (0.0033) [2024-04-27 14:00:08,864][52263] Updated weights for policy 0, policy_version 395089 (0.0030) [2024-04-27 14:00:09,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.1, 300 sec: 53817.3). Total num frames: 6473154560. Throughput: 0: 53568.7. Samples: 963718780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 14:00:09,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 14:00:11,627][52263] Updated weights for policy 0, policy_version 395099 (0.0031) [2024-04-27 14:00:14,107][52031] Fps is (10 sec: 50789.5, 60 sec: 53520.9, 300 sec: 53650.6). Total num frames: 6473400320. Throughput: 0: 53396.3. Samples: 963877800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 14:00:14,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 14:00:15,042][52263] Updated weights for policy 0, policy_version 395109 (0.0029) [2024-04-27 14:00:17,800][52263] Updated weights for policy 0, policy_version 395119 (0.0029) [2024-04-27 14:00:19,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6473695232. Throughput: 0: 53260.8. Samples: 964196920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 14:00:19,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 14:00:21,078][52263] Updated weights for policy 0, policy_version 395129 (0.0032) [2024-04-27 14:00:23,920][52263] Updated weights for policy 0, policy_version 395139 (0.0030) [2024-04-27 14:00:24,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53521.2, 300 sec: 53761.7). Total num frames: 6473957376. Throughput: 0: 53430.8. Samples: 964522840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 14:00:24,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 14:00:27,192][52263] Updated weights for policy 0, policy_version 395149 (0.0032) [2024-04-27 14:00:29,107][52031] Fps is (10 sec: 52428.8, 60 sec: 52974.9, 300 sec: 53817.3). Total num frames: 6474219520. Throughput: 0: 53457.6. Samples: 964679200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 14:00:29,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 14:00:30,094][52263] Updated weights for policy 0, policy_version 395159 (0.0035) [2024-04-27 14:00:33,371][52263] Updated weights for policy 0, policy_version 395169 (0.0035) [2024-04-27 14:00:34,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.0, 300 sec: 53761.8). Total num frames: 6474481664. Throughput: 0: 53461.4. Samples: 964999740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 14:00:34,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 14:00:36,174][52263] Updated weights for policy 0, policy_version 395179 (0.0030) [2024-04-27 14:00:39,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53248.2, 300 sec: 53817.3). Total num frames: 6474760192. Throughput: 0: 53375.6. Samples: 965320920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 14:00:39,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 14:00:39,581][52263] Updated weights for policy 0, policy_version 395189 (0.0032) [2024-04-27 14:00:42,458][52263] Updated weights for policy 0, policy_version 395199 (0.0032) [2024-04-27 14:00:44,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53650.7). Total num frames: 6475005952. Throughput: 0: 53412.0. Samples: 965482720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 14:00:44,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 14:00:45,635][52263] Updated weights for policy 0, policy_version 395209 (0.0031) [2024-04-27 14:00:48,464][52263] Updated weights for policy 0, policy_version 395219 (0.0028) [2024-04-27 14:00:49,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53247.9, 300 sec: 53706.2). Total num frames: 6475284480. Throughput: 0: 53484.3. Samples: 965806180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 14:00:49,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 14:00:51,582][52263] Updated weights for policy 0, policy_version 395229 (0.0034) [2024-04-27 14:00:52,011][52242] Signal inference workers to stop experience collection... (14550 times) [2024-04-27 14:00:52,063][52263] InferenceWorker_p0-w0: stopping experience collection (14550 times) [2024-04-27 14:00:52,063][52242] Signal inference workers to resume experience collection... (14550 times) [2024-04-27 14:00:52,077][52263] InferenceWorker_p0-w0: resuming experience collection (14550 times) [2024-04-27 14:00:54,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53247.9, 300 sec: 53650.6). Total num frames: 6475546624. Throughput: 0: 53461.6. Samples: 966124560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 14:00:54,107][52031] Avg episode reward: [(0, '0.493')] [2024-04-27 14:00:54,549][52263] Updated weights for policy 0, policy_version 395239 (0.0029) [2024-04-27 14:00:57,718][52263] Updated weights for policy 0, policy_version 395249 (0.0036) [2024-04-27 14:00:59,106][52031] Fps is (10 sec: 54068.2, 60 sec: 53248.1, 300 sec: 53817.3). Total num frames: 6475825152. Throughput: 0: 53593.2. Samples: 966289480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 14:00:59,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 14:01:00,632][52263] Updated weights for policy 0, policy_version 395259 (0.0028) [2024-04-27 14:01:03,829][52263] Updated weights for policy 0, policy_version 395269 (0.0040) [2024-04-27 14:01:04,106][52031] Fps is (10 sec: 55706.7, 60 sec: 53521.1, 300 sec: 53761.8). Total num frames: 6476103680. Throughput: 0: 53582.4. Samples: 966608120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 14:01:04,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 14:01:06,688][52263] Updated weights for policy 0, policy_version 395279 (0.0030) [2024-04-27 14:01:09,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53248.0, 300 sec: 53650.7). Total num frames: 6476349440. Throughput: 0: 53406.7. Samples: 966926140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 14:01:09,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 14:01:09,893][52263] Updated weights for policy 0, policy_version 395289 (0.0026) [2024-04-27 14:01:13,029][52263] Updated weights for policy 0, policy_version 395299 (0.0033) [2024-04-27 14:01:14,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 6476627968. Throughput: 0: 53589.9. Samples: 967090740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 14:01:14,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 14:01:15,891][52263] Updated weights for policy 0, policy_version 395309 (0.0026) [2024-04-27 14:01:19,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.2, 300 sec: 53706.2). Total num frames: 6476890112. Throughput: 0: 53520.9. Samples: 967408180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 14:01:19,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 14:01:19,208][52263] Updated weights for policy 0, policy_version 395319 (0.0030) [2024-04-27 14:01:21,976][52263] Updated weights for policy 0, policy_version 395329 (0.0025) [2024-04-27 14:01:24,106][52031] Fps is (10 sec: 50790.2, 60 sec: 52974.9, 300 sec: 53595.1). Total num frames: 6477135872. Throughput: 0: 53533.3. Samples: 967729920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 14:01:24,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 14:01:25,351][52263] Updated weights for policy 0, policy_version 395339 (0.0031) [2024-04-27 14:01:28,197][52263] Updated weights for policy 0, policy_version 395349 (0.0032) [2024-04-27 14:01:29,107][52031] Fps is (10 sec: 55703.9, 60 sec: 53794.0, 300 sec: 53706.1). Total num frames: 6477447168. Throughput: 0: 53517.0. Samples: 967891000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 14:01:29,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 14:01:31,469][52263] Updated weights for policy 0, policy_version 395359 (0.0028) [2024-04-27 14:01:34,106][52031] Fps is (10 sec: 55706.2, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6477692928. Throughput: 0: 53308.2. Samples: 968205040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 14:01:34,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 14:01:34,189][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000395368_6477709312.pth... [2024-04-27 14:01:34,241][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000394582_6464831488.pth [2024-04-27 14:01:34,390][52263] Updated weights for policy 0, policy_version 395369 (0.0031) [2024-04-27 14:01:37,697][52263] Updated weights for policy 0, policy_version 395379 (0.0031) [2024-04-27 14:01:39,107][52031] Fps is (10 sec: 52429.7, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6477971456. Throughput: 0: 53429.4. Samples: 968528880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 14:01:39,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 14:01:40,436][52263] Updated weights for policy 0, policy_version 395389 (0.0034) [2024-04-27 14:01:43,814][52263] Updated weights for policy 0, policy_version 395399 (0.0028) [2024-04-27 14:01:44,106][52031] Fps is (10 sec: 52428.2, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6478217216. Throughput: 0: 53158.5. Samples: 968681620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 14:01:44,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 14:01:46,563][52263] Updated weights for policy 0, policy_version 395409 (0.0031) [2024-04-27 14:01:49,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 6478495744. Throughput: 0: 53315.5. Samples: 969007320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 14:01:49,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 14:01:49,961][52263] Updated weights for policy 0, policy_version 395419 (0.0031) [2024-04-27 14:01:52,748][52263] Updated weights for policy 0, policy_version 395429 (0.0028) [2024-04-27 14:01:54,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6478757888. Throughput: 0: 53351.4. Samples: 969326960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 14:01:54,107][52031] Avg episode reward: [(0, '0.698')] [2024-04-27 14:01:55,943][52263] Updated weights for policy 0, policy_version 395439 (0.0027) [2024-04-27 14:01:58,682][52263] Updated weights for policy 0, policy_version 395449 (0.0027) [2024-04-27 14:01:59,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6479052800. Throughput: 0: 53289.7. Samples: 969488780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-27 14:01:59,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 14:02:02,010][52263] Updated weights for policy 0, policy_version 395459 (0.0032) [2024-04-27 14:02:04,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53247.9, 300 sec: 53650.7). Total num frames: 6479298560. Throughput: 0: 53442.5. Samples: 969813100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:02:04,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 14:02:05,049][52263] Updated weights for policy 0, policy_version 395469 (0.0028) [2024-04-27 14:02:07,856][52242] Signal inference workers to stop experience collection... (14600 times) [2024-04-27 14:02:07,856][52242] Signal inference workers to resume experience collection... (14600 times) [2024-04-27 14:02:07,868][52263] InferenceWorker_p0-w0: stopping experience collection (14600 times) [2024-04-27 14:02:07,868][52263] InferenceWorker_p0-w0: resuming experience collection (14600 times) [2024-04-27 14:02:08,218][52263] Updated weights for policy 0, policy_version 395479 (0.0036) [2024-04-27 14:02:09,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6479577088. Throughput: 0: 53471.7. Samples: 970136140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:02:09,107][52031] Avg episode reward: [(0, '0.492')] [2024-04-27 14:02:10,998][52263] Updated weights for policy 0, policy_version 395489 (0.0033) [2024-04-27 14:02:14,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6479839232. Throughput: 0: 53510.5. Samples: 970298960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:02:14,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 14:02:14,359][52263] Updated weights for policy 0, policy_version 395499 (0.0036) [2024-04-27 14:02:16,943][52263] Updated weights for policy 0, policy_version 395509 (0.0027) [2024-04-27 14:02:19,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53247.9, 300 sec: 53595.2). Total num frames: 6480084992. Throughput: 0: 53606.1. Samples: 970617320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:02:19,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 14:02:20,285][52263] Updated weights for policy 0, policy_version 395519 (0.0031) [2024-04-27 14:02:23,124][52263] Updated weights for policy 0, policy_version 395529 (0.0029) [2024-04-27 14:02:24,106][52031] Fps is (10 sec: 54067.4, 60 sec: 54067.2, 300 sec: 53650.7). Total num frames: 6480379904. Throughput: 0: 53487.6. Samples: 970935820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:02:24,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 14:02:26,471][52263] Updated weights for policy 0, policy_version 395539 (0.0039) [2024-04-27 14:02:29,107][52031] Fps is (10 sec: 57343.3, 60 sec: 53521.1, 300 sec: 53650.6). Total num frames: 6480658432. Throughput: 0: 53686.5. Samples: 971097520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:02:29,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 14:02:29,249][52263] Updated weights for policy 0, policy_version 395549 (0.0029) [2024-04-27 14:02:32,771][52263] Updated weights for policy 0, policy_version 395559 (0.0030) [2024-04-27 14:02:34,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53793.9, 300 sec: 53650.6). Total num frames: 6480920576. Throughput: 0: 53695.8. Samples: 971423640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:02:34,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 14:02:35,210][52263] Updated weights for policy 0, policy_version 395569 (0.0029) [2024-04-27 14:02:38,817][52263] Updated weights for policy 0, policy_version 395579 (0.0027) [2024-04-27 14:02:39,106][52031] Fps is (10 sec: 52429.8, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6481182720. Throughput: 0: 53676.1. Samples: 971742380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:02:39,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 14:02:41,141][52263] Updated weights for policy 0, policy_version 395589 (0.0034) [2024-04-27 14:02:44,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6481444864. Throughput: 0: 53578.1. Samples: 971899800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:02:44,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 14:02:44,891][52263] Updated weights for policy 0, policy_version 395599 (0.0030) [2024-04-27 14:02:47,267][52263] Updated weights for policy 0, policy_version 395609 (0.0025) [2024-04-27 14:02:49,107][52031] Fps is (10 sec: 50790.1, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6481690624. Throughput: 0: 53463.6. Samples: 972218960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:02:49,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 14:02:51,100][52263] Updated weights for policy 0, policy_version 395619 (0.0030) [2024-04-27 14:02:53,916][52263] Updated weights for policy 0, policy_version 395629 (0.0029) [2024-04-27 14:02:54,107][52031] Fps is (10 sec: 54067.8, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6481985536. Throughput: 0: 53428.7. Samples: 972540440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:02:54,108][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 14:02:57,268][52263] Updated weights for policy 0, policy_version 395639 (0.0026) [2024-04-27 14:02:59,106][52031] Fps is (10 sec: 58982.7, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6482280448. Throughput: 0: 53548.5. Samples: 972708640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:02:59,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 14:03:00,162][52263] Updated weights for policy 0, policy_version 395649 (0.0027) [2024-04-27 14:03:03,408][52263] Updated weights for policy 0, policy_version 395659 (0.0029) [2024-04-27 14:03:04,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6482509824. Throughput: 0: 53601.5. Samples: 973029380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:03:04,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 14:03:06,125][52263] Updated weights for policy 0, policy_version 395669 (0.0034) [2024-04-27 14:03:09,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6482788352. Throughput: 0: 53589.4. Samples: 973347340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:03:09,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 14:03:09,437][52263] Updated weights for policy 0, policy_version 395679 (0.0031) [2024-04-27 14:03:12,260][52263] Updated weights for policy 0, policy_version 395689 (0.0032) [2024-04-27 14:03:14,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53247.9, 300 sec: 53539.5). Total num frames: 6483034112. Throughput: 0: 53446.3. Samples: 973502600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:03:14,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 14:03:15,626][52263] Updated weights for policy 0, policy_version 395699 (0.0034) [2024-04-27 14:03:17,483][52242] Signal inference workers to stop experience collection... (14650 times) [2024-04-27 14:03:17,484][52242] Signal inference workers to resume experience collection... (14650 times) [2024-04-27 14:03:17,514][52263] InferenceWorker_p0-w0: stopping experience collection (14650 times) [2024-04-27 14:03:17,514][52263] InferenceWorker_p0-w0: resuming experience collection (14650 times) [2024-04-27 14:03:18,430][52263] Updated weights for policy 0, policy_version 395709 (0.0031) [2024-04-27 14:03:19,106][52031] Fps is (10 sec: 54067.4, 60 sec: 54067.3, 300 sec: 53595.1). Total num frames: 6483329024. Throughput: 0: 53327.8. Samples: 973823380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:03:19,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 14:03:21,683][52263] Updated weights for policy 0, policy_version 395719 (0.0032) [2024-04-27 14:03:24,107][52031] Fps is (10 sec: 57344.1, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6483607552. Throughput: 0: 53399.4. Samples: 974145360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:03:24,107][52031] Avg episode reward: [(0, '0.504')] [2024-04-27 14:03:24,489][52263] Updated weights for policy 0, policy_version 395729 (0.0029) [2024-04-27 14:03:27,689][52263] Updated weights for policy 0, policy_version 395739 (0.0028) [2024-04-27 14:03:29,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.3, 300 sec: 53595.2). Total num frames: 6483869696. Throughput: 0: 53771.8. Samples: 974319520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:03:29,107][52031] Avg episode reward: [(0, '0.480')] [2024-04-27 14:03:30,463][52263] Updated weights for policy 0, policy_version 395749 (0.0033) [2024-04-27 14:03:33,826][52263] Updated weights for policy 0, policy_version 395759 (0.0032) [2024-04-27 14:03:34,106][52031] Fps is (10 sec: 50791.0, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6484115456. Throughput: 0: 53833.0. Samples: 974641440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:03:34,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 14:03:34,120][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000395760_6484131840.pth... [2024-04-27 14:03:34,176][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000394976_6471286784.pth [2024-04-27 14:03:36,438][52263] Updated weights for policy 0, policy_version 395769 (0.0026) [2024-04-27 14:03:39,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6484393984. Throughput: 0: 53829.1. Samples: 974962740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:03:39,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 14:03:39,854][52263] Updated weights for policy 0, policy_version 395779 (0.0030) [2024-04-27 14:03:42,680][52263] Updated weights for policy 0, policy_version 395789 (0.0032) [2024-04-27 14:03:44,107][52031] Fps is (10 sec: 54066.0, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6484656128. Throughput: 0: 53512.2. Samples: 975116700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:03:44,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 14:03:46,117][52263] Updated weights for policy 0, policy_version 395799 (0.0037) [2024-04-27 14:03:49,025][52263] Updated weights for policy 0, policy_version 395809 (0.0029) [2024-04-27 14:03:49,107][52031] Fps is (10 sec: 54065.5, 60 sec: 54067.0, 300 sec: 53539.6). Total num frames: 6484934656. Throughput: 0: 53551.2. Samples: 975439200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:03:49,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 14:03:52,202][52263] Updated weights for policy 0, policy_version 395819 (0.0029) [2024-04-27 14:03:54,107][52031] Fps is (10 sec: 55706.4, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6485213184. Throughput: 0: 53656.8. Samples: 975761900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:03:54,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 14:03:55,053][52263] Updated weights for policy 0, policy_version 395829 (0.0029) [2024-04-27 14:03:58,280][52263] Updated weights for policy 0, policy_version 395839 (0.0028) [2024-04-27 14:03:59,107][52031] Fps is (10 sec: 52428.9, 60 sec: 52974.7, 300 sec: 53484.0). Total num frames: 6485458944. Throughput: 0: 53739.5. Samples: 975920880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:03:59,107][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 14:04:01,064][52263] Updated weights for policy 0, policy_version 395849 (0.0033) [2024-04-27 14:04:04,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 6485737472. Throughput: 0: 53757.8. Samples: 976242480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:04:04,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 14:04:04,333][52263] Updated weights for policy 0, policy_version 395859 (0.0027) [2024-04-27 14:04:07,502][52263] Updated weights for policy 0, policy_version 395869 (0.0027) [2024-04-27 14:04:09,106][52031] Fps is (10 sec: 52430.3, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6485983232. Throughput: 0: 53676.2. Samples: 976560780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:04:09,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 14:04:10,539][52263] Updated weights for policy 0, policy_version 395879 (0.0026) [2024-04-27 14:04:13,676][52263] Updated weights for policy 0, policy_version 395889 (0.0029) [2024-04-27 14:04:14,107][52031] Fps is (10 sec: 50789.7, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6486245376. Throughput: 0: 53420.3. Samples: 976723440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:04:14,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 14:04:16,741][52263] Updated weights for policy 0, policy_version 395899 (0.0030) [2024-04-27 14:04:19,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6486540288. Throughput: 0: 53344.0. Samples: 977041920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:04:19,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 14:04:19,747][52263] Updated weights for policy 0, policy_version 395909 (0.0029) [2024-04-27 14:04:22,955][52263] Updated weights for policy 0, policy_version 395919 (0.0029) [2024-04-27 14:04:23,583][52242] Signal inference workers to stop experience collection... (14700 times) [2024-04-27 14:04:23,583][52242] Signal inference workers to resume experience collection... (14700 times) [2024-04-27 14:04:23,610][52263] InferenceWorker_p0-w0: stopping experience collection (14700 times) [2024-04-27 14:04:23,610][52263] InferenceWorker_p0-w0: resuming experience collection (14700 times) [2024-04-27 14:04:24,106][52031] Fps is (10 sec: 55706.6, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6486802432. Throughput: 0: 53473.3. Samples: 977369040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:04:24,107][52031] Avg episode reward: [(0, '0.434')] [2024-04-27 14:04:25,702][52263] Updated weights for policy 0, policy_version 395929 (0.0030) [2024-04-27 14:04:29,106][52031] Fps is (10 sec: 50790.5, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6487048192. Throughput: 0: 53496.7. Samples: 977524040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:04:29,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 14:04:29,163][52263] Updated weights for policy 0, policy_version 395939 (0.0029) [2024-04-27 14:04:31,859][52263] Updated weights for policy 0, policy_version 395949 (0.0032) [2024-04-27 14:04:34,107][52031] Fps is (10 sec: 54065.8, 60 sec: 53793.9, 300 sec: 53484.0). Total num frames: 6487343104. Throughput: 0: 53473.8. Samples: 977845520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:04:34,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 14:04:35,301][52263] Updated weights for policy 0, policy_version 395959 (0.0034) [2024-04-27 14:04:38,053][52263] Updated weights for policy 0, policy_version 395969 (0.0029) [2024-04-27 14:04:39,107][52031] Fps is (10 sec: 57343.0, 60 sec: 53793.9, 300 sec: 53595.1). Total num frames: 6487621632. Throughput: 0: 53418.1. Samples: 978165720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:04:39,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 14:04:41,372][52263] Updated weights for policy 0, policy_version 395979 (0.0031) [2024-04-27 14:04:44,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6487867392. Throughput: 0: 53538.6. Samples: 978330120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:04:44,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 14:04:44,239][52263] Updated weights for policy 0, policy_version 395989 (0.0029) [2024-04-27 14:04:47,413][52263] Updated weights for policy 0, policy_version 395999 (0.0030) [2024-04-27 14:04:49,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6488145920. Throughput: 0: 53487.8. Samples: 978649440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:04:49,107][52031] Avg episode reward: [(0, '0.673')] [2024-04-27 14:04:50,550][52263] Updated weights for policy 0, policy_version 396009 (0.0028) [2024-04-27 14:04:53,602][52263] Updated weights for policy 0, policy_version 396019 (0.0034) [2024-04-27 14:04:54,107][52031] Fps is (10 sec: 54067.9, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6488408064. Throughput: 0: 53650.9. Samples: 978975080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:04:54,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 14:04:56,487][52263] Updated weights for policy 0, policy_version 396029 (0.0032) [2024-04-27 14:04:59,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6488670208. Throughput: 0: 53491.1. Samples: 979130540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:04:59,107][52031] Avg episode reward: [(0, '0.513')] [2024-04-27 14:04:59,718][52263] Updated weights for policy 0, policy_version 396039 (0.0029) [2024-04-27 14:05:02,440][52263] Updated weights for policy 0, policy_version 396049 (0.0030) [2024-04-27 14:05:04,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6488932352. Throughput: 0: 53548.3. Samples: 979451600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:05:04,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 14:05:05,682][52263] Updated weights for policy 0, policy_version 396059 (0.0033) [2024-04-27 14:05:08,450][52263] Updated weights for policy 0, policy_version 396069 (0.0031) [2024-04-27 14:05:09,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6489194496. Throughput: 0: 53361.2. Samples: 979770300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:05:09,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 14:05:11,826][52263] Updated weights for policy 0, policy_version 396079 (0.0034) [2024-04-27 14:05:14,107][52031] Fps is (10 sec: 55705.1, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6489489408. Throughput: 0: 53804.2. Samples: 979945240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:05:14,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 14:05:14,637][52263] Updated weights for policy 0, policy_version 396089 (0.0034) [2024-04-27 14:05:17,996][52263] Updated weights for policy 0, policy_version 396099 (0.0032) [2024-04-27 14:05:19,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6489751552. Throughput: 0: 53740.7. Samples: 980263840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:05:19,107][52031] Avg episode reward: [(0, '0.485')] [2024-04-27 14:05:20,797][52263] Updated weights for policy 0, policy_version 396109 (0.0027) [2024-04-27 14:05:24,106][52031] Fps is (10 sec: 50791.6, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6489997312. Throughput: 0: 53722.0. Samples: 980583200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:05:24,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 14:05:24,199][52263] Updated weights for policy 0, policy_version 396119 (0.0026) [2024-04-27 14:05:26,825][52263] Updated weights for policy 0, policy_version 396129 (0.0032) [2024-04-27 14:05:29,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6490275840. Throughput: 0: 53360.8. Samples: 980731340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:05:29,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 14:05:30,274][52263] Updated weights for policy 0, policy_version 396139 (0.0027) [2024-04-27 14:05:33,013][52263] Updated weights for policy 0, policy_version 396149 (0.0030) [2024-04-27 14:05:34,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.3, 300 sec: 53539.6). Total num frames: 6490554368. Throughput: 0: 53538.0. Samples: 981058640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:05:34,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 14:05:34,119][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000396153_6490570752.pth... [2024-04-27 14:05:34,167][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000395368_6477709312.pth [2024-04-27 14:05:36,293][52263] Updated weights for policy 0, policy_version 396159 (0.0028) [2024-04-27 14:05:36,803][52242] Signal inference workers to stop experience collection... (14750 times) [2024-04-27 14:05:36,807][52242] Signal inference workers to resume experience collection... (14750 times) [2024-04-27 14:05:36,825][52263] InferenceWorker_p0-w0: stopping experience collection (14750 times) [2024-04-27 14:05:36,825][52263] InferenceWorker_p0-w0: resuming experience collection (14750 times) [2024-04-27 14:05:39,106][52031] Fps is (10 sec: 52427.9, 60 sec: 52975.0, 300 sec: 53539.6). Total num frames: 6490800128. Throughput: 0: 53437.8. Samples: 981379780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:05:39,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 14:05:39,394][52263] Updated weights for policy 0, policy_version 396169 (0.0032) [2024-04-27 14:05:42,552][52263] Updated weights for policy 0, policy_version 396179 (0.0027) [2024-04-27 14:05:44,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6491062272. Throughput: 0: 53527.3. Samples: 981539260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:05:44,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 14:05:45,699][52263] Updated weights for policy 0, policy_version 396189 (0.0031) [2024-04-27 14:05:48,829][52263] Updated weights for policy 0, policy_version 396199 (0.0029) [2024-04-27 14:05:49,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6491340800. Throughput: 0: 53368.5. Samples: 981853180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:05:49,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 14:05:51,682][52263] Updated weights for policy 0, policy_version 396209 (0.0027) [2024-04-27 14:05:54,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52975.1, 300 sec: 53428.5). Total num frames: 6491586560. Throughput: 0: 53507.7. Samples: 982178140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:05:54,107][52031] Avg episode reward: [(0, '0.505')] [2024-04-27 14:05:54,806][52263] Updated weights for policy 0, policy_version 396219 (0.0026) [2024-04-27 14:05:57,640][52263] Updated weights for policy 0, policy_version 396229 (0.0030) [2024-04-27 14:05:59,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.2, 300 sec: 53484.0). Total num frames: 6491881472. Throughput: 0: 53155.0. Samples: 982337200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:05:59,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 14:06:01,114][52263] Updated weights for policy 0, policy_version 396239 (0.0033) [2024-04-27 14:06:03,791][52263] Updated weights for policy 0, policy_version 396249 (0.0035) [2024-04-27 14:06:04,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6492143616. Throughput: 0: 53134.2. Samples: 982654880. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 14:06:04,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 14:06:07,274][52263] Updated weights for policy 0, policy_version 396259 (0.0030) [2024-04-27 14:06:09,107][52031] Fps is (10 sec: 52427.7, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6492405760. Throughput: 0: 53189.5. Samples: 982976740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 14:06:09,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 14:06:09,921][52263] Updated weights for policy 0, policy_version 396269 (0.0036) [2024-04-27 14:06:13,324][52263] Updated weights for policy 0, policy_version 396279 (0.0029) [2024-04-27 14:06:14,107][52031] Fps is (10 sec: 54066.1, 60 sec: 53248.0, 300 sec: 53539.5). Total num frames: 6492684288. Throughput: 0: 53416.5. Samples: 983135100. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 14:06:14,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 14:06:15,987][52263] Updated weights for policy 0, policy_version 396289 (0.0035) [2024-04-27 14:06:19,106][52031] Fps is (10 sec: 52429.6, 60 sec: 52975.0, 300 sec: 53539.6). Total num frames: 6492930048. Throughput: 0: 53269.3. Samples: 983455760. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 14:06:19,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 14:06:19,481][52263] Updated weights for policy 0, policy_version 396299 (0.0033) [2024-04-27 14:06:22,264][52263] Updated weights for policy 0, policy_version 396309 (0.0033) [2024-04-27 14:06:24,107][52031] Fps is (10 sec: 52429.1, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6493208576. Throughput: 0: 53173.7. Samples: 983772600. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 14:06:24,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 14:06:25,685][52263] Updated weights for policy 0, policy_version 396319 (0.0031) [2024-04-27 14:06:28,237][52263] Updated weights for policy 0, policy_version 396329 (0.0031) [2024-04-27 14:06:29,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6493487104. Throughput: 0: 53193.8. Samples: 983932980. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 14:06:29,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 14:06:31,717][52263] Updated weights for policy 0, policy_version 396339 (0.0030) [2024-04-27 14:06:34,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52974.8, 300 sec: 53428.5). Total num frames: 6493732864. Throughput: 0: 53365.7. Samples: 984254640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 14:06:34,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 14:06:34,440][52263] Updated weights for policy 0, policy_version 396349 (0.0030) [2024-04-27 14:06:37,914][52242] Signal inference workers to stop experience collection... (14800 times) [2024-04-27 14:06:37,944][52263] InferenceWorker_p0-w0: stopping experience collection (14800 times) [2024-04-27 14:06:38,009][52242] Signal inference workers to resume experience collection... (14800 times) [2024-04-27 14:06:38,010][52263] InferenceWorker_p0-w0: resuming experience collection (14800 times) [2024-04-27 14:06:38,012][52263] Updated weights for policy 0, policy_version 396359 (0.0029) [2024-04-27 14:06:39,106][52031] Fps is (10 sec: 54066.7, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6494027776. Throughput: 0: 53257.6. Samples: 984574740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 14:06:39,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 14:06:40,510][52263] Updated weights for policy 0, policy_version 396369 (0.0026) [2024-04-27 14:06:44,092][52263] Updated weights for policy 0, policy_version 396379 (0.0038) [2024-04-27 14:06:44,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6494273536. Throughput: 0: 53298.6. Samples: 984735640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 14:06:44,107][52031] Avg episode reward: [(0, '0.655')] [2024-04-27 14:06:46,541][52263] Updated weights for policy 0, policy_version 396389 (0.0029) [2024-04-27 14:06:49,106][52031] Fps is (10 sec: 49152.3, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6494519296. Throughput: 0: 53402.2. Samples: 985057980. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 14:06:49,107][52031] Avg episode reward: [(0, '0.684')] [2024-04-27 14:06:50,091][52263] Updated weights for policy 0, policy_version 396399 (0.0035) [2024-04-27 14:06:52,582][52263] Updated weights for policy 0, policy_version 396409 (0.0026) [2024-04-27 14:06:54,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6494814208. Throughput: 0: 53442.3. Samples: 985381640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 14:06:54,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 14:06:56,126][52263] Updated weights for policy 0, policy_version 396419 (0.0031) [2024-04-27 14:06:58,751][52263] Updated weights for policy 0, policy_version 396429 (0.0032) [2024-04-27 14:06:59,107][52031] Fps is (10 sec: 58981.6, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6495109120. Throughput: 0: 53507.6. Samples: 985542940. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 14:06:59,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 14:07:02,265][52263] Updated weights for policy 0, policy_version 396439 (0.0026) [2024-04-27 14:07:04,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6495371264. Throughput: 0: 53578.5. Samples: 985866800. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 14:07:04,107][52031] Avg episode reward: [(0, '0.658')] [2024-04-27 14:07:04,778][52263] Updated weights for policy 0, policy_version 396449 (0.0030) [2024-04-27 14:07:08,403][52263] Updated weights for policy 0, policy_version 396459 (0.0030) [2024-04-27 14:07:09,106][52031] Fps is (10 sec: 50791.4, 60 sec: 53521.3, 300 sec: 53484.1). Total num frames: 6495617024. Throughput: 0: 53729.1. Samples: 986190400. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 14:07:09,107][52031] Avg episode reward: [(0, '0.467')] [2024-04-27 14:07:10,805][52263] Updated weights for policy 0, policy_version 396469 (0.0028) [2024-04-27 14:07:14,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6495879168. Throughput: 0: 53581.9. Samples: 986344180. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 14:07:14,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 14:07:14,477][52263] Updated weights for policy 0, policy_version 396479 (0.0032) [2024-04-27 14:07:17,109][52263] Updated weights for policy 0, policy_version 396489 (0.0030) [2024-04-27 14:07:19,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6496141312. Throughput: 0: 53630.6. Samples: 986668020. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 14:07:19,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 14:07:20,371][52263] Updated weights for policy 0, policy_version 396499 (0.0026) [2024-04-27 14:07:23,146][52263] Updated weights for policy 0, policy_version 396509 (0.0031) [2024-04-27 14:07:24,107][52031] Fps is (10 sec: 55706.3, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6496436224. Throughput: 0: 53809.7. Samples: 986996180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 14:07:24,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 14:07:26,423][52263] Updated weights for policy 0, policy_version 396519 (0.0028) [2024-04-27 14:07:29,106][52031] Fps is (10 sec: 57344.5, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6496714752. Throughput: 0: 53796.9. Samples: 987156500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 14:07:29,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 14:07:29,306][52263] Updated weights for policy 0, policy_version 396529 (0.0029) [2024-04-27 14:07:32,599][52263] Updated weights for policy 0, policy_version 396539 (0.0031) [2024-04-27 14:07:34,106][52031] Fps is (10 sec: 54067.8, 60 sec: 54067.3, 300 sec: 53539.6). Total num frames: 6496976896. Throughput: 0: 53936.5. Samples: 987485120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 14:07:34,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 14:07:34,206][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000396545_6496993280.pth... [2024-04-27 14:07:34,264][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000395760_6484131840.pth [2024-04-27 14:07:35,436][52263] Updated weights for policy 0, policy_version 396549 (0.0028) [2024-04-27 14:07:38,650][52263] Updated weights for policy 0, policy_version 396559 (0.0033) [2024-04-27 14:07:39,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6497222656. Throughput: 0: 53824.0. Samples: 987803720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 14:07:39,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 14:07:41,437][52263] Updated weights for policy 0, policy_version 396569 (0.0033) [2024-04-27 14:07:44,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6497484800. Throughput: 0: 53686.0. Samples: 987958800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 14:07:44,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 14:07:44,724][52263] Updated weights for policy 0, policy_version 396579 (0.0027) [2024-04-27 14:07:47,659][52263] Updated weights for policy 0, policy_version 396589 (0.0031) [2024-04-27 14:07:49,107][52031] Fps is (10 sec: 55705.3, 60 sec: 54340.2, 300 sec: 53539.6). Total num frames: 6497779712. Throughput: 0: 53651.6. Samples: 988281120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 14:07:49,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 14:07:50,896][52263] Updated weights for policy 0, policy_version 396599 (0.0027) [2024-04-27 14:07:53,907][52263] Updated weights for policy 0, policy_version 396609 (0.0028) [2024-04-27 14:07:54,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6498041856. Throughput: 0: 53546.9. Samples: 988600020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 14:07:54,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 14:07:54,399][52242] Signal inference workers to stop experience collection... (14850 times) [2024-04-27 14:07:54,399][52242] Signal inference workers to resume experience collection... (14850 times) [2024-04-27 14:07:54,421][52263] InferenceWorker_p0-w0: stopping experience collection (14850 times) [2024-04-27 14:07:54,422][52263] InferenceWorker_p0-w0: resuming experience collection (14850 times) [2024-04-27 14:07:57,252][52263] Updated weights for policy 0, policy_version 396619 (0.0028) [2024-04-27 14:07:59,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6498320384. Throughput: 0: 53936.7. Samples: 988771320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 14:07:59,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 14:08:00,014][52263] Updated weights for policy 0, policy_version 396629 (0.0030) [2024-04-27 14:08:03,267][52263] Updated weights for policy 0, policy_version 396639 (0.0029) [2024-04-27 14:08:04,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6498582528. Throughput: 0: 53775.2. Samples: 989087900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 14:08:04,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 14:08:06,005][52263] Updated weights for policy 0, policy_version 396649 (0.0027) [2024-04-27 14:08:09,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6498844672. Throughput: 0: 53701.4. Samples: 989412740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 14:08:09,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 14:08:09,209][52263] Updated weights for policy 0, policy_version 396659 (0.0035) [2024-04-27 14:08:12,358][52263] Updated weights for policy 0, policy_version 396669 (0.0027) [2024-04-27 14:08:14,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6499106816. Throughput: 0: 53606.5. Samples: 989568800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 14:08:14,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 14:08:15,615][52263] Updated weights for policy 0, policy_version 396679 (0.0032) [2024-04-27 14:08:18,482][52263] Updated weights for policy 0, policy_version 396689 (0.0029) [2024-04-27 14:08:19,106][52031] Fps is (10 sec: 54067.5, 60 sec: 54067.3, 300 sec: 53484.1). Total num frames: 6499385344. Throughput: 0: 53349.3. Samples: 989885840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 14:08:19,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 14:08:21,671][52263] Updated weights for policy 0, policy_version 396699 (0.0027) [2024-04-27 14:08:24,107][52031] Fps is (10 sec: 55706.2, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6499663872. Throughput: 0: 53519.5. Samples: 990212100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 14:08:24,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 14:08:24,597][52263] Updated weights for policy 0, policy_version 396709 (0.0028) [2024-04-27 14:08:27,642][52263] Updated weights for policy 0, policy_version 396719 (0.0028) [2024-04-27 14:08:29,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6499926016. Throughput: 0: 53683.0. Samples: 990374540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 14:08:29,107][52031] Avg episode reward: [(0, '0.493')] [2024-04-27 14:08:30,605][52263] Updated weights for policy 0, policy_version 396729 (0.0027) [2024-04-27 14:08:33,781][52263] Updated weights for policy 0, policy_version 396739 (0.0029) [2024-04-27 14:08:34,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6500171776. Throughput: 0: 53615.7. Samples: 990693820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 14:08:34,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 14:08:36,657][52263] Updated weights for policy 0, policy_version 396749 (0.0029) [2024-04-27 14:08:39,107][52031] Fps is (10 sec: 50790.2, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 6500433920. Throughput: 0: 53588.0. Samples: 991011480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 14:08:39,107][52031] Avg episode reward: [(0, '0.493')] [2024-04-27 14:08:40,172][52263] Updated weights for policy 0, policy_version 396759 (0.0033) [2024-04-27 14:08:42,802][52263] Updated weights for policy 0, policy_version 396769 (0.0028) [2024-04-27 14:08:44,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6500712448. Throughput: 0: 53339.2. Samples: 991171580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 14:08:44,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 14:08:46,188][52263] Updated weights for policy 0, policy_version 396779 (0.0029) [2024-04-27 14:08:48,924][52263] Updated weights for policy 0, policy_version 396789 (0.0034) [2024-04-27 14:08:49,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6500990976. Throughput: 0: 53412.8. Samples: 991491480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 14:08:49,107][52031] Avg episode reward: [(0, '0.655')] [2024-04-27 14:08:52,259][52263] Updated weights for policy 0, policy_version 396799 (0.0029) [2024-04-27 14:08:54,107][52031] Fps is (10 sec: 54065.9, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6501253120. Throughput: 0: 53315.8. Samples: 991811960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 14:08:54,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 14:08:54,992][52263] Updated weights for policy 0, policy_version 396809 (0.0029) [2024-04-27 14:08:58,236][52263] Updated weights for policy 0, policy_version 396819 (0.0028) [2024-04-27 14:08:59,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6501531648. Throughput: 0: 53481.1. Samples: 991975440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 14:08:59,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 14:09:00,697][52242] Signal inference workers to stop experience collection... (14900 times) [2024-04-27 14:09:00,746][52263] InferenceWorker_p0-w0: stopping experience collection (14900 times) [2024-04-27 14:09:00,756][52242] Signal inference workers to resume experience collection... (14900 times) [2024-04-27 14:09:00,762][52263] InferenceWorker_p0-w0: resuming experience collection (14900 times) [2024-04-27 14:09:01,134][52263] Updated weights for policy 0, policy_version 396829 (0.0039) [2024-04-27 14:09:04,106][52031] Fps is (10 sec: 52429.9, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6501777408. Throughput: 0: 53503.1. Samples: 992293480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 14:09:04,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 14:09:04,410][52263] Updated weights for policy 0, policy_version 396839 (0.0026) [2024-04-27 14:09:07,706][52263] Updated weights for policy 0, policy_version 396849 (0.0035) [2024-04-27 14:09:09,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6502039552. Throughput: 0: 53375.9. Samples: 992614020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 14:09:09,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 14:09:10,595][52263] Updated weights for policy 0, policy_version 396859 (0.0029) [2024-04-27 14:09:13,810][52263] Updated weights for policy 0, policy_version 396869 (0.0031) [2024-04-27 14:09:14,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6502318080. Throughput: 0: 53302.2. Samples: 992773140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 14:09:14,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 14:09:16,552][52263] Updated weights for policy 0, policy_version 396879 (0.0033) [2024-04-27 14:09:19,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6502596608. Throughput: 0: 53460.9. Samples: 993099560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 14:09:19,107][52031] Avg episode reward: [(0, '0.505')] [2024-04-27 14:09:19,722][52263] Updated weights for policy 0, policy_version 396889 (0.0032) [2024-04-27 14:09:22,518][52263] Updated weights for policy 0, policy_version 396899 (0.0032) [2024-04-27 14:09:24,106][52031] Fps is (10 sec: 55706.2, 60 sec: 53521.1, 300 sec: 53650.6). Total num frames: 6502875136. Throughput: 0: 53604.6. Samples: 993423680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 14:09:24,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 14:09:25,660][52263] Updated weights for policy 0, policy_version 396909 (0.0032) [2024-04-27 14:09:28,644][52263] Updated weights for policy 0, policy_version 396919 (0.0030) [2024-04-27 14:09:29,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6503137280. Throughput: 0: 53730.0. Samples: 993589440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 14:09:29,107][52031] Avg episode reward: [(0, '0.671')] [2024-04-27 14:09:32,068][52263] Updated weights for policy 0, policy_version 396929 (0.0038) [2024-04-27 14:09:34,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6503383040. Throughput: 0: 53695.7. Samples: 993907780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 14:09:34,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 14:09:34,195][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000396936_6503399424.pth... [2024-04-27 14:09:34,240][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000396153_6490570752.pth [2024-04-27 14:09:34,859][52263] Updated weights for policy 0, policy_version 396939 (0.0026) [2024-04-27 14:09:38,086][52263] Updated weights for policy 0, policy_version 396949 (0.0031) [2024-04-27 14:09:39,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6503661568. Throughput: 0: 53718.9. Samples: 994229300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 14:09:39,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 14:09:40,937][52263] Updated weights for policy 0, policy_version 396959 (0.0026) [2024-04-27 14:09:44,013][52263] Updated weights for policy 0, policy_version 396969 (0.0029) [2024-04-27 14:09:44,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6503940096. Throughput: 0: 53679.0. Samples: 994391000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 14:09:44,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 14:09:47,097][52263] Updated weights for policy 0, policy_version 396979 (0.0027) [2024-04-27 14:09:49,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6504202240. Throughput: 0: 53808.5. Samples: 994714860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 14:09:49,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 14:09:50,169][52263] Updated weights for policy 0, policy_version 396989 (0.0023) [2024-04-27 14:09:53,422][52263] Updated weights for policy 0, policy_version 396999 (0.0030) [2024-04-27 14:09:54,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53521.3, 300 sec: 53539.6). Total num frames: 6504464384. Throughput: 0: 53698.5. Samples: 995030440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 14:09:54,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 14:09:56,450][52263] Updated weights for policy 0, policy_version 397009 (0.0034) [2024-04-27 14:09:59,107][52031] Fps is (10 sec: 54065.9, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6504742912. Throughput: 0: 53700.8. Samples: 995189680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 14:09:59,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 14:09:59,734][52263] Updated weights for policy 0, policy_version 397019 (0.0039) [2024-04-27 14:10:02,447][52263] Updated weights for policy 0, policy_version 397029 (0.0031) [2024-04-27 14:10:04,107][52031] Fps is (10 sec: 54065.8, 60 sec: 53793.9, 300 sec: 53595.1). Total num frames: 6505005056. Throughput: 0: 53553.1. Samples: 995509460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 14:10:04,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 14:10:05,763][52263] Updated weights for policy 0, policy_version 397039 (0.0033) [2024-04-27 14:10:08,584][52263] Updated weights for policy 0, policy_version 397049 (0.0028) [2024-04-27 14:10:09,107][52031] Fps is (10 sec: 52429.2, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6505267200. Throughput: 0: 53545.2. Samples: 995833220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 14:10:09,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 14:10:10,480][52242] Signal inference workers to stop experience collection... (14950 times) [2024-04-27 14:10:10,483][52242] Signal inference workers to resume experience collection... (14950 times) [2024-04-27 14:10:10,510][52263] InferenceWorker_p0-w0: stopping experience collection (14950 times) [2024-04-27 14:10:10,510][52263] InferenceWorker_p0-w0: resuming experience collection (14950 times) [2024-04-27 14:10:11,727][52263] Updated weights for policy 0, policy_version 397059 (0.0027) [2024-04-27 14:10:14,106][52031] Fps is (10 sec: 54068.4, 60 sec: 53794.3, 300 sec: 53539.6). Total num frames: 6505545728. Throughput: 0: 53459.7. Samples: 995995120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 14:10:14,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 14:10:14,617][52263] Updated weights for policy 0, policy_version 397069 (0.0030) [2024-04-27 14:10:18,022][52263] Updated weights for policy 0, policy_version 397079 (0.0030) [2024-04-27 14:10:19,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 6505824256. Throughput: 0: 53570.9. Samples: 996318480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 14:10:19,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 14:10:20,695][52263] Updated weights for policy 0, policy_version 397089 (0.0026) [2024-04-27 14:10:24,057][52263] Updated weights for policy 0, policy_version 397099 (0.0028) [2024-04-27 14:10:24,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53247.8, 300 sec: 53539.5). Total num frames: 6506070016. Throughput: 0: 53615.3. Samples: 996642000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 14:10:24,116][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 14:10:26,862][52263] Updated weights for policy 0, policy_version 397109 (0.0031) [2024-04-27 14:10:29,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6506348544. Throughput: 0: 53549.4. Samples: 996800720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 14:10:29,107][52031] Avg episode reward: [(0, '0.488')] [2024-04-27 14:10:30,223][52263] Updated weights for policy 0, policy_version 397119 (0.0025) [2024-04-27 14:10:32,944][52263] Updated weights for policy 0, policy_version 397129 (0.0036) [2024-04-27 14:10:34,106][52031] Fps is (10 sec: 52429.8, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6506594304. Throughput: 0: 53560.4. Samples: 997125080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 14:10:34,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 14:10:36,207][52263] Updated weights for policy 0, policy_version 397139 (0.0032) [2024-04-27 14:10:38,914][52263] Updated weights for policy 0, policy_version 397149 (0.0032) [2024-04-27 14:10:39,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6506889216. Throughput: 0: 53730.5. Samples: 997448320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 14:10:39,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 14:10:42,395][52263] Updated weights for policy 0, policy_version 397159 (0.0020) [2024-04-27 14:10:44,107][52031] Fps is (10 sec: 57343.4, 60 sec: 53794.2, 300 sec: 53650.6). Total num frames: 6507167744. Throughput: 0: 53889.4. Samples: 997614700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 14:10:44,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 14:10:45,145][52263] Updated weights for policy 0, policy_version 397169 (0.0026) [2024-04-27 14:10:48,405][52263] Updated weights for policy 0, policy_version 397179 (0.0028) [2024-04-27 14:10:49,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6507429888. Throughput: 0: 53927.0. Samples: 997936160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 14:10:49,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 14:10:51,174][52263] Updated weights for policy 0, policy_version 397189 (0.0026) [2024-04-27 14:10:54,107][52031] Fps is (10 sec: 50790.6, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6507675648. Throughput: 0: 53917.4. Samples: 998259500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 14:10:54,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 14:10:54,484][52263] Updated weights for policy 0, policy_version 397199 (0.0034) [2024-04-27 14:10:57,099][52263] Updated weights for policy 0, policy_version 397209 (0.0025) [2024-04-27 14:10:59,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6507954176. Throughput: 0: 53717.6. Samples: 998412420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 14:10:59,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 14:11:00,506][52263] Updated weights for policy 0, policy_version 397219 (0.0030) [2024-04-27 14:11:03,332][52263] Updated weights for policy 0, policy_version 397229 (0.0027) [2024-04-27 14:11:04,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 6508232704. Throughput: 0: 53728.9. Samples: 998736280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 14:11:04,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 14:11:06,574][52263] Updated weights for policy 0, policy_version 397239 (0.0027) [2024-04-27 14:11:09,107][52031] Fps is (10 sec: 55705.6, 60 sec: 54067.2, 300 sec: 53650.7). Total num frames: 6508511232. Throughput: 0: 53781.4. Samples: 999062160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 14:11:09,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 14:11:09,650][52263] Updated weights for policy 0, policy_version 397249 (0.0032) [2024-04-27 14:11:12,657][52263] Updated weights for policy 0, policy_version 397259 (0.0028) [2024-04-27 14:11:14,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6508773376. Throughput: 0: 54018.3. Samples: 999231540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 14:11:14,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 14:11:14,150][52242] Signal inference workers to stop experience collection... (15000 times) [2024-04-27 14:11:14,151][52242] Signal inference workers to resume experience collection... (15000 times) [2024-04-27 14:11:14,173][52263] InferenceWorker_p0-w0: stopping experience collection (15000 times) [2024-04-27 14:11:14,174][52263] InferenceWorker_p0-w0: resuming experience collection (15000 times) [2024-04-27 14:11:15,612][52263] Updated weights for policy 0, policy_version 397269 (0.0029) [2024-04-27 14:11:18,849][52263] Updated weights for policy 0, policy_version 397279 (0.0027) [2024-04-27 14:11:19,106][52031] Fps is (10 sec: 50791.0, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6509019136. Throughput: 0: 53950.7. Samples: 999552860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 14:11:19,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 14:11:21,795][52263] Updated weights for policy 0, policy_version 397289 (0.0027) [2024-04-27 14:11:24,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53521.3, 300 sec: 53539.6). Total num frames: 6509281280. Throughput: 0: 53847.7. Samples: 999871460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:11:24,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 14:11:24,860][52263] Updated weights for policy 0, policy_version 397299 (0.0028) [2024-04-27 14:11:27,972][52263] Updated weights for policy 0, policy_version 397309 (0.0030) [2024-04-27 14:11:29,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.3, 300 sec: 53706.2). Total num frames: 6509576192. Throughput: 0: 53572.2. Samples: 1000025440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:11:29,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 14:11:31,065][52263] Updated weights for policy 0, policy_version 397319 (0.0030) [2024-04-27 14:11:33,986][52263] Updated weights for policy 0, policy_version 397329 (0.0027) [2024-04-27 14:11:34,106][52031] Fps is (10 sec: 55705.0, 60 sec: 54067.1, 300 sec: 53595.1). Total num frames: 6509838336. Throughput: 0: 53549.6. Samples: 1000345900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:11:34,107][52031] Avg episode reward: [(0, '0.683')] [2024-04-27 14:11:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000397329_6509838336.pth... [2024-04-27 14:11:34,181][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000396545_6496993280.pth [2024-04-27 14:11:37,267][52263] Updated weights for policy 0, policy_version 397339 (0.0027) [2024-04-27 14:11:39,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6510116864. Throughput: 0: 53489.0. Samples: 1000666500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:11:39,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 14:11:40,030][52263] Updated weights for policy 0, policy_version 397349 (0.0035) [2024-04-27 14:11:43,262][52263] Updated weights for policy 0, policy_version 397359 (0.0027) [2024-04-27 14:11:44,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.0, 300 sec: 53706.2). Total num frames: 6510362624. Throughput: 0: 53758.7. Samples: 1000831560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:11:44,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 14:11:46,039][52263] Updated weights for policy 0, policy_version 397369 (0.0024) [2024-04-27 14:11:49,107][52031] Fps is (10 sec: 52427.6, 60 sec: 53520.8, 300 sec: 53650.6). Total num frames: 6510641152. Throughput: 0: 53818.5. Samples: 1001158120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:11:49,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 14:11:49,335][52263] Updated weights for policy 0, policy_version 397379 (0.0036) [2024-04-27 14:11:52,215][52263] Updated weights for policy 0, policy_version 397389 (0.0029) [2024-04-27 14:11:54,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6510903296. Throughput: 0: 53757.8. Samples: 1001481260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:11:54,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 14:11:55,389][52263] Updated weights for policy 0, policy_version 397399 (0.0032) [2024-04-27 14:11:58,374][52263] Updated weights for policy 0, policy_version 397409 (0.0036) [2024-04-27 14:11:59,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53793.9, 300 sec: 53595.1). Total num frames: 6511181824. Throughput: 0: 53359.2. Samples: 1001632720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:11:59,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 14:12:01,543][52263] Updated weights for policy 0, policy_version 397419 (0.0032) [2024-04-27 14:12:04,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6511460352. Throughput: 0: 53393.8. Samples: 1001955580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:12:04,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 14:12:04,346][52263] Updated weights for policy 0, policy_version 397429 (0.0027) [2024-04-27 14:12:07,582][52263] Updated weights for policy 0, policy_version 397439 (0.0027) [2024-04-27 14:12:09,106][52031] Fps is (10 sec: 54069.1, 60 sec: 53521.2, 300 sec: 53706.2). Total num frames: 6511722496. Throughput: 0: 53472.4. Samples: 1002277720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:12:09,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 14:12:10,388][52263] Updated weights for policy 0, policy_version 397449 (0.0026) [2024-04-27 14:12:13,409][52242] Signal inference workers to stop experience collection... (15050 times) [2024-04-27 14:12:13,444][52263] InferenceWorker_p0-w0: stopping experience collection (15050 times) [2024-04-27 14:12:13,473][52242] Signal inference workers to resume experience collection... (15050 times) [2024-04-27 14:12:13,475][52263] InferenceWorker_p0-w0: resuming experience collection (15050 times) [2024-04-27 14:12:13,733][52263] Updated weights for policy 0, policy_version 397459 (0.0028) [2024-04-27 14:12:14,106][52031] Fps is (10 sec: 50790.2, 60 sec: 53248.0, 300 sec: 53650.7). Total num frames: 6511968256. Throughput: 0: 53592.8. Samples: 1002437120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:12:14,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 14:12:16,553][52263] Updated weights for policy 0, policy_version 397469 (0.0029) [2024-04-27 14:12:19,106][52031] Fps is (10 sec: 49151.6, 60 sec: 53247.9, 300 sec: 53484.1). Total num frames: 6512214016. Throughput: 0: 53474.3. Samples: 1002752240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:12:19,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 14:12:20,049][52263] Updated weights for policy 0, policy_version 397479 (0.0028) [2024-04-27 14:12:22,820][52263] Updated weights for policy 0, policy_version 397489 (0.0028) [2024-04-27 14:12:24,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53793.9, 300 sec: 53539.6). Total num frames: 6512508928. Throughput: 0: 53430.8. Samples: 1003070900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:12:24,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 14:12:26,124][52263] Updated weights for policy 0, policy_version 397499 (0.0027) [2024-04-27 14:12:28,794][52263] Updated weights for policy 0, policy_version 397509 (0.0031) [2024-04-27 14:12:29,106][52031] Fps is (10 sec: 57344.2, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6512787456. Throughput: 0: 53551.6. Samples: 1003241380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:12:29,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 14:12:32,316][52263] Updated weights for policy 0, policy_version 397519 (0.0030) [2024-04-27 14:12:34,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6513065984. Throughput: 0: 53305.3. Samples: 1003556860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:12:34,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 14:12:34,861][52263] Updated weights for policy 0, policy_version 397529 (0.0024) [2024-04-27 14:12:38,540][52263] Updated weights for policy 0, policy_version 397539 (0.0030) [2024-04-27 14:12:39,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53247.9, 300 sec: 53650.6). Total num frames: 6513311744. Throughput: 0: 53221.8. Samples: 1003876240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:12:39,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 14:12:41,056][52263] Updated weights for policy 0, policy_version 397549 (0.0027) [2024-04-27 14:12:44,106][52031] Fps is (10 sec: 50791.3, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6513573888. Throughput: 0: 53291.0. Samples: 1004030800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 14:12:44,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 14:12:44,537][52263] Updated weights for policy 0, policy_version 397559 (0.0034) [2024-04-27 14:12:47,226][52263] Updated weights for policy 0, policy_version 397569 (0.0027) [2024-04-27 14:12:49,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6513836032. Throughput: 0: 53244.4. Samples: 1004351580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 14:12:49,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 14:12:50,742][52263] Updated weights for policy 0, policy_version 397579 (0.0029) [2024-04-27 14:12:53,182][52263] Updated weights for policy 0, policy_version 397589 (0.0029) [2024-04-27 14:12:54,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6514114560. Throughput: 0: 53213.6. Samples: 1004672340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 14:12:54,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 14:12:56,879][52263] Updated weights for policy 0, policy_version 397599 (0.0033) [2024-04-27 14:12:59,106][52031] Fps is (10 sec: 57344.7, 60 sec: 53794.5, 300 sec: 53650.7). Total num frames: 6514409472. Throughput: 0: 53385.9. Samples: 1004839480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 14:12:59,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 14:12:59,198][52263] Updated weights for policy 0, policy_version 397609 (0.0029) [2024-04-27 14:13:02,881][52263] Updated weights for policy 0, policy_version 397619 (0.0028) [2024-04-27 14:13:04,106][52031] Fps is (10 sec: 52429.6, 60 sec: 52974.9, 300 sec: 53539.6). Total num frames: 6514638848. Throughput: 0: 53439.2. Samples: 1005157000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 14:13:04,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 14:13:05,389][52263] Updated weights for policy 0, policy_version 397629 (0.0026) [2024-04-27 14:13:08,962][52263] Updated weights for policy 0, policy_version 397639 (0.0036) [2024-04-27 14:13:09,107][52031] Fps is (10 sec: 50789.7, 60 sec: 53247.9, 300 sec: 53595.1). Total num frames: 6514917376. Throughput: 0: 53585.9. Samples: 1005482260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 14:13:09,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 14:13:11,908][52263] Updated weights for policy 0, policy_version 397649 (0.0028) [2024-04-27 14:13:14,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6515163136. Throughput: 0: 53089.6. Samples: 1005630420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 14:13:14,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 14:13:14,261][52242] Signal inference workers to stop experience collection... (15100 times) [2024-04-27 14:13:14,261][52242] Signal inference workers to resume experience collection... (15100 times) [2024-04-27 14:13:14,289][52263] InferenceWorker_p0-w0: stopping experience collection (15100 times) [2024-04-27 14:13:14,289][52263] InferenceWorker_p0-w0: resuming experience collection (15100 times) [2024-04-27 14:13:15,379][52263] Updated weights for policy 0, policy_version 397659 (0.0031) [2024-04-27 14:13:18,157][52263] Updated weights for policy 0, policy_version 397669 (0.0031) [2024-04-27 14:13:19,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6515441664. Throughput: 0: 53145.6. Samples: 1005948400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 14:13:19,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 14:13:21,590][52263] Updated weights for policy 0, policy_version 397679 (0.0031) [2024-04-27 14:13:24,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6515720192. Throughput: 0: 53122.6. Samples: 1006266760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 14:13:24,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 14:13:24,453][52263] Updated weights for policy 0, policy_version 397689 (0.0035) [2024-04-27 14:13:27,568][52263] Updated weights for policy 0, policy_version 397699 (0.0041) [2024-04-27 14:13:29,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6515982336. Throughput: 0: 53475.1. Samples: 1006437180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 14:13:29,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 14:13:30,768][52263] Updated weights for policy 0, policy_version 397709 (0.0032) [2024-04-27 14:13:33,621][52263] Updated weights for policy 0, policy_version 397719 (0.0032) [2024-04-27 14:13:34,107][52031] Fps is (10 sec: 52429.1, 60 sec: 52975.0, 300 sec: 53595.1). Total num frames: 6516244480. Throughput: 0: 53329.7. Samples: 1006751420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 14:13:34,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 14:13:34,119][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000397720_6516244480.pth... [2024-04-27 14:13:34,170][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000396936_6503399424.pth [2024-04-27 14:13:36,799][52263] Updated weights for policy 0, policy_version 397729 (0.0031) [2024-04-27 14:13:39,107][52031] Fps is (10 sec: 50790.3, 60 sec: 52975.0, 300 sec: 53484.0). Total num frames: 6516490240. Throughput: 0: 53382.7. Samples: 1007074560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 14:13:39,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 14:13:39,860][52263] Updated weights for policy 0, policy_version 397739 (0.0034) [2024-04-27 14:13:42,794][52263] Updated weights for policy 0, policy_version 397749 (0.0029) [2024-04-27 14:13:44,106][52031] Fps is (10 sec: 50791.1, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6516752384. Throughput: 0: 52931.1. Samples: 1007221380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 14:13:44,107][52031] Avg episode reward: [(0, '0.504')] [2024-04-27 14:13:46,141][52263] Updated weights for policy 0, policy_version 397759 (0.0032) [2024-04-27 14:13:49,091][52263] Updated weights for policy 0, policy_version 397769 (0.0036) [2024-04-27 14:13:49,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6517047296. Throughput: 0: 53034.5. Samples: 1007543560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 14:13:49,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 14:13:52,257][52263] Updated weights for policy 0, policy_version 397779 (0.0029) [2024-04-27 14:13:54,107][52031] Fps is (10 sec: 57343.0, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6517325824. Throughput: 0: 52932.4. Samples: 1007864220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 14:13:54,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 14:13:55,378][52263] Updated weights for policy 0, policy_version 397789 (0.0032) [2024-04-27 14:13:58,440][52263] Updated weights for policy 0, policy_version 397799 (0.0028) [2024-04-27 14:13:59,107][52031] Fps is (10 sec: 52428.8, 60 sec: 52701.7, 300 sec: 53539.6). Total num frames: 6517571584. Throughput: 0: 53391.2. Samples: 1008033020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 14:13:59,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 14:14:01,408][52263] Updated weights for policy 0, policy_version 397809 (0.0029) [2024-04-27 14:14:04,106][52031] Fps is (10 sec: 49152.8, 60 sec: 52974.9, 300 sec: 53484.1). Total num frames: 6517817344. Throughput: 0: 53417.4. Samples: 1008352180. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-27 14:14:04,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 14:14:04,518][52263] Updated weights for policy 0, policy_version 397819 (0.0029) [2024-04-27 14:14:07,496][52263] Updated weights for policy 0, policy_version 397829 (0.0032) [2024-04-27 14:14:09,106][52031] Fps is (10 sec: 50790.8, 60 sec: 52701.9, 300 sec: 53428.5). Total num frames: 6518079488. Throughput: 0: 53376.6. Samples: 1008668700. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-27 14:14:09,107][52031] Avg episode reward: [(0, '0.679')] [2024-04-27 14:14:10,724][52263] Updated weights for policy 0, policy_version 397839 (0.0040) [2024-04-27 14:14:13,786][52263] Updated weights for policy 0, policy_version 397849 (0.0029) [2024-04-27 14:14:14,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6518358016. Throughput: 0: 53036.8. Samples: 1008823840. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-27 14:14:14,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 14:14:16,739][52263] Updated weights for policy 0, policy_version 397859 (0.0029) [2024-04-27 14:14:19,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6518636544. Throughput: 0: 53061.9. Samples: 1009139200. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-27 14:14:19,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 14:14:20,058][52263] Updated weights for policy 0, policy_version 397869 (0.0036) [2024-04-27 14:14:22,956][52263] Updated weights for policy 0, policy_version 397879 (0.0031) [2024-04-27 14:14:23,171][52242] Signal inference workers to stop experience collection... (15150 times) [2024-04-27 14:14:23,171][52242] Signal inference workers to resume experience collection... (15150 times) [2024-04-27 14:14:23,203][52263] InferenceWorker_p0-w0: stopping experience collection (15150 times) [2024-04-27 14:14:23,204][52263] InferenceWorker_p0-w0: resuming experience collection (15150 times) [2024-04-27 14:14:24,106][52031] Fps is (10 sec: 52429.3, 60 sec: 52702.0, 300 sec: 53373.0). Total num frames: 6518882304. Throughput: 0: 52889.4. Samples: 1009454580. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-27 14:14:24,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 14:14:26,065][52263] Updated weights for policy 0, policy_version 397889 (0.0032) [2024-04-27 14:14:29,084][52263] Updated weights for policy 0, policy_version 397899 (0.0033) [2024-04-27 14:14:29,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6519177216. Throughput: 0: 53378.3. Samples: 1009623400. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-27 14:14:29,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 14:14:32,545][52263] Updated weights for policy 0, policy_version 397909 (0.0029) [2024-04-27 14:14:34,106][52031] Fps is (10 sec: 52429.4, 60 sec: 52702.0, 300 sec: 53373.0). Total num frames: 6519406592. Throughput: 0: 53359.4. Samples: 1009944720. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-27 14:14:34,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 14:14:35,091][52263] Updated weights for policy 0, policy_version 397919 (0.0038) [2024-04-27 14:14:38,562][52263] Updated weights for policy 0, policy_version 397929 (0.0024) [2024-04-27 14:14:39,106][52031] Fps is (10 sec: 49151.5, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 6519668736. Throughput: 0: 53298.3. Samples: 1010262640. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-27 14:14:39,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 14:14:41,130][52263] Updated weights for policy 0, policy_version 397939 (0.0036) [2024-04-27 14:14:44,107][52031] Fps is (10 sec: 57343.0, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6519980032. Throughput: 0: 53030.7. Samples: 1010419400. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-27 14:14:44,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 14:14:44,536][52263] Updated weights for policy 0, policy_version 397949 (0.0030) [2024-04-27 14:14:47,267][52263] Updated weights for policy 0, policy_version 397959 (0.0033) [2024-04-27 14:14:49,106][52031] Fps is (10 sec: 55705.6, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6520225792. Throughput: 0: 52983.5. Samples: 1010736440. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-27 14:14:49,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 14:14:50,500][52263] Updated weights for policy 0, policy_version 397969 (0.0027) [2024-04-27 14:14:53,453][52263] Updated weights for policy 0, policy_version 397979 (0.0032) [2024-04-27 14:14:54,106][52031] Fps is (10 sec: 52429.0, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6520504320. Throughput: 0: 53065.7. Samples: 1011056660. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-27 14:14:54,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 14:14:56,628][52263] Updated weights for policy 0, policy_version 397989 (0.0029) [2024-04-27 14:14:59,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6520782848. Throughput: 0: 53240.2. Samples: 1011219640. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-27 14:14:59,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 14:14:59,504][52263] Updated weights for policy 0, policy_version 397999 (0.0034) [2024-04-27 14:15:02,971][52263] Updated weights for policy 0, policy_version 398009 (0.0029) [2024-04-27 14:15:04,107][52031] Fps is (10 sec: 50789.5, 60 sec: 53247.8, 300 sec: 53372.9). Total num frames: 6521012224. Throughput: 0: 53357.0. Samples: 1011540280. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-27 14:15:04,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 14:15:05,567][52263] Updated weights for policy 0, policy_version 398019 (0.0032) [2024-04-27 14:15:09,107][52031] Fps is (10 sec: 50789.5, 60 sec: 53521.0, 300 sec: 53372.9). Total num frames: 6521290752. Throughput: 0: 53417.2. Samples: 1011858360. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-27 14:15:09,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 14:15:09,221][52263] Updated weights for policy 0, policy_version 398029 (0.0033) [2024-04-27 14:15:11,697][52263] Updated weights for policy 0, policy_version 398039 (0.0029) [2024-04-27 14:15:14,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6521552896. Throughput: 0: 53196.3. Samples: 1012017240. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-27 14:15:14,107][52031] Avg episode reward: [(0, '0.711')] [2024-04-27 14:15:15,416][52263] Updated weights for policy 0, policy_version 398049 (0.0026) [2024-04-27 14:15:16,016][52242] Signal inference workers to stop experience collection... (15200 times) [2024-04-27 14:15:16,016][52242] Signal inference workers to resume experience collection... (15200 times) [2024-04-27 14:15:16,026][52263] InferenceWorker_p0-w0: stopping experience collection (15200 times) [2024-04-27 14:15:16,026][52263] InferenceWorker_p0-w0: resuming experience collection (15200 times) [2024-04-27 14:15:17,979][52263] Updated weights for policy 0, policy_version 398059 (0.0030) [2024-04-27 14:15:19,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6521831424. Throughput: 0: 53206.9. Samples: 1012339040. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-27 14:15:19,107][52031] Avg episode reward: [(0, '0.508')] [2024-04-27 14:15:21,691][52263] Updated weights for policy 0, policy_version 398069 (0.0025) [2024-04-27 14:15:24,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53793.9, 300 sec: 53428.5). Total num frames: 6522109952. Throughput: 0: 53286.4. Samples: 1012660540. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-27 14:15:24,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 14:15:24,181][52263] Updated weights for policy 0, policy_version 398079 (0.0035) [2024-04-27 14:15:28,136][52263] Updated weights for policy 0, policy_version 398089 (0.0027) [2024-04-27 14:15:29,107][52031] Fps is (10 sec: 52428.9, 60 sec: 52974.8, 300 sec: 53428.5). Total num frames: 6522355712. Throughput: 0: 53208.0. Samples: 1012813760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-27 14:15:29,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 14:15:30,468][52263] Updated weights for policy 0, policy_version 398099 (0.0037) [2024-04-27 14:15:34,106][52031] Fps is (10 sec: 49153.4, 60 sec: 53248.0, 300 sec: 53261.9). Total num frames: 6522601472. Throughput: 0: 53373.9. Samples: 1013138260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-27 14:15:34,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 14:15:34,134][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000398109_6522617856.pth... [2024-04-27 14:15:34,139][52263] Updated weights for policy 0, policy_version 398109 (0.0028) [2024-04-27 14:15:34,179][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000397329_6509838336.pth [2024-04-27 14:15:36,511][52263] Updated weights for policy 0, policy_version 398119 (0.0033) [2024-04-27 14:15:39,106][52031] Fps is (10 sec: 55705.9, 60 sec: 54067.2, 300 sec: 53373.0). Total num frames: 6522912768. Throughput: 0: 53342.2. Samples: 1013457060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-27 14:15:39,107][52031] Avg episode reward: [(0, '0.672')] [2024-04-27 14:15:40,222][52263] Updated weights for policy 0, policy_version 398129 (0.0034) [2024-04-27 14:15:42,723][52263] Updated weights for policy 0, policy_version 398139 (0.0027) [2024-04-27 14:15:44,106][52031] Fps is (10 sec: 55705.7, 60 sec: 52975.1, 300 sec: 53317.4). Total num frames: 6523158528. Throughput: 0: 53430.7. Samples: 1013624020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-27 14:15:44,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 14:15:46,436][52263] Updated weights for policy 0, policy_version 398149 (0.0031) [2024-04-27 14:15:49,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6523420672. Throughput: 0: 53319.0. Samples: 1013939620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-27 14:15:49,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 14:15:49,130][52263] Updated weights for policy 0, policy_version 398159 (0.0031) [2024-04-27 14:15:52,448][52263] Updated weights for policy 0, policy_version 398169 (0.0032) [2024-04-27 14:15:54,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6523699200. Throughput: 0: 53366.4. Samples: 1014259840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-27 14:15:54,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 14:15:55,142][52263] Updated weights for policy 0, policy_version 398179 (0.0025) [2024-04-27 14:15:58,421][52263] Updated weights for policy 0, policy_version 398189 (0.0030) [2024-04-27 14:15:59,107][52031] Fps is (10 sec: 52428.2, 60 sec: 52701.8, 300 sec: 53261.9). Total num frames: 6523944960. Throughput: 0: 53369.3. Samples: 1014418860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-27 14:15:59,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 14:16:01,300][52263] Updated weights for policy 0, policy_version 398199 (0.0029) [2024-04-27 14:16:04,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.3, 300 sec: 53317.4). Total num frames: 6524239872. Throughput: 0: 53353.0. Samples: 1014739920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-27 14:16:04,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 14:16:04,748][52263] Updated weights for policy 0, policy_version 398209 (0.0031) [2024-04-27 14:16:07,486][52263] Updated weights for policy 0, policy_version 398219 (0.0028) [2024-04-27 14:16:09,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53248.2, 300 sec: 53261.9). Total num frames: 6524485632. Throughput: 0: 53367.9. Samples: 1015062080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-27 14:16:09,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 14:16:10,966][52263] Updated weights for policy 0, policy_version 398229 (0.0025) [2024-04-27 14:16:13,247][52242] Signal inference workers to stop experience collection... (15250 times) [2024-04-27 14:16:13,289][52263] InferenceWorker_p0-w0: stopping experience collection (15250 times) [2024-04-27 14:16:13,302][52242] Signal inference workers to resume experience collection... (15250 times) [2024-04-27 14:16:13,309][52263] InferenceWorker_p0-w0: resuming experience collection (15250 times) [2024-04-27 14:16:13,572][52263] Updated weights for policy 0, policy_version 398239 (0.0029) [2024-04-27 14:16:14,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 6524764160. Throughput: 0: 53448.2. Samples: 1015218920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-27 14:16:14,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 14:16:17,133][52263] Updated weights for policy 0, policy_version 398249 (0.0031) [2024-04-27 14:16:19,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53248.0, 300 sec: 53372.9). Total num frames: 6525026304. Throughput: 0: 53323.4. Samples: 1015537820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-27 14:16:19,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 14:16:19,610][52263] Updated weights for policy 0, policy_version 398259 (0.0035) [2024-04-27 14:16:23,184][52263] Updated weights for policy 0, policy_version 398269 (0.0027) [2024-04-27 14:16:24,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52975.2, 300 sec: 53261.9). Total num frames: 6525288448. Throughput: 0: 53414.8. Samples: 1015860720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-27 14:16:24,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 14:16:26,025][52263] Updated weights for policy 0, policy_version 398279 (0.0031) [2024-04-27 14:16:29,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53247.9, 300 sec: 53261.9). Total num frames: 6525550592. Throughput: 0: 53142.8. Samples: 1016015460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-27 14:16:29,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 14:16:29,285][52263] Updated weights for policy 0, policy_version 398289 (0.0033) [2024-04-27 14:16:32,057][52263] Updated weights for policy 0, policy_version 398299 (0.0025) [2024-04-27 14:16:34,106][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.1, 300 sec: 53261.9). Total num frames: 6525829120. Throughput: 0: 53277.7. Samples: 1016337120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-27 14:16:34,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 14:16:35,362][52263] Updated weights for policy 0, policy_version 398309 (0.0027) [2024-04-27 14:16:38,197][52263] Updated weights for policy 0, policy_version 398319 (0.0030) [2024-04-27 14:16:39,106][52031] Fps is (10 sec: 54068.6, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 6526091264. Throughput: 0: 53307.6. Samples: 1016658680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-27 14:16:39,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 14:16:41,545][52263] Updated weights for policy 0, policy_version 398329 (0.0031) [2024-04-27 14:16:44,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.0, 300 sec: 53261.9). Total num frames: 6526353408. Throughput: 0: 53415.2. Samples: 1016822540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-04-27 14:16:44,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 14:16:44,465][52263] Updated weights for policy 0, policy_version 398339 (0.0034) [2024-04-27 14:16:47,614][52263] Updated weights for policy 0, policy_version 398349 (0.0028) [2024-04-27 14:16:49,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53247.9, 300 sec: 53261.9). Total num frames: 6526615552. Throughput: 0: 53334.6. Samples: 1017139980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 14:16:49,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 14:16:50,584][52263] Updated weights for policy 0, policy_version 398359 (0.0029) [2024-04-27 14:16:53,538][52263] Updated weights for policy 0, policy_version 398369 (0.0029) [2024-04-27 14:16:54,107][52031] Fps is (10 sec: 54066.2, 60 sec: 53247.8, 300 sec: 53261.9). Total num frames: 6526894080. Throughput: 0: 53292.2. Samples: 1017460240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 14:16:54,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 14:16:56,644][52263] Updated weights for policy 0, policy_version 398379 (0.0028) [2024-04-27 14:16:59,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.1, 300 sec: 53206.3). Total num frames: 6527156224. Throughput: 0: 53396.8. Samples: 1017621780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 14:16:59,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 14:16:59,710][52263] Updated weights for policy 0, policy_version 398389 (0.0032) [2024-04-27 14:17:02,697][52263] Updated weights for policy 0, policy_version 398399 (0.0030) [2024-04-27 14:17:04,106][52031] Fps is (10 sec: 54068.3, 60 sec: 53248.0, 300 sec: 53261.9). Total num frames: 6527434752. Throughput: 0: 53423.3. Samples: 1017941860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 14:17:04,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 14:17:05,858][52263] Updated weights for policy 0, policy_version 398409 (0.0027) [2024-04-27 14:17:08,729][52263] Updated weights for policy 0, policy_version 398419 (0.0034) [2024-04-27 14:17:09,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 6527713280. Throughput: 0: 53391.6. Samples: 1018263340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 14:17:09,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 14:17:12,115][52263] Updated weights for policy 0, policy_version 398429 (0.0035) [2024-04-27 14:17:14,107][52031] Fps is (10 sec: 52427.7, 60 sec: 53247.8, 300 sec: 53372.9). Total num frames: 6527959040. Throughput: 0: 53672.5. Samples: 1018430720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 14:17:14,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 14:17:14,846][52263] Updated weights for policy 0, policy_version 398439 (0.0036) [2024-04-27 14:17:18,240][52263] Updated weights for policy 0, policy_version 398449 (0.0032) [2024-04-27 14:17:19,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53248.1, 300 sec: 53261.9). Total num frames: 6528221184. Throughput: 0: 53567.2. Samples: 1018747640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 14:17:19,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 14:17:21,043][52263] Updated weights for policy 0, policy_version 398459 (0.0028) [2024-04-27 14:17:24,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53520.9, 300 sec: 53261.9). Total num frames: 6528499712. Throughput: 0: 53447.3. Samples: 1019063820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 14:17:24,107][52031] Avg episode reward: [(0, '0.436')] [2024-04-27 14:17:24,199][52263] Updated weights for policy 0, policy_version 398469 (0.0031) [2024-04-27 14:17:27,190][52263] Updated weights for policy 0, policy_version 398479 (0.0032) [2024-04-27 14:17:29,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.3, 300 sec: 53206.4). Total num frames: 6528761856. Throughput: 0: 53529.8. Samples: 1019231380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 14:17:29,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 14:17:30,393][52263] Updated weights for policy 0, policy_version 398489 (0.0030) [2024-04-27 14:17:33,358][52263] Updated weights for policy 0, policy_version 398499 (0.0036) [2024-04-27 14:17:34,107][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 6529056768. Throughput: 0: 53626.6. Samples: 1019553180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 14:17:34,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 14:17:34,114][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000398502_6529056768.pth... [2024-04-27 14:17:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000397720_6516244480.pth [2024-04-27 14:17:36,440][52263] Updated weights for policy 0, policy_version 398509 (0.0029) [2024-04-27 14:17:39,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53247.9, 300 sec: 53261.9). Total num frames: 6529286144. Throughput: 0: 53577.0. Samples: 1019871200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 14:17:39,107][52031] Avg episode reward: [(0, '0.517')] [2024-04-27 14:17:39,500][52263] Updated weights for policy 0, policy_version 398519 (0.0032) [2024-04-27 14:17:39,653][52242] Signal inference workers to stop experience collection... (15300 times) [2024-04-27 14:17:39,654][52242] Signal inference workers to resume experience collection... (15300 times) [2024-04-27 14:17:39,667][52263] InferenceWorker_p0-w0: stopping experience collection (15300 times) [2024-04-27 14:17:39,668][52263] InferenceWorker_p0-w0: resuming experience collection (15300 times) [2024-04-27 14:17:42,672][52263] Updated weights for policy 0, policy_version 398529 (0.0032) [2024-04-27 14:17:44,106][52031] Fps is (10 sec: 50791.1, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 6529564672. Throughput: 0: 53392.5. Samples: 1020024440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 14:17:44,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 14:17:45,674][52263] Updated weights for policy 0, policy_version 398539 (0.0035) [2024-04-27 14:17:48,891][52263] Updated weights for policy 0, policy_version 398549 (0.0029) [2024-04-27 14:17:49,107][52031] Fps is (10 sec: 55704.4, 60 sec: 53793.9, 300 sec: 53317.4). Total num frames: 6529843200. Throughput: 0: 53376.5. Samples: 1020343820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 14:17:49,116][52031] Avg episode reward: [(0, '0.496')] [2024-04-27 14:17:51,691][52263] Updated weights for policy 0, policy_version 398559 (0.0027) [2024-04-27 14:17:54,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53248.0, 300 sec: 53150.8). Total num frames: 6530088960. Throughput: 0: 53367.8. Samples: 1020664900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 14:17:54,115][52031] Avg episode reward: [(0, '0.524')] [2024-04-27 14:17:55,087][52263] Updated weights for policy 0, policy_version 398569 (0.0032) [2024-04-27 14:17:57,785][52263] Updated weights for policy 0, policy_version 398579 (0.0027) [2024-04-27 14:17:59,107][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.0, 300 sec: 53372.9). Total num frames: 6530383872. Throughput: 0: 53303.6. Samples: 1020829380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 14:17:59,107][52031] Avg episode reward: [(0, '0.505')] [2024-04-27 14:18:01,327][52263] Updated weights for policy 0, policy_version 398589 (0.0029) [2024-04-27 14:18:03,931][52263] Updated weights for policy 0, policy_version 398599 (0.0028) [2024-04-27 14:18:04,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 6530646016. Throughput: 0: 53388.8. Samples: 1021150140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 14:18:04,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 14:18:07,351][52263] Updated weights for policy 0, policy_version 398609 (0.0028) [2024-04-27 14:18:09,107][52031] Fps is (10 sec: 50790.3, 60 sec: 52974.7, 300 sec: 53317.4). Total num frames: 6530891776. Throughput: 0: 53585.3. Samples: 1021475160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-27 14:18:09,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 14:18:09,987][52263] Updated weights for policy 0, policy_version 398619 (0.0029) [2024-04-27 14:18:13,379][52263] Updated weights for policy 0, policy_version 398629 (0.0040) [2024-04-27 14:18:14,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53248.2, 300 sec: 53261.9). Total num frames: 6531153920. Throughput: 0: 53208.0. Samples: 1021625740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-27 14:18:14,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 14:18:16,117][52263] Updated weights for policy 0, policy_version 398639 (0.0026) [2024-04-27 14:18:19,106][52031] Fps is (10 sec: 55706.6, 60 sec: 53794.1, 300 sec: 53317.5). Total num frames: 6531448832. Throughput: 0: 53202.4. Samples: 1021947280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-27 14:18:19,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 14:18:19,588][52263] Updated weights for policy 0, policy_version 398649 (0.0027) [2024-04-27 14:18:22,297][52263] Updated weights for policy 0, policy_version 398659 (0.0029) [2024-04-27 14:18:24,107][52031] Fps is (10 sec: 57342.8, 60 sec: 53794.1, 300 sec: 53372.9). Total num frames: 6531727360. Throughput: 0: 53260.7. Samples: 1022267940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-27 14:18:24,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 14:18:25,714][52263] Updated weights for policy 0, policy_version 398669 (0.0031) [2024-04-27 14:18:28,507][52263] Updated weights for policy 0, policy_version 398679 (0.0033) [2024-04-27 14:18:29,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 6531973120. Throughput: 0: 53609.3. Samples: 1022436860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-27 14:18:29,107][52031] Avg episode reward: [(0, '0.487')] [2024-04-27 14:18:31,743][52263] Updated weights for policy 0, policy_version 398689 (0.0030) [2024-04-27 14:18:34,106][52031] Fps is (10 sec: 50791.1, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6532235264. Throughput: 0: 53750.0. Samples: 1022762560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-27 14:18:34,107][52031] Avg episode reward: [(0, '0.646')] [2024-04-27 14:18:34,643][52263] Updated weights for policy 0, policy_version 398699 (0.0033) [2024-04-27 14:18:37,831][52263] Updated weights for policy 0, policy_version 398709 (0.0028) [2024-04-27 14:18:39,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6532513792. Throughput: 0: 53629.4. Samples: 1023078220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-27 14:18:39,115][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 14:18:40,657][52263] Updated weights for policy 0, policy_version 398719 (0.0025) [2024-04-27 14:18:43,978][52263] Updated weights for policy 0, policy_version 398729 (0.0030) [2024-04-27 14:18:44,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 6532775936. Throughput: 0: 53482.7. Samples: 1023236100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-27 14:18:44,116][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 14:18:46,894][52263] Updated weights for policy 0, policy_version 398739 (0.0031) [2024-04-27 14:18:49,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.2, 300 sec: 53261.9). Total num frames: 6533038080. Throughput: 0: 53582.6. Samples: 1023561360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-27 14:18:49,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 14:18:50,289][52263] Updated weights for policy 0, policy_version 398749 (0.0029) [2024-04-27 14:18:53,060][52263] Updated weights for policy 0, policy_version 398759 (0.0030) [2024-04-27 14:18:54,107][52031] Fps is (10 sec: 57343.8, 60 sec: 54340.3, 300 sec: 53484.0). Total num frames: 6533349376. Throughput: 0: 53396.0. Samples: 1023877980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-27 14:18:54,107][52031] Avg episode reward: [(0, '0.512')] [2024-04-27 14:18:56,401][52263] Updated weights for policy 0, policy_version 398769 (0.0033) [2024-04-27 14:18:59,063][52263] Updated weights for policy 0, policy_version 398779 (0.0029) [2024-04-27 14:18:59,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53521.2, 300 sec: 53484.0). Total num frames: 6533595136. Throughput: 0: 53883.1. Samples: 1024050480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-27 14:18:59,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 14:19:02,423][52263] Updated weights for policy 0, policy_version 398789 (0.0028) [2024-04-27 14:19:02,727][52242] Signal inference workers to stop experience collection... (15350 times) [2024-04-27 14:19:02,732][52242] Signal inference workers to resume experience collection... (15350 times) [2024-04-27 14:19:02,766][52263] InferenceWorker_p0-w0: stopping experience collection (15350 times) [2024-04-27 14:19:02,766][52263] InferenceWorker_p0-w0: resuming experience collection (15350 times) [2024-04-27 14:19:04,106][52031] Fps is (10 sec: 49152.3, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6533840896. Throughput: 0: 53848.8. Samples: 1024370480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-27 14:19:04,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 14:19:05,092][52263] Updated weights for policy 0, policy_version 398799 (0.0033) [2024-04-27 14:19:08,559][52263] Updated weights for policy 0, policy_version 398809 (0.0031) [2024-04-27 14:19:09,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6534119424. Throughput: 0: 53915.6. Samples: 1024694140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-27 14:19:09,107][52031] Avg episode reward: [(0, '0.672')] [2024-04-27 14:19:11,305][52263] Updated weights for policy 0, policy_version 398819 (0.0029) [2024-04-27 14:19:14,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 6534365184. Throughput: 0: 53604.5. Samples: 1024849060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-27 14:19:14,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 14:19:14,727][52263] Updated weights for policy 0, policy_version 398829 (0.0033) [2024-04-27 14:19:17,480][52263] Updated weights for policy 0, policy_version 398839 (0.0031) [2024-04-27 14:19:19,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6534660096. Throughput: 0: 53532.4. Samples: 1025171520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-27 14:19:19,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 14:19:20,811][52263] Updated weights for policy 0, policy_version 398849 (0.0029) [2024-04-27 14:19:23,496][52263] Updated weights for policy 0, policy_version 398859 (0.0032) [2024-04-27 14:19:24,106][52031] Fps is (10 sec: 57344.0, 60 sec: 53521.3, 300 sec: 53428.5). Total num frames: 6534938624. Throughput: 0: 53636.6. Samples: 1025491860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-27 14:19:24,107][52031] Avg episode reward: [(0, '0.476')] [2024-04-27 14:19:26,818][52263] Updated weights for policy 0, policy_version 398869 (0.0033) [2024-04-27 14:19:29,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.0, 300 sec: 53539.5). Total num frames: 6535200768. Throughput: 0: 53810.6. Samples: 1025657580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 14:19:29,116][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 14:19:29,591][52263] Updated weights for policy 0, policy_version 398879 (0.0029) [2024-04-27 14:19:32,828][52263] Updated weights for policy 0, policy_version 398889 (0.0025) [2024-04-27 14:19:34,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6535462912. Throughput: 0: 53992.5. Samples: 1025991020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 14:19:34,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 14:19:34,303][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000398895_6535495680.pth... [2024-04-27 14:19:34,347][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000398109_6522617856.pth [2024-04-27 14:19:35,525][52263] Updated weights for policy 0, policy_version 398899 (0.0031) [2024-04-27 14:19:39,086][52263] Updated weights for policy 0, policy_version 398909 (0.0033) [2024-04-27 14:19:39,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6535725056. Throughput: 0: 54071.2. Samples: 1026311180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 14:19:39,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 14:19:41,554][52263] Updated weights for policy 0, policy_version 398919 (0.0031) [2024-04-27 14:19:44,107][52031] Fps is (10 sec: 50789.2, 60 sec: 53247.9, 300 sec: 53372.9). Total num frames: 6535970816. Throughput: 0: 53576.2. Samples: 1026461420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 14:19:44,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 14:19:45,194][52263] Updated weights for policy 0, policy_version 398929 (0.0027) [2024-04-27 14:19:47,648][52263] Updated weights for policy 0, policy_version 398939 (0.0027) [2024-04-27 14:19:49,107][52031] Fps is (10 sec: 55705.2, 60 sec: 54067.1, 300 sec: 53484.0). Total num frames: 6536282112. Throughput: 0: 53518.6. Samples: 1026778820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 14:19:49,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 14:19:51,210][52263] Updated weights for policy 0, policy_version 398949 (0.0033) [2024-04-27 14:19:53,807][52263] Updated weights for policy 0, policy_version 398959 (0.0030) [2024-04-27 14:19:54,106][52031] Fps is (10 sec: 57344.9, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6536544256. Throughput: 0: 53495.2. Samples: 1027101420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 14:19:54,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 14:19:57,358][52263] Updated weights for policy 0, policy_version 398969 (0.0028) [2024-04-27 14:19:59,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53520.9, 300 sec: 53539.6). Total num frames: 6536806400. Throughput: 0: 53901.6. Samples: 1027274640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 14:19:59,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 14:19:59,953][52263] Updated weights for policy 0, policy_version 398979 (0.0030) [2024-04-27 14:20:03,736][52263] Updated weights for policy 0, policy_version 398989 (0.0032) [2024-04-27 14:20:03,828][52242] Signal inference workers to stop experience collection... (15400 times) [2024-04-27 14:20:03,874][52263] InferenceWorker_p0-w0: stopping experience collection (15400 times) [2024-04-27 14:20:03,897][52242] Signal inference workers to resume experience collection... (15400 times) [2024-04-27 14:20:03,898][52263] InferenceWorker_p0-w0: resuming experience collection (15400 times) [2024-04-27 14:20:04,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6537068544. Throughput: 0: 53791.1. Samples: 1027592120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 14:20:04,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 14:20:06,187][52263] Updated weights for policy 0, policy_version 398999 (0.0028) [2024-04-27 14:20:09,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6537314304. Throughput: 0: 53676.3. Samples: 1027907300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 14:20:09,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 14:20:09,697][52263] Updated weights for policy 0, policy_version 399009 (0.0029) [2024-04-27 14:20:12,224][52263] Updated weights for policy 0, policy_version 399019 (0.0031) [2024-04-27 14:20:14,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53793.9, 300 sec: 53428.5). Total num frames: 6537592832. Throughput: 0: 53452.9. Samples: 1028062960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 14:20:14,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 14:20:15,603][52263] Updated weights for policy 0, policy_version 399029 (0.0035) [2024-04-27 14:20:18,236][52263] Updated weights for policy 0, policy_version 399039 (0.0031) [2024-04-27 14:20:19,106][52031] Fps is (10 sec: 57344.1, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6537887744. Throughput: 0: 53192.8. Samples: 1028384700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 14:20:19,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 14:20:21,878][52263] Updated weights for policy 0, policy_version 399049 (0.0030) [2024-04-27 14:20:24,106][52031] Fps is (10 sec: 57344.9, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6538166272. Throughput: 0: 53196.0. Samples: 1028705000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 14:20:24,107][52031] Avg episode reward: [(0, '0.520')] [2024-04-27 14:20:24,284][52263] Updated weights for policy 0, policy_version 399059 (0.0030) [2024-04-27 14:20:28,102][52263] Updated weights for policy 0, policy_version 399069 (0.0026) [2024-04-27 14:20:29,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6538412032. Throughput: 0: 53578.9. Samples: 1028872460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 14:20:29,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 14:20:30,345][52263] Updated weights for policy 0, policy_version 399079 (0.0037) [2024-04-27 14:20:34,106][52031] Fps is (10 sec: 49151.7, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 6538657792. Throughput: 0: 53635.2. Samples: 1029192400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 14:20:34,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 14:20:34,160][52263] Updated weights for policy 0, policy_version 399089 (0.0033) [2024-04-27 14:20:36,425][52263] Updated weights for policy 0, policy_version 399099 (0.0029) [2024-04-27 14:20:39,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6538936320. Throughput: 0: 53556.8. Samples: 1029511480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 14:20:39,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 14:20:40,258][52263] Updated weights for policy 0, policy_version 399109 (0.0033) [2024-04-27 14:20:42,507][52263] Updated weights for policy 0, policy_version 399119 (0.0029) [2024-04-27 14:20:44,107][52031] Fps is (10 sec: 55705.1, 60 sec: 54067.2, 300 sec: 53539.5). Total num frames: 6539214848. Throughput: 0: 53281.3. Samples: 1029672300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 14:20:44,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 14:20:46,357][52263] Updated weights for policy 0, policy_version 399129 (0.0030) [2024-04-27 14:20:48,601][52263] Updated weights for policy 0, policy_version 399139 (0.0033) [2024-04-27 14:20:49,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6539493376. Throughput: 0: 53365.9. Samples: 1029993580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 14:20:49,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 14:20:52,655][52263] Updated weights for policy 0, policy_version 399149 (0.0036) [2024-04-27 14:20:54,106][52031] Fps is (10 sec: 54068.3, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6539755520. Throughput: 0: 53430.8. Samples: 1030311680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 14:20:54,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 14:20:54,854][52263] Updated weights for policy 0, policy_version 399159 (0.0028) [2024-04-27 14:20:58,623][52263] Updated weights for policy 0, policy_version 399169 (0.0032) [2024-04-27 14:20:59,107][52031] Fps is (10 sec: 50790.0, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6540001280. Throughput: 0: 53541.9. Samples: 1030472340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 14:20:59,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 14:21:01,129][52263] Updated weights for policy 0, policy_version 399179 (0.0033) [2024-04-27 14:21:04,107][52031] Fps is (10 sec: 49151.3, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6540247040. Throughput: 0: 53551.5. Samples: 1030794520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 14:21:04,107][52031] Avg episode reward: [(0, '0.487')] [2024-04-27 14:21:04,673][52263] Updated weights for policy 0, policy_version 399189 (0.0028) [2024-04-27 14:21:07,360][52263] Updated weights for policy 0, policy_version 399199 (0.0030) [2024-04-27 14:21:09,107][52031] Fps is (10 sec: 55705.6, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6540558336. Throughput: 0: 53529.7. Samples: 1031113840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 14:21:09,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 14:21:10,636][52242] Signal inference workers to stop experience collection... (15450 times) [2024-04-27 14:21:10,674][52263] InferenceWorker_p0-w0: stopping experience collection (15450 times) [2024-04-27 14:21:10,700][52242] Signal inference workers to resume experience collection... (15450 times) [2024-04-27 14:21:10,700][52263] InferenceWorker_p0-w0: resuming experience collection (15450 times) [2024-04-27 14:21:10,816][52263] Updated weights for policy 0, policy_version 399209 (0.0027) [2024-04-27 14:21:13,447][52263] Updated weights for policy 0, policy_version 399219 (0.0031) [2024-04-27 14:21:14,107][52031] Fps is (10 sec: 57343.4, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6540820480. Throughput: 0: 53482.9. Samples: 1031279200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 14:21:14,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 14:21:17,056][52263] Updated weights for policy 0, policy_version 399229 (0.0029) [2024-04-27 14:21:19,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6541099008. Throughput: 0: 53331.9. Samples: 1031592340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 14:21:19,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 14:21:19,629][52263] Updated weights for policy 0, policy_version 399239 (0.0027) [2024-04-27 14:21:22,990][52263] Updated weights for policy 0, policy_version 399249 (0.0029) [2024-04-27 14:21:24,107][52031] Fps is (10 sec: 50790.7, 60 sec: 52701.8, 300 sec: 53484.1). Total num frames: 6541328384. Throughput: 0: 53339.1. Samples: 1031911740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 14:21:24,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 14:21:25,851][52263] Updated weights for policy 0, policy_version 399259 (0.0033) [2024-04-27 14:21:29,061][52263] Updated weights for policy 0, policy_version 399269 (0.0026) [2024-04-27 14:21:29,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6541623296. Throughput: 0: 53316.6. Samples: 1032071540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 14:21:29,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 14:21:31,808][52263] Updated weights for policy 0, policy_version 399279 (0.0032) [2024-04-27 14:21:34,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6541869056. Throughput: 0: 53422.9. Samples: 1032397620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 14:21:34,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 14:21:34,120][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000399284_6541869056.pth... [2024-04-27 14:21:34,173][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000398502_6529056768.pth [2024-04-27 14:21:35,267][52263] Updated weights for policy 0, policy_version 399289 (0.0028) [2024-04-27 14:21:37,908][52263] Updated weights for policy 0, policy_version 399299 (0.0030) [2024-04-27 14:21:39,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6542163968. Throughput: 0: 53533.8. Samples: 1032720700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 14:21:39,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 14:21:41,299][52263] Updated weights for policy 0, policy_version 399309 (0.0034) [2024-04-27 14:21:43,895][52263] Updated weights for policy 0, policy_version 399319 (0.0023) [2024-04-27 14:21:44,107][52031] Fps is (10 sec: 57344.5, 60 sec: 53794.2, 300 sec: 53650.6). Total num frames: 6542442496. Throughput: 0: 53535.6. Samples: 1032881440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 14:21:44,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 14:21:47,364][52263] Updated weights for policy 0, policy_version 399329 (0.0036) [2024-04-27 14:21:49,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6542721024. Throughput: 0: 53543.1. Samples: 1033203960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 14:21:49,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 14:21:49,970][52263] Updated weights for policy 0, policy_version 399339 (0.0028) [2024-04-27 14:21:53,666][52263] Updated weights for policy 0, policy_version 399349 (0.0029) [2024-04-27 14:21:54,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6542950400. Throughput: 0: 53768.5. Samples: 1033533420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 14:21:54,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 14:21:56,319][52263] Updated weights for policy 0, policy_version 399359 (0.0030) [2024-04-27 14:21:56,767][52242] Signal inference workers to stop experience collection... (15500 times) [2024-04-27 14:21:56,767][52242] Signal inference workers to resume experience collection... (15500 times) [2024-04-27 14:21:56,792][52263] InferenceWorker_p0-w0: stopping experience collection (15500 times) [2024-04-27 14:21:56,792][52263] InferenceWorker_p0-w0: resuming experience collection (15500 times) [2024-04-27 14:21:59,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6543228928. Throughput: 0: 53503.8. Samples: 1033686860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 14:21:59,107][52031] Avg episode reward: [(0, '0.478')] [2024-04-27 14:21:59,752][52263] Updated weights for policy 0, policy_version 399369 (0.0033) [2024-04-27 14:22:02,365][52263] Updated weights for policy 0, policy_version 399379 (0.0028) [2024-04-27 14:22:04,107][52031] Fps is (10 sec: 54066.4, 60 sec: 54067.1, 300 sec: 53484.0). Total num frames: 6543491072. Throughput: 0: 53662.2. Samples: 1034007140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 14:22:04,107][52031] Avg episode reward: [(0, '0.481')] [2024-04-27 14:22:05,985][52263] Updated weights for policy 0, policy_version 399389 (0.0034) [2024-04-27 14:22:08,315][52263] Updated weights for policy 0, policy_version 399399 (0.0026) [2024-04-27 14:22:09,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6543769600. Throughput: 0: 53592.8. Samples: 1034323420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 14:22:09,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 14:22:12,184][52263] Updated weights for policy 0, policy_version 399409 (0.0027) [2024-04-27 14:22:14,106][52031] Fps is (10 sec: 54068.3, 60 sec: 53521.3, 300 sec: 53595.1). Total num frames: 6544031744. Throughput: 0: 53688.1. Samples: 1034487500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 14:22:14,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 14:22:14,902][52263] Updated weights for policy 0, policy_version 399419 (0.0030) [2024-04-27 14:22:18,106][52263] Updated weights for policy 0, policy_version 399429 (0.0026) [2024-04-27 14:22:19,107][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6544310272. Throughput: 0: 53557.0. Samples: 1034807680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 14:22:19,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 14:22:20,826][52263] Updated weights for policy 0, policy_version 399439 (0.0031) [2024-04-27 14:22:24,080][52263] Updated weights for policy 0, policy_version 399449 (0.0027) [2024-04-27 14:22:24,107][52031] Fps is (10 sec: 54066.3, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6544572416. Throughput: 0: 53727.7. Samples: 1035138460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 14:22:24,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 14:22:26,900][52263] Updated weights for policy 0, policy_version 399459 (0.0032) [2024-04-27 14:22:29,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6544818176. Throughput: 0: 53545.9. Samples: 1035291000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 14:22:29,116][52031] Avg episode reward: [(0, '0.691')] [2024-04-27 14:22:30,487][52263] Updated weights for policy 0, policy_version 399469 (0.0032) [2024-04-27 14:22:33,057][52263] Updated weights for policy 0, policy_version 399479 (0.0031) [2024-04-27 14:22:34,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6545096704. Throughput: 0: 53397.7. Samples: 1035606860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 14:22:34,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 14:22:36,684][52263] Updated weights for policy 0, policy_version 399489 (0.0030) [2024-04-27 14:22:39,008][52263] Updated weights for policy 0, policy_version 399499 (0.0028) [2024-04-27 14:22:39,107][52031] Fps is (10 sec: 57343.7, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 6545391616. Throughput: 0: 53263.1. Samples: 1035930260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 14:22:39,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 14:22:42,867][52263] Updated weights for policy 0, policy_version 399509 (0.0031) [2024-04-27 14:22:44,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53521.1, 300 sec: 53595.2). Total num frames: 6545653760. Throughput: 0: 53624.8. Samples: 1036099980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 14:22:44,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 14:22:44,973][52263] Updated weights for policy 0, policy_version 399519 (0.0036) [2024-04-27 14:22:48,853][52263] Updated weights for policy 0, policy_version 399529 (0.0030) [2024-04-27 14:22:49,107][52031] Fps is (10 sec: 50789.7, 60 sec: 52974.8, 300 sec: 53595.1). Total num frames: 6545899520. Throughput: 0: 53613.3. Samples: 1036419740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 14:22:49,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 14:22:49,200][52242] Signal inference workers to stop experience collection... (15550 times) [2024-04-27 14:22:49,200][52242] Signal inference workers to resume experience collection... (15550 times) [2024-04-27 14:22:49,215][52263] InferenceWorker_p0-w0: stopping experience collection (15550 times) [2024-04-27 14:22:49,215][52263] InferenceWorker_p0-w0: resuming experience collection (15550 times) [2024-04-27 14:22:51,394][52263] Updated weights for policy 0, policy_version 399539 (0.0033) [2024-04-27 14:22:54,106][52031] Fps is (10 sec: 54067.4, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6546194432. Throughput: 0: 53778.9. Samples: 1036743460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 14:22:54,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 14:22:54,953][52263] Updated weights for policy 0, policy_version 399549 (0.0032) [2024-04-27 14:22:57,828][52263] Updated weights for policy 0, policy_version 399559 (0.0025) [2024-04-27 14:22:59,107][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6546440192. Throughput: 0: 53627.9. Samples: 1036900760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 14:22:59,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 14:23:01,093][52263] Updated weights for policy 0, policy_version 399569 (0.0028) [2024-04-27 14:23:04,020][52263] Updated weights for policy 0, policy_version 399579 (0.0027) [2024-04-27 14:23:04,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6546702336. Throughput: 0: 53605.8. Samples: 1037219940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 14:23:04,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 14:23:07,045][52263] Updated weights for policy 0, policy_version 399589 (0.0032) [2024-04-27 14:23:09,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6546997248. Throughput: 0: 53473.0. Samples: 1037544740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 14:23:09,107][52031] Avg episode reward: [(0, '0.516')] [2024-04-27 14:23:10,055][52263] Updated weights for policy 0, policy_version 399599 (0.0029) [2024-04-27 14:23:13,153][52263] Updated weights for policy 0, policy_version 399609 (0.0032) [2024-04-27 14:23:14,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6547259392. Throughput: 0: 53821.3. Samples: 1037712960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 14:23:14,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 14:23:16,280][52263] Updated weights for policy 0, policy_version 399619 (0.0031) [2024-04-27 14:23:19,107][52031] Fps is (10 sec: 50790.0, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6547505152. Throughput: 0: 53902.2. Samples: 1038032460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 14:23:19,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 14:23:19,324][52263] Updated weights for policy 0, policy_version 399629 (0.0029) [2024-04-27 14:23:22,393][52263] Updated weights for policy 0, policy_version 399639 (0.0038) [2024-04-27 14:23:24,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6547783680. Throughput: 0: 53877.4. Samples: 1038354740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 14:23:24,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 14:23:25,396][52263] Updated weights for policy 0, policy_version 399649 (0.0030) [2024-04-27 14:23:28,383][52263] Updated weights for policy 0, policy_version 399659 (0.0027) [2024-04-27 14:23:29,107][52031] Fps is (10 sec: 54067.6, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6548045824. Throughput: 0: 53471.1. Samples: 1038506180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-27 14:23:29,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 14:23:31,395][52263] Updated weights for policy 0, policy_version 399669 (0.0026) [2024-04-27 14:23:34,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6548324352. Throughput: 0: 53603.8. Samples: 1038831900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-27 14:23:34,107][52031] Avg episode reward: [(0, '0.695')] [2024-04-27 14:23:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000399678_6548324352.pth... [2024-04-27 14:23:34,169][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000398895_6535495680.pth [2024-04-27 14:23:34,333][52263] Updated weights for policy 0, policy_version 399679 (0.0029) [2024-04-27 14:23:37,531][52263] Updated weights for policy 0, policy_version 399689 (0.0032) [2024-04-27 14:23:39,107][52031] Fps is (10 sec: 57344.1, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6548619264. Throughput: 0: 53520.4. Samples: 1039151880. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-27 14:23:39,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 14:23:40,435][52263] Updated weights for policy 0, policy_version 399699 (0.0030) [2024-04-27 14:23:41,923][52242] Signal inference workers to stop experience collection... (15600 times) [2024-04-27 14:23:41,960][52263] InferenceWorker_p0-w0: stopping experience collection (15600 times) [2024-04-27 14:23:42,018][52242] Signal inference workers to resume experience collection... (15600 times) [2024-04-27 14:23:42,018][52263] InferenceWorker_p0-w0: resuming experience collection (15600 times) [2024-04-27 14:23:43,764][52263] Updated weights for policy 0, policy_version 399709 (0.0040) [2024-04-27 14:23:44,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6548848640. Throughput: 0: 53693.5. Samples: 1039316960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-27 14:23:44,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 14:23:46,619][52263] Updated weights for policy 0, policy_version 399719 (0.0027) [2024-04-27 14:23:49,107][52031] Fps is (10 sec: 50790.4, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6549127168. Throughput: 0: 53636.0. Samples: 1039633560. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-27 14:23:49,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 14:23:49,847][52263] Updated weights for policy 0, policy_version 399729 (0.0032) [2024-04-27 14:23:53,014][52263] Updated weights for policy 0, policy_version 399739 (0.0028) [2024-04-27 14:23:54,106][52031] Fps is (10 sec: 54066.7, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6549389312. Throughput: 0: 53614.3. Samples: 1039957380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-27 14:23:54,107][52031] Avg episode reward: [(0, '0.671')] [2024-04-27 14:23:55,836][52263] Updated weights for policy 0, policy_version 399749 (0.0026) [2024-04-27 14:23:59,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6549635072. Throughput: 0: 53431.5. Samples: 1040117380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-27 14:23:59,107][52031] Avg episode reward: [(0, '0.669')] [2024-04-27 14:23:59,233][52263] Updated weights for policy 0, policy_version 399759 (0.0029) [2024-04-27 14:24:01,917][52263] Updated weights for policy 0, policy_version 399769 (0.0027) [2024-04-27 14:24:04,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.2, 300 sec: 53595.2). Total num frames: 6549929984. Throughput: 0: 53392.7. Samples: 1040435120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-27 14:24:04,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 14:24:05,454][52263] Updated weights for policy 0, policy_version 399779 (0.0026) [2024-04-27 14:24:08,051][52263] Updated weights for policy 0, policy_version 399789 (0.0031) [2024-04-27 14:24:09,106][52031] Fps is (10 sec: 57344.3, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6550208512. Throughput: 0: 53352.8. Samples: 1040755620. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-27 14:24:09,107][52031] Avg episode reward: [(0, '0.468')] [2024-04-27 14:24:11,537][52263] Updated weights for policy 0, policy_version 399799 (0.0033) [2024-04-27 14:24:14,106][52031] Fps is (10 sec: 50789.9, 60 sec: 52974.9, 300 sec: 53484.1). Total num frames: 6550437888. Throughput: 0: 53734.7. Samples: 1040924240. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-27 14:24:14,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 14:24:14,256][52263] Updated weights for policy 0, policy_version 399809 (0.0028) [2024-04-27 14:24:17,530][52263] Updated weights for policy 0, policy_version 399819 (0.0026) [2024-04-27 14:24:19,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53521.2, 300 sec: 53484.0). Total num frames: 6550716416. Throughput: 0: 53504.9. Samples: 1041239620. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-27 14:24:19,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 14:24:20,485][52263] Updated weights for policy 0, policy_version 399829 (0.0026) [2024-04-27 14:24:23,613][52263] Updated weights for policy 0, policy_version 399839 (0.0028) [2024-04-27 14:24:24,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6550994944. Throughput: 0: 53579.5. Samples: 1041562960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-27 14:24:24,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 14:24:26,527][52263] Updated weights for policy 0, policy_version 399849 (0.0033) [2024-04-27 14:24:29,107][52031] Fps is (10 sec: 54066.1, 60 sec: 53521.0, 300 sec: 53539.5). Total num frames: 6551257088. Throughput: 0: 53456.2. Samples: 1041722500. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-27 14:24:29,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 14:24:29,651][52263] Updated weights for policy 0, policy_version 399859 (0.0025) [2024-04-27 14:24:32,483][52263] Updated weights for policy 0, policy_version 399869 (0.0029) [2024-04-27 14:24:34,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6551535616. Throughput: 0: 53488.9. Samples: 1042040560. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-27 14:24:34,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 14:24:35,749][52263] Updated weights for policy 0, policy_version 399879 (0.0031) [2024-04-27 14:24:38,652][52263] Updated weights for policy 0, policy_version 399889 (0.0033) [2024-04-27 14:24:39,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53248.0, 300 sec: 53706.2). Total num frames: 6551814144. Throughput: 0: 53412.8. Samples: 1042360960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-27 14:24:39,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 14:24:41,966][52263] Updated weights for policy 0, policy_version 399899 (0.0029) [2024-04-27 14:24:44,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6552059904. Throughput: 0: 53485.3. Samples: 1042524220. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-27 14:24:44,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 14:24:44,593][52263] Updated weights for policy 0, policy_version 399909 (0.0027) [2024-04-27 14:24:48,049][52263] Updated weights for policy 0, policy_version 399919 (0.0028) [2024-04-27 14:24:49,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6552322048. Throughput: 0: 53556.3. Samples: 1042845160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 14:24:49,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 14:24:50,655][52263] Updated weights for policy 0, policy_version 399929 (0.0035) [2024-04-27 14:24:54,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6552584192. Throughput: 0: 53617.3. Samples: 1043168400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 14:24:54,107][52031] Avg episode reward: [(0, '0.519')] [2024-04-27 14:24:54,172][52263] Updated weights for policy 0, policy_version 399939 (0.0027) [2024-04-27 14:24:54,991][52242] Signal inference workers to stop experience collection... (15650 times) [2024-04-27 14:24:55,029][52263] InferenceWorker_p0-w0: stopping experience collection (15650 times) [2024-04-27 14:24:55,088][52242] Signal inference workers to resume experience collection... (15650 times) [2024-04-27 14:24:55,088][52263] InferenceWorker_p0-w0: resuming experience collection (15650 times) [2024-04-27 14:24:56,816][52263] Updated weights for policy 0, policy_version 399949 (0.0024) [2024-04-27 14:24:59,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6552862720. Throughput: 0: 53363.2. Samples: 1043325580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 14:24:59,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 14:25:00,227][52263] Updated weights for policy 0, policy_version 399959 (0.0031) [2024-04-27 14:25:03,107][52263] Updated weights for policy 0, policy_version 399969 (0.0027) [2024-04-27 14:25:04,106][52031] Fps is (10 sec: 57344.2, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6553157632. Throughput: 0: 53565.3. Samples: 1043650060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 14:25:04,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 14:25:06,148][52263] Updated weights for policy 0, policy_version 399979 (0.0027) [2024-04-27 14:25:09,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6553403392. Throughput: 0: 53571.7. Samples: 1043973680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 14:25:09,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 14:25:09,239][52263] Updated weights for policy 0, policy_version 399989 (0.0034) [2024-04-27 14:25:12,176][52263] Updated weights for policy 0, policy_version 399999 (0.0027) [2024-04-27 14:25:14,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6553665536. Throughput: 0: 53463.7. Samples: 1044128360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 14:25:14,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 14:25:15,352][52263] Updated weights for policy 0, policy_version 400009 (0.0034) [2024-04-27 14:25:18,315][52263] Updated weights for policy 0, policy_version 400019 (0.0033) [2024-04-27 14:25:19,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6553927680. Throughput: 0: 53506.7. Samples: 1044448360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 14:25:19,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 14:25:21,459][52263] Updated weights for policy 0, policy_version 400029 (0.0038) [2024-04-27 14:25:24,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53521.0, 300 sec: 53539.5). Total num frames: 6554206208. Throughput: 0: 53686.6. Samples: 1044776860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 14:25:24,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 14:25:24,346][52263] Updated weights for policy 0, policy_version 400039 (0.0032) [2024-04-27 14:25:27,577][52263] Updated weights for policy 0, policy_version 400049 (0.0026) [2024-04-27 14:25:29,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6554468352. Throughput: 0: 53448.1. Samples: 1044929380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 14:25:29,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 14:25:30,593][52263] Updated weights for policy 0, policy_version 400059 (0.0033) [2024-04-27 14:25:33,679][52263] Updated weights for policy 0, policy_version 400069 (0.0029) [2024-04-27 14:25:34,107][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6554746880. Throughput: 0: 53406.2. Samples: 1045248440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 14:25:34,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 14:25:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000400070_6554746880.pth... [2024-04-27 14:25:34,168][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000399284_6541869056.pth [2024-04-27 14:25:36,993][52263] Updated weights for policy 0, policy_version 400079 (0.0028) [2024-04-27 14:25:39,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53521.2, 300 sec: 53595.2). Total num frames: 6555025408. Throughput: 0: 53304.6. Samples: 1045567100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 14:25:39,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 14:25:39,834][52263] Updated weights for policy 0, policy_version 400089 (0.0025) [2024-04-27 14:25:42,912][52263] Updated weights for policy 0, policy_version 400099 (0.0026) [2024-04-27 14:25:44,107][52031] Fps is (10 sec: 52427.7, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6555271168. Throughput: 0: 53544.5. Samples: 1045735100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 14:25:44,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 14:25:45,967][52263] Updated weights for policy 0, policy_version 400109 (0.0036) [2024-04-27 14:25:49,087][52263] Updated weights for policy 0, policy_version 400119 (0.0030) [2024-04-27 14:25:49,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6555549696. Throughput: 0: 53416.8. Samples: 1046053820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 14:25:49,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 14:25:51,949][52263] Updated weights for policy 0, policy_version 400129 (0.0027) [2024-04-27 14:25:54,107][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6555811840. Throughput: 0: 53402.0. Samples: 1046376780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 14:25:54,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 14:25:55,256][52263] Updated weights for policy 0, policy_version 400139 (0.0034) [2024-04-27 14:25:58,285][52263] Updated weights for policy 0, policy_version 400149 (0.0031) [2024-04-27 14:25:59,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53520.8, 300 sec: 53650.6). Total num frames: 6556073984. Throughput: 0: 53495.3. Samples: 1046535660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 14:25:59,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 14:26:01,216][52263] Updated weights for policy 0, policy_version 400159 (0.0033) [2024-04-27 14:26:04,107][52031] Fps is (10 sec: 54067.6, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6556352512. Throughput: 0: 53484.3. Samples: 1046855160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 14:26:04,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 14:26:04,482][52263] Updated weights for policy 0, policy_version 400169 (0.0038) [2024-04-27 14:26:07,467][52263] Updated weights for policy 0, policy_version 400179 (0.0036) [2024-04-27 14:26:09,107][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6556614656. Throughput: 0: 53249.4. Samples: 1047173080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 14:26:09,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 14:26:09,367][52242] Signal inference workers to stop experience collection... (15700 times) [2024-04-27 14:26:09,367][52242] Signal inference workers to resume experience collection... (15700 times) [2024-04-27 14:26:09,386][52263] InferenceWorker_p0-w0: stopping experience collection (15700 times) [2024-04-27 14:26:09,386][52263] InferenceWorker_p0-w0: resuming experience collection (15700 times) [2024-04-27 14:26:10,755][52263] Updated weights for policy 0, policy_version 400189 (0.0029) [2024-04-27 14:26:13,662][52263] Updated weights for policy 0, policy_version 400199 (0.0032) [2024-04-27 14:26:14,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6556893184. Throughput: 0: 53624.0. Samples: 1047342460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 14:26:14,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 14:26:16,775][52263] Updated weights for policy 0, policy_version 400209 (0.0036) [2024-04-27 14:26:19,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6557155328. Throughput: 0: 53644.9. Samples: 1047662460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 14:26:19,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 14:26:19,659][52263] Updated weights for policy 0, policy_version 400219 (0.0032) [2024-04-27 14:26:22,780][52263] Updated weights for policy 0, policy_version 400229 (0.0026) [2024-04-27 14:26:24,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6557401088. Throughput: 0: 53658.5. Samples: 1047981740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 14:26:24,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 14:26:26,078][52263] Updated weights for policy 0, policy_version 400239 (0.0028) [2024-04-27 14:26:28,879][52263] Updated weights for policy 0, policy_version 400249 (0.0028) [2024-04-27 14:26:29,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.1, 300 sec: 53595.2). Total num frames: 6557679616. Throughput: 0: 53363.5. Samples: 1048136440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 14:26:29,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 14:26:32,093][52263] Updated weights for policy 0, policy_version 400259 (0.0034) [2024-04-27 14:26:34,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53521.0, 300 sec: 53539.5). Total num frames: 6557958144. Throughput: 0: 53339.8. Samples: 1048454120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 14:26:34,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 14:26:35,145][52263] Updated weights for policy 0, policy_version 400269 (0.0030) [2024-04-27 14:26:38,238][52263] Updated weights for policy 0, policy_version 400279 (0.0032) [2024-04-27 14:26:39,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53247.8, 300 sec: 53484.0). Total num frames: 6558220288. Throughput: 0: 53285.4. Samples: 1048774620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 14:26:39,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 14:26:41,141][52263] Updated weights for policy 0, policy_version 400289 (0.0034) [2024-04-27 14:26:44,106][52031] Fps is (10 sec: 50791.4, 60 sec: 53248.2, 300 sec: 53373.0). Total num frames: 6558466048. Throughput: 0: 53468.7. Samples: 1048941740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 14:26:44,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 14:26:44,421][52263] Updated weights for policy 0, policy_version 400299 (0.0027) [2024-04-27 14:26:47,471][52263] Updated weights for policy 0, policy_version 400309 (0.0030) [2024-04-27 14:26:49,106][52031] Fps is (10 sec: 54068.4, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6558760960. Throughput: 0: 53441.1. Samples: 1049260000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 14:26:49,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 14:26:50,591][52263] Updated weights for policy 0, policy_version 400319 (0.0031) [2024-04-27 14:26:53,631][52263] Updated weights for policy 0, policy_version 400329 (0.0029) [2024-04-27 14:26:54,106][52031] Fps is (10 sec: 52428.9, 60 sec: 52975.1, 300 sec: 53428.5). Total num frames: 6558990336. Throughput: 0: 53508.1. Samples: 1049580940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 14:26:54,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 14:26:56,522][52263] Updated weights for policy 0, policy_version 400339 (0.0034) [2024-04-27 14:26:59,106][52031] Fps is (10 sec: 50790.0, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6559268864. Throughput: 0: 53228.5. Samples: 1049737740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 14:26:59,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 14:26:59,896][52263] Updated weights for policy 0, policy_version 400349 (0.0032) [2024-04-27 14:27:02,573][52263] Updated weights for policy 0, policy_version 400359 (0.0030) [2024-04-27 14:27:04,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6559547392. Throughput: 0: 53244.8. Samples: 1050058480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 14:27:04,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 14:27:05,993][52263] Updated weights for policy 0, policy_version 400369 (0.0030) [2024-04-27 14:27:08,769][52263] Updated weights for policy 0, policy_version 400379 (0.0027) [2024-04-27 14:27:09,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53521.0, 300 sec: 53539.5). Total num frames: 6559825920. Throughput: 0: 53305.2. Samples: 1050380480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 14:27:09,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 14:27:11,976][52263] Updated weights for policy 0, policy_version 400389 (0.0034) [2024-04-27 14:27:14,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6560088064. Throughput: 0: 53513.2. Samples: 1050544540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 14:27:14,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 14:27:15,048][52263] Updated weights for policy 0, policy_version 400399 (0.0035) [2024-04-27 14:27:18,389][52263] Updated weights for policy 0, policy_version 400409 (0.0031) [2024-04-27 14:27:19,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6560350208. Throughput: 0: 53519.8. Samples: 1050862500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 14:27:19,107][52031] Avg episode reward: [(0, '0.511')] [2024-04-27 14:27:21,037][52263] Updated weights for policy 0, policy_version 400419 (0.0027) [2024-04-27 14:27:24,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6560612352. Throughput: 0: 53452.4. Samples: 1051179980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 14:27:24,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 14:27:24,360][52263] Updated weights for policy 0, policy_version 400429 (0.0031) [2024-04-27 14:27:27,143][52263] Updated weights for policy 0, policy_version 400439 (0.0032) [2024-04-27 14:27:29,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53247.9, 300 sec: 53484.1). Total num frames: 6560874496. Throughput: 0: 53287.1. Samples: 1051339660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-27 14:27:29,107][52031] Avg episode reward: [(0, '0.661')] [2024-04-27 14:27:30,318][52263] Updated weights for policy 0, policy_version 400449 (0.0031) [2024-04-27 14:27:31,326][52242] Signal inference workers to stop experience collection... (15750 times) [2024-04-27 14:27:31,326][52242] Signal inference workers to resume experience collection... (15750 times) [2024-04-27 14:27:31,355][52263] InferenceWorker_p0-w0: stopping experience collection (15750 times) [2024-04-27 14:27:31,355][52263] InferenceWorker_p0-w0: resuming experience collection (15750 times) [2024-04-27 14:27:33,349][52263] Updated weights for policy 0, policy_version 400459 (0.0028) [2024-04-27 14:27:34,106][52031] Fps is (10 sec: 52429.8, 60 sec: 52975.2, 300 sec: 53373.0). Total num frames: 6561136640. Throughput: 0: 53296.8. Samples: 1051658360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-27 14:27:34,107][52031] Avg episode reward: [(0, '0.672')] [2024-04-27 14:27:34,175][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000400461_6561153024.pth... [2024-04-27 14:27:34,221][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000399678_6548324352.pth [2024-04-27 14:27:36,524][52263] Updated weights for policy 0, policy_version 400469 (0.0034) [2024-04-27 14:27:39,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.2, 300 sec: 53428.5). Total num frames: 6561415168. Throughput: 0: 53312.0. Samples: 1051979980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-27 14:27:39,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 14:27:39,455][52263] Updated weights for policy 0, policy_version 400479 (0.0031) [2024-04-27 14:27:42,839][52263] Updated weights for policy 0, policy_version 400489 (0.0030) [2024-04-27 14:27:44,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6561693696. Throughput: 0: 53439.5. Samples: 1052142520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-27 14:27:44,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 14:27:45,479][52263] Updated weights for policy 0, policy_version 400499 (0.0032) [2024-04-27 14:27:48,850][52263] Updated weights for policy 0, policy_version 400509 (0.0029) [2024-04-27 14:27:49,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53247.8, 300 sec: 53428.5). Total num frames: 6561955840. Throughput: 0: 53488.0. Samples: 1052465440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-27 14:27:49,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 14:27:51,597][52263] Updated weights for policy 0, policy_version 400519 (0.0029) [2024-04-27 14:27:54,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6562217984. Throughput: 0: 53477.4. Samples: 1052786960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-27 14:27:54,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 14:27:54,935][52263] Updated weights for policy 0, policy_version 400529 (0.0028) [2024-04-27 14:27:57,748][52263] Updated weights for policy 0, policy_version 400539 (0.0034) [2024-04-27 14:27:59,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6562480128. Throughput: 0: 53370.7. Samples: 1052946220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-27 14:27:59,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 14:28:01,069][52263] Updated weights for policy 0, policy_version 400549 (0.0031) [2024-04-27 14:28:03,868][52263] Updated weights for policy 0, policy_version 400559 (0.0037) [2024-04-27 14:28:04,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6562758656. Throughput: 0: 53347.1. Samples: 1053263120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-27 14:28:04,107][52031] Avg episode reward: [(0, '0.681')] [2024-04-27 14:28:07,066][52263] Updated weights for policy 0, policy_version 400569 (0.0033) [2024-04-27 14:28:09,111][52031] Fps is (10 sec: 54041.6, 60 sec: 53243.9, 300 sec: 53427.6). Total num frames: 6563020800. Throughput: 0: 53430.5. Samples: 1053584600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-27 14:28:09,112][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 14:28:10,073][52263] Updated weights for policy 0, policy_version 400579 (0.0030) [2024-04-27 14:28:13,093][52263] Updated weights for policy 0, policy_version 400589 (0.0030) [2024-04-27 14:28:14,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6563282944. Throughput: 0: 53528.0. Samples: 1053748420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-27 14:28:14,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 14:28:16,265][52263] Updated weights for policy 0, policy_version 400599 (0.0033) [2024-04-27 14:28:19,107][52031] Fps is (10 sec: 54092.5, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6563561472. Throughput: 0: 53586.5. Samples: 1054069760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-27 14:28:19,116][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 14:28:19,295][52263] Updated weights for policy 0, policy_version 400609 (0.0028) [2024-04-27 14:28:22,396][52263] Updated weights for policy 0, policy_version 400619 (0.0032) [2024-04-27 14:28:24,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6563807232. Throughput: 0: 53529.3. Samples: 1054388800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-27 14:28:24,115][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 14:28:25,600][52263] Updated weights for policy 0, policy_version 400629 (0.0029) [2024-04-27 14:28:28,869][52263] Updated weights for policy 0, policy_version 400639 (0.0030) [2024-04-27 14:28:29,107][52031] Fps is (10 sec: 52429.0, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6564085760. Throughput: 0: 53380.5. Samples: 1054544640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-27 14:28:29,116][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 14:28:31,794][52263] Updated weights for policy 0, policy_version 400649 (0.0030) [2024-04-27 14:28:34,106][52031] Fps is (10 sec: 57344.2, 60 sec: 54067.2, 300 sec: 53428.5). Total num frames: 6564380672. Throughput: 0: 53295.7. Samples: 1054863740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-27 14:28:34,115][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 14:28:34,849][52263] Updated weights for policy 0, policy_version 400659 (0.0027) [2024-04-27 14:28:37,751][52263] Updated weights for policy 0, policy_version 400669 (0.0028) [2024-04-27 14:28:39,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6564610048. Throughput: 0: 53247.6. Samples: 1055183100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-27 14:28:39,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 14:28:40,985][52263] Updated weights for policy 0, policy_version 400679 (0.0028) [2024-04-27 14:28:43,814][52263] Updated weights for policy 0, policy_version 400689 (0.0030) [2024-04-27 14:28:44,107][52031] Fps is (10 sec: 52427.7, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6564904960. Throughput: 0: 53314.5. Samples: 1055345380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-04-27 14:28:44,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 14:28:47,203][52263] Updated weights for policy 0, policy_version 400699 (0.0034) [2024-04-27 14:28:49,107][52031] Fps is (10 sec: 52428.3, 60 sec: 52974.9, 300 sec: 53372.9). Total num frames: 6565134336. Throughput: 0: 53417.7. Samples: 1055666920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 14:28:49,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 14:28:49,982][52263] Updated weights for policy 0, policy_version 400709 (0.0027) [2024-04-27 14:28:53,348][52263] Updated weights for policy 0, policy_version 400719 (0.0031) [2024-04-27 14:28:54,107][52031] Fps is (10 sec: 50790.8, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6565412864. Throughput: 0: 53425.1. Samples: 1055988480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 14:28:54,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 14:28:56,073][52263] Updated weights for policy 0, policy_version 400729 (0.0030) [2024-04-27 14:28:59,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6565691392. Throughput: 0: 53282.5. Samples: 1056146140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 14:28:59,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 14:28:59,293][52263] Updated weights for policy 0, policy_version 400739 (0.0031) [2024-04-27 14:29:02,080][52263] Updated weights for policy 0, policy_version 400749 (0.0036) [2024-04-27 14:29:02,592][52242] Signal inference workers to stop experience collection... (15800 times) [2024-04-27 14:29:02,592][52242] Signal inference workers to resume experience collection... (15800 times) [2024-04-27 14:29:02,620][52263] InferenceWorker_p0-w0: stopping experience collection (15800 times) [2024-04-27 14:29:02,620][52263] InferenceWorker_p0-w0: resuming experience collection (15800 times) [2024-04-27 14:29:04,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6565969920. Throughput: 0: 53338.2. Samples: 1056469980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 14:29:04,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 14:29:05,443][52263] Updated weights for policy 0, policy_version 400759 (0.0032) [2024-04-27 14:29:08,156][52263] Updated weights for policy 0, policy_version 400769 (0.0029) [2024-04-27 14:29:09,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53525.2, 300 sec: 53539.6). Total num frames: 6566232064. Throughput: 0: 53406.6. Samples: 1056792100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 14:29:09,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 14:29:11,613][52263] Updated weights for policy 0, policy_version 400779 (0.0031) [2024-04-27 14:29:14,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.0, 300 sec: 53539.5). Total num frames: 6566510592. Throughput: 0: 53633.2. Samples: 1056958140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 14:29:14,107][52031] Avg episode reward: [(0, '0.672')] [2024-04-27 14:29:14,306][52263] Updated weights for policy 0, policy_version 400789 (0.0030) [2024-04-27 14:29:17,663][52263] Updated weights for policy 0, policy_version 400799 (0.0036) [2024-04-27 14:29:19,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6566756352. Throughput: 0: 53750.9. Samples: 1057282540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 14:29:19,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 14:29:20,460][52263] Updated weights for policy 0, policy_version 400809 (0.0031) [2024-04-27 14:29:23,888][52263] Updated weights for policy 0, policy_version 400819 (0.0026) [2024-04-27 14:29:24,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 6567034880. Throughput: 0: 53776.5. Samples: 1057603040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 14:29:24,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 14:29:26,502][52263] Updated weights for policy 0, policy_version 400829 (0.0029) [2024-04-27 14:29:29,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6567297024. Throughput: 0: 53590.8. Samples: 1057756960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 14:29:29,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 14:29:30,129][52263] Updated weights for policy 0, policy_version 400839 (0.0029) [2024-04-27 14:29:32,443][52263] Updated weights for policy 0, policy_version 400849 (0.0030) [2024-04-27 14:29:34,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6567575552. Throughput: 0: 53557.8. Samples: 1058077020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 14:29:34,108][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 14:29:34,119][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000400853_6567575552.pth... [2024-04-27 14:29:34,174][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000400070_6554746880.pth [2024-04-27 14:29:36,133][52263] Updated weights for policy 0, policy_version 400859 (0.0031) [2024-04-27 14:29:38,548][52263] Updated weights for policy 0, policy_version 400869 (0.0028) [2024-04-27 14:29:39,106][52031] Fps is (10 sec: 55706.0, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6567854080. Throughput: 0: 53617.0. Samples: 1058401240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 14:29:39,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 14:29:42,280][52263] Updated weights for policy 0, policy_version 400879 (0.0030) [2024-04-27 14:29:44,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6568099840. Throughput: 0: 53873.6. Samples: 1058570440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 14:29:44,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 14:29:44,773][52263] Updated weights for policy 0, policy_version 400889 (0.0033) [2024-04-27 14:29:48,363][52263] Updated weights for policy 0, policy_version 400899 (0.0037) [2024-04-27 14:29:49,107][52031] Fps is (10 sec: 50789.9, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6568361984. Throughput: 0: 53722.7. Samples: 1058887500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 14:29:49,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 14:29:50,772][52263] Updated weights for policy 0, policy_version 400909 (0.0030) [2024-04-27 14:29:54,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6568640512. Throughput: 0: 53710.7. Samples: 1059209080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 14:29:54,107][52031] Avg episode reward: [(0, '0.671')] [2024-04-27 14:29:54,513][52263] Updated weights for policy 0, policy_version 400919 (0.0026) [2024-04-27 14:29:56,440][52242] Signal inference workers to stop experience collection... (15850 times) [2024-04-27 14:29:56,442][52242] Signal inference workers to resume experience collection... (15850 times) [2024-04-27 14:29:56,466][52263] InferenceWorker_p0-w0: stopping experience collection (15850 times) [2024-04-27 14:29:56,466][52263] InferenceWorker_p0-w0: resuming experience collection (15850 times) [2024-04-27 14:29:56,840][52263] Updated weights for policy 0, policy_version 400929 (0.0031) [2024-04-27 14:29:59,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 6568902656. Throughput: 0: 53528.6. Samples: 1059366920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 14:29:59,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 14:30:00,487][52263] Updated weights for policy 0, policy_version 400939 (0.0031) [2024-04-27 14:30:03,334][52263] Updated weights for policy 0, policy_version 400949 (0.0031) [2024-04-27 14:30:04,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6569181184. Throughput: 0: 53466.2. Samples: 1059688520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 14:30:04,107][52031] Avg episode reward: [(0, '0.504')] [2024-04-27 14:30:06,597][52263] Updated weights for policy 0, policy_version 400959 (0.0030) [2024-04-27 14:30:09,107][52031] Fps is (10 sec: 55704.5, 60 sec: 53794.0, 300 sec: 53539.5). Total num frames: 6569459712. Throughput: 0: 53487.3. Samples: 1060009980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:30:09,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 14:30:09,310][52263] Updated weights for policy 0, policy_version 400969 (0.0032) [2024-04-27 14:30:12,760][52263] Updated weights for policy 0, policy_version 400979 (0.0032) [2024-04-27 14:30:14,107][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6569705472. Throughput: 0: 53701.3. Samples: 1060173520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:30:14,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 14:30:15,313][52263] Updated weights for policy 0, policy_version 400989 (0.0027) [2024-04-27 14:30:18,737][52263] Updated weights for policy 0, policy_version 400999 (0.0031) [2024-04-27 14:30:19,106][52031] Fps is (10 sec: 50791.2, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6569967616. Throughput: 0: 53691.2. Samples: 1060493120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:30:19,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 14:30:21,501][52263] Updated weights for policy 0, policy_version 401009 (0.0026) [2024-04-27 14:30:24,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6570229760. Throughput: 0: 53542.2. Samples: 1060810640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:30:24,107][52031] Avg episode reward: [(0, '0.471')] [2024-04-27 14:30:24,733][52263] Updated weights for policy 0, policy_version 401019 (0.0031) [2024-04-27 14:30:27,573][52263] Updated weights for policy 0, policy_version 401029 (0.0031) [2024-04-27 14:30:29,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6570524672. Throughput: 0: 53422.9. Samples: 1060974480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:30:29,107][52031] Avg episode reward: [(0, '0.666')] [2024-04-27 14:30:31,008][52263] Updated weights for policy 0, policy_version 401039 (0.0028) [2024-04-27 14:30:33,680][52263] Updated weights for policy 0, policy_version 401049 (0.0026) [2024-04-27 14:30:34,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6570786816. Throughput: 0: 53508.4. Samples: 1061295380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:30:34,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 14:30:37,052][52263] Updated weights for policy 0, policy_version 401059 (0.0034) [2024-04-27 14:30:39,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6571048960. Throughput: 0: 53417.8. Samples: 1061612880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:30:39,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 14:30:39,710][52263] Updated weights for policy 0, policy_version 401069 (0.0030) [2024-04-27 14:30:43,264][52263] Updated weights for policy 0, policy_version 401079 (0.0027) [2024-04-27 14:30:44,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6571311104. Throughput: 0: 53521.8. Samples: 1061775400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:30:44,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 14:30:45,954][52263] Updated weights for policy 0, policy_version 401089 (0.0033) [2024-04-27 14:30:49,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 6571589632. Throughput: 0: 53368.0. Samples: 1062090080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:30:49,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 14:30:49,570][52263] Updated weights for policy 0, policy_version 401099 (0.0029) [2024-04-27 14:30:52,158][52263] Updated weights for policy 0, policy_version 401109 (0.0033) [2024-04-27 14:30:54,107][52031] Fps is (10 sec: 54066.0, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6571851776. Throughput: 0: 53324.0. Samples: 1062409560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:30:54,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 14:30:55,108][52242] Signal inference workers to stop experience collection... (15900 times) [2024-04-27 14:30:55,108][52242] Signal inference workers to resume experience collection... (15900 times) [2024-04-27 14:30:55,143][52263] InferenceWorker_p0-w0: stopping experience collection (15900 times) [2024-04-27 14:30:55,143][52263] InferenceWorker_p0-w0: resuming experience collection (15900 times) [2024-04-27 14:30:55,558][52263] Updated weights for policy 0, policy_version 401119 (0.0034) [2024-04-27 14:30:58,916][52263] Updated weights for policy 0, policy_version 401129 (0.0026) [2024-04-27 14:30:59,106][52031] Fps is (10 sec: 52429.9, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6572113920. Throughput: 0: 53340.3. Samples: 1062573820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:30:59,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 14:31:01,540][52263] Updated weights for policy 0, policy_version 401139 (0.0026) [2024-04-27 14:31:04,107][52031] Fps is (10 sec: 55706.4, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6572408832. Throughput: 0: 53418.6. Samples: 1062896960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:31:04,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 14:31:04,906][52263] Updated weights for policy 0, policy_version 401149 (0.0029) [2024-04-27 14:31:07,654][52263] Updated weights for policy 0, policy_version 401159 (0.0030) [2024-04-27 14:31:09,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53248.2, 300 sec: 53428.5). Total num frames: 6572654592. Throughput: 0: 53545.5. Samples: 1063220180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:31:09,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 14:31:11,005][52263] Updated weights for policy 0, policy_version 401169 (0.0032) [2024-04-27 14:31:13,859][52263] Updated weights for policy 0, policy_version 401179 (0.0034) [2024-04-27 14:31:14,107][52031] Fps is (10 sec: 50790.1, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6572916736. Throughput: 0: 53397.3. Samples: 1063377360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:31:14,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 14:31:17,060][52263] Updated weights for policy 0, policy_version 401189 (0.0031) [2024-04-27 14:31:19,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6573178880. Throughput: 0: 53401.0. Samples: 1063698420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:31:19,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 14:31:19,912][52263] Updated weights for policy 0, policy_version 401199 (0.0031) [2024-04-27 14:31:23,035][52263] Updated weights for policy 0, policy_version 401209 (0.0028) [2024-04-27 14:31:24,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6573457408. Throughput: 0: 53508.5. Samples: 1064020760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:31:24,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 14:31:26,108][52263] Updated weights for policy 0, policy_version 401219 (0.0027) [2024-04-27 14:31:29,106][52031] Fps is (10 sec: 54068.0, 60 sec: 53248.2, 300 sec: 53428.6). Total num frames: 6573719552. Throughput: 0: 53495.6. Samples: 1064182700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 14:31:29,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 14:31:29,329][52263] Updated weights for policy 0, policy_version 401229 (0.0030) [2024-04-27 14:31:32,144][52263] Updated weights for policy 0, policy_version 401239 (0.0033) [2024-04-27 14:31:34,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6573998080. Throughput: 0: 53598.3. Samples: 1064502000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:31:34,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 14:31:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000401245_6573998080.pth... [2024-04-27 14:31:34,168][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000400461_6561153024.pth [2024-04-27 14:31:35,453][52263] Updated weights for policy 0, policy_version 401249 (0.0034) [2024-04-27 14:31:38,230][52263] Updated weights for policy 0, policy_version 401259 (0.0030) [2024-04-27 14:31:39,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6574243840. Throughput: 0: 53614.4. Samples: 1064822200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:31:39,107][52031] Avg episode reward: [(0, '0.655')] [2024-04-27 14:31:41,456][52263] Updated weights for policy 0, policy_version 401269 (0.0031) [2024-04-27 14:31:44,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6574538752. Throughput: 0: 53532.7. Samples: 1064982800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:31:44,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 14:31:44,307][52263] Updated weights for policy 0, policy_version 401279 (0.0027) [2024-04-27 14:31:47,653][52263] Updated weights for policy 0, policy_version 401289 (0.0029) [2024-04-27 14:31:49,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6574800896. Throughput: 0: 53545.4. Samples: 1065306500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:31:49,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 14:31:50,471][52263] Updated weights for policy 0, policy_version 401299 (0.0030) [2024-04-27 14:31:53,723][52263] Updated weights for policy 0, policy_version 401309 (0.0036) [2024-04-27 14:31:54,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6575063040. Throughput: 0: 53605.1. Samples: 1065632420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:31:54,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 14:31:56,414][52263] Updated weights for policy 0, policy_version 401319 (0.0027) [2024-04-27 14:31:59,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 6575325184. Throughput: 0: 53469.6. Samples: 1065783480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:31:59,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 14:31:59,720][52263] Updated weights for policy 0, policy_version 401329 (0.0027) [2024-04-27 14:32:02,812][52263] Updated weights for policy 0, policy_version 401339 (0.0032) [2024-04-27 14:32:04,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6575603712. Throughput: 0: 53570.3. Samples: 1066109080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:32:04,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 14:32:05,780][52263] Updated weights for policy 0, policy_version 401349 (0.0026) [2024-04-27 14:32:08,277][52242] Signal inference workers to stop experience collection... (15950 times) [2024-04-27 14:32:08,316][52263] InferenceWorker_p0-w0: stopping experience collection (15950 times) [2024-04-27 14:32:08,333][52242] Signal inference workers to resume experience collection... (15950 times) [2024-04-27 14:32:08,334][52263] InferenceWorker_p0-w0: resuming experience collection (15950 times) [2024-04-27 14:32:08,757][52263] Updated weights for policy 0, policy_version 401359 (0.0028) [2024-04-27 14:32:09,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 6575865856. Throughput: 0: 53550.7. Samples: 1066430540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:32:09,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 14:32:11,926][52263] Updated weights for policy 0, policy_version 401369 (0.0027) [2024-04-27 14:32:14,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6576128000. Throughput: 0: 53611.4. Samples: 1066595220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:32:14,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 14:32:15,407][52263] Updated weights for policy 0, policy_version 401379 (0.0036) [2024-04-27 14:32:18,006][52263] Updated weights for policy 0, policy_version 401389 (0.0031) [2024-04-27 14:32:19,106][52031] Fps is (10 sec: 55705.9, 60 sec: 54067.3, 300 sec: 53595.2). Total num frames: 6576422912. Throughput: 0: 53681.0. Samples: 1066917640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:32:19,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 14:32:21,520][52263] Updated weights for policy 0, policy_version 401399 (0.0027) [2024-04-27 14:32:24,064][52263] Updated weights for policy 0, policy_version 401409 (0.0028) [2024-04-27 14:32:24,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6576685056. Throughput: 0: 53692.9. Samples: 1067238380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:32:24,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 14:32:27,564][52263] Updated weights for policy 0, policy_version 401419 (0.0028) [2024-04-27 14:32:29,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53520.9, 300 sec: 53539.6). Total num frames: 6576930816. Throughput: 0: 53682.2. Samples: 1067398500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:32:29,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 14:32:30,239][52263] Updated weights for policy 0, policy_version 401429 (0.0027) [2024-04-27 14:32:33,511][52263] Updated weights for policy 0, policy_version 401439 (0.0027) [2024-04-27 14:32:34,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6577209344. Throughput: 0: 53600.0. Samples: 1067718500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:32:34,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 14:32:36,273][52263] Updated weights for policy 0, policy_version 401449 (0.0033) [2024-04-27 14:32:39,107][52031] Fps is (10 sec: 55705.1, 60 sec: 54067.1, 300 sec: 53539.6). Total num frames: 6577487872. Throughput: 0: 53495.0. Samples: 1068039700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:32:39,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 14:32:39,657][52263] Updated weights for policy 0, policy_version 401459 (0.0032) [2024-04-27 14:32:42,377][52263] Updated weights for policy 0, policy_version 401469 (0.0036) [2024-04-27 14:32:44,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6577750016. Throughput: 0: 53868.4. Samples: 1068207560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:32:44,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 14:32:45,812][52263] Updated weights for policy 0, policy_version 401479 (0.0026) [2024-04-27 14:32:48,441][52263] Updated weights for policy 0, policy_version 401489 (0.0035) [2024-04-27 14:32:49,108][52031] Fps is (10 sec: 54060.8, 60 sec: 53793.0, 300 sec: 53594.9). Total num frames: 6578028544. Throughput: 0: 53822.9. Samples: 1068531180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 14:32:49,108][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 14:32:51,748][52263] Updated weights for policy 0, policy_version 401499 (0.0034) [2024-04-27 14:32:54,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6578290688. Throughput: 0: 53784.9. Samples: 1068850860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 14:32:54,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 14:32:54,403][52263] Updated weights for policy 0, policy_version 401509 (0.0032) [2024-04-27 14:32:58,271][52263] Updated weights for policy 0, policy_version 401519 (0.0030) [2024-04-27 14:32:59,107][52031] Fps is (10 sec: 52435.2, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6578552832. Throughput: 0: 53719.1. Samples: 1069012580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 14:32:59,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 14:33:00,481][52263] Updated weights for policy 0, policy_version 401529 (0.0027) [2024-04-27 14:33:04,108][52031] Fps is (10 sec: 50782.9, 60 sec: 53246.8, 300 sec: 53484.7). Total num frames: 6578798592. Throughput: 0: 53606.3. Samples: 1069330000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 14:33:04,108][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 14:33:04,290][52263] Updated weights for policy 0, policy_version 401539 (0.0035) [2024-04-27 14:33:06,738][52263] Updated weights for policy 0, policy_version 401549 (0.0026) [2024-04-27 14:33:06,967][52242] Signal inference workers to stop experience collection... (16000 times) [2024-04-27 14:33:06,967][52242] Signal inference workers to resume experience collection... (16000 times) [2024-04-27 14:33:06,994][52263] InferenceWorker_p0-w0: stopping experience collection (16000 times) [2024-04-27 14:33:06,994][52263] InferenceWorker_p0-w0: resuming experience collection (16000 times) [2024-04-27 14:33:09,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6579077120. Throughput: 0: 53541.8. Samples: 1069647760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 14:33:09,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 14:33:10,599][52263] Updated weights for policy 0, policy_version 401559 (0.0032) [2024-04-27 14:33:12,764][52263] Updated weights for policy 0, policy_version 401569 (0.0029) [2024-04-27 14:33:14,107][52031] Fps is (10 sec: 57351.7, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6579372032. Throughput: 0: 53602.7. Samples: 1069810620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 14:33:14,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 14:33:16,761][52263] Updated weights for policy 0, policy_version 401579 (0.0028) [2024-04-27 14:33:19,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53247.9, 300 sec: 53595.1). Total num frames: 6579617792. Throughput: 0: 53642.2. Samples: 1070132400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 14:33:19,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 14:33:19,162][52263] Updated weights for policy 0, policy_version 401589 (0.0032) [2024-04-27 14:33:22,858][52263] Updated weights for policy 0, policy_version 401599 (0.0027) [2024-04-27 14:33:24,106][52031] Fps is (10 sec: 49152.6, 60 sec: 52975.0, 300 sec: 53484.1). Total num frames: 6579863552. Throughput: 0: 53562.5. Samples: 1070450000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 14:33:24,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 14:33:25,225][52263] Updated weights for policy 0, policy_version 401609 (0.0033) [2024-04-27 14:33:28,967][52263] Updated weights for policy 0, policy_version 401619 (0.0033) [2024-04-27 14:33:29,107][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.0, 300 sec: 53372.9). Total num frames: 6580125696. Throughput: 0: 53316.8. Samples: 1070606820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 14:33:29,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 14:33:31,341][52263] Updated weights for policy 0, policy_version 401629 (0.0031) [2024-04-27 14:33:34,106][52031] Fps is (10 sec: 55705.0, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6580420608. Throughput: 0: 53158.8. Samples: 1070923260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 14:33:34,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 14:33:34,120][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000401638_6580436992.pth... [2024-04-27 14:33:34,165][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000400853_6567575552.pth [2024-04-27 14:33:35,117][52263] Updated weights for policy 0, policy_version 401639 (0.0027) [2024-04-27 14:33:37,535][52263] Updated weights for policy 0, policy_version 401649 (0.0033) [2024-04-27 14:33:39,107][52031] Fps is (10 sec: 57343.8, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6580699136. Throughput: 0: 53161.2. Samples: 1071243120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 14:33:39,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 14:33:41,433][52263] Updated weights for policy 0, policy_version 401659 (0.0030) [2024-04-27 14:33:43,768][52263] Updated weights for policy 0, policy_version 401669 (0.0039) [2024-04-27 14:33:44,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6580961280. Throughput: 0: 53373.8. Samples: 1071414400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 14:33:44,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 14:33:47,473][52263] Updated weights for policy 0, policy_version 401679 (0.0040) [2024-04-27 14:33:49,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53249.1, 300 sec: 53595.1). Total num frames: 6581223424. Throughput: 0: 53495.3. Samples: 1071737220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 14:33:49,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 14:33:49,752][52263] Updated weights for policy 0, policy_version 401689 (0.0031) [2024-04-27 14:33:53,644][52263] Updated weights for policy 0, policy_version 401699 (0.0038) [2024-04-27 14:33:54,107][52031] Fps is (10 sec: 50790.5, 60 sec: 52974.8, 300 sec: 53484.1). Total num frames: 6581469184. Throughput: 0: 53593.7. Samples: 1072059480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 14:33:54,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 14:33:55,914][52263] Updated weights for policy 0, policy_version 401709 (0.0029) [2024-04-27 14:33:59,106][52031] Fps is (10 sec: 50791.0, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6581731328. Throughput: 0: 53119.2. Samples: 1072200980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 14:33:59,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 14:33:59,710][52263] Updated weights for policy 0, policy_version 401719 (0.0035) [2024-04-27 14:34:02,030][52263] Updated weights for policy 0, policy_version 401729 (0.0035) [2024-04-27 14:34:02,119][52242] Signal inference workers to stop experience collection... (16050 times) [2024-04-27 14:34:02,160][52263] InferenceWorker_p0-w0: stopping experience collection (16050 times) [2024-04-27 14:34:02,218][52242] Signal inference workers to resume experience collection... (16050 times) [2024-04-27 14:34:02,218][52263] InferenceWorker_p0-w0: resuming experience collection (16050 times) [2024-04-27 14:34:04,106][52031] Fps is (10 sec: 57344.6, 60 sec: 54068.5, 300 sec: 53595.1). Total num frames: 6582042624. Throughput: 0: 53109.0. Samples: 1072522300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 14:34:04,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 14:34:06,012][52263] Updated weights for policy 0, policy_version 401739 (0.0039) [2024-04-27 14:34:08,124][52263] Updated weights for policy 0, policy_version 401749 (0.0035) [2024-04-27 14:34:09,106][52031] Fps is (10 sec: 58982.7, 60 sec: 54067.2, 300 sec: 53595.2). Total num frames: 6582321152. Throughput: 0: 53316.9. Samples: 1072849260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-27 14:34:09,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 14:34:12,023][52263] Updated weights for policy 0, policy_version 401759 (0.0030) [2024-04-27 14:34:14,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6582566912. Throughput: 0: 53681.0. Samples: 1073022460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:34:14,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 14:34:14,430][52263] Updated weights for policy 0, policy_version 401769 (0.0029) [2024-04-27 14:34:18,167][52263] Updated weights for policy 0, policy_version 401779 (0.0030) [2024-04-27 14:34:19,106][52031] Fps is (10 sec: 49151.8, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6582812672. Throughput: 0: 53696.9. Samples: 1073339620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:34:19,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 14:34:20,473][52263] Updated weights for policy 0, policy_version 401789 (0.0034) [2024-04-27 14:34:24,106][52031] Fps is (10 sec: 49152.1, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6583058432. Throughput: 0: 53695.3. Samples: 1073659400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:34:24,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 14:34:24,139][52263] Updated weights for policy 0, policy_version 401799 (0.0028) [2024-04-27 14:34:26,529][52263] Updated weights for policy 0, policy_version 401809 (0.0027) [2024-04-27 14:34:29,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6583353344. Throughput: 0: 53338.2. Samples: 1073814620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:34:29,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 14:34:30,146][52263] Updated weights for policy 0, policy_version 401819 (0.0027) [2024-04-27 14:34:32,716][52263] Updated weights for policy 0, policy_version 401829 (0.0031) [2024-04-27 14:34:34,107][52031] Fps is (10 sec: 58981.3, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6583648256. Throughput: 0: 53310.2. Samples: 1074136180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:34:34,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 14:34:36,152][52263] Updated weights for policy 0, policy_version 401839 (0.0033) [2024-04-27 14:34:38,750][52263] Updated weights for policy 0, policy_version 401849 (0.0031) [2024-04-27 14:34:39,107][52031] Fps is (10 sec: 57343.5, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6583926784. Throughput: 0: 53300.3. Samples: 1074458000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:34:39,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 14:34:42,346][52263] Updated weights for policy 0, policy_version 401859 (0.0027) [2024-04-27 14:34:44,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6584172544. Throughput: 0: 53928.9. Samples: 1074627780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:34:44,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 14:34:44,689][52263] Updated weights for policy 0, policy_version 401869 (0.0026) [2024-04-27 14:34:48,455][52263] Updated weights for policy 0, policy_version 401879 (0.0030) [2024-04-27 14:34:49,107][52031] Fps is (10 sec: 49151.9, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6584418304. Throughput: 0: 53964.6. Samples: 1074950720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:34:49,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 14:34:50,725][52263] Updated weights for policy 0, policy_version 401889 (0.0032) [2024-04-27 14:34:54,106][52031] Fps is (10 sec: 49152.2, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6584664064. Throughput: 0: 53811.2. Samples: 1075270760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:34:54,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 14:34:54,446][52263] Updated weights for policy 0, policy_version 401899 (0.0025) [2024-04-27 14:34:56,879][52263] Updated weights for policy 0, policy_version 401909 (0.0028) [2024-04-27 14:34:59,107][52031] Fps is (10 sec: 55705.9, 60 sec: 54067.1, 300 sec: 53539.6). Total num frames: 6584975360. Throughput: 0: 53452.7. Samples: 1075427840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:34:59,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 14:35:00,536][52263] Updated weights for policy 0, policy_version 401919 (0.0031) [2024-04-27 14:35:02,998][52263] Updated weights for policy 0, policy_version 401929 (0.0037) [2024-04-27 14:35:04,107][52031] Fps is (10 sec: 60619.8, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6585270272. Throughput: 0: 53623.4. Samples: 1075752680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:35:04,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 14:35:06,503][52263] Updated weights for policy 0, policy_version 401939 (0.0034) [2024-04-27 14:35:08,336][52242] Signal inference workers to stop experience collection... (16100 times) [2024-04-27 14:35:08,336][52242] Signal inference workers to resume experience collection... (16100 times) [2024-04-27 14:35:08,349][52263] InferenceWorker_p0-w0: stopping experience collection (16100 times) [2024-04-27 14:35:08,349][52263] InferenceWorker_p0-w0: resuming experience collection (16100 times) [2024-04-27 14:35:09,106][52031] Fps is (10 sec: 54068.2, 60 sec: 53248.0, 300 sec: 53595.2). Total num frames: 6585516032. Throughput: 0: 53588.9. Samples: 1076070900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:35:09,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 14:35:09,119][52263] Updated weights for policy 0, policy_version 401949 (0.0027) [2024-04-27 14:35:12,584][52263] Updated weights for policy 0, policy_version 401959 (0.0027) [2024-04-27 14:35:14,107][52031] Fps is (10 sec: 50790.5, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6585778176. Throughput: 0: 53826.2. Samples: 1076236800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:35:14,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 14:35:15,279][52263] Updated weights for policy 0, policy_version 401969 (0.0028) [2024-04-27 14:35:18,563][52263] Updated weights for policy 0, policy_version 401979 (0.0030) [2024-04-27 14:35:19,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6586040320. Throughput: 0: 53906.9. Samples: 1076561980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:35:19,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 14:35:21,409][52263] Updated weights for policy 0, policy_version 401989 (0.0026) [2024-04-27 14:35:24,107][52031] Fps is (10 sec: 52428.7, 60 sec: 54067.1, 300 sec: 53484.0). Total num frames: 6586302464. Throughput: 0: 53873.0. Samples: 1076882280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:35:24,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 14:35:24,702][52263] Updated weights for policy 0, policy_version 401999 (0.0035) [2024-04-27 14:35:27,463][52263] Updated weights for policy 0, policy_version 402009 (0.0031) [2024-04-27 14:35:29,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6586580992. Throughput: 0: 53531.1. Samples: 1077036680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:35:29,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 14:35:30,934][52263] Updated weights for policy 0, policy_version 402019 (0.0030) [2024-04-27 14:35:33,471][52263] Updated weights for policy 0, policy_version 402029 (0.0038) [2024-04-27 14:35:34,107][52031] Fps is (10 sec: 57343.8, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6586875904. Throughput: 0: 53566.7. Samples: 1077361220. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-04-27 14:35:34,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 14:35:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000402031_6586875904.pth... [2024-04-27 14:35:34,168][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000401245_6573998080.pth [2024-04-27 14:35:37,229][52263] Updated weights for policy 0, policy_version 402039 (0.0034) [2024-04-27 14:35:39,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53521.1, 300 sec: 53650.6). Total num frames: 6587138048. Throughput: 0: 53636.7. Samples: 1077684420. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-04-27 14:35:39,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 14:35:39,576][52263] Updated weights for policy 0, policy_version 402049 (0.0027) [2024-04-27 14:35:43,132][52263] Updated weights for policy 0, policy_version 402059 (0.0029) [2024-04-27 14:35:44,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53520.9, 300 sec: 53539.6). Total num frames: 6587383808. Throughput: 0: 53715.1. Samples: 1077845020. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-04-27 14:35:44,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 14:35:45,621][52263] Updated weights for policy 0, policy_version 402069 (0.0023) [2024-04-27 14:35:49,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53794.3, 300 sec: 53539.6). Total num frames: 6587645952. Throughput: 0: 53596.5. Samples: 1078164520. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-04-27 14:35:49,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 14:35:49,190][52263] Updated weights for policy 0, policy_version 402079 (0.0030) [2024-04-27 14:35:52,027][52263] Updated weights for policy 0, policy_version 402089 (0.0027) [2024-04-27 14:35:54,107][52031] Fps is (10 sec: 52429.1, 60 sec: 54067.1, 300 sec: 53539.5). Total num frames: 6587908096. Throughput: 0: 53756.3. Samples: 1078489940. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-04-27 14:35:54,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 14:35:55,379][52263] Updated weights for policy 0, policy_version 402099 (0.0030) [2024-04-27 14:35:58,196][52263] Updated weights for policy 0, policy_version 402109 (0.0032) [2024-04-27 14:35:59,106][52031] Fps is (10 sec: 54068.0, 60 sec: 53521.3, 300 sec: 53484.1). Total num frames: 6588186624. Throughput: 0: 53577.6. Samples: 1078647780. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-04-27 14:35:59,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 14:36:01,511][52263] Updated weights for policy 0, policy_version 402119 (0.0033) [2024-04-27 14:36:04,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6588465152. Throughput: 0: 53454.1. Samples: 1078967420. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-04-27 14:36:04,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 14:36:04,365][52263] Updated weights for policy 0, policy_version 402129 (0.0031) [2024-04-27 14:36:07,499][52263] Updated weights for policy 0, policy_version 402139 (0.0032) [2024-04-27 14:36:09,107][52031] Fps is (10 sec: 52427.5, 60 sec: 53247.8, 300 sec: 53539.6). Total num frames: 6588710912. Throughput: 0: 53499.5. Samples: 1079289760. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-04-27 14:36:09,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 14:36:10,423][52263] Updated weights for policy 0, policy_version 402149 (0.0028) [2024-04-27 14:36:12,487][52242] Signal inference workers to stop experience collection... (16150 times) [2024-04-27 14:36:12,487][52242] Signal inference workers to resume experience collection... (16150 times) [2024-04-27 14:36:12,515][52263] InferenceWorker_p0-w0: stopping experience collection (16150 times) [2024-04-27 14:36:12,515][52263] InferenceWorker_p0-w0: resuming experience collection (16150 times) [2024-04-27 14:36:13,698][52263] Updated weights for policy 0, policy_version 402159 (0.0033) [2024-04-27 14:36:14,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6588989440. Throughput: 0: 53586.2. Samples: 1079448060. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-04-27 14:36:14,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 14:36:16,527][52263] Updated weights for policy 0, policy_version 402169 (0.0027) [2024-04-27 14:36:19,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53247.8, 300 sec: 53484.0). Total num frames: 6589235200. Throughput: 0: 53433.7. Samples: 1079765740. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-04-27 14:36:19,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 14:36:20,005][52263] Updated weights for policy 0, policy_version 402179 (0.0027) [2024-04-27 14:36:22,637][52263] Updated weights for policy 0, policy_version 402189 (0.0033) [2024-04-27 14:36:24,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6589513728. Throughput: 0: 53276.1. Samples: 1080081840. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-04-27 14:36:24,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 14:36:26,167][52263] Updated weights for policy 0, policy_version 402199 (0.0027) [2024-04-27 14:36:28,701][52263] Updated weights for policy 0, policy_version 402209 (0.0033) [2024-04-27 14:36:29,107][52031] Fps is (10 sec: 57344.3, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6589808640. Throughput: 0: 53493.4. Samples: 1080252220. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-04-27 14:36:29,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 14:36:32,286][52263] Updated weights for policy 0, policy_version 402219 (0.0035) [2024-04-27 14:36:34,106][52031] Fps is (10 sec: 54067.8, 60 sec: 52975.1, 300 sec: 53595.2). Total num frames: 6590054400. Throughput: 0: 53508.6. Samples: 1080572400. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-04-27 14:36:34,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 14:36:34,851][52263] Updated weights for policy 0, policy_version 402229 (0.0029) [2024-04-27 14:36:38,292][52263] Updated weights for policy 0, policy_version 402239 (0.0035) [2024-04-27 14:36:39,106][52031] Fps is (10 sec: 50790.9, 60 sec: 52975.0, 300 sec: 53484.0). Total num frames: 6590316544. Throughput: 0: 53309.8. Samples: 1080888880. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-04-27 14:36:39,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 14:36:41,118][52263] Updated weights for policy 0, policy_version 402249 (0.0034) [2024-04-27 14:36:44,107][52031] Fps is (10 sec: 54066.0, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6590595072. Throughput: 0: 53135.3. Samples: 1081038880. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-04-27 14:36:44,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 14:36:44,383][52263] Updated weights for policy 0, policy_version 402259 (0.0030) [2024-04-27 14:36:47,273][52263] Updated weights for policy 0, policy_version 402269 (0.0033) [2024-04-27 14:36:49,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6590857216. Throughput: 0: 53193.9. Samples: 1081361140. Policy #0 lag: (min: 1.0, avg: 8.1, max: 20.0) [2024-04-27 14:36:49,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 14:36:50,431][52263] Updated weights for policy 0, policy_version 402279 (0.0032) [2024-04-27 14:36:53,554][52263] Updated weights for policy 0, policy_version 402289 (0.0028) [2024-04-27 14:36:54,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6591135744. Throughput: 0: 53376.5. Samples: 1081691700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:36:54,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 14:36:56,556][52263] Updated weights for policy 0, policy_version 402299 (0.0028) [2024-04-27 14:36:59,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53793.9, 300 sec: 53595.1). Total num frames: 6591414272. Throughput: 0: 53397.3. Samples: 1081850940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:36:59,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 14:36:59,561][52263] Updated weights for policy 0, policy_version 402309 (0.0031) [2024-04-27 14:37:02,579][52263] Updated weights for policy 0, policy_version 402319 (0.0035) [2024-04-27 14:37:04,106][52031] Fps is (10 sec: 50791.0, 60 sec: 52975.1, 300 sec: 53484.1). Total num frames: 6591643648. Throughput: 0: 53441.2. Samples: 1082170580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:37:04,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 14:37:05,620][52263] Updated weights for policy 0, policy_version 402329 (0.0027) [2024-04-27 14:37:08,809][52263] Updated weights for policy 0, policy_version 402339 (0.0027) [2024-04-27 14:37:09,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6591938560. Throughput: 0: 53503.0. Samples: 1082489480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:37:09,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 14:37:11,820][52263] Updated weights for policy 0, policy_version 402349 (0.0030) [2024-04-27 14:37:14,106][52031] Fps is (10 sec: 52428.3, 60 sec: 52974.9, 300 sec: 53372.9). Total num frames: 6592167936. Throughput: 0: 53348.6. Samples: 1082652900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:37:14,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 14:37:15,053][52263] Updated weights for policy 0, policy_version 402359 (0.0027) [2024-04-27 14:37:17,940][52263] Updated weights for policy 0, policy_version 402369 (0.0033) [2024-04-27 14:37:19,107][52031] Fps is (10 sec: 52429.1, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6592462848. Throughput: 0: 53309.6. Samples: 1082971340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:37:19,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 14:37:20,964][52263] Updated weights for policy 0, policy_version 402379 (0.0030) [2024-04-27 14:37:23,935][52263] Updated weights for policy 0, policy_version 402389 (0.0028) [2024-04-27 14:37:24,106][52031] Fps is (10 sec: 57344.0, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6592741376. Throughput: 0: 53442.2. Samples: 1083293780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:37:24,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 14:37:27,036][52263] Updated weights for policy 0, policy_version 402399 (0.0028) [2024-04-27 14:37:27,820][52242] Signal inference workers to stop experience collection... (16200 times) [2024-04-27 14:37:27,855][52263] InferenceWorker_p0-w0: stopping experience collection (16200 times) [2024-04-27 14:37:27,890][52242] Signal inference workers to resume experience collection... (16200 times) [2024-04-27 14:37:27,890][52263] InferenceWorker_p0-w0: resuming experience collection (16200 times) [2024-04-27 14:37:29,106][52031] Fps is (10 sec: 52429.2, 60 sec: 52975.0, 300 sec: 53484.1). Total num frames: 6592987136. Throughput: 0: 53796.6. Samples: 1083459720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:37:29,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 14:37:30,050][52263] Updated weights for policy 0, policy_version 402409 (0.0025) [2024-04-27 14:37:33,252][52263] Updated weights for policy 0, policy_version 402419 (0.0031) [2024-04-27 14:37:34,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 6593265664. Throughput: 0: 53675.9. Samples: 1083776560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:37:34,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 14:37:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000402421_6593265664.pth... [2024-04-27 14:37:34,162][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000401638_6580436992.pth [2024-04-27 14:37:36,287][52263] Updated weights for policy 0, policy_version 402429 (0.0035) [2024-04-27 14:37:39,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6593544192. Throughput: 0: 53526.6. Samples: 1084100400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:37:39,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 14:37:39,216][52263] Updated weights for policy 0, policy_version 402439 (0.0029) [2024-04-27 14:37:42,429][52263] Updated weights for policy 0, policy_version 402449 (0.0031) [2024-04-27 14:37:44,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.2, 300 sec: 53484.3). Total num frames: 6593806336. Throughput: 0: 53605.4. Samples: 1084263180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:37:44,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 14:37:45,239][52263] Updated weights for policy 0, policy_version 402459 (0.0031) [2024-04-27 14:37:48,513][52263] Updated weights for policy 0, policy_version 402469 (0.0025) [2024-04-27 14:37:49,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.0, 300 sec: 53539.5). Total num frames: 6594084864. Throughput: 0: 53613.9. Samples: 1084583220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:37:49,107][52031] Avg episode reward: [(0, '0.512')] [2024-04-27 14:37:51,448][52263] Updated weights for policy 0, policy_version 402479 (0.0026) [2024-04-27 14:37:54,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6594330624. Throughput: 0: 53587.5. Samples: 1084900920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:37:54,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 14:37:54,550][52263] Updated weights for policy 0, policy_version 402489 (0.0031) [2024-04-27 14:37:57,785][52263] Updated weights for policy 0, policy_version 402499 (0.0031) [2024-04-27 14:37:59,107][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.0, 300 sec: 53595.4). Total num frames: 6594609152. Throughput: 0: 53677.3. Samples: 1085068380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:37:59,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 14:38:00,717][52263] Updated weights for policy 0, policy_version 402509 (0.0026) [2024-04-27 14:38:03,937][52263] Updated weights for policy 0, policy_version 402519 (0.0031) [2024-04-27 14:38:04,107][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6594871296. Throughput: 0: 53640.4. Samples: 1085385160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:38:04,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 14:38:06,896][52263] Updated weights for policy 0, policy_version 402529 (0.0029) [2024-04-27 14:38:09,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6595133440. Throughput: 0: 53564.0. Samples: 1085704160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 14:38:09,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 14:38:10,491][52263] Updated weights for policy 0, policy_version 402539 (0.0026) [2024-04-27 14:38:12,945][52263] Updated weights for policy 0, policy_version 402549 (0.0030) [2024-04-27 14:38:14,107][52031] Fps is (10 sec: 54066.9, 60 sec: 54067.1, 300 sec: 53539.6). Total num frames: 6595411968. Throughput: 0: 53420.3. Samples: 1085863640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 14:38:14,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 14:38:16,738][52263] Updated weights for policy 0, policy_version 402559 (0.0025) [2024-04-27 14:38:19,028][52263] Updated weights for policy 0, policy_version 402569 (0.0033) [2024-04-27 14:38:19,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6595690496. Throughput: 0: 53584.5. Samples: 1086187860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 14:38:19,107][52031] Avg episode reward: [(0, '0.510')] [2024-04-27 14:38:22,722][52263] Updated weights for policy 0, policy_version 402579 (0.0033) [2024-04-27 14:38:24,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6595936256. Throughput: 0: 53473.9. Samples: 1086506720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 14:38:24,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 14:38:24,966][52242] Signal inference workers to stop experience collection... (16250 times) [2024-04-27 14:38:24,967][52242] Signal inference workers to resume experience collection... (16250 times) [2024-04-27 14:38:24,994][52263] InferenceWorker_p0-w0: stopping experience collection (16250 times) [2024-04-27 14:38:24,994][52263] InferenceWorker_p0-w0: resuming experience collection (16250 times) [2024-04-27 14:38:25,260][52263] Updated weights for policy 0, policy_version 402589 (0.0030) [2024-04-27 14:38:28,709][52263] Updated weights for policy 0, policy_version 402599 (0.0032) [2024-04-27 14:38:29,107][52031] Fps is (10 sec: 49151.5, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6596182016. Throughput: 0: 53225.7. Samples: 1086658340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 14:38:29,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 14:38:31,551][52263] Updated weights for policy 0, policy_version 402609 (0.0029) [2024-04-27 14:38:34,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6596493312. Throughput: 0: 53258.4. Samples: 1086979840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 14:38:34,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 14:38:34,860][52263] Updated weights for policy 0, policy_version 402619 (0.0026) [2024-04-27 14:38:37,696][52263] Updated weights for policy 0, policy_version 402629 (0.0027) [2024-04-27 14:38:39,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6596739072. Throughput: 0: 53367.0. Samples: 1087302420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 14:38:39,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 14:38:40,984][52263] Updated weights for policy 0, policy_version 402639 (0.0028) [2024-04-27 14:38:43,969][52263] Updated weights for policy 0, policy_version 402649 (0.0029) [2024-04-27 14:38:44,107][52031] Fps is (10 sec: 50789.3, 60 sec: 53247.8, 300 sec: 53484.0). Total num frames: 6597001216. Throughput: 0: 53375.0. Samples: 1087470260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 14:38:44,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 14:38:47,124][52263] Updated weights for policy 0, policy_version 402659 (0.0028) [2024-04-27 14:38:49,106][52031] Fps is (10 sec: 52428.3, 60 sec: 52975.0, 300 sec: 53539.6). Total num frames: 6597263360. Throughput: 0: 53465.4. Samples: 1087791100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 14:38:49,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 14:38:50,012][52263] Updated weights for policy 0, policy_version 402669 (0.0028) [2024-04-27 14:38:53,360][52263] Updated weights for policy 0, policy_version 402679 (0.0030) [2024-04-27 14:38:54,107][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6597541888. Throughput: 0: 53472.3. Samples: 1088110420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 14:38:54,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 14:38:55,969][52263] Updated weights for policy 0, policy_version 402689 (0.0028) [2024-04-27 14:38:59,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6597804032. Throughput: 0: 53277.4. Samples: 1088261120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 14:38:59,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 14:38:59,583][52263] Updated weights for policy 0, policy_version 402699 (0.0032) [2024-04-27 14:39:02,157][52263] Updated weights for policy 0, policy_version 402709 (0.0032) [2024-04-27 14:39:04,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6598082560. Throughput: 0: 53208.4. Samples: 1088582240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 14:39:04,107][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 14:39:05,712][52263] Updated weights for policy 0, policy_version 402719 (0.0030) [2024-04-27 14:39:08,360][52263] Updated weights for policy 0, policy_version 402729 (0.0030) [2024-04-27 14:39:09,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6598344704. Throughput: 0: 53364.3. Samples: 1088908120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 14:39:09,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 14:39:11,730][52263] Updated weights for policy 0, policy_version 402739 (0.0027) [2024-04-27 14:39:14,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6598606848. Throughput: 0: 53752.9. Samples: 1089077220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 14:39:14,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 14:39:14,311][52263] Updated weights for policy 0, policy_version 402749 (0.0030) [2024-04-27 14:39:17,705][52263] Updated weights for policy 0, policy_version 402759 (0.0029) [2024-04-27 14:39:19,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53247.9, 300 sec: 53650.6). Total num frames: 6598885376. Throughput: 0: 53712.3. Samples: 1089396900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 14:39:19,107][52031] Avg episode reward: [(0, '0.484')] [2024-04-27 14:39:20,336][52263] Updated weights for policy 0, policy_version 402769 (0.0032) [2024-04-27 14:39:24,011][52263] Updated weights for policy 0, policy_version 402779 (0.0030) [2024-04-27 14:39:24,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6599131136. Throughput: 0: 53554.1. Samples: 1089712360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 14:39:24,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 14:39:26,621][52263] Updated weights for policy 0, policy_version 402789 (0.0031) [2024-04-27 14:39:29,107][52031] Fps is (10 sec: 54067.3, 60 sec: 54067.2, 300 sec: 53484.0). Total num frames: 6599426048. Throughput: 0: 53374.8. Samples: 1089872120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 14:39:29,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 14:39:30,008][52263] Updated weights for policy 0, policy_version 402799 (0.0028) [2024-04-27 14:39:32,771][52263] Updated weights for policy 0, policy_version 402809 (0.0028) [2024-04-27 14:39:34,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6599688192. Throughput: 0: 53362.1. Samples: 1090192400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 14:39:34,107][52031] Avg episode reward: [(0, '0.664')] [2024-04-27 14:39:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000402813_6599688192.pth... [2024-04-27 14:39:34,170][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000402031_6586875904.pth [2024-04-27 14:39:35,867][52242] Signal inference workers to stop experience collection... (16300 times) [2024-04-27 14:39:35,868][52242] Signal inference workers to resume experience collection... (16300 times) [2024-04-27 14:39:35,877][52263] InferenceWorker_p0-w0: stopping experience collection (16300 times) [2024-04-27 14:39:35,887][52263] InferenceWorker_p0-w0: resuming experience collection (16300 times) [2024-04-27 14:39:35,996][52263] Updated weights for policy 0, policy_version 402819 (0.0037) [2024-04-27 14:39:39,106][52031] Fps is (10 sec: 50791.0, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6599933952. Throughput: 0: 53268.1. Samples: 1090507480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 14:39:39,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 14:39:39,115][52263] Updated weights for policy 0, policy_version 402829 (0.0029) [2024-04-27 14:39:42,446][52263] Updated weights for policy 0, policy_version 402839 (0.0031) [2024-04-27 14:39:44,106][52031] Fps is (10 sec: 50791.1, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6600196096. Throughput: 0: 53620.5. Samples: 1090674040. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 14:39:44,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 14:39:45,181][52263] Updated weights for policy 0, policy_version 402849 (0.0033) [2024-04-27 14:39:48,622][52263] Updated weights for policy 0, policy_version 402859 (0.0026) [2024-04-27 14:39:49,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6600474624. Throughput: 0: 53567.2. Samples: 1090992760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 14:39:49,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 14:39:51,278][52263] Updated weights for policy 0, policy_version 402869 (0.0027) [2024-04-27 14:39:54,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6600736768. Throughput: 0: 53430.4. Samples: 1091312480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 14:39:54,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 14:39:54,640][52263] Updated weights for policy 0, policy_version 402879 (0.0030) [2024-04-27 14:39:57,543][52263] Updated weights for policy 0, policy_version 402889 (0.0033) [2024-04-27 14:39:59,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6601015296. Throughput: 0: 53209.7. Samples: 1091471660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 14:39:59,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 14:40:00,762][52263] Updated weights for policy 0, policy_version 402899 (0.0036) [2024-04-27 14:40:03,759][52263] Updated weights for policy 0, policy_version 402909 (0.0027) [2024-04-27 14:40:04,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6601277440. Throughput: 0: 53166.7. Samples: 1091789400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 14:40:04,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 14:40:06,899][52263] Updated weights for policy 0, policy_version 402919 (0.0026) [2024-04-27 14:40:09,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6601572352. Throughput: 0: 53291.1. Samples: 1092110460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 14:40:09,107][52031] Avg episode reward: [(0, '0.502')] [2024-04-27 14:40:09,902][52263] Updated weights for policy 0, policy_version 402929 (0.0031) [2024-04-27 14:40:12,968][52263] Updated weights for policy 0, policy_version 402939 (0.0031) [2024-04-27 14:40:14,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6601818112. Throughput: 0: 53364.1. Samples: 1092273500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 14:40:14,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 14:40:15,978][52263] Updated weights for policy 0, policy_version 402949 (0.0027) [2024-04-27 14:40:19,009][52263] Updated weights for policy 0, policy_version 402959 (0.0028) [2024-04-27 14:40:19,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6602080256. Throughput: 0: 53373.1. Samples: 1092594180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 14:40:19,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 14:40:22,128][52263] Updated weights for policy 0, policy_version 402969 (0.0029) [2024-04-27 14:40:24,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6602342400. Throughput: 0: 53463.6. Samples: 1092913340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 14:40:24,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 14:40:25,113][52263] Updated weights for policy 0, policy_version 402979 (0.0031) [2024-04-27 14:40:28,335][52263] Updated weights for policy 0, policy_version 402989 (0.0028) [2024-04-27 14:40:29,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53248.2, 300 sec: 53373.0). Total num frames: 6602620928. Throughput: 0: 53173.9. Samples: 1093066860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 14:40:29,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 14:40:31,275][52263] Updated weights for policy 0, policy_version 402999 (0.0029) [2024-04-27 14:40:32,595][52242] Signal inference workers to stop experience collection... (16350 times) [2024-04-27 14:40:32,596][52242] Signal inference workers to resume experience collection... (16350 times) [2024-04-27 14:40:32,608][52263] InferenceWorker_p0-w0: stopping experience collection (16350 times) [2024-04-27 14:40:32,609][52263] InferenceWorker_p0-w0: resuming experience collection (16350 times) [2024-04-27 14:40:34,107][52031] Fps is (10 sec: 52427.3, 60 sec: 52974.8, 300 sec: 53317.4). Total num frames: 6602866688. Throughput: 0: 53341.0. Samples: 1093393120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 14:40:34,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 14:40:34,360][52263] Updated weights for policy 0, policy_version 403009 (0.0034) [2024-04-27 14:40:37,396][52263] Updated weights for policy 0, policy_version 403019 (0.0029) [2024-04-27 14:40:39,107][52031] Fps is (10 sec: 55704.3, 60 sec: 54067.1, 300 sec: 53539.6). Total num frames: 6603177984. Throughput: 0: 53441.6. Samples: 1093717360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 14:40:39,107][52031] Avg episode reward: [(0, '0.504')] [2024-04-27 14:40:40,370][52263] Updated weights for policy 0, policy_version 403029 (0.0030) [2024-04-27 14:40:43,460][52263] Updated weights for policy 0, policy_version 403039 (0.0032) [2024-04-27 14:40:44,106][52031] Fps is (10 sec: 55707.0, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6603423744. Throughput: 0: 53535.3. Samples: 1093880740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 14:40:44,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 14:40:46,503][52263] Updated weights for policy 0, policy_version 403049 (0.0030) [2024-04-27 14:40:49,106][52031] Fps is (10 sec: 49152.6, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6603669504. Throughput: 0: 53668.5. Samples: 1094204480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 14:40:49,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 14:40:49,552][52263] Updated weights for policy 0, policy_version 403059 (0.0034) [2024-04-27 14:40:52,830][52263] Updated weights for policy 0, policy_version 403069 (0.0031) [2024-04-27 14:40:54,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6603948032. Throughput: 0: 53671.7. Samples: 1094525680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 14:40:54,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 14:40:55,972][52263] Updated weights for policy 0, policy_version 403079 (0.0030) [2024-04-27 14:40:58,968][52263] Updated weights for policy 0, policy_version 403089 (0.0026) [2024-04-27 14:40:59,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6604210176. Throughput: 0: 53546.1. Samples: 1094683080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 14:40:59,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 14:41:01,976][52263] Updated weights for policy 0, policy_version 403099 (0.0033) [2024-04-27 14:41:04,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6604488704. Throughput: 0: 53450.5. Samples: 1094999460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 14:41:04,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 14:41:05,003][52263] Updated weights for policy 0, policy_version 403109 (0.0032) [2024-04-27 14:41:08,023][52263] Updated weights for policy 0, policy_version 403119 (0.0031) [2024-04-27 14:41:09,106][52031] Fps is (10 sec: 57344.6, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6604783616. Throughput: 0: 53476.8. Samples: 1095319800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 14:41:09,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 14:41:11,267][52263] Updated weights for policy 0, policy_version 403129 (0.0038) [2024-04-27 14:41:14,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6605012992. Throughput: 0: 53730.1. Samples: 1095484720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 14:41:14,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 14:41:14,164][52263] Updated weights for policy 0, policy_version 403139 (0.0028) [2024-04-27 14:41:17,193][52263] Updated weights for policy 0, policy_version 403149 (0.0029) [2024-04-27 14:41:19,107][52031] Fps is (10 sec: 50790.2, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6605291520. Throughput: 0: 53705.5. Samples: 1095809860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 14:41:19,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 14:41:20,243][52263] Updated weights for policy 0, policy_version 403159 (0.0040) [2024-04-27 14:41:23,426][52263] Updated weights for policy 0, policy_version 403169 (0.0029) [2024-04-27 14:41:24,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6605570048. Throughput: 0: 53596.9. Samples: 1096129220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 14:41:24,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 14:41:26,311][52263] Updated weights for policy 0, policy_version 403179 (0.0031) [2024-04-27 14:41:29,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6605815808. Throughput: 0: 53408.8. Samples: 1096284140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 14:41:29,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 14:41:29,625][52263] Updated weights for policy 0, policy_version 403189 (0.0029) [2024-04-27 14:41:32,288][52263] Updated weights for policy 0, policy_version 403199 (0.0031) [2024-04-27 14:41:34,106][52031] Fps is (10 sec: 54067.8, 60 sec: 54067.4, 300 sec: 53539.6). Total num frames: 6606110720. Throughput: 0: 53430.2. Samples: 1096608840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 14:41:34,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 14:41:34,206][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000403206_6606127104.pth... [2024-04-27 14:41:34,251][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000402421_6593265664.pth [2024-04-27 14:41:35,655][52263] Updated weights for policy 0, policy_version 403209 (0.0033) [2024-04-27 14:41:37,473][52242] Signal inference workers to stop experience collection... (16400 times) [2024-04-27 14:41:37,474][52242] Signal inference workers to resume experience collection... (16400 times) [2024-04-27 14:41:37,505][52263] InferenceWorker_p0-w0: stopping experience collection (16400 times) [2024-04-27 14:41:37,505][52263] InferenceWorker_p0-w0: resuming experience collection (16400 times) [2024-04-27 14:41:38,398][52263] Updated weights for policy 0, policy_version 403219 (0.0038) [2024-04-27 14:41:39,106][52031] Fps is (10 sec: 57344.6, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6606389248. Throughput: 0: 53407.9. Samples: 1096929040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 14:41:39,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 14:41:41,771][52263] Updated weights for policy 0, policy_version 403229 (0.0031) [2024-04-27 14:41:44,107][52031] Fps is (10 sec: 50789.6, 60 sec: 53247.8, 300 sec: 53428.5). Total num frames: 6606618624. Throughput: 0: 53656.4. Samples: 1097097620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 14:41:44,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 14:41:44,477][52263] Updated weights for policy 0, policy_version 403239 (0.0034) [2024-04-27 14:41:48,056][52263] Updated weights for policy 0, policy_version 403249 (0.0029) [2024-04-27 14:41:49,107][52031] Fps is (10 sec: 50790.1, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6606897152. Throughput: 0: 53709.4. Samples: 1097416380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 14:41:49,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 14:41:50,686][52263] Updated weights for policy 0, policy_version 403259 (0.0030) [2024-04-27 14:41:54,025][52263] Updated weights for policy 0, policy_version 403269 (0.0028) [2024-04-27 14:41:54,106][52031] Fps is (10 sec: 54068.3, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6607159296. Throughput: 0: 53794.3. Samples: 1097740540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 14:41:54,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 14:41:56,704][52263] Updated weights for policy 0, policy_version 403279 (0.0027) [2024-04-27 14:41:59,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53794.3, 300 sec: 53539.6). Total num frames: 6607437824. Throughput: 0: 53658.7. Samples: 1097899360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 14:41:59,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 14:42:00,057][52263] Updated weights for policy 0, policy_version 403289 (0.0030) [2024-04-27 14:42:02,801][52263] Updated weights for policy 0, policy_version 403299 (0.0032) [2024-04-27 14:42:04,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6607716352. Throughput: 0: 53548.0. Samples: 1098219520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 14:42:04,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 14:42:06,013][52263] Updated weights for policy 0, policy_version 403309 (0.0028) [2024-04-27 14:42:08,988][52263] Updated weights for policy 0, policy_version 403319 (0.0024) [2024-04-27 14:42:09,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6607978496. Throughput: 0: 53656.6. Samples: 1098543760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 14:42:09,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 14:42:12,100][52263] Updated weights for policy 0, policy_version 403329 (0.0028) [2024-04-27 14:42:14,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6608240640. Throughput: 0: 53991.8. Samples: 1098713760. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 14:42:14,107][52031] Avg episode reward: [(0, '0.517')] [2024-04-27 14:42:14,949][52263] Updated weights for policy 0, policy_version 403339 (0.0030) [2024-04-27 14:42:18,310][52263] Updated weights for policy 0, policy_version 403349 (0.0031) [2024-04-27 14:42:19,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6608502784. Throughput: 0: 53912.8. Samples: 1099034920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 14:42:19,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 14:42:21,006][52263] Updated weights for policy 0, policy_version 403359 (0.0025) [2024-04-27 14:42:24,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6608764928. Throughput: 0: 53929.0. Samples: 1099355840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 14:42:24,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 14:42:24,365][52263] Updated weights for policy 0, policy_version 403369 (0.0027) [2024-04-27 14:42:27,195][52263] Updated weights for policy 0, policy_version 403379 (0.0029) [2024-04-27 14:42:29,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6609043456. Throughput: 0: 53657.0. Samples: 1099512180. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 14:42:29,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 14:42:30,582][52263] Updated weights for policy 0, policy_version 403389 (0.0030) [2024-04-27 14:42:33,324][52263] Updated weights for policy 0, policy_version 403399 (0.0028) [2024-04-27 14:42:33,692][52242] Signal inference workers to stop experience collection... (16450 times) [2024-04-27 14:42:33,737][52263] InferenceWorker_p0-w0: stopping experience collection (16450 times) [2024-04-27 14:42:33,750][52242] Signal inference workers to resume experience collection... (16450 times) [2024-04-27 14:42:33,756][52263] InferenceWorker_p0-w0: resuming experience collection (16450 times) [2024-04-27 14:42:34,106][52031] Fps is (10 sec: 58982.0, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6609354752. Throughput: 0: 53817.4. Samples: 1099838160. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 14:42:34,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 14:42:36,578][52263] Updated weights for policy 0, policy_version 403409 (0.0024) [2024-04-27 14:42:39,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6609584128. Throughput: 0: 53739.4. Samples: 1100158820. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 14:42:39,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 14:42:39,380][52263] Updated weights for policy 0, policy_version 403419 (0.0037) [2024-04-27 14:42:42,611][52263] Updated weights for policy 0, policy_version 403429 (0.0033) [2024-04-27 14:42:44,106][52031] Fps is (10 sec: 49152.2, 60 sec: 53794.3, 300 sec: 53428.5). Total num frames: 6609846272. Throughput: 0: 53826.2. Samples: 1100321540. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 14:42:44,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 14:42:45,402][52263] Updated weights for policy 0, policy_version 403439 (0.0027) [2024-04-27 14:42:48,712][52263] Updated weights for policy 0, policy_version 403449 (0.0030) [2024-04-27 14:42:49,107][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6610108416. Throughput: 0: 53863.1. Samples: 1100643360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 14:42:49,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 14:42:51,489][52263] Updated weights for policy 0, policy_version 403459 (0.0032) [2024-04-27 14:42:54,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6610386944. Throughput: 0: 53776.1. Samples: 1100963680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 14:42:54,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 14:42:54,697][52263] Updated weights for policy 0, policy_version 403469 (0.0027) [2024-04-27 14:42:57,585][52263] Updated weights for policy 0, policy_version 403479 (0.0028) [2024-04-27 14:42:59,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6610665472. Throughput: 0: 53573.1. Samples: 1101124560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 14:42:59,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 14:43:00,980][52263] Updated weights for policy 0, policy_version 403489 (0.0023) [2024-04-27 14:43:03,655][52263] Updated weights for policy 0, policy_version 403499 (0.0032) [2024-04-27 14:43:04,107][52031] Fps is (10 sec: 57343.3, 60 sec: 54067.2, 300 sec: 53650.6). Total num frames: 6610960384. Throughput: 0: 53526.8. Samples: 1101443620. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 14:43:04,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 14:43:07,255][52263] Updated weights for policy 0, policy_version 403509 (0.0026) [2024-04-27 14:43:09,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6611189760. Throughput: 0: 53449.3. Samples: 1101761060. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 14:43:09,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 14:43:09,747][52263] Updated weights for policy 0, policy_version 403519 (0.0031) [2024-04-27 14:43:13,490][52263] Updated weights for policy 0, policy_version 403529 (0.0028) [2024-04-27 14:43:14,106][52031] Fps is (10 sec: 49152.1, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6611451904. Throughput: 0: 53572.1. Samples: 1101922920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 14:43:14,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 14:43:15,809][52263] Updated weights for policy 0, policy_version 403539 (0.0036) [2024-04-27 14:43:19,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6611730432. Throughput: 0: 53453.3. Samples: 1102243560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 14:43:19,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 14:43:19,514][52263] Updated weights for policy 0, policy_version 403549 (0.0036) [2024-04-27 14:43:22,040][52263] Updated weights for policy 0, policy_version 403559 (0.0036) [2024-04-27 14:43:24,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6611976192. Throughput: 0: 53396.6. Samples: 1102561660. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 14:43:24,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 14:43:24,980][52242] Signal inference workers to stop experience collection... (16500 times) [2024-04-27 14:43:24,985][52242] Signal inference workers to resume experience collection... (16500 times) [2024-04-27 14:43:25,007][52263] InferenceWorker_p0-w0: stopping experience collection (16500 times) [2024-04-27 14:43:25,007][52263] InferenceWorker_p0-w0: resuming experience collection (16500 times) [2024-04-27 14:43:25,588][52263] Updated weights for policy 0, policy_version 403569 (0.0029) [2024-04-27 14:43:28,051][52263] Updated weights for policy 0, policy_version 403579 (0.0027) [2024-04-27 14:43:29,106][52031] Fps is (10 sec: 55705.8, 60 sec: 54067.3, 300 sec: 53539.6). Total num frames: 6612287488. Throughput: 0: 53560.4. Samples: 1102731760. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 14:43:29,107][52031] Avg episode reward: [(0, '0.681')] [2024-04-27 14:43:31,803][52263] Updated weights for policy 0, policy_version 403589 (0.0028) [2024-04-27 14:43:34,106][52031] Fps is (10 sec: 57344.1, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6612549632. Throughput: 0: 53438.3. Samples: 1103048080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 14:43:34,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 14:43:34,215][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000403599_6612566016.pth... [2024-04-27 14:43:34,218][52263] Updated weights for policy 0, policy_version 403599 (0.0029) [2024-04-27 14:43:34,261][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000402813_6599688192.pth [2024-04-27 14:43:37,764][52263] Updated weights for policy 0, policy_version 403609 (0.0027) [2024-04-27 14:43:39,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6612795392. Throughput: 0: 53387.9. Samples: 1103366140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 14:43:39,107][52031] Avg episode reward: [(0, '0.501')] [2024-04-27 14:43:40,385][52263] Updated weights for policy 0, policy_version 403619 (0.0039) [2024-04-27 14:43:43,784][52263] Updated weights for policy 0, policy_version 403629 (0.0031) [2024-04-27 14:43:44,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53793.9, 300 sec: 53595.1). Total num frames: 6613073920. Throughput: 0: 53386.1. Samples: 1103526940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 14:43:44,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 14:43:46,465][52263] Updated weights for policy 0, policy_version 403639 (0.0030) [2024-04-27 14:43:49,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6613319680. Throughput: 0: 53312.3. Samples: 1103842680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 14:43:49,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 14:43:50,094][52263] Updated weights for policy 0, policy_version 403649 (0.0028) [2024-04-27 14:43:52,564][52263] Updated weights for policy 0, policy_version 403659 (0.0040) [2024-04-27 14:43:54,107][52031] Fps is (10 sec: 52429.2, 60 sec: 53520.9, 300 sec: 53539.6). Total num frames: 6613598208. Throughput: 0: 53344.3. Samples: 1104161560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 14:43:54,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 14:43:56,261][52263] Updated weights for policy 0, policy_version 403669 (0.0029) [2024-04-27 14:43:58,652][52263] Updated weights for policy 0, policy_version 403679 (0.0031) [2024-04-27 14:43:59,107][52031] Fps is (10 sec: 57344.4, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6613893120. Throughput: 0: 53378.2. Samples: 1104324940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 14:43:59,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 14:44:02,259][52263] Updated weights for policy 0, policy_version 403689 (0.0030) [2024-04-27 14:44:04,106][52031] Fps is (10 sec: 52429.4, 60 sec: 52701.9, 300 sec: 53484.1). Total num frames: 6614122496. Throughput: 0: 53358.7. Samples: 1104644700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 14:44:04,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 14:44:04,956][52263] Updated weights for policy 0, policy_version 403699 (0.0033) [2024-04-27 14:44:08,329][52263] Updated weights for policy 0, policy_version 403709 (0.0029) [2024-04-27 14:44:09,106][52031] Fps is (10 sec: 49152.5, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6614384640. Throughput: 0: 53437.4. Samples: 1104966340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 14:44:09,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 14:44:11,090][52263] Updated weights for policy 0, policy_version 403719 (0.0030) [2024-04-27 14:44:14,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6614646784. Throughput: 0: 52934.3. Samples: 1105113800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 14:44:14,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 14:44:14,505][52263] Updated weights for policy 0, policy_version 403729 (0.0029) [2024-04-27 14:44:17,071][52263] Updated weights for policy 0, policy_version 403739 (0.0030) [2024-04-27 14:44:19,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6614941696. Throughput: 0: 53173.8. Samples: 1105440900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 14:44:19,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 14:44:20,570][52263] Updated weights for policy 0, policy_version 403749 (0.0030) [2024-04-27 14:44:23,179][52263] Updated weights for policy 0, policy_version 403759 (0.0032) [2024-04-27 14:44:24,106][52031] Fps is (10 sec: 55705.5, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6615203840. Throughput: 0: 53359.2. Samples: 1105767300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 14:44:24,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 14:44:26,571][52263] Updated weights for policy 0, policy_version 403769 (0.0029) [2024-04-27 14:44:28,968][52242] Signal inference workers to stop experience collection... (16550 times) [2024-04-27 14:44:29,011][52263] InferenceWorker_p0-w0: stopping experience collection (16550 times) [2024-04-27 14:44:29,023][52242] Signal inference workers to resume experience collection... (16550 times) [2024-04-27 14:44:29,030][52263] InferenceWorker_p0-w0: resuming experience collection (16550 times) [2024-04-27 14:44:29,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6615482368. Throughput: 0: 53437.6. Samples: 1105931620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 14:44:29,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 14:44:29,287][52263] Updated weights for policy 0, policy_version 403779 (0.0028) [2024-04-27 14:44:32,648][52263] Updated weights for policy 0, policy_version 403789 (0.0028) [2024-04-27 14:44:34,107][52031] Fps is (10 sec: 54065.4, 60 sec: 53247.8, 300 sec: 53595.1). Total num frames: 6615744512. Throughput: 0: 53583.8. Samples: 1106253960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 14:44:34,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 14:44:35,344][52263] Updated weights for policy 0, policy_version 403799 (0.0025) [2024-04-27 14:44:38,923][52263] Updated weights for policy 0, policy_version 403809 (0.0034) [2024-04-27 14:44:39,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6616006656. Throughput: 0: 53722.8. Samples: 1106579080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 14:44:39,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 14:44:41,312][52263] Updated weights for policy 0, policy_version 403819 (0.0028) [2024-04-27 14:44:44,107][52031] Fps is (10 sec: 50791.4, 60 sec: 52975.0, 300 sec: 53484.0). Total num frames: 6616252416. Throughput: 0: 53371.5. Samples: 1106726660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 14:44:44,107][52031] Avg episode reward: [(0, '0.658')] [2024-04-27 14:44:45,191][52263] Updated weights for policy 0, policy_version 403829 (0.0028) [2024-04-27 14:44:47,417][52263] Updated weights for policy 0, policy_version 403839 (0.0028) [2024-04-27 14:44:49,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6616547328. Throughput: 0: 53353.7. Samples: 1107045620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 14:44:49,107][52031] Avg episode reward: [(0, '0.515')] [2024-04-27 14:44:51,270][52263] Updated weights for policy 0, policy_version 403849 (0.0038) [2024-04-27 14:44:53,674][52263] Updated weights for policy 0, policy_version 403859 (0.0027) [2024-04-27 14:44:54,107][52031] Fps is (10 sec: 57344.1, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6616825856. Throughput: 0: 53390.6. Samples: 1107368920. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-04-27 14:44:54,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 14:44:57,580][52263] Updated weights for policy 0, policy_version 403869 (0.0026) [2024-04-27 14:44:59,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6617088000. Throughput: 0: 53934.2. Samples: 1107540840. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-04-27 14:44:59,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 14:44:59,855][52263] Updated weights for policy 0, policy_version 403879 (0.0029) [2024-04-27 14:45:03,628][52263] Updated weights for policy 0, policy_version 403889 (0.0029) [2024-04-27 14:45:04,106][52031] Fps is (10 sec: 50791.0, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6617333760. Throughput: 0: 53826.7. Samples: 1107863100. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-04-27 14:45:04,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 14:45:06,112][52263] Updated weights for policy 0, policy_version 403899 (0.0031) [2024-04-27 14:45:09,106][52031] Fps is (10 sec: 52428.2, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6617612288. Throughput: 0: 53656.8. Samples: 1108181860. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-04-27 14:45:09,107][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 14:45:09,663][52263] Updated weights for policy 0, policy_version 403909 (0.0032) [2024-04-27 14:45:12,250][52263] Updated weights for policy 0, policy_version 403919 (0.0031) [2024-04-27 14:45:14,106][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6617874432. Throughput: 0: 53450.1. Samples: 1108336880. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-04-27 14:45:14,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 14:45:15,735][52263] Updated weights for policy 0, policy_version 403929 (0.0029) [2024-04-27 14:45:18,442][52263] Updated weights for policy 0, policy_version 403939 (0.0029) [2024-04-27 14:45:19,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6618136576. Throughput: 0: 53417.7. Samples: 1108657740. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-04-27 14:45:19,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 14:45:22,040][52263] Updated weights for policy 0, policy_version 403949 (0.0030) [2024-04-27 14:45:24,106][52031] Fps is (10 sec: 57344.4, 60 sec: 54067.2, 300 sec: 53650.6). Total num frames: 6618447872. Throughput: 0: 53268.0. Samples: 1108976140. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-04-27 14:45:24,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 14:45:24,507][52263] Updated weights for policy 0, policy_version 403959 (0.0037) [2024-04-27 14:45:28,062][52263] Updated weights for policy 0, policy_version 403969 (0.0034) [2024-04-27 14:45:29,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6618693632. Throughput: 0: 53812.6. Samples: 1109148220. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-04-27 14:45:29,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 14:45:30,462][52263] Updated weights for policy 0, policy_version 403979 (0.0028) [2024-04-27 14:45:34,106][52031] Fps is (10 sec: 49152.1, 60 sec: 53248.3, 300 sec: 53428.5). Total num frames: 6618939392. Throughput: 0: 53879.6. Samples: 1109470200. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-04-27 14:45:34,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 14:45:34,188][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000403989_6618955776.pth... [2024-04-27 14:45:34,201][52263] Updated weights for policy 0, policy_version 403989 (0.0031) [2024-04-27 14:45:34,247][52242] Signal inference workers to stop experience collection... (16600 times) [2024-04-27 14:45:34,247][52242] Signal inference workers to resume experience collection... (16600 times) [2024-04-27 14:45:34,250][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000403206_6606127104.pth [2024-04-27 14:45:34,262][52263] InferenceWorker_p0-w0: stopping experience collection (16600 times) [2024-04-27 14:45:34,262][52263] InferenceWorker_p0-w0: resuming experience collection (16600 times) [2024-04-27 14:45:36,672][52263] Updated weights for policy 0, policy_version 403999 (0.0030) [2024-04-27 14:45:39,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6619217920. Throughput: 0: 53852.0. Samples: 1109792260. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-04-27 14:45:39,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 14:45:40,420][52263] Updated weights for policy 0, policy_version 404009 (0.0027) [2024-04-27 14:45:42,945][52263] Updated weights for policy 0, policy_version 404019 (0.0032) [2024-04-27 14:45:44,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6619480064. Throughput: 0: 53453.2. Samples: 1109946240. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-04-27 14:45:44,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 14:45:46,379][52263] Updated weights for policy 0, policy_version 404029 (0.0032) [2024-04-27 14:45:48,979][52263] Updated weights for policy 0, policy_version 404039 (0.0025) [2024-04-27 14:45:49,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6619774976. Throughput: 0: 53444.3. Samples: 1110268100. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-04-27 14:45:49,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 14:45:52,509][52263] Updated weights for policy 0, policy_version 404049 (0.0038) [2024-04-27 14:45:54,107][52031] Fps is (10 sec: 57343.7, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6620053504. Throughput: 0: 53504.9. Samples: 1110589580. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-04-27 14:45:54,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 14:45:55,357][52263] Updated weights for policy 0, policy_version 404059 (0.0034) [2024-04-27 14:45:58,514][52263] Updated weights for policy 0, policy_version 404069 (0.0029) [2024-04-27 14:45:59,107][52031] Fps is (10 sec: 52429.0, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6620299264. Throughput: 0: 53715.5. Samples: 1110754080. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-04-27 14:45:59,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 14:46:01,952][52263] Updated weights for policy 0, policy_version 404079 (0.0027) [2024-04-27 14:46:04,106][52031] Fps is (10 sec: 49152.8, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6620545024. Throughput: 0: 53750.3. Samples: 1111076500. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-04-27 14:46:04,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 14:46:04,640][52263] Updated weights for policy 0, policy_version 404089 (0.0035) [2024-04-27 14:46:07,938][52263] Updated weights for policy 0, policy_version 404099 (0.0030) [2024-04-27 14:46:09,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6620823552. Throughput: 0: 53782.9. Samples: 1111396380. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-04-27 14:46:09,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 14:46:10,815][52263] Updated weights for policy 0, policy_version 404109 (0.0027) [2024-04-27 14:46:14,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6621069312. Throughput: 0: 53421.4. Samples: 1111552180. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-04-27 14:46:14,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 14:46:14,186][52263] Updated weights for policy 0, policy_version 404119 (0.0033) [2024-04-27 14:46:16,866][52263] Updated weights for policy 0, policy_version 404129 (0.0025) [2024-04-27 14:46:19,107][52031] Fps is (10 sec: 55705.2, 60 sec: 54067.0, 300 sec: 53595.1). Total num frames: 6621380608. Throughput: 0: 53341.0. Samples: 1111870560. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 14:46:19,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 14:46:20,360][52263] Updated weights for policy 0, policy_version 404139 (0.0035) [2024-04-27 14:46:22,898][52263] Updated weights for policy 0, policy_version 404149 (0.0028) [2024-04-27 14:46:24,106][52031] Fps is (10 sec: 55705.6, 60 sec: 52975.0, 300 sec: 53595.1). Total num frames: 6621626368. Throughput: 0: 53305.0. Samples: 1112190980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 14:46:24,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 14:46:26,301][52263] Updated weights for policy 0, policy_version 404159 (0.0038) [2024-04-27 14:46:29,106][52031] Fps is (10 sec: 50792.0, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6621888512. Throughput: 0: 53628.6. Samples: 1112359520. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 14:46:29,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 14:46:29,196][52263] Updated weights for policy 0, policy_version 404169 (0.0036) [2024-04-27 14:46:29,323][52242] Signal inference workers to stop experience collection... (16650 times) [2024-04-27 14:46:29,351][52263] InferenceWorker_p0-w0: stopping experience collection (16650 times) [2024-04-27 14:46:29,382][52242] Signal inference workers to resume experience collection... (16650 times) [2024-04-27 14:46:29,383][52263] InferenceWorker_p0-w0: resuming experience collection (16650 times) [2024-04-27 14:46:32,217][52263] Updated weights for policy 0, policy_version 404179 (0.0027) [2024-04-27 14:46:34,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6622150656. Throughput: 0: 53608.6. Samples: 1112680480. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 14:46:34,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 14:46:35,148][52263] Updated weights for policy 0, policy_version 404189 (0.0029) [2024-04-27 14:46:38,301][52263] Updated weights for policy 0, policy_version 404199 (0.0031) [2024-04-27 14:46:39,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6622445568. Throughput: 0: 53595.6. Samples: 1113001380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 14:46:39,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 14:46:41,281][52263] Updated weights for policy 0, policy_version 404209 (0.0035) [2024-04-27 14:46:44,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6622691328. Throughput: 0: 53447.3. Samples: 1113159200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 14:46:44,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 14:46:44,393][52263] Updated weights for policy 0, policy_version 404219 (0.0027) [2024-04-27 14:46:47,319][52263] Updated weights for policy 0, policy_version 404229 (0.0030) [2024-04-27 14:46:49,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6622969856. Throughput: 0: 53355.5. Samples: 1113477500. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 14:46:49,108][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 14:46:50,731][52263] Updated weights for policy 0, policy_version 404239 (0.0034) [2024-04-27 14:46:53,533][52263] Updated weights for policy 0, policy_version 404249 (0.0026) [2024-04-27 14:46:54,106][52031] Fps is (10 sec: 55705.1, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6623248384. Throughput: 0: 53364.7. Samples: 1113797780. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 14:46:54,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 14:46:56,923][52263] Updated weights for policy 0, policy_version 404259 (0.0033) [2024-04-27 14:46:59,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6623494144. Throughput: 0: 53494.2. Samples: 1113959420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 14:46:59,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 14:46:59,535][52263] Updated weights for policy 0, policy_version 404269 (0.0032) [2024-04-27 14:47:02,876][52263] Updated weights for policy 0, policy_version 404279 (0.0029) [2024-04-27 14:47:04,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53793.9, 300 sec: 53539.6). Total num frames: 6623772672. Throughput: 0: 53590.8. Samples: 1114282140. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 14:47:04,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 14:47:05,715][52263] Updated weights for policy 0, policy_version 404289 (0.0032) [2024-04-27 14:47:08,895][52263] Updated weights for policy 0, policy_version 404299 (0.0035) [2024-04-27 14:47:09,107][52031] Fps is (10 sec: 54065.6, 60 sec: 53521.0, 300 sec: 53539.5). Total num frames: 6624034816. Throughput: 0: 53607.7. Samples: 1114603340. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 14:47:09,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 14:47:11,948][52263] Updated weights for policy 0, policy_version 404309 (0.0032) [2024-04-27 14:47:14,107][52031] Fps is (10 sec: 52429.3, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6624296960. Throughput: 0: 53423.4. Samples: 1114763580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 14:47:14,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 14:47:15,069][52263] Updated weights for policy 0, policy_version 404319 (0.0039) [2024-04-27 14:47:18,061][52263] Updated weights for policy 0, policy_version 404329 (0.0026) [2024-04-27 14:47:19,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53248.2, 300 sec: 53595.1). Total num frames: 6624575488. Throughput: 0: 53435.5. Samples: 1115085080. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 14:47:19,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 14:47:21,233][52263] Updated weights for policy 0, policy_version 404339 (0.0034) [2024-04-27 14:47:23,648][52242] Signal inference workers to stop experience collection... (16700 times) [2024-04-27 14:47:23,648][52242] Signal inference workers to resume experience collection... (16700 times) [2024-04-27 14:47:23,681][52263] InferenceWorker_p0-w0: stopping experience collection (16700 times) [2024-04-27 14:47:23,681][52263] InferenceWorker_p0-w0: resuming experience collection (16700 times) [2024-04-27 14:47:24,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6624821248. Throughput: 0: 53533.9. Samples: 1115410400. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 14:47:24,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 14:47:24,301][52263] Updated weights for policy 0, policy_version 404349 (0.0025) [2024-04-27 14:47:27,201][52263] Updated weights for policy 0, policy_version 404359 (0.0027) [2024-04-27 14:47:29,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53520.9, 300 sec: 53372.9). Total num frames: 6625099776. Throughput: 0: 53438.8. Samples: 1115563960. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 14:47:29,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 14:47:30,363][52263] Updated weights for policy 0, policy_version 404369 (0.0028) [2024-04-27 14:47:33,423][52263] Updated weights for policy 0, policy_version 404379 (0.0038) [2024-04-27 14:47:34,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6625378304. Throughput: 0: 53438.9. Samples: 1115882260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 14:47:34,116][52031] Avg episode reward: [(0, '0.665')] [2024-04-27 14:47:34,125][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000404381_6625378304.pth... [2024-04-27 14:47:34,170][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000403599_6612566016.pth [2024-04-27 14:47:36,337][52263] Updated weights for policy 0, policy_version 404389 (0.0030) [2024-04-27 14:47:39,107][52031] Fps is (10 sec: 55706.1, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6625656832. Throughput: 0: 53504.3. Samples: 1116205480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-27 14:47:39,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 14:47:39,371][52263] Updated weights for policy 0, policy_version 404399 (0.0028) [2024-04-27 14:47:42,489][52263] Updated weights for policy 0, policy_version 404409 (0.0029) [2024-04-27 14:47:44,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6625902592. Throughput: 0: 53575.9. Samples: 1116370340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-27 14:47:44,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 14:47:45,544][52263] Updated weights for policy 0, policy_version 404419 (0.0029) [2024-04-27 14:47:48,571][52263] Updated weights for policy 0, policy_version 404429 (0.0027) [2024-04-27 14:47:49,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6626181120. Throughput: 0: 53512.2. Samples: 1116690180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-27 14:47:49,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 14:47:51,697][52263] Updated weights for policy 0, policy_version 404439 (0.0037) [2024-04-27 14:47:54,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6626443264. Throughput: 0: 53581.6. Samples: 1117014500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-27 14:47:54,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 14:47:54,648][52263] Updated weights for policy 0, policy_version 404449 (0.0031) [2024-04-27 14:47:57,662][52263] Updated weights for policy 0, policy_version 404459 (0.0028) [2024-04-27 14:47:59,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6626705408. Throughput: 0: 53608.1. Samples: 1117175940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-27 14:47:59,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 14:48:00,606][52263] Updated weights for policy 0, policy_version 404469 (0.0033) [2024-04-27 14:48:03,700][52263] Updated weights for policy 0, policy_version 404479 (0.0037) [2024-04-27 14:48:04,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6626983936. Throughput: 0: 53606.1. Samples: 1117497360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-27 14:48:04,115][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 14:48:06,652][52263] Updated weights for policy 0, policy_version 404489 (0.0028) [2024-04-27 14:48:09,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6627262464. Throughput: 0: 53521.7. Samples: 1117818880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-27 14:48:09,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 14:48:09,701][52263] Updated weights for policy 0, policy_version 404499 (0.0028) [2024-04-27 14:48:13,001][52263] Updated weights for policy 0, policy_version 404509 (0.0031) [2024-04-27 14:48:14,107][52031] Fps is (10 sec: 55706.0, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6627540992. Throughput: 0: 53845.9. Samples: 1117987020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-27 14:48:14,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 14:48:15,875][52263] Updated weights for policy 0, policy_version 404519 (0.0035) [2024-04-27 14:48:18,999][52263] Updated weights for policy 0, policy_version 404529 (0.0030) [2024-04-27 14:48:19,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6627803136. Throughput: 0: 53937.4. Samples: 1118309440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-27 14:48:19,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 14:48:21,916][52263] Updated weights for policy 0, policy_version 404539 (0.0033) [2024-04-27 14:48:23,106][52242] Signal inference workers to stop experience collection... (16750 times) [2024-04-27 14:48:23,159][52242] Signal inference workers to resume experience collection... (16750 times) [2024-04-27 14:48:23,159][52263] InferenceWorker_p0-w0: stopping experience collection (16750 times) [2024-04-27 14:48:23,173][52263] InferenceWorker_p0-w0: resuming experience collection (16750 times) [2024-04-27 14:48:24,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6628048896. Throughput: 0: 53886.3. Samples: 1118630360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-27 14:48:24,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 14:48:25,010][52263] Updated weights for policy 0, policy_version 404549 (0.0030) [2024-04-27 14:48:27,924][52263] Updated weights for policy 0, policy_version 404559 (0.0028) [2024-04-27 14:48:29,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6628327424. Throughput: 0: 53703.0. Samples: 1118786980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-27 14:48:29,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 14:48:31,158][52263] Updated weights for policy 0, policy_version 404569 (0.0030) [2024-04-27 14:48:33,994][52263] Updated weights for policy 0, policy_version 404579 (0.0029) [2024-04-27 14:48:34,106][52031] Fps is (10 sec: 57344.3, 60 sec: 54067.3, 300 sec: 53650.7). Total num frames: 6628622336. Throughput: 0: 53716.0. Samples: 1119107400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-27 14:48:34,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 14:48:37,320][52263] Updated weights for policy 0, policy_version 404589 (0.0027) [2024-04-27 14:48:39,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6628884480. Throughput: 0: 53709.3. Samples: 1119431420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-27 14:48:39,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 14:48:40,563][52263] Updated weights for policy 0, policy_version 404599 (0.0029) [2024-04-27 14:48:43,615][52263] Updated weights for policy 0, policy_version 404609 (0.0031) [2024-04-27 14:48:44,106][52031] Fps is (10 sec: 52428.7, 60 sec: 54067.2, 300 sec: 53650.7). Total num frames: 6629146624. Throughput: 0: 53763.0. Samples: 1119595280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-27 14:48:44,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 14:48:46,556][52263] Updated weights for policy 0, policy_version 404619 (0.0034) [2024-04-27 14:48:49,106][52031] Fps is (10 sec: 54067.5, 60 sec: 54067.2, 300 sec: 53650.7). Total num frames: 6629425152. Throughput: 0: 53852.6. Samples: 1119920720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-27 14:48:49,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 14:48:49,595][52263] Updated weights for policy 0, policy_version 404629 (0.0027) [2024-04-27 14:48:52,572][52263] Updated weights for policy 0, policy_version 404639 (0.0033) [2024-04-27 14:48:54,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6629670912. Throughput: 0: 53854.0. Samples: 1120242320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-04-27 14:48:54,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 14:48:55,657][52263] Updated weights for policy 0, policy_version 404649 (0.0035) [2024-04-27 14:48:58,716][52263] Updated weights for policy 0, policy_version 404659 (0.0026) [2024-04-27 14:48:59,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6629933056. Throughput: 0: 53515.5. Samples: 1120395220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 14:48:59,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 14:49:01,619][52263] Updated weights for policy 0, policy_version 404669 (0.0037) [2024-04-27 14:49:04,106][52031] Fps is (10 sec: 54068.3, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 6630211584. Throughput: 0: 53597.5. Samples: 1120721320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 14:49:04,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 14:49:04,820][52263] Updated weights for policy 0, policy_version 404679 (0.0031) [2024-04-27 14:49:07,917][52263] Updated weights for policy 0, policy_version 404689 (0.0032) [2024-04-27 14:49:09,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53793.9, 300 sec: 53706.1). Total num frames: 6630490112. Throughput: 0: 53526.8. Samples: 1121039080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 14:49:09,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 14:49:10,910][52263] Updated weights for policy 0, policy_version 404699 (0.0029) [2024-04-27 14:49:13,952][52263] Updated weights for policy 0, policy_version 404709 (0.0034) [2024-04-27 14:49:14,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6630768640. Throughput: 0: 53739.6. Samples: 1121205260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 14:49:14,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 14:49:14,123][52242] Signal inference workers to stop experience collection... (16800 times) [2024-04-27 14:49:14,124][52242] Signal inference workers to resume experience collection... (16800 times) [2024-04-27 14:49:14,153][52263] InferenceWorker_p0-w0: stopping experience collection (16800 times) [2024-04-27 14:49:14,153][52263] InferenceWorker_p0-w0: resuming experience collection (16800 times) [2024-04-27 14:49:16,818][52263] Updated weights for policy 0, policy_version 404719 (0.0028) [2024-04-27 14:49:19,107][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6631014400. Throughput: 0: 53859.3. Samples: 1121531080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 14:49:19,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 14:49:19,924][52263] Updated weights for policy 0, policy_version 404729 (0.0028) [2024-04-27 14:49:22,836][52263] Updated weights for policy 0, policy_version 404739 (0.0033) [2024-04-27 14:49:24,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6631276544. Throughput: 0: 53817.4. Samples: 1121853200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 14:49:24,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 14:49:25,998][52263] Updated weights for policy 0, policy_version 404749 (0.0025) [2024-04-27 14:49:28,939][52263] Updated weights for policy 0, policy_version 404759 (0.0035) [2024-04-27 14:49:29,107][52031] Fps is (10 sec: 55706.3, 60 sec: 54067.2, 300 sec: 53650.7). Total num frames: 6631571456. Throughput: 0: 53832.0. Samples: 1122017720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 14:49:29,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 14:49:32,174][52263] Updated weights for policy 0, policy_version 404769 (0.0033) [2024-04-27 14:49:34,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6631833600. Throughput: 0: 53639.6. Samples: 1122334500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 14:49:34,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 14:49:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000404775_6631833600.pth... [2024-04-27 14:49:34,160][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000403989_6618955776.pth [2024-04-27 14:49:35,337][52263] Updated weights for policy 0, policy_version 404779 (0.0028) [2024-04-27 14:49:38,181][52263] Updated weights for policy 0, policy_version 404789 (0.0030) [2024-04-27 14:49:39,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6632095744. Throughput: 0: 53541.9. Samples: 1122651700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 14:49:39,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 14:49:41,299][52263] Updated weights for policy 0, policy_version 404799 (0.0035) [2024-04-27 14:49:44,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6632357888. Throughput: 0: 53704.6. Samples: 1122811920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 14:49:44,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 14:49:44,234][52263] Updated weights for policy 0, policy_version 404809 (0.0025) [2024-04-27 14:49:47,585][52263] Updated weights for policy 0, policy_version 404819 (0.0035) [2024-04-27 14:49:49,106][52031] Fps is (10 sec: 49152.3, 60 sec: 52701.9, 300 sec: 53428.5). Total num frames: 6632587264. Throughput: 0: 53607.6. Samples: 1123133660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 14:49:49,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 14:49:50,485][52263] Updated weights for policy 0, policy_version 404829 (0.0035) [2024-04-27 14:49:53,565][52263] Updated weights for policy 0, policy_version 404839 (0.0033) [2024-04-27 14:49:54,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6632882176. Throughput: 0: 53594.1. Samples: 1123450800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 14:49:54,107][52031] Avg episode reward: [(0, '0.490')] [2024-04-27 14:49:56,674][52263] Updated weights for policy 0, policy_version 404849 (0.0026) [2024-04-27 14:49:59,107][52031] Fps is (10 sec: 57342.8, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6633160704. Throughput: 0: 53514.5. Samples: 1123613420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 14:49:59,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 14:49:59,736][52263] Updated weights for policy 0, policy_version 404859 (0.0028) [2024-04-27 14:50:02,769][52263] Updated weights for policy 0, policy_version 404869 (0.0028) [2024-04-27 14:50:04,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6633422848. Throughput: 0: 53396.2. Samples: 1123933900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 14:50:04,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 14:50:05,934][52263] Updated weights for policy 0, policy_version 404879 (0.0028) [2024-04-27 14:50:08,880][52263] Updated weights for policy 0, policy_version 404889 (0.0028) [2024-04-27 14:50:09,107][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.2, 300 sec: 53650.6). Total num frames: 6633701376. Throughput: 0: 53341.2. Samples: 1124253560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 14:50:09,107][52031] Avg episode reward: [(0, '0.490')] [2024-04-27 14:50:10,839][52242] Signal inference workers to stop experience collection... (16850 times) [2024-04-27 14:50:10,839][52242] Signal inference workers to resume experience collection... (16850 times) [2024-04-27 14:50:10,865][52263] InferenceWorker_p0-w0: stopping experience collection (16850 times) [2024-04-27 14:50:10,866][52263] InferenceWorker_p0-w0: resuming experience collection (16850 times) [2024-04-27 14:50:11,965][52263] Updated weights for policy 0, policy_version 404899 (0.0028) [2024-04-27 14:50:14,107][52031] Fps is (10 sec: 50789.6, 60 sec: 52701.8, 300 sec: 53539.5). Total num frames: 6633930752. Throughput: 0: 53147.0. Samples: 1124409340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 14:50:14,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 14:50:15,089][52263] Updated weights for policy 0, policy_version 404909 (0.0030) [2024-04-27 14:50:17,993][52263] Updated weights for policy 0, policy_version 404919 (0.0031) [2024-04-27 14:50:19,107][52031] Fps is (10 sec: 50790.2, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6634209280. Throughput: 0: 53354.0. Samples: 1124735440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:50:19,107][52031] Avg episode reward: [(0, '0.466')] [2024-04-27 14:50:21,381][52263] Updated weights for policy 0, policy_version 404929 (0.0028) [2024-04-27 14:50:24,106][52031] Fps is (10 sec: 57345.0, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6634504192. Throughput: 0: 53373.8. Samples: 1125053520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:50:24,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 14:50:24,198][52263] Updated weights for policy 0, policy_version 404939 (0.0032) [2024-04-27 14:50:27,479][52263] Updated weights for policy 0, policy_version 404949 (0.0030) [2024-04-27 14:50:29,106][52031] Fps is (10 sec: 55706.5, 60 sec: 53248.1, 300 sec: 53650.7). Total num frames: 6634766336. Throughput: 0: 53524.9. Samples: 1125220540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:50:29,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 14:50:30,664][52263] Updated weights for policy 0, policy_version 404959 (0.0029) [2024-04-27 14:50:33,528][52263] Updated weights for policy 0, policy_version 404969 (0.0032) [2024-04-27 14:50:34,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6635044864. Throughput: 0: 53531.5. Samples: 1125542580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:50:34,107][52031] Avg episode reward: [(0, '0.684')] [2024-04-27 14:50:37,020][52263] Updated weights for policy 0, policy_version 404979 (0.0030) [2024-04-27 14:50:39,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6635307008. Throughput: 0: 53653.3. Samples: 1125865200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:50:39,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 14:50:39,486][52263] Updated weights for policy 0, policy_version 404989 (0.0030) [2024-04-27 14:50:43,507][52263] Updated weights for policy 0, policy_version 404999 (0.0026) [2024-04-27 14:50:44,106][52031] Fps is (10 sec: 49152.4, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6635536384. Throughput: 0: 53513.7. Samples: 1126021520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:50:44,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 14:50:45,649][52263] Updated weights for policy 0, policy_version 405009 (0.0026) [2024-04-27 14:50:49,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6635814912. Throughput: 0: 53532.5. Samples: 1126342860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:50:49,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 14:50:49,597][52263] Updated weights for policy 0, policy_version 405019 (0.0030) [2024-04-27 14:50:51,842][52263] Updated weights for policy 0, policy_version 405029 (0.0023) [2024-04-27 14:50:54,106][52031] Fps is (10 sec: 58981.8, 60 sec: 54067.2, 300 sec: 53650.7). Total num frames: 6636126208. Throughput: 0: 53507.2. Samples: 1126661380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:50:54,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 14:50:55,709][52263] Updated weights for policy 0, policy_version 405039 (0.0027) [2024-04-27 14:50:56,497][52242] Signal inference workers to stop experience collection... (16900 times) [2024-04-27 14:50:56,533][52263] InferenceWorker_p0-w0: stopping experience collection (16900 times) [2024-04-27 14:50:56,561][52242] Signal inference workers to resume experience collection... (16900 times) [2024-04-27 14:50:56,566][52263] InferenceWorker_p0-w0: resuming experience collection (16900 times) [2024-04-27 14:50:57,954][52263] Updated weights for policy 0, policy_version 405049 (0.0025) [2024-04-27 14:50:59,107][52031] Fps is (10 sec: 58982.0, 60 sec: 54067.3, 300 sec: 53761.7). Total num frames: 6636404736. Throughput: 0: 53949.0. Samples: 1126837040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:50:59,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 14:51:01,767][52263] Updated weights for policy 0, policy_version 405059 (0.0035) [2024-04-27 14:51:04,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53521.2, 300 sec: 53595.2). Total num frames: 6636634112. Throughput: 0: 53754.9. Samples: 1127154400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:51:04,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 14:51:04,158][52263] Updated weights for policy 0, policy_version 405069 (0.0027) [2024-04-27 14:51:08,116][52263] Updated weights for policy 0, policy_version 405079 (0.0039) [2024-04-27 14:51:09,107][52031] Fps is (10 sec: 49151.6, 60 sec: 53247.9, 300 sec: 53650.6). Total num frames: 6636896256. Throughput: 0: 53712.7. Samples: 1127470600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:51:09,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 14:51:10,398][52263] Updated weights for policy 0, policy_version 405089 (0.0034) [2024-04-27 14:51:14,106][52263] Updated weights for policy 0, policy_version 405099 (0.0026) [2024-04-27 14:51:14,107][52031] Fps is (10 sec: 50789.5, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6637142016. Throughput: 0: 53323.0. Samples: 1127620080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:51:14,107][52031] Avg episode reward: [(0, '0.477')] [2024-04-27 14:51:16,625][52263] Updated weights for policy 0, policy_version 405109 (0.0026) [2024-04-27 14:51:19,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6637420544. Throughput: 0: 53252.4. Samples: 1127938940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:51:19,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 14:51:20,298][52263] Updated weights for policy 0, policy_version 405119 (0.0026) [2024-04-27 14:51:22,633][52263] Updated weights for policy 0, policy_version 405129 (0.0028) [2024-04-27 14:51:24,107][52031] Fps is (10 sec: 58982.1, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6637731840. Throughput: 0: 53151.4. Samples: 1128257020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:51:24,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 14:51:26,356][52263] Updated weights for policy 0, policy_version 405139 (0.0031) [2024-04-27 14:51:28,668][52263] Updated weights for policy 0, policy_version 405149 (0.0041) [2024-04-27 14:51:29,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 6637977600. Throughput: 0: 53573.2. Samples: 1128432320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:51:29,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 14:51:32,584][52263] Updated weights for policy 0, policy_version 405159 (0.0029) [2024-04-27 14:51:34,106][52031] Fps is (10 sec: 49153.3, 60 sec: 52975.0, 300 sec: 53484.1). Total num frames: 6638223360. Throughput: 0: 53517.9. Samples: 1128751160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 14:51:34,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 14:51:34,212][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000405166_6638239744.pth... [2024-04-27 14:51:34,269][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000404381_6625378304.pth [2024-04-27 14:51:34,879][52263] Updated weights for policy 0, policy_version 405169 (0.0027) [2024-04-27 14:51:38,716][52263] Updated weights for policy 0, policy_version 405179 (0.0031) [2024-04-27 14:51:39,106][52031] Fps is (10 sec: 49152.3, 60 sec: 52701.9, 300 sec: 53484.0). Total num frames: 6638469120. Throughput: 0: 53552.9. Samples: 1129071260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 14:51:39,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 14:51:40,457][52242] Signal inference workers to stop experience collection... (16950 times) [2024-04-27 14:51:40,457][52242] Signal inference workers to resume experience collection... (16950 times) [2024-04-27 14:51:40,481][52263] InferenceWorker_p0-w0: stopping experience collection (16950 times) [2024-04-27 14:51:40,482][52263] InferenceWorker_p0-w0: resuming experience collection (16950 times) [2024-04-27 14:51:41,112][52263] Updated weights for policy 0, policy_version 405189 (0.0029) [2024-04-27 14:51:44,106][52031] Fps is (10 sec: 50790.2, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6638731264. Throughput: 0: 52803.3. Samples: 1129213180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 14:51:44,107][52031] Avg episode reward: [(0, '0.494')] [2024-04-27 14:51:44,849][52263] Updated weights for policy 0, policy_version 405199 (0.0035) [2024-04-27 14:51:47,240][52263] Updated weights for policy 0, policy_version 405209 (0.0029) [2024-04-27 14:51:49,107][52031] Fps is (10 sec: 58981.9, 60 sec: 54067.1, 300 sec: 53595.1). Total num frames: 6639058944. Throughput: 0: 52937.2. Samples: 1129536580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 14:51:49,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 14:51:50,891][52263] Updated weights for policy 0, policy_version 405219 (0.0039) [2024-04-27 14:51:53,358][52263] Updated weights for policy 0, policy_version 405229 (0.0028) [2024-04-27 14:51:54,107][52031] Fps is (10 sec: 58981.2, 60 sec: 53247.9, 300 sec: 53650.6). Total num frames: 6639321088. Throughput: 0: 53117.8. Samples: 1129860900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 14:51:54,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 14:51:56,902][52263] Updated weights for policy 0, policy_version 405239 (0.0026) [2024-04-27 14:51:59,106][52031] Fps is (10 sec: 50791.1, 60 sec: 52702.0, 300 sec: 53539.6). Total num frames: 6639566848. Throughput: 0: 53604.6. Samples: 1130032280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 14:51:59,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 14:51:59,426][52263] Updated weights for policy 0, policy_version 405249 (0.0032) [2024-04-27 14:52:03,118][52263] Updated weights for policy 0, policy_version 405259 (0.0029) [2024-04-27 14:52:04,107][52031] Fps is (10 sec: 50790.8, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6639828992. Throughput: 0: 53622.1. Samples: 1130351940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 14:52:04,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 14:52:05,482][52263] Updated weights for policy 0, policy_version 405269 (0.0029) [2024-04-27 14:52:09,106][52031] Fps is (10 sec: 50790.1, 60 sec: 52975.1, 300 sec: 53484.1). Total num frames: 6640074752. Throughput: 0: 53694.0. Samples: 1130673240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 14:52:09,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 14:52:09,149][52263] Updated weights for policy 0, policy_version 405279 (0.0036) [2024-04-27 14:52:11,469][52263] Updated weights for policy 0, policy_version 405289 (0.0030) [2024-04-27 14:52:14,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6640353280. Throughput: 0: 53103.9. Samples: 1130822000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 14:52:14,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 14:52:15,183][52263] Updated weights for policy 0, policy_version 405299 (0.0035) [2024-04-27 14:52:17,461][52263] Updated weights for policy 0, policy_version 405309 (0.0028) [2024-04-27 14:52:19,107][52031] Fps is (10 sec: 58981.7, 60 sec: 54067.1, 300 sec: 53706.2). Total num frames: 6640664576. Throughput: 0: 53185.5. Samples: 1131144520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 14:52:19,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 14:52:21,310][52263] Updated weights for policy 0, policy_version 405319 (0.0035) [2024-04-27 14:52:23,723][52263] Updated weights for policy 0, policy_version 405329 (0.0029) [2024-04-27 14:52:24,106][52031] Fps is (10 sec: 57344.8, 60 sec: 53248.1, 300 sec: 53650.7). Total num frames: 6640926720. Throughput: 0: 53211.5. Samples: 1131465780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 14:52:24,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 14:52:27,442][52263] Updated weights for policy 0, policy_version 405339 (0.0036) [2024-04-27 14:52:29,106][52031] Fps is (10 sec: 50791.2, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6641172480. Throughput: 0: 53893.8. Samples: 1131638400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 14:52:29,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 14:52:29,921][52263] Updated weights for policy 0, policy_version 405349 (0.0029) [2024-04-27 14:52:33,511][52263] Updated weights for policy 0, policy_version 405359 (0.0034) [2024-04-27 14:52:34,106][52031] Fps is (10 sec: 50790.2, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6641434624. Throughput: 0: 53898.7. Samples: 1131962020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 14:52:34,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 14:52:36,075][52263] Updated weights for policy 0, policy_version 405369 (0.0038) [2024-04-27 14:52:39,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6641680384. Throughput: 0: 53745.4. Samples: 1132279440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 14:52:39,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 14:52:39,620][52263] Updated weights for policy 0, policy_version 405379 (0.0033) [2024-04-27 14:52:42,272][52263] Updated weights for policy 0, policy_version 405389 (0.0038) [2024-04-27 14:52:44,107][52031] Fps is (10 sec: 54067.1, 60 sec: 54067.1, 300 sec: 53539.6). Total num frames: 6641975296. Throughput: 0: 53373.6. Samples: 1132434100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 14:52:44,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 14:52:45,656][52263] Updated weights for policy 0, policy_version 405399 (0.0034) [2024-04-27 14:52:46,772][52242] Signal inference workers to stop experience collection... (17000 times) [2024-04-27 14:52:46,809][52263] InferenceWorker_p0-w0: stopping experience collection (17000 times) [2024-04-27 14:52:46,835][52242] Signal inference workers to resume experience collection... (17000 times) [2024-04-27 14:52:46,838][52263] InferenceWorker_p0-w0: resuming experience collection (17000 times) [2024-04-27 14:52:48,396][52263] Updated weights for policy 0, policy_version 405409 (0.0027) [2024-04-27 14:52:49,106][52031] Fps is (10 sec: 58982.6, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6642270208. Throughput: 0: 53400.9. Samples: 1132754980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 14:52:49,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 14:52:51,805][52263] Updated weights for policy 0, policy_version 405419 (0.0028) [2024-04-27 14:52:54,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6642515968. Throughput: 0: 53435.1. Samples: 1133077820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 14:52:54,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 14:52:54,505][52263] Updated weights for policy 0, policy_version 405429 (0.0033) [2024-04-27 14:52:57,889][52263] Updated weights for policy 0, policy_version 405439 (0.0027) [2024-04-27 14:52:59,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6642794496. Throughput: 0: 53816.1. Samples: 1133243720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 14:52:59,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 14:53:00,561][52263] Updated weights for policy 0, policy_version 405449 (0.0029) [2024-04-27 14:53:04,055][52263] Updated weights for policy 0, policy_version 405459 (0.0033) [2024-04-27 14:53:04,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6643040256. Throughput: 0: 53763.8. Samples: 1133563880. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 14:53:04,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 14:53:06,698][52263] Updated weights for policy 0, policy_version 405469 (0.0038) [2024-04-27 14:53:09,106][52031] Fps is (10 sec: 49152.3, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6643286016. Throughput: 0: 53697.4. Samples: 1133882160. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 14:53:09,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 14:53:10,215][52263] Updated weights for policy 0, policy_version 405479 (0.0031) [2024-04-27 14:53:12,738][52263] Updated weights for policy 0, policy_version 405489 (0.0028) [2024-04-27 14:53:14,106][52031] Fps is (10 sec: 55705.2, 60 sec: 54067.3, 300 sec: 53539.6). Total num frames: 6643597312. Throughput: 0: 53375.0. Samples: 1134040280. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 14:53:14,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 14:53:16,442][52263] Updated weights for policy 0, policy_version 405499 (0.0026) [2024-04-27 14:53:18,935][52263] Updated weights for policy 0, policy_version 405509 (0.0033) [2024-04-27 14:53:19,106][52031] Fps is (10 sec: 57344.1, 60 sec: 53248.2, 300 sec: 53595.1). Total num frames: 6643859456. Throughput: 0: 53333.9. Samples: 1134362040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 14:53:19,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 14:53:22,527][52263] Updated weights for policy 0, policy_version 405519 (0.0030) [2024-04-27 14:53:24,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6644137984. Throughput: 0: 53464.4. Samples: 1134685340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 14:53:24,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 14:53:25,035][52263] Updated weights for policy 0, policy_version 405529 (0.0034) [2024-04-27 14:53:28,571][52263] Updated weights for policy 0, policy_version 405539 (0.0032) [2024-04-27 14:53:29,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6644383744. Throughput: 0: 53585.4. Samples: 1134845440. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 14:53:29,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 14:53:31,189][52263] Updated weights for policy 0, policy_version 405549 (0.0027) [2024-04-27 14:53:34,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6644645888. Throughput: 0: 53492.4. Samples: 1135162140. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 14:53:34,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 14:53:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000405557_6644645888.pth... [2024-04-27 14:53:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000404775_6631833600.pth [2024-04-27 14:53:34,664][52263] Updated weights for policy 0, policy_version 405559 (0.0029) [2024-04-27 14:53:37,361][52263] Updated weights for policy 0, policy_version 405569 (0.0030) [2024-04-27 14:53:39,107][52031] Fps is (10 sec: 54066.7, 60 sec: 54067.2, 300 sec: 53484.0). Total num frames: 6644924416. Throughput: 0: 53427.5. Samples: 1135482060. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 14:53:39,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 14:53:40,801][52263] Updated weights for policy 0, policy_version 405579 (0.0027) [2024-04-27 14:53:43,447][52263] Updated weights for policy 0, policy_version 405589 (0.0035) [2024-04-27 14:53:44,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6645202944. Throughput: 0: 53493.7. Samples: 1135650940. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 14:53:44,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 14:53:46,829][52263] Updated weights for policy 0, policy_version 405599 (0.0029) [2024-04-27 14:53:49,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6645465088. Throughput: 0: 53555.5. Samples: 1135973880. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 14:53:49,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 14:53:49,481][52263] Updated weights for policy 0, policy_version 405609 (0.0037) [2024-04-27 14:53:53,012][52263] Updated weights for policy 0, policy_version 405619 (0.0031) [2024-04-27 14:53:54,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6645743616. Throughput: 0: 53645.7. Samples: 1136296220. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 14:53:54,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 14:53:55,627][52263] Updated weights for policy 0, policy_version 405629 (0.0032) [2024-04-27 14:53:58,635][52242] Signal inference workers to stop experience collection... (17050 times) [2024-04-27 14:53:58,635][52242] Signal inference workers to resume experience collection... (17050 times) [2024-04-27 14:53:58,651][52263] InferenceWorker_p0-w0: stopping experience collection (17050 times) [2024-04-27 14:53:58,651][52263] InferenceWorker_p0-w0: resuming experience collection (17050 times) [2024-04-27 14:53:59,034][52263] Updated weights for policy 0, policy_version 405639 (0.0024) [2024-04-27 14:53:59,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6645989376. Throughput: 0: 53612.0. Samples: 1136452820. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 14:53:59,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 14:54:01,675][52263] Updated weights for policy 0, policy_version 405649 (0.0029) [2024-04-27 14:54:04,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6646251520. Throughput: 0: 53533.7. Samples: 1136771060. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 14:54:04,107][52031] Avg episode reward: [(0, '0.507')] [2024-04-27 14:54:05,124][52263] Updated weights for policy 0, policy_version 405659 (0.0027) [2024-04-27 14:54:07,774][52263] Updated weights for policy 0, policy_version 405669 (0.0031) [2024-04-27 14:54:09,107][52031] Fps is (10 sec: 55705.2, 60 sec: 54340.1, 300 sec: 53484.0). Total num frames: 6646546432. Throughput: 0: 53543.1. Samples: 1137094780. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 14:54:09,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 14:54:11,188][52263] Updated weights for policy 0, policy_version 405679 (0.0037) [2024-04-27 14:54:13,994][52263] Updated weights for policy 0, policy_version 405689 (0.0036) [2024-04-27 14:54:14,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6646808576. Throughput: 0: 53713.9. Samples: 1137262560. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 14:54:14,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 14:54:17,164][52263] Updated weights for policy 0, policy_version 405699 (0.0035) [2024-04-27 14:54:19,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6647070720. Throughput: 0: 53873.9. Samples: 1137586460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 14:54:19,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 14:54:19,966][52263] Updated weights for policy 0, policy_version 405709 (0.0027) [2024-04-27 14:54:23,342][52263] Updated weights for policy 0, policy_version 405719 (0.0032) [2024-04-27 14:54:24,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6647365632. Throughput: 0: 53940.5. Samples: 1137909380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 14:54:24,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 14:54:26,049][52263] Updated weights for policy 0, policy_version 405729 (0.0032) [2024-04-27 14:54:29,106][52031] Fps is (10 sec: 50790.0, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6647578624. Throughput: 0: 53531.2. Samples: 1138059840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 14:54:29,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 14:54:29,496][52263] Updated weights for policy 0, policy_version 405739 (0.0032) [2024-04-27 14:54:32,125][52263] Updated weights for policy 0, policy_version 405749 (0.0030) [2024-04-27 14:54:34,107][52031] Fps is (10 sec: 50790.2, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6647873536. Throughput: 0: 53670.5. Samples: 1138389060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 14:54:34,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 14:54:35,514][52263] Updated weights for policy 0, policy_version 405759 (0.0028) [2024-04-27 14:54:38,309][52263] Updated weights for policy 0, policy_version 405769 (0.0032) [2024-04-27 14:54:39,106][52031] Fps is (10 sec: 57344.0, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6648152064. Throughput: 0: 53640.1. Samples: 1138710020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 14:54:39,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 14:54:41,576][52263] Updated weights for policy 0, policy_version 405779 (0.0027) [2024-04-27 14:54:44,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6648430592. Throughput: 0: 53898.1. Samples: 1138878240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 14:54:44,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 14:54:44,270][52263] Updated weights for policy 0, policy_version 405789 (0.0027) [2024-04-27 14:54:47,806][52263] Updated weights for policy 0, policy_version 405799 (0.0031) [2024-04-27 14:54:49,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6648676352. Throughput: 0: 53867.1. Samples: 1139195080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 14:54:49,107][52031] Avg episode reward: [(0, '0.671')] [2024-04-27 14:54:50,314][52263] Updated weights for policy 0, policy_version 405809 (0.0029) [2024-04-27 14:54:53,819][52263] Updated weights for policy 0, policy_version 405819 (0.0028) [2024-04-27 14:54:54,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6648954880. Throughput: 0: 53978.3. Samples: 1139523800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 14:54:54,107][52031] Avg episode reward: [(0, '0.693')] [2024-04-27 14:54:54,189][52242] Signal inference workers to stop experience collection... (17100 times) [2024-04-27 14:54:54,192][52242] Signal inference workers to resume experience collection... (17100 times) [2024-04-27 14:54:54,218][52263] InferenceWorker_p0-w0: stopping experience collection (17100 times) [2024-04-27 14:54:54,218][52263] InferenceWorker_p0-w0: resuming experience collection (17100 times) [2024-04-27 14:54:56,629][52263] Updated weights for policy 0, policy_version 405829 (0.0029) [2024-04-27 14:54:59,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6649200640. Throughput: 0: 53607.3. Samples: 1139674900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 14:54:59,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 14:54:59,905][52263] Updated weights for policy 0, policy_version 405839 (0.0026) [2024-04-27 14:55:02,754][52263] Updated weights for policy 0, policy_version 405849 (0.0036) [2024-04-27 14:55:04,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6649479168. Throughput: 0: 53550.9. Samples: 1139996260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 14:55:04,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 14:55:06,077][52263] Updated weights for policy 0, policy_version 405859 (0.0025) [2024-04-27 14:55:08,785][52263] Updated weights for policy 0, policy_version 405869 (0.0030) [2024-04-27 14:55:09,106][52031] Fps is (10 sec: 57344.8, 60 sec: 53794.3, 300 sec: 53706.2). Total num frames: 6649774080. Throughput: 0: 53370.8. Samples: 1140311060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 14:55:09,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 14:55:12,240][52263] Updated weights for policy 0, policy_version 405879 (0.0026) [2024-04-27 14:55:14,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6650019840. Throughput: 0: 53886.6. Samples: 1140484740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 14:55:14,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 14:55:14,924][52263] Updated weights for policy 0, policy_version 405889 (0.0035) [2024-04-27 14:55:18,317][52263] Updated weights for policy 0, policy_version 405899 (0.0038) [2024-04-27 14:55:19,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6650298368. Throughput: 0: 53822.4. Samples: 1140811060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 14:55:19,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 14:55:21,122][52263] Updated weights for policy 0, policy_version 405909 (0.0033) [2024-04-27 14:55:24,106][52031] Fps is (10 sec: 50790.9, 60 sec: 52701.9, 300 sec: 53428.5). Total num frames: 6650527744. Throughput: 0: 53731.1. Samples: 1141127920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 14:55:24,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 14:55:24,542][52263] Updated weights for policy 0, policy_version 405919 (0.0030) [2024-04-27 14:55:27,326][52263] Updated weights for policy 0, policy_version 405929 (0.0027) [2024-04-27 14:55:29,107][52031] Fps is (10 sec: 52428.4, 60 sec: 54067.2, 300 sec: 53484.0). Total num frames: 6650822656. Throughput: 0: 53309.0. Samples: 1141277140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 14:55:29,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 14:55:30,580][52263] Updated weights for policy 0, policy_version 405939 (0.0027) [2024-04-27 14:55:33,287][52263] Updated weights for policy 0, policy_version 405949 (0.0032) [2024-04-27 14:55:34,107][52031] Fps is (10 sec: 57343.2, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6651101184. Throughput: 0: 53365.3. Samples: 1141596520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 14:55:34,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 14:55:34,113][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000405951_6651101184.pth... [2024-04-27 14:55:34,168][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000405166_6638239744.pth [2024-04-27 14:55:36,800][52263] Updated weights for policy 0, policy_version 405959 (0.0031) [2024-04-27 14:55:39,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 6651363328. Throughput: 0: 53235.1. Samples: 1141919380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 14:55:39,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 14:55:39,389][52263] Updated weights for policy 0, policy_version 405969 (0.0029) [2024-04-27 14:55:42,902][52263] Updated weights for policy 0, policy_version 405979 (0.0032) [2024-04-27 14:55:44,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6651625472. Throughput: 0: 53516.8. Samples: 1142083160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 14:55:44,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 14:55:45,545][52263] Updated weights for policy 0, policy_version 405989 (0.0026) [2024-04-27 14:55:49,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53247.9, 300 sec: 53372.9). Total num frames: 6651871232. Throughput: 0: 53521.3. Samples: 1142404720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 14:55:49,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 14:55:49,146][52263] Updated weights for policy 0, policy_version 405999 (0.0029) [2024-04-27 14:55:51,745][52263] Updated weights for policy 0, policy_version 406009 (0.0028) [2024-04-27 14:55:54,107][52031] Fps is (10 sec: 52429.5, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6652149760. Throughput: 0: 53606.5. Samples: 1142723360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 14:55:54,107][52031] Avg episode reward: [(0, '0.488')] [2024-04-27 14:55:54,844][52242] Signal inference workers to stop experience collection... (17150 times) [2024-04-27 14:55:54,885][52263] InferenceWorker_p0-w0: stopping experience collection (17150 times) [2024-04-27 14:55:54,913][52242] Signal inference workers to resume experience collection... (17150 times) [2024-04-27 14:55:54,919][52263] InferenceWorker_p0-w0: resuming experience collection (17150 times) [2024-04-27 14:55:55,187][52263] Updated weights for policy 0, policy_version 406019 (0.0029) [2024-04-27 14:55:57,791][52263] Updated weights for policy 0, policy_version 406029 (0.0028) [2024-04-27 14:55:59,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6652411904. Throughput: 0: 53325.4. Samples: 1142884380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 14:55:59,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 14:56:01,249][52263] Updated weights for policy 0, policy_version 406039 (0.0026) [2024-04-27 14:56:03,860][52263] Updated weights for policy 0, policy_version 406049 (0.0025) [2024-04-27 14:56:04,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6652706816. Throughput: 0: 53234.8. Samples: 1143206640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 14:56:04,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 14:56:07,461][52263] Updated weights for policy 0, policy_version 406059 (0.0030) [2024-04-27 14:56:09,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53247.8, 300 sec: 53650.7). Total num frames: 6652968960. Throughput: 0: 53236.3. Samples: 1143523560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 14:56:09,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 14:56:09,923][52263] Updated weights for policy 0, policy_version 406069 (0.0045) [2024-04-27 14:56:13,454][52263] Updated weights for policy 0, policy_version 406079 (0.0031) [2024-04-27 14:56:14,107][52031] Fps is (10 sec: 54067.6, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6653247488. Throughput: 0: 53561.7. Samples: 1143687420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 14:56:14,107][52031] Avg episode reward: [(0, '0.510')] [2024-04-27 14:56:16,058][52263] Updated weights for policy 0, policy_version 406089 (0.0031) [2024-04-27 14:56:19,107][52031] Fps is (10 sec: 50790.6, 60 sec: 52974.8, 300 sec: 53373.0). Total num frames: 6653476864. Throughput: 0: 53569.8. Samples: 1144007160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 14:56:19,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 14:56:19,617][52263] Updated weights for policy 0, policy_version 406099 (0.0032) [2024-04-27 14:56:22,188][52263] Updated weights for policy 0, policy_version 406109 (0.0031) [2024-04-27 14:56:24,106][52031] Fps is (10 sec: 50791.2, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 6653755392. Throughput: 0: 53577.0. Samples: 1144330340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 14:56:24,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 14:56:25,709][52263] Updated weights for policy 0, policy_version 406119 (0.0034) [2024-04-27 14:56:28,408][52263] Updated weights for policy 0, policy_version 406129 (0.0031) [2024-04-27 14:56:29,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6654033920. Throughput: 0: 53482.5. Samples: 1144489860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 14:56:29,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 14:56:31,630][52263] Updated weights for policy 0, policy_version 406139 (0.0032) [2024-04-27 14:56:34,107][52031] Fps is (10 sec: 57343.2, 60 sec: 53794.2, 300 sec: 53761.7). Total num frames: 6654328832. Throughput: 0: 53521.4. Samples: 1144813180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 14:56:34,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 14:56:34,816][52263] Updated weights for policy 0, policy_version 406149 (0.0034) [2024-04-27 14:56:37,746][52263] Updated weights for policy 0, policy_version 406159 (0.0031) [2024-04-27 14:56:39,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6654574592. Throughput: 0: 53639.7. Samples: 1145137140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 14:56:39,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 14:56:40,855][52263] Updated weights for policy 0, policy_version 406169 (0.0025) [2024-04-27 14:56:43,790][52263] Updated weights for policy 0, policy_version 406179 (0.0032) [2024-04-27 14:56:44,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53794.3, 300 sec: 53539.6). Total num frames: 6654853120. Throughput: 0: 53719.2. Samples: 1145301740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 14:56:44,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 14:56:46,771][52263] Updated weights for policy 0, policy_version 406189 (0.0035) [2024-04-27 14:56:49,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53521.3, 300 sec: 53428.5). Total num frames: 6655082496. Throughput: 0: 53671.9. Samples: 1145621860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 14:56:49,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 14:56:49,202][52242] Signal inference workers to stop experience collection... (17200 times) [2024-04-27 14:56:49,202][52242] Signal inference workers to resume experience collection... (17200 times) [2024-04-27 14:56:49,233][52263] InferenceWorker_p0-w0: stopping experience collection (17200 times) [2024-04-27 14:56:49,233][52263] InferenceWorker_p0-w0: resuming experience collection (17200 times) [2024-04-27 14:56:49,955][52263] Updated weights for policy 0, policy_version 406199 (0.0029) [2024-04-27 14:56:52,716][52263] Updated weights for policy 0, policy_version 406209 (0.0026) [2024-04-27 14:56:54,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6655377408. Throughput: 0: 53808.7. Samples: 1145944940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 14:56:54,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 14:56:55,934][52263] Updated weights for policy 0, policy_version 406219 (0.0030) [2024-04-27 14:56:58,874][52263] Updated weights for policy 0, policy_version 406229 (0.0027) [2024-04-27 14:56:59,106][52031] Fps is (10 sec: 57343.6, 60 sec: 54067.3, 300 sec: 53650.7). Total num frames: 6655655936. Throughput: 0: 53611.7. Samples: 1146099940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 14:56:59,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 14:57:02,014][52263] Updated weights for policy 0, policy_version 406239 (0.0026) [2024-04-27 14:57:04,106][52031] Fps is (10 sec: 55705.4, 60 sec: 53794.3, 300 sec: 53761.7). Total num frames: 6655934464. Throughput: 0: 53723.2. Samples: 1146424700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 14:57:04,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 14:57:05,225][52263] Updated weights for policy 0, policy_version 406249 (0.0029) [2024-04-27 14:57:08,254][52263] Updated weights for policy 0, policy_version 406259 (0.0030) [2024-04-27 14:57:09,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6656163840. Throughput: 0: 53592.9. Samples: 1146742020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 14:57:09,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 14:57:11,500][52263] Updated weights for policy 0, policy_version 406269 (0.0031) [2024-04-27 14:57:14,106][52031] Fps is (10 sec: 49152.4, 60 sec: 52975.1, 300 sec: 53428.5). Total num frames: 6656425984. Throughput: 0: 53579.2. Samples: 1146900920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 14:57:14,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 14:57:14,393][52263] Updated weights for policy 0, policy_version 406279 (0.0030) [2024-04-27 14:57:17,555][52263] Updated weights for policy 0, policy_version 406289 (0.0029) [2024-04-27 14:57:19,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6656688128. Throughput: 0: 53465.9. Samples: 1147219140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 14:57:19,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 14:57:20,424][52263] Updated weights for policy 0, policy_version 406299 (0.0026) [2024-04-27 14:57:23,755][52263] Updated weights for policy 0, policy_version 406309 (0.0033) [2024-04-27 14:57:24,106][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6656966656. Throughput: 0: 53395.5. Samples: 1147539940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 14:57:24,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 14:57:26,554][52263] Updated weights for policy 0, policy_version 406319 (0.0025) [2024-04-27 14:57:29,107][52031] Fps is (10 sec: 57343.6, 60 sec: 53794.0, 300 sec: 53650.7). Total num frames: 6657261568. Throughput: 0: 53301.3. Samples: 1147700300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 14:57:29,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 14:57:29,968][52263] Updated weights for policy 0, policy_version 406329 (0.0031) [2024-04-27 14:57:32,859][52263] Updated weights for policy 0, policy_version 406339 (0.0032) [2024-04-27 14:57:34,107][52031] Fps is (10 sec: 57343.7, 60 sec: 53521.1, 300 sec: 53761.7). Total num frames: 6657540096. Throughput: 0: 53319.3. Samples: 1148021240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 14:57:34,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 14:57:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000406344_6657540096.pth... [2024-04-27 14:57:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000405557_6644645888.pth [2024-04-27 14:57:35,978][52263] Updated weights for policy 0, policy_version 406349 (0.0033) [2024-04-27 14:57:38,902][52263] Updated weights for policy 0, policy_version 406359 (0.0032) [2024-04-27 14:57:39,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6657785856. Throughput: 0: 53283.5. Samples: 1148342700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 14:57:39,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 14:57:39,557][52242] Signal inference workers to stop experience collection... (17250 times) [2024-04-27 14:57:39,586][52263] InferenceWorker_p0-w0: stopping experience collection (17250 times) [2024-04-27 14:57:39,616][52242] Signal inference workers to resume experience collection... (17250 times) [2024-04-27 14:57:39,622][52263] InferenceWorker_p0-w0: resuming experience collection (17250 times) [2024-04-27 14:57:41,986][52263] Updated weights for policy 0, policy_version 406369 (0.0027) [2024-04-27 14:57:44,107][52031] Fps is (10 sec: 47513.7, 60 sec: 52701.8, 300 sec: 53373.0). Total num frames: 6658015232. Throughput: 0: 53324.8. Samples: 1148499560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 14:57:44,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 14:57:45,301][52263] Updated weights for policy 0, policy_version 406379 (0.0030) [2024-04-27 14:57:48,179][52263] Updated weights for policy 0, policy_version 406389 (0.0027) [2024-04-27 14:57:49,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6658310144. Throughput: 0: 53235.4. Samples: 1148820300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 14:57:49,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 14:57:51,501][52263] Updated weights for policy 0, policy_version 406399 (0.0031) [2024-04-27 14:57:54,106][52031] Fps is (10 sec: 57344.5, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6658588672. Throughput: 0: 53436.0. Samples: 1149146640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 14:57:54,107][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 14:57:54,251][52263] Updated weights for policy 0, policy_version 406409 (0.0033) [2024-04-27 14:57:57,622][52263] Updated weights for policy 0, policy_version 406419 (0.0028) [2024-04-27 14:57:59,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53247.9, 300 sec: 53595.1). Total num frames: 6658850816. Throughput: 0: 53583.3. Samples: 1149312180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 14:57:59,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 14:58:00,238][52263] Updated weights for policy 0, policy_version 406429 (0.0028) [2024-04-27 14:58:03,570][52263] Updated weights for policy 0, policy_version 406439 (0.0031) [2024-04-27 14:58:04,107][52031] Fps is (10 sec: 54065.8, 60 sec: 53247.8, 300 sec: 53706.1). Total num frames: 6659129344. Throughput: 0: 53625.9. Samples: 1149632320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 14:58:04,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 14:58:06,579][52263] Updated weights for policy 0, policy_version 406449 (0.0034) [2024-04-27 14:58:09,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6659391488. Throughput: 0: 53628.5. Samples: 1149953220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 14:58:09,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 14:58:09,746][52263] Updated weights for policy 0, policy_version 406459 (0.0028) [2024-04-27 14:58:12,738][52263] Updated weights for policy 0, policy_version 406469 (0.0033) [2024-04-27 14:58:14,107][52031] Fps is (10 sec: 49152.7, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6659620864. Throughput: 0: 53518.7. Samples: 1150108640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 14:58:14,108][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 14:58:15,796][52263] Updated weights for policy 0, policy_version 406479 (0.0029) [2024-04-27 14:58:18,697][52263] Updated weights for policy 0, policy_version 406489 (0.0030) [2024-04-27 14:58:19,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 6659915776. Throughput: 0: 53445.9. Samples: 1150426300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-27 14:58:19,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 14:58:21,870][52263] Updated weights for policy 0, policy_version 406499 (0.0030) [2024-04-27 14:58:24,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6660177920. Throughput: 0: 53365.4. Samples: 1150744140. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-27 14:58:24,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 14:58:24,939][52263] Updated weights for policy 0, policy_version 406509 (0.0029) [2024-04-27 14:58:27,932][52263] Updated weights for policy 0, policy_version 406519 (0.0028) [2024-04-27 14:58:29,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6660472832. Throughput: 0: 53768.5. Samples: 1150919140. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-27 14:58:29,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 14:58:31,208][52263] Updated weights for policy 0, policy_version 406529 (0.0040) [2024-04-27 14:58:34,106][52031] Fps is (10 sec: 54067.0, 60 sec: 52975.0, 300 sec: 53539.6). Total num frames: 6660718592. Throughput: 0: 53773.0. Samples: 1151240080. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-27 14:58:34,107][52031] Avg episode reward: [(0, '0.479')] [2024-04-27 14:58:34,167][52263] Updated weights for policy 0, policy_version 406539 (0.0034) [2024-04-27 14:58:37,260][52263] Updated weights for policy 0, policy_version 406549 (0.0028) [2024-04-27 14:58:39,106][52031] Fps is (10 sec: 47513.4, 60 sec: 52701.9, 300 sec: 53373.0). Total num frames: 6660947968. Throughput: 0: 53495.1. Samples: 1151553920. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-27 14:58:39,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 14:58:40,264][52263] Updated weights for policy 0, policy_version 406559 (0.0036) [2024-04-27 14:58:43,217][52263] Updated weights for policy 0, policy_version 406569 (0.0029) [2024-04-27 14:58:44,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6661242880. Throughput: 0: 53140.5. Samples: 1151703500. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-27 14:58:44,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 14:58:46,091][52242] Signal inference workers to stop experience collection... (17300 times) [2024-04-27 14:58:46,117][52263] InferenceWorker_p0-w0: stopping experience collection (17300 times) [2024-04-27 14:58:46,151][52242] Signal inference workers to resume experience collection... (17300 times) [2024-04-27 14:58:46,151][52263] InferenceWorker_p0-w0: resuming experience collection (17300 times) [2024-04-27 14:58:46,274][52263] Updated weights for policy 0, policy_version 406579 (0.0026) [2024-04-27 14:58:49,107][52031] Fps is (10 sec: 57343.4, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6661521408. Throughput: 0: 53201.9. Samples: 1152026400. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-27 14:58:49,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 14:58:49,738][52263] Updated weights for policy 0, policy_version 406589 (0.0029) [2024-04-27 14:58:52,400][52263] Updated weights for policy 0, policy_version 406599 (0.0028) [2024-04-27 14:58:54,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6661799936. Throughput: 0: 53245.5. Samples: 1152349280. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-27 14:58:54,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 14:58:55,724][52263] Updated weights for policy 0, policy_version 406609 (0.0031) [2024-04-27 14:58:58,722][52263] Updated weights for policy 0, policy_version 406619 (0.0031) [2024-04-27 14:58:59,106][52031] Fps is (10 sec: 54068.3, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6662062080. Throughput: 0: 53530.0. Samples: 1152517480. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-27 14:58:59,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 14:59:02,335][52263] Updated weights for policy 0, policy_version 406629 (0.0025) [2024-04-27 14:59:04,107][52031] Fps is (10 sec: 52429.5, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6662324224. Throughput: 0: 53563.5. Samples: 1152836660. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-27 14:59:04,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 14:59:04,763][52263] Updated weights for policy 0, policy_version 406639 (0.0034) [2024-04-27 14:59:08,463][52263] Updated weights for policy 0, policy_version 406649 (0.0029) [2024-04-27 14:59:09,107][52031] Fps is (10 sec: 50789.6, 60 sec: 52974.8, 300 sec: 53428.5). Total num frames: 6662569984. Throughput: 0: 53651.8. Samples: 1153158480. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-27 14:59:09,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 14:59:10,811][52263] Updated weights for policy 0, policy_version 406659 (0.0026) [2024-04-27 14:59:14,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6662848512. Throughput: 0: 53090.1. Samples: 1153308200. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-27 14:59:14,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 14:59:14,519][52263] Updated weights for policy 0, policy_version 406669 (0.0033) [2024-04-27 14:59:16,866][52263] Updated weights for policy 0, policy_version 406679 (0.0026) [2024-04-27 14:59:19,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 6663110656. Throughput: 0: 53124.3. Samples: 1153630680. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-27 14:59:19,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 14:59:20,481][52263] Updated weights for policy 0, policy_version 406689 (0.0029) [2024-04-27 14:59:23,016][52263] Updated weights for policy 0, policy_version 406699 (0.0028) [2024-04-27 14:59:24,107][52031] Fps is (10 sec: 57343.4, 60 sec: 54067.0, 300 sec: 53706.2). Total num frames: 6663421952. Throughput: 0: 53343.4. Samples: 1153954380. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-27 14:59:24,107][52031] Avg episode reward: [(0, '0.659')] [2024-04-27 14:59:26,637][52263] Updated weights for policy 0, policy_version 406709 (0.0026) [2024-04-27 14:59:29,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6663667712. Throughput: 0: 53888.2. Samples: 1154128460. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-27 14:59:29,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 14:59:29,122][52263] Updated weights for policy 0, policy_version 406719 (0.0037) [2024-04-27 14:59:32,829][52263] Updated weights for policy 0, policy_version 406729 (0.0032) [2024-04-27 14:59:34,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6663929856. Throughput: 0: 53834.6. Samples: 1154448960. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-27 14:59:34,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 14:59:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000406734_6663929856.pth... [2024-04-27 14:59:34,166][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000405951_6651101184.pth [2024-04-27 14:59:35,118][52263] Updated weights for policy 0, policy_version 406739 (0.0035) [2024-04-27 14:59:38,753][52263] Updated weights for policy 0, policy_version 406749 (0.0033) [2024-04-27 14:59:39,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53794.2, 300 sec: 53373.0). Total num frames: 6664175616. Throughput: 0: 53801.2. Samples: 1154770320. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-27 14:59:39,107][52031] Avg episode reward: [(0, '0.520')] [2024-04-27 14:59:41,100][52263] Updated weights for policy 0, policy_version 406759 (0.0029) [2024-04-27 14:59:44,106][52031] Fps is (10 sec: 52429.9, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6664454144. Throughput: 0: 53395.5. Samples: 1154920280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:59:44,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 14:59:45,011][52263] Updated weights for policy 0, policy_version 406769 (0.0033) [2024-04-27 14:59:45,285][52242] Signal inference workers to stop experience collection... (17350 times) [2024-04-27 14:59:45,286][52242] Signal inference workers to resume experience collection... (17350 times) [2024-04-27 14:59:45,300][52263] InferenceWorker_p0-w0: stopping experience collection (17350 times) [2024-04-27 14:59:45,319][52263] InferenceWorker_p0-w0: resuming experience collection (17350 times) [2024-04-27 14:59:47,293][52263] Updated weights for policy 0, policy_version 406779 (0.0036) [2024-04-27 14:59:49,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6664732672. Throughput: 0: 53473.4. Samples: 1155242960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:59:49,107][52031] Avg episode reward: [(0, '0.663')] [2024-04-27 14:59:51,288][52263] Updated weights for policy 0, policy_version 406789 (0.0035) [2024-04-27 14:59:53,407][52263] Updated weights for policy 0, policy_version 406799 (0.0031) [2024-04-27 14:59:54,106][52031] Fps is (10 sec: 57344.4, 60 sec: 53794.4, 300 sec: 53650.7). Total num frames: 6665027584. Throughput: 0: 53410.4. Samples: 1155561940. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:59:54,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 14:59:57,410][52263] Updated weights for policy 0, policy_version 406809 (0.0030) [2024-04-27 14:59:59,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53247.8, 300 sec: 53484.0). Total num frames: 6665256960. Throughput: 0: 53902.6. Samples: 1155733820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 14:59:59,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 14:59:59,550][52263] Updated weights for policy 0, policy_version 406819 (0.0031) [2024-04-27 15:00:03,510][52263] Updated weights for policy 0, policy_version 406829 (0.0029) [2024-04-27 15:00:04,106][52031] Fps is (10 sec: 49152.2, 60 sec: 53248.2, 300 sec: 53373.0). Total num frames: 6665519104. Throughput: 0: 53829.5. Samples: 1156053000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 15:00:04,107][52031] Avg episode reward: [(0, '0.497')] [2024-04-27 15:00:05,777][52263] Updated weights for policy 0, policy_version 406839 (0.0026) [2024-04-27 15:00:09,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6665797632. Throughput: 0: 53673.5. Samples: 1156369680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 15:00:09,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 15:00:09,672][52263] Updated weights for policy 0, policy_version 406849 (0.0025) [2024-04-27 15:00:11,937][52263] Updated weights for policy 0, policy_version 406859 (0.0030) [2024-04-27 15:00:14,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6666076160. Throughput: 0: 53393.6. Samples: 1156531180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 15:00:14,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 15:00:15,855][52263] Updated weights for policy 0, policy_version 406869 (0.0031) [2024-04-27 15:00:18,112][52263] Updated weights for policy 0, policy_version 406879 (0.0024) [2024-04-27 15:00:19,106][52031] Fps is (10 sec: 55705.8, 60 sec: 54067.3, 300 sec: 53650.7). Total num frames: 6666354688. Throughput: 0: 53268.2. Samples: 1156846020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 15:00:19,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 15:00:21,869][52263] Updated weights for policy 0, policy_version 406889 (0.0033) [2024-04-27 15:00:24,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6666616832. Throughput: 0: 53273.9. Samples: 1157167660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 15:00:24,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 15:00:24,215][52263] Updated weights for policy 0, policy_version 406899 (0.0033) [2024-04-27 15:00:27,883][52263] Updated weights for policy 0, policy_version 406909 (0.0030) [2024-04-27 15:00:29,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6666862592. Throughput: 0: 53408.9. Samples: 1157323680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 15:00:29,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 15:00:30,343][52263] Updated weights for policy 0, policy_version 406919 (0.0028) [2024-04-27 15:00:33,049][52242] Signal inference workers to stop experience collection... (17400 times) [2024-04-27 15:00:33,067][52263] InferenceWorker_p0-w0: stopping experience collection (17400 times) [2024-04-27 15:00:33,141][52242] Signal inference workers to resume experience collection... (17400 times) [2024-04-27 15:00:33,142][52263] InferenceWorker_p0-w0: resuming experience collection (17400 times) [2024-04-27 15:00:34,032][52263] Updated weights for policy 0, policy_version 406929 (0.0032) [2024-04-27 15:00:34,106][52031] Fps is (10 sec: 50791.3, 60 sec: 53248.2, 300 sec: 53428.5). Total num frames: 6667124736. Throughput: 0: 53493.3. Samples: 1157650160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 15:00:34,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 15:00:36,384][52263] Updated weights for policy 0, policy_version 406939 (0.0026) [2024-04-27 15:00:39,112][52031] Fps is (10 sec: 50763.0, 60 sec: 53243.1, 300 sec: 53372.0). Total num frames: 6667370496. Throughput: 0: 53410.4. Samples: 1157965700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 15:00:39,113][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 15:00:40,164][52263] Updated weights for policy 0, policy_version 406949 (0.0027) [2024-04-27 15:00:42,590][52263] Updated weights for policy 0, policy_version 406959 (0.0030) [2024-04-27 15:00:44,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6667681792. Throughput: 0: 53270.3. Samples: 1158130980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 15:00:44,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 15:00:46,291][52263] Updated weights for policy 0, policy_version 406969 (0.0029) [2024-04-27 15:00:48,631][52263] Updated weights for policy 0, policy_version 406979 (0.0033) [2024-04-27 15:00:49,106][52031] Fps is (10 sec: 59014.5, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6667960320. Throughput: 0: 53298.1. Samples: 1158451420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 15:00:49,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 15:00:52,396][52263] Updated weights for policy 0, policy_version 406989 (0.0029) [2024-04-27 15:00:54,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53247.8, 300 sec: 53595.1). Total num frames: 6668222464. Throughput: 0: 53472.7. Samples: 1158775960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 15:00:54,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 15:00:54,693][52263] Updated weights for policy 0, policy_version 406999 (0.0029) [2024-04-27 15:00:58,359][52263] Updated weights for policy 0, policy_version 407009 (0.0037) [2024-04-27 15:00:59,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6668468224. Throughput: 0: 53332.9. Samples: 1158931160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-04-27 15:00:59,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 15:01:01,083][52263] Updated weights for policy 0, policy_version 407019 (0.0037) [2024-04-27 15:01:04,106][52031] Fps is (10 sec: 49153.2, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6668713984. Throughput: 0: 53349.0. Samples: 1159246720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 15:01:04,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 15:01:04,421][52263] Updated weights for policy 0, policy_version 407029 (0.0035) [2024-04-27 15:01:07,313][52263] Updated weights for policy 0, policy_version 407039 (0.0030) [2024-04-27 15:01:09,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6669008896. Throughput: 0: 53269.4. Samples: 1159564780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 15:01:09,107][52031] Avg episode reward: [(0, '0.678')] [2024-04-27 15:01:10,651][52263] Updated weights for policy 0, policy_version 407049 (0.0029) [2024-04-27 15:01:13,622][52263] Updated weights for policy 0, policy_version 407059 (0.0033) [2024-04-27 15:01:14,107][52031] Fps is (10 sec: 57343.4, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6669287424. Throughput: 0: 53439.5. Samples: 1159728460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 15:01:14,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 15:01:16,677][52263] Updated weights for policy 0, policy_version 407069 (0.0032) [2024-04-27 15:01:19,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6669565952. Throughput: 0: 53425.6. Samples: 1160054320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 15:01:19,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 15:01:19,646][52263] Updated weights for policy 0, policy_version 407079 (0.0038) [2024-04-27 15:01:22,714][52263] Updated weights for policy 0, policy_version 407089 (0.0027) [2024-04-27 15:01:24,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6669811712. Throughput: 0: 53551.3. Samples: 1160375220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 15:01:24,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 15:01:25,601][52263] Updated weights for policy 0, policy_version 407099 (0.0036) [2024-04-27 15:01:28,990][52263] Updated weights for policy 0, policy_version 407109 (0.0032) [2024-04-27 15:01:29,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53520.9, 300 sec: 53372.9). Total num frames: 6670073856. Throughput: 0: 53335.4. Samples: 1160531080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 15:01:29,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 15:01:29,355][52242] Signal inference workers to stop experience collection... (17450 times) [2024-04-27 15:01:29,384][52263] InferenceWorker_p0-w0: stopping experience collection (17450 times) [2024-04-27 15:01:29,414][52242] Signal inference workers to resume experience collection... (17450 times) [2024-04-27 15:01:29,418][52263] InferenceWorker_p0-w0: resuming experience collection (17450 times) [2024-04-27 15:01:31,774][52263] Updated weights for policy 0, policy_version 407119 (0.0028) [2024-04-27 15:01:34,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6670336000. Throughput: 0: 53287.1. Samples: 1160849340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 15:01:34,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 15:01:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000407125_6670336000.pth... [2024-04-27 15:01:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000406344_6657540096.pth [2024-04-27 15:01:35,502][52263] Updated weights for policy 0, policy_version 407129 (0.0029) [2024-04-27 15:01:37,807][52263] Updated weights for policy 0, policy_version 407139 (0.0028) [2024-04-27 15:01:39,106][52031] Fps is (10 sec: 54068.7, 60 sec: 54072.1, 300 sec: 53428.5). Total num frames: 6670614528. Throughput: 0: 53137.7. Samples: 1161167140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 15:01:39,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 15:01:41,579][52263] Updated weights for policy 0, policy_version 407149 (0.0034) [2024-04-27 15:01:43,975][52263] Updated weights for policy 0, policy_version 407159 (0.0025) [2024-04-27 15:01:44,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6670893056. Throughput: 0: 53520.9. Samples: 1161339600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 15:01:44,107][52031] Avg episode reward: [(0, '0.516')] [2024-04-27 15:01:47,630][52263] Updated weights for policy 0, policy_version 407169 (0.0029) [2024-04-27 15:01:49,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6671155200. Throughput: 0: 53623.0. Samples: 1161659760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 15:01:49,107][52031] Avg episode reward: [(0, '0.469')] [2024-04-27 15:01:50,071][52263] Updated weights for policy 0, policy_version 407179 (0.0035) [2024-04-27 15:01:53,653][52263] Updated weights for policy 0, policy_version 407189 (0.0028) [2024-04-27 15:01:54,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53248.2, 300 sec: 53428.5). Total num frames: 6671417344. Throughput: 0: 53541.9. Samples: 1161974160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 15:01:54,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 15:01:56,459][52263] Updated weights for policy 0, policy_version 407199 (0.0031) [2024-04-27 15:01:59,107][52031] Fps is (10 sec: 50790.2, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6671663104. Throughput: 0: 53324.4. Samples: 1162128060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 15:01:59,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 15:01:59,820][52263] Updated weights for policy 0, policy_version 407209 (0.0029) [2024-04-27 15:02:02,633][52263] Updated weights for policy 0, policy_version 407219 (0.0033) [2024-04-27 15:02:04,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6671941632. Throughput: 0: 53276.4. Samples: 1162451760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 15:02:04,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 15:02:05,987][52263] Updated weights for policy 0, policy_version 407229 (0.0034) [2024-04-27 15:02:08,894][52263] Updated weights for policy 0, policy_version 407239 (0.0032) [2024-04-27 15:02:09,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6672203776. Throughput: 0: 53220.1. Samples: 1162770120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 15:02:09,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 15:02:12,007][52263] Updated weights for policy 0, policy_version 407249 (0.0033) [2024-04-27 15:02:14,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6672498688. Throughput: 0: 53453.4. Samples: 1162936480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 15:02:14,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 15:02:14,844][52263] Updated weights for policy 0, policy_version 407259 (0.0034) [2024-04-27 15:02:18,111][52263] Updated weights for policy 0, policy_version 407269 (0.0032) [2024-04-27 15:02:19,106][52031] Fps is (10 sec: 52428.6, 60 sec: 52701.9, 300 sec: 53428.5). Total num frames: 6672728064. Throughput: 0: 53514.6. Samples: 1163257500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 15:02:19,107][52031] Avg episode reward: [(0, '0.491')] [2024-04-27 15:02:20,999][52263] Updated weights for policy 0, policy_version 407279 (0.0033) [2024-04-27 15:02:24,107][52031] Fps is (10 sec: 50790.5, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 6673006592. Throughput: 0: 53471.4. Samples: 1163573360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 15:02:24,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 15:02:24,282][52263] Updated weights for policy 0, policy_version 407289 (0.0027) [2024-04-27 15:02:27,237][52263] Updated weights for policy 0, policy_version 407299 (0.0032) [2024-04-27 15:02:29,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.1, 300 sec: 53317.4). Total num frames: 6673268736. Throughput: 0: 53125.4. Samples: 1163730240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 15:02:29,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 15:02:30,477][52263] Updated weights for policy 0, policy_version 407309 (0.0029) [2024-04-27 15:02:33,284][52263] Updated weights for policy 0, policy_version 407319 (0.0032) [2024-04-27 15:02:34,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6673530880. Throughput: 0: 53180.9. Samples: 1164052900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 15:02:34,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 15:02:36,619][52263] Updated weights for policy 0, policy_version 407329 (0.0025) [2024-04-27 15:02:37,564][52242] Signal inference workers to stop experience collection... (17500 times) [2024-04-27 15:02:37,587][52263] InferenceWorker_p0-w0: stopping experience collection (17500 times) [2024-04-27 15:02:37,661][52242] Signal inference workers to resume experience collection... (17500 times) [2024-04-27 15:02:37,661][52263] InferenceWorker_p0-w0: resuming experience collection (17500 times) [2024-04-27 15:02:39,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53247.8, 300 sec: 53539.6). Total num frames: 6673809408. Throughput: 0: 53238.5. Samples: 1164369900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 15:02:39,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 15:02:39,523][52263] Updated weights for policy 0, policy_version 407339 (0.0027) [2024-04-27 15:02:42,702][52263] Updated weights for policy 0, policy_version 407349 (0.0026) [2024-04-27 15:02:44,106][52031] Fps is (10 sec: 54067.4, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6674071552. Throughput: 0: 53439.2. Samples: 1164532820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 15:02:44,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 15:02:45,810][52263] Updated weights for policy 0, policy_version 407359 (0.0027) [2024-04-27 15:02:48,858][52263] Updated weights for policy 0, policy_version 407369 (0.0033) [2024-04-27 15:02:49,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6674350080. Throughput: 0: 53266.3. Samples: 1164848740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 15:02:49,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 15:02:51,818][52263] Updated weights for policy 0, policy_version 407379 (0.0029) [2024-04-27 15:02:54,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6674612224. Throughput: 0: 53359.5. Samples: 1165171300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 15:02:54,107][52031] Avg episode reward: [(0, '0.506')] [2024-04-27 15:02:54,884][52263] Updated weights for policy 0, policy_version 407389 (0.0032) [2024-04-27 15:02:57,903][52263] Updated weights for policy 0, policy_version 407399 (0.0026) [2024-04-27 15:02:59,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6674890752. Throughput: 0: 53270.2. Samples: 1165333640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 15:02:59,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 15:03:01,156][52263] Updated weights for policy 0, policy_version 407409 (0.0033) [2024-04-27 15:03:04,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53248.0, 300 sec: 53372.9). Total num frames: 6675136512. Throughput: 0: 53228.8. Samples: 1165652800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 15:03:04,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 15:03:04,294][52263] Updated weights for policy 0, policy_version 407419 (0.0030) [2024-04-27 15:03:07,248][52263] Updated weights for policy 0, policy_version 407429 (0.0029) [2024-04-27 15:03:09,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6675431424. Throughput: 0: 53314.7. Samples: 1165972520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 15:03:09,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 15:03:10,455][52263] Updated weights for policy 0, policy_version 407439 (0.0035) [2024-04-27 15:03:13,219][52263] Updated weights for policy 0, policy_version 407449 (0.0026) [2024-04-27 15:03:14,106][52031] Fps is (10 sec: 52429.5, 60 sec: 52702.0, 300 sec: 53373.0). Total num frames: 6675660800. Throughput: 0: 53475.6. Samples: 1166136640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 15:03:14,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 15:03:16,755][52263] Updated weights for policy 0, policy_version 407459 (0.0026) [2024-04-27 15:03:19,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6675955712. Throughput: 0: 53534.2. Samples: 1166461940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 15:03:19,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 15:03:19,346][52263] Updated weights for policy 0, policy_version 407469 (0.0031) [2024-04-27 15:03:22,697][52263] Updated weights for policy 0, policy_version 407479 (0.0036) [2024-04-27 15:03:24,106][52031] Fps is (10 sec: 57343.8, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6676234240. Throughput: 0: 53632.6. Samples: 1166783360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 15:03:24,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 15:03:25,675][52263] Updated weights for policy 0, policy_version 407489 (0.0032) [2024-04-27 15:03:28,652][52263] Updated weights for policy 0, policy_version 407499 (0.0027) [2024-04-27 15:03:29,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6676480000. Throughput: 0: 53573.4. Samples: 1166943620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 15:03:29,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 15:03:31,808][52263] Updated weights for policy 0, policy_version 407509 (0.0029) [2024-04-27 15:03:34,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6676758528. Throughput: 0: 53722.1. Samples: 1167266240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 15:03:34,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 15:03:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000407517_6676758528.pth... [2024-04-27 15:03:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000406734_6663929856.pth [2024-04-27 15:03:34,835][52263] Updated weights for policy 0, policy_version 407519 (0.0028) [2024-04-27 15:03:37,771][52263] Updated weights for policy 0, policy_version 407529 (0.0030) [2024-04-27 15:03:39,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6677020672. Throughput: 0: 53613.3. Samples: 1167583900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 15:03:39,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 15:03:41,051][52263] Updated weights for policy 0, policy_version 407539 (0.0034) [2024-04-27 15:03:43,764][52263] Updated weights for policy 0, policy_version 407549 (0.0034) [2024-04-27 15:03:44,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6677282816. Throughput: 0: 53528.2. Samples: 1167742400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 15:03:44,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 15:03:45,872][52242] Signal inference workers to stop experience collection... (17550 times) [2024-04-27 15:03:45,901][52263] InferenceWorker_p0-w0: stopping experience collection (17550 times) [2024-04-27 15:03:45,932][52242] Signal inference workers to resume experience collection... (17550 times) [2024-04-27 15:03:45,934][52263] InferenceWorker_p0-w0: resuming experience collection (17550 times) [2024-04-27 15:03:47,112][52263] Updated weights for policy 0, policy_version 407559 (0.0028) [2024-04-27 15:03:49,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6677561344. Throughput: 0: 53553.3. Samples: 1168062700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 15:03:49,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 15:03:49,974][52263] Updated weights for policy 0, policy_version 407569 (0.0038) [2024-04-27 15:03:53,206][52263] Updated weights for policy 0, policy_version 407579 (0.0030) [2024-04-27 15:03:54,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6677823488. Throughput: 0: 53613.1. Samples: 1168385100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 15:03:54,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 15:03:56,407][52263] Updated weights for policy 0, policy_version 407589 (0.0035) [2024-04-27 15:03:59,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53248.2, 300 sec: 53428.5). Total num frames: 6678085632. Throughput: 0: 53620.5. Samples: 1168549560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 15:03:59,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 15:03:59,189][52263] Updated weights for policy 0, policy_version 407599 (0.0029) [2024-04-27 15:04:02,618][52263] Updated weights for policy 0, policy_version 407609 (0.0034) [2024-04-27 15:04:04,107][52031] Fps is (10 sec: 52427.7, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6678347776. Throughput: 0: 53598.5. Samples: 1168873880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 15:04:04,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 15:04:05,365][52263] Updated weights for policy 0, policy_version 407619 (0.0030) [2024-04-27 15:04:08,626][52263] Updated weights for policy 0, policy_version 407629 (0.0032) [2024-04-27 15:04:09,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6678626304. Throughput: 0: 53517.7. Samples: 1169191660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 15:04:09,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 15:04:11,485][52263] Updated weights for policy 0, policy_version 407639 (0.0031) [2024-04-27 15:04:14,106][52031] Fps is (10 sec: 52430.0, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6678872064. Throughput: 0: 53386.7. Samples: 1169346020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 15:04:14,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 15:04:14,825][52263] Updated weights for policy 0, policy_version 407649 (0.0031) [2024-04-27 15:04:17,478][52263] Updated weights for policy 0, policy_version 407659 (0.0029) [2024-04-27 15:04:19,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6679166976. Throughput: 0: 53394.3. Samples: 1169668980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 15:04:19,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 15:04:21,033][52263] Updated weights for policy 0, policy_version 407669 (0.0037) [2024-04-27 15:04:23,540][52263] Updated weights for policy 0, policy_version 407679 (0.0038) [2024-04-27 15:04:24,106][52031] Fps is (10 sec: 57344.3, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6679445504. Throughput: 0: 53467.3. Samples: 1169989920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 15:04:24,107][52031] Avg episode reward: [(0, '0.513')] [2024-04-27 15:04:27,012][52263] Updated weights for policy 0, policy_version 407689 (0.0031) [2024-04-27 15:04:29,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6679691264. Throughput: 0: 53499.1. Samples: 1170149860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 15:04:29,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 15:04:29,599][52263] Updated weights for policy 0, policy_version 407699 (0.0029) [2024-04-27 15:04:33,016][52263] Updated weights for policy 0, policy_version 407709 (0.0028) [2024-04-27 15:04:34,106][52031] Fps is (10 sec: 49151.0, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6679937024. Throughput: 0: 53509.0. Samples: 1170470600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 15:04:34,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 15:04:35,779][52263] Updated weights for policy 0, policy_version 407719 (0.0029) [2024-04-27 15:04:39,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6680215552. Throughput: 0: 53521.2. Samples: 1170793560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 15:04:39,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 15:04:39,156][52263] Updated weights for policy 0, policy_version 407729 (0.0028) [2024-04-27 15:04:41,864][52263] Updated weights for policy 0, policy_version 407739 (0.0030) [2024-04-27 15:04:44,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6680477696. Throughput: 0: 53428.9. Samples: 1170953860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 15:04:44,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 15:04:45,452][52263] Updated weights for policy 0, policy_version 407749 (0.0030) [2024-04-27 15:04:47,878][52263] Updated weights for policy 0, policy_version 407759 (0.0027) [2024-04-27 15:04:48,403][52242] Signal inference workers to stop experience collection... (17600 times) [2024-04-27 15:04:48,403][52242] Signal inference workers to resume experience collection... (17600 times) [2024-04-27 15:04:48,425][52263] InferenceWorker_p0-w0: stopping experience collection (17600 times) [2024-04-27 15:04:48,426][52263] InferenceWorker_p0-w0: resuming experience collection (17600 times) [2024-04-27 15:04:49,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.1, 300 sec: 53372.9). Total num frames: 6680772608. Throughput: 0: 53290.3. Samples: 1171271940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 15:04:49,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 15:04:51,432][52263] Updated weights for policy 0, policy_version 407769 (0.0031) [2024-04-27 15:04:53,999][52263] Updated weights for policy 0, policy_version 407779 (0.0030) [2024-04-27 15:04:54,107][52031] Fps is (10 sec: 57343.4, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6681051136. Throughput: 0: 53292.0. Samples: 1171589800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 15:04:54,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 15:04:57,467][52263] Updated weights for policy 0, policy_version 407789 (0.0035) [2024-04-27 15:04:59,106][52031] Fps is (10 sec: 50791.3, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6681280512. Throughput: 0: 53470.2. Samples: 1171752180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 15:04:59,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 15:05:00,120][52263] Updated weights for policy 0, policy_version 407799 (0.0027) [2024-04-27 15:05:03,647][52263] Updated weights for policy 0, policy_version 407809 (0.0028) [2024-04-27 15:05:04,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6681559040. Throughput: 0: 53478.1. Samples: 1172075500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 15:05:04,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 15:05:06,351][52263] Updated weights for policy 0, policy_version 407819 (0.0026) [2024-04-27 15:05:09,106][52031] Fps is (10 sec: 54066.7, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6681821184. Throughput: 0: 53428.7. Samples: 1172394220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 15:05:09,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 15:05:09,840][52263] Updated weights for policy 0, policy_version 407829 (0.0033) [2024-04-27 15:05:12,376][52263] Updated weights for policy 0, policy_version 407839 (0.0027) [2024-04-27 15:05:14,107][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.0, 300 sec: 53373.0). Total num frames: 6682099712. Throughput: 0: 53527.1. Samples: 1172558580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 15:05:14,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 15:05:16,015][52263] Updated weights for policy 0, policy_version 407849 (0.0031) [2024-04-27 15:05:18,568][52263] Updated weights for policy 0, policy_version 407859 (0.0035) [2024-04-27 15:05:19,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6682378240. Throughput: 0: 53471.2. Samples: 1172876800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 15:05:19,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 15:05:22,346][52263] Updated weights for policy 0, policy_version 407869 (0.0031) [2024-04-27 15:05:24,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53247.8, 300 sec: 53484.0). Total num frames: 6682640384. Throughput: 0: 53412.4. Samples: 1173197120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 15:05:24,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 15:05:24,558][52263] Updated weights for policy 0, policy_version 407879 (0.0028) [2024-04-27 15:05:28,392][52263] Updated weights for policy 0, policy_version 407889 (0.0027) [2024-04-27 15:05:29,106][52031] Fps is (10 sec: 49151.9, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6682869760. Throughput: 0: 53337.3. Samples: 1173354040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 15:05:29,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 15:05:30,662][52263] Updated weights for policy 0, policy_version 407899 (0.0035) [2024-04-27 15:05:34,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53794.1, 300 sec: 53540.6). Total num frames: 6683164672. Throughput: 0: 53358.3. Samples: 1173673060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 15:05:34,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 15:05:34,114][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000407908_6683164672.pth... [2024-04-27 15:05:34,160][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000407125_6670336000.pth [2024-04-27 15:05:34,449][52263] Updated weights for policy 0, policy_version 407909 (0.0030) [2024-04-27 15:05:36,721][52263] Updated weights for policy 0, policy_version 407919 (0.0026) [2024-04-27 15:05:39,107][52031] Fps is (10 sec: 55704.4, 60 sec: 53520.9, 300 sec: 53372.9). Total num frames: 6683426816. Throughput: 0: 53365.2. Samples: 1173991240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 15:05:39,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 15:05:40,577][52263] Updated weights for policy 0, policy_version 407929 (0.0029) [2024-04-27 15:05:42,730][52242] Signal inference workers to stop experience collection... (17650 times) [2024-04-27 15:05:42,730][52242] Signal inference workers to resume experience collection... (17650 times) [2024-04-27 15:05:42,761][52263] InferenceWorker_p0-w0: stopping experience collection (17650 times) [2024-04-27 15:05:42,767][52263] InferenceWorker_p0-w0: resuming experience collection (17650 times) [2024-04-27 15:05:42,843][52263] Updated weights for policy 0, policy_version 407939 (0.0031) [2024-04-27 15:05:44,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 6683705344. Throughput: 0: 53602.2. Samples: 1174164280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 15:05:44,107][52031] Avg episode reward: [(0, '0.512')] [2024-04-27 15:05:46,731][52263] Updated weights for policy 0, policy_version 407949 (0.0026) [2024-04-27 15:05:48,999][52263] Updated weights for policy 0, policy_version 407959 (0.0032) [2024-04-27 15:05:49,107][52031] Fps is (10 sec: 57344.4, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 6684000256. Throughput: 0: 53576.1. Samples: 1174486420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 15:05:49,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 15:05:52,723][52263] Updated weights for policy 0, policy_version 407969 (0.0026) [2024-04-27 15:05:54,106][52031] Fps is (10 sec: 50790.2, 60 sec: 52701.9, 300 sec: 53373.0). Total num frames: 6684213248. Throughput: 0: 53616.0. Samples: 1174806940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 15:05:54,108][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 15:05:55,086][52263] Updated weights for policy 0, policy_version 407979 (0.0028) [2024-04-27 15:05:58,759][52263] Updated weights for policy 0, policy_version 407989 (0.0035) [2024-04-27 15:05:59,107][52031] Fps is (10 sec: 49152.0, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6684491776. Throughput: 0: 53243.9. Samples: 1174954560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 15:05:59,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 15:06:01,079][52263] Updated weights for policy 0, policy_version 407999 (0.0024) [2024-04-27 15:06:04,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53248.0, 300 sec: 53372.9). Total num frames: 6684753920. Throughput: 0: 53429.5. Samples: 1175281140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 15:06:04,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 15:06:05,283][52263] Updated weights for policy 0, policy_version 408009 (0.0030) [2024-04-27 15:06:07,304][52263] Updated weights for policy 0, policy_version 408019 (0.0028) [2024-04-27 15:06:09,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53520.9, 300 sec: 53372.9). Total num frames: 6685032448. Throughput: 0: 53425.1. Samples: 1175601260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 15:06:09,107][52031] Avg episode reward: [(0, '0.658')] [2024-04-27 15:06:11,256][52263] Updated weights for policy 0, policy_version 408029 (0.0030) [2024-04-27 15:06:13,414][52263] Updated weights for policy 0, policy_version 408039 (0.0038) [2024-04-27 15:06:14,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.0, 300 sec: 53372.9). Total num frames: 6685310976. Throughput: 0: 53734.0. Samples: 1175772080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 15:06:14,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 15:06:17,160][52263] Updated weights for policy 0, policy_version 408049 (0.0040) [2024-04-27 15:06:19,106][52031] Fps is (10 sec: 57345.5, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6685605888. Throughput: 0: 53750.8. Samples: 1176091840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 15:06:19,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 15:06:19,488][52263] Updated weights for policy 0, policy_version 408059 (0.0026) [2024-04-27 15:06:23,483][52263] Updated weights for policy 0, policy_version 408069 (0.0024) [2024-04-27 15:06:24,107][52031] Fps is (10 sec: 52429.4, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6685835264. Throughput: 0: 53860.2. Samples: 1176414940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 15:06:24,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 15:06:25,582][52263] Updated weights for policy 0, policy_version 408079 (0.0029) [2024-04-27 15:06:29,107][52031] Fps is (10 sec: 49150.9, 60 sec: 53793.9, 300 sec: 53428.5). Total num frames: 6686097408. Throughput: 0: 53409.1. Samples: 1176567700. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 15:06:29,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 15:06:29,524][52263] Updated weights for policy 0, policy_version 408089 (0.0029) [2024-04-27 15:06:31,719][52263] Updated weights for policy 0, policy_version 408099 (0.0032) [2024-04-27 15:06:34,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.0, 300 sec: 53372.9). Total num frames: 6686359552. Throughput: 0: 53412.6. Samples: 1176889980. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 15:06:34,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 15:06:35,690][52263] Updated weights for policy 0, policy_version 408109 (0.0030) [2024-04-27 15:06:36,972][52242] Signal inference workers to stop experience collection... (17700 times) [2024-04-27 15:06:37,018][52263] InferenceWorker_p0-w0: stopping experience collection (17700 times) [2024-04-27 15:06:37,035][52242] Signal inference workers to resume experience collection... (17700 times) [2024-04-27 15:06:37,039][52263] InferenceWorker_p0-w0: resuming experience collection (17700 times) [2024-04-27 15:06:37,860][52263] Updated weights for policy 0, policy_version 408119 (0.0039) [2024-04-27 15:06:39,107][52031] Fps is (10 sec: 55706.1, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6686654464. Throughput: 0: 53333.2. Samples: 1177206940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 15:06:39,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 15:06:41,914][52263] Updated weights for policy 0, policy_version 408129 (0.0031) [2024-04-27 15:06:44,106][52031] Fps is (10 sec: 57344.4, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6686932992. Throughput: 0: 53802.9. Samples: 1177375680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 15:06:44,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 15:06:44,143][52263] Updated weights for policy 0, policy_version 408139 (0.0030) [2024-04-27 15:06:47,905][52263] Updated weights for policy 0, policy_version 408149 (0.0035) [2024-04-27 15:06:49,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6687195136. Throughput: 0: 53783.3. Samples: 1177701380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 15:06:49,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 15:06:50,099][52263] Updated weights for policy 0, policy_version 408159 (0.0030) [2024-04-27 15:06:54,107][52031] Fps is (10 sec: 49151.4, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6687424512. Throughput: 0: 53778.4. Samples: 1178021280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 15:06:54,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 15:06:54,277][52263] Updated weights for policy 0, policy_version 408169 (0.0029) [2024-04-27 15:06:56,278][52263] Updated weights for policy 0, policy_version 408179 (0.0028) [2024-04-27 15:06:59,107][52031] Fps is (10 sec: 50789.7, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6687703040. Throughput: 0: 53264.9. Samples: 1178169000. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 15:06:59,108][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 15:07:00,248][52263] Updated weights for policy 0, policy_version 408189 (0.0037) [2024-04-27 15:07:02,495][52263] Updated weights for policy 0, policy_version 408199 (0.0037) [2024-04-27 15:07:04,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6687965184. Throughput: 0: 53278.2. Samples: 1178489360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 15:07:04,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 15:07:06,312][52263] Updated weights for policy 0, policy_version 408209 (0.0029) [2024-04-27 15:07:08,559][52263] Updated weights for policy 0, policy_version 408219 (0.0031) [2024-04-27 15:07:09,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6688260096. Throughput: 0: 53343.0. Samples: 1178815380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 15:07:09,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 15:07:12,442][52263] Updated weights for policy 0, policy_version 408229 (0.0029) [2024-04-27 15:07:14,107][52031] Fps is (10 sec: 58981.5, 60 sec: 54067.2, 300 sec: 53650.6). Total num frames: 6688555008. Throughput: 0: 53758.3. Samples: 1178986820. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 15:07:14,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 15:07:14,772][52263] Updated weights for policy 0, policy_version 408239 (0.0028) [2024-04-27 15:07:18,548][52263] Updated weights for policy 0, policy_version 408249 (0.0030) [2024-04-27 15:07:19,106][52031] Fps is (10 sec: 54068.0, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6688800768. Throughput: 0: 53764.0. Samples: 1179309360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 15:07:19,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 15:07:20,951][52263] Updated weights for policy 0, policy_version 408259 (0.0035) [2024-04-27 15:07:24,106][52031] Fps is (10 sec: 47514.4, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6689030144. Throughput: 0: 53811.3. Samples: 1179628440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 15:07:24,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 15:07:24,510][52263] Updated weights for policy 0, policy_version 408269 (0.0028) [2024-04-27 15:07:26,987][52263] Updated weights for policy 0, policy_version 408279 (0.0028) [2024-04-27 15:07:29,107][52031] Fps is (10 sec: 49151.4, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6689292288. Throughput: 0: 53400.6. Samples: 1179778720. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 15:07:29,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 15:07:30,358][52242] Signal inference workers to stop experience collection... (17750 times) [2024-04-27 15:07:30,387][52263] InferenceWorker_p0-w0: stopping experience collection (17750 times) [2024-04-27 15:07:30,449][52242] Signal inference workers to resume experience collection... (17750 times) [2024-04-27 15:07:30,449][52263] InferenceWorker_p0-w0: resuming experience collection (17750 times) [2024-04-27 15:07:30,565][52263] Updated weights for policy 0, policy_version 408289 (0.0030) [2024-04-27 15:07:33,078][52263] Updated weights for policy 0, policy_version 408299 (0.0028) [2024-04-27 15:07:34,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6689587200. Throughput: 0: 53456.5. Samples: 1180106920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 15:07:34,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 15:07:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000408300_6689587200.pth... [2024-04-27 15:07:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000407517_6676758528.pth [2024-04-27 15:07:36,707][52263] Updated weights for policy 0, policy_version 408309 (0.0031) [2024-04-27 15:07:39,107][52031] Fps is (10 sec: 58982.2, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6689882112. Throughput: 0: 53437.2. Samples: 1180425960. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 15:07:39,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 15:07:39,741][52263] Updated weights for policy 0, policy_version 408319 (0.0031) [2024-04-27 15:07:42,784][52263] Updated weights for policy 0, policy_version 408329 (0.0032) [2024-04-27 15:07:44,107][52031] Fps is (10 sec: 58981.7, 60 sec: 54067.1, 300 sec: 53650.6). Total num frames: 6690177024. Throughput: 0: 53978.7. Samples: 1180598040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 15:07:44,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 15:07:45,806][52263] Updated weights for policy 0, policy_version 408339 (0.0026) [2024-04-27 15:07:48,714][52263] Updated weights for policy 0, policy_version 408349 (0.0032) [2024-04-27 15:07:49,107][52031] Fps is (10 sec: 50791.0, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6690390016. Throughput: 0: 53976.3. Samples: 1180918300. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 15:07:49,107][52031] Avg episode reward: [(0, '0.508')] [2024-04-27 15:07:51,986][52263] Updated weights for policy 0, policy_version 408359 (0.0024) [2024-04-27 15:07:54,107][52031] Fps is (10 sec: 47513.5, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6690652160. Throughput: 0: 54044.5. Samples: 1181247380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 15:07:54,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 15:07:54,823][52263] Updated weights for policy 0, policy_version 408369 (0.0040) [2024-04-27 15:07:58,197][52263] Updated weights for policy 0, policy_version 408379 (0.0028) [2024-04-27 15:07:59,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6690914304. Throughput: 0: 53305.4. Samples: 1181385560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 15:07:59,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 15:08:00,971][52263] Updated weights for policy 0, policy_version 408389 (0.0031) [2024-04-27 15:08:04,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6691192832. Throughput: 0: 53292.3. Samples: 1181707520. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 15:08:04,107][52031] Avg episode reward: [(0, '0.479')] [2024-04-27 15:08:04,188][52263] Updated weights for policy 0, policy_version 408399 (0.0029) [2024-04-27 15:08:07,036][52263] Updated weights for policy 0, policy_version 408409 (0.0028) [2024-04-27 15:08:07,382][52242] Signal inference workers to stop experience collection... (17800 times) [2024-04-27 15:08:07,430][52263] InferenceWorker_p0-w0: stopping experience collection (17800 times) [2024-04-27 15:08:07,446][52242] Signal inference workers to resume experience collection... (17800 times) [2024-04-27 15:08:07,446][52263] InferenceWorker_p0-w0: resuming experience collection (17800 times) [2024-04-27 15:08:09,106][52031] Fps is (10 sec: 57344.3, 60 sec: 53794.2, 300 sec: 53650.6). Total num frames: 6691487744. Throughput: 0: 53328.8. Samples: 1182028240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 15:08:09,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 15:08:10,384][52263] Updated weights for policy 0, policy_version 408419 (0.0034) [2024-04-27 15:08:13,172][52263] Updated weights for policy 0, policy_version 408429 (0.0028) [2024-04-27 15:08:14,106][52031] Fps is (10 sec: 58983.6, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 6691782656. Throughput: 0: 54086.9. Samples: 1182212620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 15:08:14,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 15:08:16,570][52263] Updated weights for policy 0, policy_version 408439 (0.0033) [2024-04-27 15:08:19,068][52263] Updated weights for policy 0, policy_version 408449 (0.0036) [2024-04-27 15:08:19,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6692028416. Throughput: 0: 53922.2. Samples: 1182533420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 15:08:19,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 15:08:22,522][52263] Updated weights for policy 0, policy_version 408459 (0.0035) [2024-04-27 15:08:24,106][52031] Fps is (10 sec: 45875.1, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6692241408. Throughput: 0: 54012.7. Samples: 1182856520. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 15:08:24,107][52031] Avg episode reward: [(0, '0.646')] [2024-04-27 15:08:25,233][52263] Updated weights for policy 0, policy_version 408469 (0.0026) [2024-04-27 15:08:28,714][52263] Updated weights for policy 0, policy_version 408479 (0.0027) [2024-04-27 15:08:29,106][52031] Fps is (10 sec: 49151.8, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6692519936. Throughput: 0: 53362.3. Samples: 1182999340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 15:08:29,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 15:08:31,252][52263] Updated weights for policy 0, policy_version 408489 (0.0030) [2024-04-27 15:08:34,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6692798464. Throughput: 0: 53382.7. Samples: 1183320520. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 15:08:34,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 15:08:35,021][52263] Updated weights for policy 0, policy_version 408499 (0.0028) [2024-04-27 15:08:37,412][52263] Updated weights for policy 0, policy_version 408509 (0.0025) [2024-04-27 15:08:39,106][52031] Fps is (10 sec: 58983.0, 60 sec: 53794.4, 300 sec: 53650.7). Total num frames: 6693109760. Throughput: 0: 53215.3. Samples: 1183642060. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 15:08:39,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 15:08:41,080][52263] Updated weights for policy 0, policy_version 408519 (0.0036) [2024-04-27 15:08:43,388][52263] Updated weights for policy 0, policy_version 408529 (0.0033) [2024-04-27 15:08:44,106][52031] Fps is (10 sec: 60621.2, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6693404672. Throughput: 0: 54165.1. Samples: 1183822980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 15:08:44,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 15:08:47,073][52263] Updated weights for policy 0, policy_version 408539 (0.0028) [2024-04-27 15:08:49,107][52031] Fps is (10 sec: 49150.9, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6693601280. Throughput: 0: 54151.1. Samples: 1184144320. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 15:08:49,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 15:08:49,278][52242] Signal inference workers to stop experience collection... (17850 times) [2024-04-27 15:08:49,320][52263] InferenceWorker_p0-w0: stopping experience collection (17850 times) [2024-04-27 15:08:49,381][52242] Signal inference workers to resume experience collection... (17850 times) [2024-04-27 15:08:49,382][52263] InferenceWorker_p0-w0: resuming experience collection (17850 times) [2024-04-27 15:08:49,502][52263] Updated weights for policy 0, policy_version 408549 (0.0034) [2024-04-27 15:08:53,179][52263] Updated weights for policy 0, policy_version 408559 (0.0029) [2024-04-27 15:08:54,107][52031] Fps is (10 sec: 45874.3, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6693863424. Throughput: 0: 54056.8. Samples: 1184460800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 15:08:54,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 15:08:55,664][52263] Updated weights for policy 0, policy_version 408569 (0.0027) [2024-04-27 15:08:59,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53794.3, 300 sec: 53539.6). Total num frames: 6694141952. Throughput: 0: 53138.6. Samples: 1184603860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 15:08:59,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 15:08:59,201][52263] Updated weights for policy 0, policy_version 408579 (0.0028) [2024-04-27 15:09:01,731][52263] Updated weights for policy 0, policy_version 408589 (0.0034) [2024-04-27 15:09:04,106][52031] Fps is (10 sec: 55706.5, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6694420480. Throughput: 0: 53115.1. Samples: 1184923600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-04-27 15:09:04,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 15:09:05,501][52263] Updated weights for policy 0, policy_version 408599 (0.0033) [2024-04-27 15:09:07,935][52263] Updated weights for policy 0, policy_version 408609 (0.0026) [2024-04-27 15:09:09,107][52031] Fps is (10 sec: 57343.3, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6694715392. Throughput: 0: 52980.7. Samples: 1185240660. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-04-27 15:09:09,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 15:09:11,700][52263] Updated weights for policy 0, policy_version 408619 (0.0032) [2024-04-27 15:09:13,903][52263] Updated weights for policy 0, policy_version 408629 (0.0037) [2024-04-27 15:09:14,106][52031] Fps is (10 sec: 57344.4, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6694993920. Throughput: 0: 53839.2. Samples: 1185422100. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-04-27 15:09:14,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 15:09:17,774][52263] Updated weights for policy 0, policy_version 408639 (0.0031) [2024-04-27 15:09:19,107][52031] Fps is (10 sec: 49152.0, 60 sec: 52974.8, 300 sec: 53428.5). Total num frames: 6695206912. Throughput: 0: 53855.0. Samples: 1185744000. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-04-27 15:09:19,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 15:09:20,001][52263] Updated weights for policy 0, policy_version 408649 (0.0025) [2024-04-27 15:09:23,948][52263] Updated weights for policy 0, policy_version 408659 (0.0033) [2024-04-27 15:09:24,107][52031] Fps is (10 sec: 47512.8, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6695469056. Throughput: 0: 53804.2. Samples: 1186063260. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-04-27 15:09:24,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 15:09:26,141][52263] Updated weights for policy 0, policy_version 408669 (0.0032) [2024-04-27 15:09:29,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6695747584. Throughput: 0: 53023.9. Samples: 1186209060. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-04-27 15:09:29,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 15:09:30,081][52263] Updated weights for policy 0, policy_version 408679 (0.0027) [2024-04-27 15:09:32,275][52263] Updated weights for policy 0, policy_version 408689 (0.0027) [2024-04-27 15:09:34,106][52031] Fps is (10 sec: 57344.7, 60 sec: 54067.2, 300 sec: 53650.7). Total num frames: 6696042496. Throughput: 0: 52992.2. Samples: 1186528960. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-04-27 15:09:34,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 15:09:34,112][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000408694_6696042496.pth... [2024-04-27 15:09:34,156][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000407908_6683164672.pth [2024-04-27 15:09:36,287][52263] Updated weights for policy 0, policy_version 408699 (0.0027) [2024-04-27 15:09:38,331][52263] Updated weights for policy 0, policy_version 408709 (0.0032) [2024-04-27 15:09:39,106][52031] Fps is (10 sec: 58983.2, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 6696337408. Throughput: 0: 53151.3. Samples: 1186852600. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-04-27 15:09:39,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 15:09:42,244][52263] Updated weights for policy 0, policy_version 408719 (0.0033) [2024-04-27 15:09:42,662][52242] Signal inference workers to stop experience collection... (17900 times) [2024-04-27 15:09:42,664][52242] Signal inference workers to resume experience collection... (17900 times) [2024-04-27 15:09:42,683][52263] InferenceWorker_p0-w0: stopping experience collection (17900 times) [2024-04-27 15:09:42,683][52263] InferenceWorker_p0-w0: resuming experience collection (17900 times) [2024-04-27 15:09:44,106][52031] Fps is (10 sec: 52428.5, 60 sec: 52701.8, 300 sec: 53539.6). Total num frames: 6696566784. Throughput: 0: 53798.1. Samples: 1187024780. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-04-27 15:09:44,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 15:09:44,480][52263] Updated weights for policy 0, policy_version 408729 (0.0034) [2024-04-27 15:09:48,286][52263] Updated weights for policy 0, policy_version 408739 (0.0028) [2024-04-27 15:09:49,106][52031] Fps is (10 sec: 47513.5, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6696812544. Throughput: 0: 53748.0. Samples: 1187342260. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-04-27 15:09:49,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 15:09:50,604][52263] Updated weights for policy 0, policy_version 408749 (0.0028) [2024-04-27 15:09:54,107][52031] Fps is (10 sec: 50789.9, 60 sec: 53521.1, 300 sec: 53539.5). Total num frames: 6697074688. Throughput: 0: 53952.4. Samples: 1187668520. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-04-27 15:09:54,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 15:09:54,438][52263] Updated weights for policy 0, policy_version 408759 (0.0030) [2024-04-27 15:09:56,689][52263] Updated weights for policy 0, policy_version 408769 (0.0027) [2024-04-27 15:09:59,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6697353216. Throughput: 0: 53182.7. Samples: 1187815320. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-04-27 15:09:59,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 15:10:00,691][52263] Updated weights for policy 0, policy_version 408779 (0.0029) [2024-04-27 15:10:02,787][52263] Updated weights for policy 0, policy_version 408789 (0.0030) [2024-04-27 15:10:04,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6697631744. Throughput: 0: 53117.7. Samples: 1188134300. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-04-27 15:10:04,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 15:10:06,775][52263] Updated weights for policy 0, policy_version 408799 (0.0031) [2024-04-27 15:10:08,914][52263] Updated weights for policy 0, policy_version 408809 (0.0033) [2024-04-27 15:10:09,106][52031] Fps is (10 sec: 57343.5, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6697926656. Throughput: 0: 53125.5. Samples: 1188453900. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-04-27 15:10:09,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 15:10:13,024][52263] Updated weights for policy 0, policy_version 408819 (0.0034) [2024-04-27 15:10:14,106][52031] Fps is (10 sec: 52429.7, 60 sec: 52701.8, 300 sec: 53484.0). Total num frames: 6698156032. Throughput: 0: 53624.1. Samples: 1188622140. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-04-27 15:10:14,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 15:10:15,150][52263] Updated weights for policy 0, policy_version 408829 (0.0031) [2024-04-27 15:10:19,106][52031] Fps is (10 sec: 47513.7, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6698401792. Throughput: 0: 53505.8. Samples: 1188936720. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-04-27 15:10:19,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 15:10:19,166][52263] Updated weights for policy 0, policy_version 408839 (0.0028) [2024-04-27 15:10:21,311][52263] Updated weights for policy 0, policy_version 408849 (0.0030) [2024-04-27 15:10:24,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6698680320. Throughput: 0: 53461.1. Samples: 1189258360. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-04-27 15:10:24,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 15:10:25,240][52263] Updated weights for policy 0, policy_version 408859 (0.0030) [2024-04-27 15:10:27,471][52263] Updated weights for policy 0, policy_version 408869 (0.0033) [2024-04-27 15:10:29,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6698942464. Throughput: 0: 53219.3. Samples: 1189419640. Policy #0 lag: (min: 1.0, avg: 12.9, max: 22.0) [2024-04-27 15:10:29,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 15:10:31,298][52263] Updated weights for policy 0, policy_version 408879 (0.0032) [2024-04-27 15:10:33,569][52263] Updated weights for policy 0, policy_version 408889 (0.0030) [2024-04-27 15:10:34,107][52031] Fps is (10 sec: 57344.4, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6699253760. Throughput: 0: 53203.0. Samples: 1189736400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 15:10:34,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 15:10:37,468][52263] Updated weights for policy 0, policy_version 408899 (0.0026) [2024-04-27 15:10:38,834][52242] Signal inference workers to stop experience collection... (17950 times) [2024-04-27 15:10:38,886][52263] InferenceWorker_p0-w0: stopping experience collection (17950 times) [2024-04-27 15:10:38,897][52242] Signal inference workers to resume experience collection... (17950 times) [2024-04-27 15:10:38,902][52263] InferenceWorker_p0-w0: resuming experience collection (17950 times) [2024-04-27 15:10:39,106][52031] Fps is (10 sec: 55705.2, 60 sec: 52701.8, 300 sec: 53539.6). Total num frames: 6699499520. Throughput: 0: 53022.4. Samples: 1190054520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 15:10:39,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 15:10:39,870][52263] Updated weights for policy 0, policy_version 408909 (0.0035) [2024-04-27 15:10:43,422][52263] Updated weights for policy 0, policy_version 408919 (0.0028) [2024-04-27 15:10:44,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6699761664. Throughput: 0: 53406.0. Samples: 1190218600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 15:10:44,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 15:10:46,025][52263] Updated weights for policy 0, policy_version 408929 (0.0032) [2024-04-27 15:10:49,107][52031] Fps is (10 sec: 49151.1, 60 sec: 52974.8, 300 sec: 53484.0). Total num frames: 6699991040. Throughput: 0: 53540.4. Samples: 1190543620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 15:10:49,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 15:10:49,635][52263] Updated weights for policy 0, policy_version 408939 (0.0023) [2024-04-27 15:10:52,097][52263] Updated weights for policy 0, policy_version 408949 (0.0025) [2024-04-27 15:10:54,107][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6700285952. Throughput: 0: 53471.9. Samples: 1190860140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 15:10:54,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 15:10:55,864][52263] Updated weights for policy 0, policy_version 408959 (0.0032) [2024-04-27 15:10:58,212][52263] Updated weights for policy 0, policy_version 408969 (0.0031) [2024-04-27 15:10:59,107][52031] Fps is (10 sec: 57344.6, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6700564480. Throughput: 0: 53218.6. Samples: 1191016980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 15:10:59,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 15:11:01,946][52263] Updated weights for policy 0, policy_version 408979 (0.0030) [2024-04-27 15:11:04,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6700843008. Throughput: 0: 53360.3. Samples: 1191337940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 15:11:04,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 15:11:04,423][52263] Updated weights for policy 0, policy_version 408989 (0.0030) [2024-04-27 15:11:08,081][52263] Updated weights for policy 0, policy_version 408999 (0.0032) [2024-04-27 15:11:09,107][52031] Fps is (10 sec: 54066.4, 60 sec: 52974.8, 300 sec: 53539.6). Total num frames: 6701105152. Throughput: 0: 53433.3. Samples: 1191662860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 15:11:09,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 15:11:10,698][52263] Updated weights for policy 0, policy_version 409009 (0.0032) [2024-04-27 15:11:14,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53247.8, 300 sec: 53372.9). Total num frames: 6701350912. Throughput: 0: 53301.9. Samples: 1191818240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 15:11:14,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 15:11:14,151][52263] Updated weights for policy 0, policy_version 409019 (0.0034) [2024-04-27 15:11:16,788][52263] Updated weights for policy 0, policy_version 409029 (0.0036) [2024-04-27 15:11:19,106][52031] Fps is (10 sec: 50791.3, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6701613056. Throughput: 0: 53391.7. Samples: 1192139020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 15:11:19,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 15:11:20,373][52263] Updated weights for policy 0, policy_version 409039 (0.0033) [2024-04-27 15:11:22,884][52263] Updated weights for policy 0, policy_version 409049 (0.0036) [2024-04-27 15:11:24,106][52031] Fps is (10 sec: 54068.6, 60 sec: 53521.3, 300 sec: 53539.6). Total num frames: 6701891584. Throughput: 0: 53361.4. Samples: 1192455780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 15:11:24,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 15:11:26,471][52263] Updated weights for policy 0, policy_version 409059 (0.0028) [2024-04-27 15:11:29,069][52263] Updated weights for policy 0, policy_version 409069 (0.0028) [2024-04-27 15:11:29,107][52031] Fps is (10 sec: 57343.1, 60 sec: 54067.0, 300 sec: 53650.6). Total num frames: 6702186496. Throughput: 0: 53425.7. Samples: 1192622760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 15:11:29,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 15:11:32,560][52263] Updated weights for policy 0, policy_version 409079 (0.0032) [2024-04-27 15:11:34,106][52031] Fps is (10 sec: 52428.4, 60 sec: 52702.0, 300 sec: 53428.5). Total num frames: 6702415872. Throughput: 0: 53311.3. Samples: 1192942620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 15:11:34,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 15:11:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000409084_6702432256.pth... [2024-04-27 15:11:34,166][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000408300_6689587200.pth [2024-04-27 15:11:35,286][52263] Updated weights for policy 0, policy_version 409089 (0.0029) [2024-04-27 15:11:38,349][52242] Signal inference workers to stop experience collection... (18000 times) [2024-04-27 15:11:38,350][52242] Signal inference workers to resume experience collection... (18000 times) [2024-04-27 15:11:38,367][52263] InferenceWorker_p0-w0: stopping experience collection (18000 times) [2024-04-27 15:11:38,367][52263] InferenceWorker_p0-w0: resuming experience collection (18000 times) [2024-04-27 15:11:38,677][52263] Updated weights for policy 0, policy_version 409099 (0.0025) [2024-04-27 15:11:39,107][52031] Fps is (10 sec: 50790.9, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6702694400. Throughput: 0: 53403.1. Samples: 1193263280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 15:11:39,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 15:11:41,461][52263] Updated weights for policy 0, policy_version 409109 (0.0032) [2024-04-27 15:11:44,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6702956544. Throughput: 0: 53318.2. Samples: 1193416300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 15:11:44,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 15:11:44,994][52263] Updated weights for policy 0, policy_version 409119 (0.0025) [2024-04-27 15:11:47,417][52263] Updated weights for policy 0, policy_version 409129 (0.0037) [2024-04-27 15:11:49,106][52031] Fps is (10 sec: 54067.8, 60 sec: 54067.4, 300 sec: 53595.1). Total num frames: 6703235072. Throughput: 0: 53346.0. Samples: 1193738500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 15:11:49,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 15:11:51,058][52263] Updated weights for policy 0, policy_version 409139 (0.0035) [2024-04-27 15:11:53,457][52263] Updated weights for policy 0, policy_version 409149 (0.0027) [2024-04-27 15:11:54,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6703513600. Throughput: 0: 53253.5. Samples: 1194059260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 15:11:54,107][52031] Avg episode reward: [(0, '0.658')] [2024-04-27 15:11:57,114][52263] Updated weights for policy 0, policy_version 409159 (0.0029) [2024-04-27 15:11:59,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6703775744. Throughput: 0: 53711.8. Samples: 1194235260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 15:11:59,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 15:11:59,529][52263] Updated weights for policy 0, policy_version 409169 (0.0026) [2024-04-27 15:12:03,229][52263] Updated weights for policy 0, policy_version 409179 (0.0029) [2024-04-27 15:12:04,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6704037888. Throughput: 0: 53728.0. Samples: 1194556780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 15:12:04,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 15:12:05,913][52263] Updated weights for policy 0, policy_version 409189 (0.0034) [2024-04-27 15:12:09,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6704300032. Throughput: 0: 53902.9. Samples: 1194881420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 15:12:09,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 15:12:09,332][52263] Updated weights for policy 0, policy_version 409199 (0.0033) [2024-04-27 15:12:11,893][52263] Updated weights for policy 0, policy_version 409209 (0.0027) [2024-04-27 15:12:14,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6704562176. Throughput: 0: 53462.8. Samples: 1195028580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 15:12:14,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 15:12:15,285][52263] Updated weights for policy 0, policy_version 409219 (0.0033) [2024-04-27 15:12:18,035][52263] Updated weights for policy 0, policy_version 409229 (0.0028) [2024-04-27 15:12:19,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6704840704. Throughput: 0: 53524.8. Samples: 1195351240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 15:12:19,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 15:12:21,490][52263] Updated weights for policy 0, policy_version 409239 (0.0035) [2024-04-27 15:12:24,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53794.0, 300 sec: 53650.7). Total num frames: 6705119232. Throughput: 0: 53599.2. Samples: 1195675240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 15:12:24,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 15:12:24,240][52263] Updated weights for policy 0, policy_version 409249 (0.0027) [2024-04-27 15:12:27,605][52263] Updated weights for policy 0, policy_version 409259 (0.0027) [2024-04-27 15:12:29,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.2, 300 sec: 53539.6). Total num frames: 6705381376. Throughput: 0: 53949.0. Samples: 1195844000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 15:12:29,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 15:12:30,281][52263] Updated weights for policy 0, policy_version 409269 (0.0030) [2024-04-27 15:12:33,511][52263] Updated weights for policy 0, policy_version 409279 (0.0029) [2024-04-27 15:12:34,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6705643520. Throughput: 0: 54157.8. Samples: 1196175600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 15:12:34,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 15:12:36,233][52263] Updated weights for policy 0, policy_version 409289 (0.0028) [2024-04-27 15:12:39,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 6705922048. Throughput: 0: 54213.8. Samples: 1196498880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 15:12:39,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 15:12:39,845][52263] Updated weights for policy 0, policy_version 409299 (0.0030) [2024-04-27 15:12:42,166][52263] Updated weights for policy 0, policy_version 409309 (0.0030) [2024-04-27 15:12:44,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6706184192. Throughput: 0: 53545.6. Samples: 1196644820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 15:12:44,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 15:12:45,904][52263] Updated weights for policy 0, policy_version 409319 (0.0040) [2024-04-27 15:12:48,256][52263] Updated weights for policy 0, policy_version 409329 (0.0033) [2024-04-27 15:12:49,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6706462720. Throughput: 0: 53557.4. Samples: 1196966860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 15:12:49,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 15:12:52,334][52263] Updated weights for policy 0, policy_version 409339 (0.0032) [2024-04-27 15:12:52,465][52242] Signal inference workers to stop experience collection... (18050 times) [2024-04-27 15:12:52,472][52242] Signal inference workers to resume experience collection... (18050 times) [2024-04-27 15:12:52,491][52263] InferenceWorker_p0-w0: stopping experience collection (18050 times) [2024-04-27 15:12:52,492][52263] InferenceWorker_p0-w0: resuming experience collection (18050 times) [2024-04-27 15:12:54,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6706741248. Throughput: 0: 53472.5. Samples: 1197287680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 15:12:54,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 15:12:54,447][52263] Updated weights for policy 0, policy_version 409349 (0.0028) [2024-04-27 15:12:58,317][52263] Updated weights for policy 0, policy_version 409359 (0.0036) [2024-04-27 15:12:59,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6707003392. Throughput: 0: 53921.0. Samples: 1197455020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 15:12:59,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 15:13:00,815][52263] Updated weights for policy 0, policy_version 409369 (0.0030) [2024-04-27 15:13:04,107][52031] Fps is (10 sec: 50789.9, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6707249152. Throughput: 0: 53930.9. Samples: 1197778140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 15:13:04,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 15:13:04,227][52263] Updated weights for policy 0, policy_version 409379 (0.0033) [2024-04-27 15:13:07,033][52263] Updated weights for policy 0, policy_version 409389 (0.0031) [2024-04-27 15:13:09,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53794.2, 300 sec: 53372.9). Total num frames: 6707527680. Throughput: 0: 53794.6. Samples: 1198096000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 15:13:09,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 15:13:10,514][52263] Updated weights for policy 0, policy_version 409399 (0.0027) [2024-04-27 15:13:13,320][52263] Updated weights for policy 0, policy_version 409409 (0.0032) [2024-04-27 15:13:14,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6707789824. Throughput: 0: 53533.9. Samples: 1198253040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 15:13:14,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 15:13:16,759][52263] Updated weights for policy 0, policy_version 409419 (0.0030) [2024-04-27 15:13:19,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 6708068352. Throughput: 0: 53213.2. Samples: 1198570200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 15:13:19,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 15:13:19,412][52263] Updated weights for policy 0, policy_version 409429 (0.0032) [2024-04-27 15:13:22,773][52263] Updated weights for policy 0, policy_version 409439 (0.0030) [2024-04-27 15:13:24,106][52031] Fps is (10 sec: 55706.8, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6708346880. Throughput: 0: 53144.6. Samples: 1198890380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 15:13:24,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 15:13:25,453][52263] Updated weights for policy 0, policy_version 409449 (0.0038) [2024-04-27 15:13:28,765][52263] Updated weights for policy 0, policy_version 409459 (0.0033) [2024-04-27 15:13:29,106][52031] Fps is (10 sec: 50791.0, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6708576256. Throughput: 0: 53567.2. Samples: 1199055340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 15:13:29,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 15:13:31,441][52263] Updated weights for policy 0, policy_version 409469 (0.0032) [2024-04-27 15:13:34,106][52031] Fps is (10 sec: 49152.0, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6708838400. Throughput: 0: 53461.8. Samples: 1199372640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 15:13:34,107][52031] Avg episode reward: [(0, '0.666')] [2024-04-27 15:13:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000409475_6708838400.pth... [2024-04-27 15:13:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000408694_6696042496.pth [2024-04-27 15:13:35,106][52263] Updated weights for policy 0, policy_version 409479 (0.0027) [2024-04-27 15:13:37,585][52263] Updated weights for policy 0, policy_version 409489 (0.0026) [2024-04-27 15:13:39,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53248.0, 300 sec: 53261.9). Total num frames: 6709116928. Throughput: 0: 53364.4. Samples: 1199689080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 15:13:39,107][52031] Avg episode reward: [(0, '0.688')] [2024-04-27 15:13:41,400][52263] Updated weights for policy 0, policy_version 409499 (0.0029) [2024-04-27 15:13:43,594][52263] Updated weights for policy 0, policy_version 409509 (0.0026) [2024-04-27 15:13:44,107][52031] Fps is (10 sec: 57343.1, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6709411840. Throughput: 0: 53359.4. Samples: 1199856200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 15:13:44,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 15:13:47,458][52263] Updated weights for policy 0, policy_version 409519 (0.0034) [2024-04-27 15:13:49,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6709657600. Throughput: 0: 53241.4. Samples: 1200174000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 15:13:49,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 15:13:49,756][52263] Updated weights for policy 0, policy_version 409529 (0.0029) [2024-04-27 15:13:50,356][52242] Signal inference workers to stop experience collection... (18100 times) [2024-04-27 15:13:50,414][52263] InferenceWorker_p0-w0: stopping experience collection (18100 times) [2024-04-27 15:13:50,416][52242] Signal inference workers to resume experience collection... (18100 times) [2024-04-27 15:13:50,423][52263] InferenceWorker_p0-w0: resuming experience collection (18100 times) [2024-04-27 15:13:53,536][52263] Updated weights for policy 0, policy_version 409539 (0.0029) [2024-04-27 15:13:54,107][52031] Fps is (10 sec: 50790.6, 60 sec: 52974.9, 300 sec: 53484.0). Total num frames: 6709919744. Throughput: 0: 53240.9. Samples: 1200491840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 15:13:54,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 15:13:56,126][52263] Updated weights for policy 0, policy_version 409549 (0.0033) [2024-04-27 15:13:59,106][52031] Fps is (10 sec: 50791.0, 60 sec: 52701.9, 300 sec: 53373.0). Total num frames: 6710165504. Throughput: 0: 53169.1. Samples: 1200645640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 15:13:59,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 15:13:59,705][52263] Updated weights for policy 0, policy_version 409559 (0.0030) [2024-04-27 15:14:02,531][52263] Updated weights for policy 0, policy_version 409569 (0.0029) [2024-04-27 15:14:04,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6710476800. Throughput: 0: 53386.2. Samples: 1200972580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 15:14:04,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 15:14:05,780][52263] Updated weights for policy 0, policy_version 409579 (0.0029) [2024-04-27 15:14:08,505][52263] Updated weights for policy 0, policy_version 409589 (0.0026) [2024-04-27 15:14:09,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6710722560. Throughput: 0: 53306.0. Samples: 1201289160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 15:14:09,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 15:14:11,887][52263] Updated weights for policy 0, policy_version 409599 (0.0033) [2024-04-27 15:14:14,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6711017472. Throughput: 0: 53372.0. Samples: 1201457080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 15:14:14,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 15:14:14,475][52263] Updated weights for policy 0, policy_version 409609 (0.0030) [2024-04-27 15:14:17,974][52263] Updated weights for policy 0, policy_version 409619 (0.0031) [2024-04-27 15:14:19,107][52031] Fps is (10 sec: 55706.1, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6711279616. Throughput: 0: 53435.0. Samples: 1201777220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 15:14:19,108][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 15:14:20,576][52263] Updated weights for policy 0, policy_version 409629 (0.0034) [2024-04-27 15:14:24,106][52031] Fps is (10 sec: 49152.0, 60 sec: 52701.9, 300 sec: 53428.5). Total num frames: 6711508992. Throughput: 0: 53637.5. Samples: 1202102760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 15:14:24,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 15:14:24,220][52263] Updated weights for policy 0, policy_version 409639 (0.0027) [2024-04-27 15:14:26,606][52263] Updated weights for policy 0, policy_version 409649 (0.0031) [2024-04-27 15:14:29,107][52031] Fps is (10 sec: 49152.0, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6711771136. Throughput: 0: 53115.7. Samples: 1202246400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 15:14:29,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 15:14:30,405][52263] Updated weights for policy 0, policy_version 409659 (0.0028) [2024-04-27 15:14:32,600][52263] Updated weights for policy 0, policy_version 409669 (0.0028) [2024-04-27 15:14:34,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.1, 300 sec: 53261.9). Total num frames: 6712049664. Throughput: 0: 53061.1. Samples: 1202561740. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-27 15:14:34,107][52031] Avg episode reward: [(0, '0.520')] [2024-04-27 15:14:36,417][52263] Updated weights for policy 0, policy_version 409679 (0.0030) [2024-04-27 15:14:38,622][52263] Updated weights for policy 0, policy_version 409689 (0.0031) [2024-04-27 15:14:39,107][52031] Fps is (10 sec: 57344.1, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6712344576. Throughput: 0: 53157.0. Samples: 1202883900. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-27 15:14:39,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 15:14:42,672][52263] Updated weights for policy 0, policy_version 409699 (0.0030) [2024-04-27 15:14:42,827][52242] Signal inference workers to stop experience collection... (18150 times) [2024-04-27 15:14:42,827][52242] Signal inference workers to resume experience collection... (18150 times) [2024-04-27 15:14:42,850][52263] InferenceWorker_p0-w0: stopping experience collection (18150 times) [2024-04-27 15:14:42,850][52263] InferenceWorker_p0-w0: resuming experience collection (18150 times) [2024-04-27 15:14:44,106][52031] Fps is (10 sec: 57343.2, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6712623104. Throughput: 0: 53696.8. Samples: 1203062000. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-27 15:14:44,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 15:14:45,399][52263] Updated weights for policy 0, policy_version 409709 (0.0033) [2024-04-27 15:14:48,751][52263] Updated weights for policy 0, policy_version 409719 (0.0027) [2024-04-27 15:14:49,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6712868864. Throughput: 0: 53508.2. Samples: 1203380440. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-27 15:14:49,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 15:14:51,403][52263] Updated weights for policy 0, policy_version 409729 (0.0030) [2024-04-27 15:14:54,107][52031] Fps is (10 sec: 47513.0, 60 sec: 52974.9, 300 sec: 53372.9). Total num frames: 6713098240. Throughput: 0: 53604.8. Samples: 1203701380. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-27 15:14:54,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 15:14:54,735][52263] Updated weights for policy 0, policy_version 409739 (0.0031) [2024-04-27 15:14:57,824][52263] Updated weights for policy 0, policy_version 409749 (0.0031) [2024-04-27 15:14:59,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6713393152. Throughput: 0: 53203.1. Samples: 1203851220. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-27 15:14:59,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 15:15:00,810][52263] Updated weights for policy 0, policy_version 409759 (0.0033) [2024-04-27 15:15:03,891][52263] Updated weights for policy 0, policy_version 409769 (0.0030) [2024-04-27 15:15:04,107][52031] Fps is (10 sec: 55706.1, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 6713655296. Throughput: 0: 53141.7. Samples: 1204168600. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-27 15:15:04,108][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 15:15:06,884][52263] Updated weights for policy 0, policy_version 409779 (0.0036) [2024-04-27 15:15:09,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6713950208. Throughput: 0: 53056.7. Samples: 1204490320. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-27 15:15:09,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 15:15:09,845][52263] Updated weights for policy 0, policy_version 409789 (0.0027) [2024-04-27 15:15:12,953][52263] Updated weights for policy 0, policy_version 409799 (0.0028) [2024-04-27 15:15:14,106][52031] Fps is (10 sec: 57344.7, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6714228736. Throughput: 0: 53900.1. Samples: 1204671900. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-27 15:15:14,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 15:15:16,013][52263] Updated weights for policy 0, policy_version 409809 (0.0027) [2024-04-27 15:15:18,940][52263] Updated weights for policy 0, policy_version 409819 (0.0029) [2024-04-27 15:15:19,106][52031] Fps is (10 sec: 54068.2, 60 sec: 53521.2, 300 sec: 53595.2). Total num frames: 6714490880. Throughput: 0: 54076.4. Samples: 1204995180. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-27 15:15:19,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 15:15:22,043][52263] Updated weights for policy 0, policy_version 409829 (0.0028) [2024-04-27 15:15:24,106][52031] Fps is (10 sec: 49151.8, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6714720256. Throughput: 0: 54050.7. Samples: 1205316180. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-27 15:15:24,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 15:15:25,000][52263] Updated weights for policy 0, policy_version 409839 (0.0031) [2024-04-27 15:15:26,455][52242] Signal inference workers to stop experience collection... (18200 times) [2024-04-27 15:15:26,498][52263] InferenceWorker_p0-w0: stopping experience collection (18200 times) [2024-04-27 15:15:26,511][52242] Signal inference workers to resume experience collection... (18200 times) [2024-04-27 15:15:26,516][52263] InferenceWorker_p0-w0: resuming experience collection (18200 times) [2024-04-27 15:15:27,986][52263] Updated weights for policy 0, policy_version 409849 (0.0028) [2024-04-27 15:15:29,106][52031] Fps is (10 sec: 52428.5, 60 sec: 54067.2, 300 sec: 53428.5). Total num frames: 6715015168. Throughput: 0: 53503.2. Samples: 1205469640. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-27 15:15:29,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 15:15:31,040][52263] Updated weights for policy 0, policy_version 409859 (0.0031) [2024-04-27 15:15:34,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6715277312. Throughput: 0: 53573.1. Samples: 1205791240. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-27 15:15:34,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 15:15:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000409869_6715293696.pth... [2024-04-27 15:15:34,120][52263] Updated weights for policy 0, policy_version 409869 (0.0033) [2024-04-27 15:15:34,172][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000409084_6702432256.pth [2024-04-27 15:15:37,058][52263] Updated weights for policy 0, policy_version 409879 (0.0032) [2024-04-27 15:15:39,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6715572224. Throughput: 0: 53671.5. Samples: 1206116600. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-27 15:15:39,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 15:15:40,250][52263] Updated weights for policy 0, policy_version 409889 (0.0030) [2024-04-27 15:15:43,270][52263] Updated weights for policy 0, policy_version 409899 (0.0032) [2024-04-27 15:15:44,106][52031] Fps is (10 sec: 57344.4, 60 sec: 53794.2, 300 sec: 53761.8). Total num frames: 6715850752. Throughput: 0: 54023.5. Samples: 1206282280. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-27 15:15:44,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 15:15:46,339][52263] Updated weights for policy 0, policy_version 409909 (0.0028) [2024-04-27 15:15:49,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6716096512. Throughput: 0: 54199.6. Samples: 1206607580. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-04-27 15:15:49,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 15:15:49,257][52263] Updated weights for policy 0, policy_version 409919 (0.0028) [2024-04-27 15:15:52,574][52263] Updated weights for policy 0, policy_version 409929 (0.0031) [2024-04-27 15:15:54,106][52031] Fps is (10 sec: 49152.0, 60 sec: 54067.4, 300 sec: 53484.1). Total num frames: 6716342272. Throughput: 0: 54201.5. Samples: 1206929380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 15:15:54,107][52031] Avg episode reward: [(0, '0.710')] [2024-04-27 15:15:55,370][52263] Updated weights for policy 0, policy_version 409939 (0.0030) [2024-04-27 15:15:58,711][52263] Updated weights for policy 0, policy_version 409949 (0.0035) [2024-04-27 15:15:59,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6716620800. Throughput: 0: 53376.9. Samples: 1207073860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 15:15:59,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 15:16:01,392][52263] Updated weights for policy 0, policy_version 409959 (0.0035) [2024-04-27 15:16:04,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 6716882944. Throughput: 0: 53451.8. Samples: 1207400520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 15:16:04,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 15:16:04,848][52263] Updated weights for policy 0, policy_version 409969 (0.0029) [2024-04-27 15:16:07,627][52263] Updated weights for policy 0, policy_version 409979 (0.0037) [2024-04-27 15:16:09,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 6717177856. Throughput: 0: 53396.0. Samples: 1207719000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 15:16:09,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 15:16:10,821][52263] Updated weights for policy 0, policy_version 409989 (0.0034) [2024-04-27 15:16:13,700][52263] Updated weights for policy 0, policy_version 409999 (0.0031) [2024-04-27 15:16:14,106][52031] Fps is (10 sec: 55706.6, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6717440000. Throughput: 0: 53783.2. Samples: 1207889880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 15:16:14,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 15:16:16,826][52263] Updated weights for policy 0, policy_version 410009 (0.0034) [2024-04-27 15:16:19,107][52031] Fps is (10 sec: 49149.2, 60 sec: 52974.4, 300 sec: 53483.9). Total num frames: 6717669376. Throughput: 0: 53718.5. Samples: 1208208600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 15:16:19,108][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 15:16:19,334][52242] Signal inference workers to stop experience collection... (18250 times) [2024-04-27 15:16:19,367][52263] InferenceWorker_p0-w0: stopping experience collection (18250 times) [2024-04-27 15:16:19,396][52242] Signal inference workers to resume experience collection... (18250 times) [2024-04-27 15:16:19,403][52263] InferenceWorker_p0-w0: resuming experience collection (18250 times) [2024-04-27 15:16:19,852][52263] Updated weights for policy 0, policy_version 410019 (0.0028) [2024-04-27 15:16:23,219][52263] Updated weights for policy 0, policy_version 410029 (0.0029) [2024-04-27 15:16:24,107][52031] Fps is (10 sec: 52428.1, 60 sec: 54067.2, 300 sec: 53484.1). Total num frames: 6717964288. Throughput: 0: 53592.2. Samples: 1208528240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 15:16:24,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 15:16:25,870][52263] Updated weights for policy 0, policy_version 410039 (0.0030) [2024-04-27 15:16:29,106][52031] Fps is (10 sec: 54070.4, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6718210048. Throughput: 0: 53378.7. Samples: 1208684320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 15:16:29,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 15:16:29,318][52263] Updated weights for policy 0, policy_version 410049 (0.0030) [2024-04-27 15:16:31,966][52263] Updated weights for policy 0, policy_version 410059 (0.0026) [2024-04-27 15:16:34,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6718488576. Throughput: 0: 53254.9. Samples: 1209004060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 15:16:34,107][52031] Avg episode reward: [(0, '0.493')] [2024-04-27 15:16:35,428][52263] Updated weights for policy 0, policy_version 410069 (0.0029) [2024-04-27 15:16:38,062][52263] Updated weights for policy 0, policy_version 410079 (0.0027) [2024-04-27 15:16:39,106][52031] Fps is (10 sec: 57344.2, 60 sec: 53521.3, 300 sec: 53650.7). Total num frames: 6718783488. Throughput: 0: 53216.0. Samples: 1209324100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 15:16:39,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 15:16:41,635][52263] Updated weights for policy 0, policy_version 410089 (0.0027) [2024-04-27 15:16:44,106][52031] Fps is (10 sec: 55706.8, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6719045632. Throughput: 0: 53926.6. Samples: 1209500560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 15:16:44,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 15:16:44,124][52263] Updated weights for policy 0, policy_version 410099 (0.0037) [2024-04-27 15:16:47,640][52263] Updated weights for policy 0, policy_version 410109 (0.0029) [2024-04-27 15:16:49,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6719291392. Throughput: 0: 53865.1. Samples: 1209824440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 15:16:49,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 15:16:50,103][52263] Updated weights for policy 0, policy_version 410119 (0.0033) [2024-04-27 15:16:53,747][52263] Updated weights for policy 0, policy_version 410129 (0.0031) [2024-04-27 15:16:54,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6719569920. Throughput: 0: 53928.0. Samples: 1210145760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 15:16:54,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 15:16:56,207][52263] Updated weights for policy 0, policy_version 410139 (0.0028) [2024-04-27 15:16:59,107][52031] Fps is (10 sec: 54066.2, 60 sec: 53520.9, 300 sec: 53539.6). Total num frames: 6719832064. Throughput: 0: 53466.4. Samples: 1210295880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 15:16:59,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 15:16:59,831][52263] Updated weights for policy 0, policy_version 410149 (0.0033) [2024-04-27 15:17:02,369][52263] Updated weights for policy 0, policy_version 410159 (0.0027) [2024-04-27 15:17:04,107][52031] Fps is (10 sec: 55704.6, 60 sec: 54067.1, 300 sec: 53650.6). Total num frames: 6720126976. Throughput: 0: 53596.5. Samples: 1210620420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 15:17:04,107][52031] Avg episode reward: [(0, '0.475')] [2024-04-27 15:17:05,899][52263] Updated weights for policy 0, policy_version 410169 (0.0036) [2024-04-27 15:17:08,354][52263] Updated weights for policy 0, policy_version 410179 (0.0025) [2024-04-27 15:17:09,106][52031] Fps is (10 sec: 57344.6, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6720405504. Throughput: 0: 53627.1. Samples: 1210941460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 15:17:09,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 15:17:11,828][52263] Updated weights for policy 0, policy_version 410189 (0.0030) [2024-04-27 15:17:14,106][52031] Fps is (10 sec: 54068.5, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6720667648. Throughput: 0: 54030.7. Samples: 1211115700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 15:17:14,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 15:17:14,460][52263] Updated weights for policy 0, policy_version 410199 (0.0031) [2024-04-27 15:17:17,975][52263] Updated weights for policy 0, policy_version 410209 (0.0039) [2024-04-27 15:17:19,106][52031] Fps is (10 sec: 50791.1, 60 sec: 54067.8, 300 sec: 53539.6). Total num frames: 6720913408. Throughput: 0: 54109.7. Samples: 1211438980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 15:17:19,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 15:17:20,500][52263] Updated weights for policy 0, policy_version 410219 (0.0030) [2024-04-27 15:17:24,081][52263] Updated weights for policy 0, policy_version 410229 (0.0031) [2024-04-27 15:17:24,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6721191936. Throughput: 0: 54084.4. Samples: 1211757900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 15:17:24,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 15:17:24,530][52242] Signal inference workers to stop experience collection... (18300 times) [2024-04-27 15:17:24,560][52263] InferenceWorker_p0-w0: stopping experience collection (18300 times) [2024-04-27 15:17:24,627][52242] Signal inference workers to resume experience collection... (18300 times) [2024-04-27 15:17:24,628][52263] InferenceWorker_p0-w0: resuming experience collection (18300 times) [2024-04-27 15:17:26,819][52263] Updated weights for policy 0, policy_version 410239 (0.0030) [2024-04-27 15:17:29,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6721437696. Throughput: 0: 53551.4. Samples: 1211910380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 15:17:29,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 15:17:30,305][52263] Updated weights for policy 0, policy_version 410249 (0.0031) [2024-04-27 15:17:32,836][52263] Updated weights for policy 0, policy_version 410259 (0.0029) [2024-04-27 15:17:34,106][52031] Fps is (10 sec: 54066.9, 60 sec: 54067.3, 300 sec: 53595.1). Total num frames: 6721732608. Throughput: 0: 53453.7. Samples: 1212229860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 15:17:34,107][52031] Avg episode reward: [(0, '0.646')] [2024-04-27 15:17:34,121][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000410262_6721732608.pth... [2024-04-27 15:17:34,173][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000409475_6708838400.pth [2024-04-27 15:17:36,384][52263] Updated weights for policy 0, policy_version 410269 (0.0030) [2024-04-27 15:17:38,979][52263] Updated weights for policy 0, policy_version 410279 (0.0024) [2024-04-27 15:17:39,107][52031] Fps is (10 sec: 57344.3, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6722011136. Throughput: 0: 53463.1. Samples: 1212551600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 15:17:39,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 15:17:42,436][52263] Updated weights for policy 0, policy_version 410289 (0.0033) [2024-04-27 15:17:44,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6722273280. Throughput: 0: 53872.5. Samples: 1212720140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 15:17:44,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 15:17:44,969][52263] Updated weights for policy 0, policy_version 410299 (0.0033) [2024-04-27 15:17:48,617][52263] Updated weights for policy 0, policy_version 410309 (0.0036) [2024-04-27 15:17:49,107][52031] Fps is (10 sec: 50789.7, 60 sec: 53793.9, 300 sec: 53484.0). Total num frames: 6722519040. Throughput: 0: 53893.3. Samples: 1213045620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 15:17:49,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 15:17:51,218][52263] Updated weights for policy 0, policy_version 410319 (0.0034) [2024-04-27 15:17:54,107][52031] Fps is (10 sec: 50790.5, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6722781184. Throughput: 0: 53857.3. Samples: 1213365040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 15:17:54,107][52031] Avg episode reward: [(0, '0.700')] [2024-04-27 15:17:54,702][52263] Updated weights for policy 0, policy_version 410329 (0.0029) [2024-04-27 15:17:57,272][52263] Updated weights for policy 0, policy_version 410339 (0.0033) [2024-04-27 15:17:59,107][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6723059712. Throughput: 0: 53491.9. Samples: 1213522840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 15:17:59,107][52031] Avg episode reward: [(0, '0.499')] [2024-04-27 15:18:00,826][52263] Updated weights for policy 0, policy_version 410349 (0.0026) [2024-04-27 15:18:03,532][52263] Updated weights for policy 0, policy_version 410359 (0.0032) [2024-04-27 15:18:04,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6723338240. Throughput: 0: 53425.2. Samples: 1213843120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 15:18:04,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 15:18:06,867][52263] Updated weights for policy 0, policy_version 410369 (0.0039) [2024-04-27 15:18:09,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6723616768. Throughput: 0: 53382.5. Samples: 1214160120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 15:18:09,107][52031] Avg episode reward: [(0, '0.509')] [2024-04-27 15:18:09,741][52263] Updated weights for policy 0, policy_version 410379 (0.0032) [2024-04-27 15:18:13,091][52263] Updated weights for policy 0, policy_version 410389 (0.0029) [2024-04-27 15:18:14,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6723862528. Throughput: 0: 53616.6. Samples: 1214323120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 15:18:14,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 15:18:15,773][52263] Updated weights for policy 0, policy_version 410399 (0.0024) [2024-04-27 15:18:19,057][52263] Updated weights for policy 0, policy_version 410409 (0.0032) [2024-04-27 15:18:19,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53793.9, 300 sec: 53539.5). Total num frames: 6724141056. Throughput: 0: 53660.7. Samples: 1214644600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 15:18:19,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 15:18:21,890][52263] Updated weights for policy 0, policy_version 410419 (0.0029) [2024-04-27 15:18:24,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6724386816. Throughput: 0: 53610.8. Samples: 1214964080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 15:18:24,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 15:18:25,187][52263] Updated weights for policy 0, policy_version 410429 (0.0033) [2024-04-27 15:18:27,946][52263] Updated weights for policy 0, policy_version 410439 (0.0030) [2024-04-27 15:18:29,107][52031] Fps is (10 sec: 50790.9, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6724648960. Throughput: 0: 53386.3. Samples: 1215122520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 15:18:29,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 15:18:31,461][52263] Updated weights for policy 0, policy_version 410449 (0.0026) [2024-04-27 15:18:31,654][52242] Signal inference workers to stop experience collection... (18350 times) [2024-04-27 15:18:31,695][52263] InferenceWorker_p0-w0: stopping experience collection (18350 times) [2024-04-27 15:18:31,711][52242] Signal inference workers to resume experience collection... (18350 times) [2024-04-27 15:18:31,716][52263] InferenceWorker_p0-w0: resuming experience collection (18350 times) [2024-04-27 15:18:34,105][52263] Updated weights for policy 0, policy_version 410459 (0.0028) [2024-04-27 15:18:34,107][52031] Fps is (10 sec: 57343.0, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6724960256. Throughput: 0: 53243.6. Samples: 1215441580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 15:18:34,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 15:18:37,431][52263] Updated weights for policy 0, policy_version 410469 (0.0028) [2024-04-27 15:18:39,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6725206016. Throughput: 0: 53364.0. Samples: 1215766420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 15:18:39,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 15:18:40,237][52263] Updated weights for policy 0, policy_version 410479 (0.0029) [2024-04-27 15:18:43,554][52263] Updated weights for policy 0, policy_version 410489 (0.0029) [2024-04-27 15:18:44,107][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6725484544. Throughput: 0: 53495.1. Samples: 1215930120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 15:18:44,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 15:18:46,365][52263] Updated weights for policy 0, policy_version 410499 (0.0031) [2024-04-27 15:18:49,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6725730304. Throughput: 0: 53460.9. Samples: 1216248860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 15:18:49,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 15:18:49,693][52263] Updated weights for policy 0, policy_version 410509 (0.0032) [2024-04-27 15:18:52,570][52263] Updated weights for policy 0, policy_version 410519 (0.0031) [2024-04-27 15:18:54,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6726008832. Throughput: 0: 53583.1. Samples: 1216571360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 15:18:54,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 15:18:55,663][52263] Updated weights for policy 0, policy_version 410529 (0.0033) [2024-04-27 15:18:58,574][52263] Updated weights for policy 0, policy_version 410539 (0.0036) [2024-04-27 15:18:59,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6726270976. Throughput: 0: 53531.3. Samples: 1216732040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 15:18:59,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 15:19:01,764][52263] Updated weights for policy 0, policy_version 410549 (0.0034) [2024-04-27 15:19:04,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6726549504. Throughput: 0: 53490.7. Samples: 1217051680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 15:19:04,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 15:19:05,023][52263] Updated weights for policy 0, policy_version 410559 (0.0029) [2024-04-27 15:19:07,994][52263] Updated weights for policy 0, policy_version 410569 (0.0036) [2024-04-27 15:19:09,107][52031] Fps is (10 sec: 52429.0, 60 sec: 52974.9, 300 sec: 53484.0). Total num frames: 6726795264. Throughput: 0: 53414.5. Samples: 1217367740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 15:19:09,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 15:19:11,315][52263] Updated weights for policy 0, policy_version 410579 (0.0028) [2024-04-27 15:19:14,089][52263] Updated weights for policy 0, policy_version 410589 (0.0029) [2024-04-27 15:19:14,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6727090176. Throughput: 0: 53523.6. Samples: 1217531080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 15:19:14,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 15:19:17,404][52263] Updated weights for policy 0, policy_version 410599 (0.0030) [2024-04-27 15:19:19,107][52031] Fps is (10 sec: 52428.9, 60 sec: 52975.0, 300 sec: 53595.1). Total num frames: 6727319552. Throughput: 0: 53546.3. Samples: 1217851160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 15:19:19,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 15:19:20,028][52263] Updated weights for policy 0, policy_version 410609 (0.0027) [2024-04-27 15:19:23,383][52263] Updated weights for policy 0, policy_version 410619 (0.0030) [2024-04-27 15:19:24,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6727598080. Throughput: 0: 53382.7. Samples: 1218168640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 15:19:24,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 15:19:26,173][52263] Updated weights for policy 0, policy_version 410629 (0.0032) [2024-04-27 15:19:29,107][52031] Fps is (10 sec: 57344.0, 60 sec: 54067.2, 300 sec: 53706.2). Total num frames: 6727892992. Throughput: 0: 53352.0. Samples: 1218330960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 15:19:29,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 15:19:29,323][52263] Updated weights for policy 0, policy_version 410639 (0.0032) [2024-04-27 15:19:31,674][52242] Signal inference workers to stop experience collection... (18400 times) [2024-04-27 15:19:31,674][52242] Signal inference workers to resume experience collection... (18400 times) [2024-04-27 15:19:31,692][52263] InferenceWorker_p0-w0: stopping experience collection (18400 times) [2024-04-27 15:19:31,692][52263] InferenceWorker_p0-w0: resuming experience collection (18400 times) [2024-04-27 15:19:32,601][52263] Updated weights for policy 0, policy_version 410649 (0.0028) [2024-04-27 15:19:34,107][52031] Fps is (10 sec: 54066.8, 60 sec: 52974.9, 300 sec: 53539.6). Total num frames: 6728138752. Throughput: 0: 53493.7. Samples: 1218656080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 15:19:34,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 15:19:34,154][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000410654_6728155136.pth... [2024-04-27 15:19:34,213][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000409869_6715293696.pth [2024-04-27 15:19:35,452][52263] Updated weights for policy 0, policy_version 410659 (0.0028) [2024-04-27 15:19:38,558][52263] Updated weights for policy 0, policy_version 410669 (0.0028) [2024-04-27 15:19:39,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6728400896. Throughput: 0: 53484.2. Samples: 1218978140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 15:19:39,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 15:19:41,620][52263] Updated weights for policy 0, policy_version 410679 (0.0027) [2024-04-27 15:19:44,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 6728695808. Throughput: 0: 53460.5. Samples: 1219137760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 15:19:44,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 15:19:44,646][52263] Updated weights for policy 0, policy_version 410689 (0.0031) [2024-04-27 15:19:47,574][52263] Updated weights for policy 0, policy_version 410699 (0.0036) [2024-04-27 15:19:49,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53520.9, 300 sec: 53706.2). Total num frames: 6728941568. Throughput: 0: 53474.1. Samples: 1219458020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 15:19:49,107][52031] Avg episode reward: [(0, '0.646')] [2024-04-27 15:19:50,768][52263] Updated weights for policy 0, policy_version 410709 (0.0036) [2024-04-27 15:19:54,106][52031] Fps is (10 sec: 52429.8, 60 sec: 53521.3, 300 sec: 53650.7). Total num frames: 6729220096. Throughput: 0: 53619.3. Samples: 1219780600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-27 15:19:54,107][52031] Avg episode reward: [(0, '0.668')] [2024-04-27 15:19:54,113][52263] Updated weights for policy 0, policy_version 410719 (0.0028) [2024-04-27 15:19:57,022][52263] Updated weights for policy 0, policy_version 410729 (0.0031) [2024-04-27 15:19:59,107][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6729498624. Throughput: 0: 53490.6. Samples: 1219938160. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 15:19:59,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 15:20:00,102][52263] Updated weights for policy 0, policy_version 410739 (0.0031) [2024-04-27 15:20:03,002][52263] Updated weights for policy 0, policy_version 410749 (0.0028) [2024-04-27 15:20:04,107][52031] Fps is (10 sec: 52427.6, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6729744384. Throughput: 0: 53517.3. Samples: 1220259440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 15:20:04,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 15:20:06,087][52263] Updated weights for policy 0, policy_version 410759 (0.0028) [2024-04-27 15:20:09,018][52263] Updated weights for policy 0, policy_version 410769 (0.0027) [2024-04-27 15:20:09,106][52031] Fps is (10 sec: 54068.3, 60 sec: 54067.3, 300 sec: 53595.1). Total num frames: 6730039296. Throughput: 0: 53799.7. Samples: 1220589620. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 15:20:09,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 15:20:12,196][52263] Updated weights for policy 0, policy_version 410779 (0.0034) [2024-04-27 15:20:14,106][52031] Fps is (10 sec: 55706.6, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6730301440. Throughput: 0: 53669.5. Samples: 1220746080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 15:20:14,107][52031] Avg episode reward: [(0, '0.667')] [2024-04-27 15:20:15,306][52263] Updated weights for policy 0, policy_version 410789 (0.0026) [2024-04-27 15:20:18,071][52263] Updated weights for policy 0, policy_version 410799 (0.0032) [2024-04-27 15:20:19,106][52031] Fps is (10 sec: 52428.8, 60 sec: 54067.3, 300 sec: 53706.2). Total num frames: 6730563584. Throughput: 0: 53706.9. Samples: 1221072880. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 15:20:19,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 15:20:21,645][52263] Updated weights for policy 0, policy_version 410809 (0.0028) [2024-04-27 15:20:24,102][52263] Updated weights for policy 0, policy_version 410819 (0.0034) [2024-04-27 15:20:24,106][52031] Fps is (10 sec: 55705.1, 60 sec: 54340.3, 300 sec: 53706.2). Total num frames: 6730858496. Throughput: 0: 53722.2. Samples: 1221395640. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 15:20:24,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 15:20:27,922][52263] Updated weights for policy 0, policy_version 410829 (0.0034) [2024-04-27 15:20:28,914][52242] Signal inference workers to stop experience collection... (18450 times) [2024-04-27 15:20:28,915][52242] Signal inference workers to resume experience collection... (18450 times) [2024-04-27 15:20:28,938][52263] InferenceWorker_p0-w0: stopping experience collection (18450 times) [2024-04-27 15:20:28,938][52263] InferenceWorker_p0-w0: resuming experience collection (18450 times) [2024-04-27 15:20:29,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 6731104256. Throughput: 0: 53756.2. Samples: 1221556780. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 15:20:29,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 15:20:30,119][52263] Updated weights for policy 0, policy_version 410839 (0.0032) [2024-04-27 15:20:33,978][52263] Updated weights for policy 0, policy_version 410849 (0.0030) [2024-04-27 15:20:34,106][52031] Fps is (10 sec: 49152.5, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6731350016. Throughput: 0: 53711.8. Samples: 1221875040. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 15:20:34,107][52031] Avg episode reward: [(0, '0.510')] [2024-04-27 15:20:36,291][52263] Updated weights for policy 0, policy_version 410859 (0.0027) [2024-04-27 15:20:39,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6731628544. Throughput: 0: 53707.0. Samples: 1222197420. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 15:20:39,107][52031] Avg episode reward: [(0, '0.669')] [2024-04-27 15:20:40,307][52263] Updated weights for policy 0, policy_version 410869 (0.0029) [2024-04-27 15:20:42,381][52263] Updated weights for policy 0, policy_version 410879 (0.0028) [2024-04-27 15:20:44,106][52031] Fps is (10 sec: 57343.8, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6731923456. Throughput: 0: 53838.9. Samples: 1222360900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 15:20:44,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 15:20:46,447][52263] Updated weights for policy 0, policy_version 410889 (0.0028) [2024-04-27 15:20:48,845][52263] Updated weights for policy 0, policy_version 410899 (0.0032) [2024-04-27 15:20:49,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 6732169216. Throughput: 0: 53960.6. Samples: 1222687660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 15:20:49,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 15:20:52,384][52263] Updated weights for policy 0, policy_version 410909 (0.0031) [2024-04-27 15:20:54,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6732447744. Throughput: 0: 53739.6. Samples: 1223007900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 15:20:54,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 15:20:55,046][52263] Updated weights for policy 0, policy_version 410919 (0.0029) [2024-04-27 15:20:58,620][52263] Updated weights for policy 0, policy_version 410929 (0.0024) [2024-04-27 15:20:59,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6732693504. Throughput: 0: 53759.8. Samples: 1223165280. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 15:20:59,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 15:21:01,183][52263] Updated weights for policy 0, policy_version 410939 (0.0034) [2024-04-27 15:21:04,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53794.3, 300 sec: 53539.6). Total num frames: 6732972032. Throughput: 0: 53609.4. Samples: 1223485300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 15:21:04,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 15:21:04,637][52263] Updated weights for policy 0, policy_version 410949 (0.0031) [2024-04-27 15:21:07,226][52263] Updated weights for policy 0, policy_version 410959 (0.0030) [2024-04-27 15:21:09,107][52031] Fps is (10 sec: 54067.7, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6733234176. Throughput: 0: 53555.1. Samples: 1223805620. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 15:21:09,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 15:21:10,568][52263] Updated weights for policy 0, policy_version 410969 (0.0030) [2024-04-27 15:21:13,194][52263] Updated weights for policy 0, policy_version 410979 (0.0030) [2024-04-27 15:21:14,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.1, 300 sec: 53761.8). Total num frames: 6733529088. Throughput: 0: 53724.0. Samples: 1223974360. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-04-27 15:21:14,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 15:21:17,096][52263] Updated weights for policy 0, policy_version 410989 (0.0028) [2024-04-27 15:21:18,087][52242] Signal inference workers to stop experience collection... (18500 times) [2024-04-27 15:21:18,107][52263] InferenceWorker_p0-w0: stopping experience collection (18500 times) [2024-04-27 15:21:18,150][52242] Signal inference workers to resume experience collection... (18500 times) [2024-04-27 15:21:18,150][52263] InferenceWorker_p0-w0: resuming experience collection (18500 times) [2024-04-27 15:21:19,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 6733791232. Throughput: 0: 53718.4. Samples: 1224292380. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-04-27 15:21:19,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 15:21:19,405][52263] Updated weights for policy 0, policy_version 410999 (0.0027) [2024-04-27 15:21:23,148][52263] Updated weights for policy 0, policy_version 411009 (0.0030) [2024-04-27 15:21:24,106][52031] Fps is (10 sec: 49152.3, 60 sec: 52702.0, 300 sec: 53595.1). Total num frames: 6734020608. Throughput: 0: 53790.3. Samples: 1224617980. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-04-27 15:21:24,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 15:21:25,417][52263] Updated weights for policy 0, policy_version 411019 (0.0031) [2024-04-27 15:21:29,106][52031] Fps is (10 sec: 49153.2, 60 sec: 52975.0, 300 sec: 53539.6). Total num frames: 6734282752. Throughput: 0: 53363.2. Samples: 1224762240. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-04-27 15:21:29,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 15:21:29,420][52263] Updated weights for policy 0, policy_version 411029 (0.0038) [2024-04-27 15:21:31,517][52263] Updated weights for policy 0, policy_version 411039 (0.0030) [2024-04-27 15:21:34,106][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6734561280. Throughput: 0: 53201.3. Samples: 1225081720. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-04-27 15:21:34,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 15:21:34,177][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000411046_6734577664.pth... [2024-04-27 15:21:34,223][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000410262_6721732608.pth [2024-04-27 15:21:35,449][52263] Updated weights for policy 0, policy_version 411049 (0.0031) [2024-04-27 15:21:37,604][52263] Updated weights for policy 0, policy_version 411059 (0.0025) [2024-04-27 15:21:39,106][52031] Fps is (10 sec: 57343.3, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6734856192. Throughput: 0: 53242.1. Samples: 1225403800. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-04-27 15:21:39,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 15:21:41,677][52263] Updated weights for policy 0, policy_version 411069 (0.0033) [2024-04-27 15:21:43,848][52263] Updated weights for policy 0, policy_version 411079 (0.0030) [2024-04-27 15:21:44,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53247.9, 300 sec: 53650.6). Total num frames: 6735118336. Throughput: 0: 53585.8. Samples: 1225576640. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-04-27 15:21:44,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 15:21:47,695][52263] Updated weights for policy 0, policy_version 411089 (0.0026) [2024-04-27 15:21:49,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6735380480. Throughput: 0: 53535.6. Samples: 1225894400. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-04-27 15:21:49,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 15:21:49,945][52263] Updated weights for policy 0, policy_version 411099 (0.0026) [2024-04-27 15:21:53,805][52263] Updated weights for policy 0, policy_version 411109 (0.0026) [2024-04-27 15:21:54,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53247.9, 300 sec: 53595.2). Total num frames: 6735642624. Throughput: 0: 53587.7. Samples: 1226217060. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-04-27 15:21:54,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 15:21:55,885][52263] Updated weights for policy 0, policy_version 411119 (0.0035) [2024-04-27 15:21:59,107][52031] Fps is (10 sec: 50789.0, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6735888384. Throughput: 0: 53182.0. Samples: 1226367560. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-04-27 15:21:59,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 15:21:59,847][52263] Updated weights for policy 0, policy_version 411129 (0.0036) [2024-04-27 15:22:02,018][52263] Updated weights for policy 0, policy_version 411139 (0.0026) [2024-04-27 15:22:04,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6736183296. Throughput: 0: 53129.9. Samples: 1226683220. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-04-27 15:22:04,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 15:22:06,102][52263] Updated weights for policy 0, policy_version 411149 (0.0028) [2024-04-27 15:22:08,173][52263] Updated weights for policy 0, policy_version 411159 (0.0031) [2024-04-27 15:22:09,107][52031] Fps is (10 sec: 55706.1, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6736445440. Throughput: 0: 52900.3. Samples: 1226998500. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-04-27 15:22:09,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 15:22:12,148][52263] Updated weights for policy 0, policy_version 411169 (0.0027) [2024-04-27 15:22:14,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 6736740352. Throughput: 0: 53612.3. Samples: 1227174800. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-04-27 15:22:14,107][52031] Avg episode reward: [(0, '0.695')] [2024-04-27 15:22:14,422][52263] Updated weights for policy 0, policy_version 411179 (0.0030) [2024-04-27 15:22:18,361][52263] Updated weights for policy 0, policy_version 411189 (0.0028) [2024-04-27 15:22:19,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6736986112. Throughput: 0: 53759.9. Samples: 1227500920. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-04-27 15:22:19,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 15:22:20,544][52263] Updated weights for policy 0, policy_version 411199 (0.0030) [2024-04-27 15:22:24,107][52031] Fps is (10 sec: 49152.0, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6737231872. Throughput: 0: 53715.1. Samples: 1227820980. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-04-27 15:22:24,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 15:22:24,329][52263] Updated weights for policy 0, policy_version 411209 (0.0032) [2024-04-27 15:22:25,722][52242] Signal inference workers to stop experience collection... (18550 times) [2024-04-27 15:22:25,727][52242] Signal inference workers to resume experience collection... (18550 times) [2024-04-27 15:22:25,745][52263] InferenceWorker_p0-w0: stopping experience collection (18550 times) [2024-04-27 15:22:25,746][52263] InferenceWorker_p0-w0: resuming experience collection (18550 times) [2024-04-27 15:22:26,531][52263] Updated weights for policy 0, policy_version 411219 (0.0030) [2024-04-27 15:22:29,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6737494016. Throughput: 0: 53083.7. Samples: 1227965400. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-04-27 15:22:29,107][52031] Avg episode reward: [(0, '0.680')] [2024-04-27 15:22:30,477][52263] Updated weights for policy 0, policy_version 411229 (0.0039) [2024-04-27 15:22:32,913][52263] Updated weights for policy 0, policy_version 411239 (0.0029) [2024-04-27 15:22:34,107][52031] Fps is (10 sec: 57343.5, 60 sec: 54067.1, 300 sec: 53539.6). Total num frames: 6737805312. Throughput: 0: 53125.5. Samples: 1228285060. Policy #0 lag: (min: 0.0, avg: 7.9, max: 19.0) [2024-04-27 15:22:34,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 15:22:36,521][52263] Updated weights for policy 0, policy_version 411249 (0.0024) [2024-04-27 15:22:39,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6738051072. Throughput: 0: 53135.1. Samples: 1228608140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 15:22:39,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 15:22:39,261][52263] Updated weights for policy 0, policy_version 411259 (0.0038) [2024-04-27 15:22:42,637][52263] Updated weights for policy 0, policy_version 411269 (0.0027) [2024-04-27 15:22:44,107][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6738329600. Throughput: 0: 53638.8. Samples: 1228781300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 15:22:44,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 15:22:45,340][52263] Updated weights for policy 0, policy_version 411279 (0.0035) [2024-04-27 15:22:48,666][52263] Updated weights for policy 0, policy_version 411289 (0.0029) [2024-04-27 15:22:49,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6738591744. Throughput: 0: 53741.0. Samples: 1229101560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 15:22:49,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 15:22:51,603][52263] Updated weights for policy 0, policy_version 411299 (0.0030) [2024-04-27 15:22:54,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6738837504. Throughput: 0: 53937.4. Samples: 1229425680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 15:22:54,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 15:22:54,882][52263] Updated weights for policy 0, policy_version 411309 (0.0035) [2024-04-27 15:22:57,628][52263] Updated weights for policy 0, policy_version 411319 (0.0031) [2024-04-27 15:22:59,107][52031] Fps is (10 sec: 50789.6, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6739099648. Throughput: 0: 53201.7. Samples: 1229568880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 15:22:59,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 15:23:00,884][52263] Updated weights for policy 0, policy_version 411329 (0.0027) [2024-04-27 15:23:03,541][52263] Updated weights for policy 0, policy_version 411339 (0.0034) [2024-04-27 15:23:04,106][52031] Fps is (10 sec: 57344.0, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6739410944. Throughput: 0: 53271.2. Samples: 1229898120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 15:23:04,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 15:23:07,119][52263] Updated weights for policy 0, policy_version 411349 (0.0032) [2024-04-27 15:23:09,107][52031] Fps is (10 sec: 58982.5, 60 sec: 54067.2, 300 sec: 53650.6). Total num frames: 6739689472. Throughput: 0: 53344.4. Samples: 1230221480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 15:23:09,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 15:23:09,625][52263] Updated weights for policy 0, policy_version 411359 (0.0035) [2024-04-27 15:23:13,178][52263] Updated weights for policy 0, policy_version 411369 (0.0028) [2024-04-27 15:23:14,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6739951616. Throughput: 0: 53878.5. Samples: 1230389940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 15:23:14,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 15:23:15,951][52263] Updated weights for policy 0, policy_version 411379 (0.0030) [2024-04-27 15:23:19,106][52031] Fps is (10 sec: 47513.9, 60 sec: 52975.0, 300 sec: 53484.0). Total num frames: 6740164608. Throughput: 0: 53843.7. Samples: 1230708020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 15:23:19,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 15:23:19,312][52263] Updated weights for policy 0, policy_version 411389 (0.0035) [2024-04-27 15:23:19,734][52242] Signal inference workers to stop experience collection... (18600 times) [2024-04-27 15:23:19,784][52263] InferenceWorker_p0-w0: stopping experience collection (18600 times) [2024-04-27 15:23:19,797][52242] Signal inference workers to resume experience collection... (18600 times) [2024-04-27 15:23:19,804][52263] InferenceWorker_p0-w0: resuming experience collection (18600 times) [2024-04-27 15:23:22,280][52263] Updated weights for policy 0, policy_version 411399 (0.0032) [2024-04-27 15:23:24,106][52031] Fps is (10 sec: 47514.2, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6740426752. Throughput: 0: 53740.5. Samples: 1231026460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 15:23:24,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 15:23:25,360][52263] Updated weights for policy 0, policy_version 411409 (0.0032) [2024-04-27 15:23:28,221][52263] Updated weights for policy 0, policy_version 411419 (0.0036) [2024-04-27 15:23:29,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6740721664. Throughput: 0: 53226.1. Samples: 1231176480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 15:23:29,107][52031] Avg episode reward: [(0, '0.462')] [2024-04-27 15:23:31,513][52263] Updated weights for policy 0, policy_version 411429 (0.0034) [2024-04-27 15:23:34,106][52031] Fps is (10 sec: 57343.5, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6741000192. Throughput: 0: 53308.8. Samples: 1231500460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 15:23:34,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 15:23:34,211][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000411439_6741016576.pth... [2024-04-27 15:23:34,215][52263] Updated weights for policy 0, policy_version 411439 (0.0028) [2024-04-27 15:23:34,257][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000410654_6728155136.pth [2024-04-27 15:23:37,519][52263] Updated weights for policy 0, policy_version 411449 (0.0029) [2024-04-27 15:23:39,107][52031] Fps is (10 sec: 57344.3, 60 sec: 54067.1, 300 sec: 53595.1). Total num frames: 6741295104. Throughput: 0: 53219.9. Samples: 1231820580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 15:23:39,107][52031] Avg episode reward: [(0, '0.704')] [2024-04-27 15:23:40,402][52263] Updated weights for policy 0, policy_version 411459 (0.0026) [2024-04-27 15:23:43,608][52263] Updated weights for policy 0, policy_version 411469 (0.0024) [2024-04-27 15:23:44,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6741540864. Throughput: 0: 53906.2. Samples: 1231994660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 15:23:44,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 15:23:46,578][52263] Updated weights for policy 0, policy_version 411479 (0.0025) [2024-04-27 15:23:49,107][52031] Fps is (10 sec: 49151.4, 60 sec: 53247.8, 300 sec: 53484.0). Total num frames: 6741786624. Throughput: 0: 53638.0. Samples: 1232311840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 15:23:49,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 15:23:49,633][52263] Updated weights for policy 0, policy_version 411489 (0.0031) [2024-04-27 15:23:52,609][52263] Updated weights for policy 0, policy_version 411499 (0.0028) [2024-04-27 15:23:54,106][52031] Fps is (10 sec: 49152.9, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6742032384. Throughput: 0: 53522.8. Samples: 1232630000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 15:23:54,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 15:23:55,845][52263] Updated weights for policy 0, policy_version 411509 (0.0029) [2024-04-27 15:23:58,727][52263] Updated weights for policy 0, policy_version 411519 (0.0026) [2024-04-27 15:23:59,107][52031] Fps is (10 sec: 55706.4, 60 sec: 54067.3, 300 sec: 53539.6). Total num frames: 6742343680. Throughput: 0: 53181.8. Samples: 1232783120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 15:23:59,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 15:24:01,976][52263] Updated weights for policy 0, policy_version 411529 (0.0029) [2024-04-27 15:24:04,106][52031] Fps is (10 sec: 57343.5, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6742605824. Throughput: 0: 53364.0. Samples: 1233109400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 15:24:04,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 15:24:04,868][52263] Updated weights for policy 0, policy_version 411539 (0.0028) [2024-04-27 15:24:08,010][52263] Updated weights for policy 0, policy_version 411549 (0.0032) [2024-04-27 15:24:09,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6742884352. Throughput: 0: 53382.8. Samples: 1233428700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 15:24:09,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 15:24:11,014][52263] Updated weights for policy 0, policy_version 411559 (0.0026) [2024-04-27 15:24:13,241][52242] Signal inference workers to stop experience collection... (18650 times) [2024-04-27 15:24:13,242][52242] Signal inference workers to resume experience collection... (18650 times) [2024-04-27 15:24:13,264][52263] InferenceWorker_p0-w0: stopping experience collection (18650 times) [2024-04-27 15:24:13,265][52263] InferenceWorker_p0-w0: resuming experience collection (18650 times) [2024-04-27 15:24:14,026][52263] Updated weights for policy 0, policy_version 411569 (0.0028) [2024-04-27 15:24:14,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.0, 300 sec: 53650.7). Total num frames: 6743146496. Throughput: 0: 53736.1. Samples: 1233594600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 15:24:14,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 15:24:17,057][52263] Updated weights for policy 0, policy_version 411579 (0.0032) [2024-04-27 15:24:19,106][52031] Fps is (10 sec: 49152.9, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6743375872. Throughput: 0: 53648.9. Samples: 1233914660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 15:24:19,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 15:24:20,172][52263] Updated weights for policy 0, policy_version 411589 (0.0031) [2024-04-27 15:24:23,349][52263] Updated weights for policy 0, policy_version 411599 (0.0028) [2024-04-27 15:24:24,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6743654400. Throughput: 0: 53655.6. Samples: 1234235080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 15:24:24,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 15:24:26,183][52263] Updated weights for policy 0, policy_version 411609 (0.0030) [2024-04-27 15:24:29,106][52031] Fps is (10 sec: 57344.1, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6743949312. Throughput: 0: 53382.4. Samples: 1234396860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 15:24:29,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 15:24:29,427][52263] Updated weights for policy 0, policy_version 411619 (0.0028) [2024-04-27 15:24:32,490][52263] Updated weights for policy 0, policy_version 411629 (0.0029) [2024-04-27 15:24:34,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6744211456. Throughput: 0: 53497.4. Samples: 1234719220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 15:24:34,107][52031] Avg episode reward: [(0, '0.662')] [2024-04-27 15:24:35,442][52263] Updated weights for policy 0, policy_version 411639 (0.0027) [2024-04-27 15:24:38,619][52263] Updated weights for policy 0, policy_version 411649 (0.0035) [2024-04-27 15:24:39,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6744489984. Throughput: 0: 53447.8. Samples: 1235035160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 15:24:39,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 15:24:41,634][52263] Updated weights for policy 0, policy_version 411659 (0.0026) [2024-04-27 15:24:44,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6744735744. Throughput: 0: 53666.5. Samples: 1235198120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 15:24:44,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 15:24:44,613][52263] Updated weights for policy 0, policy_version 411669 (0.0033) [2024-04-27 15:24:47,656][52263] Updated weights for policy 0, policy_version 411679 (0.0027) [2024-04-27 15:24:49,106][52031] Fps is (10 sec: 49153.1, 60 sec: 53248.3, 300 sec: 53428.5). Total num frames: 6744981504. Throughput: 0: 53571.7. Samples: 1235520120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 15:24:49,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 15:24:50,684][52263] Updated weights for policy 0, policy_version 411689 (0.0032) [2024-04-27 15:24:53,962][52263] Updated weights for policy 0, policy_version 411699 (0.0037) [2024-04-27 15:24:54,107][52031] Fps is (10 sec: 54067.3, 60 sec: 54067.0, 300 sec: 53484.0). Total num frames: 6745276416. Throughput: 0: 53655.6. Samples: 1235843200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 15:24:54,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 15:24:56,803][52263] Updated weights for policy 0, policy_version 411709 (0.0028) [2024-04-27 15:24:59,107][52031] Fps is (10 sec: 54065.7, 60 sec: 52974.8, 300 sec: 53484.0). Total num frames: 6745522176. Throughput: 0: 53375.0. Samples: 1235996480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 15:24:59,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 15:25:00,003][52263] Updated weights for policy 0, policy_version 411719 (0.0027) [2024-04-27 15:25:02,842][52263] Updated weights for policy 0, policy_version 411729 (0.0032) [2024-04-27 15:25:04,106][52031] Fps is (10 sec: 54068.2, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6745817088. Throughput: 0: 53334.3. Samples: 1236314700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 15:25:04,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 15:25:04,575][52242] Signal inference workers to stop experience collection... (18700 times) [2024-04-27 15:25:04,576][52242] Signal inference workers to resume experience collection... (18700 times) [2024-04-27 15:25:04,605][52263] InferenceWorker_p0-w0: stopping experience collection (18700 times) [2024-04-27 15:25:04,605][52263] InferenceWorker_p0-w0: resuming experience collection (18700 times) [2024-04-27 15:25:06,046][52263] Updated weights for policy 0, policy_version 411739 (0.0034) [2024-04-27 15:25:09,004][52263] Updated weights for policy 0, policy_version 411749 (0.0028) [2024-04-27 15:25:09,106][52031] Fps is (10 sec: 57345.2, 60 sec: 53521.3, 300 sec: 53539.6). Total num frames: 6746095616. Throughput: 0: 53363.7. Samples: 1236636440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 15:25:09,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 15:25:12,467][52263] Updated weights for policy 0, policy_version 411759 (0.0034) [2024-04-27 15:25:14,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6746341376. Throughput: 0: 53511.4. Samples: 1236804880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 15:25:14,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 15:25:15,109][52263] Updated weights for policy 0, policy_version 411769 (0.0028) [2024-04-27 15:25:18,655][52263] Updated weights for policy 0, policy_version 411779 (0.0032) [2024-04-27 15:25:19,107][52031] Fps is (10 sec: 49151.4, 60 sec: 53521.0, 300 sec: 53317.4). Total num frames: 6746587136. Throughput: 0: 53438.3. Samples: 1237123940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 15:25:19,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 15:25:21,216][52263] Updated weights for policy 0, policy_version 411789 (0.0029) [2024-04-27 15:25:24,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6746865664. Throughput: 0: 53484.6. Samples: 1237441960. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-04-27 15:25:24,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 15:25:24,612][52263] Updated weights for policy 0, policy_version 411799 (0.0032) [2024-04-27 15:25:27,278][52263] Updated weights for policy 0, policy_version 411809 (0.0025) [2024-04-27 15:25:29,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53247.8, 300 sec: 53539.5). Total num frames: 6747144192. Throughput: 0: 53417.3. Samples: 1237601900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-04-27 15:25:29,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 15:25:30,557][52263] Updated weights for policy 0, policy_version 411819 (0.0027) [2024-04-27 15:25:33,416][52263] Updated weights for policy 0, policy_version 411829 (0.0026) [2024-04-27 15:25:34,107][52031] Fps is (10 sec: 57343.6, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6747439104. Throughput: 0: 53465.1. Samples: 1237926060. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-04-27 15:25:34,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 15:25:34,114][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000411831_6747439104.pth... [2024-04-27 15:25:34,160][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000411046_6734577664.pth [2024-04-27 15:25:36,884][52263] Updated weights for policy 0, policy_version 411839 (0.0032) [2024-04-27 15:25:39,107][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6747684864. Throughput: 0: 53337.8. Samples: 1238243400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-04-27 15:25:39,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 15:25:39,726][52263] Updated weights for policy 0, policy_version 411849 (0.0033) [2024-04-27 15:25:43,353][52263] Updated weights for policy 0, policy_version 411859 (0.0031) [2024-04-27 15:25:44,106][52031] Fps is (10 sec: 49152.1, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6747930624. Throughput: 0: 53420.6. Samples: 1238400400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-04-27 15:25:44,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 15:25:45,728][52263] Updated weights for policy 0, policy_version 411869 (0.0027) [2024-04-27 15:25:49,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6748209152. Throughput: 0: 53554.1. Samples: 1238724640. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-04-27 15:25:49,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 15:25:49,650][52263] Updated weights for policy 0, policy_version 411879 (0.0036) [2024-04-27 15:25:51,888][52263] Updated weights for policy 0, policy_version 411889 (0.0028) [2024-04-27 15:25:53,368][52242] Signal inference workers to stop experience collection... (18750 times) [2024-04-27 15:25:53,377][52263] InferenceWorker_p0-w0: stopping experience collection (18750 times) [2024-04-27 15:25:53,467][52242] Signal inference workers to resume experience collection... (18750 times) [2024-04-27 15:25:53,467][52263] InferenceWorker_p0-w0: resuming experience collection (18750 times) [2024-04-27 15:25:54,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6748471296. Throughput: 0: 53602.7. Samples: 1239048560. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-04-27 15:25:54,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 15:25:55,678][52263] Updated weights for policy 0, policy_version 411899 (0.0027) [2024-04-27 15:25:58,012][52263] Updated weights for policy 0, policy_version 411909 (0.0029) [2024-04-27 15:25:59,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6748733440. Throughput: 0: 53459.1. Samples: 1239210540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-04-27 15:25:59,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 15:26:01,640][52263] Updated weights for policy 0, policy_version 411919 (0.0026) [2024-04-27 15:26:04,107][52031] Fps is (10 sec: 55704.4, 60 sec: 53520.9, 300 sec: 53539.6). Total num frames: 6749028352. Throughput: 0: 53450.1. Samples: 1239529200. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-04-27 15:26:04,107][52031] Avg episode reward: [(0, '0.505')] [2024-04-27 15:26:04,242][52263] Updated weights for policy 0, policy_version 411929 (0.0033) [2024-04-27 15:26:07,820][52263] Updated weights for policy 0, policy_version 411939 (0.0038) [2024-04-27 15:26:09,106][52031] Fps is (10 sec: 55706.6, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6749290496. Throughput: 0: 53454.3. Samples: 1239847400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-04-27 15:26:09,107][52031] Avg episode reward: [(0, '0.661')] [2024-04-27 15:26:10,258][52263] Updated weights for policy 0, policy_version 411949 (0.0022) [2024-04-27 15:26:13,963][52263] Updated weights for policy 0, policy_version 411959 (0.0027) [2024-04-27 15:26:14,106][52031] Fps is (10 sec: 52430.1, 60 sec: 53521.3, 300 sec: 53428.5). Total num frames: 6749552640. Throughput: 0: 53594.5. Samples: 1240013640. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-04-27 15:26:14,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 15:26:16,249][52263] Updated weights for policy 0, policy_version 411969 (0.0032) [2024-04-27 15:26:19,106][52031] Fps is (10 sec: 55705.1, 60 sec: 54340.3, 300 sec: 53650.6). Total num frames: 6749847552. Throughput: 0: 53627.2. Samples: 1240339280. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-04-27 15:26:19,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 15:26:19,882][52263] Updated weights for policy 0, policy_version 411979 (0.0028) [2024-04-27 15:26:22,436][52263] Updated weights for policy 0, policy_version 411989 (0.0031) [2024-04-27 15:26:24,107][52031] Fps is (10 sec: 52427.7, 60 sec: 53521.0, 300 sec: 53539.5). Total num frames: 6750076928. Throughput: 0: 53709.7. Samples: 1240660340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-04-27 15:26:24,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 15:26:25,917][52263] Updated weights for policy 0, policy_version 411999 (0.0031) [2024-04-27 15:26:28,517][52263] Updated weights for policy 0, policy_version 412009 (0.0035) [2024-04-27 15:26:29,107][52031] Fps is (10 sec: 50789.4, 60 sec: 53521.0, 300 sec: 53539.5). Total num frames: 6750355456. Throughput: 0: 53676.7. Samples: 1240815860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-04-27 15:26:29,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 15:26:32,104][52263] Updated weights for policy 0, policy_version 412019 (0.0029) [2024-04-27 15:26:34,106][52031] Fps is (10 sec: 54068.2, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6750617600. Throughput: 0: 53593.4. Samples: 1241136340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-04-27 15:26:34,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 15:26:34,611][52263] Updated weights for policy 0, policy_version 412029 (0.0028) [2024-04-27 15:26:38,200][52263] Updated weights for policy 0, policy_version 412039 (0.0034) [2024-04-27 15:26:39,107][52031] Fps is (10 sec: 54068.1, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6750896128. Throughput: 0: 53580.8. Samples: 1241459700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-04-27 15:26:39,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 15:26:40,861][52263] Updated weights for policy 0, policy_version 412049 (0.0029) [2024-04-27 15:26:44,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6751158272. Throughput: 0: 53558.7. Samples: 1241620680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 15:26:44,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 15:26:44,194][52263] Updated weights for policy 0, policy_version 412059 (0.0029) [2024-04-27 15:26:44,476][52242] Signal inference workers to stop experience collection... (18800 times) [2024-04-27 15:26:44,539][52263] InferenceWorker_p0-w0: stopping experience collection (18800 times) [2024-04-27 15:26:44,539][52242] Signal inference workers to resume experience collection... (18800 times) [2024-04-27 15:26:44,551][52263] InferenceWorker_p0-w0: resuming experience collection (18800 times) [2024-04-27 15:26:46,953][52263] Updated weights for policy 0, policy_version 412069 (0.0029) [2024-04-27 15:26:49,106][52031] Fps is (10 sec: 54068.2, 60 sec: 53794.3, 300 sec: 53539.6). Total num frames: 6751436800. Throughput: 0: 53616.3. Samples: 1241941920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 15:26:49,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 15:26:50,187][52263] Updated weights for policy 0, policy_version 412079 (0.0034) [2024-04-27 15:26:53,415][52263] Updated weights for policy 0, policy_version 412089 (0.0030) [2024-04-27 15:26:54,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53520.8, 300 sec: 53539.6). Total num frames: 6751682560. Throughput: 0: 53592.6. Samples: 1242259080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 15:26:54,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 15:26:56,237][52263] Updated weights for policy 0, policy_version 412099 (0.0030) [2024-04-27 15:26:59,106][52031] Fps is (10 sec: 52428.1, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6751961088. Throughput: 0: 53428.8. Samples: 1242417940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 15:26:59,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 15:26:59,417][52263] Updated weights for policy 0, policy_version 412109 (0.0029) [2024-04-27 15:27:02,506][52263] Updated weights for policy 0, policy_version 412119 (0.0029) [2024-04-27 15:27:04,107][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6752223232. Throughput: 0: 53384.3. Samples: 1242741580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 15:27:04,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 15:27:05,686][52263] Updated weights for policy 0, policy_version 412129 (0.0028) [2024-04-27 15:27:08,691][52263] Updated weights for policy 0, policy_version 412139 (0.0029) [2024-04-27 15:27:09,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6752501760. Throughput: 0: 53314.4. Samples: 1243059480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 15:27:09,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 15:27:11,795][52263] Updated weights for policy 0, policy_version 412149 (0.0030) [2024-04-27 15:27:14,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 6752763904. Throughput: 0: 53422.4. Samples: 1243219860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 15:27:14,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 15:27:14,711][52263] Updated weights for policy 0, policy_version 412159 (0.0028) [2024-04-27 15:27:18,089][52263] Updated weights for policy 0, policy_version 412169 (0.0030) [2024-04-27 15:27:19,107][52031] Fps is (10 sec: 52428.1, 60 sec: 52974.9, 300 sec: 53539.6). Total num frames: 6753026048. Throughput: 0: 53415.9. Samples: 1243540060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 15:27:19,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 15:27:21,098][52263] Updated weights for policy 0, policy_version 412179 (0.0028) [2024-04-27 15:27:24,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6753288192. Throughput: 0: 53376.1. Samples: 1243861620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 15:27:24,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 15:27:24,452][52263] Updated weights for policy 0, policy_version 412189 (0.0027) [2024-04-27 15:27:27,240][52263] Updated weights for policy 0, policy_version 412199 (0.0029) [2024-04-27 15:27:29,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6753550336. Throughput: 0: 53164.0. Samples: 1244013060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 15:27:29,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 15:27:30,509][52263] Updated weights for policy 0, policy_version 412209 (0.0029) [2024-04-27 15:27:33,343][52263] Updated weights for policy 0, policy_version 412219 (0.0030) [2024-04-27 15:27:34,107][52031] Fps is (10 sec: 54066.2, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6753828864. Throughput: 0: 53050.3. Samples: 1244329200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 15:27:34,116][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 15:27:34,127][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000412221_6753828864.pth... [2024-04-27 15:27:34,170][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000411439_6741016576.pth [2024-04-27 15:27:36,646][52263] Updated weights for policy 0, policy_version 412229 (0.0028) [2024-04-27 15:27:39,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6754107392. Throughput: 0: 53210.3. Samples: 1244653540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 15:27:39,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 15:27:39,382][52263] Updated weights for policy 0, policy_version 412239 (0.0032) [2024-04-27 15:27:42,844][52263] Updated weights for policy 0, policy_version 412249 (0.0039) [2024-04-27 15:27:44,106][52031] Fps is (10 sec: 52429.9, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6754353152. Throughput: 0: 53438.3. Samples: 1244822660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 15:27:44,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 15:27:45,445][52263] Updated weights for policy 0, policy_version 412259 (0.0028) [2024-04-27 15:27:48,945][52263] Updated weights for policy 0, policy_version 412269 (0.0029) [2024-04-27 15:27:49,107][52031] Fps is (10 sec: 50790.7, 60 sec: 52974.8, 300 sec: 53484.0). Total num frames: 6754615296. Throughput: 0: 53285.8. Samples: 1245139440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 15:27:49,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 15:27:51,446][52263] Updated weights for policy 0, policy_version 412279 (0.0028) [2024-04-27 15:27:54,106][52031] Fps is (10 sec: 54066.7, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6754893824. Throughput: 0: 53350.1. Samples: 1245460240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 15:27:54,107][52031] Avg episode reward: [(0, '0.646')] [2024-04-27 15:27:55,078][52263] Updated weights for policy 0, policy_version 412289 (0.0028) [2024-04-27 15:27:57,769][52263] Updated weights for policy 0, policy_version 412299 (0.0028) [2024-04-27 15:27:59,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53248.1, 300 sec: 53373.0). Total num frames: 6755155968. Throughput: 0: 53263.2. Samples: 1245616700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-27 15:27:59,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 15:28:01,194][52263] Updated weights for policy 0, policy_version 412309 (0.0027) [2024-04-27 15:28:02,523][52242] Signal inference workers to stop experience collection... (18850 times) [2024-04-27 15:28:02,529][52242] Signal inference workers to resume experience collection... (18850 times) [2024-04-27 15:28:02,556][52263] InferenceWorker_p0-w0: stopping experience collection (18850 times) [2024-04-27 15:28:02,556][52263] InferenceWorker_p0-w0: resuming experience collection (18850 times) [2024-04-27 15:28:03,751][52263] Updated weights for policy 0, policy_version 412319 (0.0035) [2024-04-27 15:28:04,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6755450880. Throughput: 0: 53282.7. Samples: 1245937780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 15:28:04,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 15:28:07,480][52263] Updated weights for policy 0, policy_version 412329 (0.0031) [2024-04-27 15:28:09,107][52031] Fps is (10 sec: 54065.7, 60 sec: 53247.7, 300 sec: 53372.9). Total num frames: 6755696640. Throughput: 0: 53289.9. Samples: 1246259680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 15:28:09,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 15:28:10,089][52263] Updated weights for policy 0, policy_version 412339 (0.0028) [2024-04-27 15:28:13,614][52263] Updated weights for policy 0, policy_version 412349 (0.0029) [2024-04-27 15:28:14,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6755958784. Throughput: 0: 53508.0. Samples: 1246420920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 15:28:14,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 15:28:16,350][52263] Updated weights for policy 0, policy_version 412359 (0.0031) [2024-04-27 15:28:19,107][52031] Fps is (10 sec: 50790.9, 60 sec: 52974.9, 300 sec: 53484.0). Total num frames: 6756204544. Throughput: 0: 53667.6. Samples: 1246744240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 15:28:19,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 15:28:19,654][52263] Updated weights for policy 0, policy_version 412369 (0.0026) [2024-04-27 15:28:22,372][52263] Updated weights for policy 0, policy_version 412379 (0.0032) [2024-04-27 15:28:24,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6756483072. Throughput: 0: 53515.3. Samples: 1247061720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 15:28:24,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 15:28:25,782][52263] Updated weights for policy 0, policy_version 412389 (0.0026) [2024-04-27 15:28:28,405][52263] Updated weights for policy 0, policy_version 412399 (0.0028) [2024-04-27 15:28:29,107][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6756761600. Throughput: 0: 53281.6. Samples: 1247220340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 15:28:29,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 15:28:31,977][52263] Updated weights for policy 0, policy_version 412409 (0.0027) [2024-04-27 15:28:34,106][52031] Fps is (10 sec: 57343.5, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6757056512. Throughput: 0: 53304.5. Samples: 1247538140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 15:28:34,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 15:28:34,693][52263] Updated weights for policy 0, policy_version 412419 (0.0029) [2024-04-27 15:28:38,139][52263] Updated weights for policy 0, policy_version 412429 (0.0026) [2024-04-27 15:28:39,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6757302272. Throughput: 0: 53267.1. Samples: 1247857260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 15:28:39,107][52031] Avg episode reward: [(0, '0.485')] [2024-04-27 15:28:40,951][52263] Updated weights for policy 0, policy_version 412439 (0.0033) [2024-04-27 15:28:44,106][52031] Fps is (10 sec: 49152.0, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6757548032. Throughput: 0: 53363.5. Samples: 1248018060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 15:28:44,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 15:28:44,193][52263] Updated weights for policy 0, policy_version 412449 (0.0031) [2024-04-27 15:28:47,022][52263] Updated weights for policy 0, policy_version 412459 (0.0030) [2024-04-27 15:28:49,108][52031] Fps is (10 sec: 50783.3, 60 sec: 53246.8, 300 sec: 53483.8). Total num frames: 6757810176. Throughput: 0: 53201.5. Samples: 1248331920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 15:28:49,109][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 15:28:50,322][52263] Updated weights for policy 0, policy_version 412469 (0.0025) [2024-04-27 15:28:53,328][52263] Updated weights for policy 0, policy_version 412479 (0.0034) [2024-04-27 15:28:54,107][52031] Fps is (10 sec: 52428.5, 60 sec: 52974.9, 300 sec: 53317.4). Total num frames: 6758072320. Throughput: 0: 53202.4. Samples: 1248653780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 15:28:54,107][52031] Avg episode reward: [(0, '0.667')] [2024-04-27 15:28:56,411][52263] Updated weights for policy 0, policy_version 412489 (0.0028) [2024-04-27 15:28:59,107][52031] Fps is (10 sec: 54074.6, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 6758350848. Throughput: 0: 53180.5. Samples: 1248814040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 15:28:59,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 15:28:59,563][52263] Updated weights for policy 0, policy_version 412499 (0.0029) [2024-04-27 15:29:00,517][52242] Signal inference workers to stop experience collection... (18900 times) [2024-04-27 15:29:00,517][52242] Signal inference workers to resume experience collection... (18900 times) [2024-04-27 15:29:00,542][52263] InferenceWorker_p0-w0: stopping experience collection (18900 times) [2024-04-27 15:29:00,542][52263] InferenceWorker_p0-w0: resuming experience collection (18900 times) [2024-04-27 15:29:02,508][52263] Updated weights for policy 0, policy_version 412509 (0.0029) [2024-04-27 15:29:04,106][52031] Fps is (10 sec: 57344.4, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6758645760. Throughput: 0: 53203.7. Samples: 1249138400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 15:29:04,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 15:29:05,517][52263] Updated weights for policy 0, policy_version 412519 (0.0032) [2024-04-27 15:29:08,524][52263] Updated weights for policy 0, policy_version 412529 (0.0031) [2024-04-27 15:29:09,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6758907904. Throughput: 0: 53310.6. Samples: 1249460700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 15:29:09,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 15:29:11,710][52263] Updated weights for policy 0, policy_version 412539 (0.0032) [2024-04-27 15:29:14,106][52031] Fps is (10 sec: 49152.2, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6759137280. Throughput: 0: 53385.0. Samples: 1249622660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 15:29:14,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 15:29:14,719][52263] Updated weights for policy 0, policy_version 412549 (0.0035) [2024-04-27 15:29:17,870][52263] Updated weights for policy 0, policy_version 412559 (0.0029) [2024-04-27 15:29:19,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53794.3, 300 sec: 53484.1). Total num frames: 6759432192. Throughput: 0: 53529.0. Samples: 1249946940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 15:29:19,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 15:29:20,980][52263] Updated weights for policy 0, policy_version 412569 (0.0026) [2024-04-27 15:29:23,866][52263] Updated weights for policy 0, policy_version 412579 (0.0026) [2024-04-27 15:29:24,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.0, 300 sec: 53373.0). Total num frames: 6759694336. Throughput: 0: 53597.3. Samples: 1250269140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 15:29:24,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 15:29:26,958][52263] Updated weights for policy 0, policy_version 412589 (0.0031) [2024-04-27 15:29:29,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6759989248. Throughput: 0: 53668.9. Samples: 1250433160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 15:29:29,107][52031] Avg episode reward: [(0, '0.516')] [2024-04-27 15:29:29,844][52263] Updated weights for policy 0, policy_version 412599 (0.0029) [2024-04-27 15:29:33,133][52263] Updated weights for policy 0, policy_version 412609 (0.0031) [2024-04-27 15:29:34,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6760251392. Throughput: 0: 53811.3. Samples: 1250753360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 15:29:34,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 15:29:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000412613_6760251392.pth... [2024-04-27 15:29:34,171][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000411831_6747439104.pth [2024-04-27 15:29:35,992][52263] Updated weights for policy 0, policy_version 412619 (0.0029) [2024-04-27 15:29:39,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6760497152. Throughput: 0: 53773.9. Samples: 1251073600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 15:29:39,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 15:29:39,395][52263] Updated weights for policy 0, policy_version 412629 (0.0026) [2024-04-27 15:29:42,383][52263] Updated weights for policy 0, policy_version 412639 (0.0027) [2024-04-27 15:29:44,106][52031] Fps is (10 sec: 50791.3, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6760759296. Throughput: 0: 53659.2. Samples: 1251228700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 15:29:44,107][52031] Avg episode reward: [(0, '0.682')] [2024-04-27 15:29:45,392][52263] Updated weights for policy 0, policy_version 412649 (0.0026) [2024-04-27 15:29:48,540][52263] Updated weights for policy 0, policy_version 412659 (0.0029) [2024-04-27 15:29:49,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53249.3, 300 sec: 53317.5). Total num frames: 6761005056. Throughput: 0: 53547.7. Samples: 1251548040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 15:29:49,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 15:29:51,358][52263] Updated weights for policy 0, policy_version 412669 (0.0029) [2024-04-27 15:29:54,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 6761299968. Throughput: 0: 53499.9. Samples: 1251868200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 15:29:54,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 15:29:54,659][52263] Updated weights for policy 0, policy_version 412679 (0.0028) [2024-04-27 15:29:56,915][52242] Signal inference workers to stop experience collection... (18950 times) [2024-04-27 15:29:56,915][52242] Signal inference workers to resume experience collection... (18950 times) [2024-04-27 15:29:56,947][52263] InferenceWorker_p0-w0: stopping experience collection (18950 times) [2024-04-27 15:29:56,947][52263] InferenceWorker_p0-w0: resuming experience collection (18950 times) [2024-04-27 15:29:57,468][52263] Updated weights for policy 0, policy_version 412689 (0.0038) [2024-04-27 15:29:59,107][52031] Fps is (10 sec: 57342.9, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6761578496. Throughput: 0: 53544.7. Samples: 1252032180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 15:29:59,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 15:30:00,667][52263] Updated weights for policy 0, policy_version 412699 (0.0029) [2024-04-27 15:30:03,553][52263] Updated weights for policy 0, policy_version 412709 (0.0027) [2024-04-27 15:30:04,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6761857024. Throughput: 0: 53550.7. Samples: 1252356720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 15:30:04,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 15:30:06,615][52263] Updated weights for policy 0, policy_version 412719 (0.0031) [2024-04-27 15:30:09,107][52031] Fps is (10 sec: 52429.0, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6762102784. Throughput: 0: 53464.4. Samples: 1252675040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 15:30:09,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 15:30:10,007][52263] Updated weights for policy 0, policy_version 412729 (0.0027) [2024-04-27 15:30:12,864][52263] Updated weights for policy 0, policy_version 412739 (0.0039) [2024-04-27 15:30:14,107][52031] Fps is (10 sec: 52428.0, 60 sec: 54067.1, 300 sec: 53539.6). Total num frames: 6762381312. Throughput: 0: 53262.6. Samples: 1252829980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 15:30:14,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 15:30:16,053][52263] Updated weights for policy 0, policy_version 412749 (0.0026) [2024-04-27 15:30:19,024][52263] Updated weights for policy 0, policy_version 412759 (0.0034) [2024-04-27 15:30:19,107][52031] Fps is (10 sec: 54064.7, 60 sec: 53520.5, 300 sec: 53483.9). Total num frames: 6762643456. Throughput: 0: 53344.8. Samples: 1253153900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 15:30:19,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 15:30:22,090][52263] Updated weights for policy 0, policy_version 412769 (0.0027) [2024-04-27 15:30:24,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6762905600. Throughput: 0: 53415.1. Samples: 1253477280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 15:30:24,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 15:30:25,027][52263] Updated weights for policy 0, policy_version 412779 (0.0027) [2024-04-27 15:30:28,137][52263] Updated weights for policy 0, policy_version 412789 (0.0032) [2024-04-27 15:30:29,107][52031] Fps is (10 sec: 54070.0, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6763184128. Throughput: 0: 53439.4. Samples: 1253633480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 15:30:29,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 15:30:31,136][52263] Updated weights for policy 0, policy_version 412799 (0.0033) [2024-04-27 15:30:34,107][52031] Fps is (10 sec: 54066.2, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6763446272. Throughput: 0: 53443.8. Samples: 1253953020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 15:30:34,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 15:30:34,333][52263] Updated weights for policy 0, policy_version 412809 (0.0026) [2024-04-27 15:30:37,670][52263] Updated weights for policy 0, policy_version 412819 (0.0032) [2024-04-27 15:30:39,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6763708416. Throughput: 0: 53349.4. Samples: 1254268920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 15:30:39,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 15:30:40,496][52263] Updated weights for policy 0, policy_version 412829 (0.0028) [2024-04-27 15:30:43,992][52263] Updated weights for policy 0, policy_version 412839 (0.0032) [2024-04-27 15:30:44,106][52031] Fps is (10 sec: 50791.0, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 6763954176. Throughput: 0: 53410.3. Samples: 1254435640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 15:30:44,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 15:30:46,498][52263] Updated weights for policy 0, policy_version 412849 (0.0028) [2024-04-27 15:30:49,106][52031] Fps is (10 sec: 54067.3, 60 sec: 54067.1, 300 sec: 53484.0). Total num frames: 6764249088. Throughput: 0: 53289.7. Samples: 1254754760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 15:30:49,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 15:30:50,159][52263] Updated weights for policy 0, policy_version 412859 (0.0024) [2024-04-27 15:30:52,444][52263] Updated weights for policy 0, policy_version 412869 (0.0028) [2024-04-27 15:30:54,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6764511232. Throughput: 0: 53328.8. Samples: 1255074840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 15:30:54,107][52031] Avg episode reward: [(0, '0.669')] [2024-04-27 15:30:56,111][52263] Updated weights for policy 0, policy_version 412879 (0.0036) [2024-04-27 15:30:58,677][52263] Updated weights for policy 0, policy_version 412889 (0.0037) [2024-04-27 15:30:59,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53248.0, 300 sec: 53373.0). Total num frames: 6764773376. Throughput: 0: 53525.3. Samples: 1255238620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 15:30:59,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 15:31:02,209][52263] Updated weights for policy 0, policy_version 412899 (0.0028) [2024-04-27 15:31:04,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 6765051904. Throughput: 0: 53378.3. Samples: 1255555900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 15:31:04,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 15:31:05,074][52263] Updated weights for policy 0, policy_version 412909 (0.0032) [2024-04-27 15:31:07,242][52242] Signal inference workers to stop experience collection... (19000 times) [2024-04-27 15:31:07,243][52242] Signal inference workers to resume experience collection... (19000 times) [2024-04-27 15:31:07,259][52263] InferenceWorker_p0-w0: stopping experience collection (19000 times) [2024-04-27 15:31:07,259][52263] InferenceWorker_p0-w0: resuming experience collection (19000 times) [2024-04-27 15:31:08,314][52263] Updated weights for policy 0, policy_version 412919 (0.0029) [2024-04-27 15:31:09,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6765330432. Throughput: 0: 53465.1. Samples: 1255883220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 15:31:09,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 15:31:11,184][52263] Updated weights for policy 0, policy_version 412929 (0.0028) [2024-04-27 15:31:14,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6765576192. Throughput: 0: 53599.8. Samples: 1256045480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 15:31:14,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 15:31:14,282][52263] Updated weights for policy 0, policy_version 412939 (0.0027) [2024-04-27 15:31:17,210][52263] Updated weights for policy 0, policy_version 412949 (0.0033) [2024-04-27 15:31:19,106][52031] Fps is (10 sec: 50791.7, 60 sec: 53248.6, 300 sec: 53428.5). Total num frames: 6765838336. Throughput: 0: 53607.8. Samples: 1256365360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 15:31:19,107][52031] Avg episode reward: [(0, '0.519')] [2024-04-27 15:31:20,232][52263] Updated weights for policy 0, policy_version 412959 (0.0030) [2024-04-27 15:31:23,327][52263] Updated weights for policy 0, policy_version 412969 (0.0028) [2024-04-27 15:31:24,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6766116864. Throughput: 0: 53622.1. Samples: 1256681920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 15:31:24,107][52031] Avg episode reward: [(0, '0.484')] [2024-04-27 15:31:26,406][52263] Updated weights for policy 0, policy_version 412979 (0.0030) [2024-04-27 15:31:29,106][52031] Fps is (10 sec: 55705.1, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6766395392. Throughput: 0: 53450.3. Samples: 1256840900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 15:31:29,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 15:31:29,472][52263] Updated weights for policy 0, policy_version 412989 (0.0030) [2024-04-27 15:31:32,842][52263] Updated weights for policy 0, policy_version 412999 (0.0029) [2024-04-27 15:31:34,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53794.3, 300 sec: 53484.1). Total num frames: 6766673920. Throughput: 0: 53573.8. Samples: 1257165580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 15:31:34,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 15:31:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000413005_6766673920.pth... [2024-04-27 15:31:34,172][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000412221_6753828864.pth [2024-04-27 15:31:35,409][52263] Updated weights for policy 0, policy_version 413009 (0.0029) [2024-04-27 15:31:38,990][52263] Updated weights for policy 0, policy_version 413019 (0.0028) [2024-04-27 15:31:39,107][52031] Fps is (10 sec: 50789.9, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 6766903296. Throughput: 0: 53580.5. Samples: 1257485960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 15:31:39,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 15:31:41,359][52263] Updated weights for policy 0, policy_version 413029 (0.0030) [2024-04-27 15:31:44,107][52031] Fps is (10 sec: 52428.4, 60 sec: 54067.2, 300 sec: 53428.5). Total num frames: 6767198208. Throughput: 0: 53429.9. Samples: 1257642960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 15:31:44,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 15:31:44,953][52263] Updated weights for policy 0, policy_version 413039 (0.0027) [2024-04-27 15:31:47,451][52263] Updated weights for policy 0, policy_version 413049 (0.0025) [2024-04-27 15:31:49,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6767443968. Throughput: 0: 53469.8. Samples: 1257962040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 15:31:49,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 15:31:50,978][52263] Updated weights for policy 0, policy_version 413059 (0.0031) [2024-04-27 15:31:53,910][52263] Updated weights for policy 0, policy_version 413069 (0.0029) [2024-04-27 15:31:54,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6767722496. Throughput: 0: 53525.9. Samples: 1258291880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 15:31:54,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 15:31:57,123][52263] Updated weights for policy 0, policy_version 413079 (0.0028) [2024-04-27 15:31:59,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53794.2, 300 sec: 53484.1). Total num frames: 6768001024. Throughput: 0: 53493.5. Samples: 1258452680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 15:31:59,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 15:32:00,580][52263] Updated weights for policy 0, policy_version 413089 (0.0033) [2024-04-27 15:32:03,091][52263] Updated weights for policy 0, policy_version 413099 (0.0036) [2024-04-27 15:32:04,107][52031] Fps is (10 sec: 57344.0, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6768295936. Throughput: 0: 53510.0. Samples: 1258773320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-27 15:32:04,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 15:32:06,722][52263] Updated weights for policy 0, policy_version 413109 (0.0038) [2024-04-27 15:32:06,921][52242] Signal inference workers to stop experience collection... (19050 times) [2024-04-27 15:32:06,959][52263] InferenceWorker_p0-w0: stopping experience collection (19050 times) [2024-04-27 15:32:06,975][52242] Signal inference workers to resume experience collection... (19050 times) [2024-04-27 15:32:06,983][52263] InferenceWorker_p0-w0: resuming experience collection (19050 times) [2024-04-27 15:32:09,083][52263] Updated weights for policy 0, policy_version 413119 (0.0027) [2024-04-27 15:32:09,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6768541696. Throughput: 0: 53589.1. Samples: 1259093420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 15:32:09,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 15:32:12,907][52263] Updated weights for policy 0, policy_version 413129 (0.0038) [2024-04-27 15:32:14,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53794.3, 300 sec: 53484.1). Total num frames: 6768803840. Throughput: 0: 53647.6. Samples: 1259255040. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 15:32:14,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 15:32:15,101][52263] Updated weights for policy 0, policy_version 413139 (0.0027) [2024-04-27 15:32:18,843][52263] Updated weights for policy 0, policy_version 413149 (0.0029) [2024-04-27 15:32:19,107][52031] Fps is (10 sec: 49151.5, 60 sec: 53247.9, 300 sec: 53372.9). Total num frames: 6769033216. Throughput: 0: 53552.3. Samples: 1259575440. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 15:32:19,107][52031] Avg episode reward: [(0, '0.485')] [2024-04-27 15:32:21,321][52263] Updated weights for policy 0, policy_version 413159 (0.0032) [2024-04-27 15:32:24,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6769344512. Throughput: 0: 53592.1. Samples: 1259897600. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 15:32:24,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 15:32:24,779][52263] Updated weights for policy 0, policy_version 413169 (0.0027) [2024-04-27 15:32:27,519][52263] Updated weights for policy 0, policy_version 413179 (0.0026) [2024-04-27 15:32:29,106][52031] Fps is (10 sec: 57344.2, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 6769606656. Throughput: 0: 53716.5. Samples: 1260060200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 15:32:29,107][52031] Avg episode reward: [(0, '0.690')] [2024-04-27 15:32:30,961][52263] Updated weights for policy 0, policy_version 413189 (0.0027) [2024-04-27 15:32:33,545][52263] Updated weights for policy 0, policy_version 413199 (0.0035) [2024-04-27 15:32:34,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6769901568. Throughput: 0: 53721.4. Samples: 1260379500. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 15:32:34,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 15:32:37,298][52263] Updated weights for policy 0, policy_version 413209 (0.0029) [2024-04-27 15:32:39,106][52031] Fps is (10 sec: 54067.5, 60 sec: 54067.3, 300 sec: 53539.6). Total num frames: 6770147328. Throughput: 0: 53518.4. Samples: 1260700200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 15:32:39,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 15:32:39,619][52263] Updated weights for policy 0, policy_version 413219 (0.0026) [2024-04-27 15:32:43,274][52263] Updated weights for policy 0, policy_version 413229 (0.0026) [2024-04-27 15:32:44,106][52031] Fps is (10 sec: 49152.1, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6770393088. Throughput: 0: 53640.1. Samples: 1260866480. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 15:32:44,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 15:32:45,771][52263] Updated weights for policy 0, policy_version 413239 (0.0031) [2024-04-27 15:32:49,107][52031] Fps is (10 sec: 50789.2, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6770655232. Throughput: 0: 53668.7. Samples: 1261188420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 15:32:49,107][52031] Avg episode reward: [(0, '0.674')] [2024-04-27 15:32:49,243][52263] Updated weights for policy 0, policy_version 413249 (0.0032) [2024-04-27 15:32:51,899][52263] Updated weights for policy 0, policy_version 413259 (0.0026) [2024-04-27 15:32:54,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6770950144. Throughput: 0: 53664.3. Samples: 1261508320. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 15:32:54,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 15:32:55,336][52263] Updated weights for policy 0, policy_version 413269 (0.0030) [2024-04-27 15:32:56,152][52242] Signal inference workers to stop experience collection... (19100 times) [2024-04-27 15:32:56,193][52263] InferenceWorker_p0-w0: stopping experience collection (19100 times) [2024-04-27 15:32:56,252][52242] Signal inference workers to resume experience collection... (19100 times) [2024-04-27 15:32:56,252][52263] InferenceWorker_p0-w0: resuming experience collection (19100 times) [2024-04-27 15:32:57,897][52263] Updated weights for policy 0, policy_version 413279 (0.0034) [2024-04-27 15:32:59,107][52031] Fps is (10 sec: 57344.5, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6771228672. Throughput: 0: 53657.2. Samples: 1261669620. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 15:32:59,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 15:33:01,749][52263] Updated weights for policy 0, policy_version 413289 (0.0033) [2024-04-27 15:33:03,997][52263] Updated weights for policy 0, policy_version 413299 (0.0033) [2024-04-27 15:33:04,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6771490816. Throughput: 0: 53674.4. Samples: 1261990780. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 15:33:04,107][52031] Avg episode reward: [(0, '0.675')] [2024-04-27 15:33:07,782][52263] Updated weights for policy 0, policy_version 413309 (0.0028) [2024-04-27 15:33:09,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6771736576. Throughput: 0: 53672.8. Samples: 1262312880. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 15:33:09,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 15:33:10,145][52263] Updated weights for policy 0, policy_version 413319 (0.0028) [2024-04-27 15:33:13,899][52263] Updated weights for policy 0, policy_version 413329 (0.0034) [2024-04-27 15:33:14,106][52031] Fps is (10 sec: 50790.2, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6771998720. Throughput: 0: 53558.3. Samples: 1262470320. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 15:33:14,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 15:33:16,287][52263] Updated weights for policy 0, policy_version 413339 (0.0034) [2024-04-27 15:33:19,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6772260864. Throughput: 0: 53615.4. Samples: 1262792200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 15:33:19,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 15:33:20,138][52263] Updated weights for policy 0, policy_version 413349 (0.0026) [2024-04-27 15:33:22,441][52263] Updated weights for policy 0, policy_version 413359 (0.0029) [2024-04-27 15:33:24,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6772555776. Throughput: 0: 53547.1. Samples: 1263109820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 15:33:24,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 15:33:26,118][52263] Updated weights for policy 0, policy_version 413369 (0.0033) [2024-04-27 15:33:28,483][52263] Updated weights for policy 0, policy_version 413379 (0.0039) [2024-04-27 15:33:29,107][52031] Fps is (10 sec: 57344.6, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6772834304. Throughput: 0: 53602.1. Samples: 1263278580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 15:33:29,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 15:33:32,090][52263] Updated weights for policy 0, policy_version 413389 (0.0032) [2024-04-27 15:33:34,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6773096448. Throughput: 0: 53632.6. Samples: 1263601880. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 15:33:34,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 15:33:34,113][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000413397_6773096448.pth... [2024-04-27 15:33:34,162][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000412613_6760251392.pth [2024-04-27 15:33:34,639][52263] Updated weights for policy 0, policy_version 413399 (0.0023) [2024-04-27 15:33:38,191][52263] Updated weights for policy 0, policy_version 413409 (0.0028) [2024-04-27 15:33:39,107][52031] Fps is (10 sec: 50790.2, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6773342208. Throughput: 0: 53753.8. Samples: 1263927240. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 15:33:39,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 15:33:40,682][52263] Updated weights for policy 0, policy_version 413419 (0.0033) [2024-04-27 15:33:44,106][52031] Fps is (10 sec: 50791.1, 60 sec: 53521.1, 300 sec: 53539.8). Total num frames: 6773604352. Throughput: 0: 53459.8. Samples: 1264075300. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 15:33:44,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 15:33:44,355][52263] Updated weights for policy 0, policy_version 413429 (0.0034) [2024-04-27 15:33:46,783][52263] Updated weights for policy 0, policy_version 413439 (0.0029) [2024-04-27 15:33:49,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6773882880. Throughput: 0: 53433.1. Samples: 1264395280. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 15:33:49,107][52031] Avg episode reward: [(0, '0.492')] [2024-04-27 15:33:50,262][52263] Updated weights for policy 0, policy_version 413449 (0.0031) [2024-04-27 15:33:51,010][52242] Signal inference workers to stop experience collection... (19150 times) [2024-04-27 15:33:51,012][52242] Signal inference workers to resume experience collection... (19150 times) [2024-04-27 15:33:51,053][52263] InferenceWorker_p0-w0: stopping experience collection (19150 times) [2024-04-27 15:33:51,053][52263] InferenceWorker_p0-w0: resuming experience collection (19150 times) [2024-04-27 15:33:52,847][52263] Updated weights for policy 0, policy_version 413459 (0.0026) [2024-04-27 15:33:54,106][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6774161408. Throughput: 0: 53380.1. Samples: 1264714980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 15:33:54,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 15:33:56,516][52263] Updated weights for policy 0, policy_version 413469 (0.0028) [2024-04-27 15:33:58,866][52263] Updated weights for policy 0, policy_version 413479 (0.0035) [2024-04-27 15:33:59,106][52031] Fps is (10 sec: 57345.0, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6774456320. Throughput: 0: 53820.0. Samples: 1264892220. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 15:33:59,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 15:34:02,728][52263] Updated weights for policy 0, policy_version 413489 (0.0031) [2024-04-27 15:34:04,107][52031] Fps is (10 sec: 50789.6, 60 sec: 52974.8, 300 sec: 53428.5). Total num frames: 6774669312. Throughput: 0: 53760.5. Samples: 1265211420. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 15:34:04,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 15:34:04,971][52263] Updated weights for policy 0, policy_version 413499 (0.0026) [2024-04-27 15:34:08,930][52263] Updated weights for policy 0, policy_version 413509 (0.0034) [2024-04-27 15:34:09,106][52031] Fps is (10 sec: 49151.8, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6774947840. Throughput: 0: 53791.9. Samples: 1265530460. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 15:34:09,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 15:34:11,081][52263] Updated weights for policy 0, policy_version 413519 (0.0028) [2024-04-27 15:34:14,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6775209984. Throughput: 0: 53261.4. Samples: 1265675340. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 15:34:14,115][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 15:34:14,878][52263] Updated weights for policy 0, policy_version 413529 (0.0031) [2024-04-27 15:34:17,456][52263] Updated weights for policy 0, policy_version 413539 (0.0038) [2024-04-27 15:34:19,107][52031] Fps is (10 sec: 55705.7, 60 sec: 54067.3, 300 sec: 53595.1). Total num frames: 6775504896. Throughput: 0: 53327.2. Samples: 1266001600. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 15:34:19,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 15:34:21,050][52263] Updated weights for policy 0, policy_version 413549 (0.0031) [2024-04-27 15:34:23,529][52263] Updated weights for policy 0, policy_version 413559 (0.0026) [2024-04-27 15:34:24,107][52031] Fps is (10 sec: 57342.8, 60 sec: 53793.9, 300 sec: 53539.6). Total num frames: 6775783424. Throughput: 0: 53267.4. Samples: 1266324280. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 15:34:24,116][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 15:34:27,195][52263] Updated weights for policy 0, policy_version 413569 (0.0026) [2024-04-27 15:34:29,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6776029184. Throughput: 0: 53769.4. Samples: 1266494920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 15:34:29,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 15:34:29,550][52263] Updated weights for policy 0, policy_version 413579 (0.0025) [2024-04-27 15:34:33,282][52263] Updated weights for policy 0, policy_version 413589 (0.0029) [2024-04-27 15:34:34,106][52031] Fps is (10 sec: 50791.2, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6776291328. Throughput: 0: 53662.9. Samples: 1266810100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 15:34:34,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 15:34:35,742][52263] Updated weights for policy 0, policy_version 413599 (0.0033) [2024-04-27 15:34:38,969][52242] Signal inference workers to stop experience collection... (19200 times) [2024-04-27 15:34:39,014][52263] InferenceWorker_p0-w0: stopping experience collection (19200 times) [2024-04-27 15:34:39,032][52242] Signal inference workers to resume experience collection... (19200 times) [2024-04-27 15:34:39,033][52263] InferenceWorker_p0-w0: resuming experience collection (19200 times) [2024-04-27 15:34:39,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6776537088. Throughput: 0: 53674.3. Samples: 1267130320. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 15:34:39,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 15:34:39,277][52263] Updated weights for policy 0, policy_version 413609 (0.0031) [2024-04-27 15:34:41,949][52263] Updated weights for policy 0, policy_version 413619 (0.0028) [2024-04-27 15:34:44,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53793.9, 300 sec: 53650.6). Total num frames: 6776832000. Throughput: 0: 53226.9. Samples: 1267287440. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 15:34:44,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 15:34:45,407][52263] Updated weights for policy 0, policy_version 413629 (0.0035) [2024-04-27 15:34:47,953][52263] Updated weights for policy 0, policy_version 413639 (0.0030) [2024-04-27 15:34:49,106][52031] Fps is (10 sec: 57343.8, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6777110528. Throughput: 0: 53338.4. Samples: 1267611640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 15:34:49,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 15:34:51,606][52263] Updated weights for policy 0, policy_version 413649 (0.0032) [2024-04-27 15:34:54,045][52263] Updated weights for policy 0, policy_version 413659 (0.0027) [2024-04-27 15:34:54,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6777389056. Throughput: 0: 53350.6. Samples: 1267931240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 15:34:54,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 15:34:57,882][52263] Updated weights for policy 0, policy_version 413669 (0.0029) [2024-04-27 15:34:59,107][52031] Fps is (10 sec: 50789.7, 60 sec: 52701.8, 300 sec: 53428.5). Total num frames: 6777618432. Throughput: 0: 53671.9. Samples: 1268090580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 15:34:59,107][52031] Avg episode reward: [(0, '0.518')] [2024-04-27 15:35:00,303][52263] Updated weights for policy 0, policy_version 413679 (0.0023) [2024-04-27 15:35:03,872][52263] Updated weights for policy 0, policy_version 413689 (0.0026) [2024-04-27 15:35:04,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6777896960. Throughput: 0: 53560.5. Samples: 1268411820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 15:35:04,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 15:35:06,430][52263] Updated weights for policy 0, policy_version 413699 (0.0027) [2024-04-27 15:35:09,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6778142720. Throughput: 0: 53461.7. Samples: 1268730040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 15:35:09,107][52031] Avg episode reward: [(0, '0.522')] [2024-04-27 15:35:09,940][52263] Updated weights for policy 0, policy_version 413709 (0.0029) [2024-04-27 15:35:12,464][52263] Updated weights for policy 0, policy_version 413719 (0.0031) [2024-04-27 15:35:14,107][52031] Fps is (10 sec: 55705.1, 60 sec: 54067.1, 300 sec: 53595.2). Total num frames: 6778454016. Throughput: 0: 53391.8. Samples: 1268897560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 15:35:14,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 15:35:16,049][52263] Updated weights for policy 0, policy_version 413729 (0.0025) [2024-04-27 15:35:18,836][52263] Updated weights for policy 0, policy_version 413739 (0.0028) [2024-04-27 15:35:19,106][52031] Fps is (10 sec: 58981.8, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6778732544. Throughput: 0: 53577.3. Samples: 1269221080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 15:35:19,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 15:35:22,246][52263] Updated weights for policy 0, policy_version 413749 (0.0023) [2024-04-27 15:35:24,106][52031] Fps is (10 sec: 49152.8, 60 sec: 52702.1, 300 sec: 53428.5). Total num frames: 6778945536. Throughput: 0: 53618.2. Samples: 1269543140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 15:35:24,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 15:35:24,950][52263] Updated weights for policy 0, policy_version 413759 (0.0027) [2024-04-27 15:35:27,787][52242] Signal inference workers to stop experience collection... (19250 times) [2024-04-27 15:35:27,846][52263] InferenceWorker_p0-w0: stopping experience collection (19250 times) [2024-04-27 15:35:27,846][52242] Signal inference workers to resume experience collection... (19250 times) [2024-04-27 15:35:27,861][52263] InferenceWorker_p0-w0: resuming experience collection (19250 times) [2024-04-27 15:35:28,264][52263] Updated weights for policy 0, policy_version 413769 (0.0031) [2024-04-27 15:35:29,107][52031] Fps is (10 sec: 49151.9, 60 sec: 53247.9, 300 sec: 53484.1). Total num frames: 6779224064. Throughput: 0: 53348.1. Samples: 1269688100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 15:35:29,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 15:35:31,024][52263] Updated weights for policy 0, policy_version 413779 (0.0034) [2024-04-27 15:35:34,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6779502592. Throughput: 0: 53254.6. Samples: 1270008100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 15:35:34,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 15:35:34,124][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000413788_6779502592.pth... [2024-04-27 15:35:34,177][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000413005_6766673920.pth [2024-04-27 15:35:34,422][52263] Updated weights for policy 0, policy_version 413789 (0.0029) [2024-04-27 15:35:37,059][52263] Updated weights for policy 0, policy_version 413799 (0.0026) [2024-04-27 15:35:39,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6779764736. Throughput: 0: 53265.4. Samples: 1270328180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 15:35:39,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 15:35:40,456][52263] Updated weights for policy 0, policy_version 413809 (0.0025) [2024-04-27 15:35:43,280][52263] Updated weights for policy 0, policy_version 413819 (0.0030) [2024-04-27 15:35:44,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6780059648. Throughput: 0: 53401.8. Samples: 1270493660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 15:35:44,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 15:35:46,680][52263] Updated weights for policy 0, policy_version 413829 (0.0031) [2024-04-27 15:35:49,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6780305408. Throughput: 0: 53321.8. Samples: 1270811300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 15:35:49,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 15:35:49,386][52263] Updated weights for policy 0, policy_version 413839 (0.0029) [2024-04-27 15:35:52,773][52263] Updated weights for policy 0, policy_version 413849 (0.0031) [2024-04-27 15:35:54,107][52031] Fps is (10 sec: 49151.7, 60 sec: 52701.9, 300 sec: 53484.1). Total num frames: 6780551168. Throughput: 0: 53462.4. Samples: 1271135860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 15:35:54,107][52031] Avg episode reward: [(0, '0.486')] [2024-04-27 15:35:55,495][52263] Updated weights for policy 0, policy_version 413859 (0.0030) [2024-04-27 15:35:58,861][52263] Updated weights for policy 0, policy_version 413869 (0.0027) [2024-04-27 15:35:59,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6780829696. Throughput: 0: 53078.8. Samples: 1271286100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 15:35:59,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 15:36:01,748][52263] Updated weights for policy 0, policy_version 413879 (0.0031) [2024-04-27 15:36:04,106][52031] Fps is (10 sec: 52429.5, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6781075456. Throughput: 0: 52859.2. Samples: 1271599740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 15:36:04,107][52031] Avg episode reward: [(0, '0.517')] [2024-04-27 15:36:04,884][52263] Updated weights for policy 0, policy_version 413889 (0.0033) [2024-04-27 15:36:07,908][52263] Updated weights for policy 0, policy_version 413899 (0.0032) [2024-04-27 15:36:09,107][52031] Fps is (10 sec: 55705.1, 60 sec: 54067.0, 300 sec: 53595.1). Total num frames: 6781386752. Throughput: 0: 52871.8. Samples: 1271922380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 15:36:09,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 15:36:10,957][52263] Updated weights for policy 0, policy_version 413909 (0.0030) [2024-04-27 15:36:14,038][52263] Updated weights for policy 0, policy_version 413919 (0.0025) [2024-04-27 15:36:14,106][52031] Fps is (10 sec: 57344.3, 60 sec: 53248.2, 300 sec: 53595.1). Total num frames: 6781648896. Throughput: 0: 53384.6. Samples: 1272090400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 15:36:14,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 15:36:17,158][52263] Updated weights for policy 0, policy_version 413929 (0.0031) [2024-04-27 15:36:19,106][52031] Fps is (10 sec: 49152.3, 60 sec: 52428.8, 300 sec: 53428.5). Total num frames: 6781878272. Throughput: 0: 53321.3. Samples: 1272407560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 15:36:19,107][52031] Avg episode reward: [(0, '0.673')] [2024-04-27 15:36:19,850][52242] Signal inference workers to stop experience collection... (19300 times) [2024-04-27 15:36:19,851][52242] Signal inference workers to resume experience collection... (19300 times) [2024-04-27 15:36:19,879][52263] InferenceWorker_p0-w0: stopping experience collection (19300 times) [2024-04-27 15:36:19,879][52263] InferenceWorker_p0-w0: resuming experience collection (19300 times) [2024-04-27 15:36:20,182][52263] Updated weights for policy 0, policy_version 413939 (0.0031) [2024-04-27 15:36:23,220][52263] Updated weights for policy 0, policy_version 413949 (0.0031) [2024-04-27 15:36:24,107][52031] Fps is (10 sec: 50789.7, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6782156800. Throughput: 0: 53279.9. Samples: 1272725780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 15:36:24,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 15:36:26,306][52263] Updated weights for policy 0, policy_version 413959 (0.0030) [2024-04-27 15:36:29,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53247.9, 300 sec: 53372.9). Total num frames: 6782418944. Throughput: 0: 53104.4. Samples: 1272883360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 15:36:29,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 15:36:29,408][52263] Updated weights for policy 0, policy_version 413969 (0.0033) [2024-04-27 15:36:32,481][52263] Updated weights for policy 0, policy_version 413979 (0.0029) [2024-04-27 15:36:34,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6782697472. Throughput: 0: 53114.1. Samples: 1273201440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 15:36:34,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 15:36:36,019][52263] Updated weights for policy 0, policy_version 413989 (0.0028) [2024-04-27 15:36:38,676][52263] Updated weights for policy 0, policy_version 413999 (0.0035) [2024-04-27 15:36:39,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6782976000. Throughput: 0: 53000.8. Samples: 1273520900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 15:36:39,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 15:36:42,384][52263] Updated weights for policy 0, policy_version 414009 (0.0031) [2024-04-27 15:36:44,107][52031] Fps is (10 sec: 54066.5, 60 sec: 52974.8, 300 sec: 53539.5). Total num frames: 6783238144. Throughput: 0: 53253.5. Samples: 1273682520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 15:36:44,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 15:36:44,773][52263] Updated weights for policy 0, policy_version 414019 (0.0028) [2024-04-27 15:36:48,447][52263] Updated weights for policy 0, policy_version 414029 (0.0033) [2024-04-27 15:36:49,107][52031] Fps is (10 sec: 49152.2, 60 sec: 52701.8, 300 sec: 53373.0). Total num frames: 6783467520. Throughput: 0: 53439.4. Samples: 1274004520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 15:36:49,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 15:36:50,804][52263] Updated weights for policy 0, policy_version 414039 (0.0031) [2024-04-27 15:36:54,106][52031] Fps is (10 sec: 52430.2, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 6783762432. Throughput: 0: 53442.8. Samples: 1274327300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 15:36:54,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 15:36:54,732][52263] Updated weights for policy 0, policy_version 414049 (0.0035) [2024-04-27 15:36:57,087][52263] Updated weights for policy 0, policy_version 414059 (0.0025) [2024-04-27 15:36:59,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6784024576. Throughput: 0: 53091.8. Samples: 1274479540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 15:36:59,107][52031] Avg episode reward: [(0, '0.471')] [2024-04-27 15:37:00,859][52263] Updated weights for policy 0, policy_version 414069 (0.0029) [2024-04-27 15:37:03,354][52263] Updated weights for policy 0, policy_version 414079 (0.0026) [2024-04-27 15:37:04,106][52031] Fps is (10 sec: 55705.9, 60 sec: 54067.2, 300 sec: 53484.0). Total num frames: 6784319488. Throughput: 0: 53227.3. Samples: 1274802780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 15:37:04,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 15:37:06,879][52263] Updated weights for policy 0, policy_version 414089 (0.0031) [2024-04-27 15:37:09,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6784581632. Throughput: 0: 53348.9. Samples: 1275126480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 15:37:09,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 15:37:09,392][52263] Updated weights for policy 0, policy_version 414099 (0.0031) [2024-04-27 15:37:12,884][52263] Updated weights for policy 0, policy_version 414109 (0.0026) [2024-04-27 15:37:14,107][52031] Fps is (10 sec: 50789.6, 60 sec: 52974.8, 300 sec: 53539.6). Total num frames: 6784827392. Throughput: 0: 53430.7. Samples: 1275287740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 15:37:14,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 15:37:15,377][52263] Updated weights for policy 0, policy_version 414119 (0.0027) [2024-04-27 15:37:15,985][52242] Signal inference workers to stop experience collection... (19350 times) [2024-04-27 15:37:15,985][52242] Signal inference workers to resume experience collection... (19350 times) [2024-04-27 15:37:16,009][52263] InferenceWorker_p0-w0: stopping experience collection (19350 times) [2024-04-27 15:37:16,009][52263] InferenceWorker_p0-w0: resuming experience collection (19350 times) [2024-04-27 15:37:19,012][52263] Updated weights for policy 0, policy_version 414129 (0.0037) [2024-04-27 15:37:19,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53521.1, 300 sec: 53373.0). Total num frames: 6785089536. Throughput: 0: 53576.6. Samples: 1275612380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 15:37:19,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 15:37:21,534][52263] Updated weights for policy 0, policy_version 414139 (0.0029) [2024-04-27 15:37:24,106][52031] Fps is (10 sec: 50791.3, 60 sec: 52975.1, 300 sec: 53317.4). Total num frames: 6785335296. Throughput: 0: 53511.0. Samples: 1275928880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 15:37:24,107][52031] Avg episode reward: [(0, '0.498')] [2024-04-27 15:37:25,366][52263] Updated weights for policy 0, policy_version 414149 (0.0030) [2024-04-27 15:37:27,743][52263] Updated weights for policy 0, policy_version 414159 (0.0030) [2024-04-27 15:37:29,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.1, 300 sec: 53317.4). Total num frames: 6785630208. Throughput: 0: 53453.5. Samples: 1276087920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 15:37:29,107][52031] Avg episode reward: [(0, '0.674')] [2024-04-27 15:37:31,506][52263] Updated weights for policy 0, policy_version 414169 (0.0034) [2024-04-27 15:37:33,830][52263] Updated weights for policy 0, policy_version 414179 (0.0034) [2024-04-27 15:37:34,107][52031] Fps is (10 sec: 58981.0, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6785925120. Throughput: 0: 53430.2. Samples: 1276408880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 15:37:34,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 15:37:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000414180_6785925120.pth... [2024-04-27 15:37:34,171][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000413397_6773096448.pth [2024-04-27 15:37:37,589][52263] Updated weights for policy 0, policy_version 414189 (0.0030) [2024-04-27 15:37:39,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6786170880. Throughput: 0: 53333.2. Samples: 1276727300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 15:37:39,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 15:37:39,929][52263] Updated weights for policy 0, policy_version 414199 (0.0035) [2024-04-27 15:37:43,890][52263] Updated weights for policy 0, policy_version 414209 (0.0036) [2024-04-27 15:37:44,106][52031] Fps is (10 sec: 47514.2, 60 sec: 52702.1, 300 sec: 53373.0). Total num frames: 6786400256. Throughput: 0: 53530.8. Samples: 1276888420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 15:37:44,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 15:37:46,256][52263] Updated weights for policy 0, policy_version 414219 (0.0028) [2024-04-27 15:37:49,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53794.3, 300 sec: 53373.0). Total num frames: 6786695168. Throughput: 0: 53351.9. Samples: 1277203620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 15:37:49,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 15:37:50,062][52263] Updated weights for policy 0, policy_version 414229 (0.0033) [2024-04-27 15:37:52,424][52263] Updated weights for policy 0, policy_version 414239 (0.0038) [2024-04-27 15:37:54,107][52031] Fps is (10 sec: 55704.4, 60 sec: 53247.8, 300 sec: 53317.4). Total num frames: 6786957312. Throughput: 0: 53231.8. Samples: 1277521920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 15:37:54,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 15:37:56,104][52263] Updated weights for policy 0, policy_version 414249 (0.0033) [2024-04-27 15:37:58,677][52263] Updated weights for policy 0, policy_version 414259 (0.0029) [2024-04-27 15:37:59,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6787252224. Throughput: 0: 53379.9. Samples: 1277689840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 15:37:59,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 15:38:02,170][52263] Updated weights for policy 0, policy_version 414269 (0.0026) [2024-04-27 15:38:04,107][52031] Fps is (10 sec: 55706.2, 60 sec: 53247.8, 300 sec: 53484.0). Total num frames: 6787514368. Throughput: 0: 53246.6. Samples: 1278008480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 15:38:04,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 15:38:04,791][52263] Updated weights for policy 0, policy_version 414279 (0.0032) [2024-04-27 15:38:08,277][52263] Updated weights for policy 0, policy_version 414289 (0.0033) [2024-04-27 15:38:09,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6787776512. Throughput: 0: 53426.1. Samples: 1278333060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 15:38:09,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 15:38:10,807][52263] Updated weights for policy 0, policy_version 414299 (0.0026) [2024-04-27 15:38:14,106][52031] Fps is (10 sec: 49152.4, 60 sec: 52975.0, 300 sec: 53373.0). Total num frames: 6788005888. Throughput: 0: 53357.0. Samples: 1278488980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 15:38:14,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 15:38:14,372][52263] Updated weights for policy 0, policy_version 414309 (0.0027) [2024-04-27 15:38:16,871][52263] Updated weights for policy 0, policy_version 414319 (0.0030) [2024-04-27 15:38:19,107][52031] Fps is (10 sec: 50789.5, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6788284416. Throughput: 0: 53334.6. Samples: 1278808940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 15:38:19,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 15:38:20,362][52263] Updated weights for policy 0, policy_version 414329 (0.0026) [2024-04-27 15:38:21,962][52242] Signal inference workers to stop experience collection... (19400 times) [2024-04-27 15:38:21,962][52242] Signal inference workers to resume experience collection... (19400 times) [2024-04-27 15:38:21,983][52263] InferenceWorker_p0-w0: stopping experience collection (19400 times) [2024-04-27 15:38:21,984][52263] InferenceWorker_p0-w0: resuming experience collection (19400 times) [2024-04-27 15:38:23,065][52263] Updated weights for policy 0, policy_version 414339 (0.0031) [2024-04-27 15:38:24,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.0, 300 sec: 53317.4). Total num frames: 6788562944. Throughput: 0: 53410.8. Samples: 1279130780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 15:38:24,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 15:38:26,536][52263] Updated weights for policy 0, policy_version 414349 (0.0031) [2024-04-27 15:38:29,106][52031] Fps is (10 sec: 55706.7, 60 sec: 53521.2, 300 sec: 53373.0). Total num frames: 6788841472. Throughput: 0: 53637.8. Samples: 1279302120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 15:38:29,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 15:38:29,207][52263] Updated weights for policy 0, policy_version 414359 (0.0036) [2024-04-27 15:38:32,632][52263] Updated weights for policy 0, policy_version 414369 (0.0026) [2024-04-27 15:38:34,107][52031] Fps is (10 sec: 57343.3, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6789136384. Throughput: 0: 53755.8. Samples: 1279622640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 15:38:34,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 15:38:35,191][52263] Updated weights for policy 0, policy_version 414379 (0.0029) [2024-04-27 15:38:38,691][52263] Updated weights for policy 0, policy_version 414389 (0.0027) [2024-04-27 15:38:39,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.2, 300 sec: 53484.0). Total num frames: 6789382144. Throughput: 0: 53891.0. Samples: 1279947000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 15:38:39,107][52031] Avg episode reward: [(0, '0.670')] [2024-04-27 15:38:41,328][52263] Updated weights for policy 0, policy_version 414399 (0.0029) [2024-04-27 15:38:44,106][52031] Fps is (10 sec: 49152.5, 60 sec: 53794.1, 300 sec: 53373.0). Total num frames: 6789627904. Throughput: 0: 53561.9. Samples: 1280100120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 15:38:44,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 15:38:44,588][52263] Updated weights for policy 0, policy_version 414409 (0.0026) [2024-04-27 15:38:47,405][52263] Updated weights for policy 0, policy_version 414419 (0.0028) [2024-04-27 15:38:49,107][52031] Fps is (10 sec: 50789.7, 60 sec: 53247.9, 300 sec: 53317.4). Total num frames: 6789890048. Throughput: 0: 53636.9. Samples: 1280422140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 15:38:49,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 15:38:50,650][52263] Updated weights for policy 0, policy_version 414429 (0.0031) [2024-04-27 15:38:53,485][52263] Updated weights for policy 0, policy_version 414439 (0.0030) [2024-04-27 15:38:54,107][52031] Fps is (10 sec: 57343.3, 60 sec: 54067.3, 300 sec: 53372.9). Total num frames: 6790201344. Throughput: 0: 53638.0. Samples: 1280746780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 15:38:54,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 15:38:56,814][52263] Updated weights for policy 0, policy_version 414449 (0.0025) [2024-04-27 15:38:59,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6790447104. Throughput: 0: 53849.8. Samples: 1280912220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 15:38:59,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 15:38:59,481][52263] Updated weights for policy 0, policy_version 414459 (0.0030) [2024-04-27 15:39:02,871][52263] Updated weights for policy 0, policy_version 414469 (0.0035) [2024-04-27 15:39:04,106][52031] Fps is (10 sec: 54068.3, 60 sec: 53794.3, 300 sec: 53539.6). Total num frames: 6790742016. Throughput: 0: 53942.9. Samples: 1281236360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 15:39:04,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 15:39:04,120][52242] Signal inference workers to stop experience collection... (19450 times) [2024-04-27 15:39:04,120][52242] Signal inference workers to resume experience collection... (19450 times) [2024-04-27 15:39:04,133][52263] InferenceWorker_p0-w0: stopping experience collection (19450 times) [2024-04-27 15:39:04,133][52263] InferenceWorker_p0-w0: resuming experience collection (19450 times) [2024-04-27 15:39:05,536][52263] Updated weights for policy 0, policy_version 414479 (0.0027) [2024-04-27 15:39:08,905][52263] Updated weights for policy 0, policy_version 414489 (0.0039) [2024-04-27 15:39:09,106][52031] Fps is (10 sec: 54066.8, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6790987776. Throughput: 0: 53957.7. Samples: 1281558880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 15:39:09,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 15:39:11,680][52263] Updated weights for policy 0, policy_version 414499 (0.0033) [2024-04-27 15:39:14,107][52031] Fps is (10 sec: 47512.8, 60 sec: 53521.0, 300 sec: 53261.9). Total num frames: 6791217152. Throughput: 0: 53560.7. Samples: 1281712360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 15:39:14,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 15:39:15,124][52263] Updated weights for policy 0, policy_version 414509 (0.0033) [2024-04-27 15:39:17,958][52263] Updated weights for policy 0, policy_version 414519 (0.0022) [2024-04-27 15:39:19,107][52031] Fps is (10 sec: 50790.2, 60 sec: 53521.1, 300 sec: 53261.9). Total num frames: 6791495680. Throughput: 0: 53564.1. Samples: 1282033020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 15:39:19,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 15:39:21,257][52263] Updated weights for policy 0, policy_version 414529 (0.0032) [2024-04-27 15:39:24,106][52031] Fps is (10 sec: 57344.3, 60 sec: 53794.1, 300 sec: 53428.5). Total num frames: 6791790592. Throughput: 0: 53443.4. Samples: 1282351960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 15:39:24,107][52031] Avg episode reward: [(0, '0.508')] [2024-04-27 15:39:24,166][52263] Updated weights for policy 0, policy_version 414539 (0.0031) [2024-04-27 15:39:27,264][52263] Updated weights for policy 0, policy_version 414549 (0.0032) [2024-04-27 15:39:29,106][52031] Fps is (10 sec: 57344.2, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 6792069120. Throughput: 0: 53881.8. Samples: 1282524800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 15:39:29,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 15:39:30,225][52263] Updated weights for policy 0, policy_version 414559 (0.0028) [2024-04-27 15:39:33,231][52263] Updated weights for policy 0, policy_version 414569 (0.0039) [2024-04-27 15:39:34,107][52031] Fps is (10 sec: 57343.7, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6792364032. Throughput: 0: 53874.7. Samples: 1282846500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 15:39:34,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 15:39:34,122][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000414573_6792364032.pth... [2024-04-27 15:39:34,174][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000413788_6779502592.pth [2024-04-27 15:39:36,406][52263] Updated weights for policy 0, policy_version 414579 (0.0032) [2024-04-27 15:39:39,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6792593408. Throughput: 0: 53795.1. Samples: 1283167560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 15:39:39,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 15:39:39,389][52263] Updated weights for policy 0, policy_version 414589 (0.0031) [2024-04-27 15:39:42,967][52263] Updated weights for policy 0, policy_version 414599 (0.0031) [2024-04-27 15:39:44,106][52031] Fps is (10 sec: 45875.6, 60 sec: 53248.0, 300 sec: 53261.9). Total num frames: 6792822784. Throughput: 0: 53498.6. Samples: 1283319660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 15:39:44,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 15:39:45,479][52263] Updated weights for policy 0, policy_version 414609 (0.0028) [2024-04-27 15:39:49,106][52031] Fps is (10 sec: 50791.2, 60 sec: 53521.2, 300 sec: 53261.9). Total num frames: 6793101312. Throughput: 0: 53393.7. Samples: 1283639080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 15:39:49,107][52031] Avg episode reward: [(0, '0.519')] [2024-04-27 15:39:49,273][52263] Updated weights for policy 0, policy_version 414619 (0.0030) [2024-04-27 15:39:50,799][52242] Signal inference workers to stop experience collection... (19500 times) [2024-04-27 15:39:50,803][52242] Signal inference workers to resume experience collection... (19500 times) [2024-04-27 15:39:50,829][52263] InferenceWorker_p0-w0: stopping experience collection (19500 times) [2024-04-27 15:39:50,829][52263] InferenceWorker_p0-w0: resuming experience collection (19500 times) [2024-04-27 15:39:51,591][52263] Updated weights for policy 0, policy_version 414629 (0.0033) [2024-04-27 15:39:54,106][52031] Fps is (10 sec: 58982.5, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6793412608. Throughput: 0: 53476.9. Samples: 1283965340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 15:39:54,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 15:39:55,436][52263] Updated weights for policy 0, policy_version 414639 (0.0030) [2024-04-27 15:39:57,635][52263] Updated weights for policy 0, policy_version 414649 (0.0030) [2024-04-27 15:39:59,106][52031] Fps is (10 sec: 58982.3, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6793691136. Throughput: 0: 53758.3. Samples: 1284131480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 15:39:59,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 15:40:01,470][52263] Updated weights for policy 0, policy_version 414659 (0.0034) [2024-04-27 15:40:03,773][52263] Updated weights for policy 0, policy_version 414669 (0.0025) [2024-04-27 15:40:04,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6793953280. Throughput: 0: 53769.8. Samples: 1284452660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 15:40:04,107][52031] Avg episode reward: [(0, '0.505')] [2024-04-27 15:40:07,543][52263] Updated weights for policy 0, policy_version 414679 (0.0030) [2024-04-27 15:40:09,106][52031] Fps is (10 sec: 49152.1, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6794182656. Throughput: 0: 53806.3. Samples: 1284773240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 15:40:09,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 15:40:09,836][52263] Updated weights for policy 0, policy_version 414689 (0.0033) [2024-04-27 15:40:13,748][52263] Updated weights for policy 0, policy_version 414699 (0.0028) [2024-04-27 15:40:14,106][52031] Fps is (10 sec: 47513.8, 60 sec: 53521.2, 300 sec: 53206.4). Total num frames: 6794428416. Throughput: 0: 53214.7. Samples: 1284919460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 15:40:14,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 15:40:16,167][52263] Updated weights for policy 0, policy_version 414709 (0.0031) [2024-04-27 15:40:19,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 6794723328. Throughput: 0: 53232.6. Samples: 1285241960. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-27 15:40:19,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 15:40:20,060][52263] Updated weights for policy 0, policy_version 414719 (0.0031) [2024-04-27 15:40:22,271][52263] Updated weights for policy 0, policy_version 414729 (0.0031) [2024-04-27 15:40:24,107][52031] Fps is (10 sec: 58981.4, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 6795018240. Throughput: 0: 53111.1. Samples: 1285557560. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-27 15:40:24,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 15:40:26,098][52263] Updated weights for policy 0, policy_version 414739 (0.0030) [2024-04-27 15:40:28,235][52263] Updated weights for policy 0, policy_version 414749 (0.0031) [2024-04-27 15:40:29,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6795280384. Throughput: 0: 53523.1. Samples: 1285728200. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-27 15:40:29,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 15:40:32,101][52263] Updated weights for policy 0, policy_version 414759 (0.0035) [2024-04-27 15:40:34,106][52031] Fps is (10 sec: 52430.1, 60 sec: 52975.1, 300 sec: 53484.1). Total num frames: 6795542528. Throughput: 0: 53550.3. Samples: 1286048840. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-27 15:40:34,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 15:40:34,323][52263] Updated weights for policy 0, policy_version 414769 (0.0039) [2024-04-27 15:40:38,318][52263] Updated weights for policy 0, policy_version 414779 (0.0034) [2024-04-27 15:40:39,107][52031] Fps is (10 sec: 50789.7, 60 sec: 53248.0, 300 sec: 53317.4). Total num frames: 6795788288. Throughput: 0: 53478.0. Samples: 1286371860. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-27 15:40:39,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 15:40:40,457][52263] Updated weights for policy 0, policy_version 414789 (0.0028) [2024-04-27 15:40:44,107][52031] Fps is (10 sec: 50789.1, 60 sec: 53794.0, 300 sec: 53372.9). Total num frames: 6796050432. Throughput: 0: 53089.5. Samples: 1286520520. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-27 15:40:44,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 15:40:44,351][52263] Updated weights for policy 0, policy_version 414799 (0.0028) [2024-04-27 15:40:46,154][52242] Signal inference workers to stop experience collection... (19550 times) [2024-04-27 15:40:46,199][52263] InferenceWorker_p0-w0: stopping experience collection (19550 times) [2024-04-27 15:40:46,212][52242] Signal inference workers to resume experience collection... (19550 times) [2024-04-27 15:40:46,219][52263] InferenceWorker_p0-w0: resuming experience collection (19550 times) [2024-04-27 15:40:46,469][52263] Updated weights for policy 0, policy_version 414809 (0.0026) [2024-04-27 15:40:49,107][52031] Fps is (10 sec: 55705.7, 60 sec: 54067.0, 300 sec: 53539.6). Total num frames: 6796345344. Throughput: 0: 53086.5. Samples: 1286841560. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-27 15:40:49,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 15:40:50,427][52263] Updated weights for policy 0, policy_version 414819 (0.0027) [2024-04-27 15:40:52,662][52263] Updated weights for policy 0, policy_version 414829 (0.0028) [2024-04-27 15:40:54,107][52031] Fps is (10 sec: 58983.0, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6796640256. Throughput: 0: 53151.4. Samples: 1287165060. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-27 15:40:54,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 15:40:56,604][52263] Updated weights for policy 0, policy_version 414839 (0.0026) [2024-04-27 15:40:59,016][52263] Updated weights for policy 0, policy_version 414849 (0.0032) [2024-04-27 15:40:59,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53247.9, 300 sec: 53595.1). Total num frames: 6796886016. Throughput: 0: 53789.6. Samples: 1287340000. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-27 15:40:59,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 15:41:02,735][52263] Updated weights for policy 0, policy_version 414859 (0.0034) [2024-04-27 15:41:04,106][52031] Fps is (10 sec: 49152.2, 60 sec: 52974.9, 300 sec: 53373.0). Total num frames: 6797131776. Throughput: 0: 53824.8. Samples: 1287664080. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-27 15:41:04,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 15:41:05,260][52263] Updated weights for policy 0, policy_version 414869 (0.0030) [2024-04-27 15:41:08,795][52263] Updated weights for policy 0, policy_version 414879 (0.0027) [2024-04-27 15:41:09,107][52031] Fps is (10 sec: 50790.6, 60 sec: 53521.0, 300 sec: 53372.9). Total num frames: 6797393920. Throughput: 0: 53914.8. Samples: 1287983720. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-27 15:41:09,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 15:41:11,241][52263] Updated weights for policy 0, policy_version 414889 (0.0028) [2024-04-27 15:41:14,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6797656064. Throughput: 0: 53459.0. Samples: 1288133860. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-27 15:41:14,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 15:41:15,052][52263] Updated weights for policy 0, policy_version 414899 (0.0028) [2024-04-27 15:41:17,570][52263] Updated weights for policy 0, policy_version 414909 (0.0032) [2024-04-27 15:41:19,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6797950976. Throughput: 0: 53475.1. Samples: 1288455220. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-27 15:41:19,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 15:41:21,095][52263] Updated weights for policy 0, policy_version 414919 (0.0027) [2024-04-27 15:41:23,567][52263] Updated weights for policy 0, policy_version 414929 (0.0028) [2024-04-27 15:41:24,107][52031] Fps is (10 sec: 58981.7, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6798245888. Throughput: 0: 53406.6. Samples: 1288775160. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-27 15:41:24,108][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 15:41:27,347][52263] Updated weights for policy 0, policy_version 414939 (0.0028) [2024-04-27 15:41:29,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6798491648. Throughput: 0: 53891.4. Samples: 1288945620. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-27 15:41:29,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 15:41:29,721][52263] Updated weights for policy 0, policy_version 414949 (0.0027) [2024-04-27 15:41:33,423][52263] Updated weights for policy 0, policy_version 414959 (0.0030) [2024-04-27 15:41:34,106][52031] Fps is (10 sec: 50791.3, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 6798753792. Throughput: 0: 53873.0. Samples: 1289265840. Policy #0 lag: (min: 0.0, avg: 7.5, max: 21.0) [2024-04-27 15:41:34,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 15:41:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000414963_6798753792.pth... [2024-04-27 15:41:34,165][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000414180_6785925120.pth [2024-04-27 15:41:35,696][52263] Updated weights for policy 0, policy_version 414969 (0.0030) [2024-04-27 15:41:36,292][52242] Signal inference workers to stop experience collection... (19600 times) [2024-04-27 15:41:36,296][52242] Signal inference workers to resume experience collection... (19600 times) [2024-04-27 15:41:36,310][52263] InferenceWorker_p0-w0: stopping experience collection (19600 times) [2024-04-27 15:41:36,310][52263] InferenceWorker_p0-w0: resuming experience collection (19600 times) [2024-04-27 15:41:39,106][52031] Fps is (10 sec: 47513.5, 60 sec: 52975.1, 300 sec: 53317.5). Total num frames: 6798966784. Throughput: 0: 53809.0. Samples: 1289586460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 15:41:39,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 15:41:39,634][52263] Updated weights for policy 0, policy_version 414979 (0.0030) [2024-04-27 15:41:41,850][52263] Updated weights for policy 0, policy_version 414989 (0.0032) [2024-04-27 15:41:44,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6799261696. Throughput: 0: 53101.9. Samples: 1289729580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 15:41:44,107][52031] Avg episode reward: [(0, '0.524')] [2024-04-27 15:41:45,744][52263] Updated weights for policy 0, policy_version 414999 (0.0028) [2024-04-27 15:41:47,858][52263] Updated weights for policy 0, policy_version 415009 (0.0033) [2024-04-27 15:41:49,107][52031] Fps is (10 sec: 60620.4, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6799572992. Throughput: 0: 53176.9. Samples: 1290057040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 15:41:49,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 15:41:51,670][52263] Updated weights for policy 0, policy_version 415019 (0.0027) [2024-04-27 15:41:54,107][52031] Fps is (10 sec: 55704.8, 60 sec: 52974.9, 300 sec: 53539.6). Total num frames: 6799818752. Throughput: 0: 53269.2. Samples: 1290380840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 15:41:54,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 15:41:54,170][52263] Updated weights for policy 0, policy_version 415029 (0.0034) [2024-04-27 15:41:57,802][52263] Updated weights for policy 0, policy_version 415039 (0.0039) [2024-04-27 15:41:59,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6800097280. Throughput: 0: 53663.6. Samples: 1290548720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 15:41:59,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 15:42:00,208][52263] Updated weights for policy 0, policy_version 415049 (0.0038) [2024-04-27 15:42:03,940][52263] Updated weights for policy 0, policy_version 415059 (0.0027) [2024-04-27 15:42:04,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6800343040. Throughput: 0: 53644.4. Samples: 1290869220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 15:42:04,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 15:42:06,386][52263] Updated weights for policy 0, policy_version 415069 (0.0030) [2024-04-27 15:42:09,107][52031] Fps is (10 sec: 49151.9, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 6800588800. Throughput: 0: 53635.3. Samples: 1291188740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 15:42:09,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 15:42:09,858][52263] Updated weights for policy 0, policy_version 415079 (0.0029) [2024-04-27 15:42:12,546][52263] Updated weights for policy 0, policy_version 415089 (0.0027) [2024-04-27 15:42:14,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6800883712. Throughput: 0: 53285.1. Samples: 1291343460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 15:42:14,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 15:42:15,851][52263] Updated weights for policy 0, policy_version 415099 (0.0041) [2024-04-27 15:42:18,665][52263] Updated weights for policy 0, policy_version 415109 (0.0029) [2024-04-27 15:42:19,107][52031] Fps is (10 sec: 58982.6, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6801178624. Throughput: 0: 53344.0. Samples: 1291666320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 15:42:19,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 15:42:22,062][52263] Updated weights for policy 0, policy_version 415119 (0.0034) [2024-04-27 15:42:24,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53248.2, 300 sec: 53595.1). Total num frames: 6801440768. Throughput: 0: 53358.6. Samples: 1291987600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 15:42:24,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 15:42:24,680][52263] Updated weights for policy 0, policy_version 415129 (0.0032) [2024-04-27 15:42:27,859][52242] Signal inference workers to stop experience collection... (19650 times) [2024-04-27 15:42:27,899][52263] InferenceWorker_p0-w0: stopping experience collection (19650 times) [2024-04-27 15:42:27,928][52242] Signal inference workers to resume experience collection... (19650 times) [2024-04-27 15:42:27,928][52263] InferenceWorker_p0-w0: resuming experience collection (19650 times) [2024-04-27 15:42:28,191][52263] Updated weights for policy 0, policy_version 415139 (0.0031) [2024-04-27 15:42:29,107][52031] Fps is (10 sec: 49151.2, 60 sec: 52974.7, 300 sec: 53372.9). Total num frames: 6801670144. Throughput: 0: 53815.3. Samples: 1292151280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 15:42:29,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 15:42:30,867][52263] Updated weights for policy 0, policy_version 415149 (0.0029) [2024-04-27 15:42:34,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6801948672. Throughput: 0: 53537.0. Samples: 1292466200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 15:42:34,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 15:42:34,219][52263] Updated weights for policy 0, policy_version 415159 (0.0033) [2024-04-27 15:42:37,041][52263] Updated weights for policy 0, policy_version 415169 (0.0033) [2024-04-27 15:42:39,107][52031] Fps is (10 sec: 54067.8, 60 sec: 54067.1, 300 sec: 53595.1). Total num frames: 6802210816. Throughput: 0: 53445.4. Samples: 1292785880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 15:42:39,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 15:42:40,229][52263] Updated weights for policy 0, policy_version 415179 (0.0028) [2024-04-27 15:42:43,149][52263] Updated weights for policy 0, policy_version 415189 (0.0031) [2024-04-27 15:42:44,106][52031] Fps is (10 sec: 55705.4, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6802505728. Throughput: 0: 53519.6. Samples: 1292957100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 15:42:44,107][52031] Avg episode reward: [(0, '0.496')] [2024-04-27 15:42:46,520][52263] Updated weights for policy 0, policy_version 415199 (0.0030) [2024-04-27 15:42:49,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53248.0, 300 sec: 53595.2). Total num frames: 6802767872. Throughput: 0: 53492.9. Samples: 1293276400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 15:42:49,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 15:42:49,190][52263] Updated weights for policy 0, policy_version 415209 (0.0033) [2024-04-27 15:42:52,679][52263] Updated weights for policy 0, policy_version 415219 (0.0029) [2024-04-27 15:42:54,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6803030016. Throughput: 0: 53411.6. Samples: 1293592260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 15:42:54,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 15:42:55,189][52263] Updated weights for policy 0, policy_version 415229 (0.0030) [2024-04-27 15:42:58,653][52263] Updated weights for policy 0, policy_version 415239 (0.0027) [2024-04-27 15:42:59,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6803292160. Throughput: 0: 53570.2. Samples: 1293754120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 15:42:59,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 15:43:01,334][52263] Updated weights for policy 0, policy_version 415249 (0.0029) [2024-04-27 15:43:04,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6803554304. Throughput: 0: 53529.3. Samples: 1294075140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 15:43:04,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 15:43:04,788][52263] Updated weights for policy 0, policy_version 415259 (0.0026) [2024-04-27 15:43:07,518][52263] Updated weights for policy 0, policy_version 415269 (0.0026) [2024-04-27 15:43:09,106][52031] Fps is (10 sec: 54067.9, 60 sec: 54067.3, 300 sec: 53650.7). Total num frames: 6803832832. Throughput: 0: 53492.0. Samples: 1294394740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 15:43:09,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 15:43:11,065][52263] Updated weights for policy 0, policy_version 415279 (0.0032) [2024-04-27 15:43:13,473][52263] Updated weights for policy 0, policy_version 415289 (0.0033) [2024-04-27 15:43:14,106][52031] Fps is (10 sec: 57344.8, 60 sec: 54067.4, 300 sec: 53706.2). Total num frames: 6804127744. Throughput: 0: 53657.2. Samples: 1294565840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 15:43:14,107][52031] Avg episode reward: [(0, '0.451')] [2024-04-27 15:43:17,121][52263] Updated weights for policy 0, policy_version 415299 (0.0034) [2024-04-27 15:43:19,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6804373504. Throughput: 0: 53809.3. Samples: 1294887620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 15:43:19,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 15:43:19,579][52263] Updated weights for policy 0, policy_version 415309 (0.0031) [2024-04-27 15:43:19,921][52242] Signal inference workers to stop experience collection... (19700 times) [2024-04-27 15:43:19,921][52242] Signal inference workers to resume experience collection... (19700 times) [2024-04-27 15:43:19,935][52263] InferenceWorker_p0-w0: stopping experience collection (19700 times) [2024-04-27 15:43:19,935][52263] InferenceWorker_p0-w0: resuming experience collection (19700 times) [2024-04-27 15:43:23,143][52263] Updated weights for policy 0, policy_version 415319 (0.0035) [2024-04-27 15:43:24,106][52031] Fps is (10 sec: 49151.7, 60 sec: 52975.0, 300 sec: 53484.0). Total num frames: 6804619264. Throughput: 0: 53783.7. Samples: 1295206140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 15:43:24,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 15:43:25,824][52263] Updated weights for policy 0, policy_version 415329 (0.0033) [2024-04-27 15:43:29,084][52263] Updated weights for policy 0, policy_version 415339 (0.0027) [2024-04-27 15:43:29,107][52031] Fps is (10 sec: 54066.3, 60 sec: 54067.2, 300 sec: 53484.0). Total num frames: 6804914176. Throughput: 0: 53430.0. Samples: 1295361460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 15:43:29,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 15:43:31,908][52263] Updated weights for policy 0, policy_version 415349 (0.0033) [2024-04-27 15:43:34,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6805159936. Throughput: 0: 53446.3. Samples: 1295681480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 15:43:34,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 15:43:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000415355_6805176320.pth... [2024-04-27 15:43:34,169][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000414573_6792364032.pth [2024-04-27 15:43:35,325][52263] Updated weights for policy 0, policy_version 415359 (0.0029) [2024-04-27 15:43:38,022][52263] Updated weights for policy 0, policy_version 415369 (0.0035) [2024-04-27 15:43:39,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6805438464. Throughput: 0: 53656.3. Samples: 1296006800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 15:43:39,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 15:43:41,410][52263] Updated weights for policy 0, policy_version 415379 (0.0026) [2024-04-27 15:43:44,050][52263] Updated weights for policy 0, policy_version 415389 (0.0028) [2024-04-27 15:43:44,107][52031] Fps is (10 sec: 57343.2, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6805733376. Throughput: 0: 53624.9. Samples: 1296167240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 15:43:44,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 15:43:47,468][52263] Updated weights for policy 0, policy_version 415399 (0.0033) [2024-04-27 15:43:49,107][52031] Fps is (10 sec: 55706.3, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6805995520. Throughput: 0: 53667.1. Samples: 1296490160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 15:43:49,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 15:43:50,192][52263] Updated weights for policy 0, policy_version 415409 (0.0031) [2024-04-27 15:43:53,528][52263] Updated weights for policy 0, policy_version 415419 (0.0029) [2024-04-27 15:43:54,107][52031] Fps is (10 sec: 49151.7, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6806224896. Throughput: 0: 53784.2. Samples: 1296815040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 15:43:54,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 15:43:56,176][52263] Updated weights for policy 0, policy_version 415429 (0.0027) [2024-04-27 15:43:59,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53794.3, 300 sec: 53484.0). Total num frames: 6806519808. Throughput: 0: 53427.5. Samples: 1296970080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 15:43:59,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 15:43:59,796][52263] Updated weights for policy 0, policy_version 415439 (0.0028) [2024-04-27 15:44:02,263][52263] Updated weights for policy 0, policy_version 415449 (0.0029) [2024-04-27 15:44:04,106][52031] Fps is (10 sec: 55706.7, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6806781952. Throughput: 0: 53473.8. Samples: 1297293940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 15:44:04,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 15:44:05,893][52263] Updated weights for policy 0, policy_version 415459 (0.0026) [2024-04-27 15:44:08,426][52263] Updated weights for policy 0, policy_version 415469 (0.0029) [2024-04-27 15:44:09,107][52031] Fps is (10 sec: 55705.1, 60 sec: 54067.1, 300 sec: 53761.7). Total num frames: 6807076864. Throughput: 0: 53506.6. Samples: 1297613940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 15:44:09,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 15:44:11,971][52263] Updated weights for policy 0, policy_version 415479 (0.0028) [2024-04-27 15:44:14,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53520.9, 300 sec: 53706.2). Total num frames: 6807339008. Throughput: 0: 53712.1. Samples: 1297778500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 15:44:14,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 15:44:14,557][52263] Updated weights for policy 0, policy_version 415489 (0.0032) [2024-04-27 15:44:18,040][52263] Updated weights for policy 0, policy_version 415499 (0.0025) [2024-04-27 15:44:19,106][52031] Fps is (10 sec: 49152.6, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6807568384. Throughput: 0: 53829.0. Samples: 1298103780. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 15:44:19,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 15:44:20,594][52263] Updated weights for policy 0, policy_version 415509 (0.0028) [2024-04-27 15:44:23,976][52263] Updated weights for policy 0, policy_version 415519 (0.0034) [2024-04-27 15:44:24,107][52031] Fps is (10 sec: 52428.6, 60 sec: 54067.1, 300 sec: 53539.6). Total num frames: 6807863296. Throughput: 0: 53729.3. Samples: 1298424620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 15:44:24,116][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 15:44:26,841][52263] Updated weights for policy 0, policy_version 415529 (0.0037) [2024-04-27 15:44:29,107][52031] Fps is (10 sec: 55704.2, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6808125440. Throughput: 0: 53655.9. Samples: 1298581760. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 15:44:29,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 15:44:29,906][52263] Updated weights for policy 0, policy_version 415539 (0.0032) [2024-04-27 15:44:32,892][52263] Updated weights for policy 0, policy_version 415549 (0.0031) [2024-04-27 15:44:34,106][52031] Fps is (10 sec: 52429.8, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6808387584. Throughput: 0: 53700.5. Samples: 1298906680. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 15:44:34,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 15:44:36,289][52263] Updated weights for policy 0, policy_version 415559 (0.0031) [2024-04-27 15:44:38,943][52263] Updated weights for policy 0, policy_version 415569 (0.0040) [2024-04-27 15:44:39,107][52031] Fps is (10 sec: 55706.0, 60 sec: 54067.3, 300 sec: 53761.7). Total num frames: 6808682496. Throughput: 0: 53716.1. Samples: 1299232260. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 15:44:39,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 15:44:42,675][52263] Updated weights for policy 0, policy_version 415579 (0.0038) [2024-04-27 15:44:43,345][52242] Signal inference workers to stop experience collection... (19750 times) [2024-04-27 15:44:43,373][52263] InferenceWorker_p0-w0: stopping experience collection (19750 times) [2024-04-27 15:44:43,406][52242] Signal inference workers to resume experience collection... (19750 times) [2024-04-27 15:44:43,406][52263] InferenceWorker_p0-w0: resuming experience collection (19750 times) [2024-04-27 15:44:44,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6808944640. Throughput: 0: 53779.0. Samples: 1299390140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 15:44:44,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 15:44:45,036][52263] Updated weights for policy 0, policy_version 415589 (0.0031) [2024-04-27 15:44:48,662][52263] Updated weights for policy 0, policy_version 415599 (0.0030) [2024-04-27 15:44:49,106][52031] Fps is (10 sec: 49152.6, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6809174016. Throughput: 0: 53683.1. Samples: 1299709680. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 15:44:49,107][52031] Avg episode reward: [(0, '0.474')] [2024-04-27 15:44:51,118][52263] Updated weights for policy 0, policy_version 415609 (0.0030) [2024-04-27 15:44:54,107][52031] Fps is (10 sec: 54067.0, 60 sec: 54340.4, 300 sec: 53539.6). Total num frames: 6809485312. Throughput: 0: 53793.7. Samples: 1300034660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 15:44:54,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 15:44:54,878][52263] Updated weights for policy 0, policy_version 415619 (0.0030) [2024-04-27 15:44:57,125][52263] Updated weights for policy 0, policy_version 415629 (0.0025) [2024-04-27 15:44:59,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6809731072. Throughput: 0: 53736.5. Samples: 1300196640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 15:44:59,108][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 15:45:01,089][52263] Updated weights for policy 0, policy_version 415639 (0.0033) [2024-04-27 15:45:03,477][52263] Updated weights for policy 0, policy_version 415649 (0.0030) [2024-04-27 15:45:04,107][52031] Fps is (10 sec: 54067.1, 60 sec: 54067.1, 300 sec: 53706.2). Total num frames: 6810025984. Throughput: 0: 53740.2. Samples: 1300522100. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 15:45:04,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 15:45:07,084][52263] Updated weights for policy 0, policy_version 415659 (0.0037) [2024-04-27 15:45:09,106][52031] Fps is (10 sec: 55706.5, 60 sec: 53521.2, 300 sec: 53761.7). Total num frames: 6810288128. Throughput: 0: 53702.0. Samples: 1300841200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 15:45:09,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 15:45:09,590][52263] Updated weights for policy 0, policy_version 415669 (0.0028) [2024-04-27 15:45:13,057][52263] Updated weights for policy 0, policy_version 415679 (0.0030) [2024-04-27 15:45:14,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6810533888. Throughput: 0: 53749.4. Samples: 1301000480. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 15:45:14,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 15:45:15,644][52263] Updated weights for policy 0, policy_version 415689 (0.0035) [2024-04-27 15:45:19,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53794.1, 300 sec: 53484.1). Total num frames: 6810796032. Throughput: 0: 53668.9. Samples: 1301321780. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 15:45:19,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 15:45:19,125][52263] Updated weights for policy 0, policy_version 415699 (0.0029) [2024-04-27 15:45:21,702][52263] Updated weights for policy 0, policy_version 415709 (0.0033) [2024-04-27 15:45:24,106][52031] Fps is (10 sec: 55706.8, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6811090944. Throughput: 0: 53682.9. Samples: 1301647980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 15:45:24,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 15:45:25,264][52263] Updated weights for policy 0, policy_version 415719 (0.0031) [2024-04-27 15:45:27,772][52263] Updated weights for policy 0, policy_version 415729 (0.0030) [2024-04-27 15:45:29,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.3, 300 sec: 53539.6). Total num frames: 6811336704. Throughput: 0: 53697.9. Samples: 1301806540. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 15:45:29,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 15:45:31,481][52263] Updated weights for policy 0, policy_version 415739 (0.0032) [2024-04-27 15:45:33,761][52263] Updated weights for policy 0, policy_version 415749 (0.0029) [2024-04-27 15:45:34,106][52031] Fps is (10 sec: 54067.2, 60 sec: 54067.2, 300 sec: 53706.2). Total num frames: 6811631616. Throughput: 0: 53737.0. Samples: 1302127840. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 15:45:34,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 15:45:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000415749_6811631616.pth... [2024-04-27 15:45:34,178][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000414963_6798753792.pth [2024-04-27 15:45:37,437][52263] Updated weights for policy 0, policy_version 415759 (0.0030) [2024-04-27 15:45:39,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53248.1, 300 sec: 53650.7). Total num frames: 6811877376. Throughput: 0: 53703.3. Samples: 1302451300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-04-27 15:45:39,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 15:45:39,902][52263] Updated weights for policy 0, policy_version 415769 (0.0036) [2024-04-27 15:45:43,842][52263] Updated weights for policy 0, policy_version 415779 (0.0034) [2024-04-27 15:45:44,107][52031] Fps is (10 sec: 50789.2, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6812139520. Throughput: 0: 53728.8. Samples: 1302614440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-27 15:45:44,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 15:45:46,108][52263] Updated weights for policy 0, policy_version 415789 (0.0029) [2024-04-27 15:45:49,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6812401664. Throughput: 0: 53683.5. Samples: 1302937860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-27 15:45:49,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 15:45:49,972][52263] Updated weights for policy 0, policy_version 415799 (0.0035) [2024-04-27 15:45:52,159][52263] Updated weights for policy 0, policy_version 415809 (0.0027) [2024-04-27 15:45:54,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6812696576. Throughput: 0: 53611.0. Samples: 1303253700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-27 15:45:54,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 15:45:56,137][52263] Updated weights for policy 0, policy_version 415819 (0.0038) [2024-04-27 15:45:56,425][52242] Signal inference workers to stop experience collection... (19800 times) [2024-04-27 15:45:56,426][52242] Signal inference workers to resume experience collection... (19800 times) [2024-04-27 15:45:56,442][52263] InferenceWorker_p0-w0: stopping experience collection (19800 times) [2024-04-27 15:45:56,443][52263] InferenceWorker_p0-w0: resuming experience collection (19800 times) [2024-04-27 15:45:58,389][52263] Updated weights for policy 0, policy_version 415829 (0.0032) [2024-04-27 15:45:59,106][52031] Fps is (10 sec: 55706.6, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 6812958720. Throughput: 0: 53785.5. Samples: 1303420820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-27 15:45:59,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 15:46:02,082][52263] Updated weights for policy 0, policy_version 415839 (0.0034) [2024-04-27 15:46:04,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53521.0, 300 sec: 53706.2). Total num frames: 6813237248. Throughput: 0: 53785.9. Samples: 1303742160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-27 15:46:04,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 15:46:04,707][52263] Updated weights for policy 0, policy_version 415849 (0.0028) [2024-04-27 15:46:08,252][52263] Updated weights for policy 0, policy_version 415859 (0.0029) [2024-04-27 15:46:09,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.0, 300 sec: 53650.7). Total num frames: 6813483008. Throughput: 0: 53627.1. Samples: 1304061200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-27 15:46:09,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 15:46:10,790][52263] Updated weights for policy 0, policy_version 415869 (0.0026) [2024-04-27 15:46:14,107][52031] Fps is (10 sec: 50791.0, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6813745152. Throughput: 0: 53584.7. Samples: 1304217860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-27 15:46:14,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 15:46:14,356][52263] Updated weights for policy 0, policy_version 415879 (0.0025) [2024-04-27 15:46:16,791][52263] Updated weights for policy 0, policy_version 415889 (0.0025) [2024-04-27 15:46:19,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53428.6). Total num frames: 6814007296. Throughput: 0: 53618.7. Samples: 1304540680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-27 15:46:19,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 15:46:20,529][52263] Updated weights for policy 0, policy_version 415899 (0.0026) [2024-04-27 15:46:23,041][52263] Updated weights for policy 0, policy_version 415909 (0.0028) [2024-04-27 15:46:24,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6814285824. Throughput: 0: 53465.3. Samples: 1304857240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-27 15:46:24,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 15:46:26,518][52263] Updated weights for policy 0, policy_version 415919 (0.0029) [2024-04-27 15:46:29,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6814564352. Throughput: 0: 53549.0. Samples: 1305024140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-27 15:46:29,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 15:46:29,250][52263] Updated weights for policy 0, policy_version 415929 (0.0028) [2024-04-27 15:46:32,546][52263] Updated weights for policy 0, policy_version 415939 (0.0028) [2024-04-27 15:46:34,107][52031] Fps is (10 sec: 57342.8, 60 sec: 53793.9, 300 sec: 53872.8). Total num frames: 6814859264. Throughput: 0: 53526.6. Samples: 1305346560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-27 15:46:34,107][52031] Avg episode reward: [(0, '0.527')] [2024-04-27 15:46:35,156][52263] Updated weights for policy 0, policy_version 415949 (0.0028) [2024-04-27 15:46:38,600][52263] Updated weights for policy 0, policy_version 415959 (0.0026) [2024-04-27 15:46:39,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6815088640. Throughput: 0: 53709.4. Samples: 1305670620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-27 15:46:39,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 15:46:41,335][52263] Updated weights for policy 0, policy_version 415969 (0.0030) [2024-04-27 15:46:44,107][52031] Fps is (10 sec: 49151.9, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6815350784. Throughput: 0: 53481.5. Samples: 1305827500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-27 15:46:44,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 15:46:44,564][52263] Updated weights for policy 0, policy_version 415979 (0.0037) [2024-04-27 15:46:47,374][52263] Updated weights for policy 0, policy_version 415989 (0.0031) [2024-04-27 15:46:49,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6815629312. Throughput: 0: 53459.3. Samples: 1306147820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-27 15:46:49,107][52031] Avg episode reward: [(0, '0.495')] [2024-04-27 15:46:50,618][52263] Updated weights for policy 0, policy_version 415999 (0.0029) [2024-04-27 15:46:53,543][52263] Updated weights for policy 0, policy_version 416009 (0.0029) [2024-04-27 15:46:54,107][52031] Fps is (10 sec: 57344.6, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6815924224. Throughput: 0: 53592.3. Samples: 1306472860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-27 15:46:54,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 15:46:56,780][52263] Updated weights for policy 0, policy_version 416019 (0.0030) [2024-04-27 15:46:59,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6816169984. Throughput: 0: 53816.7. Samples: 1306639600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-27 15:46:59,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 15:46:59,663][52263] Updated weights for policy 0, policy_version 416029 (0.0025) [2024-04-27 15:47:03,002][52263] Updated weights for policy 0, policy_version 416039 (0.0029) [2024-04-27 15:47:03,631][52242] Signal inference workers to stop experience collection... (19850 times) [2024-04-27 15:47:03,678][52263] InferenceWorker_p0-w0: stopping experience collection (19850 times) [2024-04-27 15:47:03,691][52242] Signal inference workers to resume experience collection... (19850 times) [2024-04-27 15:47:03,698][52263] InferenceWorker_p0-w0: resuming experience collection (19850 times) [2024-04-27 15:47:04,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.2, 300 sec: 53761.7). Total num frames: 6816448512. Throughput: 0: 53727.8. Samples: 1306958440. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 15:47:04,107][52031] Avg episode reward: [(0, '0.508')] [2024-04-27 15:47:05,735][52263] Updated weights for policy 0, policy_version 416049 (0.0030) [2024-04-27 15:47:09,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6816694272. Throughput: 0: 53847.6. Samples: 1307280380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 15:47:09,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 15:47:09,188][52263] Updated weights for policy 0, policy_version 416059 (0.0026) [2024-04-27 15:47:11,876][52263] Updated weights for policy 0, policy_version 416069 (0.0033) [2024-04-27 15:47:14,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6816956416. Throughput: 0: 53535.1. Samples: 1307433220. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 15:47:14,107][52031] Avg episode reward: [(0, '0.472')] [2024-04-27 15:47:15,172][52263] Updated weights for policy 0, policy_version 416079 (0.0028) [2024-04-27 15:47:17,894][52263] Updated weights for policy 0, policy_version 416089 (0.0031) [2024-04-27 15:47:19,107][52031] Fps is (10 sec: 55704.8, 60 sec: 54067.0, 300 sec: 53595.1). Total num frames: 6817251328. Throughput: 0: 53542.8. Samples: 1307755980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 15:47:19,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 15:47:21,263][52263] Updated weights for policy 0, policy_version 416099 (0.0027) [2024-04-27 15:47:24,038][52263] Updated weights for policy 0, policy_version 416109 (0.0031) [2024-04-27 15:47:24,106][52031] Fps is (10 sec: 57344.9, 60 sec: 54067.2, 300 sec: 53761.8). Total num frames: 6817529856. Throughput: 0: 53570.3. Samples: 1308081280. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 15:47:24,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 15:47:27,508][52263] Updated weights for policy 0, policy_version 416119 (0.0028) [2024-04-27 15:47:29,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 6817775616. Throughput: 0: 53729.2. Samples: 1308245300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 15:47:29,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 15:47:30,116][52263] Updated weights for policy 0, policy_version 416129 (0.0030) [2024-04-27 15:47:33,447][52263] Updated weights for policy 0, policy_version 416139 (0.0031) [2024-04-27 15:47:34,106][52031] Fps is (10 sec: 50790.2, 60 sec: 52975.1, 300 sec: 53650.7). Total num frames: 6818037760. Throughput: 0: 53727.2. Samples: 1308565540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 15:47:34,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 15:47:34,273][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000416142_6818070528.pth... [2024-04-27 15:47:34,316][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000415355_6805176320.pth [2024-04-27 15:47:36,357][52263] Updated weights for policy 0, policy_version 416149 (0.0029) [2024-04-27 15:47:39,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6818299904. Throughput: 0: 53638.3. Samples: 1308886580. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 15:47:39,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 15:47:39,603][52263] Updated weights for policy 0, policy_version 416159 (0.0036) [2024-04-27 15:47:42,428][52263] Updated weights for policy 0, policy_version 416169 (0.0027) [2024-04-27 15:47:44,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6818578432. Throughput: 0: 53476.2. Samples: 1309046040. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 15:47:44,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 15:47:45,660][52263] Updated weights for policy 0, policy_version 416179 (0.0027) [2024-04-27 15:47:48,494][52263] Updated weights for policy 0, policy_version 416189 (0.0029) [2024-04-27 15:47:49,106][52031] Fps is (10 sec: 57343.9, 60 sec: 54067.2, 300 sec: 53706.2). Total num frames: 6818873344. Throughput: 0: 53648.5. Samples: 1309372620. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 15:47:49,107][52031] Avg episode reward: [(0, '0.717')] [2024-04-27 15:47:51,723][52263] Updated weights for policy 0, policy_version 416199 (0.0030) [2024-04-27 15:47:54,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53248.2, 300 sec: 53650.7). Total num frames: 6819119104. Throughput: 0: 53568.9. Samples: 1309690980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 15:47:54,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 15:47:54,596][52263] Updated weights for policy 0, policy_version 416209 (0.0027) [2024-04-27 15:47:58,001][52263] Updated weights for policy 0, policy_version 416219 (0.0026) [2024-04-27 15:47:59,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6819397632. Throughput: 0: 53750.7. Samples: 1309852000. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 15:47:59,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 15:48:00,643][52263] Updated weights for policy 0, policy_version 416229 (0.0033) [2024-04-27 15:48:04,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6819643392. Throughput: 0: 53678.4. Samples: 1310171500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 15:48:04,107][52031] Avg episode reward: [(0, '0.665')] [2024-04-27 15:48:04,162][52263] Updated weights for policy 0, policy_version 416239 (0.0035) [2024-04-27 15:48:04,796][52242] Signal inference workers to stop experience collection... (19900 times) [2024-04-27 15:48:04,841][52263] InferenceWorker_p0-w0: stopping experience collection (19900 times) [2024-04-27 15:48:04,894][52242] Signal inference workers to resume experience collection... (19900 times) [2024-04-27 15:48:04,894][52263] InferenceWorker_p0-w0: resuming experience collection (19900 times) [2024-04-27 15:48:06,657][52263] Updated weights for policy 0, policy_version 416249 (0.0030) [2024-04-27 15:48:09,107][52031] Fps is (10 sec: 50790.1, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6819905536. Throughput: 0: 53628.7. Samples: 1310494580. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 15:48:09,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 15:48:10,194][52263] Updated weights for policy 0, policy_version 416259 (0.0027) [2024-04-27 15:48:12,904][52263] Updated weights for policy 0, policy_version 416269 (0.0029) [2024-04-27 15:48:14,106][52031] Fps is (10 sec: 55705.2, 60 sec: 54067.3, 300 sec: 53650.7). Total num frames: 6820200448. Throughput: 0: 53535.5. Samples: 1310654400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 15:48:14,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 15:48:16,396][52263] Updated weights for policy 0, policy_version 416279 (0.0027) [2024-04-27 15:48:19,096][52263] Updated weights for policy 0, policy_version 416289 (0.0033) [2024-04-27 15:48:19,106][52031] Fps is (10 sec: 57345.2, 60 sec: 53794.3, 300 sec: 53761.8). Total num frames: 6820478976. Throughput: 0: 53569.9. Samples: 1310976180. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 15:48:19,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 15:48:22,928][52263] Updated weights for policy 0, policy_version 416299 (0.0025) [2024-04-27 15:48:24,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6820741120. Throughput: 0: 53596.0. Samples: 1311298400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 15:48:24,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 15:48:25,325][52263] Updated weights for policy 0, policy_version 416309 (0.0028) [2024-04-27 15:48:29,035][52263] Updated weights for policy 0, policy_version 416319 (0.0032) [2024-04-27 15:48:29,106][52031] Fps is (10 sec: 49152.0, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6820970496. Throughput: 0: 53390.9. Samples: 1311448620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 15:48:29,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 15:48:31,285][52263] Updated weights for policy 0, policy_version 416329 (0.0029) [2024-04-27 15:48:34,106][52031] Fps is (10 sec: 50790.1, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6821249024. Throughput: 0: 53302.7. Samples: 1311771240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 15:48:34,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 15:48:34,917][52263] Updated weights for policy 0, policy_version 416339 (0.0026) [2024-04-27 15:48:37,301][52263] Updated weights for policy 0, policy_version 416349 (0.0035) [2024-04-27 15:48:39,107][52031] Fps is (10 sec: 57343.0, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6821543936. Throughput: 0: 53329.6. Samples: 1312090820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 15:48:39,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 15:48:41,018][52263] Updated weights for policy 0, policy_version 416359 (0.0033) [2024-04-27 15:48:43,581][52263] Updated weights for policy 0, policy_version 416369 (0.0034) [2024-04-27 15:48:44,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6821806080. Throughput: 0: 53489.2. Samples: 1312259020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 15:48:44,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 15:48:47,171][52263] Updated weights for policy 0, policy_version 416379 (0.0027) [2024-04-27 15:48:49,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53248.1, 300 sec: 53706.2). Total num frames: 6822068224. Throughput: 0: 53558.6. Samples: 1312581640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 15:48:49,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 15:48:49,676][52263] Updated weights for policy 0, policy_version 416389 (0.0030) [2024-04-27 15:48:53,476][52263] Updated weights for policy 0, policy_version 416399 (0.0032) [2024-04-27 15:48:54,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 6822346752. Throughput: 0: 53684.5. Samples: 1312910380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 15:48:54,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 15:48:55,752][52263] Updated weights for policy 0, policy_version 416409 (0.0028) [2024-04-27 15:48:59,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53247.9, 300 sec: 53595.1). Total num frames: 6822592512. Throughput: 0: 53403.0. Samples: 1313057540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 15:48:59,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 15:48:59,429][52263] Updated weights for policy 0, policy_version 416419 (0.0028) [2024-04-27 15:49:01,761][52263] Updated weights for policy 0, policy_version 416429 (0.0028) [2024-04-27 15:49:04,106][52031] Fps is (10 sec: 50791.1, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 6822854656. Throughput: 0: 53387.0. Samples: 1313378600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 15:49:04,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 15:49:05,442][52263] Updated weights for policy 0, policy_version 416439 (0.0026) [2024-04-27 15:49:07,878][52263] Updated weights for policy 0, policy_version 416449 (0.0027) [2024-04-27 15:49:09,106][52031] Fps is (10 sec: 57344.8, 60 sec: 54340.4, 300 sec: 53650.7). Total num frames: 6823165952. Throughput: 0: 53344.9. Samples: 1313698920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 15:49:09,107][52031] Avg episode reward: [(0, '0.663')] [2024-04-27 15:49:11,621][52263] Updated weights for policy 0, policy_version 416459 (0.0033) [2024-04-27 15:49:11,873][52242] Signal inference workers to stop experience collection... (19950 times) [2024-04-27 15:49:11,874][52242] Signal inference workers to resume experience collection... (19950 times) [2024-04-27 15:49:11,887][52263] InferenceWorker_p0-w0: stopping experience collection (19950 times) [2024-04-27 15:49:11,905][52263] InferenceWorker_p0-w0: resuming experience collection (19950 times) [2024-04-27 15:49:14,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.0, 300 sec: 53706.2). Total num frames: 6823411712. Throughput: 0: 53791.4. Samples: 1313869240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 15:49:14,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 15:49:14,188][52263] Updated weights for policy 0, policy_version 416469 (0.0030) [2024-04-27 15:49:17,765][52263] Updated weights for policy 0, policy_version 416479 (0.0029) [2024-04-27 15:49:19,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53247.9, 300 sec: 53595.2). Total num frames: 6823673856. Throughput: 0: 53721.5. Samples: 1314188700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 15:49:19,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 15:49:20,250][52263] Updated weights for policy 0, policy_version 416489 (0.0028) [2024-04-27 15:49:23,828][52263] Updated weights for policy 0, policy_version 416499 (0.0028) [2024-04-27 15:49:24,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.0, 300 sec: 53595.2). Total num frames: 6823936000. Throughput: 0: 53752.1. Samples: 1314509660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 15:49:24,107][52031] Avg episode reward: [(0, '0.664')] [2024-04-27 15:49:26,299][52263] Updated weights for policy 0, policy_version 416509 (0.0029) [2024-04-27 15:49:29,107][52031] Fps is (10 sec: 52427.7, 60 sec: 53793.9, 300 sec: 53595.1). Total num frames: 6824198144. Throughput: 0: 53424.9. Samples: 1314663140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 15:49:29,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 15:49:29,898][52263] Updated weights for policy 0, policy_version 416519 (0.0033) [2024-04-27 15:49:32,587][52263] Updated weights for policy 0, policy_version 416529 (0.0032) [2024-04-27 15:49:34,106][52031] Fps is (10 sec: 55705.4, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6824493056. Throughput: 0: 53416.8. Samples: 1314985400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 15:49:34,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 15:49:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000416534_6824493056.pth... [2024-04-27 15:49:34,162][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000415749_6811631616.pth [2024-04-27 15:49:35,897][52263] Updated weights for policy 0, policy_version 416539 (0.0031) [2024-04-27 15:49:38,713][52263] Updated weights for policy 0, policy_version 416549 (0.0024) [2024-04-27 15:49:39,106][52031] Fps is (10 sec: 55706.7, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6824755200. Throughput: 0: 53231.7. Samples: 1315305800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 15:49:39,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 15:49:42,258][52263] Updated weights for policy 0, policy_version 416559 (0.0036) [2024-04-27 15:49:44,107][52031] Fps is (10 sec: 49151.9, 60 sec: 52975.0, 300 sec: 53595.1). Total num frames: 6824984576. Throughput: 0: 53638.3. Samples: 1315471260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-27 15:49:44,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 15:49:44,793][52263] Updated weights for policy 0, policy_version 416569 (0.0033) [2024-04-27 15:49:48,209][52263] Updated weights for policy 0, policy_version 416579 (0.0030) [2024-04-27 15:49:49,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53520.9, 300 sec: 53539.6). Total num frames: 6825279488. Throughput: 0: 53587.8. Samples: 1315790060. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-04-27 15:49:49,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 15:49:51,059][52263] Updated weights for policy 0, policy_version 416589 (0.0032) [2024-04-27 15:49:54,107][52031] Fps is (10 sec: 55704.3, 60 sec: 53247.8, 300 sec: 53595.1). Total num frames: 6825541632. Throughput: 0: 53660.5. Samples: 1316113660. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-04-27 15:49:54,107][52031] Avg episode reward: [(0, '0.661')] [2024-04-27 15:49:54,297][52263] Updated weights for policy 0, policy_version 416599 (0.0035) [2024-04-27 15:49:57,164][52263] Updated weights for policy 0, policy_version 416609 (0.0032) [2024-04-27 15:49:57,725][52242] Signal inference workers to stop experience collection... (20000 times) [2024-04-27 15:49:57,725][52242] Signal inference workers to resume experience collection... (20000 times) [2024-04-27 15:49:57,737][52263] InferenceWorker_p0-w0: stopping experience collection (20000 times) [2024-04-27 15:49:57,755][52263] InferenceWorker_p0-w0: resuming experience collection (20000 times) [2024-04-27 15:49:59,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6825803776. Throughput: 0: 53341.8. Samples: 1316269620. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-04-27 15:49:59,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 15:50:00,353][52263] Updated weights for policy 0, policy_version 416619 (0.0027) [2024-04-27 15:50:03,215][52263] Updated weights for policy 0, policy_version 416629 (0.0035) [2024-04-27 15:50:04,107][52031] Fps is (10 sec: 57344.9, 60 sec: 54340.1, 300 sec: 53650.6). Total num frames: 6826115072. Throughput: 0: 53483.8. Samples: 1316595480. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-04-27 15:50:04,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 15:50:06,450][52263] Updated weights for policy 0, policy_version 416639 (0.0028) [2024-04-27 15:50:09,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53247.9, 300 sec: 53650.7). Total num frames: 6826360832. Throughput: 0: 53465.2. Samples: 1316915600. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-04-27 15:50:09,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 15:50:09,231][52263] Updated weights for policy 0, policy_version 416649 (0.0033) [2024-04-27 15:50:12,453][52263] Updated weights for policy 0, policy_version 416659 (0.0028) [2024-04-27 15:50:14,106][52031] Fps is (10 sec: 47514.6, 60 sec: 52975.0, 300 sec: 53539.6). Total num frames: 6826590208. Throughput: 0: 53727.8. Samples: 1317080880. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-04-27 15:50:14,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 15:50:15,406][52263] Updated weights for policy 0, policy_version 416669 (0.0035) [2024-04-27 15:50:18,640][52263] Updated weights for policy 0, policy_version 416679 (0.0030) [2024-04-27 15:50:19,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53793.9, 300 sec: 53595.1). Total num frames: 6826901504. Throughput: 0: 53566.0. Samples: 1317395880. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-04-27 15:50:19,108][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 15:50:21,390][52263] Updated weights for policy 0, policy_version 416689 (0.0027) [2024-04-27 15:50:24,106][52031] Fps is (10 sec: 55705.4, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6827147264. Throughput: 0: 53637.3. Samples: 1317719480. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-04-27 15:50:24,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 15:50:24,709][52263] Updated weights for policy 0, policy_version 416699 (0.0029) [2024-04-27 15:50:27,490][52263] Updated weights for policy 0, policy_version 416709 (0.0028) [2024-04-27 15:50:29,107][52031] Fps is (10 sec: 52429.0, 60 sec: 53794.1, 300 sec: 53539.5). Total num frames: 6827425792. Throughput: 0: 53609.2. Samples: 1317883680. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-04-27 15:50:29,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 15:50:30,953][52263] Updated weights for policy 0, policy_version 416719 (0.0031) [2024-04-27 15:50:33,666][52263] Updated weights for policy 0, policy_version 416729 (0.0028) [2024-04-27 15:50:34,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 6827704320. Throughput: 0: 53567.2. Samples: 1318200580. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-04-27 15:50:34,107][52031] Avg episode reward: [(0, '0.465')] [2024-04-27 15:50:37,060][52263] Updated weights for policy 0, policy_version 416739 (0.0028) [2024-04-27 15:50:39,106][52031] Fps is (10 sec: 54068.3, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6827966464. Throughput: 0: 53467.1. Samples: 1318519660. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-04-27 15:50:39,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 15:50:39,756][52263] Updated weights for policy 0, policy_version 416749 (0.0027) [2024-04-27 15:50:43,216][52263] Updated weights for policy 0, policy_version 416759 (0.0026) [2024-04-27 15:50:44,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6828212224. Throughput: 0: 53604.0. Samples: 1318681800. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-04-27 15:50:44,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 15:50:45,896][52263] Updated weights for policy 0, policy_version 416769 (0.0030) [2024-04-27 15:50:49,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6828490752. Throughput: 0: 53487.2. Samples: 1319002400. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-04-27 15:50:49,107][52031] Avg episode reward: [(0, '0.662')] [2024-04-27 15:50:49,252][52263] Updated weights for policy 0, policy_version 416779 (0.0031) [2024-04-27 15:50:52,051][52242] Signal inference workers to stop experience collection... (20050 times) [2024-04-27 15:50:52,053][52242] Signal inference workers to resume experience collection... (20050 times) [2024-04-27 15:50:52,064][52263] Updated weights for policy 0, policy_version 416789 (0.0025) [2024-04-27 15:50:52,088][52263] InferenceWorker_p0-w0: stopping experience collection (20050 times) [2024-04-27 15:50:52,088][52263] InferenceWorker_p0-w0: resuming experience collection (20050 times) [2024-04-27 15:50:54,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6828752896. Throughput: 0: 53523.1. Samples: 1319324140. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-04-27 15:50:54,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 15:50:55,361][52263] Updated weights for policy 0, policy_version 416799 (0.0029) [2024-04-27 15:50:58,029][52263] Updated weights for policy 0, policy_version 416809 (0.0028) [2024-04-27 15:50:59,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 6829015040. Throughput: 0: 53441.2. Samples: 1319485740. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-04-27 15:50:59,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 15:51:01,376][52263] Updated weights for policy 0, policy_version 416819 (0.0032) [2024-04-27 15:51:04,106][52031] Fps is (10 sec: 55706.5, 60 sec: 53248.1, 300 sec: 53650.7). Total num frames: 6829309952. Throughput: 0: 53670.1. Samples: 1319811020. Policy #0 lag: (min: 2.0, avg: 11.2, max: 24.0) [2024-04-27 15:51:04,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 15:51:04,145][52263] Updated weights for policy 0, policy_version 416829 (0.0031) [2024-04-27 15:51:07,550][52263] Updated weights for policy 0, policy_version 416839 (0.0026) [2024-04-27 15:51:09,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 6829572096. Throughput: 0: 53640.9. Samples: 1320133320. Policy #0 lag: (min: 2.0, avg: 9.7, max: 21.0) [2024-04-27 15:51:09,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 15:51:10,377][52263] Updated weights for policy 0, policy_version 416849 (0.0030) [2024-04-27 15:51:13,716][52263] Updated weights for policy 0, policy_version 416859 (0.0031) [2024-04-27 15:51:14,107][52031] Fps is (10 sec: 52427.4, 60 sec: 54066.9, 300 sec: 53650.6). Total num frames: 6829834240. Throughput: 0: 53642.6. Samples: 1320297600. Policy #0 lag: (min: 2.0, avg: 9.7, max: 21.0) [2024-04-27 15:51:14,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 15:51:16,313][52263] Updated weights for policy 0, policy_version 416869 (0.0029) [2024-04-27 15:51:19,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53248.2, 300 sec: 53595.1). Total num frames: 6830096384. Throughput: 0: 53777.4. Samples: 1320620560. Policy #0 lag: (min: 2.0, avg: 9.7, max: 21.0) [2024-04-27 15:51:19,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 15:51:19,641][52263] Updated weights for policy 0, policy_version 416879 (0.0032) [2024-04-27 15:51:22,492][52263] Updated weights for policy 0, policy_version 416889 (0.0030) [2024-04-27 15:51:24,106][52031] Fps is (10 sec: 54068.6, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6830374912. Throughput: 0: 53858.7. Samples: 1320943300. Policy #0 lag: (min: 2.0, avg: 9.7, max: 21.0) [2024-04-27 15:51:24,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 15:51:25,824][52263] Updated weights for policy 0, policy_version 416899 (0.0033) [2024-04-27 15:51:28,701][52263] Updated weights for policy 0, policy_version 416909 (0.0035) [2024-04-27 15:51:29,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.3, 300 sec: 53484.1). Total num frames: 6830637056. Throughput: 0: 53725.4. Samples: 1321099440. Policy #0 lag: (min: 2.0, avg: 9.7, max: 21.0) [2024-04-27 15:51:29,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 15:51:32,008][52263] Updated weights for policy 0, policy_version 416919 (0.0028) [2024-04-27 15:51:34,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 6830915584. Throughput: 0: 53712.4. Samples: 1321419460. Policy #0 lag: (min: 2.0, avg: 9.7, max: 21.0) [2024-04-27 15:51:34,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 15:51:34,119][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000416926_6830915584.pth... [2024-04-27 15:51:34,167][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000416142_6818070528.pth [2024-04-27 15:51:34,813][52263] Updated weights for policy 0, policy_version 416929 (0.0031) [2024-04-27 15:51:37,923][52263] Updated weights for policy 0, policy_version 416939 (0.0029) [2024-04-27 15:51:39,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6831194112. Throughput: 0: 53705.0. Samples: 1321740860. Policy #0 lag: (min: 2.0, avg: 9.7, max: 21.0) [2024-04-27 15:51:39,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 15:51:41,012][52263] Updated weights for policy 0, policy_version 416949 (0.0027) [2024-04-27 15:51:43,925][52263] Updated weights for policy 0, policy_version 416959 (0.0034) [2024-04-27 15:51:44,107][52031] Fps is (10 sec: 54067.0, 60 sec: 54067.1, 300 sec: 53650.6). Total num frames: 6831456256. Throughput: 0: 53891.0. Samples: 1321910840. Policy #0 lag: (min: 2.0, avg: 9.7, max: 21.0) [2024-04-27 15:51:44,116][52031] Avg episode reward: [(0, '0.646')] [2024-04-27 15:51:47,136][52263] Updated weights for policy 0, policy_version 416969 (0.0028) [2024-04-27 15:51:49,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6831702016. Throughput: 0: 53812.9. Samples: 1322232600. Policy #0 lag: (min: 2.0, avg: 9.7, max: 21.0) [2024-04-27 15:51:49,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 15:51:50,290][52263] Updated weights for policy 0, policy_version 416979 (0.0031) [2024-04-27 15:51:53,199][52263] Updated weights for policy 0, policy_version 416989 (0.0031) [2024-04-27 15:51:54,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6831980544. Throughput: 0: 53728.4. Samples: 1322551100. Policy #0 lag: (min: 2.0, avg: 9.7, max: 21.0) [2024-04-27 15:51:54,107][52031] Avg episode reward: [(0, '0.646')] [2024-04-27 15:51:56,316][52263] Updated weights for policy 0, policy_version 416999 (0.0033) [2024-04-27 15:51:59,106][52031] Fps is (10 sec: 55705.4, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6832259072. Throughput: 0: 53591.4. Samples: 1322709200. Policy #0 lag: (min: 2.0, avg: 9.7, max: 21.0) [2024-04-27 15:51:59,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 15:51:59,205][52263] Updated weights for policy 0, policy_version 417009 (0.0032) [2024-04-27 15:52:02,435][52263] Updated weights for policy 0, policy_version 417019 (0.0031) [2024-04-27 15:52:04,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6832521216. Throughput: 0: 53577.8. Samples: 1323031560. Policy #0 lag: (min: 2.0, avg: 9.7, max: 21.0) [2024-04-27 15:52:04,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 15:52:05,344][52263] Updated weights for policy 0, policy_version 417029 (0.0028) [2024-04-27 15:52:08,507][52263] Updated weights for policy 0, policy_version 417039 (0.0028) [2024-04-27 15:52:09,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6832799744. Throughput: 0: 53596.5. Samples: 1323355140. Policy #0 lag: (min: 2.0, avg: 9.7, max: 21.0) [2024-04-27 15:52:09,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 15:52:11,570][52263] Updated weights for policy 0, policy_version 417049 (0.0033) [2024-04-27 15:52:14,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.3, 300 sec: 53539.6). Total num frames: 6833045504. Throughput: 0: 53755.5. Samples: 1323518440. Policy #0 lag: (min: 2.0, avg: 9.7, max: 21.0) [2024-04-27 15:52:14,107][52031] Avg episode reward: [(0, '0.641')] [2024-04-27 15:52:14,590][52263] Updated weights for policy 0, policy_version 417059 (0.0023) [2024-04-27 15:52:17,202][52242] Signal inference workers to stop experience collection... (20100 times) [2024-04-27 15:52:17,235][52263] InferenceWorker_p0-w0: stopping experience collection (20100 times) [2024-04-27 15:52:17,303][52242] Signal inference workers to resume experience collection... (20100 times) [2024-04-27 15:52:17,303][52263] InferenceWorker_p0-w0: resuming experience collection (20100 times) [2024-04-27 15:52:17,563][52263] Updated weights for policy 0, policy_version 417069 (0.0032) [2024-04-27 15:52:19,107][52031] Fps is (10 sec: 52427.7, 60 sec: 53794.0, 300 sec: 53539.5). Total num frames: 6833324032. Throughput: 0: 53791.1. Samples: 1323840060. Policy #0 lag: (min: 2.0, avg: 9.7, max: 21.0) [2024-04-27 15:52:19,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 15:52:20,653][52263] Updated weights for policy 0, policy_version 417079 (0.0029) [2024-04-27 15:52:23,690][52263] Updated weights for policy 0, policy_version 417089 (0.0032) [2024-04-27 15:52:24,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6833602560. Throughput: 0: 53787.7. Samples: 1324161300. Policy #0 lag: (min: 2.0, avg: 9.7, max: 21.0) [2024-04-27 15:52:24,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 15:52:26,684][52263] Updated weights for policy 0, policy_version 417099 (0.0039) [2024-04-27 15:52:29,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 6833864704. Throughput: 0: 53531.1. Samples: 1324319740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-27 15:52:29,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 15:52:29,846][52263] Updated weights for policy 0, policy_version 417109 (0.0028) [2024-04-27 15:52:32,766][52263] Updated weights for policy 0, policy_version 417119 (0.0028) [2024-04-27 15:52:34,107][52031] Fps is (10 sec: 54066.1, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6834143232. Throughput: 0: 53584.3. Samples: 1324643900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-27 15:52:34,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 15:52:35,944][52263] Updated weights for policy 0, policy_version 417129 (0.0033) [2024-04-27 15:52:39,027][52263] Updated weights for policy 0, policy_version 417139 (0.0031) [2024-04-27 15:52:39,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6834405376. Throughput: 0: 53668.8. Samples: 1324966200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-27 15:52:39,107][52031] Avg episode reward: [(0, '0.501')] [2024-04-27 15:52:41,893][52263] Updated weights for policy 0, policy_version 417149 (0.0025) [2024-04-27 15:52:44,107][52031] Fps is (10 sec: 50790.6, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6834651136. Throughput: 0: 53853.2. Samples: 1325132600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-27 15:52:44,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 15:52:45,057][52263] Updated weights for policy 0, policy_version 417159 (0.0024) [2024-04-27 15:52:47,945][52263] Updated weights for policy 0, policy_version 417169 (0.0027) [2024-04-27 15:52:49,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6834929664. Throughput: 0: 53853.6. Samples: 1325454980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-27 15:52:49,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 15:52:51,000][52263] Updated weights for policy 0, policy_version 417179 (0.0029) [2024-04-27 15:52:54,106][52263] Updated weights for policy 0, policy_version 417189 (0.0026) [2024-04-27 15:52:54,107][52031] Fps is (10 sec: 57343.7, 60 sec: 54067.0, 300 sec: 53650.6). Total num frames: 6835224576. Throughput: 0: 53764.6. Samples: 1325774560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-27 15:52:54,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 15:52:57,187][52263] Updated weights for policy 0, policy_version 417199 (0.0027) [2024-04-27 15:52:59,106][52031] Fps is (10 sec: 55706.7, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6835486720. Throughput: 0: 53695.6. Samples: 1325934740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-27 15:52:59,107][52031] Avg episode reward: [(0, '0.666')] [2024-04-27 15:53:00,098][52263] Updated weights for policy 0, policy_version 417209 (0.0026) [2024-04-27 15:53:03,203][52263] Updated weights for policy 0, policy_version 417219 (0.0031) [2024-04-27 15:53:04,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6835748864. Throughput: 0: 53675.2. Samples: 1326255440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-27 15:53:04,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 15:53:06,361][52263] Updated weights for policy 0, policy_version 417229 (0.0029) [2024-04-27 15:53:09,107][52031] Fps is (10 sec: 54066.0, 60 sec: 53793.9, 300 sec: 53650.6). Total num frames: 6836027392. Throughput: 0: 53822.8. Samples: 1326583340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-27 15:53:09,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 15:53:09,290][52263] Updated weights for policy 0, policy_version 417239 (0.0035) [2024-04-27 15:53:11,976][52242] Signal inference workers to stop experience collection... (20150 times) [2024-04-27 15:53:11,998][52263] InferenceWorker_p0-w0: stopping experience collection (20150 times) [2024-04-27 15:53:12,042][52242] Signal inference workers to resume experience collection... (20150 times) [2024-04-27 15:53:12,042][52263] InferenceWorker_p0-w0: resuming experience collection (20150 times) [2024-04-27 15:53:12,312][52263] Updated weights for policy 0, policy_version 417249 (0.0026) [2024-04-27 15:53:14,107][52031] Fps is (10 sec: 54066.8, 60 sec: 54067.1, 300 sec: 53595.1). Total num frames: 6836289536. Throughput: 0: 53749.8. Samples: 1326738480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-27 15:53:14,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 15:53:15,338][52263] Updated weights for policy 0, policy_version 417259 (0.0025) [2024-04-27 15:53:18,317][52263] Updated weights for policy 0, policy_version 417269 (0.0030) [2024-04-27 15:53:19,107][52031] Fps is (10 sec: 50790.8, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6836535296. Throughput: 0: 53860.5. Samples: 1327067620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-27 15:53:19,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 15:53:21,299][52263] Updated weights for policy 0, policy_version 417279 (0.0035) [2024-04-27 15:53:24,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.0, 300 sec: 53761.7). Total num frames: 6836830208. Throughput: 0: 53949.2. Samples: 1327393920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-27 15:53:24,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 15:53:24,700][52263] Updated weights for policy 0, policy_version 417289 (0.0029) [2024-04-27 15:53:27,444][52263] Updated weights for policy 0, policy_version 417299 (0.0026) [2024-04-27 15:53:29,107][52031] Fps is (10 sec: 57343.5, 60 sec: 54067.1, 300 sec: 53761.7). Total num frames: 6837108736. Throughput: 0: 53841.7. Samples: 1327555480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-27 15:53:29,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 15:53:30,746][52263] Updated weights for policy 0, policy_version 417309 (0.0033) [2024-04-27 15:53:33,601][52263] Updated weights for policy 0, policy_version 417319 (0.0030) [2024-04-27 15:53:34,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6837354496. Throughput: 0: 53747.7. Samples: 1327873620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-27 15:53:34,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 15:53:34,176][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000417320_6837370880.pth... [2024-04-27 15:53:34,222][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000416534_6824493056.pth [2024-04-27 15:53:36,747][52263] Updated weights for policy 0, policy_version 417329 (0.0029) [2024-04-27 15:53:39,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6837633024. Throughput: 0: 53899.3. Samples: 1328200020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-27 15:53:39,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 15:53:39,670][52263] Updated weights for policy 0, policy_version 417339 (0.0033) [2024-04-27 15:53:42,939][52263] Updated weights for policy 0, policy_version 417349 (0.0031) [2024-04-27 15:53:44,107][52031] Fps is (10 sec: 55705.2, 60 sec: 54340.3, 300 sec: 53706.2). Total num frames: 6837911552. Throughput: 0: 53896.3. Samples: 1328360080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-27 15:53:44,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 15:53:45,836][52263] Updated weights for policy 0, policy_version 417359 (0.0029) [2024-04-27 15:53:49,063][52263] Updated weights for policy 0, policy_version 417369 (0.0029) [2024-04-27 15:53:49,107][52031] Fps is (10 sec: 54067.0, 60 sec: 54067.3, 300 sec: 53650.7). Total num frames: 6838173696. Throughput: 0: 53976.9. Samples: 1328684400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-27 15:53:49,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 15:53:52,079][52263] Updated weights for policy 0, policy_version 417379 (0.0028) [2024-04-27 15:53:54,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 6838452224. Throughput: 0: 53803.2. Samples: 1329004480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 15:53:54,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 15:53:55,412][52263] Updated weights for policy 0, policy_version 417389 (0.0027) [2024-04-27 15:53:58,111][52263] Updated weights for policy 0, policy_version 417399 (0.0036) [2024-04-27 15:53:59,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 6838714368. Throughput: 0: 53908.5. Samples: 1329164360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 15:53:59,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 15:54:01,363][52263] Updated weights for policy 0, policy_version 417409 (0.0032) [2024-04-27 15:54:04,065][52263] Updated weights for policy 0, policy_version 417419 (0.0029) [2024-04-27 15:54:04,107][52031] Fps is (10 sec: 54067.7, 60 sec: 54067.2, 300 sec: 53650.6). Total num frames: 6838992896. Throughput: 0: 53894.7. Samples: 1329492880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 15:54:04,107][52031] Avg episode reward: [(0, '0.669')] [2024-04-27 15:54:07,376][52263] Updated weights for policy 0, policy_version 417429 (0.0027) [2024-04-27 15:54:09,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 6839238656. Throughput: 0: 53828.3. Samples: 1329816200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 15:54:09,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 15:54:10,049][52263] Updated weights for policy 0, policy_version 417439 (0.0033) [2024-04-27 15:54:13,486][52263] Updated weights for policy 0, policy_version 417449 (0.0029) [2024-04-27 15:54:13,977][52242] Signal inference workers to stop experience collection... (20200 times) [2024-04-27 15:54:13,997][52263] InferenceWorker_p0-w0: stopping experience collection (20200 times) [2024-04-27 15:54:14,078][52242] Signal inference workers to resume experience collection... (20200 times) [2024-04-27 15:54:14,078][52263] InferenceWorker_p0-w0: resuming experience collection (20200 times) [2024-04-27 15:54:14,107][52031] Fps is (10 sec: 54066.8, 60 sec: 54067.2, 300 sec: 53761.7). Total num frames: 6839533568. Throughput: 0: 53814.7. Samples: 1329977140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 15:54:14,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 15:54:16,177][52263] Updated weights for policy 0, policy_version 417459 (0.0031) [2024-04-27 15:54:19,106][52031] Fps is (10 sec: 52430.1, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 6839762944. Throughput: 0: 53820.5. Samples: 1330295540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 15:54:19,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 15:54:19,654][52263] Updated weights for policy 0, policy_version 417469 (0.0031) [2024-04-27 15:54:22,241][52263] Updated weights for policy 0, policy_version 417479 (0.0036) [2024-04-27 15:54:24,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53794.2, 300 sec: 53761.8). Total num frames: 6840057856. Throughput: 0: 53907.5. Samples: 1330625860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 15:54:24,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 15:54:25,676][52263] Updated weights for policy 0, policy_version 417489 (0.0031) [2024-04-27 15:54:28,318][52263] Updated weights for policy 0, policy_version 417499 (0.0036) [2024-04-27 15:54:29,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 6840320000. Throughput: 0: 53733.0. Samples: 1330778060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 15:54:29,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 15:54:31,786][52263] Updated weights for policy 0, policy_version 417509 (0.0029) [2024-04-27 15:54:34,106][52031] Fps is (10 sec: 55705.8, 60 sec: 54340.3, 300 sec: 53761.7). Total num frames: 6840614912. Throughput: 0: 53696.5. Samples: 1331100740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 15:54:34,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 15:54:34,345][52263] Updated weights for policy 0, policy_version 417519 (0.0031) [2024-04-27 15:54:37,862][52263] Updated weights for policy 0, policy_version 417529 (0.0027) [2024-04-27 15:54:39,106][52031] Fps is (10 sec: 55705.8, 60 sec: 54067.2, 300 sec: 53872.8). Total num frames: 6840877056. Throughput: 0: 53750.0. Samples: 1331423220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 15:54:39,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 15:54:40,460][52263] Updated weights for policy 0, policy_version 417539 (0.0031) [2024-04-27 15:54:43,912][52263] Updated weights for policy 0, policy_version 417549 (0.0031) [2024-04-27 15:54:44,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53521.2, 300 sec: 53706.2). Total num frames: 6841122816. Throughput: 0: 53886.4. Samples: 1331589240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 15:54:44,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 15:54:46,594][52263] Updated weights for policy 0, policy_version 417559 (0.0033) [2024-04-27 15:54:49,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53521.2, 300 sec: 53706.3). Total num frames: 6841384960. Throughput: 0: 53872.6. Samples: 1331917140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 15:54:49,107][52031] Avg episode reward: [(0, '0.458')] [2024-04-27 15:54:50,009][52263] Updated weights for policy 0, policy_version 417569 (0.0029) [2024-04-27 15:54:52,756][52263] Updated weights for policy 0, policy_version 417579 (0.0025) [2024-04-27 15:54:54,106][52031] Fps is (10 sec: 55705.0, 60 sec: 53794.2, 300 sec: 53817.3). Total num frames: 6841679872. Throughput: 0: 53868.2. Samples: 1332240260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 15:54:54,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 15:54:56,089][52263] Updated weights for policy 0, policy_version 417589 (0.0029) [2024-04-27 15:54:58,687][52263] Updated weights for policy 0, policy_version 417599 (0.0030) [2024-04-27 15:54:59,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6841942016. Throughput: 0: 53779.6. Samples: 1332397220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 15:54:59,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 15:55:02,200][52263] Updated weights for policy 0, policy_version 417609 (0.0031) [2024-04-27 15:55:04,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 6842220544. Throughput: 0: 53774.9. Samples: 1332715420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 15:55:04,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 15:55:05,054][52263] Updated weights for policy 0, policy_version 417619 (0.0029) [2024-04-27 15:55:05,497][52242] Signal inference workers to stop experience collection... (20250 times) [2024-04-27 15:55:05,498][52242] Signal inference workers to resume experience collection... (20250 times) [2024-04-27 15:55:05,519][52263] InferenceWorker_p0-w0: stopping experience collection (20250 times) [2024-04-27 15:55:05,519][52263] InferenceWorker_p0-w0: resuming experience collection (20250 times) [2024-04-27 15:55:08,185][52263] Updated weights for policy 0, policy_version 417629 (0.0034) [2024-04-27 15:55:09,106][52031] Fps is (10 sec: 55706.0, 60 sec: 54340.4, 300 sec: 53928.3). Total num frames: 6842499072. Throughput: 0: 53604.9. Samples: 1333038080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 15:55:09,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 15:55:11,197][52263] Updated weights for policy 0, policy_version 417639 (0.0031) [2024-04-27 15:55:14,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.2, 300 sec: 53706.2). Total num frames: 6842744832. Throughput: 0: 53872.5. Samples: 1333202320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 15:55:14,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 15:55:14,282][52263] Updated weights for policy 0, policy_version 417649 (0.0033) [2024-04-27 15:55:17,234][52263] Updated weights for policy 0, policy_version 417659 (0.0032) [2024-04-27 15:55:19,107][52031] Fps is (10 sec: 50790.2, 60 sec: 54067.1, 300 sec: 53761.7). Total num frames: 6843006976. Throughput: 0: 53827.5. Samples: 1333522980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 15:55:19,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 15:55:20,387][52263] Updated weights for policy 0, policy_version 417669 (0.0027) [2024-04-27 15:55:23,193][52263] Updated weights for policy 0, policy_version 417679 (0.0035) [2024-04-27 15:55:24,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.2, 300 sec: 53761.8). Total num frames: 6843285504. Throughput: 0: 53790.7. Samples: 1333843800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 15:55:24,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 15:55:26,487][52263] Updated weights for policy 0, policy_version 417689 (0.0029) [2024-04-27 15:55:29,106][52031] Fps is (10 sec: 55706.1, 60 sec: 54067.2, 300 sec: 53761.7). Total num frames: 6843564032. Throughput: 0: 53729.7. Samples: 1334007080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 15:55:29,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 15:55:29,411][52263] Updated weights for policy 0, policy_version 417699 (0.0038) [2024-04-27 15:55:32,451][52263] Updated weights for policy 0, policy_version 417709 (0.0029) [2024-04-27 15:55:34,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53248.1, 300 sec: 53706.2). Total num frames: 6843809792. Throughput: 0: 53494.7. Samples: 1334324400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 15:55:34,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 15:55:34,156][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000417714_6843826176.pth... [2024-04-27 15:55:34,194][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000416926_6830915584.pth [2024-04-27 15:55:35,487][52263] Updated weights for policy 0, policy_version 417719 (0.0033) [2024-04-27 15:55:38,732][52263] Updated weights for policy 0, policy_version 417729 (0.0030) [2024-04-27 15:55:39,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53817.3). Total num frames: 6844088320. Throughput: 0: 53471.6. Samples: 1334646480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 15:55:39,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 15:55:41,468][52263] Updated weights for policy 0, policy_version 417739 (0.0030) [2024-04-27 15:55:44,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.0, 300 sec: 53706.2). Total num frames: 6844334080. Throughput: 0: 53475.7. Samples: 1334803620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 15:55:44,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 15:55:44,899][52263] Updated weights for policy 0, policy_version 417749 (0.0030) [2024-04-27 15:55:47,431][52263] Updated weights for policy 0, policy_version 417759 (0.0029) [2024-04-27 15:55:49,106][52031] Fps is (10 sec: 54066.7, 60 sec: 54067.1, 300 sec: 53817.3). Total num frames: 6844628992. Throughput: 0: 53460.5. Samples: 1335121140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 15:55:49,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 15:55:50,863][52263] Updated weights for policy 0, policy_version 417769 (0.0028) [2024-04-27 15:55:53,584][52263] Updated weights for policy 0, policy_version 417779 (0.0030) [2024-04-27 15:55:54,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53521.0, 300 sec: 53817.3). Total num frames: 6844891136. Throughput: 0: 53436.4. Samples: 1335442720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 15:55:54,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 15:55:57,081][52263] Updated weights for policy 0, policy_version 417789 (0.0032) [2024-04-27 15:55:59,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.1, 300 sec: 53650.7). Total num frames: 6845136896. Throughput: 0: 53567.5. Samples: 1335612860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 15:55:59,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 15:56:00,263][52263] Updated weights for policy 0, policy_version 417799 (0.0035) [2024-04-27 15:56:03,226][52263] Updated weights for policy 0, policy_version 417809 (0.0035) [2024-04-27 15:56:04,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53248.1, 300 sec: 53706.2). Total num frames: 6845415424. Throughput: 0: 53640.1. Samples: 1335936780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 15:56:04,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 15:56:06,369][52263] Updated weights for policy 0, policy_version 417819 (0.0040) [2024-04-27 15:56:09,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53248.1, 300 sec: 53761.8). Total num frames: 6845693952. Throughput: 0: 53585.9. Samples: 1336255160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 15:56:09,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 15:56:09,310][52263] Updated weights for policy 0, policy_version 417829 (0.0026) [2024-04-27 15:56:12,442][52263] Updated weights for policy 0, policy_version 417839 (0.0026) [2024-04-27 15:56:14,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.1, 300 sec: 53761.7). Total num frames: 6845956096. Throughput: 0: 53433.8. Samples: 1336411600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 15:56:14,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 15:56:15,362][52263] Updated weights for policy 0, policy_version 417849 (0.0027) [2024-04-27 15:56:17,106][52242] Signal inference workers to stop experience collection... (20300 times) [2024-04-27 15:56:17,107][52242] Signal inference workers to resume experience collection... (20300 times) [2024-04-27 15:56:17,125][52263] InferenceWorker_p0-w0: stopping experience collection (20300 times) [2024-04-27 15:56:17,125][52263] InferenceWorker_p0-w0: resuming experience collection (20300 times) [2024-04-27 15:56:18,595][52263] Updated weights for policy 0, policy_version 417859 (0.0030) [2024-04-27 15:56:19,107][52031] Fps is (10 sec: 54066.1, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 6846234624. Throughput: 0: 53605.1. Samples: 1336736640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 15:56:19,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 15:56:21,449][52263] Updated weights for policy 0, policy_version 417869 (0.0032) [2024-04-27 15:56:24,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53794.0, 300 sec: 53817.2). Total num frames: 6846513152. Throughput: 0: 53634.5. Samples: 1337060040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 15:56:24,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 15:56:24,642][52263] Updated weights for policy 0, policy_version 417879 (0.0027) [2024-04-27 15:56:27,718][52263] Updated weights for policy 0, policy_version 417889 (0.0029) [2024-04-27 15:56:29,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.0, 300 sec: 53761.8). Total num frames: 6846775296. Throughput: 0: 53820.4. Samples: 1337225540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 15:56:29,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 15:56:30,786][52263] Updated weights for policy 0, policy_version 417899 (0.0037) [2024-04-27 15:56:33,677][52263] Updated weights for policy 0, policy_version 417909 (0.0032) [2024-04-27 15:56:34,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6847037440. Throughput: 0: 53983.6. Samples: 1337550400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 15:56:34,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 15:56:36,750][52263] Updated weights for policy 0, policy_version 417919 (0.0031) [2024-04-27 15:56:39,107][52031] Fps is (10 sec: 50790.0, 60 sec: 53247.9, 300 sec: 53650.7). Total num frames: 6847283200. Throughput: 0: 53961.8. Samples: 1337871000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 15:56:39,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 15:56:39,868][52263] Updated weights for policy 0, policy_version 417929 (0.0039) [2024-04-27 15:56:42,731][52263] Updated weights for policy 0, policy_version 417939 (0.0028) [2024-04-27 15:56:44,107][52031] Fps is (10 sec: 54066.3, 60 sec: 54067.0, 300 sec: 53817.2). Total num frames: 6847578112. Throughput: 0: 53733.6. Samples: 1338030880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 15:56:44,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 15:56:46,090][52263] Updated weights for policy 0, policy_version 417949 (0.0031) [2024-04-27 15:56:48,866][52263] Updated weights for policy 0, policy_version 417959 (0.0028) [2024-04-27 15:56:49,106][52031] Fps is (10 sec: 57344.3, 60 sec: 53794.2, 300 sec: 53817.3). Total num frames: 6847856640. Throughput: 0: 53686.1. Samples: 1338352660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 15:56:49,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 15:56:52,070][52263] Updated weights for policy 0, policy_version 417969 (0.0032) [2024-04-27 15:56:54,107][52031] Fps is (10 sec: 55706.0, 60 sec: 54067.2, 300 sec: 53817.3). Total num frames: 6848135168. Throughput: 0: 53812.6. Samples: 1338676740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 15:56:54,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 15:56:54,885][52263] Updated weights for policy 0, policy_version 417979 (0.0027) [2024-04-27 15:56:58,180][52263] Updated weights for policy 0, policy_version 417989 (0.0032) [2024-04-27 15:56:59,107][52031] Fps is (10 sec: 52428.6, 60 sec: 54067.1, 300 sec: 53761.7). Total num frames: 6848380928. Throughput: 0: 53929.7. Samples: 1338838440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 15:56:59,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 15:57:00,973][52263] Updated weights for policy 0, policy_version 417999 (0.0028) [2024-04-27 15:57:04,106][52031] Fps is (10 sec: 50791.1, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6848643072. Throughput: 0: 53909.5. Samples: 1339162560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 15:57:04,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 15:57:04,426][52263] Updated weights for policy 0, policy_version 418009 (0.0030) [2024-04-27 15:57:07,246][52263] Updated weights for policy 0, policy_version 418019 (0.0031) [2024-04-27 15:57:09,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53520.8, 300 sec: 53761.7). Total num frames: 6848905216. Throughput: 0: 53787.1. Samples: 1339480460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 15:57:09,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 15:57:10,400][52263] Updated weights for policy 0, policy_version 418029 (0.0031) [2024-04-27 15:57:13,146][52263] Updated weights for policy 0, policy_version 418039 (0.0034) [2024-04-27 15:57:14,107][52031] Fps is (10 sec: 55704.6, 60 sec: 54067.0, 300 sec: 53817.3). Total num frames: 6849200128. Throughput: 0: 53673.6. Samples: 1339640860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 15:57:14,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 15:57:16,676][52263] Updated weights for policy 0, policy_version 418049 (0.0029) [2024-04-27 15:57:19,106][52031] Fps is (10 sec: 55706.2, 60 sec: 53794.2, 300 sec: 53761.7). Total num frames: 6849462272. Throughput: 0: 53615.1. Samples: 1339963080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 15:57:19,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 15:57:19,154][52263] Updated weights for policy 0, policy_version 418059 (0.0033) [2024-04-27 15:57:22,660][52263] Updated weights for policy 0, policy_version 418069 (0.0029) [2024-04-27 15:57:24,106][52031] Fps is (10 sec: 52429.8, 60 sec: 53521.2, 300 sec: 53761.8). Total num frames: 6849724416. Throughput: 0: 53657.9. Samples: 1340285600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 15:57:24,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 15:57:24,130][52242] Signal inference workers to stop experience collection... (20350 times) [2024-04-27 15:57:24,131][52242] Signal inference workers to resume experience collection... (20350 times) [2024-04-27 15:57:24,165][52263] InferenceWorker_p0-w0: stopping experience collection (20350 times) [2024-04-27 15:57:24,165][52263] InferenceWorker_p0-w0: resuming experience collection (20350 times) [2024-04-27 15:57:25,274][52263] Updated weights for policy 0, policy_version 418079 (0.0028) [2024-04-27 15:57:28,833][52263] Updated weights for policy 0, policy_version 418089 (0.0034) [2024-04-27 15:57:29,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53520.9, 300 sec: 53706.2). Total num frames: 6849986560. Throughput: 0: 53656.9. Samples: 1340445440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 15:57:29,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 15:57:31,392][52263] Updated weights for policy 0, policy_version 418099 (0.0030) [2024-04-27 15:57:34,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53521.0, 300 sec: 53706.2). Total num frames: 6850248704. Throughput: 0: 53764.7. Samples: 1340772080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 15:57:34,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 15:57:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000418106_6850248704.pth... [2024-04-27 15:57:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000417320_6837370880.pth [2024-04-27 15:57:34,844][52263] Updated weights for policy 0, policy_version 418109 (0.0029) [2024-04-27 15:57:37,370][52263] Updated weights for policy 0, policy_version 418119 (0.0032) [2024-04-27 15:57:39,107][52031] Fps is (10 sec: 54067.5, 60 sec: 54067.2, 300 sec: 53817.3). Total num frames: 6850527232. Throughput: 0: 53644.9. Samples: 1341090760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 15:57:39,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 15:57:40,875][52263] Updated weights for policy 0, policy_version 418129 (0.0026) [2024-04-27 15:57:43,406][52263] Updated weights for policy 0, policy_version 418139 (0.0034) [2024-04-27 15:57:44,107][52031] Fps is (10 sec: 55706.0, 60 sec: 53794.2, 300 sec: 53817.3). Total num frames: 6850805760. Throughput: 0: 53794.7. Samples: 1341259200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 15:57:44,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 15:57:47,017][52263] Updated weights for policy 0, policy_version 418149 (0.0027) [2024-04-27 15:57:49,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 6851084288. Throughput: 0: 53705.2. Samples: 1341579300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 15:57:49,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 15:57:49,673][52263] Updated weights for policy 0, policy_version 418159 (0.0027) [2024-04-27 15:57:53,175][52263] Updated weights for policy 0, policy_version 418169 (0.0027) [2024-04-27 15:57:54,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53248.0, 300 sec: 53706.2). Total num frames: 6851330048. Throughput: 0: 53863.1. Samples: 1341904300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 15:57:54,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 15:57:55,712][52263] Updated weights for policy 0, policy_version 418179 (0.0034) [2024-04-27 15:57:59,107][52031] Fps is (10 sec: 50790.4, 60 sec: 53521.0, 300 sec: 53706.2). Total num frames: 6851592192. Throughput: 0: 53760.5. Samples: 1342060080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 15:57:59,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 15:57:59,217][52263] Updated weights for policy 0, policy_version 418189 (0.0029) [2024-04-27 15:58:01,635][52263] Updated weights for policy 0, policy_version 418199 (0.0030) [2024-04-27 15:58:04,107][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6851854336. Throughput: 0: 53910.2. Samples: 1342389040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 15:58:04,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 15:58:05,142][52263] Updated weights for policy 0, policy_version 418209 (0.0029) [2024-04-27 15:58:07,636][52263] Updated weights for policy 0, policy_version 418219 (0.0027) [2024-04-27 15:58:09,107][52031] Fps is (10 sec: 55705.7, 60 sec: 54067.3, 300 sec: 53761.7). Total num frames: 6852149248. Throughput: 0: 53883.8. Samples: 1342710380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 15:58:09,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 15:58:11,534][52263] Updated weights for policy 0, policy_version 418229 (0.0034) [2024-04-27 15:58:13,716][52263] Updated weights for policy 0, policy_version 418239 (0.0028) [2024-04-27 15:58:14,107][52031] Fps is (10 sec: 57343.6, 60 sec: 53794.1, 300 sec: 53872.8). Total num frames: 6852427776. Throughput: 0: 54159.1. Samples: 1342882600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 15:58:14,107][52031] Avg episode reward: [(0, '0.684')] [2024-04-27 15:58:17,635][52263] Updated weights for policy 0, policy_version 418249 (0.0034) [2024-04-27 15:58:19,107][52031] Fps is (10 sec: 55705.4, 60 sec: 54067.1, 300 sec: 53817.3). Total num frames: 6852706304. Throughput: 0: 54046.2. Samples: 1343204160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 15:58:19,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 15:58:19,911][52263] Updated weights for policy 0, policy_version 418259 (0.0030) [2024-04-27 15:58:23,643][52263] Updated weights for policy 0, policy_version 418269 (0.0035) [2024-04-27 15:58:24,106][52031] Fps is (10 sec: 50791.1, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6852935680. Throughput: 0: 54095.7. Samples: 1343525060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 15:58:24,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 15:58:25,851][52263] Updated weights for policy 0, policy_version 418279 (0.0025) [2024-04-27 15:58:29,106][52031] Fps is (10 sec: 49152.7, 60 sec: 53521.2, 300 sec: 53706.2). Total num frames: 6853197824. Throughput: 0: 53718.8. Samples: 1343676540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 15:58:29,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 15:58:29,856][52263] Updated weights for policy 0, policy_version 418289 (0.0028) [2024-04-27 15:58:32,225][52263] Updated weights for policy 0, policy_version 418299 (0.0032) [2024-04-27 15:58:34,106][52031] Fps is (10 sec: 55705.4, 60 sec: 54067.3, 300 sec: 53761.7). Total num frames: 6853492736. Throughput: 0: 53797.4. Samples: 1344000180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 15:58:34,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 15:58:35,967][52263] Updated weights for policy 0, policy_version 418309 (0.0032) [2024-04-27 15:58:38,437][52263] Updated weights for policy 0, policy_version 418319 (0.0029) [2024-04-27 15:58:39,106][52031] Fps is (10 sec: 57343.8, 60 sec: 54067.3, 300 sec: 53761.7). Total num frames: 6853771264. Throughput: 0: 53799.3. Samples: 1344325260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 15:58:39,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 15:58:40,671][52242] Signal inference workers to stop experience collection... (20400 times) [2024-04-27 15:58:40,706][52263] InferenceWorker_p0-w0: stopping experience collection (20400 times) [2024-04-27 15:58:40,736][52242] Signal inference workers to resume experience collection... (20400 times) [2024-04-27 15:58:40,736][52263] InferenceWorker_p0-w0: resuming experience collection (20400 times) [2024-04-27 15:58:41,978][52263] Updated weights for policy 0, policy_version 418329 (0.0028) [2024-04-27 15:58:44,106][52031] Fps is (10 sec: 55706.0, 60 sec: 54067.3, 300 sec: 53817.3). Total num frames: 6854049792. Throughput: 0: 54104.2. Samples: 1344494760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 15:58:44,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 15:58:44,608][52263] Updated weights for policy 0, policy_version 418339 (0.0025) [2024-04-27 15:58:48,084][52263] Updated weights for policy 0, policy_version 418349 (0.0026) [2024-04-27 15:58:49,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 6854311936. Throughput: 0: 54004.4. Samples: 1344819240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 15:58:49,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 15:58:50,529][52263] Updated weights for policy 0, policy_version 418359 (0.0028) [2024-04-27 15:58:54,107][52031] Fps is (10 sec: 49151.4, 60 sec: 53521.1, 300 sec: 53650.6). Total num frames: 6854541312. Throughput: 0: 53956.0. Samples: 1345138400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 15:58:54,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 15:58:54,285][52263] Updated weights for policy 0, policy_version 418369 (0.0039) [2024-04-27 15:58:56,583][52263] Updated weights for policy 0, policy_version 418379 (0.0028) [2024-04-27 15:58:59,107][52031] Fps is (10 sec: 52428.7, 60 sec: 54067.2, 300 sec: 53706.2). Total num frames: 6854836224. Throughput: 0: 53557.3. Samples: 1345292680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 15:58:59,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 15:59:00,187][52263] Updated weights for policy 0, policy_version 418389 (0.0029) [2024-04-27 15:59:02,817][52263] Updated weights for policy 0, policy_version 418399 (0.0026) [2024-04-27 15:59:04,106][52031] Fps is (10 sec: 55705.9, 60 sec: 54067.2, 300 sec: 53761.8). Total num frames: 6855098368. Throughput: 0: 53551.2. Samples: 1345613960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 15:59:04,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 15:59:06,298][52263] Updated weights for policy 0, policy_version 418409 (0.0027) [2024-04-27 15:59:08,812][52263] Updated weights for policy 0, policy_version 418419 (0.0023) [2024-04-27 15:59:09,106][52031] Fps is (10 sec: 54068.3, 60 sec: 53794.3, 300 sec: 53706.2). Total num frames: 6855376896. Throughput: 0: 53522.3. Samples: 1345933560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 15:59:09,107][52031] Avg episode reward: [(0, '0.495')] [2024-04-27 15:59:12,604][52263] Updated weights for policy 0, policy_version 418429 (0.0032) [2024-04-27 15:59:14,107][52031] Fps is (10 sec: 57343.7, 60 sec: 54067.2, 300 sec: 53928.3). Total num frames: 6855671808. Throughput: 0: 53945.2. Samples: 1346104080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 15:59:14,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 15:59:14,959][52263] Updated weights for policy 0, policy_version 418439 (0.0031) [2024-04-27 15:59:18,529][52263] Updated weights for policy 0, policy_version 418449 (0.0032) [2024-04-27 15:59:19,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.3, 300 sec: 53761.8). Total num frames: 6855917568. Throughput: 0: 53875.7. Samples: 1346424580. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 15:59:19,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 15:59:20,982][52263] Updated weights for policy 0, policy_version 418459 (0.0030) [2024-04-27 15:59:24,107][52031] Fps is (10 sec: 49151.6, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6856163328. Throughput: 0: 53781.1. Samples: 1346745420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 15:59:24,107][52031] Avg episode reward: [(0, '0.508')] [2024-04-27 15:59:24,670][52263] Updated weights for policy 0, policy_version 418469 (0.0029) [2024-04-27 15:59:25,159][52242] Signal inference workers to stop experience collection... (20450 times) [2024-04-27 15:59:25,188][52263] InferenceWorker_p0-w0: stopping experience collection (20450 times) [2024-04-27 15:59:25,252][52242] Signal inference workers to resume experience collection... (20450 times) [2024-04-27 15:59:25,253][52263] InferenceWorker_p0-w0: resuming experience collection (20450 times) [2024-04-27 15:59:27,240][52263] Updated weights for policy 0, policy_version 418479 (0.0028) [2024-04-27 15:59:29,107][52031] Fps is (10 sec: 52427.9, 60 sec: 54067.1, 300 sec: 53650.6). Total num frames: 6856441856. Throughput: 0: 53540.8. Samples: 1346904100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 15:59:29,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 15:59:30,608][52263] Updated weights for policy 0, policy_version 418489 (0.0030) [2024-04-27 15:59:33,496][52263] Updated weights for policy 0, policy_version 418499 (0.0026) [2024-04-27 15:59:34,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 6856704000. Throughput: 0: 53433.8. Samples: 1347223760. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 15:59:34,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 15:59:34,166][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000418501_6856720384.pth... [2024-04-27 15:59:34,213][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000417714_6843826176.pth [2024-04-27 15:59:36,674][52263] Updated weights for policy 0, policy_version 418509 (0.0027) [2024-04-27 15:59:39,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.1, 300 sec: 53761.7). Total num frames: 6856982528. Throughput: 0: 53494.3. Samples: 1347545640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 15:59:39,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 15:59:39,629][52263] Updated weights for policy 0, policy_version 418519 (0.0029) [2024-04-27 15:59:42,862][52263] Updated weights for policy 0, policy_version 418529 (0.0029) [2024-04-27 15:59:44,106][52031] Fps is (10 sec: 57345.1, 60 sec: 53794.2, 300 sec: 53872.8). Total num frames: 6857277440. Throughput: 0: 53721.2. Samples: 1347710120. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 15:59:44,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 15:59:45,779][52263] Updated weights for policy 0, policy_version 418539 (0.0027) [2024-04-27 15:59:48,859][52263] Updated weights for policy 0, policy_version 418549 (0.0026) [2024-04-27 15:59:49,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53248.2, 300 sec: 53650.7). Total num frames: 6857506816. Throughput: 0: 53775.3. Samples: 1348033840. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 15:59:49,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 15:59:51,829][52263] Updated weights for policy 0, policy_version 418559 (0.0028) [2024-04-27 15:59:54,107][52031] Fps is (10 sec: 49150.9, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6857768960. Throughput: 0: 53785.1. Samples: 1348353900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 15:59:54,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 15:59:54,956][52263] Updated weights for policy 0, policy_version 418569 (0.0036) [2024-04-27 15:59:57,904][52263] Updated weights for policy 0, policy_version 418579 (0.0030) [2024-04-27 15:59:59,107][52031] Fps is (10 sec: 55704.1, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6858063872. Throughput: 0: 53350.6. Samples: 1348504860. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 15:59:59,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 16:00:01,152][52263] Updated weights for policy 0, policy_version 418589 (0.0029) [2024-04-27 16:00:04,013][52263] Updated weights for policy 0, policy_version 418599 (0.0028) [2024-04-27 16:00:04,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6858326016. Throughput: 0: 53450.9. Samples: 1348829880. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 16:00:04,107][52031] Avg episode reward: [(0, '0.504')] [2024-04-27 16:00:07,249][52263] Updated weights for policy 0, policy_version 418609 (0.0025) [2024-04-27 16:00:09,107][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.0, 300 sec: 53761.7). Total num frames: 6858604544. Throughput: 0: 53457.9. Samples: 1349151020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 16:00:09,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 16:00:10,112][52263] Updated weights for policy 0, policy_version 418619 (0.0028) [2024-04-27 16:00:13,312][52263] Updated weights for policy 0, policy_version 418629 (0.0035) [2024-04-27 16:00:14,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53521.0, 300 sec: 53817.3). Total num frames: 6858883072. Throughput: 0: 53649.3. Samples: 1349318320. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 16:00:14,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 16:00:16,351][52263] Updated weights for policy 0, policy_version 418639 (0.0030) [2024-04-27 16:00:19,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53247.9, 300 sec: 53650.7). Total num frames: 6859112448. Throughput: 0: 53552.6. Samples: 1349633620. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 16:00:19,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 16:00:19,473][52263] Updated weights for policy 0, policy_version 418649 (0.0035) [2024-04-27 16:00:22,645][52263] Updated weights for policy 0, policy_version 418659 (0.0026) [2024-04-27 16:00:24,106][52031] Fps is (10 sec: 50791.1, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 6859390976. Throughput: 0: 53620.1. Samples: 1349958540. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 16:00:24,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 16:00:25,462][52263] Updated weights for policy 0, policy_version 418669 (0.0026) [2024-04-27 16:00:28,779][52263] Updated weights for policy 0, policy_version 418679 (0.0027) [2024-04-27 16:00:29,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53248.0, 300 sec: 53650.6). Total num frames: 6859636736. Throughput: 0: 53398.9. Samples: 1350113080. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 16:00:29,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 16:00:29,224][52242] Signal inference workers to stop experience collection... (20500 times) [2024-04-27 16:00:29,271][52263] InferenceWorker_p0-w0: stopping experience collection (20500 times) [2024-04-27 16:00:29,285][52242] Signal inference workers to resume experience collection... (20500 times) [2024-04-27 16:00:29,291][52263] InferenceWorker_p0-w0: resuming experience collection (20500 times) [2024-04-27 16:00:31,625][52263] Updated weights for policy 0, policy_version 418689 (0.0029) [2024-04-27 16:00:34,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.1, 300 sec: 53650.6). Total num frames: 6859915264. Throughput: 0: 53334.1. Samples: 1350433880. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-04-27 16:00:34,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 16:00:34,754][52263] Updated weights for policy 0, policy_version 418699 (0.0029) [2024-04-27 16:00:37,838][52263] Updated weights for policy 0, policy_version 418709 (0.0037) [2024-04-27 16:00:39,107][52031] Fps is (10 sec: 57344.1, 60 sec: 53794.1, 300 sec: 53817.3). Total num frames: 6860210176. Throughput: 0: 53377.4. Samples: 1350755880. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-27 16:00:39,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 16:00:40,902][52263] Updated weights for policy 0, policy_version 418719 (0.0028) [2024-04-27 16:00:44,023][52263] Updated weights for policy 0, policy_version 418729 (0.0034) [2024-04-27 16:00:44,107][52031] Fps is (10 sec: 54066.2, 60 sec: 52974.7, 300 sec: 53650.6). Total num frames: 6860455936. Throughput: 0: 53772.4. Samples: 1350924620. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-27 16:00:44,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 16:00:46,914][52263] Updated weights for policy 0, policy_version 418739 (0.0030) [2024-04-27 16:00:49,106][52031] Fps is (10 sec: 50791.0, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6860718080. Throughput: 0: 53763.3. Samples: 1351249220. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-27 16:00:49,107][52031] Avg episode reward: [(0, '0.505')] [2024-04-27 16:00:50,164][52263] Updated weights for policy 0, policy_version 418749 (0.0030) [2024-04-27 16:00:52,905][52263] Updated weights for policy 0, policy_version 418759 (0.0034) [2024-04-27 16:00:54,107][52031] Fps is (10 sec: 52429.3, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6860980224. Throughput: 0: 53698.2. Samples: 1351567440. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-27 16:00:54,116][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 16:00:56,258][52263] Updated weights for policy 0, policy_version 418769 (0.0032) [2024-04-27 16:00:59,106][52031] Fps is (10 sec: 54066.8, 60 sec: 53248.2, 300 sec: 53706.2). Total num frames: 6861258752. Throughput: 0: 53423.2. Samples: 1351722360. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-27 16:00:59,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 16:00:59,184][52263] Updated weights for policy 0, policy_version 418779 (0.0027) [2024-04-27 16:01:02,417][52263] Updated weights for policy 0, policy_version 418789 (0.0029) [2024-04-27 16:01:04,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.0, 300 sec: 53706.2). Total num frames: 6861537280. Throughput: 0: 53560.7. Samples: 1352043860. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-27 16:01:04,116][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 16:01:05,415][52263] Updated weights for policy 0, policy_version 418799 (0.0033) [2024-04-27 16:01:08,461][52263] Updated weights for policy 0, policy_version 418809 (0.0030) [2024-04-27 16:01:09,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.1, 300 sec: 53706.2). Total num frames: 6861799424. Throughput: 0: 53388.9. Samples: 1352361040. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-27 16:01:09,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 16:01:11,487][52263] Updated weights for policy 0, policy_version 418819 (0.0031) [2024-04-27 16:01:14,107][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.0, 300 sec: 53706.2). Total num frames: 6862077952. Throughput: 0: 53596.0. Samples: 1352524900. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-27 16:01:14,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 16:01:14,647][52263] Updated weights for policy 0, policy_version 418829 (0.0038) [2024-04-27 16:01:18,046][52263] Updated weights for policy 0, policy_version 418839 (0.0031) [2024-04-27 16:01:19,107][52031] Fps is (10 sec: 54065.9, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 6862340096. Throughput: 0: 53659.8. Samples: 1352848580. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-27 16:01:19,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 16:01:20,660][52263] Updated weights for policy 0, policy_version 418849 (0.0023) [2024-04-27 16:01:24,107][52031] Fps is (10 sec: 49152.0, 60 sec: 52974.9, 300 sec: 53539.6). Total num frames: 6862569472. Throughput: 0: 53524.9. Samples: 1353164500. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-27 16:01:24,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 16:01:24,210][52263] Updated weights for policy 0, policy_version 418859 (0.0027) [2024-04-27 16:01:24,354][52242] Signal inference workers to stop experience collection... (20550 times) [2024-04-27 16:01:24,354][52242] Signal inference workers to resume experience collection... (20550 times) [2024-04-27 16:01:24,367][52263] InferenceWorker_p0-w0: stopping experience collection (20550 times) [2024-04-27 16:01:24,367][52263] InferenceWorker_p0-w0: resuming experience collection (20550 times) [2024-04-27 16:01:26,882][52263] Updated weights for policy 0, policy_version 418869 (0.0024) [2024-04-27 16:01:29,107][52031] Fps is (10 sec: 50790.8, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6862848000. Throughput: 0: 53340.6. Samples: 1353324940. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-27 16:01:29,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 16:01:30,228][52263] Updated weights for policy 0, policy_version 418879 (0.0033) [2024-04-27 16:01:32,997][52263] Updated weights for policy 0, policy_version 418889 (0.0028) [2024-04-27 16:01:34,106][52031] Fps is (10 sec: 58982.6, 60 sec: 54067.2, 300 sec: 53817.3). Total num frames: 6863159296. Throughput: 0: 53295.9. Samples: 1353647540. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-27 16:01:34,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 16:01:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000418894_6863159296.pth... [2024-04-27 16:01:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000418106_6850248704.pth [2024-04-27 16:01:36,304][52263] Updated weights for policy 0, policy_version 418899 (0.0032) [2024-04-27 16:01:39,088][52263] Updated weights for policy 0, policy_version 418909 (0.0034) [2024-04-27 16:01:39,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53248.0, 300 sec: 53650.7). Total num frames: 6863405056. Throughput: 0: 53411.6. Samples: 1353970960. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-27 16:01:39,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 16:01:42,361][52263] Updated weights for policy 0, policy_version 418919 (0.0027) [2024-04-27 16:01:44,107][52031] Fps is (10 sec: 50790.1, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6863667200. Throughput: 0: 53420.8. Samples: 1354126300. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-27 16:01:44,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 16:01:45,313][52263] Updated weights for policy 0, policy_version 418929 (0.0032) [2024-04-27 16:01:48,470][52263] Updated weights for policy 0, policy_version 418939 (0.0031) [2024-04-27 16:01:49,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53793.9, 300 sec: 53595.1). Total num frames: 6863945728. Throughput: 0: 53317.7. Samples: 1354443160. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-27 16:01:49,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 16:01:51,450][52263] Updated weights for policy 0, policy_version 418949 (0.0026) [2024-04-27 16:01:54,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6864207872. Throughput: 0: 53448.6. Samples: 1354766240. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-27 16:01:54,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 16:01:54,544][52263] Updated weights for policy 0, policy_version 418959 (0.0028) [2024-04-27 16:01:57,541][52263] Updated weights for policy 0, policy_version 418969 (0.0028) [2024-04-27 16:01:59,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6864470016. Throughput: 0: 53464.5. Samples: 1354930800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 16:01:59,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 16:02:00,710][52263] Updated weights for policy 0, policy_version 418979 (0.0032) [2024-04-27 16:02:03,604][52263] Updated weights for policy 0, policy_version 418989 (0.0030) [2024-04-27 16:02:04,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6864748544. Throughput: 0: 53385.4. Samples: 1355250920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 16:02:04,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 16:02:06,772][52263] Updated weights for policy 0, policy_version 418999 (0.0032) [2024-04-27 16:02:09,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6864994304. Throughput: 0: 53462.2. Samples: 1355570300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 16:02:09,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 16:02:09,781][52263] Updated weights for policy 0, policy_version 419009 (0.0035) [2024-04-27 16:02:12,907][52263] Updated weights for policy 0, policy_version 419019 (0.0035) [2024-04-27 16:02:14,107][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6865272832. Throughput: 0: 53466.3. Samples: 1355730920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 16:02:14,107][52031] Avg episode reward: [(0, '0.524')] [2024-04-27 16:02:15,848][52263] Updated weights for policy 0, policy_version 419029 (0.0034) [2024-04-27 16:02:15,856][52242] Signal inference workers to stop experience collection... (20600 times) [2024-04-27 16:02:15,856][52242] Signal inference workers to resume experience collection... (20600 times) [2024-04-27 16:02:15,876][52263] InferenceWorker_p0-w0: stopping experience collection (20600 times) [2024-04-27 16:02:15,876][52263] InferenceWorker_p0-w0: resuming experience collection (20600 times) [2024-04-27 16:02:18,828][52263] Updated weights for policy 0, policy_version 419039 (0.0035) [2024-04-27 16:02:19,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.2, 300 sec: 53650.6). Total num frames: 6865551360. Throughput: 0: 53425.3. Samples: 1356051680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 16:02:19,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 16:02:21,866][52263] Updated weights for policy 0, policy_version 419049 (0.0032) [2024-04-27 16:02:24,107][52031] Fps is (10 sec: 52429.0, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6865797120. Throughput: 0: 53372.5. Samples: 1356372720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 16:02:24,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 16:02:24,869][52263] Updated weights for policy 0, policy_version 419059 (0.0035) [2024-04-27 16:02:27,975][52263] Updated weights for policy 0, policy_version 419069 (0.0034) [2024-04-27 16:02:29,106][52031] Fps is (10 sec: 54067.7, 60 sec: 54067.3, 300 sec: 53706.2). Total num frames: 6866092032. Throughput: 0: 53711.7. Samples: 1356543320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 16:02:29,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 16:02:30,956][52263] Updated weights for policy 0, policy_version 419079 (0.0026) [2024-04-27 16:02:34,089][52263] Updated weights for policy 0, policy_version 419089 (0.0028) [2024-04-27 16:02:34,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53248.0, 300 sec: 53650.7). Total num frames: 6866354176. Throughput: 0: 53751.7. Samples: 1356861980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 16:02:34,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 16:02:37,256][52263] Updated weights for policy 0, policy_version 419099 (0.0030) [2024-04-27 16:02:39,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6866599936. Throughput: 0: 53631.8. Samples: 1357179660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 16:02:39,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 16:02:40,227][52263] Updated weights for policy 0, policy_version 419109 (0.0028) [2024-04-27 16:02:43,171][52263] Updated weights for policy 0, policy_version 419119 (0.0040) [2024-04-27 16:02:44,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6866894848. Throughput: 0: 53479.0. Samples: 1357337360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 16:02:44,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 16:02:46,368][52263] Updated weights for policy 0, policy_version 419129 (0.0029) [2024-04-27 16:02:49,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53248.2, 300 sec: 53595.2). Total num frames: 6867140608. Throughput: 0: 53433.1. Samples: 1357655400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 16:02:49,107][52031] Avg episode reward: [(0, '0.490')] [2024-04-27 16:02:49,346][52263] Updated weights for policy 0, policy_version 419139 (0.0026) [2024-04-27 16:02:52,431][52263] Updated weights for policy 0, policy_version 419149 (0.0033) [2024-04-27 16:02:54,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6867402752. Throughput: 0: 53445.4. Samples: 1357975340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 16:02:54,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 16:02:55,513][52263] Updated weights for policy 0, policy_version 419159 (0.0031) [2024-04-27 16:02:58,621][52263] Updated weights for policy 0, policy_version 419169 (0.0030) [2024-04-27 16:02:59,106][52031] Fps is (10 sec: 55705.0, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6867697664. Throughput: 0: 53504.5. Samples: 1358138620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 16:02:59,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 16:03:01,801][52263] Updated weights for policy 0, policy_version 419179 (0.0029) [2024-04-27 16:03:04,107][52031] Fps is (10 sec: 50789.8, 60 sec: 52701.9, 300 sec: 53428.5). Total num frames: 6867910656. Throughput: 0: 53488.8. Samples: 1358458680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 16:03:04,107][52031] Avg episode reward: [(0, '0.511')] [2024-04-27 16:03:04,298][52242] Signal inference workers to stop experience collection... (20650 times) [2024-04-27 16:03:04,299][52242] Signal inference workers to resume experience collection... (20650 times) [2024-04-27 16:03:04,310][52263] InferenceWorker_p0-w0: stopping experience collection (20650 times) [2024-04-27 16:03:04,330][52263] InferenceWorker_p0-w0: resuming experience collection (20650 times) [2024-04-27 16:03:04,710][52263] Updated weights for policy 0, policy_version 419189 (0.0030) [2024-04-27 16:03:07,885][52263] Updated weights for policy 0, policy_version 419199 (0.0028) [2024-04-27 16:03:09,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6868221952. Throughput: 0: 53464.9. Samples: 1358778640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 16:03:09,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 16:03:10,952][52263] Updated weights for policy 0, policy_version 419209 (0.0029) [2024-04-27 16:03:14,009][52263] Updated weights for policy 0, policy_version 419219 (0.0034) [2024-04-27 16:03:14,107][52031] Fps is (10 sec: 57344.1, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6868484096. Throughput: 0: 53099.8. Samples: 1358932820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 16:03:14,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 16:03:16,996][52263] Updated weights for policy 0, policy_version 419229 (0.0031) [2024-04-27 16:03:19,106][52031] Fps is (10 sec: 50790.5, 60 sec: 52974.9, 300 sec: 53539.6). Total num frames: 6868729856. Throughput: 0: 53186.3. Samples: 1359255360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 16:03:19,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 16:03:20,103][52263] Updated weights for policy 0, policy_version 419239 (0.0038) [2024-04-27 16:03:23,088][52263] Updated weights for policy 0, policy_version 419249 (0.0031) [2024-04-27 16:03:24,107][52031] Fps is (10 sec: 55705.8, 60 sec: 54067.2, 300 sec: 53706.2). Total num frames: 6869041152. Throughput: 0: 53311.0. Samples: 1359578660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:03:24,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 16:03:26,210][52263] Updated weights for policy 0, policy_version 419259 (0.0030) [2024-04-27 16:03:29,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6869286912. Throughput: 0: 53364.6. Samples: 1359738760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:03:29,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 16:03:29,190][52263] Updated weights for policy 0, policy_version 419269 (0.0027) [2024-04-27 16:03:32,270][52263] Updated weights for policy 0, policy_version 419279 (0.0026) [2024-04-27 16:03:34,106][52031] Fps is (10 sec: 49152.3, 60 sec: 52975.0, 300 sec: 53428.5). Total num frames: 6869532672. Throughput: 0: 53456.4. Samples: 1360060940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:03:34,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 16:03:34,226][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000419284_6869549056.pth... [2024-04-27 16:03:34,276][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000418501_6856720384.pth [2024-04-27 16:03:35,294][52263] Updated weights for policy 0, policy_version 419289 (0.0034) [2024-04-27 16:03:38,435][52263] Updated weights for policy 0, policy_version 419299 (0.0031) [2024-04-27 16:03:39,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6869827584. Throughput: 0: 53537.2. Samples: 1360384520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:03:39,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 16:03:41,502][52263] Updated weights for policy 0, policy_version 419309 (0.0028) [2024-04-27 16:03:44,107][52031] Fps is (10 sec: 54066.5, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 6870073344. Throughput: 0: 53509.7. Samples: 1360546560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:03:44,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 16:03:44,439][52263] Updated weights for policy 0, policy_version 419319 (0.0034) [2024-04-27 16:03:47,597][52263] Updated weights for policy 0, policy_version 419329 (0.0033) [2024-04-27 16:03:49,106][52031] Fps is (10 sec: 54068.2, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6870368256. Throughput: 0: 53474.9. Samples: 1360865040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:03:49,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 16:03:50,669][52263] Updated weights for policy 0, policy_version 419339 (0.0036) [2024-04-27 16:03:53,661][52263] Updated weights for policy 0, policy_version 419349 (0.0027) [2024-04-27 16:03:54,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6870630400. Throughput: 0: 53493.4. Samples: 1361185840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:03:54,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 16:03:55,651][52242] Signal inference workers to stop experience collection... (20700 times) [2024-04-27 16:03:55,651][52242] Signal inference workers to resume experience collection... (20700 times) [2024-04-27 16:03:55,676][52263] InferenceWorker_p0-w0: stopping experience collection (20700 times) [2024-04-27 16:03:55,676][52263] InferenceWorker_p0-w0: resuming experience collection (20700 times) [2024-04-27 16:03:56,595][52263] Updated weights for policy 0, policy_version 419359 (0.0033) [2024-04-27 16:03:59,106][52031] Fps is (10 sec: 50790.0, 60 sec: 52975.0, 300 sec: 53484.0). Total num frames: 6870876160. Throughput: 0: 53629.9. Samples: 1361346160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:03:59,107][52031] Avg episode reward: [(0, '0.510')] [2024-04-27 16:03:59,738][52263] Updated weights for policy 0, policy_version 419369 (0.0034) [2024-04-27 16:04:02,706][52263] Updated weights for policy 0, policy_version 419379 (0.0027) [2024-04-27 16:04:04,107][52031] Fps is (10 sec: 52428.6, 60 sec: 54067.3, 300 sec: 53484.0). Total num frames: 6871154688. Throughput: 0: 53657.7. Samples: 1361669960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:04:04,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 16:04:05,797][52263] Updated weights for policy 0, policy_version 419389 (0.0027) [2024-04-27 16:04:08,938][52263] Updated weights for policy 0, policy_version 419399 (0.0031) [2024-04-27 16:04:09,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6871433216. Throughput: 0: 53568.9. Samples: 1361989260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:04:09,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 16:04:12,009][52263] Updated weights for policy 0, policy_version 419409 (0.0036) [2024-04-27 16:04:14,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6871695360. Throughput: 0: 53735.9. Samples: 1362156880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:04:14,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 16:04:14,913][52263] Updated weights for policy 0, policy_version 419419 (0.0024) [2024-04-27 16:04:18,064][52263] Updated weights for policy 0, policy_version 419429 (0.0031) [2024-04-27 16:04:19,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6871957504. Throughput: 0: 53619.1. Samples: 1362473800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:04:19,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 16:04:21,013][52263] Updated weights for policy 0, policy_version 419439 (0.0030) [2024-04-27 16:04:24,056][52263] Updated weights for policy 0, policy_version 419449 (0.0027) [2024-04-27 16:04:24,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6872252416. Throughput: 0: 53572.1. Samples: 1362795260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:04:24,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 16:04:27,446][52263] Updated weights for policy 0, policy_version 419459 (0.0032) [2024-04-27 16:04:29,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6872481792. Throughput: 0: 53553.0. Samples: 1362956440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:04:29,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 16:04:30,249][52263] Updated weights for policy 0, policy_version 419469 (0.0031) [2024-04-27 16:04:33,414][52263] Updated weights for policy 0, policy_version 419479 (0.0028) [2024-04-27 16:04:34,106][52031] Fps is (10 sec: 49152.4, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6872743936. Throughput: 0: 53567.5. Samples: 1363275580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:04:34,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 16:04:36,426][52263] Updated weights for policy 0, policy_version 419489 (0.0031) [2024-04-27 16:04:39,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6873038848. Throughput: 0: 53558.7. Samples: 1363595980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:04:39,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 16:04:39,611][52263] Updated weights for policy 0, policy_version 419499 (0.0034) [2024-04-27 16:04:42,718][52263] Updated weights for policy 0, policy_version 419509 (0.0026) [2024-04-27 16:04:44,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.3, 300 sec: 53539.6). Total num frames: 6873300992. Throughput: 0: 53585.4. Samples: 1363757500. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 16:04:44,107][52031] Avg episode reward: [(0, '0.685')] [2024-04-27 16:04:45,794][52263] Updated weights for policy 0, policy_version 419519 (0.0028) [2024-04-27 16:04:47,278][52242] Signal inference workers to stop experience collection... (20750 times) [2024-04-27 16:04:47,278][52242] Signal inference workers to resume experience collection... (20750 times) [2024-04-27 16:04:47,293][52263] InferenceWorker_p0-w0: stopping experience collection (20750 times) [2024-04-27 16:04:47,294][52263] InferenceWorker_p0-w0: resuming experience collection (20750 times) [2024-04-27 16:04:48,953][52263] Updated weights for policy 0, policy_version 419529 (0.0029) [2024-04-27 16:04:49,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53247.8, 300 sec: 53539.6). Total num frames: 6873563136. Throughput: 0: 53525.6. Samples: 1364078620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 16:04:49,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 16:04:51,813][52263] Updated weights for policy 0, policy_version 419539 (0.0031) [2024-04-27 16:04:54,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53521.0, 300 sec: 53484.1). Total num frames: 6873841664. Throughput: 0: 53543.5. Samples: 1364398720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 16:04:54,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 16:04:55,032][52263] Updated weights for policy 0, policy_version 419549 (0.0035) [2024-04-27 16:04:57,902][52263] Updated weights for policy 0, policy_version 419559 (0.0033) [2024-04-27 16:04:59,106][52031] Fps is (10 sec: 52430.2, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6874087424. Throughput: 0: 53284.1. Samples: 1364554660. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 16:04:59,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 16:05:01,260][52263] Updated weights for policy 0, policy_version 419569 (0.0035) [2024-04-27 16:05:04,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.0, 300 sec: 53428.5). Total num frames: 6874365952. Throughput: 0: 53430.1. Samples: 1364878160. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 16:05:04,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 16:05:04,457][52263] Updated weights for policy 0, policy_version 419579 (0.0028) [2024-04-27 16:05:07,421][52263] Updated weights for policy 0, policy_version 419589 (0.0027) [2024-04-27 16:05:09,106][52031] Fps is (10 sec: 52428.4, 60 sec: 52975.0, 300 sec: 53317.4). Total num frames: 6874611712. Throughput: 0: 53356.9. Samples: 1365196320. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 16:05:09,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 16:05:10,686][52263] Updated weights for policy 0, policy_version 419599 (0.0029) [2024-04-27 16:05:13,367][52263] Updated weights for policy 0, policy_version 419609 (0.0032) [2024-04-27 16:05:14,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6874890240. Throughput: 0: 53336.0. Samples: 1365356560. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 16:05:14,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 16:05:16,792][52263] Updated weights for policy 0, policy_version 419619 (0.0028) [2024-04-27 16:05:19,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6875168768. Throughput: 0: 53411.2. Samples: 1365679080. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 16:05:19,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 16:05:19,354][52263] Updated weights for policy 0, policy_version 419629 (0.0027) [2024-04-27 16:05:22,915][52263] Updated weights for policy 0, policy_version 419639 (0.0028) [2024-04-27 16:05:24,107][52031] Fps is (10 sec: 54066.6, 60 sec: 52974.8, 300 sec: 53539.6). Total num frames: 6875430912. Throughput: 0: 53408.7. Samples: 1365999380. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 16:05:24,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 16:05:25,349][52263] Updated weights for policy 0, policy_version 419649 (0.0031) [2024-04-27 16:05:28,973][52263] Updated weights for policy 0, policy_version 419659 (0.0029) [2024-04-27 16:05:29,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 6875693056. Throughput: 0: 53458.3. Samples: 1366163120. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 16:05:29,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 16:05:31,294][52242] Signal inference workers to stop experience collection... (20800 times) [2024-04-27 16:05:31,298][52242] Signal inference workers to resume experience collection... (20800 times) [2024-04-27 16:05:31,326][52263] InferenceWorker_p0-w0: stopping experience collection (20800 times) [2024-04-27 16:05:31,327][52263] InferenceWorker_p0-w0: resuming experience collection (20800 times) [2024-04-27 16:05:31,557][52263] Updated weights for policy 0, policy_version 419669 (0.0035) [2024-04-27 16:05:34,106][52031] Fps is (10 sec: 54068.4, 60 sec: 53794.2, 300 sec: 53428.5). Total num frames: 6875971584. Throughput: 0: 53437.2. Samples: 1366483280. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 16:05:34,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 16:05:34,200][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000419677_6875987968.pth... [2024-04-27 16:05:34,240][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000418894_6863159296.pth [2024-04-27 16:05:35,054][52263] Updated weights for policy 0, policy_version 419679 (0.0034) [2024-04-27 16:05:38,030][52263] Updated weights for policy 0, policy_version 419689 (0.0028) [2024-04-27 16:05:39,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6876233728. Throughput: 0: 53501.8. Samples: 1366806300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 16:05:39,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 16:05:41,180][52263] Updated weights for policy 0, policy_version 419699 (0.0040) [2024-04-27 16:05:44,039][52263] Updated weights for policy 0, policy_version 419709 (0.0025) [2024-04-27 16:05:44,107][52031] Fps is (10 sec: 54066.0, 60 sec: 53520.9, 300 sec: 53539.5). Total num frames: 6876512256. Throughput: 0: 53567.3. Samples: 1366965200. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 16:05:44,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 16:05:47,257][52263] Updated weights for policy 0, policy_version 419719 (0.0031) [2024-04-27 16:05:49,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6876790784. Throughput: 0: 53466.2. Samples: 1367284140. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 16:05:49,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 16:05:50,233][52263] Updated weights for policy 0, policy_version 419729 (0.0026) [2024-04-27 16:05:53,403][52263] Updated weights for policy 0, policy_version 419739 (0.0032) [2024-04-27 16:05:54,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6877069312. Throughput: 0: 53603.9. Samples: 1367608500. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 16:05:54,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 16:05:56,456][52263] Updated weights for policy 0, policy_version 419749 (0.0026) [2024-04-27 16:05:59,107][52031] Fps is (10 sec: 49152.2, 60 sec: 53247.9, 300 sec: 53373.0). Total num frames: 6877282304. Throughput: 0: 53606.7. Samples: 1367768860. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 16:05:59,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 16:05:59,515][52263] Updated weights for policy 0, policy_version 419759 (0.0034) [2024-04-27 16:06:02,498][52263] Updated weights for policy 0, policy_version 419769 (0.0028) [2024-04-27 16:06:04,106][52031] Fps is (10 sec: 49152.7, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6877560832. Throughput: 0: 53609.3. Samples: 1368091500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 16:06:04,107][52031] Avg episode reward: [(0, '0.689')] [2024-04-27 16:06:05,572][52263] Updated weights for policy 0, policy_version 419779 (0.0034) [2024-04-27 16:06:08,543][52263] Updated weights for policy 0, policy_version 419789 (0.0032) [2024-04-27 16:06:09,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.0, 300 sec: 53428.5). Total num frames: 6877839360. Throughput: 0: 53672.5. Samples: 1368414640. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 16:06:09,107][52031] Avg episode reward: [(0, '0.516')] [2024-04-27 16:06:11,545][52263] Updated weights for policy 0, policy_version 419799 (0.0027) [2024-04-27 16:06:14,106][52031] Fps is (10 sec: 57343.5, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6878134272. Throughput: 0: 53635.4. Samples: 1368576720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 16:06:14,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 16:06:14,673][52263] Updated weights for policy 0, policy_version 419809 (0.0028) [2024-04-27 16:06:17,602][52263] Updated weights for policy 0, policy_version 419819 (0.0029) [2024-04-27 16:06:19,106][52031] Fps is (10 sec: 55706.4, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6878396416. Throughput: 0: 53691.1. Samples: 1368899380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 16:06:19,107][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 16:06:20,847][52263] Updated weights for policy 0, policy_version 419829 (0.0031) [2024-04-27 16:06:23,753][52263] Updated weights for policy 0, policy_version 419839 (0.0034) [2024-04-27 16:06:24,107][52031] Fps is (10 sec: 54067.1, 60 sec: 54067.3, 300 sec: 53650.7). Total num frames: 6878674944. Throughput: 0: 53700.9. Samples: 1369222840. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 16:06:24,107][52031] Avg episode reward: [(0, '0.681')] [2024-04-27 16:06:26,911][52263] Updated weights for policy 0, policy_version 419849 (0.0029) [2024-04-27 16:06:28,211][52242] Signal inference workers to stop experience collection... (20850 times) [2024-04-27 16:06:28,212][52242] Signal inference workers to resume experience collection... (20850 times) [2024-04-27 16:06:28,224][52263] InferenceWorker_p0-w0: stopping experience collection (20850 times) [2024-04-27 16:06:28,224][52263] InferenceWorker_p0-w0: resuming experience collection (20850 times) [2024-04-27 16:06:29,106][52031] Fps is (10 sec: 50790.2, 60 sec: 53520.9, 300 sec: 53373.0). Total num frames: 6878904320. Throughput: 0: 53715.7. Samples: 1369382400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 16:06:29,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 16:06:29,816][52263] Updated weights for policy 0, policy_version 419859 (0.0032) [2024-04-27 16:06:33,198][52263] Updated weights for policy 0, policy_version 419869 (0.0030) [2024-04-27 16:06:34,107][52031] Fps is (10 sec: 50789.9, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 6879182848. Throughput: 0: 53788.8. Samples: 1369704640. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 16:06:34,107][52031] Avg episode reward: [(0, '0.519')] [2024-04-27 16:06:35,808][52263] Updated weights for policy 0, policy_version 419879 (0.0038) [2024-04-27 16:06:39,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.1, 300 sec: 53484.1). Total num frames: 6879444992. Throughput: 0: 53725.9. Samples: 1370026160. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 16:06:39,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 16:06:39,318][52263] Updated weights for policy 0, policy_version 419889 (0.0027) [2024-04-27 16:06:41,918][52263] Updated weights for policy 0, policy_version 419899 (0.0031) [2024-04-27 16:06:44,107][52031] Fps is (10 sec: 57344.4, 60 sec: 54067.3, 300 sec: 53595.1). Total num frames: 6879756288. Throughput: 0: 53765.7. Samples: 1370188320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 16:06:44,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 16:06:45,398][52263] Updated weights for policy 0, policy_version 419909 (0.0028) [2024-04-27 16:06:48,017][52263] Updated weights for policy 0, policy_version 419919 (0.0041) [2024-04-27 16:06:49,107][52031] Fps is (10 sec: 58982.0, 60 sec: 54067.2, 300 sec: 53650.7). Total num frames: 6880034816. Throughput: 0: 53823.4. Samples: 1370513560. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 16:06:49,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 16:06:51,404][52263] Updated weights for policy 0, policy_version 419929 (0.0035) [2024-04-27 16:06:54,106][52031] Fps is (10 sec: 50791.2, 60 sec: 53248.2, 300 sec: 53539.6). Total num frames: 6880264192. Throughput: 0: 53894.5. Samples: 1370839880. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 16:06:54,107][52031] Avg episode reward: [(0, '0.508')] [2024-04-27 16:06:54,137][52263] Updated weights for policy 0, policy_version 419939 (0.0027) [2024-04-27 16:06:57,548][52263] Updated weights for policy 0, policy_version 419949 (0.0031) [2024-04-27 16:06:59,107][52031] Fps is (10 sec: 49151.9, 60 sec: 54067.1, 300 sec: 53484.0). Total num frames: 6880526336. Throughput: 0: 53667.9. Samples: 1370991780. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 16:06:59,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 16:07:00,161][52263] Updated weights for policy 0, policy_version 419959 (0.0027) [2024-04-27 16:07:03,612][52263] Updated weights for policy 0, policy_version 419969 (0.0035) [2024-04-27 16:07:04,106][52031] Fps is (10 sec: 52428.3, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6880788480. Throughput: 0: 53644.8. Samples: 1371313400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 16:07:04,115][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 16:07:06,239][52263] Updated weights for policy 0, policy_version 419979 (0.0030) [2024-04-27 16:07:09,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6881067008. Throughput: 0: 53792.0. Samples: 1371643480. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 16:07:09,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 16:07:09,582][52263] Updated weights for policy 0, policy_version 419989 (0.0029) [2024-04-27 16:07:12,284][52263] Updated weights for policy 0, policy_version 419999 (0.0030) [2024-04-27 16:07:14,106][52031] Fps is (10 sec: 57344.0, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6881361920. Throughput: 0: 53898.2. Samples: 1371807820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 16:07:14,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 16:07:15,741][52263] Updated weights for policy 0, policy_version 420009 (0.0027) [2024-04-27 16:07:18,186][52263] Updated weights for policy 0, policy_version 420019 (0.0032) [2024-04-27 16:07:19,107][52031] Fps is (10 sec: 57344.0, 60 sec: 54067.1, 300 sec: 53706.2). Total num frames: 6881640448. Throughput: 0: 53913.9. Samples: 1372130760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 16:07:19,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 16:07:21,795][52263] Updated weights for policy 0, policy_version 420029 (0.0027) [2024-04-27 16:07:24,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6881902592. Throughput: 0: 53846.6. Samples: 1372449260. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 16:07:24,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 16:07:24,358][52263] Updated weights for policy 0, policy_version 420039 (0.0031) [2024-04-27 16:07:28,110][52263] Updated weights for policy 0, policy_version 420049 (0.0037) [2024-04-27 16:07:29,107][52031] Fps is (10 sec: 49151.7, 60 sec: 53794.0, 300 sec: 53484.0). Total num frames: 6882131968. Throughput: 0: 53691.1. Samples: 1372604420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 16:07:29,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 16:07:30,389][52263] Updated weights for policy 0, policy_version 420059 (0.0033) [2024-04-27 16:07:34,107][52263] Updated weights for policy 0, policy_version 420069 (0.0027) [2024-04-27 16:07:34,107][52031] Fps is (10 sec: 50790.0, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6882410496. Throughput: 0: 53568.8. Samples: 1372924160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 16:07:34,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 16:07:34,114][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000420069_6882410496.pth... [2024-04-27 16:07:34,160][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000419284_6869549056.pth [2024-04-27 16:07:36,616][52263] Updated weights for policy 0, policy_version 420079 (0.0030) [2024-04-27 16:07:37,252][52242] Signal inference workers to stop experience collection... (20900 times) [2024-04-27 16:07:37,287][52263] InferenceWorker_p0-w0: stopping experience collection (20900 times) [2024-04-27 16:07:37,344][52242] Signal inference workers to resume experience collection... (20900 times) [2024-04-27 16:07:37,344][52263] InferenceWorker_p0-w0: resuming experience collection (20900 times) [2024-04-27 16:07:39,106][52031] Fps is (10 sec: 55706.4, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6882689024. Throughput: 0: 53492.8. Samples: 1373247060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 16:07:39,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 16:07:40,244][52263] Updated weights for policy 0, policy_version 420089 (0.0028) [2024-04-27 16:07:42,556][52263] Updated weights for policy 0, policy_version 420099 (0.0029) [2024-04-27 16:07:44,106][52031] Fps is (10 sec: 55706.5, 60 sec: 53521.1, 300 sec: 53650.6). Total num frames: 6882967552. Throughput: 0: 54016.1. Samples: 1373422500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 16:07:44,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 16:07:46,431][52263] Updated weights for policy 0, policy_version 420109 (0.0025) [2024-04-27 16:07:48,661][52263] Updated weights for policy 0, policy_version 420119 (0.0030) [2024-04-27 16:07:49,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53248.0, 300 sec: 53650.6). Total num frames: 6883229696. Throughput: 0: 54039.0. Samples: 1373745160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 16:07:49,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 16:07:52,457][52263] Updated weights for policy 0, policy_version 420129 (0.0034) [2024-04-27 16:07:54,107][52031] Fps is (10 sec: 54066.7, 60 sec: 54067.0, 300 sec: 53595.1). Total num frames: 6883508224. Throughput: 0: 53868.8. Samples: 1374067580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 16:07:54,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 16:07:54,721][52263] Updated weights for policy 0, policy_version 420139 (0.0031) [2024-04-27 16:07:58,488][52263] Updated weights for policy 0, policy_version 420149 (0.0032) [2024-04-27 16:07:59,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6883753984. Throughput: 0: 53603.6. Samples: 1374219980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 16:07:59,107][52031] Avg episode reward: [(0, '0.486')] [2024-04-27 16:08:00,920][52263] Updated weights for policy 0, policy_version 420159 (0.0028) [2024-04-27 16:08:04,106][52031] Fps is (10 sec: 50791.3, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6884016128. Throughput: 0: 53663.3. Samples: 1374545600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 16:08:04,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 16:08:04,499][52263] Updated weights for policy 0, policy_version 420169 (0.0028) [2024-04-27 16:08:06,996][52263] Updated weights for policy 0, policy_version 420179 (0.0026) [2024-04-27 16:08:09,107][52031] Fps is (10 sec: 55704.8, 60 sec: 54067.1, 300 sec: 53650.7). Total num frames: 6884311040. Throughput: 0: 53776.4. Samples: 1374869200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 16:08:09,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 16:08:10,693][52263] Updated weights for policy 0, policy_version 420189 (0.0027) [2024-04-27 16:08:13,051][52263] Updated weights for policy 0, policy_version 420199 (0.0036) [2024-04-27 16:08:14,106][52031] Fps is (10 sec: 57343.6, 60 sec: 53794.2, 300 sec: 53761.7). Total num frames: 6884589568. Throughput: 0: 54018.8. Samples: 1375035260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 16:08:14,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 16:08:16,760][52263] Updated weights for policy 0, policy_version 420209 (0.0032) [2024-04-27 16:08:19,049][52263] Updated weights for policy 0, policy_version 420219 (0.0034) [2024-04-27 16:08:19,106][52031] Fps is (10 sec: 55706.8, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 6884868096. Throughput: 0: 54090.1. Samples: 1375358200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 16:08:19,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 16:08:22,728][52263] Updated weights for policy 0, policy_version 420229 (0.0031) [2024-04-27 16:08:24,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 6885113856. Throughput: 0: 54222.3. Samples: 1375687060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 16:08:24,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 16:08:25,114][52263] Updated weights for policy 0, policy_version 420239 (0.0031) [2024-04-27 16:08:28,751][52263] Updated weights for policy 0, policy_version 420249 (0.0031) [2024-04-27 16:08:29,106][52031] Fps is (10 sec: 49151.5, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6885359616. Throughput: 0: 53653.8. Samples: 1375836920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 16:08:29,107][52031] Avg episode reward: [(0, '0.646')] [2024-04-27 16:08:31,318][52263] Updated weights for policy 0, policy_version 420259 (0.0032) [2024-04-27 16:08:34,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6885621760. Throughput: 0: 53532.0. Samples: 1376154100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 16:08:34,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 16:08:34,984][52263] Updated weights for policy 0, policy_version 420269 (0.0027) [2024-04-27 16:08:37,509][52263] Updated weights for policy 0, policy_version 420279 (0.0027) [2024-04-27 16:08:37,891][52242] Signal inference workers to stop experience collection... (20950 times) [2024-04-27 16:08:37,891][52242] Signal inference workers to resume experience collection... (20950 times) [2024-04-27 16:08:37,903][52263] InferenceWorker_p0-w0: stopping experience collection (20950 times) [2024-04-27 16:08:37,921][52263] InferenceWorker_p0-w0: resuming experience collection (20950 times) [2024-04-27 16:08:39,106][52031] Fps is (10 sec: 57344.1, 60 sec: 54067.2, 300 sec: 53761.8). Total num frames: 6885933056. Throughput: 0: 53691.3. Samples: 1376483680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 16:08:39,107][52031] Avg episode reward: [(0, '0.492')] [2024-04-27 16:08:41,130][52263] Updated weights for policy 0, policy_version 420289 (0.0028) [2024-04-27 16:08:43,576][52263] Updated weights for policy 0, policy_version 420299 (0.0032) [2024-04-27 16:08:44,107][52031] Fps is (10 sec: 58982.4, 60 sec: 54067.2, 300 sec: 53706.2). Total num frames: 6886211584. Throughput: 0: 53999.5. Samples: 1376649960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 16:08:44,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 16:08:47,199][52263] Updated weights for policy 0, policy_version 420309 (0.0026) [2024-04-27 16:08:49,106][52031] Fps is (10 sec: 54067.2, 60 sec: 54067.3, 300 sec: 53706.2). Total num frames: 6886473728. Throughput: 0: 53869.3. Samples: 1376969720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 16:08:49,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 16:08:49,808][52263] Updated weights for policy 0, policy_version 420319 (0.0027) [2024-04-27 16:08:53,305][52263] Updated weights for policy 0, policy_version 420329 (0.0031) [2024-04-27 16:08:54,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53794.2, 300 sec: 53761.7). Total num frames: 6886735872. Throughput: 0: 53745.5. Samples: 1377287740. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 16:08:54,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 16:08:56,017][52263] Updated weights for policy 0, policy_version 420339 (0.0031) [2024-04-27 16:08:59,107][52031] Fps is (10 sec: 49151.5, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6886965248. Throughput: 0: 53470.1. Samples: 1377441420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 16:08:59,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 16:08:59,395][52263] Updated weights for policy 0, policy_version 420349 (0.0029) [2024-04-27 16:09:02,099][52263] Updated weights for policy 0, policy_version 420359 (0.0029) [2024-04-27 16:09:04,106][52031] Fps is (10 sec: 49152.1, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6887227392. Throughput: 0: 53397.7. Samples: 1377761100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 16:09:04,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 16:09:05,578][52263] Updated weights for policy 0, policy_version 420369 (0.0035) [2024-04-27 16:09:08,147][52263] Updated weights for policy 0, policy_version 420379 (0.0025) [2024-04-27 16:09:09,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 6887522304. Throughput: 0: 53160.1. Samples: 1378079280. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 16:09:09,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 16:09:11,811][52263] Updated weights for policy 0, policy_version 420389 (0.0027) [2024-04-27 16:09:14,106][52031] Fps is (10 sec: 57343.8, 60 sec: 53521.0, 300 sec: 53706.2). Total num frames: 6887800832. Throughput: 0: 53694.2. Samples: 1378253160. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 16:09:14,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 16:09:14,280][52263] Updated weights for policy 0, policy_version 420399 (0.0024) [2024-04-27 16:09:17,782][52263] Updated weights for policy 0, policy_version 420409 (0.0028) [2024-04-27 16:09:19,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53247.7, 300 sec: 53595.1). Total num frames: 6888062976. Throughput: 0: 53752.2. Samples: 1378572960. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 16:09:19,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 16:09:20,522][52263] Updated weights for policy 0, policy_version 420419 (0.0029) [2024-04-27 16:09:23,932][52263] Updated weights for policy 0, policy_version 420429 (0.0027) [2024-04-27 16:09:24,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6888325120. Throughput: 0: 53460.1. Samples: 1378889380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 16:09:24,107][52031] Avg episode reward: [(0, '0.673')] [2024-04-27 16:09:26,637][52263] Updated weights for policy 0, policy_version 420439 (0.0031) [2024-04-27 16:09:29,107][52031] Fps is (10 sec: 49152.6, 60 sec: 53247.9, 300 sec: 53595.1). Total num frames: 6888554496. Throughput: 0: 53267.1. Samples: 1379046980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 16:09:29,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 16:09:30,089][52263] Updated weights for policy 0, policy_version 420449 (0.0029) [2024-04-27 16:09:32,689][52263] Updated weights for policy 0, policy_version 420459 (0.0031) [2024-04-27 16:09:34,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6888849408. Throughput: 0: 53211.1. Samples: 1379364220. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 16:09:34,107][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 16:09:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000420462_6888849408.pth... [2024-04-27 16:09:34,165][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000419677_6875987968.pth [2024-04-27 16:09:36,105][52263] Updated weights for policy 0, policy_version 420469 (0.0031) [2024-04-27 16:09:38,872][52263] Updated weights for policy 0, policy_version 420479 (0.0026) [2024-04-27 16:09:39,106][52031] Fps is (10 sec: 57344.3, 60 sec: 53247.9, 300 sec: 53650.6). Total num frames: 6889127936. Throughput: 0: 53196.4. Samples: 1379681580. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 16:09:39,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 16:09:41,830][52242] Signal inference workers to stop experience collection... (21000 times) [2024-04-27 16:09:41,830][52242] Signal inference workers to resume experience collection... (21000 times) [2024-04-27 16:09:41,843][52263] InferenceWorker_p0-w0: stopping experience collection (21000 times) [2024-04-27 16:09:41,843][52263] InferenceWorker_p0-w0: resuming experience collection (21000 times) [2024-04-27 16:09:42,341][52263] Updated weights for policy 0, policy_version 420489 (0.0032) [2024-04-27 16:09:44,106][52031] Fps is (10 sec: 55705.8, 60 sec: 53248.1, 300 sec: 53706.2). Total num frames: 6889406464. Throughput: 0: 53598.8. Samples: 1379853360. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 16:09:44,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 16:09:45,051][52263] Updated weights for policy 0, policy_version 420499 (0.0033) [2024-04-27 16:09:48,471][52263] Updated weights for policy 0, policy_version 420509 (0.0028) [2024-04-27 16:09:49,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53247.9, 300 sec: 53650.7). Total num frames: 6889668608. Throughput: 0: 53580.8. Samples: 1380172240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 16:09:49,107][52031] Avg episode reward: [(0, '0.673')] [2024-04-27 16:09:51,063][52263] Updated weights for policy 0, policy_version 420519 (0.0028) [2024-04-27 16:09:54,107][52031] Fps is (10 sec: 49151.3, 60 sec: 52701.8, 300 sec: 53595.1). Total num frames: 6889897984. Throughput: 0: 53610.8. Samples: 1380491760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 16:09:54,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 16:09:54,623][52263] Updated weights for policy 0, policy_version 420529 (0.0032) [2024-04-27 16:09:57,087][52263] Updated weights for policy 0, policy_version 420539 (0.0030) [2024-04-27 16:09:59,107][52031] Fps is (10 sec: 50789.9, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6890176512. Throughput: 0: 53045.2. Samples: 1380640200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 16:09:59,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 16:10:00,815][52263] Updated weights for policy 0, policy_version 420549 (0.0032) [2024-04-27 16:10:03,231][52263] Updated weights for policy 0, policy_version 420559 (0.0030) [2024-04-27 16:10:04,106][52031] Fps is (10 sec: 55706.2, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6890455040. Throughput: 0: 53101.6. Samples: 1380962520. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 16:10:04,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 16:10:06,860][52263] Updated weights for policy 0, policy_version 420569 (0.0034) [2024-04-27 16:10:09,106][52031] Fps is (10 sec: 55706.7, 60 sec: 53521.3, 300 sec: 53706.2). Total num frames: 6890733568. Throughput: 0: 53145.7. Samples: 1381280940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:10:09,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 16:10:09,355][52263] Updated weights for policy 0, policy_version 420579 (0.0036) [2024-04-27 16:10:12,926][52263] Updated weights for policy 0, policy_version 420589 (0.0030) [2024-04-27 16:10:14,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.1, 300 sec: 53650.7). Total num frames: 6890995712. Throughput: 0: 53399.8. Samples: 1381449960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:10:14,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 16:10:15,305][52263] Updated weights for policy 0, policy_version 420599 (0.0028) [2024-04-27 16:10:19,061][52263] Updated weights for policy 0, policy_version 420609 (0.0036) [2024-04-27 16:10:19,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53248.1, 300 sec: 53650.7). Total num frames: 6891257856. Throughput: 0: 53465.6. Samples: 1381770180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:10:19,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 16:10:21,703][52263] Updated weights for policy 0, policy_version 420619 (0.0030) [2024-04-27 16:10:24,107][52031] Fps is (10 sec: 50789.5, 60 sec: 52974.8, 300 sec: 53595.1). Total num frames: 6891503616. Throughput: 0: 53570.6. Samples: 1382092260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:10:24,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 16:10:25,141][52263] Updated weights for policy 0, policy_version 420629 (0.0033) [2024-04-27 16:10:28,295][52263] Updated weights for policy 0, policy_version 420639 (0.0032) [2024-04-27 16:10:29,107][52031] Fps is (10 sec: 52429.4, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6891782144. Throughput: 0: 53135.9. Samples: 1382244480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:10:29,107][52031] Avg episode reward: [(0, '0.646')] [2024-04-27 16:10:31,202][52263] Updated weights for policy 0, policy_version 420649 (0.0035) [2024-04-27 16:10:34,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6892060672. Throughput: 0: 53159.2. Samples: 1382564400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:10:34,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 16:10:34,545][52263] Updated weights for policy 0, policy_version 420659 (0.0027) [2024-04-27 16:10:37,307][52242] Signal inference workers to stop experience collection... (21050 times) [2024-04-27 16:10:37,308][52242] Signal inference workers to resume experience collection... (21050 times) [2024-04-27 16:10:37,349][52263] InferenceWorker_p0-w0: stopping experience collection (21050 times) [2024-04-27 16:10:37,349][52263] InferenceWorker_p0-w0: resuming experience collection (21050 times) [2024-04-27 16:10:37,424][52263] Updated weights for policy 0, policy_version 420669 (0.0032) [2024-04-27 16:10:39,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6892339200. Throughput: 0: 53161.8. Samples: 1382884040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:10:39,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 16:10:40,543][52263] Updated weights for policy 0, policy_version 420679 (0.0029) [2024-04-27 16:10:43,443][52263] Updated weights for policy 0, policy_version 420689 (0.0030) [2024-04-27 16:10:44,107][52031] Fps is (10 sec: 52428.7, 60 sec: 52974.9, 300 sec: 53539.6). Total num frames: 6892584960. Throughput: 0: 53570.4. Samples: 1383050860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:10:44,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 16:10:46,686][52263] Updated weights for policy 0, policy_version 420699 (0.0029) [2024-04-27 16:10:49,107][52031] Fps is (10 sec: 50790.5, 60 sec: 52974.9, 300 sec: 53484.0). Total num frames: 6892847104. Throughput: 0: 53586.6. Samples: 1383373920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:10:49,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 16:10:49,544][52263] Updated weights for policy 0, policy_version 420709 (0.0029) [2024-04-27 16:10:52,817][52263] Updated weights for policy 0, policy_version 420719 (0.0026) [2024-04-27 16:10:54,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6893125632. Throughput: 0: 53633.3. Samples: 1383694440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:10:54,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 16:10:55,709][52263] Updated weights for policy 0, policy_version 420729 (0.0027) [2024-04-27 16:10:58,820][52263] Updated weights for policy 0, policy_version 420739 (0.0026) [2024-04-27 16:10:59,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6893404160. Throughput: 0: 53338.1. Samples: 1383850180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:10:59,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 16:11:01,744][52263] Updated weights for policy 0, policy_version 420749 (0.0028) [2024-04-27 16:11:04,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6893682688. Throughput: 0: 53550.1. Samples: 1384179920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:11:04,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 16:11:04,922][52263] Updated weights for policy 0, policy_version 420759 (0.0027) [2024-04-27 16:11:07,722][52263] Updated weights for policy 0, policy_version 420769 (0.0027) [2024-04-27 16:11:09,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6893944832. Throughput: 0: 53594.4. Samples: 1384504000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:11:09,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 16:11:10,838][52263] Updated weights for policy 0, policy_version 420779 (0.0027) [2024-04-27 16:11:13,793][52263] Updated weights for policy 0, policy_version 420789 (0.0028) [2024-04-27 16:11:14,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6894206976. Throughput: 0: 53804.8. Samples: 1384665700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:11:14,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 16:11:17,070][52263] Updated weights for policy 0, policy_version 420799 (0.0030) [2024-04-27 16:11:19,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.2, 300 sec: 53484.1). Total num frames: 6894452736. Throughput: 0: 53835.7. Samples: 1384987000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:11:19,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 16:11:19,921][52263] Updated weights for policy 0, policy_version 420809 (0.0029) [2024-04-27 16:11:23,257][52263] Updated weights for policy 0, policy_version 420819 (0.0031) [2024-04-27 16:11:24,107][52031] Fps is (10 sec: 54067.4, 60 sec: 54067.3, 300 sec: 53706.2). Total num frames: 6894747648. Throughput: 0: 53932.9. Samples: 1385311020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:11:24,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 16:11:26,410][52263] Updated weights for policy 0, policy_version 420829 (0.0033) [2024-04-27 16:11:29,107][52031] Fps is (10 sec: 55704.3, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6895009792. Throughput: 0: 53841.7. Samples: 1385473740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:11:29,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 16:11:29,288][52263] Updated weights for policy 0, policy_version 420839 (0.0031) [2024-04-27 16:11:32,392][52263] Updated weights for policy 0, policy_version 420849 (0.0024) [2024-04-27 16:11:33,868][52242] Signal inference workers to stop experience collection... (21100 times) [2024-04-27 16:11:33,901][52263] InferenceWorker_p0-w0: stopping experience collection (21100 times) [2024-04-27 16:11:33,932][52242] Signal inference workers to resume experience collection... (21100 times) [2024-04-27 16:11:33,933][52263] InferenceWorker_p0-w0: resuming experience collection (21100 times) [2024-04-27 16:11:34,106][52031] Fps is (10 sec: 55705.6, 60 sec: 54067.2, 300 sec: 53761.7). Total num frames: 6895304704. Throughput: 0: 53758.3. Samples: 1385793040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 16:11:34,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 16:11:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000420856_6895304704.pth... [2024-04-27 16:11:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000420069_6882410496.pth [2024-04-27 16:11:35,327][52263] Updated weights for policy 0, policy_version 420859 (0.0041) [2024-04-27 16:11:38,403][52263] Updated weights for policy 0, policy_version 420869 (0.0035) [2024-04-27 16:11:39,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6895550464. Throughput: 0: 53715.6. Samples: 1386111640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 16:11:39,107][52031] Avg episode reward: [(0, '0.509')] [2024-04-27 16:11:41,400][52263] Updated weights for policy 0, policy_version 420879 (0.0026) [2024-04-27 16:11:44,106][52031] Fps is (10 sec: 52429.5, 60 sec: 54067.3, 300 sec: 53539.6). Total num frames: 6895828992. Throughput: 0: 53851.3. Samples: 1386273480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 16:11:44,107][52031] Avg episode reward: [(0, '0.679')] [2024-04-27 16:11:44,505][52263] Updated weights for policy 0, policy_version 420889 (0.0030) [2024-04-27 16:11:47,641][52263] Updated weights for policy 0, policy_version 420899 (0.0031) [2024-04-27 16:11:49,106][52031] Fps is (10 sec: 54067.0, 60 sec: 54067.2, 300 sec: 53650.6). Total num frames: 6896091136. Throughput: 0: 53773.2. Samples: 1386599720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 16:11:49,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 16:11:50,676][52263] Updated weights for policy 0, policy_version 420909 (0.0026) [2024-04-27 16:11:53,627][52263] Updated weights for policy 0, policy_version 420919 (0.0029) [2024-04-27 16:11:54,106][52031] Fps is (10 sec: 52428.3, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6896353280. Throughput: 0: 53791.5. Samples: 1386924620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 16:11:54,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 16:11:56,641][52263] Updated weights for policy 0, policy_version 420929 (0.0027) [2024-04-27 16:11:59,106][52031] Fps is (10 sec: 55706.0, 60 sec: 54067.3, 300 sec: 53761.7). Total num frames: 6896648192. Throughput: 0: 53759.7. Samples: 1387084880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 16:11:59,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 16:11:59,630][52263] Updated weights for policy 0, policy_version 420939 (0.0031) [2024-04-27 16:12:02,607][52263] Updated weights for policy 0, policy_version 420949 (0.0026) [2024-04-27 16:12:04,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6896893952. Throughput: 0: 53748.3. Samples: 1387405680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 16:12:04,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 16:12:05,708][52263] Updated weights for policy 0, policy_version 420959 (0.0043) [2024-04-27 16:12:08,811][52263] Updated weights for policy 0, policy_version 420969 (0.0028) [2024-04-27 16:12:09,107][52031] Fps is (10 sec: 54066.5, 60 sec: 54067.1, 300 sec: 53650.6). Total num frames: 6897188864. Throughput: 0: 53779.5. Samples: 1387731100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 16:12:09,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 16:12:11,744][52263] Updated weights for policy 0, policy_version 420979 (0.0032) [2024-04-27 16:12:14,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6897434624. Throughput: 0: 53663.6. Samples: 1387888600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 16:12:14,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 16:12:14,927][52263] Updated weights for policy 0, policy_version 420989 (0.0027) [2024-04-27 16:12:17,839][52263] Updated weights for policy 0, policy_version 420999 (0.0028) [2024-04-27 16:12:19,107][52031] Fps is (10 sec: 52428.6, 60 sec: 54340.0, 300 sec: 53595.1). Total num frames: 6897713152. Throughput: 0: 53749.2. Samples: 1388211760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 16:12:19,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 16:12:20,927][52263] Updated weights for policy 0, policy_version 421009 (0.0027) [2024-04-27 16:12:23,750][52263] Updated weights for policy 0, policy_version 421019 (0.0030) [2024-04-27 16:12:24,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6897975296. Throughput: 0: 53952.3. Samples: 1388539500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 16:12:24,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 16:12:26,928][52263] Updated weights for policy 0, policy_version 421029 (0.0031) [2024-04-27 16:12:29,107][52031] Fps is (10 sec: 52429.3, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6898237440. Throughput: 0: 53904.3. Samples: 1388699180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 16:12:29,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 16:12:30,044][52263] Updated weights for policy 0, policy_version 421039 (0.0033) [2024-04-27 16:12:32,972][52263] Updated weights for policy 0, policy_version 421049 (0.0034) [2024-04-27 16:12:34,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6898499584. Throughput: 0: 53879.1. Samples: 1389024280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 16:12:34,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 16:12:36,242][52263] Updated weights for policy 0, policy_version 421059 (0.0030) [2024-04-27 16:12:39,103][52263] Updated weights for policy 0, policy_version 421069 (0.0038) [2024-04-27 16:12:39,106][52031] Fps is (10 sec: 55706.2, 60 sec: 54067.2, 300 sec: 53650.7). Total num frames: 6898794496. Throughput: 0: 53774.3. Samples: 1389344460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 16:12:39,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 16:12:41,953][52242] Signal inference workers to stop experience collection... (21150 times) [2024-04-27 16:12:41,954][52242] Signal inference workers to resume experience collection... (21150 times) [2024-04-27 16:12:41,987][52263] InferenceWorker_p0-w0: stopping experience collection (21150 times) [2024-04-27 16:12:41,988][52263] InferenceWorker_p0-w0: resuming experience collection (21150 times) [2024-04-27 16:12:42,279][52263] Updated weights for policy 0, policy_version 421079 (0.0037) [2024-04-27 16:12:44,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6899040256. Throughput: 0: 53776.0. Samples: 1389504800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 16:12:44,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 16:12:45,104][52263] Updated weights for policy 0, policy_version 421089 (0.0028) [2024-04-27 16:12:48,283][52263] Updated weights for policy 0, policy_version 421099 (0.0031) [2024-04-27 16:12:49,107][52031] Fps is (10 sec: 52427.6, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6899318784. Throughput: 0: 53761.5. Samples: 1389824960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-27 16:12:49,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 16:12:51,322][52263] Updated weights for policy 0, policy_version 421109 (0.0028) [2024-04-27 16:12:54,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6899564544. Throughput: 0: 53647.2. Samples: 1390145220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-27 16:12:54,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 16:12:54,391][52263] Updated weights for policy 0, policy_version 421119 (0.0028) [2024-04-27 16:12:57,498][52263] Updated weights for policy 0, policy_version 421129 (0.0034) [2024-04-27 16:12:59,107][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.0, 300 sec: 53706.2). Total num frames: 6899859456. Throughput: 0: 53857.0. Samples: 1390312160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-27 16:12:59,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 16:13:00,563][52263] Updated weights for policy 0, policy_version 421139 (0.0034) [2024-04-27 16:13:03,643][52263] Updated weights for policy 0, policy_version 421149 (0.0027) [2024-04-27 16:13:04,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53520.9, 300 sec: 53539.6). Total num frames: 6900105216. Throughput: 0: 53870.7. Samples: 1390635940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-27 16:13:04,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 16:13:06,576][52263] Updated weights for policy 0, policy_version 421159 (0.0031) [2024-04-27 16:13:09,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6900383744. Throughput: 0: 53723.2. Samples: 1390957040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-27 16:13:09,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 16:13:09,846][52263] Updated weights for policy 0, policy_version 421169 (0.0025) [2024-04-27 16:13:12,637][52263] Updated weights for policy 0, policy_version 421179 (0.0032) [2024-04-27 16:13:14,107][52031] Fps is (10 sec: 55706.0, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6900662272. Throughput: 0: 53703.1. Samples: 1391115820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-27 16:13:14,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 16:13:15,818][52263] Updated weights for policy 0, policy_version 421189 (0.0028) [2024-04-27 16:13:19,084][52263] Updated weights for policy 0, policy_version 421199 (0.0029) [2024-04-27 16:13:19,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6900924416. Throughput: 0: 53476.4. Samples: 1391430720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-27 16:13:19,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 16:13:21,934][52263] Updated weights for policy 0, policy_version 421209 (0.0032) [2024-04-27 16:13:24,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.2, 300 sec: 53650.6). Total num frames: 6901186560. Throughput: 0: 53547.4. Samples: 1391754100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-27 16:13:24,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 16:13:25,188][52263] Updated weights for policy 0, policy_version 421219 (0.0034) [2024-04-27 16:13:28,144][52263] Updated weights for policy 0, policy_version 421229 (0.0027) [2024-04-27 16:13:29,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6901465088. Throughput: 0: 53567.5. Samples: 1391915340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-27 16:13:29,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 16:13:31,421][52263] Updated weights for policy 0, policy_version 421239 (0.0027) [2024-04-27 16:13:34,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6901727232. Throughput: 0: 53520.3. Samples: 1392233360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-27 16:13:34,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 16:13:34,140][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000421249_6901743616.pth... [2024-04-27 16:13:34,145][52263] Updated weights for policy 0, policy_version 421249 (0.0024) [2024-04-27 16:13:34,187][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000420462_6888849408.pth [2024-04-27 16:13:35,320][52242] Signal inference workers to stop experience collection... (21200 times) [2024-04-27 16:13:35,320][52242] Signal inference workers to resume experience collection... (21200 times) [2024-04-27 16:13:35,350][52263] InferenceWorker_p0-w0: stopping experience collection (21200 times) [2024-04-27 16:13:35,350][52263] InferenceWorker_p0-w0: resuming experience collection (21200 times) [2024-04-27 16:13:37,506][52263] Updated weights for policy 0, policy_version 421259 (0.0026) [2024-04-27 16:13:39,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6901989376. Throughput: 0: 53584.4. Samples: 1392556520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-27 16:13:39,107][52031] Avg episode reward: [(0, '0.552')] [2024-04-27 16:13:40,133][52263] Updated weights for policy 0, policy_version 421269 (0.0028) [2024-04-27 16:13:43,787][52263] Updated weights for policy 0, policy_version 421279 (0.0033) [2024-04-27 16:13:44,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6902251520. Throughput: 0: 53497.3. Samples: 1392719540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-27 16:13:44,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 16:13:46,576][52263] Updated weights for policy 0, policy_version 421289 (0.0034) [2024-04-27 16:13:49,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6902513664. Throughput: 0: 53364.1. Samples: 1393037320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-27 16:13:49,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 16:13:49,790][52263] Updated weights for policy 0, policy_version 421299 (0.0028) [2024-04-27 16:13:52,890][52263] Updated weights for policy 0, policy_version 421309 (0.0032) [2024-04-27 16:13:54,106][52031] Fps is (10 sec: 55705.9, 60 sec: 54067.2, 300 sec: 53706.2). Total num frames: 6902808576. Throughput: 0: 53330.7. Samples: 1393356920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-27 16:13:54,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 16:13:55,810][52263] Updated weights for policy 0, policy_version 421319 (0.0030) [2024-04-27 16:13:58,944][52263] Updated weights for policy 0, policy_version 421329 (0.0032) [2024-04-27 16:13:59,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53248.1, 300 sec: 53650.7). Total num frames: 6903054336. Throughput: 0: 53409.4. Samples: 1393519240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-27 16:13:59,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 16:14:01,982][52263] Updated weights for policy 0, policy_version 421339 (0.0029) [2024-04-27 16:14:04,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6903332864. Throughput: 0: 53534.2. Samples: 1393839760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-27 16:14:04,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 16:14:04,948][52263] Updated weights for policy 0, policy_version 421349 (0.0038) [2024-04-27 16:14:08,068][52263] Updated weights for policy 0, policy_version 421359 (0.0030) [2024-04-27 16:14:09,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6903595008. Throughput: 0: 53422.2. Samples: 1394158100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-27 16:14:09,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 16:14:11,064][52263] Updated weights for policy 0, policy_version 421369 (0.0028) [2024-04-27 16:14:14,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6903857152. Throughput: 0: 53603.1. Samples: 1394327480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-04-27 16:14:14,107][52031] Avg episode reward: [(0, '0.661')] [2024-04-27 16:14:14,324][52263] Updated weights for policy 0, policy_version 421379 (0.0028) [2024-04-27 16:14:17,281][52263] Updated weights for policy 0, policy_version 421389 (0.0032) [2024-04-27 16:14:19,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6904135680. Throughput: 0: 53637.7. Samples: 1394647060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 16:14:19,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 16:14:20,490][52263] Updated weights for policy 0, policy_version 421399 (0.0033) [2024-04-27 16:14:23,334][52263] Updated weights for policy 0, policy_version 421409 (0.0031) [2024-04-27 16:14:24,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53794.2, 300 sec: 53761.7). Total num frames: 6904414208. Throughput: 0: 53533.8. Samples: 1394965540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 16:14:24,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 16:14:26,624][52263] Updated weights for policy 0, policy_version 421419 (0.0027) [2024-04-27 16:14:29,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.1, 300 sec: 53650.6). Total num frames: 6904676352. Throughput: 0: 53515.6. Samples: 1395127740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 16:14:29,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 16:14:29,402][52263] Updated weights for policy 0, policy_version 421429 (0.0031) [2024-04-27 16:14:32,661][52263] Updated weights for policy 0, policy_version 421439 (0.0026) [2024-04-27 16:14:34,107][52031] Fps is (10 sec: 50790.1, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6904922112. Throughput: 0: 53551.1. Samples: 1395447120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 16:14:34,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 16:14:35,577][52263] Updated weights for policy 0, policy_version 421449 (0.0033) [2024-04-27 16:14:38,994][52263] Updated weights for policy 0, policy_version 421459 (0.0029) [2024-04-27 16:14:39,107][52031] Fps is (10 sec: 50790.5, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6905184256. Throughput: 0: 53550.3. Samples: 1395766680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 16:14:39,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 16:14:40,915][52242] Signal inference workers to stop experience collection... (21250 times) [2024-04-27 16:14:40,915][52242] Signal inference workers to resume experience collection... (21250 times) [2024-04-27 16:14:40,926][52263] InferenceWorker_p0-w0: stopping experience collection (21250 times) [2024-04-27 16:14:40,946][52263] InferenceWorker_p0-w0: resuming experience collection (21250 times) [2024-04-27 16:14:41,764][52263] Updated weights for policy 0, policy_version 421469 (0.0033) [2024-04-27 16:14:44,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 6905462784. Throughput: 0: 53473.1. Samples: 1395925540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 16:14:44,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 16:14:45,165][52263] Updated weights for policy 0, policy_version 421479 (0.0033) [2024-04-27 16:14:47,924][52263] Updated weights for policy 0, policy_version 421489 (0.0028) [2024-04-27 16:14:49,106][52031] Fps is (10 sec: 57344.5, 60 sec: 54067.3, 300 sec: 53761.8). Total num frames: 6905757696. Throughput: 0: 53449.0. Samples: 1396244960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 16:14:49,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 16:14:51,148][52263] Updated weights for policy 0, policy_version 421499 (0.0029) [2024-04-27 16:14:54,045][52263] Updated weights for policy 0, policy_version 421509 (0.0038) [2024-04-27 16:14:54,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53248.0, 300 sec: 53650.7). Total num frames: 6906003456. Throughput: 0: 53469.0. Samples: 1396564200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 16:14:54,115][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 16:14:57,292][52263] Updated weights for policy 0, policy_version 421519 (0.0030) [2024-04-27 16:14:59,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 6906281984. Throughput: 0: 53318.0. Samples: 1396726780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 16:14:59,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 16:15:00,179][52263] Updated weights for policy 0, policy_version 421529 (0.0032) [2024-04-27 16:15:03,387][52263] Updated weights for policy 0, policy_version 421539 (0.0032) [2024-04-27 16:15:04,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6906527744. Throughput: 0: 53374.2. Samples: 1397048900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 16:15:04,115][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 16:15:06,123][52263] Updated weights for policy 0, policy_version 421549 (0.0029) [2024-04-27 16:15:09,106][52031] Fps is (10 sec: 50790.0, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6906789888. Throughput: 0: 53526.3. Samples: 1397374220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 16:15:09,116][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 16:15:09,472][52263] Updated weights for policy 0, policy_version 421559 (0.0034) [2024-04-27 16:15:12,287][52263] Updated weights for policy 0, policy_version 421569 (0.0032) [2024-04-27 16:15:14,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6907068416. Throughput: 0: 53285.3. Samples: 1397525580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 16:15:14,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 16:15:15,684][52263] Updated weights for policy 0, policy_version 421579 (0.0027) [2024-04-27 16:15:18,307][52263] Updated weights for policy 0, policy_version 421589 (0.0029) [2024-04-27 16:15:19,107][52031] Fps is (10 sec: 58981.7, 60 sec: 54067.2, 300 sec: 53817.3). Total num frames: 6907379712. Throughput: 0: 53364.4. Samples: 1397848520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 16:15:19,116][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 16:15:21,743][52263] Updated weights for policy 0, policy_version 421599 (0.0032) [2024-04-27 16:15:24,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6907625472. Throughput: 0: 53409.8. Samples: 1398170120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 16:15:24,115][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 16:15:24,301][52263] Updated weights for policy 0, policy_version 421609 (0.0027) [2024-04-27 16:15:27,918][52263] Updated weights for policy 0, policy_version 421619 (0.0035) [2024-04-27 16:15:29,106][52031] Fps is (10 sec: 49152.5, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6907871232. Throughput: 0: 53525.5. Samples: 1398334180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 16:15:29,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 16:15:30,365][52263] Updated weights for policy 0, policy_version 421629 (0.0030) [2024-04-27 16:15:34,106][52031] Fps is (10 sec: 49152.0, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6908116992. Throughput: 0: 53539.5. Samples: 1398654240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 16:15:34,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 16:15:34,159][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000421639_6908133376.pth... [2024-04-27 16:15:34,163][52263] Updated weights for policy 0, policy_version 421639 (0.0028) [2024-04-27 16:15:34,201][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000420856_6895304704.pth [2024-04-27 16:15:36,529][52242] Signal inference workers to stop experience collection... (21300 times) [2024-04-27 16:15:36,576][52263] InferenceWorker_p0-w0: stopping experience collection (21300 times) [2024-04-27 16:15:36,592][52242] Signal inference workers to resume experience collection... (21300 times) [2024-04-27 16:15:36,593][52263] InferenceWorker_p0-w0: resuming experience collection (21300 times) [2024-04-27 16:15:36,597][52263] Updated weights for policy 0, policy_version 421649 (0.0027) [2024-04-27 16:15:39,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6908411904. Throughput: 0: 53587.0. Samples: 1398975620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:15:39,107][52031] Avg episode reward: [(0, '0.671')] [2024-04-27 16:15:40,217][52263] Updated weights for policy 0, policy_version 421659 (0.0028) [2024-04-27 16:15:42,737][52263] Updated weights for policy 0, policy_version 421669 (0.0037) [2024-04-27 16:15:44,107][52031] Fps is (10 sec: 57343.5, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6908690432. Throughput: 0: 53537.2. Samples: 1399135960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:15:44,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 16:15:46,217][52263] Updated weights for policy 0, policy_version 421679 (0.0034) [2024-04-27 16:15:48,777][52263] Updated weights for policy 0, policy_version 421689 (0.0030) [2024-04-27 16:15:49,107][52031] Fps is (10 sec: 57344.4, 60 sec: 53794.0, 300 sec: 53761.7). Total num frames: 6908985344. Throughput: 0: 53657.4. Samples: 1399463480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:15:49,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 16:15:52,370][52263] Updated weights for policy 0, policy_version 421699 (0.0037) [2024-04-27 16:15:54,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6909214720. Throughput: 0: 53576.2. Samples: 1399785160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:15:54,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 16:15:54,689][52263] Updated weights for policy 0, policy_version 421709 (0.0032) [2024-04-27 16:15:58,568][52263] Updated weights for policy 0, policy_version 421719 (0.0030) [2024-04-27 16:15:59,106][52031] Fps is (10 sec: 47513.8, 60 sec: 52974.8, 300 sec: 53484.0). Total num frames: 6909460480. Throughput: 0: 53786.3. Samples: 1399945960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:15:59,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 16:16:00,897][52263] Updated weights for policy 0, policy_version 421729 (0.0027) [2024-04-27 16:16:04,106][52031] Fps is (10 sec: 50791.3, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 6909722624. Throughput: 0: 53733.9. Samples: 1400266540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:16:04,107][52031] Avg episode reward: [(0, '0.481')] [2024-04-27 16:16:04,699][52263] Updated weights for policy 0, policy_version 421739 (0.0025) [2024-04-27 16:16:07,184][52263] Updated weights for policy 0, policy_version 421749 (0.0029) [2024-04-27 16:16:09,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6910017536. Throughput: 0: 53698.0. Samples: 1400586540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:16:09,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 16:16:10,627][52263] Updated weights for policy 0, policy_version 421759 (0.0023) [2024-04-27 16:16:13,059][52263] Updated weights for policy 0, policy_version 421769 (0.0027) [2024-04-27 16:16:14,106][52031] Fps is (10 sec: 58982.3, 60 sec: 54067.3, 300 sec: 53761.7). Total num frames: 6910312448. Throughput: 0: 53824.9. Samples: 1400756300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:16:14,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 16:16:16,895][52263] Updated weights for policy 0, policy_version 421779 (0.0031) [2024-04-27 16:16:19,046][52263] Updated weights for policy 0, policy_version 421789 (0.0026) [2024-04-27 16:16:19,107][52031] Fps is (10 sec: 57344.0, 60 sec: 53521.0, 300 sec: 53706.2). Total num frames: 6910590976. Throughput: 0: 53902.5. Samples: 1401079860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:16:19,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 16:16:23,036][52242] Signal inference workers to stop experience collection... (21350 times) [2024-04-27 16:16:23,084][52263] InferenceWorker_p0-w0: stopping experience collection (21350 times) [2024-04-27 16:16:23,097][52242] Signal inference workers to resume experience collection... (21350 times) [2024-04-27 16:16:23,102][52263] InferenceWorker_p0-w0: resuming experience collection (21350 times) [2024-04-27 16:16:23,105][52263] Updated weights for policy 0, policy_version 421799 (0.0029) [2024-04-27 16:16:24,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6910853120. Throughput: 0: 53937.0. Samples: 1401402780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:16:24,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 16:16:25,190][52263] Updated weights for policy 0, policy_version 421809 (0.0030) [2024-04-27 16:16:29,106][52031] Fps is (10 sec: 47514.7, 60 sec: 53248.1, 300 sec: 53428.5). Total num frames: 6911066112. Throughput: 0: 53780.2. Samples: 1401556060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:16:29,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 16:16:29,119][52263] Updated weights for policy 0, policy_version 421819 (0.0028) [2024-04-27 16:16:31,293][52263] Updated weights for policy 0, policy_version 421829 (0.0030) [2024-04-27 16:16:34,107][52031] Fps is (10 sec: 50790.5, 60 sec: 54067.1, 300 sec: 53595.1). Total num frames: 6911361024. Throughput: 0: 53776.5. Samples: 1401883420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:16:34,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 16:16:34,998][52263] Updated weights for policy 0, policy_version 421839 (0.0027) [2024-04-27 16:16:37,503][52263] Updated weights for policy 0, policy_version 421849 (0.0034) [2024-04-27 16:16:39,106][52031] Fps is (10 sec: 57343.4, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6911639552. Throughput: 0: 53831.3. Samples: 1402207560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:16:39,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 16:16:41,085][52263] Updated weights for policy 0, policy_version 421859 (0.0029) [2024-04-27 16:16:43,515][52263] Updated weights for policy 0, policy_version 421869 (0.0027) [2024-04-27 16:16:44,107][52031] Fps is (10 sec: 57343.5, 60 sec: 54067.1, 300 sec: 53706.2). Total num frames: 6911934464. Throughput: 0: 53952.3. Samples: 1402373820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:16:44,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 16:16:47,176][52263] Updated weights for policy 0, policy_version 421879 (0.0028) [2024-04-27 16:16:49,106][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6912196608. Throughput: 0: 53979.0. Samples: 1402695600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:16:49,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 16:16:49,641][52263] Updated weights for policy 0, policy_version 421889 (0.0035) [2024-04-27 16:16:53,218][52263] Updated weights for policy 0, policy_version 421899 (0.0029) [2024-04-27 16:16:54,106][52031] Fps is (10 sec: 54067.8, 60 sec: 54340.4, 300 sec: 53650.6). Total num frames: 6912475136. Throughput: 0: 54098.8. Samples: 1403020980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:16:54,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 16:16:55,688][52263] Updated weights for policy 0, policy_version 421909 (0.0032) [2024-04-27 16:16:59,106][52031] Fps is (10 sec: 50790.9, 60 sec: 54067.3, 300 sec: 53595.1). Total num frames: 6912704512. Throughput: 0: 53716.5. Samples: 1403173540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-27 16:16:59,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 16:16:59,383][52263] Updated weights for policy 0, policy_version 421919 (0.0032) [2024-04-27 16:17:01,839][52263] Updated weights for policy 0, policy_version 421929 (0.0027) [2024-04-27 16:17:04,106][52031] Fps is (10 sec: 49152.3, 60 sec: 54067.2, 300 sec: 53484.1). Total num frames: 6912966656. Throughput: 0: 53707.8. Samples: 1403496700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-27 16:17:04,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 16:17:05,476][52263] Updated weights for policy 0, policy_version 421939 (0.0028) [2024-04-27 16:17:08,071][52263] Updated weights for policy 0, policy_version 421949 (0.0028) [2024-04-27 16:17:09,107][52031] Fps is (10 sec: 57342.9, 60 sec: 54340.3, 300 sec: 53706.2). Total num frames: 6913277952. Throughput: 0: 53672.8. Samples: 1403818060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-27 16:17:09,107][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 16:17:11,633][52263] Updated weights for policy 0, policy_version 421959 (0.0031) [2024-04-27 16:17:14,020][52263] Updated weights for policy 0, policy_version 421969 (0.0029) [2024-04-27 16:17:14,106][52031] Fps is (10 sec: 57343.8, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6913540096. Throughput: 0: 54106.1. Samples: 1403990840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-27 16:17:14,107][52031] Avg episode reward: [(0, '0.701')] [2024-04-27 16:17:14,905][52242] Signal inference workers to stop experience collection... (21400 times) [2024-04-27 16:17:14,906][52242] Signal inference workers to resume experience collection... (21400 times) [2024-04-27 16:17:14,932][52263] InferenceWorker_p0-w0: stopping experience collection (21400 times) [2024-04-27 16:17:14,932][52263] InferenceWorker_p0-w0: resuming experience collection (21400 times) [2024-04-27 16:17:17,633][52263] Updated weights for policy 0, policy_version 421979 (0.0035) [2024-04-27 16:17:19,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6913802240. Throughput: 0: 54040.8. Samples: 1404315260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-27 16:17:19,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 16:17:20,296][52263] Updated weights for policy 0, policy_version 421989 (0.0025) [2024-04-27 16:17:23,618][52263] Updated weights for policy 0, policy_version 421999 (0.0030) [2024-04-27 16:17:24,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6914064384. Throughput: 0: 54050.6. Samples: 1404639840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-27 16:17:24,107][52031] Avg episode reward: [(0, '0.672')] [2024-04-27 16:17:26,538][52263] Updated weights for policy 0, policy_version 422009 (0.0028) [2024-04-27 16:17:29,106][52031] Fps is (10 sec: 52429.8, 60 sec: 54340.2, 300 sec: 53650.7). Total num frames: 6914326528. Throughput: 0: 53853.1. Samples: 1404797200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-27 16:17:29,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 16:17:29,583][52263] Updated weights for policy 0, policy_version 422019 (0.0031) [2024-04-27 16:17:32,485][52263] Updated weights for policy 0, policy_version 422029 (0.0028) [2024-04-27 16:17:34,106][52031] Fps is (10 sec: 54067.5, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6914605056. Throughput: 0: 53858.7. Samples: 1405119240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-27 16:17:34,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 16:17:34,119][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000422034_6914605056.pth... [2024-04-27 16:17:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000421249_6901743616.pth [2024-04-27 16:17:35,815][52263] Updated weights for policy 0, policy_version 422039 (0.0027) [2024-04-27 16:17:38,460][52263] Updated weights for policy 0, policy_version 422049 (0.0029) [2024-04-27 16:17:39,107][52031] Fps is (10 sec: 55704.8, 60 sec: 54067.1, 300 sec: 53706.2). Total num frames: 6914883584. Throughput: 0: 53723.5. Samples: 1405438540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-27 16:17:39,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 16:17:41,827][52263] Updated weights for policy 0, policy_version 422059 (0.0029) [2024-04-27 16:17:44,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 6915145728. Throughput: 0: 54043.9. Samples: 1405605520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-27 16:17:44,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 16:17:44,580][52263] Updated weights for policy 0, policy_version 422069 (0.0026) [2024-04-27 16:17:47,897][52263] Updated weights for policy 0, policy_version 422079 (0.0031) [2024-04-27 16:17:49,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 6915424256. Throughput: 0: 54079.9. Samples: 1405930300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-27 16:17:49,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 16:17:50,699][52263] Updated weights for policy 0, policy_version 422089 (0.0027) [2024-04-27 16:17:54,019][52263] Updated weights for policy 0, policy_version 422099 (0.0027) [2024-04-27 16:17:54,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6915670016. Throughput: 0: 54116.1. Samples: 1406253280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-27 16:17:54,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 16:17:56,666][52263] Updated weights for policy 0, policy_version 422109 (0.0034) [2024-04-27 16:17:59,107][52031] Fps is (10 sec: 52428.1, 60 sec: 54067.0, 300 sec: 53706.2). Total num frames: 6915948544. Throughput: 0: 53671.3. Samples: 1406406060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-27 16:17:59,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 16:18:00,169][52263] Updated weights for policy 0, policy_version 422119 (0.0031) [2024-04-27 16:18:02,756][52263] Updated weights for policy 0, policy_version 422129 (0.0036) [2024-04-27 16:18:04,106][52031] Fps is (10 sec: 54067.4, 60 sec: 54067.2, 300 sec: 53650.7). Total num frames: 6916210688. Throughput: 0: 53592.2. Samples: 1406726900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-27 16:18:04,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 16:18:06,181][52242] Signal inference workers to stop experience collection... (21450 times) [2024-04-27 16:18:06,220][52263] InferenceWorker_p0-w0: stopping experience collection (21450 times) [2024-04-27 16:18:06,250][52242] Signal inference workers to resume experience collection... (21450 times) [2024-04-27 16:18:06,251][52263] InferenceWorker_p0-w0: resuming experience collection (21450 times) [2024-04-27 16:18:06,254][52263] Updated weights for policy 0, policy_version 422139 (0.0029) [2024-04-27 16:18:08,945][52263] Updated weights for policy 0, policy_version 422149 (0.0033) [2024-04-27 16:18:09,106][52031] Fps is (10 sec: 55707.3, 60 sec: 53794.3, 300 sec: 53706.2). Total num frames: 6916505600. Throughput: 0: 53532.6. Samples: 1407048800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-27 16:18:09,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 16:18:12,272][52263] Updated weights for policy 0, policy_version 422159 (0.0030) [2024-04-27 16:18:14,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6916767744. Throughput: 0: 53719.4. Samples: 1407214580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-27 16:18:14,107][52031] Avg episode reward: [(0, '0.665')] [2024-04-27 16:18:15,045][52263] Updated weights for policy 0, policy_version 422169 (0.0028) [2024-04-27 16:18:18,259][52263] Updated weights for policy 0, policy_version 422179 (0.0030) [2024-04-27 16:18:19,107][52031] Fps is (10 sec: 54066.3, 60 sec: 54067.3, 300 sec: 53761.7). Total num frames: 6917046272. Throughput: 0: 53850.6. Samples: 1407542520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-27 16:18:19,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 16:18:20,994][52263] Updated weights for policy 0, policy_version 422189 (0.0031) [2024-04-27 16:18:24,107][52031] Fps is (10 sec: 49151.8, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6917259264. Throughput: 0: 53818.6. Samples: 1407860380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 16:18:24,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 16:18:24,431][52263] Updated weights for policy 0, policy_version 422199 (0.0026) [2024-04-27 16:18:27,023][52263] Updated weights for policy 0, policy_version 422209 (0.0030) [2024-04-27 16:18:29,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 6917554176. Throughput: 0: 53643.9. Samples: 1408019500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 16:18:29,108][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 16:18:30,621][52263] Updated weights for policy 0, policy_version 422219 (0.0030) [2024-04-27 16:18:33,053][52263] Updated weights for policy 0, policy_version 422229 (0.0025) [2024-04-27 16:18:34,106][52031] Fps is (10 sec: 57344.9, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6917832704. Throughput: 0: 53531.2. Samples: 1408339200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 16:18:34,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 16:18:36,551][52263] Updated weights for policy 0, policy_version 422239 (0.0023) [2024-04-27 16:18:39,104][52263] Updated weights for policy 0, policy_version 422249 (0.0028) [2024-04-27 16:18:39,107][52031] Fps is (10 sec: 57344.0, 60 sec: 54067.2, 300 sec: 53817.3). Total num frames: 6918127616. Throughput: 0: 53637.3. Samples: 1408666960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 16:18:39,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 16:18:42,618][52263] Updated weights for policy 0, policy_version 422259 (0.0032) [2024-04-27 16:18:44,106][52031] Fps is (10 sec: 54066.8, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 6918373376. Throughput: 0: 53946.0. Samples: 1408833620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 16:18:44,107][52031] Avg episode reward: [(0, '0.520')] [2024-04-27 16:18:45,365][52263] Updated weights for policy 0, policy_version 422269 (0.0040) [2024-04-27 16:18:48,781][52263] Updated weights for policy 0, policy_version 422279 (0.0035) [2024-04-27 16:18:49,107][52031] Fps is (10 sec: 52429.0, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6918651904. Throughput: 0: 53870.2. Samples: 1409151060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 16:18:49,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 16:18:51,335][52263] Updated weights for policy 0, policy_version 422289 (0.0031) [2024-04-27 16:18:54,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6918897664. Throughput: 0: 53827.3. Samples: 1409471040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 16:18:54,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 16:18:54,968][52263] Updated weights for policy 0, policy_version 422299 (0.0029) [2024-04-27 16:18:56,734][52242] Signal inference workers to stop experience collection... (21500 times) [2024-04-27 16:18:56,735][52242] Signal inference workers to resume experience collection... (21500 times) [2024-04-27 16:18:56,758][52263] InferenceWorker_p0-w0: stopping experience collection (21500 times) [2024-04-27 16:18:56,758][52263] InferenceWorker_p0-w0: resuming experience collection (21500 times) [2024-04-27 16:18:57,288][52263] Updated weights for policy 0, policy_version 422309 (0.0032) [2024-04-27 16:18:59,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6919176192. Throughput: 0: 53757.3. Samples: 1409633660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 16:18:59,107][52031] Avg episode reward: [(0, '0.708')] [2024-04-27 16:19:00,933][52263] Updated weights for policy 0, policy_version 422319 (0.0027) [2024-04-27 16:19:03,674][52263] Updated weights for policy 0, policy_version 422329 (0.0027) [2024-04-27 16:19:04,107][52031] Fps is (10 sec: 55705.8, 60 sec: 54067.1, 300 sec: 53761.7). Total num frames: 6919454720. Throughput: 0: 53618.2. Samples: 1409955340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 16:19:04,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 16:19:06,973][52263] Updated weights for policy 0, policy_version 422339 (0.0029) [2024-04-27 16:19:09,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53520.9, 300 sec: 53761.7). Total num frames: 6919716864. Throughput: 0: 53694.7. Samples: 1410276640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 16:19:09,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 16:19:09,603][52263] Updated weights for policy 0, policy_version 422349 (0.0028) [2024-04-27 16:19:13,186][52263] Updated weights for policy 0, policy_version 422359 (0.0035) [2024-04-27 16:19:14,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 6919995392. Throughput: 0: 53756.8. Samples: 1410438560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 16:19:14,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 16:19:15,568][52263] Updated weights for policy 0, policy_version 422369 (0.0032) [2024-04-27 16:19:19,106][52031] Fps is (10 sec: 50791.1, 60 sec: 52975.0, 300 sec: 53595.1). Total num frames: 6920224768. Throughput: 0: 53772.9. Samples: 1410758980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 16:19:19,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 16:19:19,268][52263] Updated weights for policy 0, policy_version 422379 (0.0032) [2024-04-27 16:19:21,579][52263] Updated weights for policy 0, policy_version 422389 (0.0028) [2024-04-27 16:19:24,107][52031] Fps is (10 sec: 50790.2, 60 sec: 54067.2, 300 sec: 53650.6). Total num frames: 6920503296. Throughput: 0: 53674.1. Samples: 1411082300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 16:19:24,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 16:19:25,317][52263] Updated weights for policy 0, policy_version 422399 (0.0026) [2024-04-27 16:19:27,651][52263] Updated weights for policy 0, policy_version 422409 (0.0029) [2024-04-27 16:19:29,106][52031] Fps is (10 sec: 55704.9, 60 sec: 53794.2, 300 sec: 53761.7). Total num frames: 6920781824. Throughput: 0: 53525.3. Samples: 1411242260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 16:19:29,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 16:19:31,307][52263] Updated weights for policy 0, policy_version 422419 (0.0030) [2024-04-27 16:19:34,106][52031] Fps is (10 sec: 55706.7, 60 sec: 53794.1, 300 sec: 53817.3). Total num frames: 6921060352. Throughput: 0: 53679.7. Samples: 1411566640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 16:19:34,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 16:19:34,158][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000422429_6921076736.pth... [2024-04-27 16:19:34,163][52263] Updated weights for policy 0, policy_version 422429 (0.0029) [2024-04-27 16:19:34,215][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000421639_6908133376.pth [2024-04-27 16:19:37,618][52263] Updated weights for policy 0, policy_version 422439 (0.0030) [2024-04-27 16:19:39,107][52031] Fps is (10 sec: 57343.4, 60 sec: 53794.1, 300 sec: 53872.8). Total num frames: 6921355264. Throughput: 0: 53701.3. Samples: 1411887600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 16:19:39,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 16:19:40,265][52263] Updated weights for policy 0, policy_version 422449 (0.0032) [2024-04-27 16:19:43,608][52263] Updated weights for policy 0, policy_version 422459 (0.0030) [2024-04-27 16:19:44,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6921601024. Throughput: 0: 53685.8. Samples: 1412049520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:19:44,107][52031] Avg episode reward: [(0, '0.757')] [2024-04-27 16:19:44,219][52242] Saving new best policy, reward=0.757! [2024-04-27 16:19:46,188][52263] Updated weights for policy 0, policy_version 422469 (0.0026) [2024-04-27 16:19:47,231][52242] Signal inference workers to stop experience collection... (21550 times) [2024-04-27 16:19:47,232][52242] Signal inference workers to resume experience collection... (21550 times) [2024-04-27 16:19:47,257][52263] InferenceWorker_p0-w0: stopping experience collection (21550 times) [2024-04-27 16:19:47,257][52263] InferenceWorker_p0-w0: resuming experience collection (21550 times) [2024-04-27 16:19:49,106][52031] Fps is (10 sec: 47514.6, 60 sec: 52975.0, 300 sec: 53650.7). Total num frames: 6921830400. Throughput: 0: 53732.6. Samples: 1412373300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:19:49,107][52031] Avg episode reward: [(0, '0.683')] [2024-04-27 16:19:49,839][52263] Updated weights for policy 0, policy_version 422479 (0.0032) [2024-04-27 16:19:52,139][52263] Updated weights for policy 0, policy_version 422489 (0.0027) [2024-04-27 16:19:54,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6922125312. Throughput: 0: 53716.8. Samples: 1412693900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:19:54,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 16:19:55,867][52263] Updated weights for policy 0, policy_version 422499 (0.0027) [2024-04-27 16:19:58,484][52263] Updated weights for policy 0, policy_version 422509 (0.0029) [2024-04-27 16:19:59,107][52031] Fps is (10 sec: 58981.3, 60 sec: 54067.2, 300 sec: 53872.8). Total num frames: 6922420224. Throughput: 0: 53715.1. Samples: 1412855740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:19:59,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 16:20:01,942][52263] Updated weights for policy 0, policy_version 422519 (0.0025) [2024-04-27 16:20:04,107][52031] Fps is (10 sec: 57344.1, 60 sec: 54067.2, 300 sec: 53928.3). Total num frames: 6922698752. Throughput: 0: 53857.6. Samples: 1413182580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:20:04,107][52031] Avg episode reward: [(0, '0.693')] [2024-04-27 16:20:04,472][52263] Updated weights for policy 0, policy_version 422529 (0.0027) [2024-04-27 16:20:08,111][52263] Updated weights for policy 0, policy_version 422539 (0.0036) [2024-04-27 16:20:09,107][52031] Fps is (10 sec: 54067.5, 60 sec: 54067.2, 300 sec: 53872.8). Total num frames: 6922960896. Throughput: 0: 53743.2. Samples: 1413500740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:20:09,107][52031] Avg episode reward: [(0, '0.517')] [2024-04-27 16:20:10,536][52263] Updated weights for policy 0, policy_version 422549 (0.0024) [2024-04-27 16:20:14,058][52263] Updated weights for policy 0, policy_version 422559 (0.0029) [2024-04-27 16:20:14,106][52031] Fps is (10 sec: 50791.4, 60 sec: 53521.3, 300 sec: 53650.7). Total num frames: 6923206656. Throughput: 0: 53792.2. Samples: 1413662900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:20:14,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 16:20:16,593][52263] Updated weights for policy 0, policy_version 422569 (0.0031) [2024-04-27 16:20:19,107][52031] Fps is (10 sec: 47513.5, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6923436032. Throughput: 0: 53684.3. Samples: 1413982440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:20:19,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 16:20:20,154][52263] Updated weights for policy 0, policy_version 422579 (0.0032) [2024-04-27 16:20:22,730][52263] Updated weights for policy 0, policy_version 422589 (0.0030) [2024-04-27 16:20:24,107][52031] Fps is (10 sec: 54066.1, 60 sec: 54067.2, 300 sec: 53817.3). Total num frames: 6923747328. Throughput: 0: 53633.4. Samples: 1414301100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:20:24,107][52031] Avg episode reward: [(0, '0.663')] [2024-04-27 16:20:26,372][52263] Updated weights for policy 0, policy_version 422599 (0.0027) [2024-04-27 16:20:28,888][52263] Updated weights for policy 0, policy_version 422609 (0.0029) [2024-04-27 16:20:29,106][52031] Fps is (10 sec: 58983.1, 60 sec: 54067.3, 300 sec: 53928.4). Total num frames: 6924025856. Throughput: 0: 53806.3. Samples: 1414470800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:20:29,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 16:20:32,327][52263] Updated weights for policy 0, policy_version 422619 (0.0035) [2024-04-27 16:20:34,106][52031] Fps is (10 sec: 55706.1, 60 sec: 54067.1, 300 sec: 53872.8). Total num frames: 6924304384. Throughput: 0: 53776.4. Samples: 1414793240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:20:34,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 16:20:34,946][52263] Updated weights for policy 0, policy_version 422629 (0.0031) [2024-04-27 16:20:37,282][52242] Signal inference workers to stop experience collection... (21600 times) [2024-04-27 16:20:37,283][52242] Signal inference workers to resume experience collection... (21600 times) [2024-04-27 16:20:37,321][52263] InferenceWorker_p0-w0: stopping experience collection (21600 times) [2024-04-27 16:20:37,321][52263] InferenceWorker_p0-w0: resuming experience collection (21600 times) [2024-04-27 16:20:38,409][52263] Updated weights for policy 0, policy_version 422639 (0.0032) [2024-04-27 16:20:39,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.2, 300 sec: 53761.8). Total num frames: 6924550144. Throughput: 0: 53711.7. Samples: 1415110920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:20:39,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 16:20:41,165][52263] Updated weights for policy 0, policy_version 422649 (0.0038) [2024-04-27 16:20:44,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 6924812288. Throughput: 0: 53670.5. Samples: 1415270900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:20:44,107][52031] Avg episode reward: [(0, '0.665')] [2024-04-27 16:20:44,549][52263] Updated weights for policy 0, policy_version 422659 (0.0032) [2024-04-27 16:20:47,524][52263] Updated weights for policy 0, policy_version 422669 (0.0031) [2024-04-27 16:20:49,107][52031] Fps is (10 sec: 49151.3, 60 sec: 53520.9, 300 sec: 53650.7). Total num frames: 6925041664. Throughput: 0: 53578.2. Samples: 1415593600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:20:49,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 16:20:50,625][52263] Updated weights for policy 0, policy_version 422679 (0.0036) [2024-04-27 16:20:53,590][52263] Updated weights for policy 0, policy_version 422689 (0.0027) [2024-04-27 16:20:54,106][52031] Fps is (10 sec: 54066.7, 60 sec: 53794.2, 300 sec: 53872.8). Total num frames: 6925352960. Throughput: 0: 53713.4. Samples: 1415917840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:20:54,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 16:20:56,801][52263] Updated weights for policy 0, policy_version 422699 (0.0025) [2024-04-27 16:20:59,107][52031] Fps is (10 sec: 60621.2, 60 sec: 53794.2, 300 sec: 53983.9). Total num frames: 6925647872. Throughput: 0: 53809.2. Samples: 1416084320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:20:59,108][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 16:20:59,531][52263] Updated weights for policy 0, policy_version 422709 (0.0029) [2024-04-27 16:21:02,772][52263] Updated weights for policy 0, policy_version 422719 (0.0033) [2024-04-27 16:21:04,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.2, 300 sec: 53872.8). Total num frames: 6925910016. Throughput: 0: 53792.6. Samples: 1416403100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:21:04,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 16:21:05,560][52263] Updated weights for policy 0, policy_version 422729 (0.0030) [2024-04-27 16:21:09,005][52263] Updated weights for policy 0, policy_version 422739 (0.0031) [2024-04-27 16:21:09,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53248.1, 300 sec: 53706.2). Total num frames: 6926155776. Throughput: 0: 53905.9. Samples: 1416726860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:21:09,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 16:21:11,695][52263] Updated weights for policy 0, policy_version 422749 (0.0028) [2024-04-27 16:21:14,107][52031] Fps is (10 sec: 49151.0, 60 sec: 53247.7, 300 sec: 53595.1). Total num frames: 6926401536. Throughput: 0: 53442.0. Samples: 1416875700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:21:14,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 16:21:15,098][52263] Updated weights for policy 0, policy_version 422759 (0.0029) [2024-04-27 16:21:17,638][52263] Updated weights for policy 0, policy_version 422769 (0.0030) [2024-04-27 16:21:19,106][52031] Fps is (10 sec: 52429.2, 60 sec: 54067.3, 300 sec: 53650.7). Total num frames: 6926680064. Throughput: 0: 53404.1. Samples: 1417196420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:21:19,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 16:21:21,140][52263] Updated weights for policy 0, policy_version 422779 (0.0031) [2024-04-27 16:21:24,061][52263] Updated weights for policy 0, policy_version 422789 (0.0029) [2024-04-27 16:21:24,107][52031] Fps is (10 sec: 57344.1, 60 sec: 53794.1, 300 sec: 53928.3). Total num frames: 6926974976. Throughput: 0: 53516.7. Samples: 1417519180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:21:24,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 16:21:26,909][52242] Signal inference workers to stop experience collection... (21650 times) [2024-04-27 16:21:26,943][52263] InferenceWorker_p0-w0: stopping experience collection (21650 times) [2024-04-27 16:21:26,968][52242] Signal inference workers to resume experience collection... (21650 times) [2024-04-27 16:21:26,973][52263] InferenceWorker_p0-w0: resuming experience collection (21650 times) [2024-04-27 16:21:27,096][52263] Updated weights for policy 0, policy_version 422799 (0.0029) [2024-04-27 16:21:29,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.1, 300 sec: 53817.3). Total num frames: 6927237120. Throughput: 0: 53896.0. Samples: 1417696220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:21:29,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 16:21:30,056][52263] Updated weights for policy 0, policy_version 422809 (0.0030) [2024-04-27 16:21:33,139][52263] Updated weights for policy 0, policy_version 422819 (0.0037) [2024-04-27 16:21:34,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53520.9, 300 sec: 53817.2). Total num frames: 6927515648. Throughput: 0: 53814.1. Samples: 1418015240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:21:34,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 16:21:34,168][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000422823_6927532032.pth... [2024-04-27 16:21:34,216][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000422034_6914605056.pth [2024-04-27 16:21:36,536][52263] Updated weights for policy 0, policy_version 422829 (0.0029) [2024-04-27 16:21:39,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6927761408. Throughput: 0: 53755.2. Samples: 1418336820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:21:39,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 16:21:39,286][52263] Updated weights for policy 0, policy_version 422839 (0.0030) [2024-04-27 16:21:42,759][52263] Updated weights for policy 0, policy_version 422849 (0.0032) [2024-04-27 16:21:44,107][52031] Fps is (10 sec: 50790.4, 60 sec: 53520.9, 300 sec: 53650.6). Total num frames: 6928023552. Throughput: 0: 53486.9. Samples: 1418491240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:21:44,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 16:21:45,412][52263] Updated weights for policy 0, policy_version 422859 (0.0026) [2024-04-27 16:21:48,844][52263] Updated weights for policy 0, policy_version 422869 (0.0038) [2024-04-27 16:21:49,106][52031] Fps is (10 sec: 52428.7, 60 sec: 54067.3, 300 sec: 53595.1). Total num frames: 6928285696. Throughput: 0: 53584.8. Samples: 1418814420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:21:49,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 16:21:51,490][52263] Updated weights for policy 0, policy_version 422879 (0.0027) [2024-04-27 16:21:54,107][52031] Fps is (10 sec: 57344.5, 60 sec: 54067.2, 300 sec: 53872.8). Total num frames: 6928596992. Throughput: 0: 53525.7. Samples: 1419135520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:21:54,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 16:21:54,885][52263] Updated weights for policy 0, policy_version 422889 (0.0027) [2024-04-27 16:21:57,452][52263] Updated weights for policy 0, policy_version 422899 (0.0024) [2024-04-27 16:21:59,106][52031] Fps is (10 sec: 57344.3, 60 sec: 53521.1, 300 sec: 53872.8). Total num frames: 6928859136. Throughput: 0: 54036.7. Samples: 1419307340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:21:59,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 16:22:00,855][52263] Updated weights for policy 0, policy_version 422909 (0.0027) [2024-04-27 16:22:03,576][52263] Updated weights for policy 0, policy_version 422919 (0.0028) [2024-04-27 16:22:04,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53794.1, 300 sec: 53761.8). Total num frames: 6929137664. Throughput: 0: 54172.9. Samples: 1419634200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:22:04,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 16:22:07,038][52263] Updated weights for policy 0, policy_version 422929 (0.0035) [2024-04-27 16:22:09,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6929383424. Throughput: 0: 54178.4. Samples: 1419957200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:22:09,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 16:22:09,721][52263] Updated weights for policy 0, policy_version 422939 (0.0029) [2024-04-27 16:22:13,031][52263] Updated weights for policy 0, policy_version 422949 (0.0024) [2024-04-27 16:22:14,106][52031] Fps is (10 sec: 49151.6, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 6929629184. Throughput: 0: 53596.3. Samples: 1420108060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:22:14,107][52031] Avg episode reward: [(0, '0.662')] [2024-04-27 16:22:15,721][52263] Updated weights for policy 0, policy_version 422959 (0.0030) [2024-04-27 16:22:19,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6929907712. Throughput: 0: 53805.0. Samples: 1420436460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:22:19,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 16:22:19,403][52263] Updated weights for policy 0, policy_version 422969 (0.0029) [2024-04-27 16:22:21,322][52242] Signal inference workers to stop experience collection... (21700 times) [2024-04-27 16:22:21,361][52263] InferenceWorker_p0-w0: stopping experience collection (21700 times) [2024-04-27 16:22:21,375][52242] Signal inference workers to resume experience collection... (21700 times) [2024-04-27 16:22:21,382][52263] InferenceWorker_p0-w0: resuming experience collection (21700 times) [2024-04-27 16:22:21,708][52263] Updated weights for policy 0, policy_version 422979 (0.0028) [2024-04-27 16:22:24,106][52031] Fps is (10 sec: 57344.2, 60 sec: 53794.3, 300 sec: 53817.3). Total num frames: 6930202624. Throughput: 0: 53776.4. Samples: 1420756760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:22:24,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 16:22:25,408][52263] Updated weights for policy 0, policy_version 422989 (0.0033) [2024-04-27 16:22:27,799][52263] Updated weights for policy 0, policy_version 422999 (0.0034) [2024-04-27 16:22:29,107][52031] Fps is (10 sec: 57344.0, 60 sec: 54067.1, 300 sec: 53817.3). Total num frames: 6930481152. Throughput: 0: 54129.4. Samples: 1420927060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 16:22:29,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 16:22:31,874][52263] Updated weights for policy 0, policy_version 423009 (0.0027) [2024-04-27 16:22:33,827][52263] Updated weights for policy 0, policy_version 423019 (0.0032) [2024-04-27 16:22:34,107][52031] Fps is (10 sec: 55705.2, 60 sec: 54067.3, 300 sec: 53817.3). Total num frames: 6930759680. Throughput: 0: 54226.6. Samples: 1421254620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 16:22:34,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 16:22:38,131][52263] Updated weights for policy 0, policy_version 423029 (0.0034) [2024-04-27 16:22:39,107][52031] Fps is (10 sec: 52428.8, 60 sec: 54067.1, 300 sec: 53761.7). Total num frames: 6931005440. Throughput: 0: 54223.5. Samples: 1421575580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 16:22:39,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 16:22:39,800][52263] Updated weights for policy 0, policy_version 423039 (0.0030) [2024-04-27 16:22:44,068][52263] Updated weights for policy 0, policy_version 423049 (0.0031) [2024-04-27 16:22:44,107][52031] Fps is (10 sec: 47513.5, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6931234816. Throughput: 0: 53726.0. Samples: 1421725020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 16:22:44,107][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 16:22:46,040][52263] Updated weights for policy 0, policy_version 423059 (0.0023) [2024-04-27 16:22:49,107][52031] Fps is (10 sec: 50790.7, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6931513344. Throughput: 0: 53554.6. Samples: 1422044160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 16:22:49,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 16:22:50,314][52263] Updated weights for policy 0, policy_version 423069 (0.0032) [2024-04-27 16:22:51,970][52263] Updated weights for policy 0, policy_version 423079 (0.0034) [2024-04-27 16:22:54,106][52031] Fps is (10 sec: 57345.1, 60 sec: 53521.2, 300 sec: 53761.8). Total num frames: 6931808256. Throughput: 0: 53566.4. Samples: 1422367680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 16:22:54,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 16:22:56,318][52263] Updated weights for policy 0, policy_version 423089 (0.0029) [2024-04-27 16:22:58,135][52263] Updated weights for policy 0, policy_version 423099 (0.0029) [2024-04-27 16:22:59,107][52031] Fps is (10 sec: 58982.0, 60 sec: 54067.1, 300 sec: 53872.8). Total num frames: 6932103168. Throughput: 0: 54149.3. Samples: 1422544780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 16:22:59,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 16:23:02,388][52263] Updated weights for policy 0, policy_version 423109 (0.0030) [2024-04-27 16:23:04,100][52263] Updated weights for policy 0, policy_version 423119 (0.0030) [2024-04-27 16:23:04,107][52031] Fps is (10 sec: 57342.6, 60 sec: 54067.0, 300 sec: 53817.2). Total num frames: 6932381696. Throughput: 0: 54001.7. Samples: 1422866540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 16:23:04,107][52031] Avg episode reward: [(0, '0.658')] [2024-04-27 16:23:04,707][52242] Signal inference workers to stop experience collection... (21750 times) [2024-04-27 16:23:04,746][52263] InferenceWorker_p0-w0: stopping experience collection (21750 times) [2024-04-27 16:23:04,772][52242] Signal inference workers to resume experience collection... (21750 times) [2024-04-27 16:23:04,773][52263] InferenceWorker_p0-w0: resuming experience collection (21750 times) [2024-04-27 16:23:08,456][52263] Updated weights for policy 0, policy_version 423129 (0.0028) [2024-04-27 16:23:09,107][52031] Fps is (10 sec: 49152.2, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6932594688. Throughput: 0: 53955.5. Samples: 1423184760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 16:23:09,107][52031] Avg episode reward: [(0, '0.665')] [2024-04-27 16:23:10,298][52263] Updated weights for policy 0, policy_version 423139 (0.0031) [2024-04-27 16:23:14,106][52031] Fps is (10 sec: 45875.8, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6932840448. Throughput: 0: 53268.1. Samples: 1423324120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 16:23:14,107][52031] Avg episode reward: [(0, '0.519')] [2024-04-27 16:23:14,426][52263] Updated weights for policy 0, policy_version 423149 (0.0031) [2024-04-27 16:23:16,235][52263] Updated weights for policy 0, policy_version 423159 (0.0026) [2024-04-27 16:23:19,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.1, 300 sec: 53761.7). Total num frames: 6933118976. Throughput: 0: 53112.4. Samples: 1423644680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 16:23:19,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 16:23:20,724][52263] Updated weights for policy 0, policy_version 423169 (0.0028) [2024-04-27 16:23:22,393][52263] Updated weights for policy 0, policy_version 423179 (0.0023) [2024-04-27 16:23:24,107][52031] Fps is (10 sec: 58982.1, 60 sec: 53794.1, 300 sec: 53817.3). Total num frames: 6933430272. Throughput: 0: 53081.4. Samples: 1423964240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 16:23:24,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 16:23:26,940][52263] Updated weights for policy 0, policy_version 423189 (0.0031) [2024-04-27 16:23:28,783][52263] Updated weights for policy 0, policy_version 423199 (0.0027) [2024-04-27 16:23:29,106][52031] Fps is (10 sec: 58983.3, 60 sec: 53794.3, 300 sec: 53817.3). Total num frames: 6933708800. Throughput: 0: 53783.7. Samples: 1424145280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 16:23:29,107][52031] Avg episode reward: [(0, '0.669')] [2024-04-27 16:23:33,046][52263] Updated weights for policy 0, policy_version 423209 (0.0033) [2024-04-27 16:23:34,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6933970944. Throughput: 0: 53814.6. Samples: 1424465820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 16:23:34,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 16:23:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000423216_6933970944.pth... [2024-04-27 16:23:34,163][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000422429_6921076736.pth [2024-04-27 16:23:34,832][52263] Updated weights for policy 0, policy_version 423219 (0.0023) [2024-04-27 16:23:39,106][52031] Fps is (10 sec: 45875.5, 60 sec: 52702.0, 300 sec: 53539.6). Total num frames: 6934167552. Throughput: 0: 53693.4. Samples: 1424783880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 16:23:39,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 16:23:39,189][52263] Updated weights for policy 0, policy_version 423229 (0.0027) [2024-04-27 16:23:40,856][52263] Updated weights for policy 0, policy_version 423239 (0.0027) [2024-04-27 16:23:44,106][52031] Fps is (10 sec: 47514.2, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6934446080. Throughput: 0: 52930.4. Samples: 1424926640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 16:23:44,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 16:23:45,282][52263] Updated weights for policy 0, policy_version 423249 (0.0032) [2024-04-27 16:23:46,649][52242] Signal inference workers to stop experience collection... (21800 times) [2024-04-27 16:23:46,650][52242] Signal inference workers to resume experience collection... (21800 times) [2024-04-27 16:23:46,662][52263] InferenceWorker_p0-w0: stopping experience collection (21800 times) [2024-04-27 16:23:46,662][52263] InferenceWorker_p0-w0: resuming experience collection (21800 times) [2024-04-27 16:23:47,045][52263] Updated weights for policy 0, policy_version 423259 (0.0031) [2024-04-27 16:23:49,107][52031] Fps is (10 sec: 57342.9, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6934740992. Throughput: 0: 52865.0. Samples: 1425245460. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-27 16:23:49,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 16:23:51,419][52263] Updated weights for policy 0, policy_version 423269 (0.0031) [2024-04-27 16:23:53,230][52263] Updated weights for policy 0, policy_version 423279 (0.0029) [2024-04-27 16:23:54,107][52031] Fps is (10 sec: 58981.4, 60 sec: 53793.9, 300 sec: 53761.7). Total num frames: 6935035904. Throughput: 0: 52955.5. Samples: 1425567760. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-27 16:23:54,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 16:23:57,372][52263] Updated weights for policy 0, policy_version 423289 (0.0028) [2024-04-27 16:23:59,106][52031] Fps is (10 sec: 57344.7, 60 sec: 53521.2, 300 sec: 53761.8). Total num frames: 6935314432. Throughput: 0: 53963.6. Samples: 1425752480. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-27 16:23:59,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 16:23:59,271][52263] Updated weights for policy 0, policy_version 423299 (0.0026) [2024-04-27 16:24:03,531][52263] Updated weights for policy 0, policy_version 423309 (0.0039) [2024-04-27 16:24:04,106][52031] Fps is (10 sec: 50791.0, 60 sec: 52702.0, 300 sec: 53650.7). Total num frames: 6935543808. Throughput: 0: 53960.1. Samples: 1426072880. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-27 16:24:04,107][52031] Avg episode reward: [(0, '0.675')] [2024-04-27 16:24:05,266][52263] Updated weights for policy 0, policy_version 423319 (0.0035) [2024-04-27 16:24:09,107][52031] Fps is (10 sec: 47512.9, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6935789568. Throughput: 0: 53924.9. Samples: 1426390860. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-27 16:24:09,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 16:24:09,803][52263] Updated weights for policy 0, policy_version 423329 (0.0024) [2024-04-27 16:24:11,493][52263] Updated weights for policy 0, policy_version 423339 (0.0030) [2024-04-27 16:24:14,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6936051712. Throughput: 0: 52847.6. Samples: 1426523420. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-27 16:24:14,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 16:24:15,724][52263] Updated weights for policy 0, policy_version 423349 (0.0030) [2024-04-27 16:24:17,675][52263] Updated weights for policy 0, policy_version 423359 (0.0027) [2024-04-27 16:24:19,106][52031] Fps is (10 sec: 57344.5, 60 sec: 54067.3, 300 sec: 53761.8). Total num frames: 6936363008. Throughput: 0: 52901.4. Samples: 1426846380. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-27 16:24:19,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 16:24:21,843][52263] Updated weights for policy 0, policy_version 423369 (0.0030) [2024-04-27 16:24:23,783][52263] Updated weights for policy 0, policy_version 423379 (0.0029) [2024-04-27 16:24:24,106][52031] Fps is (10 sec: 58982.1, 60 sec: 53521.1, 300 sec: 53761.7). Total num frames: 6936641536. Throughput: 0: 53028.8. Samples: 1427170180. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-27 16:24:24,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 16:24:28,045][52263] Updated weights for policy 0, policy_version 423389 (0.0027) [2024-04-27 16:24:29,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53247.9, 300 sec: 53706.2). Total num frames: 6936903680. Throughput: 0: 53855.8. Samples: 1427350160. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-27 16:24:29,107][52031] Avg episode reward: [(0, '0.473')] [2024-04-27 16:24:30,126][52263] Updated weights for policy 0, policy_version 423399 (0.0027) [2024-04-27 16:24:34,054][52263] Updated weights for policy 0, policy_version 423409 (0.0035) [2024-04-27 16:24:34,107][52031] Fps is (10 sec: 49151.8, 60 sec: 52701.9, 300 sec: 53484.1). Total num frames: 6937133056. Throughput: 0: 53828.9. Samples: 1427667760. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-27 16:24:34,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 16:24:36,124][52263] Updated weights for policy 0, policy_version 423419 (0.0028) [2024-04-27 16:24:36,896][52242] Signal inference workers to stop experience collection... (21850 times) [2024-04-27 16:24:36,934][52263] InferenceWorker_p0-w0: stopping experience collection (21850 times) [2024-04-27 16:24:36,950][52242] Signal inference workers to resume experience collection... (21850 times) [2024-04-27 16:24:36,951][52263] InferenceWorker_p0-w0: resuming experience collection (21850 times) [2024-04-27 16:24:39,106][52031] Fps is (10 sec: 47513.9, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6937378816. Throughput: 0: 53634.8. Samples: 1427981320. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-27 16:24:39,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 16:24:40,068][52263] Updated weights for policy 0, policy_version 423429 (0.0030) [2024-04-27 16:24:42,362][52263] Updated weights for policy 0, policy_version 423439 (0.0035) [2024-04-27 16:24:44,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6937673728. Throughput: 0: 52878.6. Samples: 1428132020. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-27 16:24:44,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 16:24:46,224][52263] Updated weights for policy 0, policy_version 423449 (0.0030) [2024-04-27 16:24:48,645][52263] Updated weights for policy 0, policy_version 423459 (0.0027) [2024-04-27 16:24:49,107][52031] Fps is (10 sec: 57343.7, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6937952256. Throughput: 0: 52867.5. Samples: 1428451920. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-27 16:24:49,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 16:24:52,306][52263] Updated weights for policy 0, policy_version 423469 (0.0025) [2024-04-27 16:24:54,107][52031] Fps is (10 sec: 58982.3, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6938263552. Throughput: 0: 52954.3. Samples: 1428773800. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-27 16:24:54,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 16:24:54,542][52263] Updated weights for policy 0, policy_version 423479 (0.0026) [2024-04-27 16:24:58,324][52263] Updated weights for policy 0, policy_version 423489 (0.0027) [2024-04-27 16:24:59,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6938509312. Throughput: 0: 54023.6. Samples: 1428954480. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-27 16:24:59,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 16:25:00,440][52263] Updated weights for policy 0, policy_version 423499 (0.0035) [2024-04-27 16:25:04,106][52031] Fps is (10 sec: 47513.7, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6938738688. Throughput: 0: 54061.8. Samples: 1429279160. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-04-27 16:25:04,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 16:25:04,367][52263] Updated weights for policy 0, policy_version 423509 (0.0028) [2024-04-27 16:25:06,706][52263] Updated weights for policy 0, policy_version 423519 (0.0039) [2024-04-27 16:25:09,106][52031] Fps is (10 sec: 49152.0, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6939000832. Throughput: 0: 53995.6. Samples: 1429599980. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 16:25:09,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 16:25:10,417][52263] Updated weights for policy 0, policy_version 423529 (0.0027) [2024-04-27 16:25:12,874][52263] Updated weights for policy 0, policy_version 423539 (0.0033) [2024-04-27 16:25:14,106][52031] Fps is (10 sec: 55705.7, 60 sec: 54067.2, 300 sec: 53761.8). Total num frames: 6939295744. Throughput: 0: 53284.5. Samples: 1429747960. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 16:25:14,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 16:25:16,480][52263] Updated weights for policy 0, policy_version 423549 (0.0032) [2024-04-27 16:25:18,874][52263] Updated weights for policy 0, policy_version 423559 (0.0025) [2024-04-27 16:25:19,107][52031] Fps is (10 sec: 58981.7, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6939590656. Throughput: 0: 53441.3. Samples: 1430072620. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 16:25:19,107][52031] Avg episode reward: [(0, '0.477')] [2024-04-27 16:25:21,953][52242] Signal inference workers to stop experience collection... (21900 times) [2024-04-27 16:25:21,992][52263] InferenceWorker_p0-w0: stopping experience collection (21900 times) [2024-04-27 16:25:22,051][52242] Signal inference workers to resume experience collection... (21900 times) [2024-04-27 16:25:22,051][52263] InferenceWorker_p0-w0: resuming experience collection (21900 times) [2024-04-27 16:25:22,492][52263] Updated weights for policy 0, policy_version 423569 (0.0024) [2024-04-27 16:25:24,107][52031] Fps is (10 sec: 57343.5, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6939869184. Throughput: 0: 53637.7. Samples: 1430395020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 16:25:24,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 16:25:25,390][52263] Updated weights for policy 0, policy_version 423579 (0.0029) [2024-04-27 16:25:28,515][52263] Updated weights for policy 0, policy_version 423589 (0.0030) [2024-04-27 16:25:29,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6940114944. Throughput: 0: 54141.9. Samples: 1430568400. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 16:25:29,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 16:25:31,576][52263] Updated weights for policy 0, policy_version 423599 (0.0037) [2024-04-27 16:25:34,107][52031] Fps is (10 sec: 49151.7, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6940360704. Throughput: 0: 54237.2. Samples: 1430892600. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 16:25:34,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 16:25:34,207][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000423607_6940377088.pth... [2024-04-27 16:25:34,249][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000422823_6927532032.pth [2024-04-27 16:25:34,623][52263] Updated weights for policy 0, policy_version 423609 (0.0033) [2024-04-27 16:25:37,536][52263] Updated weights for policy 0, policy_version 423619 (0.0026) [2024-04-27 16:25:39,106][52031] Fps is (10 sec: 50790.1, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6940622848. Throughput: 0: 54414.7. Samples: 1431222460. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 16:25:39,107][52031] Avg episode reward: [(0, '0.491')] [2024-04-27 16:25:40,738][52263] Updated weights for policy 0, policy_version 423629 (0.0034) [2024-04-27 16:25:43,694][52263] Updated weights for policy 0, policy_version 423639 (0.0030) [2024-04-27 16:25:44,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 6940901376. Throughput: 0: 53524.3. Samples: 1431363080. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 16:25:44,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 16:25:46,734][52263] Updated weights for policy 0, policy_version 423649 (0.0036) [2024-04-27 16:25:49,106][52031] Fps is (10 sec: 57344.1, 60 sec: 54067.3, 300 sec: 53706.2). Total num frames: 6941196288. Throughput: 0: 53460.5. Samples: 1431684880. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 16:25:49,107][52031] Avg episode reward: [(0, '0.507')] [2024-04-27 16:25:49,754][52263] Updated weights for policy 0, policy_version 423659 (0.0031) [2024-04-27 16:25:52,955][52263] Updated weights for policy 0, policy_version 423669 (0.0027) [2024-04-27 16:25:54,106][52031] Fps is (10 sec: 58983.1, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6941491200. Throughput: 0: 53584.0. Samples: 1432011260. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 16:25:54,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 16:25:55,995][52263] Updated weights for policy 0, policy_version 423679 (0.0032) [2024-04-27 16:25:58,970][52263] Updated weights for policy 0, policy_version 423689 (0.0027) [2024-04-27 16:25:59,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6941720576. Throughput: 0: 54213.8. Samples: 1432187580. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 16:25:59,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 16:26:02,034][52263] Updated weights for policy 0, policy_version 423699 (0.0025) [2024-04-27 16:26:04,106][52031] Fps is (10 sec: 47513.7, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6941966336. Throughput: 0: 54057.5. Samples: 1432505200. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 16:26:04,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 16:26:04,968][52263] Updated weights for policy 0, policy_version 423709 (0.0035) [2024-04-27 16:26:08,167][52263] Updated weights for policy 0, policy_version 423719 (0.0027) [2024-04-27 16:26:09,106][52031] Fps is (10 sec: 54067.1, 60 sec: 54340.2, 300 sec: 53761.8). Total num frames: 6942261248. Throughput: 0: 54105.9. Samples: 1432829780. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 16:26:09,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 16:26:11,020][52263] Updated weights for policy 0, policy_version 423729 (0.0032) [2024-04-27 16:26:14,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6942523392. Throughput: 0: 53678.1. Samples: 1432983920. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 16:26:14,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 16:26:14,230][52263] Updated weights for policy 0, policy_version 423739 (0.0032) [2024-04-27 16:26:17,027][52263] Updated weights for policy 0, policy_version 423749 (0.0031) [2024-04-27 16:26:19,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 6942801920. Throughput: 0: 53643.4. Samples: 1433306540. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 16:26:19,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 16:26:20,447][52263] Updated weights for policy 0, policy_version 423759 (0.0031) [2024-04-27 16:26:22,195][52242] Signal inference workers to stop experience collection... (21950 times) [2024-04-27 16:26:22,215][52263] InferenceWorker_p0-w0: stopping experience collection (21950 times) [2024-04-27 16:26:22,289][52242] Signal inference workers to resume experience collection... (21950 times) [2024-04-27 16:26:22,289][52263] InferenceWorker_p0-w0: resuming experience collection (21950 times) [2024-04-27 16:26:23,167][52263] Updated weights for policy 0, policy_version 423769 (0.0028) [2024-04-27 16:26:24,106][52031] Fps is (10 sec: 57344.8, 60 sec: 53794.3, 300 sec: 53761.7). Total num frames: 6943096832. Throughput: 0: 53482.8. Samples: 1433629180. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 16:26:24,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 16:26:26,479][52263] Updated weights for policy 0, policy_version 423779 (0.0030) [2024-04-27 16:26:29,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6943342592. Throughput: 0: 54235.2. Samples: 1433803660. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-27 16:26:29,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 16:26:29,150][52263] Updated weights for policy 0, policy_version 423789 (0.0034) [2024-04-27 16:26:32,494][52263] Updated weights for policy 0, policy_version 423799 (0.0026) [2024-04-27 16:26:34,107][52031] Fps is (10 sec: 50789.3, 60 sec: 54067.2, 300 sec: 53706.2). Total num frames: 6943604736. Throughput: 0: 54287.8. Samples: 1434127840. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-04-27 16:26:34,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 16:26:35,235][52263] Updated weights for policy 0, policy_version 423809 (0.0029) [2024-04-27 16:26:38,578][52263] Updated weights for policy 0, policy_version 423819 (0.0026) [2024-04-27 16:26:39,107][52031] Fps is (10 sec: 54066.9, 60 sec: 54340.2, 300 sec: 53761.7). Total num frames: 6943883264. Throughput: 0: 54231.4. Samples: 1434451680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-04-27 16:26:39,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 16:26:41,210][52263] Updated weights for policy 0, policy_version 423829 (0.0029) [2024-04-27 16:26:44,106][52031] Fps is (10 sec: 54067.6, 60 sec: 54067.2, 300 sec: 53761.7). Total num frames: 6944145408. Throughput: 0: 53726.5. Samples: 1434605280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-04-27 16:26:44,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 16:26:44,598][52263] Updated weights for policy 0, policy_version 423839 (0.0025) [2024-04-27 16:26:47,426][52263] Updated weights for policy 0, policy_version 423849 (0.0035) [2024-04-27 16:26:49,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6944423936. Throughput: 0: 53863.5. Samples: 1434929060. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-04-27 16:26:49,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 16:26:50,541][52263] Updated weights for policy 0, policy_version 423859 (0.0024) [2024-04-27 16:26:53,457][52263] Updated weights for policy 0, policy_version 423869 (0.0028) [2024-04-27 16:26:54,106][52031] Fps is (10 sec: 55706.5, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6944702464. Throughput: 0: 53825.9. Samples: 1435251940. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-04-27 16:26:54,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 16:26:56,594][52263] Updated weights for policy 0, policy_version 423879 (0.0035) [2024-04-27 16:26:59,107][52031] Fps is (10 sec: 54066.8, 60 sec: 54067.1, 300 sec: 53650.6). Total num frames: 6944964608. Throughput: 0: 54119.1. Samples: 1435419280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-04-27 16:26:59,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 16:26:59,458][52263] Updated weights for policy 0, policy_version 423889 (0.0029) [2024-04-27 16:27:02,949][52263] Updated weights for policy 0, policy_version 423899 (0.0035) [2024-04-27 16:27:04,106][52031] Fps is (10 sec: 52428.2, 60 sec: 54340.2, 300 sec: 53706.2). Total num frames: 6945226752. Throughput: 0: 54188.8. Samples: 1435745040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-04-27 16:27:04,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 16:27:05,545][52263] Updated weights for policy 0, policy_version 423909 (0.0035) [2024-04-27 16:27:09,107][52031] Fps is (10 sec: 50789.5, 60 sec: 53520.8, 300 sec: 53706.2). Total num frames: 6945472512. Throughput: 0: 54111.2. Samples: 1436064200. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-04-27 16:27:09,108][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 16:27:09,240][52263] Updated weights for policy 0, policy_version 423919 (0.0030) [2024-04-27 16:27:11,617][52263] Updated weights for policy 0, policy_version 423929 (0.0030) [2024-04-27 16:27:14,106][52031] Fps is (10 sec: 54067.3, 60 sec: 54067.3, 300 sec: 53761.8). Total num frames: 6945767424. Throughput: 0: 53807.2. Samples: 1436224980. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-04-27 16:27:14,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 16:27:15,312][52263] Updated weights for policy 0, policy_version 423939 (0.0032) [2024-04-27 16:27:17,642][52263] Updated weights for policy 0, policy_version 423949 (0.0028) [2024-04-27 16:27:19,107][52031] Fps is (10 sec: 58983.2, 60 sec: 54340.1, 300 sec: 53761.7). Total num frames: 6946062336. Throughput: 0: 53700.5. Samples: 1436544360. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-04-27 16:27:19,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 16:27:21,403][52263] Updated weights for policy 0, policy_version 423959 (0.0027) [2024-04-27 16:27:21,982][52242] Signal inference workers to stop experience collection... (22000 times) [2024-04-27 16:27:21,983][52242] Signal inference workers to resume experience collection... (22000 times) [2024-04-27 16:27:22,010][52263] InferenceWorker_p0-w0: stopping experience collection (22000 times) [2024-04-27 16:27:22,010][52263] InferenceWorker_p0-w0: resuming experience collection (22000 times) [2024-04-27 16:27:23,741][52263] Updated weights for policy 0, policy_version 423969 (0.0035) [2024-04-27 16:27:24,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6946324480. Throughput: 0: 53744.1. Samples: 1436870160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-04-27 16:27:24,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 16:27:27,431][52263] Updated weights for policy 0, policy_version 423979 (0.0027) [2024-04-27 16:27:29,106][52031] Fps is (10 sec: 52429.6, 60 sec: 54067.3, 300 sec: 53650.7). Total num frames: 6946586624. Throughput: 0: 54023.2. Samples: 1437036320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-04-27 16:27:29,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 16:27:29,877][52263] Updated weights for policy 0, policy_version 423989 (0.0025) [2024-04-27 16:27:33,470][52263] Updated weights for policy 0, policy_version 423999 (0.0033) [2024-04-27 16:27:34,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 6946832384. Throughput: 0: 53915.1. Samples: 1437355240. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-04-27 16:27:34,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 16:27:34,152][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000424002_6946848768.pth... [2024-04-27 16:27:34,197][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000423216_6933970944.pth [2024-04-27 16:27:35,853][52263] Updated weights for policy 0, policy_version 424009 (0.0033) [2024-04-27 16:27:39,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53794.2, 300 sec: 53817.3). Total num frames: 6947110912. Throughput: 0: 53943.0. Samples: 1437679380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-04-27 16:27:39,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 16:27:39,534][52263] Updated weights for policy 0, policy_version 424019 (0.0032) [2024-04-27 16:27:41,902][52263] Updated weights for policy 0, policy_version 424029 (0.0031) [2024-04-27 16:27:44,106][52031] Fps is (10 sec: 55705.9, 60 sec: 54067.3, 300 sec: 53817.3). Total num frames: 6947389440. Throughput: 0: 53896.6. Samples: 1437844620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-04-27 16:27:44,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 16:27:45,633][52263] Updated weights for policy 0, policy_version 424039 (0.0040) [2024-04-27 16:27:47,977][52263] Updated weights for policy 0, policy_version 424049 (0.0030) [2024-04-27 16:27:49,106][52031] Fps is (10 sec: 55706.1, 60 sec: 54067.3, 300 sec: 53761.7). Total num frames: 6947667968. Throughput: 0: 53837.9. Samples: 1438167740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-04-27 16:27:49,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 16:27:51,581][52263] Updated weights for policy 0, policy_version 424059 (0.0028) [2024-04-27 16:27:53,976][52263] Updated weights for policy 0, policy_version 424069 (0.0035) [2024-04-27 16:27:54,107][52031] Fps is (10 sec: 55704.5, 60 sec: 54067.0, 300 sec: 53706.2). Total num frames: 6947946496. Throughput: 0: 53969.5. Samples: 1438492820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 16:27:54,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 16:27:57,724][52263] Updated weights for policy 0, policy_version 424079 (0.0033) [2024-04-27 16:27:59,106][52031] Fps is (10 sec: 50790.0, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6948175872. Throughput: 0: 53952.4. Samples: 1438652840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 16:27:59,107][52031] Avg episode reward: [(0, '0.664')] [2024-04-27 16:28:00,038][52263] Updated weights for policy 0, policy_version 424089 (0.0026) [2024-04-27 16:28:03,734][52263] Updated weights for policy 0, policy_version 424099 (0.0029) [2024-04-27 16:28:04,106][52031] Fps is (10 sec: 52429.8, 60 sec: 54067.2, 300 sec: 53817.3). Total num frames: 6948470784. Throughput: 0: 54023.7. Samples: 1438975420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 16:28:04,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 16:28:06,122][52263] Updated weights for policy 0, policy_version 424109 (0.0027) [2024-04-27 16:28:09,106][52031] Fps is (10 sec: 55705.6, 60 sec: 54340.5, 300 sec: 53872.8). Total num frames: 6948732928. Throughput: 0: 53916.0. Samples: 1439296380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 16:28:09,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 16:28:09,690][52263] Updated weights for policy 0, policy_version 424119 (0.0029) [2024-04-27 16:28:11,542][52242] Signal inference workers to stop experience collection... (22050 times) [2024-04-27 16:28:11,585][52263] InferenceWorker_p0-w0: stopping experience collection (22050 times) [2024-04-27 16:28:11,643][52242] Signal inference workers to resume experience collection... (22050 times) [2024-04-27 16:28:11,643][52263] InferenceWorker_p0-w0: resuming experience collection (22050 times) [2024-04-27 16:28:12,489][52263] Updated weights for policy 0, policy_version 424129 (0.0025) [2024-04-27 16:28:14,106][52031] Fps is (10 sec: 54066.9, 60 sec: 54067.2, 300 sec: 53872.8). Total num frames: 6949011456. Throughput: 0: 54026.6. Samples: 1439467520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 16:28:14,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 16:28:15,788][52263] Updated weights for policy 0, policy_version 424139 (0.0039) [2024-04-27 16:28:18,478][52263] Updated weights for policy 0, policy_version 424149 (0.0028) [2024-04-27 16:28:19,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.2, 300 sec: 53706.2). Total num frames: 6949273600. Throughput: 0: 54174.2. Samples: 1439793080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 16:28:19,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 16:28:21,905][52263] Updated weights for policy 0, policy_version 424159 (0.0034) [2024-04-27 16:28:24,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53520.9, 300 sec: 53650.6). Total num frames: 6949535744. Throughput: 0: 54152.2. Samples: 1440116240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 16:28:24,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 16:28:24,451][52263] Updated weights for policy 0, policy_version 424169 (0.0027) [2024-04-27 16:28:27,905][52263] Updated weights for policy 0, policy_version 424179 (0.0029) [2024-04-27 16:28:29,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53520.9, 300 sec: 53650.6). Total num frames: 6949797888. Throughput: 0: 53651.8. Samples: 1440258960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 16:28:29,107][52031] Avg episode reward: [(0, '0.509')] [2024-04-27 16:28:30,503][52263] Updated weights for policy 0, policy_version 424189 (0.0025) [2024-04-27 16:28:33,928][52263] Updated weights for policy 0, policy_version 424199 (0.0029) [2024-04-27 16:28:34,106][52031] Fps is (10 sec: 55706.8, 60 sec: 54340.3, 300 sec: 53983.9). Total num frames: 6950092800. Throughput: 0: 53815.0. Samples: 1440589420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 16:28:34,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 16:28:36,618][52263] Updated weights for policy 0, policy_version 424209 (0.0030) [2024-04-27 16:28:39,107][52031] Fps is (10 sec: 55706.1, 60 sec: 54067.2, 300 sec: 53928.3). Total num frames: 6950354944. Throughput: 0: 53735.7. Samples: 1440910920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 16:28:39,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 16:28:39,901][52263] Updated weights for policy 0, policy_version 424219 (0.0034) [2024-04-27 16:28:42,773][52263] Updated weights for policy 0, policy_version 424229 (0.0026) [2024-04-27 16:28:44,107][52031] Fps is (10 sec: 55704.9, 60 sec: 54340.1, 300 sec: 53928.3). Total num frames: 6950649856. Throughput: 0: 54169.6. Samples: 1441090480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 16:28:44,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 16:28:45,957][52263] Updated weights for policy 0, policy_version 424239 (0.0026) [2024-04-27 16:28:48,794][52263] Updated weights for policy 0, policy_version 424249 (0.0030) [2024-04-27 16:28:49,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53794.2, 300 sec: 53761.8). Total num frames: 6950895616. Throughput: 0: 54184.1. Samples: 1441413700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 16:28:49,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 16:28:51,951][52263] Updated weights for policy 0, policy_version 424259 (0.0027) [2024-04-27 16:28:54,107][52031] Fps is (10 sec: 49152.1, 60 sec: 53248.0, 300 sec: 53650.6). Total num frames: 6951141376. Throughput: 0: 54260.3. Samples: 1441738100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 16:28:54,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 16:28:54,813][52263] Updated weights for policy 0, policy_version 424269 (0.0032) [2024-04-27 16:28:58,137][52263] Updated weights for policy 0, policy_version 424279 (0.0029) [2024-04-27 16:28:58,642][52242] Signal inference workers to stop experience collection... (22100 times) [2024-04-27 16:28:58,671][52263] InferenceWorker_p0-w0: stopping experience collection (22100 times) [2024-04-27 16:28:58,741][52242] Signal inference workers to resume experience collection... (22100 times) [2024-04-27 16:28:58,741][52263] InferenceWorker_p0-w0: resuming experience collection (22100 times) [2024-04-27 16:28:59,106][52031] Fps is (10 sec: 52428.1, 60 sec: 54067.2, 300 sec: 53817.3). Total num frames: 6951419904. Throughput: 0: 53719.1. Samples: 1441884880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 16:28:59,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 16:29:01,034][52263] Updated weights for policy 0, policy_version 424289 (0.0032) [2024-04-27 16:29:04,106][52031] Fps is (10 sec: 55706.2, 60 sec: 53794.1, 300 sec: 53928.4). Total num frames: 6951698432. Throughput: 0: 53607.2. Samples: 1442205400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 16:29:04,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 16:29:04,241][52263] Updated weights for policy 0, policy_version 424299 (0.0028) [2024-04-27 16:29:07,253][52263] Updated weights for policy 0, policy_version 424309 (0.0025) [2024-04-27 16:29:09,106][52031] Fps is (10 sec: 55706.3, 60 sec: 54067.3, 300 sec: 53983.9). Total num frames: 6951976960. Throughput: 0: 53571.0. Samples: 1442526920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-04-27 16:29:09,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 16:29:10,354][52263] Updated weights for policy 0, policy_version 424319 (0.0025) [2024-04-27 16:29:13,321][52263] Updated weights for policy 0, policy_version 424329 (0.0027) [2024-04-27 16:29:14,107][52031] Fps is (10 sec: 57343.4, 60 sec: 54340.2, 300 sec: 53928.3). Total num frames: 6952271872. Throughput: 0: 54217.8. Samples: 1442698760. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-27 16:29:14,107][52031] Avg episode reward: [(0, '0.589')] [2024-04-27 16:29:16,313][52263] Updated weights for policy 0, policy_version 424339 (0.0027) [2024-04-27 16:29:19,107][52031] Fps is (10 sec: 54066.0, 60 sec: 54067.1, 300 sec: 53817.3). Total num frames: 6952517632. Throughput: 0: 54058.5. Samples: 1443022060. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-27 16:29:19,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 16:29:19,327][52263] Updated weights for policy 0, policy_version 424349 (0.0030) [2024-04-27 16:29:22,580][52263] Updated weights for policy 0, policy_version 424359 (0.0032) [2024-04-27 16:29:24,107][52031] Fps is (10 sec: 45875.1, 60 sec: 53248.1, 300 sec: 53650.7). Total num frames: 6952730624. Throughput: 0: 54010.2. Samples: 1443341380. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-27 16:29:24,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 16:29:25,468][52263] Updated weights for policy 0, policy_version 424369 (0.0030) [2024-04-27 16:29:28,700][52263] Updated weights for policy 0, policy_version 424379 (0.0026) [2024-04-27 16:29:29,106][52031] Fps is (10 sec: 50791.2, 60 sec: 53794.3, 300 sec: 53872.8). Total num frames: 6953025536. Throughput: 0: 53227.7. Samples: 1443485720. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-27 16:29:29,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 16:29:31,657][52263] Updated weights for policy 0, policy_version 424389 (0.0025) [2024-04-27 16:29:34,107][52031] Fps is (10 sec: 57343.8, 60 sec: 53520.9, 300 sec: 53983.9). Total num frames: 6953304064. Throughput: 0: 53097.0. Samples: 1443803080. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-27 16:29:34,107][52031] Avg episode reward: [(0, '0.659')] [2024-04-27 16:29:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000424396_6953304064.pth... [2024-04-27 16:29:34,185][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000423607_6940377088.pth [2024-04-27 16:29:34,853][52263] Updated weights for policy 0, policy_version 424399 (0.0031) [2024-04-27 16:29:37,630][52263] Updated weights for policy 0, policy_version 424409 (0.0028) [2024-04-27 16:29:39,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53794.1, 300 sec: 53928.3). Total num frames: 6953582592. Throughput: 0: 53192.5. Samples: 1444131760. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-27 16:29:39,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 16:29:41,244][52263] Updated weights for policy 0, policy_version 424419 (0.0030) [2024-04-27 16:29:43,666][52263] Updated weights for policy 0, policy_version 424429 (0.0032) [2024-04-27 16:29:44,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53521.0, 300 sec: 53928.3). Total num frames: 6953861120. Throughput: 0: 53699.8. Samples: 1444301380. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-27 16:29:44,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 16:29:47,177][52263] Updated weights for policy 0, policy_version 424439 (0.0031) [2024-04-27 16:29:49,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6954106880. Throughput: 0: 53715.6. Samples: 1444622600. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-27 16:29:49,107][52031] Avg episode reward: [(0, '0.658')] [2024-04-27 16:29:49,802][52263] Updated weights for policy 0, policy_version 424449 (0.0030) [2024-04-27 16:29:53,316][52263] Updated weights for policy 0, policy_version 424459 (0.0034) [2024-04-27 16:29:54,106][52031] Fps is (10 sec: 49152.7, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6954352640. Throughput: 0: 53682.1. Samples: 1444942620. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-27 16:29:54,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 16:29:54,735][52242] Signal inference workers to stop experience collection... (22150 times) [2024-04-27 16:29:54,736][52242] Signal inference workers to resume experience collection... (22150 times) [2024-04-27 16:29:54,750][52263] InferenceWorker_p0-w0: stopping experience collection (22150 times) [2024-04-27 16:29:54,750][52263] InferenceWorker_p0-w0: resuming experience collection (22150 times) [2024-04-27 16:29:55,803][52263] Updated weights for policy 0, policy_version 424469 (0.0029) [2024-04-27 16:29:59,106][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.1, 300 sec: 53928.4). Total num frames: 6954647552. Throughput: 0: 53207.6. Samples: 1445093100. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-27 16:29:59,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 16:29:59,459][52263] Updated weights for policy 0, policy_version 424479 (0.0029) [2024-04-27 16:30:01,825][52263] Updated weights for policy 0, policy_version 424489 (0.0029) [2024-04-27 16:30:04,107][52031] Fps is (10 sec: 57342.9, 60 sec: 53793.9, 300 sec: 53983.8). Total num frames: 6954926080. Throughput: 0: 53296.8. Samples: 1445420420. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-27 16:30:04,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 16:30:05,970][52263] Updated weights for policy 0, policy_version 424499 (0.0031) [2024-04-27 16:30:07,937][52263] Updated weights for policy 0, policy_version 424509 (0.0029) [2024-04-27 16:30:09,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.0, 300 sec: 53872.8). Total num frames: 6955188224. Throughput: 0: 53336.5. Samples: 1445741520. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-27 16:30:09,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 16:30:11,899][52263] Updated weights for policy 0, policy_version 424519 (0.0027) [2024-04-27 16:30:13,910][52263] Updated weights for policy 0, policy_version 424529 (0.0035) [2024-04-27 16:30:14,106][52031] Fps is (10 sec: 55706.9, 60 sec: 53521.2, 300 sec: 53872.8). Total num frames: 6955483136. Throughput: 0: 53951.1. Samples: 1445913520. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-27 16:30:14,107][52031] Avg episode reward: [(0, '0.705')] [2024-04-27 16:30:18,070][52263] Updated weights for policy 0, policy_version 424539 (0.0029) [2024-04-27 16:30:19,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.1, 300 sec: 53706.2). Total num frames: 6955712512. Throughput: 0: 54055.2. Samples: 1446235560. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-27 16:30:19,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 16:30:19,960][52263] Updated weights for policy 0, policy_version 424549 (0.0032) [2024-04-27 16:30:24,107][52031] Fps is (10 sec: 47513.0, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6955958272. Throughput: 0: 54005.7. Samples: 1446562020. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-27 16:30:24,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 16:30:24,272][52263] Updated weights for policy 0, policy_version 424559 (0.0029) [2024-04-27 16:30:26,234][52263] Updated weights for policy 0, policy_version 424569 (0.0027) [2024-04-27 16:30:29,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53794.0, 300 sec: 53872.8). Total num frames: 6956253184. Throughput: 0: 53574.8. Samples: 1446712240. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-27 16:30:29,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 16:30:30,345][52263] Updated weights for policy 0, policy_version 424579 (0.0032) [2024-04-27 16:30:32,422][52263] Updated weights for policy 0, policy_version 424589 (0.0030) [2024-04-27 16:30:34,106][52031] Fps is (10 sec: 57345.1, 60 sec: 53794.3, 300 sec: 53928.4). Total num frames: 6956531712. Throughput: 0: 53506.7. Samples: 1447030400. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-27 16:30:34,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 16:30:36,379][52263] Updated weights for policy 0, policy_version 424599 (0.0029) [2024-04-27 16:30:37,388][52242] Signal inference workers to stop experience collection... (22200 times) [2024-04-27 16:30:37,416][52263] InferenceWorker_p0-w0: stopping experience collection (22200 times) [2024-04-27 16:30:37,480][52242] Signal inference workers to resume experience collection... (22200 times) [2024-04-27 16:30:37,480][52263] InferenceWorker_p0-w0: resuming experience collection (22200 times) [2024-04-27 16:30:38,632][52263] Updated weights for policy 0, policy_version 424609 (0.0037) [2024-04-27 16:30:39,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.1, 300 sec: 53872.8). Total num frames: 6956793856. Throughput: 0: 53592.4. Samples: 1447354280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:30:39,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 16:30:42,512][52263] Updated weights for policy 0, policy_version 424619 (0.0026) [2024-04-27 16:30:44,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.3, 300 sec: 53817.3). Total num frames: 6957072384. Throughput: 0: 54044.5. Samples: 1447525100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:30:44,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 16:30:44,575][52263] Updated weights for policy 0, policy_version 424629 (0.0027) [2024-04-27 16:30:48,601][52263] Updated weights for policy 0, policy_version 424639 (0.0029) [2024-04-27 16:30:49,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6957318144. Throughput: 0: 53919.9. Samples: 1447846800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:30:49,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 16:30:50,792][52263] Updated weights for policy 0, policy_version 424649 (0.0026) [2024-04-27 16:30:54,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53794.2, 300 sec: 53761.7). Total num frames: 6957580288. Throughput: 0: 53885.9. Samples: 1448166380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:30:54,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 16:30:54,631][52263] Updated weights for policy 0, policy_version 424659 (0.0031) [2024-04-27 16:30:57,074][52263] Updated weights for policy 0, policy_version 424669 (0.0029) [2024-04-27 16:30:59,107][52031] Fps is (10 sec: 55704.4, 60 sec: 53794.0, 300 sec: 53928.3). Total num frames: 6957875200. Throughput: 0: 53426.9. Samples: 1448317740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:30:59,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 16:31:00,619][52263] Updated weights for policy 0, policy_version 424679 (0.0034) [2024-04-27 16:31:03,112][52263] Updated weights for policy 0, policy_version 424689 (0.0033) [2024-04-27 16:31:04,106][52031] Fps is (10 sec: 55705.1, 60 sec: 53521.2, 300 sec: 53817.3). Total num frames: 6958137344. Throughput: 0: 53419.1. Samples: 1448639420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:31:04,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 16:31:06,798][52263] Updated weights for policy 0, policy_version 424699 (0.0030) [2024-04-27 16:31:09,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53794.2, 300 sec: 53872.8). Total num frames: 6958415872. Throughput: 0: 53393.9. Samples: 1448964740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:31:09,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 16:31:09,188][52263] Updated weights for policy 0, policy_version 424709 (0.0029) [2024-04-27 16:31:12,865][52263] Updated weights for policy 0, policy_version 424719 (0.0030) [2024-04-27 16:31:14,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53520.9, 300 sec: 53872.8). Total num frames: 6958694400. Throughput: 0: 53755.1. Samples: 1449131220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:31:14,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 16:31:15,327][52263] Updated weights for policy 0, policy_version 424729 (0.0038) [2024-04-27 16:31:18,811][52263] Updated weights for policy 0, policy_version 424739 (0.0031) [2024-04-27 16:31:19,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6958940160. Throughput: 0: 53952.2. Samples: 1449458260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:31:19,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 16:31:21,456][52263] Updated weights for policy 0, policy_version 424749 (0.0029) [2024-04-27 16:31:24,107][52031] Fps is (10 sec: 49151.9, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6959185920. Throughput: 0: 53894.6. Samples: 1449779540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:31:24,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 16:31:24,912][52263] Updated weights for policy 0, policy_version 424759 (0.0031) [2024-04-27 16:31:27,632][52263] Updated weights for policy 0, policy_version 424769 (0.0031) [2024-04-27 16:31:28,431][52242] Signal inference workers to stop experience collection... (22250 times) [2024-04-27 16:31:28,485][52263] InferenceWorker_p0-w0: stopping experience collection (22250 times) [2024-04-27 16:31:28,490][52242] Signal inference workers to resume experience collection... (22250 times) [2024-04-27 16:31:28,499][52263] InferenceWorker_p0-w0: resuming experience collection (22250 times) [2024-04-27 16:31:29,106][52031] Fps is (10 sec: 54068.3, 60 sec: 53794.3, 300 sec: 53817.3). Total num frames: 6959480832. Throughput: 0: 53416.0. Samples: 1449928820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:31:29,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 16:31:31,047][52263] Updated weights for policy 0, policy_version 424779 (0.0029) [2024-04-27 16:31:33,628][52263] Updated weights for policy 0, policy_version 424789 (0.0030) [2024-04-27 16:31:34,107][52031] Fps is (10 sec: 55706.1, 60 sec: 53520.9, 300 sec: 53761.7). Total num frames: 6959742976. Throughput: 0: 53507.4. Samples: 1450254640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:31:34,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 16:31:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000424789_6959742976.pth... [2024-04-27 16:31:34,162][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000424002_6946848768.pth [2024-04-27 16:31:37,053][52263] Updated weights for policy 0, policy_version 424799 (0.0034) [2024-04-27 16:31:39,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.1, 300 sec: 53817.3). Total num frames: 6960021504. Throughput: 0: 53517.6. Samples: 1450574680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:31:39,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 16:31:39,829][52263] Updated weights for policy 0, policy_version 424809 (0.0029) [2024-04-27 16:31:43,110][52263] Updated weights for policy 0, policy_version 424819 (0.0029) [2024-04-27 16:31:44,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53247.8, 300 sec: 53706.2). Total num frames: 6960267264. Throughput: 0: 53826.1. Samples: 1450739920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:31:44,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 16:31:45,968][52263] Updated weights for policy 0, policy_version 424829 (0.0031) [2024-04-27 16:31:49,028][52263] Updated weights for policy 0, policy_version 424839 (0.0029) [2024-04-27 16:31:49,107][52031] Fps is (10 sec: 54066.8, 60 sec: 54067.0, 300 sec: 53761.7). Total num frames: 6960562176. Throughput: 0: 53844.3. Samples: 1451062420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:31:49,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 16:31:51,860][52263] Updated weights for policy 0, policy_version 424849 (0.0032) [2024-04-27 16:31:54,106][52031] Fps is (10 sec: 52430.3, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6960791552. Throughput: 0: 53798.7. Samples: 1451385680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 16:31:54,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 16:31:55,257][52263] Updated weights for policy 0, policy_version 424859 (0.0029) [2024-04-27 16:31:58,019][52263] Updated weights for policy 0, policy_version 424869 (0.0034) [2024-04-27 16:31:59,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.2, 300 sec: 53761.7). Total num frames: 6961086464. Throughput: 0: 53558.8. Samples: 1451541360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:31:59,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 16:32:01,412][52263] Updated weights for policy 0, policy_version 424879 (0.0033) [2024-04-27 16:32:04,073][52263] Updated weights for policy 0, policy_version 424889 (0.0032) [2024-04-27 16:32:04,106][52031] Fps is (10 sec: 58981.8, 60 sec: 54067.2, 300 sec: 53928.4). Total num frames: 6961381376. Throughput: 0: 53506.0. Samples: 1451866020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:32:04,107][52031] Avg episode reward: [(0, '0.655')] [2024-04-27 16:32:07,421][52263] Updated weights for policy 0, policy_version 424899 (0.0026) [2024-04-27 16:32:09,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.0, 300 sec: 53761.7). Total num frames: 6961627136. Throughput: 0: 53579.3. Samples: 1452190600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:32:09,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 16:32:10,127][52263] Updated weights for policy 0, policy_version 424909 (0.0027) [2024-04-27 16:32:13,674][52263] Updated weights for policy 0, policy_version 424919 (0.0030) [2024-04-27 16:32:14,106][52031] Fps is (10 sec: 49152.0, 60 sec: 52975.0, 300 sec: 53595.1). Total num frames: 6961872896. Throughput: 0: 53853.3. Samples: 1452352220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:32:14,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 16:32:16,388][52263] Updated weights for policy 0, policy_version 424929 (0.0034) [2024-04-27 16:32:19,107][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 6962151424. Throughput: 0: 53637.4. Samples: 1452668320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:32:19,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 16:32:19,886][52263] Updated weights for policy 0, policy_version 424939 (0.0029) [2024-04-27 16:32:22,511][52263] Updated weights for policy 0, policy_version 424949 (0.0029) [2024-04-27 16:32:24,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.4, 300 sec: 53650.7). Total num frames: 6962413568. Throughput: 0: 53600.2. Samples: 1452986680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:32:24,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 16:32:25,860][52263] Updated weights for policy 0, policy_version 424959 (0.0034) [2024-04-27 16:32:27,396][52242] Signal inference workers to stop experience collection... (22300 times) [2024-04-27 16:32:27,397][52242] Signal inference workers to resume experience collection... (22300 times) [2024-04-27 16:32:27,419][52263] InferenceWorker_p0-w0: stopping experience collection (22300 times) [2024-04-27 16:32:27,420][52263] InferenceWorker_p0-w0: resuming experience collection (22300 times) [2024-04-27 16:32:28,903][52263] Updated weights for policy 0, policy_version 424969 (0.0032) [2024-04-27 16:32:29,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.1, 300 sec: 53761.7). Total num frames: 6962692096. Throughput: 0: 53617.6. Samples: 1453152700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:32:29,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 16:32:31,976][52263] Updated weights for policy 0, policy_version 424979 (0.0028) [2024-04-27 16:32:34,107][52031] Fps is (10 sec: 57342.6, 60 sec: 54067.2, 300 sec: 53817.3). Total num frames: 6962987008. Throughput: 0: 53698.7. Samples: 1453478860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:32:34,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 16:32:34,903][52263] Updated weights for policy 0, policy_version 424989 (0.0028) [2024-04-27 16:32:38,293][52263] Updated weights for policy 0, policy_version 424999 (0.0029) [2024-04-27 16:32:39,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53248.0, 300 sec: 53650.6). Total num frames: 6963216384. Throughput: 0: 53534.5. Samples: 1453794740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:32:39,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 16:32:41,024][52263] Updated weights for policy 0, policy_version 425009 (0.0030) [2024-04-27 16:32:44,106][52031] Fps is (10 sec: 50791.5, 60 sec: 53794.4, 300 sec: 53650.7). Total num frames: 6963494912. Throughput: 0: 53636.6. Samples: 1453955000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:32:44,107][52031] Avg episode reward: [(0, '0.675')] [2024-04-27 16:32:44,493][52263] Updated weights for policy 0, policy_version 425019 (0.0027) [2024-04-27 16:32:47,151][52263] Updated weights for policy 0, policy_version 425029 (0.0028) [2024-04-27 16:32:49,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6963757056. Throughput: 0: 53520.0. Samples: 1454274420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:32:49,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 16:32:50,685][52263] Updated weights for policy 0, policy_version 425039 (0.0037) [2024-04-27 16:32:53,090][52263] Updated weights for policy 0, policy_version 425049 (0.0031) [2024-04-27 16:32:54,107][52031] Fps is (10 sec: 50789.7, 60 sec: 53520.9, 300 sec: 53650.6). Total num frames: 6964002816. Throughput: 0: 53382.6. Samples: 1454592820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:32:54,115][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 16:32:56,902][52263] Updated weights for policy 0, policy_version 425059 (0.0029) [2024-04-27 16:32:59,061][52263] Updated weights for policy 0, policy_version 425069 (0.0030) [2024-04-27 16:32:59,107][52031] Fps is (10 sec: 57343.2, 60 sec: 54067.0, 300 sec: 53761.7). Total num frames: 6964330496. Throughput: 0: 53588.2. Samples: 1454763700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:32:59,107][52031] Avg episode reward: [(0, '0.603')] [2024-04-27 16:33:03,004][52263] Updated weights for policy 0, policy_version 425079 (0.0032) [2024-04-27 16:33:04,106][52031] Fps is (10 sec: 57344.6, 60 sec: 53248.1, 300 sec: 53706.2). Total num frames: 6964576256. Throughput: 0: 53627.2. Samples: 1455081540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:33:04,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 16:33:05,134][52263] Updated weights for policy 0, policy_version 425089 (0.0031) [2024-04-27 16:33:09,106][52031] Fps is (10 sec: 47515.1, 60 sec: 52975.0, 300 sec: 53539.6). Total num frames: 6964805632. Throughput: 0: 53661.3. Samples: 1455401440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:33:09,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 16:33:09,117][52263] Updated weights for policy 0, policy_version 425099 (0.0027) [2024-04-27 16:33:11,583][52263] Updated weights for policy 0, policy_version 425109 (0.0029) [2024-04-27 16:33:14,107][52031] Fps is (10 sec: 52427.8, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 6965100544. Throughput: 0: 53485.6. Samples: 1455559560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:33:14,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 16:33:15,235][52263] Updated weights for policy 0, policy_version 425119 (0.0029) [2024-04-27 16:33:17,842][52263] Updated weights for policy 0, policy_version 425129 (0.0031) [2024-04-27 16:33:19,107][52031] Fps is (10 sec: 57342.3, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6965379072. Throughput: 0: 53377.7. Samples: 1455880860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-27 16:33:19,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 16:33:21,260][52263] Updated weights for policy 0, policy_version 425139 (0.0038) [2024-04-27 16:33:24,107][52031] Fps is (10 sec: 52429.3, 60 sec: 53520.9, 300 sec: 53650.7). Total num frames: 6965624832. Throughput: 0: 53536.9. Samples: 1456203900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 16:33:24,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 16:33:24,409][52263] Updated weights for policy 0, policy_version 425149 (0.0029) [2024-04-27 16:33:24,422][52242] Signal inference workers to stop experience collection... (22350 times) [2024-04-27 16:33:24,423][52242] Signal inference workers to resume experience collection... (22350 times) [2024-04-27 16:33:24,442][52263] InferenceWorker_p0-w0: stopping experience collection (22350 times) [2024-04-27 16:33:24,442][52263] InferenceWorker_p0-w0: resuming experience collection (22350 times) [2024-04-27 16:33:27,315][52263] Updated weights for policy 0, policy_version 425159 (0.0034) [2024-04-27 16:33:29,106][52031] Fps is (10 sec: 55706.9, 60 sec: 54067.2, 300 sec: 53706.2). Total num frames: 6965936128. Throughput: 0: 53605.7. Samples: 1456367260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 16:33:29,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 16:33:30,417][52263] Updated weights for policy 0, policy_version 425169 (0.0031) [2024-04-27 16:33:33,330][52263] Updated weights for policy 0, policy_version 425179 (0.0029) [2024-04-27 16:33:34,107][52031] Fps is (10 sec: 55704.6, 60 sec: 53247.9, 300 sec: 53650.6). Total num frames: 6966181888. Throughput: 0: 53609.1. Samples: 1456686840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 16:33:34,107][52031] Avg episode reward: [(0, '0.467')] [2024-04-27 16:33:34,164][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000425183_6966198272.pth... [2024-04-27 16:33:34,219][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000424396_6953304064.pth [2024-04-27 16:33:36,534][52263] Updated weights for policy 0, policy_version 425189 (0.0033) [2024-04-27 16:33:39,106][52031] Fps is (10 sec: 50790.2, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6966444032. Throughput: 0: 53709.0. Samples: 1457009720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 16:33:39,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 16:33:39,296][52263] Updated weights for policy 0, policy_version 425199 (0.0030) [2024-04-27 16:33:42,627][52263] Updated weights for policy 0, policy_version 425209 (0.0028) [2024-04-27 16:33:44,107][52031] Fps is (10 sec: 52429.6, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6966706176. Throughput: 0: 53289.9. Samples: 1457161740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 16:33:44,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 16:33:45,777][52263] Updated weights for policy 0, policy_version 425219 (0.0034) [2024-04-27 16:33:48,690][52263] Updated weights for policy 0, policy_version 425229 (0.0034) [2024-04-27 16:33:49,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6966984704. Throughput: 0: 53373.7. Samples: 1457483360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 16:33:49,107][52031] Avg episode reward: [(0, '0.485')] [2024-04-27 16:33:51,994][52263] Updated weights for policy 0, policy_version 425239 (0.0026) [2024-04-27 16:33:54,107][52031] Fps is (10 sec: 55705.5, 60 sec: 54340.2, 300 sec: 53706.2). Total num frames: 6967263232. Throughput: 0: 53444.2. Samples: 1457806440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 16:33:54,107][52031] Avg episode reward: [(0, '0.666')] [2024-04-27 16:33:54,649][52263] Updated weights for policy 0, policy_version 425249 (0.0032) [2024-04-27 16:33:58,037][52263] Updated weights for policy 0, policy_version 425259 (0.0026) [2024-04-27 16:33:59,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.1, 300 sec: 53650.6). Total num frames: 6967525376. Throughput: 0: 53636.5. Samples: 1457973200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 16:33:59,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 16:34:00,646][52263] Updated weights for policy 0, policy_version 425269 (0.0036) [2024-04-27 16:34:04,063][52263] Updated weights for policy 0, policy_version 425279 (0.0037) [2024-04-27 16:34:04,107][52031] Fps is (10 sec: 50790.4, 60 sec: 53247.9, 300 sec: 53539.5). Total num frames: 6967771136. Throughput: 0: 53689.4. Samples: 1458296880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 16:34:04,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 16:34:06,731][52263] Updated weights for policy 0, policy_version 425289 (0.0032) [2024-04-27 16:34:09,106][52031] Fps is (10 sec: 52429.0, 60 sec: 54067.1, 300 sec: 53484.1). Total num frames: 6968049664. Throughput: 0: 53629.8. Samples: 1458617240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 16:34:09,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 16:34:10,207][52263] Updated weights for policy 0, policy_version 425299 (0.0026) [2024-04-27 16:34:12,783][52263] Updated weights for policy 0, policy_version 425309 (0.0027) [2024-04-27 16:34:14,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6968328192. Throughput: 0: 53437.9. Samples: 1458771980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 16:34:14,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 16:34:16,205][52263] Updated weights for policy 0, policy_version 425319 (0.0028) [2024-04-27 16:34:19,014][52263] Updated weights for policy 0, policy_version 425329 (0.0027) [2024-04-27 16:34:19,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.2, 300 sec: 53761.7). Total num frames: 6968590336. Throughput: 0: 53460.6. Samples: 1459092560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 16:34:19,115][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 16:34:21,484][52242] Signal inference workers to stop experience collection... (22400 times) [2024-04-27 16:34:21,525][52263] InferenceWorker_p0-w0: stopping experience collection (22400 times) [2024-04-27 16:34:21,538][52242] Signal inference workers to resume experience collection... (22400 times) [2024-04-27 16:34:21,545][52263] InferenceWorker_p0-w0: resuming experience collection (22400 times) [2024-04-27 16:34:22,251][52263] Updated weights for policy 0, policy_version 425339 (0.0030) [2024-04-27 16:34:24,106][52031] Fps is (10 sec: 54068.4, 60 sec: 54067.2, 300 sec: 53706.2). Total num frames: 6968868864. Throughput: 0: 53573.3. Samples: 1459420520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 16:34:24,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 16:34:24,955][52263] Updated weights for policy 0, policy_version 425349 (0.0035) [2024-04-27 16:34:28,290][52263] Updated weights for policy 0, policy_version 425359 (0.0026) [2024-04-27 16:34:29,107][52031] Fps is (10 sec: 55705.6, 60 sec: 53521.0, 300 sec: 53706.2). Total num frames: 6969147392. Throughput: 0: 53870.7. Samples: 1459585920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 16:34:29,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 16:34:30,880][52263] Updated weights for policy 0, policy_version 425369 (0.0029) [2024-04-27 16:34:34,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6969393152. Throughput: 0: 53880.4. Samples: 1459907980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 16:34:34,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 16:34:34,434][52263] Updated weights for policy 0, policy_version 425379 (0.0026) [2024-04-27 16:34:37,022][52263] Updated weights for policy 0, policy_version 425389 (0.0028) [2024-04-27 16:34:39,107][52031] Fps is (10 sec: 49152.1, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6969638912. Throughput: 0: 53919.2. Samples: 1460232800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 16:34:39,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 16:34:40,588][52263] Updated weights for policy 0, policy_version 425399 (0.0030) [2024-04-27 16:34:43,187][52263] Updated weights for policy 0, policy_version 425409 (0.0030) [2024-04-27 16:34:44,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 6969933824. Throughput: 0: 53640.0. Samples: 1460387000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 16:34:44,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 16:34:46,554][52263] Updated weights for policy 0, policy_version 425419 (0.0033) [2024-04-27 16:34:49,107][52031] Fps is (10 sec: 57343.7, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 6970212352. Throughput: 0: 53670.2. Samples: 1460712040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 16:34:49,107][52031] Avg episode reward: [(0, '0.685')] [2024-04-27 16:34:49,124][52263] Updated weights for policy 0, policy_version 425429 (0.0021) [2024-04-27 16:34:52,825][52263] Updated weights for policy 0, policy_version 425439 (0.0026) [2024-04-27 16:34:54,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6970458112. Throughput: 0: 53687.9. Samples: 1461033200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 16:34:54,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 16:34:55,165][52263] Updated weights for policy 0, policy_version 425449 (0.0034) [2024-04-27 16:34:58,976][52263] Updated weights for policy 0, policy_version 425459 (0.0033) [2024-04-27 16:34:59,106][52031] Fps is (10 sec: 52429.9, 60 sec: 53521.2, 300 sec: 53595.2). Total num frames: 6970736640. Throughput: 0: 53781.7. Samples: 1461192140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 16:34:59,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 16:35:01,605][52263] Updated weights for policy 0, policy_version 425469 (0.0035) [2024-04-27 16:35:04,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6970998784. Throughput: 0: 53877.3. Samples: 1461517040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 16:35:04,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 16:35:04,999][52263] Updated weights for policy 0, policy_version 425479 (0.0029) [2024-04-27 16:35:07,805][52263] Updated weights for policy 0, policy_version 425489 (0.0029) [2024-04-27 16:35:09,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 6971260928. Throughput: 0: 53722.3. Samples: 1461838020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 16:35:09,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 16:35:10,991][52263] Updated weights for policy 0, policy_version 425499 (0.0030) [2024-04-27 16:35:14,056][52263] Updated weights for policy 0, policy_version 425509 (0.0031) [2024-04-27 16:35:14,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.3, 300 sec: 53650.7). Total num frames: 6971539456. Throughput: 0: 53558.7. Samples: 1461996060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 16:35:14,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 16:35:17,092][52263] Updated weights for policy 0, policy_version 425519 (0.0028) [2024-04-27 16:35:19,106][52031] Fps is (10 sec: 57343.6, 60 sec: 54067.2, 300 sec: 53817.3). Total num frames: 6971834368. Throughput: 0: 53639.6. Samples: 1462321760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 16:35:19,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 16:35:20,205][52263] Updated weights for policy 0, policy_version 425529 (0.0028) [2024-04-27 16:35:20,485][52242] Signal inference workers to stop experience collection... (22450 times) [2024-04-27 16:35:20,543][52242] Signal inference workers to resume experience collection... (22450 times) [2024-04-27 16:35:20,544][52263] InferenceWorker_p0-w0: stopping experience collection (22450 times) [2024-04-27 16:35:20,557][52263] InferenceWorker_p0-w0: resuming experience collection (22450 times) [2024-04-27 16:35:23,239][52263] Updated weights for policy 0, policy_version 425539 (0.0027) [2024-04-27 16:35:24,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6972096512. Throughput: 0: 53575.9. Samples: 1462643720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 16:35:24,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 16:35:26,151][52263] Updated weights for policy 0, policy_version 425549 (0.0027) [2024-04-27 16:35:29,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6972342272. Throughput: 0: 53784.5. Samples: 1462807300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 16:35:29,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 16:35:29,258][52263] Updated weights for policy 0, policy_version 425559 (0.0027) [2024-04-27 16:35:32,147][52263] Updated weights for policy 0, policy_version 425569 (0.0034) [2024-04-27 16:35:34,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6972604416. Throughput: 0: 53665.3. Samples: 1463126980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 16:35:34,107][52031] Avg episode reward: [(0, '0.533')] [2024-04-27 16:35:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000425574_6972604416.pth... [2024-04-27 16:35:34,162][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000424789_6959742976.pth [2024-04-27 16:35:35,264][52263] Updated weights for policy 0, policy_version 425579 (0.0027) [2024-04-27 16:35:38,329][52263] Updated weights for policy 0, policy_version 425589 (0.0028) [2024-04-27 16:35:39,106][52031] Fps is (10 sec: 54067.1, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 6972882944. Throughput: 0: 53748.1. Samples: 1463451860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 16:35:39,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 16:35:41,385][52263] Updated weights for policy 0, policy_version 425599 (0.0034) [2024-04-27 16:35:44,106][52031] Fps is (10 sec: 55706.5, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6973161472. Throughput: 0: 53774.6. Samples: 1463612000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 16:35:44,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 16:35:44,440][52263] Updated weights for policy 0, policy_version 425609 (0.0034) [2024-04-27 16:35:47,515][52263] Updated weights for policy 0, policy_version 425619 (0.0032) [2024-04-27 16:35:49,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.3, 300 sec: 53706.2). Total num frames: 6973423616. Throughput: 0: 53754.9. Samples: 1463936000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 16:35:49,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 16:35:50,511][52263] Updated weights for policy 0, policy_version 425629 (0.0027) [2024-04-27 16:35:53,536][52263] Updated weights for policy 0, policy_version 425639 (0.0031) [2024-04-27 16:35:54,106][52031] Fps is (10 sec: 54067.2, 60 sec: 54067.3, 300 sec: 53650.7). Total num frames: 6973702144. Throughput: 0: 53689.8. Samples: 1464254060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 16:35:54,116][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 16:35:56,695][52263] Updated weights for policy 0, policy_version 425649 (0.0032) [2024-04-27 16:35:59,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6973931520. Throughput: 0: 53767.7. Samples: 1464415600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-27 16:35:59,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 16:35:59,637][52263] Updated weights for policy 0, policy_version 425659 (0.0028) [2024-04-27 16:36:02,925][52263] Updated weights for policy 0, policy_version 425669 (0.0029) [2024-04-27 16:36:04,107][52031] Fps is (10 sec: 52427.2, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6974226432. Throughput: 0: 53717.0. Samples: 1464739040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:36:04,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 16:36:05,683][52263] Updated weights for policy 0, policy_version 425679 (0.0028) [2024-04-27 16:36:08,986][52263] Updated weights for policy 0, policy_version 425689 (0.0026) [2024-04-27 16:36:09,106][52031] Fps is (10 sec: 55705.6, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6974488576. Throughput: 0: 53705.1. Samples: 1465060440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:36:09,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 16:36:11,868][52263] Updated weights for policy 0, policy_version 425699 (0.0029) [2024-04-27 16:36:14,107][52031] Fps is (10 sec: 54068.4, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6974767104. Throughput: 0: 53720.4. Samples: 1465224720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:36:14,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 16:36:15,089][52263] Updated weights for policy 0, policy_version 425709 (0.0029) [2024-04-27 16:36:17,344][52242] Signal inference workers to stop experience collection... (22500 times) [2024-04-27 16:36:17,380][52263] InferenceWorker_p0-w0: stopping experience collection (22500 times) [2024-04-27 16:36:17,409][52242] Signal inference workers to resume experience collection... (22500 times) [2024-04-27 16:36:17,411][52263] InferenceWorker_p0-w0: resuming experience collection (22500 times) [2024-04-27 16:36:17,990][52263] Updated weights for policy 0, policy_version 425719 (0.0030) [2024-04-27 16:36:19,107][52031] Fps is (10 sec: 57343.2, 60 sec: 53794.1, 300 sec: 53817.3). Total num frames: 6975062016. Throughput: 0: 53823.7. Samples: 1465549040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:36:19,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 16:36:21,056][52263] Updated weights for policy 0, policy_version 425729 (0.0032) [2024-04-27 16:36:24,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6975291392. Throughput: 0: 53720.9. Samples: 1465869300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:36:24,107][52031] Avg episode reward: [(0, '0.513')] [2024-04-27 16:36:24,169][52263] Updated weights for policy 0, policy_version 425739 (0.0027) [2024-04-27 16:36:27,243][52263] Updated weights for policy 0, policy_version 425749 (0.0027) [2024-04-27 16:36:29,106][52031] Fps is (10 sec: 47514.0, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 6975537152. Throughput: 0: 53596.5. Samples: 1466023840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:36:29,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 16:36:30,248][52263] Updated weights for policy 0, policy_version 425759 (0.0030) [2024-04-27 16:36:33,291][52263] Updated weights for policy 0, policy_version 425769 (0.0028) [2024-04-27 16:36:34,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.3, 300 sec: 53595.1). Total num frames: 6975832064. Throughput: 0: 53496.8. Samples: 1466343360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:36:34,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 16:36:36,218][52263] Updated weights for policy 0, policy_version 425779 (0.0026) [2024-04-27 16:36:39,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6976094208. Throughput: 0: 53631.1. Samples: 1466667460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:36:39,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 16:36:39,450][52263] Updated weights for policy 0, policy_version 425789 (0.0027) [2024-04-27 16:36:42,314][52263] Updated weights for policy 0, policy_version 425799 (0.0032) [2024-04-27 16:36:44,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 6976356352. Throughput: 0: 53737.1. Samples: 1466833780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:36:44,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 16:36:45,556][52263] Updated weights for policy 0, policy_version 425809 (0.0025) [2024-04-27 16:36:48,465][52263] Updated weights for policy 0, policy_version 425819 (0.0029) [2024-04-27 16:36:49,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53794.0, 300 sec: 53761.7). Total num frames: 6976651264. Throughput: 0: 53678.5. Samples: 1467154560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:36:49,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 16:36:51,654][52263] Updated weights for policy 0, policy_version 425829 (0.0029) [2024-04-27 16:36:54,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53520.9, 300 sec: 53650.6). Total num frames: 6976913408. Throughput: 0: 53657.5. Samples: 1467475040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:36:54,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 16:36:54,387][52263] Updated weights for policy 0, policy_version 425839 (0.0029) [2024-04-27 16:36:57,825][52263] Updated weights for policy 0, policy_version 425849 (0.0028) [2024-04-27 16:36:59,107][52031] Fps is (10 sec: 49151.7, 60 sec: 53520.9, 300 sec: 53428.5). Total num frames: 6977142784. Throughput: 0: 53561.7. Samples: 1467635000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:36:59,108][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 16:37:00,405][52263] Updated weights for policy 0, policy_version 425859 (0.0037) [2024-04-27 16:37:03,830][52263] Updated weights for policy 0, policy_version 425869 (0.0033) [2024-04-27 16:37:04,107][52031] Fps is (10 sec: 52429.3, 60 sec: 53521.3, 300 sec: 53595.1). Total num frames: 6977437696. Throughput: 0: 53526.6. Samples: 1467957740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:37:04,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 16:37:05,829][52242] Signal inference workers to stop experience collection... (22550 times) [2024-04-27 16:37:05,829][52242] Signal inference workers to resume experience collection... (22550 times) [2024-04-27 16:37:05,842][52263] InferenceWorker_p0-w0: stopping experience collection (22550 times) [2024-04-27 16:37:05,842][52263] InferenceWorker_p0-w0: resuming experience collection (22550 times) [2024-04-27 16:37:06,702][52263] Updated weights for policy 0, policy_version 425879 (0.0026) [2024-04-27 16:37:09,107][52031] Fps is (10 sec: 57344.0, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6977716224. Throughput: 0: 53543.0. Samples: 1468278740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:37:09,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 16:37:10,021][52263] Updated weights for policy 0, policy_version 425889 (0.0028) [2024-04-27 16:37:12,866][52263] Updated weights for policy 0, policy_version 425899 (0.0033) [2024-04-27 16:37:14,107][52031] Fps is (10 sec: 55705.4, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 6977994752. Throughput: 0: 53872.3. Samples: 1468448100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:37:14,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 16:37:16,307][52263] Updated weights for policy 0, policy_version 425909 (0.0032) [2024-04-27 16:37:19,036][52263] Updated weights for policy 0, policy_version 425919 (0.0032) [2024-04-27 16:37:19,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.0, 300 sec: 53706.2). Total num frames: 6978256896. Throughput: 0: 53847.5. Samples: 1468766500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:37:19,107][52031] Avg episode reward: [(0, '0.447')] [2024-04-27 16:37:22,259][52263] Updated weights for policy 0, policy_version 425929 (0.0028) [2024-04-27 16:37:24,107][52031] Fps is (10 sec: 50787.2, 60 sec: 53520.4, 300 sec: 53595.0). Total num frames: 6978502656. Throughput: 0: 53817.8. Samples: 1469089300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:37:24,108][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 16:37:25,078][52263] Updated weights for policy 0, policy_version 425939 (0.0032) [2024-04-27 16:37:28,396][52263] Updated weights for policy 0, policy_version 425949 (0.0030) [2024-04-27 16:37:29,106][52031] Fps is (10 sec: 52429.2, 60 sec: 54067.2, 300 sec: 53539.6). Total num frames: 6978781184. Throughput: 0: 53529.1. Samples: 1469242580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 16:37:29,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 16:37:31,073][52263] Updated weights for policy 0, policy_version 425959 (0.0027) [2024-04-27 16:37:34,106][52031] Fps is (10 sec: 54071.4, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 6979043328. Throughput: 0: 53499.2. Samples: 1469562020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 16:37:34,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 16:37:34,172][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000425968_6979059712.pth... [2024-04-27 16:37:34,219][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000425183_6966198272.pth [2024-04-27 16:37:34,591][52263] Updated weights for policy 0, policy_version 425969 (0.0027) [2024-04-27 16:37:37,360][52263] Updated weights for policy 0, policy_version 425979 (0.0027) [2024-04-27 16:37:39,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6979305472. Throughput: 0: 53417.4. Samples: 1469878820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 16:37:39,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 16:37:40,772][52263] Updated weights for policy 0, policy_version 425989 (0.0035) [2024-04-27 16:37:43,564][52263] Updated weights for policy 0, policy_version 425999 (0.0029) [2024-04-27 16:37:44,107][52031] Fps is (10 sec: 55704.3, 60 sec: 54067.1, 300 sec: 53706.2). Total num frames: 6979600384. Throughput: 0: 53462.5. Samples: 1470040820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 16:37:44,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 16:37:46,809][52263] Updated weights for policy 0, policy_version 426009 (0.0026) [2024-04-27 16:37:49,107][52031] Fps is (10 sec: 55705.8, 60 sec: 53521.0, 300 sec: 53761.7). Total num frames: 6979862528. Throughput: 0: 53467.6. Samples: 1470363780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 16:37:49,107][52031] Avg episode reward: [(0, '0.545')] [2024-04-27 16:37:49,579][52263] Updated weights for policy 0, policy_version 426019 (0.0024) [2024-04-27 16:37:52,801][52263] Updated weights for policy 0, policy_version 426029 (0.0030) [2024-04-27 16:37:54,106][52031] Fps is (10 sec: 49153.0, 60 sec: 52975.1, 300 sec: 53428.5). Total num frames: 6980091904. Throughput: 0: 53599.2. Samples: 1470690700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 16:37:54,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 16:37:55,585][52263] Updated weights for policy 0, policy_version 426039 (0.0028) [2024-04-27 16:37:58,988][52263] Updated weights for policy 0, policy_version 426049 (0.0027) [2024-04-27 16:37:59,106][52031] Fps is (10 sec: 52429.1, 60 sec: 54067.3, 300 sec: 53595.1). Total num frames: 6980386816. Throughput: 0: 53174.4. Samples: 1470840940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 16:37:59,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 16:38:00,350][52242] Signal inference workers to stop experience collection... (22600 times) [2024-04-27 16:38:00,350][52242] Signal inference workers to resume experience collection... (22600 times) [2024-04-27 16:38:00,365][52263] InferenceWorker_p0-w0: stopping experience collection (22600 times) [2024-04-27 16:38:00,365][52263] InferenceWorker_p0-w0: resuming experience collection (22600 times) [2024-04-27 16:38:01,716][52263] Updated weights for policy 0, policy_version 426059 (0.0032) [2024-04-27 16:38:04,106][52031] Fps is (10 sec: 57344.0, 60 sec: 53794.2, 300 sec: 53761.7). Total num frames: 6980665344. Throughput: 0: 53248.9. Samples: 1471162700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 16:38:04,107][52031] Avg episode reward: [(0, '0.513')] [2024-04-27 16:38:05,078][52263] Updated weights for policy 0, policy_version 426069 (0.0032) [2024-04-27 16:38:07,936][52263] Updated weights for policy 0, policy_version 426079 (0.0027) [2024-04-27 16:38:09,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6980927488. Throughput: 0: 53223.0. Samples: 1471484300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 16:38:09,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 16:38:11,188][52263] Updated weights for policy 0, policy_version 426089 (0.0035) [2024-04-27 16:38:13,964][52263] Updated weights for policy 0, policy_version 426099 (0.0028) [2024-04-27 16:38:14,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6981206016. Throughput: 0: 53516.3. Samples: 1471650820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 16:38:14,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 16:38:17,277][52263] Updated weights for policy 0, policy_version 426109 (0.0031) [2024-04-27 16:38:19,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 6981468160. Throughput: 0: 53559.6. Samples: 1471972200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 16:38:19,107][52031] Avg episode reward: [(0, '0.500')] [2024-04-27 16:38:20,143][52263] Updated weights for policy 0, policy_version 426119 (0.0025) [2024-04-27 16:38:23,240][52263] Updated weights for policy 0, policy_version 426129 (0.0031) [2024-04-27 16:38:24,106][52031] Fps is (10 sec: 50790.8, 60 sec: 53521.8, 300 sec: 53484.0). Total num frames: 6981713920. Throughput: 0: 53712.1. Samples: 1472295860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 16:38:24,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 16:38:26,063][52263] Updated weights for policy 0, policy_version 426139 (0.0027) [2024-04-27 16:38:29,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53595.2). Total num frames: 6981992448. Throughput: 0: 53687.0. Samples: 1472456720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 16:38:29,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 16:38:29,740][52263] Updated weights for policy 0, policy_version 426149 (0.0030) [2024-04-27 16:38:32,078][52263] Updated weights for policy 0, policy_version 426159 (0.0034) [2024-04-27 16:38:34,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6982254592. Throughput: 0: 53648.8. Samples: 1472777980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 16:38:34,107][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 16:38:35,821][52263] Updated weights for policy 0, policy_version 426169 (0.0033) [2024-04-27 16:38:38,085][52263] Updated weights for policy 0, policy_version 426179 (0.0032) [2024-04-27 16:38:39,107][52031] Fps is (10 sec: 55704.1, 60 sec: 54067.1, 300 sec: 53706.2). Total num frames: 6982549504. Throughput: 0: 53537.1. Samples: 1473099880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 16:38:39,107][52031] Avg episode reward: [(0, '0.623')] [2024-04-27 16:38:41,985][52263] Updated weights for policy 0, policy_version 426189 (0.0025) [2024-04-27 16:38:44,107][52031] Fps is (10 sec: 57344.3, 60 sec: 53794.3, 300 sec: 53706.2). Total num frames: 6982828032. Throughput: 0: 53888.8. Samples: 1473265940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 16:38:44,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 16:38:44,387][52263] Updated weights for policy 0, policy_version 426199 (0.0027) [2024-04-27 16:38:47,979][52263] Updated weights for policy 0, policy_version 426209 (0.0028) [2024-04-27 16:38:49,107][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6983073792. Throughput: 0: 53774.0. Samples: 1473582540. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 16:38:49,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 16:38:50,707][52263] Updated weights for policy 0, policy_version 426219 (0.0033) [2024-04-27 16:38:54,107][52031] Fps is (10 sec: 50790.2, 60 sec: 54067.1, 300 sec: 53595.1). Total num frames: 6983335936. Throughput: 0: 53716.0. Samples: 1473901520. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 16:38:54,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 16:38:54,110][52263] Updated weights for policy 0, policy_version 426229 (0.0031) [2024-04-27 16:38:56,857][52263] Updated weights for policy 0, policy_version 426239 (0.0036) [2024-04-27 16:38:59,106][52031] Fps is (10 sec: 50791.2, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 6983581696. Throughput: 0: 53504.1. Samples: 1474058500. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 16:38:59,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 16:39:00,157][52263] Updated weights for policy 0, policy_version 426249 (0.0036) [2024-04-27 16:39:02,835][52263] Updated weights for policy 0, policy_version 426259 (0.0027) [2024-04-27 16:39:04,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 6983876608. Throughput: 0: 53433.6. Samples: 1474376720. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 16:39:04,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 16:39:06,153][52263] Updated weights for policy 0, policy_version 426269 (0.0027) [2024-04-27 16:39:08,955][52263] Updated weights for policy 0, policy_version 426279 (0.0028) [2024-04-27 16:39:09,106][52031] Fps is (10 sec: 57344.6, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 6984155136. Throughput: 0: 53436.1. Samples: 1474700480. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 16:39:09,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 16:39:09,711][52242] Signal inference workers to stop experience collection... (22650 times) [2024-04-27 16:39:09,712][52242] Signal inference workers to resume experience collection... (22650 times) [2024-04-27 16:39:09,735][52263] InferenceWorker_p0-w0: stopping experience collection (22650 times) [2024-04-27 16:39:09,735][52263] InferenceWorker_p0-w0: resuming experience collection (22650 times) [2024-04-27 16:39:12,281][52263] Updated weights for policy 0, policy_version 426289 (0.0034) [2024-04-27 16:39:14,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 6984417280. Throughput: 0: 53672.0. Samples: 1474871960. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 16:39:14,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 16:39:15,143][52263] Updated weights for policy 0, policy_version 426299 (0.0034) [2024-04-27 16:39:18,349][52263] Updated weights for policy 0, policy_version 426309 (0.0026) [2024-04-27 16:39:19,107][52031] Fps is (10 sec: 52427.2, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6984679424. Throughput: 0: 53650.6. Samples: 1475192260. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 16:39:19,107][52031] Avg episode reward: [(0, '0.503')] [2024-04-27 16:39:21,099][52263] Updated weights for policy 0, policy_version 426319 (0.0026) [2024-04-27 16:39:24,106][52031] Fps is (10 sec: 50790.1, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 6984925184. Throughput: 0: 53620.7. Samples: 1475512800. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 16:39:24,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 16:39:24,535][52263] Updated weights for policy 0, policy_version 426329 (0.0028) [2024-04-27 16:39:27,077][52263] Updated weights for policy 0, policy_version 426339 (0.0027) [2024-04-27 16:39:29,107][52031] Fps is (10 sec: 52429.4, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6985203712. Throughput: 0: 53312.0. Samples: 1475664980. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 16:39:29,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 16:39:30,606][52263] Updated weights for policy 0, policy_version 426349 (0.0032) [2024-04-27 16:39:33,278][52263] Updated weights for policy 0, policy_version 426359 (0.0026) [2024-04-27 16:39:34,107][52031] Fps is (10 sec: 55704.2, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6985482240. Throughput: 0: 53475.4. Samples: 1475988940. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 16:39:34,107][52031] Avg episode reward: [(0, '0.506')] [2024-04-27 16:39:34,228][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000426361_6985498624.pth... [2024-04-27 16:39:34,270][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000425574_6972604416.pth [2024-04-27 16:39:36,536][52263] Updated weights for policy 0, policy_version 426369 (0.0032) [2024-04-27 16:39:39,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 6985744384. Throughput: 0: 53539.6. Samples: 1476310800. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 16:39:39,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 16:39:39,608][52263] Updated weights for policy 0, policy_version 426379 (0.0035) [2024-04-27 16:39:42,709][52263] Updated weights for policy 0, policy_version 426389 (0.0030) [2024-04-27 16:39:44,106][52031] Fps is (10 sec: 52430.1, 60 sec: 52975.0, 300 sec: 53539.6). Total num frames: 6986006528. Throughput: 0: 53652.0. Samples: 1476472840. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 16:39:44,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 16:39:45,820][52263] Updated weights for policy 0, policy_version 426399 (0.0031) [2024-04-27 16:39:48,852][52263] Updated weights for policy 0, policy_version 426409 (0.0026) [2024-04-27 16:39:49,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 6986285056. Throughput: 0: 53714.3. Samples: 1476793860. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 16:39:49,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 16:39:52,321][52263] Updated weights for policy 0, policy_version 426419 (0.0032) [2024-04-27 16:39:54,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6986530816. Throughput: 0: 53689.2. Samples: 1477116500. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 16:39:54,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 16:39:55,015][52263] Updated weights for policy 0, policy_version 426429 (0.0032) [2024-04-27 16:39:58,425][52263] Updated weights for policy 0, policy_version 426439 (0.0027) [2024-04-27 16:39:59,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6986809344. Throughput: 0: 53254.0. Samples: 1477268400. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 16:39:59,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 16:40:01,002][52263] Updated weights for policy 0, policy_version 426449 (0.0031) [2024-04-27 16:40:04,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 6987087872. Throughput: 0: 53252.1. Samples: 1477588600. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 16:40:04,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 16:40:04,429][52263] Updated weights for policy 0, policy_version 426459 (0.0028) [2024-04-27 16:40:07,328][52263] Updated weights for policy 0, policy_version 426469 (0.0025) [2024-04-27 16:40:09,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53247.9, 300 sec: 53595.1). Total num frames: 6987350016. Throughput: 0: 53202.3. Samples: 1477906900. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 16:40:09,107][52031] Avg episode reward: [(0, '0.677')] [2024-04-27 16:40:10,681][52263] Updated weights for policy 0, policy_version 426479 (0.0038) [2024-04-27 16:40:13,441][52263] Updated weights for policy 0, policy_version 426489 (0.0037) [2024-04-27 16:40:14,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 6987612160. Throughput: 0: 53520.4. Samples: 1478073400. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 16:40:14,107][52031] Avg episode reward: [(0, '0.671')] [2024-04-27 16:40:16,994][52263] Updated weights for policy 0, policy_version 426499 (0.0030) [2024-04-27 16:40:19,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6987890688. Throughput: 0: 53358.5. Samples: 1478390060. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 16:40:19,107][52031] Avg episode reward: [(0, '0.492')] [2024-04-27 16:40:19,905][52263] Updated weights for policy 0, policy_version 426509 (0.0026) [2024-04-27 16:40:20,111][52242] Signal inference workers to stop experience collection... (22700 times) [2024-04-27 16:40:20,112][52242] Signal inference workers to resume experience collection... (22700 times) [2024-04-27 16:40:20,129][52263] InferenceWorker_p0-w0: stopping experience collection (22700 times) [2024-04-27 16:40:20,130][52263] InferenceWorker_p0-w0: resuming experience collection (22700 times) [2024-04-27 16:40:22,995][52263] Updated weights for policy 0, policy_version 426519 (0.0029) [2024-04-27 16:40:24,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53520.9, 300 sec: 53539.6). Total num frames: 6988136448. Throughput: 0: 53338.5. Samples: 1478711040. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 16:40:24,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 16:40:26,004][52263] Updated weights for policy 0, policy_version 426529 (0.0027) [2024-04-27 16:40:29,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6988398592. Throughput: 0: 53253.3. Samples: 1478869240. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 16:40:29,107][52031] Avg episode reward: [(0, '0.498')] [2024-04-27 16:40:29,316][52263] Updated weights for policy 0, policy_version 426539 (0.0032) [2024-04-27 16:40:32,183][52263] Updated weights for policy 0, policy_version 426549 (0.0031) [2024-04-27 16:40:34,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 6988677120. Throughput: 0: 53271.0. Samples: 1479191060. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 16:40:34,107][52031] Avg episode reward: [(0, '0.501')] [2024-04-27 16:40:35,440][52263] Updated weights for policy 0, policy_version 426559 (0.0026) [2024-04-27 16:40:38,230][52263] Updated weights for policy 0, policy_version 426569 (0.0033) [2024-04-27 16:40:39,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6988955648. Throughput: 0: 53174.7. Samples: 1479509360. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 16:40:39,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 16:40:41,521][52263] Updated weights for policy 0, policy_version 426579 (0.0028) [2024-04-27 16:40:44,106][52031] Fps is (10 sec: 54068.3, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 6989217792. Throughput: 0: 53514.1. Samples: 1479676520. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 16:40:44,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 16:40:44,285][52263] Updated weights for policy 0, policy_version 426589 (0.0028) [2024-04-27 16:40:47,637][52263] Updated weights for policy 0, policy_version 426599 (0.0032) [2024-04-27 16:40:49,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 6989479936. Throughput: 0: 53611.6. Samples: 1480001120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 16:40:49,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 16:40:50,456][52263] Updated weights for policy 0, policy_version 426609 (0.0034) [2024-04-27 16:40:53,806][52263] Updated weights for policy 0, policy_version 426619 (0.0031) [2024-04-27 16:40:54,107][52031] Fps is (10 sec: 52427.6, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 6989742080. Throughput: 0: 53706.9. Samples: 1480323720. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 16:40:54,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 16:40:56,517][52263] Updated weights for policy 0, policy_version 426629 (0.0032) [2024-04-27 16:40:59,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 6990004224. Throughput: 0: 53438.7. Samples: 1480478140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 16:40:59,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 16:40:59,877][52263] Updated weights for policy 0, policy_version 426639 (0.0029) [2024-04-27 16:41:02,496][52263] Updated weights for policy 0, policy_version 426649 (0.0030) [2024-04-27 16:41:04,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6990299136. Throughput: 0: 53620.2. Samples: 1480802980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 16:41:04,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 16:41:05,917][52263] Updated weights for policy 0, policy_version 426659 (0.0037) [2024-04-27 16:41:08,470][52263] Updated weights for policy 0, policy_version 426669 (0.0028) [2024-04-27 16:41:09,106][52031] Fps is (10 sec: 57344.4, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 6990577664. Throughput: 0: 53608.2. Samples: 1481123400. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 16:41:09,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 16:41:11,841][52263] Updated weights for policy 0, policy_version 426679 (0.0029) [2024-04-27 16:41:14,107][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.1, 300 sec: 53428.5). Total num frames: 6990823424. Throughput: 0: 53977.7. Samples: 1481298240. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 16:41:14,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 16:41:14,815][52263] Updated weights for policy 0, policy_version 426689 (0.0030) [2024-04-27 16:41:17,850][52263] Updated weights for policy 0, policy_version 426699 (0.0031) [2024-04-27 16:41:19,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6991101952. Throughput: 0: 53982.7. Samples: 1481620280. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 16:41:19,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 16:41:20,333][52242] Signal inference workers to stop experience collection... (22750 times) [2024-04-27 16:41:20,333][52242] Signal inference workers to resume experience collection... (22750 times) [2024-04-27 16:41:20,364][52263] InferenceWorker_p0-w0: stopping experience collection (22750 times) [2024-04-27 16:41:20,364][52263] InferenceWorker_p0-w0: resuming experience collection (22750 times) [2024-04-27 16:41:20,897][52263] Updated weights for policy 0, policy_version 426709 (0.0033) [2024-04-27 16:41:24,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.3, 300 sec: 53595.1). Total num frames: 6991347712. Throughput: 0: 54013.7. Samples: 1481939980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 16:41:24,107][52031] Avg episode reward: [(0, '0.500')] [2024-04-27 16:41:24,232][52263] Updated weights for policy 0, policy_version 426719 (0.0031) [2024-04-27 16:41:26,809][52263] Updated weights for policy 0, policy_version 426729 (0.0029) [2024-04-27 16:41:29,107][52031] Fps is (10 sec: 54067.0, 60 sec: 54067.1, 300 sec: 53595.1). Total num frames: 6991642624. Throughput: 0: 53725.5. Samples: 1482094180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-27 16:41:29,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 16:41:30,392][52263] Updated weights for policy 0, policy_version 426739 (0.0036) [2024-04-27 16:41:32,739][52263] Updated weights for policy 0, policy_version 426749 (0.0029) [2024-04-27 16:41:34,106][52031] Fps is (10 sec: 57344.3, 60 sec: 54067.4, 300 sec: 53650.7). Total num frames: 6991921152. Throughput: 0: 53628.7. Samples: 1482414400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:41:34,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 16:41:34,113][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000426753_6991921152.pth... [2024-04-27 16:41:34,160][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000425968_6979059712.pth [2024-04-27 16:41:36,535][52263] Updated weights for policy 0, policy_version 426759 (0.0027) [2024-04-27 16:41:38,922][52263] Updated weights for policy 0, policy_version 426769 (0.0031) [2024-04-27 16:41:39,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53793.9, 300 sec: 53650.6). Total num frames: 6992183296. Throughput: 0: 53741.3. Samples: 1482742080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:41:39,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 16:41:42,592][52263] Updated weights for policy 0, policy_version 426779 (0.0031) [2024-04-27 16:41:44,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6992445440. Throughput: 0: 53981.0. Samples: 1482907280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:41:44,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 16:41:45,202][52263] Updated weights for policy 0, policy_version 426789 (0.0030) [2024-04-27 16:41:48,651][52263] Updated weights for policy 0, policy_version 426799 (0.0027) [2024-04-27 16:41:49,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6992707584. Throughput: 0: 53886.4. Samples: 1483227860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:41:49,107][52031] Avg episode reward: [(0, '0.671')] [2024-04-27 16:41:51,332][52263] Updated weights for policy 0, policy_version 426809 (0.0027) [2024-04-27 16:41:54,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 6992969728. Throughput: 0: 53919.4. Samples: 1483549780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:41:54,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 16:41:54,696][52263] Updated weights for policy 0, policy_version 426819 (0.0028) [2024-04-27 16:41:57,445][52263] Updated weights for policy 0, policy_version 426829 (0.0024) [2024-04-27 16:41:59,106][52031] Fps is (10 sec: 54067.5, 60 sec: 54067.3, 300 sec: 53595.1). Total num frames: 6993248256. Throughput: 0: 53454.8. Samples: 1483703700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:41:59,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 16:42:00,689][52263] Updated weights for policy 0, policy_version 426839 (0.0034) [2024-04-27 16:42:03,538][52263] Updated weights for policy 0, policy_version 426849 (0.0031) [2024-04-27 16:42:04,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6993526784. Throughput: 0: 53405.3. Samples: 1484023520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:42:04,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 16:42:06,833][52263] Updated weights for policy 0, policy_version 426859 (0.0031) [2024-04-27 16:42:09,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6993772544. Throughput: 0: 53495.5. Samples: 1484347280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:42:09,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 16:42:09,510][52263] Updated weights for policy 0, policy_version 426869 (0.0027) [2024-04-27 16:42:12,879][52263] Updated weights for policy 0, policy_version 426879 (0.0030) [2024-04-27 16:42:14,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6994051072. Throughput: 0: 53616.2. Samples: 1484506900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:42:14,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 16:42:15,691][52263] Updated weights for policy 0, policy_version 426889 (0.0037) [2024-04-27 16:42:18,904][52263] Updated weights for policy 0, policy_version 426899 (0.0038) [2024-04-27 16:42:19,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.2, 300 sec: 53595.3). Total num frames: 6994313216. Throughput: 0: 53725.2. Samples: 1484832040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:42:19,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 16:42:21,912][52263] Updated weights for policy 0, policy_version 426909 (0.0035) [2024-04-27 16:42:24,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53794.0, 300 sec: 53539.5). Total num frames: 6994575360. Throughput: 0: 53579.6. Samples: 1485153160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:42:24,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 16:42:24,893][52263] Updated weights for policy 0, policy_version 426919 (0.0030) [2024-04-27 16:42:25,972][52242] Signal inference workers to stop experience collection... (22800 times) [2024-04-27 16:42:26,010][52263] InferenceWorker_p0-w0: stopping experience collection (22800 times) [2024-04-27 16:42:26,068][52242] Signal inference workers to resume experience collection... (22800 times) [2024-04-27 16:42:26,068][52263] InferenceWorker_p0-w0: resuming experience collection (22800 times) [2024-04-27 16:42:27,901][52263] Updated weights for policy 0, policy_version 426929 (0.0028) [2024-04-27 16:42:29,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6994853888. Throughput: 0: 53468.4. Samples: 1485313360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:42:29,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 16:42:31,035][52263] Updated weights for policy 0, policy_version 426939 (0.0031) [2024-04-27 16:42:33,849][52263] Updated weights for policy 0, policy_version 426949 (0.0027) [2024-04-27 16:42:34,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53520.9, 300 sec: 53650.6). Total num frames: 6995132416. Throughput: 0: 53449.2. Samples: 1485633080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:42:34,107][52031] Avg episode reward: [(0, '0.687')] [2024-04-27 16:42:37,238][52263] Updated weights for policy 0, policy_version 426959 (0.0032) [2024-04-27 16:42:39,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6995394560. Throughput: 0: 53413.3. Samples: 1485953380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:42:39,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 16:42:40,231][52263] Updated weights for policy 0, policy_version 426969 (0.0028) [2024-04-27 16:42:43,216][52263] Updated weights for policy 0, policy_version 426979 (0.0026) [2024-04-27 16:42:44,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 6995673088. Throughput: 0: 53661.6. Samples: 1486118480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:42:44,116][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 16:42:46,273][52263] Updated weights for policy 0, policy_version 426989 (0.0036) [2024-04-27 16:42:49,107][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 6995935232. Throughput: 0: 53659.0. Samples: 1486438180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:42:49,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 16:42:49,274][52263] Updated weights for policy 0, policy_version 426999 (0.0025) [2024-04-27 16:42:52,347][52263] Updated weights for policy 0, policy_version 427009 (0.0035) [2024-04-27 16:42:54,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 6996197376. Throughput: 0: 53593.7. Samples: 1486759000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 16:42:54,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 16:42:55,288][52263] Updated weights for policy 0, policy_version 427019 (0.0028) [2024-04-27 16:42:58,662][52263] Updated weights for policy 0, policy_version 427029 (0.0032) [2024-04-27 16:42:59,107][52031] Fps is (10 sec: 52429.5, 60 sec: 53520.9, 300 sec: 53539.6). Total num frames: 6996459520. Throughput: 0: 53592.8. Samples: 1486918580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:42:59,107][52031] Avg episode reward: [(0, '0.672')] [2024-04-27 16:43:01,343][52263] Updated weights for policy 0, policy_version 427039 (0.0025) [2024-04-27 16:43:04,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6996738048. Throughput: 0: 53551.1. Samples: 1487241840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:43:04,107][52031] Avg episode reward: [(0, '0.484')] [2024-04-27 16:43:04,883][52263] Updated weights for policy 0, policy_version 427049 (0.0029) [2024-04-27 16:43:07,584][52263] Updated weights for policy 0, policy_version 427059 (0.0028) [2024-04-27 16:43:09,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 6997000192. Throughput: 0: 53513.9. Samples: 1487561280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:43:09,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 16:43:10,964][52263] Updated weights for policy 0, policy_version 427069 (0.0033) [2024-04-27 16:43:13,609][52263] Updated weights for policy 0, policy_version 427079 (0.0028) [2024-04-27 16:43:14,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6997262336. Throughput: 0: 53643.2. Samples: 1487727300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:43:14,107][52031] Avg episode reward: [(0, '0.572')] [2024-04-27 16:43:17,053][52263] Updated weights for policy 0, policy_version 427089 (0.0029) [2024-04-27 16:43:19,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 6997524480. Throughput: 0: 53718.0. Samples: 1488050380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:43:19,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 16:43:19,770][52263] Updated weights for policy 0, policy_version 427099 (0.0027) [2024-04-27 16:43:20,282][52242] Signal inference workers to stop experience collection... (22850 times) [2024-04-27 16:43:20,283][52242] Signal inference workers to resume experience collection... (22850 times) [2024-04-27 16:43:20,304][52263] InferenceWorker_p0-w0: stopping experience collection (22850 times) [2024-04-27 16:43:20,304][52263] InferenceWorker_p0-w0: resuming experience collection (22850 times) [2024-04-27 16:43:23,221][52263] Updated weights for policy 0, policy_version 427109 (0.0032) [2024-04-27 16:43:24,107][52031] Fps is (10 sec: 55704.5, 60 sec: 54067.2, 300 sec: 53650.6). Total num frames: 6997819392. Throughput: 0: 53697.8. Samples: 1488369780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:43:24,107][52031] Avg episode reward: [(0, '0.514')] [2024-04-27 16:43:26,105][52263] Updated weights for policy 0, policy_version 427119 (0.0030) [2024-04-27 16:43:29,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 6998065152. Throughput: 0: 53534.7. Samples: 1488527540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:43:29,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 16:43:29,334][52263] Updated weights for policy 0, policy_version 427129 (0.0030) [2024-04-27 16:43:32,091][52263] Updated weights for policy 0, policy_version 427139 (0.0030) [2024-04-27 16:43:34,107][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.0, 300 sec: 53484.1). Total num frames: 6998327296. Throughput: 0: 53617.9. Samples: 1488850980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:43:34,108][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 16:43:34,114][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000427144_6998327296.pth... [2024-04-27 16:43:34,160][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000426361_6985498624.pth [2024-04-27 16:43:35,624][52263] Updated weights for policy 0, policy_version 427149 (0.0033) [2024-04-27 16:43:38,194][52263] Updated weights for policy 0, policy_version 427159 (0.0031) [2024-04-27 16:43:39,107][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 6998622208. Throughput: 0: 53521.8. Samples: 1489167480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:43:39,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 16:43:41,779][52263] Updated weights for policy 0, policy_version 427169 (0.0032) [2024-04-27 16:43:44,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 6998884352. Throughput: 0: 53654.3. Samples: 1489333020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:43:44,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 16:43:44,403][52263] Updated weights for policy 0, policy_version 427179 (0.0027) [2024-04-27 16:43:47,745][52263] Updated weights for policy 0, policy_version 427189 (0.0028) [2024-04-27 16:43:49,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.3, 300 sec: 53595.1). Total num frames: 6999146496. Throughput: 0: 53579.1. Samples: 1489652900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:43:49,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 16:43:50,323][52263] Updated weights for policy 0, policy_version 427199 (0.0033) [2024-04-27 16:43:53,841][52263] Updated weights for policy 0, policy_version 427209 (0.0029) [2024-04-27 16:43:54,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 6999425024. Throughput: 0: 53760.6. Samples: 1489980500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:43:54,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 16:43:56,385][52263] Updated weights for policy 0, policy_version 427219 (0.0027) [2024-04-27 16:43:59,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 6999670784. Throughput: 0: 53493.3. Samples: 1490134500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:43:59,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 16:43:59,907][52263] Updated weights for policy 0, policy_version 427229 (0.0027) [2024-04-27 16:44:02,486][52263] Updated weights for policy 0, policy_version 427239 (0.0028) [2024-04-27 16:44:04,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53521.0, 300 sec: 53539.5). Total num frames: 6999949312. Throughput: 0: 53474.5. Samples: 1490456740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:44:04,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 16:44:05,966][52263] Updated weights for policy 0, policy_version 427249 (0.0032) [2024-04-27 16:44:08,588][52263] Updated weights for policy 0, policy_version 427259 (0.0024) [2024-04-27 16:44:09,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 7000227840. Throughput: 0: 53587.7. Samples: 1490781220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:44:09,107][52031] Avg episode reward: [(0, '0.661')] [2024-04-27 16:44:12,031][52263] Updated weights for policy 0, policy_version 427269 (0.0030) [2024-04-27 16:44:14,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.1, 300 sec: 53595.2). Total num frames: 7000489984. Throughput: 0: 53861.1. Samples: 1490951280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 16:44:14,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 16:44:14,804][52263] Updated weights for policy 0, policy_version 427279 (0.0029) [2024-04-27 16:44:18,155][52263] Updated weights for policy 0, policy_version 427289 (0.0036) [2024-04-27 16:44:19,107][52031] Fps is (10 sec: 54067.1, 60 sec: 54067.1, 300 sec: 53706.2). Total num frames: 7000768512. Throughput: 0: 53914.7. Samples: 1491277140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:44:19,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 16:44:19,763][52242] Signal inference workers to stop experience collection... (22900 times) [2024-04-27 16:44:19,767][52242] Signal inference workers to resume experience collection... (22900 times) [2024-04-27 16:44:19,798][52263] InferenceWorker_p0-w0: stopping experience collection (22900 times) [2024-04-27 16:44:19,798][52263] InferenceWorker_p0-w0: resuming experience collection (22900 times) [2024-04-27 16:44:20,748][52263] Updated weights for policy 0, policy_version 427299 (0.0039) [2024-04-27 16:44:24,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 7001014272. Throughput: 0: 54022.6. Samples: 1491598500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:44:24,107][52031] Avg episode reward: [(0, '0.562')] [2024-04-27 16:44:24,177][52263] Updated weights for policy 0, policy_version 427309 (0.0033) [2024-04-27 16:44:26,825][52263] Updated weights for policy 0, policy_version 427319 (0.0035) [2024-04-27 16:44:29,107][52031] Fps is (10 sec: 50790.1, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 7001276416. Throughput: 0: 53674.5. Samples: 1491748380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:44:29,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 16:44:30,480][52263] Updated weights for policy 0, policy_version 427329 (0.0033) [2024-04-27 16:44:32,891][52263] Updated weights for policy 0, policy_version 427339 (0.0028) [2024-04-27 16:44:34,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 7001554944. Throughput: 0: 53653.6. Samples: 1492067320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:44:34,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 16:44:36,469][52263] Updated weights for policy 0, policy_version 427349 (0.0029) [2024-04-27 16:44:38,831][52263] Updated weights for policy 0, policy_version 427359 (0.0029) [2024-04-27 16:44:39,106][52031] Fps is (10 sec: 57345.0, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 7001849856. Throughput: 0: 53576.5. Samples: 1492391440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:44:39,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 16:44:42,651][52263] Updated weights for policy 0, policy_version 427369 (0.0029) [2024-04-27 16:44:44,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 7002112000. Throughput: 0: 54075.9. Samples: 1492567920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:44:44,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 16:44:45,009][52263] Updated weights for policy 0, policy_version 427379 (0.0035) [2024-04-27 16:44:48,679][52263] Updated weights for policy 0, policy_version 427389 (0.0027) [2024-04-27 16:44:49,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 7002357760. Throughput: 0: 53989.9. Samples: 1492886280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:44:49,107][52031] Avg episode reward: [(0, '0.652')] [2024-04-27 16:44:51,089][52263] Updated weights for policy 0, policy_version 427399 (0.0025) [2024-04-27 16:44:54,106][52031] Fps is (10 sec: 49153.0, 60 sec: 52974.9, 300 sec: 53539.6). Total num frames: 7002603520. Throughput: 0: 53950.8. Samples: 1493209000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:44:54,107][52031] Avg episode reward: [(0, '0.671')] [2024-04-27 16:44:54,691][52263] Updated weights for policy 0, policy_version 427409 (0.0029) [2024-04-27 16:44:57,192][52263] Updated weights for policy 0, policy_version 427419 (0.0026) [2024-04-27 16:44:59,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 7002865664. Throughput: 0: 53466.2. Samples: 1493357260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:44:59,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 16:45:00,708][52263] Updated weights for policy 0, policy_version 427429 (0.0029) [2024-04-27 16:45:03,328][52263] Updated weights for policy 0, policy_version 427439 (0.0032) [2024-04-27 16:45:04,107][52031] Fps is (10 sec: 58981.7, 60 sec: 54067.2, 300 sec: 53706.2). Total num frames: 7003193344. Throughput: 0: 53442.2. Samples: 1493682040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:45:04,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 16:45:06,854][52263] Updated weights for policy 0, policy_version 427449 (0.0029) [2024-04-27 16:45:09,107][52031] Fps is (10 sec: 60619.8, 60 sec: 54067.2, 300 sec: 53761.7). Total num frames: 7003471872. Throughput: 0: 53471.1. Samples: 1494004700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:45:09,107][52031] Avg episode reward: [(0, '0.564')] [2024-04-27 16:45:09,285][52263] Updated weights for policy 0, policy_version 427459 (0.0033) [2024-04-27 16:45:12,807][52263] Updated weights for policy 0, policy_version 427469 (0.0031) [2024-04-27 16:45:14,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 7003717632. Throughput: 0: 54030.4. Samples: 1494179740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:45:14,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 16:45:15,354][52263] Updated weights for policy 0, policy_version 427479 (0.0033) [2024-04-27 16:45:18,914][52263] Updated weights for policy 0, policy_version 427489 (0.0032) [2024-04-27 16:45:19,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 7003979776. Throughput: 0: 54028.6. Samples: 1494498600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:45:19,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 16:45:21,399][52263] Updated weights for policy 0, policy_version 427499 (0.0025) [2024-04-27 16:45:24,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 7004225536. Throughput: 0: 54016.8. Samples: 1494822200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:45:24,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 16:45:24,934][52242] Signal inference workers to stop experience collection... (22950 times) [2024-04-27 16:45:24,940][52242] Signal inference workers to resume experience collection... (22950 times) [2024-04-27 16:45:24,968][52263] InferenceWorker_p0-w0: stopping experience collection (22950 times) [2024-04-27 16:45:24,968][52263] InferenceWorker_p0-w0: resuming experience collection (22950 times) [2024-04-27 16:45:25,057][52263] Updated weights for policy 0, policy_version 427509 (0.0030) [2024-04-27 16:45:27,458][52263] Updated weights for policy 0, policy_version 427519 (0.0028) [2024-04-27 16:45:29,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 7004504064. Throughput: 0: 53370.9. Samples: 1494969600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:45:29,107][52031] Avg episode reward: [(0, '0.660')] [2024-04-27 16:45:31,224][52263] Updated weights for policy 0, policy_version 427529 (0.0029) [2024-04-27 16:45:33,413][52263] Updated weights for policy 0, policy_version 427539 (0.0034) [2024-04-27 16:45:34,107][52031] Fps is (10 sec: 58981.4, 60 sec: 54340.2, 300 sec: 53761.7). Total num frames: 7004815360. Throughput: 0: 53545.1. Samples: 1495295820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:45:34,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 16:45:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000427540_7004815360.pth... [2024-04-27 16:45:34,160][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000426753_6991921152.pth [2024-04-27 16:45:37,401][52263] Updated weights for policy 0, policy_version 427549 (0.0036) [2024-04-27 16:45:39,106][52031] Fps is (10 sec: 57344.2, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 7005077504. Throughput: 0: 53579.6. Samples: 1495620080. Policy #0 lag: (min: 2.0, avg: 12.1, max: 22.0) [2024-04-27 16:45:39,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 16:45:39,573][52263] Updated weights for policy 0, policy_version 427559 (0.0035) [2024-04-27 16:45:43,611][52263] Updated weights for policy 0, policy_version 427569 (0.0028) [2024-04-27 16:45:44,106][52031] Fps is (10 sec: 50791.0, 60 sec: 53521.2, 300 sec: 53706.2). Total num frames: 7005323264. Throughput: 0: 53999.4. Samples: 1495787240. Policy #0 lag: (min: 2.0, avg: 12.1, max: 22.0) [2024-04-27 16:45:44,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 16:45:45,626][52263] Updated weights for policy 0, policy_version 427579 (0.0039) [2024-04-27 16:45:49,107][52031] Fps is (10 sec: 50789.4, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 7005585408. Throughput: 0: 53965.3. Samples: 1496110480. Policy #0 lag: (min: 2.0, avg: 12.1, max: 22.0) [2024-04-27 16:45:49,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 16:45:49,584][52263] Updated weights for policy 0, policy_version 427589 (0.0030) [2024-04-27 16:45:51,882][52263] Updated weights for policy 0, policy_version 427599 (0.0027) [2024-04-27 16:45:54,106][52031] Fps is (10 sec: 49152.5, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 7005814784. Throughput: 0: 53913.1. Samples: 1496430780. Policy #0 lag: (min: 2.0, avg: 12.1, max: 22.0) [2024-04-27 16:45:54,107][52031] Avg episode reward: [(0, '0.539')] [2024-04-27 16:45:55,575][52263] Updated weights for policy 0, policy_version 427609 (0.0032) [2024-04-27 16:45:57,835][52263] Updated weights for policy 0, policy_version 427619 (0.0031) [2024-04-27 16:45:59,106][52031] Fps is (10 sec: 54067.9, 60 sec: 54340.2, 300 sec: 53650.7). Total num frames: 7006126080. Throughput: 0: 53485.3. Samples: 1496586580. Policy #0 lag: (min: 2.0, avg: 12.1, max: 22.0) [2024-04-27 16:45:59,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 16:46:01,730][52263] Updated weights for policy 0, policy_version 427629 (0.0027) [2024-04-27 16:46:04,106][52031] Fps is (10 sec: 60620.4, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 7006420992. Throughput: 0: 53591.6. Samples: 1496910220. Policy #0 lag: (min: 2.0, avg: 12.1, max: 22.0) [2024-04-27 16:46:04,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 16:46:04,219][52263] Updated weights for policy 0, policy_version 427639 (0.0039) [2024-04-27 16:46:07,801][52263] Updated weights for policy 0, policy_version 427649 (0.0031) [2024-04-27 16:46:09,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53248.1, 300 sec: 53706.2). Total num frames: 7006666752. Throughput: 0: 53463.5. Samples: 1497228060. Policy #0 lag: (min: 2.0, avg: 12.1, max: 22.0) [2024-04-27 16:46:09,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 16:46:10,217][52263] Updated weights for policy 0, policy_version 427659 (0.0027) [2024-04-27 16:46:13,262][52242] Signal inference workers to stop experience collection... (23000 times) [2024-04-27 16:46:13,262][52242] Signal inference workers to resume experience collection... (23000 times) [2024-04-27 16:46:13,279][52263] InferenceWorker_p0-w0: stopping experience collection (23000 times) [2024-04-27 16:46:13,279][52263] InferenceWorker_p0-w0: resuming experience collection (23000 times) [2024-04-27 16:46:13,925][52263] Updated weights for policy 0, policy_version 427669 (0.0029) [2024-04-27 16:46:14,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53520.9, 300 sec: 53650.7). Total num frames: 7006928896. Throughput: 0: 53753.1. Samples: 1497388500. Policy #0 lag: (min: 2.0, avg: 12.1, max: 22.0) [2024-04-27 16:46:14,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 16:46:16,179][52263] Updated weights for policy 0, policy_version 427679 (0.0025) [2024-04-27 16:46:19,107][52031] Fps is (10 sec: 50790.0, 60 sec: 53247.9, 300 sec: 53650.6). Total num frames: 7007174656. Throughput: 0: 53685.4. Samples: 1497711660. Policy #0 lag: (min: 2.0, avg: 12.1, max: 22.0) [2024-04-27 16:46:19,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 16:46:19,937][52263] Updated weights for policy 0, policy_version 427689 (0.0029) [2024-04-27 16:46:22,308][52263] Updated weights for policy 0, policy_version 427699 (0.0027) [2024-04-27 16:46:24,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 7007453184. Throughput: 0: 53672.8. Samples: 1498035360. Policy #0 lag: (min: 2.0, avg: 12.1, max: 22.0) [2024-04-27 16:46:24,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 16:46:26,346][52263] Updated weights for policy 0, policy_version 427709 (0.0026) [2024-04-27 16:46:28,848][52263] Updated weights for policy 0, policy_version 427719 (0.0029) [2024-04-27 16:46:29,107][52031] Fps is (10 sec: 57343.8, 60 sec: 54067.1, 300 sec: 53650.6). Total num frames: 7007748096. Throughput: 0: 53464.4. Samples: 1498193140. Policy #0 lag: (min: 2.0, avg: 12.1, max: 22.0) [2024-04-27 16:46:29,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 16:46:32,289][52263] Updated weights for policy 0, policy_version 427729 (0.0027) [2024-04-27 16:46:34,106][52031] Fps is (10 sec: 57344.1, 60 sec: 53521.2, 300 sec: 53706.2). Total num frames: 7008026624. Throughput: 0: 53409.1. Samples: 1498513880. Policy #0 lag: (min: 2.0, avg: 12.1, max: 22.0) [2024-04-27 16:46:34,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 16:46:35,017][52263] Updated weights for policy 0, policy_version 427739 (0.0025) [2024-04-27 16:46:38,448][52263] Updated weights for policy 0, policy_version 427749 (0.0029) [2024-04-27 16:46:39,106][52031] Fps is (10 sec: 52429.6, 60 sec: 53248.0, 300 sec: 53650.7). Total num frames: 7008272384. Throughput: 0: 53375.5. Samples: 1498832680. Policy #0 lag: (min: 2.0, avg: 12.1, max: 22.0) [2024-04-27 16:46:39,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 16:46:41,043][52263] Updated weights for policy 0, policy_version 427759 (0.0030) [2024-04-27 16:46:44,106][52031] Fps is (10 sec: 49151.9, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 7008518144. Throughput: 0: 53424.0. Samples: 1498990660. Policy #0 lag: (min: 2.0, avg: 12.1, max: 22.0) [2024-04-27 16:46:44,107][52031] Avg episode reward: [(0, '0.659')] [2024-04-27 16:46:44,795][52263] Updated weights for policy 0, policy_version 427769 (0.0028) [2024-04-27 16:46:47,058][52263] Updated weights for policy 0, policy_version 427779 (0.0034) [2024-04-27 16:46:49,106][52031] Fps is (10 sec: 50790.1, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 7008780288. Throughput: 0: 53337.8. Samples: 1499310420. Policy #0 lag: (min: 2.0, avg: 12.1, max: 22.0) [2024-04-27 16:46:49,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 16:46:50,709][52263] Updated weights for policy 0, policy_version 427789 (0.0025) [2024-04-27 16:46:53,352][52263] Updated weights for policy 0, policy_version 427799 (0.0030) [2024-04-27 16:46:54,106][52031] Fps is (10 sec: 54067.1, 60 sec: 54067.1, 300 sec: 53595.1). Total num frames: 7009058816. Throughput: 0: 53345.7. Samples: 1499628620. Policy #0 lag: (min: 2.0, avg: 12.1, max: 22.0) [2024-04-27 16:46:54,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 16:46:56,693][52263] Updated weights for policy 0, policy_version 427809 (0.0031) [2024-04-27 16:46:59,107][52031] Fps is (10 sec: 58981.9, 60 sec: 54067.1, 300 sec: 53706.2). Total num frames: 7009370112. Throughput: 0: 53533.4. Samples: 1499797500. Policy #0 lag: (min: 2.0, avg: 12.1, max: 22.0) [2024-04-27 16:46:59,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 16:46:59,560][52263] Updated weights for policy 0, policy_version 427819 (0.0026) [2024-04-27 16:47:02,847][52263] Updated weights for policy 0, policy_version 427829 (0.0028) [2024-04-27 16:47:04,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53248.0, 300 sec: 53706.2). Total num frames: 7009615872. Throughput: 0: 53534.8. Samples: 1500120720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-27 16:47:04,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 16:47:05,910][52263] Updated weights for policy 0, policy_version 427839 (0.0032) [2024-04-27 16:47:06,498][52242] Signal inference workers to stop experience collection... (23050 times) [2024-04-27 16:47:06,498][52242] Signal inference workers to resume experience collection... (23050 times) [2024-04-27 16:47:06,526][52263] InferenceWorker_p0-w0: stopping experience collection (23050 times) [2024-04-27 16:47:06,527][52263] InferenceWorker_p0-w0: resuming experience collection (23050 times) [2024-04-27 16:47:08,997][52263] Updated weights for policy 0, policy_version 427849 (0.0028) [2024-04-27 16:47:09,107][52031] Fps is (10 sec: 50790.3, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 7009878016. Throughput: 0: 53539.9. Samples: 1500444660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-27 16:47:09,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 16:47:12,118][52263] Updated weights for policy 0, policy_version 427859 (0.0029) [2024-04-27 16:47:14,106][52031] Fps is (10 sec: 50790.6, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 7010123776. Throughput: 0: 53353.5. Samples: 1500594040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-27 16:47:14,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 16:47:15,165][52263] Updated weights for policy 0, policy_version 427869 (0.0025) [2024-04-27 16:47:18,332][52263] Updated weights for policy 0, policy_version 427879 (0.0027) [2024-04-27 16:47:19,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 7010402304. Throughput: 0: 53392.5. Samples: 1500916540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-27 16:47:19,107][52031] Avg episode reward: [(0, '0.517')] [2024-04-27 16:47:21,105][52263] Updated weights for policy 0, policy_version 427889 (0.0034) [2024-04-27 16:47:24,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 7010680832. Throughput: 0: 53416.0. Samples: 1501236400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-27 16:47:24,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 16:47:24,415][52263] Updated weights for policy 0, policy_version 427899 (0.0032) [2024-04-27 16:47:27,152][52263] Updated weights for policy 0, policy_version 427909 (0.0029) [2024-04-27 16:47:29,106][52031] Fps is (10 sec: 57343.8, 60 sec: 53794.3, 300 sec: 53706.2). Total num frames: 7010975744. Throughput: 0: 53615.6. Samples: 1501403360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-27 16:47:29,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 16:47:30,474][52263] Updated weights for policy 0, policy_version 427919 (0.0031) [2024-04-27 16:47:33,290][52263] Updated weights for policy 0, policy_version 427929 (0.0030) [2024-04-27 16:47:34,107][52031] Fps is (10 sec: 54066.7, 60 sec: 53247.9, 300 sec: 53650.7). Total num frames: 7011221504. Throughput: 0: 53841.3. Samples: 1501733280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-27 16:47:34,107][52031] Avg episode reward: [(0, '0.619')] [2024-04-27 16:47:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000427931_7011221504.pth... [2024-04-27 16:47:34,279][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000427144_6998327296.pth [2024-04-27 16:47:36,621][52263] Updated weights for policy 0, policy_version 427939 (0.0029) [2024-04-27 16:47:39,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 7011483648. Throughput: 0: 53816.5. Samples: 1502050360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-27 16:47:39,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 16:47:39,367][52263] Updated weights for policy 0, policy_version 427949 (0.0027) [2024-04-27 16:47:42,789][52263] Updated weights for policy 0, policy_version 427959 (0.0028) [2024-04-27 16:47:44,107][52031] Fps is (10 sec: 50790.5, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 7011729408. Throughput: 0: 53267.2. Samples: 1502194520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-27 16:47:44,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 16:47:45,440][52263] Updated weights for policy 0, policy_version 427969 (0.0027) [2024-04-27 16:47:48,772][52263] Updated weights for policy 0, policy_version 427979 (0.0031) [2024-04-27 16:47:49,106][52031] Fps is (10 sec: 52428.4, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 7012007936. Throughput: 0: 53224.0. Samples: 1502515800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-27 16:47:49,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 16:47:51,632][52263] Updated weights for policy 0, policy_version 427989 (0.0025) [2024-04-27 16:47:54,107][52031] Fps is (10 sec: 57343.3, 60 sec: 54067.1, 300 sec: 53706.2). Total num frames: 7012302848. Throughput: 0: 53247.5. Samples: 1502840800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-27 16:47:54,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 16:47:54,742][52263] Updated weights for policy 0, policy_version 427999 (0.0028) [2024-04-27 16:47:57,812][52263] Updated weights for policy 0, policy_version 428009 (0.0034) [2024-04-27 16:47:58,985][52242] Signal inference workers to stop experience collection... (23100 times) [2024-04-27 16:47:59,032][52263] InferenceWorker_p0-w0: stopping experience collection (23100 times) [2024-04-27 16:47:59,047][52242] Signal inference workers to resume experience collection... (23100 times) [2024-04-27 16:47:59,050][52263] InferenceWorker_p0-w0: resuming experience collection (23100 times) [2024-04-27 16:47:59,106][52031] Fps is (10 sec: 55706.1, 60 sec: 53248.1, 300 sec: 53650.7). Total num frames: 7012564992. Throughput: 0: 53807.2. Samples: 1503015360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-27 16:47:59,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 16:48:00,946][52263] Updated weights for policy 0, policy_version 428019 (0.0033) [2024-04-27 16:48:03,792][52263] Updated weights for policy 0, policy_version 428029 (0.0030) [2024-04-27 16:48:04,106][52031] Fps is (10 sec: 52429.9, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 7012827136. Throughput: 0: 53891.1. Samples: 1503341640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-27 16:48:04,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 16:48:07,254][52263] Updated weights for policy 0, policy_version 428039 (0.0032) [2024-04-27 16:48:09,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 7013089280. Throughput: 0: 53930.7. Samples: 1503663280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-27 16:48:09,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 16:48:09,776][52263] Updated weights for policy 0, policy_version 428049 (0.0026) [2024-04-27 16:48:13,305][52263] Updated weights for policy 0, policy_version 428059 (0.0033) [2024-04-27 16:48:14,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 7013351424. Throughput: 0: 53646.5. Samples: 1503817460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-27 16:48:14,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 16:48:16,009][52263] Updated weights for policy 0, policy_version 428069 (0.0027) [2024-04-27 16:48:19,107][52031] Fps is (10 sec: 54065.9, 60 sec: 53793.9, 300 sec: 53595.1). Total num frames: 7013629952. Throughput: 0: 53484.3. Samples: 1504140080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-27 16:48:19,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 16:48:19,544][52263] Updated weights for policy 0, policy_version 428079 (0.0025) [2024-04-27 16:48:22,186][52263] Updated weights for policy 0, policy_version 428089 (0.0026) [2024-04-27 16:48:24,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 7013908480. Throughput: 0: 53469.2. Samples: 1504456480. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 16:48:24,107][52031] Avg episode reward: [(0, '0.556')] [2024-04-27 16:48:25,702][52263] Updated weights for policy 0, policy_version 428099 (0.0034) [2024-04-27 16:48:28,364][52263] Updated weights for policy 0, policy_version 428109 (0.0028) [2024-04-27 16:48:29,107][52031] Fps is (10 sec: 54067.7, 60 sec: 53247.9, 300 sec: 53706.2). Total num frames: 7014170624. Throughput: 0: 54177.3. Samples: 1504632500. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 16:48:29,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 16:48:31,813][52263] Updated weights for policy 0, policy_version 428119 (0.0030) [2024-04-27 16:48:34,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 7014432768. Throughput: 0: 54179.6. Samples: 1504953880. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 16:48:34,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 16:48:34,640][52263] Updated weights for policy 0, policy_version 428129 (0.0030) [2024-04-27 16:48:37,990][52263] Updated weights for policy 0, policy_version 428139 (0.0031) [2024-04-27 16:48:39,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 7014711296. Throughput: 0: 54072.1. Samples: 1505274040. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 16:48:39,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 16:48:40,682][52263] Updated weights for policy 0, policy_version 428149 (0.0034) [2024-04-27 16:48:43,918][52263] Updated weights for policy 0, policy_version 428159 (0.0034) [2024-04-27 16:48:44,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 7014957056. Throughput: 0: 53606.6. Samples: 1505427660. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 16:48:44,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 16:48:47,068][52263] Updated weights for policy 0, policy_version 428169 (0.0030) [2024-04-27 16:48:48,278][52242] Signal inference workers to stop experience collection... (23150 times) [2024-04-27 16:48:48,278][52242] Signal inference workers to resume experience collection... (23150 times) [2024-04-27 16:48:48,309][52263] InferenceWorker_p0-w0: stopping experience collection (23150 times) [2024-04-27 16:48:48,309][52263] InferenceWorker_p0-w0: resuming experience collection (23150 times) [2024-04-27 16:48:49,106][52031] Fps is (10 sec: 54067.8, 60 sec: 54067.2, 300 sec: 53650.6). Total num frames: 7015251968. Throughput: 0: 53585.7. Samples: 1505753000. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 16:48:49,107][52031] Avg episode reward: [(0, '0.536')] [2024-04-27 16:48:49,951][52263] Updated weights for policy 0, policy_version 428179 (0.0032) [2024-04-27 16:48:53,065][52263] Updated weights for policy 0, policy_version 428189 (0.0028) [2024-04-27 16:48:54,106][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.2, 300 sec: 53706.2). Total num frames: 7015514112. Throughput: 0: 53574.5. Samples: 1506074140. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 16:48:54,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 16:48:56,223][52263] Updated weights for policy 0, policy_version 428199 (0.0033) [2024-04-27 16:48:59,046][52263] Updated weights for policy 0, policy_version 428209 (0.0035) [2024-04-27 16:48:59,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 7015776256. Throughput: 0: 53828.6. Samples: 1506239740. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 16:48:59,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 16:49:02,354][52263] Updated weights for policy 0, policy_version 428219 (0.0024) [2024-04-27 16:49:04,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 7016038400. Throughput: 0: 53782.8. Samples: 1506560300. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 16:49:04,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 16:49:05,250][52263] Updated weights for policy 0, policy_version 428229 (0.0030) [2024-04-27 16:49:08,505][52263] Updated weights for policy 0, policy_version 428239 (0.0022) [2024-04-27 16:49:09,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 7016316928. Throughput: 0: 53907.1. Samples: 1506882300. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 16:49:09,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 16:49:11,315][52263] Updated weights for policy 0, policy_version 428249 (0.0031) [2024-04-27 16:49:14,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 7016562688. Throughput: 0: 53452.1. Samples: 1507037840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 16:49:14,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 16:49:14,522][52263] Updated weights for policy 0, policy_version 428259 (0.0034) [2024-04-27 16:49:17,372][52263] Updated weights for policy 0, policy_version 428269 (0.0027) [2024-04-27 16:49:19,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 7016857600. Throughput: 0: 53487.0. Samples: 1507360800. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 16:49:19,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 16:49:20,673][52263] Updated weights for policy 0, policy_version 428279 (0.0026) [2024-04-27 16:49:23,394][52263] Updated weights for policy 0, policy_version 428289 (0.0025) [2024-04-27 16:49:24,107][52031] Fps is (10 sec: 57342.8, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 7017136128. Throughput: 0: 53562.6. Samples: 1507684360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 16:49:24,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 16:49:26,737][52263] Updated weights for policy 0, policy_version 428299 (0.0027) [2024-04-27 16:49:29,106][52031] Fps is (10 sec: 50791.3, 60 sec: 53248.1, 300 sec: 53595.2). Total num frames: 7017365504. Throughput: 0: 53820.2. Samples: 1507849560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 16:49:29,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 16:49:29,498][52263] Updated weights for policy 0, policy_version 428309 (0.0027) [2024-04-27 16:49:32,808][52263] Updated weights for policy 0, policy_version 428319 (0.0029) [2024-04-27 16:49:34,106][52031] Fps is (10 sec: 52430.1, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 7017660416. Throughput: 0: 53742.8. Samples: 1508171420. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 16:49:34,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 16:49:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000428324_7017660416.pth... [2024-04-27 16:49:34,166][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000427540_7004815360.pth [2024-04-27 16:49:35,703][52263] Updated weights for policy 0, policy_version 428329 (0.0034) [2024-04-27 16:49:38,759][52263] Updated weights for policy 0, policy_version 428339 (0.0024) [2024-04-27 16:49:39,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.2, 300 sec: 53595.2). Total num frames: 7017922560. Throughput: 0: 53767.7. Samples: 1508493680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 16:49:39,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 16:49:41,948][52263] Updated weights for policy 0, policy_version 428349 (0.0029) [2024-04-27 16:49:44,107][52031] Fps is (10 sec: 54066.3, 60 sec: 54067.2, 300 sec: 53706.2). Total num frames: 7018201088. Throughput: 0: 53696.8. Samples: 1508656100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-27 16:49:44,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 16:49:44,662][52263] Updated weights for policy 0, policy_version 428359 (0.0031) [2024-04-27 16:49:47,948][52263] Updated weights for policy 0, policy_version 428369 (0.0026) [2024-04-27 16:49:49,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53794.1, 300 sec: 53817.3). Total num frames: 7018479616. Throughput: 0: 53858.1. Samples: 1508983920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:49:49,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 16:49:50,696][52263] Updated weights for policy 0, policy_version 428379 (0.0030) [2024-04-27 16:49:53,998][52263] Updated weights for policy 0, policy_version 428389 (0.0030) [2024-04-27 16:49:54,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.1, 300 sec: 53761.7). Total num frames: 7018725376. Throughput: 0: 53801.1. Samples: 1509303340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:49:54,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 16:49:54,142][52242] Signal inference workers to stop experience collection... (23200 times) [2024-04-27 16:49:54,142][52242] Signal inference workers to resume experience collection... (23200 times) [2024-04-27 16:49:54,161][52263] InferenceWorker_p0-w0: stopping experience collection (23200 times) [2024-04-27 16:49:54,161][52263] InferenceWorker_p0-w0: resuming experience collection (23200 times) [2024-04-27 16:49:57,069][52263] Updated weights for policy 0, policy_version 428399 (0.0037) [2024-04-27 16:49:59,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 7019003904. Throughput: 0: 53916.7. Samples: 1509464100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:49:59,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 16:50:00,089][52263] Updated weights for policy 0, policy_version 428409 (0.0030) [2024-04-27 16:50:03,024][52263] Updated weights for policy 0, policy_version 428419 (0.0032) [2024-04-27 16:50:04,107][52031] Fps is (10 sec: 52427.4, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 7019249664. Throughput: 0: 53779.0. Samples: 1509780860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:50:04,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 16:50:06,172][52263] Updated weights for policy 0, policy_version 428429 (0.0029) [2024-04-27 16:50:09,106][52031] Fps is (10 sec: 52430.0, 60 sec: 53521.3, 300 sec: 53595.1). Total num frames: 7019528192. Throughput: 0: 53808.8. Samples: 1510105740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:50:09,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 16:50:09,162][52263] Updated weights for policy 0, policy_version 428439 (0.0030) [2024-04-27 16:50:12,156][52263] Updated weights for policy 0, policy_version 428449 (0.0028) [2024-04-27 16:50:14,106][52031] Fps is (10 sec: 57345.2, 60 sec: 54340.2, 300 sec: 53706.2). Total num frames: 7019823104. Throughput: 0: 53818.1. Samples: 1510271380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:50:14,107][52031] Avg episode reward: [(0, '0.561')] [2024-04-27 16:50:15,144][52263] Updated weights for policy 0, policy_version 428459 (0.0029) [2024-04-27 16:50:18,258][52263] Updated weights for policy 0, policy_version 428469 (0.0036) [2024-04-27 16:50:19,107][52031] Fps is (10 sec: 57343.1, 60 sec: 54067.2, 300 sec: 53817.3). Total num frames: 7020101632. Throughput: 0: 53858.1. Samples: 1510595040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:50:19,107][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 16:50:21,257][52263] Updated weights for policy 0, policy_version 428479 (0.0030) [2024-04-27 16:50:24,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.2, 300 sec: 53650.7). Total num frames: 7020331008. Throughput: 0: 53839.6. Samples: 1510916460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:50:24,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 16:50:24,485][52263] Updated weights for policy 0, policy_version 428489 (0.0025) [2024-04-27 16:50:27,170][52263] Updated weights for policy 0, policy_version 428499 (0.0029) [2024-04-27 16:50:29,107][52031] Fps is (10 sec: 50790.3, 60 sec: 54067.0, 300 sec: 53539.6). Total num frames: 7020609536. Throughput: 0: 53585.3. Samples: 1511067440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:50:29,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 16:50:30,569][52263] Updated weights for policy 0, policy_version 428509 (0.0027) [2024-04-27 16:50:33,201][52263] Updated weights for policy 0, policy_version 428519 (0.0033) [2024-04-27 16:50:34,106][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 7020888064. Throughput: 0: 53579.6. Samples: 1511395000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:50:34,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 16:50:36,659][52263] Updated weights for policy 0, policy_version 428529 (0.0036) [2024-04-27 16:50:39,107][52031] Fps is (10 sec: 55704.1, 60 sec: 54066.9, 300 sec: 53706.1). Total num frames: 7021166592. Throughput: 0: 53638.6. Samples: 1511717100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:50:39,107][52031] Avg episode reward: [(0, '0.661')] [2024-04-27 16:50:39,351][52263] Updated weights for policy 0, policy_version 428539 (0.0029) [2024-04-27 16:50:42,639][52263] Updated weights for policy 0, policy_version 428549 (0.0027) [2024-04-27 16:50:44,106][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 7021412352. Throughput: 0: 53728.1. Samples: 1511881860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:50:44,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 16:50:45,671][52263] Updated weights for policy 0, policy_version 428559 (0.0031) [2024-04-27 16:50:48,826][52263] Updated weights for policy 0, policy_version 428569 (0.0033) [2024-04-27 16:50:49,107][52031] Fps is (10 sec: 54068.8, 60 sec: 53794.2, 300 sec: 53872.8). Total num frames: 7021707264. Throughput: 0: 53760.6. Samples: 1512200080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:50:49,107][52031] Avg episode reward: [(0, '0.647')] [2024-04-27 16:50:51,713][52263] Updated weights for policy 0, policy_version 428579 (0.0036) [2024-04-27 16:50:52,723][52242] Signal inference workers to stop experience collection... (23250 times) [2024-04-27 16:50:52,723][52242] Signal inference workers to resume experience collection... (23250 times) [2024-04-27 16:50:52,766][52263] InferenceWorker_p0-w0: stopping experience collection (23250 times) [2024-04-27 16:50:52,767][52263] InferenceWorker_p0-w0: resuming experience collection (23250 times) [2024-04-27 16:50:54,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 7021953024. Throughput: 0: 53709.7. Samples: 1512522680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:50:54,107][52031] Avg episode reward: [(0, '0.551')] [2024-04-27 16:50:54,950][52263] Updated weights for policy 0, policy_version 428589 (0.0026) [2024-04-27 16:50:57,804][52263] Updated weights for policy 0, policy_version 428599 (0.0033) [2024-04-27 16:50:59,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 7022231552. Throughput: 0: 53681.6. Samples: 1512687060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:50:59,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 16:51:00,908][52263] Updated weights for policy 0, policy_version 428609 (0.0030) [2024-04-27 16:51:04,055][52263] Updated weights for policy 0, policy_version 428619 (0.0029) [2024-04-27 16:51:04,107][52031] Fps is (10 sec: 54066.8, 60 sec: 54067.3, 300 sec: 53650.6). Total num frames: 7022493696. Throughput: 0: 53633.3. Samples: 1513008540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 16:51:04,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 16:51:06,919][52263] Updated weights for policy 0, policy_version 428629 (0.0028) [2024-04-27 16:51:09,106][52031] Fps is (10 sec: 54068.2, 60 sec: 54067.1, 300 sec: 53706.2). Total num frames: 7022772224. Throughput: 0: 53584.0. Samples: 1513327740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 16:51:09,107][52031] Avg episode reward: [(0, '0.678')] [2024-04-27 16:51:10,039][52263] Updated weights for policy 0, policy_version 428639 (0.0035) [2024-04-27 16:51:13,224][52263] Updated weights for policy 0, policy_version 428649 (0.0029) [2024-04-27 16:51:14,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.1, 300 sec: 53761.7). Total num frames: 7023034368. Throughput: 0: 53768.5. Samples: 1513487020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 16:51:14,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 16:51:16,036][52263] Updated weights for policy 0, policy_version 428659 (0.0030) [2024-04-27 16:51:19,107][52031] Fps is (10 sec: 50789.4, 60 sec: 52974.8, 300 sec: 53650.6). Total num frames: 7023280128. Throughput: 0: 53584.3. Samples: 1513806300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 16:51:19,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 16:51:19,590][52263] Updated weights for policy 0, policy_version 428669 (0.0032) [2024-04-27 16:51:22,228][52263] Updated weights for policy 0, policy_version 428679 (0.0028) [2024-04-27 16:51:24,107][52031] Fps is (10 sec: 52428.3, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 7023558656. Throughput: 0: 53586.0. Samples: 1514128460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 16:51:24,107][52031] Avg episode reward: [(0, '0.645')] [2024-04-27 16:51:25,598][52263] Updated weights for policy 0, policy_version 428689 (0.0034) [2024-04-27 16:51:28,407][52263] Updated weights for policy 0, policy_version 428699 (0.0027) [2024-04-27 16:51:29,107][52031] Fps is (10 sec: 55705.8, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 7023837184. Throughput: 0: 53544.3. Samples: 1514291360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 16:51:29,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 16:51:31,705][52263] Updated weights for policy 0, policy_version 428709 (0.0029) [2024-04-27 16:51:34,107][52031] Fps is (10 sec: 55706.0, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 7024115712. Throughput: 0: 53580.0. Samples: 1514611180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 16:51:34,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 16:51:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000428718_7024115712.pth... [2024-04-27 16:51:34,160][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000427931_7011221504.pth [2024-04-27 16:51:34,665][52263] Updated weights for policy 0, policy_version 428719 (0.0031) [2024-04-27 16:51:37,858][52263] Updated weights for policy 0, policy_version 428729 (0.0032) [2024-04-27 16:51:39,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.2, 300 sec: 53761.7). Total num frames: 7024377856. Throughput: 0: 53506.5. Samples: 1514930480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 16:51:39,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 16:51:40,719][52263] Updated weights for policy 0, policy_version 428739 (0.0030) [2024-04-27 16:51:43,807][52263] Updated weights for policy 0, policy_version 428749 (0.0031) [2024-04-27 16:51:44,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53794.0, 300 sec: 53761.7). Total num frames: 7024640000. Throughput: 0: 53473.3. Samples: 1515093360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 16:51:44,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 16:51:46,655][52263] Updated weights for policy 0, policy_version 428759 (0.0027) [2024-04-27 16:51:49,106][52031] Fps is (10 sec: 52429.9, 60 sec: 53248.1, 300 sec: 53706.2). Total num frames: 7024902144. Throughput: 0: 53586.4. Samples: 1515419920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 16:51:49,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 16:51:49,717][52263] Updated weights for policy 0, policy_version 428769 (0.0026) [2024-04-27 16:51:52,671][52263] Updated weights for policy 0, policy_version 428779 (0.0031) [2024-04-27 16:51:54,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 7025164288. Throughput: 0: 53524.3. Samples: 1515736340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 16:51:54,107][52031] Avg episode reward: [(0, '0.634')] [2024-04-27 16:51:56,022][52263] Updated weights for policy 0, policy_version 428789 (0.0030) [2024-04-27 16:51:58,867][52263] Updated weights for policy 0, policy_version 428799 (0.0031) [2024-04-27 16:51:59,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 7025442816. Throughput: 0: 53491.2. Samples: 1515894120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 16:51:59,107][52031] Avg episode reward: [(0, '0.576')] [2024-04-27 16:52:02,176][52263] Updated weights for policy 0, policy_version 428809 (0.0030) [2024-04-27 16:52:04,106][52031] Fps is (10 sec: 55706.8, 60 sec: 53794.3, 300 sec: 53706.2). Total num frames: 7025721344. Throughput: 0: 53554.1. Samples: 1516216220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 16:52:04,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 16:52:05,019][52263] Updated weights for policy 0, policy_version 428819 (0.0026) [2024-04-27 16:52:08,278][52263] Updated weights for policy 0, policy_version 428829 (0.0033) [2024-04-27 16:52:09,086][52242] Signal inference workers to stop experience collection... (23300 times) [2024-04-27 16:52:09,087][52242] Signal inference workers to resume experience collection... (23300 times) [2024-04-27 16:52:09,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.1, 300 sec: 53761.7). Total num frames: 7025983488. Throughput: 0: 53596.7. Samples: 1516540300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 16:52:09,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 16:52:09,113][52263] InferenceWorker_p0-w0: stopping experience collection (23300 times) [2024-04-27 16:52:09,113][52263] InferenceWorker_p0-w0: resuming experience collection (23300 times) [2024-04-27 16:52:11,067][52263] Updated weights for policy 0, policy_version 428839 (0.0026) [2024-04-27 16:52:14,106][52031] Fps is (10 sec: 52428.1, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 7026245632. Throughput: 0: 53527.7. Samples: 1516700100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 16:52:14,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 16:52:14,384][52263] Updated weights for policy 0, policy_version 428849 (0.0029) [2024-04-27 16:52:17,100][52263] Updated weights for policy 0, policy_version 428859 (0.0028) [2024-04-27 16:52:19,106][52031] Fps is (10 sec: 50790.1, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 7026491392. Throughput: 0: 53550.3. Samples: 1517020940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 16:52:19,107][52031] Avg episode reward: [(0, '0.529')] [2024-04-27 16:52:20,415][52263] Updated weights for policy 0, policy_version 428869 (0.0034) [2024-04-27 16:52:23,571][52263] Updated weights for policy 0, policy_version 428879 (0.0028) [2024-04-27 16:52:24,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 7026786304. Throughput: 0: 53618.4. Samples: 1517343300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 16:52:24,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 16:52:26,490][52263] Updated weights for policy 0, policy_version 428889 (0.0027) [2024-04-27 16:52:29,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.2, 300 sec: 53595.1). Total num frames: 7027032064. Throughput: 0: 53473.6. Samples: 1517499660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 16:52:29,107][52031] Avg episode reward: [(0, '0.557')] [2024-04-27 16:52:29,656][52263] Updated weights for policy 0, policy_version 428899 (0.0029) [2024-04-27 16:52:32,624][52263] Updated weights for policy 0, policy_version 428909 (0.0026) [2024-04-27 16:52:34,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.0, 300 sec: 53706.2). Total num frames: 7027326976. Throughput: 0: 53452.7. Samples: 1517825300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 16:52:34,107][52031] Avg episode reward: [(0, '0.509')] [2024-04-27 16:52:35,845][52263] Updated weights for policy 0, policy_version 428919 (0.0031) [2024-04-27 16:52:38,771][52263] Updated weights for policy 0, policy_version 428929 (0.0032) [2024-04-27 16:52:39,107][52031] Fps is (10 sec: 54066.0, 60 sec: 53248.0, 300 sec: 53706.2). Total num frames: 7027572736. Throughput: 0: 53569.7. Samples: 1518146980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 16:52:39,107][52031] Avg episode reward: [(0, '0.503')] [2024-04-27 16:52:41,960][52263] Updated weights for policy 0, policy_version 428939 (0.0024) [2024-04-27 16:52:44,107][52031] Fps is (10 sec: 50790.1, 60 sec: 53248.0, 300 sec: 53650.6). Total num frames: 7027834880. Throughput: 0: 53663.4. Samples: 1518308980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 16:52:44,107][52031] Avg episode reward: [(0, '0.485')] [2024-04-27 16:52:44,993][52263] Updated weights for policy 0, policy_version 428949 (0.0031) [2024-04-27 16:52:48,011][52263] Updated weights for policy 0, policy_version 428959 (0.0029) [2024-04-27 16:52:49,106][52031] Fps is (10 sec: 52429.5, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 7028097024. Throughput: 0: 53599.9. Samples: 1518628220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 16:52:49,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 16:52:51,192][52263] Updated weights for policy 0, policy_version 428969 (0.0028) [2024-04-27 16:52:54,106][52031] Fps is (10 sec: 54068.0, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 7028375552. Throughput: 0: 53469.3. Samples: 1518946420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 16:52:54,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 16:52:54,184][52263] Updated weights for policy 0, policy_version 428979 (0.0026) [2024-04-27 16:52:57,185][52263] Updated weights for policy 0, policy_version 428989 (0.0033) [2024-04-27 16:52:59,106][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 7028654080. Throughput: 0: 53518.3. Samples: 1519108420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 16:52:59,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 16:53:00,272][52263] Updated weights for policy 0, policy_version 428999 (0.0025) [2024-04-27 16:53:03,487][52263] Updated weights for policy 0, policy_version 429009 (0.0028) [2024-04-27 16:53:04,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.0, 300 sec: 53706.2). Total num frames: 7028932608. Throughput: 0: 53582.3. Samples: 1519432140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 16:53:04,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 16:53:06,446][52263] Updated weights for policy 0, policy_version 429019 (0.0028) [2024-04-27 16:53:09,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53247.9, 300 sec: 53650.7). Total num frames: 7029178368. Throughput: 0: 53444.4. Samples: 1519748300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 16:53:09,108][52031] Avg episode reward: [(0, '0.540')] [2024-04-27 16:53:09,608][52263] Updated weights for policy 0, policy_version 429029 (0.0031) [2024-04-27 16:53:12,582][52263] Updated weights for policy 0, policy_version 429039 (0.0036) [2024-04-27 16:53:14,107][52031] Fps is (10 sec: 50789.5, 60 sec: 53247.9, 300 sec: 53595.1). Total num frames: 7029440512. Throughput: 0: 53540.6. Samples: 1519909000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 16:53:14,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 16:53:15,665][52263] Updated weights for policy 0, policy_version 429049 (0.0033) [2024-04-27 16:53:18,539][52263] Updated weights for policy 0, policy_version 429059 (0.0027) [2024-04-27 16:53:19,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 7029719040. Throughput: 0: 53596.4. Samples: 1520237140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 16:53:19,107][52031] Avg episode reward: [(0, '0.579')] [2024-04-27 16:53:21,596][52263] Updated weights for policy 0, policy_version 429069 (0.0036) [2024-04-27 16:53:23,606][52242] Signal inference workers to stop experience collection... (23350 times) [2024-04-27 16:53:23,655][52263] InferenceWorker_p0-w0: stopping experience collection (23350 times) [2024-04-27 16:53:23,673][52242] Signal inference workers to resume experience collection... (23350 times) [2024-04-27 16:53:23,673][52263] InferenceWorker_p0-w0: resuming experience collection (23350 times) [2024-04-27 16:53:24,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53247.9, 300 sec: 53595.1). Total num frames: 7029981184. Throughput: 0: 53631.6. Samples: 1520560400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 16:53:24,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 16:53:24,579][52263] Updated weights for policy 0, policy_version 429079 (0.0029) [2024-04-27 16:53:27,680][52263] Updated weights for policy 0, policy_version 429089 (0.0027) [2024-04-27 16:53:29,107][52031] Fps is (10 sec: 55705.9, 60 sec: 54067.1, 300 sec: 53706.2). Total num frames: 7030276096. Throughput: 0: 53508.1. Samples: 1520716840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 16:53:29,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 16:53:30,812][52263] Updated weights for policy 0, policy_version 429099 (0.0038) [2024-04-27 16:53:33,865][52263] Updated weights for policy 0, policy_version 429109 (0.0032) [2024-04-27 16:53:34,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 7030521856. Throughput: 0: 53564.5. Samples: 1521038620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 16:53:34,107][52031] Avg episode reward: [(0, '0.599')] [2024-04-27 16:53:34,204][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000429110_7030538240.pth... [2024-04-27 16:53:34,246][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000428324_7017660416.pth [2024-04-27 16:53:36,909][52263] Updated weights for policy 0, policy_version 429119 (0.0031) [2024-04-27 16:53:39,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 7030800384. Throughput: 0: 53682.6. Samples: 1521362140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 16:53:39,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 16:53:39,948][52263] Updated weights for policy 0, policy_version 429129 (0.0032) [2024-04-27 16:53:43,046][52263] Updated weights for policy 0, policy_version 429139 (0.0028) [2024-04-27 16:53:44,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 7031062528. Throughput: 0: 53525.6. Samples: 1521517080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 16:53:44,107][52031] Avg episode reward: [(0, '0.538')] [2024-04-27 16:53:45,955][52263] Updated weights for policy 0, policy_version 429149 (0.0029) [2024-04-27 16:53:49,027][52263] Updated weights for policy 0, policy_version 429159 (0.0030) [2024-04-27 16:53:49,107][52031] Fps is (10 sec: 54066.3, 60 sec: 54067.0, 300 sec: 53650.6). Total num frames: 7031341056. Throughput: 0: 53484.1. Samples: 1521838940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 16:53:49,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 16:53:52,334][52263] Updated weights for policy 0, policy_version 429169 (0.0030) [2024-04-27 16:53:54,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 7031603200. Throughput: 0: 53625.3. Samples: 1522161440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 16:53:54,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 16:53:55,040][52263] Updated weights for policy 0, policy_version 429179 (0.0033) [2024-04-27 16:53:58,482][52263] Updated weights for policy 0, policy_version 429189 (0.0028) [2024-04-27 16:53:59,106][52031] Fps is (10 sec: 52429.8, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 7031865344. Throughput: 0: 53609.5. Samples: 1522321420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 16:53:59,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 16:54:01,270][52263] Updated weights for policy 0, policy_version 429199 (0.0029) [2024-04-27 16:54:04,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53520.9, 300 sec: 53650.7). Total num frames: 7032143872. Throughput: 0: 53446.7. Samples: 1522642240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 16:54:04,107][52031] Avg episode reward: [(0, '0.532')] [2024-04-27 16:54:04,453][52263] Updated weights for policy 0, policy_version 429209 (0.0029) [2024-04-27 16:54:07,644][52263] Updated weights for policy 0, policy_version 429219 (0.0030) [2024-04-27 16:54:09,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 7032406016. Throughput: 0: 53457.5. Samples: 1522965980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 16:54:09,107][52031] Avg episode reward: [(0, '0.465')] [2024-04-27 16:54:10,410][52263] Updated weights for policy 0, policy_version 429229 (0.0032) [2024-04-27 16:54:13,736][52263] Updated weights for policy 0, policy_version 429239 (0.0028) [2024-04-27 16:54:14,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 7032668160. Throughput: 0: 53640.0. Samples: 1523130640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 16:54:14,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 16:54:16,504][52263] Updated weights for policy 0, policy_version 429249 (0.0035) [2024-04-27 16:54:19,106][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.3, 300 sec: 53595.2). Total num frames: 7032946688. Throughput: 0: 53710.2. Samples: 1523455580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 16:54:19,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 16:54:19,948][52263] Updated weights for policy 0, policy_version 429259 (0.0028) [2024-04-27 16:54:22,628][52263] Updated weights for policy 0, policy_version 429269 (0.0033) [2024-04-27 16:54:24,106][52031] Fps is (10 sec: 55706.0, 60 sec: 54067.3, 300 sec: 53761.7). Total num frames: 7033225216. Throughput: 0: 53569.4. Samples: 1523772760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 16:54:24,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 16:54:26,084][52263] Updated weights for policy 0, policy_version 429279 (0.0032) [2024-04-27 16:54:28,634][52263] Updated weights for policy 0, policy_version 429289 (0.0030) [2024-04-27 16:54:29,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 7033470976. Throughput: 0: 53771.1. Samples: 1523936780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 16:54:29,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 16:54:32,067][52263] Updated weights for policy 0, policy_version 429299 (0.0032) [2024-04-27 16:54:32,439][52242] Signal inference workers to stop experience collection... (23400 times) [2024-04-27 16:54:32,478][52263] InferenceWorker_p0-w0: stopping experience collection (23400 times) [2024-04-27 16:54:32,492][52242] Signal inference workers to resume experience collection... (23400 times) [2024-04-27 16:54:32,500][52263] InferenceWorker_p0-w0: resuming experience collection (23400 times) [2024-04-27 16:54:34,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 7033733120. Throughput: 0: 53810.6. Samples: 1524260400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 16:54:34,107][52031] Avg episode reward: [(0, '0.663')] [2024-04-27 16:54:34,772][52263] Updated weights for policy 0, policy_version 429309 (0.0027) [2024-04-27 16:54:38,129][52263] Updated weights for policy 0, policy_version 429319 (0.0028) [2024-04-27 16:54:39,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 7034011648. Throughput: 0: 53721.0. Samples: 1524578880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 16:54:39,107][52031] Avg episode reward: [(0, '0.636')] [2024-04-27 16:54:40,886][52263] Updated weights for policy 0, policy_version 429329 (0.0026) [2024-04-27 16:54:44,107][52031] Fps is (10 sec: 54066.2, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 7034273792. Throughput: 0: 53685.7. Samples: 1524737280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 16:54:44,107][52031] Avg episode reward: [(0, '0.457')] [2024-04-27 16:54:44,262][52263] Updated weights for policy 0, policy_version 429339 (0.0033) [2024-04-27 16:54:47,589][52263] Updated weights for policy 0, policy_version 429349 (0.0026) [2024-04-27 16:54:49,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53521.1, 300 sec: 53650.6). Total num frames: 7034552320. Throughput: 0: 53800.4. Samples: 1525063260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 16:54:49,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 16:54:50,402][52263] Updated weights for policy 0, policy_version 429359 (0.0036) [2024-04-27 16:54:53,595][52263] Updated weights for policy 0, policy_version 429369 (0.0032) [2024-04-27 16:54:54,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 7034798080. Throughput: 0: 53683.0. Samples: 1525381720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 16:54:54,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 16:54:56,464][52263] Updated weights for policy 0, policy_version 429379 (0.0026) [2024-04-27 16:54:59,106][52031] Fps is (10 sec: 54068.4, 60 sec: 53794.2, 300 sec: 53706.2). Total num frames: 7035092992. Throughput: 0: 53520.5. Samples: 1525539060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 16:54:59,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 16:54:59,593][52263] Updated weights for policy 0, policy_version 429389 (0.0027) [2024-04-27 16:55:02,629][52263] Updated weights for policy 0, policy_version 429399 (0.0032) [2024-04-27 16:55:04,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.1, 300 sec: 53650.6). Total num frames: 7035355136. Throughput: 0: 53451.9. Samples: 1525860920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 16:55:04,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 16:55:05,795][52263] Updated weights for policy 0, policy_version 429409 (0.0029) [2024-04-27 16:55:08,752][52263] Updated weights for policy 0, policy_version 429419 (0.0028) [2024-04-27 16:55:09,106][52031] Fps is (10 sec: 50790.2, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 7035600896. Throughput: 0: 53435.5. Samples: 1526177360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 16:55:09,107][52031] Avg episode reward: [(0, '0.658')] [2024-04-27 16:55:11,929][52263] Updated weights for policy 0, policy_version 429429 (0.0036) [2024-04-27 16:55:14,107][52031] Fps is (10 sec: 50790.1, 60 sec: 53248.0, 300 sec: 53428.5). Total num frames: 7035863040. Throughput: 0: 53271.6. Samples: 1526334000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 16:55:14,108][52031] Avg episode reward: [(0, '0.492')] [2024-04-27 16:55:15,134][52263] Updated weights for policy 0, policy_version 429439 (0.0029) [2024-04-27 16:55:18,038][52263] Updated weights for policy 0, policy_version 429449 (0.0031) [2024-04-27 16:55:19,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 7036157952. Throughput: 0: 53088.3. Samples: 1526649380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:55:19,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 16:55:21,156][52263] Updated weights for policy 0, policy_version 429459 (0.0028) [2024-04-27 16:55:24,106][52031] Fps is (10 sec: 54067.6, 60 sec: 52974.9, 300 sec: 53539.6). Total num frames: 7036403712. Throughput: 0: 53256.0. Samples: 1526975400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:55:24,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 16:55:24,118][52263] Updated weights for policy 0, policy_version 429469 (0.0025) [2024-04-27 16:55:27,267][52263] Updated weights for policy 0, policy_version 429479 (0.0030) [2024-04-27 16:55:29,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 7036682240. Throughput: 0: 53284.5. Samples: 1527135080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:55:29,107][52031] Avg episode reward: [(0, '0.566')] [2024-04-27 16:55:30,272][52263] Updated weights for policy 0, policy_version 429489 (0.0029) [2024-04-27 16:55:33,402][52263] Updated weights for policy 0, policy_version 429499 (0.0035) [2024-04-27 16:55:34,106][52031] Fps is (10 sec: 55705.4, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 7036960768. Throughput: 0: 53128.1. Samples: 1527454020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:55:34,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 16:55:34,227][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000429503_7036977152.pth... [2024-04-27 16:55:34,272][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000428718_7024115712.pth [2024-04-27 16:55:36,411][52263] Updated weights for policy 0, policy_version 429509 (0.0025) [2024-04-27 16:55:39,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 7037222912. Throughput: 0: 53247.6. Samples: 1527777860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:55:39,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 16:55:39,402][52263] Updated weights for policy 0, policy_version 429519 (0.0029) [2024-04-27 16:55:39,667][52242] Signal inference workers to stop experience collection... (23450 times) [2024-04-27 16:55:39,709][52263] InferenceWorker_p0-w0: stopping experience collection (23450 times) [2024-04-27 16:55:39,722][52242] Signal inference workers to resume experience collection... (23450 times) [2024-04-27 16:55:39,730][52263] InferenceWorker_p0-w0: resuming experience collection (23450 times) [2024-04-27 16:55:42,502][52263] Updated weights for policy 0, policy_version 429529 (0.0031) [2024-04-27 16:55:44,106][52031] Fps is (10 sec: 50791.2, 60 sec: 53248.2, 300 sec: 53428.5). Total num frames: 7037468672. Throughput: 0: 53399.6. Samples: 1527942040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:55:44,107][52031] Avg episode reward: [(0, '0.560')] [2024-04-27 16:55:45,452][52263] Updated weights for policy 0, policy_version 429539 (0.0031) [2024-04-27 16:55:48,737][52263] Updated weights for policy 0, policy_version 429549 (0.0030) [2024-04-27 16:55:49,107][52031] Fps is (10 sec: 50789.7, 60 sec: 52974.9, 300 sec: 53484.0). Total num frames: 7037730816. Throughput: 0: 53471.8. Samples: 1528267160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:55:49,108][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 16:55:51,649][52263] Updated weights for policy 0, policy_version 429559 (0.0027) [2024-04-27 16:55:54,107][52031] Fps is (10 sec: 57342.7, 60 sec: 54067.1, 300 sec: 53595.1). Total num frames: 7038042112. Throughput: 0: 53604.7. Samples: 1528589580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:55:54,107][52031] Avg episode reward: [(0, '0.521')] [2024-04-27 16:55:54,856][52263] Updated weights for policy 0, policy_version 429569 (0.0035) [2024-04-27 16:55:57,822][52263] Updated weights for policy 0, policy_version 429579 (0.0028) [2024-04-27 16:55:59,106][52031] Fps is (10 sec: 55706.8, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 7038287872. Throughput: 0: 53653.5. Samples: 1528748400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:55:59,107][52031] Avg episode reward: [(0, '0.648')] [2024-04-27 16:56:00,841][52263] Updated weights for policy 0, policy_version 429589 (0.0027) [2024-04-27 16:56:03,858][52263] Updated weights for policy 0, policy_version 429599 (0.0032) [2024-04-27 16:56:04,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.0, 300 sec: 53539.5). Total num frames: 7038566400. Throughput: 0: 53779.5. Samples: 1529069460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:56:04,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 16:56:06,783][52263] Updated weights for policy 0, policy_version 429609 (0.0033) [2024-04-27 16:56:09,107][52031] Fps is (10 sec: 54066.1, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 7038828544. Throughput: 0: 53730.5. Samples: 1529393280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:56:09,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 16:56:10,031][52263] Updated weights for policy 0, policy_version 429619 (0.0030) [2024-04-27 16:56:13,041][52263] Updated weights for policy 0, policy_version 429629 (0.0031) [2024-04-27 16:56:14,107][52031] Fps is (10 sec: 52428.7, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 7039090688. Throughput: 0: 53683.4. Samples: 1529550840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:56:14,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 16:56:15,963][52263] Updated weights for policy 0, policy_version 429639 (0.0029) [2024-04-27 16:56:19,107][52031] Fps is (10 sec: 52429.3, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 7039352832. Throughput: 0: 53769.3. Samples: 1529873640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:56:19,107][52031] Avg episode reward: [(0, '0.620')] [2024-04-27 16:56:19,219][52263] Updated weights for policy 0, policy_version 429649 (0.0028) [2024-04-27 16:56:22,149][52263] Updated weights for policy 0, policy_version 429659 (0.0035) [2024-04-27 16:56:24,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53794.2, 300 sec: 53539.6). Total num frames: 7039631360. Throughput: 0: 53765.0. Samples: 1530197280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:56:24,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 16:56:25,161][52263] Updated weights for policy 0, policy_version 429669 (0.0027) [2024-04-27 16:56:28,125][52263] Updated weights for policy 0, policy_version 429679 (0.0028) [2024-04-27 16:56:29,107][52031] Fps is (10 sec: 57343.6, 60 sec: 54067.1, 300 sec: 53595.1). Total num frames: 7039926272. Throughput: 0: 53678.8. Samples: 1530357600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:56:29,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 16:56:31,587][52263] Updated weights for policy 0, policy_version 429689 (0.0032) [2024-04-27 16:56:34,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 7040172032. Throughput: 0: 53673.5. Samples: 1530682460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 16:56:34,107][52031] Avg episode reward: [(0, '0.672')] [2024-04-27 16:56:34,119][52263] Updated weights for policy 0, policy_version 429699 (0.0029) [2024-04-27 16:56:37,710][52263] Updated weights for policy 0, policy_version 429709 (0.0030) [2024-04-27 16:56:39,107][52031] Fps is (10 sec: 50790.8, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 7040434176. Throughput: 0: 53670.7. Samples: 1531004760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 16:56:39,107][52031] Avg episode reward: [(0, '0.681')] [2024-04-27 16:56:40,377][52263] Updated weights for policy 0, policy_version 429719 (0.0033) [2024-04-27 16:56:43,667][52263] Updated weights for policy 0, policy_version 429729 (0.0031) [2024-04-27 16:56:44,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53793.9, 300 sec: 53539.5). Total num frames: 7040696320. Throughput: 0: 53613.1. Samples: 1531161000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 16:56:44,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 16:56:46,498][52263] Updated weights for policy 0, policy_version 429739 (0.0029) [2024-04-27 16:56:49,106][52031] Fps is (10 sec: 55706.0, 60 sec: 54340.4, 300 sec: 53650.7). Total num frames: 7040991232. Throughput: 0: 53581.5. Samples: 1531480620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 16:56:49,107][52031] Avg episode reward: [(0, '0.558')] [2024-04-27 16:56:49,629][52263] Updated weights for policy 0, policy_version 429749 (0.0029) [2024-04-27 16:56:50,801][52242] Signal inference workers to stop experience collection... (23500 times) [2024-04-27 16:56:50,843][52263] InferenceWorker_p0-w0: stopping experience collection (23500 times) [2024-04-27 16:56:50,856][52242] Signal inference workers to resume experience collection... (23500 times) [2024-04-27 16:56:50,861][52263] InferenceWorker_p0-w0: resuming experience collection (23500 times) [2024-04-27 16:56:52,556][52263] Updated weights for policy 0, policy_version 429759 (0.0032) [2024-04-27 16:56:54,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 7041236992. Throughput: 0: 53484.6. Samples: 1531800080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 16:56:54,107][52031] Avg episode reward: [(0, '0.543')] [2024-04-27 16:56:55,804][52263] Updated weights for policy 0, policy_version 429769 (0.0035) [2024-04-27 16:56:58,614][52263] Updated weights for policy 0, policy_version 429779 (0.0025) [2024-04-27 16:56:59,106][52031] Fps is (10 sec: 54067.2, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 7041531904. Throughput: 0: 53811.3. Samples: 1531972340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 16:56:59,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 16:57:01,939][52263] Updated weights for policy 0, policy_version 429789 (0.0028) [2024-04-27 16:57:04,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53248.1, 300 sec: 53484.0). Total num frames: 7041761280. Throughput: 0: 53790.7. Samples: 1532294220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 16:57:04,107][52031] Avg episode reward: [(0, '0.550')] [2024-04-27 16:57:04,752][52263] Updated weights for policy 0, policy_version 429799 (0.0034) [2024-04-27 16:57:08,084][52263] Updated weights for policy 0, policy_version 429809 (0.0027) [2024-04-27 16:57:09,107][52031] Fps is (10 sec: 50789.8, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 7042039808. Throughput: 0: 53730.5. Samples: 1532615160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 16:57:09,107][52031] Avg episode reward: [(0, '0.606')] [2024-04-27 16:57:10,911][52263] Updated weights for policy 0, policy_version 429819 (0.0031) [2024-04-27 16:57:14,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 7042301952. Throughput: 0: 53486.8. Samples: 1532764500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 16:57:14,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 16:57:14,161][52263] Updated weights for policy 0, policy_version 429829 (0.0028) [2024-04-27 16:57:16,912][52263] Updated weights for policy 0, policy_version 429839 (0.0025) [2024-04-27 16:57:19,107][52031] Fps is (10 sec: 55705.9, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 7042596864. Throughput: 0: 53465.8. Samples: 1533088420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 16:57:19,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 16:57:20,224][52263] Updated weights for policy 0, policy_version 429849 (0.0025) [2024-04-27 16:57:22,955][52263] Updated weights for policy 0, policy_version 429859 (0.0035) [2024-04-27 16:57:24,107][52031] Fps is (10 sec: 57343.3, 60 sec: 54067.1, 300 sec: 53706.2). Total num frames: 7042875392. Throughput: 0: 53497.2. Samples: 1533412140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 16:57:24,107][52031] Avg episode reward: [(0, '0.487')] [2024-04-27 16:57:26,363][52263] Updated weights for policy 0, policy_version 429869 (0.0030) [2024-04-27 16:57:29,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 7043121152. Throughput: 0: 53786.4. Samples: 1533581380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 16:57:29,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 16:57:29,144][52263] Updated weights for policy 0, policy_version 429879 (0.0027) [2024-04-27 16:57:32,317][52263] Updated weights for policy 0, policy_version 429889 (0.0026) [2024-04-27 16:57:34,107][52031] Fps is (10 sec: 50790.7, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 7043383296. Throughput: 0: 53841.2. Samples: 1533903480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 16:57:34,107][52031] Avg episode reward: [(0, '0.663')] [2024-04-27 16:57:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000429894_7043383296.pth... [2024-04-27 16:57:34,162][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000429110_7030538240.pth [2024-04-27 16:57:35,151][52263] Updated weights for policy 0, policy_version 429899 (0.0029) [2024-04-27 16:57:38,445][52263] Updated weights for policy 0, policy_version 429909 (0.0029) [2024-04-27 16:57:39,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 7043645440. Throughput: 0: 53845.8. Samples: 1534223140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 16:57:39,107][52031] Avg episode reward: [(0, '0.507')] [2024-04-27 16:57:41,244][52263] Updated weights for policy 0, policy_version 429919 (0.0026) [2024-04-27 16:57:44,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 7043923968. Throughput: 0: 53440.0. Samples: 1534377140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 16:57:44,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 16:57:44,415][52263] Updated weights for policy 0, policy_version 429929 (0.0027) [2024-04-27 16:57:47,230][52263] Updated weights for policy 0, policy_version 429939 (0.0022) [2024-04-27 16:57:49,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53248.0, 300 sec: 53595.1). Total num frames: 7044186112. Throughput: 0: 53436.1. Samples: 1534698840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 16:57:49,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 16:57:50,577][52263] Updated weights for policy 0, policy_version 429949 (0.0030) [2024-04-27 16:57:52,883][52242] Signal inference workers to stop experience collection... (23550 times) [2024-04-27 16:57:52,917][52263] InferenceWorker_p0-w0: stopping experience collection (23550 times) [2024-04-27 16:57:52,951][52242] Signal inference workers to resume experience collection... (23550 times) [2024-04-27 16:57:52,952][52263] InferenceWorker_p0-w0: resuming experience collection (23550 times) [2024-04-27 16:57:53,488][52263] Updated weights for policy 0, policy_version 429959 (0.0028) [2024-04-27 16:57:54,107][52031] Fps is (10 sec: 55704.7, 60 sec: 54067.1, 300 sec: 53650.6). Total num frames: 7044481024. Throughput: 0: 53493.3. Samples: 1535022360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 16:57:54,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 16:57:56,716][52263] Updated weights for policy 0, policy_version 429969 (0.0030) [2024-04-27 16:57:59,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 7044726784. Throughput: 0: 53874.2. Samples: 1535188840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:57:59,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 16:57:59,497][52263] Updated weights for policy 0, policy_version 429979 (0.0028) [2024-04-27 16:58:02,786][52263] Updated weights for policy 0, policy_version 429989 (0.0030) [2024-04-27 16:58:04,106][52031] Fps is (10 sec: 49152.6, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 7044972544. Throughput: 0: 53839.2. Samples: 1535511180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:58:04,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 16:58:05,588][52263] Updated weights for policy 0, policy_version 429999 (0.0028) [2024-04-27 16:58:08,944][52263] Updated weights for policy 0, policy_version 430009 (0.0030) [2024-04-27 16:58:09,107][52031] Fps is (10 sec: 54067.0, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 7045267456. Throughput: 0: 53771.2. Samples: 1535831840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:58:09,107][52031] Avg episode reward: [(0, '0.632')] [2024-04-27 16:58:11,590][52263] Updated weights for policy 0, policy_version 430019 (0.0028) [2024-04-27 16:58:14,107][52031] Fps is (10 sec: 55704.8, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 7045529600. Throughput: 0: 53454.9. Samples: 1535986860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:58:14,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 16:58:14,971][52263] Updated weights for policy 0, policy_version 430029 (0.0031) [2024-04-27 16:58:17,846][52263] Updated weights for policy 0, policy_version 430039 (0.0028) [2024-04-27 16:58:19,106][52031] Fps is (10 sec: 55706.5, 60 sec: 53794.3, 300 sec: 53706.2). Total num frames: 7045824512. Throughput: 0: 53448.7. Samples: 1536308660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:58:19,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 16:58:21,134][52263] Updated weights for policy 0, policy_version 430049 (0.0029) [2024-04-27 16:58:23,941][52263] Updated weights for policy 0, policy_version 430059 (0.0028) [2024-04-27 16:58:24,107][52031] Fps is (10 sec: 55705.9, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 7046086656. Throughput: 0: 53595.4. Samples: 1536634940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:58:24,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 16:58:27,093][52263] Updated weights for policy 0, policy_version 430069 (0.0031) [2024-04-27 16:58:29,107][52031] Fps is (10 sec: 50789.1, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 7046332416. Throughput: 0: 53781.1. Samples: 1536797300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:58:29,107][52031] Avg episode reward: [(0, '0.535')] [2024-04-27 16:58:29,976][52263] Updated weights for policy 0, policy_version 430079 (0.0026) [2024-04-27 16:58:33,329][52263] Updated weights for policy 0, policy_version 430089 (0.0032) [2024-04-27 16:58:34,106][52031] Fps is (10 sec: 49152.6, 60 sec: 53248.1, 300 sec: 53484.1). Total num frames: 7046578176. Throughput: 0: 53860.0. Samples: 1537122540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:58:34,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 16:58:35,952][52263] Updated weights for policy 0, policy_version 430099 (0.0030) [2024-04-27 16:58:39,106][52031] Fps is (10 sec: 55706.8, 60 sec: 54067.2, 300 sec: 53650.7). Total num frames: 7046889472. Throughput: 0: 53766.0. Samples: 1537441820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:58:39,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 16:58:39,467][52263] Updated weights for policy 0, policy_version 430109 (0.0029) [2024-04-27 16:58:42,216][52263] Updated weights for policy 0, policy_version 430119 (0.0032) [2024-04-27 16:58:44,107][52031] Fps is (10 sec: 57342.8, 60 sec: 53793.9, 300 sec: 53595.1). Total num frames: 7047151616. Throughput: 0: 53651.4. Samples: 1537603160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:58:44,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 16:58:45,812][52263] Updated weights for policy 0, policy_version 430129 (0.0031) [2024-04-27 16:58:47,581][52242] Signal inference workers to stop experience collection... (23600 times) [2024-04-27 16:58:47,582][52242] Signal inference workers to resume experience collection... (23600 times) [2024-04-27 16:58:47,595][52263] InferenceWorker_p0-w0: stopping experience collection (23600 times) [2024-04-27 16:58:47,595][52263] InferenceWorker_p0-w0: resuming experience collection (23600 times) [2024-04-27 16:58:48,470][52263] Updated weights for policy 0, policy_version 430139 (0.0029) [2024-04-27 16:58:49,106][52031] Fps is (10 sec: 54067.2, 60 sec: 54067.3, 300 sec: 53650.7). Total num frames: 7047430144. Throughput: 0: 53662.7. Samples: 1537926000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:58:49,107][52031] Avg episode reward: [(0, '0.575')] [2024-04-27 16:58:52,386][52263] Updated weights for policy 0, policy_version 430149 (0.0029) [2024-04-27 16:58:54,106][52031] Fps is (10 sec: 54068.7, 60 sec: 53521.3, 300 sec: 53650.7). Total num frames: 7047692288. Throughput: 0: 53668.2. Samples: 1538246900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:58:54,107][52031] Avg episode reward: [(0, '0.588')] [2024-04-27 16:58:54,420][52263] Updated weights for policy 0, policy_version 430159 (0.0036) [2024-04-27 16:58:58,413][52263] Updated weights for policy 0, policy_version 430169 (0.0030) [2024-04-27 16:58:59,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 7047938048. Throughput: 0: 53693.1. Samples: 1538403040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:58:59,107][52031] Avg episode reward: [(0, '0.563')] [2024-04-27 16:59:00,609][52263] Updated weights for policy 0, policy_version 430179 (0.0031) [2024-04-27 16:59:04,107][52031] Fps is (10 sec: 50789.5, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 7048200192. Throughput: 0: 53627.3. Samples: 1538721900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:59:04,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 16:59:04,587][52263] Updated weights for policy 0, policy_version 430189 (0.0030) [2024-04-27 16:59:06,777][52263] Updated weights for policy 0, policy_version 430199 (0.0032) [2024-04-27 16:59:09,107][52031] Fps is (10 sec: 57343.1, 60 sec: 54067.2, 300 sec: 53706.2). Total num frames: 7048511488. Throughput: 0: 53576.4. Samples: 1539045880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:59:09,107][52031] Avg episode reward: [(0, '0.649')] [2024-04-27 16:59:10,587][52263] Updated weights for policy 0, policy_version 430209 (0.0040) [2024-04-27 16:59:13,102][52263] Updated weights for policy 0, policy_version 430219 (0.0028) [2024-04-27 16:59:14,106][52031] Fps is (10 sec: 54067.5, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 7048740864. Throughput: 0: 53584.2. Samples: 1539208580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:59:14,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 16:59:16,517][52263] Updated weights for policy 0, policy_version 430229 (0.0027) [2024-04-27 16:59:19,093][52263] Updated weights for policy 0, policy_version 430239 (0.0027) [2024-04-27 16:59:19,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53520.9, 300 sec: 53595.1). Total num frames: 7049035776. Throughput: 0: 53562.0. Samples: 1539532840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 16:59:19,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 16:59:22,610][52263] Updated weights for policy 0, policy_version 430249 (0.0036) [2024-04-27 16:59:24,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 7049297920. Throughput: 0: 53590.0. Samples: 1539853380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 16:59:24,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 16:59:25,317][52263] Updated weights for policy 0, policy_version 430259 (0.0030) [2024-04-27 16:59:28,834][52263] Updated weights for policy 0, policy_version 430269 (0.0033) [2024-04-27 16:59:29,106][52031] Fps is (10 sec: 49153.3, 60 sec: 53248.3, 300 sec: 53539.6). Total num frames: 7049527296. Throughput: 0: 53509.7. Samples: 1540011080. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 16:59:29,107][52031] Avg episode reward: [(0, '0.537')] [2024-04-27 16:59:31,430][52263] Updated weights for policy 0, policy_version 430279 (0.0028) [2024-04-27 16:59:34,107][52031] Fps is (10 sec: 52428.5, 60 sec: 54067.0, 300 sec: 53595.1). Total num frames: 7049822208. Throughput: 0: 53458.8. Samples: 1540331660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 16:59:34,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 16:59:34,118][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000430287_7049822208.pth... [2024-04-27 16:59:34,167][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000429503_7036977152.pth [2024-04-27 16:59:34,955][52263] Updated weights for policy 0, policy_version 430289 (0.0030) [2024-04-27 16:59:37,530][52263] Updated weights for policy 0, policy_version 430299 (0.0026) [2024-04-27 16:59:39,106][52031] Fps is (10 sec: 54066.7, 60 sec: 52974.9, 300 sec: 53539.6). Total num frames: 7050067968. Throughput: 0: 53395.5. Samples: 1540649700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 16:59:39,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 16:59:41,136][52263] Updated weights for policy 0, policy_version 430309 (0.0027) [2024-04-27 16:59:43,736][52263] Updated weights for policy 0, policy_version 430319 (0.0033) [2024-04-27 16:59:44,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 7050346496. Throughput: 0: 53467.3. Samples: 1540809080. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 16:59:44,107][52031] Avg episode reward: [(0, '0.657')] [2024-04-27 16:59:45,205][52242] Signal inference workers to stop experience collection... (23650 times) [2024-04-27 16:59:45,246][52263] InferenceWorker_p0-w0: stopping experience collection (23650 times) [2024-04-27 16:59:45,258][52242] Signal inference workers to resume experience collection... (23650 times) [2024-04-27 16:59:45,263][52263] InferenceWorker_p0-w0: resuming experience collection (23650 times) [2024-04-27 16:59:47,137][52263] Updated weights for policy 0, policy_version 430329 (0.0028) [2024-04-27 16:59:49,106][52031] Fps is (10 sec: 57344.0, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 7050641408. Throughput: 0: 53523.7. Samples: 1541130460. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 16:59:49,107][52031] Avg episode reward: [(0, '0.681')] [2024-04-27 16:59:50,212][52263] Updated weights for policy 0, policy_version 430339 (0.0030) [2024-04-27 16:59:53,478][52263] Updated weights for policy 0, policy_version 430349 (0.0031) [2024-04-27 16:59:54,106][52031] Fps is (10 sec: 52429.9, 60 sec: 52974.9, 300 sec: 53484.0). Total num frames: 7050870784. Throughput: 0: 53430.4. Samples: 1541450240. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 16:59:54,107][52031] Avg episode reward: [(0, '0.669')] [2024-04-27 16:59:56,403][52263] Updated weights for policy 0, policy_version 430359 (0.0031) [2024-04-27 16:59:59,106][52031] Fps is (10 sec: 50790.2, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 7051149312. Throughput: 0: 53363.2. Samples: 1541609920. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 16:59:59,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 16:59:59,567][52263] Updated weights for policy 0, policy_version 430369 (0.0030) [2024-04-27 17:00:02,543][52263] Updated weights for policy 0, policy_version 430379 (0.0031) [2024-04-27 17:00:04,107][52031] Fps is (10 sec: 54066.1, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 7051411456. Throughput: 0: 53320.4. Samples: 1541932260. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 17:00:04,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 17:00:05,521][52263] Updated weights for policy 0, policy_version 430389 (0.0031) [2024-04-27 17:00:08,558][52263] Updated weights for policy 0, policy_version 430399 (0.0026) [2024-04-27 17:00:09,107][52031] Fps is (10 sec: 52428.0, 60 sec: 52701.8, 300 sec: 53595.1). Total num frames: 7051673600. Throughput: 0: 53329.3. Samples: 1542253200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 17:00:09,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 17:00:11,713][52263] Updated weights for policy 0, policy_version 430409 (0.0030) [2024-04-27 17:00:14,107][52031] Fps is (10 sec: 55706.4, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 7051968512. Throughput: 0: 53403.8. Samples: 1542414260. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 17:00:14,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 17:00:14,629][52263] Updated weights for policy 0, policy_version 430419 (0.0032) [2024-04-27 17:00:17,872][52263] Updated weights for policy 0, policy_version 430429 (0.0034) [2024-04-27 17:00:19,106][52031] Fps is (10 sec: 57344.9, 60 sec: 53521.2, 300 sec: 53706.2). Total num frames: 7052247040. Throughput: 0: 53371.4. Samples: 1542733360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 17:00:19,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 17:00:20,853][52263] Updated weights for policy 0, policy_version 430439 (0.0030) [2024-04-27 17:00:23,955][52263] Updated weights for policy 0, policy_version 430449 (0.0026) [2024-04-27 17:00:24,107][52031] Fps is (10 sec: 50790.1, 60 sec: 52974.9, 300 sec: 53539.6). Total num frames: 7052476416. Throughput: 0: 53417.6. Samples: 1543053500. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 17:00:24,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 17:00:26,831][52263] Updated weights for policy 0, policy_version 430459 (0.0033) [2024-04-27 17:00:29,106][52031] Fps is (10 sec: 50790.3, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 7052754944. Throughput: 0: 53366.0. Samples: 1543210540. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 17:00:29,107][52031] Avg episode reward: [(0, '0.674')] [2024-04-27 17:00:30,062][52263] Updated weights for policy 0, policy_version 430469 (0.0029) [2024-04-27 17:00:32,825][52263] Updated weights for policy 0, policy_version 430479 (0.0031) [2024-04-27 17:00:34,107][52031] Fps is (10 sec: 55705.5, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 7053033472. Throughput: 0: 53372.2. Samples: 1543532220. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 17:00:34,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 17:00:36,132][52263] Updated weights for policy 0, policy_version 430489 (0.0023) [2024-04-27 17:00:39,106][52263] Updated weights for policy 0, policy_version 430499 (0.0027) [2024-04-27 17:00:39,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 7053295616. Throughput: 0: 53479.4. Samples: 1543856820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 17:00:39,108][52031] Avg episode reward: [(0, '0.666')] [2024-04-27 17:00:42,180][52263] Updated weights for policy 0, policy_version 430509 (0.0025) [2024-04-27 17:00:42,999][52242] Signal inference workers to stop experience collection... (23700 times) [2024-04-27 17:00:42,999][52242] Signal inference workers to resume experience collection... (23700 times) [2024-04-27 17:00:43,022][52263] InferenceWorker_p0-w0: stopping experience collection (23700 times) [2024-04-27 17:00:43,022][52263] InferenceWorker_p0-w0: resuming experience collection (23700 times) [2024-04-27 17:00:44,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.3, 300 sec: 53706.2). Total num frames: 7053574144. Throughput: 0: 53608.4. Samples: 1544022300. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 17:00:44,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 17:00:45,214][52263] Updated weights for policy 0, policy_version 430519 (0.0031) [2024-04-27 17:00:48,351][52263] Updated weights for policy 0, policy_version 430529 (0.0027) [2024-04-27 17:00:49,107][52031] Fps is (10 sec: 54067.4, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 7053836288. Throughput: 0: 53530.4. Samples: 1544341120. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 17:00:49,107][52031] Avg episode reward: [(0, '0.517')] [2024-04-27 17:00:51,208][52263] Updated weights for policy 0, policy_version 430539 (0.0026) [2024-04-27 17:00:54,107][52031] Fps is (10 sec: 50789.9, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 7054082048. Throughput: 0: 53521.8. Samples: 1544661680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 17:00:54,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 17:00:54,404][52263] Updated weights for policy 0, policy_version 430549 (0.0026) [2024-04-27 17:00:57,264][52263] Updated weights for policy 0, policy_version 430559 (0.0035) [2024-04-27 17:00:59,107][52031] Fps is (10 sec: 52428.4, 60 sec: 53521.0, 300 sec: 53539.6). Total num frames: 7054360576. Throughput: 0: 53521.3. Samples: 1544822720. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 17:00:59,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 17:01:00,552][52263] Updated weights for policy 0, policy_version 430569 (0.0028) [2024-04-27 17:01:03,293][52263] Updated weights for policy 0, policy_version 430579 (0.0036) [2024-04-27 17:01:04,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 7054639104. Throughput: 0: 53554.0. Samples: 1545143300. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 17:01:04,107][52031] Avg episode reward: [(0, '0.625')] [2024-04-27 17:01:06,760][52263] Updated weights for policy 0, policy_version 430589 (0.0027) [2024-04-27 17:01:09,106][52031] Fps is (10 sec: 54068.1, 60 sec: 53794.3, 300 sec: 53595.2). Total num frames: 7054901248. Throughput: 0: 53658.9. Samples: 1545468140. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 17:01:09,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 17:01:09,340][52263] Updated weights for policy 0, policy_version 430599 (0.0028) [2024-04-27 17:01:12,788][52263] Updated weights for policy 0, policy_version 430609 (0.0027) [2024-04-27 17:01:14,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53520.9, 300 sec: 53650.6). Total num frames: 7055179776. Throughput: 0: 53711.4. Samples: 1545627560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 17:01:14,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 17:01:15,364][52263] Updated weights for policy 0, policy_version 430619 (0.0029) [2024-04-27 17:01:18,973][52263] Updated weights for policy 0, policy_version 430629 (0.0035) [2024-04-27 17:01:19,107][52031] Fps is (10 sec: 52428.1, 60 sec: 52974.8, 300 sec: 53539.6). Total num frames: 7055425536. Throughput: 0: 53723.6. Samples: 1545949780. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 17:01:19,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 17:01:21,374][52263] Updated weights for policy 0, policy_version 430639 (0.0032) [2024-04-27 17:01:24,106][52031] Fps is (10 sec: 50791.6, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 7055687680. Throughput: 0: 53725.5. Samples: 1546274460. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 17:01:24,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 17:01:25,082][52263] Updated weights for policy 0, policy_version 430649 (0.0029) [2024-04-27 17:01:27,455][52263] Updated weights for policy 0, policy_version 430659 (0.0026) [2024-04-27 17:01:29,106][52031] Fps is (10 sec: 55706.3, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 7055982592. Throughput: 0: 53578.3. Samples: 1546433320. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 17:01:29,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 17:01:31,103][52263] Updated weights for policy 0, policy_version 430669 (0.0028) [2024-04-27 17:01:33,485][52263] Updated weights for policy 0, policy_version 430679 (0.0030) [2024-04-27 17:01:34,106][52031] Fps is (10 sec: 55705.4, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 7056244736. Throughput: 0: 53700.1. Samples: 1546757620. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 17:01:34,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 17:01:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000430679_7056244736.pth... [2024-04-27 17:01:34,177][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000429894_7043383296.pth [2024-04-27 17:01:37,242][52263] Updated weights for policy 0, policy_version 430689 (0.0029) [2024-04-27 17:01:39,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 7056506880. Throughput: 0: 53731.2. Samples: 1547079580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 17:01:39,107][52031] Avg episode reward: [(0, '0.478')] [2024-04-27 17:01:39,973][52263] Updated weights for policy 0, policy_version 430699 (0.0031) [2024-04-27 17:01:43,224][52263] Updated weights for policy 0, policy_version 430709 (0.0035) [2024-04-27 17:01:44,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 7056801792. Throughput: 0: 53684.5. Samples: 1547238520. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 17:01:44,107][52031] Avg episode reward: [(0, '0.544')] [2024-04-27 17:01:45,459][52242] Signal inference workers to stop experience collection... (23750 times) [2024-04-27 17:01:45,461][52242] Signal inference workers to resume experience collection... (23750 times) [2024-04-27 17:01:45,477][52263] InferenceWorker_p0-w0: stopping experience collection (23750 times) [2024-04-27 17:01:45,478][52263] InferenceWorker_p0-w0: resuming experience collection (23750 times) [2024-04-27 17:01:46,309][52263] Updated weights for policy 0, policy_version 430719 (0.0030) [2024-04-27 17:01:49,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 7057047552. Throughput: 0: 53694.4. Samples: 1547559540. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 17:01:49,107][52031] Avg episode reward: [(0, '0.501')] [2024-04-27 17:01:49,306][52263] Updated weights for policy 0, policy_version 430729 (0.0025) [2024-04-27 17:01:52,191][52263] Updated weights for policy 0, policy_version 430739 (0.0033) [2024-04-27 17:01:54,107][52031] Fps is (10 sec: 50790.9, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 7057309696. Throughput: 0: 53813.8. Samples: 1547889760. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 17:01:54,107][52031] Avg episode reward: [(0, '0.602')] [2024-04-27 17:01:55,411][52263] Updated weights for policy 0, policy_version 430749 (0.0028) [2024-04-27 17:01:58,406][52263] Updated weights for policy 0, policy_version 430759 (0.0030) [2024-04-27 17:01:59,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 7057588224. Throughput: 0: 53600.0. Samples: 1548039560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 17:01:59,107][52031] Avg episode reward: [(0, '0.644')] [2024-04-27 17:02:01,478][52263] Updated weights for policy 0, policy_version 430769 (0.0027) [2024-04-27 17:02:04,107][52031] Fps is (10 sec: 55705.0, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 7057866752. Throughput: 0: 53615.6. Samples: 1548362480. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-27 17:02:04,109][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 17:02:04,410][52263] Updated weights for policy 0, policy_version 430779 (0.0027) [2024-04-27 17:02:07,597][52263] Updated weights for policy 0, policy_version 430789 (0.0024) [2024-04-27 17:02:09,107][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 7058112512. Throughput: 0: 53592.3. Samples: 1548686120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:02:09,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 17:02:10,605][52263] Updated weights for policy 0, policy_version 430799 (0.0031) [2024-04-27 17:02:13,682][52263] Updated weights for policy 0, policy_version 430809 (0.0028) [2024-04-27 17:02:14,107][52031] Fps is (10 sec: 54067.1, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 7058407424. Throughput: 0: 53731.4. Samples: 1548851240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:02:14,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 17:02:16,512][52263] Updated weights for policy 0, policy_version 430819 (0.0037) [2024-04-27 17:02:19,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 7058653184. Throughput: 0: 53670.5. Samples: 1549172800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:02:19,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 17:02:19,831][52263] Updated weights for policy 0, policy_version 430829 (0.0029) [2024-04-27 17:02:22,736][52263] Updated weights for policy 0, policy_version 430839 (0.0040) [2024-04-27 17:02:24,107][52031] Fps is (10 sec: 49152.2, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 7058898944. Throughput: 0: 53687.9. Samples: 1549495540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:02:24,107][52031] Avg episode reward: [(0, '0.629')] [2024-04-27 17:02:25,768][52263] Updated weights for policy 0, policy_version 430849 (0.0029) [2024-04-27 17:02:28,841][52263] Updated weights for policy 0, policy_version 430859 (0.0036) [2024-04-27 17:02:29,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53794.0, 300 sec: 53650.7). Total num frames: 7059210240. Throughput: 0: 53580.8. Samples: 1549649660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:02:29,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 17:02:31,795][52263] Updated weights for policy 0, policy_version 430869 (0.0027) [2024-04-27 17:02:34,106][52031] Fps is (10 sec: 57344.6, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 7059472384. Throughput: 0: 53626.8. Samples: 1549972740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:02:34,107][52031] Avg episode reward: [(0, '0.525')] [2024-04-27 17:02:34,922][52263] Updated weights for policy 0, policy_version 430879 (0.0028) [2024-04-27 17:02:37,991][52263] Updated weights for policy 0, policy_version 430889 (0.0026) [2024-04-27 17:02:39,107][52031] Fps is (10 sec: 50789.9, 60 sec: 53520.9, 300 sec: 53539.5). Total num frames: 7059718144. Throughput: 0: 53379.7. Samples: 1550291860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:02:39,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 17:02:40,910][52263] Updated weights for policy 0, policy_version 430899 (0.0031) [2024-04-27 17:02:44,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 7059996672. Throughput: 0: 53794.0. Samples: 1550460280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:02:44,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 17:02:44,160][52263] Updated weights for policy 0, policy_version 430909 (0.0037) [2024-04-27 17:02:47,093][52263] Updated weights for policy 0, policy_version 430919 (0.0028) [2024-04-27 17:02:49,107][52031] Fps is (10 sec: 54067.5, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 7060258816. Throughput: 0: 53658.1. Samples: 1550777100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:02:49,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 17:02:49,953][52242] Signal inference workers to stop experience collection... (23800 times) [2024-04-27 17:02:49,953][52242] Signal inference workers to resume experience collection... (23800 times) [2024-04-27 17:02:49,966][52263] InferenceWorker_p0-w0: stopping experience collection (23800 times) [2024-04-27 17:02:49,967][52263] InferenceWorker_p0-w0: resuming experience collection (23800 times) [2024-04-27 17:02:50,325][52263] Updated weights for policy 0, policy_version 430929 (0.0035) [2024-04-27 17:02:53,219][52263] Updated weights for policy 0, policy_version 430939 (0.0035) [2024-04-27 17:02:54,107][52031] Fps is (10 sec: 50789.7, 60 sec: 53247.9, 300 sec: 53484.0). Total num frames: 7060504576. Throughput: 0: 53547.0. Samples: 1551095740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:02:54,107][52031] Avg episode reward: [(0, '0.662')] [2024-04-27 17:02:56,435][52263] Updated weights for policy 0, policy_version 430949 (0.0030) [2024-04-27 17:02:59,106][52031] Fps is (10 sec: 55706.6, 60 sec: 53794.3, 300 sec: 53706.2). Total num frames: 7060815872. Throughput: 0: 53457.5. Samples: 1551256820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:02:59,107][52031] Avg episode reward: [(0, '0.474')] [2024-04-27 17:02:59,225][52263] Updated weights for policy 0, policy_version 430959 (0.0028) [2024-04-27 17:03:02,569][52263] Updated weights for policy 0, policy_version 430969 (0.0027) [2024-04-27 17:03:04,106][52031] Fps is (10 sec: 57344.7, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 7061078016. Throughput: 0: 53487.3. Samples: 1551579720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:03:04,107][52031] Avg episode reward: [(0, '0.607')] [2024-04-27 17:03:05,255][52263] Updated weights for policy 0, policy_version 430979 (0.0027) [2024-04-27 17:03:08,735][52263] Updated weights for policy 0, policy_version 430989 (0.0026) [2024-04-27 17:03:09,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53521.1, 300 sec: 53539.6). Total num frames: 7061323776. Throughput: 0: 53492.1. Samples: 1551902680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:03:09,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 17:03:11,782][52263] Updated weights for policy 0, policy_version 430999 (0.0032) [2024-04-27 17:03:14,107][52031] Fps is (10 sec: 50789.7, 60 sec: 52974.9, 300 sec: 53428.5). Total num frames: 7061585920. Throughput: 0: 53440.0. Samples: 1552054460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:03:14,107][52031] Avg episode reward: [(0, '0.546')] [2024-04-27 17:03:14,703][52263] Updated weights for policy 0, policy_version 431009 (0.0030) [2024-04-27 17:03:18,172][52263] Updated weights for policy 0, policy_version 431019 (0.0028) [2024-04-27 17:03:19,107][52031] Fps is (10 sec: 54066.5, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 7061864448. Throughput: 0: 53460.2. Samples: 1552378460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:03:19,107][52031] Avg episode reward: [(0, '0.664')] [2024-04-27 17:03:20,722][52263] Updated weights for policy 0, policy_version 431029 (0.0029) [2024-04-27 17:03:24,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.1, 300 sec: 53539.6). Total num frames: 7062126592. Throughput: 0: 53579.3. Samples: 1552702920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:03:24,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 17:03:24,635][52263] Updated weights for policy 0, policy_version 431039 (0.0029) [2024-04-27 17:03:26,838][52263] Updated weights for policy 0, policy_version 431049 (0.0033) [2024-04-27 17:03:29,106][52031] Fps is (10 sec: 55706.2, 60 sec: 53521.2, 300 sec: 53706.2). Total num frames: 7062421504. Throughput: 0: 53444.9. Samples: 1552865300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 17:03:29,107][52031] Avg episode reward: [(0, '0.523')] [2024-04-27 17:03:30,559][52263] Updated weights for policy 0, policy_version 431059 (0.0032) [2024-04-27 17:03:32,910][52263] Updated weights for policy 0, policy_version 431069 (0.0028) [2024-04-27 17:03:34,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53248.0, 300 sec: 53484.0). Total num frames: 7062667264. Throughput: 0: 53636.7. Samples: 1553190740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 17:03:34,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 17:03:34,115][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000431071_7062667264.pth... [2024-04-27 17:03:34,171][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000430287_7049822208.pth [2024-04-27 17:03:34,349][52242] Signal inference workers to stop experience collection... (23850 times) [2024-04-27 17:03:34,349][52242] Signal inference workers to resume experience collection... (23850 times) [2024-04-27 17:03:34,374][52263] InferenceWorker_p0-w0: stopping experience collection (23850 times) [2024-04-27 17:03:34,374][52263] InferenceWorker_p0-w0: resuming experience collection (23850 times) [2024-04-27 17:03:36,546][52263] Updated weights for policy 0, policy_version 431079 (0.0026) [2024-04-27 17:03:39,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53794.4, 300 sec: 53539.6). Total num frames: 7062945792. Throughput: 0: 53673.1. Samples: 1553511020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 17:03:39,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 17:03:39,292][52263] Updated weights for policy 0, policy_version 431089 (0.0030) [2024-04-27 17:03:42,786][52263] Updated weights for policy 0, policy_version 431099 (0.0031) [2024-04-27 17:03:44,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53247.9, 300 sec: 53428.5). Total num frames: 7063191552. Throughput: 0: 53477.2. Samples: 1553663300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 17:03:44,107][52031] Avg episode reward: [(0, '0.638')] [2024-04-27 17:03:45,454][52263] Updated weights for policy 0, policy_version 431109 (0.0028) [2024-04-27 17:03:48,991][52263] Updated weights for policy 0, policy_version 431119 (0.0027) [2024-04-27 17:03:49,107][52031] Fps is (10 sec: 52428.2, 60 sec: 53521.2, 300 sec: 53484.0). Total num frames: 7063470080. Throughput: 0: 53370.6. Samples: 1553981400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 17:03:49,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 17:03:51,410][52263] Updated weights for policy 0, policy_version 431129 (0.0030) [2024-04-27 17:03:54,107][52031] Fps is (10 sec: 55705.7, 60 sec: 54067.2, 300 sec: 53595.1). Total num frames: 7063748608. Throughput: 0: 53334.5. Samples: 1554302740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 17:03:54,107][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 17:03:54,925][52263] Updated weights for policy 0, policy_version 431139 (0.0031) [2024-04-27 17:03:57,538][52263] Updated weights for policy 0, policy_version 431149 (0.0028) [2024-04-27 17:03:59,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53247.9, 300 sec: 53595.1). Total num frames: 7064010752. Throughput: 0: 53835.7. Samples: 1554477060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 17:03:59,107][52031] Avg episode reward: [(0, '0.547')] [2024-04-27 17:04:00,894][52263] Updated weights for policy 0, policy_version 431159 (0.0031) [2024-04-27 17:04:03,562][52263] Updated weights for policy 0, policy_version 431169 (0.0027) [2024-04-27 17:04:04,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53521.0, 300 sec: 53484.0). Total num frames: 7064289280. Throughput: 0: 53815.5. Samples: 1554800160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 17:04:04,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 17:04:07,164][52263] Updated weights for policy 0, policy_version 431179 (0.0034) [2024-04-27 17:04:09,107][52031] Fps is (10 sec: 55705.3, 60 sec: 54067.1, 300 sec: 53650.6). Total num frames: 7064567808. Throughput: 0: 53720.8. Samples: 1555120360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 17:04:09,107][52031] Avg episode reward: [(0, '0.541')] [2024-04-27 17:04:09,561][52263] Updated weights for policy 0, policy_version 431189 (0.0029) [2024-04-27 17:04:13,188][52263] Updated weights for policy 0, policy_version 431199 (0.0032) [2024-04-27 17:04:14,106][52031] Fps is (10 sec: 50791.2, 60 sec: 53521.2, 300 sec: 53428.5). Total num frames: 7064797184. Throughput: 0: 53549.8. Samples: 1555275040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 17:04:14,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 17:04:15,545][52263] Updated weights for policy 0, policy_version 431209 (0.0031) [2024-04-27 17:04:19,107][52031] Fps is (10 sec: 50790.5, 60 sec: 53521.1, 300 sec: 53484.0). Total num frames: 7065075712. Throughput: 0: 53399.4. Samples: 1555593720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 17:04:19,107][52031] Avg episode reward: [(0, '0.643')] [2024-04-27 17:04:19,262][52263] Updated weights for policy 0, policy_version 431219 (0.0031) [2024-04-27 17:04:21,626][52263] Updated weights for policy 0, policy_version 431229 (0.0031) [2024-04-27 17:04:24,107][52031] Fps is (10 sec: 55704.0, 60 sec: 53794.0, 300 sec: 53650.6). Total num frames: 7065354240. Throughput: 0: 53524.5. Samples: 1555919640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 17:04:24,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 17:04:25,221][52263] Updated weights for policy 0, policy_version 431239 (0.0037) [2024-04-27 17:04:27,900][52263] Updated weights for policy 0, policy_version 431249 (0.0032) [2024-04-27 17:04:28,987][52242] Signal inference workers to stop experience collection... (23900 times) [2024-04-27 17:04:28,988][52242] Signal inference workers to resume experience collection... (23900 times) [2024-04-27 17:04:29,012][52263] InferenceWorker_p0-w0: stopping experience collection (23900 times) [2024-04-27 17:04:29,012][52263] InferenceWorker_p0-w0: resuming experience collection (23900 times) [2024-04-27 17:04:29,107][52031] Fps is (10 sec: 57343.5, 60 sec: 53794.0, 300 sec: 53650.7). Total num frames: 7065649152. Throughput: 0: 53918.6. Samples: 1556089640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 17:04:29,107][52031] Avg episode reward: [(0, '0.570')] [2024-04-27 17:04:31,470][52263] Updated weights for policy 0, policy_version 431259 (0.0030) [2024-04-27 17:04:34,106][52031] Fps is (10 sec: 54068.8, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 7065894912. Throughput: 0: 53901.0. Samples: 1556406940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 17:04:34,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 17:04:34,165][52263] Updated weights for policy 0, policy_version 431269 (0.0023) [2024-04-27 17:04:37,662][52263] Updated weights for policy 0, policy_version 431279 (0.0030) [2024-04-27 17:04:39,107][52031] Fps is (10 sec: 49152.1, 60 sec: 53247.8, 300 sec: 53539.6). Total num frames: 7066140672. Throughput: 0: 53843.5. Samples: 1556725700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 17:04:39,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 17:04:40,409][52263] Updated weights for policy 0, policy_version 431289 (0.0027) [2024-04-27 17:04:43,629][52263] Updated weights for policy 0, policy_version 431299 (0.0034) [2024-04-27 17:04:44,106][52031] Fps is (10 sec: 54067.1, 60 sec: 54067.3, 300 sec: 53539.6). Total num frames: 7066435584. Throughput: 0: 53288.5. Samples: 1556875040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 17:04:44,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 17:04:46,367][52263] Updated weights for policy 0, policy_version 431309 (0.0023) [2024-04-27 17:04:49,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 7066681344. Throughput: 0: 53282.3. Samples: 1557197860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 17:04:49,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 17:04:49,706][52263] Updated weights for policy 0, policy_version 431319 (0.0039) [2024-04-27 17:04:52,478][52263] Updated weights for policy 0, policy_version 431329 (0.0028) [2024-04-27 17:04:54,106][52031] Fps is (10 sec: 50790.9, 60 sec: 53248.2, 300 sec: 53539.6). Total num frames: 7066943488. Throughput: 0: 53226.1. Samples: 1557515520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 17:04:54,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 17:04:55,993][52263] Updated weights for policy 0, policy_version 431339 (0.0031) [2024-04-27 17:04:58,504][52263] Updated weights for policy 0, policy_version 431349 (0.0021) [2024-04-27 17:04:59,107][52031] Fps is (10 sec: 54066.4, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 7067222016. Throughput: 0: 53446.8. Samples: 1557680160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 17:04:59,107][52031] Avg episode reward: [(0, '0.577')] [2024-04-27 17:05:02,091][52263] Updated weights for policy 0, policy_version 431359 (0.0033) [2024-04-27 17:05:04,107][52031] Fps is (10 sec: 57342.6, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 7067516928. Throughput: 0: 53505.8. Samples: 1558001480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 17:05:04,107][52031] Avg episode reward: [(0, '0.605')] [2024-04-27 17:05:04,703][52263] Updated weights for policy 0, policy_version 431369 (0.0029) [2024-04-27 17:05:08,103][52263] Updated weights for policy 0, policy_version 431379 (0.0030) [2024-04-27 17:05:09,106][52031] Fps is (10 sec: 52430.2, 60 sec: 52975.1, 300 sec: 53484.1). Total num frames: 7067746304. Throughput: 0: 53463.1. Samples: 1558325460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 17:05:09,107][52031] Avg episode reward: [(0, '0.479')] [2024-04-27 17:05:10,679][52263] Updated weights for policy 0, policy_version 431389 (0.0031) [2024-04-27 17:05:14,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53794.1, 300 sec: 53484.0). Total num frames: 7068024832. Throughput: 0: 53231.3. Samples: 1558485040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 17:05:14,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 17:05:14,202][52263] Updated weights for policy 0, policy_version 431399 (0.0033) [2024-04-27 17:05:16,715][52263] Updated weights for policy 0, policy_version 431409 (0.0031) [2024-04-27 17:05:19,106][52031] Fps is (10 sec: 54066.8, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 7068286976. Throughput: 0: 53380.0. Samples: 1558809040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 17:05:19,107][52031] Avg episode reward: [(0, '0.486')] [2024-04-27 17:05:20,261][52263] Updated weights for policy 0, policy_version 431419 (0.0028) [2024-04-27 17:05:22,790][52263] Updated weights for policy 0, policy_version 431429 (0.0032) [2024-04-27 17:05:24,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 7068565504. Throughput: 0: 53389.4. Samples: 1559128220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 17:05:24,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 17:05:26,434][52263] Updated weights for policy 0, policy_version 431439 (0.0030) [2024-04-27 17:05:29,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 7068844032. Throughput: 0: 53727.9. Samples: 1559292800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 17:05:29,107][52031] Avg episode reward: [(0, '0.621')] [2024-04-27 17:05:29,293][52263] Updated weights for policy 0, policy_version 431449 (0.0038) [2024-04-27 17:05:32,587][52263] Updated weights for policy 0, policy_version 431459 (0.0030) [2024-04-27 17:05:34,106][52031] Fps is (10 sec: 54068.0, 60 sec: 53521.0, 300 sec: 53595.1). Total num frames: 7069106176. Throughput: 0: 53735.2. Samples: 1559615940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 17:05:34,107][52031] Avg episode reward: [(0, '0.617')] [2024-04-27 17:05:34,123][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000431465_7069122560.pth... [2024-04-27 17:05:34,184][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000430679_7056244736.pth [2024-04-27 17:05:34,207][52242] Signal inference workers to stop experience collection... (23950 times) [2024-04-27 17:05:34,207][52242] Signal inference workers to resume experience collection... (23950 times) [2024-04-27 17:05:34,236][52263] InferenceWorker_p0-w0: stopping experience collection (23950 times) [2024-04-27 17:05:34,236][52263] InferenceWorker_p0-w0: resuming experience collection (23950 times) [2024-04-27 17:05:35,497][52263] Updated weights for policy 0, policy_version 431469 (0.0025) [2024-04-27 17:05:38,577][52263] Updated weights for policy 0, policy_version 431479 (0.0035) [2024-04-27 17:05:39,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53794.3, 300 sec: 53539.6). Total num frames: 7069368320. Throughput: 0: 53859.0. Samples: 1559939180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 17:05:39,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 17:05:41,447][52263] Updated weights for policy 0, policy_version 431489 (0.0030) [2024-04-27 17:05:44,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 7069630464. Throughput: 0: 53619.2. Samples: 1560093020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 17:05:44,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 17:05:44,833][52263] Updated weights for policy 0, policy_version 431499 (0.0028) [2024-04-27 17:05:47,401][52263] Updated weights for policy 0, policy_version 431509 (0.0031) [2024-04-27 17:05:49,106][52031] Fps is (10 sec: 54066.9, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 7069908992. Throughput: 0: 53565.9. Samples: 1560411940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 17:05:49,107][52031] Avg episode reward: [(0, '0.554')] [2024-04-27 17:05:50,975][52263] Updated weights for policy 0, policy_version 431519 (0.0030) [2024-04-27 17:05:53,487][52263] Updated weights for policy 0, policy_version 431529 (0.0027) [2024-04-27 17:05:54,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 7070171136. Throughput: 0: 53551.4. Samples: 1560735280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 17:05:54,107][52031] Avg episode reward: [(0, '0.571')] [2024-04-27 17:05:57,105][52263] Updated weights for policy 0, policy_version 431539 (0.0031) [2024-04-27 17:05:59,107][52031] Fps is (10 sec: 55705.0, 60 sec: 54067.3, 300 sec: 53650.7). Total num frames: 7070466048. Throughput: 0: 53880.8. Samples: 1560909680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 17:05:59,107][52031] Avg episode reward: [(0, '0.569')] [2024-04-27 17:05:59,498][52263] Updated weights for policy 0, policy_version 431549 (0.0027) [2024-04-27 17:06:03,189][52263] Updated weights for policy 0, policy_version 431559 (0.0028) [2024-04-27 17:06:04,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 7070711808. Throughput: 0: 53826.2. Samples: 1561231220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 17:06:04,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 17:06:05,926][52263] Updated weights for policy 0, policy_version 431569 (0.0026) [2024-04-27 17:06:09,107][52031] Fps is (10 sec: 50790.7, 60 sec: 53794.0, 300 sec: 53539.6). Total num frames: 7070973952. Throughput: 0: 53943.2. Samples: 1561555660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 17:06:09,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 17:06:09,389][52263] Updated weights for policy 0, policy_version 431579 (0.0029) [2024-04-27 17:06:12,234][52263] Updated weights for policy 0, policy_version 431589 (0.0028) [2024-04-27 17:06:14,106][52031] Fps is (10 sec: 54067.8, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 7071252480. Throughput: 0: 53605.6. Samples: 1561705040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-27 17:06:14,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 17:06:15,386][52263] Updated weights for policy 0, policy_version 431599 (0.0032) [2024-04-27 17:06:18,218][52263] Updated weights for policy 0, policy_version 431609 (0.0026) [2024-04-27 17:06:19,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 7071514624. Throughput: 0: 53570.7. Samples: 1562026620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-27 17:06:19,107][52031] Avg episode reward: [(0, '0.593')] [2024-04-27 17:06:21,453][52263] Updated weights for policy 0, policy_version 431619 (0.0027) [2024-04-27 17:06:24,107][52031] Fps is (10 sec: 54066.3, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 7071793152. Throughput: 0: 53623.9. Samples: 1562352260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-27 17:06:24,107][52031] Avg episode reward: [(0, '0.667')] [2024-04-27 17:06:24,230][52263] Updated weights for policy 0, policy_version 431629 (0.0032) [2024-04-27 17:06:27,605][52263] Updated weights for policy 0, policy_version 431639 (0.0030) [2024-04-27 17:06:29,107][52031] Fps is (10 sec: 55705.2, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 7072071680. Throughput: 0: 54124.1. Samples: 1562528600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-27 17:06:29,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 17:06:30,368][52263] Updated weights for policy 0, policy_version 431649 (0.0029) [2024-04-27 17:06:33,625][52263] Updated weights for policy 0, policy_version 431659 (0.0026) [2024-04-27 17:06:34,107][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 7072333824. Throughput: 0: 54166.3. Samples: 1562849420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-27 17:06:34,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 17:06:34,396][52242] Signal inference workers to stop experience collection... (24000 times) [2024-04-27 17:06:34,396][52242] Signal inference workers to resume experience collection... (24000 times) [2024-04-27 17:06:34,408][52263] InferenceWorker_p0-w0: stopping experience collection (24000 times) [2024-04-27 17:06:34,408][52263] InferenceWorker_p0-w0: resuming experience collection (24000 times) [2024-04-27 17:06:36,475][52263] Updated weights for policy 0, policy_version 431669 (0.0031) [2024-04-27 17:06:39,107][52031] Fps is (10 sec: 50790.0, 60 sec: 53520.9, 300 sec: 53484.0). Total num frames: 7072579584. Throughput: 0: 54190.6. Samples: 1563173860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-27 17:06:39,107][52031] Avg episode reward: [(0, '0.613')] [2024-04-27 17:06:39,672][52263] Updated weights for policy 0, policy_version 431679 (0.0030) [2024-04-27 17:06:42,487][52263] Updated weights for policy 0, policy_version 431689 (0.0028) [2024-04-27 17:06:44,107][52031] Fps is (10 sec: 54066.5, 60 sec: 54067.2, 300 sec: 53650.6). Total num frames: 7072874496. Throughput: 0: 53627.1. Samples: 1563322900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-27 17:06:44,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 17:06:45,721][52263] Updated weights for policy 0, policy_version 431699 (0.0029) [2024-04-27 17:06:48,375][52263] Updated weights for policy 0, policy_version 431709 (0.0032) [2024-04-27 17:06:49,106][52031] Fps is (10 sec: 55706.5, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 7073136640. Throughput: 0: 53613.8. Samples: 1563643840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-27 17:06:49,107][52031] Avg episode reward: [(0, '0.582')] [2024-04-27 17:06:51,949][52263] Updated weights for policy 0, policy_version 431719 (0.0021) [2024-04-27 17:06:54,107][52031] Fps is (10 sec: 54066.6, 60 sec: 54067.0, 300 sec: 53650.6). Total num frames: 7073415168. Throughput: 0: 53571.8. Samples: 1563966400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-27 17:06:54,107][52031] Avg episode reward: [(0, '0.568')] [2024-04-27 17:06:54,552][52263] Updated weights for policy 0, policy_version 431729 (0.0033) [2024-04-27 17:06:57,988][52263] Updated weights for policy 0, policy_version 431739 (0.0039) [2024-04-27 17:06:59,106][52031] Fps is (10 sec: 57343.9, 60 sec: 54067.3, 300 sec: 53706.2). Total num frames: 7073710080. Throughput: 0: 54065.7. Samples: 1564138000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-27 17:06:59,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 17:07:00,702][52263] Updated weights for policy 0, policy_version 431749 (0.0029) [2024-04-27 17:07:03,933][52263] Updated weights for policy 0, policy_version 431759 (0.0029) [2024-04-27 17:07:04,106][52031] Fps is (10 sec: 52429.9, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 7073939456. Throughput: 0: 54059.5. Samples: 1564459300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-27 17:07:04,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 17:07:06,677][52263] Updated weights for policy 0, policy_version 431769 (0.0026) [2024-04-27 17:07:09,106][52031] Fps is (10 sec: 47513.8, 60 sec: 53521.2, 300 sec: 53484.1). Total num frames: 7074185216. Throughput: 0: 53967.3. Samples: 1564780780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-27 17:07:09,107][52031] Avg episode reward: [(0, '0.590')] [2024-04-27 17:07:10,045][52263] Updated weights for policy 0, policy_version 431779 (0.0028) [2024-04-27 17:07:12,841][52263] Updated weights for policy 0, policy_version 431789 (0.0029) [2024-04-27 17:07:14,107][52031] Fps is (10 sec: 55705.0, 60 sec: 54067.0, 300 sec: 53706.2). Total num frames: 7074496512. Throughput: 0: 53555.9. Samples: 1564938620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-27 17:07:14,107][52031] Avg episode reward: [(0, '0.614')] [2024-04-27 17:07:16,220][52263] Updated weights for policy 0, policy_version 431799 (0.0029) [2024-04-27 17:07:19,063][52263] Updated weights for policy 0, policy_version 431809 (0.0027) [2024-04-27 17:07:19,107][52031] Fps is (10 sec: 57343.4, 60 sec: 54067.1, 300 sec: 53761.7). Total num frames: 7074758656. Throughput: 0: 53537.7. Samples: 1565258620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-27 17:07:19,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 17:07:22,160][52263] Updated weights for policy 0, policy_version 431819 (0.0029) [2024-04-27 17:07:24,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 7075020800. Throughput: 0: 53478.6. Samples: 1565580400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-27 17:07:24,107][52031] Avg episode reward: [(0, '0.670')] [2024-04-27 17:07:25,041][52263] Updated weights for policy 0, policy_version 431829 (0.0028) [2024-04-27 17:07:28,378][52263] Updated weights for policy 0, policy_version 431839 (0.0029) [2024-04-27 17:07:29,107][52031] Fps is (10 sec: 54067.2, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 7075299328. Throughput: 0: 53854.7. Samples: 1565746360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-27 17:07:29,107][52031] Avg episode reward: [(0, '0.477')] [2024-04-27 17:07:31,203][52263] Updated weights for policy 0, policy_version 431849 (0.0024) [2024-04-27 17:07:33,974][52242] Signal inference workers to stop experience collection... (24050 times) [2024-04-27 17:07:33,978][52242] Signal inference workers to resume experience collection... (24050 times) [2024-04-27 17:07:33,989][52263] InferenceWorker_p0-w0: stopping experience collection (24050 times) [2024-04-27 17:07:34,006][52263] InferenceWorker_p0-w0: resuming experience collection (24050 times) [2024-04-27 17:07:34,106][52031] Fps is (10 sec: 54067.9, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 7075561472. Throughput: 0: 53855.1. Samples: 1566067320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-04-27 17:07:34,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 17:07:34,117][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000431858_7075561472.pth... [2024-04-27 17:07:34,160][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000431071_7062667264.pth [2024-04-27 17:07:34,553][52263] Updated weights for policy 0, policy_version 431859 (0.0027) [2024-04-27 17:07:37,265][52263] Updated weights for policy 0, policy_version 431869 (0.0030) [2024-04-27 17:07:39,107][52031] Fps is (10 sec: 50790.1, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 7075807232. Throughput: 0: 53953.9. Samples: 1566394320. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-27 17:07:39,107][52031] Avg episode reward: [(0, '0.530')] [2024-04-27 17:07:40,539][52263] Updated weights for policy 0, policy_version 431879 (0.0027) [2024-04-27 17:07:43,211][52263] Updated weights for policy 0, policy_version 431889 (0.0027) [2024-04-27 17:07:44,107][52031] Fps is (10 sec: 54066.8, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 7076102144. Throughput: 0: 53515.0. Samples: 1566546180. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-27 17:07:44,107][52031] Avg episode reward: [(0, '0.674')] [2024-04-27 17:07:46,771][52263] Updated weights for policy 0, policy_version 431899 (0.0033) [2024-04-27 17:07:49,106][52031] Fps is (10 sec: 55706.7, 60 sec: 53794.2, 300 sec: 53761.8). Total num frames: 7076364288. Throughput: 0: 53618.8. Samples: 1566872140. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-27 17:07:49,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 17:07:49,391][52263] Updated weights for policy 0, policy_version 431909 (0.0029) [2024-04-27 17:07:52,857][52263] Updated weights for policy 0, policy_version 431919 (0.0029) [2024-04-27 17:07:54,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.2, 300 sec: 53595.1). Total num frames: 7076626432. Throughput: 0: 53686.6. Samples: 1567196680. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-27 17:07:54,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 17:07:55,550][52263] Updated weights for policy 0, policy_version 431929 (0.0027) [2024-04-27 17:07:58,796][52263] Updated weights for policy 0, policy_version 431939 (0.0031) [2024-04-27 17:07:59,107][52031] Fps is (10 sec: 52427.5, 60 sec: 52974.8, 300 sec: 53595.1). Total num frames: 7076888576. Throughput: 0: 53751.5. Samples: 1567357440. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-27 17:07:59,107][52031] Avg episode reward: [(0, '0.549')] [2024-04-27 17:08:01,530][52263] Updated weights for policy 0, policy_version 431949 (0.0035) [2024-04-27 17:08:04,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 7077167104. Throughput: 0: 53792.8. Samples: 1567679300. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-27 17:08:04,107][52031] Avg episode reward: [(0, '0.526')] [2024-04-27 17:08:04,751][52263] Updated weights for policy 0, policy_version 431959 (0.0026) [2024-04-27 17:08:07,814][52263] Updated weights for policy 0, policy_version 431969 (0.0035) [2024-04-27 17:08:09,106][52031] Fps is (10 sec: 54067.9, 60 sec: 54067.1, 300 sec: 53706.2). Total num frames: 7077429248. Throughput: 0: 53794.3. Samples: 1568001140. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-27 17:08:09,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 17:08:10,897][52263] Updated weights for policy 0, policy_version 431979 (0.0026) [2024-04-27 17:08:13,879][52263] Updated weights for policy 0, policy_version 431989 (0.0036) [2024-04-27 17:08:14,107][52031] Fps is (10 sec: 54067.3, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 7077707776. Throughput: 0: 53677.3. Samples: 1568161840. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-27 17:08:14,107][52031] Avg episode reward: [(0, '0.612')] [2024-04-27 17:08:17,019][52263] Updated weights for policy 0, policy_version 431999 (0.0029) [2024-04-27 17:08:19,106][52031] Fps is (10 sec: 54067.6, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 7077969920. Throughput: 0: 53715.2. Samples: 1568484500. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-27 17:08:19,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 17:08:19,868][52263] Updated weights for policy 0, policy_version 432009 (0.0028) [2024-04-27 17:08:23,178][52263] Updated weights for policy 0, policy_version 432019 (0.0031) [2024-04-27 17:08:24,106][52031] Fps is (10 sec: 54068.0, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 7078248448. Throughput: 0: 53615.3. Samples: 1568807000. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-27 17:08:24,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 17:08:25,960][52263] Updated weights for policy 0, policy_version 432029 (0.0030) [2024-04-27 17:08:29,106][52031] Fps is (10 sec: 54067.0, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 7078510592. Throughput: 0: 53791.2. Samples: 1568966780. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-27 17:08:29,107][52031] Avg episode reward: [(0, '0.594')] [2024-04-27 17:08:29,370][52263] Updated weights for policy 0, policy_version 432039 (0.0026) [2024-04-27 17:08:32,007][52263] Updated weights for policy 0, policy_version 432049 (0.0028) [2024-04-27 17:08:32,705][52242] Signal inference workers to stop experience collection... (24100 times) [2024-04-27 17:08:32,758][52263] InferenceWorker_p0-w0: stopping experience collection (24100 times) [2024-04-27 17:08:32,758][52242] Signal inference workers to resume experience collection... (24100 times) [2024-04-27 17:08:32,773][52263] InferenceWorker_p0-w0: resuming experience collection (24100 times) [2024-04-27 17:08:34,107][52031] Fps is (10 sec: 50789.9, 60 sec: 53247.9, 300 sec: 53595.1). Total num frames: 7078756352. Throughput: 0: 53606.0. Samples: 1569284420. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-27 17:08:34,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 17:08:35,578][52263] Updated weights for policy 0, policy_version 432059 (0.0025) [2024-04-27 17:08:38,137][52263] Updated weights for policy 0, policy_version 432069 (0.0034) [2024-04-27 17:08:39,107][52031] Fps is (10 sec: 54066.4, 60 sec: 54067.2, 300 sec: 53761.7). Total num frames: 7079051264. Throughput: 0: 53417.2. Samples: 1569600460. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-27 17:08:39,108][52031] Avg episode reward: [(0, '0.640')] [2024-04-27 17:08:41,714][52263] Updated weights for policy 0, policy_version 432079 (0.0026) [2024-04-27 17:08:44,106][52031] Fps is (10 sec: 55706.0, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 7079313408. Throughput: 0: 53527.7. Samples: 1569766180. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-27 17:08:44,107][52031] Avg episode reward: [(0, '0.663')] [2024-04-27 17:08:44,389][52263] Updated weights for policy 0, policy_version 432089 (0.0026) [2024-04-27 17:08:47,836][52263] Updated weights for policy 0, policy_version 432099 (0.0029) [2024-04-27 17:08:49,106][52031] Fps is (10 sec: 52429.7, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 7079575552. Throughput: 0: 53459.3. Samples: 1570084960. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-27 17:08:49,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 17:08:50,686][52263] Updated weights for policy 0, policy_version 432109 (0.0031) [2024-04-27 17:08:53,931][52263] Updated weights for policy 0, policy_version 432119 (0.0032) [2024-04-27 17:08:54,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 7079837696. Throughput: 0: 53416.5. Samples: 1570404880. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-27 17:08:54,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 17:08:56,973][52263] Updated weights for policy 0, policy_version 432129 (0.0027) [2024-04-27 17:08:59,106][52031] Fps is (10 sec: 54067.3, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 7080116224. Throughput: 0: 53489.1. Samples: 1570568840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:08:59,107][52031] Avg episode reward: [(0, '0.653')] [2024-04-27 17:08:59,913][52263] Updated weights for policy 0, policy_version 432139 (0.0028) [2024-04-27 17:09:02,915][52263] Updated weights for policy 0, policy_version 432149 (0.0031) [2024-04-27 17:09:04,106][52031] Fps is (10 sec: 52428.7, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 7080361984. Throughput: 0: 53475.9. Samples: 1570890920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:09:04,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 17:09:06,055][52263] Updated weights for policy 0, policy_version 432159 (0.0030) [2024-04-27 17:09:09,023][52263] Updated weights for policy 0, policy_version 432169 (0.0032) [2024-04-27 17:09:09,107][52031] Fps is (10 sec: 54066.6, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 7080656896. Throughput: 0: 53438.5. Samples: 1571211740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:09:09,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 17:09:12,125][52263] Updated weights for policy 0, policy_version 432179 (0.0026) [2024-04-27 17:09:14,107][52031] Fps is (10 sec: 55705.1, 60 sec: 53521.0, 300 sec: 53706.2). Total num frames: 7080919040. Throughput: 0: 53479.4. Samples: 1571373360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:09:14,107][52031] Avg episode reward: [(0, '0.508')] [2024-04-27 17:09:15,065][52263] Updated weights for policy 0, policy_version 432189 (0.0029) [2024-04-27 17:09:18,371][52263] Updated weights for policy 0, policy_version 432199 (0.0028) [2024-04-27 17:09:19,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 7081197568. Throughput: 0: 53668.1. Samples: 1571699480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:09:19,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 17:09:21,089][52263] Updated weights for policy 0, policy_version 432209 (0.0036) [2024-04-27 17:09:24,107][52031] Fps is (10 sec: 52429.0, 60 sec: 53247.9, 300 sec: 53539.6). Total num frames: 7081443328. Throughput: 0: 53717.8. Samples: 1572017760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:09:24,107][52031] Avg episode reward: [(0, '0.673')] [2024-04-27 17:09:24,463][52263] Updated weights for policy 0, policy_version 432219 (0.0029) [2024-04-27 17:09:27,217][52263] Updated weights for policy 0, policy_version 432229 (0.0032) [2024-04-27 17:09:29,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 7081721856. Throughput: 0: 53569.0. Samples: 1572176780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:09:29,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 17:09:30,676][52263] Updated weights for policy 0, policy_version 432239 (0.0028) [2024-04-27 17:09:33,418][52263] Updated weights for policy 0, policy_version 432249 (0.0028) [2024-04-27 17:09:34,107][52031] Fps is (10 sec: 55704.8, 60 sec: 54067.0, 300 sec: 53761.7). Total num frames: 7082000384. Throughput: 0: 53631.2. Samples: 1572498380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:09:34,107][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 17:09:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000432251_7082000384.pth... [2024-04-27 17:09:34,158][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000431465_7069122560.pth [2024-04-27 17:09:36,859][52263] Updated weights for policy 0, policy_version 432259 (0.0028) [2024-04-27 17:09:39,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53248.2, 300 sec: 53595.1). Total num frames: 7082246144. Throughput: 0: 53686.3. Samples: 1572820760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:09:39,107][52031] Avg episode reward: [(0, '0.578')] [2024-04-27 17:09:39,592][52263] Updated weights for policy 0, policy_version 432269 (0.0035) [2024-04-27 17:09:42,859][52263] Updated weights for policy 0, policy_version 432279 (0.0034) [2024-04-27 17:09:44,107][52031] Fps is (10 sec: 52429.5, 60 sec: 53521.0, 300 sec: 53706.2). Total num frames: 7082524672. Throughput: 0: 53637.2. Samples: 1572982520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:09:44,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 17:09:45,696][52263] Updated weights for policy 0, policy_version 432289 (0.0033) [2024-04-27 17:09:48,854][52263] Updated weights for policy 0, policy_version 432299 (0.0031) [2024-04-27 17:09:49,106][52031] Fps is (10 sec: 54067.1, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 7082786816. Throughput: 0: 53575.6. Samples: 1573301820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:09:49,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 17:09:51,923][52263] Updated weights for policy 0, policy_version 432309 (0.0031) [2024-04-27 17:09:54,107][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 7083048960. Throughput: 0: 53575.0. Samples: 1573622620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:09:54,107][52031] Avg episode reward: [(0, '0.581')] [2024-04-27 17:09:54,985][52263] Updated weights for policy 0, policy_version 432319 (0.0032) [2024-04-27 17:09:55,772][52242] Signal inference workers to stop experience collection... (24150 times) [2024-04-27 17:09:55,772][52242] Signal inference workers to resume experience collection... (24150 times) [2024-04-27 17:09:55,787][52263] InferenceWorker_p0-w0: stopping experience collection (24150 times) [2024-04-27 17:09:55,787][52263] InferenceWorker_p0-w0: resuming experience collection (24150 times) [2024-04-27 17:09:58,061][52263] Updated weights for policy 0, policy_version 432329 (0.0026) [2024-04-27 17:09:59,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 7083311104. Throughput: 0: 53548.1. Samples: 1573783020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:09:59,107][52031] Avg episode reward: [(0, '0.642')] [2024-04-27 17:10:01,179][52263] Updated weights for policy 0, policy_version 432339 (0.0029) [2024-04-27 17:10:04,033][52263] Updated weights for policy 0, policy_version 432349 (0.0033) [2024-04-27 17:10:04,107][52031] Fps is (10 sec: 55705.9, 60 sec: 54067.2, 300 sec: 53761.7). Total num frames: 7083606016. Throughput: 0: 53411.9. Samples: 1574103020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:10:04,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 17:10:07,329][52263] Updated weights for policy 0, policy_version 432359 (0.0028) [2024-04-27 17:10:09,107][52031] Fps is (10 sec: 57343.3, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 7083884544. Throughput: 0: 53502.2. Samples: 1574425360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:10:09,107][52031] Avg episode reward: [(0, '0.662')] [2024-04-27 17:10:10,159][52263] Updated weights for policy 0, policy_version 432369 (0.0031) [2024-04-27 17:10:13,314][52263] Updated weights for policy 0, policy_version 432379 (0.0032) [2024-04-27 17:10:14,106][52031] Fps is (10 sec: 52429.4, 60 sec: 53521.2, 300 sec: 53706.2). Total num frames: 7084130304. Throughput: 0: 53764.8. Samples: 1574596200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:10:14,107][52031] Avg episode reward: [(0, '0.615')] [2024-04-27 17:10:16,272][52263] Updated weights for policy 0, policy_version 432389 (0.0031) [2024-04-27 17:10:19,106][52031] Fps is (10 sec: 50791.4, 60 sec: 53248.0, 300 sec: 53650.7). Total num frames: 7084392448. Throughput: 0: 53866.1. Samples: 1574922340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:10:19,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 17:10:19,563][52263] Updated weights for policy 0, policy_version 432399 (0.0036) [2024-04-27 17:10:22,343][52263] Updated weights for policy 0, policy_version 432409 (0.0031) [2024-04-27 17:10:24,106][52031] Fps is (10 sec: 52428.5, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 7084654592. Throughput: 0: 53784.8. Samples: 1575241080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:10:24,107][52031] Avg episode reward: [(0, '0.596')] [2024-04-27 17:10:25,674][52263] Updated weights for policy 0, policy_version 432419 (0.0033) [2024-04-27 17:10:28,564][52263] Updated weights for policy 0, policy_version 432429 (0.0028) [2024-04-27 17:10:29,107][52031] Fps is (10 sec: 54065.9, 60 sec: 53520.8, 300 sec: 53650.6). Total num frames: 7084933120. Throughput: 0: 53649.2. Samples: 1575396740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:10:29,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 17:10:31,742][52263] Updated weights for policy 0, policy_version 432439 (0.0033) [2024-04-27 17:10:34,106][52031] Fps is (10 sec: 57344.4, 60 sec: 53794.4, 300 sec: 53761.7). Total num frames: 7085228032. Throughput: 0: 53784.9. Samples: 1575722140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:10:34,107][52031] Avg episode reward: [(0, '0.639')] [2024-04-27 17:10:34,632][52263] Updated weights for policy 0, policy_version 432449 (0.0035) [2024-04-27 17:10:37,759][52263] Updated weights for policy 0, policy_version 432459 (0.0039) [2024-04-27 17:10:39,107][52031] Fps is (10 sec: 55705.8, 60 sec: 54067.0, 300 sec: 53761.7). Total num frames: 7085490176. Throughput: 0: 53760.4. Samples: 1576041840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:10:39,107][52031] Avg episode reward: [(0, '0.534')] [2024-04-27 17:10:40,702][52263] Updated weights for policy 0, policy_version 432469 (0.0037) [2024-04-27 17:10:43,971][52263] Updated weights for policy 0, policy_version 432479 (0.0034) [2024-04-27 17:10:44,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 7085735936. Throughput: 0: 53862.7. Samples: 1576206840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:10:44,107][52031] Avg episode reward: [(0, '0.567')] [2024-04-27 17:10:47,084][52263] Updated weights for policy 0, policy_version 432489 (0.0029) [2024-04-27 17:10:49,107][52031] Fps is (10 sec: 52428.8, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 7086014464. Throughput: 0: 53887.5. Samples: 1576527960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:10:49,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 17:10:50,118][52263] Updated weights for policy 0, policy_version 432499 (0.0028) [2024-04-27 17:10:53,144][52263] Updated weights for policy 0, policy_version 432509 (0.0036) [2024-04-27 17:10:54,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53521.2, 300 sec: 53539.6). Total num frames: 7086260224. Throughput: 0: 53942.4. Samples: 1576852760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:10:54,107][52031] Avg episode reward: [(0, '0.624')] [2024-04-27 17:10:56,102][52263] Updated weights for policy 0, policy_version 432519 (0.0029) [2024-04-27 17:10:59,106][52031] Fps is (10 sec: 52430.0, 60 sec: 53794.2, 300 sec: 53650.7). Total num frames: 7086538752. Throughput: 0: 53724.5. Samples: 1577013800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:10:59,107][52031] Avg episode reward: [(0, '0.626')] [2024-04-27 17:10:59,135][52263] Updated weights for policy 0, policy_version 432529 (0.0030) [2024-04-27 17:11:02,174][52263] Updated weights for policy 0, policy_version 432539 (0.0028) [2024-04-27 17:11:04,106][52031] Fps is (10 sec: 57344.0, 60 sec: 53794.2, 300 sec: 53761.7). Total num frames: 7086833664. Throughput: 0: 53583.9. Samples: 1577333620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:11:04,107][52031] Avg episode reward: [(0, '0.531')] [2024-04-27 17:11:05,275][52263] Updated weights for policy 0, policy_version 432549 (0.0036) [2024-04-27 17:11:08,440][52263] Updated weights for policy 0, policy_version 432559 (0.0031) [2024-04-27 17:11:09,070][52242] Signal inference workers to stop experience collection... (24200 times) [2024-04-27 17:11:09,071][52242] Signal inference workers to resume experience collection... (24200 times) [2024-04-27 17:11:09,102][52263] InferenceWorker_p0-w0: stopping experience collection (24200 times) [2024-04-27 17:11:09,102][52263] InferenceWorker_p0-w0: resuming experience collection (24200 times) [2024-04-27 17:11:09,106][52031] Fps is (10 sec: 55705.1, 60 sec: 53521.2, 300 sec: 53706.2). Total num frames: 7087095808. Throughput: 0: 53616.5. Samples: 1577653820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:11:09,107][52031] Avg episode reward: [(0, '0.584')] [2024-04-27 17:11:11,441][52263] Updated weights for policy 0, policy_version 432569 (0.0027) [2024-04-27 17:11:14,106][52031] Fps is (10 sec: 50790.5, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 7087341568. Throughput: 0: 53774.9. Samples: 1577816600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:11:14,107][52031] Avg episode reward: [(0, '0.496')] [2024-04-27 17:11:14,559][52263] Updated weights for policy 0, policy_version 432579 (0.0028) [2024-04-27 17:11:17,700][52263] Updated weights for policy 0, policy_version 432589 (0.0028) [2024-04-27 17:11:19,106][52031] Fps is (10 sec: 52428.9, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 7087620096. Throughput: 0: 53800.9. Samples: 1578143180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:11:19,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 17:11:20,498][52263] Updated weights for policy 0, policy_version 432599 (0.0030) [2024-04-27 17:11:23,821][52263] Updated weights for policy 0, policy_version 432609 (0.0035) [2024-04-27 17:11:24,106][52031] Fps is (10 sec: 54067.4, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 7087882240. Throughput: 0: 53824.7. Samples: 1578463940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:11:24,107][52031] Avg episode reward: [(0, '0.628')] [2024-04-27 17:11:26,464][52263] Updated weights for policy 0, policy_version 432619 (0.0027) [2024-04-27 17:11:29,107][52031] Fps is (10 sec: 55704.6, 60 sec: 54067.2, 300 sec: 53706.2). Total num frames: 7088177152. Throughput: 0: 53670.4. Samples: 1578622020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:11:29,107][52031] Avg episode reward: [(0, '0.559')] [2024-04-27 17:11:29,822][52263] Updated weights for policy 0, policy_version 432629 (0.0032) [2024-04-27 17:11:32,608][52263] Updated weights for policy 0, policy_version 432639 (0.0026) [2024-04-27 17:11:34,106][52031] Fps is (10 sec: 58982.6, 60 sec: 54067.2, 300 sec: 53872.8). Total num frames: 7088472064. Throughput: 0: 53720.7. Samples: 1578945380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:11:34,107][52031] Avg episode reward: [(0, '0.580')] [2024-04-27 17:11:34,116][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000432646_7088472064.pth... [2024-04-27 17:11:34,164][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000431858_7075561472.pth [2024-04-27 17:11:35,918][52263] Updated weights for policy 0, policy_version 432649 (0.0038) [2024-04-27 17:11:38,819][52263] Updated weights for policy 0, policy_version 432659 (0.0032) [2024-04-27 17:11:39,107][52031] Fps is (10 sec: 50790.9, 60 sec: 53248.1, 300 sec: 53595.1). Total num frames: 7088685056. Throughput: 0: 53623.5. Samples: 1579265820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:11:39,107][52031] Avg episode reward: [(0, '0.573')] [2024-04-27 17:11:42,096][52263] Updated weights for policy 0, policy_version 432669 (0.0030) [2024-04-27 17:11:44,107][52031] Fps is (10 sec: 49151.3, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 7088963584. Throughput: 0: 53644.7. Samples: 1579427820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 17:11:44,107][52031] Avg episode reward: [(0, '0.595')] [2024-04-27 17:11:45,098][52263] Updated weights for policy 0, policy_version 432679 (0.0031) [2024-04-27 17:11:48,479][52263] Updated weights for policy 0, policy_version 432689 (0.0029) [2024-04-27 17:11:49,106][52031] Fps is (10 sec: 52429.2, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 7089209344. Throughput: 0: 53575.2. Samples: 1579744500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 17:11:49,107][52031] Avg episode reward: [(0, '0.631')] [2024-04-27 17:11:51,037][52263] Updated weights for policy 0, policy_version 432699 (0.0031) [2024-04-27 17:11:54,107][52031] Fps is (10 sec: 52429.2, 60 sec: 53794.2, 300 sec: 53484.0). Total num frames: 7089487872. Throughput: 0: 53587.1. Samples: 1580065240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 17:11:54,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 17:11:54,488][52263] Updated weights for policy 0, policy_version 432709 (0.0028) [2024-04-27 17:11:56,979][52263] Updated weights for policy 0, policy_version 432719 (0.0031) [2024-04-27 17:11:59,106][52031] Fps is (10 sec: 57343.7, 60 sec: 54067.1, 300 sec: 53706.2). Total num frames: 7089782784. Throughput: 0: 53673.3. Samples: 1580231900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 17:11:59,107][52031] Avg episode reward: [(0, '0.654')] [2024-04-27 17:12:00,648][52263] Updated weights for policy 0, policy_version 432729 (0.0029) [2024-04-27 17:12:03,303][52263] Updated weights for policy 0, policy_version 432739 (0.0032) [2024-04-27 17:12:04,106][52031] Fps is (10 sec: 58982.2, 60 sec: 54067.2, 300 sec: 53872.8). Total num frames: 7090077696. Throughput: 0: 53535.1. Samples: 1580552260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 17:12:04,107][52031] Avg episode reward: [(0, '0.553')] [2024-04-27 17:12:06,006][52242] Signal inference workers to stop experience collection... (24250 times) [2024-04-27 17:12:06,007][52242] Signal inference workers to resume experience collection... (24250 times) [2024-04-27 17:12:06,031][52263] InferenceWorker_p0-w0: stopping experience collection (24250 times) [2024-04-27 17:12:06,032][52263] InferenceWorker_p0-w0: resuming experience collection (24250 times) [2024-04-27 17:12:06,930][52263] Updated weights for policy 0, policy_version 432749 (0.0036) [2024-04-27 17:12:09,106][52031] Fps is (10 sec: 50790.7, 60 sec: 53248.0, 300 sec: 53539.6). Total num frames: 7090290688. Throughput: 0: 53498.2. Samples: 1580871360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 17:12:09,107][52031] Avg episode reward: [(0, '0.637')] [2024-04-27 17:12:09,363][52263] Updated weights for policy 0, policy_version 432759 (0.0028) [2024-04-27 17:12:12,950][52263] Updated weights for policy 0, policy_version 432769 (0.0026) [2024-04-27 17:12:14,106][52031] Fps is (10 sec: 49152.2, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 7090569216. Throughput: 0: 53454.0. Samples: 1581027440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 17:12:14,107][52031] Avg episode reward: [(0, '0.608')] [2024-04-27 17:12:15,301][52263] Updated weights for policy 0, policy_version 432779 (0.0033) [2024-04-27 17:12:19,107][52031] Fps is (10 sec: 50789.5, 60 sec: 52974.8, 300 sec: 53484.0). Total num frames: 7090798592. Throughput: 0: 53362.8. Samples: 1581346720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 17:12:19,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 17:12:19,289][52263] Updated weights for policy 0, policy_version 432789 (0.0030) [2024-04-27 17:12:21,348][52263] Updated weights for policy 0, policy_version 432799 (0.0033) [2024-04-27 17:12:24,107][52031] Fps is (10 sec: 52428.1, 60 sec: 53520.9, 300 sec: 53539.6). Total num frames: 7091093504. Throughput: 0: 53381.7. Samples: 1581668000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 17:12:24,107][52031] Avg episode reward: [(0, '0.611')] [2024-04-27 17:12:25,287][52263] Updated weights for policy 0, policy_version 432809 (0.0030) [2024-04-27 17:12:27,427][52263] Updated weights for policy 0, policy_version 432819 (0.0031) [2024-04-27 17:12:29,106][52031] Fps is (10 sec: 58983.9, 60 sec: 53521.3, 300 sec: 53650.7). Total num frames: 7091388416. Throughput: 0: 53585.1. Samples: 1581839140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 17:12:29,107][52031] Avg episode reward: [(0, '0.650')] [2024-04-27 17:12:31,370][52263] Updated weights for policy 0, policy_version 432829 (0.0029) [2024-04-27 17:12:33,527][52263] Updated weights for policy 0, policy_version 432839 (0.0028) [2024-04-27 17:12:34,106][52031] Fps is (10 sec: 57344.7, 60 sec: 53247.9, 300 sec: 53761.8). Total num frames: 7091666944. Throughput: 0: 53734.7. Samples: 1582162560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 17:12:34,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 17:12:37,450][52263] Updated weights for policy 0, policy_version 432849 (0.0032) [2024-04-27 17:12:39,106][52031] Fps is (10 sec: 52428.6, 60 sec: 53794.2, 300 sec: 53595.1). Total num frames: 7091912704. Throughput: 0: 53795.6. Samples: 1582486040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 17:12:39,107][52031] Avg episode reward: [(0, '0.507')] [2024-04-27 17:12:39,579][52263] Updated weights for policy 0, policy_version 432859 (0.0031) [2024-04-27 17:12:43,489][52263] Updated weights for policy 0, policy_version 432869 (0.0038) [2024-04-27 17:12:44,106][52031] Fps is (10 sec: 49152.2, 60 sec: 53248.1, 300 sec: 53539.6). Total num frames: 7092158464. Throughput: 0: 53436.1. Samples: 1582636520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 17:12:44,107][52031] Avg episode reward: [(0, '0.548')] [2024-04-27 17:12:45,711][52263] Updated weights for policy 0, policy_version 432879 (0.0029) [2024-04-27 17:12:49,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53794.0, 300 sec: 53595.1). Total num frames: 7092436992. Throughput: 0: 53423.0. Samples: 1582956300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 17:12:49,107][52031] Avg episode reward: [(0, '0.555')] [2024-04-27 17:12:49,548][52263] Updated weights for policy 0, policy_version 432889 (0.0027) [2024-04-27 17:12:51,622][52242] Signal inference workers to stop experience collection... (24300 times) [2024-04-27 17:12:51,623][52242] Signal inference workers to resume experience collection... (24300 times) [2024-04-27 17:12:51,636][52263] InferenceWorker_p0-w0: stopping experience collection (24300 times) [2024-04-27 17:12:51,636][52263] InferenceWorker_p0-w0: resuming experience collection (24300 times) [2024-04-27 17:12:51,751][52263] Updated weights for policy 0, policy_version 432899 (0.0029) [2024-04-27 17:12:54,106][52031] Fps is (10 sec: 57344.2, 60 sec: 54067.3, 300 sec: 53706.2). Total num frames: 7092731904. Throughput: 0: 53515.6. Samples: 1583279560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 17:12:54,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 17:12:55,649][52263] Updated weights for policy 0, policy_version 432909 (0.0041) [2024-04-27 17:12:57,959][52263] Updated weights for policy 0, policy_version 432919 (0.0031) [2024-04-27 17:12:59,107][52031] Fps is (10 sec: 57344.1, 60 sec: 53794.1, 300 sec: 53706.2). Total num frames: 7093010432. Throughput: 0: 53832.8. Samples: 1583449920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 17:12:59,116][52031] Avg episode reward: [(0, '0.610')] [2024-04-27 17:13:01,570][52263] Updated weights for policy 0, policy_version 432929 (0.0025) [2024-04-27 17:13:03,855][52263] Updated weights for policy 0, policy_version 432939 (0.0033) [2024-04-27 17:13:04,107][52031] Fps is (10 sec: 54066.0, 60 sec: 53247.9, 300 sec: 53706.2). Total num frames: 7093272576. Throughput: 0: 53947.1. Samples: 1583774340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-27 17:13:04,107][52031] Avg episode reward: [(0, '0.565')] [2024-04-27 17:13:07,631][52263] Updated weights for policy 0, policy_version 432949 (0.0027) [2024-04-27 17:13:09,107][52031] Fps is (10 sec: 50790.6, 60 sec: 53794.1, 300 sec: 53595.1). Total num frames: 7093518336. Throughput: 0: 53934.3. Samples: 1584095040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 17:13:09,107][52031] Avg episode reward: [(0, '0.635')] [2024-04-27 17:13:09,920][52263] Updated weights for policy 0, policy_version 432959 (0.0030) [2024-04-27 17:13:13,674][52263] Updated weights for policy 0, policy_version 432969 (0.0031) [2024-04-27 17:13:14,107][52031] Fps is (10 sec: 52429.2, 60 sec: 53794.1, 300 sec: 53650.6). Total num frames: 7093796864. Throughput: 0: 53562.9. Samples: 1584249480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 17:13:14,107][52031] Avg episode reward: [(0, '0.671')] [2024-04-27 17:13:15,985][52263] Updated weights for policy 0, policy_version 432979 (0.0031) [2024-04-27 17:13:19,107][52031] Fps is (10 sec: 50790.5, 60 sec: 53794.3, 300 sec: 53484.0). Total num frames: 7094026240. Throughput: 0: 53588.8. Samples: 1584574060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 17:13:19,107][52031] Avg episode reward: [(0, '0.585')] [2024-04-27 17:13:19,802][52263] Updated weights for policy 0, policy_version 432989 (0.0032) [2024-04-27 17:13:22,130][52263] Updated weights for policy 0, policy_version 432999 (0.0033) [2024-04-27 17:13:24,106][52031] Fps is (10 sec: 54067.8, 60 sec: 54067.3, 300 sec: 53650.7). Total num frames: 7094337536. Throughput: 0: 53555.1. Samples: 1584896020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 17:13:24,107][52031] Avg episode reward: [(0, '0.616')] [2024-04-27 17:13:25,851][52263] Updated weights for policy 0, policy_version 433009 (0.0030) [2024-04-27 17:13:28,317][52263] Updated weights for policy 0, policy_version 433019 (0.0032) [2024-04-27 17:13:29,106][52031] Fps is (10 sec: 60621.4, 60 sec: 54067.2, 300 sec: 53817.3). Total num frames: 7094632448. Throughput: 0: 53940.9. Samples: 1585063860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 17:13:29,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 17:13:31,861][52263] Updated weights for policy 0, policy_version 433029 (0.0027) [2024-04-27 17:13:34,106][52031] Fps is (10 sec: 54066.8, 60 sec: 53521.0, 300 sec: 53650.7). Total num frames: 7094878208. Throughput: 0: 54025.4. Samples: 1585387440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 17:13:34,107][52031] Avg episode reward: [(0, '0.685')] [2024-04-27 17:13:34,153][52242] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000433038_7094894592.pth... [2024-04-27 17:13:34,203][52242] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000432251_7082000384.pth [2024-04-27 17:13:34,359][52263] Updated weights for policy 0, policy_version 433039 (0.0031) [2024-04-27 17:13:37,974][52263] Updated weights for policy 0, policy_version 433049 (0.0027) [2024-04-27 17:13:39,106][52031] Fps is (10 sec: 50790.4, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 7095140352. Throughput: 0: 53996.0. Samples: 1585709380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 17:13:39,107][52031] Avg episode reward: [(0, '0.622')] [2024-04-27 17:13:40,339][52263] Updated weights for policy 0, policy_version 433059 (0.0030) [2024-04-27 17:13:43,954][52263] Updated weights for policy 0, policy_version 433069 (0.0030) [2024-04-27 17:13:44,106][52031] Fps is (10 sec: 52429.2, 60 sec: 54067.2, 300 sec: 53650.7). Total num frames: 7095402496. Throughput: 0: 53773.5. Samples: 1585869720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 17:13:44,107][52031] Avg episode reward: [(0, '0.587')] [2024-04-27 17:13:46,388][52263] Updated weights for policy 0, policy_version 433079 (0.0028) [2024-04-27 17:13:49,106][52031] Fps is (10 sec: 52429.0, 60 sec: 53794.3, 300 sec: 53650.7). Total num frames: 7095664640. Throughput: 0: 53720.7. Samples: 1586191760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 17:13:49,107][52031] Avg episode reward: [(0, '0.591')] [2024-04-27 17:13:49,954][52263] Updated weights for policy 0, policy_version 433089 (0.0027) [2024-04-27 17:13:52,455][52263] Updated weights for policy 0, policy_version 433099 (0.0030) [2024-04-27 17:13:54,107][52031] Fps is (10 sec: 55704.9, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 7095959552. Throughput: 0: 53660.4. Samples: 1586509760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 17:13:54,107][52031] Avg episode reward: [(0, '0.604')] [2024-04-27 17:13:56,099][52263] Updated weights for policy 0, policy_version 433109 (0.0028) [2024-04-27 17:13:58,171][52242] Signal inference workers to stop experience collection... (24350 times) [2024-04-27 17:13:58,219][52263] InferenceWorker_p0-w0: stopping experience collection (24350 times) [2024-04-27 17:13:58,230][52242] Signal inference workers to resume experience collection... (24350 times) [2024-04-27 17:13:58,234][52263] InferenceWorker_p0-w0: resuming experience collection (24350 times) [2024-04-27 17:13:58,680][52263] Updated weights for policy 0, policy_version 433119 (0.0031) [2024-04-27 17:13:59,106][52031] Fps is (10 sec: 55705.3, 60 sec: 53521.2, 300 sec: 53761.7). Total num frames: 7096221696. Throughput: 0: 53969.5. Samples: 1586678100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 17:13:59,107][52031] Avg episode reward: [(0, '0.511')] [2024-04-27 17:14:02,399][52263] Updated weights for policy 0, policy_version 433129 (0.0033) [2024-04-27 17:14:04,107][52031] Fps is (10 sec: 52428.9, 60 sec: 53521.2, 300 sec: 53650.7). Total num frames: 7096483840. Throughput: 0: 53879.5. Samples: 1586998640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 17:14:04,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 17:14:04,680][52263] Updated weights for policy 0, policy_version 433139 (0.0037) [2024-04-27 17:14:08,527][52263] Updated weights for policy 0, policy_version 433149 (0.0029) [2024-04-27 17:14:09,107][52031] Fps is (10 sec: 52427.9, 60 sec: 53794.1, 300 sec: 53650.7). Total num frames: 7096745984. Throughput: 0: 53889.6. Samples: 1587321060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 17:14:09,107][52031] Avg episode reward: [(0, '0.597')] [2024-04-27 17:14:10,802][52263] Updated weights for policy 0, policy_version 433159 (0.0027) [2024-04-27 17:14:14,106][52031] Fps is (10 sec: 52429.1, 60 sec: 53521.1, 300 sec: 53595.1). Total num frames: 7097008128. Throughput: 0: 53570.6. Samples: 1587474540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 17:14:14,107][52031] Avg episode reward: [(0, '0.598')] [2024-04-27 17:14:14,670][52263] Updated weights for policy 0, policy_version 433169 (0.0030) [2024-04-27 17:14:16,974][52263] Updated weights for policy 0, policy_version 433179 (0.0031) [2024-04-27 17:14:19,106][52031] Fps is (10 sec: 52429.8, 60 sec: 54067.3, 300 sec: 53650.7). Total num frames: 7097270272. Throughput: 0: 53523.7. Samples: 1587796000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 17:14:19,107][52031] Avg episode reward: [(0, '0.601')] [2024-04-27 17:14:20,737][52263] Updated weights for policy 0, policy_version 433189 (0.0032) [2024-04-27 17:14:23,226][52263] Updated weights for policy 0, policy_version 433199 (0.0036) [2024-04-27 17:14:24,107][52031] Fps is (10 sec: 55704.7, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 7097565184. Throughput: 0: 53599.3. Samples: 1588121360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 17:14:24,107][52031] Avg episode reward: [(0, '0.630')] [2024-04-27 17:14:26,671][52263] Updated weights for policy 0, policy_version 433209 (0.0029) [2024-04-27 17:14:29,106][52031] Fps is (10 sec: 57343.9, 60 sec: 53521.1, 300 sec: 53706.2). Total num frames: 7097843712. Throughput: 0: 53808.4. Samples: 1588291100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 17:14:29,107][52031] Avg episode reward: [(0, '0.651')] [2024-04-27 17:14:29,150][52263] Updated weights for policy 0, policy_version 433219 (0.0026) [2024-04-27 17:14:32,737][52263] Updated weights for policy 0, policy_version 433229 (0.0032) [2024-04-27 17:14:34,107][52031] Fps is (10 sec: 54067.7, 60 sec: 53794.1, 300 sec: 53761.7). Total num frames: 7098105856. Throughput: 0: 53791.8. Samples: 1588612400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 17:14:34,107][52031] Avg episode reward: [(0, '0.586')] [2024-04-27 17:14:35,680][52263] Updated weights for policy 0, policy_version 433239 (0.0026) [2024-04-27 17:14:38,925][52263] Updated weights for policy 0, policy_version 433249 (0.0028) [2024-04-27 17:14:39,107][52031] Fps is (10 sec: 52428.0, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 7098368000. Throughput: 0: 53874.2. Samples: 1588934100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 17:14:39,107][52031] Avg episode reward: [(0, '0.528')] [2024-04-27 17:14:41,601][52263] Updated weights for policy 0, policy_version 433259 (0.0032) [2024-04-27 17:14:44,107][52031] Fps is (10 sec: 50790.2, 60 sec: 53520.9, 300 sec: 53650.6). Total num frames: 7098613760. Throughput: 0: 53589.2. Samples: 1589089620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 17:14:44,107][52031] Avg episode reward: [(0, '0.656')] [2024-04-27 17:14:44,854][52263] Updated weights for policy 0, policy_version 433269 (0.0029) [2024-04-27 17:14:47,809][52263] Updated weights for policy 0, policy_version 433279 (0.0038) [2024-04-27 17:14:49,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 7098892288. Throughput: 0: 53576.0. Samples: 1589409560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 17:14:49,107][52031] Avg episode reward: [(0, '0.592')] [2024-04-27 17:14:50,935][52263] Updated weights for policy 0, policy_version 433289 (0.0026) [2024-04-27 17:14:53,877][52263] Updated weights for policy 0, policy_version 433299 (0.0031) [2024-04-27 17:14:54,107][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.1, 300 sec: 53761.7). Total num frames: 7099170816. Throughput: 0: 53568.9. Samples: 1589731660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 17:14:54,107][52031] Avg episode reward: [(0, '0.583')] [2024-04-27 17:14:57,105][52263] Updated weights for policy 0, policy_version 433309 (0.0025) [2024-04-27 17:14:57,349][52242] Signal inference workers to stop experience collection... (24400 times) [2024-04-27 17:14:57,395][52263] InferenceWorker_p0-w0: stopping experience collection (24400 times) [2024-04-27 17:14:57,407][52242] Signal inference workers to resume experience collection... (24400 times) [2024-04-27 17:14:57,416][52263] InferenceWorker_p0-w0: resuming experience collection (24400 times) [2024-04-27 17:14:59,107][52031] Fps is (10 sec: 55705.3, 60 sec: 53794.0, 300 sec: 53706.2). Total num frames: 7099449344. Throughput: 0: 53919.9. Samples: 1589900940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 17:14:59,107][52031] Avg episode reward: [(0, '0.627')] [2024-04-27 17:15:00,012][52263] Updated weights for policy 0, policy_version 433319 (0.0040) [2024-04-27 17:15:03,364][52263] Updated weights for policy 0, policy_version 433329 (0.0034) [2024-04-27 17:15:04,106][52031] Fps is (10 sec: 55706.3, 60 sec: 54067.3, 300 sec: 53706.2). Total num frames: 7099727872. Throughput: 0: 53943.1. Samples: 1590223440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 17:15:04,107][52031] Avg episode reward: [(0, '0.542')] [2024-04-27 17:15:06,049][52263] Updated weights for policy 0, policy_version 433339 (0.0030) [2024-04-27 17:15:09,106][52031] Fps is (10 sec: 52429.3, 60 sec: 53794.3, 300 sec: 53706.2). Total num frames: 7099973632. Throughput: 0: 53853.5. Samples: 1590544760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 17:15:09,107][52031] Avg episode reward: [(0, '0.574')] [2024-04-27 17:15:09,326][52263] Updated weights for policy 0, policy_version 433349 (0.0040) [2024-04-27 17:15:12,084][52263] Updated weights for policy 0, policy_version 433359 (0.0031) [2024-04-27 17:15:14,107][52031] Fps is (10 sec: 49151.5, 60 sec: 53521.0, 300 sec: 53650.6). Total num frames: 7100219392. Throughput: 0: 53421.7. Samples: 1590695080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 17:15:14,107][52031] Avg episode reward: [(0, '0.600')] [2024-04-27 17:15:15,435][52263] Updated weights for policy 0, policy_version 433369 (0.0030) [2024-04-27 17:15:18,408][52263] Updated weights for policy 0, policy_version 433379 (0.0026) [2024-04-27 17:15:19,106][52031] Fps is (10 sec: 54066.7, 60 sec: 54067.1, 300 sec: 53761.7). Total num frames: 7100514304. Throughput: 0: 53379.6. Samples: 1591014480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 17:15:19,107][52031] Avg episode reward: [(0, '0.609')] [2024-04-27 17:15:21,513][52263] Updated weights for policy 0, policy_version 433389 (0.0029) [2024-04-27 17:15:24,106][52031] Fps is (10 sec: 55705.7, 60 sec: 53521.2, 300 sec: 53706.2). Total num frames: 7100776448. Throughput: 0: 53358.3. Samples: 1591335220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 17:15:24,107][52031] Avg episode reward: [(0, '0.633')] [2024-04-27 17:15:24,387][52263] Updated weights for policy 0, policy_version 433399 (0.0032) [2024-04-27 17:15:27,602][52263] Updated weights for policy 0, policy_version 433409 (0.0031) [2024-04-27 17:15:29,106][52031] Fps is (10 sec: 54067.7, 60 sec: 53521.1, 300 sec: 53650.7). Total num frames: 7101054976. Throughput: 0: 53703.3. Samples: 1591506260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 17:15:29,107][52031] Avg episode reward: [(0, '0.618')] [2024-04-27 17:17:51,541][54587] Saving configuration to /workspace/metta/train_dir/p2.fa.clean/config.json... [2024-04-27 17:17:51,549][54587] Rollout worker 0 uses device cpu [2024-04-27 17:17:51,550][54587] Rollout worker 1 uses device cpu [2024-04-27 17:17:51,550][54587] Rollout worker 2 uses device cpu [2024-04-27 17:17:51,550][54587] Rollout worker 3 uses device cpu [2024-04-27 17:17:51,550][54587] Rollout worker 4 uses device cpu [2024-04-27 17:17:51,550][54587] Rollout worker 5 uses device cpu [2024-04-27 17:17:51,550][54587] Rollout worker 6 uses device cpu [2024-04-27 17:17:51,550][54587] Rollout worker 7 uses device cpu [2024-04-27 17:17:51,550][54587] Rollout worker 8 uses device cpu [2024-04-27 17:17:51,550][54587] Rollout worker 9 uses device cpu [2024-04-27 17:17:51,550][54587] Rollout worker 10 uses device cpu [2024-04-27 17:17:51,551][54587] Rollout worker 11 uses device cpu [2024-04-27 17:17:51,551][54587] Rollout worker 12 uses device cpu [2024-04-27 17:17:51,551][54587] Rollout worker 13 uses device cpu [2024-04-27 17:17:51,551][54587] Rollout worker 14 uses device cpu [2024-04-27 17:17:51,551][54587] Rollout worker 15 uses device cpu [2024-04-27 17:17:51,551][54587] Rollout worker 16 uses device cpu [2024-04-27 17:17:51,551][54587] Rollout worker 17 uses device cpu [2024-04-27 17:17:51,551][54587] Rollout worker 18 uses device cpu [2024-04-27 17:17:51,551][54587] Rollout worker 19 uses device cpu [2024-04-27 17:17:51,551][54587] Rollout worker 20 uses device cpu [2024-04-27 17:17:51,551][54587] Rollout worker 21 uses device cpu [2024-04-27 17:17:51,551][54587] Rollout worker 22 uses device cpu [2024-04-27 17:17:51,551][54587] Rollout worker 23 uses device cpu [2024-04-27 17:17:51,552][54587] Rollout worker 24 uses device cpu [2024-04-27 17:17:51,552][54587] Rollout worker 25 uses device cpu [2024-04-27 17:17:51,552][54587] Rollout worker 26 uses device cpu [2024-04-27 17:17:51,552][54587] Rollout worker 27 uses device cpu [2024-04-27 17:17:51,552][54587] Rollout worker 28 uses device cpu [2024-04-27 17:17:51,552][54587] Rollout worker 29 uses device cpu [2024-04-27 17:17:51,552][54587] Rollout worker 30 uses device cpu [2024-04-27 17:17:51,552][54587] Rollout worker 31 uses device cpu [2024-04-27 17:17:52,095][54587] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-27 17:17:52,095][54587] InferenceWorker_p0-w0: min num requests: 10 [2024-04-27 17:17:52,141][54587] Starting all processes... [2024-04-27 17:17:52,142][54587] Starting process learner_proc0 [2024-04-27 17:17:52,196][54587] Starting all processes... [2024-04-27 17:17:52,199][54587] Starting process inference_proc0-0 [2024-04-27 17:17:52,200][54587] Starting process rollout_proc6 [2024-04-27 17:17:52,199][54587] Starting process rollout_proc1 [2024-04-27 17:17:52,199][54587] Starting process rollout_proc2 [2024-04-27 17:17:52,199][54587] Starting process rollout_proc3 [2024-04-27 17:17:52,199][54587] Starting process rollout_proc4 [2024-04-27 17:17:52,200][54587] Starting process rollout_proc5 [2024-04-27 17:17:52,199][54587] Starting process rollout_proc0 [2024-04-27 17:17:52,200][54587] Starting process rollout_proc7 [2024-04-27 17:17:52,200][54587] Starting process rollout_proc8 [2024-04-27 17:17:52,201][54587] Starting process rollout_proc9 [2024-04-27 17:17:52,202][54587] Starting process rollout_proc10 [2024-04-27 17:17:52,202][54587] Starting process rollout_proc11 [2024-04-27 17:17:52,202][54587] Starting process rollout_proc12 [2024-04-27 17:17:52,203][54587] Starting process rollout_proc13 [2024-04-27 17:17:52,203][54587] Starting process rollout_proc14 [2024-04-27 17:17:52,203][54587] Starting process rollout_proc15 [2024-04-27 17:17:52,203][54587] Starting process rollout_proc16 [2024-04-27 17:17:52,204][54587] Starting process rollout_proc17 [2024-04-27 17:17:52,205][54587] Starting process rollout_proc18 [2024-04-27 17:17:52,206][54587] Starting process rollout_proc19 [2024-04-27 17:17:52,207][54587] Starting process rollout_proc20 [2024-04-27 17:17:52,212][54587] Starting process rollout_proc21 [2024-04-27 17:17:52,213][54587] Starting process rollout_proc22 [2024-04-27 17:17:52,214][54587] Starting process rollout_proc23 [2024-04-27 17:17:52,215][54587] Starting process rollout_proc24 [2024-04-27 17:17:52,215][54587] Starting process rollout_proc25 [2024-04-27 17:17:52,218][54587] Starting process rollout_proc26 [2024-04-27 17:17:52,220][54587] Starting process rollout_proc27 [2024-04-27 17:17:52,222][54587] Starting process rollout_proc28 [2024-04-27 17:17:52,228][54587] Starting process rollout_proc29 [2024-04-27 17:17:52,228][54587] Starting process rollout_proc30 [2024-04-27 17:17:52,230][54587] Starting process rollout_proc31 [2024-04-27 17:17:55,667][54820] Worker 1 uses CPU cores [1] [2024-04-27 17:17:55,834][54848] Worker 30 uses CPU cores [30] [2024-04-27 17:17:55,854][54823] Worker 4 uses CPU cores [4] [2024-04-27 17:17:55,870][54818] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-27 17:17:55,870][54818] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-04-27 17:17:55,873][54798] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-27 17:17:55,873][54798] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-04-27 17:17:55,883][54798] Num visible devices: 1 [2024-04-27 17:17:55,883][54818] Num visible devices: 1 [2024-04-27 17:17:55,890][54831] Worker 12 uses CPU cores [12] [2024-04-27 17:17:55,898][54837] Worker 20 uses CPU cores [20] [2024-04-27 17:17:55,946][54846] Worker 28 uses CPU cores [28] [2024-04-27 17:17:55,953][54830] Worker 11 uses CPU cores [11] [2024-04-27 17:17:55,967][54798] Starting seed is not provided [2024-04-27 17:17:55,967][54798] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-27 17:17:55,968][54798] Initializing actor-critic model on device cuda:0 [2024-04-27 17:17:55,969][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,982][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,982][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,983][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,983][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,983][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,983][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,983][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,983][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,984][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,984][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,984][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,984][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,984][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,984][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,985][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,985][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,985][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,985][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,985][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,985][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,986][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,986][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,986][54798] RunningMeanStd input shape: (1,) [2024-04-27 17:17:55,988][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,988][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,988][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,988][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,989][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,989][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,989][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,989][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,989][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,989][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,990][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,990][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,990][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,990][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,990][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,991][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,991][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,991][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,991][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,991][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,991][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,992][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:55,992][54798] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,027][54836] Worker 15 uses CPU cores [15] [2024-04-27 17:17:56,054][54832] Worker 14 uses CPU cores [14] [2024-04-27 17:17:56,071][54798] Created Actor Critic model with architecture: [2024-04-27 17:17:56,071][54798] PredictingActorCritic( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): FeatureEncoder( (_global_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (conf:agent:energy:initial): RunningMeanStdInPlace() (conf:agent:energy:max): RunningMeanStdInPlace() (conf:agent:energy:regen): RunningMeanStdInPlace() (conf:altar:cooldown): RunningMeanStdInPlace() (conf:altar:cost): RunningMeanStdInPlace() (conf:attack:damage): RunningMeanStdInPlace() (conf:attack:freeze_duration): RunningMeanStdInPlace() (conf:charger:cooldown): RunningMeanStdInPlace() (conf:charger:energy): RunningMeanStdInPlace() (conf:cost:attack): RunningMeanStdInPlace() (conf:cost:frozen): RunningMeanStdInPlace() (conf:cost:jump): RunningMeanStdInPlace() (conf:cost:move): RunningMeanStdInPlace() (conf:cost:move:predator): RunningMeanStdInPlace() (conf:cost:move:prey): RunningMeanStdInPlace() (conf:cost:rotate): RunningMeanStdInPlace() (conf:cost:shield): RunningMeanStdInPlace() (conf:cost:shield:upkeep): RunningMeanStdInPlace() (conf:generator:cooldown): RunningMeanStdInPlace() (conf:gift:energy): RunningMeanStdInPlace() (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() (last_reward): RunningMeanStdInPlace() ) ) (_grid_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (charger): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv:1): RunningMeanStdInPlace() (agent:inv:2): RunningMeanStdInPlace() (agent:inv:3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (agent:species): RunningMeanStdInPlace() (altar:ready): RunningMeanStdInPlace() (charger:bonus): RunningMeanStdInPlace() (charger:input:1): RunningMeanStdInPlace() (charger:input:2): RunningMeanStdInPlace() (charger:input:3): RunningMeanStdInPlace() (charger:output): RunningMeanStdInPlace() (charger:ready): RunningMeanStdInPlace() (generator:ready): RunningMeanStdInPlace() (generator:resource): RunningMeanStdInPlace() ) ) (_global_encoder): FeatureListEncoder( (_embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (_grid_encoder): FeatureListEncoder( (_embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (encoder_head): Sequential( (0): Linear(in_features=520, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): FeatureDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=17, bias=True) ) ) [2024-04-27 17:17:56,112][54845] Worker 25 uses CPU cores [25] [2024-04-27 17:17:56,130][54824] Worker 8 uses CPU cores [8] [2024-04-27 17:17:56,134][54839] Worker 19 uses CPU cores [19] [2024-04-27 17:17:56,138][54842] Worker 23 uses CPU cores [23] [2024-04-27 17:17:56,142][54825] Worker 9 uses CPU cores [9] [2024-04-27 17:17:56,146][54829] Worker 10 uses CPU cores [10] [2024-04-27 17:17:56,154][54834] Worker 16 uses CPU cores [16] [2024-04-27 17:17:56,154][54828] Worker 0 uses CPU cores [0] [2024-04-27 17:17:56,160][54819] Worker 6 uses CPU cores [6] [2024-04-27 17:17:56,162][54844] Worker 26 uses CPU cores [26] [2024-04-27 17:17:56,172][54826] Worker 7 uses CPU cores [7] [2024-04-27 17:17:56,182][54841] Worker 22 uses CPU cores [22] [2024-04-27 17:17:56,189][54835] Worker 17 uses CPU cores [17] [2024-04-27 17:17:56,189][54833] Worker 13 uses CPU cores [13] [2024-04-27 17:17:56,192][54843] Worker 24 uses CPU cores [24] [2024-04-27 17:17:56,226][54827] Worker 5 uses CPU cores [5] [2024-04-27 17:17:56,229][54850] Worker 31 uses CPU cores [31] [2024-04-27 17:17:56,264][54847] Worker 27 uses CPU cores [27] [2024-04-27 17:17:56,279][54798] Using optimizer [2024-04-27 17:17:56,285][54821] Worker 2 uses CPU cores [2] [2024-04-27 17:17:56,295][54822] Worker 3 uses CPU cores [3] [2024-04-27 17:17:56,423][54838] Worker 18 uses CPU cores [18] [2024-04-27 17:17:56,442][54798] Loading state from checkpoint /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000433038_7094894592.pth... [2024-04-27 17:17:56,462][54798] Loading model from checkpoint [2024-04-27 17:17:56,464][54798] Loaded experiment state at self.train_step=433038, self.env_steps=7094894592 [2024-04-27 17:17:56,464][54798] Initialized policy 0 weights for model version 433038 [2024-04-27 17:17:56,466][54798] LearnerWorker_p0 finished initialization! [2024-04-27 17:17:56,466][54798] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-27 17:17:56,538][54849] Worker 29 uses CPU cores [29] [2024-04-27 17:17:56,550][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,556][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,556][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,556][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,556][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,556][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,556][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,556][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,556][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,556][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,556][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,557][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,557][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,557][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,557][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,557][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,557][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,557][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,557][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,557][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,557][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,557][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,557][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,557][54818] RunningMeanStd input shape: (1,) [2024-04-27 17:17:56,558][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,558][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,558][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,558][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,558][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,558][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,558][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,558][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,558][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,558][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,558][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,558][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,558][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,558][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,559][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,559][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,559][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,559][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,559][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,559][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,559][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,559][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,559][54818] RunningMeanStd input shape: (11, 11) [2024-04-27 17:17:56,571][54840] Worker 21 uses CPU cores [21] [2024-04-27 17:17:56,618][54587] Inference worker 0-0 is ready! [2024-04-27 17:17:56,619][54587] All inference workers are ready! Signal rollout workers to start! [2024-04-27 17:17:57,369][54820] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,378][54825] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,379][54821] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,381][54824] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,381][54828] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,384][54827] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,394][54836] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,403][54832] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,405][54830] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,408][54819] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,408][54826] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,411][54829] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,412][54831] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,413][54822] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,414][54833] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,414][54823] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,418][54850] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,422][54834] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,424][54839] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,427][54846] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,427][54841] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,428][54835] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,429][54843] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,429][54848] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,431][54844] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,432][54842] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,432][54845] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,433][54837] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,439][54847] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,506][54838] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,601][54840] Decorrelating experience for 0 frames... [2024-04-27 17:17:57,665][54849] Decorrelating experience for 0 frames... [2024-04-27 17:17:58,064][54820] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,069][54821] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,075][54827] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,093][54828] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,093][54825] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,099][54824] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,116][54822] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,125][54836] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,134][54830] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,136][54832] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,151][54829] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,154][54831] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,157][54819] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,157][54826] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,159][54833] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,161][54823] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,186][54850] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,188][54834] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,205][54839] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,212][54841] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,219][54835] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,223][54846] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,231][54844] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,231][54848] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,235][54843] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,237][54847] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,238][54837] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,238][54842] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,240][54845] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,283][54838] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,345][54840] Decorrelating experience for 256 frames... [2024-04-27 17:17:58,396][54849] Decorrelating experience for 256 frames... [2024-04-27 17:17:59,253][54587] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 7094894592. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-04-27 17:18:02,929][54820] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-04-27 17:18:02,937][54822] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-04-27 17:18:02,947][54826] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-04-27 17:18:02,947][54825] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-04-27 17:18:02,954][54830] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-04-27 17:18:02,954][54833] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-04-27 17:18:02,970][54836] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-04-27 17:18:02,991][54821] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-04-27 17:18:02,991][54832] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-04-27 17:18:02,991][54827] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-04-27 17:18:02,991][54831] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-04-27 17:18:02,991][54824] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-04-27 17:18:03,002][54829] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-04-27 17:18:03,016][54841] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-04-27 17:18:03,055][54798] Signal inference workers to stop experience collection... [2024-04-27 17:18:03,057][54839] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-04-27 17:18:03,060][54818] InferenceWorker_p0-w0: stopping experience collection [2024-04-27 17:18:03,527][54798] Signal inference workers to resume experience collection... [2024-04-27 17:18:03,528][54818] InferenceWorker_p0-w0: resuming experience collection [2024-04-27 17:18:03,551][54834] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-04-27 17:18:03,555][54838] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-04-27 17:18:03,748][54837] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-04-27 17:18:03,752][54850] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-04-27 17:18:03,787][54835] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-04-27 17:18:03,850][54842] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-04-27 17:18:03,875][54845] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-04-27 17:18:03,891][54843] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-04-27 17:18:03,953][54844] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-04-27 17:18:03,958][54846] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-04-27 17:18:03,985][54847] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-04-27 17:18:03,995][54840] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-04-27 17:18:03,999][54848] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-04-27 17:18:04,051][54849] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-04-27 17:18:04,253][54587] Fps is (10 sec: 22937.7, 60 sec: 22937.7, 300 sec: 22937.7). Total num frames: 7095009280. Throughput: 0: 58000.4. Samples: 290000. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 17:18:04,512][54819] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-04-27 17:18:04,635][54818] Updated weights for policy 0, policy_version 433048 (0.0019) [2024-04-27 17:18:04,692][54823] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-04-27 17:18:07,640][54820] Worker 1 awakens! [2024-04-27 17:18:09,253][54587] Fps is (10 sec: 16383.8, 60 sec: 16383.8, 300 sec: 16383.8). Total num frames: 7095058432. Throughput: 0: 33413.6. Samples: 334140. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 17:18:12,091][54587] Heartbeat connected on Batcher_0 [2024-04-27 17:18:12,093][54587] Heartbeat connected on LearnerWorker_p0 [2024-04-27 17:18:12,111][54587] Heartbeat connected on RolloutWorker_w1 [2024-04-27 17:18:12,111][54587] Heartbeat connected on RolloutWorker_w0 [2024-04-27 17:18:12,168][54587] Heartbeat connected on InferenceWorker_p0-w0 [2024-04-27 17:18:12,413][54821] Worker 2 awakens! [2024-04-27 17:18:12,420][54587] Heartbeat connected on RolloutWorker_w2 [2024-04-27 17:18:14,253][54587] Fps is (10 sec: 6553.5, 60 sec: 12014.8, 300 sec: 12014.8). Total num frames: 7095074816. Throughput: 0: 22707.8. Samples: 340620. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 17:18:17,070][54822] Worker 3 awakens! [2024-04-27 17:18:17,082][54587] Heartbeat connected on RolloutWorker_w3 [2024-04-27 17:18:19,253][54587] Fps is (10 sec: 3276.8, 60 sec: 9830.3, 300 sec: 9830.3). Total num frames: 7095091200. Throughput: 0: 17963.9. Samples: 359280. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 17:18:23,446][54823] Worker 4 awakens! [2024-04-27 17:18:23,455][54587] Heartbeat connected on RolloutWorker_w4 [2024-04-27 17:18:24,253][54587] Fps is (10 sec: 4915.3, 60 sec: 9175.0, 300 sec: 9175.0). Total num frames: 7095123968. Throughput: 0: 15420.8. Samples: 385520. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 17:18:24,253][54587] Avg episode reward: [(0, '0.364')] [2024-04-27 17:18:26,489][54827] Worker 5 awakens! [2024-04-27 17:18:26,495][54587] Heartbeat connected on RolloutWorker_w5 [2024-04-27 17:18:29,117][54818] Updated weights for policy 0, policy_version 433058 (0.0018) [2024-04-27 17:18:29,253][54587] Fps is (10 sec: 13107.4, 60 sec: 10922.7, 300 sec: 10922.7). Total num frames: 7095222272. Throughput: 0: 14683.3. Samples: 440500. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 17:18:29,254][54587] Avg episode reward: [(0, '0.478')] [2024-04-27 17:18:32,734][54819] Worker 6 awakens! [2024-04-27 17:18:32,738][54587] Heartbeat connected on RolloutWorker_w6 [2024-04-27 17:18:34,253][54587] Fps is (10 sec: 19660.8, 60 sec: 12171.0, 300 sec: 12171.0). Total num frames: 7095320576. Throughput: 0: 16000.0. Samples: 560000. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 17:18:34,253][54587] Avg episode reward: [(0, '0.525')] [2024-04-27 17:18:35,860][54826] Worker 7 awakens! [2024-04-27 17:18:35,864][54587] Heartbeat connected on RolloutWorker_w7 [2024-04-27 17:18:35,976][54818] Updated weights for policy 0, policy_version 433068 (0.0013) [2024-04-27 17:18:39,253][54587] Fps is (10 sec: 24575.8, 60 sec: 14336.0, 300 sec: 14336.0). Total num frames: 7095468032. Throughput: 0: 18023.0. Samples: 720920. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 17:18:39,254][54587] Avg episode reward: [(0, '0.504')] [2024-04-27 17:18:40,598][54824] Worker 8 awakens! [2024-04-27 17:18:40,602][54587] Heartbeat connected on RolloutWorker_w8 [2024-04-27 17:18:41,747][54818] Updated weights for policy 0, policy_version 433078 (0.0013) [2024-04-27 17:18:44,253][54587] Fps is (10 sec: 27852.6, 60 sec: 15655.8, 300 sec: 15655.8). Total num frames: 7095599104. Throughput: 0: 17920.0. Samples: 806400. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 17:18:44,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 17:18:45,212][54825] Worker 9 awakens! [2024-04-27 17:18:45,219][54587] Heartbeat connected on RolloutWorker_w9 [2024-04-27 17:18:47,300][54818] Updated weights for policy 0, policy_version 433088 (0.0013) [2024-04-27 17:18:49,253][54587] Fps is (10 sec: 29491.2, 60 sec: 17367.0, 300 sec: 17367.0). Total num frames: 7095762944. Throughput: 0: 15685.7. Samples: 995860. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 17:18:49,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 17:18:49,974][54829] Worker 10 awakens! [2024-04-27 17:18:49,979][54587] Heartbeat connected on RolloutWorker_w10 [2024-04-27 17:18:52,036][54818] Updated weights for policy 0, policy_version 433098 (0.0013) [2024-04-27 17:18:54,253][54587] Fps is (10 sec: 37683.1, 60 sec: 19660.8, 300 sec: 19660.8). Total num frames: 7095975936. Throughput: 0: 19889.3. Samples: 1229160. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 17:18:54,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 17:18:54,614][54830] Worker 11 awakens! [2024-04-27 17:18:54,621][54587] Heartbeat connected on RolloutWorker_w11 [2024-04-27 17:18:55,586][54818] Updated weights for policy 0, policy_version 433108 (0.0013) [2024-04-27 17:18:59,116][54818] Updated weights for policy 0, policy_version 433118 (0.0014) [2024-04-27 17:18:59,253][54587] Fps is (10 sec: 44237.0, 60 sec: 21845.3, 300 sec: 21845.3). Total num frames: 7096205312. Throughput: 0: 22545.8. Samples: 1355180. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 17:18:59,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 17:18:59,344][54831] Worker 12 awakens! [2024-04-27 17:18:59,349][54587] Heartbeat connected on RolloutWorker_w12 [2024-04-27 17:19:02,769][54818] Updated weights for policy 0, policy_version 433128 (0.0018) [2024-04-27 17:19:03,990][54833] Worker 13 awakens! [2024-04-27 17:19:03,997][54587] Heartbeat connected on RolloutWorker_w13 [2024-04-27 17:19:04,253][54587] Fps is (10 sec: 42598.7, 60 sec: 23210.6, 300 sec: 23189.7). Total num frames: 7096401920. Throughput: 0: 27917.9. Samples: 1615580. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 17:19:04,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 17:19:07,294][54818] Updated weights for policy 0, policy_version 433138 (0.0021) [2024-04-27 17:19:08,714][54832] Worker 14 awakens! [2024-04-27 17:19:08,720][54587] Heartbeat connected on RolloutWorker_w14 [2024-04-27 17:19:09,253][54587] Fps is (10 sec: 42598.1, 60 sec: 26214.4, 300 sec: 24810.0). Total num frames: 7096631296. Throughput: 0: 33182.1. Samples: 1878720. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 17:19:09,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 17:19:11,218][54818] Updated weights for policy 0, policy_version 433148 (0.0018) [2024-04-27 17:19:13,383][54836] Worker 15 awakens! [2024-04-27 17:19:13,390][54587] Heartbeat connected on RolloutWorker_w15 [2024-04-27 17:19:14,253][54587] Fps is (10 sec: 42598.0, 60 sec: 29218.1, 300 sec: 25777.5). Total num frames: 7096827904. Throughput: 0: 34538.1. Samples: 1994720. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 17:19:14,254][54587] Avg episode reward: [(0, '0.697')] [2024-04-27 17:19:14,991][54818] Updated weights for policy 0, policy_version 433158 (0.0024) [2024-04-27 17:19:18,650][54834] Worker 16 awakens! [2024-04-27 17:19:18,659][54587] Heartbeat connected on RolloutWorker_w16 [2024-04-27 17:19:18,833][54818] Updated weights for policy 0, policy_version 433168 (0.0022) [2024-04-27 17:19:19,253][54587] Fps is (10 sec: 40960.5, 60 sec: 32495.0, 300 sec: 26828.8). Total num frames: 7097040896. Throughput: 0: 37503.5. Samples: 2247660. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 17:19:19,253][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 17:19:22,768][54818] Updated weights for policy 0, policy_version 433178 (0.0023) [2024-04-27 17:19:23,574][54835] Worker 17 awakens! [2024-04-27 17:19:23,584][54587] Heartbeat connected on RolloutWorker_w17 [2024-04-27 17:19:24,253][54587] Fps is (10 sec: 44236.9, 60 sec: 35771.7, 300 sec: 27949.1). Total num frames: 7097270272. Throughput: 0: 39697.3. Samples: 2507300. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-27 17:19:24,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 17:19:26,417][54818] Updated weights for policy 0, policy_version 433188 (0.0022) [2024-04-27 17:19:28,030][54838] Worker 18 awakens! [2024-04-27 17:19:28,040][54587] Heartbeat connected on RolloutWorker_w18 [2024-04-27 17:19:29,253][54587] Fps is (10 sec: 45874.4, 60 sec: 37956.2, 300 sec: 28945.0). Total num frames: 7097499648. Throughput: 0: 40918.6. Samples: 2647740. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-04-27 17:19:29,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-27 17:19:29,881][54818] Updated weights for policy 0, policy_version 433198 (0.0023) [2024-04-27 17:19:32,144][54839] Worker 19 awakens! [2024-04-27 17:19:32,153][54587] Heartbeat connected on RolloutWorker_w19 [2024-04-27 17:19:32,966][54818] Updated weights for policy 0, policy_version 433208 (0.0032) [2024-04-27 17:19:34,253][54587] Fps is (10 sec: 44236.8, 60 sec: 39867.6, 300 sec: 29663.6). Total num frames: 7097712640. Throughput: 0: 42944.0. Samples: 2928340. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-04-27 17:19:34,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 17:19:36,420][54818] Updated weights for policy 0, policy_version 433218 (0.0023) [2024-04-27 17:19:37,567][54837] Worker 20 awakens! [2024-04-27 17:19:37,577][54587] Heartbeat connected on RolloutWorker_w20 [2024-04-27 17:19:39,253][54587] Fps is (10 sec: 45875.6, 60 sec: 41506.1, 300 sec: 30638.1). Total num frames: 7097958400. Throughput: 0: 44206.3. Samples: 3218440. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-04-27 17:19:39,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 17:19:39,753][54818] Updated weights for policy 0, policy_version 433228 (0.0024) [2024-04-27 17:19:42,529][54840] Worker 21 awakens! [2024-04-27 17:19:42,540][54587] Heartbeat connected on RolloutWorker_w21 [2024-04-27 17:19:43,480][54818] Updated weights for policy 0, policy_version 433238 (0.0025) [2024-04-27 17:19:44,253][54587] Fps is (10 sec: 50790.0, 60 sec: 43690.6, 300 sec: 31675.7). Total num frames: 7098220544. Throughput: 0: 44566.5. Samples: 3360680. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-04-27 17:19:44,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 17:19:46,242][54841] Worker 22 awakens! [2024-04-27 17:19:46,252][54587] Heartbeat connected on RolloutWorker_w22 [2024-04-27 17:19:46,306][54818] Updated weights for policy 0, policy_version 433248 (0.0025) [2024-04-27 17:19:49,254][54587] Fps is (10 sec: 49150.9, 60 sec: 44782.8, 300 sec: 32321.1). Total num frames: 7098449920. Throughput: 0: 45498.4. Samples: 3663020. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-04-27 17:19:49,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 17:19:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000433255_7098449920.pth... [2024-04-27 17:19:49,320][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000432646_7088472064.pth [2024-04-27 17:19:49,844][54818] Updated weights for policy 0, policy_version 433258 (0.0031) [2024-04-27 17:19:51,762][54842] Worker 23 awakens! [2024-04-27 17:19:51,774][54587] Heartbeat connected on RolloutWorker_w23 [2024-04-27 17:20:32,141][54587] Fps is (10 sec: 8895.6, 60 sec: 27282.4, 300 sec: 24540.5). Total num frames: 7098646528. Throughput: 0: 23299.9. Samples: 3809980. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-04-27 17:20:32,142][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 17:20:32,152][54587] Fps is (10 sec: 4583.1, 60 sec: 26278.3, 300 sec: 24538.7). Total num frames: 7098646528. Throughput: 0: 23302.9. Samples: 3809980. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-04-27 17:20:32,152][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 17:20:32,161][54587] Fps is (10 sec: 0.0, 60 sec: 25533.7, 300 sec: 24537.2). Total num frames: 7098646528. Throughput: 0: 21428.7. Samples: 3809980. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-04-27 17:20:32,161][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 17:20:32,166][54587] Fps is (10 sec: 0.0, 60 sec: 24305.5, 300 sec: 24536.5). Total num frames: 7098646528. Throughput: 0: 20694.9. Samples: 3912740. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-04-27 17:20:32,166][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 17:20:32,170][54587] Fps is (10 sec: 0.0, 60 sec: 23340.6, 300 sec: 24535.8). Total num frames: 7098646528. Throughput: 0: 20105.9. Samples: 3912740. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-04-27 17:20:32,170][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 17:20:32,171][54587] Fps is (10 sec: 0.0, 60 sec: 22019.9, 300 sec: 24535.7). Total num frames: 7098646528. Throughput: 0: 16996.7. Samples: 3912740. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-04-27 17:20:32,171][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 17:20:32,171][54587] Fps is (10 sec: 0.0, 60 sec: 20263.6, 300 sec: 24535.6). Total num frames: 7098646528. Throughput: 0: 13120.4. Samples: 3912740. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-04-27 17:20:32,171][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 17:20:32,171][54587] Fps is (10 sec: 0.0, 60 sec: 18228.2, 300 sec: 24535.6). Total num frames: 7098646528. Throughput: 0: 11521.0. Samples: 3912740. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-04-27 17:20:32,171][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 17:20:32,488][54818] Updated weights for policy 0, policy_version 433268 (0.0031) [2024-04-27 17:20:34,253][54587] Fps is (10 sec: 39338.5, 60 sec: 16930.1, 300 sec: 24734.5). Total num frames: 7098728448. Throughput: 0: 6060.9. Samples: 3935760. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-04-27 17:20:34,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 17:20:35,831][54843] Worker 24 awakens! [2024-04-27 17:20:35,843][54587] Heartbeat connected on RolloutWorker_w24 [2024-04-27 17:20:35,989][54818] Updated weights for policy 0, policy_version 433278 (0.0026) [2024-04-27 17:20:39,253][54587] Fps is (10 sec: 46269.0, 60 sec: 16930.1, 300 sec: 25497.6). Total num frames: 7098974208. Throughput: 0: 60588.1. Samples: 4240940. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-04-27 17:20:39,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 17:20:39,496][54818] Updated weights for policy 0, policy_version 433288 (0.0027) [2024-04-27 17:20:40,517][54845] Worker 25 awakens! [2024-04-27 17:20:40,529][54587] Heartbeat connected on RolloutWorker_w25 [2024-04-27 17:20:41,882][54818] Updated weights for policy 0, policy_version 433298 (0.0022) [2024-04-27 17:20:44,253][54587] Fps is (10 sec: 50790.6, 60 sec: 16930.1, 300 sec: 26313.7). Total num frames: 7099236352. Throughput: 0: 48322.6. Samples: 4394760. Policy #0 lag: (min: 0.0, avg: 4.3, max: 12.0) [2024-04-27 17:20:44,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 17:20:45,316][54844] Worker 26 awakens! [2024-04-27 17:20:45,327][54587] Heartbeat connected on RolloutWorker_w26 [2024-04-27 17:20:45,493][54818] Updated weights for policy 0, policy_version 433308 (0.0035) [2024-04-27 17:20:48,580][54818] Updated weights for policy 0, policy_version 433318 (0.0028) [2024-04-27 17:20:49,253][54587] Fps is (10 sec: 52429.1, 60 sec: 17476.3, 300 sec: 27081.8). Total num frames: 7099498496. Throughput: 0: 52465.9. Samples: 4706740. Policy #0 lag: (min: 0.0, avg: 8.3, max: 18.0) [2024-04-27 17:20:49,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 17:20:49,262][54587] Components not started: RolloutWorker_w27, RolloutWorker_w28, RolloutWorker_w29, RolloutWorker_w30, RolloutWorker_w31, wait_time=181.9 seconds [2024-04-27 17:20:50,017][54847] Worker 27 awakens! [2024-04-27 17:20:50,028][54587] Heartbeat connected on RolloutWorker_w27 [2024-04-27 17:20:51,467][54818] Updated weights for policy 0, policy_version 433328 (0.0024) [2024-04-27 17:20:54,254][54587] Fps is (10 sec: 54066.7, 60 sec: 51123.5, 300 sec: 27899.6). Total num frames: 7099777024. Throughput: 0: 50666.2. Samples: 5031840. Policy #0 lag: (min: 0.0, avg: 8.3, max: 18.0) [2024-04-27 17:20:54,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 17:20:54,725][54818] Updated weights for policy 0, policy_version 433338 (0.0032) [2024-04-27 17:20:54,726][54846] Worker 28 awakens! [2024-04-27 17:20:54,739][54587] Heartbeat connected on RolloutWorker_w28 [2024-04-27 17:20:57,575][54818] Updated weights for policy 0, policy_version 433348 (0.0030) [2024-04-27 17:20:59,253][54587] Fps is (10 sec: 54067.3, 60 sec: 51386.1, 300 sec: 28581.0). Total num frames: 7100039168. Throughput: 0: 47127.5. Samples: 5189100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 18.0) [2024-04-27 17:20:59,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 17:20:59,502][54849] Worker 29 awakens! [2024-04-27 17:20:59,516][54587] Heartbeat connected on RolloutWorker_w29 [2024-04-27 17:21:00,897][54818] Updated weights for policy 0, policy_version 433358 (0.0029) [2024-04-27 17:21:02,823][54798] Signal inference workers to stop experience collection... (50 times) [2024-04-27 17:21:02,852][54818] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-04-27 17:21:02,880][54798] Signal inference workers to resume experience collection... (50 times) [2024-04-27 17:21:02,883][54818] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-04-27 17:21:03,401][54818] Updated weights for policy 0, policy_version 433368 (0.0028) [2024-04-27 17:21:04,142][54848] Worker 30 awakens! [2024-04-27 17:21:04,156][54587] Heartbeat connected on RolloutWorker_w30 [2024-04-27 17:21:04,254][54587] Fps is (10 sec: 54067.1, 60 sec: 52073.6, 300 sec: 29314.0). Total num frames: 7100317696. Throughput: 0: 49946.9. Samples: 5515180. Policy #0 lag: (min: 0.0, avg: 8.3, max: 18.0) [2024-04-27 17:21:04,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 17:21:06,778][54818] Updated weights for policy 0, policy_version 433378 (0.0025) [2024-04-27 17:21:08,586][54850] Worker 31 awakens! [2024-04-27 17:21:08,598][54587] Heartbeat connected on RolloutWorker_w31 [2024-04-27 17:21:09,253][54587] Fps is (10 sec: 57344.2, 60 sec: 53012.0, 300 sec: 30094.8). Total num frames: 7100612608. Throughput: 0: 51991.9. Samples: 5840720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 18.0) [2024-04-27 17:21:09,254][54587] Avg episode reward: [(0, '0.476')] [2024-04-27 17:21:09,321][54818] Updated weights for policy 0, policy_version 433388 (0.0023) [2024-04-27 17:21:12,875][54818] Updated weights for policy 0, policy_version 433398 (0.0029) [2024-04-27 17:21:14,253][54587] Fps is (10 sec: 55706.8, 60 sec: 52948.2, 300 sec: 30667.5). Total num frames: 7100874752. Throughput: 0: 49888.4. Samples: 6012140. Policy #0 lag: (min: 0.0, avg: 8.3, max: 18.0) [2024-04-27 17:21:14,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 17:21:15,225][54818] Updated weights for policy 0, policy_version 433408 (0.0034) [2024-04-27 17:21:18,900][54818] Updated weights for policy 0, policy_version 433418 (0.0031) [2024-04-27 17:21:19,253][54587] Fps is (10 sec: 52429.1, 60 sec: 52893.6, 300 sec: 31211.5). Total num frames: 7101136896. Throughput: 0: 53454.5. Samples: 6341200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 18.0) [2024-04-27 17:21:19,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 17:21:21,059][54818] Updated weights for policy 0, policy_version 433428 (0.0029) [2024-04-27 17:21:24,253][54587] Fps is (10 sec: 54067.1, 60 sec: 53163.9, 300 sec: 31808.9). Total num frames: 7101415424. Throughput: 0: 54113.9. Samples: 6676060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 18.0) [2024-04-27 17:21:24,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 17:21:24,754][54818] Updated weights for policy 0, policy_version 433438 (0.0025) [2024-04-27 17:21:27,028][54818] Updated weights for policy 0, policy_version 433448 (0.0030) [2024-04-27 17:21:29,253][54587] Fps is (10 sec: 55705.5, 60 sec: 53386.9, 300 sec: 32377.9). Total num frames: 7101693952. Throughput: 0: 54164.2. Samples: 6832140. Policy #0 lag: (min: 0.0, avg: 8.3, max: 18.0) [2024-04-27 17:21:29,253][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 17:21:30,604][54818] Updated weights for policy 0, policy_version 433458 (0.0032) [2024-04-27 17:21:33,271][54818] Updated weights for policy 0, policy_version 433468 (0.0026) [2024-04-27 17:21:34,253][54587] Fps is (10 sec: 57343.5, 60 sec: 54340.3, 300 sec: 32996.6). Total num frames: 7101988864. Throughput: 0: 54663.9. Samples: 7166620. Policy #0 lag: (min: 0.0, avg: 8.3, max: 18.0) [2024-04-27 17:21:34,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 17:21:36,531][54818] Updated weights for policy 0, policy_version 433478 (0.0033) [2024-04-27 17:21:39,253][54587] Fps is (10 sec: 55705.3, 60 sec: 54613.4, 300 sec: 33438.2). Total num frames: 7102251008. Throughput: 0: 54846.4. Samples: 7499920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 18.0) [2024-04-27 17:21:39,254][54587] Avg episode reward: [(0, '0.524')] [2024-04-27 17:21:39,525][54818] Updated weights for policy 0, policy_version 433488 (0.0033) [2024-04-27 17:21:42,485][54818] Updated weights for policy 0, policy_version 433498 (0.0029) [2024-04-27 17:21:44,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55159.6, 300 sec: 34005.9). Total num frames: 7102545920. Throughput: 0: 55171.6. Samples: 7671820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 18.0) [2024-04-27 17:21:44,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 17:21:45,438][54818] Updated weights for policy 0, policy_version 433508 (0.0033) [2024-04-27 17:21:48,268][54818] Updated weights for policy 0, policy_version 433518 (0.0032) [2024-04-27 17:21:49,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55432.5, 300 sec: 34477.6). Total num frames: 7102824448. Throughput: 0: 55298.9. Samples: 8003620. Policy #0 lag: (min: 0.0, avg: 8.3, max: 18.0) [2024-04-27 17:21:49,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 17:21:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000433522_7102824448.pth... [2024-04-27 17:21:49,310][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000433038_7094894592.pth [2024-04-27 17:21:51,208][54818] Updated weights for policy 0, policy_version 433528 (0.0027) [2024-04-27 17:21:54,066][54818] Updated weights for policy 0, policy_version 433538 (0.0028) [2024-04-27 17:21:54,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55159.6, 300 sec: 34859.6). Total num frames: 7103086592. Throughput: 0: 55495.5. Samples: 8338020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 18.0) [2024-04-27 17:21:54,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 17:21:57,005][54818] Updated weights for policy 0, policy_version 433548 (0.0031) [2024-04-27 17:21:59,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55159.5, 300 sec: 35225.6). Total num frames: 7103348736. Throughput: 0: 55383.0. Samples: 8504380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 18.0) [2024-04-27 17:21:59,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 17:22:00,125][54818] Updated weights for policy 0, policy_version 433558 (0.0027) [2024-04-27 17:22:02,831][54818] Updated weights for policy 0, policy_version 433568 (0.0030) [2024-04-27 17:22:04,253][54587] Fps is (10 sec: 55706.5, 60 sec: 55432.8, 300 sec: 35710.4). Total num frames: 7103643648. Throughput: 0: 55453.4. Samples: 8836600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 18.0) [2024-04-27 17:22:04,253][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 17:22:06,214][54818] Updated weights for policy 0, policy_version 433578 (0.0029) [2024-04-27 17:22:08,664][54818] Updated weights for policy 0, policy_version 433588 (0.0031) [2024-04-27 17:22:09,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55159.4, 300 sec: 36110.3). Total num frames: 7103922176. Throughput: 0: 55275.4. Samples: 9163460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 18.0) [2024-04-27 17:22:09,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 17:22:11,944][54818] Updated weights for policy 0, policy_version 433598 (0.0030) [2024-04-27 17:22:14,253][54587] Fps is (10 sec: 54065.8, 60 sec: 55159.3, 300 sec: 36430.3). Total num frames: 7104184320. Throughput: 0: 55748.7. Samples: 9340840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:22:14,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-27 17:22:14,526][54818] Updated weights for policy 0, policy_version 433608 (0.0030) [2024-04-27 17:22:17,893][54818] Updated weights for policy 0, policy_version 433618 (0.0028) [2024-04-27 17:22:18,229][54798] Signal inference workers to stop experience collection... (100 times) [2024-04-27 17:22:18,269][54818] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-04-27 17:22:18,324][54798] Signal inference workers to resume experience collection... (100 times) [2024-04-27 17:22:18,325][54818] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-04-27 17:22:19,253][54587] Fps is (10 sec: 57345.0, 60 sec: 55978.7, 300 sec: 36927.0). Total num frames: 7104495616. Throughput: 0: 55713.1. Samples: 9673700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:22:19,254][54587] Avg episode reward: [(0, '0.521')] [2024-04-27 17:22:20,571][54818] Updated weights for policy 0, policy_version 433628 (0.0026) [2024-04-27 17:22:23,790][54818] Updated weights for policy 0, policy_version 433638 (0.0030) [2024-04-27 17:22:24,253][54587] Fps is (10 sec: 57345.6, 60 sec: 55705.7, 300 sec: 37219.5). Total num frames: 7104757760. Throughput: 0: 55683.3. Samples: 10005660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:22:24,253][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 17:22:26,570][54818] Updated weights for policy 0, policy_version 433648 (0.0032) [2024-04-27 17:22:29,253][54587] Fps is (10 sec: 52428.3, 60 sec: 55432.5, 300 sec: 37501.1). Total num frames: 7105019904. Throughput: 0: 55431.1. Samples: 10166220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:22:29,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 17:22:29,636][54818] Updated weights for policy 0, policy_version 433658 (0.0028) [2024-04-27 17:22:32,706][54818] Updated weights for policy 0, policy_version 433668 (0.0031) [2024-04-27 17:22:34,253][54587] Fps is (10 sec: 52428.0, 60 sec: 54886.5, 300 sec: 37772.6). Total num frames: 7105282048. Throughput: 0: 55466.3. Samples: 10499600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:22:34,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 17:22:35,500][54818] Updated weights for policy 0, policy_version 433678 (0.0029) [2024-04-27 17:22:38,565][54818] Updated weights for policy 0, policy_version 433688 (0.0027) [2024-04-27 17:22:39,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 38151.3). Total num frames: 7105576960. Throughput: 0: 55397.9. Samples: 10830920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:22:39,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-27 17:22:41,462][54818] Updated weights for policy 0, policy_version 433698 (0.0028) [2024-04-27 17:22:44,253][54587] Fps is (10 sec: 57343.3, 60 sec: 55159.3, 300 sec: 38459.3). Total num frames: 7105855488. Throughput: 0: 55345.7. Samples: 10994940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:22:44,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 17:22:44,405][54818] Updated weights for policy 0, policy_version 433708 (0.0041) [2024-04-27 17:22:47,507][54818] Updated weights for policy 0, policy_version 433718 (0.0026) [2024-04-27 17:22:49,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55432.7, 300 sec: 38813.1). Total num frames: 7106150400. Throughput: 0: 55224.4. Samples: 11321700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:22:49,253][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 17:22:50,355][54818] Updated weights for policy 0, policy_version 433728 (0.0035) [2024-04-27 17:22:53,356][54818] Updated weights for policy 0, policy_version 433738 (0.0034) [2024-04-27 17:22:54,253][54587] Fps is (10 sec: 55706.6, 60 sec: 55432.6, 300 sec: 39043.9). Total num frames: 7106412544. Throughput: 0: 55249.5. Samples: 11649680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:22:54,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 17:22:56,363][54818] Updated weights for policy 0, policy_version 433748 (0.0029) [2024-04-27 17:22:59,253][54587] Fps is (10 sec: 52428.1, 60 sec: 55432.6, 300 sec: 39543.7). Total num frames: 7106674688. Throughput: 0: 55077.0. Samples: 11819300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:22:59,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 17:22:59,329][54818] Updated weights for policy 0, policy_version 433758 (0.0030) [2024-04-27 17:23:02,365][54818] Updated weights for policy 0, policy_version 433768 (0.0028) [2024-04-27 17:23:04,253][54587] Fps is (10 sec: 52428.0, 60 sec: 54886.2, 300 sec: 40265.8). Total num frames: 7106936832. Throughput: 0: 54934.0. Samples: 12145740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:23:04,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 17:23:05,227][54818] Updated weights for policy 0, policy_version 433778 (0.0031) [2024-04-27 17:23:08,193][54818] Updated weights for policy 0, policy_version 433788 (0.0028) [2024-04-27 17:23:09,253][54587] Fps is (10 sec: 54067.2, 60 sec: 54886.5, 300 sec: 41154.4). Total num frames: 7107215360. Throughput: 0: 54954.0. Samples: 12478600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:23:09,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 17:23:10,735][54798] Signal inference workers to stop experience collection... (150 times) [2024-04-27 17:23:10,770][54818] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-04-27 17:23:10,784][54798] Signal inference workers to resume experience collection... (150 times) [2024-04-27 17:23:10,789][54818] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-04-27 17:23:11,073][54818] Updated weights for policy 0, policy_version 433798 (0.0026) [2024-04-27 17:23:14,181][54818] Updated weights for policy 0, policy_version 433808 (0.0027) [2024-04-27 17:23:14,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55432.6, 300 sec: 42098.6). Total num frames: 7107510272. Throughput: 0: 54885.7. Samples: 12636080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:23:14,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 17:23:16,963][54818] Updated weights for policy 0, policy_version 433818 (0.0029) [2024-04-27 17:23:19,253][54587] Fps is (10 sec: 57343.6, 60 sec: 54886.3, 300 sec: 42931.6). Total num frames: 7107788800. Throughput: 0: 54845.2. Samples: 12967640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:23:19,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 17:23:20,124][54818] Updated weights for policy 0, policy_version 433828 (0.0027) [2024-04-27 17:23:22,764][54818] Updated weights for policy 0, policy_version 433838 (0.0027) [2024-04-27 17:23:24,253][54587] Fps is (10 sec: 55706.7, 60 sec: 55159.4, 300 sec: 43542.6). Total num frames: 7108067328. Throughput: 0: 54897.4. Samples: 13301300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:23:24,253][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 17:23:26,022][54818] Updated weights for policy 0, policy_version 433848 (0.0027) [2024-04-27 17:23:28,776][54818] Updated weights for policy 0, policy_version 433858 (0.0028) [2024-04-27 17:23:29,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 44153.5). Total num frames: 7108345856. Throughput: 0: 55201.8. Samples: 13479020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:23:29,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 17:23:31,769][54818] Updated weights for policy 0, policy_version 433868 (0.0029) [2024-04-27 17:23:34,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55432.7, 300 sec: 44542.3). Total num frames: 7108608000. Throughput: 0: 55332.9. Samples: 13811680. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-27 17:23:34,253][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 17:23:34,820][54818] Updated weights for policy 0, policy_version 433878 (0.0029) [2024-04-27 17:23:37,797][54818] Updated weights for policy 0, policy_version 433888 (0.0029) [2024-04-27 17:23:39,253][54587] Fps is (10 sec: 52429.0, 60 sec: 54886.3, 300 sec: 44986.6). Total num frames: 7108870144. Throughput: 0: 55302.1. Samples: 14138280. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-27 17:23:39,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 17:23:40,773][54818] Updated weights for policy 0, policy_version 433898 (0.0033) [2024-04-27 17:23:43,874][54818] Updated weights for policy 0, policy_version 433908 (0.0037) [2024-04-27 17:23:44,253][54587] Fps is (10 sec: 54066.3, 60 sec: 54886.5, 300 sec: 45375.4). Total num frames: 7109148672. Throughput: 0: 55052.0. Samples: 14296640. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-27 17:23:44,254][54587] Avg episode reward: [(0, '0.509')] [2024-04-27 17:23:46,617][54818] Updated weights for policy 0, policy_version 433918 (0.0027) [2024-04-27 17:23:49,253][54587] Fps is (10 sec: 55705.1, 60 sec: 54613.1, 300 sec: 45597.5). Total num frames: 7109427200. Throughput: 0: 55190.2. Samples: 14629300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-27 17:23:49,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 17:23:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000433925_7109427200.pth... [2024-04-27 17:23:49,317][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000433255_7098449920.pth [2024-04-27 17:23:49,771][54818] Updated weights for policy 0, policy_version 433928 (0.0031) [2024-04-27 17:23:52,522][54818] Updated weights for policy 0, policy_version 433938 (0.0029) [2024-04-27 17:23:54,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55159.4, 300 sec: 45819.7). Total num frames: 7109722112. Throughput: 0: 55072.9. Samples: 14956880. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-27 17:23:54,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 17:23:55,994][54818] Updated weights for policy 0, policy_version 433948 (0.0032) [2024-04-27 17:23:58,558][54818] Updated weights for policy 0, policy_version 433958 (0.0033) [2024-04-27 17:23:59,253][54587] Fps is (10 sec: 55706.8, 60 sec: 55159.6, 300 sec: 46041.8). Total num frames: 7109984256. Throughput: 0: 55453.1. Samples: 15131460. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-27 17:23:59,253][54587] Avg episode reward: [(0, '0.511')] [2024-04-27 17:24:02,070][54818] Updated weights for policy 0, policy_version 433968 (0.0031) [2024-04-27 17:24:04,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 46208.4). Total num frames: 7110262784. Throughput: 0: 55386.8. Samples: 15460040. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-27 17:24:04,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 17:24:04,452][54818] Updated weights for policy 0, policy_version 433978 (0.0025) [2024-04-27 17:24:07,901][54818] Updated weights for policy 0, policy_version 433988 (0.0034) [2024-04-27 17:24:09,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 46486.2). Total num frames: 7110541312. Throughput: 0: 55359.1. Samples: 15792460. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-27 17:24:09,253][54587] Avg episode reward: [(0, '0.526')] [2024-04-27 17:24:10,493][54818] Updated weights for policy 0, policy_version 433998 (0.0035) [2024-04-27 17:24:13,861][54818] Updated weights for policy 0, policy_version 434008 (0.0028) [2024-04-27 17:24:14,253][54587] Fps is (10 sec: 52428.4, 60 sec: 54613.3, 300 sec: 46597.2). Total num frames: 7110787072. Throughput: 0: 54888.4. Samples: 15949000. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-27 17:24:14,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 17:24:16,476][54818] Updated weights for policy 0, policy_version 434018 (0.0032) [2024-04-27 17:24:17,169][54798] Signal inference workers to stop experience collection... (200 times) [2024-04-27 17:24:17,172][54798] Signal inference workers to resume experience collection... (200 times) [2024-04-27 17:24:17,202][54818] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-04-27 17:24:17,202][54818] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-04-27 17:24:19,253][54587] Fps is (10 sec: 54066.1, 60 sec: 54886.4, 300 sec: 46819.4). Total num frames: 7111081984. Throughput: 0: 54792.1. Samples: 16277340. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-27 17:24:19,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 17:24:19,850][54818] Updated weights for policy 0, policy_version 434028 (0.0032) [2024-04-27 17:24:22,418][54818] Updated weights for policy 0, policy_version 434038 (0.0033) [2024-04-27 17:24:24,253][54587] Fps is (10 sec: 57343.6, 60 sec: 54886.2, 300 sec: 46986.0). Total num frames: 7111360512. Throughput: 0: 54855.0. Samples: 16606760. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-27 17:24:24,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 17:24:25,751][54818] Updated weights for policy 0, policy_version 434048 (0.0032) [2024-04-27 17:24:28,270][54818] Updated weights for policy 0, policy_version 434058 (0.0029) [2024-04-27 17:24:29,253][54587] Fps is (10 sec: 55706.4, 60 sec: 54886.5, 300 sec: 47208.2). Total num frames: 7111639040. Throughput: 0: 55082.3. Samples: 16775340. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-27 17:24:29,253][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 17:24:31,734][54818] Updated weights for policy 0, policy_version 434068 (0.0033) [2024-04-27 17:24:34,189][54818] Updated weights for policy 0, policy_version 434078 (0.0040) [2024-04-27 17:24:34,253][54587] Fps is (10 sec: 57344.9, 60 sec: 55432.4, 300 sec: 47374.8). Total num frames: 7111933952. Throughput: 0: 55040.2. Samples: 17106100. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-27 17:24:34,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 17:24:37,844][54818] Updated weights for policy 0, policy_version 434088 (0.0027) [2024-04-27 17:24:39,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 47374.8). Total num frames: 7112196096. Throughput: 0: 54954.2. Samples: 17429820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-27 17:24:39,254][54587] Avg episode reward: [(0, '0.507')] [2024-04-27 17:24:40,273][54818] Updated weights for policy 0, policy_version 434098 (0.0036) [2024-04-27 17:24:43,897][54818] Updated weights for policy 0, policy_version 434108 (0.0025) [2024-04-27 17:24:44,253][54587] Fps is (10 sec: 50789.7, 60 sec: 54886.3, 300 sec: 47430.3). Total num frames: 7112441856. Throughput: 0: 54682.8. Samples: 17592200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-27 17:24:44,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 17:24:46,219][54818] Updated weights for policy 0, policy_version 434118 (0.0023) [2024-04-27 17:24:49,253][54587] Fps is (10 sec: 50790.8, 60 sec: 54613.5, 300 sec: 54674.3). Total num frames: 7112704000. Throughput: 0: 54667.6. Samples: 17920080. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-27 17:24:49,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 17:24:49,870][54818] Updated weights for policy 0, policy_version 434128 (0.0031) [2024-04-27 17:24:52,350][54818] Updated weights for policy 0, policy_version 434138 (0.0033) [2024-04-27 17:24:54,253][54587] Fps is (10 sec: 55706.1, 60 sec: 54613.3, 300 sec: 54758.9). Total num frames: 7112998912. Throughput: 0: 54576.3. Samples: 18248400. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-27 17:24:54,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 17:24:55,696][54818] Updated weights for policy 0, policy_version 434148 (0.0032) [2024-04-27 17:24:58,147][54818] Updated weights for policy 0, policy_version 434158 (0.0032) [2024-04-27 17:24:59,253][54587] Fps is (10 sec: 58982.1, 60 sec: 55159.4, 300 sec: 54839.8). Total num frames: 7113293824. Throughput: 0: 54825.0. Samples: 18416120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:24:59,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 17:25:01,577][54818] Updated weights for policy 0, policy_version 434168 (0.0027) [2024-04-27 17:25:04,253][54587] Fps is (10 sec: 55706.1, 60 sec: 54886.4, 300 sec: 54796.5). Total num frames: 7113555968. Throughput: 0: 54834.0. Samples: 18744860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:25:04,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 17:25:04,290][54818] Updated weights for policy 0, policy_version 434178 (0.0031) [2024-04-27 17:25:07,594][54818] Updated weights for policy 0, policy_version 434188 (0.0034) [2024-04-27 17:25:09,253][54587] Fps is (10 sec: 54067.9, 60 sec: 54886.4, 300 sec: 54813.8). Total num frames: 7113834496. Throughput: 0: 54828.3. Samples: 19074020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:25:09,253][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 17:25:10,186][54818] Updated weights for policy 0, policy_version 434198 (0.0027) [2024-04-27 17:25:13,586][54818] Updated weights for policy 0, policy_version 434208 (0.0032) [2024-04-27 17:25:14,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55432.6, 300 sec: 54829.6). Total num frames: 7114113024. Throughput: 0: 54835.4. Samples: 19242940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:25:14,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 17:25:16,125][54818] Updated weights for policy 0, policy_version 434218 (0.0028) [2024-04-27 17:25:19,253][54587] Fps is (10 sec: 52428.4, 60 sec: 54613.5, 300 sec: 54730.9). Total num frames: 7114358784. Throughput: 0: 54944.9. Samples: 19578620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:25:19,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 17:25:19,531][54818] Updated weights for policy 0, policy_version 434228 (0.0030) [2024-04-27 17:25:22,096][54818] Updated weights for policy 0, policy_version 434238 (0.0027) [2024-04-27 17:25:24,253][54587] Fps is (10 sec: 52428.3, 60 sec: 54613.4, 300 sec: 54747.6). Total num frames: 7114637312. Throughput: 0: 55097.6. Samples: 19909220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:25:24,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 17:25:25,449][54818] Updated weights for policy 0, policy_version 434248 (0.0028) [2024-04-27 17:25:28,045][54818] Updated weights for policy 0, policy_version 434258 (0.0027) [2024-04-27 17:25:29,253][54587] Fps is (10 sec: 57343.4, 60 sec: 54886.3, 300 sec: 54928.1). Total num frames: 7114932224. Throughput: 0: 54928.5. Samples: 20063980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:25:29,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 17:25:30,879][54798] Signal inference workers to stop experience collection... (250 times) [2024-04-27 17:25:30,880][54798] Signal inference workers to resume experience collection... (250 times) [2024-04-27 17:25:30,893][54818] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-04-27 17:25:30,893][54818] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-04-27 17:25:31,297][54818] Updated weights for policy 0, policy_version 434268 (0.0029) [2024-04-27 17:25:33,900][54818] Updated weights for policy 0, policy_version 434278 (0.0034) [2024-04-27 17:25:34,253][54587] Fps is (10 sec: 57344.8, 60 sec: 54613.3, 300 sec: 55039.2). Total num frames: 7115210752. Throughput: 0: 55075.9. Samples: 20398500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:25:34,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 17:25:37,234][54818] Updated weights for policy 0, policy_version 434288 (0.0029) [2024-04-27 17:25:39,253][54587] Fps is (10 sec: 55706.3, 60 sec: 54886.5, 300 sec: 55094.7). Total num frames: 7115489280. Throughput: 0: 55085.5. Samples: 20727240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:25:39,253][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 17:25:40,142][54818] Updated weights for policy 0, policy_version 434298 (0.0028) [2024-04-27 17:25:43,217][54818] Updated weights for policy 0, policy_version 434308 (0.0030) [2024-04-27 17:25:44,253][54587] Fps is (10 sec: 54066.3, 60 sec: 55159.4, 300 sec: 55094.6). Total num frames: 7115751424. Throughput: 0: 55186.9. Samples: 20899540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:25:44,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 17:25:46,093][54818] Updated weights for policy 0, policy_version 434318 (0.0037) [2024-04-27 17:25:49,081][54818] Updated weights for policy 0, policy_version 434328 (0.0024) [2024-04-27 17:25:49,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55432.5, 300 sec: 55094.7). Total num frames: 7116029952. Throughput: 0: 55212.3. Samples: 21229420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:25:49,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 17:25:49,289][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000434329_7116046336.pth... [2024-04-27 17:25:49,337][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000433522_7102824448.pth [2024-04-27 17:25:52,131][54818] Updated weights for policy 0, policy_version 434338 (0.0028) [2024-04-27 17:25:54,253][54587] Fps is (10 sec: 52430.0, 60 sec: 54613.4, 300 sec: 55039.1). Total num frames: 7116275712. Throughput: 0: 55167.5. Samples: 21556560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:25:54,253][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 17:25:55,012][54818] Updated weights for policy 0, policy_version 434348 (0.0029) [2024-04-27 17:25:58,056][54818] Updated weights for policy 0, policy_version 434358 (0.0028) [2024-04-27 17:25:59,253][54587] Fps is (10 sec: 52428.9, 60 sec: 54340.3, 300 sec: 55039.2). Total num frames: 7116554240. Throughput: 0: 54810.7. Samples: 21709420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:25:59,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 17:26:00,962][54818] Updated weights for policy 0, policy_version 434368 (0.0027) [2024-04-27 17:26:04,075][54818] Updated weights for policy 0, policy_version 434378 (0.0037) [2024-04-27 17:26:04,253][54587] Fps is (10 sec: 58982.4, 60 sec: 55159.5, 300 sec: 55094.7). Total num frames: 7116865536. Throughput: 0: 54669.8. Samples: 22038760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:26:04,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 17:26:07,079][54818] Updated weights for policy 0, policy_version 434388 (0.0032) [2024-04-27 17:26:09,253][54587] Fps is (10 sec: 58982.1, 60 sec: 55159.3, 300 sec: 55150.2). Total num frames: 7117144064. Throughput: 0: 54482.7. Samples: 22360940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:26:09,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 17:26:09,906][54818] Updated weights for policy 0, policy_version 434398 (0.0031) [2024-04-27 17:26:12,984][54818] Updated weights for policy 0, policy_version 434408 (0.0031) [2024-04-27 17:26:14,253][54587] Fps is (10 sec: 54067.1, 60 sec: 54886.5, 300 sec: 55150.2). Total num frames: 7117406208. Throughput: 0: 55227.7. Samples: 22549220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:26:14,255][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 17:26:15,854][54818] Updated weights for policy 0, policy_version 434418 (0.0032) [2024-04-27 17:26:18,983][54818] Updated weights for policy 0, policy_version 434428 (0.0028) [2024-04-27 17:26:19,253][54587] Fps is (10 sec: 52429.7, 60 sec: 55159.5, 300 sec: 55094.7). Total num frames: 7117668352. Throughput: 0: 55097.0. Samples: 22877860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 17:26:19,253][54587] Avg episode reward: [(0, '0.663')] [2024-04-27 17:26:21,692][54818] Updated weights for policy 0, policy_version 434438 (0.0031) [2024-04-27 17:26:24,253][54587] Fps is (10 sec: 52429.2, 60 sec: 54886.6, 300 sec: 55039.1). Total num frames: 7117930496. Throughput: 0: 55091.6. Samples: 23206360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 17:26:24,253][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 17:26:24,874][54818] Updated weights for policy 0, policy_version 434448 (0.0028) [2024-04-27 17:26:27,825][54818] Updated weights for policy 0, policy_version 434458 (0.0030) [2024-04-27 17:26:29,253][54587] Fps is (10 sec: 54066.7, 60 sec: 54613.4, 300 sec: 54983.6). Total num frames: 7118209024. Throughput: 0: 54835.3. Samples: 23367120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 17:26:29,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 17:26:30,688][54818] Updated weights for policy 0, policy_version 434468 (0.0031) [2024-04-27 17:26:33,850][54818] Updated weights for policy 0, policy_version 434478 (0.0029) [2024-04-27 17:26:34,253][54587] Fps is (10 sec: 57342.8, 60 sec: 54886.3, 300 sec: 55094.7). Total num frames: 7118503936. Throughput: 0: 54752.4. Samples: 23693280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 17:26:34,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 17:26:36,496][54818] Updated weights for policy 0, policy_version 434488 (0.0036) [2024-04-27 17:26:39,253][54587] Fps is (10 sec: 57343.7, 60 sec: 54886.3, 300 sec: 55039.1). Total num frames: 7118782464. Throughput: 0: 54899.9. Samples: 24027060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 17:26:39,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 17:26:39,778][54818] Updated weights for policy 0, policy_version 434498 (0.0033) [2024-04-27 17:26:42,483][54818] Updated weights for policy 0, policy_version 434508 (0.0026) [2024-04-27 17:26:44,253][54587] Fps is (10 sec: 57344.7, 60 sec: 55432.7, 300 sec: 55094.7). Total num frames: 7119077376. Throughput: 0: 55376.9. Samples: 24201380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 17:26:44,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 17:26:45,622][54818] Updated weights for policy 0, policy_version 434518 (0.0028) [2024-04-27 17:26:46,628][54798] Signal inference workers to stop experience collection... (300 times) [2024-04-27 17:26:46,643][54818] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-04-27 17:26:46,728][54798] Signal inference workers to resume experience collection... (300 times) [2024-04-27 17:26:46,729][54818] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-04-27 17:26:48,565][54818] Updated weights for policy 0, policy_version 434528 (0.0034) [2024-04-27 17:26:49,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55159.4, 300 sec: 55094.7). Total num frames: 7119339520. Throughput: 0: 55362.1. Samples: 24530060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 17:26:49,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 17:26:51,679][54818] Updated weights for policy 0, policy_version 434538 (0.0042) [2024-04-27 17:26:54,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55150.2). Total num frames: 7119618048. Throughput: 0: 55433.0. Samples: 24855420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 17:26:54,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 17:26:54,616][54818] Updated weights for policy 0, policy_version 434548 (0.0027) [2024-04-27 17:26:57,557][54818] Updated weights for policy 0, policy_version 434558 (0.0031) [2024-04-27 17:26:59,253][54587] Fps is (10 sec: 52429.2, 60 sec: 55159.5, 300 sec: 54983.6). Total num frames: 7119863808. Throughput: 0: 54794.6. Samples: 25014980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 17:26:59,254][54587] Avg episode reward: [(0, '0.532')] [2024-04-27 17:27:00,550][54818] Updated weights for policy 0, policy_version 434568 (0.0026) [2024-04-27 17:27:03,525][54818] Updated weights for policy 0, policy_version 434578 (0.0029) [2024-04-27 17:27:04,253][54587] Fps is (10 sec: 52428.9, 60 sec: 54613.3, 300 sec: 54983.6). Total num frames: 7120142336. Throughput: 0: 54970.1. Samples: 25351520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 17:27:04,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 17:27:06,328][54818] Updated weights for policy 0, policy_version 434588 (0.0029) [2024-04-27 17:27:09,253][54587] Fps is (10 sec: 55705.1, 60 sec: 54613.3, 300 sec: 55039.1). Total num frames: 7120420864. Throughput: 0: 55037.1. Samples: 25683040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 17:27:09,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 17:27:09,735][54818] Updated weights for policy 0, policy_version 434598 (0.0029) [2024-04-27 17:27:12,266][54818] Updated weights for policy 0, policy_version 434608 (0.0034) [2024-04-27 17:27:14,253][54587] Fps is (10 sec: 55705.5, 60 sec: 54886.4, 300 sec: 54928.0). Total num frames: 7120699392. Throughput: 0: 55194.2. Samples: 25850860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 17:27:14,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 17:27:15,888][54818] Updated weights for policy 0, policy_version 434618 (0.0029) [2024-04-27 17:27:18,377][54818] Updated weights for policy 0, policy_version 434628 (0.0031) [2024-04-27 17:27:19,253][54587] Fps is (10 sec: 57344.7, 60 sec: 55432.5, 300 sec: 55039.1). Total num frames: 7120994304. Throughput: 0: 55129.5. Samples: 26174100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 17:27:19,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 17:27:21,729][54818] Updated weights for policy 0, policy_version 434638 (0.0028) [2024-04-27 17:27:24,187][54818] Updated weights for policy 0, policy_version 434648 (0.0027) [2024-04-27 17:27:24,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55705.5, 300 sec: 55094.7). Total num frames: 7121272832. Throughput: 0: 55029.9. Samples: 26503400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 17:27:24,262][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 17:27:27,720][54818] Updated weights for policy 0, policy_version 434658 (0.0029) [2024-04-27 17:27:29,253][54587] Fps is (10 sec: 52428.6, 60 sec: 55159.5, 300 sec: 55039.1). Total num frames: 7121518592. Throughput: 0: 54836.4. Samples: 26669020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 17:27:29,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 17:27:30,260][54818] Updated weights for policy 0, policy_version 434668 (0.0031) [2024-04-27 17:27:33,630][54818] Updated weights for policy 0, policy_version 434678 (0.0030) [2024-04-27 17:27:34,253][54587] Fps is (10 sec: 52428.9, 60 sec: 54886.5, 300 sec: 54983.6). Total num frames: 7121797120. Throughput: 0: 54921.0. Samples: 27001500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 17:27:34,253][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 17:27:36,057][54818] Updated weights for policy 0, policy_version 434688 (0.0030) [2024-04-27 17:27:39,253][54587] Fps is (10 sec: 55704.9, 60 sec: 54886.3, 300 sec: 54983.6). Total num frames: 7122075648. Throughput: 0: 55001.6. Samples: 27330500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-27 17:27:39,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 17:27:39,476][54818] Updated weights for policy 0, policy_version 434698 (0.0039) [2024-04-27 17:27:41,957][54818] Updated weights for policy 0, policy_version 434708 (0.0027) [2024-04-27 17:27:44,253][54587] Fps is (10 sec: 55705.7, 60 sec: 54613.4, 300 sec: 54928.0). Total num frames: 7122354176. Throughput: 0: 55145.4. Samples: 27496520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:27:44,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 17:27:45,325][54818] Updated weights for policy 0, policy_version 434718 (0.0029) [2024-04-27 17:27:47,953][54818] Updated weights for policy 0, policy_version 434728 (0.0031) [2024-04-27 17:27:49,253][54587] Fps is (10 sec: 55705.8, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7122632704. Throughput: 0: 54958.5. Samples: 27824660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:27:49,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 17:27:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000434731_7122632704.pth... [2024-04-27 17:27:49,321][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000433925_7109427200.pth [2024-04-27 17:27:51,317][54818] Updated weights for policy 0, policy_version 434738 (0.0029) [2024-04-27 17:27:53,054][54798] Signal inference workers to stop experience collection... (350 times) [2024-04-27 17:27:53,072][54818] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-04-27 17:27:53,144][54798] Signal inference workers to resume experience collection... (350 times) [2024-04-27 17:27:53,144][54818] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-04-27 17:27:53,750][54818] Updated weights for policy 0, policy_version 434748 (0.0028) [2024-04-27 17:27:54,253][54587] Fps is (10 sec: 55705.1, 60 sec: 54886.4, 300 sec: 55039.1). Total num frames: 7122911232. Throughput: 0: 54932.1. Samples: 28154980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:27:54,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 17:27:57,355][54818] Updated weights for policy 0, policy_version 434758 (0.0028) [2024-04-27 17:27:59,253][54587] Fps is (10 sec: 57344.7, 60 sec: 55705.6, 300 sec: 55150.2). Total num frames: 7123206144. Throughput: 0: 54950.3. Samples: 28323620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:27:59,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 17:27:59,539][54818] Updated weights for policy 0, policy_version 434768 (0.0027) [2024-04-27 17:28:03,368][54818] Updated weights for policy 0, policy_version 434778 (0.0034) [2024-04-27 17:28:04,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55159.4, 300 sec: 55039.1). Total num frames: 7123451904. Throughput: 0: 55067.0. Samples: 28652120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:28:04,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-27 17:28:05,615][54818] Updated weights for policy 0, policy_version 434788 (0.0029) [2024-04-27 17:28:09,253][54587] Fps is (10 sec: 50789.5, 60 sec: 54886.3, 300 sec: 54928.0). Total num frames: 7123714048. Throughput: 0: 55110.5. Samples: 28983380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:28:09,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 17:28:09,373][54818] Updated weights for policy 0, policy_version 434798 (0.0026) [2024-04-27 17:28:11,696][54818] Updated weights for policy 0, policy_version 434808 (0.0031) [2024-04-27 17:28:14,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55159.5, 300 sec: 54983.6). Total num frames: 7124008960. Throughput: 0: 55011.6. Samples: 29144540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:28:14,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 17:28:15,435][54818] Updated weights for policy 0, policy_version 434818 (0.0031) [2024-04-27 17:28:17,798][54818] Updated weights for policy 0, policy_version 434828 (0.0030) [2024-04-27 17:28:19,253][54587] Fps is (10 sec: 57343.8, 60 sec: 54886.2, 300 sec: 54983.5). Total num frames: 7124287488. Throughput: 0: 55001.5. Samples: 29476580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:28:19,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 17:28:21,231][54818] Updated weights for policy 0, policy_version 434838 (0.0029) [2024-04-27 17:28:23,756][54818] Updated weights for policy 0, policy_version 434848 (0.0026) [2024-04-27 17:28:24,253][54587] Fps is (10 sec: 55705.4, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7124566016. Throughput: 0: 55024.1. Samples: 29806580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:28:24,254][54587] Avg episode reward: [(0, '0.679')] [2024-04-27 17:28:27,096][54818] Updated weights for policy 0, policy_version 434858 (0.0027) [2024-04-27 17:28:29,253][54587] Fps is (10 sec: 57344.8, 60 sec: 55705.6, 300 sec: 55094.6). Total num frames: 7124860928. Throughput: 0: 55032.7. Samples: 29973000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:28:29,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 17:28:29,545][54818] Updated weights for policy 0, policy_version 434868 (0.0026) [2024-04-27 17:28:32,941][54818] Updated weights for policy 0, policy_version 434878 (0.0032) [2024-04-27 17:28:34,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.4, 300 sec: 55094.7). Total num frames: 7125123072. Throughput: 0: 55209.3. Samples: 30309080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:28:34,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 17:28:35,386][54818] Updated weights for policy 0, policy_version 434888 (0.0033) [2024-04-27 17:28:38,933][54818] Updated weights for policy 0, policy_version 434898 (0.0034) [2024-04-27 17:28:39,253][54587] Fps is (10 sec: 50789.9, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7125368832. Throughput: 0: 55245.2. Samples: 30641020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:28:39,254][54587] Avg episode reward: [(0, '0.667')] [2024-04-27 17:28:41,321][54818] Updated weights for policy 0, policy_version 434908 (0.0026) [2024-04-27 17:28:44,253][54587] Fps is (10 sec: 52428.5, 60 sec: 54886.2, 300 sec: 54983.6). Total num frames: 7125647360. Throughput: 0: 54978.9. Samples: 30797680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:28:44,254][54587] Avg episode reward: [(0, '0.677')] [2024-04-27 17:28:45,083][54818] Updated weights for policy 0, policy_version 434918 (0.0030) [2024-04-27 17:28:47,251][54818] Updated weights for policy 0, policy_version 434928 (0.0027) [2024-04-27 17:28:49,253][54587] Fps is (10 sec: 55706.3, 60 sec: 54886.5, 300 sec: 54928.1). Total num frames: 7125925888. Throughput: 0: 55085.8. Samples: 31130980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:28:49,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 17:28:50,911][54818] Updated weights for policy 0, policy_version 434938 (0.0028) [2024-04-27 17:28:53,250][54818] Updated weights for policy 0, policy_version 434948 (0.0029) [2024-04-27 17:28:54,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55159.5, 300 sec: 55039.1). Total num frames: 7126220800. Throughput: 0: 55075.2. Samples: 31461760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:28:54,263][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 17:28:56,116][54798] Signal inference workers to stop experience collection... (400 times) [2024-04-27 17:28:56,147][54818] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-04-27 17:28:56,200][54798] Signal inference workers to resume experience collection... (400 times) [2024-04-27 17:28:56,201][54818] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-04-27 17:28:56,916][54818] Updated weights for policy 0, policy_version 434958 (0.0032) [2024-04-27 17:28:59,253][54587] Fps is (10 sec: 55705.3, 60 sec: 54613.2, 300 sec: 54983.6). Total num frames: 7126482944. Throughput: 0: 55332.3. Samples: 31634500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:28:59,254][54587] Avg episode reward: [(0, '0.504')] [2024-04-27 17:28:59,644][54818] Updated weights for policy 0, policy_version 434968 (0.0026) [2024-04-27 17:29:02,865][54818] Updated weights for policy 0, policy_version 434978 (0.0029) [2024-04-27 17:29:04,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55039.1). Total num frames: 7126777856. Throughput: 0: 55327.3. Samples: 31966300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 17:29:04,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 17:29:05,713][54818] Updated weights for policy 0, policy_version 434988 (0.0032) [2024-04-27 17:29:08,724][54818] Updated weights for policy 0, policy_version 434998 (0.0035) [2024-04-27 17:29:09,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55094.7). Total num frames: 7127040000. Throughput: 0: 55252.9. Samples: 32292960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 17:29:09,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 17:29:11,679][54818] Updated weights for policy 0, policy_version 435008 (0.0029) [2024-04-27 17:29:14,253][54587] Fps is (10 sec: 52429.0, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7127302144. Throughput: 0: 55140.0. Samples: 32454300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 17:29:14,263][54587] Avg episode reward: [(0, '0.515')] [2024-04-27 17:29:14,636][54818] Updated weights for policy 0, policy_version 435018 (0.0030) [2024-04-27 17:29:17,773][54818] Updated weights for policy 0, policy_version 435028 (0.0026) [2024-04-27 17:29:19,253][54587] Fps is (10 sec: 54067.2, 60 sec: 54886.5, 300 sec: 54983.6). Total num frames: 7127580672. Throughput: 0: 55075.2. Samples: 32787460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 17:29:19,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 17:29:20,785][54818] Updated weights for policy 0, policy_version 435038 (0.0028) [2024-04-27 17:29:23,656][54818] Updated weights for policy 0, policy_version 435048 (0.0027) [2024-04-27 17:29:24,253][54587] Fps is (10 sec: 55705.4, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7127859200. Throughput: 0: 54812.6. Samples: 33107580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 17:29:24,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 17:29:26,849][54818] Updated weights for policy 0, policy_version 435058 (0.0028) [2024-04-27 17:29:29,253][54587] Fps is (10 sec: 54067.0, 60 sec: 54340.3, 300 sec: 54872.5). Total num frames: 7128121344. Throughput: 0: 55190.3. Samples: 33281240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 17:29:29,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 17:29:29,581][54818] Updated weights for policy 0, policy_version 435068 (0.0030) [2024-04-27 17:29:32,825][54818] Updated weights for policy 0, policy_version 435078 (0.0028) [2024-04-27 17:29:34,253][54587] Fps is (10 sec: 55705.6, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7128416256. Throughput: 0: 55054.7. Samples: 33608440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 17:29:34,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 17:29:35,611][54818] Updated weights for policy 0, policy_version 435088 (0.0030) [2024-04-27 17:29:38,769][54818] Updated weights for policy 0, policy_version 435098 (0.0031) [2024-04-27 17:29:39,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55432.6, 300 sec: 55094.7). Total num frames: 7128694784. Throughput: 0: 54984.8. Samples: 33936080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 17:29:39,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 17:29:41,600][54818] Updated weights for policy 0, policy_version 435108 (0.0028) [2024-04-27 17:29:44,253][54587] Fps is (10 sec: 50790.8, 60 sec: 54613.5, 300 sec: 54983.6). Total num frames: 7128924160. Throughput: 0: 54687.3. Samples: 34095420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 17:29:44,253][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 17:29:44,582][54818] Updated weights for policy 0, policy_version 435118 (0.0029) [2024-04-27 17:29:47,525][54818] Updated weights for policy 0, policy_version 435128 (0.0028) [2024-04-27 17:29:49,253][54587] Fps is (10 sec: 52428.9, 60 sec: 54886.3, 300 sec: 54983.6). Total num frames: 7129219072. Throughput: 0: 54626.6. Samples: 34424500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 17:29:49,254][54587] Avg episode reward: [(0, '0.512')] [2024-04-27 17:29:49,267][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000435133_7129219072.pth... [2024-04-27 17:29:49,326][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000434329_7116046336.pth [2024-04-27 17:29:50,623][54818] Updated weights for policy 0, policy_version 435138 (0.0031) [2024-04-27 17:29:51,491][54798] Signal inference workers to stop experience collection... (450 times) [2024-04-27 17:29:51,491][54798] Signal inference workers to resume experience collection... (450 times) [2024-04-27 17:29:51,517][54818] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-04-27 17:29:51,517][54818] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-04-27 17:29:53,673][54818] Updated weights for policy 0, policy_version 435148 (0.0030) [2024-04-27 17:29:54,253][54587] Fps is (10 sec: 55705.7, 60 sec: 54340.4, 300 sec: 54872.5). Total num frames: 7129481216. Throughput: 0: 54686.8. Samples: 34753860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 17:29:54,253][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 17:29:56,630][54818] Updated weights for policy 0, policy_version 435158 (0.0029) [2024-04-27 17:29:59,253][54587] Fps is (10 sec: 54067.0, 60 sec: 54613.3, 300 sec: 54928.0). Total num frames: 7129759744. Throughput: 0: 54710.5. Samples: 34916280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 17:29:59,254][54587] Avg episode reward: [(0, '0.523')] [2024-04-27 17:29:59,560][54818] Updated weights for policy 0, policy_version 435168 (0.0033) [2024-04-27 17:30:02,526][54818] Updated weights for policy 0, policy_version 435178 (0.0023) [2024-04-27 17:30:04,253][54587] Fps is (10 sec: 57343.1, 60 sec: 54613.3, 300 sec: 54983.6). Total num frames: 7130054656. Throughput: 0: 54507.5. Samples: 35240300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 17:30:04,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 17:30:05,542][54818] Updated weights for policy 0, policy_version 435188 (0.0028) [2024-04-27 17:30:08,423][54818] Updated weights for policy 0, policy_version 435198 (0.0029) [2024-04-27 17:30:09,253][54587] Fps is (10 sec: 58983.2, 60 sec: 55159.5, 300 sec: 55039.1). Total num frames: 7130349568. Throughput: 0: 54757.4. Samples: 35571660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 17:30:09,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 17:30:11,361][54818] Updated weights for policy 0, policy_version 435208 (0.0031) [2024-04-27 17:30:14,253][54587] Fps is (10 sec: 52429.5, 60 sec: 54613.4, 300 sec: 54983.6). Total num frames: 7130578944. Throughput: 0: 54605.9. Samples: 35738500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 17:30:14,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 17:30:14,458][54818] Updated weights for policy 0, policy_version 435218 (0.0027) [2024-04-27 17:30:17,277][54818] Updated weights for policy 0, policy_version 435228 (0.0029) [2024-04-27 17:30:19,253][54587] Fps is (10 sec: 50790.6, 60 sec: 54613.4, 300 sec: 54983.6). Total num frames: 7130857472. Throughput: 0: 54561.0. Samples: 36063680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 17:30:19,253][54587] Avg episode reward: [(0, '0.524')] [2024-04-27 17:30:20,468][54818] Updated weights for policy 0, policy_version 435238 (0.0028) [2024-04-27 17:30:23,148][54818] Updated weights for policy 0, policy_version 435248 (0.0030) [2024-04-27 17:30:24,253][54587] Fps is (10 sec: 55705.1, 60 sec: 54613.3, 300 sec: 54928.1). Total num frames: 7131136000. Throughput: 0: 54638.7. Samples: 36394820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-04-27 17:30:24,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 17:30:26,316][54818] Updated weights for policy 0, policy_version 435258 (0.0029) [2024-04-27 17:30:29,153][54818] Updated weights for policy 0, policy_version 435268 (0.0031) [2024-04-27 17:30:29,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55159.6, 300 sec: 54983.6). Total num frames: 7131430912. Throughput: 0: 54682.7. Samples: 36556140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 17:30:29,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 17:30:32,099][54818] Updated weights for policy 0, policy_version 435278 (0.0032) [2024-04-27 17:30:34,253][54587] Fps is (10 sec: 55705.6, 60 sec: 54613.3, 300 sec: 54928.0). Total num frames: 7131693056. Throughput: 0: 54702.7. Samples: 36886120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 17:30:34,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 17:30:35,314][54818] Updated weights for policy 0, policy_version 435288 (0.0023) [2024-04-27 17:30:38,168][54818] Updated weights for policy 0, policy_version 435298 (0.0028) [2024-04-27 17:30:39,253][54587] Fps is (10 sec: 55704.4, 60 sec: 54886.3, 300 sec: 55039.1). Total num frames: 7131987968. Throughput: 0: 54683.3. Samples: 37214620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 17:30:39,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 17:30:41,233][54818] Updated weights for policy 0, policy_version 435308 (0.0030) [2024-04-27 17:30:44,062][54818] Updated weights for policy 0, policy_version 435318 (0.0035) [2024-04-27 17:30:44,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55432.6, 300 sec: 54983.6). Total num frames: 7132250112. Throughput: 0: 54849.2. Samples: 37384480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 17:30:44,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 17:30:47,190][54818] Updated weights for policy 0, policy_version 435328 (0.0030) [2024-04-27 17:30:49,253][54587] Fps is (10 sec: 52429.9, 60 sec: 54886.5, 300 sec: 55039.1). Total num frames: 7132512256. Throughput: 0: 55011.3. Samples: 37715800. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 17:30:49,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 17:30:50,122][54818] Updated weights for policy 0, policy_version 435338 (0.0038) [2024-04-27 17:30:53,128][54818] Updated weights for policy 0, policy_version 435348 (0.0023) [2024-04-27 17:30:54,253][54587] Fps is (10 sec: 52427.7, 60 sec: 54886.2, 300 sec: 54983.6). Total num frames: 7132774400. Throughput: 0: 54905.6. Samples: 38042420. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 17:30:54,254][54587] Avg episode reward: [(0, '0.532')] [2024-04-27 17:30:55,768][54798] Signal inference workers to stop experience collection... (500 times) [2024-04-27 17:30:55,769][54798] Signal inference workers to resume experience collection... (500 times) [2024-04-27 17:30:55,789][54818] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-04-27 17:30:55,789][54818] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-04-27 17:30:56,109][54818] Updated weights for policy 0, policy_version 435358 (0.0032) [2024-04-27 17:30:59,019][54818] Updated weights for policy 0, policy_version 435368 (0.0034) [2024-04-27 17:30:59,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55159.5, 300 sec: 54928.0). Total num frames: 7133069312. Throughput: 0: 54723.4. Samples: 38201060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 17:30:59,262][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 17:31:01,938][54818] Updated weights for policy 0, policy_version 435378 (0.0035) [2024-04-27 17:31:04,253][54587] Fps is (10 sec: 57344.2, 60 sec: 54886.4, 300 sec: 54928.1). Total num frames: 7133347840. Throughput: 0: 54934.5. Samples: 38535740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 17:31:04,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 17:31:04,936][54818] Updated weights for policy 0, policy_version 435388 (0.0033) [2024-04-27 17:31:07,925][54818] Updated weights for policy 0, policy_version 435398 (0.0033) [2024-04-27 17:31:09,253][54587] Fps is (10 sec: 54068.0, 60 sec: 54340.3, 300 sec: 54928.1). Total num frames: 7133609984. Throughput: 0: 55027.2. Samples: 38871040. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 17:31:09,253][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 17:31:10,740][54818] Updated weights for policy 0, policy_version 435408 (0.0031) [2024-04-27 17:31:13,713][54818] Updated weights for policy 0, policy_version 435418 (0.0039) [2024-04-27 17:31:14,253][54587] Fps is (10 sec: 57344.9, 60 sec: 55705.6, 300 sec: 55094.7). Total num frames: 7133921280. Throughput: 0: 55258.7. Samples: 39042780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 17:31:14,253][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 17:31:16,851][54818] Updated weights for policy 0, policy_version 435428 (0.0035) [2024-04-27 17:31:19,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55432.4, 300 sec: 55094.6). Total num frames: 7134183424. Throughput: 0: 55214.6. Samples: 39370780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 17:31:19,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 17:31:19,657][54818] Updated weights for policy 0, policy_version 435438 (0.0028) [2024-04-27 17:31:22,936][54818] Updated weights for policy 0, policy_version 435448 (0.0031) [2024-04-27 17:31:24,253][54587] Fps is (10 sec: 52427.8, 60 sec: 55159.4, 300 sec: 55039.1). Total num frames: 7134445568. Throughput: 0: 55261.4. Samples: 39701380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 17:31:24,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-27 17:31:25,701][54818] Updated weights for policy 0, policy_version 435458 (0.0033) [2024-04-27 17:31:28,936][54818] Updated weights for policy 0, policy_version 435468 (0.0030) [2024-04-27 17:31:29,253][54587] Fps is (10 sec: 52428.6, 60 sec: 54613.2, 300 sec: 54928.1). Total num frames: 7134707712. Throughput: 0: 54947.8. Samples: 39857140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 17:31:29,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 17:31:31,619][54818] Updated weights for policy 0, policy_version 435478 (0.0026) [2024-04-27 17:31:34,253][54587] Fps is (10 sec: 54067.7, 60 sec: 54886.4, 300 sec: 54928.1). Total num frames: 7134986240. Throughput: 0: 54929.3. Samples: 40187620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 17:31:34,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 17:31:34,879][54818] Updated weights for policy 0, policy_version 435488 (0.0036) [2024-04-27 17:31:37,443][54818] Updated weights for policy 0, policy_version 435498 (0.0032) [2024-04-27 17:31:39,253][54587] Fps is (10 sec: 55705.2, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 7135264768. Throughput: 0: 55022.2. Samples: 40518420. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 17:31:39,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 17:31:40,780][54818] Updated weights for policy 0, policy_version 435508 (0.0026) [2024-04-27 17:31:43,422][54818] Updated weights for policy 0, policy_version 435518 (0.0031) [2024-04-27 17:31:44,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55159.4, 300 sec: 54983.6). Total num frames: 7135559680. Throughput: 0: 55303.7. Samples: 40689720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 17:31:44,254][54587] Avg episode reward: [(0, '0.508')] [2024-04-27 17:31:46,960][54818] Updated weights for policy 0, policy_version 435528 (0.0029) [2024-04-27 17:31:49,253][54587] Fps is (10 sec: 55706.7, 60 sec: 55159.5, 300 sec: 54928.1). Total num frames: 7135821824. Throughput: 0: 55140.1. Samples: 41017040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:31:49,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 17:31:49,293][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000435537_7135838208.pth... [2024-04-27 17:31:49,345][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000434731_7122632704.pth [2024-04-27 17:31:49,496][54818] Updated weights for policy 0, policy_version 435538 (0.0033) [2024-04-27 17:31:52,920][54818] Updated weights for policy 0, policy_version 435548 (0.0029) [2024-04-27 17:31:54,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55039.1). Total num frames: 7136100352. Throughput: 0: 55002.1. Samples: 41346140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:31:54,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 17:31:55,305][54818] Updated weights for policy 0, policy_version 435558 (0.0031) [2024-04-27 17:31:58,842][54818] Updated weights for policy 0, policy_version 435568 (0.0030) [2024-04-27 17:31:59,253][54587] Fps is (10 sec: 54066.6, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7136362496. Throughput: 0: 54816.7. Samples: 41509540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:31:59,254][54587] Avg episode reward: [(0, '0.673')] [2024-04-27 17:32:01,174][54818] Updated weights for policy 0, policy_version 435578 (0.0026) [2024-04-27 17:32:04,253][54587] Fps is (10 sec: 54067.1, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7136641024. Throughput: 0: 54813.4. Samples: 41837380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:32:04,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 17:32:04,720][54818] Updated weights for policy 0, policy_version 435588 (0.0034) [2024-04-27 17:32:07,237][54818] Updated weights for policy 0, policy_version 435598 (0.0031) [2024-04-27 17:32:09,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55159.4, 300 sec: 54983.6). Total num frames: 7136919552. Throughput: 0: 54844.1. Samples: 42169360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:32:09,253][54587] Avg episode reward: [(0, '0.525')] [2024-04-27 17:32:09,839][54798] Signal inference workers to stop experience collection... (550 times) [2024-04-27 17:32:09,844][54798] Signal inference workers to resume experience collection... (550 times) [2024-04-27 17:32:09,853][54818] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-04-27 17:32:09,883][54818] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-04-27 17:32:10,831][54818] Updated weights for policy 0, policy_version 435608 (0.0034) [2024-04-27 17:32:13,221][54818] Updated weights for policy 0, policy_version 435618 (0.0031) [2024-04-27 17:32:14,253][54587] Fps is (10 sec: 55705.0, 60 sec: 54613.1, 300 sec: 54928.0). Total num frames: 7137198080. Throughput: 0: 55073.3. Samples: 42335440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:32:14,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 17:32:16,692][54818] Updated weights for policy 0, policy_version 435628 (0.0029) [2024-04-27 17:32:19,124][54818] Updated weights for policy 0, policy_version 435638 (0.0026) [2024-04-27 17:32:19,253][54587] Fps is (10 sec: 57343.3, 60 sec: 55159.4, 300 sec: 54983.6). Total num frames: 7137492992. Throughput: 0: 55067.0. Samples: 42665640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:32:19,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 17:32:22,627][54818] Updated weights for policy 0, policy_version 435648 (0.0030) [2024-04-27 17:32:24,253][54587] Fps is (10 sec: 54067.4, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7137738752. Throughput: 0: 55031.2. Samples: 42994820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:32:24,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 17:32:25,004][54818] Updated weights for policy 0, policy_version 435658 (0.0034) [2024-04-27 17:32:28,756][54818] Updated weights for policy 0, policy_version 435668 (0.0026) [2024-04-27 17:32:29,253][54587] Fps is (10 sec: 52429.5, 60 sec: 55159.6, 300 sec: 54983.6). Total num frames: 7138017280. Throughput: 0: 54732.5. Samples: 43152680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:32:29,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 17:32:30,998][54818] Updated weights for policy 0, policy_version 435678 (0.0026) [2024-04-27 17:32:34,253][54587] Fps is (10 sec: 54068.0, 60 sec: 54886.4, 300 sec: 54928.1). Total num frames: 7138279424. Throughput: 0: 54794.7. Samples: 43482800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:32:34,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 17:32:34,786][54818] Updated weights for policy 0, policy_version 435688 (0.0025) [2024-04-27 17:32:37,125][54818] Updated weights for policy 0, policy_version 435698 (0.0024) [2024-04-27 17:32:39,253][54587] Fps is (10 sec: 54067.1, 60 sec: 54886.6, 300 sec: 54928.0). Total num frames: 7138557952. Throughput: 0: 54902.3. Samples: 43816740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:32:39,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 17:32:40,624][54818] Updated weights for policy 0, policy_version 435708 (0.0028) [2024-04-27 17:32:42,973][54818] Updated weights for policy 0, policy_version 435718 (0.0027) [2024-04-27 17:32:44,253][54587] Fps is (10 sec: 57344.0, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7138852864. Throughput: 0: 54942.3. Samples: 43981940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:32:44,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 17:32:46,692][54818] Updated weights for policy 0, policy_version 435728 (0.0028) [2024-04-27 17:32:49,253][54587] Fps is (10 sec: 55704.8, 60 sec: 54886.2, 300 sec: 54928.0). Total num frames: 7139115008. Throughput: 0: 55029.7. Samples: 44313720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:32:49,254][54587] Avg episode reward: [(0, '0.511')] [2024-04-27 17:32:49,369][54818] Updated weights for policy 0, policy_version 435738 (0.0029) [2024-04-27 17:32:52,687][54818] Updated weights for policy 0, policy_version 435748 (0.0024) [2024-04-27 17:32:54,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55159.5, 300 sec: 54928.0). Total num frames: 7139409920. Throughput: 0: 54960.4. Samples: 44642580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:32:54,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 17:32:55,375][54818] Updated weights for policy 0, policy_version 435758 (0.0030) [2024-04-27 17:32:58,525][54818] Updated weights for policy 0, policy_version 435768 (0.0032) [2024-04-27 17:32:59,253][54587] Fps is (10 sec: 54067.3, 60 sec: 54886.4, 300 sec: 54928.0). Total num frames: 7139655680. Throughput: 0: 54839.6. Samples: 44803220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:32:59,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 17:33:01,250][54818] Updated weights for policy 0, policy_version 435778 (0.0034) [2024-04-27 17:33:04,253][54587] Fps is (10 sec: 50790.2, 60 sec: 54613.3, 300 sec: 54928.1). Total num frames: 7139917824. Throughput: 0: 54857.0. Samples: 45134200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:33:04,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 17:33:04,495][54818] Updated weights for policy 0, policy_version 435788 (0.0033) [2024-04-27 17:33:07,081][54818] Updated weights for policy 0, policy_version 435798 (0.0027) [2024-04-27 17:33:09,253][54587] Fps is (10 sec: 55706.2, 60 sec: 54886.4, 300 sec: 54928.1). Total num frames: 7140212736. Throughput: 0: 54816.1. Samples: 45461540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 17:33:09,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 17:33:10,607][54818] Updated weights for policy 0, policy_version 435808 (0.0032) [2024-04-27 17:33:13,019][54818] Updated weights for policy 0, policy_version 435818 (0.0026) [2024-04-27 17:33:14,253][54587] Fps is (10 sec: 57344.2, 60 sec: 54886.5, 300 sec: 54928.1). Total num frames: 7140491264. Throughput: 0: 55038.2. Samples: 45629400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 17:33:14,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 17:33:16,481][54818] Updated weights for policy 0, policy_version 435828 (0.0028) [2024-04-27 17:33:19,059][54818] Updated weights for policy 0, policy_version 435838 (0.0028) [2024-04-27 17:33:19,253][54587] Fps is (10 sec: 57343.6, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7140786176. Throughput: 0: 54935.9. Samples: 45954920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 17:33:19,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 17:33:22,291][54818] Updated weights for policy 0, policy_version 435848 (0.0039) [2024-04-27 17:33:24,253][54587] Fps is (10 sec: 54067.5, 60 sec: 54886.5, 300 sec: 54817.0). Total num frames: 7141031936. Throughput: 0: 54909.4. Samples: 46287660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 17:33:24,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 17:33:24,877][54818] Updated weights for policy 0, policy_version 435858 (0.0028) [2024-04-27 17:33:28,290][54818] Updated weights for policy 0, policy_version 435868 (0.0030) [2024-04-27 17:33:29,253][54587] Fps is (10 sec: 50790.2, 60 sec: 54613.2, 300 sec: 54817.0). Total num frames: 7141294080. Throughput: 0: 54936.7. Samples: 46454100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 17:33:29,254][54587] Avg episode reward: [(0, '0.494')] [2024-04-27 17:33:30,800][54818] Updated weights for policy 0, policy_version 435878 (0.0030) [2024-04-27 17:33:34,253][54587] Fps is (10 sec: 54067.0, 60 sec: 54886.4, 300 sec: 54928.1). Total num frames: 7141572608. Throughput: 0: 54900.2. Samples: 46784220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 17:33:34,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 17:33:34,312][54818] Updated weights for policy 0, policy_version 435888 (0.0030) [2024-04-27 17:33:35,228][54798] Signal inference workers to stop experience collection... (600 times) [2024-04-27 17:33:35,260][54818] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-04-27 17:33:35,285][54798] Signal inference workers to resume experience collection... (600 times) [2024-04-27 17:33:35,287][54818] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-04-27 17:33:36,714][54818] Updated weights for policy 0, policy_version 435898 (0.0034) [2024-04-27 17:33:39,253][54587] Fps is (10 sec: 55705.3, 60 sec: 54886.2, 300 sec: 54928.1). Total num frames: 7141851136. Throughput: 0: 54787.4. Samples: 47108020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 17:33:39,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 17:33:40,338][54818] Updated weights for policy 0, policy_version 435908 (0.0029) [2024-04-27 17:33:42,832][54818] Updated weights for policy 0, policy_version 435918 (0.0034) [2024-04-27 17:33:44,253][54587] Fps is (10 sec: 55705.5, 60 sec: 54613.3, 300 sec: 54928.1). Total num frames: 7142129664. Throughput: 0: 54877.4. Samples: 47272700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 17:33:44,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 17:33:46,151][54818] Updated weights for policy 0, policy_version 435928 (0.0027) [2024-04-27 17:33:48,725][54818] Updated weights for policy 0, policy_version 435938 (0.0036) [2024-04-27 17:33:49,253][54587] Fps is (10 sec: 57344.9, 60 sec: 55159.6, 300 sec: 54928.1). Total num frames: 7142424576. Throughput: 0: 54840.9. Samples: 47602040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 17:33:49,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-27 17:33:49,262][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000435939_7142424576.pth... [2024-04-27 17:33:49,307][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000435133_7129219072.pth [2024-04-27 17:33:52,027][54818] Updated weights for policy 0, policy_version 435948 (0.0034) [2024-04-27 17:33:54,253][54587] Fps is (10 sec: 57344.4, 60 sec: 54886.5, 300 sec: 54983.6). Total num frames: 7142703104. Throughput: 0: 54944.1. Samples: 47934020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 17:33:54,253][54587] Avg episode reward: [(0, '0.514')] [2024-04-27 17:33:54,504][54818] Updated weights for policy 0, policy_version 435958 (0.0027) [2024-04-27 17:33:58,471][54818] Updated weights for policy 0, policy_version 435968 (0.0027) [2024-04-27 17:33:59,253][54587] Fps is (10 sec: 54066.2, 60 sec: 55159.4, 300 sec: 54872.5). Total num frames: 7142965248. Throughput: 0: 54991.3. Samples: 48104020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 17:33:59,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 17:34:00,594][54818] Updated weights for policy 0, policy_version 435978 (0.0029) [2024-04-27 17:34:04,232][54818] Updated weights for policy 0, policy_version 435988 (0.0030) [2024-04-27 17:34:04,253][54587] Fps is (10 sec: 52428.6, 60 sec: 55159.5, 300 sec: 54872.5). Total num frames: 7143227392. Throughput: 0: 55081.4. Samples: 48433580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 17:34:04,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 17:34:06,426][54818] Updated weights for policy 0, policy_version 435998 (0.0036) [2024-04-27 17:34:09,253][54587] Fps is (10 sec: 52430.0, 60 sec: 54613.4, 300 sec: 54872.5). Total num frames: 7143489536. Throughput: 0: 55027.6. Samples: 48763900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 17:34:09,253][54587] Avg episode reward: [(0, '0.672')] [2024-04-27 17:34:10,060][54818] Updated weights for policy 0, policy_version 436008 (0.0027) [2024-04-27 17:34:12,518][54818] Updated weights for policy 0, policy_version 436018 (0.0031) [2024-04-27 17:34:14,253][54587] Fps is (10 sec: 55705.7, 60 sec: 54886.4, 300 sec: 54928.1). Total num frames: 7143784448. Throughput: 0: 54985.5. Samples: 48928440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 17:34:14,253][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 17:34:15,949][54818] Updated weights for policy 0, policy_version 436028 (0.0032) [2024-04-27 17:34:18,358][54818] Updated weights for policy 0, policy_version 436038 (0.0038) [2024-04-27 17:34:19,253][54587] Fps is (10 sec: 57344.2, 60 sec: 54613.5, 300 sec: 54928.1). Total num frames: 7144062976. Throughput: 0: 55072.1. Samples: 49262460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 17:34:19,253][54587] Avg episode reward: [(0, '0.667')] [2024-04-27 17:34:21,764][54818] Updated weights for policy 0, policy_version 436048 (0.0028) [2024-04-27 17:34:24,240][54818] Updated weights for policy 0, policy_version 436058 (0.0028) [2024-04-27 17:34:24,253][54587] Fps is (10 sec: 58981.8, 60 sec: 55705.5, 300 sec: 55094.7). Total num frames: 7144374272. Throughput: 0: 55260.6. Samples: 49594740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 17:34:24,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 17:34:27,730][54818] Updated weights for policy 0, policy_version 436068 (0.0028) [2024-04-27 17:34:29,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55705.7, 300 sec: 54983.6). Total num frames: 7144636416. Throughput: 0: 55408.9. Samples: 49766100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 17:34:29,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 17:34:30,211][54818] Updated weights for policy 0, policy_version 436078 (0.0031) [2024-04-27 17:34:33,756][54818] Updated weights for policy 0, policy_version 436088 (0.0032) [2024-04-27 17:34:34,253][54587] Fps is (10 sec: 50790.2, 60 sec: 55159.4, 300 sec: 54872.5). Total num frames: 7144882176. Throughput: 0: 55451.0. Samples: 50097340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 17:34:34,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 17:34:36,161][54818] Updated weights for policy 0, policy_version 436098 (0.0031) [2024-04-27 17:34:39,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55159.5, 300 sec: 55039.1). Total num frames: 7145160704. Throughput: 0: 55442.1. Samples: 50428920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 17:34:39,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 17:34:39,888][54818] Updated weights for policy 0, policy_version 436108 (0.0023) [2024-04-27 17:34:40,187][54798] Signal inference workers to stop experience collection... (650 times) [2024-04-27 17:34:40,223][54818] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-04-27 17:34:40,235][54798] Signal inference workers to resume experience collection... (650 times) [2024-04-27 17:34:40,241][54818] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-04-27 17:34:42,261][54818] Updated weights for policy 0, policy_version 436118 (0.0032) [2024-04-27 17:34:44,253][54587] Fps is (10 sec: 54067.9, 60 sec: 54886.4, 300 sec: 54928.1). Total num frames: 7145422848. Throughput: 0: 55126.0. Samples: 50584680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 17:34:44,254][54587] Avg episode reward: [(0, '0.717')] [2024-04-27 17:34:45,852][54818] Updated weights for policy 0, policy_version 436128 (0.0027) [2024-04-27 17:34:48,141][54818] Updated weights for policy 0, policy_version 436138 (0.0030) [2024-04-27 17:34:49,254][54587] Fps is (10 sec: 57343.3, 60 sec: 55159.3, 300 sec: 55094.6). Total num frames: 7145734144. Throughput: 0: 55054.4. Samples: 50911040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 17:34:49,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 17:34:51,736][54818] Updated weights for policy 0, policy_version 436148 (0.0031) [2024-04-27 17:34:54,172][54818] Updated weights for policy 0, policy_version 436158 (0.0032) [2024-04-27 17:34:54,253][54587] Fps is (10 sec: 58981.9, 60 sec: 55159.4, 300 sec: 55094.7). Total num frames: 7146012672. Throughput: 0: 54987.4. Samples: 51238340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 17:34:54,256][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 17:34:57,696][54818] Updated weights for policy 0, policy_version 436168 (0.0026) [2024-04-27 17:34:59,253][54587] Fps is (10 sec: 52430.1, 60 sec: 54886.6, 300 sec: 54928.1). Total num frames: 7146258432. Throughput: 0: 55150.3. Samples: 51410200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 17:34:59,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 17:35:00,257][54818] Updated weights for policy 0, policy_version 436178 (0.0030) [2024-04-27 17:35:03,673][54818] Updated weights for policy 0, policy_version 436188 (0.0035) [2024-04-27 17:35:04,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55705.7, 300 sec: 54983.6). Total num frames: 7146569728. Throughput: 0: 55012.0. Samples: 51738000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 17:35:04,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 17:35:06,218][54818] Updated weights for policy 0, policy_version 436198 (0.0027) [2024-04-27 17:35:09,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55159.4, 300 sec: 54983.6). Total num frames: 7146799104. Throughput: 0: 54972.5. Samples: 52068500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 17:35:09,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 17:35:09,491][54818] Updated weights for policy 0, policy_version 436208 (0.0028) [2024-04-27 17:35:12,127][54818] Updated weights for policy 0, policy_version 436218 (0.0028) [2024-04-27 17:35:14,253][54587] Fps is (10 sec: 49152.0, 60 sec: 54613.4, 300 sec: 54928.1). Total num frames: 7147061248. Throughput: 0: 54710.8. Samples: 52228080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 17:35:14,253][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 17:35:15,487][54818] Updated weights for policy 0, policy_version 436228 (0.0026) [2024-04-27 17:35:18,042][54818] Updated weights for policy 0, policy_version 436238 (0.0034) [2024-04-27 17:35:19,253][54587] Fps is (10 sec: 55705.2, 60 sec: 54886.2, 300 sec: 54983.6). Total num frames: 7147356160. Throughput: 0: 54688.9. Samples: 52558340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 17:35:19,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 17:35:21,528][54818] Updated weights for policy 0, policy_version 436248 (0.0030) [2024-04-27 17:35:23,839][54818] Updated weights for policy 0, policy_version 436258 (0.0036) [2024-04-27 17:35:24,253][54587] Fps is (10 sec: 58982.2, 60 sec: 54613.4, 300 sec: 54983.6). Total num frames: 7147651072. Throughput: 0: 54608.6. Samples: 52886300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 17:35:24,253][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 17:35:27,338][54818] Updated weights for policy 0, policy_version 436268 (0.0028) [2024-04-27 17:35:29,253][54587] Fps is (10 sec: 57345.0, 60 sec: 54886.4, 300 sec: 55039.1). Total num frames: 7147929600. Throughput: 0: 54883.1. Samples: 53054420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 17:35:29,254][54587] Avg episode reward: [(0, '0.700')] [2024-04-27 17:35:29,788][54818] Updated weights for policy 0, policy_version 436278 (0.0029) [2024-04-27 17:35:33,311][54818] Updated weights for policy 0, policy_version 436288 (0.0027) [2024-04-27 17:35:34,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55159.6, 300 sec: 54928.1). Total num frames: 7148191744. Throughput: 0: 54961.6. Samples: 53384300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 17:35:34,254][54587] Avg episode reward: [(0, '0.532')] [2024-04-27 17:35:36,028][54818] Updated weights for policy 0, policy_version 436298 (0.0027) [2024-04-27 17:35:36,062][54798] Signal inference workers to stop experience collection... (700 times) [2024-04-27 17:35:36,062][54798] Signal inference workers to resume experience collection... (700 times) [2024-04-27 17:35:36,092][54818] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-04-27 17:35:36,093][54818] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-04-27 17:35:39,240][54818] Updated weights for policy 0, policy_version 436308 (0.0033) [2024-04-27 17:35:39,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55159.5, 300 sec: 54983.6). Total num frames: 7148470272. Throughput: 0: 55060.5. Samples: 53716060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 17:35:39,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 17:35:42,090][54818] Updated weights for policy 0, policy_version 436318 (0.0030) [2024-04-27 17:35:44,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55039.1). Total num frames: 7148748800. Throughput: 0: 54879.5. Samples: 53879780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 17:35:44,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 17:35:44,992][54818] Updated weights for policy 0, policy_version 436328 (0.0032) [2024-04-27 17:35:48,145][54818] Updated weights for policy 0, policy_version 436338 (0.0032) [2024-04-27 17:35:49,253][54587] Fps is (10 sec: 52428.3, 60 sec: 54340.3, 300 sec: 54983.6). Total num frames: 7148994560. Throughput: 0: 54993.0. Samples: 54212700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 17:35:49,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 17:35:49,364][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000436341_7149010944.pth... [2024-04-27 17:35:49,431][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000435537_7135838208.pth [2024-04-27 17:35:51,003][54818] Updated weights for policy 0, policy_version 436348 (0.0027) [2024-04-27 17:35:53,942][54818] Updated weights for policy 0, policy_version 436358 (0.0026) [2024-04-27 17:35:54,253][54587] Fps is (10 sec: 54066.8, 60 sec: 54613.4, 300 sec: 54983.6). Total num frames: 7149289472. Throughput: 0: 54957.4. Samples: 54541580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 17:35:54,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 17:35:56,835][54818] Updated weights for policy 0, policy_version 436368 (0.0029) [2024-04-27 17:35:59,253][54587] Fps is (10 sec: 58983.1, 60 sec: 55432.4, 300 sec: 55039.1). Total num frames: 7149584384. Throughput: 0: 55098.5. Samples: 54707520. Policy #0 lag: (min: 1.0, avg: 11.3, max: 25.0) [2024-04-27 17:35:59,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 17:35:59,780][54818] Updated weights for policy 0, policy_version 436378 (0.0029) [2024-04-27 17:36:02,788][54818] Updated weights for policy 0, policy_version 436388 (0.0031) [2024-04-27 17:36:04,253][54587] Fps is (10 sec: 55705.5, 60 sec: 54613.2, 300 sec: 55039.1). Total num frames: 7149846528. Throughput: 0: 55047.7. Samples: 55035480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 25.0) [2024-04-27 17:36:04,254][54587] Avg episode reward: [(0, '0.668')] [2024-04-27 17:36:05,669][54818] Updated weights for policy 0, policy_version 436398 (0.0030) [2024-04-27 17:36:08,795][54818] Updated weights for policy 0, policy_version 436408 (0.0028) [2024-04-27 17:36:09,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 54928.0). Total num frames: 7150125056. Throughput: 0: 55150.2. Samples: 55368060. Policy #0 lag: (min: 1.0, avg: 11.3, max: 25.0) [2024-04-27 17:36:09,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 17:36:11,674][54818] Updated weights for policy 0, policy_version 436418 (0.0028) [2024-04-27 17:36:14,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 54928.1). Total num frames: 7150387200. Throughput: 0: 55009.7. Samples: 55529860. Policy #0 lag: (min: 1.0, avg: 11.3, max: 25.0) [2024-04-27 17:36:14,253][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 17:36:14,692][54818] Updated weights for policy 0, policy_version 436428 (0.0028) [2024-04-27 17:36:17,786][54818] Updated weights for policy 0, policy_version 436438 (0.0030) [2024-04-27 17:36:19,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55159.5, 300 sec: 54983.6). Total num frames: 7150665728. Throughput: 0: 55067.8. Samples: 55862360. Policy #0 lag: (min: 1.0, avg: 11.3, max: 25.0) [2024-04-27 17:36:19,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 17:36:20,691][54818] Updated weights for policy 0, policy_version 436448 (0.0031) [2024-04-27 17:36:24,181][54818] Updated weights for policy 0, policy_version 436458 (0.0029) [2024-04-27 17:36:24,253][54587] Fps is (10 sec: 54067.1, 60 sec: 54613.3, 300 sec: 54983.6). Total num frames: 7150927872. Throughput: 0: 55013.4. Samples: 56191660. Policy #0 lag: (min: 1.0, avg: 11.3, max: 25.0) [2024-04-27 17:36:24,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 17:36:26,489][54818] Updated weights for policy 0, policy_version 436468 (0.0026) [2024-04-27 17:36:29,253][54587] Fps is (10 sec: 55706.0, 60 sec: 54886.3, 300 sec: 55039.1). Total num frames: 7151222784. Throughput: 0: 55000.8. Samples: 56354820. Policy #0 lag: (min: 1.0, avg: 11.3, max: 25.0) [2024-04-27 17:36:29,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 17:36:30,150][54818] Updated weights for policy 0, policy_version 436478 (0.0030) [2024-04-27 17:36:32,474][54818] Updated weights for policy 0, policy_version 436488 (0.0027) [2024-04-27 17:36:34,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55159.3, 300 sec: 55039.1). Total num frames: 7151501312. Throughput: 0: 54877.8. Samples: 56682200. Policy #0 lag: (min: 1.0, avg: 11.3, max: 25.0) [2024-04-27 17:36:34,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 17:36:36,228][54818] Updated weights for policy 0, policy_version 436498 (0.0035) [2024-04-27 17:36:38,455][54818] Updated weights for policy 0, policy_version 436508 (0.0030) [2024-04-27 17:36:39,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55159.5, 300 sec: 54983.6). Total num frames: 7151779840. Throughput: 0: 54820.9. Samples: 57008520. Policy #0 lag: (min: 1.0, avg: 11.3, max: 25.0) [2024-04-27 17:36:39,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-27 17:36:42,189][54818] Updated weights for policy 0, policy_version 436518 (0.0026) [2024-04-27 17:36:44,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55159.3, 300 sec: 55039.1). Total num frames: 7152058368. Throughput: 0: 55001.2. Samples: 57182580. Policy #0 lag: (min: 1.0, avg: 11.3, max: 25.0) [2024-04-27 17:36:44,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 17:36:44,436][54818] Updated weights for policy 0, policy_version 436528 (0.0031) [2024-04-27 17:36:48,097][54818] Updated weights for policy 0, policy_version 436538 (0.0032) [2024-04-27 17:36:49,143][54798] Signal inference workers to stop experience collection... (750 times) [2024-04-27 17:36:49,180][54818] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-04-27 17:36:49,236][54798] Signal inference workers to resume experience collection... (750 times) [2024-04-27 17:36:49,236][54818] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-04-27 17:36:49,253][54587] Fps is (10 sec: 52428.7, 60 sec: 55159.6, 300 sec: 54928.0). Total num frames: 7152304128. Throughput: 0: 55013.3. Samples: 57511080. Policy #0 lag: (min: 1.0, avg: 11.3, max: 25.0) [2024-04-27 17:36:49,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 17:36:50,403][54818] Updated weights for policy 0, policy_version 436548 (0.0025) [2024-04-27 17:36:54,078][54818] Updated weights for policy 0, policy_version 436558 (0.0033) [2024-04-27 17:36:54,253][54587] Fps is (10 sec: 52429.2, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7152582656. Throughput: 0: 54823.5. Samples: 57835120. Policy #0 lag: (min: 1.0, avg: 11.3, max: 25.0) [2024-04-27 17:36:54,254][54587] Avg episode reward: [(0, '0.482')] [2024-04-27 17:36:56,501][54818] Updated weights for policy 0, policy_version 436568 (0.0031) [2024-04-27 17:36:59,253][54587] Fps is (10 sec: 54067.7, 60 sec: 54340.4, 300 sec: 54928.1). Total num frames: 7152844800. Throughput: 0: 54664.5. Samples: 57989760. Policy #0 lag: (min: 1.0, avg: 11.3, max: 25.0) [2024-04-27 17:36:59,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 17:36:59,989][54818] Updated weights for policy 0, policy_version 436578 (0.0026) [2024-04-27 17:37:02,451][54818] Updated weights for policy 0, policy_version 436588 (0.0030) [2024-04-27 17:37:04,253][54587] Fps is (10 sec: 55705.7, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7153139712. Throughput: 0: 54571.6. Samples: 58318080. Policy #0 lag: (min: 1.0, avg: 11.3, max: 25.0) [2024-04-27 17:37:04,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 17:37:05,954][54818] Updated weights for policy 0, policy_version 436598 (0.0026) [2024-04-27 17:37:08,595][54818] Updated weights for policy 0, policy_version 436608 (0.0033) [2024-04-27 17:37:09,253][54587] Fps is (10 sec: 57343.5, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7153418240. Throughput: 0: 54533.3. Samples: 58645660. Policy #0 lag: (min: 1.0, avg: 11.3, max: 25.0) [2024-04-27 17:37:09,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 17:37:11,897][54818] Updated weights for policy 0, policy_version 436618 (0.0027) [2024-04-27 17:37:14,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55159.4, 300 sec: 54928.1). Total num frames: 7153696768. Throughput: 0: 54787.9. Samples: 58820280. Policy #0 lag: (min: 1.0, avg: 11.3, max: 25.0) [2024-04-27 17:37:14,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 17:37:14,368][54818] Updated weights for policy 0, policy_version 436628 (0.0036) [2024-04-27 17:37:17,892][54818] Updated weights for policy 0, policy_version 436638 (0.0028) [2024-04-27 17:37:19,253][54587] Fps is (10 sec: 50790.8, 60 sec: 54340.4, 300 sec: 54872.5). Total num frames: 7153926144. Throughput: 0: 54812.7. Samples: 59148760. Policy #0 lag: (min: 1.0, avg: 11.3, max: 25.0) [2024-04-27 17:37:19,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 17:37:20,442][54818] Updated weights for policy 0, policy_version 436648 (0.0027) [2024-04-27 17:37:23,903][54818] Updated weights for policy 0, policy_version 436658 (0.0026) [2024-04-27 17:37:24,253][54587] Fps is (10 sec: 50790.2, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 7154204672. Throughput: 0: 54903.9. Samples: 59479200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-27 17:37:24,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 17:37:26,522][54818] Updated weights for policy 0, policy_version 436668 (0.0032) [2024-04-27 17:37:29,253][54587] Fps is (10 sec: 57343.7, 60 sec: 54613.4, 300 sec: 54983.6). Total num frames: 7154499584. Throughput: 0: 54450.4. Samples: 59632840. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-27 17:37:29,254][54587] Avg episode reward: [(0, '0.510')] [2024-04-27 17:37:29,876][54818] Updated weights for policy 0, policy_version 436678 (0.0035) [2024-04-27 17:37:32,627][54818] Updated weights for policy 0, policy_version 436688 (0.0024) [2024-04-27 17:37:34,253][54587] Fps is (10 sec: 57344.7, 60 sec: 54613.5, 300 sec: 54983.6). Total num frames: 7154778112. Throughput: 0: 54477.4. Samples: 59962560. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-27 17:37:34,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 17:37:35,744][54818] Updated weights for policy 0, policy_version 436698 (0.0033) [2024-04-27 17:37:38,452][54818] Updated weights for policy 0, policy_version 436708 (0.0027) [2024-04-27 17:37:39,253][54587] Fps is (10 sec: 55705.5, 60 sec: 54613.4, 300 sec: 54928.0). Total num frames: 7155056640. Throughput: 0: 54624.9. Samples: 60293240. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-27 17:37:39,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 17:37:41,619][54818] Updated weights for policy 0, policy_version 436718 (0.0032) [2024-04-27 17:37:44,253][54587] Fps is (10 sec: 55704.6, 60 sec: 54613.3, 300 sec: 54983.6). Total num frames: 7155335168. Throughput: 0: 54818.4. Samples: 60456600. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-27 17:37:44,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 17:37:44,367][54818] Updated weights for policy 0, policy_version 436728 (0.0026) [2024-04-27 17:37:47,694][54818] Updated weights for policy 0, policy_version 436738 (0.0026) [2024-04-27 17:37:48,821][54798] Signal inference workers to stop experience collection... (800 times) [2024-04-27 17:37:48,826][54798] Signal inference workers to resume experience collection... (800 times) [2024-04-27 17:37:48,840][54818] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-04-27 17:37:48,840][54818] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-04-27 17:37:49,253][54587] Fps is (10 sec: 54067.4, 60 sec: 54886.5, 300 sec: 54872.5). Total num frames: 7155597312. Throughput: 0: 54887.2. Samples: 60788000. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-27 17:37:49,253][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 17:37:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000436743_7155597312.pth... [2024-04-27 17:37:49,314][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000435939_7142424576.pth [2024-04-27 17:37:50,408][54818] Updated weights for policy 0, policy_version 436748 (0.0030) [2024-04-27 17:37:53,801][54818] Updated weights for policy 0, policy_version 436758 (0.0027) [2024-04-27 17:37:54,253][54587] Fps is (10 sec: 50791.1, 60 sec: 54340.3, 300 sec: 54872.5). Total num frames: 7155843072. Throughput: 0: 54944.4. Samples: 61118160. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-27 17:37:54,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 17:37:56,295][54818] Updated weights for policy 0, policy_version 436768 (0.0035) [2024-04-27 17:37:59,253][54587] Fps is (10 sec: 52428.7, 60 sec: 54613.3, 300 sec: 54928.1). Total num frames: 7156121600. Throughput: 0: 54571.6. Samples: 61276000. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-27 17:37:59,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 17:37:59,972][54818] Updated weights for policy 0, policy_version 436778 (0.0032) [2024-04-27 17:38:02,333][54818] Updated weights for policy 0, policy_version 436788 (0.0033) [2024-04-27 17:38:04,253][54587] Fps is (10 sec: 57343.5, 60 sec: 54613.2, 300 sec: 54928.0). Total num frames: 7156416512. Throughput: 0: 54663.3. Samples: 61608620. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-27 17:38:04,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 17:38:05,681][54818] Updated weights for policy 0, policy_version 436798 (0.0028) [2024-04-27 17:38:08,564][54818] Updated weights for policy 0, policy_version 436808 (0.0029) [2024-04-27 17:38:09,253][54587] Fps is (10 sec: 57343.0, 60 sec: 54613.2, 300 sec: 54928.0). Total num frames: 7156695040. Throughput: 0: 54654.1. Samples: 61938640. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-27 17:38:09,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 17:38:11,585][54818] Updated weights for policy 0, policy_version 436818 (0.0033) [2024-04-27 17:38:14,253][54587] Fps is (10 sec: 55705.5, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 7156973568. Throughput: 0: 54918.9. Samples: 62104200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-27 17:38:14,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 17:38:14,520][54818] Updated weights for policy 0, policy_version 436828 (0.0027) [2024-04-27 17:38:17,495][54818] Updated weights for policy 0, policy_version 436838 (0.0027) [2024-04-27 17:38:19,253][54587] Fps is (10 sec: 54068.0, 60 sec: 55159.4, 300 sec: 54928.0). Total num frames: 7157235712. Throughput: 0: 54940.8. Samples: 62434900. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-27 17:38:19,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 17:38:20,442][54818] Updated weights for policy 0, policy_version 436848 (0.0030) [2024-04-27 17:38:23,660][54818] Updated weights for policy 0, policy_version 436858 (0.0030) [2024-04-27 17:38:24,253][54587] Fps is (10 sec: 54068.0, 60 sec: 55159.6, 300 sec: 54983.6). Total num frames: 7157514240. Throughput: 0: 54997.8. Samples: 62768140. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-27 17:38:24,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 17:38:26,370][54818] Updated weights for policy 0, policy_version 436868 (0.0031) [2024-04-27 17:38:29,253][54587] Fps is (10 sec: 55705.6, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7157792768. Throughput: 0: 55032.6. Samples: 62933060. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-27 17:38:29,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 17:38:29,550][54818] Updated weights for policy 0, policy_version 436878 (0.0027) [2024-04-27 17:38:32,237][54818] Updated weights for policy 0, policy_version 436888 (0.0029) [2024-04-27 17:38:34,253][54587] Fps is (10 sec: 55705.3, 60 sec: 54886.3, 300 sec: 54983.6). Total num frames: 7158071296. Throughput: 0: 55026.6. Samples: 63264200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-27 17:38:34,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 17:38:35,374][54818] Updated weights for policy 0, policy_version 436898 (0.0031) [2024-04-27 17:38:38,136][54818] Updated weights for policy 0, policy_version 436908 (0.0033) [2024-04-27 17:38:39,253][54587] Fps is (10 sec: 54067.6, 60 sec: 54613.4, 300 sec: 54928.1). Total num frames: 7158333440. Throughput: 0: 54913.9. Samples: 63589280. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-27 17:38:39,253][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 17:38:41,202][54818] Updated weights for policy 0, policy_version 436918 (0.0028) [2024-04-27 17:38:44,253][54587] Fps is (10 sec: 54068.0, 60 sec: 54613.6, 300 sec: 54872.5). Total num frames: 7158611968. Throughput: 0: 55155.7. Samples: 63758000. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 17:38:44,253][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 17:38:44,263][54818] Updated weights for policy 0, policy_version 436928 (0.0033) [2024-04-27 17:38:47,443][54818] Updated weights for policy 0, policy_version 436938 (0.0029) [2024-04-27 17:38:49,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55159.4, 300 sec: 54928.0). Total num frames: 7158906880. Throughput: 0: 55103.2. Samples: 64088260. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 17:38:49,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 17:38:50,306][54818] Updated weights for policy 0, policy_version 436948 (0.0030) [2024-04-27 17:38:53,244][54818] Updated weights for policy 0, policy_version 436958 (0.0028) [2024-04-27 17:38:54,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55432.5, 300 sec: 54928.1). Total num frames: 7159169024. Throughput: 0: 55143.7. Samples: 64420100. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 17:38:54,255][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 17:38:56,271][54818] Updated weights for policy 0, policy_version 436968 (0.0029) [2024-04-27 17:38:59,135][54818] Updated weights for policy 0, policy_version 436978 (0.0034) [2024-04-27 17:38:59,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 54983.6). Total num frames: 7159447552. Throughput: 0: 55191.4. Samples: 64587800. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 17:38:59,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 17:39:00,474][54798] Signal inference workers to stop experience collection... (850 times) [2024-04-27 17:39:00,474][54798] Signal inference workers to resume experience collection... (850 times) [2024-04-27 17:39:00,493][54818] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-04-27 17:39:00,493][54818] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-04-27 17:39:02,107][54818] Updated weights for policy 0, policy_version 436988 (0.0033) [2024-04-27 17:39:04,253][54587] Fps is (10 sec: 54066.9, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7159709696. Throughput: 0: 55187.0. Samples: 64918320. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 17:39:04,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 17:39:05,096][54818] Updated weights for policy 0, policy_version 436998 (0.0029) [2024-04-27 17:39:08,343][54818] Updated weights for policy 0, policy_version 437008 (0.0032) [2024-04-27 17:39:09,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55159.6, 300 sec: 54983.6). Total num frames: 7160004608. Throughput: 0: 55077.8. Samples: 65246640. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 17:39:09,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 17:39:11,312][54818] Updated weights for policy 0, policy_version 437018 (0.0034) [2024-04-27 17:39:14,253][54587] Fps is (10 sec: 54067.9, 60 sec: 54613.5, 300 sec: 54872.5). Total num frames: 7160250368. Throughput: 0: 55152.5. Samples: 65414920. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 17:39:14,253][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 17:39:14,263][54818] Updated weights for policy 0, policy_version 437028 (0.0030) [2024-04-27 17:39:17,279][54818] Updated weights for policy 0, policy_version 437038 (0.0026) [2024-04-27 17:39:19,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 54817.0). Total num frames: 7160545280. Throughput: 0: 55100.5. Samples: 65743720. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 17:39:19,254][54587] Avg episode reward: [(0, '0.476')] [2024-04-27 17:39:20,115][54818] Updated weights for policy 0, policy_version 437048 (0.0029) [2024-04-27 17:39:23,199][54818] Updated weights for policy 0, policy_version 437058 (0.0032) [2024-04-27 17:39:24,253][54587] Fps is (10 sec: 58981.2, 60 sec: 55432.4, 300 sec: 54928.0). Total num frames: 7160840192. Throughput: 0: 55167.3. Samples: 66071820. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 17:39:24,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 17:39:26,206][54818] Updated weights for policy 0, policy_version 437068 (0.0034) [2024-04-27 17:39:29,097][54818] Updated weights for policy 0, policy_version 437078 (0.0028) [2024-04-27 17:39:29,253][54587] Fps is (10 sec: 54067.4, 60 sec: 54886.5, 300 sec: 54928.1). Total num frames: 7161085952. Throughput: 0: 55150.2. Samples: 66239760. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 17:39:29,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 17:39:32,112][54818] Updated weights for policy 0, policy_version 437088 (0.0023) [2024-04-27 17:39:34,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55159.4, 300 sec: 54983.6). Total num frames: 7161380864. Throughput: 0: 55103.1. Samples: 66567900. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 17:39:34,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 17:39:35,087][54818] Updated weights for policy 0, policy_version 437098 (0.0029) [2024-04-27 17:39:38,002][54818] Updated weights for policy 0, policy_version 437108 (0.0031) [2024-04-27 17:39:39,253][54587] Fps is (10 sec: 57343.0, 60 sec: 55432.4, 300 sec: 55039.1). Total num frames: 7161659392. Throughput: 0: 55027.4. Samples: 66896340. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 17:39:39,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 17:39:41,031][54818] Updated weights for policy 0, policy_version 437118 (0.0030) [2024-04-27 17:39:43,958][54818] Updated weights for policy 0, policy_version 437128 (0.0030) [2024-04-27 17:39:44,253][54587] Fps is (10 sec: 52429.2, 60 sec: 54886.3, 300 sec: 54817.0). Total num frames: 7161905152. Throughput: 0: 54870.6. Samples: 67056980. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 17:39:44,254][54587] Avg episode reward: [(0, '0.507')] [2024-04-27 17:39:46,978][54818] Updated weights for policy 0, policy_version 437138 (0.0027) [2024-04-27 17:39:49,253][54587] Fps is (10 sec: 50791.0, 60 sec: 54340.3, 300 sec: 54761.4). Total num frames: 7162167296. Throughput: 0: 54846.3. Samples: 67386400. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 17:39:49,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 17:39:49,295][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000437145_7162183680.pth... [2024-04-27 17:39:49,351][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000436341_7149010944.pth [2024-04-27 17:39:50,028][54818] Updated weights for policy 0, policy_version 437148 (0.0035) [2024-04-27 17:39:53,034][54818] Updated weights for policy 0, policy_version 437158 (0.0029) [2024-04-27 17:39:54,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55159.4, 300 sec: 54983.6). Total num frames: 7162478592. Throughput: 0: 54696.8. Samples: 67708000. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 17:39:54,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 17:39:56,105][54818] Updated weights for policy 0, policy_version 437168 (0.0026) [2024-04-27 17:39:59,026][54818] Updated weights for policy 0, policy_version 437178 (0.0034) [2024-04-27 17:39:59,253][54587] Fps is (10 sec: 57344.0, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 7162740736. Throughput: 0: 54808.4. Samples: 67881300. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 17:39:59,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 17:39:59,766][54798] Signal inference workers to stop experience collection... (900 times) [2024-04-27 17:39:59,797][54818] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-04-27 17:39:59,828][54798] Signal inference workers to resume experience collection... (900 times) [2024-04-27 17:39:59,828][54818] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-04-27 17:40:01,965][54818] Updated weights for policy 0, policy_version 437188 (0.0036) [2024-04-27 17:40:04,253][54587] Fps is (10 sec: 52428.3, 60 sec: 54886.3, 300 sec: 54928.0). Total num frames: 7163002880. Throughput: 0: 54812.2. Samples: 68210280. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 17:40:04,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 17:40:04,891][54818] Updated weights for policy 0, policy_version 437198 (0.0031) [2024-04-27 17:40:07,906][54818] Updated weights for policy 0, policy_version 437208 (0.0028) [2024-04-27 17:40:09,253][54587] Fps is (10 sec: 55705.7, 60 sec: 54886.4, 300 sec: 55039.1). Total num frames: 7163297792. Throughput: 0: 54747.7. Samples: 68535460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 17:40:09,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 17:40:10,807][54818] Updated weights for policy 0, policy_version 437218 (0.0031) [2024-04-27 17:40:13,844][54818] Updated weights for policy 0, policy_version 437228 (0.0032) [2024-04-27 17:40:14,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55159.4, 300 sec: 54928.1). Total num frames: 7163559936. Throughput: 0: 54633.7. Samples: 68698280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 17:40:14,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 17:40:16,773][54818] Updated weights for policy 0, policy_version 437238 (0.0031) [2024-04-27 17:40:19,253][54587] Fps is (10 sec: 54066.4, 60 sec: 54886.3, 300 sec: 54872.5). Total num frames: 7163838464. Throughput: 0: 54742.2. Samples: 69031300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 17:40:19,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 17:40:19,831][54818] Updated weights for policy 0, policy_version 437248 (0.0028) [2024-04-27 17:40:22,657][54818] Updated weights for policy 0, policy_version 437258 (0.0032) [2024-04-27 17:40:24,253][54587] Fps is (10 sec: 54067.8, 60 sec: 54340.5, 300 sec: 54817.0). Total num frames: 7164100608. Throughput: 0: 54828.2. Samples: 69363600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 17:40:24,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 17:40:25,656][54818] Updated weights for policy 0, policy_version 437268 (0.0030) [2024-04-27 17:40:28,636][54818] Updated weights for policy 0, policy_version 437278 (0.0026) [2024-04-27 17:40:29,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55432.4, 300 sec: 54983.6). Total num frames: 7164411904. Throughput: 0: 54939.4. Samples: 69529260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 17:40:29,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 17:40:31,507][54818] Updated weights for policy 0, policy_version 437288 (0.0032) [2024-04-27 17:40:34,253][54587] Fps is (10 sec: 55705.1, 60 sec: 54613.4, 300 sec: 54872.5). Total num frames: 7164657664. Throughput: 0: 54980.4. Samples: 69860520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 17:40:34,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 17:40:34,668][54818] Updated weights for policy 0, policy_version 437298 (0.0034) [2024-04-27 17:40:37,583][54818] Updated weights for policy 0, policy_version 437308 (0.0033) [2024-04-27 17:40:39,253][54587] Fps is (10 sec: 54068.4, 60 sec: 54886.6, 300 sec: 54928.1). Total num frames: 7164952576. Throughput: 0: 55242.4. Samples: 70193900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 17:40:39,253][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 17:40:40,536][54818] Updated weights for policy 0, policy_version 437318 (0.0022) [2024-04-27 17:40:43,372][54818] Updated weights for policy 0, policy_version 437328 (0.0028) [2024-04-27 17:40:44,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55432.6, 300 sec: 55039.2). Total num frames: 7165231104. Throughput: 0: 55182.3. Samples: 70364500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 17:40:44,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 17:40:46,484][54818] Updated weights for policy 0, policy_version 437338 (0.0027) [2024-04-27 17:40:49,240][54818] Updated weights for policy 0, policy_version 437348 (0.0031) [2024-04-27 17:40:49,253][54587] Fps is (10 sec: 55704.4, 60 sec: 55705.5, 300 sec: 54983.6). Total num frames: 7165509632. Throughput: 0: 55202.3. Samples: 70694380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 17:40:49,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 17:40:51,375][54798] Signal inference workers to stop experience collection... (950 times) [2024-04-27 17:40:51,376][54798] Signal inference workers to resume experience collection... (950 times) [2024-04-27 17:40:51,389][54818] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-04-27 17:40:51,390][54818] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-04-27 17:40:52,529][54818] Updated weights for policy 0, policy_version 437358 (0.0031) [2024-04-27 17:40:54,253][54587] Fps is (10 sec: 52427.7, 60 sec: 54613.2, 300 sec: 54817.0). Total num frames: 7165755392. Throughput: 0: 55180.2. Samples: 71018580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 17:40:54,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 17:40:55,282][54818] Updated weights for policy 0, policy_version 437368 (0.0033) [2024-04-27 17:40:58,479][54818] Updated weights for policy 0, policy_version 437378 (0.0031) [2024-04-27 17:40:59,253][54587] Fps is (10 sec: 54068.1, 60 sec: 55159.5, 300 sec: 54928.1). Total num frames: 7166050304. Throughput: 0: 55240.1. Samples: 71184080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 17:40:59,253][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 17:41:01,233][54818] Updated weights for policy 0, policy_version 437388 (0.0037) [2024-04-27 17:41:04,253][54587] Fps is (10 sec: 55706.8, 60 sec: 55159.7, 300 sec: 54872.5). Total num frames: 7166312448. Throughput: 0: 55154.0. Samples: 71513220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 17:41:04,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 17:41:04,437][54818] Updated weights for policy 0, policy_version 437398 (0.0031) [2024-04-27 17:41:07,070][54818] Updated weights for policy 0, policy_version 437408 (0.0031) [2024-04-27 17:41:09,254][54587] Fps is (10 sec: 54065.6, 60 sec: 54886.2, 300 sec: 54928.0). Total num frames: 7166590976. Throughput: 0: 55178.7. Samples: 71846660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 17:41:09,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 17:41:10,390][54818] Updated weights for policy 0, policy_version 437418 (0.0031) [2024-04-27 17:41:12,887][54818] Updated weights for policy 0, policy_version 437428 (0.0037) [2024-04-27 17:41:14,253][54587] Fps is (10 sec: 57343.0, 60 sec: 55432.4, 300 sec: 54983.6). Total num frames: 7166885888. Throughput: 0: 55097.4. Samples: 72008640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 17:41:14,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 17:41:16,337][54818] Updated weights for policy 0, policy_version 437438 (0.0028) [2024-04-27 17:41:18,904][54818] Updated weights for policy 0, policy_version 437448 (0.0034) [2024-04-27 17:41:19,253][54587] Fps is (10 sec: 57344.8, 60 sec: 55432.6, 300 sec: 55039.1). Total num frames: 7167164416. Throughput: 0: 55147.9. Samples: 72342180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 17:41:19,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 17:41:22,274][54818] Updated weights for policy 0, policy_version 437458 (0.0031) [2024-04-27 17:41:24,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55432.4, 300 sec: 54928.0). Total num frames: 7167426560. Throughput: 0: 55010.9. Samples: 72669400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 17:41:24,254][54587] Avg episode reward: [(0, '0.518')] [2024-04-27 17:41:24,845][54818] Updated weights for policy 0, policy_version 437468 (0.0030) [2024-04-27 17:41:28,136][54818] Updated weights for policy 0, policy_version 437478 (0.0034) [2024-04-27 17:41:29,253][54587] Fps is (10 sec: 52429.2, 60 sec: 54613.4, 300 sec: 54872.5). Total num frames: 7167688704. Throughput: 0: 54951.0. Samples: 72837300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 17:41:29,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 17:41:30,707][54818] Updated weights for policy 0, policy_version 437488 (0.0028) [2024-04-27 17:41:34,188][54818] Updated weights for policy 0, policy_version 437498 (0.0022) [2024-04-27 17:41:34,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55159.4, 300 sec: 54872.5). Total num frames: 7167967232. Throughput: 0: 55038.3. Samples: 73171100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 17:41:34,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-27 17:41:36,739][54818] Updated weights for policy 0, policy_version 437508 (0.0030) [2024-04-27 17:41:39,253][54587] Fps is (10 sec: 55705.6, 60 sec: 54886.3, 300 sec: 54872.5). Total num frames: 7168245760. Throughput: 0: 55196.6. Samples: 73502420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 17:41:39,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 17:41:39,961][54818] Updated weights for policy 0, policy_version 437518 (0.0034) [2024-04-27 17:41:42,809][54818] Updated weights for policy 0, policy_version 437528 (0.0028) [2024-04-27 17:41:44,253][54587] Fps is (10 sec: 55706.1, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7168524288. Throughput: 0: 55148.0. Samples: 73665740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 17:41:44,254][54587] Avg episode reward: [(0, '0.480')] [2024-04-27 17:41:45,934][54818] Updated weights for policy 0, policy_version 437538 (0.0026) [2024-04-27 17:41:48,805][54818] Updated weights for policy 0, policy_version 437548 (0.0029) [2024-04-27 17:41:49,253][54587] Fps is (10 sec: 55705.1, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7168802816. Throughput: 0: 55068.7. Samples: 73991320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 17:41:49,254][54587] Avg episode reward: [(0, '0.503')] [2024-04-27 17:41:49,337][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000437550_7168819200.pth... [2024-04-27 17:41:49,385][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000436743_7155597312.pth [2024-04-27 17:41:51,848][54818] Updated weights for policy 0, policy_version 437558 (0.0030) [2024-04-27 17:41:54,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.7, 300 sec: 55039.1). Total num frames: 7169081344. Throughput: 0: 54975.0. Samples: 74320520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 17:41:54,253][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 17:41:54,698][54818] Updated weights for policy 0, policy_version 437568 (0.0030) [2024-04-27 17:41:57,303][54798] Signal inference workers to stop experience collection... (1000 times) [2024-04-27 17:41:57,328][54818] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-04-27 17:41:57,364][54798] Signal inference workers to resume experience collection... (1000 times) [2024-04-27 17:41:57,364][54818] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-04-27 17:41:57,816][54818] Updated weights for policy 0, policy_version 437578 (0.0027) [2024-04-27 17:41:59,253][54587] Fps is (10 sec: 57344.7, 60 sec: 55432.5, 300 sec: 55039.1). Total num frames: 7169376256. Throughput: 0: 55313.9. Samples: 74497760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 17:41:59,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 17:42:00,566][54818] Updated weights for policy 0, policy_version 437588 (0.0032) [2024-04-27 17:42:03,818][54818] Updated weights for policy 0, policy_version 437598 (0.0027) [2024-04-27 17:42:04,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55159.4, 300 sec: 54928.1). Total num frames: 7169622016. Throughput: 0: 55208.1. Samples: 74826540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 17:42:04,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 17:42:06,569][54818] Updated weights for policy 0, policy_version 437608 (0.0025) [2024-04-27 17:42:09,253][54587] Fps is (10 sec: 50790.7, 60 sec: 54886.7, 300 sec: 54872.5). Total num frames: 7169884160. Throughput: 0: 55268.2. Samples: 75156460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 17:42:09,253][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 17:42:09,753][54818] Updated weights for policy 0, policy_version 437618 (0.0026) [2024-04-27 17:42:12,691][54818] Updated weights for policy 0, policy_version 437628 (0.0027) [2024-04-27 17:42:14,253][54587] Fps is (10 sec: 54067.5, 60 sec: 54613.5, 300 sec: 55039.1). Total num frames: 7170162688. Throughput: 0: 55093.4. Samples: 75316500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 17:42:14,253][54587] Avg episode reward: [(0, '0.514')] [2024-04-27 17:42:15,738][54818] Updated weights for policy 0, policy_version 437638 (0.0027) [2024-04-27 17:42:18,570][54818] Updated weights for policy 0, policy_version 437648 (0.0029) [2024-04-27 17:42:19,253][54587] Fps is (10 sec: 58981.6, 60 sec: 55159.5, 300 sec: 55150.2). Total num frames: 7170473984. Throughput: 0: 55032.9. Samples: 75647580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 17:42:19,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 17:42:21,641][54818] Updated weights for policy 0, policy_version 437658 (0.0035) [2024-04-27 17:42:24,253][54587] Fps is (10 sec: 55705.8, 60 sec: 54886.6, 300 sec: 54983.6). Total num frames: 7170719744. Throughput: 0: 54932.1. Samples: 75974360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 17:42:24,254][54587] Avg episode reward: [(0, '0.668')] [2024-04-27 17:42:24,539][54818] Updated weights for policy 0, policy_version 437668 (0.0031) [2024-04-27 17:42:27,559][54818] Updated weights for policy 0, policy_version 437678 (0.0027) [2024-04-27 17:42:29,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55432.4, 300 sec: 55039.1). Total num frames: 7171014656. Throughput: 0: 55072.3. Samples: 76144000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 17:42:29,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 17:42:30,429][54818] Updated weights for policy 0, policy_version 437688 (0.0027) [2024-04-27 17:42:33,621][54818] Updated weights for policy 0, policy_version 437698 (0.0027) [2024-04-27 17:42:34,253][54587] Fps is (10 sec: 54066.4, 60 sec: 54886.4, 300 sec: 54928.0). Total num frames: 7171260416. Throughput: 0: 55159.2. Samples: 76473480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 17:42:34,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 17:42:36,433][54818] Updated weights for policy 0, policy_version 437708 (0.0034) [2024-04-27 17:42:39,253][54587] Fps is (10 sec: 52429.8, 60 sec: 54886.5, 300 sec: 54928.1). Total num frames: 7171538944. Throughput: 0: 55218.2. Samples: 76805340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 17:42:39,253][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 17:42:39,480][54818] Updated weights for policy 0, policy_version 437718 (0.0028) [2024-04-27 17:42:42,319][54818] Updated weights for policy 0, policy_version 437728 (0.0027) [2024-04-27 17:42:44,253][54587] Fps is (10 sec: 55705.3, 60 sec: 54886.3, 300 sec: 54983.6). Total num frames: 7171817472. Throughput: 0: 54742.1. Samples: 76961160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 17:42:44,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 17:42:45,462][54818] Updated weights for policy 0, policy_version 437738 (0.0026) [2024-04-27 17:42:48,252][54818] Updated weights for policy 0, policy_version 437748 (0.0031) [2024-04-27 17:42:49,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55159.5, 300 sec: 55150.2). Total num frames: 7172112384. Throughput: 0: 54752.0. Samples: 77290380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 17:42:49,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 17:42:51,491][54818] Updated weights for policy 0, policy_version 437758 (0.0029) [2024-04-27 17:42:54,253][54587] Fps is (10 sec: 55706.2, 60 sec: 54886.4, 300 sec: 55094.7). Total num frames: 7172374528. Throughput: 0: 54735.0. Samples: 77619540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 17:42:54,254][54587] Avg episode reward: [(0, '0.526')] [2024-04-27 17:42:54,260][54818] Updated weights for policy 0, policy_version 437768 (0.0035) [2024-04-27 17:42:57,423][54818] Updated weights for policy 0, policy_version 437778 (0.0028) [2024-04-27 17:42:59,253][54587] Fps is (10 sec: 54066.6, 60 sec: 54613.2, 300 sec: 55039.1). Total num frames: 7172653056. Throughput: 0: 54987.8. Samples: 77790960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 17:42:59,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 17:43:00,320][54818] Updated weights for policy 0, policy_version 437788 (0.0027) [2024-04-27 17:43:03,364][54818] Updated weights for policy 0, policy_version 437798 (0.0028) [2024-04-27 17:43:03,379][54798] Signal inference workers to stop experience collection... (1050 times) [2024-04-27 17:43:03,379][54798] Signal inference workers to resume experience collection... (1050 times) [2024-04-27 17:43:03,394][54818] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-04-27 17:43:03,394][54818] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-04-27 17:43:04,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55159.4, 300 sec: 55039.1). Total num frames: 7172931584. Throughput: 0: 54938.6. Samples: 78119820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 17:43:04,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 17:43:06,110][54818] Updated weights for policy 0, policy_version 437808 (0.0034) [2024-04-27 17:43:09,191][54818] Updated weights for policy 0, policy_version 437818 (0.0028) [2024-04-27 17:43:09,253][54587] Fps is (10 sec: 55706.7, 60 sec: 55432.5, 300 sec: 55039.2). Total num frames: 7173210112. Throughput: 0: 55056.0. Samples: 78451880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 17:43:09,253][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 17:43:12,020][54818] Updated weights for policy 0, policy_version 437828 (0.0030) [2024-04-27 17:43:14,253][54587] Fps is (10 sec: 52429.2, 60 sec: 54886.3, 300 sec: 54983.6). Total num frames: 7173455872. Throughput: 0: 54890.8. Samples: 78614080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 17:43:14,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 17:43:15,093][54818] Updated weights for policy 0, policy_version 437838 (0.0025) [2024-04-27 17:43:18,010][54818] Updated weights for policy 0, policy_version 437848 (0.0027) [2024-04-27 17:43:19,253][54587] Fps is (10 sec: 52427.9, 60 sec: 54340.2, 300 sec: 54983.6). Total num frames: 7173734400. Throughput: 0: 54866.6. Samples: 78942480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 17:43:19,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 17:43:21,172][54818] Updated weights for policy 0, policy_version 437858 (0.0030) [2024-04-27 17:43:23,996][54818] Updated weights for policy 0, policy_version 437868 (0.0029) [2024-04-27 17:43:24,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55159.4, 300 sec: 55039.1). Total num frames: 7174029312. Throughput: 0: 54783.1. Samples: 79270580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 17:43:24,253][54587] Avg episode reward: [(0, '0.523')] [2024-04-27 17:43:26,980][54818] Updated weights for policy 0, policy_version 437878 (0.0030) [2024-04-27 17:43:29,253][54587] Fps is (10 sec: 57344.6, 60 sec: 54886.5, 300 sec: 55039.1). Total num frames: 7174307840. Throughput: 0: 55024.6. Samples: 79437260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 17:43:29,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 17:43:29,935][54818] Updated weights for policy 0, policy_version 437888 (0.0026) [2024-04-27 17:43:32,789][54818] Updated weights for policy 0, policy_version 437898 (0.0025) [2024-04-27 17:43:34,253][54587] Fps is (10 sec: 54066.5, 60 sec: 55159.4, 300 sec: 55039.1). Total num frames: 7174569984. Throughput: 0: 54918.2. Samples: 79761700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 17:43:34,254][54587] Avg episode reward: [(0, '0.677')] [2024-04-27 17:43:35,997][54818] Updated weights for policy 0, policy_version 437908 (0.0026) [2024-04-27 17:43:38,798][54818] Updated weights for policy 0, policy_version 437918 (0.0026) [2024-04-27 17:43:39,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55159.4, 300 sec: 55039.1). Total num frames: 7174848512. Throughput: 0: 54897.8. Samples: 80089940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 17:43:39,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 17:43:41,983][54818] Updated weights for policy 0, policy_version 437928 (0.0027) [2024-04-27 17:43:44,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55159.5, 300 sec: 54983.6). Total num frames: 7175127040. Throughput: 0: 54873.0. Samples: 80260240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 17:43:44,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-27 17:43:44,857][54818] Updated weights for policy 0, policy_version 437938 (0.0028) [2024-04-27 17:43:48,234][54818] Updated weights for policy 0, policy_version 437948 (0.0030) [2024-04-27 17:43:49,253][54587] Fps is (10 sec: 54066.2, 60 sec: 54613.2, 300 sec: 54983.6). Total num frames: 7175389184. Throughput: 0: 54955.1. Samples: 80592800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 17:43:49,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 17:43:49,262][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000437951_7175389184.pth... [2024-04-27 17:43:49,319][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000437145_7162183680.pth [2024-04-27 17:43:50,796][54818] Updated weights for policy 0, policy_version 437958 (0.0030) [2024-04-27 17:43:54,052][54818] Updated weights for policy 0, policy_version 437968 (0.0032) [2024-04-27 17:43:54,253][54587] Fps is (10 sec: 54066.9, 60 sec: 54886.3, 300 sec: 54983.6). Total num frames: 7175667712. Throughput: 0: 54804.2. Samples: 80918080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 17:43:54,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 17:43:56,722][54818] Updated weights for policy 0, policy_version 437978 (0.0026) [2024-04-27 17:43:59,253][54587] Fps is (10 sec: 55706.7, 60 sec: 54886.6, 300 sec: 55039.2). Total num frames: 7175946240. Throughput: 0: 54840.1. Samples: 81081880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 17:43:59,253][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 17:44:00,060][54818] Updated weights for policy 0, policy_version 437988 (0.0032) [2024-04-27 17:44:02,509][54818] Updated weights for policy 0, policy_version 437998 (0.0030) [2024-04-27 17:44:04,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55159.6, 300 sec: 55039.1). Total num frames: 7176241152. Throughput: 0: 54839.7. Samples: 81410260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 17:44:04,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 17:44:06,069][54818] Updated weights for policy 0, policy_version 438008 (0.0030) [2024-04-27 17:44:08,719][54818] Updated weights for policy 0, policy_version 438018 (0.0027) [2024-04-27 17:44:09,253][54587] Fps is (10 sec: 55705.1, 60 sec: 54886.3, 300 sec: 55094.6). Total num frames: 7176503296. Throughput: 0: 55026.5. Samples: 81746780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 17:44:09,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 17:44:09,990][54798] Signal inference workers to stop experience collection... (1100 times) [2024-04-27 17:44:10,030][54818] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-04-27 17:44:10,080][54798] Signal inference workers to resume experience collection... (1100 times) [2024-04-27 17:44:10,080][54818] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-04-27 17:44:11,854][54818] Updated weights for policy 0, policy_version 438028 (0.0025) [2024-04-27 17:44:14,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55705.7, 300 sec: 55094.7). Total num frames: 7176798208. Throughput: 0: 54981.4. Samples: 81911420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 17:44:14,253][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 17:44:14,733][54818] Updated weights for policy 0, policy_version 438038 (0.0032) [2024-04-27 17:44:17,833][54818] Updated weights for policy 0, policy_version 438048 (0.0032) [2024-04-27 17:44:19,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55159.6, 300 sec: 54928.1). Total num frames: 7177043968. Throughput: 0: 55129.1. Samples: 82242500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 17:44:19,253][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 17:44:20,788][54818] Updated weights for policy 0, policy_version 438058 (0.0040) [2024-04-27 17:44:23,792][54818] Updated weights for policy 0, policy_version 438068 (0.0028) [2024-04-27 17:44:24,253][54587] Fps is (10 sec: 52428.0, 60 sec: 54886.3, 300 sec: 55039.1). Total num frames: 7177322496. Throughput: 0: 55190.1. Samples: 82573500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 17:44:24,254][54587] Avg episode reward: [(0, '0.500')] [2024-04-27 17:44:26,639][54818] Updated weights for policy 0, policy_version 438078 (0.0034) [2024-04-27 17:44:29,253][54587] Fps is (10 sec: 54066.7, 60 sec: 54613.3, 300 sec: 54928.1). Total num frames: 7177584640. Throughput: 0: 54982.3. Samples: 82734440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 17:44:29,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 17:44:29,822][54818] Updated weights for policy 0, policy_version 438088 (0.0028) [2024-04-27 17:44:32,470][54818] Updated weights for policy 0, policy_version 438098 (0.0025) [2024-04-27 17:44:34,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55159.4, 300 sec: 54983.6). Total num frames: 7177879552. Throughput: 0: 54940.0. Samples: 83065100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 17:44:34,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 17:44:35,832][54818] Updated weights for policy 0, policy_version 438108 (0.0030) [2024-04-27 17:44:38,669][54818] Updated weights for policy 0, policy_version 438118 (0.0025) [2024-04-27 17:44:39,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55159.5, 300 sec: 55094.7). Total num frames: 7178158080. Throughput: 0: 55104.6. Samples: 83397780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 17:44:39,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 17:44:41,768][54818] Updated weights for policy 0, policy_version 438128 (0.0029) [2024-04-27 17:44:44,253][54587] Fps is (10 sec: 54067.5, 60 sec: 54886.4, 300 sec: 55094.7). Total num frames: 7178420224. Throughput: 0: 55215.9. Samples: 83566600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 17:44:44,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 17:44:44,610][54818] Updated weights for policy 0, policy_version 438138 (0.0026) [2024-04-27 17:44:47,641][54818] Updated weights for policy 0, policy_version 438148 (0.0033) [2024-04-27 17:44:49,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55159.6, 300 sec: 54983.6). Total num frames: 7178698752. Throughput: 0: 55190.2. Samples: 83893820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 17:44:49,254][54587] Avg episode reward: [(0, '0.504')] [2024-04-27 17:44:50,457][54818] Updated weights for policy 0, policy_version 438158 (0.0027) [2024-04-27 17:44:53,527][54818] Updated weights for policy 0, policy_version 438168 (0.0028) [2024-04-27 17:44:54,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55159.5, 300 sec: 55039.1). Total num frames: 7178977280. Throughput: 0: 55116.9. Samples: 84227040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 17:44:54,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 17:44:56,341][54818] Updated weights for policy 0, policy_version 438178 (0.0029) [2024-04-27 17:44:59,253][54587] Fps is (10 sec: 54066.8, 60 sec: 54886.3, 300 sec: 55039.1). Total num frames: 7179239424. Throughput: 0: 55111.8. Samples: 84391460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 17:44:59,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 17:44:59,610][54818] Updated weights for policy 0, policy_version 438188 (0.0026) [2024-04-27 17:45:02,363][54818] Updated weights for policy 0, policy_version 438198 (0.0036) [2024-04-27 17:45:04,253][54587] Fps is (10 sec: 55706.8, 60 sec: 54886.6, 300 sec: 55039.2). Total num frames: 7179534336. Throughput: 0: 54975.2. Samples: 84716380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 17:45:04,253][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 17:45:05,674][54818] Updated weights for policy 0, policy_version 438208 (0.0028) [2024-04-27 17:45:08,355][54818] Updated weights for policy 0, policy_version 438218 (0.0029) [2024-04-27 17:45:09,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55159.5, 300 sec: 55094.7). Total num frames: 7179812864. Throughput: 0: 55016.1. Samples: 85049220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 17:45:09,254][54587] Avg episode reward: [(0, '0.524')] [2024-04-27 17:45:11,556][54818] Updated weights for policy 0, policy_version 438228 (0.0026) [2024-04-27 17:45:14,139][54818] Updated weights for policy 0, policy_version 438238 (0.0027) [2024-04-27 17:45:14,253][54587] Fps is (10 sec: 55704.8, 60 sec: 54886.3, 300 sec: 55094.7). Total num frames: 7180091392. Throughput: 0: 55153.8. Samples: 85216360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 17:45:14,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 17:45:17,466][54818] Updated weights for policy 0, policy_version 438248 (0.0034) [2024-04-27 17:45:18,642][54798] Signal inference workers to stop experience collection... (1150 times) [2024-04-27 17:45:18,644][54798] Signal inference workers to resume experience collection... (1150 times) [2024-04-27 17:45:18,684][54818] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-04-27 17:45:18,684][54818] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-04-27 17:45:19,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.4, 300 sec: 55150.2). Total num frames: 7180369920. Throughput: 0: 55184.5. Samples: 85548400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 17:45:19,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 17:45:20,017][54818] Updated weights for policy 0, policy_version 438258 (0.0028) [2024-04-27 17:45:23,412][54818] Updated weights for policy 0, policy_version 438268 (0.0031) [2024-04-27 17:45:24,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 54983.6). Total num frames: 7180632064. Throughput: 0: 55006.1. Samples: 85873060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 17:45:24,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 17:45:26,100][54818] Updated weights for policy 0, policy_version 438278 (0.0032) [2024-04-27 17:45:29,253][54587] Fps is (10 sec: 52429.2, 60 sec: 55159.4, 300 sec: 55039.1). Total num frames: 7180894208. Throughput: 0: 54924.5. Samples: 86038200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 17:45:29,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 17:45:29,362][54818] Updated weights for policy 0, policy_version 438288 (0.0027) [2024-04-27 17:45:32,122][54818] Updated weights for policy 0, policy_version 438298 (0.0028) [2024-04-27 17:45:34,253][54587] Fps is (10 sec: 52428.8, 60 sec: 54613.4, 300 sec: 54928.0). Total num frames: 7181156352. Throughput: 0: 54966.6. Samples: 86367320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-04-27 17:45:34,254][54587] Avg episode reward: [(0, '0.668')] [2024-04-27 17:45:35,229][54818] Updated weights for policy 0, policy_version 438308 (0.0031) [2024-04-27 17:45:38,132][54818] Updated weights for policy 0, policy_version 438318 (0.0031) [2024-04-27 17:45:39,253][54587] Fps is (10 sec: 54067.4, 60 sec: 54613.3, 300 sec: 54928.0). Total num frames: 7181434880. Throughput: 0: 54842.3. Samples: 86694940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 17:45:39,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 17:45:41,193][54818] Updated weights for policy 0, policy_version 438328 (0.0029) [2024-04-27 17:45:44,037][54818] Updated weights for policy 0, policy_version 438338 (0.0034) [2024-04-27 17:45:44,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55159.5, 300 sec: 54983.6). Total num frames: 7181729792. Throughput: 0: 54776.1. Samples: 86856380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 17:45:44,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 17:45:47,148][54818] Updated weights for policy 0, policy_version 438348 (0.0028) [2024-04-27 17:45:49,254][54587] Fps is (10 sec: 55704.1, 60 sec: 54886.2, 300 sec: 55039.1). Total num frames: 7181991936. Throughput: 0: 54793.7. Samples: 87182120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 17:45:49,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 17:45:49,272][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000438355_7182008320.pth... [2024-04-27 17:45:49,318][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000437550_7168819200.pth [2024-04-27 17:45:50,026][54818] Updated weights for policy 0, policy_version 438358 (0.0028) [2024-04-27 17:45:53,290][54818] Updated weights for policy 0, policy_version 438368 (0.0036) [2024-04-27 17:45:54,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55159.6, 300 sec: 55039.1). Total num frames: 7182286848. Throughput: 0: 54724.5. Samples: 87511820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 17:45:54,253][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 17:45:56,023][54818] Updated weights for policy 0, policy_version 438378 (0.0031) [2024-04-27 17:45:59,245][54818] Updated weights for policy 0, policy_version 438388 (0.0028) [2024-04-27 17:45:59,253][54587] Fps is (10 sec: 55707.2, 60 sec: 55159.5, 300 sec: 55039.1). Total num frames: 7182548992. Throughput: 0: 54709.3. Samples: 87678280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 17:45:59,254][54587] Avg episode reward: [(0, '0.682')] [2024-04-27 17:46:01,932][54818] Updated weights for policy 0, policy_version 438398 (0.0033) [2024-04-27 17:46:04,253][54587] Fps is (10 sec: 50789.6, 60 sec: 54340.0, 300 sec: 54928.1). Total num frames: 7182794752. Throughput: 0: 54623.1. Samples: 88006440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 17:46:04,254][54587] Avg episode reward: [(0, '0.668')] [2024-04-27 17:46:05,163][54818] Updated weights for policy 0, policy_version 438408 (0.0030) [2024-04-27 17:46:07,870][54818] Updated weights for policy 0, policy_version 438418 (0.0031) [2024-04-27 17:46:09,253][54587] Fps is (10 sec: 52429.0, 60 sec: 54340.3, 300 sec: 54872.5). Total num frames: 7183073280. Throughput: 0: 54813.0. Samples: 88339640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 17:46:09,253][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 17:46:11,084][54818] Updated weights for policy 0, policy_version 438428 (0.0033) [2024-04-27 17:46:13,935][54818] Updated weights for policy 0, policy_version 438438 (0.0028) [2024-04-27 17:46:14,253][54587] Fps is (10 sec: 58982.4, 60 sec: 54886.3, 300 sec: 54983.6). Total num frames: 7183384576. Throughput: 0: 54714.1. Samples: 88500340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 17:46:14,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 17:46:17,040][54818] Updated weights for policy 0, policy_version 438448 (0.0031) [2024-04-27 17:46:19,253][54587] Fps is (10 sec: 55704.4, 60 sec: 54340.2, 300 sec: 54928.0). Total num frames: 7183630336. Throughput: 0: 54670.1. Samples: 88827480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 17:46:19,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 17:46:20,059][54818] Updated weights for policy 0, policy_version 438458 (0.0029) [2024-04-27 17:46:23,020][54818] Updated weights for policy 0, policy_version 438468 (0.0029) [2024-04-27 17:46:24,253][54587] Fps is (10 sec: 54068.2, 60 sec: 54886.5, 300 sec: 55039.1). Total num frames: 7183925248. Throughput: 0: 54748.5. Samples: 89158620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 17:46:24,253][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 17:46:25,205][54798] Signal inference workers to stop experience collection... (1200 times) [2024-04-27 17:46:25,205][54798] Signal inference workers to resume experience collection... (1200 times) [2024-04-27 17:46:25,222][54818] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-04-27 17:46:25,222][54818] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-04-27 17:46:26,142][54818] Updated weights for policy 0, policy_version 438478 (0.0030) [2024-04-27 17:46:29,012][54818] Updated weights for policy 0, policy_version 438488 (0.0030) [2024-04-27 17:46:29,253][54587] Fps is (10 sec: 57345.6, 60 sec: 55159.6, 300 sec: 55039.2). Total num frames: 7184203776. Throughput: 0: 54944.1. Samples: 89328860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 17:46:29,253][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 17:46:31,958][54818] Updated weights for policy 0, policy_version 438498 (0.0032) [2024-04-27 17:46:34,253][54587] Fps is (10 sec: 52428.8, 60 sec: 54886.5, 300 sec: 54928.1). Total num frames: 7184449536. Throughput: 0: 55084.0. Samples: 89660880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 17:46:34,262][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 17:46:34,934][54818] Updated weights for policy 0, policy_version 438508 (0.0030) [2024-04-27 17:46:37,966][54818] Updated weights for policy 0, policy_version 438518 (0.0027) [2024-04-27 17:46:39,253][54587] Fps is (10 sec: 49151.8, 60 sec: 54340.3, 300 sec: 54817.0). Total num frames: 7184695296. Throughput: 0: 55098.7. Samples: 89991260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 17:46:39,253][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 17:46:40,926][54818] Updated weights for policy 0, policy_version 438528 (0.0026) [2024-04-27 17:46:44,057][54818] Updated weights for policy 0, policy_version 438538 (0.0033) [2024-04-27 17:46:44,253][54587] Fps is (10 sec: 55704.6, 60 sec: 54613.3, 300 sec: 54928.1). Total num frames: 7185006592. Throughput: 0: 54865.6. Samples: 90147240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 17:46:44,262][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 17:46:46,977][54818] Updated weights for policy 0, policy_version 438548 (0.0037) [2024-04-27 17:46:49,253][54587] Fps is (10 sec: 62257.7, 60 sec: 55432.6, 300 sec: 55039.1). Total num frames: 7185317888. Throughput: 0: 54930.6. Samples: 90478320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 17:46:49,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 17:46:49,864][54818] Updated weights for policy 0, policy_version 438558 (0.0031) [2024-04-27 17:46:52,918][54818] Updated weights for policy 0, policy_version 438568 (0.0027) [2024-04-27 17:46:54,253][54587] Fps is (10 sec: 57344.0, 60 sec: 54886.3, 300 sec: 54928.0). Total num frames: 7185580032. Throughput: 0: 54934.0. Samples: 90811680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 17:46:54,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 17:46:55,986][54818] Updated weights for policy 0, policy_version 438578 (0.0030) [2024-04-27 17:46:58,875][54818] Updated weights for policy 0, policy_version 438588 (0.0025) [2024-04-27 17:46:59,253][54587] Fps is (10 sec: 55706.9, 60 sec: 55432.6, 300 sec: 55094.7). Total num frames: 7185874944. Throughput: 0: 55302.9. Samples: 90988960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:46:59,253][54587] Avg episode reward: [(0, '0.683')] [2024-04-27 17:47:02,075][54818] Updated weights for policy 0, policy_version 438598 (0.0024) [2024-04-27 17:47:04,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55039.1). Total num frames: 7186120704. Throughput: 0: 55308.5. Samples: 91316360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:47:04,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 17:47:04,869][54818] Updated weights for policy 0, policy_version 438608 (0.0031) [2024-04-27 17:47:08,084][54818] Updated weights for policy 0, policy_version 438618 (0.0033) [2024-04-27 17:47:09,253][54587] Fps is (10 sec: 49151.7, 60 sec: 54886.4, 300 sec: 54928.0). Total num frames: 7186366464. Throughput: 0: 55359.5. Samples: 91649800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:47:09,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 17:47:10,783][54818] Updated weights for policy 0, policy_version 438628 (0.0033) [2024-04-27 17:47:13,870][54818] Updated weights for policy 0, policy_version 438638 (0.0034) [2024-04-27 17:47:14,253][54587] Fps is (10 sec: 52428.6, 60 sec: 54340.2, 300 sec: 54817.0). Total num frames: 7186644992. Throughput: 0: 54806.4. Samples: 91795160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:47:14,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 17:47:16,731][54818] Updated weights for policy 0, policy_version 438648 (0.0032) [2024-04-27 17:47:19,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55159.6, 300 sec: 54983.6). Total num frames: 7186939904. Throughput: 0: 54793.2. Samples: 92126580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:47:19,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 17:47:19,926][54818] Updated weights for policy 0, policy_version 438658 (0.0029) [2024-04-27 17:47:22,150][54798] Signal inference workers to stop experience collection... (1250 times) [2024-04-27 17:47:22,150][54798] Signal inference workers to resume experience collection... (1250 times) [2024-04-27 17:47:22,165][54818] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-04-27 17:47:22,165][54818] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-04-27 17:47:22,673][54818] Updated weights for policy 0, policy_version 438668 (0.0035) [2024-04-27 17:47:24,253][54587] Fps is (10 sec: 57344.8, 60 sec: 54886.3, 300 sec: 54928.1). Total num frames: 7187218432. Throughput: 0: 54812.8. Samples: 92457840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:47:24,254][54587] Avg episode reward: [(0, '0.728')] [2024-04-27 17:47:26,044][54818] Updated weights for policy 0, policy_version 438678 (0.0029) [2024-04-27 17:47:28,807][54818] Updated weights for policy 0, policy_version 438688 (0.0037) [2024-04-27 17:47:29,253][54587] Fps is (10 sec: 55706.0, 60 sec: 54886.3, 300 sec: 55039.1). Total num frames: 7187496960. Throughput: 0: 55254.8. Samples: 92633700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:47:29,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 17:47:31,813][54818] Updated weights for policy 0, policy_version 438698 (0.0032) [2024-04-27 17:47:34,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 54983.6). Total num frames: 7187759104. Throughput: 0: 55232.6. Samples: 92963780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:47:34,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 17:47:34,608][54818] Updated weights for policy 0, policy_version 438708 (0.0035) [2024-04-27 17:47:37,608][54818] Updated weights for policy 0, policy_version 438718 (0.0031) [2024-04-27 17:47:39,253][54587] Fps is (10 sec: 52428.7, 60 sec: 55432.5, 300 sec: 54928.1). Total num frames: 7188021248. Throughput: 0: 55216.1. Samples: 93296400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:47:39,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 17:47:40,670][54818] Updated weights for policy 0, policy_version 438728 (0.0026) [2024-04-27 17:47:43,617][54818] Updated weights for policy 0, policy_version 438738 (0.0022) [2024-04-27 17:47:44,253][54587] Fps is (10 sec: 54067.5, 60 sec: 54886.5, 300 sec: 54872.5). Total num frames: 7188299776. Throughput: 0: 54552.4. Samples: 93443820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:47:44,262][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 17:47:46,717][54818] Updated weights for policy 0, policy_version 438748 (0.0033) [2024-04-27 17:47:49,253][54587] Fps is (10 sec: 54067.3, 60 sec: 54067.4, 300 sec: 54872.5). Total num frames: 7188561920. Throughput: 0: 54551.7. Samples: 93771180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:47:49,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 17:47:49,268][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000438756_7188578304.pth... [2024-04-27 17:47:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000437951_7175389184.pth [2024-04-27 17:47:49,648][54818] Updated weights for policy 0, policy_version 438758 (0.0026) [2024-04-27 17:47:52,557][54818] Updated weights for policy 0, policy_version 438768 (0.0035) [2024-04-27 17:47:54,253][54587] Fps is (10 sec: 57344.3, 60 sec: 54886.5, 300 sec: 54983.6). Total num frames: 7188873216. Throughput: 0: 54489.4. Samples: 94101820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:47:54,262][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 17:47:55,477][54818] Updated weights for policy 0, policy_version 438778 (0.0037) [2024-04-27 17:47:58,483][54818] Updated weights for policy 0, policy_version 438788 (0.0030) [2024-04-27 17:47:59,253][54587] Fps is (10 sec: 58982.3, 60 sec: 54613.3, 300 sec: 54983.6). Total num frames: 7189151744. Throughput: 0: 55283.8. Samples: 94282920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:47:59,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 17:48:01,321][54818] Updated weights for policy 0, policy_version 438798 (0.0028) [2024-04-27 17:48:04,253][54587] Fps is (10 sec: 54067.2, 60 sec: 54886.6, 300 sec: 54928.1). Total num frames: 7189413888. Throughput: 0: 55282.8. Samples: 94614300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:48:04,262][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 17:48:04,336][54818] Updated weights for policy 0, policy_version 438808 (0.0030) [2024-04-27 17:48:07,224][54818] Updated weights for policy 0, policy_version 438818 (0.0030) [2024-04-27 17:48:09,253][54587] Fps is (10 sec: 52428.6, 60 sec: 55159.4, 300 sec: 54983.6). Total num frames: 7189676032. Throughput: 0: 55211.1. Samples: 94942340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:48:09,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 17:48:10,272][54818] Updated weights for policy 0, policy_version 438828 (0.0031) [2024-04-27 17:48:13,255][54818] Updated weights for policy 0, policy_version 438838 (0.0033) [2024-04-27 17:48:14,253][54587] Fps is (10 sec: 52428.0, 60 sec: 54886.5, 300 sec: 54928.1). Total num frames: 7189938176. Throughput: 0: 54712.8. Samples: 95095780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:48:14,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-27 17:48:16,194][54818] Updated weights for policy 0, policy_version 438848 (0.0026) [2024-04-27 17:48:19,253][54587] Fps is (10 sec: 55704.8, 60 sec: 54886.3, 300 sec: 54928.0). Total num frames: 7190233088. Throughput: 0: 54671.0. Samples: 95423980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 17:48:19,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 17:48:19,531][54818] Updated weights for policy 0, policy_version 438858 (0.0035) [2024-04-27 17:48:22,057][54798] Signal inference workers to stop experience collection... (1300 times) [2024-04-27 17:48:22,062][54798] Signal inference workers to resume experience collection... (1300 times) [2024-04-27 17:48:22,071][54818] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-04-27 17:48:22,089][54818] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-04-27 17:48:22,175][54818] Updated weights for policy 0, policy_version 438868 (0.0034) [2024-04-27 17:48:24,253][54587] Fps is (10 sec: 57344.5, 60 sec: 54886.4, 300 sec: 54928.1). Total num frames: 7190511616. Throughput: 0: 54612.5. Samples: 95753960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:48:24,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 17:48:25,302][54818] Updated weights for policy 0, policy_version 438878 (0.0034) [2024-04-27 17:48:28,255][54818] Updated weights for policy 0, policy_version 438888 (0.0021) [2024-04-27 17:48:29,253][54587] Fps is (10 sec: 55706.4, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7190790144. Throughput: 0: 55248.0. Samples: 95929980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:48:29,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 17:48:31,469][54818] Updated weights for policy 0, policy_version 438898 (0.0029) [2024-04-27 17:48:34,068][54818] Updated weights for policy 0, policy_version 438908 (0.0030) [2024-04-27 17:48:34,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55159.6, 300 sec: 54983.6). Total num frames: 7191068672. Throughput: 0: 55200.5. Samples: 96255200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:48:34,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 17:48:37,740][54818] Updated weights for policy 0, policy_version 438918 (0.0029) [2024-04-27 17:48:39,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55159.4, 300 sec: 54928.1). Total num frames: 7191330816. Throughput: 0: 55186.1. Samples: 96585200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:48:39,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 17:48:40,080][54818] Updated weights for policy 0, policy_version 438928 (0.0036) [2024-04-27 17:48:43,733][54818] Updated weights for policy 0, policy_version 438938 (0.0030) [2024-04-27 17:48:44,253][54587] Fps is (10 sec: 50789.9, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 7191576576. Throughput: 0: 54632.4. Samples: 96741380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:48:44,254][54587] Avg episode reward: [(0, '0.502')] [2024-04-27 17:48:45,902][54818] Updated weights for policy 0, policy_version 438948 (0.0031) [2024-04-27 17:48:49,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55159.4, 300 sec: 54928.1). Total num frames: 7191871488. Throughput: 0: 54651.4. Samples: 97073620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:48:49,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 17:48:49,532][54818] Updated weights for policy 0, policy_version 438958 (0.0031) [2024-04-27 17:48:51,830][54818] Updated weights for policy 0, policy_version 438968 (0.0026) [2024-04-27 17:48:54,253][54587] Fps is (10 sec: 58982.3, 60 sec: 54886.3, 300 sec: 54983.6). Total num frames: 7192166400. Throughput: 0: 54712.5. Samples: 97404400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:48:54,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 17:48:55,296][54818] Updated weights for policy 0, policy_version 438978 (0.0029) [2024-04-27 17:48:57,880][54818] Updated weights for policy 0, policy_version 438988 (0.0028) [2024-04-27 17:48:59,253][54587] Fps is (10 sec: 57344.2, 60 sec: 54886.4, 300 sec: 54928.1). Total num frames: 7192444928. Throughput: 0: 55110.8. Samples: 97575760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:48:59,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 17:49:01,350][54818] Updated weights for policy 0, policy_version 438998 (0.0029) [2024-04-27 17:49:03,901][54818] Updated weights for policy 0, policy_version 439008 (0.0029) [2024-04-27 17:49:04,253][54587] Fps is (10 sec: 54067.2, 60 sec: 54886.3, 300 sec: 54928.1). Total num frames: 7192707072. Throughput: 0: 55096.6. Samples: 97903320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:49:04,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 17:49:07,619][54818] Updated weights for policy 0, policy_version 439018 (0.0028) [2024-04-27 17:49:09,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 54928.0). Total num frames: 7193001984. Throughput: 0: 55050.6. Samples: 98231240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:49:09,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 17:49:09,832][54818] Updated weights for policy 0, policy_version 439028 (0.0030) [2024-04-27 17:49:13,382][54818] Updated weights for policy 0, policy_version 439038 (0.0028) [2024-04-27 17:49:14,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 54928.0). Total num frames: 7193247744. Throughput: 0: 54864.8. Samples: 98398900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:49:14,254][54587] Avg episode reward: [(0, '0.516')] [2024-04-27 17:49:15,682][54818] Updated weights for policy 0, policy_version 439048 (0.0033) [2024-04-27 17:49:19,186][54818] Updated weights for policy 0, policy_version 439058 (0.0031) [2024-04-27 17:49:19,253][54587] Fps is (10 sec: 52429.0, 60 sec: 54886.5, 300 sec: 54928.1). Total num frames: 7193526272. Throughput: 0: 54970.5. Samples: 98728880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:49:19,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 17:49:20,119][54798] Signal inference workers to stop experience collection... (1350 times) [2024-04-27 17:49:20,153][54818] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-04-27 17:49:20,181][54798] Signal inference workers to resume experience collection... (1350 times) [2024-04-27 17:49:20,181][54818] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-04-27 17:49:21,706][54818] Updated weights for policy 0, policy_version 439068 (0.0028) [2024-04-27 17:49:24,253][54587] Fps is (10 sec: 55705.8, 60 sec: 54886.3, 300 sec: 54983.6). Total num frames: 7193804800. Throughput: 0: 54897.3. Samples: 99055580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:49:24,262][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 17:49:25,550][54818] Updated weights for policy 0, policy_version 439078 (0.0031) [2024-04-27 17:49:27,766][54818] Updated weights for policy 0, policy_version 439088 (0.0026) [2024-04-27 17:49:29,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55159.5, 300 sec: 54983.6). Total num frames: 7194099712. Throughput: 0: 55072.4. Samples: 99219640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:49:29,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 17:49:31,553][54818] Updated weights for policy 0, policy_version 439098 (0.0033) [2024-04-27 17:49:33,654][54818] Updated weights for policy 0, policy_version 439108 (0.0032) [2024-04-27 17:49:34,253][54587] Fps is (10 sec: 55706.0, 60 sec: 54886.3, 300 sec: 54928.1). Total num frames: 7194361856. Throughput: 0: 55035.2. Samples: 99550200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:49:34,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 17:49:37,536][54818] Updated weights for policy 0, policy_version 439118 (0.0024) [2024-04-27 17:49:39,255][54587] Fps is (10 sec: 54058.7, 60 sec: 55158.0, 300 sec: 54983.3). Total num frames: 7194640384. Throughput: 0: 55075.4. Samples: 99882880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:49:39,264][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 17:49:39,730][54818] Updated weights for policy 0, policy_version 439128 (0.0028) [2024-04-27 17:49:43,306][54818] Updated weights for policy 0, policy_version 439138 (0.0028) [2024-04-27 17:49:44,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55705.7, 300 sec: 54983.6). Total num frames: 7194918912. Throughput: 0: 54932.1. Samples: 100047700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 17:49:44,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-27 17:49:45,765][54818] Updated weights for policy 0, policy_version 439148 (0.0028) [2024-04-27 17:49:49,177][54818] Updated weights for policy 0, policy_version 439158 (0.0026) [2024-04-27 17:49:49,253][54587] Fps is (10 sec: 52436.4, 60 sec: 54886.3, 300 sec: 54872.5). Total num frames: 7195164672. Throughput: 0: 54896.3. Samples: 100373660. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-27 17:49:49,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 17:49:49,361][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000439159_7195181056.pth... [2024-04-27 17:49:49,407][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000438355_7182008320.pth [2024-04-27 17:49:51,664][54818] Updated weights for policy 0, policy_version 439168 (0.0030) [2024-04-27 17:49:54,253][54587] Fps is (10 sec: 52429.0, 60 sec: 54613.4, 300 sec: 54928.1). Total num frames: 7195443200. Throughput: 0: 54878.9. Samples: 100700780. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-27 17:49:54,253][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 17:49:55,283][54818] Updated weights for policy 0, policy_version 439178 (0.0028) [2024-04-27 17:49:57,730][54818] Updated weights for policy 0, policy_version 439188 (0.0030) [2024-04-27 17:49:59,253][54587] Fps is (10 sec: 57345.0, 60 sec: 54886.5, 300 sec: 54928.0). Total num frames: 7195738112. Throughput: 0: 54700.6. Samples: 100860420. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-27 17:49:59,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-27 17:50:01,112][54818] Updated weights for policy 0, policy_version 439198 (0.0031) [2024-04-27 17:50:03,764][54818] Updated weights for policy 0, policy_version 439208 (0.0033) [2024-04-27 17:50:04,253][54587] Fps is (10 sec: 55705.3, 60 sec: 54886.5, 300 sec: 54872.5). Total num frames: 7196000256. Throughput: 0: 54739.2. Samples: 101192140. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-27 17:50:04,253][54587] Avg episode reward: [(0, '0.518')] [2024-04-27 17:50:07,059][54818] Updated weights for policy 0, policy_version 439218 (0.0030) [2024-04-27 17:50:09,253][54587] Fps is (10 sec: 52428.6, 60 sec: 54340.4, 300 sec: 54817.0). Total num frames: 7196262400. Throughput: 0: 54836.1. Samples: 101523200. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-27 17:50:09,253][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 17:50:09,611][54818] Updated weights for policy 0, policy_version 439228 (0.0032) [2024-04-27 17:50:13,014][54818] Updated weights for policy 0, policy_version 439238 (0.0032) [2024-04-27 17:50:14,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55432.6, 300 sec: 54928.1). Total num frames: 7196573696. Throughput: 0: 54968.4. Samples: 101693220. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-27 17:50:14,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 17:50:15,597][54818] Updated weights for policy 0, policy_version 439248 (0.0027) [2024-04-27 17:50:18,896][54818] Updated weights for policy 0, policy_version 439258 (0.0028) [2024-04-27 17:50:19,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55159.5, 300 sec: 54928.1). Total num frames: 7196835840. Throughput: 0: 54966.7. Samples: 102023700. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-27 17:50:19,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 17:50:21,465][54818] Updated weights for policy 0, policy_version 439268 (0.0029) [2024-04-27 17:50:24,253][54587] Fps is (10 sec: 50790.6, 60 sec: 54613.4, 300 sec: 54872.5). Total num frames: 7197081600. Throughput: 0: 54957.9. Samples: 102355900. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-27 17:50:24,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 17:50:24,747][54818] Updated weights for policy 0, policy_version 439278 (0.0027) [2024-04-27 17:50:26,825][54798] Signal inference workers to stop experience collection... (1400 times) [2024-04-27 17:50:26,856][54818] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-04-27 17:50:26,916][54798] Signal inference workers to resume experience collection... (1400 times) [2024-04-27 17:50:26,917][54818] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-04-27 17:50:27,585][54818] Updated weights for policy 0, policy_version 439288 (0.0026) [2024-04-27 17:50:29,253][54587] Fps is (10 sec: 54067.3, 60 sec: 54613.4, 300 sec: 54983.6). Total num frames: 7197376512. Throughput: 0: 54905.3. Samples: 102518440. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-27 17:50:29,253][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 17:50:30,794][54818] Updated weights for policy 0, policy_version 439298 (0.0033) [2024-04-27 17:50:33,407][54818] Updated weights for policy 0, policy_version 439308 (0.0029) [2024-04-27 17:50:34,253][54587] Fps is (10 sec: 58982.3, 60 sec: 55159.4, 300 sec: 55039.1). Total num frames: 7197671424. Throughput: 0: 54969.9. Samples: 102847300. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-27 17:50:34,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 17:50:36,688][54818] Updated weights for policy 0, policy_version 439318 (0.0028) [2024-04-27 17:50:39,197][54818] Updated weights for policy 0, policy_version 439328 (0.0028) [2024-04-27 17:50:39,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55161.0, 300 sec: 54983.6). Total num frames: 7197949952. Throughput: 0: 55060.4. Samples: 103178500. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-27 17:50:39,253][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 17:50:42,697][54818] Updated weights for policy 0, policy_version 439338 (0.0030) [2024-04-27 17:50:44,253][54587] Fps is (10 sec: 54067.5, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 7198212096. Throughput: 0: 55235.1. Samples: 103346000. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-27 17:50:44,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-27 17:50:45,336][54818] Updated weights for policy 0, policy_version 439348 (0.0031) [2024-04-27 17:50:48,503][54818] Updated weights for policy 0, policy_version 439358 (0.0032) [2024-04-27 17:50:49,254][54587] Fps is (10 sec: 54065.5, 60 sec: 55432.4, 300 sec: 54928.0). Total num frames: 7198490624. Throughput: 0: 55277.4. Samples: 103679640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-27 17:50:49,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 17:50:51,352][54818] Updated weights for policy 0, policy_version 439368 (0.0031) [2024-04-27 17:50:54,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55159.3, 300 sec: 54928.0). Total num frames: 7198752768. Throughput: 0: 55220.4. Samples: 104008120. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-27 17:50:54,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 17:50:54,485][54818] Updated weights for policy 0, policy_version 439378 (0.0036) [2024-04-27 17:50:57,191][54818] Updated weights for policy 0, policy_version 439388 (0.0033) [2024-04-27 17:50:59,253][54587] Fps is (10 sec: 52429.7, 60 sec: 54613.2, 300 sec: 54983.6). Total num frames: 7199014912. Throughput: 0: 55066.2. Samples: 104171200. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-27 17:50:59,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 17:51:00,312][54818] Updated weights for policy 0, policy_version 439398 (0.0026) [2024-04-27 17:51:02,942][54818] Updated weights for policy 0, policy_version 439408 (0.0033) [2024-04-27 17:51:04,253][54587] Fps is (10 sec: 54067.4, 60 sec: 54886.3, 300 sec: 54983.6). Total num frames: 7199293440. Throughput: 0: 55024.0. Samples: 104499780. Policy #0 lag: (min: 1.0, avg: 11.2, max: 23.0) [2024-04-27 17:51:04,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 17:51:06,352][54818] Updated weights for policy 0, policy_version 439418 (0.0028) [2024-04-27 17:51:08,875][54818] Updated weights for policy 0, policy_version 439428 (0.0033) [2024-04-27 17:51:09,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55432.4, 300 sec: 54928.1). Total num frames: 7199588352. Throughput: 0: 55017.7. Samples: 104831700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 17:51:09,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 17:51:12,273][54818] Updated weights for policy 0, policy_version 439438 (0.0027) [2024-04-27 17:51:14,253][54587] Fps is (10 sec: 55705.3, 60 sec: 54613.3, 300 sec: 54983.6). Total num frames: 7199850496. Throughput: 0: 55084.8. Samples: 104997260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 17:51:14,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 17:51:14,954][54818] Updated weights for policy 0, policy_version 439448 (0.0027) [2024-04-27 17:51:18,078][54818] Updated weights for policy 0, policy_version 439458 (0.0032) [2024-04-27 17:51:19,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55159.4, 300 sec: 54983.6). Total num frames: 7200145408. Throughput: 0: 55159.1. Samples: 105329460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 17:51:19,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 17:51:20,907][54818] Updated weights for policy 0, policy_version 439468 (0.0028) [2024-04-27 17:51:24,206][54818] Updated weights for policy 0, policy_version 439478 (0.0037) [2024-04-27 17:51:24,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55432.6, 300 sec: 54928.0). Total num frames: 7200407552. Throughput: 0: 55151.5. Samples: 105660320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 17:51:24,253][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 17:51:26,945][54818] Updated weights for policy 0, policy_version 439488 (0.0026) [2024-04-27 17:51:29,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55094.7). Total num frames: 7200702464. Throughput: 0: 55047.1. Samples: 105823120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 17:51:29,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 17:51:30,015][54818] Updated weights for policy 0, policy_version 439498 (0.0027) [2024-04-27 17:51:32,984][54818] Updated weights for policy 0, policy_version 439508 (0.0025) [2024-04-27 17:51:34,253][54587] Fps is (10 sec: 54066.6, 60 sec: 54613.3, 300 sec: 55094.7). Total num frames: 7200948224. Throughput: 0: 55045.6. Samples: 106156680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 17:51:34,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-27 17:51:35,887][54818] Updated weights for policy 0, policy_version 439518 (0.0024) [2024-04-27 17:51:38,770][54818] Updated weights for policy 0, policy_version 439528 (0.0029) [2024-04-27 17:51:39,253][54587] Fps is (10 sec: 52428.7, 60 sec: 54613.2, 300 sec: 54983.6). Total num frames: 7201226752. Throughput: 0: 55185.8. Samples: 106491480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 17:51:39,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 17:51:41,847][54818] Updated weights for policy 0, policy_version 439538 (0.0032) [2024-04-27 17:51:41,991][54798] Signal inference workers to stop experience collection... (1450 times) [2024-04-27 17:51:41,991][54798] Signal inference workers to resume experience collection... (1450 times) [2024-04-27 17:51:42,003][54818] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-04-27 17:51:42,023][54818] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-04-27 17:51:44,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55159.4, 300 sec: 54928.1). Total num frames: 7201521664. Throughput: 0: 55368.5. Samples: 106662780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 17:51:44,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 17:51:44,542][54818] Updated weights for policy 0, policy_version 439548 (0.0031) [2024-04-27 17:51:47,739][54818] Updated weights for policy 0, policy_version 439558 (0.0034) [2024-04-27 17:51:49,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55159.7, 300 sec: 54983.6). Total num frames: 7201800192. Throughput: 0: 55340.5. Samples: 106990100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 17:51:49,253][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 17:51:49,345][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000439564_7201816576.pth... [2024-04-27 17:51:49,396][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000438756_7188578304.pth [2024-04-27 17:51:50,572][54818] Updated weights for policy 0, policy_version 439568 (0.0035) [2024-04-27 17:51:53,610][54818] Updated weights for policy 0, policy_version 439578 (0.0029) [2024-04-27 17:51:54,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 54928.1). Total num frames: 7202078720. Throughput: 0: 55202.4. Samples: 107315800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 17:51:54,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 17:51:56,635][54818] Updated weights for policy 0, policy_version 439588 (0.0029) [2024-04-27 17:51:59,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 54983.6). Total num frames: 7202340864. Throughput: 0: 55307.2. Samples: 107486080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 17:51:59,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 17:51:59,492][54818] Updated weights for policy 0, policy_version 439598 (0.0034) [2024-04-27 17:52:02,728][54818] Updated weights for policy 0, policy_version 439608 (0.0032) [2024-04-27 17:52:04,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55432.5, 300 sec: 55094.7). Total num frames: 7202619392. Throughput: 0: 55360.6. Samples: 107820680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 17:52:04,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 17:52:05,361][54818] Updated weights for policy 0, policy_version 439618 (0.0032) [2024-04-27 17:52:08,492][54818] Updated weights for policy 0, policy_version 439628 (0.0027) [2024-04-27 17:52:09,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55159.4, 300 sec: 55094.7). Total num frames: 7202897920. Throughput: 0: 55316.2. Samples: 108149560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 17:52:09,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 17:52:11,279][54818] Updated weights for policy 0, policy_version 439638 (0.0031) [2024-04-27 17:52:14,254][54587] Fps is (10 sec: 54061.5, 60 sec: 55158.5, 300 sec: 54983.4). Total num frames: 7203160064. Throughput: 0: 55290.3. Samples: 108311240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 17:52:14,255][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 17:52:14,416][54818] Updated weights for policy 0, policy_version 439648 (0.0028) [2024-04-27 17:52:17,094][54818] Updated weights for policy 0, policy_version 439658 (0.0028) [2024-04-27 17:52:19,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55432.5, 300 sec: 55094.7). Total num frames: 7203471360. Throughput: 0: 55308.0. Samples: 108645540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 17:52:19,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 17:52:20,333][54818] Updated weights for policy 0, policy_version 439668 (0.0030) [2024-04-27 17:52:22,963][54818] Updated weights for policy 0, policy_version 439678 (0.0031) [2024-04-27 17:52:24,253][54587] Fps is (10 sec: 57350.1, 60 sec: 55432.5, 300 sec: 55039.1). Total num frames: 7203733504. Throughput: 0: 55317.4. Samples: 108980760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 17:52:24,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 17:52:26,288][54818] Updated weights for policy 0, policy_version 439688 (0.0024) [2024-04-27 17:52:29,100][54818] Updated weights for policy 0, policy_version 439698 (0.0026) [2024-04-27 17:52:29,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55150.2). Total num frames: 7204028416. Throughput: 0: 55193.7. Samples: 109146500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 17:52:29,254][54587] Avg episode reward: [(0, '0.671')] [2024-04-27 17:52:32,087][54818] Updated weights for policy 0, policy_version 439708 (0.0027) [2024-04-27 17:52:34,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55094.7). Total num frames: 7204274176. Throughput: 0: 55345.2. Samples: 109480640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:52:34,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 17:52:34,898][54818] Updated weights for policy 0, policy_version 439718 (0.0025) [2024-04-27 17:52:38,050][54818] Updated weights for policy 0, policy_version 439728 (0.0028) [2024-04-27 17:52:39,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55705.6, 300 sec: 55150.2). Total num frames: 7204569088. Throughput: 0: 55480.8. Samples: 109812440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:52:39,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 17:52:40,916][54818] Updated weights for policy 0, policy_version 439738 (0.0029) [2024-04-27 17:52:44,039][54818] Updated weights for policy 0, policy_version 439748 (0.0028) [2024-04-27 17:52:44,253][54587] Fps is (10 sec: 55706.7, 60 sec: 55159.6, 300 sec: 55150.2). Total num frames: 7204831232. Throughput: 0: 55345.9. Samples: 109976640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:52:44,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 17:52:46,772][54818] Updated weights for policy 0, policy_version 439758 (0.0030) [2024-04-27 17:52:49,253][54587] Fps is (10 sec: 52428.4, 60 sec: 54886.3, 300 sec: 54983.6). Total num frames: 7205093376. Throughput: 0: 55281.2. Samples: 110308340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:52:49,254][54587] Avg episode reward: [(0, '0.528')] [2024-04-27 17:52:49,842][54818] Updated weights for policy 0, policy_version 439768 (0.0028) [2024-04-27 17:52:52,651][54818] Updated weights for policy 0, policy_version 439778 (0.0028) [2024-04-27 17:52:53,056][54798] Signal inference workers to stop experience collection... (1500 times) [2024-04-27 17:52:53,056][54798] Signal inference workers to resume experience collection... (1500 times) [2024-04-27 17:52:53,078][54818] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-04-27 17:52:53,079][54818] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-04-27 17:52:54,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55159.4, 300 sec: 55039.1). Total num frames: 7205388288. Throughput: 0: 55308.1. Samples: 110638420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:52:54,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 17:52:55,897][54818] Updated weights for policy 0, policy_version 439788 (0.0034) [2024-04-27 17:52:58,533][54818] Updated weights for policy 0, policy_version 439798 (0.0029) [2024-04-27 17:52:59,253][54587] Fps is (10 sec: 58982.0, 60 sec: 55705.5, 300 sec: 55150.2). Total num frames: 7205683200. Throughput: 0: 55410.5. Samples: 110804660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:52:59,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 17:53:01,832][54818] Updated weights for policy 0, policy_version 439808 (0.0029) [2024-04-27 17:53:04,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55150.2). Total num frames: 7205945344. Throughput: 0: 55408.1. Samples: 111138900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:53:04,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 17:53:04,592][54818] Updated weights for policy 0, policy_version 439818 (0.0033) [2024-04-27 17:53:07,777][54818] Updated weights for policy 0, policy_version 439828 (0.0033) [2024-04-27 17:53:09,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55205.8). Total num frames: 7206223872. Throughput: 0: 55263.1. Samples: 111467600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:53:09,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 17:53:10,535][54818] Updated weights for policy 0, policy_version 439838 (0.0030) [2024-04-27 17:53:13,619][54818] Updated weights for policy 0, policy_version 439848 (0.0030) [2024-04-27 17:53:14,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55433.5, 300 sec: 55094.7). Total num frames: 7206486016. Throughput: 0: 55220.5. Samples: 111631420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:53:14,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 17:53:16,378][54818] Updated weights for policy 0, policy_version 439858 (0.0030) [2024-04-27 17:53:19,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55159.5, 300 sec: 55150.2). Total num frames: 7206780928. Throughput: 0: 55277.8. Samples: 111968140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:53:19,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 17:53:19,695][54818] Updated weights for policy 0, policy_version 439868 (0.0030) [2024-04-27 17:53:22,169][54818] Updated weights for policy 0, policy_version 439878 (0.0030) [2024-04-27 17:53:24,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55159.5, 300 sec: 55094.7). Total num frames: 7207043072. Throughput: 0: 55165.8. Samples: 112294900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:53:24,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 17:53:25,439][54818] Updated weights for policy 0, policy_version 439888 (0.0034) [2024-04-27 17:53:28,180][54818] Updated weights for policy 0, policy_version 439898 (0.0031) [2024-04-27 17:53:29,253][54587] Fps is (10 sec: 52429.5, 60 sec: 54613.5, 300 sec: 55039.1). Total num frames: 7207305216. Throughput: 0: 55212.4. Samples: 112461200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:53:29,253][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 17:53:31,647][54818] Updated weights for policy 0, policy_version 439908 (0.0035) [2024-04-27 17:53:33,943][54818] Updated weights for policy 0, policy_version 439918 (0.0030) [2024-04-27 17:53:34,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55705.6, 300 sec: 55205.7). Total num frames: 7207616512. Throughput: 0: 55121.3. Samples: 112788800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:53:34,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 17:53:37,589][54818] Updated weights for policy 0, policy_version 439928 (0.0033) [2024-04-27 17:53:39,253][54587] Fps is (10 sec: 57342.9, 60 sec: 55159.3, 300 sec: 55261.3). Total num frames: 7207878656. Throughput: 0: 55185.6. Samples: 113121780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:53:39,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 17:53:39,746][54818] Updated weights for policy 0, policy_version 439938 (0.0029) [2024-04-27 17:53:43,635][54818] Updated weights for policy 0, policy_version 439948 (0.0029) [2024-04-27 17:53:44,253][54587] Fps is (10 sec: 52429.4, 60 sec: 55159.4, 300 sec: 55150.2). Total num frames: 7208140800. Throughput: 0: 55141.5. Samples: 113286020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:53:44,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 17:53:45,292][54798] Signal inference workers to stop experience collection... (1550 times) [2024-04-27 17:53:45,324][54818] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-04-27 17:53:45,356][54798] Signal inference workers to resume experience collection... (1550 times) [2024-04-27 17:53:45,357][54818] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-04-27 17:53:45,919][54818] Updated weights for policy 0, policy_version 439958 (0.0026) [2024-04-27 17:53:49,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55159.5, 300 sec: 55039.1). Total num frames: 7208402944. Throughput: 0: 55100.8. Samples: 113618440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 17:53:49,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 17:53:49,306][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000439967_7208419328.pth... [2024-04-27 17:53:49,353][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000439159_7195181056.pth [2024-04-27 17:53:49,537][54818] Updated weights for policy 0, policy_version 439968 (0.0026) [2024-04-27 17:53:52,361][54818] Updated weights for policy 0, policy_version 439978 (0.0024) [2024-04-27 17:53:54,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55159.5, 300 sec: 55094.7). Total num frames: 7208697856. Throughput: 0: 55162.3. Samples: 113949900. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-27 17:53:54,253][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 17:53:55,342][54818] Updated weights for policy 0, policy_version 439988 (0.0033) [2024-04-27 17:53:58,570][54818] Updated weights for policy 0, policy_version 439998 (0.0033) [2024-04-27 17:53:59,253][54587] Fps is (10 sec: 57343.7, 60 sec: 54886.4, 300 sec: 55150.2). Total num frames: 7208976384. Throughput: 0: 55134.1. Samples: 114112460. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-27 17:53:59,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 17:54:01,260][54818] Updated weights for policy 0, policy_version 440008 (0.0027) [2024-04-27 17:54:04,253][54587] Fps is (10 sec: 54066.0, 60 sec: 54886.2, 300 sec: 55039.1). Total num frames: 7209238528. Throughput: 0: 54982.1. Samples: 114442340. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-27 17:54:04,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 17:54:04,563][54818] Updated weights for policy 0, policy_version 440018 (0.0024) [2024-04-27 17:54:07,007][54818] Updated weights for policy 0, policy_version 440028 (0.0029) [2024-04-27 17:54:09,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55159.4, 300 sec: 55205.7). Total num frames: 7209533440. Throughput: 0: 55079.0. Samples: 114773460. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-27 17:54:09,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 17:54:10,395][54818] Updated weights for policy 0, policy_version 440038 (0.0029) [2024-04-27 17:54:13,063][54818] Updated weights for policy 0, policy_version 440048 (0.0027) [2024-04-27 17:54:14,253][54587] Fps is (10 sec: 57345.0, 60 sec: 55432.6, 300 sec: 55205.8). Total num frames: 7209811968. Throughput: 0: 55121.7. Samples: 114941680. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-27 17:54:14,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 17:54:16,243][54818] Updated weights for policy 0, policy_version 440058 (0.0032) [2024-04-27 17:54:19,004][54818] Updated weights for policy 0, policy_version 440068 (0.0027) [2024-04-27 17:54:19,253][54587] Fps is (10 sec: 54067.5, 60 sec: 54886.4, 300 sec: 55150.2). Total num frames: 7210074112. Throughput: 0: 55220.9. Samples: 115273740. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-27 17:54:19,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 17:54:22,199][54818] Updated weights for policy 0, policy_version 440078 (0.0034) [2024-04-27 17:54:24,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55159.5, 300 sec: 55094.7). Total num frames: 7210352640. Throughput: 0: 55201.1. Samples: 115605820. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-27 17:54:24,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 17:54:24,891][54818] Updated weights for policy 0, policy_version 440088 (0.0031) [2024-04-27 17:54:28,238][54818] Updated weights for policy 0, policy_version 440098 (0.0029) [2024-04-27 17:54:29,253][54587] Fps is (10 sec: 55706.5, 60 sec: 55432.6, 300 sec: 55150.2). Total num frames: 7210631168. Throughput: 0: 55145.9. Samples: 115767580. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-27 17:54:29,253][54587] Avg episode reward: [(0, '0.681')] [2024-04-27 17:54:30,927][54818] Updated weights for policy 0, policy_version 440108 (0.0025) [2024-04-27 17:54:34,147][54818] Updated weights for policy 0, policy_version 440118 (0.0023) [2024-04-27 17:54:34,253][54587] Fps is (10 sec: 54067.2, 60 sec: 54613.5, 300 sec: 55095.0). Total num frames: 7210893312. Throughput: 0: 55192.6. Samples: 116102100. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-27 17:54:34,254][54587] Avg episode reward: [(0, '0.522')] [2024-04-27 17:54:36,725][54798] Signal inference workers to stop experience collection... (1600 times) [2024-04-27 17:54:36,728][54798] Signal inference workers to resume experience collection... (1600 times) [2024-04-27 17:54:36,752][54818] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-04-27 17:54:36,752][54818] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-04-27 17:54:36,839][54818] Updated weights for policy 0, policy_version 440128 (0.0033) [2024-04-27 17:54:39,253][54587] Fps is (10 sec: 54066.9, 60 sec: 54886.6, 300 sec: 55094.7). Total num frames: 7211171840. Throughput: 0: 55176.9. Samples: 116432860. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-27 17:54:39,254][54587] Avg episode reward: [(0, '0.503')] [2024-04-27 17:54:40,081][54818] Updated weights for policy 0, policy_version 440138 (0.0029) [2024-04-27 17:54:42,729][54818] Updated weights for policy 0, policy_version 440148 (0.0039) [2024-04-27 17:54:44,253][54587] Fps is (10 sec: 57343.0, 60 sec: 55432.4, 300 sec: 55261.3). Total num frames: 7211466752. Throughput: 0: 55140.9. Samples: 116593800. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-27 17:54:44,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 17:54:46,131][54818] Updated weights for policy 0, policy_version 440158 (0.0027) [2024-04-27 17:54:48,552][54818] Updated weights for policy 0, policy_version 440168 (0.0028) [2024-04-27 17:54:49,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55705.8, 300 sec: 55261.3). Total num frames: 7211745280. Throughput: 0: 55171.9. Samples: 116925060. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-27 17:54:49,253][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 17:54:52,025][54818] Updated weights for policy 0, policy_version 440178 (0.0032) [2024-04-27 17:54:54,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55159.4, 300 sec: 55150.2). Total num frames: 7212007424. Throughput: 0: 55259.7. Samples: 117260140. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-27 17:54:54,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 17:54:54,670][54818] Updated weights for policy 0, policy_version 440188 (0.0035) [2024-04-27 17:54:57,935][54818] Updated weights for policy 0, policy_version 440198 (0.0028) [2024-04-27 17:54:59,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.7, 300 sec: 55261.3). Total num frames: 7212302336. Throughput: 0: 55215.1. Samples: 117426360. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-27 17:54:59,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 17:55:00,781][54818] Updated weights for policy 0, policy_version 440208 (0.0036) [2024-04-27 17:55:03,862][54818] Updated weights for policy 0, policy_version 440218 (0.0032) [2024-04-27 17:55:04,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55432.7, 300 sec: 55261.3). Total num frames: 7212564480. Throughput: 0: 55217.0. Samples: 117758500. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-27 17:55:04,254][54587] Avg episode reward: [(0, '0.690')] [2024-04-27 17:55:06,608][54818] Updated weights for policy 0, policy_version 440228 (0.0034) [2024-04-27 17:55:09,253][54587] Fps is (10 sec: 52428.4, 60 sec: 54886.5, 300 sec: 55094.7). Total num frames: 7212826624. Throughput: 0: 55220.8. Samples: 118090760. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-27 17:55:09,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 17:55:09,662][54818] Updated weights for policy 0, policy_version 440238 (0.0027) [2024-04-27 17:55:12,361][54818] Updated weights for policy 0, policy_version 440248 (0.0030) [2024-04-27 17:55:14,253][54587] Fps is (10 sec: 54066.5, 60 sec: 54886.3, 300 sec: 55150.2). Total num frames: 7213105152. Throughput: 0: 55400.2. Samples: 118260600. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-27 17:55:14,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-27 17:55:15,592][54818] Updated weights for policy 0, policy_version 440258 (0.0037) [2024-04-27 17:55:18,378][54818] Updated weights for policy 0, policy_version 440268 (0.0022) [2024-04-27 17:55:19,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 7213400064. Throughput: 0: 55239.0. Samples: 118587860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 17:55:19,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 17:55:21,574][54818] Updated weights for policy 0, policy_version 440278 (0.0030) [2024-04-27 17:55:22,341][54798] Signal inference workers to stop experience collection... (1650 times) [2024-04-27 17:55:22,341][54798] Signal inference workers to resume experience collection... (1650 times) [2024-04-27 17:55:22,357][54818] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-04-27 17:55:22,357][54818] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-04-27 17:55:24,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55159.4, 300 sec: 55205.7). Total num frames: 7213662208. Throughput: 0: 55208.7. Samples: 118917260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 17:55:24,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 17:55:24,385][54818] Updated weights for policy 0, policy_version 440288 (0.0030) [2024-04-27 17:55:27,596][54818] Updated weights for policy 0, policy_version 440298 (0.0027) [2024-04-27 17:55:29,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55159.4, 300 sec: 55150.2). Total num frames: 7213940736. Throughput: 0: 55451.3. Samples: 119089100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 17:55:29,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 17:55:30,159][54818] Updated weights for policy 0, policy_version 440308 (0.0027) [2024-04-27 17:55:33,328][54818] Updated weights for policy 0, policy_version 440318 (0.0030) [2024-04-27 17:55:34,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55705.5, 300 sec: 55205.7). Total num frames: 7214235648. Throughput: 0: 55602.9. Samples: 119427200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 17:55:34,254][54587] Avg episode reward: [(0, '0.684')] [2024-04-27 17:55:35,927][54818] Updated weights for policy 0, policy_version 440328 (0.0036) [2024-04-27 17:55:39,036][54818] Updated weights for policy 0, policy_version 440338 (0.0024) [2024-04-27 17:55:39,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55205.7). Total num frames: 7214497792. Throughput: 0: 55545.4. Samples: 119759680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 17:55:39,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 17:55:42,106][54818] Updated weights for policy 0, policy_version 440348 (0.0032) [2024-04-27 17:55:44,253][54587] Fps is (10 sec: 52429.4, 60 sec: 54886.5, 300 sec: 55150.3). Total num frames: 7214759936. Throughput: 0: 55470.7. Samples: 119922540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 17:55:44,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 17:55:44,938][54818] Updated weights for policy 0, policy_version 440358 (0.0033) [2024-04-27 17:55:47,762][54818] Updated weights for policy 0, policy_version 440368 (0.0032) [2024-04-27 17:55:49,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55159.4, 300 sec: 55261.3). Total num frames: 7215054848. Throughput: 0: 55532.0. Samples: 120257440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 17:55:49,253][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 17:55:49,317][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000440373_7215071232.pth... [2024-04-27 17:55:49,365][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000439564_7201816576.pth [2024-04-27 17:55:50,831][54818] Updated weights for policy 0, policy_version 440378 (0.0024) [2024-04-27 17:55:53,535][54818] Updated weights for policy 0, policy_version 440388 (0.0025) [2024-04-27 17:55:54,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 7215333376. Throughput: 0: 55481.4. Samples: 120587420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 17:55:54,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 17:55:56,669][54818] Updated weights for policy 0, policy_version 440398 (0.0031) [2024-04-27 17:55:59,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 7215628288. Throughput: 0: 55522.3. Samples: 120759100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 17:55:59,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 17:55:59,407][54818] Updated weights for policy 0, policy_version 440408 (0.0030) [2024-04-27 17:56:02,614][54818] Updated weights for policy 0, policy_version 440418 (0.0030) [2024-04-27 17:56:04,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55705.6, 300 sec: 55316.9). Total num frames: 7215906816. Throughput: 0: 55630.8. Samples: 121091240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 17:56:04,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 17:56:05,380][54818] Updated weights for policy 0, policy_version 440428 (0.0027) [2024-04-27 17:56:08,504][54818] Updated weights for policy 0, policy_version 440438 (0.0028) [2024-04-27 17:56:09,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55372.4). Total num frames: 7216185344. Throughput: 0: 55707.1. Samples: 121424080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 17:56:09,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 17:56:11,191][54818] Updated weights for policy 0, policy_version 440448 (0.0027) [2024-04-27 17:56:14,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55705.7, 300 sec: 55261.3). Total num frames: 7216447488. Throughput: 0: 55524.0. Samples: 121587680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 17:56:14,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 17:56:14,311][54818] Updated weights for policy 0, policy_version 440458 (0.0028) [2024-04-27 17:56:17,274][54818] Updated weights for policy 0, policy_version 440468 (0.0030) [2024-04-27 17:56:19,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 7216726016. Throughput: 0: 55383.1. Samples: 121919440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 17:56:19,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 17:56:20,146][54818] Updated weights for policy 0, policy_version 440478 (0.0030) [2024-04-27 17:56:23,229][54818] Updated weights for policy 0, policy_version 440488 (0.0031) [2024-04-27 17:56:24,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55261.3). Total num frames: 7217004544. Throughput: 0: 55340.0. Samples: 122249980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 17:56:24,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 17:56:25,966][54818] Updated weights for policy 0, policy_version 440498 (0.0029) [2024-04-27 17:56:28,977][54818] Updated weights for policy 0, policy_version 440508 (0.0031) [2024-04-27 17:56:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55427.9). Total num frames: 7217299456. Throughput: 0: 55545.7. Samples: 122422100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 17:56:29,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-27 17:56:31,963][54818] Updated weights for policy 0, policy_version 440518 (0.0030) [2024-04-27 17:56:34,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 7217561600. Throughput: 0: 55436.8. Samples: 122752100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-27 17:56:34,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 17:56:34,898][54818] Updated weights for policy 0, policy_version 440528 (0.0029) [2024-04-27 17:56:38,057][54818] Updated weights for policy 0, policy_version 440538 (0.0032) [2024-04-27 17:56:39,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55432.5, 300 sec: 55261.3). Total num frames: 7217823744. Throughput: 0: 55476.9. Samples: 123083880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:56:39,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 17:56:40,558][54798] Signal inference workers to stop experience collection... (1700 times) [2024-04-27 17:56:40,598][54818] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-04-27 17:56:40,624][54798] Signal inference workers to resume experience collection... (1700 times) [2024-04-27 17:56:40,629][54818] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-04-27 17:56:40,752][54818] Updated weights for policy 0, policy_version 440548 (0.0029) [2024-04-27 17:56:43,802][54818] Updated weights for policy 0, policy_version 440558 (0.0028) [2024-04-27 17:56:44,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 55316.8). Total num frames: 7218118656. Throughput: 0: 55536.0. Samples: 123258220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:56:44,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 17:56:46,657][54818] Updated weights for policy 0, policy_version 440568 (0.0038) [2024-04-27 17:56:49,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55705.5, 300 sec: 55316.8). Total num frames: 7218397184. Throughput: 0: 55537.6. Samples: 123590440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:56:49,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 17:56:49,741][54818] Updated weights for policy 0, policy_version 440578 (0.0029) [2024-04-27 17:56:52,460][54818] Updated weights for policy 0, policy_version 440588 (0.0031) [2024-04-27 17:56:54,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55372.4). Total num frames: 7218675712. Throughput: 0: 55461.4. Samples: 123919840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:56:54,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 17:56:55,785][54818] Updated weights for policy 0, policy_version 440598 (0.0030) [2024-04-27 17:56:58,327][54818] Updated weights for policy 0, policy_version 440608 (0.0030) [2024-04-27 17:56:59,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 7218954240. Throughput: 0: 55675.2. Samples: 124093060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:56:59,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 17:57:01,749][54818] Updated weights for policy 0, policy_version 440618 (0.0030) [2024-04-27 17:57:04,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 7219232768. Throughput: 0: 55628.1. Samples: 124422700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:57:04,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 17:57:04,338][54818] Updated weights for policy 0, policy_version 440628 (0.0028) [2024-04-27 17:57:07,517][54818] Updated weights for policy 0, policy_version 440638 (0.0027) [2024-04-27 17:57:09,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55159.6, 300 sec: 55372.6). Total num frames: 7219494912. Throughput: 0: 55684.5. Samples: 124755780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:57:09,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 17:57:10,308][54818] Updated weights for policy 0, policy_version 440648 (0.0034) [2024-04-27 17:57:13,296][54818] Updated weights for policy 0, policy_version 440658 (0.0029) [2024-04-27 17:57:14,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55316.8). Total num frames: 7219789824. Throughput: 0: 55603.6. Samples: 124924260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:57:14,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 17:57:16,083][54818] Updated weights for policy 0, policy_version 440668 (0.0032) [2024-04-27 17:57:19,253][54587] Fps is (10 sec: 55704.6, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 7220051968. Throughput: 0: 55620.8. Samples: 125255040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:57:19,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 17:57:19,313][54818] Updated weights for policy 0, policy_version 440678 (0.0025) [2024-04-27 17:57:21,870][54818] Updated weights for policy 0, policy_version 440688 (0.0032) [2024-04-27 17:57:24,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55432.4, 300 sec: 55261.3). Total num frames: 7220330496. Throughput: 0: 55547.5. Samples: 125583520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:57:24,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 17:57:25,128][54818] Updated weights for policy 0, policy_version 440698 (0.0030) [2024-04-27 17:57:27,835][54818] Updated weights for policy 0, policy_version 440708 (0.0029) [2024-04-27 17:57:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7220625408. Throughput: 0: 55516.4. Samples: 125756460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:57:29,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 17:57:31,063][54818] Updated weights for policy 0, policy_version 440718 (0.0028) [2024-04-27 17:57:33,730][54818] Updated weights for policy 0, policy_version 440728 (0.0033) [2024-04-27 17:57:34,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55705.7, 300 sec: 55372.4). Total num frames: 7220903936. Throughput: 0: 55553.5. Samples: 126090340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:57:34,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 17:57:36,952][54818] Updated weights for policy 0, policy_version 440738 (0.0031) [2024-04-27 17:57:39,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 55427.9). Total num frames: 7221182464. Throughput: 0: 55617.7. Samples: 126422640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:57:39,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 17:57:39,667][54818] Updated weights for policy 0, policy_version 440748 (0.0030) [2024-04-27 17:57:42,835][54818] Updated weights for policy 0, policy_version 440758 (0.0035) [2024-04-27 17:57:44,253][54587] Fps is (10 sec: 52427.9, 60 sec: 55159.4, 300 sec: 55372.4). Total num frames: 7221428224. Throughput: 0: 55511.3. Samples: 126591080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:57:44,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 17:57:45,417][54818] Updated weights for policy 0, policy_version 440768 (0.0035) [2024-04-27 17:57:48,654][54818] Updated weights for policy 0, policy_version 440778 (0.0030) [2024-04-27 17:57:49,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 7221723136. Throughput: 0: 55612.9. Samples: 126925280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:57:49,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 17:57:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000440779_7221723136.pth... [2024-04-27 17:57:49,327][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000439967_7208419328.pth [2024-04-27 17:57:51,388][54818] Updated weights for policy 0, policy_version 440788 (0.0028) [2024-04-27 17:57:54,253][54587] Fps is (10 sec: 57344.9, 60 sec: 55432.6, 300 sec: 55316.9). Total num frames: 7222001664. Throughput: 0: 55572.4. Samples: 127256540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:57:54,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 17:57:54,437][54818] Updated weights for policy 0, policy_version 440798 (0.0034) [2024-04-27 17:57:57,249][54818] Updated weights for policy 0, policy_version 440808 (0.0026) [2024-04-27 17:57:59,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 7222280192. Throughput: 0: 55555.6. Samples: 127424260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 17:57:59,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 17:58:00,380][54818] Updated weights for policy 0, policy_version 440818 (0.0029) [2024-04-27 17:58:02,752][54798] Signal inference workers to stop experience collection... (1750 times) [2024-04-27 17:58:02,788][54818] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-04-27 17:58:02,843][54798] Signal inference workers to resume experience collection... (1750 times) [2024-04-27 17:58:02,843][54818] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-04-27 17:58:03,096][54818] Updated weights for policy 0, policy_version 440828 (0.0026) [2024-04-27 17:58:04,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 7222558720. Throughput: 0: 55576.9. Samples: 127756000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 17:58:04,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 17:58:06,395][54818] Updated weights for policy 0, policy_version 440838 (0.0028) [2024-04-27 17:58:09,040][54818] Updated weights for policy 0, policy_version 440848 (0.0030) [2024-04-27 17:58:09,253][54587] Fps is (10 sec: 57343.3, 60 sec: 55978.5, 300 sec: 55483.4). Total num frames: 7222853632. Throughput: 0: 55524.0. Samples: 128082100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 17:58:09,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 17:58:12,393][54818] Updated weights for policy 0, policy_version 440858 (0.0028) [2024-04-27 17:58:14,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55159.4, 300 sec: 55316.8). Total num frames: 7223099392. Throughput: 0: 55517.3. Samples: 128254740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 17:58:14,254][54587] Avg episode reward: [(0, '0.684')] [2024-04-27 17:58:14,843][54818] Updated weights for policy 0, policy_version 440868 (0.0024) [2024-04-27 17:58:18,585][54818] Updated weights for policy 0, policy_version 440878 (0.0027) [2024-04-27 17:58:19,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55705.7, 300 sec: 55427.9). Total num frames: 7223394304. Throughput: 0: 55367.6. Samples: 128581880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 17:58:19,253][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 17:58:20,885][54818] Updated weights for policy 0, policy_version 440888 (0.0028) [2024-04-27 17:58:24,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7223656448. Throughput: 0: 55389.5. Samples: 128915160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 17:58:24,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-27 17:58:24,370][54818] Updated weights for policy 0, policy_version 440898 (0.0024) [2024-04-27 17:58:26,760][54818] Updated weights for policy 0, policy_version 440908 (0.0034) [2024-04-27 17:58:29,253][54587] Fps is (10 sec: 55704.7, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 7223951360. Throughput: 0: 55236.9. Samples: 129076740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 17:58:29,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 17:58:30,171][54818] Updated weights for policy 0, policy_version 440918 (0.0024) [2024-04-27 17:58:32,737][54818] Updated weights for policy 0, policy_version 440928 (0.0028) [2024-04-27 17:58:34,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55159.4, 300 sec: 55372.4). Total num frames: 7224213504. Throughput: 0: 55129.8. Samples: 129406120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 17:58:34,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 17:58:36,124][54818] Updated weights for policy 0, policy_version 440938 (0.0036) [2024-04-27 17:58:38,646][54818] Updated weights for policy 0, policy_version 440948 (0.0030) [2024-04-27 17:58:39,253][54587] Fps is (10 sec: 54068.3, 60 sec: 55159.7, 300 sec: 55427.9). Total num frames: 7224492032. Throughput: 0: 55145.8. Samples: 129738100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 17:58:39,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 17:58:42,105][54818] Updated weights for policy 0, policy_version 440958 (0.0030) [2024-04-27 17:58:44,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7224786944. Throughput: 0: 55373.6. Samples: 129916080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 17:58:44,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 17:58:44,452][54818] Updated weights for policy 0, policy_version 440968 (0.0031) [2024-04-27 17:58:47,890][54818] Updated weights for policy 0, policy_version 440978 (0.0031) [2024-04-27 17:58:49,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7225049088. Throughput: 0: 55425.4. Samples: 130250140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 17:58:49,254][54587] Avg episode reward: [(0, '0.504')] [2024-04-27 17:58:50,300][54818] Updated weights for policy 0, policy_version 440988 (0.0031) [2024-04-27 17:58:53,692][54818] Updated weights for policy 0, policy_version 440998 (0.0028) [2024-04-27 17:58:54,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7225327616. Throughput: 0: 55520.1. Samples: 130580500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 17:58:54,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 17:58:56,311][54818] Updated weights for policy 0, policy_version 441008 (0.0034) [2024-04-27 17:58:59,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55432.5, 300 sec: 55483.5). Total num frames: 7225606144. Throughput: 0: 55295.2. Samples: 130743020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 17:58:59,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 17:58:59,843][54818] Updated weights for policy 0, policy_version 441018 (0.0034) [2024-04-27 17:59:02,239][54818] Updated weights for policy 0, policy_version 441028 (0.0029) [2024-04-27 17:59:03,527][54798] Signal inference workers to stop experience collection... (1800 times) [2024-04-27 17:59:03,559][54818] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-04-27 17:59:03,589][54798] Signal inference workers to resume experience collection... (1800 times) [2024-04-27 17:59:03,590][54818] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-04-27 17:59:04,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55432.7, 300 sec: 55427.9). Total num frames: 7225884672. Throughput: 0: 55471.2. Samples: 131078080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 17:59:04,253][54587] Avg episode reward: [(0, '0.520')] [2024-04-27 17:59:05,743][54818] Updated weights for policy 0, policy_version 441038 (0.0032) [2024-04-27 17:59:08,113][54818] Updated weights for policy 0, policy_version 441048 (0.0033) [2024-04-27 17:59:09,253][54587] Fps is (10 sec: 54066.2, 60 sec: 54886.3, 300 sec: 55372.3). Total num frames: 7226146816. Throughput: 0: 55516.7. Samples: 131413420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 17:59:09,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 17:59:11,666][54818] Updated weights for policy 0, policy_version 441058 (0.0030) [2024-04-27 17:59:14,067][54818] Updated weights for policy 0, policy_version 441068 (0.0027) [2024-04-27 17:59:14,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7226458112. Throughput: 0: 55717.4. Samples: 131584020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 17:59:14,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-27 17:59:17,396][54818] Updated weights for policy 0, policy_version 441078 (0.0031) [2024-04-27 17:59:19,253][54587] Fps is (10 sec: 57345.2, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 7226720256. Throughput: 0: 55819.2. Samples: 131917980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 17:59:19,253][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 17:59:20,143][54818] Updated weights for policy 0, policy_version 441088 (0.0030) [2024-04-27 17:59:23,493][54818] Updated weights for policy 0, policy_version 441098 (0.0029) [2024-04-27 17:59:24,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 7226998784. Throughput: 0: 55788.3. Samples: 132248580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 17:59:24,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-27 17:59:26,129][54818] Updated weights for policy 0, policy_version 441108 (0.0027) [2024-04-27 17:59:29,253][54587] Fps is (10 sec: 54066.2, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 7227260928. Throughput: 0: 55596.0. Samples: 132417900. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-04-27 17:59:29,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 17:59:29,372][54818] Updated weights for policy 0, policy_version 441118 (0.0026) [2024-04-27 17:59:31,852][54818] Updated weights for policy 0, policy_version 441128 (0.0025) [2024-04-27 17:59:34,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7227555840. Throughput: 0: 55693.8. Samples: 132756360. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-04-27 17:59:34,254][54587] Avg episode reward: [(0, '0.514')] [2024-04-27 17:59:35,055][54818] Updated weights for policy 0, policy_version 441138 (0.0030) [2024-04-27 17:59:37,742][54818] Updated weights for policy 0, policy_version 441148 (0.0031) [2024-04-27 17:59:39,255][54587] Fps is (10 sec: 55697.9, 60 sec: 55431.1, 300 sec: 55427.7). Total num frames: 7227817984. Throughput: 0: 55748.8. Samples: 133089280. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-04-27 17:59:39,255][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 17:59:40,876][54818] Updated weights for policy 0, policy_version 441158 (0.0028) [2024-04-27 17:59:43,800][54818] Updated weights for policy 0, policy_version 441168 (0.0025) [2024-04-27 17:59:44,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 7228112896. Throughput: 0: 55759.9. Samples: 133252220. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-04-27 17:59:44,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 17:59:46,875][54818] Updated weights for policy 0, policy_version 441178 (0.0027) [2024-04-27 17:59:49,253][54587] Fps is (10 sec: 57351.8, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7228391424. Throughput: 0: 55744.2. Samples: 133586580. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-04-27 17:59:49,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 17:59:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000441186_7228391424.pth... [2024-04-27 17:59:49,311][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000440373_7215071232.pth [2024-04-27 17:59:49,629][54818] Updated weights for policy 0, policy_version 441188 (0.0027) [2024-04-27 17:59:52,726][54818] Updated weights for policy 0, policy_version 441198 (0.0027) [2024-04-27 17:59:54,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 7228669952. Throughput: 0: 55554.4. Samples: 133913360. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-04-27 17:59:54,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 17:59:55,538][54818] Updated weights for policy 0, policy_version 441208 (0.0026) [2024-04-27 17:59:58,460][54818] Updated weights for policy 0, policy_version 441218 (0.0025) [2024-04-27 17:59:59,253][54587] Fps is (10 sec: 54068.1, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7228932096. Throughput: 0: 55584.6. Samples: 134085320. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-04-27 17:59:59,253][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 18:00:01,412][54818] Updated weights for policy 0, policy_version 441228 (0.0024) [2024-04-27 18:00:03,693][54798] Signal inference workers to stop experience collection... (1850 times) [2024-04-27 18:00:03,730][54818] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-04-27 18:00:03,785][54798] Signal inference workers to resume experience collection... (1850 times) [2024-04-27 18:00:03,785][54818] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-04-27 18:00:04,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7229227008. Throughput: 0: 55568.0. Samples: 134418540. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-04-27 18:00:04,253][54587] Avg episode reward: [(0, '0.463')] [2024-04-27 18:00:04,306][54818] Updated weights for policy 0, policy_version 441238 (0.0029) [2024-04-27 18:00:07,320][54818] Updated weights for policy 0, policy_version 441248 (0.0030) [2024-04-27 18:00:09,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 7229505536. Throughput: 0: 55619.6. Samples: 134751460. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-04-27 18:00:09,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 18:00:10,205][54818] Updated weights for policy 0, policy_version 441258 (0.0024) [2024-04-27 18:00:13,045][54818] Updated weights for policy 0, policy_version 441268 (0.0027) [2024-04-27 18:00:14,253][54587] Fps is (10 sec: 52428.7, 60 sec: 54886.5, 300 sec: 55427.9). Total num frames: 7229751296. Throughput: 0: 55479.3. Samples: 134914460. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-04-27 18:00:14,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 18:00:16,044][54818] Updated weights for policy 0, policy_version 441278 (0.0026) [2024-04-27 18:00:18,856][54818] Updated weights for policy 0, policy_version 441288 (0.0026) [2024-04-27 18:00:19,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7230062592. Throughput: 0: 55541.0. Samples: 135255700. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-04-27 18:00:19,253][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 18:00:22,008][54818] Updated weights for policy 0, policy_version 441298 (0.0031) [2024-04-27 18:00:24,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7230324736. Throughput: 0: 55555.1. Samples: 135589180. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-04-27 18:00:24,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 18:00:24,854][54818] Updated weights for policy 0, policy_version 441308 (0.0031) [2024-04-27 18:00:27,935][54818] Updated weights for policy 0, policy_version 441318 (0.0029) [2024-04-27 18:00:29,253][54587] Fps is (10 sec: 54066.2, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 7230603264. Throughput: 0: 55486.6. Samples: 135749120. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-04-27 18:00:29,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 18:00:30,914][54818] Updated weights for policy 0, policy_version 441328 (0.0028) [2024-04-27 18:00:33,642][54818] Updated weights for policy 0, policy_version 441338 (0.0029) [2024-04-27 18:00:34,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7230898176. Throughput: 0: 55408.0. Samples: 136079940. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-04-27 18:00:34,254][54587] Avg episode reward: [(0, '0.505')] [2024-04-27 18:00:36,735][54818] Updated weights for policy 0, policy_version 441348 (0.0025) [2024-04-27 18:00:39,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55980.0, 300 sec: 55650.1). Total num frames: 7231176704. Throughput: 0: 55595.5. Samples: 136415160. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-04-27 18:00:39,254][54587] Avg episode reward: [(0, '0.673')] [2024-04-27 18:00:39,713][54818] Updated weights for policy 0, policy_version 441358 (0.0028) [2024-04-27 18:00:42,662][54818] Updated weights for policy 0, policy_version 441368 (0.0029) [2024-04-27 18:00:44,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7231438848. Throughput: 0: 55560.4. Samples: 136585540. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-04-27 18:00:44,254][54587] Avg episode reward: [(0, '0.524')] [2024-04-27 18:00:45,492][54818] Updated weights for policy 0, policy_version 441378 (0.0025) [2024-04-27 18:00:48,709][54818] Updated weights for policy 0, policy_version 441388 (0.0043) [2024-04-27 18:00:49,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55159.6, 300 sec: 55483.4). Total num frames: 7231700992. Throughput: 0: 55536.8. Samples: 136917700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 18:00:49,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 18:00:51,476][54818] Updated weights for policy 0, policy_version 441398 (0.0026) [2024-04-27 18:00:54,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7231995904. Throughput: 0: 55462.6. Samples: 137247280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 18:00:54,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 18:00:54,815][54818] Updated weights for policy 0, policy_version 441408 (0.0031) [2024-04-27 18:00:57,189][54818] Updated weights for policy 0, policy_version 441418 (0.0025) [2024-04-27 18:00:59,157][54798] Signal inference workers to stop experience collection... (1900 times) [2024-04-27 18:00:59,187][54818] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-04-27 18:00:59,245][54798] Signal inference workers to resume experience collection... (1900 times) [2024-04-27 18:00:59,245][54818] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-04-27 18:00:59,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7232258048. Throughput: 0: 55496.4. Samples: 137411800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 18:00:59,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 18:01:00,692][54818] Updated weights for policy 0, policy_version 441428 (0.0029) [2024-04-27 18:01:03,309][54818] Updated weights for policy 0, policy_version 441438 (0.0028) [2024-04-27 18:01:04,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55432.5, 300 sec: 55483.5). Total num frames: 7232552960. Throughput: 0: 55293.3. Samples: 137743900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 18:01:04,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 18:01:06,523][54818] Updated weights for policy 0, policy_version 441448 (0.0028) [2024-04-27 18:01:09,231][54818] Updated weights for policy 0, policy_version 441458 (0.0029) [2024-04-27 18:01:09,253][54587] Fps is (10 sec: 58982.5, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7232847872. Throughput: 0: 55231.2. Samples: 138074580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 18:01:09,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 18:01:12,788][54818] Updated weights for policy 0, policy_version 441468 (0.0028) [2024-04-27 18:01:14,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 7233110016. Throughput: 0: 55329.9. Samples: 138238960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 18:01:14,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 18:01:15,039][54818] Updated weights for policy 0, policy_version 441478 (0.0027) [2024-04-27 18:01:18,658][54818] Updated weights for policy 0, policy_version 441488 (0.0025) [2024-04-27 18:01:19,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7233388544. Throughput: 0: 55500.7. Samples: 138577460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 18:01:19,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 18:01:20,825][54818] Updated weights for policy 0, policy_version 441498 (0.0025) [2024-04-27 18:01:24,253][54587] Fps is (10 sec: 52429.4, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 7233634304. Throughput: 0: 55480.5. Samples: 138911780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 18:01:24,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 18:01:24,457][54818] Updated weights for policy 0, policy_version 441508 (0.0030) [2024-04-27 18:01:26,779][54818] Updated weights for policy 0, policy_version 441518 (0.0030) [2024-04-27 18:01:29,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7233945600. Throughput: 0: 55375.1. Samples: 139077420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 18:01:29,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 18:01:30,533][54818] Updated weights for policy 0, policy_version 441528 (0.0033) [2024-04-27 18:01:32,514][54818] Updated weights for policy 0, policy_version 441538 (0.0026) [2024-04-27 18:01:34,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 7234207744. Throughput: 0: 55305.9. Samples: 139406460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 18:01:34,253][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 18:01:36,350][54818] Updated weights for policy 0, policy_version 441548 (0.0036) [2024-04-27 18:01:38,364][54818] Updated weights for policy 0, policy_version 441558 (0.0029) [2024-04-27 18:01:39,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7234502656. Throughput: 0: 55421.4. Samples: 139741240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 18:01:39,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-27 18:01:42,141][54818] Updated weights for policy 0, policy_version 441568 (0.0028) [2024-04-27 18:01:44,253][54587] Fps is (10 sec: 58981.1, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 7234797568. Throughput: 0: 55704.7. Samples: 139918520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 18:01:44,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 18:01:44,373][54818] Updated weights for policy 0, policy_version 441578 (0.0032) [2024-04-27 18:01:47,982][54818] Updated weights for policy 0, policy_version 441588 (0.0034) [2024-04-27 18:01:49,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7235059712. Throughput: 0: 55760.9. Samples: 140253140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 18:01:49,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-27 18:01:49,340][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000441594_7235076096.pth... [2024-04-27 18:01:49,391][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000440779_7221723136.pth [2024-04-27 18:01:50,572][54818] Updated weights for policy 0, policy_version 441598 (0.0029) [2024-04-27 18:01:53,946][54818] Updated weights for policy 0, policy_version 441608 (0.0025) [2024-04-27 18:01:54,194][54798] Signal inference workers to stop experience collection... (1950 times) [2024-04-27 18:01:54,239][54818] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-04-27 18:01:54,248][54798] Signal inference workers to resume experience collection... (1950 times) [2024-04-27 18:01:54,253][54587] Fps is (10 sec: 54068.6, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7235338240. Throughput: 0: 55833.9. Samples: 140587100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 18:01:54,253][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 18:01:54,256][54818] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-04-27 18:01:56,532][54818] Updated weights for policy 0, policy_version 441618 (0.0027) [2024-04-27 18:01:59,253][54587] Fps is (10 sec: 52429.1, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7235584000. Throughput: 0: 55674.4. Samples: 140744300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 18:01:59,253][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 18:01:59,879][54818] Updated weights for policy 0, policy_version 441628 (0.0029) [2024-04-27 18:02:02,395][54818] Updated weights for policy 0, policy_version 441638 (0.0026) [2024-04-27 18:02:04,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7235878912. Throughput: 0: 55598.1. Samples: 141079380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 18:02:04,262][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 18:02:05,711][54818] Updated weights for policy 0, policy_version 441648 (0.0025) [2024-04-27 18:02:08,154][54818] Updated weights for policy 0, policy_version 441658 (0.0029) [2024-04-27 18:02:09,253][54587] Fps is (10 sec: 57343.3, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 7236157440. Throughput: 0: 55599.4. Samples: 141413760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-27 18:02:09,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 18:02:11,567][54818] Updated weights for policy 0, policy_version 441668 (0.0028) [2024-04-27 18:02:14,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7236435968. Throughput: 0: 55580.5. Samples: 141578540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 18:02:14,254][54587] Avg episode reward: [(0, '0.497')] [2024-04-27 18:02:14,307][54818] Updated weights for policy 0, policy_version 441678 (0.0036) [2024-04-27 18:02:17,510][54818] Updated weights for policy 0, policy_version 441688 (0.0032) [2024-04-27 18:02:19,253][54587] Fps is (10 sec: 58982.1, 60 sec: 55978.5, 300 sec: 55650.1). Total num frames: 7236747264. Throughput: 0: 55613.6. Samples: 141909080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 18:02:19,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-27 18:02:20,023][54818] Updated weights for policy 0, policy_version 441698 (0.0028) [2024-04-27 18:02:23,237][54818] Updated weights for policy 0, policy_version 441708 (0.0027) [2024-04-27 18:02:24,253][54587] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 55539.0). Total num frames: 7237009408. Throughput: 0: 55559.7. Samples: 142241420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 18:02:24,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 18:02:26,044][54818] Updated weights for policy 0, policy_version 441718 (0.0030) [2024-04-27 18:02:29,210][54818] Updated weights for policy 0, policy_version 441728 (0.0030) [2024-04-27 18:02:29,253][54587] Fps is (10 sec: 52429.5, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 7237271552. Throughput: 0: 55394.4. Samples: 142411260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 18:02:29,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 18:02:31,977][54818] Updated weights for policy 0, policy_version 441738 (0.0027) [2024-04-27 18:02:34,253][54587] Fps is (10 sec: 50790.3, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 7237517312. Throughput: 0: 55384.0. Samples: 142745420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 18:02:34,254][54587] Avg episode reward: [(0, '0.532')] [2024-04-27 18:02:34,976][54818] Updated weights for policy 0, policy_version 441748 (0.0029) [2024-04-27 18:02:37,700][54818] Updated weights for policy 0, policy_version 441758 (0.0025) [2024-04-27 18:02:39,253][54587] Fps is (10 sec: 54066.3, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 7237812224. Throughput: 0: 55457.0. Samples: 143082680. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 18:02:39,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 18:02:40,920][54818] Updated weights for policy 0, policy_version 441768 (0.0027) [2024-04-27 18:02:43,418][54818] Updated weights for policy 0, policy_version 441778 (0.0031) [2024-04-27 18:02:44,253][54587] Fps is (10 sec: 58982.8, 60 sec: 55159.7, 300 sec: 55539.0). Total num frames: 7238107136. Throughput: 0: 55519.2. Samples: 143242660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 18:02:44,253][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 18:02:46,672][54818] Updated weights for policy 0, policy_version 441788 (0.0027) [2024-04-27 18:02:49,253][54587] Fps is (10 sec: 58983.5, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7238402048. Throughput: 0: 55518.3. Samples: 143577700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 18:02:49,253][54587] Avg episode reward: [(0, '0.688')] [2024-04-27 18:02:49,308][54818] Updated weights for policy 0, policy_version 441798 (0.0029) [2024-04-27 18:02:52,668][54818] Updated weights for policy 0, policy_version 441808 (0.0035) [2024-04-27 18:02:54,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 7238664192. Throughput: 0: 55456.5. Samples: 143909300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 18:02:54,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 18:02:55,207][54818] Updated weights for policy 0, policy_version 441818 (0.0032) [2024-04-27 18:02:58,333][54818] Updated weights for policy 0, policy_version 441828 (0.0030) [2024-04-27 18:02:59,253][54587] Fps is (10 sec: 55705.5, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 7238959104. Throughput: 0: 55739.6. Samples: 144086820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 18:02:59,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 18:02:59,406][54798] Signal inference workers to stop experience collection... (2000 times) [2024-04-27 18:02:59,440][54818] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-04-27 18:02:59,469][54798] Signal inference workers to resume experience collection... (2000 times) [2024-04-27 18:02:59,474][54818] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-04-27 18:03:01,165][54818] Updated weights for policy 0, policy_version 441838 (0.0034) [2024-04-27 18:03:04,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 7239221248. Throughput: 0: 55788.0. Samples: 144419540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 18:03:04,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 18:03:04,359][54818] Updated weights for policy 0, policy_version 441848 (0.0033) [2024-04-27 18:03:07,141][54818] Updated weights for policy 0, policy_version 441858 (0.0034) [2024-04-27 18:03:09,253][54587] Fps is (10 sec: 50790.5, 60 sec: 55159.6, 300 sec: 55483.5). Total num frames: 7239467008. Throughput: 0: 55837.8. Samples: 144754120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 18:03:09,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 18:03:10,175][54818] Updated weights for policy 0, policy_version 441868 (0.0027) [2024-04-27 18:03:13,377][54818] Updated weights for policy 0, policy_version 441878 (0.0030) [2024-04-27 18:03:14,253][54587] Fps is (10 sec: 52428.9, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 7239745536. Throughput: 0: 55473.7. Samples: 144907580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 18:03:14,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 18:03:16,009][54818] Updated weights for policy 0, policy_version 441888 (0.0027) [2024-04-27 18:03:19,174][54818] Updated weights for policy 0, policy_version 441898 (0.0032) [2024-04-27 18:03:19,253][54587] Fps is (10 sec: 58982.2, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 7240056832. Throughput: 0: 55505.3. Samples: 145243160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 18:03:19,254][54587] Avg episode reward: [(0, '0.509')] [2024-04-27 18:03:21,870][54818] Updated weights for policy 0, policy_version 441908 (0.0029) [2024-04-27 18:03:24,253][54587] Fps is (10 sec: 60620.9, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7240351744. Throughput: 0: 55473.0. Samples: 145578960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 18:03:24,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 18:03:24,899][54818] Updated weights for policy 0, policy_version 441918 (0.0031) [2024-04-27 18:03:27,749][54818] Updated weights for policy 0, policy_version 441928 (0.0032) [2024-04-27 18:03:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7240630272. Throughput: 0: 55930.6. Samples: 145759540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 18:03:29,253][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 18:03:30,797][54818] Updated weights for policy 0, policy_version 441938 (0.0027) [2024-04-27 18:03:33,696][54818] Updated weights for policy 0, policy_version 441948 (0.0034) [2024-04-27 18:03:34,253][54587] Fps is (10 sec: 55705.5, 60 sec: 56524.7, 300 sec: 55650.0). Total num frames: 7240908800. Throughput: 0: 55900.3. Samples: 146093220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 18:03:34,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 18:03:36,706][54818] Updated weights for policy 0, policy_version 441958 (0.0027) [2024-04-27 18:03:39,253][54587] Fps is (10 sec: 55704.4, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 7241187328. Throughput: 0: 55829.6. Samples: 146421640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 18:03:39,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 18:03:39,493][54818] Updated weights for policy 0, policy_version 441968 (0.0031) [2024-04-27 18:03:42,527][54818] Updated weights for policy 0, policy_version 441978 (0.0027) [2024-04-27 18:03:44,253][54587] Fps is (10 sec: 50790.3, 60 sec: 55159.3, 300 sec: 55483.4). Total num frames: 7241416704. Throughput: 0: 55385.7. Samples: 146579180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 18:03:44,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 18:03:45,177][54798] Signal inference workers to stop experience collection... (2050 times) [2024-04-27 18:03:45,216][54818] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-04-27 18:03:45,228][54798] Signal inference workers to resume experience collection... (2050 times) [2024-04-27 18:03:45,234][54818] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-04-27 18:03:45,338][54818] Updated weights for policy 0, policy_version 441988 (0.0024) [2024-04-27 18:03:48,984][54818] Updated weights for policy 0, policy_version 441998 (0.0028) [2024-04-27 18:03:49,253][54587] Fps is (10 sec: 50791.0, 60 sec: 54886.3, 300 sec: 55483.4). Total num frames: 7241695232. Throughput: 0: 55378.7. Samples: 146911580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 18:03:49,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 18:03:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000441998_7241695232.pth... [2024-04-27 18:03:49,324][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000441186_7228391424.pth [2024-04-27 18:03:51,314][54818] Updated weights for policy 0, policy_version 442008 (0.0027) [2024-04-27 18:03:54,253][54587] Fps is (10 sec: 57344.8, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7241990144. Throughput: 0: 55314.2. Samples: 147243260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 18:03:54,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 18:03:55,101][54818] Updated weights for policy 0, policy_version 442018 (0.0030) [2024-04-27 18:03:57,346][54818] Updated weights for policy 0, policy_version 442028 (0.0030) [2024-04-27 18:03:59,253][54587] Fps is (10 sec: 58982.2, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7242285056. Throughput: 0: 55611.1. Samples: 147410080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 18:03:59,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 18:04:00,859][54818] Updated weights for policy 0, policy_version 442038 (0.0027) [2024-04-27 18:04:03,249][54818] Updated weights for policy 0, policy_version 442048 (0.0029) [2024-04-27 18:04:04,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7242563584. Throughput: 0: 55460.0. Samples: 147738860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 18:04:04,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 18:04:06,754][54818] Updated weights for policy 0, policy_version 442058 (0.0024) [2024-04-27 18:04:09,087][54818] Updated weights for policy 0, policy_version 442068 (0.0026) [2024-04-27 18:04:09,253][54587] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 55539.0). Total num frames: 7242842112. Throughput: 0: 55440.9. Samples: 148073800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 18:04:09,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 18:04:12,672][54818] Updated weights for policy 0, policy_version 442078 (0.0025) [2024-04-27 18:04:14,253][54587] Fps is (10 sec: 52429.1, 60 sec: 55705.7, 300 sec: 55483.4). Total num frames: 7243087872. Throughput: 0: 55240.9. Samples: 148245380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 18:04:14,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 18:04:14,986][54818] Updated weights for policy 0, policy_version 442088 (0.0031) [2024-04-27 18:04:18,395][54818] Updated weights for policy 0, policy_version 442098 (0.0028) [2024-04-27 18:04:19,253][54587] Fps is (10 sec: 52429.4, 60 sec: 55159.5, 300 sec: 55483.5). Total num frames: 7243366400. Throughput: 0: 55313.5. Samples: 148582320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 18:04:19,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 18:04:20,889][54818] Updated weights for policy 0, policy_version 442108 (0.0030) [2024-04-27 18:04:24,249][54818] Updated weights for policy 0, policy_version 442118 (0.0031) [2024-04-27 18:04:24,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7243661312. Throughput: 0: 55495.3. Samples: 148918920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 18:04:24,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-27 18:04:26,762][54818] Updated weights for policy 0, policy_version 442128 (0.0030) [2024-04-27 18:04:29,253][54587] Fps is (10 sec: 55705.1, 60 sec: 54886.3, 300 sec: 55483.4). Total num frames: 7243923456. Throughput: 0: 55534.7. Samples: 149078240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 18:04:29,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 18:04:30,304][54818] Updated weights for policy 0, policy_version 442138 (0.0029) [2024-04-27 18:04:32,528][54818] Updated weights for policy 0, policy_version 442148 (0.0030) [2024-04-27 18:04:34,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55432.6, 300 sec: 55650.3). Total num frames: 7244234752. Throughput: 0: 55492.4. Samples: 149408740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 18:04:34,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 18:04:36,082][54818] Updated weights for policy 0, policy_version 442158 (0.0028) [2024-04-27 18:04:38,516][54818] Updated weights for policy 0, policy_version 442168 (0.0035) [2024-04-27 18:04:39,253][54587] Fps is (10 sec: 60621.1, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 7244529664. Throughput: 0: 55532.4. Samples: 149742220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 18:04:39,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 18:04:41,894][54818] Updated weights for policy 0, policy_version 442178 (0.0030) [2024-04-27 18:04:44,253][54587] Fps is (10 sec: 55706.3, 60 sec: 56251.9, 300 sec: 55594.6). Total num frames: 7244791808. Throughput: 0: 55842.4. Samples: 149922980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 18:04:44,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 18:04:44,451][54818] Updated weights for policy 0, policy_version 442188 (0.0032) [2024-04-27 18:04:47,543][54798] Signal inference workers to stop experience collection... (2100 times) [2024-04-27 18:04:47,573][54818] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-04-27 18:04:47,628][54798] Signal inference workers to resume experience collection... (2100 times) [2024-04-27 18:04:47,628][54818] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-04-27 18:04:47,734][54818] Updated weights for policy 0, policy_version 442198 (0.0027) [2024-04-27 18:04:49,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7245053952. Throughput: 0: 55854.7. Samples: 150252320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 18:04:49,254][54587] Avg episode reward: [(0, '0.504')] [2024-04-27 18:04:50,278][54818] Updated weights for policy 0, policy_version 442208 (0.0037) [2024-04-27 18:04:53,679][54818] Updated weights for policy 0, policy_version 442218 (0.0026) [2024-04-27 18:04:54,253][54587] Fps is (10 sec: 52428.2, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7245316096. Throughput: 0: 55815.6. Samples: 150585500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 18:04:54,254][54587] Avg episode reward: [(0, '0.506')] [2024-04-27 18:04:56,142][54818] Updated weights for policy 0, policy_version 442228 (0.0028) [2024-04-27 18:04:59,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 7245594624. Throughput: 0: 55520.7. Samples: 150743820. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:04:59,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 18:04:59,478][54818] Updated weights for policy 0, policy_version 442238 (0.0030) [2024-04-27 18:05:02,148][54818] Updated weights for policy 0, policy_version 442248 (0.0029) [2024-04-27 18:05:04,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7245889536. Throughput: 0: 55375.8. Samples: 151074240. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:05:04,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-27 18:05:05,485][54818] Updated weights for policy 0, policy_version 442258 (0.0026) [2024-04-27 18:05:08,117][54818] Updated weights for policy 0, policy_version 442268 (0.0026) [2024-04-27 18:05:09,253][54587] Fps is (10 sec: 58982.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7246184448. Throughput: 0: 55297.2. Samples: 151407300. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:05:09,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 18:05:11,492][54818] Updated weights for policy 0, policy_version 442278 (0.0028) [2024-04-27 18:05:13,982][54818] Updated weights for policy 0, policy_version 442288 (0.0027) [2024-04-27 18:05:14,253][54587] Fps is (10 sec: 57344.2, 60 sec: 56251.6, 300 sec: 55594.5). Total num frames: 7246462976. Throughput: 0: 55646.2. Samples: 151582320. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:05:14,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 18:05:17,369][54818] Updated weights for policy 0, policy_version 442298 (0.0027) [2024-04-27 18:05:19,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7246725120. Throughput: 0: 55668.9. Samples: 151913840. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:05:19,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 18:05:19,935][54818] Updated weights for policy 0, policy_version 442308 (0.0031) [2024-04-27 18:05:23,317][54818] Updated weights for policy 0, policy_version 442318 (0.0028) [2024-04-27 18:05:24,253][54587] Fps is (10 sec: 50790.6, 60 sec: 55159.5, 300 sec: 55483.5). Total num frames: 7246970880. Throughput: 0: 55656.8. Samples: 152246780. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:05:24,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 18:05:25,825][54818] Updated weights for policy 0, policy_version 442328 (0.0031) [2024-04-27 18:05:29,116][54818] Updated weights for policy 0, policy_version 442338 (0.0028) [2024-04-27 18:05:29,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 7247265792. Throughput: 0: 55177.7. Samples: 152405980. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:05:29,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 18:05:31,724][54818] Updated weights for policy 0, policy_version 442348 (0.0027) [2024-04-27 18:05:34,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 7247544320. Throughput: 0: 55183.6. Samples: 152735580. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:05:34,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 18:05:35,112][54818] Updated weights for policy 0, policy_version 442358 (0.0027) [2024-04-27 18:05:37,606][54818] Updated weights for policy 0, policy_version 442368 (0.0027) [2024-04-27 18:05:39,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7247839232. Throughput: 0: 55272.9. Samples: 153072780. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:05:39,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 18:05:41,032][54818] Updated weights for policy 0, policy_version 442378 (0.0035) [2024-04-27 18:05:43,436][54818] Updated weights for policy 0, policy_version 442388 (0.0025) [2024-04-27 18:05:44,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 7248101376. Throughput: 0: 55604.4. Samples: 153246020. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:05:44,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 18:05:46,797][54818] Updated weights for policy 0, policy_version 442398 (0.0028) [2024-04-27 18:05:49,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7248396288. Throughput: 0: 55670.7. Samples: 153579420. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:05:49,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-27 18:05:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000442407_7248396288.pth... [2024-04-27 18:05:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000441594_7235076096.pth [2024-04-27 18:05:49,642][54818] Updated weights for policy 0, policy_version 442408 (0.0026) [2024-04-27 18:05:52,645][54818] Updated weights for policy 0, policy_version 442418 (0.0031) [2024-04-27 18:05:54,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7248642048. Throughput: 0: 55616.6. Samples: 153910040. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:05:54,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 18:05:55,622][54818] Updated weights for policy 0, policy_version 442428 (0.0027) [2024-04-27 18:05:58,811][54818] Updated weights for policy 0, policy_version 442438 (0.0029) [2024-04-27 18:05:59,253][54587] Fps is (10 sec: 50790.6, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 7248904192. Throughput: 0: 55141.8. Samples: 154063700. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:05:59,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 18:06:01,554][54818] Updated weights for policy 0, policy_version 442448 (0.0026) [2024-04-27 18:06:04,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 7249199104. Throughput: 0: 55134.2. Samples: 154394880. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:06:04,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 18:06:04,639][54818] Updated weights for policy 0, policy_version 442458 (0.0034) [2024-04-27 18:06:06,270][54798] Signal inference workers to stop experience collection... (2150 times) [2024-04-27 18:06:06,271][54798] Signal inference workers to resume experience collection... (2150 times) [2024-04-27 18:06:06,283][54818] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-04-27 18:06:06,283][54818] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-04-27 18:06:07,368][54818] Updated weights for policy 0, policy_version 442468 (0.0029) [2024-04-27 18:06:09,253][54587] Fps is (10 sec: 58982.4, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 7249494016. Throughput: 0: 55163.5. Samples: 154729140. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:06:09,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 18:06:10,437][54818] Updated weights for policy 0, policy_version 442478 (0.0030) [2024-04-27 18:06:13,307][54818] Updated weights for policy 0, policy_version 442488 (0.0028) [2024-04-27 18:06:14,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7249772544. Throughput: 0: 55572.0. Samples: 154906720. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:06:14,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 18:06:16,880][54818] Updated weights for policy 0, policy_version 442498 (0.0026) [2024-04-27 18:06:19,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7250034688. Throughput: 0: 55542.2. Samples: 155234980. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:06:19,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 18:06:19,378][54818] Updated weights for policy 0, policy_version 442508 (0.0029) [2024-04-27 18:06:22,658][54818] Updated weights for policy 0, policy_version 442518 (0.0028) [2024-04-27 18:06:24,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 7250329600. Throughput: 0: 55435.5. Samples: 155567380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:06:24,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 18:06:25,298][54818] Updated weights for policy 0, policy_version 442528 (0.0030) [2024-04-27 18:06:28,671][54818] Updated weights for policy 0, policy_version 442538 (0.0027) [2024-04-27 18:06:29,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 7250575360. Throughput: 0: 55131.7. Samples: 155726940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:06:29,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-27 18:06:31,083][54818] Updated weights for policy 0, policy_version 442548 (0.0026) [2024-04-27 18:06:34,253][54587] Fps is (10 sec: 52428.6, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 7250853888. Throughput: 0: 55111.1. Samples: 156059420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:06:34,255][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 18:06:34,423][54818] Updated weights for policy 0, policy_version 442558 (0.0028) [2024-04-27 18:06:37,095][54818] Updated weights for policy 0, policy_version 442568 (0.0027) [2024-04-27 18:06:39,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55159.6, 300 sec: 55427.9). Total num frames: 7251148800. Throughput: 0: 55174.8. Samples: 156392900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:06:39,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 18:06:40,229][54818] Updated weights for policy 0, policy_version 442578 (0.0033) [2024-04-27 18:06:42,952][54818] Updated weights for policy 0, policy_version 442588 (0.0031) [2024-04-27 18:06:44,253][54587] Fps is (10 sec: 57344.8, 60 sec: 55432.7, 300 sec: 55483.4). Total num frames: 7251427328. Throughput: 0: 55500.1. Samples: 156561200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:06:44,254][54587] Avg episode reward: [(0, '0.509')] [2024-04-27 18:06:46,196][54818] Updated weights for policy 0, policy_version 442598 (0.0027) [2024-04-27 18:06:48,826][54818] Updated weights for policy 0, policy_version 442608 (0.0027) [2024-04-27 18:06:49,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7251722240. Throughput: 0: 55606.3. Samples: 156897160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:06:49,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-27 18:06:51,983][54818] Updated weights for policy 0, policy_version 442618 (0.0028) [2024-04-27 18:06:54,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7251968000. Throughput: 0: 55604.0. Samples: 157231320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:06:54,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 18:06:54,820][54818] Updated weights for policy 0, policy_version 442628 (0.0028) [2024-04-27 18:06:58,090][54818] Updated weights for policy 0, policy_version 442638 (0.0033) [2024-04-27 18:06:59,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7252262912. Throughput: 0: 55352.1. Samples: 157397560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:06:59,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 18:07:00,492][54798] Signal inference workers to stop experience collection... (2200 times) [2024-04-27 18:07:00,521][54818] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-04-27 18:07:00,548][54798] Signal inference workers to resume experience collection... (2200 times) [2024-04-27 18:07:00,548][54818] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-04-27 18:07:00,664][54818] Updated weights for policy 0, policy_version 442648 (0.0026) [2024-04-27 18:07:03,988][54818] Updated weights for policy 0, policy_version 442658 (0.0026) [2024-04-27 18:07:04,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7252525056. Throughput: 0: 55451.1. Samples: 157730280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:07:04,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 18:07:06,506][54818] Updated weights for policy 0, policy_version 442668 (0.0031) [2024-04-27 18:07:09,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 7252803584. Throughput: 0: 55550.3. Samples: 158067140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:07:09,254][54587] Avg episode reward: [(0, '0.523')] [2024-04-27 18:07:09,782][54818] Updated weights for policy 0, policy_version 442678 (0.0027) [2024-04-27 18:07:12,273][54818] Updated weights for policy 0, policy_version 442688 (0.0024) [2024-04-27 18:07:14,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7253098496. Throughput: 0: 55727.5. Samples: 158234680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:07:14,254][54587] Avg episode reward: [(0, '0.492')] [2024-04-27 18:07:15,657][54818] Updated weights for policy 0, policy_version 442698 (0.0028) [2024-04-27 18:07:18,297][54818] Updated weights for policy 0, policy_version 442708 (0.0035) [2024-04-27 18:07:19,253][54587] Fps is (10 sec: 58982.0, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 7253393408. Throughput: 0: 55740.0. Samples: 158567720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:07:19,254][54587] Avg episode reward: [(0, '0.686')] [2024-04-27 18:07:21,628][54818] Updated weights for policy 0, policy_version 442718 (0.0035) [2024-04-27 18:07:24,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 7253639168. Throughput: 0: 55710.6. Samples: 158899880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:07:24,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 18:07:24,285][54818] Updated weights for policy 0, policy_version 442728 (0.0025) [2024-04-27 18:07:27,413][54818] Updated weights for policy 0, policy_version 442738 (0.0033) [2024-04-27 18:07:29,253][54587] Fps is (10 sec: 52428.7, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7253917696. Throughput: 0: 55647.8. Samples: 159065360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:07:29,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 18:07:30,073][54818] Updated weights for policy 0, policy_version 442748 (0.0027) [2024-04-27 18:07:33,176][54818] Updated weights for policy 0, policy_version 442758 (0.0027) [2024-04-27 18:07:34,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7254212608. Throughput: 0: 55587.1. Samples: 159398580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:07:34,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 18:07:36,014][54818] Updated weights for policy 0, policy_version 442768 (0.0028) [2024-04-27 18:07:39,154][54818] Updated weights for policy 0, policy_version 442778 (0.0027) [2024-04-27 18:07:39,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 7254474752. Throughput: 0: 55648.8. Samples: 159735520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:07:39,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 18:07:42,036][54818] Updated weights for policy 0, policy_version 442788 (0.0030) [2024-04-27 18:07:44,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7254753280. Throughput: 0: 55599.2. Samples: 159899520. Policy #0 lag: (min: 1.0, avg: 13.0, max: 26.0) [2024-04-27 18:07:44,253][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 18:07:45,064][54818] Updated weights for policy 0, policy_version 442798 (0.0034) [2024-04-27 18:07:47,903][54818] Updated weights for policy 0, policy_version 442808 (0.0027) [2024-04-27 18:07:49,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7255048192. Throughput: 0: 55624.5. Samples: 160233380. Policy #0 lag: (min: 1.0, avg: 13.0, max: 26.0) [2024-04-27 18:07:49,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-27 18:07:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000442813_7255048192.pth... [2024-04-27 18:07:49,314][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000441998_7241695232.pth [2024-04-27 18:07:50,852][54818] Updated weights for policy 0, policy_version 442818 (0.0030) [2024-04-27 18:07:53,739][54818] Updated weights for policy 0, policy_version 442828 (0.0028) [2024-04-27 18:07:54,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 55483.4). Total num frames: 7255326720. Throughput: 0: 55570.8. Samples: 160567820. Policy #0 lag: (min: 1.0, avg: 13.0, max: 26.0) [2024-04-27 18:07:54,253][54587] Avg episode reward: [(0, '0.522')] [2024-04-27 18:07:56,784][54818] Updated weights for policy 0, policy_version 442838 (0.0026) [2024-04-27 18:07:59,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 7255588864. Throughput: 0: 55610.6. Samples: 160737160. Policy #0 lag: (min: 1.0, avg: 13.0, max: 26.0) [2024-04-27 18:07:59,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 18:07:59,607][54818] Updated weights for policy 0, policy_version 442848 (0.0027) [2024-04-27 18:08:02,736][54818] Updated weights for policy 0, policy_version 442858 (0.0034) [2024-04-27 18:08:04,253][54587] Fps is (10 sec: 55704.6, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 7255883776. Throughput: 0: 55680.8. Samples: 161073360. Policy #0 lag: (min: 1.0, avg: 13.0, max: 26.0) [2024-04-27 18:08:04,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-27 18:08:05,378][54818] Updated weights for policy 0, policy_version 442868 (0.0027) [2024-04-27 18:08:06,045][54798] Signal inference workers to stop experience collection... (2250 times) [2024-04-27 18:08:06,078][54818] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-04-27 18:08:06,104][54798] Signal inference workers to resume experience collection... (2250 times) [2024-04-27 18:08:06,104][54818] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-04-27 18:08:08,474][54818] Updated weights for policy 0, policy_version 442878 (0.0028) [2024-04-27 18:08:09,253][54587] Fps is (10 sec: 54068.3, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7256129536. Throughput: 0: 55653.4. Samples: 161404280. Policy #0 lag: (min: 1.0, avg: 13.0, max: 26.0) [2024-04-27 18:08:09,253][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 18:08:11,409][54818] Updated weights for policy 0, policy_version 442888 (0.0025) [2024-04-27 18:08:14,213][54818] Updated weights for policy 0, policy_version 442898 (0.0029) [2024-04-27 18:08:14,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7256440832. Throughput: 0: 55669.3. Samples: 161570480. Policy #0 lag: (min: 1.0, avg: 13.0, max: 26.0) [2024-04-27 18:08:14,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 18:08:17,315][54818] Updated weights for policy 0, policy_version 442908 (0.0026) [2024-04-27 18:08:19,253][54587] Fps is (10 sec: 57342.9, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 7256702976. Throughput: 0: 55624.3. Samples: 161901680. Policy #0 lag: (min: 1.0, avg: 13.0, max: 26.0) [2024-04-27 18:08:19,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 18:08:20,299][54818] Updated weights for policy 0, policy_version 442918 (0.0027) [2024-04-27 18:08:23,169][54818] Updated weights for policy 0, policy_version 442928 (0.0023) [2024-04-27 18:08:24,253][54587] Fps is (10 sec: 57343.9, 60 sec: 56251.6, 300 sec: 55538.9). Total num frames: 7257014272. Throughput: 0: 55573.3. Samples: 162236320. Policy #0 lag: (min: 1.0, avg: 13.0, max: 26.0) [2024-04-27 18:08:24,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 18:08:26,373][54818] Updated weights for policy 0, policy_version 442938 (0.0033) [2024-04-27 18:08:29,062][54818] Updated weights for policy 0, policy_version 442948 (0.0033) [2024-04-27 18:08:29,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55427.9). Total num frames: 7257260032. Throughput: 0: 55608.3. Samples: 162401900. Policy #0 lag: (min: 1.0, avg: 13.0, max: 26.0) [2024-04-27 18:08:29,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 18:08:32,310][54818] Updated weights for policy 0, policy_version 442958 (0.0028) [2024-04-27 18:08:34,253][54587] Fps is (10 sec: 52429.2, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7257538560. Throughput: 0: 55542.7. Samples: 162732800. Policy #0 lag: (min: 1.0, avg: 13.0, max: 26.0) [2024-04-27 18:08:34,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 18:08:35,009][54818] Updated weights for policy 0, policy_version 442968 (0.0029) [2024-04-27 18:08:38,043][54818] Updated weights for policy 0, policy_version 442978 (0.0030) [2024-04-27 18:08:39,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7257817088. Throughput: 0: 55421.7. Samples: 163061800. Policy #0 lag: (min: 1.0, avg: 13.0, max: 26.0) [2024-04-27 18:08:39,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 18:08:41,031][54818] Updated weights for policy 0, policy_version 442988 (0.0034) [2024-04-27 18:08:43,909][54818] Updated weights for policy 0, policy_version 442998 (0.0028) [2024-04-27 18:08:44,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7258095616. Throughput: 0: 55469.5. Samples: 163233280. Policy #0 lag: (min: 1.0, avg: 13.0, max: 26.0) [2024-04-27 18:08:44,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 18:08:46,869][54818] Updated weights for policy 0, policy_version 443008 (0.0025) [2024-04-27 18:08:49,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 7258357760. Throughput: 0: 55375.1. Samples: 163565240. Policy #0 lag: (min: 1.0, avg: 13.0, max: 26.0) [2024-04-27 18:08:49,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 18:08:49,837][54818] Updated weights for policy 0, policy_version 443018 (0.0031) [2024-04-27 18:08:52,734][54818] Updated weights for policy 0, policy_version 443028 (0.0025) [2024-04-27 18:08:54,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 7258652672. Throughput: 0: 55467.4. Samples: 163900320. Policy #0 lag: (min: 1.0, avg: 13.0, max: 26.0) [2024-04-27 18:08:54,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 18:08:55,680][54818] Updated weights for policy 0, policy_version 443038 (0.0025) [2024-04-27 18:08:58,678][54818] Updated weights for policy 0, policy_version 443048 (0.0032) [2024-04-27 18:08:59,041][54798] Signal inference workers to stop experience collection... (2300 times) [2024-04-27 18:08:59,078][54818] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-04-27 18:08:59,099][54798] Signal inference workers to resume experience collection... (2300 times) [2024-04-27 18:08:59,102][54818] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-04-27 18:08:59,253][54587] Fps is (10 sec: 58983.1, 60 sec: 55978.8, 300 sec: 55539.0). Total num frames: 7258947584. Throughput: 0: 55511.7. Samples: 164068500. Policy #0 lag: (min: 1.0, avg: 13.0, max: 26.0) [2024-04-27 18:08:59,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-27 18:09:01,586][54818] Updated weights for policy 0, policy_version 443058 (0.0024) [2024-04-27 18:09:04,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 7259193344. Throughput: 0: 55510.3. Samples: 164399640. Policy #0 lag: (min: 1.0, avg: 13.0, max: 26.0) [2024-04-27 18:09:04,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 18:09:04,606][54818] Updated weights for policy 0, policy_version 443068 (0.0027) [2024-04-27 18:09:07,312][54818] Updated weights for policy 0, policy_version 443078 (0.0033) [2024-04-27 18:09:09,253][54587] Fps is (10 sec: 55704.7, 60 sec: 56251.5, 300 sec: 55650.0). Total num frames: 7259504640. Throughput: 0: 55521.7. Samples: 164734800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:09:09,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 18:09:10,477][54818] Updated weights for policy 0, policy_version 443088 (0.0026) [2024-04-27 18:09:13,209][54818] Updated weights for policy 0, policy_version 443098 (0.0026) [2024-04-27 18:09:14,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 7259750400. Throughput: 0: 55496.5. Samples: 164899240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:09:14,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 18:09:16,259][54818] Updated weights for policy 0, policy_version 443108 (0.0028) [2024-04-27 18:09:19,123][54818] Updated weights for policy 0, policy_version 443118 (0.0028) [2024-04-27 18:09:19,253][54587] Fps is (10 sec: 54068.1, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7260045312. Throughput: 0: 55554.3. Samples: 165232740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:09:19,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 18:09:22,203][54818] Updated weights for policy 0, policy_version 443128 (0.0024) [2024-04-27 18:09:24,253][54587] Fps is (10 sec: 55705.3, 60 sec: 54886.5, 300 sec: 55539.0). Total num frames: 7260307456. Throughput: 0: 55672.0. Samples: 165567040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:09:24,254][54587] Avg episode reward: [(0, '0.485')] [2024-04-27 18:09:25,021][54818] Updated weights for policy 0, policy_version 443138 (0.0030) [2024-04-27 18:09:28,106][54818] Updated weights for policy 0, policy_version 443148 (0.0033) [2024-04-27 18:09:29,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7260585984. Throughput: 0: 55419.1. Samples: 165727140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:09:29,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 18:09:30,778][54818] Updated weights for policy 0, policy_version 443158 (0.0028) [2024-04-27 18:09:33,928][54818] Updated weights for policy 0, policy_version 443168 (0.0026) [2024-04-27 18:09:34,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55705.6, 300 sec: 55427.9). Total num frames: 7260880896. Throughput: 0: 55452.0. Samples: 166060580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:09:34,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 18:09:36,744][54818] Updated weights for policy 0, policy_version 443178 (0.0027) [2024-04-27 18:09:39,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 7261126656. Throughput: 0: 55384.5. Samples: 166392620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:09:39,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 18:09:40,013][54818] Updated weights for policy 0, policy_version 443188 (0.0029) [2024-04-27 18:09:42,694][54818] Updated weights for policy 0, policy_version 443198 (0.0025) [2024-04-27 18:09:44,253][54587] Fps is (10 sec: 55706.6, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7261437952. Throughput: 0: 55408.1. Samples: 166561860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:09:44,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 18:09:45,943][54818] Updated weights for policy 0, policy_version 443208 (0.0027) [2024-04-27 18:09:48,527][54818] Updated weights for policy 0, policy_version 443218 (0.0027) [2024-04-27 18:09:49,253][54587] Fps is (10 sec: 57343.0, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7261700096. Throughput: 0: 55534.1. Samples: 166898680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:09:49,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 18:09:49,268][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000443219_7261700096.pth... [2024-04-27 18:09:49,311][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000442407_7248396288.pth [2024-04-27 18:09:51,630][54818] Updated weights for policy 0, policy_version 443228 (0.0028) [2024-04-27 18:09:54,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.7, 300 sec: 55594.6). Total num frames: 7261995008. Throughput: 0: 55530.1. Samples: 167233640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:09:54,253][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 18:09:54,339][54818] Updated weights for policy 0, policy_version 443238 (0.0026) [2024-04-27 18:09:57,609][54818] Updated weights for policy 0, policy_version 443248 (0.0029) [2024-04-27 18:09:59,253][54587] Fps is (10 sec: 54068.2, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 7262240768. Throughput: 0: 55518.6. Samples: 167397580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:09:59,253][54587] Avg episode reward: [(0, '0.666')] [2024-04-27 18:10:00,264][54818] Updated weights for policy 0, policy_version 443258 (0.0026) [2024-04-27 18:10:02,932][54798] Signal inference workers to stop experience collection... (2350 times) [2024-04-27 18:10:02,976][54818] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-04-27 18:10:02,991][54798] Signal inference workers to resume experience collection... (2350 times) [2024-04-27 18:10:02,994][54818] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-04-27 18:10:03,478][54818] Updated weights for policy 0, policy_version 443268 (0.0027) [2024-04-27 18:10:04,253][54587] Fps is (10 sec: 54065.9, 60 sec: 55705.5, 300 sec: 55427.9). Total num frames: 7262535680. Throughput: 0: 55534.0. Samples: 167731780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:10:04,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 18:10:06,147][54818] Updated weights for policy 0, policy_version 443278 (0.0027) [2024-04-27 18:10:09,253][54587] Fps is (10 sec: 57343.3, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 7262814208. Throughput: 0: 55523.8. Samples: 168065620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:10:09,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 18:10:09,495][54818] Updated weights for policy 0, policy_version 443288 (0.0031) [2024-04-27 18:10:12,097][54818] Updated weights for policy 0, policy_version 443298 (0.0025) [2024-04-27 18:10:14,253][54587] Fps is (10 sec: 54068.1, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7263076352. Throughput: 0: 55585.8. Samples: 168228500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:10:14,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 18:10:15,352][54818] Updated weights for policy 0, policy_version 443308 (0.0025) [2024-04-27 18:10:17,890][54818] Updated weights for policy 0, policy_version 443318 (0.0028) [2024-04-27 18:10:19,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7263371264. Throughput: 0: 55515.3. Samples: 168558760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:10:19,254][54587] Avg episode reward: [(0, '0.509')] [2024-04-27 18:10:21,428][54818] Updated weights for policy 0, policy_version 443328 (0.0026) [2024-04-27 18:10:23,785][54818] Updated weights for policy 0, policy_version 443338 (0.0030) [2024-04-27 18:10:24,253][54587] Fps is (10 sec: 58982.7, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7263666176. Throughput: 0: 55504.5. Samples: 168890320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:10:24,253][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 18:10:27,315][54818] Updated weights for policy 0, policy_version 443348 (0.0029) [2024-04-27 18:10:29,253][54587] Fps is (10 sec: 55704.4, 60 sec: 55705.4, 300 sec: 55539.0). Total num frames: 7263928320. Throughput: 0: 55506.8. Samples: 169059680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-27 18:10:29,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 18:10:29,829][54818] Updated weights for policy 0, policy_version 443358 (0.0034) [2024-04-27 18:10:33,382][54818] Updated weights for policy 0, policy_version 443368 (0.0027) [2024-04-27 18:10:34,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55159.6, 300 sec: 55427.9). Total num frames: 7264190464. Throughput: 0: 55501.2. Samples: 169396220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-27 18:10:34,253][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 18:10:35,806][54818] Updated weights for policy 0, policy_version 443378 (0.0039) [2024-04-27 18:10:39,253][54587] Fps is (10 sec: 52430.1, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7264452608. Throughput: 0: 55488.0. Samples: 169730600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-27 18:10:39,253][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 18:10:39,295][54818] Updated weights for policy 0, policy_version 443388 (0.0027) [2024-04-27 18:10:41,731][54818] Updated weights for policy 0, policy_version 443398 (0.0025) [2024-04-27 18:10:44,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 7264763904. Throughput: 0: 55481.2. Samples: 169894240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-27 18:10:44,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 18:10:45,060][54818] Updated weights for policy 0, policy_version 443408 (0.0036) [2024-04-27 18:10:47,585][54818] Updated weights for policy 0, policy_version 443418 (0.0027) [2024-04-27 18:10:49,253][54587] Fps is (10 sec: 58982.0, 60 sec: 55705.8, 300 sec: 55594.5). Total num frames: 7265042432. Throughput: 0: 55424.7. Samples: 170225880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-27 18:10:49,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 18:10:50,740][54818] Updated weights for policy 0, policy_version 443428 (0.0031) [2024-04-27 18:10:53,497][54818] Updated weights for policy 0, policy_version 443438 (0.0029) [2024-04-27 18:10:54,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7265304576. Throughput: 0: 55399.7. Samples: 170558600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-27 18:10:54,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-27 18:10:56,879][54818] Updated weights for policy 0, policy_version 443448 (0.0026) [2024-04-27 18:10:59,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 7265599488. Throughput: 0: 55667.8. Samples: 170733560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-27 18:10:59,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 18:10:59,480][54818] Updated weights for policy 0, policy_version 443458 (0.0036) [2024-04-27 18:11:02,828][54818] Updated weights for policy 0, policy_version 443468 (0.0034) [2024-04-27 18:11:04,253][54587] Fps is (10 sec: 57342.9, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7265878016. Throughput: 0: 55750.9. Samples: 171067560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-27 18:11:04,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 18:11:05,285][54818] Updated weights for policy 0, policy_version 443478 (0.0028) [2024-04-27 18:11:08,642][54818] Updated weights for policy 0, policy_version 443488 (0.0026) [2024-04-27 18:11:09,255][54587] Fps is (10 sec: 54060.4, 60 sec: 55431.3, 300 sec: 55483.2). Total num frames: 7266140160. Throughput: 0: 55777.7. Samples: 171400400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-27 18:11:09,255][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 18:11:11,102][54818] Updated weights for policy 0, policy_version 443498 (0.0040) [2024-04-27 18:11:11,819][54798] Signal inference workers to stop experience collection... (2400 times) [2024-04-27 18:11:11,819][54798] Signal inference workers to resume experience collection... (2400 times) [2024-04-27 18:11:11,831][54818] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-04-27 18:11:11,849][54818] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-04-27 18:11:14,253][54587] Fps is (10 sec: 52430.0, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 7266402304. Throughput: 0: 55615.9. Samples: 171562380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-27 18:11:14,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 18:11:14,453][54818] Updated weights for policy 0, policy_version 443508 (0.0035) [2024-04-27 18:11:17,080][54818] Updated weights for policy 0, policy_version 443518 (0.0030) [2024-04-27 18:11:19,253][54587] Fps is (10 sec: 55713.4, 60 sec: 55432.5, 300 sec: 55483.5). Total num frames: 7266697216. Throughput: 0: 55611.9. Samples: 171898760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-27 18:11:19,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 18:11:20,325][54818] Updated weights for policy 0, policy_version 443528 (0.0034) [2024-04-27 18:11:23,166][54818] Updated weights for policy 0, policy_version 443538 (0.0025) [2024-04-27 18:11:24,253][54587] Fps is (10 sec: 58982.3, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7266992128. Throughput: 0: 55624.9. Samples: 172233720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-27 18:11:24,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 18:11:26,273][54818] Updated weights for policy 0, policy_version 443548 (0.0033) [2024-04-27 18:11:29,106][54818] Updated weights for policy 0, policy_version 443558 (0.0026) [2024-04-27 18:11:29,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 7267254272. Throughput: 0: 55784.6. Samples: 172404540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-27 18:11:29,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 18:11:32,024][54818] Updated weights for policy 0, policy_version 443568 (0.0035) [2024-04-27 18:11:34,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7267549184. Throughput: 0: 55705.7. Samples: 172732640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-27 18:11:34,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 18:11:35,106][54818] Updated weights for policy 0, policy_version 443578 (0.0031) [2024-04-27 18:11:37,919][54818] Updated weights for policy 0, policy_version 443588 (0.0023) [2024-04-27 18:11:39,253][54587] Fps is (10 sec: 57344.1, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 7267827712. Throughput: 0: 55678.2. Samples: 173064120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-27 18:11:39,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 18:11:40,909][54818] Updated weights for policy 0, policy_version 443598 (0.0025) [2024-04-27 18:11:43,852][54818] Updated weights for policy 0, policy_version 443608 (0.0029) [2024-04-27 18:11:44,253][54587] Fps is (10 sec: 52429.6, 60 sec: 55159.6, 300 sec: 55427.9). Total num frames: 7268073472. Throughput: 0: 55436.8. Samples: 173228200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-27 18:11:44,253][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 18:11:46,786][54818] Updated weights for policy 0, policy_version 443618 (0.0025) [2024-04-27 18:11:49,253][54587] Fps is (10 sec: 52427.4, 60 sec: 55159.3, 300 sec: 55538.9). Total num frames: 7268352000. Throughput: 0: 55312.4. Samples: 173556620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-27 18:11:49,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 18:11:49,370][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000443626_7268368384.pth... [2024-04-27 18:11:49,424][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000442813_7255048192.pth [2024-04-27 18:11:49,819][54818] Updated weights for policy 0, policy_version 443628 (0.0029) [2024-04-27 18:11:52,664][54818] Updated weights for policy 0, policy_version 443638 (0.0030) [2024-04-27 18:11:54,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7268646912. Throughput: 0: 55229.3. Samples: 173885640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 18:11:54,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 18:11:55,850][54818] Updated weights for policy 0, policy_version 443648 (0.0027) [2024-04-27 18:11:58,697][54818] Updated weights for policy 0, policy_version 443658 (0.0029) [2024-04-27 18:11:59,253][54587] Fps is (10 sec: 58983.5, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7268941824. Throughput: 0: 55540.3. Samples: 174061700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 18:11:59,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 18:12:01,603][54818] Updated weights for policy 0, policy_version 443668 (0.0027) [2024-04-27 18:12:04,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 7269187584. Throughput: 0: 55449.0. Samples: 174393960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 18:12:04,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 18:12:04,550][54818] Updated weights for policy 0, policy_version 443678 (0.0027) [2024-04-27 18:12:07,630][54818] Updated weights for policy 0, policy_version 443688 (0.0028) [2024-04-27 18:12:09,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55706.9, 300 sec: 55539.0). Total num frames: 7269482496. Throughput: 0: 55296.4. Samples: 174722060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 18:12:09,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-27 18:12:10,386][54818] Updated weights for policy 0, policy_version 443698 (0.0028) [2024-04-27 18:12:13,462][54818] Updated weights for policy 0, policy_version 443708 (0.0026) [2024-04-27 18:12:14,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55483.5). Total num frames: 7269761024. Throughput: 0: 55232.4. Samples: 174890000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 18:12:14,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-27 18:12:16,015][54798] Signal inference workers to stop experience collection... (2450 times) [2024-04-27 18:12:16,049][54818] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-04-27 18:12:16,076][54798] Signal inference workers to resume experience collection... (2450 times) [2024-04-27 18:12:16,082][54818] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-04-27 18:12:16,181][54818] Updated weights for policy 0, policy_version 443718 (0.0023) [2024-04-27 18:12:19,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7270023168. Throughput: 0: 55268.1. Samples: 175219700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 18:12:19,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 18:12:19,417][54818] Updated weights for policy 0, policy_version 443728 (0.0031) [2024-04-27 18:12:22,292][54818] Updated weights for policy 0, policy_version 443738 (0.0025) [2024-04-27 18:12:24,253][54587] Fps is (10 sec: 52428.7, 60 sec: 54886.3, 300 sec: 55483.5). Total num frames: 7270285312. Throughput: 0: 55193.7. Samples: 175547840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 18:12:24,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 18:12:25,416][54818] Updated weights for policy 0, policy_version 443748 (0.0024) [2024-04-27 18:12:28,032][54818] Updated weights for policy 0, policy_version 443758 (0.0038) [2024-04-27 18:12:29,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7270580224. Throughput: 0: 55261.7. Samples: 175714980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 18:12:29,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 18:12:31,604][54818] Updated weights for policy 0, policy_version 443768 (0.0030) [2024-04-27 18:12:33,981][54818] Updated weights for policy 0, policy_version 443778 (0.0031) [2024-04-27 18:12:34,253][54587] Fps is (10 sec: 58982.1, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7270875136. Throughput: 0: 55366.4. Samples: 176048100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 18:12:34,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 18:12:37,371][54818] Updated weights for policy 0, policy_version 443788 (0.0027) [2024-04-27 18:12:39,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 7271137280. Throughput: 0: 55511.9. Samples: 176383680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 18:12:39,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 18:12:39,923][54818] Updated weights for policy 0, policy_version 443798 (0.0035) [2024-04-27 18:12:43,210][54818] Updated weights for policy 0, policy_version 443808 (0.0028) [2024-04-27 18:12:44,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55705.4, 300 sec: 55483.4). Total num frames: 7271415808. Throughput: 0: 55336.7. Samples: 176551860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 18:12:44,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 18:12:45,689][54818] Updated weights for policy 0, policy_version 443818 (0.0027) [2024-04-27 18:12:49,244][54818] Updated weights for policy 0, policy_version 443828 (0.0031) [2024-04-27 18:12:49,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55432.7, 300 sec: 55427.9). Total num frames: 7271677952. Throughput: 0: 55201.7. Samples: 176878040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 18:12:49,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 18:12:51,593][54818] Updated weights for policy 0, policy_version 443838 (0.0026) [2024-04-27 18:12:54,253][54587] Fps is (10 sec: 52429.4, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 7271940096. Throughput: 0: 55443.9. Samples: 177217040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 18:12:54,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 18:12:54,975][54818] Updated weights for policy 0, policy_version 443848 (0.0025) [2024-04-27 18:12:57,570][54818] Updated weights for policy 0, policy_version 443858 (0.0028) [2024-04-27 18:12:59,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55159.4, 300 sec: 55483.5). Total num frames: 7272251392. Throughput: 0: 55328.8. Samples: 177379800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 18:12:59,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 18:13:01,089][54818] Updated weights for policy 0, policy_version 443868 (0.0026) [2024-04-27 18:13:03,516][54818] Updated weights for policy 0, policy_version 443878 (0.0029) [2024-04-27 18:13:04,253][54587] Fps is (10 sec: 58982.5, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7272529920. Throughput: 0: 55314.2. Samples: 177708840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 18:13:04,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 18:13:06,934][54818] Updated weights for policy 0, policy_version 443888 (0.0026) [2024-04-27 18:13:09,218][54818] Updated weights for policy 0, policy_version 443898 (0.0030) [2024-04-27 18:13:09,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7272824832. Throughput: 0: 55495.9. Samples: 178045160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 18:13:09,254][54587] Avg episode reward: [(0, '0.687')] [2024-04-27 18:13:12,858][54818] Updated weights for policy 0, policy_version 443908 (0.0026) [2024-04-27 18:13:14,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7273086976. Throughput: 0: 55609.3. Samples: 178217400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-27 18:13:14,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 18:13:15,070][54818] Updated weights for policy 0, policy_version 443918 (0.0026) [2024-04-27 18:13:18,546][54798] Signal inference workers to stop experience collection... (2500 times) [2024-04-27 18:13:18,547][54798] Signal inference workers to resume experience collection... (2500 times) [2024-04-27 18:13:18,558][54818] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-04-27 18:13:18,558][54818] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-04-27 18:13:18,662][54818] Updated weights for policy 0, policy_version 443928 (0.0029) [2024-04-27 18:13:19,253][54587] Fps is (10 sec: 52428.3, 60 sec: 55432.4, 300 sec: 55372.4). Total num frames: 7273349120. Throughput: 0: 55623.9. Samples: 178551180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-27 18:13:19,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 18:13:21,086][54818] Updated weights for policy 0, policy_version 443938 (0.0026) [2024-04-27 18:13:24,253][54587] Fps is (10 sec: 52428.9, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7273611264. Throughput: 0: 55602.7. Samples: 178885800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-27 18:13:24,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 18:13:24,502][54818] Updated weights for policy 0, policy_version 443948 (0.0033) [2024-04-27 18:13:27,022][54818] Updated weights for policy 0, policy_version 443958 (0.0032) [2024-04-27 18:13:29,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7273906176. Throughput: 0: 55334.8. Samples: 179041920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-27 18:13:29,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 18:13:30,383][54818] Updated weights for policy 0, policy_version 443968 (0.0028) [2024-04-27 18:13:32,821][54818] Updated weights for policy 0, policy_version 443978 (0.0026) [2024-04-27 18:13:34,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 7274184704. Throughput: 0: 55489.8. Samples: 179375080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-27 18:13:34,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 18:13:36,152][54818] Updated weights for policy 0, policy_version 443988 (0.0029) [2024-04-27 18:13:38,557][54818] Updated weights for policy 0, policy_version 443998 (0.0031) [2024-04-27 18:13:39,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7274479616. Throughput: 0: 55408.0. Samples: 179710400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-27 18:13:39,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 18:13:41,944][54818] Updated weights for policy 0, policy_version 444008 (0.0030) [2024-04-27 18:13:44,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7274758144. Throughput: 0: 55762.7. Samples: 179889120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-27 18:13:44,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 18:13:44,435][54818] Updated weights for policy 0, policy_version 444018 (0.0029) [2024-04-27 18:13:47,858][54818] Updated weights for policy 0, policy_version 444028 (0.0028) [2024-04-27 18:13:49,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7275036672. Throughput: 0: 55855.1. Samples: 180222320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-27 18:13:49,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 18:13:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000444033_7275036672.pth... [2024-04-27 18:13:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000443219_7261700096.pth [2024-04-27 18:13:50,282][54818] Updated weights for policy 0, policy_version 444038 (0.0039) [2024-04-27 18:13:53,932][54818] Updated weights for policy 0, policy_version 444048 (0.0025) [2024-04-27 18:13:54,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55978.6, 300 sec: 55427.9). Total num frames: 7275298816. Throughput: 0: 55742.2. Samples: 180553560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-27 18:13:54,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 18:13:56,090][54818] Updated weights for policy 0, policy_version 444058 (0.0027) [2024-04-27 18:13:59,253][54587] Fps is (10 sec: 52428.6, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 7275560960. Throughput: 0: 55547.1. Samples: 180717020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-27 18:13:59,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 18:13:59,688][54818] Updated weights for policy 0, policy_version 444068 (0.0029) [2024-04-27 18:14:01,961][54818] Updated weights for policy 0, policy_version 444078 (0.0029) [2024-04-27 18:14:04,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7275855872. Throughput: 0: 55476.5. Samples: 181047620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-27 18:14:04,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 18:14:05,760][54818] Updated weights for policy 0, policy_version 444088 (0.0028) [2024-04-27 18:14:08,088][54818] Updated weights for policy 0, policy_version 444098 (0.0033) [2024-04-27 18:14:09,254][54587] Fps is (10 sec: 57341.7, 60 sec: 55159.1, 300 sec: 55538.9). Total num frames: 7276134400. Throughput: 0: 55523.5. Samples: 181384380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-27 18:14:09,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 18:14:09,704][54798] Signal inference workers to stop experience collection... (2550 times) [2024-04-27 18:14:09,740][54818] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-04-27 18:14:09,767][54798] Signal inference workers to resume experience collection... (2550 times) [2024-04-27 18:14:09,767][54818] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-04-27 18:14:11,544][54818] Updated weights for policy 0, policy_version 444108 (0.0025) [2024-04-27 18:14:14,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7276412928. Throughput: 0: 55793.8. Samples: 181552640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-27 18:14:14,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 18:14:14,374][54818] Updated weights for policy 0, policy_version 444118 (0.0030) [2024-04-27 18:14:17,573][54818] Updated weights for policy 0, policy_version 444128 (0.0032) [2024-04-27 18:14:19,253][54587] Fps is (10 sec: 57346.4, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 7276707840. Throughput: 0: 55742.3. Samples: 181883480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-27 18:14:19,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 18:14:20,329][54818] Updated weights for policy 0, policy_version 444138 (0.0027) [2024-04-27 18:14:23,313][54818] Updated weights for policy 0, policy_version 444148 (0.0028) [2024-04-27 18:14:24,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7276969984. Throughput: 0: 55627.2. Samples: 182213620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-27 18:14:24,254][54587] Avg episode reward: [(0, '0.519')] [2024-04-27 18:14:26,196][54818] Updated weights for policy 0, policy_version 444158 (0.0028) [2024-04-27 18:14:29,253][54587] Fps is (10 sec: 52428.3, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7277232128. Throughput: 0: 55361.3. Samples: 182380380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-27 18:14:29,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 18:14:29,315][54818] Updated weights for policy 0, policy_version 444168 (0.0024) [2024-04-27 18:14:31,944][54818] Updated weights for policy 0, policy_version 444178 (0.0030) [2024-04-27 18:14:34,253][54587] Fps is (10 sec: 52428.6, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 7277494272. Throughput: 0: 55359.1. Samples: 182713480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-27 18:14:34,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 18:14:35,194][54818] Updated weights for policy 0, policy_version 444188 (0.0029) [2024-04-27 18:14:37,905][54818] Updated weights for policy 0, policy_version 444198 (0.0035) [2024-04-27 18:14:39,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 7277789184. Throughput: 0: 55478.6. Samples: 183050100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 18:14:39,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 18:14:41,084][54818] Updated weights for policy 0, policy_version 444208 (0.0026) [2024-04-27 18:14:43,694][54818] Updated weights for policy 0, policy_version 444218 (0.0028) [2024-04-27 18:14:44,253][54587] Fps is (10 sec: 58982.6, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7278084096. Throughput: 0: 55535.2. Samples: 183216100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 18:14:44,253][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 18:14:46,953][54818] Updated weights for policy 0, policy_version 444228 (0.0027) [2024-04-27 18:14:49,253][54587] Fps is (10 sec: 57344.9, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 7278362624. Throughput: 0: 55599.7. Samples: 183549600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 18:14:49,253][54587] Avg episode reward: [(0, '0.713')] [2024-04-27 18:14:49,665][54818] Updated weights for policy 0, policy_version 444238 (0.0030) [2024-04-27 18:14:52,846][54818] Updated weights for policy 0, policy_version 444248 (0.0027) [2024-04-27 18:14:54,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7278641152. Throughput: 0: 55363.7. Samples: 183875720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 18:14:54,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 18:14:55,531][54818] Updated weights for policy 0, policy_version 444258 (0.0028) [2024-04-27 18:14:58,676][54818] Updated weights for policy 0, policy_version 444268 (0.0026) [2024-04-27 18:14:59,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 7278903296. Throughput: 0: 55387.6. Samples: 184045080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 18:14:59,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 18:15:01,554][54818] Updated weights for policy 0, policy_version 444278 (0.0038) [2024-04-27 18:15:04,253][54587] Fps is (10 sec: 52428.4, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 7279165440. Throughput: 0: 55415.9. Samples: 184377200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 18:15:04,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 18:15:04,552][54818] Updated weights for policy 0, policy_version 444288 (0.0031) [2024-04-27 18:15:07,530][54818] Updated weights for policy 0, policy_version 444298 (0.0031) [2024-04-27 18:15:08,599][54798] Signal inference workers to stop experience collection... (2600 times) [2024-04-27 18:15:08,600][54798] Signal inference workers to resume experience collection... (2600 times) [2024-04-27 18:15:08,613][54818] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-04-27 18:15:08,613][54818] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-04-27 18:15:09,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55159.8, 300 sec: 55483.4). Total num frames: 7279443968. Throughput: 0: 55585.7. Samples: 184714980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 18:15:09,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 18:15:10,545][54818] Updated weights for policy 0, policy_version 444308 (0.0025) [2024-04-27 18:15:13,310][54818] Updated weights for policy 0, policy_version 444318 (0.0025) [2024-04-27 18:15:14,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 7279722496. Throughput: 0: 55277.8. Samples: 184867880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 18:15:14,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 18:15:16,610][54818] Updated weights for policy 0, policy_version 444328 (0.0029) [2024-04-27 18:15:19,104][54818] Updated weights for policy 0, policy_version 444338 (0.0025) [2024-04-27 18:15:19,253][54587] Fps is (10 sec: 58982.4, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7280033792. Throughput: 0: 55254.2. Samples: 185199920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 18:15:19,254][54587] Avg episode reward: [(0, '0.685')] [2024-04-27 18:15:22,585][54818] Updated weights for policy 0, policy_version 444348 (0.0025) [2024-04-27 18:15:24,253][54587] Fps is (10 sec: 57344.7, 60 sec: 55432.5, 300 sec: 55483.5). Total num frames: 7280295936. Throughput: 0: 55202.0. Samples: 185534180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 18:15:24,253][54587] Avg episode reward: [(0, '0.680')] [2024-04-27 18:15:25,172][54818] Updated weights for policy 0, policy_version 444358 (0.0031) [2024-04-27 18:15:28,396][54818] Updated weights for policy 0, policy_version 444368 (0.0033) [2024-04-27 18:15:29,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7280574464. Throughput: 0: 55524.9. Samples: 185714720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 18:15:29,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 18:15:31,156][54818] Updated weights for policy 0, policy_version 444378 (0.0029) [2024-04-27 18:15:34,129][54818] Updated weights for policy 0, policy_version 444388 (0.0028) [2024-04-27 18:15:34,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7280852992. Throughput: 0: 55459.5. Samples: 186045280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 18:15:34,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 18:15:36,839][54818] Updated weights for policy 0, policy_version 444398 (0.0030) [2024-04-27 18:15:39,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 7281131520. Throughput: 0: 55668.9. Samples: 186380820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 18:15:39,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 18:15:39,908][54818] Updated weights for policy 0, policy_version 444408 (0.0037) [2024-04-27 18:15:42,950][54818] Updated weights for policy 0, policy_version 444418 (0.0031) [2024-04-27 18:15:44,254][54587] Fps is (10 sec: 54063.2, 60 sec: 55158.7, 300 sec: 55427.8). Total num frames: 7281393664. Throughput: 0: 55505.3. Samples: 186542860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 18:15:44,255][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 18:15:46,003][54818] Updated weights for policy 0, policy_version 444428 (0.0029) [2024-04-27 18:15:48,948][54818] Updated weights for policy 0, policy_version 444438 (0.0025) [2024-04-27 18:15:49,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 7281672192. Throughput: 0: 55513.8. Samples: 186875320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 18:15:49,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 18:15:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000444438_7281672192.pth... [2024-04-27 18:15:49,312][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000443626_7268368384.pth [2024-04-27 18:15:52,036][54818] Updated weights for policy 0, policy_version 444448 (0.0026) [2024-04-27 18:15:54,253][54587] Fps is (10 sec: 57348.6, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 7281967104. Throughput: 0: 55359.2. Samples: 187206140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 18:15:54,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 18:15:54,679][54818] Updated weights for policy 0, policy_version 444458 (0.0033) [2024-04-27 18:15:57,950][54818] Updated weights for policy 0, policy_version 444468 (0.0027) [2024-04-27 18:15:59,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 7282245632. Throughput: 0: 55655.7. Samples: 187372380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 18:15:59,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 18:16:00,614][54818] Updated weights for policy 0, policy_version 444478 (0.0027) [2024-04-27 18:16:03,866][54818] Updated weights for policy 0, policy_version 444488 (0.0027) [2024-04-27 18:16:04,253][54587] Fps is (10 sec: 55704.7, 60 sec: 55978.6, 300 sec: 55539.2). Total num frames: 7282524160. Throughput: 0: 55714.2. Samples: 187707060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-27 18:16:04,254][54587] Avg episode reward: [(0, '0.514')] [2024-04-27 18:16:06,452][54818] Updated weights for policy 0, policy_version 444498 (0.0027) [2024-04-27 18:16:09,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7282786304. Throughput: 0: 55820.0. Samples: 188046080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-27 18:16:09,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 18:16:09,799][54818] Updated weights for policy 0, policy_version 444508 (0.0027) [2024-04-27 18:16:12,341][54818] Updated weights for policy 0, policy_version 444518 (0.0027) [2024-04-27 18:16:14,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 7283064832. Throughput: 0: 55343.5. Samples: 188205180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-27 18:16:14,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 18:16:15,525][54798] Signal inference workers to stop experience collection... (2650 times) [2024-04-27 18:16:15,573][54818] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-04-27 18:16:15,580][54798] Signal inference workers to resume experience collection... (2650 times) [2024-04-27 18:16:15,586][54818] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-04-27 18:16:15,589][54818] Updated weights for policy 0, policy_version 444528 (0.0027) [2024-04-27 18:16:18,298][54818] Updated weights for policy 0, policy_version 444538 (0.0027) [2024-04-27 18:16:19,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 7283343360. Throughput: 0: 55502.1. Samples: 188542880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-27 18:16:19,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 18:16:21,358][54818] Updated weights for policy 0, policy_version 444548 (0.0031) [2024-04-27 18:16:24,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7283621888. Throughput: 0: 55508.8. Samples: 188878720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-27 18:16:24,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 18:16:24,324][54818] Updated weights for policy 0, policy_version 444558 (0.0025) [2024-04-27 18:16:27,238][54818] Updated weights for policy 0, policy_version 444568 (0.0025) [2024-04-27 18:16:29,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55705.4, 300 sec: 55483.4). Total num frames: 7283916800. Throughput: 0: 55573.7. Samples: 189043640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-27 18:16:29,254][54587] Avg episode reward: [(0, '0.521')] [2024-04-27 18:16:30,309][54818] Updated weights for policy 0, policy_version 444578 (0.0037) [2024-04-27 18:16:33,107][54818] Updated weights for policy 0, policy_version 444588 (0.0030) [2024-04-27 18:16:34,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7284178944. Throughput: 0: 55561.4. Samples: 189375580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-27 18:16:34,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 18:16:36,055][54818] Updated weights for policy 0, policy_version 444598 (0.0028) [2024-04-27 18:16:38,962][54818] Updated weights for policy 0, policy_version 444608 (0.0031) [2024-04-27 18:16:39,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7284473856. Throughput: 0: 55635.4. Samples: 189709740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-27 18:16:39,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 18:16:41,878][54818] Updated weights for policy 0, policy_version 444618 (0.0025) [2024-04-27 18:16:44,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55706.3, 300 sec: 55539.0). Total num frames: 7284736000. Throughput: 0: 55676.0. Samples: 189877800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-27 18:16:44,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-27 18:16:44,877][54818] Updated weights for policy 0, policy_version 444628 (0.0039) [2024-04-27 18:16:47,773][54818] Updated weights for policy 0, policy_version 444638 (0.0029) [2024-04-27 18:16:49,253][54587] Fps is (10 sec: 54068.1, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 7285014528. Throughput: 0: 55671.8. Samples: 190212280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-27 18:16:49,253][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 18:16:50,746][54818] Updated weights for policy 0, policy_version 444648 (0.0034) [2024-04-27 18:16:53,584][54818] Updated weights for policy 0, policy_version 444658 (0.0028) [2024-04-27 18:16:54,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7285293056. Throughput: 0: 55418.2. Samples: 190539900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-27 18:16:54,254][54587] Avg episode reward: [(0, '0.513')] [2024-04-27 18:16:56,500][54818] Updated weights for policy 0, policy_version 444668 (0.0037) [2024-04-27 18:16:59,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7285587968. Throughput: 0: 55639.2. Samples: 190708940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-27 18:16:59,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 18:16:59,517][54818] Updated weights for policy 0, policy_version 444678 (0.0023) [2024-04-27 18:17:02,210][54818] Updated weights for policy 0, policy_version 444688 (0.0029) [2024-04-27 18:17:04,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.7, 300 sec: 55483.4). Total num frames: 7285850112. Throughput: 0: 55640.1. Samples: 191046680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-27 18:17:04,263][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 18:17:05,531][54818] Updated weights for policy 0, policy_version 444698 (0.0028) [2024-04-27 18:17:08,104][54818] Updated weights for policy 0, policy_version 444708 (0.0025) [2024-04-27 18:17:09,253][54587] Fps is (10 sec: 52428.7, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7286112256. Throughput: 0: 55615.6. Samples: 191381420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-27 18:17:09,262][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 18:17:11,290][54818] Updated weights for policy 0, policy_version 444718 (0.0032) [2024-04-27 18:17:14,032][54818] Updated weights for policy 0, policy_version 444728 (0.0028) [2024-04-27 18:17:14,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7286423552. Throughput: 0: 55619.3. Samples: 191546500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-27 18:17:14,254][54587] Avg episode reward: [(0, '0.673')] [2024-04-27 18:17:17,134][54818] Updated weights for policy 0, policy_version 444738 (0.0025) [2024-04-27 18:17:19,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7286669312. Throughput: 0: 55666.2. Samples: 191880560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-27 18:17:19,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 18:17:19,948][54818] Updated weights for policy 0, policy_version 444748 (0.0031) [2024-04-27 18:17:23,167][54818] Updated weights for policy 0, policy_version 444758 (0.0033) [2024-04-27 18:17:24,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7286964224. Throughput: 0: 55591.4. Samples: 192211360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-27 18:17:24,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 18:17:25,211][54798] Signal inference workers to stop experience collection... (2700 times) [2024-04-27 18:17:25,244][54818] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-04-27 18:17:25,272][54798] Signal inference workers to resume experience collection... (2700 times) [2024-04-27 18:17:25,273][54818] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-04-27 18:17:26,511][54818] Updated weights for policy 0, policy_version 444768 (0.0029) [2024-04-27 18:17:29,216][54818] Updated weights for policy 0, policy_version 444778 (0.0027) [2024-04-27 18:17:29,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 7287242752. Throughput: 0: 55617.7. Samples: 192380600. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 18:17:29,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 18:17:32,266][54818] Updated weights for policy 0, policy_version 444788 (0.0027) [2024-04-27 18:17:34,253][54587] Fps is (10 sec: 57344.8, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7287537664. Throughput: 0: 55561.2. Samples: 192712540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 18:17:34,254][54587] Avg episode reward: [(0, '0.676')] [2024-04-27 18:17:34,970][54818] Updated weights for policy 0, policy_version 444798 (0.0028) [2024-04-27 18:17:37,996][54818] Updated weights for policy 0, policy_version 444808 (0.0027) [2024-04-27 18:17:39,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55159.6, 300 sec: 55483.5). Total num frames: 7287783424. Throughput: 0: 55722.7. Samples: 193047420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 18:17:39,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 18:17:40,749][54818] Updated weights for policy 0, policy_version 444818 (0.0031) [2024-04-27 18:17:43,943][54818] Updated weights for policy 0, policy_version 444828 (0.0026) [2024-04-27 18:17:44,253][54587] Fps is (10 sec: 52428.1, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 7288061952. Throughput: 0: 55532.2. Samples: 193207900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 18:17:44,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 18:17:46,880][54818] Updated weights for policy 0, policy_version 444838 (0.0028) [2024-04-27 18:17:49,253][54587] Fps is (10 sec: 58981.4, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 7288373248. Throughput: 0: 55521.6. Samples: 193545160. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 18:17:49,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 18:17:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000444847_7288373248.pth... [2024-04-27 18:17:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000444033_7275036672.pth [2024-04-27 18:17:49,778][54818] Updated weights for policy 0, policy_version 444848 (0.0028) [2024-04-27 18:17:52,797][54818] Updated weights for policy 0, policy_version 444858 (0.0033) [2024-04-27 18:17:54,253][54587] Fps is (10 sec: 57344.7, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7288635392. Throughput: 0: 55521.3. Samples: 193879880. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 18:17:54,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 18:17:55,508][54818] Updated weights for policy 0, policy_version 444868 (0.0027) [2024-04-27 18:17:58,701][54818] Updated weights for policy 0, policy_version 444878 (0.0025) [2024-04-27 18:17:59,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 7288913920. Throughput: 0: 55667.9. Samples: 194051560. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 18:17:59,254][54587] Avg episode reward: [(0, '0.524')] [2024-04-27 18:18:01,402][54818] Updated weights for policy 0, policy_version 444888 (0.0026) [2024-04-27 18:18:04,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55432.4, 300 sec: 55427.9). Total num frames: 7289176064. Throughput: 0: 55589.2. Samples: 194382080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 18:18:04,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 18:18:04,504][54818] Updated weights for policy 0, policy_version 444898 (0.0030) [2024-04-27 18:18:07,568][54818] Updated weights for policy 0, policy_version 444908 (0.0029) [2024-04-27 18:18:09,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7289470976. Throughput: 0: 55658.9. Samples: 194716000. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 18:18:09,254][54587] Avg episode reward: [(0, '0.511')] [2024-04-27 18:18:10,337][54818] Updated weights for policy 0, policy_version 444918 (0.0030) [2024-04-27 18:18:13,607][54818] Updated weights for policy 0, policy_version 444928 (0.0026) [2024-04-27 18:18:14,253][54587] Fps is (10 sec: 54067.4, 60 sec: 54886.3, 300 sec: 55483.5). Total num frames: 7289716736. Throughput: 0: 55560.0. Samples: 194880800. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 18:18:14,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 18:18:16,296][54818] Updated weights for policy 0, policy_version 444938 (0.0033) [2024-04-27 18:18:19,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7289995264. Throughput: 0: 55520.1. Samples: 195210940. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 18:18:19,253][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 18:18:19,460][54818] Updated weights for policy 0, policy_version 444948 (0.0026) [2024-04-27 18:18:22,172][54818] Updated weights for policy 0, policy_version 444958 (0.0030) [2024-04-27 18:18:24,253][54587] Fps is (10 sec: 58982.9, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7290306560. Throughput: 0: 55366.6. Samples: 195538920. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 18:18:24,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 18:18:25,292][54818] Updated weights for policy 0, policy_version 444968 (0.0027) [2024-04-27 18:18:27,483][54798] Signal inference workers to stop experience collection... (2750 times) [2024-04-27 18:18:27,484][54798] Signal inference workers to resume experience collection... (2750 times) [2024-04-27 18:18:27,505][54818] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-04-27 18:18:27,505][54818] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-04-27 18:18:28,152][54818] Updated weights for policy 0, policy_version 444978 (0.0030) [2024-04-27 18:18:29,253][54587] Fps is (10 sec: 58981.8, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7290585088. Throughput: 0: 55683.6. Samples: 195713660. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 18:18:29,254][54587] Avg episode reward: [(0, '0.685')] [2024-04-27 18:18:31,237][54818] Updated weights for policy 0, policy_version 444988 (0.0029) [2024-04-27 18:18:34,099][54818] Updated weights for policy 0, policy_version 444998 (0.0030) [2024-04-27 18:18:34,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55159.3, 300 sec: 55483.4). Total num frames: 7290847232. Throughput: 0: 55559.1. Samples: 196045320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 18:18:34,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 18:18:37,056][54818] Updated weights for policy 0, policy_version 445008 (0.0027) [2024-04-27 18:18:39,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 7291125760. Throughput: 0: 55444.0. Samples: 196374860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 18:18:39,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 18:18:40,183][54818] Updated weights for policy 0, policy_version 445018 (0.0037) [2024-04-27 18:18:42,827][54818] Updated weights for policy 0, policy_version 445028 (0.0029) [2024-04-27 18:18:44,253][54587] Fps is (10 sec: 54068.0, 60 sec: 55432.7, 300 sec: 55427.9). Total num frames: 7291387904. Throughput: 0: 55189.9. Samples: 196535100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 18:18:44,254][54587] Avg episode reward: [(0, '0.494')] [2024-04-27 18:18:46,052][54818] Updated weights for policy 0, policy_version 445038 (0.0031) [2024-04-27 18:18:48,819][54818] Updated weights for policy 0, policy_version 445048 (0.0033) [2024-04-27 18:18:49,253][54587] Fps is (10 sec: 54067.1, 60 sec: 54886.5, 300 sec: 55483.5). Total num frames: 7291666432. Throughput: 0: 55187.2. Samples: 196865500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 18:18:49,262][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 18:18:51,944][54818] Updated weights for policy 0, policy_version 445058 (0.0033) [2024-04-27 18:18:54,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7291944960. Throughput: 0: 55123.1. Samples: 197196540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 18:18:54,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 18:18:55,035][54818] Updated weights for policy 0, policy_version 445068 (0.0038) [2024-04-27 18:18:57,744][54818] Updated weights for policy 0, policy_version 445078 (0.0025) [2024-04-27 18:18:59,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 7292239872. Throughput: 0: 55194.4. Samples: 197364540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 18:18:59,253][54587] Avg episode reward: [(0, '0.495')] [2024-04-27 18:19:00,898][54818] Updated weights for policy 0, policy_version 445088 (0.0027) [2024-04-27 18:19:03,784][54818] Updated weights for policy 0, policy_version 445098 (0.0032) [2024-04-27 18:19:04,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 7292502016. Throughput: 0: 55279.0. Samples: 197698500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 18:19:04,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-27 18:19:06,704][54818] Updated weights for policy 0, policy_version 445108 (0.0025) [2024-04-27 18:19:09,253][54587] Fps is (10 sec: 52428.9, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 7292764160. Throughput: 0: 55389.4. Samples: 198031440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 18:19:09,253][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 18:19:09,645][54818] Updated weights for policy 0, policy_version 445118 (0.0026) [2024-04-27 18:19:12,462][54818] Updated weights for policy 0, policy_version 445128 (0.0029) [2024-04-27 18:19:14,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55427.9). Total num frames: 7293059072. Throughput: 0: 55099.6. Samples: 198193140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 18:19:14,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 18:19:15,546][54818] Updated weights for policy 0, policy_version 445138 (0.0027) [2024-04-27 18:19:18,480][54818] Updated weights for policy 0, policy_version 445148 (0.0026) [2024-04-27 18:19:19,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 7293337600. Throughput: 0: 55209.9. Samples: 198529760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 18:19:19,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 18:19:21,496][54818] Updated weights for policy 0, policy_version 445158 (0.0026) [2024-04-27 18:19:24,253][54587] Fps is (10 sec: 54067.2, 60 sec: 54886.4, 300 sec: 55483.5). Total num frames: 7293599744. Throughput: 0: 55226.2. Samples: 198860040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 18:19:24,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 18:19:24,581][54818] Updated weights for policy 0, policy_version 445168 (0.0025) [2024-04-27 18:19:27,398][54818] Updated weights for policy 0, policy_version 445178 (0.0026) [2024-04-27 18:19:29,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 7293894656. Throughput: 0: 55588.1. Samples: 199036560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 18:19:29,253][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 18:19:30,520][54818] Updated weights for policy 0, policy_version 445188 (0.0029) [2024-04-27 18:19:33,151][54818] Updated weights for policy 0, policy_version 445198 (0.0027) [2024-04-27 18:19:34,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7294173184. Throughput: 0: 55506.7. Samples: 199363300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 18:19:34,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 18:19:36,455][54818] Updated weights for policy 0, policy_version 445208 (0.0026) [2024-04-27 18:19:38,309][54798] Signal inference workers to stop experience collection... (2800 times) [2024-04-27 18:19:38,310][54798] Signal inference workers to resume experience collection... (2800 times) [2024-04-27 18:19:38,323][54818] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-04-27 18:19:38,323][54818] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-04-27 18:19:38,974][54818] Updated weights for policy 0, policy_version 445218 (0.0034) [2024-04-27 18:19:39,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7294451712. Throughput: 0: 55499.9. Samples: 199694040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 18:19:39,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 18:19:42,451][54818] Updated weights for policy 0, policy_version 445228 (0.0027) [2024-04-27 18:19:44,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7294713856. Throughput: 0: 55445.7. Samples: 199859600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 18:19:44,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 18:19:44,941][54818] Updated weights for policy 0, policy_version 445238 (0.0026) [2024-04-27 18:19:48,234][54818] Updated weights for policy 0, policy_version 445248 (0.0026) [2024-04-27 18:19:49,253][54587] Fps is (10 sec: 52428.6, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 7294976000. Throughput: 0: 55483.1. Samples: 200195240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 18:19:49,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 18:19:49,267][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000445250_7294976000.pth... [2024-04-27 18:19:49,328][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000444438_7281672192.pth [2024-04-27 18:19:50,808][54818] Updated weights for policy 0, policy_version 445258 (0.0026) [2024-04-27 18:19:54,136][54818] Updated weights for policy 0, policy_version 445268 (0.0031) [2024-04-27 18:19:54,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 7295270912. Throughput: 0: 55496.0. Samples: 200528760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 18:19:54,253][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 18:19:56,655][54818] Updated weights for policy 0, policy_version 445278 (0.0031) [2024-04-27 18:19:59,253][54587] Fps is (10 sec: 57344.7, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7295549440. Throughput: 0: 55396.1. Samples: 200685960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 18:19:59,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 18:20:00,028][54818] Updated weights for policy 0, policy_version 445288 (0.0031) [2024-04-27 18:20:03,019][54818] Updated weights for policy 0, policy_version 445298 (0.0028) [2024-04-27 18:20:04,253][54587] Fps is (10 sec: 58982.0, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7295860736. Throughput: 0: 55389.8. Samples: 201022300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 18:20:04,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 18:20:05,933][54818] Updated weights for policy 0, policy_version 445308 (0.0026) [2024-04-27 18:20:08,709][54818] Updated weights for policy 0, policy_version 445318 (0.0027) [2024-04-27 18:20:09,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7296106496. Throughput: 0: 55501.0. Samples: 201357580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 18:20:09,253][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 18:20:11,962][54818] Updated weights for policy 0, policy_version 445328 (0.0032) [2024-04-27 18:20:14,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 7296401408. Throughput: 0: 55464.3. Samples: 201532460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:20:14,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 18:20:14,434][54818] Updated weights for policy 0, policy_version 445338 (0.0031) [2024-04-27 18:20:17,932][54818] Updated weights for policy 0, policy_version 445348 (0.0023) [2024-04-27 18:20:19,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 7296647168. Throughput: 0: 55436.0. Samples: 201857920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:20:19,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 18:20:20,438][54818] Updated weights for policy 0, policy_version 445358 (0.0028) [2024-04-27 18:20:23,785][54818] Updated weights for policy 0, policy_version 445368 (0.0027) [2024-04-27 18:20:24,253][54587] Fps is (10 sec: 52429.1, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7296925696. Throughput: 0: 55437.8. Samples: 202188740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:20:24,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 18:20:26,818][54818] Updated weights for policy 0, policy_version 445378 (0.0030) [2024-04-27 18:20:29,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55159.3, 300 sec: 55427.9). Total num frames: 7297204224. Throughput: 0: 55487.5. Samples: 202356540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:20:29,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 18:20:29,737][54818] Updated weights for policy 0, policy_version 445388 (0.0024) [2024-04-27 18:20:32,603][54818] Updated weights for policy 0, policy_version 445398 (0.0028) [2024-04-27 18:20:33,054][54798] Signal inference workers to stop experience collection... (2850 times) [2024-04-27 18:20:33,066][54818] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-04-27 18:20:33,150][54798] Signal inference workers to resume experience collection... (2850 times) [2024-04-27 18:20:33,150][54818] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-04-27 18:20:34,253][54587] Fps is (10 sec: 58982.4, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7297515520. Throughput: 0: 55383.6. Samples: 202687500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:20:34,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 18:20:35,739][54818] Updated weights for policy 0, policy_version 445408 (0.0028) [2024-04-27 18:20:38,507][54818] Updated weights for policy 0, policy_version 445418 (0.0025) [2024-04-27 18:20:39,253][54587] Fps is (10 sec: 58982.5, 60 sec: 55705.6, 300 sec: 55594.6). Total num frames: 7297794048. Throughput: 0: 55301.2. Samples: 203017320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:20:39,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 18:20:41,552][54818] Updated weights for policy 0, policy_version 445428 (0.0025) [2024-04-27 18:20:44,253][54587] Fps is (10 sec: 52428.2, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 7298039808. Throughput: 0: 55529.6. Samples: 203184800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:20:44,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 18:20:44,387][54818] Updated weights for policy 0, policy_version 445438 (0.0024) [2024-04-27 18:20:47,336][54818] Updated weights for policy 0, policy_version 445448 (0.0027) [2024-04-27 18:20:49,254][54587] Fps is (10 sec: 52428.0, 60 sec: 55705.4, 300 sec: 55427.9). Total num frames: 7298318336. Throughput: 0: 55543.3. Samples: 203521760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:20:49,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 18:20:50,424][54818] Updated weights for policy 0, policy_version 445458 (0.0028) [2024-04-27 18:20:53,490][54818] Updated weights for policy 0, policy_version 445468 (0.0026) [2024-04-27 18:20:54,253][54587] Fps is (10 sec: 52429.1, 60 sec: 54886.3, 300 sec: 55316.8). Total num frames: 7298564096. Throughput: 0: 55460.7. Samples: 203853320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:20:54,262][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 18:20:56,185][54818] Updated weights for policy 0, policy_version 445478 (0.0027) [2024-04-27 18:20:59,253][54587] Fps is (10 sec: 54068.3, 60 sec: 55159.4, 300 sec: 55372.4). Total num frames: 7298859008. Throughput: 0: 55038.3. Samples: 204009180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:20:59,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 18:20:59,355][54818] Updated weights for policy 0, policy_version 445488 (0.0027) [2024-04-27 18:21:02,056][54818] Updated weights for policy 0, policy_version 445498 (0.0028) [2024-04-27 18:21:04,253][54587] Fps is (10 sec: 57344.3, 60 sec: 54613.3, 300 sec: 55427.9). Total num frames: 7299137536. Throughput: 0: 55106.2. Samples: 204337700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:21:04,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 18:21:05,331][54818] Updated weights for policy 0, policy_version 445508 (0.0029) [2024-04-27 18:21:08,032][54818] Updated weights for policy 0, policy_version 445518 (0.0027) [2024-04-27 18:21:09,253][54587] Fps is (10 sec: 58982.8, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7299448832. Throughput: 0: 55063.1. Samples: 204666580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:21:09,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 18:21:11,465][54818] Updated weights for policy 0, policy_version 445528 (0.0027) [2024-04-27 18:21:13,797][54818] Updated weights for policy 0, policy_version 445538 (0.0026) [2024-04-27 18:21:14,253][54587] Fps is (10 sec: 60620.1, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7299743744. Throughput: 0: 55384.8. Samples: 204848860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:21:14,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 18:21:14,579][54798] Signal inference workers to stop experience collection... (2900 times) [2024-04-27 18:21:14,588][54798] Signal inference workers to resume experience collection... (2900 times) [2024-04-27 18:21:14,627][54818] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-04-27 18:21:14,627][54818] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-04-27 18:21:17,271][54818] Updated weights for policy 0, policy_version 445548 (0.0029) [2024-04-27 18:21:19,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 7299989504. Throughput: 0: 55476.3. Samples: 205183940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:21:19,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 18:21:19,684][54818] Updated weights for policy 0, policy_version 445558 (0.0027) [2024-04-27 18:21:23,040][54818] Updated weights for policy 0, policy_version 445568 (0.0036) [2024-04-27 18:21:24,253][54587] Fps is (10 sec: 50791.3, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 7300251648. Throughput: 0: 55491.3. Samples: 205514420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:21:24,253][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 18:21:25,606][54818] Updated weights for policy 0, policy_version 445578 (0.0029) [2024-04-27 18:21:28,863][54818] Updated weights for policy 0, policy_version 445588 (0.0031) [2024-04-27 18:21:29,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7300530176. Throughput: 0: 55116.6. Samples: 205665040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:21:29,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 18:21:31,562][54818] Updated weights for policy 0, policy_version 445598 (0.0029) [2024-04-27 18:21:34,253][54587] Fps is (10 sec: 55704.5, 60 sec: 54886.3, 300 sec: 55372.4). Total num frames: 7300808704. Throughput: 0: 55052.6. Samples: 205999120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:21:34,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 18:21:34,913][54818] Updated weights for policy 0, policy_version 445608 (0.0030) [2024-04-27 18:21:37,332][54818] Updated weights for policy 0, policy_version 445618 (0.0032) [2024-04-27 18:21:39,253][54587] Fps is (10 sec: 57343.0, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 7301103616. Throughput: 0: 55207.4. Samples: 206337660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 18:21:39,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 18:21:40,677][54818] Updated weights for policy 0, policy_version 445628 (0.0036) [2024-04-27 18:21:43,164][54818] Updated weights for policy 0, policy_version 445638 (0.0035) [2024-04-27 18:21:44,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 7301382144. Throughput: 0: 55570.6. Samples: 206509860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 18:21:44,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 18:21:46,440][54818] Updated weights for policy 0, policy_version 445648 (0.0027) [2024-04-27 18:21:49,237][54818] Updated weights for policy 0, policy_version 445658 (0.0030) [2024-04-27 18:21:49,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55705.8, 300 sec: 55483.4). Total num frames: 7301660672. Throughput: 0: 55654.7. Samples: 206842160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 18:21:49,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 18:21:49,362][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000445659_7301677056.pth... [2024-04-27 18:21:49,407][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000444847_7288373248.pth [2024-04-27 18:21:52,476][54818] Updated weights for policy 0, policy_version 445668 (0.0027) [2024-04-27 18:21:54,253][54587] Fps is (10 sec: 52429.2, 60 sec: 55705.7, 300 sec: 55316.8). Total num frames: 7301906432. Throughput: 0: 55677.3. Samples: 207172060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 18:21:54,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 18:21:55,112][54818] Updated weights for policy 0, policy_version 445678 (0.0029) [2024-04-27 18:21:58,328][54818] Updated weights for policy 0, policy_version 445688 (0.0025) [2024-04-27 18:21:59,253][54587] Fps is (10 sec: 52429.2, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 7302184960. Throughput: 0: 55251.8. Samples: 207335180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 18:21:59,262][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 18:22:00,905][54818] Updated weights for policy 0, policy_version 445698 (0.0026) [2024-04-27 18:22:04,224][54818] Updated weights for policy 0, policy_version 445708 (0.0031) [2024-04-27 18:22:04,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 7302479872. Throughput: 0: 55378.3. Samples: 207675960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 18:22:04,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 18:22:06,729][54818] Updated weights for policy 0, policy_version 445718 (0.0027) [2024-04-27 18:22:09,156][54798] Signal inference workers to stop experience collection... (2950 times) [2024-04-27 18:22:09,201][54818] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-04-27 18:22:09,214][54798] Signal inference workers to resume experience collection... (2950 times) [2024-04-27 18:22:09,219][54818] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-04-27 18:22:09,253][54587] Fps is (10 sec: 55705.6, 60 sec: 54886.4, 300 sec: 55316.8). Total num frames: 7302742016. Throughput: 0: 55396.4. Samples: 208007260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 18:22:09,253][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 18:22:09,974][54818] Updated weights for policy 0, policy_version 445728 (0.0033) [2024-04-27 18:22:12,705][54818] Updated weights for policy 0, policy_version 445738 (0.0027) [2024-04-27 18:22:14,253][54587] Fps is (10 sec: 55705.9, 60 sec: 54886.5, 300 sec: 55483.5). Total num frames: 7303036928. Throughput: 0: 55699.6. Samples: 208171520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 18:22:14,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 18:22:16,209][54818] Updated weights for policy 0, policy_version 445748 (0.0028) [2024-04-27 18:22:18,636][54818] Updated weights for policy 0, policy_version 445758 (0.0027) [2024-04-27 18:22:19,253][54587] Fps is (10 sec: 58981.4, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 7303331840. Throughput: 0: 55648.0. Samples: 208503280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 18:22:19,263][54587] Avg episode reward: [(0, '0.513')] [2024-04-27 18:22:22,116][54818] Updated weights for policy 0, policy_version 445768 (0.0026) [2024-04-27 18:22:24,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55427.9). Total num frames: 7303593984. Throughput: 0: 55549.9. Samples: 208837400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 18:22:24,262][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 18:22:24,438][54818] Updated weights for policy 0, policy_version 445778 (0.0029) [2024-04-27 18:22:27,872][54818] Updated weights for policy 0, policy_version 445788 (0.0028) [2024-04-27 18:22:29,253][54587] Fps is (10 sec: 52429.6, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 7303856128. Throughput: 0: 55417.0. Samples: 209003620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 18:22:29,253][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 18:22:30,401][54818] Updated weights for policy 0, policy_version 445798 (0.0033) [2024-04-27 18:22:33,817][54818] Updated weights for policy 0, policy_version 445808 (0.0025) [2024-04-27 18:22:34,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55159.6, 300 sec: 55372.4). Total num frames: 7304118272. Throughput: 0: 55396.5. Samples: 209335000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 18:22:34,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 18:22:36,379][54818] Updated weights for policy 0, policy_version 445818 (0.0027) [2024-04-27 18:22:39,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55159.6, 300 sec: 55427.9). Total num frames: 7304413184. Throughput: 0: 55436.5. Samples: 209666700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 18:22:39,254][54587] Avg episode reward: [(0, '0.512')] [2024-04-27 18:22:39,954][54818] Updated weights for policy 0, policy_version 445828 (0.0030) [2024-04-27 18:22:42,148][54818] Updated weights for policy 0, policy_version 445838 (0.0028) [2024-04-27 18:22:44,253][54587] Fps is (10 sec: 57343.3, 60 sec: 55159.4, 300 sec: 55316.8). Total num frames: 7304691712. Throughput: 0: 55484.2. Samples: 209831980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 18:22:44,254][54587] Avg episode reward: [(0, '0.507')] [2024-04-27 18:22:45,937][54818] Updated weights for policy 0, policy_version 445848 (0.0027) [2024-04-27 18:22:48,030][54818] Updated weights for policy 0, policy_version 445858 (0.0026) [2024-04-27 18:22:49,253][54587] Fps is (10 sec: 57342.8, 60 sec: 55432.4, 300 sec: 55427.9). Total num frames: 7304986624. Throughput: 0: 55324.7. Samples: 210165580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 18:22:49,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 18:22:51,747][54818] Updated weights for policy 0, policy_version 445868 (0.0026) [2024-04-27 18:22:53,994][54818] Updated weights for policy 0, policy_version 445878 (0.0030) [2024-04-27 18:22:54,253][54587] Fps is (10 sec: 58982.9, 60 sec: 56251.7, 300 sec: 55483.5). Total num frames: 7305281536. Throughput: 0: 55395.9. Samples: 210500080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-27 18:22:54,254][54587] Avg episode reward: [(0, '0.510')] [2024-04-27 18:22:57,704][54818] Updated weights for policy 0, policy_version 445888 (0.0029) [2024-04-27 18:22:59,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55705.4, 300 sec: 55427.9). Total num frames: 7305527296. Throughput: 0: 55462.0. Samples: 210667320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 18:22:59,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 18:22:59,845][54818] Updated weights for policy 0, policy_version 445898 (0.0031) [2024-04-27 18:23:03,503][54818] Updated weights for policy 0, policy_version 445908 (0.0033) [2024-04-27 18:23:04,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 7305805824. Throughput: 0: 55535.3. Samples: 211002360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 18:23:04,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 18:23:05,706][54818] Updated weights for policy 0, policy_version 445918 (0.0026) [2024-04-27 18:23:05,734][54798] Signal inference workers to stop experience collection... (3000 times) [2024-04-27 18:23:05,734][54798] Signal inference workers to resume experience collection... (3000 times) [2024-04-27 18:23:05,758][54818] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-04-27 18:23:05,758][54818] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-04-27 18:23:09,253][54587] Fps is (10 sec: 52429.3, 60 sec: 55159.4, 300 sec: 55372.4). Total num frames: 7306051584. Throughput: 0: 55421.4. Samples: 211331360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 18:23:09,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 18:23:09,454][54818] Updated weights for policy 0, policy_version 445928 (0.0028) [2024-04-27 18:23:11,567][54818] Updated weights for policy 0, policy_version 445938 (0.0029) [2024-04-27 18:23:14,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 7306346496. Throughput: 0: 55241.3. Samples: 211489480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 18:23:14,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 18:23:15,359][54818] Updated weights for policy 0, policy_version 445948 (0.0034) [2024-04-27 18:23:17,448][54818] Updated weights for policy 0, policy_version 445958 (0.0030) [2024-04-27 18:23:19,253][54587] Fps is (10 sec: 57343.5, 60 sec: 54886.4, 300 sec: 55316.8). Total num frames: 7306625024. Throughput: 0: 55268.8. Samples: 211822100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 18:23:19,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 18:23:21,332][54818] Updated weights for policy 0, policy_version 445968 (0.0033) [2024-04-27 18:23:23,547][54818] Updated weights for policy 0, policy_version 445978 (0.0032) [2024-04-27 18:23:24,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 7306919936. Throughput: 0: 55178.6. Samples: 212149740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 18:23:24,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 18:23:27,252][54818] Updated weights for policy 0, policy_version 445988 (0.0027) [2024-04-27 18:23:29,253][54587] Fps is (10 sec: 57345.4, 60 sec: 55705.7, 300 sec: 55428.0). Total num frames: 7307198464. Throughput: 0: 55522.5. Samples: 212330480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 18:23:29,253][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 18:23:29,409][54818] Updated weights for policy 0, policy_version 445998 (0.0029) [2024-04-27 18:23:33,050][54818] Updated weights for policy 0, policy_version 446008 (0.0031) [2024-04-27 18:23:34,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55372.4). Total num frames: 7307460608. Throughput: 0: 55594.4. Samples: 212667320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 18:23:34,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 18:23:35,568][54818] Updated weights for policy 0, policy_version 446018 (0.0027) [2024-04-27 18:23:38,749][54818] Updated weights for policy 0, policy_version 446028 (0.0028) [2024-04-27 18:23:39,253][54587] Fps is (10 sec: 54066.3, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7307739136. Throughput: 0: 55425.8. Samples: 212994240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 18:23:39,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 18:23:41,677][54818] Updated weights for policy 0, policy_version 446038 (0.0031) [2024-04-27 18:23:44,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55159.6, 300 sec: 55372.4). Total num frames: 7308001280. Throughput: 0: 55328.1. Samples: 213157080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 18:23:44,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 18:23:44,906][54818] Updated weights for policy 0, policy_version 446048 (0.0028) [2024-04-27 18:23:47,602][54818] Updated weights for policy 0, policy_version 446058 (0.0025) [2024-04-27 18:23:49,253][54587] Fps is (10 sec: 54067.5, 60 sec: 54886.6, 300 sec: 55372.4). Total num frames: 7308279808. Throughput: 0: 55105.8. Samples: 213482120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 18:23:49,253][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 18:23:49,293][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000446063_7308296192.pth... [2024-04-27 18:23:49,353][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000445250_7294976000.pth [2024-04-27 18:23:50,863][54818] Updated weights for policy 0, policy_version 446068 (0.0032) [2024-04-27 18:23:53,533][54818] Updated weights for policy 0, policy_version 446078 (0.0026) [2024-04-27 18:23:54,253][54587] Fps is (10 sec: 57344.2, 60 sec: 54886.5, 300 sec: 55372.4). Total num frames: 7308574720. Throughput: 0: 55186.3. Samples: 213814740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 18:23:54,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 18:23:56,834][54818] Updated weights for policy 0, policy_version 446088 (0.0034) [2024-04-27 18:23:59,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7308853248. Throughput: 0: 55549.7. Samples: 213989220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 18:23:59,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 18:23:59,594][54818] Updated weights for policy 0, policy_version 446098 (0.0026) [2024-04-27 18:24:02,601][54818] Updated weights for policy 0, policy_version 446108 (0.0026) [2024-04-27 18:24:04,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 7309115392. Throughput: 0: 55415.7. Samples: 214315800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 18:24:04,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 18:24:05,635][54818] Updated weights for policy 0, policy_version 446118 (0.0027) [2024-04-27 18:24:05,662][54798] Signal inference workers to stop experience collection... (3050 times) [2024-04-27 18:24:05,663][54798] Signal inference workers to resume experience collection... (3050 times) [2024-04-27 18:24:05,673][54818] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-04-27 18:24:05,673][54818] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-04-27 18:24:08,549][54818] Updated weights for policy 0, policy_version 446128 (0.0037) [2024-04-27 18:24:09,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55372.4). Total num frames: 7309393920. Throughput: 0: 55514.8. Samples: 214647900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 18:24:09,254][54587] Avg episode reward: [(0, '0.679')] [2024-04-27 18:24:11,372][54818] Updated weights for policy 0, policy_version 446138 (0.0027) [2024-04-27 18:24:14,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55159.4, 300 sec: 55316.8). Total num frames: 7309656064. Throughput: 0: 55144.7. Samples: 214812000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 18:24:14,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 18:24:14,501][54818] Updated weights for policy 0, policy_version 446148 (0.0029) [2024-04-27 18:24:17,351][54818] Updated weights for policy 0, policy_version 446158 (0.0029) [2024-04-27 18:24:19,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 7309934592. Throughput: 0: 55105.8. Samples: 215147080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 18:24:19,254][54587] Avg episode reward: [(0, '0.687')] [2024-04-27 18:24:20,249][54818] Updated weights for policy 0, policy_version 446168 (0.0033) [2024-04-27 18:24:23,281][54818] Updated weights for policy 0, policy_version 446178 (0.0032) [2024-04-27 18:24:24,254][54587] Fps is (10 sec: 57342.2, 60 sec: 55159.2, 300 sec: 55372.3). Total num frames: 7310229504. Throughput: 0: 55180.6. Samples: 215477380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 18:24:24,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 18:24:26,328][54818] Updated weights for policy 0, policy_version 446188 (0.0030) [2024-04-27 18:24:29,167][54818] Updated weights for policy 0, policy_version 446198 (0.0033) [2024-04-27 18:24:29,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55159.3, 300 sec: 55372.4). Total num frames: 7310508032. Throughput: 0: 55257.3. Samples: 215643660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 18:24:29,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 18:24:32,189][54818] Updated weights for policy 0, policy_version 446208 (0.0027) [2024-04-27 18:24:34,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55432.4, 300 sec: 55372.3). Total num frames: 7310786560. Throughput: 0: 55418.4. Samples: 215975960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 18:24:34,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 18:24:35,208][54818] Updated weights for policy 0, policy_version 446218 (0.0028) [2024-04-27 18:24:38,088][54818] Updated weights for policy 0, policy_version 446228 (0.0029) [2024-04-27 18:24:39,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7311065088. Throughput: 0: 55415.2. Samples: 216308420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 18:24:39,253][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 18:24:41,039][54818] Updated weights for policy 0, policy_version 446238 (0.0027) [2024-04-27 18:24:44,069][54818] Updated weights for policy 0, policy_version 446248 (0.0030) [2024-04-27 18:24:44,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 7311343616. Throughput: 0: 55190.6. Samples: 216472800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 18:24:44,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 18:24:46,904][54818] Updated weights for policy 0, policy_version 446258 (0.0025) [2024-04-27 18:24:49,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 7311605760. Throughput: 0: 55395.2. Samples: 216808580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 18:24:49,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 18:24:49,831][54818] Updated weights for policy 0, policy_version 446268 (0.0030) [2024-04-27 18:24:52,844][54818] Updated weights for policy 0, policy_version 446278 (0.0031) [2024-04-27 18:24:54,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55432.4, 300 sec: 55427.9). Total num frames: 7311900672. Throughput: 0: 55442.5. Samples: 217142820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 18:24:54,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 18:24:55,607][54818] Updated weights for policy 0, policy_version 446288 (0.0028) [2024-04-27 18:24:58,760][54818] Updated weights for policy 0, policy_version 446298 (0.0027) [2024-04-27 18:24:59,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55432.6, 300 sec: 55316.8). Total num frames: 7312179200. Throughput: 0: 55543.2. Samples: 217311440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 18:24:59,253][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 18:25:01,441][54818] Updated weights for policy 0, policy_version 446308 (0.0030) [2024-04-27 18:25:04,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 7312441344. Throughput: 0: 55448.9. Samples: 217642280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 18:25:04,254][54587] Avg episode reward: [(0, '0.655')] [2024-04-27 18:25:04,586][54818] Updated weights for policy 0, policy_version 446318 (0.0028) [2024-04-27 18:25:07,376][54818] Updated weights for policy 0, policy_version 446328 (0.0025) [2024-04-27 18:25:09,253][54587] Fps is (10 sec: 55704.6, 60 sec: 55705.5, 300 sec: 55372.4). Total num frames: 7312736256. Throughput: 0: 55565.6. Samples: 217977820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 18:25:09,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 18:25:10,498][54818] Updated weights for policy 0, policy_version 446338 (0.0029) [2024-04-27 18:25:13,283][54818] Updated weights for policy 0, policy_version 446348 (0.0031) [2024-04-27 18:25:14,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55978.6, 300 sec: 55483.4). Total num frames: 7313014784. Throughput: 0: 55743.1. Samples: 218152100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 18:25:14,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 18:25:16,381][54818] Updated weights for policy 0, policy_version 446358 (0.0028) [2024-04-27 18:25:19,130][54818] Updated weights for policy 0, policy_version 446368 (0.0033) [2024-04-27 18:25:19,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55978.7, 300 sec: 55483.4). Total num frames: 7313293312. Throughput: 0: 55673.1. Samples: 218481240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 18:25:19,253][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 18:25:22,383][54818] Updated weights for policy 0, policy_version 446378 (0.0026) [2024-04-27 18:25:24,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55432.8, 300 sec: 55427.9). Total num frames: 7313555456. Throughput: 0: 55576.4. Samples: 218809360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 18:25:24,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 18:25:25,031][54818] Updated weights for policy 0, policy_version 446388 (0.0036) [2024-04-27 18:25:28,155][54818] Updated weights for policy 0, policy_version 446398 (0.0029) [2024-04-27 18:25:29,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55372.3). Total num frames: 7313850368. Throughput: 0: 55670.2. Samples: 218977960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 18:25:29,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 18:25:31,033][54818] Updated weights for policy 0, policy_version 446408 (0.0031) [2024-04-27 18:25:34,105][54818] Updated weights for policy 0, policy_version 446418 (0.0026) [2024-04-27 18:25:34,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55705.7, 300 sec: 55372.4). Total num frames: 7314128896. Throughput: 0: 55613.2. Samples: 219311180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 18:25:34,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 18:25:36,635][54798] Signal inference workers to stop experience collection... (3100 times) [2024-04-27 18:25:36,674][54818] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-04-27 18:25:36,728][54798] Signal inference workers to resume experience collection... (3100 times) [2024-04-27 18:25:36,728][54818] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-04-27 18:25:36,835][54818] Updated weights for policy 0, policy_version 446428 (0.0031) [2024-04-27 18:25:39,253][54587] Fps is (10 sec: 52429.9, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 7314374656. Throughput: 0: 55564.7. Samples: 219643220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 18:25:39,253][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 18:25:40,142][54818] Updated weights for policy 0, policy_version 446438 (0.0021) [2024-04-27 18:25:42,656][54818] Updated weights for policy 0, policy_version 446448 (0.0027) [2024-04-27 18:25:44,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7314669568. Throughput: 0: 55493.7. Samples: 219808660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 18:25:44,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-27 18:25:46,060][54818] Updated weights for policy 0, policy_version 446458 (0.0033) [2024-04-27 18:25:48,795][54818] Updated weights for policy 0, policy_version 446468 (0.0033) [2024-04-27 18:25:49,253][54587] Fps is (10 sec: 57343.1, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7314948096. Throughput: 0: 55508.0. Samples: 220140140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 18:25:49,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 18:25:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000446469_7314948096.pth... [2024-04-27 18:25:49,318][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000445659_7301677056.pth [2024-04-27 18:25:51,833][54818] Updated weights for policy 0, policy_version 446478 (0.0026) [2024-04-27 18:25:54,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55432.7, 300 sec: 55483.5). Total num frames: 7315226624. Throughput: 0: 55430.9. Samples: 220472200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 18:25:54,253][54587] Avg episode reward: [(0, '0.697')] [2024-04-27 18:25:54,785][54818] Updated weights for policy 0, policy_version 446488 (0.0028) [2024-04-27 18:25:57,699][54818] Updated weights for policy 0, policy_version 446498 (0.0026) [2024-04-27 18:25:59,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 7315505152. Throughput: 0: 55285.3. Samples: 220639940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 18:25:59,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 18:26:00,905][54818] Updated weights for policy 0, policy_version 446508 (0.0028) [2024-04-27 18:26:03,618][54818] Updated weights for policy 0, policy_version 446518 (0.0030) [2024-04-27 18:26:04,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 55427.9). Total num frames: 7315800064. Throughput: 0: 55350.2. Samples: 220972000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 18:26:04,254][54587] Avg episode reward: [(0, '0.495')] [2024-04-27 18:26:06,837][54818] Updated weights for policy 0, policy_version 446528 (0.0031) [2024-04-27 18:26:09,253][54587] Fps is (10 sec: 52429.3, 60 sec: 54886.5, 300 sec: 55205.8). Total num frames: 7316029440. Throughput: 0: 55468.1. Samples: 221305420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 18:26:09,253][54587] Avg episode reward: [(0, '0.679')] [2024-04-27 18:26:09,637][54818] Updated weights for policy 0, policy_version 446538 (0.0030) [2024-04-27 18:26:12,659][54818] Updated weights for policy 0, policy_version 446548 (0.0029) [2024-04-27 18:26:14,253][54587] Fps is (10 sec: 52428.4, 60 sec: 55159.4, 300 sec: 55372.4). Total num frames: 7316324352. Throughput: 0: 55386.7. Samples: 221470360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 18:26:14,254][54587] Avg episode reward: [(0, '0.508')] [2024-04-27 18:26:15,442][54818] Updated weights for policy 0, policy_version 446558 (0.0025) [2024-04-27 18:26:18,418][54818] Updated weights for policy 0, policy_version 446568 (0.0036) [2024-04-27 18:26:19,253][54587] Fps is (10 sec: 58981.5, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 7316619264. Throughput: 0: 55392.8. Samples: 221803860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 18:26:19,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 18:26:21,211][54818] Updated weights for policy 0, policy_version 446578 (0.0025) [2024-04-27 18:26:24,251][54818] Updated weights for policy 0, policy_version 446588 (0.0029) [2024-04-27 18:26:24,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 7316897792. Throughput: 0: 55386.4. Samples: 222135620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 18:26:24,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 18:26:27,126][54818] Updated weights for policy 0, policy_version 446598 (0.0029) [2024-04-27 18:26:29,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 7317176320. Throughput: 0: 55617.3. Samples: 222311440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 18:26:29,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 18:26:30,267][54818] Updated weights for policy 0, policy_version 446608 (0.0028) [2024-04-27 18:26:32,988][54818] Updated weights for policy 0, policy_version 446618 (0.0027) [2024-04-27 18:26:34,253][54587] Fps is (10 sec: 57344.7, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 7317471232. Throughput: 0: 55692.1. Samples: 222646280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 18:26:34,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 18:26:35,998][54818] Updated weights for policy 0, policy_version 446628 (0.0030) [2024-04-27 18:26:38,658][54798] Signal inference workers to stop experience collection... (3150 times) [2024-04-27 18:26:38,659][54798] Signal inference workers to resume experience collection... (3150 times) [2024-04-27 18:26:38,683][54818] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-04-27 18:26:38,683][54818] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-04-27 18:26:38,771][54818] Updated weights for policy 0, policy_version 446638 (0.0028) [2024-04-27 18:26:39,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55978.4, 300 sec: 55427.9). Total num frames: 7317733376. Throughput: 0: 55662.4. Samples: 222977020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 18:26:39,254][54587] Avg episode reward: [(0, '0.680')] [2024-04-27 18:26:41,746][54818] Updated weights for policy 0, policy_version 446648 (0.0027) [2024-04-27 18:26:44,253][54587] Fps is (10 sec: 52429.1, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 7317995520. Throughput: 0: 55579.3. Samples: 223141000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 18:26:44,253][54587] Avg episode reward: [(0, '0.547')] [2024-04-27 18:26:44,711][54818] Updated weights for policy 0, policy_version 446658 (0.0031) [2024-04-27 18:26:47,686][54818] Updated weights for policy 0, policy_version 446668 (0.0028) [2024-04-27 18:26:49,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7318290432. Throughput: 0: 55663.9. Samples: 223476880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 18:26:49,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 18:26:50,727][54818] Updated weights for policy 0, policy_version 446678 (0.0026) [2024-04-27 18:26:53,716][54818] Updated weights for policy 0, policy_version 446688 (0.0029) [2024-04-27 18:26:54,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 7318552576. Throughput: 0: 55708.6. Samples: 223812300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 18:26:54,253][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 18:26:56,644][54818] Updated weights for policy 0, policy_version 446698 (0.0027) [2024-04-27 18:26:59,253][54587] Fps is (10 sec: 55706.5, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 7318847488. Throughput: 0: 55779.3. Samples: 223980420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 18:26:59,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 18:26:59,392][54818] Updated weights for policy 0, policy_version 446708 (0.0029) [2024-04-27 18:27:02,513][54818] Updated weights for policy 0, policy_version 446718 (0.0026) [2024-04-27 18:27:04,253][54587] Fps is (10 sec: 57342.2, 60 sec: 55432.4, 300 sec: 55538.9). Total num frames: 7319126016. Throughput: 0: 55754.6. Samples: 224312820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 18:27:04,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 18:27:05,233][54818] Updated weights for policy 0, policy_version 446728 (0.0032) [2024-04-27 18:27:08,425][54818] Updated weights for policy 0, policy_version 446738 (0.0027) [2024-04-27 18:27:09,253][54587] Fps is (10 sec: 57343.7, 60 sec: 56524.8, 300 sec: 55539.0). Total num frames: 7319420928. Throughput: 0: 55748.1. Samples: 224644280. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-04-27 18:27:09,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 18:27:11,218][54818] Updated weights for policy 0, policy_version 446748 (0.0024) [2024-04-27 18:27:14,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55978.8, 300 sec: 55427.9). Total num frames: 7319683072. Throughput: 0: 55582.3. Samples: 224812640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-04-27 18:27:14,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 18:27:14,256][54818] Updated weights for policy 0, policy_version 446758 (0.0031) [2024-04-27 18:27:17,051][54818] Updated weights for policy 0, policy_version 446768 (0.0029) [2024-04-27 18:27:19,253][54587] Fps is (10 sec: 50790.2, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 7319928832. Throughput: 0: 55643.0. Samples: 225150220. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-04-27 18:27:19,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 18:27:19,992][54818] Updated weights for policy 0, policy_version 446778 (0.0027) [2024-04-27 18:27:22,839][54818] Updated weights for policy 0, policy_version 446788 (0.0033) [2024-04-27 18:27:24,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 7320223744. Throughput: 0: 55743.3. Samples: 225485460. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-04-27 18:27:24,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 18:27:26,074][54818] Updated weights for policy 0, policy_version 446798 (0.0030) [2024-04-27 18:27:28,814][54818] Updated weights for policy 0, policy_version 446808 (0.0030) [2024-04-27 18:27:29,253][54587] Fps is (10 sec: 58982.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7320518656. Throughput: 0: 55607.8. Samples: 225643360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-04-27 18:27:29,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 18:27:31,850][54818] Updated weights for policy 0, policy_version 446818 (0.0026) [2024-04-27 18:27:34,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7320797184. Throughput: 0: 55610.8. Samples: 225979360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-04-27 18:27:34,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 18:27:34,731][54818] Updated weights for policy 0, policy_version 446828 (0.0032) [2024-04-27 18:27:37,790][54818] Updated weights for policy 0, policy_version 446838 (0.0026) [2024-04-27 18:27:39,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 7321075712. Throughput: 0: 55608.2. Samples: 226314680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-04-27 18:27:39,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 18:27:40,562][54818] Updated weights for policy 0, policy_version 446848 (0.0027) [2024-04-27 18:27:43,714][54818] Updated weights for policy 0, policy_version 446858 (0.0026) [2024-04-27 18:27:44,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55978.6, 300 sec: 55483.5). Total num frames: 7321354240. Throughput: 0: 55610.7. Samples: 226482900. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-04-27 18:27:44,253][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 18:27:46,456][54818] Updated weights for policy 0, policy_version 446868 (0.0025) [2024-04-27 18:27:49,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55432.7, 300 sec: 55372.4). Total num frames: 7321616384. Throughput: 0: 55685.1. Samples: 226818640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-04-27 18:27:49,253][54587] Avg episode reward: [(0, '0.534')] [2024-04-27 18:27:49,350][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000446877_7321632768.pth... [2024-04-27 18:27:49,396][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000446063_7308296192.pth [2024-04-27 18:27:49,511][54818] Updated weights for policy 0, policy_version 446878 (0.0027) [2024-04-27 18:27:52,445][54818] Updated weights for policy 0, policy_version 446888 (0.0026) [2024-04-27 18:27:54,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55705.4, 300 sec: 55483.5). Total num frames: 7321894912. Throughput: 0: 55646.1. Samples: 227148360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-04-27 18:27:54,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 18:27:55,541][54818] Updated weights for policy 0, policy_version 446898 (0.0026) [2024-04-27 18:27:55,809][54798] Signal inference workers to stop experience collection... (3200 times) [2024-04-27 18:27:55,861][54818] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-04-27 18:27:55,862][54798] Signal inference workers to resume experience collection... (3200 times) [2024-04-27 18:27:55,875][54818] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-04-27 18:27:58,316][54818] Updated weights for policy 0, policy_version 446908 (0.0032) [2024-04-27 18:27:59,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55483.5). Total num frames: 7322173440. Throughput: 0: 55591.6. Samples: 227314260. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-04-27 18:27:59,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 18:28:01,466][54818] Updated weights for policy 0, policy_version 446918 (0.0021) [2024-04-27 18:28:04,068][54818] Updated weights for policy 0, policy_version 446928 (0.0036) [2024-04-27 18:28:04,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7322468352. Throughput: 0: 55430.2. Samples: 227644580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-04-27 18:28:04,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 18:28:07,426][54818] Updated weights for policy 0, policy_version 446938 (0.0029) [2024-04-27 18:28:09,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7322730496. Throughput: 0: 55419.6. Samples: 227979340. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-04-27 18:28:09,253][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 18:28:09,869][54818] Updated weights for policy 0, policy_version 446948 (0.0026) [2024-04-27 18:28:13,340][54818] Updated weights for policy 0, policy_version 446958 (0.0034) [2024-04-27 18:28:14,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7323025408. Throughput: 0: 55771.5. Samples: 228153080. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-04-27 18:28:14,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 18:28:15,837][54818] Updated weights for policy 0, policy_version 446968 (0.0028) [2024-04-27 18:28:19,200][54818] Updated weights for policy 0, policy_version 446978 (0.0025) [2024-04-27 18:28:19,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55978.7, 300 sec: 55483.4). Total num frames: 7323287552. Throughput: 0: 55693.3. Samples: 228485560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-04-27 18:28:19,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 18:28:21,882][54818] Updated weights for policy 0, policy_version 446988 (0.0039) [2024-04-27 18:28:24,253][54587] Fps is (10 sec: 52428.6, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7323549696. Throughput: 0: 55573.6. Samples: 228815500. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-04-27 18:28:24,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 18:28:25,088][54818] Updated weights for policy 0, policy_version 446998 (0.0029) [2024-04-27 18:28:27,989][54818] Updated weights for policy 0, policy_version 447008 (0.0030) [2024-04-27 18:28:29,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55159.5, 300 sec: 55483.5). Total num frames: 7323828224. Throughput: 0: 55533.3. Samples: 228981900. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-04-27 18:28:29,254][54587] Avg episode reward: [(0, '0.523')] [2024-04-27 18:28:30,986][54818] Updated weights for policy 0, policy_version 447018 (0.0030) [2024-04-27 18:28:33,768][54818] Updated weights for policy 0, policy_version 447028 (0.0031) [2024-04-27 18:28:34,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7324123136. Throughput: 0: 55400.9. Samples: 229311680. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-27 18:28:34,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 18:28:36,803][54818] Updated weights for policy 0, policy_version 447038 (0.0027) [2024-04-27 18:28:39,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7324401664. Throughput: 0: 55342.3. Samples: 229638760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-27 18:28:39,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 18:28:39,548][54818] Updated weights for policy 0, policy_version 447048 (0.0034) [2024-04-27 18:28:42,954][54818] Updated weights for policy 0, policy_version 447058 (0.0027) [2024-04-27 18:28:44,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7324680192. Throughput: 0: 55598.6. Samples: 229816200. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-27 18:28:44,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 18:28:45,437][54818] Updated weights for policy 0, policy_version 447068 (0.0022) [2024-04-27 18:28:48,938][54818] Updated weights for policy 0, policy_version 447078 (0.0026) [2024-04-27 18:28:49,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7324958720. Throughput: 0: 55698.6. Samples: 230151020. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-27 18:28:49,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-27 18:28:51,204][54818] Updated weights for policy 0, policy_version 447088 (0.0029) [2024-04-27 18:28:54,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7325220864. Throughput: 0: 55717.6. Samples: 230486640. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-27 18:28:54,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 18:28:54,671][54818] Updated weights for policy 0, policy_version 447098 (0.0026) [2024-04-27 18:28:57,058][54818] Updated weights for policy 0, policy_version 447108 (0.0039) [2024-04-27 18:28:59,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 7325499392. Throughput: 0: 55447.9. Samples: 230648240. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-27 18:28:59,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 18:29:00,620][54818] Updated weights for policy 0, policy_version 447118 (0.0039) [2024-04-27 18:29:03,069][54818] Updated weights for policy 0, policy_version 447128 (0.0028) [2024-04-27 18:29:04,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7325777920. Throughput: 0: 55487.5. Samples: 230982500. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-27 18:29:04,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 18:29:06,532][54818] Updated weights for policy 0, policy_version 447138 (0.0028) [2024-04-27 18:29:08,833][54818] Updated weights for policy 0, policy_version 447148 (0.0026) [2024-04-27 18:29:09,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 7326072832. Throughput: 0: 55503.6. Samples: 231313160. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-27 18:29:09,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 18:29:10,163][54798] Signal inference workers to stop experience collection... (3250 times) [2024-04-27 18:29:10,164][54798] Signal inference workers to resume experience collection... (3250 times) [2024-04-27 18:29:10,195][54818] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-04-27 18:29:10,195][54818] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-04-27 18:29:12,340][54818] Updated weights for policy 0, policy_version 447158 (0.0028) [2024-04-27 18:29:14,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7326351360. Throughput: 0: 55605.3. Samples: 231484140. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-27 18:29:14,254][54587] Avg episode reward: [(0, '0.514')] [2024-04-27 18:29:14,757][54818] Updated weights for policy 0, policy_version 447168 (0.0035) [2024-04-27 18:29:18,161][54818] Updated weights for policy 0, policy_version 447178 (0.0027) [2024-04-27 18:29:19,253][54587] Fps is (10 sec: 57344.7, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 7326646272. Throughput: 0: 55723.6. Samples: 231819240. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-27 18:29:19,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 18:29:20,520][54818] Updated weights for policy 0, policy_version 447188 (0.0030) [2024-04-27 18:29:24,234][54818] Updated weights for policy 0, policy_version 447198 (0.0027) [2024-04-27 18:29:24,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7326892032. Throughput: 0: 55905.8. Samples: 232154520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-27 18:29:24,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 18:29:26,502][54818] Updated weights for policy 0, policy_version 447208 (0.0034) [2024-04-27 18:29:29,253][54587] Fps is (10 sec: 52427.9, 60 sec: 55705.4, 300 sec: 55539.0). Total num frames: 7327170560. Throughput: 0: 55424.8. Samples: 232310320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-27 18:29:29,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 18:29:30,085][54818] Updated weights for policy 0, policy_version 447218 (0.0033) [2024-04-27 18:29:32,200][54818] Updated weights for policy 0, policy_version 447228 (0.0030) [2024-04-27 18:29:34,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 7327432704. Throughput: 0: 55429.0. Samples: 232645320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-27 18:29:34,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 18:29:35,963][54818] Updated weights for policy 0, policy_version 447238 (0.0033) [2024-04-27 18:29:38,069][54818] Updated weights for policy 0, policy_version 447248 (0.0028) [2024-04-27 18:29:39,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7327727616. Throughput: 0: 55392.2. Samples: 232979280. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-27 18:29:39,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 18:29:41,912][54818] Updated weights for policy 0, policy_version 447258 (0.0034) [2024-04-27 18:29:44,238][54818] Updated weights for policy 0, policy_version 447268 (0.0031) [2024-04-27 18:29:44,253][54587] Fps is (10 sec: 60620.8, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7328038912. Throughput: 0: 55807.7. Samples: 233159580. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-27 18:29:44,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 18:29:47,751][54818] Updated weights for policy 0, policy_version 447278 (0.0031) [2024-04-27 18:29:49,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55705.7, 300 sec: 55594.6). Total num frames: 7328301056. Throughput: 0: 55720.5. Samples: 233489920. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-27 18:29:49,253][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 18:29:49,329][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000447285_7328317440.pth... [2024-04-27 18:29:49,379][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000446469_7314948096.pth [2024-04-27 18:29:50,277][54818] Updated weights for policy 0, policy_version 447288 (0.0032) [2024-04-27 18:29:53,581][54818] Updated weights for policy 0, policy_version 447298 (0.0028) [2024-04-27 18:29:54,253][54587] Fps is (10 sec: 52427.9, 60 sec: 55705.6, 300 sec: 55538.9). Total num frames: 7328563200. Throughput: 0: 55680.3. Samples: 233818780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-04-27 18:29:54,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 18:29:56,034][54818] Updated weights for policy 0, policy_version 447308 (0.0031) [2024-04-27 18:29:59,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7328841728. Throughput: 0: 55563.1. Samples: 233984480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 18:29:59,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 18:29:59,510][54818] Updated weights for policy 0, policy_version 447318 (0.0032) [2024-04-27 18:30:01,899][54818] Updated weights for policy 0, policy_version 447328 (0.0028) [2024-04-27 18:30:03,671][54798] Signal inference workers to stop experience collection... (3300 times) [2024-04-27 18:30:03,714][54818] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-04-27 18:30:03,726][54798] Signal inference workers to resume experience collection... (3300 times) [2024-04-27 18:30:03,732][54818] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-04-27 18:30:04,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7329120256. Throughput: 0: 55583.8. Samples: 234320520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 18:30:04,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-27 18:30:05,419][54818] Updated weights for policy 0, policy_version 447338 (0.0031) [2024-04-27 18:30:07,803][54818] Updated weights for policy 0, policy_version 447348 (0.0029) [2024-04-27 18:30:09,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7329398784. Throughput: 0: 55524.9. Samples: 234653140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 18:30:09,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 18:30:11,478][54818] Updated weights for policy 0, policy_version 447358 (0.0028) [2024-04-27 18:30:13,494][54818] Updated weights for policy 0, policy_version 447368 (0.0028) [2024-04-27 18:30:14,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7329677312. Throughput: 0: 55796.6. Samples: 234821160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 18:30:14,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 18:30:17,405][54818] Updated weights for policy 0, policy_version 447378 (0.0027) [2024-04-27 18:30:19,253][54587] Fps is (10 sec: 58982.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7329988608. Throughput: 0: 55825.8. Samples: 235157480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 18:30:19,253][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 18:30:19,286][54818] Updated weights for policy 0, policy_version 447388 (0.0027) [2024-04-27 18:30:23,243][54818] Updated weights for policy 0, policy_version 447398 (0.0027) [2024-04-27 18:30:24,253][54587] Fps is (10 sec: 58982.1, 60 sec: 56251.6, 300 sec: 55650.1). Total num frames: 7330267136. Throughput: 0: 55761.7. Samples: 235488560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 18:30:24,254][54587] Avg episode reward: [(0, '0.675')] [2024-04-27 18:30:25,946][54818] Updated weights for policy 0, policy_version 447408 (0.0025) [2024-04-27 18:30:29,025][54818] Updated weights for policy 0, policy_version 447418 (0.0029) [2024-04-27 18:30:29,253][54587] Fps is (10 sec: 52428.3, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7330512896. Throughput: 0: 55468.4. Samples: 235655660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 18:30:29,254][54587] Avg episode reward: [(0, '0.683')] [2024-04-27 18:30:31,908][54818] Updated weights for policy 0, policy_version 447428 (0.0026) [2024-04-27 18:30:34,253][54587] Fps is (10 sec: 49152.4, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7330758656. Throughput: 0: 55416.4. Samples: 235983660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 18:30:34,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 18:30:34,945][54818] Updated weights for policy 0, policy_version 447438 (0.0026) [2024-04-27 18:30:37,640][54818] Updated weights for policy 0, policy_version 447448 (0.0028) [2024-04-27 18:30:39,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7331069952. Throughput: 0: 55551.3. Samples: 236318580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 18:30:39,253][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 18:30:40,753][54818] Updated weights for policy 0, policy_version 447458 (0.0027) [2024-04-27 18:30:43,634][54818] Updated weights for policy 0, policy_version 447468 (0.0035) [2024-04-27 18:30:44,253][54587] Fps is (10 sec: 58981.8, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7331348480. Throughput: 0: 55500.4. Samples: 236482000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 18:30:44,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 18:30:46,730][54818] Updated weights for policy 0, policy_version 447478 (0.0029) [2024-04-27 18:30:49,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7331627008. Throughput: 0: 55405.8. Samples: 236813780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 18:30:49,262][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 18:30:49,544][54818] Updated weights for policy 0, policy_version 447488 (0.0027) [2024-04-27 18:30:52,575][54818] Updated weights for policy 0, policy_version 447498 (0.0028) [2024-04-27 18:30:52,836][54798] Signal inference workers to stop experience collection... (3350 times) [2024-04-27 18:30:52,837][54798] Signal inference workers to resume experience collection... (3350 times) [2024-04-27 18:30:52,860][54818] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-04-27 18:30:52,860][54818] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-04-27 18:30:54,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 7331921920. Throughput: 0: 55431.5. Samples: 237147560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 18:30:54,262][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 18:30:55,398][54818] Updated weights for policy 0, policy_version 447508 (0.0034) [2024-04-27 18:30:58,427][54818] Updated weights for policy 0, policy_version 447518 (0.0026) [2024-04-27 18:30:59,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7332200448. Throughput: 0: 55511.0. Samples: 237319160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 18:30:59,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 18:31:01,200][54818] Updated weights for policy 0, policy_version 447528 (0.0029) [2024-04-27 18:31:04,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7332446208. Throughput: 0: 55401.3. Samples: 237650540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 18:31:04,254][54587] Avg episode reward: [(0, '0.521')] [2024-04-27 18:31:04,459][54818] Updated weights for policy 0, policy_version 447538 (0.0030) [2024-04-27 18:31:07,160][54818] Updated weights for policy 0, policy_version 447548 (0.0027) [2024-04-27 18:31:09,253][54587] Fps is (10 sec: 50790.7, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 7332708352. Throughput: 0: 55435.1. Samples: 237983140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 18:31:09,254][54587] Avg episode reward: [(0, '0.678')] [2024-04-27 18:31:10,284][54818] Updated weights for policy 0, policy_version 447558 (0.0032) [2024-04-27 18:31:13,173][54818] Updated weights for policy 0, policy_version 447568 (0.0026) [2024-04-27 18:31:14,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7333019648. Throughput: 0: 55292.1. Samples: 238143800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 18:31:14,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 18:31:16,048][54818] Updated weights for policy 0, policy_version 447578 (0.0032) [2024-04-27 18:31:18,898][54818] Updated weights for policy 0, policy_version 447588 (0.0030) [2024-04-27 18:31:19,253][54587] Fps is (10 sec: 57344.6, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 7333281792. Throughput: 0: 55420.1. Samples: 238477560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:31:19,253][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 18:31:21,948][54818] Updated weights for policy 0, policy_version 447598 (0.0034) [2024-04-27 18:31:24,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7333576704. Throughput: 0: 55447.8. Samples: 238813740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:31:24,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 18:31:24,689][54818] Updated weights for policy 0, policy_version 447608 (0.0028) [2024-04-27 18:31:27,993][54818] Updated weights for policy 0, policy_version 447618 (0.0032) [2024-04-27 18:31:29,253][54587] Fps is (10 sec: 58981.1, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7333871616. Throughput: 0: 55769.7. Samples: 238991640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:31:29,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 18:31:30,603][54818] Updated weights for policy 0, policy_version 447628 (0.0029) [2024-04-27 18:31:33,761][54818] Updated weights for policy 0, policy_version 447638 (0.0033) [2024-04-27 18:31:34,253][54587] Fps is (10 sec: 55706.7, 60 sec: 56251.8, 300 sec: 55594.6). Total num frames: 7334133760. Throughput: 0: 55834.4. Samples: 239326320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:31:34,253][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 18:31:36,389][54818] Updated weights for policy 0, policy_version 447648 (0.0035) [2024-04-27 18:31:39,253][54587] Fps is (10 sec: 52429.7, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7334395904. Throughput: 0: 55882.7. Samples: 239662280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:31:39,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 18:31:39,492][54818] Updated weights for policy 0, policy_version 447658 (0.0029) [2024-04-27 18:31:42,171][54818] Updated weights for policy 0, policy_version 447668 (0.0030) [2024-04-27 18:31:44,253][54587] Fps is (10 sec: 54066.1, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7334674432. Throughput: 0: 55653.3. Samples: 239823560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:31:44,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 18:31:45,430][54818] Updated weights for policy 0, policy_version 447678 (0.0033) [2024-04-27 18:31:48,364][54818] Updated weights for policy 0, policy_version 447688 (0.0034) [2024-04-27 18:31:49,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55705.7, 300 sec: 55650.0). Total num frames: 7334969344. Throughput: 0: 55743.1. Samples: 240158980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:31:49,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 18:31:49,364][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000447692_7334985728.pth... [2024-04-27 18:31:49,415][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000446877_7321632768.pth [2024-04-27 18:31:51,350][54818] Updated weights for policy 0, policy_version 447698 (0.0034) [2024-04-27 18:31:54,253][54587] Fps is (10 sec: 55706.5, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7335231488. Throughput: 0: 55814.8. Samples: 240494800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:31:54,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 18:31:54,358][54818] Updated weights for policy 0, policy_version 447708 (0.0030) [2024-04-27 18:31:54,881][54798] Signal inference workers to stop experience collection... (3400 times) [2024-04-27 18:31:54,912][54818] InferenceWorker_p0-w0: stopping experience collection (3400 times) [2024-04-27 18:31:54,939][54798] Signal inference workers to resume experience collection... (3400 times) [2024-04-27 18:31:54,942][54818] InferenceWorker_p0-w0: resuming experience collection (3400 times) [2024-04-27 18:31:57,070][54818] Updated weights for policy 0, policy_version 447718 (0.0032) [2024-04-27 18:31:59,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55432.7, 300 sec: 55594.6). Total num frames: 7335526400. Throughput: 0: 55929.4. Samples: 240660620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:31:59,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 18:32:00,196][54818] Updated weights for policy 0, policy_version 447728 (0.0032) [2024-04-27 18:32:02,874][54818] Updated weights for policy 0, policy_version 447738 (0.0025) [2024-04-27 18:32:04,253][54587] Fps is (10 sec: 58982.5, 60 sec: 56251.8, 300 sec: 55594.5). Total num frames: 7335821312. Throughput: 0: 55941.3. Samples: 240994920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:32:04,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 18:32:05,953][54818] Updated weights for policy 0, policy_version 447748 (0.0026) [2024-04-27 18:32:09,006][54818] Updated weights for policy 0, policy_version 447758 (0.0027) [2024-04-27 18:32:09,254][54587] Fps is (10 sec: 55704.1, 60 sec: 56251.6, 300 sec: 55594.5). Total num frames: 7336083456. Throughput: 0: 55909.7. Samples: 241329680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:32:09,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 18:32:11,827][54818] Updated weights for policy 0, policy_version 447768 (0.0030) [2024-04-27 18:32:14,253][54587] Fps is (10 sec: 54066.5, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7336361984. Throughput: 0: 55695.2. Samples: 241497920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:32:14,254][54587] Avg episode reward: [(0, '0.702')] [2024-04-27 18:32:14,872][54818] Updated weights for policy 0, policy_version 447778 (0.0033) [2024-04-27 18:32:17,750][54818] Updated weights for policy 0, policy_version 447788 (0.0025) [2024-04-27 18:32:19,253][54587] Fps is (10 sec: 54068.4, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7336624128. Throughput: 0: 55673.3. Samples: 241831620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:32:19,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 18:32:20,659][54818] Updated weights for policy 0, policy_version 447798 (0.0033) [2024-04-27 18:32:23,533][54818] Updated weights for policy 0, policy_version 447808 (0.0031) [2024-04-27 18:32:24,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7336919040. Throughput: 0: 55559.0. Samples: 242162440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:32:24,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 18:32:26,586][54818] Updated weights for policy 0, policy_version 447818 (0.0025) [2024-04-27 18:32:29,253][54587] Fps is (10 sec: 57343.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7337197568. Throughput: 0: 55643.6. Samples: 242327520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:32:29,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 18:32:29,652][54818] Updated weights for policy 0, policy_version 447828 (0.0026) [2024-04-27 18:32:32,456][54818] Updated weights for policy 0, policy_version 447838 (0.0024) [2024-04-27 18:32:34,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55705.4, 300 sec: 55594.5). Total num frames: 7337476096. Throughput: 0: 55672.3. Samples: 242664240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:32:34,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 18:32:35,480][54818] Updated weights for policy 0, policy_version 447848 (0.0025) [2024-04-27 18:32:38,218][54818] Updated weights for policy 0, policy_version 447858 (0.0030) [2024-04-27 18:32:39,253][54587] Fps is (10 sec: 55706.6, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7337754624. Throughput: 0: 55577.4. Samples: 242995780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 18:32:39,253][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 18:32:41,470][54818] Updated weights for policy 0, policy_version 447868 (0.0028) [2024-04-27 18:32:44,018][54818] Updated weights for policy 0, policy_version 447878 (0.0029) [2024-04-27 18:32:44,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 7338033152. Throughput: 0: 55591.0. Samples: 243162220. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 18:32:44,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 18:32:47,362][54818] Updated weights for policy 0, policy_version 447888 (0.0029) [2024-04-27 18:32:49,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7338295296. Throughput: 0: 55595.0. Samples: 243496700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 18:32:49,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 18:32:50,029][54818] Updated weights for policy 0, policy_version 447898 (0.0031) [2024-04-27 18:32:51,195][54798] Signal inference workers to stop experience collection... (3450 times) [2024-04-27 18:32:51,195][54798] Signal inference workers to resume experience collection... (3450 times) [2024-04-27 18:32:51,224][54818] InferenceWorker_p0-w0: stopping experience collection (3450 times) [2024-04-27 18:32:51,225][54818] InferenceWorker_p0-w0: resuming experience collection (3450 times) [2024-04-27 18:32:53,154][54818] Updated weights for policy 0, policy_version 447908 (0.0025) [2024-04-27 18:32:54,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7338573824. Throughput: 0: 55595.9. Samples: 243831480. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 18:32:54,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 18:32:56,127][54818] Updated weights for policy 0, policy_version 447918 (0.0035) [2024-04-27 18:32:59,082][54818] Updated weights for policy 0, policy_version 447928 (0.0025) [2024-04-27 18:32:59,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7338852352. Throughput: 0: 55559.6. Samples: 243998100. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 18:32:59,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 18:33:01,872][54818] Updated weights for policy 0, policy_version 447938 (0.0027) [2024-04-27 18:33:04,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7339130880. Throughput: 0: 55700.0. Samples: 244338120. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 18:33:04,253][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 18:33:04,755][54818] Updated weights for policy 0, policy_version 447948 (0.0026) [2024-04-27 18:33:07,707][54818] Updated weights for policy 0, policy_version 447958 (0.0030) [2024-04-27 18:33:09,253][54587] Fps is (10 sec: 57343.1, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7339425792. Throughput: 0: 55631.0. Samples: 244665840. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 18:33:09,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 18:33:10,737][54818] Updated weights for policy 0, policy_version 447968 (0.0030) [2024-04-27 18:33:13,574][54818] Updated weights for policy 0, policy_version 447978 (0.0025) [2024-04-27 18:33:14,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7339687936. Throughput: 0: 55755.2. Samples: 244836500. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 18:33:14,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 18:33:16,784][54818] Updated weights for policy 0, policy_version 447988 (0.0036) [2024-04-27 18:33:19,253][54587] Fps is (10 sec: 54068.6, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7339966464. Throughput: 0: 55561.6. Samples: 245164500. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 18:33:19,253][54587] Avg episode reward: [(0, '0.519')] [2024-04-27 18:33:19,574][54818] Updated weights for policy 0, policy_version 447998 (0.0031) [2024-04-27 18:33:22,562][54818] Updated weights for policy 0, policy_version 448008 (0.0025) [2024-04-27 18:33:24,253][54587] Fps is (10 sec: 58982.5, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7340277760. Throughput: 0: 55838.2. Samples: 245508500. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 18:33:24,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 18:33:25,366][54818] Updated weights for policy 0, policy_version 448018 (0.0029) [2024-04-27 18:33:28,286][54818] Updated weights for policy 0, policy_version 448028 (0.0035) [2024-04-27 18:33:29,253][54587] Fps is (10 sec: 57343.1, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 7340539904. Throughput: 0: 55681.3. Samples: 245667880. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 18:33:29,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 18:33:31,298][54818] Updated weights for policy 0, policy_version 448038 (0.0033) [2024-04-27 18:33:34,200][54818] Updated weights for policy 0, policy_version 448048 (0.0027) [2024-04-27 18:33:34,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7340818432. Throughput: 0: 55726.2. Samples: 246004380. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 18:33:34,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 18:33:37,474][54818] Updated weights for policy 0, policy_version 448058 (0.0026) [2024-04-27 18:33:39,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7341080576. Throughput: 0: 55663.6. Samples: 246336340. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 18:33:39,253][54587] Avg episode reward: [(0, '0.558')] [2024-04-27 18:33:40,099][54818] Updated weights for policy 0, policy_version 448068 (0.0027) [2024-04-27 18:33:43,284][54818] Updated weights for policy 0, policy_version 448078 (0.0034) [2024-04-27 18:33:44,254][54587] Fps is (10 sec: 55704.5, 60 sec: 55705.4, 300 sec: 55650.0). Total num frames: 7341375488. Throughput: 0: 55610.8. Samples: 246500600. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 18:33:44,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 18:33:45,848][54818] Updated weights for policy 0, policy_version 448088 (0.0026) [2024-04-27 18:33:49,015][54818] Updated weights for policy 0, policy_version 448098 (0.0026) [2024-04-27 18:33:49,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 7341654016. Throughput: 0: 55486.3. Samples: 246835000. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 18:33:49,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 18:33:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000448099_7341654016.pth... [2024-04-27 18:33:49,310][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000447285_7328317440.pth [2024-04-27 18:33:51,978][54818] Updated weights for policy 0, policy_version 448108 (0.0032) [2024-04-27 18:33:54,253][54587] Fps is (10 sec: 55706.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7341932544. Throughput: 0: 55649.1. Samples: 247170040. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 18:33:54,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 18:33:54,900][54818] Updated weights for policy 0, policy_version 448118 (0.0026) [2024-04-27 18:33:57,955][54818] Updated weights for policy 0, policy_version 448128 (0.0031) [2024-04-27 18:33:59,067][54798] Signal inference workers to stop experience collection... (3500 times) [2024-04-27 18:33:59,068][54798] Signal inference workers to resume experience collection... (3500 times) [2024-04-27 18:33:59,088][54818] InferenceWorker_p0-w0: stopping experience collection (3500 times) [2024-04-27 18:33:59,088][54818] InferenceWorker_p0-w0: resuming experience collection (3500 times) [2024-04-27 18:33:59,253][54587] Fps is (10 sec: 57343.9, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 7342227456. Throughput: 0: 55710.7. Samples: 247343480. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 18:33:59,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 18:34:00,711][54818] Updated weights for policy 0, policy_version 448138 (0.0026) [2024-04-27 18:34:03,661][54818] Updated weights for policy 0, policy_version 448148 (0.0029) [2024-04-27 18:34:04,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7342473216. Throughput: 0: 55799.8. Samples: 247675500. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-04-27 18:34:04,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 18:34:06,572][54818] Updated weights for policy 0, policy_version 448158 (0.0034) [2024-04-27 18:34:09,253][54587] Fps is (10 sec: 52428.9, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 7342751744. Throughput: 0: 55602.3. Samples: 248010600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-04-27 18:34:09,253][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 18:34:09,520][54818] Updated weights for policy 0, policy_version 448168 (0.0036) [2024-04-27 18:34:12,421][54818] Updated weights for policy 0, policy_version 448178 (0.0026) [2024-04-27 18:34:14,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7343030272. Throughput: 0: 55841.5. Samples: 248180740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-04-27 18:34:14,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 18:34:15,399][54818] Updated weights for policy 0, policy_version 448188 (0.0030) [2024-04-27 18:34:18,337][54818] Updated weights for policy 0, policy_version 448198 (0.0032) [2024-04-27 18:34:19,254][54587] Fps is (10 sec: 55704.1, 60 sec: 55705.3, 300 sec: 55650.0). Total num frames: 7343308800. Throughput: 0: 55773.1. Samples: 248514180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-04-27 18:34:19,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 18:34:21,145][54818] Updated weights for policy 0, policy_version 448208 (0.0030) [2024-04-27 18:34:24,203][54818] Updated weights for policy 0, policy_version 448218 (0.0026) [2024-04-27 18:34:24,254][54587] Fps is (10 sec: 57337.7, 60 sec: 55431.5, 300 sec: 55705.4). Total num frames: 7343603712. Throughput: 0: 55638.6. Samples: 248840140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-04-27 18:34:24,255][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 18:34:27,018][54818] Updated weights for policy 0, policy_version 448228 (0.0025) [2024-04-27 18:34:29,253][54587] Fps is (10 sec: 55706.7, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7343865856. Throughput: 0: 55672.3. Samples: 249005840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-04-27 18:34:29,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 18:34:29,976][54818] Updated weights for policy 0, policy_version 448238 (0.0027) [2024-04-27 18:34:33,254][54818] Updated weights for policy 0, policy_version 448248 (0.0025) [2024-04-27 18:34:34,253][54587] Fps is (10 sec: 55712.0, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7344160768. Throughput: 0: 55664.5. Samples: 249339900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-04-27 18:34:34,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 18:34:35,783][54818] Updated weights for policy 0, policy_version 448258 (0.0034) [2024-04-27 18:34:39,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7344406528. Throughput: 0: 55653.4. Samples: 249674440. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-04-27 18:34:39,253][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 18:34:39,338][54818] Updated weights for policy 0, policy_version 448268 (0.0029) [2024-04-27 18:34:41,732][54818] Updated weights for policy 0, policy_version 448278 (0.0027) [2024-04-27 18:34:44,253][54587] Fps is (10 sec: 52427.8, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 7344685056. Throughput: 0: 55359.4. Samples: 249834660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-04-27 18:34:44,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 18:34:45,323][54818] Updated weights for policy 0, policy_version 448288 (0.0033) [2024-04-27 18:34:47,739][54818] Updated weights for policy 0, policy_version 448298 (0.0034) [2024-04-27 18:34:49,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7344963584. Throughput: 0: 55318.3. Samples: 250164820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-04-27 18:34:49,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 18:34:51,052][54818] Updated weights for policy 0, policy_version 448308 (0.0033) [2024-04-27 18:34:53,757][54818] Updated weights for policy 0, policy_version 448318 (0.0028) [2024-04-27 18:34:54,253][54587] Fps is (10 sec: 55706.5, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7345242112. Throughput: 0: 55258.2. Samples: 250497220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-04-27 18:34:54,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 18:34:56,930][54818] Updated weights for policy 0, policy_version 448328 (0.0029) [2024-04-27 18:34:57,119][54798] Signal inference workers to stop experience collection... (3550 times) [2024-04-27 18:34:57,153][54818] InferenceWorker_p0-w0: stopping experience collection (3550 times) [2024-04-27 18:34:57,208][54798] Signal inference workers to resume experience collection... (3550 times) [2024-04-27 18:34:57,209][54818] InferenceWorker_p0-w0: resuming experience collection (3550 times) [2024-04-27 18:34:59,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 7345537024. Throughput: 0: 55280.4. Samples: 250668360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-04-27 18:34:59,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 18:34:59,666][54818] Updated weights for policy 0, policy_version 448338 (0.0030) [2024-04-27 18:35:02,812][54818] Updated weights for policy 0, policy_version 448348 (0.0028) [2024-04-27 18:35:04,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 7345799168. Throughput: 0: 55155.9. Samples: 250996180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-04-27 18:35:04,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 18:35:06,005][54818] Updated weights for policy 0, policy_version 448358 (0.0024) [2024-04-27 18:35:08,775][54818] Updated weights for policy 0, policy_version 448368 (0.0027) [2024-04-27 18:35:09,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7346077696. Throughput: 0: 55231.6. Samples: 251325500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-04-27 18:35:09,253][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 18:35:12,052][54818] Updated weights for policy 0, policy_version 448378 (0.0035) [2024-04-27 18:35:14,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 7346356224. Throughput: 0: 55205.1. Samples: 251490060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-04-27 18:35:14,253][54587] Avg episode reward: [(0, '0.497')] [2024-04-27 18:35:14,828][54818] Updated weights for policy 0, policy_version 448388 (0.0030) [2024-04-27 18:35:18,041][54818] Updated weights for policy 0, policy_version 448398 (0.0034) [2024-04-27 18:35:19,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55432.7, 300 sec: 55483.4). Total num frames: 7346634752. Throughput: 0: 55270.4. Samples: 251827080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-04-27 18:35:19,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 18:35:20,557][54818] Updated weights for policy 0, policy_version 448408 (0.0026) [2024-04-27 18:35:23,869][54818] Updated weights for policy 0, policy_version 448418 (0.0029) [2024-04-27 18:35:24,253][54587] Fps is (10 sec: 52428.1, 60 sec: 54614.3, 300 sec: 55483.4). Total num frames: 7346880512. Throughput: 0: 55331.9. Samples: 252164380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 20.0) [2024-04-27 18:35:24,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 18:35:26,405][54818] Updated weights for policy 0, policy_version 448428 (0.0028) [2024-04-27 18:35:29,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7347175424. Throughput: 0: 55244.2. Samples: 252320640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 18:35:29,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 18:35:30,016][54818] Updated weights for policy 0, policy_version 448438 (0.0029) [2024-04-27 18:35:32,383][54818] Updated weights for policy 0, policy_version 448448 (0.0026) [2024-04-27 18:35:34,253][54587] Fps is (10 sec: 58981.5, 60 sec: 55159.2, 300 sec: 55594.5). Total num frames: 7347470336. Throughput: 0: 55313.6. Samples: 252653940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 18:35:34,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 18:35:35,839][54818] Updated weights for policy 0, policy_version 448458 (0.0029) [2024-04-27 18:35:38,239][54818] Updated weights for policy 0, policy_version 448468 (0.0035) [2024-04-27 18:35:39,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7347748864. Throughput: 0: 55339.3. Samples: 252987500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 18:35:39,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 18:35:41,839][54818] Updated weights for policy 0, policy_version 448478 (0.0035) [2024-04-27 18:35:44,047][54818] Updated weights for policy 0, policy_version 448488 (0.0029) [2024-04-27 18:35:44,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7348027392. Throughput: 0: 55489.3. Samples: 253165380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 18:35:44,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 18:35:47,624][54818] Updated weights for policy 0, policy_version 448498 (0.0029) [2024-04-27 18:35:49,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7348305920. Throughput: 0: 55478.4. Samples: 253492720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 18:35:49,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 18:35:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000448505_7348305920.pth... [2024-04-27 18:35:49,307][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000447692_7334985728.pth [2024-04-27 18:35:50,006][54818] Updated weights for policy 0, policy_version 448508 (0.0027) [2024-04-27 18:35:53,404][54818] Updated weights for policy 0, policy_version 448518 (0.0026) [2024-04-27 18:35:53,888][54798] Signal inference workers to stop experience collection... (3600 times) [2024-04-27 18:35:53,915][54818] InferenceWorker_p0-w0: stopping experience collection (3600 times) [2024-04-27 18:35:53,956][54798] Signal inference workers to resume experience collection... (3600 times) [2024-04-27 18:35:53,956][54818] InferenceWorker_p0-w0: resuming experience collection (3600 times) [2024-04-27 18:35:54,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55483.5). Total num frames: 7348568064. Throughput: 0: 55461.7. Samples: 253821280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 18:35:54,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 18:35:56,019][54818] Updated weights for policy 0, policy_version 448528 (0.0033) [2024-04-27 18:35:59,254][54587] Fps is (10 sec: 52428.6, 60 sec: 54886.3, 300 sec: 55538.9). Total num frames: 7348830208. Throughput: 0: 55356.5. Samples: 253981120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 18:35:59,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 18:35:59,387][54818] Updated weights for policy 0, policy_version 448538 (0.0028) [2024-04-27 18:36:01,891][54818] Updated weights for policy 0, policy_version 448548 (0.0028) [2024-04-27 18:36:04,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7349108736. Throughput: 0: 55241.9. Samples: 254312960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 18:36:04,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-27 18:36:05,237][54818] Updated weights for policy 0, policy_version 448558 (0.0025) [2024-04-27 18:36:07,590][54818] Updated weights for policy 0, policy_version 448568 (0.0025) [2024-04-27 18:36:09,253][54587] Fps is (10 sec: 58982.9, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7349420032. Throughput: 0: 55191.4. Samples: 254648000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 18:36:09,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 18:36:11,175][54818] Updated weights for policy 0, policy_version 448578 (0.0026) [2024-04-27 18:36:13,431][54818] Updated weights for policy 0, policy_version 448588 (0.0028) [2024-04-27 18:36:14,253][54587] Fps is (10 sec: 58982.0, 60 sec: 55705.4, 300 sec: 55650.0). Total num frames: 7349698560. Throughput: 0: 55689.2. Samples: 254826660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 18:36:14,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 18:36:16,944][54818] Updated weights for policy 0, policy_version 448598 (0.0031) [2024-04-27 18:36:19,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7349960704. Throughput: 0: 55655.8. Samples: 255158440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 18:36:19,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 18:36:19,538][54818] Updated weights for policy 0, policy_version 448608 (0.0032) [2024-04-27 18:36:22,983][54818] Updated weights for policy 0, policy_version 448618 (0.0034) [2024-04-27 18:36:24,253][54587] Fps is (10 sec: 54068.1, 60 sec: 55978.8, 300 sec: 55483.5). Total num frames: 7350239232. Throughput: 0: 55604.7. Samples: 255489700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 18:36:24,253][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 18:36:25,587][54818] Updated weights for policy 0, policy_version 448628 (0.0029) [2024-04-27 18:36:28,919][54818] Updated weights for policy 0, policy_version 448638 (0.0033) [2024-04-27 18:36:29,254][54587] Fps is (10 sec: 54065.7, 60 sec: 55432.2, 300 sec: 55483.4). Total num frames: 7350501376. Throughput: 0: 55134.4. Samples: 255646440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 18:36:29,254][54587] Avg episode reward: [(0, '0.488')] [2024-04-27 18:36:31,336][54818] Updated weights for policy 0, policy_version 448648 (0.0029) [2024-04-27 18:36:34,253][54587] Fps is (10 sec: 54066.0, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7350779904. Throughput: 0: 55337.4. Samples: 255982900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 18:36:34,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 18:36:34,673][54818] Updated weights for policy 0, policy_version 448658 (0.0037) [2024-04-27 18:36:37,093][54818] Updated weights for policy 0, policy_version 448668 (0.0032) [2024-04-27 18:36:39,253][54587] Fps is (10 sec: 55706.7, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7351058432. Throughput: 0: 55417.7. Samples: 256315080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 18:36:39,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 18:36:40,547][54818] Updated weights for policy 0, policy_version 448678 (0.0034) [2024-04-27 18:36:43,027][54818] Updated weights for policy 0, policy_version 448688 (0.0025) [2024-04-27 18:36:44,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7351353344. Throughput: 0: 55579.3. Samples: 256482180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 18:36:44,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 18:36:46,617][54818] Updated weights for policy 0, policy_version 448698 (0.0027) [2024-04-27 18:36:48,954][54818] Updated weights for policy 0, policy_version 448708 (0.0031) [2024-04-27 18:36:49,253][54587] Fps is (10 sec: 58983.2, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 7351648256. Throughput: 0: 55621.8. Samples: 256815940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 18:36:49,254][54587] Avg episode reward: [(0, '0.517')] [2024-04-27 18:36:52,503][54818] Updated weights for policy 0, policy_version 448718 (0.0026) [2024-04-27 18:36:54,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7351910400. Throughput: 0: 55712.2. Samples: 257155040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 18:36:54,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 18:36:54,747][54818] Updated weights for policy 0, policy_version 448728 (0.0027) [2024-04-27 18:36:58,186][54818] Updated weights for policy 0, policy_version 448738 (0.0032) [2024-04-27 18:36:59,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55705.8, 300 sec: 55427.9). Total num frames: 7352172544. Throughput: 0: 55438.3. Samples: 257321380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 18:36:59,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-27 18:37:00,630][54818] Updated weights for policy 0, policy_version 448748 (0.0033) [2024-04-27 18:37:04,192][54818] Updated weights for policy 0, policy_version 448758 (0.0025) [2024-04-27 18:37:04,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55705.5, 300 sec: 55483.5). Total num frames: 7352451072. Throughput: 0: 55399.5. Samples: 257651420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 18:37:04,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 18:37:06,737][54818] Updated weights for policy 0, policy_version 448768 (0.0029) [2024-04-27 18:37:09,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55159.5, 300 sec: 55483.5). Total num frames: 7352729600. Throughput: 0: 55401.2. Samples: 257982760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 18:37:09,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 18:37:09,985][54818] Updated weights for policy 0, policy_version 448778 (0.0030) [2024-04-27 18:37:12,013][54798] Signal inference workers to stop experience collection... (3650 times) [2024-04-27 18:37:12,013][54798] Signal inference workers to resume experience collection... (3650 times) [2024-04-27 18:37:12,022][54818] InferenceWorker_p0-w0: stopping experience collection (3650 times) [2024-04-27 18:37:12,023][54818] InferenceWorker_p0-w0: resuming experience collection (3650 times) [2024-04-27 18:37:12,498][54818] Updated weights for policy 0, policy_version 448788 (0.0027) [2024-04-27 18:37:14,253][54587] Fps is (10 sec: 57344.8, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7353024512. Throughput: 0: 55796.4. Samples: 258157260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 18:37:14,253][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 18:37:15,880][54818] Updated weights for policy 0, policy_version 448798 (0.0032) [2024-04-27 18:37:18,407][54818] Updated weights for policy 0, policy_version 448808 (0.0025) [2024-04-27 18:37:19,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7353303040. Throughput: 0: 55693.5. Samples: 258489100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 18:37:19,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 18:37:21,803][54818] Updated weights for policy 0, policy_version 448818 (0.0027) [2024-04-27 18:37:24,253][54818] Updated weights for policy 0, policy_version 448828 (0.0031) [2024-04-27 18:37:24,254][54587] Fps is (10 sec: 57342.4, 60 sec: 55978.4, 300 sec: 55594.5). Total num frames: 7353597952. Throughput: 0: 55746.1. Samples: 258823660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 18:37:24,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 18:37:27,727][54818] Updated weights for policy 0, policy_version 448838 (0.0027) [2024-04-27 18:37:29,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55706.0, 300 sec: 55483.5). Total num frames: 7353843712. Throughput: 0: 55786.9. Samples: 258992580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 18:37:29,253][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 18:37:30,118][54818] Updated weights for policy 0, policy_version 448848 (0.0024) [2024-04-27 18:37:33,792][54818] Updated weights for policy 0, policy_version 448858 (0.0027) [2024-04-27 18:37:34,253][54587] Fps is (10 sec: 50792.0, 60 sec: 55432.7, 300 sec: 55427.9). Total num frames: 7354105856. Throughput: 0: 55813.4. Samples: 259327540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 18:37:34,253][54587] Avg episode reward: [(0, '0.534')] [2024-04-27 18:37:36,161][54818] Updated weights for policy 0, policy_version 448868 (0.0030) [2024-04-27 18:37:39,253][54587] Fps is (10 sec: 52428.2, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 7354368000. Throughput: 0: 55696.0. Samples: 259661360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 18:37:39,262][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 18:37:39,633][54818] Updated weights for policy 0, policy_version 448878 (0.0026) [2024-04-27 18:37:42,154][54818] Updated weights for policy 0, policy_version 448888 (0.0026) [2024-04-27 18:37:44,253][54587] Fps is (10 sec: 57342.8, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7354679296. Throughput: 0: 55513.6. Samples: 259819500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 18:37:44,262][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 18:37:45,565][54818] Updated weights for policy 0, policy_version 448898 (0.0025) [2024-04-27 18:37:48,132][54818] Updated weights for policy 0, policy_version 448908 (0.0026) [2024-04-27 18:37:49,253][54587] Fps is (10 sec: 62259.1, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7354990592. Throughput: 0: 55644.1. Samples: 260155400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 18:37:49,262][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 18:37:49,273][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000448913_7354990592.pth... [2024-04-27 18:37:49,323][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000448099_7341654016.pth [2024-04-27 18:37:51,372][54818] Updated weights for policy 0, policy_version 448918 (0.0027) [2024-04-27 18:37:53,988][54818] Updated weights for policy 0, policy_version 448928 (0.0026) [2024-04-27 18:37:54,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 7355236352. Throughput: 0: 55680.4. Samples: 260488380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 18:37:54,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 18:37:57,252][54818] Updated weights for policy 0, policy_version 448938 (0.0028) [2024-04-27 18:37:59,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 7355531264. Throughput: 0: 55707.4. Samples: 260664100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 18:37:59,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 18:37:59,886][54818] Updated weights for policy 0, policy_version 448948 (0.0030) [2024-04-27 18:38:03,216][54818] Updated weights for policy 0, policy_version 448958 (0.0028) [2024-04-27 18:38:04,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 7355793408. Throughput: 0: 55652.0. Samples: 260993440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 18:38:04,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 18:38:05,840][54818] Updated weights for policy 0, policy_version 448968 (0.0030) [2024-04-27 18:38:08,995][54818] Updated weights for policy 0, policy_version 448978 (0.0027) [2024-04-27 18:38:09,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7356071936. Throughput: 0: 55611.3. Samples: 261326160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 18:38:09,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 18:38:11,645][54818] Updated weights for policy 0, policy_version 448988 (0.0026) [2024-04-27 18:38:14,253][54587] Fps is (10 sec: 52428.8, 60 sec: 54886.3, 300 sec: 55427.9). Total num frames: 7356317696. Throughput: 0: 55396.7. Samples: 261485440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 18:38:14,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 18:38:14,886][54818] Updated weights for policy 0, policy_version 448998 (0.0036) [2024-04-27 18:38:17,405][54818] Updated weights for policy 0, policy_version 449008 (0.0026) [2024-04-27 18:38:19,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7356628992. Throughput: 0: 55368.7. Samples: 261819140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 18:38:19,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 18:38:20,979][54818] Updated weights for policy 0, policy_version 449018 (0.0026) [2024-04-27 18:38:23,183][54818] Updated weights for policy 0, policy_version 449028 (0.0030) [2024-04-27 18:38:23,595][54798] Signal inference workers to stop experience collection... (3700 times) [2024-04-27 18:38:23,625][54818] InferenceWorker_p0-w0: stopping experience collection (3700 times) [2024-04-27 18:38:23,679][54798] Signal inference workers to resume experience collection... (3700 times) [2024-04-27 18:38:23,680][54818] InferenceWorker_p0-w0: resuming experience collection (3700 times) [2024-04-27 18:38:24,253][54587] Fps is (10 sec: 62259.8, 60 sec: 55705.9, 300 sec: 55594.5). Total num frames: 7356940288. Throughput: 0: 55389.9. Samples: 262153900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 18:38:24,253][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 18:38:26,661][54818] Updated weights for policy 0, policy_version 449038 (0.0030) [2024-04-27 18:38:29,224][54818] Updated weights for policy 0, policy_version 449048 (0.0033) [2024-04-27 18:38:29,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55978.5, 300 sec: 55539.0). Total num frames: 7357202432. Throughput: 0: 55658.7. Samples: 262324140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 18:38:29,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 18:38:32,482][54818] Updated weights for policy 0, policy_version 449058 (0.0031) [2024-04-27 18:38:34,253][54587] Fps is (10 sec: 52427.7, 60 sec: 55978.4, 300 sec: 55538.9). Total num frames: 7357464576. Throughput: 0: 55582.1. Samples: 262656600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 18:38:34,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 18:38:35,178][54818] Updated weights for policy 0, policy_version 449068 (0.0030) [2024-04-27 18:38:38,306][54818] Updated weights for policy 0, policy_version 449078 (0.0027) [2024-04-27 18:38:39,253][54587] Fps is (10 sec: 54066.8, 60 sec: 56251.6, 300 sec: 55483.5). Total num frames: 7357743104. Throughput: 0: 55627.5. Samples: 262991620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 18:38:39,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 18:38:41,043][54818] Updated weights for policy 0, policy_version 449088 (0.0028) [2024-04-27 18:38:44,150][54818] Updated weights for policy 0, policy_version 449098 (0.0027) [2024-04-27 18:38:44,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 7358021632. Throughput: 0: 55180.9. Samples: 263147240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 18:38:44,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 18:38:46,847][54818] Updated weights for policy 0, policy_version 449108 (0.0024) [2024-04-27 18:38:49,253][54587] Fps is (10 sec: 52429.4, 60 sec: 54613.3, 300 sec: 55372.4). Total num frames: 7358267392. Throughput: 0: 55366.2. Samples: 263484920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 18:38:49,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 18:38:50,319][54818] Updated weights for policy 0, policy_version 449118 (0.0030) [2024-04-27 18:38:52,669][54818] Updated weights for policy 0, policy_version 449128 (0.0025) [2024-04-27 18:38:54,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55427.9). Total num frames: 7358578688. Throughput: 0: 55384.3. Samples: 263818460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 18:38:54,254][54587] Avg episode reward: [(0, '0.664')] [2024-04-27 18:38:56,105][54818] Updated weights for policy 0, policy_version 449138 (0.0032) [2024-04-27 18:38:58,520][54818] Updated weights for policy 0, policy_version 449148 (0.0031) [2024-04-27 18:38:59,253][54587] Fps is (10 sec: 62259.4, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 7358889984. Throughput: 0: 55714.3. Samples: 263992580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 18:38:59,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 18:39:01,887][54818] Updated weights for policy 0, policy_version 449158 (0.0029) [2024-04-27 18:39:04,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7359152128. Throughput: 0: 55703.6. Samples: 264325800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 18:39:04,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 18:39:04,480][54818] Updated weights for policy 0, policy_version 449168 (0.0026) [2024-04-27 18:39:08,182][54818] Updated weights for policy 0, policy_version 449178 (0.0027) [2024-04-27 18:39:09,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7359414272. Throughput: 0: 55701.2. Samples: 264660460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 18:39:09,254][54587] Avg episode reward: [(0, '0.674')] [2024-04-27 18:39:10,332][54818] Updated weights for policy 0, policy_version 449188 (0.0027) [2024-04-27 18:39:10,353][54798] Signal inference workers to stop experience collection... (3750 times) [2024-04-27 18:39:10,372][54818] InferenceWorker_p0-w0: stopping experience collection (3750 times) [2024-04-27 18:39:10,447][54798] Signal inference workers to resume experience collection... (3750 times) [2024-04-27 18:39:10,448][54818] InferenceWorker_p0-w0: resuming experience collection (3750 times) [2024-04-27 18:39:14,008][54818] Updated weights for policy 0, policy_version 449198 (0.0028) [2024-04-27 18:39:14,253][54587] Fps is (10 sec: 50790.6, 60 sec: 55705.6, 300 sec: 55427.9). Total num frames: 7359660032. Throughput: 0: 55537.4. Samples: 264823320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 18:39:14,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 18:39:16,215][54818] Updated weights for policy 0, policy_version 449208 (0.0032) [2024-04-27 18:39:19,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55428.1). Total num frames: 7359954944. Throughput: 0: 55492.6. Samples: 265153760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 18:39:19,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 18:39:20,068][54818] Updated weights for policy 0, policy_version 449218 (0.0026) [2024-04-27 18:39:22,295][54818] Updated weights for policy 0, policy_version 449228 (0.0032) [2024-04-27 18:39:24,253][54587] Fps is (10 sec: 55705.6, 60 sec: 54613.3, 300 sec: 55427.9). Total num frames: 7360217088. Throughput: 0: 55405.1. Samples: 265484840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 18:39:24,254][54587] Avg episode reward: [(0, '0.509')] [2024-04-27 18:39:25,959][54818] Updated weights for policy 0, policy_version 449238 (0.0026) [2024-04-27 18:39:28,055][54818] Updated weights for policy 0, policy_version 449248 (0.0027) [2024-04-27 18:39:29,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 7360512000. Throughput: 0: 55697.4. Samples: 265653620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 18:39:29,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 18:39:31,865][54818] Updated weights for policy 0, policy_version 449258 (0.0029) [2024-04-27 18:39:33,841][54818] Updated weights for policy 0, policy_version 449268 (0.0027) [2024-04-27 18:39:34,253][54587] Fps is (10 sec: 60620.5, 60 sec: 55978.8, 300 sec: 55650.0). Total num frames: 7360823296. Throughput: 0: 55528.9. Samples: 265983720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 18:39:34,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 18:39:37,664][54818] Updated weights for policy 0, policy_version 449278 (0.0029) [2024-04-27 18:39:39,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 7361069056. Throughput: 0: 55567.3. Samples: 266318980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-27 18:39:39,254][54587] Avg episode reward: [(0, '0.673')] [2024-04-27 18:39:39,822][54818] Updated weights for policy 0, policy_version 449288 (0.0027) [2024-04-27 18:39:43,505][54818] Updated weights for policy 0, policy_version 449298 (0.0026) [2024-04-27 18:39:44,253][54587] Fps is (10 sec: 50791.1, 60 sec: 55159.6, 300 sec: 55483.5). Total num frames: 7361331200. Throughput: 0: 55498.3. Samples: 266490000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-27 18:39:44,253][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 18:39:45,698][54818] Updated weights for policy 0, policy_version 449308 (0.0027) [2024-04-27 18:39:49,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7361593344. Throughput: 0: 55448.0. Samples: 266820960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-27 18:39:49,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 18:39:49,339][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000449317_7361609728.pth... [2024-04-27 18:39:49,393][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000448505_7348305920.pth [2024-04-27 18:39:49,526][54818] Updated weights for policy 0, policy_version 449318 (0.0030) [2024-04-27 18:39:51,681][54818] Updated weights for policy 0, policy_version 449328 (0.0029) [2024-04-27 18:39:54,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55432.7, 300 sec: 55483.5). Total num frames: 7361904640. Throughput: 0: 55361.9. Samples: 267151740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-27 18:39:54,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 18:39:55,523][54818] Updated weights for policy 0, policy_version 449338 (0.0030) [2024-04-27 18:39:57,525][54818] Updated weights for policy 0, policy_version 449348 (0.0027) [2024-04-27 18:39:59,253][54587] Fps is (10 sec: 55705.6, 60 sec: 54340.3, 300 sec: 55427.9). Total num frames: 7362150400. Throughput: 0: 55347.5. Samples: 267313960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-27 18:39:59,254][54587] Avg episode reward: [(0, '0.462')] [2024-04-27 18:40:01,390][54818] Updated weights for policy 0, policy_version 449358 (0.0025) [2024-04-27 18:40:02,221][54798] Signal inference workers to stop experience collection... (3800 times) [2024-04-27 18:40:02,221][54798] Signal inference workers to resume experience collection... (3800 times) [2024-04-27 18:40:02,236][54818] InferenceWorker_p0-w0: stopping experience collection (3800 times) [2024-04-27 18:40:02,236][54818] InferenceWorker_p0-w0: resuming experience collection (3800 times) [2024-04-27 18:40:03,340][54818] Updated weights for policy 0, policy_version 449368 (0.0031) [2024-04-27 18:40:04,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7362461696. Throughput: 0: 55440.4. Samples: 267648580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-27 18:40:04,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 18:40:07,180][54818] Updated weights for policy 0, policy_version 449378 (0.0024) [2024-04-27 18:40:09,253][54587] Fps is (10 sec: 58982.1, 60 sec: 55432.5, 300 sec: 55538.9). Total num frames: 7362740224. Throughput: 0: 55394.1. Samples: 267977580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-27 18:40:09,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 18:40:09,444][54818] Updated weights for policy 0, policy_version 449388 (0.0029) [2024-04-27 18:40:13,090][54818] Updated weights for policy 0, policy_version 449398 (0.0027) [2024-04-27 18:40:14,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 7363018752. Throughput: 0: 55358.6. Samples: 268144760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-27 18:40:14,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 18:40:15,352][54818] Updated weights for policy 0, policy_version 449408 (0.0032) [2024-04-27 18:40:19,040][54818] Updated weights for policy 0, policy_version 449418 (0.0029) [2024-04-27 18:40:19,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7363280896. Throughput: 0: 55422.6. Samples: 268477740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-27 18:40:19,254][54587] Avg episode reward: [(0, '0.505')] [2024-04-27 18:40:21,170][54818] Updated weights for policy 0, policy_version 449428 (0.0025) [2024-04-27 18:40:24,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7363559424. Throughput: 0: 55447.9. Samples: 268814140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-27 18:40:24,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 18:40:24,890][54818] Updated weights for policy 0, policy_version 449438 (0.0025) [2024-04-27 18:40:27,109][54818] Updated weights for policy 0, policy_version 449448 (0.0027) [2024-04-27 18:40:29,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7363854336. Throughput: 0: 55353.1. Samples: 268980900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-27 18:40:29,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 18:40:30,685][54818] Updated weights for policy 0, policy_version 449458 (0.0028) [2024-04-27 18:40:33,086][54818] Updated weights for policy 0, policy_version 449468 (0.0033) [2024-04-27 18:40:34,253][54587] Fps is (10 sec: 54067.4, 60 sec: 54613.3, 300 sec: 55427.9). Total num frames: 7364100096. Throughput: 0: 55322.2. Samples: 269310460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-27 18:40:34,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 18:40:36,672][54818] Updated weights for policy 0, policy_version 449478 (0.0033) [2024-04-27 18:40:38,968][54818] Updated weights for policy 0, policy_version 449488 (0.0024) [2024-04-27 18:40:39,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7364411392. Throughput: 0: 55338.5. Samples: 269641980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-27 18:40:39,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 18:40:42,421][54818] Updated weights for policy 0, policy_version 449498 (0.0026) [2024-04-27 18:40:44,253][54587] Fps is (10 sec: 58982.0, 60 sec: 55978.5, 300 sec: 55539.0). Total num frames: 7364689920. Throughput: 0: 55571.4. Samples: 269814680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-27 18:40:44,254][54587] Avg episode reward: [(0, '0.531')] [2024-04-27 18:40:44,857][54818] Updated weights for policy 0, policy_version 449508 (0.0028) [2024-04-27 18:40:48,236][54818] Updated weights for policy 0, policy_version 449518 (0.0030) [2024-04-27 18:40:49,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7364952064. Throughput: 0: 55576.9. Samples: 270149540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-27 18:40:49,254][54587] Avg episode reward: [(0, '0.674')] [2024-04-27 18:40:50,790][54818] Updated weights for policy 0, policy_version 449528 (0.0035) [2024-04-27 18:40:54,232][54818] Updated weights for policy 0, policy_version 449538 (0.0034) [2024-04-27 18:40:54,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55432.5, 300 sec: 55594.6). Total num frames: 7365230592. Throughput: 0: 55641.4. Samples: 270481440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-27 18:40:54,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 18:40:56,953][54818] Updated weights for policy 0, policy_version 449548 (0.0030) [2024-04-27 18:40:57,837][54798] Signal inference workers to stop experience collection... (3850 times) [2024-04-27 18:40:57,838][54798] Signal inference workers to resume experience collection... (3850 times) [2024-04-27 18:40:57,865][54818] InferenceWorker_p0-w0: stopping experience collection (3850 times) [2024-04-27 18:40:57,865][54818] InferenceWorker_p0-w0: resuming experience collection (3850 times) [2024-04-27 18:40:59,253][54587] Fps is (10 sec: 54066.2, 60 sec: 55705.5, 300 sec: 55538.9). Total num frames: 7365492736. Throughput: 0: 55409.7. Samples: 270638200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 26.0) [2024-04-27 18:40:59,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 18:40:59,969][54818] Updated weights for policy 0, policy_version 449558 (0.0032) [2024-04-27 18:41:02,712][54818] Updated weights for policy 0, policy_version 449568 (0.0026) [2024-04-27 18:41:04,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55159.6, 300 sec: 55427.9). Total num frames: 7365771264. Throughput: 0: 55474.4. Samples: 270974080. Policy #0 lag: (min: 2.0, avg: 11.4, max: 22.0) [2024-04-27 18:41:04,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 18:41:05,830][54818] Updated weights for policy 0, policy_version 449578 (0.0035) [2024-04-27 18:41:08,668][54818] Updated weights for policy 0, policy_version 449588 (0.0029) [2024-04-27 18:41:09,253][54587] Fps is (10 sec: 55706.8, 60 sec: 55159.6, 300 sec: 55427.9). Total num frames: 7366049792. Throughput: 0: 55380.6. Samples: 271306260. Policy #0 lag: (min: 2.0, avg: 11.4, max: 22.0) [2024-04-27 18:41:09,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 18:41:11,794][54818] Updated weights for policy 0, policy_version 449598 (0.0032) [2024-04-27 18:41:14,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7366344704. Throughput: 0: 55492.6. Samples: 271478060. Policy #0 lag: (min: 2.0, avg: 11.4, max: 22.0) [2024-04-27 18:41:14,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-27 18:41:14,797][54818] Updated weights for policy 0, policy_version 449608 (0.0026) [2024-04-27 18:41:17,785][54818] Updated weights for policy 0, policy_version 449618 (0.0023) [2024-04-27 18:41:19,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7366623232. Throughput: 0: 55507.7. Samples: 271808300. Policy #0 lag: (min: 2.0, avg: 11.4, max: 22.0) [2024-04-27 18:41:19,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 18:41:20,686][54818] Updated weights for policy 0, policy_version 449628 (0.0025) [2024-04-27 18:41:23,573][54818] Updated weights for policy 0, policy_version 449638 (0.0027) [2024-04-27 18:41:24,253][54587] Fps is (10 sec: 54066.5, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7366885376. Throughput: 0: 55404.4. Samples: 272135180. Policy #0 lag: (min: 2.0, avg: 11.4, max: 22.0) [2024-04-27 18:41:24,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 18:41:26,422][54818] Updated weights for policy 0, policy_version 449648 (0.0027) [2024-04-27 18:41:29,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 7367163904. Throughput: 0: 55277.6. Samples: 272302160. Policy #0 lag: (min: 2.0, avg: 11.4, max: 22.0) [2024-04-27 18:41:29,253][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 18:41:29,415][54818] Updated weights for policy 0, policy_version 449658 (0.0030) [2024-04-27 18:41:32,233][54818] Updated weights for policy 0, policy_version 449668 (0.0033) [2024-04-27 18:41:34,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 7367426048. Throughput: 0: 55225.3. Samples: 272634680. Policy #0 lag: (min: 2.0, avg: 11.4, max: 22.0) [2024-04-27 18:41:34,262][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 18:41:35,501][54818] Updated weights for policy 0, policy_version 449678 (0.0030) [2024-04-27 18:41:38,507][54818] Updated weights for policy 0, policy_version 449688 (0.0033) [2024-04-27 18:41:39,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7367737344. Throughput: 0: 55252.9. Samples: 272967820. Policy #0 lag: (min: 2.0, avg: 11.4, max: 22.0) [2024-04-27 18:41:39,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 18:41:41,495][54818] Updated weights for policy 0, policy_version 449698 (0.0030) [2024-04-27 18:41:44,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55159.6, 300 sec: 55427.9). Total num frames: 7367999488. Throughput: 0: 55410.4. Samples: 273131660. Policy #0 lag: (min: 2.0, avg: 11.4, max: 22.0) [2024-04-27 18:41:44,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 18:41:44,302][54818] Updated weights for policy 0, policy_version 449708 (0.0031) [2024-04-27 18:41:47,293][54818] Updated weights for policy 0, policy_version 449718 (0.0035) [2024-04-27 18:41:49,253][54587] Fps is (10 sec: 52427.9, 60 sec: 55159.3, 300 sec: 55427.9). Total num frames: 7368261632. Throughput: 0: 55314.8. Samples: 273463260. Policy #0 lag: (min: 2.0, avg: 11.4, max: 22.0) [2024-04-27 18:41:49,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 18:41:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000449723_7368261632.pth... [2024-04-27 18:41:49,324][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000448913_7354990592.pth [2024-04-27 18:41:50,196][54818] Updated weights for policy 0, policy_version 449728 (0.0030) [2024-04-27 18:41:51,172][54798] Signal inference workers to stop experience collection... (3900 times) [2024-04-27 18:41:51,177][54798] Signal inference workers to resume experience collection... (3900 times) [2024-04-27 18:41:51,202][54818] InferenceWorker_p0-w0: stopping experience collection (3900 times) [2024-04-27 18:41:51,203][54818] InferenceWorker_p0-w0: resuming experience collection (3900 times) [2024-04-27 18:41:53,080][54818] Updated weights for policy 0, policy_version 449738 (0.0028) [2024-04-27 18:41:54,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 7368540160. Throughput: 0: 55359.5. Samples: 273797440. Policy #0 lag: (min: 2.0, avg: 11.4, max: 22.0) [2024-04-27 18:41:54,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 18:41:56,255][54818] Updated weights for policy 0, policy_version 449748 (0.0025) [2024-04-27 18:41:59,169][54818] Updated weights for policy 0, policy_version 449758 (0.0031) [2024-04-27 18:41:59,253][54587] Fps is (10 sec: 57345.2, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 7368835072. Throughput: 0: 55202.2. Samples: 273962160. Policy #0 lag: (min: 2.0, avg: 11.4, max: 22.0) [2024-04-27 18:41:59,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-27 18:42:02,227][54818] Updated weights for policy 0, policy_version 449768 (0.0036) [2024-04-27 18:42:04,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 7369097216. Throughput: 0: 55252.3. Samples: 274294660. Policy #0 lag: (min: 2.0, avg: 11.4, max: 22.0) [2024-04-27 18:42:04,254][54587] Avg episode reward: [(0, '0.655')] [2024-04-27 18:42:05,046][54818] Updated weights for policy 0, policy_version 449778 (0.0029) [2024-04-27 18:42:08,283][54818] Updated weights for policy 0, policy_version 449788 (0.0031) [2024-04-27 18:42:09,254][54587] Fps is (10 sec: 54066.0, 60 sec: 55432.3, 300 sec: 55427.9). Total num frames: 7369375744. Throughput: 0: 55387.4. Samples: 274627620. Policy #0 lag: (min: 2.0, avg: 11.4, max: 22.0) [2024-04-27 18:42:09,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 18:42:10,847][54818] Updated weights for policy 0, policy_version 449798 (0.0028) [2024-04-27 18:42:14,073][54818] Updated weights for policy 0, policy_version 449808 (0.0029) [2024-04-27 18:42:14,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 7369670656. Throughput: 0: 55414.5. Samples: 274795820. Policy #0 lag: (min: 2.0, avg: 11.4, max: 22.0) [2024-04-27 18:42:14,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 18:42:16,915][54818] Updated weights for policy 0, policy_version 449818 (0.0025) [2024-04-27 18:42:19,253][54587] Fps is (10 sec: 54068.6, 60 sec: 54886.4, 300 sec: 55316.9). Total num frames: 7369916416. Throughput: 0: 55388.0. Samples: 275127140. Policy #0 lag: (min: 2.0, avg: 11.4, max: 22.0) [2024-04-27 18:42:19,253][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 18:42:20,005][54818] Updated weights for policy 0, policy_version 449828 (0.0027) [2024-04-27 18:42:22,889][54818] Updated weights for policy 0, policy_version 449838 (0.0027) [2024-04-27 18:42:24,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7370211328. Throughput: 0: 55443.9. Samples: 275462800. Policy #0 lag: (min: 2.0, avg: 11.4, max: 22.0) [2024-04-27 18:42:24,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 18:42:25,898][54818] Updated weights for policy 0, policy_version 449848 (0.0031) [2024-04-27 18:42:28,642][54818] Updated weights for policy 0, policy_version 449858 (0.0030) [2024-04-27 18:42:29,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7370489856. Throughput: 0: 55469.8. Samples: 275627800. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 18:42:29,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 18:42:31,779][54818] Updated weights for policy 0, policy_version 449868 (0.0032) [2024-04-27 18:42:34,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7370768384. Throughput: 0: 55475.4. Samples: 275959640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 18:42:34,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 18:42:34,394][54818] Updated weights for policy 0, policy_version 449878 (0.0031) [2024-04-27 18:42:37,603][54818] Updated weights for policy 0, policy_version 449888 (0.0030) [2024-04-27 18:42:39,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55159.5, 300 sec: 55483.5). Total num frames: 7371046912. Throughput: 0: 55595.1. Samples: 276299220. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 18:42:39,262][54587] Avg episode reward: [(0, '0.499')] [2024-04-27 18:42:40,272][54818] Updated weights for policy 0, policy_version 449898 (0.0027) [2024-04-27 18:42:43,382][54818] Updated weights for policy 0, policy_version 449908 (0.0032) [2024-04-27 18:42:44,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 55427.9). Total num frames: 7371341824. Throughput: 0: 55660.4. Samples: 276466880. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 18:42:44,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 18:42:46,216][54818] Updated weights for policy 0, policy_version 449918 (0.0031) [2024-04-27 18:42:49,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.8, 300 sec: 55483.5). Total num frames: 7371603968. Throughput: 0: 55608.6. Samples: 276797040. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 18:42:49,253][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 18:42:49,306][54818] Updated weights for policy 0, policy_version 449928 (0.0028) [2024-04-27 18:42:52,088][54818] Updated weights for policy 0, policy_version 449938 (0.0029) [2024-04-27 18:42:54,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 7371866112. Throughput: 0: 55721.1. Samples: 277135060. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 18:42:54,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 18:42:55,181][54818] Updated weights for policy 0, policy_version 449948 (0.0027) [2024-04-27 18:42:55,594][54798] Signal inference workers to stop experience collection... (3950 times) [2024-04-27 18:42:55,595][54798] Signal inference workers to resume experience collection... (3950 times) [2024-04-27 18:42:55,616][54818] InferenceWorker_p0-w0: stopping experience collection (3950 times) [2024-04-27 18:42:55,617][54818] InferenceWorker_p0-w0: resuming experience collection (3950 times) [2024-04-27 18:42:58,168][54818] Updated weights for policy 0, policy_version 449958 (0.0028) [2024-04-27 18:42:59,253][54587] Fps is (10 sec: 55704.3, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 7372161024. Throughput: 0: 55688.8. Samples: 277301820. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 18:42:59,263][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 18:43:01,077][54818] Updated weights for policy 0, policy_version 449968 (0.0029) [2024-04-27 18:43:03,990][54818] Updated weights for policy 0, policy_version 449978 (0.0026) [2024-04-27 18:43:04,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 7372439552. Throughput: 0: 55675.9. Samples: 277632560. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 18:43:04,263][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 18:43:06,986][54818] Updated weights for policy 0, policy_version 449988 (0.0025) [2024-04-27 18:43:09,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7372718080. Throughput: 0: 55579.1. Samples: 277963860. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 18:43:09,262][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 18:43:09,885][54818] Updated weights for policy 0, policy_version 449998 (0.0025) [2024-04-27 18:43:13,013][54818] Updated weights for policy 0, policy_version 450008 (0.0031) [2024-04-27 18:43:14,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 7372980224. Throughput: 0: 55696.4. Samples: 278134140. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 18:43:14,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 18:43:15,757][54818] Updated weights for policy 0, policy_version 450018 (0.0042) [2024-04-27 18:43:18,913][54818] Updated weights for policy 0, policy_version 450028 (0.0028) [2024-04-27 18:43:19,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55978.5, 300 sec: 55372.3). Total num frames: 7373275136. Throughput: 0: 55698.9. Samples: 278466100. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 18:43:19,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 18:43:21,718][54818] Updated weights for policy 0, policy_version 450038 (0.0026) [2024-04-27 18:43:24,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 7373537280. Throughput: 0: 55609.8. Samples: 278801660. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 18:43:24,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 18:43:24,852][54818] Updated weights for policy 0, policy_version 450048 (0.0028) [2024-04-27 18:43:27,656][54818] Updated weights for policy 0, policy_version 450058 (0.0031) [2024-04-27 18:43:29,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7373815808. Throughput: 0: 55472.9. Samples: 278963160. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 18:43:29,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 18:43:30,715][54818] Updated weights for policy 0, policy_version 450068 (0.0026) [2024-04-27 18:43:33,635][54818] Updated weights for policy 0, policy_version 450078 (0.0030) [2024-04-27 18:43:34,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55705.5, 300 sec: 55483.5). Total num frames: 7374110720. Throughput: 0: 55519.4. Samples: 279295420. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 18:43:34,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 18:43:36,483][54818] Updated weights for policy 0, policy_version 450088 (0.0028) [2024-04-27 18:43:39,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7374372864. Throughput: 0: 55439.2. Samples: 279629820. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 18:43:39,253][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 18:43:39,422][54818] Updated weights for policy 0, policy_version 450098 (0.0027) [2024-04-27 18:43:42,332][54818] Updated weights for policy 0, policy_version 450108 (0.0035) [2024-04-27 18:43:44,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7374667776. Throughput: 0: 55481.1. Samples: 279798460. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 18:43:44,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 18:43:45,333][54818] Updated weights for policy 0, policy_version 450118 (0.0034) [2024-04-27 18:43:48,202][54818] Updated weights for policy 0, policy_version 450128 (0.0035) [2024-04-27 18:43:49,253][54587] Fps is (10 sec: 57343.3, 60 sec: 55705.5, 300 sec: 55483.5). Total num frames: 7374946304. Throughput: 0: 55614.2. Samples: 280135200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 18:43:49,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 18:43:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000450131_7374946304.pth... [2024-04-27 18:43:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000449317_7361609728.pth [2024-04-27 18:43:51,199][54818] Updated weights for policy 0, policy_version 450138 (0.0026) [2024-04-27 18:43:54,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55316.8). Total num frames: 7375208448. Throughput: 0: 55666.3. Samples: 280468840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 18:43:54,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 18:43:54,298][54818] Updated weights for policy 0, policy_version 450148 (0.0027) [2024-04-27 18:43:57,029][54818] Updated weights for policy 0, policy_version 450158 (0.0025) [2024-04-27 18:43:59,253][54587] Fps is (10 sec: 52429.1, 60 sec: 55159.6, 300 sec: 55316.8). Total num frames: 7375470592. Throughput: 0: 55352.0. Samples: 280624980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 18:43:59,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 18:44:00,134][54818] Updated weights for policy 0, policy_version 450168 (0.0027) [2024-04-27 18:44:01,381][54798] Signal inference workers to stop experience collection... (4000 times) [2024-04-27 18:44:01,382][54798] Signal inference workers to resume experience collection... (4000 times) [2024-04-27 18:44:01,399][54818] InferenceWorker_p0-w0: stopping experience collection (4000 times) [2024-04-27 18:44:01,399][54818] InferenceWorker_p0-w0: resuming experience collection (4000 times) [2024-04-27 18:44:02,919][54818] Updated weights for policy 0, policy_version 450178 (0.0032) [2024-04-27 18:44:04,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7375765504. Throughput: 0: 55446.8. Samples: 280961200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 18:44:04,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 18:44:05,937][54818] Updated weights for policy 0, policy_version 450188 (0.0030) [2024-04-27 18:44:08,857][54818] Updated weights for policy 0, policy_version 450198 (0.0026) [2024-04-27 18:44:09,253][54587] Fps is (10 sec: 58982.1, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7376060416. Throughput: 0: 55507.0. Samples: 281299480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 18:44:09,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 18:44:11,852][54818] Updated weights for policy 0, policy_version 450208 (0.0027) [2024-04-27 18:44:14,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7376306176. Throughput: 0: 55603.2. Samples: 281465300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 18:44:14,253][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 18:44:14,827][54818] Updated weights for policy 0, policy_version 450218 (0.0029) [2024-04-27 18:44:17,634][54818] Updated weights for policy 0, policy_version 450228 (0.0026) [2024-04-27 18:44:19,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7376617472. Throughput: 0: 55510.2. Samples: 281793380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 18:44:19,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 18:44:20,674][54818] Updated weights for policy 0, policy_version 450238 (0.0026) [2024-04-27 18:44:23,419][54818] Updated weights for policy 0, policy_version 450248 (0.0027) [2024-04-27 18:44:24,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 7376879616. Throughput: 0: 55492.8. Samples: 282127000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 18:44:24,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 18:44:26,674][54818] Updated weights for policy 0, policy_version 450258 (0.0026) [2024-04-27 18:44:29,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55705.5, 300 sec: 55372.4). Total num frames: 7377158144. Throughput: 0: 55540.9. Samples: 282297800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 18:44:29,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 18:44:29,535][54818] Updated weights for policy 0, policy_version 450268 (0.0035) [2024-04-27 18:44:32,549][54818] Updated weights for policy 0, policy_version 450278 (0.0031) [2024-04-27 18:44:34,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55432.7, 300 sec: 55483.5). Total num frames: 7377436672. Throughput: 0: 55522.0. Samples: 282633680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 18:44:34,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 18:44:35,353][54818] Updated weights for policy 0, policy_version 450288 (0.0035) [2024-04-27 18:44:38,522][54818] Updated weights for policy 0, policy_version 450298 (0.0026) [2024-04-27 18:44:39,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7377715200. Throughput: 0: 55419.6. Samples: 282962720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 18:44:39,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-27 18:44:41,257][54818] Updated weights for policy 0, policy_version 450308 (0.0032) [2024-04-27 18:44:44,253][54587] Fps is (10 sec: 55704.5, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7377993728. Throughput: 0: 55703.0. Samples: 283131620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 18:44:44,254][54587] Avg episode reward: [(0, '0.518')] [2024-04-27 18:44:44,403][54818] Updated weights for policy 0, policy_version 450318 (0.0031) [2024-04-27 18:44:47,075][54818] Updated weights for policy 0, policy_version 450328 (0.0027) [2024-04-27 18:44:49,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7378272256. Throughput: 0: 55688.8. Samples: 283467200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 18:44:49,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 18:44:50,376][54818] Updated weights for policy 0, policy_version 450338 (0.0033) [2024-04-27 18:44:53,025][54818] Updated weights for policy 0, policy_version 450348 (0.0032) [2024-04-27 18:44:54,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7378550784. Throughput: 0: 55558.1. Samples: 283799600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 18:44:54,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 18:44:56,154][54818] Updated weights for policy 0, policy_version 450358 (0.0030) [2024-04-27 18:44:58,861][54818] Updated weights for policy 0, policy_version 450368 (0.0028) [2024-04-27 18:44:59,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 55483.4). Total num frames: 7378829312. Throughput: 0: 55511.8. Samples: 283963340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 18:44:59,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 18:45:02,208][54818] Updated weights for policy 0, policy_version 450378 (0.0024) [2024-04-27 18:45:04,253][54587] Fps is (10 sec: 55706.9, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 7379107840. Throughput: 0: 55563.3. Samples: 284293720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 18:45:04,253][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 18:45:04,917][54818] Updated weights for policy 0, policy_version 450388 (0.0030) [2024-04-27 18:45:08,017][54798] Signal inference workers to stop experience collection... (4050 times) [2024-04-27 18:45:08,021][54798] Signal inference workers to resume experience collection... (4050 times) [2024-04-27 18:45:08,043][54818] InferenceWorker_p0-w0: stopping experience collection (4050 times) [2024-04-27 18:45:08,044][54818] InferenceWorker_p0-w0: resuming experience collection (4050 times) [2024-04-27 18:45:08,131][54818] Updated weights for policy 0, policy_version 450398 (0.0025) [2024-04-27 18:45:09,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7379386368. Throughput: 0: 55597.2. Samples: 284628880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 18:45:09,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 18:45:10,891][54818] Updated weights for policy 0, policy_version 450408 (0.0035) [2024-04-27 18:45:13,988][54818] Updated weights for policy 0, policy_version 450418 (0.0038) [2024-04-27 18:45:14,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55978.5, 300 sec: 55539.0). Total num frames: 7379664896. Throughput: 0: 55423.6. Samples: 284791860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:45:14,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 18:45:16,610][54818] Updated weights for policy 0, policy_version 450428 (0.0025) [2024-04-27 18:45:19,253][54587] Fps is (10 sec: 52428.6, 60 sec: 54886.3, 300 sec: 55427.9). Total num frames: 7379910656. Throughput: 0: 55453.0. Samples: 285129080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:45:19,254][54587] Avg episode reward: [(0, '0.655')] [2024-04-27 18:45:19,855][54818] Updated weights for policy 0, policy_version 450438 (0.0030) [2024-04-27 18:45:22,539][54818] Updated weights for policy 0, policy_version 450448 (0.0025) [2024-04-27 18:45:24,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 7380221952. Throughput: 0: 55500.9. Samples: 285460260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:45:24,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 18:45:25,687][54818] Updated weights for policy 0, policy_version 450458 (0.0033) [2024-04-27 18:45:28,645][54818] Updated weights for policy 0, policy_version 450468 (0.0036) [2024-04-27 18:45:29,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7380484096. Throughput: 0: 55403.6. Samples: 285624780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:45:29,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 18:45:31,645][54818] Updated weights for policy 0, policy_version 450478 (0.0032) [2024-04-27 18:45:34,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 7380779008. Throughput: 0: 55321.9. Samples: 285956680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:45:34,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 18:45:34,665][54818] Updated weights for policy 0, policy_version 450488 (0.0028) [2024-04-27 18:45:37,484][54818] Updated weights for policy 0, policy_version 450498 (0.0034) [2024-04-27 18:45:39,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7381041152. Throughput: 0: 55529.1. Samples: 286298400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:45:39,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 18:45:40,387][54818] Updated weights for policy 0, policy_version 450508 (0.0025) [2024-04-27 18:45:43,489][54818] Updated weights for policy 0, policy_version 450518 (0.0033) [2024-04-27 18:45:44,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7381336064. Throughput: 0: 55602.3. Samples: 286465440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:45:44,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 18:45:46,342][54818] Updated weights for policy 0, policy_version 450528 (0.0034) [2024-04-27 18:45:49,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 7381598208. Throughput: 0: 55540.7. Samples: 286793060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:45:49,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 18:45:49,262][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000450537_7381598208.pth... [2024-04-27 18:45:49,318][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000449723_7368261632.pth [2024-04-27 18:45:49,468][54818] Updated weights for policy 0, policy_version 450538 (0.0026) [2024-04-27 18:45:52,467][54818] Updated weights for policy 0, policy_version 450548 (0.0025) [2024-04-27 18:45:54,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7381876736. Throughput: 0: 55452.1. Samples: 287124220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:45:54,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 18:45:55,221][54818] Updated weights for policy 0, policy_version 450558 (0.0028) [2024-04-27 18:45:58,179][54818] Updated weights for policy 0, policy_version 450568 (0.0027) [2024-04-27 18:45:59,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7382155264. Throughput: 0: 55579.1. Samples: 287292920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:45:59,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 18:46:01,189][54818] Updated weights for policy 0, policy_version 450578 (0.0028) [2024-04-27 18:46:03,862][54818] Updated weights for policy 0, policy_version 450588 (0.0031) [2024-04-27 18:46:04,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7382450176. Throughput: 0: 55666.4. Samples: 287634060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:46:04,254][54587] Avg episode reward: [(0, '0.522')] [2024-04-27 18:46:06,775][54798] Signal inference workers to stop experience collection... (4100 times) [2024-04-27 18:46:06,813][54818] InferenceWorker_p0-w0: stopping experience collection (4100 times) [2024-04-27 18:46:06,867][54798] Signal inference workers to resume experience collection... (4100 times) [2024-04-27 18:46:06,867][54818] InferenceWorker_p0-w0: resuming experience collection (4100 times) [2024-04-27 18:46:06,991][54818] Updated weights for policy 0, policy_version 450598 (0.0029) [2024-04-27 18:46:09,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 7382712320. Throughput: 0: 55614.2. Samples: 287962900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:46:09,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 18:46:09,959][54818] Updated weights for policy 0, policy_version 450608 (0.0028) [2024-04-27 18:46:12,899][54818] Updated weights for policy 0, policy_version 450618 (0.0025) [2024-04-27 18:46:14,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7383007232. Throughput: 0: 55675.7. Samples: 288130180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:46:14,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 18:46:15,794][54818] Updated weights for policy 0, policy_version 450628 (0.0033) [2024-04-27 18:46:18,738][54818] Updated weights for policy 0, policy_version 450638 (0.0026) [2024-04-27 18:46:19,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7383269376. Throughput: 0: 55735.5. Samples: 288464780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:46:19,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 18:46:21,567][54818] Updated weights for policy 0, policy_version 450648 (0.0032) [2024-04-27 18:46:24,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7383547904. Throughput: 0: 55700.1. Samples: 288804900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:46:24,253][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 18:46:24,764][54818] Updated weights for policy 0, policy_version 450658 (0.0031) [2024-04-27 18:46:27,504][54818] Updated weights for policy 0, policy_version 450668 (0.0030) [2024-04-27 18:46:29,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7383826432. Throughput: 0: 55578.6. Samples: 288966480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:46:29,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 18:46:30,631][54818] Updated weights for policy 0, policy_version 450678 (0.0028) [2024-04-27 18:46:33,573][54818] Updated weights for policy 0, policy_version 450688 (0.0026) [2024-04-27 18:46:34,253][54587] Fps is (10 sec: 57342.7, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7384121344. Throughput: 0: 55686.6. Samples: 289298960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 18:46:34,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 18:46:36,528][54818] Updated weights for policy 0, policy_version 450698 (0.0033) [2024-04-27 18:46:39,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7384383488. Throughput: 0: 55743.2. Samples: 289632660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 18:46:39,253][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 18:46:39,279][54818] Updated weights for policy 0, policy_version 450708 (0.0029) [2024-04-27 18:46:42,517][54818] Updated weights for policy 0, policy_version 450718 (0.0029) [2024-04-27 18:46:44,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7384678400. Throughput: 0: 55731.0. Samples: 289800820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 18:46:44,254][54587] Avg episode reward: [(0, '0.507')] [2024-04-27 18:46:45,085][54818] Updated weights for policy 0, policy_version 450728 (0.0028) [2024-04-27 18:46:48,328][54818] Updated weights for policy 0, policy_version 450738 (0.0023) [2024-04-27 18:46:49,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 7384924160. Throughput: 0: 55615.2. Samples: 290136740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 18:46:49,253][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 18:46:50,990][54818] Updated weights for policy 0, policy_version 450748 (0.0026) [2024-04-27 18:46:54,180][54818] Updated weights for policy 0, policy_version 450758 (0.0029) [2024-04-27 18:46:54,253][54587] Fps is (10 sec: 54068.2, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7385219072. Throughput: 0: 55716.1. Samples: 290470120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 18:46:54,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 18:46:56,940][54818] Updated weights for policy 0, policy_version 450768 (0.0029) [2024-04-27 18:46:59,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7385497600. Throughput: 0: 55583.8. Samples: 290631460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 18:46:59,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 18:47:00,144][54818] Updated weights for policy 0, policy_version 450778 (0.0024) [2024-04-27 18:47:02,916][54818] Updated weights for policy 0, policy_version 450788 (0.0031) [2024-04-27 18:47:04,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55432.5, 300 sec: 55594.6). Total num frames: 7385776128. Throughput: 0: 55610.7. Samples: 290967260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 18:47:04,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 18:47:06,074][54818] Updated weights for policy 0, policy_version 450798 (0.0036) [2024-04-27 18:47:08,667][54818] Updated weights for policy 0, policy_version 450808 (0.0028) [2024-04-27 18:47:09,254][54587] Fps is (10 sec: 57341.5, 60 sec: 55978.2, 300 sec: 55594.4). Total num frames: 7386071040. Throughput: 0: 55425.0. Samples: 291299060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 18:47:09,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 18:47:11,858][54818] Updated weights for policy 0, policy_version 450818 (0.0030) [2024-04-27 18:47:14,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 7386333184. Throughput: 0: 55745.0. Samples: 291475000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 18:47:14,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 18:47:14,720][54818] Updated weights for policy 0, policy_version 450828 (0.0025) [2024-04-27 18:47:17,851][54818] Updated weights for policy 0, policy_version 450838 (0.0032) [2024-04-27 18:47:19,253][54587] Fps is (10 sec: 54069.9, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7386611712. Throughput: 0: 55688.6. Samples: 291804940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 18:47:19,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 18:47:20,696][54818] Updated weights for policy 0, policy_version 450848 (0.0026) [2024-04-27 18:47:23,618][54818] Updated weights for policy 0, policy_version 450858 (0.0032) [2024-04-27 18:47:24,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 7386873856. Throughput: 0: 55543.4. Samples: 292132120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 18:47:24,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 18:47:26,316][54798] Signal inference workers to stop experience collection... (4150 times) [2024-04-27 18:47:26,345][54818] InferenceWorker_p0-w0: stopping experience collection (4150 times) [2024-04-27 18:47:26,372][54798] Signal inference workers to resume experience collection... (4150 times) [2024-04-27 18:47:26,377][54818] InferenceWorker_p0-w0: resuming experience collection (4150 times) [2024-04-27 18:47:26,485][54818] Updated weights for policy 0, policy_version 450868 (0.0028) [2024-04-27 18:47:29,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7387152384. Throughput: 0: 55450.8. Samples: 292296100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 18:47:29,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 18:47:29,649][54818] Updated weights for policy 0, policy_version 450878 (0.0028) [2024-04-27 18:47:32,240][54818] Updated weights for policy 0, policy_version 450888 (0.0028) [2024-04-27 18:47:34,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7387447296. Throughput: 0: 55446.2. Samples: 292631820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 18:47:34,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 18:47:35,751][54818] Updated weights for policy 0, policy_version 450898 (0.0032) [2024-04-27 18:47:38,108][54818] Updated weights for policy 0, policy_version 450908 (0.0026) [2024-04-27 18:47:39,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7387725824. Throughput: 0: 55488.8. Samples: 292967120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 18:47:39,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 18:47:41,611][54818] Updated weights for policy 0, policy_version 450918 (0.0030) [2024-04-27 18:47:44,083][54818] Updated weights for policy 0, policy_version 450928 (0.0027) [2024-04-27 18:47:44,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 7388004352. Throughput: 0: 55739.7. Samples: 293139740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 18:47:44,253][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 18:47:47,518][54818] Updated weights for policy 0, policy_version 450938 (0.0028) [2024-04-27 18:47:49,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 7388282880. Throughput: 0: 55608.9. Samples: 293469660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 18:47:49,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 18:47:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000450945_7388282880.pth... [2024-04-27 18:47:49,312][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000450131_7374946304.pth [2024-04-27 18:47:49,814][54818] Updated weights for policy 0, policy_version 450948 (0.0025) [2024-04-27 18:47:53,269][54818] Updated weights for policy 0, policy_version 450958 (0.0032) [2024-04-27 18:47:54,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55594.6). Total num frames: 7388561408. Throughput: 0: 55665.6. Samples: 293803980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 18:47:54,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 18:47:55,924][54818] Updated weights for policy 0, policy_version 450968 (0.0033) [2024-04-27 18:47:59,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 7388807168. Throughput: 0: 55345.3. Samples: 293965540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 18:47:59,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 18:47:59,295][54818] Updated weights for policy 0, policy_version 450978 (0.0028) [2024-04-27 18:48:01,717][54818] Updated weights for policy 0, policy_version 450988 (0.0033) [2024-04-27 18:48:04,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7389118464. Throughput: 0: 55382.6. Samples: 294297160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 18:48:04,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 18:48:05,142][54818] Updated weights for policy 0, policy_version 450998 (0.0028) [2024-04-27 18:48:07,587][54818] Updated weights for policy 0, policy_version 451008 (0.0027) [2024-04-27 18:48:09,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55159.9, 300 sec: 55594.5). Total num frames: 7389380608. Throughput: 0: 55532.5. Samples: 294631080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 18:48:09,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 18:48:10,827][54818] Updated weights for policy 0, policy_version 451018 (0.0024) [2024-04-27 18:48:13,704][54818] Updated weights for policy 0, policy_version 451028 (0.0029) [2024-04-27 18:48:14,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7389691904. Throughput: 0: 55702.7. Samples: 294802720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 18:48:14,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 18:48:16,885][54818] Updated weights for policy 0, policy_version 451038 (0.0031) [2024-04-27 18:48:19,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7389937664. Throughput: 0: 55649.3. Samples: 295136040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 18:48:19,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 18:48:19,470][54818] Updated weights for policy 0, policy_version 451048 (0.0028) [2024-04-27 18:48:22,805][54818] Updated weights for policy 0, policy_version 451058 (0.0027) [2024-04-27 18:48:24,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7390216192. Throughput: 0: 55602.6. Samples: 295469240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 18:48:24,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 18:48:25,421][54818] Updated weights for policy 0, policy_version 451068 (0.0031) [2024-04-27 18:48:28,555][54818] Updated weights for policy 0, policy_version 451078 (0.0030) [2024-04-27 18:48:29,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7390494720. Throughput: 0: 55462.6. Samples: 295635560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 18:48:29,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 18:48:31,211][54818] Updated weights for policy 0, policy_version 451088 (0.0025) [2024-04-27 18:48:34,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7390773248. Throughput: 0: 55546.5. Samples: 295969260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 18:48:34,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 18:48:34,489][54818] Updated weights for policy 0, policy_version 451098 (0.0030) [2024-04-27 18:48:37,205][54818] Updated weights for policy 0, policy_version 451108 (0.0030) [2024-04-27 18:48:39,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7391051776. Throughput: 0: 55523.1. Samples: 296302520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 18:48:39,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 18:48:40,341][54818] Updated weights for policy 0, policy_version 451118 (0.0028) [2024-04-27 18:48:42,940][54818] Updated weights for policy 0, policy_version 451128 (0.0027) [2024-04-27 18:48:44,253][54587] Fps is (10 sec: 55706.7, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7391330304. Throughput: 0: 55636.5. Samples: 296469180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 18:48:44,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 18:48:46,163][54818] Updated weights for policy 0, policy_version 451138 (0.0029) [2024-04-27 18:48:48,748][54818] Updated weights for policy 0, policy_version 451148 (0.0028) [2024-04-27 18:48:49,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7391625216. Throughput: 0: 55652.0. Samples: 296801500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 18:48:49,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 18:48:51,949][54818] Updated weights for policy 0, policy_version 451158 (0.0027) [2024-04-27 18:48:53,144][54798] Signal inference workers to stop experience collection... (4200 times) [2024-04-27 18:48:53,149][54798] Signal inference workers to resume experience collection... (4200 times) [2024-04-27 18:48:53,167][54818] InferenceWorker_p0-w0: stopping experience collection (4200 times) [2024-04-27 18:48:53,167][54818] InferenceWorker_p0-w0: resuming experience collection (4200 times) [2024-04-27 18:48:54,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7391903744. Throughput: 0: 55561.3. Samples: 297131340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 18:48:54,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 18:48:54,786][54818] Updated weights for policy 0, policy_version 451168 (0.0040) [2024-04-27 18:48:58,270][54818] Updated weights for policy 0, policy_version 451178 (0.0029) [2024-04-27 18:48:59,254][54587] Fps is (10 sec: 52427.9, 60 sec: 55705.4, 300 sec: 55538.9). Total num frames: 7392149504. Throughput: 0: 55450.8. Samples: 297298020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 18:48:59,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 18:49:00,600][54818] Updated weights for policy 0, policy_version 451188 (0.0027) [2024-04-27 18:49:03,981][54818] Updated weights for policy 0, policy_version 451198 (0.0026) [2024-04-27 18:49:04,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 7392428032. Throughput: 0: 55522.6. Samples: 297634560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 18:49:04,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 18:49:06,501][54818] Updated weights for policy 0, policy_version 451208 (0.0029) [2024-04-27 18:49:09,253][54587] Fps is (10 sec: 55707.2, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7392706560. Throughput: 0: 55421.5. Samples: 297963200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 18:49:09,253][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 18:49:09,852][54818] Updated weights for policy 0, policy_version 451218 (0.0028) [2024-04-27 18:49:12,569][54818] Updated weights for policy 0, policy_version 451228 (0.0026) [2024-04-27 18:49:14,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7393001472. Throughput: 0: 55384.1. Samples: 298127840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 18:49:14,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 18:49:15,746][54818] Updated weights for policy 0, policy_version 451238 (0.0031) [2024-04-27 18:49:18,606][54818] Updated weights for policy 0, policy_version 451248 (0.0024) [2024-04-27 18:49:19,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7393263616. Throughput: 0: 55359.7. Samples: 298460440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 18:49:19,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 18:49:21,861][54818] Updated weights for policy 0, policy_version 451258 (0.0028) [2024-04-27 18:49:24,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7393558528. Throughput: 0: 55181.4. Samples: 298785680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 18:49:24,253][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 18:49:24,420][54818] Updated weights for policy 0, policy_version 451268 (0.0030) [2024-04-27 18:49:27,886][54818] Updated weights for policy 0, policy_version 451278 (0.0030) [2024-04-27 18:49:29,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7393837056. Throughput: 0: 55343.9. Samples: 298959660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 18:49:29,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 18:49:30,152][54818] Updated weights for policy 0, policy_version 451288 (0.0032) [2024-04-27 18:49:33,712][54818] Updated weights for policy 0, policy_version 451298 (0.0027) [2024-04-27 18:49:34,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55159.6, 300 sec: 55483.4). Total num frames: 7394082816. Throughput: 0: 55429.4. Samples: 299295820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 18:49:34,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 18:49:36,088][54818] Updated weights for policy 0, policy_version 451308 (0.0032) [2024-04-27 18:49:39,253][54587] Fps is (10 sec: 50790.4, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 7394344960. Throughput: 0: 55428.9. Samples: 299625640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 18:49:39,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 18:49:39,634][54818] Updated weights for policy 0, policy_version 451318 (0.0024) [2024-04-27 18:49:42,082][54818] Updated weights for policy 0, policy_version 451328 (0.0030) [2024-04-27 18:49:44,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7394656256. Throughput: 0: 55267.8. Samples: 299785060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 18:49:44,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 18:49:45,504][54818] Updated weights for policy 0, policy_version 451338 (0.0029) [2024-04-27 18:49:47,848][54818] Updated weights for policy 0, policy_version 451348 (0.0028) [2024-04-27 18:49:49,253][54587] Fps is (10 sec: 58982.0, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 7394934784. Throughput: 0: 55186.6. Samples: 300117960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 18:49:49,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-27 18:49:49,290][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000451352_7394951168.pth... [2024-04-27 18:49:49,336][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000450537_7381598208.pth [2024-04-27 18:49:51,475][54818] Updated weights for policy 0, policy_version 451358 (0.0027) [2024-04-27 18:49:53,900][54818] Updated weights for policy 0, policy_version 451368 (0.0027) [2024-04-27 18:49:54,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7395213312. Throughput: 0: 55431.5. Samples: 300457620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 18:49:54,254][54587] Avg episode reward: [(0, '0.683')] [2024-04-27 18:49:57,368][54818] Updated weights for policy 0, policy_version 451378 (0.0038) [2024-04-27 18:49:58,303][54798] Signal inference workers to stop experience collection... (4250 times) [2024-04-27 18:49:58,304][54798] Signal inference workers to resume experience collection... (4250 times) [2024-04-27 18:49:58,315][54818] InferenceWorker_p0-w0: stopping experience collection (4250 times) [2024-04-27 18:49:58,315][54818] InferenceWorker_p0-w0: resuming experience collection (4250 times) [2024-04-27 18:49:59,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 7395508224. Throughput: 0: 55496.7. Samples: 300625200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 18:49:59,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 18:49:59,780][54818] Updated weights for policy 0, policy_version 451388 (0.0028) [2024-04-27 18:50:03,145][54818] Updated weights for policy 0, policy_version 451398 (0.0028) [2024-04-27 18:50:04,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 7395753984. Throughput: 0: 55527.1. Samples: 300959160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 18:50:04,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 18:50:05,508][54818] Updated weights for policy 0, policy_version 451408 (0.0029) [2024-04-27 18:50:09,047][54818] Updated weights for policy 0, policy_version 451418 (0.0029) [2024-04-27 18:50:09,253][54587] Fps is (10 sec: 52429.3, 60 sec: 55432.5, 300 sec: 55483.5). Total num frames: 7396032512. Throughput: 0: 55742.6. Samples: 301294100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 18:50:09,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 18:50:11,303][54818] Updated weights for policy 0, policy_version 451428 (0.0027) [2024-04-27 18:50:14,253][54587] Fps is (10 sec: 54066.7, 60 sec: 54886.3, 300 sec: 55539.0). Total num frames: 7396294656. Throughput: 0: 55459.9. Samples: 301455360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 18:50:14,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 18:50:14,957][54818] Updated weights for policy 0, policy_version 451438 (0.0028) [2024-04-27 18:50:17,260][54818] Updated weights for policy 0, policy_version 451448 (0.0033) [2024-04-27 18:50:19,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7396605952. Throughput: 0: 55360.4. Samples: 301787040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 18:50:19,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 18:50:21,070][54818] Updated weights for policy 0, policy_version 451458 (0.0030) [2024-04-27 18:50:23,196][54818] Updated weights for policy 0, policy_version 451468 (0.0036) [2024-04-27 18:50:24,253][54587] Fps is (10 sec: 57344.8, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 7396868096. Throughput: 0: 55393.9. Samples: 302118360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 18:50:24,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 18:50:26,968][54818] Updated weights for policy 0, policy_version 451478 (0.0025) [2024-04-27 18:50:29,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7397163008. Throughput: 0: 55673.8. Samples: 302290380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 18:50:29,254][54587] Avg episode reward: [(0, '0.672')] [2024-04-27 18:50:29,462][54818] Updated weights for policy 0, policy_version 451488 (0.0030) [2024-04-27 18:50:32,740][54818] Updated weights for policy 0, policy_version 451498 (0.0028) [2024-04-27 18:50:34,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7397441536. Throughput: 0: 55733.5. Samples: 302625960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 18:50:34,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 18:50:35,153][54818] Updated weights for policy 0, policy_version 451508 (0.0038) [2024-04-27 18:50:38,632][54818] Updated weights for policy 0, policy_version 451518 (0.0025) [2024-04-27 18:50:39,253][54587] Fps is (10 sec: 52429.1, 60 sec: 55705.6, 300 sec: 55427.9). Total num frames: 7397687296. Throughput: 0: 55638.2. Samples: 302961340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 18:50:39,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-27 18:50:41,249][54818] Updated weights for policy 0, policy_version 451528 (0.0029) [2024-04-27 18:50:44,253][54587] Fps is (10 sec: 52428.2, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 7397965824. Throughput: 0: 55541.0. Samples: 303124540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 18:50:44,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 18:50:44,537][54818] Updated weights for policy 0, policy_version 451538 (0.0028) [2024-04-27 18:50:44,951][54798] Signal inference workers to stop experience collection... (4300 times) [2024-04-27 18:50:44,993][54818] InferenceWorker_p0-w0: stopping experience collection (4300 times) [2024-04-27 18:50:45,000][54798] Signal inference workers to resume experience collection... (4300 times) [2024-04-27 18:50:45,007][54818] InferenceWorker_p0-w0: resuming experience collection (4300 times) [2024-04-27 18:50:47,185][54818] Updated weights for policy 0, policy_version 451548 (0.0030) [2024-04-27 18:50:49,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7398260736. Throughput: 0: 55656.3. Samples: 303463700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:50:49,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 18:50:50,312][54818] Updated weights for policy 0, policy_version 451558 (0.0033) [2024-04-27 18:50:52,927][54818] Updated weights for policy 0, policy_version 451568 (0.0027) [2024-04-27 18:50:54,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 7398539264. Throughput: 0: 55640.8. Samples: 303797940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:50:54,254][54587] Avg episode reward: [(0, '0.676')] [2024-04-27 18:50:56,272][54818] Updated weights for policy 0, policy_version 451578 (0.0031) [2024-04-27 18:50:58,804][54818] Updated weights for policy 0, policy_version 451588 (0.0030) [2024-04-27 18:50:59,253][54587] Fps is (10 sec: 55706.6, 60 sec: 55159.6, 300 sec: 55483.5). Total num frames: 7398817792. Throughput: 0: 55740.2. Samples: 303963660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:50:59,253][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 18:51:02,092][54818] Updated weights for policy 0, policy_version 451598 (0.0034) [2024-04-27 18:51:04,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7399112704. Throughput: 0: 55744.5. Samples: 304295540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:51:04,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 18:51:04,722][54818] Updated weights for policy 0, policy_version 451608 (0.0030) [2024-04-27 18:51:08,067][54818] Updated weights for policy 0, policy_version 451618 (0.0031) [2024-04-27 18:51:09,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7399391232. Throughput: 0: 55736.0. Samples: 304626480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:51:09,253][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 18:51:10,842][54818] Updated weights for policy 0, policy_version 451628 (0.0030) [2024-04-27 18:51:13,905][54818] Updated weights for policy 0, policy_version 451638 (0.0032) [2024-04-27 18:51:14,254][54587] Fps is (10 sec: 55704.2, 60 sec: 56251.5, 300 sec: 55594.5). Total num frames: 7399669760. Throughput: 0: 55700.1. Samples: 304796900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:51:14,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 18:51:16,693][54818] Updated weights for policy 0, policy_version 451648 (0.0026) [2024-04-27 18:51:19,253][54587] Fps is (10 sec: 52428.7, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 7399915520. Throughput: 0: 55673.7. Samples: 305131280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:51:19,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 18:51:19,596][54818] Updated weights for policy 0, policy_version 451658 (0.0027) [2024-04-27 18:51:22,519][54818] Updated weights for policy 0, policy_version 451668 (0.0028) [2024-04-27 18:51:24,253][54587] Fps is (10 sec: 54068.4, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7400210432. Throughput: 0: 55743.0. Samples: 305469780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:51:24,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 18:51:25,535][54818] Updated weights for policy 0, policy_version 451678 (0.0029) [2024-04-27 18:51:28,530][54818] Updated weights for policy 0, policy_version 451688 (0.0030) [2024-04-27 18:51:29,253][54587] Fps is (10 sec: 58981.1, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7400505344. Throughput: 0: 55661.2. Samples: 305629300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:51:29,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 18:51:31,485][54818] Updated weights for policy 0, policy_version 451698 (0.0031) [2024-04-27 18:51:34,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 7400767488. Throughput: 0: 55602.3. Samples: 305965800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:51:34,255][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 18:51:34,365][54818] Updated weights for policy 0, policy_version 451708 (0.0025) [2024-04-27 18:51:37,127][54818] Updated weights for policy 0, policy_version 451718 (0.0027) [2024-04-27 18:51:39,253][54587] Fps is (10 sec: 57344.6, 60 sec: 56524.7, 300 sec: 55594.5). Total num frames: 7401078784. Throughput: 0: 55600.5. Samples: 306299960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:51:39,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 18:51:40,141][54818] Updated weights for policy 0, policy_version 451728 (0.0031) [2024-04-27 18:51:42,886][54818] Updated weights for policy 0, policy_version 451738 (0.0029) [2024-04-27 18:51:44,253][54587] Fps is (10 sec: 57344.0, 60 sec: 56251.7, 300 sec: 55650.0). Total num frames: 7401340928. Throughput: 0: 55780.7. Samples: 306473800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:51:44,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 18:51:45,850][54818] Updated weights for policy 0, policy_version 451748 (0.0028) [2024-04-27 18:51:49,070][54818] Updated weights for policy 0, policy_version 451758 (0.0030) [2024-04-27 18:51:49,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 7401619456. Throughput: 0: 55953.8. Samples: 306813460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:51:49,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 18:51:49,387][54798] Signal inference workers to stop experience collection... (4350 times) [2024-04-27 18:51:49,388][54798] Signal inference workers to resume experience collection... (4350 times) [2024-04-27 18:51:49,389][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000451760_7401635840.pth... [2024-04-27 18:51:49,400][54818] InferenceWorker_p0-w0: stopping experience collection (4350 times) [2024-04-27 18:51:49,400][54818] InferenceWorker_p0-w0: resuming experience collection (4350 times) [2024-04-27 18:51:49,439][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000450945_7388282880.pth [2024-04-27 18:51:51,760][54818] Updated weights for policy 0, policy_version 451768 (0.0029) [2024-04-27 18:51:54,253][54587] Fps is (10 sec: 54068.1, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 7401881600. Throughput: 0: 55894.3. Samples: 307141720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:51:54,253][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 18:51:54,933][54818] Updated weights for policy 0, policy_version 451778 (0.0038) [2024-04-27 18:51:57,820][54818] Updated weights for policy 0, policy_version 451788 (0.0032) [2024-04-27 18:51:59,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 7402176512. Throughput: 0: 55809.2. Samples: 307308300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:51:59,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 18:52:00,674][54818] Updated weights for policy 0, policy_version 451798 (0.0031) [2024-04-27 18:52:03,619][54818] Updated weights for policy 0, policy_version 451808 (0.0027) [2024-04-27 18:52:04,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 7402438656. Throughput: 0: 55827.1. Samples: 307643500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:52:04,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 18:52:06,509][54818] Updated weights for policy 0, policy_version 451818 (0.0023) [2024-04-27 18:52:09,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7402733568. Throughput: 0: 55722.7. Samples: 307977300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:52:09,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 18:52:09,800][54818] Updated weights for policy 0, policy_version 451828 (0.0027) [2024-04-27 18:52:12,623][54818] Updated weights for policy 0, policy_version 451838 (0.0033) [2024-04-27 18:52:14,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 7402995712. Throughput: 0: 55841.9. Samples: 308142180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 18:52:14,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 18:52:15,593][54818] Updated weights for policy 0, policy_version 451848 (0.0028) [2024-04-27 18:52:18,511][54818] Updated weights for policy 0, policy_version 451858 (0.0025) [2024-04-27 18:52:19,253][54587] Fps is (10 sec: 57344.1, 60 sec: 56524.7, 300 sec: 55705.6). Total num frames: 7403307008. Throughput: 0: 55816.5. Samples: 308477540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 18:52:19,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 18:52:21,602][54818] Updated weights for policy 0, policy_version 451868 (0.0029) [2024-04-27 18:52:24,241][54818] Updated weights for policy 0, policy_version 451878 (0.0025) [2024-04-27 18:52:24,253][54587] Fps is (10 sec: 57344.9, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 7403569152. Throughput: 0: 55743.7. Samples: 308808420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 18:52:24,253][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 18:52:27,438][54818] Updated weights for policy 0, policy_version 451888 (0.0027) [2024-04-27 18:52:29,253][54587] Fps is (10 sec: 52428.9, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 7403831296. Throughput: 0: 55502.3. Samples: 308971400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 18:52:29,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 18:52:30,255][54818] Updated weights for policy 0, policy_version 451898 (0.0025) [2024-04-27 18:52:33,244][54818] Updated weights for policy 0, policy_version 451908 (0.0034) [2024-04-27 18:52:34,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7404109824. Throughput: 0: 55362.3. Samples: 309304760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 18:52:34,253][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 18:52:36,099][54818] Updated weights for policy 0, policy_version 451918 (0.0029) [2024-04-27 18:52:39,026][54818] Updated weights for policy 0, policy_version 451928 (0.0029) [2024-04-27 18:52:39,253][54587] Fps is (10 sec: 55704.6, 60 sec: 55159.4, 300 sec: 55538.9). Total num frames: 7404388352. Throughput: 0: 55507.2. Samples: 309639560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 18:52:39,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 18:52:41,987][54818] Updated weights for policy 0, policy_version 451938 (0.0024) [2024-04-27 18:52:44,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55159.6, 300 sec: 55483.5). Total num frames: 7404650496. Throughput: 0: 55502.9. Samples: 309805920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 18:52:44,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 18:52:44,972][54818] Updated weights for policy 0, policy_version 451948 (0.0028) [2024-04-27 18:52:47,782][54818] Updated weights for policy 0, policy_version 451958 (0.0032) [2024-04-27 18:52:49,253][54587] Fps is (10 sec: 55706.8, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7404945408. Throughput: 0: 55418.7. Samples: 310137340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 18:52:49,254][54587] Avg episode reward: [(0, '0.694')] [2024-04-27 18:52:50,754][54798] Signal inference workers to stop experience collection... (4400 times) [2024-04-27 18:52:50,788][54818] InferenceWorker_p0-w0: stopping experience collection (4400 times) [2024-04-27 18:52:50,850][54798] Signal inference workers to resume experience collection... (4400 times) [2024-04-27 18:52:50,850][54818] InferenceWorker_p0-w0: resuming experience collection (4400 times) [2024-04-27 18:52:50,957][54818] Updated weights for policy 0, policy_version 451968 (0.0025) [2024-04-27 18:52:53,565][54818] Updated weights for policy 0, policy_version 451978 (0.0026) [2024-04-27 18:52:54,253][54587] Fps is (10 sec: 58982.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7405240320. Throughput: 0: 55287.7. Samples: 310465240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 18:52:54,253][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 18:52:56,879][54818] Updated weights for policy 0, policy_version 451988 (0.0031) [2024-04-27 18:52:59,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7405502464. Throughput: 0: 55441.0. Samples: 310637020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 18:52:59,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 18:52:59,620][54818] Updated weights for policy 0, policy_version 451998 (0.0023) [2024-04-27 18:53:02,815][54818] Updated weights for policy 0, policy_version 452008 (0.0037) [2024-04-27 18:53:04,253][54587] Fps is (10 sec: 50789.5, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 7405748224. Throughput: 0: 55387.4. Samples: 310969980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 18:53:04,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 18:53:05,444][54818] Updated weights for policy 0, policy_version 452018 (0.0031) [2024-04-27 18:53:08,604][54818] Updated weights for policy 0, policy_version 452028 (0.0029) [2024-04-27 18:53:09,253][54587] Fps is (10 sec: 54066.2, 60 sec: 55159.3, 300 sec: 55427.9). Total num frames: 7406043136. Throughput: 0: 55311.3. Samples: 311297440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 18:53:09,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 18:53:11,294][54818] Updated weights for policy 0, policy_version 452038 (0.0032) [2024-04-27 18:53:14,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7406321664. Throughput: 0: 55404.0. Samples: 311464580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 18:53:14,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 18:53:14,443][54818] Updated weights for policy 0, policy_version 452048 (0.0029) [2024-04-27 18:53:17,220][54818] Updated weights for policy 0, policy_version 452058 (0.0031) [2024-04-27 18:53:19,253][54587] Fps is (10 sec: 55706.5, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 7406600192. Throughput: 0: 55414.1. Samples: 311798400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 18:53:19,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 18:53:20,277][54818] Updated weights for policy 0, policy_version 452068 (0.0022) [2024-04-27 18:53:23,287][54818] Updated weights for policy 0, policy_version 452078 (0.0029) [2024-04-27 18:53:24,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7406895104. Throughput: 0: 55333.9. Samples: 312129580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 18:53:24,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 18:53:26,352][54818] Updated weights for policy 0, policy_version 452088 (0.0032) [2024-04-27 18:53:29,147][54818] Updated weights for policy 0, policy_version 452098 (0.0029) [2024-04-27 18:53:29,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55594.6). Total num frames: 7407173632. Throughput: 0: 55291.0. Samples: 312294020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 18:53:29,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 18:53:32,440][54818] Updated weights for policy 0, policy_version 452108 (0.0027) [2024-04-27 18:53:34,253][54587] Fps is (10 sec: 55706.5, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7407452160. Throughput: 0: 55416.5. Samples: 312631080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 18:53:34,253][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 18:53:34,874][54818] Updated weights for policy 0, policy_version 452118 (0.0026) [2024-04-27 18:53:38,292][54818] Updated weights for policy 0, policy_version 452128 (0.0026) [2024-04-27 18:53:39,253][54587] Fps is (10 sec: 52428.6, 60 sec: 55159.6, 300 sec: 55483.4). Total num frames: 7407697920. Throughput: 0: 55627.0. Samples: 312968460. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:53:39,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 18:53:40,722][54818] Updated weights for policy 0, policy_version 452138 (0.0031) [2024-04-27 18:53:44,174][54818] Updated weights for policy 0, policy_version 452148 (0.0030) [2024-04-27 18:53:44,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 7407992832. Throughput: 0: 55295.5. Samples: 313125320. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:53:44,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 18:53:46,672][54818] Updated weights for policy 0, policy_version 452158 (0.0026) [2024-04-27 18:53:49,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7408271360. Throughput: 0: 55336.1. Samples: 313460100. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:53:49,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 18:53:49,261][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000452165_7408271360.pth... [2024-04-27 18:53:49,310][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000451352_7394951168.pth [2024-04-27 18:53:50,001][54818] Updated weights for policy 0, policy_version 452168 (0.0028) [2024-04-27 18:53:52,966][54818] Updated weights for policy 0, policy_version 452178 (0.0035) [2024-04-27 18:53:54,253][54587] Fps is (10 sec: 54067.2, 60 sec: 54886.3, 300 sec: 55539.0). Total num frames: 7408533504. Throughput: 0: 55416.1. Samples: 313791160. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:53:54,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 18:53:55,822][54818] Updated weights for policy 0, policy_version 452188 (0.0032) [2024-04-27 18:53:58,663][54818] Updated weights for policy 0, policy_version 452198 (0.0031) [2024-04-27 18:53:59,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7408828416. Throughput: 0: 55605.3. Samples: 313966820. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:53:59,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 18:54:01,692][54818] Updated weights for policy 0, policy_version 452208 (0.0034) [2024-04-27 18:54:02,778][54798] Signal inference workers to stop experience collection... (4450 times) [2024-04-27 18:54:02,807][54818] InferenceWorker_p0-w0: stopping experience collection (4450 times) [2024-04-27 18:54:02,875][54798] Signal inference workers to resume experience collection... (4450 times) [2024-04-27 18:54:02,876][54818] InferenceWorker_p0-w0: resuming experience collection (4450 times) [2024-04-27 18:54:04,253][54587] Fps is (10 sec: 58982.7, 60 sec: 56251.8, 300 sec: 55650.0). Total num frames: 7409123328. Throughput: 0: 55575.5. Samples: 314299300. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:54:04,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 18:54:04,381][54818] Updated weights for policy 0, policy_version 452218 (0.0031) [2024-04-27 18:54:07,542][54818] Updated weights for policy 0, policy_version 452228 (0.0025) [2024-04-27 18:54:09,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 7409385472. Throughput: 0: 55565.5. Samples: 314630020. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:54:09,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 18:54:10,274][54818] Updated weights for policy 0, policy_version 452238 (0.0031) [2024-04-27 18:54:13,821][54818] Updated weights for policy 0, policy_version 452248 (0.0025) [2024-04-27 18:54:14,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7409647616. Throughput: 0: 55781.8. Samples: 314804200. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:54:14,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 18:54:16,127][54818] Updated weights for policy 0, policy_version 452258 (0.0036) [2024-04-27 18:54:19,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7409942528. Throughput: 0: 55659.9. Samples: 315135780. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:54:19,254][54587] Avg episode reward: [(0, '0.676')] [2024-04-27 18:54:19,732][54818] Updated weights for policy 0, policy_version 452268 (0.0033) [2024-04-27 18:54:21,836][54818] Updated weights for policy 0, policy_version 452278 (0.0037) [2024-04-27 18:54:24,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55159.6, 300 sec: 55483.5). Total num frames: 7410204672. Throughput: 0: 55463.7. Samples: 315464320. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:54:24,253][54587] Avg episode reward: [(0, '0.523')] [2024-04-27 18:54:25,492][54818] Updated weights for policy 0, policy_version 452288 (0.0029) [2024-04-27 18:54:27,787][54818] Updated weights for policy 0, policy_version 452298 (0.0031) [2024-04-27 18:54:29,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7410483200. Throughput: 0: 55751.7. Samples: 315634140. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:54:29,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 18:54:31,413][54818] Updated weights for policy 0, policy_version 452308 (0.0029) [2024-04-27 18:54:34,238][54818] Updated weights for policy 0, policy_version 452318 (0.0031) [2024-04-27 18:54:34,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 7410778112. Throughput: 0: 55674.6. Samples: 315965460. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:54:34,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 18:54:37,291][54818] Updated weights for policy 0, policy_version 452328 (0.0027) [2024-04-27 18:54:39,253][54587] Fps is (10 sec: 58982.2, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 7411073024. Throughput: 0: 55691.7. Samples: 316297280. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:54:39,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 18:54:40,013][54818] Updated weights for policy 0, policy_version 452338 (0.0026) [2024-04-27 18:54:43,025][54818] Updated weights for policy 0, policy_version 452348 (0.0029) [2024-04-27 18:54:44,253][54587] Fps is (10 sec: 55706.5, 60 sec: 55705.8, 300 sec: 55594.6). Total num frames: 7411335168. Throughput: 0: 55623.3. Samples: 316469860. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:54:44,253][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 18:54:45,840][54818] Updated weights for policy 0, policy_version 452358 (0.0028) [2024-04-27 18:54:48,773][54818] Updated weights for policy 0, policy_version 452368 (0.0027) [2024-04-27 18:54:49,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7411613696. Throughput: 0: 55830.8. Samples: 316811680. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:54:49,253][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 18:54:51,528][54818] Updated weights for policy 0, policy_version 452378 (0.0030) [2024-04-27 18:54:54,253][54587] Fps is (10 sec: 52428.4, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7411859456. Throughput: 0: 55826.2. Samples: 317142200. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-27 18:54:54,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 18:54:54,704][54818] Updated weights for policy 0, policy_version 452388 (0.0031) [2024-04-27 18:54:57,458][54818] Updated weights for policy 0, policy_version 452398 (0.0026) [2024-04-27 18:54:59,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7412137984. Throughput: 0: 55424.9. Samples: 317298320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 18:54:59,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 18:55:00,668][54818] Updated weights for policy 0, policy_version 452408 (0.0033) [2024-04-27 18:55:03,395][54818] Updated weights for policy 0, policy_version 452418 (0.0029) [2024-04-27 18:55:04,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7412432896. Throughput: 0: 55493.7. Samples: 317633000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 18:55:04,254][54587] Avg episode reward: [(0, '0.532')] [2024-04-27 18:55:06,435][54818] Updated weights for policy 0, policy_version 452428 (0.0030) [2024-04-27 18:55:07,196][54798] Signal inference workers to stop experience collection... (4500 times) [2024-04-27 18:55:07,217][54818] InferenceWorker_p0-w0: stopping experience collection (4500 times) [2024-04-27 18:55:07,256][54798] Signal inference workers to resume experience collection... (4500 times) [2024-04-27 18:55:07,256][54818] InferenceWorker_p0-w0: resuming experience collection (4500 times) [2024-04-27 18:55:09,231][54818] Updated weights for policy 0, policy_version 452438 (0.0025) [2024-04-27 18:55:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 7412744192. Throughput: 0: 55615.5. Samples: 317967020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 18:55:09,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 18:55:12,253][54818] Updated weights for policy 0, policy_version 452448 (0.0030) [2024-04-27 18:55:14,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7413006336. Throughput: 0: 55724.8. Samples: 318141760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 18:55:14,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 18:55:15,390][54818] Updated weights for policy 0, policy_version 452458 (0.0034) [2024-04-27 18:55:18,165][54818] Updated weights for policy 0, policy_version 452468 (0.0031) [2024-04-27 18:55:19,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7413284864. Throughput: 0: 55747.2. Samples: 318474080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 18:55:19,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 18:55:21,517][54818] Updated weights for policy 0, policy_version 452478 (0.0025) [2024-04-27 18:55:24,156][54818] Updated weights for policy 0, policy_version 452488 (0.0025) [2024-04-27 18:55:24,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 7413563392. Throughput: 0: 55533.6. Samples: 318796300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 18:55:24,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 18:55:27,301][54818] Updated weights for policy 0, policy_version 452498 (0.0032) [2024-04-27 18:55:29,253][54587] Fps is (10 sec: 52428.9, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7413809152. Throughput: 0: 55336.8. Samples: 318960020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 18:55:29,253][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 18:55:29,919][54818] Updated weights for policy 0, policy_version 452508 (0.0024) [2024-04-27 18:55:33,157][54818] Updated weights for policy 0, policy_version 452518 (0.0034) [2024-04-27 18:55:34,253][54587] Fps is (10 sec: 52429.4, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7414087680. Throughput: 0: 55235.0. Samples: 319297260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 18:55:34,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-27 18:55:36,081][54818] Updated weights for policy 0, policy_version 452528 (0.0031) [2024-04-27 18:55:39,147][54818] Updated weights for policy 0, policy_version 452538 (0.0033) [2024-04-27 18:55:39,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 7414382592. Throughput: 0: 55283.9. Samples: 319629980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 18:55:39,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 18:55:42,157][54818] Updated weights for policy 0, policy_version 452548 (0.0026) [2024-04-27 18:55:44,253][54587] Fps is (10 sec: 58982.7, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7414677504. Throughput: 0: 55544.0. Samples: 319797800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 18:55:44,254][54587] Avg episode reward: [(0, '0.668')] [2024-04-27 18:55:44,952][54818] Updated weights for policy 0, policy_version 452558 (0.0026) [2024-04-27 18:55:47,966][54818] Updated weights for policy 0, policy_version 452568 (0.0028) [2024-04-27 18:55:49,254][54587] Fps is (10 sec: 57343.1, 60 sec: 55705.4, 300 sec: 55650.0). Total num frames: 7414956032. Throughput: 0: 55588.7. Samples: 320134500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 18:55:49,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-27 18:55:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000452573_7414956032.pth... [2024-04-27 18:55:49,323][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000451760_7401635840.pth [2024-04-27 18:55:50,737][54818] Updated weights for policy 0, policy_version 452578 (0.0028) [2024-04-27 18:55:53,883][54818] Updated weights for policy 0, policy_version 452588 (0.0027) [2024-04-27 18:55:54,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7415218176. Throughput: 0: 55500.0. Samples: 320464520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 18:55:54,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 18:55:56,185][54798] Signal inference workers to stop experience collection... (4550 times) [2024-04-27 18:55:56,188][54798] Signal inference workers to resume experience collection... (4550 times) [2024-04-27 18:55:56,216][54818] InferenceWorker_p0-w0: stopping experience collection (4550 times) [2024-04-27 18:55:56,221][54818] InferenceWorker_p0-w0: resuming experience collection (4550 times) [2024-04-27 18:55:56,870][54818] Updated weights for policy 0, policy_version 452598 (0.0025) [2024-04-27 18:55:59,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 7415496704. Throughput: 0: 55451.5. Samples: 320637080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 18:55:59,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 18:55:59,615][54818] Updated weights for policy 0, policy_version 452608 (0.0028) [2024-04-27 18:56:02,949][54818] Updated weights for policy 0, policy_version 452618 (0.0030) [2024-04-27 18:56:04,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7415775232. Throughput: 0: 55553.4. Samples: 320973980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 18:56:04,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 18:56:05,536][54818] Updated weights for policy 0, policy_version 452628 (0.0030) [2024-04-27 18:56:08,862][54818] Updated weights for policy 0, policy_version 452638 (0.0032) [2024-04-27 18:56:09,253][54587] Fps is (10 sec: 54066.8, 60 sec: 54886.3, 300 sec: 55483.5). Total num frames: 7416037376. Throughput: 0: 55711.1. Samples: 321303300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 18:56:09,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 18:56:11,323][54818] Updated weights for policy 0, policy_version 452648 (0.0028) [2024-04-27 18:56:14,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7416315904. Throughput: 0: 55633.8. Samples: 321463540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 18:56:14,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 18:56:14,810][54818] Updated weights for policy 0, policy_version 452658 (0.0027) [2024-04-27 18:56:17,273][54818] Updated weights for policy 0, policy_version 452668 (0.0025) [2024-04-27 18:56:19,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7416610816. Throughput: 0: 55498.2. Samples: 321794680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 18:56:19,254][54587] Avg episode reward: [(0, '0.700')] [2024-04-27 18:56:20,527][54818] Updated weights for policy 0, policy_version 452678 (0.0027) [2024-04-27 18:56:23,177][54818] Updated weights for policy 0, policy_version 452688 (0.0033) [2024-04-27 18:56:24,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 7416889344. Throughput: 0: 55671.7. Samples: 322135200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:56:24,253][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 18:56:26,321][54818] Updated weights for policy 0, policy_version 452698 (0.0029) [2024-04-27 18:56:28,989][54818] Updated weights for policy 0, policy_version 452708 (0.0031) [2024-04-27 18:56:29,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 7417167872. Throughput: 0: 55657.1. Samples: 322302380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:56:29,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 18:56:32,168][54818] Updated weights for policy 0, policy_version 452718 (0.0031) [2024-04-27 18:56:34,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55978.6, 300 sec: 55483.5). Total num frames: 7417446400. Throughput: 0: 55563.3. Samples: 322634840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:56:34,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 18:56:34,935][54818] Updated weights for policy 0, policy_version 452728 (0.0036) [2024-04-27 18:56:38,214][54818] Updated weights for policy 0, policy_version 452738 (0.0031) [2024-04-27 18:56:39,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7417724928. Throughput: 0: 55706.2. Samples: 322971300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:56:39,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 18:56:40,846][54818] Updated weights for policy 0, policy_version 452748 (0.0031) [2024-04-27 18:56:44,155][54818] Updated weights for policy 0, policy_version 452758 (0.0036) [2024-04-27 18:56:44,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55159.4, 300 sec: 55483.5). Total num frames: 7417987072. Throughput: 0: 55490.8. Samples: 323134160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:56:44,254][54587] Avg episode reward: [(0, '0.517')] [2024-04-27 18:56:46,597][54818] Updated weights for policy 0, policy_version 452768 (0.0027) [2024-04-27 18:56:49,253][54587] Fps is (10 sec: 52429.0, 60 sec: 54886.6, 300 sec: 55483.4). Total num frames: 7418249216. Throughput: 0: 55438.6. Samples: 323468720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:56:49,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 18:56:49,935][54818] Updated weights for policy 0, policy_version 452778 (0.0026) [2024-04-27 18:56:52,521][54818] Updated weights for policy 0, policy_version 452788 (0.0023) [2024-04-27 18:56:54,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7418560512. Throughput: 0: 55481.2. Samples: 323799940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:56:54,253][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 18:56:55,844][54818] Updated weights for policy 0, policy_version 452798 (0.0030) [2024-04-27 18:56:58,431][54818] Updated weights for policy 0, policy_version 452808 (0.0025) [2024-04-27 18:56:59,253][54587] Fps is (10 sec: 58982.1, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7418839040. Throughput: 0: 55779.9. Samples: 323973640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:56:59,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 18:57:01,589][54818] Updated weights for policy 0, policy_version 452818 (0.0026) [2024-04-27 18:57:04,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7419117568. Throughput: 0: 55844.5. Samples: 324307680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:57:04,254][54587] Avg episode reward: [(0, '0.517')] [2024-04-27 18:57:04,381][54818] Updated weights for policy 0, policy_version 452828 (0.0032) [2024-04-27 18:57:07,360][54818] Updated weights for policy 0, policy_version 452838 (0.0030) [2024-04-27 18:57:08,485][54798] Signal inference workers to stop experience collection... (4600 times) [2024-04-27 18:57:08,529][54818] InferenceWorker_p0-w0: stopping experience collection (4600 times) [2024-04-27 18:57:08,541][54798] Signal inference workers to resume experience collection... (4600 times) [2024-04-27 18:57:08,547][54818] InferenceWorker_p0-w0: resuming experience collection (4600 times) [2024-04-27 18:57:09,253][54587] Fps is (10 sec: 57344.3, 60 sec: 56251.9, 300 sec: 55650.1). Total num frames: 7419412480. Throughput: 0: 55640.3. Samples: 324639020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:57:09,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 18:57:10,285][54818] Updated weights for policy 0, policy_version 452848 (0.0031) [2024-04-27 18:57:13,325][54818] Updated weights for policy 0, policy_version 452858 (0.0034) [2024-04-27 18:57:14,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55978.5, 300 sec: 55483.4). Total num frames: 7419674624. Throughput: 0: 55695.1. Samples: 324808660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:57:14,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 18:57:16,160][54818] Updated weights for policy 0, policy_version 452868 (0.0029) [2024-04-27 18:57:19,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 7419936768. Throughput: 0: 55722.7. Samples: 325142360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:57:19,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 18:57:19,327][54818] Updated weights for policy 0, policy_version 452878 (0.0032) [2024-04-27 18:57:21,912][54818] Updated weights for policy 0, policy_version 452888 (0.0024) [2024-04-27 18:57:24,253][54587] Fps is (10 sec: 52429.5, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 7420198912. Throughput: 0: 55705.8. Samples: 325478060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:57:24,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 18:57:25,390][54818] Updated weights for policy 0, policy_version 452898 (0.0032) [2024-04-27 18:57:27,792][54818] Updated weights for policy 0, policy_version 452908 (0.0032) [2024-04-27 18:57:29,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7420493824. Throughput: 0: 55586.2. Samples: 325635540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:57:29,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 18:57:31,277][54818] Updated weights for policy 0, policy_version 452918 (0.0031) [2024-04-27 18:57:33,660][54818] Updated weights for policy 0, policy_version 452928 (0.0025) [2024-04-27 18:57:34,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7420772352. Throughput: 0: 55629.4. Samples: 325972040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:57:34,253][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 18:57:37,295][54818] Updated weights for policy 0, policy_version 452938 (0.0031) [2024-04-27 18:57:39,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7421050880. Throughput: 0: 55686.6. Samples: 326305840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:57:39,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 18:57:39,598][54818] Updated weights for policy 0, policy_version 452948 (0.0027) [2024-04-27 18:57:43,083][54818] Updated weights for policy 0, policy_version 452958 (0.0028) [2024-04-27 18:57:44,253][54587] Fps is (10 sec: 57343.1, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7421345792. Throughput: 0: 55645.3. Samples: 326477680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 18:57:44,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 18:57:45,672][54818] Updated weights for policy 0, policy_version 452968 (0.0033) [2024-04-27 18:57:48,847][54818] Updated weights for policy 0, policy_version 452978 (0.0029) [2024-04-27 18:57:49,253][54587] Fps is (10 sec: 57344.0, 60 sec: 56251.7, 300 sec: 55539.0). Total num frames: 7421624320. Throughput: 0: 55620.9. Samples: 326810620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-27 18:57:49,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 18:57:49,328][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000452981_7421640704.pth... [2024-04-27 18:57:49,373][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000452165_7408271360.pth [2024-04-27 18:57:51,562][54818] Updated weights for policy 0, policy_version 452988 (0.0038) [2024-04-27 18:57:54,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7421886464. Throughput: 0: 55616.9. Samples: 327141780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-27 18:57:54,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 18:57:54,619][54818] Updated weights for policy 0, policy_version 452998 (0.0032) [2024-04-27 18:57:57,576][54818] Updated weights for policy 0, policy_version 453008 (0.0033) [2024-04-27 18:57:59,253][54587] Fps is (10 sec: 52428.6, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7422148608. Throughput: 0: 55581.9. Samples: 327309840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-27 18:57:59,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-27 18:58:00,670][54818] Updated weights for policy 0, policy_version 453018 (0.0035) [2024-04-27 18:58:03,373][54818] Updated weights for policy 0, policy_version 453028 (0.0033) [2024-04-27 18:58:04,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7422427136. Throughput: 0: 55393.3. Samples: 327635060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-27 18:58:04,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 18:58:06,665][54818] Updated weights for policy 0, policy_version 453038 (0.0027) [2024-04-27 18:58:08,588][54798] Signal inference workers to stop experience collection... (4650 times) [2024-04-27 18:58:08,592][54798] Signal inference workers to resume experience collection... (4650 times) [2024-04-27 18:58:08,629][54818] InferenceWorker_p0-w0: stopping experience collection (4650 times) [2024-04-27 18:58:08,629][54818] InferenceWorker_p0-w0: resuming experience collection (4650 times) [2024-04-27 18:58:09,083][54818] Updated weights for policy 0, policy_version 453048 (0.0028) [2024-04-27 18:58:09,253][54587] Fps is (10 sec: 58982.0, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 7422738432. Throughput: 0: 55329.3. Samples: 327967880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-27 18:58:09,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 18:58:12,446][54818] Updated weights for policy 0, policy_version 453058 (0.0027) [2024-04-27 18:58:14,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 7422984192. Throughput: 0: 55717.9. Samples: 328142840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-27 18:58:14,254][54587] Avg episode reward: [(0, '0.684')] [2024-04-27 18:58:15,010][54818] Updated weights for policy 0, policy_version 453068 (0.0026) [2024-04-27 18:58:18,253][54818] Updated weights for policy 0, policy_version 453078 (0.0035) [2024-04-27 18:58:19,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7423295488. Throughput: 0: 55631.4. Samples: 328475460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-27 18:58:19,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 18:58:20,983][54818] Updated weights for policy 0, policy_version 453088 (0.0027) [2024-04-27 18:58:24,130][54818] Updated weights for policy 0, policy_version 453098 (0.0027) [2024-04-27 18:58:24,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7423557632. Throughput: 0: 55654.7. Samples: 328810300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-27 18:58:24,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-27 18:58:27,003][54818] Updated weights for policy 0, policy_version 453108 (0.0033) [2024-04-27 18:58:29,253][54587] Fps is (10 sec: 52429.1, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 7423819776. Throughput: 0: 55307.3. Samples: 328966500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-27 18:58:29,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 18:58:30,011][54818] Updated weights for policy 0, policy_version 453118 (0.0025) [2024-04-27 18:58:33,095][54818] Updated weights for policy 0, policy_version 453128 (0.0027) [2024-04-27 18:58:34,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7424098304. Throughput: 0: 55416.4. Samples: 329304360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-27 18:58:34,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 18:58:35,765][54818] Updated weights for policy 0, policy_version 453138 (0.0028) [2024-04-27 18:58:39,113][54818] Updated weights for policy 0, policy_version 453148 (0.0029) [2024-04-27 18:58:39,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7424376832. Throughput: 0: 55416.6. Samples: 329635520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-27 18:58:39,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 18:58:41,698][54818] Updated weights for policy 0, policy_version 453158 (0.0027) [2024-04-27 18:58:44,253][54587] Fps is (10 sec: 58981.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7424688128. Throughput: 0: 55413.7. Samples: 329803460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-27 18:58:44,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 18:58:45,121][54818] Updated weights for policy 0, policy_version 453168 (0.0028) [2024-04-27 18:58:47,636][54818] Updated weights for policy 0, policy_version 453178 (0.0032) [2024-04-27 18:58:49,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7424933888. Throughput: 0: 55516.0. Samples: 330133280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-27 18:58:49,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 18:58:50,931][54818] Updated weights for policy 0, policy_version 453188 (0.0031) [2024-04-27 18:58:53,634][54818] Updated weights for policy 0, policy_version 453198 (0.0025) [2024-04-27 18:58:54,253][54587] Fps is (10 sec: 52428.9, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7425212416. Throughput: 0: 55611.1. Samples: 330470380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-27 18:58:54,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 18:58:56,655][54818] Updated weights for policy 0, policy_version 453208 (0.0029) [2024-04-27 18:58:59,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 7425490944. Throughput: 0: 55491.5. Samples: 330639960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-27 18:58:59,253][54587] Avg episode reward: [(0, '0.665')] [2024-04-27 18:58:59,378][54818] Updated weights for policy 0, policy_version 453218 (0.0034) [2024-04-27 18:59:02,408][54818] Updated weights for policy 0, policy_version 453228 (0.0025) [2024-04-27 18:59:04,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7425753088. Throughput: 0: 55424.9. Samples: 330969580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-27 18:59:04,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 18:59:05,249][54818] Updated weights for policy 0, policy_version 453238 (0.0026) [2024-04-27 18:59:06,841][54798] Signal inference workers to stop experience collection... (4700 times) [2024-04-27 18:59:06,842][54798] Signal inference workers to resume experience collection... (4700 times) [2024-04-27 18:59:06,874][54818] InferenceWorker_p0-w0: stopping experience collection (4700 times) [2024-04-27 18:59:06,874][54818] InferenceWorker_p0-w0: resuming experience collection (4700 times) [2024-04-27 18:59:08,429][54818] Updated weights for policy 0, policy_version 453248 (0.0029) [2024-04-27 18:59:09,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 7426064384. Throughput: 0: 55380.9. Samples: 331302440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-27 18:59:09,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 18:59:11,236][54818] Updated weights for policy 0, policy_version 453258 (0.0024) [2024-04-27 18:59:14,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7426326528. Throughput: 0: 55638.7. Samples: 331470240. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 18:59:14,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 18:59:14,305][54818] Updated weights for policy 0, policy_version 453268 (0.0025) [2024-04-27 18:59:17,110][54818] Updated weights for policy 0, policy_version 453278 (0.0029) [2024-04-27 18:59:19,253][54587] Fps is (10 sec: 55704.5, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 7426621440. Throughput: 0: 55558.9. Samples: 331804520. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 18:59:19,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 18:59:20,075][54818] Updated weights for policy 0, policy_version 453288 (0.0034) [2024-04-27 18:59:23,065][54818] Updated weights for policy 0, policy_version 453298 (0.0026) [2024-04-27 18:59:24,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7426899968. Throughput: 0: 55562.6. Samples: 332135840. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 18:59:24,253][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 18:59:26,022][54818] Updated weights for policy 0, policy_version 453308 (0.0033) [2024-04-27 18:59:28,911][54818] Updated weights for policy 0, policy_version 453318 (0.0026) [2024-04-27 18:59:29,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7427178496. Throughput: 0: 55696.0. Samples: 332309780. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 18:59:29,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 18:59:31,771][54818] Updated weights for policy 0, policy_version 453328 (0.0029) [2024-04-27 18:59:34,253][54587] Fps is (10 sec: 54066.5, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 7427440640. Throughput: 0: 55868.8. Samples: 332647380. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 18:59:34,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 18:59:34,820][54818] Updated weights for policy 0, policy_version 453338 (0.0025) [2024-04-27 18:59:37,565][54818] Updated weights for policy 0, policy_version 453348 (0.0029) [2024-04-27 18:59:39,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55705.4, 300 sec: 55538.9). Total num frames: 7427719168. Throughput: 0: 55747.5. Samples: 332979020. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 18:59:39,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 18:59:40,787][54818] Updated weights for policy 0, policy_version 453358 (0.0024) [2024-04-27 18:59:43,389][54818] Updated weights for policy 0, policy_version 453368 (0.0030) [2024-04-27 18:59:44,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7427997696. Throughput: 0: 55611.1. Samples: 333142460. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 18:59:44,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 18:59:46,647][54818] Updated weights for policy 0, policy_version 453378 (0.0025) [2024-04-27 18:59:49,238][54818] Updated weights for policy 0, policy_version 453388 (0.0028) [2024-04-27 18:59:49,253][54587] Fps is (10 sec: 58983.1, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 7428308992. Throughput: 0: 55802.8. Samples: 333480700. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 18:59:49,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 18:59:49,262][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000453388_7428308992.pth... [2024-04-27 18:59:49,310][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000452573_7414956032.pth [2024-04-27 18:59:52,497][54818] Updated weights for policy 0, policy_version 453398 (0.0025) [2024-04-27 18:59:54,253][54587] Fps is (10 sec: 58982.2, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 7428587520. Throughput: 0: 55745.7. Samples: 333811000. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 18:59:54,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 18:59:55,194][54818] Updated weights for policy 0, policy_version 453408 (0.0033) [2024-04-27 18:59:58,283][54818] Updated weights for policy 0, policy_version 453418 (0.0031) [2024-04-27 18:59:59,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 7428849664. Throughput: 0: 55923.9. Samples: 333986820. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 18:59:59,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 19:00:01,363][54818] Updated weights for policy 0, policy_version 453428 (0.0026) [2024-04-27 19:00:04,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55978.7, 300 sec: 55483.4). Total num frames: 7429111808. Throughput: 0: 55890.8. Samples: 334319600. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 19:00:04,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 19:00:04,303][54818] Updated weights for policy 0, policy_version 453438 (0.0036) [2024-04-27 19:00:07,113][54818] Updated weights for policy 0, policy_version 453448 (0.0025) [2024-04-27 19:00:09,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7429390336. Throughput: 0: 55999.1. Samples: 334655800. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 19:00:09,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 19:00:10,190][54818] Updated weights for policy 0, policy_version 453458 (0.0026) [2024-04-27 19:00:12,834][54818] Updated weights for policy 0, policy_version 453468 (0.0027) [2024-04-27 19:00:14,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7429668864. Throughput: 0: 55628.4. Samples: 334813060. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 19:00:14,254][54587] Avg episode reward: [(0, '0.524')] [2024-04-27 19:00:15,904][54818] Updated weights for policy 0, policy_version 453478 (0.0026) [2024-04-27 19:00:18,739][54818] Updated weights for policy 0, policy_version 453488 (0.0027) [2024-04-27 19:00:19,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55705.8, 300 sec: 55594.5). Total num frames: 7429963776. Throughput: 0: 55582.3. Samples: 335148580. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 19:00:19,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 19:00:21,833][54818] Updated weights for policy 0, policy_version 453498 (0.0035) [2024-04-27 19:00:24,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7430242304. Throughput: 0: 55588.1. Samples: 335480480. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 19:00:24,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 19:00:24,704][54818] Updated weights for policy 0, policy_version 453508 (0.0027) [2024-04-27 19:00:27,695][54818] Updated weights for policy 0, policy_version 453518 (0.0027) [2024-04-27 19:00:28,703][54798] Signal inference workers to stop experience collection... (4750 times) [2024-04-27 19:00:28,729][54818] InferenceWorker_p0-w0: stopping experience collection (4750 times) [2024-04-27 19:00:28,753][54798] Signal inference workers to resume experience collection... (4750 times) [2024-04-27 19:00:28,758][54818] InferenceWorker_p0-w0: resuming experience collection (4750 times) [2024-04-27 19:00:29,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 7430537216. Throughput: 0: 55934.7. Samples: 335659520. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-27 19:00:29,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 19:00:30,513][54818] Updated weights for policy 0, policy_version 453528 (0.0023) [2024-04-27 19:00:33,531][54818] Updated weights for policy 0, policy_version 453538 (0.0030) [2024-04-27 19:00:34,253][54587] Fps is (10 sec: 57343.6, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 7430815744. Throughput: 0: 55858.6. Samples: 335994340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 19:00:34,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 19:00:36,378][54818] Updated weights for policy 0, policy_version 453548 (0.0027) [2024-04-27 19:00:39,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7431077888. Throughput: 0: 55929.4. Samples: 336327820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 19:00:39,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 19:00:39,428][54818] Updated weights for policy 0, policy_version 453558 (0.0034) [2024-04-27 19:00:42,304][54818] Updated weights for policy 0, policy_version 453568 (0.0028) [2024-04-27 19:00:44,253][54587] Fps is (10 sec: 50791.5, 60 sec: 55432.7, 300 sec: 55483.5). Total num frames: 7431323648. Throughput: 0: 55656.3. Samples: 336491340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 19:00:44,253][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 19:00:45,296][54818] Updated weights for policy 0, policy_version 453578 (0.0032) [2024-04-27 19:00:48,138][54818] Updated weights for policy 0, policy_version 453588 (0.0031) [2024-04-27 19:00:49,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 7431634944. Throughput: 0: 55672.4. Samples: 336824860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 19:00:49,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 19:00:51,283][54818] Updated weights for policy 0, policy_version 453598 (0.0026) [2024-04-27 19:00:53,841][54818] Updated weights for policy 0, policy_version 453608 (0.0029) [2024-04-27 19:00:54,253][54587] Fps is (10 sec: 60618.9, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7431929856. Throughput: 0: 55654.4. Samples: 337160260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 19:00:54,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 19:00:57,068][54818] Updated weights for policy 0, policy_version 453618 (0.0032) [2024-04-27 19:00:59,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7432208384. Throughput: 0: 55936.0. Samples: 337330180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 19:00:59,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 19:00:59,679][54818] Updated weights for policy 0, policy_version 453628 (0.0023) [2024-04-27 19:01:02,988][54818] Updated weights for policy 0, policy_version 453638 (0.0028) [2024-04-27 19:01:04,253][54587] Fps is (10 sec: 55706.4, 60 sec: 56251.7, 300 sec: 55761.2). Total num frames: 7432486912. Throughput: 0: 55816.8. Samples: 337660340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 19:01:04,254][54587] Avg episode reward: [(0, '0.526')] [2024-04-27 19:01:05,782][54818] Updated weights for policy 0, policy_version 453648 (0.0032) [2024-04-27 19:01:08,906][54818] Updated weights for policy 0, policy_version 453658 (0.0033) [2024-04-27 19:01:09,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7432732672. Throughput: 0: 55917.4. Samples: 337996760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 19:01:09,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 19:01:11,628][54818] Updated weights for policy 0, policy_version 453668 (0.0027) [2024-04-27 19:01:14,253][54587] Fps is (10 sec: 52428.9, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7433011200. Throughput: 0: 55427.5. Samples: 338153760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 19:01:14,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 19:01:14,751][54818] Updated weights for policy 0, policy_version 453678 (0.0026) [2024-04-27 19:01:17,378][54818] Updated weights for policy 0, policy_version 453688 (0.0029) [2024-04-27 19:01:19,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 7433273344. Throughput: 0: 55504.1. Samples: 338492020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 19:01:19,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 19:01:20,572][54818] Updated weights for policy 0, policy_version 453698 (0.0035) [2024-04-27 19:01:23,372][54818] Updated weights for policy 0, policy_version 453708 (0.0034) [2024-04-27 19:01:24,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7433568256. Throughput: 0: 55470.6. Samples: 338824000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 19:01:24,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 19:01:26,582][54818] Updated weights for policy 0, policy_version 453718 (0.0030) [2024-04-27 19:01:29,253][54587] Fps is (10 sec: 58983.0, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7433863168. Throughput: 0: 55590.1. Samples: 338992900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 19:01:29,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 19:01:29,287][54818] Updated weights for policy 0, policy_version 453728 (0.0027) [2024-04-27 19:01:32,493][54818] Updated weights for policy 0, policy_version 453738 (0.0032) [2024-04-27 19:01:34,253][54587] Fps is (10 sec: 58982.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7434158080. Throughput: 0: 55695.7. Samples: 339331160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 19:01:34,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 19:01:34,863][54798] Signal inference workers to stop experience collection... (4800 times) [2024-04-27 19:01:34,894][54818] InferenceWorker_p0-w0: stopping experience collection (4800 times) [2024-04-27 19:01:34,948][54798] Signal inference workers to resume experience collection... (4800 times) [2024-04-27 19:01:34,948][54818] InferenceWorker_p0-w0: resuming experience collection (4800 times) [2024-04-27 19:01:35,056][54818] Updated weights for policy 0, policy_version 453748 (0.0029) [2024-04-27 19:01:38,271][54818] Updated weights for policy 0, policy_version 453758 (0.0035) [2024-04-27 19:01:39,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7434436608. Throughput: 0: 55656.7. Samples: 339664800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 19:01:39,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 19:01:40,866][54818] Updated weights for policy 0, policy_version 453768 (0.0026) [2024-04-27 19:01:44,253][54587] Fps is (10 sec: 52428.2, 60 sec: 55978.4, 300 sec: 55705.6). Total num frames: 7434682368. Throughput: 0: 55618.6. Samples: 339833020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 19:01:44,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 19:01:44,265][54818] Updated weights for policy 0, policy_version 453778 (0.0030) [2024-04-27 19:01:46,906][54818] Updated weights for policy 0, policy_version 453788 (0.0029) [2024-04-27 19:01:49,253][54587] Fps is (10 sec: 52428.9, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 7434960896. Throughput: 0: 55733.4. Samples: 340168340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 19:01:49,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 19:01:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000453794_7434960896.pth... [2024-04-27 19:01:49,315][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000452981_7421640704.pth [2024-04-27 19:01:50,095][54818] Updated weights for policy 0, policy_version 453798 (0.0030) [2024-04-27 19:01:52,734][54818] Updated weights for policy 0, policy_version 453808 (0.0028) [2024-04-27 19:01:54,253][54587] Fps is (10 sec: 54067.2, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 7435223040. Throughput: 0: 55728.3. Samples: 340504540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 19:01:54,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 19:01:55,922][54818] Updated weights for policy 0, policy_version 453818 (0.0025) [2024-04-27 19:01:58,504][54818] Updated weights for policy 0, policy_version 453828 (0.0028) [2024-04-27 19:01:59,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7435517952. Throughput: 0: 55759.5. Samples: 340662940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 19:01:59,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 19:02:01,835][54818] Updated weights for policy 0, policy_version 453838 (0.0026) [2024-04-27 19:02:04,253][54587] Fps is (10 sec: 58982.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7435812864. Throughput: 0: 55604.0. Samples: 340994200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 19:02:04,254][54587] Avg episode reward: [(0, '0.498')] [2024-04-27 19:02:04,430][54818] Updated weights for policy 0, policy_version 453848 (0.0027) [2024-04-27 19:02:07,603][54818] Updated weights for policy 0, policy_version 453858 (0.0031) [2024-04-27 19:02:09,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 7436091392. Throughput: 0: 55648.9. Samples: 341328200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 19:02:09,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 19:02:10,621][54818] Updated weights for policy 0, policy_version 453868 (0.0025) [2024-04-27 19:02:13,468][54818] Updated weights for policy 0, policy_version 453878 (0.0032) [2024-04-27 19:02:14,253][54587] Fps is (10 sec: 57344.4, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 7436386304. Throughput: 0: 55792.4. Samples: 341503560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 19:02:14,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 19:02:16,390][54818] Updated weights for policy 0, policy_version 453888 (0.0027) [2024-04-27 19:02:19,253][54587] Fps is (10 sec: 55706.3, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 7436648448. Throughput: 0: 55775.6. Samples: 341841060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 19:02:19,253][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 19:02:19,338][54818] Updated weights for policy 0, policy_version 453898 (0.0030) [2024-04-27 19:02:22,115][54818] Updated weights for policy 0, policy_version 453908 (0.0028) [2024-04-27 19:02:24,253][54587] Fps is (10 sec: 50790.4, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7436894208. Throughput: 0: 55740.5. Samples: 342173120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 19:02:24,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 19:02:25,318][54818] Updated weights for policy 0, policy_version 453918 (0.0024) [2024-04-27 19:02:25,735][54798] Signal inference workers to stop experience collection... (4850 times) [2024-04-27 19:02:25,740][54798] Signal inference workers to resume experience collection... (4850 times) [2024-04-27 19:02:25,754][54818] InferenceWorker_p0-w0: stopping experience collection (4850 times) [2024-04-27 19:02:25,754][54818] InferenceWorker_p0-w0: resuming experience collection (4850 times) [2024-04-27 19:02:28,072][54818] Updated weights for policy 0, policy_version 453928 (0.0029) [2024-04-27 19:02:29,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7437205504. Throughput: 0: 55502.7. Samples: 342330640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 19:02:29,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 19:02:31,059][54818] Updated weights for policy 0, policy_version 453938 (0.0032) [2024-04-27 19:02:34,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 7437467648. Throughput: 0: 55511.0. Samples: 342666340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 19:02:34,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 19:02:34,272][54818] Updated weights for policy 0, policy_version 453948 (0.0031) [2024-04-27 19:02:37,044][54818] Updated weights for policy 0, policy_version 453958 (0.0026) [2024-04-27 19:02:39,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 7437746176. Throughput: 0: 55492.8. Samples: 343001720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 19:02:39,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 19:02:40,048][54818] Updated weights for policy 0, policy_version 453968 (0.0030) [2024-04-27 19:02:42,922][54818] Updated weights for policy 0, policy_version 453978 (0.0035) [2024-04-27 19:02:44,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55705.8, 300 sec: 55594.5). Total num frames: 7438024704. Throughput: 0: 55872.2. Samples: 343177180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 19:02:44,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 19:02:45,840][54818] Updated weights for policy 0, policy_version 453988 (0.0030) [2024-04-27 19:02:48,678][54818] Updated weights for policy 0, policy_version 453998 (0.0030) [2024-04-27 19:02:49,253][54587] Fps is (10 sec: 60621.5, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 7438352384. Throughput: 0: 55922.6. Samples: 343510720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 19:02:49,254][54587] Avg episode reward: [(0, '0.668')] [2024-04-27 19:02:51,868][54818] Updated weights for policy 0, policy_version 454008 (0.0026) [2024-04-27 19:02:54,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 7438581760. Throughput: 0: 55841.4. Samples: 343841060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 19:02:54,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 19:02:54,602][54818] Updated weights for policy 0, policy_version 454018 (0.0033) [2024-04-27 19:02:57,870][54818] Updated weights for policy 0, policy_version 454028 (0.0027) [2024-04-27 19:02:59,253][54587] Fps is (10 sec: 49152.6, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 7438843904. Throughput: 0: 55673.8. Samples: 344008880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 19:02:59,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 19:03:00,531][54818] Updated weights for policy 0, policy_version 454038 (0.0025) [2024-04-27 19:03:03,857][54818] Updated weights for policy 0, policy_version 454048 (0.0031) [2024-04-27 19:03:04,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7439138816. Throughput: 0: 55677.3. Samples: 344346540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 19:03:04,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 19:03:06,261][54818] Updated weights for policy 0, policy_version 454058 (0.0028) [2024-04-27 19:03:09,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7439417344. Throughput: 0: 55656.0. Samples: 344677640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 19:03:09,254][54587] Avg episode reward: [(0, '0.680')] [2024-04-27 19:03:09,766][54818] Updated weights for policy 0, policy_version 454068 (0.0028) [2024-04-27 19:03:12,061][54818] Updated weights for policy 0, policy_version 454078 (0.0025) [2024-04-27 19:03:14,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7439695872. Throughput: 0: 55841.1. Samples: 344843480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 19:03:14,253][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 19:03:15,580][54818] Updated weights for policy 0, policy_version 454088 (0.0030) [2024-04-27 19:03:18,022][54818] Updated weights for policy 0, policy_version 454098 (0.0027) [2024-04-27 19:03:19,253][54587] Fps is (10 sec: 57343.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7439990784. Throughput: 0: 55774.6. Samples: 345176200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-27 19:03:19,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 19:03:20,155][54798] Signal inference workers to stop experience collection... (4900 times) [2024-04-27 19:03:20,198][54818] InferenceWorker_p0-w0: stopping experience collection (4900 times) [2024-04-27 19:03:20,207][54798] Signal inference workers to resume experience collection... (4900 times) [2024-04-27 19:03:20,213][54818] InferenceWorker_p0-w0: resuming experience collection (4900 times) [2024-04-27 19:03:21,487][54818] Updated weights for policy 0, policy_version 454108 (0.0029) [2024-04-27 19:03:23,928][54818] Updated weights for policy 0, policy_version 454118 (0.0032) [2024-04-27 19:03:24,253][54587] Fps is (10 sec: 58981.3, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 7440285696. Throughput: 0: 55678.8. Samples: 345507260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 19:03:24,263][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 19:03:27,389][54818] Updated weights for policy 0, policy_version 454128 (0.0037) [2024-04-27 19:03:29,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7440531456. Throughput: 0: 55663.4. Samples: 345682040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 19:03:29,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 19:03:29,777][54818] Updated weights for policy 0, policy_version 454138 (0.0032) [2024-04-27 19:03:33,126][54818] Updated weights for policy 0, policy_version 454148 (0.0036) [2024-04-27 19:03:34,253][54587] Fps is (10 sec: 52429.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7440809984. Throughput: 0: 55685.9. Samples: 346016580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 19:03:34,262][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 19:03:35,684][54818] Updated weights for policy 0, policy_version 454158 (0.0030) [2024-04-27 19:03:39,113][54818] Updated weights for policy 0, policy_version 454168 (0.0025) [2024-04-27 19:03:39,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55705.8, 300 sec: 55594.5). Total num frames: 7441088512. Throughput: 0: 55743.1. Samples: 346349500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 19:03:39,253][54587] Avg episode reward: [(0, '0.520')] [2024-04-27 19:03:41,509][54818] Updated weights for policy 0, policy_version 454178 (0.0032) [2024-04-27 19:03:44,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 7441350656. Throughput: 0: 55471.5. Samples: 346505100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 19:03:44,254][54587] Avg episode reward: [(0, '0.490')] [2024-04-27 19:03:45,068][54818] Updated weights for policy 0, policy_version 454188 (0.0027) [2024-04-27 19:03:47,355][54818] Updated weights for policy 0, policy_version 454198 (0.0024) [2024-04-27 19:03:49,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 7441661952. Throughput: 0: 55388.3. Samples: 346839020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 19:03:49,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 19:03:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000454203_7441661952.pth... [2024-04-27 19:03:49,311][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000453388_7428308992.pth [2024-04-27 19:03:51,106][54818] Updated weights for policy 0, policy_version 454208 (0.0029) [2024-04-27 19:03:53,216][54818] Updated weights for policy 0, policy_version 454218 (0.0031) [2024-04-27 19:03:54,253][54587] Fps is (10 sec: 58981.5, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 7441940480. Throughput: 0: 55456.2. Samples: 347173180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 19:03:54,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-27 19:03:56,866][54818] Updated weights for policy 0, policy_version 454228 (0.0026) [2024-04-27 19:03:59,245][54818] Updated weights for policy 0, policy_version 454238 (0.0025) [2024-04-27 19:03:59,253][54587] Fps is (10 sec: 57344.1, 60 sec: 56524.7, 300 sec: 55872.2). Total num frames: 7442235392. Throughput: 0: 55761.6. Samples: 347352760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 19:03:59,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 19:04:02,712][54818] Updated weights for policy 0, policy_version 454248 (0.0030) [2024-04-27 19:04:04,253][54587] Fps is (10 sec: 55706.6, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7442497536. Throughput: 0: 55732.6. Samples: 347684160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 19:04:04,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 19:04:05,175][54818] Updated weights for policy 0, policy_version 454258 (0.0025) [2024-04-27 19:04:08,561][54818] Updated weights for policy 0, policy_version 454268 (0.0029) [2024-04-27 19:04:09,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7442759680. Throughput: 0: 55800.5. Samples: 348018280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 19:04:09,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 19:04:11,001][54818] Updated weights for policy 0, policy_version 454278 (0.0035) [2024-04-27 19:04:14,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7443038208. Throughput: 0: 55591.5. Samples: 348183660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 19:04:14,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 19:04:14,308][54818] Updated weights for policy 0, policy_version 454288 (0.0031) [2024-04-27 19:04:16,915][54818] Updated weights for policy 0, policy_version 454298 (0.0024) [2024-04-27 19:04:19,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 7443316736. Throughput: 0: 55588.7. Samples: 348518080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 19:04:19,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 19:04:20,113][54818] Updated weights for policy 0, policy_version 454308 (0.0030) [2024-04-27 19:04:22,817][54818] Updated weights for policy 0, policy_version 454318 (0.0027) [2024-04-27 19:04:24,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7443611648. Throughput: 0: 55583.0. Samples: 348850740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 19:04:24,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 19:04:26,227][54818] Updated weights for policy 0, policy_version 454328 (0.0025) [2024-04-27 19:04:26,746][54798] Signal inference workers to stop experience collection... (4950 times) [2024-04-27 19:04:26,779][54818] InferenceWorker_p0-w0: stopping experience collection (4950 times) [2024-04-27 19:04:26,806][54798] Signal inference workers to resume experience collection... (4950 times) [2024-04-27 19:04:26,811][54818] InferenceWorker_p0-w0: resuming experience collection (4950 times) [2024-04-27 19:04:28,632][54818] Updated weights for policy 0, policy_version 454338 (0.0032) [2024-04-27 19:04:29,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7443890176. Throughput: 0: 55867.5. Samples: 349019140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 19:04:29,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 19:04:32,145][54818] Updated weights for policy 0, policy_version 454348 (0.0030) [2024-04-27 19:04:34,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 7444168704. Throughput: 0: 55980.8. Samples: 349358160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 19:04:34,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 19:04:34,501][54818] Updated weights for policy 0, policy_version 454358 (0.0036) [2024-04-27 19:04:37,965][54818] Updated weights for policy 0, policy_version 454368 (0.0033) [2024-04-27 19:04:39,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7444430848. Throughput: 0: 55963.7. Samples: 349691540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 19:04:39,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 19:04:40,428][54818] Updated weights for policy 0, policy_version 454378 (0.0032) [2024-04-27 19:04:43,878][54818] Updated weights for policy 0, policy_version 454388 (0.0030) [2024-04-27 19:04:44,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7444709376. Throughput: 0: 55510.6. Samples: 349850740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 19:04:44,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 19:04:46,364][54818] Updated weights for policy 0, policy_version 454398 (0.0026) [2024-04-27 19:04:49,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7444987904. Throughput: 0: 55585.5. Samples: 350185520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 19:04:49,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 19:04:49,671][54818] Updated weights for policy 0, policy_version 454408 (0.0028) [2024-04-27 19:04:52,389][54818] Updated weights for policy 0, policy_version 454418 (0.0028) [2024-04-27 19:04:54,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 7445266432. Throughput: 0: 55612.5. Samples: 350520840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 19:04:54,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 19:04:55,611][54818] Updated weights for policy 0, policy_version 454428 (0.0028) [2024-04-27 19:04:58,125][54818] Updated weights for policy 0, policy_version 454438 (0.0027) [2024-04-27 19:04:59,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 7445561344. Throughput: 0: 55654.6. Samples: 350688120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 19:04:59,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 19:05:01,438][54818] Updated weights for policy 0, policy_version 454448 (0.0030) [2024-04-27 19:05:04,155][54818] Updated weights for policy 0, policy_version 454458 (0.0030) [2024-04-27 19:05:04,253][54587] Fps is (10 sec: 57343.3, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7445839872. Throughput: 0: 55583.1. Samples: 351019320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 19:05:04,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 19:05:07,342][54818] Updated weights for policy 0, policy_version 454468 (0.0027) [2024-04-27 19:05:09,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7446102016. Throughput: 0: 55637.3. Samples: 351354420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 19:05:09,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 19:05:10,095][54818] Updated weights for policy 0, policy_version 454478 (0.0028) [2024-04-27 19:05:13,197][54818] Updated weights for policy 0, policy_version 454488 (0.0035) [2024-04-27 19:05:14,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7446396928. Throughput: 0: 55703.7. Samples: 351525800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 19:05:14,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 19:05:15,818][54818] Updated weights for policy 0, policy_version 454498 (0.0027) [2024-04-27 19:05:19,197][54818] Updated weights for policy 0, policy_version 454508 (0.0027) [2024-04-27 19:05:19,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7446659072. Throughput: 0: 55625.9. Samples: 351861320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 19:05:19,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-27 19:05:21,704][54818] Updated weights for policy 0, policy_version 454518 (0.0026) [2024-04-27 19:05:24,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7446937600. Throughput: 0: 55701.4. Samples: 352198100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 19:05:24,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 19:05:24,992][54818] Updated weights for policy 0, policy_version 454528 (0.0027) [2024-04-27 19:05:27,597][54818] Updated weights for policy 0, policy_version 454538 (0.0032) [2024-04-27 19:05:29,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7447216128. Throughput: 0: 55799.5. Samples: 352361720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 19:05:29,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 19:05:30,769][54818] Updated weights for policy 0, policy_version 454548 (0.0027) [2024-04-27 19:05:33,577][54818] Updated weights for policy 0, policy_version 454558 (0.0024) [2024-04-27 19:05:34,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 7447511040. Throughput: 0: 55774.5. Samples: 352695360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 19:05:34,253][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 19:05:36,631][54818] Updated weights for policy 0, policy_version 454568 (0.0030) [2024-04-27 19:05:39,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7447773184. Throughput: 0: 55790.6. Samples: 353031420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 19:05:39,254][54587] Avg episode reward: [(0, '0.499')] [2024-04-27 19:05:39,370][54818] Updated weights for policy 0, policy_version 454578 (0.0032) [2024-04-27 19:05:40,378][54798] Signal inference workers to stop experience collection... (5000 times) [2024-04-27 19:05:40,378][54798] Signal inference workers to resume experience collection... (5000 times) [2024-04-27 19:05:40,402][54818] InferenceWorker_p0-w0: stopping experience collection (5000 times) [2024-04-27 19:05:40,402][54818] InferenceWorker_p0-w0: resuming experience collection (5000 times) [2024-04-27 19:05:42,577][54818] Updated weights for policy 0, policy_version 454588 (0.0027) [2024-04-27 19:05:44,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7448068096. Throughput: 0: 55837.4. Samples: 353200800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 19:05:44,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 19:05:45,186][54818] Updated weights for policy 0, policy_version 454598 (0.0027) [2024-04-27 19:05:48,402][54818] Updated weights for policy 0, policy_version 454608 (0.0028) [2024-04-27 19:05:49,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 7448346624. Throughput: 0: 55890.8. Samples: 353534400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 19:05:49,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 19:05:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000454611_7448346624.pth... [2024-04-27 19:05:49,319][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000453794_7434960896.pth [2024-04-27 19:05:51,019][54818] Updated weights for policy 0, policy_version 454618 (0.0036) [2024-04-27 19:05:54,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7448608768. Throughput: 0: 55862.3. Samples: 353868220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 19:05:54,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 19:05:54,390][54818] Updated weights for policy 0, policy_version 454628 (0.0026) [2024-04-27 19:05:56,906][54818] Updated weights for policy 0, policy_version 454638 (0.0028) [2024-04-27 19:05:59,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7448887296. Throughput: 0: 55657.3. Samples: 354030380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 19:05:59,253][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 19:06:00,197][54818] Updated weights for policy 0, policy_version 454648 (0.0027) [2024-04-27 19:06:02,775][54818] Updated weights for policy 0, policy_version 454658 (0.0030) [2024-04-27 19:06:04,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 7449182208. Throughput: 0: 55598.7. Samples: 354363260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 19:06:04,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 19:06:06,184][54818] Updated weights for policy 0, policy_version 454668 (0.0034) [2024-04-27 19:06:08,765][54818] Updated weights for policy 0, policy_version 454678 (0.0029) [2024-04-27 19:06:09,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7449460736. Throughput: 0: 55548.8. Samples: 354697800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 19:06:09,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 19:06:11,989][54818] Updated weights for policy 0, policy_version 454688 (0.0027) [2024-04-27 19:06:14,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7449739264. Throughput: 0: 55774.8. Samples: 354871580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-04-27 19:06:14,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-27 19:06:14,771][54818] Updated weights for policy 0, policy_version 454698 (0.0032) [2024-04-27 19:06:17,798][54818] Updated weights for policy 0, policy_version 454708 (0.0028) [2024-04-27 19:06:19,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7450001408. Throughput: 0: 55715.9. Samples: 355202580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-04-27 19:06:19,254][54587] Avg episode reward: [(0, '0.667')] [2024-04-27 19:06:20,480][54818] Updated weights for policy 0, policy_version 454718 (0.0028) [2024-04-27 19:06:23,734][54818] Updated weights for policy 0, policy_version 454728 (0.0037) [2024-04-27 19:06:24,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7450296320. Throughput: 0: 55678.2. Samples: 355536940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-04-27 19:06:24,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 19:06:26,310][54818] Updated weights for policy 0, policy_version 454738 (0.0024) [2024-04-27 19:06:29,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7450558464. Throughput: 0: 55556.1. Samples: 355700820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-04-27 19:06:29,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 19:06:29,794][54818] Updated weights for policy 0, policy_version 454748 (0.0028) [2024-04-27 19:06:32,172][54818] Updated weights for policy 0, policy_version 454758 (0.0030) [2024-04-27 19:06:34,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7450853376. Throughput: 0: 55626.2. Samples: 356037580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-04-27 19:06:34,254][54587] Avg episode reward: [(0, '0.528')] [2024-04-27 19:06:35,511][54818] Updated weights for policy 0, policy_version 454768 (0.0029) [2024-04-27 19:06:38,171][54818] Updated weights for policy 0, policy_version 454778 (0.0028) [2024-04-27 19:06:39,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 7451131904. Throughput: 0: 55622.7. Samples: 356371240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-04-27 19:06:39,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 19:06:41,465][54818] Updated weights for policy 0, policy_version 454788 (0.0030) [2024-04-27 19:06:43,923][54818] Updated weights for policy 0, policy_version 454798 (0.0033) [2024-04-27 19:06:44,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7451410432. Throughput: 0: 55803.4. Samples: 356541540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-04-27 19:06:44,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 19:06:46,696][54798] Signal inference workers to stop experience collection... (5050 times) [2024-04-27 19:06:46,746][54798] Signal inference workers to resume experience collection... (5050 times) [2024-04-27 19:06:46,746][54818] InferenceWorker_p0-w0: stopping experience collection (5050 times) [2024-04-27 19:06:46,762][54818] InferenceWorker_p0-w0: resuming experience collection (5050 times) [2024-04-27 19:06:47,334][54818] Updated weights for policy 0, policy_version 454808 (0.0029) [2024-04-27 19:06:49,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7451688960. Throughput: 0: 55816.4. Samples: 356875000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-04-27 19:06:49,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 19:06:50,156][54818] Updated weights for policy 0, policy_version 454818 (0.0031) [2024-04-27 19:06:53,208][54818] Updated weights for policy 0, policy_version 454828 (0.0029) [2024-04-27 19:06:54,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7451951104. Throughput: 0: 55804.9. Samples: 357209020. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-04-27 19:06:54,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 19:06:56,005][54818] Updated weights for policy 0, policy_version 454838 (0.0027) [2024-04-27 19:06:58,993][54818] Updated weights for policy 0, policy_version 454848 (0.0033) [2024-04-27 19:06:59,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7452246016. Throughput: 0: 55497.7. Samples: 357368980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-04-27 19:06:59,254][54587] Avg episode reward: [(0, '0.516')] [2024-04-27 19:07:01,959][54818] Updated weights for policy 0, policy_version 454858 (0.0032) [2024-04-27 19:07:04,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7452491776. Throughput: 0: 55476.5. Samples: 357699020. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-04-27 19:07:04,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 19:07:04,956][54818] Updated weights for policy 0, policy_version 454868 (0.0037) [2024-04-27 19:07:07,738][54818] Updated weights for policy 0, policy_version 454878 (0.0032) [2024-04-27 19:07:09,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7452786688. Throughput: 0: 55516.1. Samples: 358035160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-04-27 19:07:09,254][54587] Avg episode reward: [(0, '0.676')] [2024-04-27 19:07:10,916][54818] Updated weights for policy 0, policy_version 454888 (0.0025) [2024-04-27 19:07:13,654][54818] Updated weights for policy 0, policy_version 454898 (0.0031) [2024-04-27 19:07:14,253][54587] Fps is (10 sec: 58983.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7453081600. Throughput: 0: 55607.2. Samples: 358203140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-04-27 19:07:14,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 19:07:16,761][54818] Updated weights for policy 0, policy_version 454908 (0.0025) [2024-04-27 19:07:19,254][54587] Fps is (10 sec: 55704.3, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7453343744. Throughput: 0: 55523.3. Samples: 358536140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-04-27 19:07:19,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 19:07:19,653][54818] Updated weights for policy 0, policy_version 454918 (0.0028) [2024-04-27 19:07:22,832][54818] Updated weights for policy 0, policy_version 454928 (0.0027) [2024-04-27 19:07:24,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7453622272. Throughput: 0: 55493.9. Samples: 358868460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-04-27 19:07:24,253][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 19:07:25,456][54818] Updated weights for policy 0, policy_version 454938 (0.0028) [2024-04-27 19:07:28,576][54818] Updated weights for policy 0, policy_version 454948 (0.0034) [2024-04-27 19:07:29,253][54587] Fps is (10 sec: 54068.2, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7453884416. Throughput: 0: 55375.2. Samples: 359033420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-04-27 19:07:29,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 19:07:31,484][54818] Updated weights for policy 0, policy_version 454958 (0.0028) [2024-04-27 19:07:34,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7454162944. Throughput: 0: 55366.2. Samples: 359366480. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 19:07:34,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 19:07:34,564][54818] Updated weights for policy 0, policy_version 454968 (0.0025) [2024-04-27 19:07:37,331][54818] Updated weights for policy 0, policy_version 454978 (0.0026) [2024-04-27 19:07:39,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 7454441472. Throughput: 0: 55396.0. Samples: 359701840. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 19:07:39,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 19:07:40,436][54818] Updated weights for policy 0, policy_version 454988 (0.0027) [2024-04-27 19:07:43,357][54818] Updated weights for policy 0, policy_version 454998 (0.0028) [2024-04-27 19:07:44,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 7454736384. Throughput: 0: 55436.1. Samples: 359863600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 19:07:44,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 19:07:46,375][54818] Updated weights for policy 0, policy_version 455008 (0.0027) [2024-04-27 19:07:49,123][54818] Updated weights for policy 0, policy_version 455018 (0.0027) [2024-04-27 19:07:49,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 7455014912. Throughput: 0: 55518.5. Samples: 360197360. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 19:07:49,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 19:07:49,313][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000455019_7455031296.pth... [2024-04-27 19:07:49,356][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000454203_7441661952.pth [2024-04-27 19:07:51,148][54798] Signal inference workers to stop experience collection... (5100 times) [2024-04-27 19:07:51,155][54798] Signal inference workers to resume experience collection... (5100 times) [2024-04-27 19:07:51,168][54818] InferenceWorker_p0-w0: stopping experience collection (5100 times) [2024-04-27 19:07:51,169][54818] InferenceWorker_p0-w0: resuming experience collection (5100 times) [2024-04-27 19:07:52,135][54818] Updated weights for policy 0, policy_version 455028 (0.0030) [2024-04-27 19:07:54,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7455277056. Throughput: 0: 55495.1. Samples: 360532440. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 19:07:54,253][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 19:07:55,096][54818] Updated weights for policy 0, policy_version 455038 (0.0030) [2024-04-27 19:07:58,150][54818] Updated weights for policy 0, policy_version 455048 (0.0031) [2024-04-27 19:07:59,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7455588352. Throughput: 0: 55541.2. Samples: 360702500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 19:07:59,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 19:08:01,085][54818] Updated weights for policy 0, policy_version 455058 (0.0031) [2024-04-27 19:08:04,151][54818] Updated weights for policy 0, policy_version 455068 (0.0030) [2024-04-27 19:08:04,253][54587] Fps is (10 sec: 57343.0, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7455850496. Throughput: 0: 55671.2. Samples: 361041340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 19:08:04,254][54587] Avg episode reward: [(0, '0.528')] [2024-04-27 19:08:06,932][54818] Updated weights for policy 0, policy_version 455078 (0.0028) [2024-04-27 19:08:09,253][54587] Fps is (10 sec: 52429.6, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7456112640. Throughput: 0: 55837.8. Samples: 361381160. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 19:08:09,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 19:08:09,840][54818] Updated weights for policy 0, policy_version 455088 (0.0023) [2024-04-27 19:08:12,582][54818] Updated weights for policy 0, policy_version 455098 (0.0027) [2024-04-27 19:08:14,253][54587] Fps is (10 sec: 52429.2, 60 sec: 54886.3, 300 sec: 55539.0). Total num frames: 7456374784. Throughput: 0: 55599.1. Samples: 361535380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 19:08:14,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 19:08:15,831][54818] Updated weights for policy 0, policy_version 455108 (0.0035) [2024-04-27 19:08:18,453][54818] Updated weights for policy 0, policy_version 455118 (0.0028) [2024-04-27 19:08:19,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 7456669696. Throughput: 0: 55664.4. Samples: 361871380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 19:08:19,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 19:08:21,592][54818] Updated weights for policy 0, policy_version 455128 (0.0032) [2024-04-27 19:08:24,218][54818] Updated weights for policy 0, policy_version 455138 (0.0028) [2024-04-27 19:08:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7456980992. Throughput: 0: 55599.1. Samples: 362203800. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 19:08:24,254][54587] Avg episode reward: [(0, '0.505')] [2024-04-27 19:08:27,560][54818] Updated weights for policy 0, policy_version 455148 (0.0026) [2024-04-27 19:08:29,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7457243136. Throughput: 0: 55914.9. Samples: 362379780. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 19:08:29,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 19:08:30,017][54818] Updated weights for policy 0, policy_version 455158 (0.0031) [2024-04-27 19:08:33,313][54818] Updated weights for policy 0, policy_version 455168 (0.0033) [2024-04-27 19:08:34,253][54587] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 7457538048. Throughput: 0: 55918.9. Samples: 362713700. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 19:08:34,253][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 19:08:36,061][54818] Updated weights for policy 0, policy_version 455178 (0.0031) [2024-04-27 19:08:39,244][54818] Updated weights for policy 0, policy_version 455188 (0.0028) [2024-04-27 19:08:39,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7457800192. Throughput: 0: 55848.3. Samples: 363045620. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 19:08:39,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 19:08:41,838][54818] Updated weights for policy 0, policy_version 455198 (0.0030) [2024-04-27 19:08:44,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7458062336. Throughput: 0: 55660.2. Samples: 363207200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 19:08:44,253][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 19:08:45,055][54818] Updated weights for policy 0, policy_version 455208 (0.0030) [2024-04-27 19:08:47,830][54818] Updated weights for policy 0, policy_version 455218 (0.0030) [2024-04-27 19:08:49,253][54587] Fps is (10 sec: 52429.5, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 7458324480. Throughput: 0: 55614.4. Samples: 363543980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 19:08:49,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 19:08:50,834][54818] Updated weights for policy 0, policy_version 455228 (0.0026) [2024-04-27 19:08:54,153][54818] Updated weights for policy 0, policy_version 455238 (0.0037) [2024-04-27 19:08:54,253][54587] Fps is (10 sec: 55704.6, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7458619392. Throughput: 0: 55527.8. Samples: 363879920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-27 19:08:54,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 19:08:55,530][54798] Signal inference workers to stop experience collection... (5150 times) [2024-04-27 19:08:55,530][54798] Signal inference workers to resume experience collection... (5150 times) [2024-04-27 19:08:55,542][54818] InferenceWorker_p0-w0: stopping experience collection (5150 times) [2024-04-27 19:08:55,543][54818] InferenceWorker_p0-w0: resuming experience collection (5150 times) [2024-04-27 19:08:56,760][54818] Updated weights for policy 0, policy_version 455248 (0.0028) [2024-04-27 19:08:59,253][54587] Fps is (10 sec: 58981.6, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 7458914304. Throughput: 0: 55804.8. Samples: 364046600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 19:08:59,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 19:09:00,072][54818] Updated weights for policy 0, policy_version 455258 (0.0030) [2024-04-27 19:09:02,723][54818] Updated weights for policy 0, policy_version 455268 (0.0030) [2024-04-27 19:09:04,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7459192832. Throughput: 0: 55812.0. Samples: 364382920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 19:09:04,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 19:09:05,998][54818] Updated weights for policy 0, policy_version 455278 (0.0033) [2024-04-27 19:09:08,431][54818] Updated weights for policy 0, policy_version 455288 (0.0033) [2024-04-27 19:09:09,253][54587] Fps is (10 sec: 57344.9, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 7459487744. Throughput: 0: 55802.3. Samples: 364714900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 19:09:09,253][54587] Avg episode reward: [(0, '0.676')] [2024-04-27 19:09:11,800][54818] Updated weights for policy 0, policy_version 455298 (0.0027) [2024-04-27 19:09:14,253][54587] Fps is (10 sec: 55706.1, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 7459749888. Throughput: 0: 55826.0. Samples: 364891940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 19:09:14,253][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 19:09:14,325][54818] Updated weights for policy 0, policy_version 455308 (0.0029) [2024-04-27 19:09:17,993][54818] Updated weights for policy 0, policy_version 455318 (0.0036) [2024-04-27 19:09:19,253][54587] Fps is (10 sec: 52428.1, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7460012032. Throughput: 0: 55697.1. Samples: 365220080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 19:09:19,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 19:09:20,268][54818] Updated weights for policy 0, policy_version 455328 (0.0029) [2024-04-27 19:09:23,902][54818] Updated weights for policy 0, policy_version 455338 (0.0025) [2024-04-27 19:09:24,253][54587] Fps is (10 sec: 52428.6, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 7460274176. Throughput: 0: 55694.4. Samples: 365551860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 19:09:24,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 19:09:26,245][54818] Updated weights for policy 0, policy_version 455348 (0.0030) [2024-04-27 19:09:29,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7460552704. Throughput: 0: 55521.2. Samples: 365705660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 19:09:29,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 19:09:29,794][54818] Updated weights for policy 0, policy_version 455358 (0.0032) [2024-04-27 19:09:32,144][54818] Updated weights for policy 0, policy_version 455368 (0.0032) [2024-04-27 19:09:34,253][54587] Fps is (10 sec: 58982.3, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7460864000. Throughput: 0: 55464.9. Samples: 366039900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 19:09:34,254][54587] Avg episode reward: [(0, '0.521')] [2024-04-27 19:09:35,676][54818] Updated weights for policy 0, policy_version 455378 (0.0029) [2024-04-27 19:09:37,876][54818] Updated weights for policy 0, policy_version 455388 (0.0030) [2024-04-27 19:09:39,253][54587] Fps is (10 sec: 60620.9, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 7461158912. Throughput: 0: 55485.9. Samples: 366376780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 19:09:39,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-27 19:09:41,631][54818] Updated weights for policy 0, policy_version 455398 (0.0031) [2024-04-27 19:09:43,678][54818] Updated weights for policy 0, policy_version 455408 (0.0034) [2024-04-27 19:09:44,253][54587] Fps is (10 sec: 57344.1, 60 sec: 56251.7, 300 sec: 55761.2). Total num frames: 7461437440. Throughput: 0: 55828.6. Samples: 366558880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 19:09:44,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 19:09:47,674][54818] Updated weights for policy 0, policy_version 455418 (0.0034) [2024-04-27 19:09:49,253][54587] Fps is (10 sec: 54067.3, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 7461699584. Throughput: 0: 55848.9. Samples: 366896120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 19:09:49,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 19:09:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000455426_7461699584.pth... [2024-04-27 19:09:49,316][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000454611_7448346624.pth [2024-04-27 19:09:49,586][54818] Updated weights for policy 0, policy_version 455428 (0.0032) [2024-04-27 19:09:53,446][54818] Updated weights for policy 0, policy_version 455438 (0.0028) [2024-04-27 19:09:54,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55705.8, 300 sec: 55594.5). Total num frames: 7461961728. Throughput: 0: 55736.5. Samples: 367223040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 19:09:54,253][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 19:09:55,628][54818] Updated weights for policy 0, policy_version 455448 (0.0027) [2024-04-27 19:09:59,253][54587] Fps is (10 sec: 49152.1, 60 sec: 54613.5, 300 sec: 55427.9). Total num frames: 7462191104. Throughput: 0: 55105.3. Samples: 367371680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 19:09:59,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 19:09:59,513][54818] Updated weights for policy 0, policy_version 455458 (0.0029) [2024-04-27 19:10:01,059][54798] Signal inference workers to stop experience collection... (5200 times) [2024-04-27 19:10:01,060][54798] Signal inference workers to resume experience collection... (5200 times) [2024-04-27 19:10:01,084][54818] InferenceWorker_p0-w0: stopping experience collection (5200 times) [2024-04-27 19:10:01,084][54818] InferenceWorker_p0-w0: resuming experience collection (5200 times) [2024-04-27 19:10:01,353][54818] Updated weights for policy 0, policy_version 455468 (0.0027) [2024-04-27 19:10:04,253][54587] Fps is (10 sec: 55704.7, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7462518784. Throughput: 0: 55248.0. Samples: 367706240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 19:10:04,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 19:10:05,353][54818] Updated weights for policy 0, policy_version 455478 (0.0025) [2024-04-27 19:10:07,226][54818] Updated weights for policy 0, policy_version 455488 (0.0023) [2024-04-27 19:10:09,253][54587] Fps is (10 sec: 60619.9, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 7462797312. Throughput: 0: 55411.8. Samples: 368045400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 19:10:09,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 19:10:11,096][54818] Updated weights for policy 0, policy_version 455498 (0.0028) [2024-04-27 19:10:13,295][54818] Updated weights for policy 0, policy_version 455508 (0.0032) [2024-04-27 19:10:14,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7463092224. Throughput: 0: 55765.3. Samples: 368215100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 19:10:14,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 19:10:17,001][54818] Updated weights for policy 0, policy_version 455518 (0.0030) [2024-04-27 19:10:19,028][54818] Updated weights for policy 0, policy_version 455528 (0.0033) [2024-04-27 19:10:19,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7463370752. Throughput: 0: 55773.7. Samples: 368549720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-27 19:10:19,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 19:10:22,923][54818] Updated weights for policy 0, policy_version 455538 (0.0032) [2024-04-27 19:10:24,253][54587] Fps is (10 sec: 55705.6, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 7463649280. Throughput: 0: 55653.3. Samples: 368881180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 19:10:24,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 19:10:24,770][54818] Updated weights for policy 0, policy_version 455548 (0.0032) [2024-04-27 19:10:28,622][54818] Updated weights for policy 0, policy_version 455558 (0.0025) [2024-04-27 19:10:29,253][54587] Fps is (10 sec: 52429.5, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7463895040. Throughput: 0: 55279.6. Samples: 369046460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 19:10:29,253][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 19:10:30,684][54818] Updated weights for policy 0, policy_version 455568 (0.0026) [2024-04-27 19:10:34,253][54587] Fps is (10 sec: 50791.1, 60 sec: 54886.5, 300 sec: 55539.0). Total num frames: 7464157184. Throughput: 0: 55257.9. Samples: 369382720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 19:10:34,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 19:10:34,594][54818] Updated weights for policy 0, policy_version 455578 (0.0026) [2024-04-27 19:10:36,674][54818] Updated weights for policy 0, policy_version 455588 (0.0027) [2024-04-27 19:10:39,253][54587] Fps is (10 sec: 55705.3, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 7464452096. Throughput: 0: 55423.0. Samples: 369717080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 19:10:39,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 19:10:40,686][54818] Updated weights for policy 0, policy_version 455598 (0.0035) [2024-04-27 19:10:42,460][54818] Updated weights for policy 0, policy_version 455608 (0.0023) [2024-04-27 19:10:44,253][54587] Fps is (10 sec: 58982.6, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7464747008. Throughput: 0: 55796.1. Samples: 369882500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 19:10:44,253][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 19:10:46,393][54818] Updated weights for policy 0, policy_version 455618 (0.0030) [2024-04-27 19:10:46,736][54798] Signal inference workers to stop experience collection... (5250 times) [2024-04-27 19:10:46,737][54798] Signal inference workers to resume experience collection... (5250 times) [2024-04-27 19:10:46,752][54818] InferenceWorker_p0-w0: stopping experience collection (5250 times) [2024-04-27 19:10:46,753][54818] InferenceWorker_p0-w0: resuming experience collection (5250 times) [2024-04-27 19:10:48,410][54818] Updated weights for policy 0, policy_version 455628 (0.0031) [2024-04-27 19:10:49,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7465025536. Throughput: 0: 55815.2. Samples: 370217920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 19:10:49,257][54587] Avg episode reward: [(0, '0.697')] [2024-04-27 19:10:52,369][54818] Updated weights for policy 0, policy_version 455638 (0.0027) [2024-04-27 19:10:54,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7465320448. Throughput: 0: 55689.5. Samples: 370551420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 19:10:54,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 19:10:54,440][54818] Updated weights for policy 0, policy_version 455648 (0.0033) [2024-04-27 19:10:58,084][54818] Updated weights for policy 0, policy_version 455658 (0.0028) [2024-04-27 19:10:59,253][54587] Fps is (10 sec: 57344.1, 60 sec: 56797.8, 300 sec: 55650.1). Total num frames: 7465598976. Throughput: 0: 55662.7. Samples: 370719920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 19:10:59,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 19:11:00,228][54818] Updated weights for policy 0, policy_version 455668 (0.0025) [2024-04-27 19:11:03,974][54818] Updated weights for policy 0, policy_version 455678 (0.0026) [2024-04-27 19:11:04,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7465861120. Throughput: 0: 55726.4. Samples: 371057400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 19:11:04,253][54587] Avg episode reward: [(0, '0.465')] [2024-04-27 19:11:06,120][54818] Updated weights for policy 0, policy_version 455688 (0.0031) [2024-04-27 19:11:09,253][54587] Fps is (10 sec: 52429.2, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 7466123264. Throughput: 0: 55841.9. Samples: 371394060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 19:11:09,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 19:11:09,760][54818] Updated weights for policy 0, policy_version 455698 (0.0030) [2024-04-27 19:11:11,967][54818] Updated weights for policy 0, policy_version 455708 (0.0027) [2024-04-27 19:11:14,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 7466401792. Throughput: 0: 55531.1. Samples: 371545360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 19:11:14,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 19:11:15,550][54818] Updated weights for policy 0, policy_version 455718 (0.0026) [2024-04-27 19:11:17,833][54818] Updated weights for policy 0, policy_version 455728 (0.0026) [2024-04-27 19:11:19,253][54587] Fps is (10 sec: 57343.3, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7466696704. Throughput: 0: 55606.9. Samples: 371885040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 19:11:19,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 19:11:21,444][54818] Updated weights for policy 0, policy_version 455738 (0.0033) [2024-04-27 19:11:23,767][54818] Updated weights for policy 0, policy_version 455748 (0.0027) [2024-04-27 19:11:24,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 7466975232. Throughput: 0: 55581.7. Samples: 372218260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 19:11:24,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 19:11:27,420][54818] Updated weights for policy 0, policy_version 455758 (0.0028) [2024-04-27 19:11:29,253][54587] Fps is (10 sec: 57343.9, 60 sec: 56251.6, 300 sec: 55650.0). Total num frames: 7467270144. Throughput: 0: 55895.3. Samples: 372397800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 19:11:29,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 19:11:29,566][54818] Updated weights for policy 0, policy_version 455768 (0.0028) [2024-04-27 19:11:33,276][54818] Updated weights for policy 0, policy_version 455778 (0.0032) [2024-04-27 19:11:34,253][54587] Fps is (10 sec: 55705.9, 60 sec: 56251.6, 300 sec: 55594.5). Total num frames: 7467532288. Throughput: 0: 55784.9. Samples: 372728240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 19:11:34,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 19:11:35,738][54818] Updated weights for policy 0, policy_version 455788 (0.0031) [2024-04-27 19:11:39,045][54818] Updated weights for policy 0, policy_version 455798 (0.0028) [2024-04-27 19:11:39,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7467810816. Throughput: 0: 55839.8. Samples: 373064220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 19:11:39,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 19:11:39,944][54798] Signal inference workers to stop experience collection... (5300 times) [2024-04-27 19:11:39,944][54798] Signal inference workers to resume experience collection... (5300 times) [2024-04-27 19:11:39,968][54818] InferenceWorker_p0-w0: stopping experience collection (5300 times) [2024-04-27 19:11:39,969][54818] InferenceWorker_p0-w0: resuming experience collection (5300 times) [2024-04-27 19:11:41,674][54818] Updated weights for policy 0, policy_version 455808 (0.0030) [2024-04-27 19:11:44,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7468072960. Throughput: 0: 55633.8. Samples: 373223440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-27 19:11:44,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 19:11:44,903][54818] Updated weights for policy 0, policy_version 455818 (0.0027) [2024-04-27 19:11:47,420][54818] Updated weights for policy 0, policy_version 455828 (0.0031) [2024-04-27 19:11:49,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7468351488. Throughput: 0: 55612.2. Samples: 373559960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 19:11:49,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 19:11:49,294][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000455833_7468367872.pth... [2024-04-27 19:11:49,339][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000455019_7455031296.pth [2024-04-27 19:11:50,653][54818] Updated weights for policy 0, policy_version 455838 (0.0036) [2024-04-27 19:11:53,138][54818] Updated weights for policy 0, policy_version 455848 (0.0025) [2024-04-27 19:11:54,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 7468630016. Throughput: 0: 55606.2. Samples: 373896340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 19:11:54,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 19:11:56,588][54818] Updated weights for policy 0, policy_version 455858 (0.0027) [2024-04-27 19:11:59,145][54818] Updated weights for policy 0, policy_version 455868 (0.0031) [2024-04-27 19:11:59,253][54587] Fps is (10 sec: 58983.5, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7468941312. Throughput: 0: 55931.5. Samples: 374062280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 19:11:59,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 19:12:02,479][54818] Updated weights for policy 0, policy_version 455878 (0.0027) [2024-04-27 19:12:04,253][54587] Fps is (10 sec: 58982.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7469219840. Throughput: 0: 55686.8. Samples: 374390940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 19:12:04,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 19:12:05,171][54818] Updated weights for policy 0, policy_version 455888 (0.0029) [2024-04-27 19:12:08,323][54818] Updated weights for policy 0, policy_version 455898 (0.0027) [2024-04-27 19:12:09,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7469481984. Throughput: 0: 55753.4. Samples: 374727160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 19:12:09,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 19:12:11,179][54818] Updated weights for policy 0, policy_version 455908 (0.0030) [2024-04-27 19:12:14,143][54818] Updated weights for policy 0, policy_version 455918 (0.0034) [2024-04-27 19:12:14,253][54587] Fps is (10 sec: 54066.5, 60 sec: 55978.5, 300 sec: 55650.1). Total num frames: 7469760512. Throughput: 0: 55523.5. Samples: 374896360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 19:12:14,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 19:12:17,101][54818] Updated weights for policy 0, policy_version 455928 (0.0030) [2024-04-27 19:12:19,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7470039040. Throughput: 0: 55683.2. Samples: 375233980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 19:12:19,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 19:12:19,943][54818] Updated weights for policy 0, policy_version 455938 (0.0027) [2024-04-27 19:12:23,025][54818] Updated weights for policy 0, policy_version 455948 (0.0034) [2024-04-27 19:12:24,253][54587] Fps is (10 sec: 52429.6, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 7470284800. Throughput: 0: 55691.3. Samples: 375570320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 19:12:24,254][54587] Avg episode reward: [(0, '0.523')] [2024-04-27 19:12:25,709][54818] Updated weights for policy 0, policy_version 455958 (0.0029) [2024-04-27 19:12:28,919][54818] Updated weights for policy 0, policy_version 455968 (0.0027) [2024-04-27 19:12:29,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7470579712. Throughput: 0: 55665.3. Samples: 375728380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 19:12:29,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 19:12:31,600][54818] Updated weights for policy 0, policy_version 455978 (0.0031) [2024-04-27 19:12:34,253][54587] Fps is (10 sec: 60620.2, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7470891008. Throughput: 0: 55688.1. Samples: 376065920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 19:12:34,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 19:12:34,651][54818] Updated weights for policy 0, policy_version 455988 (0.0028) [2024-04-27 19:12:37,507][54818] Updated weights for policy 0, policy_version 455998 (0.0034) [2024-04-27 19:12:39,253][54587] Fps is (10 sec: 58982.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7471169536. Throughput: 0: 55605.3. Samples: 376398580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 19:12:39,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 19:12:40,346][54818] Updated weights for policy 0, policy_version 456008 (0.0028) [2024-04-27 19:12:43,213][54798] Signal inference workers to stop experience collection... (5350 times) [2024-04-27 19:12:43,214][54798] Signal inference workers to resume experience collection... (5350 times) [2024-04-27 19:12:43,241][54818] InferenceWorker_p0-w0: stopping experience collection (5350 times) [2024-04-27 19:12:43,241][54818] InferenceWorker_p0-w0: resuming experience collection (5350 times) [2024-04-27 19:12:43,322][54818] Updated weights for policy 0, policy_version 456018 (0.0033) [2024-04-27 19:12:44,253][54587] Fps is (10 sec: 54068.0, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7471431680. Throughput: 0: 55954.3. Samples: 376580220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 19:12:44,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 19:12:46,247][54818] Updated weights for policy 0, policy_version 456028 (0.0031) [2024-04-27 19:12:49,147][54818] Updated weights for policy 0, policy_version 456038 (0.0038) [2024-04-27 19:12:49,253][54587] Fps is (10 sec: 55705.1, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 7471726592. Throughput: 0: 55941.6. Samples: 376908320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 19:12:49,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-27 19:12:52,357][54818] Updated weights for policy 0, policy_version 456048 (0.0031) [2024-04-27 19:12:54,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7471988736. Throughput: 0: 55772.4. Samples: 377236920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 19:12:54,254][54587] Avg episode reward: [(0, '0.668')] [2024-04-27 19:12:55,005][54818] Updated weights for policy 0, policy_version 456058 (0.0034) [2024-04-27 19:12:59,041][54818] Updated weights for policy 0, policy_version 456068 (0.0026) [2024-04-27 19:12:59,253][54587] Fps is (10 sec: 50790.7, 60 sec: 54886.3, 300 sec: 55539.0). Total num frames: 7472234496. Throughput: 0: 55549.0. Samples: 377396060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 19:12:59,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-27 19:13:00,936][54818] Updated weights for policy 0, policy_version 456078 (0.0027) [2024-04-27 19:13:04,253][54587] Fps is (10 sec: 52429.4, 60 sec: 54886.5, 300 sec: 55594.5). Total num frames: 7472513024. Throughput: 0: 55441.0. Samples: 377728820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 19:13:04,253][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 19:13:04,851][54818] Updated weights for policy 0, policy_version 456088 (0.0031) [2024-04-27 19:13:06,720][54818] Updated weights for policy 0, policy_version 456098 (0.0031) [2024-04-27 19:13:09,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7472807936. Throughput: 0: 55423.5. Samples: 378064380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 19:13:09,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 19:13:10,654][54818] Updated weights for policy 0, policy_version 456108 (0.0028) [2024-04-27 19:13:12,695][54818] Updated weights for policy 0, policy_version 456118 (0.0033) [2024-04-27 19:13:14,253][54587] Fps is (10 sec: 58980.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7473102848. Throughput: 0: 55659.8. Samples: 378233080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-27 19:13:14,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 19:13:16,549][54818] Updated weights for policy 0, policy_version 456128 (0.0031) [2024-04-27 19:13:18,648][54818] Updated weights for policy 0, policy_version 456138 (0.0032) [2024-04-27 19:13:19,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7473381376. Throughput: 0: 55623.1. Samples: 378568960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-27 19:13:19,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 19:13:22,329][54818] Updated weights for policy 0, policy_version 456148 (0.0028) [2024-04-27 19:13:24,253][54587] Fps is (10 sec: 55706.8, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 7473659904. Throughput: 0: 55601.9. Samples: 378900660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-27 19:13:24,253][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 19:13:24,508][54818] Updated weights for policy 0, policy_version 456158 (0.0029) [2024-04-27 19:13:28,110][54818] Updated weights for policy 0, policy_version 456168 (0.0035) [2024-04-27 19:13:29,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7473938432. Throughput: 0: 55345.6. Samples: 379070780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-27 19:13:29,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 19:13:30,224][54818] Updated weights for policy 0, policy_version 456178 (0.0032) [2024-04-27 19:13:33,992][54818] Updated weights for policy 0, policy_version 456188 (0.0029) [2024-04-27 19:13:34,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55159.6, 300 sec: 55594.6). Total num frames: 7474200576. Throughput: 0: 55486.9. Samples: 379405220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-27 19:13:34,253][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 19:13:35,096][54798] Signal inference workers to stop experience collection... (5400 times) [2024-04-27 19:13:35,103][54798] Signal inference workers to resume experience collection... (5400 times) [2024-04-27 19:13:35,118][54818] InferenceWorker_p0-w0: stopping experience collection (5400 times) [2024-04-27 19:13:35,137][54818] InferenceWorker_p0-w0: resuming experience collection (5400 times) [2024-04-27 19:13:36,043][54818] Updated weights for policy 0, policy_version 456198 (0.0034) [2024-04-27 19:13:39,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 7474479104. Throughput: 0: 55613.3. Samples: 379739520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-27 19:13:39,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 19:13:39,857][54818] Updated weights for policy 0, policy_version 456208 (0.0025) [2024-04-27 19:13:41,986][54818] Updated weights for policy 0, policy_version 456218 (0.0030) [2024-04-27 19:13:44,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7474774016. Throughput: 0: 55587.2. Samples: 379897480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-27 19:13:44,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 19:13:45,716][54818] Updated weights for policy 0, policy_version 456228 (0.0036) [2024-04-27 19:13:47,976][54818] Updated weights for policy 0, policy_version 456238 (0.0026) [2024-04-27 19:13:49,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7475052544. Throughput: 0: 55628.6. Samples: 380232120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-27 19:13:49,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 19:13:49,396][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000456242_7475068928.pth... [2024-04-27 19:13:49,440][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000455426_7461699584.pth [2024-04-27 19:13:51,672][54818] Updated weights for policy 0, policy_version 456248 (0.0030) [2024-04-27 19:13:54,233][54818] Updated weights for policy 0, policy_version 456258 (0.0026) [2024-04-27 19:13:54,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7475331072. Throughput: 0: 55555.1. Samples: 380564360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-27 19:13:54,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 19:13:57,387][54818] Updated weights for policy 0, policy_version 456268 (0.0031) [2024-04-27 19:13:59,253][54587] Fps is (10 sec: 55705.8, 60 sec: 56251.7, 300 sec: 55650.0). Total num frames: 7475609600. Throughput: 0: 55694.3. Samples: 380739320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-27 19:13:59,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 19:14:00,354][54818] Updated weights for policy 0, policy_version 456278 (0.0033) [2024-04-27 19:14:03,165][54818] Updated weights for policy 0, policy_version 456288 (0.0027) [2024-04-27 19:14:04,253][54587] Fps is (10 sec: 52429.2, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 7475855360. Throughput: 0: 55644.6. Samples: 381072960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-27 19:14:04,253][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 19:14:06,117][54818] Updated weights for policy 0, policy_version 456298 (0.0035) [2024-04-27 19:14:08,987][54818] Updated weights for policy 0, policy_version 456308 (0.0032) [2024-04-27 19:14:09,253][54587] Fps is (10 sec: 54068.0, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7476150272. Throughput: 0: 55521.7. Samples: 381399140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-27 19:14:09,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 19:14:12,068][54818] Updated weights for policy 0, policy_version 456318 (0.0028) [2024-04-27 19:14:14,253][54587] Fps is (10 sec: 55704.5, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7476412416. Throughput: 0: 55426.6. Samples: 381564980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-27 19:14:14,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 19:14:15,055][54818] Updated weights for policy 0, policy_version 456328 (0.0023) [2024-04-27 19:14:17,956][54818] Updated weights for policy 0, policy_version 456338 (0.0033) [2024-04-27 19:14:19,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7476707328. Throughput: 0: 55427.0. Samples: 381899440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-27 19:14:19,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 19:14:20,878][54818] Updated weights for policy 0, policy_version 456348 (0.0029) [2024-04-27 19:14:23,802][54818] Updated weights for policy 0, policy_version 456358 (0.0027) [2024-04-27 19:14:24,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 7476985856. Throughput: 0: 55406.3. Samples: 382232800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-27 19:14:24,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 19:14:26,880][54818] Updated weights for policy 0, policy_version 456368 (0.0029) [2024-04-27 19:14:29,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7477264384. Throughput: 0: 55596.8. Samples: 382399340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-27 19:14:29,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 19:14:29,630][54818] Updated weights for policy 0, policy_version 456378 (0.0032) [2024-04-27 19:14:32,781][54818] Updated weights for policy 0, policy_version 456388 (0.0025) [2024-04-27 19:14:34,022][54798] Signal inference workers to stop experience collection... (5450 times) [2024-04-27 19:14:34,048][54818] InferenceWorker_p0-w0: stopping experience collection (5450 times) [2024-04-27 19:14:34,076][54798] Signal inference workers to resume experience collection... (5450 times) [2024-04-27 19:14:34,081][54818] InferenceWorker_p0-w0: resuming experience collection (5450 times) [2024-04-27 19:14:34,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 7477559296. Throughput: 0: 55597.4. Samples: 382734000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 19:14:34,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 19:14:35,464][54818] Updated weights for policy 0, policy_version 456398 (0.0030) [2024-04-27 19:14:38,595][54818] Updated weights for policy 0, policy_version 456408 (0.0033) [2024-04-27 19:14:39,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 7477805056. Throughput: 0: 55545.4. Samples: 383063900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 19:14:39,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 19:14:41,430][54818] Updated weights for policy 0, policy_version 456418 (0.0025) [2024-04-27 19:14:44,253][54587] Fps is (10 sec: 52429.7, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7478083584. Throughput: 0: 55430.0. Samples: 383233660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 19:14:44,253][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 19:14:44,501][54818] Updated weights for policy 0, policy_version 456428 (0.0030) [2024-04-27 19:14:47,279][54818] Updated weights for policy 0, policy_version 456438 (0.0031) [2024-04-27 19:14:49,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7478362112. Throughput: 0: 55340.2. Samples: 383563280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 19:14:49,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 19:14:50,472][54818] Updated weights for policy 0, policy_version 456448 (0.0035) [2024-04-27 19:14:53,182][54818] Updated weights for policy 0, policy_version 456458 (0.0030) [2024-04-27 19:14:54,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 7478640640. Throughput: 0: 55511.5. Samples: 383897160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 19:14:54,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 19:14:56,384][54818] Updated weights for policy 0, policy_version 456468 (0.0026) [2024-04-27 19:14:59,105][54818] Updated weights for policy 0, policy_version 456478 (0.0027) [2024-04-27 19:14:59,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7478935552. Throughput: 0: 55651.2. Samples: 384069280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 19:14:59,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 19:15:02,291][54818] Updated weights for policy 0, policy_version 456488 (0.0028) [2024-04-27 19:15:04,253][54587] Fps is (10 sec: 58982.1, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 7479230464. Throughput: 0: 55640.4. Samples: 384403260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 19:15:04,254][54587] Avg episode reward: [(0, '0.677')] [2024-04-27 19:15:04,954][54818] Updated weights for policy 0, policy_version 456498 (0.0029) [2024-04-27 19:15:08,117][54818] Updated weights for policy 0, policy_version 456508 (0.0027) [2024-04-27 19:15:09,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 7479508992. Throughput: 0: 55711.2. Samples: 384739800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 19:15:09,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 19:15:10,937][54818] Updated weights for policy 0, policy_version 456518 (0.0032) [2024-04-27 19:15:13,953][54818] Updated weights for policy 0, policy_version 456528 (0.0030) [2024-04-27 19:15:14,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 7479771136. Throughput: 0: 55770.7. Samples: 384909020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 19:15:14,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 19:15:16,713][54818] Updated weights for policy 0, policy_version 456538 (0.0031) [2024-04-27 19:15:19,253][54587] Fps is (10 sec: 52428.4, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7480033280. Throughput: 0: 55792.0. Samples: 385244640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 19:15:19,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 19:15:19,725][54818] Updated weights for policy 0, policy_version 456548 (0.0023) [2024-04-27 19:15:22,669][54818] Updated weights for policy 0, policy_version 456558 (0.0031) [2024-04-27 19:15:24,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7480328192. Throughput: 0: 55782.7. Samples: 385574120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 19:15:24,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 19:15:25,655][54818] Updated weights for policy 0, policy_version 456568 (0.0032) [2024-04-27 19:15:28,466][54818] Updated weights for policy 0, policy_version 456578 (0.0030) [2024-04-27 19:15:29,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7480590336. Throughput: 0: 55628.8. Samples: 385736960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 19:15:29,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 19:15:31,717][54818] Updated weights for policy 0, policy_version 456588 (0.0031) [2024-04-27 19:15:34,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7480885248. Throughput: 0: 55780.6. Samples: 386073400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 19:15:34,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 19:15:34,372][54818] Updated weights for policy 0, policy_version 456598 (0.0034) [2024-04-27 19:15:37,610][54818] Updated weights for policy 0, policy_version 456608 (0.0032) [2024-04-27 19:15:39,253][54587] Fps is (10 sec: 58982.1, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 7481180160. Throughput: 0: 55827.0. Samples: 386409380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 19:15:39,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 19:15:40,374][54818] Updated weights for policy 0, policy_version 456618 (0.0037) [2024-04-27 19:15:43,475][54818] Updated weights for policy 0, policy_version 456628 (0.0031) [2024-04-27 19:15:43,953][54798] Signal inference workers to stop experience collection... (5500 times) [2024-04-27 19:15:43,954][54798] Signal inference workers to resume experience collection... (5500 times) [2024-04-27 19:15:43,969][54818] InferenceWorker_p0-w0: stopping experience collection (5500 times) [2024-04-27 19:15:43,969][54818] InferenceWorker_p0-w0: resuming experience collection (5500 times) [2024-04-27 19:15:44,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 7481442304. Throughput: 0: 55749.1. Samples: 386577980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 19:15:44,253][54587] Avg episode reward: [(0, '0.520')] [2024-04-27 19:15:46,194][54818] Updated weights for policy 0, policy_version 456638 (0.0035) [2024-04-27 19:15:49,253][54587] Fps is (10 sec: 52429.5, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 7481704448. Throughput: 0: 55766.3. Samples: 386912740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 19:15:49,253][54587] Avg episode reward: [(0, '0.524')] [2024-04-27 19:15:49,295][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000456648_7481720832.pth... [2024-04-27 19:15:49,301][54818] Updated weights for policy 0, policy_version 456648 (0.0029) [2024-04-27 19:15:49,356][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000455833_7468367872.pth [2024-04-27 19:15:51,972][54818] Updated weights for policy 0, policy_version 456658 (0.0027) [2024-04-27 19:15:54,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7481982976. Throughput: 0: 55656.5. Samples: 387244340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-27 19:15:54,254][54587] Avg episode reward: [(0, '0.499')] [2024-04-27 19:15:55,315][54818] Updated weights for policy 0, policy_version 456668 (0.0027) [2024-04-27 19:15:58,059][54818] Updated weights for policy 0, policy_version 456678 (0.0036) [2024-04-27 19:15:59,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7482277888. Throughput: 0: 55570.3. Samples: 387409680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 19:15:59,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 19:16:01,154][54818] Updated weights for policy 0, policy_version 456688 (0.0027) [2024-04-27 19:16:04,019][54818] Updated weights for policy 0, policy_version 456698 (0.0030) [2024-04-27 19:16:04,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7482556416. Throughput: 0: 55551.6. Samples: 387744460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 19:16:04,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 19:16:06,853][54818] Updated weights for policy 0, policy_version 456708 (0.0029) [2024-04-27 19:16:09,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 7482818560. Throughput: 0: 55719.5. Samples: 388081500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 19:16:09,254][54587] Avg episode reward: [(0, '0.513')] [2024-04-27 19:16:09,830][54818] Updated weights for policy 0, policy_version 456718 (0.0028) [2024-04-27 19:16:12,714][54818] Updated weights for policy 0, policy_version 456728 (0.0028) [2024-04-27 19:16:14,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7483113472. Throughput: 0: 55943.6. Samples: 388254420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 19:16:14,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 19:16:15,599][54818] Updated weights for policy 0, policy_version 456738 (0.0028) [2024-04-27 19:16:18,747][54818] Updated weights for policy 0, policy_version 456748 (0.0030) [2024-04-27 19:16:19,253][54587] Fps is (10 sec: 58982.7, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 7483408384. Throughput: 0: 55836.5. Samples: 388586040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 19:16:19,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 19:16:21,359][54818] Updated weights for policy 0, policy_version 456758 (0.0032) [2024-04-27 19:16:24,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7483654144. Throughput: 0: 55928.6. Samples: 388926160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 19:16:24,253][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 19:16:24,458][54818] Updated weights for policy 0, policy_version 456768 (0.0026) [2024-04-27 19:16:27,174][54818] Updated weights for policy 0, policy_version 456778 (0.0030) [2024-04-27 19:16:29,253][54587] Fps is (10 sec: 52428.6, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7483932672. Throughput: 0: 55823.0. Samples: 389090020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 19:16:29,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 19:16:30,229][54818] Updated weights for policy 0, policy_version 456788 (0.0028) [2024-04-27 19:16:32,955][54818] Updated weights for policy 0, policy_version 456798 (0.0025) [2024-04-27 19:16:34,253][54587] Fps is (10 sec: 58982.0, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7484243968. Throughput: 0: 55854.6. Samples: 389426200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 19:16:34,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 19:16:36,182][54818] Updated weights for policy 0, policy_version 456808 (0.0031) [2024-04-27 19:16:38,713][54818] Updated weights for policy 0, policy_version 456818 (0.0024) [2024-04-27 19:16:39,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7484506112. Throughput: 0: 55770.5. Samples: 389754020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 19:16:39,254][54587] Avg episode reward: [(0, '0.522')] [2024-04-27 19:16:42,106][54818] Updated weights for policy 0, policy_version 456828 (0.0037) [2024-04-27 19:16:44,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 7484801024. Throughput: 0: 56088.9. Samples: 389933680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 19:16:44,253][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 19:16:44,450][54818] Updated weights for policy 0, policy_version 456838 (0.0025) [2024-04-27 19:16:47,878][54818] Updated weights for policy 0, policy_version 456848 (0.0026) [2024-04-27 19:16:49,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7485063168. Throughput: 0: 56005.8. Samples: 390264720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 19:16:49,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 19:16:50,134][54798] Signal inference workers to stop experience collection... (5550 times) [2024-04-27 19:16:50,135][54798] Signal inference workers to resume experience collection... (5550 times) [2024-04-27 19:16:50,155][54818] InferenceWorker_p0-w0: stopping experience collection (5550 times) [2024-04-27 19:16:50,155][54818] InferenceWorker_p0-w0: resuming experience collection (5550 times) [2024-04-27 19:16:50,378][54818] Updated weights for policy 0, policy_version 456858 (0.0029) [2024-04-27 19:16:53,730][54818] Updated weights for policy 0, policy_version 456868 (0.0028) [2024-04-27 19:16:54,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 7485341696. Throughput: 0: 55838.1. Samples: 390594220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 19:16:54,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 19:16:56,545][54818] Updated weights for policy 0, policy_version 456878 (0.0032) [2024-04-27 19:16:59,254][54587] Fps is (10 sec: 54065.8, 60 sec: 55432.2, 300 sec: 55538.9). Total num frames: 7485603840. Throughput: 0: 55625.5. Samples: 390757580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 19:16:59,254][54587] Avg episode reward: [(0, '0.532')] [2024-04-27 19:16:59,673][54818] Updated weights for policy 0, policy_version 456888 (0.0030) [2024-04-27 19:17:02,381][54818] Updated weights for policy 0, policy_version 456898 (0.0026) [2024-04-27 19:17:04,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 7485898752. Throughput: 0: 55600.3. Samples: 391088060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 19:17:04,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 19:17:05,556][54818] Updated weights for policy 0, policy_version 456908 (0.0027) [2024-04-27 19:17:08,210][54818] Updated weights for policy 0, policy_version 456918 (0.0024) [2024-04-27 19:17:09,253][54587] Fps is (10 sec: 57345.9, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7486177280. Throughput: 0: 55573.3. Samples: 391426960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 19:17:09,253][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 19:17:11,335][54818] Updated weights for policy 0, policy_version 456928 (0.0029) [2024-04-27 19:17:13,931][54818] Updated weights for policy 0, policy_version 456938 (0.0027) [2024-04-27 19:17:14,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7486472192. Throughput: 0: 55805.8. Samples: 391601280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 19:17:14,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 19:17:17,058][54818] Updated weights for policy 0, policy_version 456948 (0.0030) [2024-04-27 19:17:19,253][54587] Fps is (10 sec: 58981.8, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 7486767104. Throughput: 0: 55867.5. Samples: 391940240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 19:17:19,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 19:17:19,824][54818] Updated weights for policy 0, policy_version 456958 (0.0031) [2024-04-27 19:17:23,084][54818] Updated weights for policy 0, policy_version 456968 (0.0031) [2024-04-27 19:17:24,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7487012864. Throughput: 0: 56025.5. Samples: 392275160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:17:24,253][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 19:17:25,733][54818] Updated weights for policy 0, policy_version 456978 (0.0031) [2024-04-27 19:17:29,044][54818] Updated weights for policy 0, policy_version 456988 (0.0024) [2024-04-27 19:17:29,253][54587] Fps is (10 sec: 52429.4, 60 sec: 55978.7, 300 sec: 55594.6). Total num frames: 7487291392. Throughput: 0: 55697.4. Samples: 392440060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:17:29,253][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 19:17:31,505][54818] Updated weights for policy 0, policy_version 456998 (0.0028) [2024-04-27 19:17:34,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7487569920. Throughput: 0: 55720.9. Samples: 392772160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:17:34,254][54587] Avg episode reward: [(0, '0.503')] [2024-04-27 19:17:34,989][54818] Updated weights for policy 0, policy_version 457008 (0.0026) [2024-04-27 19:17:37,283][54818] Updated weights for policy 0, policy_version 457018 (0.0031) [2024-04-27 19:17:39,253][54587] Fps is (10 sec: 55704.4, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 7487848448. Throughput: 0: 55877.7. Samples: 393108720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:17:39,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 19:17:40,949][54818] Updated weights for policy 0, policy_version 457028 (0.0032) [2024-04-27 19:17:43,213][54818] Updated weights for policy 0, policy_version 457038 (0.0028) [2024-04-27 19:17:44,253][54587] Fps is (10 sec: 58982.0, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7488159744. Throughput: 0: 56028.3. Samples: 393278840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:17:44,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 19:17:46,795][54818] Updated weights for policy 0, policy_version 457048 (0.0028) [2024-04-27 19:17:47,518][54798] Signal inference workers to stop experience collection... (5600 times) [2024-04-27 19:17:47,518][54798] Signal inference workers to resume experience collection... (5600 times) [2024-04-27 19:17:47,528][54818] InferenceWorker_p0-w0: stopping experience collection (5600 times) [2024-04-27 19:17:47,529][54818] InferenceWorker_p0-w0: resuming experience collection (5600 times) [2024-04-27 19:17:49,179][54818] Updated weights for policy 0, policy_version 457058 (0.0031) [2024-04-27 19:17:49,253][54587] Fps is (10 sec: 58982.2, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 7488438272. Throughput: 0: 56087.0. Samples: 393611980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:17:49,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 19:17:49,307][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000457059_7488454656.pth... [2024-04-27 19:17:49,351][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000456242_7475068928.pth [2024-04-27 19:17:52,584][54818] Updated weights for policy 0, policy_version 457068 (0.0024) [2024-04-27 19:17:54,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 7488700416. Throughput: 0: 55905.7. Samples: 393942720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:17:54,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 19:17:54,906][54818] Updated weights for policy 0, policy_version 457078 (0.0025) [2024-04-27 19:17:58,325][54818] Updated weights for policy 0, policy_version 457088 (0.0026) [2024-04-27 19:17:59,253][54587] Fps is (10 sec: 54068.2, 60 sec: 56252.0, 300 sec: 55816.7). Total num frames: 7488978944. Throughput: 0: 56037.4. Samples: 394122960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:17:59,254][54587] Avg episode reward: [(0, '0.521')] [2024-04-27 19:18:00,689][54818] Updated weights for policy 0, policy_version 457098 (0.0031) [2024-04-27 19:18:04,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7489241088. Throughput: 0: 55956.9. Samples: 394458300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:18:04,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 19:18:04,380][54818] Updated weights for policy 0, policy_version 457108 (0.0027) [2024-04-27 19:18:06,580][54818] Updated weights for policy 0, policy_version 457118 (0.0026) [2024-04-27 19:18:09,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7489536000. Throughput: 0: 55914.2. Samples: 394791300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:18:09,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 19:18:10,123][54818] Updated weights for policy 0, policy_version 457128 (0.0031) [2024-04-27 19:18:12,532][54818] Updated weights for policy 0, policy_version 457138 (0.0028) [2024-04-27 19:18:14,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7489798144. Throughput: 0: 55760.9. Samples: 394949300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:18:14,253][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 19:18:16,045][54818] Updated weights for policy 0, policy_version 457148 (0.0030) [2024-04-27 19:18:18,322][54818] Updated weights for policy 0, policy_version 457158 (0.0031) [2024-04-27 19:18:19,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7490109440. Throughput: 0: 55956.0. Samples: 395290180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:18:19,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 19:18:21,954][54818] Updated weights for policy 0, policy_version 457168 (0.0025) [2024-04-27 19:18:24,204][54818] Updated weights for policy 0, policy_version 457178 (0.0029) [2024-04-27 19:18:24,253][54587] Fps is (10 sec: 60620.0, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 7490404352. Throughput: 0: 55899.6. Samples: 395624200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:18:24,254][54587] Avg episode reward: [(0, '0.516')] [2024-04-27 19:18:27,714][54818] Updated weights for policy 0, policy_version 457188 (0.0035) [2024-04-27 19:18:29,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7490650112. Throughput: 0: 55976.1. Samples: 395797760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:18:29,253][54587] Avg episode reward: [(0, '0.507')] [2024-04-27 19:18:29,987][54818] Updated weights for policy 0, policy_version 457198 (0.0028) [2024-04-27 19:18:33,454][54818] Updated weights for policy 0, policy_version 457208 (0.0026) [2024-04-27 19:18:34,253][54587] Fps is (10 sec: 52429.4, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 7490928640. Throughput: 0: 56050.5. Samples: 396134240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:18:34,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 19:18:35,908][54818] Updated weights for policy 0, policy_version 457218 (0.0038) [2024-04-27 19:18:39,253][54587] Fps is (10 sec: 55704.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7491207168. Throughput: 0: 56073.5. Samples: 396466040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:18:39,263][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 19:18:39,391][54818] Updated weights for policy 0, policy_version 457228 (0.0034) [2024-04-27 19:18:42,192][54818] Updated weights for policy 0, policy_version 457238 (0.0024) [2024-04-27 19:18:44,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7491485696. Throughput: 0: 55444.9. Samples: 396617980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:18:44,262][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 19:18:45,348][54818] Updated weights for policy 0, policy_version 457248 (0.0031) [2024-04-27 19:18:47,922][54818] Updated weights for policy 0, policy_version 457258 (0.0027) [2024-04-27 19:18:49,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7491764224. Throughput: 0: 55510.1. Samples: 396956260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 19:18:49,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 19:18:51,161][54818] Updated weights for policy 0, policy_version 457268 (0.0031) [2024-04-27 19:18:53,687][54818] Updated weights for policy 0, policy_version 457278 (0.0030) [2024-04-27 19:18:54,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 7492059136. Throughput: 0: 55508.9. Samples: 397289200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 19:18:54,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 19:18:56,497][54798] Signal inference workers to stop experience collection... (5650 times) [2024-04-27 19:18:56,533][54818] InferenceWorker_p0-w0: stopping experience collection (5650 times) [2024-04-27 19:18:56,589][54798] Signal inference workers to resume experience collection... (5650 times) [2024-04-27 19:18:56,589][54818] InferenceWorker_p0-w0: resuming experience collection (5650 times) [2024-04-27 19:18:56,967][54818] Updated weights for policy 0, policy_version 457288 (0.0031) [2024-04-27 19:18:59,253][54587] Fps is (10 sec: 57345.2, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 7492337664. Throughput: 0: 55902.3. Samples: 397464900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 19:18:59,253][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 19:18:59,662][54818] Updated weights for policy 0, policy_version 457298 (0.0027) [2024-04-27 19:19:02,887][54818] Updated weights for policy 0, policy_version 457308 (0.0029) [2024-04-27 19:19:04,253][54587] Fps is (10 sec: 55704.6, 60 sec: 56251.6, 300 sec: 55816.6). Total num frames: 7492616192. Throughput: 0: 55755.0. Samples: 397799160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 19:19:04,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 19:19:05,732][54818] Updated weights for policy 0, policy_version 457318 (0.0032) [2024-04-27 19:19:08,715][54818] Updated weights for policy 0, policy_version 457328 (0.0030) [2024-04-27 19:19:09,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 7492861952. Throughput: 0: 55706.4. Samples: 398130980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 19:19:09,253][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 19:19:11,527][54818] Updated weights for policy 0, policy_version 457338 (0.0030) [2024-04-27 19:19:14,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 7493156864. Throughput: 0: 55408.7. Samples: 398291160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 19:19:14,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 19:19:14,521][54818] Updated weights for policy 0, policy_version 457348 (0.0025) [2024-04-27 19:19:17,422][54818] Updated weights for policy 0, policy_version 457358 (0.0031) [2024-04-27 19:19:19,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 7493435392. Throughput: 0: 55379.6. Samples: 398626320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 19:19:19,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 19:19:20,473][54818] Updated weights for policy 0, policy_version 457368 (0.0033) [2024-04-27 19:19:23,531][54818] Updated weights for policy 0, policy_version 457378 (0.0032) [2024-04-27 19:19:24,253][54587] Fps is (10 sec: 55706.7, 60 sec: 55159.6, 300 sec: 55761.2). Total num frames: 7493713920. Throughput: 0: 55455.0. Samples: 398961500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 19:19:24,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 19:19:26,404][54818] Updated weights for policy 0, policy_version 457388 (0.0027) [2024-04-27 19:19:29,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7493992448. Throughput: 0: 55852.0. Samples: 399131320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 19:19:29,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-27 19:19:29,309][54818] Updated weights for policy 0, policy_version 457398 (0.0028) [2024-04-27 19:19:32,266][54818] Updated weights for policy 0, policy_version 457408 (0.0027) [2024-04-27 19:19:34,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 7494287360. Throughput: 0: 55744.1. Samples: 399464740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 19:19:34,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 19:19:35,078][54818] Updated weights for policy 0, policy_version 457418 (0.0030) [2024-04-27 19:19:38,119][54818] Updated weights for policy 0, policy_version 457428 (0.0028) [2024-04-27 19:19:39,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55432.7, 300 sec: 55761.1). Total num frames: 7494533120. Throughput: 0: 55820.0. Samples: 399801100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 19:19:39,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 19:19:41,007][54818] Updated weights for policy 0, policy_version 457438 (0.0031) [2024-04-27 19:19:44,106][54818] Updated weights for policy 0, policy_version 457448 (0.0028) [2024-04-27 19:19:44,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7494828032. Throughput: 0: 55530.6. Samples: 399963780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 19:19:44,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 19:19:47,127][54818] Updated weights for policy 0, policy_version 457458 (0.0032) [2024-04-27 19:19:49,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 7495106560. Throughput: 0: 55568.7. Samples: 400299740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 19:19:49,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 19:19:49,262][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000457465_7495106560.pth... [2024-04-27 19:19:49,314][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000456648_7481720832.pth [2024-04-27 19:19:49,886][54818] Updated weights for policy 0, policy_version 457468 (0.0033) [2024-04-27 19:19:51,074][54798] Signal inference workers to stop experience collection... (5700 times) [2024-04-27 19:19:51,074][54798] Signal inference workers to resume experience collection... (5700 times) [2024-04-27 19:19:51,087][54818] InferenceWorker_p0-w0: stopping experience collection (5700 times) [2024-04-27 19:19:51,087][54818] InferenceWorker_p0-w0: resuming experience collection (5700 times) [2024-04-27 19:19:53,153][54818] Updated weights for policy 0, policy_version 457478 (0.0032) [2024-04-27 19:19:54,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 7495368704. Throughput: 0: 55712.1. Samples: 400638020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 19:19:54,253][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 19:19:55,711][54818] Updated weights for policy 0, policy_version 457488 (0.0032) [2024-04-27 19:19:59,113][54818] Updated weights for policy 0, policy_version 457498 (0.0032) [2024-04-27 19:19:59,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 7495663616. Throughput: 0: 55874.3. Samples: 400805500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 19:19:59,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 19:20:01,657][54818] Updated weights for policy 0, policy_version 457508 (0.0026) [2024-04-27 19:20:04,253][54587] Fps is (10 sec: 58982.2, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 7495958528. Throughput: 0: 55737.8. Samples: 401134520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 19:20:04,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 19:20:04,823][54818] Updated weights for policy 0, policy_version 457518 (0.0035) [2024-04-27 19:20:07,758][54818] Updated weights for policy 0, policy_version 457528 (0.0030) [2024-04-27 19:20:09,253][54587] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 7496237056. Throughput: 0: 55690.2. Samples: 401467560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 19:20:09,253][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 19:20:10,913][54818] Updated weights for policy 0, policy_version 457538 (0.0026) [2024-04-27 19:20:13,500][54818] Updated weights for policy 0, policy_version 457548 (0.0032) [2024-04-27 19:20:14,253][54587] Fps is (10 sec: 52428.7, 60 sec: 55432.7, 300 sec: 55761.2). Total num frames: 7496482816. Throughput: 0: 55668.0. Samples: 401636380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:20:14,253][54587] Avg episode reward: [(0, '0.690')] [2024-04-27 19:20:16,806][54818] Updated weights for policy 0, policy_version 457558 (0.0031) [2024-04-27 19:20:19,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 7496777728. Throughput: 0: 55756.8. Samples: 401973800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:20:19,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 19:20:19,368][54818] Updated weights for policy 0, policy_version 457568 (0.0031) [2024-04-27 19:20:22,536][54818] Updated weights for policy 0, policy_version 457578 (0.0026) [2024-04-27 19:20:24,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55705.4, 300 sec: 55816.7). Total num frames: 7497056256. Throughput: 0: 55586.1. Samples: 402302480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:20:24,254][54587] Avg episode reward: [(0, '0.516')] [2024-04-27 19:20:25,204][54818] Updated weights for policy 0, policy_version 457588 (0.0030) [2024-04-27 19:20:28,302][54818] Updated weights for policy 0, policy_version 457598 (0.0036) [2024-04-27 19:20:29,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7497334784. Throughput: 0: 55681.2. Samples: 402469440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:20:29,254][54587] Avg episode reward: [(0, '0.674')] [2024-04-27 19:20:31,404][54818] Updated weights for policy 0, policy_version 457608 (0.0032) [2024-04-27 19:20:34,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7497596928. Throughput: 0: 55680.9. Samples: 402805380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:20:34,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 19:20:34,331][54818] Updated weights for policy 0, policy_version 457618 (0.0026) [2024-04-27 19:20:37,175][54818] Updated weights for policy 0, policy_version 457628 (0.0034) [2024-04-27 19:20:39,253][54587] Fps is (10 sec: 57343.6, 60 sec: 56251.6, 300 sec: 55816.6). Total num frames: 7497908224. Throughput: 0: 55629.5. Samples: 403141360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:20:39,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 19:20:40,224][54818] Updated weights for policy 0, policy_version 457638 (0.0029) [2024-04-27 19:20:40,736][54798] Signal inference workers to stop experience collection... (5750 times) [2024-04-27 19:20:40,769][54818] InferenceWorker_p0-w0: stopping experience collection (5750 times) [2024-04-27 19:20:40,794][54798] Signal inference workers to resume experience collection... (5750 times) [2024-04-27 19:20:40,799][54818] InferenceWorker_p0-w0: resuming experience collection (5750 times) [2024-04-27 19:20:42,973][54818] Updated weights for policy 0, policy_version 457648 (0.0029) [2024-04-27 19:20:44,253][54587] Fps is (10 sec: 58982.3, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 7498186752. Throughput: 0: 55612.0. Samples: 403308040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:20:44,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 19:20:45,888][54818] Updated weights for policy 0, policy_version 457658 (0.0031) [2024-04-27 19:20:48,789][54818] Updated weights for policy 0, policy_version 457668 (0.0028) [2024-04-27 19:20:49,254][54587] Fps is (10 sec: 54066.9, 60 sec: 55705.4, 300 sec: 55816.6). Total num frames: 7498448896. Throughput: 0: 55802.8. Samples: 403645660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:20:49,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 19:20:51,981][54818] Updated weights for policy 0, policy_version 457678 (0.0029) [2024-04-27 19:20:54,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7498727424. Throughput: 0: 55873.8. Samples: 403981880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:20:54,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 19:20:54,616][54818] Updated weights for policy 0, policy_version 457688 (0.0033) [2024-04-27 19:20:57,899][54818] Updated weights for policy 0, policy_version 457698 (0.0028) [2024-04-27 19:20:59,253][54587] Fps is (10 sec: 55706.6, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7499005952. Throughput: 0: 55809.2. Samples: 404147800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:20:59,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-27 19:21:00,712][54818] Updated weights for policy 0, policy_version 457708 (0.0030) [2024-04-27 19:21:03,608][54818] Updated weights for policy 0, policy_version 457718 (0.0032) [2024-04-27 19:21:04,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 7499284480. Throughput: 0: 55773.5. Samples: 404483600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:21:04,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 19:21:06,466][54818] Updated weights for policy 0, policy_version 457728 (0.0038) [2024-04-27 19:21:09,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 7499563008. Throughput: 0: 55950.3. Samples: 404820240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:21:09,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 19:21:09,396][54818] Updated weights for policy 0, policy_version 457738 (0.0027) [2024-04-27 19:21:12,508][54818] Updated weights for policy 0, policy_version 457748 (0.0030) [2024-04-27 19:21:14,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7499841536. Throughput: 0: 55896.6. Samples: 404984780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:21:14,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 19:21:15,283][54818] Updated weights for policy 0, policy_version 457758 (0.0032) [2024-04-27 19:21:18,248][54818] Updated weights for policy 0, policy_version 457768 (0.0027) [2024-04-27 19:21:19,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 7500136448. Throughput: 0: 55780.5. Samples: 405315500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:21:19,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 19:21:21,343][54818] Updated weights for policy 0, policy_version 457778 (0.0025) [2024-04-27 19:21:24,239][54818] Updated weights for policy 0, policy_version 457788 (0.0027) [2024-04-27 19:21:24,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 7500398592. Throughput: 0: 55771.4. Samples: 405651060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:21:24,253][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 19:21:27,046][54818] Updated weights for policy 0, policy_version 457798 (0.0027) [2024-04-27 19:21:29,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7500677120. Throughput: 0: 55719.5. Samples: 405815420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:21:29,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 19:21:30,049][54818] Updated weights for policy 0, policy_version 457808 (0.0031) [2024-04-27 19:21:32,819][54818] Updated weights for policy 0, policy_version 457818 (0.0026) [2024-04-27 19:21:34,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7500939264. Throughput: 0: 55754.6. Samples: 406154600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:21:34,253][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 19:21:35,915][54818] Updated weights for policy 0, policy_version 457828 (0.0029) [2024-04-27 19:21:38,703][54818] Updated weights for policy 0, policy_version 457838 (0.0030) [2024-04-27 19:21:39,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 7501234176. Throughput: 0: 55579.1. Samples: 406482940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 19:21:39,254][54587] Avg episode reward: [(0, '0.524')] [2024-04-27 19:21:39,669][54798] Signal inference workers to stop experience collection... (5800 times) [2024-04-27 19:21:39,701][54818] InferenceWorker_p0-w0: stopping experience collection (5800 times) [2024-04-27 19:21:39,730][54798] Signal inference workers to resume experience collection... (5800 times) [2024-04-27 19:21:39,730][54818] InferenceWorker_p0-w0: resuming experience collection (5800 times) [2024-04-27 19:21:41,696][54818] Updated weights for policy 0, policy_version 457848 (0.0031) [2024-04-27 19:21:44,253][54587] Fps is (10 sec: 57343.0, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 7501512704. Throughput: 0: 55594.6. Samples: 406649560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 19:21:44,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 19:21:44,712][54818] Updated weights for policy 0, policy_version 457858 (0.0032) [2024-04-27 19:21:47,688][54818] Updated weights for policy 0, policy_version 457868 (0.0024) [2024-04-27 19:21:49,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 7501791232. Throughput: 0: 55557.3. Samples: 406983680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 19:21:49,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 19:21:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000457873_7501791232.pth... [2024-04-27 19:21:49,315][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000457059_7488454656.pth [2024-04-27 19:21:50,436][54818] Updated weights for policy 0, policy_version 457878 (0.0024) [2024-04-27 19:21:53,589][54818] Updated weights for policy 0, policy_version 457888 (0.0025) [2024-04-27 19:21:54,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55978.6, 300 sec: 55872.3). Total num frames: 7502086144. Throughput: 0: 55484.5. Samples: 407317040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 19:21:54,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 19:21:56,156][54818] Updated weights for policy 0, policy_version 457898 (0.0029) [2024-04-27 19:21:59,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7502331904. Throughput: 0: 55463.5. Samples: 407480640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 19:21:59,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 19:21:59,528][54818] Updated weights for policy 0, policy_version 457908 (0.0026) [2024-04-27 19:22:02,048][54818] Updated weights for policy 0, policy_version 457918 (0.0035) [2024-04-27 19:22:04,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7502626816. Throughput: 0: 55535.5. Samples: 407814600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 19:22:04,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 19:22:05,241][54818] Updated weights for policy 0, policy_version 457928 (0.0031) [2024-04-27 19:22:08,353][54818] Updated weights for policy 0, policy_version 457938 (0.0034) [2024-04-27 19:22:09,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7502905344. Throughput: 0: 55461.5. Samples: 408146840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 19:22:09,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 19:22:11,131][54818] Updated weights for policy 0, policy_version 457948 (0.0026) [2024-04-27 19:22:14,038][54818] Updated weights for policy 0, policy_version 457958 (0.0033) [2024-04-27 19:22:14,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7503183872. Throughput: 0: 55637.0. Samples: 408319080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 19:22:14,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 19:22:16,934][54818] Updated weights for policy 0, policy_version 457968 (0.0037) [2024-04-27 19:22:19,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55705.5, 300 sec: 55816.6). Total num frames: 7503478784. Throughput: 0: 55435.7. Samples: 408649220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 19:22:19,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 19:22:19,846][54818] Updated weights for policy 0, policy_version 457978 (0.0029) [2024-04-27 19:22:22,834][54818] Updated weights for policy 0, policy_version 457988 (0.0030) [2024-04-27 19:22:24,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 7503724544. Throughput: 0: 55538.6. Samples: 408982180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 19:22:24,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-27 19:22:25,715][54818] Updated weights for policy 0, policy_version 457998 (0.0026) [2024-04-27 19:22:28,779][54818] Updated weights for policy 0, policy_version 458008 (0.0030) [2024-04-27 19:22:29,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55816.6). Total num frames: 7504035840. Throughput: 0: 55601.2. Samples: 409151620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 19:22:29,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 19:22:31,629][54818] Updated weights for policy 0, policy_version 458018 (0.0027) [2024-04-27 19:22:34,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7504265216. Throughput: 0: 55649.4. Samples: 409487900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 19:22:34,253][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 19:22:34,873][54818] Updated weights for policy 0, policy_version 458028 (0.0027) [2024-04-27 19:22:37,346][54818] Updated weights for policy 0, policy_version 458038 (0.0026) [2024-04-27 19:22:39,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7504576512. Throughput: 0: 55691.5. Samples: 409823160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 19:22:39,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 19:22:40,626][54818] Updated weights for policy 0, policy_version 458048 (0.0027) [2024-04-27 19:22:42,103][54798] Signal inference workers to stop experience collection... (5850 times) [2024-04-27 19:22:42,104][54798] Signal inference workers to resume experience collection... (5850 times) [2024-04-27 19:22:42,129][54818] InferenceWorker_p0-w0: stopping experience collection (5850 times) [2024-04-27 19:22:42,129][54818] InferenceWorker_p0-w0: resuming experience collection (5850 times) [2024-04-27 19:22:43,163][54818] Updated weights for policy 0, policy_version 458058 (0.0032) [2024-04-27 19:22:44,253][54587] Fps is (10 sec: 60620.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7504871424. Throughput: 0: 55841.3. Samples: 409993500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 19:22:44,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 19:22:46,467][54818] Updated weights for policy 0, policy_version 458068 (0.0026) [2024-04-27 19:22:49,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7505133568. Throughput: 0: 55848.4. Samples: 410327780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 19:22:49,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 19:22:49,391][54818] Updated weights for policy 0, policy_version 458078 (0.0034) [2024-04-27 19:22:52,297][54818] Updated weights for policy 0, policy_version 458088 (0.0027) [2024-04-27 19:22:54,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7505412096. Throughput: 0: 55893.9. Samples: 410662060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-27 19:22:54,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 19:22:55,112][54818] Updated weights for policy 0, policy_version 458098 (0.0025) [2024-04-27 19:22:58,256][54818] Updated weights for policy 0, policy_version 458108 (0.0032) [2024-04-27 19:22:59,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7505690624. Throughput: 0: 55728.8. Samples: 410826880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:22:59,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 19:23:00,859][54818] Updated weights for policy 0, policy_version 458118 (0.0026) [2024-04-27 19:23:04,185][54818] Updated weights for policy 0, policy_version 458128 (0.0026) [2024-04-27 19:23:04,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7505969152. Throughput: 0: 55905.4. Samples: 411164960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:23:04,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 19:23:06,897][54818] Updated weights for policy 0, policy_version 458138 (0.0025) [2024-04-27 19:23:09,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7506231296. Throughput: 0: 55910.1. Samples: 411498140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:23:09,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 19:23:10,168][54818] Updated weights for policy 0, policy_version 458148 (0.0031) [2024-04-27 19:23:12,818][54818] Updated weights for policy 0, policy_version 458158 (0.0025) [2024-04-27 19:23:14,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7506509824. Throughput: 0: 55692.7. Samples: 411657780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:23:14,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 19:23:16,057][54818] Updated weights for policy 0, policy_version 458168 (0.0025) [2024-04-27 19:23:18,577][54818] Updated weights for policy 0, policy_version 458178 (0.0033) [2024-04-27 19:23:19,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7506804736. Throughput: 0: 55640.7. Samples: 411991740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:23:19,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 19:23:21,964][54818] Updated weights for policy 0, policy_version 458188 (0.0031) [2024-04-27 19:23:24,253][54587] Fps is (10 sec: 58981.8, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 7507099648. Throughput: 0: 55578.6. Samples: 412324200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:23:24,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 19:23:24,315][54818] Updated weights for policy 0, policy_version 458198 (0.0034) [2024-04-27 19:23:27,877][54818] Updated weights for policy 0, policy_version 458208 (0.0032) [2024-04-27 19:23:29,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 7507361792. Throughput: 0: 55747.2. Samples: 412502120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:23:29,253][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 19:23:30,361][54818] Updated weights for policy 0, policy_version 458218 (0.0037) [2024-04-27 19:23:33,704][54818] Updated weights for policy 0, policy_version 458228 (0.0030) [2024-04-27 19:23:34,253][54587] Fps is (10 sec: 54067.6, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 7507640320. Throughput: 0: 55603.1. Samples: 412829920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:23:34,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 19:23:36,353][54818] Updated weights for policy 0, policy_version 458238 (0.0034) [2024-04-27 19:23:39,253][54587] Fps is (10 sec: 54066.0, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 7507902464. Throughput: 0: 55605.6. Samples: 413164320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:23:39,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 19:23:39,556][54818] Updated weights for policy 0, policy_version 458248 (0.0035) [2024-04-27 19:23:42,084][54818] Updated weights for policy 0, policy_version 458258 (0.0032) [2024-04-27 19:23:44,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7508180992. Throughput: 0: 55467.1. Samples: 413322900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:23:44,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 19:23:45,476][54818] Updated weights for policy 0, policy_version 458268 (0.0029) [2024-04-27 19:23:48,135][54818] Updated weights for policy 0, policy_version 458278 (0.0030) [2024-04-27 19:23:49,253][54587] Fps is (10 sec: 55706.7, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7508459520. Throughput: 0: 55397.9. Samples: 413657860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:23:49,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 19:23:49,261][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000458280_7508459520.pth... [2024-04-27 19:23:49,312][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000457465_7495106560.pth [2024-04-27 19:23:50,036][54798] Signal inference workers to stop experience collection... (5900 times) [2024-04-27 19:23:50,037][54798] Signal inference workers to resume experience collection... (5900 times) [2024-04-27 19:23:50,063][54818] InferenceWorker_p0-w0: stopping experience collection (5900 times) [2024-04-27 19:23:50,063][54818] InferenceWorker_p0-w0: resuming experience collection (5900 times) [2024-04-27 19:23:51,351][54818] Updated weights for policy 0, policy_version 458288 (0.0030) [2024-04-27 19:23:54,194][54818] Updated weights for policy 0, policy_version 458298 (0.0026) [2024-04-27 19:23:54,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55705.7, 300 sec: 55650.0). Total num frames: 7508754432. Throughput: 0: 55382.4. Samples: 413990340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:23:54,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 19:23:57,251][54818] Updated weights for policy 0, policy_version 458308 (0.0028) [2024-04-27 19:23:59,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7509032960. Throughput: 0: 55686.2. Samples: 414163660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:23:59,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 19:23:59,918][54818] Updated weights for policy 0, policy_version 458318 (0.0029) [2024-04-27 19:24:03,218][54818] Updated weights for policy 0, policy_version 458328 (0.0031) [2024-04-27 19:24:04,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7509295104. Throughput: 0: 55621.3. Samples: 414494700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:24:04,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 19:24:05,645][54818] Updated weights for policy 0, policy_version 458338 (0.0026) [2024-04-27 19:24:08,910][54818] Updated weights for policy 0, policy_version 458348 (0.0030) [2024-04-27 19:24:09,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7509573632. Throughput: 0: 55720.9. Samples: 414831640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:24:09,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 19:24:11,824][54818] Updated weights for policy 0, policy_version 458358 (0.0030) [2024-04-27 19:24:14,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 7509868544. Throughput: 0: 55393.5. Samples: 414994840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:24:14,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 19:24:14,635][54818] Updated weights for policy 0, policy_version 458368 (0.0031) [2024-04-27 19:24:17,608][54818] Updated weights for policy 0, policy_version 458378 (0.0027) [2024-04-27 19:24:19,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7510130688. Throughput: 0: 55478.8. Samples: 415326460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 19:24:19,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 19:24:20,611][54818] Updated weights for policy 0, policy_version 458388 (0.0026) [2024-04-27 19:24:23,572][54818] Updated weights for policy 0, policy_version 458398 (0.0029) [2024-04-27 19:24:24,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 7510409216. Throughput: 0: 55516.1. Samples: 415662540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 19:24:24,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 19:24:26,530][54818] Updated weights for policy 0, policy_version 458408 (0.0029) [2024-04-27 19:24:29,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 7510704128. Throughput: 0: 55731.9. Samples: 415830840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 19:24:29,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 19:24:29,346][54818] Updated weights for policy 0, policy_version 458418 (0.0027) [2024-04-27 19:24:32,892][54818] Updated weights for policy 0, policy_version 458428 (0.0026) [2024-04-27 19:24:34,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7510982656. Throughput: 0: 55852.0. Samples: 416171200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 19:24:34,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 19:24:35,272][54818] Updated weights for policy 0, policy_version 458438 (0.0030) [2024-04-27 19:24:38,662][54818] Updated weights for policy 0, policy_version 458448 (0.0028) [2024-04-27 19:24:39,253][54587] Fps is (10 sec: 52429.6, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 7511228416. Throughput: 0: 55761.8. Samples: 416499620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 19:24:39,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 19:24:41,178][54818] Updated weights for policy 0, policy_version 458458 (0.0026) [2024-04-27 19:24:44,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 7511523328. Throughput: 0: 55529.2. Samples: 416662480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 19:24:44,254][54587] Avg episode reward: [(0, '0.664')] [2024-04-27 19:24:44,564][54818] Updated weights for policy 0, policy_version 458468 (0.0029) [2024-04-27 19:24:47,243][54818] Updated weights for policy 0, policy_version 458478 (0.0026) [2024-04-27 19:24:49,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7511801856. Throughput: 0: 55580.6. Samples: 416995820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 19:24:49,253][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 19:24:50,631][54818] Updated weights for policy 0, policy_version 458488 (0.0032) [2024-04-27 19:24:53,113][54818] Updated weights for policy 0, policy_version 458498 (0.0033) [2024-04-27 19:24:54,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7512080384. Throughput: 0: 55535.1. Samples: 417330720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 19:24:54,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 19:24:56,480][54818] Updated weights for policy 0, policy_version 458508 (0.0033) [2024-04-27 19:24:58,882][54818] Updated weights for policy 0, policy_version 458518 (0.0030) [2024-04-27 19:24:59,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7512358912. Throughput: 0: 55760.7. Samples: 417504060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 19:24:59,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 19:25:02,251][54818] Updated weights for policy 0, policy_version 458528 (0.0028) [2024-04-27 19:25:04,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 7512653824. Throughput: 0: 55879.4. Samples: 417841040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 19:25:04,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 19:25:04,808][54818] Updated weights for policy 0, policy_version 458538 (0.0027) [2024-04-27 19:25:05,804][54798] Signal inference workers to stop experience collection... (5950 times) [2024-04-27 19:25:05,805][54798] Signal inference workers to resume experience collection... (5950 times) [2024-04-27 19:25:05,817][54818] InferenceWorker_p0-w0: stopping experience collection (5950 times) [2024-04-27 19:25:05,817][54818] InferenceWorker_p0-w0: resuming experience collection (5950 times) [2024-04-27 19:25:07,910][54818] Updated weights for policy 0, policy_version 458548 (0.0031) [2024-04-27 19:25:09,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7512915968. Throughput: 0: 55792.1. Samples: 418173180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 19:25:09,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 19:25:10,738][54818] Updated weights for policy 0, policy_version 458558 (0.0033) [2024-04-27 19:25:13,921][54818] Updated weights for policy 0, policy_version 458568 (0.0028) [2024-04-27 19:25:14,253][54587] Fps is (10 sec: 52429.8, 60 sec: 55159.7, 300 sec: 55594.6). Total num frames: 7513178112. Throughput: 0: 55647.8. Samples: 418334980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 19:25:14,253][54587] Avg episode reward: [(0, '0.663')] [2024-04-27 19:25:16,617][54818] Updated weights for policy 0, policy_version 458578 (0.0028) [2024-04-27 19:25:19,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7513473024. Throughput: 0: 55511.6. Samples: 418669220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 19:25:19,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 19:25:20,006][54818] Updated weights for policy 0, policy_version 458588 (0.0026) [2024-04-27 19:25:22,620][54818] Updated weights for policy 0, policy_version 458598 (0.0030) [2024-04-27 19:25:24,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7513751552. Throughput: 0: 55647.5. Samples: 419003760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 19:25:24,254][54587] Avg episode reward: [(0, '0.686')] [2024-04-27 19:25:25,843][54818] Updated weights for policy 0, policy_version 458608 (0.0031) [2024-04-27 19:25:28,366][54818] Updated weights for policy 0, policy_version 458618 (0.0028) [2024-04-27 19:25:29,253][54587] Fps is (10 sec: 57343.1, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7514046464. Throughput: 0: 55808.8. Samples: 419173880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 19:25:29,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 19:25:31,779][54818] Updated weights for policy 0, policy_version 458628 (0.0029) [2024-04-27 19:25:34,181][54818] Updated weights for policy 0, policy_version 458638 (0.0033) [2024-04-27 19:25:34,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7514324992. Throughput: 0: 55859.1. Samples: 419509480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 19:25:34,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 19:25:37,608][54818] Updated weights for policy 0, policy_version 458648 (0.0027) [2024-04-27 19:25:39,253][54587] Fps is (10 sec: 55706.4, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 7514603520. Throughput: 0: 55731.2. Samples: 419838620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 19:25:39,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 19:25:40,161][54818] Updated weights for policy 0, policy_version 458658 (0.0028) [2024-04-27 19:25:43,579][54818] Updated weights for policy 0, policy_version 458668 (0.0028) [2024-04-27 19:25:44,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7514865664. Throughput: 0: 55549.3. Samples: 420003780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 19:25:44,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 19:25:46,030][54818] Updated weights for policy 0, policy_version 458678 (0.0025) [2024-04-27 19:25:49,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7515127808. Throughput: 0: 55466.7. Samples: 420337040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-27 19:25:49,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 19:25:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000458687_7515127808.pth... [2024-04-27 19:25:49,332][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000457873_7501791232.pth [2024-04-27 19:25:49,466][54818] Updated weights for policy 0, policy_version 458688 (0.0028) [2024-04-27 19:25:51,808][54818] Updated weights for policy 0, policy_version 458698 (0.0027) [2024-04-27 19:25:54,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7515422720. Throughput: 0: 55514.7. Samples: 420671340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-27 19:25:54,253][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 19:25:55,174][54818] Updated weights for policy 0, policy_version 458708 (0.0031) [2024-04-27 19:25:57,629][54818] Updated weights for policy 0, policy_version 458718 (0.0026) [2024-04-27 19:25:59,253][54587] Fps is (10 sec: 58982.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7515717632. Throughput: 0: 55647.4. Samples: 420839120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-27 19:25:59,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 19:26:01,112][54818] Updated weights for policy 0, policy_version 458728 (0.0028) [2024-04-27 19:26:01,657][54798] Signal inference workers to stop experience collection... (6000 times) [2024-04-27 19:26:01,707][54818] InferenceWorker_p0-w0: stopping experience collection (6000 times) [2024-04-27 19:26:01,714][54798] Signal inference workers to resume experience collection... (6000 times) [2024-04-27 19:26:01,717][54818] InferenceWorker_p0-w0: resuming experience collection (6000 times) [2024-04-27 19:26:03,435][54818] Updated weights for policy 0, policy_version 458738 (0.0026) [2024-04-27 19:26:04,253][54587] Fps is (10 sec: 55704.5, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 7515979776. Throughput: 0: 55728.3. Samples: 421177000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-27 19:26:04,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 19:26:07,035][54818] Updated weights for policy 0, policy_version 458748 (0.0034) [2024-04-27 19:26:09,216][54818] Updated weights for policy 0, policy_version 458758 (0.0030) [2024-04-27 19:26:09,253][54587] Fps is (10 sec: 57343.8, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 7516291072. Throughput: 0: 55823.9. Samples: 421515840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-27 19:26:09,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 19:26:13,077][54818] Updated weights for policy 0, policy_version 458768 (0.0028) [2024-04-27 19:26:14,253][54587] Fps is (10 sec: 57344.9, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 7516553216. Throughput: 0: 55793.6. Samples: 421684580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-27 19:26:14,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 19:26:15,031][54818] Updated weights for policy 0, policy_version 458778 (0.0031) [2024-04-27 19:26:18,759][54818] Updated weights for policy 0, policy_version 458788 (0.0025) [2024-04-27 19:26:19,253][54587] Fps is (10 sec: 50791.0, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7516798976. Throughput: 0: 55734.6. Samples: 422017540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-27 19:26:19,253][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 19:26:20,985][54818] Updated weights for policy 0, policy_version 458798 (0.0026) [2024-04-27 19:26:24,253][54587] Fps is (10 sec: 52429.1, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7517077504. Throughput: 0: 55937.0. Samples: 422355780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-27 19:26:24,253][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 19:26:24,619][54818] Updated weights for policy 0, policy_version 458808 (0.0026) [2024-04-27 19:26:26,984][54818] Updated weights for policy 0, policy_version 458818 (0.0032) [2024-04-27 19:26:29,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 7517372416. Throughput: 0: 55952.4. Samples: 422521640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-27 19:26:29,253][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 19:26:30,493][54818] Updated weights for policy 0, policy_version 458828 (0.0028) [2024-04-27 19:26:32,912][54818] Updated weights for policy 0, policy_version 458838 (0.0036) [2024-04-27 19:26:34,253][54587] Fps is (10 sec: 60619.5, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 7517683712. Throughput: 0: 55954.6. Samples: 422855000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-27 19:26:34,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 19:26:36,333][54818] Updated weights for policy 0, policy_version 458848 (0.0029) [2024-04-27 19:26:38,790][54818] Updated weights for policy 0, policy_version 458858 (0.0027) [2024-04-27 19:26:39,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7517945856. Throughput: 0: 55906.6. Samples: 423187140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-27 19:26:39,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 19:26:42,084][54818] Updated weights for policy 0, policy_version 458868 (0.0030) [2024-04-27 19:26:44,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7518224384. Throughput: 0: 56001.8. Samples: 423359200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-27 19:26:44,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-27 19:26:44,605][54818] Updated weights for policy 0, policy_version 458878 (0.0030) [2024-04-27 19:26:47,856][54818] Updated weights for policy 0, policy_version 458888 (0.0028) [2024-04-27 19:26:49,253][54587] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 7518502912. Throughput: 0: 56044.2. Samples: 423698980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-27 19:26:49,253][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 19:26:50,417][54818] Updated weights for policy 0, policy_version 458898 (0.0026) [2024-04-27 19:26:53,869][54818] Updated weights for policy 0, policy_version 458908 (0.0026) [2024-04-27 19:26:54,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7518765056. Throughput: 0: 55959.3. Samples: 424034000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-27 19:26:54,253][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 19:26:54,813][54798] Signal inference workers to stop experience collection... (6050 times) [2024-04-27 19:26:54,850][54818] InferenceWorker_p0-w0: stopping experience collection (6050 times) [2024-04-27 19:26:54,906][54798] Signal inference workers to resume experience collection... (6050 times) [2024-04-27 19:26:54,906][54818] InferenceWorker_p0-w0: resuming experience collection (6050 times) [2024-04-27 19:26:56,480][54818] Updated weights for policy 0, policy_version 458918 (0.0029) [2024-04-27 19:26:59,253][54587] Fps is (10 sec: 54066.2, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 7519043584. Throughput: 0: 55747.4. Samples: 424193220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-27 19:26:59,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-27 19:26:59,737][54818] Updated weights for policy 0, policy_version 458928 (0.0031) [2024-04-27 19:27:02,242][54818] Updated weights for policy 0, policy_version 458938 (0.0031) [2024-04-27 19:27:04,253][54587] Fps is (10 sec: 57343.0, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7519338496. Throughput: 0: 55638.5. Samples: 424521280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-27 19:27:04,254][54587] Avg episode reward: [(0, '0.517')] [2024-04-27 19:27:05,607][54818] Updated weights for policy 0, policy_version 458948 (0.0028) [2024-04-27 19:27:08,188][54818] Updated weights for policy 0, policy_version 458958 (0.0027) [2024-04-27 19:27:09,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7519617024. Throughput: 0: 55493.5. Samples: 424853000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-04-27 19:27:09,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 19:27:11,407][54818] Updated weights for policy 0, policy_version 458968 (0.0031) [2024-04-27 19:27:13,947][54818] Updated weights for policy 0, policy_version 458978 (0.0029) [2024-04-27 19:27:14,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.4, 300 sec: 55650.1). Total num frames: 7519895552. Throughput: 0: 55882.4. Samples: 425036360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:27:14,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 19:27:17,134][54818] Updated weights for policy 0, policy_version 458988 (0.0027) [2024-04-27 19:27:19,253][54587] Fps is (10 sec: 57344.8, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 7520190464. Throughput: 0: 55868.6. Samples: 425369080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:27:19,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 19:27:19,895][54818] Updated weights for policy 0, policy_version 458998 (0.0029) [2024-04-27 19:27:23,013][54818] Updated weights for policy 0, policy_version 459008 (0.0027) [2024-04-27 19:27:24,253][54587] Fps is (10 sec: 55706.8, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 7520452608. Throughput: 0: 55894.8. Samples: 425702400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:27:24,253][54587] Avg episode reward: [(0, '0.700')] [2024-04-27 19:27:25,803][54818] Updated weights for policy 0, policy_version 459018 (0.0034) [2024-04-27 19:27:29,000][54818] Updated weights for policy 0, policy_version 459028 (0.0031) [2024-04-27 19:27:29,254][54587] Fps is (10 sec: 52425.7, 60 sec: 55705.0, 300 sec: 55761.0). Total num frames: 7520714752. Throughput: 0: 55602.8. Samples: 425861360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:27:29,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 19:27:31,557][54818] Updated weights for policy 0, policy_version 459038 (0.0034) [2024-04-27 19:27:34,253][54587] Fps is (10 sec: 54066.5, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 7520993280. Throughput: 0: 55470.1. Samples: 426195140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:27:34,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 19:27:35,020][54818] Updated weights for policy 0, policy_version 459048 (0.0033) [2024-04-27 19:27:37,478][54818] Updated weights for policy 0, policy_version 459058 (0.0025) [2024-04-27 19:27:39,253][54587] Fps is (10 sec: 55708.6, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7521271808. Throughput: 0: 55519.8. Samples: 426532400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:27:39,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 19:27:40,997][54818] Updated weights for policy 0, policy_version 459068 (0.0027) [2024-04-27 19:27:43,384][54818] Updated weights for policy 0, policy_version 459078 (0.0032) [2024-04-27 19:27:44,253][54587] Fps is (10 sec: 58983.1, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 7521583104. Throughput: 0: 55645.1. Samples: 426697240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:27:44,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 19:27:46,913][54818] Updated weights for policy 0, policy_version 459088 (0.0029) [2024-04-27 19:27:49,243][54818] Updated weights for policy 0, policy_version 459098 (0.0025) [2024-04-27 19:27:49,253][54587] Fps is (10 sec: 58982.3, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 7521861632. Throughput: 0: 55864.9. Samples: 427035200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:27:49,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 19:27:49,371][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000459099_7521878016.pth... [2024-04-27 19:27:49,416][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000458280_7508459520.pth [2024-04-27 19:27:52,612][54818] Updated weights for policy 0, policy_version 459108 (0.0025) [2024-04-27 19:27:54,253][54587] Fps is (10 sec: 52428.0, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7522107392. Throughput: 0: 55845.0. Samples: 427366020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:27:54,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 19:27:55,283][54818] Updated weights for policy 0, policy_version 459118 (0.0031) [2024-04-27 19:27:57,245][54798] Signal inference workers to stop experience collection... (6100 times) [2024-04-27 19:27:57,277][54818] InferenceWorker_p0-w0: stopping experience collection (6100 times) [2024-04-27 19:27:57,303][54798] Signal inference workers to resume experience collection... (6100 times) [2024-04-27 19:27:57,303][54818] InferenceWorker_p0-w0: resuming experience collection (6100 times) [2024-04-27 19:27:58,511][54818] Updated weights for policy 0, policy_version 459128 (0.0027) [2024-04-27 19:27:59,253][54587] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 7522418688. Throughput: 0: 55530.2. Samples: 427535220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:27:59,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 19:28:01,046][54818] Updated weights for policy 0, policy_version 459138 (0.0026) [2024-04-27 19:28:04,253][54587] Fps is (10 sec: 57344.9, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 7522680832. Throughput: 0: 55698.8. Samples: 427875520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:28:04,254][54587] Avg episode reward: [(0, '0.726')] [2024-04-27 19:28:04,256][54818] Updated weights for policy 0, policy_version 459148 (0.0024) [2024-04-27 19:28:06,808][54818] Updated weights for policy 0, policy_version 459158 (0.0030) [2024-04-27 19:28:09,253][54587] Fps is (10 sec: 52429.6, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 7522942976. Throughput: 0: 55747.9. Samples: 428211060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:28:09,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 19:28:10,130][54818] Updated weights for policy 0, policy_version 459168 (0.0032) [2024-04-27 19:28:12,753][54818] Updated weights for policy 0, policy_version 459178 (0.0028) [2024-04-27 19:28:14,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7523221504. Throughput: 0: 55863.8. Samples: 428375200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:28:14,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 19:28:16,046][54818] Updated weights for policy 0, policy_version 459188 (0.0034) [2024-04-27 19:28:18,627][54818] Updated weights for policy 0, policy_version 459198 (0.0028) [2024-04-27 19:28:19,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7523516416. Throughput: 0: 55836.6. Samples: 428707780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:28:19,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 19:28:21,807][54818] Updated weights for policy 0, policy_version 459208 (0.0035) [2024-04-27 19:28:24,253][54587] Fps is (10 sec: 58982.4, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7523811328. Throughput: 0: 55843.6. Samples: 429045360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:28:24,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 19:28:24,337][54818] Updated weights for policy 0, policy_version 459218 (0.0028) [2024-04-27 19:28:27,648][54818] Updated weights for policy 0, policy_version 459228 (0.0033) [2024-04-27 19:28:29,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55979.1, 300 sec: 55705.6). Total num frames: 7524073472. Throughput: 0: 55966.9. Samples: 429215760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:28:29,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-27 19:28:30,194][54818] Updated weights for policy 0, policy_version 459238 (0.0028) [2024-04-27 19:28:33,446][54818] Updated weights for policy 0, policy_version 459248 (0.0029) [2024-04-27 19:28:34,253][54587] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 7524368384. Throughput: 0: 55840.0. Samples: 429548000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 19:28:34,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 19:28:36,390][54818] Updated weights for policy 0, policy_version 459258 (0.0029) [2024-04-27 19:28:39,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 7524630528. Throughput: 0: 55821.1. Samples: 429877960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 19:28:39,253][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 19:28:39,376][54818] Updated weights for policy 0, policy_version 459268 (0.0028) [2024-04-27 19:28:42,322][54818] Updated weights for policy 0, policy_version 459278 (0.0027) [2024-04-27 19:28:44,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 7524909056. Throughput: 0: 55897.1. Samples: 430050580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 19:28:44,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 19:28:45,165][54818] Updated weights for policy 0, policy_version 459288 (0.0031) [2024-04-27 19:28:48,161][54818] Updated weights for policy 0, policy_version 459298 (0.0030) [2024-04-27 19:28:49,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7525171200. Throughput: 0: 55647.9. Samples: 430379680. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 19:28:49,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 19:28:50,906][54818] Updated weights for policy 0, policy_version 459308 (0.0030) [2024-04-27 19:28:54,144][54818] Updated weights for policy 0, policy_version 459318 (0.0029) [2024-04-27 19:28:54,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 7525466112. Throughput: 0: 55621.4. Samples: 430714020. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 19:28:54,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 19:28:56,942][54818] Updated weights for policy 0, policy_version 459328 (0.0030) [2024-04-27 19:28:59,253][54587] Fps is (10 sec: 58982.5, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 7525761024. Throughput: 0: 55721.8. Samples: 430882680. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 19:28:59,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 19:29:00,020][54818] Updated weights for policy 0, policy_version 459338 (0.0027) [2024-04-27 19:29:02,753][54818] Updated weights for policy 0, policy_version 459348 (0.0027) [2024-04-27 19:29:04,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 7526023168. Throughput: 0: 55746.3. Samples: 431216360. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 19:29:04,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 19:29:05,818][54818] Updated weights for policy 0, policy_version 459358 (0.0035) [2024-04-27 19:29:06,664][54798] Signal inference workers to stop experience collection... (6150 times) [2024-04-27 19:29:06,664][54798] Signal inference workers to resume experience collection... (6150 times) [2024-04-27 19:29:06,679][54818] InferenceWorker_p0-w0: stopping experience collection (6150 times) [2024-04-27 19:29:06,679][54818] InferenceWorker_p0-w0: resuming experience collection (6150 times) [2024-04-27 19:29:08,520][54818] Updated weights for policy 0, policy_version 459368 (0.0030) [2024-04-27 19:29:09,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7526301696. Throughput: 0: 55654.7. Samples: 431549820. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 19:29:09,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 19:29:11,594][54818] Updated weights for policy 0, policy_version 459378 (0.0027) [2024-04-27 19:29:14,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7526580224. Throughput: 0: 55655.6. Samples: 431720260. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 19:29:14,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 19:29:14,380][54818] Updated weights for policy 0, policy_version 459388 (0.0028) [2024-04-27 19:29:17,634][54818] Updated weights for policy 0, policy_version 459398 (0.0024) [2024-04-27 19:29:19,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7526858752. Throughput: 0: 55767.1. Samples: 432057520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 19:29:19,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 19:29:20,291][54818] Updated weights for policy 0, policy_version 459408 (0.0027) [2024-04-27 19:29:23,526][54818] Updated weights for policy 0, policy_version 459418 (0.0025) [2024-04-27 19:29:24,253][54587] Fps is (10 sec: 55706.5, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 7527137280. Throughput: 0: 55863.2. Samples: 432391800. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 19:29:24,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 19:29:25,992][54818] Updated weights for policy 0, policy_version 459428 (0.0027) [2024-04-27 19:29:29,253][54587] Fps is (10 sec: 54068.1, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 7527399424. Throughput: 0: 55702.7. Samples: 432557200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 19:29:29,253][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 19:29:29,603][54818] Updated weights for policy 0, policy_version 459438 (0.0031) [2024-04-27 19:29:31,726][54818] Updated weights for policy 0, policy_version 459448 (0.0025) [2024-04-27 19:29:34,253][54587] Fps is (10 sec: 57342.5, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 7527710720. Throughput: 0: 55898.9. Samples: 432895140. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 19:29:34,254][54587] Avg episode reward: [(0, '0.675')] [2024-04-27 19:29:35,318][54818] Updated weights for policy 0, policy_version 459458 (0.0032) [2024-04-27 19:29:37,919][54818] Updated weights for policy 0, policy_version 459468 (0.0027) [2024-04-27 19:29:39,253][54587] Fps is (10 sec: 58982.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7527989248. Throughput: 0: 55971.6. Samples: 433232740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 19:29:39,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 19:29:41,217][54818] Updated weights for policy 0, policy_version 459478 (0.0028) [2024-04-27 19:29:43,997][54818] Updated weights for policy 0, policy_version 459488 (0.0028) [2024-04-27 19:29:44,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 7528267776. Throughput: 0: 55900.4. Samples: 433398200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 19:29:44,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 19:29:47,147][54818] Updated weights for policy 0, policy_version 459498 (0.0033) [2024-04-27 19:29:49,253][54587] Fps is (10 sec: 55704.9, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 7528546304. Throughput: 0: 56001.2. Samples: 433736420. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 19:29:49,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 19:29:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000459506_7528546304.pth... [2024-04-27 19:29:49,311][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000458687_7515127808.pth [2024-04-27 19:29:49,730][54818] Updated weights for policy 0, policy_version 459508 (0.0025) [2024-04-27 19:29:52,873][54818] Updated weights for policy 0, policy_version 459518 (0.0031) [2024-04-27 19:29:54,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 7528824832. Throughput: 0: 56029.3. Samples: 434071140. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 19:29:54,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 19:29:55,442][54818] Updated weights for policy 0, policy_version 459528 (0.0030) [2024-04-27 19:29:58,714][54818] Updated weights for policy 0, policy_version 459538 (0.0028) [2024-04-27 19:29:59,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7529086976. Throughput: 0: 56029.9. Samples: 434241600. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-04-27 19:29:59,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 19:30:01,318][54818] Updated weights for policy 0, policy_version 459548 (0.0032) [2024-04-27 19:30:04,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7529365504. Throughput: 0: 55936.1. Samples: 434574640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 19:30:04,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 19:30:04,527][54818] Updated weights for policy 0, policy_version 459558 (0.0030) [2024-04-27 19:30:07,181][54818] Updated weights for policy 0, policy_version 459568 (0.0027) [2024-04-27 19:30:09,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 7529660416. Throughput: 0: 55927.9. Samples: 434908560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 19:30:09,254][54587] Avg episode reward: [(0, '0.696')] [2024-04-27 19:30:10,423][54818] Updated weights for policy 0, policy_version 459578 (0.0027) [2024-04-27 19:30:10,445][54798] Signal inference workers to stop experience collection... (6200 times) [2024-04-27 19:30:10,445][54798] Signal inference workers to resume experience collection... (6200 times) [2024-04-27 19:30:10,468][54818] InferenceWorker_p0-w0: stopping experience collection (6200 times) [2024-04-27 19:30:10,468][54818] InferenceWorker_p0-w0: resuming experience collection (6200 times) [2024-04-27 19:30:13,102][54818] Updated weights for policy 0, policy_version 459588 (0.0032) [2024-04-27 19:30:14,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7529938944. Throughput: 0: 55987.0. Samples: 435076620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 19:30:14,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 19:30:16,324][54818] Updated weights for policy 0, policy_version 459598 (0.0031) [2024-04-27 19:30:18,943][54818] Updated weights for policy 0, policy_version 459608 (0.0026) [2024-04-27 19:30:19,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7530217472. Throughput: 0: 55846.7. Samples: 435408240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 19:30:19,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 19:30:22,128][54818] Updated weights for policy 0, policy_version 459618 (0.0027) [2024-04-27 19:30:24,253][54587] Fps is (10 sec: 57344.1, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 7530512384. Throughput: 0: 55807.1. Samples: 435744060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 19:30:24,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 19:30:24,975][54818] Updated weights for policy 0, policy_version 459628 (0.0031) [2024-04-27 19:30:28,079][54818] Updated weights for policy 0, policy_version 459638 (0.0030) [2024-04-27 19:30:29,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 7530758144. Throughput: 0: 55800.8. Samples: 435909240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 19:30:29,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-27 19:30:30,907][54818] Updated weights for policy 0, policy_version 459648 (0.0030) [2024-04-27 19:30:33,988][54818] Updated weights for policy 0, policy_version 459658 (0.0024) [2024-04-27 19:30:34,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55705.8, 300 sec: 55761.1). Total num frames: 7531053056. Throughput: 0: 55685.9. Samples: 436242280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 19:30:34,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 19:30:36,862][54818] Updated weights for policy 0, policy_version 459668 (0.0027) [2024-04-27 19:30:39,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 7531315200. Throughput: 0: 55731.5. Samples: 436579060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 19:30:39,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 19:30:39,782][54818] Updated weights for policy 0, policy_version 459678 (0.0027) [2024-04-27 19:30:42,528][54818] Updated weights for policy 0, policy_version 459688 (0.0027) [2024-04-27 19:30:44,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 7531610112. Throughput: 0: 55595.2. Samples: 436743380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 19:30:44,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 19:30:45,635][54818] Updated weights for policy 0, policy_version 459698 (0.0031) [2024-04-27 19:30:48,213][54818] Updated weights for policy 0, policy_version 459708 (0.0032) [2024-04-27 19:30:49,253][54587] Fps is (10 sec: 58982.3, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 7531905024. Throughput: 0: 55616.9. Samples: 437077400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 19:30:49,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 19:30:51,548][54818] Updated weights for policy 0, policy_version 459718 (0.0029) [2024-04-27 19:30:54,199][54818] Updated weights for policy 0, policy_version 459728 (0.0027) [2024-04-27 19:30:54,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 7532183552. Throughput: 0: 55707.4. Samples: 437415400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 19:30:54,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 19:30:57,326][54818] Updated weights for policy 0, policy_version 459738 (0.0029) [2024-04-27 19:30:59,253][54587] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 7532462080. Throughput: 0: 55820.0. Samples: 437588520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 19:30:59,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 19:31:00,392][54818] Updated weights for policy 0, policy_version 459748 (0.0022) [2024-04-27 19:31:03,129][54818] Updated weights for policy 0, policy_version 459758 (0.0031) [2024-04-27 19:31:04,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7532724224. Throughput: 0: 55825.4. Samples: 437920380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 19:31:04,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 19:31:06,146][54818] Updated weights for policy 0, policy_version 459768 (0.0029) [2024-04-27 19:31:08,910][54818] Updated weights for policy 0, policy_version 459778 (0.0030) [2024-04-27 19:31:09,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7533002752. Throughput: 0: 55866.6. Samples: 438258060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 19:31:09,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 19:31:12,053][54818] Updated weights for policy 0, policy_version 459788 (0.0027) [2024-04-27 19:31:14,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 7533281280. Throughput: 0: 55981.8. Samples: 438428420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 19:31:14,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 19:31:14,436][54798] Signal inference workers to stop experience collection... (6250 times) [2024-04-27 19:31:14,436][54798] Signal inference workers to resume experience collection... (6250 times) [2024-04-27 19:31:14,454][54818] InferenceWorker_p0-w0: stopping experience collection (6250 times) [2024-04-27 19:31:14,455][54818] InferenceWorker_p0-w0: resuming experience collection (6250 times) [2024-04-27 19:31:14,675][54818] Updated weights for policy 0, policy_version 459798 (0.0031) [2024-04-27 19:31:17,895][54818] Updated weights for policy 0, policy_version 459808 (0.0029) [2024-04-27 19:31:19,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55978.8, 300 sec: 55927.7). Total num frames: 7533576192. Throughput: 0: 55972.4. Samples: 438761040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 19:31:19,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 19:31:20,607][54818] Updated weights for policy 0, policy_version 459818 (0.0029) [2024-04-27 19:31:23,813][54818] Updated weights for policy 0, policy_version 459828 (0.0028) [2024-04-27 19:31:24,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 7533838336. Throughput: 0: 55977.7. Samples: 439098060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 19:31:24,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 19:31:26,578][54818] Updated weights for policy 0, policy_version 459838 (0.0027) [2024-04-27 19:31:29,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55978.9, 300 sec: 55705.6). Total num frames: 7534116864. Throughput: 0: 55955.2. Samples: 439261360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 19:31:29,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-27 19:31:29,602][54818] Updated weights for policy 0, policy_version 459848 (0.0030) [2024-04-27 19:31:32,662][54818] Updated weights for policy 0, policy_version 459858 (0.0027) [2024-04-27 19:31:34,253][54587] Fps is (10 sec: 57344.8, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7534411776. Throughput: 0: 56106.3. Samples: 439602180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 19:31:34,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 19:31:35,367][54818] Updated weights for policy 0, policy_version 459868 (0.0031) [2024-04-27 19:31:38,403][54818] Updated weights for policy 0, policy_version 459878 (0.0033) [2024-04-27 19:31:39,253][54587] Fps is (10 sec: 57343.1, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 7534690304. Throughput: 0: 55968.5. Samples: 439933980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 19:31:39,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 19:31:41,153][54818] Updated weights for policy 0, policy_version 459888 (0.0030) [2024-04-27 19:31:44,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7534952448. Throughput: 0: 55881.5. Samples: 440103180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 19:31:44,253][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 19:31:44,284][54818] Updated weights for policy 0, policy_version 459898 (0.0026) [2024-04-27 19:31:46,996][54818] Updated weights for policy 0, policy_version 459908 (0.0032) [2024-04-27 19:31:49,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 7535263744. Throughput: 0: 55989.8. Samples: 440439920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 19:31:49,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 19:31:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000459916_7535263744.pth... [2024-04-27 19:31:49,312][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000459099_7521878016.pth [2024-04-27 19:31:50,013][54818] Updated weights for policy 0, policy_version 459918 (0.0028) [2024-04-27 19:31:52,817][54818] Updated weights for policy 0, policy_version 459928 (0.0028) [2024-04-27 19:31:54,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 7535509504. Throughput: 0: 55759.2. Samples: 440767220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 19:31:54,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 19:31:55,966][54818] Updated weights for policy 0, policy_version 459938 (0.0029) [2024-04-27 19:31:58,627][54818] Updated weights for policy 0, policy_version 459948 (0.0032) [2024-04-27 19:31:59,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7535804416. Throughput: 0: 55807.6. Samples: 440939760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 19:31:59,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 19:32:01,826][54818] Updated weights for policy 0, policy_version 459958 (0.0031) [2024-04-27 19:32:04,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 7536082944. Throughput: 0: 55950.3. Samples: 441278800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 19:32:04,253][54587] Avg episode reward: [(0, '0.501')] [2024-04-27 19:32:04,468][54818] Updated weights for policy 0, policy_version 459968 (0.0029) [2024-04-27 19:32:07,594][54818] Updated weights for policy 0, policy_version 459978 (0.0027) [2024-04-27 19:32:09,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 7536361472. Throughput: 0: 55931.7. Samples: 441614980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 19:32:09,253][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 19:32:09,642][54798] Signal inference workers to stop experience collection... (6300 times) [2024-04-27 19:32:09,661][54818] InferenceWorker_p0-w0: stopping experience collection (6300 times) [2024-04-27 19:32:09,699][54798] Signal inference workers to resume experience collection... (6300 times) [2024-04-27 19:32:09,708][54818] InferenceWorker_p0-w0: resuming experience collection (6300 times) [2024-04-27 19:32:10,458][54818] Updated weights for policy 0, policy_version 459988 (0.0036) [2024-04-27 19:32:13,471][54818] Updated weights for policy 0, policy_version 459998 (0.0029) [2024-04-27 19:32:14,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 7536640000. Throughput: 0: 55969.7. Samples: 441780000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 19:32:14,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 19:32:16,226][54818] Updated weights for policy 0, policy_version 460008 (0.0029) [2024-04-27 19:32:19,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7536918528. Throughput: 0: 55862.6. Samples: 442116000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 19:32:19,253][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 19:32:19,277][54818] Updated weights for policy 0, policy_version 460018 (0.0028) [2024-04-27 19:32:21,987][54818] Updated weights for policy 0, policy_version 460028 (0.0026) [2024-04-27 19:32:24,253][54587] Fps is (10 sec: 57344.4, 60 sec: 56251.8, 300 sec: 55927.9). Total num frames: 7537213440. Throughput: 0: 55919.7. Samples: 442450360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 19:32:24,253][54587] Avg episode reward: [(0, '0.533')] [2024-04-27 19:32:25,159][54818] Updated weights for policy 0, policy_version 460038 (0.0034) [2024-04-27 19:32:27,975][54818] Updated weights for policy 0, policy_version 460048 (0.0034) [2024-04-27 19:32:29,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 7537475584. Throughput: 0: 55779.5. Samples: 442613260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 19:32:29,254][54587] Avg episode reward: [(0, '0.501')] [2024-04-27 19:32:31,133][54818] Updated weights for policy 0, policy_version 460058 (0.0029) [2024-04-27 19:32:33,904][54818] Updated weights for policy 0, policy_version 460068 (0.0030) [2024-04-27 19:32:34,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 7537754112. Throughput: 0: 55757.5. Samples: 442949000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 19:32:34,253][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 19:32:37,268][54818] Updated weights for policy 0, policy_version 460078 (0.0025) [2024-04-27 19:32:39,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7538032640. Throughput: 0: 56037.2. Samples: 443288900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 19:32:39,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 19:32:39,672][54818] Updated weights for policy 0, policy_version 460088 (0.0025) [2024-04-27 19:32:43,031][54818] Updated weights for policy 0, policy_version 460098 (0.0030) [2024-04-27 19:32:44,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7538294784. Throughput: 0: 55684.1. Samples: 443445540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 19:32:44,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 19:32:45,660][54818] Updated weights for policy 0, policy_version 460108 (0.0032) [2024-04-27 19:32:48,985][54818] Updated weights for policy 0, policy_version 460118 (0.0034) [2024-04-27 19:32:49,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 7538589696. Throughput: 0: 55530.1. Samples: 443777660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 19:32:49,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 19:32:51,761][54818] Updated weights for policy 0, policy_version 460128 (0.0027) [2024-04-27 19:32:54,253][54587] Fps is (10 sec: 58982.6, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 7538884608. Throughput: 0: 55495.6. Samples: 444112280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 19:32:54,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 19:32:54,828][54818] Updated weights for policy 0, policy_version 460138 (0.0028) [2024-04-27 19:32:57,529][54818] Updated weights for policy 0, policy_version 460148 (0.0033) [2024-04-27 19:32:59,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 7539163136. Throughput: 0: 55753.3. Samples: 444288900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 19:32:59,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 19:32:59,768][54798] Signal inference workers to stop experience collection... (6350 times) [2024-04-27 19:32:59,768][54798] Signal inference workers to resume experience collection... (6350 times) [2024-04-27 19:32:59,780][54818] InferenceWorker_p0-w0: stopping experience collection (6350 times) [2024-04-27 19:32:59,780][54818] InferenceWorker_p0-w0: resuming experience collection (6350 times) [2024-04-27 19:33:00,504][54818] Updated weights for policy 0, policy_version 460158 (0.0029) [2024-04-27 19:33:03,322][54818] Updated weights for policy 0, policy_version 460168 (0.0029) [2024-04-27 19:33:04,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 7539441664. Throughput: 0: 55787.0. Samples: 444626420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 19:33:04,254][54587] Avg episode reward: [(0, '0.675')] [2024-04-27 19:33:06,424][54818] Updated weights for policy 0, policy_version 460178 (0.0027) [2024-04-27 19:33:09,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 7539703808. Throughput: 0: 55580.0. Samples: 444951460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 19:33:09,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 19:33:09,321][54818] Updated weights for policy 0, policy_version 460188 (0.0028) [2024-04-27 19:33:12,530][54818] Updated weights for policy 0, policy_version 460198 (0.0027) [2024-04-27 19:33:14,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 7539965952. Throughput: 0: 55623.1. Samples: 445116300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 19:33:14,254][54587] Avg episode reward: [(0, '0.496')] [2024-04-27 19:33:15,115][54818] Updated weights for policy 0, policy_version 460208 (0.0025) [2024-04-27 19:33:18,384][54818] Updated weights for policy 0, policy_version 460218 (0.0027) [2024-04-27 19:33:19,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7540244480. Throughput: 0: 55591.1. Samples: 445450600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 19:33:19,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 19:33:21,049][54818] Updated weights for policy 0, policy_version 460228 (0.0029) [2024-04-27 19:33:24,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55159.5, 300 sec: 55761.2). Total num frames: 7540523008. Throughput: 0: 55493.9. Samples: 445786120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 19:33:24,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 19:33:24,333][54818] Updated weights for policy 0, policy_version 460238 (0.0031) [2024-04-27 19:33:26,983][54818] Updated weights for policy 0, policy_version 460248 (0.0029) [2024-04-27 19:33:29,253][54587] Fps is (10 sec: 58981.8, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 7540834304. Throughput: 0: 55829.2. Samples: 445957860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 19:33:29,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 19:33:30,343][54818] Updated weights for policy 0, policy_version 460258 (0.0026) [2024-04-27 19:33:32,936][54818] Updated weights for policy 0, policy_version 460268 (0.0029) [2024-04-27 19:33:34,253][54587] Fps is (10 sec: 58982.4, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 7541112832. Throughput: 0: 55851.6. Samples: 446290980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 19:33:34,253][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 19:33:36,202][54818] Updated weights for policy 0, policy_version 460278 (0.0027) [2024-04-27 19:33:38,837][54818] Updated weights for policy 0, policy_version 460288 (0.0027) [2024-04-27 19:33:39,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55705.5, 300 sec: 55816.6). Total num frames: 7541374976. Throughput: 0: 55782.4. Samples: 446622500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 19:33:39,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 19:33:41,902][54818] Updated weights for policy 0, policy_version 460298 (0.0029) [2024-04-27 19:33:44,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7541637120. Throughput: 0: 55557.4. Samples: 446788980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 19:33:44,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 19:33:44,612][54818] Updated weights for policy 0, policy_version 460308 (0.0027) [2024-04-27 19:33:47,805][54818] Updated weights for policy 0, policy_version 460318 (0.0028) [2024-04-27 19:33:49,253][54587] Fps is (10 sec: 52429.3, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 7541899264. Throughput: 0: 55475.2. Samples: 447122800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 19:33:49,254][54587] Avg episode reward: [(0, '0.668')] [2024-04-27 19:33:49,313][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000460322_7541915648.pth... [2024-04-27 19:33:49,362][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000459506_7528546304.pth [2024-04-27 19:33:50,466][54818] Updated weights for policy 0, policy_version 460328 (0.0028) [2024-04-27 19:33:53,930][54818] Updated weights for policy 0, policy_version 460338 (0.0032) [2024-04-27 19:33:54,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 7542210560. Throughput: 0: 55692.4. Samples: 447457620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 19:33:54,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 19:33:56,386][54818] Updated weights for policy 0, policy_version 460348 (0.0026) [2024-04-27 19:33:59,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 7542472704. Throughput: 0: 55518.7. Samples: 447614640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 19:33:59,253][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 19:33:59,777][54818] Updated weights for policy 0, policy_version 460358 (0.0030) [2024-04-27 19:34:01,121][54798] Signal inference workers to stop experience collection... (6400 times) [2024-04-27 19:34:01,151][54818] InferenceWorker_p0-w0: stopping experience collection (6400 times) [2024-04-27 19:34:01,180][54798] Signal inference workers to resume experience collection... (6400 times) [2024-04-27 19:34:01,181][54818] InferenceWorker_p0-w0: resuming experience collection (6400 times) [2024-04-27 19:34:02,309][54818] Updated weights for policy 0, policy_version 460368 (0.0029) [2024-04-27 19:34:04,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 7542767616. Throughput: 0: 55508.0. Samples: 447948460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 19:34:04,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 19:34:05,690][54818] Updated weights for policy 0, policy_version 460378 (0.0034) [2024-04-27 19:34:08,121][54818] Updated weights for policy 0, policy_version 460388 (0.0034) [2024-04-27 19:34:09,253][54587] Fps is (10 sec: 58982.0, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 7543062528. Throughput: 0: 55513.7. Samples: 448284240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 19:34:09,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 19:34:11,368][54818] Updated weights for policy 0, policy_version 460398 (0.0032) [2024-04-27 19:34:14,100][54818] Updated weights for policy 0, policy_version 460408 (0.0035) [2024-04-27 19:34:14,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 7543324672. Throughput: 0: 55522.4. Samples: 448456360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 19:34:14,253][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 19:34:17,254][54818] Updated weights for policy 0, policy_version 460418 (0.0031) [2024-04-27 19:34:19,253][54587] Fps is (10 sec: 52428.9, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7543586816. Throughput: 0: 55449.7. Samples: 448786220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 19:34:19,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 19:34:20,065][54818] Updated weights for policy 0, policy_version 460428 (0.0030) [2024-04-27 19:34:23,125][54818] Updated weights for policy 0, policy_version 460438 (0.0029) [2024-04-27 19:34:24,253][54587] Fps is (10 sec: 52428.1, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 7543848960. Throughput: 0: 55559.6. Samples: 449122680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 19:34:24,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 19:34:25,934][54818] Updated weights for policy 0, policy_version 460448 (0.0029) [2024-04-27 19:34:28,871][54818] Updated weights for policy 0, policy_version 460458 (0.0033) [2024-04-27 19:34:29,253][54587] Fps is (10 sec: 57343.0, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 7544160256. Throughput: 0: 55420.2. Samples: 449282900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 19:34:29,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 19:34:31,626][54818] Updated weights for policy 0, policy_version 460468 (0.0028) [2024-04-27 19:34:34,253][54587] Fps is (10 sec: 58983.2, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 7544438784. Throughput: 0: 55467.2. Samples: 449618820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 19:34:34,254][54587] Avg episode reward: [(0, '0.493')] [2024-04-27 19:34:34,893][54818] Updated weights for policy 0, policy_version 460478 (0.0027) [2024-04-27 19:34:37,571][54818] Updated weights for policy 0, policy_version 460488 (0.0028) [2024-04-27 19:34:39,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 7544717312. Throughput: 0: 55429.3. Samples: 449951940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 19:34:39,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 19:34:40,731][54818] Updated weights for policy 0, policy_version 460498 (0.0028) [2024-04-27 19:34:43,470][54818] Updated weights for policy 0, policy_version 460508 (0.0038) [2024-04-27 19:34:44,253][54587] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 7545012224. Throughput: 0: 55964.4. Samples: 450133040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 19:34:44,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 19:34:46,919][54818] Updated weights for policy 0, policy_version 460518 (0.0027) [2024-04-27 19:34:49,253][54587] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 7545274368. Throughput: 0: 56022.1. Samples: 450469460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 19:34:49,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 19:34:49,406][54818] Updated weights for policy 0, policy_version 460528 (0.0034) [2024-04-27 19:34:52,674][54818] Updated weights for policy 0, policy_version 460538 (0.0025) [2024-04-27 19:34:54,253][54587] Fps is (10 sec: 52428.3, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 7545536512. Throughput: 0: 55978.2. Samples: 450803260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 19:34:54,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 19:34:55,215][54818] Updated weights for policy 0, policy_version 460548 (0.0032) [2024-04-27 19:34:58,574][54818] Updated weights for policy 0, policy_version 460558 (0.0030) [2024-04-27 19:34:59,253][54587] Fps is (10 sec: 52429.4, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7545798656. Throughput: 0: 55583.5. Samples: 450957620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 19:34:59,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 19:35:01,103][54818] Updated weights for policy 0, policy_version 460568 (0.0024) [2024-04-27 19:35:04,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 7546093568. Throughput: 0: 55731.5. Samples: 451294140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 19:35:04,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 19:35:04,492][54818] Updated weights for policy 0, policy_version 460578 (0.0032) [2024-04-27 19:35:07,029][54798] Signal inference workers to stop experience collection... (6450 times) [2024-04-27 19:35:07,029][54798] Signal inference workers to resume experience collection... (6450 times) [2024-04-27 19:35:07,043][54818] InferenceWorker_p0-w0: stopping experience collection (6450 times) [2024-04-27 19:35:07,043][54818] InferenceWorker_p0-w0: resuming experience collection (6450 times) [2024-04-27 19:35:07,146][54818] Updated weights for policy 0, policy_version 460588 (0.0025) [2024-04-27 19:35:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7546404864. Throughput: 0: 55801.0. Samples: 451633720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 19:35:09,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 19:35:10,157][54818] Updated weights for policy 0, policy_version 460598 (0.0031) [2024-04-27 19:35:12,874][54818] Updated weights for policy 0, policy_version 460608 (0.0029) [2024-04-27 19:35:14,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7546667008. Throughput: 0: 56025.0. Samples: 451804020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 19:35:14,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 19:35:16,313][54818] Updated weights for policy 0, policy_version 460618 (0.0027) [2024-04-27 19:35:18,702][54818] Updated weights for policy 0, policy_version 460628 (0.0030) [2024-04-27 19:35:19,253][54587] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 7546961920. Throughput: 0: 55871.9. Samples: 452133060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 19:35:19,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 19:35:22,395][54818] Updated weights for policy 0, policy_version 460638 (0.0028) [2024-04-27 19:35:24,253][54587] Fps is (10 sec: 55705.5, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 7547224064. Throughput: 0: 55842.2. Samples: 452464840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 19:35:24,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 19:35:24,747][54818] Updated weights for policy 0, policy_version 460648 (0.0038) [2024-04-27 19:35:28,196][54818] Updated weights for policy 0, policy_version 460658 (0.0029) [2024-04-27 19:35:29,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55705.8, 300 sec: 55761.1). Total num frames: 7547502592. Throughput: 0: 55574.6. Samples: 452633900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 19:35:29,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 19:35:30,696][54818] Updated weights for policy 0, policy_version 460668 (0.0033) [2024-04-27 19:35:34,156][54818] Updated weights for policy 0, policy_version 460678 (0.0037) [2024-04-27 19:35:34,253][54587] Fps is (10 sec: 52429.2, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 7547748352. Throughput: 0: 55553.9. Samples: 452969380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 19:35:34,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 19:35:36,508][54818] Updated weights for policy 0, policy_version 460688 (0.0029) [2024-04-27 19:35:39,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7548043264. Throughput: 0: 55530.2. Samples: 453302120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 19:35:39,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 19:35:39,906][54818] Updated weights for policy 0, policy_version 460698 (0.0030) [2024-04-27 19:35:42,254][54818] Updated weights for policy 0, policy_version 460708 (0.0029) [2024-04-27 19:35:44,254][54587] Fps is (10 sec: 58981.3, 60 sec: 55432.3, 300 sec: 55705.6). Total num frames: 7548338176. Throughput: 0: 55733.9. Samples: 453465660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 19:35:44,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 19:35:45,686][54818] Updated weights for policy 0, policy_version 460718 (0.0031) [2024-04-27 19:35:48,282][54818] Updated weights for policy 0, policy_version 460728 (0.0028) [2024-04-27 19:35:49,253][54587] Fps is (10 sec: 57344.9, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7548616704. Throughput: 0: 55794.4. Samples: 453804880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 19:35:49,253][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 19:35:49,270][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000460732_7548633088.pth... [2024-04-27 19:35:49,317][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000459916_7535263744.pth [2024-04-27 19:35:51,409][54818] Updated weights for policy 0, policy_version 460738 (0.0027) [2024-04-27 19:35:54,081][54818] Updated weights for policy 0, policy_version 460748 (0.0033) [2024-04-27 19:35:54,253][54587] Fps is (10 sec: 57344.7, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 7548911616. Throughput: 0: 55684.3. Samples: 454139520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 19:35:54,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 19:35:57,471][54818] Updated weights for policy 0, policy_version 460758 (0.0031) [2024-04-27 19:35:59,253][54587] Fps is (10 sec: 55704.9, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 7549173760. Throughput: 0: 55661.8. Samples: 454308800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 19:35:59,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 19:35:59,870][54818] Updated weights for policy 0, policy_version 460768 (0.0031) [2024-04-27 19:36:03,246][54818] Updated weights for policy 0, policy_version 460778 (0.0025) [2024-04-27 19:36:04,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7549452288. Throughput: 0: 55819.1. Samples: 454644920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 19:36:04,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-27 19:36:05,777][54818] Updated weights for policy 0, policy_version 460788 (0.0031) [2024-04-27 19:36:09,043][54818] Updated weights for policy 0, policy_version 460798 (0.0029) [2024-04-27 19:36:09,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 7549730816. Throughput: 0: 55837.9. Samples: 454977540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 19:36:09,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 19:36:11,629][54818] Updated weights for policy 0, policy_version 460808 (0.0033) [2024-04-27 19:36:14,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 7549992960. Throughput: 0: 55607.3. Samples: 455136220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 19:36:14,259][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 19:36:14,959][54818] Updated weights for policy 0, policy_version 460818 (0.0029) [2024-04-27 19:36:17,492][54818] Updated weights for policy 0, policy_version 460828 (0.0032) [2024-04-27 19:36:19,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 7550271488. Throughput: 0: 55584.7. Samples: 455470700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 19:36:19,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 19:36:20,739][54798] Signal inference workers to stop experience collection... (6500 times) [2024-04-27 19:36:20,743][54798] Signal inference workers to resume experience collection... (6500 times) [2024-04-27 19:36:20,762][54818] InferenceWorker_p0-w0: stopping experience collection (6500 times) [2024-04-27 19:36:20,762][54818] InferenceWorker_p0-w0: resuming experience collection (6500 times) [2024-04-27 19:36:20,872][54818] Updated weights for policy 0, policy_version 460838 (0.0030) [2024-04-27 19:36:23,322][54818] Updated weights for policy 0, policy_version 460848 (0.0027) [2024-04-27 19:36:24,253][54587] Fps is (10 sec: 57342.9, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7550566400. Throughput: 0: 55507.6. Samples: 455799960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 19:36:24,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 19:36:26,683][54818] Updated weights for policy 0, policy_version 460858 (0.0026) [2024-04-27 19:36:29,253][54587] Fps is (10 sec: 57345.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7550844928. Throughput: 0: 55795.9. Samples: 455976460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 19:36:29,253][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 19:36:29,260][54818] Updated weights for policy 0, policy_version 460868 (0.0028) [2024-04-27 19:36:32,454][54818] Updated weights for policy 0, policy_version 460878 (0.0025) [2024-04-27 19:36:34,253][54587] Fps is (10 sec: 55705.1, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 7551123456. Throughput: 0: 55589.5. Samples: 456306420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 19:36:34,254][54587] Avg episode reward: [(0, '0.476')] [2024-04-27 19:36:35,235][54818] Updated weights for policy 0, policy_version 460888 (0.0029) [2024-04-27 19:36:38,359][54818] Updated weights for policy 0, policy_version 460898 (0.0028) [2024-04-27 19:36:39,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7551385600. Throughput: 0: 55525.4. Samples: 456638160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 19:36:39,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 19:36:40,975][54818] Updated weights for policy 0, policy_version 460908 (0.0026) [2024-04-27 19:36:44,253][54587] Fps is (10 sec: 54068.4, 60 sec: 55432.8, 300 sec: 55594.5). Total num frames: 7551664128. Throughput: 0: 55467.3. Samples: 456804820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 19:36:44,253][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 19:36:44,300][54818] Updated weights for policy 0, policy_version 460918 (0.0032) [2024-04-27 19:36:46,754][54818] Updated weights for policy 0, policy_version 460928 (0.0025) [2024-04-27 19:36:49,253][54587] Fps is (10 sec: 52428.7, 60 sec: 54886.3, 300 sec: 55594.5). Total num frames: 7551909888. Throughput: 0: 55427.5. Samples: 457139160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 19:36:49,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 19:36:50,136][54818] Updated weights for policy 0, policy_version 460938 (0.0032) [2024-04-27 19:36:52,748][54818] Updated weights for policy 0, policy_version 460948 (0.0028) [2024-04-27 19:36:54,253][54587] Fps is (10 sec: 52428.5, 60 sec: 54613.4, 300 sec: 55539.0). Total num frames: 7552188416. Throughput: 0: 55396.9. Samples: 457470400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 19:36:54,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 19:36:56,231][54818] Updated weights for policy 0, policy_version 460958 (0.0030) [2024-04-27 19:36:58,884][54818] Updated weights for policy 0, policy_version 460968 (0.0029) [2024-04-27 19:36:59,253][54587] Fps is (10 sec: 60620.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7552516096. Throughput: 0: 55510.5. Samples: 457634200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 19:36:59,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 19:37:02,015][54818] Updated weights for policy 0, policy_version 460978 (0.0033) [2024-04-27 19:37:04,253][54587] Fps is (10 sec: 60620.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7552794624. Throughput: 0: 55460.6. Samples: 457966420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-27 19:37:04,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 19:37:04,788][54818] Updated weights for policy 0, policy_version 460988 (0.0028) [2024-04-27 19:37:07,941][54818] Updated weights for policy 0, policy_version 460998 (0.0028) [2024-04-27 19:37:09,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 7553056768. Throughput: 0: 55551.5. Samples: 458299780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-27 19:37:09,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 19:37:10,571][54818] Updated weights for policy 0, policy_version 461008 (0.0026) [2024-04-27 19:37:13,626][54798] Signal inference workers to stop experience collection... (6550 times) [2024-04-27 19:37:13,676][54818] InferenceWorker_p0-w0: stopping experience collection (6550 times) [2024-04-27 19:37:13,684][54798] Signal inference workers to resume experience collection... (6550 times) [2024-04-27 19:37:13,691][54818] InferenceWorker_p0-w0: resuming experience collection (6550 times) [2024-04-27 19:37:13,813][54818] Updated weights for policy 0, policy_version 461018 (0.0033) [2024-04-27 19:37:14,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7553335296. Throughput: 0: 55360.4. Samples: 458467680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-27 19:37:14,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 19:37:16,475][54818] Updated weights for policy 0, policy_version 461028 (0.0033) [2024-04-27 19:37:19,253][54587] Fps is (10 sec: 54068.1, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 7553597440. Throughput: 0: 55501.2. Samples: 458803960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-27 19:37:19,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 19:37:19,793][54818] Updated weights for policy 0, policy_version 461038 (0.0028) [2024-04-27 19:37:22,446][54818] Updated weights for policy 0, policy_version 461048 (0.0025) [2024-04-27 19:37:24,253][54587] Fps is (10 sec: 52429.4, 60 sec: 54886.5, 300 sec: 55539.0). Total num frames: 7553859584. Throughput: 0: 55504.2. Samples: 459135840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-27 19:37:24,253][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 19:37:25,645][54818] Updated weights for policy 0, policy_version 461058 (0.0032) [2024-04-27 19:37:28,297][54818] Updated weights for policy 0, policy_version 461068 (0.0031) [2024-04-27 19:37:29,253][54587] Fps is (10 sec: 54066.8, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 7554138112. Throughput: 0: 55271.9. Samples: 459292060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-27 19:37:29,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 19:37:31,675][54818] Updated weights for policy 0, policy_version 461078 (0.0030) [2024-04-27 19:37:34,253][54587] Fps is (10 sec: 58981.6, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7554449408. Throughput: 0: 55135.6. Samples: 459620260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-27 19:37:34,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 19:37:34,466][54818] Updated weights for policy 0, policy_version 461088 (0.0028) [2024-04-27 19:37:37,567][54818] Updated weights for policy 0, policy_version 461098 (0.0043) [2024-04-27 19:37:39,253][54587] Fps is (10 sec: 60620.4, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7554744320. Throughput: 0: 55140.8. Samples: 459951740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-27 19:37:39,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 19:37:40,217][54818] Updated weights for policy 0, policy_version 461108 (0.0026) [2024-04-27 19:37:43,623][54818] Updated weights for policy 0, policy_version 461118 (0.0028) [2024-04-27 19:37:44,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7554990080. Throughput: 0: 55550.8. Samples: 460133980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-27 19:37:44,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 19:37:46,277][54818] Updated weights for policy 0, policy_version 461128 (0.0036) [2024-04-27 19:37:49,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7555268608. Throughput: 0: 55555.5. Samples: 460466420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-27 19:37:49,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 19:37:49,323][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000461138_7555284992.pth... [2024-04-27 19:37:49,328][54818] Updated weights for policy 0, policy_version 461138 (0.0030) [2024-04-27 19:37:49,375][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000460322_7541915648.pth [2024-04-27 19:37:52,084][54818] Updated weights for policy 0, policy_version 461148 (0.0027) [2024-04-27 19:37:54,253][54587] Fps is (10 sec: 50790.0, 60 sec: 55159.4, 300 sec: 55372.4). Total num frames: 7555497984. Throughput: 0: 55581.9. Samples: 460800960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-27 19:37:54,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 19:37:55,457][54818] Updated weights for policy 0, policy_version 461158 (0.0032) [2024-04-27 19:37:57,920][54818] Updated weights for policy 0, policy_version 461168 (0.0034) [2024-04-27 19:37:59,253][54587] Fps is (10 sec: 52428.5, 60 sec: 54613.3, 300 sec: 55427.9). Total num frames: 7555792896. Throughput: 0: 55167.0. Samples: 460950200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-27 19:37:59,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 19:37:59,657][54798] Signal inference workers to stop experience collection... (6600 times) [2024-04-27 19:37:59,706][54818] InferenceWorker_p0-w0: stopping experience collection (6600 times) [2024-04-27 19:37:59,710][54798] Signal inference workers to resume experience collection... (6600 times) [2024-04-27 19:37:59,718][54818] InferenceWorker_p0-w0: resuming experience collection (6600 times) [2024-04-27 19:38:01,532][54818] Updated weights for policy 0, policy_version 461178 (0.0027) [2024-04-27 19:38:03,841][54818] Updated weights for policy 0, policy_version 461188 (0.0030) [2024-04-27 19:38:04,253][54587] Fps is (10 sec: 60620.7, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7556104192. Throughput: 0: 55025.6. Samples: 461280120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-27 19:38:04,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 19:38:07,405][54818] Updated weights for policy 0, policy_version 461198 (0.0027) [2024-04-27 19:38:09,253][54587] Fps is (10 sec: 60621.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7556399104. Throughput: 0: 55047.8. Samples: 461613000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-27 19:38:09,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 19:38:10,185][54818] Updated weights for policy 0, policy_version 461208 (0.0033) [2024-04-27 19:38:13,132][54818] Updated weights for policy 0, policy_version 461218 (0.0027) [2024-04-27 19:38:14,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7556677632. Throughput: 0: 55618.7. Samples: 461794900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-27 19:38:14,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 19:38:16,143][54818] Updated weights for policy 0, policy_version 461228 (0.0026) [2024-04-27 19:38:19,001][54818] Updated weights for policy 0, policy_version 461238 (0.0027) [2024-04-27 19:38:19,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 7556939776. Throughput: 0: 55772.8. Samples: 462130040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-27 19:38:19,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 19:38:22,365][54818] Updated weights for policy 0, policy_version 461248 (0.0030) [2024-04-27 19:38:24,253][54587] Fps is (10 sec: 50790.6, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7557185536. Throughput: 0: 55846.8. Samples: 462464840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-27 19:38:24,253][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 19:38:24,926][54818] Updated weights for policy 0, policy_version 461258 (0.0028) [2024-04-27 19:38:27,989][54818] Updated weights for policy 0, policy_version 461268 (0.0026) [2024-04-27 19:38:29,253][54587] Fps is (10 sec: 50790.3, 60 sec: 55159.4, 300 sec: 55372.3). Total num frames: 7557447680. Throughput: 0: 55128.3. Samples: 462614760. Policy #0 lag: (min: 0.0, avg: 13.6, max: 23.0) [2024-04-27 19:38:29,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 19:38:30,727][54818] Updated weights for policy 0, policy_version 461278 (0.0027) [2024-04-27 19:38:33,653][54818] Updated weights for policy 0, policy_version 461288 (0.0026) [2024-04-27 19:38:34,253][54587] Fps is (10 sec: 55705.7, 60 sec: 54886.5, 300 sec: 55483.5). Total num frames: 7557742592. Throughput: 0: 55213.9. Samples: 462951040. Policy #0 lag: (min: 0.0, avg: 13.6, max: 23.0) [2024-04-27 19:38:34,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 19:38:36,710][54818] Updated weights for policy 0, policy_version 461298 (0.0038) [2024-04-27 19:38:39,253][54587] Fps is (10 sec: 60621.4, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7558053888. Throughput: 0: 55183.6. Samples: 463284220. Policy #0 lag: (min: 0.0, avg: 13.6, max: 23.0) [2024-04-27 19:38:39,262][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 19:38:39,607][54818] Updated weights for policy 0, policy_version 461308 (0.0027) [2024-04-27 19:38:42,537][54818] Updated weights for policy 0, policy_version 461318 (0.0028) [2024-04-27 19:38:44,253][54587] Fps is (10 sec: 60619.4, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 7558348800. Throughput: 0: 55894.6. Samples: 463465460. Policy #0 lag: (min: 0.0, avg: 13.6, max: 23.0) [2024-04-27 19:38:44,254][54587] Avg episode reward: [(0, '0.531')] [2024-04-27 19:38:45,469][54818] Updated weights for policy 0, policy_version 461328 (0.0026) [2024-04-27 19:38:48,520][54818] Updated weights for policy 0, policy_version 461338 (0.0031) [2024-04-27 19:38:49,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7558627328. Throughput: 0: 55956.9. Samples: 463798180. Policy #0 lag: (min: 0.0, avg: 13.6, max: 23.0) [2024-04-27 19:38:49,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-27 19:38:51,161][54818] Updated weights for policy 0, policy_version 461348 (0.0037) [2024-04-27 19:38:54,253][54587] Fps is (10 sec: 52429.5, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 7558873088. Throughput: 0: 55892.5. Samples: 464128160. Policy #0 lag: (min: 0.0, avg: 13.6, max: 23.0) [2024-04-27 19:38:54,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 19:38:54,365][54818] Updated weights for policy 0, policy_version 461358 (0.0025) [2024-04-27 19:38:57,197][54818] Updated weights for policy 0, policy_version 461368 (0.0022) [2024-04-27 19:38:59,253][54587] Fps is (10 sec: 52428.2, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 7559151616. Throughput: 0: 55462.0. Samples: 464290700. Policy #0 lag: (min: 0.0, avg: 13.6, max: 23.0) [2024-04-27 19:38:59,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 19:39:00,279][54818] Updated weights for policy 0, policy_version 461378 (0.0031) [2024-04-27 19:39:03,325][54818] Updated weights for policy 0, policy_version 461388 (0.0027) [2024-04-27 19:39:04,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 7559413760. Throughput: 0: 55509.4. Samples: 464627960. Policy #0 lag: (min: 0.0, avg: 13.6, max: 23.0) [2024-04-27 19:39:04,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 19:39:04,667][54798] Signal inference workers to stop experience collection... (6650 times) [2024-04-27 19:39:04,667][54798] Signal inference workers to resume experience collection... (6650 times) [2024-04-27 19:39:04,683][54818] InferenceWorker_p0-w0: stopping experience collection (6650 times) [2024-04-27 19:39:04,684][54818] InferenceWorker_p0-w0: resuming experience collection (6650 times) [2024-04-27 19:39:05,994][54818] Updated weights for policy 0, policy_version 461398 (0.0030) [2024-04-27 19:39:09,025][54818] Updated weights for policy 0, policy_version 461408 (0.0028) [2024-04-27 19:39:09,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7559708672. Throughput: 0: 55557.6. Samples: 464964940. Policy #0 lag: (min: 0.0, avg: 13.6, max: 23.0) [2024-04-27 19:39:09,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 19:39:11,881][54818] Updated weights for policy 0, policy_version 461418 (0.0030) [2024-04-27 19:39:14,253][54587] Fps is (10 sec: 58982.1, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7560003584. Throughput: 0: 55983.2. Samples: 465134000. Policy #0 lag: (min: 0.0, avg: 13.6, max: 23.0) [2024-04-27 19:39:14,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 19:39:14,755][54818] Updated weights for policy 0, policy_version 461428 (0.0027) [2024-04-27 19:39:17,843][54818] Updated weights for policy 0, policy_version 461438 (0.0029) [2024-04-27 19:39:19,253][54587] Fps is (10 sec: 58982.5, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7560298496. Throughput: 0: 55984.3. Samples: 465470340. Policy #0 lag: (min: 0.0, avg: 13.6, max: 23.0) [2024-04-27 19:39:19,254][54587] Avg episode reward: [(0, '0.506')] [2024-04-27 19:39:20,632][54818] Updated weights for policy 0, policy_version 461448 (0.0025) [2024-04-27 19:39:23,706][54818] Updated weights for policy 0, policy_version 461458 (0.0027) [2024-04-27 19:39:24,253][54587] Fps is (10 sec: 55706.5, 60 sec: 56251.8, 300 sec: 55594.6). Total num frames: 7560560640. Throughput: 0: 55940.1. Samples: 465801520. Policy #0 lag: (min: 0.0, avg: 13.6, max: 23.0) [2024-04-27 19:39:24,253][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 19:39:26,469][54818] Updated weights for policy 0, policy_version 461468 (0.0027) [2024-04-27 19:39:29,253][54587] Fps is (10 sec: 52429.3, 60 sec: 56251.9, 300 sec: 55539.0). Total num frames: 7560822784. Throughput: 0: 55582.1. Samples: 465966640. Policy #0 lag: (min: 0.0, avg: 13.6, max: 23.0) [2024-04-27 19:39:29,253][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 19:39:29,464][54818] Updated weights for policy 0, policy_version 461478 (0.0030) [2024-04-27 19:39:32,223][54818] Updated weights for policy 0, policy_version 461488 (0.0031) [2024-04-27 19:39:34,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 7561101312. Throughput: 0: 55684.1. Samples: 466303960. Policy #0 lag: (min: 0.0, avg: 13.6, max: 23.0) [2024-04-27 19:39:34,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 19:39:35,414][54818] Updated weights for policy 0, policy_version 461498 (0.0025) [2024-04-27 19:39:38,374][54818] Updated weights for policy 0, policy_version 461508 (0.0024) [2024-04-27 19:39:39,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 7561379840. Throughput: 0: 55825.4. Samples: 466640300. Policy #0 lag: (min: 0.0, avg: 13.6, max: 23.0) [2024-04-27 19:39:39,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 19:39:41,222][54818] Updated weights for policy 0, policy_version 461518 (0.0027) [2024-04-27 19:39:44,253][54587] Fps is (10 sec: 54067.1, 60 sec: 54886.6, 300 sec: 55483.5). Total num frames: 7561641984. Throughput: 0: 55810.9. Samples: 466802180. Policy #0 lag: (min: 0.0, avg: 13.6, max: 23.0) [2024-04-27 19:39:44,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 19:39:44,416][54818] Updated weights for policy 0, policy_version 461528 (0.0028) [2024-04-27 19:39:46,910][54818] Updated weights for policy 0, policy_version 461538 (0.0028) [2024-04-27 19:39:49,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7561936896. Throughput: 0: 55796.0. Samples: 467138780. Policy #0 lag: (min: 0.0, avg: 13.6, max: 23.0) [2024-04-27 19:39:49,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 19:39:49,315][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000461545_7561953280.pth... [2024-04-27 19:39:49,370][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000460732_7548633088.pth [2024-04-27 19:39:50,185][54818] Updated weights for policy 0, policy_version 461548 (0.0029) [2024-04-27 19:39:52,736][54818] Updated weights for policy 0, policy_version 461558 (0.0024) [2024-04-27 19:39:54,253][54587] Fps is (10 sec: 60620.8, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 7562248192. Throughput: 0: 55770.3. Samples: 467474600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 19:39:54,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 19:39:56,056][54818] Updated weights for policy 0, policy_version 461568 (0.0031) [2024-04-27 19:39:58,693][54818] Updated weights for policy 0, policy_version 461578 (0.0031) [2024-04-27 19:39:59,253][54587] Fps is (10 sec: 58982.5, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 7562526720. Throughput: 0: 55930.7. Samples: 467650880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 19:39:59,254][54587] Avg episode reward: [(0, '0.526')] [2024-04-27 19:40:01,860][54818] Updated weights for policy 0, policy_version 461588 (0.0030) [2024-04-27 19:40:04,253][54587] Fps is (10 sec: 54066.3, 60 sec: 56251.6, 300 sec: 55539.0). Total num frames: 7562788864. Throughput: 0: 55784.8. Samples: 467980660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 19:40:04,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 19:40:04,665][54818] Updated weights for policy 0, policy_version 461598 (0.0032) [2024-04-27 19:40:07,566][54818] Updated weights for policy 0, policy_version 461608 (0.0026) [2024-04-27 19:40:09,247][54798] Signal inference workers to stop experience collection... (6700 times) [2024-04-27 19:40:09,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7563051008. Throughput: 0: 55929.2. Samples: 468318340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 19:40:09,253][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 19:40:09,299][54818] InferenceWorker_p0-w0: stopping experience collection (6700 times) [2024-04-27 19:40:09,300][54798] Signal inference workers to resume experience collection... (6700 times) [2024-04-27 19:40:09,312][54818] InferenceWorker_p0-w0: resuming experience collection (6700 times) [2024-04-27 19:40:10,518][54818] Updated weights for policy 0, policy_version 461618 (0.0033) [2024-04-27 19:40:13,327][54818] Updated weights for policy 0, policy_version 461628 (0.0026) [2024-04-27 19:40:14,253][54587] Fps is (10 sec: 54068.5, 60 sec: 55432.7, 300 sec: 55483.5). Total num frames: 7563329536. Throughput: 0: 55825.4. Samples: 468478780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 19:40:14,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 19:40:16,238][54818] Updated weights for policy 0, policy_version 461638 (0.0026) [2024-04-27 19:40:19,239][54818] Updated weights for policy 0, policy_version 461648 (0.0031) [2024-04-27 19:40:19,253][54587] Fps is (10 sec: 58981.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7563640832. Throughput: 0: 55735.4. Samples: 468812060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 19:40:19,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 19:40:22,046][54818] Updated weights for policy 0, policy_version 461658 (0.0028) [2024-04-27 19:40:24,253][54587] Fps is (10 sec: 57343.3, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7563902976. Throughput: 0: 55668.4. Samples: 469145380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 19:40:24,254][54587] Avg episode reward: [(0, '0.524')] [2024-04-27 19:40:25,563][54818] Updated weights for policy 0, policy_version 461668 (0.0029) [2024-04-27 19:40:27,896][54818] Updated weights for policy 0, policy_version 461678 (0.0027) [2024-04-27 19:40:29,253][54587] Fps is (10 sec: 55706.2, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 7564197888. Throughput: 0: 55887.6. Samples: 469317120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 19:40:29,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 19:40:31,275][54818] Updated weights for policy 0, policy_version 461688 (0.0028) [2024-04-27 19:40:33,905][54818] Updated weights for policy 0, policy_version 461698 (0.0027) [2024-04-27 19:40:34,253][54587] Fps is (10 sec: 57344.0, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 7564476416. Throughput: 0: 55861.3. Samples: 469652540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 19:40:34,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 19:40:37,237][54818] Updated weights for policy 0, policy_version 461708 (0.0030) [2024-04-27 19:40:39,253][54587] Fps is (10 sec: 55704.7, 60 sec: 56251.6, 300 sec: 55650.1). Total num frames: 7564754944. Throughput: 0: 55917.2. Samples: 469990880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 19:40:39,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 19:40:39,745][54818] Updated weights for policy 0, policy_version 461718 (0.0026) [2024-04-27 19:40:43,166][54818] Updated weights for policy 0, policy_version 461728 (0.0031) [2024-04-27 19:40:44,253][54587] Fps is (10 sec: 52429.1, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7565000704. Throughput: 0: 55778.7. Samples: 470160920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 19:40:44,262][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 19:40:45,578][54818] Updated weights for policy 0, policy_version 461738 (0.0031) [2024-04-27 19:40:48,820][54818] Updated weights for policy 0, policy_version 461748 (0.0029) [2024-04-27 19:40:49,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 7565295616. Throughput: 0: 55840.0. Samples: 470493460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 19:40:49,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 19:40:51,501][54818] Updated weights for policy 0, policy_version 461758 (0.0026) [2024-04-27 19:40:54,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7565574144. Throughput: 0: 55700.9. Samples: 470824880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 19:40:54,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 19:40:54,656][54818] Updated weights for policy 0, policy_version 461768 (0.0035) [2024-04-27 19:40:57,357][54818] Updated weights for policy 0, policy_version 461778 (0.0030) [2024-04-27 19:40:59,253][54587] Fps is (10 sec: 55706.5, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7565852672. Throughput: 0: 55885.2. Samples: 470993620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 19:40:59,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 19:41:00,586][54818] Updated weights for policy 0, policy_version 461788 (0.0028) [2024-04-27 19:41:03,159][54818] Updated weights for policy 0, policy_version 461798 (0.0027) [2024-04-27 19:41:04,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 7566147584. Throughput: 0: 55908.9. Samples: 471327960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 19:41:04,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 19:41:06,357][54818] Updated weights for policy 0, policy_version 461808 (0.0028) [2024-04-27 19:41:09,060][54818] Updated weights for policy 0, policy_version 461818 (0.0032) [2024-04-27 19:41:09,253][54587] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 7566426112. Throughput: 0: 55934.7. Samples: 471662440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 19:41:09,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 19:41:12,275][54818] Updated weights for policy 0, policy_version 461828 (0.0029) [2024-04-27 19:41:14,253][54587] Fps is (10 sec: 55705.8, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 7566704640. Throughput: 0: 56029.2. Samples: 471838440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 19:41:14,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 19:41:14,791][54818] Updated weights for policy 0, policy_version 461838 (0.0029) [2024-04-27 19:41:18,010][54818] Updated weights for policy 0, policy_version 461848 (0.0027) [2024-04-27 19:41:19,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7566966784. Throughput: 0: 56079.5. Samples: 472176120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 19:41:19,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 19:41:20,618][54818] Updated weights for policy 0, policy_version 461858 (0.0031) [2024-04-27 19:41:23,978][54818] Updated weights for policy 0, policy_version 461868 (0.0027) [2024-04-27 19:41:24,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7567245312. Throughput: 0: 55925.0. Samples: 472507500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 19:41:24,254][54587] Avg episode reward: [(0, '0.472')] [2024-04-27 19:41:26,352][54798] Signal inference workers to stop experience collection... (6750 times) [2024-04-27 19:41:26,397][54818] InferenceWorker_p0-w0: stopping experience collection (6750 times) [2024-04-27 19:41:26,404][54798] Signal inference workers to resume experience collection... (6750 times) [2024-04-27 19:41:26,411][54818] InferenceWorker_p0-w0: resuming experience collection (6750 times) [2024-04-27 19:41:26,513][54818] Updated weights for policy 0, policy_version 461878 (0.0027) [2024-04-27 19:41:29,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55159.3, 300 sec: 55539.0). Total num frames: 7567507456. Throughput: 0: 55659.0. Samples: 472665580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 19:41:29,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 19:41:29,924][54818] Updated weights for policy 0, policy_version 461888 (0.0033) [2024-04-27 19:41:32,470][54818] Updated weights for policy 0, policy_version 461898 (0.0027) [2024-04-27 19:41:34,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7567818752. Throughput: 0: 55733.9. Samples: 473001480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 19:41:34,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 19:41:35,650][54818] Updated weights for policy 0, policy_version 461908 (0.0022) [2024-04-27 19:41:38,266][54818] Updated weights for policy 0, policy_version 461918 (0.0025) [2024-04-27 19:41:39,253][54587] Fps is (10 sec: 57344.8, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 7568080896. Throughput: 0: 55772.9. Samples: 473334660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 19:41:39,254][54587] Avg episode reward: [(0, '0.676')] [2024-04-27 19:41:41,432][54818] Updated weights for policy 0, policy_version 461928 (0.0031) [2024-04-27 19:41:44,163][54818] Updated weights for policy 0, policy_version 461938 (0.0041) [2024-04-27 19:41:44,253][54587] Fps is (10 sec: 57344.6, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 7568392192. Throughput: 0: 55861.4. Samples: 473507380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 19:41:44,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 19:41:47,625][54818] Updated weights for policy 0, policy_version 461948 (0.0033) [2024-04-27 19:41:49,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 7568654336. Throughput: 0: 55907.6. Samples: 473843800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 19:41:49,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 19:41:49,275][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000461955_7568670720.pth... [2024-04-27 19:41:49,320][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000461138_7555284992.pth [2024-04-27 19:41:49,879][54818] Updated weights for policy 0, policy_version 461958 (0.0029) [2024-04-27 19:41:53,541][54818] Updated weights for policy 0, policy_version 461968 (0.0025) [2024-04-27 19:41:54,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7568916480. Throughput: 0: 55971.2. Samples: 474181140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 19:41:54,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 19:41:55,725][54818] Updated weights for policy 0, policy_version 461978 (0.0027) [2024-04-27 19:41:59,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 7569178624. Throughput: 0: 55703.1. Samples: 474345080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 19:41:59,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-27 19:41:59,419][54818] Updated weights for policy 0, policy_version 461988 (0.0027) [2024-04-27 19:42:01,682][54818] Updated weights for policy 0, policy_version 461998 (0.0030) [2024-04-27 19:42:04,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7569473536. Throughput: 0: 55641.0. Samples: 474679960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 19:42:04,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 19:42:05,328][54818] Updated weights for policy 0, policy_version 462008 (0.0031) [2024-04-27 19:42:07,612][54818] Updated weights for policy 0, policy_version 462018 (0.0032) [2024-04-27 19:42:09,253][54587] Fps is (10 sec: 57344.9, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7569752064. Throughput: 0: 55679.7. Samples: 475013080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 19:42:09,253][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 19:42:11,034][54818] Updated weights for policy 0, policy_version 462028 (0.0029) [2024-04-27 19:42:13,350][54818] Updated weights for policy 0, policy_version 462038 (0.0036) [2024-04-27 19:42:14,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7570030592. Throughput: 0: 55960.6. Samples: 475183800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 19:42:14,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 19:42:16,963][54818] Updated weights for policy 0, policy_version 462048 (0.0038) [2024-04-27 19:42:19,253][54587] Fps is (10 sec: 58982.0, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 7570341888. Throughput: 0: 55856.9. Samples: 475515040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 19:42:19,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 19:42:19,284][54818] Updated weights for policy 0, policy_version 462058 (0.0028) [2024-04-27 19:42:22,758][54818] Updated weights for policy 0, policy_version 462068 (0.0040) [2024-04-27 19:42:24,253][54587] Fps is (10 sec: 58981.9, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 7570620416. Throughput: 0: 55860.8. Samples: 475848400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 19:42:24,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 19:42:25,171][54818] Updated weights for policy 0, policy_version 462078 (0.0028) [2024-04-27 19:42:28,704][54818] Updated weights for policy 0, policy_version 462088 (0.0026) [2024-04-27 19:42:29,253][54587] Fps is (10 sec: 55705.6, 60 sec: 56524.9, 300 sec: 55761.1). Total num frames: 7570898944. Throughput: 0: 55850.1. Samples: 476020640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 19:42:29,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 19:42:29,767][54798] Signal inference workers to stop experience collection... (6800 times) [2024-04-27 19:42:29,767][54798] Signal inference workers to resume experience collection... (6800 times) [2024-04-27 19:42:29,794][54818] InferenceWorker_p0-w0: stopping experience collection (6800 times) [2024-04-27 19:42:29,795][54818] InferenceWorker_p0-w0: resuming experience collection (6800 times) [2024-04-27 19:42:30,959][54818] Updated weights for policy 0, policy_version 462098 (0.0033) [2024-04-27 19:42:34,253][54587] Fps is (10 sec: 50790.9, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 7571128320. Throughput: 0: 55805.9. Samples: 476355060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 19:42:34,253][54587] Avg episode reward: [(0, '0.520')] [2024-04-27 19:42:34,640][54818] Updated weights for policy 0, policy_version 462108 (0.0024) [2024-04-27 19:42:36,826][54818] Updated weights for policy 0, policy_version 462118 (0.0032) [2024-04-27 19:42:39,253][54587] Fps is (10 sec: 50790.7, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7571406848. Throughput: 0: 55716.9. Samples: 476688400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 19:42:39,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 19:42:40,452][54818] Updated weights for policy 0, policy_version 462128 (0.0029) [2024-04-27 19:42:42,719][54818] Updated weights for policy 0, policy_version 462138 (0.0027) [2024-04-27 19:42:44,253][54587] Fps is (10 sec: 58982.1, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 7571718144. Throughput: 0: 55654.8. Samples: 476849540. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-04-27 19:42:44,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 19:42:46,430][54818] Updated weights for policy 0, policy_version 462148 (0.0028) [2024-04-27 19:42:48,755][54818] Updated weights for policy 0, policy_version 462158 (0.0026) [2024-04-27 19:42:49,253][54587] Fps is (10 sec: 58982.2, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 7571996672. Throughput: 0: 55681.4. Samples: 477185620. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-04-27 19:42:49,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 19:42:52,364][54818] Updated weights for policy 0, policy_version 462168 (0.0028) [2024-04-27 19:42:54,253][54587] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 7572291584. Throughput: 0: 55641.2. Samples: 477516940. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-04-27 19:42:54,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 19:42:54,783][54818] Updated weights for policy 0, policy_version 462178 (0.0030) [2024-04-27 19:42:58,129][54818] Updated weights for policy 0, policy_version 462188 (0.0025) [2024-04-27 19:42:59,253][54587] Fps is (10 sec: 57343.6, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 7572570112. Throughput: 0: 55740.3. Samples: 477692120. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-04-27 19:42:59,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 19:43:00,623][54818] Updated weights for policy 0, policy_version 462198 (0.0032) [2024-04-27 19:43:03,888][54818] Updated weights for policy 0, policy_version 462208 (0.0030) [2024-04-27 19:43:04,253][54587] Fps is (10 sec: 54066.5, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 7572832256. Throughput: 0: 55980.7. Samples: 478034180. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-04-27 19:43:04,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-27 19:43:06,479][54818] Updated weights for policy 0, policy_version 462218 (0.0033) [2024-04-27 19:43:09,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7573094400. Throughput: 0: 56007.1. Samples: 478368720. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-04-27 19:43:09,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 19:43:09,803][54818] Updated weights for policy 0, policy_version 462228 (0.0026) [2024-04-27 19:43:12,294][54818] Updated weights for policy 0, policy_version 462238 (0.0028) [2024-04-27 19:43:14,253][54587] Fps is (10 sec: 52429.1, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 7573356544. Throughput: 0: 55611.5. Samples: 478523160. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-04-27 19:43:14,254][54587] Avg episode reward: [(0, '0.491')] [2024-04-27 19:43:15,642][54818] Updated weights for policy 0, policy_version 462248 (0.0028) [2024-04-27 19:43:18,075][54818] Updated weights for policy 0, policy_version 462258 (0.0025) [2024-04-27 19:43:19,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 7573667840. Throughput: 0: 55609.2. Samples: 478857480. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-04-27 19:43:19,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 19:43:21,489][54818] Updated weights for policy 0, policy_version 462268 (0.0033) [2024-04-27 19:43:23,971][54818] Updated weights for policy 0, policy_version 462278 (0.0033) [2024-04-27 19:43:24,253][54587] Fps is (10 sec: 60621.2, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 7573962752. Throughput: 0: 55556.8. Samples: 479188460. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-04-27 19:43:24,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 19:43:27,479][54818] Updated weights for policy 0, policy_version 462288 (0.0028) [2024-04-27 19:43:27,842][54798] Signal inference workers to stop experience collection... (6850 times) [2024-04-27 19:43:27,846][54798] Signal inference workers to resume experience collection... (6850 times) [2024-04-27 19:43:27,867][54818] InferenceWorker_p0-w0: stopping experience collection (6850 times) [2024-04-27 19:43:27,867][54818] InferenceWorker_p0-w0: resuming experience collection (6850 times) [2024-04-27 19:43:29,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55927.7). Total num frames: 7574241280. Throughput: 0: 55925.7. Samples: 479366200. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-04-27 19:43:29,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 19:43:30,045][54818] Updated weights for policy 0, policy_version 462298 (0.0034) [2024-04-27 19:43:33,249][54818] Updated weights for policy 0, policy_version 462308 (0.0033) [2024-04-27 19:43:34,253][54587] Fps is (10 sec: 55706.1, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 7574519808. Throughput: 0: 55857.4. Samples: 479699200. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-04-27 19:43:34,253][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 19:43:35,868][54818] Updated weights for policy 0, policy_version 462318 (0.0027) [2024-04-27 19:43:39,056][54818] Updated weights for policy 0, policy_version 462328 (0.0025) [2024-04-27 19:43:39,253][54587] Fps is (10 sec: 55705.7, 60 sec: 56524.8, 300 sec: 55761.2). Total num frames: 7574798336. Throughput: 0: 56043.1. Samples: 480038880. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-04-27 19:43:39,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 19:43:41,970][54818] Updated weights for policy 0, policy_version 462338 (0.0030) [2024-04-27 19:43:44,253][54587] Fps is (10 sec: 52428.6, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7575044096. Throughput: 0: 55633.9. Samples: 480195640. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-04-27 19:43:44,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 19:43:44,869][54818] Updated weights for policy 0, policy_version 462348 (0.0031) [2024-04-27 19:43:47,658][54818] Updated weights for policy 0, policy_version 462358 (0.0029) [2024-04-27 19:43:49,253][54587] Fps is (10 sec: 49152.2, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 7575289856. Throughput: 0: 55450.9. Samples: 480529460. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-04-27 19:43:49,253][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 19:43:49,330][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000462360_7575306240.pth... [2024-04-27 19:43:49,380][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000461545_7561953280.pth [2024-04-27 19:43:50,827][54818] Updated weights for policy 0, policy_version 462368 (0.0029) [2024-04-27 19:43:53,444][54818] Updated weights for policy 0, policy_version 462378 (0.0028) [2024-04-27 19:43:54,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 7575617536. Throughput: 0: 55447.1. Samples: 480863840. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-04-27 19:43:54,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 19:43:56,566][54818] Updated weights for policy 0, policy_version 462388 (0.0024) [2024-04-27 19:43:59,253][54587] Fps is (10 sec: 62258.8, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 7575912448. Throughput: 0: 55724.1. Samples: 481030740. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-04-27 19:43:59,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 19:43:59,313][54818] Updated weights for policy 0, policy_version 462398 (0.0025) [2024-04-27 19:44:02,520][54818] Updated weights for policy 0, policy_version 462408 (0.0025) [2024-04-27 19:44:04,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7576174592. Throughput: 0: 55670.6. Samples: 481362660. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-04-27 19:44:04,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 19:44:05,129][54818] Updated weights for policy 0, policy_version 462418 (0.0028) [2024-04-27 19:44:08,392][54818] Updated weights for policy 0, policy_version 462428 (0.0029) [2024-04-27 19:44:09,253][54587] Fps is (10 sec: 55705.5, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 7576469504. Throughput: 0: 55796.4. Samples: 481699300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 19:44:09,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 19:44:11,150][54818] Updated weights for policy 0, policy_version 462438 (0.0027) [2024-04-27 19:44:12,928][54798] Signal inference workers to stop experience collection... (6900 times) [2024-04-27 19:44:12,928][54798] Signal inference workers to resume experience collection... (6900 times) [2024-04-27 19:44:12,942][54818] InferenceWorker_p0-w0: stopping experience collection (6900 times) [2024-04-27 19:44:12,942][54818] InferenceWorker_p0-w0: resuming experience collection (6900 times) [2024-04-27 19:44:14,253][54587] Fps is (10 sec: 55706.5, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 7576731648. Throughput: 0: 55532.5. Samples: 481865160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 19:44:14,253][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 19:44:14,262][54818] Updated weights for policy 0, policy_version 462448 (0.0027) [2024-04-27 19:44:16,957][54818] Updated weights for policy 0, policy_version 462458 (0.0034) [2024-04-27 19:44:19,253][54587] Fps is (10 sec: 52429.3, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7576993792. Throughput: 0: 55756.9. Samples: 482208260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 19:44:19,253][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 19:44:19,999][54818] Updated weights for policy 0, policy_version 462468 (0.0027) [2024-04-27 19:44:22,944][54818] Updated weights for policy 0, policy_version 462478 (0.0031) [2024-04-27 19:44:24,253][54587] Fps is (10 sec: 52428.6, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 7577255936. Throughput: 0: 55693.3. Samples: 482545080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 19:44:24,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 19:44:25,873][54818] Updated weights for policy 0, policy_version 462488 (0.0031) [2024-04-27 19:44:29,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 7577550848. Throughput: 0: 55591.9. Samples: 482697280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 19:44:29,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 19:44:29,307][54818] Updated weights for policy 0, policy_version 462498 (0.0028) [2024-04-27 19:44:31,890][54818] Updated weights for policy 0, policy_version 462508 (0.0029) [2024-04-27 19:44:34,253][54587] Fps is (10 sec: 60621.0, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 7577862144. Throughput: 0: 55691.6. Samples: 483035580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 19:44:34,253][54587] Avg episode reward: [(0, '0.523')] [2024-04-27 19:44:35,320][54818] Updated weights for policy 0, policy_version 462518 (0.0027) [2024-04-27 19:44:37,624][54818] Updated weights for policy 0, policy_version 462528 (0.0029) [2024-04-27 19:44:39,253][54587] Fps is (10 sec: 58982.7, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 7578140672. Throughput: 0: 55666.7. Samples: 483368840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 19:44:39,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 19:44:41,292][54818] Updated weights for policy 0, policy_version 462538 (0.0026) [2024-04-27 19:44:43,529][54818] Updated weights for policy 0, policy_version 462548 (0.0032) [2024-04-27 19:44:44,253][54587] Fps is (10 sec: 57343.0, 60 sec: 56524.7, 300 sec: 55927.7). Total num frames: 7578435584. Throughput: 0: 56107.4. Samples: 483555580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 19:44:44,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 19:44:47,028][54818] Updated weights for policy 0, policy_version 462558 (0.0026) [2024-04-27 19:44:49,253][54587] Fps is (10 sec: 55704.7, 60 sec: 56797.7, 300 sec: 55761.1). Total num frames: 7578697728. Throughput: 0: 56139.0. Samples: 483888920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 19:44:49,254][54587] Avg episode reward: [(0, '0.519')] [2024-04-27 19:44:49,317][54818] Updated weights for policy 0, policy_version 462568 (0.0031) [2024-04-27 19:44:53,067][54818] Updated weights for policy 0, policy_version 462578 (0.0030) [2024-04-27 19:44:54,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7578959872. Throughput: 0: 56075.9. Samples: 484222720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 19:44:54,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 19:44:55,224][54818] Updated weights for policy 0, policy_version 462588 (0.0032) [2024-04-27 19:44:58,861][54818] Updated weights for policy 0, policy_version 462598 (0.0028) [2024-04-27 19:44:59,253][54587] Fps is (10 sec: 52429.7, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 7579222016. Throughput: 0: 55859.5. Samples: 484378840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 19:44:59,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 19:45:01,253][54818] Updated weights for policy 0, policy_version 462608 (0.0027) [2024-04-27 19:45:04,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 7579500544. Throughput: 0: 55669.1. Samples: 484713380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 19:45:04,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-27 19:45:04,942][54818] Updated weights for policy 0, policy_version 462618 (0.0024) [2024-04-27 19:45:07,187][54818] Updated weights for policy 0, policy_version 462628 (0.0025) [2024-04-27 19:45:07,213][54798] Signal inference workers to stop experience collection... (6950 times) [2024-04-27 19:45:07,248][54818] InferenceWorker_p0-w0: stopping experience collection (6950 times) [2024-04-27 19:45:07,303][54798] Signal inference workers to resume experience collection... (6950 times) [2024-04-27 19:45:07,303][54818] InferenceWorker_p0-w0: resuming experience collection (6950 times) [2024-04-27 19:45:09,253][54587] Fps is (10 sec: 58982.1, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 7579811840. Throughput: 0: 55568.8. Samples: 485045680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 19:45:09,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 19:45:10,633][54818] Updated weights for policy 0, policy_version 462638 (0.0028) [2024-04-27 19:45:13,001][54818] Updated weights for policy 0, policy_version 462648 (0.0035) [2024-04-27 19:45:14,253][54587] Fps is (10 sec: 58983.3, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 7580090368. Throughput: 0: 56037.4. Samples: 485218960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 19:45:14,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 19:45:16,554][54818] Updated weights for policy 0, policy_version 462658 (0.0030) [2024-04-27 19:45:18,793][54818] Updated weights for policy 0, policy_version 462668 (0.0029) [2024-04-27 19:45:19,253][54587] Fps is (10 sec: 57343.2, 60 sec: 56524.6, 300 sec: 55872.2). Total num frames: 7580385280. Throughput: 0: 55889.0. Samples: 485550600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 19:45:19,254][54587] Avg episode reward: [(0, '0.514')] [2024-04-27 19:45:22,425][54818] Updated weights for policy 0, policy_version 462678 (0.0031) [2024-04-27 19:45:24,253][54587] Fps is (10 sec: 55705.2, 60 sec: 56524.7, 300 sec: 55761.1). Total num frames: 7580647424. Throughput: 0: 55941.2. Samples: 485886200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-27 19:45:24,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 19:45:24,551][54818] Updated weights for policy 0, policy_version 462688 (0.0033) [2024-04-27 19:45:28,230][54818] Updated weights for policy 0, policy_version 462698 (0.0034) [2024-04-27 19:45:29,253][54587] Fps is (10 sec: 50791.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7580893184. Throughput: 0: 55376.6. Samples: 486047520. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 19:45:29,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 19:45:30,449][54818] Updated weights for policy 0, policy_version 462708 (0.0032) [2024-04-27 19:45:34,253][54587] Fps is (10 sec: 50791.2, 60 sec: 54886.4, 300 sec: 55594.6). Total num frames: 7581155328. Throughput: 0: 55301.6. Samples: 486377480. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 19:45:34,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 19:45:34,430][54818] Updated weights for policy 0, policy_version 462718 (0.0032) [2024-04-27 19:45:36,373][54818] Updated weights for policy 0, policy_version 462728 (0.0031) [2024-04-27 19:45:39,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 7581450240. Throughput: 0: 55355.8. Samples: 486713720. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 19:45:39,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 19:45:40,384][54818] Updated weights for policy 0, policy_version 462738 (0.0031) [2024-04-27 19:45:42,244][54818] Updated weights for policy 0, policy_version 462748 (0.0031) [2024-04-27 19:45:44,253][54587] Fps is (10 sec: 60620.4, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 7581761536. Throughput: 0: 55617.8. Samples: 486881640. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 19:45:44,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 19:45:46,101][54818] Updated weights for policy 0, policy_version 462758 (0.0025) [2024-04-27 19:45:48,145][54818] Updated weights for policy 0, policy_version 462768 (0.0027) [2024-04-27 19:45:49,253][54587] Fps is (10 sec: 58981.3, 60 sec: 55705.6, 300 sec: 55816.6). Total num frames: 7582040064. Throughput: 0: 55698.7. Samples: 487219820. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 19:45:49,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 19:45:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000462771_7582040064.pth... [2024-04-27 19:45:49,319][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000461955_7568670720.pth [2024-04-27 19:45:51,973][54818] Updated weights for policy 0, policy_version 462778 (0.0026) [2024-04-27 19:45:53,880][54818] Updated weights for policy 0, policy_version 462788 (0.0031) [2024-04-27 19:45:54,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7582318592. Throughput: 0: 55575.6. Samples: 487546580. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 19:45:54,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-27 19:45:57,829][54818] Updated weights for policy 0, policy_version 462798 (0.0031) [2024-04-27 19:45:59,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7582580736. Throughput: 0: 55658.6. Samples: 487723600. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 19:45:59,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 19:45:59,775][54818] Updated weights for policy 0, policy_version 462808 (0.0025) [2024-04-27 19:46:03,654][54818] Updated weights for policy 0, policy_version 462818 (0.0028) [2024-04-27 19:46:04,253][54587] Fps is (10 sec: 52429.1, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 7582842880. Throughput: 0: 55716.3. Samples: 488057820. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 19:46:04,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 19:46:05,658][54818] Updated weights for policy 0, policy_version 462828 (0.0024) [2024-04-27 19:46:09,253][54587] Fps is (10 sec: 52429.2, 60 sec: 54886.5, 300 sec: 55594.5). Total num frames: 7583105024. Throughput: 0: 55693.9. Samples: 488392420. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 19:46:09,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 19:46:09,384][54818] Updated weights for policy 0, policy_version 462838 (0.0030) [2024-04-27 19:46:11,295][54798] Signal inference workers to stop experience collection... (7000 times) [2024-04-27 19:46:11,296][54798] Signal inference workers to resume experience collection... (7000 times) [2024-04-27 19:46:11,325][54818] InferenceWorker_p0-w0: stopping experience collection (7000 times) [2024-04-27 19:46:11,326][54818] InferenceWorker_p0-w0: resuming experience collection (7000 times) [2024-04-27 19:46:11,559][54818] Updated weights for policy 0, policy_version 462848 (0.0024) [2024-04-27 19:46:14,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 7583399936. Throughput: 0: 55647.0. Samples: 488551640. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 19:46:14,254][54587] Avg episode reward: [(0, '0.509')] [2024-04-27 19:46:15,340][54818] Updated weights for policy 0, policy_version 462858 (0.0025) [2024-04-27 19:46:17,604][54818] Updated weights for policy 0, policy_version 462868 (0.0034) [2024-04-27 19:46:19,253][54587] Fps is (10 sec: 60621.0, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 7583711232. Throughput: 0: 55782.6. Samples: 488887700. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 19:46:19,253][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 19:46:21,258][54818] Updated weights for policy 0, policy_version 462878 (0.0026) [2024-04-27 19:46:23,405][54818] Updated weights for policy 0, policy_version 462888 (0.0034) [2024-04-27 19:46:24,253][54587] Fps is (10 sec: 58983.0, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 7583989760. Throughput: 0: 55682.2. Samples: 489219420. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 19:46:24,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 19:46:27,182][54818] Updated weights for policy 0, policy_version 462898 (0.0028) [2024-04-27 19:46:29,253][54587] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 7584268288. Throughput: 0: 55987.6. Samples: 489401080. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 19:46:29,253][54587] Avg episode reward: [(0, '0.533')] [2024-04-27 19:46:29,369][54818] Updated weights for policy 0, policy_version 462908 (0.0029) [2024-04-27 19:46:32,915][54818] Updated weights for policy 0, policy_version 462918 (0.0032) [2024-04-27 19:46:34,253][54587] Fps is (10 sec: 54067.3, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 7584530432. Throughput: 0: 55863.3. Samples: 489733660. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 19:46:34,254][54587] Avg episode reward: [(0, '0.517')] [2024-04-27 19:46:35,433][54818] Updated weights for policy 0, policy_version 462928 (0.0031) [2024-04-27 19:46:38,711][54818] Updated weights for policy 0, policy_version 462938 (0.0036) [2024-04-27 19:46:39,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7584792576. Throughput: 0: 56015.7. Samples: 490067280. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 19:46:39,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 19:46:41,238][54818] Updated weights for policy 0, policy_version 462948 (0.0031) [2024-04-27 19:46:44,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 7585071104. Throughput: 0: 55661.8. Samples: 490228380. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 19:46:44,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 19:46:44,759][54818] Updated weights for policy 0, policy_version 462958 (0.0030) [2024-04-27 19:46:47,002][54818] Updated weights for policy 0, policy_version 462968 (0.0032) [2024-04-27 19:46:49,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55432.7, 300 sec: 55761.1). Total num frames: 7585366016. Throughput: 0: 55728.4. Samples: 490565600. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 19:46:49,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-27 19:46:50,609][54818] Updated weights for policy 0, policy_version 462978 (0.0035) [2024-04-27 19:46:53,098][54818] Updated weights for policy 0, policy_version 462988 (0.0028) [2024-04-27 19:46:54,253][54587] Fps is (10 sec: 60620.9, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 7585677312. Throughput: 0: 55666.2. Samples: 490897400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 19:46:54,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 19:46:56,467][54818] Updated weights for policy 0, policy_version 462998 (0.0038) [2024-04-27 19:46:59,020][54818] Updated weights for policy 0, policy_version 463008 (0.0033) [2024-04-27 19:46:59,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 7585939456. Throughput: 0: 55998.6. Samples: 491071580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 19:46:59,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 19:47:02,324][54818] Updated weights for policy 0, policy_version 463018 (0.0031) [2024-04-27 19:47:04,253][54587] Fps is (10 sec: 52428.9, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7586201600. Throughput: 0: 55945.3. Samples: 491405240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 19:47:04,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 19:47:04,763][54818] Updated weights for policy 0, policy_version 463028 (0.0025) [2024-04-27 19:47:08,131][54818] Updated weights for policy 0, policy_version 463038 (0.0032) [2024-04-27 19:47:09,253][54587] Fps is (10 sec: 54068.2, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 7586480128. Throughput: 0: 55980.5. Samples: 491738540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 19:47:09,253][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 19:47:10,763][54818] Updated weights for policy 0, policy_version 463048 (0.0028) [2024-04-27 19:47:13,985][54818] Updated weights for policy 0, policy_version 463058 (0.0029) [2024-04-27 19:47:14,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 7586758656. Throughput: 0: 55464.0. Samples: 491896960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 19:47:14,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 19:47:16,740][54818] Updated weights for policy 0, policy_version 463068 (0.0026) [2024-04-27 19:47:16,777][54798] Signal inference workers to stop experience collection... (7050 times) [2024-04-27 19:47:16,777][54798] Signal inference workers to resume experience collection... (7050 times) [2024-04-27 19:47:16,803][54818] InferenceWorker_p0-w0: stopping experience collection (7050 times) [2024-04-27 19:47:16,804][54818] InferenceWorker_p0-w0: resuming experience collection (7050 times) [2024-04-27 19:47:19,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7587020800. Throughput: 0: 55545.8. Samples: 492233220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 19:47:19,254][54587] Avg episode reward: [(0, '0.673')] [2024-04-27 19:47:19,738][54818] Updated weights for policy 0, policy_version 463078 (0.0027) [2024-04-27 19:47:22,711][54818] Updated weights for policy 0, policy_version 463088 (0.0033) [2024-04-27 19:47:24,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7587332096. Throughput: 0: 55542.5. Samples: 492566700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 19:47:24,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 19:47:25,671][54818] Updated weights for policy 0, policy_version 463098 (0.0031) [2024-04-27 19:47:28,561][54818] Updated weights for policy 0, policy_version 463108 (0.0028) [2024-04-27 19:47:29,253][54587] Fps is (10 sec: 60620.7, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 7587627008. Throughput: 0: 55808.5. Samples: 492739760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 19:47:29,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 19:47:31,572][54818] Updated weights for policy 0, policy_version 463118 (0.0029) [2024-04-27 19:47:34,253][54587] Fps is (10 sec: 55706.5, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 7587889152. Throughput: 0: 55680.6. Samples: 493071220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 19:47:34,253][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 19:47:34,258][54818] Updated weights for policy 0, policy_version 463128 (0.0028) [2024-04-27 19:47:37,490][54818] Updated weights for policy 0, policy_version 463138 (0.0028) [2024-04-27 19:47:39,253][54587] Fps is (10 sec: 54066.2, 60 sec: 56251.5, 300 sec: 55761.1). Total num frames: 7588167680. Throughput: 0: 55795.0. Samples: 493408180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 19:47:39,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 19:47:40,029][54818] Updated weights for policy 0, policy_version 463148 (0.0026) [2024-04-27 19:47:43,214][54818] Updated weights for policy 0, policy_version 463158 (0.0029) [2024-04-27 19:47:44,253][54587] Fps is (10 sec: 55704.7, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 7588446208. Throughput: 0: 55580.5. Samples: 493572700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 19:47:44,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-27 19:47:45,924][54818] Updated weights for policy 0, policy_version 463168 (0.0029) [2024-04-27 19:47:48,962][54818] Updated weights for policy 0, policy_version 463178 (0.0035) [2024-04-27 19:47:49,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 7588708352. Throughput: 0: 55639.4. Samples: 493909020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 19:47:49,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 19:47:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000463178_7588708352.pth... [2024-04-27 19:47:49,326][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000462360_7575306240.pth [2024-04-27 19:47:52,123][54818] Updated weights for policy 0, policy_version 463188 (0.0028) [2024-04-27 19:47:54,253][54587] Fps is (10 sec: 52429.2, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 7588970496. Throughput: 0: 55680.0. Samples: 494244140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 19:47:54,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 19:47:54,946][54818] Updated weights for policy 0, policy_version 463198 (0.0025) [2024-04-27 19:47:57,893][54818] Updated weights for policy 0, policy_version 463208 (0.0034) [2024-04-27 19:47:59,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 7589281792. Throughput: 0: 55965.7. Samples: 494415420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 19:47:59,254][54587] Avg episode reward: [(0, '0.505')] [2024-04-27 19:48:00,739][54818] Updated weights for policy 0, policy_version 463218 (0.0033) [2024-04-27 19:48:03,650][54818] Updated weights for policy 0, policy_version 463228 (0.0029) [2024-04-27 19:48:04,253][54587] Fps is (10 sec: 58982.9, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 7589560320. Throughput: 0: 55915.2. Samples: 494749400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 19:48:04,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 19:48:04,392][54798] Signal inference workers to stop experience collection... (7100 times) [2024-04-27 19:48:04,393][54798] Signal inference workers to resume experience collection... (7100 times) [2024-04-27 19:48:04,420][54818] InferenceWorker_p0-w0: stopping experience collection (7100 times) [2024-04-27 19:48:04,420][54818] InferenceWorker_p0-w0: resuming experience collection (7100 times) [2024-04-27 19:48:06,545][54818] Updated weights for policy 0, policy_version 463238 (0.0025) [2024-04-27 19:48:09,253][54587] Fps is (10 sec: 54068.0, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7589822464. Throughput: 0: 55932.2. Samples: 495083640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 19:48:09,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 19:48:09,739][54818] Updated weights for policy 0, policy_version 463248 (0.0028) [2024-04-27 19:48:12,538][54818] Updated weights for policy 0, policy_version 463258 (0.0030) [2024-04-27 19:48:14,253][54587] Fps is (10 sec: 55704.7, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7590117376. Throughput: 0: 55833.2. Samples: 495252260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 19:48:14,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 19:48:15,477][54818] Updated weights for policy 0, policy_version 463268 (0.0031) [2024-04-27 19:48:18,503][54818] Updated weights for policy 0, policy_version 463278 (0.0034) [2024-04-27 19:48:19,253][54587] Fps is (10 sec: 57343.1, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 7590395904. Throughput: 0: 55901.1. Samples: 495586780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 19:48:19,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 19:48:21,322][54818] Updated weights for policy 0, policy_version 463288 (0.0033) [2024-04-27 19:48:24,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 7590658048. Throughput: 0: 55767.4. Samples: 495917700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 19:48:24,253][54587] Avg episode reward: [(0, '0.532')] [2024-04-27 19:48:24,327][54818] Updated weights for policy 0, policy_version 463298 (0.0031) [2024-04-27 19:48:27,259][54818] Updated weights for policy 0, policy_version 463308 (0.0028) [2024-04-27 19:48:29,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 7590936576. Throughput: 0: 55794.7. Samples: 496083460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 19:48:29,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 19:48:30,241][54818] Updated weights for policy 0, policy_version 463318 (0.0028) [2024-04-27 19:48:33,217][54818] Updated weights for policy 0, policy_version 463328 (0.0025) [2024-04-27 19:48:34,253][54587] Fps is (10 sec: 58982.2, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7591247872. Throughput: 0: 55760.7. Samples: 496418240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 19:48:34,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 19:48:35,984][54818] Updated weights for policy 0, policy_version 463338 (0.0031) [2024-04-27 19:48:39,201][54818] Updated weights for policy 0, policy_version 463348 (0.0032) [2024-04-27 19:48:39,255][54587] Fps is (10 sec: 55694.5, 60 sec: 55430.8, 300 sec: 55760.8). Total num frames: 7591493632. Throughput: 0: 55695.3. Samples: 496750540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 19:48:39,256][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 19:48:41,979][54818] Updated weights for policy 0, policy_version 463358 (0.0036) [2024-04-27 19:48:44,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 7591772160. Throughput: 0: 55596.5. Samples: 496917260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 19:48:44,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 19:48:44,916][54818] Updated weights for policy 0, policy_version 463368 (0.0025) [2024-04-27 19:48:47,855][54818] Updated weights for policy 0, policy_version 463378 (0.0032) [2024-04-27 19:48:49,253][54587] Fps is (10 sec: 55717.2, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 7592050688. Throughput: 0: 55588.8. Samples: 497250900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 19:48:49,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 19:48:50,728][54818] Updated weights for policy 0, policy_version 463388 (0.0035) [2024-04-27 19:48:53,688][54818] Updated weights for policy 0, policy_version 463398 (0.0029) [2024-04-27 19:48:54,253][54587] Fps is (10 sec: 57344.7, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 7592345600. Throughput: 0: 55587.1. Samples: 497585060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 19:48:54,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 19:48:56,683][54818] Updated weights for policy 0, policy_version 463408 (0.0031) [2024-04-27 19:48:59,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7592607744. Throughput: 0: 55599.2. Samples: 497754220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 19:48:59,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 19:48:59,668][54818] Updated weights for policy 0, policy_version 463418 (0.0029) [2024-04-27 19:49:02,530][54818] Updated weights for policy 0, policy_version 463428 (0.0025) [2024-04-27 19:49:04,253][54587] Fps is (10 sec: 54066.2, 60 sec: 55432.3, 300 sec: 55650.0). Total num frames: 7592886272. Throughput: 0: 55577.8. Samples: 498087780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 19:49:04,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 19:49:05,430][54818] Updated weights for policy 0, policy_version 463438 (0.0032) [2024-04-27 19:49:08,264][54818] Updated weights for policy 0, policy_version 463448 (0.0031) [2024-04-27 19:49:09,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7593181184. Throughput: 0: 55623.5. Samples: 498420760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 19:49:09,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 19:49:11,340][54818] Updated weights for policy 0, policy_version 463458 (0.0031) [2024-04-27 19:49:14,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 7593443328. Throughput: 0: 55700.4. Samples: 498589980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 19:49:14,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 19:49:14,273][54818] Updated weights for policy 0, policy_version 463468 (0.0029) [2024-04-27 19:49:17,235][54818] Updated weights for policy 0, policy_version 463478 (0.0023) [2024-04-27 19:49:19,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 7593721856. Throughput: 0: 55686.6. Samples: 498924140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 19:49:19,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 19:49:20,236][54818] Updated weights for policy 0, policy_version 463488 (0.0027) [2024-04-27 19:49:23,149][54818] Updated weights for policy 0, policy_version 463498 (0.0029) [2024-04-27 19:49:24,253][54587] Fps is (10 sec: 55706.6, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 7594000384. Throughput: 0: 55630.6. Samples: 499253800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 19:49:24,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 19:49:26,033][54818] Updated weights for policy 0, policy_version 463508 (0.0028) [2024-04-27 19:49:28,950][54818] Updated weights for policy 0, policy_version 463518 (0.0029) [2024-04-27 19:49:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7594295296. Throughput: 0: 55734.7. Samples: 499425320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 19:49:29,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 19:49:31,920][54818] Updated weights for policy 0, policy_version 463528 (0.0026) [2024-04-27 19:49:34,253][54587] Fps is (10 sec: 54066.9, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 7594541056. Throughput: 0: 55593.8. Samples: 499752620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 19:49:34,253][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 19:49:34,357][54798] Signal inference workers to stop experience collection... (7150 times) [2024-04-27 19:49:34,357][54798] Signal inference workers to resume experience collection... (7150 times) [2024-04-27 19:49:34,368][54818] InferenceWorker_p0-w0: stopping experience collection (7150 times) [2024-04-27 19:49:34,368][54818] InferenceWorker_p0-w0: resuming experience collection (7150 times) [2024-04-27 19:49:34,858][54818] Updated weights for policy 0, policy_version 463538 (0.0031) [2024-04-27 19:49:37,827][54818] Updated weights for policy 0, policy_version 463548 (0.0029) [2024-04-27 19:49:39,253][54587] Fps is (10 sec: 52428.0, 60 sec: 55434.3, 300 sec: 55539.0). Total num frames: 7594819584. Throughput: 0: 55575.7. Samples: 500085980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-27 19:49:39,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 19:49:40,988][54818] Updated weights for policy 0, policy_version 463558 (0.0028) [2024-04-27 19:49:43,895][54818] Updated weights for policy 0, policy_version 463568 (0.0027) [2024-04-27 19:49:44,253][54587] Fps is (10 sec: 58982.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7595130880. Throughput: 0: 55528.8. Samples: 500253020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 19:49:44,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 19:49:46,722][54818] Updated weights for policy 0, policy_version 463578 (0.0030) [2024-04-27 19:49:49,253][54587] Fps is (10 sec: 57345.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7595393024. Throughput: 0: 55553.1. Samples: 500587660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 19:49:49,253][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 19:49:49,286][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000463587_7595409408.pth... [2024-04-27 19:49:49,332][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000462771_7582040064.pth [2024-04-27 19:49:49,684][54818] Updated weights for policy 0, policy_version 463588 (0.0034) [2024-04-27 19:49:52,526][54818] Updated weights for policy 0, policy_version 463598 (0.0030) [2024-04-27 19:49:54,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 7595671552. Throughput: 0: 55629.8. Samples: 500924100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 19:49:54,253][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 19:49:55,621][54818] Updated weights for policy 0, policy_version 463608 (0.0030) [2024-04-27 19:49:58,412][54818] Updated weights for policy 0, policy_version 463618 (0.0034) [2024-04-27 19:49:59,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 7595950080. Throughput: 0: 55661.5. Samples: 501094740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 19:49:59,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 19:50:01,510][54818] Updated weights for policy 0, policy_version 463628 (0.0026) [2024-04-27 19:50:04,155][54818] Updated weights for policy 0, policy_version 463638 (0.0028) [2024-04-27 19:50:04,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 7596244992. Throughput: 0: 55625.0. Samples: 501427260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 19:50:04,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 19:50:07,445][54818] Updated weights for policy 0, policy_version 463648 (0.0031) [2024-04-27 19:50:09,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7596490752. Throughput: 0: 55752.7. Samples: 501762680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 19:50:09,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 19:50:10,042][54818] Updated weights for policy 0, policy_version 463658 (0.0026) [2024-04-27 19:50:13,282][54818] Updated weights for policy 0, policy_version 463668 (0.0025) [2024-04-27 19:50:14,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55705.7, 300 sec: 55594.6). Total num frames: 7596785664. Throughput: 0: 55474.2. Samples: 501921660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 19:50:14,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 19:50:16,051][54818] Updated weights for policy 0, policy_version 463678 (0.0035) [2024-04-27 19:50:19,184][54818] Updated weights for policy 0, policy_version 463688 (0.0031) [2024-04-27 19:50:19,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7597064192. Throughput: 0: 55631.0. Samples: 502256020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 19:50:19,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 19:50:22,033][54818] Updated weights for policy 0, policy_version 463698 (0.0023) [2024-04-27 19:50:24,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7597342720. Throughput: 0: 55560.2. Samples: 502586180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 19:50:24,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 19:50:25,201][54818] Updated weights for policy 0, policy_version 463708 (0.0029) [2024-04-27 19:50:27,859][54818] Updated weights for policy 0, policy_version 463718 (0.0027) [2024-04-27 19:50:29,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 7597637632. Throughput: 0: 55869.6. Samples: 502767160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 19:50:29,254][54587] Avg episode reward: [(0, '0.677')] [2024-04-27 19:50:31,028][54818] Updated weights for policy 0, policy_version 463728 (0.0025) [2024-04-27 19:50:33,599][54818] Updated weights for policy 0, policy_version 463738 (0.0036) [2024-04-27 19:50:34,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7597899776. Throughput: 0: 55863.1. Samples: 503101500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 19:50:34,253][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 19:50:36,845][54818] Updated weights for policy 0, policy_version 463748 (0.0033) [2024-04-27 19:50:39,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 7598178304. Throughput: 0: 55652.2. Samples: 503428460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 19:50:39,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 19:50:39,538][54818] Updated weights for policy 0, policy_version 463758 (0.0034) [2024-04-27 19:50:40,376][54798] Signal inference workers to stop experience collection... (7200 times) [2024-04-27 19:50:40,420][54818] InferenceWorker_p0-w0: stopping experience collection (7200 times) [2024-04-27 19:50:40,435][54798] Signal inference workers to resume experience collection... (7200 times) [2024-04-27 19:50:40,437][54818] InferenceWorker_p0-w0: resuming experience collection (7200 times) [2024-04-27 19:50:42,799][54818] Updated weights for policy 0, policy_version 463768 (0.0028) [2024-04-27 19:50:44,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 55594.6). Total num frames: 7598440448. Throughput: 0: 55489.2. Samples: 503591760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 19:50:44,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 19:50:45,442][54818] Updated weights for policy 0, policy_version 463778 (0.0032) [2024-04-27 19:50:48,681][54818] Updated weights for policy 0, policy_version 463788 (0.0029) [2024-04-27 19:50:49,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7598718976. Throughput: 0: 55546.9. Samples: 503926880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 19:50:49,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 19:50:51,359][54818] Updated weights for policy 0, policy_version 463798 (0.0025) [2024-04-27 19:50:54,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7598997504. Throughput: 0: 55488.1. Samples: 504259640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 19:50:54,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 19:50:54,590][54818] Updated weights for policy 0, policy_version 463808 (0.0027) [2024-04-27 19:50:57,162][54818] Updated weights for policy 0, policy_version 463818 (0.0028) [2024-04-27 19:50:59,253][54587] Fps is (10 sec: 57345.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7599292416. Throughput: 0: 55649.4. Samples: 504425880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 19:50:59,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 19:51:00,347][54818] Updated weights for policy 0, policy_version 463828 (0.0031) [2024-04-27 19:51:03,229][54818] Updated weights for policy 0, policy_version 463838 (0.0025) [2024-04-27 19:51:04,253][54587] Fps is (10 sec: 60620.4, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 7599603712. Throughput: 0: 55686.7. Samples: 504761920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 19:51:04,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 19:51:06,177][54818] Updated weights for policy 0, policy_version 463848 (0.0034) [2024-04-27 19:51:09,191][54818] Updated weights for policy 0, policy_version 463858 (0.0029) [2024-04-27 19:51:09,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7599849472. Throughput: 0: 55784.3. Samples: 505096480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 19:51:09,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 19:51:12,074][54818] Updated weights for policy 0, policy_version 463868 (0.0032) [2024-04-27 19:51:14,253][54587] Fps is (10 sec: 50790.3, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7600111616. Throughput: 0: 55410.7. Samples: 505260640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 19:51:14,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 19:51:14,936][54818] Updated weights for policy 0, policy_version 463878 (0.0028) [2024-04-27 19:51:17,890][54818] Updated weights for policy 0, policy_version 463888 (0.0034) [2024-04-27 19:51:19,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7600390144. Throughput: 0: 55446.1. Samples: 505596580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 19:51:19,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 19:51:20,687][54818] Updated weights for policy 0, policy_version 463898 (0.0031) [2024-04-27 19:51:23,799][54818] Updated weights for policy 0, policy_version 463908 (0.0030) [2024-04-27 19:51:24,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 7600685056. Throughput: 0: 55663.2. Samples: 505933300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 19:51:24,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 19:51:26,560][54818] Updated weights for policy 0, policy_version 463918 (0.0030) [2024-04-27 19:51:29,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 7600947200. Throughput: 0: 55565.3. Samples: 506092200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 19:51:29,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 19:51:29,897][54818] Updated weights for policy 0, policy_version 463928 (0.0024) [2024-04-27 19:51:32,617][54818] Updated weights for policy 0, policy_version 463938 (0.0037) [2024-04-27 19:51:34,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7601242112. Throughput: 0: 55506.3. Samples: 506424660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 19:51:34,254][54587] Avg episode reward: [(0, '0.499')] [2024-04-27 19:51:35,867][54818] Updated weights for policy 0, policy_version 463948 (0.0029) [2024-04-27 19:51:38,402][54818] Updated weights for policy 0, policy_version 463958 (0.0032) [2024-04-27 19:51:39,253][54587] Fps is (10 sec: 58982.6, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 7601537024. Throughput: 0: 55424.5. Samples: 506753740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 19:51:39,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 19:51:41,846][54818] Updated weights for policy 0, policy_version 463968 (0.0026) [2024-04-27 19:51:44,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7601799168. Throughput: 0: 55644.8. Samples: 506929900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 19:51:44,254][54587] Avg episode reward: [(0, '0.679')] [2024-04-27 19:51:44,275][54818] Updated weights for policy 0, policy_version 463978 (0.0029) [2024-04-27 19:51:47,677][54818] Updated weights for policy 0, policy_version 463988 (0.0029) [2024-04-27 19:51:49,067][54798] Signal inference workers to stop experience collection... (7250 times) [2024-04-27 19:51:49,067][54798] Signal inference workers to resume experience collection... (7250 times) [2024-04-27 19:51:49,082][54818] InferenceWorker_p0-w0: stopping experience collection (7250 times) [2024-04-27 19:51:49,082][54818] InferenceWorker_p0-w0: resuming experience collection (7250 times) [2024-04-27 19:51:49,253][54587] Fps is (10 sec: 52428.9, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 7602061312. Throughput: 0: 55556.1. Samples: 507261940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 19:51:49,253][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 19:51:49,345][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000463994_7602077696.pth... [2024-04-27 19:51:49,387][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000463178_7588708352.pth [2024-04-27 19:51:50,220][54818] Updated weights for policy 0, policy_version 463998 (0.0028) [2024-04-27 19:51:53,553][54818] Updated weights for policy 0, policy_version 464008 (0.0026) [2024-04-27 19:51:54,254][54587] Fps is (10 sec: 52427.5, 60 sec: 55432.3, 300 sec: 55539.0). Total num frames: 7602323456. Throughput: 0: 55441.5. Samples: 507591360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 19:51:54,254][54587] Avg episode reward: [(0, '0.678')] [2024-04-27 19:51:56,219][54818] Updated weights for policy 0, policy_version 464018 (0.0032) [2024-04-27 19:51:59,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7602601984. Throughput: 0: 55452.1. Samples: 507755980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 19:51:59,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 19:51:59,480][54818] Updated weights for policy 0, policy_version 464028 (0.0024) [2024-04-27 19:52:01,951][54818] Updated weights for policy 0, policy_version 464038 (0.0028) [2024-04-27 19:52:04,253][54587] Fps is (10 sec: 57345.1, 60 sec: 54886.4, 300 sec: 55650.0). Total num frames: 7602896896. Throughput: 0: 55424.4. Samples: 508090680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 19:52:04,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 19:52:05,325][54818] Updated weights for policy 0, policy_version 464048 (0.0031) [2024-04-27 19:52:07,626][54818] Updated weights for policy 0, policy_version 464058 (0.0030) [2024-04-27 19:52:09,253][54587] Fps is (10 sec: 58982.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7603191808. Throughput: 0: 55405.8. Samples: 508426560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 19:52:09,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 19:52:11,209][54818] Updated weights for policy 0, policy_version 464068 (0.0030) [2024-04-27 19:52:13,528][54818] Updated weights for policy 0, policy_version 464078 (0.0029) [2024-04-27 19:52:14,253][54587] Fps is (10 sec: 57344.7, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 7603470336. Throughput: 0: 55794.3. Samples: 508602940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 19:52:14,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 19:52:17,147][54818] Updated weights for policy 0, policy_version 464088 (0.0028) [2024-04-27 19:52:19,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7603748864. Throughput: 0: 55789.0. Samples: 508935160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 19:52:19,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-27 19:52:19,465][54818] Updated weights for policy 0, policy_version 464098 (0.0026) [2024-04-27 19:52:23,070][54818] Updated weights for policy 0, policy_version 464108 (0.0028) [2024-04-27 19:52:24,253][54587] Fps is (10 sec: 52428.1, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 7603994624. Throughput: 0: 55882.6. Samples: 509268460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 19:52:24,254][54587] Avg episode reward: [(0, '0.667')] [2024-04-27 19:52:25,389][54818] Updated weights for policy 0, policy_version 464118 (0.0034) [2024-04-27 19:52:28,838][54818] Updated weights for policy 0, policy_version 464128 (0.0026) [2024-04-27 19:52:29,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7604273152. Throughput: 0: 55509.8. Samples: 509427840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 19:52:29,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 19:52:31,306][54818] Updated weights for policy 0, policy_version 464138 (0.0028) [2024-04-27 19:52:34,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7604551680. Throughput: 0: 55591.4. Samples: 509763560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 19:52:34,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 19:52:34,653][54818] Updated weights for policy 0, policy_version 464148 (0.0029) [2024-04-27 19:52:37,186][54818] Updated weights for policy 0, policy_version 464158 (0.0026) [2024-04-27 19:52:39,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7604846592. Throughput: 0: 55830.0. Samples: 510103700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 19:52:39,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 19:52:40,465][54818] Updated weights for policy 0, policy_version 464168 (0.0025) [2024-04-27 19:52:43,066][54818] Updated weights for policy 0, policy_version 464178 (0.0030) [2024-04-27 19:52:44,253][54587] Fps is (10 sec: 58982.2, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7605141504. Throughput: 0: 55868.3. Samples: 510270060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 19:52:44,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 19:52:46,593][54818] Updated weights for policy 0, policy_version 464188 (0.0028) [2024-04-27 19:52:47,617][54798] Signal inference workers to stop experience collection... (7300 times) [2024-04-27 19:52:47,618][54798] Signal inference workers to resume experience collection... (7300 times) [2024-04-27 19:52:47,646][54818] InferenceWorker_p0-w0: stopping experience collection (7300 times) [2024-04-27 19:52:47,646][54818] InferenceWorker_p0-w0: resuming experience collection (7300 times) [2024-04-27 19:52:48,874][54818] Updated weights for policy 0, policy_version 464198 (0.0031) [2024-04-27 19:52:49,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 7605420032. Throughput: 0: 55847.1. Samples: 510603800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 19:52:49,254][54587] Avg episode reward: [(0, '0.474')] [2024-04-27 19:52:52,477][54818] Updated weights for policy 0, policy_version 464208 (0.0031) [2024-04-27 19:52:54,253][54587] Fps is (10 sec: 57344.5, 60 sec: 56525.0, 300 sec: 55705.6). Total num frames: 7605714944. Throughput: 0: 55694.3. Samples: 510932800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 19:52:54,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 19:52:54,657][54818] Updated weights for policy 0, policy_version 464218 (0.0030) [2024-04-27 19:52:58,325][54818] Updated weights for policy 0, policy_version 464228 (0.0027) [2024-04-27 19:52:59,253][54587] Fps is (10 sec: 54068.1, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7605960704. Throughput: 0: 55572.0. Samples: 511103680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 19:52:59,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 19:53:00,691][54818] Updated weights for policy 0, policy_version 464238 (0.0025) [2024-04-27 19:53:04,253][54587] Fps is (10 sec: 50790.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7606222848. Throughput: 0: 55685.3. Samples: 511441000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 19:53:04,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 19:53:04,314][54818] Updated weights for policy 0, policy_version 464248 (0.0026) [2024-04-27 19:53:07,014][54818] Updated weights for policy 0, policy_version 464258 (0.0027) [2024-04-27 19:53:09,253][54587] Fps is (10 sec: 54066.3, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 7606501376. Throughput: 0: 55608.9. Samples: 511770860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 19:53:09,254][54587] Avg episode reward: [(0, '0.655')] [2024-04-27 19:53:10,056][54818] Updated weights for policy 0, policy_version 464268 (0.0028) [2024-04-27 19:53:12,935][54818] Updated weights for policy 0, policy_version 464278 (0.0027) [2024-04-27 19:53:14,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 7606779904. Throughput: 0: 55627.6. Samples: 511931080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 19:53:14,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 19:53:16,075][54818] Updated weights for policy 0, policy_version 464288 (0.0027) [2024-04-27 19:53:18,690][54818] Updated weights for policy 0, policy_version 464298 (0.0038) [2024-04-27 19:53:19,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55432.3, 300 sec: 55650.0). Total num frames: 7607074816. Throughput: 0: 55583.4. Samples: 512264820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 19:53:19,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 19:53:21,955][54818] Updated weights for policy 0, policy_version 464308 (0.0029) [2024-04-27 19:53:24,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 7607353344. Throughput: 0: 55365.1. Samples: 512595120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 19:53:24,253][54587] Avg episode reward: [(0, '0.663')] [2024-04-27 19:53:24,456][54818] Updated weights for policy 0, policy_version 464318 (0.0030) [2024-04-27 19:53:27,781][54818] Updated weights for policy 0, policy_version 464328 (0.0029) [2024-04-27 19:53:29,253][54587] Fps is (10 sec: 55706.7, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7607631872. Throughput: 0: 55526.4. Samples: 512768740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 19:53:29,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 19:53:30,449][54818] Updated weights for policy 0, policy_version 464338 (0.0030) [2024-04-27 19:53:33,575][54818] Updated weights for policy 0, policy_version 464348 (0.0031) [2024-04-27 19:53:34,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55978.8, 300 sec: 55650.5). Total num frames: 7607910400. Throughput: 0: 55630.9. Samples: 513107180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 19:53:34,253][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 19:53:36,356][54818] Updated weights for policy 0, policy_version 464358 (0.0027) [2024-04-27 19:53:39,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7608172544. Throughput: 0: 55800.8. Samples: 513443840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 19:53:39,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 19:53:39,482][54818] Updated weights for policy 0, policy_version 464368 (0.0028) [2024-04-27 19:53:42,048][54818] Updated weights for policy 0, policy_version 464378 (0.0032) [2024-04-27 19:53:44,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 7608451072. Throughput: 0: 55455.5. Samples: 513599180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 19:53:44,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 19:53:45,258][54818] Updated weights for policy 0, policy_version 464388 (0.0027) [2024-04-27 19:53:47,894][54818] Updated weights for policy 0, policy_version 464398 (0.0027) [2024-04-27 19:53:49,253][54587] Fps is (10 sec: 54067.6, 60 sec: 54886.5, 300 sec: 55483.4). Total num frames: 7608713216. Throughput: 0: 55442.7. Samples: 513935920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 19:53:49,253][54587] Avg episode reward: [(0, '0.672')] [2024-04-27 19:53:49,299][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000464400_7608729600.pth... [2024-04-27 19:53:49,339][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000463587_7595409408.pth [2024-04-27 19:53:50,767][54798] Signal inference workers to stop experience collection... (7350 times) [2024-04-27 19:53:50,775][54798] Signal inference workers to resume experience collection... (7350 times) [2024-04-27 19:53:50,795][54818] InferenceWorker_p0-w0: stopping experience collection (7350 times) [2024-04-27 19:53:50,795][54818] InferenceWorker_p0-w0: resuming experience collection (7350 times) [2024-04-27 19:53:51,053][54818] Updated weights for policy 0, policy_version 464408 (0.0033) [2024-04-27 19:53:53,895][54818] Updated weights for policy 0, policy_version 464418 (0.0027) [2024-04-27 19:53:54,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 7609024512. Throughput: 0: 55581.4. Samples: 514272020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-27 19:53:54,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 19:53:56,881][54818] Updated weights for policy 0, policy_version 464428 (0.0029) [2024-04-27 19:53:59,253][54587] Fps is (10 sec: 57343.3, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7609286656. Throughput: 0: 55732.3. Samples: 514439040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 19:53:59,254][54587] Avg episode reward: [(0, '0.494')] [2024-04-27 19:53:59,841][54818] Updated weights for policy 0, policy_version 464438 (0.0036) [2024-04-27 19:54:02,784][54818] Updated weights for policy 0, policy_version 464448 (0.0036) [2024-04-27 19:54:04,253][54587] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 55650.0). Total num frames: 7609597952. Throughput: 0: 55717.9. Samples: 514772120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 19:54:04,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 19:54:05,693][54818] Updated weights for policy 0, policy_version 464458 (0.0030) [2024-04-27 19:54:08,636][54818] Updated weights for policy 0, policy_version 464468 (0.0029) [2024-04-27 19:54:09,253][54587] Fps is (10 sec: 57344.9, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 7609860096. Throughput: 0: 55742.2. Samples: 515103520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 19:54:09,253][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 19:54:11,725][54818] Updated weights for policy 0, policy_version 464478 (0.0030) [2024-04-27 19:54:14,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 7610138624. Throughput: 0: 55700.7. Samples: 515275280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 19:54:14,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 19:54:14,511][54818] Updated weights for policy 0, policy_version 464488 (0.0029) [2024-04-27 19:54:17,430][54818] Updated weights for policy 0, policy_version 464498 (0.0029) [2024-04-27 19:54:19,253][54587] Fps is (10 sec: 50789.5, 60 sec: 54886.4, 300 sec: 55483.4). Total num frames: 7610368000. Throughput: 0: 55514.0. Samples: 515605320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 19:54:19,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 19:54:20,541][54818] Updated weights for policy 0, policy_version 464508 (0.0026) [2024-04-27 19:54:23,239][54818] Updated weights for policy 0, policy_version 464518 (0.0028) [2024-04-27 19:54:24,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55159.3, 300 sec: 55483.4). Total num frames: 7610662912. Throughput: 0: 55431.1. Samples: 515938240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 19:54:24,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 19:54:26,300][54818] Updated weights for policy 0, policy_version 464528 (0.0026) [2024-04-27 19:54:29,253][54587] Fps is (10 sec: 60620.9, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7610974208. Throughput: 0: 55679.4. Samples: 516104760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 19:54:29,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 19:54:29,656][54818] Updated weights for policy 0, policy_version 464538 (0.0030) [2024-04-27 19:54:32,150][54818] Updated weights for policy 0, policy_version 464548 (0.0027) [2024-04-27 19:54:34,253][54587] Fps is (10 sec: 57345.1, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7611236352. Throughput: 0: 55573.4. Samples: 516436720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 19:54:34,253][54587] Avg episode reward: [(0, '0.507')] [2024-04-27 19:54:35,533][54818] Updated weights for policy 0, policy_version 464558 (0.0029) [2024-04-27 19:54:36,899][54798] Signal inference workers to stop experience collection... (7400 times) [2024-04-27 19:54:36,913][54818] InferenceWorker_p0-w0: stopping experience collection (7400 times) [2024-04-27 19:54:36,990][54798] Signal inference workers to resume experience collection... (7400 times) [2024-04-27 19:54:36,990][54818] InferenceWorker_p0-w0: resuming experience collection (7400 times) [2024-04-27 19:54:38,160][54818] Updated weights for policy 0, policy_version 464568 (0.0029) [2024-04-27 19:54:39,253][54587] Fps is (10 sec: 57345.0, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 7611547648. Throughput: 0: 55469.9. Samples: 516768160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 19:54:39,253][54587] Avg episode reward: [(0, '0.670')] [2024-04-27 19:54:41,369][54818] Updated weights for policy 0, policy_version 464578 (0.0030) [2024-04-27 19:54:43,963][54818] Updated weights for policy 0, policy_version 464588 (0.0027) [2024-04-27 19:54:44,253][54587] Fps is (10 sec: 58982.0, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 7611826176. Throughput: 0: 55623.3. Samples: 516942080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 19:54:44,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-27 19:54:47,466][54818] Updated weights for policy 0, policy_version 464598 (0.0034) [2024-04-27 19:54:49,253][54587] Fps is (10 sec: 52428.2, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7612071936. Throughput: 0: 55669.8. Samples: 517277260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 19:54:49,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 19:54:49,896][54818] Updated weights for policy 0, policy_version 464608 (0.0030) [2024-04-27 19:54:53,266][54818] Updated weights for policy 0, policy_version 464618 (0.0025) [2024-04-27 19:54:54,253][54587] Fps is (10 sec: 49151.8, 60 sec: 54886.4, 300 sec: 55483.4). Total num frames: 7612317696. Throughput: 0: 55699.5. Samples: 517610000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 19:54:54,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 19:54:55,766][54818] Updated weights for policy 0, policy_version 464628 (0.0028) [2024-04-27 19:54:59,006][54818] Updated weights for policy 0, policy_version 464638 (0.0030) [2024-04-27 19:54:59,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7612628992. Throughput: 0: 55297.8. Samples: 517763680. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 19:54:59,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 19:55:01,808][54818] Updated weights for policy 0, policy_version 464648 (0.0028) [2024-04-27 19:55:04,253][54587] Fps is (10 sec: 60619.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7612923904. Throughput: 0: 55333.7. Samples: 518095340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 19:55:04,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 19:55:04,891][54818] Updated weights for policy 0, policy_version 464658 (0.0026) [2024-04-27 19:55:07,785][54818] Updated weights for policy 0, policy_version 464668 (0.0024) [2024-04-27 19:55:09,253][54587] Fps is (10 sec: 57344.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7613202432. Throughput: 0: 55378.8. Samples: 518430280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 19:55:09,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 19:55:10,917][54818] Updated weights for policy 0, policy_version 464678 (0.0026) [2024-04-27 19:55:13,534][54818] Updated weights for policy 0, policy_version 464688 (0.0034) [2024-04-27 19:55:14,253][54587] Fps is (10 sec: 55706.7, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7613480960. Throughput: 0: 55768.2. Samples: 518614320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 19:55:14,253][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 19:55:17,462][54818] Updated weights for policy 0, policy_version 464698 (0.0023) [2024-04-27 19:55:19,253][54587] Fps is (10 sec: 55704.9, 60 sec: 56524.9, 300 sec: 55650.0). Total num frames: 7613759488. Throughput: 0: 55759.3. Samples: 518945900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-27 19:55:19,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 19:55:19,383][54818] Updated weights for policy 0, policy_version 464708 (0.0024) [2024-04-27 19:55:23,296][54818] Updated weights for policy 0, policy_version 464718 (0.0025) [2024-04-27 19:55:24,253][54587] Fps is (10 sec: 52428.2, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 7614005248. Throughput: 0: 55701.6. Samples: 519274740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 19:55:24,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 19:55:25,363][54818] Updated weights for policy 0, policy_version 464728 (0.0033) [2024-04-27 19:55:29,103][54818] Updated weights for policy 0, policy_version 464738 (0.0034) [2024-04-27 19:55:29,253][54587] Fps is (10 sec: 50790.6, 60 sec: 54886.5, 300 sec: 55483.4). Total num frames: 7614267392. Throughput: 0: 55256.4. Samples: 519428620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 19:55:29,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 19:55:29,671][54798] Signal inference workers to stop experience collection... (7450 times) [2024-04-27 19:55:29,706][54818] InferenceWorker_p0-w0: stopping experience collection (7450 times) [2024-04-27 19:55:29,734][54798] Signal inference workers to resume experience collection... (7450 times) [2024-04-27 19:55:29,734][54818] InferenceWorker_p0-w0: resuming experience collection (7450 times) [2024-04-27 19:55:31,399][54818] Updated weights for policy 0, policy_version 464748 (0.0031) [2024-04-27 19:55:34,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 7614562304. Throughput: 0: 55184.0. Samples: 519760540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 19:55:34,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 19:55:35,030][54818] Updated weights for policy 0, policy_version 464758 (0.0035) [2024-04-27 19:55:37,132][54818] Updated weights for policy 0, policy_version 464768 (0.0025) [2024-04-27 19:55:39,253][54587] Fps is (10 sec: 58981.6, 60 sec: 55159.3, 300 sec: 55650.0). Total num frames: 7614857216. Throughput: 0: 55170.5. Samples: 520092680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 19:55:39,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-27 19:55:41,057][54818] Updated weights for policy 0, policy_version 464778 (0.0033) [2024-04-27 19:55:43,175][54818] Updated weights for policy 0, policy_version 464788 (0.0030) [2024-04-27 19:55:44,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 7615135744. Throughput: 0: 55549.8. Samples: 520263420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 19:55:44,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 19:55:46,832][54818] Updated weights for policy 0, policy_version 464798 (0.0032) [2024-04-27 19:55:49,001][54818] Updated weights for policy 0, policy_version 464808 (0.0027) [2024-04-27 19:55:49,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7615414272. Throughput: 0: 55740.2. Samples: 520603640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 19:55:49,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 19:55:49,326][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000464809_7615430656.pth... [2024-04-27 19:55:49,377][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000463994_7602077696.pth [2024-04-27 19:55:52,613][54818] Updated weights for policy 0, policy_version 464818 (0.0029) [2024-04-27 19:55:54,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7615676416. Throughput: 0: 55775.0. Samples: 520940160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 19:55:54,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 19:55:55,020][54818] Updated weights for policy 0, policy_version 464828 (0.0035) [2024-04-27 19:55:58,594][54818] Updated weights for policy 0, policy_version 464838 (0.0025) [2024-04-27 19:55:59,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7615954944. Throughput: 0: 55173.4. Samples: 521097120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 19:55:59,253][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 19:56:00,754][54818] Updated weights for policy 0, policy_version 464848 (0.0031) [2024-04-27 19:56:04,253][54587] Fps is (10 sec: 54067.4, 60 sec: 54886.6, 300 sec: 55483.5). Total num frames: 7616217088. Throughput: 0: 55166.4. Samples: 521428380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 19:56:04,253][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 19:56:04,412][54818] Updated weights for policy 0, policy_version 464858 (0.0033) [2024-04-27 19:56:06,692][54818] Updated weights for policy 0, policy_version 464868 (0.0027) [2024-04-27 19:56:09,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7616528384. Throughput: 0: 55335.6. Samples: 521764840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 19:56:09,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 19:56:10,123][54818] Updated weights for policy 0, policy_version 464878 (0.0025) [2024-04-27 19:56:12,601][54818] Updated weights for policy 0, policy_version 464888 (0.0027) [2024-04-27 19:56:14,253][54587] Fps is (10 sec: 58982.1, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7616806912. Throughput: 0: 55730.7. Samples: 521936500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 19:56:14,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 19:56:15,837][54818] Updated weights for policy 0, policy_version 464898 (0.0022) [2024-04-27 19:56:18,048][54798] Signal inference workers to stop experience collection... (7500 times) [2024-04-27 19:56:18,048][54798] Signal inference workers to resume experience collection... (7500 times) [2024-04-27 19:56:18,059][54818] InferenceWorker_p0-w0: stopping experience collection (7500 times) [2024-04-27 19:56:18,059][54818] InferenceWorker_p0-w0: resuming experience collection (7500 times) [2024-04-27 19:56:18,508][54818] Updated weights for policy 0, policy_version 464908 (0.0028) [2024-04-27 19:56:19,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7617085440. Throughput: 0: 55700.9. Samples: 522267080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 19:56:19,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 19:56:22,011][54818] Updated weights for policy 0, policy_version 464918 (0.0041) [2024-04-27 19:56:24,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 7617363968. Throughput: 0: 55672.5. Samples: 522597940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 19:56:24,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 19:56:24,590][54818] Updated weights for policy 0, policy_version 464928 (0.0032) [2024-04-27 19:56:27,870][54818] Updated weights for policy 0, policy_version 464938 (0.0030) [2024-04-27 19:56:29,253][54587] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 7617642496. Throughput: 0: 55556.4. Samples: 522763460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 19:56:29,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 19:56:30,446][54818] Updated weights for policy 0, policy_version 464948 (0.0025) [2024-04-27 19:56:33,695][54818] Updated weights for policy 0, policy_version 464958 (0.0031) [2024-04-27 19:56:34,253][54587] Fps is (10 sec: 52428.9, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7617888256. Throughput: 0: 55528.8. Samples: 523102440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 19:56:34,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 19:56:36,245][54818] Updated weights for policy 0, policy_version 464968 (0.0031) [2024-04-27 19:56:39,253][54587] Fps is (10 sec: 52428.6, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 7618166784. Throughput: 0: 55443.4. Samples: 523435120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 19:56:39,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 19:56:39,517][54818] Updated weights for policy 0, policy_version 464978 (0.0027) [2024-04-27 19:56:42,124][54818] Updated weights for policy 0, policy_version 464988 (0.0029) [2024-04-27 19:56:44,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7618445312. Throughput: 0: 55552.4. Samples: 523596980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-27 19:56:44,253][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 19:56:45,653][54818] Updated weights for policy 0, policy_version 464998 (0.0032) [2024-04-27 19:56:48,017][54818] Updated weights for policy 0, policy_version 465008 (0.0031) [2024-04-27 19:56:49,253][54587] Fps is (10 sec: 57344.7, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7618740224. Throughput: 0: 55576.0. Samples: 523929300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 19:56:49,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 19:56:51,364][54818] Updated weights for policy 0, policy_version 465018 (0.0027) [2024-04-27 19:56:53,914][54818] Updated weights for policy 0, policy_version 465028 (0.0029) [2024-04-27 19:56:54,253][54587] Fps is (10 sec: 58981.6, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7619035136. Throughput: 0: 55551.0. Samples: 524264640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 19:56:54,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 19:56:57,300][54818] Updated weights for policy 0, policy_version 465038 (0.0028) [2024-04-27 19:56:59,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7619297280. Throughput: 0: 55424.5. Samples: 524430600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 19:56:59,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 19:56:59,999][54818] Updated weights for policy 0, policy_version 465048 (0.0028) [2024-04-27 19:57:02,966][54818] Updated weights for policy 0, policy_version 465058 (0.0029) [2024-04-27 19:57:04,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 7619575808. Throughput: 0: 55606.7. Samples: 524769380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 19:57:04,254][54587] Avg episode reward: [(0, '0.681')] [2024-04-27 19:57:05,840][54818] Updated weights for policy 0, policy_version 465068 (0.0026) [2024-04-27 19:57:08,992][54818] Updated weights for policy 0, policy_version 465078 (0.0042) [2024-04-27 19:57:09,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7619854336. Throughput: 0: 55737.8. Samples: 525106140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 19:57:09,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 19:57:11,611][54818] Updated weights for policy 0, policy_version 465088 (0.0027) [2024-04-27 19:57:14,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 7620116480. Throughput: 0: 55597.8. Samples: 525265360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 19:57:14,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 19:57:14,693][54818] Updated weights for policy 0, policy_version 465098 (0.0026) [2024-04-27 19:57:17,287][54818] Updated weights for policy 0, policy_version 465108 (0.0025) [2024-04-27 19:57:19,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7620411392. Throughput: 0: 55482.7. Samples: 525599160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 19:57:19,254][54587] Avg episode reward: [(0, '0.459')] [2024-04-27 19:57:20,628][54818] Updated weights for policy 0, policy_version 465118 (0.0029) [2024-04-27 19:57:23,192][54818] Updated weights for policy 0, policy_version 465128 (0.0030) [2024-04-27 19:57:24,253][54587] Fps is (10 sec: 58982.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7620706304. Throughput: 0: 55615.5. Samples: 525937820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 19:57:24,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 19:57:26,583][54818] Updated weights for policy 0, policy_version 465138 (0.0026) [2024-04-27 19:57:28,967][54798] Signal inference workers to stop experience collection... (7550 times) [2024-04-27 19:57:28,967][54798] Signal inference workers to resume experience collection... (7550 times) [2024-04-27 19:57:28,977][54818] InferenceWorker_p0-w0: stopping experience collection (7550 times) [2024-04-27 19:57:28,997][54818] InferenceWorker_p0-w0: resuming experience collection (7550 times) [2024-04-27 19:57:29,088][54818] Updated weights for policy 0, policy_version 465148 (0.0037) [2024-04-27 19:57:29,254][54587] Fps is (10 sec: 57343.0, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 7620984832. Throughput: 0: 55902.8. Samples: 526112620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 19:57:29,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 19:57:32,491][54818] Updated weights for policy 0, policy_version 465158 (0.0028) [2024-04-27 19:57:34,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7621246976. Throughput: 0: 55855.5. Samples: 526442800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 19:57:34,254][54587] Avg episode reward: [(0, '0.519')] [2024-04-27 19:57:34,856][54818] Updated weights for policy 0, policy_version 465168 (0.0031) [2024-04-27 19:57:38,205][54818] Updated weights for policy 0, policy_version 465178 (0.0033) [2024-04-27 19:57:39,253][54587] Fps is (10 sec: 54068.2, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7621525504. Throughput: 0: 55924.1. Samples: 526781220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 19:57:39,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 19:57:40,810][54818] Updated weights for policy 0, policy_version 465188 (0.0026) [2024-04-27 19:57:44,178][54818] Updated weights for policy 0, policy_version 465198 (0.0030) [2024-04-27 19:57:44,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55978.5, 300 sec: 55539.0). Total num frames: 7621804032. Throughput: 0: 55958.5. Samples: 526948740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 19:57:44,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 19:57:46,758][54818] Updated weights for policy 0, policy_version 465208 (0.0030) [2024-04-27 19:57:49,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55432.4, 300 sec: 55427.9). Total num frames: 7622066176. Throughput: 0: 55878.1. Samples: 527283900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 19:57:49,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 19:57:49,261][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000465214_7622066176.pth... [2024-04-27 19:57:49,325][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000464400_7608729600.pth [2024-04-27 19:57:49,922][54818] Updated weights for policy 0, policy_version 465218 (0.0029) [2024-04-27 19:57:52,563][54818] Updated weights for policy 0, policy_version 465228 (0.0033) [2024-04-27 19:57:54,253][54587] Fps is (10 sec: 55706.5, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 7622361088. Throughput: 0: 55901.9. Samples: 527621720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 19:57:54,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 19:57:55,743][54818] Updated weights for policy 0, policy_version 465238 (0.0031) [2024-04-27 19:57:58,365][54818] Updated weights for policy 0, policy_version 465248 (0.0027) [2024-04-27 19:57:59,253][54587] Fps is (10 sec: 58982.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7622656000. Throughput: 0: 55937.7. Samples: 527782560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 19:57:59,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 19:58:01,618][54818] Updated weights for policy 0, policy_version 465258 (0.0027) [2024-04-27 19:58:04,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7622934528. Throughput: 0: 55936.9. Samples: 528116320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 19:58:04,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 19:58:04,342][54818] Updated weights for policy 0, policy_version 465268 (0.0031) [2024-04-27 19:58:07,492][54818] Updated weights for policy 0, policy_version 465278 (0.0036) [2024-04-27 19:58:09,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7623213056. Throughput: 0: 55864.5. Samples: 528451720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-27 19:58:09,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 19:58:10,169][54818] Updated weights for policy 0, policy_version 465288 (0.0026) [2024-04-27 19:58:13,315][54818] Updated weights for policy 0, policy_version 465298 (0.0031) [2024-04-27 19:58:14,253][54587] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 7623491584. Throughput: 0: 55773.7. Samples: 528622420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 19:58:14,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 19:58:16,052][54818] Updated weights for policy 0, policy_version 465308 (0.0029) [2024-04-27 19:58:19,211][54818] Updated weights for policy 0, policy_version 465318 (0.0034) [2024-04-27 19:58:19,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 7623770112. Throughput: 0: 55949.8. Samples: 528960540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 19:58:19,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 19:58:21,956][54818] Updated weights for policy 0, policy_version 465328 (0.0029) [2024-04-27 19:58:24,253][54587] Fps is (10 sec: 50790.1, 60 sec: 54886.5, 300 sec: 55483.4). Total num frames: 7623999488. Throughput: 0: 55896.9. Samples: 529296580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 19:58:24,254][54587] Avg episode reward: [(0, '0.664')] [2024-04-27 19:58:24,452][54798] Signal inference workers to stop experience collection... (7600 times) [2024-04-27 19:58:24,452][54798] Signal inference workers to resume experience collection... (7600 times) [2024-04-27 19:58:24,473][54818] InferenceWorker_p0-w0: stopping experience collection (7600 times) [2024-04-27 19:58:24,473][54818] InferenceWorker_p0-w0: resuming experience collection (7600 times) [2024-04-27 19:58:25,094][54818] Updated weights for policy 0, policy_version 465338 (0.0026) [2024-04-27 19:58:27,894][54818] Updated weights for policy 0, policy_version 465348 (0.0024) [2024-04-27 19:58:29,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7624310784. Throughput: 0: 55713.8. Samples: 529455860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 19:58:29,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 19:58:30,926][54818] Updated weights for policy 0, policy_version 465358 (0.0029) [2024-04-27 19:58:33,721][54818] Updated weights for policy 0, policy_version 465368 (0.0027) [2024-04-27 19:58:34,253][54587] Fps is (10 sec: 60620.8, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7624605696. Throughput: 0: 55786.8. Samples: 529794300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 19:58:34,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 19:58:36,803][54818] Updated weights for policy 0, policy_version 465378 (0.0028) [2024-04-27 19:58:39,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7624884224. Throughput: 0: 55693.2. Samples: 530127920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 19:58:39,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 19:58:39,577][54818] Updated weights for policy 0, policy_version 465388 (0.0028) [2024-04-27 19:58:42,738][54818] Updated weights for policy 0, policy_version 465398 (0.0032) [2024-04-27 19:58:44,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 7625162752. Throughput: 0: 56046.8. Samples: 530304660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 19:58:44,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 19:58:45,376][54818] Updated weights for policy 0, policy_version 465408 (0.0027) [2024-04-27 19:58:48,634][54818] Updated weights for policy 0, policy_version 465418 (0.0030) [2024-04-27 19:58:49,253][54587] Fps is (10 sec: 55706.2, 60 sec: 56251.9, 300 sec: 55650.1). Total num frames: 7625441280. Throughput: 0: 56005.4. Samples: 530636560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 19:58:49,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 19:58:51,246][54818] Updated weights for policy 0, policy_version 465428 (0.0031) [2024-04-27 19:58:54,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7625719808. Throughput: 0: 55991.2. Samples: 530971320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 19:58:54,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 19:58:54,374][54818] Updated weights for policy 0, policy_version 465438 (0.0027) [2024-04-27 19:58:57,183][54818] Updated weights for policy 0, policy_version 465448 (0.0027) [2024-04-27 19:58:59,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55159.5, 300 sec: 55483.5). Total num frames: 7625965568. Throughput: 0: 55761.3. Samples: 531131680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 19:58:59,262][54587] Avg episode reward: [(0, '0.488')] [2024-04-27 19:59:00,239][54818] Updated weights for policy 0, policy_version 465458 (0.0030) [2024-04-27 19:59:03,082][54818] Updated weights for policy 0, policy_version 465468 (0.0031) [2024-04-27 19:59:04,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7626260480. Throughput: 0: 55665.3. Samples: 531465480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 19:59:04,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 19:59:06,240][54818] Updated weights for policy 0, policy_version 465478 (0.0024) [2024-04-27 19:59:08,866][54818] Updated weights for policy 0, policy_version 465488 (0.0037) [2024-04-27 19:59:09,253][54587] Fps is (10 sec: 60620.9, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 7626571776. Throughput: 0: 55520.0. Samples: 531794980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 19:59:09,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 19:59:12,020][54818] Updated weights for policy 0, policy_version 465498 (0.0034) [2024-04-27 19:59:14,253][54587] Fps is (10 sec: 58982.6, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 7626850304. Throughput: 0: 55878.4. Samples: 531970380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 19:59:14,253][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 19:59:14,689][54818] Updated weights for policy 0, policy_version 465508 (0.0035) [2024-04-27 19:59:17,851][54818] Updated weights for policy 0, policy_version 465518 (0.0029) [2024-04-27 19:59:18,467][54798] Signal inference workers to stop experience collection... (7650 times) [2024-04-27 19:59:18,467][54798] Signal inference workers to resume experience collection... (7650 times) [2024-04-27 19:59:18,491][54818] InferenceWorker_p0-w0: stopping experience collection (7650 times) [2024-04-27 19:59:18,491][54818] InferenceWorker_p0-w0: resuming experience collection (7650 times) [2024-04-27 19:59:19,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 7627112448. Throughput: 0: 55791.6. Samples: 532304920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 19:59:19,253][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 19:59:20,476][54818] Updated weights for policy 0, policy_version 465528 (0.0030) [2024-04-27 19:59:23,923][54818] Updated weights for policy 0, policy_version 465538 (0.0031) [2024-04-27 19:59:24,253][54587] Fps is (10 sec: 54067.0, 60 sec: 56524.8, 300 sec: 55650.1). Total num frames: 7627390976. Throughput: 0: 55836.5. Samples: 532640560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 19:59:24,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 19:59:26,378][54818] Updated weights for policy 0, policy_version 465548 (0.0026) [2024-04-27 19:59:29,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55978.9, 300 sec: 55705.6). Total num frames: 7627669504. Throughput: 0: 55407.1. Samples: 532797980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 19:59:29,253][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 19:59:29,824][54818] Updated weights for policy 0, policy_version 465558 (0.0025) [2024-04-27 19:59:32,049][54818] Updated weights for policy 0, policy_version 465568 (0.0027) [2024-04-27 19:59:34,253][54587] Fps is (10 sec: 52428.1, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 7627915264. Throughput: 0: 55474.5. Samples: 533132920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 19:59:34,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 19:59:35,813][54818] Updated weights for policy 0, policy_version 465578 (0.0026) [2024-04-27 19:59:38,024][54818] Updated weights for policy 0, policy_version 465588 (0.0031) [2024-04-27 19:59:39,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7628210176. Throughput: 0: 55459.6. Samples: 533467000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-27 19:59:39,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 19:59:41,713][54818] Updated weights for policy 0, policy_version 465598 (0.0025) [2024-04-27 19:59:44,011][54818] Updated weights for policy 0, policy_version 465608 (0.0032) [2024-04-27 19:59:44,253][54587] Fps is (10 sec: 60620.7, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 7628521472. Throughput: 0: 55833.6. Samples: 533644200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-27 19:59:44,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 19:59:47,481][54818] Updated weights for policy 0, policy_version 465618 (0.0033) [2024-04-27 19:59:49,253][54587] Fps is (10 sec: 60620.3, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 7628816384. Throughput: 0: 55836.4. Samples: 533978120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-27 19:59:49,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 19:59:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000465626_7628816384.pth... [2024-04-27 19:59:49,311][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000464809_7615430656.pth [2024-04-27 19:59:49,765][54818] Updated weights for policy 0, policy_version 465628 (0.0024) [2024-04-27 19:59:53,387][54818] Updated weights for policy 0, policy_version 465638 (0.0030) [2024-04-27 19:59:54,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7629062144. Throughput: 0: 55892.7. Samples: 534310160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-27 19:59:54,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 19:59:55,614][54818] Updated weights for policy 0, policy_version 465648 (0.0034) [2024-04-27 19:59:59,253][54587] Fps is (10 sec: 50790.8, 60 sec: 55978.7, 300 sec: 55594.6). Total num frames: 7629324288. Throughput: 0: 55776.0. Samples: 534480300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-27 19:59:59,254][54587] Avg episode reward: [(0, '0.684')] [2024-04-27 19:59:59,323][54818] Updated weights for policy 0, policy_version 465658 (0.0030) [2024-04-27 20:00:01,530][54818] Updated weights for policy 0, policy_version 465668 (0.0026) [2024-04-27 20:00:04,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7629602816. Throughput: 0: 55687.5. Samples: 534810860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-27 20:00:04,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 20:00:05,229][54818] Updated weights for policy 0, policy_version 465678 (0.0025) [2024-04-27 20:00:07,473][54818] Updated weights for policy 0, policy_version 465688 (0.0028) [2024-04-27 20:00:09,253][54587] Fps is (10 sec: 54067.0, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 7629864960. Throughput: 0: 55689.3. Samples: 535146580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-27 20:00:09,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 20:00:11,303][54818] Updated weights for policy 0, policy_version 465698 (0.0027) [2024-04-27 20:00:13,230][54818] Updated weights for policy 0, policy_version 465708 (0.0028) [2024-04-27 20:00:14,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 7630159872. Throughput: 0: 55791.3. Samples: 535308600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-27 20:00:14,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 20:00:17,161][54818] Updated weights for policy 0, policy_version 465718 (0.0028) [2024-04-27 20:00:19,148][54818] Updated weights for policy 0, policy_version 465728 (0.0033) [2024-04-27 20:00:19,253][54587] Fps is (10 sec: 62257.8, 60 sec: 56251.5, 300 sec: 55872.2). Total num frames: 7630487552. Throughput: 0: 55779.9. Samples: 535643020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-27 20:00:19,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 20:00:22,937][54818] Updated weights for policy 0, policy_version 465738 (0.0026) [2024-04-27 20:00:24,253][54587] Fps is (10 sec: 58982.7, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 7630749696. Throughput: 0: 55755.9. Samples: 535976020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-27 20:00:24,254][54587] Avg episode reward: [(0, '0.516')] [2024-04-27 20:00:25,918][54818] Updated weights for policy 0, policy_version 465748 (0.0029) [2024-04-27 20:00:28,615][54798] Signal inference workers to stop experience collection... (7700 times) [2024-04-27 20:00:28,616][54798] Signal inference workers to resume experience collection... (7700 times) [2024-04-27 20:00:28,643][54818] InferenceWorker_p0-w0: stopping experience collection (7700 times) [2024-04-27 20:00:28,643][54818] InferenceWorker_p0-w0: resuming experience collection (7700 times) [2024-04-27 20:00:28,728][54818] Updated weights for policy 0, policy_version 465758 (0.0028) [2024-04-27 20:00:29,253][54587] Fps is (10 sec: 52429.8, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7631011840. Throughput: 0: 55562.8. Samples: 536144520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-27 20:00:29,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 20:00:31,710][54818] Updated weights for policy 0, policy_version 465768 (0.0034) [2024-04-27 20:00:34,253][54587] Fps is (10 sec: 52429.4, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 7631273984. Throughput: 0: 55624.1. Samples: 536481200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-27 20:00:34,253][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 20:00:34,617][54818] Updated weights for policy 0, policy_version 465778 (0.0026) [2024-04-27 20:00:37,684][54818] Updated weights for policy 0, policy_version 465788 (0.0026) [2024-04-27 20:00:39,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7631552512. Throughput: 0: 55525.0. Samples: 536808780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-27 20:00:39,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 20:00:40,529][54818] Updated weights for policy 0, policy_version 465798 (0.0028) [2024-04-27 20:00:43,643][54818] Updated weights for policy 0, policy_version 465808 (0.0024) [2024-04-27 20:00:44,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 7631831040. Throughput: 0: 55447.5. Samples: 536975440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-27 20:00:44,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 20:00:46,386][54818] Updated weights for policy 0, policy_version 465818 (0.0038) [2024-04-27 20:00:49,253][54587] Fps is (10 sec: 55705.8, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 7632109568. Throughput: 0: 55491.1. Samples: 537307960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-27 20:00:49,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 20:00:49,365][54818] Updated weights for policy 0, policy_version 465828 (0.0032) [2024-04-27 20:00:52,114][54818] Updated weights for policy 0, policy_version 465838 (0.0027) [2024-04-27 20:00:54,253][54587] Fps is (10 sec: 58981.9, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7632420864. Throughput: 0: 55438.1. Samples: 537641300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-27 20:00:54,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 20:00:55,090][54818] Updated weights for policy 0, policy_version 465848 (0.0025) [2024-04-27 20:00:58,157][54818] Updated weights for policy 0, policy_version 465858 (0.0030) [2024-04-27 20:00:59,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7632683008. Throughput: 0: 55670.9. Samples: 537813780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-04-27 20:00:59,253][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 20:01:00,957][54818] Updated weights for policy 0, policy_version 465868 (0.0026) [2024-04-27 20:01:03,896][54818] Updated weights for policy 0, policy_version 465878 (0.0027) [2024-04-27 20:01:04,253][54587] Fps is (10 sec: 52429.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7632945152. Throughput: 0: 55666.1. Samples: 538147980. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:01:04,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 20:01:06,899][54818] Updated weights for policy 0, policy_version 465888 (0.0029) [2024-04-27 20:01:09,253][54587] Fps is (10 sec: 54065.8, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 7633223680. Throughput: 0: 55792.3. Samples: 538486680. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:01:09,254][54587] Avg episode reward: [(0, '0.445')] [2024-04-27 20:01:09,796][54818] Updated weights for policy 0, policy_version 465898 (0.0029) [2024-04-27 20:01:12,593][54818] Updated weights for policy 0, policy_version 465908 (0.0030) [2024-04-27 20:01:14,253][54587] Fps is (10 sec: 57343.1, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7633518592. Throughput: 0: 55637.2. Samples: 538648200. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:01:14,254][54587] Avg episode reward: [(0, '0.502')] [2024-04-27 20:01:15,637][54818] Updated weights for policy 0, policy_version 465918 (0.0023) [2024-04-27 20:01:18,295][54818] Updated weights for policy 0, policy_version 465928 (0.0027) [2024-04-27 20:01:19,254][54587] Fps is (10 sec: 57343.6, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 7633797120. Throughput: 0: 55747.6. Samples: 538989860. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:01:19,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-27 20:01:21,460][54818] Updated weights for policy 0, policy_version 465938 (0.0027) [2024-04-27 20:01:24,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7634075648. Throughput: 0: 55910.7. Samples: 539324760. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:01:24,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 20:01:24,508][54818] Updated weights for policy 0, policy_version 465948 (0.0028) [2024-04-27 20:01:27,464][54818] Updated weights for policy 0, policy_version 465958 (0.0030) [2024-04-27 20:01:28,589][54798] Signal inference workers to stop experience collection... (7750 times) [2024-04-27 20:01:28,589][54798] Signal inference workers to resume experience collection... (7750 times) [2024-04-27 20:01:28,622][54818] InferenceWorker_p0-w0: stopping experience collection (7750 times) [2024-04-27 20:01:28,622][54818] InferenceWorker_p0-w0: resuming experience collection (7750 times) [2024-04-27 20:01:29,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 7634354176. Throughput: 0: 55908.3. Samples: 539491320. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:01:29,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 20:01:30,376][54818] Updated weights for policy 0, policy_version 465968 (0.0029) [2024-04-27 20:01:33,283][54818] Updated weights for policy 0, policy_version 465978 (0.0028) [2024-04-27 20:01:34,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55705.5, 300 sec: 55761.2). Total num frames: 7634616320. Throughput: 0: 55827.5. Samples: 539820200. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:01:34,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-27 20:01:36,068][54818] Updated weights for policy 0, policy_version 465988 (0.0030) [2024-04-27 20:01:39,021][54818] Updated weights for policy 0, policy_version 465998 (0.0037) [2024-04-27 20:01:39,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7634911232. Throughput: 0: 55902.7. Samples: 540156920. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:01:39,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 20:01:41,988][54818] Updated weights for policy 0, policy_version 466008 (0.0032) [2024-04-27 20:01:44,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7635173376. Throughput: 0: 55656.3. Samples: 540318320. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:01:44,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 20:01:44,940][54818] Updated weights for policy 0, policy_version 466018 (0.0037) [2024-04-27 20:01:47,821][54818] Updated weights for policy 0, policy_version 466028 (0.0030) [2024-04-27 20:01:49,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7635468288. Throughput: 0: 55741.3. Samples: 540656340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:01:49,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 20:01:49,348][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000466033_7635484672.pth... [2024-04-27 20:01:49,394][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000465214_7622066176.pth [2024-04-27 20:01:50,829][54818] Updated weights for policy 0, policy_version 466038 (0.0027) [2024-04-27 20:01:53,689][54818] Updated weights for policy 0, policy_version 466048 (0.0029) [2024-04-27 20:01:54,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 7635730432. Throughput: 0: 55578.7. Samples: 540987720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:01:54,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 20:01:56,625][54818] Updated weights for policy 0, policy_version 466058 (0.0031) [2024-04-27 20:01:59,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7636025344. Throughput: 0: 55632.2. Samples: 541151640. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:01:59,253][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 20:01:59,608][54818] Updated weights for policy 0, policy_version 466068 (0.0028) [2024-04-27 20:02:02,660][54818] Updated weights for policy 0, policy_version 466078 (0.0039) [2024-04-27 20:02:04,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7636303872. Throughput: 0: 55552.2. Samples: 541489700. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:02:04,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 20:02:05,728][54818] Updated weights for policy 0, policy_version 466088 (0.0026) [2024-04-27 20:02:08,592][54818] Updated weights for policy 0, policy_version 466098 (0.0026) [2024-04-27 20:02:09,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 7636566016. Throughput: 0: 55595.2. Samples: 541826540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:02:09,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 20:02:11,879][54818] Updated weights for policy 0, policy_version 466108 (0.0032) [2024-04-27 20:02:14,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 7636860928. Throughput: 0: 55643.7. Samples: 541995280. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:02:14,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 20:02:14,718][54818] Updated weights for policy 0, policy_version 466118 (0.0032) [2024-04-27 20:02:17,722][54818] Updated weights for policy 0, policy_version 466128 (0.0028) [2024-04-27 20:02:19,253][54587] Fps is (10 sec: 55704.7, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 7637123072. Throughput: 0: 55771.0. Samples: 542329900. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:02:19,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 20:02:20,446][54818] Updated weights for policy 0, policy_version 466138 (0.0028) [2024-04-27 20:02:23,566][54818] Updated weights for policy 0, policy_version 466148 (0.0037) [2024-04-27 20:02:24,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7637417984. Throughput: 0: 55596.5. Samples: 542658760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 20:02:24,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 20:02:26,434][54818] Updated weights for policy 0, policy_version 466158 (0.0026) [2024-04-27 20:02:29,254][54587] Fps is (10 sec: 55704.7, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 7637680128. Throughput: 0: 55742.8. Samples: 542826760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 20:02:29,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 20:02:29,433][54818] Updated weights for policy 0, policy_version 466168 (0.0034) [2024-04-27 20:02:32,533][54818] Updated weights for policy 0, policy_version 466178 (0.0026) [2024-04-27 20:02:32,790][54798] Signal inference workers to stop experience collection... (7800 times) [2024-04-27 20:02:32,795][54798] Signal inference workers to resume experience collection... (7800 times) [2024-04-27 20:02:32,812][54818] InferenceWorker_p0-w0: stopping experience collection (7800 times) [2024-04-27 20:02:32,812][54818] InferenceWorker_p0-w0: resuming experience collection (7800 times) [2024-04-27 20:02:34,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7637958656. Throughput: 0: 55735.2. Samples: 543164420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 20:02:34,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 20:02:35,292][54818] Updated weights for policy 0, policy_version 466188 (0.0033) [2024-04-27 20:02:38,293][54818] Updated weights for policy 0, policy_version 466198 (0.0029) [2024-04-27 20:02:39,253][54587] Fps is (10 sec: 55707.2, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7638237184. Throughput: 0: 55798.0. Samples: 543498620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 20:02:39,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 20:02:41,038][54818] Updated weights for policy 0, policy_version 466208 (0.0038) [2024-04-27 20:02:44,103][54818] Updated weights for policy 0, policy_version 466218 (0.0028) [2024-04-27 20:02:44,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 7638515712. Throughput: 0: 55672.9. Samples: 543656920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 20:02:44,253][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 20:02:46,846][54818] Updated weights for policy 0, policy_version 466228 (0.0027) [2024-04-27 20:02:49,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7638810624. Throughput: 0: 55710.2. Samples: 543996660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 20:02:49,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-27 20:02:49,891][54818] Updated weights for policy 0, policy_version 466238 (0.0033) [2024-04-27 20:02:52,758][54818] Updated weights for policy 0, policy_version 466248 (0.0030) [2024-04-27 20:02:54,253][54587] Fps is (10 sec: 58982.4, 60 sec: 56251.9, 300 sec: 55761.2). Total num frames: 7639105536. Throughput: 0: 55764.0. Samples: 544335920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 20:02:54,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 20:02:55,774][54818] Updated weights for policy 0, policy_version 466258 (0.0030) [2024-04-27 20:02:58,620][54818] Updated weights for policy 0, policy_version 466268 (0.0029) [2024-04-27 20:02:59,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 7639384064. Throughput: 0: 55772.5. Samples: 544505040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 20:02:59,253][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 20:03:01,600][54818] Updated weights for policy 0, policy_version 466278 (0.0025) [2024-04-27 20:03:04,253][54587] Fps is (10 sec: 54066.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7639646208. Throughput: 0: 55801.7. Samples: 544840980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 20:03:04,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 20:03:04,377][54818] Updated weights for policy 0, policy_version 466288 (0.0026) [2024-04-27 20:03:07,434][54818] Updated weights for policy 0, policy_version 466298 (0.0029) [2024-04-27 20:03:09,253][54587] Fps is (10 sec: 52428.4, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 7639908352. Throughput: 0: 55844.8. Samples: 545171780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 20:03:09,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 20:03:10,211][54818] Updated weights for policy 0, policy_version 466308 (0.0031) [2024-04-27 20:03:13,420][54818] Updated weights for policy 0, policy_version 466318 (0.0032) [2024-04-27 20:03:14,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7640203264. Throughput: 0: 55753.7. Samples: 545335660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 20:03:14,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-27 20:03:16,077][54818] Updated weights for policy 0, policy_version 466328 (0.0032) [2024-04-27 20:03:19,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7640465408. Throughput: 0: 55762.1. Samples: 545673720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 20:03:19,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 20:03:19,317][54818] Updated weights for policy 0, policy_version 466338 (0.0028) [2024-04-27 20:03:21,931][54818] Updated weights for policy 0, policy_version 466348 (0.0033) [2024-04-27 20:03:24,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7640760320. Throughput: 0: 55789.2. Samples: 546009140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 20:03:24,254][54587] Avg episode reward: [(0, '0.673')] [2024-04-27 20:03:25,066][54818] Updated weights for policy 0, policy_version 466358 (0.0027) [2024-04-27 20:03:27,768][54818] Updated weights for policy 0, policy_version 466368 (0.0027) [2024-04-27 20:03:29,253][54587] Fps is (10 sec: 58982.8, 60 sec: 56252.0, 300 sec: 55761.1). Total num frames: 7641055232. Throughput: 0: 56178.1. Samples: 546184940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 20:03:29,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 20:03:30,777][54818] Updated weights for policy 0, policy_version 466378 (0.0028) [2024-04-27 20:03:33,538][54818] Updated weights for policy 0, policy_version 466388 (0.0026) [2024-04-27 20:03:34,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7641317376. Throughput: 0: 56086.2. Samples: 546520540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 20:03:34,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 20:03:36,612][54818] Updated weights for policy 0, policy_version 466398 (0.0029) [2024-04-27 20:03:39,253][54587] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 7641612288. Throughput: 0: 56128.9. Samples: 546861720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 20:03:39,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 20:03:39,285][54818] Updated weights for policy 0, policy_version 466408 (0.0023) [2024-04-27 20:03:41,697][54798] Signal inference workers to stop experience collection... (7850 times) [2024-04-27 20:03:41,723][54818] InferenceWorker_p0-w0: stopping experience collection (7850 times) [2024-04-27 20:03:41,751][54798] Signal inference workers to resume experience collection... (7850 times) [2024-04-27 20:03:41,756][54818] InferenceWorker_p0-w0: resuming experience collection (7850 times) [2024-04-27 20:03:42,512][54818] Updated weights for policy 0, policy_version 466418 (0.0026) [2024-04-27 20:03:44,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7641874432. Throughput: 0: 56047.1. Samples: 547027160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-27 20:03:44,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 20:03:45,186][54818] Updated weights for policy 0, policy_version 466428 (0.0027) [2024-04-27 20:03:48,365][54818] Updated weights for policy 0, policy_version 466438 (0.0031) [2024-04-27 20:03:49,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7642169344. Throughput: 0: 56011.2. Samples: 547361480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 20:03:49,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 20:03:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000466441_7642169344.pth... [2024-04-27 20:03:49,324][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000465626_7628816384.pth [2024-04-27 20:03:51,077][54818] Updated weights for policy 0, policy_version 466448 (0.0025) [2024-04-27 20:03:54,086][54818] Updated weights for policy 0, policy_version 466458 (0.0028) [2024-04-27 20:03:54,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 7642447872. Throughput: 0: 56150.8. Samples: 547698560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 20:03:54,253][54587] Avg episode reward: [(0, '0.525')] [2024-04-27 20:03:56,936][54818] Updated weights for policy 0, policy_version 466468 (0.0032) [2024-04-27 20:03:59,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 7642726400. Throughput: 0: 56218.1. Samples: 547865480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 20:03:59,262][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 20:04:00,067][54818] Updated weights for policy 0, policy_version 466478 (0.0031) [2024-04-27 20:04:02,866][54818] Updated weights for policy 0, policy_version 466488 (0.0030) [2024-04-27 20:04:04,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7642988544. Throughput: 0: 56080.5. Samples: 548197340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 20:04:04,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 20:04:05,919][54818] Updated weights for policy 0, policy_version 466498 (0.0028) [2024-04-27 20:04:08,545][54818] Updated weights for policy 0, policy_version 466508 (0.0027) [2024-04-27 20:04:09,253][54587] Fps is (10 sec: 55706.4, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 7643283456. Throughput: 0: 56039.3. Samples: 548530900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 20:04:09,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 20:04:11,655][54818] Updated weights for policy 0, policy_version 466518 (0.0027) [2024-04-27 20:04:14,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7643561984. Throughput: 0: 55897.9. Samples: 548700340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 20:04:14,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 20:04:14,468][54818] Updated weights for policy 0, policy_version 466528 (0.0030) [2024-04-27 20:04:17,496][54818] Updated weights for policy 0, policy_version 466538 (0.0030) [2024-04-27 20:04:19,253][54587] Fps is (10 sec: 54066.2, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7643824128. Throughput: 0: 55926.6. Samples: 549037240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 20:04:19,263][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 20:04:20,439][54818] Updated weights for policy 0, policy_version 466548 (0.0030) [2024-04-27 20:04:23,256][54818] Updated weights for policy 0, policy_version 466558 (0.0025) [2024-04-27 20:04:24,253][54587] Fps is (10 sec: 57343.5, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 7644135424. Throughput: 0: 55739.5. Samples: 549370000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 20:04:24,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 20:04:26,204][54818] Updated weights for policy 0, policy_version 466568 (0.0027) [2024-04-27 20:04:29,222][54818] Updated weights for policy 0, policy_version 466578 (0.0029) [2024-04-27 20:04:29,253][54587] Fps is (10 sec: 58983.0, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 7644413952. Throughput: 0: 55661.3. Samples: 549531920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 20:04:29,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 20:04:32,026][54818] Updated weights for policy 0, policy_version 466588 (0.0030) [2024-04-27 20:04:34,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7644676096. Throughput: 0: 55778.3. Samples: 549871500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 20:04:34,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 20:04:35,023][54818] Updated weights for policy 0, policy_version 466598 (0.0033) [2024-04-27 20:04:37,438][54798] Signal inference workers to stop experience collection... (7900 times) [2024-04-27 20:04:37,438][54798] Signal inference workers to resume experience collection... (7900 times) [2024-04-27 20:04:37,463][54818] InferenceWorker_p0-w0: stopping experience collection (7900 times) [2024-04-27 20:04:37,463][54818] InferenceWorker_p0-w0: resuming experience collection (7900 times) [2024-04-27 20:04:38,011][54818] Updated weights for policy 0, policy_version 466608 (0.0035) [2024-04-27 20:04:39,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7644954624. Throughput: 0: 55780.4. Samples: 550208680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 20:04:39,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 20:04:41,054][54818] Updated weights for policy 0, policy_version 466618 (0.0027) [2024-04-27 20:04:43,873][54818] Updated weights for policy 0, policy_version 466628 (0.0027) [2024-04-27 20:04:44,253][54587] Fps is (10 sec: 57343.7, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 7645249536. Throughput: 0: 55689.8. Samples: 550371520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 20:04:44,254][54587] Avg episode reward: [(0, '0.493')] [2024-04-27 20:04:47,067][54818] Updated weights for policy 0, policy_version 466638 (0.0030) [2024-04-27 20:04:49,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7645528064. Throughput: 0: 55800.4. Samples: 550708360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 20:04:49,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 20:04:49,661][54818] Updated weights for policy 0, policy_version 466648 (0.0035) [2024-04-27 20:04:52,862][54818] Updated weights for policy 0, policy_version 466658 (0.0031) [2024-04-27 20:04:54,253][54587] Fps is (10 sec: 52429.2, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 7645773824. Throughput: 0: 55727.1. Samples: 551038620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 20:04:54,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 20:04:55,648][54818] Updated weights for policy 0, policy_version 466668 (0.0028) [2024-04-27 20:04:58,844][54818] Updated weights for policy 0, policy_version 466678 (0.0024) [2024-04-27 20:04:59,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55705.5, 300 sec: 55816.6). Total num frames: 7646068736. Throughput: 0: 55693.5. Samples: 551206560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 20:04:59,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-27 20:05:01,554][54818] Updated weights for policy 0, policy_version 466688 (0.0028) [2024-04-27 20:05:04,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 7646347264. Throughput: 0: 55745.0. Samples: 551545760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 20:05:04,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 20:05:04,529][54818] Updated weights for policy 0, policy_version 466698 (0.0027) [2024-04-27 20:05:07,432][54818] Updated weights for policy 0, policy_version 466708 (0.0026) [2024-04-27 20:05:09,253][54587] Fps is (10 sec: 57344.7, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 7646642176. Throughput: 0: 55743.1. Samples: 551878440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-27 20:05:09,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 20:05:10,459][54818] Updated weights for policy 0, policy_version 466718 (0.0043) [2024-04-27 20:05:13,327][54818] Updated weights for policy 0, policy_version 466728 (0.0035) [2024-04-27 20:05:14,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7646904320. Throughput: 0: 55908.8. Samples: 552047820. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 20:05:14,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 20:05:16,337][54818] Updated weights for policy 0, policy_version 466738 (0.0025) [2024-04-27 20:05:19,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 7647182848. Throughput: 0: 55809.4. Samples: 552382920. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 20:05:19,253][54587] Avg episode reward: [(0, '0.702')] [2024-04-27 20:05:19,331][54818] Updated weights for policy 0, policy_version 466748 (0.0027) [2024-04-27 20:05:22,102][54818] Updated weights for policy 0, policy_version 466758 (0.0029) [2024-04-27 20:05:24,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7647477760. Throughput: 0: 55727.0. Samples: 552716400. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 20:05:24,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 20:05:25,285][54818] Updated weights for policy 0, policy_version 466768 (0.0024) [2024-04-27 20:05:27,861][54818] Updated weights for policy 0, policy_version 466778 (0.0028) [2024-04-27 20:05:29,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 7647723520. Throughput: 0: 55696.6. Samples: 552877860. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 20:05:29,254][54587] Avg episode reward: [(0, '0.676')] [2024-04-27 20:05:31,067][54818] Updated weights for policy 0, policy_version 466788 (0.0029) [2024-04-27 20:05:33,977][54818] Updated weights for policy 0, policy_version 466798 (0.0031) [2024-04-27 20:05:34,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 7648034816. Throughput: 0: 55611.1. Samples: 553210860. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 20:05:34,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 20:05:36,961][54818] Updated weights for policy 0, policy_version 466808 (0.0035) [2024-04-27 20:05:37,745][54798] Signal inference workers to stop experience collection... (7950 times) [2024-04-27 20:05:37,745][54798] Signal inference workers to resume experience collection... (7950 times) [2024-04-27 20:05:37,758][54818] InferenceWorker_p0-w0: stopping experience collection (7950 times) [2024-04-27 20:05:37,759][54818] InferenceWorker_p0-w0: resuming experience collection (7950 times) [2024-04-27 20:05:39,253][54587] Fps is (10 sec: 58981.7, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 7648313344. Throughput: 0: 55700.3. Samples: 553545140. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 20:05:39,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 20:05:39,782][54818] Updated weights for policy 0, policy_version 466818 (0.0026) [2024-04-27 20:05:42,927][54818] Updated weights for policy 0, policy_version 466828 (0.0026) [2024-04-27 20:05:44,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 7648591872. Throughput: 0: 55762.3. Samples: 553715860. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 20:05:44,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 20:05:45,546][54818] Updated weights for policy 0, policy_version 466838 (0.0029) [2024-04-27 20:05:48,798][54818] Updated weights for policy 0, policy_version 466848 (0.0026) [2024-04-27 20:05:49,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 7648870400. Throughput: 0: 55701.9. Samples: 554052340. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 20:05:49,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 20:05:49,333][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000466851_7648886784.pth... [2024-04-27 20:05:49,380][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000466033_7635484672.pth [2024-04-27 20:05:51,404][54818] Updated weights for policy 0, policy_version 466858 (0.0030) [2024-04-27 20:05:54,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7649132544. Throughput: 0: 55798.2. Samples: 554389360. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 20:05:54,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 20:05:54,576][54818] Updated weights for policy 0, policy_version 466868 (0.0025) [2024-04-27 20:05:57,617][54818] Updated weights for policy 0, policy_version 466878 (0.0028) [2024-04-27 20:05:59,253][54587] Fps is (10 sec: 55704.5, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 7649427456. Throughput: 0: 55612.8. Samples: 554550400. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 20:05:59,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 20:06:00,650][54818] Updated weights for policy 0, policy_version 466888 (0.0025) [2024-04-27 20:06:03,347][54818] Updated weights for policy 0, policy_version 466898 (0.0032) [2024-04-27 20:06:04,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 7649689600. Throughput: 0: 55610.1. Samples: 554885380. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 20:06:04,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 20:06:06,532][54818] Updated weights for policy 0, policy_version 466908 (0.0032) [2024-04-27 20:06:09,251][54818] Updated weights for policy 0, policy_version 466918 (0.0029) [2024-04-27 20:06:09,253][54587] Fps is (10 sec: 55706.8, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 7649984512. Throughput: 0: 55661.0. Samples: 555221140. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 20:06:09,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 20:06:12,214][54818] Updated weights for policy 0, policy_version 466928 (0.0028) [2024-04-27 20:06:14,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7650263040. Throughput: 0: 56004.8. Samples: 555398080. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 20:06:14,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 20:06:15,143][54818] Updated weights for policy 0, policy_version 466938 (0.0029) [2024-04-27 20:06:18,041][54818] Updated weights for policy 0, policy_version 466948 (0.0034) [2024-04-27 20:06:19,253][54587] Fps is (10 sec: 57344.1, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 7650557952. Throughput: 0: 56104.6. Samples: 555735560. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 20:06:19,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 20:06:21,323][54818] Updated weights for policy 0, policy_version 466958 (0.0031) [2024-04-27 20:06:24,023][54818] Updated weights for policy 0, policy_version 466968 (0.0025) [2024-04-27 20:06:24,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 7650820096. Throughput: 0: 56051.8. Samples: 556067460. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 20:06:24,253][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 20:06:27,263][54818] Updated weights for policy 0, policy_version 466978 (0.0025) [2024-04-27 20:06:29,253][54587] Fps is (10 sec: 52427.8, 60 sec: 55978.5, 300 sec: 55816.6). Total num frames: 7651082240. Throughput: 0: 56007.1. Samples: 556236180. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 20:06:29,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-27 20:06:29,793][54818] Updated weights for policy 0, policy_version 466988 (0.0030) [2024-04-27 20:06:33,214][54818] Updated weights for policy 0, policy_version 466998 (0.0028) [2024-04-27 20:06:34,253][54587] Fps is (10 sec: 55704.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7651377152. Throughput: 0: 56033.2. Samples: 556573840. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-27 20:06:34,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 20:06:35,593][54798] Signal inference workers to stop experience collection... (8000 times) [2024-04-27 20:06:35,611][54818] InferenceWorker_p0-w0: stopping experience collection (8000 times) [2024-04-27 20:06:35,684][54798] Signal inference workers to resume experience collection... (8000 times) [2024-04-27 20:06:35,684][54818] InferenceWorker_p0-w0: resuming experience collection (8000 times) [2024-04-27 20:06:35,686][54818] Updated weights for policy 0, policy_version 467008 (0.0030) [2024-04-27 20:06:38,898][54818] Updated weights for policy 0, policy_version 467018 (0.0033) [2024-04-27 20:06:39,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 7651639296. Throughput: 0: 56003.9. Samples: 556909540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:06:39,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 20:06:41,558][54818] Updated weights for policy 0, policy_version 467028 (0.0027) [2024-04-27 20:06:44,253][54587] Fps is (10 sec: 54068.0, 60 sec: 55432.7, 300 sec: 55761.2). Total num frames: 7651917824. Throughput: 0: 55923.4. Samples: 557066940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:06:44,253][54587] Avg episode reward: [(0, '0.506')] [2024-04-27 20:06:44,889][54818] Updated weights for policy 0, policy_version 467038 (0.0033) [2024-04-27 20:06:47,303][54818] Updated weights for policy 0, policy_version 467048 (0.0030) [2024-04-27 20:06:49,253][54587] Fps is (10 sec: 58982.9, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 7652229120. Throughput: 0: 55940.9. Samples: 557402720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:06:49,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 20:06:50,850][54818] Updated weights for policy 0, policy_version 467058 (0.0027) [2024-04-27 20:06:53,079][54818] Updated weights for policy 0, policy_version 467068 (0.0029) [2024-04-27 20:06:54,253][54587] Fps is (10 sec: 58981.8, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 7652507648. Throughput: 0: 55856.8. Samples: 557734700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:06:54,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-27 20:06:56,796][54818] Updated weights for policy 0, policy_version 467078 (0.0027) [2024-04-27 20:06:58,947][54818] Updated weights for policy 0, policy_version 467088 (0.0030) [2024-04-27 20:06:59,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 7652786176. Throughput: 0: 56037.0. Samples: 557919740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:06:59,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 20:07:02,716][54818] Updated weights for policy 0, policy_version 467098 (0.0030) [2024-04-27 20:07:04,253][54587] Fps is (10 sec: 52428.6, 60 sec: 55705.5, 300 sec: 55816.6). Total num frames: 7653031936. Throughput: 0: 55842.9. Samples: 558248500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:07:04,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 20:07:04,944][54818] Updated weights for policy 0, policy_version 467108 (0.0027) [2024-04-27 20:07:08,619][54818] Updated weights for policy 0, policy_version 467118 (0.0028) [2024-04-27 20:07:09,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 7653310464. Throughput: 0: 55851.1. Samples: 558580760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:07:09,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 20:07:10,778][54818] Updated weights for policy 0, policy_version 467128 (0.0027) [2024-04-27 20:07:14,253][54587] Fps is (10 sec: 52429.2, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 7653556224. Throughput: 0: 55570.0. Samples: 558736820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:07:14,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-27 20:07:14,464][54818] Updated weights for policy 0, policy_version 467138 (0.0029) [2024-04-27 20:07:16,718][54818] Updated weights for policy 0, policy_version 467148 (0.0034) [2024-04-27 20:07:19,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 7653867520. Throughput: 0: 55433.9. Samples: 559068360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:07:19,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 20:07:20,365][54818] Updated weights for policy 0, policy_version 467158 (0.0029) [2024-04-27 20:07:22,657][54818] Updated weights for policy 0, policy_version 467168 (0.0026) [2024-04-27 20:07:24,253][54587] Fps is (10 sec: 62259.3, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 7654178816. Throughput: 0: 55409.5. Samples: 559402960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:07:24,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 20:07:26,081][54818] Updated weights for policy 0, policy_version 467178 (0.0031) [2024-04-27 20:07:27,720][54798] Signal inference workers to stop experience collection... (8050 times) [2024-04-27 20:07:27,720][54798] Signal inference workers to resume experience collection... (8050 times) [2024-04-27 20:07:27,732][54818] InferenceWorker_p0-w0: stopping experience collection (8050 times) [2024-04-27 20:07:27,733][54818] InferenceWorker_p0-w0: resuming experience collection (8050 times) [2024-04-27 20:07:28,315][54818] Updated weights for policy 0, policy_version 467188 (0.0027) [2024-04-27 20:07:29,253][54587] Fps is (10 sec: 58982.1, 60 sec: 56251.8, 300 sec: 55927.7). Total num frames: 7654457344. Throughput: 0: 55813.6. Samples: 559578560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:07:29,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 20:07:32,061][54818] Updated weights for policy 0, policy_version 467198 (0.0027) [2024-04-27 20:07:34,019][54818] Updated weights for policy 0, policy_version 467208 (0.0026) [2024-04-27 20:07:34,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55978.8, 300 sec: 55927.7). Total num frames: 7654735872. Throughput: 0: 55764.0. Samples: 559912100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:07:34,254][54587] Avg episode reward: [(0, '0.703')] [2024-04-27 20:07:37,787][54818] Updated weights for policy 0, policy_version 467218 (0.0026) [2024-04-27 20:07:39,253][54587] Fps is (10 sec: 55705.8, 60 sec: 56251.8, 300 sec: 55927.7). Total num frames: 7655014400. Throughput: 0: 55920.4. Samples: 560251120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:07:39,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-27 20:07:39,944][54818] Updated weights for policy 0, policy_version 467228 (0.0027) [2024-04-27 20:07:43,686][54818] Updated weights for policy 0, policy_version 467238 (0.0036) [2024-04-27 20:07:44,253][54587] Fps is (10 sec: 55705.1, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 7655292928. Throughput: 0: 55456.8. Samples: 560415300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:07:44,254][54587] Avg episode reward: [(0, '0.683')] [2024-04-27 20:07:45,888][54818] Updated weights for policy 0, policy_version 467248 (0.0024) [2024-04-27 20:07:49,253][54587] Fps is (10 sec: 50790.2, 60 sec: 54886.3, 300 sec: 55650.0). Total num frames: 7655522304. Throughput: 0: 55605.3. Samples: 560750740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:07:49,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 20:07:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000467256_7655522304.pth... [2024-04-27 20:07:49,323][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000466441_7642169344.pth [2024-04-27 20:07:49,617][54818] Updated weights for policy 0, policy_version 467258 (0.0025) [2024-04-27 20:07:51,626][54818] Updated weights for policy 0, policy_version 467268 (0.0028) [2024-04-27 20:07:54,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 7655817216. Throughput: 0: 55669.3. Samples: 561085880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:07:54,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 20:07:55,354][54818] Updated weights for policy 0, policy_version 467278 (0.0030) [2024-04-27 20:07:57,438][54818] Updated weights for policy 0, policy_version 467288 (0.0026) [2024-04-27 20:07:59,253][54587] Fps is (10 sec: 60621.9, 60 sec: 55705.7, 300 sec: 55872.3). Total num frames: 7656128512. Throughput: 0: 56055.7. Samples: 561259320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:07:59,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-27 20:08:01,203][54818] Updated weights for policy 0, policy_version 467298 (0.0030) [2024-04-27 20:08:03,375][54818] Updated weights for policy 0, policy_version 467308 (0.0025) [2024-04-27 20:08:04,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 7656390656. Throughput: 0: 55944.1. Samples: 561585840. Policy #0 lag: (min: 1.0, avg: 11.3, max: 19.0) [2024-04-27 20:08:04,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-27 20:08:06,963][54818] Updated weights for policy 0, policy_version 467318 (0.0036) [2024-04-27 20:08:09,183][54818] Updated weights for policy 0, policy_version 467328 (0.0037) [2024-04-27 20:08:09,253][54587] Fps is (10 sec: 57343.2, 60 sec: 56524.7, 300 sec: 55927.7). Total num frames: 7656701952. Throughput: 0: 55979.9. Samples: 561922060. Policy #0 lag: (min: 1.0, avg: 11.3, max: 19.0) [2024-04-27 20:08:09,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 20:08:12,376][54798] Signal inference workers to stop experience collection... (8100 times) [2024-04-27 20:08:12,403][54818] InferenceWorker_p0-w0: stopping experience collection (8100 times) [2024-04-27 20:08:12,431][54798] Signal inference workers to resume experience collection... (8100 times) [2024-04-27 20:08:12,434][54818] InferenceWorker_p0-w0: resuming experience collection (8100 times) [2024-04-27 20:08:12,998][54818] Updated weights for policy 0, policy_version 467338 (0.0032) [2024-04-27 20:08:14,253][54587] Fps is (10 sec: 57344.1, 60 sec: 56797.9, 300 sec: 55927.8). Total num frames: 7656964096. Throughput: 0: 55837.9. Samples: 562091260. Policy #0 lag: (min: 1.0, avg: 11.3, max: 19.0) [2024-04-27 20:08:14,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 20:08:15,259][54818] Updated weights for policy 0, policy_version 467348 (0.0028) [2024-04-27 20:08:18,879][54818] Updated weights for policy 0, policy_version 467358 (0.0028) [2024-04-27 20:08:19,253][54587] Fps is (10 sec: 54067.1, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 7657242624. Throughput: 0: 55965.2. Samples: 562430540. Policy #0 lag: (min: 1.0, avg: 11.3, max: 19.0) [2024-04-27 20:08:19,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 20:08:21,089][54818] Updated weights for policy 0, policy_version 467368 (0.0036) [2024-04-27 20:08:24,253][54587] Fps is (10 sec: 50790.1, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 7657472000. Throughput: 0: 55899.6. Samples: 562766600. Policy #0 lag: (min: 1.0, avg: 11.3, max: 19.0) [2024-04-27 20:08:24,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 20:08:24,641][54818] Updated weights for policy 0, policy_version 467378 (0.0033) [2024-04-27 20:08:27,036][54818] Updated weights for policy 0, policy_version 467388 (0.0026) [2024-04-27 20:08:29,253][54587] Fps is (10 sec: 52429.1, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 7657766912. Throughput: 0: 55678.8. Samples: 562920840. Policy #0 lag: (min: 1.0, avg: 11.3, max: 19.0) [2024-04-27 20:08:29,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 20:08:30,529][54818] Updated weights for policy 0, policy_version 467398 (0.0032) [2024-04-27 20:08:32,886][54818] Updated weights for policy 0, policy_version 467408 (0.0027) [2024-04-27 20:08:34,253][54587] Fps is (10 sec: 60620.8, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7658078208. Throughput: 0: 55541.0. Samples: 563250080. Policy #0 lag: (min: 1.0, avg: 11.3, max: 19.0) [2024-04-27 20:08:34,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 20:08:36,378][54818] Updated weights for policy 0, policy_version 467418 (0.0032) [2024-04-27 20:08:39,016][54818] Updated weights for policy 0, policy_version 467428 (0.0026) [2024-04-27 20:08:39,253][54587] Fps is (10 sec: 58981.7, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 7658356736. Throughput: 0: 55577.7. Samples: 563586880. Policy #0 lag: (min: 1.0, avg: 11.3, max: 19.0) [2024-04-27 20:08:39,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 20:08:42,148][54818] Updated weights for policy 0, policy_version 467438 (0.0028) [2024-04-27 20:08:44,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7658635264. Throughput: 0: 55641.5. Samples: 563763200. Policy #0 lag: (min: 1.0, avg: 11.3, max: 19.0) [2024-04-27 20:08:44,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 20:08:44,826][54818] Updated weights for policy 0, policy_version 467448 (0.0030) [2024-04-27 20:08:48,014][54818] Updated weights for policy 0, policy_version 467458 (0.0028) [2024-04-27 20:08:49,253][54587] Fps is (10 sec: 55706.1, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 7658913792. Throughput: 0: 55673.2. Samples: 564091140. Policy #0 lag: (min: 1.0, avg: 11.3, max: 19.0) [2024-04-27 20:08:49,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 20:08:50,769][54818] Updated weights for policy 0, policy_version 467468 (0.0027) [2024-04-27 20:08:53,928][54818] Updated weights for policy 0, policy_version 467478 (0.0027) [2024-04-27 20:08:54,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7659175936. Throughput: 0: 55633.4. Samples: 564425560. Policy #0 lag: (min: 1.0, avg: 11.3, max: 19.0) [2024-04-27 20:08:54,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 20:08:56,770][54818] Updated weights for policy 0, policy_version 467488 (0.0026) [2024-04-27 20:08:59,253][54587] Fps is (10 sec: 50790.9, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 7659421696. Throughput: 0: 55506.2. Samples: 564589040. Policy #0 lag: (min: 1.0, avg: 11.3, max: 19.0) [2024-04-27 20:08:59,253][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 20:08:59,720][54798] Signal inference workers to stop experience collection... (8150 times) [2024-04-27 20:08:59,720][54798] Signal inference workers to resume experience collection... (8150 times) [2024-04-27 20:08:59,745][54818] InferenceWorker_p0-w0: stopping experience collection (8150 times) [2024-04-27 20:08:59,746][54818] InferenceWorker_p0-w0: resuming experience collection (8150 times) [2024-04-27 20:08:59,829][54818] Updated weights for policy 0, policy_version 467498 (0.0035) [2024-04-27 20:09:02,483][54818] Updated weights for policy 0, policy_version 467508 (0.0027) [2024-04-27 20:09:04,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 7659700224. Throughput: 0: 55358.3. Samples: 564921660. Policy #0 lag: (min: 1.0, avg: 11.3, max: 19.0) [2024-04-27 20:09:04,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 20:09:05,646][54818] Updated weights for policy 0, policy_version 467518 (0.0028) [2024-04-27 20:09:08,302][54818] Updated weights for policy 0, policy_version 467528 (0.0025) [2024-04-27 20:09:09,253][54587] Fps is (10 sec: 60619.9, 60 sec: 55432.5, 300 sec: 55816.6). Total num frames: 7660027904. Throughput: 0: 55359.4. Samples: 565257780. Policy #0 lag: (min: 1.0, avg: 11.3, max: 19.0) [2024-04-27 20:09:09,254][54587] Avg episode reward: [(0, '0.676')] [2024-04-27 20:09:11,564][54818] Updated weights for policy 0, policy_version 467538 (0.0031) [2024-04-27 20:09:14,253][54587] Fps is (10 sec: 58982.3, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 7660290048. Throughput: 0: 55604.0. Samples: 565423020. Policy #0 lag: (min: 1.0, avg: 11.3, max: 19.0) [2024-04-27 20:09:14,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 20:09:14,585][54818] Updated weights for policy 0, policy_version 467548 (0.0033) [2024-04-27 20:09:17,420][54818] Updated weights for policy 0, policy_version 467558 (0.0026) [2024-04-27 20:09:19,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7660568576. Throughput: 0: 55641.7. Samples: 565753960. Policy #0 lag: (min: 1.0, avg: 11.3, max: 19.0) [2024-04-27 20:09:19,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 20:09:20,441][54818] Updated weights for policy 0, policy_version 467568 (0.0030) [2024-04-27 20:09:23,331][54818] Updated weights for policy 0, policy_version 467578 (0.0025) [2024-04-27 20:09:24,253][54587] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 7660847104. Throughput: 0: 55442.4. Samples: 566081780. Policy #0 lag: (min: 1.0, avg: 11.3, max: 19.0) [2024-04-27 20:09:24,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 20:09:26,424][54818] Updated weights for policy 0, policy_version 467588 (0.0036) [2024-04-27 20:09:29,134][54818] Updated weights for policy 0, policy_version 467598 (0.0026) [2024-04-27 20:09:29,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7661125632. Throughput: 0: 55346.8. Samples: 566253800. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-04-27 20:09:29,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 20:09:32,304][54818] Updated weights for policy 0, policy_version 467608 (0.0032) [2024-04-27 20:09:34,253][54587] Fps is (10 sec: 52429.0, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 7661371392. Throughput: 0: 55443.2. Samples: 566586080. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-04-27 20:09:34,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-27 20:09:35,061][54818] Updated weights for policy 0, policy_version 467618 (0.0027) [2024-04-27 20:09:38,239][54818] Updated weights for policy 0, policy_version 467628 (0.0028) [2024-04-27 20:09:39,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7661666304. Throughput: 0: 55421.7. Samples: 566919540. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-04-27 20:09:39,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 20:09:41,052][54818] Updated weights for policy 0, policy_version 467638 (0.0026) [2024-04-27 20:09:44,195][54818] Updated weights for policy 0, policy_version 467648 (0.0029) [2024-04-27 20:09:44,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 7661944832. Throughput: 0: 55300.0. Samples: 567077540. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-04-27 20:09:44,253][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 20:09:47,142][54818] Updated weights for policy 0, policy_version 467658 (0.0032) [2024-04-27 20:09:49,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55432.5, 300 sec: 55816.6). Total num frames: 7662239744. Throughput: 0: 55327.0. Samples: 567411380. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-04-27 20:09:49,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 20:09:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000467666_7662239744.pth... [2024-04-27 20:09:49,319][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000466851_7648886784.pth [2024-04-27 20:09:50,092][54818] Updated weights for policy 0, policy_version 467668 (0.0024) [2024-04-27 20:09:52,931][54818] Updated weights for policy 0, policy_version 467678 (0.0034) [2024-04-27 20:09:54,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 7662518272. Throughput: 0: 55407.7. Samples: 567751120. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-04-27 20:09:54,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 20:09:55,897][54818] Updated weights for policy 0, policy_version 467688 (0.0026) [2024-04-27 20:09:58,738][54818] Updated weights for policy 0, policy_version 467698 (0.0027) [2024-04-27 20:09:59,253][54587] Fps is (10 sec: 55705.7, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 7662796800. Throughput: 0: 55368.8. Samples: 567914620. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-04-27 20:09:59,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 20:10:01,956][54818] Updated weights for policy 0, policy_version 467708 (0.0029) [2024-04-27 20:10:04,253][54587] Fps is (10 sec: 52428.7, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7663042560. Throughput: 0: 55440.0. Samples: 568248760. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-04-27 20:10:04,254][54587] Avg episode reward: [(0, '0.655')] [2024-04-27 20:10:04,453][54798] Signal inference workers to stop experience collection... (8200 times) [2024-04-27 20:10:04,493][54818] InferenceWorker_p0-w0: stopping experience collection (8200 times) [2024-04-27 20:10:04,519][54798] Signal inference workers to resume experience collection... (8200 times) [2024-04-27 20:10:04,525][54818] InferenceWorker_p0-w0: resuming experience collection (8200 times) [2024-04-27 20:10:04,638][54818] Updated weights for policy 0, policy_version 467718 (0.0026) [2024-04-27 20:10:07,963][54818] Updated weights for policy 0, policy_version 467728 (0.0028) [2024-04-27 20:10:09,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 7663337472. Throughput: 0: 55574.5. Samples: 568582640. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-04-27 20:10:09,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 20:10:10,473][54818] Updated weights for policy 0, policy_version 467738 (0.0030) [2024-04-27 20:10:13,882][54818] Updated weights for policy 0, policy_version 467748 (0.0028) [2024-04-27 20:10:14,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 7663599616. Throughput: 0: 55480.5. Samples: 568750420. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-04-27 20:10:14,253][54587] Avg episode reward: [(0, '0.508')] [2024-04-27 20:10:16,260][54818] Updated weights for policy 0, policy_version 467758 (0.0028) [2024-04-27 20:10:19,253][54587] Fps is (10 sec: 54068.2, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7663878144. Throughput: 0: 55451.1. Samples: 569081380. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-04-27 20:10:19,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 20:10:19,639][54818] Updated weights for policy 0, policy_version 467768 (0.0033) [2024-04-27 20:10:22,233][54818] Updated weights for policy 0, policy_version 467778 (0.0024) [2024-04-27 20:10:24,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 7664173056. Throughput: 0: 55354.8. Samples: 569410500. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-04-27 20:10:24,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 20:10:25,494][54818] Updated weights for policy 0, policy_version 467788 (0.0026) [2024-04-27 20:10:28,214][54818] Updated weights for policy 0, policy_version 467798 (0.0024) [2024-04-27 20:10:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7664451584. Throughput: 0: 55727.6. Samples: 569585280. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-04-27 20:10:29,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 20:10:31,427][54818] Updated weights for policy 0, policy_version 467808 (0.0023) [2024-04-27 20:10:34,099][54818] Updated weights for policy 0, policy_version 467818 (0.0028) [2024-04-27 20:10:34,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7664730112. Throughput: 0: 55673.1. Samples: 569916660. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-04-27 20:10:34,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 20:10:37,140][54818] Updated weights for policy 0, policy_version 467828 (0.0026) [2024-04-27 20:10:39,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7664992256. Throughput: 0: 55604.8. Samples: 570253340. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-04-27 20:10:39,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 20:10:39,845][54818] Updated weights for policy 0, policy_version 467838 (0.0027) [2024-04-27 20:10:43,029][54818] Updated weights for policy 0, policy_version 467848 (0.0038) [2024-04-27 20:10:44,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 7665287168. Throughput: 0: 55750.4. Samples: 570423380. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-04-27 20:10:44,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 20:10:45,871][54818] Updated weights for policy 0, policy_version 467858 (0.0030) [2024-04-27 20:10:49,073][54818] Updated weights for policy 0, policy_version 467868 (0.0026) [2024-04-27 20:10:49,254][54587] Fps is (10 sec: 55701.8, 60 sec: 55158.9, 300 sec: 55649.9). Total num frames: 7665549312. Throughput: 0: 55563.1. Samples: 570749140. Policy #0 lag: (min: 1.0, avg: 8.7, max: 19.0) [2024-04-27 20:10:49,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 20:10:51,822][54818] Updated weights for policy 0, policy_version 467878 (0.0035) [2024-04-27 20:10:54,253][54587] Fps is (10 sec: 52428.4, 60 sec: 54886.3, 300 sec: 55539.0). Total num frames: 7665811456. Throughput: 0: 55567.6. Samples: 571083180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 20:10:54,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 20:10:55,076][54818] Updated weights for policy 0, policy_version 467888 (0.0029) [2024-04-27 20:10:57,606][54818] Updated weights for policy 0, policy_version 467898 (0.0027) [2024-04-27 20:10:59,253][54587] Fps is (10 sec: 57348.2, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7666122752. Throughput: 0: 55590.6. Samples: 571252000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 20:10:59,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-27 20:11:00,883][54818] Updated weights for policy 0, policy_version 467908 (0.0027) [2024-04-27 20:11:02,730][54798] Signal inference workers to stop experience collection... (8250 times) [2024-04-27 20:11:02,778][54818] InferenceWorker_p0-w0: stopping experience collection (8250 times) [2024-04-27 20:11:02,788][54798] Signal inference workers to resume experience collection... (8250 times) [2024-04-27 20:11:02,796][54818] InferenceWorker_p0-w0: resuming experience collection (8250 times) [2024-04-27 20:11:03,408][54818] Updated weights for policy 0, policy_version 467918 (0.0025) [2024-04-27 20:11:04,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7666384896. Throughput: 0: 55717.6. Samples: 571588680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 20:11:04,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 20:11:06,877][54818] Updated weights for policy 0, policy_version 467928 (0.0034) [2024-04-27 20:11:09,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7666679808. Throughput: 0: 55764.0. Samples: 571919880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 20:11:09,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 20:11:09,729][54818] Updated weights for policy 0, policy_version 467938 (0.0037) [2024-04-27 20:11:12,736][54818] Updated weights for policy 0, policy_version 467948 (0.0029) [2024-04-27 20:11:14,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7666941952. Throughput: 0: 55601.1. Samples: 572087340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 20:11:14,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 20:11:15,746][54818] Updated weights for policy 0, policy_version 467958 (0.0028) [2024-04-27 20:11:18,725][54818] Updated weights for policy 0, policy_version 467968 (0.0028) [2024-04-27 20:11:19,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7667236864. Throughput: 0: 55632.0. Samples: 572420100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 20:11:19,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 20:11:21,659][54818] Updated weights for policy 0, policy_version 467978 (0.0029) [2024-04-27 20:11:24,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7667482624. Throughput: 0: 55531.2. Samples: 572752240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 20:11:24,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 20:11:24,694][54818] Updated weights for policy 0, policy_version 467988 (0.0036) [2024-04-27 20:11:27,419][54818] Updated weights for policy 0, policy_version 467998 (0.0026) [2024-04-27 20:11:29,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7667777536. Throughput: 0: 55296.4. Samples: 572911720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 20:11:29,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 20:11:30,659][54818] Updated weights for policy 0, policy_version 468008 (0.0031) [2024-04-27 20:11:33,438][54818] Updated weights for policy 0, policy_version 468018 (0.0030) [2024-04-27 20:11:34,253][54587] Fps is (10 sec: 57343.1, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 7668056064. Throughput: 0: 55460.7. Samples: 573244840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 20:11:34,254][54587] Avg episode reward: [(0, '0.494')] [2024-04-27 20:11:36,638][54818] Updated weights for policy 0, policy_version 468028 (0.0027) [2024-04-27 20:11:39,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7668318208. Throughput: 0: 55466.3. Samples: 573579160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 20:11:39,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 20:11:39,301][54818] Updated weights for policy 0, policy_version 468038 (0.0024) [2024-04-27 20:11:42,340][54818] Updated weights for policy 0, policy_version 468048 (0.0027) [2024-04-27 20:11:44,253][54587] Fps is (10 sec: 54068.2, 60 sec: 55159.5, 300 sec: 55483.5). Total num frames: 7668596736. Throughput: 0: 55449.8. Samples: 573747240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 20:11:44,254][54587] Avg episode reward: [(0, '0.509')] [2024-04-27 20:11:45,139][54818] Updated weights for policy 0, policy_version 468058 (0.0028) [2024-04-27 20:11:48,201][54818] Updated weights for policy 0, policy_version 468068 (0.0032) [2024-04-27 20:11:49,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55706.3, 300 sec: 55539.0). Total num frames: 7668891648. Throughput: 0: 55454.0. Samples: 574084100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 20:11:49,253][54587] Avg episode reward: [(0, '0.525')] [2024-04-27 20:11:49,294][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000468073_7668908032.pth... [2024-04-27 20:11:49,340][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000467256_7655522304.pth [2024-04-27 20:11:50,971][54818] Updated weights for policy 0, policy_version 468078 (0.0028) [2024-04-27 20:11:54,046][54818] Updated weights for policy 0, policy_version 468088 (0.0029) [2024-04-27 20:11:54,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7669170176. Throughput: 0: 55519.0. Samples: 574418240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 20:11:54,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 20:11:56,891][54818] Updated weights for policy 0, policy_version 468098 (0.0032) [2024-04-27 20:11:59,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7669448704. Throughput: 0: 55525.0. Samples: 574585960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 20:11:59,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 20:11:59,986][54818] Updated weights for policy 0, policy_version 468108 (0.0026) [2024-04-27 20:12:02,806][54818] Updated weights for policy 0, policy_version 468118 (0.0034) [2024-04-27 20:12:04,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 7669710848. Throughput: 0: 55480.8. Samples: 574916740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 20:12:04,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 20:12:05,095][54798] Signal inference workers to stop experience collection... (8300 times) [2024-04-27 20:12:05,095][54798] Signal inference workers to resume experience collection... (8300 times) [2024-04-27 20:12:05,117][54818] InferenceWorker_p0-w0: stopping experience collection (8300 times) [2024-04-27 20:12:05,142][54818] InferenceWorker_p0-w0: resuming experience collection (8300 times) [2024-04-27 20:12:05,658][54818] Updated weights for policy 0, policy_version 468128 (0.0029) [2024-04-27 20:12:08,564][54818] Updated weights for policy 0, policy_version 468138 (0.0031) [2024-04-27 20:12:09,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 7669989376. Throughput: 0: 55466.7. Samples: 575248240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 20:12:09,254][54587] Avg episode reward: [(0, '0.489')] [2024-04-27 20:12:11,691][54818] Updated weights for policy 0, policy_version 468148 (0.0039) [2024-04-27 20:12:14,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7670284288. Throughput: 0: 55747.6. Samples: 575420360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-27 20:12:14,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 20:12:14,284][54818] Updated weights for policy 0, policy_version 468158 (0.0033) [2024-04-27 20:12:17,478][54818] Updated weights for policy 0, policy_version 468168 (0.0028) [2024-04-27 20:12:19,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 7670562816. Throughput: 0: 55719.7. Samples: 575752220. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 20:12:19,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 20:12:20,120][54818] Updated weights for policy 0, policy_version 468178 (0.0031) [2024-04-27 20:12:23,232][54818] Updated weights for policy 0, policy_version 468188 (0.0028) [2024-04-27 20:12:24,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 7670841344. Throughput: 0: 55721.4. Samples: 576086620. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 20:12:24,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 20:12:25,926][54818] Updated weights for policy 0, policy_version 468198 (0.0027) [2024-04-27 20:12:29,049][54818] Updated weights for policy 0, policy_version 468208 (0.0032) [2024-04-27 20:12:29,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7671119872. Throughput: 0: 55803.5. Samples: 576258400. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 20:12:29,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 20:12:32,125][54818] Updated weights for policy 0, policy_version 468218 (0.0032) [2024-04-27 20:12:34,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7671398400. Throughput: 0: 55784.8. Samples: 576594420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 20:12:34,254][54587] Avg episode reward: [(0, '0.506')] [2024-04-27 20:12:34,808][54818] Updated weights for policy 0, policy_version 468228 (0.0029) [2024-04-27 20:12:37,969][54818] Updated weights for policy 0, policy_version 468238 (0.0031) [2024-04-27 20:12:39,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 7671660544. Throughput: 0: 55883.7. Samples: 576933000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 20:12:39,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 20:12:40,770][54818] Updated weights for policy 0, policy_version 468248 (0.0029) [2024-04-27 20:12:43,788][54818] Updated weights for policy 0, policy_version 468258 (0.0028) [2024-04-27 20:12:44,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7671939072. Throughput: 0: 55757.8. Samples: 577095060. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 20:12:44,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 20:12:46,583][54818] Updated weights for policy 0, policy_version 468268 (0.0034) [2024-04-27 20:12:49,253][54587] Fps is (10 sec: 58982.2, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7672250368. Throughput: 0: 55911.1. Samples: 577432740. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 20:12:49,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 20:12:49,539][54818] Updated weights for policy 0, policy_version 468278 (0.0029) [2024-04-27 20:12:52,429][54818] Updated weights for policy 0, policy_version 468288 (0.0029) [2024-04-27 20:12:54,253][54587] Fps is (10 sec: 58982.1, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7672528896. Throughput: 0: 56022.2. Samples: 577769240. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 20:12:54,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 20:12:55,263][54818] Updated weights for policy 0, policy_version 468298 (0.0031) [2024-04-27 20:12:58,424][54818] Updated weights for policy 0, policy_version 468308 (0.0034) [2024-04-27 20:12:59,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7672791040. Throughput: 0: 56042.7. Samples: 577942280. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 20:12:59,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 20:13:01,262][54818] Updated weights for policy 0, policy_version 468318 (0.0022) [2024-04-27 20:13:04,186][54818] Updated weights for policy 0, policy_version 468328 (0.0036) [2024-04-27 20:13:04,253][54587] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 55539.0). Total num frames: 7673085952. Throughput: 0: 55999.1. Samples: 578272180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 20:13:04,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 20:13:07,006][54818] Updated weights for policy 0, policy_version 468338 (0.0026) [2024-04-27 20:13:08,611][54798] Signal inference workers to stop experience collection... (8350 times) [2024-04-27 20:13:08,612][54798] Signal inference workers to resume experience collection... (8350 times) [2024-04-27 20:13:08,636][54818] InferenceWorker_p0-w0: stopping experience collection (8350 times) [2024-04-27 20:13:08,636][54818] InferenceWorker_p0-w0: resuming experience collection (8350 times) [2024-04-27 20:13:09,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7673348096. Throughput: 0: 55933.5. Samples: 578603620. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 20:13:09,253][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 20:13:10,015][54818] Updated weights for policy 0, policy_version 468348 (0.0028) [2024-04-27 20:13:12,927][54818] Updated weights for policy 0, policy_version 468358 (0.0027) [2024-04-27 20:13:14,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7673626624. Throughput: 0: 55930.2. Samples: 578775260. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 20:13:14,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 20:13:15,843][54818] Updated weights for policy 0, policy_version 468368 (0.0031) [2024-04-27 20:13:18,839][54818] Updated weights for policy 0, policy_version 468378 (0.0030) [2024-04-27 20:13:19,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7673905152. Throughput: 0: 55956.0. Samples: 579112440. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 20:13:19,262][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 20:13:21,819][54818] Updated weights for policy 0, policy_version 468388 (0.0033) [2024-04-27 20:13:24,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 7674200064. Throughput: 0: 55744.1. Samples: 579441480. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 20:13:24,253][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 20:13:24,732][54818] Updated weights for policy 0, policy_version 468398 (0.0028) [2024-04-27 20:13:27,698][54818] Updated weights for policy 0, policy_version 468408 (0.0027) [2024-04-27 20:13:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7674478592. Throughput: 0: 55840.5. Samples: 579607880. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 20:13:29,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 20:13:30,724][54818] Updated weights for policy 0, policy_version 468418 (0.0025) [2024-04-27 20:13:33,630][54818] Updated weights for policy 0, policy_version 468428 (0.0032) [2024-04-27 20:13:34,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 55594.6). Total num frames: 7674757120. Throughput: 0: 55798.7. Samples: 579943680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 20:13:34,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 20:13:36,499][54818] Updated weights for policy 0, policy_version 468438 (0.0028) [2024-04-27 20:13:39,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 7675019264. Throughput: 0: 55772.9. Samples: 580279020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-04-27 20:13:39,254][54587] Avg episode reward: [(0, '0.505')] [2024-04-27 20:13:39,588][54818] Updated weights for policy 0, policy_version 468448 (0.0026) [2024-04-27 20:13:42,228][54818] Updated weights for policy 0, policy_version 468458 (0.0030) [2024-04-27 20:13:44,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7675297792. Throughput: 0: 55503.1. Samples: 580439920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:13:44,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 20:13:45,386][54818] Updated weights for policy 0, policy_version 468468 (0.0029) [2024-04-27 20:13:48,206][54818] Updated weights for policy 0, policy_version 468478 (0.0027) [2024-04-27 20:13:49,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7675592704. Throughput: 0: 55688.9. Samples: 580778180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:13:49,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 20:13:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000468481_7675592704.pth... [2024-04-27 20:13:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000467666_7662239744.pth [2024-04-27 20:13:51,444][54818] Updated weights for policy 0, policy_version 468488 (0.0026) [2024-04-27 20:13:54,191][54818] Updated weights for policy 0, policy_version 468498 (0.0028) [2024-04-27 20:13:54,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 7675871232. Throughput: 0: 55693.4. Samples: 581109820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:13:54,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 20:13:57,216][54818] Updated weights for policy 0, policy_version 468508 (0.0027) [2024-04-27 20:13:59,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 7676149760. Throughput: 0: 55702.3. Samples: 581281860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:13:59,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 20:14:00,194][54818] Updated weights for policy 0, policy_version 468518 (0.0035) [2024-04-27 20:14:03,033][54818] Updated weights for policy 0, policy_version 468528 (0.0032) [2024-04-27 20:14:03,556][54798] Signal inference workers to stop experience collection... (8400 times) [2024-04-27 20:14:03,556][54798] Signal inference workers to resume experience collection... (8400 times) [2024-04-27 20:14:03,569][54818] InferenceWorker_p0-w0: stopping experience collection (8400 times) [2024-04-27 20:14:03,569][54818] InferenceWorker_p0-w0: resuming experience collection (8400 times) [2024-04-27 20:14:04,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7676411904. Throughput: 0: 55633.9. Samples: 581615960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:14:04,253][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 20:14:06,108][54818] Updated weights for policy 0, policy_version 468538 (0.0028) [2024-04-27 20:14:08,945][54818] Updated weights for policy 0, policy_version 468548 (0.0027) [2024-04-27 20:14:09,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55978.5, 300 sec: 55650.1). Total num frames: 7676706816. Throughput: 0: 55779.4. Samples: 581951560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:14:09,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 20:14:11,852][54818] Updated weights for policy 0, policy_version 468558 (0.0031) [2024-04-27 20:14:14,253][54587] Fps is (10 sec: 57343.1, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 7676985344. Throughput: 0: 55784.3. Samples: 582118180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:14:14,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 20:14:15,038][54818] Updated weights for policy 0, policy_version 468568 (0.0026) [2024-04-27 20:14:17,719][54818] Updated weights for policy 0, policy_version 468578 (0.0024) [2024-04-27 20:14:19,253][54587] Fps is (10 sec: 52429.2, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7677231104. Throughput: 0: 55648.4. Samples: 582447860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:14:19,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 20:14:20,776][54818] Updated weights for policy 0, policy_version 468588 (0.0030) [2024-04-27 20:14:23,580][54818] Updated weights for policy 0, policy_version 468598 (0.0032) [2024-04-27 20:14:24,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7677526016. Throughput: 0: 55576.4. Samples: 582779960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:14:24,254][54587] Avg episode reward: [(0, '0.680')] [2024-04-27 20:14:26,602][54818] Updated weights for policy 0, policy_version 468608 (0.0029) [2024-04-27 20:14:29,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7677804544. Throughput: 0: 55782.2. Samples: 582950120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:14:29,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 20:14:29,415][54818] Updated weights for policy 0, policy_version 468618 (0.0024) [2024-04-27 20:14:32,617][54818] Updated weights for policy 0, policy_version 468628 (0.0030) [2024-04-27 20:14:34,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 7678083072. Throughput: 0: 55711.4. Samples: 583285200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:14:34,254][54587] Avg episode reward: [(0, '0.667')] [2024-04-27 20:14:35,207][54818] Updated weights for policy 0, policy_version 468638 (0.0039) [2024-04-27 20:14:38,546][54818] Updated weights for policy 0, policy_version 468648 (0.0027) [2024-04-27 20:14:39,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7678345216. Throughput: 0: 55708.4. Samples: 583616700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:14:39,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 20:14:41,074][54818] Updated weights for policy 0, policy_version 468658 (0.0032) [2024-04-27 20:14:44,253][54587] Fps is (10 sec: 55706.7, 60 sec: 55705.6, 300 sec: 55594.6). Total num frames: 7678640128. Throughput: 0: 55629.8. Samples: 583785200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:14:44,253][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 20:14:44,309][54818] Updated weights for policy 0, policy_version 468668 (0.0028) [2024-04-27 20:14:46,965][54818] Updated weights for policy 0, policy_version 468678 (0.0027) [2024-04-27 20:14:49,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7678902272. Throughput: 0: 55695.9. Samples: 584122280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:14:49,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-27 20:14:50,218][54818] Updated weights for policy 0, policy_version 468688 (0.0030) [2024-04-27 20:14:52,743][54818] Updated weights for policy 0, policy_version 468698 (0.0030) [2024-04-27 20:14:54,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7679197184. Throughput: 0: 55764.9. Samples: 584460980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:14:54,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 20:14:56,187][54818] Updated weights for policy 0, policy_version 468708 (0.0028) [2024-04-27 20:14:56,195][54798] Signal inference workers to stop experience collection... (8450 times) [2024-04-27 20:14:56,196][54798] Signal inference workers to resume experience collection... (8450 times) [2024-04-27 20:14:56,214][54818] InferenceWorker_p0-w0: stopping experience collection (8450 times) [2024-04-27 20:14:56,214][54818] InferenceWorker_p0-w0: resuming experience collection (8450 times) [2024-04-27 20:14:58,692][54818] Updated weights for policy 0, policy_version 468718 (0.0033) [2024-04-27 20:14:59,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 7679475712. Throughput: 0: 55618.7. Samples: 584621020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:14:59,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 20:15:01,934][54818] Updated weights for policy 0, policy_version 468728 (0.0025) [2024-04-27 20:15:04,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7679770624. Throughput: 0: 55724.0. Samples: 584955440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:15:04,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 20:15:04,469][54818] Updated weights for policy 0, policy_version 468738 (0.0027) [2024-04-27 20:15:07,733][54818] Updated weights for policy 0, policy_version 468748 (0.0029) [2024-04-27 20:15:09,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7680049152. Throughput: 0: 55654.2. Samples: 585284400. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-04-27 20:15:09,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 20:15:10,234][54818] Updated weights for policy 0, policy_version 468758 (0.0025) [2024-04-27 20:15:13,723][54818] Updated weights for policy 0, policy_version 468768 (0.0035) [2024-04-27 20:15:14,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7680311296. Throughput: 0: 55877.3. Samples: 585464600. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-04-27 20:15:14,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 20:15:16,690][54818] Updated weights for policy 0, policy_version 468778 (0.0031) [2024-04-27 20:15:19,253][54587] Fps is (10 sec: 52429.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7680573440. Throughput: 0: 55767.3. Samples: 585794720. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-04-27 20:15:19,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 20:15:19,693][54818] Updated weights for policy 0, policy_version 468788 (0.0027) [2024-04-27 20:15:22,732][54818] Updated weights for policy 0, policy_version 468798 (0.0026) [2024-04-27 20:15:24,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7680868352. Throughput: 0: 55751.1. Samples: 586125500. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-04-27 20:15:24,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 20:15:25,571][54818] Updated weights for policy 0, policy_version 468808 (0.0028) [2024-04-27 20:15:28,624][54818] Updated weights for policy 0, policy_version 468818 (0.0025) [2024-04-27 20:15:29,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7681130496. Throughput: 0: 55446.0. Samples: 586280280. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-04-27 20:15:29,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 20:15:31,357][54818] Updated weights for policy 0, policy_version 468828 (0.0030) [2024-04-27 20:15:34,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7681409024. Throughput: 0: 55454.1. Samples: 586617720. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-04-27 20:15:34,254][54587] Avg episode reward: [(0, '0.526')] [2024-04-27 20:15:34,541][54818] Updated weights for policy 0, policy_version 468838 (0.0027) [2024-04-27 20:15:37,241][54818] Updated weights for policy 0, policy_version 468848 (0.0029) [2024-04-27 20:15:39,253][54587] Fps is (10 sec: 60621.4, 60 sec: 56524.8, 300 sec: 55761.1). Total num frames: 7681736704. Throughput: 0: 55316.5. Samples: 586950220. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-04-27 20:15:39,254][54587] Avg episode reward: [(0, '0.680')] [2024-04-27 20:15:40,368][54818] Updated weights for policy 0, policy_version 468858 (0.0033) [2024-04-27 20:15:43,217][54818] Updated weights for policy 0, policy_version 468868 (0.0028) [2024-04-27 20:15:44,253][54587] Fps is (10 sec: 60620.6, 60 sec: 56251.5, 300 sec: 55816.8). Total num frames: 7682015232. Throughput: 0: 55749.2. Samples: 587129740. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-04-27 20:15:44,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 20:15:46,294][54818] Updated weights for policy 0, policy_version 468878 (0.0035) [2024-04-27 20:15:49,226][54818] Updated weights for policy 0, policy_version 468888 (0.0028) [2024-04-27 20:15:49,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 7682260992. Throughput: 0: 55737.8. Samples: 587463640. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-04-27 20:15:49,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 20:15:49,349][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000468889_7682277376.pth... [2024-04-27 20:15:49,398][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000468073_7668908032.pth [2024-04-27 20:15:49,418][54798] Signal inference workers to stop experience collection... (8500 times) [2024-04-27 20:15:49,419][54798] Signal inference workers to resume experience collection... (8500 times) [2024-04-27 20:15:49,438][54818] InferenceWorker_p0-w0: stopping experience collection (8500 times) [2024-04-27 20:15:49,438][54818] InferenceWorker_p0-w0: resuming experience collection (8500 times) [2024-04-27 20:15:52,310][54818] Updated weights for policy 0, policy_version 468898 (0.0025) [2024-04-27 20:15:54,253][54587] Fps is (10 sec: 50790.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7682523136. Throughput: 0: 55823.6. Samples: 587796460. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-04-27 20:15:54,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 20:15:54,996][54818] Updated weights for policy 0, policy_version 468908 (0.0027) [2024-04-27 20:15:57,973][54818] Updated weights for policy 0, policy_version 468918 (0.0034) [2024-04-27 20:15:59,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7682801664. Throughput: 0: 55240.7. Samples: 587950440. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-04-27 20:15:59,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 20:16:00,852][54818] Updated weights for policy 0, policy_version 468928 (0.0025) [2024-04-27 20:16:03,797][54818] Updated weights for policy 0, policy_version 468938 (0.0026) [2024-04-27 20:16:04,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7683080192. Throughput: 0: 55386.3. Samples: 588287100. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-04-27 20:16:04,253][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 20:16:06,762][54818] Updated weights for policy 0, policy_version 468948 (0.0032) [2024-04-27 20:16:09,253][54587] Fps is (10 sec: 57345.1, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7683375104. Throughput: 0: 55483.1. Samples: 588622240. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-04-27 20:16:09,253][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 20:16:09,775][54818] Updated weights for policy 0, policy_version 468958 (0.0031) [2024-04-27 20:16:12,638][54818] Updated weights for policy 0, policy_version 468968 (0.0031) [2024-04-27 20:16:14,253][54587] Fps is (10 sec: 58981.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7683670016. Throughput: 0: 55872.5. Samples: 588794540. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-04-27 20:16:14,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 20:16:15,589][54818] Updated weights for policy 0, policy_version 468978 (0.0027) [2024-04-27 20:16:18,445][54818] Updated weights for policy 0, policy_version 468988 (0.0026) [2024-04-27 20:16:19,253][54587] Fps is (10 sec: 58982.6, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 7683964928. Throughput: 0: 55844.7. Samples: 589130720. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-04-27 20:16:19,253][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 20:16:21,632][54818] Updated weights for policy 0, policy_version 468998 (0.0024) [2024-04-27 20:16:24,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7684210688. Throughput: 0: 55939.4. Samples: 589467500. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-04-27 20:16:24,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 20:16:24,293][54818] Updated weights for policy 0, policy_version 469008 (0.0030) [2024-04-27 20:16:27,330][54818] Updated weights for policy 0, policy_version 469018 (0.0030) [2024-04-27 20:16:29,253][54587] Fps is (10 sec: 50790.3, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 7684472832. Throughput: 0: 55580.2. Samples: 589630840. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-04-27 20:16:29,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 20:16:30,292][54818] Updated weights for policy 0, policy_version 469028 (0.0030) [2024-04-27 20:16:33,026][54818] Updated weights for policy 0, policy_version 469038 (0.0032) [2024-04-27 20:16:34,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7684767744. Throughput: 0: 55553.7. Samples: 589963560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:16:34,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-27 20:16:36,299][54818] Updated weights for policy 0, policy_version 469048 (0.0034) [2024-04-27 20:16:38,912][54818] Updated weights for policy 0, policy_version 469058 (0.0031) [2024-04-27 20:16:39,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 7685046272. Throughput: 0: 55598.3. Samples: 590298380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:16:39,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 20:16:42,043][54818] Updated weights for policy 0, policy_version 469068 (0.0033) [2024-04-27 20:16:44,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 7685324800. Throughput: 0: 55951.2. Samples: 590468240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:16:44,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 20:16:44,984][54818] Updated weights for policy 0, policy_version 469078 (0.0031) [2024-04-27 20:16:47,744][54818] Updated weights for policy 0, policy_version 469088 (0.0032) [2024-04-27 20:16:48,506][54798] Signal inference workers to stop experience collection... (8550 times) [2024-04-27 20:16:48,510][54798] Signal inference workers to resume experience collection... (8550 times) [2024-04-27 20:16:48,534][54818] InferenceWorker_p0-w0: stopping experience collection (8550 times) [2024-04-27 20:16:48,534][54818] InferenceWorker_p0-w0: resuming experience collection (8550 times) [2024-04-27 20:16:49,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7685603328. Throughput: 0: 55851.0. Samples: 590800400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:16:49,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 20:16:50,802][54818] Updated weights for policy 0, policy_version 469098 (0.0029) [2024-04-27 20:16:53,667][54818] Updated weights for policy 0, policy_version 469108 (0.0031) [2024-04-27 20:16:54,253][54587] Fps is (10 sec: 57344.7, 60 sec: 56251.9, 300 sec: 55761.2). Total num frames: 7685898240. Throughput: 0: 55730.7. Samples: 591130120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:16:54,253][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 20:16:56,660][54818] Updated weights for policy 0, policy_version 469118 (0.0032) [2024-04-27 20:16:59,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 7686160384. Throughput: 0: 55607.6. Samples: 591296880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:16:59,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 20:16:59,653][54818] Updated weights for policy 0, policy_version 469128 (0.0038) [2024-04-27 20:17:02,927][54818] Updated weights for policy 0, policy_version 469138 (0.0025) [2024-04-27 20:17:04,253][54587] Fps is (10 sec: 52428.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7686422528. Throughput: 0: 55628.3. Samples: 591634000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:17:04,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 20:17:05,534][54818] Updated weights for policy 0, policy_version 469148 (0.0028) [2024-04-27 20:17:08,724][54818] Updated weights for policy 0, policy_version 469158 (0.0029) [2024-04-27 20:17:09,253][54587] Fps is (10 sec: 52429.1, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7686684672. Throughput: 0: 55633.0. Samples: 591970980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:17:09,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 20:17:11,250][54818] Updated weights for policy 0, policy_version 469168 (0.0028) [2024-04-27 20:17:14,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7686995968. Throughput: 0: 55591.9. Samples: 592132480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:17:14,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-27 20:17:14,473][54818] Updated weights for policy 0, policy_version 469178 (0.0027) [2024-04-27 20:17:17,149][54818] Updated weights for policy 0, policy_version 469188 (0.0032) [2024-04-27 20:17:19,253][54587] Fps is (10 sec: 58982.6, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 7687274496. Throughput: 0: 55596.6. Samples: 592465400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:17:19,253][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 20:17:20,390][54818] Updated weights for policy 0, policy_version 469198 (0.0029) [2024-04-27 20:17:23,342][54818] Updated weights for policy 0, policy_version 469208 (0.0025) [2024-04-27 20:17:24,253][54587] Fps is (10 sec: 55704.7, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7687553024. Throughput: 0: 55447.8. Samples: 592793540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:17:24,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 20:17:26,329][54818] Updated weights for policy 0, policy_version 469218 (0.0027) [2024-04-27 20:17:29,033][54818] Updated weights for policy 0, policy_version 469228 (0.0028) [2024-04-27 20:17:29,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7687831552. Throughput: 0: 55582.8. Samples: 592969460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:17:29,253][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 20:17:32,178][54818] Updated weights for policy 0, policy_version 469238 (0.0028) [2024-04-27 20:17:34,253][54587] Fps is (10 sec: 57345.2, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7688126464. Throughput: 0: 55688.5. Samples: 593306380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:17:34,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 20:17:34,830][54818] Updated weights for policy 0, policy_version 469248 (0.0031) [2024-04-27 20:17:38,051][54818] Updated weights for policy 0, policy_version 469258 (0.0031) [2024-04-27 20:17:39,253][54587] Fps is (10 sec: 55704.4, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 7688388608. Throughput: 0: 55761.9. Samples: 593639420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:17:39,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 20:17:40,775][54818] Updated weights for policy 0, policy_version 469268 (0.0027) [2024-04-27 20:17:43,844][54818] Updated weights for policy 0, policy_version 469278 (0.0030) [2024-04-27 20:17:44,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 7688667136. Throughput: 0: 55686.6. Samples: 593802780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:17:44,254][54587] Avg episode reward: [(0, '0.493')] [2024-04-27 20:17:46,662][54818] Updated weights for policy 0, policy_version 469288 (0.0027) [2024-04-27 20:17:49,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 7688945664. Throughput: 0: 55692.8. Samples: 594140180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:17:49,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 20:17:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000469296_7688945664.pth... [2024-04-27 20:17:49,315][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000468481_7675592704.pth [2024-04-27 20:17:49,735][54818] Updated weights for policy 0, policy_version 469298 (0.0028) [2024-04-27 20:17:52,532][54818] Updated weights for policy 0, policy_version 469308 (0.0028) [2024-04-27 20:17:54,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 7689207808. Throughput: 0: 55636.9. Samples: 594474640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:17:54,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 20:17:54,344][54798] Signal inference workers to stop experience collection... (8600 times) [2024-04-27 20:17:54,349][54798] Signal inference workers to resume experience collection... (8600 times) [2024-04-27 20:17:54,361][54818] InferenceWorker_p0-w0: stopping experience collection (8600 times) [2024-04-27 20:17:54,362][54818] InferenceWorker_p0-w0: resuming experience collection (8600 times) [2024-04-27 20:17:55,684][54818] Updated weights for policy 0, policy_version 469318 (0.0025) [2024-04-27 20:17:58,308][54818] Updated weights for policy 0, policy_version 469328 (0.0030) [2024-04-27 20:17:59,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7689486336. Throughput: 0: 55741.1. Samples: 594640840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-27 20:17:59,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 20:18:01,583][54818] Updated weights for policy 0, policy_version 469338 (0.0026) [2024-04-27 20:18:04,193][54818] Updated weights for policy 0, policy_version 469348 (0.0026) [2024-04-27 20:18:04,253][54587] Fps is (10 sec: 58982.1, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 7689797632. Throughput: 0: 55724.8. Samples: 594973020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-27 20:18:04,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 20:18:07,363][54818] Updated weights for policy 0, policy_version 469358 (0.0028) [2024-04-27 20:18:09,253][54587] Fps is (10 sec: 57344.5, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 7690059776. Throughput: 0: 55726.3. Samples: 595301220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-27 20:18:09,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 20:18:10,252][54818] Updated weights for policy 0, policy_version 469368 (0.0027) [2024-04-27 20:18:13,282][54818] Updated weights for policy 0, policy_version 469378 (0.0028) [2024-04-27 20:18:14,253][54587] Fps is (10 sec: 52429.3, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7690321920. Throughput: 0: 55622.7. Samples: 595472480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-27 20:18:14,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 20:18:16,084][54818] Updated weights for policy 0, policy_version 469388 (0.0022) [2024-04-27 20:18:19,036][54818] Updated weights for policy 0, policy_version 469398 (0.0036) [2024-04-27 20:18:19,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 7690616832. Throughput: 0: 55503.9. Samples: 595804060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-27 20:18:19,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 20:18:21,966][54818] Updated weights for policy 0, policy_version 469408 (0.0031) [2024-04-27 20:18:24,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55705.7, 300 sec: 55650.0). Total num frames: 7690895360. Throughput: 0: 55567.7. Samples: 596139960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-27 20:18:24,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 20:18:24,981][54818] Updated weights for policy 0, policy_version 469418 (0.0026) [2024-04-27 20:18:27,774][54818] Updated weights for policy 0, policy_version 469428 (0.0028) [2024-04-27 20:18:29,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7691173888. Throughput: 0: 55510.3. Samples: 596300740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-27 20:18:29,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 20:18:30,763][54818] Updated weights for policy 0, policy_version 469438 (0.0031) [2024-04-27 20:18:33,903][54818] Updated weights for policy 0, policy_version 469448 (0.0027) [2024-04-27 20:18:34,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7691436032. Throughput: 0: 55466.0. Samples: 596636140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-27 20:18:34,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 20:18:36,618][54818] Updated weights for policy 0, policy_version 469458 (0.0029) [2024-04-27 20:18:39,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 7691730944. Throughput: 0: 55555.6. Samples: 596974640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-27 20:18:39,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 20:18:39,701][54818] Updated weights for policy 0, policy_version 469468 (0.0032) [2024-04-27 20:18:42,614][54818] Updated weights for policy 0, policy_version 469478 (0.0031) [2024-04-27 20:18:44,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7692009472. Throughput: 0: 55638.9. Samples: 597144580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-27 20:18:44,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 20:18:45,583][54818] Updated weights for policy 0, policy_version 469488 (0.0033) [2024-04-27 20:18:48,284][54818] Updated weights for policy 0, policy_version 469498 (0.0026) [2024-04-27 20:18:49,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 7692271616. Throughput: 0: 55749.5. Samples: 597481740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-27 20:18:49,253][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 20:18:51,480][54818] Updated weights for policy 0, policy_version 469508 (0.0030) [2024-04-27 20:18:54,157][54818] Updated weights for policy 0, policy_version 469518 (0.0027) [2024-04-27 20:18:54,253][54587] Fps is (10 sec: 57344.1, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 7692582912. Throughput: 0: 55772.6. Samples: 597810980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-27 20:18:54,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 20:18:56,553][54798] Signal inference workers to stop experience collection... (8650 times) [2024-04-27 20:18:56,584][54818] InferenceWorker_p0-w0: stopping experience collection (8650 times) [2024-04-27 20:18:56,612][54798] Signal inference workers to resume experience collection... (8650 times) [2024-04-27 20:18:56,613][54818] InferenceWorker_p0-w0: resuming experience collection (8650 times) [2024-04-27 20:18:57,421][54818] Updated weights for policy 0, policy_version 469528 (0.0030) [2024-04-27 20:18:59,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55978.9, 300 sec: 55705.6). Total num frames: 7692845056. Throughput: 0: 55786.3. Samples: 597982860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-27 20:18:59,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 20:19:00,016][54818] Updated weights for policy 0, policy_version 469538 (0.0033) [2024-04-27 20:19:03,329][54818] Updated weights for policy 0, policy_version 469548 (0.0037) [2024-04-27 20:19:04,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7693123584. Throughput: 0: 55831.9. Samples: 598316500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-27 20:19:04,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 20:19:05,833][54818] Updated weights for policy 0, policy_version 469558 (0.0028) [2024-04-27 20:19:09,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7693385728. Throughput: 0: 55793.4. Samples: 598650660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-27 20:19:09,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 20:19:09,298][54818] Updated weights for policy 0, policy_version 469568 (0.0025) [2024-04-27 20:19:11,601][54818] Updated weights for policy 0, policy_version 469578 (0.0026) [2024-04-27 20:19:14,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 7693680640. Throughput: 0: 55726.1. Samples: 598808420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-27 20:19:14,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 20:19:15,223][54818] Updated weights for policy 0, policy_version 469588 (0.0034) [2024-04-27 20:19:17,872][54818] Updated weights for policy 0, policy_version 469598 (0.0027) [2024-04-27 20:19:19,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7693959168. Throughput: 0: 55735.9. Samples: 599144260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-04-27 20:19:19,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 20:19:21,140][54818] Updated weights for policy 0, policy_version 469608 (0.0035) [2024-04-27 20:19:23,765][54818] Updated weights for policy 0, policy_version 469618 (0.0029) [2024-04-27 20:19:24,253][54587] Fps is (10 sec: 55706.5, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7694237696. Throughput: 0: 55694.7. Samples: 599480900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:19:24,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 20:19:26,888][54818] Updated weights for policy 0, policy_version 469628 (0.0030) [2024-04-27 20:19:29,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7694516224. Throughput: 0: 55643.2. Samples: 599648520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:19:29,253][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 20:19:29,536][54818] Updated weights for policy 0, policy_version 469638 (0.0027) [2024-04-27 20:19:32,671][54818] Updated weights for policy 0, policy_version 469648 (0.0025) [2024-04-27 20:19:34,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7694778368. Throughput: 0: 55466.5. Samples: 599977740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:19:34,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 20:19:35,497][54818] Updated weights for policy 0, policy_version 469658 (0.0040) [2024-04-27 20:19:38,601][54818] Updated weights for policy 0, policy_version 469668 (0.0031) [2024-04-27 20:19:39,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7695056896. Throughput: 0: 55386.6. Samples: 600303380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:19:39,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 20:19:41,420][54818] Updated weights for policy 0, policy_version 469678 (0.0031) [2024-04-27 20:19:44,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7695335424. Throughput: 0: 55292.7. Samples: 600471040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:19:44,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-27 20:19:44,783][54818] Updated weights for policy 0, policy_version 469688 (0.0036) [2024-04-27 20:19:47,200][54818] Updated weights for policy 0, policy_version 469698 (0.0039) [2024-04-27 20:19:49,253][54587] Fps is (10 sec: 54066.3, 60 sec: 55432.3, 300 sec: 55594.5). Total num frames: 7695597568. Throughput: 0: 55311.9. Samples: 600805540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:19:49,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 20:19:49,312][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000469703_7695613952.pth... [2024-04-27 20:19:49,373][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000468889_7682277376.pth [2024-04-27 20:19:50,849][54818] Updated weights for policy 0, policy_version 469708 (0.0028) [2024-04-27 20:19:53,125][54818] Updated weights for policy 0, policy_version 469718 (0.0025) [2024-04-27 20:19:54,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7695892480. Throughput: 0: 55320.5. Samples: 601140080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:19:54,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 20:19:56,817][54818] Updated weights for policy 0, policy_version 469728 (0.0023) [2024-04-27 20:19:59,011][54818] Updated weights for policy 0, policy_version 469738 (0.0032) [2024-04-27 20:19:59,253][54587] Fps is (10 sec: 58983.0, 60 sec: 55705.4, 300 sec: 55650.0). Total num frames: 7696187392. Throughput: 0: 55573.8. Samples: 601309240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:19:59,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 20:20:02,587][54818] Updated weights for policy 0, policy_version 469748 (0.0027) [2024-04-27 20:20:04,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 7696433152. Throughput: 0: 55356.6. Samples: 601635300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:20:04,253][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 20:20:05,180][54818] Updated weights for policy 0, policy_version 469758 (0.0029) [2024-04-27 20:20:08,325][54818] Updated weights for policy 0, policy_version 469768 (0.0033) [2024-04-27 20:20:09,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 7696728064. Throughput: 0: 55230.5. Samples: 601966280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:20:09,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 20:20:09,763][54798] Signal inference workers to stop experience collection... (8700 times) [2024-04-27 20:20:09,763][54798] Signal inference workers to resume experience collection... (8700 times) [2024-04-27 20:20:09,777][54818] InferenceWorker_p0-w0: stopping experience collection (8700 times) [2024-04-27 20:20:09,799][54818] InferenceWorker_p0-w0: resuming experience collection (8700 times) [2024-04-27 20:20:11,366][54818] Updated weights for policy 0, policy_version 469778 (0.0029) [2024-04-27 20:20:14,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 7696990208. Throughput: 0: 55320.0. Samples: 602137920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:20:14,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 20:20:14,324][54818] Updated weights for policy 0, policy_version 469788 (0.0036) [2024-04-27 20:20:17,447][54818] Updated weights for policy 0, policy_version 469798 (0.0032) [2024-04-27 20:20:19,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7697268736. Throughput: 0: 55288.4. Samples: 602465720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:20:19,254][54587] Avg episode reward: [(0, '0.490')] [2024-04-27 20:20:20,176][54818] Updated weights for policy 0, policy_version 469808 (0.0031) [2024-04-27 20:20:23,437][54818] Updated weights for policy 0, policy_version 469818 (0.0029) [2024-04-27 20:20:24,253][54587] Fps is (10 sec: 52428.5, 60 sec: 54613.3, 300 sec: 55539.0). Total num frames: 7697514496. Throughput: 0: 55292.4. Samples: 602791540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:20:24,254][54587] Avg episode reward: [(0, '0.512')] [2024-04-27 20:20:26,165][54818] Updated weights for policy 0, policy_version 469828 (0.0031) [2024-04-27 20:20:29,160][54818] Updated weights for policy 0, policy_version 469838 (0.0028) [2024-04-27 20:20:29,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 7697825792. Throughput: 0: 55264.0. Samples: 602957920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:20:29,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 20:20:31,990][54818] Updated weights for policy 0, policy_version 469848 (0.0034) [2024-04-27 20:20:34,253][54587] Fps is (10 sec: 60621.0, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7698120704. Throughput: 0: 55342.0. Samples: 603295920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:20:34,254][54587] Avg episode reward: [(0, '0.520')] [2024-04-27 20:20:35,050][54818] Updated weights for policy 0, policy_version 469858 (0.0032) [2024-04-27 20:20:37,950][54818] Updated weights for policy 0, policy_version 469868 (0.0030) [2024-04-27 20:20:39,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55483.5). Total num frames: 7698382848. Throughput: 0: 55319.5. Samples: 603629460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:20:39,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 20:20:41,074][54818] Updated weights for policy 0, policy_version 469878 (0.0029) [2024-04-27 20:20:43,896][54818] Updated weights for policy 0, policy_version 469888 (0.0025) [2024-04-27 20:20:44,253][54587] Fps is (10 sec: 54066.5, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7698661376. Throughput: 0: 55277.7. Samples: 603796740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:20:44,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 20:20:46,882][54818] Updated weights for policy 0, policy_version 469898 (0.0032) [2024-04-27 20:20:49,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55432.8, 300 sec: 55594.5). Total num frames: 7698923520. Throughput: 0: 55438.3. Samples: 604130020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:20:49,253][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 20:20:49,744][54818] Updated weights for policy 0, policy_version 469908 (0.0030) [2024-04-27 20:20:52,894][54818] Updated weights for policy 0, policy_version 469918 (0.0028) [2024-04-27 20:20:54,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 7699218432. Throughput: 0: 55519.1. Samples: 604464640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:20:54,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-27 20:20:55,511][54818] Updated weights for policy 0, policy_version 469928 (0.0024) [2024-04-27 20:20:58,749][54818] Updated weights for policy 0, policy_version 469938 (0.0027) [2024-04-27 20:20:59,253][54587] Fps is (10 sec: 54067.2, 60 sec: 54613.5, 300 sec: 55539.0). Total num frames: 7699464192. Throughput: 0: 55298.7. Samples: 604626360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:20:59,253][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 20:21:01,436][54818] Updated weights for policy 0, policy_version 469948 (0.0027) [2024-04-27 20:21:04,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7699759104. Throughput: 0: 55408.5. Samples: 604959100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:21:04,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 20:21:04,559][54818] Updated weights for policy 0, policy_version 469958 (0.0030) [2024-04-27 20:21:07,487][54818] Updated weights for policy 0, policy_version 469968 (0.0025) [2024-04-27 20:21:09,253][54587] Fps is (10 sec: 58981.0, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7700054016. Throughput: 0: 55439.4. Samples: 605286320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:21:09,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 20:21:10,530][54818] Updated weights for policy 0, policy_version 469978 (0.0029) [2024-04-27 20:21:13,393][54818] Updated weights for policy 0, policy_version 469988 (0.0031) [2024-04-27 20:21:14,253][54587] Fps is (10 sec: 58982.4, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 7700348928. Throughput: 0: 55698.2. Samples: 605464340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:21:14,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 20:21:16,347][54818] Updated weights for policy 0, policy_version 469998 (0.0028) [2024-04-27 20:21:19,113][54818] Updated weights for policy 0, policy_version 470008 (0.0030) [2024-04-27 20:21:19,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7700611072. Throughput: 0: 55694.1. Samples: 605802160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:21:19,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 20:21:22,094][54818] Updated weights for policy 0, policy_version 470018 (0.0029) [2024-04-27 20:21:24,253][54587] Fps is (10 sec: 52429.3, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7700873216. Throughput: 0: 55697.4. Samples: 606135840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:21:24,253][54587] Avg episode reward: [(0, '0.513')] [2024-04-27 20:21:24,921][54818] Updated weights for policy 0, policy_version 470028 (0.0028) [2024-04-27 20:21:28,124][54818] Updated weights for policy 0, policy_version 470038 (0.0034) [2024-04-27 20:21:28,445][54798] Signal inference workers to stop experience collection... (8750 times) [2024-04-27 20:21:28,484][54818] InferenceWorker_p0-w0: stopping experience collection (8750 times) [2024-04-27 20:21:28,501][54798] Signal inference workers to resume experience collection... (8750 times) [2024-04-27 20:21:28,502][54818] InferenceWorker_p0-w0: resuming experience collection (8750 times) [2024-04-27 20:21:29,254][54587] Fps is (10 sec: 54061.6, 60 sec: 55431.5, 300 sec: 55538.8). Total num frames: 7701151744. Throughput: 0: 55552.1. Samples: 606296640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:21:29,255][54587] Avg episode reward: [(0, '0.547')] [2024-04-27 20:21:30,922][54818] Updated weights for policy 0, policy_version 470048 (0.0033) [2024-04-27 20:21:34,158][54818] Updated weights for policy 0, policy_version 470058 (0.0024) [2024-04-27 20:21:34,253][54587] Fps is (10 sec: 55704.3, 60 sec: 55159.3, 300 sec: 55538.9). Total num frames: 7701430272. Throughput: 0: 55517.0. Samples: 606628300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:21:34,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 20:21:36,790][54818] Updated weights for policy 0, policy_version 470068 (0.0028) [2024-04-27 20:21:39,253][54587] Fps is (10 sec: 54073.9, 60 sec: 55159.6, 300 sec: 55483.5). Total num frames: 7701692416. Throughput: 0: 55557.1. Samples: 606964700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:21:39,253][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 20:21:40,113][54818] Updated weights for policy 0, policy_version 470078 (0.0031) [2024-04-27 20:21:42,504][54818] Updated weights for policy 0, policy_version 470088 (0.0033) [2024-04-27 20:21:44,253][54587] Fps is (10 sec: 57344.8, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7702003712. Throughput: 0: 55641.2. Samples: 607130220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:21:44,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 20:21:45,992][54818] Updated weights for policy 0, policy_version 470098 (0.0026) [2024-04-27 20:21:48,380][54818] Updated weights for policy 0, policy_version 470108 (0.0030) [2024-04-27 20:21:49,253][54587] Fps is (10 sec: 60619.6, 60 sec: 56251.6, 300 sec: 55594.5). Total num frames: 7702298624. Throughput: 0: 55742.1. Samples: 607467500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:21:49,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 20:21:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000470111_7702298624.pth... [2024-04-27 20:21:49,320][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000469296_7688945664.pth [2024-04-27 20:21:51,762][54818] Updated weights for policy 0, policy_version 470118 (0.0032) [2024-04-27 20:21:54,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7702544384. Throughput: 0: 55849.1. Samples: 607799520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:21:54,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 20:21:54,567][54818] Updated weights for policy 0, policy_version 470128 (0.0030) [2024-04-27 20:21:57,529][54818] Updated weights for policy 0, policy_version 470138 (0.0028) [2024-04-27 20:21:59,253][54587] Fps is (10 sec: 54067.4, 60 sec: 56251.6, 300 sec: 55650.1). Total num frames: 7702839296. Throughput: 0: 55730.2. Samples: 607972200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:21:59,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 20:22:00,369][54818] Updated weights for policy 0, policy_version 470148 (0.0026) [2024-04-27 20:22:03,483][54818] Updated weights for policy 0, policy_version 470158 (0.0027) [2024-04-27 20:22:04,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7703117824. Throughput: 0: 55653.0. Samples: 608306540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:22:04,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 20:22:06,186][54818] Updated weights for policy 0, policy_version 470168 (0.0031) [2024-04-27 20:22:09,245][54818] Updated weights for policy 0, policy_version 470178 (0.0030) [2024-04-27 20:22:09,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7703396352. Throughput: 0: 55613.2. Samples: 608638440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-27 20:22:09,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 20:22:11,900][54818] Updated weights for policy 0, policy_version 470188 (0.0030) [2024-04-27 20:22:14,253][54587] Fps is (10 sec: 52428.9, 60 sec: 54886.4, 300 sec: 55483.4). Total num frames: 7703642112. Throughput: 0: 55717.4. Samples: 608803860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:22:14,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 20:22:15,142][54818] Updated weights for policy 0, policy_version 470198 (0.0024) [2024-04-27 20:22:17,753][54818] Updated weights for policy 0, policy_version 470208 (0.0029) [2024-04-27 20:22:19,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7703937024. Throughput: 0: 55655.8. Samples: 609132800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:22:19,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 20:22:21,142][54818] Updated weights for policy 0, policy_version 470218 (0.0029) [2024-04-27 20:22:23,666][54818] Updated weights for policy 0, policy_version 470228 (0.0027) [2024-04-27 20:22:24,253][54587] Fps is (10 sec: 58982.2, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7704231936. Throughput: 0: 55618.5. Samples: 609467540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:22:24,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 20:22:27,030][54818] Updated weights for policy 0, policy_version 470238 (0.0024) [2024-04-27 20:22:29,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55979.7, 300 sec: 55539.0). Total num frames: 7704510464. Throughput: 0: 55958.7. Samples: 609648360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:22:29,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 20:22:29,550][54818] Updated weights for policy 0, policy_version 470248 (0.0032) [2024-04-27 20:22:31,107][54798] Signal inference workers to stop experience collection... (8800 times) [2024-04-27 20:22:31,142][54818] InferenceWorker_p0-w0: stopping experience collection (8800 times) [2024-04-27 20:22:31,203][54798] Signal inference workers to resume experience collection... (8800 times) [2024-04-27 20:22:31,204][54818] InferenceWorker_p0-w0: resuming experience collection (8800 times) [2024-04-27 20:22:32,826][54818] Updated weights for policy 0, policy_version 470258 (0.0024) [2024-04-27 20:22:34,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7704788992. Throughput: 0: 55908.8. Samples: 609983400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:22:34,255][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 20:22:35,445][54818] Updated weights for policy 0, policy_version 470268 (0.0029) [2024-04-27 20:22:38,694][54818] Updated weights for policy 0, policy_version 470278 (0.0026) [2024-04-27 20:22:39,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 7705051136. Throughput: 0: 55812.5. Samples: 610311080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:22:39,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 20:22:41,272][54818] Updated weights for policy 0, policy_version 470288 (0.0027) [2024-04-27 20:22:44,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7705329664. Throughput: 0: 55612.0. Samples: 610474740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:22:44,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 20:22:44,577][54818] Updated weights for policy 0, policy_version 470298 (0.0024) [2024-04-27 20:22:47,338][54818] Updated weights for policy 0, policy_version 470308 (0.0031) [2024-04-27 20:22:49,253][54587] Fps is (10 sec: 54067.1, 60 sec: 54886.5, 300 sec: 55539.0). Total num frames: 7705591808. Throughput: 0: 55591.6. Samples: 610808160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:22:49,254][54587] Avg episode reward: [(0, '0.664')] [2024-04-27 20:22:50,478][54818] Updated weights for policy 0, policy_version 470318 (0.0026) [2024-04-27 20:22:53,068][54818] Updated weights for policy 0, policy_version 470328 (0.0031) [2024-04-27 20:22:54,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55594.6). Total num frames: 7705886720. Throughput: 0: 55724.1. Samples: 611146020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:22:54,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 20:22:56,297][54818] Updated weights for policy 0, policy_version 470338 (0.0029) [2024-04-27 20:22:58,977][54818] Updated weights for policy 0, policy_version 470348 (0.0026) [2024-04-27 20:22:59,253][54587] Fps is (10 sec: 58981.3, 60 sec: 55705.4, 300 sec: 55539.0). Total num frames: 7706181632. Throughput: 0: 55709.1. Samples: 611310780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:22:59,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 20:23:02,219][54818] Updated weights for policy 0, policy_version 470358 (0.0027) [2024-04-27 20:23:04,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7706460160. Throughput: 0: 55785.9. Samples: 611643160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:23:04,253][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 20:23:04,895][54818] Updated weights for policy 0, policy_version 470368 (0.0030) [2024-04-27 20:23:08,060][54818] Updated weights for policy 0, policy_version 470378 (0.0026) [2024-04-27 20:23:09,253][54587] Fps is (10 sec: 55706.7, 60 sec: 55705.7, 300 sec: 55650.0). Total num frames: 7706738688. Throughput: 0: 55828.5. Samples: 611979820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:23:09,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 20:23:10,636][54818] Updated weights for policy 0, policy_version 470388 (0.0028) [2024-04-27 20:23:13,917][54818] Updated weights for policy 0, policy_version 470398 (0.0028) [2024-04-27 20:23:14,253][54587] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 55594.5). Total num frames: 7707017216. Throughput: 0: 55625.4. Samples: 612151500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:23:14,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 20:23:16,688][54818] Updated weights for policy 0, policy_version 470408 (0.0027) [2024-04-27 20:23:19,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7707279360. Throughput: 0: 55618.0. Samples: 612486200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:23:19,253][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 20:23:19,647][54818] Updated weights for policy 0, policy_version 470418 (0.0032) [2024-04-27 20:23:22,846][54818] Updated weights for policy 0, policy_version 470428 (0.0025) [2024-04-27 20:23:24,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7707557888. Throughput: 0: 55740.8. Samples: 612819420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:23:24,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 20:23:25,547][54818] Updated weights for policy 0, policy_version 470438 (0.0025) [2024-04-27 20:23:29,104][54818] Updated weights for policy 0, policy_version 470448 (0.0030) [2024-04-27 20:23:29,253][54587] Fps is (10 sec: 54066.0, 60 sec: 55159.3, 300 sec: 55538.9). Total num frames: 7707820032. Throughput: 0: 55643.0. Samples: 612978680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:23:29,254][54587] Avg episode reward: [(0, '0.524')] [2024-04-27 20:23:31,342][54818] Updated weights for policy 0, policy_version 470458 (0.0025) [2024-04-27 20:23:34,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7708114944. Throughput: 0: 55592.0. Samples: 613309800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 20:23:34,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 20:23:34,898][54818] Updated weights for policy 0, policy_version 470468 (0.0036) [2024-04-27 20:23:37,191][54818] Updated weights for policy 0, policy_version 470478 (0.0034) [2024-04-27 20:23:39,253][54587] Fps is (10 sec: 60621.9, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 7708426240. Throughput: 0: 55546.7. Samples: 613645620. Policy #0 lag: (min: 0.0, avg: 12.8, max: 27.0) [2024-04-27 20:23:39,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 20:23:40,626][54818] Updated weights for policy 0, policy_version 470488 (0.0033) [2024-04-27 20:23:42,122][54798] Signal inference workers to stop experience collection... (8850 times) [2024-04-27 20:23:42,122][54798] Signal inference workers to resume experience collection... (8850 times) [2024-04-27 20:23:42,133][54818] InferenceWorker_p0-w0: stopping experience collection (8850 times) [2024-04-27 20:23:42,133][54818] InferenceWorker_p0-w0: resuming experience collection (8850 times) [2024-04-27 20:23:43,250][54818] Updated weights for policy 0, policy_version 470498 (0.0034) [2024-04-27 20:23:44,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 7708688384. Throughput: 0: 55891.9. Samples: 613825900. Policy #0 lag: (min: 0.0, avg: 12.8, max: 27.0) [2024-04-27 20:23:44,253][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 20:23:46,547][54818] Updated weights for policy 0, policy_version 470508 (0.0035) [2024-04-27 20:23:49,032][54818] Updated weights for policy 0, policy_version 470518 (0.0029) [2024-04-27 20:23:49,253][54587] Fps is (10 sec: 54067.1, 60 sec: 56251.7, 300 sec: 55539.0). Total num frames: 7708966912. Throughput: 0: 55892.4. Samples: 614158320. Policy #0 lag: (min: 0.0, avg: 12.8, max: 27.0) [2024-04-27 20:23:49,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-27 20:23:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000470518_7708966912.pth... [2024-04-27 20:23:49,319][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000469703_7695613952.pth [2024-04-27 20:23:52,613][54818] Updated weights for policy 0, policy_version 470528 (0.0031) [2024-04-27 20:23:54,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7709229056. Throughput: 0: 55834.1. Samples: 614492360. Policy #0 lag: (min: 0.0, avg: 12.8, max: 27.0) [2024-04-27 20:23:54,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 20:23:54,805][54818] Updated weights for policy 0, policy_version 470538 (0.0031) [2024-04-27 20:23:58,397][54818] Updated weights for policy 0, policy_version 470548 (0.0032) [2024-04-27 20:23:59,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 7709507584. Throughput: 0: 55563.0. Samples: 614651840. Policy #0 lag: (min: 0.0, avg: 12.8, max: 27.0) [2024-04-27 20:23:59,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 20:24:00,749][54818] Updated weights for policy 0, policy_version 470558 (0.0029) [2024-04-27 20:24:04,203][54818] Updated weights for policy 0, policy_version 470568 (0.0030) [2024-04-27 20:24:04,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7709786112. Throughput: 0: 55571.4. Samples: 614986920. Policy #0 lag: (min: 0.0, avg: 12.8, max: 27.0) [2024-04-27 20:24:04,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 20:24:06,616][54818] Updated weights for policy 0, policy_version 470578 (0.0028) [2024-04-27 20:24:09,253][54587] Fps is (10 sec: 57344.7, 60 sec: 55705.7, 300 sec: 55594.6). Total num frames: 7710081024. Throughput: 0: 55519.3. Samples: 615317780. Policy #0 lag: (min: 0.0, avg: 12.8, max: 27.0) [2024-04-27 20:24:09,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 20:24:10,112][54818] Updated weights for policy 0, policy_version 470588 (0.0028) [2024-04-27 20:24:12,444][54818] Updated weights for policy 0, policy_version 470598 (0.0029) [2024-04-27 20:24:14,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7710359552. Throughput: 0: 55709.5. Samples: 615485600. Policy #0 lag: (min: 0.0, avg: 12.8, max: 27.0) [2024-04-27 20:24:14,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 20:24:16,084][54818] Updated weights for policy 0, policy_version 470608 (0.0027) [2024-04-27 20:24:18,275][54818] Updated weights for policy 0, policy_version 470618 (0.0028) [2024-04-27 20:24:19,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7710638080. Throughput: 0: 55752.9. Samples: 615818680. Policy #0 lag: (min: 0.0, avg: 12.8, max: 27.0) [2024-04-27 20:24:19,263][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 20:24:21,840][54818] Updated weights for policy 0, policy_version 470628 (0.0027) [2024-04-27 20:24:24,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 7710916608. Throughput: 0: 55694.7. Samples: 616151880. Policy #0 lag: (min: 0.0, avg: 12.8, max: 27.0) [2024-04-27 20:24:24,253][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 20:24:24,287][54818] Updated weights for policy 0, policy_version 470638 (0.0026) [2024-04-27 20:24:27,608][54818] Updated weights for policy 0, policy_version 470648 (0.0032) [2024-04-27 20:24:29,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 7711178752. Throughput: 0: 55389.7. Samples: 616318440. Policy #0 lag: (min: 0.0, avg: 12.8, max: 27.0) [2024-04-27 20:24:29,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 20:24:30,576][54818] Updated weights for policy 0, policy_version 470658 (0.0029) [2024-04-27 20:24:33,541][54818] Updated weights for policy 0, policy_version 470668 (0.0030) [2024-04-27 20:24:34,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7711440896. Throughput: 0: 55327.7. Samples: 616648060. Policy #0 lag: (min: 0.0, avg: 12.8, max: 27.0) [2024-04-27 20:24:34,253][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 20:24:36,465][54818] Updated weights for policy 0, policy_version 470678 (0.0030) [2024-04-27 20:24:39,253][54587] Fps is (10 sec: 54066.5, 60 sec: 54886.3, 300 sec: 55539.0). Total num frames: 7711719424. Throughput: 0: 55336.8. Samples: 616982520. Policy #0 lag: (min: 0.0, avg: 12.8, max: 27.0) [2024-04-27 20:24:39,263][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 20:24:39,537][54818] Updated weights for policy 0, policy_version 470688 (0.0035) [2024-04-27 20:24:40,391][54798] Signal inference workers to stop experience collection... (8900 times) [2024-04-27 20:24:40,391][54798] Signal inference workers to resume experience collection... (8900 times) [2024-04-27 20:24:40,406][54818] InferenceWorker_p0-w0: stopping experience collection (8900 times) [2024-04-27 20:24:40,407][54818] InferenceWorker_p0-w0: resuming experience collection (8900 times) [2024-04-27 20:24:42,215][54818] Updated weights for policy 0, policy_version 470698 (0.0027) [2024-04-27 20:24:44,253][54587] Fps is (10 sec: 57343.1, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 7712014336. Throughput: 0: 55477.3. Samples: 617148320. Policy #0 lag: (min: 0.0, avg: 12.8, max: 27.0) [2024-04-27 20:24:44,263][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 20:24:45,318][54818] Updated weights for policy 0, policy_version 470708 (0.0040) [2024-04-27 20:24:48,139][54818] Updated weights for policy 0, policy_version 470718 (0.0027) [2024-04-27 20:24:49,253][54587] Fps is (10 sec: 58982.8, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 7712309248. Throughput: 0: 55522.7. Samples: 617485440. Policy #0 lag: (min: 0.0, avg: 12.8, max: 27.0) [2024-04-27 20:24:49,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 20:24:51,419][54818] Updated weights for policy 0, policy_version 470728 (0.0034) [2024-04-27 20:24:53,979][54818] Updated weights for policy 0, policy_version 470738 (0.0030) [2024-04-27 20:24:54,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7712571392. Throughput: 0: 55624.3. Samples: 617820880. Policy #0 lag: (min: 0.0, avg: 12.8, max: 27.0) [2024-04-27 20:24:54,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 20:24:57,329][54818] Updated weights for policy 0, policy_version 470748 (0.0033) [2024-04-27 20:24:59,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7712866304. Throughput: 0: 55721.4. Samples: 617993060. Policy #0 lag: (min: 0.0, avg: 12.8, max: 27.0) [2024-04-27 20:24:59,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 20:24:59,880][54818] Updated weights for policy 0, policy_version 470758 (0.0025) [2024-04-27 20:25:03,046][54818] Updated weights for policy 0, policy_version 470768 (0.0030) [2024-04-27 20:25:04,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7713128448. Throughput: 0: 55748.8. Samples: 618327380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:25:04,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 20:25:05,772][54818] Updated weights for policy 0, policy_version 470778 (0.0030) [2024-04-27 20:25:09,001][54818] Updated weights for policy 0, policy_version 470788 (0.0027) [2024-04-27 20:25:09,253][54587] Fps is (10 sec: 52429.6, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7713390592. Throughput: 0: 55831.2. Samples: 618664280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:25:09,253][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 20:25:11,530][54818] Updated weights for policy 0, policy_version 470798 (0.0034) [2024-04-27 20:25:14,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 7713685504. Throughput: 0: 55692.3. Samples: 618824600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:25:14,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 20:25:14,898][54818] Updated weights for policy 0, policy_version 470808 (0.0029) [2024-04-27 20:25:17,595][54818] Updated weights for policy 0, policy_version 470818 (0.0026) [2024-04-27 20:25:19,253][54587] Fps is (10 sec: 58981.8, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7713980416. Throughput: 0: 55790.6. Samples: 619158640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:25:19,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 20:25:20,746][54818] Updated weights for policy 0, policy_version 470828 (0.0031) [2024-04-27 20:25:23,498][54818] Updated weights for policy 0, policy_version 470838 (0.0026) [2024-04-27 20:25:24,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7714242560. Throughput: 0: 55784.6. Samples: 619492820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:25:24,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 20:25:26,656][54818] Updated weights for policy 0, policy_version 470848 (0.0026) [2024-04-27 20:25:29,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7714521088. Throughput: 0: 55865.5. Samples: 619662260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:25:29,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 20:25:29,310][54818] Updated weights for policy 0, policy_version 470858 (0.0031) [2024-04-27 20:25:32,515][54818] Updated weights for policy 0, policy_version 470868 (0.0038) [2024-04-27 20:25:34,253][54587] Fps is (10 sec: 57344.5, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 7714816000. Throughput: 0: 55770.4. Samples: 619995100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:25:34,253][54587] Avg episode reward: [(0, '0.514')] [2024-04-27 20:25:35,199][54818] Updated weights for policy 0, policy_version 470878 (0.0030) [2024-04-27 20:25:38,011][54798] Signal inference workers to stop experience collection... (8950 times) [2024-04-27 20:25:38,042][54818] InferenceWorker_p0-w0: stopping experience collection (8950 times) [2024-04-27 20:25:38,099][54798] Signal inference workers to resume experience collection... (8950 times) [2024-04-27 20:25:38,099][54818] InferenceWorker_p0-w0: resuming experience collection (8950 times) [2024-04-27 20:25:38,350][54818] Updated weights for policy 0, policy_version 470888 (0.0025) [2024-04-27 20:25:39,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 7715078144. Throughput: 0: 55651.3. Samples: 620325180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:25:39,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 20:25:41,274][54818] Updated weights for policy 0, policy_version 470898 (0.0025) [2024-04-27 20:25:44,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 7715340288. Throughput: 0: 55503.3. Samples: 620490700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:25:44,253][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 20:25:44,313][54818] Updated weights for policy 0, policy_version 470908 (0.0027) [2024-04-27 20:25:47,239][54818] Updated weights for policy 0, policy_version 470918 (0.0028) [2024-04-27 20:25:49,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7715618816. Throughput: 0: 55454.3. Samples: 620822820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:25:49,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 20:25:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000470924_7715618816.pth... [2024-04-27 20:25:49,312][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000470111_7702298624.pth [2024-04-27 20:25:50,083][54818] Updated weights for policy 0, policy_version 470928 (0.0030) [2024-04-27 20:25:53,126][54818] Updated weights for policy 0, policy_version 470938 (0.0029) [2024-04-27 20:25:54,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 7715913728. Throughput: 0: 55457.2. Samples: 621159860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:25:54,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 20:25:55,883][54818] Updated weights for policy 0, policy_version 470948 (0.0029) [2024-04-27 20:25:58,922][54818] Updated weights for policy 0, policy_version 470958 (0.0030) [2024-04-27 20:25:59,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7716192256. Throughput: 0: 55523.2. Samples: 621323140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:25:59,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 20:26:01,781][54818] Updated weights for policy 0, policy_version 470968 (0.0028) [2024-04-27 20:26:04,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55432.7, 300 sec: 55594.6). Total num frames: 7716454400. Throughput: 0: 55538.3. Samples: 621657860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:26:04,253][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 20:26:04,979][54818] Updated weights for policy 0, policy_version 470978 (0.0030) [2024-04-27 20:26:07,700][54818] Updated weights for policy 0, policy_version 470988 (0.0028) [2024-04-27 20:26:09,253][54587] Fps is (10 sec: 57344.2, 60 sec: 56251.6, 300 sec: 55650.1). Total num frames: 7716765696. Throughput: 0: 55405.4. Samples: 621986060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:26:09,254][54587] Avg episode reward: [(0, '0.487')] [2024-04-27 20:26:10,902][54818] Updated weights for policy 0, policy_version 470998 (0.0029) [2024-04-27 20:26:13,452][54818] Updated weights for policy 0, policy_version 471008 (0.0027) [2024-04-27 20:26:14,253][54587] Fps is (10 sec: 58982.2, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 7717044224. Throughput: 0: 55460.9. Samples: 622158000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:26:14,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 20:26:16,756][54818] Updated weights for policy 0, policy_version 471018 (0.0030) [2024-04-27 20:26:19,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7717306368. Throughput: 0: 55531.9. Samples: 622494040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:26:19,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-27 20:26:19,355][54818] Updated weights for policy 0, policy_version 471028 (0.0027) [2024-04-27 20:26:22,638][54818] Updated weights for policy 0, policy_version 471038 (0.0030) [2024-04-27 20:26:24,253][54587] Fps is (10 sec: 52428.2, 60 sec: 55432.5, 300 sec: 55650.3). Total num frames: 7717568512. Throughput: 0: 55552.3. Samples: 622825040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-27 20:26:24,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 20:26:25,241][54818] Updated weights for policy 0, policy_version 471048 (0.0025) [2024-04-27 20:26:28,639][54818] Updated weights for policy 0, policy_version 471058 (0.0027) [2024-04-27 20:26:29,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 7717863424. Throughput: 0: 55469.1. Samples: 622986820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 20:26:29,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 20:26:31,065][54798] Signal inference workers to stop experience collection... (9000 times) [2024-04-27 20:26:31,066][54798] Signal inference workers to resume experience collection... (9000 times) [2024-04-27 20:26:31,092][54818] InferenceWorker_p0-w0: stopping experience collection (9000 times) [2024-04-27 20:26:31,092][54818] InferenceWorker_p0-w0: resuming experience collection (9000 times) [2024-04-27 20:26:31,174][54818] Updated weights for policy 0, policy_version 471068 (0.0028) [2024-04-27 20:26:34,253][54587] Fps is (10 sec: 54067.1, 60 sec: 54886.3, 300 sec: 55650.0). Total num frames: 7718109184. Throughput: 0: 55512.9. Samples: 623320900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 20:26:34,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 20:26:34,602][54818] Updated weights for policy 0, policy_version 471078 (0.0028) [2024-04-27 20:26:36,925][54818] Updated weights for policy 0, policy_version 471088 (0.0031) [2024-04-27 20:26:39,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7718404096. Throughput: 0: 55451.5. Samples: 623655180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 20:26:39,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 20:26:40,333][54818] Updated weights for policy 0, policy_version 471098 (0.0026) [2024-04-27 20:26:42,749][54818] Updated weights for policy 0, policy_version 471108 (0.0025) [2024-04-27 20:26:44,253][54587] Fps is (10 sec: 58983.1, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7718699008. Throughput: 0: 55688.1. Samples: 623829100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 20:26:44,253][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 20:26:46,154][54818] Updated weights for policy 0, policy_version 471118 (0.0026) [2024-04-27 20:26:48,960][54818] Updated weights for policy 0, policy_version 471128 (0.0032) [2024-04-27 20:26:49,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7718977536. Throughput: 0: 55605.1. Samples: 624160100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 20:26:49,255][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 20:26:52,113][54818] Updated weights for policy 0, policy_version 471138 (0.0029) [2024-04-27 20:26:54,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7719256064. Throughput: 0: 55636.9. Samples: 624489720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 20:26:54,254][54587] Avg episode reward: [(0, '0.494')] [2024-04-27 20:26:54,722][54818] Updated weights for policy 0, policy_version 471148 (0.0031) [2024-04-27 20:26:58,076][54818] Updated weights for policy 0, policy_version 471158 (0.0024) [2024-04-27 20:26:59,253][54587] Fps is (10 sec: 52429.4, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7719501824. Throughput: 0: 55519.5. Samples: 624656380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 20:26:59,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 20:27:00,492][54818] Updated weights for policy 0, policy_version 471168 (0.0028) [2024-04-27 20:27:03,842][54818] Updated weights for policy 0, policy_version 471178 (0.0025) [2024-04-27 20:27:04,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55705.4, 300 sec: 55594.5). Total num frames: 7719796736. Throughput: 0: 55523.0. Samples: 624992580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 20:27:04,254][54587] Avg episode reward: [(0, '0.523')] [2024-04-27 20:27:06,533][54818] Updated weights for policy 0, policy_version 471188 (0.0038) [2024-04-27 20:27:09,253][54587] Fps is (10 sec: 54067.3, 60 sec: 54613.4, 300 sec: 55594.5). Total num frames: 7720042496. Throughput: 0: 55652.6. Samples: 625329400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 20:27:09,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 20:27:09,796][54818] Updated weights for policy 0, policy_version 471198 (0.0029) [2024-04-27 20:27:12,529][54818] Updated weights for policy 0, policy_version 471208 (0.0028) [2024-04-27 20:27:14,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 7720353792. Throughput: 0: 55633.0. Samples: 625490300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 20:27:14,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 20:27:15,739][54818] Updated weights for policy 0, policy_version 471218 (0.0036) [2024-04-27 20:27:18,255][54818] Updated weights for policy 0, policy_version 471228 (0.0030) [2024-04-27 20:27:19,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 7720615936. Throughput: 0: 55561.8. Samples: 625821180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 20:27:19,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 20:27:21,677][54818] Updated weights for policy 0, policy_version 471238 (0.0026) [2024-04-27 20:27:24,013][54818] Updated weights for policy 0, policy_version 471248 (0.0029) [2024-04-27 20:27:24,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7720927232. Throughput: 0: 55458.7. Samples: 626150820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 20:27:24,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 20:27:27,407][54818] Updated weights for policy 0, policy_version 471258 (0.0023) [2024-04-27 20:27:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 7721189376. Throughput: 0: 55580.0. Samples: 626330200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 20:27:29,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 20:27:30,400][54818] Updated weights for policy 0, policy_version 471268 (0.0035) [2024-04-27 20:27:33,378][54818] Updated weights for policy 0, policy_version 471278 (0.0028) [2024-04-27 20:27:34,253][54587] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 7721484288. Throughput: 0: 55638.0. Samples: 626663800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 20:27:34,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 20:27:36,366][54818] Updated weights for policy 0, policy_version 471288 (0.0034) [2024-04-27 20:27:39,253][54587] Fps is (10 sec: 54066.1, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7721730048. Throughput: 0: 55688.2. Samples: 626995700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 20:27:39,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 20:27:39,384][54818] Updated weights for policy 0, policy_version 471298 (0.0025) [2024-04-27 20:27:42,128][54818] Updated weights for policy 0, policy_version 471308 (0.0026) [2024-04-27 20:27:44,253][54587] Fps is (10 sec: 52428.9, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7722008576. Throughput: 0: 55527.6. Samples: 627155120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 20:27:44,253][54587] Avg episode reward: [(0, '0.526')] [2024-04-27 20:27:45,066][54818] Updated weights for policy 0, policy_version 471318 (0.0030) [2024-04-27 20:27:46,109][54798] Signal inference workers to stop experience collection... (9050 times) [2024-04-27 20:27:46,148][54818] InferenceWorker_p0-w0: stopping experience collection (9050 times) [2024-04-27 20:27:46,200][54798] Signal inference workers to resume experience collection... (9050 times) [2024-04-27 20:27:46,200][54818] InferenceWorker_p0-w0: resuming experience collection (9050 times) [2024-04-27 20:27:47,890][54818] Updated weights for policy 0, policy_version 471328 (0.0029) [2024-04-27 20:27:49,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7722287104. Throughput: 0: 55526.7. Samples: 627491280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-27 20:27:49,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 20:27:49,267][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000471331_7722287104.pth... [2024-04-27 20:27:49,316][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000470518_7708966912.pth [2024-04-27 20:27:50,820][54818] Updated weights for policy 0, policy_version 471338 (0.0028) [2024-04-27 20:27:53,777][54818] Updated weights for policy 0, policy_version 471348 (0.0033) [2024-04-27 20:27:54,253][54587] Fps is (10 sec: 55704.7, 60 sec: 55159.3, 300 sec: 55539.0). Total num frames: 7722565632. Throughput: 0: 55558.1. Samples: 627829520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 20:27:54,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 20:27:56,755][54818] Updated weights for policy 0, policy_version 471358 (0.0026) [2024-04-27 20:27:59,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7722860544. Throughput: 0: 55726.4. Samples: 627997980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 20:27:59,253][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 20:27:59,852][54818] Updated weights for policy 0, policy_version 471368 (0.0031) [2024-04-27 20:28:02,559][54818] Updated weights for policy 0, policy_version 471378 (0.0031) [2024-04-27 20:28:04,253][54587] Fps is (10 sec: 57344.8, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7723139072. Throughput: 0: 55712.1. Samples: 628328220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 20:28:04,254][54587] Avg episode reward: [(0, '0.521')] [2024-04-27 20:28:05,725][54818] Updated weights for policy 0, policy_version 471388 (0.0029) [2024-04-27 20:28:08,367][54818] Updated weights for policy 0, policy_version 471398 (0.0030) [2024-04-27 20:28:09,253][54587] Fps is (10 sec: 57344.0, 60 sec: 56524.8, 300 sec: 55650.1). Total num frames: 7723433984. Throughput: 0: 55740.5. Samples: 628659140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 20:28:09,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 20:28:11,649][54818] Updated weights for policy 0, policy_version 471408 (0.0030) [2024-04-27 20:28:14,231][54818] Updated weights for policy 0, policy_version 471418 (0.0027) [2024-04-27 20:28:14,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 7723712512. Throughput: 0: 55631.6. Samples: 628833620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 20:28:14,253][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 20:28:17,634][54818] Updated weights for policy 0, policy_version 471428 (0.0031) [2024-04-27 20:28:19,253][54587] Fps is (10 sec: 52428.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7723958272. Throughput: 0: 55563.9. Samples: 629164180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 20:28:19,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 20:28:20,128][54818] Updated weights for policy 0, policy_version 471438 (0.0026) [2024-04-27 20:28:23,609][54818] Updated weights for policy 0, policy_version 471448 (0.0028) [2024-04-27 20:28:24,253][54587] Fps is (10 sec: 50789.8, 60 sec: 54886.3, 300 sec: 55594.5). Total num frames: 7724220416. Throughput: 0: 55583.3. Samples: 629496940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 20:28:24,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 20:28:26,153][54818] Updated weights for policy 0, policy_version 471458 (0.0024) [2024-04-27 20:28:29,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7724515328. Throughput: 0: 55615.4. Samples: 629657820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 20:28:29,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 20:28:29,580][54818] Updated weights for policy 0, policy_version 471468 (0.0031) [2024-04-27 20:28:32,023][54818] Updated weights for policy 0, policy_version 471478 (0.0028) [2024-04-27 20:28:32,707][54798] Signal inference workers to stop experience collection... (9100 times) [2024-04-27 20:28:32,711][54798] Signal inference workers to resume experience collection... (9100 times) [2024-04-27 20:28:32,736][54818] InferenceWorker_p0-w0: stopping experience collection (9100 times) [2024-04-27 20:28:32,736][54818] InferenceWorker_p0-w0: resuming experience collection (9100 times) [2024-04-27 20:28:34,253][54587] Fps is (10 sec: 58981.8, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 7724810240. Throughput: 0: 55583.0. Samples: 629992520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 20:28:34,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 20:28:35,326][54818] Updated weights for policy 0, policy_version 471488 (0.0029) [2024-04-27 20:28:37,715][54818] Updated weights for policy 0, policy_version 471498 (0.0031) [2024-04-27 20:28:39,253][54587] Fps is (10 sec: 57344.7, 60 sec: 55978.9, 300 sec: 55594.5). Total num frames: 7725088768. Throughput: 0: 55528.6. Samples: 630328300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 20:28:39,253][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 20:28:41,161][54818] Updated weights for policy 0, policy_version 471508 (0.0033) [2024-04-27 20:28:43,765][54818] Updated weights for policy 0, policy_version 471518 (0.0028) [2024-04-27 20:28:44,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7725367296. Throughput: 0: 55599.5. Samples: 630499960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 20:28:44,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 20:28:47,066][54818] Updated weights for policy 0, policy_version 471528 (0.0029) [2024-04-27 20:28:49,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7725629440. Throughput: 0: 55732.5. Samples: 630836180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 20:28:49,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 20:28:49,527][54818] Updated weights for policy 0, policy_version 471538 (0.0029) [2024-04-27 20:28:52,976][54818] Updated weights for policy 0, policy_version 471548 (0.0032) [2024-04-27 20:28:54,253][54587] Fps is (10 sec: 52429.3, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 7725891584. Throughput: 0: 55846.3. Samples: 631172220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 20:28:54,253][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 20:28:55,359][54818] Updated weights for policy 0, policy_version 471558 (0.0026) [2024-04-27 20:28:58,808][54818] Updated weights for policy 0, policy_version 471568 (0.0023) [2024-04-27 20:28:59,253][54587] Fps is (10 sec: 54066.1, 60 sec: 55159.3, 300 sec: 55539.0). Total num frames: 7726170112. Throughput: 0: 55373.0. Samples: 631325420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 20:28:59,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 20:29:01,390][54818] Updated weights for policy 0, policy_version 471578 (0.0034) [2024-04-27 20:29:04,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7726465024. Throughput: 0: 55444.2. Samples: 631659160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 20:29:04,253][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 20:29:04,585][54818] Updated weights for policy 0, policy_version 471588 (0.0023) [2024-04-27 20:29:07,333][54818] Updated weights for policy 0, policy_version 471598 (0.0030) [2024-04-27 20:29:09,253][54587] Fps is (10 sec: 58982.8, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7726759936. Throughput: 0: 55447.5. Samples: 631992080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 20:29:09,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 20:29:10,414][54818] Updated weights for policy 0, policy_version 471608 (0.0030) [2024-04-27 20:29:13,382][54818] Updated weights for policy 0, policy_version 471618 (0.0029) [2024-04-27 20:29:14,253][54587] Fps is (10 sec: 58982.0, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7727054848. Throughput: 0: 55811.2. Samples: 632169320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 20:29:14,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 20:29:16,333][54818] Updated weights for policy 0, policy_version 471628 (0.0026) [2024-04-27 20:29:18,947][54798] Signal inference workers to stop experience collection... (9150 times) [2024-04-27 20:29:18,973][54818] InferenceWorker_p0-w0: stopping experience collection (9150 times) [2024-04-27 20:29:19,033][54798] Signal inference workers to resume experience collection... (9150 times) [2024-04-27 20:29:19,033][54818] InferenceWorker_p0-w0: resuming experience collection (9150 times) [2024-04-27 20:29:19,159][54818] Updated weights for policy 0, policy_version 471638 (0.0025) [2024-04-27 20:29:19,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 7727316992. Throughput: 0: 55871.9. Samples: 632506760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 20:29:19,254][54587] Avg episode reward: [(0, '0.523')] [2024-04-27 20:29:22,240][54818] Updated weights for policy 0, policy_version 471648 (0.0034) [2024-04-27 20:29:24,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7727579136. Throughput: 0: 55728.0. Samples: 632836060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 20:29:24,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 20:29:25,009][54818] Updated weights for policy 0, policy_version 471658 (0.0031) [2024-04-27 20:29:28,023][54818] Updated weights for policy 0, policy_version 471668 (0.0027) [2024-04-27 20:29:29,253][54587] Fps is (10 sec: 54068.5, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7727857664. Throughput: 0: 55649.5. Samples: 633004180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 20:29:29,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 20:29:30,859][54818] Updated weights for policy 0, policy_version 471678 (0.0034) [2024-04-27 20:29:33,993][54818] Updated weights for policy 0, policy_version 471688 (0.0025) [2024-04-27 20:29:34,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 7728136192. Throughput: 0: 55628.4. Samples: 633339460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 20:29:34,254][54587] Avg episode reward: [(0, '0.505')] [2024-04-27 20:29:36,706][54818] Updated weights for policy 0, policy_version 471698 (0.0026) [2024-04-27 20:29:39,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55159.3, 300 sec: 55539.0). Total num frames: 7728398336. Throughput: 0: 55558.9. Samples: 633672380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 20:29:39,254][54587] Avg episode reward: [(0, '0.504')] [2024-04-27 20:29:40,086][54818] Updated weights for policy 0, policy_version 471708 (0.0033) [2024-04-27 20:29:42,497][54818] Updated weights for policy 0, policy_version 471718 (0.0031) [2024-04-27 20:29:44,253][54587] Fps is (10 sec: 58981.4, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 7728726016. Throughput: 0: 56043.6. Samples: 633847380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 20:29:44,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 20:29:45,769][54818] Updated weights for policy 0, policy_version 471728 (0.0036) [2024-04-27 20:29:48,584][54818] Updated weights for policy 0, policy_version 471738 (0.0031) [2024-04-27 20:29:49,253][54587] Fps is (10 sec: 60621.7, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 7729004544. Throughput: 0: 56015.1. Samples: 634179840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 20:29:49,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 20:29:49,317][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000471742_7729020928.pth... [2024-04-27 20:29:49,372][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000470924_7715618816.pth [2024-04-27 20:29:51,680][54818] Updated weights for policy 0, policy_version 471748 (0.0034) [2024-04-27 20:29:54,253][54587] Fps is (10 sec: 54067.4, 60 sec: 56251.6, 300 sec: 55594.5). Total num frames: 7729266688. Throughput: 0: 55966.2. Samples: 634510560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 20:29:54,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 20:29:54,504][54818] Updated weights for policy 0, policy_version 471758 (0.0034) [2024-04-27 20:29:57,488][54818] Updated weights for policy 0, policy_version 471768 (0.0032) [2024-04-27 20:29:59,253][54587] Fps is (10 sec: 52428.0, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7729528832. Throughput: 0: 55737.2. Samples: 634677500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 20:29:59,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 20:30:00,278][54818] Updated weights for policy 0, policy_version 471778 (0.0035) [2024-04-27 20:30:03,243][54818] Updated weights for policy 0, policy_version 471788 (0.0038) [2024-04-27 20:30:04,253][54587] Fps is (10 sec: 54068.1, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7729807360. Throughput: 0: 55799.4. Samples: 635017720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 20:30:04,253][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 20:30:06,153][54818] Updated weights for policy 0, policy_version 471798 (0.0024) [2024-04-27 20:30:08,872][54798] Signal inference workers to stop experience collection... (9200 times) [2024-04-27 20:30:08,872][54798] Signal inference workers to resume experience collection... (9200 times) [2024-04-27 20:30:08,886][54818] InferenceWorker_p0-w0: stopping experience collection (9200 times) [2024-04-27 20:30:08,922][54818] InferenceWorker_p0-w0: resuming experience collection (9200 times) [2024-04-27 20:30:08,985][54818] Updated weights for policy 0, policy_version 471808 (0.0026) [2024-04-27 20:30:09,253][54587] Fps is (10 sec: 57345.2, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 7730102272. Throughput: 0: 55925.0. Samples: 635352680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 20:30:09,253][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 20:30:12,098][54818] Updated weights for policy 0, policy_version 471818 (0.0032) [2024-04-27 20:30:14,253][54587] Fps is (10 sec: 54066.6, 60 sec: 54886.4, 300 sec: 55483.4). Total num frames: 7730348032. Throughput: 0: 55807.0. Samples: 635515500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 20:30:14,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 20:30:14,877][54818] Updated weights for policy 0, policy_version 471828 (0.0025) [2024-04-27 20:30:17,829][54818] Updated weights for policy 0, policy_version 471838 (0.0027) [2024-04-27 20:30:19,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55432.8, 300 sec: 55594.5). Total num frames: 7730642944. Throughput: 0: 55699.1. Samples: 635845920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 20:30:19,253][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 20:30:21,112][54818] Updated weights for policy 0, policy_version 471848 (0.0028) [2024-04-27 20:30:23,546][54818] Updated weights for policy 0, policy_version 471858 (0.0043) [2024-04-27 20:30:24,253][54587] Fps is (10 sec: 60620.3, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 7730954240. Throughput: 0: 55601.8. Samples: 636174460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 20:30:24,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 20:30:26,843][54818] Updated weights for policy 0, policy_version 471868 (0.0031) [2024-04-27 20:30:29,253][54587] Fps is (10 sec: 58982.1, 60 sec: 56251.7, 300 sec: 55650.0). Total num frames: 7731232768. Throughput: 0: 55635.7. Samples: 636350980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 20:30:29,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 20:30:29,416][54818] Updated weights for policy 0, policy_version 471878 (0.0028) [2024-04-27 20:30:32,722][54818] Updated weights for policy 0, policy_version 471888 (0.0029) [2024-04-27 20:30:34,253][54587] Fps is (10 sec: 54068.1, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7731494912. Throughput: 0: 55708.0. Samples: 636686700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 20:30:34,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 20:30:35,375][54818] Updated weights for policy 0, policy_version 471898 (0.0035) [2024-04-27 20:30:38,450][54818] Updated weights for policy 0, policy_version 471908 (0.0027) [2024-04-27 20:30:39,253][54587] Fps is (10 sec: 54067.1, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 7731773440. Throughput: 0: 55727.6. Samples: 637018300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 20:30:39,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 20:30:41,278][54818] Updated weights for policy 0, policy_version 471918 (0.0030) [2024-04-27 20:30:44,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7732051968. Throughput: 0: 55642.3. Samples: 637181400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 20:30:44,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 20:30:44,344][54818] Updated weights for policy 0, policy_version 471928 (0.0027) [2024-04-27 20:30:47,210][54818] Updated weights for policy 0, policy_version 471938 (0.0030) [2024-04-27 20:30:49,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7732314112. Throughput: 0: 55524.8. Samples: 637516340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 20:30:49,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 20:30:50,208][54818] Updated weights for policy 0, policy_version 471948 (0.0027) [2024-04-27 20:30:53,003][54818] Updated weights for policy 0, policy_version 471958 (0.0027) [2024-04-27 20:30:54,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 7732592640. Throughput: 0: 55634.2. Samples: 637856220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 20:30:54,253][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 20:30:56,054][54818] Updated weights for policy 0, policy_version 471968 (0.0025) [2024-04-27 20:30:58,816][54818] Updated weights for policy 0, policy_version 471978 (0.0029) [2024-04-27 20:30:59,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 7732887552. Throughput: 0: 55761.9. Samples: 638024780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 20:30:59,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 20:31:01,984][54818] Updated weights for policy 0, policy_version 471988 (0.0028) [2024-04-27 20:31:04,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7733166080. Throughput: 0: 55854.6. Samples: 638359380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 20:31:04,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 20:31:04,753][54818] Updated weights for policy 0, policy_version 471998 (0.0035) [2024-04-27 20:31:07,760][54818] Updated weights for policy 0, policy_version 472008 (0.0027) [2024-04-27 20:31:09,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 7733428224. Throughput: 0: 56049.0. Samples: 638696660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 20:31:09,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 20:31:10,510][54818] Updated weights for policy 0, policy_version 472018 (0.0027) [2024-04-27 20:31:13,717][54818] Updated weights for policy 0, policy_version 472028 (0.0029) [2024-04-27 20:31:14,253][54587] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 7733723136. Throughput: 0: 55881.8. Samples: 638865660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 20:31:14,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 20:31:16,360][54818] Updated weights for policy 0, policy_version 472038 (0.0025) [2024-04-27 20:31:18,305][54798] Signal inference workers to stop experience collection... (9250 times) [2024-04-27 20:31:18,356][54818] InferenceWorker_p0-w0: stopping experience collection (9250 times) [2024-04-27 20:31:18,362][54798] Signal inference workers to resume experience collection... (9250 times) [2024-04-27 20:31:18,371][54818] InferenceWorker_p0-w0: resuming experience collection (9250 times) [2024-04-27 20:31:19,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7734001664. Throughput: 0: 55802.1. Samples: 639197800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 20:31:19,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 20:31:19,603][54818] Updated weights for policy 0, policy_version 472048 (0.0027) [2024-04-27 20:31:22,485][54818] Updated weights for policy 0, policy_version 472058 (0.0028) [2024-04-27 20:31:24,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55159.6, 300 sec: 55594.6). Total num frames: 7734263808. Throughput: 0: 55793.4. Samples: 639529000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 20:31:24,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 20:31:25,390][54818] Updated weights for policy 0, policy_version 472068 (0.0029) [2024-04-27 20:31:28,707][54818] Updated weights for policy 0, policy_version 472078 (0.0030) [2024-04-27 20:31:29,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 7734542336. Throughput: 0: 55803.2. Samples: 639692540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 20:31:29,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 20:31:31,398][54818] Updated weights for policy 0, policy_version 472088 (0.0028) [2024-04-27 20:31:34,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7734837248. Throughput: 0: 55727.1. Samples: 640024060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 20:31:34,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 20:31:34,521][54818] Updated weights for policy 0, policy_version 472098 (0.0030) [2024-04-27 20:31:37,257][54818] Updated weights for policy 0, policy_version 472108 (0.0027) [2024-04-27 20:31:39,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7735115776. Throughput: 0: 55728.0. Samples: 640363980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 20:31:39,253][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 20:31:40,598][54818] Updated weights for policy 0, policy_version 472118 (0.0030) [2024-04-27 20:31:43,109][54818] Updated weights for policy 0, policy_version 472128 (0.0029) [2024-04-27 20:31:44,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7735377920. Throughput: 0: 55605.2. Samples: 640527020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 20:31:44,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 20:31:46,610][54818] Updated weights for policy 0, policy_version 472138 (0.0029) [2024-04-27 20:31:49,033][54818] Updated weights for policy 0, policy_version 472148 (0.0028) [2024-04-27 20:31:49,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 7735672832. Throughput: 0: 55501.2. Samples: 640856940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 20:31:49,254][54587] Avg episode reward: [(0, '0.655')] [2024-04-27 20:31:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000472148_7735672832.pth... [2024-04-27 20:31:49,330][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000471331_7722287104.pth [2024-04-27 20:31:52,373][54818] Updated weights for policy 0, policy_version 472158 (0.0035) [2024-04-27 20:31:54,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7735951360. Throughput: 0: 55389.3. Samples: 641189180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 20:31:54,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 20:31:54,836][54818] Updated weights for policy 0, policy_version 472168 (0.0029) [2024-04-27 20:31:58,133][54818] Updated weights for policy 0, policy_version 472178 (0.0030) [2024-04-27 20:31:59,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7736213504. Throughput: 0: 55329.7. Samples: 641355500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 20:31:59,254][54587] Avg episode reward: [(0, '0.674')] [2024-04-27 20:32:00,810][54818] Updated weights for policy 0, policy_version 472188 (0.0027) [2024-04-27 20:32:04,253][54587] Fps is (10 sec: 52429.4, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 7736475648. Throughput: 0: 55372.1. Samples: 641689540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-04-27 20:32:04,253][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 20:32:04,305][54818] Updated weights for policy 0, policy_version 472198 (0.0028) [2024-04-27 20:32:06,596][54818] Updated weights for policy 0, policy_version 472208 (0.0030) [2024-04-27 20:32:09,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7736770560. Throughput: 0: 55438.6. Samples: 642023740. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 20:32:09,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 20:32:10,088][54818] Updated weights for policy 0, policy_version 472218 (0.0034) [2024-04-27 20:32:11,153][54798] Signal inference workers to stop experience collection... (9300 times) [2024-04-27 20:32:11,153][54798] Signal inference workers to resume experience collection... (9300 times) [2024-04-27 20:32:11,166][54818] InferenceWorker_p0-w0: stopping experience collection (9300 times) [2024-04-27 20:32:11,166][54818] InferenceWorker_p0-w0: resuming experience collection (9300 times) [2024-04-27 20:32:12,441][54818] Updated weights for policy 0, policy_version 472228 (0.0026) [2024-04-27 20:32:14,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7737032704. Throughput: 0: 55397.8. Samples: 642185440. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 20:32:14,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 20:32:15,980][54818] Updated weights for policy 0, policy_version 472238 (0.0029) [2024-04-27 20:32:18,341][54818] Updated weights for policy 0, policy_version 472248 (0.0028) [2024-04-27 20:32:19,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7737311232. Throughput: 0: 55371.3. Samples: 642515760. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 20:32:19,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 20:32:21,783][54818] Updated weights for policy 0, policy_version 472258 (0.0030) [2024-04-27 20:32:24,253][54587] Fps is (10 sec: 58981.5, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 7737622528. Throughput: 0: 55318.5. Samples: 642853320. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 20:32:24,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 20:32:24,445][54818] Updated weights for policy 0, policy_version 472268 (0.0031) [2024-04-27 20:32:27,559][54818] Updated weights for policy 0, policy_version 472278 (0.0031) [2024-04-27 20:32:29,253][54587] Fps is (10 sec: 58982.0, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7737901056. Throughput: 0: 55417.8. Samples: 643020820. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 20:32:29,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 20:32:30,307][54818] Updated weights for policy 0, policy_version 472288 (0.0032) [2024-04-27 20:32:33,412][54818] Updated weights for policy 0, policy_version 472298 (0.0026) [2024-04-27 20:32:34,253][54587] Fps is (10 sec: 54068.1, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 7738163200. Throughput: 0: 55516.2. Samples: 643355160. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 20:32:34,253][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 20:32:36,679][54818] Updated weights for policy 0, policy_version 472308 (0.0027) [2024-04-27 20:32:39,123][54818] Updated weights for policy 0, policy_version 472318 (0.0030) [2024-04-27 20:32:39,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7738458112. Throughput: 0: 55658.4. Samples: 643693800. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 20:32:39,253][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 20:32:42,712][54818] Updated weights for policy 0, policy_version 472328 (0.0031) [2024-04-27 20:32:44,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7738720256. Throughput: 0: 55501.0. Samples: 643853040. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 20:32:44,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 20:32:45,107][54818] Updated weights for policy 0, policy_version 472338 (0.0027) [2024-04-27 20:32:48,421][54818] Updated weights for policy 0, policy_version 472348 (0.0026) [2024-04-27 20:32:49,253][54587] Fps is (10 sec: 52428.0, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7738982400. Throughput: 0: 55561.6. Samples: 644189820. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 20:32:49,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 20:32:50,932][54818] Updated weights for policy 0, policy_version 472358 (0.0030) [2024-04-27 20:32:54,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7739260928. Throughput: 0: 55515.2. Samples: 644521920. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 20:32:54,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 20:32:54,273][54818] Updated weights for policy 0, policy_version 472368 (0.0032) [2024-04-27 20:32:57,192][54818] Updated weights for policy 0, policy_version 472378 (0.0027) [2024-04-27 20:32:59,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 7739555840. Throughput: 0: 55713.1. Samples: 644692540. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 20:32:59,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-27 20:33:00,132][54818] Updated weights for policy 0, policy_version 472388 (0.0037) [2024-04-27 20:33:02,966][54818] Updated weights for policy 0, policy_version 472398 (0.0031) [2024-04-27 20:33:04,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7739834368. Throughput: 0: 55665.6. Samples: 645020720. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 20:33:04,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 20:33:05,970][54818] Updated weights for policy 0, policy_version 472408 (0.0034) [2024-04-27 20:33:08,734][54818] Updated weights for policy 0, policy_version 472418 (0.0028) [2024-04-27 20:33:09,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7740096512. Throughput: 0: 55620.6. Samples: 645356240. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 20:33:09,254][54587] Avg episode reward: [(0, '0.674')] [2024-04-27 20:33:11,871][54818] Updated weights for policy 0, policy_version 472428 (0.0026) [2024-04-27 20:33:14,253][54587] Fps is (10 sec: 57343.9, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 7740407808. Throughput: 0: 55653.7. Samples: 645525240. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 20:33:14,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 20:33:14,470][54818] Updated weights for policy 0, policy_version 472438 (0.0026) [2024-04-27 20:33:17,916][54818] Updated weights for policy 0, policy_version 472448 (0.0028) [2024-04-27 20:33:19,235][54798] Signal inference workers to stop experience collection... (9350 times) [2024-04-27 20:33:19,235][54798] Signal inference workers to resume experience collection... (9350 times) [2024-04-27 20:33:19,252][54818] InferenceWorker_p0-w0: stopping experience collection (9350 times) [2024-04-27 20:33:19,252][54818] InferenceWorker_p0-w0: resuming experience collection (9350 times) [2024-04-27 20:33:19,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7740653568. Throughput: 0: 55705.6. Samples: 645861920. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 20:33:19,255][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 20:33:20,326][54818] Updated weights for policy 0, policy_version 472458 (0.0025) [2024-04-27 20:33:23,956][54818] Updated weights for policy 0, policy_version 472468 (0.0032) [2024-04-27 20:33:24,253][54587] Fps is (10 sec: 50790.9, 60 sec: 54886.5, 300 sec: 55594.5). Total num frames: 7740915712. Throughput: 0: 55555.0. Samples: 646193780. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 20:33:24,254][54587] Avg episode reward: [(0, '0.520')] [2024-04-27 20:33:26,447][54818] Updated weights for policy 0, policy_version 472478 (0.0023) [2024-04-27 20:33:29,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55159.5, 300 sec: 55594.6). Total num frames: 7741210624. Throughput: 0: 55552.5. Samples: 646352900. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-27 20:33:29,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 20:33:29,703][54818] Updated weights for policy 0, policy_version 472488 (0.0034) [2024-04-27 20:33:32,157][54818] Updated weights for policy 0, policy_version 472498 (0.0030) [2024-04-27 20:33:34,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7741489152. Throughput: 0: 55606.4. Samples: 646692100. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 20:33:34,253][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 20:33:35,452][54818] Updated weights for policy 0, policy_version 472508 (0.0029) [2024-04-27 20:33:38,314][54818] Updated weights for policy 0, policy_version 472518 (0.0029) [2024-04-27 20:33:39,253][54587] Fps is (10 sec: 57343.1, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 7741784064. Throughput: 0: 55607.4. Samples: 647024260. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 20:33:39,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 20:33:41,462][54818] Updated weights for policy 0, policy_version 472528 (0.0033) [2024-04-27 20:33:44,010][54818] Updated weights for policy 0, policy_version 472538 (0.0031) [2024-04-27 20:33:44,253][54587] Fps is (10 sec: 57343.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7742062592. Throughput: 0: 55552.0. Samples: 647192380. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 20:33:44,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 20:33:47,342][54818] Updated weights for policy 0, policy_version 472548 (0.0030) [2024-04-27 20:33:49,253][54587] Fps is (10 sec: 57344.8, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 7742357504. Throughput: 0: 55741.5. Samples: 647529080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 20:33:49,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 20:33:49,262][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000472556_7742357504.pth... [2024-04-27 20:33:49,315][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000471742_7729020928.pth [2024-04-27 20:33:49,936][54818] Updated weights for policy 0, policy_version 472558 (0.0029) [2024-04-27 20:33:53,236][54818] Updated weights for policy 0, policy_version 472568 (0.0029) [2024-04-27 20:33:54,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 7742619648. Throughput: 0: 55727.5. Samples: 647863980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 20:33:54,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 20:33:55,910][54818] Updated weights for policy 0, policy_version 472578 (0.0028) [2024-04-27 20:33:59,152][54818] Updated weights for policy 0, policy_version 472588 (0.0027) [2024-04-27 20:33:59,253][54587] Fps is (10 sec: 52428.0, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 7742881792. Throughput: 0: 55608.0. Samples: 648027600. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 20:33:59,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 20:34:01,682][54818] Updated weights for policy 0, policy_version 472598 (0.0029) [2024-04-27 20:34:04,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7743160320. Throughput: 0: 55473.0. Samples: 648358200. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 20:34:04,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 20:34:05,135][54818] Updated weights for policy 0, policy_version 472608 (0.0026) [2024-04-27 20:34:07,443][54818] Updated weights for policy 0, policy_version 472618 (0.0029) [2024-04-27 20:34:09,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7743438848. Throughput: 0: 55482.5. Samples: 648690500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 20:34:09,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 20:34:11,317][54818] Updated weights for policy 0, policy_version 472628 (0.0026) [2024-04-27 20:34:13,769][54818] Updated weights for policy 0, policy_version 472638 (0.0026) [2024-04-27 20:34:14,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55159.6, 300 sec: 55594.6). Total num frames: 7743717376. Throughput: 0: 55661.4. Samples: 648857660. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 20:34:14,253][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 20:34:17,074][54818] Updated weights for policy 0, policy_version 472648 (0.0036) [2024-04-27 20:34:19,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 7743995904. Throughput: 0: 55463.8. Samples: 649187980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 20:34:19,262][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 20:34:19,671][54818] Updated weights for policy 0, policy_version 472658 (0.0025) [2024-04-27 20:34:22,799][54818] Updated weights for policy 0, policy_version 472668 (0.0033) [2024-04-27 20:34:24,253][54587] Fps is (10 sec: 58980.8, 60 sec: 56524.6, 300 sec: 55761.1). Total num frames: 7744307200. Throughput: 0: 55508.8. Samples: 649522160. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 20:34:24,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 20:34:25,396][54818] Updated weights for policy 0, policy_version 472678 (0.0028) [2024-04-27 20:34:27,607][54798] Signal inference workers to stop experience collection... (9400 times) [2024-04-27 20:34:27,607][54798] Signal inference workers to resume experience collection... (9400 times) [2024-04-27 20:34:27,634][54818] InferenceWorker_p0-w0: stopping experience collection (9400 times) [2024-04-27 20:34:27,634][54818] InferenceWorker_p0-w0: resuming experience collection (9400 times) [2024-04-27 20:34:28,705][54818] Updated weights for policy 0, policy_version 472688 (0.0031) [2024-04-27 20:34:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7744569344. Throughput: 0: 55634.3. Samples: 649695920. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 20:34:29,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 20:34:31,341][54818] Updated weights for policy 0, policy_version 472698 (0.0033) [2024-04-27 20:34:34,253][54587] Fps is (10 sec: 52429.7, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7744831488. Throughput: 0: 55575.9. Samples: 650030000. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 20:34:34,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 20:34:34,605][54818] Updated weights for policy 0, policy_version 472708 (0.0033) [2024-04-27 20:34:37,401][54818] Updated weights for policy 0, policy_version 472718 (0.0028) [2024-04-27 20:34:39,253][54587] Fps is (10 sec: 52428.4, 60 sec: 55159.5, 300 sec: 55483.5). Total num frames: 7745093632. Throughput: 0: 55577.7. Samples: 650364980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 20:34:39,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 20:34:40,462][54818] Updated weights for policy 0, policy_version 472728 (0.0034) [2024-04-27 20:34:43,171][54818] Updated weights for policy 0, policy_version 472738 (0.0029) [2024-04-27 20:34:44,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7745388544. Throughput: 0: 55267.2. Samples: 650514620. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 20:34:44,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 20:34:46,334][54818] Updated weights for policy 0, policy_version 472748 (0.0032) [2024-04-27 20:34:48,947][54818] Updated weights for policy 0, policy_version 472758 (0.0031) [2024-04-27 20:34:49,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7745667072. Throughput: 0: 55404.3. Samples: 650851400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 20:34:49,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 20:34:52,240][54818] Updated weights for policy 0, policy_version 472768 (0.0030) [2024-04-27 20:34:54,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7745945600. Throughput: 0: 55336.0. Samples: 651180620. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-27 20:34:54,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 20:34:54,891][54818] Updated weights for policy 0, policy_version 472778 (0.0027) [2024-04-27 20:34:58,247][54818] Updated weights for policy 0, policy_version 472788 (0.0027) [2024-04-27 20:34:59,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 7746240512. Throughput: 0: 55572.3. Samples: 651358420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 20:34:59,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 20:35:01,190][54818] Updated weights for policy 0, policy_version 472798 (0.0026) [2024-04-27 20:35:03,980][54818] Updated weights for policy 0, policy_version 472808 (0.0026) [2024-04-27 20:35:04,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7746502656. Throughput: 0: 55639.2. Samples: 651691740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 20:35:04,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 20:35:07,229][54818] Updated weights for policy 0, policy_version 472818 (0.0031) [2024-04-27 20:35:09,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7746764800. Throughput: 0: 55612.7. Samples: 652024720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 20:35:09,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 20:35:09,880][54818] Updated weights for policy 0, policy_version 472828 (0.0032) [2024-04-27 20:35:12,974][54818] Updated weights for policy 0, policy_version 472838 (0.0030) [2024-04-27 20:35:14,253][54587] Fps is (10 sec: 52428.7, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 7747026944. Throughput: 0: 55264.5. Samples: 652182820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 20:35:14,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 20:35:15,647][54818] Updated weights for policy 0, policy_version 472848 (0.0031) [2024-04-27 20:35:18,737][54818] Updated weights for policy 0, policy_version 472858 (0.0030) [2024-04-27 20:35:19,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55159.6, 300 sec: 55427.9). Total num frames: 7747305472. Throughput: 0: 55259.6. Samples: 652516680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 20:35:19,253][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 20:35:21,763][54818] Updated weights for policy 0, policy_version 472868 (0.0028) [2024-04-27 20:35:24,253][54587] Fps is (10 sec: 58982.4, 60 sec: 55159.7, 300 sec: 55539.0). Total num frames: 7747616768. Throughput: 0: 55259.2. Samples: 652851640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 20:35:24,254][54587] Avg episode reward: [(0, '0.532')] [2024-04-27 20:35:24,712][54818] Updated weights for policy 0, policy_version 472878 (0.0026) [2024-04-27 20:35:27,619][54798] Signal inference workers to stop experience collection... (9450 times) [2024-04-27 20:35:27,645][54818] InferenceWorker_p0-w0: stopping experience collection (9450 times) [2024-04-27 20:35:27,681][54798] Signal inference workers to resume experience collection... (9450 times) [2024-04-27 20:35:27,681][54818] InferenceWorker_p0-w0: resuming experience collection (9450 times) [2024-04-27 20:35:27,683][54818] Updated weights for policy 0, policy_version 472888 (0.0023) [2024-04-27 20:35:29,253][54587] Fps is (10 sec: 60620.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7747911680. Throughput: 0: 55727.2. Samples: 653022340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 20:35:29,253][54587] Avg episode reward: [(0, '0.473')] [2024-04-27 20:35:30,678][54818] Updated weights for policy 0, policy_version 472898 (0.0033) [2024-04-27 20:35:33,504][54818] Updated weights for policy 0, policy_version 472908 (0.0027) [2024-04-27 20:35:34,254][54587] Fps is (10 sec: 57339.3, 60 sec: 55977.9, 300 sec: 55649.9). Total num frames: 7748190208. Throughput: 0: 55762.2. Samples: 653360740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 20:35:34,255][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 20:35:36,384][54818] Updated weights for policy 0, policy_version 472918 (0.0031) [2024-04-27 20:35:39,232][54818] Updated weights for policy 0, policy_version 472928 (0.0030) [2024-04-27 20:35:39,253][54587] Fps is (10 sec: 54066.0, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7748452352. Throughput: 0: 55707.5. Samples: 653687460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 20:35:39,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 20:35:42,687][54818] Updated weights for policy 0, policy_version 472938 (0.0026) [2024-04-27 20:35:44,253][54587] Fps is (10 sec: 52433.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7748714496. Throughput: 0: 55584.1. Samples: 653859700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 20:35:44,253][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 20:35:45,181][54818] Updated weights for policy 0, policy_version 472948 (0.0025) [2024-04-27 20:35:48,633][54818] Updated weights for policy 0, policy_version 472958 (0.0026) [2024-04-27 20:35:49,253][54587] Fps is (10 sec: 52429.7, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7748976640. Throughput: 0: 55595.1. Samples: 654193520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 20:35:49,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 20:35:49,355][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000472961_7748993024.pth... [2024-04-27 20:35:49,400][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000472148_7735672832.pth [2024-04-27 20:35:51,075][54818] Updated weights for policy 0, policy_version 472968 (0.0033) [2024-04-27 20:35:54,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55159.6, 300 sec: 55483.4). Total num frames: 7749255168. Throughput: 0: 55576.0. Samples: 654525640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 20:35:54,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 20:35:54,510][54818] Updated weights for policy 0, policy_version 472978 (0.0028) [2024-04-27 20:35:56,919][54818] Updated weights for policy 0, policy_version 472988 (0.0024) [2024-04-27 20:35:59,253][54587] Fps is (10 sec: 55705.6, 60 sec: 54886.4, 300 sec: 55483.4). Total num frames: 7749533696. Throughput: 0: 55635.6. Samples: 654686420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 20:35:59,254][54587] Avg episode reward: [(0, '0.528')] [2024-04-27 20:36:00,246][54818] Updated weights for policy 0, policy_version 472998 (0.0026) [2024-04-27 20:36:02,792][54818] Updated weights for policy 0, policy_version 473008 (0.0028) [2024-04-27 20:36:04,253][54587] Fps is (10 sec: 58981.8, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7749844992. Throughput: 0: 55621.2. Samples: 655019640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 20:36:04,254][54587] Avg episode reward: [(0, '0.667')] [2024-04-27 20:36:06,041][54818] Updated weights for policy 0, policy_version 473018 (0.0028) [2024-04-27 20:36:08,630][54818] Updated weights for policy 0, policy_version 473028 (0.0029) [2024-04-27 20:36:09,253][54587] Fps is (10 sec: 58981.7, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7750123520. Throughput: 0: 55526.1. Samples: 655350320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 20:36:09,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 20:36:11,855][54818] Updated weights for policy 0, policy_version 473038 (0.0033) [2024-04-27 20:36:14,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55978.5, 300 sec: 55539.0). Total num frames: 7750385664. Throughput: 0: 55684.2. Samples: 655528140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 20:36:14,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 20:36:14,621][54818] Updated weights for policy 0, policy_version 473048 (0.0034) [2024-04-27 20:36:17,637][54818] Updated weights for policy 0, policy_version 473058 (0.0025) [2024-04-27 20:36:19,253][54587] Fps is (10 sec: 55706.0, 60 sec: 56251.6, 300 sec: 55650.0). Total num frames: 7750680576. Throughput: 0: 55487.6. Samples: 655857640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-27 20:36:19,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 20:36:20,384][54818] Updated weights for policy 0, policy_version 473068 (0.0034) [2024-04-27 20:36:23,768][54818] Updated weights for policy 0, policy_version 473078 (0.0033) [2024-04-27 20:36:24,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 7750926336. Throughput: 0: 55780.2. Samples: 656197560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-27 20:36:24,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 20:36:26,109][54818] Updated weights for policy 0, policy_version 473088 (0.0030) [2024-04-27 20:36:29,253][54587] Fps is (10 sec: 52429.2, 60 sec: 54886.4, 300 sec: 55483.5). Total num frames: 7751204864. Throughput: 0: 55454.2. Samples: 656355140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-27 20:36:29,253][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 20:36:29,796][54818] Updated weights for policy 0, policy_version 473098 (0.0032) [2024-04-27 20:36:30,144][54798] Signal inference workers to stop experience collection... (9500 times) [2024-04-27 20:36:30,187][54818] InferenceWorker_p0-w0: stopping experience collection (9500 times) [2024-04-27 20:36:30,203][54798] Signal inference workers to resume experience collection... (9500 times) [2024-04-27 20:36:30,204][54818] InferenceWorker_p0-w0: resuming experience collection (9500 times) [2024-04-27 20:36:32,070][54818] Updated weights for policy 0, policy_version 473108 (0.0027) [2024-04-27 20:36:34,253][54587] Fps is (10 sec: 55705.2, 60 sec: 54887.1, 300 sec: 55483.4). Total num frames: 7751483392. Throughput: 0: 55315.4. Samples: 656682720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-27 20:36:34,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 20:36:35,766][54818] Updated weights for policy 0, policy_version 473118 (0.0026) [2024-04-27 20:36:37,966][54818] Updated weights for policy 0, policy_version 473128 (0.0031) [2024-04-27 20:36:39,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 7751778304. Throughput: 0: 55343.6. Samples: 657016100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-27 20:36:39,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 20:36:41,702][54818] Updated weights for policy 0, policy_version 473138 (0.0026) [2024-04-27 20:36:43,822][54818] Updated weights for policy 0, policy_version 473148 (0.0024) [2024-04-27 20:36:44,253][54587] Fps is (10 sec: 58982.4, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 7752073216. Throughput: 0: 55885.2. Samples: 657201260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-27 20:36:44,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 20:36:47,399][54818] Updated weights for policy 0, policy_version 473158 (0.0032) [2024-04-27 20:36:49,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7752335360. Throughput: 0: 55841.0. Samples: 657532480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-27 20:36:49,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 20:36:49,766][54818] Updated weights for policy 0, policy_version 473168 (0.0025) [2024-04-27 20:36:53,155][54818] Updated weights for policy 0, policy_version 473178 (0.0029) [2024-04-27 20:36:54,253][54587] Fps is (10 sec: 52429.6, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7752597504. Throughput: 0: 55864.7. Samples: 657864220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-27 20:36:54,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 20:36:55,501][54818] Updated weights for policy 0, policy_version 473188 (0.0033) [2024-04-27 20:36:58,979][54818] Updated weights for policy 0, policy_version 473198 (0.0032) [2024-04-27 20:36:59,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7752876032. Throughput: 0: 55471.3. Samples: 658024340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-27 20:36:59,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 20:37:01,304][54818] Updated weights for policy 0, policy_version 473208 (0.0028) [2024-04-27 20:37:04,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 7753154560. Throughput: 0: 55764.1. Samples: 658367020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-27 20:37:04,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 20:37:05,124][54818] Updated weights for policy 0, policy_version 473218 (0.0035) [2024-04-27 20:37:07,259][54818] Updated weights for policy 0, policy_version 473228 (0.0029) [2024-04-27 20:37:09,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 7753433088. Throughput: 0: 55515.2. Samples: 658695740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-27 20:37:09,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 20:37:11,070][54818] Updated weights for policy 0, policy_version 473238 (0.0031) [2024-04-27 20:37:13,212][54818] Updated weights for policy 0, policy_version 473248 (0.0031) [2024-04-27 20:37:14,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55705.7, 300 sec: 55650.0). Total num frames: 7753728000. Throughput: 0: 55637.2. Samples: 658858820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-27 20:37:14,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 20:37:16,899][54818] Updated weights for policy 0, policy_version 473258 (0.0026) [2024-04-27 20:37:19,090][54818] Updated weights for policy 0, policy_version 473268 (0.0034) [2024-04-27 20:37:19,253][54587] Fps is (10 sec: 60619.6, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 7754039296. Throughput: 0: 55744.8. Samples: 659191240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-27 20:37:19,254][54587] Avg episode reward: [(0, '0.675')] [2024-04-27 20:37:22,821][54818] Updated weights for policy 0, policy_version 473278 (0.0027) [2024-04-27 20:37:24,253][54587] Fps is (10 sec: 54068.0, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 7754268672. Throughput: 0: 55751.7. Samples: 659524920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-27 20:37:24,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 20:37:24,533][54798] Signal inference workers to stop experience collection... (9550 times) [2024-04-27 20:37:24,533][54798] Signal inference workers to resume experience collection... (9550 times) [2024-04-27 20:37:24,555][54818] InferenceWorker_p0-w0: stopping experience collection (9550 times) [2024-04-27 20:37:24,555][54818] InferenceWorker_p0-w0: resuming experience collection (9550 times) [2024-04-27 20:37:24,958][54818] Updated weights for policy 0, policy_version 473288 (0.0031) [2024-04-27 20:37:28,780][54818] Updated weights for policy 0, policy_version 473298 (0.0032) [2024-04-27 20:37:29,253][54587] Fps is (10 sec: 50791.4, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7754547200. Throughput: 0: 55410.8. Samples: 659694740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-27 20:37:29,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 20:37:31,025][54818] Updated weights for policy 0, policy_version 473308 (0.0029) [2024-04-27 20:37:34,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55432.7, 300 sec: 55427.9). Total num frames: 7754809344. Throughput: 0: 55365.5. Samples: 660023920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-27 20:37:34,253][54587] Avg episode reward: [(0, '0.685')] [2024-04-27 20:37:34,561][54818] Updated weights for policy 0, policy_version 473318 (0.0028) [2024-04-27 20:37:36,817][54818] Updated weights for policy 0, policy_version 473328 (0.0028) [2024-04-27 20:37:39,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7755104256. Throughput: 0: 55550.2. Samples: 660363980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-27 20:37:39,254][54587] Avg episode reward: [(0, '0.477')] [2024-04-27 20:37:40,358][54818] Updated weights for policy 0, policy_version 473338 (0.0030) [2024-04-27 20:37:42,623][54818] Updated weights for policy 0, policy_version 473348 (0.0025) [2024-04-27 20:37:44,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55159.6, 300 sec: 55594.6). Total num frames: 7755382784. Throughput: 0: 55615.2. Samples: 660527020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-04-27 20:37:44,253][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 20:37:46,176][54818] Updated weights for policy 0, policy_version 473358 (0.0027) [2024-04-27 20:37:48,445][54818] Updated weights for policy 0, policy_version 473368 (0.0029) [2024-04-27 20:37:49,253][54587] Fps is (10 sec: 57342.7, 60 sec: 55705.4, 300 sec: 55650.0). Total num frames: 7755677696. Throughput: 0: 55442.0. Samples: 660861920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 20:37:49,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 20:37:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000473369_7755677696.pth... [2024-04-27 20:37:49,319][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000472556_7742357504.pth [2024-04-27 20:37:52,203][54818] Updated weights for policy 0, policy_version 473378 (0.0034) [2024-04-27 20:37:54,253][54587] Fps is (10 sec: 58981.7, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 7755972608. Throughput: 0: 55369.3. Samples: 661187360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 20:37:54,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 20:37:54,393][54818] Updated weights for policy 0, policy_version 473388 (0.0027) [2024-04-27 20:37:57,984][54818] Updated weights for policy 0, policy_version 473398 (0.0032) [2024-04-27 20:37:59,253][54587] Fps is (10 sec: 54068.0, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7756218368. Throughput: 0: 55522.2. Samples: 661357320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 20:37:59,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 20:38:00,381][54818] Updated weights for policy 0, policy_version 473408 (0.0030) [2024-04-27 20:38:03,816][54818] Updated weights for policy 0, policy_version 473418 (0.0025) [2024-04-27 20:38:04,253][54587] Fps is (10 sec: 52428.9, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7756496896. Throughput: 0: 55614.9. Samples: 661693900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 20:38:04,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 20:38:06,281][54818] Updated weights for policy 0, policy_version 473428 (0.0029) [2024-04-27 20:38:09,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 7756775424. Throughput: 0: 55730.6. Samples: 662032800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 20:38:09,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 20:38:09,604][54818] Updated weights for policy 0, policy_version 473438 (0.0034) [2024-04-27 20:38:12,018][54818] Updated weights for policy 0, policy_version 473448 (0.0030) [2024-04-27 20:38:14,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7757053952. Throughput: 0: 55471.4. Samples: 662190960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 20:38:14,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 20:38:15,473][54818] Updated weights for policy 0, policy_version 473458 (0.0027) [2024-04-27 20:38:18,229][54818] Updated weights for policy 0, policy_version 473468 (0.0028) [2024-04-27 20:38:19,253][54587] Fps is (10 sec: 55705.0, 60 sec: 54886.5, 300 sec: 55650.0). Total num frames: 7757332480. Throughput: 0: 55626.9. Samples: 662527140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 20:38:19,263][54587] Avg episode reward: [(0, '0.534')] [2024-04-27 20:38:21,286][54818] Updated weights for policy 0, policy_version 473478 (0.0024) [2024-04-27 20:38:24,235][54818] Updated weights for policy 0, policy_version 473488 (0.0034) [2024-04-27 20:38:24,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 7757627392. Throughput: 0: 55436.3. Samples: 662858620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 20:38:24,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 20:38:27,175][54818] Updated weights for policy 0, policy_version 473498 (0.0037) [2024-04-27 20:38:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 7757905920. Throughput: 0: 55750.1. Samples: 663035780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 20:38:29,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 20:38:30,135][54818] Updated weights for policy 0, policy_version 473508 (0.0034) [2024-04-27 20:38:32,506][54798] Signal inference workers to stop experience collection... (9600 times) [2024-04-27 20:38:32,538][54818] InferenceWorker_p0-w0: stopping experience collection (9600 times) [2024-04-27 20:38:32,565][54798] Signal inference workers to resume experience collection... (9600 times) [2024-04-27 20:38:32,570][54818] InferenceWorker_p0-w0: resuming experience collection (9600 times) [2024-04-27 20:38:32,922][54818] Updated weights for policy 0, policy_version 473518 (0.0028) [2024-04-27 20:38:34,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55978.5, 300 sec: 55539.0). Total num frames: 7758168064. Throughput: 0: 55638.8. Samples: 663365660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 20:38:34,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 20:38:36,149][54818] Updated weights for policy 0, policy_version 473528 (0.0029) [2024-04-27 20:38:38,896][54818] Updated weights for policy 0, policy_version 473538 (0.0028) [2024-04-27 20:38:39,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7758446592. Throughput: 0: 55820.1. Samples: 663699260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 20:38:39,253][54587] Avg episode reward: [(0, '0.675')] [2024-04-27 20:38:42,073][54818] Updated weights for policy 0, policy_version 473548 (0.0028) [2024-04-27 20:38:44,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55432.4, 300 sec: 55427.9). Total num frames: 7758708736. Throughput: 0: 55651.5. Samples: 663861640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 20:38:44,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 20:38:44,949][54818] Updated weights for policy 0, policy_version 473558 (0.0028) [2024-04-27 20:38:47,973][54818] Updated weights for policy 0, policy_version 473568 (0.0029) [2024-04-27 20:38:49,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 7759003648. Throughput: 0: 55776.9. Samples: 664203860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 20:38:49,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 20:38:50,710][54818] Updated weights for policy 0, policy_version 473578 (0.0033) [2024-04-27 20:38:53,757][54818] Updated weights for policy 0, policy_version 473588 (0.0025) [2024-04-27 20:38:54,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7759282176. Throughput: 0: 55705.2. Samples: 664539540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 20:38:54,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 20:38:56,403][54818] Updated weights for policy 0, policy_version 473598 (0.0030) [2024-04-27 20:38:59,254][54587] Fps is (10 sec: 55704.6, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7759560704. Throughput: 0: 55776.3. Samples: 664700900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 20:38:59,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-27 20:38:59,691][54818] Updated weights for policy 0, policy_version 473608 (0.0028) [2024-04-27 20:39:02,351][54818] Updated weights for policy 0, policy_version 473618 (0.0028) [2024-04-27 20:39:04,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 7759855616. Throughput: 0: 55796.0. Samples: 665037960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 20:39:04,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 20:39:05,374][54818] Updated weights for policy 0, policy_version 473628 (0.0033) [2024-04-27 20:39:08,430][54818] Updated weights for policy 0, policy_version 473638 (0.0035) [2024-04-27 20:39:09,253][54587] Fps is (10 sec: 55707.1, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7760117760. Throughput: 0: 55667.3. Samples: 665363640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-27 20:39:09,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 20:39:11,456][54818] Updated weights for policy 0, policy_version 473648 (0.0032) [2024-04-27 20:39:14,234][54818] Updated weights for policy 0, policy_version 473658 (0.0027) [2024-04-27 20:39:14,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 7760412672. Throughput: 0: 55554.6. Samples: 665535740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 20:39:14,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 20:39:17,305][54818] Updated weights for policy 0, policy_version 473668 (0.0030) [2024-04-27 20:39:19,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7760691200. Throughput: 0: 55725.0. Samples: 665873280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 20:39:19,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 20:39:19,914][54818] Updated weights for policy 0, policy_version 473678 (0.0028) [2024-04-27 20:39:23,067][54818] Updated weights for policy 0, policy_version 473688 (0.0033) [2024-04-27 20:39:24,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7760953344. Throughput: 0: 55734.0. Samples: 666207300. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 20:39:24,262][54587] Avg episode reward: [(0, '0.670')] [2024-04-27 20:39:25,745][54818] Updated weights for policy 0, policy_version 473698 (0.0027) [2024-04-27 20:39:28,842][54818] Updated weights for policy 0, policy_version 473708 (0.0029) [2024-04-27 20:39:29,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7761248256. Throughput: 0: 55917.1. Samples: 666377900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 20:39:29,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 20:39:32,141][54818] Updated weights for policy 0, policy_version 473718 (0.0030) [2024-04-27 20:39:34,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7761510400. Throughput: 0: 55724.5. Samples: 666711460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 20:39:34,254][54587] Avg episode reward: [(0, '0.667')] [2024-04-27 20:39:34,908][54818] Updated weights for policy 0, policy_version 473728 (0.0029) [2024-04-27 20:39:37,094][54798] Signal inference workers to stop experience collection... (9650 times) [2024-04-27 20:39:37,095][54798] Signal inference workers to resume experience collection... (9650 times) [2024-04-27 20:39:37,115][54818] InferenceWorker_p0-w0: stopping experience collection (9650 times) [2024-04-27 20:39:37,116][54818] InferenceWorker_p0-w0: resuming experience collection (9650 times) [2024-04-27 20:39:37,850][54818] Updated weights for policy 0, policy_version 473738 (0.0027) [2024-04-27 20:39:39,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7761805312. Throughput: 0: 55812.7. Samples: 667051100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 20:39:39,254][54587] Avg episode reward: [(0, '0.674')] [2024-04-27 20:39:40,774][54818] Updated weights for policy 0, policy_version 473748 (0.0026) [2024-04-27 20:39:43,516][54818] Updated weights for policy 0, policy_version 473758 (0.0025) [2024-04-27 20:39:44,253][54587] Fps is (10 sec: 57343.3, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 7762083840. Throughput: 0: 56024.1. Samples: 667221980. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 20:39:44,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 20:39:46,580][54818] Updated weights for policy 0, policy_version 473768 (0.0034) [2024-04-27 20:39:49,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7762362368. Throughput: 0: 55894.8. Samples: 667553220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 20:39:49,262][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 20:39:49,274][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000473777_7762362368.pth... [2024-04-27 20:39:49,330][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000472961_7748993024.pth [2024-04-27 20:39:49,494][54818] Updated weights for policy 0, policy_version 473778 (0.0027) [2024-04-27 20:39:52,414][54818] Updated weights for policy 0, policy_version 473788 (0.0026) [2024-04-27 20:39:54,253][54587] Fps is (10 sec: 57344.7, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 7762657280. Throughput: 0: 56079.9. Samples: 667887240. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 20:39:54,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 20:39:55,354][54818] Updated weights for policy 0, policy_version 473798 (0.0031) [2024-04-27 20:39:58,365][54818] Updated weights for policy 0, policy_version 473808 (0.0026) [2024-04-27 20:39:59,253][54587] Fps is (10 sec: 57343.7, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 7762935808. Throughput: 0: 56159.6. Samples: 668062920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 20:39:59,262][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 20:40:01,370][54818] Updated weights for policy 0, policy_version 473818 (0.0026) [2024-04-27 20:40:04,192][54818] Updated weights for policy 0, policy_version 473828 (0.0037) [2024-04-27 20:40:04,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7763197952. Throughput: 0: 55964.8. Samples: 668391700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 20:40:04,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 20:40:07,310][54818] Updated weights for policy 0, policy_version 473838 (0.0039) [2024-04-27 20:40:09,253][54587] Fps is (10 sec: 55706.1, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 7763492864. Throughput: 0: 55978.9. Samples: 668726340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 20:40:09,253][54587] Avg episode reward: [(0, '0.532')] [2024-04-27 20:40:10,062][54818] Updated weights for policy 0, policy_version 473848 (0.0032) [2024-04-27 20:40:13,145][54818] Updated weights for policy 0, policy_version 473858 (0.0031) [2024-04-27 20:40:14,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 7763755008. Throughput: 0: 55847.1. Samples: 668891020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 20:40:14,254][54587] Avg episode reward: [(0, '0.676')] [2024-04-27 20:40:15,859][54818] Updated weights for policy 0, policy_version 473868 (0.0030) [2024-04-27 20:40:18,901][54818] Updated weights for policy 0, policy_version 473878 (0.0036) [2024-04-27 20:40:19,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 7764033536. Throughput: 0: 55904.8. Samples: 669227180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 20:40:19,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-27 20:40:21,829][54818] Updated weights for policy 0, policy_version 473888 (0.0032) [2024-04-27 20:40:24,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7764312064. Throughput: 0: 55719.4. Samples: 669558480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 20:40:24,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 20:40:24,863][54818] Updated weights for policy 0, policy_version 473898 (0.0028) [2024-04-27 20:40:27,599][54818] Updated weights for policy 0, policy_version 473908 (0.0027) [2024-04-27 20:40:29,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55978.5, 300 sec: 55650.2). Total num frames: 7764606976. Throughput: 0: 55750.7. Samples: 669730760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 20:40:29,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 20:40:30,778][54818] Updated weights for policy 0, policy_version 473918 (0.0025) [2024-04-27 20:40:33,490][54818] Updated weights for policy 0, policy_version 473928 (0.0029) [2024-04-27 20:40:34,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7764869120. Throughput: 0: 55811.2. Samples: 670064720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 20:40:34,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 20:40:36,558][54818] Updated weights for policy 0, policy_version 473938 (0.0027) [2024-04-27 20:40:37,218][54798] Signal inference workers to stop experience collection... (9700 times) [2024-04-27 20:40:37,263][54818] InferenceWorker_p0-w0: stopping experience collection (9700 times) [2024-04-27 20:40:37,268][54798] Signal inference workers to resume experience collection... (9700 times) [2024-04-27 20:40:37,276][54818] InferenceWorker_p0-w0: resuming experience collection (9700 times) [2024-04-27 20:40:39,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7765147648. Throughput: 0: 55830.6. Samples: 670399620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 20:40:39,254][54587] Avg episode reward: [(0, '0.664')] [2024-04-27 20:40:39,340][54818] Updated weights for policy 0, policy_version 473948 (0.0029) [2024-04-27 20:40:42,487][54818] Updated weights for policy 0, policy_version 473958 (0.0035) [2024-04-27 20:40:44,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 7765426176. Throughput: 0: 55625.8. Samples: 670566080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 20:40:44,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 20:40:45,217][54818] Updated weights for policy 0, policy_version 473968 (0.0027) [2024-04-27 20:40:48,464][54818] Updated weights for policy 0, policy_version 473978 (0.0023) [2024-04-27 20:40:49,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 7765688320. Throughput: 0: 55740.4. Samples: 670900020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 20:40:49,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 20:40:50,946][54818] Updated weights for policy 0, policy_version 473988 (0.0029) [2024-04-27 20:40:54,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 7765966848. Throughput: 0: 55851.0. Samples: 671239640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 20:40:54,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 20:40:54,357][54818] Updated weights for policy 0, policy_version 473998 (0.0026) [2024-04-27 20:40:56,819][54818] Updated weights for policy 0, policy_version 474008 (0.0033) [2024-04-27 20:40:59,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7766245376. Throughput: 0: 55705.2. Samples: 671397760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 20:40:59,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 20:41:00,140][54818] Updated weights for policy 0, policy_version 474018 (0.0026) [2024-04-27 20:41:02,770][54818] Updated weights for policy 0, policy_version 474028 (0.0025) [2024-04-27 20:41:04,253][54587] Fps is (10 sec: 58983.0, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 7766556672. Throughput: 0: 55714.9. Samples: 671734340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 20:41:04,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 20:41:05,865][54818] Updated weights for policy 0, policy_version 474038 (0.0025) [2024-04-27 20:41:08,530][54818] Updated weights for policy 0, policy_version 474048 (0.0032) [2024-04-27 20:41:09,253][54587] Fps is (10 sec: 58982.7, 60 sec: 55705.5, 300 sec: 55761.2). Total num frames: 7766835200. Throughput: 0: 55804.9. Samples: 672069700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 20:41:09,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 20:41:11,901][54818] Updated weights for policy 0, policy_version 474058 (0.0027) [2024-04-27 20:41:14,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7767113728. Throughput: 0: 55928.6. Samples: 672247540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 20:41:14,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 20:41:14,363][54818] Updated weights for policy 0, policy_version 474068 (0.0026) [2024-04-27 20:41:17,728][54818] Updated weights for policy 0, policy_version 474078 (0.0029) [2024-04-27 20:41:19,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 7767392256. Throughput: 0: 55883.8. Samples: 672579500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 20:41:19,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 20:41:20,352][54818] Updated weights for policy 0, policy_version 474088 (0.0030) [2024-04-27 20:41:23,612][54818] Updated weights for policy 0, policy_version 474098 (0.0025) [2024-04-27 20:41:24,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7767638016. Throughput: 0: 55713.8. Samples: 672906740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 20:41:24,254][54587] Avg episode reward: [(0, '0.686')] [2024-04-27 20:41:26,239][54818] Updated weights for policy 0, policy_version 474108 (0.0031) [2024-04-27 20:41:29,253][54587] Fps is (10 sec: 54068.2, 60 sec: 55432.7, 300 sec: 55761.2). Total num frames: 7767932928. Throughput: 0: 55670.8. Samples: 673071260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 20:41:29,253][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 20:41:29,330][54818] Updated weights for policy 0, policy_version 474118 (0.0030) [2024-04-27 20:41:30,657][54798] Signal inference workers to stop experience collection... (9750 times) [2024-04-27 20:41:30,663][54798] Signal inference workers to resume experience collection... (9750 times) [2024-04-27 20:41:30,677][54818] InferenceWorker_p0-w0: stopping experience collection (9750 times) [2024-04-27 20:41:30,678][54818] InferenceWorker_p0-w0: resuming experience collection (9750 times) [2024-04-27 20:41:32,036][54818] Updated weights for policy 0, policy_version 474128 (0.0027) [2024-04-27 20:41:34,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7768211456. Throughput: 0: 55781.4. Samples: 673410180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 20:41:34,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-27 20:41:35,192][54818] Updated weights for policy 0, policy_version 474138 (0.0027) [2024-04-27 20:41:37,892][54818] Updated weights for policy 0, policy_version 474148 (0.0025) [2024-04-27 20:41:39,253][54587] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7768506368. Throughput: 0: 55615.1. Samples: 673742320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 20:41:39,254][54587] Avg episode reward: [(0, '0.671')] [2024-04-27 20:41:41,157][54818] Updated weights for policy 0, policy_version 474158 (0.0025) [2024-04-27 20:41:43,828][54818] Updated weights for policy 0, policy_version 474168 (0.0027) [2024-04-27 20:41:44,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7768784896. Throughput: 0: 55913.8. Samples: 673913880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 20:41:44,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 20:41:46,996][54818] Updated weights for policy 0, policy_version 474178 (0.0032) [2024-04-27 20:41:49,253][54587] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 55816.6). Total num frames: 7769063424. Throughput: 0: 55882.0. Samples: 674249040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 20:41:49,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 20:41:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000474186_7769063424.pth... [2024-04-27 20:41:49,320][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000473369_7755677696.pth [2024-04-27 20:41:49,591][54818] Updated weights for policy 0, policy_version 474188 (0.0023) [2024-04-27 20:41:52,697][54818] Updated weights for policy 0, policy_version 474198 (0.0025) [2024-04-27 20:41:54,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7769325568. Throughput: 0: 55984.5. Samples: 674589000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 20:41:54,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 20:41:55,278][54818] Updated weights for policy 0, policy_version 474208 (0.0027) [2024-04-27 20:41:58,552][54818] Updated weights for policy 0, policy_version 474218 (0.0030) [2024-04-27 20:41:59,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7769604096. Throughput: 0: 55581.3. Samples: 674748700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-27 20:41:59,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 20:42:01,096][54818] Updated weights for policy 0, policy_version 474228 (0.0029) [2024-04-27 20:42:04,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 7769882624. Throughput: 0: 55663.1. Samples: 675084340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 20:42:04,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 20:42:04,588][54818] Updated weights for policy 0, policy_version 474238 (0.0025) [2024-04-27 20:42:07,414][54818] Updated weights for policy 0, policy_version 474248 (0.0033) [2024-04-27 20:42:09,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7770161152. Throughput: 0: 55769.7. Samples: 675416380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 20:42:09,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 20:42:10,567][54818] Updated weights for policy 0, policy_version 474258 (0.0039) [2024-04-27 20:42:13,295][54818] Updated weights for policy 0, policy_version 474268 (0.0029) [2024-04-27 20:42:14,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7770456064. Throughput: 0: 55920.7. Samples: 675587700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 20:42:14,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 20:42:16,533][54818] Updated weights for policy 0, policy_version 474278 (0.0029) [2024-04-27 20:42:19,135][54818] Updated weights for policy 0, policy_version 474288 (0.0024) [2024-04-27 20:42:19,253][54587] Fps is (10 sec: 57345.0, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 7770734592. Throughput: 0: 55903.3. Samples: 675925820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 20:42:19,253][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 20:42:22,337][54818] Updated weights for policy 0, policy_version 474298 (0.0028) [2024-04-27 20:42:24,253][54587] Fps is (10 sec: 57343.9, 60 sec: 56524.7, 300 sec: 55872.2). Total num frames: 7771029504. Throughput: 0: 55895.1. Samples: 676257600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 20:42:24,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 20:42:24,991][54818] Updated weights for policy 0, policy_version 474308 (0.0028) [2024-04-27 20:42:25,290][54798] Signal inference workers to stop experience collection... (9800 times) [2024-04-27 20:42:25,291][54798] Signal inference workers to resume experience collection... (9800 times) [2024-04-27 20:42:25,300][54818] InferenceWorker_p0-w0: stopping experience collection (9800 times) [2024-04-27 20:42:25,324][54818] InferenceWorker_p0-w0: resuming experience collection (9800 times) [2024-04-27 20:42:28,032][54818] Updated weights for policy 0, policy_version 474318 (0.0033) [2024-04-27 20:42:29,253][54587] Fps is (10 sec: 54066.1, 60 sec: 55705.5, 300 sec: 55816.6). Total num frames: 7771275264. Throughput: 0: 55722.6. Samples: 676421400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 20:42:29,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 20:42:30,889][54818] Updated weights for policy 0, policy_version 474328 (0.0035) [2024-04-27 20:42:33,933][54818] Updated weights for policy 0, policy_version 474338 (0.0030) [2024-04-27 20:42:34,253][54587] Fps is (10 sec: 52429.3, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7771553792. Throughput: 0: 55710.8. Samples: 676756020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 20:42:34,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 20:42:36,680][54818] Updated weights for policy 0, policy_version 474348 (0.0024) [2024-04-27 20:42:39,253][54587] Fps is (10 sec: 57344.9, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 7771848704. Throughput: 0: 55670.3. Samples: 677094160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 20:42:39,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-27 20:42:39,655][54818] Updated weights for policy 0, policy_version 474358 (0.0028) [2024-04-27 20:42:42,585][54818] Updated weights for policy 0, policy_version 474368 (0.0030) [2024-04-27 20:42:44,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 7772127232. Throughput: 0: 55752.4. Samples: 677257560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 20:42:44,254][54587] Avg episode reward: [(0, '0.667')] [2024-04-27 20:42:45,560][54818] Updated weights for policy 0, policy_version 474378 (0.0033) [2024-04-27 20:42:48,527][54818] Updated weights for policy 0, policy_version 474388 (0.0035) [2024-04-27 20:42:49,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7772405760. Throughput: 0: 55695.7. Samples: 677590640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 20:42:49,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 20:42:51,660][54818] Updated weights for policy 0, policy_version 474398 (0.0033) [2024-04-27 20:42:54,253][54587] Fps is (10 sec: 54068.1, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 7772667904. Throughput: 0: 55722.0. Samples: 677923860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 20:42:54,254][54587] Avg episode reward: [(0, '0.526')] [2024-04-27 20:42:54,452][54818] Updated weights for policy 0, policy_version 474408 (0.0025) [2024-04-27 20:42:57,536][54818] Updated weights for policy 0, policy_version 474418 (0.0027) [2024-04-27 20:42:59,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 7772962816. Throughput: 0: 55748.4. Samples: 678096380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 20:42:59,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 20:43:00,279][54818] Updated weights for policy 0, policy_version 474428 (0.0022) [2024-04-27 20:43:03,293][54818] Updated weights for policy 0, policy_version 474438 (0.0030) [2024-04-27 20:43:04,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 7773224960. Throughput: 0: 55548.7. Samples: 678425520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 20:43:04,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 20:43:06,128][54818] Updated weights for policy 0, policy_version 474448 (0.0027) [2024-04-27 20:43:09,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 7773503488. Throughput: 0: 55603.3. Samples: 678759740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 20:43:09,253][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 20:43:09,274][54818] Updated weights for policy 0, policy_version 474458 (0.0026) [2024-04-27 20:43:12,022][54818] Updated weights for policy 0, policy_version 474468 (0.0028) [2024-04-27 20:43:14,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 7773798400. Throughput: 0: 55825.0. Samples: 678933520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 20:43:14,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 20:43:15,003][54818] Updated weights for policy 0, policy_version 474478 (0.0027) [2024-04-27 20:43:17,654][54798] Signal inference workers to stop experience collection... (9850 times) [2024-04-27 20:43:17,694][54818] InferenceWorker_p0-w0: stopping experience collection (9850 times) [2024-04-27 20:43:17,703][54798] Signal inference workers to resume experience collection... (9850 times) [2024-04-27 20:43:17,710][54818] InferenceWorker_p0-w0: resuming experience collection (9850 times) [2024-04-27 20:43:17,942][54818] Updated weights for policy 0, policy_version 474488 (0.0027) [2024-04-27 20:43:19,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7774060544. Throughput: 0: 55762.3. Samples: 679265320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 20:43:19,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 20:43:20,796][54818] Updated weights for policy 0, policy_version 474498 (0.0040) [2024-04-27 20:43:23,899][54818] Updated weights for policy 0, policy_version 474508 (0.0033) [2024-04-27 20:43:24,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.7, 300 sec: 55761.2). Total num frames: 7774355456. Throughput: 0: 55653.8. Samples: 679598580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-27 20:43:24,253][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 20:43:26,660][54818] Updated weights for policy 0, policy_version 474518 (0.0024) [2024-04-27 20:43:29,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 7774617600. Throughput: 0: 55654.9. Samples: 679762020. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:43:29,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 20:43:29,639][54818] Updated weights for policy 0, policy_version 474528 (0.0026) [2024-04-27 20:43:32,582][54818] Updated weights for policy 0, policy_version 474538 (0.0028) [2024-04-27 20:43:34,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 7774912512. Throughput: 0: 55753.3. Samples: 680099540. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:43:34,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 20:43:35,735][54818] Updated weights for policy 0, policy_version 474548 (0.0030) [2024-04-27 20:43:38,404][54818] Updated weights for policy 0, policy_version 474558 (0.0029) [2024-04-27 20:43:39,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 7775191040. Throughput: 0: 55768.0. Samples: 680433420. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:43:39,253][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 20:43:41,579][54818] Updated weights for policy 0, policy_version 474568 (0.0028) [2024-04-27 20:43:44,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7775469568. Throughput: 0: 55647.2. Samples: 680600500. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:43:44,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 20:43:44,331][54818] Updated weights for policy 0, policy_version 474578 (0.0030) [2024-04-27 20:43:47,459][54818] Updated weights for policy 0, policy_version 474588 (0.0028) [2024-04-27 20:43:49,253][54587] Fps is (10 sec: 55704.6, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 7775748096. Throughput: 0: 55812.3. Samples: 680937080. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:43:49,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 20:43:49,369][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000474595_7775764480.pth... [2024-04-27 20:43:49,418][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000473777_7762362368.pth [2024-04-27 20:43:50,026][54818] Updated weights for policy 0, policy_version 474598 (0.0026) [2024-04-27 20:43:53,551][54818] Updated weights for policy 0, policy_version 474608 (0.0029) [2024-04-27 20:43:54,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 7776026624. Throughput: 0: 55751.0. Samples: 681268540. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:43:54,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 20:43:56,076][54818] Updated weights for policy 0, policy_version 474618 (0.0029) [2024-04-27 20:43:59,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7776288768. Throughput: 0: 55522.6. Samples: 681432040. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:43:59,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 20:43:59,415][54818] Updated weights for policy 0, policy_version 474628 (0.0031) [2024-04-27 20:44:02,040][54818] Updated weights for policy 0, policy_version 474638 (0.0026) [2024-04-27 20:44:04,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7776567296. Throughput: 0: 55556.4. Samples: 681765360. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:44:04,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 20:44:05,180][54818] Updated weights for policy 0, policy_version 474648 (0.0031) [2024-04-27 20:44:07,981][54818] Updated weights for policy 0, policy_version 474658 (0.0034) [2024-04-27 20:44:09,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7776845824. Throughput: 0: 55654.6. Samples: 682103040. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:44:09,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 20:44:11,035][54818] Updated weights for policy 0, policy_version 474668 (0.0034) [2024-04-27 20:44:13,658][54818] Updated weights for policy 0, policy_version 474678 (0.0028) [2024-04-27 20:44:14,254][54587] Fps is (10 sec: 55704.7, 60 sec: 55432.3, 300 sec: 55705.6). Total num frames: 7777124352. Throughput: 0: 55680.1. Samples: 682267640. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:44:14,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 20:44:16,900][54818] Updated weights for policy 0, policy_version 474688 (0.0031) [2024-04-27 20:44:19,253][54587] Fps is (10 sec: 57343.3, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 7777419264. Throughput: 0: 55677.7. Samples: 682605040. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:44:19,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 20:44:19,542][54818] Updated weights for policy 0, policy_version 474698 (0.0026) [2024-04-27 20:44:22,943][54818] Updated weights for policy 0, policy_version 474708 (0.0033) [2024-04-27 20:44:24,253][54587] Fps is (10 sec: 57345.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7777697792. Throughput: 0: 55635.0. Samples: 682937000. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:44:24,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 20:44:25,382][54818] Updated weights for policy 0, policy_version 474718 (0.0030) [2024-04-27 20:44:28,708][54818] Updated weights for policy 0, policy_version 474728 (0.0031) [2024-04-27 20:44:29,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 7777976320. Throughput: 0: 55805.8. Samples: 683111760. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:44:29,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 20:44:31,154][54818] Updated weights for policy 0, policy_version 474738 (0.0030) [2024-04-27 20:44:34,253][54587] Fps is (10 sec: 52428.9, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 7778222080. Throughput: 0: 55613.5. Samples: 683439680. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:44:34,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 20:44:34,495][54798] Signal inference workers to stop experience collection... (9900 times) [2024-04-27 20:44:34,515][54818] InferenceWorker_p0-w0: stopping experience collection (9900 times) [2024-04-27 20:44:34,588][54798] Signal inference workers to resume experience collection... (9900 times) [2024-04-27 20:44:34,588][54818] InferenceWorker_p0-w0: resuming experience collection (9900 times) [2024-04-27 20:44:34,717][54818] Updated weights for policy 0, policy_version 474748 (0.0024) [2024-04-27 20:44:37,201][54818] Updated weights for policy 0, policy_version 474758 (0.0026) [2024-04-27 20:44:39,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7778516992. Throughput: 0: 55681.0. Samples: 683774180. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:44:39,253][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 20:44:40,643][54818] Updated weights for policy 0, policy_version 474768 (0.0029) [2024-04-27 20:44:43,624][54818] Updated weights for policy 0, policy_version 474778 (0.0029) [2024-04-27 20:44:44,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7778795520. Throughput: 0: 55709.9. Samples: 683938980. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:44:44,253][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 20:44:46,495][54818] Updated weights for policy 0, policy_version 474788 (0.0027) [2024-04-27 20:44:49,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 7779074048. Throughput: 0: 55790.2. Samples: 684275920. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:44:49,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 20:44:49,551][54818] Updated weights for policy 0, policy_version 474798 (0.0029) [2024-04-27 20:44:52,290][54818] Updated weights for policy 0, policy_version 474808 (0.0023) [2024-04-27 20:44:54,253][54587] Fps is (10 sec: 57342.8, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7779368960. Throughput: 0: 55553.2. Samples: 684602940. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-04-27 20:44:54,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 20:44:55,548][54818] Updated weights for policy 0, policy_version 474818 (0.0033) [2024-04-27 20:44:58,089][54818] Updated weights for policy 0, policy_version 474828 (0.0030) [2024-04-27 20:44:59,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7779647488. Throughput: 0: 55807.3. Samples: 684778960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 20:44:59,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 20:45:01,325][54818] Updated weights for policy 0, policy_version 474838 (0.0028) [2024-04-27 20:45:04,126][54818] Updated weights for policy 0, policy_version 474848 (0.0027) [2024-04-27 20:45:04,253][54587] Fps is (10 sec: 55706.7, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 7779926016. Throughput: 0: 55619.3. Samples: 685107900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 20:45:04,253][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 20:45:07,089][54818] Updated weights for policy 0, policy_version 474858 (0.0031) [2024-04-27 20:45:09,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7780188160. Throughput: 0: 55679.9. Samples: 685442600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 20:45:09,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 20:45:09,947][54818] Updated weights for policy 0, policy_version 474868 (0.0029) [2024-04-27 20:45:12,935][54818] Updated weights for policy 0, policy_version 474878 (0.0033) [2024-04-27 20:45:14,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 7780450304. Throughput: 0: 55361.3. Samples: 685603020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 20:45:14,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 20:45:15,796][54818] Updated weights for policy 0, policy_version 474888 (0.0030) [2024-04-27 20:45:18,941][54818] Updated weights for policy 0, policy_version 474898 (0.0027) [2024-04-27 20:45:19,253][54587] Fps is (10 sec: 55706.5, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 7780745216. Throughput: 0: 55541.9. Samples: 685939060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 20:45:19,253][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 20:45:21,609][54818] Updated weights for policy 0, policy_version 474908 (0.0031) [2024-04-27 20:45:24,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7781007360. Throughput: 0: 55567.8. Samples: 686274740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 20:45:24,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 20:45:24,587][54798] Signal inference workers to stop experience collection... (9950 times) [2024-04-27 20:45:24,591][54798] Signal inference workers to resume experience collection... (9950 times) [2024-04-27 20:45:24,615][54818] InferenceWorker_p0-w0: stopping experience collection (9950 times) [2024-04-27 20:45:24,616][54818] InferenceWorker_p0-w0: resuming experience collection (9950 times) [2024-04-27 20:45:24,702][54818] Updated weights for policy 0, policy_version 474918 (0.0029) [2024-04-27 20:45:27,531][54818] Updated weights for policy 0, policy_version 474928 (0.0030) [2024-04-27 20:45:29,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7781318656. Throughput: 0: 55750.6. Samples: 686447760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 20:45:29,253][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 20:45:30,715][54818] Updated weights for policy 0, policy_version 474938 (0.0030) [2024-04-27 20:45:33,443][54818] Updated weights for policy 0, policy_version 474948 (0.0026) [2024-04-27 20:45:34,253][54587] Fps is (10 sec: 57345.4, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 7781580800. Throughput: 0: 55667.4. Samples: 686780940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 20:45:34,253][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 20:45:36,449][54818] Updated weights for policy 0, policy_version 474958 (0.0026) [2024-04-27 20:45:39,222][54818] Updated weights for policy 0, policy_version 474968 (0.0028) [2024-04-27 20:45:39,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7781875712. Throughput: 0: 55869.9. Samples: 687117080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 20:45:39,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 20:45:42,386][54818] Updated weights for policy 0, policy_version 474978 (0.0033) [2024-04-27 20:45:44,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7782121472. Throughput: 0: 55709.0. Samples: 687285860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 20:45:44,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 20:45:45,087][54818] Updated weights for policy 0, policy_version 474988 (0.0031) [2024-04-27 20:45:48,378][54818] Updated weights for policy 0, policy_version 474998 (0.0030) [2024-04-27 20:45:49,253][54587] Fps is (10 sec: 52429.5, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 7782400000. Throughput: 0: 55848.0. Samples: 687621060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 20:45:49,254][54587] Avg episode reward: [(0, '0.516')] [2024-04-27 20:45:49,351][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000475001_7782416384.pth... [2024-04-27 20:45:49,396][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000474186_7769063424.pth [2024-04-27 20:45:51,015][54818] Updated weights for policy 0, policy_version 475008 (0.0029) [2024-04-27 20:45:54,159][54818] Updated weights for policy 0, policy_version 475018 (0.0032) [2024-04-27 20:45:54,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55432.7, 300 sec: 55761.2). Total num frames: 7782694912. Throughput: 0: 55649.9. Samples: 687946840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 20:45:54,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-27 20:45:56,856][54818] Updated weights for policy 0, policy_version 475028 (0.0032) [2024-04-27 20:45:59,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7782957056. Throughput: 0: 55822.2. Samples: 688115020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 20:45:59,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 20:45:59,909][54818] Updated weights for policy 0, policy_version 475038 (0.0025) [2024-04-27 20:46:02,704][54818] Updated weights for policy 0, policy_version 475048 (0.0035) [2024-04-27 20:46:04,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7783268352. Throughput: 0: 55686.2. Samples: 688444940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 20:46:04,253][54587] Avg episode reward: [(0, '0.678')] [2024-04-27 20:46:05,883][54818] Updated weights for policy 0, policy_version 475058 (0.0028) [2024-04-27 20:46:08,584][54818] Updated weights for policy 0, policy_version 475068 (0.0026) [2024-04-27 20:46:09,253][54587] Fps is (10 sec: 58982.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7783546880. Throughput: 0: 55575.2. Samples: 688775620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 20:46:09,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 20:46:11,828][54818] Updated weights for policy 0, policy_version 475078 (0.0027) [2024-04-27 20:46:14,253][54587] Fps is (10 sec: 55704.7, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 7783825408. Throughput: 0: 55608.7. Samples: 688950160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 20:46:14,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 20:46:14,509][54818] Updated weights for policy 0, policy_version 475088 (0.0030) [2024-04-27 20:46:17,779][54818] Updated weights for policy 0, policy_version 475098 (0.0026) [2024-04-27 20:46:19,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 7784071168. Throughput: 0: 55583.4. Samples: 689282200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 20:46:19,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 20:46:20,409][54818] Updated weights for policy 0, policy_version 475108 (0.0026) [2024-04-27 20:46:21,658][54798] Signal inference workers to stop experience collection... (10000 times) [2024-04-27 20:46:21,701][54818] InferenceWorker_p0-w0: stopping experience collection (10000 times) [2024-04-27 20:46:21,716][54798] Signal inference workers to resume experience collection... (10000 times) [2024-04-27 20:46:21,721][54818] InferenceWorker_p0-w0: resuming experience collection (10000 times) [2024-04-27 20:46:24,020][54818] Updated weights for policy 0, policy_version 475118 (0.0027) [2024-04-27 20:46:24,253][54587] Fps is (10 sec: 52429.1, 60 sec: 55705.7, 300 sec: 55650.0). Total num frames: 7784349696. Throughput: 0: 55531.2. Samples: 689615980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 20:46:24,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 20:46:26,148][54818] Updated weights for policy 0, policy_version 475128 (0.0031) [2024-04-27 20:46:29,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 7784628224. Throughput: 0: 55246.1. Samples: 689771940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 20:46:29,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 20:46:29,722][54818] Updated weights for policy 0, policy_version 475138 (0.0029) [2024-04-27 20:46:31,988][54818] Updated weights for policy 0, policy_version 475148 (0.0027) [2024-04-27 20:46:34,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7784923136. Throughput: 0: 55253.3. Samples: 690107460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 20:46:34,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 20:46:35,660][54818] Updated weights for policy 0, policy_version 475158 (0.0025) [2024-04-27 20:46:38,040][54818] Updated weights for policy 0, policy_version 475168 (0.0027) [2024-04-27 20:46:39,253][54587] Fps is (10 sec: 58983.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7785218048. Throughput: 0: 55428.1. Samples: 690441100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 20:46:39,253][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 20:46:41,590][54818] Updated weights for policy 0, policy_version 475178 (0.0029) [2024-04-27 20:46:43,971][54818] Updated weights for policy 0, policy_version 475188 (0.0032) [2024-04-27 20:46:44,253][54587] Fps is (10 sec: 57344.7, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 7785496576. Throughput: 0: 55576.6. Samples: 690615960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 20:46:44,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 20:46:47,385][54818] Updated weights for policy 0, policy_version 475198 (0.0028) [2024-04-27 20:46:49,253][54587] Fps is (10 sec: 52428.2, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 7785742336. Throughput: 0: 55627.9. Samples: 690948200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 20:46:49,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 20:46:49,679][54818] Updated weights for policy 0, policy_version 475208 (0.0035) [2024-04-27 20:46:53,232][54818] Updated weights for policy 0, policy_version 475218 (0.0023) [2024-04-27 20:46:54,253][54587] Fps is (10 sec: 52428.1, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7786020864. Throughput: 0: 55670.7. Samples: 691280800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 20:46:54,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 20:46:55,625][54818] Updated weights for policy 0, policy_version 475228 (0.0029) [2024-04-27 20:46:59,173][54818] Updated weights for policy 0, policy_version 475238 (0.0032) [2024-04-27 20:46:59,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7786299392. Throughput: 0: 55284.0. Samples: 691437940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 20:46:59,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-27 20:47:01,652][54818] Updated weights for policy 0, policy_version 475248 (0.0027) [2024-04-27 20:47:04,253][54587] Fps is (10 sec: 54067.7, 60 sec: 54886.4, 300 sec: 55594.6). Total num frames: 7786561536. Throughput: 0: 55302.8. Samples: 691770820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 20:47:04,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 20:47:05,034][54818] Updated weights for policy 0, policy_version 475258 (0.0026) [2024-04-27 20:47:07,526][54818] Updated weights for policy 0, policy_version 475268 (0.0026) [2024-04-27 20:47:09,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7786856448. Throughput: 0: 55298.7. Samples: 692104420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 20:47:09,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 20:47:11,222][54818] Updated weights for policy 0, policy_version 475278 (0.0027) [2024-04-27 20:47:13,219][54818] Updated weights for policy 0, policy_version 475288 (0.0030) [2024-04-27 20:47:14,253][54587] Fps is (10 sec: 60620.4, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7787167744. Throughput: 0: 55747.2. Samples: 692280560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 20:47:14,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 20:47:16,931][54818] Updated weights for policy 0, policy_version 475298 (0.0033) [2024-04-27 20:47:18,694][54798] Signal inference workers to stop experience collection... (10050 times) [2024-04-27 20:47:18,697][54798] Signal inference workers to resume experience collection... (10050 times) [2024-04-27 20:47:18,721][54818] InferenceWorker_p0-w0: stopping experience collection (10050 times) [2024-04-27 20:47:18,722][54818] InferenceWorker_p0-w0: resuming experience collection (10050 times) [2024-04-27 20:47:19,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7787429888. Throughput: 0: 55695.1. Samples: 692613740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 20:47:19,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 20:47:19,381][54818] Updated weights for policy 0, policy_version 475308 (0.0028) [2024-04-27 20:47:22,643][54818] Updated weights for policy 0, policy_version 475318 (0.0028) [2024-04-27 20:47:24,253][54587] Fps is (10 sec: 50790.2, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7787675648. Throughput: 0: 55716.8. Samples: 692948360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 20:47:24,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 20:47:25,254][54818] Updated weights for policy 0, policy_version 475328 (0.0030) [2024-04-27 20:47:28,598][54818] Updated weights for policy 0, policy_version 475338 (0.0027) [2024-04-27 20:47:29,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7787970560. Throughput: 0: 55540.3. Samples: 693115280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 20:47:29,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 20:47:31,001][54818] Updated weights for policy 0, policy_version 475348 (0.0026) [2024-04-27 20:47:34,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7788249088. Throughput: 0: 55593.0. Samples: 693449880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 20:47:34,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 20:47:34,477][54818] Updated weights for policy 0, policy_version 475358 (0.0033) [2024-04-27 20:47:36,737][54818] Updated weights for policy 0, policy_version 475368 (0.0035) [2024-04-27 20:47:39,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55159.5, 300 sec: 55594.6). Total num frames: 7788527616. Throughput: 0: 55676.6. Samples: 693786240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 20:47:39,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 20:47:40,300][54818] Updated weights for policy 0, policy_version 475378 (0.0025) [2024-04-27 20:47:42,543][54818] Updated weights for policy 0, policy_version 475388 (0.0032) [2024-04-27 20:47:44,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 7788806144. Throughput: 0: 55875.1. Samples: 693952320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 20:47:44,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 20:47:46,000][54818] Updated weights for policy 0, policy_version 475398 (0.0032) [2024-04-27 20:47:48,457][54818] Updated weights for policy 0, policy_version 475408 (0.0031) [2024-04-27 20:47:49,253][54587] Fps is (10 sec: 58981.7, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 7789117440. Throughput: 0: 55865.2. Samples: 694284760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 20:47:49,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 20:47:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000475410_7789117440.pth... [2024-04-27 20:47:49,311][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000474595_7775764480.pth [2024-04-27 20:47:51,943][54818] Updated weights for policy 0, policy_version 475418 (0.0030) [2024-04-27 20:47:54,253][54587] Fps is (10 sec: 58982.4, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 7789395968. Throughput: 0: 55964.9. Samples: 694622840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 20:47:54,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 20:47:54,497][54818] Updated weights for policy 0, policy_version 475428 (0.0038) [2024-04-27 20:47:57,835][54818] Updated weights for policy 0, policy_version 475438 (0.0029) [2024-04-27 20:47:59,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7789658112. Throughput: 0: 55809.3. Samples: 694791980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 20:47:59,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 20:48:00,251][54818] Updated weights for policy 0, policy_version 475448 (0.0024) [2024-04-27 20:48:03,597][54818] Updated weights for policy 0, policy_version 475458 (0.0033) [2024-04-27 20:48:04,253][54587] Fps is (10 sec: 52429.5, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7789920256. Throughput: 0: 55794.8. Samples: 695124500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 20:48:04,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 20:48:06,353][54818] Updated weights for policy 0, policy_version 475468 (0.0028) [2024-04-27 20:48:09,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7790198784. Throughput: 0: 55797.9. Samples: 695459260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 20:48:09,253][54587] Avg episode reward: [(0, '0.684')] [2024-04-27 20:48:09,460][54818] Updated weights for policy 0, policy_version 475478 (0.0028) [2024-04-27 20:48:12,167][54818] Updated weights for policy 0, policy_version 475488 (0.0032) [2024-04-27 20:48:14,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7790477312. Throughput: 0: 55717.8. Samples: 695622580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 20:48:14,253][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 20:48:15,336][54818] Updated weights for policy 0, policy_version 475498 (0.0027) [2024-04-27 20:48:18,076][54818] Updated weights for policy 0, policy_version 475508 (0.0033) [2024-04-27 20:48:19,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7790755840. Throughput: 0: 55666.7. Samples: 695954880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 20:48:19,253][54587] Avg episode reward: [(0, '0.672')] [2024-04-27 20:48:21,213][54818] Updated weights for policy 0, policy_version 475518 (0.0037) [2024-04-27 20:48:23,787][54818] Updated weights for policy 0, policy_version 475528 (0.0030) [2024-04-27 20:48:24,253][54587] Fps is (10 sec: 57343.2, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 7791050752. Throughput: 0: 55578.0. Samples: 696287260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 20:48:24,254][54587] Avg episode reward: [(0, '0.528')] [2024-04-27 20:48:27,167][54818] Updated weights for policy 0, policy_version 475538 (0.0030) [2024-04-27 20:48:28,427][54798] Signal inference workers to stop experience collection... (10100 times) [2024-04-27 20:48:28,431][54798] Signal inference workers to resume experience collection... (10100 times) [2024-04-27 20:48:28,456][54818] InferenceWorker_p0-w0: stopping experience collection (10100 times) [2024-04-27 20:48:28,456][54818] InferenceWorker_p0-w0: resuming experience collection (10100 times) [2024-04-27 20:48:29,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7791329280. Throughput: 0: 55835.3. Samples: 696464900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 20:48:29,253][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 20:48:29,589][54818] Updated weights for policy 0, policy_version 475548 (0.0028) [2024-04-27 20:48:32,964][54818] Updated weights for policy 0, policy_version 475558 (0.0038) [2024-04-27 20:48:34,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7791591424. Throughput: 0: 55791.2. Samples: 696795360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 20:48:34,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 20:48:35,592][54818] Updated weights for policy 0, policy_version 475568 (0.0034) [2024-04-27 20:48:38,953][54818] Updated weights for policy 0, policy_version 475578 (0.0030) [2024-04-27 20:48:39,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55978.5, 300 sec: 55650.1). Total num frames: 7791886336. Throughput: 0: 55635.1. Samples: 697126420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 20:48:39,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 20:48:41,916][54818] Updated weights for policy 0, policy_version 475588 (0.0031) [2024-04-27 20:48:44,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55705.8, 300 sec: 55594.6). Total num frames: 7792148480. Throughput: 0: 55530.4. Samples: 697290840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 20:48:44,253][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 20:48:44,756][54818] Updated weights for policy 0, policy_version 475598 (0.0030) [2024-04-27 20:48:47,766][54818] Updated weights for policy 0, policy_version 475608 (0.0033) [2024-04-27 20:48:49,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7792427008. Throughput: 0: 55627.4. Samples: 697627740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 20:48:49,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 20:48:50,502][54818] Updated weights for policy 0, policy_version 475618 (0.0029) [2024-04-27 20:48:53,724][54818] Updated weights for policy 0, policy_version 475628 (0.0029) [2024-04-27 20:48:54,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7792705536. Throughput: 0: 55648.4. Samples: 697963440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 20:48:54,254][54587] Avg episode reward: [(0, '0.526')] [2024-04-27 20:48:56,436][54818] Updated weights for policy 0, policy_version 475638 (0.0030) [2024-04-27 20:48:59,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 7792984064. Throughput: 0: 55566.4. Samples: 698123080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 20:48:59,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 20:48:59,563][54818] Updated weights for policy 0, policy_version 475648 (0.0031) [2024-04-27 20:49:02,393][54818] Updated weights for policy 0, policy_version 475658 (0.0029) [2024-04-27 20:49:04,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 7793278976. Throughput: 0: 55530.0. Samples: 698453740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 20:49:04,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 20:49:05,310][54818] Updated weights for policy 0, policy_version 475668 (0.0032) [2024-04-27 20:49:08,289][54818] Updated weights for policy 0, policy_version 475678 (0.0029) [2024-04-27 20:49:09,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7793557504. Throughput: 0: 55784.9. Samples: 698797580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-27 20:49:09,254][54587] Avg episode reward: [(0, '0.677')] [2024-04-27 20:49:11,141][54818] Updated weights for policy 0, policy_version 475688 (0.0025) [2024-04-27 20:49:14,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55705.6, 300 sec: 55594.6). Total num frames: 7793819648. Throughput: 0: 55460.4. Samples: 698960620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 20:49:14,253][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 20:49:14,309][54818] Updated weights for policy 0, policy_version 475698 (0.0030) [2024-04-27 20:49:17,080][54818] Updated weights for policy 0, policy_version 475708 (0.0040) [2024-04-27 20:49:19,253][54587] Fps is (10 sec: 52429.3, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7794081792. Throughput: 0: 55624.1. Samples: 699298440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 20:49:19,253][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 20:49:19,614][54798] Signal inference workers to stop experience collection... (10150 times) [2024-04-27 20:49:19,615][54798] Signal inference workers to resume experience collection... (10150 times) [2024-04-27 20:49:19,626][54818] InferenceWorker_p0-w0: stopping experience collection (10150 times) [2024-04-27 20:49:19,626][54818] InferenceWorker_p0-w0: resuming experience collection (10150 times) [2024-04-27 20:49:20,081][54818] Updated weights for policy 0, policy_version 475718 (0.0031) [2024-04-27 20:49:22,995][54818] Updated weights for policy 0, policy_version 475728 (0.0026) [2024-04-27 20:49:24,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7794376704. Throughput: 0: 55704.0. Samples: 699633100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 20:49:24,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 20:49:25,976][54818] Updated weights for policy 0, policy_version 475738 (0.0025) [2024-04-27 20:49:28,918][54818] Updated weights for policy 0, policy_version 475748 (0.0032) [2024-04-27 20:49:29,253][54587] Fps is (10 sec: 58981.8, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7794671616. Throughput: 0: 55738.5. Samples: 699799080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 20:49:29,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 20:49:32,117][54818] Updated weights for policy 0, policy_version 475758 (0.0026) [2024-04-27 20:49:34,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7794950144. Throughput: 0: 55635.7. Samples: 700131340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 20:49:34,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 20:49:34,761][54818] Updated weights for policy 0, policy_version 475768 (0.0037) [2024-04-27 20:49:38,038][54818] Updated weights for policy 0, policy_version 475778 (0.0025) [2024-04-27 20:49:39,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7795228672. Throughput: 0: 55613.3. Samples: 700466040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 20:49:39,254][54587] Avg episode reward: [(0, '0.514')] [2024-04-27 20:49:40,685][54818] Updated weights for policy 0, policy_version 475788 (0.0027) [2024-04-27 20:49:43,877][54818] Updated weights for policy 0, policy_version 475798 (0.0027) [2024-04-27 20:49:44,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7795490816. Throughput: 0: 55789.5. Samples: 700633600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 20:49:44,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 20:49:46,457][54818] Updated weights for policy 0, policy_version 475808 (0.0033) [2024-04-27 20:49:49,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7795769344. Throughput: 0: 55956.5. Samples: 700971780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 20:49:49,262][54587] Avg episode reward: [(0, '0.487')] [2024-04-27 20:49:49,273][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000475816_7795769344.pth... [2024-04-27 20:49:49,321][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000475001_7782416384.pth [2024-04-27 20:49:49,888][54818] Updated weights for policy 0, policy_version 475818 (0.0029) [2024-04-27 20:49:52,395][54818] Updated weights for policy 0, policy_version 475828 (0.0031) [2024-04-27 20:49:54,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7796047872. Throughput: 0: 55670.8. Samples: 701302760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 20:49:54,253][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 20:49:55,709][54818] Updated weights for policy 0, policy_version 475838 (0.0027) [2024-04-27 20:49:58,302][54818] Updated weights for policy 0, policy_version 475848 (0.0033) [2024-04-27 20:49:59,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7796326400. Throughput: 0: 55629.7. Samples: 701463960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 20:49:59,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 20:50:01,577][54818] Updated weights for policy 0, policy_version 475858 (0.0027) [2024-04-27 20:50:04,083][54818] Updated weights for policy 0, policy_version 475868 (0.0032) [2024-04-27 20:50:04,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7796621312. Throughput: 0: 55716.4. Samples: 701805680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 20:50:04,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 20:50:07,521][54818] Updated weights for policy 0, policy_version 475878 (0.0026) [2024-04-27 20:50:09,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7796883456. Throughput: 0: 55622.7. Samples: 702136120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 20:50:09,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 20:50:10,086][54818] Updated weights for policy 0, policy_version 475888 (0.0031) [2024-04-27 20:50:13,247][54818] Updated weights for policy 0, policy_version 475898 (0.0032) [2024-04-27 20:50:14,253][54587] Fps is (10 sec: 55704.7, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 7797178368. Throughput: 0: 55614.1. Samples: 702301720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 20:50:14,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-27 20:50:16,308][54818] Updated weights for policy 0, policy_version 475908 (0.0031) [2024-04-27 20:50:19,018][54818] Updated weights for policy 0, policy_version 475918 (0.0033) [2024-04-27 20:50:19,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7797440512. Throughput: 0: 55565.4. Samples: 702631780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 20:50:19,253][54587] Avg episode reward: [(0, '0.474')] [2024-04-27 20:50:22,094][54818] Updated weights for policy 0, policy_version 475928 (0.0029) [2024-04-27 20:50:24,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 7797735424. Throughput: 0: 55637.7. Samples: 702969740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 20:50:24,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 20:50:24,998][54818] Updated weights for policy 0, policy_version 475938 (0.0028) [2024-04-27 20:50:27,926][54818] Updated weights for policy 0, policy_version 475948 (0.0033) [2024-04-27 20:50:29,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 7797997568. Throughput: 0: 55611.2. Samples: 703136100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 20:50:29,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 20:50:31,158][54818] Updated weights for policy 0, policy_version 475958 (0.0032) [2024-04-27 20:50:33,936][54818] Updated weights for policy 0, policy_version 475968 (0.0033) [2024-04-27 20:50:34,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7798292480. Throughput: 0: 55405.8. Samples: 703465040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-04-27 20:50:34,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 20:50:36,995][54818] Updated weights for policy 0, policy_version 475978 (0.0031) [2024-04-27 20:50:37,002][54798] Signal inference workers to stop experience collection... (10200 times) [2024-04-27 20:50:37,002][54798] Signal inference workers to resume experience collection... (10200 times) [2024-04-27 20:50:37,014][54818] InferenceWorker_p0-w0: stopping experience collection (10200 times) [2024-04-27 20:50:37,014][54818] InferenceWorker_p0-w0: resuming experience collection (10200 times) [2024-04-27 20:50:39,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7798554624. Throughput: 0: 55540.7. Samples: 703802100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 20:50:39,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 20:50:39,743][54818] Updated weights for policy 0, policy_version 475988 (0.0030) [2024-04-27 20:50:42,843][54818] Updated weights for policy 0, policy_version 475998 (0.0032) [2024-04-27 20:50:44,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7798833152. Throughput: 0: 55656.7. Samples: 703968520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 20:50:44,254][54587] Avg episode reward: [(0, '0.690')] [2024-04-27 20:50:45,591][54818] Updated weights for policy 0, policy_version 476008 (0.0028) [2024-04-27 20:50:48,746][54818] Updated weights for policy 0, policy_version 476018 (0.0029) [2024-04-27 20:50:49,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7799095296. Throughput: 0: 55400.4. Samples: 704298700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 20:50:49,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 20:50:51,497][54818] Updated weights for policy 0, policy_version 476028 (0.0033) [2024-04-27 20:50:54,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 7799373824. Throughput: 0: 55472.9. Samples: 704632400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 20:50:54,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 20:50:54,573][54818] Updated weights for policy 0, policy_version 476038 (0.0028) [2024-04-27 20:50:57,371][54818] Updated weights for policy 0, policy_version 476048 (0.0029) [2024-04-27 20:50:59,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7799668736. Throughput: 0: 55622.3. Samples: 704804720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 20:50:59,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 20:51:00,367][54818] Updated weights for policy 0, policy_version 476058 (0.0028) [2024-04-27 20:51:03,410][54818] Updated weights for policy 0, policy_version 476068 (0.0028) [2024-04-27 20:51:04,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7799947264. Throughput: 0: 55749.7. Samples: 705140520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 20:51:04,254][54587] Avg episode reward: [(0, '0.518')] [2024-04-27 20:51:06,210][54818] Updated weights for policy 0, policy_version 476078 (0.0026) [2024-04-27 20:51:09,080][54818] Updated weights for policy 0, policy_version 476088 (0.0033) [2024-04-27 20:51:09,253][54587] Fps is (10 sec: 57344.9, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 7800242176. Throughput: 0: 55633.1. Samples: 705473220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 20:51:09,253][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 20:51:12,117][54818] Updated weights for policy 0, policy_version 476098 (0.0028) [2024-04-27 20:51:14,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 7800504320. Throughput: 0: 55729.8. Samples: 705643940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 20:51:14,253][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 20:51:14,746][54818] Updated weights for policy 0, policy_version 476108 (0.0034) [2024-04-27 20:51:17,953][54818] Updated weights for policy 0, policy_version 476118 (0.0035) [2024-04-27 20:51:19,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7800766464. Throughput: 0: 55890.4. Samples: 705980100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 20:51:19,253][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 20:51:20,524][54818] Updated weights for policy 0, policy_version 476128 (0.0031) [2024-04-27 20:51:23,733][54818] Updated weights for policy 0, policy_version 476138 (0.0035) [2024-04-27 20:51:24,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 7801077760. Throughput: 0: 55816.1. Samples: 706313820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 20:51:24,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 20:51:26,451][54818] Updated weights for policy 0, policy_version 476148 (0.0032) [2024-04-27 20:51:29,253][54587] Fps is (10 sec: 57343.3, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7801339904. Throughput: 0: 55805.4. Samples: 706479760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 20:51:29,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 20:51:29,594][54818] Updated weights for policy 0, policy_version 476158 (0.0028) [2024-04-27 20:51:32,371][54818] Updated weights for policy 0, policy_version 476168 (0.0031) [2024-04-27 20:51:34,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 7801634816. Throughput: 0: 55832.4. Samples: 706811160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 20:51:34,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 20:51:35,388][54818] Updated weights for policy 0, policy_version 476178 (0.0036) [2024-04-27 20:51:38,309][54818] Updated weights for policy 0, policy_version 476188 (0.0027) [2024-04-27 20:51:39,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7801896960. Throughput: 0: 55845.3. Samples: 707145440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 20:51:39,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 20:51:41,253][54818] Updated weights for policy 0, policy_version 476198 (0.0034) [2024-04-27 20:51:44,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7802175488. Throughput: 0: 55721.4. Samples: 707312180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 20:51:44,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 20:51:44,397][54818] Updated weights for policy 0, policy_version 476208 (0.0027) [2024-04-27 20:51:47,011][54818] Updated weights for policy 0, policy_version 476218 (0.0036) [2024-04-27 20:51:49,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7802454016. Throughput: 0: 55741.3. Samples: 707648880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 20:51:49,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 20:51:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000476224_7802454016.pth... [2024-04-27 20:51:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000475410_7789117440.pth [2024-04-27 20:51:50,067][54798] Signal inference workers to stop experience collection... (10250 times) [2024-04-27 20:51:50,070][54798] Signal inference workers to resume experience collection... (10250 times) [2024-04-27 20:51:50,094][54818] InferenceWorker_p0-w0: stopping experience collection (10250 times) [2024-04-27 20:51:50,094][54818] InferenceWorker_p0-w0: resuming experience collection (10250 times) [2024-04-27 20:51:50,179][54818] Updated weights for policy 0, policy_version 476228 (0.0031) [2024-04-27 20:51:52,928][54818] Updated weights for policy 0, policy_version 476238 (0.0026) [2024-04-27 20:51:54,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7802732544. Throughput: 0: 55799.4. Samples: 707984200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 20:51:54,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 20:51:55,967][54818] Updated weights for policy 0, policy_version 476248 (0.0029) [2024-04-27 20:51:58,886][54818] Updated weights for policy 0, policy_version 476258 (0.0029) [2024-04-27 20:51:59,254][54587] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7803011072. Throughput: 0: 55678.8. Samples: 708149500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 20:51:59,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 20:52:01,978][54818] Updated weights for policy 0, policy_version 476268 (0.0029) [2024-04-27 20:52:04,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7803273216. Throughput: 0: 55581.1. Samples: 708481260. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:52:04,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 20:52:04,957][54818] Updated weights for policy 0, policy_version 476278 (0.0026) [2024-04-27 20:52:07,969][54818] Updated weights for policy 0, policy_version 476288 (0.0028) [2024-04-27 20:52:09,253][54587] Fps is (10 sec: 55706.5, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 7803568128. Throughput: 0: 55535.5. Samples: 708812920. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:52:09,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 20:52:10,760][54818] Updated weights for policy 0, policy_version 476298 (0.0040) [2024-04-27 20:52:13,703][54818] Updated weights for policy 0, policy_version 476308 (0.0027) [2024-04-27 20:52:14,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 7803846656. Throughput: 0: 55649.8. Samples: 708984000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:52:14,254][54587] Avg episode reward: [(0, '0.532')] [2024-04-27 20:52:16,622][54818] Updated weights for policy 0, policy_version 476318 (0.0030) [2024-04-27 20:52:19,253][54587] Fps is (10 sec: 55706.5, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 7804125184. Throughput: 0: 55586.9. Samples: 709312560. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:52:19,253][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 20:52:19,776][54818] Updated weights for policy 0, policy_version 476328 (0.0025) [2024-04-27 20:52:22,481][54818] Updated weights for policy 0, policy_version 476338 (0.0028) [2024-04-27 20:52:24,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7804403712. Throughput: 0: 55618.3. Samples: 709648260. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:52:24,254][54587] Avg episode reward: [(0, '0.682')] [2024-04-27 20:52:25,763][54818] Updated weights for policy 0, policy_version 476348 (0.0031) [2024-04-27 20:52:28,258][54818] Updated weights for policy 0, policy_version 476358 (0.0032) [2024-04-27 20:52:29,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7804698624. Throughput: 0: 55638.3. Samples: 709815900. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:52:29,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 20:52:31,457][54818] Updated weights for policy 0, policy_version 476368 (0.0027) [2024-04-27 20:52:34,004][54818] Updated weights for policy 0, policy_version 476378 (0.0026) [2024-04-27 20:52:34,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7804977152. Throughput: 0: 55765.8. Samples: 710158340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:52:34,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 20:52:37,195][54818] Updated weights for policy 0, policy_version 476388 (0.0028) [2024-04-27 20:52:39,253][54587] Fps is (10 sec: 52428.2, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7805222912. Throughput: 0: 55760.9. Samples: 710493440. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:52:39,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 20:52:40,210][54818] Updated weights for policy 0, policy_version 476398 (0.0026) [2024-04-27 20:52:43,216][54818] Updated weights for policy 0, policy_version 476408 (0.0029) [2024-04-27 20:52:44,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7805517824. Throughput: 0: 55669.1. Samples: 710654600. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:52:44,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-27 20:52:46,181][54818] Updated weights for policy 0, policy_version 476418 (0.0027) [2024-04-27 20:52:49,039][54818] Updated weights for policy 0, policy_version 476428 (0.0031) [2024-04-27 20:52:49,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7805796352. Throughput: 0: 55648.8. Samples: 710985460. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:52:49,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 20:52:51,231][54798] Signal inference workers to stop experience collection... (10300 times) [2024-04-27 20:52:51,235][54798] Signal inference workers to resume experience collection... (10300 times) [2024-04-27 20:52:51,252][54818] InferenceWorker_p0-w0: stopping experience collection (10300 times) [2024-04-27 20:52:51,252][54818] InferenceWorker_p0-w0: resuming experience collection (10300 times) [2024-04-27 20:52:51,920][54818] Updated weights for policy 0, policy_version 476438 (0.0029) [2024-04-27 20:52:54,253][54587] Fps is (10 sec: 54067.6, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 7806058496. Throughput: 0: 55769.0. Samples: 711322520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:52:54,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 20:52:54,763][54818] Updated weights for policy 0, policy_version 476448 (0.0033) [2024-04-27 20:52:57,867][54818] Updated weights for policy 0, policy_version 476458 (0.0031) [2024-04-27 20:52:59,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55432.7, 300 sec: 55650.0). Total num frames: 7806337024. Throughput: 0: 55645.7. Samples: 711488060. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:52:59,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 20:53:00,783][54818] Updated weights for policy 0, policy_version 476468 (0.0031) [2024-04-27 20:53:03,613][54818] Updated weights for policy 0, policy_version 476478 (0.0022) [2024-04-27 20:53:04,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7806631936. Throughput: 0: 55820.7. Samples: 711824500. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:53:04,254][54587] Avg episode reward: [(0, '0.484')] [2024-04-27 20:53:07,158][54818] Updated weights for policy 0, policy_version 476488 (0.0029) [2024-04-27 20:53:09,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7806894080. Throughput: 0: 55715.2. Samples: 712155440. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:53:09,253][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 20:53:09,539][54818] Updated weights for policy 0, policy_version 476498 (0.0026) [2024-04-27 20:53:13,140][54818] Updated weights for policy 0, policy_version 476508 (0.0030) [2024-04-27 20:53:14,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7807188992. Throughput: 0: 55680.7. Samples: 712321540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:53:14,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 20:53:15,274][54818] Updated weights for policy 0, policy_version 476518 (0.0031) [2024-04-27 20:53:18,996][54818] Updated weights for policy 0, policy_version 476528 (0.0038) [2024-04-27 20:53:19,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55594.6). Total num frames: 7807451136. Throughput: 0: 55596.2. Samples: 712660160. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:53:19,253][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 20:53:21,140][54818] Updated weights for policy 0, policy_version 476538 (0.0028) [2024-04-27 20:53:24,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7807729664. Throughput: 0: 55584.5. Samples: 712994740. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-04-27 20:53:24,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-27 20:53:24,732][54818] Updated weights for policy 0, policy_version 476548 (0.0032) [2024-04-27 20:53:27,229][54818] Updated weights for policy 0, policy_version 476558 (0.0037) [2024-04-27 20:53:29,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7808024576. Throughput: 0: 55667.1. Samples: 713159620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 20:53:29,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 20:53:30,726][54818] Updated weights for policy 0, policy_version 476568 (0.0024) [2024-04-27 20:53:32,975][54818] Updated weights for policy 0, policy_version 476578 (0.0030) [2024-04-27 20:53:34,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7808286720. Throughput: 0: 55571.7. Samples: 713486180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 20:53:34,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 20:53:36,703][54818] Updated weights for policy 0, policy_version 476588 (0.0033) [2024-04-27 20:53:38,779][54818] Updated weights for policy 0, policy_version 476598 (0.0028) [2024-04-27 20:53:39,253][54587] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 7808598016. Throughput: 0: 55419.4. Samples: 713816400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 20:53:39,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 20:53:42,446][54818] Updated weights for policy 0, policy_version 476608 (0.0027) [2024-04-27 20:53:44,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7808860160. Throughput: 0: 55662.8. Samples: 713992880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 20:53:44,253][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 20:53:44,342][54798] Signal inference workers to stop experience collection... (10350 times) [2024-04-27 20:53:44,343][54798] Signal inference workers to resume experience collection... (10350 times) [2024-04-27 20:53:44,355][54818] InferenceWorker_p0-w0: stopping experience collection (10350 times) [2024-04-27 20:53:44,355][54818] InferenceWorker_p0-w0: resuming experience collection (10350 times) [2024-04-27 20:53:44,704][54818] Updated weights for policy 0, policy_version 476618 (0.0032) [2024-04-27 20:53:48,260][54818] Updated weights for policy 0, policy_version 476628 (0.0027) [2024-04-27 20:53:49,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7809138688. Throughput: 0: 55731.4. Samples: 714332420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 20:53:49,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 20:53:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000476632_7809138688.pth... [2024-04-27 20:53:49,311][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000475816_7795769344.pth [2024-04-27 20:53:50,577][54818] Updated weights for policy 0, policy_version 476638 (0.0031) [2024-04-27 20:53:54,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55432.5, 300 sec: 55594.6). Total num frames: 7809384448. Throughput: 0: 55718.2. Samples: 714662760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 20:53:54,253][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 20:53:54,333][54818] Updated weights for policy 0, policy_version 476648 (0.0029) [2024-04-27 20:53:56,369][54818] Updated weights for policy 0, policy_version 476658 (0.0027) [2024-04-27 20:53:59,253][54587] Fps is (10 sec: 52429.9, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7809662976. Throughput: 0: 55376.6. Samples: 714813480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 20:53:59,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 20:54:00,148][54818] Updated weights for policy 0, policy_version 476668 (0.0030) [2024-04-27 20:54:02,147][54818] Updated weights for policy 0, policy_version 476678 (0.0028) [2024-04-27 20:54:04,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 7809941504. Throughput: 0: 55302.5. Samples: 715148780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 20:54:04,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 20:54:05,932][54818] Updated weights for policy 0, policy_version 476688 (0.0028) [2024-04-27 20:54:08,161][54818] Updated weights for policy 0, policy_version 476698 (0.0027) [2024-04-27 20:54:09,253][54587] Fps is (10 sec: 60619.4, 60 sec: 56251.5, 300 sec: 55761.1). Total num frames: 7810269184. Throughput: 0: 55378.1. Samples: 715486760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 20:54:09,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 20:54:11,855][54818] Updated weights for policy 0, policy_version 476708 (0.0027) [2024-04-27 20:54:14,028][54818] Updated weights for policy 0, policy_version 476718 (0.0030) [2024-04-27 20:54:14,253][54587] Fps is (10 sec: 60621.3, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 7810547712. Throughput: 0: 55680.6. Samples: 715665240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 20:54:14,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 20:54:17,768][54818] Updated weights for policy 0, policy_version 476728 (0.0026) [2024-04-27 20:54:19,253][54587] Fps is (10 sec: 55706.4, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 7810826240. Throughput: 0: 55784.9. Samples: 715996500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 20:54:19,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 20:54:19,725][54818] Updated weights for policy 0, policy_version 476738 (0.0028) [2024-04-27 20:54:23,628][54818] Updated weights for policy 0, policy_version 476748 (0.0026) [2024-04-27 20:54:24,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 7811088384. Throughput: 0: 55872.6. Samples: 716330660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 20:54:24,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 20:54:25,477][54818] Updated weights for policy 0, policy_version 476758 (0.0030) [2024-04-27 20:54:29,253][54587] Fps is (10 sec: 50790.8, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 7811334144. Throughput: 0: 55592.9. Samples: 716494560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 20:54:29,253][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 20:54:29,470][54818] Updated weights for policy 0, policy_version 476768 (0.0028) [2024-04-27 20:54:30,359][54798] Signal inference workers to stop experience collection... (10400 times) [2024-04-27 20:54:30,359][54798] Signal inference workers to resume experience collection... (10400 times) [2024-04-27 20:54:30,381][54818] InferenceWorker_p0-w0: stopping experience collection (10400 times) [2024-04-27 20:54:30,386][54818] InferenceWorker_p0-w0: resuming experience collection (10400 times) [2024-04-27 20:54:31,468][54818] Updated weights for policy 0, policy_version 476778 (0.0030) [2024-04-27 20:54:34,253][54587] Fps is (10 sec: 52428.0, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7811612672. Throughput: 0: 55541.9. Samples: 716831800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 20:54:34,263][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 20:54:35,502][54818] Updated weights for policy 0, policy_version 476788 (0.0037) [2024-04-27 20:54:37,385][54818] Updated weights for policy 0, policy_version 476798 (0.0028) [2024-04-27 20:54:39,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 7811907584. Throughput: 0: 55574.0. Samples: 717163600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 20:54:39,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 20:54:41,466][54818] Updated weights for policy 0, policy_version 476808 (0.0033) [2024-04-27 20:54:43,290][54818] Updated weights for policy 0, policy_version 476818 (0.0031) [2024-04-27 20:54:44,253][54587] Fps is (10 sec: 58982.4, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7812202496. Throughput: 0: 55868.3. Samples: 717327560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 20:54:44,262][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 20:54:47,339][54818] Updated weights for policy 0, policy_version 476828 (0.0032) [2024-04-27 20:54:49,224][54818] Updated weights for policy 0, policy_version 476838 (0.0023) [2024-04-27 20:54:49,253][54587] Fps is (10 sec: 60621.1, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 7812513792. Throughput: 0: 55767.6. Samples: 717658320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 20:54:49,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 20:54:53,204][54818] Updated weights for policy 0, policy_version 476848 (0.0024) [2024-04-27 20:54:54,253][54587] Fps is (10 sec: 55706.0, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 7812759552. Throughput: 0: 55635.8. Samples: 717990360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 20:54:54,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 20:54:55,455][54818] Updated weights for policy 0, policy_version 476858 (0.0043) [2024-04-27 20:54:59,056][54818] Updated weights for policy 0, policy_version 476868 (0.0024) [2024-04-27 20:54:59,253][54587] Fps is (10 sec: 50790.5, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7813021696. Throughput: 0: 55433.3. Samples: 718159740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 20:54:59,262][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 20:55:01,415][54818] Updated weights for policy 0, policy_version 476878 (0.0033) [2024-04-27 20:55:04,253][54587] Fps is (10 sec: 49151.9, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 7813251072. Throughput: 0: 55561.8. Samples: 718496780. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 20:55:04,263][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 20:55:05,025][54818] Updated weights for policy 0, policy_version 476888 (0.0027) [2024-04-27 20:55:06,755][54798] Signal inference workers to stop experience collection... (10450 times) [2024-04-27 20:55:06,755][54798] Signal inference workers to resume experience collection... (10450 times) [2024-04-27 20:55:06,764][54818] InferenceWorker_p0-w0: stopping experience collection (10450 times) [2024-04-27 20:55:06,774][54818] InferenceWorker_p0-w0: resuming experience collection (10450 times) [2024-04-27 20:55:07,254][54818] Updated weights for policy 0, policy_version 476898 (0.0027) [2024-04-27 20:55:09,253][54587] Fps is (10 sec: 54067.0, 60 sec: 54886.5, 300 sec: 55539.0). Total num frames: 7813562368. Throughput: 0: 55512.3. Samples: 718828720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 20:55:09,262][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 20:55:10,824][54818] Updated weights for policy 0, policy_version 476908 (0.0028) [2024-04-27 20:55:12,944][54818] Updated weights for policy 0, policy_version 476918 (0.0030) [2024-04-27 20:55:14,253][54587] Fps is (10 sec: 58982.1, 60 sec: 54886.3, 300 sec: 55594.5). Total num frames: 7813840896. Throughput: 0: 55295.0. Samples: 718982840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 20:55:14,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 20:55:16,618][54818] Updated weights for policy 0, policy_version 476928 (0.0027) [2024-04-27 20:55:18,665][54818] Updated weights for policy 0, policy_version 476938 (0.0033) [2024-04-27 20:55:19,253][54587] Fps is (10 sec: 58982.0, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7814152192. Throughput: 0: 55311.5. Samples: 719320820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 20:55:19,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 20:55:22,678][54818] Updated weights for policy 0, policy_version 476948 (0.0028) [2024-04-27 20:55:24,253][54587] Fps is (10 sec: 58983.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7814430720. Throughput: 0: 55344.6. Samples: 719654100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 20:55:24,253][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 20:55:24,700][54818] Updated weights for policy 0, policy_version 476958 (0.0031) [2024-04-27 20:55:28,548][54818] Updated weights for policy 0, policy_version 476968 (0.0028) [2024-04-27 20:55:29,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7814676480. Throughput: 0: 55376.9. Samples: 719819520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 20:55:29,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 20:55:30,886][54818] Updated weights for policy 0, policy_version 476978 (0.0027) [2024-04-27 20:55:34,253][54587] Fps is (10 sec: 50790.2, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7814938624. Throughput: 0: 55463.2. Samples: 720154160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 20:55:34,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 20:55:34,475][54818] Updated weights for policy 0, policy_version 476988 (0.0031) [2024-04-27 20:55:37,039][54818] Updated weights for policy 0, policy_version 476998 (0.0032) [2024-04-27 20:55:39,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7815217152. Throughput: 0: 55559.0. Samples: 720490520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 20:55:39,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 20:55:40,302][54818] Updated weights for policy 0, policy_version 477008 (0.0025) [2024-04-27 20:55:42,766][54818] Updated weights for policy 0, policy_version 477018 (0.0032) [2024-04-27 20:55:44,253][54587] Fps is (10 sec: 57343.3, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 7815512064. Throughput: 0: 55368.8. Samples: 720651340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 20:55:44,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 20:55:46,054][54818] Updated weights for policy 0, policy_version 477028 (0.0032) [2024-04-27 20:55:48,718][54818] Updated weights for policy 0, policy_version 477038 (0.0024) [2024-04-27 20:55:49,253][54587] Fps is (10 sec: 58982.5, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 7815806976. Throughput: 0: 55291.5. Samples: 720984900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 20:55:49,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 20:55:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000477039_7815806976.pth... [2024-04-27 20:55:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000476224_7802454016.pth [2024-04-27 20:55:52,006][54818] Updated weights for policy 0, policy_version 477048 (0.0031) [2024-04-27 20:55:53,319][54798] Signal inference workers to stop experience collection... (10500 times) [2024-04-27 20:55:53,319][54798] Signal inference workers to resume experience collection... (10500 times) [2024-04-27 20:55:53,348][54818] InferenceWorker_p0-w0: stopping experience collection (10500 times) [2024-04-27 20:55:53,348][54818] InferenceWorker_p0-w0: resuming experience collection (10500 times) [2024-04-27 20:55:54,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7816085504. Throughput: 0: 55255.6. Samples: 721315220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 20:55:54,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 20:55:54,686][54818] Updated weights for policy 0, policy_version 477058 (0.0028) [2024-04-27 20:55:57,910][54818] Updated weights for policy 0, policy_version 477068 (0.0029) [2024-04-27 20:55:59,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7816380416. Throughput: 0: 55941.4. Samples: 721500200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 20:55:59,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 20:56:00,407][54818] Updated weights for policy 0, policy_version 477078 (0.0030) [2024-04-27 20:56:03,781][54818] Updated weights for policy 0, policy_version 477088 (0.0030) [2024-04-27 20:56:04,253][54587] Fps is (10 sec: 54067.2, 60 sec: 56251.7, 300 sec: 55539.0). Total num frames: 7816626176. Throughput: 0: 55778.8. Samples: 721830860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 20:56:04,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 20:56:06,190][54818] Updated weights for policy 0, policy_version 477098 (0.0030) [2024-04-27 20:56:09,253][54587] Fps is (10 sec: 52429.4, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7816904704. Throughput: 0: 55628.0. Samples: 722157360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 20:56:09,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 20:56:09,704][54818] Updated weights for policy 0, policy_version 477108 (0.0026) [2024-04-27 20:56:12,339][54818] Updated weights for policy 0, policy_version 477118 (0.0031) [2024-04-27 20:56:14,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55650.0). Total num frames: 7817183232. Throughput: 0: 55538.8. Samples: 722318760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-04-27 20:56:14,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 20:56:15,755][54818] Updated weights for policy 0, policy_version 477128 (0.0032) [2024-04-27 20:56:18,331][54818] Updated weights for policy 0, policy_version 477138 (0.0026) [2024-04-27 20:56:19,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 7817461760. Throughput: 0: 55508.9. Samples: 722652060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:56:19,253][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 20:56:21,575][54818] Updated weights for policy 0, policy_version 477148 (0.0029) [2024-04-27 20:56:24,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 7817740288. Throughput: 0: 55495.6. Samples: 722987820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:56:24,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 20:56:24,293][54818] Updated weights for policy 0, policy_version 477158 (0.0032) [2024-04-27 20:56:27,514][54818] Updated weights for policy 0, policy_version 477168 (0.0028) [2024-04-27 20:56:29,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7818018816. Throughput: 0: 55669.4. Samples: 723156460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:56:29,254][54587] Avg episode reward: [(0, '0.521')] [2024-04-27 20:56:30,220][54818] Updated weights for policy 0, policy_version 477178 (0.0037) [2024-04-27 20:56:33,355][54818] Updated weights for policy 0, policy_version 477188 (0.0030) [2024-04-27 20:56:34,254][54587] Fps is (10 sec: 55699.5, 60 sec: 55977.5, 300 sec: 55594.3). Total num frames: 7818297344. Throughput: 0: 55704.0. Samples: 723491640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:56:34,255][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 20:56:35,985][54818] Updated weights for policy 0, policy_version 477198 (0.0031) [2024-04-27 20:56:39,196][54818] Updated weights for policy 0, policy_version 477208 (0.0037) [2024-04-27 20:56:39,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7818575872. Throughput: 0: 55705.7. Samples: 723821980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:56:39,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 20:56:42,040][54818] Updated weights for policy 0, policy_version 477218 (0.0028) [2024-04-27 20:56:44,253][54587] Fps is (10 sec: 54073.7, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7818838016. Throughput: 0: 55051.2. Samples: 723977500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:56:44,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 20:56:45,102][54818] Updated weights for policy 0, policy_version 477228 (0.0031) [2024-04-27 20:56:48,079][54818] Updated weights for policy 0, policy_version 477238 (0.0032) [2024-04-27 20:56:49,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7819116544. Throughput: 0: 55141.7. Samples: 724312240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:56:49,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 20:56:51,074][54818] Updated weights for policy 0, policy_version 477248 (0.0032) [2024-04-27 20:56:53,789][54818] Updated weights for policy 0, policy_version 477258 (0.0032) [2024-04-27 20:56:54,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7819395072. Throughput: 0: 55361.8. Samples: 724648640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:56:54,253][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 20:56:57,084][54818] Updated weights for policy 0, policy_version 477268 (0.0026) [2024-04-27 20:56:59,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 7819689984. Throughput: 0: 55574.1. Samples: 724819600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:56:59,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 20:56:59,922][54818] Updated weights for policy 0, policy_version 477278 (0.0024) [2024-04-27 20:57:02,838][54818] Updated weights for policy 0, policy_version 477288 (0.0032) [2024-04-27 20:57:04,253][54587] Fps is (10 sec: 57343.1, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7819968512. Throughput: 0: 55604.7. Samples: 725154280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:57:04,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 20:57:05,713][54818] Updated weights for policy 0, policy_version 477298 (0.0028) [2024-04-27 20:57:08,526][54818] Updated weights for policy 0, policy_version 477308 (0.0025) [2024-04-27 20:57:09,150][54798] Signal inference workers to stop experience collection... (10550 times) [2024-04-27 20:57:09,151][54798] Signal inference workers to resume experience collection... (10550 times) [2024-04-27 20:57:09,175][54818] InferenceWorker_p0-w0: stopping experience collection (10550 times) [2024-04-27 20:57:09,176][54818] InferenceWorker_p0-w0: resuming experience collection (10550 times) [2024-04-27 20:57:09,253][54587] Fps is (10 sec: 55706.5, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7820247040. Throughput: 0: 55518.4. Samples: 725486140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:57:09,253][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 20:57:11,559][54818] Updated weights for policy 0, policy_version 477318 (0.0030) [2024-04-27 20:57:14,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7820525568. Throughput: 0: 55511.2. Samples: 725654460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:57:14,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 20:57:14,377][54818] Updated weights for policy 0, policy_version 477328 (0.0031) [2024-04-27 20:57:17,413][54818] Updated weights for policy 0, policy_version 477338 (0.0031) [2024-04-27 20:57:19,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 7820787712. Throughput: 0: 55456.5. Samples: 725987120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:57:19,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 20:57:20,611][54818] Updated weights for policy 0, policy_version 477348 (0.0029) [2024-04-27 20:57:23,280][54818] Updated weights for policy 0, policy_version 477358 (0.0030) [2024-04-27 20:57:24,253][54587] Fps is (10 sec: 52428.4, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 7821049856. Throughput: 0: 55391.2. Samples: 726314580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:57:24,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 20:57:26,417][54818] Updated weights for policy 0, policy_version 477368 (0.0031) [2024-04-27 20:57:29,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 7821344768. Throughput: 0: 55743.6. Samples: 726485960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:57:29,253][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 20:57:29,273][54818] Updated weights for policy 0, policy_version 477378 (0.0028) [2024-04-27 20:57:32,199][54818] Updated weights for policy 0, policy_version 477388 (0.0028) [2024-04-27 20:57:34,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55433.6, 300 sec: 55594.5). Total num frames: 7821623296. Throughput: 0: 55712.4. Samples: 726819300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:57:34,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 20:57:35,041][54818] Updated weights for policy 0, policy_version 477398 (0.0025) [2024-04-27 20:57:38,173][54818] Updated weights for policy 0, policy_version 477408 (0.0026) [2024-04-27 20:57:39,253][54587] Fps is (10 sec: 57343.1, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7821918208. Throughput: 0: 55665.6. Samples: 727153600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 20:57:39,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-27 20:57:41,017][54818] Updated weights for policy 0, policy_version 477418 (0.0025) [2024-04-27 20:57:44,231][54818] Updated weights for policy 0, policy_version 477428 (0.0030) [2024-04-27 20:57:44,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7822180352. Throughput: 0: 55646.9. Samples: 727323700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:57:44,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 20:57:46,889][54818] Updated weights for policy 0, policy_version 477438 (0.0027) [2024-04-27 20:57:49,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7822458880. Throughput: 0: 55569.4. Samples: 727654900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:57:49,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 20:57:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000477445_7822458880.pth... [2024-04-27 20:57:49,341][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000476632_7809138688.pth [2024-04-27 20:57:50,062][54818] Updated weights for policy 0, policy_version 477448 (0.0029) [2024-04-27 20:57:52,796][54818] Updated weights for policy 0, policy_version 477458 (0.0028) [2024-04-27 20:57:54,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7822721024. Throughput: 0: 55710.6. Samples: 727993120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:57:54,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 20:57:55,843][54818] Updated weights for policy 0, policy_version 477468 (0.0035) [2024-04-27 20:57:58,550][54818] Updated weights for policy 0, policy_version 477478 (0.0034) [2024-04-27 20:57:59,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7823015936. Throughput: 0: 55516.8. Samples: 728152720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:57:59,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 20:58:01,906][54818] Updated weights for policy 0, policy_version 477488 (0.0029) [2024-04-27 20:58:04,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7823294464. Throughput: 0: 55492.5. Samples: 728484280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:58:04,254][54587] Avg episode reward: [(0, '0.702')] [2024-04-27 20:58:04,522][54818] Updated weights for policy 0, policy_version 477498 (0.0029) [2024-04-27 20:58:07,864][54818] Updated weights for policy 0, policy_version 477508 (0.0030) [2024-04-27 20:58:09,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 7823572992. Throughput: 0: 55620.0. Samples: 728817480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:58:09,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 20:58:10,420][54818] Updated weights for policy 0, policy_version 477518 (0.0027) [2024-04-27 20:58:13,685][54818] Updated weights for policy 0, policy_version 477528 (0.0024) [2024-04-27 20:58:14,253][54587] Fps is (10 sec: 57344.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7823867904. Throughput: 0: 55494.7. Samples: 728983220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:58:14,253][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 20:58:14,836][54798] Signal inference workers to stop experience collection... (10600 times) [2024-04-27 20:58:14,876][54818] InferenceWorker_p0-w0: stopping experience collection (10600 times) [2024-04-27 20:58:14,889][54798] Signal inference workers to resume experience collection... (10600 times) [2024-04-27 20:58:14,896][54818] InferenceWorker_p0-w0: resuming experience collection (10600 times) [2024-04-27 20:58:16,387][54818] Updated weights for policy 0, policy_version 477538 (0.0031) [2024-04-27 20:58:19,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 7824113664. Throughput: 0: 55577.5. Samples: 729320280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:58:19,253][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 20:58:19,399][54818] Updated weights for policy 0, policy_version 477548 (0.0029) [2024-04-27 20:58:22,239][54818] Updated weights for policy 0, policy_version 477558 (0.0024) [2024-04-27 20:58:24,253][54587] Fps is (10 sec: 52428.3, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 7824392192. Throughput: 0: 55554.3. Samples: 729653540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:58:24,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 20:58:25,345][54818] Updated weights for policy 0, policy_version 477568 (0.0030) [2024-04-27 20:58:28,153][54818] Updated weights for policy 0, policy_version 477578 (0.0024) [2024-04-27 20:58:29,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7824670720. Throughput: 0: 55371.1. Samples: 729815400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:58:29,253][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 20:58:31,442][54818] Updated weights for policy 0, policy_version 477588 (0.0034) [2024-04-27 20:58:34,013][54818] Updated weights for policy 0, policy_version 477598 (0.0029) [2024-04-27 20:58:34,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 7824965632. Throughput: 0: 55445.0. Samples: 730149920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:58:34,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 20:58:37,172][54818] Updated weights for policy 0, policy_version 477608 (0.0028) [2024-04-27 20:58:39,253][54587] Fps is (10 sec: 57342.7, 60 sec: 55432.5, 300 sec: 55538.9). Total num frames: 7825244160. Throughput: 0: 55471.8. Samples: 730489360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:58:39,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 20:58:39,926][54818] Updated weights for policy 0, policy_version 477618 (0.0035) [2024-04-27 20:58:42,969][54818] Updated weights for policy 0, policy_version 477628 (0.0027) [2024-04-27 20:58:44,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55432.4, 300 sec: 55483.5). Total num frames: 7825506304. Throughput: 0: 55652.9. Samples: 730657100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:58:44,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 20:58:45,773][54818] Updated weights for policy 0, policy_version 477638 (0.0026) [2024-04-27 20:58:48,957][54818] Updated weights for policy 0, policy_version 477648 (0.0031) [2024-04-27 20:58:49,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 7825801216. Throughput: 0: 55604.4. Samples: 730986480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:58:49,254][54587] Avg episode reward: [(0, '0.677')] [2024-04-27 20:58:51,828][54818] Updated weights for policy 0, policy_version 477658 (0.0027) [2024-04-27 20:58:54,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7826063360. Throughput: 0: 55527.3. Samples: 731316200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:58:54,253][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 20:58:54,929][54818] Updated weights for policy 0, policy_version 477668 (0.0028) [2024-04-27 20:58:57,866][54818] Updated weights for policy 0, policy_version 477678 (0.0029) [2024-04-27 20:58:59,253][54587] Fps is (10 sec: 54068.1, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7826341888. Throughput: 0: 55608.4. Samples: 731485600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:58:59,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 20:59:00,626][54818] Updated weights for policy 0, policy_version 477688 (0.0031) [2024-04-27 20:59:03,920][54818] Updated weights for policy 0, policy_version 477698 (0.0028) [2024-04-27 20:59:04,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7826620416. Throughput: 0: 55517.2. Samples: 731818560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:59:04,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 20:59:06,389][54818] Updated weights for policy 0, policy_version 477708 (0.0028) [2024-04-27 20:59:09,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7826898944. Throughput: 0: 55497.3. Samples: 732150920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-27 20:59:09,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 20:59:09,696][54818] Updated weights for policy 0, policy_version 477718 (0.0032) [2024-04-27 20:59:12,449][54818] Updated weights for policy 0, policy_version 477728 (0.0031) [2024-04-27 20:59:14,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 7827177472. Throughput: 0: 55654.1. Samples: 732319840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 20:59:14,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 20:59:15,664][54818] Updated weights for policy 0, policy_version 477738 (0.0028) [2024-04-27 20:59:18,394][54818] Updated weights for policy 0, policy_version 477748 (0.0027) [2024-04-27 20:59:19,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55978.5, 300 sec: 55539.0). Total num frames: 7827472384. Throughput: 0: 55610.1. Samples: 732652380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 20:59:19,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 20:59:21,661][54818] Updated weights for policy 0, policy_version 477758 (0.0037) [2024-04-27 20:59:24,105][54798] Signal inference workers to stop experience collection... (10650 times) [2024-04-27 20:59:24,105][54798] Signal inference workers to resume experience collection... (10650 times) [2024-04-27 20:59:24,129][54818] InferenceWorker_p0-w0: stopping experience collection (10650 times) [2024-04-27 20:59:24,129][54818] InferenceWorker_p0-w0: resuming experience collection (10650 times) [2024-04-27 20:59:24,215][54818] Updated weights for policy 0, policy_version 477768 (0.0026) [2024-04-27 20:59:24,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7827750912. Throughput: 0: 55401.1. Samples: 732982400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 20:59:24,254][54587] Avg episode reward: [(0, '0.686')] [2024-04-27 20:59:27,439][54818] Updated weights for policy 0, policy_version 477778 (0.0034) [2024-04-27 20:59:29,253][54587] Fps is (10 sec: 52429.5, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7827996672. Throughput: 0: 55279.7. Samples: 733144680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 20:59:29,253][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 20:59:30,039][54818] Updated weights for policy 0, policy_version 477788 (0.0025) [2024-04-27 20:59:33,169][54818] Updated weights for policy 0, policy_version 477798 (0.0042) [2024-04-27 20:59:34,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7828291584. Throughput: 0: 55531.7. Samples: 733485400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 20:59:34,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 20:59:35,940][54818] Updated weights for policy 0, policy_version 477808 (0.0032) [2024-04-27 20:59:39,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55159.7, 300 sec: 55427.9). Total num frames: 7828553728. Throughput: 0: 55691.6. Samples: 733822320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 20:59:39,253][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 20:59:39,276][54818] Updated weights for policy 0, policy_version 477818 (0.0032) [2024-04-27 20:59:41,806][54818] Updated weights for policy 0, policy_version 477828 (0.0029) [2024-04-27 20:59:44,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55372.4). Total num frames: 7828848640. Throughput: 0: 55523.0. Samples: 733984140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 20:59:44,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 20:59:45,261][54818] Updated weights for policy 0, policy_version 477838 (0.0025) [2024-04-27 20:59:47,517][54818] Updated weights for policy 0, policy_version 477848 (0.0029) [2024-04-27 20:59:49,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 7829127168. Throughput: 0: 55502.3. Samples: 734316160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 20:59:49,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 20:59:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000477852_7829127168.pth... [2024-04-27 20:59:49,331][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000477039_7815806976.pth [2024-04-27 20:59:51,123][54818] Updated weights for policy 0, policy_version 477858 (0.0034) [2024-04-27 20:59:53,394][54818] Updated weights for policy 0, policy_version 477868 (0.0039) [2024-04-27 20:59:54,253][54587] Fps is (10 sec: 58981.4, 60 sec: 56251.5, 300 sec: 55650.0). Total num frames: 7829438464. Throughput: 0: 55555.0. Samples: 734650900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 20:59:54,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 20:59:56,977][54818] Updated weights for policy 0, policy_version 477878 (0.0031) [2024-04-27 20:59:59,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7829684224. Throughput: 0: 55643.1. Samples: 734823780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 20:59:59,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 20:59:59,469][54818] Updated weights for policy 0, policy_version 477888 (0.0031) [2024-04-27 21:00:02,906][54818] Updated weights for policy 0, policy_version 477898 (0.0030) [2024-04-27 21:00:04,253][54587] Fps is (10 sec: 50791.8, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 7829946368. Throughput: 0: 55498.9. Samples: 735149820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 21:00:04,253][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 21:00:05,570][54818] Updated weights for policy 0, policy_version 477908 (0.0030) [2024-04-27 21:00:08,787][54818] Updated weights for policy 0, policy_version 477918 (0.0029) [2024-04-27 21:00:09,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 7830224896. Throughput: 0: 55633.4. Samples: 735485900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 21:00:09,253][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 21:00:11,239][54818] Updated weights for policy 0, policy_version 477928 (0.0035) [2024-04-27 21:00:14,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 7830503424. Throughput: 0: 55691.0. Samples: 735650780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 21:00:14,255][54587] Avg episode reward: [(0, '0.667')] [2024-04-27 21:00:14,721][54818] Updated weights for policy 0, policy_version 477938 (0.0032) [2024-04-27 21:00:16,994][54818] Updated weights for policy 0, policy_version 477948 (0.0029) [2024-04-27 21:00:19,253][54587] Fps is (10 sec: 55704.9, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 7830781952. Throughput: 0: 55586.6. Samples: 735986800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 21:00:19,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 21:00:20,562][54818] Updated weights for policy 0, policy_version 477958 (0.0034) [2024-04-27 21:00:22,993][54818] Updated weights for policy 0, policy_version 477968 (0.0039) [2024-04-27 21:00:24,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7831076864. Throughput: 0: 55359.9. Samples: 736313520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 21:00:24,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 21:00:26,602][54818] Updated weights for policy 0, policy_version 477978 (0.0028) [2024-04-27 21:00:29,026][54818] Updated weights for policy 0, policy_version 477988 (0.0034) [2024-04-27 21:00:29,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 7831355392. Throughput: 0: 55540.7. Samples: 736483480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 21:00:29,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 21:00:32,560][54818] Updated weights for policy 0, policy_version 477998 (0.0030) [2024-04-27 21:00:34,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7831633920. Throughput: 0: 55572.5. Samples: 736816920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 21:00:34,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 21:00:34,801][54818] Updated weights for policy 0, policy_version 478008 (0.0031) [2024-04-27 21:00:38,299][54818] Updated weights for policy 0, policy_version 478018 (0.0027) [2024-04-27 21:00:39,253][54587] Fps is (10 sec: 54068.3, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7831896064. Throughput: 0: 55642.5. Samples: 737154800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 21:00:39,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 21:00:40,179][54798] Signal inference workers to stop experience collection... (10700 times) [2024-04-27 21:00:40,179][54798] Signal inference workers to resume experience collection... (10700 times) [2024-04-27 21:00:40,191][54818] InferenceWorker_p0-w0: stopping experience collection (10700 times) [2024-04-27 21:00:40,191][54818] InferenceWorker_p0-w0: resuming experience collection (10700 times) [2024-04-27 21:00:40,845][54818] Updated weights for policy 0, policy_version 478028 (0.0024) [2024-04-27 21:00:44,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 7832158208. Throughput: 0: 55261.9. Samples: 737310560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 21:00:44,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 21:00:44,288][54818] Updated weights for policy 0, policy_version 478038 (0.0028) [2024-04-27 21:00:46,909][54818] Updated weights for policy 0, policy_version 478048 (0.0029) [2024-04-27 21:00:49,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55159.6, 300 sec: 55427.9). Total num frames: 7832436736. Throughput: 0: 55439.1. Samples: 737644580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 21:00:49,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 21:00:50,213][54818] Updated weights for policy 0, policy_version 478058 (0.0024) [2024-04-27 21:00:52,836][54818] Updated weights for policy 0, policy_version 478068 (0.0031) [2024-04-27 21:00:54,253][54587] Fps is (10 sec: 57342.9, 60 sec: 54886.5, 300 sec: 55427.9). Total num frames: 7832731648. Throughput: 0: 55352.2. Samples: 737976760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 21:00:54,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 21:00:55,907][54818] Updated weights for policy 0, policy_version 478078 (0.0029) [2024-04-27 21:00:58,547][54818] Updated weights for policy 0, policy_version 478088 (0.0030) [2024-04-27 21:00:59,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7833010176. Throughput: 0: 55451.7. Samples: 738146100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 21:00:59,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 21:01:01,918][54818] Updated weights for policy 0, policy_version 478098 (0.0029) [2024-04-27 21:01:04,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55705.4, 300 sec: 55539.0). Total num frames: 7833288704. Throughput: 0: 55416.4. Samples: 738480540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 21:01:04,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 21:01:04,586][54818] Updated weights for policy 0, policy_version 478108 (0.0030) [2024-04-27 21:01:07,931][54818] Updated weights for policy 0, policy_version 478118 (0.0031) [2024-04-27 21:01:09,253][54587] Fps is (10 sec: 57343.3, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 7833583616. Throughput: 0: 55477.2. Samples: 738810000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 21:01:09,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 21:01:10,528][54818] Updated weights for policy 0, policy_version 478128 (0.0026) [2024-04-27 21:01:13,842][54818] Updated weights for policy 0, policy_version 478138 (0.0028) [2024-04-27 21:01:14,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7833845760. Throughput: 0: 55426.8. Samples: 738977680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 21:01:14,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 21:01:16,352][54818] Updated weights for policy 0, policy_version 478148 (0.0031) [2024-04-27 21:01:19,253][54587] Fps is (10 sec: 52429.5, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 7834107904. Throughput: 0: 55417.4. Samples: 739310700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 21:01:19,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 21:01:19,832][54818] Updated weights for policy 0, policy_version 478158 (0.0027) [2024-04-27 21:01:22,366][54818] Updated weights for policy 0, policy_version 478168 (0.0027) [2024-04-27 21:01:24,253][54587] Fps is (10 sec: 52428.6, 60 sec: 54886.3, 300 sec: 55427.9). Total num frames: 7834370048. Throughput: 0: 55295.3. Samples: 739643100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 21:01:24,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 21:01:25,722][54818] Updated weights for policy 0, policy_version 478178 (0.0033) [2024-04-27 21:01:28,209][54818] Updated weights for policy 0, policy_version 478188 (0.0025) [2024-04-27 21:01:29,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55432.7, 300 sec: 55539.2). Total num frames: 7834681344. Throughput: 0: 55525.3. Samples: 739809200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 21:01:29,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 21:01:31,541][54818] Updated weights for policy 0, policy_version 478198 (0.0027) [2024-04-27 21:01:34,204][54818] Updated weights for policy 0, policy_version 478208 (0.0027) [2024-04-27 21:01:34,253][54587] Fps is (10 sec: 58983.3, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 7834959872. Throughput: 0: 55360.4. Samples: 740135800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 21:01:34,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 21:01:37,383][54818] Updated weights for policy 0, policy_version 478218 (0.0027) [2024-04-27 21:01:39,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7835222016. Throughput: 0: 55389.0. Samples: 740469260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 21:01:39,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 21:01:39,722][54798] Signal inference workers to stop experience collection... (10750 times) [2024-04-27 21:01:39,773][54818] InferenceWorker_p0-w0: stopping experience collection (10750 times) [2024-04-27 21:01:39,783][54798] Signal inference workers to resume experience collection... (10750 times) [2024-04-27 21:01:39,787][54818] InferenceWorker_p0-w0: resuming experience collection (10750 times) [2024-04-27 21:01:40,026][54818] Updated weights for policy 0, policy_version 478228 (0.0028) [2024-04-27 21:01:43,362][54818] Updated weights for policy 0, policy_version 478238 (0.0025) [2024-04-27 21:01:44,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7835516928. Throughput: 0: 55542.1. Samples: 740645500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 21:01:44,254][54587] Avg episode reward: [(0, '0.528')] [2024-04-27 21:01:45,747][54818] Updated weights for policy 0, policy_version 478248 (0.0028) [2024-04-27 21:01:49,197][54818] Updated weights for policy 0, policy_version 478258 (0.0032) [2024-04-27 21:01:49,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7835779072. Throughput: 0: 55578.4. Samples: 740981560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 21:01:49,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 21:01:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000478258_7835779072.pth... [2024-04-27 21:01:49,329][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000477445_7822458880.pth [2024-04-27 21:01:51,656][54818] Updated weights for policy 0, policy_version 478268 (0.0029) [2024-04-27 21:01:54,253][54587] Fps is (10 sec: 52428.7, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 7836041216. Throughput: 0: 55589.4. Samples: 741311520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 21:01:54,254][54587] Avg episode reward: [(0, '0.469')] [2024-04-27 21:01:55,021][54818] Updated weights for policy 0, policy_version 478278 (0.0034) [2024-04-27 21:01:57,580][54818] Updated weights for policy 0, policy_version 478288 (0.0024) [2024-04-27 21:01:59,253][54587] Fps is (10 sec: 52428.7, 60 sec: 54886.4, 300 sec: 55372.4). Total num frames: 7836303360. Throughput: 0: 55361.9. Samples: 741468960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 21:01:59,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 21:02:00,957][54818] Updated weights for policy 0, policy_version 478298 (0.0029) [2024-04-27 21:02:03,419][54818] Updated weights for policy 0, policy_version 478308 (0.0036) [2024-04-27 21:02:04,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 7836614656. Throughput: 0: 55380.8. Samples: 741802840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 21:02:04,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 21:02:06,935][54818] Updated weights for policy 0, policy_version 478318 (0.0031) [2024-04-27 21:02:09,253][54587] Fps is (10 sec: 60621.0, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 7836909568. Throughput: 0: 55428.7. Samples: 742137380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 21:02:09,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 21:02:09,330][54818] Updated weights for policy 0, policy_version 478328 (0.0029) [2024-04-27 21:02:12,725][54818] Updated weights for policy 0, policy_version 478338 (0.0038) [2024-04-27 21:02:14,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55159.6, 300 sec: 55483.5). Total num frames: 7837155328. Throughput: 0: 55536.5. Samples: 742308340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 21:02:14,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 21:02:15,298][54818] Updated weights for policy 0, policy_version 478348 (0.0024) [2024-04-27 21:02:18,575][54818] Updated weights for policy 0, policy_version 478358 (0.0026) [2024-04-27 21:02:19,253][54587] Fps is (10 sec: 52428.7, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7837433856. Throughput: 0: 55619.1. Samples: 742638660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 21:02:19,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 21:02:21,429][54818] Updated weights for policy 0, policy_version 478368 (0.0027) [2024-04-27 21:02:24,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7837728768. Throughput: 0: 55574.6. Samples: 742970120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 21:02:24,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 21:02:24,591][54818] Updated weights for policy 0, policy_version 478378 (0.0031) [2024-04-27 21:02:27,363][54818] Updated weights for policy 0, policy_version 478388 (0.0033) [2024-04-27 21:02:29,253][54587] Fps is (10 sec: 52428.8, 60 sec: 54613.4, 300 sec: 55372.4). Total num frames: 7837958144. Throughput: 0: 55125.0. Samples: 743126120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 21:02:29,253][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 21:02:30,412][54818] Updated weights for policy 0, policy_version 478398 (0.0028) [2024-04-27 21:02:33,163][54818] Updated weights for policy 0, policy_version 478408 (0.0028) [2024-04-27 21:02:34,253][54587] Fps is (10 sec: 52429.3, 60 sec: 54886.4, 300 sec: 55372.4). Total num frames: 7838253056. Throughput: 0: 55159.6. Samples: 743463740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 21:02:34,253][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 21:02:36,306][54818] Updated weights for policy 0, policy_version 478418 (0.0034) [2024-04-27 21:02:39,022][54818] Updated weights for policy 0, policy_version 478428 (0.0035) [2024-04-27 21:02:39,253][54587] Fps is (10 sec: 60620.6, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7838564352. Throughput: 0: 55294.3. Samples: 743799760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 21:02:39,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 21:02:42,308][54818] Updated weights for policy 0, policy_version 478438 (0.0028) [2024-04-27 21:02:44,253][54587] Fps is (10 sec: 60620.7, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7838859264. Throughput: 0: 55625.3. Samples: 743972100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 21:02:44,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 21:02:44,863][54818] Updated weights for policy 0, policy_version 478448 (0.0025) [2024-04-27 21:02:48,137][54818] Updated weights for policy 0, policy_version 478458 (0.0026) [2024-04-27 21:02:49,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 7839088640. Throughput: 0: 55499.6. Samples: 744300320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 21:02:49,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 21:02:50,943][54818] Updated weights for policy 0, policy_version 478468 (0.0030) [2024-04-27 21:02:54,020][54818] Updated weights for policy 0, policy_version 478478 (0.0029) [2024-04-27 21:02:54,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 7839383552. Throughput: 0: 55350.2. Samples: 744628140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 21:02:54,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 21:02:56,798][54818] Updated weights for policy 0, policy_version 478488 (0.0028) [2024-04-27 21:02:59,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55427.9). Total num frames: 7839645696. Throughput: 0: 55340.7. Samples: 744798680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 21:02:59,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 21:03:00,125][54818] Updated weights for policy 0, policy_version 478498 (0.0037) [2024-04-27 21:03:00,614][54798] Signal inference workers to stop experience collection... (10800 times) [2024-04-27 21:03:00,653][54818] InferenceWorker_p0-w0: stopping experience collection (10800 times) [2024-04-27 21:03:00,703][54798] Signal inference workers to resume experience collection... (10800 times) [2024-04-27 21:03:00,703][54818] InferenceWorker_p0-w0: resuming experience collection (10800 times) [2024-04-27 21:03:02,867][54818] Updated weights for policy 0, policy_version 478508 (0.0029) [2024-04-27 21:03:04,253][54587] Fps is (10 sec: 52428.4, 60 sec: 54886.4, 300 sec: 55372.4). Total num frames: 7839907840. Throughput: 0: 55285.2. Samples: 745126500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 21:03:04,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 21:03:05,917][54818] Updated weights for policy 0, policy_version 478518 (0.0031) [2024-04-27 21:03:08,955][54818] Updated weights for policy 0, policy_version 478528 (0.0035) [2024-04-27 21:03:09,253][54587] Fps is (10 sec: 55706.7, 60 sec: 54886.4, 300 sec: 55372.4). Total num frames: 7840202752. Throughput: 0: 55381.5. Samples: 745462280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 21:03:09,253][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 21:03:11,835][54818] Updated weights for policy 0, policy_version 478538 (0.0030) [2024-04-27 21:03:14,253][54587] Fps is (10 sec: 58982.1, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 7840497664. Throughput: 0: 55458.1. Samples: 745621740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 21:03:14,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 21:03:14,877][54818] Updated weights for policy 0, policy_version 478548 (0.0039) [2024-04-27 21:03:17,717][54818] Updated weights for policy 0, policy_version 478558 (0.0025) [2024-04-27 21:03:19,253][54587] Fps is (10 sec: 58981.3, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 7840792576. Throughput: 0: 55354.1. Samples: 745954680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 21:03:19,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 21:03:20,594][54818] Updated weights for policy 0, policy_version 478568 (0.0029) [2024-04-27 21:03:23,632][54818] Updated weights for policy 0, policy_version 478578 (0.0027) [2024-04-27 21:03:24,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7841054720. Throughput: 0: 55306.1. Samples: 746288540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-27 21:03:24,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 21:03:26,575][54818] Updated weights for policy 0, policy_version 478588 (0.0026) [2024-04-27 21:03:29,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55978.6, 300 sec: 55427.9). Total num frames: 7841316864. Throughput: 0: 55204.4. Samples: 746456300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 21:03:29,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 21:03:29,821][54818] Updated weights for policy 0, policy_version 478598 (0.0029) [2024-04-27 21:03:32,670][54818] Updated weights for policy 0, policy_version 478608 (0.0032) [2024-04-27 21:03:34,253][54587] Fps is (10 sec: 50790.5, 60 sec: 55159.4, 300 sec: 55316.9). Total num frames: 7841562624. Throughput: 0: 55349.4. Samples: 746791040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 21:03:34,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 21:03:35,595][54818] Updated weights for policy 0, policy_version 478618 (0.0031) [2024-04-27 21:03:38,493][54818] Updated weights for policy 0, policy_version 478628 (0.0029) [2024-04-27 21:03:39,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55159.5, 300 sec: 55483.5). Total num frames: 7841873920. Throughput: 0: 55565.3. Samples: 747128580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 21:03:39,253][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 21:03:41,420][54818] Updated weights for policy 0, policy_version 478638 (0.0027) [2024-04-27 21:03:44,253][54587] Fps is (10 sec: 57344.5, 60 sec: 54613.4, 300 sec: 55372.4). Total num frames: 7842136064. Throughput: 0: 55539.7. Samples: 747297960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 21:03:44,253][54587] Avg episode reward: [(0, '0.519')] [2024-04-27 21:03:44,424][54818] Updated weights for policy 0, policy_version 478648 (0.0035) [2024-04-27 21:03:47,358][54818] Updated weights for policy 0, policy_version 478658 (0.0029) [2024-04-27 21:03:49,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7842447360. Throughput: 0: 55563.6. Samples: 747626860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 21:03:49,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 21:03:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000478665_7842447360.pth... [2024-04-27 21:03:49,326][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000477852_7829127168.pth [2024-04-27 21:03:50,398][54818] Updated weights for policy 0, policy_version 478668 (0.0035) [2024-04-27 21:03:53,241][54818] Updated weights for policy 0, policy_version 478678 (0.0026) [2024-04-27 21:03:54,253][54587] Fps is (10 sec: 58981.4, 60 sec: 55705.4, 300 sec: 55539.0). Total num frames: 7842725888. Throughput: 0: 55423.7. Samples: 747956360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 21:03:54,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 21:03:56,447][54818] Updated weights for policy 0, policy_version 478688 (0.0026) [2024-04-27 21:03:59,067][54818] Updated weights for policy 0, policy_version 478698 (0.0031) [2024-04-27 21:03:59,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 7842988032. Throughput: 0: 55647.6. Samples: 748125880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 21:03:59,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 21:04:02,252][54818] Updated weights for policy 0, policy_version 478708 (0.0026) [2024-04-27 21:04:04,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55978.5, 300 sec: 55483.4). Total num frames: 7843266560. Throughput: 0: 55704.8. Samples: 748461400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 21:04:04,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-27 21:04:04,617][54798] Signal inference workers to stop experience collection... (10850 times) [2024-04-27 21:04:04,618][54798] Signal inference workers to resume experience collection... (10850 times) [2024-04-27 21:04:04,633][54818] InferenceWorker_p0-w0: stopping experience collection (10850 times) [2024-04-27 21:04:04,633][54818] InferenceWorker_p0-w0: resuming experience collection (10850 times) [2024-04-27 21:04:04,867][54818] Updated weights for policy 0, policy_version 478718 (0.0026) [2024-04-27 21:04:07,936][54818] Updated weights for policy 0, policy_version 478728 (0.0031) [2024-04-27 21:04:09,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55432.4, 300 sec: 55427.9). Total num frames: 7843528704. Throughput: 0: 55749.8. Samples: 748797280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 21:04:09,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 21:04:11,003][54818] Updated weights for policy 0, policy_version 478738 (0.0034) [2024-04-27 21:04:13,657][54818] Updated weights for policy 0, policy_version 478748 (0.0030) [2024-04-27 21:04:14,253][54587] Fps is (10 sec: 55706.7, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 7843823616. Throughput: 0: 55631.2. Samples: 748959700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 21:04:14,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 21:04:16,952][54818] Updated weights for policy 0, policy_version 478758 (0.0030) [2024-04-27 21:04:19,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 7844102144. Throughput: 0: 55582.6. Samples: 749292260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 21:04:19,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 21:04:19,623][54818] Updated weights for policy 0, policy_version 478768 (0.0029) [2024-04-27 21:04:22,826][54818] Updated weights for policy 0, policy_version 478778 (0.0025) [2024-04-27 21:04:24,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7844397056. Throughput: 0: 55471.5. Samples: 749624800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 21:04:24,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 21:04:25,612][54818] Updated weights for policy 0, policy_version 478788 (0.0033) [2024-04-27 21:04:28,552][54818] Updated weights for policy 0, policy_version 478798 (0.0028) [2024-04-27 21:04:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7844675584. Throughput: 0: 55603.0. Samples: 749800100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 21:04:29,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 21:04:31,537][54818] Updated weights for policy 0, policy_version 478808 (0.0025) [2024-04-27 21:04:34,253][54587] Fps is (10 sec: 54067.2, 60 sec: 56251.7, 300 sec: 55539.0). Total num frames: 7844937728. Throughput: 0: 55700.5. Samples: 750133380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 21:04:34,254][54587] Avg episode reward: [(0, '0.509')] [2024-04-27 21:04:34,316][54818] Updated weights for policy 0, policy_version 478818 (0.0031) [2024-04-27 21:04:37,446][54818] Updated weights for policy 0, policy_version 478828 (0.0030) [2024-04-27 21:04:39,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7845232640. Throughput: 0: 55898.4. Samples: 750471780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 21:04:39,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 21:04:40,216][54818] Updated weights for policy 0, policy_version 478838 (0.0031) [2024-04-27 21:04:43,272][54818] Updated weights for policy 0, policy_version 478848 (0.0025) [2024-04-27 21:04:44,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55483.4). Total num frames: 7845494784. Throughput: 0: 55694.6. Samples: 750632140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 21:04:44,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 21:04:46,090][54818] Updated weights for policy 0, policy_version 478858 (0.0028) [2024-04-27 21:04:49,012][54818] Updated weights for policy 0, policy_version 478868 (0.0028) [2024-04-27 21:04:49,253][54587] Fps is (10 sec: 54066.5, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 7845773312. Throughput: 0: 55724.6. Samples: 750969000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 21:04:49,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 21:04:51,983][54818] Updated weights for policy 0, policy_version 478878 (0.0025) [2024-04-27 21:04:54,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7846068224. Throughput: 0: 55728.0. Samples: 751305040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 21:04:54,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 21:04:54,987][54818] Updated weights for policy 0, policy_version 478888 (0.0029) [2024-04-27 21:04:57,880][54818] Updated weights for policy 0, policy_version 478898 (0.0026) [2024-04-27 21:04:59,253][54587] Fps is (10 sec: 58982.6, 60 sec: 56251.7, 300 sec: 55650.0). Total num frames: 7846363136. Throughput: 0: 55951.9. Samples: 751477540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 21:04:59,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 21:05:00,694][54818] Updated weights for policy 0, policy_version 478908 (0.0029) [2024-04-27 21:05:03,819][54818] Updated weights for policy 0, policy_version 478918 (0.0027) [2024-04-27 21:05:04,253][54587] Fps is (10 sec: 55705.0, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7846625280. Throughput: 0: 55998.1. Samples: 751812180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 21:05:04,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 21:05:06,579][54818] Updated weights for policy 0, policy_version 478928 (0.0036) [2024-04-27 21:05:09,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7846887424. Throughput: 0: 56114.6. Samples: 752149960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 21:05:09,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 21:05:09,587][54818] Updated weights for policy 0, policy_version 478938 (0.0032) [2024-04-27 21:05:12,575][54818] Updated weights for policy 0, policy_version 478948 (0.0032) [2024-04-27 21:05:13,800][54798] Signal inference workers to stop experience collection... (10900 times) [2024-04-27 21:05:13,801][54798] Signal inference workers to resume experience collection... (10900 times) [2024-04-27 21:05:13,817][54818] InferenceWorker_p0-w0: stopping experience collection (10900 times) [2024-04-27 21:05:13,817][54818] InferenceWorker_p0-w0: resuming experience collection (10900 times) [2024-04-27 21:05:14,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7847182336. Throughput: 0: 55930.6. Samples: 752316980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 21:05:14,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 21:05:15,446][54818] Updated weights for policy 0, policy_version 478958 (0.0031) [2024-04-27 21:05:18,547][54818] Updated weights for policy 0, policy_version 478968 (0.0025) [2024-04-27 21:05:19,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7847460864. Throughput: 0: 55988.0. Samples: 752652840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 21:05:19,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 21:05:21,498][54818] Updated weights for policy 0, policy_version 478978 (0.0031) [2024-04-27 21:05:24,253][54587] Fps is (10 sec: 52429.3, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 7847706624. Throughput: 0: 55738.7. Samples: 752980020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 21:05:24,253][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 21:05:24,692][54818] Updated weights for policy 0, policy_version 478988 (0.0034) [2024-04-27 21:05:27,301][54818] Updated weights for policy 0, policy_version 478998 (0.0028) [2024-04-27 21:05:29,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 7848017920. Throughput: 0: 55926.8. Samples: 753148840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 21:05:29,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-27 21:05:30,465][54818] Updated weights for policy 0, policy_version 479008 (0.0033) [2024-04-27 21:05:33,108][54818] Updated weights for policy 0, policy_version 479018 (0.0037) [2024-04-27 21:05:34,253][54587] Fps is (10 sec: 58982.1, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7848296448. Throughput: 0: 55912.1. Samples: 753485040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 21:05:34,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-27 21:05:36,269][54818] Updated weights for policy 0, policy_version 479028 (0.0036) [2024-04-27 21:05:39,095][54818] Updated weights for policy 0, policy_version 479038 (0.0032) [2024-04-27 21:05:39,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7848558592. Throughput: 0: 55837.8. Samples: 753817740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 21:05:39,254][54587] Avg episode reward: [(0, '0.531')] [2024-04-27 21:05:42,221][54818] Updated weights for policy 0, policy_version 479048 (0.0027) [2024-04-27 21:05:44,253][54587] Fps is (10 sec: 52428.4, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7848820736. Throughput: 0: 55656.4. Samples: 753982080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 21:05:44,254][54587] Avg episode reward: [(0, '0.687')] [2024-04-27 21:05:45,020][54818] Updated weights for policy 0, policy_version 479058 (0.0027) [2024-04-27 21:05:47,908][54818] Updated weights for policy 0, policy_version 479068 (0.0036) [2024-04-27 21:05:49,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7849132032. Throughput: 0: 55657.5. Samples: 754316760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 21:05:49,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 21:05:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000479073_7849132032.pth... [2024-04-27 21:05:49,315][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000478258_7835779072.pth [2024-04-27 21:05:50,801][54818] Updated weights for policy 0, policy_version 479078 (0.0031) [2024-04-27 21:05:53,786][54818] Updated weights for policy 0, policy_version 479088 (0.0033) [2024-04-27 21:05:54,253][54587] Fps is (10 sec: 58983.1, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7849410560. Throughput: 0: 55660.6. Samples: 754654680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 21:05:54,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 21:05:56,538][54818] Updated weights for policy 0, policy_version 479098 (0.0028) [2024-04-27 21:05:59,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 7849672704. Throughput: 0: 55562.6. Samples: 754817300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 21:05:59,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 21:05:59,648][54818] Updated weights for policy 0, policy_version 479108 (0.0025) [2024-04-27 21:06:02,604][54818] Updated weights for policy 0, policy_version 479118 (0.0029) [2024-04-27 21:06:04,253][54587] Fps is (10 sec: 55704.7, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 7849967616. Throughput: 0: 55580.3. Samples: 755153960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 21:06:04,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 21:06:05,497][54818] Updated weights for policy 0, policy_version 479128 (0.0028) [2024-04-27 21:06:08,507][54818] Updated weights for policy 0, policy_version 479138 (0.0025) [2024-04-27 21:06:09,253][54587] Fps is (10 sec: 58983.2, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 7850262528. Throughput: 0: 55677.8. Samples: 755485520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 21:06:09,254][54587] Avg episode reward: [(0, '0.664')] [2024-04-27 21:06:11,270][54818] Updated weights for policy 0, policy_version 479148 (0.0031) [2024-04-27 21:06:14,222][54818] Updated weights for policy 0, policy_version 479158 (0.0032) [2024-04-27 21:06:14,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 7850524672. Throughput: 0: 55804.3. Samples: 755660040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-27 21:06:14,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 21:06:16,981][54818] Updated weights for policy 0, policy_version 479168 (0.0028) [2024-04-27 21:06:19,253][54587] Fps is (10 sec: 50790.7, 60 sec: 55159.6, 300 sec: 55594.6). Total num frames: 7850770432. Throughput: 0: 55743.7. Samples: 755993500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 21:06:19,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-27 21:06:19,959][54818] Updated weights for policy 0, policy_version 479178 (0.0026) [2024-04-27 21:06:22,840][54818] Updated weights for policy 0, policy_version 479188 (0.0024) [2024-04-27 21:06:24,253][54587] Fps is (10 sec: 54068.0, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 7851065344. Throughput: 0: 55907.2. Samples: 756333560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 21:06:24,253][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 21:06:25,861][54818] Updated weights for policy 0, policy_version 479198 (0.0029) [2024-04-27 21:06:26,499][54798] Signal inference workers to stop experience collection... (10950 times) [2024-04-27 21:06:26,499][54798] Signal inference workers to resume experience collection... (10950 times) [2024-04-27 21:06:26,525][54818] InferenceWorker_p0-w0: stopping experience collection (10950 times) [2024-04-27 21:06:26,525][54818] InferenceWorker_p0-w0: resuming experience collection (10950 times) [2024-04-27 21:06:28,865][54818] Updated weights for policy 0, policy_version 479208 (0.0029) [2024-04-27 21:06:29,253][54587] Fps is (10 sec: 58981.7, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 7851360256. Throughput: 0: 55823.2. Samples: 756494120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 21:06:29,262][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 21:06:31,771][54818] Updated weights for policy 0, policy_version 479218 (0.0028) [2024-04-27 21:06:34,253][54587] Fps is (10 sec: 57342.6, 60 sec: 55705.4, 300 sec: 55650.0). Total num frames: 7851638784. Throughput: 0: 55758.0. Samples: 756825880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 21:06:34,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 21:06:34,817][54818] Updated weights for policy 0, policy_version 479228 (0.0032) [2024-04-27 21:06:37,590][54818] Updated weights for policy 0, policy_version 479238 (0.0025) [2024-04-27 21:06:39,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7851917312. Throughput: 0: 55702.7. Samples: 757161300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 21:06:39,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 21:06:40,658][54818] Updated weights for policy 0, policy_version 479248 (0.0026) [2024-04-27 21:06:43,406][54818] Updated weights for policy 0, policy_version 479258 (0.0028) [2024-04-27 21:06:44,253][54587] Fps is (10 sec: 57345.3, 60 sec: 56524.9, 300 sec: 55705.6). Total num frames: 7852212224. Throughput: 0: 55944.1. Samples: 757334780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 21:06:44,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 21:06:46,789][54818] Updated weights for policy 0, policy_version 479268 (0.0026) [2024-04-27 21:06:49,182][54818] Updated weights for policy 0, policy_version 479278 (0.0031) [2024-04-27 21:06:49,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7852490752. Throughput: 0: 55980.6. Samples: 757673080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 21:06:49,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 21:06:52,499][54818] Updated weights for policy 0, policy_version 479288 (0.0033) [2024-04-27 21:06:54,253][54587] Fps is (10 sec: 52428.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7852736512. Throughput: 0: 56072.0. Samples: 758008760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 21:06:54,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 21:06:55,145][54818] Updated weights for policy 0, policy_version 479298 (0.0027) [2024-04-27 21:06:58,249][54818] Updated weights for policy 0, policy_version 479308 (0.0031) [2024-04-27 21:06:59,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7853015040. Throughput: 0: 55768.6. Samples: 758169620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 21:06:59,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 21:07:01,045][54818] Updated weights for policy 0, policy_version 479318 (0.0029) [2024-04-27 21:07:04,172][54818] Updated weights for policy 0, policy_version 479328 (0.0026) [2024-04-27 21:07:04,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7853309952. Throughput: 0: 55824.3. Samples: 758505600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 21:07:04,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 21:07:06,987][54818] Updated weights for policy 0, policy_version 479338 (0.0028) [2024-04-27 21:07:09,253][54587] Fps is (10 sec: 58981.6, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7853604864. Throughput: 0: 55653.6. Samples: 758837980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 21:07:09,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 21:07:10,087][54818] Updated weights for policy 0, policy_version 479348 (0.0030) [2024-04-27 21:07:12,870][54818] Updated weights for policy 0, policy_version 479358 (0.0028) [2024-04-27 21:07:14,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 7853867008. Throughput: 0: 55988.6. Samples: 759013600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 21:07:14,253][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 21:07:15,846][54818] Updated weights for policy 0, policy_version 479368 (0.0031) [2024-04-27 21:07:18,752][54818] Updated weights for policy 0, policy_version 479378 (0.0041) [2024-04-27 21:07:19,253][54587] Fps is (10 sec: 55705.8, 60 sec: 56524.6, 300 sec: 55705.6). Total num frames: 7854161920. Throughput: 0: 55925.0. Samples: 759342500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 21:07:19,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 21:07:21,656][54818] Updated weights for policy 0, policy_version 479388 (0.0030) [2024-04-27 21:07:24,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7854407680. Throughput: 0: 55866.6. Samples: 759675300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 21:07:24,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 21:07:24,606][54818] Updated weights for policy 0, policy_version 479398 (0.0032) [2024-04-27 21:07:27,700][54818] Updated weights for policy 0, policy_version 479408 (0.0034) [2024-04-27 21:07:29,253][54587] Fps is (10 sec: 50790.4, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 7854669824. Throughput: 0: 55541.6. Samples: 759834160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 21:07:29,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 21:07:30,514][54818] Updated weights for policy 0, policy_version 479418 (0.0030) [2024-04-27 21:07:33,664][54818] Updated weights for policy 0, policy_version 479428 (0.0027) [2024-04-27 21:07:34,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55432.8, 300 sec: 55594.5). Total num frames: 7854964736. Throughput: 0: 55402.3. Samples: 760166180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 21:07:34,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 21:07:36,667][54798] Signal inference workers to stop experience collection... (11000 times) [2024-04-27 21:07:36,700][54818] InferenceWorker_p0-w0: stopping experience collection (11000 times) [2024-04-27 21:07:36,725][54798] Signal inference workers to resume experience collection... (11000 times) [2024-04-27 21:07:36,727][54818] InferenceWorker_p0-w0: resuming experience collection (11000 times) [2024-04-27 21:07:36,732][54818] Updated weights for policy 0, policy_version 479438 (0.0029) [2024-04-27 21:07:39,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 7855243264. Throughput: 0: 55334.6. Samples: 760498820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 21:07:39,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 21:07:39,462][54818] Updated weights for policy 0, policy_version 479448 (0.0029) [2024-04-27 21:07:42,543][54818] Updated weights for policy 0, policy_version 479458 (0.0025) [2024-04-27 21:07:44,253][54587] Fps is (10 sec: 58982.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7855554560. Throughput: 0: 55534.6. Samples: 760668680. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-27 21:07:44,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 21:07:45,229][54818] Updated weights for policy 0, policy_version 479468 (0.0028) [2024-04-27 21:07:48,448][54818] Updated weights for policy 0, policy_version 479478 (0.0031) [2024-04-27 21:07:49,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7855816704. Throughput: 0: 55522.6. Samples: 761004120. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-27 21:07:49,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 21:07:49,261][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000479481_7855816704.pth... [2024-04-27 21:07:49,317][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000478665_7842447360.pth [2024-04-27 21:07:51,217][54818] Updated weights for policy 0, policy_version 479488 (0.0031) [2024-04-27 21:07:54,253][54587] Fps is (10 sec: 50790.8, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7856062464. Throughput: 0: 55490.4. Samples: 761335040. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-27 21:07:54,253][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 21:07:54,433][54818] Updated weights for policy 0, policy_version 479498 (0.0027) [2024-04-27 21:07:57,226][54818] Updated weights for policy 0, policy_version 479508 (0.0025) [2024-04-27 21:07:59,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7856340992. Throughput: 0: 55199.5. Samples: 761497580. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-27 21:07:59,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 21:08:00,338][54818] Updated weights for policy 0, policy_version 479518 (0.0025) [2024-04-27 21:08:03,095][54818] Updated weights for policy 0, policy_version 479528 (0.0026) [2024-04-27 21:08:04,253][54587] Fps is (10 sec: 57343.9, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7856635904. Throughput: 0: 55264.6. Samples: 761829400. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-27 21:08:04,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 21:08:06,165][54818] Updated weights for policy 0, policy_version 479538 (0.0030) [2024-04-27 21:08:08,788][54818] Updated weights for policy 0, policy_version 479548 (0.0026) [2024-04-27 21:08:09,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7856914432. Throughput: 0: 55319.0. Samples: 762164660. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-27 21:08:09,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 21:08:11,905][54818] Updated weights for policy 0, policy_version 479558 (0.0030) [2024-04-27 21:08:14,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7857192960. Throughput: 0: 55656.2. Samples: 762338680. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-27 21:08:14,253][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 21:08:14,581][54818] Updated weights for policy 0, policy_version 479568 (0.0027) [2024-04-27 21:08:17,852][54818] Updated weights for policy 0, policy_version 479578 (0.0035) [2024-04-27 21:08:19,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7857487872. Throughput: 0: 55714.7. Samples: 762673340. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-27 21:08:19,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 21:08:20,907][54818] Updated weights for policy 0, policy_version 479588 (0.0028) [2024-04-27 21:08:23,808][54818] Updated weights for policy 0, policy_version 479598 (0.0032) [2024-04-27 21:08:24,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7857766400. Throughput: 0: 55704.9. Samples: 763005540. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-27 21:08:24,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 21:08:26,832][54818] Updated weights for policy 0, policy_version 479608 (0.0028) [2024-04-27 21:08:29,253][54587] Fps is (10 sec: 54066.5, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7858028544. Throughput: 0: 55601.7. Samples: 763170760. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-27 21:08:29,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 21:08:29,703][54798] Signal inference workers to stop experience collection... (11050 times) [2024-04-27 21:08:29,751][54818] InferenceWorker_p0-w0: stopping experience collection (11050 times) [2024-04-27 21:08:29,758][54798] Signal inference workers to resume experience collection... (11050 times) [2024-04-27 21:08:29,765][54818] InferenceWorker_p0-w0: resuming experience collection (11050 times) [2024-04-27 21:08:29,768][54818] Updated weights for policy 0, policy_version 479618 (0.0026) [2024-04-27 21:08:32,844][54818] Updated weights for policy 0, policy_version 479628 (0.0033) [2024-04-27 21:08:34,253][54587] Fps is (10 sec: 52428.6, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 7858290688. Throughput: 0: 55616.0. Samples: 763506840. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-27 21:08:34,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 21:08:35,516][54818] Updated weights for policy 0, policy_version 479638 (0.0032) [2024-04-27 21:08:38,611][54818] Updated weights for policy 0, policy_version 479648 (0.0026) [2024-04-27 21:08:39,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7858585600. Throughput: 0: 55686.0. Samples: 763840920. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-27 21:08:39,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-27 21:08:41,362][54818] Updated weights for policy 0, policy_version 479658 (0.0027) [2024-04-27 21:08:44,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7858864128. Throughput: 0: 55662.2. Samples: 764002380. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-27 21:08:44,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 21:08:44,304][54818] Updated weights for policy 0, policy_version 479668 (0.0029) [2024-04-27 21:08:47,399][54818] Updated weights for policy 0, policy_version 479678 (0.0024) [2024-04-27 21:08:49,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 7859142656. Throughput: 0: 55594.9. Samples: 764331180. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-27 21:08:49,262][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 21:08:50,259][54818] Updated weights for policy 0, policy_version 479688 (0.0024) [2024-04-27 21:08:53,253][54818] Updated weights for policy 0, policy_version 479698 (0.0030) [2024-04-27 21:08:54,253][54587] Fps is (10 sec: 58982.0, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 7859453952. Throughput: 0: 55531.1. Samples: 764663560. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-27 21:08:54,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 21:08:56,274][54818] Updated weights for policy 0, policy_version 479708 (0.0029) [2024-04-27 21:08:59,049][54818] Updated weights for policy 0, policy_version 479718 (0.0036) [2024-04-27 21:08:59,253][54587] Fps is (10 sec: 55706.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7859699712. Throughput: 0: 55535.4. Samples: 764837780. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-27 21:08:59,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 21:09:01,961][54818] Updated weights for policy 0, policy_version 479728 (0.0026) [2024-04-27 21:09:04,253][54587] Fps is (10 sec: 50790.1, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 7859961856. Throughput: 0: 55573.6. Samples: 765174160. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-04-27 21:09:04,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 21:09:04,863][54818] Updated weights for policy 0, policy_version 479738 (0.0027) [2024-04-27 21:09:07,714][54818] Updated weights for policy 0, policy_version 479748 (0.0028) [2024-04-27 21:09:09,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7860256768. Throughput: 0: 55608.0. Samples: 765507900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:09:09,254][54587] Avg episode reward: [(0, '0.667')] [2024-04-27 21:09:10,699][54818] Updated weights for policy 0, policy_version 479758 (0.0031) [2024-04-27 21:09:13,713][54818] Updated weights for policy 0, policy_version 479768 (0.0026) [2024-04-27 21:09:14,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7860535296. Throughput: 0: 55464.1. Samples: 765666640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:09:14,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 21:09:16,482][54818] Updated weights for policy 0, policy_version 479778 (0.0034) [2024-04-27 21:09:19,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7860813824. Throughput: 0: 55428.0. Samples: 766001100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:09:19,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 21:09:19,918][54818] Updated weights for policy 0, policy_version 479788 (0.0028) [2024-04-27 21:09:22,283][54798] Signal inference workers to stop experience collection... (11100 times) [2024-04-27 21:09:22,283][54798] Signal inference workers to resume experience collection... (11100 times) [2024-04-27 21:09:22,297][54818] InferenceWorker_p0-w0: stopping experience collection (11100 times) [2024-04-27 21:09:22,297][54818] InferenceWorker_p0-w0: resuming experience collection (11100 times) [2024-04-27 21:09:22,399][54818] Updated weights for policy 0, policy_version 479798 (0.0029) [2024-04-27 21:09:24,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7861075968. Throughput: 0: 55403.8. Samples: 766334080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:09:24,253][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 21:09:25,764][54818] Updated weights for policy 0, policy_version 479808 (0.0030) [2024-04-27 21:09:28,237][54818] Updated weights for policy 0, policy_version 479818 (0.0030) [2024-04-27 21:09:29,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7861370880. Throughput: 0: 55704.8. Samples: 766509100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:09:29,254][54587] Avg episode reward: [(0, '0.691')] [2024-04-27 21:09:31,550][54818] Updated weights for policy 0, policy_version 479828 (0.0036) [2024-04-27 21:09:34,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7861633024. Throughput: 0: 55668.6. Samples: 766836260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:09:34,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 21:09:34,612][54818] Updated weights for policy 0, policy_version 479838 (0.0028) [2024-04-27 21:09:37,494][54818] Updated weights for policy 0, policy_version 479848 (0.0028) [2024-04-27 21:09:39,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7861911552. Throughput: 0: 55689.8. Samples: 767169600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:09:39,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 21:09:40,503][54818] Updated weights for policy 0, policy_version 479858 (0.0027) [2024-04-27 21:09:43,456][54818] Updated weights for policy 0, policy_version 479868 (0.0029) [2024-04-27 21:09:44,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 7862190080. Throughput: 0: 55453.9. Samples: 767333200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:09:44,253][54587] Avg episode reward: [(0, '0.529')] [2024-04-27 21:09:46,295][54818] Updated weights for policy 0, policy_version 479878 (0.0033) [2024-04-27 21:09:49,208][54818] Updated weights for policy 0, policy_version 479888 (0.0030) [2024-04-27 21:09:49,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7862484992. Throughput: 0: 55515.6. Samples: 767672360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:09:49,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 21:09:49,268][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000479888_7862484992.pth... [2024-04-27 21:09:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000479073_7849132032.pth [2024-04-27 21:09:52,330][54818] Updated weights for policy 0, policy_version 479898 (0.0040) [2024-04-27 21:09:54,253][54587] Fps is (10 sec: 55704.7, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 7862747136. Throughput: 0: 55561.7. Samples: 768008180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:09:54,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 21:09:55,187][54818] Updated weights for policy 0, policy_version 479908 (0.0031) [2024-04-27 21:09:58,338][54818] Updated weights for policy 0, policy_version 479918 (0.0024) [2024-04-27 21:09:59,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7863025664. Throughput: 0: 55627.1. Samples: 768169860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:09:59,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 21:10:01,207][54818] Updated weights for policy 0, policy_version 479928 (0.0026) [2024-04-27 21:10:04,053][54818] Updated weights for policy 0, policy_version 479938 (0.0027) [2024-04-27 21:10:04,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7863304192. Throughput: 0: 55499.1. Samples: 768498560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:10:04,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 21:10:07,039][54818] Updated weights for policy 0, policy_version 479948 (0.0028) [2024-04-27 21:10:09,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7863599104. Throughput: 0: 55582.1. Samples: 768835280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:10:09,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 21:10:09,919][54818] Updated weights for policy 0, policy_version 479958 (0.0030) [2024-04-27 21:10:12,998][54818] Updated weights for policy 0, policy_version 479968 (0.0031) [2024-04-27 21:10:14,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7863861248. Throughput: 0: 55444.9. Samples: 769004120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:10:14,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 21:10:15,783][54818] Updated weights for policy 0, policy_version 479978 (0.0031) [2024-04-27 21:10:18,684][54818] Updated weights for policy 0, policy_version 479988 (0.0032) [2024-04-27 21:10:19,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7864139776. Throughput: 0: 55632.0. Samples: 769339700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:10:19,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 21:10:21,762][54818] Updated weights for policy 0, policy_version 479998 (0.0028) [2024-04-27 21:10:24,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 7864434688. Throughput: 0: 55573.8. Samples: 769670420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:10:24,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 21:10:24,383][54818] Updated weights for policy 0, policy_version 480008 (0.0027) [2024-04-27 21:10:27,824][54818] Updated weights for policy 0, policy_version 480018 (0.0029) [2024-04-27 21:10:29,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 7864713216. Throughput: 0: 55701.1. Samples: 769839760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:10:29,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 21:10:30,266][54818] Updated weights for policy 0, policy_version 480028 (0.0027) [2024-04-27 21:10:33,580][54818] Updated weights for policy 0, policy_version 480038 (0.0026) [2024-04-27 21:10:34,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7864958976. Throughput: 0: 55690.3. Samples: 770178420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:10:34,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 21:10:34,653][54798] Signal inference workers to stop experience collection... (11150 times) [2024-04-27 21:10:34,653][54798] Signal inference workers to resume experience collection... (11150 times) [2024-04-27 21:10:34,675][54818] InferenceWorker_p0-w0: stopping experience collection (11150 times) [2024-04-27 21:10:34,675][54818] InferenceWorker_p0-w0: resuming experience collection (11150 times) [2024-04-27 21:10:36,245][54818] Updated weights for policy 0, policy_version 480048 (0.0029) [2024-04-27 21:10:39,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7865253888. Throughput: 0: 55633.8. Samples: 770511700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:10:39,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 21:10:39,511][54818] Updated weights for policy 0, policy_version 480058 (0.0029) [2024-04-27 21:10:41,922][54818] Updated weights for policy 0, policy_version 480068 (0.0033) [2024-04-27 21:10:44,253][54587] Fps is (10 sec: 60621.0, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 7865565184. Throughput: 0: 55796.1. Samples: 770680680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:10:44,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 21:10:45,300][54818] Updated weights for policy 0, policy_version 480078 (0.0031) [2024-04-27 21:10:47,816][54818] Updated weights for policy 0, policy_version 480088 (0.0036) [2024-04-27 21:10:49,253][54587] Fps is (10 sec: 58982.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7865843712. Throughput: 0: 56005.7. Samples: 771018820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:10:49,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 21:10:51,092][54818] Updated weights for policy 0, policy_version 480098 (0.0033) [2024-04-27 21:10:53,664][54818] Updated weights for policy 0, policy_version 480108 (0.0027) [2024-04-27 21:10:54,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 7866105856. Throughput: 0: 55929.4. Samples: 771352100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:10:54,253][54587] Avg episode reward: [(0, '0.493')] [2024-04-27 21:10:57,074][54818] Updated weights for policy 0, policy_version 480118 (0.0034) [2024-04-27 21:10:59,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 7866384384. Throughput: 0: 56023.2. Samples: 771525160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:10:59,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 21:10:59,480][54818] Updated weights for policy 0, policy_version 480128 (0.0031) [2024-04-27 21:11:03,034][54818] Updated weights for policy 0, policy_version 480138 (0.0034) [2024-04-27 21:11:04,253][54587] Fps is (10 sec: 55704.6, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7866662912. Throughput: 0: 56002.1. Samples: 771859800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:11:04,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 21:11:05,309][54818] Updated weights for policy 0, policy_version 480148 (0.0029) [2024-04-27 21:11:08,831][54818] Updated weights for policy 0, policy_version 480158 (0.0034) [2024-04-27 21:11:09,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 7866925056. Throughput: 0: 56072.6. Samples: 772193680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:11:09,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 21:11:11,188][54818] Updated weights for policy 0, policy_version 480168 (0.0026) [2024-04-27 21:11:14,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7867203584. Throughput: 0: 55805.0. Samples: 772350980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:11:14,254][54587] Avg episode reward: [(0, '0.502')] [2024-04-27 21:11:14,915][54818] Updated weights for policy 0, policy_version 480178 (0.0031) [2024-04-27 21:11:17,173][54818] Updated weights for policy 0, policy_version 480188 (0.0028) [2024-04-27 21:11:19,254][54587] Fps is (10 sec: 57342.5, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 7867498496. Throughput: 0: 55822.5. Samples: 772690440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:11:19,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 21:11:20,610][54818] Updated weights for policy 0, policy_version 480198 (0.0031) [2024-04-27 21:11:23,035][54818] Updated weights for policy 0, policy_version 480208 (0.0027) [2024-04-27 21:11:24,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7867777024. Throughput: 0: 55852.1. Samples: 773025040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:11:24,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 21:11:26,430][54818] Updated weights for policy 0, policy_version 480218 (0.0027) [2024-04-27 21:11:28,925][54818] Updated weights for policy 0, policy_version 480228 (0.0031) [2024-04-27 21:11:29,253][54587] Fps is (10 sec: 55706.7, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 7868055552. Throughput: 0: 55948.0. Samples: 773198340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:11:29,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 21:11:32,348][54818] Updated weights for policy 0, policy_version 480238 (0.0027) [2024-04-27 21:11:34,253][54587] Fps is (10 sec: 57343.9, 60 sec: 56524.9, 300 sec: 55705.6). Total num frames: 7868350464. Throughput: 0: 55817.5. Samples: 773530600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:11:34,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 21:11:34,683][54818] Updated weights for policy 0, policy_version 480248 (0.0035) [2024-04-27 21:11:38,161][54818] Updated weights for policy 0, policy_version 480258 (0.0027) [2024-04-27 21:11:39,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 7868612608. Throughput: 0: 55924.4. Samples: 773868700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:11:39,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-27 21:11:40,720][54818] Updated weights for policy 0, policy_version 480268 (0.0026) [2024-04-27 21:11:43,138][54798] Signal inference workers to stop experience collection... (11200 times) [2024-04-27 21:11:43,138][54798] Signal inference workers to resume experience collection... (11200 times) [2024-04-27 21:11:43,150][54818] InferenceWorker_p0-w0: stopping experience collection (11200 times) [2024-04-27 21:11:43,150][54818] InferenceWorker_p0-w0: resuming experience collection (11200 times) [2024-04-27 21:11:44,089][54818] Updated weights for policy 0, policy_version 480278 (0.0034) [2024-04-27 21:11:44,253][54587] Fps is (10 sec: 52428.0, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 7868874752. Throughput: 0: 55623.8. Samples: 774028240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:11:44,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 21:11:46,762][54818] Updated weights for policy 0, policy_version 480288 (0.0030) [2024-04-27 21:11:49,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7869169664. Throughput: 0: 55557.7. Samples: 774359900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:11:49,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 21:11:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000480296_7869169664.pth... [2024-04-27 21:11:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000479481_7855816704.pth [2024-04-27 21:11:49,844][54818] Updated weights for policy 0, policy_version 480298 (0.0032) [2024-04-27 21:11:52,571][54818] Updated weights for policy 0, policy_version 480308 (0.0027) [2024-04-27 21:11:54,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 7869448192. Throughput: 0: 55499.7. Samples: 774691180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:11:54,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 21:11:55,601][54818] Updated weights for policy 0, policy_version 480318 (0.0031) [2024-04-27 21:11:58,493][54818] Updated weights for policy 0, policy_version 480328 (0.0026) [2024-04-27 21:11:59,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7869743104. Throughput: 0: 55975.9. Samples: 774869900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:11:59,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-27 21:12:01,502][54818] Updated weights for policy 0, policy_version 480338 (0.0027) [2024-04-27 21:12:04,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7870005248. Throughput: 0: 55855.3. Samples: 775203920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 21:12:04,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 21:12:04,360][54818] Updated weights for policy 0, policy_version 480348 (0.0025) [2024-04-27 21:12:07,442][54818] Updated weights for policy 0, policy_version 480358 (0.0030) [2024-04-27 21:12:09,253][54587] Fps is (10 sec: 55706.3, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 7870300160. Throughput: 0: 55769.8. Samples: 775534680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 21:12:09,253][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 21:12:10,363][54818] Updated weights for policy 0, policy_version 480368 (0.0035) [2024-04-27 21:12:13,428][54818] Updated weights for policy 0, policy_version 480378 (0.0027) [2024-04-27 21:12:14,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 7870562304. Throughput: 0: 55621.7. Samples: 775701320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 21:12:14,254][54587] Avg episode reward: [(0, '0.485')] [2024-04-27 21:12:16,099][54818] Updated weights for policy 0, policy_version 480388 (0.0038) [2024-04-27 21:12:19,207][54818] Updated weights for policy 0, policy_version 480398 (0.0027) [2024-04-27 21:12:19,253][54587] Fps is (10 sec: 54066.4, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7870840832. Throughput: 0: 55691.4. Samples: 776036720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 21:12:19,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 21:12:21,765][54818] Updated weights for policy 0, policy_version 480408 (0.0027) [2024-04-27 21:12:24,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7871119360. Throughput: 0: 55589.3. Samples: 776370220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 21:12:24,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 21:12:25,023][54818] Updated weights for policy 0, policy_version 480418 (0.0029) [2024-04-27 21:12:27,759][54818] Updated weights for policy 0, policy_version 480428 (0.0027) [2024-04-27 21:12:29,253][54587] Fps is (10 sec: 57344.8, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7871414272. Throughput: 0: 55893.1. Samples: 776543420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 21:12:29,254][54587] Avg episode reward: [(0, '0.503')] [2024-04-27 21:12:30,919][54818] Updated weights for policy 0, policy_version 480438 (0.0032) [2024-04-27 21:12:33,644][54818] Updated weights for policy 0, policy_version 480448 (0.0027) [2024-04-27 21:12:34,253][54587] Fps is (10 sec: 57344.1, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7871692800. Throughput: 0: 55873.0. Samples: 776874180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 21:12:34,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 21:12:36,834][54818] Updated weights for policy 0, policy_version 480458 (0.0032) [2024-04-27 21:12:39,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 7871954944. Throughput: 0: 55895.4. Samples: 777206460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 21:12:39,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 21:12:39,634][54818] Updated weights for policy 0, policy_version 480468 (0.0028) [2024-04-27 21:12:42,508][54798] Signal inference workers to stop experience collection... (11250 times) [2024-04-27 21:12:42,508][54798] Signal inference workers to resume experience collection... (11250 times) [2024-04-27 21:12:42,535][54818] InferenceWorker_p0-w0: stopping experience collection (11250 times) [2024-04-27 21:12:42,535][54818] InferenceWorker_p0-w0: resuming experience collection (11250 times) [2024-04-27 21:12:42,621][54818] Updated weights for policy 0, policy_version 480478 (0.0032) [2024-04-27 21:12:44,253][54587] Fps is (10 sec: 54067.8, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 7872233472. Throughput: 0: 55651.7. Samples: 777374220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 21:12:44,253][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 21:12:45,392][54818] Updated weights for policy 0, policy_version 480488 (0.0031) [2024-04-27 21:12:48,573][54818] Updated weights for policy 0, policy_version 480498 (0.0031) [2024-04-27 21:12:49,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 7872528384. Throughput: 0: 55727.2. Samples: 777711640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 21:12:49,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 21:12:51,185][54818] Updated weights for policy 0, policy_version 480508 (0.0028) [2024-04-27 21:12:54,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55432.8, 300 sec: 55705.6). Total num frames: 7872774144. Throughput: 0: 55834.7. Samples: 778047240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 21:12:54,253][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 21:12:54,433][54818] Updated weights for policy 0, policy_version 480518 (0.0027) [2024-04-27 21:12:57,200][54818] Updated weights for policy 0, policy_version 480528 (0.0035) [2024-04-27 21:12:59,253][54587] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7873069056. Throughput: 0: 55704.1. Samples: 778208000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 21:12:59,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 21:13:00,329][54818] Updated weights for policy 0, policy_version 480538 (0.0031) [2024-04-27 21:13:02,956][54818] Updated weights for policy 0, policy_version 480548 (0.0023) [2024-04-27 21:13:04,253][54587] Fps is (10 sec: 58982.4, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 7873363968. Throughput: 0: 55710.9. Samples: 778543700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 21:13:04,253][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 21:13:06,205][54818] Updated weights for policy 0, policy_version 480558 (0.0028) [2024-04-27 21:13:08,849][54818] Updated weights for policy 0, policy_version 480568 (0.0027) [2024-04-27 21:13:09,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7873642496. Throughput: 0: 55710.2. Samples: 778877180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 21:13:09,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 21:13:11,973][54818] Updated weights for policy 0, policy_version 480578 (0.0029) [2024-04-27 21:13:14,253][54587] Fps is (10 sec: 52427.7, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 7873888256. Throughput: 0: 55684.2. Samples: 779049220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 21:13:14,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 21:13:14,707][54818] Updated weights for policy 0, policy_version 480588 (0.0029) [2024-04-27 21:13:17,793][54818] Updated weights for policy 0, policy_version 480598 (0.0028) [2024-04-27 21:13:19,253][54587] Fps is (10 sec: 54066.6, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 7874183168. Throughput: 0: 55683.8. Samples: 779379960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 21:13:19,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 21:13:20,588][54818] Updated weights for policy 0, policy_version 480608 (0.0031) [2024-04-27 21:13:23,673][54818] Updated weights for policy 0, policy_version 480618 (0.0037) [2024-04-27 21:13:24,253][54587] Fps is (10 sec: 58982.9, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7874478080. Throughput: 0: 55563.0. Samples: 779706800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-27 21:13:24,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 21:13:26,506][54818] Updated weights for policy 0, policy_version 480628 (0.0031) [2024-04-27 21:13:29,253][54587] Fps is (10 sec: 55706.4, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 7874740224. Throughput: 0: 55579.4. Samples: 779875300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:13:29,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 21:13:29,630][54818] Updated weights for policy 0, policy_version 480638 (0.0025) [2024-04-27 21:13:32,413][54818] Updated weights for policy 0, policy_version 480648 (0.0029) [2024-04-27 21:13:34,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7875018752. Throughput: 0: 55505.8. Samples: 780209400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:13:34,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 21:13:35,475][54818] Updated weights for policy 0, policy_version 480658 (0.0031) [2024-04-27 21:13:38,306][54818] Updated weights for policy 0, policy_version 480668 (0.0027) [2024-04-27 21:13:39,254][54587] Fps is (10 sec: 57341.9, 60 sec: 55978.3, 300 sec: 55761.1). Total num frames: 7875313664. Throughput: 0: 55434.5. Samples: 780541820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:13:39,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 21:13:41,293][54818] Updated weights for policy 0, policy_version 480678 (0.0028) [2024-04-27 21:13:44,099][54818] Updated weights for policy 0, policy_version 480688 (0.0030) [2024-04-27 21:13:44,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 7875592192. Throughput: 0: 55577.2. Samples: 780708980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:13:44,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 21:13:45,438][54798] Signal inference workers to stop experience collection... (11300 times) [2024-04-27 21:13:45,439][54798] Signal inference workers to resume experience collection... (11300 times) [2024-04-27 21:13:45,453][54818] InferenceWorker_p0-w0: stopping experience collection (11300 times) [2024-04-27 21:13:45,453][54818] InferenceWorker_p0-w0: resuming experience collection (11300 times) [2024-04-27 21:13:47,297][54818] Updated weights for policy 0, policy_version 480698 (0.0029) [2024-04-27 21:13:49,253][54587] Fps is (10 sec: 52430.8, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7875837952. Throughput: 0: 55567.0. Samples: 781044220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:13:49,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 21:13:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000480703_7875837952.pth... [2024-04-27 21:13:49,321][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000479888_7862484992.pth [2024-04-27 21:13:50,072][54818] Updated weights for policy 0, policy_version 480708 (0.0026) [2024-04-27 21:13:53,088][54818] Updated weights for policy 0, policy_version 480718 (0.0030) [2024-04-27 21:13:54,253][54587] Fps is (10 sec: 52429.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 7876116480. Throughput: 0: 55617.1. Samples: 781379940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:13:54,253][54587] Avg episode reward: [(0, '0.684')] [2024-04-27 21:13:56,094][54818] Updated weights for policy 0, policy_version 480728 (0.0029) [2024-04-27 21:13:58,807][54818] Updated weights for policy 0, policy_version 480738 (0.0027) [2024-04-27 21:13:59,253][54587] Fps is (10 sec: 58982.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 7876427776. Throughput: 0: 55525.1. Samples: 781547840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:13:59,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 21:14:02,003][54818] Updated weights for policy 0, policy_version 480748 (0.0029) [2024-04-27 21:14:04,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7876689920. Throughput: 0: 55659.8. Samples: 781884640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:14:04,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 21:14:04,696][54818] Updated weights for policy 0, policy_version 480758 (0.0028) [2024-04-27 21:14:07,847][54818] Updated weights for policy 0, policy_version 480768 (0.0029) [2024-04-27 21:14:09,253][54587] Fps is (10 sec: 52428.5, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7876952064. Throughput: 0: 55871.6. Samples: 782221020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:14:09,254][54587] Avg episode reward: [(0, '0.678')] [2024-04-27 21:14:10,620][54818] Updated weights for policy 0, policy_version 480778 (0.0029) [2024-04-27 21:14:13,585][54818] Updated weights for policy 0, policy_version 480788 (0.0029) [2024-04-27 21:14:14,253][54587] Fps is (10 sec: 55705.8, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 7877246976. Throughput: 0: 55782.8. Samples: 782385520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:14:14,254][54587] Avg episode reward: [(0, '0.531')] [2024-04-27 21:14:16,504][54818] Updated weights for policy 0, policy_version 480798 (0.0033) [2024-04-27 21:14:19,253][54587] Fps is (10 sec: 57345.0, 60 sec: 55705.9, 300 sec: 55761.1). Total num frames: 7877525504. Throughput: 0: 55873.0. Samples: 782723680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:14:19,253][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 21:14:19,366][54818] Updated weights for policy 0, policy_version 480808 (0.0027) [2024-04-27 21:14:22,472][54818] Updated weights for policy 0, policy_version 480818 (0.0032) [2024-04-27 21:14:24,253][54587] Fps is (10 sec: 54067.4, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 7877787648. Throughput: 0: 55891.3. Samples: 783056900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:14:24,253][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 21:14:25,283][54818] Updated weights for policy 0, policy_version 480828 (0.0027) [2024-04-27 21:14:28,330][54818] Updated weights for policy 0, policy_version 480838 (0.0028) [2024-04-27 21:14:29,253][54587] Fps is (10 sec: 54066.3, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7878066176. Throughput: 0: 55802.3. Samples: 783220080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:14:29,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 21:14:31,528][54818] Updated weights for policy 0, policy_version 480848 (0.0027) [2024-04-27 21:14:34,139][54818] Updated weights for policy 0, policy_version 480858 (0.0029) [2024-04-27 21:14:34,253][54587] Fps is (10 sec: 58982.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7878377472. Throughput: 0: 55660.6. Samples: 783548940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:14:34,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 21:14:37,204][54818] Updated weights for policy 0, policy_version 480868 (0.0029) [2024-04-27 21:14:39,253][54587] Fps is (10 sec: 58982.6, 60 sec: 55705.9, 300 sec: 55816.7). Total num frames: 7878656000. Throughput: 0: 55777.7. Samples: 783889940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:14:39,254][54587] Avg episode reward: [(0, '0.702')] [2024-04-27 21:14:39,782][54818] Updated weights for policy 0, policy_version 480878 (0.0025) [2024-04-27 21:14:42,960][54818] Updated weights for policy 0, policy_version 480888 (0.0032) [2024-04-27 21:14:44,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 7878934528. Throughput: 0: 56022.7. Samples: 784068860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:14:44,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 21:14:45,766][54818] Updated weights for policy 0, policy_version 480898 (0.0028) [2024-04-27 21:14:48,535][54798] Signal inference workers to stop experience collection... (11350 times) [2024-04-27 21:14:48,535][54798] Signal inference workers to resume experience collection... (11350 times) [2024-04-27 21:14:48,561][54818] InferenceWorker_p0-w0: stopping experience collection (11350 times) [2024-04-27 21:14:48,561][54818] InferenceWorker_p0-w0: resuming experience collection (11350 times) [2024-04-27 21:14:48,864][54818] Updated weights for policy 0, policy_version 480908 (0.0030) [2024-04-27 21:14:49,253][54587] Fps is (10 sec: 54067.3, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 7879196672. Throughput: 0: 55967.1. Samples: 784403160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:14:49,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 21:14:51,863][54818] Updated weights for policy 0, policy_version 480918 (0.0036) [2024-04-27 21:14:54,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 7879475200. Throughput: 0: 55736.0. Samples: 784729140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 21:14:54,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-27 21:14:54,947][54818] Updated weights for policy 0, policy_version 480928 (0.0026) [2024-04-27 21:14:57,565][54818] Updated weights for policy 0, policy_version 480938 (0.0027) [2024-04-27 21:14:59,253][54587] Fps is (10 sec: 52429.2, 60 sec: 54886.5, 300 sec: 55650.1). Total num frames: 7879720960. Throughput: 0: 55760.0. Samples: 784894720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 21:14:59,253][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 21:15:00,639][54818] Updated weights for policy 0, policy_version 480948 (0.0032) [2024-04-27 21:15:03,718][54818] Updated weights for policy 0, policy_version 480958 (0.0032) [2024-04-27 21:15:04,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 7880032256. Throughput: 0: 55668.6. Samples: 785228780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 21:15:04,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 21:15:06,496][54818] Updated weights for policy 0, policy_version 480968 (0.0026) [2024-04-27 21:15:09,253][54587] Fps is (10 sec: 60620.1, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 7880327168. Throughput: 0: 55762.1. Samples: 785566200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 21:15:09,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 21:15:09,686][54818] Updated weights for policy 0, policy_version 480978 (0.0028) [2024-04-27 21:15:12,425][54818] Updated weights for policy 0, policy_version 480988 (0.0024) [2024-04-27 21:15:14,253][54587] Fps is (10 sec: 57345.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7880605696. Throughput: 0: 55918.4. Samples: 785736400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 21:15:14,253][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 21:15:15,568][54818] Updated weights for policy 0, policy_version 480998 (0.0030) [2024-04-27 21:15:18,360][54818] Updated weights for policy 0, policy_version 481008 (0.0026) [2024-04-27 21:15:19,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 7880884224. Throughput: 0: 55974.2. Samples: 786067780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 21:15:19,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 21:15:21,249][54818] Updated weights for policy 0, policy_version 481018 (0.0025) [2024-04-27 21:15:24,189][54818] Updated weights for policy 0, policy_version 481028 (0.0028) [2024-04-27 21:15:24,253][54587] Fps is (10 sec: 55704.7, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 7881162752. Throughput: 0: 55869.7. Samples: 786404080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 21:15:24,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 21:15:27,140][54818] Updated weights for policy 0, policy_version 481038 (0.0030) [2024-04-27 21:15:29,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7881424896. Throughput: 0: 55544.4. Samples: 786568360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 21:15:29,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 21:15:30,100][54818] Updated weights for policy 0, policy_version 481048 (0.0024) [2024-04-27 21:15:33,038][54818] Updated weights for policy 0, policy_version 481058 (0.0030) [2024-04-27 21:15:34,253][54587] Fps is (10 sec: 50790.8, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 7881670656. Throughput: 0: 55624.0. Samples: 786906240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 21:15:34,262][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 21:15:35,830][54818] Updated weights for policy 0, policy_version 481068 (0.0025) [2024-04-27 21:15:38,892][54818] Updated weights for policy 0, policy_version 481078 (0.0028) [2024-04-27 21:15:39,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7881981952. Throughput: 0: 55751.6. Samples: 787237960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 21:15:39,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 21:15:40,258][54798] Signal inference workers to stop experience collection... (11400 times) [2024-04-27 21:15:40,261][54798] Signal inference workers to resume experience collection... (11400 times) [2024-04-27 21:15:40,276][54818] InferenceWorker_p0-w0: stopping experience collection (11400 times) [2024-04-27 21:15:40,276][54818] InferenceWorker_p0-w0: resuming experience collection (11400 times) [2024-04-27 21:15:41,808][54818] Updated weights for policy 0, policy_version 481088 (0.0034) [2024-04-27 21:15:44,253][54587] Fps is (10 sec: 60621.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7882276864. Throughput: 0: 55774.2. Samples: 787404560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 21:15:44,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 21:15:44,644][54818] Updated weights for policy 0, policy_version 481098 (0.0026) [2024-04-27 21:15:47,701][54818] Updated weights for policy 0, policy_version 481108 (0.0033) [2024-04-27 21:15:49,253][54587] Fps is (10 sec: 58982.3, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 7882571776. Throughput: 0: 55845.0. Samples: 787741800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 21:15:49,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 21:15:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000481114_7882571776.pth... [2024-04-27 21:15:49,315][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000480296_7869169664.pth [2024-04-27 21:15:50,840][54818] Updated weights for policy 0, policy_version 481118 (0.0030) [2024-04-27 21:15:53,542][54818] Updated weights for policy 0, policy_version 481128 (0.0030) [2024-04-27 21:15:54,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 7882833920. Throughput: 0: 55781.8. Samples: 788076380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 21:15:54,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 21:15:56,751][54818] Updated weights for policy 0, policy_version 481138 (0.0026) [2024-04-27 21:15:59,253][54587] Fps is (10 sec: 55706.4, 60 sec: 56797.9, 300 sec: 55816.7). Total num frames: 7883128832. Throughput: 0: 55876.0. Samples: 788250820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 21:15:59,253][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 21:15:59,258][54818] Updated weights for policy 0, policy_version 481148 (0.0026) [2024-04-27 21:16:02,552][54818] Updated weights for policy 0, policy_version 481158 (0.0036) [2024-04-27 21:16:04,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 7883374592. Throughput: 0: 55864.3. Samples: 788581680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 21:16:04,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-27 21:16:05,169][54818] Updated weights for policy 0, policy_version 481168 (0.0032) [2024-04-27 21:16:08,281][54818] Updated weights for policy 0, policy_version 481178 (0.0029) [2024-04-27 21:16:09,253][54587] Fps is (10 sec: 50790.4, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 7883636736. Throughput: 0: 55870.8. Samples: 788918260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 21:16:09,253][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 21:16:11,210][54818] Updated weights for policy 0, policy_version 481188 (0.0029) [2024-04-27 21:16:14,253][54587] Fps is (10 sec: 55705.4, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 7883931648. Throughput: 0: 55756.8. Samples: 789077420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-27 21:16:14,254][54587] Avg episode reward: [(0, '0.696')] [2024-04-27 21:16:14,508][54818] Updated weights for policy 0, policy_version 481198 (0.0030) [2024-04-27 21:16:17,101][54818] Updated weights for policy 0, policy_version 481208 (0.0031) [2024-04-27 21:16:19,253][54587] Fps is (10 sec: 58982.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7884226560. Throughput: 0: 55726.7. Samples: 789413940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 21:16:19,253][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 21:16:20,568][54818] Updated weights for policy 0, policy_version 481218 (0.0034) [2024-04-27 21:16:22,868][54798] Signal inference workers to stop experience collection... (11450 times) [2024-04-27 21:16:22,875][54798] Signal inference workers to resume experience collection... (11450 times) [2024-04-27 21:16:22,895][54818] InferenceWorker_p0-w0: stopping experience collection (11450 times) [2024-04-27 21:16:22,896][54818] InferenceWorker_p0-w0: resuming experience collection (11450 times) [2024-04-27 21:16:22,982][54818] Updated weights for policy 0, policy_version 481228 (0.0024) [2024-04-27 21:16:24,253][54587] Fps is (10 sec: 58982.7, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7884521472. Throughput: 0: 55718.7. Samples: 789745300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 21:16:24,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 21:16:26,376][54818] Updated weights for policy 0, policy_version 481238 (0.0031) [2024-04-27 21:16:28,760][54818] Updated weights for policy 0, policy_version 481248 (0.0027) [2024-04-27 21:16:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 7884800000. Throughput: 0: 55942.7. Samples: 789921980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 21:16:29,253][54587] Avg episode reward: [(0, '0.661')] [2024-04-27 21:16:32,043][54818] Updated weights for policy 0, policy_version 481258 (0.0025) [2024-04-27 21:16:34,253][54587] Fps is (10 sec: 50790.5, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 7885029376. Throughput: 0: 55796.1. Samples: 790252620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 21:16:34,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 21:16:34,703][54818] Updated weights for policy 0, policy_version 481268 (0.0031) [2024-04-27 21:16:38,071][54818] Updated weights for policy 0, policy_version 481278 (0.0028) [2024-04-27 21:16:39,253][54587] Fps is (10 sec: 54066.7, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7885340672. Throughput: 0: 55710.7. Samples: 790583360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 21:16:39,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 21:16:40,693][54818] Updated weights for policy 0, policy_version 481288 (0.0030) [2024-04-27 21:16:43,957][54818] Updated weights for policy 0, policy_version 481298 (0.0027) [2024-04-27 21:16:44,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7885602816. Throughput: 0: 55500.4. Samples: 790748340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 21:16:44,253][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 21:16:46,465][54818] Updated weights for policy 0, policy_version 481308 (0.0029) [2024-04-27 21:16:49,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 7885881344. Throughput: 0: 55479.1. Samples: 791078240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 21:16:49,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-27 21:16:49,762][54818] Updated weights for policy 0, policy_version 481318 (0.0030) [2024-04-27 21:16:52,398][54818] Updated weights for policy 0, policy_version 481328 (0.0031) [2024-04-27 21:16:54,253][54587] Fps is (10 sec: 57343.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7886176256. Throughput: 0: 55501.3. Samples: 791415820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 21:16:54,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 21:16:55,758][54818] Updated weights for policy 0, policy_version 481338 (0.0029) [2024-04-27 21:16:58,345][54818] Updated weights for policy 0, policy_version 481348 (0.0032) [2024-04-27 21:16:59,253][54587] Fps is (10 sec: 57344.8, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 7886454784. Throughput: 0: 55651.3. Samples: 791581720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 21:16:59,253][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 21:17:01,513][54818] Updated weights for policy 0, policy_version 481358 (0.0030) [2024-04-27 21:17:04,211][54818] Updated weights for policy 0, policy_version 481368 (0.0029) [2024-04-27 21:17:04,253][54587] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7886733312. Throughput: 0: 55743.5. Samples: 791922400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 21:17:04,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 21:17:07,319][54818] Updated weights for policy 0, policy_version 481378 (0.0030) [2024-04-27 21:17:09,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 7886995456. Throughput: 0: 55767.2. Samples: 792254820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 21:17:09,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 21:17:10,032][54818] Updated weights for policy 0, policy_version 481388 (0.0033) [2024-04-27 21:17:13,313][54818] Updated weights for policy 0, policy_version 481398 (0.0026) [2024-04-27 21:17:14,253][54587] Fps is (10 sec: 55705.9, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 7887290368. Throughput: 0: 55549.3. Samples: 792421700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 21:17:14,253][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 21:17:15,823][54818] Updated weights for policy 0, policy_version 481408 (0.0027) [2024-04-27 21:17:19,090][54818] Updated weights for policy 0, policy_version 481418 (0.0025) [2024-04-27 21:17:19,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7887568896. Throughput: 0: 55613.3. Samples: 792755220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 21:17:19,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 21:17:21,787][54818] Updated weights for policy 0, policy_version 481428 (0.0028) [2024-04-27 21:17:24,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7887847424. Throughput: 0: 55784.0. Samples: 793093640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 21:17:24,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 21:17:24,807][54818] Updated weights for policy 0, policy_version 481438 (0.0028) [2024-04-27 21:17:27,762][54818] Updated weights for policy 0, policy_version 481448 (0.0030) [2024-04-27 21:17:28,926][54798] Signal inference workers to stop experience collection... (11500 times) [2024-04-27 21:17:28,927][54798] Signal inference workers to resume experience collection... (11500 times) [2024-04-27 21:17:28,943][54818] InferenceWorker_p0-w0: stopping experience collection (11500 times) [2024-04-27 21:17:28,943][54818] InferenceWorker_p0-w0: resuming experience collection (11500 times) [2024-04-27 21:17:29,253][54587] Fps is (10 sec: 55704.8, 60 sec: 55432.3, 300 sec: 55705.6). Total num frames: 7888125952. Throughput: 0: 55815.7. Samples: 793260060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 21:17:29,254][54587] Avg episode reward: [(0, '0.701')] [2024-04-27 21:17:30,800][54818] Updated weights for policy 0, policy_version 481458 (0.0028) [2024-04-27 21:17:33,525][54818] Updated weights for policy 0, policy_version 481468 (0.0027) [2024-04-27 21:17:34,253][54587] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 7888404480. Throughput: 0: 55921.3. Samples: 793594700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 21:17:34,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 21:17:36,660][54818] Updated weights for policy 0, policy_version 481478 (0.0028) [2024-04-27 21:17:39,253][54587] Fps is (10 sec: 54067.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7888666624. Throughput: 0: 55919.9. Samples: 793932220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-27 21:17:39,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 21:17:39,439][54818] Updated weights for policy 0, policy_version 481488 (0.0031) [2024-04-27 21:17:42,371][54818] Updated weights for policy 0, policy_version 481498 (0.0027) [2024-04-27 21:17:44,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7888961536. Throughput: 0: 55989.7. Samples: 794101260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 21:17:44,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 21:17:45,420][54818] Updated weights for policy 0, policy_version 481508 (0.0030) [2024-04-27 21:17:48,106][54818] Updated weights for policy 0, policy_version 481518 (0.0031) [2024-04-27 21:17:49,253][54587] Fps is (10 sec: 58982.5, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 7889256448. Throughput: 0: 55820.4. Samples: 794434320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 21:17:49,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 21:17:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000481522_7889256448.pth... [2024-04-27 21:17:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000480703_7875837952.pth [2024-04-27 21:17:51,413][54818] Updated weights for policy 0, policy_version 481528 (0.0033) [2024-04-27 21:17:54,008][54818] Updated weights for policy 0, policy_version 481538 (0.0026) [2024-04-27 21:17:54,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 7889534976. Throughput: 0: 55922.1. Samples: 794771320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 21:17:54,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 21:17:57,188][54818] Updated weights for policy 0, policy_version 481548 (0.0030) [2024-04-27 21:17:59,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 7889797120. Throughput: 0: 55970.0. Samples: 794940360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 21:17:59,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 21:17:59,985][54818] Updated weights for policy 0, policy_version 481558 (0.0029) [2024-04-27 21:18:02,976][54818] Updated weights for policy 0, policy_version 481568 (0.0028) [2024-04-27 21:18:04,253][54587] Fps is (10 sec: 54068.0, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 7890075648. Throughput: 0: 56060.1. Samples: 795277920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 21:18:04,253][54587] Avg episode reward: [(0, '0.668')] [2024-04-27 21:18:05,781][54818] Updated weights for policy 0, policy_version 481578 (0.0030) [2024-04-27 21:18:08,865][54818] Updated weights for policy 0, policy_version 481588 (0.0031) [2024-04-27 21:18:09,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 7890354176. Throughput: 0: 55894.5. Samples: 795608900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 21:18:09,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 21:18:11,743][54818] Updated weights for policy 0, policy_version 481598 (0.0034) [2024-04-27 21:18:14,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 7890632704. Throughput: 0: 55925.1. Samples: 795776680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 21:18:14,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 21:18:14,796][54818] Updated weights for policy 0, policy_version 481608 (0.0029) [2024-04-27 21:18:17,634][54818] Updated weights for policy 0, policy_version 481618 (0.0029) [2024-04-27 21:18:19,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 7890927616. Throughput: 0: 55912.1. Samples: 796110740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 21:18:19,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 21:18:20,570][54818] Updated weights for policy 0, policy_version 481628 (0.0028) [2024-04-27 21:18:23,420][54818] Updated weights for policy 0, policy_version 481638 (0.0029) [2024-04-27 21:18:24,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 7891206144. Throughput: 0: 55818.8. Samples: 796444060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 21:18:24,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 21:18:26,533][54818] Updated weights for policy 0, policy_version 481648 (0.0029) [2024-04-27 21:18:29,215][54818] Updated weights for policy 0, policy_version 481658 (0.0033) [2024-04-27 21:18:29,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55978.9, 300 sec: 55816.7). Total num frames: 7891484672. Throughput: 0: 55827.1. Samples: 796613480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 21:18:29,253][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 21:18:32,311][54818] Updated weights for policy 0, policy_version 481668 (0.0025) [2024-04-27 21:18:34,253][54587] Fps is (10 sec: 54067.2, 60 sec: 55705.7, 300 sec: 55705.7). Total num frames: 7891746816. Throughput: 0: 55910.3. Samples: 796950280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 21:18:34,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 21:18:34,941][54818] Updated weights for policy 0, policy_version 481678 (0.0031) [2024-04-27 21:18:38,231][54818] Updated weights for policy 0, policy_version 481688 (0.0030) [2024-04-27 21:18:39,253][54587] Fps is (10 sec: 55705.5, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 7892041728. Throughput: 0: 55931.7. Samples: 797288240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 21:18:39,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 21:18:40,806][54818] Updated weights for policy 0, policy_version 481698 (0.0031) [2024-04-27 21:18:44,010][54818] Updated weights for policy 0, policy_version 481708 (0.0036) [2024-04-27 21:18:44,253][54587] Fps is (10 sec: 57343.2, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 7892320256. Throughput: 0: 55775.1. Samples: 797450240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 21:18:44,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 21:18:46,752][54818] Updated weights for policy 0, policy_version 481718 (0.0029) [2024-04-27 21:18:49,254][54587] Fps is (10 sec: 54065.7, 60 sec: 55432.4, 300 sec: 55816.6). Total num frames: 7892582400. Throughput: 0: 55814.7. Samples: 797789600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 21:18:49,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 21:18:49,609][54798] Signal inference workers to stop experience collection... (11550 times) [2024-04-27 21:18:49,645][54818] InferenceWorker_p0-w0: stopping experience collection (11550 times) [2024-04-27 21:18:49,658][54798] Signal inference workers to resume experience collection... (11550 times) [2024-04-27 21:18:49,666][54818] InferenceWorker_p0-w0: resuming experience collection (11550 times) [2024-04-27 21:18:49,781][54818] Updated weights for policy 0, policy_version 481728 (0.0029) [2024-04-27 21:18:52,540][54818] Updated weights for policy 0, policy_version 481738 (0.0036) [2024-04-27 21:18:54,253][54587] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7892877312. Throughput: 0: 55932.5. Samples: 798125860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 21:18:54,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 21:18:55,650][54818] Updated weights for policy 0, policy_version 481748 (0.0035) [2024-04-27 21:18:58,427][54818] Updated weights for policy 0, policy_version 481758 (0.0030) [2024-04-27 21:18:59,253][54587] Fps is (10 sec: 57345.3, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 7893155840. Throughput: 0: 55819.0. Samples: 798288540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 21:18:59,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 21:19:01,682][54818] Updated weights for policy 0, policy_version 481768 (0.0030) [2024-04-27 21:19:04,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 7893434368. Throughput: 0: 55859.6. Samples: 798624420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 21:19:04,253][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 21:19:04,296][54818] Updated weights for policy 0, policy_version 481778 (0.0032) [2024-04-27 21:19:07,367][54818] Updated weights for policy 0, policy_version 481788 (0.0028) [2024-04-27 21:19:09,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 7893712896. Throughput: 0: 55840.8. Samples: 798956900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:19:09,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 21:19:10,101][54818] Updated weights for policy 0, policy_version 481798 (0.0028) [2024-04-27 21:19:13,325][54818] Updated weights for policy 0, policy_version 481808 (0.0027) [2024-04-27 21:19:14,253][54587] Fps is (10 sec: 57343.6, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 7894007808. Throughput: 0: 56054.1. Samples: 799135920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:19:14,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 21:19:15,828][54818] Updated weights for policy 0, policy_version 481818 (0.0029) [2024-04-27 21:19:19,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 7894253568. Throughput: 0: 55891.9. Samples: 799465420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:19:19,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 21:19:19,267][54818] Updated weights for policy 0, policy_version 481828 (0.0034) [2024-04-27 21:19:21,755][54818] Updated weights for policy 0, policy_version 481838 (0.0027) [2024-04-27 21:19:24,253][54587] Fps is (10 sec: 50790.5, 60 sec: 55159.4, 300 sec: 55761.2). Total num frames: 7894515712. Throughput: 0: 55777.7. Samples: 799798240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:19:24,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 21:19:25,158][54818] Updated weights for policy 0, policy_version 481848 (0.0028) [2024-04-27 21:19:27,694][54818] Updated weights for policy 0, policy_version 481858 (0.0027) [2024-04-27 21:19:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7894827008. Throughput: 0: 55899.7. Samples: 799965720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:19:29,254][54587] Avg episode reward: [(0, '0.521')] [2024-04-27 21:19:31,036][54818] Updated weights for policy 0, policy_version 481868 (0.0033) [2024-04-27 21:19:33,652][54818] Updated weights for policy 0, policy_version 481878 (0.0028) [2024-04-27 21:19:34,253][54587] Fps is (10 sec: 58982.2, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 7895105536. Throughput: 0: 55745.6. Samples: 800298140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:19:34,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 21:19:36,783][54818] Updated weights for policy 0, policy_version 481888 (0.0027) [2024-04-27 21:19:39,253][54587] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 7895400448. Throughput: 0: 55793.8. Samples: 800636580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:19:39,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 21:19:39,336][54818] Updated weights for policy 0, policy_version 481898 (0.0038) [2024-04-27 21:19:42,651][54818] Updated weights for policy 0, policy_version 481908 (0.0030) [2024-04-27 21:19:44,253][54587] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7895662592. Throughput: 0: 55939.9. Samples: 800805840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:19:44,254][54587] Avg episode reward: [(0, '0.501')] [2024-04-27 21:19:45,145][54818] Updated weights for policy 0, policy_version 481918 (0.0027) [2024-04-27 21:19:48,589][54818] Updated weights for policy 0, policy_version 481928 (0.0034) [2024-04-27 21:19:49,253][54587] Fps is (10 sec: 54066.9, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 7895941120. Throughput: 0: 55904.3. Samples: 801140120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:19:49,254][54587] Avg episode reward: [(0, '0.523')] [2024-04-27 21:19:49,330][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000481931_7895957504.pth... [2024-04-27 21:19:49,374][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000481114_7882571776.pth [2024-04-27 21:19:51,084][54818] Updated weights for policy 0, policy_version 481938 (0.0028) [2024-04-27 21:19:54,253][54587] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 7896203264. Throughput: 0: 55829.8. Samples: 801469240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:19:54,254][54587] Avg episode reward: [(0, '0.664')] [2024-04-27 21:19:54,579][54818] Updated weights for policy 0, policy_version 481948 (0.0030) [2024-04-27 21:19:56,984][54818] Updated weights for policy 0, policy_version 481958 (0.0029) [2024-04-27 21:19:57,677][54798] Signal inference workers to stop experience collection... (11600 times) [2024-04-27 21:19:57,710][54818] InferenceWorker_p0-w0: stopping experience collection (11600 times) [2024-04-27 21:19:57,738][54798] Signal inference workers to resume experience collection... (11600 times) [2024-04-27 21:19:57,742][54818] InferenceWorker_p0-w0: resuming experience collection (11600 times) [2024-04-27 21:19:59,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 7896481792. Throughput: 0: 55322.1. Samples: 801625420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:19:59,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 21:20:00,560][54818] Updated weights for policy 0, policy_version 481968 (0.0029) [2024-04-27 21:20:02,718][54818] Updated weights for policy 0, policy_version 481978 (0.0032) [2024-04-27 21:20:04,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 7896776704. Throughput: 0: 55368.9. Samples: 801957020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:20:04,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 21:20:06,489][54818] Updated weights for policy 0, policy_version 481988 (0.0026) [2024-04-27 21:20:08,632][54818] Updated weights for policy 0, policy_version 481998 (0.0032) [2024-04-27 21:20:09,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7897055232. Throughput: 0: 55363.1. Samples: 802289580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:20:09,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 21:20:12,280][54818] Updated weights for policy 0, policy_version 482008 (0.0033) [2024-04-27 21:20:14,253][54587] Fps is (10 sec: 57344.3, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7897350144. Throughput: 0: 55732.4. Samples: 802473680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:20:14,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 21:20:14,515][54818] Updated weights for policy 0, policy_version 482018 (0.0028) [2024-04-27 21:20:18,092][54818] Updated weights for policy 0, policy_version 482028 (0.0026) [2024-04-27 21:20:19,253][54587] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 7897612288. Throughput: 0: 55848.5. Samples: 802811320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:20:19,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 21:20:20,875][54818] Updated weights for policy 0, policy_version 482038 (0.0033) [2024-04-27 21:20:24,039][54818] Updated weights for policy 0, policy_version 482048 (0.0030) [2024-04-27 21:20:24,253][54587] Fps is (10 sec: 54068.0, 60 sec: 56251.9, 300 sec: 55816.7). Total num frames: 7897890816. Throughput: 0: 55718.9. Samples: 803143920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:20:24,253][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 21:20:26,599][54818] Updated weights for policy 0, policy_version 482058 (0.0026) [2024-04-27 21:20:29,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 7898169344. Throughput: 0: 55569.8. Samples: 803306480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:20:29,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 21:20:29,915][54818] Updated weights for policy 0, policy_version 482068 (0.0030) [2024-04-27 21:20:32,260][54818] Updated weights for policy 0, policy_version 482078 (0.0027) [2024-04-27 21:20:34,253][54587] Fps is (10 sec: 55704.3, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7898447872. Throughput: 0: 55650.2. Samples: 803644380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:20:34,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 21:20:35,677][54818] Updated weights for policy 0, policy_version 482088 (0.0029) [2024-04-27 21:20:38,174][54818] Updated weights for policy 0, policy_version 482098 (0.0027) [2024-04-27 21:20:39,253][54587] Fps is (10 sec: 54067.1, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 7898710016. Throughput: 0: 55679.8. Samples: 803974840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-27 21:20:39,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 21:20:41,663][54818] Updated weights for policy 0, policy_version 482108 (0.0027) [2024-04-27 21:20:43,890][54818] Updated weights for policy 0, policy_version 482118 (0.0031) [2024-04-27 21:20:44,253][54587] Fps is (10 sec: 57344.5, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 7899021312. Throughput: 0: 56131.7. Samples: 804151340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-27 21:20:44,254][54587] Avg episode reward: [(0, '0.702')] [2024-04-27 21:20:47,588][54818] Updated weights for policy 0, policy_version 482128 (0.0026) [2024-04-27 21:20:49,253][54587] Fps is (10 sec: 60620.2, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 7899316224. Throughput: 0: 56270.9. Samples: 804489220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-27 21:20:49,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 21:20:49,636][54818] Updated weights for policy 0, policy_version 482138 (0.0028) [2024-04-27 21:20:51,667][54798] Signal inference workers to stop experience collection... (11650 times) [2024-04-27 21:20:51,667][54798] Signal inference workers to resume experience collection... (11650 times) [2024-04-27 21:20:51,705][54818] InferenceWorker_p0-w0: stopping experience collection (11650 times) [2024-04-27 21:20:51,705][54818] InferenceWorker_p0-w0: resuming experience collection (11650 times) [2024-04-27 21:20:53,361][54818] Updated weights for policy 0, policy_version 482148 (0.0026) [2024-04-27 21:20:54,253][54587] Fps is (10 sec: 54066.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 7899561984. Throughput: 0: 56304.4. Samples: 804823280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-27 21:20:54,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 21:20:55,622][54818] Updated weights for policy 0, policy_version 482158 (0.0024) [2024-04-27 21:20:59,086][54818] Updated weights for policy 0, policy_version 482168 (0.0032) [2024-04-27 21:20:59,253][54587] Fps is (10 sec: 52429.2, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 7899840512. Throughput: 0: 55800.7. Samples: 804984720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-27 21:20:59,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-27 21:21:01,688][54818] Updated weights for policy 0, policy_version 482178 (0.0035) [2024-04-27 21:21:04,253][54587] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 7900119040. Throughput: 0: 55673.8. Samples: 805316640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-27 21:21:04,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 21:21:05,164][54818] Updated weights for policy 0, policy_version 482188 (0.0030) [2024-04-27 21:21:07,606][54818] Updated weights for policy 0, policy_version 482198 (0.0026) [2024-04-27 21:21:09,253][54587] Fps is (10 sec: 55706.3, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 7900397568. Throughput: 0: 55804.7. Samples: 805655140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-27 21:21:09,254][54587] Avg episode reward: [(0, '0.728')] [2024-04-27 21:21:11,170][54818] Updated weights for policy 0, policy_version 482208 (0.0028) [2024-04-27 21:21:13,642][54818] Updated weights for policy 0, policy_version 482218 (0.0030) [2024-04-27 21:21:14,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 7900676096. Throughput: 0: 55923.6. Samples: 805823040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-27 21:21:14,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 21:21:16,908][54818] Updated weights for policy 0, policy_version 482228 (0.0031) [2024-04-27 21:21:19,253][54587] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 7900971008. Throughput: 0: 55770.8. Samples: 806154060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-27 21:21:19,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 21:21:19,344][54818] Updated weights for policy 0, policy_version 482238 (0.0028) [2024-04-27 21:21:22,707][54818] Updated weights for policy 0, policy_version 482248 (0.0033) [2024-04-27 21:21:24,253][54587] Fps is (10 sec: 58982.8, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 7901265920. Throughput: 0: 55917.5. Samples: 806491120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-27 21:21:24,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 21:21:25,015][54818] Updated weights for policy 0, policy_version 482258 (0.0026) [2024-04-27 21:21:28,511][54818] Updated weights for policy 0, policy_version 482268 (0.0032) [2024-04-27 21:21:29,253][54587] Fps is (10 sec: 52428.8, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 7901495296. Throughput: 0: 55767.5. Samples: 806660880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-27 21:21:29,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 21:21:30,922][54818] Updated weights for policy 0, policy_version 482278 (0.0027) [2024-04-27 21:21:34,253][54587] Fps is (10 sec: 50790.7, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 7901773824. Throughput: 0: 55721.2. Samples: 806996660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-27 21:21:34,253][54587] Avg episode reward: [(0, '0.674')] [2024-04-27 21:21:34,644][54818] Updated weights for policy 0, policy_version 482288 (0.0034) [2024-04-27 21:21:36,905][54818] Updated weights for policy 0, policy_version 482298 (0.0027) [2024-04-27 21:21:39,253][54587] Fps is (10 sec: 58982.6, 60 sec: 56251.9, 300 sec: 55872.2). Total num frames: 7902085120. Throughput: 0: 55605.4. Samples: 807325520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-27 21:21:39,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 21:21:40,471][54818] Updated weights for policy 0, policy_version 482308 (0.0033) [2024-04-27 21:21:42,863][54818] Updated weights for policy 0, policy_version 482318 (0.0028) [2024-04-27 21:21:44,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 7902347264. Throughput: 0: 55604.6. Samples: 807486920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-27 21:21:44,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 21:21:46,233][54818] Updated weights for policy 0, policy_version 482328 (0.0029) [2024-04-27 21:21:47,081][54798] Signal inference workers to stop experience collection... (11700 times) [2024-04-27 21:21:47,121][54818] InferenceWorker_p0-w0: stopping experience collection (11700 times) [2024-04-27 21:21:47,133][54798] Signal inference workers to resume experience collection... (11700 times) [2024-04-27 21:21:47,136][54818] InferenceWorker_p0-w0: resuming experience collection (11700 times) [2024-04-27 21:21:48,594][54818] Updated weights for policy 0, policy_version 482338 (0.0029) [2024-04-27 21:21:49,253][54587] Fps is (10 sec: 55705.1, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 7902642176. Throughput: 0: 55678.6. Samples: 807822180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-27 21:21:49,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 21:21:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000482339_7902642176.pth... [2024-04-27 21:21:49,320][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000481522_7889256448.pth [2024-04-27 21:21:52,112][54818] Updated weights for policy 0, policy_version 482348 (0.0037) [2024-04-27 21:21:54,253][54587] Fps is (10 sec: 58982.8, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 7902937088. Throughput: 0: 55524.5. Samples: 808153740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-27 21:21:54,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 21:21:54,558][54818] Updated weights for policy 0, policy_version 482358 (0.0027) [2024-04-27 21:21:58,092][54818] Updated weights for policy 0, policy_version 482368 (0.0029) [2024-04-27 21:21:59,253][54587] Fps is (10 sec: 57344.1, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 7903215616. Throughput: 0: 55648.9. Samples: 808327240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-27 21:21:59,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 21:22:00,508][54818] Updated weights for policy 0, policy_version 482378 (0.0024) [2024-04-27 21:22:04,114][54818] Updated weights for policy 0, policy_version 482388 (0.0034) [2024-04-27 21:22:04,253][54587] Fps is (10 sec: 50789.8, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 7903444992. Throughput: 0: 55724.4. Samples: 808661660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 21:22:04,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 21:22:06,218][54818] Updated weights for policy 0, policy_version 482398 (0.0027) [2024-04-27 21:22:09,253][54587] Fps is (10 sec: 52428.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 7903739904. Throughput: 0: 55686.2. Samples: 808997000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 21:22:09,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 21:22:10,213][54818] Updated weights for policy 0, policy_version 482408 (0.0024) [2024-04-27 21:22:11,936][54818] Updated weights for policy 0, policy_version 482418 (0.0036) [2024-04-27 21:22:14,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 7904018432. Throughput: 0: 55365.4. Samples: 809152320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 21:22:14,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 21:22:15,967][54818] Updated weights for policy 0, policy_version 482428 (0.0025) [2024-04-27 21:22:18,022][54818] Updated weights for policy 0, policy_version 482438 (0.0032) [2024-04-27 21:22:19,253][54587] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 7904296960. Throughput: 0: 55383.5. Samples: 809488920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 21:22:19,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 21:22:21,924][54818] Updated weights for policy 0, policy_version 482448 (0.0030) [2024-04-27 21:22:23,967][54818] Updated weights for policy 0, policy_version 482458 (0.0024) [2024-04-27 21:22:24,253][54587] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 7904591872. Throughput: 0: 55357.3. Samples: 809816600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 21:22:24,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 21:22:28,048][54818] Updated weights for policy 0, policy_version 482468 (0.0043) [2024-04-27 21:22:29,155][54798] Signal inference workers to stop experience collection... (11750 times) [2024-04-27 21:22:29,162][54798] Signal inference workers to resume experience collection... (11750 times) [2024-04-27 21:22:29,183][54818] InferenceWorker_p0-w0: stopping experience collection (11750 times) [2024-04-27 21:22:29,183][54818] InferenceWorker_p0-w0: resuming experience collection (11750 times) [2024-04-27 21:22:29,253][54587] Fps is (10 sec: 58982.0, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 7904886784. Throughput: 0: 55840.4. Samples: 809999740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 21:22:29,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 21:22:29,802][54818] Updated weights for policy 0, policy_version 482478 (0.0031) [2024-04-27 21:22:33,871][54818] Updated weights for policy 0, policy_version 482488 (0.0032) [2024-04-27 21:22:34,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 7905116160. Throughput: 0: 55809.8. Samples: 810333620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 21:22:34,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 21:22:35,569][54818] Updated weights for policy 0, policy_version 482498 (0.0023) [2024-04-27 21:22:39,253][54587] Fps is (10 sec: 49152.6, 60 sec: 54886.5, 300 sec: 55650.1). Total num frames: 7905378304. Throughput: 0: 55864.5. Samples: 810667640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 21:22:39,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 21:22:39,643][54818] Updated weights for policy 0, policy_version 482508 (0.0028) [2024-04-27 21:22:41,897][54818] Updated weights for policy 0, policy_version 482518 (0.0024) [2024-04-27 21:22:44,253][54587] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 7905673216. Throughput: 0: 55457.7. Samples: 810822840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 21:22:44,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 21:22:45,666][54818] Updated weights for policy 0, policy_version 482528 (0.0033) [2024-04-27 21:22:47,862][54818] Updated weights for policy 0, policy_version 482538 (0.0029) [2024-04-27 21:22:49,253][54587] Fps is (10 sec: 57342.8, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 7905951744. Throughput: 0: 55407.9. Samples: 811155020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 21:22:49,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 21:22:51,598][54818] Updated weights for policy 0, policy_version 482548 (0.0031) [2024-04-27 21:22:53,631][54818] Updated weights for policy 0, policy_version 482558 (0.0026) [2024-04-27 21:22:54,253][54587] Fps is (10 sec: 57343.7, 60 sec: 55159.3, 300 sec: 55761.1). Total num frames: 7906246656. Throughput: 0: 55175.9. Samples: 811479920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 21:22:54,254][54587] Avg episode reward: [(0, '0.516')] [2024-04-27 21:22:57,401][54818] Updated weights for policy 0, policy_version 482568 (0.0032) [2024-04-27 21:22:59,253][54587] Fps is (10 sec: 57344.6, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 7906525184. Throughput: 0: 55811.5. Samples: 811663840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 21:22:59,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 21:22:59,828][54818] Updated weights for policy 0, policy_version 482578 (0.0029) [2024-04-27 21:23:03,246][54818] Updated weights for policy 0, policy_version 482588 (0.0027) [2024-04-27 21:23:04,253][54587] Fps is (10 sec: 57345.1, 60 sec: 56251.9, 300 sec: 55816.7). Total num frames: 7906820096. Throughput: 0: 55691.6. Samples: 811995040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 21:23:04,253][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 21:23:05,818][54818] Updated weights for policy 0, policy_version 482598 (0.0036) [2024-04-27 21:23:08,739][54798] Signal inference workers to stop experience collection... (11800 times) [2024-04-27 21:23:08,781][54818] InferenceWorker_p0-w0: stopping experience collection (11800 times) [2024-04-27 21:23:08,797][54798] Signal inference workers to resume experience collection... (11800 times) [2024-04-27 21:23:08,798][54818] InferenceWorker_p0-w0: resuming experience collection (11800 times) [2024-04-27 21:23:09,165][54818] Updated weights for policy 0, policy_version 482608 (0.0027) [2024-04-27 21:23:09,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 7907049472. Throughput: 0: 55975.6. Samples: 812335500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 21:23:09,253][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 21:23:11,488][54818] Updated weights for policy 0, policy_version 482618 (0.0026) [2024-04-27 21:23:14,253][54587] Fps is (10 sec: 49152.0, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 7907311616. Throughput: 0: 55199.7. Samples: 812483720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 21:23:14,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 21:23:15,154][54818] Updated weights for policy 0, policy_version 482628 (0.0032) [2024-04-27 21:23:17,170][54818] Updated weights for policy 0, policy_version 482638 (0.0029) [2024-04-27 21:23:19,253][54587] Fps is (10 sec: 57343.0, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 7907622912. Throughput: 0: 55232.3. Samples: 812819080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 21:23:19,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 21:23:21,043][54818] Updated weights for policy 0, policy_version 482648 (0.0033) [2024-04-27 21:23:22,972][54818] Updated weights for policy 0, policy_version 482658 (0.0023) [2024-04-27 21:23:24,253][54587] Fps is (10 sec: 57343.4, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 7907885056. Throughput: 0: 55174.5. Samples: 813150500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 21:23:24,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 21:23:26,869][54818] Updated weights for policy 0, policy_version 482668 (0.0029) [2024-04-27 21:23:29,056][54818] Updated weights for policy 0, policy_version 482678 (0.0029) [2024-04-27 21:23:29,253][54587] Fps is (10 sec: 57344.8, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 7908196352. Throughput: 0: 55760.1. Samples: 813332040. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-04-27 21:23:29,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-27 21:23:32,795][54818] Updated weights for policy 0, policy_version 482688 (0.0026) [2024-04-27 21:23:33,374][54798] Signal inference workers to stop experience collection... (11850 times) [2024-04-27 21:23:33,401][54818] InferenceWorker_p0-w0: stopping experience collection (11850 times) [2024-04-27 21:23:33,428][54798] Signal inference workers to resume experience collection... (11850 times) [2024-04-27 21:23:33,429][54818] InferenceWorker_p0-w0: resuming experience collection (11850 times) [2024-04-27 21:23:34,253][54587] Fps is (10 sec: 60621.2, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 7908491264. Throughput: 0: 55770.4. Samples: 813664680. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-04-27 21:23:34,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 21:23:35,030][54818] Updated weights for policy 0, policy_version 482698 (0.0028) [2024-04-27 21:23:38,597][54818] Updated weights for policy 0, policy_version 482708 (0.0032) [2024-04-27 21:23:39,253][54587] Fps is (10 sec: 55706.0, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 7908753408. Throughput: 0: 55834.0. Samples: 813992440. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-04-27 21:23:39,253][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 21:23:40,957][54818] Updated weights for policy 0, policy_version 482718 (0.0029) [2024-04-27 21:23:44,253][54587] Fps is (10 sec: 49152.2, 60 sec: 55159.6, 300 sec: 55594.6). Total num frames: 7908982784. Throughput: 0: 55426.3. Samples: 814158020. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-04-27 21:23:44,253][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 21:23:44,388][54818] Updated weights for policy 0, policy_version 482728 (0.0026) [2024-04-27 21:23:46,864][54818] Updated weights for policy 0, policy_version 482738 (0.0026) [2024-04-27 21:23:49,253][54587] Fps is (10 sec: 50789.7, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 7909261312. Throughput: 0: 55526.9. Samples: 814493760. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-04-27 21:23:49,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 21:23:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000482743_7909261312.pth... [2024-04-27 21:23:49,315][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000481931_7895957504.pth [2024-04-27 21:23:50,243][54818] Updated weights for policy 0, policy_version 482748 (0.0036) [2024-04-27 21:23:52,627][54818] Updated weights for policy 0, policy_version 482758 (0.0027) [2024-04-27 21:23:54,253][54587] Fps is (10 sec: 57343.4, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 7909556224. Throughput: 0: 55369.2. Samples: 814827120. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-04-27 21:23:54,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 21:23:56,226][54818] Updated weights for policy 0, policy_version 482768 (0.0027) [2024-04-27 21:23:58,318][54818] Updated weights for policy 0, policy_version 482778 (0.0033) [2024-04-27 21:23:59,253][54587] Fps is (10 sec: 58982.6, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 7909851136. Throughput: 0: 55593.7. Samples: 814985440. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-04-27 21:23:59,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 21:24:02,159][54818] Updated weights for policy 0, policy_version 482788 (0.0028) [2024-04-27 21:24:04,253][54587] Fps is (10 sec: 58982.6, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 7910146048. Throughput: 0: 55508.2. Samples: 815316940. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-04-27 21:24:04,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 21:24:04,306][54818] Updated weights for policy 0, policy_version 482798 (0.0031) [2024-04-27 21:24:07,904][54818] Updated weights for policy 0, policy_version 482808 (0.0026) [2024-04-27 21:24:08,579][54798] Signal inference workers to stop experience collection... (11900 times) [2024-04-27 21:24:08,585][54798] Signal inference workers to resume experience collection... (11900 times) [2024-04-27 21:24:08,601][54818] InferenceWorker_p0-w0: stopping experience collection (11900 times) [2024-04-27 21:24:08,601][54818] InferenceWorker_p0-w0: resuming experience collection (11900 times) [2024-04-27 21:24:09,253][54587] Fps is (10 sec: 58982.4, 60 sec: 56524.7, 300 sec: 55705.6). Total num frames: 7910440960. Throughput: 0: 55581.8. Samples: 815651680. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-04-27 21:24:09,254][54587] Avg episode reward: [(0, '0.479')] [2024-04-27 21:24:10,331][54818] Updated weights for policy 0, policy_version 482818 (0.0032) [2024-04-27 21:24:13,808][54818] Updated weights for policy 0, policy_version 482828 (0.0026) [2024-04-27 21:24:14,253][54587] Fps is (10 sec: 54067.2, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 7910686720. Throughput: 0: 55452.0. Samples: 815827380. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-04-27 21:24:14,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 21:24:16,341][54818] Updated weights for policy 0, policy_version 482838 (0.0028) [2024-04-27 21:24:19,253][54587] Fps is (10 sec: 49151.8, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 7910932480. Throughput: 0: 55447.0. Samples: 816159800. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-04-27 21:24:19,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 21:24:19,720][54818] Updated weights for policy 0, policy_version 482848 (0.0027) [2024-04-27 21:24:22,285][54818] Updated weights for policy 0, policy_version 482858 (0.0037) [2024-04-27 21:24:24,253][54587] Fps is (10 sec: 54067.0, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 7911227392. Throughput: 0: 55623.4. Samples: 816495500. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-04-27 21:24:24,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 21:24:25,650][54818] Updated weights for policy 0, policy_version 482868 (0.0031) [2024-04-27 21:24:28,207][54818] Updated weights for policy 0, policy_version 482878 (0.0031) [2024-04-27 21:24:29,253][54587] Fps is (10 sec: 57344.4, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 7911505920. Throughput: 0: 55168.3. Samples: 816640600. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-04-27 21:24:29,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 21:24:31,513][54818] Updated weights for policy 0, policy_version 482888 (0.0026) [2024-04-27 21:24:34,253][54587] Fps is (10 sec: 57344.2, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 7911800832. Throughput: 0: 55149.9. Samples: 816975500. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-04-27 21:24:34,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 21:24:34,258][54818] Updated weights for policy 0, policy_version 482898 (0.0028) [2024-04-27 21:24:37,079][54818] Updated weights for policy 0, policy_version 482908 (0.0027) [2024-04-27 21:24:39,253][54587] Fps is (10 sec: 58982.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 7912095744. Throughput: 0: 55513.4. Samples: 817325220. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-04-27 21:24:39,253][54587] Avg episode reward: [(0, '0.661')] [2024-04-27 21:24:39,770][54818] Updated weights for policy 0, policy_version 482918 (0.0026) [2024-04-27 21:24:42,685][54818] Updated weights for policy 0, policy_version 482928 (0.0026) [2024-04-27 21:24:44,253][54587] Fps is (10 sec: 60621.1, 60 sec: 57070.9, 300 sec: 55816.7). Total num frames: 7912407040. Throughput: 0: 56401.5. Samples: 817523500. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-04-27 21:24:44,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 21:24:45,334][54818] Updated weights for policy 0, policy_version 482938 (0.0024) [2024-04-27 21:24:48,251][54818] Updated weights for policy 0, policy_version 482948 (0.0024) [2024-04-27 21:24:48,684][54798] Signal inference workers to stop experience collection... (11950 times) [2024-04-27 21:24:48,684][54798] Signal inference workers to resume experience collection... (11950 times) [2024-04-27 21:24:48,708][54818] InferenceWorker_p0-w0: stopping experience collection (11950 times) [2024-04-27 21:24:48,708][54818] InferenceWorker_p0-w0: resuming experience collection (11950 times) [2024-04-27 21:24:49,253][54587] Fps is (10 sec: 60620.5, 60 sec: 57344.1, 300 sec: 55927.7). Total num frames: 7912701952. Throughput: 0: 56828.9. Samples: 817874240. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-04-27 21:24:49,254][54587] Avg episode reward: [(0, '0.714')] [2024-04-27 21:24:50,892][54818] Updated weights for policy 0, policy_version 482958 (0.0025) [2024-04-27 21:24:53,738][54818] Updated weights for policy 0, policy_version 482968 (0.0025) [2024-04-27 21:24:54,253][54587] Fps is (10 sec: 58982.1, 60 sec: 57344.0, 300 sec: 55983.3). Total num frames: 7912996864. Throughput: 0: 56902.3. Samples: 818212280. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-27 21:24:54,254][54587] Avg episode reward: [(0, '0.672')] [2024-04-27 21:24:56,691][54818] Updated weights for policy 0, policy_version 482978 (0.0027) [2024-04-27 21:24:59,171][54818] Updated weights for policy 0, policy_version 482988 (0.0026) [2024-04-27 21:24:59,253][54587] Fps is (10 sec: 57344.1, 60 sec: 57071.0, 300 sec: 55927.8). Total num frames: 7913275392. Throughput: 0: 56972.0. Samples: 818391120. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-27 21:24:59,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 21:25:02,567][54818] Updated weights for policy 0, policy_version 482998 (0.0027) [2024-04-27 21:25:04,253][54587] Fps is (10 sec: 54066.9, 60 sec: 56524.7, 300 sec: 55872.2). Total num frames: 7913537536. Throughput: 0: 57552.9. Samples: 818749680. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-27 21:25:04,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-27 21:25:04,644][54818] Updated weights for policy 0, policy_version 483008 (0.0026) [2024-04-27 21:25:08,126][54818] Updated weights for policy 0, policy_version 483018 (0.0026) [2024-04-27 21:25:09,253][54587] Fps is (10 sec: 52429.0, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 7913799680. Throughput: 0: 58142.3. Samples: 819111900. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-27 21:25:09,253][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 21:25:10,195][54818] Updated weights for policy 0, policy_version 483028 (0.0022) [2024-04-27 21:25:13,455][54818] Updated weights for policy 0, policy_version 483038 (0.0027) [2024-04-27 21:25:14,253][54587] Fps is (10 sec: 55705.5, 60 sec: 56797.8, 300 sec: 55872.2). Total num frames: 7914094592. Throughput: 0: 58416.8. Samples: 819269360. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-27 21:25:14,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 21:25:15,705][54818] Updated weights for policy 0, policy_version 483048 (0.0021) [2024-04-27 21:25:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 57890.3, 300 sec: 55983.3). Total num frames: 7914405888. Throughput: 0: 58855.6. Samples: 819624000. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-27 21:25:19,253][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 21:25:19,359][54818] Updated weights for policy 0, policy_version 483058 (0.0025) [2024-04-27 21:25:19,935][54798] Signal inference workers to stop experience collection... (12000 times) [2024-04-27 21:25:19,973][54818] InferenceWorker_p0-w0: stopping experience collection (12000 times) [2024-04-27 21:25:20,028][54798] Signal inference workers to resume experience collection... (12000 times) [2024-04-27 21:25:20,028][54818] InferenceWorker_p0-w0: resuming experience collection (12000 times) [2024-04-27 21:25:21,215][54818] Updated weights for policy 0, policy_version 483068 (0.0024) [2024-04-27 21:25:24,253][54587] Fps is (10 sec: 60621.1, 60 sec: 57890.1, 300 sec: 56038.8). Total num frames: 7914700800. Throughput: 0: 58911.5. Samples: 819976240. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-27 21:25:24,254][54587] Avg episode reward: [(0, '0.511')] [2024-04-27 21:25:25,331][54818] Updated weights for policy 0, policy_version 483078 (0.0027) [2024-04-27 21:25:26,664][54818] Updated weights for policy 0, policy_version 483088 (0.0028) [2024-04-27 21:25:29,253][54587] Fps is (10 sec: 60620.0, 60 sec: 58436.2, 300 sec: 56149.9). Total num frames: 7915012096. Throughput: 0: 58208.3. Samples: 820142880. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-27 21:25:29,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 21:25:30,685][54818] Updated weights for policy 0, policy_version 483098 (0.0026) [2024-04-27 21:25:32,419][54818] Updated weights for policy 0, policy_version 483108 (0.0025) [2024-04-27 21:25:34,253][54587] Fps is (10 sec: 62259.3, 60 sec: 58709.3, 300 sec: 56316.5). Total num frames: 7915323392. Throughput: 0: 58264.9. Samples: 820496160. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-27 21:25:34,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 21:25:36,217][54818] Updated weights for policy 0, policy_version 483118 (0.0025) [2024-04-27 21:25:38,005][54818] Updated weights for policy 0, policy_version 483128 (0.0022) [2024-04-27 21:25:39,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58709.2, 300 sec: 56261.0). Total num frames: 7915618304. Throughput: 0: 58765.7. Samples: 820856740. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-27 21:25:39,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 21:25:41,661][54818] Updated weights for policy 0, policy_version 483138 (0.0026) [2024-04-27 21:25:43,416][54818] Updated weights for policy 0, policy_version 483148 (0.0025) [2024-04-27 21:25:44,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58436.2, 300 sec: 56261.0). Total num frames: 7915913216. Throughput: 0: 59005.2. Samples: 821046360. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-27 21:25:44,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 21:25:47,149][54818] Updated weights for policy 0, policy_version 483158 (0.0024) [2024-04-27 21:25:49,092][54818] Updated weights for policy 0, policy_version 483168 (0.0025) [2024-04-27 21:25:49,253][54587] Fps is (10 sec: 60621.1, 60 sec: 58709.3, 300 sec: 56483.2). Total num frames: 7916224512. Throughput: 0: 58571.6. Samples: 821385400. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-27 21:25:49,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 21:25:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000483168_7916224512.pth... [2024-04-27 21:25:49,319][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000482339_7902642176.pth [2024-04-27 21:25:50,452][54798] Signal inference workers to stop experience collection... (12050 times) [2024-04-27 21:25:50,453][54798] Signal inference workers to resume experience collection... (12050 times) [2024-04-27 21:25:50,462][54818] InferenceWorker_p0-w0: stopping experience collection (12050 times) [2024-04-27 21:25:50,483][54818] InferenceWorker_p0-w0: resuming experience collection (12050 times) [2024-04-27 21:25:52,604][54818] Updated weights for policy 0, policy_version 483178 (0.0025) [2024-04-27 21:25:54,253][54587] Fps is (10 sec: 62259.8, 60 sec: 58982.4, 300 sec: 56594.3). Total num frames: 7916535808. Throughput: 0: 58280.9. Samples: 821734540. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-27 21:25:54,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 21:25:55,010][54818] Updated weights for policy 0, policy_version 483188 (0.0027) [2024-04-27 21:25:58,059][54818] Updated weights for policy 0, policy_version 483198 (0.0025) [2024-04-27 21:25:59,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58982.4, 300 sec: 56594.2). Total num frames: 7916814336. Throughput: 0: 59047.2. Samples: 821926480. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-27 21:25:59,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 21:26:00,811][54818] Updated weights for policy 0, policy_version 483208 (0.0023) [2024-04-27 21:26:03,556][54818] Updated weights for policy 0, policy_version 483218 (0.0025) [2024-04-27 21:26:04,253][54587] Fps is (10 sec: 58982.0, 60 sec: 59801.6, 300 sec: 56705.3). Total num frames: 7917125632. Throughput: 0: 59131.0. Samples: 822284900. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-27 21:26:04,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 21:26:06,282][54818] Updated weights for policy 0, policy_version 483228 (0.0030) [2024-04-27 21:26:09,020][54818] Updated weights for policy 0, policy_version 483238 (0.0026) [2024-04-27 21:26:09,253][54587] Fps is (10 sec: 57344.4, 60 sec: 59801.6, 300 sec: 56649.8). Total num frames: 7917387776. Throughput: 0: 59123.2. Samples: 822636780. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-27 21:26:09,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 21:26:11,986][54818] Updated weights for policy 0, policy_version 483248 (0.0028) [2024-04-27 21:26:14,253][54587] Fps is (10 sec: 54067.4, 60 sec: 59528.6, 300 sec: 56594.2). Total num frames: 7917666304. Throughput: 0: 59182.4. Samples: 822806080. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-04-27 21:26:14,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 21:26:14,505][54818] Updated weights for policy 0, policy_version 483258 (0.0024) [2024-04-27 21:26:17,422][54818] Updated weights for policy 0, policy_version 483268 (0.0024) [2024-04-27 21:26:19,253][54587] Fps is (10 sec: 55705.2, 60 sec: 58982.3, 300 sec: 56538.7). Total num frames: 7917944832. Throughput: 0: 59117.7. Samples: 823156460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:26:19,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 21:26:19,932][54818] Updated weights for policy 0, policy_version 483278 (0.0027) [2024-04-27 21:26:22,812][54798] Signal inference workers to stop experience collection... (12100 times) [2024-04-27 21:26:22,830][54818] InferenceWorker_p0-w0: stopping experience collection (12100 times) [2024-04-27 21:26:22,906][54798] Signal inference workers to resume experience collection... (12100 times) [2024-04-27 21:26:22,906][54818] InferenceWorker_p0-w0: resuming experience collection (12100 times) [2024-04-27 21:26:23,011][54818] Updated weights for policy 0, policy_version 483288 (0.0024) [2024-04-27 21:26:24,253][54587] Fps is (10 sec: 55705.6, 60 sec: 58709.4, 300 sec: 56705.3). Total num frames: 7918223360. Throughput: 0: 58991.2. Samples: 823511340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:26:24,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 21:26:25,448][54818] Updated weights for policy 0, policy_version 483298 (0.0026) [2024-04-27 21:26:28,688][54818] Updated weights for policy 0, policy_version 483308 (0.0025) [2024-04-27 21:26:29,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58982.4, 300 sec: 56871.9). Total num frames: 7918551040. Throughput: 0: 58525.3. Samples: 823680000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:26:29,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 21:26:30,985][54818] Updated weights for policy 0, policy_version 483318 (0.0022) [2024-04-27 21:26:34,153][54818] Updated weights for policy 0, policy_version 483328 (0.0026) [2024-04-27 21:26:34,253][54587] Fps is (10 sec: 62259.0, 60 sec: 58709.3, 300 sec: 56816.4). Total num frames: 7918845952. Throughput: 0: 58929.8. Samples: 824037240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:26:34,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 21:26:36,552][54818] Updated weights for policy 0, policy_version 483338 (0.0022) [2024-04-27 21:26:39,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58436.2, 300 sec: 56871.9). Total num frames: 7919124480. Throughput: 0: 59010.9. Samples: 824390040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:26:39,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 21:26:39,799][54818] Updated weights for policy 0, policy_version 483348 (0.0027) [2024-04-27 21:26:42,038][54818] Updated weights for policy 0, policy_version 483358 (0.0021) [2024-04-27 21:26:44,253][54587] Fps is (10 sec: 55705.7, 60 sec: 58163.2, 300 sec: 56816.4). Total num frames: 7919403008. Throughput: 0: 58466.3. Samples: 824557460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:26:44,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 21:26:45,551][54818] Updated weights for policy 0, policy_version 483368 (0.0027) [2024-04-27 21:26:47,534][54818] Updated weights for policy 0, policy_version 483378 (0.0023) [2024-04-27 21:26:49,253][54587] Fps is (10 sec: 57344.7, 60 sec: 57890.2, 300 sec: 56816.4). Total num frames: 7919697920. Throughput: 0: 58165.4. Samples: 824902340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:26:49,253][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 21:26:51,175][54818] Updated weights for policy 0, policy_version 483388 (0.0026) [2024-04-27 21:26:53,056][54818] Updated weights for policy 0, policy_version 483398 (0.0027) [2024-04-27 21:26:54,253][54587] Fps is (10 sec: 62259.3, 60 sec: 58163.2, 300 sec: 56983.0). Total num frames: 7920025600. Throughput: 0: 58260.4. Samples: 825258500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:26:54,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-27 21:26:57,063][54818] Updated weights for policy 0, policy_version 483408 (0.0027) [2024-04-27 21:26:57,066][54798] Signal inference workers to stop experience collection... (12150 times) [2024-04-27 21:26:57,066][54798] Signal inference workers to resume experience collection... (12150 times) [2024-04-27 21:26:57,081][54818] InferenceWorker_p0-w0: stopping experience collection (12150 times) [2024-04-27 21:26:57,082][54818] InferenceWorker_p0-w0: resuming experience collection (12150 times) [2024-04-27 21:26:58,622][54818] Updated weights for policy 0, policy_version 483418 (0.0024) [2024-04-27 21:26:59,253][54587] Fps is (10 sec: 62258.9, 60 sec: 58436.2, 300 sec: 57205.2). Total num frames: 7920320512. Throughput: 0: 58789.7. Samples: 825451620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:26:59,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 21:27:02,486][54818] Updated weights for policy 0, policy_version 483428 (0.0025) [2024-04-27 21:27:04,253][54587] Fps is (10 sec: 60620.2, 60 sec: 58436.2, 300 sec: 57260.7). Total num frames: 7920631808. Throughput: 0: 58777.7. Samples: 825801460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:27:04,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 21:27:04,382][54818] Updated weights for policy 0, policy_version 483438 (0.0025) [2024-04-27 21:27:07,925][54818] Updated weights for policy 0, policy_version 483448 (0.0027) [2024-04-27 21:27:09,253][54587] Fps is (10 sec: 62259.3, 60 sec: 59255.4, 300 sec: 57371.8). Total num frames: 7920943104. Throughput: 0: 58509.3. Samples: 826144260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:27:09,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 21:27:10,574][54818] Updated weights for policy 0, policy_version 483458 (0.0024) [2024-04-27 21:27:13,365][54818] Updated weights for policy 0, policy_version 483468 (0.0025) [2024-04-27 21:27:14,253][54587] Fps is (10 sec: 58983.3, 60 sec: 59255.5, 300 sec: 57371.8). Total num frames: 7921221632. Throughput: 0: 59133.0. Samples: 826340980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:27:14,253][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 21:27:16,104][54818] Updated weights for policy 0, policy_version 483478 (0.0023) [2024-04-27 21:27:18,876][54818] Updated weights for policy 0, policy_version 483488 (0.0023) [2024-04-27 21:27:19,253][54587] Fps is (10 sec: 54067.6, 60 sec: 58982.5, 300 sec: 57260.7). Total num frames: 7921483776. Throughput: 0: 58888.5. Samples: 826687220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:27:19,253][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 21:27:21,664][54818] Updated weights for policy 0, policy_version 483498 (0.0029) [2024-04-27 21:27:24,253][54587] Fps is (10 sec: 55705.5, 60 sec: 59255.5, 300 sec: 57260.7). Total num frames: 7921778688. Throughput: 0: 58927.8. Samples: 827041780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:27:24,253][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 21:27:24,328][54818] Updated weights for policy 0, policy_version 483508 (0.0025) [2024-04-27 21:27:27,221][54818] Updated weights for policy 0, policy_version 483518 (0.0025) [2024-04-27 21:27:29,253][54587] Fps is (10 sec: 57343.4, 60 sec: 58436.3, 300 sec: 57427.3). Total num frames: 7922057216. Throughput: 0: 59114.2. Samples: 827217600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:27:29,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 21:27:29,838][54818] Updated weights for policy 0, policy_version 483528 (0.0025) [2024-04-27 21:27:33,070][54818] Updated weights for policy 0, policy_version 483538 (0.0027) [2024-04-27 21:27:33,635][54798] Signal inference workers to stop experience collection... (12200 times) [2024-04-27 21:27:33,635][54798] Signal inference workers to resume experience collection... (12200 times) [2024-04-27 21:27:33,646][54818] InferenceWorker_p0-w0: stopping experience collection (12200 times) [2024-04-27 21:27:33,647][54818] InferenceWorker_p0-w0: resuming experience collection (12200 times) [2024-04-27 21:27:34,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58436.3, 300 sec: 57538.4). Total num frames: 7922352128. Throughput: 0: 59311.2. Samples: 827571340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:27:34,253][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 21:27:35,287][54818] Updated weights for policy 0, policy_version 483548 (0.0026) [2024-04-27 21:27:38,675][54818] Updated weights for policy 0, policy_version 483558 (0.0024) [2024-04-27 21:27:39,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.4, 300 sec: 57538.4). Total num frames: 7922647040. Throughput: 0: 59411.1. Samples: 827932000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:27:39,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-27 21:27:40,743][54818] Updated weights for policy 0, policy_version 483568 (0.0027) [2024-04-27 21:27:44,145][54818] Updated weights for policy 0, policy_version 483578 (0.0025) [2024-04-27 21:27:44,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58982.4, 300 sec: 57593.9). Total num frames: 7922941952. Throughput: 0: 58533.8. Samples: 828085640. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:27:44,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 21:27:46,169][54818] Updated weights for policy 0, policy_version 483588 (0.0027) [2024-04-27 21:27:49,253][54587] Fps is (10 sec: 58981.5, 60 sec: 58982.2, 300 sec: 57593.9). Total num frames: 7923236864. Throughput: 0: 58760.3. Samples: 828445680. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:27:49,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 21:27:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000483596_7923236864.pth... [2024-04-27 21:27:49,317][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000482743_7909261312.pth [2024-04-27 21:27:49,757][54818] Updated weights for policy 0, policy_version 483598 (0.0027) [2024-04-27 21:27:51,681][54818] Updated weights for policy 0, policy_version 483608 (0.0024) [2024-04-27 21:27:54,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58436.3, 300 sec: 57649.5). Total num frames: 7923531776. Throughput: 0: 59156.5. Samples: 828806300. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:27:54,253][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 21:27:55,326][54818] Updated weights for policy 0, policy_version 483618 (0.0027) [2024-04-27 21:27:57,325][54818] Updated weights for policy 0, policy_version 483628 (0.0029) [2024-04-27 21:27:59,253][54587] Fps is (10 sec: 60621.4, 60 sec: 58709.3, 300 sec: 57705.0). Total num frames: 7923843072. Throughput: 0: 58716.3. Samples: 828983220. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:27:59,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 21:28:00,705][54818] Updated weights for policy 0, policy_version 483638 (0.0022) [2024-04-27 21:28:02,795][54818] Updated weights for policy 0, policy_version 483648 (0.0026) [2024-04-27 21:28:04,253][54587] Fps is (10 sec: 62259.0, 60 sec: 58709.4, 300 sec: 57982.7). Total num frames: 7924154368. Throughput: 0: 58659.0. Samples: 829326880. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:28:04,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 21:28:06,426][54818] Updated weights for policy 0, policy_version 483658 (0.0027) [2024-04-27 21:28:08,181][54818] Updated weights for policy 0, policy_version 483668 (0.0027) [2024-04-27 21:28:09,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58436.3, 300 sec: 58093.8). Total num frames: 7924449280. Throughput: 0: 58537.7. Samples: 829675980. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:28:09,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 21:28:12,092][54818] Updated weights for policy 0, policy_version 483678 (0.0026) [2024-04-27 21:28:13,689][54818] Updated weights for policy 0, policy_version 483688 (0.0025) [2024-04-27 21:28:14,253][54587] Fps is (10 sec: 60621.1, 60 sec: 58982.4, 300 sec: 58093.8). Total num frames: 7924760576. Throughput: 0: 58845.0. Samples: 829865620. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:28:14,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 21:28:16,549][54798] Signal inference workers to stop experience collection... (12250 times) [2024-04-27 21:28:16,589][54818] InferenceWorker_p0-w0: stopping experience collection (12250 times) [2024-04-27 21:28:16,599][54798] Signal inference workers to resume experience collection... (12250 times) [2024-04-27 21:28:16,605][54818] InferenceWorker_p0-w0: resuming experience collection (12250 times) [2024-04-27 21:28:17,455][54818] Updated weights for policy 0, policy_version 483698 (0.0026) [2024-04-27 21:28:19,253][54587] Fps is (10 sec: 60621.0, 60 sec: 59528.5, 300 sec: 58204.9). Total num frames: 7925055488. Throughput: 0: 58748.8. Samples: 830215040. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:28:19,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 21:28:19,282][54818] Updated weights for policy 0, policy_version 483708 (0.0025) [2024-04-27 21:28:22,970][54818] Updated weights for policy 0, policy_version 483718 (0.0025) [2024-04-27 21:28:24,253][54587] Fps is (10 sec: 57343.9, 60 sec: 59255.4, 300 sec: 58093.8). Total num frames: 7925334016. Throughput: 0: 58588.9. Samples: 830568500. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:28:24,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 21:28:25,008][54818] Updated weights for policy 0, policy_version 483728 (0.0030) [2024-04-27 21:28:28,638][54818] Updated weights for policy 0, policy_version 483738 (0.0027) [2024-04-27 21:28:29,253][54587] Fps is (10 sec: 55705.7, 60 sec: 59255.5, 300 sec: 58038.2). Total num frames: 7925612544. Throughput: 0: 59024.5. Samples: 830741740. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:28:29,254][54587] Avg episode reward: [(0, '0.504')] [2024-04-27 21:28:30,609][54818] Updated weights for policy 0, policy_version 483748 (0.0028) [2024-04-27 21:28:34,051][54818] Updated weights for policy 0, policy_version 483758 (0.0026) [2024-04-27 21:28:34,253][54587] Fps is (10 sec: 55705.8, 60 sec: 58982.4, 300 sec: 58093.8). Total num frames: 7925891072. Throughput: 0: 58932.7. Samples: 831097640. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:28:34,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 21:28:36,413][54818] Updated weights for policy 0, policy_version 483768 (0.0025) [2024-04-27 21:28:39,253][54587] Fps is (10 sec: 57344.5, 60 sec: 58982.5, 300 sec: 58315.9). Total num frames: 7926185984. Throughput: 0: 58817.0. Samples: 831453060. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:28:39,253][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 21:28:39,611][54818] Updated weights for policy 0, policy_version 483778 (0.0023) [2024-04-27 21:28:41,774][54818] Updated weights for policy 0, policy_version 483788 (0.0026) [2024-04-27 21:28:44,253][54587] Fps is (10 sec: 57343.3, 60 sec: 58709.3, 300 sec: 58315.9). Total num frames: 7926464512. Throughput: 0: 58441.3. Samples: 831613080. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:28:44,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 21:28:45,186][54818] Updated weights for policy 0, policy_version 483798 (0.0025) [2024-04-27 21:28:47,775][54818] Updated weights for policy 0, policy_version 483808 (0.0025) [2024-04-27 21:28:49,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58982.5, 300 sec: 58371.5). Total num frames: 7926775808. Throughput: 0: 58877.3. Samples: 831976360. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:28:49,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-27 21:28:50,771][54818] Updated weights for policy 0, policy_version 483818 (0.0027) [2024-04-27 21:28:53,375][54818] Updated weights for policy 0, policy_version 483828 (0.0030) [2024-04-27 21:28:54,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58982.3, 300 sec: 58371.5). Total num frames: 7927070720. Throughput: 0: 58997.8. Samples: 832330880. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:28:54,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 21:28:56,211][54818] Updated weights for policy 0, policy_version 483838 (0.0027) [2024-04-27 21:28:59,078][54818] Updated weights for policy 0, policy_version 483848 (0.0027) [2024-04-27 21:28:59,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.4, 300 sec: 58371.5). Total num frames: 7927365632. Throughput: 0: 58538.6. Samples: 832499860. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:28:59,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 21:29:01,794][54818] Updated weights for policy 0, policy_version 483858 (0.0026) [2024-04-27 21:29:04,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58436.3, 300 sec: 58371.5). Total num frames: 7927660544. Throughput: 0: 58636.5. Samples: 832853680. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:29:04,253][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 21:29:04,413][54818] Updated weights for policy 0, policy_version 483868 (0.0027) [2024-04-27 21:29:07,203][54818] Updated weights for policy 0, policy_version 483878 (0.0026) [2024-04-27 21:29:07,465][54798] Signal inference workers to stop experience collection... (12300 times) [2024-04-27 21:29:07,505][54818] InferenceWorker_p0-w0: stopping experience collection (12300 times) [2024-04-27 21:29:07,555][54798] Signal inference workers to resume experience collection... (12300 times) [2024-04-27 21:29:07,555][54818] InferenceWorker_p0-w0: resuming experience collection (12300 times) [2024-04-27 21:29:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58709.3, 300 sec: 58593.6). Total num frames: 7927971840. Throughput: 0: 58734.6. Samples: 833211560. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-27 21:29:09,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 21:29:10,236][54818] Updated weights for policy 0, policy_version 483888 (0.0023) [2024-04-27 21:29:12,774][54818] Updated weights for policy 0, policy_version 483898 (0.0027) [2024-04-27 21:29:14,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58436.3, 300 sec: 58760.3). Total num frames: 7928266752. Throughput: 0: 58952.5. Samples: 833394600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 21:29:14,254][54587] Avg episode reward: [(0, '0.485')] [2024-04-27 21:29:15,723][54818] Updated weights for policy 0, policy_version 483908 (0.0025) [2024-04-27 21:29:18,150][54818] Updated weights for policy 0, policy_version 483918 (0.0026) [2024-04-27 21:29:19,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58436.2, 300 sec: 58760.2). Total num frames: 7928561664. Throughput: 0: 58938.1. Samples: 833749860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 21:29:19,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 21:29:21,321][54818] Updated weights for policy 0, policy_version 483928 (0.0025) [2024-04-27 21:29:23,704][54818] Updated weights for policy 0, policy_version 483938 (0.0027) [2024-04-27 21:29:24,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 7928872960. Throughput: 0: 58874.1. Samples: 834102400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 21:29:24,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 21:29:26,816][54818] Updated weights for policy 0, policy_version 483948 (0.0026) [2024-04-27 21:29:29,195][54818] Updated weights for policy 0, policy_version 483958 (0.0027) [2024-04-27 21:29:29,253][54587] Fps is (10 sec: 60621.5, 60 sec: 59255.5, 300 sec: 58871.3). Total num frames: 7929167872. Throughput: 0: 59368.6. Samples: 834284660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 21:29:29,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-27 21:29:32,445][54818] Updated weights for policy 0, policy_version 483968 (0.0025) [2024-04-27 21:29:34,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59528.5, 300 sec: 58871.3). Total num frames: 7929462784. Throughput: 0: 59192.5. Samples: 834640020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 21:29:34,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 21:29:34,545][54818] Updated weights for policy 0, policy_version 483978 (0.0027) [2024-04-27 21:29:37,948][54818] Updated weights for policy 0, policy_version 483988 (0.0026) [2024-04-27 21:29:39,253][54587] Fps is (10 sec: 57343.5, 60 sec: 59255.3, 300 sec: 58760.2). Total num frames: 7929741312. Throughput: 0: 59132.0. Samples: 834991820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 21:29:39,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 21:29:40,077][54818] Updated weights for policy 0, policy_version 483998 (0.0024) [2024-04-27 21:29:43,434][54818] Updated weights for policy 0, policy_version 484008 (0.0028) [2024-04-27 21:29:44,253][54587] Fps is (10 sec: 58982.9, 60 sec: 59801.8, 300 sec: 58815.8). Total num frames: 7930052608. Throughput: 0: 59276.6. Samples: 835167300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 21:29:44,253][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 21:29:45,814][54818] Updated weights for policy 0, policy_version 484018 (0.0028) [2024-04-27 21:29:48,900][54818] Updated weights for policy 0, policy_version 484028 (0.0027) [2024-04-27 21:29:49,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59255.5, 300 sec: 58760.2). Total num frames: 7930331136. Throughput: 0: 59335.5. Samples: 835523780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 21:29:49,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 21:29:49,261][54587] No heartbeat for components: RolloutWorker_w4 (337 seconds) [2024-04-27 21:29:49,295][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000484030_7930347520.pth... [2024-04-27 21:29:49,348][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000483168_7916224512.pth [2024-04-27 21:29:51,369][54818] Updated weights for policy 0, policy_version 484038 (0.0024) [2024-04-27 21:29:54,253][54587] Fps is (10 sec: 57343.1, 60 sec: 59255.5, 300 sec: 58815.8). Total num frames: 7930626048. Throughput: 0: 59335.5. Samples: 835881660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 21:29:54,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 21:29:54,453][54818] Updated weights for policy 0, policy_version 484048 (0.0026) [2024-04-27 21:29:57,131][54818] Updated weights for policy 0, policy_version 484058 (0.0026) [2024-04-27 21:29:59,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 7930904576. Throughput: 0: 58927.1. Samples: 836046320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 21:29:59,262][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 21:29:59,917][54818] Updated weights for policy 0, policy_version 484068 (0.0026) [2024-04-27 21:30:02,534][54818] Updated weights for policy 0, policy_version 484078 (0.0027) [2024-04-27 21:30:04,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58982.3, 300 sec: 58982.4). Total num frames: 7931199488. Throughput: 0: 58990.7. Samples: 836404440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 21:30:04,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 21:30:05,068][54798] Signal inference workers to stop experience collection... (12350 times) [2024-04-27 21:30:05,068][54798] Signal inference workers to resume experience collection... (12350 times) [2024-04-27 21:30:05,084][54818] InferenceWorker_p0-w0: stopping experience collection (12350 times) [2024-04-27 21:30:05,085][54818] InferenceWorker_p0-w0: resuming experience collection (12350 times) [2024-04-27 21:30:05,351][54818] Updated weights for policy 0, policy_version 484088 (0.0025) [2024-04-27 21:30:08,276][54818] Updated weights for policy 0, policy_version 484098 (0.0024) [2024-04-27 21:30:09,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58709.2, 300 sec: 58982.4). Total num frames: 7931494400. Throughput: 0: 59071.9. Samples: 836760640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 21:30:09,254][54587] Avg episode reward: [(0, '0.523')] [2024-04-27 21:30:10,841][54818] Updated weights for policy 0, policy_version 484108 (0.0023) [2024-04-27 21:30:13,858][54818] Updated weights for policy 0, policy_version 484118 (0.0024) [2024-04-27 21:30:14,253][54587] Fps is (10 sec: 60621.3, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 7931805696. Throughput: 0: 58895.1. Samples: 836934940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 21:30:14,253][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 21:30:16,408][54818] Updated weights for policy 0, policy_version 484128 (0.0025) [2024-04-27 21:30:19,253][54587] Fps is (10 sec: 60621.2, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 7932100608. Throughput: 0: 58954.6. Samples: 837292980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 21:30:19,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 21:30:19,375][54818] Updated weights for policy 0, policy_version 484138 (0.0027) [2024-04-27 21:30:21,812][54818] Updated weights for policy 0, policy_version 484148 (0.0025) [2024-04-27 21:30:24,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 7932395520. Throughput: 0: 58944.0. Samples: 837644300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 21:30:24,262][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 21:30:24,903][54818] Updated weights for policy 0, policy_version 484158 (0.0026) [2024-04-27 21:30:27,721][54818] Updated weights for policy 0, policy_version 484168 (0.0026) [2024-04-27 21:30:29,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58709.2, 300 sec: 58871.3). Total num frames: 7932690432. Throughput: 0: 58962.8. Samples: 837820640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 21:30:29,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 21:30:30,365][54818] Updated weights for policy 0, policy_version 484178 (0.0027) [2024-04-27 21:30:33,248][54818] Updated weights for policy 0, policy_version 484188 (0.0026) [2024-04-27 21:30:34,253][54587] Fps is (10 sec: 60621.1, 60 sec: 58982.4, 300 sec: 58926.9). Total num frames: 7933001728. Throughput: 0: 58647.5. Samples: 838162920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 21:30:34,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-27 21:30:36,700][54818] Updated weights for policy 0, policy_version 484198 (0.0027) [2024-04-27 21:30:38,828][54818] Updated weights for policy 0, policy_version 484208 (0.0026) [2024-04-27 21:30:39,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 7933280256. Throughput: 0: 58789.2. Samples: 838527180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 21:30:39,263][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 21:30:42,118][54818] Updated weights for policy 0, policy_version 484218 (0.0024) [2024-04-27 21:30:43,089][54798] Signal inference workers to stop experience collection... (12400 times) [2024-04-27 21:30:43,089][54798] Signal inference workers to resume experience collection... (12400 times) [2024-04-27 21:30:43,115][54818] InferenceWorker_p0-w0: stopping experience collection (12400 times) [2024-04-27 21:30:43,115][54818] InferenceWorker_p0-w0: resuming experience collection (12400 times) [2024-04-27 21:30:44,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.2, 300 sec: 58815.8). Total num frames: 7933575168. Throughput: 0: 59200.0. Samples: 838710320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 21:30:44,254][54587] Avg episode reward: [(0, '0.519')] [2024-04-27 21:30:44,484][54818] Updated weights for policy 0, policy_version 484228 (0.0027) [2024-04-27 21:30:47,712][54818] Updated weights for policy 0, policy_version 484238 (0.0027) [2024-04-27 21:30:49,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58982.3, 300 sec: 58760.2). Total num frames: 7933870080. Throughput: 0: 59183.9. Samples: 839067720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 21:30:49,262][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 21:30:50,151][54818] Updated weights for policy 0, policy_version 484248 (0.0027) [2024-04-27 21:30:53,096][54818] Updated weights for policy 0, policy_version 484258 (0.0026) [2024-04-27 21:30:54,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59255.5, 300 sec: 58871.3). Total num frames: 7934181376. Throughput: 0: 58953.4. Samples: 839413540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 21:30:54,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 21:30:55,594][54818] Updated weights for policy 0, policy_version 484268 (0.0028) [2024-04-27 21:30:58,758][54818] Updated weights for policy 0, policy_version 484278 (0.0026) [2024-04-27 21:30:59,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.4, 300 sec: 58760.2). Total num frames: 7934459904. Throughput: 0: 58883.4. Samples: 839584700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 21:30:59,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 21:31:01,003][54818] Updated weights for policy 0, policy_version 484288 (0.0028) [2024-04-27 21:31:04,213][54818] Updated weights for policy 0, policy_version 484298 (0.0026) [2024-04-27 21:31:04,253][54587] Fps is (10 sec: 55705.9, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 7934738432. Throughput: 0: 58824.5. Samples: 839940080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 21:31:04,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 21:31:06,595][54818] Updated weights for policy 0, policy_version 484308 (0.0024) [2024-04-27 21:31:09,253][54587] Fps is (10 sec: 55706.1, 60 sec: 58709.5, 300 sec: 58815.8). Total num frames: 7935016960. Throughput: 0: 58971.6. Samples: 840298020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 21:31:09,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 21:31:09,772][54818] Updated weights for policy 0, policy_version 484318 (0.0027) [2024-04-27 21:31:12,163][54818] Updated weights for policy 0, policy_version 484328 (0.0027) [2024-04-27 21:31:14,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58436.2, 300 sec: 58871.3). Total num frames: 7935311872. Throughput: 0: 58844.6. Samples: 840468640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 21:31:14,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 21:31:15,210][54818] Updated weights for policy 0, policy_version 484338 (0.0027) [2024-04-27 21:31:17,568][54818] Updated weights for policy 0, policy_version 484348 (0.0025) [2024-04-27 21:31:19,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 7935623168. Throughput: 0: 58947.1. Samples: 840815540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 21:31:19,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-27 21:31:20,677][54818] Updated weights for policy 0, policy_version 484358 (0.0025) [2024-04-27 21:31:22,641][54798] Signal inference workers to stop experience collection... (12450 times) [2024-04-27 21:31:22,641][54798] Signal inference workers to resume experience collection... (12450 times) [2024-04-27 21:31:22,651][54818] InferenceWorker_p0-w0: stopping experience collection (12450 times) [2024-04-27 21:31:22,652][54818] InferenceWorker_p0-w0: resuming experience collection (12450 times) [2024-04-27 21:31:23,111][54818] Updated weights for policy 0, policy_version 484368 (0.0024) [2024-04-27 21:31:24,253][54587] Fps is (10 sec: 62259.1, 60 sec: 58982.4, 300 sec: 58926.9). Total num frames: 7935934464. Throughput: 0: 58748.6. Samples: 841170860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 21:31:24,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 21:31:26,148][54818] Updated weights for policy 0, policy_version 484378 (0.0025) [2024-04-27 21:31:28,810][54818] Updated weights for policy 0, policy_version 484388 (0.0023) [2024-04-27 21:31:29,253][54587] Fps is (10 sec: 60620.5, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 7936229376. Throughput: 0: 58578.1. Samples: 841346340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 21:31:29,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 21:31:31,606][54818] Updated weights for policy 0, policy_version 484398 (0.0026) [2024-04-27 21:31:34,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 7936524288. Throughput: 0: 58560.1. Samples: 841702920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 21:31:34,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 21:31:34,330][54818] Updated weights for policy 0, policy_version 484408 (0.0026) [2024-04-27 21:31:37,309][54818] Updated weights for policy 0, policy_version 484418 (0.0027) [2024-04-27 21:31:39,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 7936835584. Throughput: 0: 58766.2. Samples: 842058020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 21:31:39,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 21:31:39,991][54818] Updated weights for policy 0, policy_version 484428 (0.0024) [2024-04-27 21:31:42,970][54818] Updated weights for policy 0, policy_version 484438 (0.0027) [2024-04-27 21:31:44,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 7937114112. Throughput: 0: 59149.9. Samples: 842246440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 21:31:44,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-27 21:31:45,503][54818] Updated weights for policy 0, policy_version 484448 (0.0026) [2024-04-27 21:31:48,544][54818] Updated weights for policy 0, policy_version 484458 (0.0027) [2024-04-27 21:31:49,253][54587] Fps is (10 sec: 55705.2, 60 sec: 58709.2, 300 sec: 58871.3). Total num frames: 7937392640. Throughput: 0: 58857.6. Samples: 842588680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 21:31:49,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 21:31:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000484460_7937392640.pth... [2024-04-27 21:31:49,320][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000483596_7923236864.pth [2024-04-27 21:31:51,187][54818] Updated weights for policy 0, policy_version 484468 (0.0026) [2024-04-27 21:31:54,253][54587] Fps is (10 sec: 55705.8, 60 sec: 58163.3, 300 sec: 58815.8). Total num frames: 7937671168. Throughput: 0: 58813.3. Samples: 842944620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 21:31:54,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 21:31:54,287][54818] Updated weights for policy 0, policy_version 484478 (0.0025) [2024-04-27 21:31:56,695][54818] Updated weights for policy 0, policy_version 484488 (0.0024) [2024-04-27 21:31:59,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 7937982464. Throughput: 0: 58863.4. Samples: 843117500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-27 21:31:59,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 21:31:59,628][54818] Updated weights for policy 0, policy_version 484498 (0.0024) [2024-04-27 21:32:02,452][54818] Updated weights for policy 0, policy_version 484508 (0.0025) [2024-04-27 21:32:04,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58982.5, 300 sec: 58760.3). Total num frames: 7938277376. Throughput: 0: 59142.8. Samples: 843476960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:32:04,253][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 21:32:05,205][54818] Updated weights for policy 0, policy_version 484518 (0.0026) [2024-04-27 21:32:05,824][54798] Signal inference workers to stop experience collection... (12500 times) [2024-04-27 21:32:05,846][54818] InferenceWorker_p0-w0: stopping experience collection (12500 times) [2024-04-27 21:32:05,919][54798] Signal inference workers to resume experience collection... (12500 times) [2024-04-27 21:32:05,919][54818] InferenceWorker_p0-w0: resuming experience collection (12500 times) [2024-04-27 21:32:08,447][54818] Updated weights for policy 0, policy_version 484528 (0.0024) [2024-04-27 21:32:09,253][54587] Fps is (10 sec: 58983.3, 60 sec: 59255.5, 300 sec: 58815.8). Total num frames: 7938572288. Throughput: 0: 59138.7. Samples: 843832100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:32:09,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 21:32:10,728][54818] Updated weights for policy 0, policy_version 484538 (0.0027) [2024-04-27 21:32:13,868][54818] Updated weights for policy 0, policy_version 484548 (0.0027) [2024-04-27 21:32:14,253][54587] Fps is (10 sec: 58981.8, 60 sec: 59255.4, 300 sec: 58926.8). Total num frames: 7938867200. Throughput: 0: 58926.3. Samples: 843998020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:32:14,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-27 21:32:16,196][54818] Updated weights for policy 0, policy_version 484558 (0.0027) [2024-04-27 21:32:19,254][54587] Fps is (10 sec: 57342.7, 60 sec: 58709.2, 300 sec: 58871.3). Total num frames: 7939145728. Throughput: 0: 58822.4. Samples: 844349940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:32:19,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 21:32:19,372][54818] Updated weights for policy 0, policy_version 484568 (0.0025) [2024-04-27 21:32:21,735][54818] Updated weights for policy 0, policy_version 484578 (0.0027) [2024-04-27 21:32:24,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58436.2, 300 sec: 58926.8). Total num frames: 7939440640. Throughput: 0: 58926.6. Samples: 844709720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:32:24,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 21:32:24,913][54818] Updated weights for policy 0, policy_version 484588 (0.0029) [2024-04-27 21:32:27,313][54818] Updated weights for policy 0, policy_version 484598 (0.0027) [2024-04-27 21:32:29,253][54587] Fps is (10 sec: 60621.7, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 7939751936. Throughput: 0: 58776.0. Samples: 844891360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:32:29,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 21:32:30,361][54818] Updated weights for policy 0, policy_version 484608 (0.0025) [2024-04-27 21:32:32,800][54818] Updated weights for policy 0, policy_version 484618 (0.0027) [2024-04-27 21:32:34,253][54587] Fps is (10 sec: 62259.5, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 7940063232. Throughput: 0: 58957.5. Samples: 845241760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:32:34,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 21:32:35,748][54818] Updated weights for policy 0, policy_version 484628 (0.0022) [2024-04-27 21:32:38,381][54818] Updated weights for policy 0, policy_version 484638 (0.0025) [2024-04-27 21:32:39,253][54587] Fps is (10 sec: 62259.3, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 7940374528. Throughput: 0: 58678.6. Samples: 845585160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:32:39,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 21:32:41,276][54818] Updated weights for policy 0, policy_version 484648 (0.0025) [2024-04-27 21:32:43,768][54818] Updated weights for policy 0, policy_version 484658 (0.0025) [2024-04-27 21:32:44,255][54587] Fps is (10 sec: 58975.5, 60 sec: 58981.2, 300 sec: 59037.7). Total num frames: 7940653056. Throughput: 0: 58952.3. Samples: 845770420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:32:44,255][54587] Avg episode reward: [(0, '0.524')] [2024-04-27 21:32:46,947][54818] Updated weights for policy 0, policy_version 484668 (0.0025) [2024-04-27 21:32:49,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59255.6, 300 sec: 59037.9). Total num frames: 7940947968. Throughput: 0: 58971.0. Samples: 846130660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:32:49,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 21:32:49,259][54587] No heartbeat for components: RolloutWorker_w4 (517 seconds) [2024-04-27 21:32:49,308][54818] Updated weights for policy 0, policy_version 484678 (0.0027) [2024-04-27 21:32:51,419][54798] Signal inference workers to stop experience collection... (12550 times) [2024-04-27 21:32:51,419][54798] Signal inference workers to resume experience collection... (12550 times) [2024-04-27 21:32:51,431][54818] InferenceWorker_p0-w0: stopping experience collection (12550 times) [2024-04-27 21:32:51,431][54818] InferenceWorker_p0-w0: resuming experience collection (12550 times) [2024-04-27 21:32:52,465][54818] Updated weights for policy 0, policy_version 484688 (0.0026) [2024-04-27 21:32:54,253][54587] Fps is (10 sec: 55711.8, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 7941210112. Throughput: 0: 58765.6. Samples: 846476560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:32:54,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 21:32:54,990][54818] Updated weights for policy 0, policy_version 484698 (0.0023) [2024-04-27 21:32:57,816][54818] Updated weights for policy 0, policy_version 484708 (0.0026) [2024-04-27 21:32:59,253][54587] Fps is (10 sec: 55706.1, 60 sec: 58709.5, 300 sec: 58815.8). Total num frames: 7941505024. Throughput: 0: 59033.0. Samples: 846654500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:32:59,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 21:33:00,430][54818] Updated weights for policy 0, policy_version 484718 (0.0027) [2024-04-27 21:33:03,684][54818] Updated weights for policy 0, policy_version 484728 (0.0026) [2024-04-27 21:33:04,253][54587] Fps is (10 sec: 58983.3, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 7941799936. Throughput: 0: 58965.6. Samples: 847003380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:33:04,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 21:33:06,000][54818] Updated weights for policy 0, policy_version 484738 (0.0025) [2024-04-27 21:33:09,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.4, 300 sec: 58760.2). Total num frames: 7942094848. Throughput: 0: 58767.3. Samples: 847354240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:33:09,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 21:33:09,454][54818] Updated weights for policy 0, policy_version 484748 (0.0026) [2024-04-27 21:33:11,488][54818] Updated weights for policy 0, policy_version 484758 (0.0027) [2024-04-27 21:33:14,253][54587] Fps is (10 sec: 60621.1, 60 sec: 58982.5, 300 sec: 58815.8). Total num frames: 7942406144. Throughput: 0: 58496.6. Samples: 847523700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:33:14,253][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 21:33:15,013][54818] Updated weights for policy 0, policy_version 484768 (0.0027) [2024-04-27 21:33:17,306][54818] Updated weights for policy 0, policy_version 484778 (0.0027) [2024-04-27 21:33:19,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59255.7, 300 sec: 58871.3). Total num frames: 7942701056. Throughput: 0: 58685.4. Samples: 847882600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:33:19,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 21:33:20,504][54818] Updated weights for policy 0, policy_version 484788 (0.0027) [2024-04-27 21:33:23,131][54818] Updated weights for policy 0, policy_version 484798 (0.0026) [2024-04-27 21:33:24,253][54587] Fps is (10 sec: 58981.3, 60 sec: 59255.5, 300 sec: 58926.8). Total num frames: 7942995968. Throughput: 0: 58947.9. Samples: 848237820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 21:33:24,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 21:33:26,028][54818] Updated weights for policy 0, policy_version 484808 (0.0027) [2024-04-27 21:33:28,646][54818] Updated weights for policy 0, policy_version 484818 (0.0026) [2024-04-27 21:33:29,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 7943274496. Throughput: 0: 58747.9. Samples: 848414000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:33:29,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 21:33:31,446][54818] Updated weights for policy 0, policy_version 484828 (0.0026) [2024-04-27 21:33:34,253][54587] Fps is (10 sec: 57344.9, 60 sec: 58436.4, 300 sec: 58926.9). Total num frames: 7943569408. Throughput: 0: 58533.0. Samples: 848764640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:33:34,253][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 21:33:34,346][54818] Updated weights for policy 0, policy_version 484838 (0.0026) [2024-04-27 21:33:37,207][54818] Updated weights for policy 0, policy_version 484848 (0.0026) [2024-04-27 21:33:39,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58163.2, 300 sec: 58982.4). Total num frames: 7943864320. Throughput: 0: 58678.0. Samples: 849117060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:33:39,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 21:33:39,930][54818] Updated weights for policy 0, policy_version 484858 (0.0027) [2024-04-27 21:33:40,418][54798] Signal inference workers to stop experience collection... (12600 times) [2024-04-27 21:33:40,418][54798] Signal inference workers to resume experience collection... (12600 times) [2024-04-27 21:33:40,429][54818] InferenceWorker_p0-w0: stopping experience collection (12600 times) [2024-04-27 21:33:40,430][54818] InferenceWorker_p0-w0: resuming experience collection (12600 times) [2024-04-27 21:33:42,691][54818] Updated weights for policy 0, policy_version 484868 (0.0026) [2024-04-27 21:33:44,253][54587] Fps is (10 sec: 60620.1, 60 sec: 58710.5, 300 sec: 58982.4). Total num frames: 7944175616. Throughput: 0: 58786.1. Samples: 849299880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:33:44,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 21:33:45,439][54818] Updated weights for policy 0, policy_version 484878 (0.0025) [2024-04-27 21:33:48,216][54818] Updated weights for policy 0, policy_version 484888 (0.0023) [2024-04-27 21:33:49,253][54587] Fps is (10 sec: 62258.6, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 7944486912. Throughput: 0: 58988.7. Samples: 849657880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:33:49,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 21:33:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000484893_7944486912.pth... [2024-04-27 21:33:49,315][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000484030_7930347520.pth [2024-04-27 21:33:50,965][54818] Updated weights for policy 0, policy_version 484898 (0.0025) [2024-04-27 21:33:53,722][54818] Updated weights for policy 0, policy_version 484908 (0.0024) [2024-04-27 21:33:54,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59255.6, 300 sec: 58982.4). Total num frames: 7944765440. Throughput: 0: 58990.6. Samples: 850008820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:33:54,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 21:33:56,626][54818] Updated weights for policy 0, policy_version 484918 (0.0025) [2024-04-27 21:33:59,211][54818] Updated weights for policy 0, policy_version 484928 (0.0025) [2024-04-27 21:33:59,253][54587] Fps is (10 sec: 57343.5, 60 sec: 59255.2, 300 sec: 58982.4). Total num frames: 7945060352. Throughput: 0: 59066.8. Samples: 850181720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:33:59,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 21:34:01,948][54818] Updated weights for policy 0, policy_version 484938 (0.0026) [2024-04-27 21:34:04,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 7945338880. Throughput: 0: 58945.0. Samples: 850535120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:34:04,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 21:34:04,813][54818] Updated weights for policy 0, policy_version 484948 (0.0027) [2024-04-27 21:34:07,583][54818] Updated weights for policy 0, policy_version 484958 (0.0026) [2024-04-27 21:34:09,253][54587] Fps is (10 sec: 57344.6, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 7945633792. Throughput: 0: 58957.4. Samples: 850890900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:34:09,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 21:34:10,378][54818] Updated weights for policy 0, policy_version 484968 (0.0026) [2024-04-27 21:34:13,215][54818] Updated weights for policy 0, policy_version 484978 (0.0025) [2024-04-27 21:34:14,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58709.2, 300 sec: 58871.3). Total num frames: 7945928704. Throughput: 0: 58937.3. Samples: 851066180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:34:14,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 21:34:15,937][54818] Updated weights for policy 0, policy_version 484988 (0.0028) [2024-04-27 21:34:18,620][54818] Updated weights for policy 0, policy_version 484998 (0.0024) [2024-04-27 21:34:19,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 7946240000. Throughput: 0: 58953.2. Samples: 851417540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:34:19,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 21:34:21,521][54818] Updated weights for policy 0, policy_version 485008 (0.0027) [2024-04-27 21:34:24,179][54818] Updated weights for policy 0, policy_version 485018 (0.0025) [2024-04-27 21:34:24,253][54587] Fps is (10 sec: 60621.3, 60 sec: 58982.5, 300 sec: 58871.3). Total num frames: 7946534912. Throughput: 0: 58881.8. Samples: 851766740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:34:24,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 21:34:27,162][54818] Updated weights for policy 0, policy_version 485028 (0.0025) [2024-04-27 21:34:29,253][54587] Fps is (10 sec: 55706.1, 60 sec: 58709.4, 300 sec: 58760.3). Total num frames: 7946797056. Throughput: 0: 58746.4. Samples: 851943460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:34:29,253][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 21:34:29,817][54818] Updated weights for policy 0, policy_version 485038 (0.0026) [2024-04-27 21:34:32,744][54818] Updated weights for policy 0, policy_version 485048 (0.0026) [2024-04-27 21:34:34,253][54587] Fps is (10 sec: 55705.7, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 7947091968. Throughput: 0: 58620.6. Samples: 852295800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:34:34,253][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 21:34:34,944][54798] Signal inference workers to stop experience collection... (12650 times) [2024-04-27 21:34:34,945][54798] Signal inference workers to resume experience collection... (12650 times) [2024-04-27 21:34:34,956][54818] InferenceWorker_p0-w0: stopping experience collection (12650 times) [2024-04-27 21:34:34,983][54818] InferenceWorker_p0-w0: resuming experience collection (12650 times) [2024-04-27 21:34:35,251][54818] Updated weights for policy 0, policy_version 485058 (0.0026) [2024-04-27 21:34:38,273][54818] Updated weights for policy 0, policy_version 485068 (0.0025) [2024-04-27 21:34:39,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.4, 300 sec: 58760.2). Total num frames: 7947386880. Throughput: 0: 58582.7. Samples: 852645040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:34:39,253][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 21:34:41,123][54818] Updated weights for policy 0, policy_version 485078 (0.0026) [2024-04-27 21:34:43,740][54818] Updated weights for policy 0, policy_version 485088 (0.0027) [2024-04-27 21:34:44,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 7947681792. Throughput: 0: 58861.5. Samples: 852830480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:34:44,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 21:34:47,095][54818] Updated weights for policy 0, policy_version 485098 (0.0026) [2024-04-27 21:34:49,168][54818] Updated weights for policy 0, policy_version 485108 (0.0026) [2024-04-27 21:34:49,253][54587] Fps is (10 sec: 62258.8, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 7948009472. Throughput: 0: 58837.2. Samples: 853182800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:34:49,254][54587] Avg episode reward: [(0, '0.664')] [2024-04-27 21:34:52,788][54818] Updated weights for policy 0, policy_version 485118 (0.0026) [2024-04-27 21:34:54,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58436.3, 300 sec: 58871.3). Total num frames: 7948271616. Throughput: 0: 58455.7. Samples: 853521400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-27 21:34:54,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 21:34:55,228][54818] Updated weights for policy 0, policy_version 485128 (0.0027) [2024-04-27 21:34:58,335][54818] Updated weights for policy 0, policy_version 485138 (0.0026) [2024-04-27 21:34:59,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58709.5, 300 sec: 58926.9). Total num frames: 7948582912. Throughput: 0: 58586.7. Samples: 853702580. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-04-27 21:34:59,254][54587] Avg episode reward: [(0, '0.531')] [2024-04-27 21:35:00,806][54818] Updated weights for policy 0, policy_version 485148 (0.0025) [2024-04-27 21:35:03,958][54818] Updated weights for policy 0, policy_version 485158 (0.0025) [2024-04-27 21:35:04,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58436.2, 300 sec: 58815.8). Total num frames: 7948845056. Throughput: 0: 58537.4. Samples: 854051720. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-04-27 21:35:04,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 21:35:06,224][54818] Updated weights for policy 0, policy_version 485168 (0.0029) [2024-04-27 21:35:09,253][54587] Fps is (10 sec: 55705.2, 60 sec: 58436.3, 300 sec: 58760.2). Total num frames: 7949139968. Throughput: 0: 58796.8. Samples: 854412600. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-04-27 21:35:09,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 21:35:09,395][54818] Updated weights for policy 0, policy_version 485178 (0.0025) [2024-04-27 21:35:11,863][54818] Updated weights for policy 0, policy_version 485188 (0.0026) [2024-04-27 21:35:14,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58436.3, 300 sec: 58760.2). Total num frames: 7949434880. Throughput: 0: 58671.9. Samples: 854583700. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-04-27 21:35:14,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 21:35:14,881][54818] Updated weights for policy 0, policy_version 485198 (0.0026) [2024-04-27 21:35:17,451][54818] Updated weights for policy 0, policy_version 485208 (0.0027) [2024-04-27 21:35:18,062][54798] Signal inference workers to stop experience collection... (12700 times) [2024-04-27 21:35:18,074][54818] InferenceWorker_p0-w0: stopping experience collection (12700 times) [2024-04-27 21:35:18,125][54798] Signal inference workers to resume experience collection... (12700 times) [2024-04-27 21:35:18,125][54818] InferenceWorker_p0-w0: resuming experience collection (12700 times) [2024-04-27 21:35:19,254][54587] Fps is (10 sec: 58981.6, 60 sec: 58163.0, 300 sec: 58760.2). Total num frames: 7949729792. Throughput: 0: 58846.8. Samples: 854943920. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-04-27 21:35:19,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 21:35:20,565][54818] Updated weights for policy 0, policy_version 485218 (0.0022) [2024-04-27 21:35:22,905][54818] Updated weights for policy 0, policy_version 485228 (0.0025) [2024-04-27 21:35:24,253][54587] Fps is (10 sec: 62259.2, 60 sec: 58709.2, 300 sec: 58871.3). Total num frames: 7950057472. Throughput: 0: 58868.3. Samples: 855294120. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-04-27 21:35:24,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 21:35:26,030][54818] Updated weights for policy 0, policy_version 485238 (0.0027) [2024-04-27 21:35:28,492][54818] Updated weights for policy 0, policy_version 485248 (0.0026) [2024-04-27 21:35:29,253][54587] Fps is (10 sec: 62259.7, 60 sec: 59255.3, 300 sec: 58815.8). Total num frames: 7950352384. Throughput: 0: 58586.6. Samples: 855466880. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-04-27 21:35:29,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 21:35:31,443][54818] Updated weights for policy 0, policy_version 485258 (0.0026) [2024-04-27 21:35:34,025][54818] Updated weights for policy 0, policy_version 485268 (0.0027) [2024-04-27 21:35:34,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59255.4, 300 sec: 58871.3). Total num frames: 7950647296. Throughput: 0: 58664.4. Samples: 855822700. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-04-27 21:35:34,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 21:35:36,938][54818] Updated weights for policy 0, policy_version 485278 (0.0026) [2024-04-27 21:35:39,253][54587] Fps is (10 sec: 57345.0, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 7950925824. Throughput: 0: 59084.0. Samples: 856180180. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-04-27 21:35:39,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 21:35:39,448][54818] Updated weights for policy 0, policy_version 485288 (0.0027) [2024-04-27 21:35:42,541][54818] Updated weights for policy 0, policy_version 485298 (0.0025) [2024-04-27 21:35:44,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59255.6, 300 sec: 58871.3). Total num frames: 7951237120. Throughput: 0: 59080.5. Samples: 856361200. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-04-27 21:35:44,254][54587] Avg episode reward: [(0, '0.684')] [2024-04-27 21:35:45,047][54818] Updated weights for policy 0, policy_version 485308 (0.0025) [2024-04-27 21:35:47,967][54818] Updated weights for policy 0, policy_version 485318 (0.0026) [2024-04-27 21:35:49,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 7951532032. Throughput: 0: 59203.9. Samples: 856715900. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-04-27 21:35:49,263][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 21:35:49,275][54587] No heartbeat for components: RolloutWorker_w4 (697 seconds) [2024-04-27 21:35:49,276][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000485323_7951532032.pth... [2024-04-27 21:35:49,326][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000484460_7937392640.pth [2024-04-27 21:35:50,655][54818] Updated weights for policy 0, policy_version 485328 (0.0027) [2024-04-27 21:35:53,418][54818] Updated weights for policy 0, policy_version 485338 (0.0025) [2024-04-27 21:35:54,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 7951810560. Throughput: 0: 58829.1. Samples: 857059900. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-04-27 21:35:54,253][54587] Avg episode reward: [(0, '0.522')] [2024-04-27 21:35:56,092][54818] Updated weights for policy 0, policy_version 485348 (0.0027) [2024-04-27 21:35:58,857][54818] Updated weights for policy 0, policy_version 485358 (0.0026) [2024-04-27 21:35:59,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.4, 300 sec: 58926.9). Total num frames: 7952121856. Throughput: 0: 59188.9. Samples: 857247200. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-04-27 21:35:59,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 21:36:01,563][54818] Updated weights for policy 0, policy_version 485368 (0.0026) [2024-04-27 21:36:04,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59528.5, 300 sec: 58982.4). Total num frames: 7952416768. Throughput: 0: 59003.8. Samples: 857599080. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-04-27 21:36:04,262][54587] Avg episode reward: [(0, '0.473')] [2024-04-27 21:36:04,414][54818] Updated weights for policy 0, policy_version 485378 (0.0024) [2024-04-27 21:36:07,679][54798] Signal inference workers to stop experience collection... (12750 times) [2024-04-27 21:36:07,694][54818] InferenceWorker_p0-w0: stopping experience collection (12750 times) [2024-04-27 21:36:07,774][54798] Signal inference workers to resume experience collection... (12750 times) [2024-04-27 21:36:07,774][54818] InferenceWorker_p0-w0: resuming experience collection (12750 times) [2024-04-27 21:36:07,776][54818] Updated weights for policy 0, policy_version 485388 (0.0023) [2024-04-27 21:36:09,253][54587] Fps is (10 sec: 57343.7, 60 sec: 59255.5, 300 sec: 58926.8). Total num frames: 7952695296. Throughput: 0: 59102.7. Samples: 857953740. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-04-27 21:36:09,263][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 21:36:10,135][54818] Updated weights for policy 0, policy_version 485398 (0.0025) [2024-04-27 21:36:13,125][54818] Updated weights for policy 0, policy_version 485408 (0.0028) [2024-04-27 21:36:14,253][54587] Fps is (10 sec: 57343.8, 60 sec: 59255.5, 300 sec: 58871.3). Total num frames: 7952990208. Throughput: 0: 58933.9. Samples: 858118900. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-04-27 21:36:14,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 21:36:15,646][54818] Updated weights for policy 0, policy_version 485418 (0.0026) [2024-04-27 21:36:18,745][54818] Updated weights for policy 0, policy_version 485428 (0.0024) [2024-04-27 21:36:19,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58982.6, 300 sec: 58760.2). Total num frames: 7953268736. Throughput: 0: 58941.3. Samples: 858475060. Policy #0 lag: (min: 2.0, avg: 11.9, max: 21.0) [2024-04-27 21:36:19,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-27 21:36:21,092][54818] Updated weights for policy 0, policy_version 485438 (0.0025) [2024-04-27 21:36:24,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58436.3, 300 sec: 58760.2). Total num frames: 7953563648. Throughput: 0: 59012.3. Samples: 858835740. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 21:36:24,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 21:36:24,355][54818] Updated weights for policy 0, policy_version 485448 (0.0027) [2024-04-27 21:36:26,634][54818] Updated weights for policy 0, policy_version 485458 (0.0025) [2024-04-27 21:36:29,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58436.4, 300 sec: 58760.3). Total num frames: 7953858560. Throughput: 0: 58698.2. Samples: 859002620. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 21:36:29,253][54587] Avg episode reward: [(0, '0.668')] [2024-04-27 21:36:30,022][54818] Updated weights for policy 0, policy_version 485468 (0.0027) [2024-04-27 21:36:32,434][54818] Updated weights for policy 0, policy_version 485478 (0.0026) [2024-04-27 21:36:34,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58709.3, 300 sec: 58760.2). Total num frames: 7954169856. Throughput: 0: 58732.9. Samples: 859358880. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 21:36:34,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 21:36:35,537][54818] Updated weights for policy 0, policy_version 485488 (0.0026) [2024-04-27 21:36:37,860][54818] Updated weights for policy 0, policy_version 485498 (0.0026) [2024-04-27 21:36:39,253][54587] Fps is (10 sec: 62258.7, 60 sec: 59255.4, 300 sec: 58871.3). Total num frames: 7954481152. Throughput: 0: 58887.4. Samples: 859709840. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 21:36:39,254][54587] Avg episode reward: [(0, '0.475')] [2024-04-27 21:36:41,015][54818] Updated weights for policy 0, policy_version 485508 (0.0026) [2024-04-27 21:36:43,198][54818] Updated weights for policy 0, policy_version 485518 (0.0024) [2024-04-27 21:36:44,253][54587] Fps is (10 sec: 62259.5, 60 sec: 59255.4, 300 sec: 58982.4). Total num frames: 7954792448. Throughput: 0: 58980.9. Samples: 859901340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 21:36:44,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 21:36:46,528][54818] Updated weights for policy 0, policy_version 485528 (0.0026) [2024-04-27 21:36:48,791][54818] Updated weights for policy 0, policy_version 485538 (0.0025) [2024-04-27 21:36:49,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.5, 300 sec: 58982.4). Total num frames: 7955070976. Throughput: 0: 58841.8. Samples: 860246960. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 21:36:49,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 21:36:52,066][54818] Updated weights for policy 0, policy_version 485548 (0.0027) [2024-04-27 21:36:54,253][54587] Fps is (10 sec: 57344.5, 60 sec: 59255.5, 300 sec: 58926.9). Total num frames: 7955365888. Throughput: 0: 58895.7. Samples: 860604040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 21:36:54,253][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 21:36:54,322][54818] Updated weights for policy 0, policy_version 485558 (0.0022) [2024-04-27 21:36:57,631][54818] Updated weights for policy 0, policy_version 485568 (0.0026) [2024-04-27 21:36:59,253][54587] Fps is (10 sec: 57343.2, 60 sec: 58709.2, 300 sec: 58871.3). Total num frames: 7955644416. Throughput: 0: 59201.7. Samples: 860782980. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 21:36:59,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 21:37:00,034][54818] Updated weights for policy 0, policy_version 485578 (0.0025) [2024-04-27 21:37:03,137][54818] Updated weights for policy 0, policy_version 485588 (0.0027) [2024-04-27 21:37:03,168][54798] Signal inference workers to stop experience collection... (12800 times) [2024-04-27 21:37:03,206][54818] InferenceWorker_p0-w0: stopping experience collection (12800 times) [2024-04-27 21:37:03,260][54798] Signal inference workers to resume experience collection... (12800 times) [2024-04-27 21:37:03,260][54818] InferenceWorker_p0-w0: resuming experience collection (12800 times) [2024-04-27 21:37:04,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58982.4, 300 sec: 58926.8). Total num frames: 7955955712. Throughput: 0: 59083.5. Samples: 861133820. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 21:37:04,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 21:37:05,500][54818] Updated weights for policy 0, policy_version 485598 (0.0025) [2024-04-27 21:37:08,717][54818] Updated weights for policy 0, policy_version 485608 (0.0023) [2024-04-27 21:37:09,253][54587] Fps is (10 sec: 58983.7, 60 sec: 58982.6, 300 sec: 58871.4). Total num frames: 7956234240. Throughput: 0: 58929.1. Samples: 861487540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 21:37:09,253][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 21:37:11,070][54818] Updated weights for policy 0, policy_version 485618 (0.0024) [2024-04-27 21:37:14,138][54818] Updated weights for policy 0, policy_version 485628 (0.0024) [2024-04-27 21:37:14,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58982.4, 300 sec: 58926.9). Total num frames: 7956529152. Throughput: 0: 58836.4. Samples: 861650260. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 21:37:14,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 21:37:16,482][54818] Updated weights for policy 0, policy_version 485638 (0.0026) [2024-04-27 21:37:19,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58982.5, 300 sec: 58871.4). Total num frames: 7956807680. Throughput: 0: 58955.3. Samples: 862011860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 21:37:19,253][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 21:37:19,591][54818] Updated weights for policy 0, policy_version 485648 (0.0027) [2024-04-27 21:37:22,014][54818] Updated weights for policy 0, policy_version 485658 (0.0022) [2024-04-27 21:37:24,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 7957102592. Throughput: 0: 59020.9. Samples: 862365780. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 21:37:24,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-27 21:37:25,397][54818] Updated weights for policy 0, policy_version 485668 (0.0023) [2024-04-27 21:37:28,019][54818] Updated weights for policy 0, policy_version 485678 (0.0025) [2024-04-27 21:37:29,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58982.4, 300 sec: 58760.3). Total num frames: 7957397504. Throughput: 0: 58455.2. Samples: 862531820. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 21:37:29,253][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 21:37:30,912][54818] Updated weights for policy 0, policy_version 485688 (0.0028) [2024-04-27 21:37:33,843][54818] Updated weights for policy 0, policy_version 485698 (0.0027) [2024-04-27 21:37:34,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58709.5, 300 sec: 58704.7). Total num frames: 7957692416. Throughput: 0: 58594.3. Samples: 862883700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 21:37:34,253][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 21:37:36,629][54818] Updated weights for policy 0, policy_version 485708 (0.0026) [2024-04-27 21:37:39,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58436.3, 300 sec: 58760.5). Total num frames: 7957987328. Throughput: 0: 58523.4. Samples: 863237600. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 21:37:39,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 21:37:39,513][54818] Updated weights for policy 0, policy_version 485718 (0.0026) [2024-04-27 21:37:41,680][54798] Signal inference workers to stop experience collection... (12850 times) [2024-04-27 21:37:41,681][54798] Signal inference workers to resume experience collection... (12850 times) [2024-04-27 21:37:41,713][54818] InferenceWorker_p0-w0: stopping experience collection (12850 times) [2024-04-27 21:37:41,714][54818] InferenceWorker_p0-w0: resuming experience collection (12850 times) [2024-04-27 21:37:42,092][54818] Updated weights for policy 0, policy_version 485728 (0.0032) [2024-04-27 21:37:44,253][54587] Fps is (10 sec: 60620.0, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 7958298624. Throughput: 0: 58590.8. Samples: 863419560. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-04-27 21:37:44,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 21:37:45,013][54818] Updated weights for policy 0, policy_version 485738 (0.0027) [2024-04-27 21:37:47,610][54818] Updated weights for policy 0, policy_version 485748 (0.0026) [2024-04-27 21:37:49,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 7958593536. Throughput: 0: 58474.7. Samples: 863765180. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-27 21:37:49,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 21:37:49,279][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000485755_7958609920.pth... [2024-04-27 21:37:49,327][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000484893_7944486912.pth [2024-04-27 21:37:50,639][54818] Updated weights for policy 0, policy_version 485758 (0.0027) [2024-04-27 21:37:53,156][54818] Updated weights for policy 0, policy_version 485768 (0.0024) [2024-04-27 21:37:54,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 7958888448. Throughput: 0: 58601.3. Samples: 864124600. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-27 21:37:54,253][54587] Avg episode reward: [(0, '0.664')] [2024-04-27 21:37:56,068][54818] Updated weights for policy 0, policy_version 485778 (0.0025) [2024-04-27 21:37:58,788][54818] Updated weights for policy 0, policy_version 485788 (0.0026) [2024-04-27 21:37:59,253][54587] Fps is (10 sec: 55705.8, 60 sec: 58436.4, 300 sec: 58815.8). Total num frames: 7959150592. Throughput: 0: 58898.3. Samples: 864300680. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-27 21:37:59,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 21:38:01,633][54818] Updated weights for policy 0, policy_version 485798 (0.0026) [2024-04-27 21:38:04,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58436.3, 300 sec: 58871.3). Total num frames: 7959461888. Throughput: 0: 58673.7. Samples: 864652180. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-27 21:38:04,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 21:38:04,422][54818] Updated weights for policy 0, policy_version 485808 (0.0026) [2024-04-27 21:38:07,300][54818] Updated weights for policy 0, policy_version 485818 (0.0027) [2024-04-27 21:38:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58709.2, 300 sec: 58815.8). Total num frames: 7959756800. Throughput: 0: 58454.2. Samples: 864996220. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-27 21:38:09,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 21:38:10,092][54818] Updated weights for policy 0, policy_version 485828 (0.0024) [2024-04-27 21:38:13,497][54818] Updated weights for policy 0, policy_version 485838 (0.0024) [2024-04-27 21:38:14,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.4, 300 sec: 58815.8). Total num frames: 7960051712. Throughput: 0: 58786.2. Samples: 865177200. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-27 21:38:14,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 21:38:15,693][54818] Updated weights for policy 0, policy_version 485848 (0.0029) [2024-04-27 21:38:18,870][54818] Updated weights for policy 0, policy_version 485858 (0.0031) [2024-04-27 21:38:19,186][54798] Signal inference workers to stop experience collection... (12900 times) [2024-04-27 21:38:19,191][54798] Signal inference workers to resume experience collection... (12900 times) [2024-04-27 21:38:19,205][54818] InferenceWorker_p0-w0: stopping experience collection (12900 times) [2024-04-27 21:38:19,205][54818] InferenceWorker_p0-w0: resuming experience collection (12900 times) [2024-04-27 21:38:19,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58709.3, 300 sec: 58760.3). Total num frames: 7960330240. Throughput: 0: 58698.6. Samples: 865525140. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-27 21:38:19,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 21:38:21,103][54818] Updated weights for policy 0, policy_version 485868 (0.0027) [2024-04-27 21:38:24,253][54587] Fps is (10 sec: 55705.4, 60 sec: 58436.3, 300 sec: 58760.2). Total num frames: 7960608768. Throughput: 0: 58849.4. Samples: 865885820. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-27 21:38:24,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 21:38:24,445][54818] Updated weights for policy 0, policy_version 485878 (0.0027) [2024-04-27 21:38:26,690][54818] Updated weights for policy 0, policy_version 485888 (0.0023) [2024-04-27 21:38:29,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58436.2, 300 sec: 58760.2). Total num frames: 7960903680. Throughput: 0: 58512.9. Samples: 866052640. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-27 21:38:29,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 21:38:30,132][54818] Updated weights for policy 0, policy_version 485898 (0.0027) [2024-04-27 21:38:32,221][54818] Updated weights for policy 0, policy_version 485908 (0.0022) [2024-04-27 21:38:34,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58709.2, 300 sec: 58815.8). Total num frames: 7961214976. Throughput: 0: 58489.8. Samples: 866397220. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-27 21:38:34,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 21:38:35,674][54818] Updated weights for policy 0, policy_version 485918 (0.0024) [2024-04-27 21:38:37,811][54818] Updated weights for policy 0, policy_version 485928 (0.0022) [2024-04-27 21:38:39,253][54587] Fps is (10 sec: 60619.8, 60 sec: 58709.2, 300 sec: 58760.2). Total num frames: 7961509888. Throughput: 0: 58419.7. Samples: 866753500. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-27 21:38:39,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 21:38:41,121][54818] Updated weights for policy 0, policy_version 485938 (0.0025) [2024-04-27 21:38:43,411][54818] Updated weights for policy 0, policy_version 485948 (0.0025) [2024-04-27 21:38:44,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58163.2, 300 sec: 58649.2). Total num frames: 7961788416. Throughput: 0: 58288.5. Samples: 866923660. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-27 21:38:44,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 21:38:46,649][54818] Updated weights for policy 0, policy_version 485958 (0.0022) [2024-04-27 21:38:48,973][54818] Updated weights for policy 0, policy_version 485968 (0.0025) [2024-04-27 21:38:49,253][54587] Fps is (10 sec: 58983.4, 60 sec: 58436.2, 300 sec: 58760.2). Total num frames: 7962099712. Throughput: 0: 58410.7. Samples: 867280660. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-27 21:38:49,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 21:38:49,264][54587] No heartbeat for components: RolloutWorker_w4 (877 seconds) [2024-04-27 21:38:51,949][54798] Signal inference workers to stop experience collection... (12950 times) [2024-04-27 21:38:51,986][54818] InferenceWorker_p0-w0: stopping experience collection (12950 times) [2024-04-27 21:38:52,016][54798] Signal inference workers to resume experience collection... (12950 times) [2024-04-27 21:38:52,018][54818] InferenceWorker_p0-w0: resuming experience collection (12950 times) [2024-04-27 21:38:52,146][54818] Updated weights for policy 0, policy_version 485978 (0.0023) [2024-04-27 21:38:54,253][54587] Fps is (10 sec: 62258.7, 60 sec: 58709.2, 300 sec: 58815.8). Total num frames: 7962411008. Throughput: 0: 58566.2. Samples: 867631700. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-27 21:38:54,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-27 21:38:54,660][54818] Updated weights for policy 0, policy_version 485988 (0.0027) [2024-04-27 21:38:57,649][54818] Updated weights for policy 0, policy_version 485998 (0.0027) [2024-04-27 21:38:59,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 7962689536. Throughput: 0: 58736.4. Samples: 867820340. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-27 21:38:59,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 21:39:00,373][54818] Updated weights for policy 0, policy_version 486008 (0.0024) [2024-04-27 21:39:03,135][54818] Updated weights for policy 0, policy_version 486018 (0.0023) [2024-04-27 21:39:04,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58709.4, 300 sec: 58815.8). Total num frames: 7962984448. Throughput: 0: 58822.6. Samples: 868172160. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-27 21:39:04,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 21:39:06,046][54818] Updated weights for policy 0, policy_version 486028 (0.0026) [2024-04-27 21:39:08,521][54818] Updated weights for policy 0, policy_version 486038 (0.0023) [2024-04-27 21:39:09,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 7963279360. Throughput: 0: 58386.7. Samples: 868513220. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-04-27 21:39:09,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 21:39:11,679][54818] Updated weights for policy 0, policy_version 486048 (0.0027) [2024-04-27 21:39:14,172][54818] Updated weights for policy 0, policy_version 486058 (0.0030) [2024-04-27 21:39:14,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58709.4, 300 sec: 58760.3). Total num frames: 7963574272. Throughput: 0: 58815.2. Samples: 868699320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:39:14,254][54587] Avg episode reward: [(0, '0.696')] [2024-04-27 21:39:17,219][54818] Updated weights for policy 0, policy_version 486068 (0.0028) [2024-04-27 21:39:19,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58709.4, 300 sec: 58704.7). Total num frames: 7963852800. Throughput: 0: 59077.4. Samples: 869055700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:39:19,253][54587] Avg episode reward: [(0, '0.668')] [2024-04-27 21:39:19,870][54818] Updated weights for policy 0, policy_version 486078 (0.0023) [2024-04-27 21:39:22,691][54818] Updated weights for policy 0, policy_version 486088 (0.0027) [2024-04-27 21:39:24,253][54587] Fps is (10 sec: 55705.5, 60 sec: 58709.4, 300 sec: 58760.2). Total num frames: 7964131328. Throughput: 0: 59103.4. Samples: 869413140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:39:24,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 21:39:25,415][54818] Updated weights for policy 0, policy_version 486098 (0.0026) [2024-04-27 21:39:28,636][54818] Updated weights for policy 0, policy_version 486108 (0.0027) [2024-04-27 21:39:29,253][54587] Fps is (10 sec: 55705.2, 60 sec: 58436.3, 300 sec: 58704.7). Total num frames: 7964409856. Throughput: 0: 58864.8. Samples: 869572580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:39:29,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 21:39:30,798][54818] Updated weights for policy 0, policy_version 486118 (0.0027) [2024-04-27 21:39:31,628][54798] Signal inference workers to stop experience collection... (13000 times) [2024-04-27 21:39:31,674][54818] InferenceWorker_p0-w0: stopping experience collection (13000 times) [2024-04-27 21:39:31,688][54798] Signal inference workers to resume experience collection... (13000 times) [2024-04-27 21:39:31,691][54818] InferenceWorker_p0-w0: resuming experience collection (13000 times) [2024-04-27 21:39:34,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58163.2, 300 sec: 58704.7). Total num frames: 7964704768. Throughput: 0: 58823.6. Samples: 869927720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:39:34,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 21:39:34,274][54818] Updated weights for policy 0, policy_version 486128 (0.0026) [2024-04-27 21:39:36,341][54818] Updated weights for policy 0, policy_version 486138 (0.0024) [2024-04-27 21:39:39,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58436.5, 300 sec: 58760.3). Total num frames: 7965016064. Throughput: 0: 58956.1. Samples: 870284720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:39:39,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 21:39:39,958][54818] Updated weights for policy 0, policy_version 486148 (0.0027) [2024-04-27 21:39:41,872][54818] Updated weights for policy 0, policy_version 486158 (0.0024) [2024-04-27 21:39:44,253][54587] Fps is (10 sec: 62259.0, 60 sec: 58982.3, 300 sec: 58704.7). Total num frames: 7965327360. Throughput: 0: 58515.5. Samples: 870453540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:39:44,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 21:39:45,709][54818] Updated weights for policy 0, policy_version 486168 (0.0030) [2024-04-27 21:39:47,344][54818] Updated weights for policy 0, policy_version 486178 (0.0024) [2024-04-27 21:39:49,253][54587] Fps is (10 sec: 60620.0, 60 sec: 58709.2, 300 sec: 58815.7). Total num frames: 7965622272. Throughput: 0: 58536.7. Samples: 870806320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:39:49,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 21:39:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000486183_7965622272.pth... [2024-04-27 21:39:49,326][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000485323_7951532032.pth [2024-04-27 21:39:51,081][54818] Updated weights for policy 0, policy_version 486188 (0.0026) [2024-04-27 21:39:52,815][54818] Updated weights for policy 0, policy_version 486198 (0.0027) [2024-04-27 21:39:54,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 7965933568. Throughput: 0: 58751.0. Samples: 871157020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:39:54,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 21:39:56,574][54818] Updated weights for policy 0, policy_version 486208 (0.0026) [2024-04-27 21:39:58,385][54818] Updated weights for policy 0, policy_version 486218 (0.0026) [2024-04-27 21:39:59,253][54587] Fps is (10 sec: 60622.0, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 7966228480. Throughput: 0: 58935.1. Samples: 871351400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:39:59,253][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 21:40:02,124][54818] Updated weights for policy 0, policy_version 486228 (0.0024) [2024-04-27 21:40:03,855][54818] Updated weights for policy 0, policy_version 486238 (0.0026) [2024-04-27 21:40:04,253][54587] Fps is (10 sec: 62259.5, 60 sec: 59528.5, 300 sec: 59037.9). Total num frames: 7966556160. Throughput: 0: 58794.6. Samples: 871701460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:40:04,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 21:40:07,434][54798] Signal inference workers to stop experience collection... (13050 times) [2024-04-27 21:40:07,483][54818] InferenceWorker_p0-w0: stopping experience collection (13050 times) [2024-04-27 21:40:07,484][54798] Signal inference workers to resume experience collection... (13050 times) [2024-04-27 21:40:07,497][54818] InferenceWorker_p0-w0: resuming experience collection (13050 times) [2024-04-27 21:40:07,617][54818] Updated weights for policy 0, policy_version 486248 (0.0027) [2024-04-27 21:40:09,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 7966834688. Throughput: 0: 58538.1. Samples: 872047360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:40:09,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 21:40:09,476][54818] Updated weights for policy 0, policy_version 486258 (0.0025) [2024-04-27 21:40:12,996][54818] Updated weights for policy 0, policy_version 486268 (0.0027) [2024-04-27 21:40:14,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59255.4, 300 sec: 58982.4). Total num frames: 7967129600. Throughput: 0: 59080.9. Samples: 872231220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:40:14,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-27 21:40:15,002][54818] Updated weights for policy 0, policy_version 486278 (0.0025) [2024-04-27 21:40:18,742][54818] Updated weights for policy 0, policy_version 486288 (0.0025) [2024-04-27 21:40:19,253][54587] Fps is (10 sec: 55705.4, 60 sec: 58982.3, 300 sec: 58760.2). Total num frames: 7967391744. Throughput: 0: 59185.8. Samples: 872591080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:40:19,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 21:40:20,634][54818] Updated weights for policy 0, policy_version 486298 (0.0025) [2024-04-27 21:40:24,252][54818] Updated weights for policy 0, policy_version 486308 (0.0026) [2024-04-27 21:40:24,254][54587] Fps is (10 sec: 54064.8, 60 sec: 58981.9, 300 sec: 58704.6). Total num frames: 7967670272. Throughput: 0: 59042.5. Samples: 872941660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:40:24,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 21:40:26,176][54818] Updated weights for policy 0, policy_version 486318 (0.0027) [2024-04-27 21:40:29,253][54587] Fps is (10 sec: 55706.1, 60 sec: 58982.5, 300 sec: 58649.2). Total num frames: 7967948800. Throughput: 0: 58888.6. Samples: 873103520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:40:29,253][54587] Avg episode reward: [(0, '0.528')] [2024-04-27 21:40:29,594][54818] Updated weights for policy 0, policy_version 486328 (0.0025) [2024-04-27 21:40:31,765][54818] Updated weights for policy 0, policy_version 486338 (0.0025) [2024-04-27 21:40:34,253][54587] Fps is (10 sec: 57346.5, 60 sec: 58982.4, 300 sec: 58704.7). Total num frames: 7968243712. Throughput: 0: 58920.2. Samples: 873457720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:40:34,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 21:40:35,137][54818] Updated weights for policy 0, policy_version 486348 (0.0027) [2024-04-27 21:40:37,944][54818] Updated weights for policy 0, policy_version 486358 (0.0022) [2024-04-27 21:40:39,253][54587] Fps is (10 sec: 57343.4, 60 sec: 58436.2, 300 sec: 58593.6). Total num frames: 7968522240. Throughput: 0: 59003.1. Samples: 873812160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:40:39,254][54587] Avg episode reward: [(0, '0.680')] [2024-04-27 21:40:40,658][54818] Updated weights for policy 0, policy_version 486368 (0.0027) [2024-04-27 21:40:43,263][54818] Updated weights for policy 0, policy_version 486378 (0.0026) [2024-04-27 21:40:44,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58436.3, 300 sec: 58649.2). Total num frames: 7968833536. Throughput: 0: 58399.9. Samples: 873979400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:40:44,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 21:40:45,556][54798] Signal inference workers to stop experience collection... (13100 times) [2024-04-27 21:40:45,594][54818] InferenceWorker_p0-w0: stopping experience collection (13100 times) [2024-04-27 21:40:45,648][54798] Signal inference workers to resume experience collection... (13100 times) [2024-04-27 21:40:45,649][54818] InferenceWorker_p0-w0: resuming experience collection (13100 times) [2024-04-27 21:40:46,127][54818] Updated weights for policy 0, policy_version 486388 (0.0026) [2024-04-27 21:40:48,857][54818] Updated weights for policy 0, policy_version 486398 (0.0027) [2024-04-27 21:40:49,253][54587] Fps is (10 sec: 62258.5, 60 sec: 58709.3, 300 sec: 58760.2). Total num frames: 7969144832. Throughput: 0: 58494.9. Samples: 874333740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:40:49,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 21:40:51,592][54818] Updated weights for policy 0, policy_version 486408 (0.0027) [2024-04-27 21:40:54,253][54587] Fps is (10 sec: 62258.6, 60 sec: 58709.3, 300 sec: 58760.2). Total num frames: 7969456128. Throughput: 0: 58716.8. Samples: 874689620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:40:54,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 21:40:54,510][54818] Updated weights for policy 0, policy_version 486418 (0.0027) [2024-04-27 21:40:57,183][54818] Updated weights for policy 0, policy_version 486428 (0.0025) [2024-04-27 21:40:59,253][54587] Fps is (10 sec: 60621.9, 60 sec: 58709.3, 300 sec: 58760.2). Total num frames: 7969751040. Throughput: 0: 58656.0. Samples: 874870740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:40:59,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 21:41:00,430][54818] Updated weights for policy 0, policy_version 486438 (0.0025) [2024-04-27 21:41:02,855][54818] Updated weights for policy 0, policy_version 486448 (0.0026) [2024-04-27 21:41:04,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58436.2, 300 sec: 58871.3). Total num frames: 7970062336. Throughput: 0: 58489.3. Samples: 875223100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:41:04,254][54587] Avg episode reward: [(0, '0.655')] [2024-04-27 21:41:06,268][54818] Updated weights for policy 0, policy_version 486458 (0.0025) [2024-04-27 21:41:08,199][54818] Updated weights for policy 0, policy_version 486468 (0.0027) [2024-04-27 21:41:09,253][54587] Fps is (10 sec: 60621.2, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 7970357248. Throughput: 0: 58609.1. Samples: 875579040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:41:09,253][54587] Avg episode reward: [(0, '0.501')] [2024-04-27 21:41:11,871][54818] Updated weights for policy 0, policy_version 486478 (0.0025) [2024-04-27 21:41:14,009][54818] Updated weights for policy 0, policy_version 486488 (0.0022) [2024-04-27 21:41:14,253][54587] Fps is (10 sec: 57344.8, 60 sec: 58436.4, 300 sec: 58871.3). Total num frames: 7970635776. Throughput: 0: 59142.2. Samples: 875764920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:41:14,253][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 21:41:17,173][54818] Updated weights for policy 0, policy_version 486498 (0.0026) [2024-04-27 21:41:19,253][54587] Fps is (10 sec: 57343.2, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 7970930688. Throughput: 0: 58978.1. Samples: 876111740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:41:19,254][54587] Avg episode reward: [(0, '0.685')] [2024-04-27 21:41:19,437][54818] Updated weights for policy 0, policy_version 486508 (0.0027) [2024-04-27 21:41:22,686][54818] Updated weights for policy 0, policy_version 486518 (0.0026) [2024-04-27 21:41:24,253][54587] Fps is (10 sec: 58981.6, 60 sec: 59255.8, 300 sec: 58871.3). Total num frames: 7971225600. Throughput: 0: 58868.9. Samples: 876461260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:41:24,254][54587] Avg episode reward: [(0, '0.507')] [2024-04-27 21:41:24,963][54818] Updated weights for policy 0, policy_version 486528 (0.0026) [2024-04-27 21:41:28,241][54818] Updated weights for policy 0, policy_version 486538 (0.0026) [2024-04-27 21:41:28,970][54798] Signal inference workers to stop experience collection... (13150 times) [2024-04-27 21:41:29,007][54818] InferenceWorker_p0-w0: stopping experience collection (13150 times) [2024-04-27 21:41:29,061][54798] Signal inference workers to resume experience collection... (13150 times) [2024-04-27 21:41:29,061][54818] InferenceWorker_p0-w0: resuming experience collection (13150 times) [2024-04-27 21:41:29,253][54587] Fps is (10 sec: 58982.6, 60 sec: 59528.4, 300 sec: 58815.8). Total num frames: 7971520512. Throughput: 0: 59021.7. Samples: 876635380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:41:29,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 21:41:30,496][54818] Updated weights for policy 0, policy_version 486548 (0.0025) [2024-04-27 21:41:33,862][54818] Updated weights for policy 0, policy_version 486558 (0.0027) [2024-04-27 21:41:34,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59255.5, 300 sec: 58704.7). Total num frames: 7971799040. Throughput: 0: 59118.4. Samples: 876994060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:41:34,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 21:41:35,968][54818] Updated weights for policy 0, policy_version 486568 (0.0035) [2024-04-27 21:41:39,228][54818] Updated weights for policy 0, policy_version 486578 (0.0026) [2024-04-27 21:41:39,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59528.5, 300 sec: 58649.2). Total num frames: 7972093952. Throughput: 0: 59172.1. Samples: 877352360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:41:39,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 21:41:41,561][54818] Updated weights for policy 0, policy_version 486588 (0.0027) [2024-04-27 21:41:44,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.4, 300 sec: 58704.7). Total num frames: 7972388864. Throughput: 0: 58776.4. Samples: 877515680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:41:44,254][54587] Avg episode reward: [(0, '0.528')] [2024-04-27 21:41:44,744][54818] Updated weights for policy 0, policy_version 486598 (0.0027) [2024-04-27 21:41:47,013][54818] Updated weights for policy 0, policy_version 486608 (0.0026) [2024-04-27 21:41:49,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58709.4, 300 sec: 58649.1). Total num frames: 7972667392. Throughput: 0: 59027.5. Samples: 877879340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:41:49,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 21:41:49,262][54587] No heartbeat for components: RolloutWorker_w4 (1057 seconds) [2024-04-27 21:41:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000486613_7972667392.pth... [2024-04-27 21:41:49,328][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000485755_7958609920.pth [2024-04-27 21:41:50,285][54818] Updated weights for policy 0, policy_version 486618 (0.0025) [2024-04-27 21:41:52,955][54818] Updated weights for policy 0, policy_version 486628 (0.0026) [2024-04-27 21:41:54,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58709.4, 300 sec: 58760.3). Total num frames: 7972978688. Throughput: 0: 58987.9. Samples: 878233500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:41:54,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 21:41:55,676][54818] Updated weights for policy 0, policy_version 486638 (0.0025) [2024-04-27 21:41:58,753][54818] Updated weights for policy 0, policy_version 486648 (0.0027) [2024-04-27 21:41:59,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58709.2, 300 sec: 58704.7). Total num frames: 7973273600. Throughput: 0: 58647.8. Samples: 878404080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:41:59,254][54587] Avg episode reward: [(0, '0.684')] [2024-04-27 21:42:01,292][54818] Updated weights for policy 0, policy_version 486658 (0.0025) [2024-04-27 21:42:04,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58163.2, 300 sec: 58704.7). Total num frames: 7973552128. Throughput: 0: 58678.2. Samples: 878752260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-27 21:42:04,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 21:42:04,336][54818] Updated weights for policy 0, policy_version 486668 (0.0027) [2024-04-27 21:42:06,738][54818] Updated weights for policy 0, policy_version 486678 (0.0025) [2024-04-27 21:42:09,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58436.1, 300 sec: 58760.2). Total num frames: 7973863424. Throughput: 0: 58862.7. Samples: 879110080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 21:42:09,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 21:42:09,744][54818] Updated weights for policy 0, policy_version 486688 (0.0027) [2024-04-27 21:42:12,109][54818] Updated weights for policy 0, policy_version 486698 (0.0023) [2024-04-27 21:42:14,253][54587] Fps is (10 sec: 62259.3, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 7974174720. Throughput: 0: 59122.2. Samples: 879295880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 21:42:14,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 21:42:15,363][54818] Updated weights for policy 0, policy_version 486708 (0.0026) [2024-04-27 21:42:17,854][54818] Updated weights for policy 0, policy_version 486718 (0.0028) [2024-04-27 21:42:19,253][54587] Fps is (10 sec: 60621.2, 60 sec: 58982.5, 300 sec: 58871.3). Total num frames: 7974469632. Throughput: 0: 58912.9. Samples: 879645140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 21:42:19,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 21:42:20,828][54818] Updated weights for policy 0, policy_version 486728 (0.0024) [2024-04-27 21:42:21,705][54798] Signal inference workers to stop experience collection... (13200 times) [2024-04-27 21:42:21,705][54798] Signal inference workers to resume experience collection... (13200 times) [2024-04-27 21:42:21,716][54818] InferenceWorker_p0-w0: stopping experience collection (13200 times) [2024-04-27 21:42:21,716][54818] InferenceWorker_p0-w0: resuming experience collection (13200 times) [2024-04-27 21:42:23,523][54818] Updated weights for policy 0, policy_version 486738 (0.0029) [2024-04-27 21:42:24,253][54587] Fps is (10 sec: 60621.6, 60 sec: 59255.6, 300 sec: 58926.9). Total num frames: 7974780928. Throughput: 0: 58787.3. Samples: 879997780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 21:42:24,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 21:42:26,328][54818] Updated weights for policy 0, policy_version 486748 (0.0027) [2024-04-27 21:42:28,896][54818] Updated weights for policy 0, policy_version 486758 (0.0025) [2024-04-27 21:42:29,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58982.5, 300 sec: 58871.3). Total num frames: 7975059456. Throughput: 0: 59225.9. Samples: 880180840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 21:42:29,253][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 21:42:32,229][54818] Updated weights for policy 0, policy_version 486768 (0.0026) [2024-04-27 21:42:34,253][54587] Fps is (10 sec: 55705.4, 60 sec: 58982.5, 300 sec: 58815.8). Total num frames: 7975337984. Throughput: 0: 59095.3. Samples: 880538620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 21:42:34,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 21:42:34,386][54818] Updated weights for policy 0, policy_version 486778 (0.0026) [2024-04-27 21:42:37,823][54818] Updated weights for policy 0, policy_version 486788 (0.0027) [2024-04-27 21:42:39,253][54587] Fps is (10 sec: 58982.0, 60 sec: 59255.5, 300 sec: 58815.8). Total num frames: 7975649280. Throughput: 0: 58806.2. Samples: 880879780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 21:42:39,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 21:42:39,998][54818] Updated weights for policy 0, policy_version 486798 (0.0022) [2024-04-27 21:42:43,277][54818] Updated weights for policy 0, policy_version 486808 (0.0026) [2024-04-27 21:42:44,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59255.4, 300 sec: 58815.8). Total num frames: 7975944192. Throughput: 0: 59049.4. Samples: 881061300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 21:42:44,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 21:42:45,531][54818] Updated weights for policy 0, policy_version 486818 (0.0030) [2024-04-27 21:42:49,064][54818] Updated weights for policy 0, policy_version 486828 (0.0025) [2024-04-27 21:42:49,253][54587] Fps is (10 sec: 55705.3, 60 sec: 58982.4, 300 sec: 58704.7). Total num frames: 7976206336. Throughput: 0: 59235.6. Samples: 881417860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 21:42:49,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 21:42:50,978][54818] Updated weights for policy 0, policy_version 486838 (0.0028) [2024-04-27 21:42:54,253][54587] Fps is (10 sec: 54067.1, 60 sec: 58436.2, 300 sec: 58760.2). Total num frames: 7976484864. Throughput: 0: 59070.2. Samples: 881768240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 21:42:54,262][54587] Avg episode reward: [(0, '0.522')] [2024-04-27 21:42:54,581][54818] Updated weights for policy 0, policy_version 486848 (0.0025) [2024-04-27 21:42:56,697][54818] Updated weights for policy 0, policy_version 486858 (0.0027) [2024-04-27 21:42:59,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.4, 300 sec: 58760.2). Total num frames: 7976796160. Throughput: 0: 58656.9. Samples: 881935440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 21:42:59,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 21:42:59,956][54818] Updated weights for policy 0, policy_version 486868 (0.0027) [2024-04-27 21:43:02,360][54818] Updated weights for policy 0, policy_version 486878 (0.0027) [2024-04-27 21:43:04,253][54587] Fps is (10 sec: 60621.8, 60 sec: 58982.6, 300 sec: 58760.3). Total num frames: 7977091072. Throughput: 0: 58844.1. Samples: 882293120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 21:43:04,253][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 21:43:05,226][54798] Signal inference workers to stop experience collection... (13250 times) [2024-04-27 21:43:05,228][54798] Signal inference workers to resume experience collection... (13250 times) [2024-04-27 21:43:05,257][54818] InferenceWorker_p0-w0: stopping experience collection (13250 times) [2024-04-27 21:43:05,257][54818] InferenceWorker_p0-w0: resuming experience collection (13250 times) [2024-04-27 21:43:05,517][54818] Updated weights for policy 0, policy_version 486888 (0.0026) [2024-04-27 21:43:07,874][54818] Updated weights for policy 0, policy_version 486898 (0.0025) [2024-04-27 21:43:09,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58709.2, 300 sec: 58760.2). Total num frames: 7977385984. Throughput: 0: 58817.0. Samples: 882644560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 21:43:09,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-27 21:43:11,106][54818] Updated weights for policy 0, policy_version 486908 (0.0027) [2024-04-27 21:43:13,395][54818] Updated weights for policy 0, policy_version 486918 (0.0025) [2024-04-27 21:43:14,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 7977680896. Throughput: 0: 58527.0. Samples: 882814560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 21:43:14,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 21:43:16,594][54818] Updated weights for policy 0, policy_version 486928 (0.0027) [2024-04-27 21:43:19,253][54587] Fps is (10 sec: 58983.3, 60 sec: 58436.3, 300 sec: 58871.3). Total num frames: 7977975808. Throughput: 0: 58479.0. Samples: 883170180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 21:43:19,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 21:43:19,321][54818] Updated weights for policy 0, policy_version 486938 (0.0027) [2024-04-27 21:43:22,067][54818] Updated weights for policy 0, policy_version 486948 (0.0027) [2024-04-27 21:43:24,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58436.1, 300 sec: 58926.9). Total num frames: 7978287104. Throughput: 0: 58578.2. Samples: 883515800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 21:43:24,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 21:43:24,894][54818] Updated weights for policy 0, policy_version 486958 (0.0026) [2024-04-27 21:43:27,880][54818] Updated weights for policy 0, policy_version 486968 (0.0026) [2024-04-27 21:43:29,253][54587] Fps is (10 sec: 62258.8, 60 sec: 58982.3, 300 sec: 58926.9). Total num frames: 7978598400. Throughput: 0: 58752.0. Samples: 883705140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 21:43:29,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 21:43:30,546][54818] Updated weights for policy 0, policy_version 486978 (0.0027) [2024-04-27 21:43:33,266][54818] Updated weights for policy 0, policy_version 486988 (0.0026) [2024-04-27 21:43:34,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.3, 300 sec: 58871.4). Total num frames: 7978876928. Throughput: 0: 58654.3. Samples: 884057300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:43:34,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 21:43:35,924][54818] Updated weights for policy 0, policy_version 486998 (0.0027) [2024-04-27 21:43:38,634][54818] Updated weights for policy 0, policy_version 487008 (0.0028) [2024-04-27 21:43:39,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58709.3, 300 sec: 58926.8). Total num frames: 7979171840. Throughput: 0: 58455.1. Samples: 884398720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:43:39,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 21:43:41,832][54818] Updated weights for policy 0, policy_version 487018 (0.0026) [2024-04-27 21:43:44,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 7979450368. Throughput: 0: 58849.4. Samples: 884583660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:43:44,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 21:43:44,377][54818] Updated weights for policy 0, policy_version 487028 (0.0024) [2024-04-27 21:43:47,506][54818] Updated weights for policy 0, policy_version 487038 (0.0027) [2024-04-27 21:43:49,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58982.4, 300 sec: 58760.2). Total num frames: 7979745280. Throughput: 0: 58633.6. Samples: 884931640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:43:49,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-27 21:43:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000487045_7979745280.pth... [2024-04-27 21:43:49,316][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000486183_7965622272.pth [2024-04-27 21:43:49,985][54818] Updated weights for policy 0, policy_version 487048 (0.0027) [2024-04-27 21:43:52,917][54818] Updated weights for policy 0, policy_version 487058 (0.0025) [2024-04-27 21:43:54,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58982.4, 300 sec: 58760.2). Total num frames: 7980023808. Throughput: 0: 58858.0. Samples: 885293160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:43:54,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 21:43:54,416][54798] Signal inference workers to stop experience collection... (13300 times) [2024-04-27 21:43:54,417][54798] Signal inference workers to resume experience collection... (13300 times) [2024-04-27 21:43:54,442][54818] InferenceWorker_p0-w0: stopping experience collection (13300 times) [2024-04-27 21:43:54,443][54818] InferenceWorker_p0-w0: resuming experience collection (13300 times) [2024-04-27 21:43:55,373][54818] Updated weights for policy 0, policy_version 487068 (0.0024) [2024-04-27 21:43:58,439][54818] Updated weights for policy 0, policy_version 487078 (0.0026) [2024-04-27 21:43:59,253][54587] Fps is (10 sec: 55705.9, 60 sec: 58436.3, 300 sec: 58704.7). Total num frames: 7980302336. Throughput: 0: 58620.5. Samples: 885452480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:43:59,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 21:44:00,928][54818] Updated weights for policy 0, policy_version 487088 (0.0026) [2024-04-27 21:44:04,148][54818] Updated weights for policy 0, policy_version 487098 (0.0026) [2024-04-27 21:44:04,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.3, 300 sec: 58760.3). Total num frames: 7980613632. Throughput: 0: 58724.9. Samples: 885812800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:44:04,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 21:44:06,424][54818] Updated weights for policy 0, policy_version 487108 (0.0025) [2024-04-27 21:44:09,253][54587] Fps is (10 sec: 62259.0, 60 sec: 58982.5, 300 sec: 58815.8). Total num frames: 7980924928. Throughput: 0: 59062.3. Samples: 886173600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:44:09,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 21:44:09,593][54818] Updated weights for policy 0, policy_version 487118 (0.0026) [2024-04-27 21:44:12,095][54818] Updated weights for policy 0, policy_version 487128 (0.0025) [2024-04-27 21:44:14,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 7981219840. Throughput: 0: 58698.7. Samples: 886346580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:44:14,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 21:44:15,225][54818] Updated weights for policy 0, policy_version 487138 (0.0026) [2024-04-27 21:44:17,561][54818] Updated weights for policy 0, policy_version 487148 (0.0026) [2024-04-27 21:44:19,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58982.3, 300 sec: 58926.8). Total num frames: 7981514752. Throughput: 0: 58783.5. Samples: 886702560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:44:19,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 21:44:20,888][54818] Updated weights for policy 0, policy_version 487158 (0.0026) [2024-04-27 21:44:23,088][54818] Updated weights for policy 0, policy_version 487168 (0.0027) [2024-04-27 21:44:24,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 7981826048. Throughput: 0: 59077.8. Samples: 887057220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:44:24,263][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 21:44:26,212][54818] Updated weights for policy 0, policy_version 487178 (0.0026) [2024-04-27 21:44:28,581][54818] Updated weights for policy 0, policy_version 487188 (0.0025) [2024-04-27 21:44:29,253][54587] Fps is (10 sec: 60621.1, 60 sec: 58709.4, 300 sec: 59037.9). Total num frames: 7982120960. Throughput: 0: 59024.9. Samples: 887239780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:44:29,262][54587] Avg episode reward: [(0, '0.488')] [2024-04-27 21:44:31,749][54818] Updated weights for policy 0, policy_version 487198 (0.0025) [2024-04-27 21:44:34,214][54818] Updated weights for policy 0, policy_version 487208 (0.0026) [2024-04-27 21:44:34,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 7982415872. Throughput: 0: 59127.1. Samples: 887592360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:44:34,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 21:44:37,345][54818] Updated weights for policy 0, policy_version 487218 (0.0027) [2024-04-27 21:44:39,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 7982694400. Throughput: 0: 58903.5. Samples: 887943820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:44:39,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 21:44:39,579][54818] Updated weights for policy 0, policy_version 487228 (0.0027) [2024-04-27 21:44:42,801][54818] Updated weights for policy 0, policy_version 487238 (0.0028) [2024-04-27 21:44:44,253][54587] Fps is (10 sec: 58981.9, 60 sec: 59255.4, 300 sec: 58926.9). Total num frames: 7983005696. Throughput: 0: 59500.7. Samples: 888130020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:44:44,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 21:44:45,065][54818] Updated weights for policy 0, policy_version 487248 (0.0027) [2024-04-27 21:44:48,250][54818] Updated weights for policy 0, policy_version 487258 (0.0027) [2024-04-27 21:44:48,275][54798] Signal inference workers to stop experience collection... (13350 times) [2024-04-27 21:44:48,275][54798] Signal inference workers to resume experience collection... (13350 times) [2024-04-27 21:44:48,287][54818] InferenceWorker_p0-w0: stopping experience collection (13350 times) [2024-04-27 21:44:48,287][54818] InferenceWorker_p0-w0: resuming experience collection (13350 times) [2024-04-27 21:44:49,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59255.5, 300 sec: 58871.3). Total num frames: 7983300608. Throughput: 0: 59354.6. Samples: 888483760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:44:49,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 21:44:49,262][54587] No heartbeat for components: RolloutWorker_w4 (1237 seconds) [2024-04-27 21:44:51,097][54818] Updated weights for policy 0, policy_version 487268 (0.0027) [2024-04-27 21:44:53,659][54818] Updated weights for policy 0, policy_version 487278 (0.0025) [2024-04-27 21:44:54,253][54587] Fps is (10 sec: 58983.0, 60 sec: 59528.5, 300 sec: 58871.3). Total num frames: 7983595520. Throughput: 0: 59052.4. Samples: 888830960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 21:44:54,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 21:44:56,655][54818] Updated weights for policy 0, policy_version 487288 (0.0025) [2024-04-27 21:44:59,210][54818] Updated weights for policy 0, policy_version 487298 (0.0024) [2024-04-27 21:44:59,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59801.5, 300 sec: 58760.2). Total num frames: 7983890432. Throughput: 0: 59213.3. Samples: 889011180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 21:44:59,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 21:45:02,012][54818] Updated weights for policy 0, policy_version 487308 (0.0026) [2024-04-27 21:45:04,253][54587] Fps is (10 sec: 57344.6, 60 sec: 59255.5, 300 sec: 58760.3). Total num frames: 7984168960. Throughput: 0: 59093.1. Samples: 889361740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 21:45:04,253][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 21:45:04,679][54818] Updated weights for policy 0, policy_version 487318 (0.0025) [2024-04-27 21:45:07,667][54818] Updated weights for policy 0, policy_version 487328 (0.0027) [2024-04-27 21:45:09,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58982.3, 300 sec: 58760.2). Total num frames: 7984463872. Throughput: 0: 59134.5. Samples: 889718280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 21:45:09,254][54587] Avg episode reward: [(0, '0.681')] [2024-04-27 21:45:10,242][54818] Updated weights for policy 0, policy_version 487338 (0.0026) [2024-04-27 21:45:13,197][54818] Updated weights for policy 0, policy_version 487348 (0.0026) [2024-04-27 21:45:14,253][54587] Fps is (10 sec: 57343.4, 60 sec: 58709.4, 300 sec: 58815.8). Total num frames: 7984742400. Throughput: 0: 58835.1. Samples: 889887360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 21:45:14,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 21:45:15,803][54818] Updated weights for policy 0, policy_version 487358 (0.0026) [2024-04-27 21:45:18,759][54818] Updated weights for policy 0, policy_version 487368 (0.0027) [2024-04-27 21:45:19,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58982.4, 300 sec: 58926.9). Total num frames: 7985053696. Throughput: 0: 58877.8. Samples: 890241860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 21:45:19,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 21:45:21,381][54818] Updated weights for policy 0, policy_version 487378 (0.0026) [2024-04-27 21:45:24,254][54587] Fps is (10 sec: 60619.7, 60 sec: 58709.2, 300 sec: 58982.3). Total num frames: 7985348608. Throughput: 0: 59011.8. Samples: 890599360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 21:45:24,254][54587] Avg episode reward: [(0, '0.512')] [2024-04-27 21:45:24,392][54818] Updated weights for policy 0, policy_version 487388 (0.0025) [2024-04-27 21:45:27,132][54818] Updated weights for policy 0, policy_version 487398 (0.0026) [2024-04-27 21:45:29,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 7985643520. Throughput: 0: 58705.1. Samples: 890771740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 21:45:29,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 21:45:29,959][54818] Updated weights for policy 0, policy_version 487408 (0.0027) [2024-04-27 21:45:32,726][54818] Updated weights for policy 0, policy_version 487418 (0.0027) [2024-04-27 21:45:34,253][54587] Fps is (10 sec: 58984.5, 60 sec: 58709.5, 300 sec: 59038.0). Total num frames: 7985938432. Throughput: 0: 58770.9. Samples: 891128440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 21:45:34,253][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 21:45:35,664][54818] Updated weights for policy 0, policy_version 487428 (0.0027) [2024-04-27 21:45:38,105][54798] Signal inference workers to stop experience collection... (13400 times) [2024-04-27 21:45:38,105][54798] Signal inference workers to resume experience collection... (13400 times) [2024-04-27 21:45:38,132][54818] InferenceWorker_p0-w0: stopping experience collection (13400 times) [2024-04-27 21:45:38,133][54818] InferenceWorker_p0-w0: resuming experience collection (13400 times) [2024-04-27 21:45:38,217][54818] Updated weights for policy 0, policy_version 487438 (0.0027) [2024-04-27 21:45:39,253][54587] Fps is (10 sec: 60620.1, 60 sec: 59255.4, 300 sec: 59037.9). Total num frames: 7986249728. Throughput: 0: 58819.0. Samples: 891477820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 21:45:39,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 21:45:41,362][54818] Updated weights for policy 0, policy_version 487448 (0.0026) [2024-04-27 21:45:43,595][54818] Updated weights for policy 0, policy_version 487458 (0.0027) [2024-04-27 21:45:44,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58709.5, 300 sec: 58926.9). Total num frames: 7986528256. Throughput: 0: 58916.2. Samples: 891662400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 21:45:44,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 21:45:47,000][54818] Updated weights for policy 0, policy_version 487468 (0.0025) [2024-04-27 21:45:49,062][54818] Updated weights for policy 0, policy_version 487478 (0.0025) [2024-04-27 21:45:49,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.4, 300 sec: 58926.9). Total num frames: 7986839552. Throughput: 0: 58838.9. Samples: 892009500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 21:45:49,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 21:45:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000487478_7986839552.pth... [2024-04-27 21:45:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000486613_7972667392.pth [2024-04-27 21:45:52,463][54818] Updated weights for policy 0, policy_version 487488 (0.0024) [2024-04-27 21:45:54,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 7987118080. Throughput: 0: 58754.8. Samples: 892362240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 21:45:54,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 21:45:54,872][54818] Updated weights for policy 0, policy_version 487498 (0.0023) [2024-04-27 21:45:57,818][54818] Updated weights for policy 0, policy_version 487508 (0.0026) [2024-04-27 21:45:59,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 7987412992. Throughput: 0: 59044.4. Samples: 892544360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 21:45:59,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 21:46:00,484][54818] Updated weights for policy 0, policy_version 487518 (0.0026) [2024-04-27 21:46:03,423][54818] Updated weights for policy 0, policy_version 487528 (0.0025) [2024-04-27 21:46:04,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59255.4, 300 sec: 58871.3). Total num frames: 7987724288. Throughput: 0: 59160.1. Samples: 892904060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 21:46:04,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 21:46:06,158][54818] Updated weights for policy 0, policy_version 487538 (0.0026) [2024-04-27 21:46:08,901][54818] Updated weights for policy 0, policy_version 487548 (0.0024) [2024-04-27 21:46:09,253][54587] Fps is (10 sec: 60621.3, 60 sec: 59255.6, 300 sec: 58926.9). Total num frames: 7988019200. Throughput: 0: 58943.0. Samples: 893251780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 21:46:09,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 21:46:11,749][54818] Updated weights for policy 0, policy_version 487558 (0.0025) [2024-04-27 21:46:14,253][54587] Fps is (10 sec: 57343.7, 60 sec: 59255.5, 300 sec: 58871.3). Total num frames: 7988297728. Throughput: 0: 59035.9. Samples: 893428360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 21:46:14,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 21:46:14,352][54818] Updated weights for policy 0, policy_version 487568 (0.0027) [2024-04-27 21:46:17,193][54818] Updated weights for policy 0, policy_version 487578 (0.0025) [2024-04-27 21:46:19,253][54587] Fps is (10 sec: 55705.4, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 7988576256. Throughput: 0: 58985.1. Samples: 893782780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 21:46:19,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 21:46:19,929][54818] Updated weights for policy 0, policy_version 487588 (0.0027) [2024-04-27 21:46:20,422][54798] Signal inference workers to stop experience collection... (13450 times) [2024-04-27 21:46:20,470][54818] InferenceWorker_p0-w0: stopping experience collection (13450 times) [2024-04-27 21:46:20,477][54798] Signal inference workers to resume experience collection... (13450 times) [2024-04-27 21:46:20,483][54818] InferenceWorker_p0-w0: resuming experience collection (13450 times) [2024-04-27 21:46:22,845][54818] Updated weights for policy 0, policy_version 487598 (0.0026) [2024-04-27 21:46:24,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58709.5, 300 sec: 58815.8). Total num frames: 7988871168. Throughput: 0: 59025.5. Samples: 894133960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 21:46:24,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 21:46:25,555][54818] Updated weights for policy 0, policy_version 487608 (0.0027) [2024-04-27 21:46:28,489][54818] Updated weights for policy 0, policy_version 487618 (0.0024) [2024-04-27 21:46:29,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 7989166080. Throughput: 0: 58771.4. Samples: 894307120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 21:46:29,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 21:46:31,093][54818] Updated weights for policy 0, policy_version 487628 (0.0026) [2024-04-27 21:46:34,008][54818] Updated weights for policy 0, policy_version 487638 (0.0026) [2024-04-27 21:46:34,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.2, 300 sec: 58871.3). Total num frames: 7989460992. Throughput: 0: 59017.5. Samples: 894665280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 21:46:34,253][54587] Avg episode reward: [(0, '0.512')] [2024-04-27 21:46:36,610][54818] Updated weights for policy 0, policy_version 487648 (0.0026) [2024-04-27 21:46:39,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58436.4, 300 sec: 58871.3). Total num frames: 7989755904. Throughput: 0: 58959.1. Samples: 895015400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 21:46:39,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 21:46:39,669][54818] Updated weights for policy 0, policy_version 487658 (0.0025) [2024-04-27 21:46:42,126][54818] Updated weights for policy 0, policy_version 487668 (0.0027) [2024-04-27 21:46:44,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58982.3, 300 sec: 58982.4). Total num frames: 7990067200. Throughput: 0: 58941.0. Samples: 895196700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 21:46:44,262][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 21:46:45,058][54818] Updated weights for policy 0, policy_version 487678 (0.0026) [2024-04-27 21:46:47,605][54818] Updated weights for policy 0, policy_version 487688 (0.0026) [2024-04-27 21:46:49,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 7990362112. Throughput: 0: 58687.9. Samples: 895545020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 21:46:49,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 21:46:50,670][54818] Updated weights for policy 0, policy_version 487698 (0.0026) [2024-04-27 21:46:53,320][54818] Updated weights for policy 0, policy_version 487708 (0.0025) [2024-04-27 21:46:54,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58709.2, 300 sec: 58871.3). Total num frames: 7990640640. Throughput: 0: 58815.4. Samples: 895898480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 21:46:54,254][54587] Avg episode reward: [(0, '0.524')] [2024-04-27 21:46:56,306][54818] Updated weights for policy 0, policy_version 487718 (0.0025) [2024-04-27 21:46:58,777][54818] Updated weights for policy 0, policy_version 487728 (0.0025) [2024-04-27 21:46:59,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 7990935552. Throughput: 0: 58957.8. Samples: 896081460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 21:46:59,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 21:47:01,816][54818] Updated weights for policy 0, policy_version 487738 (0.0026) [2024-04-27 21:47:04,253][54587] Fps is (10 sec: 60621.5, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 7991246848. Throughput: 0: 58746.7. Samples: 896426380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 21:47:04,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 21:47:04,546][54818] Updated weights for policy 0, policy_version 487748 (0.0027) [2024-04-27 21:47:07,638][54818] Updated weights for policy 0, policy_version 487758 (0.0025) [2024-04-27 21:47:09,253][54587] Fps is (10 sec: 60620.5, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 7991541760. Throughput: 0: 58864.8. Samples: 896782880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 21:47:09,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 21:47:10,042][54818] Updated weights for policy 0, policy_version 487768 (0.0025) [2024-04-27 21:47:13,058][54818] Updated weights for policy 0, policy_version 487778 (0.0027) [2024-04-27 21:47:14,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 7991836672. Throughput: 0: 58964.3. Samples: 896960520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 21:47:14,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 21:47:15,899][54818] Updated weights for policy 0, policy_version 487788 (0.0027) [2024-04-27 21:47:18,423][54798] Signal inference workers to stop experience collection... (13500 times) [2024-04-27 21:47:18,447][54818] InferenceWorker_p0-w0: stopping experience collection (13500 times) [2024-04-27 21:47:18,487][54798] Signal inference workers to resume experience collection... (13500 times) [2024-04-27 21:47:18,487][54818] InferenceWorker_p0-w0: resuming experience collection (13500 times) [2024-04-27 21:47:18,489][54818] Updated weights for policy 0, policy_version 487798 (0.0025) [2024-04-27 21:47:19,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.4, 300 sec: 58815.8). Total num frames: 7992131584. Throughput: 0: 58884.3. Samples: 897315080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 21:47:19,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 21:47:21,276][54818] Updated weights for policy 0, policy_version 487808 (0.0026) [2024-04-27 21:47:23,991][54818] Updated weights for policy 0, policy_version 487818 (0.0025) [2024-04-27 21:47:24,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58982.3, 300 sec: 58815.8). Total num frames: 7992410112. Throughput: 0: 58927.0. Samples: 897667120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 21:47:24,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 21:47:27,067][54818] Updated weights for policy 0, policy_version 487828 (0.0027) [2024-04-27 21:47:29,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 7992705024. Throughput: 0: 58713.8. Samples: 897838820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 21:47:29,254][54587] Avg episode reward: [(0, '0.504')] [2024-04-27 21:47:29,501][54818] Updated weights for policy 0, policy_version 487838 (0.0025) [2024-04-27 21:47:32,429][54818] Updated weights for policy 0, policy_version 487848 (0.0026) [2024-04-27 21:47:34,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58709.3, 300 sec: 58760.2). Total num frames: 7992983552. Throughput: 0: 58889.4. Samples: 898195040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 21:47:34,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 21:47:35,109][54818] Updated weights for policy 0, policy_version 487858 (0.0024) [2024-04-27 21:47:38,030][54818] Updated weights for policy 0, policy_version 487868 (0.0026) [2024-04-27 21:47:39,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58982.3, 300 sec: 58815.8). Total num frames: 7993294848. Throughput: 0: 58813.4. Samples: 898545080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 21:47:39,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 21:47:40,654][54818] Updated weights for policy 0, policy_version 487878 (0.0025) [2024-04-27 21:47:43,690][54818] Updated weights for policy 0, policy_version 487888 (0.0026) [2024-04-27 21:47:44,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58436.3, 300 sec: 58871.3). Total num frames: 7993573376. Throughput: 0: 58547.5. Samples: 898716100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-27 21:47:44,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 21:47:46,305][54818] Updated weights for policy 0, policy_version 487898 (0.0025) [2024-04-27 21:47:49,135][54818] Updated weights for policy 0, policy_version 487908 (0.0027) [2024-04-27 21:47:49,253][54587] Fps is (10 sec: 58983.2, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 7993884672. Throughput: 0: 58797.4. Samples: 899072260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:47:49,254][54587] Avg episode reward: [(0, '0.519')] [2024-04-27 21:47:49,262][54587] No heartbeat for components: RolloutWorker_w4 (1417 seconds) [2024-04-27 21:47:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000487908_7993884672.pth... [2024-04-27 21:47:49,316][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000487045_7979745280.pth [2024-04-27 21:47:51,772][54818] Updated weights for policy 0, policy_version 487918 (0.0027) [2024-04-27 21:47:54,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58709.5, 300 sec: 58871.3). Total num frames: 7994163200. Throughput: 0: 58824.1. Samples: 899429960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:47:54,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 21:47:54,822][54818] Updated weights for policy 0, policy_version 487928 (0.0027) [2024-04-27 21:47:57,223][54818] Updated weights for policy 0, policy_version 487938 (0.0026) [2024-04-27 21:47:59,253][54587] Fps is (10 sec: 58981.3, 60 sec: 58982.3, 300 sec: 58926.8). Total num frames: 7994474496. Throughput: 0: 58700.0. Samples: 899602020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:47:59,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 21:48:00,383][54818] Updated weights for policy 0, policy_version 487948 (0.0027) [2024-04-27 21:48:02,847][54818] Updated weights for policy 0, policy_version 487958 (0.0027) [2024-04-27 21:48:04,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 7994769408. Throughput: 0: 58604.9. Samples: 899952300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:48:04,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 21:48:06,057][54818] Updated weights for policy 0, policy_version 487968 (0.0026) [2024-04-27 21:48:08,320][54818] Updated weights for policy 0, policy_version 487978 (0.0028) [2024-04-27 21:48:09,068][54798] Signal inference workers to stop experience collection... (13550 times) [2024-04-27 21:48:09,117][54818] InferenceWorker_p0-w0: stopping experience collection (13550 times) [2024-04-27 21:48:09,118][54798] Signal inference workers to resume experience collection... (13550 times) [2024-04-27 21:48:09,129][54818] InferenceWorker_p0-w0: resuming experience collection (13550 times) [2024-04-27 21:48:09,253][54587] Fps is (10 sec: 60621.3, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 7995080704. Throughput: 0: 58775.1. Samples: 900312000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:48:09,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 21:48:11,719][54818] Updated weights for policy 0, policy_version 487988 (0.0025) [2024-04-27 21:48:13,670][54818] Updated weights for policy 0, policy_version 487998 (0.0027) [2024-04-27 21:48:14,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58709.5, 300 sec: 58926.9). Total num frames: 7995359232. Throughput: 0: 59079.6. Samples: 900497400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:48:14,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 21:48:17,221][54818] Updated weights for policy 0, policy_version 488008 (0.0027) [2024-04-27 21:48:19,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 7995670528. Throughput: 0: 58882.2. Samples: 900844740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:48:19,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 21:48:19,912][54818] Updated weights for policy 0, policy_version 488018 (0.0024) [2024-04-27 21:48:22,763][54818] Updated weights for policy 0, policy_version 488028 (0.0025) [2024-04-27 21:48:24,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.5, 300 sec: 58815.8). Total num frames: 7995949056. Throughput: 0: 58733.9. Samples: 901188100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:48:24,254][54587] Avg episode reward: [(0, '0.501')] [2024-04-27 21:48:25,348][54818] Updated weights for policy 0, policy_version 488038 (0.0026) [2024-04-27 21:48:28,268][54818] Updated weights for policy 0, policy_version 488048 (0.0027) [2024-04-27 21:48:29,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.5, 300 sec: 58926.9). Total num frames: 7996260352. Throughput: 0: 59081.3. Samples: 901374760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:48:29,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 21:48:30,751][54818] Updated weights for policy 0, policy_version 488058 (0.0027) [2024-04-27 21:48:33,820][54818] Updated weights for policy 0, policy_version 488068 (0.0026) [2024-04-27 21:48:34,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59255.4, 300 sec: 58871.3). Total num frames: 7996538880. Throughput: 0: 59142.6. Samples: 901733680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:48:34,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 21:48:36,266][54818] Updated weights for policy 0, policy_version 488078 (0.0025) [2024-04-27 21:48:39,253][54587] Fps is (10 sec: 55705.5, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 7996817408. Throughput: 0: 58998.2. Samples: 902084880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:48:39,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 21:48:39,341][54818] Updated weights for policy 0, policy_version 488088 (0.0027) [2024-04-27 21:48:41,832][54818] Updated weights for policy 0, policy_version 488098 (0.0027) [2024-04-27 21:48:44,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 7997112320. Throughput: 0: 59028.1. Samples: 902258280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:48:44,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 21:48:44,792][54818] Updated weights for policy 0, policy_version 488108 (0.0025) [2024-04-27 21:48:47,426][54818] Updated weights for policy 0, policy_version 488118 (0.0027) [2024-04-27 21:48:49,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 7997407232. Throughput: 0: 59115.7. Samples: 902612500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:48:49,253][54587] Avg episode reward: [(0, '0.547')] [2024-04-27 21:48:50,243][54818] Updated weights for policy 0, policy_version 488128 (0.0027) [2024-04-27 21:48:52,846][54818] Updated weights for policy 0, policy_version 488138 (0.0026) [2024-04-27 21:48:54,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58982.2, 300 sec: 58982.4). Total num frames: 7997702144. Throughput: 0: 58908.3. Samples: 902962880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:48:54,254][54587] Avg episode reward: [(0, '0.675')] [2024-04-27 21:48:55,862][54818] Updated weights for policy 0, policy_version 488148 (0.0026) [2024-04-27 21:48:58,706][54818] Updated weights for policy 0, policy_version 488158 (0.0027) [2024-04-27 21:48:59,253][54587] Fps is (10 sec: 60619.9, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 7998013440. Throughput: 0: 58569.7. Samples: 903133040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:48:59,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 21:49:00,523][54798] Signal inference workers to stop experience collection... (13600 times) [2024-04-27 21:49:00,524][54798] Signal inference workers to resume experience collection... (13600 times) [2024-04-27 21:49:00,551][54818] InferenceWorker_p0-w0: stopping experience collection (13600 times) [2024-04-27 21:49:00,552][54818] InferenceWorker_p0-w0: resuming experience collection (13600 times) [2024-04-27 21:49:01,242][54818] Updated weights for policy 0, policy_version 488168 (0.0026) [2024-04-27 21:49:04,209][54818] Updated weights for policy 0, policy_version 488178 (0.0027) [2024-04-27 21:49:04,253][54587] Fps is (10 sec: 60621.8, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 7998308352. Throughput: 0: 58905.3. Samples: 903495480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:49:04,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 21:49:06,980][54818] Updated weights for policy 0, policy_version 488188 (0.0024) [2024-04-27 21:49:09,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 7998603264. Throughput: 0: 59055.9. Samples: 903845620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:49:09,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 21:49:09,702][54818] Updated weights for policy 0, policy_version 488198 (0.0027) [2024-04-27 21:49:12,529][54818] Updated weights for policy 0, policy_version 488208 (0.0026) [2024-04-27 21:49:14,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 7998898176. Throughput: 0: 58936.1. Samples: 904026880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 21:49:14,253][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 21:49:15,330][54818] Updated weights for policy 0, policy_version 488218 (0.0026) [2024-04-27 21:49:17,861][54818] Updated weights for policy 0, policy_version 488228 (0.0026) [2024-04-27 21:49:19,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 7999193088. Throughput: 0: 58695.2. Samples: 904374960. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-27 21:49:19,253][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 21:49:21,030][54818] Updated weights for policy 0, policy_version 488238 (0.0026) [2024-04-27 21:49:23,382][54818] Updated weights for policy 0, policy_version 488248 (0.0026) [2024-04-27 21:49:24,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 7999488000. Throughput: 0: 58705.8. Samples: 904726640. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-27 21:49:24,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 21:49:26,531][54818] Updated weights for policy 0, policy_version 488258 (0.0027) [2024-04-27 21:49:29,062][54818] Updated weights for policy 0, policy_version 488268 (0.0025) [2024-04-27 21:49:29,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 7999782912. Throughput: 0: 58973.9. Samples: 904912100. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-27 21:49:29,253][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 21:49:32,024][54818] Updated weights for policy 0, policy_version 488278 (0.0027) [2024-04-27 21:49:34,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 8000077824. Throughput: 0: 58980.4. Samples: 905266620. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-27 21:49:34,253][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 21:49:34,564][54818] Updated weights for policy 0, policy_version 488288 (0.0026) [2024-04-27 21:49:37,568][54818] Updated weights for policy 0, policy_version 488298 (0.0025) [2024-04-27 21:49:39,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.5, 300 sec: 58871.4). Total num frames: 8000372736. Throughput: 0: 59013.1. Samples: 905618460. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-27 21:49:39,253][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 21:49:40,062][54818] Updated weights for policy 0, policy_version 488308 (0.0027) [2024-04-27 21:49:43,128][54818] Updated weights for policy 0, policy_version 488318 (0.0027) [2024-04-27 21:49:44,253][54587] Fps is (10 sec: 60620.1, 60 sec: 59528.5, 300 sec: 58926.9). Total num frames: 8000684032. Throughput: 0: 59181.8. Samples: 905796220. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-27 21:49:44,263][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 21:49:45,860][54818] Updated weights for policy 0, policy_version 488328 (0.0026) [2024-04-27 21:49:48,762][54818] Updated weights for policy 0, policy_version 488338 (0.0027) [2024-04-27 21:49:49,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.4, 300 sec: 58871.3). Total num frames: 8000962560. Throughput: 0: 59049.8. Samples: 906152720. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-27 21:49:49,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 21:49:49,396][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000488341_8000978944.pth... [2024-04-27 21:49:49,431][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000487478_7986839552.pth [2024-04-27 21:49:51,454][54818] Updated weights for policy 0, policy_version 488348 (0.0027) [2024-04-27 21:49:54,253][54587] Fps is (10 sec: 55705.5, 60 sec: 58982.5, 300 sec: 58815.8). Total num frames: 8001241088. Throughput: 0: 58979.1. Samples: 906499680. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-27 21:49:54,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 21:49:54,448][54818] Updated weights for policy 0, policy_version 488358 (0.0026) [2024-04-27 21:49:56,986][54818] Updated weights for policy 0, policy_version 488368 (0.0027) [2024-04-27 21:49:59,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8001536000. Throughput: 0: 58803.9. Samples: 906673060. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-27 21:49:59,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 21:49:59,847][54818] Updated weights for policy 0, policy_version 488378 (0.0026) [2024-04-27 21:50:02,789][54818] Updated weights for policy 0, policy_version 488388 (0.0027) [2024-04-27 21:50:04,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58709.2, 300 sec: 58871.3). Total num frames: 8001830912. Throughput: 0: 59079.3. Samples: 907033540. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-27 21:50:04,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 21:50:05,181][54798] Signal inference workers to stop experience collection... (13650 times) [2024-04-27 21:50:05,183][54798] Signal inference workers to resume experience collection... (13650 times) [2024-04-27 21:50:05,194][54818] InferenceWorker_p0-w0: stopping experience collection (13650 times) [2024-04-27 21:50:05,204][54818] InferenceWorker_p0-w0: resuming experience collection (13650 times) [2024-04-27 21:50:05,442][54818] Updated weights for policy 0, policy_version 488398 (0.0024) [2024-04-27 21:50:08,363][54818] Updated weights for policy 0, policy_version 488408 (0.0026) [2024-04-27 21:50:09,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8002125824. Throughput: 0: 59089.8. Samples: 907385680. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-27 21:50:09,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 21:50:10,865][54818] Updated weights for policy 0, policy_version 488418 (0.0027) [2024-04-27 21:50:13,707][54818] Updated weights for policy 0, policy_version 488428 (0.0026) [2024-04-27 21:50:14,253][54587] Fps is (10 sec: 57345.0, 60 sec: 58436.2, 300 sec: 58815.8). Total num frames: 8002404352. Throughput: 0: 58721.8. Samples: 907554580. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-27 21:50:14,253][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 21:50:16,406][54818] Updated weights for policy 0, policy_version 488438 (0.0026) [2024-04-27 21:50:19,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58709.3, 300 sec: 58871.4). Total num frames: 8002715648. Throughput: 0: 58616.0. Samples: 907904340. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-27 21:50:19,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 21:50:19,412][54818] Updated weights for policy 0, policy_version 488448 (0.0026) [2024-04-27 21:50:22,074][54818] Updated weights for policy 0, policy_version 488458 (0.0027) [2024-04-27 21:50:24,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58436.2, 300 sec: 58815.8). Total num frames: 8002994176. Throughput: 0: 58601.2. Samples: 908255520. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-27 21:50:24,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 21:50:25,281][54818] Updated weights for policy 0, policy_version 488468 (0.0027) [2024-04-27 21:50:27,612][54818] Updated weights for policy 0, policy_version 488478 (0.0026) [2024-04-27 21:50:29,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58709.2, 300 sec: 58871.3). Total num frames: 8003305472. Throughput: 0: 58602.7. Samples: 908433340. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-27 21:50:29,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 21:50:30,873][54818] Updated weights for policy 0, policy_version 488488 (0.0027) [2024-04-27 21:50:33,298][54818] Updated weights for policy 0, policy_version 488498 (0.0027) [2024-04-27 21:50:34,253][54587] Fps is (10 sec: 62259.3, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8003616768. Throughput: 0: 58350.1. Samples: 908778480. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-27 21:50:34,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 21:50:36,397][54818] Updated weights for policy 0, policy_version 488508 (0.0025) [2024-04-27 21:50:38,945][54818] Updated weights for policy 0, policy_version 488518 (0.0024) [2024-04-27 21:50:39,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8003895296. Throughput: 0: 58452.6. Samples: 909130040. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-27 21:50:39,253][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 21:50:41,966][54818] Updated weights for policy 0, policy_version 488528 (0.0027) [2024-04-27 21:50:44,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 8004190208. Throughput: 0: 58715.6. Samples: 909315260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:50:44,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-27 21:50:44,525][54818] Updated weights for policy 0, policy_version 488538 (0.0027) [2024-04-27 21:50:47,463][54818] Updated weights for policy 0, policy_version 488548 (0.0027) [2024-04-27 21:50:49,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8004485120. Throughput: 0: 58453.5. Samples: 909663940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:50:49,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 21:50:49,274][54587] No heartbeat for components: RolloutWorker_w4 (1597 seconds) [2024-04-27 21:50:50,097][54818] Updated weights for policy 0, policy_version 488558 (0.0026) [2024-04-27 21:50:53,052][54818] Updated weights for policy 0, policy_version 488568 (0.0026) [2024-04-27 21:50:54,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 8004780032. Throughput: 0: 58381.7. Samples: 910012860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:50:54,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 21:50:55,709][54818] Updated weights for policy 0, policy_version 488578 (0.0025) [2024-04-27 21:50:58,804][54818] Updated weights for policy 0, policy_version 488588 (0.0027) [2024-04-27 21:50:59,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58982.3, 300 sec: 58815.8). Total num frames: 8005074944. Throughput: 0: 58557.6. Samples: 910189680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:50:59,262][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 21:51:01,097][54818] Updated weights for policy 0, policy_version 488598 (0.0027) [2024-04-27 21:51:03,564][54798] Signal inference workers to stop experience collection... (13700 times) [2024-04-27 21:51:03,565][54798] Signal inference workers to resume experience collection... (13700 times) [2024-04-27 21:51:03,595][54818] InferenceWorker_p0-w0: stopping experience collection (13700 times) [2024-04-27 21:51:03,595][54818] InferenceWorker_p0-w0: resuming experience collection (13700 times) [2024-04-27 21:51:04,186][54818] Updated weights for policy 0, policy_version 488608 (0.0025) [2024-04-27 21:51:04,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58709.4, 300 sec: 58760.2). Total num frames: 8005353472. Throughput: 0: 58722.2. Samples: 910546840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:51:04,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 21:51:06,516][54818] Updated weights for policy 0, policy_version 488618 (0.0026) [2024-04-27 21:51:09,253][54587] Fps is (10 sec: 55704.9, 60 sec: 58436.1, 300 sec: 58760.2). Total num frames: 8005632000. Throughput: 0: 58859.4. Samples: 910904200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:51:09,254][54587] Avg episode reward: [(0, '0.521')] [2024-04-27 21:51:09,743][54818] Updated weights for policy 0, policy_version 488628 (0.0025) [2024-04-27 21:51:12,423][54818] Updated weights for policy 0, policy_version 488638 (0.0027) [2024-04-27 21:51:14,253][54587] Fps is (10 sec: 55705.2, 60 sec: 58436.1, 300 sec: 58760.2). Total num frames: 8005910528. Throughput: 0: 58553.3. Samples: 911068240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:51:14,254][54587] Avg episode reward: [(0, '0.667')] [2024-04-27 21:51:15,280][54818] Updated weights for policy 0, policy_version 488648 (0.0025) [2024-04-27 21:51:18,149][54818] Updated weights for policy 0, policy_version 488658 (0.0025) [2024-04-27 21:51:19,253][54587] Fps is (10 sec: 57345.3, 60 sec: 58163.2, 300 sec: 58760.2). Total num frames: 8006205440. Throughput: 0: 58687.2. Samples: 911419400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:51:19,253][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 21:51:20,700][54818] Updated weights for policy 0, policy_version 488668 (0.0027) [2024-04-27 21:51:24,000][54818] Updated weights for policy 0, policy_version 488678 (0.0026) [2024-04-27 21:51:24,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8006516736. Throughput: 0: 58923.8. Samples: 911781620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:51:24,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 21:51:26,268][54818] Updated weights for policy 0, policy_version 488688 (0.0027) [2024-04-27 21:51:29,253][54587] Fps is (10 sec: 60620.1, 60 sec: 58436.2, 300 sec: 58815.8). Total num frames: 8006811648. Throughput: 0: 58567.9. Samples: 911950820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:51:29,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-27 21:51:29,563][54818] Updated weights for policy 0, policy_version 488698 (0.0027) [2024-04-27 21:51:31,754][54818] Updated weights for policy 0, policy_version 488708 (0.0026) [2024-04-27 21:51:34,253][54587] Fps is (10 sec: 58983.4, 60 sec: 58163.3, 300 sec: 58815.8). Total num frames: 8007106560. Throughput: 0: 58689.4. Samples: 912304960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:51:34,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 21:51:34,951][54818] Updated weights for policy 0, policy_version 488718 (0.0025) [2024-04-27 21:51:37,282][54818] Updated weights for policy 0, policy_version 488728 (0.0027) [2024-04-27 21:51:39,253][54587] Fps is (10 sec: 60621.7, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8007417856. Throughput: 0: 58761.0. Samples: 912657100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:51:39,253][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 21:51:40,352][54818] Updated weights for policy 0, policy_version 488738 (0.0025) [2024-04-27 21:51:42,733][54818] Updated weights for policy 0, policy_version 488748 (0.0026) [2024-04-27 21:51:44,253][54587] Fps is (10 sec: 62258.7, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 8007729152. Throughput: 0: 59062.3. Samples: 912847480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:51:44,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 21:51:45,864][54818] Updated weights for policy 0, policy_version 488758 (0.0026) [2024-04-27 21:51:48,341][54818] Updated weights for policy 0, policy_version 488768 (0.0030) [2024-04-27 21:51:49,253][54587] Fps is (10 sec: 60620.2, 60 sec: 58982.4, 300 sec: 58926.9). Total num frames: 8008024064. Throughput: 0: 58875.5. Samples: 913196240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:51:49,254][54587] Avg episode reward: [(0, '0.491')] [2024-04-27 21:51:49,304][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000488772_8008040448.pth... [2024-04-27 21:51:49,351][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000487908_7993884672.pth [2024-04-27 21:51:51,507][54818] Updated weights for policy 0, policy_version 488778 (0.0027) [2024-04-27 21:51:52,635][54798] Signal inference workers to stop experience collection... (13750 times) [2024-04-27 21:51:52,635][54798] Signal inference workers to resume experience collection... (13750 times) [2024-04-27 21:51:52,648][54818] InferenceWorker_p0-w0: stopping experience collection (13750 times) [2024-04-27 21:51:52,648][54818] InferenceWorker_p0-w0: resuming experience collection (13750 times) [2024-04-27 21:51:53,834][54818] Updated weights for policy 0, policy_version 488788 (0.0025) [2024-04-27 21:51:54,253][54587] Fps is (10 sec: 58981.5, 60 sec: 58982.3, 300 sec: 58926.8). Total num frames: 8008318976. Throughput: 0: 58544.9. Samples: 913538720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:51:54,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 21:51:57,474][54818] Updated weights for policy 0, policy_version 488798 (0.0027) [2024-04-27 21:51:59,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 8008613888. Throughput: 0: 58992.0. Samples: 913722880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:51:59,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 21:51:59,487][54818] Updated weights for policy 0, policy_version 488808 (0.0027) [2024-04-27 21:52:03,018][54818] Updated weights for policy 0, policy_version 488818 (0.0027) [2024-04-27 21:52:04,253][54587] Fps is (10 sec: 57345.0, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 8008892416. Throughput: 0: 59138.6. Samples: 914080640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:52:04,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 21:52:04,944][54818] Updated weights for policy 0, policy_version 488828 (0.0026) [2024-04-27 21:52:08,545][54818] Updated weights for policy 0, policy_version 488838 (0.0026) [2024-04-27 21:52:09,253][54587] Fps is (10 sec: 55705.9, 60 sec: 58982.6, 300 sec: 58760.3). Total num frames: 8009170944. Throughput: 0: 59012.1. Samples: 914437160. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 21:52:09,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 21:52:10,579][54818] Updated weights for policy 0, policy_version 488848 (0.0026) [2024-04-27 21:52:14,075][54818] Updated weights for policy 0, policy_version 488858 (0.0024) [2024-04-27 21:52:14,253][54587] Fps is (10 sec: 55705.0, 60 sec: 58982.4, 300 sec: 58704.7). Total num frames: 8009449472. Throughput: 0: 58891.5. Samples: 914600940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 21:52:14,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 21:52:16,359][54818] Updated weights for policy 0, policy_version 488868 (0.0025) [2024-04-27 21:52:19,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58982.3, 300 sec: 58760.2). Total num frames: 8009744384. Throughput: 0: 59014.0. Samples: 914960600. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 21:52:19,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 21:52:19,536][54818] Updated weights for policy 0, policy_version 488878 (0.0026) [2024-04-27 21:52:21,922][54818] Updated weights for policy 0, policy_version 488888 (0.0027) [2024-04-27 21:52:24,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58709.4, 300 sec: 58760.2). Total num frames: 8010039296. Throughput: 0: 58996.3. Samples: 915311940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 21:52:24,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 21:52:25,057][54818] Updated weights for policy 0, policy_version 488898 (0.0026) [2024-04-27 21:52:27,357][54818] Updated weights for policy 0, policy_version 488908 (0.0026) [2024-04-27 21:52:29,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 8010350592. Throughput: 0: 58563.9. Samples: 915482860. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 21:52:29,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 21:52:30,561][54818] Updated weights for policy 0, policy_version 488918 (0.0027) [2024-04-27 21:52:32,787][54818] Updated weights for policy 0, policy_version 488928 (0.0026) [2024-04-27 21:52:34,253][54587] Fps is (10 sec: 62259.2, 60 sec: 59255.4, 300 sec: 58871.3). Total num frames: 8010661888. Throughput: 0: 58660.9. Samples: 915835980. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 21:52:34,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 21:52:36,068][54818] Updated weights for policy 0, policy_version 488938 (0.0027) [2024-04-27 21:52:38,776][54818] Updated weights for policy 0, policy_version 488948 (0.0027) [2024-04-27 21:52:39,253][54587] Fps is (10 sec: 58983.4, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8010940416. Throughput: 0: 58995.0. Samples: 916193480. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 21:52:39,254][54587] Avg episode reward: [(0, '0.513')] [2024-04-27 21:52:41,610][54818] Updated weights for policy 0, policy_version 488958 (0.0024) [2024-04-27 21:52:44,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 8011235328. Throughput: 0: 58846.7. Samples: 916370980. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 21:52:44,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 21:52:44,350][54798] Signal inference workers to stop experience collection... (13800 times) [2024-04-27 21:52:44,386][54818] InferenceWorker_p0-w0: stopping experience collection (13800 times) [2024-04-27 21:52:44,441][54798] Signal inference workers to resume experience collection... (13800 times) [2024-04-27 21:52:44,442][54818] InferenceWorker_p0-w0: resuming experience collection (13800 times) [2024-04-27 21:52:44,443][54818] Updated weights for policy 0, policy_version 488968 (0.0025) [2024-04-27 21:52:47,051][54818] Updated weights for policy 0, policy_version 488978 (0.0025) [2024-04-27 21:52:49,253][54587] Fps is (10 sec: 60620.0, 60 sec: 58709.3, 300 sec: 58926.8). Total num frames: 8011546624. Throughput: 0: 58649.3. Samples: 916719860. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 21:52:49,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 21:52:49,888][54818] Updated weights for policy 0, policy_version 488988 (0.0028) [2024-04-27 21:52:52,604][54818] Updated weights for policy 0, policy_version 488998 (0.0026) [2024-04-27 21:52:54,253][54587] Fps is (10 sec: 60621.2, 60 sec: 58709.5, 300 sec: 58871.4). Total num frames: 8011841536. Throughput: 0: 58515.6. Samples: 917070360. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 21:52:54,253][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 21:52:55,687][54818] Updated weights for policy 0, policy_version 489008 (0.0027) [2024-04-27 21:52:57,983][54818] Updated weights for policy 0, policy_version 489018 (0.0026) [2024-04-27 21:52:59,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8012136448. Throughput: 0: 58975.1. Samples: 917254820. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 21:52:59,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 21:53:01,069][54818] Updated weights for policy 0, policy_version 489028 (0.0025) [2024-04-27 21:53:03,545][54818] Updated weights for policy 0, policy_version 489038 (0.0027) [2024-04-27 21:53:04,253][54587] Fps is (10 sec: 58981.5, 60 sec: 58982.3, 300 sec: 58815.8). Total num frames: 8012431360. Throughput: 0: 58998.7. Samples: 917615540. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 21:53:04,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 21:53:06,464][54818] Updated weights for policy 0, policy_version 489048 (0.0026) [2024-04-27 21:53:09,076][54818] Updated weights for policy 0, policy_version 489058 (0.0026) [2024-04-27 21:53:09,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59255.5, 300 sec: 58871.3). Total num frames: 8012726272. Throughput: 0: 58943.1. Samples: 917964380. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 21:53:09,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 21:53:12,028][54818] Updated weights for policy 0, policy_version 489068 (0.0027) [2024-04-27 21:53:14,253][54587] Fps is (10 sec: 57345.3, 60 sec: 59255.7, 300 sec: 58760.3). Total num frames: 8013004800. Throughput: 0: 59159.4. Samples: 918145020. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 21:53:14,253][54587] Avg episode reward: [(0, '0.524')] [2024-04-27 21:53:14,665][54818] Updated weights for policy 0, policy_version 489078 (0.0027) [2024-04-27 21:53:17,513][54818] Updated weights for policy 0, policy_version 489088 (0.0027) [2024-04-27 21:53:19,253][54587] Fps is (10 sec: 57344.5, 60 sec: 59255.6, 300 sec: 58815.8). Total num frames: 8013299712. Throughput: 0: 58986.7. Samples: 918490380. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 21:53:19,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 21:53:20,168][54818] Updated weights for policy 0, policy_version 489098 (0.0025) [2024-04-27 21:53:23,173][54818] Updated weights for policy 0, policy_version 489108 (0.0027) [2024-04-27 21:53:24,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58982.5, 300 sec: 58704.7). Total num frames: 8013578240. Throughput: 0: 59042.6. Samples: 918850400. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 21:53:24,253][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 21:53:25,691][54818] Updated weights for policy 0, policy_version 489118 (0.0027) [2024-04-27 21:53:28,606][54818] Updated weights for policy 0, policy_version 489128 (0.0027) [2024-04-27 21:53:29,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 8013889536. Throughput: 0: 58981.3. Samples: 919025140. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-04-27 21:53:29,254][54587] Avg episode reward: [(0, '0.504')] [2024-04-27 21:53:31,152][54818] Updated weights for policy 0, policy_version 489138 (0.0026) [2024-04-27 21:53:33,269][54798] Signal inference workers to stop experience collection... (13850 times) [2024-04-27 21:53:33,289][54818] InferenceWorker_p0-w0: stopping experience collection (13850 times) [2024-04-27 21:53:33,361][54798] Signal inference workers to resume experience collection... (13850 times) [2024-04-27 21:53:33,362][54818] InferenceWorker_p0-w0: resuming experience collection (13850 times) [2024-04-27 21:53:34,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8014184448. Throughput: 0: 59199.2. Samples: 919383820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:53:34,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 21:53:34,465][54818] Updated weights for policy 0, policy_version 489148 (0.0027) [2024-04-27 21:53:37,091][54818] Updated weights for policy 0, policy_version 489158 (0.0027) [2024-04-27 21:53:39,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59255.4, 300 sec: 58926.9). Total num frames: 8014495744. Throughput: 0: 59160.9. Samples: 919732600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:53:39,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 21:53:40,195][54818] Updated weights for policy 0, policy_version 489168 (0.0026) [2024-04-27 21:53:42,757][54818] Updated weights for policy 0, policy_version 489178 (0.0026) [2024-04-27 21:53:44,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 8014774272. Throughput: 0: 58815.7. Samples: 919901520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:53:44,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 21:53:45,688][54818] Updated weights for policy 0, policy_version 489188 (0.0027) [2024-04-27 21:53:48,296][54818] Updated weights for policy 0, policy_version 489198 (0.0025) [2024-04-27 21:53:49,253][54587] Fps is (10 sec: 55705.5, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 8015052800. Throughput: 0: 58725.0. Samples: 920258160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:53:49,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-27 21:53:49,262][54587] No heartbeat for components: RolloutWorker_w4 (1777 seconds) [2024-04-27 21:53:49,391][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000489202_8015085568.pth... [2024-04-27 21:53:49,443][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000488341_8000978944.pth [2024-04-27 21:53:51,208][54818] Updated weights for policy 0, policy_version 489208 (0.0027) [2024-04-27 21:53:53,981][54818] Updated weights for policy 0, policy_version 489218 (0.0027) [2024-04-27 21:53:54,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58436.2, 300 sec: 58760.2). Total num frames: 8015347712. Throughput: 0: 58932.4. Samples: 920616340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:53:54,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 21:53:56,667][54818] Updated weights for policy 0, policy_version 489228 (0.0024) [2024-04-27 21:53:59,253][54587] Fps is (10 sec: 60621.1, 60 sec: 58709.5, 300 sec: 58815.8). Total num frames: 8015659008. Throughput: 0: 58851.5. Samples: 920793340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:53:59,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 21:53:59,609][54818] Updated weights for policy 0, policy_version 489238 (0.0027) [2024-04-27 21:54:02,161][54818] Updated weights for policy 0, policy_version 489248 (0.0027) [2024-04-27 21:54:04,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58436.3, 300 sec: 58760.2). Total num frames: 8015937536. Throughput: 0: 58899.0. Samples: 921140840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:54:04,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 21:54:05,288][54818] Updated weights for policy 0, policy_version 489258 (0.0025) [2024-04-27 21:54:07,721][54818] Updated weights for policy 0, policy_version 489268 (0.0025) [2024-04-27 21:54:09,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8016248832. Throughput: 0: 58663.8. Samples: 921490280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:54:09,254][54587] Avg episode reward: [(0, '0.664')] [2024-04-27 21:54:10,837][54818] Updated weights for policy 0, policy_version 489278 (0.0027) [2024-04-27 21:54:13,168][54818] Updated weights for policy 0, policy_version 489288 (0.0027) [2024-04-27 21:54:14,253][54587] Fps is (10 sec: 62259.6, 60 sec: 59255.4, 300 sec: 58871.3). Total num frames: 8016560128. Throughput: 0: 58841.9. Samples: 921673020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:54:14,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 21:54:16,239][54818] Updated weights for policy 0, policy_version 489298 (0.0026) [2024-04-27 21:54:18,630][54818] Updated weights for policy 0, policy_version 489308 (0.0025) [2024-04-27 21:54:19,253][54587] Fps is (10 sec: 60621.0, 60 sec: 59255.4, 300 sec: 58871.3). Total num frames: 8016855040. Throughput: 0: 58866.2. Samples: 922032800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:54:19,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 21:54:21,435][54798] Signal inference workers to stop experience collection... (13900 times) [2024-04-27 21:54:21,439][54798] Signal inference workers to resume experience collection... (13900 times) [2024-04-27 21:54:21,462][54818] InferenceWorker_p0-w0: stopping experience collection (13900 times) [2024-04-27 21:54:21,462][54818] InferenceWorker_p0-w0: resuming experience collection (13900 times) [2024-04-27 21:54:21,735][54818] Updated weights for policy 0, policy_version 489318 (0.0026) [2024-04-27 21:54:24,253][54587] Fps is (10 sec: 57343.1, 60 sec: 59255.3, 300 sec: 58815.7). Total num frames: 8017133568. Throughput: 0: 58830.5. Samples: 922379980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:54:24,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 21:54:24,321][54818] Updated weights for policy 0, policy_version 489328 (0.0025) [2024-04-27 21:54:27,163][54818] Updated weights for policy 0, policy_version 489338 (0.0031) [2024-04-27 21:54:29,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58982.4, 300 sec: 58815.7). Total num frames: 8017428480. Throughput: 0: 59117.6. Samples: 922561820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:54:29,254][54587] Avg episode reward: [(0, '0.516')] [2024-04-27 21:54:29,787][54818] Updated weights for policy 0, policy_version 489348 (0.0027) [2024-04-27 21:54:32,590][54818] Updated weights for policy 0, policy_version 489358 (0.0027) [2024-04-27 21:54:34,253][54587] Fps is (10 sec: 58983.2, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 8017723392. Throughput: 0: 58888.9. Samples: 922908160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:54:34,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 21:54:35,279][54818] Updated weights for policy 0, policy_version 489368 (0.0025) [2024-04-27 21:54:38,195][54818] Updated weights for policy 0, policy_version 489378 (0.0027) [2024-04-27 21:54:39,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58436.1, 300 sec: 58704.7). Total num frames: 8018001920. Throughput: 0: 58841.6. Samples: 923264220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:54:39,254][54587] Avg episode reward: [(0, '0.500')] [2024-04-27 21:54:40,871][54818] Updated weights for policy 0, policy_version 489388 (0.0027) [2024-04-27 21:54:43,695][54818] Updated weights for policy 0, policy_version 489398 (0.0025) [2024-04-27 21:54:44,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58709.3, 300 sec: 58760.2). Total num frames: 8018296832. Throughput: 0: 58773.7. Samples: 923438160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:54:44,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-27 21:54:46,382][54818] Updated weights for policy 0, policy_version 489408 (0.0026) [2024-04-27 21:54:49,153][54818] Updated weights for policy 0, policy_version 489418 (0.0026) [2024-04-27 21:54:49,253][54587] Fps is (10 sec: 62260.0, 60 sec: 59528.5, 300 sec: 58926.9). Total num frames: 8018624512. Throughput: 0: 59030.2. Samples: 923797200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:54:49,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 21:54:52,097][54818] Updated weights for policy 0, policy_version 489428 (0.0027) [2024-04-27 21:54:54,253][54587] Fps is (10 sec: 60620.7, 60 sec: 59255.4, 300 sec: 58871.3). Total num frames: 8018903040. Throughput: 0: 59082.7. Samples: 924149000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:54:54,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-27 21:54:55,251][54818] Updated weights for policy 0, policy_version 489438 (0.0023) [2024-04-27 21:54:57,768][54818] Updated weights for policy 0, policy_version 489448 (0.0024) [2024-04-27 21:54:59,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8019197952. Throughput: 0: 58799.1. Samples: 924318980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 21:54:59,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-27 21:55:00,694][54818] Updated weights for policy 0, policy_version 489458 (0.0026) [2024-04-27 21:55:03,444][54818] Updated weights for policy 0, policy_version 489468 (0.0026) [2024-04-27 21:55:04,253][54587] Fps is (10 sec: 55706.4, 60 sec: 58709.4, 300 sec: 58760.3). Total num frames: 8019460096. Throughput: 0: 58701.5. Samples: 924674360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:55:04,253][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 21:55:06,285][54818] Updated weights for policy 0, policy_version 489478 (0.0026) [2024-04-27 21:55:09,164][54818] Updated weights for policy 0, policy_version 489488 (0.0026) [2024-04-27 21:55:09,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8019771392. Throughput: 0: 58752.2. Samples: 925023820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:55:09,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-27 21:55:11,732][54818] Updated weights for policy 0, policy_version 489498 (0.0027) [2024-04-27 21:55:12,821][54798] Signal inference workers to stop experience collection... (13950 times) [2024-04-27 21:55:12,826][54798] Signal inference workers to resume experience collection... (13950 times) [2024-04-27 21:55:12,841][54818] InferenceWorker_p0-w0: stopping experience collection (13950 times) [2024-04-27 21:55:12,841][54818] InferenceWorker_p0-w0: resuming experience collection (13950 times) [2024-04-27 21:55:14,253][54587] Fps is (10 sec: 60620.2, 60 sec: 58436.2, 300 sec: 58815.8). Total num frames: 8020066304. Throughput: 0: 58719.2. Samples: 925204180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:55:14,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 21:55:14,968][54818] Updated weights for policy 0, policy_version 489508 (0.0025) [2024-04-27 21:55:17,249][54818] Updated weights for policy 0, policy_version 489518 (0.0026) [2024-04-27 21:55:19,253][54587] Fps is (10 sec: 60621.1, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8020377600. Throughput: 0: 58935.6. Samples: 925560260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:55:19,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 21:55:20,374][54818] Updated weights for policy 0, policy_version 489528 (0.0026) [2024-04-27 21:55:23,143][54818] Updated weights for policy 0, policy_version 489538 (0.0027) [2024-04-27 21:55:24,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58709.4, 300 sec: 58815.8). Total num frames: 8020656128. Throughput: 0: 58705.5. Samples: 925905960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:55:24,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 21:55:25,969][54818] Updated weights for policy 0, policy_version 489548 (0.0025) [2024-04-27 21:55:28,787][54818] Updated weights for policy 0, policy_version 489558 (0.0025) [2024-04-27 21:55:29,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58709.4, 300 sec: 58760.2). Total num frames: 8020951040. Throughput: 0: 58730.7. Samples: 926081040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:55:29,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 21:55:31,467][54818] Updated weights for policy 0, policy_version 489568 (0.0025) [2024-04-27 21:55:34,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58436.3, 300 sec: 58760.2). Total num frames: 8021229568. Throughput: 0: 58479.2. Samples: 926428760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:55:34,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 21:55:34,425][54818] Updated weights for policy 0, policy_version 489578 (0.0027) [2024-04-27 21:55:36,891][54818] Updated weights for policy 0, policy_version 489588 (0.0025) [2024-04-27 21:55:39,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58709.5, 300 sec: 58760.2). Total num frames: 8021524480. Throughput: 0: 58587.6. Samples: 926785440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:55:39,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 21:55:39,859][54818] Updated weights for policy 0, policy_version 489598 (0.0027) [2024-04-27 21:55:42,415][54818] Updated weights for policy 0, policy_version 489608 (0.0026) [2024-04-27 21:55:44,253][54587] Fps is (10 sec: 60621.2, 60 sec: 58982.5, 300 sec: 58815.8). Total num frames: 8021835776. Throughput: 0: 58903.3. Samples: 926969620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:55:44,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 21:55:45,343][54818] Updated weights for policy 0, policy_version 489618 (0.0025) [2024-04-27 21:55:48,021][54818] Updated weights for policy 0, policy_version 489628 (0.0025) [2024-04-27 21:55:49,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 8022130688. Throughput: 0: 58698.1. Samples: 927315780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:55:49,254][54587] Avg episode reward: [(0, '0.528')] [2024-04-27 21:55:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000489632_8022130688.pth... [2024-04-27 21:55:49,318][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000488772_8008040448.pth [2024-04-27 21:55:50,926][54818] Updated weights for policy 0, policy_version 489638 (0.0027) [2024-04-27 21:55:53,595][54818] Updated weights for policy 0, policy_version 489648 (0.0025) [2024-04-27 21:55:54,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58709.5, 300 sec: 58815.8). Total num frames: 8022425600. Throughput: 0: 58633.9. Samples: 927662340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:55:54,253][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 21:55:56,469][54818] Updated weights for policy 0, policy_version 489658 (0.0026) [2024-04-27 21:55:59,163][54818] Updated weights for policy 0, policy_version 489668 (0.0026) [2024-04-27 21:55:59,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8022720512. Throughput: 0: 58643.6. Samples: 927843140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:55:59,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 21:56:02,054][54818] Updated weights for policy 0, policy_version 489678 (0.0027) [2024-04-27 21:56:04,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58982.4, 300 sec: 58871.4). Total num frames: 8022999040. Throughput: 0: 58502.2. Samples: 928192860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:56:04,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 21:56:04,742][54818] Updated weights for policy 0, policy_version 489688 (0.0025) [2024-04-27 21:56:05,001][54798] Signal inference workers to stop experience collection... (14000 times) [2024-04-27 21:56:05,033][54818] InferenceWorker_p0-w0: stopping experience collection (14000 times) [2024-04-27 21:56:05,059][54798] Signal inference workers to resume experience collection... (14000 times) [2024-04-27 21:56:05,060][54818] InferenceWorker_p0-w0: resuming experience collection (14000 times) [2024-04-27 21:56:07,693][54818] Updated weights for policy 0, policy_version 489698 (0.0024) [2024-04-27 21:56:09,253][54587] Fps is (10 sec: 55705.6, 60 sec: 58436.3, 300 sec: 58871.3). Total num frames: 8023277568. Throughput: 0: 58796.5. Samples: 928551800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:56:09,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 21:56:10,228][54818] Updated weights for policy 0, policy_version 489708 (0.0027) [2024-04-27 21:56:13,384][54818] Updated weights for policy 0, policy_version 489718 (0.0026) [2024-04-27 21:56:14,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58436.3, 300 sec: 58871.3). Total num frames: 8023572480. Throughput: 0: 58643.1. Samples: 928719980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:56:14,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 21:56:15,716][54818] Updated weights for policy 0, policy_version 489728 (0.0026) [2024-04-27 21:56:18,846][54818] Updated weights for policy 0, policy_version 489738 (0.0026) [2024-04-27 21:56:19,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58163.2, 300 sec: 58815.8). Total num frames: 8023867392. Throughput: 0: 58807.5. Samples: 929075100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:56:19,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 21:56:21,271][54818] Updated weights for policy 0, policy_version 489748 (0.0024) [2024-04-27 21:56:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8024178688. Throughput: 0: 58584.0. Samples: 929421720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 21:56:24,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 21:56:24,583][54818] Updated weights for policy 0, policy_version 489758 (0.0027) [2024-04-27 21:56:26,926][54818] Updated weights for policy 0, policy_version 489768 (0.0022) [2024-04-27 21:56:29,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58436.2, 300 sec: 58815.7). Total num frames: 8024457216. Throughput: 0: 58384.6. Samples: 929596940. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-27 21:56:29,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 21:56:30,416][54818] Updated weights for policy 0, policy_version 489778 (0.0024) [2024-04-27 21:56:32,683][54818] Updated weights for policy 0, policy_version 489788 (0.0028) [2024-04-27 21:56:34,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58982.5, 300 sec: 58815.8). Total num frames: 8024768512. Throughput: 0: 58531.7. Samples: 929949700. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-27 21:56:34,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 21:56:36,059][54818] Updated weights for policy 0, policy_version 489798 (0.0027) [2024-04-27 21:56:38,157][54818] Updated weights for policy 0, policy_version 489808 (0.0027) [2024-04-27 21:56:39,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.3, 300 sec: 58704.7). Total num frames: 8025047040. Throughput: 0: 58640.3. Samples: 930301160. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-27 21:56:39,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 21:56:41,461][54818] Updated weights for policy 0, policy_version 489818 (0.0027) [2024-04-27 21:56:43,573][54818] Updated weights for policy 0, policy_version 489828 (0.0027) [2024-04-27 21:56:44,253][54587] Fps is (10 sec: 57343.4, 60 sec: 58436.2, 300 sec: 58704.7). Total num frames: 8025341952. Throughput: 0: 58780.9. Samples: 930488280. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-27 21:56:44,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 21:56:47,023][54818] Updated weights for policy 0, policy_version 489838 (0.0027) [2024-04-27 21:56:49,253][54587] Fps is (10 sec: 60621.4, 60 sec: 58709.4, 300 sec: 58760.3). Total num frames: 8025653248. Throughput: 0: 58634.2. Samples: 930831400. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-27 21:56:49,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 21:56:49,263][54587] No heartbeat for components: RolloutWorker_w4 (1957 seconds) [2024-04-27 21:56:49,502][54818] Updated weights for policy 0, policy_version 489848 (0.0027) [2024-04-27 21:56:52,650][54818] Updated weights for policy 0, policy_version 489858 (0.0026) [2024-04-27 21:56:54,253][54587] Fps is (10 sec: 60620.5, 60 sec: 58709.2, 300 sec: 58760.3). Total num frames: 8025948160. Throughput: 0: 58324.4. Samples: 931176400. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-27 21:56:54,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 21:56:55,260][54818] Updated weights for policy 0, policy_version 489868 (0.0025) [2024-04-27 21:56:58,175][54818] Updated weights for policy 0, policy_version 489878 (0.0026) [2024-04-27 21:56:59,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8026243072. Throughput: 0: 58623.6. Samples: 931358040. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-27 21:56:59,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 21:57:00,926][54818] Updated weights for policy 0, policy_version 489888 (0.0025) [2024-04-27 21:57:03,558][54818] Updated weights for policy 0, policy_version 489898 (0.0025) [2024-04-27 21:57:04,253][54587] Fps is (10 sec: 58983.4, 60 sec: 58982.5, 300 sec: 58871.4). Total num frames: 8026537984. Throughput: 0: 58662.4. Samples: 931714900. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-27 21:57:04,253][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 21:57:06,525][54818] Updated weights for policy 0, policy_version 489908 (0.0026) [2024-04-27 21:57:08,614][54798] Signal inference workers to stop experience collection... (14050 times) [2024-04-27 21:57:08,643][54818] InferenceWorker_p0-w0: stopping experience collection (14050 times) [2024-04-27 21:57:08,704][54798] Signal inference workers to resume experience collection... (14050 times) [2024-04-27 21:57:08,704][54818] InferenceWorker_p0-w0: resuming experience collection (14050 times) [2024-04-27 21:57:09,027][54818] Updated weights for policy 0, policy_version 489918 (0.0025) [2024-04-27 21:57:09,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 8026816512. Throughput: 0: 58904.4. Samples: 932072420. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-27 21:57:09,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 21:57:12,003][54818] Updated weights for policy 0, policy_version 489928 (0.0030) [2024-04-27 21:57:14,253][54587] Fps is (10 sec: 55705.6, 60 sec: 58709.5, 300 sec: 58815.8). Total num frames: 8027095040. Throughput: 0: 58723.0. Samples: 932239460. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-27 21:57:14,253][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 21:57:14,697][54818] Updated weights for policy 0, policy_version 489938 (0.0025) [2024-04-27 21:57:17,596][54818] Updated weights for policy 0, policy_version 489948 (0.0027) [2024-04-27 21:57:19,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8027389952. Throughput: 0: 58836.7. Samples: 932597360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-27 21:57:19,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 21:57:20,098][54818] Updated weights for policy 0, policy_version 489958 (0.0028) [2024-04-27 21:57:23,198][54818] Updated weights for policy 0, policy_version 489968 (0.0026) [2024-04-27 21:57:24,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58436.3, 300 sec: 58760.3). Total num frames: 8027684864. Throughput: 0: 58913.1. Samples: 932952240. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-27 21:57:24,253][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 21:57:25,637][54818] Updated weights for policy 0, policy_version 489978 (0.0026) [2024-04-27 21:57:28,571][54818] Updated weights for policy 0, policy_version 489988 (0.0025) [2024-04-27 21:57:29,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.4, 300 sec: 58704.7). Total num frames: 8027979776. Throughput: 0: 58478.2. Samples: 933119800. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-27 21:57:29,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 21:57:31,177][54818] Updated weights for policy 0, policy_version 489998 (0.0027) [2024-04-27 21:57:34,147][54818] Updated weights for policy 0, policy_version 490008 (0.0028) [2024-04-27 21:57:34,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58709.2, 300 sec: 58815.8). Total num frames: 8028291072. Throughput: 0: 58695.5. Samples: 933472700. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-27 21:57:34,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 21:57:36,669][54818] Updated weights for policy 0, policy_version 490018 (0.0027) [2024-04-27 21:57:39,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 8028585984. Throughput: 0: 59144.0. Samples: 933837880. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-27 21:57:39,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-27 21:57:39,941][54818] Updated weights for policy 0, policy_version 490028 (0.0026) [2024-04-27 21:57:42,248][54818] Updated weights for policy 0, policy_version 490038 (0.0027) [2024-04-27 21:57:44,253][54587] Fps is (10 sec: 57344.5, 60 sec: 58709.4, 300 sec: 58704.7). Total num frames: 8028864512. Throughput: 0: 58933.9. Samples: 934010060. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-27 21:57:44,253][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 21:57:45,451][54818] Updated weights for policy 0, policy_version 490048 (0.0026) [2024-04-27 21:57:47,640][54818] Updated weights for policy 0, policy_version 490058 (0.0026) [2024-04-27 21:57:49,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.3, 300 sec: 58760.2). Total num frames: 8029175808. Throughput: 0: 58814.4. Samples: 934361560. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-04-27 21:57:49,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 21:57:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000490062_8029175808.pth... [2024-04-27 21:57:49,321][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000489202_8015085568.pth [2024-04-27 21:57:50,810][54818] Updated weights for policy 0, policy_version 490068 (0.0026) [2024-04-27 21:57:53,220][54818] Updated weights for policy 0, policy_version 490078 (0.0027) [2024-04-27 21:57:54,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58709.4, 300 sec: 58760.3). Total num frames: 8029470720. Throughput: 0: 58849.4. Samples: 934720640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:57:54,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 21:57:56,328][54818] Updated weights for policy 0, policy_version 490088 (0.0024) [2024-04-27 21:57:58,797][54818] Updated weights for policy 0, policy_version 490098 (0.0026) [2024-04-27 21:57:59,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 8029782016. Throughput: 0: 59242.9. Samples: 934905400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:57:59,254][54587] Avg episode reward: [(0, '0.519')] [2024-04-27 21:58:01,954][54818] Updated weights for policy 0, policy_version 490108 (0.0026) [2024-04-27 21:58:03,221][54798] Signal inference workers to stop experience collection... (14100 times) [2024-04-27 21:58:03,250][54818] InferenceWorker_p0-w0: stopping experience collection (14100 times) [2024-04-27 21:58:03,280][54798] Signal inference workers to resume experience collection... (14100 times) [2024-04-27 21:58:03,280][54818] InferenceWorker_p0-w0: resuming experience collection (14100 times) [2024-04-27 21:58:04,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58982.3, 300 sec: 58815.8). Total num frames: 8030076928. Throughput: 0: 59126.3. Samples: 935258040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:58:04,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 21:58:04,274][54818] Updated weights for policy 0, policy_version 490118 (0.0026) [2024-04-27 21:58:07,434][54818] Updated weights for policy 0, policy_version 490128 (0.0027) [2024-04-27 21:58:09,253][54587] Fps is (10 sec: 58983.1, 60 sec: 59255.6, 300 sec: 58871.3). Total num frames: 8030371840. Throughput: 0: 58943.1. Samples: 935604680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:58:09,253][54587] Avg episode reward: [(0, '0.516')] [2024-04-27 21:58:09,905][54818] Updated weights for policy 0, policy_version 490138 (0.0024) [2024-04-27 21:58:13,159][54818] Updated weights for policy 0, policy_version 490148 (0.0026) [2024-04-27 21:58:14,253][54587] Fps is (10 sec: 58981.5, 60 sec: 59528.3, 300 sec: 58871.3). Total num frames: 8030666752. Throughput: 0: 59366.6. Samples: 935791300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:58:14,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 21:58:16,154][54818] Updated weights for policy 0, policy_version 490158 (0.0026) [2024-04-27 21:58:18,610][54818] Updated weights for policy 0, policy_version 490168 (0.0027) [2024-04-27 21:58:19,253][54587] Fps is (10 sec: 58982.0, 60 sec: 59528.6, 300 sec: 58926.8). Total num frames: 8030961664. Throughput: 0: 59334.7. Samples: 936142760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:58:19,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 21:58:21,726][54818] Updated weights for policy 0, policy_version 490178 (0.0026) [2024-04-27 21:58:24,115][54818] Updated weights for policy 0, policy_version 490188 (0.0025) [2024-04-27 21:58:24,253][54587] Fps is (10 sec: 57345.2, 60 sec: 59255.5, 300 sec: 58815.8). Total num frames: 8031240192. Throughput: 0: 58998.0. Samples: 936492780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:58:24,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 21:58:27,179][54818] Updated weights for policy 0, policy_version 490198 (0.0027) [2024-04-27 21:58:29,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59255.5, 300 sec: 58815.8). Total num frames: 8031535104. Throughput: 0: 59260.8. Samples: 936676800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:58:29,254][54587] Avg episode reward: [(0, '0.700')] [2024-04-27 21:58:29,489][54818] Updated weights for policy 0, policy_version 490208 (0.0025) [2024-04-27 21:58:32,597][54818] Updated weights for policy 0, policy_version 490218 (0.0026) [2024-04-27 21:58:34,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58982.5, 300 sec: 58760.3). Total num frames: 8031830016. Throughput: 0: 59235.7. Samples: 937027160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:58:34,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 21:58:35,083][54818] Updated weights for policy 0, policy_version 490228 (0.0026) [2024-04-27 21:58:38,307][54818] Updated weights for policy 0, policy_version 490238 (0.0026) [2024-04-27 21:58:39,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 8032124928. Throughput: 0: 59097.2. Samples: 937380020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:58:39,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 21:58:40,481][54818] Updated weights for policy 0, policy_version 490248 (0.0023) [2024-04-27 21:58:43,879][54818] Updated weights for policy 0, policy_version 490258 (0.0026) [2024-04-27 21:58:44,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58982.3, 300 sec: 58815.8). Total num frames: 8032403456. Throughput: 0: 58756.1. Samples: 937549420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:58:44,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 21:58:45,899][54818] Updated weights for policy 0, policy_version 490268 (0.0026) [2024-04-27 21:58:49,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8032698368. Throughput: 0: 58838.5. Samples: 937905780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:58:49,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 21:58:49,425][54818] Updated weights for policy 0, policy_version 490278 (0.0026) [2024-04-27 21:58:51,592][54818] Updated weights for policy 0, policy_version 490288 (0.0027) [2024-04-27 21:58:54,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58709.3, 300 sec: 58760.2). Total num frames: 8032993280. Throughput: 0: 59172.3. Samples: 938267440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:58:54,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-27 21:58:55,009][54818] Updated weights for policy 0, policy_version 490298 (0.0026) [2024-04-27 21:58:57,083][54818] Updated weights for policy 0, policy_version 490308 (0.0025) [2024-04-27 21:58:57,395][54798] Signal inference workers to stop experience collection... (14150 times) [2024-04-27 21:58:57,426][54818] InferenceWorker_p0-w0: stopping experience collection (14150 times) [2024-04-27 21:58:57,453][54798] Signal inference workers to resume experience collection... (14150 times) [2024-04-27 21:58:57,454][54818] InferenceWorker_p0-w0: resuming experience collection (14150 times) [2024-04-27 21:58:59,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58436.2, 300 sec: 58815.8). Total num frames: 8033288192. Throughput: 0: 58627.5. Samples: 938429540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:58:59,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 21:59:00,621][54818] Updated weights for policy 0, policy_version 490318 (0.0024) [2024-04-27 21:59:02,685][54818] Updated weights for policy 0, policy_version 490328 (0.0025) [2024-04-27 21:59:04,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8033599488. Throughput: 0: 58702.6. Samples: 938784380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:59:04,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 21:59:06,147][54818] Updated weights for policy 0, policy_version 490338 (0.0025) [2024-04-27 21:59:08,232][54818] Updated weights for policy 0, policy_version 490348 (0.0026) [2024-04-27 21:59:09,253][54587] Fps is (10 sec: 62260.3, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 8033910784. Throughput: 0: 58706.6. Samples: 939134580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:59:09,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 21:59:11,701][54818] Updated weights for policy 0, policy_version 490358 (0.0026) [2024-04-27 21:59:13,748][54818] Updated weights for policy 0, policy_version 490368 (0.0024) [2024-04-27 21:59:14,253][54587] Fps is (10 sec: 60621.5, 60 sec: 58982.6, 300 sec: 58815.8). Total num frames: 8034205696. Throughput: 0: 58848.1. Samples: 939324960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-27 21:59:14,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 21:59:17,331][54818] Updated weights for policy 0, policy_version 490378 (0.0027) [2024-04-27 21:59:19,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8034500608. Throughput: 0: 58743.8. Samples: 939670640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 21:59:19,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-27 21:59:19,458][54818] Updated weights for policy 0, policy_version 490388 (0.0028) [2024-04-27 21:59:22,755][54818] Updated weights for policy 0, policy_version 490398 (0.0027) [2024-04-27 21:59:24,253][54587] Fps is (10 sec: 60620.0, 60 sec: 59528.4, 300 sec: 58926.9). Total num frames: 8034811904. Throughput: 0: 58560.9. Samples: 940015260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 21:59:24,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 21:59:24,884][54818] Updated weights for policy 0, policy_version 490408 (0.0027) [2024-04-27 21:59:28,305][54818] Updated weights for policy 0, policy_version 490418 (0.0027) [2024-04-27 21:59:29,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59528.5, 300 sec: 58926.9). Total num frames: 8035106816. Throughput: 0: 58997.3. Samples: 940204300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 21:59:29,254][54587] Avg episode reward: [(0, '0.683')] [2024-04-27 21:59:30,300][54818] Updated weights for policy 0, policy_version 490428 (0.0027) [2024-04-27 21:59:33,825][54818] Updated weights for policy 0, policy_version 490438 (0.0026) [2024-04-27 21:59:34,253][54587] Fps is (10 sec: 55705.3, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8035368960. Throughput: 0: 59052.4. Samples: 940563140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 21:59:34,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 21:59:36,476][54818] Updated weights for policy 0, policy_version 490448 (0.0026) [2024-04-27 21:59:39,253][54587] Fps is (10 sec: 54067.7, 60 sec: 58709.5, 300 sec: 58815.8). Total num frames: 8035647488. Throughput: 0: 58879.2. Samples: 940917000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 21:59:39,253][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 21:59:39,324][54818] Updated weights for policy 0, policy_version 490458 (0.0025) [2024-04-27 21:59:42,223][54818] Updated weights for policy 0, policy_version 490468 (0.0025) [2024-04-27 21:59:44,253][54587] Fps is (10 sec: 54067.4, 60 sec: 58436.2, 300 sec: 58593.6). Total num frames: 8035909632. Throughput: 0: 58863.2. Samples: 941078380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 21:59:44,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 21:59:44,601][54798] Signal inference workers to stop experience collection... (14200 times) [2024-04-27 21:59:44,606][54798] Signal inference workers to resume experience collection... (14200 times) [2024-04-27 21:59:44,628][54818] InferenceWorker_p0-w0: stopping experience collection (14200 times) [2024-04-27 21:59:44,629][54818] InferenceWorker_p0-w0: resuming experience collection (14200 times) [2024-04-27 21:59:44,888][54818] Updated weights for policy 0, policy_version 490478 (0.0025) [2024-04-27 21:59:47,652][54818] Updated weights for policy 0, policy_version 490488 (0.0027) [2024-04-27 21:59:49,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58709.4, 300 sec: 58704.7). Total num frames: 8036220928. Throughput: 0: 58883.1. Samples: 941434120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 21:59:49,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 21:59:49,265][54587] No heartbeat for components: RolloutWorker_w4 (2137 seconds) [2024-04-27 21:59:49,376][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000490493_8036237312.pth... [2024-04-27 21:59:49,430][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000489632_8022130688.pth [2024-04-27 21:59:50,399][54818] Updated weights for policy 0, policy_version 490498 (0.0023) [2024-04-27 21:59:53,408][54818] Updated weights for policy 0, policy_version 490508 (0.0024) [2024-04-27 21:59:54,253][54587] Fps is (10 sec: 60621.1, 60 sec: 58709.4, 300 sec: 58704.7). Total num frames: 8036515840. Throughput: 0: 59073.7. Samples: 941792900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 21:59:54,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 21:59:55,846][54818] Updated weights for policy 0, policy_version 490518 (0.0029) [2024-04-27 21:59:58,818][54818] Updated weights for policy 0, policy_version 490528 (0.0025) [2024-04-27 21:59:59,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.5, 300 sec: 58815.8). Total num frames: 8036810752. Throughput: 0: 58407.0. Samples: 941953280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 21:59:59,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 22:00:01,445][54818] Updated weights for policy 0, policy_version 490538 (0.0027) [2024-04-27 22:00:04,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8037122048. Throughput: 0: 58556.5. Samples: 942305680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 22:00:04,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 22:00:04,479][54818] Updated weights for policy 0, policy_version 490548 (0.0025) [2024-04-27 22:00:06,909][54818] Updated weights for policy 0, policy_version 490558 (0.0027) [2024-04-27 22:00:09,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58436.2, 300 sec: 58815.8). Total num frames: 8037416960. Throughput: 0: 58750.2. Samples: 942659020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 22:00:09,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 22:00:10,234][54818] Updated weights for policy 0, policy_version 490568 (0.0027) [2024-04-27 22:00:12,409][54818] Updated weights for policy 0, policy_version 490578 (0.0027) [2024-04-27 22:00:14,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58436.1, 300 sec: 58760.2). Total num frames: 8037711872. Throughput: 0: 58813.7. Samples: 942850920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 22:00:14,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 22:00:15,718][54818] Updated weights for policy 0, policy_version 490588 (0.0026) [2024-04-27 22:00:17,984][54818] Updated weights for policy 0, policy_version 490598 (0.0026) [2024-04-27 22:00:19,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8038023168. Throughput: 0: 58412.5. Samples: 943191700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 22:00:19,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 22:00:21,662][54818] Updated weights for policy 0, policy_version 490608 (0.0025) [2024-04-27 22:00:23,233][54798] Signal inference workers to stop experience collection... (14250 times) [2024-04-27 22:00:23,239][54798] Signal inference workers to resume experience collection... (14250 times) [2024-04-27 22:00:23,253][54818] InferenceWorker_p0-w0: stopping experience collection (14250 times) [2024-04-27 22:00:23,254][54818] InferenceWorker_p0-w0: resuming experience collection (14250 times) [2024-04-27 22:00:23,483][54818] Updated weights for policy 0, policy_version 490618 (0.0031) [2024-04-27 22:00:24,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58163.2, 300 sec: 58815.8). Total num frames: 8038301696. Throughput: 0: 58252.3. Samples: 943538360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 22:00:24,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-27 22:00:27,130][54818] Updated weights for policy 0, policy_version 490628 (0.0027) [2024-04-27 22:00:29,038][54818] Updated weights for policy 0, policy_version 490638 (0.0032) [2024-04-27 22:00:29,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58436.2, 300 sec: 58926.8). Total num frames: 8038612992. Throughput: 0: 58932.8. Samples: 943730360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 22:00:29,254][54587] Avg episode reward: [(0, '0.483')] [2024-04-27 22:00:32,559][54818] Updated weights for policy 0, policy_version 490648 (0.0027) [2024-04-27 22:00:34,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58982.4, 300 sec: 58926.9). Total num frames: 8038907904. Throughput: 0: 58805.7. Samples: 944080380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 22:00:34,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 22:00:34,466][54818] Updated weights for policy 0, policy_version 490658 (0.0026) [2024-04-27 22:00:38,145][54818] Updated weights for policy 0, policy_version 490668 (0.0027) [2024-04-27 22:00:39,253][54587] Fps is (10 sec: 55706.4, 60 sec: 58709.3, 300 sec: 58760.2). Total num frames: 8039170048. Throughput: 0: 58746.3. Samples: 944436480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 22:00:39,253][54587] Avg episode reward: [(0, '0.526')] [2024-04-27 22:00:40,362][54818] Updated weights for policy 0, policy_version 490678 (0.0025) [2024-04-27 22:00:43,688][54818] Updated weights for policy 0, policy_version 490688 (0.0025) [2024-04-27 22:00:44,253][54587] Fps is (10 sec: 55705.7, 60 sec: 59255.5, 300 sec: 58760.2). Total num frames: 8039464960. Throughput: 0: 58788.4. Samples: 944598760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-04-27 22:00:44,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 22:00:45,833][54818] Updated weights for policy 0, policy_version 490698 (0.0024) [2024-04-27 22:00:49,239][54818] Updated weights for policy 0, policy_version 490708 (0.0027) [2024-04-27 22:00:49,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58982.4, 300 sec: 58760.2). Total num frames: 8039759872. Throughput: 0: 58869.4. Samples: 944954800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 22:00:49,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 22:00:51,718][54818] Updated weights for policy 0, policy_version 490718 (0.0025) [2024-04-27 22:00:54,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58982.4, 300 sec: 58760.2). Total num frames: 8040054784. Throughput: 0: 59049.9. Samples: 945316260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 22:00:54,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 22:00:54,876][54818] Updated weights for policy 0, policy_version 490728 (0.0025) [2024-04-27 22:00:57,237][54818] Updated weights for policy 0, policy_version 490738 (0.0025) [2024-04-27 22:00:59,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58709.3, 300 sec: 58760.2). Total num frames: 8040333312. Throughput: 0: 58257.8. Samples: 945472520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 22:00:59,254][54587] Avg episode reward: [(0, '0.442')] [2024-04-27 22:01:00,360][54818] Updated weights for policy 0, policy_version 490748 (0.0027) [2024-04-27 22:01:02,735][54818] Updated weights for policy 0, policy_version 490758 (0.0025) [2024-04-27 22:01:04,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8040644608. Throughput: 0: 58666.2. Samples: 945831680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 22:01:04,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-27 22:01:05,928][54818] Updated weights for policy 0, policy_version 490768 (0.0026) [2024-04-27 22:01:07,165][54798] Signal inference workers to stop experience collection... (14300 times) [2024-04-27 22:01:07,170][54798] Signal inference workers to resume experience collection... (14300 times) [2024-04-27 22:01:07,191][54818] InferenceWorker_p0-w0: stopping experience collection (14300 times) [2024-04-27 22:01:07,191][54818] InferenceWorker_p0-w0: resuming experience collection (14300 times) [2024-04-27 22:01:08,531][54818] Updated weights for policy 0, policy_version 490778 (0.0026) [2024-04-27 22:01:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8040939520. Throughput: 0: 58936.4. Samples: 946190500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 22:01:09,254][54587] Avg episode reward: [(0, '0.509')] [2024-04-27 22:01:11,467][54818] Updated weights for policy 0, policy_version 490788 (0.0025) [2024-04-27 22:01:14,231][54818] Updated weights for policy 0, policy_version 490798 (0.0029) [2024-04-27 22:01:14,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8041234432. Throughput: 0: 58505.9. Samples: 946363120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 22:01:14,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 22:01:16,963][54818] Updated weights for policy 0, policy_version 490808 (0.0027) [2024-04-27 22:01:19,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58436.4, 300 sec: 58815.8). Total num frames: 8041529344. Throughput: 0: 58546.4. Samples: 946714960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 22:01:19,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 22:01:19,757][54818] Updated weights for policy 0, policy_version 490818 (0.0025) [2024-04-27 22:01:22,525][54818] Updated weights for policy 0, policy_version 490828 (0.0025) [2024-04-27 22:01:24,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58982.4, 300 sec: 58926.9). Total num frames: 8041840640. Throughput: 0: 58411.9. Samples: 947065020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 22:01:24,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 22:01:25,200][54818] Updated weights for policy 0, policy_version 490838 (0.0027) [2024-04-27 22:01:27,906][54818] Updated weights for policy 0, policy_version 490848 (0.0025) [2024-04-27 22:01:29,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8042135552. Throughput: 0: 58894.7. Samples: 947249020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 22:01:29,254][54587] Avg episode reward: [(0, '0.490')] [2024-04-27 22:01:30,921][54818] Updated weights for policy 0, policy_version 490858 (0.0025) [2024-04-27 22:01:33,621][54818] Updated weights for policy 0, policy_version 490868 (0.0026) [2024-04-27 22:01:34,253][54587] Fps is (10 sec: 57343.3, 60 sec: 58436.2, 300 sec: 58871.3). Total num frames: 8042414080. Throughput: 0: 58935.8. Samples: 947606920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 22:01:34,254][54587] Avg episode reward: [(0, '0.485')] [2024-04-27 22:01:36,554][54818] Updated weights for policy 0, policy_version 490878 (0.0027) [2024-04-27 22:01:39,094][54818] Updated weights for policy 0, policy_version 490888 (0.0026) [2024-04-27 22:01:39,253][54587] Fps is (10 sec: 57343.2, 60 sec: 58982.2, 300 sec: 58871.3). Total num frames: 8042708992. Throughput: 0: 58592.2. Samples: 947952920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 22:01:39,254][54587] Avg episode reward: [(0, '0.472')] [2024-04-27 22:01:42,112][54818] Updated weights for policy 0, policy_version 490898 (0.0026) [2024-04-27 22:01:44,253][54587] Fps is (10 sec: 57344.7, 60 sec: 58709.4, 300 sec: 58760.2). Total num frames: 8042987520. Throughput: 0: 59069.8. Samples: 948130660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 22:01:44,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-27 22:01:44,784][54818] Updated weights for policy 0, policy_version 490908 (0.0026) [2024-04-27 22:01:47,424][54798] Signal inference workers to stop experience collection... (14350 times) [2024-04-27 22:01:47,453][54818] InferenceWorker_p0-w0: stopping experience collection (14350 times) [2024-04-27 22:01:47,481][54798] Signal inference workers to resume experience collection... (14350 times) [2024-04-27 22:01:47,481][54818] InferenceWorker_p0-w0: resuming experience collection (14350 times) [2024-04-27 22:01:47,604][54818] Updated weights for policy 0, policy_version 490918 (0.0026) [2024-04-27 22:01:49,254][54587] Fps is (10 sec: 55702.0, 60 sec: 58435.5, 300 sec: 58704.6). Total num frames: 8043266048. Throughput: 0: 58701.7. Samples: 948473300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 22:01:49,255][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 22:01:49,389][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000490923_8043282432.pth... [2024-04-27 22:01:49,453][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000490062_8029175808.pth [2024-04-27 22:01:50,387][54818] Updated weights for policy 0, policy_version 490928 (0.0021) [2024-04-27 22:01:53,222][54818] Updated weights for policy 0, policy_version 490938 (0.0023) [2024-04-27 22:01:54,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58436.2, 300 sec: 58704.7). Total num frames: 8043560960. Throughput: 0: 58690.7. Samples: 948831580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 22:01:54,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 22:01:55,959][54818] Updated weights for policy 0, policy_version 490948 (0.0025) [2024-04-27 22:01:58,686][54818] Updated weights for policy 0, policy_version 490958 (0.0027) [2024-04-27 22:01:59,253][54587] Fps is (10 sec: 58986.7, 60 sec: 58709.3, 300 sec: 58704.7). Total num frames: 8043855872. Throughput: 0: 58513.6. Samples: 948996240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 22:01:59,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 22:02:01,363][54818] Updated weights for policy 0, policy_version 490968 (0.0026) [2024-04-27 22:02:04,253][54587] Fps is (10 sec: 60621.1, 60 sec: 58709.4, 300 sec: 58815.8). Total num frames: 8044167168. Throughput: 0: 58649.3. Samples: 949354180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 22:02:04,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 22:02:04,861][54818] Updated weights for policy 0, policy_version 490978 (0.0027) [2024-04-27 22:02:06,877][54818] Updated weights for policy 0, policy_version 490988 (0.0032) [2024-04-27 22:02:09,253][54587] Fps is (10 sec: 60621.7, 60 sec: 58709.5, 300 sec: 58871.3). Total num frames: 8044462080. Throughput: 0: 58885.9. Samples: 949714880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-27 22:02:09,253][54587] Avg episode reward: [(0, '0.446')] [2024-04-27 22:02:10,312][54818] Updated weights for policy 0, policy_version 490998 (0.0027) [2024-04-27 22:02:12,281][54818] Updated weights for policy 0, policy_version 491008 (0.0024) [2024-04-27 22:02:14,253][54587] Fps is (10 sec: 57343.4, 60 sec: 58436.1, 300 sec: 58815.8). Total num frames: 8044740608. Throughput: 0: 58595.9. Samples: 949885840. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-04-27 22:02:14,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 22:02:15,937][54818] Updated weights for policy 0, policy_version 491018 (0.0027) [2024-04-27 22:02:17,695][54818] Updated weights for policy 0, policy_version 491028 (0.0023) [2024-04-27 22:02:19,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 8045035520. Throughput: 0: 58255.8. Samples: 950228420. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-04-27 22:02:19,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 22:02:21,579][54818] Updated weights for policy 0, policy_version 491038 (0.0027) [2024-04-27 22:02:22,003][54798] Signal inference workers to stop experience collection... (14400 times) [2024-04-27 22:02:22,037][54818] InferenceWorker_p0-w0: stopping experience collection (14400 times) [2024-04-27 22:02:22,066][54798] Signal inference workers to resume experience collection... (14400 times) [2024-04-27 22:02:22,066][54818] InferenceWorker_p0-w0: resuming experience collection (14400 times) [2024-04-27 22:02:23,394][54818] Updated weights for policy 0, policy_version 491048 (0.0025) [2024-04-27 22:02:24,253][54587] Fps is (10 sec: 60621.9, 60 sec: 58436.3, 300 sec: 58871.3). Total num frames: 8045346816. Throughput: 0: 58518.1. Samples: 950586220. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-04-27 22:02:24,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 22:02:27,138][54818] Updated weights for policy 0, policy_version 491058 (0.0027) [2024-04-27 22:02:29,173][54818] Updated weights for policy 0, policy_version 491068 (0.0023) [2024-04-27 22:02:29,253][54587] Fps is (10 sec: 62258.3, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8045658112. Throughput: 0: 58798.2. Samples: 950776580. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-04-27 22:02:29,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 22:02:32,705][54818] Updated weights for policy 0, policy_version 491078 (0.0029) [2024-04-27 22:02:34,253][54587] Fps is (10 sec: 62258.7, 60 sec: 59255.6, 300 sec: 58926.9). Total num frames: 8045969408. Throughput: 0: 59107.7. Samples: 951133100. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-04-27 22:02:34,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 22:02:34,689][54818] Updated weights for policy 0, policy_version 491088 (0.0027) [2024-04-27 22:02:38,036][54818] Updated weights for policy 0, policy_version 491098 (0.0026) [2024-04-27 22:02:39,253][54587] Fps is (10 sec: 60620.7, 60 sec: 59255.6, 300 sec: 58982.4). Total num frames: 8046264320. Throughput: 0: 58636.0. Samples: 951470200. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-04-27 22:02:39,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 22:02:40,696][54818] Updated weights for policy 0, policy_version 491108 (0.0026) [2024-04-27 22:02:43,552][54818] Updated weights for policy 0, policy_version 491118 (0.0027) [2024-04-27 22:02:44,253][54587] Fps is (10 sec: 57343.4, 60 sec: 59255.4, 300 sec: 58871.3). Total num frames: 8046542848. Throughput: 0: 59010.2. Samples: 951651700. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-04-27 22:02:44,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 22:02:46,210][54818] Updated weights for policy 0, policy_version 491128 (0.0026) [2024-04-27 22:02:49,124][54818] Updated weights for policy 0, policy_version 491138 (0.0026) [2024-04-27 22:02:49,253][54587] Fps is (10 sec: 54067.3, 60 sec: 58983.1, 300 sec: 58760.2). Total num frames: 8046804992. Throughput: 0: 59020.4. Samples: 952010100. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-04-27 22:02:49,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 22:02:49,262][54587] No heartbeat for components: RolloutWorker_w4 (2317 seconds) [2024-04-27 22:02:51,830][54818] Updated weights for policy 0, policy_version 491148 (0.0025) [2024-04-27 22:02:54,253][54587] Fps is (10 sec: 54068.6, 60 sec: 58709.6, 300 sec: 58649.2). Total num frames: 8047083520. Throughput: 0: 58930.8. Samples: 952366760. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-04-27 22:02:54,253][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 22:02:54,610][54818] Updated weights for policy 0, policy_version 491158 (0.0024) [2024-04-27 22:02:57,345][54818] Updated weights for policy 0, policy_version 491168 (0.0025) [2024-04-27 22:02:59,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58709.4, 300 sec: 58649.2). Total num frames: 8047378432. Throughput: 0: 58855.6. Samples: 952534340. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-04-27 22:02:59,254][54587] Avg episode reward: [(0, '0.474')] [2024-04-27 22:03:00,094][54818] Updated weights for policy 0, policy_version 491178 (0.0028) [2024-04-27 22:03:00,649][54798] Signal inference workers to stop experience collection... (14450 times) [2024-04-27 22:03:00,649][54798] Signal inference workers to resume experience collection... (14450 times) [2024-04-27 22:03:00,667][54818] InferenceWorker_p0-w0: stopping experience collection (14450 times) [2024-04-27 22:03:00,667][54818] InferenceWorker_p0-w0: resuming experience collection (14450 times) [2024-04-27 22:03:02,857][54818] Updated weights for policy 0, policy_version 491188 (0.0026) [2024-04-27 22:03:04,253][54587] Fps is (10 sec: 57343.1, 60 sec: 58163.2, 300 sec: 58593.6). Total num frames: 8047656960. Throughput: 0: 59024.3. Samples: 952884520. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-04-27 22:03:04,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 22:03:05,649][54818] Updated weights for policy 0, policy_version 491198 (0.0025) [2024-04-27 22:03:08,429][54818] Updated weights for policy 0, policy_version 491208 (0.0028) [2024-04-27 22:03:09,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58436.1, 300 sec: 58649.2). Total num frames: 8047968256. Throughput: 0: 58859.8. Samples: 953234920. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-04-27 22:03:09,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 22:03:11,248][54818] Updated weights for policy 0, policy_version 491218 (0.0027) [2024-04-27 22:03:14,047][54818] Updated weights for policy 0, policy_version 491228 (0.0025) [2024-04-27 22:03:14,253][54587] Fps is (10 sec: 62258.6, 60 sec: 58982.4, 300 sec: 58704.7). Total num frames: 8048279552. Throughput: 0: 58391.1. Samples: 953404180. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-04-27 22:03:14,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 22:03:16,735][54818] Updated weights for policy 0, policy_version 491238 (0.0024) [2024-04-27 22:03:19,253][54587] Fps is (10 sec: 60621.6, 60 sec: 58982.4, 300 sec: 58760.2). Total num frames: 8048574464. Throughput: 0: 58254.7. Samples: 953754560. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-04-27 22:03:19,253][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 22:03:19,619][54818] Updated weights for policy 0, policy_version 491248 (0.0025) [2024-04-27 22:03:22,293][54818] Updated weights for policy 0, policy_version 491258 (0.0027) [2024-04-27 22:03:24,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58982.2, 300 sec: 58815.8). Total num frames: 8048885760. Throughput: 0: 58841.3. Samples: 954118060. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-04-27 22:03:24,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 22:03:25,431][54818] Updated weights for policy 0, policy_version 491268 (0.0025) [2024-04-27 22:03:27,817][54818] Updated weights for policy 0, policy_version 491278 (0.0027) [2024-04-27 22:03:29,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58436.3, 300 sec: 58760.2). Total num frames: 8049164288. Throughput: 0: 58841.9. Samples: 954299580. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-04-27 22:03:29,254][54587] Avg episode reward: [(0, '0.684')] [2024-04-27 22:03:31,115][54818] Updated weights for policy 0, policy_version 491288 (0.0027) [2024-04-27 22:03:33,306][54818] Updated weights for policy 0, policy_version 491298 (0.0027) [2024-04-27 22:03:34,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58163.1, 300 sec: 58760.2). Total num frames: 8049459200. Throughput: 0: 58592.4. Samples: 954646760. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-04-27 22:03:34,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 22:03:36,870][54818] Updated weights for policy 0, policy_version 491308 (0.0025) [2024-04-27 22:03:38,746][54818] Updated weights for policy 0, policy_version 491318 (0.0031) [2024-04-27 22:03:39,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58436.3, 300 sec: 58871.3). Total num frames: 8049770496. Throughput: 0: 58325.6. Samples: 954991420. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 22:03:39,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 22:03:42,504][54818] Updated weights for policy 0, policy_version 491328 (0.0026) [2024-04-27 22:03:43,724][54798] Signal inference workers to stop experience collection... (14500 times) [2024-04-27 22:03:43,729][54798] Signal inference workers to resume experience collection... (14500 times) [2024-04-27 22:03:43,745][54818] InferenceWorker_p0-w0: stopping experience collection (14500 times) [2024-04-27 22:03:43,746][54818] InferenceWorker_p0-w0: resuming experience collection (14500 times) [2024-04-27 22:03:44,253][54587] Fps is (10 sec: 60621.1, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8050065408. Throughput: 0: 58741.3. Samples: 955177700. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 22:03:44,254][54587] Avg episode reward: [(0, '0.501')] [2024-04-27 22:03:44,530][54818] Updated weights for policy 0, policy_version 491338 (0.0027) [2024-04-27 22:03:48,074][54818] Updated weights for policy 0, policy_version 491348 (0.0026) [2024-04-27 22:03:49,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58982.5, 300 sec: 58815.8). Total num frames: 8050343936. Throughput: 0: 59022.3. Samples: 955540520. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 22:03:49,253][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 22:03:49,270][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000491355_8050360320.pth... [2024-04-27 22:03:49,320][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000490493_8036237312.pth [2024-04-27 22:03:49,982][54818] Updated weights for policy 0, policy_version 491358 (0.0023) [2024-04-27 22:03:53,580][54818] Updated weights for policy 0, policy_version 491368 (0.0027) [2024-04-27 22:03:54,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59255.3, 300 sec: 58815.8). Total num frames: 8050638848. Throughput: 0: 58948.0. Samples: 955887580. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 22:03:54,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 22:03:55,611][54818] Updated weights for policy 0, policy_version 491378 (0.0025) [2024-04-27 22:03:59,062][54818] Updated weights for policy 0, policy_version 491388 (0.0025) [2024-04-27 22:03:59,253][54587] Fps is (10 sec: 55705.1, 60 sec: 58709.4, 300 sec: 58649.2). Total num frames: 8050900992. Throughput: 0: 58824.1. Samples: 956051260. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 22:03:59,254][54587] Avg episode reward: [(0, '0.510')] [2024-04-27 22:04:01,622][54818] Updated weights for policy 0, policy_version 491398 (0.0026) [2024-04-27 22:04:04,253][54587] Fps is (10 sec: 55706.3, 60 sec: 58982.5, 300 sec: 58593.6). Total num frames: 8051195904. Throughput: 0: 58881.8. Samples: 956404240. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 22:04:04,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 22:04:04,528][54818] Updated weights for policy 0, policy_version 491408 (0.0026) [2024-04-27 22:04:07,007][54818] Updated weights for policy 0, policy_version 491418 (0.0025) [2024-04-27 22:04:09,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58709.4, 300 sec: 58593.6). Total num frames: 8051490816. Throughput: 0: 58888.9. Samples: 956768060. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 22:04:09,254][54587] Avg episode reward: [(0, '0.504')] [2024-04-27 22:04:10,030][54818] Updated weights for policy 0, policy_version 491428 (0.0026) [2024-04-27 22:04:12,393][54818] Updated weights for policy 0, policy_version 491438 (0.0025) [2024-04-27 22:04:14,253][54587] Fps is (10 sec: 60619.9, 60 sec: 58709.3, 300 sec: 58649.2). Total num frames: 8051802112. Throughput: 0: 58649.7. Samples: 956938820. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 22:04:14,254][54587] Avg episode reward: [(0, '0.479')] [2024-04-27 22:04:15,558][54818] Updated weights for policy 0, policy_version 491448 (0.0027) [2024-04-27 22:04:17,934][54818] Updated weights for policy 0, policy_version 491458 (0.0027) [2024-04-27 22:04:19,253][54587] Fps is (10 sec: 62259.3, 60 sec: 58982.3, 300 sec: 58649.2). Total num frames: 8052113408. Throughput: 0: 58873.4. Samples: 957296060. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 22:04:19,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-27 22:04:21,107][54818] Updated weights for policy 0, policy_version 491468 (0.0026) [2024-04-27 22:04:23,567][54818] Updated weights for policy 0, policy_version 491478 (0.0027) [2024-04-27 22:04:24,253][54587] Fps is (10 sec: 62259.1, 60 sec: 58982.4, 300 sec: 58704.7). Total num frames: 8052424704. Throughput: 0: 58971.0. Samples: 957645120. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 22:04:24,254][54587] Avg episode reward: [(0, '0.498')] [2024-04-27 22:04:26,688][54818] Updated weights for policy 0, policy_version 491488 (0.0027) [2024-04-27 22:04:28,958][54818] Updated weights for policy 0, policy_version 491498 (0.0024) [2024-04-27 22:04:29,253][54587] Fps is (10 sec: 60621.4, 60 sec: 59255.5, 300 sec: 58815.8). Total num frames: 8052719616. Throughput: 0: 58983.7. Samples: 957831960. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 22:04:29,253][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 22:04:32,073][54818] Updated weights for policy 0, policy_version 491508 (0.0026) [2024-04-27 22:04:32,351][54798] Signal inference workers to stop experience collection... (14550 times) [2024-04-27 22:04:32,351][54798] Signal inference workers to resume experience collection... (14550 times) [2024-04-27 22:04:32,374][54818] InferenceWorker_p0-w0: stopping experience collection (14550 times) [2024-04-27 22:04:32,374][54818] InferenceWorker_p0-w0: resuming experience collection (14550 times) [2024-04-27 22:04:34,253][54587] Fps is (10 sec: 58983.2, 60 sec: 59255.6, 300 sec: 58871.3). Total num frames: 8053014528. Throughput: 0: 58516.9. Samples: 958173780. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 22:04:34,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-27 22:04:34,362][54818] Updated weights for policy 0, policy_version 491518 (0.0022) [2024-04-27 22:04:37,594][54818] Updated weights for policy 0, policy_version 491528 (0.0026) [2024-04-27 22:04:39,253][54587] Fps is (10 sec: 55705.3, 60 sec: 58436.3, 300 sec: 58871.3). Total num frames: 8053276672. Throughput: 0: 58726.3. Samples: 958530260. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 22:04:39,254][54587] Avg episode reward: [(0, '0.517')] [2024-04-27 22:04:40,133][54818] Updated weights for policy 0, policy_version 491538 (0.0027) [2024-04-27 22:04:43,027][54818] Updated weights for policy 0, policy_version 491548 (0.0028) [2024-04-27 22:04:44,253][54587] Fps is (10 sec: 57343.0, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8053587968. Throughput: 0: 59238.1. Samples: 958716980. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 22:04:44,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 22:04:45,613][54818] Updated weights for policy 0, policy_version 491558 (0.0027) [2024-04-27 22:04:48,606][54818] Updated weights for policy 0, policy_version 491568 (0.0025) [2024-04-27 22:04:49,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8053882880. Throughput: 0: 59234.6. Samples: 959069800. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 22:04:49,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 22:04:51,103][54818] Updated weights for policy 0, policy_version 491578 (0.0032) [2024-04-27 22:04:54,044][54818] Updated weights for policy 0, policy_version 491588 (0.0024) [2024-04-27 22:04:54,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 8054177792. Throughput: 0: 58952.1. Samples: 959420900. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 22:04:54,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 22:04:57,096][54818] Updated weights for policy 0, policy_version 491598 (0.0026) [2024-04-27 22:04:59,253][54587] Fps is (10 sec: 57343.2, 60 sec: 59255.3, 300 sec: 58760.2). Total num frames: 8054456320. Throughput: 0: 58868.4. Samples: 959587900. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-27 22:04:59,254][54587] Avg episode reward: [(0, '0.526')] [2024-04-27 22:04:59,549][54818] Updated weights for policy 0, policy_version 491608 (0.0028) [2024-04-27 22:05:03,067][54818] Updated weights for policy 0, policy_version 491618 (0.0026) [2024-04-27 22:05:04,253][54587] Fps is (10 sec: 55705.7, 60 sec: 58982.3, 300 sec: 58704.7). Total num frames: 8054734848. Throughput: 0: 58998.3. Samples: 959950980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:05:04,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 22:05:05,163][54818] Updated weights for policy 0, policy_version 491628 (0.0024) [2024-04-27 22:05:08,479][54818] Updated weights for policy 0, policy_version 491638 (0.0027) [2024-04-27 22:05:09,253][54587] Fps is (10 sec: 57345.1, 60 sec: 58982.5, 300 sec: 58704.7). Total num frames: 8055029760. Throughput: 0: 59185.1. Samples: 960308440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:05:09,253][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 22:05:10,755][54818] Updated weights for policy 0, policy_version 491648 (0.0024) [2024-04-27 22:05:13,877][54818] Updated weights for policy 0, policy_version 491658 (0.0025) [2024-04-27 22:05:14,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.5, 300 sec: 58649.2). Total num frames: 8055324672. Throughput: 0: 58638.2. Samples: 960470680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:05:14,253][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 22:05:16,389][54818] Updated weights for policy 0, policy_version 491668 (0.0025) [2024-04-27 22:05:19,187][54798] Signal inference workers to stop experience collection... (14600 times) [2024-04-27 22:05:19,188][54798] Signal inference workers to resume experience collection... (14600 times) [2024-04-27 22:05:19,201][54818] InferenceWorker_p0-w0: stopping experience collection (14600 times) [2024-04-27 22:05:19,201][54818] InferenceWorker_p0-w0: resuming experience collection (14600 times) [2024-04-27 22:05:19,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58436.4, 300 sec: 58704.7). Total num frames: 8055619584. Throughput: 0: 59040.9. Samples: 960830620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:05:19,253][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 22:05:19,517][54818] Updated weights for policy 0, policy_version 491678 (0.0025) [2024-04-27 22:05:21,860][54818] Updated weights for policy 0, policy_version 491688 (0.0027) [2024-04-27 22:05:24,253][54587] Fps is (10 sec: 60620.0, 60 sec: 58436.3, 300 sec: 58704.7). Total num frames: 8055930880. Throughput: 0: 58923.0. Samples: 961181800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:05:24,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 22:05:25,064][54818] Updated weights for policy 0, policy_version 491698 (0.0026) [2024-04-27 22:05:27,359][54818] Updated weights for policy 0, policy_version 491708 (0.0025) [2024-04-27 22:05:29,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58436.3, 300 sec: 58704.7). Total num frames: 8056225792. Throughput: 0: 58698.9. Samples: 961358420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:05:29,253][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 22:05:30,684][54818] Updated weights for policy 0, policy_version 491718 (0.0027) [2024-04-27 22:05:33,056][54818] Updated weights for policy 0, policy_version 491728 (0.0026) [2024-04-27 22:05:34,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58436.2, 300 sec: 58815.8). Total num frames: 8056520704. Throughput: 0: 58658.6. Samples: 961709440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:05:34,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 22:05:36,373][54818] Updated weights for policy 0, policy_version 491738 (0.0025) [2024-04-27 22:05:38,514][54818] Updated weights for policy 0, policy_version 491748 (0.0026) [2024-04-27 22:05:39,253][54587] Fps is (10 sec: 60620.2, 60 sec: 59255.4, 300 sec: 58871.3). Total num frames: 8056832000. Throughput: 0: 58818.2. Samples: 962067720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:05:39,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 22:05:41,776][54818] Updated weights for policy 0, policy_version 491758 (0.0026) [2024-04-27 22:05:44,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.5, 300 sec: 58815.8). Total num frames: 8057110528. Throughput: 0: 59283.8. Samples: 962255660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:05:44,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 22:05:44,274][54818] Updated weights for policy 0, policy_version 491768 (0.0026) [2024-04-27 22:05:47,233][54818] Updated weights for policy 0, policy_version 491778 (0.0027) [2024-04-27 22:05:49,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8057421824. Throughput: 0: 58896.8. Samples: 962601340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:05:49,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-27 22:05:49,264][54587] No heartbeat for components: RolloutWorker_w4 (2497 seconds) [2024-04-27 22:05:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000491786_8057421824.pth... [2024-04-27 22:05:49,324][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000490923_8043282432.pth [2024-04-27 22:05:49,847][54818] Updated weights for policy 0, policy_version 491788 (0.0025) [2024-04-27 22:05:52,807][54818] Updated weights for policy 0, policy_version 491798 (0.0027) [2024-04-27 22:05:54,253][54587] Fps is (10 sec: 62259.3, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8057733120. Throughput: 0: 58733.3. Samples: 962951440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:05:54,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 22:05:55,293][54818] Updated weights for policy 0, policy_version 491808 (0.0023) [2024-04-27 22:05:58,331][54818] Updated weights for policy 0, policy_version 491818 (0.0026) [2024-04-27 22:05:59,253][54587] Fps is (10 sec: 60621.6, 60 sec: 59528.7, 300 sec: 58926.9). Total num frames: 8058028032. Throughput: 0: 59262.6. Samples: 963137500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:05:59,253][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 22:06:01,005][54818] Updated weights for policy 0, policy_version 491828 (0.0026) [2024-04-27 22:06:03,786][54818] Updated weights for policy 0, policy_version 491838 (0.0027) [2024-04-27 22:06:04,253][54587] Fps is (10 sec: 57344.5, 60 sec: 59528.6, 300 sec: 58871.4). Total num frames: 8058306560. Throughput: 0: 59096.0. Samples: 963489940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:06:04,253][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 22:06:06,464][54818] Updated weights for policy 0, policy_version 491848 (0.0026) [2024-04-27 22:06:09,253][54587] Fps is (10 sec: 55705.1, 60 sec: 59255.4, 300 sec: 58815.8). Total num frames: 8058585088. Throughput: 0: 59202.7. Samples: 963845920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:06:09,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 22:06:09,312][54818] Updated weights for policy 0, policy_version 491858 (0.0026) [2024-04-27 22:06:12,086][54818] Updated weights for policy 0, policy_version 491868 (0.0026) [2024-04-27 22:06:14,253][54587] Fps is (10 sec: 55705.2, 60 sec: 58982.4, 300 sec: 58760.2). Total num frames: 8058863616. Throughput: 0: 58976.9. Samples: 964012380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:06:14,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 22:06:14,871][54818] Updated weights for policy 0, policy_version 491878 (0.0026) [2024-04-27 22:06:17,516][54818] Updated weights for policy 0, policy_version 491888 (0.0024) [2024-04-27 22:06:19,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58982.3, 300 sec: 58704.7). Total num frames: 8059158528. Throughput: 0: 58980.5. Samples: 964363560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:06:19,263][54587] Avg episode reward: [(0, '0.655')] [2024-04-27 22:06:19,778][54798] Signal inference workers to stop experience collection... (14650 times) [2024-04-27 22:06:19,818][54818] InferenceWorker_p0-w0: stopping experience collection (14650 times) [2024-04-27 22:06:19,871][54798] Signal inference workers to resume experience collection... (14650 times) [2024-04-27 22:06:19,871][54818] InferenceWorker_p0-w0: resuming experience collection (14650 times) [2024-04-27 22:06:20,397][54818] Updated weights for policy 0, policy_version 491898 (0.0026) [2024-04-27 22:06:23,472][54818] Updated weights for policy 0, policy_version 491908 (0.0026) [2024-04-27 22:06:24,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58436.3, 300 sec: 58649.2). Total num frames: 8059437056. Throughput: 0: 59010.7. Samples: 964723200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:06:24,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 22:06:25,815][54818] Updated weights for policy 0, policy_version 491918 (0.0023) [2024-04-27 22:06:29,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58436.2, 300 sec: 58704.7). Total num frames: 8059731968. Throughput: 0: 58456.9. Samples: 964886220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-27 22:06:29,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 22:06:29,287][54818] Updated weights for policy 0, policy_version 491928 (0.0026) [2024-04-27 22:06:31,298][54818] Updated weights for policy 0, policy_version 491938 (0.0027) [2024-04-27 22:06:34,253][54587] Fps is (10 sec: 62259.3, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 8060059648. Throughput: 0: 58819.7. Samples: 965248220. Policy #0 lag: (min: 2.0, avg: 8.8, max: 22.0) [2024-04-27 22:06:34,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 22:06:34,908][54818] Updated weights for policy 0, policy_version 491948 (0.0026) [2024-04-27 22:06:36,822][54818] Updated weights for policy 0, policy_version 491958 (0.0024) [2024-04-27 22:06:39,253][54587] Fps is (10 sec: 62258.7, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8060354560. Throughput: 0: 58810.1. Samples: 965597900. Policy #0 lag: (min: 2.0, avg: 8.8, max: 22.0) [2024-04-27 22:06:39,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 22:06:40,521][54818] Updated weights for policy 0, policy_version 491968 (0.0026) [2024-04-27 22:06:42,388][54818] Updated weights for policy 0, policy_version 491978 (0.0025) [2024-04-27 22:06:44,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.4, 300 sec: 58927.0). Total num frames: 8060649472. Throughput: 0: 58664.0. Samples: 965777380. Policy #0 lag: (min: 2.0, avg: 8.8, max: 22.0) [2024-04-27 22:06:44,254][54587] Avg episode reward: [(0, '0.487')] [2024-04-27 22:06:46,028][54818] Updated weights for policy 0, policy_version 491988 (0.0027) [2024-04-27 22:06:47,892][54818] Updated weights for policy 0, policy_version 491998 (0.0027) [2024-04-27 22:06:49,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58709.3, 300 sec: 58926.8). Total num frames: 8060944384. Throughput: 0: 58465.0. Samples: 966120880. Policy #0 lag: (min: 2.0, avg: 8.8, max: 22.0) [2024-04-27 22:06:49,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 22:06:51,456][54818] Updated weights for policy 0, policy_version 492008 (0.0029) [2024-04-27 22:06:53,437][54818] Updated weights for policy 0, policy_version 492018 (0.0022) [2024-04-27 22:06:54,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58163.2, 300 sec: 58871.3). Total num frames: 8061222912. Throughput: 0: 58376.1. Samples: 966472840. Policy #0 lag: (min: 2.0, avg: 8.8, max: 22.0) [2024-04-27 22:06:54,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 22:06:57,129][54818] Updated weights for policy 0, policy_version 492028 (0.0025) [2024-04-27 22:06:59,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58436.1, 300 sec: 58871.3). Total num frames: 8061534208. Throughput: 0: 59042.4. Samples: 966669300. Policy #0 lag: (min: 2.0, avg: 8.8, max: 22.0) [2024-04-27 22:06:59,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 22:06:59,420][54818] Updated weights for policy 0, policy_version 492038 (0.0023) [2024-04-27 22:07:03,064][54818] Updated weights for policy 0, policy_version 492048 (0.0026) [2024-04-27 22:07:03,550][54798] Signal inference workers to stop experience collection... (14700 times) [2024-04-27 22:07:03,581][54818] InferenceWorker_p0-w0: stopping experience collection (14700 times) [2024-04-27 22:07:03,608][54798] Signal inference workers to resume experience collection... (14700 times) [2024-04-27 22:07:03,609][54818] InferenceWorker_p0-w0: resuming experience collection (14700 times) [2024-04-27 22:07:04,253][54587] Fps is (10 sec: 62258.9, 60 sec: 58982.3, 300 sec: 58926.8). Total num frames: 8061845504. Throughput: 0: 58928.4. Samples: 967015340. Policy #0 lag: (min: 2.0, avg: 8.8, max: 22.0) [2024-04-27 22:07:04,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 22:07:05,041][54818] Updated weights for policy 0, policy_version 492058 (0.0029) [2024-04-27 22:07:08,514][54818] Updated weights for policy 0, policy_version 492068 (0.0025) [2024-04-27 22:07:09,253][54587] Fps is (10 sec: 57345.4, 60 sec: 58709.4, 300 sec: 58871.4). Total num frames: 8062107648. Throughput: 0: 58767.2. Samples: 967367720. Policy #0 lag: (min: 2.0, avg: 8.8, max: 22.0) [2024-04-27 22:07:09,253][54587] Avg episode reward: [(0, '0.505')] [2024-04-27 22:07:10,626][54818] Updated weights for policy 0, policy_version 492078 (0.0024) [2024-04-27 22:07:13,931][54818] Updated weights for policy 0, policy_version 492088 (0.0027) [2024-04-27 22:07:14,253][54587] Fps is (10 sec: 54067.4, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8062386176. Throughput: 0: 58733.8. Samples: 967529240. Policy #0 lag: (min: 2.0, avg: 8.8, max: 22.0) [2024-04-27 22:07:14,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 22:07:16,205][54818] Updated weights for policy 0, policy_version 492098 (0.0025) [2024-04-27 22:07:19,253][54587] Fps is (10 sec: 54066.9, 60 sec: 58163.2, 300 sec: 58649.2). Total num frames: 8062648320. Throughput: 0: 58730.2. Samples: 967891080. Policy #0 lag: (min: 2.0, avg: 8.8, max: 22.0) [2024-04-27 22:07:19,254][54587] Avg episode reward: [(0, '0.486')] [2024-04-27 22:07:19,531][54818] Updated weights for policy 0, policy_version 492108 (0.0025) [2024-04-27 22:07:21,666][54818] Updated weights for policy 0, policy_version 492118 (0.0027) [2024-04-27 22:07:24,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58982.4, 300 sec: 58704.7). Total num frames: 8062976000. Throughput: 0: 58831.2. Samples: 968245300. Policy #0 lag: (min: 2.0, avg: 8.8, max: 22.0) [2024-04-27 22:07:24,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 22:07:24,991][54818] Updated weights for policy 0, policy_version 492128 (0.0027) [2024-04-27 22:07:27,215][54818] Updated weights for policy 0, policy_version 492138 (0.0027) [2024-04-27 22:07:29,253][54587] Fps is (10 sec: 62258.8, 60 sec: 58982.4, 300 sec: 58649.2). Total num frames: 8063270912. Throughput: 0: 58593.2. Samples: 968414080. Policy #0 lag: (min: 2.0, avg: 8.8, max: 22.0) [2024-04-27 22:07:29,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 22:07:30,442][54818] Updated weights for policy 0, policy_version 492148 (0.0027) [2024-04-27 22:07:32,707][54818] Updated weights for policy 0, policy_version 492158 (0.0027) [2024-04-27 22:07:34,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58163.2, 300 sec: 58593.6). Total num frames: 8063549440. Throughput: 0: 58780.2. Samples: 968765980. Policy #0 lag: (min: 2.0, avg: 8.8, max: 22.0) [2024-04-27 22:07:34,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 22:07:35,835][54818] Updated weights for policy 0, policy_version 492168 (0.0023) [2024-04-27 22:07:38,167][54818] Updated weights for policy 0, policy_version 492178 (0.0027) [2024-04-27 22:07:39,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58709.4, 300 sec: 58760.3). Total num frames: 8063877120. Throughput: 0: 58865.7. Samples: 969121800. Policy #0 lag: (min: 2.0, avg: 8.8, max: 22.0) [2024-04-27 22:07:39,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 22:07:41,317][54818] Updated weights for policy 0, policy_version 492188 (0.0025) [2024-04-27 22:07:43,605][54798] Signal inference workers to stop experience collection... (14750 times) [2024-04-27 22:07:43,606][54798] Signal inference workers to resume experience collection... (14750 times) [2024-04-27 22:07:43,625][54818] InferenceWorker_p0-w0: stopping experience collection (14750 times) [2024-04-27 22:07:43,625][54818] InferenceWorker_p0-w0: resuming experience collection (14750 times) [2024-04-27 22:07:43,725][54818] Updated weights for policy 0, policy_version 492198 (0.0027) [2024-04-27 22:07:44,253][54587] Fps is (10 sec: 63897.9, 60 sec: 58982.4, 300 sec: 58926.9). Total num frames: 8064188416. Throughput: 0: 58469.6. Samples: 969300420. Policy #0 lag: (min: 2.0, avg: 8.8, max: 22.0) [2024-04-27 22:07:44,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 22:07:46,903][54818] Updated weights for policy 0, policy_version 492208 (0.0025) [2024-04-27 22:07:49,254][54587] Fps is (10 sec: 60617.3, 60 sec: 58982.0, 300 sec: 58982.2). Total num frames: 8064483328. Throughput: 0: 58689.9. Samples: 969656420. Policy #0 lag: (min: 2.0, avg: 8.8, max: 22.0) [2024-04-27 22:07:49,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 22:07:49,283][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000492218_8064499712.pth... [2024-04-27 22:07:49,286][54818] Updated weights for policy 0, policy_version 492218 (0.0025) [2024-04-27 22:07:49,332][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000491355_8050360320.pth [2024-04-27 22:07:52,513][54818] Updated weights for policy 0, policy_version 492228 (0.0024) [2024-04-27 22:07:54,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.4, 300 sec: 58982.4). Total num frames: 8064778240. Throughput: 0: 58619.0. Samples: 970005580. Policy #0 lag: (min: 2.0, avg: 8.8, max: 22.0) [2024-04-27 22:07:54,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 22:07:54,904][54818] Updated weights for policy 0, policy_version 492238 (0.0026) [2024-04-27 22:07:58,032][54818] Updated weights for policy 0, policy_version 492248 (0.0025) [2024-04-27 22:07:59,253][54587] Fps is (10 sec: 58986.2, 60 sec: 58982.6, 300 sec: 59037.9). Total num frames: 8065073152. Throughput: 0: 59212.9. Samples: 970193820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-27 22:07:59,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 22:08:01,109][54818] Updated weights for policy 0, policy_version 492258 (0.0027) [2024-04-27 22:08:03,530][54818] Updated weights for policy 0, policy_version 492268 (0.0025) [2024-04-27 22:08:04,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58436.2, 300 sec: 58926.9). Total num frames: 8065351680. Throughput: 0: 58859.4. Samples: 970539760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-27 22:08:04,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 22:08:06,625][54818] Updated weights for policy 0, policy_version 492278 (0.0026) [2024-04-27 22:08:09,217][54818] Updated weights for policy 0, policy_version 492288 (0.0027) [2024-04-27 22:08:09,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8065646592. Throughput: 0: 58803.1. Samples: 970891440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-27 22:08:09,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 22:08:12,187][54818] Updated weights for policy 0, policy_version 492298 (0.0025) [2024-04-27 22:08:14,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59255.4, 300 sec: 58871.3). Total num frames: 8065941504. Throughput: 0: 59050.2. Samples: 971071340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-27 22:08:14,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-27 22:08:14,683][54818] Updated weights for policy 0, policy_version 492308 (0.0026) [2024-04-27 22:08:17,745][54818] Updated weights for policy 0, policy_version 492318 (0.0025) [2024-04-27 22:08:19,253][54587] Fps is (10 sec: 58981.5, 60 sec: 59801.4, 300 sec: 58815.8). Total num frames: 8066236416. Throughput: 0: 59170.0. Samples: 971428640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-27 22:08:19,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 22:08:20,246][54818] Updated weights for policy 0, policy_version 492328 (0.0025) [2024-04-27 22:08:23,389][54818] Updated weights for policy 0, policy_version 492338 (0.0026) [2024-04-27 22:08:24,253][54587] Fps is (10 sec: 58981.7, 60 sec: 59255.4, 300 sec: 58871.3). Total num frames: 8066531328. Throughput: 0: 59011.4. Samples: 971777320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-27 22:08:24,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 22:08:25,823][54818] Updated weights for policy 0, policy_version 492348 (0.0026) [2024-04-27 22:08:28,865][54818] Updated weights for policy 0, policy_version 492358 (0.0026) [2024-04-27 22:08:29,253][54587] Fps is (10 sec: 58983.3, 60 sec: 59255.5, 300 sec: 58871.3). Total num frames: 8066826240. Throughput: 0: 58875.1. Samples: 971949800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-27 22:08:29,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 22:08:31,272][54818] Updated weights for policy 0, policy_version 492368 (0.0025) [2024-04-27 22:08:34,253][54587] Fps is (10 sec: 57344.5, 60 sec: 59255.4, 300 sec: 58760.2). Total num frames: 8067104768. Throughput: 0: 58819.4. Samples: 972303260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-27 22:08:34,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-27 22:08:34,262][54818] Updated weights for policy 0, policy_version 492378 (0.0028) [2024-04-27 22:08:36,192][54798] Signal inference workers to stop experience collection... (14800 times) [2024-04-27 22:08:36,192][54798] Signal inference workers to resume experience collection... (14800 times) [2024-04-27 22:08:36,202][54818] InferenceWorker_p0-w0: stopping experience collection (14800 times) [2024-04-27 22:08:36,220][54818] InferenceWorker_p0-w0: resuming experience collection (14800 times) [2024-04-27 22:08:36,833][54818] Updated weights for policy 0, policy_version 492388 (0.0025) [2024-04-27 22:08:39,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58709.3, 300 sec: 58760.2). Total num frames: 8067399680. Throughput: 0: 59135.1. Samples: 972666660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-27 22:08:39,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 22:08:39,701][54818] Updated weights for policy 0, policy_version 492398 (0.0026) [2024-04-27 22:08:42,365][54818] Updated weights for policy 0, policy_version 492408 (0.0026) [2024-04-27 22:08:44,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58436.2, 300 sec: 58815.8). Total num frames: 8067694592. Throughput: 0: 58754.6. Samples: 972837780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-27 22:08:44,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 22:08:45,387][54818] Updated weights for policy 0, policy_version 492418 (0.0025) [2024-04-27 22:08:48,058][54818] Updated weights for policy 0, policy_version 492428 (0.0025) [2024-04-27 22:08:49,253][54587] Fps is (10 sec: 60620.5, 60 sec: 58709.8, 300 sec: 58871.3). Total num frames: 8068005888. Throughput: 0: 58838.2. Samples: 973187480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-27 22:08:49,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 22:08:49,265][54587] No heartbeat for components: RolloutWorker_w4 (2677 seconds) [2024-04-27 22:08:50,903][54818] Updated weights for policy 0, policy_version 492438 (0.0024) [2024-04-27 22:08:53,637][54818] Updated weights for policy 0, policy_version 492448 (0.0028) [2024-04-27 22:08:54,254][54587] Fps is (10 sec: 62258.4, 60 sec: 58982.2, 300 sec: 59037.9). Total num frames: 8068317184. Throughput: 0: 58997.1. Samples: 973546320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-27 22:08:54,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 22:08:56,213][54818] Updated weights for policy 0, policy_version 492458 (0.0026) [2024-04-27 22:08:59,098][54818] Updated weights for policy 0, policy_version 492468 (0.0027) [2024-04-27 22:08:59,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.2, 300 sec: 58982.4). Total num frames: 8068595712. Throughput: 0: 58994.6. Samples: 973726100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-27 22:08:59,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 22:09:01,880][54818] Updated weights for policy 0, policy_version 492478 (0.0026) [2024-04-27 22:09:04,253][54587] Fps is (10 sec: 57345.5, 60 sec: 58982.6, 300 sec: 58982.4). Total num frames: 8068890624. Throughput: 0: 58859.9. Samples: 974077320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-27 22:09:04,253][54587] Avg episode reward: [(0, '0.666')] [2024-04-27 22:09:04,704][54818] Updated weights for policy 0, policy_version 492488 (0.0027) [2024-04-27 22:09:07,593][54818] Updated weights for policy 0, policy_version 492498 (0.0025) [2024-04-27 22:09:09,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58982.3, 300 sec: 58926.9). Total num frames: 8069185536. Throughput: 0: 58920.1. Samples: 974428720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-27 22:09:09,254][54587] Avg episode reward: [(0, '0.699')] [2024-04-27 22:09:10,143][54818] Updated weights for policy 0, policy_version 492508 (0.0026) [2024-04-27 22:09:13,028][54818] Updated weights for policy 0, policy_version 492518 (0.0026) [2024-04-27 22:09:14,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 8069480448. Throughput: 0: 59000.4. Samples: 974604820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-27 22:09:14,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 22:09:15,942][54818] Updated weights for policy 0, policy_version 492528 (0.0027) [2024-04-27 22:09:18,750][54818] Updated weights for policy 0, policy_version 492538 (0.0026) [2024-04-27 22:09:19,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58982.6, 300 sec: 58815.8). Total num frames: 8069775360. Throughput: 0: 58936.5. Samples: 974955400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-04-27 22:09:19,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 22:09:21,430][54818] Updated weights for policy 0, policy_version 492548 (0.0027) [2024-04-27 22:09:24,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58709.4, 300 sec: 58760.2). Total num frames: 8070053888. Throughput: 0: 58756.9. Samples: 975310720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:09:24,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 22:09:24,319][54818] Updated weights for policy 0, policy_version 492558 (0.0026) [2024-04-27 22:09:27,338][54818] Updated weights for policy 0, policy_version 492568 (0.0026) [2024-04-27 22:09:29,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.3, 300 sec: 58760.2). Total num frames: 8070348800. Throughput: 0: 58824.5. Samples: 975484880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:09:29,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 22:09:29,743][54818] Updated weights for policy 0, policy_version 492578 (0.0027) [2024-04-27 22:09:32,982][54818] Updated weights for policy 0, policy_version 492588 (0.0026) [2024-04-27 22:09:34,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58982.5, 300 sec: 58871.3). Total num frames: 8070643712. Throughput: 0: 58889.1. Samples: 975837480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:09:34,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 22:09:35,262][54818] Updated weights for policy 0, policy_version 492598 (0.0027) [2024-04-27 22:09:38,482][54818] Updated weights for policy 0, policy_version 492608 (0.0026) [2024-04-27 22:09:38,490][54798] Signal inference workers to stop experience collection... (14850 times) [2024-04-27 22:09:38,494][54798] Signal inference workers to resume experience collection... (14850 times) [2024-04-27 22:09:38,517][54818] InferenceWorker_p0-w0: stopping experience collection (14850 times) [2024-04-27 22:09:38,517][54818] InferenceWorker_p0-w0: resuming experience collection (14850 times) [2024-04-27 22:09:39,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 8070938624. Throughput: 0: 58732.6. Samples: 976189280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:09:39,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 22:09:40,940][54818] Updated weights for policy 0, policy_version 492618 (0.0027) [2024-04-27 22:09:44,073][54818] Updated weights for policy 0, policy_version 492628 (0.0027) [2024-04-27 22:09:44,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.6, 300 sec: 58815.8). Total num frames: 8071233536. Throughput: 0: 58656.6. Samples: 976365640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:09:44,253][54587] Avg episode reward: [(0, '0.674')] [2024-04-27 22:09:46,613][54818] Updated weights for policy 0, policy_version 492638 (0.0025) [2024-04-27 22:09:49,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58436.3, 300 sec: 58760.2). Total num frames: 8071512064. Throughput: 0: 58612.7. Samples: 976714900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:09:49,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 22:09:49,307][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000492647_8071528448.pth... [2024-04-27 22:09:49,352][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000491786_8057421824.pth [2024-04-27 22:09:49,556][54818] Updated weights for policy 0, policy_version 492648 (0.0027) [2024-04-27 22:09:52,379][54818] Updated weights for policy 0, policy_version 492658 (0.0026) [2024-04-27 22:09:54,253][54587] Fps is (10 sec: 57343.1, 60 sec: 58163.3, 300 sec: 58815.8). Total num frames: 8071806976. Throughput: 0: 58656.5. Samples: 977068260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:09:54,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 22:09:55,044][54818] Updated weights for policy 0, policy_version 492668 (0.0025) [2024-04-27 22:09:57,920][54818] Updated weights for policy 0, policy_version 492678 (0.0028) [2024-04-27 22:09:59,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58436.4, 300 sec: 58871.3). Total num frames: 8072101888. Throughput: 0: 58670.7. Samples: 977245000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:09:59,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 22:10:00,803][54818] Updated weights for policy 0, policy_version 492688 (0.0027) [2024-04-27 22:10:03,447][54818] Updated weights for policy 0, policy_version 492698 (0.0025) [2024-04-27 22:10:04,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58436.1, 300 sec: 58871.3). Total num frames: 8072396800. Throughput: 0: 58684.8. Samples: 977596220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:10:04,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 22:10:06,271][54818] Updated weights for policy 0, policy_version 492708 (0.0027) [2024-04-27 22:10:08,860][54818] Updated weights for policy 0, policy_version 492718 (0.0026) [2024-04-27 22:10:09,253][54587] Fps is (10 sec: 60621.1, 60 sec: 58709.5, 300 sec: 58926.9). Total num frames: 8072708096. Throughput: 0: 58762.0. Samples: 977955000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:10:09,253][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 22:10:11,668][54818] Updated weights for policy 0, policy_version 492728 (0.0025) [2024-04-27 22:10:14,253][54587] Fps is (10 sec: 60621.4, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8073003008. Throughput: 0: 58728.5. Samples: 978127660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:10:14,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 22:10:14,350][54818] Updated weights for policy 0, policy_version 492738 (0.0028) [2024-04-27 22:10:17,129][54818] Updated weights for policy 0, policy_version 492748 (0.0026) [2024-04-27 22:10:19,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8073297920. Throughput: 0: 58832.8. Samples: 978484960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:10:19,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 22:10:19,927][54818] Updated weights for policy 0, policy_version 492758 (0.0026) [2024-04-27 22:10:22,767][54818] Updated weights for policy 0, policy_version 492768 (0.0025) [2024-04-27 22:10:24,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 8073592832. Throughput: 0: 58742.3. Samples: 978832680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:10:24,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 22:10:25,577][54818] Updated weights for policy 0, policy_version 492778 (0.0026) [2024-04-27 22:10:28,227][54818] Updated weights for policy 0, policy_version 492788 (0.0026) [2024-04-27 22:10:29,253][54587] Fps is (10 sec: 60620.2, 60 sec: 59255.4, 300 sec: 58926.8). Total num frames: 8073904128. Throughput: 0: 58874.0. Samples: 979014980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:10:29,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 22:10:30,118][54798] Signal inference workers to stop experience collection... (14900 times) [2024-04-27 22:10:30,149][54818] InferenceWorker_p0-w0: stopping experience collection (14900 times) [2024-04-27 22:10:30,178][54798] Signal inference workers to resume experience collection... (14900 times) [2024-04-27 22:10:30,178][54818] InferenceWorker_p0-w0: resuming experience collection (14900 times) [2024-04-27 22:10:31,088][54818] Updated weights for policy 0, policy_version 492798 (0.0024) [2024-04-27 22:10:33,899][54818] Updated weights for policy 0, policy_version 492808 (0.0027) [2024-04-27 22:10:34,253][54587] Fps is (10 sec: 60620.0, 60 sec: 59255.3, 300 sec: 58871.3). Total num frames: 8074199040. Throughput: 0: 59064.3. Samples: 979372800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:10:34,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 22:10:36,740][54818] Updated weights for policy 0, policy_version 492818 (0.0025) [2024-04-27 22:10:39,253][54587] Fps is (10 sec: 55706.1, 60 sec: 58709.4, 300 sec: 58815.8). Total num frames: 8074461184. Throughput: 0: 58957.9. Samples: 979721360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:10:39,254][54587] Avg episode reward: [(0, '0.717')] [2024-04-27 22:10:39,448][54818] Updated weights for policy 0, policy_version 492828 (0.0027) [2024-04-27 22:10:42,380][54818] Updated weights for policy 0, policy_version 492838 (0.0027) [2024-04-27 22:10:44,253][54587] Fps is (10 sec: 57344.6, 60 sec: 58982.3, 300 sec: 58815.8). Total num frames: 8074772480. Throughput: 0: 58953.2. Samples: 979897900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:10:44,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 22:10:44,939][54818] Updated weights for policy 0, policy_version 492848 (0.0027) [2024-04-27 22:10:47,897][54818] Updated weights for policy 0, policy_version 492858 (0.0025) [2024-04-27 22:10:49,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58982.3, 300 sec: 58704.7). Total num frames: 8075051008. Throughput: 0: 59058.6. Samples: 980253860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:10:49,254][54587] Avg episode reward: [(0, '0.697')] [2024-04-27 22:10:50,424][54818] Updated weights for policy 0, policy_version 492868 (0.0025) [2024-04-27 22:10:53,454][54818] Updated weights for policy 0, policy_version 492878 (0.0026) [2024-04-27 22:10:54,253][54587] Fps is (10 sec: 57344.7, 60 sec: 58982.5, 300 sec: 58704.7). Total num frames: 8075345920. Throughput: 0: 58983.1. Samples: 980609240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 22:10:54,253][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 22:10:56,171][54818] Updated weights for policy 0, policy_version 492888 (0.0027) [2024-04-27 22:10:59,070][54818] Updated weights for policy 0, policy_version 492898 (0.0027) [2024-04-27 22:10:59,253][54587] Fps is (10 sec: 60621.6, 60 sec: 59255.4, 300 sec: 58815.8). Total num frames: 8075657216. Throughput: 0: 59009.3. Samples: 980783080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 22:10:59,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 22:11:01,698][54818] Updated weights for policy 0, policy_version 492908 (0.0026) [2024-04-27 22:11:04,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59255.6, 300 sec: 58871.3). Total num frames: 8075952128. Throughput: 0: 58983.6. Samples: 981139220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 22:11:04,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 22:11:04,488][54818] Updated weights for policy 0, policy_version 492918 (0.0024) [2024-04-27 22:11:07,329][54818] Updated weights for policy 0, policy_version 492928 (0.0026) [2024-04-27 22:11:09,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8076230656. Throughput: 0: 58931.2. Samples: 981484580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 22:11:09,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 22:11:09,922][54818] Updated weights for policy 0, policy_version 492938 (0.0026) [2024-04-27 22:11:12,893][54818] Updated weights for policy 0, policy_version 492948 (0.0025) [2024-04-27 22:11:14,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58982.3, 300 sec: 58926.8). Total num frames: 8076541952. Throughput: 0: 58936.5. Samples: 981667120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 22:11:14,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 22:11:15,666][54818] Updated weights for policy 0, policy_version 492958 (0.0025) [2024-04-27 22:11:18,478][54818] Updated weights for policy 0, policy_version 492968 (0.0026) [2024-04-27 22:11:19,253][54587] Fps is (10 sec: 60620.5, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8076836864. Throughput: 0: 58772.6. Samples: 982017560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 22:11:19,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 22:11:21,128][54818] Updated weights for policy 0, policy_version 492978 (0.0027) [2024-04-27 22:11:24,147][54818] Updated weights for policy 0, policy_version 492988 (0.0027) [2024-04-27 22:11:24,253][54587] Fps is (10 sec: 57344.9, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8077115392. Throughput: 0: 59025.0. Samples: 982377480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 22:11:24,253][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 22:11:26,797][54818] Updated weights for policy 0, policy_version 492998 (0.0028) [2024-04-27 22:11:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58436.4, 300 sec: 58815.8). Total num frames: 8077410304. Throughput: 0: 58937.9. Samples: 982550100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 22:11:29,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 22:11:29,599][54818] Updated weights for policy 0, policy_version 493008 (0.0027) [2024-04-27 22:11:32,429][54818] Updated weights for policy 0, policy_version 493018 (0.0025) [2024-04-27 22:11:34,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58436.4, 300 sec: 58815.8). Total num frames: 8077705216. Throughput: 0: 58822.4. Samples: 982900860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 22:11:34,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 22:11:35,104][54818] Updated weights for policy 0, policy_version 493028 (0.0027) [2024-04-27 22:11:37,933][54818] Updated weights for policy 0, policy_version 493038 (0.0027) [2024-04-27 22:11:39,253][54587] Fps is (10 sec: 60620.0, 60 sec: 59255.4, 300 sec: 58871.3). Total num frames: 8078016512. Throughput: 0: 58697.1. Samples: 983250620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 22:11:39,254][54587] Avg episode reward: [(0, '0.699')] [2024-04-27 22:11:40,583][54818] Updated weights for policy 0, policy_version 493048 (0.0027) [2024-04-27 22:11:43,408][54818] Updated weights for policy 0, policy_version 493058 (0.0026) [2024-04-27 22:11:44,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.4, 300 sec: 58815.8). Total num frames: 8078295040. Throughput: 0: 58895.1. Samples: 983433360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 22:11:44,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-27 22:11:46,118][54818] Updated weights for policy 0, policy_version 493068 (0.0023) [2024-04-27 22:11:48,826][54818] Updated weights for policy 0, policy_version 493078 (0.0026) [2024-04-27 22:11:49,253][54587] Fps is (10 sec: 57344.9, 60 sec: 58982.6, 300 sec: 58871.3). Total num frames: 8078589952. Throughput: 0: 58782.7. Samples: 983784440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 22:11:49,253][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 22:11:49,270][54587] No heartbeat for components: RolloutWorker_w4 (2857 seconds) [2024-04-27 22:11:49,344][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000493079_8078606336.pth... [2024-04-27 22:11:49,393][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000492218_8064499712.pth [2024-04-27 22:11:51,925][54818] Updated weights for policy 0, policy_version 493088 (0.0027) [2024-04-27 22:11:52,358][54798] Signal inference workers to stop experience collection... (14950 times) [2024-04-27 22:11:52,359][54798] Signal inference workers to resume experience collection... (14950 times) [2024-04-27 22:11:52,370][54818] InferenceWorker_p0-w0: stopping experience collection (14950 times) [2024-04-27 22:11:52,370][54818] InferenceWorker_p0-w0: resuming experience collection (14950 times) [2024-04-27 22:11:54,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.3, 300 sec: 58815.8). Total num frames: 8078884864. Throughput: 0: 59017.3. Samples: 984140360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 22:11:54,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 22:11:54,523][54818] Updated weights for policy 0, policy_version 493098 (0.0027) [2024-04-27 22:11:57,330][54818] Updated weights for policy 0, policy_version 493108 (0.0026) [2024-04-27 22:11:59,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58436.3, 300 sec: 58704.7). Total num frames: 8079163392. Throughput: 0: 58783.7. Samples: 984312380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 22:11:59,253][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 22:12:00,120][54818] Updated weights for policy 0, policy_version 493118 (0.0024) [2024-04-27 22:12:02,974][54818] Updated weights for policy 0, policy_version 493128 (0.0027) [2024-04-27 22:12:04,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58709.2, 300 sec: 58871.3). Total num frames: 8079474688. Throughput: 0: 58944.8. Samples: 984670080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 22:12:04,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 22:12:05,664][54818] Updated weights for policy 0, policy_version 493138 (0.0025) [2024-04-27 22:12:08,506][54818] Updated weights for policy 0, policy_version 493148 (0.0025) [2024-04-27 22:12:09,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 8079769600. Throughput: 0: 58782.2. Samples: 985022680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 22:12:09,253][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 22:12:11,154][54818] Updated weights for policy 0, policy_version 493158 (0.0026) [2024-04-27 22:12:13,920][54818] Updated weights for policy 0, policy_version 493168 (0.0026) [2024-04-27 22:12:14,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58709.4, 300 sec: 59037.9). Total num frames: 8080064512. Throughput: 0: 58837.8. Samples: 985197800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-27 22:12:14,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 22:12:16,797][54818] Updated weights for policy 0, policy_version 493178 (0.0025) [2024-04-27 22:12:19,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 8080359424. Throughput: 0: 59000.8. Samples: 985555900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 22:12:19,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 22:12:19,574][54818] Updated weights for policy 0, policy_version 493188 (0.0024) [2024-04-27 22:12:22,447][54818] Updated weights for policy 0, policy_version 493198 (0.0024) [2024-04-27 22:12:24,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59255.3, 300 sec: 58982.4). Total num frames: 8080670720. Throughput: 0: 58943.6. Samples: 985903080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 22:12:24,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 22:12:25,395][54818] Updated weights for policy 0, policy_version 493208 (0.0026) [2024-04-27 22:12:28,007][54818] Updated weights for policy 0, policy_version 493218 (0.0025) [2024-04-27 22:12:29,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.3, 300 sec: 58982.4). Total num frames: 8080949248. Throughput: 0: 58814.2. Samples: 986080000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 22:12:29,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 22:12:30,767][54818] Updated weights for policy 0, policy_version 493228 (0.0024) [2024-04-27 22:12:33,587][54818] Updated weights for policy 0, policy_version 493238 (0.0027) [2024-04-27 22:12:34,253][54587] Fps is (10 sec: 58983.3, 60 sec: 59255.6, 300 sec: 58926.9). Total num frames: 8081260544. Throughput: 0: 58868.9. Samples: 986433540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 22:12:34,253][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 22:12:36,556][54818] Updated weights for policy 0, policy_version 493248 (0.0025) [2024-04-27 22:12:39,139][54818] Updated weights for policy 0, policy_version 493258 (0.0028) [2024-04-27 22:12:39,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58982.5, 300 sec: 58871.3). Total num frames: 8081555456. Throughput: 0: 58836.8. Samples: 986788020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 22:12:39,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 22:12:42,216][54818] Updated weights for policy 0, policy_version 493268 (0.0025) [2024-04-27 22:12:44,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58982.5, 300 sec: 58815.9). Total num frames: 8081833984. Throughput: 0: 58904.9. Samples: 986963100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 22:12:44,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 22:12:44,574][54818] Updated weights for policy 0, policy_version 493278 (0.0025) [2024-04-27 22:12:47,660][54818] Updated weights for policy 0, policy_version 493288 (0.0026) [2024-04-27 22:12:49,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58982.2, 300 sec: 58815.8). Total num frames: 8082128896. Throughput: 0: 58970.6. Samples: 987323760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 22:12:49,254][54587] Avg episode reward: [(0, '0.685')] [2024-04-27 22:12:50,287][54818] Updated weights for policy 0, policy_version 493298 (0.0025) [2024-04-27 22:12:53,190][54818] Updated weights for policy 0, policy_version 493308 (0.0025) [2024-04-27 22:12:53,735][54798] Signal inference workers to stop experience collection... (15000 times) [2024-04-27 22:12:53,779][54818] InferenceWorker_p0-w0: stopping experience collection (15000 times) [2024-04-27 22:12:53,793][54798] Signal inference workers to resume experience collection... (15000 times) [2024-04-27 22:12:53,799][54818] InferenceWorker_p0-w0: resuming experience collection (15000 times) [2024-04-27 22:12:54,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59255.5, 300 sec: 58871.3). Total num frames: 8082440192. Throughput: 0: 58933.8. Samples: 987674700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 22:12:54,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 22:12:55,653][54818] Updated weights for policy 0, policy_version 493318 (0.0028) [2024-04-27 22:12:58,655][54818] Updated weights for policy 0, policy_version 493328 (0.0025) [2024-04-27 22:12:59,253][54587] Fps is (10 sec: 60621.3, 60 sec: 59528.4, 300 sec: 58926.9). Total num frames: 8082735104. Throughput: 0: 58902.2. Samples: 987848400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 22:12:59,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 22:13:01,272][54818] Updated weights for policy 0, policy_version 493338 (0.0026) [2024-04-27 22:13:04,132][54818] Updated weights for policy 0, policy_version 493348 (0.0027) [2024-04-27 22:13:04,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58982.5, 300 sec: 58871.3). Total num frames: 8083013632. Throughput: 0: 58832.6. Samples: 988203360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 22:13:04,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 22:13:06,894][54818] Updated weights for policy 0, policy_version 493358 (0.0027) [2024-04-27 22:13:09,253][54587] Fps is (10 sec: 55705.8, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8083292160. Throughput: 0: 58963.7. Samples: 988556440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 22:13:09,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 22:13:09,506][54818] Updated weights for policy 0, policy_version 493368 (0.0025) [2024-04-27 22:13:12,349][54818] Updated weights for policy 0, policy_version 493378 (0.0025) [2024-04-27 22:13:14,253][54587] Fps is (10 sec: 57343.4, 60 sec: 58709.2, 300 sec: 58815.8). Total num frames: 8083587072. Throughput: 0: 59040.9. Samples: 988736840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 22:13:14,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 22:13:15,212][54818] Updated weights for policy 0, policy_version 493388 (0.0025) [2024-04-27 22:13:17,916][54818] Updated weights for policy 0, policy_version 493398 (0.0023) [2024-04-27 22:13:19,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 8083898368. Throughput: 0: 58844.7. Samples: 989081560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 22:13:19,254][54587] Avg episode reward: [(0, '0.677')] [2024-04-27 22:13:20,725][54818] Updated weights for policy 0, policy_version 493408 (0.0025) [2024-04-27 22:13:23,543][54818] Updated weights for policy 0, policy_version 493418 (0.0026) [2024-04-27 22:13:24,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8084193280. Throughput: 0: 58834.6. Samples: 989435580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 22:13:24,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-27 22:13:26,150][54818] Updated weights for policy 0, policy_version 493428 (0.0025) [2024-04-27 22:13:29,029][54818] Updated weights for policy 0, policy_version 493438 (0.0027) [2024-04-27 22:13:29,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 8084488192. Throughput: 0: 58850.7. Samples: 989611380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 22:13:29,253][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 22:13:31,640][54818] Updated weights for policy 0, policy_version 493448 (0.0027) [2024-04-27 22:13:34,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58709.2, 300 sec: 58926.9). Total num frames: 8084783104. Throughput: 0: 58921.0. Samples: 989975200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 22:13:34,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 22:13:34,491][54818] Updated weights for policy 0, policy_version 493458 (0.0025) [2024-04-27 22:13:37,692][54818] Updated weights for policy 0, policy_version 493468 (0.0027) [2024-04-27 22:13:39,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8085078016. Throughput: 0: 58789.7. Samples: 990320240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 22:13:39,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 22:13:40,243][54818] Updated weights for policy 0, policy_version 493478 (0.0027) [2024-04-27 22:13:40,925][54798] Signal inference workers to stop experience collection... (15050 times) [2024-04-27 22:13:40,955][54818] InferenceWorker_p0-w0: stopping experience collection (15050 times) [2024-04-27 22:13:40,982][54798] Signal inference workers to resume experience collection... (15050 times) [2024-04-27 22:13:40,983][54818] InferenceWorker_p0-w0: resuming experience collection (15050 times) [2024-04-27 22:13:43,320][54818] Updated weights for policy 0, policy_version 493488 (0.0028) [2024-04-27 22:13:44,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8085356544. Throughput: 0: 58863.6. Samples: 990497260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 22:13:44,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 22:13:46,261][54818] Updated weights for policy 0, policy_version 493498 (0.0027) [2024-04-27 22:13:48,779][54818] Updated weights for policy 0, policy_version 493508 (0.0027) [2024-04-27 22:13:49,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.4, 300 sec: 58760.3). Total num frames: 8085651456. Throughput: 0: 58643.9. Samples: 990842340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 22:13:49,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 22:13:49,267][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000493509_8085651456.pth... [2024-04-27 22:13:49,326][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000492647_8071528448.pth [2024-04-27 22:13:51,916][54818] Updated weights for policy 0, policy_version 493518 (0.0027) [2024-04-27 22:13:54,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58436.2, 300 sec: 58815.8). Total num frames: 8085946368. Throughput: 0: 58732.0. Samples: 991199380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 22:13:54,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 22:13:54,466][54818] Updated weights for policy 0, policy_version 493528 (0.0027) [2024-04-27 22:13:57,375][54818] Updated weights for policy 0, policy_version 493538 (0.0026) [2024-04-27 22:13:59,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 8086241280. Throughput: 0: 58643.2. Samples: 991375780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 22:13:59,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 22:13:59,990][54818] Updated weights for policy 0, policy_version 493548 (0.0024) [2024-04-27 22:14:03,029][54818] Updated weights for policy 0, policy_version 493558 (0.0027) [2024-04-27 22:14:04,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8086552576. Throughput: 0: 58915.1. Samples: 991732740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 22:14:04,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 22:14:05,610][54818] Updated weights for policy 0, policy_version 493568 (0.0025) [2024-04-27 22:14:08,488][54818] Updated weights for policy 0, policy_version 493578 (0.0026) [2024-04-27 22:14:09,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59255.5, 300 sec: 58871.3). Total num frames: 8086847488. Throughput: 0: 58900.6. Samples: 992086100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 22:14:09,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 22:14:11,072][54818] Updated weights for policy 0, policy_version 493588 (0.0025) [2024-04-27 22:14:13,994][54818] Updated weights for policy 0, policy_version 493598 (0.0027) [2024-04-27 22:14:14,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58982.5, 300 sec: 58815.8). Total num frames: 8087126016. Throughput: 0: 58870.1. Samples: 992260540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 22:14:14,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-27 22:14:16,604][54818] Updated weights for policy 0, policy_version 493608 (0.0025) [2024-04-27 22:14:19,253][54587] Fps is (10 sec: 55705.8, 60 sec: 58436.4, 300 sec: 58815.8). Total num frames: 8087404544. Throughput: 0: 58623.2. Samples: 992613240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 22:14:19,253][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 22:14:19,468][54818] Updated weights for policy 0, policy_version 493618 (0.0026) [2024-04-27 22:14:22,169][54818] Updated weights for policy 0, policy_version 493628 (0.0025) [2024-04-27 22:14:24,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8087715840. Throughput: 0: 58820.3. Samples: 992967160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 22:14:24,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 22:14:25,065][54818] Updated weights for policy 0, policy_version 493638 (0.0025) [2024-04-27 22:14:25,066][54798] Signal inference workers to stop experience collection... (15100 times) [2024-04-27 22:14:25,067][54798] Signal inference workers to resume experience collection... (15100 times) [2024-04-27 22:14:25,094][54818] InferenceWorker_p0-w0: stopping experience collection (15100 times) [2024-04-27 22:14:25,094][54818] InferenceWorker_p0-w0: resuming experience collection (15100 times) [2024-04-27 22:14:27,761][54818] Updated weights for policy 0, policy_version 493648 (0.0026) [2024-04-27 22:14:29,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58436.2, 300 sec: 58815.8). Total num frames: 8087994368. Throughput: 0: 58913.8. Samples: 993148380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 22:14:29,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 22:14:30,415][54818] Updated weights for policy 0, policy_version 493658 (0.0027) [2024-04-27 22:14:33,210][54818] Updated weights for policy 0, policy_version 493668 (0.0026) [2024-04-27 22:14:34,253][54587] Fps is (10 sec: 60621.6, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 8088322048. Throughput: 0: 59047.6. Samples: 993499480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 22:14:34,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 22:14:35,835][54818] Updated weights for policy 0, policy_version 493678 (0.0029) [2024-04-27 22:14:38,670][54818] Updated weights for policy 0, policy_version 493688 (0.0026) [2024-04-27 22:14:39,253][54587] Fps is (10 sec: 62259.0, 60 sec: 58982.4, 300 sec: 58926.8). Total num frames: 8088616960. Throughput: 0: 59056.4. Samples: 993856920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 22:14:39,254][54587] Avg episode reward: [(0, '0.655')] [2024-04-27 22:14:41,398][54818] Updated weights for policy 0, policy_version 493698 (0.0027) [2024-04-27 22:14:44,073][54818] Updated weights for policy 0, policy_version 493708 (0.0027) [2024-04-27 22:14:44,253][54587] Fps is (10 sec: 58983.0, 60 sec: 59255.6, 300 sec: 58982.4). Total num frames: 8088911872. Throughput: 0: 58857.1. Samples: 994024340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 22:14:44,253][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 22:14:46,898][54818] Updated weights for policy 0, policy_version 493718 (0.0026) [2024-04-27 22:14:49,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58982.4, 300 sec: 58926.9). Total num frames: 8089190400. Throughput: 0: 58916.5. Samples: 994383980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 22:14:49,254][54587] Avg episode reward: [(0, '0.687')] [2024-04-27 22:14:49,263][54587] No heartbeat for components: RolloutWorker_w4 (3037 seconds) [2024-04-27 22:14:49,811][54818] Updated weights for policy 0, policy_version 493728 (0.0027) [2024-04-27 22:14:52,342][54818] Updated weights for policy 0, policy_version 493738 (0.0024) [2024-04-27 22:14:54,253][54587] Fps is (10 sec: 57343.4, 60 sec: 58982.4, 300 sec: 58926.9). Total num frames: 8089485312. Throughput: 0: 59027.1. Samples: 994742320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 22:14:54,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 22:14:55,416][54818] Updated weights for policy 0, policy_version 493748 (0.0027) [2024-04-27 22:14:57,737][54818] Updated weights for policy 0, policy_version 493758 (0.0027) [2024-04-27 22:14:59,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.4, 300 sec: 58926.9). Total num frames: 8089780224. Throughput: 0: 59083.1. Samples: 994919280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 22:14:59,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 22:15:00,847][54818] Updated weights for policy 0, policy_version 493768 (0.0026) [2024-04-27 22:15:03,522][54818] Updated weights for policy 0, policy_version 493778 (0.0026) [2024-04-27 22:15:04,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8090075136. Throughput: 0: 59074.1. Samples: 995271580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 22:15:04,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 22:15:06,309][54818] Updated weights for policy 0, policy_version 493788 (0.0026) [2024-04-27 22:15:09,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58982.4, 300 sec: 58926.8). Total num frames: 8090386432. Throughput: 0: 59040.5. Samples: 995623980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:15:09,254][54587] Avg episode reward: [(0, '0.466')] [2024-04-27 22:15:09,256][54818] Updated weights for policy 0, policy_version 493798 (0.0024) [2024-04-27 22:15:12,146][54818] Updated weights for policy 0, policy_version 493808 (0.0024) [2024-04-27 22:15:13,376][54798] Signal inference workers to stop experience collection... (15150 times) [2024-04-27 22:15:13,376][54798] Signal inference workers to resume experience collection... (15150 times) [2024-04-27 22:15:13,406][54818] InferenceWorker_p0-w0: stopping experience collection (15150 times) [2024-04-27 22:15:13,406][54818] InferenceWorker_p0-w0: resuming experience collection (15150 times) [2024-04-27 22:15:14,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59255.5, 300 sec: 58926.9). Total num frames: 8090681344. Throughput: 0: 58945.3. Samples: 995800920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:15:14,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 22:15:14,969][54818] Updated weights for policy 0, policy_version 493818 (0.0023) [2024-04-27 22:15:17,807][54818] Updated weights for policy 0, policy_version 493828 (0.0026) [2024-04-27 22:15:19,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59801.6, 300 sec: 58982.4). Total num frames: 8090992640. Throughput: 0: 59194.2. Samples: 996163220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:15:19,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 22:15:20,420][54818] Updated weights for policy 0, policy_version 493838 (0.0024) [2024-04-27 22:15:23,186][54818] Updated weights for policy 0, policy_version 493848 (0.0024) [2024-04-27 22:15:24,253][54587] Fps is (10 sec: 58983.0, 60 sec: 59255.6, 300 sec: 58871.4). Total num frames: 8091271168. Throughput: 0: 59036.1. Samples: 996513540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:15:24,253][54587] Avg episode reward: [(0, '0.663')] [2024-04-27 22:15:25,896][54818] Updated weights for policy 0, policy_version 493858 (0.0027) [2024-04-27 22:15:28,886][54818] Updated weights for policy 0, policy_version 493868 (0.0026) [2024-04-27 22:15:29,253][54587] Fps is (10 sec: 55705.0, 60 sec: 59255.4, 300 sec: 58815.8). Total num frames: 8091549696. Throughput: 0: 59098.4. Samples: 996683780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:15:29,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 22:15:31,492][54818] Updated weights for policy 0, policy_version 493878 (0.0026) [2024-04-27 22:15:34,253][54587] Fps is (10 sec: 57343.3, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 8091844608. Throughput: 0: 59017.3. Samples: 997039760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:15:34,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 22:15:34,444][54818] Updated weights for policy 0, policy_version 493888 (0.0027) [2024-04-27 22:15:36,984][54818] Updated weights for policy 0, policy_version 493898 (0.0027) [2024-04-27 22:15:39,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 8092123136. Throughput: 0: 59064.4. Samples: 997400220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:15:39,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 22:15:39,891][54818] Updated weights for policy 0, policy_version 493908 (0.0025) [2024-04-27 22:15:42,380][54818] Updated weights for policy 0, policy_version 493918 (0.0027) [2024-04-27 22:15:44,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58436.1, 300 sec: 58871.3). Total num frames: 8092418048. Throughput: 0: 58977.3. Samples: 997573260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:15:44,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 22:15:45,337][54818] Updated weights for policy 0, policy_version 493928 (0.0026) [2024-04-27 22:15:48,003][54818] Updated weights for policy 0, policy_version 493938 (0.0027) [2024-04-27 22:15:49,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58982.4, 300 sec: 58926.8). Total num frames: 8092729344. Throughput: 0: 58910.3. Samples: 997922540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:15:49,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 22:15:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000493941_8092729344.pth... [2024-04-27 22:15:49,318][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000493079_8078606336.pth [2024-04-27 22:15:50,892][54818] Updated weights for policy 0, policy_version 493948 (0.0026) [2024-04-27 22:15:53,549][54818] Updated weights for policy 0, policy_version 493958 (0.0027) [2024-04-27 22:15:54,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8093024256. Throughput: 0: 58887.9. Samples: 998273940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:15:54,254][54587] Avg episode reward: [(0, '0.519')] [2024-04-27 22:15:56,274][54818] Updated weights for policy 0, policy_version 493968 (0.0027) [2024-04-27 22:15:59,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8093319168. Throughput: 0: 58955.5. Samples: 998453920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:15:59,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 22:15:59,269][54818] Updated weights for policy 0, policy_version 493978 (0.0025) [2024-04-27 22:16:01,902][54818] Updated weights for policy 0, policy_version 493988 (0.0025) [2024-04-27 22:16:04,253][54587] Fps is (10 sec: 62259.2, 60 sec: 59528.5, 300 sec: 59037.9). Total num frames: 8093646848. Throughput: 0: 58846.5. Samples: 998811320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:16:04,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 22:16:04,914][54818] Updated weights for policy 0, policy_version 493998 (0.0027) [2024-04-27 22:16:07,495][54818] Updated weights for policy 0, policy_version 494008 (0.0026) [2024-04-27 22:16:09,253][54587] Fps is (10 sec: 60621.2, 60 sec: 58982.4, 300 sec: 58926.9). Total num frames: 8093925376. Throughput: 0: 58847.4. Samples: 999161680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:16:09,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 22:16:10,266][54818] Updated weights for policy 0, policy_version 494018 (0.0026) [2024-04-27 22:16:11,634][54798] Signal inference workers to stop experience collection... (15200 times) [2024-04-27 22:16:11,635][54798] Signal inference workers to resume experience collection... (15200 times) [2024-04-27 22:16:11,659][54818] InferenceWorker_p0-w0: stopping experience collection (15200 times) [2024-04-27 22:16:11,659][54818] InferenceWorker_p0-w0: resuming experience collection (15200 times) [2024-04-27 22:16:12,988][54818] Updated weights for policy 0, policy_version 494028 (0.0025) [2024-04-27 22:16:14,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58982.3, 300 sec: 58926.8). Total num frames: 8094220288. Throughput: 0: 59082.6. Samples: 999342500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:16:14,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 22:16:16,148][54818] Updated weights for policy 0, policy_version 494038 (0.0026) [2024-04-27 22:16:18,471][54818] Updated weights for policy 0, policy_version 494048 (0.0026) [2024-04-27 22:16:19,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8094515200. Throughput: 0: 59000.4. Samples: 999694780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:16:19,254][54587] Avg episode reward: [(0, '0.678')] [2024-04-27 22:16:21,828][54818] Updated weights for policy 0, policy_version 494058 (0.0027) [2024-04-27 22:16:24,124][54818] Updated weights for policy 0, policy_version 494068 (0.0028) [2024-04-27 22:16:24,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.2, 300 sec: 58982.4). Total num frames: 8094810112. Throughput: 0: 58873.2. Samples: 1000049520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:16:24,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 22:16:27,440][54818] Updated weights for policy 0, policy_version 494078 (0.0026) [2024-04-27 22:16:29,253][54587] Fps is (10 sec: 57344.5, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 8095088640. Throughput: 0: 58973.9. Samples: 1000227080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:16:29,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 22:16:29,720][54818] Updated weights for policy 0, policy_version 494088 (0.0027) [2024-04-27 22:16:32,749][54818] Updated weights for policy 0, policy_version 494098 (0.0026) [2024-04-27 22:16:34,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 8095383552. Throughput: 0: 59174.1. Samples: 1000585380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:16:34,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 22:16:35,223][54818] Updated weights for policy 0, policy_version 494108 (0.0026) [2024-04-27 22:16:38,242][54818] Updated weights for policy 0, policy_version 494118 (0.0027) [2024-04-27 22:16:39,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59528.6, 300 sec: 58982.4). Total num frames: 8095694848. Throughput: 0: 59160.2. Samples: 1000936140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 22:16:39,254][54587] Avg episode reward: [(0, '0.510')] [2024-04-27 22:16:40,735][54818] Updated weights for policy 0, policy_version 494128 (0.0026) [2024-04-27 22:16:43,744][54818] Updated weights for policy 0, policy_version 494138 (0.0026) [2024-04-27 22:16:44,253][54587] Fps is (10 sec: 60621.3, 60 sec: 59528.5, 300 sec: 58982.4). Total num frames: 8095989760. Throughput: 0: 58997.0. Samples: 1001108780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 22:16:44,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-27 22:16:46,283][54818] Updated weights for policy 0, policy_version 494148 (0.0026) [2024-04-27 22:16:49,233][54818] Updated weights for policy 0, policy_version 494158 (0.0026) [2024-04-27 22:16:49,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8096284672. Throughput: 0: 59002.8. Samples: 1001466440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 22:16:49,254][54587] Avg episode reward: [(0, '0.686')] [2024-04-27 22:16:51,843][54818] Updated weights for policy 0, policy_version 494168 (0.0027) [2024-04-27 22:16:54,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8096563200. Throughput: 0: 59107.9. Samples: 1001821540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 22:16:54,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 22:16:54,742][54818] Updated weights for policy 0, policy_version 494178 (0.0028) [2024-04-27 22:16:57,409][54818] Updated weights for policy 0, policy_version 494188 (0.0028) [2024-04-27 22:16:59,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 8096858112. Throughput: 0: 58854.0. Samples: 1001990920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 22:16:59,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 22:17:00,398][54818] Updated weights for policy 0, policy_version 494198 (0.0027) [2024-04-27 22:17:03,030][54818] Updated weights for policy 0, policy_version 494208 (0.0025) [2024-04-27 22:17:04,253][54587] Fps is (10 sec: 57344.6, 60 sec: 58163.3, 300 sec: 58871.3). Total num frames: 8097136640. Throughput: 0: 58940.9. Samples: 1002347120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 22:17:04,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 22:17:05,921][54818] Updated weights for policy 0, policy_version 494218 (0.0026) [2024-04-27 22:17:08,534][54818] Updated weights for policy 0, policy_version 494228 (0.0026) [2024-04-27 22:17:09,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58982.5, 300 sec: 58982.4). Total num frames: 8097464320. Throughput: 0: 58822.8. Samples: 1002696540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 22:17:09,254][54587] Avg episode reward: [(0, '0.523')] [2024-04-27 22:17:11,507][54818] Updated weights for policy 0, policy_version 494238 (0.0027) [2024-04-27 22:17:14,253][54587] Fps is (10 sec: 62259.0, 60 sec: 58982.5, 300 sec: 58982.4). Total num frames: 8097759232. Throughput: 0: 58897.2. Samples: 1002877460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 22:17:14,254][54587] Avg episode reward: [(0, '0.531')] [2024-04-27 22:17:14,262][54818] Updated weights for policy 0, policy_version 494248 (0.0025) [2024-04-27 22:17:17,221][54818] Updated weights for policy 0, policy_version 494258 (0.0027) [2024-04-27 22:17:19,253][54587] Fps is (10 sec: 57343.3, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8098037760. Throughput: 0: 58688.0. Samples: 1003226340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 22:17:19,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 22:17:19,841][54818] Updated weights for policy 0, policy_version 494268 (0.0026) [2024-04-27 22:17:21,007][54798] Signal inference workers to stop experience collection... (15250 times) [2024-04-27 22:17:21,013][54798] Signal inference workers to resume experience collection... (15250 times) [2024-04-27 22:17:21,026][54818] InferenceWorker_p0-w0: stopping experience collection (15250 times) [2024-04-27 22:17:21,027][54818] InferenceWorker_p0-w0: resuming experience collection (15250 times) [2024-04-27 22:17:22,825][54818] Updated weights for policy 0, policy_version 494278 (0.0025) [2024-04-27 22:17:24,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8098332672. Throughput: 0: 58560.4. Samples: 1003571360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 22:17:24,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 22:17:25,655][54818] Updated weights for policy 0, policy_version 494288 (0.0026) [2024-04-27 22:17:28,507][54818] Updated weights for policy 0, policy_version 494298 (0.0027) [2024-04-27 22:17:29,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8098627584. Throughput: 0: 58855.6. Samples: 1003757280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 22:17:29,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 22:17:31,480][54818] Updated weights for policy 0, policy_version 494308 (0.0025) [2024-04-27 22:17:33,913][54818] Updated weights for policy 0, policy_version 494318 (0.0029) [2024-04-27 22:17:34,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58709.4, 300 sec: 58815.8). Total num frames: 8098906112. Throughput: 0: 58628.4. Samples: 1004104720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 22:17:34,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 22:17:37,131][54818] Updated weights for policy 0, policy_version 494328 (0.0027) [2024-04-27 22:17:39,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58709.2, 300 sec: 58926.8). Total num frames: 8099217408. Throughput: 0: 58705.4. Samples: 1004463280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 22:17:39,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 22:17:39,398][54818] Updated weights for policy 0, policy_version 494338 (0.0025) [2024-04-27 22:17:42,670][54818] Updated weights for policy 0, policy_version 494348 (0.0026) [2024-04-27 22:17:44,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58436.3, 300 sec: 58871.4). Total num frames: 8099495936. Throughput: 0: 58750.7. Samples: 1004634700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 22:17:44,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 22:17:45,011][54818] Updated weights for policy 0, policy_version 494358 (0.0026) [2024-04-27 22:17:48,190][54818] Updated weights for policy 0, policy_version 494368 (0.0026) [2024-04-27 22:17:49,253][54587] Fps is (10 sec: 57344.6, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 8099790848. Throughput: 0: 58863.2. Samples: 1004995960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 22:17:49,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 22:17:49,264][54587] No heartbeat for components: RolloutWorker_w4 (3217 seconds) [2024-04-27 22:17:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000494372_8099790848.pth... [2024-04-27 22:17:49,328][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000493509_8085651456.pth [2024-04-27 22:17:50,586][54818] Updated weights for policy 0, policy_version 494378 (0.0024) [2024-04-27 22:17:53,734][54818] Updated weights for policy 0, policy_version 494388 (0.0024) [2024-04-27 22:17:54,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58709.4, 300 sec: 58815.8). Total num frames: 8100085760. Throughput: 0: 58803.5. Samples: 1005342700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 22:17:54,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 22:17:56,140][54818] Updated weights for policy 0, policy_version 494398 (0.0027) [2024-04-27 22:17:59,195][54818] Updated weights for policy 0, policy_version 494408 (0.0026) [2024-04-27 22:17:59,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58709.2, 300 sec: 58871.3). Total num frames: 8100380672. Throughput: 0: 58449.8. Samples: 1005507700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 22:17:59,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 22:18:01,614][54818] Updated weights for policy 0, policy_version 494418 (0.0025) [2024-04-27 22:18:04,253][54587] Fps is (10 sec: 55705.9, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 8100642816. Throughput: 0: 58687.7. Samples: 1005867280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 22:18:04,254][54587] Avg episode reward: [(0, '0.694')] [2024-04-27 22:18:04,779][54818] Updated weights for policy 0, policy_version 494428 (0.0024) [2024-04-27 22:18:07,038][54818] Updated weights for policy 0, policy_version 494438 (0.0027) [2024-04-27 22:18:09,253][54587] Fps is (10 sec: 57344.5, 60 sec: 58163.2, 300 sec: 58871.3). Total num frames: 8100954112. Throughput: 0: 58833.8. Samples: 1006218880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 22:18:09,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 22:18:10,399][54818] Updated weights for policy 0, policy_version 494448 (0.0025) [2024-04-27 22:18:11,330][54798] Signal inference workers to stop experience collection... (15300 times) [2024-04-27 22:18:11,335][54798] Signal inference workers to resume experience collection... (15300 times) [2024-04-27 22:18:11,347][54818] InferenceWorker_p0-w0: stopping experience collection (15300 times) [2024-04-27 22:18:11,347][54818] InferenceWorker_p0-w0: resuming experience collection (15300 times) [2024-04-27 22:18:12,649][54818] Updated weights for policy 0, policy_version 494458 (0.0027) [2024-04-27 22:18:14,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58163.3, 300 sec: 58815.8). Total num frames: 8101249024. Throughput: 0: 58463.2. Samples: 1006388120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 22:18:14,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 22:18:15,922][54818] Updated weights for policy 0, policy_version 494468 (0.0027) [2024-04-27 22:18:18,589][54818] Updated weights for policy 0, policy_version 494478 (0.0025) [2024-04-27 22:18:19,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 8101543936. Throughput: 0: 58440.4. Samples: 1006734540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 22:18:19,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 22:18:21,646][54818] Updated weights for policy 0, policy_version 494488 (0.0025) [2024-04-27 22:18:24,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58436.2, 300 sec: 58815.8). Total num frames: 8101838848. Throughput: 0: 58202.2. Samples: 1007082380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 22:18:24,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 22:18:24,573][54818] Updated weights for policy 0, policy_version 494498 (0.0026) [2024-04-27 22:18:27,091][54818] Updated weights for policy 0, policy_version 494508 (0.0027) [2024-04-27 22:18:29,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8102150144. Throughput: 0: 58599.0. Samples: 1007271660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 22:18:29,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 22:18:29,953][54818] Updated weights for policy 0, policy_version 494518 (0.0027) [2024-04-27 22:18:32,869][54818] Updated weights for policy 0, policy_version 494528 (0.0026) [2024-04-27 22:18:34,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58709.4, 300 sec: 58815.8). Total num frames: 8102428672. Throughput: 0: 58307.6. Samples: 1007619800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 22:18:34,253][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 22:18:35,599][54818] Updated weights for policy 0, policy_version 494538 (0.0027) [2024-04-27 22:18:38,519][54818] Updated weights for policy 0, policy_version 494548 (0.0023) [2024-04-27 22:18:39,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8102739968. Throughput: 0: 58367.6. Samples: 1007969240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 22:18:39,254][54587] Avg episode reward: [(0, '0.532')] [2024-04-27 22:18:41,440][54818] Updated weights for policy 0, policy_version 494558 (0.0027) [2024-04-27 22:18:44,006][54818] Updated weights for policy 0, policy_version 494568 (0.0025) [2024-04-27 22:18:44,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8103018496. Throughput: 0: 58706.4. Samples: 1008149480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 22:18:44,253][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 22:18:47,154][54818] Updated weights for policy 0, policy_version 494578 (0.0031) [2024-04-27 22:18:49,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58709.2, 300 sec: 58871.3). Total num frames: 8103313408. Throughput: 0: 58557.7. Samples: 1008502380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 22:18:49,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 22:18:49,395][54818] Updated weights for policy 0, policy_version 494588 (0.0027) [2024-04-27 22:18:52,761][54818] Updated weights for policy 0, policy_version 494598 (0.0025) [2024-04-27 22:18:53,943][54798] Signal inference workers to stop experience collection... (15350 times) [2024-04-27 22:18:53,949][54798] Signal inference workers to resume experience collection... (15350 times) [2024-04-27 22:18:53,961][54818] InferenceWorker_p0-w0: stopping experience collection (15350 times) [2024-04-27 22:18:53,962][54818] InferenceWorker_p0-w0: resuming experience collection (15350 times) [2024-04-27 22:18:54,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 8103591936. Throughput: 0: 58400.9. Samples: 1008846920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 22:18:54,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 22:18:55,124][54818] Updated weights for policy 0, policy_version 494608 (0.0027) [2024-04-27 22:18:58,335][54818] Updated weights for policy 0, policy_version 494618 (0.0024) [2024-04-27 22:18:59,253][54587] Fps is (10 sec: 57344.7, 60 sec: 58436.4, 300 sec: 58760.3). Total num frames: 8103886848. Throughput: 0: 58508.9. Samples: 1009021020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 22:18:59,253][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 22:19:00,615][54818] Updated weights for policy 0, policy_version 494628 (0.0025) [2024-04-27 22:19:03,757][54818] Updated weights for policy 0, policy_version 494638 (0.0027) [2024-04-27 22:19:04,253][54587] Fps is (10 sec: 55705.8, 60 sec: 58436.3, 300 sec: 58649.2). Total num frames: 8104148992. Throughput: 0: 58498.8. Samples: 1009366980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 22:19:04,253][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 22:19:06,046][54818] Updated weights for policy 0, policy_version 494648 (0.0026) [2024-04-27 22:19:09,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58436.2, 300 sec: 58760.3). Total num frames: 8104460288. Throughput: 0: 58851.2. Samples: 1009730680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 22:19:09,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 22:19:09,622][54818] Updated weights for policy 0, policy_version 494658 (0.0027) [2024-04-27 22:19:11,524][54818] Updated weights for policy 0, policy_version 494668 (0.0027) [2024-04-27 22:19:14,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 8104755200. Throughput: 0: 58253.9. Samples: 1009893080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 22:19:14,254][54587] Avg episode reward: [(0, '0.517')] [2024-04-27 22:19:14,974][54818] Updated weights for policy 0, policy_version 494678 (0.0027) [2024-04-27 22:19:17,155][54818] Updated weights for policy 0, policy_version 494688 (0.0027) [2024-04-27 22:19:19,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58436.2, 300 sec: 58760.2). Total num frames: 8105050112. Throughput: 0: 58541.2. Samples: 1010254160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 22:19:19,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 22:19:20,342][54818] Updated weights for policy 0, policy_version 494698 (0.0026) [2024-04-27 22:19:22,633][54818] Updated weights for policy 0, policy_version 494708 (0.0025) [2024-04-27 22:19:24,253][54587] Fps is (10 sec: 62258.8, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 8105377792. Throughput: 0: 58592.5. Samples: 1010605900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-27 22:19:24,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 22:19:26,105][54818] Updated weights for policy 0, policy_version 494718 (0.0026) [2024-04-27 22:19:28,074][54818] Updated weights for policy 0, policy_version 494728 (0.0026) [2024-04-27 22:19:29,253][54587] Fps is (10 sec: 60621.7, 60 sec: 58436.4, 300 sec: 58760.2). Total num frames: 8105656320. Throughput: 0: 58768.4. Samples: 1010794060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:19:29,253][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 22:19:31,781][54818] Updated weights for policy 0, policy_version 494738 (0.0025) [2024-04-27 22:19:33,823][54818] Updated weights for policy 0, policy_version 494748 (0.0025) [2024-04-27 22:19:34,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 8105967616. Throughput: 0: 58696.6. Samples: 1011143720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:19:34,253][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 22:19:37,119][54818] Updated weights for policy 0, policy_version 494758 (0.0027) [2024-04-27 22:19:39,253][54587] Fps is (10 sec: 60620.0, 60 sec: 58709.3, 300 sec: 58815.7). Total num frames: 8106262528. Throughput: 0: 58774.1. Samples: 1011491760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:19:39,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 22:19:39,525][54818] Updated weights for policy 0, policy_version 494768 (0.0026) [2024-04-27 22:19:42,574][54818] Updated weights for policy 0, policy_version 494778 (0.0026) [2024-04-27 22:19:44,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8106557440. Throughput: 0: 59021.7. Samples: 1011677000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:19:44,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 22:19:45,015][54818] Updated weights for policy 0, policy_version 494788 (0.0024) [2024-04-27 22:19:47,990][54818] Updated weights for policy 0, policy_version 494798 (0.0027) [2024-04-27 22:19:48,949][54798] Signal inference workers to stop experience collection... (15400 times) [2024-04-27 22:19:48,979][54818] InferenceWorker_p0-w0: stopping experience collection (15400 times) [2024-04-27 22:19:49,006][54798] Signal inference workers to resume experience collection... (15400 times) [2024-04-27 22:19:49,007][54818] InferenceWorker_p0-w0: resuming experience collection (15400 times) [2024-04-27 22:19:49,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8106835968. Throughput: 0: 59306.9. Samples: 1012035800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:19:49,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 22:19:49,271][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000494803_8106852352.pth... [2024-04-27 22:19:49,332][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000493941_8092729344.pth [2024-04-27 22:19:50,424][54818] Updated weights for policy 0, policy_version 494808 (0.0028) [2024-04-27 22:19:53,720][54818] Updated weights for policy 0, policy_version 494818 (0.0026) [2024-04-27 22:19:54,253][54587] Fps is (10 sec: 55705.6, 60 sec: 58709.3, 300 sec: 58760.2). Total num frames: 8107114496. Throughput: 0: 59079.6. Samples: 1012389260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:19:54,254][54587] Avg episode reward: [(0, '0.498')] [2024-04-27 22:19:56,041][54818] Updated weights for policy 0, policy_version 494828 (0.0027) [2024-04-27 22:19:59,156][54818] Updated weights for policy 0, policy_version 494838 (0.0027) [2024-04-27 22:19:59,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58982.3, 300 sec: 58815.8). Total num frames: 8107425792. Throughput: 0: 59219.4. Samples: 1012557960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:19:59,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 22:20:01,531][54818] Updated weights for policy 0, policy_version 494848 (0.0026) [2024-04-27 22:20:04,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59528.4, 300 sec: 58760.2). Total num frames: 8107720704. Throughput: 0: 59078.7. Samples: 1012912700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:20:04,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 22:20:04,702][54818] Updated weights for policy 0, policy_version 494858 (0.0026) [2024-04-27 22:20:07,211][54818] Updated weights for policy 0, policy_version 494868 (0.0027) [2024-04-27 22:20:09,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58982.3, 300 sec: 58704.7). Total num frames: 8107999232. Throughput: 0: 59153.3. Samples: 1013267800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:20:09,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 22:20:10,383][54818] Updated weights for policy 0, policy_version 494878 (0.0025) [2024-04-27 22:20:12,920][54818] Updated weights for policy 0, policy_version 494888 (0.0027) [2024-04-27 22:20:14,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58982.3, 300 sec: 58649.2). Total num frames: 8108294144. Throughput: 0: 58888.8. Samples: 1013444060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:20:14,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 22:20:15,824][54818] Updated weights for policy 0, policy_version 494898 (0.0026) [2024-04-27 22:20:18,496][54818] Updated weights for policy 0, policy_version 494908 (0.0022) [2024-04-27 22:20:19,253][54587] Fps is (10 sec: 60621.0, 60 sec: 59255.5, 300 sec: 58760.2). Total num frames: 8108605440. Throughput: 0: 58897.7. Samples: 1013794120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:20:19,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 22:20:21,428][54818] Updated weights for policy 0, policy_version 494918 (0.0027) [2024-04-27 22:20:23,952][54818] Updated weights for policy 0, policy_version 494928 (0.0026) [2024-04-27 22:20:24,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8108900352. Throughput: 0: 59067.6. Samples: 1014149800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:20:24,254][54587] Avg episode reward: [(0, '0.532')] [2024-04-27 22:20:26,858][54818] Updated weights for policy 0, policy_version 494938 (0.0027) [2024-04-27 22:20:29,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59255.4, 300 sec: 58871.3). Total num frames: 8109211648. Throughput: 0: 58882.6. Samples: 1014326720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:20:29,254][54587] Avg episode reward: [(0, '0.488')] [2024-04-27 22:20:29,499][54818] Updated weights for policy 0, policy_version 494948 (0.0024) [2024-04-27 22:20:32,364][54818] Updated weights for policy 0, policy_version 494958 (0.0027) [2024-04-27 22:20:34,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8109490176. Throughput: 0: 58555.2. Samples: 1014670780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:20:34,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 22:20:35,507][54818] Updated weights for policy 0, policy_version 494968 (0.0025) [2024-04-27 22:20:37,838][54818] Updated weights for policy 0, policy_version 494978 (0.0026) [2024-04-27 22:20:39,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 8109801472. Throughput: 0: 58720.9. Samples: 1015031700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:20:39,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 22:20:41,029][54818] Updated weights for policy 0, policy_version 494988 (0.0025) [2024-04-27 22:20:43,375][54818] Updated weights for policy 0, policy_version 494998 (0.0028) [2024-04-27 22:20:44,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58709.2, 300 sec: 58815.8). Total num frames: 8110080000. Throughput: 0: 58884.8. Samples: 1015207780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:20:44,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 22:20:45,464][54798] Signal inference workers to stop experience collection... (15450 times) [2024-04-27 22:20:45,464][54798] Signal inference workers to resume experience collection... (15450 times) [2024-04-27 22:20:45,474][54818] InferenceWorker_p0-w0: stopping experience collection (15450 times) [2024-04-27 22:20:45,475][54818] InferenceWorker_p0-w0: resuming experience collection (15450 times) [2024-04-27 22:20:46,387][54818] Updated weights for policy 0, policy_version 495008 (0.0026) [2024-04-27 22:20:48,924][54818] Updated weights for policy 0, policy_version 495018 (0.0025) [2024-04-27 22:20:49,253][54587] Fps is (10 sec: 57343.3, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 8110374912. Throughput: 0: 58901.3. Samples: 1015563260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:20:49,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 22:20:49,265][54587] No heartbeat for components: RolloutWorker_w4 (3397 seconds) [2024-04-27 22:20:52,059][54818] Updated weights for policy 0, policy_version 495028 (0.0026) [2024-04-27 22:20:54,253][54587] Fps is (10 sec: 60621.6, 60 sec: 59528.6, 300 sec: 58871.4). Total num frames: 8110686208. Throughput: 0: 58828.1. Samples: 1015915060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-27 22:20:54,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 22:20:54,521][54818] Updated weights for policy 0, policy_version 495038 (0.0024) [2024-04-27 22:20:57,629][54818] Updated weights for policy 0, policy_version 495048 (0.0027) [2024-04-27 22:20:59,253][54587] Fps is (10 sec: 58983.2, 60 sec: 58982.4, 300 sec: 58704.7). Total num frames: 8110964736. Throughput: 0: 58821.0. Samples: 1016091000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:20:59,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 22:21:00,235][54818] Updated weights for policy 0, policy_version 495058 (0.0026) [2024-04-27 22:21:03,241][54818] Updated weights for policy 0, policy_version 495068 (0.0027) [2024-04-27 22:21:04,253][54587] Fps is (10 sec: 55705.4, 60 sec: 58709.4, 300 sec: 58704.7). Total num frames: 8111243264. Throughput: 0: 58933.8. Samples: 1016446140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:21:04,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 22:21:05,832][54818] Updated weights for policy 0, policy_version 495078 (0.0026) [2024-04-27 22:21:08,833][54818] Updated weights for policy 0, policy_version 495088 (0.0027) [2024-04-27 22:21:09,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59255.5, 300 sec: 58760.3). Total num frames: 8111554560. Throughput: 0: 58904.0. Samples: 1016800480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:21:09,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 22:21:11,366][54818] Updated weights for policy 0, policy_version 495098 (0.0027) [2024-04-27 22:21:14,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.5, 300 sec: 58704.7). Total num frames: 8111833088. Throughput: 0: 58905.9. Samples: 1016977480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:21:14,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 22:21:14,257][54818] Updated weights for policy 0, policy_version 495108 (0.0025) [2024-04-27 22:21:16,963][54818] Updated weights for policy 0, policy_version 495118 (0.0026) [2024-04-27 22:21:19,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58709.3, 300 sec: 58704.7). Total num frames: 8112128000. Throughput: 0: 59144.8. Samples: 1017332300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:21:19,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 22:21:19,773][54818] Updated weights for policy 0, policy_version 495128 (0.0027) [2024-04-27 22:21:22,634][54818] Updated weights for policy 0, policy_version 495138 (0.0027) [2024-04-27 22:21:24,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.4, 300 sec: 58760.2). Total num frames: 8112422912. Throughput: 0: 58901.4. Samples: 1017682260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:21:24,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-27 22:21:25,384][54818] Updated weights for policy 0, policy_version 495148 (0.0026) [2024-04-27 22:21:28,100][54818] Updated weights for policy 0, policy_version 495158 (0.0027) [2024-04-27 22:21:29,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8112734208. Throughput: 0: 58950.7. Samples: 1017860560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:21:29,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 22:21:30,861][54818] Updated weights for policy 0, policy_version 495168 (0.0025) [2024-04-27 22:21:33,605][54818] Updated weights for policy 0, policy_version 495178 (0.0025) [2024-04-27 22:21:34,253][54587] Fps is (10 sec: 60619.9, 60 sec: 58982.3, 300 sec: 58760.2). Total num frames: 8113029120. Throughput: 0: 58946.7. Samples: 1018215860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:21:34,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-27 22:21:36,291][54818] Updated weights for policy 0, policy_version 495188 (0.0027) [2024-04-27 22:21:39,236][54818] Updated weights for policy 0, policy_version 495198 (0.0027) [2024-04-27 22:21:39,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58709.3, 300 sec: 58760.2). Total num frames: 8113324032. Throughput: 0: 59045.7. Samples: 1018572120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:21:39,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 22:21:41,958][54818] Updated weights for policy 0, policy_version 495208 (0.0027) [2024-04-27 22:21:44,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.4, 300 sec: 58760.2). Total num frames: 8113618944. Throughput: 0: 59020.3. Samples: 1018746920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:21:44,254][54587] Avg episode reward: [(0, '0.496')] [2024-04-27 22:21:44,694][54818] Updated weights for policy 0, policy_version 495218 (0.0026) [2024-04-27 22:21:47,528][54818] Updated weights for policy 0, policy_version 495228 (0.0031) [2024-04-27 22:21:49,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59255.6, 300 sec: 58871.3). Total num frames: 8113930240. Throughput: 0: 59022.2. Samples: 1019102140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:21:49,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 22:21:49,262][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000495235_8113930240.pth... [2024-04-27 22:21:49,315][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000494372_8099790848.pth [2024-04-27 22:21:50,114][54818] Updated weights for policy 0, policy_version 495238 (0.0026) [2024-04-27 22:21:52,958][54818] Updated weights for policy 0, policy_version 495248 (0.0025) [2024-04-27 22:21:54,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8114208768. Throughput: 0: 58958.3. Samples: 1019453600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:21:54,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 22:21:55,753][54818] Updated weights for policy 0, policy_version 495258 (0.0027) [2024-04-27 22:21:58,393][54818] Updated weights for policy 0, policy_version 495268 (0.0026) [2024-04-27 22:21:59,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58982.5, 300 sec: 58871.3). Total num frames: 8114503680. Throughput: 0: 59147.6. Samples: 1019639120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:21:59,253][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 22:22:00,919][54798] Signal inference workers to stop experience collection... (15500 times) [2024-04-27 22:22:00,919][54798] Signal inference workers to resume experience collection... (15500 times) [2024-04-27 22:22:00,938][54818] InferenceWorker_p0-w0: stopping experience collection (15500 times) [2024-04-27 22:22:00,939][54818] InferenceWorker_p0-w0: resuming experience collection (15500 times) [2024-04-27 22:22:01,259][54818] Updated weights for policy 0, policy_version 495278 (0.0027) [2024-04-27 22:22:03,990][54818] Updated weights for policy 0, policy_version 495288 (0.0026) [2024-04-27 22:22:04,253][54587] Fps is (10 sec: 58981.9, 60 sec: 59255.4, 300 sec: 58760.2). Total num frames: 8114798592. Throughput: 0: 59073.7. Samples: 1019990620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:22:04,254][54587] Avg episode reward: [(0, '0.521')] [2024-04-27 22:22:06,626][54818] Updated weights for policy 0, policy_version 495298 (0.0025) [2024-04-27 22:22:09,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58709.4, 300 sec: 58704.7). Total num frames: 8115077120. Throughput: 0: 59083.9. Samples: 1020341040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:22:09,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 22:22:09,626][54818] Updated weights for policy 0, policy_version 495308 (0.0025) [2024-04-27 22:22:12,795][54818] Updated weights for policy 0, policy_version 495318 (0.0027) [2024-04-27 22:22:14,253][54587] Fps is (10 sec: 58982.6, 60 sec: 59255.4, 300 sec: 58815.8). Total num frames: 8115388416. Throughput: 0: 58865.8. Samples: 1020509520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:22:14,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 22:22:15,216][54818] Updated weights for policy 0, policy_version 495328 (0.0027) [2024-04-27 22:22:18,532][54818] Updated weights for policy 0, policy_version 495338 (0.0027) [2024-04-27 22:22:19,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58709.4, 300 sec: 58704.7). Total num frames: 8115650560. Throughput: 0: 58780.9. Samples: 1020861000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:22:19,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 22:22:20,710][54818] Updated weights for policy 0, policy_version 495348 (0.0027) [2024-04-27 22:22:24,041][54818] Updated weights for policy 0, policy_version 495358 (0.0027) [2024-04-27 22:22:24,253][54587] Fps is (10 sec: 55705.3, 60 sec: 58709.2, 300 sec: 58704.7). Total num frames: 8115945472. Throughput: 0: 58880.8. Samples: 1021221760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 22:22:24,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 22:22:26,263][54818] Updated weights for policy 0, policy_version 495368 (0.0027) [2024-04-27 22:22:29,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58709.4, 300 sec: 58815.8). Total num frames: 8116256768. Throughput: 0: 58887.1. Samples: 1021396840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 22:22:29,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 22:22:29,575][54818] Updated weights for policy 0, policy_version 495378 (0.0025) [2024-04-27 22:22:31,916][54818] Updated weights for policy 0, policy_version 495388 (0.0025) [2024-04-27 22:22:34,253][54587] Fps is (10 sec: 60621.1, 60 sec: 58709.4, 300 sec: 58760.2). Total num frames: 8116551680. Throughput: 0: 58749.7. Samples: 1021745880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 22:22:34,254][54587] Avg episode reward: [(0, '0.506')] [2024-04-27 22:22:35,075][54818] Updated weights for policy 0, policy_version 495398 (0.0027) [2024-04-27 22:22:37,419][54818] Updated weights for policy 0, policy_version 495408 (0.0026) [2024-04-27 22:22:39,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8116862976. Throughput: 0: 58695.9. Samples: 1022094920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 22:22:39,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 22:22:40,803][54818] Updated weights for policy 0, policy_version 495418 (0.0025) [2024-04-27 22:22:43,029][54818] Updated weights for policy 0, policy_version 495428 (0.0026) [2024-04-27 22:22:44,253][54587] Fps is (10 sec: 62259.0, 60 sec: 59255.4, 300 sec: 58926.8). Total num frames: 8117174272. Throughput: 0: 58653.6. Samples: 1022278540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 22:22:44,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 22:22:46,426][54818] Updated weights for policy 0, policy_version 495438 (0.0027) [2024-04-27 22:22:48,602][54818] Updated weights for policy 0, policy_version 495448 (0.0026) [2024-04-27 22:22:49,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8117452800. Throughput: 0: 58547.2. Samples: 1022625240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 22:22:49,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 22:22:52,045][54818] Updated weights for policy 0, policy_version 495458 (0.0026) [2024-04-27 22:22:53,139][54798] Signal inference workers to stop experience collection... (15550 times) [2024-04-27 22:22:53,169][54818] InferenceWorker_p0-w0: stopping experience collection (15550 times) [2024-04-27 22:22:53,198][54798] Signal inference workers to resume experience collection... (15550 times) [2024-04-27 22:22:53,198][54818] InferenceWorker_p0-w0: resuming experience collection (15550 times) [2024-04-27 22:22:54,067][54818] Updated weights for policy 0, policy_version 495468 (0.0026) [2024-04-27 22:22:54,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8117747712. Throughput: 0: 58575.0. Samples: 1022976920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 22:22:54,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 22:22:57,654][54818] Updated weights for policy 0, policy_version 495478 (0.0026) [2024-04-27 22:22:59,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58982.3, 300 sec: 58982.4). Total num frames: 8118042624. Throughput: 0: 58911.6. Samples: 1023160540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 22:22:59,254][54587] Avg episode reward: [(0, '0.455')] [2024-04-27 22:22:59,892][54818] Updated weights for policy 0, policy_version 495488 (0.0024) [2024-04-27 22:23:03,285][54818] Updated weights for policy 0, policy_version 495498 (0.0025) [2024-04-27 22:23:04,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8118321152. Throughput: 0: 59060.4. Samples: 1023518720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 22:23:04,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 22:23:05,340][54818] Updated weights for policy 0, policy_version 495508 (0.0026) [2024-04-27 22:23:08,827][54818] Updated weights for policy 0, policy_version 495518 (0.0026) [2024-04-27 22:23:09,253][54587] Fps is (10 sec: 55705.7, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8118599680. Throughput: 0: 58913.9. Samples: 1023872880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 22:23:09,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 22:23:10,761][54818] Updated weights for policy 0, policy_version 495528 (0.0027) [2024-04-27 22:23:14,253][54587] Fps is (10 sec: 55705.8, 60 sec: 58163.2, 300 sec: 58760.2). Total num frames: 8118878208. Throughput: 0: 58717.9. Samples: 1024039140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 22:23:14,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 22:23:14,434][54818] Updated weights for policy 0, policy_version 495538 (0.0027) [2024-04-27 22:23:16,255][54818] Updated weights for policy 0, policy_version 495548 (0.0023) [2024-04-27 22:23:19,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58709.4, 300 sec: 58760.3). Total num frames: 8119173120. Throughput: 0: 58791.3. Samples: 1024391480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 22:23:19,253][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 22:23:19,754][54818] Updated weights for policy 0, policy_version 495558 (0.0025) [2024-04-27 22:23:21,765][54818] Updated weights for policy 0, policy_version 495568 (0.0022) [2024-04-27 22:23:24,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58709.5, 300 sec: 58704.7). Total num frames: 8119468032. Throughput: 0: 59014.4. Samples: 1024750560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 22:23:24,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 22:23:25,138][54818] Updated weights for policy 0, policy_version 495578 (0.0026) [2024-04-27 22:23:27,284][54818] Updated weights for policy 0, policy_version 495588 (0.0026) [2024-04-27 22:23:29,253][54587] Fps is (10 sec: 60619.7, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8119779328. Throughput: 0: 58749.8. Samples: 1024922280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 22:23:29,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 22:23:30,777][54818] Updated weights for policy 0, policy_version 495598 (0.0027) [2024-04-27 22:23:32,717][54818] Updated weights for policy 0, policy_version 495608 (0.0024) [2024-04-27 22:23:34,253][54587] Fps is (10 sec: 62259.0, 60 sec: 58982.5, 300 sec: 58815.8). Total num frames: 8120090624. Throughput: 0: 58809.8. Samples: 1025271680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 22:23:34,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 22:23:35,801][54798] Signal inference workers to stop experience collection... (15600 times) [2024-04-27 22:23:35,838][54818] InferenceWorker_p0-w0: stopping experience collection (15600 times) [2024-04-27 22:23:35,892][54798] Signal inference workers to resume experience collection... (15600 times) [2024-04-27 22:23:35,892][54818] InferenceWorker_p0-w0: resuming experience collection (15600 times) [2024-04-27 22:23:36,277][54818] Updated weights for policy 0, policy_version 495618 (0.0023) [2024-04-27 22:23:38,771][54818] Updated weights for policy 0, policy_version 495628 (0.0025) [2024-04-27 22:23:39,253][54587] Fps is (10 sec: 60621.5, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8120385536. Throughput: 0: 59019.6. Samples: 1025632800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 22:23:39,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 22:23:41,643][54818] Updated weights for policy 0, policy_version 495638 (0.0025) [2024-04-27 22:23:44,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58436.4, 300 sec: 58871.3). Total num frames: 8120680448. Throughput: 0: 58876.9. Samples: 1025810000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-27 22:23:44,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 22:23:44,345][54818] Updated weights for policy 0, policy_version 495648 (0.0025) [2024-04-27 22:23:47,120][54818] Updated weights for policy 0, policy_version 495658 (0.0024) [2024-04-27 22:23:49,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8120991744. Throughput: 0: 58745.8. Samples: 1026162280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:23:49,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 22:23:49,266][54587] No heartbeat for components: RolloutWorker_w4 (3577 seconds) [2024-04-27 22:23:49,267][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000495666_8120991744.pth... [2024-04-27 22:23:49,317][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000494803_8106852352.pth [2024-04-27 22:23:49,846][54818] Updated weights for policy 0, policy_version 495668 (0.0026) [2024-04-27 22:23:52,690][54818] Updated weights for policy 0, policy_version 495678 (0.0027) [2024-04-27 22:23:54,253][54587] Fps is (10 sec: 62259.1, 60 sec: 59255.5, 300 sec: 59037.9). Total num frames: 8121303040. Throughput: 0: 58601.8. Samples: 1026509960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:23:54,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 22:23:55,462][54818] Updated weights for policy 0, policy_version 495688 (0.0024) [2024-04-27 22:23:58,207][54818] Updated weights for policy 0, policy_version 495698 (0.0024) [2024-04-27 22:23:59,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8121581568. Throughput: 0: 59149.3. Samples: 1026700860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:23:59,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 22:24:01,075][54818] Updated weights for policy 0, policy_version 495708 (0.0024) [2024-04-27 22:24:03,892][54818] Updated weights for policy 0, policy_version 495718 (0.0027) [2024-04-27 22:24:04,253][54587] Fps is (10 sec: 57343.5, 60 sec: 59255.4, 300 sec: 59037.9). Total num frames: 8121876480. Throughput: 0: 59125.1. Samples: 1027052120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:24:04,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-27 22:24:06,765][54818] Updated weights for policy 0, policy_version 495728 (0.0025) [2024-04-27 22:24:09,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8122155008. Throughput: 0: 58828.4. Samples: 1027397840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:24:09,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 22:24:09,291][54818] Updated weights for policy 0, policy_version 495738 (0.0027) [2024-04-27 22:24:12,396][54818] Updated weights for policy 0, policy_version 495748 (0.0027) [2024-04-27 22:24:13,816][54798] Signal inference workers to stop experience collection... (15650 times) [2024-04-27 22:24:13,852][54818] InferenceWorker_p0-w0: stopping experience collection (15650 times) [2024-04-27 22:24:13,906][54798] Signal inference workers to resume experience collection... (15650 times) [2024-04-27 22:24:13,907][54818] InferenceWorker_p0-w0: resuming experience collection (15650 times) [2024-04-27 22:24:14,253][54587] Fps is (10 sec: 57344.2, 60 sec: 59528.5, 300 sec: 58982.4). Total num frames: 8122449920. Throughput: 0: 59012.5. Samples: 1027577840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:24:14,254][54587] Avg episode reward: [(0, '0.440')] [2024-04-27 22:24:14,847][54818] Updated weights for policy 0, policy_version 495758 (0.0026) [2024-04-27 22:24:18,171][54818] Updated weights for policy 0, policy_version 495768 (0.0025) [2024-04-27 22:24:19,253][54587] Fps is (10 sec: 57343.1, 60 sec: 59255.3, 300 sec: 58815.8). Total num frames: 8122728448. Throughput: 0: 59062.9. Samples: 1027929520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:24:19,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 22:24:20,358][54818] Updated weights for policy 0, policy_version 495778 (0.0027) [2024-04-27 22:24:23,703][54818] Updated weights for policy 0, policy_version 495788 (0.0027) [2024-04-27 22:24:24,253][54587] Fps is (10 sec: 55706.0, 60 sec: 58982.3, 300 sec: 58815.8). Total num frames: 8123006976. Throughput: 0: 59047.5. Samples: 1028289940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:24:24,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-27 22:24:25,671][54818] Updated weights for policy 0, policy_version 495798 (0.0027) [2024-04-27 22:24:29,253][54587] Fps is (10 sec: 57344.8, 60 sec: 58709.5, 300 sec: 58760.2). Total num frames: 8123301888. Throughput: 0: 58816.0. Samples: 1028456720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:24:29,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 22:24:29,311][54818] Updated weights for policy 0, policy_version 495808 (0.0024) [2024-04-27 22:24:31,293][54818] Updated weights for policy 0, policy_version 495818 (0.0024) [2024-04-27 22:24:34,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58709.2, 300 sec: 58815.8). Total num frames: 8123613184. Throughput: 0: 58807.9. Samples: 1028808640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:24:34,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 22:24:34,716][54818] Updated weights for policy 0, policy_version 495828 (0.0027) [2024-04-27 22:24:36,981][54818] Updated weights for policy 0, policy_version 495838 (0.0026) [2024-04-27 22:24:39,253][54587] Fps is (10 sec: 62259.0, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 8123924480. Throughput: 0: 59143.6. Samples: 1029171420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:24:39,254][54587] Avg episode reward: [(0, '0.497')] [2024-04-27 22:24:40,456][54818] Updated weights for policy 0, policy_version 495848 (0.0027) [2024-04-27 22:24:42,583][54818] Updated weights for policy 0, policy_version 495858 (0.0025) [2024-04-27 22:24:44,253][54587] Fps is (10 sec: 58983.2, 60 sec: 58709.4, 300 sec: 58871.4). Total num frames: 8124203008. Throughput: 0: 58700.6. Samples: 1029342380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:24:44,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 22:24:45,743][54818] Updated weights for policy 0, policy_version 495868 (0.0023) [2024-04-27 22:24:48,267][54818] Updated weights for policy 0, policy_version 495878 (0.0025) [2024-04-27 22:24:49,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58436.3, 300 sec: 58926.9). Total num frames: 8124497920. Throughput: 0: 58743.2. Samples: 1029695560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:24:49,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 22:24:51,172][54818] Updated weights for policy 0, policy_version 495888 (0.0025) [2024-04-27 22:24:53,931][54818] Updated weights for policy 0, policy_version 495898 (0.0024) [2024-04-27 22:24:54,253][54587] Fps is (10 sec: 60619.9, 60 sec: 58436.2, 300 sec: 58926.8). Total num frames: 8124809216. Throughput: 0: 58930.5. Samples: 1030049720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:24:54,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 22:24:56,727][54818] Updated weights for policy 0, policy_version 495908 (0.0027) [2024-04-27 22:24:59,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 8125104128. Throughput: 0: 58983.2. Samples: 1030232080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:24:59,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 22:24:59,433][54818] Updated weights for policy 0, policy_version 495918 (0.0025) [2024-04-27 22:25:02,364][54818] Updated weights for policy 0, policy_version 495928 (0.0026) [2024-04-27 22:25:03,204][54798] Signal inference workers to stop experience collection... (15700 times) [2024-04-27 22:25:03,208][54798] Signal inference workers to resume experience collection... (15700 times) [2024-04-27 22:25:03,234][54818] InferenceWorker_p0-w0: stopping experience collection (15700 times) [2024-04-27 22:25:03,234][54818] InferenceWorker_p0-w0: resuming experience collection (15700 times) [2024-04-27 22:25:04,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8125399040. Throughput: 0: 59054.4. Samples: 1030586960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:25:04,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 22:25:04,959][54818] Updated weights for policy 0, policy_version 495938 (0.0025) [2024-04-27 22:25:07,809][54818] Updated weights for policy 0, policy_version 495948 (0.0025) [2024-04-27 22:25:09,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.3, 300 sec: 58982.4). Total num frames: 8125693952. Throughput: 0: 58643.1. Samples: 1030928880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:25:09,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 22:25:10,829][54818] Updated weights for policy 0, policy_version 495958 (0.0027) [2024-04-27 22:25:13,396][54818] Updated weights for policy 0, policy_version 495968 (0.0025) [2024-04-27 22:25:14,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8126005248. Throughput: 0: 59218.2. Samples: 1031121540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-27 22:25:14,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 22:25:16,406][54818] Updated weights for policy 0, policy_version 495978 (0.0023) [2024-04-27 22:25:18,963][54818] Updated weights for policy 0, policy_version 495988 (0.0027) [2024-04-27 22:25:19,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59528.6, 300 sec: 58982.4). Total num frames: 8126300160. Throughput: 0: 59116.4. Samples: 1031468880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 22:25:19,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 22:25:21,968][54818] Updated weights for policy 0, policy_version 495998 (0.0026) [2024-04-27 22:25:24,253][54587] Fps is (10 sec: 57343.8, 60 sec: 59528.5, 300 sec: 58871.3). Total num frames: 8126578688. Throughput: 0: 58771.1. Samples: 1031816120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 22:25:24,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 22:25:24,417][54818] Updated weights for policy 0, policy_version 496008 (0.0026) [2024-04-27 22:25:27,877][54818] Updated weights for policy 0, policy_version 496018 (0.0026) [2024-04-27 22:25:29,253][54587] Fps is (10 sec: 57344.8, 60 sec: 59528.6, 300 sec: 58926.9). Total num frames: 8126873600. Throughput: 0: 58896.9. Samples: 1031992740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 22:25:29,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 22:25:29,956][54818] Updated weights for policy 0, policy_version 496028 (0.0024) [2024-04-27 22:25:33,331][54818] Updated weights for policy 0, policy_version 496038 (0.0024) [2024-04-27 22:25:34,257][54587] Fps is (10 sec: 57322.0, 60 sec: 58978.7, 300 sec: 58815.0). Total num frames: 8127152128. Throughput: 0: 59107.8. Samples: 1032355640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 22:25:34,258][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 22:25:35,361][54818] Updated weights for policy 0, policy_version 496048 (0.0027) [2024-04-27 22:25:38,753][54818] Updated weights for policy 0, policy_version 496058 (0.0026) [2024-04-27 22:25:39,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8127447040. Throughput: 0: 59024.6. Samples: 1032705820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 22:25:39,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 22:25:40,900][54818] Updated weights for policy 0, policy_version 496068 (0.0026) [2024-04-27 22:25:44,248][54818] Updated weights for policy 0, policy_version 496078 (0.0026) [2024-04-27 22:25:44,253][54587] Fps is (10 sec: 59005.2, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8127741952. Throughput: 0: 58612.9. Samples: 1032869660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 22:25:44,254][54587] Avg episode reward: [(0, '0.671')] [2024-04-27 22:25:46,417][54818] Updated weights for policy 0, policy_version 496088 (0.0027) [2024-04-27 22:25:49,253][54587] Fps is (10 sec: 55705.6, 60 sec: 58436.3, 300 sec: 58704.7). Total num frames: 8128004096. Throughput: 0: 58857.4. Samples: 1033235540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 22:25:49,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 22:25:49,417][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000496096_8128036864.pth... [2024-04-27 22:25:49,461][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000495235_8113930240.pth [2024-04-27 22:25:49,878][54818] Updated weights for policy 0, policy_version 496098 (0.0024) [2024-04-27 22:25:51,927][54818] Updated weights for policy 0, policy_version 496108 (0.0025) [2024-04-27 22:25:54,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 8128315392. Throughput: 0: 58956.9. Samples: 1033581940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 22:25:54,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 22:25:55,512][54818] Updated weights for policy 0, policy_version 496118 (0.0025) [2024-04-27 22:25:57,865][54818] Updated weights for policy 0, policy_version 496128 (0.0026) [2024-04-27 22:25:59,253][54587] Fps is (10 sec: 62259.0, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8128626688. Throughput: 0: 58396.4. Samples: 1033749380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 22:25:59,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 22:26:00,326][54798] Signal inference workers to stop experience collection... (15750 times) [2024-04-27 22:26:00,326][54798] Signal inference workers to resume experience collection... (15750 times) [2024-04-27 22:26:00,337][54818] InferenceWorker_p0-w0: stopping experience collection (15750 times) [2024-04-27 22:26:00,354][54818] InferenceWorker_p0-w0: resuming experience collection (15750 times) [2024-04-27 22:26:00,950][54818] Updated weights for policy 0, policy_version 496138 (0.0032) [2024-04-27 22:26:03,709][54818] Updated weights for policy 0, policy_version 496148 (0.0027) [2024-04-27 22:26:04,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58436.2, 300 sec: 58815.8). Total num frames: 8128905216. Throughput: 0: 58607.2. Samples: 1034106200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 22:26:04,254][54587] Avg episode reward: [(0, '0.517')] [2024-04-27 22:26:06,502][54818] Updated weights for policy 0, policy_version 496158 (0.0028) [2024-04-27 22:26:09,202][54818] Updated weights for policy 0, policy_version 496168 (0.0027) [2024-04-27 22:26:09,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8129216512. Throughput: 0: 58876.1. Samples: 1034465540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 22:26:09,254][54587] Avg episode reward: [(0, '0.695')] [2024-04-27 22:26:11,948][54818] Updated weights for policy 0, policy_version 496178 (0.0027) [2024-04-27 22:26:14,253][54587] Fps is (10 sec: 60621.1, 60 sec: 58436.3, 300 sec: 58926.9). Total num frames: 8129511424. Throughput: 0: 59032.8. Samples: 1034649220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 22:26:14,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 22:26:14,867][54818] Updated weights for policy 0, policy_version 496188 (0.0029) [2024-04-27 22:26:17,535][54818] Updated weights for policy 0, policy_version 496198 (0.0026) [2024-04-27 22:26:19,253][54587] Fps is (10 sec: 60619.9, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8129822720. Throughput: 0: 58703.1. Samples: 1034997060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 22:26:19,254][54587] Avg episode reward: [(0, '0.680')] [2024-04-27 22:26:20,316][54818] Updated weights for policy 0, policy_version 496208 (0.0024) [2024-04-27 22:26:22,961][54818] Updated weights for policy 0, policy_version 496218 (0.0026) [2024-04-27 22:26:24,253][54587] Fps is (10 sec: 62259.6, 60 sec: 59255.6, 300 sec: 58982.4). Total num frames: 8130134016. Throughput: 0: 58652.9. Samples: 1035345200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 22:26:24,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 22:26:25,803][54818] Updated weights for policy 0, policy_version 496228 (0.0026) [2024-04-27 22:26:28,468][54818] Updated weights for policy 0, policy_version 496238 (0.0026) [2024-04-27 22:26:29,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59255.3, 300 sec: 58982.4). Total num frames: 8130428928. Throughput: 0: 59262.6. Samples: 1035536480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 22:26:29,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 22:26:31,380][54818] Updated weights for policy 0, policy_version 496248 (0.0026) [2024-04-27 22:26:33,875][54818] Updated weights for policy 0, policy_version 496258 (0.0024) [2024-04-27 22:26:34,253][54587] Fps is (10 sec: 58981.7, 60 sec: 59532.3, 300 sec: 58982.4). Total num frames: 8130723840. Throughput: 0: 59090.6. Samples: 1035894620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 22:26:34,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-27 22:26:36,729][54818] Updated weights for policy 0, policy_version 496268 (0.0025) [2024-04-27 22:26:39,253][54587] Fps is (10 sec: 55706.0, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 8130985984. Throughput: 0: 59235.1. Samples: 1036247520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-27 22:26:39,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 22:26:39,478][54818] Updated weights for policy 0, policy_version 496278 (0.0027) [2024-04-27 22:26:42,574][54818] Updated weights for policy 0, policy_version 496288 (0.0027) [2024-04-27 22:26:44,253][54587] Fps is (10 sec: 54067.8, 60 sec: 58709.4, 300 sec: 58760.3). Total num frames: 8131264512. Throughput: 0: 59307.2. Samples: 1036418200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 22:26:44,253][54587] Avg episode reward: [(0, '0.547')] [2024-04-27 22:26:45,003][54818] Updated weights for policy 0, policy_version 496298 (0.0024) [2024-04-27 22:26:48,119][54818] Updated weights for policy 0, policy_version 496308 (0.0027) [2024-04-27 22:26:49,253][54587] Fps is (10 sec: 57343.8, 60 sec: 59255.4, 300 sec: 58815.8). Total num frames: 8131559424. Throughput: 0: 59285.4. Samples: 1036774040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 22:26:49,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 22:26:49,264][54587] No heartbeat for components: RolloutWorker_w4 (3757 seconds) [2024-04-27 22:26:49,693][54798] Signal inference workers to stop experience collection... (15800 times) [2024-04-27 22:26:49,736][54818] InferenceWorker_p0-w0: stopping experience collection (15800 times) [2024-04-27 22:26:49,752][54798] Signal inference workers to resume experience collection... (15800 times) [2024-04-27 22:26:49,753][54818] InferenceWorker_p0-w0: resuming experience collection (15800 times) [2024-04-27 22:26:50,357][54818] Updated weights for policy 0, policy_version 496318 (0.0026) [2024-04-27 22:26:53,756][54818] Updated weights for policy 0, policy_version 496328 (0.0026) [2024-04-27 22:26:54,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58709.3, 300 sec: 58760.2). Total num frames: 8131837952. Throughput: 0: 59226.1. Samples: 1037130720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 22:26:54,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 22:26:55,962][54818] Updated weights for policy 0, policy_version 496338 (0.0026) [2024-04-27 22:26:59,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8132149248. Throughput: 0: 58704.8. Samples: 1037290940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 22:26:59,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 22:26:59,434][54818] Updated weights for policy 0, policy_version 496348 (0.0028) [2024-04-27 22:27:01,581][54818] Updated weights for policy 0, policy_version 496358 (0.0025) [2024-04-27 22:27:04,253][54587] Fps is (10 sec: 62259.7, 60 sec: 59255.6, 300 sec: 58926.9). Total num frames: 8132460544. Throughput: 0: 58887.3. Samples: 1037646980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 22:27:04,253][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 22:27:04,956][54818] Updated weights for policy 0, policy_version 496368 (0.0026) [2024-04-27 22:27:07,135][54818] Updated weights for policy 0, policy_version 496378 (0.0025) [2024-04-27 22:27:09,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8132755456. Throughput: 0: 59057.1. Samples: 1038002780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 22:27:09,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 22:27:10,712][54818] Updated weights for policy 0, policy_version 496388 (0.0026) [2024-04-27 22:27:12,678][54818] Updated weights for policy 0, policy_version 496398 (0.0024) [2024-04-27 22:27:14,253][54587] Fps is (10 sec: 60620.1, 60 sec: 59255.4, 300 sec: 59037.9). Total num frames: 8133066752. Throughput: 0: 58896.0. Samples: 1038186800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 22:27:14,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 22:27:16,204][54818] Updated weights for policy 0, policy_version 496408 (0.0025) [2024-04-27 22:27:18,241][54818] Updated weights for policy 0, policy_version 496418 (0.0026) [2024-04-27 22:27:19,253][54587] Fps is (10 sec: 60621.5, 60 sec: 58982.5, 300 sec: 59038.0). Total num frames: 8133361664. Throughput: 0: 58655.2. Samples: 1038534100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 22:27:19,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 22:27:21,828][54818] Updated weights for policy 0, policy_version 496428 (0.0026) [2024-04-27 22:27:23,833][54818] Updated weights for policy 0, policy_version 496438 (0.0025) [2024-04-27 22:27:24,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.2, 300 sec: 58982.4). Total num frames: 8133656576. Throughput: 0: 58540.9. Samples: 1038881860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 22:27:24,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 22:27:27,264][54818] Updated weights for policy 0, policy_version 496448 (0.0027) [2024-04-27 22:27:29,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.5, 300 sec: 58982.4). Total num frames: 8133951488. Throughput: 0: 59031.1. Samples: 1039074600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 22:27:29,253][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 22:27:29,290][54818] Updated weights for policy 0, policy_version 496458 (0.0027) [2024-04-27 22:27:32,683][54818] Updated weights for policy 0, policy_version 496468 (0.0025) [2024-04-27 22:27:34,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 8134246400. Throughput: 0: 58913.7. Samples: 1039425160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 22:27:34,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 22:27:35,204][54818] Updated weights for policy 0, policy_version 496478 (0.0025) [2024-04-27 22:27:38,224][54818] Updated weights for policy 0, policy_version 496488 (0.0025) [2024-04-27 22:27:39,253][54587] Fps is (10 sec: 57343.3, 60 sec: 58982.3, 300 sec: 58815.8). Total num frames: 8134524928. Throughput: 0: 58634.6. Samples: 1039769280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 22:27:39,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 22:27:39,585][54798] Signal inference workers to stop experience collection... (15850 times) [2024-04-27 22:27:39,590][54798] Signal inference workers to resume experience collection... (15850 times) [2024-04-27 22:27:39,601][54818] InferenceWorker_p0-w0: stopping experience collection (15850 times) [2024-04-27 22:27:39,601][54818] InferenceWorker_p0-w0: resuming experience collection (15850 times) [2024-04-27 22:27:40,776][54818] Updated weights for policy 0, policy_version 496498 (0.0025) [2024-04-27 22:27:43,697][54818] Updated weights for policy 0, policy_version 496508 (0.0026) [2024-04-27 22:27:44,253][54587] Fps is (10 sec: 57344.7, 60 sec: 59255.4, 300 sec: 58871.3). Total num frames: 8134819840. Throughput: 0: 58994.8. Samples: 1039945700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 22:27:44,253][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 22:27:46,204][54818] Updated weights for policy 0, policy_version 496518 (0.0025) [2024-04-27 22:27:49,157][54818] Updated weights for policy 0, policy_version 496528 (0.0026) [2024-04-27 22:27:49,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59255.5, 300 sec: 58871.3). Total num frames: 8135114752. Throughput: 0: 59044.4. Samples: 1040303980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 22:27:49,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 22:27:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000496528_8135114752.pth... [2024-04-27 22:27:49,317][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000495666_8120991744.pth [2024-04-27 22:27:51,794][54818] Updated weights for policy 0, policy_version 496538 (0.0026) [2024-04-27 22:27:54,253][54587] Fps is (10 sec: 57343.4, 60 sec: 59255.4, 300 sec: 58815.8). Total num frames: 8135393280. Throughput: 0: 59167.2. Samples: 1040665300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 22:27:54,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 22:27:54,711][54818] Updated weights for policy 0, policy_version 496548 (0.0023) [2024-04-27 22:27:57,333][54818] Updated weights for policy 0, policy_version 496558 (0.0027) [2024-04-27 22:27:59,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 8135688192. Throughput: 0: 58817.8. Samples: 1040833600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 22:27:59,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 22:28:00,128][54818] Updated weights for policy 0, policy_version 496568 (0.0027) [2024-04-27 22:28:02,944][54818] Updated weights for policy 0, policy_version 496578 (0.0024) [2024-04-27 22:28:04,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 8135983104. Throughput: 0: 58827.2. Samples: 1041181320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 22:28:04,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 22:28:05,590][54818] Updated weights for policy 0, policy_version 496588 (0.0023) [2024-04-27 22:28:08,443][54818] Updated weights for policy 0, policy_version 496598 (0.0025) [2024-04-27 22:28:09,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8136278016. Throughput: 0: 59002.2. Samples: 1041536960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:28:09,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 22:28:11,236][54818] Updated weights for policy 0, policy_version 496608 (0.0025) [2024-04-27 22:28:14,107][54818] Updated weights for policy 0, policy_version 496618 (0.0026) [2024-04-27 22:28:14,253][54587] Fps is (10 sec: 60620.2, 60 sec: 58709.4, 300 sec: 59037.9). Total num frames: 8136589312. Throughput: 0: 58614.1. Samples: 1041712240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:28:14,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 22:28:16,754][54818] Updated weights for policy 0, policy_version 496628 (0.0024) [2024-04-27 22:28:19,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58436.1, 300 sec: 58982.4). Total num frames: 8136867840. Throughput: 0: 58649.3. Samples: 1042064380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:28:19,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 22:28:19,695][54818] Updated weights for policy 0, policy_version 496638 (0.0025) [2024-04-27 22:28:22,337][54818] Updated weights for policy 0, policy_version 496648 (0.0026) [2024-04-27 22:28:24,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8137179136. Throughput: 0: 58973.0. Samples: 1042423060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:28:24,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 22:28:25,326][54818] Updated weights for policy 0, policy_version 496658 (0.0026) [2024-04-27 22:28:28,055][54818] Updated weights for policy 0, policy_version 496668 (0.0027) [2024-04-27 22:28:29,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58709.2, 300 sec: 58926.8). Total num frames: 8137474048. Throughput: 0: 58979.4. Samples: 1042599780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:28:29,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 22:28:30,876][54818] Updated weights for policy 0, policy_version 496678 (0.0027) [2024-04-27 22:28:31,478][54798] Signal inference workers to stop experience collection... (15900 times) [2024-04-27 22:28:31,513][54818] InferenceWorker_p0-w0: stopping experience collection (15900 times) [2024-04-27 22:28:31,540][54798] Signal inference workers to resume experience collection... (15900 times) [2024-04-27 22:28:31,541][54818] InferenceWorker_p0-w0: resuming experience collection (15900 times) [2024-04-27 22:28:33,553][54818] Updated weights for policy 0, policy_version 496688 (0.0027) [2024-04-27 22:28:34,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58709.4, 300 sec: 58926.8). Total num frames: 8137768960. Throughput: 0: 58755.5. Samples: 1042947980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:28:34,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 22:28:36,811][54818] Updated weights for policy 0, policy_version 496698 (0.0026) [2024-04-27 22:28:38,977][54818] Updated weights for policy 0, policy_version 496708 (0.0026) [2024-04-27 22:28:39,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58982.3, 300 sec: 58926.8). Total num frames: 8138063872. Throughput: 0: 58681.3. Samples: 1043305960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:28:39,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 22:28:42,120][54818] Updated weights for policy 0, policy_version 496718 (0.0025) [2024-04-27 22:28:44,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8138358784. Throughput: 0: 58976.0. Samples: 1043487520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:28:44,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 22:28:44,533][54818] Updated weights for policy 0, policy_version 496728 (0.0027) [2024-04-27 22:28:47,686][54818] Updated weights for policy 0, policy_version 496738 (0.0025) [2024-04-27 22:28:49,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 8138653696. Throughput: 0: 59053.7. Samples: 1043838740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:28:49,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 22:28:50,289][54818] Updated weights for policy 0, policy_version 496748 (0.0025) [2024-04-27 22:28:53,209][54818] Updated weights for policy 0, policy_version 496758 (0.0027) [2024-04-27 22:28:54,253][54587] Fps is (10 sec: 60621.5, 60 sec: 59528.7, 300 sec: 58926.9). Total num frames: 8138964992. Throughput: 0: 58988.1. Samples: 1044191420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:28:54,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 22:28:55,843][54818] Updated weights for policy 0, policy_version 496768 (0.0025) [2024-04-27 22:28:58,659][54818] Updated weights for policy 0, policy_version 496778 (0.0027) [2024-04-27 22:28:59,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59255.5, 300 sec: 58871.3). Total num frames: 8139243520. Throughput: 0: 58930.7. Samples: 1044364120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:28:59,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 22:29:01,541][54818] Updated weights for policy 0, policy_version 496788 (0.0025) [2024-04-27 22:29:04,108][54818] Updated weights for policy 0, policy_version 496798 (0.0025) [2024-04-27 22:29:04,253][54587] Fps is (10 sec: 57343.6, 60 sec: 59255.4, 300 sec: 58926.9). Total num frames: 8139538432. Throughput: 0: 59109.5. Samples: 1044724300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:29:04,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 22:29:07,093][54818] Updated weights for policy 0, policy_version 496808 (0.0025) [2024-04-27 22:29:09,253][54587] Fps is (10 sec: 57343.3, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8139816960. Throughput: 0: 59043.8. Samples: 1045080040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:29:09,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 22:29:09,849][54818] Updated weights for policy 0, policy_version 496818 (0.0027) [2024-04-27 22:29:12,556][54818] Updated weights for policy 0, policy_version 496828 (0.0025) [2024-04-27 22:29:14,253][54587] Fps is (10 sec: 55705.0, 60 sec: 58436.2, 300 sec: 58871.3). Total num frames: 8140095488. Throughput: 0: 58933.3. Samples: 1045251780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:29:14,254][54587] Avg episode reward: [(0, '0.655')] [2024-04-27 22:29:15,508][54818] Updated weights for policy 0, policy_version 496838 (0.0028) [2024-04-27 22:29:18,109][54818] Updated weights for policy 0, policy_version 496848 (0.0026) [2024-04-27 22:29:19,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58709.4, 300 sec: 58926.8). Total num frames: 8140390400. Throughput: 0: 59128.4. Samples: 1045608760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:29:19,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-27 22:29:20,858][54818] Updated weights for policy 0, policy_version 496858 (0.0025) [2024-04-27 22:29:23,647][54818] Updated weights for policy 0, policy_version 496868 (0.0027) [2024-04-27 22:29:24,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58436.2, 300 sec: 58926.8). Total num frames: 8140685312. Throughput: 0: 58857.5. Samples: 1045954540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:29:24,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 22:29:26,264][54798] Signal inference workers to stop experience collection... (15950 times) [2024-04-27 22:29:26,264][54798] Signal inference workers to resume experience collection... (15950 times) [2024-04-27 22:29:26,286][54818] InferenceWorker_p0-w0: stopping experience collection (15950 times) [2024-04-27 22:29:26,287][54818] InferenceWorker_p0-w0: resuming experience collection (15950 times) [2024-04-27 22:29:26,372][54818] Updated weights for policy 0, policy_version 496878 (0.0026) [2024-04-27 22:29:29,092][54818] Updated weights for policy 0, policy_version 496888 (0.0026) [2024-04-27 22:29:29,253][54587] Fps is (10 sec: 62259.5, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8141012992. Throughput: 0: 58711.1. Samples: 1046129520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:29:29,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 22:29:31,916][54818] Updated weights for policy 0, policy_version 496898 (0.0024) [2024-04-27 22:29:34,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8141291520. Throughput: 0: 58765.3. Samples: 1046483180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-27 22:29:34,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 22:29:35,244][54818] Updated weights for policy 0, policy_version 496908 (0.0028) [2024-04-27 22:29:37,321][54818] Updated weights for policy 0, policy_version 496918 (0.0025) [2024-04-27 22:29:39,253][54587] Fps is (10 sec: 55705.8, 60 sec: 58436.4, 300 sec: 58871.3). Total num frames: 8141570048. Throughput: 0: 58809.7. Samples: 1046837860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 22:29:39,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 22:29:40,849][54818] Updated weights for policy 0, policy_version 496928 (0.0030) [2024-04-27 22:29:42,826][54818] Updated weights for policy 0, policy_version 496938 (0.0027) [2024-04-27 22:29:44,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8141881344. Throughput: 0: 58992.9. Samples: 1047018800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 22:29:44,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 22:29:46,226][54818] Updated weights for policy 0, policy_version 496948 (0.0025) [2024-04-27 22:29:48,406][54818] Updated weights for policy 0, policy_version 496958 (0.0028) [2024-04-27 22:29:49,253][54587] Fps is (10 sec: 62258.4, 60 sec: 58982.3, 300 sec: 58926.9). Total num frames: 8142192640. Throughput: 0: 58894.9. Samples: 1047374580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 22:29:49,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 22:29:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000496960_8142192640.pth... [2024-04-27 22:29:49,263][54587] No heartbeat for components: RolloutWorker_w4 (3937 seconds) [2024-04-27 22:29:49,317][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000496096_8128036864.pth [2024-04-27 22:29:51,704][54818] Updated weights for policy 0, policy_version 496968 (0.0027) [2024-04-27 22:29:54,098][54818] Updated weights for policy 0, policy_version 496978 (0.0026) [2024-04-27 22:29:54,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58709.2, 300 sec: 58926.9). Total num frames: 8142487552. Throughput: 0: 58634.8. Samples: 1047718600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 22:29:54,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 22:29:57,335][54818] Updated weights for policy 0, policy_version 496988 (0.0027) [2024-04-27 22:29:59,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58982.4, 300 sec: 58926.9). Total num frames: 8142782464. Throughput: 0: 58973.9. Samples: 1047905600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 22:29:59,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 22:29:59,690][54818] Updated weights for policy 0, policy_version 496998 (0.0024) [2024-04-27 22:30:02,803][54818] Updated weights for policy 0, policy_version 497008 (0.0027) [2024-04-27 22:30:04,253][54587] Fps is (10 sec: 57344.6, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8143060992. Throughput: 0: 58866.0. Samples: 1048257720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 22:30:04,253][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 22:30:05,132][54818] Updated weights for policy 0, policy_version 497018 (0.0028) [2024-04-27 22:30:08,277][54818] Updated weights for policy 0, policy_version 497028 (0.0026) [2024-04-27 22:30:09,253][54587] Fps is (10 sec: 57344.6, 60 sec: 58982.6, 300 sec: 58815.8). Total num frames: 8143355904. Throughput: 0: 58971.7. Samples: 1048608260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 22:30:09,253][54587] Avg episode reward: [(0, '0.655')] [2024-04-27 22:30:11,018][54818] Updated weights for policy 0, policy_version 497038 (0.0026) [2024-04-27 22:30:13,709][54798] Signal inference workers to stop experience collection... (16000 times) [2024-04-27 22:30:13,741][54818] InferenceWorker_p0-w0: stopping experience collection (16000 times) [2024-04-27 22:30:13,769][54798] Signal inference workers to resume experience collection... (16000 times) [2024-04-27 22:30:13,770][54818] InferenceWorker_p0-w0: resuming experience collection (16000 times) [2024-04-27 22:30:13,880][54818] Updated weights for policy 0, policy_version 497048 (0.0027) [2024-04-27 22:30:14,253][54587] Fps is (10 sec: 58981.6, 60 sec: 59255.5, 300 sec: 58815.8). Total num frames: 8143650816. Throughput: 0: 58981.3. Samples: 1048783680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 22:30:14,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 22:30:16,697][54818] Updated weights for policy 0, policy_version 497058 (0.0026) [2024-04-27 22:30:19,253][54587] Fps is (10 sec: 58981.6, 60 sec: 59255.5, 300 sec: 58871.3). Total num frames: 8143945728. Throughput: 0: 58898.2. Samples: 1049133600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 22:30:19,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 22:30:19,468][54818] Updated weights for policy 0, policy_version 497068 (0.0026) [2024-04-27 22:30:22,129][54818] Updated weights for policy 0, policy_version 497078 (0.0026) [2024-04-27 22:30:24,253][54587] Fps is (10 sec: 57344.6, 60 sec: 58982.5, 300 sec: 58815.8). Total num frames: 8144224256. Throughput: 0: 58979.2. Samples: 1049491920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 22:30:24,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 22:30:25,331][54818] Updated weights for policy 0, policy_version 497088 (0.0026) [2024-04-27 22:30:27,820][54818] Updated weights for policy 0, policy_version 497098 (0.0025) [2024-04-27 22:30:29,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58436.3, 300 sec: 58872.1). Total num frames: 8144519168. Throughput: 0: 58704.4. Samples: 1049660500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 22:30:29,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 22:30:30,782][54818] Updated weights for policy 0, policy_version 497108 (0.0027) [2024-04-27 22:30:33,194][54818] Updated weights for policy 0, policy_version 497118 (0.0025) [2024-04-27 22:30:34,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8144814080. Throughput: 0: 58876.2. Samples: 1050024000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 22:30:34,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 22:30:36,128][54818] Updated weights for policy 0, policy_version 497128 (0.0027) [2024-04-27 22:30:38,772][54818] Updated weights for policy 0, policy_version 497138 (0.0025) [2024-04-27 22:30:39,253][54587] Fps is (10 sec: 62259.3, 60 sec: 59528.5, 300 sec: 58982.4). Total num frames: 8145141760. Throughput: 0: 59056.1. Samples: 1050376120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 22:30:39,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 22:30:41,708][54818] Updated weights for policy 0, policy_version 497148 (0.0027) [2024-04-27 22:30:44,135][54818] Updated weights for policy 0, policy_version 497158 (0.0026) [2024-04-27 22:30:44,253][54587] Fps is (10 sec: 62259.6, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8145436672. Throughput: 0: 58895.3. Samples: 1050555880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 22:30:44,253][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 22:30:47,115][54818] Updated weights for policy 0, policy_version 497168 (0.0024) [2024-04-27 22:30:49,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58709.5, 300 sec: 58982.4). Total num frames: 8145715200. Throughput: 0: 58869.7. Samples: 1050906860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 22:30:49,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 22:30:49,739][54818] Updated weights for policy 0, policy_version 497178 (0.0027) [2024-04-27 22:30:52,686][54818] Updated weights for policy 0, policy_version 497188 (0.0026) [2024-04-27 22:30:54,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8146010112. Throughput: 0: 58861.3. Samples: 1051257020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 22:30:54,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 22:30:55,352][54818] Updated weights for policy 0, policy_version 497198 (0.0026) [2024-04-27 22:30:58,160][54818] Updated weights for policy 0, policy_version 497208 (0.0027) [2024-04-27 22:30:59,253][54587] Fps is (10 sec: 60620.2, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8146321408. Throughput: 0: 58948.4. Samples: 1051436360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-27 22:30:59,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 22:31:00,821][54818] Updated weights for policy 0, policy_version 497218 (0.0026) [2024-04-27 22:31:03,681][54818] Updated weights for policy 0, policy_version 497228 (0.0025) [2024-04-27 22:31:04,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59255.4, 300 sec: 58982.4). Total num frames: 8146616320. Throughput: 0: 59143.7. Samples: 1051795060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 22:31:04,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 22:31:05,507][54798] Signal inference workers to stop experience collection... (16050 times) [2024-04-27 22:31:05,507][54798] Signal inference workers to resume experience collection... (16050 times) [2024-04-27 22:31:05,520][54818] InferenceWorker_p0-w0: stopping experience collection (16050 times) [2024-04-27 22:31:05,520][54818] InferenceWorker_p0-w0: resuming experience collection (16050 times) [2024-04-27 22:31:06,516][54818] Updated weights for policy 0, policy_version 497238 (0.0025) [2024-04-27 22:31:09,250][54818] Updated weights for policy 0, policy_version 497248 (0.0026) [2024-04-27 22:31:09,253][54587] Fps is (10 sec: 58983.5, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8146911232. Throughput: 0: 58960.1. Samples: 1052145120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 22:31:09,253][54587] Avg episode reward: [(0, '0.515')] [2024-04-27 22:31:12,337][54818] Updated weights for policy 0, policy_version 497258 (0.0026) [2024-04-27 22:31:14,253][54587] Fps is (10 sec: 57343.3, 60 sec: 58982.4, 300 sec: 58871.3). Total num frames: 8147189760. Throughput: 0: 58992.3. Samples: 1052315160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 22:31:14,254][54587] Avg episode reward: [(0, '0.694')] [2024-04-27 22:31:14,811][54818] Updated weights for policy 0, policy_version 497268 (0.0028) [2024-04-27 22:31:17,904][54818] Updated weights for policy 0, policy_version 497278 (0.0026) [2024-04-27 22:31:19,253][54587] Fps is (10 sec: 55705.7, 60 sec: 58709.5, 300 sec: 58760.3). Total num frames: 8147468288. Throughput: 0: 58970.8. Samples: 1052677680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 22:31:19,253][54587] Avg episode reward: [(0, '0.684')] [2024-04-27 22:31:20,352][54818] Updated weights for policy 0, policy_version 497288 (0.0025) [2024-04-27 22:31:23,379][54818] Updated weights for policy 0, policy_version 497298 (0.0026) [2024-04-27 22:31:24,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59255.4, 300 sec: 58815.8). Total num frames: 8147779584. Throughput: 0: 58997.7. Samples: 1053031020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 22:31:24,254][54587] Avg episode reward: [(0, '0.526')] [2024-04-27 22:31:25,994][54818] Updated weights for policy 0, policy_version 497308 (0.0026) [2024-04-27 22:31:28,920][54818] Updated weights for policy 0, policy_version 497318 (0.0026) [2024-04-27 22:31:29,253][54587] Fps is (10 sec: 62257.7, 60 sec: 59528.4, 300 sec: 58871.3). Total num frames: 8148090880. Throughput: 0: 58906.4. Samples: 1053206680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 22:31:29,262][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 22:31:31,620][54818] Updated weights for policy 0, policy_version 497328 (0.0025) [2024-04-27 22:31:34,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59255.5, 300 sec: 58926.9). Total num frames: 8148369408. Throughput: 0: 58961.4. Samples: 1053560120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 22:31:34,254][54587] Avg episode reward: [(0, '0.683')] [2024-04-27 22:31:34,329][54818] Updated weights for policy 0, policy_version 497338 (0.0025) [2024-04-27 22:31:37,035][54818] Updated weights for policy 0, policy_version 497348 (0.0026) [2024-04-27 22:31:39,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8148664320. Throughput: 0: 59012.3. Samples: 1053912580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 22:31:39,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 22:31:39,802][54818] Updated weights for policy 0, policy_version 497358 (0.0025) [2024-04-27 22:31:42,434][54818] Updated weights for policy 0, policy_version 497368 (0.0028) [2024-04-27 22:31:44,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58709.2, 300 sec: 58982.4). Total num frames: 8148959232. Throughput: 0: 59088.6. Samples: 1054095340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 22:31:44,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 22:31:45,330][54818] Updated weights for policy 0, policy_version 497378 (0.0025) [2024-04-27 22:31:48,207][54818] Updated weights for policy 0, policy_version 497388 (0.0025) [2024-04-27 22:31:49,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8149270528. Throughput: 0: 59003.5. Samples: 1054450220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 22:31:49,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 22:31:49,267][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000497392_8149270528.pth... [2024-04-27 22:31:49,343][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000496528_8135114752.pth [2024-04-27 22:31:51,304][54818] Updated weights for policy 0, policy_version 497398 (0.0025) [2024-04-27 22:31:53,824][54818] Updated weights for policy 0, policy_version 497408 (0.0024) [2024-04-27 22:31:54,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59255.4, 300 sec: 59037.9). Total num frames: 8149565440. Throughput: 0: 59053.6. Samples: 1054802540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 22:31:54,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 22:31:56,725][54818] Updated weights for policy 0, policy_version 497418 (0.0026) [2024-04-27 22:31:59,233][54818] Updated weights for policy 0, policy_version 497428 (0.0027) [2024-04-27 22:31:59,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58982.6, 300 sec: 58982.4). Total num frames: 8149860352. Throughput: 0: 59294.5. Samples: 1054983400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 22:31:59,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 22:32:02,341][54798] Signal inference workers to stop experience collection... (16100 times) [2024-04-27 22:32:02,384][54818] InferenceWorker_p0-w0: stopping experience collection (16100 times) [2024-04-27 22:32:02,399][54798] Signal inference workers to resume experience collection... (16100 times) [2024-04-27 22:32:02,402][54818] InferenceWorker_p0-w0: resuming experience collection (16100 times) [2024-04-27 22:32:02,405][54818] Updated weights for policy 0, policy_version 497438 (0.0026) [2024-04-27 22:32:04,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 8150138880. Throughput: 0: 59001.6. Samples: 1055332760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 22:32:04,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 22:32:04,858][54818] Updated weights for policy 0, policy_version 497448 (0.0024) [2024-04-27 22:32:07,715][54818] Updated weights for policy 0, policy_version 497458 (0.0025) [2024-04-27 22:32:09,253][54587] Fps is (10 sec: 57343.4, 60 sec: 58709.2, 300 sec: 58871.3). Total num frames: 8150433792. Throughput: 0: 58999.1. Samples: 1055685980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 22:32:09,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 22:32:10,208][54818] Updated weights for policy 0, policy_version 497468 (0.0025) [2024-04-27 22:32:13,069][54818] Updated weights for policy 0, policy_version 497478 (0.0025) [2024-04-27 22:32:14,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59255.5, 300 sec: 58926.8). Total num frames: 8150745088. Throughput: 0: 59129.4. Samples: 1055867500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 22:32:14,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 22:32:15,847][54818] Updated weights for policy 0, policy_version 497488 (0.0025) [2024-04-27 22:32:18,832][54818] Updated weights for policy 0, policy_version 497498 (0.0025) [2024-04-27 22:32:19,253][54587] Fps is (10 sec: 60620.0, 60 sec: 59528.3, 300 sec: 58926.8). Total num frames: 8151040000. Throughput: 0: 59114.4. Samples: 1056220280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 22:32:19,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 22:32:21,384][54818] Updated weights for policy 0, policy_version 497508 (0.0023) [2024-04-27 22:32:24,253][54587] Fps is (10 sec: 57344.6, 60 sec: 58982.5, 300 sec: 58871.3). Total num frames: 8151318528. Throughput: 0: 59221.5. Samples: 1056577540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-27 22:32:24,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 22:32:24,278][54818] Updated weights for policy 0, policy_version 497518 (0.0023) [2024-04-27 22:32:27,071][54818] Updated weights for policy 0, policy_version 497528 (0.0025) [2024-04-27 22:32:29,253][54587] Fps is (10 sec: 58983.3, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 8151629824. Throughput: 0: 58983.6. Samples: 1056749600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:32:29,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 22:32:29,713][54818] Updated weights for policy 0, policy_version 497538 (0.0027) [2024-04-27 22:32:32,556][54818] Updated weights for policy 0, policy_version 497548 (0.0025) [2024-04-27 22:32:34,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8151891968. Throughput: 0: 59051.6. Samples: 1057107540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:32:34,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 22:32:35,208][54818] Updated weights for policy 0, policy_version 497558 (0.0028) [2024-04-27 22:32:38,438][54818] Updated weights for policy 0, policy_version 497568 (0.0024) [2024-04-27 22:32:39,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8152219648. Throughput: 0: 59105.4. Samples: 1057462280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:32:39,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-27 22:32:40,786][54818] Updated weights for policy 0, policy_version 497578 (0.0028) [2024-04-27 22:32:43,915][54818] Updated weights for policy 0, policy_version 497588 (0.0027) [2024-04-27 22:32:44,253][54587] Fps is (10 sec: 62259.3, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8152514560. Throughput: 0: 58975.1. Samples: 1057637280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:32:44,254][54587] Avg episode reward: [(0, '0.509')] [2024-04-27 22:32:46,230][54818] Updated weights for policy 0, policy_version 497598 (0.0024) [2024-04-27 22:32:49,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8152793088. Throughput: 0: 59013.8. Samples: 1057988380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:32:49,254][54587] Avg episode reward: [(0, '0.505')] [2024-04-27 22:32:49,263][54587] No heartbeat for components: RolloutWorker_w4 (4117 seconds) [2024-04-27 22:32:49,356][54818] Updated weights for policy 0, policy_version 497608 (0.0025) [2024-04-27 22:32:51,136][54798] Signal inference workers to stop experience collection... (16150 times) [2024-04-27 22:32:51,136][54798] Signal inference workers to resume experience collection... (16150 times) [2024-04-27 22:32:51,157][54818] InferenceWorker_p0-w0: stopping experience collection (16150 times) [2024-04-27 22:32:51,157][54818] InferenceWorker_p0-w0: resuming experience collection (16150 times) [2024-04-27 22:32:51,808][54818] Updated weights for policy 0, policy_version 497618 (0.0027) [2024-04-27 22:32:54,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8153088000. Throughput: 0: 59172.0. Samples: 1058348720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:32:54,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 22:32:54,839][54818] Updated weights for policy 0, policy_version 497628 (0.0026) [2024-04-27 22:32:57,559][54818] Updated weights for policy 0, policy_version 497638 (0.0026) [2024-04-27 22:32:59,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8153399296. Throughput: 0: 59097.8. Samples: 1058526900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:32:59,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 22:33:00,359][54818] Updated weights for policy 0, policy_version 497648 (0.0025) [2024-04-27 22:33:02,999][54818] Updated weights for policy 0, policy_version 497658 (0.0027) [2024-04-27 22:33:04,253][54587] Fps is (10 sec: 60620.1, 60 sec: 59255.3, 300 sec: 59037.9). Total num frames: 8153694208. Throughput: 0: 58908.5. Samples: 1058871160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:33:04,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 22:33:06,110][54818] Updated weights for policy 0, policy_version 497668 (0.0028) [2024-04-27 22:33:08,670][54818] Updated weights for policy 0, policy_version 497678 (0.0024) [2024-04-27 22:33:09,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8153989120. Throughput: 0: 59008.0. Samples: 1059232900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:33:09,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-27 22:33:11,656][54818] Updated weights for policy 0, policy_version 497688 (0.0025) [2024-04-27 22:33:14,152][54818] Updated weights for policy 0, policy_version 497698 (0.0024) [2024-04-27 22:33:14,253][54587] Fps is (10 sec: 58983.5, 60 sec: 58982.5, 300 sec: 59038.0). Total num frames: 8154284032. Throughput: 0: 59233.9. Samples: 1059415120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:33:14,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 22:33:17,194][54818] Updated weights for policy 0, policy_version 497708 (0.0027) [2024-04-27 22:33:19,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.6, 300 sec: 58982.4). Total num frames: 8154578944. Throughput: 0: 59305.8. Samples: 1059776300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:33:19,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 22:33:19,689][54818] Updated weights for policy 0, policy_version 497718 (0.0022) [2024-04-27 22:33:22,778][54818] Updated weights for policy 0, policy_version 497728 (0.0027) [2024-04-27 22:33:24,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8154873856. Throughput: 0: 58937.9. Samples: 1060114480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:33:24,253][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 22:33:25,156][54818] Updated weights for policy 0, policy_version 497738 (0.0026) [2024-04-27 22:33:28,397][54818] Updated weights for policy 0, policy_version 497748 (0.0027) [2024-04-27 22:33:29,253][54587] Fps is (10 sec: 60619.5, 60 sec: 59255.3, 300 sec: 59037.9). Total num frames: 8155185152. Throughput: 0: 59171.3. Samples: 1060300000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:33:29,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 22:33:30,695][54818] Updated weights for policy 0, policy_version 497758 (0.0025) [2024-04-27 22:33:33,833][54818] Updated weights for policy 0, policy_version 497768 (0.0023) [2024-04-27 22:33:34,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59528.6, 300 sec: 58982.4). Total num frames: 8155463680. Throughput: 0: 59291.6. Samples: 1060656500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:33:34,253][54587] Avg episode reward: [(0, '0.681')] [2024-04-27 22:33:36,083][54818] Updated weights for policy 0, policy_version 497778 (0.0025) [2024-04-27 22:33:39,253][54587] Fps is (10 sec: 55706.0, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 8155742208. Throughput: 0: 59134.6. Samples: 1061009780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:33:39,254][54587] Avg episode reward: [(0, '0.509')] [2024-04-27 22:33:39,466][54818] Updated weights for policy 0, policy_version 497788 (0.0026) [2024-04-27 22:33:41,635][54818] Updated weights for policy 0, policy_version 497798 (0.0025) [2024-04-27 22:33:44,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 8156037120. Throughput: 0: 58838.3. Samples: 1061174620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:33:44,254][54587] Avg episode reward: [(0, '0.469')] [2024-04-27 22:33:44,937][54818] Updated weights for policy 0, policy_version 497808 (0.0025) [2024-04-27 22:33:45,151][54798] Signal inference workers to stop experience collection... (16200 times) [2024-04-27 22:33:45,152][54798] Signal inference workers to resume experience collection... (16200 times) [2024-04-27 22:33:45,165][54818] InferenceWorker_p0-w0: stopping experience collection (16200 times) [2024-04-27 22:33:45,175][54818] InferenceWorker_p0-w0: resuming experience collection (16200 times) [2024-04-27 22:33:47,221][54818] Updated weights for policy 0, policy_version 497818 (0.0025) [2024-04-27 22:33:49,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58982.2, 300 sec: 58871.3). Total num frames: 8156332032. Throughput: 0: 59284.4. Samples: 1061538960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:33:49,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 22:33:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000497823_8156332032.pth... [2024-04-27 22:33:49,327][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000496960_8142192640.pth [2024-04-27 22:33:50,422][54818] Updated weights for policy 0, policy_version 497828 (0.0026) [2024-04-27 22:33:52,905][54818] Updated weights for policy 0, policy_version 497838 (0.0026) [2024-04-27 22:33:54,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.4, 300 sec: 58926.9). Total num frames: 8156626944. Throughput: 0: 59292.0. Samples: 1061901040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-27 22:33:54,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 22:33:55,866][54818] Updated weights for policy 0, policy_version 497848 (0.0024) [2024-04-27 22:33:58,430][54818] Updated weights for policy 0, policy_version 497858 (0.0025) [2024-04-27 22:33:59,253][54587] Fps is (10 sec: 57345.1, 60 sec: 58436.4, 300 sec: 58871.3). Total num frames: 8156905472. Throughput: 0: 58794.2. Samples: 1062060860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-27 22:33:59,253][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 22:34:01,381][54818] Updated weights for policy 0, policy_version 497868 (0.0027) [2024-04-27 22:34:04,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8157216768. Throughput: 0: 58779.0. Samples: 1062421360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-27 22:34:04,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 22:34:04,474][54818] Updated weights for policy 0, policy_version 497878 (0.0027) [2024-04-27 22:34:06,796][54818] Updated weights for policy 0, policy_version 497888 (0.0024) [2024-04-27 22:34:09,253][54587] Fps is (10 sec: 63896.7, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8157544448. Throughput: 0: 59131.4. Samples: 1062775400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-27 22:34:09,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 22:34:10,057][54818] Updated weights for policy 0, policy_version 497898 (0.0026) [2024-04-27 22:34:12,280][54818] Updated weights for policy 0, policy_version 497908 (0.0026) [2024-04-27 22:34:14,253][54587] Fps is (10 sec: 62259.2, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8157839360. Throughput: 0: 59185.0. Samples: 1062963320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-27 22:34:14,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 22:34:15,515][54818] Updated weights for policy 0, policy_version 497918 (0.0025) [2024-04-27 22:34:17,770][54818] Updated weights for policy 0, policy_version 497928 (0.0025) [2024-04-27 22:34:19,253][54587] Fps is (10 sec: 60621.0, 60 sec: 59528.4, 300 sec: 59204.6). Total num frames: 8158150656. Throughput: 0: 58936.8. Samples: 1063308660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-27 22:34:19,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 22:34:21,035][54818] Updated weights for policy 0, policy_version 497938 (0.0025) [2024-04-27 22:34:23,292][54818] Updated weights for policy 0, policy_version 497948 (0.0025) [2024-04-27 22:34:24,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8158445568. Throughput: 0: 58882.8. Samples: 1063659500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-27 22:34:24,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 22:34:26,587][54818] Updated weights for policy 0, policy_version 497958 (0.0026) [2024-04-27 22:34:28,859][54818] Updated weights for policy 0, policy_version 497968 (0.0024) [2024-04-27 22:34:29,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59255.6, 300 sec: 59149.0). Total num frames: 8158740480. Throughput: 0: 59446.2. Samples: 1063849700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-27 22:34:29,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 22:34:31,958][54818] Updated weights for policy 0, policy_version 497978 (0.0026) [2024-04-27 22:34:33,208][54798] Signal inference workers to stop experience collection... (16250 times) [2024-04-27 22:34:33,237][54818] InferenceWorker_p0-w0: stopping experience collection (16250 times) [2024-04-27 22:34:33,291][54798] Signal inference workers to resume experience collection... (16250 times) [2024-04-27 22:34:33,292][54818] InferenceWorker_p0-w0: resuming experience collection (16250 times) [2024-04-27 22:34:34,145][54818] Updated weights for policy 0, policy_version 497988 (0.0022) [2024-04-27 22:34:34,253][54587] Fps is (10 sec: 58981.8, 60 sec: 59528.4, 300 sec: 59204.5). Total num frames: 8159035392. Throughput: 0: 59302.3. Samples: 1064207560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-27 22:34:34,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 22:34:37,517][54818] Updated weights for policy 0, policy_version 497998 (0.0026) [2024-04-27 22:34:39,253][54587] Fps is (10 sec: 57344.6, 60 sec: 59528.7, 300 sec: 59093.5). Total num frames: 8159313920. Throughput: 0: 58985.4. Samples: 1064555380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-27 22:34:39,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 22:34:39,852][54818] Updated weights for policy 0, policy_version 498008 (0.0024) [2024-04-27 22:34:43,332][54818] Updated weights for policy 0, policy_version 498018 (0.0026) [2024-04-27 22:34:44,253][54587] Fps is (10 sec: 55706.4, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8159592448. Throughput: 0: 59356.5. Samples: 1064731900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-27 22:34:44,253][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 22:34:45,310][54818] Updated weights for policy 0, policy_version 498028 (0.0026) [2024-04-27 22:34:48,967][54818] Updated weights for policy 0, policy_version 498038 (0.0027) [2024-04-27 22:34:49,253][54587] Fps is (10 sec: 57343.3, 60 sec: 59255.6, 300 sec: 58982.4). Total num frames: 8159887360. Throughput: 0: 59334.7. Samples: 1065091420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-27 22:34:49,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 22:34:50,963][54818] Updated weights for policy 0, policy_version 498048 (0.0022) [2024-04-27 22:34:54,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58982.4, 300 sec: 58926.9). Total num frames: 8160165888. Throughput: 0: 59286.8. Samples: 1065443300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-27 22:34:54,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 22:34:54,543][54818] Updated weights for policy 0, policy_version 498058 (0.0025) [2024-04-27 22:34:56,348][54818] Updated weights for policy 0, policy_version 498068 (0.0024) [2024-04-27 22:34:59,253][54587] Fps is (10 sec: 57344.5, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8160460800. Throughput: 0: 58908.6. Samples: 1065614200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-27 22:34:59,253][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 22:34:59,925][54818] Updated weights for policy 0, policy_version 498078 (0.0026) [2024-04-27 22:35:01,880][54818] Updated weights for policy 0, policy_version 498088 (0.0026) [2024-04-27 22:35:04,253][54587] Fps is (10 sec: 62259.4, 60 sec: 59528.6, 300 sec: 59093.5). Total num frames: 8160788480. Throughput: 0: 59074.3. Samples: 1065967000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-27 22:35:04,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-27 22:35:05,448][54818] Updated weights for policy 0, policy_version 498098 (0.0025) [2024-04-27 22:35:07,386][54818] Updated weights for policy 0, policy_version 498108 (0.0026) [2024-04-27 22:35:09,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58709.4, 300 sec: 59038.0). Total num frames: 8161067008. Throughput: 0: 59364.0. Samples: 1066330880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-27 22:35:09,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 22:35:10,853][54818] Updated weights for policy 0, policy_version 498118 (0.0025) [2024-04-27 22:35:13,221][54818] Updated weights for policy 0, policy_version 498128 (0.0024) [2024-04-27 22:35:14,253][54587] Fps is (10 sec: 57343.3, 60 sec: 58709.3, 300 sec: 59037.9). Total num frames: 8161361920. Throughput: 0: 58857.7. Samples: 1066498300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-27 22:35:14,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 22:35:16,318][54818] Updated weights for policy 0, policy_version 498138 (0.0026) [2024-04-27 22:35:16,811][54798] Signal inference workers to stop experience collection... (16300 times) [2024-04-27 22:35:16,812][54798] Signal inference workers to resume experience collection... (16300 times) [2024-04-27 22:35:16,839][54818] InferenceWorker_p0-w0: stopping experience collection (16300 times) [2024-04-27 22:35:16,839][54818] InferenceWorker_p0-w0: resuming experience collection (16300 times) [2024-04-27 22:35:18,799][54818] Updated weights for policy 0, policy_version 498148 (0.0026) [2024-04-27 22:35:19,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8161673216. Throughput: 0: 58795.6. Samples: 1066853360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-27 22:35:19,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 22:35:21,769][54818] Updated weights for policy 0, policy_version 498158 (0.0025) [2024-04-27 22:35:24,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58709.2, 300 sec: 59149.0). Total num frames: 8161968128. Throughput: 0: 58974.9. Samples: 1067209260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:35:24,254][54587] Avg episode reward: [(0, '0.511')] [2024-04-27 22:35:24,266][54818] Updated weights for policy 0, policy_version 498168 (0.0025) [2024-04-27 22:35:27,277][54818] Updated weights for policy 0, policy_version 498178 (0.0023) [2024-04-27 22:35:29,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8162263040. Throughput: 0: 59251.9. Samples: 1067398240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:35:29,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 22:35:30,138][54818] Updated weights for policy 0, policy_version 498188 (0.0026) [2024-04-27 22:35:32,883][54818] Updated weights for policy 0, policy_version 498198 (0.0034) [2024-04-27 22:35:34,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.4, 300 sec: 59037.9). Total num frames: 8162557952. Throughput: 0: 58992.5. Samples: 1067746080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:35:34,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 22:35:35,552][54818] Updated weights for policy 0, policy_version 498208 (0.0026) [2024-04-27 22:35:38,245][54818] Updated weights for policy 0, policy_version 498218 (0.0024) [2024-04-27 22:35:39,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59255.3, 300 sec: 59093.5). Total num frames: 8162869248. Throughput: 0: 58784.4. Samples: 1068088600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:35:39,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 22:35:41,052][54818] Updated weights for policy 0, policy_version 498228 (0.0024) [2024-04-27 22:35:43,933][54818] Updated weights for policy 0, policy_version 498238 (0.0022) [2024-04-27 22:35:44,253][54587] Fps is (10 sec: 58982.9, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8163147776. Throughput: 0: 59050.7. Samples: 1068271480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:35:44,253][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 22:35:46,550][54818] Updated weights for policy 0, policy_version 498248 (0.0027) [2024-04-27 22:35:49,253][54587] Fps is (10 sec: 57343.7, 60 sec: 59255.4, 300 sec: 59093.4). Total num frames: 8163442688. Throughput: 0: 59141.2. Samples: 1068628360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:35:49,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 22:35:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000498257_8163442688.pth... [2024-04-27 22:35:49,266][54587] No heartbeat for components: RolloutWorker_w4 (4297 seconds) [2024-04-27 22:35:49,322][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000497392_8149270528.pth [2024-04-27 22:35:49,488][54818] Updated weights for policy 0, policy_version 498258 (0.0023) [2024-04-27 22:35:52,173][54818] Updated weights for policy 0, policy_version 498268 (0.0025) [2024-04-27 22:35:54,253][54587] Fps is (10 sec: 57343.3, 60 sec: 59255.4, 300 sec: 58982.4). Total num frames: 8163721216. Throughput: 0: 59048.3. Samples: 1068988060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:35:54,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 22:35:54,833][54818] Updated weights for policy 0, policy_version 498278 (0.0025) [2024-04-27 22:35:57,786][54818] Updated weights for policy 0, policy_version 498288 (0.0024) [2024-04-27 22:35:59,253][54587] Fps is (10 sec: 57344.2, 60 sec: 59255.4, 300 sec: 58982.4). Total num frames: 8164016128. Throughput: 0: 59142.3. Samples: 1069159700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:35:59,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 22:36:00,238][54818] Updated weights for policy 0, policy_version 498298 (0.0025) [2024-04-27 22:36:03,520][54818] Updated weights for policy 0, policy_version 498308 (0.0026) [2024-04-27 22:36:04,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8164311040. Throughput: 0: 59030.3. Samples: 1069509720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:36:04,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-27 22:36:05,844][54818] Updated weights for policy 0, policy_version 498318 (0.0026) [2024-04-27 22:36:08,962][54818] Updated weights for policy 0, policy_version 498328 (0.0026) [2024-04-27 22:36:09,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58982.4, 300 sec: 59038.0). Total num frames: 8164605952. Throughput: 0: 58910.3. Samples: 1069860220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:36:09,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 22:36:10,674][54798] Signal inference workers to stop experience collection... (16350 times) [2024-04-27 22:36:10,718][54818] InferenceWorker_p0-w0: stopping experience collection (16350 times) [2024-04-27 22:36:10,734][54798] Signal inference workers to resume experience collection... (16350 times) [2024-04-27 22:36:10,735][54818] InferenceWorker_p0-w0: resuming experience collection (16350 times) [2024-04-27 22:36:11,455][54818] Updated weights for policy 0, policy_version 498338 (0.0026) [2024-04-27 22:36:14,253][54587] Fps is (10 sec: 60619.9, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8164917248. Throughput: 0: 58615.4. Samples: 1070035940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:36:14,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 22:36:14,454][54818] Updated weights for policy 0, policy_version 498348 (0.0026) [2024-04-27 22:36:17,225][54818] Updated weights for policy 0, policy_version 498358 (0.0026) [2024-04-27 22:36:19,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58709.3, 300 sec: 59037.9). Total num frames: 8165195776. Throughput: 0: 58794.6. Samples: 1070391840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:36:19,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 22:36:20,272][54818] Updated weights for policy 0, policy_version 498368 (0.0028) [2024-04-27 22:36:22,999][54818] Updated weights for policy 0, policy_version 498378 (0.0025) [2024-04-27 22:36:24,253][54587] Fps is (10 sec: 58983.3, 60 sec: 58982.5, 300 sec: 59038.0). Total num frames: 8165507072. Throughput: 0: 58921.9. Samples: 1070740080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:36:24,262][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 22:36:25,817][54818] Updated weights for policy 0, policy_version 498388 (0.0026) [2024-04-27 22:36:28,397][54818] Updated weights for policy 0, policy_version 498398 (0.0026) [2024-04-27 22:36:29,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8165801984. Throughput: 0: 58915.0. Samples: 1070922660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:36:29,262][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 22:36:31,245][54818] Updated weights for policy 0, policy_version 498408 (0.0025) [2024-04-27 22:36:33,872][54818] Updated weights for policy 0, policy_version 498418 (0.0026) [2024-04-27 22:36:34,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8166096896. Throughput: 0: 58804.2. Samples: 1071274540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:36:34,253][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 22:36:36,722][54818] Updated weights for policy 0, policy_version 498428 (0.0026) [2024-04-27 22:36:39,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8166391808. Throughput: 0: 58624.5. Samples: 1071626160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:36:39,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 22:36:39,508][54818] Updated weights for policy 0, policy_version 498438 (0.0025) [2024-04-27 22:36:42,559][54818] Updated weights for policy 0, policy_version 498448 (0.0026) [2024-04-27 22:36:44,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8166670336. Throughput: 0: 58860.1. Samples: 1071808400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:36:44,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 22:36:45,081][54818] Updated weights for policy 0, policy_version 498458 (0.0026) [2024-04-27 22:36:48,230][54818] Updated weights for policy 0, policy_version 498468 (0.0025) [2024-04-27 22:36:49,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8166965248. Throughput: 0: 58963.8. Samples: 1072163100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-27 22:36:49,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 22:36:50,660][54818] Updated weights for policy 0, policy_version 498478 (0.0026) [2024-04-27 22:36:53,738][54818] Updated weights for policy 0, policy_version 498488 (0.0026) [2024-04-27 22:36:54,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.5, 300 sec: 58982.4). Total num frames: 8167260160. Throughput: 0: 59164.1. Samples: 1072522600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:36:54,253][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 22:36:56,117][54818] Updated weights for policy 0, policy_version 498498 (0.0026) [2024-04-27 22:36:59,147][54818] Updated weights for policy 0, policy_version 498508 (0.0026) [2024-04-27 22:36:59,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8167555072. Throughput: 0: 58945.1. Samples: 1072688460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:36:59,254][54587] Avg episode reward: [(0, '0.671')] [2024-04-27 22:37:01,706][54818] Updated weights for policy 0, policy_version 498518 (0.0026) [2024-04-27 22:37:04,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8167849984. Throughput: 0: 59091.2. Samples: 1073050940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:37:04,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 22:37:04,620][54818] Updated weights for policy 0, policy_version 498528 (0.0026) [2024-04-27 22:37:07,174][54818] Updated weights for policy 0, policy_version 498538 (0.0026) [2024-04-27 22:37:09,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58709.2, 300 sec: 58926.9). Total num frames: 8168128512. Throughput: 0: 59150.5. Samples: 1073401860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:37:09,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 22:37:10,089][54818] Updated weights for policy 0, policy_version 498548 (0.0025) [2024-04-27 22:37:11,384][54798] Signal inference workers to stop experience collection... (16400 times) [2024-04-27 22:37:11,385][54798] Signal inference workers to resume experience collection... (16400 times) [2024-04-27 22:37:11,404][54818] InferenceWorker_p0-w0: stopping experience collection (16400 times) [2024-04-27 22:37:11,405][54818] InferenceWorker_p0-w0: resuming experience collection (16400 times) [2024-04-27 22:37:12,741][54818] Updated weights for policy 0, policy_version 498558 (0.0026) [2024-04-27 22:37:14,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8168439808. Throughput: 0: 58907.0. Samples: 1073573480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:37:14,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 22:37:15,667][54818] Updated weights for policy 0, policy_version 498568 (0.0026) [2024-04-27 22:37:18,141][54818] Updated weights for policy 0, policy_version 498578 (0.0026) [2024-04-27 22:37:19,253][54587] Fps is (10 sec: 62259.0, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8168751104. Throughput: 0: 58977.1. Samples: 1073928520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:37:19,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 22:37:21,226][54818] Updated weights for policy 0, policy_version 498588 (0.0025) [2024-04-27 22:37:24,025][54818] Updated weights for policy 0, policy_version 498598 (0.0025) [2024-04-27 22:37:24,253][54587] Fps is (10 sec: 58983.4, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8169029632. Throughput: 0: 59084.6. Samples: 1074284960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:37:24,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 22:37:26,833][54818] Updated weights for policy 0, policy_version 498608 (0.0026) [2024-04-27 22:37:29,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8169340928. Throughput: 0: 59066.5. Samples: 1074466400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:37:29,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-27 22:37:29,636][54818] Updated weights for policy 0, policy_version 498618 (0.0022) [2024-04-27 22:37:32,406][54818] Updated weights for policy 0, policy_version 498628 (0.0028) [2024-04-27 22:37:34,253][54587] Fps is (10 sec: 62259.1, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8169652224. Throughput: 0: 58953.1. Samples: 1074815980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:37:34,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 22:37:35,146][54818] Updated weights for policy 0, policy_version 498638 (0.0027) [2024-04-27 22:37:37,780][54818] Updated weights for policy 0, policy_version 498648 (0.0026) [2024-04-27 22:37:39,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8169930752. Throughput: 0: 58777.3. Samples: 1075167580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:37:39,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 22:37:40,549][54818] Updated weights for policy 0, policy_version 498658 (0.0025) [2024-04-27 22:37:43,203][54818] Updated weights for policy 0, policy_version 498668 (0.0024) [2024-04-27 22:37:44,253][54587] Fps is (10 sec: 58981.5, 60 sec: 59528.4, 300 sec: 59149.0). Total num frames: 8170242048. Throughput: 0: 59339.9. Samples: 1075358760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:37:44,254][54587] Avg episode reward: [(0, '0.686')] [2024-04-27 22:37:45,971][54818] Updated weights for policy 0, policy_version 498678 (0.0025) [2024-04-27 22:37:48,839][54818] Updated weights for policy 0, policy_version 498688 (0.0025) [2024-04-27 22:37:49,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8170536960. Throughput: 0: 59217.3. Samples: 1075715720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:37:49,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 22:37:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000498690_8170536960.pth... [2024-04-27 22:37:49,324][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000497823_8156332032.pth [2024-04-27 22:37:51,437][54818] Updated weights for policy 0, policy_version 498698 (0.0026) [2024-04-27 22:37:54,245][54818] Updated weights for policy 0, policy_version 498708 (0.0024) [2024-04-27 22:37:54,253][54587] Fps is (10 sec: 58983.2, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8170831872. Throughput: 0: 59149.9. Samples: 1076063600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:37:54,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 22:37:55,312][54798] Signal inference workers to stop experience collection... (16450 times) [2024-04-27 22:37:55,352][54818] InferenceWorker_p0-w0: stopping experience collection (16450 times) [2024-04-27 22:37:55,362][54798] Signal inference workers to resume experience collection... (16450 times) [2024-04-27 22:37:55,368][54818] InferenceWorker_p0-w0: resuming experience collection (16450 times) [2024-04-27 22:37:57,403][54818] Updated weights for policy 0, policy_version 498718 (0.0025) [2024-04-27 22:37:59,253][54587] Fps is (10 sec: 57344.7, 60 sec: 59255.5, 300 sec: 59038.0). Total num frames: 8171110400. Throughput: 0: 59177.1. Samples: 1076236440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:37:59,253][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 22:37:59,740][54818] Updated weights for policy 0, policy_version 498728 (0.0022) [2024-04-27 22:38:03,121][54818] Updated weights for policy 0, policy_version 498738 (0.0024) [2024-04-27 22:38:04,253][54587] Fps is (10 sec: 55704.9, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8171388928. Throughput: 0: 59216.0. Samples: 1076593240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:38:04,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 22:38:05,384][54818] Updated weights for policy 0, policy_version 498748 (0.0023) [2024-04-27 22:38:09,129][54818] Updated weights for policy 0, policy_version 498758 (0.0025) [2024-04-27 22:38:09,253][54587] Fps is (10 sec: 54066.9, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8171651072. Throughput: 0: 59290.6. Samples: 1076953040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:38:09,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 22:38:10,973][54818] Updated weights for policy 0, policy_version 498768 (0.0022) [2024-04-27 22:38:14,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58709.4, 300 sec: 58926.8). Total num frames: 8171962368. Throughput: 0: 58749.4. Samples: 1077110120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:38:14,262][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 22:38:14,562][54818] Updated weights for policy 0, policy_version 498778 (0.0026) [2024-04-27 22:38:16,460][54818] Updated weights for policy 0, policy_version 498788 (0.0023) [2024-04-27 22:38:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58436.4, 300 sec: 58926.8). Total num frames: 8172257280. Throughput: 0: 59016.0. Samples: 1077471700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 22:38:19,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-27 22:38:20,129][54818] Updated weights for policy 0, policy_version 498798 (0.0026) [2024-04-27 22:38:21,902][54818] Updated weights for policy 0, policy_version 498808 (0.0027) [2024-04-27 22:38:24,253][54587] Fps is (10 sec: 62258.9, 60 sec: 59255.4, 300 sec: 58982.4). Total num frames: 8172584960. Throughput: 0: 59098.1. Samples: 1077827000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 22:38:24,254][54587] Avg episode reward: [(0, '0.517')] [2024-04-27 22:38:25,583][54818] Updated weights for policy 0, policy_version 498818 (0.0026) [2024-04-27 22:38:27,342][54818] Updated weights for policy 0, policy_version 498828 (0.0025) [2024-04-27 22:38:29,253][54587] Fps is (10 sec: 63897.2, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8172896256. Throughput: 0: 58893.4. Samples: 1078008960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 22:38:29,254][54587] Avg episode reward: [(0, '0.655')] [2024-04-27 22:38:30,982][54818] Updated weights for policy 0, policy_version 498838 (0.0026) [2024-04-27 22:38:32,787][54818] Updated weights for policy 0, policy_version 498848 (0.0027) [2024-04-27 22:38:34,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8173191168. Throughput: 0: 58664.4. Samples: 1078355620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 22:38:34,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 22:38:36,551][54818] Updated weights for policy 0, policy_version 498858 (0.0025) [2024-04-27 22:38:36,900][54798] Signal inference workers to stop experience collection... (16500 times) [2024-04-27 22:38:36,900][54798] Signal inference workers to resume experience collection... (16500 times) [2024-04-27 22:38:36,920][54818] InferenceWorker_p0-w0: stopping experience collection (16500 times) [2024-04-27 22:38:36,920][54818] InferenceWorker_p0-w0: resuming experience collection (16500 times) [2024-04-27 22:38:38,309][54818] Updated weights for policy 0, policy_version 498868 (0.0026) [2024-04-27 22:38:39,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59528.5, 300 sec: 59204.6). Total num frames: 8173502464. Throughput: 0: 58948.0. Samples: 1078716260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 22:38:39,254][54587] Avg episode reward: [(0, '0.509')] [2024-04-27 22:38:41,998][54818] Updated weights for policy 0, policy_version 498878 (0.0026) [2024-04-27 22:38:44,033][54818] Updated weights for policy 0, policy_version 498888 (0.0024) [2024-04-27 22:38:44,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8173780992. Throughput: 0: 59378.5. Samples: 1078908480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 22:38:44,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 22:38:47,439][54818] Updated weights for policy 0, policy_version 498898 (0.0026) [2024-04-27 22:38:49,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59255.5, 300 sec: 59204.5). Total num frames: 8174092288. Throughput: 0: 59181.4. Samples: 1079256400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 22:38:49,254][54587] Avg episode reward: [(0, '0.667')] [2024-04-27 22:38:49,267][54587] No heartbeat for components: RolloutWorker_w4 (4477 seconds) [2024-04-27 22:38:49,602][54818] Updated weights for policy 0, policy_version 498908 (0.0023) [2024-04-27 22:38:52,861][54818] Updated weights for policy 0, policy_version 498918 (0.0025) [2024-04-27 22:38:54,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.4, 300 sec: 59204.5). Total num frames: 8174370816. Throughput: 0: 59110.2. Samples: 1079613000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 22:38:54,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 22:38:55,055][54818] Updated weights for policy 0, policy_version 498928 (0.0026) [2024-04-27 22:38:58,297][54818] Updated weights for policy 0, policy_version 498938 (0.0026) [2024-04-27 22:38:59,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59255.3, 300 sec: 59149.0). Total num frames: 8174665728. Throughput: 0: 59522.2. Samples: 1079788620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 22:38:59,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 22:39:00,506][54818] Updated weights for policy 0, policy_version 498948 (0.0024) [2024-04-27 22:39:03,915][54818] Updated weights for policy 0, policy_version 498958 (0.0026) [2024-04-27 22:39:04,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59255.6, 300 sec: 58982.4). Total num frames: 8174944256. Throughput: 0: 59510.7. Samples: 1080149680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 22:39:04,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 22:39:05,983][54818] Updated weights for policy 0, policy_version 498968 (0.0026) [2024-04-27 22:39:09,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59801.5, 300 sec: 58982.4). Total num frames: 8175239168. Throughput: 0: 59383.1. Samples: 1080499240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 22:39:09,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 22:39:09,400][54818] Updated weights for policy 0, policy_version 498978 (0.0027) [2024-04-27 22:39:11,622][54818] Updated weights for policy 0, policy_version 498988 (0.0027) [2024-04-27 22:39:14,253][54587] Fps is (10 sec: 58981.9, 60 sec: 59528.5, 300 sec: 58926.9). Total num frames: 8175534080. Throughput: 0: 58928.0. Samples: 1080660720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 22:39:14,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 22:39:14,970][54818] Updated weights for policy 0, policy_version 498998 (0.0026) [2024-04-27 22:39:16,580][54798] Signal inference workers to stop experience collection... (16550 times) [2024-04-27 22:39:16,619][54818] InferenceWorker_p0-w0: stopping experience collection (16550 times) [2024-04-27 22:39:16,672][54798] Signal inference workers to resume experience collection... (16550 times) [2024-04-27 22:39:16,672][54818] InferenceWorker_p0-w0: resuming experience collection (16550 times) [2024-04-27 22:39:17,354][54818] Updated weights for policy 0, policy_version 499008 (0.0025) [2024-04-27 22:39:19,253][54587] Fps is (10 sec: 57344.8, 60 sec: 59255.5, 300 sec: 58871.3). Total num frames: 8175812608. Throughput: 0: 59314.4. Samples: 1081024760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 22:39:19,253][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 22:39:20,512][54818] Updated weights for policy 0, policy_version 499018 (0.0026) [2024-04-27 22:39:22,898][54818] Updated weights for policy 0, policy_version 499028 (0.0027) [2024-04-27 22:39:24,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8176107520. Throughput: 0: 59358.1. Samples: 1081387380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 22:39:24,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 22:39:26,004][54818] Updated weights for policy 0, policy_version 499038 (0.0026) [2024-04-27 22:39:28,578][54818] Updated weights for policy 0, policy_version 499048 (0.0025) [2024-04-27 22:39:29,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58436.3, 300 sec: 58871.3). Total num frames: 8176402432. Throughput: 0: 58748.9. Samples: 1081552180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 22:39:29,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 22:39:31,480][54818] Updated weights for policy 0, policy_version 499058 (0.0026) [2024-04-27 22:39:34,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8176713728. Throughput: 0: 58795.1. Samples: 1081902180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 22:39:34,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 22:39:34,604][54818] Updated weights for policy 0, policy_version 499068 (0.0028) [2024-04-27 22:39:37,032][54818] Updated weights for policy 0, policy_version 499078 (0.0026) [2024-04-27 22:39:39,253][54587] Fps is (10 sec: 62259.7, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8177025024. Throughput: 0: 58751.2. Samples: 1082256800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-27 22:39:39,253][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 22:39:40,088][54818] Updated weights for policy 0, policy_version 499088 (0.0025) [2024-04-27 22:39:42,469][54818] Updated weights for policy 0, policy_version 499098 (0.0026) [2024-04-27 22:39:44,253][54587] Fps is (10 sec: 62259.5, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8177336320. Throughput: 0: 59044.5. Samples: 1082445620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:39:44,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 22:39:45,731][54818] Updated weights for policy 0, policy_version 499108 (0.0025) [2024-04-27 22:39:48,006][54818] Updated weights for policy 0, policy_version 499118 (0.0027) [2024-04-27 22:39:49,253][54587] Fps is (10 sec: 60619.5, 60 sec: 58982.3, 300 sec: 59204.5). Total num frames: 8177631232. Throughput: 0: 58812.2. Samples: 1082796240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:39:49,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 22:39:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000499123_8177631232.pth... [2024-04-27 22:39:49,317][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000498257_8163442688.pth [2024-04-27 22:39:51,253][54818] Updated weights for policy 0, policy_version 499128 (0.0024) [2024-04-27 22:39:53,513][54818] Updated weights for policy 0, policy_version 499138 (0.0026) [2024-04-27 22:39:54,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8177909760. Throughput: 0: 58863.6. Samples: 1083148100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:39:54,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 22:39:56,862][54818] Updated weights for policy 0, policy_version 499148 (0.0026) [2024-04-27 22:39:58,976][54818] Updated weights for policy 0, policy_version 499158 (0.0023) [2024-04-27 22:39:59,253][54587] Fps is (10 sec: 57345.5, 60 sec: 58982.6, 300 sec: 59038.0). Total num frames: 8178204672. Throughput: 0: 59459.8. Samples: 1083336400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:39:59,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 22:40:02,336][54818] Updated weights for policy 0, policy_version 499168 (0.0026) [2024-04-27 22:40:04,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8178499584. Throughput: 0: 59167.4. Samples: 1083687300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:40:04,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 22:40:04,486][54818] Updated weights for policy 0, policy_version 499178 (0.0026) [2024-04-27 22:40:07,730][54818] Updated weights for policy 0, policy_version 499188 (0.0027) [2024-04-27 22:40:09,253][54587] Fps is (10 sec: 57343.3, 60 sec: 58982.5, 300 sec: 59038.0). Total num frames: 8178778112. Throughput: 0: 58898.8. Samples: 1084037820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:40:09,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 22:40:10,031][54818] Updated weights for policy 0, policy_version 499198 (0.0023) [2024-04-27 22:40:13,218][54818] Updated weights for policy 0, policy_version 499208 (0.0024) [2024-04-27 22:40:14,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59255.6, 300 sec: 59037.9). Total num frames: 8179089408. Throughput: 0: 59128.5. Samples: 1084212960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:40:14,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 22:40:15,598][54818] Updated weights for policy 0, policy_version 499218 (0.0026) [2024-04-27 22:40:18,632][54818] Updated weights for policy 0, policy_version 499228 (0.0025) [2024-04-27 22:40:19,061][54798] Signal inference workers to stop experience collection... (16600 times) [2024-04-27 22:40:19,094][54818] InferenceWorker_p0-w0: stopping experience collection (16600 times) [2024-04-27 22:40:19,122][54798] Signal inference workers to resume experience collection... (16600 times) [2024-04-27 22:40:19,123][54818] InferenceWorker_p0-w0: resuming experience collection (16600 times) [2024-04-27 22:40:19,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59528.5, 300 sec: 59038.0). Total num frames: 8179384320. Throughput: 0: 59473.9. Samples: 1084578500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:40:19,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-27 22:40:21,080][54818] Updated weights for policy 0, policy_version 499238 (0.0030) [2024-04-27 22:40:24,253][54587] Fps is (10 sec: 57343.4, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8179662848. Throughput: 0: 59537.1. Samples: 1084935980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:40:24,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 22:40:24,312][54818] Updated weights for policy 0, policy_version 499248 (0.0026) [2024-04-27 22:40:26,642][54818] Updated weights for policy 0, policy_version 499258 (0.0026) [2024-04-27 22:40:29,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59528.6, 300 sec: 59037.9). Total num frames: 8179974144. Throughput: 0: 58989.0. Samples: 1085100120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:40:29,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 22:40:29,785][54818] Updated weights for policy 0, policy_version 499268 (0.0025) [2024-04-27 22:40:32,010][54818] Updated weights for policy 0, policy_version 499278 (0.0026) [2024-04-27 22:40:34,253][54587] Fps is (10 sec: 58983.2, 60 sec: 58982.6, 300 sec: 58926.9). Total num frames: 8180252672. Throughput: 0: 59070.1. Samples: 1085454380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:40:34,253][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 22:40:35,211][54818] Updated weights for policy 0, policy_version 499288 (0.0026) [2024-04-27 22:40:37,577][54818] Updated weights for policy 0, policy_version 499298 (0.0026) [2024-04-27 22:40:39,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 8180563968. Throughput: 0: 59489.0. Samples: 1085825100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:40:39,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 22:40:40,626][54818] Updated weights for policy 0, policy_version 499308 (0.0024) [2024-04-27 22:40:43,223][54818] Updated weights for policy 0, policy_version 499318 (0.0025) [2024-04-27 22:40:44,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58436.3, 300 sec: 58982.4). Total num frames: 8180842496. Throughput: 0: 58966.5. Samples: 1085989900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:40:44,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 22:40:46,280][54818] Updated weights for policy 0, policy_version 499328 (0.0025) [2024-04-27 22:40:48,883][54818] Updated weights for policy 0, policy_version 499338 (0.0023) [2024-04-27 22:40:49,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8181153792. Throughput: 0: 59162.6. Samples: 1086349620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:40:49,254][54587] Avg episode reward: [(0, '0.655')] [2024-04-27 22:40:51,856][54818] Updated weights for policy 0, policy_version 499348 (0.0025) [2024-04-27 22:40:54,253][54587] Fps is (10 sec: 62260.1, 60 sec: 59255.6, 300 sec: 59149.0). Total num frames: 8181465088. Throughput: 0: 59078.3. Samples: 1086696340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:40:54,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-27 22:40:54,935][54818] Updated weights for policy 0, policy_version 499358 (0.0026) [2024-04-27 22:40:57,380][54818] Updated weights for policy 0, policy_version 499368 (0.0026) [2024-04-27 22:40:59,253][54587] Fps is (10 sec: 60620.7, 60 sec: 59255.2, 300 sec: 59149.0). Total num frames: 8181760000. Throughput: 0: 59353.1. Samples: 1086883860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:40:59,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 22:41:00,357][54818] Updated weights for policy 0, policy_version 499378 (0.0027) [2024-04-27 22:41:02,831][54818] Updated weights for policy 0, policy_version 499388 (0.0025) [2024-04-27 22:41:04,253][54587] Fps is (10 sec: 60619.5, 60 sec: 59528.4, 300 sec: 59204.5). Total num frames: 8182071296. Throughput: 0: 59070.9. Samples: 1087236700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:41:04,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 22:41:05,733][54818] Updated weights for policy 0, policy_version 499398 (0.0026) [2024-04-27 22:41:08,176][54818] Updated weights for policy 0, policy_version 499408 (0.0026) [2024-04-27 22:41:09,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59801.5, 300 sec: 59149.0). Total num frames: 8182366208. Throughput: 0: 58928.4. Samples: 1087587760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:41:09,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 22:41:11,390][54818] Updated weights for policy 0, policy_version 499418 (0.0026) [2024-04-27 22:41:13,593][54818] Updated weights for policy 0, policy_version 499428 (0.0027) [2024-04-27 22:41:14,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59528.4, 300 sec: 59204.6). Total num frames: 8182661120. Throughput: 0: 59621.2. Samples: 1087783080. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-04-27 22:41:14,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 22:41:17,063][54818] Updated weights for policy 0, policy_version 499438 (0.0026) [2024-04-27 22:41:18,635][54798] Signal inference workers to stop experience collection... (16650 times) [2024-04-27 22:41:18,636][54798] Signal inference workers to resume experience collection... (16650 times) [2024-04-27 22:41:18,650][54818] InferenceWorker_p0-w0: stopping experience collection (16650 times) [2024-04-27 22:41:18,667][54818] InferenceWorker_p0-w0: resuming experience collection (16650 times) [2024-04-27 22:41:19,102][54818] Updated weights for policy 0, policy_version 499448 (0.0026) [2024-04-27 22:41:19,253][54587] Fps is (10 sec: 58983.8, 60 sec: 59528.7, 300 sec: 59149.0). Total num frames: 8182956032. Throughput: 0: 59564.5. Samples: 1088134780. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-04-27 22:41:19,253][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 22:41:22,666][54818] Updated weights for policy 0, policy_version 499458 (0.0026) [2024-04-27 22:41:24,253][54587] Fps is (10 sec: 58983.7, 60 sec: 59801.8, 300 sec: 59149.1). Total num frames: 8183250944. Throughput: 0: 59133.0. Samples: 1088486080. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-04-27 22:41:24,253][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 22:41:24,470][54818] Updated weights for policy 0, policy_version 499468 (0.0026) [2024-04-27 22:41:28,126][54818] Updated weights for policy 0, policy_version 499478 (0.0025) [2024-04-27 22:41:29,253][54587] Fps is (10 sec: 57342.5, 60 sec: 59255.3, 300 sec: 59093.4). Total num frames: 8183529472. Throughput: 0: 59572.3. Samples: 1088670660. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-04-27 22:41:29,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 22:41:30,082][54818] Updated weights for policy 0, policy_version 499488 (0.0024) [2024-04-27 22:41:33,659][54818] Updated weights for policy 0, policy_version 499498 (0.0025) [2024-04-27 22:41:34,253][54587] Fps is (10 sec: 57342.6, 60 sec: 59528.4, 300 sec: 59093.5). Total num frames: 8183824384. Throughput: 0: 59497.3. Samples: 1089027000. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-04-27 22:41:34,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 22:41:35,657][54818] Updated weights for policy 0, policy_version 499508 (0.0023) [2024-04-27 22:41:39,157][54818] Updated weights for policy 0, policy_version 499518 (0.0025) [2024-04-27 22:41:39,253][54587] Fps is (10 sec: 57345.1, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8184102912. Throughput: 0: 59776.0. Samples: 1089386260. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-04-27 22:41:39,253][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 22:41:41,042][54818] Updated weights for policy 0, policy_version 499528 (0.0026) [2024-04-27 22:41:44,253][54587] Fps is (10 sec: 55706.5, 60 sec: 58982.5, 300 sec: 59038.0). Total num frames: 8184381440. Throughput: 0: 59105.1. Samples: 1089543580. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-04-27 22:41:44,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 22:41:44,613][54818] Updated weights for policy 0, policy_version 499538 (0.0023) [2024-04-27 22:41:46,537][54818] Updated weights for policy 0, policy_version 499548 (0.0025) [2024-04-27 22:41:49,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.5, 300 sec: 59037.9). Total num frames: 8184676352. Throughput: 0: 59398.0. Samples: 1089909600. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-04-27 22:41:49,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 22:41:49,259][54587] No heartbeat for components: RolloutWorker_w4 (4657 seconds) [2024-04-27 22:41:49,286][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000499554_8184692736.pth... [2024-04-27 22:41:49,336][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000498690_8170536960.pth [2024-04-27 22:41:50,077][54818] Updated weights for policy 0, policy_version 499558 (0.0025) [2024-04-27 22:41:52,014][54818] Updated weights for policy 0, policy_version 499568 (0.0024) [2024-04-27 22:41:54,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58709.2, 300 sec: 59093.5). Total num frames: 8184987648. Throughput: 0: 59528.1. Samples: 1090266520. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-04-27 22:41:54,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 22:41:55,620][54818] Updated weights for policy 0, policy_version 499578 (0.0025) [2024-04-27 22:41:57,431][54818] Updated weights for policy 0, policy_version 499588 (0.0026) [2024-04-27 22:41:59,253][54587] Fps is (10 sec: 60621.3, 60 sec: 58709.6, 300 sec: 59093.5). Total num frames: 8185282560. Throughput: 0: 58942.0. Samples: 1090435460. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-04-27 22:41:59,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 22:42:00,826][54798] Signal inference workers to stop experience collection... (16700 times) [2024-04-27 22:42:00,867][54818] InferenceWorker_p0-w0: stopping experience collection (16700 times) [2024-04-27 22:42:00,883][54798] Signal inference workers to resume experience collection... (16700 times) [2024-04-27 22:42:00,883][54818] InferenceWorker_p0-w0: resuming experience collection (16700 times) [2024-04-27 22:42:01,015][54818] Updated weights for policy 0, policy_version 499598 (0.0024) [2024-04-27 22:42:03,468][54818] Updated weights for policy 0, policy_version 499608 (0.0025) [2024-04-27 22:42:04,253][54587] Fps is (10 sec: 62259.2, 60 sec: 58982.5, 300 sec: 59260.1). Total num frames: 8185610240. Throughput: 0: 58849.6. Samples: 1090783020. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-04-27 22:42:04,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 22:42:06,488][54818] Updated weights for policy 0, policy_version 499618 (0.0025) [2024-04-27 22:42:09,060][54818] Updated weights for policy 0, policy_version 499628 (0.0026) [2024-04-27 22:42:09,253][54587] Fps is (10 sec: 62257.8, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8185905152. Throughput: 0: 59095.2. Samples: 1091145380. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-04-27 22:42:09,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 22:42:11,915][54818] Updated weights for policy 0, policy_version 499638 (0.0025) [2024-04-27 22:42:14,253][54587] Fps is (10 sec: 60620.0, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8186216448. Throughput: 0: 59175.1. Samples: 1091333540. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-04-27 22:42:14,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 22:42:14,539][54818] Updated weights for policy 0, policy_version 499648 (0.0025) [2024-04-27 22:42:17,312][54818] Updated weights for policy 0, policy_version 499658 (0.0025) [2024-04-27 22:42:19,253][54587] Fps is (10 sec: 60621.4, 60 sec: 59255.3, 300 sec: 59260.1). Total num frames: 8186511360. Throughput: 0: 59025.9. Samples: 1091683160. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-04-27 22:42:19,254][54587] Avg episode reward: [(0, '0.484')] [2024-04-27 22:42:20,069][54818] Updated weights for policy 0, policy_version 499668 (0.0027) [2024-04-27 22:42:22,852][54818] Updated weights for policy 0, policy_version 499678 (0.0026) [2024-04-27 22:42:24,253][54587] Fps is (10 sec: 60621.5, 60 sec: 59528.4, 300 sec: 59260.1). Total num frames: 8186822656. Throughput: 0: 58918.1. Samples: 1092037580. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-04-27 22:42:24,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 22:42:25,958][54818] Updated weights for policy 0, policy_version 499688 (0.0028) [2024-04-27 22:42:28,447][54818] Updated weights for policy 0, policy_version 499698 (0.0027) [2024-04-27 22:42:29,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59801.6, 300 sec: 59204.5). Total num frames: 8187117568. Throughput: 0: 59632.3. Samples: 1092227040. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-04-27 22:42:29,254][54587] Avg episode reward: [(0, '0.528')] [2024-04-27 22:42:31,513][54818] Updated weights for policy 0, policy_version 499708 (0.0025) [2024-04-27 22:42:33,818][54818] Updated weights for policy 0, policy_version 499718 (0.0025) [2024-04-27 22:42:34,253][54587] Fps is (10 sec: 57344.5, 60 sec: 59528.7, 300 sec: 59204.6). Total num frames: 8187396096. Throughput: 0: 59380.0. Samples: 1092581700. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-04-27 22:42:34,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 22:42:36,889][54818] Updated weights for policy 0, policy_version 499728 (0.0026) [2024-04-27 22:42:39,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59801.5, 300 sec: 59149.0). Total num frames: 8187691008. Throughput: 0: 59223.0. Samples: 1092931560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:42:39,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 22:42:39,416][54818] Updated weights for policy 0, policy_version 499738 (0.0024) [2024-04-27 22:42:42,506][54818] Updated weights for policy 0, policy_version 499748 (0.0026) [2024-04-27 22:42:44,253][54587] Fps is (10 sec: 58982.4, 60 sec: 60074.6, 300 sec: 59149.0). Total num frames: 8187985920. Throughput: 0: 59522.6. Samples: 1093113980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:42:44,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 22:42:44,957][54818] Updated weights for policy 0, policy_version 499758 (0.0026) [2024-04-27 22:42:47,891][54818] Updated weights for policy 0, policy_version 499768 (0.0023) [2024-04-27 22:42:49,253][54587] Fps is (10 sec: 58982.5, 60 sec: 60074.6, 300 sec: 59149.0). Total num frames: 8188280832. Throughput: 0: 59804.8. Samples: 1093474240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:42:49,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 22:42:50,195][54818] Updated weights for policy 0, policy_version 499778 (0.0026) [2024-04-27 22:42:53,447][54818] Updated weights for policy 0, policy_version 499788 (0.0026) [2024-04-27 22:42:54,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8188559360. Throughput: 0: 59674.8. Samples: 1093830740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:42:54,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 22:42:54,443][54798] Signal inference workers to stop experience collection... (16750 times) [2024-04-27 22:42:54,443][54798] Signal inference workers to resume experience collection... (16750 times) [2024-04-27 22:42:54,459][54818] InferenceWorker_p0-w0: stopping experience collection (16750 times) [2024-04-27 22:42:54,460][54818] InferenceWorker_p0-w0: resuming experience collection (16750 times) [2024-04-27 22:42:55,681][54818] Updated weights for policy 0, policy_version 499798 (0.0025) [2024-04-27 22:42:59,166][54818] Updated weights for policy 0, policy_version 499808 (0.0026) [2024-04-27 22:42:59,253][54587] Fps is (10 sec: 57343.7, 60 sec: 59528.3, 300 sec: 59204.6). Total num frames: 8188854272. Throughput: 0: 59340.9. Samples: 1094003880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:42:59,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 22:43:01,170][54818] Updated weights for policy 0, policy_version 499818 (0.0026) [2024-04-27 22:43:04,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.4, 300 sec: 59315.6). Total num frames: 8189149184. Throughput: 0: 59439.6. Samples: 1094357940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:43:04,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 22:43:04,613][54818] Updated weights for policy 0, policy_version 499828 (0.0026) [2024-04-27 22:43:06,639][54818] Updated weights for policy 0, policy_version 499838 (0.0026) [2024-04-27 22:43:09,253][54587] Fps is (10 sec: 60621.0, 60 sec: 59255.5, 300 sec: 59315.6). Total num frames: 8189460480. Throughput: 0: 59529.7. Samples: 1094716420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:43:09,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 22:43:10,188][54818] Updated weights for policy 0, policy_version 499848 (0.0026) [2024-04-27 22:43:12,187][54818] Updated weights for policy 0, policy_version 499858 (0.0025) [2024-04-27 22:43:14,253][54587] Fps is (10 sec: 60620.1, 60 sec: 58982.5, 300 sec: 59315.6). Total num frames: 8189755392. Throughput: 0: 59147.1. Samples: 1094888660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:43:14,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 22:43:15,716][54818] Updated weights for policy 0, policy_version 499868 (0.0026) [2024-04-27 22:43:17,760][54818] Updated weights for policy 0, policy_version 499878 (0.0027) [2024-04-27 22:43:19,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.3, 300 sec: 59204.6). Total num frames: 8190050304. Throughput: 0: 59055.4. Samples: 1095239200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:43:19,254][54587] Avg episode reward: [(0, '0.489')] [2024-04-27 22:43:21,271][54818] Updated weights for policy 0, policy_version 499888 (0.0025) [2024-04-27 22:43:23,558][54818] Updated weights for policy 0, policy_version 499898 (0.0023) [2024-04-27 22:43:24,253][54587] Fps is (10 sec: 60621.2, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8190361600. Throughput: 0: 59143.7. Samples: 1095593020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:43:24,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 22:43:26,653][54818] Updated weights for policy 0, policy_version 499908 (0.0025) [2024-04-27 22:43:29,110][54818] Updated weights for policy 0, policy_version 499918 (0.0024) [2024-04-27 22:43:29,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8190656512. Throughput: 0: 59193.2. Samples: 1095777680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:43:29,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 22:43:32,237][54818] Updated weights for policy 0, policy_version 499928 (0.0025) [2024-04-27 22:43:34,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59528.5, 300 sec: 59204.6). Total num frames: 8190967808. Throughput: 0: 59104.5. Samples: 1096133940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:43:34,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 22:43:34,611][54818] Updated weights for policy 0, policy_version 499938 (0.0026) [2024-04-27 22:43:37,752][54818] Updated weights for policy 0, policy_version 499948 (0.0026) [2024-04-27 22:43:39,215][54798] Signal inference workers to stop experience collection... (16800 times) [2024-04-27 22:43:39,253][54587] Fps is (10 sec: 57344.7, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8191229952. Throughput: 0: 58844.5. Samples: 1096478740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:43:39,253][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 22:43:39,258][54818] InferenceWorker_p0-w0: stopping experience collection (16800 times) [2024-04-27 22:43:39,273][54798] Signal inference workers to resume experience collection... (16800 times) [2024-04-27 22:43:39,275][54818] InferenceWorker_p0-w0: resuming experience collection (16800 times) [2024-04-27 22:43:40,118][54818] Updated weights for policy 0, policy_version 499958 (0.0023) [2024-04-27 22:43:43,412][54818] Updated weights for policy 0, policy_version 499968 (0.0026) [2024-04-27 22:43:44,253][54587] Fps is (10 sec: 57343.6, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8191541248. Throughput: 0: 59113.4. Samples: 1096663980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:43:44,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 22:43:45,963][54818] Updated weights for policy 0, policy_version 499978 (0.0025) [2024-04-27 22:43:49,110][54818] Updated weights for policy 0, policy_version 499988 (0.0026) [2024-04-27 22:43:49,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8191803392. Throughput: 0: 59014.6. Samples: 1097013600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:43:49,254][54587] Avg episode reward: [(0, '0.673')] [2024-04-27 22:43:49,406][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000499989_8191819776.pth... [2024-04-27 22:43:49,449][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000499123_8177631232.pth [2024-04-27 22:43:51,766][54818] Updated weights for policy 0, policy_version 499998 (0.0023) [2024-04-27 22:43:54,253][54587] Fps is (10 sec: 55706.2, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8192098304. Throughput: 0: 59019.7. Samples: 1097372300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:43:54,253][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 22:43:54,516][54818] Updated weights for policy 0, policy_version 500008 (0.0027) [2024-04-27 22:43:57,308][54818] Updated weights for policy 0, policy_version 500018 (0.0026) [2024-04-27 22:43:59,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59255.6, 300 sec: 59204.5). Total num frames: 8192409600. Throughput: 0: 58713.8. Samples: 1097530780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-27 22:43:59,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 22:44:00,050][54818] Updated weights for policy 0, policy_version 500028 (0.0025) [2024-04-27 22:44:02,718][54818] Updated weights for policy 0, policy_version 500038 (0.0026) [2024-04-27 22:44:04,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8192688128. Throughput: 0: 59001.5. Samples: 1097894260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:44:04,253][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 22:44:05,577][54818] Updated weights for policy 0, policy_version 500048 (0.0026) [2024-04-27 22:44:08,219][54818] Updated weights for policy 0, policy_version 500058 (0.0025) [2024-04-27 22:44:09,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8192983040. Throughput: 0: 59082.2. Samples: 1098251720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:44:09,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 22:44:11,175][54818] Updated weights for policy 0, policy_version 500068 (0.0026) [2024-04-27 22:44:13,752][54818] Updated weights for policy 0, policy_version 500078 (0.0024) [2024-04-27 22:44:14,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.5, 300 sec: 59204.6). Total num frames: 8193277952. Throughput: 0: 58566.4. Samples: 1098413160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:44:14,253][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 22:44:16,531][54818] Updated weights for policy 0, policy_version 500088 (0.0025) [2024-04-27 22:44:19,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58982.5, 300 sec: 59260.1). Total num frames: 8193589248. Throughput: 0: 58635.1. Samples: 1098772520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:44:19,254][54587] Avg episode reward: [(0, '0.684')] [2024-04-27 22:44:19,351][54818] Updated weights for policy 0, policy_version 500098 (0.0026) [2024-04-27 22:44:22,032][54818] Updated weights for policy 0, policy_version 500108 (0.0026) [2024-04-27 22:44:24,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58709.3, 300 sec: 59260.1). Total num frames: 8193884160. Throughput: 0: 58927.5. Samples: 1099130480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:44:24,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 22:44:25,111][54818] Updated weights for policy 0, policy_version 500118 (0.0026) [2024-04-27 22:44:27,578][54818] Updated weights for policy 0, policy_version 500128 (0.0025) [2024-04-27 22:44:29,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58709.3, 300 sec: 59204.6). Total num frames: 8194179072. Throughput: 0: 58849.3. Samples: 1099312200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:44:29,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 22:44:30,988][54818] Updated weights for policy 0, policy_version 500138 (0.0026) [2024-04-27 22:44:33,014][54818] Updated weights for policy 0, policy_version 500148 (0.0026) [2024-04-27 22:44:34,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58709.3, 300 sec: 59204.5). Total num frames: 8194490368. Throughput: 0: 58825.2. Samples: 1099660740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:44:34,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 22:44:36,490][54818] Updated weights for policy 0, policy_version 500158 (0.0027) [2024-04-27 22:44:37,659][54798] Signal inference workers to stop experience collection... (16850 times) [2024-04-27 22:44:37,660][54798] Signal inference workers to resume experience collection... (16850 times) [2024-04-27 22:44:37,674][54818] InferenceWorker_p0-w0: stopping experience collection (16850 times) [2024-04-27 22:44:37,674][54818] InferenceWorker_p0-w0: resuming experience collection (16850 times) [2024-04-27 22:44:38,539][54818] Updated weights for policy 0, policy_version 500168 (0.0026) [2024-04-27 22:44:39,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59255.3, 300 sec: 59149.0). Total num frames: 8194785280. Throughput: 0: 58623.4. Samples: 1100010360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:44:39,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 22:44:41,996][54818] Updated weights for policy 0, policy_version 500178 (0.0026) [2024-04-27 22:44:44,067][54818] Updated weights for policy 0, policy_version 500188 (0.0026) [2024-04-27 22:44:44,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8195080192. Throughput: 0: 59380.8. Samples: 1100202920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:44:44,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 22:44:47,549][54818] Updated weights for policy 0, policy_version 500198 (0.0025) [2024-04-27 22:44:49,253][54587] Fps is (10 sec: 58983.2, 60 sec: 59528.6, 300 sec: 59204.6). Total num frames: 8195375104. Throughput: 0: 59035.6. Samples: 1100550860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:44:49,253][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 22:44:49,261][54587] No heartbeat for components: RolloutWorker_w4 (4837 seconds) [2024-04-27 22:44:49,785][54818] Updated weights for policy 0, policy_version 500208 (0.0025) [2024-04-27 22:44:53,074][54818] Updated weights for policy 0, policy_version 500218 (0.0024) [2024-04-27 22:44:54,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8195653632. Throughput: 0: 58879.0. Samples: 1100901280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:44:54,254][54587] Avg episode reward: [(0, '0.531')] [2024-04-27 22:44:55,448][54818] Updated weights for policy 0, policy_version 500228 (0.0026) [2024-04-27 22:44:58,767][54818] Updated weights for policy 0, policy_version 500238 (0.0023) [2024-04-27 22:44:59,253][54587] Fps is (10 sec: 55705.5, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8195932160. Throughput: 0: 59069.3. Samples: 1101071280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:44:59,253][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 22:45:01,101][54818] Updated weights for policy 0, policy_version 500248 (0.0025) [2024-04-27 22:45:04,253][54587] Fps is (10 sec: 55706.2, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8196210688. Throughput: 0: 58967.6. Samples: 1101426060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:45:04,262][54587] Avg episode reward: [(0, '0.661')] [2024-04-27 22:45:04,299][54818] Updated weights for policy 0, policy_version 500258 (0.0024) [2024-04-27 22:45:06,588][54818] Updated weights for policy 0, policy_version 500268 (0.0024) [2024-04-27 22:45:09,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58709.3, 300 sec: 59037.9). Total num frames: 8196505600. Throughput: 0: 59036.9. Samples: 1101787140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:45:09,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 22:45:09,826][54818] Updated weights for policy 0, policy_version 500278 (0.0024) [2024-04-27 22:45:12,182][54818] Updated weights for policy 0, policy_version 500288 (0.0026) [2024-04-27 22:45:14,253][54587] Fps is (10 sec: 57343.2, 60 sec: 58436.1, 300 sec: 58982.4). Total num frames: 8196784128. Throughput: 0: 58640.4. Samples: 1101951020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:45:14,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 22:45:15,305][54818] Updated weights for policy 0, policy_version 500298 (0.0024) [2024-04-27 22:45:17,658][54818] Updated weights for policy 0, policy_version 500308 (0.0023) [2024-04-27 22:45:19,253][54587] Fps is (10 sec: 60621.3, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8197111808. Throughput: 0: 58746.8. Samples: 1102304340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:45:19,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 22:45:20,931][54818] Updated weights for policy 0, policy_version 500318 (0.0026) [2024-04-27 22:45:23,285][54818] Updated weights for policy 0, policy_version 500328 (0.0026) [2024-04-27 22:45:23,495][54798] Signal inference workers to stop experience collection... (16900 times) [2024-04-27 22:45:23,524][54818] InferenceWorker_p0-w0: stopping experience collection (16900 times) [2024-04-27 22:45:23,552][54798] Signal inference workers to resume experience collection... (16900 times) [2024-04-27 22:45:23,553][54818] InferenceWorker_p0-w0: resuming experience collection (16900 times) [2024-04-27 22:45:24,253][54587] Fps is (10 sec: 63897.6, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8197423104. Throughput: 0: 58760.4. Samples: 1102654580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:45:24,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 22:45:26,519][54818] Updated weights for policy 0, policy_version 500338 (0.0026) [2024-04-27 22:45:28,932][54818] Updated weights for policy 0, policy_version 500348 (0.0020) [2024-04-27 22:45:29,253][54587] Fps is (10 sec: 62258.5, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8197734400. Throughput: 0: 58597.4. Samples: 1102839800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 22:45:29,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 22:45:31,946][54818] Updated weights for policy 0, policy_version 500358 (0.0024) [2024-04-27 22:45:34,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8198012928. Throughput: 0: 58492.3. Samples: 1103183020. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-04-27 22:45:34,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 22:45:34,290][54818] Updated weights for policy 0, policy_version 500368 (0.0025) [2024-04-27 22:45:37,449][54818] Updated weights for policy 0, policy_version 500378 (0.0025) [2024-04-27 22:45:39,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58709.3, 300 sec: 59204.5). Total num frames: 8198307840. Throughput: 0: 58527.9. Samples: 1103535040. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-04-27 22:45:39,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 22:45:39,859][54818] Updated weights for policy 0, policy_version 500388 (0.0026) [2024-04-27 22:45:42,902][54818] Updated weights for policy 0, policy_version 500398 (0.0024) [2024-04-27 22:45:44,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8198602752. Throughput: 0: 59224.9. Samples: 1103736400. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-04-27 22:45:44,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 22:45:45,252][54818] Updated weights for policy 0, policy_version 500408 (0.0024) [2024-04-27 22:45:48,474][54818] Updated weights for policy 0, policy_version 500418 (0.0027) [2024-04-27 22:45:49,253][54587] Fps is (10 sec: 60621.9, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8198914048. Throughput: 0: 58968.0. Samples: 1104079620. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-04-27 22:45:49,254][54587] Avg episode reward: [(0, '0.673')] [2024-04-27 22:45:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000500422_8198914048.pth... [2024-04-27 22:45:49,316][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000499554_8184692736.pth [2024-04-27 22:45:50,906][54818] Updated weights for policy 0, policy_version 500428 (0.0025) [2024-04-27 22:45:54,010][54818] Updated weights for policy 0, policy_version 500438 (0.0026) [2024-04-27 22:45:54,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8199192576. Throughput: 0: 58663.6. Samples: 1104427000. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-04-27 22:45:54,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 22:45:57,689][54818] Updated weights for policy 0, policy_version 500448 (0.0026) [2024-04-27 22:45:59,253][54587] Fps is (10 sec: 54066.8, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 8199454720. Throughput: 0: 58831.7. Samples: 1104598440. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-04-27 22:45:59,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 22:45:59,264][54798] Signal inference workers to stop experience collection... (16950 times) [2024-04-27 22:45:59,264][54798] Signal inference workers to resume experience collection... (16950 times) [2024-04-27 22:45:59,286][54818] InferenceWorker_p0-w0: stopping experience collection (16950 times) [2024-04-27 22:45:59,286][54818] InferenceWorker_p0-w0: resuming experience collection (16950 times) [2024-04-27 22:45:59,539][54818] Updated weights for policy 0, policy_version 500458 (0.0026) [2024-04-27 22:46:03,164][54818] Updated weights for policy 0, policy_version 500468 (0.0026) [2024-04-27 22:46:04,253][54587] Fps is (10 sec: 54067.3, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8199733248. Throughput: 0: 59004.8. Samples: 1104959560. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-04-27 22:46:04,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 22:46:05,119][54818] Updated weights for policy 0, policy_version 500478 (0.0025) [2024-04-27 22:46:08,698][54818] Updated weights for policy 0, policy_version 500488 (0.0025) [2024-04-27 22:46:09,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8200028160. Throughput: 0: 59194.8. Samples: 1105318340. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-04-27 22:46:09,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 22:46:10,521][54818] Updated weights for policy 0, policy_version 500498 (0.0025) [2024-04-27 22:46:14,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58709.5, 300 sec: 58815.8). Total num frames: 8200306688. Throughput: 0: 58556.6. Samples: 1105474840. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-04-27 22:46:14,253][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 22:46:14,334][54818] Updated weights for policy 0, policy_version 500508 (0.0026) [2024-04-27 22:46:15,938][54818] Updated weights for policy 0, policy_version 500518 (0.0025) [2024-04-27 22:46:19,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58436.1, 300 sec: 58871.3). Total num frames: 8200617984. Throughput: 0: 58804.8. Samples: 1105829240. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-04-27 22:46:19,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 22:46:19,907][54818] Updated weights for policy 0, policy_version 500528 (0.0026) [2024-04-27 22:46:21,800][54818] Updated weights for policy 0, policy_version 500538 (0.0025) [2024-04-27 22:46:24,253][54587] Fps is (10 sec: 60619.8, 60 sec: 58163.2, 300 sec: 58926.9). Total num frames: 8200912896. Throughput: 0: 58953.8. Samples: 1106187960. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-04-27 22:46:24,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 22:46:25,545][54818] Updated weights for policy 0, policy_version 500548 (0.0026) [2024-04-27 22:46:27,324][54818] Updated weights for policy 0, policy_version 500558 (0.0025) [2024-04-27 22:46:29,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58163.2, 300 sec: 58982.4). Total num frames: 8201224192. Throughput: 0: 58246.5. Samples: 1106357500. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-04-27 22:46:29,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 22:46:30,893][54818] Updated weights for policy 0, policy_version 500568 (0.0026) [2024-04-27 22:46:32,907][54818] Updated weights for policy 0, policy_version 500578 (0.0027) [2024-04-27 22:46:34,253][54587] Fps is (10 sec: 63897.9, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8201551872. Throughput: 0: 58571.0. Samples: 1106715320. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-04-27 22:46:34,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 22:46:36,007][54798] Signal inference workers to stop experience collection... (17000 times) [2024-04-27 22:46:36,007][54798] Signal inference workers to resume experience collection... (17000 times) [2024-04-27 22:46:36,033][54818] InferenceWorker_p0-w0: stopping experience collection (17000 times) [2024-04-27 22:46:36,033][54818] InferenceWorker_p0-w0: resuming experience collection (17000 times) [2024-04-27 22:46:36,440][54818] Updated weights for policy 0, policy_version 500588 (0.0026) [2024-04-27 22:46:38,256][54818] Updated weights for policy 0, policy_version 500598 (0.0025) [2024-04-27 22:46:39,253][54587] Fps is (10 sec: 62259.7, 60 sec: 58982.5, 300 sec: 59204.5). Total num frames: 8201846784. Throughput: 0: 58589.4. Samples: 1107063520. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-04-27 22:46:39,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 22:46:41,911][54818] Updated weights for policy 0, policy_version 500608 (0.0026) [2024-04-27 22:46:43,738][54818] Updated weights for policy 0, policy_version 500618 (0.0027) [2024-04-27 22:46:44,253][54587] Fps is (10 sec: 60620.7, 60 sec: 59255.4, 300 sec: 59260.1). Total num frames: 8202158080. Throughput: 0: 59273.3. Samples: 1107265740. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-04-27 22:46:44,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 22:46:47,399][54818] Updated weights for policy 0, policy_version 500628 (0.0024) [2024-04-27 22:46:49,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.3, 300 sec: 59149.0). Total num frames: 8202436608. Throughput: 0: 58828.1. Samples: 1107606820. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-04-27 22:46:49,254][54587] Avg episode reward: [(0, '0.496')] [2024-04-27 22:46:49,319][54818] Updated weights for policy 0, policy_version 500638 (0.0024) [2024-04-27 22:46:52,867][54818] Updated weights for policy 0, policy_version 500648 (0.0026) [2024-04-27 22:46:54,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8202731520. Throughput: 0: 58578.7. Samples: 1107954380. Policy #0 lag: (min: 0.0, avg: 7.1, max: 20.0) [2024-04-27 22:46:54,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 22:46:54,860][54818] Updated weights for policy 0, policy_version 500658 (0.0026) [2024-04-27 22:46:58,366][54818] Updated weights for policy 0, policy_version 500668 (0.0026) [2024-04-27 22:46:59,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59528.6, 300 sec: 59038.0). Total num frames: 8203026432. Throughput: 0: 59120.0. Samples: 1108135240. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:46:59,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-27 22:47:00,278][54818] Updated weights for policy 0, policy_version 500678 (0.0024) [2024-04-27 22:47:03,874][54818] Updated weights for policy 0, policy_version 500688 (0.0026) [2024-04-27 22:47:04,253][54587] Fps is (10 sec: 57343.6, 60 sec: 59528.5, 300 sec: 58982.4). Total num frames: 8203304960. Throughput: 0: 59398.2. Samples: 1108502160. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:47:04,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 22:47:05,900][54818] Updated weights for policy 0, policy_version 500698 (0.0023) [2024-04-27 22:47:09,110][54798] Signal inference workers to stop experience collection... (17050 times) [2024-04-27 22:47:09,140][54818] InferenceWorker_p0-w0: stopping experience collection (17050 times) [2024-04-27 22:47:09,169][54798] Signal inference workers to resume experience collection... (17050 times) [2024-04-27 22:47:09,170][54818] InferenceWorker_p0-w0: resuming experience collection (17050 times) [2024-04-27 22:47:09,253][54587] Fps is (10 sec: 55705.4, 60 sec: 59255.5, 300 sec: 58871.4). Total num frames: 8203583488. Throughput: 0: 59233.1. Samples: 1108853440. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:47:09,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 22:47:09,282][54818] Updated weights for policy 0, policy_version 500708 (0.0026) [2024-04-27 22:47:11,394][54818] Updated weights for policy 0, policy_version 500718 (0.0024) [2024-04-27 22:47:14,253][54587] Fps is (10 sec: 55706.5, 60 sec: 59255.5, 300 sec: 58815.8). Total num frames: 8203862016. Throughput: 0: 58981.5. Samples: 1109011660. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:47:14,253][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 22:47:14,991][54818] Updated weights for policy 0, policy_version 500728 (0.0026) [2024-04-27 22:47:17,095][54818] Updated weights for policy 0, policy_version 500738 (0.0025) [2024-04-27 22:47:19,253][54587] Fps is (10 sec: 57343.4, 60 sec: 58982.4, 300 sec: 58760.2). Total num frames: 8204156928. Throughput: 0: 59030.6. Samples: 1109371700. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:47:19,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 22:47:20,355][54818] Updated weights for policy 0, policy_version 500748 (0.0031) [2024-04-27 22:47:22,954][54818] Updated weights for policy 0, policy_version 500758 (0.0022) [2024-04-27 22:47:24,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58982.5, 300 sec: 58760.3). Total num frames: 8204451840. Throughput: 0: 59258.2. Samples: 1109730140. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:47:24,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 22:47:25,826][54818] Updated weights for policy 0, policy_version 500768 (0.0027) [2024-04-27 22:47:28,468][54818] Updated weights for policy 0, policy_version 500778 (0.0026) [2024-04-27 22:47:29,253][54587] Fps is (10 sec: 60621.1, 60 sec: 58982.5, 300 sec: 58871.3). Total num frames: 8204763136. Throughput: 0: 58555.6. Samples: 1109900740. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:47:29,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 22:47:31,512][54818] Updated weights for policy 0, policy_version 500788 (0.0027) [2024-04-27 22:47:34,255][54587] Fps is (10 sec: 60612.5, 60 sec: 58435.0, 300 sec: 58871.1). Total num frames: 8205058048. Throughput: 0: 58581.3. Samples: 1110243060. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:47:34,255][54587] Avg episode reward: [(0, '0.533')] [2024-04-27 22:47:34,588][54818] Updated weights for policy 0, policy_version 500798 (0.0026) [2024-04-27 22:47:37,171][54818] Updated weights for policy 0, policy_version 500808 (0.0026) [2024-04-27 22:47:39,253][54587] Fps is (10 sec: 60621.3, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8205369344. Throughput: 0: 58800.1. Samples: 1110600380. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:47:39,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 22:47:40,129][54818] Updated weights for policy 0, policy_version 500818 (0.0022) [2024-04-27 22:47:42,699][54818] Updated weights for policy 0, policy_version 500828 (0.0025) [2024-04-27 22:47:44,253][54587] Fps is (10 sec: 60628.7, 60 sec: 58436.3, 300 sec: 58926.9). Total num frames: 8205664256. Throughput: 0: 58971.8. Samples: 1110788980. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:47:44,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 22:47:45,805][54818] Updated weights for policy 0, policy_version 500838 (0.0025) [2024-04-27 22:47:47,941][54798] Signal inference workers to stop experience collection... (17100 times) [2024-04-27 22:47:47,960][54818] InferenceWorker_p0-w0: stopping experience collection (17100 times) [2024-04-27 22:47:48,000][54798] Signal inference workers to resume experience collection... (17100 times) [2024-04-27 22:47:48,000][54818] InferenceWorker_p0-w0: resuming experience collection (17100 times) [2024-04-27 22:47:48,118][54818] Updated weights for policy 0, policy_version 500848 (0.0021) [2024-04-27 22:47:49,253][54587] Fps is (10 sec: 60619.6, 60 sec: 58982.2, 300 sec: 59037.9). Total num frames: 8205975552. Throughput: 0: 58779.9. Samples: 1111147260. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:47:49,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 22:47:49,265][54587] No heartbeat for components: RolloutWorker_w4 (5017 seconds) [2024-04-27 22:47:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000500853_8205975552.pth... [2024-04-27 22:47:49,319][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000499989_8191819776.pth [2024-04-27 22:47:51,310][54818] Updated weights for policy 0, policy_version 500858 (0.0026) [2024-04-27 22:47:53,594][54818] Updated weights for policy 0, policy_version 500868 (0.0025) [2024-04-27 22:47:54,253][54587] Fps is (10 sec: 62259.0, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8206286848. Throughput: 0: 58633.6. Samples: 1111491960. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:47:54,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 22:47:56,643][54818] Updated weights for policy 0, policy_version 500878 (0.0028) [2024-04-27 22:47:59,124][54818] Updated weights for policy 0, policy_version 500888 (0.0022) [2024-04-27 22:47:59,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8206565376. Throughput: 0: 59194.1. Samples: 1111675400. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:47:59,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 22:48:02,402][54818] Updated weights for policy 0, policy_version 500898 (0.0026) [2024-04-27 22:48:04,253][54587] Fps is (10 sec: 55705.8, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 8206843904. Throughput: 0: 59183.6. Samples: 1112034960. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:48:04,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 22:48:04,536][54818] Updated weights for policy 0, policy_version 500908 (0.0022) [2024-04-27 22:48:07,812][54818] Updated weights for policy 0, policy_version 500918 (0.0027) [2024-04-27 22:48:09,253][54587] Fps is (10 sec: 55705.3, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8207122432. Throughput: 0: 58883.9. Samples: 1112379920. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:48:09,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 22:48:10,067][54818] Updated weights for policy 0, policy_version 500928 (0.0024) [2024-04-27 22:48:13,435][54818] Updated weights for policy 0, policy_version 500938 (0.0026) [2024-04-27 22:48:14,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59528.4, 300 sec: 58926.9). Total num frames: 8207433728. Throughput: 0: 59092.4. Samples: 1112559900. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:48:14,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 22:48:15,418][54798] Signal inference workers to stop experience collection... (17150 times) [2024-04-27 22:48:15,455][54818] InferenceWorker_p0-w0: stopping experience collection (17150 times) [2024-04-27 22:48:15,511][54798] Signal inference workers to resume experience collection... (17150 times) [2024-04-27 22:48:15,511][54818] InferenceWorker_p0-w0: resuming experience collection (17150 times) [2024-04-27 22:48:15,513][54818] Updated weights for policy 0, policy_version 500948 (0.0022) [2024-04-27 22:48:18,881][54818] Updated weights for policy 0, policy_version 500958 (0.0026) [2024-04-27 22:48:19,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59528.5, 300 sec: 58871.3). Total num frames: 8207728640. Throughput: 0: 59390.5. Samples: 1112915560. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:48:19,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 22:48:20,941][54818] Updated weights for policy 0, policy_version 500968 (0.0024) [2024-04-27 22:48:24,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59255.4, 300 sec: 58815.8). Total num frames: 8208007168. Throughput: 0: 59351.9. Samples: 1113271220. Policy #0 lag: (min: 0.0, avg: 7.1, max: 21.0) [2024-04-27 22:48:24,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 22:48:24,314][54818] Updated weights for policy 0, policy_version 500978 (0.0026) [2024-04-27 22:48:26,405][54818] Updated weights for policy 0, policy_version 500988 (0.0025) [2024-04-27 22:48:29,253][54587] Fps is (10 sec: 55706.7, 60 sec: 58709.4, 300 sec: 58704.7). Total num frames: 8208285696. Throughput: 0: 58804.1. Samples: 1113435160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:48:29,253][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 22:48:29,753][54818] Updated weights for policy 0, policy_version 500998 (0.0026) [2024-04-27 22:48:32,096][54818] Updated weights for policy 0, policy_version 501008 (0.0026) [2024-04-27 22:48:34,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58983.8, 300 sec: 58871.3). Total num frames: 8208596992. Throughput: 0: 58831.8. Samples: 1113794680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:48:34,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 22:48:35,213][54818] Updated weights for policy 0, policy_version 501018 (0.0025) [2024-04-27 22:48:37,507][54818] Updated weights for policy 0, policy_version 501028 (0.0023) [2024-04-27 22:48:39,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58436.1, 300 sec: 58760.2). Total num frames: 8208875520. Throughput: 0: 59254.7. Samples: 1114158420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:48:39,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 22:48:40,833][54818] Updated weights for policy 0, policy_version 501038 (0.0026) [2024-04-27 22:48:43,307][54818] Updated weights for policy 0, policy_version 501048 (0.0026) [2024-04-27 22:48:44,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8209186816. Throughput: 0: 59087.6. Samples: 1114334340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:48:44,254][54587] Avg episode reward: [(0, '0.518')] [2024-04-27 22:48:46,566][54818] Updated weights for policy 0, policy_version 501058 (0.0026) [2024-04-27 22:48:48,957][54818] Updated weights for policy 0, policy_version 501068 (0.0026) [2024-04-27 22:48:49,253][54587] Fps is (10 sec: 62259.0, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8209498112. Throughput: 0: 58817.2. Samples: 1114681740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:48:49,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 22:48:52,113][54818] Updated weights for policy 0, policy_version 501078 (0.0026) [2024-04-27 22:48:54,253][54587] Fps is (10 sec: 62259.1, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8209809408. Throughput: 0: 58866.3. Samples: 1115028900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:48:54,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 22:48:54,939][54818] Updated weights for policy 0, policy_version 501088 (0.0027) [2024-04-27 22:48:57,592][54818] Updated weights for policy 0, policy_version 501098 (0.0025) [2024-04-27 22:48:58,042][54798] Signal inference workers to stop experience collection... (17200 times) [2024-04-27 22:48:58,063][54818] InferenceWorker_p0-w0: stopping experience collection (17200 times) [2024-04-27 22:48:58,103][54798] Signal inference workers to resume experience collection... (17200 times) [2024-04-27 22:48:58,104][54818] InferenceWorker_p0-w0: resuming experience collection (17200 times) [2024-04-27 22:48:59,253][54587] Fps is (10 sec: 60621.1, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 8210104320. Throughput: 0: 59204.4. Samples: 1115224100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:48:59,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 22:49:00,439][54818] Updated weights for policy 0, policy_version 501108 (0.0024) [2024-04-27 22:49:03,016][54818] Updated weights for policy 0, policy_version 501118 (0.0028) [2024-04-27 22:49:04,253][54587] Fps is (10 sec: 62259.2, 60 sec: 59801.6, 300 sec: 59149.0). Total num frames: 8210432000. Throughput: 0: 59263.7. Samples: 1115582420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:49:04,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 22:49:06,276][54818] Updated weights for policy 0, policy_version 501128 (0.0024) [2024-04-27 22:49:08,430][54818] Updated weights for policy 0, policy_version 501138 (0.0023) [2024-04-27 22:49:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 59801.6, 300 sec: 59093.4). Total num frames: 8210710528. Throughput: 0: 59062.6. Samples: 1115929040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:49:09,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 22:49:11,755][54818] Updated weights for policy 0, policy_version 501148 (0.0024) [2024-04-27 22:49:14,012][54818] Updated weights for policy 0, policy_version 501158 (0.0023) [2024-04-27 22:49:14,253][54587] Fps is (10 sec: 57344.2, 60 sec: 59528.6, 300 sec: 59037.9). Total num frames: 8211005440. Throughput: 0: 59619.0. Samples: 1116118020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:49:14,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 22:49:17,243][54818] Updated weights for policy 0, policy_version 501168 (0.0025) [2024-04-27 22:49:19,253][54587] Fps is (10 sec: 54067.0, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8211251200. Throughput: 0: 59343.4. Samples: 1116465140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:49:19,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 22:49:19,591][54818] Updated weights for policy 0, policy_version 501178 (0.0026) [2024-04-27 22:49:22,678][54818] Updated weights for policy 0, policy_version 501188 (0.0028) [2024-04-27 22:49:24,253][54587] Fps is (10 sec: 54067.4, 60 sec: 58982.5, 300 sec: 58871.3). Total num frames: 8211546112. Throughput: 0: 59082.8. Samples: 1116817140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:49:24,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 22:49:25,022][54818] Updated weights for policy 0, policy_version 501198 (0.0026) [2024-04-27 22:49:28,287][54818] Updated weights for policy 0, policy_version 501208 (0.0026) [2024-04-27 22:49:29,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59528.4, 300 sec: 58871.3). Total num frames: 8211857408. Throughput: 0: 58984.8. Samples: 1116988660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:49:29,254][54587] Avg episode reward: [(0, '0.531')] [2024-04-27 22:49:30,214][54798] Signal inference workers to stop experience collection... (17250 times) [2024-04-27 22:49:30,214][54798] Signal inference workers to resume experience collection... (17250 times) [2024-04-27 22:49:30,237][54818] InferenceWorker_p0-w0: stopping experience collection (17250 times) [2024-04-27 22:49:30,237][54818] InferenceWorker_p0-w0: resuming experience collection (17250 times) [2024-04-27 22:49:30,466][54818] Updated weights for policy 0, policy_version 501218 (0.0023) [2024-04-27 22:49:33,924][54818] Updated weights for policy 0, policy_version 501228 (0.0026) [2024-04-27 22:49:34,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.4, 300 sec: 58815.8). Total num frames: 8212135936. Throughput: 0: 59178.4. Samples: 1117344760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:49:34,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 22:49:35,967][54818] Updated weights for policy 0, policy_version 501238 (0.0027) [2024-04-27 22:49:39,253][54587] Fps is (10 sec: 57343.9, 60 sec: 59255.5, 300 sec: 58815.8). Total num frames: 8212430848. Throughput: 0: 59490.2. Samples: 1117705960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:49:39,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 22:49:39,439][54818] Updated weights for policy 0, policy_version 501248 (0.0026) [2024-04-27 22:49:41,380][54818] Updated weights for policy 0, policy_version 501258 (0.0023) [2024-04-27 22:49:44,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.3, 300 sec: 58760.2). Total num frames: 8212709376. Throughput: 0: 58622.7. Samples: 1117862120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:49:44,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 22:49:44,988][54818] Updated weights for policy 0, policy_version 501268 (0.0023) [2024-04-27 22:49:46,895][54818] Updated weights for policy 0, policy_version 501278 (0.0026) [2024-04-27 22:49:49,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58436.3, 300 sec: 58815.8). Total num frames: 8213004288. Throughput: 0: 58648.5. Samples: 1118221600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 22:49:49,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-27 22:49:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000501282_8213004288.pth... [2024-04-27 22:49:49,336][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000500422_8198914048.pth [2024-04-27 22:49:50,531][54818] Updated weights for policy 0, policy_version 501288 (0.0027) [2024-04-27 22:49:52,426][54818] Updated weights for policy 0, policy_version 501298 (0.0023) [2024-04-27 22:49:54,253][54587] Fps is (10 sec: 62258.4, 60 sec: 58709.2, 300 sec: 58982.4). Total num frames: 8213331968. Throughput: 0: 59013.3. Samples: 1118584640. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:49:54,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 22:49:55,967][54818] Updated weights for policy 0, policy_version 501308 (0.0023) [2024-04-27 22:49:57,975][54818] Updated weights for policy 0, policy_version 501318 (0.0024) [2024-04-27 22:49:59,253][54587] Fps is (10 sec: 62258.5, 60 sec: 58709.2, 300 sec: 59037.9). Total num frames: 8213626880. Throughput: 0: 58814.9. Samples: 1118764700. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:49:59,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-27 22:50:01,565][54818] Updated weights for policy 0, policy_version 501328 (0.0026) [2024-04-27 22:50:03,717][54818] Updated weights for policy 0, policy_version 501338 (0.0023) [2024-04-27 22:50:04,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58163.2, 300 sec: 59037.9). Total num frames: 8213921792. Throughput: 0: 58775.2. Samples: 1119110020. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:50:04,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 22:50:07,034][54818] Updated weights for policy 0, policy_version 501348 (0.0026) [2024-04-27 22:50:08,500][54798] Signal inference workers to stop experience collection... (17300 times) [2024-04-27 22:50:08,535][54818] InferenceWorker_p0-w0: stopping experience collection (17300 times) [2024-04-27 22:50:08,563][54798] Signal inference workers to resume experience collection... (17300 times) [2024-04-27 22:50:08,563][54818] InferenceWorker_p0-w0: resuming experience collection (17300 times) [2024-04-27 22:50:09,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58709.3, 300 sec: 59149.0). Total num frames: 8214233088. Throughput: 0: 58661.6. Samples: 1119456920. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:50:09,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 22:50:09,439][54818] Updated weights for policy 0, policy_version 501358 (0.0028) [2024-04-27 22:50:12,425][54818] Updated weights for policy 0, policy_version 501368 (0.0027) [2024-04-27 22:50:14,253][54587] Fps is (10 sec: 62258.8, 60 sec: 58982.3, 300 sec: 59093.4). Total num frames: 8214544384. Throughput: 0: 59161.3. Samples: 1119650920. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:50:14,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 22:50:14,886][54818] Updated weights for policy 0, policy_version 501378 (0.0026) [2024-04-27 22:50:18,026][54818] Updated weights for policy 0, policy_version 501388 (0.0026) [2024-04-27 22:50:19,253][54587] Fps is (10 sec: 58983.7, 60 sec: 59528.7, 300 sec: 58982.4). Total num frames: 8214822912. Throughput: 0: 59089.4. Samples: 1120003780. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:50:19,253][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 22:50:20,679][54818] Updated weights for policy 0, policy_version 501398 (0.0023) [2024-04-27 22:50:23,624][54818] Updated weights for policy 0, policy_version 501408 (0.0026) [2024-04-27 22:50:24,253][54587] Fps is (10 sec: 58983.2, 60 sec: 59801.6, 300 sec: 58982.4). Total num frames: 8215134208. Throughput: 0: 58979.3. Samples: 1120360020. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:50:24,253][54587] Avg episode reward: [(0, '0.473')] [2024-04-27 22:50:26,344][54818] Updated weights for policy 0, policy_version 501418 (0.0026) [2024-04-27 22:50:29,003][54818] Updated weights for policy 0, policy_version 501428 (0.0026) [2024-04-27 22:50:29,253][54587] Fps is (10 sec: 58981.4, 60 sec: 59255.4, 300 sec: 58982.4). Total num frames: 8215412736. Throughput: 0: 59601.7. Samples: 1120544200. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:50:29,254][54587] Avg episode reward: [(0, '0.687')] [2024-04-27 22:50:32,191][54818] Updated weights for policy 0, policy_version 501438 (0.0029) [2024-04-27 22:50:34,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59528.6, 300 sec: 58982.4). Total num frames: 8215707648. Throughput: 0: 59509.0. Samples: 1120899500. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:50:34,253][54587] Avg episode reward: [(0, '0.547')] [2024-04-27 22:50:34,321][54818] Updated weights for policy 0, policy_version 501448 (0.0026) [2024-04-27 22:50:37,563][54818] Updated weights for policy 0, policy_version 501458 (0.0028) [2024-04-27 22:50:39,253][54587] Fps is (10 sec: 58983.0, 60 sec: 59528.6, 300 sec: 58982.4). Total num frames: 8216002560. Throughput: 0: 59262.0. Samples: 1121251420. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:50:39,254][54587] Avg episode reward: [(0, '0.523')] [2024-04-27 22:50:40,029][54818] Updated weights for policy 0, policy_version 501468 (0.0030) [2024-04-27 22:50:42,996][54818] Updated weights for policy 0, policy_version 501478 (0.0025) [2024-04-27 22:50:44,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59801.6, 300 sec: 58926.9). Total num frames: 8216297472. Throughput: 0: 58965.6. Samples: 1121418140. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:50:44,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 22:50:45,558][54818] Updated weights for policy 0, policy_version 501488 (0.0025) [2024-04-27 22:50:48,545][54818] Updated weights for policy 0, policy_version 501498 (0.0026) [2024-04-27 22:50:49,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59528.6, 300 sec: 58926.9). Total num frames: 8216576000. Throughput: 0: 59376.5. Samples: 1121781960. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:50:49,253][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 22:50:49,260][54587] No heartbeat for components: RolloutWorker_w4 (5197 seconds) [2024-04-27 22:50:50,997][54818] Updated weights for policy 0, policy_version 501508 (0.0026) [2024-04-27 22:50:54,089][54818] Updated weights for policy 0, policy_version 501518 (0.0027) [2024-04-27 22:50:54,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.6, 300 sec: 59093.5). Total num frames: 8216887296. Throughput: 0: 59590.4. Samples: 1122138480. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:50:54,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 22:50:56,563][54818] Updated weights for policy 0, policy_version 501528 (0.0023) [2024-04-27 22:50:59,154][54798] Signal inference workers to stop experience collection... (17350 times) [2024-04-27 22:50:59,155][54798] Signal inference workers to resume experience collection... (17350 times) [2024-04-27 22:50:59,164][54818] InferenceWorker_p0-w0: stopping experience collection (17350 times) [2024-04-27 22:50:59,182][54818] InferenceWorker_p0-w0: resuming experience collection (17350 times) [2024-04-27 22:50:59,253][54587] Fps is (10 sec: 58981.2, 60 sec: 58982.4, 300 sec: 59093.4). Total num frames: 8217165824. Throughput: 0: 59031.9. Samples: 1122307360. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:50:59,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 22:50:59,660][54818] Updated weights for policy 0, policy_version 501538 (0.0023) [2024-04-27 22:51:01,992][54818] Updated weights for policy 0, policy_version 501548 (0.0026) [2024-04-27 22:51:04,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8217460736. Throughput: 0: 58885.7. Samples: 1122653640. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:51:04,254][54587] Avg episode reward: [(0, '0.682')] [2024-04-27 22:51:05,302][54818] Updated weights for policy 0, policy_version 501558 (0.0025) [2024-04-27 22:51:07,505][54818] Updated weights for policy 0, policy_version 501568 (0.0028) [2024-04-27 22:51:09,253][54587] Fps is (10 sec: 60621.5, 60 sec: 58982.5, 300 sec: 59204.5). Total num frames: 8217772032. Throughput: 0: 58979.9. Samples: 1123014120. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:51:09,263][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 22:51:10,588][54818] Updated weights for policy 0, policy_version 501578 (0.0026) [2024-04-27 22:51:12,986][54818] Updated weights for policy 0, policy_version 501588 (0.0023) [2024-04-27 22:51:14,253][54587] Fps is (10 sec: 62258.7, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8218083328. Throughput: 0: 58927.2. Samples: 1123195920. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:51:14,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 22:51:16,201][54818] Updated weights for policy 0, policy_version 501598 (0.0022) [2024-04-27 22:51:18,502][54818] Updated weights for policy 0, policy_version 501608 (0.0025) [2024-04-27 22:51:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59255.4, 300 sec: 59204.6). Total num frames: 8218378240. Throughput: 0: 59046.6. Samples: 1123556600. Policy #0 lag: (min: 1.0, avg: 12.5, max: 22.0) [2024-04-27 22:51:19,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 22:51:21,685][54818] Updated weights for policy 0, policy_version 501618 (0.0025) [2024-04-27 22:51:24,001][54818] Updated weights for policy 0, policy_version 501628 (0.0026) [2024-04-27 22:51:24,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59255.4, 300 sec: 59204.6). Total num frames: 8218689536. Throughput: 0: 58896.8. Samples: 1123901780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 22:51:24,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 22:51:27,167][54818] Updated weights for policy 0, policy_version 501638 (0.0024) [2024-04-27 22:51:29,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.5, 300 sec: 59037.9). Total num frames: 8218968064. Throughput: 0: 59284.8. Samples: 1124085960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 22:51:29,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 22:51:29,491][54818] Updated weights for policy 0, policy_version 501648 (0.0023) [2024-04-27 22:51:32,601][54818] Updated weights for policy 0, policy_version 501658 (0.0024) [2024-04-27 22:51:34,253][54587] Fps is (10 sec: 55705.6, 60 sec: 58982.3, 300 sec: 58982.4). Total num frames: 8219246592. Throughput: 0: 58860.3. Samples: 1124430680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 22:51:34,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 22:51:34,965][54818] Updated weights for policy 0, policy_version 501668 (0.0024) [2024-04-27 22:51:37,846][54798] Signal inference workers to stop experience collection... (17400 times) [2024-04-27 22:51:37,847][54798] Signal inference workers to resume experience collection... (17400 times) [2024-04-27 22:51:37,872][54818] InferenceWorker_p0-w0: stopping experience collection (17400 times) [2024-04-27 22:51:37,872][54818] InferenceWorker_p0-w0: resuming experience collection (17400 times) [2024-04-27 22:51:38,108][54818] Updated weights for policy 0, policy_version 501678 (0.0025) [2024-04-27 22:51:39,253][54587] Fps is (10 sec: 55705.3, 60 sec: 58709.2, 300 sec: 58871.3). Total num frames: 8219525120. Throughput: 0: 58899.5. Samples: 1124788960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 22:51:39,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 22:51:40,622][54818] Updated weights for policy 0, policy_version 501688 (0.0026) [2024-04-27 22:51:43,662][54818] Updated weights for policy 0, policy_version 501698 (0.0027) [2024-04-27 22:51:44,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58709.2, 300 sec: 58926.8). Total num frames: 8219820032. Throughput: 0: 59093.9. Samples: 1124966580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 22:51:44,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-27 22:51:46,216][54818] Updated weights for policy 0, policy_version 501708 (0.0023) [2024-04-27 22:51:49,253][54587] Fps is (10 sec: 60621.0, 60 sec: 59255.4, 300 sec: 58982.4). Total num frames: 8220131328. Throughput: 0: 59382.6. Samples: 1125325860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 22:51:49,254][54587] Avg episode reward: [(0, '0.477')] [2024-04-27 22:51:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000501717_8220131328.pth... [2024-04-27 22:51:49,321][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000500853_8205975552.pth [2024-04-27 22:51:49,455][54818] Updated weights for policy 0, policy_version 501718 (0.0025) [2024-04-27 22:51:52,339][54818] Updated weights for policy 0, policy_version 501728 (0.0029) [2024-04-27 22:51:54,253][54587] Fps is (10 sec: 60620.5, 60 sec: 58982.3, 300 sec: 58982.4). Total num frames: 8220426240. Throughput: 0: 59058.5. Samples: 1125671760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 22:51:54,254][54587] Avg episode reward: [(0, '0.673')] [2024-04-27 22:51:55,120][54818] Updated weights for policy 0, policy_version 501738 (0.0026) [2024-04-27 22:51:57,928][54818] Updated weights for policy 0, policy_version 501748 (0.0025) [2024-04-27 22:51:59,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.6, 300 sec: 59037.9). Total num frames: 8220721152. Throughput: 0: 58760.4. Samples: 1125840140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 22:51:59,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 22:52:00,737][54818] Updated weights for policy 0, policy_version 501758 (0.0025) [2024-04-27 22:52:03,413][54818] Updated weights for policy 0, policy_version 501768 (0.0025) [2024-04-27 22:52:04,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8220999680. Throughput: 0: 58653.7. Samples: 1126196020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 22:52:04,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 22:52:06,306][54818] Updated weights for policy 0, policy_version 501778 (0.0026) [2024-04-27 22:52:08,952][54818] Updated weights for policy 0, policy_version 501788 (0.0024) [2024-04-27 22:52:09,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8221294592. Throughput: 0: 58921.8. Samples: 1126553260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 22:52:09,254][54587] Avg episode reward: [(0, '0.520')] [2024-04-27 22:52:11,796][54818] Updated weights for policy 0, policy_version 501798 (0.0026) [2024-04-27 22:52:14,254][54587] Fps is (10 sec: 60619.9, 60 sec: 58709.2, 300 sec: 59149.0). Total num frames: 8221605888. Throughput: 0: 58925.1. Samples: 1126737600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 22:52:14,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 22:52:14,453][54818] Updated weights for policy 0, policy_version 501808 (0.0025) [2024-04-27 22:52:17,192][54818] Updated weights for policy 0, policy_version 501818 (0.0029) [2024-04-27 22:52:19,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58709.3, 300 sec: 59149.0). Total num frames: 8221900800. Throughput: 0: 58959.6. Samples: 1127083860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 22:52:19,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 22:52:20,084][54818] Updated weights for policy 0, policy_version 501828 (0.0026) [2024-04-27 22:52:22,710][54818] Updated weights for policy 0, policy_version 501838 (0.0025) [2024-04-27 22:52:24,253][54587] Fps is (10 sec: 58983.8, 60 sec: 58436.3, 300 sec: 59093.5). Total num frames: 8222195712. Throughput: 0: 58753.9. Samples: 1127432880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 22:52:24,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 22:52:25,644][54818] Updated weights for policy 0, policy_version 501848 (0.0026) [2024-04-27 22:52:28,203][54818] Updated weights for policy 0, policy_version 501858 (0.0023) [2024-04-27 22:52:29,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58982.5, 300 sec: 59149.3). Total num frames: 8222507008. Throughput: 0: 58945.0. Samples: 1127619100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 22:52:29,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 22:52:30,767][54798] Signal inference workers to stop experience collection... (17450 times) [2024-04-27 22:52:30,774][54798] Signal inference workers to resume experience collection... (17450 times) [2024-04-27 22:52:30,796][54818] InferenceWorker_p0-w0: stopping experience collection (17450 times) [2024-04-27 22:52:30,796][54818] InferenceWorker_p0-w0: resuming experience collection (17450 times) [2024-04-27 22:52:31,406][54818] Updated weights for policy 0, policy_version 501868 (0.0026) [2024-04-27 22:52:33,700][54818] Updated weights for policy 0, policy_version 501878 (0.0025) [2024-04-27 22:52:34,253][54587] Fps is (10 sec: 62259.1, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8222818304. Throughput: 0: 59104.9. Samples: 1127985580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 22:52:34,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 22:52:36,876][54818] Updated weights for policy 0, policy_version 501888 (0.0025) [2024-04-27 22:52:39,070][54818] Updated weights for policy 0, policy_version 501898 (0.0026) [2024-04-27 22:52:39,253][54587] Fps is (10 sec: 58981.6, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8223096832. Throughput: 0: 59168.0. Samples: 1128334320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 22:52:39,254][54587] Avg episode reward: [(0, '0.517')] [2024-04-27 22:52:42,331][54818] Updated weights for policy 0, policy_version 501908 (0.0026) [2024-04-27 22:52:44,253][54587] Fps is (10 sec: 57343.5, 60 sec: 59528.5, 300 sec: 59037.9). Total num frames: 8223391744. Throughput: 0: 59481.7. Samples: 1128516820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-27 22:52:44,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-27 22:52:44,552][54818] Updated weights for policy 0, policy_version 501918 (0.0024) [2024-04-27 22:52:47,827][54818] Updated weights for policy 0, policy_version 501928 (0.0024) [2024-04-27 22:52:49,253][54587] Fps is (10 sec: 55706.2, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8223653888. Throughput: 0: 59412.5. Samples: 1128869580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:52:49,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 22:52:50,078][54818] Updated weights for policy 0, policy_version 501938 (0.0026) [2024-04-27 22:52:53,243][54818] Updated weights for policy 0, policy_version 501948 (0.0026) [2024-04-27 22:52:54,253][54587] Fps is (10 sec: 55706.2, 60 sec: 58709.5, 300 sec: 58926.9). Total num frames: 8223948800. Throughput: 0: 59279.1. Samples: 1129220820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:52:54,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 22:52:55,638][54818] Updated weights for policy 0, policy_version 501958 (0.0026) [2024-04-27 22:52:58,809][54818] Updated weights for policy 0, policy_version 501968 (0.0026) [2024-04-27 22:52:59,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8224243712. Throughput: 0: 59037.7. Samples: 1129394280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:52:59,253][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 22:53:01,394][54818] Updated weights for policy 0, policy_version 501978 (0.0024) [2024-04-27 22:53:04,188][54818] Updated weights for policy 0, policy_version 501988 (0.0026) [2024-04-27 22:53:04,253][54587] Fps is (10 sec: 62258.9, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8224571392. Throughput: 0: 59293.3. Samples: 1129752060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:53:04,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-27 22:53:06,885][54818] Updated weights for policy 0, policy_version 501998 (0.0025) [2024-04-27 22:53:09,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59255.4, 300 sec: 59037.9). Total num frames: 8224849920. Throughput: 0: 59517.7. Samples: 1130111180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:53:09,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 22:53:10,027][54818] Updated weights for policy 0, policy_version 502008 (0.0023) [2024-04-27 22:53:12,295][54818] Updated weights for policy 0, policy_version 502018 (0.0026) [2024-04-27 22:53:14,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58982.6, 300 sec: 59038.0). Total num frames: 8225144832. Throughput: 0: 59147.1. Samples: 1130280720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:53:14,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 22:53:15,500][54818] Updated weights for policy 0, policy_version 502028 (0.0025) [2024-04-27 22:53:17,871][54818] Updated weights for policy 0, policy_version 502038 (0.0026) [2024-04-27 22:53:19,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8225439744. Throughput: 0: 58973.1. Samples: 1130639380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:53:19,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 22:53:21,044][54818] Updated weights for policy 0, policy_version 502048 (0.0026) [2024-04-27 22:53:23,303][54818] Updated weights for policy 0, policy_version 502058 (0.0025) [2024-04-27 22:53:24,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8225734656. Throughput: 0: 59057.8. Samples: 1130991920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:53:24,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 22:53:26,699][54798] Signal inference workers to stop experience collection... (17500 times) [2024-04-27 22:53:26,742][54818] InferenceWorker_p0-w0: stopping experience collection (17500 times) [2024-04-27 22:53:26,759][54798] Signal inference workers to resume experience collection... (17500 times) [2024-04-27 22:53:26,760][54818] InferenceWorker_p0-w0: resuming experience collection (17500 times) [2024-04-27 22:53:26,763][54818] Updated weights for policy 0, policy_version 502068 (0.0025) [2024-04-27 22:53:29,058][54818] Updated weights for policy 0, policy_version 502078 (0.0026) [2024-04-27 22:53:29,253][54587] Fps is (10 sec: 60621.2, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8226045952. Throughput: 0: 59166.7. Samples: 1131179320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:53:29,254][54587] Avg episode reward: [(0, '0.487')] [2024-04-27 22:53:32,155][54818] Updated weights for policy 0, policy_version 502088 (0.0027) [2024-04-27 22:53:34,253][54587] Fps is (10 sec: 60622.2, 60 sec: 58709.5, 300 sec: 59204.6). Total num frames: 8226340864. Throughput: 0: 58928.2. Samples: 1131521340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:53:34,253][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 22:53:34,404][54818] Updated weights for policy 0, policy_version 502098 (0.0027) [2024-04-27 22:53:37,548][54818] Updated weights for policy 0, policy_version 502108 (0.0026) [2024-04-27 22:53:39,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8226635776. Throughput: 0: 59008.9. Samples: 1131876220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:53:39,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 22:53:40,446][54818] Updated weights for policy 0, policy_version 502118 (0.0025) [2024-04-27 22:53:43,011][54818] Updated weights for policy 0, policy_version 502128 (0.0024) [2024-04-27 22:53:44,253][54587] Fps is (10 sec: 58981.5, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8226930688. Throughput: 0: 59368.8. Samples: 1132065880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:53:44,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 22:53:45,951][54818] Updated weights for policy 0, policy_version 502138 (0.0026) [2024-04-27 22:53:48,560][54818] Updated weights for policy 0, policy_version 502148 (0.0026) [2024-04-27 22:53:49,253][54587] Fps is (10 sec: 62259.6, 60 sec: 60074.7, 300 sec: 59149.0). Total num frames: 8227258368. Throughput: 0: 59196.1. Samples: 1132415880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:53:49,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 22:53:49,263][54587] No heartbeat for components: RolloutWorker_w4 (5377 seconds) [2024-04-27 22:53:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000502152_8227258368.pth... [2024-04-27 22:53:49,317][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000501282_8213004288.pth [2024-04-27 22:53:51,409][54818] Updated weights for policy 0, policy_version 502158 (0.0023) [2024-04-27 22:53:53,958][54818] Updated weights for policy 0, policy_version 502168 (0.0024) [2024-04-27 22:53:54,253][54587] Fps is (10 sec: 58982.9, 60 sec: 59528.6, 300 sec: 59038.0). Total num frames: 8227520512. Throughput: 0: 59101.5. Samples: 1132770740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:53:54,253][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 22:53:57,101][54818] Updated weights for policy 0, policy_version 502178 (0.0024) [2024-04-27 22:53:59,253][54587] Fps is (10 sec: 57343.7, 60 sec: 59801.6, 300 sec: 58982.4). Total num frames: 8227831808. Throughput: 0: 59191.6. Samples: 1132944340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:53:59,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 22:53:59,423][54818] Updated weights for policy 0, policy_version 502188 (0.0027) [2024-04-27 22:54:02,853][54818] Updated weights for policy 0, policy_version 502198 (0.0029) [2024-04-27 22:54:04,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58709.5, 300 sec: 58926.9). Total num frames: 8228093952. Throughput: 0: 59236.3. Samples: 1133305000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:54:04,253][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 22:54:05,105][54818] Updated weights for policy 0, policy_version 502208 (0.0027) [2024-04-27 22:54:08,428][54818] Updated weights for policy 0, policy_version 502218 (0.0025) [2024-04-27 22:54:09,253][54587] Fps is (10 sec: 55705.6, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 8228388864. Throughput: 0: 59309.9. Samples: 1133660860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-27 22:54:09,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 22:54:10,530][54818] Updated weights for policy 0, policy_version 502228 (0.0024) [2024-04-27 22:54:11,543][54798] Signal inference workers to stop experience collection... (17550 times) [2024-04-27 22:54:11,543][54798] Signal inference workers to resume experience collection... (17550 times) [2024-04-27 22:54:11,555][54818] InferenceWorker_p0-w0: stopping experience collection (17550 times) [2024-04-27 22:54:11,555][54818] InferenceWorker_p0-w0: resuming experience collection (17550 times) [2024-04-27 22:54:14,135][54818] Updated weights for policy 0, policy_version 502238 (0.0026) [2024-04-27 22:54:14,253][54587] Fps is (10 sec: 57342.8, 60 sec: 58709.2, 300 sec: 59037.9). Total num frames: 8228667392. Throughput: 0: 58705.3. Samples: 1133821060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:54:14,254][54587] Avg episode reward: [(0, '0.702')] [2024-04-27 22:54:16,060][54818] Updated weights for policy 0, policy_version 502248 (0.0025) [2024-04-27 22:54:19,253][54587] Fps is (10 sec: 55705.4, 60 sec: 58436.4, 300 sec: 58982.4). Total num frames: 8228945920. Throughput: 0: 59030.0. Samples: 1134177700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:54:19,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-27 22:54:19,552][54818] Updated weights for policy 0, policy_version 502258 (0.0027) [2024-04-27 22:54:21,706][54818] Updated weights for policy 0, policy_version 502268 (0.0025) [2024-04-27 22:54:24,253][54587] Fps is (10 sec: 60621.3, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8229273600. Throughput: 0: 59132.0. Samples: 1134537160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:54:24,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 22:54:25,069][54818] Updated weights for policy 0, policy_version 502278 (0.0026) [2024-04-27 22:54:27,235][54818] Updated weights for policy 0, policy_version 502288 (0.0025) [2024-04-27 22:54:29,253][54587] Fps is (10 sec: 62258.9, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8229568512. Throughput: 0: 58749.7. Samples: 1134709620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:54:29,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 22:54:30,678][54818] Updated weights for policy 0, policy_version 502298 (0.0027) [2024-04-27 22:54:32,872][54818] Updated weights for policy 0, policy_version 502308 (0.0025) [2024-04-27 22:54:34,253][54587] Fps is (10 sec: 60621.3, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8229879808. Throughput: 0: 58809.8. Samples: 1135062320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:54:34,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 22:54:36,070][54818] Updated weights for policy 0, policy_version 502318 (0.0026) [2024-04-27 22:54:38,260][54818] Updated weights for policy 0, policy_version 502328 (0.0025) [2024-04-27 22:54:39,253][54587] Fps is (10 sec: 60621.3, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8230174720. Throughput: 0: 58742.1. Samples: 1135414140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:54:39,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 22:54:41,512][54818] Updated weights for policy 0, policy_version 502338 (0.0026) [2024-04-27 22:54:43,756][54818] Updated weights for policy 0, policy_version 502348 (0.0027) [2024-04-27 22:54:44,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8230486016. Throughput: 0: 58984.9. Samples: 1135598660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:54:44,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 22:54:46,934][54818] Updated weights for policy 0, policy_version 502358 (0.0030) [2024-04-27 22:54:49,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58709.3, 300 sec: 59149.0). Total num frames: 8230780928. Throughput: 0: 58799.4. Samples: 1135950980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:54:49,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 22:54:49,328][54818] Updated weights for policy 0, policy_version 502368 (0.0025) [2024-04-27 22:54:52,415][54818] Updated weights for policy 0, policy_version 502378 (0.0023) [2024-04-27 22:54:54,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59255.4, 300 sec: 59149.1). Total num frames: 8231075840. Throughput: 0: 58728.9. Samples: 1136303660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:54:54,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 22:54:54,894][54818] Updated weights for policy 0, policy_version 502388 (0.0024) [2024-04-27 22:54:57,776][54818] Updated weights for policy 0, policy_version 502398 (0.0026) [2024-04-27 22:54:59,253][54587] Fps is (10 sec: 58981.4, 60 sec: 58982.2, 300 sec: 59149.0). Total num frames: 8231370752. Throughput: 0: 59403.1. Samples: 1136494200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:54:59,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 22:55:00,516][54818] Updated weights for policy 0, policy_version 502408 (0.0026) [2024-04-27 22:55:02,707][54798] Signal inference workers to stop experience collection... (17600 times) [2024-04-27 22:55:02,708][54798] Signal inference workers to resume experience collection... (17600 times) [2024-04-27 22:55:02,734][54818] InferenceWorker_p0-w0: stopping experience collection (17600 times) [2024-04-27 22:55:02,734][54818] InferenceWorker_p0-w0: resuming experience collection (17600 times) [2024-04-27 22:55:03,416][54818] Updated weights for policy 0, policy_version 502418 (0.0025) [2024-04-27 22:55:04,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8231665664. Throughput: 0: 59280.9. Samples: 1136845340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:55:04,254][54587] Avg episode reward: [(0, '0.509')] [2024-04-27 22:55:06,168][54818] Updated weights for policy 0, policy_version 502428 (0.0025) [2024-04-27 22:55:09,054][54818] Updated weights for policy 0, policy_version 502438 (0.0024) [2024-04-27 22:55:09,253][54587] Fps is (10 sec: 57344.8, 60 sec: 59255.4, 300 sec: 58982.4). Total num frames: 8231944192. Throughput: 0: 58961.8. Samples: 1137190440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:55:09,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 22:55:12,213][54818] Updated weights for policy 0, policy_version 502448 (0.0026) [2024-04-27 22:55:14,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59801.7, 300 sec: 59093.5). Total num frames: 8232255488. Throughput: 0: 59130.4. Samples: 1137370480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:55:14,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 22:55:14,553][54818] Updated weights for policy 0, policy_version 502458 (0.0026) [2024-04-27 22:55:17,790][54818] Updated weights for policy 0, policy_version 502468 (0.0030) [2024-04-27 22:55:19,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60074.7, 300 sec: 59037.9). Total num frames: 8232550400. Throughput: 0: 59417.2. Samples: 1137736100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:55:19,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 22:55:20,097][54818] Updated weights for policy 0, policy_version 502478 (0.0027) [2024-04-27 22:55:23,448][54818] Updated weights for policy 0, policy_version 502488 (0.0026) [2024-04-27 22:55:24,253][54587] Fps is (10 sec: 57343.7, 60 sec: 59255.5, 300 sec: 59038.0). Total num frames: 8232828928. Throughput: 0: 59480.9. Samples: 1138090780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:55:24,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 22:55:25,424][54818] Updated weights for policy 0, policy_version 502498 (0.0024) [2024-04-27 22:55:28,852][54818] Updated weights for policy 0, policy_version 502508 (0.0026) [2024-04-27 22:55:29,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59255.5, 300 sec: 59037.9). Total num frames: 8233123840. Throughput: 0: 59049.7. Samples: 1138255900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:55:29,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 22:55:30,962][54818] Updated weights for policy 0, policy_version 502518 (0.0026) [2024-04-27 22:55:34,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8233402368. Throughput: 0: 59072.1. Samples: 1138609220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:55:34,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 22:55:34,482][54818] Updated weights for policy 0, policy_version 502528 (0.0025) [2024-04-27 22:55:36,430][54818] Updated weights for policy 0, policy_version 502538 (0.0026) [2024-04-27 22:55:39,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58709.2, 300 sec: 58982.4). Total num frames: 8233697280. Throughput: 0: 59378.9. Samples: 1138975720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-27 22:55:39,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 22:55:39,843][54818] Updated weights for policy 0, policy_version 502548 (0.0026) [2024-04-27 22:55:42,101][54818] Updated weights for policy 0, policy_version 502558 (0.0025) [2024-04-27 22:55:44,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8234008576. Throughput: 0: 58860.7. Samples: 1139142920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 22:55:44,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 22:55:45,242][54818] Updated weights for policy 0, policy_version 502568 (0.0025) [2024-04-27 22:55:47,804][54818] Updated weights for policy 0, policy_version 502578 (0.0026) [2024-04-27 22:55:49,253][54587] Fps is (10 sec: 60621.7, 60 sec: 58709.4, 300 sec: 59038.0). Total num frames: 8234303488. Throughput: 0: 59078.7. Samples: 1139503880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 22:55:49,253][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 22:55:49,277][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000502583_8234319872.pth... [2024-04-27 22:55:49,326][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000501717_8220131328.pth [2024-04-27 22:55:50,733][54818] Updated weights for policy 0, policy_version 502588 (0.0026) [2024-04-27 22:55:53,251][54818] Updated weights for policy 0, policy_version 502598 (0.0024) [2024-04-27 22:55:54,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8234614784. Throughput: 0: 59277.3. Samples: 1139857920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 22:55:54,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 22:55:56,179][54818] Updated weights for policy 0, policy_version 502608 (0.0025) [2024-04-27 22:55:58,349][54798] Signal inference workers to stop experience collection... (17650 times) [2024-04-27 22:55:58,351][54798] Signal inference workers to resume experience collection... (17650 times) [2024-04-27 22:55:58,363][54818] InferenceWorker_p0-w0: stopping experience collection (17650 times) [2024-04-27 22:55:58,363][54818] InferenceWorker_p0-w0: resuming experience collection (17650 times) [2024-04-27 22:55:58,697][54818] Updated weights for policy 0, policy_version 502618 (0.0025) [2024-04-27 22:55:59,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8234909696. Throughput: 0: 59145.7. Samples: 1140032040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 22:55:59,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 22:56:01,640][54818] Updated weights for policy 0, policy_version 502628 (0.0026) [2024-04-27 22:56:04,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8235204608. Throughput: 0: 58932.1. Samples: 1140388040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 22:56:04,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 22:56:04,329][54818] Updated weights for policy 0, policy_version 502638 (0.0026) [2024-04-27 22:56:07,053][54818] Updated weights for policy 0, policy_version 502648 (0.0025) [2024-04-27 22:56:09,253][54587] Fps is (10 sec: 58983.1, 60 sec: 59255.6, 300 sec: 59038.0). Total num frames: 8235499520. Throughput: 0: 58998.8. Samples: 1140745720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 22:56:09,253][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 22:56:10,001][54818] Updated weights for policy 0, policy_version 502658 (0.0026) [2024-04-27 22:56:12,642][54818] Updated weights for policy 0, policy_version 502668 (0.0026) [2024-04-27 22:56:14,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8235794432. Throughput: 0: 59503.9. Samples: 1140933580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 22:56:14,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-27 22:56:15,462][54818] Updated weights for policy 0, policy_version 502678 (0.0024) [2024-04-27 22:56:17,960][54818] Updated weights for policy 0, policy_version 502688 (0.0025) [2024-04-27 22:56:19,253][54587] Fps is (10 sec: 60619.5, 60 sec: 59255.4, 300 sec: 59037.9). Total num frames: 8236105728. Throughput: 0: 59410.9. Samples: 1141282720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 22:56:19,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 22:56:20,848][54818] Updated weights for policy 0, policy_version 502698 (0.0023) [2024-04-27 22:56:23,560][54818] Updated weights for policy 0, policy_version 502708 (0.0029) [2024-04-27 22:56:24,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8236400640. Throughput: 0: 59026.3. Samples: 1141631900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 22:56:24,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 22:56:26,671][54818] Updated weights for policy 0, policy_version 502718 (0.0026) [2024-04-27 22:56:29,127][54818] Updated weights for policy 0, policy_version 502728 (0.0024) [2024-04-27 22:56:29,253][54587] Fps is (10 sec: 58983.2, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8236695552. Throughput: 0: 59446.6. Samples: 1141818020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 22:56:29,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 22:56:32,360][54818] Updated weights for policy 0, policy_version 502738 (0.0026) [2024-04-27 22:56:34,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59801.5, 300 sec: 59204.6). Total num frames: 8236990464. Throughput: 0: 59304.8. Samples: 1142172600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 22:56:34,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 22:56:34,624][54818] Updated weights for policy 0, policy_version 502748 (0.0024) [2024-04-27 22:56:37,840][54818] Updated weights for policy 0, policy_version 502758 (0.0026) [2024-04-27 22:56:39,253][54587] Fps is (10 sec: 58981.9, 60 sec: 59801.6, 300 sec: 59204.6). Total num frames: 8237285376. Throughput: 0: 59408.4. Samples: 1142531300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 22:56:39,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 22:56:40,075][54818] Updated weights for policy 0, policy_version 502768 (0.0023) [2024-04-27 22:56:43,550][54818] Updated weights for policy 0, policy_version 502778 (0.0026) [2024-04-27 22:56:44,059][54798] Signal inference workers to stop experience collection... (17700 times) [2024-04-27 22:56:44,079][54818] InferenceWorker_p0-w0: stopping experience collection (17700 times) [2024-04-27 22:56:44,154][54798] Signal inference workers to resume experience collection... (17700 times) [2024-04-27 22:56:44,154][54818] InferenceWorker_p0-w0: resuming experience collection (17700 times) [2024-04-27 22:56:44,253][54587] Fps is (10 sec: 58983.1, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8237580288. Throughput: 0: 59317.5. Samples: 1142701320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 22:56:44,253][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 22:56:46,053][54818] Updated weights for policy 0, policy_version 502788 (0.0025) [2024-04-27 22:56:49,020][54818] Updated weights for policy 0, policy_version 502798 (0.0025) [2024-04-27 22:56:49,253][54587] Fps is (10 sec: 55705.6, 60 sec: 58982.3, 300 sec: 59038.0). Total num frames: 8237842432. Throughput: 0: 59406.6. Samples: 1143061340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 22:56:49,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 22:56:49,260][54587] No heartbeat for components: RolloutWorker_w4 (5557 seconds) [2024-04-27 22:56:51,439][54818] Updated weights for policy 0, policy_version 502808 (0.0025) [2024-04-27 22:56:54,253][54587] Fps is (10 sec: 57343.0, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8238153728. Throughput: 0: 59252.2. Samples: 1143412080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 22:56:54,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 22:56:54,541][54818] Updated weights for policy 0, policy_version 502818 (0.0026) [2024-04-27 22:56:57,334][54818] Updated weights for policy 0, policy_version 502828 (0.0024) [2024-04-27 22:56:59,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8238448640. Throughput: 0: 58979.6. Samples: 1143587660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 22:56:59,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 22:56:59,990][54818] Updated weights for policy 0, policy_version 502838 (0.0024) [2024-04-27 22:57:02,633][54818] Updated weights for policy 0, policy_version 502848 (0.0026) [2024-04-27 22:57:04,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8238743552. Throughput: 0: 59139.6. Samples: 1143944000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-27 22:57:04,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 22:57:05,430][54818] Updated weights for policy 0, policy_version 502858 (0.0025) [2024-04-27 22:57:08,205][54818] Updated weights for policy 0, policy_version 502868 (0.0022) [2024-04-27 22:57:09,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58709.2, 300 sec: 59038.0). Total num frames: 8239022080. Throughput: 0: 59388.1. Samples: 1144304360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:57:09,254][54587] Avg episode reward: [(0, '0.712')] [2024-04-27 22:57:10,859][54818] Updated weights for policy 0, policy_version 502878 (0.0026) [2024-04-27 22:57:13,772][54818] Updated weights for policy 0, policy_version 502888 (0.0026) [2024-04-27 22:57:14,253][54587] Fps is (10 sec: 58983.2, 60 sec: 58982.6, 300 sec: 59093.5). Total num frames: 8239333376. Throughput: 0: 58915.6. Samples: 1144469220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:57:14,253][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 22:57:16,268][54818] Updated weights for policy 0, policy_version 502898 (0.0024) [2024-04-27 22:57:19,253][54587] Fps is (10 sec: 60621.4, 60 sec: 58709.5, 300 sec: 59093.5). Total num frames: 8239628288. Throughput: 0: 59035.7. Samples: 1144829200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:57:19,253][54587] Avg episode reward: [(0, '0.516')] [2024-04-27 22:57:19,276][54818] Updated weights for policy 0, policy_version 502908 (0.0025) [2024-04-27 22:57:21,681][54818] Updated weights for policy 0, policy_version 502918 (0.0025) [2024-04-27 22:57:24,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8239939584. Throughput: 0: 59178.4. Samples: 1145194320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:57:24,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 22:57:24,841][54818] Updated weights for policy 0, policy_version 502928 (0.0027) [2024-04-27 22:57:27,075][54818] Updated weights for policy 0, policy_version 502938 (0.0024) [2024-04-27 22:57:29,253][54587] Fps is (10 sec: 62258.0, 60 sec: 59255.3, 300 sec: 59093.5). Total num frames: 8240250880. Throughput: 0: 59277.5. Samples: 1145368820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:57:29,254][54587] Avg episode reward: [(0, '0.506')] [2024-04-27 22:57:30,558][54798] Signal inference workers to stop experience collection... (17750 times) [2024-04-27 22:57:30,590][54818] InferenceWorker_p0-w0: stopping experience collection (17750 times) [2024-04-27 22:57:30,618][54798] Signal inference workers to resume experience collection... (17750 times) [2024-04-27 22:57:30,619][54818] InferenceWorker_p0-w0: resuming experience collection (17750 times) [2024-04-27 22:57:30,622][54818] Updated weights for policy 0, policy_version 502948 (0.0026) [2024-04-27 22:57:32,664][54818] Updated weights for policy 0, policy_version 502958 (0.0027) [2024-04-27 22:57:34,253][54587] Fps is (10 sec: 60620.0, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8240545792. Throughput: 0: 59004.4. Samples: 1145716540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:57:34,254][54587] Avg episode reward: [(0, '0.526')] [2024-04-27 22:57:35,893][54818] Updated weights for policy 0, policy_version 502968 (0.0026) [2024-04-27 22:57:38,107][54818] Updated weights for policy 0, policy_version 502978 (0.0026) [2024-04-27 22:57:39,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8240840704. Throughput: 0: 59043.5. Samples: 1146069040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:57:39,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 22:57:41,405][54818] Updated weights for policy 0, policy_version 502988 (0.0026) [2024-04-27 22:57:43,730][54818] Updated weights for policy 0, policy_version 502998 (0.0025) [2024-04-27 22:57:44,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59528.4, 300 sec: 59315.6). Total num frames: 8241152000. Throughput: 0: 59440.5. Samples: 1146262480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:57:44,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 22:57:46,918][54818] Updated weights for policy 0, policy_version 503008 (0.0026) [2024-04-27 22:57:49,209][54818] Updated weights for policy 0, policy_version 503018 (0.0026) [2024-04-27 22:57:49,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60074.7, 300 sec: 59315.6). Total num frames: 8241446912. Throughput: 0: 59424.9. Samples: 1146618120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:57:49,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 22:57:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000503018_8241446912.pth... [2024-04-27 22:57:49,317][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000502152_8227258368.pth [2024-04-27 22:57:52,467][54818] Updated weights for policy 0, policy_version 503028 (0.0026) [2024-04-27 22:57:54,253][54587] Fps is (10 sec: 57344.8, 60 sec: 59528.7, 300 sec: 59260.1). Total num frames: 8241725440. Throughput: 0: 59093.9. Samples: 1146963580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:57:54,253][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 22:57:54,742][54818] Updated weights for policy 0, policy_version 503038 (0.0026) [2024-04-27 22:57:58,101][54818] Updated weights for policy 0, policy_version 503048 (0.0026) [2024-04-27 22:57:59,253][54587] Fps is (10 sec: 57344.2, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8242020352. Throughput: 0: 59616.3. Samples: 1147151960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:57:59,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 22:58:00,165][54818] Updated weights for policy 0, policy_version 503058 (0.0029) [2024-04-27 22:58:03,505][54818] Updated weights for policy 0, policy_version 503068 (0.0026) [2024-04-27 22:58:04,253][54587] Fps is (10 sec: 58981.5, 60 sec: 59528.5, 300 sec: 59204.5). Total num frames: 8242315264. Throughput: 0: 59598.9. Samples: 1147511160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:58:04,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 22:58:05,950][54818] Updated weights for policy 0, policy_version 503078 (0.0026) [2024-04-27 22:58:08,945][54818] Updated weights for policy 0, policy_version 503088 (0.0026) [2024-04-27 22:58:09,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8242593792. Throughput: 0: 59271.1. Samples: 1147861520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:58:09,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 22:58:11,443][54818] Updated weights for policy 0, policy_version 503098 (0.0024) [2024-04-27 22:58:14,253][54587] Fps is (10 sec: 55706.2, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8242872320. Throughput: 0: 59061.5. Samples: 1148026580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:58:14,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 22:58:14,506][54818] Updated weights for policy 0, policy_version 503108 (0.0024) [2024-04-27 22:58:16,815][54818] Updated weights for policy 0, policy_version 503118 (0.0025) [2024-04-27 22:58:19,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8243167232. Throughput: 0: 59404.6. Samples: 1148389740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:58:19,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 22:58:19,978][54818] Updated weights for policy 0, policy_version 503128 (0.0027) [2024-04-27 22:58:22,415][54818] Updated weights for policy 0, policy_version 503138 (0.0026) [2024-04-27 22:58:24,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58709.3, 300 sec: 59038.0). Total num frames: 8243462144. Throughput: 0: 59560.1. Samples: 1148749240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:58:24,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 22:58:25,106][54798] Signal inference workers to stop experience collection... (17800 times) [2024-04-27 22:58:25,140][54818] InferenceWorker_p0-w0: stopping experience collection (17800 times) [2024-04-27 22:58:25,169][54798] Signal inference workers to resume experience collection... (17800 times) [2024-04-27 22:58:25,169][54818] InferenceWorker_p0-w0: resuming experience collection (17800 times) [2024-04-27 22:58:25,585][54818] Updated weights for policy 0, policy_version 503148 (0.0027) [2024-04-27 22:58:27,820][54818] Updated weights for policy 0, policy_version 503158 (0.0025) [2024-04-27 22:58:29,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58709.4, 300 sec: 59093.4). Total num frames: 8243773440. Throughput: 0: 58883.1. Samples: 1148912220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:58:29,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 22:58:30,988][54818] Updated weights for policy 0, policy_version 503168 (0.0024) [2024-04-27 22:58:33,616][54818] Updated weights for policy 0, policy_version 503178 (0.0025) [2024-04-27 22:58:34,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8244068352. Throughput: 0: 59061.3. Samples: 1149275880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-27 22:58:34,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 22:58:36,491][54818] Updated weights for policy 0, policy_version 503188 (0.0026) [2024-04-27 22:58:39,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8244379648. Throughput: 0: 59051.4. Samples: 1149620900. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-04-27 22:58:39,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 22:58:39,604][54818] Updated weights for policy 0, policy_version 503198 (0.0026) [2024-04-27 22:58:42,109][54818] Updated weights for policy 0, policy_version 503208 (0.0028) [2024-04-27 22:58:44,253][54587] Fps is (10 sec: 62259.6, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8244690944. Throughput: 0: 59030.2. Samples: 1149808320. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-04-27 22:58:44,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 22:58:45,396][54818] Updated weights for policy 0, policy_version 503218 (0.0023) [2024-04-27 22:58:47,701][54818] Updated weights for policy 0, policy_version 503228 (0.0025) [2024-04-27 22:58:49,253][54587] Fps is (10 sec: 60621.1, 60 sec: 58982.5, 300 sec: 59204.5). Total num frames: 8244985856. Throughput: 0: 58914.8. Samples: 1150162320. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-04-27 22:58:49,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 22:58:50,885][54818] Updated weights for policy 0, policy_version 503238 (0.0026) [2024-04-27 22:58:53,068][54818] Updated weights for policy 0, policy_version 503248 (0.0024) [2024-04-27 22:58:54,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59528.5, 300 sec: 59204.6). Total num frames: 8245297152. Throughput: 0: 58828.4. Samples: 1150508800. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-04-27 22:58:54,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 22:58:56,631][54818] Updated weights for policy 0, policy_version 503258 (0.0026) [2024-04-27 22:58:58,664][54818] Updated weights for policy 0, policy_version 503268 (0.0026) [2024-04-27 22:58:59,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8245575680. Throughput: 0: 59459.5. Samples: 1150702260. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-04-27 22:58:59,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 22:59:02,075][54818] Updated weights for policy 0, policy_version 503278 (0.0026) [2024-04-27 22:59:03,619][54798] Signal inference workers to stop experience collection... (17850 times) [2024-04-27 22:59:03,619][54798] Signal inference workers to resume experience collection... (17850 times) [2024-04-27 22:59:03,632][54818] InferenceWorker_p0-w0: stopping experience collection (17850 times) [2024-04-27 22:59:03,633][54818] InferenceWorker_p0-w0: resuming experience collection (17850 times) [2024-04-27 22:59:04,163][54818] Updated weights for policy 0, policy_version 503288 (0.0025) [2024-04-27 22:59:04,253][54587] Fps is (10 sec: 57344.4, 60 sec: 59255.6, 300 sec: 59260.1). Total num frames: 8245870592. Throughput: 0: 58995.6. Samples: 1151044540. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-04-27 22:59:04,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 22:59:07,490][54818] Updated weights for policy 0, policy_version 503298 (0.0026) [2024-04-27 22:59:09,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8246149120. Throughput: 0: 58978.8. Samples: 1151403280. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-04-27 22:59:09,253][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 22:59:09,641][54818] Updated weights for policy 0, policy_version 503308 (0.0025) [2024-04-27 22:59:12,917][54818] Updated weights for policy 0, policy_version 503318 (0.0025) [2024-04-27 22:59:14,253][54587] Fps is (10 sec: 57343.6, 60 sec: 59528.5, 300 sec: 59315.6). Total num frames: 8246444032. Throughput: 0: 59256.9. Samples: 1151578780. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-04-27 22:59:14,254][54587] Avg episode reward: [(0, '0.685')] [2024-04-27 22:59:15,190][54818] Updated weights for policy 0, policy_version 503328 (0.0025) [2024-04-27 22:59:18,399][54818] Updated weights for policy 0, policy_version 503338 (0.0025) [2024-04-27 22:59:19,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59528.5, 300 sec: 59204.6). Total num frames: 8246738944. Throughput: 0: 59219.7. Samples: 1151940760. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-04-27 22:59:19,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 22:59:20,723][54818] Updated weights for policy 0, policy_version 503348 (0.0027) [2024-04-27 22:59:23,839][54818] Updated weights for policy 0, policy_version 503358 (0.0026) [2024-04-27 22:59:24,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59801.5, 300 sec: 59260.1). Total num frames: 8247050240. Throughput: 0: 59347.0. Samples: 1152291520. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-04-27 22:59:24,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 22:59:26,297][54818] Updated weights for policy 0, policy_version 503368 (0.0026) [2024-04-27 22:59:29,248][54818] Updated weights for policy 0, policy_version 503378 (0.0023) [2024-04-27 22:59:29,253][54587] Fps is (10 sec: 60620.2, 60 sec: 59528.5, 300 sec: 59204.5). Total num frames: 8247345152. Throughput: 0: 58926.6. Samples: 1152460020. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-04-27 22:59:29,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 22:59:31,639][54818] Updated weights for policy 0, policy_version 503388 (0.0026) [2024-04-27 22:59:34,253][54587] Fps is (10 sec: 55706.5, 60 sec: 58982.6, 300 sec: 59093.5). Total num frames: 8247607296. Throughput: 0: 58929.0. Samples: 1152814120. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-04-27 22:59:34,253][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 22:59:34,789][54818] Updated weights for policy 0, policy_version 503398 (0.0026) [2024-04-27 22:59:37,215][54818] Updated weights for policy 0, policy_version 503408 (0.0024) [2024-04-27 22:59:39,253][54587] Fps is (10 sec: 55705.9, 60 sec: 58709.3, 300 sec: 59037.9). Total num frames: 8247902208. Throughput: 0: 59346.6. Samples: 1153179400. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-04-27 22:59:39,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 22:59:40,353][54818] Updated weights for policy 0, policy_version 503418 (0.0026) [2024-04-27 22:59:42,869][54818] Updated weights for policy 0, policy_version 503428 (0.0026) [2024-04-27 22:59:44,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58436.2, 300 sec: 59037.9). Total num frames: 8248197120. Throughput: 0: 58897.7. Samples: 1153352660. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-04-27 22:59:44,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 22:59:45,741][54818] Updated weights for policy 0, policy_version 503438 (0.0026) [2024-04-27 22:59:48,383][54818] Updated weights for policy 0, policy_version 503448 (0.0025) [2024-04-27 22:59:49,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8248508416. Throughput: 0: 59224.3. Samples: 1153709640. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-04-27 22:59:49,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 22:59:49,262][54587] No heartbeat for components: RolloutWorker_w4 (5737 seconds) [2024-04-27 22:59:49,372][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000503450_8248524800.pth... [2024-04-27 22:59:49,422][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000502583_8234319872.pth [2024-04-27 22:59:51,371][54818] Updated weights for policy 0, policy_version 503458 (0.0026) [2024-04-27 22:59:54,016][54818] Updated weights for policy 0, policy_version 503468 (0.0026) [2024-04-27 22:59:54,253][54587] Fps is (10 sec: 62259.2, 60 sec: 58709.3, 300 sec: 59149.0). Total num frames: 8248819712. Throughput: 0: 58944.7. Samples: 1154055800. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-04-27 22:59:54,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 22:59:56,774][54818] Updated weights for policy 0, policy_version 503478 (0.0025) [2024-04-27 22:59:59,078][54798] Signal inference workers to stop experience collection... (17900 times) [2024-04-27 22:59:59,100][54818] InferenceWorker_p0-w0: stopping experience collection (17900 times) [2024-04-27 22:59:59,167][54798] Signal inference workers to resume experience collection... (17900 times) [2024-04-27 22:59:59,167][54818] InferenceWorker_p0-w0: resuming experience collection (17900 times) [2024-04-27 22:59:59,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8249114624. Throughput: 0: 59117.8. Samples: 1154239080. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-04-27 22:59:59,254][54587] Avg episode reward: [(0, '0.689')] [2024-04-27 22:59:59,763][54818] Updated weights for policy 0, policy_version 503488 (0.0025) [2024-04-27 23:00:02,384][54818] Updated weights for policy 0, policy_version 503498 (0.0025) [2024-04-27 23:00:04,253][54587] Fps is (10 sec: 60621.5, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8249425920. Throughput: 0: 59022.7. Samples: 1154596780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:00:04,253][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 23:00:05,393][54818] Updated weights for policy 0, policy_version 503508 (0.0026) [2024-04-27 23:00:07,950][54818] Updated weights for policy 0, policy_version 503518 (0.0026) [2024-04-27 23:00:09,253][54587] Fps is (10 sec: 60620.2, 60 sec: 59528.3, 300 sec: 59204.5). Total num frames: 8249720832. Throughput: 0: 59131.5. Samples: 1154952440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:00:09,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 23:00:11,018][54818] Updated weights for policy 0, policy_version 503528 (0.0026) [2024-04-27 23:00:13,550][54818] Updated weights for policy 0, policy_version 503538 (0.0025) [2024-04-27 23:00:14,253][54587] Fps is (10 sec: 57342.8, 60 sec: 59255.3, 300 sec: 59149.0). Total num frames: 8249999360. Throughput: 0: 59361.3. Samples: 1155131280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:00:14,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 23:00:16,602][54818] Updated weights for policy 0, policy_version 503548 (0.0026) [2024-04-27 23:00:19,079][54818] Updated weights for policy 0, policy_version 503558 (0.0027) [2024-04-27 23:00:19,253][54587] Fps is (10 sec: 57344.6, 60 sec: 59255.4, 300 sec: 59204.6). Total num frames: 8250294272. Throughput: 0: 59384.3. Samples: 1155486420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:00:19,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 23:00:22,101][54818] Updated weights for policy 0, policy_version 503568 (0.0026) [2024-04-27 23:00:24,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58982.3, 300 sec: 59204.5). Total num frames: 8250589184. Throughput: 0: 58993.2. Samples: 1155834100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:00:24,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 23:00:24,698][54818] Updated weights for policy 0, policy_version 503578 (0.0026) [2024-04-27 23:00:27,528][54818] Updated weights for policy 0, policy_version 503588 (0.0025) [2024-04-27 23:00:29,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.5, 300 sec: 59260.1). Total num frames: 8250884096. Throughput: 0: 59375.6. Samples: 1156024560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:00:29,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 23:00:30,177][54818] Updated weights for policy 0, policy_version 503598 (0.0026) [2024-04-27 23:00:32,935][54818] Updated weights for policy 0, policy_version 503608 (0.0030) [2024-04-27 23:00:34,253][54587] Fps is (10 sec: 60621.7, 60 sec: 59801.5, 300 sec: 59315.6). Total num frames: 8251195392. Throughput: 0: 59298.2. Samples: 1156378060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:00:34,254][54587] Avg episode reward: [(0, '0.686')] [2024-04-27 23:00:35,688][54818] Updated weights for policy 0, policy_version 503618 (0.0026) [2024-04-27 23:00:38,367][54818] Updated weights for policy 0, policy_version 503628 (0.0024) [2024-04-27 23:00:39,253][54587] Fps is (10 sec: 62259.5, 60 sec: 60074.8, 300 sec: 59315.6). Total num frames: 8251506688. Throughput: 0: 59471.7. Samples: 1156732020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:00:39,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 23:00:41,103][54818] Updated weights for policy 0, policy_version 503638 (0.0026) [2024-04-27 23:00:43,897][54818] Updated weights for policy 0, policy_version 503648 (0.0025) [2024-04-27 23:00:44,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59528.6, 300 sec: 59204.5). Total num frames: 8251768832. Throughput: 0: 59195.2. Samples: 1156902860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:00:44,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 23:00:46,681][54818] Updated weights for policy 0, policy_version 503658 (0.0026) [2024-04-27 23:00:49,253][54587] Fps is (10 sec: 57343.0, 60 sec: 59528.5, 300 sec: 59204.5). Total num frames: 8252080128. Throughput: 0: 59443.3. Samples: 1157271740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:00:49,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 23:00:49,509][54818] Updated weights for policy 0, policy_version 503668 (0.0024) [2024-04-27 23:00:51,967][54818] Updated weights for policy 0, policy_version 503678 (0.0026) [2024-04-27 23:00:54,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8252358656. Throughput: 0: 59294.0. Samples: 1157620660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:00:54,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 23:00:54,985][54818] Updated weights for policy 0, policy_version 503688 (0.0025) [2024-04-27 23:00:57,020][54798] Signal inference workers to stop experience collection... (17950 times) [2024-04-27 23:00:57,021][54798] Signal inference workers to resume experience collection... (17950 times) [2024-04-27 23:00:57,045][54818] InferenceWorker_p0-w0: stopping experience collection (17950 times) [2024-04-27 23:00:57,046][54818] InferenceWorker_p0-w0: resuming experience collection (17950 times) [2024-04-27 23:00:57,499][54818] Updated weights for policy 0, policy_version 503698 (0.0026) [2024-04-27 23:00:59,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8252653568. Throughput: 0: 59271.2. Samples: 1157798480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:00:59,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 23:01:00,290][54818] Updated weights for policy 0, policy_version 503708 (0.0022) [2024-04-27 23:01:03,772][54818] Updated weights for policy 0, policy_version 503718 (0.0024) [2024-04-27 23:01:04,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58709.2, 300 sec: 59149.0). Total num frames: 8252948480. Throughput: 0: 59212.8. Samples: 1158151000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:01:04,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 23:01:05,736][54818] Updated weights for policy 0, policy_version 503728 (0.0023) [2024-04-27 23:01:09,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58709.5, 300 sec: 59149.1). Total num frames: 8253243392. Throughput: 0: 59333.2. Samples: 1158504080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:01:09,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 23:01:09,256][54818] Updated weights for policy 0, policy_version 503738 (0.0023) [2024-04-27 23:01:11,284][54818] Updated weights for policy 0, policy_version 503748 (0.0026) [2024-04-27 23:01:14,253][54587] Fps is (10 sec: 58983.2, 60 sec: 58982.6, 300 sec: 59093.5). Total num frames: 8253538304. Throughput: 0: 58921.8. Samples: 1158676040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:01:14,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 23:01:14,723][54818] Updated weights for policy 0, policy_version 503758 (0.0026) [2024-04-27 23:01:16,822][54818] Updated weights for policy 0, policy_version 503768 (0.0026) [2024-04-27 23:01:19,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8253849600. Throughput: 0: 59206.7. Samples: 1159042360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:01:19,254][54587] Avg episode reward: [(0, '0.526')] [2024-04-27 23:01:20,216][54818] Updated weights for policy 0, policy_version 503778 (0.0026) [2024-04-27 23:01:22,460][54818] Updated weights for policy 0, policy_version 503788 (0.0026) [2024-04-27 23:01:24,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59255.7, 300 sec: 59149.0). Total num frames: 8254144512. Throughput: 0: 59188.9. Samples: 1159395520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:01:24,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 23:01:25,678][54818] Updated weights for policy 0, policy_version 503798 (0.0026) [2024-04-27 23:01:28,239][54818] Updated weights for policy 0, policy_version 503808 (0.0023) [2024-04-27 23:01:29,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8254423040. Throughput: 0: 59353.3. Samples: 1159573760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:01:29,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 23:01:31,211][54818] Updated weights for policy 0, policy_version 503818 (0.0026) [2024-04-27 23:01:33,827][54818] Updated weights for policy 0, policy_version 503828 (0.0026) [2024-04-27 23:01:34,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8254750720. Throughput: 0: 58945.0. Samples: 1159924260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-04-27 23:01:34,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 23:01:34,742][54798] Signal inference workers to stop experience collection... (18000 times) [2024-04-27 23:01:34,746][54798] Signal inference workers to resume experience collection... (18000 times) [2024-04-27 23:01:34,772][54818] InferenceWorker_p0-w0: stopping experience collection (18000 times) [2024-04-27 23:01:34,772][54818] InferenceWorker_p0-w0: resuming experience collection (18000 times) [2024-04-27 23:01:36,601][54818] Updated weights for policy 0, policy_version 503838 (0.0024) [2024-04-27 23:01:39,212][54818] Updated weights for policy 0, policy_version 503848 (0.0026) [2024-04-27 23:01:39,253][54587] Fps is (10 sec: 62259.2, 60 sec: 58982.3, 300 sec: 59204.5). Total num frames: 8255045632. Throughput: 0: 59184.4. Samples: 1160283960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-04-27 23:01:39,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 23:01:42,185][54818] Updated weights for policy 0, policy_version 503858 (0.0021) [2024-04-27 23:01:44,253][54587] Fps is (10 sec: 58982.9, 60 sec: 59528.6, 300 sec: 59315.7). Total num frames: 8255340544. Throughput: 0: 59302.8. Samples: 1160467100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-04-27 23:01:44,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 23:01:44,643][54818] Updated weights for policy 0, policy_version 503868 (0.0025) [2024-04-27 23:01:47,735][54818] Updated weights for policy 0, policy_version 503878 (0.0025) [2024-04-27 23:01:49,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.6, 300 sec: 59260.1). Total num frames: 8255635456. Throughput: 0: 59358.3. Samples: 1160822120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-04-27 23:01:49,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 23:01:49,320][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000503885_8255651840.pth... [2024-04-27 23:01:49,372][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000503018_8241446912.pth [2024-04-27 23:01:50,111][54818] Updated weights for policy 0, policy_version 503888 (0.0026) [2024-04-27 23:01:53,057][54818] Updated weights for policy 0, policy_version 503898 (0.0024) [2024-04-27 23:01:54,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59528.5, 300 sec: 59260.1). Total num frames: 8255930368. Throughput: 0: 59260.0. Samples: 1161170780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-04-27 23:01:54,253][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 23:01:55,980][54818] Updated weights for policy 0, policy_version 503908 (0.0027) [2024-04-27 23:01:58,520][54818] Updated weights for policy 0, policy_version 503918 (0.0025) [2024-04-27 23:01:59,253][54587] Fps is (10 sec: 58982.9, 60 sec: 59528.7, 300 sec: 59260.1). Total num frames: 8256225280. Throughput: 0: 59376.0. Samples: 1161347960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-04-27 23:01:59,254][54587] Avg episode reward: [(0, '0.693')] [2024-04-27 23:02:01,520][54818] Updated weights for policy 0, policy_version 503928 (0.0025) [2024-04-27 23:02:04,140][54818] Updated weights for policy 0, policy_version 503938 (0.0026) [2024-04-27 23:02:04,253][54587] Fps is (10 sec: 58982.0, 60 sec: 59528.6, 300 sec: 59315.6). Total num frames: 8256520192. Throughput: 0: 59307.5. Samples: 1161711200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-04-27 23:02:04,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 23:02:06,957][54818] Updated weights for policy 0, policy_version 503948 (0.0026) [2024-04-27 23:02:09,253][54587] Fps is (10 sec: 58981.8, 60 sec: 59528.4, 300 sec: 59260.1). Total num frames: 8256815104. Throughput: 0: 59357.7. Samples: 1162066620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-04-27 23:02:09,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 23:02:09,754][54818] Updated weights for policy 0, policy_version 503958 (0.0023) [2024-04-27 23:02:12,367][54818] Updated weights for policy 0, policy_version 503968 (0.0026) [2024-04-27 23:02:14,253][54587] Fps is (10 sec: 57344.5, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8257093632. Throughput: 0: 59202.8. Samples: 1162237880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-04-27 23:02:14,253][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 23:02:15,137][54818] Updated weights for policy 0, policy_version 503978 (0.0026) [2024-04-27 23:02:17,808][54818] Updated weights for policy 0, policy_version 503988 (0.0026) [2024-04-27 23:02:19,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8257388544. Throughput: 0: 59333.4. Samples: 1162594260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-04-27 23:02:19,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 23:02:20,581][54818] Updated weights for policy 0, policy_version 503998 (0.0026) [2024-04-27 23:02:23,319][54818] Updated weights for policy 0, policy_version 504008 (0.0026) [2024-04-27 23:02:24,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8257683456. Throughput: 0: 59289.4. Samples: 1162951980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-04-27 23:02:24,253][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 23:02:26,173][54818] Updated weights for policy 0, policy_version 504018 (0.0024) [2024-04-27 23:02:29,252][54818] Updated weights for policy 0, policy_version 504028 (0.0025) [2024-04-27 23:02:29,253][54587] Fps is (10 sec: 60620.7, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8257994752. Throughput: 0: 59161.2. Samples: 1163129360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-04-27 23:02:29,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 23:02:29,728][54798] Signal inference workers to stop experience collection... (18050 times) [2024-04-27 23:02:29,761][54818] InferenceWorker_p0-w0: stopping experience collection (18050 times) [2024-04-27 23:02:29,780][54798] Signal inference workers to resume experience collection... (18050 times) [2024-04-27 23:02:29,780][54818] InferenceWorker_p0-w0: resuming experience collection (18050 times) [2024-04-27 23:02:31,558][54818] Updated weights for policy 0, policy_version 504038 (0.0025) [2024-04-27 23:02:34,253][54587] Fps is (10 sec: 62259.1, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8258306048. Throughput: 0: 59281.4. Samples: 1163489780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-04-27 23:02:34,253][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 23:02:34,682][54818] Updated weights for policy 0, policy_version 504048 (0.0026) [2024-04-27 23:02:37,236][54818] Updated weights for policy 0, policy_version 504058 (0.0026) [2024-04-27 23:02:39,253][54587] Fps is (10 sec: 60621.5, 60 sec: 59255.6, 300 sec: 59149.1). Total num frames: 8258600960. Throughput: 0: 59402.3. Samples: 1163843880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-04-27 23:02:39,253][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 23:02:40,156][54818] Updated weights for policy 0, policy_version 504068 (0.0026) [2024-04-27 23:02:42,833][54818] Updated weights for policy 0, policy_version 504078 (0.0027) [2024-04-27 23:02:44,253][54587] Fps is (10 sec: 58981.5, 60 sec: 59255.3, 300 sec: 59149.0). Total num frames: 8258895872. Throughput: 0: 59452.2. Samples: 1164023320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-04-27 23:02:44,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 23:02:45,689][54818] Updated weights for policy 0, policy_version 504088 (0.0024) [2024-04-27 23:02:48,502][54818] Updated weights for policy 0, policy_version 504098 (0.0026) [2024-04-27 23:02:49,253][54587] Fps is (10 sec: 58981.3, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8259190784. Throughput: 0: 59204.3. Samples: 1164375400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-04-27 23:02:49,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 23:02:49,264][54587] No heartbeat for components: RolloutWorker_w4 (5917 seconds) [2024-04-27 23:02:51,263][54818] Updated weights for policy 0, policy_version 504108 (0.0025) [2024-04-27 23:02:53,956][54818] Updated weights for policy 0, policy_version 504118 (0.0026) [2024-04-27 23:02:54,253][54587] Fps is (10 sec: 58983.1, 60 sec: 59255.4, 300 sec: 59204.6). Total num frames: 8259485696. Throughput: 0: 59160.5. Samples: 1164728840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-04-27 23:02:54,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 23:02:56,766][54818] Updated weights for policy 0, policy_version 504128 (0.0026) [2024-04-27 23:02:59,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59255.4, 300 sec: 59204.6). Total num frames: 8259780608. Throughput: 0: 59441.6. Samples: 1164912760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:02:59,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 23:02:59,457][54818] Updated weights for policy 0, policy_version 504138 (0.0025) [2024-04-27 23:03:02,356][54818] Updated weights for policy 0, policy_version 504148 (0.0022) [2024-04-27 23:03:04,253][54587] Fps is (10 sec: 58982.0, 60 sec: 59255.4, 300 sec: 59260.1). Total num frames: 8260075520. Throughput: 0: 59329.7. Samples: 1165264100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:03:04,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 23:03:04,873][54818] Updated weights for policy 0, policy_version 504158 (0.0024) [2024-04-27 23:03:07,798][54818] Updated weights for policy 0, policy_version 504168 (0.0026) [2024-04-27 23:03:09,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.5, 300 sec: 59315.6). Total num frames: 8260370432. Throughput: 0: 59105.6. Samples: 1165611740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:03:09,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 23:03:10,466][54818] Updated weights for policy 0, policy_version 504178 (0.0026) [2024-04-27 23:03:13,365][54818] Updated weights for policy 0, policy_version 504188 (0.0026) [2024-04-27 23:03:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59801.5, 300 sec: 59371.2). Total num frames: 8260681728. Throughput: 0: 59215.5. Samples: 1165794060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:03:14,263][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 23:03:15,975][54818] Updated weights for policy 0, policy_version 504198 (0.0026) [2024-04-27 23:03:18,704][54818] Updated weights for policy 0, policy_version 504208 (0.0027) [2024-04-27 23:03:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59801.5, 300 sec: 59371.2). Total num frames: 8260976640. Throughput: 0: 59267.4. Samples: 1166156820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:03:19,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 23:03:21,673][54818] Updated weights for policy 0, policy_version 504218 (0.0026) [2024-04-27 23:03:24,253][54587] Fps is (10 sec: 57344.5, 60 sec: 59528.5, 300 sec: 59260.1). Total num frames: 8261255168. Throughput: 0: 59276.8. Samples: 1166511340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:03:24,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 23:03:24,317][54818] Updated weights for policy 0, policy_version 504228 (0.0025) [2024-04-27 23:03:27,235][54818] Updated weights for policy 0, policy_version 504238 (0.0024) [2024-04-27 23:03:29,253][54587] Fps is (10 sec: 57344.5, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8261550080. Throughput: 0: 59071.7. Samples: 1166681540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:03:29,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 23:03:29,646][54818] Updated weights for policy 0, policy_version 504248 (0.0026) [2024-04-27 23:03:31,780][54798] Signal inference workers to stop experience collection... (18100 times) [2024-04-27 23:03:31,784][54798] Signal inference workers to resume experience collection... (18100 times) [2024-04-27 23:03:31,808][54818] InferenceWorker_p0-w0: stopping experience collection (18100 times) [2024-04-27 23:03:31,809][54818] InferenceWorker_p0-w0: resuming experience collection (18100 times) [2024-04-27 23:03:32,696][54818] Updated weights for policy 0, policy_version 504258 (0.0026) [2024-04-27 23:03:34,253][54587] Fps is (10 sec: 60620.7, 60 sec: 59255.4, 300 sec: 59260.1). Total num frames: 8261861376. Throughput: 0: 59358.3. Samples: 1167046520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:03:34,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 23:03:35,158][54818] Updated weights for policy 0, policy_version 504268 (0.0025) [2024-04-27 23:03:38,116][54818] Updated weights for policy 0, policy_version 504278 (0.0025) [2024-04-27 23:03:39,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8262139904. Throughput: 0: 59359.1. Samples: 1167400000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:03:39,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 23:03:40,635][54818] Updated weights for policy 0, policy_version 504288 (0.0026) [2024-04-27 23:03:43,501][54818] Updated weights for policy 0, policy_version 504298 (0.0026) [2024-04-27 23:03:44,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8262434816. Throughput: 0: 59237.4. Samples: 1167578440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:03:44,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 23:03:46,062][54818] Updated weights for policy 0, policy_version 504308 (0.0026) [2024-04-27 23:03:49,096][54818] Updated weights for policy 0, policy_version 504318 (0.0025) [2024-04-27 23:03:49,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8262746112. Throughput: 0: 59247.1. Samples: 1167930220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:03:49,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 23:03:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000504318_8262746112.pth... [2024-04-27 23:03:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000503450_8248524800.pth [2024-04-27 23:03:51,924][54818] Updated weights for policy 0, policy_version 504328 (0.0024) [2024-04-27 23:03:54,253][54587] Fps is (10 sec: 60620.2, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8263041024. Throughput: 0: 59531.1. Samples: 1168290640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:03:54,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 23:03:54,518][54818] Updated weights for policy 0, policy_version 504338 (0.0025) [2024-04-27 23:03:57,390][54818] Updated weights for policy 0, policy_version 504348 (0.0026) [2024-04-27 23:03:59,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8263319552. Throughput: 0: 59137.3. Samples: 1168455240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:03:59,254][54587] Avg episode reward: [(0, '0.664')] [2024-04-27 23:04:00,350][54818] Updated weights for policy 0, policy_version 504358 (0.0026) [2024-04-27 23:04:02,941][54818] Updated weights for policy 0, policy_version 504368 (0.0024) [2024-04-27 23:04:04,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.4, 300 sec: 59260.1). Total num frames: 8263630848. Throughput: 0: 59120.9. Samples: 1168817260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:04:04,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 23:04:05,772][54818] Updated weights for policy 0, policy_version 504378 (0.0028) [2024-04-27 23:04:08,584][54818] Updated weights for policy 0, policy_version 504388 (0.0026) [2024-04-27 23:04:09,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8263925760. Throughput: 0: 59213.2. Samples: 1169175940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:04:09,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 23:04:11,303][54818] Updated weights for policy 0, policy_version 504398 (0.0023) [2024-04-27 23:04:14,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58709.3, 300 sec: 59204.5). Total num frames: 8264204288. Throughput: 0: 59464.3. Samples: 1169357440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:04:14,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 23:04:14,405][54818] Updated weights for policy 0, policy_version 504408 (0.0024) [2024-04-27 23:04:16,997][54818] Updated weights for policy 0, policy_version 504418 (0.0026) [2024-04-27 23:04:19,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8264515584. Throughput: 0: 59108.4. Samples: 1169706400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:04:19,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 23:04:19,889][54818] Updated weights for policy 0, policy_version 504428 (0.0026) [2024-04-27 23:04:22,594][54818] Updated weights for policy 0, policy_version 504438 (0.0026) [2024-04-27 23:04:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 59255.4, 300 sec: 59204.6). Total num frames: 8264810496. Throughput: 0: 59022.1. Samples: 1170056000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:04:24,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 23:04:25,332][54818] Updated weights for policy 0, policy_version 504448 (0.0025) [2024-04-27 23:04:28,116][54818] Updated weights for policy 0, policy_version 504458 (0.0025) [2024-04-27 23:04:29,253][54587] Fps is (10 sec: 58983.1, 60 sec: 59255.5, 300 sec: 59315.6). Total num frames: 8265105408. Throughput: 0: 59032.5. Samples: 1170234900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:04:29,253][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 23:04:30,833][54818] Updated weights for policy 0, policy_version 504468 (0.0025) [2024-04-27 23:04:31,830][54798] Signal inference workers to stop experience collection... (18150 times) [2024-04-27 23:04:31,830][54798] Signal inference workers to resume experience collection... (18150 times) [2024-04-27 23:04:31,843][54818] InferenceWorker_p0-w0: stopping experience collection (18150 times) [2024-04-27 23:04:31,843][54818] InferenceWorker_p0-w0: resuming experience collection (18150 times) [2024-04-27 23:04:33,574][54818] Updated weights for policy 0, policy_version 504478 (0.0025) [2024-04-27 23:04:34,253][54587] Fps is (10 sec: 60621.0, 60 sec: 59255.5, 300 sec: 59371.2). Total num frames: 8265416704. Throughput: 0: 59146.7. Samples: 1170591820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:04:34,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 23:04:36,389][54818] Updated weights for policy 0, policy_version 504488 (0.0026) [2024-04-27 23:04:39,088][54818] Updated weights for policy 0, policy_version 504498 (0.0026) [2024-04-27 23:04:39,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.5, 300 sec: 59315.6). Total num frames: 8265695232. Throughput: 0: 59008.1. Samples: 1170946000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:04:39,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 23:04:42,017][54818] Updated weights for policy 0, policy_version 504508 (0.0026) [2024-04-27 23:04:44,253][54587] Fps is (10 sec: 55705.7, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8265973760. Throughput: 0: 59338.4. Samples: 1171125460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:04:44,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 23:04:44,564][54818] Updated weights for policy 0, policy_version 504518 (0.0023) [2024-04-27 23:04:47,537][54818] Updated weights for policy 0, policy_version 504528 (0.0026) [2024-04-27 23:04:49,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8266285056. Throughput: 0: 59198.8. Samples: 1171481200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:04:49,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 23:04:50,038][54818] Updated weights for policy 0, policy_version 504538 (0.0022) [2024-04-27 23:04:53,113][54818] Updated weights for policy 0, policy_version 504548 (0.0026) [2024-04-27 23:04:54,254][54587] Fps is (10 sec: 62255.1, 60 sec: 59254.9, 300 sec: 59260.0). Total num frames: 8266596352. Throughput: 0: 58901.9. Samples: 1171826560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:04:54,255][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 23:04:55,551][54818] Updated weights for policy 0, policy_version 504558 (0.0026) [2024-04-27 23:04:58,537][54818] Updated weights for policy 0, policy_version 504568 (0.0025) [2024-04-27 23:04:59,253][54587] Fps is (10 sec: 60619.6, 60 sec: 59528.5, 300 sec: 59204.5). Total num frames: 8266891264. Throughput: 0: 58994.6. Samples: 1172012200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:04:59,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 23:05:00,994][54818] Updated weights for policy 0, policy_version 504578 (0.0025) [2024-04-27 23:05:03,901][54818] Updated weights for policy 0, policy_version 504588 (0.0026) [2024-04-27 23:05:04,253][54587] Fps is (10 sec: 60624.2, 60 sec: 59528.5, 300 sec: 59260.1). Total num frames: 8267202560. Throughput: 0: 59236.4. Samples: 1172372040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:05:04,254][54587] Avg episode reward: [(0, '0.668')] [2024-04-27 23:05:06,580][54818] Updated weights for policy 0, policy_version 504598 (0.0026) [2024-04-27 23:05:09,253][54587] Fps is (10 sec: 58983.7, 60 sec: 59255.6, 300 sec: 59260.1). Total num frames: 8267481088. Throughput: 0: 59321.9. Samples: 1172725480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:05:09,253][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 23:05:09,355][54818] Updated weights for policy 0, policy_version 504608 (0.0026) [2024-04-27 23:05:12,131][54818] Updated weights for policy 0, policy_version 504618 (0.0025) [2024-04-27 23:05:14,253][54587] Fps is (10 sec: 55706.1, 60 sec: 59255.6, 300 sec: 59204.6). Total num frames: 8267759616. Throughput: 0: 59163.1. Samples: 1172897240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:05:14,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 23:05:14,962][54818] Updated weights for policy 0, policy_version 504628 (0.0026) [2024-04-27 23:05:17,903][54818] Updated weights for policy 0, policy_version 504638 (0.0026) [2024-04-27 23:05:19,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8268054528. Throughput: 0: 59241.0. Samples: 1173257660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:05:19,254][54587] Avg episode reward: [(0, '0.674')] [2024-04-27 23:05:20,348][54818] Updated weights for policy 0, policy_version 504648 (0.0025) [2024-04-27 23:05:23,311][54818] Updated weights for policy 0, policy_version 504658 (0.0026) [2024-04-27 23:05:24,253][54587] Fps is (10 sec: 57344.7, 60 sec: 58709.5, 300 sec: 59149.0). Total num frames: 8268333056. Throughput: 0: 59401.0. Samples: 1173619040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:05:24,253][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 23:05:25,824][54818] Updated weights for policy 0, policy_version 504668 (0.0025) [2024-04-27 23:05:29,176][54818] Updated weights for policy 0, policy_version 504678 (0.0025) [2024-04-27 23:05:29,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8268644352. Throughput: 0: 59246.5. Samples: 1173791560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:05:29,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 23:05:29,923][54798] Signal inference workers to stop experience collection... (18200 times) [2024-04-27 23:05:29,923][54798] Signal inference workers to resume experience collection... (18200 times) [2024-04-27 23:05:29,944][54818] InferenceWorker_p0-w0: stopping experience collection (18200 times) [2024-04-27 23:05:29,944][54818] InferenceWorker_p0-w0: resuming experience collection (18200 times) [2024-04-27 23:05:31,333][54818] Updated weights for policy 0, policy_version 504688 (0.0025) [2024-04-27 23:05:34,253][54587] Fps is (10 sec: 62257.7, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8268955648. Throughput: 0: 59134.0. Samples: 1174142240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:05:34,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 23:05:34,828][54818] Updated weights for policy 0, policy_version 504698 (0.0026) [2024-04-27 23:05:36,999][54818] Updated weights for policy 0, policy_version 504708 (0.0026) [2024-04-27 23:05:39,253][54587] Fps is (10 sec: 60621.3, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8269250560. Throughput: 0: 59383.1. Samples: 1174498760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:05:39,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 23:05:40,412][54818] Updated weights for policy 0, policy_version 504718 (0.0025) [2024-04-27 23:05:42,503][54818] Updated weights for policy 0, policy_version 504728 (0.0027) [2024-04-27 23:05:44,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59801.5, 300 sec: 59260.1). Total num frames: 8269561856. Throughput: 0: 59253.0. Samples: 1174678580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:05:44,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 23:05:45,725][54818] Updated weights for policy 0, policy_version 504738 (0.0025) [2024-04-27 23:05:48,084][54818] Updated weights for policy 0, policy_version 504748 (0.0026) [2024-04-27 23:05:49,253][54587] Fps is (10 sec: 62259.1, 60 sec: 59801.5, 300 sec: 59371.2). Total num frames: 8269873152. Throughput: 0: 59286.3. Samples: 1175039920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-27 23:05:49,254][54587] Avg episode reward: [(0, '0.522')] [2024-04-27 23:05:49,265][54587] No heartbeat for components: RolloutWorker_w4 (6097 seconds) [2024-04-27 23:05:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000504753_8269873152.pth... [2024-04-27 23:05:49,315][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000503885_8255651840.pth [2024-04-27 23:05:51,303][54818] Updated weights for policy 0, policy_version 504758 (0.0025) [2024-04-27 23:05:53,418][54818] Updated weights for policy 0, policy_version 504768 (0.0025) [2024-04-27 23:05:54,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59529.1, 300 sec: 59371.2). Total num frames: 8270168064. Throughput: 0: 59298.5. Samples: 1175393920. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 23:05:54,254][54587] Avg episode reward: [(0, '0.428')] [2024-04-27 23:05:56,776][54818] Updated weights for policy 0, policy_version 504778 (0.0025) [2024-04-27 23:05:58,811][54818] Updated weights for policy 0, policy_version 504788 (0.0028) [2024-04-27 23:05:59,253][54587] Fps is (10 sec: 57343.9, 60 sec: 59255.6, 300 sec: 59315.7). Total num frames: 8270446592. Throughput: 0: 59732.8. Samples: 1175585220. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 23:05:59,254][54587] Avg episode reward: [(0, '0.674')] [2024-04-27 23:06:02,141][54818] Updated weights for policy 0, policy_version 504798 (0.0026) [2024-04-27 23:06:04,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59255.5, 300 sec: 59371.2). Total num frames: 8270757888. Throughput: 0: 59604.8. Samples: 1175939880. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 23:06:04,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 23:06:04,311][54818] Updated weights for policy 0, policy_version 504808 (0.0024) [2024-04-27 23:06:07,797][54818] Updated weights for policy 0, policy_version 504818 (0.0026) [2024-04-27 23:06:09,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59528.4, 300 sec: 59371.2). Total num frames: 8271052800. Throughput: 0: 59263.7. Samples: 1176285920. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 23:06:09,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 23:06:09,798][54818] Updated weights for policy 0, policy_version 504828 (0.0025) [2024-04-27 23:06:13,188][54818] Updated weights for policy 0, policy_version 504838 (0.0026) [2024-04-27 23:06:14,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59528.6, 300 sec: 59260.1). Total num frames: 8271331328. Throughput: 0: 59526.8. Samples: 1176470260. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 23:06:14,253][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 23:06:14,903][54798] Signal inference workers to stop experience collection... (18250 times) [2024-04-27 23:06:14,937][54818] InferenceWorker_p0-w0: stopping experience collection (18250 times) [2024-04-27 23:06:14,964][54798] Signal inference workers to resume experience collection... (18250 times) [2024-04-27 23:06:14,964][54818] InferenceWorker_p0-w0: resuming experience collection (18250 times) [2024-04-27 23:06:15,365][54818] Updated weights for policy 0, policy_version 504848 (0.0026) [2024-04-27 23:06:18,622][54818] Updated weights for policy 0, policy_version 504858 (0.0023) [2024-04-27 23:06:19,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59528.5, 300 sec: 59260.1). Total num frames: 8271626240. Throughput: 0: 59618.8. Samples: 1176825080. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 23:06:19,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-27 23:06:21,016][54818] Updated weights for policy 0, policy_version 504868 (0.0024) [2024-04-27 23:06:24,253][54587] Fps is (10 sec: 57343.9, 60 sec: 59528.4, 300 sec: 59260.1). Total num frames: 8271904768. Throughput: 0: 59663.1. Samples: 1177183600. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 23:06:24,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 23:06:24,305][54818] Updated weights for policy 0, policy_version 504878 (0.0026) [2024-04-27 23:06:26,328][54818] Updated weights for policy 0, policy_version 504888 (0.0025) [2024-04-27 23:06:29,253][54587] Fps is (10 sec: 58982.0, 60 sec: 59528.5, 300 sec: 59204.5). Total num frames: 8272216064. Throughput: 0: 59237.3. Samples: 1177344260. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 23:06:29,254][54587] Avg episode reward: [(0, '0.686')] [2024-04-27 23:06:29,849][54818] Updated weights for policy 0, policy_version 504898 (0.0024) [2024-04-27 23:06:31,881][54818] Updated weights for policy 0, policy_version 504908 (0.0024) [2024-04-27 23:06:34,254][54587] Fps is (10 sec: 58980.8, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8272494592. Throughput: 0: 59363.2. Samples: 1177711280. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 23:06:34,254][54587] Avg episode reward: [(0, '0.481')] [2024-04-27 23:06:35,455][54818] Updated weights for policy 0, policy_version 504918 (0.0029) [2024-04-27 23:06:37,482][54818] Updated weights for policy 0, policy_version 504928 (0.0028) [2024-04-27 23:06:39,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8272805888. Throughput: 0: 59291.1. Samples: 1178062020. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 23:06:39,262][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 23:06:41,043][54818] Updated weights for policy 0, policy_version 504938 (0.0025) [2024-04-27 23:06:43,421][54818] Updated weights for policy 0, policy_version 504948 (0.0024) [2024-04-27 23:06:44,253][54587] Fps is (10 sec: 58984.2, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8273084416. Throughput: 0: 58799.7. Samples: 1178231200. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 23:06:44,253][54587] Avg episode reward: [(0, '0.513')] [2024-04-27 23:06:46,370][54818] Updated weights for policy 0, policy_version 504958 (0.0029) [2024-04-27 23:06:48,841][54818] Updated weights for policy 0, policy_version 504968 (0.0024) [2024-04-27 23:06:49,253][54587] Fps is (10 sec: 58983.4, 60 sec: 58709.4, 300 sec: 59204.6). Total num frames: 8273395712. Throughput: 0: 58891.6. Samples: 1178590000. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 23:06:49,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-27 23:06:51,771][54818] Updated weights for policy 0, policy_version 504978 (0.0026) [2024-04-27 23:06:54,253][54587] Fps is (10 sec: 62258.0, 60 sec: 58982.3, 300 sec: 59260.1). Total num frames: 8273707008. Throughput: 0: 59114.6. Samples: 1178946080. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 23:06:54,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 23:06:54,498][54818] Updated weights for policy 0, policy_version 504988 (0.0026) [2024-04-27 23:06:57,303][54818] Updated weights for policy 0, policy_version 504998 (0.0024) [2024-04-27 23:06:59,253][54587] Fps is (10 sec: 60620.0, 60 sec: 59255.4, 300 sec: 59260.1). Total num frames: 8274001920. Throughput: 0: 59191.4. Samples: 1179133880. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 23:06:59,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 23:07:00,198][54818] Updated weights for policy 0, policy_version 505008 (0.0030) [2024-04-27 23:07:01,920][54798] Signal inference workers to stop experience collection... (18300 times) [2024-04-27 23:07:01,931][54818] InferenceWorker_p0-w0: stopping experience collection (18300 times) [2024-04-27 23:07:02,013][54798] Signal inference workers to resume experience collection... (18300 times) [2024-04-27 23:07:02,013][54818] InferenceWorker_p0-w0: resuming experience collection (18300 times) [2024-04-27 23:07:02,793][54818] Updated weights for policy 0, policy_version 505018 (0.0024) [2024-04-27 23:07:04,253][54587] Fps is (10 sec: 62259.5, 60 sec: 59528.5, 300 sec: 59371.2). Total num frames: 8274329600. Throughput: 0: 59179.4. Samples: 1179488160. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 23:07:04,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 23:07:05,613][54818] Updated weights for policy 0, policy_version 505028 (0.0026) [2024-04-27 23:07:08,261][54818] Updated weights for policy 0, policy_version 505038 (0.0026) [2024-04-27 23:07:09,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59255.4, 300 sec: 59371.1). Total num frames: 8274608128. Throughput: 0: 58982.9. Samples: 1179837840. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 23:07:09,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 23:07:11,156][54818] Updated weights for policy 0, policy_version 505048 (0.0027) [2024-04-27 23:07:13,861][54818] Updated weights for policy 0, policy_version 505058 (0.0025) [2024-04-27 23:07:14,253][54587] Fps is (10 sec: 57344.4, 60 sec: 59528.5, 300 sec: 59371.2). Total num frames: 8274903040. Throughput: 0: 59452.1. Samples: 1180019600. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-04-27 23:07:14,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 23:07:16,460][54818] Updated weights for policy 0, policy_version 505068 (0.0023) [2024-04-27 23:07:19,253][54587] Fps is (10 sec: 57344.7, 60 sec: 59255.5, 300 sec: 59315.6). Total num frames: 8275181568. Throughput: 0: 59289.2. Samples: 1180379280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:07:19,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 23:07:19,306][54818] Updated weights for policy 0, policy_version 505078 (0.0024) [2024-04-27 23:07:22,317][54818] Updated weights for policy 0, policy_version 505088 (0.0024) [2024-04-27 23:07:24,253][54587] Fps is (10 sec: 57343.9, 60 sec: 59528.5, 300 sec: 59260.1). Total num frames: 8275476480. Throughput: 0: 59319.2. Samples: 1180731380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:07:24,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 23:07:24,885][54818] Updated weights for policy 0, policy_version 505098 (0.0026) [2024-04-27 23:07:27,645][54818] Updated weights for policy 0, policy_version 505108 (0.0026) [2024-04-27 23:07:29,253][54587] Fps is (10 sec: 58981.9, 60 sec: 59255.5, 300 sec: 59204.5). Total num frames: 8275771392. Throughput: 0: 59571.4. Samples: 1180911920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:07:29,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 23:07:30,369][54818] Updated weights for policy 0, policy_version 505118 (0.0026) [2024-04-27 23:07:33,043][54818] Updated weights for policy 0, policy_version 505128 (0.0024) [2024-04-27 23:07:34,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59528.8, 300 sec: 59204.5). Total num frames: 8276066304. Throughput: 0: 59507.4. Samples: 1181267840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:07:34,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 23:07:35,813][54818] Updated weights for policy 0, policy_version 505138 (0.0026) [2024-04-27 23:07:38,623][54818] Updated weights for policy 0, policy_version 505148 (0.0026) [2024-04-27 23:07:39,253][54587] Fps is (10 sec: 57344.5, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8276344832. Throughput: 0: 59307.3. Samples: 1181614900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:07:39,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 23:07:41,292][54818] Updated weights for policy 0, policy_version 505158 (0.0028) [2024-04-27 23:07:44,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59528.4, 300 sec: 59204.6). Total num frames: 8276656128. Throughput: 0: 59028.9. Samples: 1181790180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:07:44,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 23:07:44,401][54818] Updated weights for policy 0, policy_version 505168 (0.0025) [2024-04-27 23:07:46,736][54818] Updated weights for policy 0, policy_version 505178 (0.0024) [2024-04-27 23:07:49,253][54587] Fps is (10 sec: 62259.0, 60 sec: 59528.5, 300 sec: 59260.1). Total num frames: 8276967424. Throughput: 0: 59217.9. Samples: 1182152960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:07:49,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 23:07:49,338][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000505187_8276983808.pth... [2024-04-27 23:07:49,390][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000504318_8262746112.pth [2024-04-27 23:07:50,004][54818] Updated weights for policy 0, policy_version 505188 (0.0023) [2024-04-27 23:07:52,201][54818] Updated weights for policy 0, policy_version 505198 (0.0023) [2024-04-27 23:07:54,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58982.6, 300 sec: 59204.6). Total num frames: 8277245952. Throughput: 0: 59304.7. Samples: 1182506540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:07:54,254][54587] Avg episode reward: [(0, '0.682')] [2024-04-27 23:07:54,306][54798] Signal inference workers to stop experience collection... (18350 times) [2024-04-27 23:07:54,306][54798] Signal inference workers to resume experience collection... (18350 times) [2024-04-27 23:07:54,317][54818] InferenceWorker_p0-w0: stopping experience collection (18350 times) [2024-04-27 23:07:54,334][54818] InferenceWorker_p0-w0: resuming experience collection (18350 times) [2024-04-27 23:07:55,457][54818] Updated weights for policy 0, policy_version 505208 (0.0026) [2024-04-27 23:07:57,741][54818] Updated weights for policy 0, policy_version 505218 (0.0026) [2024-04-27 23:07:59,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8277540864. Throughput: 0: 59208.5. Samples: 1182683980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:07:59,254][54587] Avg episode reward: [(0, '0.694')] [2024-04-27 23:08:00,976][54818] Updated weights for policy 0, policy_version 505228 (0.0026) [2024-04-27 23:08:03,549][54818] Updated weights for policy 0, policy_version 505238 (0.0026) [2024-04-27 23:08:04,253][54587] Fps is (10 sec: 60620.1, 60 sec: 58709.3, 300 sec: 59260.1). Total num frames: 8277852160. Throughput: 0: 59067.9. Samples: 1183037340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:08:04,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 23:08:06,434][54818] Updated weights for policy 0, policy_version 505248 (0.0025) [2024-04-27 23:08:09,124][54818] Updated weights for policy 0, policy_version 505258 (0.0026) [2024-04-27 23:08:09,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8278147072. Throughput: 0: 59092.9. Samples: 1183390560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:08:09,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 23:08:11,986][54818] Updated weights for policy 0, policy_version 505268 (0.0026) [2024-04-27 23:08:14,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.3, 300 sec: 59204.6). Total num frames: 8278441984. Throughput: 0: 59229.3. Samples: 1183577240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:08:14,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 23:08:14,709][54818] Updated weights for policy 0, policy_version 505278 (0.0026) [2024-04-27 23:08:17,564][54818] Updated weights for policy 0, policy_version 505288 (0.0026) [2024-04-27 23:08:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59528.5, 300 sec: 59315.6). Total num frames: 8278753280. Throughput: 0: 59226.2. Samples: 1183933020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:08:19,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 23:08:20,288][54818] Updated weights for policy 0, policy_version 505298 (0.0025) [2024-04-27 23:08:22,976][54818] Updated weights for policy 0, policy_version 505308 (0.0024) [2024-04-27 23:08:24,253][54587] Fps is (10 sec: 60621.7, 60 sec: 59528.6, 300 sec: 59315.6). Total num frames: 8279048192. Throughput: 0: 59303.2. Samples: 1184283540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:08:24,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 23:08:25,700][54818] Updated weights for policy 0, policy_version 505318 (0.0023) [2024-04-27 23:08:28,536][54818] Updated weights for policy 0, policy_version 505328 (0.0026) [2024-04-27 23:08:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8279326720. Throughput: 0: 59335.6. Samples: 1184460280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:08:29,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 23:08:31,054][54818] Updated weights for policy 0, policy_version 505338 (0.0023) [2024-04-27 23:08:33,943][54818] Updated weights for policy 0, policy_version 505348 (0.0023) [2024-04-27 23:08:34,253][54587] Fps is (10 sec: 58981.8, 60 sec: 59528.5, 300 sec: 59315.6). Total num frames: 8279638016. Throughput: 0: 59398.2. Samples: 1184825880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:08:34,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 23:08:36,669][54798] Signal inference workers to stop experience collection... (18400 times) [2024-04-27 23:08:36,676][54798] Signal inference workers to resume experience collection... (18400 times) [2024-04-27 23:08:36,678][54818] InferenceWorker_p0-w0: stopping experience collection (18400 times) [2024-04-27 23:08:36,681][54818] Updated weights for policy 0, policy_version 505358 (0.0024) [2024-04-27 23:08:36,698][54818] InferenceWorker_p0-w0: resuming experience collection (18400 times) [2024-04-27 23:08:39,253][54587] Fps is (10 sec: 60620.7, 60 sec: 59801.6, 300 sec: 59315.6). Total num frames: 8279932928. Throughput: 0: 59454.1. Samples: 1185181980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:08:39,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 23:08:39,355][54818] Updated weights for policy 0, policy_version 505368 (0.0026) [2024-04-27 23:08:42,194][54818] Updated weights for policy 0, policy_version 505378 (0.0025) [2024-04-27 23:08:44,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8280211456. Throughput: 0: 59417.2. Samples: 1185357760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:08:44,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 23:08:44,868][54818] Updated weights for policy 0, policy_version 505388 (0.0026) [2024-04-27 23:08:47,660][54818] Updated weights for policy 0, policy_version 505398 (0.0026) [2024-04-27 23:08:49,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8280506368. Throughput: 0: 59274.7. Samples: 1185704700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:08:49,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 23:08:49,265][54587] No heartbeat for components: RolloutWorker_w4 (6277 seconds) [2024-04-27 23:08:50,526][54818] Updated weights for policy 0, policy_version 505408 (0.0026) [2024-04-27 23:08:53,132][54818] Updated weights for policy 0, policy_version 505418 (0.0025) [2024-04-27 23:08:54,253][54587] Fps is (10 sec: 58983.1, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8280801280. Throughput: 0: 59566.0. Samples: 1186071020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:08:54,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 23:08:55,917][54818] Updated weights for policy 0, policy_version 505428 (0.0026) [2024-04-27 23:08:58,567][54818] Updated weights for policy 0, policy_version 505438 (0.0026) [2024-04-27 23:08:59,253][54587] Fps is (10 sec: 58981.8, 60 sec: 59255.3, 300 sec: 59204.5). Total num frames: 8281096192. Throughput: 0: 59231.9. Samples: 1186242680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:08:59,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 23:09:01,368][54818] Updated weights for policy 0, policy_version 505448 (0.0024) [2024-04-27 23:09:04,253][54587] Fps is (10 sec: 60619.8, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8281407488. Throughput: 0: 59295.5. Samples: 1186601320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:09:04,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 23:09:04,627][54818] Updated weights for policy 0, policy_version 505458 (0.0026) [2024-04-27 23:09:06,839][54818] Updated weights for policy 0, policy_version 505468 (0.0025) [2024-04-27 23:09:09,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59255.4, 300 sec: 59315.6). Total num frames: 8281702400. Throughput: 0: 59360.7. Samples: 1186954780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:09:09,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 23:09:10,115][54818] Updated weights for policy 0, policy_version 505478 (0.0026) [2024-04-27 23:09:12,238][54818] Updated weights for policy 0, policy_version 505488 (0.0025) [2024-04-27 23:09:14,253][54587] Fps is (10 sec: 58983.0, 60 sec: 59255.6, 300 sec: 59260.1). Total num frames: 8281997312. Throughput: 0: 59384.5. Samples: 1187132580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:09:14,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 23:09:15,595][54818] Updated weights for policy 0, policy_version 505498 (0.0026) [2024-04-27 23:09:17,766][54818] Updated weights for policy 0, policy_version 505508 (0.0026) [2024-04-27 23:09:19,253][54587] Fps is (10 sec: 60621.5, 60 sec: 59255.6, 300 sec: 59315.6). Total num frames: 8282308608. Throughput: 0: 59057.4. Samples: 1187483460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:09:19,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 23:09:21,131][54818] Updated weights for policy 0, policy_version 505518 (0.0026) [2024-04-27 23:09:23,559][54818] Updated weights for policy 0, policy_version 505528 (0.0026) [2024-04-27 23:09:24,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59255.4, 300 sec: 59315.6). Total num frames: 8282603520. Throughput: 0: 59110.7. Samples: 1187841960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:09:24,254][54587] Avg episode reward: [(0, '0.508')] [2024-04-27 23:09:26,551][54818] Updated weights for policy 0, policy_version 505538 (0.0026) [2024-04-27 23:09:29,163][54818] Updated weights for policy 0, policy_version 505548 (0.0026) [2024-04-27 23:09:29,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59528.6, 300 sec: 59260.1). Total num frames: 8282898432. Throughput: 0: 59228.1. Samples: 1188023020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:09:29,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 23:09:32,060][54818] Updated weights for policy 0, policy_version 505558 (0.0026) [2024-04-27 23:09:34,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59528.5, 300 sec: 59371.2). Total num frames: 8283209728. Throughput: 0: 59440.4. Samples: 1188379520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:09:34,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 23:09:34,521][54818] Updated weights for policy 0, policy_version 505568 (0.0024) [2024-04-27 23:09:37,741][54818] Updated weights for policy 0, policy_version 505578 (0.0024) [2024-04-27 23:09:39,253][54587] Fps is (10 sec: 60620.2, 60 sec: 59528.5, 300 sec: 59426.7). Total num frames: 8283504640. Throughput: 0: 59147.0. Samples: 1188732640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:09:39,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 23:09:40,329][54818] Updated weights for policy 0, policy_version 505588 (0.0024) [2024-04-27 23:09:43,143][54818] Updated weights for policy 0, policy_version 505598 (0.0026) [2024-04-27 23:09:44,254][54587] Fps is (10 sec: 58981.5, 60 sec: 59801.4, 300 sec: 59371.1). Total num frames: 8283799552. Throughput: 0: 59431.0. Samples: 1188917080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:09:44,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 23:09:45,654][54818] Updated weights for policy 0, policy_version 505608 (0.0025) [2024-04-27 23:09:47,449][54798] Signal inference workers to stop experience collection... (18450 times) [2024-04-27 23:09:47,480][54818] InferenceWorker_p0-w0: stopping experience collection (18450 times) [2024-04-27 23:09:47,538][54798] Signal inference workers to resume experience collection... (18450 times) [2024-04-27 23:09:47,539][54818] InferenceWorker_p0-w0: resuming experience collection (18450 times) [2024-04-27 23:09:48,567][54818] Updated weights for policy 0, policy_version 505618 (0.0024) [2024-04-27 23:09:49,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59528.5, 300 sec: 59260.2). Total num frames: 8284078080. Throughput: 0: 59340.1. Samples: 1189271620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:09:49,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 23:09:49,267][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000505620_8284078080.pth... [2024-04-27 23:09:49,329][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000504753_8269873152.pth [2024-04-27 23:09:51,223][54818] Updated weights for policy 0, policy_version 505628 (0.0026) [2024-04-27 23:09:54,152][54818] Updated weights for policy 0, policy_version 505638 (0.0026) [2024-04-27 23:09:54,253][54587] Fps is (10 sec: 57344.7, 60 sec: 59528.4, 300 sec: 59260.1). Total num frames: 8284372992. Throughput: 0: 59357.3. Samples: 1189625860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:09:54,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 23:09:56,666][54818] Updated weights for policy 0, policy_version 505648 (0.0025) [2024-04-27 23:09:59,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59255.6, 300 sec: 59149.0). Total num frames: 8284651520. Throughput: 0: 59288.9. Samples: 1189800580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:09:59,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 23:09:59,621][54818] Updated weights for policy 0, policy_version 505658 (0.0026) [2024-04-27 23:10:02,146][54818] Updated weights for policy 0, policy_version 505668 (0.0025) [2024-04-27 23:10:04,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8284962816. Throughput: 0: 59442.1. Samples: 1190158360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:10:04,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 23:10:05,254][54818] Updated weights for policy 0, policy_version 505678 (0.0028) [2024-04-27 23:10:07,781][54818] Updated weights for policy 0, policy_version 505688 (0.0025) [2024-04-27 23:10:09,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59255.6, 300 sec: 59315.6). Total num frames: 8285257728. Throughput: 0: 59382.3. Samples: 1190514160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:10:09,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 23:10:10,661][54818] Updated weights for policy 0, policy_version 505698 (0.0026) [2024-04-27 23:10:13,324][54818] Updated weights for policy 0, policy_version 505708 (0.0026) [2024-04-27 23:10:14,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59255.3, 300 sec: 59315.6). Total num frames: 8285552640. Throughput: 0: 59185.6. Samples: 1190686380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:10:14,254][54587] Avg episode reward: [(0, '0.501')] [2024-04-27 23:10:16,298][54818] Updated weights for policy 0, policy_version 505718 (0.0026) [2024-04-27 23:10:18,825][54818] Updated weights for policy 0, policy_version 505728 (0.0026) [2024-04-27 23:10:19,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59255.5, 300 sec: 59426.7). Total num frames: 8285863936. Throughput: 0: 59210.8. Samples: 1191044000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:10:19,253][54587] Avg episode reward: [(0, '0.526')] [2024-04-27 23:10:21,672][54818] Updated weights for policy 0, policy_version 505738 (0.0026) [2024-04-27 23:10:24,253][54587] Fps is (10 sec: 60621.6, 60 sec: 59255.5, 300 sec: 59371.2). Total num frames: 8286158848. Throughput: 0: 59214.7. Samples: 1191397300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:10:24,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 23:10:24,430][54818] Updated weights for policy 0, policy_version 505748 (0.0026) [2024-04-27 23:10:27,292][54818] Updated weights for policy 0, policy_version 505758 (0.0023) [2024-04-27 23:10:29,253][54587] Fps is (10 sec: 58981.2, 60 sec: 59255.3, 300 sec: 59315.6). Total num frames: 8286453760. Throughput: 0: 59197.4. Samples: 1191580960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:10:29,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 23:10:30,101][54818] Updated weights for policy 0, policy_version 505768 (0.0024) [2024-04-27 23:10:32,721][54818] Updated weights for policy 0, policy_version 505778 (0.0024) [2024-04-27 23:10:34,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59255.5, 300 sec: 59371.2). Total num frames: 8286765056. Throughput: 0: 59128.5. Samples: 1191932400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:10:34,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 23:10:35,586][54818] Updated weights for policy 0, policy_version 505788 (0.0026) [2024-04-27 23:10:38,129][54818] Updated weights for policy 0, policy_version 505798 (0.0026) [2024-04-27 23:10:39,253][54587] Fps is (10 sec: 60621.3, 60 sec: 59255.4, 300 sec: 59315.6). Total num frames: 8287059968. Throughput: 0: 59082.7. Samples: 1192284580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:10:39,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 23:10:41,386][54818] Updated weights for policy 0, policy_version 505808 (0.0026) [2024-04-27 23:10:43,555][54818] Updated weights for policy 0, policy_version 505818 (0.0026) [2024-04-27 23:10:44,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.6, 300 sec: 59260.1). Total num frames: 8287354880. Throughput: 0: 59259.0. Samples: 1192467240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:10:44,254][54587] Avg episode reward: [(0, '0.681')] [2024-04-27 23:10:46,795][54818] Updated weights for policy 0, policy_version 505828 (0.0026) [2024-04-27 23:10:48,560][54798] Signal inference workers to stop experience collection... (18500 times) [2024-04-27 23:10:48,594][54818] InferenceWorker_p0-w0: stopping experience collection (18500 times) [2024-04-27 23:10:48,650][54798] Signal inference workers to resume experience collection... (18500 times) [2024-04-27 23:10:48,650][54818] InferenceWorker_p0-w0: resuming experience collection (18500 times) [2024-04-27 23:10:49,178][54818] Updated weights for policy 0, policy_version 505838 (0.0025) [2024-04-27 23:10:49,253][54587] Fps is (10 sec: 58983.2, 60 sec: 59528.6, 300 sec: 59260.1). Total num frames: 8287649792. Throughput: 0: 59345.1. Samples: 1192828880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:10:49,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 23:10:52,172][54818] Updated weights for policy 0, policy_version 505848 (0.0025) [2024-04-27 23:10:54,253][54587] Fps is (10 sec: 57344.5, 60 sec: 59255.6, 300 sec: 59260.1). Total num frames: 8287928320. Throughput: 0: 59328.0. Samples: 1193183920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:10:54,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 23:10:54,767][54818] Updated weights for policy 0, policy_version 505858 (0.0025) [2024-04-27 23:10:57,606][54818] Updated weights for policy 0, policy_version 505868 (0.0027) [2024-04-27 23:10:59,253][54587] Fps is (10 sec: 55705.2, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8288206848. Throughput: 0: 59338.4. Samples: 1193356600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:10:59,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 23:11:00,284][54818] Updated weights for policy 0, policy_version 505878 (0.0029) [2024-04-27 23:11:03,354][54818] Updated weights for policy 0, policy_version 505888 (0.0026) [2024-04-27 23:11:04,253][54587] Fps is (10 sec: 58981.6, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8288518144. Throughput: 0: 59378.4. Samples: 1193716040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:11:04,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 23:11:05,801][54818] Updated weights for policy 0, policy_version 505898 (0.0026) [2024-04-27 23:11:08,844][54818] Updated weights for policy 0, policy_version 505908 (0.0026) [2024-04-27 23:11:09,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8288796672. Throughput: 0: 59402.8. Samples: 1194070420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:11:09,254][54587] Avg episode reward: [(0, '0.480')] [2024-04-27 23:11:11,253][54818] Updated weights for policy 0, policy_version 505918 (0.0026) [2024-04-27 23:11:14,253][54587] Fps is (10 sec: 58983.0, 60 sec: 59255.6, 300 sec: 59260.1). Total num frames: 8289107968. Throughput: 0: 59109.1. Samples: 1194240860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:11:14,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 23:11:14,639][54818] Updated weights for policy 0, policy_version 505928 (0.0026) [2024-04-27 23:11:16,687][54818] Updated weights for policy 0, policy_version 505938 (0.0025) [2024-04-27 23:11:19,253][54587] Fps is (10 sec: 62258.4, 60 sec: 59255.3, 300 sec: 59371.2). Total num frames: 8289419264. Throughput: 0: 59368.0. Samples: 1194603960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:11:19,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-27 23:11:20,163][54818] Updated weights for policy 0, policy_version 505948 (0.0026) [2024-04-27 23:11:22,326][54818] Updated weights for policy 0, policy_version 505958 (0.0025) [2024-04-27 23:11:24,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.4, 300 sec: 59260.1). Total num frames: 8289697792. Throughput: 0: 59364.1. Samples: 1194955960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:11:24,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 23:11:25,536][54818] Updated weights for policy 0, policy_version 505968 (0.0023) [2024-04-27 23:11:27,877][54818] Updated weights for policy 0, policy_version 505978 (0.0025) [2024-04-27 23:11:29,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59255.6, 300 sec: 59371.2). Total num frames: 8290009088. Throughput: 0: 59233.8. Samples: 1195132760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:11:29,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-27 23:11:30,943][54818] Updated weights for policy 0, policy_version 505988 (0.0027) [2024-04-27 23:11:33,351][54818] Updated weights for policy 0, policy_version 505998 (0.0026) [2024-04-27 23:11:34,253][54587] Fps is (10 sec: 60620.0, 60 sec: 58982.3, 300 sec: 59315.6). Total num frames: 8290304000. Throughput: 0: 59014.4. Samples: 1195484540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:11:34,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 23:11:36,404][54818] Updated weights for policy 0, policy_version 506008 (0.0024) [2024-04-27 23:11:38,987][54818] Updated weights for policy 0, policy_version 506018 (0.0026) [2024-04-27 23:11:39,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58982.4, 300 sec: 59371.1). Total num frames: 8290598912. Throughput: 0: 59069.6. Samples: 1195842060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-27 23:11:39,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 23:11:41,322][54798] Signal inference workers to stop experience collection... (18550 times) [2024-04-27 23:11:41,359][54818] InferenceWorker_p0-w0: stopping experience collection (18550 times) [2024-04-27 23:11:41,413][54798] Signal inference workers to resume experience collection... (18550 times) [2024-04-27 23:11:41,414][54818] InferenceWorker_p0-w0: resuming experience collection (18550 times) [2024-04-27 23:11:42,055][54818] Updated weights for policy 0, policy_version 506028 (0.0024) [2024-04-27 23:11:44,253][54587] Fps is (10 sec: 58983.2, 60 sec: 58982.5, 300 sec: 59315.6). Total num frames: 8290893824. Throughput: 0: 59423.6. Samples: 1196030660. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:11:44,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 23:11:44,684][54818] Updated weights for policy 0, policy_version 506038 (0.0025) [2024-04-27 23:11:47,455][54818] Updated weights for policy 0, policy_version 506048 (0.0026) [2024-04-27 23:11:49,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59255.2, 300 sec: 59315.6). Total num frames: 8291205120. Throughput: 0: 59240.8. Samples: 1196381880. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:11:49,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 23:11:49,274][54587] No heartbeat for components: RolloutWorker_w4 (6457 seconds) [2024-04-27 23:11:49,275][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000506055_8291205120.pth... [2024-04-27 23:11:49,328][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000505187_8276983808.pth [2024-04-27 23:11:50,164][54818] Updated weights for policy 0, policy_version 506058 (0.0026) [2024-04-27 23:11:53,005][54818] Updated weights for policy 0, policy_version 506068 (0.0024) [2024-04-27 23:11:54,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59528.6, 300 sec: 59315.7). Total num frames: 8291500032. Throughput: 0: 59073.4. Samples: 1196728720. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:11:54,253][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 23:11:55,721][54818] Updated weights for policy 0, policy_version 506078 (0.0024) [2024-04-27 23:11:58,408][54818] Updated weights for policy 0, policy_version 506088 (0.0025) [2024-04-27 23:11:59,253][54587] Fps is (10 sec: 60621.6, 60 sec: 60074.6, 300 sec: 59260.1). Total num frames: 8291811328. Throughput: 0: 59580.4. Samples: 1196921980. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:11:59,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 23:12:01,166][54818] Updated weights for policy 0, policy_version 506098 (0.0026) [2024-04-27 23:12:03,822][54818] Updated weights for policy 0, policy_version 506108 (0.0024) [2024-04-27 23:12:04,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59801.7, 300 sec: 59315.7). Total num frames: 8292106240. Throughput: 0: 59421.0. Samples: 1197277900. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:12:04,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 23:12:07,009][54818] Updated weights for policy 0, policy_version 506118 (0.0026) [2024-04-27 23:12:09,253][54587] Fps is (10 sec: 57343.3, 60 sec: 59801.4, 300 sec: 59260.1). Total num frames: 8292384768. Throughput: 0: 59412.2. Samples: 1197629520. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:12:09,262][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 23:12:09,333][54818] Updated weights for policy 0, policy_version 506128 (0.0024) [2024-04-27 23:12:12,534][54818] Updated weights for policy 0, policy_version 506138 (0.0026) [2024-04-27 23:12:14,253][54587] Fps is (10 sec: 57343.6, 60 sec: 59528.5, 300 sec: 59315.6). Total num frames: 8292679680. Throughput: 0: 59415.1. Samples: 1197806440. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:12:14,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 23:12:14,802][54818] Updated weights for policy 0, policy_version 506148 (0.0023) [2024-04-27 23:12:18,156][54818] Updated weights for policy 0, policy_version 506158 (0.0026) [2024-04-27 23:12:19,253][54587] Fps is (10 sec: 57345.1, 60 sec: 58982.5, 300 sec: 59260.1). Total num frames: 8292958208. Throughput: 0: 59582.0. Samples: 1198165720. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:12:19,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 23:12:20,224][54818] Updated weights for policy 0, policy_version 506168 (0.0025) [2024-04-27 23:12:23,157][54798] Signal inference workers to stop experience collection... (18600 times) [2024-04-27 23:12:23,195][54818] InferenceWorker_p0-w0: stopping experience collection (18600 times) [2024-04-27 23:12:23,223][54798] Signal inference workers to resume experience collection... (18600 times) [2024-04-27 23:12:23,224][54818] InferenceWorker_p0-w0: resuming experience collection (18600 times) [2024-04-27 23:12:23,786][54818] Updated weights for policy 0, policy_version 506178 (0.0026) [2024-04-27 23:12:24,253][54587] Fps is (10 sec: 57343.9, 60 sec: 59255.4, 300 sec: 59260.1). Total num frames: 8293253120. Throughput: 0: 59544.5. Samples: 1198521560. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:12:24,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 23:12:25,639][54818] Updated weights for policy 0, policy_version 506188 (0.0023) [2024-04-27 23:12:29,233][54818] Updated weights for policy 0, policy_version 506198 (0.0026) [2024-04-27 23:12:29,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58982.5, 300 sec: 59260.1). Total num frames: 8293548032. Throughput: 0: 58973.8. Samples: 1198684480. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:12:29,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 23:12:31,255][54818] Updated weights for policy 0, policy_version 506208 (0.0033) [2024-04-27 23:12:34,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.5, 300 sec: 59315.6). Total num frames: 8293842944. Throughput: 0: 59130.4. Samples: 1199042740. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:12:34,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-27 23:12:34,925][54818] Updated weights for policy 0, policy_version 506218 (0.0026) [2024-04-27 23:12:36,786][54818] Updated weights for policy 0, policy_version 506228 (0.0023) [2024-04-27 23:12:39,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58709.4, 300 sec: 59204.6). Total num frames: 8294121472. Throughput: 0: 59327.4. Samples: 1199398460. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:12:39,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 23:12:40,294][54818] Updated weights for policy 0, policy_version 506238 (0.0025) [2024-04-27 23:12:42,377][54818] Updated weights for policy 0, policy_version 506248 (0.0026) [2024-04-27 23:12:44,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58982.3, 300 sec: 59204.5). Total num frames: 8294432768. Throughput: 0: 58687.9. Samples: 1199562940. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:12:44,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-27 23:12:45,850][54818] Updated weights for policy 0, policy_version 506258 (0.0026) [2024-04-27 23:12:48,474][54818] Updated weights for policy 0, policy_version 506268 (0.0026) [2024-04-27 23:12:49,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58709.5, 300 sec: 59260.1). Total num frames: 8294727680. Throughput: 0: 58675.0. Samples: 1199918280. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:12:49,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 23:12:51,197][54818] Updated weights for policy 0, policy_version 506278 (0.0029) [2024-04-27 23:12:54,062][54818] Updated weights for policy 0, policy_version 506288 (0.0025) [2024-04-27 23:12:54,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.2, 300 sec: 59260.1). Total num frames: 8295022592. Throughput: 0: 58902.3. Samples: 1200280120. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:12:54,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 23:12:56,786][54818] Updated weights for policy 0, policy_version 506298 (0.0026) [2024-04-27 23:12:59,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58709.2, 300 sec: 59260.1). Total num frames: 8295333888. Throughput: 0: 58979.5. Samples: 1200460520. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:12:59,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 23:12:59,446][54818] Updated weights for policy 0, policy_version 506308 (0.0029) [2024-04-27 23:13:02,407][54818] Updated weights for policy 0, policy_version 506318 (0.0026) [2024-04-27 23:13:04,253][54587] Fps is (10 sec: 60621.4, 60 sec: 58709.3, 300 sec: 59260.1). Total num frames: 8295628800. Throughput: 0: 58853.7. Samples: 1200814140. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:13:04,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 23:13:04,793][54818] Updated weights for policy 0, policy_version 506328 (0.0024) [2024-04-27 23:13:07,797][54818] Updated weights for policy 0, policy_version 506338 (0.0027) [2024-04-27 23:13:09,253][54587] Fps is (10 sec: 62259.3, 60 sec: 59528.6, 300 sec: 59371.2). Total num frames: 8295956480. Throughput: 0: 58622.2. Samples: 1201159560. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:13:09,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 23:13:10,646][54818] Updated weights for policy 0, policy_version 506348 (0.0026) [2024-04-27 23:13:12,460][54798] Signal inference workers to stop experience collection... (18650 times) [2024-04-27 23:13:12,498][54818] InferenceWorker_p0-w0: stopping experience collection (18650 times) [2024-04-27 23:13:12,552][54798] Signal inference workers to resume experience collection... (18650 times) [2024-04-27 23:13:12,552][54818] InferenceWorker_p0-w0: resuming experience collection (18650 times) [2024-04-27 23:13:13,181][54818] Updated weights for policy 0, policy_version 506358 (0.0025) [2024-04-27 23:13:14,253][54587] Fps is (10 sec: 62259.7, 60 sec: 59528.7, 300 sec: 59315.7). Total num frames: 8296251392. Throughput: 0: 59331.2. Samples: 1201354380. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:13:14,253][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 23:13:16,486][54818] Updated weights for policy 0, policy_version 506368 (0.0027) [2024-04-27 23:13:18,743][54818] Updated weights for policy 0, policy_version 506378 (0.0026) [2024-04-27 23:13:19,253][54587] Fps is (10 sec: 57344.6, 60 sec: 59528.5, 300 sec: 59260.1). Total num frames: 8296529920. Throughput: 0: 59421.4. Samples: 1201716700. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:13:19,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 23:13:21,920][54818] Updated weights for policy 0, policy_version 506388 (0.0026) [2024-04-27 23:13:24,215][54818] Updated weights for policy 0, policy_version 506398 (0.0027) [2024-04-27 23:13:24,253][54587] Fps is (10 sec: 57343.2, 60 sec: 59528.6, 300 sec: 59315.6). Total num frames: 8296824832. Throughput: 0: 59416.0. Samples: 1202072180. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:13:24,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 23:13:27,368][54818] Updated weights for policy 0, policy_version 506408 (0.0026) [2024-04-27 23:13:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8297103360. Throughput: 0: 59466.5. Samples: 1202238920. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:13:29,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 23:13:29,691][54818] Updated weights for policy 0, policy_version 506418 (0.0024) [2024-04-27 23:13:32,929][54818] Updated weights for policy 0, policy_version 506428 (0.0025) [2024-04-27 23:13:34,253][54587] Fps is (10 sec: 55705.5, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8297381888. Throughput: 0: 59573.3. Samples: 1202599080. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:13:34,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 23:13:35,175][54818] Updated weights for policy 0, policy_version 506438 (0.0024) [2024-04-27 23:13:38,477][54818] Updated weights for policy 0, policy_version 506448 (0.0025) [2024-04-27 23:13:39,253][54587] Fps is (10 sec: 57343.6, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8297676800. Throughput: 0: 59496.5. Samples: 1202957460. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:13:39,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-27 23:13:40,505][54818] Updated weights for policy 0, policy_version 506458 (0.0026) [2024-04-27 23:13:44,182][54818] Updated weights for policy 0, policy_version 506468 (0.0026) [2024-04-27 23:13:44,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58982.4, 300 sec: 59204.5). Total num frames: 8297971712. Throughput: 0: 59180.0. Samples: 1203123620. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:13:44,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 23:13:45,974][54818] Updated weights for policy 0, policy_version 506478 (0.0026) [2024-04-27 23:13:48,506][54798] Signal inference workers to stop experience collection... (18700 times) [2024-04-27 23:13:48,535][54818] InferenceWorker_p0-w0: stopping experience collection (18700 times) [2024-04-27 23:13:48,556][54798] Signal inference workers to resume experience collection... (18700 times) [2024-04-27 23:13:48,557][54818] InferenceWorker_p0-w0: resuming experience collection (18700 times) [2024-04-27 23:13:49,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.4, 300 sec: 59204.5). Total num frames: 8298266624. Throughput: 0: 59200.8. Samples: 1203478180. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:13:49,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 23:13:49,287][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000506487_8298283008.pth... [2024-04-27 23:13:49,339][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000505620_8284078080.pth [2024-04-27 23:13:49,773][54818] Updated weights for policy 0, policy_version 506488 (0.0026) [2024-04-27 23:13:51,445][54818] Updated weights for policy 0, policy_version 506498 (0.0025) [2024-04-27 23:13:54,253][54587] Fps is (10 sec: 58983.4, 60 sec: 58982.6, 300 sec: 59204.6). Total num frames: 8298561536. Throughput: 0: 59602.5. Samples: 1203841660. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:13:54,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 23:13:55,189][54818] Updated weights for policy 0, policy_version 506508 (0.0026) [2024-04-27 23:13:56,968][54818] Updated weights for policy 0, policy_version 506518 (0.0024) [2024-04-27 23:13:59,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8298872832. Throughput: 0: 59003.4. Samples: 1204009540. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:13:59,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 23:14:00,716][54818] Updated weights for policy 0, policy_version 506528 (0.0023) [2024-04-27 23:14:02,714][54818] Updated weights for policy 0, policy_version 506538 (0.0026) [2024-04-27 23:14:04,253][54587] Fps is (10 sec: 62257.9, 60 sec: 59255.3, 300 sec: 59260.1). Total num frames: 8299184128. Throughput: 0: 58825.2. Samples: 1204363840. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:14:04,254][54587] Avg episode reward: [(0, '0.531')] [2024-04-27 23:14:06,154][54818] Updated weights for policy 0, policy_version 506548 (0.0026) [2024-04-27 23:14:08,417][54818] Updated weights for policy 0, policy_version 506558 (0.0024) [2024-04-27 23:14:09,253][54587] Fps is (10 sec: 60621.3, 60 sec: 58709.5, 300 sec: 59260.1). Total num frames: 8299479040. Throughput: 0: 58967.7. Samples: 1204725720. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:14:09,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 23:14:11,528][54818] Updated weights for policy 0, policy_version 506568 (0.0024) [2024-04-27 23:14:14,026][54818] Updated weights for policy 0, policy_version 506578 (0.0026) [2024-04-27 23:14:14,253][54587] Fps is (10 sec: 58983.3, 60 sec: 58709.3, 300 sec: 59204.6). Total num frames: 8299773952. Throughput: 0: 59300.0. Samples: 1204907420. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:14:14,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 23:14:17,095][54818] Updated weights for policy 0, policy_version 506588 (0.0026) [2024-04-27 23:14:19,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8300068864. Throughput: 0: 59126.0. Samples: 1205259740. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:14:19,253][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 23:14:19,544][54818] Updated weights for policy 0, policy_version 506598 (0.0028) [2024-04-27 23:14:22,672][54818] Updated weights for policy 0, policy_version 506608 (0.0026) [2024-04-27 23:14:24,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8300380160. Throughput: 0: 58824.4. Samples: 1205604560. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:14:24,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 23:14:25,045][54818] Updated weights for policy 0, policy_version 506618 (0.0022) [2024-04-27 23:14:28,045][54818] Updated weights for policy 0, policy_version 506628 (0.0026) [2024-04-27 23:14:29,253][54587] Fps is (10 sec: 62258.7, 60 sec: 59801.6, 300 sec: 59260.1). Total num frames: 8300691456. Throughput: 0: 59366.8. Samples: 1205795120. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:14:29,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 23:14:30,589][54818] Updated weights for policy 0, policy_version 506638 (0.0026) [2024-04-27 23:14:33,491][54818] Updated weights for policy 0, policy_version 506648 (0.0024) [2024-04-27 23:14:34,253][54587] Fps is (10 sec: 58981.7, 60 sec: 59801.5, 300 sec: 59204.5). Total num frames: 8300969984. Throughput: 0: 59378.5. Samples: 1206150220. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-04-27 23:14:34,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 23:14:36,249][54818] Updated weights for policy 0, policy_version 506658 (0.0026) [2024-04-27 23:14:38,963][54798] Signal inference workers to stop experience collection... (18750 times) [2024-04-27 23:14:38,964][54798] Signal inference workers to resume experience collection... (18750 times) [2024-04-27 23:14:38,978][54818] InferenceWorker_p0-w0: stopping experience collection (18750 times) [2024-04-27 23:14:38,979][54818] InferenceWorker_p0-w0: resuming experience collection (18750 times) [2024-04-27 23:14:39,104][54818] Updated weights for policy 0, policy_version 506668 (0.0030) [2024-04-27 23:14:39,253][54587] Fps is (10 sec: 57343.8, 60 sec: 59801.6, 300 sec: 59204.6). Total num frames: 8301264896. Throughput: 0: 59243.0. Samples: 1206507600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 23:14:39,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 23:14:41,722][54818] Updated weights for policy 0, policy_version 506678 (0.0023) [2024-04-27 23:14:44,253][54587] Fps is (10 sec: 57345.2, 60 sec: 59528.7, 300 sec: 59204.6). Total num frames: 8301543424. Throughput: 0: 59270.8. Samples: 1206676720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 23:14:44,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 23:14:44,565][54818] Updated weights for policy 0, policy_version 506688 (0.0028) [2024-04-27 23:14:47,491][54818] Updated weights for policy 0, policy_version 506698 (0.0027) [2024-04-27 23:14:49,253][54587] Fps is (10 sec: 55705.6, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8301821952. Throughput: 0: 59419.3. Samples: 1207037700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 23:14:49,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 23:14:49,261][54587] No heartbeat for components: RolloutWorker_w4 (6637 seconds) [2024-04-27 23:14:50,023][54818] Updated weights for policy 0, policy_version 506708 (0.0026) [2024-04-27 23:14:53,059][54818] Updated weights for policy 0, policy_version 506718 (0.0026) [2024-04-27 23:14:54,253][54587] Fps is (10 sec: 58981.6, 60 sec: 59528.4, 300 sec: 59260.1). Total num frames: 8302133248. Throughput: 0: 59213.6. Samples: 1207390340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 23:14:54,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 23:14:55,535][54818] Updated weights for policy 0, policy_version 506728 (0.0025) [2024-04-27 23:14:58,601][54818] Updated weights for policy 0, policy_version 506738 (0.0026) [2024-04-27 23:14:59,253][54587] Fps is (10 sec: 60619.9, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8302428160. Throughput: 0: 58993.1. Samples: 1207562120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 23:14:59,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 23:15:00,932][54818] Updated weights for policy 0, policy_version 506748 (0.0025) [2024-04-27 23:15:04,154][54818] Updated weights for policy 0, policy_version 506758 (0.0026) [2024-04-27 23:15:04,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8302723072. Throughput: 0: 59073.6. Samples: 1207918060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 23:15:04,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-27 23:15:06,414][54818] Updated weights for policy 0, policy_version 506768 (0.0026) [2024-04-27 23:15:09,253][54587] Fps is (10 sec: 58983.6, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8303017984. Throughput: 0: 59486.4. Samples: 1208281440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 23:15:09,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 23:15:09,624][54818] Updated weights for policy 0, policy_version 506778 (0.0024) [2024-04-27 23:15:11,993][54818] Updated weights for policy 0, policy_version 506788 (0.0025) [2024-04-27 23:15:14,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8303312896. Throughput: 0: 58990.3. Samples: 1208449680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 23:15:14,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 23:15:15,188][54818] Updated weights for policy 0, policy_version 506798 (0.0025) [2024-04-27 23:15:17,630][54818] Updated weights for policy 0, policy_version 506808 (0.0026) [2024-04-27 23:15:19,253][54587] Fps is (10 sec: 60620.0, 60 sec: 59255.3, 300 sec: 59204.5). Total num frames: 8303624192. Throughput: 0: 58975.2. Samples: 1208804100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 23:15:19,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 23:15:20,589][54818] Updated weights for policy 0, policy_version 506818 (0.0026) [2024-04-27 23:15:23,242][54818] Updated weights for policy 0, policy_version 506828 (0.0026) [2024-04-27 23:15:24,253][54587] Fps is (10 sec: 60620.5, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8303919104. Throughput: 0: 58943.2. Samples: 1209160040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 23:15:24,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-27 23:15:26,157][54818] Updated weights for policy 0, policy_version 506838 (0.0023) [2024-04-27 23:15:28,838][54818] Updated weights for policy 0, policy_version 506848 (0.0026) [2024-04-27 23:15:29,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58709.2, 300 sec: 59149.0). Total num frames: 8304214016. Throughput: 0: 59194.5. Samples: 1209340480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 23:15:29,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 23:15:31,750][54818] Updated weights for policy 0, policy_version 506858 (0.0026) [2024-04-27 23:15:33,302][54798] Signal inference workers to stop experience collection... (18800 times) [2024-04-27 23:15:33,305][54798] Signal inference workers to resume experience collection... (18800 times) [2024-04-27 23:15:33,316][54818] InferenceWorker_p0-w0: stopping experience collection (18800 times) [2024-04-27 23:15:33,317][54818] InferenceWorker_p0-w0: resuming experience collection (18800 times) [2024-04-27 23:15:34,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.6, 300 sec: 59149.0). Total num frames: 8304508928. Throughput: 0: 59077.4. Samples: 1209696180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 23:15:34,253][54587] Avg episode reward: [(0, '0.685')] [2024-04-27 23:15:34,306][54818] Updated weights for policy 0, policy_version 506868 (0.0026) [2024-04-27 23:15:37,288][54818] Updated weights for policy 0, policy_version 506878 (0.0023) [2024-04-27 23:15:39,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8304803840. Throughput: 0: 59108.5. Samples: 1210050220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 23:15:39,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 23:15:39,868][54818] Updated weights for policy 0, policy_version 506888 (0.0026) [2024-04-27 23:15:42,762][54818] Updated weights for policy 0, policy_version 506898 (0.0027) [2024-04-27 23:15:44,253][54587] Fps is (10 sec: 58981.6, 60 sec: 59255.3, 300 sec: 59149.0). Total num frames: 8305098752. Throughput: 0: 59233.4. Samples: 1210227620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 23:15:44,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 23:15:45,462][54818] Updated weights for policy 0, policy_version 506908 (0.0025) [2024-04-27 23:15:48,263][54818] Updated weights for policy 0, policy_version 506918 (0.0026) [2024-04-27 23:15:49,253][54587] Fps is (10 sec: 60620.2, 60 sec: 59801.5, 300 sec: 59260.1). Total num frames: 8305410048. Throughput: 0: 59157.2. Samples: 1210580140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 23:15:49,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 23:15:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000506922_8305410048.pth... [2024-04-27 23:15:49,317][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000506055_8291205120.pth [2024-04-27 23:15:51,214][54818] Updated weights for policy 0, policy_version 506928 (0.0027) [2024-04-27 23:15:53,905][54818] Updated weights for policy 0, policy_version 506938 (0.0025) [2024-04-27 23:15:54,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59528.5, 300 sec: 59315.6). Total num frames: 8305704960. Throughput: 0: 58973.6. Samples: 1210935260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 23:15:54,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 23:15:56,746][54818] Updated weights for policy 0, policy_version 506948 (0.0026) [2024-04-27 23:15:59,253][54587] Fps is (10 sec: 57344.6, 60 sec: 59255.6, 300 sec: 59204.6). Total num frames: 8305983488. Throughput: 0: 59264.7. Samples: 1211116600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-04-27 23:15:59,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 23:15:59,400][54818] Updated weights for policy 0, policy_version 506958 (0.0029) [2024-04-27 23:16:02,447][54818] Updated weights for policy 0, policy_version 506968 (0.0024) [2024-04-27 23:16:04,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59255.4, 300 sec: 59260.1). Total num frames: 8306278400. Throughput: 0: 59332.5. Samples: 1211474060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:16:04,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-27 23:16:04,850][54818] Updated weights for policy 0, policy_version 506978 (0.0026) [2024-04-27 23:16:07,967][54818] Updated weights for policy 0, policy_version 506988 (0.0026) [2024-04-27 23:16:09,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.4, 300 sec: 59204.6). Total num frames: 8306573312. Throughput: 0: 59159.4. Samples: 1211822220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:16:09,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 23:16:10,435][54818] Updated weights for policy 0, policy_version 506998 (0.0026) [2024-04-27 23:16:13,311][54818] Updated weights for policy 0, policy_version 507008 (0.0026) [2024-04-27 23:16:14,253][54587] Fps is (10 sec: 58983.0, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8306868224. Throughput: 0: 59023.3. Samples: 1211996520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:16:14,253][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 23:16:15,816][54818] Updated weights for policy 0, policy_version 507018 (0.0024) [2024-04-27 23:16:18,954][54818] Updated weights for policy 0, policy_version 507028 (0.0026) [2024-04-27 23:16:19,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8307163136. Throughput: 0: 59157.2. Samples: 1212358260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:16:19,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-27 23:16:21,670][54818] Updated weights for policy 0, policy_version 507038 (0.0026) [2024-04-27 23:16:24,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8307441664. Throughput: 0: 59139.7. Samples: 1212711500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:16:24,253][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 23:16:24,493][54818] Updated weights for policy 0, policy_version 507048 (0.0025) [2024-04-27 23:16:27,153][54818] Updated weights for policy 0, policy_version 507058 (0.0026) [2024-04-27 23:16:29,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8307736576. Throughput: 0: 58979.2. Samples: 1212881680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:16:29,254][54587] Avg episode reward: [(0, '0.668')] [2024-04-27 23:16:29,912][54818] Updated weights for policy 0, policy_version 507068 (0.0026) [2024-04-27 23:16:32,710][54798] Signal inference workers to stop experience collection... (18850 times) [2024-04-27 23:16:32,756][54818] InferenceWorker_p0-w0: stopping experience collection (18850 times) [2024-04-27 23:16:32,765][54798] Signal inference workers to resume experience collection... (18850 times) [2024-04-27 23:16:32,771][54818] InferenceWorker_p0-w0: resuming experience collection (18850 times) [2024-04-27 23:16:32,774][54818] Updated weights for policy 0, policy_version 507078 (0.0024) [2024-04-27 23:16:34,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58982.4, 300 sec: 59149.1). Total num frames: 8308047872. Throughput: 0: 59306.0. Samples: 1213248900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:16:34,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 23:16:35,257][54818] Updated weights for policy 0, policy_version 507088 (0.0026) [2024-04-27 23:16:38,373][54818] Updated weights for policy 0, policy_version 507098 (0.0026) [2024-04-27 23:16:39,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8308342784. Throughput: 0: 59143.7. Samples: 1213596720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:16:39,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 23:16:40,750][54818] Updated weights for policy 0, policy_version 507108 (0.0026) [2024-04-27 23:16:43,799][54818] Updated weights for policy 0, policy_version 507118 (0.0026) [2024-04-27 23:16:44,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8308637696. Throughput: 0: 59043.2. Samples: 1213773540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:16:44,253][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 23:16:46,461][54818] Updated weights for policy 0, policy_version 507128 (0.0026) [2024-04-27 23:16:49,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8308932608. Throughput: 0: 59008.0. Samples: 1214129420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:16:49,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 23:16:49,300][54818] Updated weights for policy 0, policy_version 507138 (0.0026) [2024-04-27 23:16:51,921][54818] Updated weights for policy 0, policy_version 507148 (0.0025) [2024-04-27 23:16:54,253][54587] Fps is (10 sec: 58981.4, 60 sec: 58709.3, 300 sec: 59037.9). Total num frames: 8309227520. Throughput: 0: 59361.7. Samples: 1214493500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:16:54,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 23:16:54,875][54818] Updated weights for policy 0, policy_version 507158 (0.0032) [2024-04-27 23:16:57,511][54818] Updated weights for policy 0, policy_version 507168 (0.0026) [2024-04-27 23:16:59,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8309522432. Throughput: 0: 59288.7. Samples: 1214664520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:16:59,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 23:17:00,370][54818] Updated weights for policy 0, policy_version 507178 (0.0026) [2024-04-27 23:17:02,911][54818] Updated weights for policy 0, policy_version 507188 (0.0025) [2024-04-27 23:17:04,253][54587] Fps is (10 sec: 60621.7, 60 sec: 59255.5, 300 sec: 59149.1). Total num frames: 8309833728. Throughput: 0: 59189.0. Samples: 1215021760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:17:04,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 23:17:05,975][54818] Updated weights for policy 0, policy_version 507198 (0.0026) [2024-04-27 23:17:08,466][54818] Updated weights for policy 0, policy_version 507208 (0.0023) [2024-04-27 23:17:09,253][54587] Fps is (10 sec: 62259.6, 60 sec: 59528.5, 300 sec: 59204.6). Total num frames: 8310145024. Throughput: 0: 59153.2. Samples: 1215373400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:17:09,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 23:17:11,568][54818] Updated weights for policy 0, policy_version 507218 (0.0026) [2024-04-27 23:17:13,911][54818] Updated weights for policy 0, policy_version 507228 (0.0026) [2024-04-27 23:17:14,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8310423552. Throughput: 0: 59426.8. Samples: 1215555880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:17:14,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 23:17:17,153][54818] Updated weights for policy 0, policy_version 507238 (0.0026) [2024-04-27 23:17:19,253][54587] Fps is (10 sec: 58981.9, 60 sec: 59528.4, 300 sec: 59260.1). Total num frames: 8310734848. Throughput: 0: 59140.2. Samples: 1215910220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:17:19,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 23:17:19,419][54818] Updated weights for policy 0, policy_version 507248 (0.0026) [2024-04-27 23:17:22,791][54818] Updated weights for policy 0, policy_version 507258 (0.0026) [2024-04-27 23:17:24,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59528.5, 300 sec: 59204.6). Total num frames: 8311013376. Throughput: 0: 59219.1. Samples: 1216261580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:17:24,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 23:17:25,012][54818] Updated weights for policy 0, policy_version 507268 (0.0026) [2024-04-27 23:17:28,334][54818] Updated weights for policy 0, policy_version 507278 (0.0026) [2024-04-27 23:17:28,439][54798] Signal inference workers to stop experience collection... (18900 times) [2024-04-27 23:17:28,467][54818] InferenceWorker_p0-w0: stopping experience collection (18900 times) [2024-04-27 23:17:28,496][54798] Signal inference workers to resume experience collection... (18900 times) [2024-04-27 23:17:28,505][54818] InferenceWorker_p0-w0: resuming experience collection (18900 times) [2024-04-27 23:17:29,253][54587] Fps is (10 sec: 57344.9, 60 sec: 59528.6, 300 sec: 59204.6). Total num frames: 8311308288. Throughput: 0: 59133.3. Samples: 1216434540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-27 23:17:29,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-27 23:17:30,543][54818] Updated weights for policy 0, policy_version 507288 (0.0025) [2024-04-27 23:17:33,763][54818] Updated weights for policy 0, policy_version 507298 (0.0026) [2024-04-27 23:17:34,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.4, 300 sec: 59260.1). Total num frames: 8311603200. Throughput: 0: 59300.1. Samples: 1216797920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 23:17:34,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 23:17:36,053][54818] Updated weights for policy 0, policy_version 507308 (0.0027) [2024-04-27 23:17:39,247][54818] Updated weights for policy 0, policy_version 507318 (0.0026) [2024-04-27 23:17:39,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8311898112. Throughput: 0: 59097.5. Samples: 1217152880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 23:17:39,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 23:17:41,669][54818] Updated weights for policy 0, policy_version 507328 (0.0026) [2024-04-27 23:17:44,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8312176640. Throughput: 0: 58921.8. Samples: 1217316000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 23:17:44,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 23:17:44,832][54818] Updated weights for policy 0, policy_version 507338 (0.0024) [2024-04-27 23:17:47,487][54818] Updated weights for policy 0, policy_version 507348 (0.0026) [2024-04-27 23:17:49,253][54587] Fps is (10 sec: 57343.2, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8312471552. Throughput: 0: 58966.1. Samples: 1217675240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 23:17:49,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 23:17:49,264][54587] No heartbeat for components: RolloutWorker_w4 (6817 seconds) [2024-04-27 23:17:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000507353_8312471552.pth... [2024-04-27 23:17:49,318][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000506487_8298283008.pth [2024-04-27 23:17:50,407][54818] Updated weights for policy 0, policy_version 507358 (0.0026) [2024-04-27 23:17:53,037][54818] Updated weights for policy 0, policy_version 507368 (0.0026) [2024-04-27 23:17:54,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8312766464. Throughput: 0: 58971.1. Samples: 1218027100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 23:17:54,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 23:17:55,814][54818] Updated weights for policy 0, policy_version 507378 (0.0026) [2024-04-27 23:17:58,613][54818] Updated weights for policy 0, policy_version 507388 (0.0026) [2024-04-27 23:17:59,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8313077760. Throughput: 0: 58847.9. Samples: 1218204040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 23:17:59,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 23:18:01,443][54818] Updated weights for policy 0, policy_version 507398 (0.0025) [2024-04-27 23:18:03,940][54818] Updated weights for policy 0, policy_version 507408 (0.0028) [2024-04-27 23:18:04,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58982.3, 300 sec: 59038.0). Total num frames: 8313372672. Throughput: 0: 58821.9. Samples: 1218557200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 23:18:04,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 23:18:06,999][54818] Updated weights for policy 0, policy_version 507418 (0.0026) [2024-04-27 23:18:09,253][54587] Fps is (10 sec: 60621.3, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8313683968. Throughput: 0: 58881.0. Samples: 1218911220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 23:18:09,262][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 23:18:09,603][54818] Updated weights for policy 0, policy_version 507428 (0.0026) [2024-04-27 23:18:12,511][54818] Updated weights for policy 0, policy_version 507438 (0.0026) [2024-04-27 23:18:14,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59255.3, 300 sec: 59149.0). Total num frames: 8313978880. Throughput: 0: 59163.8. Samples: 1219096920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 23:18:14,262][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 23:18:15,137][54818] Updated weights for policy 0, policy_version 507448 (0.0026) [2024-04-27 23:18:17,955][54818] Updated weights for policy 0, policy_version 507458 (0.0026) [2024-04-27 23:18:19,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8314273792. Throughput: 0: 59021.3. Samples: 1219453880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 23:18:19,262][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 23:18:20,780][54818] Updated weights for policy 0, policy_version 507468 (0.0026) [2024-04-27 23:18:23,576][54818] Updated weights for policy 0, policy_version 507478 (0.0026) [2024-04-27 23:18:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 59528.5, 300 sec: 59260.1). Total num frames: 8314585088. Throughput: 0: 58985.2. Samples: 1219807220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 23:18:24,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 23:18:26,359][54818] Updated weights for policy 0, policy_version 507488 (0.0022) [2024-04-27 23:18:29,144][54818] Updated weights for policy 0, policy_version 507498 (0.0026) [2024-04-27 23:18:29,253][54587] Fps is (10 sec: 57344.5, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8314847232. Throughput: 0: 59404.6. Samples: 1219989200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 23:18:29,253][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 23:18:31,868][54818] Updated weights for policy 0, policy_version 507508 (0.0025) [2024-04-27 23:18:34,253][54587] Fps is (10 sec: 57345.0, 60 sec: 59255.6, 300 sec: 59260.1). Total num frames: 8315158528. Throughput: 0: 59184.7. Samples: 1220338540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 23:18:34,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 23:18:34,418][54818] Updated weights for policy 0, policy_version 507518 (0.0026) [2024-04-27 23:18:37,468][54818] Updated weights for policy 0, policy_version 507528 (0.0025) [2024-04-27 23:18:39,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8315437056. Throughput: 0: 59360.5. Samples: 1220698320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 23:18:39,253][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 23:18:39,969][54818] Updated weights for policy 0, policy_version 507538 (0.0026) [2024-04-27 23:18:40,532][54798] Signal inference workers to stop experience collection... (18950 times) [2024-04-27 23:18:40,533][54798] Signal inference workers to resume experience collection... (18950 times) [2024-04-27 23:18:40,549][54818] InferenceWorker_p0-w0: stopping experience collection (18950 times) [2024-04-27 23:18:40,550][54818] InferenceWorker_p0-w0: resuming experience collection (18950 times) [2024-04-27 23:18:42,940][54818] Updated weights for policy 0, policy_version 507548 (0.0024) [2024-04-27 23:18:44,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59528.7, 300 sec: 59260.1). Total num frames: 8315748352. Throughput: 0: 59368.6. Samples: 1220875620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 23:18:44,254][54587] Avg episode reward: [(0, '0.469')] [2024-04-27 23:18:45,574][54818] Updated weights for policy 0, policy_version 507558 (0.0026) [2024-04-27 23:18:48,383][54818] Updated weights for policy 0, policy_version 507568 (0.0026) [2024-04-27 23:18:49,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59255.6, 300 sec: 59204.5). Total num frames: 8316026880. Throughput: 0: 59292.9. Samples: 1221225380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 23:18:49,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 23:18:50,919][54818] Updated weights for policy 0, policy_version 507578 (0.0026) [2024-04-27 23:18:53,885][54818] Updated weights for policy 0, policy_version 507588 (0.0026) [2024-04-27 23:18:54,253][54587] Fps is (10 sec: 57343.7, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8316321792. Throughput: 0: 59343.9. Samples: 1221581700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-27 23:18:54,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 23:18:56,380][54818] Updated weights for policy 0, policy_version 507598 (0.0026) [2024-04-27 23:18:59,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59255.5, 300 sec: 59149.1). Total num frames: 8316633088. Throughput: 0: 59147.3. Samples: 1221758540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:18:59,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 23:18:59,291][54818] Updated weights for policy 0, policy_version 507608 (0.0025) [2024-04-27 23:19:01,943][54818] Updated weights for policy 0, policy_version 507618 (0.0026) [2024-04-27 23:19:04,253][54587] Fps is (10 sec: 60619.7, 60 sec: 59255.3, 300 sec: 59149.0). Total num frames: 8316928000. Throughput: 0: 59236.7. Samples: 1222119540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:19:04,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 23:19:04,821][54818] Updated weights for policy 0, policy_version 507628 (0.0024) [2024-04-27 23:19:07,626][54818] Updated weights for policy 0, policy_version 507638 (0.0022) [2024-04-27 23:19:09,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8317222912. Throughput: 0: 59202.2. Samples: 1222471320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:19:09,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-27 23:19:10,399][54818] Updated weights for policy 0, policy_version 507648 (0.0022) [2024-04-27 23:19:13,352][54818] Updated weights for policy 0, policy_version 507658 (0.0026) [2024-04-27 23:19:14,253][54587] Fps is (10 sec: 58983.5, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8317517824. Throughput: 0: 59112.0. Samples: 1222649240. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:19:14,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-27 23:19:15,874][54818] Updated weights for policy 0, policy_version 507668 (0.0025) [2024-04-27 23:19:18,840][54818] Updated weights for policy 0, policy_version 507678 (0.0026) [2024-04-27 23:19:19,253][54587] Fps is (10 sec: 58983.3, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8317812736. Throughput: 0: 59222.2. Samples: 1223003540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:19:19,253][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 23:19:21,460][54818] Updated weights for policy 0, policy_version 507688 (0.0025) [2024-04-27 23:19:24,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58709.3, 300 sec: 59037.9). Total num frames: 8318107648. Throughput: 0: 59100.7. Samples: 1223357860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:19:24,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 23:19:24,313][54818] Updated weights for policy 0, policy_version 507698 (0.0026) [2024-04-27 23:19:26,826][54818] Updated weights for policy 0, policy_version 507708 (0.0026) [2024-04-27 23:19:29,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8318402560. Throughput: 0: 59176.9. Samples: 1223538580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:19:29,253][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 23:19:29,963][54818] Updated weights for policy 0, policy_version 507718 (0.0025) [2024-04-27 23:19:32,682][54818] Updated weights for policy 0, policy_version 507728 (0.0026) [2024-04-27 23:19:34,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59255.3, 300 sec: 59149.0). Total num frames: 8318713856. Throughput: 0: 59386.5. Samples: 1223897780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:19:34,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 23:19:35,429][54818] Updated weights for policy 0, policy_version 507738 (0.0023) [2024-04-27 23:19:38,154][54818] Updated weights for policy 0, policy_version 507748 (0.0027) [2024-04-27 23:19:38,677][54798] Signal inference workers to stop experience collection... (19000 times) [2024-04-27 23:19:38,703][54818] InferenceWorker_p0-w0: stopping experience collection (19000 times) [2024-04-27 23:19:38,765][54798] Signal inference workers to resume experience collection... (19000 times) [2024-04-27 23:19:38,765][54818] InferenceWorker_p0-w0: resuming experience collection (19000 times) [2024-04-27 23:19:39,253][54587] Fps is (10 sec: 62259.1, 60 sec: 59801.6, 300 sec: 59260.1). Total num frames: 8319025152. Throughput: 0: 59371.6. Samples: 1224253420. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:19:39,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 23:19:41,212][54818] Updated weights for policy 0, policy_version 507758 (0.0024) [2024-04-27 23:19:43,719][54818] Updated weights for policy 0, policy_version 507768 (0.0026) [2024-04-27 23:19:44,253][54587] Fps is (10 sec: 57345.1, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8319287296. Throughput: 0: 59358.7. Samples: 1224429680. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:19:44,253][54587] Avg episode reward: [(0, '0.521')] [2024-04-27 23:19:46,795][54818] Updated weights for policy 0, policy_version 507778 (0.0027) [2024-04-27 23:19:49,064][54818] Updated weights for policy 0, policy_version 507788 (0.0026) [2024-04-27 23:19:49,253][54587] Fps is (10 sec: 57343.8, 60 sec: 59528.5, 300 sec: 59204.6). Total num frames: 8319598592. Throughput: 0: 59138.5. Samples: 1224780760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:19:49,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 23:19:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000507788_8319598592.pth... [2024-04-27 23:19:49,322][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000506922_8305410048.pth [2024-04-27 23:19:52,312][54818] Updated weights for policy 0, policy_version 507798 (0.0026) [2024-04-27 23:19:54,253][54587] Fps is (10 sec: 58981.9, 60 sec: 59255.5, 300 sec: 59149.1). Total num frames: 8319877120. Throughput: 0: 59228.1. Samples: 1225136580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:19:54,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 23:19:54,738][54818] Updated weights for policy 0, policy_version 507808 (0.0027) [2024-04-27 23:19:57,692][54818] Updated weights for policy 0, policy_version 507818 (0.0022) [2024-04-27 23:19:59,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8320172032. Throughput: 0: 59255.9. Samples: 1225315760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:19:59,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 23:20:00,281][54818] Updated weights for policy 0, policy_version 507828 (0.0025) [2024-04-27 23:20:03,035][54818] Updated weights for policy 0, policy_version 507838 (0.0024) [2024-04-27 23:20:04,253][54587] Fps is (10 sec: 60620.2, 60 sec: 59255.6, 300 sec: 59204.5). Total num frames: 8320483328. Throughput: 0: 59407.4. Samples: 1225676880. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:20:04,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 23:20:05,750][54818] Updated weights for policy 0, policy_version 507848 (0.0026) [2024-04-27 23:20:08,577][54818] Updated weights for policy 0, policy_version 507858 (0.0023) [2024-04-27 23:20:09,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59255.5, 300 sec: 59204.5). Total num frames: 8320778240. Throughput: 0: 59207.7. Samples: 1226022200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:20:09,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 23:20:11,205][54818] Updated weights for policy 0, policy_version 507868 (0.0025) [2024-04-27 23:20:13,975][54818] Updated weights for policy 0, policy_version 507878 (0.0025) [2024-04-27 23:20:14,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8321073152. Throughput: 0: 59098.5. Samples: 1226198020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:20:14,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 23:20:16,681][54818] Updated weights for policy 0, policy_version 507888 (0.0026) [2024-04-27 23:20:19,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8321368064. Throughput: 0: 59073.5. Samples: 1226556080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:20:19,253][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 23:20:19,617][54818] Updated weights for policy 0, policy_version 507898 (0.0024) [2024-04-27 23:20:22,346][54818] Updated weights for policy 0, policy_version 507908 (0.0025) [2024-04-27 23:20:24,253][54587] Fps is (10 sec: 58982.0, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8321662976. Throughput: 0: 59282.4. Samples: 1226921140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-27 23:20:24,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 23:20:25,080][54818] Updated weights for policy 0, policy_version 507918 (0.0024) [2024-04-27 23:20:27,855][54818] Updated weights for policy 0, policy_version 507928 (0.0025) [2024-04-27 23:20:29,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8321941504. Throughput: 0: 59152.3. Samples: 1227091540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 23:20:29,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 23:20:29,684][54798] Signal inference workers to stop experience collection... (19050 times) [2024-04-27 23:20:29,729][54818] InferenceWorker_p0-w0: stopping experience collection (19050 times) [2024-04-27 23:20:29,742][54798] Signal inference workers to resume experience collection... (19050 times) [2024-04-27 23:20:29,743][54818] InferenceWorker_p0-w0: resuming experience collection (19050 times) [2024-04-27 23:20:30,627][54818] Updated weights for policy 0, policy_version 507938 (0.0025) [2024-04-27 23:20:33,659][54818] Updated weights for policy 0, policy_version 507948 (0.0023) [2024-04-27 23:20:34,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8322236416. Throughput: 0: 59088.2. Samples: 1227439740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 23:20:34,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 23:20:36,168][54818] Updated weights for policy 0, policy_version 507958 (0.0026) [2024-04-27 23:20:39,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58436.1, 300 sec: 59093.5). Total num frames: 8322531328. Throughput: 0: 58982.5. Samples: 1227790800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 23:20:39,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 23:20:39,561][54818] Updated weights for policy 0, policy_version 507968 (0.0023) [2024-04-27 23:20:41,746][54818] Updated weights for policy 0, policy_version 507978 (0.0024) [2024-04-27 23:20:44,253][54587] Fps is (10 sec: 62260.0, 60 sec: 59528.4, 300 sec: 59149.0). Total num frames: 8322859008. Throughput: 0: 58981.4. Samples: 1227969920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 23:20:44,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 23:20:45,029][54818] Updated weights for policy 0, policy_version 507988 (0.0025) [2024-04-27 23:20:47,549][54818] Updated weights for policy 0, policy_version 507998 (0.0022) [2024-04-27 23:20:49,253][54587] Fps is (10 sec: 60622.1, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8323137536. Throughput: 0: 58956.7. Samples: 1228329920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 23:20:49,253][54587] Avg episode reward: [(0, '0.671')] [2024-04-27 23:20:49,260][54587] No heartbeat for components: RolloutWorker_w4 (6997 seconds) [2024-04-27 23:20:50,578][54818] Updated weights for policy 0, policy_version 508008 (0.0026) [2024-04-27 23:20:53,112][54818] Updated weights for policy 0, policy_version 508018 (0.0025) [2024-04-27 23:20:54,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59528.5, 300 sec: 59204.6). Total num frames: 8323448832. Throughput: 0: 59085.3. Samples: 1228681040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 23:20:54,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 23:20:56,153][54818] Updated weights for policy 0, policy_version 508028 (0.0025) [2024-04-27 23:20:58,714][54818] Updated weights for policy 0, policy_version 508038 (0.0024) [2024-04-27 23:20:59,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.6, 300 sec: 59149.0). Total num frames: 8323727360. Throughput: 0: 58951.7. Samples: 1228850840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 23:20:59,254][54587] Avg episode reward: [(0, '0.726')] [2024-04-27 23:21:01,599][54818] Updated weights for policy 0, policy_version 508048 (0.0025) [2024-04-27 23:21:04,253][54587] Fps is (10 sec: 55705.4, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8324005888. Throughput: 0: 58942.0. Samples: 1229208480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 23:21:04,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 23:21:04,307][54818] Updated weights for policy 0, policy_version 508058 (0.0025) [2024-04-27 23:21:07,077][54818] Updated weights for policy 0, policy_version 508068 (0.0025) [2024-04-27 23:21:09,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8324300800. Throughput: 0: 58674.5. Samples: 1229561480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 23:21:09,253][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 23:21:09,778][54818] Updated weights for policy 0, policy_version 508078 (0.0024) [2024-04-27 23:21:12,239][54798] Signal inference workers to stop experience collection... (19100 times) [2024-04-27 23:21:12,243][54798] Signal inference workers to resume experience collection... (19100 times) [2024-04-27 23:21:12,260][54818] InferenceWorker_p0-w0: stopping experience collection (19100 times) [2024-04-27 23:21:12,261][54818] InferenceWorker_p0-w0: resuming experience collection (19100 times) [2024-04-27 23:21:12,488][54818] Updated weights for policy 0, policy_version 508088 (0.0028) [2024-04-27 23:21:14,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8324612096. Throughput: 0: 59108.2. Samples: 1229751420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 23:21:14,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 23:21:15,248][54818] Updated weights for policy 0, policy_version 508098 (0.0024) [2024-04-27 23:21:18,028][54818] Updated weights for policy 0, policy_version 508108 (0.0023) [2024-04-27 23:21:19,253][54587] Fps is (10 sec: 60620.5, 60 sec: 58982.3, 300 sec: 59204.5). Total num frames: 8324907008. Throughput: 0: 59184.2. Samples: 1230103020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 23:21:19,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-27 23:21:20,799][54818] Updated weights for policy 0, policy_version 508118 (0.0026) [2024-04-27 23:21:23,473][54818] Updated weights for policy 0, policy_version 508128 (0.0024) [2024-04-27 23:21:24,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58982.5, 300 sec: 59204.5). Total num frames: 8325201920. Throughput: 0: 58948.6. Samples: 1230443480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 23:21:24,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 23:21:26,116][54818] Updated weights for policy 0, policy_version 508138 (0.0026) [2024-04-27 23:21:28,971][54818] Updated weights for policy 0, policy_version 508148 (0.0025) [2024-04-27 23:21:29,253][54587] Fps is (10 sec: 58982.6, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8325496832. Throughput: 0: 58965.0. Samples: 1230623340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 23:21:29,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 23:21:31,680][54818] Updated weights for policy 0, policy_version 508158 (0.0026) [2024-04-27 23:21:34,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59528.7, 300 sec: 59204.6). Total num frames: 8325808128. Throughput: 0: 58862.6. Samples: 1230978740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 23:21:34,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 23:21:34,557][54818] Updated weights for policy 0, policy_version 508168 (0.0024) [2024-04-27 23:21:37,475][54818] Updated weights for policy 0, policy_version 508178 (0.0026) [2024-04-27 23:21:39,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8326070272. Throughput: 0: 59103.6. Samples: 1231340700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 23:21:39,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 23:21:39,968][54818] Updated weights for policy 0, policy_version 508188 (0.0026) [2024-04-27 23:21:43,003][54818] Updated weights for policy 0, policy_version 508198 (0.0026) [2024-04-27 23:21:44,253][54587] Fps is (10 sec: 55705.9, 60 sec: 58436.3, 300 sec: 59093.5). Total num frames: 8326365184. Throughput: 0: 59085.8. Samples: 1231509700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 23:21:44,253][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 23:21:45,619][54818] Updated weights for policy 0, policy_version 508208 (0.0026) [2024-04-27 23:21:48,508][54818] Updated weights for policy 0, policy_version 508218 (0.0026) [2024-04-27 23:21:49,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.2, 300 sec: 59093.5). Total num frames: 8326660096. Throughput: 0: 58957.4. Samples: 1231861560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-27 23:21:49,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 23:21:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000508219_8326660096.pth... [2024-04-27 23:21:49,331][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000507353_8312471552.pth [2024-04-27 23:21:51,358][54818] Updated weights for policy 0, policy_version 508228 (0.0025) [2024-04-27 23:21:54,120][54818] Updated weights for policy 0, policy_version 508238 (0.0026) [2024-04-27 23:21:54,254][54587] Fps is (10 sec: 60619.3, 60 sec: 58709.2, 300 sec: 59149.0). Total num frames: 8326971392. Throughput: 0: 59125.0. Samples: 1232222120. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:21:54,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 23:21:56,960][54818] Updated weights for policy 0, policy_version 508248 (0.0026) [2024-04-27 23:21:59,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58709.2, 300 sec: 59037.9). Total num frames: 8327249920. Throughput: 0: 58741.0. Samples: 1232394760. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:21:59,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 23:21:59,343][54798] Signal inference workers to stop experience collection... (19150 times) [2024-04-27 23:21:59,364][54818] InferenceWorker_p0-w0: stopping experience collection (19150 times) [2024-04-27 23:21:59,400][54798] Signal inference workers to resume experience collection... (19150 times) [2024-04-27 23:21:59,400][54818] InferenceWorker_p0-w0: resuming experience collection (19150 times) [2024-04-27 23:21:59,508][54818] Updated weights for policy 0, policy_version 508258 (0.0024) [2024-04-27 23:22:02,575][54818] Updated weights for policy 0, policy_version 508268 (0.0026) [2024-04-27 23:22:04,253][54587] Fps is (10 sec: 57344.6, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8327544832. Throughput: 0: 58853.7. Samples: 1232751440. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:22:04,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 23:22:05,524][54818] Updated weights for policy 0, policy_version 508278 (0.0024) [2024-04-27 23:22:08,108][54818] Updated weights for policy 0, policy_version 508288 (0.0026) [2024-04-27 23:22:09,253][54587] Fps is (10 sec: 62259.2, 60 sec: 59528.4, 300 sec: 59149.0). Total num frames: 8327872512. Throughput: 0: 59176.4. Samples: 1233106420. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:22:09,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 23:22:10,906][54818] Updated weights for policy 0, policy_version 508298 (0.0026) [2024-04-27 23:22:13,531][54818] Updated weights for policy 0, policy_version 508308 (0.0026) [2024-04-27 23:22:14,253][54587] Fps is (10 sec: 62259.0, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8328167424. Throughput: 0: 59105.6. Samples: 1233283100. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:22:14,254][54587] Avg episode reward: [(0, '0.489')] [2024-04-27 23:22:16,445][54818] Updated weights for policy 0, policy_version 508318 (0.0026) [2024-04-27 23:22:19,157][54818] Updated weights for policy 0, policy_version 508328 (0.0026) [2024-04-27 23:22:19,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8328445952. Throughput: 0: 58947.9. Samples: 1233631400. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:22:19,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 23:22:21,935][54818] Updated weights for policy 0, policy_version 508338 (0.0025) [2024-04-27 23:22:24,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58982.3, 300 sec: 59093.4). Total num frames: 8328740864. Throughput: 0: 59064.3. Samples: 1233998600. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:22:24,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 23:22:24,710][54818] Updated weights for policy 0, policy_version 508348 (0.0025) [2024-04-27 23:22:27,442][54818] Updated weights for policy 0, policy_version 508358 (0.0026) [2024-04-27 23:22:29,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58709.3, 300 sec: 59037.9). Total num frames: 8329019392. Throughput: 0: 59200.3. Samples: 1234173720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:22:29,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 23:22:30,129][54818] Updated weights for policy 0, policy_version 508368 (0.0024) [2024-04-27 23:22:32,827][54818] Updated weights for policy 0, policy_version 508378 (0.0024) [2024-04-27 23:22:34,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8329330688. Throughput: 0: 59269.7. Samples: 1234528700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:22:34,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-27 23:22:35,578][54818] Updated weights for policy 0, policy_version 508388 (0.0025) [2024-04-27 23:22:38,437][54818] Updated weights for policy 0, policy_version 508398 (0.0026) [2024-04-27 23:22:39,253][54587] Fps is (10 sec: 63898.0, 60 sec: 59801.7, 300 sec: 59260.1). Total num frames: 8329658368. Throughput: 0: 59097.2. Samples: 1234881480. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:22:39,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 23:22:41,044][54818] Updated weights for policy 0, policy_version 508408 (0.0027) [2024-04-27 23:22:43,932][54818] Updated weights for policy 0, policy_version 508418 (0.0025) [2024-04-27 23:22:44,253][54587] Fps is (10 sec: 62259.8, 60 sec: 59801.6, 300 sec: 59260.1). Total num frames: 8329953280. Throughput: 0: 59181.0. Samples: 1235057900. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:22:44,253][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 23:22:46,507][54818] Updated weights for policy 0, policy_version 508428 (0.0026) [2024-04-27 23:22:47,556][54798] Signal inference workers to stop experience collection... (19200 times) [2024-04-27 23:22:47,597][54818] InferenceWorker_p0-w0: stopping experience collection (19200 times) [2024-04-27 23:22:47,610][54798] Signal inference workers to resume experience collection... (19200 times) [2024-04-27 23:22:47,616][54818] InferenceWorker_p0-w0: resuming experience collection (19200 times) [2024-04-27 23:22:49,253][54587] Fps is (10 sec: 57343.5, 60 sec: 59528.5, 300 sec: 59204.6). Total num frames: 8330231808. Throughput: 0: 59095.6. Samples: 1235410740. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:22:49,254][54587] Avg episode reward: [(0, '0.511')] [2024-04-27 23:22:49,353][54818] Updated weights for policy 0, policy_version 508438 (0.0026) [2024-04-27 23:22:52,082][54818] Updated weights for policy 0, policy_version 508448 (0.0026) [2024-04-27 23:22:54,253][54587] Fps is (10 sec: 55705.6, 60 sec: 58982.7, 300 sec: 59093.5). Total num frames: 8330510336. Throughput: 0: 59217.0. Samples: 1235771180. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:22:54,253][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 23:22:54,943][54818] Updated weights for policy 0, policy_version 508458 (0.0026) [2024-04-27 23:22:57,657][54818] Updated weights for policy 0, policy_version 508468 (0.0025) [2024-04-27 23:22:59,253][54587] Fps is (10 sec: 57344.5, 60 sec: 59255.6, 300 sec: 59093.5). Total num frames: 8330805248. Throughput: 0: 59234.9. Samples: 1235948660. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:22:59,253][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 23:23:00,251][54818] Updated weights for policy 0, policy_version 508478 (0.0028) [2024-04-27 23:23:03,362][54818] Updated weights for policy 0, policy_version 508488 (0.0025) [2024-04-27 23:23:04,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59255.6, 300 sec: 59037.9). Total num frames: 8331100160. Throughput: 0: 59331.2. Samples: 1236301300. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:23:04,254][54587] Avg episode reward: [(0, '0.507')] [2024-04-27 23:23:05,754][54818] Updated weights for policy 0, policy_version 508498 (0.0026) [2024-04-27 23:23:08,714][54818] Updated weights for policy 0, policy_version 508508 (0.0025) [2024-04-27 23:23:09,253][54587] Fps is (10 sec: 60620.0, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8331411456. Throughput: 0: 59028.1. Samples: 1236654860. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:23:09,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 23:23:11,302][54818] Updated weights for policy 0, policy_version 508518 (0.0025) [2024-04-27 23:23:14,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58982.6, 300 sec: 59093.5). Total num frames: 8331706368. Throughput: 0: 59105.4. Samples: 1236833460. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:23:14,253][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 23:23:14,353][54818] Updated weights for policy 0, policy_version 508528 (0.0026) [2024-04-27 23:23:17,271][54818] Updated weights for policy 0, policy_version 508538 (0.0026) [2024-04-27 23:23:19,253][54587] Fps is (10 sec: 58983.2, 60 sec: 59255.6, 300 sec: 59038.0). Total num frames: 8332001280. Throughput: 0: 59196.1. Samples: 1237192520. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:23:19,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 23:23:19,804][54818] Updated weights for policy 0, policy_version 508548 (0.0025) [2024-04-27 23:23:22,764][54818] Updated weights for policy 0, policy_version 508558 (0.0025) [2024-04-27 23:23:24,253][54587] Fps is (10 sec: 58981.6, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8332296192. Throughput: 0: 59079.4. Samples: 1237540060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 23:23:24,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 23:23:25,537][54818] Updated weights for policy 0, policy_version 508568 (0.0023) [2024-04-27 23:23:28,309][54818] Updated weights for policy 0, policy_version 508578 (0.0025) [2024-04-27 23:23:29,253][54587] Fps is (10 sec: 60620.0, 60 sec: 59801.5, 300 sec: 59149.0). Total num frames: 8332607488. Throughput: 0: 59228.2. Samples: 1237723180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 23:23:29,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 23:23:31,083][54818] Updated weights for policy 0, policy_version 508588 (0.0025) [2024-04-27 23:23:33,835][54818] Updated weights for policy 0, policy_version 508598 (0.0023) [2024-04-27 23:23:34,253][54587] Fps is (10 sec: 58983.4, 60 sec: 59255.6, 300 sec: 59149.0). Total num frames: 8332886016. Throughput: 0: 59273.9. Samples: 1238078060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 23:23:34,253][54587] Avg episode reward: [(0, '0.497')] [2024-04-27 23:23:36,088][54798] Signal inference workers to stop experience collection... (19250 times) [2024-04-27 23:23:36,089][54798] Signal inference workers to resume experience collection... (19250 times) [2024-04-27 23:23:36,117][54818] InferenceWorker_p0-w0: stopping experience collection (19250 times) [2024-04-27 23:23:36,118][54818] InferenceWorker_p0-w0: resuming experience collection (19250 times) [2024-04-27 23:23:36,469][54818] Updated weights for policy 0, policy_version 508608 (0.0026) [2024-04-27 23:23:39,253][54587] Fps is (10 sec: 57345.0, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8333180928. Throughput: 0: 59252.9. Samples: 1238437560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 23:23:39,253][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 23:23:39,379][54818] Updated weights for policy 0, policy_version 508618 (0.0025) [2024-04-27 23:23:41,932][54818] Updated weights for policy 0, policy_version 508628 (0.0026) [2024-04-27 23:23:44,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8333475840. Throughput: 0: 59218.3. Samples: 1238613480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 23:23:44,253][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 23:23:44,914][54818] Updated weights for policy 0, policy_version 508638 (0.0026) [2024-04-27 23:23:47,577][54818] Updated weights for policy 0, policy_version 508648 (0.0025) [2024-04-27 23:23:49,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8333770752. Throughput: 0: 59255.0. Samples: 1238967780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 23:23:49,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 23:23:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000508653_8333770752.pth... [2024-04-27 23:23:49,265][54587] No heartbeat for components: RolloutWorker_w4 (7177 seconds) [2024-04-27 23:23:49,326][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000507788_8319598592.pth [2024-04-27 23:23:50,410][54818] Updated weights for policy 0, policy_version 508658 (0.0025) [2024-04-27 23:23:53,066][54818] Updated weights for policy 0, policy_version 508668 (0.0025) [2024-04-27 23:23:54,253][54587] Fps is (10 sec: 58981.3, 60 sec: 59255.3, 300 sec: 59093.4). Total num frames: 8334065664. Throughput: 0: 59312.0. Samples: 1239323900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 23:23:54,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 23:23:55,860][54818] Updated weights for policy 0, policy_version 508678 (0.0026) [2024-04-27 23:23:58,754][54818] Updated weights for policy 0, policy_version 508688 (0.0026) [2024-04-27 23:23:59,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58982.3, 300 sec: 59038.0). Total num frames: 8334344192. Throughput: 0: 59239.0. Samples: 1239499220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 23:23:59,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 23:24:01,379][54818] Updated weights for policy 0, policy_version 508698 (0.0025) [2024-04-27 23:24:04,253][54587] Fps is (10 sec: 58982.6, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8334655488. Throughput: 0: 59190.1. Samples: 1239856080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 23:24:04,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 23:24:04,378][54818] Updated weights for policy 0, policy_version 508708 (0.0027) [2024-04-27 23:24:06,921][54818] Updated weights for policy 0, policy_version 508718 (0.0024) [2024-04-27 23:24:09,253][54587] Fps is (10 sec: 62259.4, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8334966784. Throughput: 0: 59115.2. Samples: 1240200240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 23:24:09,254][54587] Avg episode reward: [(0, '0.519')] [2024-04-27 23:24:10,020][54818] Updated weights for policy 0, policy_version 508728 (0.0026) [2024-04-27 23:24:12,464][54818] Updated weights for policy 0, policy_version 508738 (0.0026) [2024-04-27 23:24:14,253][54587] Fps is (10 sec: 62259.6, 60 sec: 59528.5, 300 sec: 59204.5). Total num frames: 8335278080. Throughput: 0: 59232.1. Samples: 1240388620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 23:24:14,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 23:24:15,606][54818] Updated weights for policy 0, policy_version 508748 (0.0025) [2024-04-27 23:24:17,933][54818] Updated weights for policy 0, policy_version 508758 (0.0025) [2024-04-27 23:24:19,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8335556608. Throughput: 0: 59006.2. Samples: 1240733340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 23:24:19,253][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 23:24:21,159][54818] Updated weights for policy 0, policy_version 508768 (0.0025) [2024-04-27 23:24:23,615][54818] Updated weights for policy 0, policy_version 508778 (0.0026) [2024-04-27 23:24:24,253][54587] Fps is (10 sec: 55705.9, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8335835136. Throughput: 0: 58940.0. Samples: 1241089860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 23:24:24,254][54587] Avg episode reward: [(0, '0.697')] [2024-04-27 23:24:26,546][54818] Updated weights for policy 0, policy_version 508788 (0.0026) [2024-04-27 23:24:29,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58709.4, 300 sec: 59037.9). Total num frames: 8336130048. Throughput: 0: 59088.3. Samples: 1241272460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 23:24:29,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 23:24:29,288][54818] Updated weights for policy 0, policy_version 508798 (0.0026) [2024-04-27 23:24:32,110][54818] Updated weights for policy 0, policy_version 508808 (0.0026) [2024-04-27 23:24:34,253][54587] Fps is (10 sec: 62258.0, 60 sec: 59528.3, 300 sec: 59093.4). Total num frames: 8336457728. Throughput: 0: 59267.0. Samples: 1241634800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 23:24:34,254][54587] Avg episode reward: [(0, '0.510')] [2024-04-27 23:24:34,784][54818] Updated weights for policy 0, policy_version 508818 (0.0022) [2024-04-27 23:24:37,683][54818] Updated weights for policy 0, policy_version 508828 (0.0026) [2024-04-27 23:24:39,253][54587] Fps is (10 sec: 60621.4, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8336736256. Throughput: 0: 59063.3. Samples: 1241981740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 23:24:39,254][54587] Avg episode reward: [(0, '0.520')] [2024-04-27 23:24:40,328][54818] Updated weights for policy 0, policy_version 508838 (0.0026) [2024-04-27 23:24:40,672][54798] Signal inference workers to stop experience collection... (19300 times) [2024-04-27 23:24:40,675][54798] Signal inference workers to resume experience collection... (19300 times) [2024-04-27 23:24:40,698][54818] InferenceWorker_p0-w0: stopping experience collection (19300 times) [2024-04-27 23:24:40,698][54818] InferenceWorker_p0-w0: resuming experience collection (19300 times) [2024-04-27 23:24:43,326][54818] Updated weights for policy 0, policy_version 508848 (0.0025) [2024-04-27 23:24:44,253][54587] Fps is (10 sec: 54068.3, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8336998400. Throughput: 0: 59060.6. Samples: 1242156940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-27 23:24:44,253][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 23:24:45,758][54818] Updated weights for policy 0, policy_version 508858 (0.0026) [2024-04-27 23:24:48,899][54818] Updated weights for policy 0, policy_version 508868 (0.0025) [2024-04-27 23:24:49,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8337309696. Throughput: 0: 59053.9. Samples: 1242513500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:24:49,254][54587] Avg episode reward: [(0, '0.504')] [2024-04-27 23:24:51,357][54818] Updated weights for policy 0, policy_version 508878 (0.0025) [2024-04-27 23:24:54,253][54587] Fps is (10 sec: 60620.1, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8337604608. Throughput: 0: 59185.3. Samples: 1242863580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:24:54,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 23:24:54,416][54818] Updated weights for policy 0, policy_version 508888 (0.0026) [2024-04-27 23:24:56,756][54818] Updated weights for policy 0, policy_version 508898 (0.0025) [2024-04-27 23:24:59,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.5, 300 sec: 59037.9). Total num frames: 8337899520. Throughput: 0: 58817.7. Samples: 1243035420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:24:59,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 23:24:59,863][54818] Updated weights for policy 0, policy_version 508908 (0.0025) [2024-04-27 23:25:02,178][54818] Updated weights for policy 0, policy_version 508918 (0.0025) [2024-04-27 23:25:04,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 8338194432. Throughput: 0: 58991.9. Samples: 1243387980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:25:04,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 23:25:05,536][54818] Updated weights for policy 0, policy_version 508928 (0.0026) [2024-04-27 23:25:07,676][54818] Updated weights for policy 0, policy_version 508938 (0.0025) [2024-04-27 23:25:09,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58709.3, 300 sec: 59037.9). Total num frames: 8338489344. Throughput: 0: 59106.1. Samples: 1243749640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:25:09,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 23:25:11,230][54818] Updated weights for policy 0, policy_version 508948 (0.0026) [2024-04-27 23:25:13,225][54818] Updated weights for policy 0, policy_version 508958 (0.0026) [2024-04-27 23:25:14,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8338800640. Throughput: 0: 59062.2. Samples: 1243930260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:25:14,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 23:25:16,641][54818] Updated weights for policy 0, policy_version 508968 (0.0025) [2024-04-27 23:25:18,785][54818] Updated weights for policy 0, policy_version 508978 (0.0026) [2024-04-27 23:25:19,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8339095552. Throughput: 0: 58748.6. Samples: 1244278480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:25:19,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 23:25:22,115][54818] Updated weights for policy 0, policy_version 508988 (0.0026) [2024-04-27 23:25:24,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59528.5, 300 sec: 59204.6). Total num frames: 8339406848. Throughput: 0: 58872.4. Samples: 1244631000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:25:24,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 23:25:24,484][54818] Updated weights for policy 0, policy_version 508998 (0.0026) [2024-04-27 23:25:27,592][54818] Updated weights for policy 0, policy_version 509008 (0.0025) [2024-04-27 23:25:29,253][54587] Fps is (10 sec: 62259.6, 60 sec: 59801.7, 300 sec: 59260.1). Total num frames: 8339718144. Throughput: 0: 59167.1. Samples: 1244819460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:25:29,253][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 23:25:30,032][54818] Updated weights for policy 0, policy_version 509018 (0.0026) [2024-04-27 23:25:33,102][54818] Updated weights for policy 0, policy_version 509028 (0.0026) [2024-04-27 23:25:34,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8339996672. Throughput: 0: 59200.4. Samples: 1245177520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:25:34,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 23:25:35,883][54818] Updated weights for policy 0, policy_version 509038 (0.0026) [2024-04-27 23:25:38,525][54818] Updated weights for policy 0, policy_version 509048 (0.0025) [2024-04-27 23:25:38,540][54798] Signal inference workers to stop experience collection... (19350 times) [2024-04-27 23:25:38,540][54798] Signal inference workers to resume experience collection... (19350 times) [2024-04-27 23:25:38,554][54818] InferenceWorker_p0-w0: stopping experience collection (19350 times) [2024-04-27 23:25:38,554][54818] InferenceWorker_p0-w0: resuming experience collection (19350 times) [2024-04-27 23:25:39,253][54587] Fps is (10 sec: 57342.6, 60 sec: 59255.2, 300 sec: 59093.4). Total num frames: 8340291584. Throughput: 0: 59090.5. Samples: 1245522660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:25:39,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 23:25:41,307][54818] Updated weights for policy 0, policy_version 509058 (0.0024) [2024-04-27 23:25:43,977][54818] Updated weights for policy 0, policy_version 509068 (0.0024) [2024-04-27 23:25:44,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59801.5, 300 sec: 59149.0). Total num frames: 8340586496. Throughput: 0: 59417.3. Samples: 1245709200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:25:44,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 23:25:46,976][54818] Updated weights for policy 0, policy_version 509078 (0.0026) [2024-04-27 23:25:49,253][54587] Fps is (10 sec: 58983.0, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8340881408. Throughput: 0: 59569.7. Samples: 1246068620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:25:49,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 23:25:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000509087_8340881408.pth... [2024-04-27 23:25:49,322][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000508219_8326660096.pth [2024-04-27 23:25:49,441][54818] Updated weights for policy 0, policy_version 509088 (0.0027) [2024-04-27 23:25:52,374][54818] Updated weights for policy 0, policy_version 509098 (0.0025) [2024-04-27 23:25:54,253][54587] Fps is (10 sec: 57344.4, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8341159936. Throughput: 0: 59482.7. Samples: 1246426360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:25:54,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-27 23:25:54,841][54818] Updated weights for policy 0, policy_version 509108 (0.0025) [2024-04-27 23:25:58,219][54818] Updated weights for policy 0, policy_version 509118 (0.0024) [2024-04-27 23:25:59,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8341454848. Throughput: 0: 59162.6. Samples: 1246592580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:25:59,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 23:26:00,454][54818] Updated weights for policy 0, policy_version 509128 (0.0026) [2024-04-27 23:26:03,705][54818] Updated weights for policy 0, policy_version 509138 (0.0024) [2024-04-27 23:26:04,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8341733376. Throughput: 0: 59327.9. Samples: 1246948240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:26:04,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 23:26:06,042][54818] Updated weights for policy 0, policy_version 509148 (0.0025) [2024-04-27 23:26:09,146][54818] Updated weights for policy 0, policy_version 509158 (0.0025) [2024-04-27 23:26:09,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8342044672. Throughput: 0: 59506.6. Samples: 1247308800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:26:09,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 23:26:11,528][54818] Updated weights for policy 0, policy_version 509168 (0.0026) [2024-04-27 23:26:14,253][54587] Fps is (10 sec: 60621.7, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8342339584. Throughput: 0: 59110.2. Samples: 1247479420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-27 23:26:14,253][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 23:26:14,864][54818] Updated weights for policy 0, policy_version 509178 (0.0023) [2024-04-27 23:26:16,897][54818] Updated weights for policy 0, policy_version 509188 (0.0025) [2024-04-27 23:26:19,253][54587] Fps is (10 sec: 60621.0, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8342650880. Throughput: 0: 58954.2. Samples: 1247830460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 23:26:19,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 23:26:20,267][54818] Updated weights for policy 0, policy_version 509198 (0.0026) [2024-04-27 23:26:22,526][54818] Updated weights for policy 0, policy_version 509208 (0.0026) [2024-04-27 23:26:22,796][54798] Signal inference workers to stop experience collection... (19400 times) [2024-04-27 23:26:22,797][54798] Signal inference workers to resume experience collection... (19400 times) [2024-04-27 23:26:22,816][54818] InferenceWorker_p0-w0: stopping experience collection (19400 times) [2024-04-27 23:26:22,816][54818] InferenceWorker_p0-w0: resuming experience collection (19400 times) [2024-04-27 23:26:24,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8342929408. Throughput: 0: 59387.3. Samples: 1248195080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 23:26:24,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 23:26:25,922][54818] Updated weights for policy 0, policy_version 509218 (0.0026) [2024-04-27 23:26:27,941][54818] Updated weights for policy 0, policy_version 509228 (0.0026) [2024-04-27 23:26:29,253][54587] Fps is (10 sec: 57344.5, 60 sec: 58436.2, 300 sec: 59037.9). Total num frames: 8343224320. Throughput: 0: 59079.2. Samples: 1248367760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 23:26:29,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 23:26:31,248][54818] Updated weights for policy 0, policy_version 509238 (0.0026) [2024-04-27 23:26:33,399][54818] Updated weights for policy 0, policy_version 509248 (0.0027) [2024-04-27 23:26:34,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8343535616. Throughput: 0: 58966.7. Samples: 1248722120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 23:26:34,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 23:26:36,848][54818] Updated weights for policy 0, policy_version 509258 (0.0026) [2024-04-27 23:26:38,895][54818] Updated weights for policy 0, policy_version 509268 (0.0025) [2024-04-27 23:26:39,253][54587] Fps is (10 sec: 62258.7, 60 sec: 59255.6, 300 sec: 59260.1). Total num frames: 8343846912. Throughput: 0: 58885.7. Samples: 1249076220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 23:26:39,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 23:26:42,268][54818] Updated weights for policy 0, policy_version 509278 (0.0026) [2024-04-27 23:26:44,253][54587] Fps is (10 sec: 62259.4, 60 sec: 59528.6, 300 sec: 59315.6). Total num frames: 8344158208. Throughput: 0: 59376.5. Samples: 1249264520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 23:26:44,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 23:26:44,945][54818] Updated weights for policy 0, policy_version 509288 (0.0024) [2024-04-27 23:26:47,702][54818] Updated weights for policy 0, policy_version 509298 (0.0024) [2024-04-27 23:26:49,253][54587] Fps is (10 sec: 58982.6, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8344436736. Throughput: 0: 59302.3. Samples: 1249616840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 23:26:49,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-27 23:26:49,260][54587] No heartbeat for components: RolloutWorker_w4 (7357 seconds) [2024-04-27 23:26:50,338][54818] Updated weights for policy 0, policy_version 509308 (0.0025) [2024-04-27 23:26:53,133][54818] Updated weights for policy 0, policy_version 509318 (0.0024) [2024-04-27 23:26:54,253][54587] Fps is (10 sec: 57343.2, 60 sec: 59528.4, 300 sec: 59260.1). Total num frames: 8344731648. Throughput: 0: 59163.5. Samples: 1249971160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 23:26:54,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 23:26:56,017][54818] Updated weights for policy 0, policy_version 509328 (0.0025) [2024-04-27 23:26:58,632][54818] Updated weights for policy 0, policy_version 509338 (0.0026) [2024-04-27 23:26:59,253][54587] Fps is (10 sec: 58982.6, 60 sec: 59528.6, 300 sec: 59260.1). Total num frames: 8345026560. Throughput: 0: 59243.0. Samples: 1250145360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 23:26:59,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 23:27:01,563][54818] Updated weights for policy 0, policy_version 509348 (0.0026) [2024-04-27 23:27:04,104][54818] Updated weights for policy 0, policy_version 509358 (0.0025) [2024-04-27 23:27:04,253][54587] Fps is (10 sec: 58983.1, 60 sec: 59801.7, 300 sec: 59149.0). Total num frames: 8345321472. Throughput: 0: 59369.8. Samples: 1250502100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 23:27:04,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 23:27:07,222][54818] Updated weights for policy 0, policy_version 509368 (0.0026) [2024-04-27 23:27:09,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59255.6, 300 sec: 59093.5). Total num frames: 8345600000. Throughput: 0: 59201.4. Samples: 1250859140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 23:27:09,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 23:27:09,823][54818] Updated weights for policy 0, policy_version 509378 (0.0026) [2024-04-27 23:27:10,329][54798] Signal inference workers to stop experience collection... (19450 times) [2024-04-27 23:27:10,329][54798] Signal inference workers to resume experience collection... (19450 times) [2024-04-27 23:27:10,341][54818] InferenceWorker_p0-w0: stopping experience collection (19450 times) [2024-04-27 23:27:10,341][54818] InferenceWorker_p0-w0: resuming experience collection (19450 times) [2024-04-27 23:27:13,000][54818] Updated weights for policy 0, policy_version 509388 (0.0025) [2024-04-27 23:27:14,253][54587] Fps is (10 sec: 57344.2, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8345894912. Throughput: 0: 59148.0. Samples: 1251029420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 23:27:14,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 23:27:15,335][54818] Updated weights for policy 0, policy_version 509398 (0.0026) [2024-04-27 23:27:18,359][54818] Updated weights for policy 0, policy_version 509408 (0.0027) [2024-04-27 23:27:19,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8346173440. Throughput: 0: 59200.0. Samples: 1251386120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 23:27:19,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 23:27:20,711][54818] Updated weights for policy 0, policy_version 509418 (0.0026) [2024-04-27 23:27:24,098][54818] Updated weights for policy 0, policy_version 509428 (0.0026) [2024-04-27 23:27:24,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8346468352. Throughput: 0: 59216.4. Samples: 1251740960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 23:27:24,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 23:27:26,196][54818] Updated weights for policy 0, policy_version 509438 (0.0027) [2024-04-27 23:27:29,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58709.2, 300 sec: 59037.9). Total num frames: 8346746880. Throughput: 0: 58610.5. Samples: 1251902000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 23:27:29,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 23:27:29,709][54818] Updated weights for policy 0, policy_version 509448 (0.0026) [2024-04-27 23:27:31,812][54818] Updated weights for policy 0, policy_version 509458 (0.0026) [2024-04-27 23:27:34,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8347058176. Throughput: 0: 58655.7. Samples: 1252256340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 23:27:34,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 23:27:35,347][54818] Updated weights for policy 0, policy_version 509468 (0.0026) [2024-04-27 23:27:37,376][54818] Updated weights for policy 0, policy_version 509478 (0.0029) [2024-04-27 23:27:39,253][54587] Fps is (10 sec: 60621.8, 60 sec: 58436.4, 300 sec: 58982.4). Total num frames: 8347353088. Throughput: 0: 58783.8. Samples: 1252616420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-27 23:27:39,253][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 23:27:40,834][54818] Updated weights for policy 0, policy_version 509488 (0.0026) [2024-04-27 23:27:43,310][54818] Updated weights for policy 0, policy_version 509498 (0.0025) [2024-04-27 23:27:44,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58163.1, 300 sec: 59037.9). Total num frames: 8347648000. Throughput: 0: 58686.1. Samples: 1252786240. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:27:44,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 23:27:46,427][54818] Updated weights for policy 0, policy_version 509508 (0.0025) [2024-04-27 23:27:48,924][54818] Updated weights for policy 0, policy_version 509518 (0.0027) [2024-04-27 23:27:49,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58436.3, 300 sec: 59093.5). Total num frames: 8347942912. Throughput: 0: 58508.0. Samples: 1253134960. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:27:49,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 23:27:49,285][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000509519_8347959296.pth... [2024-04-27 23:27:49,332][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000508653_8333770752.pth [2024-04-27 23:27:51,863][54818] Updated weights for policy 0, policy_version 509528 (0.0025) [2024-04-27 23:27:54,253][54587] Fps is (10 sec: 60621.4, 60 sec: 58709.5, 300 sec: 59149.0). Total num frames: 8348254208. Throughput: 0: 58448.9. Samples: 1253489340. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:27:54,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 23:27:54,544][54818] Updated weights for policy 0, policy_version 509538 (0.0025) [2024-04-27 23:27:57,381][54818] Updated weights for policy 0, policy_version 509548 (0.0026) [2024-04-27 23:27:59,253][54587] Fps is (10 sec: 60621.2, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8348549120. Throughput: 0: 58822.7. Samples: 1253676440. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:27:59,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 23:28:00,333][54818] Updated weights for policy 0, policy_version 509558 (0.0026) [2024-04-27 23:28:02,049][54798] Signal inference workers to stop experience collection... (19500 times) [2024-04-27 23:28:02,088][54818] InferenceWorker_p0-w0: stopping experience collection (19500 times) [2024-04-27 23:28:02,143][54798] Signal inference workers to resume experience collection... (19500 times) [2024-04-27 23:28:02,144][54818] InferenceWorker_p0-w0: resuming experience collection (19500 times) [2024-04-27 23:28:02,803][54818] Updated weights for policy 0, policy_version 509568 (0.0025) [2024-04-27 23:28:04,253][54587] Fps is (10 sec: 62258.8, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8348876800. Throughput: 0: 58743.1. Samples: 1254029560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:28:04,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 23:28:05,822][54818] Updated weights for policy 0, policy_version 509578 (0.0030) [2024-04-27 23:28:08,238][54818] Updated weights for policy 0, policy_version 509588 (0.0025) [2024-04-27 23:28:09,253][54587] Fps is (10 sec: 62258.8, 60 sec: 59528.5, 300 sec: 59204.5). Total num frames: 8349171712. Throughput: 0: 58657.4. Samples: 1254380540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:28:09,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 23:28:11,215][54818] Updated weights for policy 0, policy_version 509598 (0.0026) [2024-04-27 23:28:13,728][54818] Updated weights for policy 0, policy_version 509608 (0.0024) [2024-04-27 23:28:14,253][54587] Fps is (10 sec: 55706.1, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8349433856. Throughput: 0: 59268.7. Samples: 1254569080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:28:14,253][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 23:28:16,879][54818] Updated weights for policy 0, policy_version 509618 (0.0023) [2024-04-27 23:28:19,200][54818] Updated weights for policy 0, policy_version 509628 (0.0026) [2024-04-27 23:28:19,253][54587] Fps is (10 sec: 57344.2, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8349745152. Throughput: 0: 59215.9. Samples: 1254921060. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:28:19,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-27 23:28:22,407][54818] Updated weights for policy 0, policy_version 509638 (0.0025) [2024-04-27 23:28:24,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58982.5, 300 sec: 58982.4). Total num frames: 8350007296. Throughput: 0: 59043.1. Samples: 1255273360. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:28:24,253][54587] Avg episode reward: [(0, '0.533')] [2024-04-27 23:28:24,851][54818] Updated weights for policy 0, policy_version 509648 (0.0024) [2024-04-27 23:28:28,188][54818] Updated weights for policy 0, policy_version 509658 (0.0028) [2024-04-27 23:28:29,253][54587] Fps is (10 sec: 55705.4, 60 sec: 59255.6, 300 sec: 59037.9). Total num frames: 8350302208. Throughput: 0: 59152.1. Samples: 1255448080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:28:29,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 23:28:30,316][54818] Updated weights for policy 0, policy_version 509668 (0.0023) [2024-04-27 23:28:33,673][54818] Updated weights for policy 0, policy_version 509678 (0.0027) [2024-04-27 23:28:34,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 8350597120. Throughput: 0: 59285.0. Samples: 1255802780. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:28:34,254][54587] Avg episode reward: [(0, '0.514')] [2024-04-27 23:28:35,907][54818] Updated weights for policy 0, policy_version 509688 (0.0026) [2024-04-27 23:28:39,176][54818] Updated weights for policy 0, policy_version 509698 (0.0026) [2024-04-27 23:28:39,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58982.2, 300 sec: 59037.9). Total num frames: 8350892032. Throughput: 0: 59232.2. Samples: 1256154800. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:28:39,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 23:28:41,412][54818] Updated weights for policy 0, policy_version 509708 (0.0023) [2024-04-27 23:28:44,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8351170560. Throughput: 0: 58733.3. Samples: 1256319440. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:28:44,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 23:28:44,274][54798] Signal inference workers to stop experience collection... (19550 times) [2024-04-27 23:28:44,276][54798] Signal inference workers to resume experience collection... (19550 times) [2024-04-27 23:28:44,283][54818] InferenceWorker_p0-w0: stopping experience collection (19550 times) [2024-04-27 23:28:44,295][54818] InferenceWorker_p0-w0: resuming experience collection (19550 times) [2024-04-27 23:28:44,861][54818] Updated weights for policy 0, policy_version 509718 (0.0026) [2024-04-27 23:28:46,851][54818] Updated weights for policy 0, policy_version 509728 (0.0025) [2024-04-27 23:28:49,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58982.4, 300 sec: 59038.0). Total num frames: 8351481856. Throughput: 0: 58802.7. Samples: 1256675680. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:28:49,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-27 23:28:50,472][54818] Updated weights for policy 0, policy_version 509738 (0.0025) [2024-04-27 23:28:52,344][54818] Updated weights for policy 0, policy_version 509748 (0.0023) [2024-04-27 23:28:54,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8351776768. Throughput: 0: 59084.0. Samples: 1257039320. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:28:54,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 23:28:55,856][54818] Updated weights for policy 0, policy_version 509758 (0.0025) [2024-04-27 23:28:58,020][54818] Updated weights for policy 0, policy_version 509768 (0.0022) [2024-04-27 23:28:59,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58709.2, 300 sec: 59037.9). Total num frames: 8352071680. Throughput: 0: 58712.7. Samples: 1257211160. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:28:59,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 23:29:01,423][54818] Updated weights for policy 0, policy_version 509778 (0.0026) [2024-04-27 23:29:03,626][54818] Updated weights for policy 0, policy_version 509788 (0.0025) [2024-04-27 23:29:04,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58436.3, 300 sec: 59037.9). Total num frames: 8352382976. Throughput: 0: 58707.1. Samples: 1257562880. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:29:04,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 23:29:06,906][54818] Updated weights for policy 0, policy_version 509798 (0.0026) [2024-04-27 23:29:09,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58436.2, 300 sec: 58982.4). Total num frames: 8352677888. Throughput: 0: 58814.5. Samples: 1257920020. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:29:09,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 23:29:09,390][54818] Updated weights for policy 0, policy_version 509808 (0.0024) [2024-04-27 23:29:12,237][54818] Updated weights for policy 0, policy_version 509818 (0.0025) [2024-04-27 23:29:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8352989184. Throughput: 0: 59175.5. Samples: 1258110980. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-27 23:29:14,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 23:29:14,959][54818] Updated weights for policy 0, policy_version 509828 (0.0026) [2024-04-27 23:29:17,731][54818] Updated weights for policy 0, policy_version 509838 (0.0026) [2024-04-27 23:29:19,253][54587] Fps is (10 sec: 62259.7, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8353300480. Throughput: 0: 59177.7. Samples: 1258465780. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-27 23:29:19,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 23:29:20,346][54818] Updated weights for policy 0, policy_version 509848 (0.0025) [2024-04-27 23:29:23,164][54818] Updated weights for policy 0, policy_version 509858 (0.0026) [2024-04-27 23:29:24,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59801.4, 300 sec: 59204.5). Total num frames: 8353595392. Throughput: 0: 59104.0. Samples: 1258814480. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-27 23:29:24,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 23:29:25,962][54818] Updated weights for policy 0, policy_version 509868 (0.0025) [2024-04-27 23:29:28,698][54818] Updated weights for policy 0, policy_version 509878 (0.0026) [2024-04-27 23:29:29,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59801.6, 300 sec: 59093.5). Total num frames: 8353890304. Throughput: 0: 59621.8. Samples: 1259002420. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-27 23:29:29,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 23:29:31,781][54818] Updated weights for policy 0, policy_version 509888 (0.0024) [2024-04-27 23:29:33,436][54798] Signal inference workers to stop experience collection... (19600 times) [2024-04-27 23:29:33,437][54798] Signal inference workers to resume experience collection... (19600 times) [2024-04-27 23:29:33,458][54818] InferenceWorker_p0-w0: stopping experience collection (19600 times) [2024-04-27 23:29:33,458][54818] InferenceWorker_p0-w0: resuming experience collection (19600 times) [2024-04-27 23:29:34,094][54818] Updated weights for policy 0, policy_version 509898 (0.0026) [2024-04-27 23:29:34,254][54587] Fps is (10 sec: 57343.5, 60 sec: 59528.3, 300 sec: 59093.4). Total num frames: 8354168832. Throughput: 0: 59698.0. Samples: 1259362100. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-27 23:29:34,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 23:29:37,149][54818] Updated weights for policy 0, policy_version 509908 (0.0024) [2024-04-27 23:29:39,253][54587] Fps is (10 sec: 57344.2, 60 sec: 59528.7, 300 sec: 59204.6). Total num frames: 8354463744. Throughput: 0: 59465.9. Samples: 1259715280. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-27 23:29:39,253][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 23:29:39,603][54818] Updated weights for policy 0, policy_version 509918 (0.0025) [2024-04-27 23:29:42,622][54818] Updated weights for policy 0, policy_version 509928 (0.0027) [2024-04-27 23:29:44,253][54587] Fps is (10 sec: 57345.6, 60 sec: 59528.6, 300 sec: 59093.5). Total num frames: 8354742272. Throughput: 0: 59446.0. Samples: 1259886220. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-27 23:29:44,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-27 23:29:45,035][54818] Updated weights for policy 0, policy_version 509938 (0.0026) [2024-04-27 23:29:48,104][54818] Updated weights for policy 0, policy_version 509948 (0.0023) [2024-04-27 23:29:49,253][54587] Fps is (10 sec: 57343.5, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8355037184. Throughput: 0: 59682.2. Samples: 1260248580. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-27 23:29:49,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 23:29:49,264][54587] No heartbeat for components: RolloutWorker_w4 (7537 seconds) [2024-04-27 23:29:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000509951_8355037184.pth... [2024-04-27 23:29:49,326][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000509087_8340881408.pth [2024-04-27 23:29:50,357][54818] Updated weights for policy 0, policy_version 509958 (0.0021) [2024-04-27 23:29:53,724][54818] Updated weights for policy 0, policy_version 509968 (0.0026) [2024-04-27 23:29:54,253][54587] Fps is (10 sec: 58981.3, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8355332096. Throughput: 0: 59745.8. Samples: 1260608580. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-27 23:29:54,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 23:29:55,789][54818] Updated weights for policy 0, policy_version 509978 (0.0027) [2024-04-27 23:29:59,060][54818] Updated weights for policy 0, policy_version 509988 (0.0026) [2024-04-27 23:29:59,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8355643392. Throughput: 0: 59247.5. Samples: 1260777120. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-27 23:29:59,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 23:30:01,245][54818] Updated weights for policy 0, policy_version 509998 (0.0025) [2024-04-27 23:30:04,253][54587] Fps is (10 sec: 62260.1, 60 sec: 59528.6, 300 sec: 59204.6). Total num frames: 8355954688. Throughput: 0: 59367.6. Samples: 1261137320. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-27 23:30:04,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-27 23:30:04,886][54818] Updated weights for policy 0, policy_version 510008 (0.0026) [2024-04-27 23:30:06,738][54818] Updated weights for policy 0, policy_version 510018 (0.0025) [2024-04-27 23:30:09,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8356233216. Throughput: 0: 59703.6. Samples: 1261501140. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-27 23:30:09,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 23:30:10,307][54818] Updated weights for policy 0, policy_version 510028 (0.0025) [2024-04-27 23:30:12,132][54818] Updated weights for policy 0, policy_version 510038 (0.0026) [2024-04-27 23:30:14,253][54587] Fps is (10 sec: 57343.4, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8356528128. Throughput: 0: 59207.9. Samples: 1261666780. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-27 23:30:14,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 23:30:15,728][54818] Updated weights for policy 0, policy_version 510048 (0.0026) [2024-04-27 23:30:16,595][54798] Signal inference workers to stop experience collection... (19650 times) [2024-04-27 23:30:16,599][54798] Signal inference workers to resume experience collection... (19650 times) [2024-04-27 23:30:16,624][54818] InferenceWorker_p0-w0: stopping experience collection (19650 times) [2024-04-27 23:30:16,624][54818] InferenceWorker_p0-w0: resuming experience collection (19650 times) [2024-04-27 23:30:17,587][54818] Updated weights for policy 0, policy_version 510058 (0.0026) [2024-04-27 23:30:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8356839424. Throughput: 0: 59085.1. Samples: 1262020920. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-27 23:30:19,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-27 23:30:21,175][54818] Updated weights for policy 0, policy_version 510068 (0.0025) [2024-04-27 23:30:23,393][54818] Updated weights for policy 0, policy_version 510078 (0.0025) [2024-04-27 23:30:24,253][54587] Fps is (10 sec: 60621.2, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8357134336. Throughput: 0: 59397.7. Samples: 1262388180. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-27 23:30:24,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 23:30:26,856][54818] Updated weights for policy 0, policy_version 510088 (0.0024) [2024-04-27 23:30:28,996][54818] Updated weights for policy 0, policy_version 510098 (0.0025) [2024-04-27 23:30:29,254][54587] Fps is (10 sec: 60617.3, 60 sec: 59254.8, 300 sec: 59148.9). Total num frames: 8357445632. Throughput: 0: 59783.9. Samples: 1262576540. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-27 23:30:29,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 23:30:32,133][54818] Updated weights for policy 0, policy_version 510108 (0.0026) [2024-04-27 23:30:34,253][54587] Fps is (10 sec: 62258.9, 60 sec: 59801.8, 300 sec: 59204.6). Total num frames: 8357756928. Throughput: 0: 59405.8. Samples: 1262921840. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-27 23:30:34,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 23:30:34,852][54818] Updated weights for policy 0, policy_version 510118 (0.0025) [2024-04-27 23:30:37,544][54818] Updated weights for policy 0, policy_version 510128 (0.0024) [2024-04-27 23:30:39,253][54587] Fps is (10 sec: 60624.5, 60 sec: 59801.5, 300 sec: 59204.6). Total num frames: 8358051840. Throughput: 0: 59325.4. Samples: 1263278220. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:30:39,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 23:30:40,258][54818] Updated weights for policy 0, policy_version 510138 (0.0025) [2024-04-27 23:30:43,015][54818] Updated weights for policy 0, policy_version 510148 (0.0026) [2024-04-27 23:30:44,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60347.6, 300 sec: 59260.1). Total num frames: 8358363136. Throughput: 0: 59812.9. Samples: 1263468700. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:30:44,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 23:30:45,721][54818] Updated weights for policy 0, policy_version 510158 (0.0026) [2024-04-27 23:30:48,490][54818] Updated weights for policy 0, policy_version 510168 (0.0026) [2024-04-27 23:30:49,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60347.7, 300 sec: 59315.6). Total num frames: 8358658048. Throughput: 0: 59722.5. Samples: 1263824840. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:30:49,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 23:30:51,503][54818] Updated weights for policy 0, policy_version 510178 (0.0026) [2024-04-27 23:30:53,865][54818] Updated weights for policy 0, policy_version 510188 (0.0023) [2024-04-27 23:30:54,253][54587] Fps is (10 sec: 58983.0, 60 sec: 60347.9, 300 sec: 59315.7). Total num frames: 8358952960. Throughput: 0: 59471.3. Samples: 1264177340. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:30:54,253][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 23:30:56,974][54818] Updated weights for policy 0, policy_version 510198 (0.0026) [2024-04-27 23:30:58,649][54798] Signal inference workers to stop experience collection... (19700 times) [2024-04-27 23:30:58,685][54818] InferenceWorker_p0-w0: stopping experience collection (19700 times) [2024-04-27 23:30:58,698][54798] Signal inference workers to resume experience collection... (19700 times) [2024-04-27 23:30:58,703][54818] InferenceWorker_p0-w0: resuming experience collection (19700 times) [2024-04-27 23:30:59,253][54587] Fps is (10 sec: 57343.4, 60 sec: 59801.5, 300 sec: 59315.6). Total num frames: 8359231488. Throughput: 0: 59867.0. Samples: 1264360800. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:30:59,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 23:30:59,328][54818] Updated weights for policy 0, policy_version 510208 (0.0026) [2024-04-27 23:31:02,331][54818] Updated weights for policy 0, policy_version 510218 (0.0027) [2024-04-27 23:31:04,253][54587] Fps is (10 sec: 55705.6, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8359510016. Throughput: 0: 59930.4. Samples: 1264717780. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:31:04,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-27 23:31:04,694][54818] Updated weights for policy 0, policy_version 510228 (0.0026) [2024-04-27 23:31:07,779][54818] Updated weights for policy 0, policy_version 510238 (0.0026) [2024-04-27 23:31:09,253][54587] Fps is (10 sec: 57345.2, 60 sec: 59528.6, 300 sec: 59204.6). Total num frames: 8359804928. Throughput: 0: 59729.4. Samples: 1265076000. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:31:09,254][54587] Avg episode reward: [(0, '0.468')] [2024-04-27 23:31:10,049][54818] Updated weights for policy 0, policy_version 510248 (0.0030) [2024-04-27 23:31:13,407][54818] Updated weights for policy 0, policy_version 510258 (0.0024) [2024-04-27 23:31:14,253][54587] Fps is (10 sec: 58981.6, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8360099840. Throughput: 0: 59240.8. Samples: 1265242340. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:31:14,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 23:31:15,531][54818] Updated weights for policy 0, policy_version 510268 (0.0031) [2024-04-27 23:31:18,812][54818] Updated weights for policy 0, policy_version 510278 (0.0026) [2024-04-27 23:31:19,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.6, 300 sec: 59204.6). Total num frames: 8360394752. Throughput: 0: 59752.5. Samples: 1265610700. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:31:19,253][54587] Avg episode reward: [(0, '0.663')] [2024-04-27 23:31:20,997][54818] Updated weights for policy 0, policy_version 510288 (0.0026) [2024-04-27 23:31:24,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59528.5, 300 sec: 59260.1). Total num frames: 8360706048. Throughput: 0: 59857.8. Samples: 1265971820. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:31:24,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 23:31:24,604][54818] Updated weights for policy 0, policy_version 510298 (0.0026) [2024-04-27 23:31:26,471][54818] Updated weights for policy 0, policy_version 510308 (0.0026) [2024-04-27 23:31:29,253][54587] Fps is (10 sec: 60620.0, 60 sec: 59256.0, 300 sec: 59204.5). Total num frames: 8361000960. Throughput: 0: 59279.5. Samples: 1266136280. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:31:29,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 23:31:30,042][54818] Updated weights for policy 0, policy_version 510318 (0.0023) [2024-04-27 23:31:32,498][54818] Updated weights for policy 0, policy_version 510328 (0.0027) [2024-04-27 23:31:34,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8361279488. Throughput: 0: 59512.5. Samples: 1266502900. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:31:34,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-27 23:31:35,462][54818] Updated weights for policy 0, policy_version 510338 (0.0025) [2024-04-27 23:31:37,896][54818] Updated weights for policy 0, policy_version 510348 (0.0026) [2024-04-27 23:31:39,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8361607168. Throughput: 0: 59664.2. Samples: 1266862240. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:31:39,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 23:31:40,735][54798] Signal inference workers to stop experience collection... (19750 times) [2024-04-27 23:31:40,735][54798] Signal inference workers to resume experience collection... (19750 times) [2024-04-27 23:31:40,761][54818] InferenceWorker_p0-w0: stopping experience collection (19750 times) [2024-04-27 23:31:40,761][54818] InferenceWorker_p0-w0: resuming experience collection (19750 times) [2024-04-27 23:31:40,842][54818] Updated weights for policy 0, policy_version 510358 (0.0026) [2024-04-27 23:31:43,282][54818] Updated weights for policy 0, policy_version 510368 (0.0026) [2024-04-27 23:31:44,253][54587] Fps is (10 sec: 62259.5, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8361902080. Throughput: 0: 59489.2. Samples: 1267037800. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:31:44,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 23:31:46,179][54818] Updated weights for policy 0, policy_version 510378 (0.0027) [2024-04-27 23:31:48,796][54818] Updated weights for policy 0, policy_version 510388 (0.0025) [2024-04-27 23:31:49,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59255.4, 300 sec: 59260.1). Total num frames: 8362213376. Throughput: 0: 59531.3. Samples: 1267396700. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:31:49,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-27 23:31:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000510389_8362213376.pth... [2024-04-27 23:31:49,317][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000509519_8347959296.pth [2024-04-27 23:31:51,663][54818] Updated weights for policy 0, policy_version 510398 (0.0026) [2024-04-27 23:31:54,211][54818] Updated weights for policy 0, policy_version 510408 (0.0024) [2024-04-27 23:31:54,253][54587] Fps is (10 sec: 62258.3, 60 sec: 59528.4, 300 sec: 59315.6). Total num frames: 8362524672. Throughput: 0: 59509.6. Samples: 1267753940. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:31:54,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-27 23:31:57,033][54818] Updated weights for policy 0, policy_version 510418 (0.0025) [2024-04-27 23:31:59,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59801.6, 300 sec: 59315.6). Total num frames: 8362819584. Throughput: 0: 59935.1. Samples: 1267939420. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:31:59,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-27 23:32:00,023][54818] Updated weights for policy 0, policy_version 510428 (0.0026) [2024-04-27 23:32:02,516][54818] Updated weights for policy 0, policy_version 510438 (0.0025) [2024-04-27 23:32:04,253][54587] Fps is (10 sec: 57344.8, 60 sec: 59801.6, 300 sec: 59315.6). Total num frames: 8363098112. Throughput: 0: 59550.2. Samples: 1268290460. Policy #0 lag: (min: 1.0, avg: 11.4, max: 20.0) [2024-04-27 23:32:04,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 23:32:05,367][54818] Updated weights for policy 0, policy_version 510448 (0.0027) [2024-04-27 23:32:08,088][54818] Updated weights for policy 0, policy_version 510458 (0.0025) [2024-04-27 23:32:09,253][54587] Fps is (10 sec: 58983.0, 60 sec: 60074.6, 300 sec: 59371.2). Total num frames: 8363409408. Throughput: 0: 59268.0. Samples: 1268638880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 23:32:09,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-27 23:32:11,269][54818] Updated weights for policy 0, policy_version 510468 (0.0025) [2024-04-27 23:32:13,537][54818] Updated weights for policy 0, policy_version 510478 (0.0025) [2024-04-27 23:32:14,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60074.8, 300 sec: 59426.7). Total num frames: 8363704320. Throughput: 0: 59793.6. Samples: 1268826980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 23:32:14,253][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 23:32:16,677][54818] Updated weights for policy 0, policy_version 510488 (0.0025) [2024-04-27 23:32:18,865][54818] Updated weights for policy 0, policy_version 510498 (0.0027) [2024-04-27 23:32:19,253][54587] Fps is (10 sec: 62258.9, 60 sec: 60620.7, 300 sec: 59537.8). Total num frames: 8364032000. Throughput: 0: 59676.4. Samples: 1269188340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 23:32:19,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 23:32:22,060][54818] Updated weights for policy 0, policy_version 510508 (0.0025) [2024-04-27 23:32:24,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60074.7, 300 sec: 59537.8). Total num frames: 8364310528. Throughput: 0: 59462.4. Samples: 1269538040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 23:32:24,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 23:32:24,348][54818] Updated weights for policy 0, policy_version 510518 (0.0026) [2024-04-27 23:32:27,482][54818] Updated weights for policy 0, policy_version 510528 (0.0026) [2024-04-27 23:32:28,127][54798] Signal inference workers to stop experience collection... (19800 times) [2024-04-27 23:32:28,168][54818] InferenceWorker_p0-w0: stopping experience collection (19800 times) [2024-04-27 23:32:28,182][54798] Signal inference workers to resume experience collection... (19800 times) [2024-04-27 23:32:28,188][54818] InferenceWorker_p0-w0: resuming experience collection (19800 times) [2024-04-27 23:32:29,253][54587] Fps is (10 sec: 57344.0, 60 sec: 60074.7, 300 sec: 59482.2). Total num frames: 8364605440. Throughput: 0: 59561.2. Samples: 1269718060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 23:32:29,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 23:32:29,886][54818] Updated weights for policy 0, policy_version 510538 (0.0026) [2024-04-27 23:32:33,274][54818] Updated weights for policy 0, policy_version 510548 (0.0027) [2024-04-27 23:32:34,253][54587] Fps is (10 sec: 58982.0, 60 sec: 60347.7, 300 sec: 59482.2). Total num frames: 8364900352. Throughput: 0: 59557.0. Samples: 1270076760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 23:32:34,262][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 23:32:35,336][54818] Updated weights for policy 0, policy_version 510558 (0.0026) [2024-04-27 23:32:38,845][54818] Updated weights for policy 0, policy_version 510568 (0.0026) [2024-04-27 23:32:39,253][54587] Fps is (10 sec: 57343.7, 60 sec: 59528.5, 300 sec: 59426.7). Total num frames: 8365178880. Throughput: 0: 59598.7. Samples: 1270435880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 23:32:39,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 23:32:41,135][54818] Updated weights for policy 0, policy_version 510578 (0.0027) [2024-04-27 23:32:44,200][54818] Updated weights for policy 0, policy_version 510588 (0.0027) [2024-04-27 23:32:44,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59528.5, 300 sec: 59426.7). Total num frames: 8365473792. Throughput: 0: 59413.9. Samples: 1270613040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 23:32:44,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 23:32:47,126][54818] Updated weights for policy 0, policy_version 510598 (0.0026) [2024-04-27 23:32:49,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58982.5, 300 sec: 59315.6). Total num frames: 8365752320. Throughput: 0: 59524.7. Samples: 1270969080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 23:32:49,256][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 23:32:49,269][54587] No heartbeat for components: RolloutWorker_w4 (7717 seconds) [2024-04-27 23:32:49,664][54818] Updated weights for policy 0, policy_version 510608 (0.0026) [2024-04-27 23:32:52,565][54818] Updated weights for policy 0, policy_version 510618 (0.0026) [2024-04-27 23:32:54,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.5, 300 sec: 59371.2). Total num frames: 8366063616. Throughput: 0: 59498.7. Samples: 1271316320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 23:32:54,254][54587] Avg episode reward: [(0, '0.496')] [2024-04-27 23:32:55,181][54818] Updated weights for policy 0, policy_version 510628 (0.0024) [2024-04-27 23:32:58,049][54818] Updated weights for policy 0, policy_version 510638 (0.0025) [2024-04-27 23:32:59,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58982.5, 300 sec: 59260.1). Total num frames: 8366358528. Throughput: 0: 59115.0. Samples: 1271487160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 23:32:59,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 23:33:00,605][54818] Updated weights for policy 0, policy_version 510648 (0.0025) [2024-04-27 23:33:03,538][54818] Updated weights for policy 0, policy_version 510658 (0.0026) [2024-04-27 23:33:04,253][54587] Fps is (10 sec: 60620.2, 60 sec: 59528.4, 300 sec: 59315.6). Total num frames: 8366669824. Throughput: 0: 59197.3. Samples: 1271852220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 23:33:04,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 23:33:06,100][54818] Updated weights for policy 0, policy_version 510668 (0.0021) [2024-04-27 23:33:09,048][54818] Updated weights for policy 0, policy_version 510678 (0.0026) [2024-04-27 23:33:09,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58982.5, 300 sec: 59371.2). Total num frames: 8366948352. Throughput: 0: 59476.5. Samples: 1272214480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 23:33:09,253][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 23:33:11,524][54818] Updated weights for policy 0, policy_version 510688 (0.0026) [2024-04-27 23:33:14,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58982.3, 300 sec: 59315.6). Total num frames: 8367243264. Throughput: 0: 59165.4. Samples: 1272380500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 23:33:14,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 23:33:14,525][54818] Updated weights for policy 0, policy_version 510698 (0.0026) [2024-04-27 23:33:17,040][54818] Updated weights for policy 0, policy_version 510708 (0.0026) [2024-04-27 23:33:19,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58436.3, 300 sec: 59426.7). Total num frames: 8367538176. Throughput: 0: 58986.7. Samples: 1272731160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 23:33:19,254][54587] Avg episode reward: [(0, '0.497')] [2024-04-27 23:33:20,065][54818] Updated weights for policy 0, policy_version 510718 (0.0026) [2024-04-27 23:33:21,584][54798] Signal inference workers to stop experience collection... (19850 times) [2024-04-27 23:33:21,584][54798] Signal inference workers to resume experience collection... (19850 times) [2024-04-27 23:33:21,596][54818] InferenceWorker_p0-w0: stopping experience collection (19850 times) [2024-04-27 23:33:21,596][54818] InferenceWorker_p0-w0: resuming experience collection (19850 times) [2024-04-27 23:33:22,575][54818] Updated weights for policy 0, policy_version 510728 (0.0025) [2024-04-27 23:33:24,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58982.3, 300 sec: 59482.2). Total num frames: 8367849472. Throughput: 0: 58860.5. Samples: 1273084600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 23:33:24,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 23:33:25,606][54818] Updated weights for policy 0, policy_version 510738 (0.0025) [2024-04-27 23:33:28,092][54818] Updated weights for policy 0, policy_version 510748 (0.0024) [2024-04-27 23:33:29,253][54587] Fps is (10 sec: 60621.4, 60 sec: 58982.5, 300 sec: 59482.3). Total num frames: 8368144384. Throughput: 0: 59120.1. Samples: 1273273440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-27 23:33:29,253][54587] Avg episode reward: [(0, '0.548')] [2024-04-27 23:33:31,219][54818] Updated weights for policy 0, policy_version 510758 (0.0026) [2024-04-27 23:33:33,649][54818] Updated weights for policy 0, policy_version 510768 (0.0024) [2024-04-27 23:33:34,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59255.5, 300 sec: 59537.8). Total num frames: 8368455680. Throughput: 0: 58958.7. Samples: 1273622220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:33:34,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 23:33:37,352][54818] Updated weights for policy 0, policy_version 510778 (0.0026) [2024-04-27 23:33:39,229][54818] Updated weights for policy 0, policy_version 510788 (0.0026) [2024-04-27 23:33:39,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59528.6, 300 sec: 59593.3). Total num frames: 8368750592. Throughput: 0: 58915.1. Samples: 1273967500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:33:39,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 23:33:42,835][54818] Updated weights for policy 0, policy_version 510798 (0.0025) [2024-04-27 23:33:44,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59255.5, 300 sec: 59482.3). Total num frames: 8369029120. Throughput: 0: 59148.5. Samples: 1274148840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:33:44,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 23:33:44,852][54818] Updated weights for policy 0, policy_version 510808 (0.0026) [2024-04-27 23:33:48,338][54818] Updated weights for policy 0, policy_version 510818 (0.0025) [2024-04-27 23:33:49,253][54587] Fps is (10 sec: 55705.9, 60 sec: 59255.6, 300 sec: 59426.7). Total num frames: 8369307648. Throughput: 0: 58989.0. Samples: 1274506720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:33:49,253][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 23:33:49,400][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000510824_8369340416.pth... [2024-04-27 23:33:49,441][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000509951_8355037184.pth [2024-04-27 23:33:50,332][54818] Updated weights for policy 0, policy_version 510828 (0.0023) [2024-04-27 23:33:53,911][54818] Updated weights for policy 0, policy_version 510838 (0.0025) [2024-04-27 23:33:54,253][54587] Fps is (10 sec: 54067.6, 60 sec: 58436.3, 300 sec: 59315.7). Total num frames: 8369569792. Throughput: 0: 58780.0. Samples: 1274859580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:33:54,253][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 23:33:56,218][54818] Updated weights for policy 0, policy_version 510848 (0.0026) [2024-04-27 23:33:59,253][54587] Fps is (10 sec: 55705.5, 60 sec: 58436.3, 300 sec: 59260.1). Total num frames: 8369864704. Throughput: 0: 58716.5. Samples: 1275022740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:33:59,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-27 23:33:59,555][54818] Updated weights for policy 0, policy_version 510858 (0.0025) [2024-04-27 23:34:01,678][54818] Updated weights for policy 0, policy_version 510868 (0.0024) [2024-04-27 23:34:04,253][54587] Fps is (10 sec: 60620.1, 60 sec: 58436.3, 300 sec: 59315.6). Total num frames: 8370176000. Throughput: 0: 58895.5. Samples: 1275381460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:34:04,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 23:34:05,122][54818] Updated weights for policy 0, policy_version 510878 (0.0026) [2024-04-27 23:34:05,315][54798] Signal inference workers to stop experience collection... (19900 times) [2024-04-27 23:34:05,362][54818] InferenceWorker_p0-w0: stopping experience collection (19900 times) [2024-04-27 23:34:05,370][54798] Signal inference workers to resume experience collection... (19900 times) [2024-04-27 23:34:05,376][54818] InferenceWorker_p0-w0: resuming experience collection (19900 times) [2024-04-27 23:34:07,111][54818] Updated weights for policy 0, policy_version 510888 (0.0026) [2024-04-27 23:34:09,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58709.2, 300 sec: 59260.1). Total num frames: 8370470912. Throughput: 0: 58862.2. Samples: 1275733400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:34:09,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 23:34:10,521][54818] Updated weights for policy 0, policy_version 510898 (0.0026) [2024-04-27 23:34:12,811][54818] Updated weights for policy 0, policy_version 510908 (0.0030) [2024-04-27 23:34:14,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58982.3, 300 sec: 59260.1). Total num frames: 8370782208. Throughput: 0: 58433.1. Samples: 1275902940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:34:14,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 23:34:16,032][54818] Updated weights for policy 0, policy_version 510918 (0.0023) [2024-04-27 23:34:18,677][54818] Updated weights for policy 0, policy_version 510928 (0.0024) [2024-04-27 23:34:19,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58709.2, 300 sec: 59204.6). Total num frames: 8371060736. Throughput: 0: 58569.2. Samples: 1276257840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:34:19,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 23:34:21,417][54818] Updated weights for policy 0, policy_version 510938 (0.0024) [2024-04-27 23:34:24,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58436.2, 300 sec: 59204.5). Total num frames: 8371355648. Throughput: 0: 58747.9. Samples: 1276611160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:34:24,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-27 23:34:24,514][54818] Updated weights for policy 0, policy_version 510948 (0.0025) [2024-04-27 23:34:26,970][54818] Updated weights for policy 0, policy_version 510958 (0.0026) [2024-04-27 23:34:29,253][54587] Fps is (10 sec: 60621.5, 60 sec: 58709.3, 300 sec: 59315.7). Total num frames: 8371666944. Throughput: 0: 58847.6. Samples: 1276796980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:34:29,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 23:34:29,948][54818] Updated weights for policy 0, policy_version 510968 (0.0025) [2024-04-27 23:34:32,483][54818] Updated weights for policy 0, policy_version 510978 (0.0025) [2024-04-27 23:34:34,253][54587] Fps is (10 sec: 62259.6, 60 sec: 58709.4, 300 sec: 59371.2). Total num frames: 8371978240. Throughput: 0: 58507.9. Samples: 1277139580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:34:34,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-27 23:34:35,472][54818] Updated weights for policy 0, policy_version 510988 (0.0025) [2024-04-27 23:34:38,043][54818] Updated weights for policy 0, policy_version 510998 (0.0026) [2024-04-27 23:34:39,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58436.3, 300 sec: 59371.2). Total num frames: 8372256768. Throughput: 0: 58574.1. Samples: 1277495420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:34:39,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 23:34:40,962][54818] Updated weights for policy 0, policy_version 511008 (0.0026) [2024-04-27 23:34:43,673][54818] Updated weights for policy 0, policy_version 511018 (0.0026) [2024-04-27 23:34:44,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.4, 300 sec: 59426.7). Total num frames: 8372568064. Throughput: 0: 58936.4. Samples: 1277674880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:34:44,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 23:34:46,459][54818] Updated weights for policy 0, policy_version 511028 (0.0025) [2024-04-27 23:34:49,082][54818] Updated weights for policy 0, policy_version 511038 (0.0025) [2024-04-27 23:34:49,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58982.3, 300 sec: 59371.2). Total num frames: 8372846592. Throughput: 0: 58911.9. Samples: 1278032500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:34:49,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 23:34:49,393][54798] Signal inference workers to stop experience collection... (19950 times) [2024-04-27 23:34:49,427][54818] InferenceWorker_p0-w0: stopping experience collection (19950 times) [2024-04-27 23:34:49,457][54798] Signal inference workers to resume experience collection... (19950 times) [2024-04-27 23:34:49,458][54818] InferenceWorker_p0-w0: resuming experience collection (19950 times) [2024-04-27 23:34:52,021][54818] Updated weights for policy 0, policy_version 511048 (0.0027) [2024-04-27 23:34:54,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59528.5, 300 sec: 59315.7). Total num frames: 8373141504. Throughput: 0: 58904.0. Samples: 1278384080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:34:54,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 23:34:54,480][54818] Updated weights for policy 0, policy_version 511058 (0.0025) [2024-04-27 23:34:57,651][54818] Updated weights for policy 0, policy_version 511068 (0.0025) [2024-04-27 23:34:59,253][54587] Fps is (10 sec: 57344.6, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8373420032. Throughput: 0: 59093.5. Samples: 1278562140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-27 23:34:59,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 23:35:00,142][54818] Updated weights for policy 0, policy_version 511078 (0.0026) [2024-04-27 23:35:03,190][54818] Updated weights for policy 0, policy_version 511088 (0.0026) [2024-04-27 23:35:04,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59255.6, 300 sec: 59315.7). Total num frames: 8373731328. Throughput: 0: 58975.4. Samples: 1278911720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 23:35:04,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 23:35:05,726][54818] Updated weights for policy 0, policy_version 511098 (0.0026) [2024-04-27 23:35:08,690][54818] Updated weights for policy 0, policy_version 511108 (0.0026) [2024-04-27 23:35:09,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.4, 300 sec: 59204.6). Total num frames: 8373993472. Throughput: 0: 59038.3. Samples: 1279267880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 23:35:09,254][54587] Avg episode reward: [(0, '0.506')] [2024-04-27 23:35:11,249][54818] Updated weights for policy 0, policy_version 511118 (0.0026) [2024-04-27 23:35:14,253][54587] Fps is (10 sec: 57343.2, 60 sec: 58709.4, 300 sec: 59204.6). Total num frames: 8374304768. Throughput: 0: 58834.1. Samples: 1279444520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 23:35:14,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 23:35:14,364][54818] Updated weights for policy 0, policy_version 511128 (0.0026) [2024-04-27 23:35:16,817][54818] Updated weights for policy 0, policy_version 511138 (0.0026) [2024-04-27 23:35:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8374599680. Throughput: 0: 59180.4. Samples: 1279802700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 23:35:19,254][54587] Avg episode reward: [(0, '0.713')] [2024-04-27 23:35:19,691][54818] Updated weights for policy 0, policy_version 511148 (0.0026) [2024-04-27 23:35:22,461][54818] Updated weights for policy 0, policy_version 511158 (0.0022) [2024-04-27 23:35:24,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58982.5, 300 sec: 59149.2). Total num frames: 8374894592. Throughput: 0: 59145.4. Samples: 1280156960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 23:35:24,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 23:35:25,387][54818] Updated weights for policy 0, policy_version 511168 (0.0026) [2024-04-27 23:35:27,959][54818] Updated weights for policy 0, policy_version 511178 (0.0024) [2024-04-27 23:35:29,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8375205888. Throughput: 0: 59163.1. Samples: 1280337220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 23:35:29,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-27 23:35:30,861][54818] Updated weights for policy 0, policy_version 511188 (0.0026) [2024-04-27 23:35:33,492][54818] Updated weights for policy 0, policy_version 511198 (0.0031) [2024-04-27 23:35:34,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58436.3, 300 sec: 59093.5). Total num frames: 8375484416. Throughput: 0: 59030.4. Samples: 1280688860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 23:35:34,253][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 23:35:36,288][54818] Updated weights for policy 0, policy_version 511208 (0.0026) [2024-04-27 23:35:36,468][54798] Signal inference workers to stop experience collection... (20000 times) [2024-04-27 23:35:36,506][54818] InferenceWorker_p0-w0: stopping experience collection (20000 times) [2024-04-27 23:35:36,518][54798] Signal inference workers to resume experience collection... (20000 times) [2024-04-27 23:35:36,524][54818] InferenceWorker_p0-w0: resuming experience collection (20000 times) [2024-04-27 23:35:39,159][54818] Updated weights for policy 0, policy_version 511218 (0.0026) [2024-04-27 23:35:39,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8375795712. Throughput: 0: 59146.1. Samples: 1281045660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 23:35:39,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 23:35:41,744][54818] Updated weights for policy 0, policy_version 511228 (0.0026) [2024-04-27 23:35:44,253][54587] Fps is (10 sec: 62258.3, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8376107008. Throughput: 0: 59151.0. Samples: 1281223940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 23:35:44,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 23:35:44,767][54818] Updated weights for policy 0, policy_version 511238 (0.0027) [2024-04-27 23:35:47,331][54818] Updated weights for policy 0, policy_version 511248 (0.0025) [2024-04-27 23:35:49,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8376385536. Throughput: 0: 59330.5. Samples: 1281581600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 23:35:49,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 23:35:49,261][54587] No heartbeat for components: RolloutWorker_w4 (7897 seconds) [2024-04-27 23:35:49,332][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000511255_8376401920.pth... [2024-04-27 23:35:49,372][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000510389_8362213376.pth [2024-04-27 23:35:50,275][54818] Updated weights for policy 0, policy_version 511258 (0.0024) [2024-04-27 23:35:52,900][54818] Updated weights for policy 0, policy_version 511268 (0.0026) [2024-04-27 23:35:54,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8376680448. Throughput: 0: 59337.8. Samples: 1281938080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 23:35:54,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-27 23:35:55,787][54818] Updated weights for policy 0, policy_version 511278 (0.0025) [2024-04-27 23:35:58,997][54818] Updated weights for policy 0, policy_version 511288 (0.0026) [2024-04-27 23:35:59,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8376958976. Throughput: 0: 59095.6. Samples: 1282103820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 23:35:59,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 23:36:01,337][54818] Updated weights for policy 0, policy_version 511298 (0.0022) [2024-04-27 23:36:04,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58709.2, 300 sec: 59149.0). Total num frames: 8377253888. Throughput: 0: 59005.3. Samples: 1282457940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 23:36:04,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 23:36:04,568][54818] Updated weights for policy 0, policy_version 511308 (0.0023) [2024-04-27 23:36:06,945][54818] Updated weights for policy 0, policy_version 511318 (0.0025) [2024-04-27 23:36:09,253][54587] Fps is (10 sec: 60621.4, 60 sec: 59528.6, 300 sec: 59204.6). Total num frames: 8377565184. Throughput: 0: 58917.8. Samples: 1282808260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 23:36:09,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 23:36:10,025][54818] Updated weights for policy 0, policy_version 511328 (0.0026) [2024-04-27 23:36:12,409][54818] Updated weights for policy 0, policy_version 511338 (0.0026) [2024-04-27 23:36:14,253][54587] Fps is (10 sec: 62259.7, 60 sec: 59528.7, 300 sec: 59260.1). Total num frames: 8377876480. Throughput: 0: 59091.7. Samples: 1282996340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 23:36:14,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-27 23:36:15,505][54818] Updated weights for policy 0, policy_version 511348 (0.0027) [2024-04-27 23:36:18,006][54818] Updated weights for policy 0, policy_version 511358 (0.0025) [2024-04-27 23:36:19,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59528.5, 300 sec: 59204.6). Total num frames: 8378171392. Throughput: 0: 59111.9. Samples: 1283348900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 23:36:19,254][54587] Avg episode reward: [(0, '0.706')] [2024-04-27 23:36:20,973][54818] Updated weights for policy 0, policy_version 511368 (0.0025) [2024-04-27 23:36:23,534][54818] Updated weights for policy 0, policy_version 511378 (0.0025) [2024-04-27 23:36:24,253][54587] Fps is (10 sec: 58981.7, 60 sec: 59528.4, 300 sec: 59204.6). Total num frames: 8378466304. Throughput: 0: 58899.5. Samples: 1283696140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-27 23:36:24,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 23:36:24,730][54798] Signal inference workers to stop experience collection... (20050 times) [2024-04-27 23:36:24,731][54798] Signal inference workers to resume experience collection... (20050 times) [2024-04-27 23:36:24,741][54818] InferenceWorker_p0-w0: stopping experience collection (20050 times) [2024-04-27 23:36:24,741][54818] InferenceWorker_p0-w0: resuming experience collection (20050 times) [2024-04-27 23:36:26,445][54818] Updated weights for policy 0, policy_version 511388 (0.0023) [2024-04-27 23:36:28,989][54818] Updated weights for policy 0, policy_version 511398 (0.0025) [2024-04-27 23:36:29,253][54587] Fps is (10 sec: 60621.3, 60 sec: 59528.6, 300 sec: 59315.6). Total num frames: 8378777600. Throughput: 0: 58993.0. Samples: 1283878620. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:36:29,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 23:36:32,010][54818] Updated weights for policy 0, policy_version 511408 (0.0027) [2024-04-27 23:36:34,253][54587] Fps is (10 sec: 58983.0, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8379056128. Throughput: 0: 58935.2. Samples: 1284233680. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:36:34,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 23:36:34,349][54818] Updated weights for policy 0, policy_version 511418 (0.0025) [2024-04-27 23:36:37,467][54818] Updated weights for policy 0, policy_version 511428 (0.0027) [2024-04-27 23:36:39,253][54587] Fps is (10 sec: 55705.2, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8379334656. Throughput: 0: 58929.7. Samples: 1284589920. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:36:39,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 23:36:39,818][54818] Updated weights for policy 0, policy_version 511438 (0.0024) [2024-04-27 23:36:43,093][54818] Updated weights for policy 0, policy_version 511448 (0.0026) [2024-04-27 23:36:44,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58709.4, 300 sec: 59038.0). Total num frames: 8379629568. Throughput: 0: 59235.6. Samples: 1284769420. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:36:44,254][54587] Avg episode reward: [(0, '0.688')] [2024-04-27 23:36:45,440][54818] Updated weights for policy 0, policy_version 511458 (0.0022) [2024-04-27 23:36:48,775][54818] Updated weights for policy 0, policy_version 511468 (0.0025) [2024-04-27 23:36:49,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 8379908096. Throughput: 0: 59253.3. Samples: 1285124340. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:36:49,254][54587] Avg episode reward: [(0, '0.531')] [2024-04-27 23:36:50,853][54818] Updated weights for policy 0, policy_version 511478 (0.0025) [2024-04-27 23:36:54,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58982.5, 300 sec: 58982.4). Total num frames: 8380219392. Throughput: 0: 59369.8. Samples: 1285479900. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:36:54,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 23:36:54,255][54818] Updated weights for policy 0, policy_version 511488 (0.0025) [2024-04-27 23:36:56,430][54818] Updated weights for policy 0, policy_version 511498 (0.0024) [2024-04-27 23:36:59,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59255.6, 300 sec: 59037.9). Total num frames: 8380514304. Throughput: 0: 58908.0. Samples: 1285647200. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:36:59,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-27 23:36:59,752][54818] Updated weights for policy 0, policy_version 511508 (0.0025) [2024-04-27 23:37:01,947][54818] Updated weights for policy 0, policy_version 511518 (0.0029) [2024-04-27 23:37:04,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 8380792832. Throughput: 0: 58902.8. Samples: 1285999520. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:37:04,253][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 23:37:05,295][54818] Updated weights for policy 0, policy_version 511528 (0.0025) [2024-04-27 23:37:06,108][54798] Signal inference workers to stop experience collection... (20100 times) [2024-04-27 23:37:06,143][54818] InferenceWorker_p0-w0: stopping experience collection (20100 times) [2024-04-27 23:37:06,198][54798] Signal inference workers to resume experience collection... (20100 times) [2024-04-27 23:37:06,198][54818] InferenceWorker_p0-w0: resuming experience collection (20100 times) [2024-04-27 23:37:07,468][54818] Updated weights for policy 0, policy_version 511538 (0.0024) [2024-04-27 23:37:09,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58982.3, 300 sec: 58982.4). Total num frames: 8381104128. Throughput: 0: 59290.7. Samples: 1286364220. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:37:09,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 23:37:10,950][54818] Updated weights for policy 0, policy_version 511548 (0.0024) [2024-04-27 23:37:12,965][54818] Updated weights for policy 0, policy_version 511558 (0.0022) [2024-04-27 23:37:14,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8381399040. Throughput: 0: 59020.9. Samples: 1286534560. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:37:14,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 23:37:16,484][54818] Updated weights for policy 0, policy_version 511568 (0.0026) [2024-04-27 23:37:18,490][54818] Updated weights for policy 0, policy_version 511578 (0.0026) [2024-04-27 23:37:19,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58709.3, 300 sec: 58926.8). Total num frames: 8381693952. Throughput: 0: 58951.9. Samples: 1286886520. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:37:19,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 23:37:21,887][54818] Updated weights for policy 0, policy_version 511588 (0.0027) [2024-04-27 23:37:24,253][54587] Fps is (10 sec: 60619.9, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8382005248. Throughput: 0: 58855.0. Samples: 1287238400. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:37:24,254][54587] Avg episode reward: [(0, '0.697')] [2024-04-27 23:37:24,586][54818] Updated weights for policy 0, policy_version 511598 (0.0026) [2024-04-27 23:37:27,324][54818] Updated weights for policy 0, policy_version 511608 (0.0026) [2024-04-27 23:37:29,253][54587] Fps is (10 sec: 60621.3, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8382300160. Throughput: 0: 58985.4. Samples: 1287423760. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:37:29,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 23:37:30,422][54818] Updated weights for policy 0, policy_version 511618 (0.0027) [2024-04-27 23:37:33,076][54818] Updated weights for policy 0, policy_version 511628 (0.0026) [2024-04-27 23:37:34,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8382595072. Throughput: 0: 58979.9. Samples: 1287778440. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:37:34,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 23:37:35,762][54818] Updated weights for policy 0, policy_version 511638 (0.0026) [2024-04-27 23:37:38,485][54818] Updated weights for policy 0, policy_version 511648 (0.0027) [2024-04-27 23:37:39,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8382906368. Throughput: 0: 58922.5. Samples: 1288131420. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:37:39,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-27 23:37:41,188][54818] Updated weights for policy 0, policy_version 511658 (0.0025) [2024-04-27 23:37:43,941][54818] Updated weights for policy 0, policy_version 511668 (0.0025) [2024-04-27 23:37:44,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8383201280. Throughput: 0: 59169.6. Samples: 1288309840. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:37:44,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 23:37:46,711][54818] Updated weights for policy 0, policy_version 511678 (0.0027) [2024-04-27 23:37:49,063][54798] Signal inference workers to stop experience collection... (20150 times) [2024-04-27 23:37:49,112][54818] InferenceWorker_p0-w0: stopping experience collection (20150 times) [2024-04-27 23:37:49,113][54798] Signal inference workers to resume experience collection... (20150 times) [2024-04-27 23:37:49,122][54818] InferenceWorker_p0-w0: resuming experience collection (20150 times) [2024-04-27 23:37:49,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59528.6, 300 sec: 59037.9). Total num frames: 8383479808. Throughput: 0: 59319.9. Samples: 1288668920. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:37:49,254][54587] Avg episode reward: [(0, '0.655')] [2024-04-27 23:37:49,357][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000511688_8383496192.pth... [2024-04-27 23:37:49,360][54818] Updated weights for policy 0, policy_version 511688 (0.0026) [2024-04-27 23:37:49,411][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000510824_8369340416.pth [2024-04-27 23:37:52,238][54818] Updated weights for policy 0, policy_version 511698 (0.0024) [2024-04-27 23:37:54,253][54587] Fps is (10 sec: 55705.6, 60 sec: 58982.2, 300 sec: 58982.4). Total num frames: 8383758336. Throughput: 0: 58843.6. Samples: 1289012180. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-04-27 23:37:54,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 23:37:54,978][54818] Updated weights for policy 0, policy_version 511708 (0.0026) [2024-04-27 23:37:58,086][54818] Updated weights for policy 0, policy_version 511718 (0.0026) [2024-04-27 23:37:59,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.4, 300 sec: 58982.4). Total num frames: 8384069632. Throughput: 0: 59014.6. Samples: 1289190220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 23:37:59,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 23:38:00,527][54818] Updated weights for policy 0, policy_version 511728 (0.0024) [2024-04-27 23:38:03,621][54818] Updated weights for policy 0, policy_version 511738 (0.0026) [2024-04-27 23:38:04,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58982.2, 300 sec: 58926.8). Total num frames: 8384331776. Throughput: 0: 59155.0. Samples: 1289548500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 23:38:04,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 23:38:05,970][54818] Updated weights for policy 0, policy_version 511748 (0.0021) [2024-04-27 23:38:09,157][54818] Updated weights for policy 0, policy_version 511758 (0.0026) [2024-04-27 23:38:09,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58982.5, 300 sec: 58982.4). Total num frames: 8384643072. Throughput: 0: 59240.5. Samples: 1289904220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 23:38:09,254][54587] Avg episode reward: [(0, '0.526')] [2024-04-27 23:38:11,469][54818] Updated weights for policy 0, policy_version 511768 (0.0024) [2024-04-27 23:38:14,253][54587] Fps is (10 sec: 60621.9, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8384937984. Throughput: 0: 58904.9. Samples: 1290074480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 23:38:14,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 23:38:14,589][54818] Updated weights for policy 0, policy_version 511778 (0.0025) [2024-04-27 23:38:16,946][54818] Updated weights for policy 0, policy_version 511788 (0.0026) [2024-04-27 23:38:19,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8385249280. Throughput: 0: 58924.5. Samples: 1290430040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 23:38:19,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-27 23:38:20,248][54818] Updated weights for policy 0, policy_version 511798 (0.0025) [2024-04-27 23:38:22,543][54818] Updated weights for policy 0, policy_version 511808 (0.0026) [2024-04-27 23:38:24,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58982.5, 300 sec: 58982.4). Total num frames: 8385544192. Throughput: 0: 59129.0. Samples: 1290792220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 23:38:24,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-27 23:38:25,816][54818] Updated weights for policy 0, policy_version 511818 (0.0028) [2024-04-27 23:38:28,138][54818] Updated weights for policy 0, policy_version 511828 (0.0025) [2024-04-27 23:38:29,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8385822720. Throughput: 0: 59032.6. Samples: 1290966300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 23:38:29,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 23:38:31,180][54818] Updated weights for policy 0, policy_version 511838 (0.0026) [2024-04-27 23:38:33,638][54818] Updated weights for policy 0, policy_version 511848 (0.0025) [2024-04-27 23:38:34,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8386117632. Throughput: 0: 58733.3. Samples: 1291311920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 23:38:34,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 23:38:36,725][54818] Updated weights for policy 0, policy_version 511858 (0.0027) [2024-04-27 23:38:37,437][54798] Signal inference workers to stop experience collection... (20200 times) [2024-04-27 23:38:37,443][54798] Signal inference workers to resume experience collection... (20200 times) [2024-04-27 23:38:37,465][54818] InferenceWorker_p0-w0: stopping experience collection (20200 times) [2024-04-27 23:38:37,465][54818] InferenceWorker_p0-w0: resuming experience collection (20200 times) [2024-04-27 23:38:39,253][54587] Fps is (10 sec: 60621.3, 60 sec: 58709.5, 300 sec: 58982.4). Total num frames: 8386428928. Throughput: 0: 59106.9. Samples: 1291671980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 23:38:39,253][54587] Avg episode reward: [(0, '0.700')] [2024-04-27 23:38:39,264][54818] Updated weights for policy 0, policy_version 511868 (0.0026) [2024-04-27 23:38:42,079][54818] Updated weights for policy 0, policy_version 511878 (0.0026) [2024-04-27 23:38:44,253][54587] Fps is (10 sec: 60621.2, 60 sec: 58709.4, 300 sec: 59037.9). Total num frames: 8386723840. Throughput: 0: 59249.8. Samples: 1291856460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 23:38:44,253][54587] Avg episode reward: [(0, '0.686')] [2024-04-27 23:38:44,951][54818] Updated weights for policy 0, policy_version 511888 (0.0024) [2024-04-27 23:38:47,449][54818] Updated weights for policy 0, policy_version 511898 (0.0025) [2024-04-27 23:38:49,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8387018752. Throughput: 0: 59056.6. Samples: 1292206040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 23:38:49,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 23:38:49,261][54587] No heartbeat for components: RolloutWorker_w4 (8077 seconds) [2024-04-27 23:38:50,741][54818] Updated weights for policy 0, policy_version 511908 (0.0026) [2024-04-27 23:38:53,117][54818] Updated weights for policy 0, policy_version 511918 (0.0030) [2024-04-27 23:38:54,253][54587] Fps is (10 sec: 60620.7, 60 sec: 59528.6, 300 sec: 59204.6). Total num frames: 8387330048. Throughput: 0: 58944.9. Samples: 1292556740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 23:38:54,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 23:38:56,361][54818] Updated weights for policy 0, policy_version 511928 (0.0025) [2024-04-27 23:38:58,582][54818] Updated weights for policy 0, policy_version 511938 (0.0026) [2024-04-27 23:38:59,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8387624960. Throughput: 0: 59344.8. Samples: 1292745000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 23:38:59,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 23:39:01,780][54818] Updated weights for policy 0, policy_version 511948 (0.0023) [2024-04-27 23:39:04,016][54818] Updated weights for policy 0, policy_version 511958 (0.0026) [2024-04-27 23:39:04,253][54587] Fps is (10 sec: 58981.8, 60 sec: 59801.7, 300 sec: 59149.0). Total num frames: 8387919872. Throughput: 0: 59369.3. Samples: 1293101660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 23:39:04,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-27 23:39:07,207][54818] Updated weights for policy 0, policy_version 511968 (0.0026) [2024-04-27 23:39:09,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59801.6, 300 sec: 59149.0). Total num frames: 8388231168. Throughput: 0: 59187.5. Samples: 1293455660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 23:39:09,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 23:39:09,488][54818] Updated weights for policy 0, policy_version 511978 (0.0025) [2024-04-27 23:39:12,749][54818] Updated weights for policy 0, policy_version 511988 (0.0025) [2024-04-27 23:39:14,253][54587] Fps is (10 sec: 57344.6, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8388493312. Throughput: 0: 59281.8. Samples: 1293633980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 23:39:14,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-27 23:39:15,001][54818] Updated weights for policy 0, policy_version 511998 (0.0026) [2024-04-27 23:39:18,153][54818] Updated weights for policy 0, policy_version 512008 (0.0026) [2024-04-27 23:39:19,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8388804608. Throughput: 0: 59557.0. Samples: 1293991980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-27 23:39:19,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 23:39:20,688][54818] Updated weights for policy 0, policy_version 512018 (0.0024) [2024-04-27 23:39:23,719][54818] Updated weights for policy 0, policy_version 512028 (0.0024) [2024-04-27 23:39:24,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8389083136. Throughput: 0: 59423.4. Samples: 1294346040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:39:24,254][54587] Avg episode reward: [(0, '0.692')] [2024-04-27 23:39:26,181][54818] Updated weights for policy 0, policy_version 512038 (0.0026) [2024-04-27 23:39:29,253][54587] Fps is (10 sec: 57343.2, 60 sec: 59255.4, 300 sec: 58982.4). Total num frames: 8389378048. Throughput: 0: 59310.5. Samples: 1294525440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:39:29,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 23:39:29,520][54818] Updated weights for policy 0, policy_version 512048 (0.0025) [2024-04-27 23:39:31,882][54818] Updated weights for policy 0, policy_version 512058 (0.0025) [2024-04-27 23:39:34,059][54798] Signal inference workers to stop experience collection... (20250 times) [2024-04-27 23:39:34,060][54798] Signal inference workers to resume experience collection... (20250 times) [2024-04-27 23:39:34,086][54818] InferenceWorker_p0-w0: stopping experience collection (20250 times) [2024-04-27 23:39:34,086][54818] InferenceWorker_p0-w0: resuming experience collection (20250 times) [2024-04-27 23:39:34,253][54587] Fps is (10 sec: 58982.9, 60 sec: 59255.5, 300 sec: 59037.9). Total num frames: 8389672960. Throughput: 0: 59359.2. Samples: 1294877200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:39:34,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 23:39:35,008][54818] Updated weights for policy 0, policy_version 512068 (0.0026) [2024-04-27 23:39:37,536][54818] Updated weights for policy 0, policy_version 512078 (0.0026) [2024-04-27 23:39:39,253][54587] Fps is (10 sec: 57345.0, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 8389951488. Throughput: 0: 59389.8. Samples: 1295229280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:39:39,253][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 23:39:40,545][54818] Updated weights for policy 0, policy_version 512088 (0.0024) [2024-04-27 23:39:43,017][54818] Updated weights for policy 0, policy_version 512098 (0.0025) [2024-04-27 23:39:44,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8390262784. Throughput: 0: 59020.4. Samples: 1295400920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:39:44,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-27 23:39:45,903][54818] Updated weights for policy 0, policy_version 512108 (0.0025) [2024-04-27 23:39:48,559][54818] Updated weights for policy 0, policy_version 512118 (0.0027) [2024-04-27 23:39:49,253][54587] Fps is (10 sec: 62259.1, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8390574080. Throughput: 0: 59097.1. Samples: 1295761020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:39:49,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-27 23:39:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000512120_8390574080.pth... [2024-04-27 23:39:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000511255_8376401920.pth [2024-04-27 23:39:51,467][54818] Updated weights for policy 0, policy_version 512128 (0.0026) [2024-04-27 23:39:54,135][54818] Updated weights for policy 0, policy_version 512138 (0.0026) [2024-04-27 23:39:54,253][54587] Fps is (10 sec: 60621.6, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8390868992. Throughput: 0: 59245.8. Samples: 1296121720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:39:54,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 23:39:56,969][54818] Updated weights for policy 0, policy_version 512148 (0.0027) [2024-04-27 23:39:59,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58982.4, 300 sec: 59093.4). Total num frames: 8391163904. Throughput: 0: 59186.5. Samples: 1296297380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:39:59,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 23:39:59,712][54818] Updated weights for policy 0, policy_version 512158 (0.0025) [2024-04-27 23:40:02,410][54818] Updated weights for policy 0, policy_version 512168 (0.0027) [2024-04-27 23:40:04,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8391442432. Throughput: 0: 58966.2. Samples: 1296645460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:40:04,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-27 23:40:05,301][54818] Updated weights for policy 0, policy_version 512178 (0.0026) [2024-04-27 23:40:07,879][54818] Updated weights for policy 0, policy_version 512188 (0.0024) [2024-04-27 23:40:09,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58982.3, 300 sec: 59204.6). Total num frames: 8391770112. Throughput: 0: 58944.9. Samples: 1296998560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:40:09,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 23:40:10,873][54818] Updated weights for policy 0, policy_version 512198 (0.0026) [2024-04-27 23:40:13,417][54818] Updated weights for policy 0, policy_version 512208 (0.0026) [2024-04-27 23:40:14,253][54587] Fps is (10 sec: 62259.6, 60 sec: 59528.6, 300 sec: 59204.6). Total num frames: 8392065024. Throughput: 0: 59052.2. Samples: 1297182780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:40:14,253][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 23:40:16,618][54818] Updated weights for policy 0, policy_version 512218 (0.0026) [2024-04-27 23:40:18,895][54818] Updated weights for policy 0, policy_version 512228 (0.0026) [2024-04-27 23:40:19,253][54587] Fps is (10 sec: 58982.0, 60 sec: 59255.3, 300 sec: 59204.5). Total num frames: 8392359936. Throughput: 0: 59033.1. Samples: 1297533700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:40:19,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-27 23:40:22,048][54818] Updated weights for policy 0, policy_version 512238 (0.0026) [2024-04-27 23:40:24,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8392654848. Throughput: 0: 59067.9. Samples: 1297887340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:40:24,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 23:40:24,318][54818] Updated weights for policy 0, policy_version 512248 (0.0026) [2024-04-27 23:40:27,530][54818] Updated weights for policy 0, policy_version 512258 (0.0026) [2024-04-27 23:40:29,253][54587] Fps is (10 sec: 57344.9, 60 sec: 59255.6, 300 sec: 59149.0). Total num frames: 8392933376. Throughput: 0: 59226.3. Samples: 1298066100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:40:29,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 23:40:29,497][54798] Signal inference workers to stop experience collection... (20300 times) [2024-04-27 23:40:29,497][54798] Signal inference workers to resume experience collection... (20300 times) [2024-04-27 23:40:29,511][54818] InferenceWorker_p0-w0: stopping experience collection (20300 times) [2024-04-27 23:40:29,511][54818] InferenceWorker_p0-w0: resuming experience collection (20300 times) [2024-04-27 23:40:29,860][54818] Updated weights for policy 0, policy_version 512268 (0.0026) [2024-04-27 23:40:33,167][54818] Updated weights for policy 0, policy_version 512278 (0.0026) [2024-04-27 23:40:34,253][54587] Fps is (10 sec: 57343.4, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8393228288. Throughput: 0: 59148.7. Samples: 1298422720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:40:34,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-27 23:40:35,359][54818] Updated weights for policy 0, policy_version 512288 (0.0026) [2024-04-27 23:40:38,778][54818] Updated weights for policy 0, policy_version 512298 (0.0026) [2024-04-27 23:40:39,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59528.5, 300 sec: 59038.0). Total num frames: 8393523200. Throughput: 0: 58890.6. Samples: 1298771800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:40:39,253][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 23:40:40,924][54818] Updated weights for policy 0, policy_version 512308 (0.0025) [2024-04-27 23:40:44,253][54587] Fps is (10 sec: 57344.6, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8393801728. Throughput: 0: 58901.9. Samples: 1298947960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:40:44,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 23:40:44,270][54818] Updated weights for policy 0, policy_version 512318 (0.0026) [2024-04-27 23:40:46,448][54818] Updated weights for policy 0, policy_version 512328 (0.0026) [2024-04-27 23:40:49,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58709.3, 300 sec: 59037.9). Total num frames: 8394096640. Throughput: 0: 59013.8. Samples: 1299301080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-27 23:40:49,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-27 23:40:49,770][54818] Updated weights for policy 0, policy_version 512338 (0.0023) [2024-04-27 23:40:52,373][54818] Updated weights for policy 0, policy_version 512348 (0.0026) [2024-04-27 23:40:54,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58436.2, 300 sec: 59038.0). Total num frames: 8394375168. Throughput: 0: 59185.9. Samples: 1299661920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:40:54,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 23:40:55,174][54818] Updated weights for policy 0, policy_version 512358 (0.0025) [2024-04-27 23:40:58,312][54818] Updated weights for policy 0, policy_version 512368 (0.0023) [2024-04-27 23:40:59,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58436.4, 300 sec: 59038.0). Total num frames: 8394670080. Throughput: 0: 58660.4. Samples: 1299822500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:40:59,253][54587] Avg episode reward: [(0, '0.558')] [2024-04-27 23:41:01,069][54818] Updated weights for policy 0, policy_version 512378 (0.0022) [2024-04-27 23:41:03,863][54818] Updated weights for policy 0, policy_version 512388 (0.0021) [2024-04-27 23:41:04,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8394964992. Throughput: 0: 58533.6. Samples: 1300167700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:41:04,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 23:41:06,530][54818] Updated weights for policy 0, policy_version 512398 (0.0026) [2024-04-27 23:41:09,253][54587] Fps is (10 sec: 60620.0, 60 sec: 58436.3, 300 sec: 58982.4). Total num frames: 8395276288. Throughput: 0: 58768.3. Samples: 1300531920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:41:09,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 23:41:09,600][54818] Updated weights for policy 0, policy_version 512408 (0.0026) [2024-04-27 23:41:12,048][54818] Updated weights for policy 0, policy_version 512418 (0.0026) [2024-04-27 23:41:14,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58436.2, 300 sec: 58982.4). Total num frames: 8395571200. Throughput: 0: 58830.7. Samples: 1300713480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:41:14,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 23:41:15,065][54818] Updated weights for policy 0, policy_version 512428 (0.0026) [2024-04-27 23:41:16,331][54798] Signal inference workers to stop experience collection... (20350 times) [2024-04-27 23:41:16,332][54798] Signal inference workers to resume experience collection... (20350 times) [2024-04-27 23:41:16,341][54818] InferenceWorker_p0-w0: stopping experience collection (20350 times) [2024-04-27 23:41:16,359][54818] InferenceWorker_p0-w0: resuming experience collection (20350 times) [2024-04-27 23:41:17,550][54818] Updated weights for policy 0, policy_version 512438 (0.0029) [2024-04-27 23:41:19,253][54587] Fps is (10 sec: 60621.7, 60 sec: 58709.5, 300 sec: 59038.0). Total num frames: 8395882496. Throughput: 0: 58677.5. Samples: 1301063200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:41:19,253][54587] Avg episode reward: [(0, '0.627')] [2024-04-27 23:41:20,567][54818] Updated weights for policy 0, policy_version 512448 (0.0024) [2024-04-27 23:41:23,158][54818] Updated weights for policy 0, policy_version 512458 (0.0026) [2024-04-27 23:41:24,253][54587] Fps is (10 sec: 62259.2, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 8396193792. Throughput: 0: 58630.7. Samples: 1301410180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:41:24,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 23:41:26,415][54818] Updated weights for policy 0, policy_version 512468 (0.0026) [2024-04-27 23:41:28,574][54818] Updated weights for policy 0, policy_version 512478 (0.0026) [2024-04-27 23:41:29,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59255.6, 300 sec: 59093.5). Total num frames: 8396488704. Throughput: 0: 58922.3. Samples: 1301599460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:41:29,253][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 23:41:31,871][54818] Updated weights for policy 0, policy_version 512488 (0.0024) [2024-04-27 23:41:34,086][54818] Updated weights for policy 0, policy_version 512498 (0.0026) [2024-04-27 23:41:34,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.6, 300 sec: 59149.0). Total num frames: 8396783616. Throughput: 0: 58826.7. Samples: 1301948280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:41:34,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-27 23:41:37,490][54818] Updated weights for policy 0, policy_version 512508 (0.0025) [2024-04-27 23:41:39,253][54587] Fps is (10 sec: 55705.0, 60 sec: 58709.3, 300 sec: 59037.9). Total num frames: 8397045760. Throughput: 0: 58680.0. Samples: 1302302520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:41:39,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 23:41:39,609][54818] Updated weights for policy 0, policy_version 512518 (0.0023) [2024-04-27 23:41:42,949][54818] Updated weights for policy 0, policy_version 512528 (0.0024) [2024-04-27 23:41:44,253][54587] Fps is (10 sec: 55705.3, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8397340672. Throughput: 0: 59074.1. Samples: 1302480840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:41:44,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 23:41:45,200][54818] Updated weights for policy 0, policy_version 512538 (0.0027) [2024-04-27 23:41:48,266][54818] Updated weights for policy 0, policy_version 512548 (0.0025) [2024-04-27 23:41:49,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8397651968. Throughput: 0: 59368.7. Samples: 1302839300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:41:49,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 23:41:49,265][54587] No heartbeat for components: RolloutWorker_w4 (8257 seconds) [2024-04-27 23:41:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000512552_8397651968.pth... [2024-04-27 23:41:49,316][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000511688_8383496192.pth [2024-04-27 23:41:50,710][54818] Updated weights for policy 0, policy_version 512558 (0.0026) [2024-04-27 23:41:53,755][54818] Updated weights for policy 0, policy_version 512568 (0.0025) [2024-04-27 23:41:54,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8397946880. Throughput: 0: 59016.5. Samples: 1303187660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:41:54,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 23:41:55,455][54798] Signal inference workers to stop experience collection... (20400 times) [2024-04-27 23:41:55,455][54798] Signal inference workers to resume experience collection... (20400 times) [2024-04-27 23:41:55,479][54818] InferenceWorker_p0-w0: stopping experience collection (20400 times) [2024-04-27 23:41:55,479][54818] InferenceWorker_p0-w0: resuming experience collection (20400 times) [2024-04-27 23:41:56,121][54818] Updated weights for policy 0, policy_version 512578 (0.0024) [2024-04-27 23:41:59,253][54587] Fps is (10 sec: 57344.2, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8398225408. Throughput: 0: 58705.2. Samples: 1303355220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:41:59,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 23:41:59,307][54818] Updated weights for policy 0, policy_version 512588 (0.0025) [2024-04-27 23:42:01,647][54818] Updated weights for policy 0, policy_version 512598 (0.0025) [2024-04-27 23:42:04,253][54587] Fps is (10 sec: 57344.8, 60 sec: 59255.5, 300 sec: 59038.0). Total num frames: 8398520320. Throughput: 0: 58878.7. Samples: 1303712740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:42:04,253][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 23:42:04,775][54818] Updated weights for policy 0, policy_version 512608 (0.0025) [2024-04-27 23:42:07,318][54818] Updated weights for policy 0, policy_version 512618 (0.0022) [2024-04-27 23:42:09,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8398798848. Throughput: 0: 59193.6. Samples: 1304073900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:42:09,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 23:42:10,211][54818] Updated weights for policy 0, policy_version 512628 (0.0027) [2024-04-27 23:42:12,739][54818] Updated weights for policy 0, policy_version 512638 (0.0026) [2024-04-27 23:42:14,253][54587] Fps is (10 sec: 55705.0, 60 sec: 58436.2, 300 sec: 58926.9). Total num frames: 8399077376. Throughput: 0: 58816.7. Samples: 1304246220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:42:14,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 23:42:15,852][54818] Updated weights for policy 0, policy_version 512648 (0.0030) [2024-04-27 23:42:18,611][54818] Updated weights for policy 0, policy_version 512658 (0.0026) [2024-04-27 23:42:19,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58436.1, 300 sec: 58926.9). Total num frames: 8399388672. Throughput: 0: 58868.8. Samples: 1304597380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:42:19,254][54587] Avg episode reward: [(0, '0.684')] [2024-04-27 23:42:21,265][54818] Updated weights for policy 0, policy_version 512668 (0.0027) [2024-04-27 23:42:24,253][54587] Fps is (10 sec: 62258.6, 60 sec: 58436.1, 300 sec: 58982.4). Total num frames: 8399699968. Throughput: 0: 58686.5. Samples: 1304943420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 23:42:24,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 23:42:24,451][54818] Updated weights for policy 0, policy_version 512678 (0.0027) [2024-04-27 23:42:26,939][54818] Updated weights for policy 0, policy_version 512688 (0.0026) [2024-04-27 23:42:29,253][54587] Fps is (10 sec: 63898.5, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8400027648. Throughput: 0: 58704.1. Samples: 1305122520. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 23:42:29,253][54587] Avg episode reward: [(0, '0.591')] [2024-04-27 23:42:30,276][54818] Updated weights for policy 0, policy_version 512698 (0.0022) [2024-04-27 23:42:32,584][54818] Updated weights for policy 0, policy_version 512708 (0.0028) [2024-04-27 23:42:34,119][54798] Signal inference workers to stop experience collection... (20450 times) [2024-04-27 23:42:34,119][54798] Signal inference workers to resume experience collection... (20450 times) [2024-04-27 23:42:34,142][54818] InferenceWorker_p0-w0: stopping experience collection (20450 times) [2024-04-27 23:42:34,142][54818] InferenceWorker_p0-w0: resuming experience collection (20450 times) [2024-04-27 23:42:34,253][54587] Fps is (10 sec: 62259.7, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 8400322560. Throughput: 0: 58710.3. Samples: 1305481260. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 23:42:34,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 23:42:35,857][54818] Updated weights for policy 0, policy_version 512718 (0.0026) [2024-04-27 23:42:38,284][54818] Updated weights for policy 0, policy_version 512728 (0.0024) [2024-04-27 23:42:39,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8400601088. Throughput: 0: 59006.8. Samples: 1305842960. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 23:42:39,253][54587] Avg episode reward: [(0, '0.632')] [2024-04-27 23:42:41,239][54818] Updated weights for policy 0, policy_version 512738 (0.0026) [2024-04-27 23:42:43,887][54818] Updated weights for policy 0, policy_version 512748 (0.0026) [2024-04-27 23:42:44,253][54587] Fps is (10 sec: 54067.2, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8400863232. Throughput: 0: 59120.0. Samples: 1306015620. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 23:42:44,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 23:42:46,699][54818] Updated weights for policy 0, policy_version 512758 (0.0026) [2024-04-27 23:42:49,253][54587] Fps is (10 sec: 57343.2, 60 sec: 58709.3, 300 sec: 59037.9). Total num frames: 8401174528. Throughput: 0: 58942.9. Samples: 1306365180. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 23:42:49,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 23:42:49,535][54818] Updated weights for policy 0, policy_version 512768 (0.0026) [2024-04-27 23:42:52,342][54818] Updated weights for policy 0, policy_version 512778 (0.0025) [2024-04-27 23:42:54,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8401469440. Throughput: 0: 58858.8. Samples: 1306722540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 23:42:54,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 23:42:54,992][54818] Updated weights for policy 0, policy_version 512788 (0.0026) [2024-04-27 23:42:57,734][54818] Updated weights for policy 0, policy_version 512798 (0.0025) [2024-04-27 23:42:59,253][54587] Fps is (10 sec: 58983.2, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8401764352. Throughput: 0: 59005.9. Samples: 1306901480. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 23:42:59,253][54587] Avg episode reward: [(0, '0.673')] [2024-04-27 23:43:00,461][54818] Updated weights for policy 0, policy_version 512808 (0.0026) [2024-04-27 23:43:03,220][54818] Updated weights for policy 0, policy_version 512818 (0.0024) [2024-04-27 23:43:04,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58982.4, 300 sec: 59038.0). Total num frames: 8402059264. Throughput: 0: 59127.3. Samples: 1307258100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 23:43:04,253][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 23:43:05,962][54818] Updated weights for policy 0, policy_version 512828 (0.0021) [2024-04-27 23:43:08,692][54818] Updated weights for policy 0, policy_version 512838 (0.0025) [2024-04-27 23:43:09,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.6, 300 sec: 59037.9). Total num frames: 8402354176. Throughput: 0: 59144.6. Samples: 1307604920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 23:43:09,253][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 23:43:11,433][54818] Updated weights for policy 0, policy_version 512848 (0.0025) [2024-04-27 23:43:14,137][54818] Updated weights for policy 0, policy_version 512858 (0.0024) [2024-04-27 23:43:14,253][54587] Fps is (10 sec: 60620.1, 60 sec: 59801.6, 300 sec: 59037.9). Total num frames: 8402665472. Throughput: 0: 59006.5. Samples: 1307777820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 23:43:14,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 23:43:14,777][54798] Signal inference workers to stop experience collection... (20500 times) [2024-04-27 23:43:14,778][54798] Signal inference workers to resume experience collection... (20500 times) [2024-04-27 23:43:14,792][54818] InferenceWorker_p0-w0: stopping experience collection (20500 times) [2024-04-27 23:43:14,792][54818] InferenceWorker_p0-w0: resuming experience collection (20500 times) [2024-04-27 23:43:16,891][54818] Updated weights for policy 0, policy_version 512868 (0.0027) [2024-04-27 23:43:19,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8402944000. Throughput: 0: 58982.7. Samples: 1308135480. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 23:43:19,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 23:43:19,656][54818] Updated weights for policy 0, policy_version 512878 (0.0026) [2024-04-27 23:43:22,370][54818] Updated weights for policy 0, policy_version 512888 (0.0024) [2024-04-27 23:43:24,253][54587] Fps is (10 sec: 55706.1, 60 sec: 58709.5, 300 sec: 58982.4). Total num frames: 8403222528. Throughput: 0: 59103.1. Samples: 1308502600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 23:43:24,253][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 23:43:25,579][54818] Updated weights for policy 0, policy_version 512898 (0.0026) [2024-04-27 23:43:28,019][54818] Updated weights for policy 0, policy_version 512908 (0.0028) [2024-04-27 23:43:29,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58163.1, 300 sec: 58982.4). Total num frames: 8403517440. Throughput: 0: 58999.5. Samples: 1308670600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 23:43:29,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 23:43:31,135][54818] Updated weights for policy 0, policy_version 512918 (0.0026) [2024-04-27 23:43:33,410][54818] Updated weights for policy 0, policy_version 512928 (0.0026) [2024-04-27 23:43:34,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58436.3, 300 sec: 58982.4). Total num frames: 8403828736. Throughput: 0: 59171.7. Samples: 1309027900. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 23:43:34,254][54587] Avg episode reward: [(0, '0.491')] [2024-04-27 23:43:36,563][54818] Updated weights for policy 0, policy_version 512938 (0.0025) [2024-04-27 23:43:38,882][54818] Updated weights for policy 0, policy_version 512948 (0.0027) [2024-04-27 23:43:39,253][54587] Fps is (10 sec: 62259.6, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8404140032. Throughput: 0: 58864.4. Samples: 1309371440. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 23:43:39,254][54587] Avg episode reward: [(0, '0.503')] [2024-04-27 23:43:41,957][54818] Updated weights for policy 0, policy_version 512958 (0.0028) [2024-04-27 23:43:43,664][54798] Signal inference workers to stop experience collection... (20550 times) [2024-04-27 23:43:43,707][54818] InferenceWorker_p0-w0: stopping experience collection (20550 times) [2024-04-27 23:43:43,719][54798] Signal inference workers to resume experience collection... (20550 times) [2024-04-27 23:43:43,724][54818] InferenceWorker_p0-w0: resuming experience collection (20550 times) [2024-04-27 23:43:44,253][54587] Fps is (10 sec: 62259.0, 60 sec: 59801.6, 300 sec: 59093.5). Total num frames: 8404451328. Throughput: 0: 58988.8. Samples: 1309555980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-27 23:43:44,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-27 23:43:44,629][54818] Updated weights for policy 0, policy_version 512968 (0.0026) [2024-04-27 23:43:47,509][54818] Updated weights for policy 0, policy_version 512978 (0.0024) [2024-04-27 23:43:49,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58982.4, 300 sec: 58926.8). Total num frames: 8404713472. Throughput: 0: 58696.3. Samples: 1309899440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:43:49,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-27 23:43:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000512983_8404713472.pth... [2024-04-27 23:43:49,320][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000512120_8390574080.pth [2024-04-27 23:43:50,871][54818] Updated weights for policy 0, policy_version 512988 (0.0025) [2024-04-27 23:43:53,055][54818] Updated weights for policy 0, policy_version 512998 (0.0026) [2024-04-27 23:43:54,253][54587] Fps is (10 sec: 54067.7, 60 sec: 58709.4, 300 sec: 58871.4). Total num frames: 8404992000. Throughput: 0: 59007.2. Samples: 1310260240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:43:54,253][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 23:43:56,339][54818] Updated weights for policy 0, policy_version 513008 (0.0027) [2024-04-27 23:43:59,159][54818] Updated weights for policy 0, policy_version 513018 (0.0024) [2024-04-27 23:43:59,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58709.2, 300 sec: 58871.3). Total num frames: 8405286912. Throughput: 0: 59225.8. Samples: 1310442980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:43:59,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 23:44:02,067][54818] Updated weights for policy 0, policy_version 513028 (0.0025) [2024-04-27 23:44:04,253][54587] Fps is (10 sec: 60620.1, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8405598208. Throughput: 0: 58862.2. Samples: 1310784280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:44:04,254][54587] Avg episode reward: [(0, '0.531')] [2024-04-27 23:44:04,774][54818] Updated weights for policy 0, policy_version 513038 (0.0029) [2024-04-27 23:44:07,513][54818] Updated weights for policy 0, policy_version 513048 (0.0027) [2024-04-27 23:44:09,253][54587] Fps is (10 sec: 60620.5, 60 sec: 58982.3, 300 sec: 58982.4). Total num frames: 8405893120. Throughput: 0: 58484.7. Samples: 1311134420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:44:09,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 23:44:10,245][54818] Updated weights for policy 0, policy_version 513058 (0.0026) [2024-04-27 23:44:13,051][54818] Updated weights for policy 0, policy_version 513068 (0.0024) [2024-04-27 23:44:14,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8406188032. Throughput: 0: 58776.5. Samples: 1311315540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:44:14,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 23:44:15,739][54818] Updated weights for policy 0, policy_version 513078 (0.0026) [2024-04-27 23:44:18,594][54818] Updated weights for policy 0, policy_version 513088 (0.0026) [2024-04-27 23:44:19,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8406482944. Throughput: 0: 58712.8. Samples: 1311669980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:44:19,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-27 23:44:21,136][54818] Updated weights for policy 0, policy_version 513098 (0.0027) [2024-04-27 23:44:21,671][54798] Signal inference workers to stop experience collection... (20600 times) [2024-04-27 23:44:21,671][54798] Signal inference workers to resume experience collection... (20600 times) [2024-04-27 23:44:21,700][54818] InferenceWorker_p0-w0: stopping experience collection (20600 times) [2024-04-27 23:44:21,700][54818] InferenceWorker_p0-w0: resuming experience collection (20600 times) [2024-04-27 23:44:24,080][54818] Updated weights for policy 0, policy_version 513108 (0.0027) [2024-04-27 23:44:24,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58982.3, 300 sec: 58926.9). Total num frames: 8406761472. Throughput: 0: 59132.4. Samples: 1312032400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:44:24,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 23:44:26,748][54818] Updated weights for policy 0, policy_version 513118 (0.0026) [2024-04-27 23:44:29,253][54587] Fps is (10 sec: 57343.4, 60 sec: 58982.4, 300 sec: 58926.8). Total num frames: 8407056384. Throughput: 0: 58791.4. Samples: 1312201600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:44:29,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 23:44:29,664][54818] Updated weights for policy 0, policy_version 513128 (0.0023) [2024-04-27 23:44:32,414][54818] Updated weights for policy 0, policy_version 513138 (0.0025) [2024-04-27 23:44:34,253][54587] Fps is (10 sec: 57344.6, 60 sec: 58436.3, 300 sec: 58926.9). Total num frames: 8407334912. Throughput: 0: 58940.6. Samples: 1312551760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:44:34,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 23:44:35,228][54818] Updated weights for policy 0, policy_version 513148 (0.0026) [2024-04-27 23:44:37,833][54818] Updated weights for policy 0, policy_version 513158 (0.0022) [2024-04-27 23:44:39,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58436.2, 300 sec: 58926.9). Total num frames: 8407646208. Throughput: 0: 58766.4. Samples: 1312904740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:44:39,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-27 23:44:40,632][54818] Updated weights for policy 0, policy_version 513168 (0.0025) [2024-04-27 23:44:43,259][54818] Updated weights for policy 0, policy_version 513178 (0.0024) [2024-04-27 23:44:44,253][54587] Fps is (10 sec: 62258.5, 60 sec: 58436.2, 300 sec: 58926.8). Total num frames: 8407957504. Throughput: 0: 58565.3. Samples: 1313078420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:44:44,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 23:44:46,234][54818] Updated weights for policy 0, policy_version 513188 (0.0026) [2024-04-27 23:44:48,785][54818] Updated weights for policy 0, policy_version 513198 (0.0028) [2024-04-27 23:44:49,253][54587] Fps is (10 sec: 60621.7, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 8408252416. Throughput: 0: 58985.0. Samples: 1313438600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:44:49,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 23:44:49,260][54587] No heartbeat for components: RolloutWorker_w4 (8437 seconds) [2024-04-27 23:44:51,719][54818] Updated weights for policy 0, policy_version 513208 (0.0026) [2024-04-27 23:44:54,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59255.3, 300 sec: 58926.9). Total num frames: 8408547328. Throughput: 0: 58936.0. Samples: 1313786540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:44:54,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 23:44:54,284][54818] Updated weights for policy 0, policy_version 513218 (0.0026) [2024-04-27 23:44:57,231][54818] Updated weights for policy 0, policy_version 513228 (0.0025) [2024-04-27 23:44:59,253][54587] Fps is (10 sec: 58981.9, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8408842240. Throughput: 0: 58915.9. Samples: 1313966760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:44:59,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 23:44:59,749][54818] Updated weights for policy 0, policy_version 513238 (0.0025) [2024-04-27 23:45:02,601][54818] Updated weights for policy 0, policy_version 513248 (0.0026) [2024-04-27 23:45:04,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58982.5, 300 sec: 58871.3). Total num frames: 8409137152. Throughput: 0: 58787.6. Samples: 1314315420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:45:04,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-27 23:45:05,337][54818] Updated weights for policy 0, policy_version 513258 (0.0026) [2024-04-27 23:45:08,152][54818] Updated weights for policy 0, policy_version 513268 (0.0024) [2024-04-27 23:45:09,253][54587] Fps is (10 sec: 57344.6, 60 sec: 58709.5, 300 sec: 58815.8). Total num frames: 8409415680. Throughput: 0: 58710.3. Samples: 1314674360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:45:09,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 23:45:09,951][54798] Signal inference workers to stop experience collection... (20650 times) [2024-04-27 23:45:09,997][54818] InferenceWorker_p0-w0: stopping experience collection (20650 times) [2024-04-27 23:45:10,006][54798] Signal inference workers to resume experience collection... (20650 times) [2024-04-27 23:45:10,013][54818] InferenceWorker_p0-w0: resuming experience collection (20650 times) [2024-04-27 23:45:11,199][54818] Updated weights for policy 0, policy_version 513278 (0.0026) [2024-04-27 23:45:13,750][54818] Updated weights for policy 0, policy_version 513288 (0.0025) [2024-04-27 23:45:14,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.3, 300 sec: 58815.8). Total num frames: 8409710592. Throughput: 0: 59085.0. Samples: 1314860420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-27 23:45:14,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 23:45:16,521][54818] Updated weights for policy 0, policy_version 513298 (0.0027) [2024-04-27 23:45:19,254][54587] Fps is (10 sec: 60619.2, 60 sec: 58982.2, 300 sec: 58871.3). Total num frames: 8410021888. Throughput: 0: 59174.3. Samples: 1315214620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:45:19,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-27 23:45:19,385][54818] Updated weights for policy 0, policy_version 513308 (0.0027) [2024-04-27 23:45:22,402][54818] Updated weights for policy 0, policy_version 513318 (0.0026) [2024-04-27 23:45:24,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59255.5, 300 sec: 58926.9). Total num frames: 8410316800. Throughput: 0: 59167.3. Samples: 1315567260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:45:24,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 23:45:24,900][54818] Updated weights for policy 0, policy_version 513328 (0.0025) [2024-04-27 23:45:28,079][54818] Updated weights for policy 0, policy_version 513338 (0.0027) [2024-04-27 23:45:29,253][54587] Fps is (10 sec: 60621.8, 60 sec: 59528.6, 300 sec: 58982.4). Total num frames: 8410628096. Throughput: 0: 59038.7. Samples: 1315735160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:45:29,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 23:45:30,714][54818] Updated weights for policy 0, policy_version 513348 (0.0024) [2024-04-27 23:45:33,541][54818] Updated weights for policy 0, policy_version 513358 (0.0025) [2024-04-27 23:45:34,253][54587] Fps is (10 sec: 58981.8, 60 sec: 59528.4, 300 sec: 58926.8). Total num frames: 8410906624. Throughput: 0: 59073.2. Samples: 1316096900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:45:34,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-27 23:45:36,115][54818] Updated weights for policy 0, policy_version 513368 (0.0026) [2024-04-27 23:45:39,118][54818] Updated weights for policy 0, policy_version 513378 (0.0024) [2024-04-27 23:45:39,254][54587] Fps is (10 sec: 55704.6, 60 sec: 58982.3, 300 sec: 58926.8). Total num frames: 8411185152. Throughput: 0: 59298.9. Samples: 1316455000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:45:39,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-27 23:45:41,736][54818] Updated weights for policy 0, policy_version 513388 (0.0025) [2024-04-27 23:45:44,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8411496448. Throughput: 0: 59254.7. Samples: 1316633220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:45:44,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 23:45:44,599][54818] Updated weights for policy 0, policy_version 513398 (0.0026) [2024-04-27 23:45:47,045][54818] Updated weights for policy 0, policy_version 513408 (0.0025) [2024-04-27 23:45:49,253][54587] Fps is (10 sec: 58984.0, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8411774976. Throughput: 0: 59287.2. Samples: 1316983340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:45:49,253][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 23:45:49,297][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000513415_8411791360.pth... [2024-04-27 23:45:49,348][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000512552_8397651968.pth [2024-04-27 23:45:49,995][54818] Updated weights for policy 0, policy_version 513418 (0.0023) [2024-04-27 23:45:52,555][54818] Updated weights for policy 0, policy_version 513428 (0.0026) [2024-04-27 23:45:54,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 8412086272. Throughput: 0: 59098.5. Samples: 1317333800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:45:54,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-27 23:45:55,602][54818] Updated weights for policy 0, policy_version 513438 (0.0025) [2024-04-27 23:45:58,214][54818] Updated weights for policy 0, policy_version 513448 (0.0024) [2024-04-27 23:45:59,253][54587] Fps is (10 sec: 62258.3, 60 sec: 59255.4, 300 sec: 59093.4). Total num frames: 8412397568. Throughput: 0: 58908.8. Samples: 1317511320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:45:59,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 23:46:01,069][54818] Updated weights for policy 0, policy_version 513458 (0.0025) [2024-04-27 23:46:01,259][54798] Signal inference workers to stop experience collection... (20700 times) [2024-04-27 23:46:01,260][54798] Signal inference workers to resume experience collection... (20700 times) [2024-04-27 23:46:01,273][54818] InferenceWorker_p0-w0: stopping experience collection (20700 times) [2024-04-27 23:46:01,273][54818] InferenceWorker_p0-w0: resuming experience collection (20700 times) [2024-04-27 23:46:03,653][54818] Updated weights for policy 0, policy_version 513468 (0.0026) [2024-04-27 23:46:04,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59255.4, 300 sec: 59038.0). Total num frames: 8412692480. Throughput: 0: 59151.8. Samples: 1317876440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:46:04,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 23:46:06,493][54818] Updated weights for policy 0, policy_version 513478 (0.0026) [2024-04-27 23:46:09,170][54818] Updated weights for policy 0, policy_version 513488 (0.0026) [2024-04-27 23:46:09,253][54587] Fps is (10 sec: 58983.4, 60 sec: 59528.5, 300 sec: 59037.9). Total num frames: 8412987392. Throughput: 0: 59129.9. Samples: 1318228100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:46:09,253][54587] Avg episode reward: [(0, '0.697')] [2024-04-27 23:46:11,814][54818] Updated weights for policy 0, policy_version 513498 (0.0025) [2024-04-27 23:46:14,253][54587] Fps is (10 sec: 58983.1, 60 sec: 59528.6, 300 sec: 58982.4). Total num frames: 8413282304. Throughput: 0: 59395.3. Samples: 1318407940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:46:14,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 23:46:14,677][54818] Updated weights for policy 0, policy_version 513508 (0.0026) [2024-04-27 23:46:17,351][54818] Updated weights for policy 0, policy_version 513518 (0.0026) [2024-04-27 23:46:19,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58982.7, 300 sec: 58871.3). Total num frames: 8413560832. Throughput: 0: 59074.0. Samples: 1318755220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:46:19,253][54587] Avg episode reward: [(0, '0.561')] [2024-04-27 23:46:20,229][54818] Updated weights for policy 0, policy_version 513528 (0.0024) [2024-04-27 23:46:22,944][54818] Updated weights for policy 0, policy_version 513538 (0.0025) [2024-04-27 23:46:24,253][54587] Fps is (10 sec: 57343.1, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8413855744. Throughput: 0: 59178.9. Samples: 1319118040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:46:24,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 23:46:25,645][54818] Updated weights for policy 0, policy_version 513548 (0.0026) [2024-04-27 23:46:28,365][54818] Updated weights for policy 0, policy_version 513558 (0.0024) [2024-04-27 23:46:29,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58709.4, 300 sec: 58871.3). Total num frames: 8414150656. Throughput: 0: 59104.0. Samples: 1319292900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:46:29,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-27 23:46:31,298][54818] Updated weights for policy 0, policy_version 513568 (0.0024) [2024-04-27 23:46:34,144][54818] Updated weights for policy 0, policy_version 513578 (0.0025) [2024-04-27 23:46:34,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59255.6, 300 sec: 59037.9). Total num frames: 8414461952. Throughput: 0: 59173.7. Samples: 1319646160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:46:34,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 23:46:36,865][54818] Updated weights for policy 0, policy_version 513588 (0.0026) [2024-04-27 23:46:39,253][54587] Fps is (10 sec: 60621.3, 60 sec: 59528.8, 300 sec: 59038.0). Total num frames: 8414756864. Throughput: 0: 59267.7. Samples: 1320000840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-27 23:46:39,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-27 23:46:40,020][54818] Updated weights for policy 0, policy_version 513598 (0.0025) [2024-04-27 23:46:42,452][54818] Updated weights for policy 0, policy_version 513608 (0.0025) [2024-04-27 23:46:44,253][54587] Fps is (10 sec: 57343.4, 60 sec: 58982.3, 300 sec: 58926.9). Total num frames: 8415035392. Throughput: 0: 59104.9. Samples: 1320171040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:46:44,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 23:46:45,483][54818] Updated weights for policy 0, policy_version 513618 (0.0026) [2024-04-27 23:46:47,335][54798] Signal inference workers to stop experience collection... (20750 times) [2024-04-27 23:46:47,380][54818] InferenceWorker_p0-w0: stopping experience collection (20750 times) [2024-04-27 23:46:47,381][54798] Signal inference workers to resume experience collection... (20750 times) [2024-04-27 23:46:47,394][54818] InferenceWorker_p0-w0: resuming experience collection (20750 times) [2024-04-27 23:46:47,757][54818] Updated weights for policy 0, policy_version 513628 (0.0024) [2024-04-27 23:46:49,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59255.5, 300 sec: 58926.9). Total num frames: 8415330304. Throughput: 0: 58924.1. Samples: 1320528020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:46:49,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 23:46:50,995][54818] Updated weights for policy 0, policy_version 513638 (0.0026) [2024-04-27 23:46:53,645][54818] Updated weights for policy 0, policy_version 513648 (0.0026) [2024-04-27 23:46:54,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8415625216. Throughput: 0: 59094.5. Samples: 1320887360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:46:54,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 23:46:56,561][54818] Updated weights for policy 0, policy_version 513658 (0.0026) [2024-04-27 23:46:59,192][54818] Updated weights for policy 0, policy_version 513668 (0.0027) [2024-04-27 23:46:59,253][54587] Fps is (10 sec: 60620.5, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8415936512. Throughput: 0: 59087.9. Samples: 1321066900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:46:59,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 23:47:02,102][54818] Updated weights for policy 0, policy_version 513678 (0.0023) [2024-04-27 23:47:04,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8416231424. Throughput: 0: 59217.2. Samples: 1321420000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:47:04,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 23:47:04,784][54818] Updated weights for policy 0, policy_version 513688 (0.0025) [2024-04-27 23:47:07,498][54818] Updated weights for policy 0, policy_version 513698 (0.0025) [2024-04-27 23:47:09,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59255.3, 300 sec: 59204.5). Total num frames: 8416542720. Throughput: 0: 58952.0. Samples: 1321770880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:47:09,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 23:47:10,250][54818] Updated weights for policy 0, policy_version 513708 (0.0025) [2024-04-27 23:47:13,161][54818] Updated weights for policy 0, policy_version 513718 (0.0024) [2024-04-27 23:47:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59255.3, 300 sec: 59149.0). Total num frames: 8416837632. Throughput: 0: 59090.6. Samples: 1321951980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:47:14,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 23:47:15,792][54818] Updated weights for policy 0, policy_version 513728 (0.0025) [2024-04-27 23:47:18,659][54818] Updated weights for policy 0, policy_version 513738 (0.0025) [2024-04-27 23:47:19,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59255.3, 300 sec: 59037.9). Total num frames: 8417116160. Throughput: 0: 59253.7. Samples: 1322312580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:47:19,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 23:47:21,283][54818] Updated weights for policy 0, policy_version 513748 (0.0026) [2024-04-27 23:47:24,239][54818] Updated weights for policy 0, policy_version 513758 (0.0026) [2024-04-27 23:47:24,253][54587] Fps is (10 sec: 57344.4, 60 sec: 59255.5, 300 sec: 58926.8). Total num frames: 8417411072. Throughput: 0: 59181.7. Samples: 1322664020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:47:24,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 23:47:26,906][54818] Updated weights for policy 0, policy_version 513768 (0.0024) [2024-04-27 23:47:29,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.4, 300 sec: 58926.9). Total num frames: 8417705984. Throughput: 0: 59249.0. Samples: 1322837240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:47:29,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 23:47:29,784][54818] Updated weights for policy 0, policy_version 513778 (0.0028) [2024-04-27 23:47:32,246][54818] Updated weights for policy 0, policy_version 513788 (0.0026) [2024-04-27 23:47:34,253][54587] Fps is (10 sec: 55705.7, 60 sec: 58436.3, 300 sec: 58871.3). Total num frames: 8417968128. Throughput: 0: 59250.2. Samples: 1323194280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:47:34,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-27 23:47:35,319][54818] Updated weights for policy 0, policy_version 513798 (0.0025) [2024-04-27 23:47:38,031][54798] Signal inference workers to stop experience collection... (20800 times) [2024-04-27 23:47:38,062][54818] InferenceWorker_p0-w0: stopping experience collection (20800 times) [2024-04-27 23:47:38,095][54798] Signal inference workers to resume experience collection... (20800 times) [2024-04-27 23:47:38,095][54818] InferenceWorker_p0-w0: resuming experience collection (20800 times) [2024-04-27 23:47:38,097][54818] Updated weights for policy 0, policy_version 513808 (0.0025) [2024-04-27 23:47:39,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58982.2, 300 sec: 59093.5). Total num frames: 8418295808. Throughput: 0: 59111.9. Samples: 1323547400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:47:39,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-27 23:47:40,722][54818] Updated weights for policy 0, policy_version 513818 (0.0023) [2024-04-27 23:47:43,550][54818] Updated weights for policy 0, policy_version 513828 (0.0027) [2024-04-27 23:47:44,253][54587] Fps is (10 sec: 62257.8, 60 sec: 59255.4, 300 sec: 59037.9). Total num frames: 8418590720. Throughput: 0: 58917.5. Samples: 1323718200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:47:44,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 23:47:46,194][54818] Updated weights for policy 0, policy_version 513838 (0.0024) [2024-04-27 23:47:48,930][54818] Updated weights for policy 0, policy_version 513848 (0.0026) [2024-04-27 23:47:49,253][54587] Fps is (10 sec: 58983.5, 60 sec: 59255.5, 300 sec: 59038.0). Total num frames: 8418885632. Throughput: 0: 59161.5. Samples: 1324082260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:47:49,253][54587] Avg episode reward: [(0, '0.635')] [2024-04-27 23:47:49,261][54587] No heartbeat for components: RolloutWorker_w4 (8617 seconds) [2024-04-27 23:47:49,317][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000513849_8418902016.pth... [2024-04-27 23:47:49,371][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000512983_8404713472.pth [2024-04-27 23:47:51,639][54818] Updated weights for policy 0, policy_version 513858 (0.0025) [2024-04-27 23:47:54,253][54587] Fps is (10 sec: 60622.1, 60 sec: 59528.6, 300 sec: 59093.5). Total num frames: 8419196928. Throughput: 0: 59149.5. Samples: 1324432600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:47:54,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 23:47:54,435][54818] Updated weights for policy 0, policy_version 513868 (0.0025) [2024-04-27 23:47:57,200][54818] Updated weights for policy 0, policy_version 513878 (0.0026) [2024-04-27 23:47:59,253][54587] Fps is (10 sec: 60619.7, 60 sec: 59255.4, 300 sec: 59093.4). Total num frames: 8419491840. Throughput: 0: 59087.5. Samples: 1324610920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:47:59,254][54587] Avg episode reward: [(0, '0.678')] [2024-04-27 23:48:00,234][54818] Updated weights for policy 0, policy_version 513888 (0.0028) [2024-04-27 23:48:02,671][54818] Updated weights for policy 0, policy_version 513898 (0.0026) [2024-04-27 23:48:04,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8419770368. Throughput: 0: 58724.1. Samples: 1324955160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:48:04,253][54587] Avg episode reward: [(0, '0.676')] [2024-04-27 23:48:05,704][54818] Updated weights for policy 0, policy_version 513908 (0.0027) [2024-04-27 23:48:08,403][54818] Updated weights for policy 0, policy_version 513918 (0.0026) [2024-04-27 23:48:09,253][54587] Fps is (10 sec: 57344.8, 60 sec: 58709.5, 300 sec: 58982.4). Total num frames: 8420065280. Throughput: 0: 58992.1. Samples: 1325318660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-27 23:48:09,253][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 23:48:11,301][54818] Updated weights for policy 0, policy_version 513928 (0.0026) [2024-04-27 23:48:13,887][54818] Updated weights for policy 0, policy_version 513938 (0.0025) [2024-04-27 23:48:14,253][54587] Fps is (10 sec: 58981.5, 60 sec: 58709.3, 300 sec: 59037.9). Total num frames: 8420360192. Throughput: 0: 59215.9. Samples: 1325501960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:48:14,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 23:48:17,073][54818] Updated weights for policy 0, policy_version 513948 (0.0026) [2024-04-27 23:48:19,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8420655104. Throughput: 0: 59071.1. Samples: 1325852480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:48:19,253][54587] Avg episode reward: [(0, '0.685')] [2024-04-27 23:48:19,735][54818] Updated weights for policy 0, policy_version 513958 (0.0026) [2024-04-27 23:48:22,687][54818] Updated weights for policy 0, policy_version 513968 (0.0027) [2024-04-27 23:48:22,964][54798] Signal inference workers to stop experience collection... (20850 times) [2024-04-27 23:48:22,971][54798] Signal inference workers to resume experience collection... (20850 times) [2024-04-27 23:48:22,972][54818] InferenceWorker_p0-w0: stopping experience collection (20850 times) [2024-04-27 23:48:22,984][54818] InferenceWorker_p0-w0: resuming experience collection (20850 times) [2024-04-27 23:48:24,253][54587] Fps is (10 sec: 60621.6, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8420966400. Throughput: 0: 59032.2. Samples: 1326203840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:48:24,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 23:48:25,247][54818] Updated weights for policy 0, policy_version 513978 (0.0027) [2024-04-27 23:48:28,189][54818] Updated weights for policy 0, policy_version 513988 (0.0023) [2024-04-27 23:48:29,253][54587] Fps is (10 sec: 60619.9, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8421261312. Throughput: 0: 59250.4. Samples: 1326384460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:48:29,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 23:48:30,713][54818] Updated weights for policy 0, policy_version 513998 (0.0025) [2024-04-27 23:48:33,508][54818] Updated weights for policy 0, policy_version 514008 (0.0025) [2024-04-27 23:48:34,253][54587] Fps is (10 sec: 57343.4, 60 sec: 59528.4, 300 sec: 58982.4). Total num frames: 8421539840. Throughput: 0: 58959.8. Samples: 1326735460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:48:34,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 23:48:36,223][54818] Updated weights for policy 0, policy_version 514018 (0.0025) [2024-04-27 23:48:39,059][54818] Updated weights for policy 0, policy_version 514028 (0.0023) [2024-04-27 23:48:39,253][54587] Fps is (10 sec: 57344.8, 60 sec: 58982.6, 300 sec: 58926.9). Total num frames: 8421834752. Throughput: 0: 59140.5. Samples: 1327093920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:48:39,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-27 23:48:41,713][54818] Updated weights for policy 0, policy_version 514038 (0.0026) [2024-04-27 23:48:44,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8422113280. Throughput: 0: 59019.6. Samples: 1327266800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:48:44,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-27 23:48:44,645][54818] Updated weights for policy 0, policy_version 514048 (0.0023) [2024-04-27 23:48:47,478][54818] Updated weights for policy 0, policy_version 514058 (0.0024) [2024-04-27 23:48:49,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8422424576. Throughput: 0: 59300.5. Samples: 1327623680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:48:49,253][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 23:48:50,032][54818] Updated weights for policy 0, policy_version 514068 (0.0027) [2024-04-27 23:48:53,011][54818] Updated weights for policy 0, policy_version 514078 (0.0025) [2024-04-27 23:48:54,253][54587] Fps is (10 sec: 60621.1, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8422719488. Throughput: 0: 59046.1. Samples: 1327975740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:48:54,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 23:48:55,524][54818] Updated weights for policy 0, policy_version 514088 (0.0025) [2024-04-27 23:48:58,551][54818] Updated weights for policy 0, policy_version 514098 (0.0027) [2024-04-27 23:48:59,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8423030784. Throughput: 0: 58787.3. Samples: 1328147380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:48:59,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 23:49:01,167][54818] Updated weights for policy 0, policy_version 514108 (0.0025) [2024-04-27 23:49:04,106][54818] Updated weights for policy 0, policy_version 514118 (0.0023) [2024-04-27 23:49:04,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58982.4, 300 sec: 59038.0). Total num frames: 8423309312. Throughput: 0: 58910.6. Samples: 1328503460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:49:04,254][54587] Avg episode reward: [(0, '0.689')] [2024-04-27 23:49:06,557][54818] Updated weights for policy 0, policy_version 514128 (0.0026) [2024-04-27 23:49:09,253][54587] Fps is (10 sec: 55705.7, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8423587840. Throughput: 0: 59008.0. Samples: 1328859200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:49:09,254][54587] Avg episode reward: [(0, '0.671')] [2024-04-27 23:49:09,794][54818] Updated weights for policy 0, policy_version 514138 (0.0025) [2024-04-27 23:49:11,990][54818] Updated weights for policy 0, policy_version 514148 (0.0026) [2024-04-27 23:49:13,543][54798] Signal inference workers to stop experience collection... (20900 times) [2024-04-27 23:49:13,543][54798] Signal inference workers to resume experience collection... (20900 times) [2024-04-27 23:49:13,555][54818] InferenceWorker_p0-w0: stopping experience collection (20900 times) [2024-04-27 23:49:13,555][54818] InferenceWorker_p0-w0: resuming experience collection (20900 times) [2024-04-27 23:49:14,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8423899136. Throughput: 0: 58855.7. Samples: 1329032960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:49:14,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-27 23:49:15,396][54818] Updated weights for policy 0, policy_version 514158 (0.0026) [2024-04-27 23:49:17,557][54818] Updated weights for policy 0, policy_version 514168 (0.0026) [2024-04-27 23:49:19,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8424194048. Throughput: 0: 58845.0. Samples: 1329383480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:49:19,253][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 23:49:20,733][54818] Updated weights for policy 0, policy_version 514178 (0.0026) [2024-04-27 23:49:23,349][54818] Updated weights for policy 0, policy_version 514188 (0.0027) [2024-04-27 23:49:24,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8424488960. Throughput: 0: 58872.5. Samples: 1329743180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:49:24,253][54587] Avg episode reward: [(0, '0.618')] [2024-04-27 23:49:26,197][54818] Updated weights for policy 0, policy_version 514198 (0.0026) [2024-04-27 23:49:28,890][54818] Updated weights for policy 0, policy_version 514208 (0.0025) [2024-04-27 23:49:29,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8424783872. Throughput: 0: 59075.2. Samples: 1329925180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:49:29,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-27 23:49:31,965][54818] Updated weights for policy 0, policy_version 514218 (0.0026) [2024-04-27 23:49:34,253][54587] Fps is (10 sec: 60619.3, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8425095168. Throughput: 0: 58996.6. Samples: 1330278540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:49:34,254][54587] Avg episode reward: [(0, '0.671')] [2024-04-27 23:49:34,398][54818] Updated weights for policy 0, policy_version 514228 (0.0026) [2024-04-27 23:49:37,386][54818] Updated weights for policy 0, policy_version 514238 (0.0025) [2024-04-27 23:49:39,253][54587] Fps is (10 sec: 60621.3, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8425390080. Throughput: 0: 58899.2. Samples: 1330626200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:49:39,253][54587] Avg episode reward: [(0, '0.537')] [2024-04-27 23:49:40,135][54818] Updated weights for policy 0, policy_version 514248 (0.0027) [2024-04-27 23:49:42,967][54818] Updated weights for policy 0, policy_version 514258 (0.0026) [2024-04-27 23:49:44,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59801.6, 300 sec: 59149.0). Total num frames: 8425701376. Throughput: 0: 59128.3. Samples: 1330808160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 23:49:44,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 23:49:45,651][54818] Updated weights for policy 0, policy_version 514268 (0.0026) [2024-04-27 23:49:48,585][54818] Updated weights for policy 0, policy_version 514278 (0.0026) [2024-04-27 23:49:49,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58982.3, 300 sec: 59038.0). Total num frames: 8425963520. Throughput: 0: 59129.7. Samples: 1331164300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 23:49:49,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 23:49:49,326][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000514281_8425979904.pth... [2024-04-27 23:49:49,391][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000513415_8411791360.pth [2024-04-27 23:49:51,151][54818] Updated weights for policy 0, policy_version 514288 (0.0026) [2024-04-27 23:49:54,178][54818] Updated weights for policy 0, policy_version 514298 (0.0025) [2024-04-27 23:49:54,253][54587] Fps is (10 sec: 55706.3, 60 sec: 58982.5, 300 sec: 59038.0). Total num frames: 8426258432. Throughput: 0: 59068.9. Samples: 1331517300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 23:49:54,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-27 23:49:56,674][54818] Updated weights for policy 0, policy_version 514308 (0.0025) [2024-04-27 23:49:59,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.4, 300 sec: 59037.9). Total num frames: 8426553344. Throughput: 0: 59074.3. Samples: 1331691300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 23:49:59,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-27 23:49:59,613][54818] Updated weights for policy 0, policy_version 514318 (0.0026) [2024-04-27 23:50:02,302][54818] Updated weights for policy 0, policy_version 514328 (0.0025) [2024-04-27 23:50:04,253][54587] Fps is (10 sec: 58981.3, 60 sec: 58982.2, 300 sec: 59093.4). Total num frames: 8426848256. Throughput: 0: 59285.5. Samples: 1332051340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 23:50:04,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 23:50:05,103][54818] Updated weights for policy 0, policy_version 514338 (0.0026) [2024-04-27 23:50:07,779][54818] Updated weights for policy 0, policy_version 514348 (0.0026) [2024-04-27 23:50:09,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8427143168. Throughput: 0: 59119.4. Samples: 1332403560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 23:50:09,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-27 23:50:10,585][54818] Updated weights for policy 0, policy_version 514358 (0.0025) [2024-04-27 23:50:13,364][54818] Updated weights for policy 0, policy_version 514368 (0.0023) [2024-04-27 23:50:14,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58982.4, 300 sec: 59038.0). Total num frames: 8427438080. Throughput: 0: 58854.7. Samples: 1332573640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 23:50:14,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-27 23:50:15,550][54798] Signal inference workers to stop experience collection... (20950 times) [2024-04-27 23:50:15,551][54798] Signal inference workers to resume experience collection... (20950 times) [2024-04-27 23:50:15,572][54818] InferenceWorker_p0-w0: stopping experience collection (20950 times) [2024-04-27 23:50:15,573][54818] InferenceWorker_p0-w0: resuming experience collection (20950 times) [2024-04-27 23:50:16,119][54818] Updated weights for policy 0, policy_version 514378 (0.0026) [2024-04-27 23:50:18,961][54818] Updated weights for policy 0, policy_version 514388 (0.0026) [2024-04-27 23:50:19,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58982.2, 300 sec: 59037.9). Total num frames: 8427732992. Throughput: 0: 58904.9. Samples: 1332929260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 23:50:19,254][54587] Avg episode reward: [(0, '0.526')] [2024-04-27 23:50:21,522][54818] Updated weights for policy 0, policy_version 514398 (0.0026) [2024-04-27 23:50:24,253][54587] Fps is (10 sec: 60620.7, 60 sec: 59255.3, 300 sec: 59037.9). Total num frames: 8428044288. Throughput: 0: 59069.7. Samples: 1333284340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 23:50:24,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-27 23:50:24,488][54818] Updated weights for policy 0, policy_version 514408 (0.0026) [2024-04-27 23:50:27,044][54818] Updated weights for policy 0, policy_version 514418 (0.0025) [2024-04-27 23:50:29,253][54587] Fps is (10 sec: 60621.6, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8428339200. Throughput: 0: 58976.6. Samples: 1333462100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 23:50:29,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 23:50:30,231][54818] Updated weights for policy 0, policy_version 514428 (0.0026) [2024-04-27 23:50:32,789][54818] Updated weights for policy 0, policy_version 514438 (0.0027) [2024-04-27 23:50:34,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.5, 300 sec: 59149.1). Total num frames: 8428634112. Throughput: 0: 58932.4. Samples: 1333816260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 23:50:34,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-27 23:50:35,795][54818] Updated weights for policy 0, policy_version 514448 (0.0026) [2024-04-27 23:50:38,212][54818] Updated weights for policy 0, policy_version 514458 (0.0026) [2024-04-27 23:50:39,253][54587] Fps is (10 sec: 60620.1, 60 sec: 59255.3, 300 sec: 59149.0). Total num frames: 8428945408. Throughput: 0: 58947.8. Samples: 1334169960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 23:50:39,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-27 23:50:41,261][54818] Updated weights for policy 0, policy_version 514468 (0.0024) [2024-04-27 23:50:43,608][54818] Updated weights for policy 0, policy_version 514478 (0.0026) [2024-04-27 23:50:44,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58982.5, 300 sec: 59204.5). Total num frames: 8429240320. Throughput: 0: 59276.4. Samples: 1334358740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 23:50:44,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 23:50:46,894][54818] Updated weights for policy 0, policy_version 514488 (0.0024) [2024-04-27 23:50:49,241][54818] Updated weights for policy 0, policy_version 514498 (0.0025) [2024-04-27 23:50:49,253][54587] Fps is (10 sec: 58981.8, 60 sec: 59528.3, 300 sec: 59149.0). Total num frames: 8429535232. Throughput: 0: 59003.5. Samples: 1334706500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 23:50:49,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-27 23:50:49,266][54587] No heartbeat for components: RolloutWorker_w4 (8797 seconds) [2024-04-27 23:50:52,406][54818] Updated weights for policy 0, policy_version 514508 (0.0024) [2024-04-27 23:50:54,253][54587] Fps is (10 sec: 57343.9, 60 sec: 59255.4, 300 sec: 59038.0). Total num frames: 8429813760. Throughput: 0: 59011.1. Samples: 1335059060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 23:50:54,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-27 23:50:54,710][54818] Updated weights for policy 0, policy_version 514518 (0.0026) [2024-04-27 23:50:57,953][54818] Updated weights for policy 0, policy_version 514528 (0.0026) [2024-04-27 23:50:59,253][54587] Fps is (10 sec: 57344.2, 60 sec: 59255.3, 300 sec: 59037.9). Total num frames: 8430108672. Throughput: 0: 59310.9. Samples: 1335242640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 23:50:59,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 23:51:00,149][54818] Updated weights for policy 0, policy_version 514538 (0.0024) [2024-04-27 23:51:03,738][54818] Updated weights for policy 0, policy_version 514548 (0.0025) [2024-04-27 23:51:04,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.6, 300 sec: 59037.9). Total num frames: 8430403584. Throughput: 0: 59327.2. Samples: 1335598980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-27 23:51:04,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 23:51:05,667][54818] Updated weights for policy 0, policy_version 514558 (0.0024) [2024-04-27 23:51:09,097][54818] Updated weights for policy 0, policy_version 514568 (0.0025) [2024-04-27 23:51:09,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58982.3, 300 sec: 58982.4). Total num frames: 8430682112. Throughput: 0: 59288.8. Samples: 1335952340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:51:09,254][54587] Avg episode reward: [(0, '0.503')] [2024-04-27 23:51:11,286][54818] Updated weights for policy 0, policy_version 514578 (0.0027) [2024-04-27 23:51:14,253][54587] Fps is (10 sec: 57344.5, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8430977024. Throughput: 0: 59237.4. Samples: 1336127780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:51:14,253][54587] Avg episode reward: [(0, '0.671')] [2024-04-27 23:51:14,482][54818] Updated weights for policy 0, policy_version 514588 (0.0026) [2024-04-27 23:51:15,445][54798] Signal inference workers to stop experience collection... (21000 times) [2024-04-27 23:51:15,474][54818] InferenceWorker_p0-w0: stopping experience collection (21000 times) [2024-04-27 23:51:15,505][54798] Signal inference workers to resume experience collection... (21000 times) [2024-04-27 23:51:15,506][54818] InferenceWorker_p0-w0: resuming experience collection (21000 times) [2024-04-27 23:51:16,937][54818] Updated weights for policy 0, policy_version 514598 (0.0026) [2024-04-27 23:51:19,253][54587] Fps is (10 sec: 60621.0, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8431288320. Throughput: 0: 59250.6. Samples: 1336482540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:51:19,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-27 23:51:19,933][54818] Updated weights for policy 0, policy_version 514608 (0.0025) [2024-04-27 23:51:22,580][54818] Updated weights for policy 0, policy_version 514618 (0.0026) [2024-04-27 23:51:24,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8431583232. Throughput: 0: 59448.6. Samples: 1336845140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:51:24,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-27 23:51:25,476][54818] Updated weights for policy 0, policy_version 514628 (0.0026) [2024-04-27 23:51:28,072][54818] Updated weights for policy 0, policy_version 514638 (0.0024) [2024-04-27 23:51:29,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58709.2, 300 sec: 58982.4). Total num frames: 8431861760. Throughput: 0: 58954.5. Samples: 1337011700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:51:29,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-27 23:51:30,950][54818] Updated weights for policy 0, policy_version 514648 (0.0026) [2024-04-27 23:51:33,430][54818] Updated weights for policy 0, policy_version 514658 (0.0026) [2024-04-27 23:51:34,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8432173056. Throughput: 0: 59090.5. Samples: 1337365560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:51:34,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 23:51:36,579][54818] Updated weights for policy 0, policy_version 514668 (0.0025) [2024-04-27 23:51:39,086][54818] Updated weights for policy 0, policy_version 514678 (0.0026) [2024-04-27 23:51:39,253][54587] Fps is (10 sec: 62259.6, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8432484352. Throughput: 0: 59203.0. Samples: 1337723200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:51:39,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-27 23:51:41,998][54818] Updated weights for policy 0, policy_version 514688 (0.0026) [2024-04-27 23:51:44,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8432779264. Throughput: 0: 59116.6. Samples: 1337902880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:51:44,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 23:51:45,050][54818] Updated weights for policy 0, policy_version 514698 (0.0025) [2024-04-27 23:51:47,490][54818] Updated weights for policy 0, policy_version 514708 (0.0025) [2024-04-27 23:51:49,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58709.5, 300 sec: 59093.5). Total num frames: 8433057792. Throughput: 0: 59175.5. Samples: 1338261880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:51:49,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-27 23:51:49,316][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000514714_8433074176.pth... [2024-04-27 23:51:49,369][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000513849_8418902016.pth [2024-04-27 23:51:50,566][54818] Updated weights for policy 0, policy_version 514718 (0.0025) [2024-04-27 23:51:53,074][54818] Updated weights for policy 0, policy_version 514728 (0.0027) [2024-04-27 23:51:54,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8433369088. Throughput: 0: 59068.1. Samples: 1338610400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:51:54,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 23:51:56,183][54818] Updated weights for policy 0, policy_version 514738 (0.0026) [2024-04-27 23:51:58,507][54818] Updated weights for policy 0, policy_version 514748 (0.0026) [2024-04-27 23:51:59,253][54587] Fps is (10 sec: 62258.9, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8433680384. Throughput: 0: 59285.1. Samples: 1338795620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:51:59,254][54587] Avg episode reward: [(0, '0.684')] [2024-04-27 23:52:01,712][54818] Updated weights for policy 0, policy_version 514758 (0.0024) [2024-04-27 23:52:04,070][54818] Updated weights for policy 0, policy_version 514768 (0.0025) [2024-04-27 23:52:04,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8433975296. Throughput: 0: 59166.2. Samples: 1339145020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:52:04,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 23:52:07,183][54818] Updated weights for policy 0, policy_version 514778 (0.0026) [2024-04-27 23:52:09,253][54587] Fps is (10 sec: 58982.9, 60 sec: 59801.7, 300 sec: 59093.5). Total num frames: 8434270208. Throughput: 0: 59021.3. Samples: 1339501100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:52:09,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 23:52:09,492][54818] Updated weights for policy 0, policy_version 514788 (0.0026) [2024-04-27 23:52:12,723][54818] Updated weights for policy 0, policy_version 514798 (0.0026) [2024-04-27 23:52:13,064][54798] Signal inference workers to stop experience collection... (21050 times) [2024-04-27 23:52:13,064][54798] Signal inference workers to resume experience collection... (21050 times) [2024-04-27 23:52:13,074][54818] InferenceWorker_p0-w0: stopping experience collection (21050 times) [2024-04-27 23:52:13,091][54818] InferenceWorker_p0-w0: resuming experience collection (21050 times) [2024-04-27 23:52:14,253][54587] Fps is (10 sec: 57344.7, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8434548736. Throughput: 0: 59426.0. Samples: 1339685860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:52:14,253][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 23:52:15,014][54818] Updated weights for policy 0, policy_version 514808 (0.0025) [2024-04-27 23:52:18,254][54818] Updated weights for policy 0, policy_version 514818 (0.0027) [2024-04-27 23:52:19,253][54587] Fps is (10 sec: 57344.2, 60 sec: 59255.6, 300 sec: 59093.5). Total num frames: 8434843648. Throughput: 0: 59432.4. Samples: 1340040020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:52:19,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-27 23:52:20,399][54818] Updated weights for policy 0, policy_version 514828 (0.0025) [2024-04-27 23:52:23,781][54818] Updated weights for policy 0, policy_version 514838 (0.0024) [2024-04-27 23:52:24,253][54587] Fps is (10 sec: 58981.6, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8435138560. Throughput: 0: 59388.9. Samples: 1340395700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:52:24,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-27 23:52:25,798][54818] Updated weights for policy 0, policy_version 514848 (0.0027) [2024-04-27 23:52:29,253][54587] Fps is (10 sec: 57343.8, 60 sec: 59255.6, 300 sec: 59149.0). Total num frames: 8435417088. Throughput: 0: 59159.1. Samples: 1340565040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:52:29,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-27 23:52:29,259][54818] Updated weights for policy 0, policy_version 514858 (0.0024) [2024-04-27 23:52:31,729][54818] Updated weights for policy 0, policy_version 514868 (0.0026) [2024-04-27 23:52:34,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8435728384. Throughput: 0: 58972.5. Samples: 1340915640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-27 23:52:34,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 23:52:34,581][54818] Updated weights for policy 0, policy_version 514878 (0.0026) [2024-04-27 23:52:37,261][54818] Updated weights for policy 0, policy_version 514888 (0.0025) [2024-04-27 23:52:39,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58709.3, 300 sec: 59038.0). Total num frames: 8436006912. Throughput: 0: 59403.0. Samples: 1341283540. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-27 23:52:39,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 23:52:40,025][54818] Updated weights for policy 0, policy_version 514898 (0.0026) [2024-04-27 23:52:42,903][54818] Updated weights for policy 0, policy_version 514908 (0.0025) [2024-04-27 23:52:44,253][54587] Fps is (10 sec: 57343.3, 60 sec: 58709.2, 300 sec: 59037.9). Total num frames: 8436301824. Throughput: 0: 58961.3. Samples: 1341448880. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-27 23:52:44,254][54587] Avg episode reward: [(0, '0.510')] [2024-04-27 23:52:45,832][54818] Updated weights for policy 0, policy_version 514918 (0.0027) [2024-04-27 23:52:48,504][54818] Updated weights for policy 0, policy_version 514928 (0.0026) [2024-04-27 23:52:49,253][54587] Fps is (10 sec: 60621.0, 60 sec: 59255.5, 300 sec: 59037.9). Total num frames: 8436613120. Throughput: 0: 59064.0. Samples: 1341802900. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-27 23:52:49,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-27 23:52:51,513][54818] Updated weights for policy 0, policy_version 514938 (0.0027) [2024-04-27 23:52:54,079][54818] Updated weights for policy 0, policy_version 514948 (0.0025) [2024-04-27 23:52:54,253][54587] Fps is (10 sec: 60621.2, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 8436908032. Throughput: 0: 59040.8. Samples: 1342157940. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-27 23:52:54,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-27 23:52:57,113][54818] Updated weights for policy 0, policy_version 514958 (0.0025) [2024-04-27 23:52:57,204][54798] Signal inference workers to stop experience collection... (21100 times) [2024-04-27 23:52:57,236][54818] InferenceWorker_p0-w0: stopping experience collection (21100 times) [2024-04-27 23:52:57,265][54798] Signal inference workers to resume experience collection... (21100 times) [2024-04-27 23:52:57,265][54818] InferenceWorker_p0-w0: resuming experience collection (21100 times) [2024-04-27 23:52:59,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8437219328. Throughput: 0: 58938.5. Samples: 1342338100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-27 23:52:59,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-27 23:52:59,672][54818] Updated weights for policy 0, policy_version 514968 (0.0031) [2024-04-27 23:53:02,445][54818] Updated weights for policy 0, policy_version 514978 (0.0022) [2024-04-27 23:53:04,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8437514240. Throughput: 0: 59048.4. Samples: 1342697200. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-27 23:53:04,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-27 23:53:05,511][54818] Updated weights for policy 0, policy_version 514988 (0.0025) [2024-04-27 23:53:07,829][54818] Updated weights for policy 0, policy_version 514998 (0.0025) [2024-04-27 23:53:09,253][54587] Fps is (10 sec: 60621.5, 60 sec: 59255.6, 300 sec: 59204.6). Total num frames: 8437825536. Throughput: 0: 58818.4. Samples: 1343042520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-27 23:53:09,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-27 23:53:11,018][54818] Updated weights for policy 0, policy_version 515008 (0.0028) [2024-04-27 23:53:13,362][54818] Updated weights for policy 0, policy_version 515018 (0.0022) [2024-04-27 23:53:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59528.4, 300 sec: 59204.5). Total num frames: 8438120448. Throughput: 0: 59307.5. Samples: 1343233880. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-27 23:53:14,263][54587] Avg episode reward: [(0, '0.571')] [2024-04-27 23:53:16,461][54818] Updated weights for policy 0, policy_version 515028 (0.0025) [2024-04-27 23:53:18,911][54818] Updated weights for policy 0, policy_version 515038 (0.0026) [2024-04-27 23:53:19,253][54587] Fps is (10 sec: 55705.3, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 8438382592. Throughput: 0: 59202.7. Samples: 1343579760. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-27 23:53:19,254][54587] Avg episode reward: [(0, '0.711')] [2024-04-27 23:53:21,945][54818] Updated weights for policy 0, policy_version 515048 (0.0030) [2024-04-27 23:53:24,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8438693888. Throughput: 0: 59016.4. Samples: 1343939280. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-27 23:53:24,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 23:53:24,493][54818] Updated weights for policy 0, policy_version 515058 (0.0026) [2024-04-27 23:53:27,341][54818] Updated weights for policy 0, policy_version 515068 (0.0026) [2024-04-27 23:53:29,253][54587] Fps is (10 sec: 60621.0, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8438988800. Throughput: 0: 59315.8. Samples: 1344118080. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-27 23:53:29,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-27 23:53:29,795][54818] Updated weights for policy 0, policy_version 515078 (0.0026) [2024-04-27 23:53:33,009][54818] Updated weights for policy 0, policy_version 515088 (0.0026) [2024-04-27 23:53:34,253][54587] Fps is (10 sec: 57344.5, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8439267328. Throughput: 0: 59256.0. Samples: 1344469420. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-27 23:53:34,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-27 23:53:35,407][54818] Updated weights for policy 0, policy_version 515098 (0.0025) [2024-04-27 23:53:38,316][54818] Updated weights for policy 0, policy_version 515108 (0.0026) [2024-04-27 23:53:39,253][54587] Fps is (10 sec: 58981.0, 60 sec: 59528.4, 300 sec: 59204.5). Total num frames: 8439578624. Throughput: 0: 59090.1. Samples: 1344817000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-27 23:53:39,255][54587] Avg episode reward: [(0, '0.493')] [2024-04-27 23:53:41,003][54818] Updated weights for policy 0, policy_version 515118 (0.0025) [2024-04-27 23:53:43,239][54798] Signal inference workers to stop experience collection... (21150 times) [2024-04-27 23:53:43,241][54798] Signal inference workers to resume experience collection... (21150 times) [2024-04-27 23:53:43,250][54818] InferenceWorker_p0-w0: stopping experience collection (21150 times) [2024-04-27 23:53:43,270][54818] InferenceWorker_p0-w0: resuming experience collection (21150 times) [2024-04-27 23:53:43,785][54818] Updated weights for policy 0, policy_version 515128 (0.0026) [2024-04-27 23:53:44,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8439873536. Throughput: 0: 59171.1. Samples: 1345000800. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-27 23:53:44,254][54587] Avg episode reward: [(0, '0.518')] [2024-04-27 23:53:46,401][54818] Updated weights for policy 0, policy_version 515138 (0.0023) [2024-04-27 23:53:49,204][54818] Updated weights for policy 0, policy_version 515148 (0.0024) [2024-04-27 23:53:49,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59528.4, 300 sec: 59204.5). Total num frames: 8440184832. Throughput: 0: 59101.7. Samples: 1345356780. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-27 23:53:49,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-27 23:53:49,266][54587] No heartbeat for components: RolloutWorker_w4 (8977 seconds) [2024-04-27 23:53:49,267][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000515148_8440184832.pth... [2024-04-27 23:53:49,318][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000514281_8425979904.pth [2024-04-27 23:53:51,888][54818] Updated weights for policy 0, policy_version 515158 (0.0025) [2024-04-27 23:53:54,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8440463360. Throughput: 0: 59439.9. Samples: 1345717320. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-27 23:53:54,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 23:53:54,621][54818] Updated weights for policy 0, policy_version 515168 (0.0026) [2024-04-27 23:53:57,388][54818] Updated weights for policy 0, policy_version 515178 (0.0026) [2024-04-27 23:53:59,253][54587] Fps is (10 sec: 57344.5, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8440758272. Throughput: 0: 59078.7. Samples: 1345892420. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-04-27 23:53:59,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-27 23:54:00,419][54818] Updated weights for policy 0, policy_version 515188 (0.0022) [2024-04-27 23:54:02,882][54818] Updated weights for policy 0, policy_version 515198 (0.0025) [2024-04-27 23:54:04,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8441053184. Throughput: 0: 59125.0. Samples: 1346240380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:54:04,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 23:54:05,867][54818] Updated weights for policy 0, policy_version 515208 (0.0026) [2024-04-27 23:54:08,533][54818] Updated weights for policy 0, policy_version 515218 (0.0026) [2024-04-27 23:54:09,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58982.3, 300 sec: 59204.5). Total num frames: 8441364480. Throughput: 0: 59120.0. Samples: 1346599680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:54:09,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 23:54:11,596][54818] Updated weights for policy 0, policy_version 515228 (0.0026) [2024-04-27 23:54:14,005][54818] Updated weights for policy 0, policy_version 515238 (0.0023) [2024-04-27 23:54:14,253][54587] Fps is (10 sec: 60620.0, 60 sec: 58982.4, 300 sec: 59204.5). Total num frames: 8441659392. Throughput: 0: 59204.7. Samples: 1346782300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:54:14,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-27 23:54:17,268][54818] Updated weights for policy 0, policy_version 515248 (0.0026) [2024-04-27 23:54:19,253][54587] Fps is (10 sec: 58983.1, 60 sec: 59528.6, 300 sec: 59204.5). Total num frames: 8441954304. Throughput: 0: 59262.3. Samples: 1347136220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:54:19,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-27 23:54:19,668][54818] Updated weights for policy 0, policy_version 515258 (0.0025) [2024-04-27 23:54:22,747][54818] Updated weights for policy 0, policy_version 515268 (0.0024) [2024-04-27 23:54:24,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8442232832. Throughput: 0: 59285.9. Samples: 1347484860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:54:24,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-27 23:54:25,462][54818] Updated weights for policy 0, policy_version 515278 (0.0026) [2024-04-27 23:54:28,393][54818] Updated weights for policy 0, policy_version 515288 (0.0026) [2024-04-27 23:54:29,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8442527744. Throughput: 0: 59118.3. Samples: 1347661120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:54:29,254][54587] Avg episode reward: [(0, '0.506')] [2024-04-27 23:54:31,324][54818] Updated weights for policy 0, policy_version 515298 (0.0026) [2024-04-27 23:54:32,728][54798] Signal inference workers to stop experience collection... (21200 times) [2024-04-27 23:54:32,771][54818] InferenceWorker_p0-w0: stopping experience collection (21200 times) [2024-04-27 23:54:32,785][54798] Signal inference workers to resume experience collection... (21200 times) [2024-04-27 23:54:32,790][54818] InferenceWorker_p0-w0: resuming experience collection (21200 times) [2024-04-27 23:54:33,790][54818] Updated weights for policy 0, policy_version 515308 (0.0024) [2024-04-27 23:54:34,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.3, 300 sec: 59093.4). Total num frames: 8442822656. Throughput: 0: 59170.2. Samples: 1348019440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:54:34,254][54587] Avg episode reward: [(0, '0.514')] [2024-04-27 23:54:36,885][54818] Updated weights for policy 0, policy_version 515318 (0.0025) [2024-04-27 23:54:39,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.6, 300 sec: 59038.0). Total num frames: 8443117568. Throughput: 0: 58947.2. Samples: 1348369940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:54:39,253][54587] Avg episode reward: [(0, '0.596')] [2024-04-27 23:54:39,279][54818] Updated weights for policy 0, policy_version 515328 (0.0023) [2024-04-27 23:54:42,383][54818] Updated weights for policy 0, policy_version 515338 (0.0025) [2024-04-27 23:54:44,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8443412480. Throughput: 0: 58934.2. Samples: 1348544460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:54:44,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-27 23:54:44,952][54818] Updated weights for policy 0, policy_version 515348 (0.0022) [2024-04-27 23:54:47,827][54818] Updated weights for policy 0, policy_version 515358 (0.0026) [2024-04-27 23:54:49,253][54587] Fps is (10 sec: 58981.4, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8443707392. Throughput: 0: 59170.4. Samples: 1348903060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:54:49,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-27 23:54:50,490][54818] Updated weights for policy 0, policy_version 515368 (0.0024) [2024-04-27 23:54:53,311][54818] Updated weights for policy 0, policy_version 515378 (0.0025) [2024-04-27 23:54:54,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8444018688. Throughput: 0: 58861.8. Samples: 1349248460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:54:54,254][54587] Avg episode reward: [(0, '0.704')] [2024-04-27 23:54:56,033][54818] Updated weights for policy 0, policy_version 515388 (0.0026) [2024-04-27 23:54:58,800][54818] Updated weights for policy 0, policy_version 515398 (0.0024) [2024-04-27 23:54:59,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8444313600. Throughput: 0: 58874.7. Samples: 1349431660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:54:59,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 23:55:01,797][54818] Updated weights for policy 0, policy_version 515408 (0.0026) [2024-04-27 23:55:04,115][54818] Updated weights for policy 0, policy_version 515418 (0.0023) [2024-04-27 23:55:04,253][54587] Fps is (10 sec: 58982.6, 60 sec: 59255.4, 300 sec: 59204.6). Total num frames: 8444608512. Throughput: 0: 58819.1. Samples: 1349783080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:55:04,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-27 23:55:07,302][54818] Updated weights for policy 0, policy_version 515428 (0.0024) [2024-04-27 23:55:09,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8444903424. Throughput: 0: 59138.9. Samples: 1350146100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:55:09,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 23:55:09,570][54818] Updated weights for policy 0, policy_version 515438 (0.0026) [2024-04-27 23:55:12,727][54818] Updated weights for policy 0, policy_version 515448 (0.0025) [2024-04-27 23:55:14,253][54587] Fps is (10 sec: 55705.6, 60 sec: 58436.3, 300 sec: 59093.5). Total num frames: 8445165568. Throughput: 0: 59029.8. Samples: 1350317460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:55:14,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-27 23:55:15,102][54818] Updated weights for policy 0, policy_version 515458 (0.0026) [2024-04-27 23:55:18,375][54818] Updated weights for policy 0, policy_version 515468 (0.0025) [2024-04-27 23:55:19,253][54587] Fps is (10 sec: 55704.7, 60 sec: 58436.1, 300 sec: 59037.9). Total num frames: 8445460480. Throughput: 0: 59047.1. Samples: 1350676560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:55:19,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-27 23:55:20,751][54818] Updated weights for policy 0, policy_version 515478 (0.0025) [2024-04-27 23:55:22,638][54798] Signal inference workers to stop experience collection... (21250 times) [2024-04-27 23:55:22,639][54798] Signal inference workers to resume experience collection... (21250 times) [2024-04-27 23:55:22,663][54818] InferenceWorker_p0-w0: stopping experience collection (21250 times) [2024-04-27 23:55:22,663][54818] InferenceWorker_p0-w0: resuming experience collection (21250 times) [2024-04-27 23:55:24,019][54818] Updated weights for policy 0, policy_version 515488 (0.0026) [2024-04-27 23:55:24,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8445771776. Throughput: 0: 59056.0. Samples: 1351027460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:55:24,254][54587] Avg episode reward: [(0, '0.675')] [2024-04-27 23:55:26,202][54818] Updated weights for policy 0, policy_version 515498 (0.0026) [2024-04-27 23:55:29,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58709.3, 300 sec: 59037.9). Total num frames: 8446050304. Throughput: 0: 58839.2. Samples: 1351192220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-27 23:55:29,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-27 23:55:29,465][54818] Updated weights for policy 0, policy_version 515508 (0.0024) [2024-04-27 23:55:31,834][54818] Updated weights for policy 0, policy_version 515518 (0.0022) [2024-04-27 23:55:34,253][54587] Fps is (10 sec: 57343.4, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8446345216. Throughput: 0: 58960.9. Samples: 1351556300. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:55:34,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 23:55:34,972][54818] Updated weights for policy 0, policy_version 515528 (0.0026) [2024-04-27 23:55:37,390][54818] Updated weights for policy 0, policy_version 515538 (0.0025) [2024-04-27 23:55:39,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8446640128. Throughput: 0: 59143.3. Samples: 1351909900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:55:39,253][54587] Avg episode reward: [(0, '0.520')] [2024-04-27 23:55:40,839][54818] Updated weights for policy 0, policy_version 515548 (0.0025) [2024-04-27 23:55:43,017][54818] Updated weights for policy 0, policy_version 515558 (0.0026) [2024-04-27 23:55:44,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58982.4, 300 sec: 59038.0). Total num frames: 8446951424. Throughput: 0: 58799.6. Samples: 1352077640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:55:44,254][54587] Avg episode reward: [(0, '0.522')] [2024-04-27 23:55:46,267][54818] Updated weights for policy 0, policy_version 515568 (0.0026) [2024-04-27 23:55:48,899][54818] Updated weights for policy 0, policy_version 515578 (0.0025) [2024-04-27 23:55:49,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58709.5, 300 sec: 59037.9). Total num frames: 8447229952. Throughput: 0: 58900.5. Samples: 1352433600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:55:49,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-27 23:55:49,272][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000515579_8447246336.pth... [2024-04-27 23:55:49,320][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000514714_8433074176.pth [2024-04-27 23:55:51,723][54818] Updated weights for policy 0, policy_version 515588 (0.0026) [2024-04-27 23:55:54,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8447541248. Throughput: 0: 58706.0. Samples: 1352787880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:55:54,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-27 23:55:54,524][54818] Updated weights for policy 0, policy_version 515598 (0.0028) [2024-04-27 23:55:57,217][54818] Updated weights for policy 0, policy_version 515608 (0.0024) [2024-04-27 23:55:59,253][54587] Fps is (10 sec: 62258.8, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8447852544. Throughput: 0: 59040.8. Samples: 1352974300. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:55:59,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-27 23:56:00,067][54818] Updated weights for policy 0, policy_version 515618 (0.0026) [2024-04-27 23:56:02,669][54818] Updated weights for policy 0, policy_version 515628 (0.0026) [2024-04-27 23:56:04,253][54587] Fps is (10 sec: 62260.1, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8448163840. Throughput: 0: 58830.4. Samples: 1353323920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:56:04,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-27 23:56:05,554][54818] Updated weights for policy 0, policy_version 515638 (0.0026) [2024-04-27 23:56:08,124][54818] Updated weights for policy 0, policy_version 515648 (0.0026) [2024-04-27 23:56:09,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59255.3, 300 sec: 59260.1). Total num frames: 8448458752. Throughput: 0: 58804.7. Samples: 1353673680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:56:09,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-27 23:56:11,036][54818] Updated weights for policy 0, policy_version 515658 (0.0026) [2024-04-27 23:56:13,262][54798] Signal inference workers to stop experience collection... (21300 times) [2024-04-27 23:56:13,295][54818] InferenceWorker_p0-w0: stopping experience collection (21300 times) [2024-04-27 23:56:13,353][54798] Signal inference workers to resume experience collection... (21300 times) [2024-04-27 23:56:13,353][54818] InferenceWorker_p0-w0: resuming experience collection (21300 times) [2024-04-27 23:56:13,606][54818] Updated weights for policy 0, policy_version 515668 (0.0025) [2024-04-27 23:56:14,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59801.6, 300 sec: 59204.6). Total num frames: 8448753664. Throughput: 0: 59489.4. Samples: 1353869240. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:56:14,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-27 23:56:16,489][54818] Updated weights for policy 0, policy_version 515678 (0.0026) [2024-04-27 23:56:19,151][54818] Updated weights for policy 0, policy_version 515688 (0.0026) [2024-04-27 23:56:19,253][54587] Fps is (10 sec: 57344.5, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8449032192. Throughput: 0: 59096.9. Samples: 1354215660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:56:19,254][54587] Avg episode reward: [(0, '0.671')] [2024-04-27 23:56:22,601][54818] Updated weights for policy 0, policy_version 515698 (0.0025) [2024-04-27 23:56:24,253][54587] Fps is (10 sec: 55705.8, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8449310720. Throughput: 0: 59023.1. Samples: 1354565940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:56:24,253][54587] Avg episode reward: [(0, '0.569')] [2024-04-27 23:56:24,540][54818] Updated weights for policy 0, policy_version 515708 (0.0024) [2024-04-27 23:56:28,290][54818] Updated weights for policy 0, policy_version 515718 (0.0024) [2024-04-27 23:56:29,254][54587] Fps is (10 sec: 57343.0, 60 sec: 59255.3, 300 sec: 59093.4). Total num frames: 8449605632. Throughput: 0: 59248.7. Samples: 1354743840. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:56:29,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-27 23:56:30,131][54818] Updated weights for policy 0, policy_version 515728 (0.0026) [2024-04-27 23:56:33,643][54818] Updated weights for policy 0, policy_version 515738 (0.0026) [2024-04-27 23:56:34,253][54587] Fps is (10 sec: 60620.1, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8449916928. Throughput: 0: 59159.9. Samples: 1355095800. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:56:34,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 23:56:35,596][54818] Updated weights for policy 0, policy_version 515748 (0.0026) [2024-04-27 23:56:39,199][54818] Updated weights for policy 0, policy_version 515758 (0.0026) [2024-04-27 23:56:39,253][54587] Fps is (10 sec: 57345.6, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8450179072. Throughput: 0: 59443.3. Samples: 1355462820. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:56:39,253][54587] Avg episode reward: [(0, '0.595')] [2024-04-27 23:56:41,193][54818] Updated weights for policy 0, policy_version 515768 (0.0026) [2024-04-27 23:56:44,253][54587] Fps is (10 sec: 54067.6, 60 sec: 58436.3, 300 sec: 58982.4). Total num frames: 8450457600. Throughput: 0: 58889.8. Samples: 1355624340. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:56:44,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 23:56:44,675][54818] Updated weights for policy 0, policy_version 515778 (0.0027) [2024-04-27 23:56:46,785][54818] Updated weights for policy 0, policy_version 515788 (0.0025) [2024-04-27 23:56:49,253][54587] Fps is (10 sec: 58981.5, 60 sec: 58982.3, 300 sec: 58982.4). Total num frames: 8450768896. Throughput: 0: 58975.4. Samples: 1355977820. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:56:49,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-27 23:56:49,266][54587] No heartbeat for components: RolloutWorker_w4 (9157 seconds) [2024-04-27 23:56:50,197][54818] Updated weights for policy 0, policy_version 515798 (0.0025) [2024-04-27 23:56:51,871][54798] Signal inference workers to stop experience collection... (21350 times) [2024-04-27 23:56:51,891][54818] InferenceWorker_p0-w0: stopping experience collection (21350 times) [2024-04-27 23:56:51,961][54798] Signal inference workers to resume experience collection... (21350 times) [2024-04-27 23:56:51,961][54818] InferenceWorker_p0-w0: resuming experience collection (21350 times) [2024-04-27 23:56:52,385][54818] Updated weights for policy 0, policy_version 515808 (0.0025) [2024-04-27 23:56:54,253][54587] Fps is (10 sec: 62258.8, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8451080192. Throughput: 0: 59136.1. Samples: 1356334800. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:56:54,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-27 23:56:55,668][54818] Updated weights for policy 0, policy_version 515818 (0.0023) [2024-04-27 23:56:57,989][54818] Updated weights for policy 0, policy_version 515828 (0.0026) [2024-04-27 23:56:59,253][54587] Fps is (10 sec: 62259.1, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8451391488. Throughput: 0: 58562.5. Samples: 1356504560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-27 23:56:59,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-27 23:57:01,071][54818] Updated weights for policy 0, policy_version 515838 (0.0024) [2024-04-27 23:57:03,554][54818] Updated weights for policy 0, policy_version 515848 (0.0026) [2024-04-27 23:57:04,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58436.3, 300 sec: 58982.4). Total num frames: 8451670016. Throughput: 0: 58808.1. Samples: 1356862020. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 23:57:04,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-27 23:57:06,622][54818] Updated weights for policy 0, policy_version 515858 (0.0026) [2024-04-27 23:57:09,225][54818] Updated weights for policy 0, policy_version 515868 (0.0025) [2024-04-27 23:57:09,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58709.5, 300 sec: 59093.5). Total num frames: 8451981312. Throughput: 0: 58884.8. Samples: 1357215760. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 23:57:09,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-27 23:57:12,215][54818] Updated weights for policy 0, policy_version 515878 (0.0026) [2024-04-27 23:57:14,253][54587] Fps is (10 sec: 62259.1, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8452292608. Throughput: 0: 59136.3. Samples: 1357404960. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 23:57:14,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-27 23:57:14,856][54818] Updated weights for policy 0, policy_version 515888 (0.0028) [2024-04-27 23:57:17,558][54818] Updated weights for policy 0, policy_version 515898 (0.0025) [2024-04-27 23:57:19,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58709.4, 300 sec: 59038.0). Total num frames: 8452554752. Throughput: 0: 59123.7. Samples: 1357756360. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 23:57:19,253][54587] Avg episode reward: [(0, '0.560')] [2024-04-27 23:57:20,179][54818] Updated weights for policy 0, policy_version 515908 (0.0023) [2024-04-27 23:57:23,041][54818] Updated weights for policy 0, policy_version 515918 (0.0028) [2024-04-27 23:57:24,253][54587] Fps is (10 sec: 57343.7, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8452866048. Throughput: 0: 58543.0. Samples: 1358097260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 23:57:24,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 23:57:25,734][54818] Updated weights for policy 0, policy_version 515928 (0.0026) [2024-04-27 23:57:28,558][54818] Updated weights for policy 0, policy_version 515938 (0.0023) [2024-04-27 23:57:28,691][54798] Signal inference workers to stop experience collection... (21400 times) [2024-04-27 23:57:28,692][54798] Signal inference workers to resume experience collection... (21400 times) [2024-04-27 23:57:28,716][54818] InferenceWorker_p0-w0: stopping experience collection (21400 times) [2024-04-27 23:57:28,716][54818] InferenceWorker_p0-w0: resuming experience collection (21400 times) [2024-04-27 23:57:29,253][54587] Fps is (10 sec: 62257.8, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8453177344. Throughput: 0: 59203.8. Samples: 1358288520. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 23:57:29,254][54587] Avg episode reward: [(0, '0.678')] [2024-04-27 23:57:31,336][54818] Updated weights for policy 0, policy_version 515948 (0.0022) [2024-04-27 23:57:34,011][54818] Updated weights for policy 0, policy_version 515958 (0.0024) [2024-04-27 23:57:34,253][54587] Fps is (10 sec: 60621.3, 60 sec: 59255.6, 300 sec: 59204.6). Total num frames: 8453472256. Throughput: 0: 59359.3. Samples: 1358648980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 23:57:34,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 23:57:36,965][54818] Updated weights for policy 0, policy_version 515968 (0.0026) [2024-04-27 23:57:39,253][54587] Fps is (10 sec: 57344.8, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8453750784. Throughput: 0: 59320.9. Samples: 1359004240. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 23:57:39,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-27 23:57:39,401][54818] Updated weights for policy 0, policy_version 515978 (0.0022) [2024-04-27 23:57:42,564][54818] Updated weights for policy 0, policy_version 515988 (0.0026) [2024-04-27 23:57:44,253][54587] Fps is (10 sec: 57343.4, 60 sec: 59801.6, 300 sec: 59093.5). Total num frames: 8454045696. Throughput: 0: 59219.6. Samples: 1359169440. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 23:57:44,254][54587] Avg episode reward: [(0, '0.712')] [2024-04-27 23:57:44,826][54818] Updated weights for policy 0, policy_version 515998 (0.0026) [2024-04-27 23:57:48,568][54818] Updated weights for policy 0, policy_version 516008 (0.0025) [2024-04-27 23:57:49,253][54587] Fps is (10 sec: 55705.9, 60 sec: 58982.5, 300 sec: 58982.4). Total num frames: 8454307840. Throughput: 0: 59067.5. Samples: 1359520060. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 23:57:49,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-27 23:57:49,371][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000516011_8454324224.pth... [2024-04-27 23:57:49,429][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000515148_8440184832.pth [2024-04-27 23:57:50,368][54818] Updated weights for policy 0, policy_version 516018 (0.0025) [2024-04-27 23:57:53,969][54818] Updated weights for policy 0, policy_version 516028 (0.0027) [2024-04-27 23:57:54,253][54587] Fps is (10 sec: 55705.6, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8454602752. Throughput: 0: 59235.1. Samples: 1359881340. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 23:57:54,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-27 23:57:55,968][54818] Updated weights for policy 0, policy_version 516038 (0.0024) [2024-04-27 23:57:59,253][54587] Fps is (10 sec: 58981.5, 60 sec: 58436.3, 300 sec: 58926.8). Total num frames: 8454897664. Throughput: 0: 58818.9. Samples: 1360051820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 23:57:59,262][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 23:57:59,573][54818] Updated weights for policy 0, policy_version 516048 (0.0026) [2024-04-27 23:58:01,374][54818] Updated weights for policy 0, policy_version 516058 (0.0025) [2024-04-27 23:58:04,253][54587] Fps is (10 sec: 60621.6, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 8455208960. Throughput: 0: 58960.9. Samples: 1360409600. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 23:58:04,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-27 23:58:05,198][54818] Updated weights for policy 0, policy_version 516068 (0.0026) [2024-04-27 23:58:05,931][54798] Signal inference workers to stop experience collection... (21450 times) [2024-04-27 23:58:05,944][54818] InferenceWorker_p0-w0: stopping experience collection (21450 times) [2024-04-27 23:58:06,028][54798] Signal inference workers to resume experience collection... (21450 times) [2024-04-27 23:58:06,028][54818] InferenceWorker_p0-w0: resuming experience collection (21450 times) [2024-04-27 23:58:07,364][54818] Updated weights for policy 0, policy_version 516078 (0.0022) [2024-04-27 23:58:09,253][54587] Fps is (10 sec: 62259.4, 60 sec: 58982.3, 300 sec: 58982.4). Total num frames: 8455520256. Throughput: 0: 59309.7. Samples: 1360766200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 23:58:09,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-27 23:58:10,733][54818] Updated weights for policy 0, policy_version 516088 (0.0025) [2024-04-27 23:58:12,899][54818] Updated weights for policy 0, policy_version 516098 (0.0026) [2024-04-27 23:58:14,253][54587] Fps is (10 sec: 58981.2, 60 sec: 58436.1, 300 sec: 59037.9). Total num frames: 8455798784. Throughput: 0: 58837.4. Samples: 1360936200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 23:58:14,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-27 23:58:16,162][54818] Updated weights for policy 0, policy_version 516108 (0.0025) [2024-04-27 23:58:18,336][54818] Updated weights for policy 0, policy_version 516118 (0.0026) [2024-04-27 23:58:19,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.3, 300 sec: 59037.9). Total num frames: 8456110080. Throughput: 0: 58698.0. Samples: 1361290400. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 23:58:19,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-27 23:58:21,743][54818] Updated weights for policy 0, policy_version 516128 (0.0027) [2024-04-27 23:58:23,966][54818] Updated weights for policy 0, policy_version 516138 (0.0026) [2024-04-27 23:58:24,253][54587] Fps is (10 sec: 60621.7, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8456404992. Throughput: 0: 58681.4. Samples: 1361644900. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-27 23:58:24,253][54587] Avg episode reward: [(0, '0.628')] [2024-04-27 23:58:27,088][54818] Updated weights for policy 0, policy_version 516148 (0.0026) [2024-04-27 23:58:29,253][54587] Fps is (10 sec: 60621.6, 60 sec: 58982.6, 300 sec: 59149.0). Total num frames: 8456716288. Throughput: 0: 59305.0. Samples: 1361838160. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:58:29,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-27 23:58:29,382][54818] Updated weights for policy 0, policy_version 516158 (0.0022) [2024-04-27 23:58:32,679][54818] Updated weights for policy 0, policy_version 516168 (0.0026) [2024-04-27 23:58:34,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.3, 300 sec: 59038.0). Total num frames: 8456994816. Throughput: 0: 59278.2. Samples: 1362187580. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:58:34,253][54587] Avg episode reward: [(0, '0.572')] [2024-04-27 23:58:34,878][54818] Updated weights for policy 0, policy_version 516178 (0.0026) [2024-04-27 23:58:38,061][54818] Updated weights for policy 0, policy_version 516188 (0.0026) [2024-04-27 23:58:39,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8457306112. Throughput: 0: 58917.9. Samples: 1362532640. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:58:39,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-27 23:58:40,969][54818] Updated weights for policy 0, policy_version 516198 (0.0026) [2024-04-27 23:58:43,529][54818] Updated weights for policy 0, policy_version 516208 (0.0026) [2024-04-27 23:58:44,253][54587] Fps is (10 sec: 62258.7, 60 sec: 59528.6, 300 sec: 59093.5). Total num frames: 8457617408. Throughput: 0: 59315.7. Samples: 1362721020. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:58:44,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-27 23:58:46,567][54818] Updated weights for policy 0, policy_version 516218 (0.0027) [2024-04-27 23:58:49,056][54818] Updated weights for policy 0, policy_version 516228 (0.0025) [2024-04-27 23:58:49,253][54587] Fps is (10 sec: 57343.8, 60 sec: 59528.5, 300 sec: 59037.9). Total num frames: 8457879552. Throughput: 0: 59392.7. Samples: 1363082280. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:58:49,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-27 23:58:52,201][54818] Updated weights for policy 0, policy_version 516238 (0.0026) [2024-04-27 23:58:54,053][54798] Signal inference workers to stop experience collection... (21500 times) [2024-04-27 23:58:54,054][54798] Signal inference workers to resume experience collection... (21500 times) [2024-04-27 23:58:54,065][54818] InferenceWorker_p0-w0: stopping experience collection (21500 times) [2024-04-27 23:58:54,066][54818] InferenceWorker_p0-w0: resuming experience collection (21500 times) [2024-04-27 23:58:54,253][54587] Fps is (10 sec: 55705.8, 60 sec: 59528.6, 300 sec: 59038.0). Total num frames: 8458174464. Throughput: 0: 59155.7. Samples: 1363428200. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:58:54,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-27 23:58:54,535][54818] Updated weights for policy 0, policy_version 516248 (0.0026) [2024-04-27 23:58:57,612][54818] Updated weights for policy 0, policy_version 516258 (0.0026) [2024-04-27 23:58:59,253][54587] Fps is (10 sec: 58981.8, 60 sec: 59528.5, 300 sec: 59037.9). Total num frames: 8458469376. Throughput: 0: 59329.3. Samples: 1363606020. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:58:59,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 23:58:59,951][54818] Updated weights for policy 0, policy_version 516268 (0.0026) [2024-04-27 23:59:03,008][54818] Updated weights for policy 0, policy_version 516278 (0.0031) [2024-04-27 23:59:04,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58982.3, 300 sec: 58926.9). Total num frames: 8458747904. Throughput: 0: 59489.0. Samples: 1363967400. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:59:04,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-27 23:59:05,406][54818] Updated weights for policy 0, policy_version 516288 (0.0026) [2024-04-27 23:59:08,726][54818] Updated weights for policy 0, policy_version 516298 (0.0026) [2024-04-27 23:59:09,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 8459042816. Throughput: 0: 59471.8. Samples: 1364321140. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:59:09,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-27 23:59:10,901][54818] Updated weights for policy 0, policy_version 516308 (0.0026) [2024-04-27 23:59:14,221][54818] Updated weights for policy 0, policy_version 516318 (0.0025) [2024-04-27 23:59:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8459354112. Throughput: 0: 58814.6. Samples: 1364484820. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:59:14,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-27 23:59:16,401][54818] Updated weights for policy 0, policy_version 516328 (0.0024) [2024-04-27 23:59:19,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8459632640. Throughput: 0: 58853.2. Samples: 1364835980. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:59:19,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-27 23:59:19,686][54818] Updated weights for policy 0, policy_version 516338 (0.0025) [2024-04-27 23:59:21,874][54818] Updated weights for policy 0, policy_version 516348 (0.0026) [2024-04-27 23:59:24,253][54587] Fps is (10 sec: 55705.9, 60 sec: 58436.2, 300 sec: 58926.9). Total num frames: 8459911168. Throughput: 0: 59275.5. Samples: 1365200040. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:59:24,254][54587] Avg episode reward: [(0, '0.496')] [2024-04-27 23:59:25,578][54818] Updated weights for policy 0, policy_version 516358 (0.0026) [2024-04-27 23:59:27,385][54818] Updated weights for policy 0, policy_version 516368 (0.0026) [2024-04-27 23:59:29,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58436.3, 300 sec: 58982.4). Total num frames: 8460222464. Throughput: 0: 58891.6. Samples: 1365371140. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:59:29,253][54587] Avg episode reward: [(0, '0.637')] [2024-04-27 23:59:31,081][54818] Updated weights for policy 0, policy_version 516378 (0.0026) [2024-04-27 23:59:32,997][54818] Updated weights for policy 0, policy_version 516388 (0.0026) [2024-04-27 23:59:34,253][54587] Fps is (10 sec: 62259.0, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8460533760. Throughput: 0: 58574.7. Samples: 1365718140. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:59:34,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-27 23:59:36,366][54818] Updated weights for policy 0, policy_version 516398 (0.0026) [2024-04-27 23:59:38,527][54818] Updated weights for policy 0, policy_version 516408 (0.0026) [2024-04-27 23:59:39,253][54587] Fps is (10 sec: 62258.4, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8460845056. Throughput: 0: 58808.3. Samples: 1366074580. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:59:39,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-27 23:59:41,939][54818] Updated weights for policy 0, policy_version 516418 (0.0026) [2024-04-27 23:59:43,889][54818] Updated weights for policy 0, policy_version 516428 (0.0026) [2024-04-27 23:59:44,253][54587] Fps is (10 sec: 62259.3, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8461156352. Throughput: 0: 59045.5. Samples: 1366263060. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:59:44,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-27 23:59:47,438][54818] Updated weights for policy 0, policy_version 516438 (0.0026) [2024-04-27 23:59:49,253][54587] Fps is (10 sec: 62259.1, 60 sec: 59801.5, 300 sec: 59149.0). Total num frames: 8461467648. Throughput: 0: 58824.8. Samples: 1366614520. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:59:49,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-27 23:59:49,266][54587] No heartbeat for components: RolloutWorker_w4 (9337 seconds) [2024-04-27 23:59:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000516447_8461467648.pth... [2024-04-27 23:59:49,315][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000515579_8447246336.pth [2024-04-27 23:59:49,783][54818] Updated weights for policy 0, policy_version 516448 (0.0026) [2024-04-27 23:59:53,063][54818] Updated weights for policy 0, policy_version 516458 (0.0026) [2024-04-27 23:59:54,253][54587] Fps is (10 sec: 57343.8, 60 sec: 59255.4, 300 sec: 59037.9). Total num frames: 8461729792. Throughput: 0: 58716.1. Samples: 1366963360. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-04-27 23:59:54,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-27 23:59:55,615][54818] Updated weights for policy 0, policy_version 516468 (0.0026) [2024-04-27 23:59:58,614][54818] Updated weights for policy 0, policy_version 516478 (0.0027) [2024-04-27 23:59:58,615][54798] Signal inference workers to stop experience collection... (21550 times) [2024-04-27 23:59:58,615][54798] Signal inference workers to resume experience collection... (21550 times) [2024-04-27 23:59:58,642][54818] InferenceWorker_p0-w0: stopping experience collection (21550 times) [2024-04-27 23:59:58,642][54818] InferenceWorker_p0-w0: resuming experience collection (21550 times) [2024-04-27 23:59:59,253][54587] Fps is (10 sec: 55706.3, 60 sec: 59255.6, 300 sec: 59037.9). Total num frames: 8462024704. Throughput: 0: 59393.0. Samples: 1367157500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-27 23:59:59,254][54587] Avg episode reward: [(0, '0.704')] [2024-04-28 00:00:01,620][54818] Updated weights for policy 0, policy_version 516488 (0.0023) [2024-04-28 00:00:03,899][54818] Updated weights for policy 0, policy_version 516498 (0.0026) [2024-04-28 00:00:04,253][54587] Fps is (10 sec: 57344.2, 60 sec: 59255.4, 300 sec: 58982.4). Total num frames: 8462303232. Throughput: 0: 59304.9. Samples: 1367504700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 00:00:04,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 00:00:07,213][54818] Updated weights for policy 0, policy_version 516508 (0.0026) [2024-04-28 00:00:09,253][54587] Fps is (10 sec: 58982.6, 60 sec: 59528.7, 300 sec: 59149.0). Total num frames: 8462614528. Throughput: 0: 59149.4. Samples: 1367861760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 00:00:09,253][54587] Avg episode reward: [(0, '0.538')] [2024-04-28 00:00:09,445][54818] Updated weights for policy 0, policy_version 516518 (0.0026) [2024-04-28 00:00:12,855][54818] Updated weights for policy 0, policy_version 516528 (0.0026) [2024-04-28 00:00:14,254][54587] Fps is (10 sec: 57342.5, 60 sec: 58709.1, 300 sec: 59037.9). Total num frames: 8462876672. Throughput: 0: 59101.8. Samples: 1368030740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 00:00:14,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 00:00:14,874][54818] Updated weights for policy 0, policy_version 516538 (0.0023) [2024-04-28 00:00:18,366][54818] Updated weights for policy 0, policy_version 516548 (0.0027) [2024-04-28 00:00:19,253][54587] Fps is (10 sec: 57343.2, 60 sec: 59255.4, 300 sec: 59037.9). Total num frames: 8463187968. Throughput: 0: 59540.8. Samples: 1368397480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 00:00:19,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 00:00:20,402][54818] Updated weights for policy 0, policy_version 516558 (0.0024) [2024-04-28 00:00:23,743][54818] Updated weights for policy 0, policy_version 516568 (0.0025) [2024-04-28 00:00:24,253][54587] Fps is (10 sec: 58984.5, 60 sec: 59255.5, 300 sec: 59038.0). Total num frames: 8463466496. Throughput: 0: 59528.6. Samples: 1368753360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 00:00:24,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 00:00:25,793][54818] Updated weights for policy 0, policy_version 516578 (0.0025) [2024-04-28 00:00:29,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58982.2, 300 sec: 59037.9). Total num frames: 8463761408. Throughput: 0: 58818.0. Samples: 1368909880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 00:00:29,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-28 00:00:29,312][54818] Updated weights for policy 0, policy_version 516588 (0.0024) [2024-04-28 00:00:31,310][54818] Updated weights for policy 0, policy_version 516598 (0.0021) [2024-04-28 00:00:34,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58709.4, 300 sec: 59037.9). Total num frames: 8464056320. Throughput: 0: 59071.7. Samples: 1369272740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 00:00:34,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 00:00:34,699][54818] Updated weights for policy 0, policy_version 516608 (0.0024) [2024-04-28 00:00:36,903][54818] Updated weights for policy 0, policy_version 516618 (0.0025) [2024-04-28 00:00:37,055][54798] Signal inference workers to stop experience collection... (21600 times) [2024-04-28 00:00:37,055][54798] Signal inference workers to resume experience collection... (21600 times) [2024-04-28 00:00:37,068][54818] InferenceWorker_p0-w0: stopping experience collection (21600 times) [2024-04-28 00:00:37,068][54818] InferenceWorker_p0-w0: resuming experience collection (21600 times) [2024-04-28 00:00:39,253][54587] Fps is (10 sec: 58983.3, 60 sec: 58436.3, 300 sec: 58982.4). Total num frames: 8464351232. Throughput: 0: 59172.9. Samples: 1369626140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 00:00:39,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 00:00:40,390][54818] Updated weights for policy 0, policy_version 516628 (0.0026) [2024-04-28 00:00:42,457][54818] Updated weights for policy 0, policy_version 516638 (0.0027) [2024-04-28 00:00:44,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58436.2, 300 sec: 59093.5). Total num frames: 8464662528. Throughput: 0: 58776.3. Samples: 1369802440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 00:00:44,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 00:00:46,018][54818] Updated weights for policy 0, policy_version 516648 (0.0024) [2024-04-28 00:00:47,828][54818] Updated weights for policy 0, policy_version 516658 (0.0027) [2024-04-28 00:00:49,253][54587] Fps is (10 sec: 62258.1, 60 sec: 58436.2, 300 sec: 59093.5). Total num frames: 8464973824. Throughput: 0: 58840.7. Samples: 1370152540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 00:00:49,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 00:00:51,334][54818] Updated weights for policy 0, policy_version 516668 (0.0025) [2024-04-28 00:00:53,314][54818] Updated weights for policy 0, policy_version 516678 (0.0025) [2024-04-28 00:00:54,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 8465268736. Throughput: 0: 58608.8. Samples: 1370499160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 00:00:54,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 00:00:57,040][54818] Updated weights for policy 0, policy_version 516688 (0.0026) [2024-04-28 00:00:58,964][54818] Updated weights for policy 0, policy_version 516698 (0.0026) [2024-04-28 00:00:59,253][54587] Fps is (10 sec: 60621.8, 60 sec: 59255.4, 300 sec: 59037.9). Total num frames: 8465580032. Throughput: 0: 59057.7. Samples: 1370688320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 00:00:59,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-28 00:01:02,382][54818] Updated weights for policy 0, policy_version 516708 (0.0024) [2024-04-28 00:01:04,253][54587] Fps is (10 sec: 62258.8, 60 sec: 59801.5, 300 sec: 59093.5). Total num frames: 8465891328. Throughput: 0: 58891.1. Samples: 1371047580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 00:01:04,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 00:01:04,599][54818] Updated weights for policy 0, policy_version 516718 (0.0028) [2024-04-28 00:01:08,044][54818] Updated weights for policy 0, policy_version 516728 (0.0030) [2024-04-28 00:01:09,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.4, 300 sec: 59037.9). Total num frames: 8466169856. Throughput: 0: 58747.9. Samples: 1371397020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 00:01:09,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 00:01:10,500][54818] Updated weights for policy 0, policy_version 516738 (0.0027) [2024-04-28 00:01:13,597][54818] Updated weights for policy 0, policy_version 516748 (0.0024) [2024-04-28 00:01:14,253][54587] Fps is (10 sec: 55706.1, 60 sec: 59528.8, 300 sec: 59037.9). Total num frames: 8466448384. Throughput: 0: 59430.4. Samples: 1371584240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 00:01:14,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 00:01:15,869][54818] Updated weights for policy 0, policy_version 516758 (0.0026) [2024-04-28 00:01:17,016][54798] Signal inference workers to stop experience collection... (21650 times) [2024-04-28 00:01:17,022][54798] Signal inference workers to resume experience collection... (21650 times) [2024-04-28 00:01:17,034][54818] InferenceWorker_p0-w0: stopping experience collection (21650 times) [2024-04-28 00:01:17,035][54818] InferenceWorker_p0-w0: resuming experience collection (21650 times) [2024-04-28 00:01:19,045][54818] Updated weights for policy 0, policy_version 516768 (0.0027) [2024-04-28 00:01:19,253][54587] Fps is (10 sec: 55706.1, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8466726912. Throughput: 0: 59231.6. Samples: 1371938160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 00:01:19,253][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 00:01:21,872][54818] Updated weights for policy 0, policy_version 516778 (0.0023) [2024-04-28 00:01:24,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59255.4, 300 sec: 59038.0). Total num frames: 8467021824. Throughput: 0: 59068.4. Samples: 1372284220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 00:01:24,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-28 00:01:24,476][54818] Updated weights for policy 0, policy_version 516788 (0.0026) [2024-04-28 00:01:27,549][54818] Updated weights for policy 0, policy_version 516798 (0.0026) [2024-04-28 00:01:29,253][54587] Fps is (10 sec: 60619.9, 60 sec: 59528.6, 300 sec: 59037.9). Total num frames: 8467333120. Throughput: 0: 59095.5. Samples: 1372461740. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-28 00:01:29,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 00:01:29,869][54818] Updated weights for policy 0, policy_version 516808 (0.0026) [2024-04-28 00:01:33,056][54818] Updated weights for policy 0, policy_version 516818 (0.0025) [2024-04-28 00:01:34,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8467628032. Throughput: 0: 59439.8. Samples: 1372827320. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-28 00:01:34,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-28 00:01:35,342][54818] Updated weights for policy 0, policy_version 516828 (0.0026) [2024-04-28 00:01:38,423][54818] Updated weights for policy 0, policy_version 516838 (0.0024) [2024-04-28 00:01:39,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59528.4, 300 sec: 59204.5). Total num frames: 8467922944. Throughput: 0: 59441.7. Samples: 1373174040. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-28 00:01:39,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-28 00:01:40,889][54818] Updated weights for policy 0, policy_version 516848 (0.0025) [2024-04-28 00:01:43,862][54818] Updated weights for policy 0, policy_version 516858 (0.0025) [2024-04-28 00:01:44,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8468201472. Throughput: 0: 59131.0. Samples: 1373349220. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-28 00:01:44,254][54587] Avg episode reward: [(0, '0.521')] [2024-04-28 00:01:46,450][54818] Updated weights for policy 0, policy_version 516868 (0.0026) [2024-04-28 00:01:49,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8468512768. Throughput: 0: 59029.8. Samples: 1373703920. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-28 00:01:49,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 00:01:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000516877_8468512768.pth... [2024-04-28 00:01:49,323][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000516011_8454324224.pth [2024-04-28 00:01:49,438][54818] Updated weights for policy 0, policy_version 516878 (0.0026) [2024-04-28 00:01:51,979][54818] Updated weights for policy 0, policy_version 516888 (0.0026) [2024-04-28 00:01:54,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8468791296. Throughput: 0: 59322.3. Samples: 1374066520. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-28 00:01:54,253][54587] Avg episode reward: [(0, '0.651')] [2024-04-28 00:01:54,864][54818] Updated weights for policy 0, policy_version 516898 (0.0025) [2024-04-28 00:01:57,444][54818] Updated weights for policy 0, policy_version 516908 (0.0026) [2024-04-28 00:01:59,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8469102592. Throughput: 0: 58986.6. Samples: 1374238640. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-28 00:01:59,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-28 00:02:00,597][54818] Updated weights for policy 0, policy_version 516918 (0.0024) [2024-04-28 00:02:03,036][54818] Updated weights for policy 0, policy_version 516928 (0.0025) [2024-04-28 00:02:04,253][54587] Fps is (10 sec: 62258.9, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8469413888. Throughput: 0: 59013.7. Samples: 1374593780. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-28 00:02:04,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 00:02:05,876][54818] Updated weights for policy 0, policy_version 516938 (0.0024) [2024-04-28 00:02:06,303][54798] Signal inference workers to stop experience collection... (21700 times) [2024-04-28 00:02:06,303][54798] Signal inference workers to resume experience collection... (21700 times) [2024-04-28 00:02:06,315][54818] InferenceWorker_p0-w0: stopping experience collection (21700 times) [2024-04-28 00:02:06,333][54818] InferenceWorker_p0-w0: resuming experience collection (21700 times) [2024-04-28 00:02:08,651][54818] Updated weights for policy 0, policy_version 516948 (0.0026) [2024-04-28 00:02:09,253][54587] Fps is (10 sec: 60621.7, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8469708800. Throughput: 0: 59248.1. Samples: 1374950380. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-28 00:02:09,253][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 00:02:11,389][54818] Updated weights for policy 0, policy_version 516958 (0.0026) [2024-04-28 00:02:14,034][54818] Updated weights for policy 0, policy_version 516968 (0.0026) [2024-04-28 00:02:14,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8470003712. Throughput: 0: 59345.1. Samples: 1375132260. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-28 00:02:14,253][54587] Avg episode reward: [(0, '0.677')] [2024-04-28 00:02:17,021][54818] Updated weights for policy 0, policy_version 516978 (0.0025) [2024-04-28 00:02:19,253][54587] Fps is (10 sec: 58981.9, 60 sec: 59528.4, 300 sec: 59093.5). Total num frames: 8470298624. Throughput: 0: 59035.5. Samples: 1375483920. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-28 00:02:19,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 00:02:19,468][54818] Updated weights for policy 0, policy_version 516988 (0.0027) [2024-04-28 00:02:22,448][54818] Updated weights for policy 0, policy_version 516998 (0.0026) [2024-04-28 00:02:24,253][54587] Fps is (10 sec: 58981.5, 60 sec: 59528.4, 300 sec: 59038.0). Total num frames: 8470593536. Throughput: 0: 59186.7. Samples: 1375837440. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-28 00:02:24,254][54587] Avg episode reward: [(0, '0.673')] [2024-04-28 00:02:25,045][54818] Updated weights for policy 0, policy_version 517008 (0.0026) [2024-04-28 00:02:27,901][54818] Updated weights for policy 0, policy_version 517018 (0.0025) [2024-04-28 00:02:29,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59528.6, 300 sec: 59093.5). Total num frames: 8470904832. Throughput: 0: 59356.1. Samples: 1376020240. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-28 00:02:29,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-28 00:02:30,642][54818] Updated weights for policy 0, policy_version 517028 (0.0028) [2024-04-28 00:02:33,446][54818] Updated weights for policy 0, policy_version 517038 (0.0025) [2024-04-28 00:02:34,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8471199744. Throughput: 0: 59438.7. Samples: 1376378660. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-28 00:02:34,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 00:02:36,122][54818] Updated weights for policy 0, policy_version 517048 (0.0026) [2024-04-28 00:02:38,960][54818] Updated weights for policy 0, policy_version 517058 (0.0027) [2024-04-28 00:02:39,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59528.7, 300 sec: 59149.0). Total num frames: 8471494656. Throughput: 0: 59196.4. Samples: 1376730360. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-28 00:02:39,254][54587] Avg episode reward: [(0, '0.668')] [2024-04-28 00:02:41,913][54818] Updated weights for policy 0, policy_version 517068 (0.0025) [2024-04-28 00:02:44,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59801.7, 300 sec: 59260.1). Total num frames: 8471789568. Throughput: 0: 59277.5. Samples: 1376906120. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-28 00:02:44,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 00:02:44,433][54818] Updated weights for policy 0, policy_version 517078 (0.0025) [2024-04-28 00:02:47,365][54818] Updated weights for policy 0, policy_version 517088 (0.0025) [2024-04-28 00:02:49,253][54587] Fps is (10 sec: 55705.5, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8472051712. Throughput: 0: 59383.6. Samples: 1377266040. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-04-28 00:02:49,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 00:02:49,262][54587] No heartbeat for components: RolloutWorker_w4 (9517 seconds) [2024-04-28 00:02:49,768][54818] Updated weights for policy 0, policy_version 517098 (0.0022) [2024-04-28 00:02:52,813][54818] Updated weights for policy 0, policy_version 517108 (0.0026) [2024-04-28 00:02:54,253][54587] Fps is (10 sec: 57343.5, 60 sec: 59528.4, 300 sec: 59204.6). Total num frames: 8472363008. Throughput: 0: 59536.3. Samples: 1377629520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:02:54,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 00:02:55,421][54818] Updated weights for policy 0, policy_version 517118 (0.0026) [2024-04-28 00:02:56,824][54798] Signal inference workers to stop experience collection... (21750 times) [2024-04-28 00:02:56,825][54798] Signal inference workers to resume experience collection... (21750 times) [2024-04-28 00:02:56,841][54818] InferenceWorker_p0-w0: stopping experience collection (21750 times) [2024-04-28 00:02:56,841][54818] InferenceWorker_p0-w0: resuming experience collection (21750 times) [2024-04-28 00:02:58,759][54818] Updated weights for policy 0, policy_version 517128 (0.0026) [2024-04-28 00:02:59,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8472657920. Throughput: 0: 59228.3. Samples: 1377797540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:02:59,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 00:03:01,059][54818] Updated weights for policy 0, policy_version 517138 (0.0026) [2024-04-28 00:03:04,224][54818] Updated weights for policy 0, policy_version 517148 (0.0026) [2024-04-28 00:03:04,253][54587] Fps is (10 sec: 58983.3, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8472952832. Throughput: 0: 59232.1. Samples: 1378149360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:03:04,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-28 00:03:06,708][54818] Updated weights for policy 0, policy_version 517158 (0.0026) [2024-04-28 00:03:09,254][54587] Fps is (10 sec: 58976.7, 60 sec: 58981.4, 300 sec: 59148.8). Total num frames: 8473247744. Throughput: 0: 59238.8. Samples: 1378503240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:03:09,255][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 00:03:09,727][54818] Updated weights for policy 0, policy_version 517168 (0.0026) [2024-04-28 00:03:12,296][54818] Updated weights for policy 0, policy_version 517178 (0.0026) [2024-04-28 00:03:14,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8473559040. Throughput: 0: 59128.4. Samples: 1378681020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:03:14,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 00:03:15,314][54818] Updated weights for policy 0, policy_version 517188 (0.0025) [2024-04-28 00:03:17,862][54818] Updated weights for policy 0, policy_version 517198 (0.0026) [2024-04-28 00:03:19,253][54587] Fps is (10 sec: 60626.4, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8473853952. Throughput: 0: 59262.2. Samples: 1379045460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:03:19,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 00:03:20,757][54818] Updated weights for policy 0, policy_version 517208 (0.0026) [2024-04-28 00:03:23,479][54818] Updated weights for policy 0, policy_version 517218 (0.0026) [2024-04-28 00:03:24,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59255.6, 300 sec: 59093.5). Total num frames: 8474148864. Throughput: 0: 59319.1. Samples: 1379399720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:03:24,253][54587] Avg episode reward: [(0, '0.646')] [2024-04-28 00:03:26,086][54818] Updated weights for policy 0, policy_version 517228 (0.0026) [2024-04-28 00:03:29,002][54818] Updated weights for policy 0, policy_version 517238 (0.0026) [2024-04-28 00:03:29,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8474427392. Throughput: 0: 59265.7. Samples: 1379573080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:03:29,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 00:03:31,611][54818] Updated weights for policy 0, policy_version 517248 (0.0027) [2024-04-28 00:03:34,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8474738688. Throughput: 0: 59003.4. Samples: 1379921200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:03:34,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 00:03:34,435][54818] Updated weights for policy 0, policy_version 517258 (0.0026) [2024-04-28 00:03:37,031][54818] Updated weights for policy 0, policy_version 517268 (0.0028) [2024-04-28 00:03:39,253][54587] Fps is (10 sec: 58983.3, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8475017216. Throughput: 0: 58959.8. Samples: 1380282700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:03:39,253][54587] Avg episode reward: [(0, '0.652')] [2024-04-28 00:03:39,945][54818] Updated weights for policy 0, policy_version 517278 (0.0028) [2024-04-28 00:03:42,521][54818] Updated weights for policy 0, policy_version 517288 (0.0026) [2024-04-28 00:03:44,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58982.2, 300 sec: 59149.0). Total num frames: 8475328512. Throughput: 0: 59259.4. Samples: 1380464220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:03:44,254][54587] Avg episode reward: [(0, '0.528')] [2024-04-28 00:03:45,540][54818] Updated weights for policy 0, policy_version 517298 (0.0025) [2024-04-28 00:03:48,070][54818] Updated weights for policy 0, policy_version 517308 (0.0026) [2024-04-28 00:03:49,253][54587] Fps is (10 sec: 62257.7, 60 sec: 59801.5, 300 sec: 59204.5). Total num frames: 8475639808. Throughput: 0: 59378.0. Samples: 1380821380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:03:49,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 00:03:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000517312_8475639808.pth... [2024-04-28 00:03:49,324][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000516447_8461467648.pth [2024-04-28 00:03:51,056][54818] Updated weights for policy 0, policy_version 517318 (0.0025) [2024-04-28 00:03:53,568][54818] Updated weights for policy 0, policy_version 517328 (0.0026) [2024-04-28 00:03:54,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59528.5, 300 sec: 59204.6). Total num frames: 8475934720. Throughput: 0: 59199.0. Samples: 1381167140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:03:54,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 00:03:56,593][54818] Updated weights for policy 0, policy_version 517338 (0.0025) [2024-04-28 00:03:59,018][54818] Updated weights for policy 0, policy_version 517348 (0.0026) [2024-04-28 00:03:59,253][54587] Fps is (10 sec: 58983.2, 60 sec: 59528.6, 300 sec: 59260.1). Total num frames: 8476229632. Throughput: 0: 59318.7. Samples: 1381350360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:03:59,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 00:04:02,090][54818] Updated weights for policy 0, policy_version 517358 (0.0026) [2024-04-28 00:04:02,907][54798] Signal inference workers to stop experience collection... (21800 times) [2024-04-28 00:04:02,919][54818] InferenceWorker_p0-w0: stopping experience collection (21800 times) [2024-04-28 00:04:02,969][54798] Signal inference workers to resume experience collection... (21800 times) [2024-04-28 00:04:02,969][54818] InferenceWorker_p0-w0: resuming experience collection (21800 times) [2024-04-28 00:04:04,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59528.4, 300 sec: 59260.1). Total num frames: 8476524544. Throughput: 0: 59077.8. Samples: 1381703960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:04:04,254][54587] Avg episode reward: [(0, '0.500')] [2024-04-28 00:04:04,497][54818] Updated weights for policy 0, policy_version 517368 (0.0023) [2024-04-28 00:04:07,539][54818] Updated weights for policy 0, policy_version 517378 (0.0025) [2024-04-28 00:04:09,253][54587] Fps is (10 sec: 57343.8, 60 sec: 59256.4, 300 sec: 59149.0). Total num frames: 8476803072. Throughput: 0: 59013.7. Samples: 1382055340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:04:09,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 00:04:10,090][54818] Updated weights for policy 0, policy_version 517388 (0.0025) [2024-04-28 00:04:13,123][54818] Updated weights for policy 0, policy_version 517398 (0.0026) [2024-04-28 00:04:14,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8477114368. Throughput: 0: 59276.1. Samples: 1382240500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:04:14,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 00:04:15,631][54818] Updated weights for policy 0, policy_version 517408 (0.0026) [2024-04-28 00:04:18,720][54818] Updated weights for policy 0, policy_version 517418 (0.0026) [2024-04-28 00:04:19,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59255.4, 300 sec: 59315.6). Total num frames: 8477409280. Throughput: 0: 59573.8. Samples: 1382602020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 00:04:19,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 00:04:21,238][54818] Updated weights for policy 0, policy_version 517428 (0.0026) [2024-04-28 00:04:24,125][54818] Updated weights for policy 0, policy_version 517438 (0.0026) [2024-04-28 00:04:24,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.4, 300 sec: 59260.1). Total num frames: 8477704192. Throughput: 0: 59135.8. Samples: 1382943820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-28 00:04:24,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 00:04:27,139][54818] Updated weights for policy 0, policy_version 517448 (0.0026) [2024-04-28 00:04:29,253][54587] Fps is (10 sec: 57344.5, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8477982720. Throughput: 0: 59030.4. Samples: 1383120580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-28 00:04:29,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 00:04:29,773][54818] Updated weights for policy 0, policy_version 517458 (0.0026) [2024-04-28 00:04:32,584][54818] Updated weights for policy 0, policy_version 517468 (0.0026) [2024-04-28 00:04:34,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8478294016. Throughput: 0: 59045.4. Samples: 1383478420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-28 00:04:34,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 00:04:35,257][54818] Updated weights for policy 0, policy_version 517478 (0.0025) [2024-04-28 00:04:38,393][54818] Updated weights for policy 0, policy_version 517488 (0.0025) [2024-04-28 00:04:39,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.3, 300 sec: 59037.9). Total num frames: 8478572544. Throughput: 0: 59368.1. Samples: 1383838700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-28 00:04:39,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-28 00:04:40,851][54818] Updated weights for policy 0, policy_version 517498 (0.0030) [2024-04-28 00:04:43,999][54818] Updated weights for policy 0, policy_version 517508 (0.0026) [2024-04-28 00:04:44,254][54587] Fps is (10 sec: 55704.7, 60 sec: 58709.3, 300 sec: 58926.8). Total num frames: 8478851072. Throughput: 0: 58964.1. Samples: 1384003760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-28 00:04:44,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 00:04:46,290][54818] Updated weights for policy 0, policy_version 517518 (0.0026) [2024-04-28 00:04:49,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8479162368. Throughput: 0: 59003.5. Samples: 1384359120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-28 00:04:49,254][54587] Avg episode reward: [(0, '0.526')] [2024-04-28 00:04:49,524][54818] Updated weights for policy 0, policy_version 517528 (0.0027) [2024-04-28 00:04:51,884][54818] Updated weights for policy 0, policy_version 517538 (0.0024) [2024-04-28 00:04:54,253][54587] Fps is (10 sec: 60622.1, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8479457280. Throughput: 0: 59236.0. Samples: 1384720960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-28 00:04:54,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 00:04:55,087][54818] Updated weights for policy 0, policy_version 517548 (0.0024) [2024-04-28 00:04:57,442][54818] Updated weights for policy 0, policy_version 517558 (0.0026) [2024-04-28 00:04:57,625][54798] Signal inference workers to stop experience collection... (21850 times) [2024-04-28 00:04:57,655][54818] InferenceWorker_p0-w0: stopping experience collection (21850 times) [2024-04-28 00:04:57,683][54798] Signal inference workers to resume experience collection... (21850 times) [2024-04-28 00:04:57,683][54818] InferenceWorker_p0-w0: resuming experience collection (21850 times) [2024-04-28 00:04:59,253][54587] Fps is (10 sec: 60621.2, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8479768576. Throughput: 0: 59073.3. Samples: 1384898800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-28 00:04:59,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 00:05:00,568][54818] Updated weights for policy 0, policy_version 517568 (0.0026) [2024-04-28 00:05:02,994][54818] Updated weights for policy 0, policy_version 517578 (0.0027) [2024-04-28 00:05:04,253][54587] Fps is (10 sec: 62258.1, 60 sec: 59255.3, 300 sec: 59204.5). Total num frames: 8480079872. Throughput: 0: 58953.2. Samples: 1385254920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-28 00:05:04,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 00:05:05,968][54818] Updated weights for policy 0, policy_version 517588 (0.0026) [2024-04-28 00:05:08,582][54818] Updated weights for policy 0, policy_version 517598 (0.0022) [2024-04-28 00:05:09,253][54587] Fps is (10 sec: 62259.3, 60 sec: 59801.6, 300 sec: 59371.2). Total num frames: 8480391168. Throughput: 0: 59160.0. Samples: 1385606020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-28 00:05:09,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 00:05:11,479][54818] Updated weights for policy 0, policy_version 517608 (0.0025) [2024-04-28 00:05:14,124][54818] Updated weights for policy 0, policy_version 517618 (0.0025) [2024-04-28 00:05:14,253][54587] Fps is (10 sec: 58982.6, 60 sec: 59255.3, 300 sec: 59260.1). Total num frames: 8480669696. Throughput: 0: 59232.7. Samples: 1385786060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-28 00:05:14,254][54587] Avg episode reward: [(0, '0.531')] [2024-04-28 00:05:17,049][54818] Updated weights for policy 0, policy_version 517628 (0.0026) [2024-04-28 00:05:19,253][54587] Fps is (10 sec: 55705.7, 60 sec: 58982.5, 300 sec: 59260.1). Total num frames: 8480948224. Throughput: 0: 59077.9. Samples: 1386136920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-28 00:05:19,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-28 00:05:19,433][54818] Updated weights for policy 0, policy_version 517638 (0.0023) [2024-04-28 00:05:22,522][54818] Updated weights for policy 0, policy_version 517648 (0.0025) [2024-04-28 00:05:24,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58982.3, 300 sec: 59260.1). Total num frames: 8481243136. Throughput: 0: 58982.6. Samples: 1386492920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-28 00:05:24,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 00:05:24,867][54818] Updated weights for policy 0, policy_version 517658 (0.0024) [2024-04-28 00:05:27,833][54818] Updated weights for policy 0, policy_version 517668 (0.0025) [2024-04-28 00:05:29,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8481538048. Throughput: 0: 59482.6. Samples: 1386680460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-28 00:05:29,253][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 00:05:30,337][54818] Updated weights for policy 0, policy_version 517678 (0.0022) [2024-04-28 00:05:33,358][54818] Updated weights for policy 0, policy_version 517688 (0.0026) [2024-04-28 00:05:34,253][54587] Fps is (10 sec: 62259.3, 60 sec: 59528.5, 300 sec: 59371.2). Total num frames: 8481865728. Throughput: 0: 59573.4. Samples: 1387039920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-28 00:05:34,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 00:05:35,906][54818] Updated weights for policy 0, policy_version 517698 (0.0026) [2024-04-28 00:05:38,890][54818] Updated weights for policy 0, policy_version 517708 (0.0026) [2024-04-28 00:05:39,253][54587] Fps is (10 sec: 62258.1, 60 sec: 59801.5, 300 sec: 59315.6). Total num frames: 8482160640. Throughput: 0: 59308.3. Samples: 1387389840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-28 00:05:39,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-28 00:05:41,467][54818] Updated weights for policy 0, policy_version 517718 (0.0025) [2024-04-28 00:05:44,253][54587] Fps is (10 sec: 57343.6, 60 sec: 59801.7, 300 sec: 59204.6). Total num frames: 8482439168. Throughput: 0: 59188.3. Samples: 1387562280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-04-28 00:05:44,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 00:05:44,337][54818] Updated weights for policy 0, policy_version 517728 (0.0026) [2024-04-28 00:05:47,191][54818] Updated weights for policy 0, policy_version 517738 (0.0025) [2024-04-28 00:05:49,253][54587] Fps is (10 sec: 57344.2, 60 sec: 59528.5, 300 sec: 59204.5). Total num frames: 8482734080. Throughput: 0: 59237.0. Samples: 1387920580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:05:49,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 00:05:49,263][54587] No heartbeat for components: RolloutWorker_w4 (9697 seconds) [2024-04-28 00:05:49,371][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000517746_8482750464.pth... [2024-04-28 00:05:49,434][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000516877_8468512768.pth [2024-04-28 00:05:49,780][54818] Updated weights for policy 0, policy_version 517748 (0.0023) [2024-04-28 00:05:51,522][54798] Signal inference workers to stop experience collection... (21900 times) [2024-04-28 00:05:51,542][54818] InferenceWorker_p0-w0: stopping experience collection (21900 times) [2024-04-28 00:05:51,580][54798] Signal inference workers to resume experience collection... (21900 times) [2024-04-28 00:05:51,580][54818] InferenceWorker_p0-w0: resuming experience collection (21900 times) [2024-04-28 00:05:52,754][54818] Updated weights for policy 0, policy_version 517758 (0.0024) [2024-04-28 00:05:54,254][54587] Fps is (10 sec: 57343.6, 60 sec: 59255.3, 300 sec: 59093.4). Total num frames: 8483012608. Throughput: 0: 59418.8. Samples: 1388279880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:05:54,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 00:05:55,265][54818] Updated weights for policy 0, policy_version 517768 (0.0026) [2024-04-28 00:05:58,400][54818] Updated weights for policy 0, policy_version 517778 (0.0026) [2024-04-28 00:05:59,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8483307520. Throughput: 0: 59248.9. Samples: 1388452260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:05:59,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 00:06:00,906][54818] Updated weights for policy 0, policy_version 517788 (0.0026) [2024-04-28 00:06:03,840][54818] Updated weights for policy 0, policy_version 517798 (0.0025) [2024-04-28 00:06:04,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58709.3, 300 sec: 59093.4). Total num frames: 8483602432. Throughput: 0: 59246.8. Samples: 1388803040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:06:04,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-28 00:06:06,343][54818] Updated weights for policy 0, policy_version 517808 (0.0022) [2024-04-28 00:06:09,253][54587] Fps is (10 sec: 60621.3, 60 sec: 58709.3, 300 sec: 59204.6). Total num frames: 8483913728. Throughput: 0: 59253.4. Samples: 1389159320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:06:09,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 00:06:09,634][54818] Updated weights for policy 0, policy_version 517818 (0.0024) [2024-04-28 00:06:12,193][54818] Updated weights for policy 0, policy_version 517828 (0.0026) [2024-04-28 00:06:14,253][54587] Fps is (10 sec: 60621.7, 60 sec: 58982.5, 300 sec: 59260.1). Total num frames: 8484208640. Throughput: 0: 58862.1. Samples: 1389329260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:06:14,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 00:06:15,208][54818] Updated weights for policy 0, policy_version 517838 (0.0025) [2024-04-28 00:06:17,913][54818] Updated weights for policy 0, policy_version 517848 (0.0026) [2024-04-28 00:06:19,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.4, 300 sec: 59260.1). Total num frames: 8484503552. Throughput: 0: 58840.9. Samples: 1389687760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:06:19,254][54587] Avg episode reward: [(0, '0.528')] [2024-04-28 00:06:20,749][54818] Updated weights for policy 0, policy_version 517858 (0.0026) [2024-04-28 00:06:23,523][54818] Updated weights for policy 0, policy_version 517868 (0.0026) [2024-04-28 00:06:24,253][54587] Fps is (10 sec: 58982.0, 60 sec: 59255.4, 300 sec: 59204.6). Total num frames: 8484798464. Throughput: 0: 58975.1. Samples: 1390043720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:06:24,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-28 00:06:26,308][54818] Updated weights for policy 0, policy_version 517878 (0.0025) [2024-04-28 00:06:29,061][54818] Updated weights for policy 0, policy_version 517888 (0.0027) [2024-04-28 00:06:29,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8485076992. Throughput: 0: 59062.8. Samples: 1390220100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:06:29,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 00:06:31,730][54818] Updated weights for policy 0, policy_version 517898 (0.0025) [2024-04-28 00:06:34,253][54587] Fps is (10 sec: 57344.9, 60 sec: 58436.4, 300 sec: 59149.0). Total num frames: 8485371904. Throughput: 0: 58895.7. Samples: 1390570880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:06:34,253][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 00:06:34,529][54818] Updated weights for policy 0, policy_version 517908 (0.0023) [2024-04-28 00:06:37,225][54818] Updated weights for policy 0, policy_version 517918 (0.0026) [2024-04-28 00:06:39,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58436.4, 300 sec: 59204.6). Total num frames: 8485666816. Throughput: 0: 58851.4. Samples: 1390928180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:06:39,253][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 00:06:39,981][54818] Updated weights for policy 0, policy_version 517928 (0.0026) [2024-04-28 00:06:42,789][54818] Updated weights for policy 0, policy_version 517938 (0.0023) [2024-04-28 00:06:44,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8485961728. Throughput: 0: 59059.6. Samples: 1391109940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:06:44,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-28 00:06:45,470][54818] Updated weights for policy 0, policy_version 517948 (0.0026) [2024-04-28 00:06:48,109][54818] Updated weights for policy 0, policy_version 517958 (0.0026) [2024-04-28 00:06:49,253][54587] Fps is (10 sec: 60620.2, 60 sec: 58982.4, 300 sec: 59260.1). Total num frames: 8486273024. Throughput: 0: 59113.5. Samples: 1391463140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:06:49,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 00:06:51,018][54818] Updated weights for policy 0, policy_version 517968 (0.0026) [2024-04-28 00:06:52,369][54798] Signal inference workers to stop experience collection... (21950 times) [2024-04-28 00:06:52,405][54818] InferenceWorker_p0-w0: stopping experience collection (21950 times) [2024-04-28 00:06:52,437][54798] Signal inference workers to resume experience collection... (21950 times) [2024-04-28 00:06:52,437][54818] InferenceWorker_p0-w0: resuming experience collection (21950 times) [2024-04-28 00:06:53,651][54818] Updated weights for policy 0, policy_version 517978 (0.0026) [2024-04-28 00:06:54,253][54587] Fps is (10 sec: 62259.6, 60 sec: 59528.7, 300 sec: 59260.1). Total num frames: 8486584320. Throughput: 0: 58859.1. Samples: 1391807980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:06:54,254][54587] Avg episode reward: [(0, '0.679')] [2024-04-28 00:06:56,613][54818] Updated weights for policy 0, policy_version 517988 (0.0026) [2024-04-28 00:06:59,199][54818] Updated weights for policy 0, policy_version 517998 (0.0026) [2024-04-28 00:06:59,253][54587] Fps is (10 sec: 60620.7, 60 sec: 59528.6, 300 sec: 59204.5). Total num frames: 8486879232. Throughput: 0: 59038.6. Samples: 1391986000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:06:59,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 00:07:02,383][54818] Updated weights for policy 0, policy_version 518008 (0.0027) [2024-04-28 00:07:04,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59255.7, 300 sec: 59149.0). Total num frames: 8487157760. Throughput: 0: 59101.9. Samples: 1392347340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:07:04,254][54587] Avg episode reward: [(0, '0.505')] [2024-04-28 00:07:04,666][54818] Updated weights for policy 0, policy_version 518018 (0.0025) [2024-04-28 00:07:07,976][54818] Updated weights for policy 0, policy_version 518028 (0.0026) [2024-04-28 00:07:09,253][54587] Fps is (10 sec: 55706.0, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8487436288. Throughput: 0: 59028.6. Samples: 1392700000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:07:09,253][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 00:07:10,158][54818] Updated weights for policy 0, policy_version 518038 (0.0023) [2024-04-28 00:07:13,490][54818] Updated weights for policy 0, policy_version 518048 (0.0025) [2024-04-28 00:07:14,253][54587] Fps is (10 sec: 57343.4, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8487731200. Throughput: 0: 59139.9. Samples: 1392881400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 00:07:14,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-28 00:07:15,668][54818] Updated weights for policy 0, policy_version 518058 (0.0024) [2024-04-28 00:07:18,957][54818] Updated weights for policy 0, policy_version 518068 (0.0024) [2024-04-28 00:07:19,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8488026112. Throughput: 0: 59146.0. Samples: 1393232460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:07:19,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 00:07:21,987][54818] Updated weights for policy 0, policy_version 518078 (0.0025) [2024-04-28 00:07:24,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58709.4, 300 sec: 59037.9). Total num frames: 8488321024. Throughput: 0: 59017.7. Samples: 1393583980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:07:24,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 00:07:24,605][54818] Updated weights for policy 0, policy_version 518088 (0.0026) [2024-04-28 00:07:27,502][54818] Updated weights for policy 0, policy_version 518098 (0.0029) [2024-04-28 00:07:29,253][54587] Fps is (10 sec: 60621.5, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8488632320. Throughput: 0: 58786.8. Samples: 1393755340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:07:29,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-28 00:07:29,938][54818] Updated weights for policy 0, policy_version 518108 (0.0026) [2024-04-28 00:07:33,104][54818] Updated weights for policy 0, policy_version 518118 (0.0026) [2024-04-28 00:07:34,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8488927232. Throughput: 0: 59133.4. Samples: 1394124140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:07:34,254][54587] Avg episode reward: [(0, '0.494')] [2024-04-28 00:07:35,544][54818] Updated weights for policy 0, policy_version 518128 (0.0023) [2024-04-28 00:07:35,843][54798] Signal inference workers to stop experience collection... (22000 times) [2024-04-28 00:07:35,844][54798] Signal inference workers to resume experience collection... (22000 times) [2024-04-28 00:07:35,873][54818] InferenceWorker_p0-w0: stopping experience collection (22000 times) [2024-04-28 00:07:35,873][54818] InferenceWorker_p0-w0: resuming experience collection (22000 times) [2024-04-28 00:07:38,514][54818] Updated weights for policy 0, policy_version 518138 (0.0026) [2024-04-28 00:07:39,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 8489205760. Throughput: 0: 59266.3. Samples: 1394474960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:07:39,253][54587] Avg episode reward: [(0, '0.520')] [2024-04-28 00:07:41,084][54818] Updated weights for policy 0, policy_version 518148 (0.0025) [2024-04-28 00:07:44,058][54818] Updated weights for policy 0, policy_version 518158 (0.0026) [2024-04-28 00:07:44,253][54587] Fps is (10 sec: 57343.4, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8489500672. Throughput: 0: 59156.8. Samples: 1394648060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:07:44,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 00:07:46,563][54818] Updated weights for policy 0, policy_version 518168 (0.0026) [2024-04-28 00:07:49,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8489795584. Throughput: 0: 58890.7. Samples: 1394997420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:07:49,253][54587] Avg episode reward: [(0, '0.675')] [2024-04-28 00:07:49,303][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000518177_8489811968.pth... [2024-04-28 00:07:49,357][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000517312_8475639808.pth [2024-04-28 00:07:49,802][54818] Updated weights for policy 0, policy_version 518178 (0.0026) [2024-04-28 00:07:52,048][54818] Updated weights for policy 0, policy_version 518188 (0.0024) [2024-04-28 00:07:54,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58436.2, 300 sec: 59093.5). Total num frames: 8490090496. Throughput: 0: 59053.2. Samples: 1395357400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:07:54,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-28 00:07:55,086][54818] Updated weights for policy 0, policy_version 518198 (0.0026) [2024-04-28 00:07:57,476][54818] Updated weights for policy 0, policy_version 518208 (0.0023) [2024-04-28 00:07:59,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8490401792. Throughput: 0: 59113.8. Samples: 1395541520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:07:59,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-28 00:08:00,668][54818] Updated weights for policy 0, policy_version 518218 (0.0025) [2024-04-28 00:08:03,144][54818] Updated weights for policy 0, policy_version 518228 (0.0025) [2024-04-28 00:08:04,253][54587] Fps is (10 sec: 60621.8, 60 sec: 58982.5, 300 sec: 59149.2). Total num frames: 8490696704. Throughput: 0: 58946.9. Samples: 1395885060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:08:04,253][54587] Avg episode reward: [(0, '0.642')] [2024-04-28 00:08:06,200][54818] Updated weights for policy 0, policy_version 518238 (0.0025) [2024-04-28 00:08:08,611][54818] Updated weights for policy 0, policy_version 518248 (0.0024) [2024-04-28 00:08:09,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59528.4, 300 sec: 59149.0). Total num frames: 8491008000. Throughput: 0: 58975.0. Samples: 1396237860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:08:09,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-28 00:08:11,757][54818] Updated weights for policy 0, policy_version 518258 (0.0026) [2024-04-28 00:08:14,179][54818] Updated weights for policy 0, policy_version 518268 (0.0026) [2024-04-28 00:08:14,253][54587] Fps is (10 sec: 60620.1, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8491302912. Throughput: 0: 59312.4. Samples: 1396424400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:08:14,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 00:08:17,308][54818] Updated weights for policy 0, policy_version 518278 (0.0025) [2024-04-28 00:08:19,254][54587] Fps is (10 sec: 58981.6, 60 sec: 59528.4, 300 sec: 59149.0). Total num frames: 8491597824. Throughput: 0: 58801.9. Samples: 1396770240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:08:19,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 00:08:19,821][54818] Updated weights for policy 0, policy_version 518288 (0.0026) [2024-04-28 00:08:22,960][54818] Updated weights for policy 0, policy_version 518298 (0.0026) [2024-04-28 00:08:24,253][54587] Fps is (10 sec: 58981.8, 60 sec: 59528.4, 300 sec: 59204.5). Total num frames: 8491892736. Throughput: 0: 58847.3. Samples: 1397123100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:08:24,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 00:08:25,376][54818] Updated weights for policy 0, policy_version 518308 (0.0028) [2024-04-28 00:08:28,395][54818] Updated weights for policy 0, policy_version 518318 (0.0026) [2024-04-28 00:08:28,740][54798] Signal inference workers to stop experience collection... (22050 times) [2024-04-28 00:08:28,741][54798] Signal inference workers to resume experience collection... (22050 times) [2024-04-28 00:08:28,754][54818] InferenceWorker_p0-w0: stopping experience collection (22050 times) [2024-04-28 00:08:28,754][54818] InferenceWorker_p0-w0: resuming experience collection (22050 times) [2024-04-28 00:08:29,253][54587] Fps is (10 sec: 57345.6, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8492171264. Throughput: 0: 58970.0. Samples: 1397301700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:08:29,253][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 00:08:30,789][54818] Updated weights for policy 0, policy_version 518328 (0.0026) [2024-04-28 00:08:33,791][54818] Updated weights for policy 0, policy_version 518338 (0.0024) [2024-04-28 00:08:34,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8492482560. Throughput: 0: 59185.2. Samples: 1397660760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:08:34,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 00:08:36,729][54818] Updated weights for policy 0, policy_version 518348 (0.0026) [2024-04-28 00:08:39,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58982.3, 300 sec: 59038.0). Total num frames: 8492744704. Throughput: 0: 59005.8. Samples: 1398012660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:08:39,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 00:08:39,381][54818] Updated weights for policy 0, policy_version 518358 (0.0029) [2024-04-28 00:08:42,251][54818] Updated weights for policy 0, policy_version 518368 (0.0025) [2024-04-28 00:08:44,253][54587] Fps is (10 sec: 55706.3, 60 sec: 58982.5, 300 sec: 58982.4). Total num frames: 8493039616. Throughput: 0: 58551.7. Samples: 1398176340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 00:08:44,253][54587] Avg episode reward: [(0, '0.668')] [2024-04-28 00:08:44,955][54818] Updated weights for policy 0, policy_version 518378 (0.0026) [2024-04-28 00:08:47,743][54818] Updated weights for policy 0, policy_version 518388 (0.0026) [2024-04-28 00:08:49,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8493334528. Throughput: 0: 58946.6. Samples: 1398537660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 00:08:49,253][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 00:08:49,259][54587] No heartbeat for components: RolloutWorker_w4 (9877 seconds) [2024-04-28 00:08:50,471][54818] Updated weights for policy 0, policy_version 518398 (0.0026) [2024-04-28 00:08:53,499][54818] Updated weights for policy 0, policy_version 518408 (0.0026) [2024-04-28 00:08:54,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58982.5, 300 sec: 58982.4). Total num frames: 8493629440. Throughput: 0: 59167.7. Samples: 1398900400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 00:08:54,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 00:08:56,138][54818] Updated weights for policy 0, policy_version 518418 (0.0025) [2024-04-28 00:08:59,146][54818] Updated weights for policy 0, policy_version 518428 (0.0022) [2024-04-28 00:08:59,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8493924352. Throughput: 0: 58735.7. Samples: 1399067500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 00:08:59,253][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 00:09:01,559][54818] Updated weights for policy 0, policy_version 518438 (0.0030) [2024-04-28 00:09:04,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8494235648. Throughput: 0: 58886.9. Samples: 1399420140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 00:09:04,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-28 00:09:04,632][54818] Updated weights for policy 0, policy_version 518448 (0.0025) [2024-04-28 00:09:07,059][54818] Updated weights for policy 0, policy_version 518458 (0.0026) [2024-04-28 00:09:09,253][54587] Fps is (10 sec: 60619.9, 60 sec: 58709.3, 300 sec: 59037.9). Total num frames: 8494530560. Throughput: 0: 59014.3. Samples: 1399778740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 00:09:09,254][54587] Avg episode reward: [(0, '0.524')] [2024-04-28 00:09:10,118][54818] Updated weights for policy 0, policy_version 518468 (0.0025) [2024-04-28 00:09:12,576][54818] Updated weights for policy 0, policy_version 518478 (0.0026) [2024-04-28 00:09:14,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.4, 300 sec: 59038.0). Total num frames: 8494825472. Throughput: 0: 59071.0. Samples: 1399959900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 00:09:14,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 00:09:15,338][54798] Signal inference workers to stop experience collection... (22100 times) [2024-04-28 00:09:15,343][54798] Signal inference workers to resume experience collection... (22100 times) [2024-04-28 00:09:15,361][54818] InferenceWorker_p0-w0: stopping experience collection (22100 times) [2024-04-28 00:09:15,361][54818] InferenceWorker_p0-w0: resuming experience collection (22100 times) [2024-04-28 00:09:15,598][54818] Updated weights for policy 0, policy_version 518488 (0.0026) [2024-04-28 00:09:18,022][54818] Updated weights for policy 0, policy_version 518498 (0.0024) [2024-04-28 00:09:19,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8495136768. Throughput: 0: 59008.0. Samples: 1400316120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 00:09:19,254][54587] Avg episode reward: [(0, '0.703')] [2024-04-28 00:09:21,203][54818] Updated weights for policy 0, policy_version 518508 (0.0025) [2024-04-28 00:09:23,593][54818] Updated weights for policy 0, policy_version 518518 (0.0025) [2024-04-28 00:09:24,253][54587] Fps is (10 sec: 60621.5, 60 sec: 58982.6, 300 sec: 59149.0). Total num frames: 8495431680. Throughput: 0: 58752.2. Samples: 1400656500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 00:09:24,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 00:09:26,749][54818] Updated weights for policy 0, policy_version 518528 (0.0026) [2024-04-28 00:09:29,208][54818] Updated weights for policy 0, policy_version 518538 (0.0024) [2024-04-28 00:09:29,253][54587] Fps is (10 sec: 58983.4, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8495726592. Throughput: 0: 59274.2. Samples: 1400843680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 00:09:29,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 00:09:32,283][54818] Updated weights for policy 0, policy_version 518548 (0.0026) [2024-04-28 00:09:34,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8496021504. Throughput: 0: 59142.2. Samples: 1401199060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 00:09:34,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 00:09:34,578][54818] Updated weights for policy 0, policy_version 518558 (0.0027) [2024-04-28 00:09:37,696][54818] Updated weights for policy 0, policy_version 518568 (0.0024) [2024-04-28 00:09:39,253][54587] Fps is (10 sec: 57343.6, 60 sec: 59255.5, 300 sec: 59149.1). Total num frames: 8496300032. Throughput: 0: 58840.8. Samples: 1401548240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 00:09:39,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 00:09:40,426][54818] Updated weights for policy 0, policy_version 518578 (0.0024) [2024-04-28 00:09:43,539][54818] Updated weights for policy 0, policy_version 518588 (0.0026) [2024-04-28 00:09:44,253][54587] Fps is (10 sec: 57343.8, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8496594944. Throughput: 0: 59027.4. Samples: 1401723740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 00:09:44,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 00:09:46,039][54818] Updated weights for policy 0, policy_version 518598 (0.0025) [2024-04-28 00:09:49,192][54818] Updated weights for policy 0, policy_version 518608 (0.0026) [2024-04-28 00:09:49,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8496873472. Throughput: 0: 59040.4. Samples: 1402076960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 00:09:49,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 00:09:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000518608_8496873472.pth... [2024-04-28 00:09:49,318][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000517746_8482750464.pth [2024-04-28 00:09:51,599][54818] Updated weights for policy 0, policy_version 518618 (0.0025) [2024-04-28 00:09:54,253][54587] Fps is (10 sec: 55705.8, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 8497152000. Throughput: 0: 58949.0. Samples: 1402431440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 00:09:54,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 00:09:54,727][54818] Updated weights for policy 0, policy_version 518628 (0.0026) [2024-04-28 00:09:57,164][54818] Updated weights for policy 0, policy_version 518638 (0.0026) [2024-04-28 00:09:59,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59255.3, 300 sec: 58982.4). Total num frames: 8497479680. Throughput: 0: 58707.0. Samples: 1402601720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 00:09:59,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 00:10:00,144][54818] Updated weights for policy 0, policy_version 518648 (0.0025) [2024-04-28 00:10:02,553][54818] Updated weights for policy 0, policy_version 518658 (0.0025) [2024-04-28 00:10:04,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58709.3, 300 sec: 58871.3). Total num frames: 8497758208. Throughput: 0: 58612.0. Samples: 1402953660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 00:10:04,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 00:10:05,299][54798] Signal inference workers to stop experience collection... (22150 times) [2024-04-28 00:10:05,302][54798] Signal inference workers to resume experience collection... (22150 times) [2024-04-28 00:10:05,314][54818] InferenceWorker_p0-w0: stopping experience collection (22150 times) [2024-04-28 00:10:05,314][54818] InferenceWorker_p0-w0: resuming experience collection (22150 times) [2024-04-28 00:10:05,552][54818] Updated weights for policy 0, policy_version 518668 (0.0026) [2024-04-28 00:10:08,293][54818] Updated weights for policy 0, policy_version 518678 (0.0028) [2024-04-28 00:10:09,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58709.3, 300 sec: 58926.9). Total num frames: 8498053120. Throughput: 0: 59013.1. Samples: 1403312100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 00:10:09,254][54587] Avg episode reward: [(0, '0.505')] [2024-04-28 00:10:11,104][54818] Updated weights for policy 0, policy_version 518688 (0.0026) [2024-04-28 00:10:13,887][54818] Updated weights for policy 0, policy_version 518698 (0.0026) [2024-04-28 00:10:14,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8498348032. Throughput: 0: 58628.9. Samples: 1403481980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:10:14,253][54587] Avg episode reward: [(0, '0.517')] [2024-04-28 00:10:16,534][54818] Updated weights for policy 0, policy_version 518708 (0.0028) [2024-04-28 00:10:19,253][54587] Fps is (10 sec: 60621.3, 60 sec: 58709.4, 300 sec: 59037.9). Total num frames: 8498659328. Throughput: 0: 58761.3. Samples: 1403843320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:10:19,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 00:10:19,427][54818] Updated weights for policy 0, policy_version 518718 (0.0026) [2024-04-28 00:10:21,958][54818] Updated weights for policy 0, policy_version 518728 (0.0027) [2024-04-28 00:10:24,253][54587] Fps is (10 sec: 60620.1, 60 sec: 58709.2, 300 sec: 59037.9). Total num frames: 8498954240. Throughput: 0: 58903.0. Samples: 1404198880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:10:24,254][54587] Avg episode reward: [(0, '0.678')] [2024-04-28 00:10:25,243][54818] Updated weights for policy 0, policy_version 518738 (0.0025) [2024-04-28 00:10:27,479][54818] Updated weights for policy 0, policy_version 518748 (0.0026) [2024-04-28 00:10:29,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.2, 300 sec: 58926.9). Total num frames: 8499249152. Throughput: 0: 59101.3. Samples: 1404383300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:10:29,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 00:10:30,581][54818] Updated weights for policy 0, policy_version 518758 (0.0026) [2024-04-28 00:10:32,946][54818] Updated weights for policy 0, policy_version 518768 (0.0026) [2024-04-28 00:10:34,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58982.3, 300 sec: 58982.4). Total num frames: 8499560448. Throughput: 0: 59066.2. Samples: 1404734940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:10:34,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-28 00:10:36,165][54818] Updated weights for policy 0, policy_version 518778 (0.0025) [2024-04-28 00:10:38,411][54818] Updated weights for policy 0, policy_version 518788 (0.0026) [2024-04-28 00:10:39,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59255.4, 300 sec: 59037.9). Total num frames: 8499855360. Throughput: 0: 59176.7. Samples: 1405094400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:10:39,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 00:10:41,672][54818] Updated weights for policy 0, policy_version 518798 (0.0026) [2024-04-28 00:10:43,990][54818] Updated weights for policy 0, policy_version 518808 (0.0025) [2024-04-28 00:10:44,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.4, 300 sec: 59037.9). Total num frames: 8500150272. Throughput: 0: 59542.2. Samples: 1405281120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:10:44,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 00:10:47,145][54818] Updated weights for policy 0, policy_version 518818 (0.0023) [2024-04-28 00:10:49,253][54587] Fps is (10 sec: 60621.5, 60 sec: 59801.7, 300 sec: 59149.1). Total num frames: 8500461568. Throughput: 0: 59476.6. Samples: 1405630100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:10:49,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 00:10:49,601][54818] Updated weights for policy 0, policy_version 518828 (0.0025) [2024-04-28 00:10:52,727][54818] Updated weights for policy 0, policy_version 518838 (0.0026) [2024-04-28 00:10:54,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60074.6, 300 sec: 59149.0). Total num frames: 8500756480. Throughput: 0: 59397.8. Samples: 1405985000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:10:54,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 00:10:55,280][54818] Updated weights for policy 0, policy_version 518848 (0.0025) [2024-04-28 00:10:58,240][54818] Updated weights for policy 0, policy_version 518858 (0.0026) [2024-04-28 00:10:59,253][54587] Fps is (10 sec: 58981.8, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8501051392. Throughput: 0: 59552.7. Samples: 1406161860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:10:59,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-28 00:11:00,750][54818] Updated weights for policy 0, policy_version 518868 (0.0025) [2024-04-28 00:11:03,650][54818] Updated weights for policy 0, policy_version 518878 (0.0026) [2024-04-28 00:11:04,241][54798] Signal inference workers to stop experience collection... (22200 times) [2024-04-28 00:11:04,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59528.5, 300 sec: 59037.9). Total num frames: 8501329920. Throughput: 0: 59641.3. Samples: 1406527180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:11:04,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 00:11:04,275][54818] InferenceWorker_p0-w0: stopping experience collection (22200 times) [2024-04-28 00:11:04,292][54798] Signal inference workers to resume experience collection... (22200 times) [2024-04-28 00:11:04,293][54818] InferenceWorker_p0-w0: resuming experience collection (22200 times) [2024-04-28 00:11:06,433][54818] Updated weights for policy 0, policy_version 518888 (0.0026) [2024-04-28 00:11:09,212][54818] Updated weights for policy 0, policy_version 518898 (0.0026) [2024-04-28 00:11:09,253][54587] Fps is (10 sec: 57343.9, 60 sec: 59528.5, 300 sec: 59037.9). Total num frames: 8501624832. Throughput: 0: 59577.3. Samples: 1406879860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:11:09,254][54587] Avg episode reward: [(0, '0.522')] [2024-04-28 00:11:11,865][54818] Updated weights for policy 0, policy_version 518908 (0.0026) [2024-04-28 00:11:14,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59255.4, 300 sec: 58982.4). Total num frames: 8501903360. Throughput: 0: 59358.2. Samples: 1407054420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:11:14,254][54587] Avg episode reward: [(0, '0.667')] [2024-04-28 00:11:14,630][54818] Updated weights for policy 0, policy_version 518918 (0.0026) [2024-04-28 00:11:17,378][54818] Updated weights for policy 0, policy_version 518928 (0.0025) [2024-04-28 00:11:19,253][54587] Fps is (10 sec: 57344.6, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8502198272. Throughput: 0: 59486.3. Samples: 1407411820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:11:19,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 00:11:20,120][54818] Updated weights for policy 0, policy_version 518938 (0.0025) [2024-04-28 00:11:22,929][54818] Updated weights for policy 0, policy_version 518948 (0.0026) [2024-04-28 00:11:24,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8502493184. Throughput: 0: 59385.0. Samples: 1407766720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:11:24,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 00:11:25,635][54818] Updated weights for policy 0, policy_version 518958 (0.0025) [2024-04-28 00:11:28,578][54818] Updated weights for policy 0, policy_version 518968 (0.0026) [2024-04-28 00:11:29,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59255.4, 300 sec: 59093.4). Total num frames: 8502804480. Throughput: 0: 59061.7. Samples: 1407938900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:11:29,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 00:11:31,139][54818] Updated weights for policy 0, policy_version 518978 (0.0026) [2024-04-28 00:11:33,963][54818] Updated weights for policy 0, policy_version 518988 (0.0026) [2024-04-28 00:11:34,253][54587] Fps is (10 sec: 60620.1, 60 sec: 58982.3, 300 sec: 59093.4). Total num frames: 8503099392. Throughput: 0: 59244.3. Samples: 1408296100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:11:34,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 00:11:36,558][54818] Updated weights for policy 0, policy_version 518998 (0.0025) [2024-04-28 00:11:39,253][54587] Fps is (10 sec: 60621.4, 60 sec: 59255.6, 300 sec: 59149.0). Total num frames: 8503410688. Throughput: 0: 59433.0. Samples: 1408659480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 00:11:39,254][54587] Avg episode reward: [(0, '0.476')] [2024-04-28 00:11:39,377][54818] Updated weights for policy 0, policy_version 519008 (0.0026) [2024-04-28 00:11:41,953][54818] Updated weights for policy 0, policy_version 519018 (0.0024) [2024-04-28 00:11:44,253][54587] Fps is (10 sec: 57345.1, 60 sec: 58709.5, 300 sec: 58982.4). Total num frames: 8503672832. Throughput: 0: 59246.4. Samples: 1408827940. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:11:44,253][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 00:11:44,953][54818] Updated weights for policy 0, policy_version 519028 (0.0026) [2024-04-28 00:11:47,530][54818] Updated weights for policy 0, policy_version 519038 (0.0026) [2024-04-28 00:11:49,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58709.2, 300 sec: 58982.4). Total num frames: 8503984128. Throughput: 0: 58903.5. Samples: 1409177840. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:11:49,254][54587] Avg episode reward: [(0, '0.676')] [2024-04-28 00:11:49,264][54587] No heartbeat for components: RolloutWorker_w4 (10057 seconds) [2024-04-28 00:11:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000519042_8503984128.pth... [2024-04-28 00:11:49,317][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000518177_8489811968.pth [2024-04-28 00:11:50,590][54818] Updated weights for policy 0, policy_version 519048 (0.0025) [2024-04-28 00:11:53,177][54818] Updated weights for policy 0, policy_version 519058 (0.0023) [2024-04-28 00:11:54,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8504279040. Throughput: 0: 58824.6. Samples: 1409526960. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:11:54,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 00:11:56,540][54818] Updated weights for policy 0, policy_version 519068 (0.0026) [2024-04-28 00:11:57,794][54798] Signal inference workers to stop experience collection... (22250 times) [2024-04-28 00:11:57,795][54798] Signal inference workers to resume experience collection... (22250 times) [2024-04-28 00:11:57,805][54818] InferenceWorker_p0-w0: stopping experience collection (22250 times) [2024-04-28 00:11:57,805][54818] InferenceWorker_p0-w0: resuming experience collection (22250 times) [2024-04-28 00:11:58,521][54818] Updated weights for policy 0, policy_version 519078 (0.0025) [2024-04-28 00:11:59,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8504590336. Throughput: 0: 59142.2. Samples: 1409715820. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:11:59,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 00:12:01,899][54818] Updated weights for policy 0, policy_version 519088 (0.0026) [2024-04-28 00:12:04,027][54818] Updated weights for policy 0, policy_version 519098 (0.0026) [2024-04-28 00:12:04,253][54587] Fps is (10 sec: 62259.1, 60 sec: 59528.6, 300 sec: 59204.5). Total num frames: 8504901632. Throughput: 0: 59183.5. Samples: 1410075080. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:12:04,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 00:12:07,363][54818] Updated weights for policy 0, policy_version 519108 (0.0026) [2024-04-28 00:12:09,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59528.6, 300 sec: 59204.6). Total num frames: 8505196544. Throughput: 0: 59017.8. Samples: 1410422520. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:12:09,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 00:12:09,729][54818] Updated weights for policy 0, policy_version 519118 (0.0025) [2024-04-28 00:12:12,960][54818] Updated weights for policy 0, policy_version 519128 (0.0026) [2024-04-28 00:12:14,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60074.7, 300 sec: 59260.1). Total num frames: 8505507840. Throughput: 0: 59300.0. Samples: 1410607400. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:12:14,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 00:12:15,635][54818] Updated weights for policy 0, policy_version 519138 (0.0024) [2024-04-28 00:12:18,488][54818] Updated weights for policy 0, policy_version 519148 (0.0026) [2024-04-28 00:12:19,253][54587] Fps is (10 sec: 57344.5, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8505769984. Throughput: 0: 59322.9. Samples: 1410965620. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:12:19,253][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 00:12:20,934][54818] Updated weights for policy 0, policy_version 519158 (0.0026) [2024-04-28 00:12:23,991][54818] Updated weights for policy 0, policy_version 519168 (0.0026) [2024-04-28 00:12:24,253][54587] Fps is (10 sec: 55705.4, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8506064896. Throughput: 0: 59163.4. Samples: 1411321840. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:12:24,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 00:12:26,740][54818] Updated weights for policy 0, policy_version 519178 (0.0025) [2024-04-28 00:12:29,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8506343424. Throughput: 0: 59411.5. Samples: 1411501460. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:12:29,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 00:12:29,371][54818] Updated weights for policy 0, policy_version 519188 (0.0025) [2024-04-28 00:12:32,309][54818] Updated weights for policy 0, policy_version 519198 (0.0025) [2024-04-28 00:12:34,253][54587] Fps is (10 sec: 58982.6, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8506654720. Throughput: 0: 59372.5. Samples: 1411849600. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:12:34,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 00:12:34,916][54818] Updated weights for policy 0, policy_version 519208 (0.0027) [2024-04-28 00:12:37,731][54818] Updated weights for policy 0, policy_version 519218 (0.0025) [2024-04-28 00:12:39,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8506933248. Throughput: 0: 59471.9. Samples: 1412203200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:12:39,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-28 00:12:40,570][54818] Updated weights for policy 0, policy_version 519228 (0.0024) [2024-04-28 00:12:43,284][54818] Updated weights for policy 0, policy_version 519238 (0.0027) [2024-04-28 00:12:44,253][54587] Fps is (10 sec: 58982.9, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8507244544. Throughput: 0: 59093.0. Samples: 1412375000. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:12:44,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-28 00:12:45,897][54818] Updated weights for policy 0, policy_version 519248 (0.0027) [2024-04-28 00:12:48,263][54798] Signal inference workers to stop experience collection... (22300 times) [2024-04-28 00:12:48,264][54798] Signal inference workers to resume experience collection... (22300 times) [2024-04-28 00:12:48,274][54818] InferenceWorker_p0-w0: stopping experience collection (22300 times) [2024-04-28 00:12:48,291][54818] InferenceWorker_p0-w0: resuming experience collection (22300 times) [2024-04-28 00:12:48,756][54818] Updated weights for policy 0, policy_version 519258 (0.0025) [2024-04-28 00:12:49,253][54587] Fps is (10 sec: 62259.1, 60 sec: 59528.5, 300 sec: 59204.6). Total num frames: 8507555840. Throughput: 0: 59177.7. Samples: 1412738080. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:12:49,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-28 00:12:51,297][54818] Updated weights for policy 0, policy_version 519268 (0.0025) [2024-04-28 00:12:54,211][54818] Updated weights for policy 0, policy_version 519278 (0.0023) [2024-04-28 00:12:54,253][54587] Fps is (10 sec: 60620.2, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8507850752. Throughput: 0: 59355.9. Samples: 1413093540. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:12:54,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 00:12:56,881][54818] Updated weights for policy 0, policy_version 519288 (0.0024) [2024-04-28 00:12:59,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8508145664. Throughput: 0: 59068.1. Samples: 1413265460. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:12:59,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 00:12:59,730][54818] Updated weights for policy 0, policy_version 519298 (0.0025) [2024-04-28 00:13:02,261][54818] Updated weights for policy 0, policy_version 519308 (0.0028) [2024-04-28 00:13:04,253][54587] Fps is (10 sec: 57344.8, 60 sec: 58709.4, 300 sec: 59038.0). Total num frames: 8508424192. Throughput: 0: 58884.0. Samples: 1413615400. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:13:04,253][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 00:13:05,219][54818] Updated weights for policy 0, policy_version 519318 (0.0026) [2024-04-28 00:13:07,732][54818] Updated weights for policy 0, policy_version 519328 (0.0026) [2024-04-28 00:13:09,256][54587] Fps is (10 sec: 58969.5, 60 sec: 58980.2, 300 sec: 59093.0). Total num frames: 8508735488. Throughput: 0: 59065.7. Samples: 1413979920. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-04-28 00:13:09,256][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 00:13:10,721][54818] Updated weights for policy 0, policy_version 519338 (0.0025) [2024-04-28 00:13:13,157][54818] Updated weights for policy 0, policy_version 519348 (0.0027) [2024-04-28 00:13:14,253][54587] Fps is (10 sec: 60619.9, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8509030400. Throughput: 0: 59248.8. Samples: 1414167660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 00:13:14,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 00:13:16,305][54818] Updated weights for policy 0, policy_version 519358 (0.0026) [2024-04-28 00:13:18,644][54818] Updated weights for policy 0, policy_version 519368 (0.0024) [2024-04-28 00:13:19,253][54587] Fps is (10 sec: 60633.6, 60 sec: 59528.4, 300 sec: 59149.0). Total num frames: 8509341696. Throughput: 0: 59336.0. Samples: 1414519720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 00:13:19,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 00:13:22,150][54818] Updated weights for policy 0, policy_version 519378 (0.0026) [2024-04-28 00:13:24,235][54818] Updated weights for policy 0, policy_version 519388 (0.0025) [2024-04-28 00:13:24,253][54587] Fps is (10 sec: 62259.4, 60 sec: 59801.7, 300 sec: 59260.1). Total num frames: 8509652992. Throughput: 0: 59179.6. Samples: 1414866280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 00:13:24,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 00:13:27,770][54818] Updated weights for policy 0, policy_version 519398 (0.0026) [2024-04-28 00:13:29,253][54587] Fps is (10 sec: 58983.1, 60 sec: 59801.6, 300 sec: 59149.0). Total num frames: 8509931520. Throughput: 0: 59283.1. Samples: 1415042740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 00:13:29,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 00:13:29,859][54818] Updated weights for policy 0, policy_version 519408 (0.0026) [2024-04-28 00:13:33,431][54818] Updated weights for policy 0, policy_version 519418 (0.0025) [2024-04-28 00:13:34,253][54587] Fps is (10 sec: 55706.2, 60 sec: 59255.6, 300 sec: 59204.6). Total num frames: 8510210048. Throughput: 0: 59185.5. Samples: 1415401420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 00:13:34,253][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 00:13:35,590][54818] Updated weights for policy 0, policy_version 519428 (0.0026) [2024-04-28 00:13:38,859][54818] Updated weights for policy 0, policy_version 519438 (0.0026) [2024-04-28 00:13:39,253][54587] Fps is (10 sec: 55705.4, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8510488576. Throughput: 0: 59180.1. Samples: 1415756640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 00:13:39,262][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 00:13:41,411][54818] Updated weights for policy 0, policy_version 519448 (0.0024) [2024-04-28 00:13:42,191][54798] Signal inference workers to stop experience collection... (22350 times) [2024-04-28 00:13:42,236][54818] InferenceWorker_p0-w0: stopping experience collection (22350 times) [2024-04-28 00:13:42,245][54798] Signal inference workers to resume experience collection... (22350 times) [2024-04-28 00:13:42,252][54818] InferenceWorker_p0-w0: resuming experience collection (22350 times) [2024-04-28 00:13:44,253][54587] Fps is (10 sec: 57343.3, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8510783488. Throughput: 0: 59168.8. Samples: 1415928060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 00:13:44,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 00:13:44,371][54818] Updated weights for policy 0, policy_version 519458 (0.0026) [2024-04-28 00:13:47,098][54818] Updated weights for policy 0, policy_version 519468 (0.0026) [2024-04-28 00:13:49,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8511078400. Throughput: 0: 59243.5. Samples: 1416281360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 00:13:49,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 00:13:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000519475_8511078400.pth... [2024-04-28 00:13:49,311][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000518608_8496873472.pth [2024-04-28 00:13:49,828][54818] Updated weights for policy 0, policy_version 519478 (0.0026) [2024-04-28 00:13:52,700][54818] Updated weights for policy 0, policy_version 519488 (0.0027) [2024-04-28 00:13:54,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8511373312. Throughput: 0: 59014.4. Samples: 1416635440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 00:13:54,254][54587] Avg episode reward: [(0, '0.673')] [2024-04-28 00:13:55,388][54818] Updated weights for policy 0, policy_version 519498 (0.0025) [2024-04-28 00:13:58,206][54818] Updated weights for policy 0, policy_version 519508 (0.0026) [2024-04-28 00:13:59,259][54587] Fps is (10 sec: 60583.2, 60 sec: 58976.3, 300 sec: 59147.8). Total num frames: 8511684608. Throughput: 0: 58639.6. Samples: 1416806800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 00:13:59,260][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 00:14:00,913][54818] Updated weights for policy 0, policy_version 519518 (0.0026) [2024-04-28 00:14:03,752][54818] Updated weights for policy 0, policy_version 519528 (0.0026) [2024-04-28 00:14:04,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59255.3, 300 sec: 59149.0). Total num frames: 8511979520. Throughput: 0: 58606.7. Samples: 1417157020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 00:14:04,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 00:14:06,410][54818] Updated weights for policy 0, policy_version 519538 (0.0026) [2024-04-28 00:14:09,229][54818] Updated weights for policy 0, policy_version 519548 (0.0026) [2024-04-28 00:14:09,253][54587] Fps is (10 sec: 59018.7, 60 sec: 58984.5, 300 sec: 59149.0). Total num frames: 8512274432. Throughput: 0: 59009.8. Samples: 1417521720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 00:14:09,254][54587] Avg episode reward: [(0, '0.676')] [2024-04-28 00:14:11,909][54818] Updated weights for policy 0, policy_version 519558 (0.0024) [2024-04-28 00:14:14,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8512569344. Throughput: 0: 59051.1. Samples: 1417700040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 00:14:14,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 00:14:14,607][54818] Updated weights for policy 0, policy_version 519568 (0.0026) [2024-04-28 00:14:17,355][54818] Updated weights for policy 0, policy_version 519578 (0.0026) [2024-04-28 00:14:19,253][54587] Fps is (10 sec: 57343.0, 60 sec: 58436.2, 300 sec: 59037.9). Total num frames: 8512847872. Throughput: 0: 58731.2. Samples: 1418044340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 00:14:19,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 00:14:20,340][54818] Updated weights for policy 0, policy_version 519588 (0.0025) [2024-04-28 00:14:22,929][54818] Updated weights for policy 0, policy_version 519598 (0.0026) [2024-04-28 00:14:24,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58436.3, 300 sec: 59093.5). Total num frames: 8513159168. Throughput: 0: 58795.5. Samples: 1418402440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 00:14:24,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 00:14:25,870][54818] Updated weights for policy 0, policy_version 519608 (0.0026) [2024-04-28 00:14:28,479][54818] Updated weights for policy 0, policy_version 519618 (0.0026) [2024-04-28 00:14:29,253][54587] Fps is (10 sec: 60621.9, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8513454080. Throughput: 0: 59009.4. Samples: 1418583480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 00:14:29,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 00:14:31,412][54818] Updated weights for policy 0, policy_version 519628 (0.0026) [2024-04-28 00:14:34,067][54818] Updated weights for policy 0, policy_version 519638 (0.0026) [2024-04-28 00:14:34,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8513748992. Throughput: 0: 58935.5. Samples: 1418933460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 00:14:34,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 00:14:36,990][54818] Updated weights for policy 0, policy_version 519648 (0.0022) [2024-04-28 00:14:39,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8514043904. Throughput: 0: 58904.1. Samples: 1419286120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:14:39,253][54587] Avg episode reward: [(0, '0.532')] [2024-04-28 00:14:39,496][54818] Updated weights for policy 0, policy_version 519658 (0.0026) [2024-04-28 00:14:42,526][54818] Updated weights for policy 0, policy_version 519668 (0.0025) [2024-04-28 00:14:44,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8514322432. Throughput: 0: 58994.2. Samples: 1419461180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:14:44,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 00:14:45,144][54818] Updated weights for policy 0, policy_version 519678 (0.0025) [2024-04-28 00:14:48,228][54818] Updated weights for policy 0, policy_version 519688 (0.0026) [2024-04-28 00:14:48,433][54798] Signal inference workers to stop experience collection... (22400 times) [2024-04-28 00:14:48,433][54798] Signal inference workers to resume experience collection... (22400 times) [2024-04-28 00:14:48,446][54818] InferenceWorker_p0-w0: stopping experience collection (22400 times) [2024-04-28 00:14:48,456][54818] InferenceWorker_p0-w0: resuming experience collection (22400 times) [2024-04-28 00:14:49,253][54587] Fps is (10 sec: 58981.4, 60 sec: 59255.3, 300 sec: 59260.1). Total num frames: 8514633728. Throughput: 0: 59097.7. Samples: 1419816420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:14:49,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 00:14:49,263][54587] No heartbeat for components: RolloutWorker_w4 (10237 seconds) [2024-04-28 00:14:50,881][54818] Updated weights for policy 0, policy_version 519698 (0.0027) [2024-04-28 00:14:53,741][54818] Updated weights for policy 0, policy_version 519708 (0.0023) [2024-04-28 00:14:54,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8514928640. Throughput: 0: 58932.4. Samples: 1420173680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:14:54,254][54587] Avg episode reward: [(0, '0.520')] [2024-04-28 00:14:56,632][54818] Updated weights for policy 0, policy_version 519718 (0.0025) [2024-04-28 00:14:59,251][54818] Updated weights for policy 0, policy_version 519728 (0.0026) [2024-04-28 00:14:59,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58988.4, 300 sec: 59204.6). Total num frames: 8515223552. Throughput: 0: 58844.7. Samples: 1420348060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:14:59,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 00:15:02,117][54818] Updated weights for policy 0, policy_version 519738 (0.0026) [2024-04-28 00:15:04,253][54587] Fps is (10 sec: 57344.5, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8515502080. Throughput: 0: 59127.8. Samples: 1420705080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:15:04,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 00:15:04,736][54818] Updated weights for policy 0, policy_version 519748 (0.0026) [2024-04-28 00:15:07,857][54818] Updated weights for policy 0, policy_version 519758 (0.0024) [2024-04-28 00:15:09,253][54587] Fps is (10 sec: 57344.8, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8515796992. Throughput: 0: 59085.5. Samples: 1421061280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:15:09,253][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 00:15:10,146][54818] Updated weights for policy 0, policy_version 519768 (0.0027) [2024-04-28 00:15:13,185][54818] Updated weights for policy 0, policy_version 519778 (0.0026) [2024-04-28 00:15:14,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8516108288. Throughput: 0: 58886.6. Samples: 1421233380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:15:14,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 00:15:15,572][54818] Updated weights for policy 0, policy_version 519788 (0.0026) [2024-04-28 00:15:18,646][54818] Updated weights for policy 0, policy_version 519798 (0.0024) [2024-04-28 00:15:19,253][54587] Fps is (10 sec: 60619.9, 60 sec: 59255.6, 300 sec: 59149.0). Total num frames: 8516403200. Throughput: 0: 59149.3. Samples: 1421595180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:15:19,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 00:15:21,205][54818] Updated weights for policy 0, policy_version 519808 (0.0025) [2024-04-28 00:15:24,253][54587] Fps is (10 sec: 57344.8, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8516681728. Throughput: 0: 59274.7. Samples: 1421953480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:15:24,253][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 00:15:24,292][54818] Updated weights for policy 0, policy_version 519818 (0.0024) [2024-04-28 00:15:26,691][54818] Updated weights for policy 0, policy_version 519828 (0.0026) [2024-04-28 00:15:29,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58709.3, 300 sec: 59037.9). Total num frames: 8516976640. Throughput: 0: 59144.5. Samples: 1422122680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:15:29,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 00:15:29,830][54818] Updated weights for policy 0, policy_version 519838 (0.0026) [2024-04-28 00:15:32,378][54818] Updated weights for policy 0, policy_version 519848 (0.0025) [2024-04-28 00:15:34,253][54587] Fps is (10 sec: 60619.8, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8517287936. Throughput: 0: 59220.9. Samples: 1422481360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:15:34,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-28 00:15:35,183][54818] Updated weights for policy 0, policy_version 519858 (0.0026) [2024-04-28 00:15:37,698][54818] Updated weights for policy 0, policy_version 519868 (0.0025) [2024-04-28 00:15:39,254][54587] Fps is (10 sec: 60615.8, 60 sec: 58981.5, 300 sec: 59093.3). Total num frames: 8517582848. Throughput: 0: 59154.5. Samples: 1422835680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:15:39,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 00:15:40,678][54818] Updated weights for policy 0, policy_version 519878 (0.0025) [2024-04-28 00:15:43,459][54818] Updated weights for policy 0, policy_version 519888 (0.0024) [2024-04-28 00:15:44,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59528.6, 300 sec: 59093.5). Total num frames: 8517894144. Throughput: 0: 59417.4. Samples: 1423021840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:15:44,254][54587] Avg episode reward: [(0, '0.489')] [2024-04-28 00:15:46,278][54818] Updated weights for policy 0, policy_version 519898 (0.0024) [2024-04-28 00:15:48,752][54818] Updated weights for policy 0, policy_version 519908 (0.0027) [2024-04-28 00:15:49,253][54587] Fps is (10 sec: 60625.3, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8518189056. Throughput: 0: 59348.2. Samples: 1423375760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:15:49,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 00:15:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000519909_8518189056.pth... [2024-04-28 00:15:49,323][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000519042_8503984128.pth [2024-04-28 00:15:50,452][54798] Signal inference workers to stop experience collection... (22450 times) [2024-04-28 00:15:50,488][54818] InferenceWorker_p0-w0: stopping experience collection (22450 times) [2024-04-28 00:15:50,504][54798] Signal inference workers to resume experience collection... (22450 times) [2024-04-28 00:15:50,505][54818] InferenceWorker_p0-w0: resuming experience collection (22450 times) [2024-04-28 00:15:51,882][54818] Updated weights for policy 0, policy_version 519918 (0.0025) [2024-04-28 00:15:54,222][54818] Updated weights for policy 0, policy_version 519928 (0.0026) [2024-04-28 00:15:54,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8518500352. Throughput: 0: 59296.8. Samples: 1423729640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:15:54,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 00:15:57,340][54818] Updated weights for policy 0, policy_version 519938 (0.0025) [2024-04-28 00:15:59,253][54587] Fps is (10 sec: 58983.5, 60 sec: 59255.6, 300 sec: 59149.0). Total num frames: 8518778880. Throughput: 0: 59435.3. Samples: 1423907960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:15:59,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 00:15:59,776][54818] Updated weights for policy 0, policy_version 519948 (0.0026) [2024-04-28 00:16:02,768][54818] Updated weights for policy 0, policy_version 519958 (0.0026) [2024-04-28 00:16:04,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8519073792. Throughput: 0: 59377.5. Samples: 1424267160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:16:04,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 00:16:05,430][54818] Updated weights for policy 0, policy_version 519968 (0.0024) [2024-04-28 00:16:08,270][54818] Updated weights for policy 0, policy_version 519978 (0.0026) [2024-04-28 00:16:09,253][54587] Fps is (10 sec: 60619.7, 60 sec: 59801.4, 300 sec: 59260.1). Total num frames: 8519385088. Throughput: 0: 59172.6. Samples: 1424616260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:16:09,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 00:16:11,029][54818] Updated weights for policy 0, policy_version 519988 (0.0026) [2024-04-28 00:16:13,662][54818] Updated weights for policy 0, policy_version 519998 (0.0029) [2024-04-28 00:16:14,253][54587] Fps is (10 sec: 58981.7, 60 sec: 59255.5, 300 sec: 59204.5). Total num frames: 8519663616. Throughput: 0: 59347.5. Samples: 1424793320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:16:14,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-28 00:16:16,610][54818] Updated weights for policy 0, policy_version 520008 (0.0026) [2024-04-28 00:16:19,253][54818] Updated weights for policy 0, policy_version 520018 (0.0025) [2024-04-28 00:16:19,253][54587] Fps is (10 sec: 58983.3, 60 sec: 59528.6, 300 sec: 59260.1). Total num frames: 8519974912. Throughput: 0: 59316.6. Samples: 1425150600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:16:19,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-28 00:16:22,258][54818] Updated weights for policy 0, policy_version 520028 (0.0026) [2024-04-28 00:16:24,253][54587] Fps is (10 sec: 58983.1, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8520253440. Throughput: 0: 59317.7. Samples: 1425504920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:16:24,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 00:16:24,854][54818] Updated weights for policy 0, policy_version 520038 (0.0026) [2024-04-28 00:16:27,691][54818] Updated weights for policy 0, policy_version 520048 (0.0026) [2024-04-28 00:16:29,253][54587] Fps is (10 sec: 55706.0, 60 sec: 59255.6, 300 sec: 59093.5). Total num frames: 8520531968. Throughput: 0: 59003.2. Samples: 1425676980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:16:29,254][54587] Avg episode reward: [(0, '0.673')] [2024-04-28 00:16:30,465][54818] Updated weights for policy 0, policy_version 520058 (0.0026) [2024-04-28 00:16:33,329][54818] Updated weights for policy 0, policy_version 520068 (0.0026) [2024-04-28 00:16:34,253][54587] Fps is (10 sec: 58982.6, 60 sec: 59255.6, 300 sec: 59093.5). Total num frames: 8520843264. Throughput: 0: 59051.4. Samples: 1426033060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:16:34,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 00:16:35,946][54818] Updated weights for policy 0, policy_version 520078 (0.0026) [2024-04-28 00:16:38,924][54818] Updated weights for policy 0, policy_version 520088 (0.0025) [2024-04-28 00:16:39,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58983.4, 300 sec: 59149.0). Total num frames: 8521121792. Throughput: 0: 59029.5. Samples: 1426385960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:16:39,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 00:16:41,416][54818] Updated weights for policy 0, policy_version 520098 (0.0026) [2024-04-28 00:16:44,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8521433088. Throughput: 0: 58965.3. Samples: 1426561400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:16:44,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-28 00:16:44,439][54818] Updated weights for policy 0, policy_version 520108 (0.0026) [2024-04-28 00:16:46,947][54818] Updated weights for policy 0, policy_version 520118 (0.0026) [2024-04-28 00:16:49,253][54587] Fps is (10 sec: 60619.8, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8521728000. Throughput: 0: 58925.6. Samples: 1426918820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:16:49,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 00:16:49,909][54818] Updated weights for policy 0, policy_version 520128 (0.0026) [2024-04-28 00:16:51,669][54798] Signal inference workers to stop experience collection... (22500 times) [2024-04-28 00:16:51,669][54798] Signal inference workers to resume experience collection... (22500 times) [2024-04-28 00:16:51,680][54818] InferenceWorker_p0-w0: stopping experience collection (22500 times) [2024-04-28 00:16:51,680][54818] InferenceWorker_p0-w0: resuming experience collection (22500 times) [2024-04-28 00:16:52,757][54818] Updated weights for policy 0, policy_version 520138 (0.0026) [2024-04-28 00:16:54,253][54587] Fps is (10 sec: 58981.5, 60 sec: 58709.2, 300 sec: 59093.5). Total num frames: 8522022912. Throughput: 0: 58960.5. Samples: 1427269480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:16:54,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 00:16:55,499][54818] Updated weights for policy 0, policy_version 520148 (0.0026) [2024-04-28 00:16:58,412][54818] Updated weights for policy 0, policy_version 520158 (0.0026) [2024-04-28 00:16:59,253][54587] Fps is (10 sec: 57344.7, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8522301440. Throughput: 0: 58951.3. Samples: 1427446120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:16:59,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 00:17:01,111][54818] Updated weights for policy 0, policy_version 520168 (0.0025) [2024-04-28 00:17:03,793][54818] Updated weights for policy 0, policy_version 520178 (0.0026) [2024-04-28 00:17:04,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8522612736. Throughput: 0: 58832.8. Samples: 1427798080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:17:04,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 00:17:06,892][54818] Updated weights for policy 0, policy_version 520188 (0.0026) [2024-04-28 00:17:09,253][54587] Fps is (10 sec: 60620.1, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8522907648. Throughput: 0: 58908.7. Samples: 1428155820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:17:09,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 00:17:09,555][54818] Updated weights for policy 0, policy_version 520198 (0.0026) [2024-04-28 00:17:12,373][54818] Updated weights for policy 0, policy_version 520208 (0.0026) [2024-04-28 00:17:14,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8523202560. Throughput: 0: 58955.5. Samples: 1428329980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:17:14,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 00:17:14,925][54818] Updated weights for policy 0, policy_version 520218 (0.0026) [2024-04-28 00:17:17,903][54818] Updated weights for policy 0, policy_version 520228 (0.0024) [2024-04-28 00:17:19,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8523513856. Throughput: 0: 59029.0. Samples: 1428689380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:17:19,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 00:17:20,373][54818] Updated weights for policy 0, policy_version 520238 (0.0026) [2024-04-28 00:17:23,425][54818] Updated weights for policy 0, policy_version 520248 (0.0026) [2024-04-28 00:17:24,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59255.4, 300 sec: 59204.6). Total num frames: 8523808768. Throughput: 0: 59003.9. Samples: 1429041140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:17:24,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 00:17:25,989][54818] Updated weights for policy 0, policy_version 520258 (0.0026) [2024-04-28 00:17:28,865][54818] Updated weights for policy 0, policy_version 520268 (0.0024) [2024-04-28 00:17:29,253][54587] Fps is (10 sec: 57344.8, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8524087296. Throughput: 0: 59115.1. Samples: 1429221580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:17:29,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 00:17:31,711][54818] Updated weights for policy 0, policy_version 520278 (0.0026) [2024-04-28 00:17:34,253][54587] Fps is (10 sec: 57343.4, 60 sec: 58982.2, 300 sec: 59149.0). Total num frames: 8524382208. Throughput: 0: 58938.2. Samples: 1429571040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 00:17:34,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 00:17:34,321][54818] Updated weights for policy 0, policy_version 520288 (0.0026) [2024-04-28 00:17:37,284][54818] Updated weights for policy 0, policy_version 520298 (0.0025) [2024-04-28 00:17:39,254][54587] Fps is (10 sec: 58981.2, 60 sec: 59255.2, 300 sec: 59093.4). Total num frames: 8524677120. Throughput: 0: 58998.6. Samples: 1429924420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 00:17:39,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 00:17:39,740][54818] Updated weights for policy 0, policy_version 520308 (0.0024) [2024-04-28 00:17:42,931][54818] Updated weights for policy 0, policy_version 520318 (0.0026) [2024-04-28 00:17:44,253][54587] Fps is (10 sec: 58983.3, 60 sec: 58982.4, 300 sec: 59038.0). Total num frames: 8524972032. Throughput: 0: 59005.4. Samples: 1430101360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 00:17:44,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 00:17:45,352][54818] Updated weights for policy 0, policy_version 520328 (0.0026) [2024-04-28 00:17:47,920][54798] Signal inference workers to stop experience collection... (22550 times) [2024-04-28 00:17:47,920][54798] Signal inference workers to resume experience collection... (22550 times) [2024-04-28 00:17:47,929][54818] InferenceWorker_p0-w0: stopping experience collection (22550 times) [2024-04-28 00:17:47,939][54818] InferenceWorker_p0-w0: resuming experience collection (22550 times) [2024-04-28 00:17:48,346][54818] Updated weights for policy 0, policy_version 520338 (0.0026) [2024-04-28 00:17:49,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8525283328. Throughput: 0: 59119.9. Samples: 1430458480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 00:17:49,254][54587] Avg episode reward: [(0, '0.664')] [2024-04-28 00:17:49,266][54587] No heartbeat for components: RolloutWorker_w4 (10417 seconds) [2024-04-28 00:17:49,267][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000520342_8525283328.pth... [2024-04-28 00:17:49,318][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000519475_8511078400.pth [2024-04-28 00:17:50,867][54818] Updated weights for policy 0, policy_version 520348 (0.0026) [2024-04-28 00:17:53,929][54818] Updated weights for policy 0, policy_version 520358 (0.0025) [2024-04-28 00:17:54,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8525561856. Throughput: 0: 59009.4. Samples: 1430811240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 00:17:54,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 00:17:56,586][54818] Updated weights for policy 0, policy_version 520368 (0.0026) [2024-04-28 00:17:59,253][54587] Fps is (10 sec: 55705.8, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8525840384. Throughput: 0: 58957.2. Samples: 1430983060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 00:17:59,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 00:17:59,486][54818] Updated weights for policy 0, policy_version 520378 (0.0025) [2024-04-28 00:18:01,949][54818] Updated weights for policy 0, policy_version 520388 (0.0024) [2024-04-28 00:18:04,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.5, 300 sec: 59038.4). Total num frames: 8526151680. Throughput: 0: 58964.2. Samples: 1431342760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 00:18:04,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 00:18:05,048][54818] Updated weights for policy 0, policy_version 520398 (0.0026) [2024-04-28 00:18:07,755][54818] Updated weights for policy 0, policy_version 520408 (0.0026) [2024-04-28 00:18:09,253][54587] Fps is (10 sec: 62260.0, 60 sec: 59255.6, 300 sec: 59093.5). Total num frames: 8526462976. Throughput: 0: 58934.2. Samples: 1431693180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 00:18:09,254][54587] Avg episode reward: [(0, '0.734')] [2024-04-28 00:18:10,490][54818] Updated weights for policy 0, policy_version 520418 (0.0024) [2024-04-28 00:18:13,231][54818] Updated weights for policy 0, policy_version 520428 (0.0026) [2024-04-28 00:18:14,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59255.4, 300 sec: 59037.9). Total num frames: 8526757888. Throughput: 0: 58983.5. Samples: 1431875840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 00:18:14,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 00:18:16,002][54818] Updated weights for policy 0, policy_version 520438 (0.0026) [2024-04-28 00:18:18,713][54818] Updated weights for policy 0, policy_version 520448 (0.0025) [2024-04-28 00:18:19,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58709.5, 300 sec: 58926.9). Total num frames: 8527036416. Throughput: 0: 59089.9. Samples: 1432230080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 00:18:19,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 00:18:21,532][54818] Updated weights for policy 0, policy_version 520458 (0.0024) [2024-04-28 00:18:24,254][54587] Fps is (10 sec: 57343.2, 60 sec: 58709.1, 300 sec: 58982.4). Total num frames: 8527331328. Throughput: 0: 59236.5. Samples: 1432590060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 00:18:24,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 00:18:24,351][54818] Updated weights for policy 0, policy_version 520468 (0.0026) [2024-04-28 00:18:27,139][54818] Updated weights for policy 0, policy_version 520478 (0.0028) [2024-04-28 00:18:29,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8527642624. Throughput: 0: 59300.7. Samples: 1432769900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 00:18:29,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 00:18:29,844][54818] Updated weights for policy 0, policy_version 520488 (0.0026) [2024-04-28 00:18:32,602][54818] Updated weights for policy 0, policy_version 520498 (0.0026) [2024-04-28 00:18:34,253][54587] Fps is (10 sec: 58983.6, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8527921152. Throughput: 0: 59166.4. Samples: 1433120960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 00:18:34,254][54587] Avg episode reward: [(0, '0.678')] [2024-04-28 00:18:35,451][54818] Updated weights for policy 0, policy_version 520508 (0.0027) [2024-04-28 00:18:37,971][54818] Updated weights for policy 0, policy_version 520518 (0.0026) [2024-04-28 00:18:39,253][54587] Fps is (10 sec: 58981.9, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8528232448. Throughput: 0: 59179.4. Samples: 1433474320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 00:18:39,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 00:18:40,770][54818] Updated weights for policy 0, policy_version 520528 (0.0024) [2024-04-28 00:18:43,531][54818] Updated weights for policy 0, policy_version 520538 (0.0026) [2024-04-28 00:18:44,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8528527360. Throughput: 0: 59273.0. Samples: 1433650340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 00:18:44,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 00:18:46,332][54818] Updated weights for policy 0, policy_version 520548 (0.0026) [2024-04-28 00:18:49,006][54818] Updated weights for policy 0, policy_version 520558 (0.0026) [2024-04-28 00:18:49,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8528822272. Throughput: 0: 59356.8. Samples: 1434013820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 00:18:49,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 00:18:50,163][54798] Signal inference workers to stop experience collection... (22600 times) [2024-04-28 00:18:50,198][54818] InferenceWorker_p0-w0: stopping experience collection (22600 times) [2024-04-28 00:18:50,254][54798] Signal inference workers to resume experience collection... (22600 times) [2024-04-28 00:18:50,255][54818] InferenceWorker_p0-w0: resuming experience collection (22600 times) [2024-04-28 00:18:51,826][54818] Updated weights for policy 0, policy_version 520568 (0.0025) [2024-04-28 00:18:54,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59255.4, 300 sec: 59094.7). Total num frames: 8529117184. Throughput: 0: 59495.4. Samples: 1434370480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 00:18:54,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 00:18:54,617][54818] Updated weights for policy 0, policy_version 520578 (0.0026) [2024-04-28 00:18:57,158][54818] Updated weights for policy 0, policy_version 520588 (0.0025) [2024-04-28 00:18:59,253][54587] Fps is (10 sec: 60621.5, 60 sec: 59801.8, 300 sec: 59149.0). Total num frames: 8529428480. Throughput: 0: 59356.1. Samples: 1434546860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 00:18:59,253][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 00:19:00,068][54818] Updated weights for policy 0, policy_version 520598 (0.0026) [2024-04-28 00:19:02,722][54818] Updated weights for policy 0, policy_version 520608 (0.0025) [2024-04-28 00:19:04,253][54587] Fps is (10 sec: 60621.5, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8529723392. Throughput: 0: 59242.7. Samples: 1434896000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:19:04,254][54587] Avg episode reward: [(0, '0.526')] [2024-04-28 00:19:05,637][54818] Updated weights for policy 0, policy_version 520618 (0.0026) [2024-04-28 00:19:08,156][54818] Updated weights for policy 0, policy_version 520628 (0.0028) [2024-04-28 00:19:09,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8530001920. Throughput: 0: 59281.1. Samples: 1435257700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:19:09,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 00:19:11,331][54818] Updated weights for policy 0, policy_version 520638 (0.0026) [2024-04-28 00:19:14,253][54587] Fps is (10 sec: 55706.0, 60 sec: 58709.5, 300 sec: 59093.5). Total num frames: 8530280448. Throughput: 0: 59330.8. Samples: 1435439780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:19:14,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 00:19:14,260][54818] Updated weights for policy 0, policy_version 520648 (0.0026) [2024-04-28 00:19:17,102][54818] Updated weights for policy 0, policy_version 520658 (0.0027) [2024-04-28 00:19:19,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8530608128. Throughput: 0: 59349.7. Samples: 1435791700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:19:19,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-28 00:19:19,673][54818] Updated weights for policy 0, policy_version 520668 (0.0026) [2024-04-28 00:19:22,435][54818] Updated weights for policy 0, policy_version 520678 (0.0026) [2024-04-28 00:19:24,253][54587] Fps is (10 sec: 63896.7, 60 sec: 59801.7, 300 sec: 59204.5). Total num frames: 8530919424. Throughput: 0: 59234.3. Samples: 1436139860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:19:24,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 00:19:25,164][54818] Updated weights for policy 0, policy_version 520688 (0.0026) [2024-04-28 00:19:28,017][54818] Updated weights for policy 0, policy_version 520698 (0.0026) [2024-04-28 00:19:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8531181568. Throughput: 0: 59381.4. Samples: 1436322500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:19:29,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 00:19:30,630][54818] Updated weights for policy 0, policy_version 520708 (0.0026) [2024-04-28 00:19:33,490][54818] Updated weights for policy 0, policy_version 520718 (0.0025) [2024-04-28 00:19:34,253][54587] Fps is (10 sec: 57344.4, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8531492864. Throughput: 0: 59294.7. Samples: 1436682080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:19:34,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-28 00:19:36,067][54818] Updated weights for policy 0, policy_version 520728 (0.0024) [2024-04-28 00:19:36,459][54798] Signal inference workers to stop experience collection... (22650 times) [2024-04-28 00:19:36,507][54818] InferenceWorker_p0-w0: stopping experience collection (22650 times) [2024-04-28 00:19:36,510][54798] Signal inference workers to resume experience collection... (22650 times) [2024-04-28 00:19:36,517][54818] InferenceWorker_p0-w0: resuming experience collection (22650 times) [2024-04-28 00:19:39,100][54818] Updated weights for policy 0, policy_version 520738 (0.0026) [2024-04-28 00:19:39,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8531771392. Throughput: 0: 59313.3. Samples: 1437039580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:19:39,254][54587] Avg episode reward: [(0, '0.685')] [2024-04-28 00:19:41,717][54818] Updated weights for policy 0, policy_version 520748 (0.0025) [2024-04-28 00:19:44,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8532066304. Throughput: 0: 59232.8. Samples: 1437212340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:19:44,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 00:19:44,564][54818] Updated weights for policy 0, policy_version 520758 (0.0025) [2024-04-28 00:19:47,052][54818] Updated weights for policy 0, policy_version 520768 (0.0026) [2024-04-28 00:19:49,253][54587] Fps is (10 sec: 57345.1, 60 sec: 58709.4, 300 sec: 59038.0). Total num frames: 8532344832. Throughput: 0: 59288.9. Samples: 1437564000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:19:49,253][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 00:19:49,285][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000520774_8532361216.pth... [2024-04-28 00:19:49,338][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000519909_8518189056.pth [2024-04-28 00:19:50,143][54818] Updated weights for policy 0, policy_version 520778 (0.0024) [2024-04-28 00:19:52,525][54818] Updated weights for policy 0, policy_version 520788 (0.0026) [2024-04-28 00:19:54,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8532656128. Throughput: 0: 59075.1. Samples: 1437916080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:19:54,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 00:19:55,614][54818] Updated weights for policy 0, policy_version 520798 (0.0026) [2024-04-28 00:19:57,958][54818] Updated weights for policy 0, policy_version 520808 (0.0025) [2024-04-28 00:19:59,253][54587] Fps is (10 sec: 62259.2, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8532967424. Throughput: 0: 59073.3. Samples: 1438098080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:19:59,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 00:20:01,058][54818] Updated weights for policy 0, policy_version 520818 (0.0026) [2024-04-28 00:20:03,349][54818] Updated weights for policy 0, policy_version 520828 (0.0025) [2024-04-28 00:20:04,253][54587] Fps is (10 sec: 62259.1, 60 sec: 59255.4, 300 sec: 59260.1). Total num frames: 8533278720. Throughput: 0: 59255.2. Samples: 1438458180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:20:04,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 00:20:06,557][54818] Updated weights for policy 0, policy_version 520838 (0.0025) [2024-04-28 00:20:09,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8533557248. Throughput: 0: 59201.0. Samples: 1438803900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:20:09,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-28 00:20:09,260][54818] Updated weights for policy 0, policy_version 520848 (0.0023) [2024-04-28 00:20:12,097][54818] Updated weights for policy 0, policy_version 520858 (0.0024) [2024-04-28 00:20:14,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59801.5, 300 sec: 59204.6). Total num frames: 8533868544. Throughput: 0: 59309.4. Samples: 1438991420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:20:14,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-28 00:20:14,731][54818] Updated weights for policy 0, policy_version 520868 (0.0026) [2024-04-28 00:20:17,560][54818] Updated weights for policy 0, policy_version 520878 (0.0025) [2024-04-28 00:20:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8534163456. Throughput: 0: 59172.5. Samples: 1439344840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:20:19,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 00:20:20,059][54818] Updated weights for policy 0, policy_version 520888 (0.0026) [2024-04-28 00:20:22,698][54798] Signal inference workers to stop experience collection... (22700 times) [2024-04-28 00:20:22,728][54818] InferenceWorker_p0-w0: stopping experience collection (22700 times) [2024-04-28 00:20:22,755][54798] Signal inference workers to resume experience collection... (22700 times) [2024-04-28 00:20:22,755][54818] InferenceWorker_p0-w0: resuming experience collection (22700 times) [2024-04-28 00:20:22,869][54818] Updated weights for policy 0, policy_version 520898 (0.0026) [2024-04-28 00:20:24,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.5, 300 sec: 59260.1). Total num frames: 8534458368. Throughput: 0: 59134.0. Samples: 1439700600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:20:24,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 00:20:25,605][54818] Updated weights for policy 0, policy_version 520908 (0.0024) [2024-04-28 00:20:28,597][54818] Updated weights for policy 0, policy_version 520918 (0.0023) [2024-04-28 00:20:29,253][54587] Fps is (10 sec: 57343.4, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8534736896. Throughput: 0: 59502.1. Samples: 1439889940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 00:20:29,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-28 00:20:31,669][54818] Updated weights for policy 0, policy_version 520928 (0.0023) [2024-04-28 00:20:34,078][54818] Updated weights for policy 0, policy_version 520938 (0.0022) [2024-04-28 00:20:34,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59255.5, 300 sec: 59204.8). Total num frames: 8535048192. Throughput: 0: 59395.2. Samples: 1440236780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 00:20:34,254][54587] Avg episode reward: [(0, '0.520')] [2024-04-28 00:20:37,163][54818] Updated weights for policy 0, policy_version 520948 (0.0026) [2024-04-28 00:20:39,253][54587] Fps is (10 sec: 60621.4, 60 sec: 59528.7, 300 sec: 59149.0). Total num frames: 8535343104. Throughput: 0: 59549.4. Samples: 1440595800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 00:20:39,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 00:20:39,700][54818] Updated weights for policy 0, policy_version 520958 (0.0026) [2024-04-28 00:20:42,772][54818] Updated weights for policy 0, policy_version 520968 (0.0025) [2024-04-28 00:20:44,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59528.5, 300 sec: 59149.1). Total num frames: 8535638016. Throughput: 0: 59280.9. Samples: 1440765720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 00:20:44,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-28 00:20:45,324][54818] Updated weights for policy 0, policy_version 520978 (0.0025) [2024-04-28 00:20:48,237][54818] Updated weights for policy 0, policy_version 520988 (0.0025) [2024-04-28 00:20:49,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59801.6, 300 sec: 59093.5). Total num frames: 8535932928. Throughput: 0: 59439.2. Samples: 1441132940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 00:20:49,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 00:20:49,264][54587] No heartbeat for components: RolloutWorker_w4 (10597 seconds) [2024-04-28 00:20:50,890][54818] Updated weights for policy 0, policy_version 520998 (0.0026) [2024-04-28 00:20:53,698][54818] Updated weights for policy 0, policy_version 521008 (0.0027) [2024-04-28 00:20:54,253][54587] Fps is (10 sec: 58981.6, 60 sec: 59528.4, 300 sec: 59149.0). Total num frames: 8536227840. Throughput: 0: 59540.3. Samples: 1441483220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 00:20:54,254][54587] Avg episode reward: [(0, '0.508')] [2024-04-28 00:20:56,428][54818] Updated weights for policy 0, policy_version 521018 (0.0022) [2024-04-28 00:20:59,211][54818] Updated weights for policy 0, policy_version 521028 (0.0023) [2024-04-28 00:20:59,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8536522752. Throughput: 0: 59101.8. Samples: 1441651000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 00:20:59,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 00:21:01,482][54798] Signal inference workers to stop experience collection... (22750 times) [2024-04-28 00:21:01,525][54818] InferenceWorker_p0-w0: stopping experience collection (22750 times) [2024-04-28 00:21:01,543][54798] Signal inference workers to resume experience collection... (22750 times) [2024-04-28 00:21:01,543][54818] InferenceWorker_p0-w0: resuming experience collection (22750 times) [2024-04-28 00:21:01,782][54818] Updated weights for policy 0, policy_version 521038 (0.0027) [2024-04-28 00:21:04,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58709.3, 300 sec: 59038.0). Total num frames: 8536801280. Throughput: 0: 59002.1. Samples: 1441999940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 00:21:04,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-28 00:21:04,778][54818] Updated weights for policy 0, policy_version 521048 (0.0026) [2024-04-28 00:21:07,270][54818] Updated weights for policy 0, policy_version 521058 (0.0022) [2024-04-28 00:21:09,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8537096192. Throughput: 0: 59147.5. Samples: 1442362240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 00:21:09,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-28 00:21:10,114][54818] Updated weights for policy 0, policy_version 521068 (0.0026) [2024-04-28 00:21:12,730][54818] Updated weights for policy 0, policy_version 521078 (0.0025) [2024-04-28 00:21:14,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8537407488. Throughput: 0: 58987.6. Samples: 1442544380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 00:21:14,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 00:21:15,731][54818] Updated weights for policy 0, policy_version 521088 (0.0026) [2024-04-28 00:21:18,172][54818] Updated weights for policy 0, policy_version 521098 (0.0025) [2024-04-28 00:21:19,253][54587] Fps is (10 sec: 62259.9, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8537718784. Throughput: 0: 59186.2. Samples: 1442900160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 00:21:19,253][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 00:21:21,251][54818] Updated weights for policy 0, policy_version 521108 (0.0025) [2024-04-28 00:21:23,781][54818] Updated weights for policy 0, policy_version 521118 (0.0026) [2024-04-28 00:21:24,253][54587] Fps is (10 sec: 62259.3, 60 sec: 59528.5, 300 sec: 59315.6). Total num frames: 8538030080. Throughput: 0: 58873.7. Samples: 1443245120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 00:21:24,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 00:21:26,897][54818] Updated weights for policy 0, policy_version 521128 (0.0026) [2024-04-28 00:21:29,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59528.7, 300 sec: 59204.6). Total num frames: 8538308608. Throughput: 0: 59248.0. Samples: 1443431880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 00:21:29,253][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 00:21:29,332][54818] Updated weights for policy 0, policy_version 521138 (0.0026) [2024-04-28 00:21:32,301][54818] Updated weights for policy 0, policy_version 521148 (0.0025) [2024-04-28 00:21:34,253][54587] Fps is (10 sec: 57343.8, 60 sec: 59255.3, 300 sec: 59260.1). Total num frames: 8538603520. Throughput: 0: 58839.9. Samples: 1443780740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 00:21:34,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-28 00:21:34,748][54818] Updated weights for policy 0, policy_version 521158 (0.0025) [2024-04-28 00:21:37,816][54818] Updated weights for policy 0, policy_version 521168 (0.0026) [2024-04-28 00:21:39,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59255.6, 300 sec: 59204.6). Total num frames: 8538898432. Throughput: 0: 59032.7. Samples: 1444139680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 00:21:39,253][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 00:21:40,405][54818] Updated weights for policy 0, policy_version 521178 (0.0025) [2024-04-28 00:21:43,388][54818] Updated weights for policy 0, policy_version 521188 (0.0026) [2024-04-28 00:21:44,253][54587] Fps is (10 sec: 58983.5, 60 sec: 59255.6, 300 sec: 59204.6). Total num frames: 8539193344. Throughput: 0: 59356.6. Samples: 1444322040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 00:21:44,253][54587] Avg episode reward: [(0, '0.531')] [2024-04-28 00:21:46,015][54818] Updated weights for policy 0, policy_version 521198 (0.0026) [2024-04-28 00:21:48,760][54818] Updated weights for policy 0, policy_version 521208 (0.0025) [2024-04-28 00:21:49,235][54798] Signal inference workers to stop experience collection... (22800 times) [2024-04-28 00:21:49,235][54798] Signal inference workers to resume experience collection... (22800 times) [2024-04-28 00:21:49,245][54818] InferenceWorker_p0-w0: stopping experience collection (22800 times) [2024-04-28 00:21:49,246][54818] InferenceWorker_p0-w0: resuming experience collection (22800 times) [2024-04-28 00:21:49,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8539488256. Throughput: 0: 59465.9. Samples: 1444675900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 00:21:49,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 00:21:49,341][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000521210_8539504640.pth... [2024-04-28 00:21:49,388][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000520342_8525283328.pth [2024-04-28 00:21:51,725][54818] Updated weights for policy 0, policy_version 521218 (0.0025) [2024-04-28 00:21:54,253][54587] Fps is (10 sec: 58982.0, 60 sec: 59255.6, 300 sec: 59260.1). Total num frames: 8539783168. Throughput: 0: 59152.6. Samples: 1445024100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 00:21:54,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 00:21:54,320][54818] Updated weights for policy 0, policy_version 521228 (0.0025) [2024-04-28 00:21:57,460][54818] Updated weights for policy 0, policy_version 521238 (0.0026) [2024-04-28 00:21:59,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8540078080. Throughput: 0: 59111.2. Samples: 1445204380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:21:59,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-28 00:22:00,032][54818] Updated weights for policy 0, policy_version 521248 (0.0026) [2024-04-28 00:22:02,973][54818] Updated weights for policy 0, policy_version 521258 (0.0026) [2024-04-28 00:22:04,253][54587] Fps is (10 sec: 58981.6, 60 sec: 59528.5, 300 sec: 59204.6). Total num frames: 8540372992. Throughput: 0: 59132.3. Samples: 1445561120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:22:04,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 00:22:05,339][54818] Updated weights for policy 0, policy_version 521268 (0.0026) [2024-04-28 00:22:08,480][54818] Updated weights for policy 0, policy_version 521278 (0.0022) [2024-04-28 00:22:09,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59255.6, 300 sec: 59149.0). Total num frames: 8540651520. Throughput: 0: 59431.2. Samples: 1445919520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:22:09,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 00:22:10,908][54818] Updated weights for policy 0, policy_version 521288 (0.0026) [2024-04-28 00:22:13,978][54818] Updated weights for policy 0, policy_version 521298 (0.0026) [2024-04-28 00:22:14,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8540946432. Throughput: 0: 59083.4. Samples: 1446090640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:22:14,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-28 00:22:16,556][54818] Updated weights for policy 0, policy_version 521308 (0.0025) [2024-04-28 00:22:19,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58709.2, 300 sec: 59093.5). Total num frames: 8541241344. Throughput: 0: 59238.2. Samples: 1446446460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:22:19,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 00:22:19,633][54818] Updated weights for policy 0, policy_version 521318 (0.0026) [2024-04-28 00:22:22,049][54818] Updated weights for policy 0, policy_version 521328 (0.0025) [2024-04-28 00:22:24,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58163.2, 300 sec: 59093.5). Total num frames: 8541519872. Throughput: 0: 59217.2. Samples: 1446804460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:22:24,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-28 00:22:25,152][54818] Updated weights for policy 0, policy_version 521338 (0.0026) [2024-04-28 00:22:27,546][54818] Updated weights for policy 0, policy_version 521348 (0.0025) [2024-04-28 00:22:29,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58709.2, 300 sec: 59149.0). Total num frames: 8541831168. Throughput: 0: 58894.8. Samples: 1446972320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:22:29,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-28 00:22:30,551][54818] Updated weights for policy 0, policy_version 521358 (0.0026) [2024-04-28 00:22:33,013][54818] Updated weights for policy 0, policy_version 521368 (0.0026) [2024-04-28 00:22:34,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58709.3, 300 sec: 59149.0). Total num frames: 8542126080. Throughput: 0: 59056.3. Samples: 1447333440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:22:34,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-28 00:22:36,016][54818] Updated weights for policy 0, policy_version 521378 (0.0026) [2024-04-28 00:22:38,460][54818] Updated weights for policy 0, policy_version 521388 (0.0025) [2024-04-28 00:22:39,253][54587] Fps is (10 sec: 60621.7, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8542437376. Throughput: 0: 59138.7. Samples: 1447685340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:22:39,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 00:22:41,580][54818] Updated weights for policy 0, policy_version 521398 (0.0026) [2024-04-28 00:22:44,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58982.2, 300 sec: 59149.0). Total num frames: 8542732288. Throughput: 0: 59130.1. Samples: 1447865240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:22:44,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 00:22:44,477][54818] Updated weights for policy 0, policy_version 521408 (0.0026) [2024-04-28 00:22:47,134][54818] Updated weights for policy 0, policy_version 521418 (0.0026) [2024-04-28 00:22:47,934][54798] Signal inference workers to stop experience collection... (22850 times) [2024-04-28 00:22:47,935][54798] Signal inference workers to resume experience collection... (22850 times) [2024-04-28 00:22:47,947][54818] InferenceWorker_p0-w0: stopping experience collection (22850 times) [2024-04-28 00:22:47,947][54818] InferenceWorker_p0-w0: resuming experience collection (22850 times) [2024-04-28 00:22:49,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59255.4, 300 sec: 59260.1). Total num frames: 8543043584. Throughput: 0: 59024.5. Samples: 1448217220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:22:49,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 00:22:49,980][54818] Updated weights for policy 0, policy_version 521428 (0.0026) [2024-04-28 00:22:52,557][54818] Updated weights for policy 0, policy_version 521438 (0.0028) [2024-04-28 00:22:54,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59255.3, 300 sec: 59315.6). Total num frames: 8543338496. Throughput: 0: 58860.7. Samples: 1448568260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:22:54,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 00:22:55,694][54818] Updated weights for policy 0, policy_version 521448 (0.0026) [2024-04-28 00:22:58,162][54818] Updated weights for policy 0, policy_version 521458 (0.0026) [2024-04-28 00:22:59,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59255.4, 300 sec: 59260.1). Total num frames: 8543633408. Throughput: 0: 59341.8. Samples: 1448761020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:22:59,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 00:23:01,145][54818] Updated weights for policy 0, policy_version 521468 (0.0026) [2024-04-28 00:23:03,685][54818] Updated weights for policy 0, policy_version 521478 (0.0026) [2024-04-28 00:23:04,253][54587] Fps is (10 sec: 58982.0, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8543928320. Throughput: 0: 59178.5. Samples: 1449109500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:23:04,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-28 00:23:06,857][54818] Updated weights for policy 0, policy_version 521488 (0.0026) [2024-04-28 00:23:09,108][54818] Updated weights for policy 0, policy_version 521498 (0.0027) [2024-04-28 00:23:09,253][54587] Fps is (10 sec: 58982.9, 60 sec: 59528.5, 300 sec: 59204.6). Total num frames: 8544223232. Throughput: 0: 58975.1. Samples: 1449458340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:23:09,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 00:23:12,417][54818] Updated weights for policy 0, policy_version 521508 (0.0026) [2024-04-28 00:23:14,253][54587] Fps is (10 sec: 58982.9, 60 sec: 59528.5, 300 sec: 59260.1). Total num frames: 8544518144. Throughput: 0: 59283.5. Samples: 1449640080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:23:14,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-28 00:23:14,547][54818] Updated weights for policy 0, policy_version 521518 (0.0026) [2024-04-28 00:23:18,179][54818] Updated weights for policy 0, policy_version 521528 (0.0026) [2024-04-28 00:23:19,253][54587] Fps is (10 sec: 57343.5, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8544796672. Throughput: 0: 59282.7. Samples: 1450001160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:23:19,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-28 00:23:20,032][54818] Updated weights for policy 0, policy_version 521538 (0.0029) [2024-04-28 00:23:23,755][54818] Updated weights for policy 0, policy_version 521548 (0.0026) [2024-04-28 00:23:24,253][54587] Fps is (10 sec: 55706.4, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8545075200. Throughput: 0: 59368.4. Samples: 1450356920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 00:23:24,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 00:23:25,645][54818] Updated weights for policy 0, policy_version 521558 (0.0025) [2024-04-28 00:23:29,249][54818] Updated weights for policy 0, policy_version 521568 (0.0025) [2024-04-28 00:23:29,253][54587] Fps is (10 sec: 57344.7, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8545370112. Throughput: 0: 58800.1. Samples: 1450511240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:23:29,258][54587] Avg episode reward: [(0, '0.641')] [2024-04-28 00:23:30,609][54798] Signal inference workers to stop experience collection... (22900 times) [2024-04-28 00:23:30,642][54818] InferenceWorker_p0-w0: stopping experience collection (22900 times) [2024-04-28 00:23:30,661][54798] Signal inference workers to resume experience collection... (22900 times) [2024-04-28 00:23:30,662][54818] InferenceWorker_p0-w0: resuming experience collection (22900 times) [2024-04-28 00:23:31,034][54818] Updated weights for policy 0, policy_version 521578 (0.0026) [2024-04-28 00:23:34,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8545665024. Throughput: 0: 59052.4. Samples: 1450874580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:23:34,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 00:23:34,681][54818] Updated weights for policy 0, policy_version 521588 (0.0026) [2024-04-28 00:23:36,542][54818] Updated weights for policy 0, policy_version 521598 (0.0026) [2024-04-28 00:23:39,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58709.2, 300 sec: 59093.5). Total num frames: 8545959936. Throughput: 0: 59249.8. Samples: 1451234500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:23:39,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-28 00:23:40,087][54818] Updated weights for policy 0, policy_version 521608 (0.0026) [2024-04-28 00:23:42,320][54818] Updated weights for policy 0, policy_version 521618 (0.0026) [2024-04-28 00:23:44,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8546254848. Throughput: 0: 58748.4. Samples: 1451404700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:23:44,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 00:23:45,672][54818] Updated weights for policy 0, policy_version 521628 (0.0026) [2024-04-28 00:23:47,761][54818] Updated weights for policy 0, policy_version 521638 (0.0025) [2024-04-28 00:23:49,253][54587] Fps is (10 sec: 62259.1, 60 sec: 58982.3, 300 sec: 59204.6). Total num frames: 8546582528. Throughput: 0: 58843.2. Samples: 1451757440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:23:49,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 00:23:49,265][54587] No heartbeat for components: RolloutWorker_w4 (10777 seconds) [2024-04-28 00:23:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000521642_8546582528.pth... [2024-04-28 00:23:49,318][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000520774_8532361216.pth [2024-04-28 00:23:51,213][54818] Updated weights for policy 0, policy_version 521648 (0.0026) [2024-04-28 00:23:53,414][54818] Updated weights for policy 0, policy_version 521658 (0.0024) [2024-04-28 00:23:54,253][54587] Fps is (10 sec: 60621.5, 60 sec: 58709.5, 300 sec: 59093.5). Total num frames: 8546861056. Throughput: 0: 58997.8. Samples: 1452113240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:23:54,253][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 00:23:56,715][54818] Updated weights for policy 0, policy_version 521668 (0.0026) [2024-04-28 00:23:58,857][54818] Updated weights for policy 0, policy_version 521678 (0.0026) [2024-04-28 00:23:59,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8547172352. Throughput: 0: 59130.4. Samples: 1452300940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:23:59,254][54587] Avg episode reward: [(0, '0.675')] [2024-04-28 00:24:02,184][54818] Updated weights for policy 0, policy_version 521688 (0.0025) [2024-04-28 00:24:04,253][54587] Fps is (10 sec: 62258.0, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8547483648. Throughput: 0: 58864.8. Samples: 1452650080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:24:04,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 00:24:04,368][54818] Updated weights for policy 0, policy_version 521698 (0.0025) [2024-04-28 00:24:07,614][54818] Updated weights for policy 0, policy_version 521708 (0.0030) [2024-04-28 00:24:09,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59255.5, 300 sec: 59315.6). Total num frames: 8547778560. Throughput: 0: 58729.8. Samples: 1452999760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:24:09,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 00:24:10,207][54818] Updated weights for policy 0, policy_version 521718 (0.0026) [2024-04-28 00:24:13,247][54818] Updated weights for policy 0, policy_version 521728 (0.0026) [2024-04-28 00:24:14,253][54587] Fps is (10 sec: 58983.1, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8548073472. Throughput: 0: 59373.2. Samples: 1453183040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:24:14,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 00:24:15,735][54818] Updated weights for policy 0, policy_version 521738 (0.0029) [2024-04-28 00:24:18,704][54818] Updated weights for policy 0, policy_version 521748 (0.0024) [2024-04-28 00:24:19,055][54798] Signal inference workers to stop experience collection... (22950 times) [2024-04-28 00:24:19,055][54798] Signal inference workers to resume experience collection... (22950 times) [2024-04-28 00:24:19,067][54818] InferenceWorker_p0-w0: stopping experience collection (22950 times) [2024-04-28 00:24:19,067][54818] InferenceWorker_p0-w0: resuming experience collection (22950 times) [2024-04-28 00:24:19,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59528.7, 300 sec: 59149.0). Total num frames: 8548368384. Throughput: 0: 59442.4. Samples: 1453549480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:24:19,253][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 00:24:21,276][54818] Updated weights for policy 0, policy_version 521758 (0.0026) [2024-04-28 00:24:24,092][54818] Updated weights for policy 0, policy_version 521768 (0.0026) [2024-04-28 00:24:24,253][54587] Fps is (10 sec: 57344.7, 60 sec: 59528.6, 300 sec: 59204.6). Total num frames: 8548646912. Throughput: 0: 59192.7. Samples: 1453898160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:24:24,253][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 00:24:26,849][54818] Updated weights for policy 0, policy_version 521778 (0.0026) [2024-04-28 00:24:29,253][54587] Fps is (10 sec: 57343.9, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8548941824. Throughput: 0: 59251.7. Samples: 1454071020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:24:29,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 00:24:29,587][54818] Updated weights for policy 0, policy_version 521788 (0.0026) [2024-04-28 00:24:32,513][54818] Updated weights for policy 0, policy_version 521798 (0.0025) [2024-04-28 00:24:34,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59255.6, 300 sec: 59149.1). Total num frames: 8549220352. Throughput: 0: 59423.4. Samples: 1454431480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:24:34,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-28 00:24:35,032][54818] Updated weights for policy 0, policy_version 521808 (0.0025) [2024-04-28 00:24:38,170][54818] Updated weights for policy 0, policy_version 521818 (0.0026) [2024-04-28 00:24:39,253][54587] Fps is (10 sec: 57344.4, 60 sec: 59255.7, 300 sec: 59149.0). Total num frames: 8549515264. Throughput: 0: 59569.0. Samples: 1454793840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:24:39,253][54587] Avg episode reward: [(0, '0.517')] [2024-04-28 00:24:40,510][54818] Updated weights for policy 0, policy_version 521828 (0.0026) [2024-04-28 00:24:43,658][54818] Updated weights for policy 0, policy_version 521838 (0.0026) [2024-04-28 00:24:44,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.6, 300 sec: 59204.6). Total num frames: 8549810176. Throughput: 0: 59124.5. Samples: 1454961540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:24:44,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 00:24:45,945][54818] Updated weights for policy 0, policy_version 521848 (0.0026) [2024-04-28 00:24:49,221][54818] Updated weights for policy 0, policy_version 521858 (0.0026) [2024-04-28 00:24:49,253][54587] Fps is (10 sec: 60619.4, 60 sec: 58982.4, 300 sec: 59204.5). Total num frames: 8550121472. Throughput: 0: 59412.5. Samples: 1455323640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:24:49,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 00:24:51,528][54818] Updated weights for policy 0, policy_version 521868 (0.0027) [2024-04-28 00:24:54,253][54587] Fps is (10 sec: 60620.1, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8550416384. Throughput: 0: 59504.3. Samples: 1455677460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 00:24:54,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 00:24:54,616][54818] Updated weights for policy 0, policy_version 521878 (0.0026) [2024-04-28 00:24:56,777][54818] Updated weights for policy 0, policy_version 521888 (0.0024) [2024-04-28 00:24:59,253][54587] Fps is (10 sec: 60621.5, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8550727680. Throughput: 0: 59281.4. Samples: 1455850700. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:24:59,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-28 00:25:00,139][54818] Updated weights for policy 0, policy_version 521898 (0.0026) [2024-04-28 00:25:02,287][54818] Updated weights for policy 0, policy_version 521908 (0.0025) [2024-04-28 00:25:04,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58982.5, 300 sec: 59204.5). Total num frames: 8551022592. Throughput: 0: 59142.1. Samples: 1456210880. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:25:04,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 00:25:05,677][54818] Updated weights for policy 0, policy_version 521918 (0.0025) [2024-04-28 00:25:07,777][54818] Updated weights for policy 0, policy_version 521928 (0.0027) [2024-04-28 00:25:09,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8551317504. Throughput: 0: 59275.9. Samples: 1456565580. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:25:09,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 00:25:11,277][54818] Updated weights for policy 0, policy_version 521938 (0.0026) [2024-04-28 00:25:13,440][54818] Updated weights for policy 0, policy_version 521948 (0.0025) [2024-04-28 00:25:14,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8551612416. Throughput: 0: 59549.2. Samples: 1456750740. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:25:14,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-28 00:25:16,805][54818] Updated weights for policy 0, policy_version 521958 (0.0026) [2024-04-28 00:25:17,264][54798] Signal inference workers to stop experience collection... (23000 times) [2024-04-28 00:25:17,264][54798] Signal inference workers to resume experience collection... (23000 times) [2024-04-28 00:25:17,288][54818] InferenceWorker_p0-w0: stopping experience collection (23000 times) [2024-04-28 00:25:17,288][54818] InferenceWorker_p0-w0: resuming experience collection (23000 times) [2024-04-28 00:25:18,989][54818] Updated weights for policy 0, policy_version 521968 (0.0026) [2024-04-28 00:25:19,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8551923712. Throughput: 0: 59331.5. Samples: 1457101400. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:25:19,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 00:25:22,340][54818] Updated weights for policy 0, policy_version 521978 (0.0025) [2024-04-28 00:25:24,253][54587] Fps is (10 sec: 62259.2, 60 sec: 59801.5, 300 sec: 59315.6). Total num frames: 8552235008. Throughput: 0: 59121.1. Samples: 1457454300. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:25:24,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 00:25:24,632][54818] Updated weights for policy 0, policy_version 521988 (0.0026) [2024-04-28 00:25:27,574][54818] Updated weights for policy 0, policy_version 521998 (0.0025) [2024-04-28 00:25:29,253][54587] Fps is (10 sec: 58982.0, 60 sec: 59528.5, 300 sec: 59204.5). Total num frames: 8552513536. Throughput: 0: 59642.6. Samples: 1457645460. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:25:29,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 00:25:30,389][54818] Updated weights for policy 0, policy_version 522008 (0.0025) [2024-04-28 00:25:33,097][54818] Updated weights for policy 0, policy_version 522018 (0.0024) [2024-04-28 00:25:34,253][54587] Fps is (10 sec: 57344.7, 60 sec: 59801.6, 300 sec: 59204.6). Total num frames: 8552808448. Throughput: 0: 59438.0. Samples: 1457998340. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:25:34,253][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 00:25:35,799][54818] Updated weights for policy 0, policy_version 522028 (0.0028) [2024-04-28 00:25:38,538][54818] Updated weights for policy 0, policy_version 522038 (0.0026) [2024-04-28 00:25:39,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59801.5, 300 sec: 59204.5). Total num frames: 8553103360. Throughput: 0: 59290.7. Samples: 1458345540. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:25:39,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 00:25:41,407][54818] Updated weights for policy 0, policy_version 522048 (0.0029) [2024-04-28 00:25:43,881][54818] Updated weights for policy 0, policy_version 522058 (0.0025) [2024-04-28 00:25:44,253][54587] Fps is (10 sec: 58981.2, 60 sec: 59801.4, 300 sec: 59204.5). Total num frames: 8553398272. Throughput: 0: 59613.1. Samples: 1458533300. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:25:44,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 00:25:46,927][54818] Updated weights for policy 0, policy_version 522068 (0.0026) [2024-04-28 00:25:49,253][54587] Fps is (10 sec: 60620.2, 60 sec: 59801.6, 300 sec: 59260.1). Total num frames: 8553709568. Throughput: 0: 59537.7. Samples: 1458890080. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:25:49,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-28 00:25:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000522077_8553709568.pth... [2024-04-28 00:25:49,316][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000521210_8539504640.pth [2024-04-28 00:25:49,514][54818] Updated weights for policy 0, policy_version 522078 (0.0024) [2024-04-28 00:25:52,533][54818] Updated weights for policy 0, policy_version 522088 (0.0024) [2024-04-28 00:25:54,253][54587] Fps is (10 sec: 60621.3, 60 sec: 59801.6, 300 sec: 59260.1). Total num frames: 8554004480. Throughput: 0: 59590.1. Samples: 1459247140. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:25:54,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 00:25:54,996][54818] Updated weights for policy 0, policy_version 522098 (0.0025) [2024-04-28 00:25:57,981][54818] Updated weights for policy 0, policy_version 522108 (0.0026) [2024-04-28 00:25:59,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59255.3, 300 sec: 59260.1). Total num frames: 8554283008. Throughput: 0: 59359.9. Samples: 1459421940. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:25:59,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 00:26:00,619][54818] Updated weights for policy 0, policy_version 522118 (0.0026) [2024-04-28 00:26:03,437][54818] Updated weights for policy 0, policy_version 522128 (0.0024) [2024-04-28 00:26:04,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8554577920. Throughput: 0: 59575.9. Samples: 1459782320. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:26:04,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 00:26:06,160][54818] Updated weights for policy 0, policy_version 522138 (0.0027) [2024-04-28 00:26:08,822][54818] Updated weights for policy 0, policy_version 522148 (0.0025) [2024-04-28 00:26:09,253][54587] Fps is (10 sec: 60621.0, 60 sec: 59528.4, 300 sec: 59260.1). Total num frames: 8554889216. Throughput: 0: 59610.6. Samples: 1460136780. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:26:09,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 00:26:11,577][54818] Updated weights for policy 0, policy_version 522158 (0.0026) [2024-04-28 00:26:14,253][54587] Fps is (10 sec: 60620.1, 60 sec: 59528.4, 300 sec: 59204.5). Total num frames: 8555184128. Throughput: 0: 59310.1. Samples: 1460314420. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:26:14,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-28 00:26:14,380][54818] Updated weights for policy 0, policy_version 522168 (0.0026) [2024-04-28 00:26:16,960][54818] Updated weights for policy 0, policy_version 522178 (0.0026) [2024-04-28 00:26:19,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8555479040. Throughput: 0: 59273.7. Samples: 1460665660. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:26:19,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 00:26:19,908][54818] Updated weights for policy 0, policy_version 522188 (0.0026) [2024-04-28 00:26:20,715][54798] Signal inference workers to stop experience collection... (23050 times) [2024-04-28 00:26:20,715][54798] Signal inference workers to resume experience collection... (23050 times) [2024-04-28 00:26:20,727][54818] InferenceWorker_p0-w0: stopping experience collection (23050 times) [2024-04-28 00:26:20,727][54818] InferenceWorker_p0-w0: resuming experience collection (23050 times) [2024-04-28 00:26:22,520][54818] Updated weights for policy 0, policy_version 522198 (0.0026) [2024-04-28 00:26:24,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58982.4, 300 sec: 59204.5). Total num frames: 8555773952. Throughput: 0: 59725.7. Samples: 1461033200. Policy #0 lag: (min: 2.0, avg: 9.8, max: 22.0) [2024-04-28 00:26:24,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 00:26:25,436][54818] Updated weights for policy 0, policy_version 522208 (0.0023) [2024-04-28 00:26:27,949][54818] Updated weights for policy 0, policy_version 522218 (0.0025) [2024-04-28 00:26:29,253][54587] Fps is (10 sec: 58981.9, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8556068864. Throughput: 0: 59427.1. Samples: 1461207520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:26:29,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 00:26:31,085][54818] Updated weights for policy 0, policy_version 522228 (0.0025) [2024-04-28 00:26:33,418][54818] Updated weights for policy 0, policy_version 522238 (0.0023) [2024-04-28 00:26:34,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59528.4, 300 sec: 59260.1). Total num frames: 8556380160. Throughput: 0: 59446.4. Samples: 1461565160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:26:34,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 00:26:36,486][54818] Updated weights for policy 0, policy_version 522248 (0.0025) [2024-04-28 00:26:38,910][54818] Updated weights for policy 0, policy_version 522258 (0.0024) [2024-04-28 00:26:39,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59528.5, 300 sec: 59260.1). Total num frames: 8556675072. Throughput: 0: 59341.7. Samples: 1461917520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:26:39,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-28 00:26:41,984][54818] Updated weights for policy 0, policy_version 522268 (0.0025) [2024-04-28 00:26:44,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59801.7, 300 sec: 59315.6). Total num frames: 8556986368. Throughput: 0: 59468.1. Samples: 1462098000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:26:44,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 00:26:44,489][54818] Updated weights for policy 0, policy_version 522278 (0.0026) [2024-04-28 00:26:47,283][54818] Updated weights for policy 0, policy_version 522288 (0.0025) [2024-04-28 00:26:49,253][54587] Fps is (10 sec: 58982.6, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8557264896. Throughput: 0: 59445.3. Samples: 1462457360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:26:49,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 00:26:49,262][54587] No heartbeat for components: RolloutWorker_w4 (10957 seconds) [2024-04-28 00:26:50,220][54818] Updated weights for policy 0, policy_version 522298 (0.0026) [2024-04-28 00:26:52,719][54818] Updated weights for policy 0, policy_version 522308 (0.0026) [2024-04-28 00:26:54,253][54587] Fps is (10 sec: 58983.1, 60 sec: 59528.6, 300 sec: 59315.6). Total num frames: 8557576192. Throughput: 0: 59450.9. Samples: 1462812060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:26:54,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 00:26:55,677][54818] Updated weights for policy 0, policy_version 522318 (0.0026) [2024-04-28 00:26:58,303][54818] Updated weights for policy 0, policy_version 522328 (0.0026) [2024-04-28 00:26:59,253][54587] Fps is (10 sec: 60621.4, 60 sec: 59801.8, 300 sec: 59315.7). Total num frames: 8557871104. Throughput: 0: 59625.6. Samples: 1462997560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:26:59,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 00:27:01,426][54818] Updated weights for policy 0, policy_version 522338 (0.0027) [2024-04-28 00:27:03,769][54818] Updated weights for policy 0, policy_version 522348 (0.0026) [2024-04-28 00:27:04,254][54587] Fps is (10 sec: 60617.5, 60 sec: 60074.2, 300 sec: 59426.6). Total num frames: 8558182400. Throughput: 0: 59727.8. Samples: 1463353440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:27:04,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 00:27:06,771][54818] Updated weights for policy 0, policy_version 522358 (0.0023) [2024-04-28 00:27:09,225][54818] Updated weights for policy 0, policy_version 522368 (0.0025) [2024-04-28 00:27:09,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59801.8, 300 sec: 59426.7). Total num frames: 8558477312. Throughput: 0: 59506.9. Samples: 1463711000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:27:09,253][54587] Avg episode reward: [(0, '0.689')] [2024-04-28 00:27:12,289][54818] Updated weights for policy 0, policy_version 522378 (0.0027) [2024-04-28 00:27:14,253][54587] Fps is (10 sec: 57347.4, 60 sec: 59528.8, 300 sec: 59371.2). Total num frames: 8558755840. Throughput: 0: 59647.0. Samples: 1463891620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:27:14,253][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 00:27:14,675][54818] Updated weights for policy 0, policy_version 522388 (0.0026) [2024-04-28 00:27:17,798][54818] Updated weights for policy 0, policy_version 522398 (0.0026) [2024-04-28 00:27:19,253][54587] Fps is (10 sec: 55704.9, 60 sec: 59255.5, 300 sec: 59371.2). Total num frames: 8559034368. Throughput: 0: 59688.0. Samples: 1464251120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:27:19,254][54587] Avg episode reward: [(0, '0.681')] [2024-04-28 00:27:20,160][54818] Updated weights for policy 0, policy_version 522408 (0.0026) [2024-04-28 00:27:20,855][54798] Signal inference workers to stop experience collection... (23100 times) [2024-04-28 00:27:20,855][54798] Signal inference workers to resume experience collection... (23100 times) [2024-04-28 00:27:20,881][54818] InferenceWorker_p0-w0: stopping experience collection (23100 times) [2024-04-28 00:27:20,881][54818] InferenceWorker_p0-w0: resuming experience collection (23100 times) [2024-04-28 00:27:23,434][54818] Updated weights for policy 0, policy_version 522418 (0.0025) [2024-04-28 00:27:24,253][54587] Fps is (10 sec: 60619.5, 60 sec: 59801.6, 300 sec: 59426.7). Total num frames: 8559362048. Throughput: 0: 59861.7. Samples: 1464611300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:27:24,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 00:27:25,562][54818] Updated weights for policy 0, policy_version 522428 (0.0026) [2024-04-28 00:27:29,005][54818] Updated weights for policy 0, policy_version 522438 (0.0027) [2024-04-28 00:27:29,253][54587] Fps is (10 sec: 60620.0, 60 sec: 59528.5, 300 sec: 59371.2). Total num frames: 8559640576. Throughput: 0: 59498.6. Samples: 1464775440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:27:29,254][54587] Avg episode reward: [(0, '0.531')] [2024-04-28 00:27:31,303][54818] Updated weights for policy 0, policy_version 522448 (0.0025) [2024-04-28 00:27:34,253][54587] Fps is (10 sec: 57345.0, 60 sec: 59255.6, 300 sec: 59315.6). Total num frames: 8559935488. Throughput: 0: 59644.6. Samples: 1465141360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:27:34,253][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 00:27:34,303][54818] Updated weights for policy 0, policy_version 522458 (0.0026) [2024-04-28 00:27:36,781][54818] Updated weights for policy 0, policy_version 522468 (0.0025) [2024-04-28 00:27:39,253][54587] Fps is (10 sec: 58983.5, 60 sec: 59255.6, 300 sec: 59315.7). Total num frames: 8560230400. Throughput: 0: 59840.9. Samples: 1465504900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:27:39,253][54587] Avg episode reward: [(0, '0.678')] [2024-04-28 00:27:39,672][54818] Updated weights for policy 0, policy_version 522478 (0.0026) [2024-04-28 00:27:42,049][54818] Updated weights for policy 0, policy_version 522488 (0.0025) [2024-04-28 00:27:44,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58982.5, 300 sec: 59260.1). Total num frames: 8560525312. Throughput: 0: 59480.4. Samples: 1465674180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:27:44,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-28 00:27:45,077][54818] Updated weights for policy 0, policy_version 522498 (0.0026) [2024-04-28 00:27:47,484][54818] Updated weights for policy 0, policy_version 522508 (0.0024) [2024-04-28 00:27:49,253][54587] Fps is (10 sec: 60620.0, 60 sec: 59528.5, 300 sec: 59315.6). Total num frames: 8560836608. Throughput: 0: 59485.0. Samples: 1466030240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:27:49,254][54587] Avg episode reward: [(0, '0.692')] [2024-04-28 00:27:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000522512_8560836608.pth... [2024-04-28 00:27:49,323][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000521642_8546582528.pth [2024-04-28 00:27:50,465][54818] Updated weights for policy 0, policy_version 522518 (0.0026) [2024-04-28 00:27:53,069][54818] Updated weights for policy 0, policy_version 522528 (0.0026) [2024-04-28 00:27:54,253][54587] Fps is (10 sec: 62259.0, 60 sec: 59528.5, 300 sec: 59371.2). Total num frames: 8561147904. Throughput: 0: 59622.5. Samples: 1466394020. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:27:54,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 00:27:56,007][54818] Updated weights for policy 0, policy_version 522538 (0.0026) [2024-04-28 00:27:58,654][54818] Updated weights for policy 0, policy_version 522548 (0.0026) [2024-04-28 00:27:59,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59528.4, 300 sec: 59371.2). Total num frames: 8561442816. Throughput: 0: 59680.2. Samples: 1466577240. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:27:59,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 00:28:01,683][54818] Updated weights for policy 0, policy_version 522558 (0.0026) [2024-04-28 00:28:04,117][54818] Updated weights for policy 0, policy_version 522568 (0.0025) [2024-04-28 00:28:04,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59529.0, 300 sec: 59426.7). Total num frames: 8561754112. Throughput: 0: 59446.7. Samples: 1466926220. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:28:04,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 00:28:07,338][54818] Updated weights for policy 0, policy_version 522578 (0.0025) [2024-04-28 00:28:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 59528.3, 300 sec: 59426.7). Total num frames: 8562049024. Throughput: 0: 59346.7. Samples: 1467281900. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:28:09,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-28 00:28:09,756][54818] Updated weights for policy 0, policy_version 522588 (0.0026) [2024-04-28 00:28:12,747][54818] Updated weights for policy 0, policy_version 522598 (0.0028) [2024-04-28 00:28:14,253][54587] Fps is (10 sec: 57343.6, 60 sec: 59528.4, 300 sec: 59426.7). Total num frames: 8562327552. Throughput: 0: 59690.8. Samples: 1467461520. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:28:14,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-28 00:28:15,202][54818] Updated weights for policy 0, policy_version 522608 (0.0026) [2024-04-28 00:28:16,500][54798] Signal inference workers to stop experience collection... (23150 times) [2024-04-28 00:28:16,500][54798] Signal inference workers to resume experience collection... (23150 times) [2024-04-28 00:28:16,509][54818] InferenceWorker_p0-w0: stopping experience collection (23150 times) [2024-04-28 00:28:16,526][54818] InferenceWorker_p0-w0: resuming experience collection (23150 times) [2024-04-28 00:28:18,302][54818] Updated weights for policy 0, policy_version 522618 (0.0027) [2024-04-28 00:28:19,253][54587] Fps is (10 sec: 58982.6, 60 sec: 60074.6, 300 sec: 59537.8). Total num frames: 8562638848. Throughput: 0: 59532.3. Samples: 1467820320. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:28:19,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-28 00:28:20,754][54818] Updated weights for policy 0, policy_version 522628 (0.0025) [2024-04-28 00:28:23,899][54818] Updated weights for policy 0, policy_version 522638 (0.0025) [2024-04-28 00:28:24,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.5, 300 sec: 59482.2). Total num frames: 8562917376. Throughput: 0: 59487.8. Samples: 1468181860. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:28:24,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 00:28:26,395][54818] Updated weights for policy 0, policy_version 522648 (0.0026) [2024-04-28 00:28:29,254][54587] Fps is (10 sec: 57343.4, 60 sec: 59528.5, 300 sec: 59482.2). Total num frames: 8563212288. Throughput: 0: 59521.5. Samples: 1468352660. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:28:29,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 00:28:29,403][54818] Updated weights for policy 0, policy_version 522658 (0.0021) [2024-04-28 00:28:32,010][54818] Updated weights for policy 0, policy_version 522668 (0.0026) [2024-04-28 00:28:34,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59528.5, 300 sec: 59482.3). Total num frames: 8563507200. Throughput: 0: 59550.7. Samples: 1468710020. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:28:34,262][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 00:28:34,752][54818] Updated weights for policy 0, policy_version 522678 (0.0026) [2024-04-28 00:28:37,752][54818] Updated weights for policy 0, policy_version 522688 (0.0027) [2024-04-28 00:28:39,253][54587] Fps is (10 sec: 58983.7, 60 sec: 59528.5, 300 sec: 59482.3). Total num frames: 8563802112. Throughput: 0: 59252.1. Samples: 1469060360. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:28:39,262][54587] Avg episode reward: [(0, '0.563')] [2024-04-28 00:28:40,375][54818] Updated weights for policy 0, policy_version 522698 (0.0027) [2024-04-28 00:28:43,256][54818] Updated weights for policy 0, policy_version 522708 (0.0026) [2024-04-28 00:28:44,253][54587] Fps is (10 sec: 60620.7, 60 sec: 59801.5, 300 sec: 59426.7). Total num frames: 8564113408. Throughput: 0: 59064.1. Samples: 1469235120. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:28:44,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-28 00:28:45,949][54818] Updated weights for policy 0, policy_version 522718 (0.0026) [2024-04-28 00:28:48,765][54818] Updated weights for policy 0, policy_version 522728 (0.0025) [2024-04-28 00:28:49,253][54587] Fps is (10 sec: 62259.2, 60 sec: 59801.7, 300 sec: 59537.8). Total num frames: 8564424704. Throughput: 0: 59185.8. Samples: 1469589580. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:28:49,262][54587] Avg episode reward: [(0, '0.672')] [2024-04-28 00:28:51,616][54818] Updated weights for policy 0, policy_version 522738 (0.0026) [2024-04-28 00:28:54,145][54818] Updated weights for policy 0, policy_version 522748 (0.0024) [2024-04-28 00:28:54,253][54587] Fps is (10 sec: 58982.9, 60 sec: 59255.5, 300 sec: 59426.7). Total num frames: 8564703232. Throughput: 0: 59320.7. Samples: 1469951320. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:28:54,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 00:28:57,129][54818] Updated weights for policy 0, policy_version 522758 (0.0027) [2024-04-28 00:28:59,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59255.6, 300 sec: 59371.2). Total num frames: 8564998144. Throughput: 0: 59268.1. Samples: 1470128580. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:28:59,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 00:28:59,266][54798] Signal inference workers to stop experience collection... (23200 times) [2024-04-28 00:28:59,271][54798] Signal inference workers to resume experience collection... (23200 times) [2024-04-28 00:28:59,284][54818] InferenceWorker_p0-w0: stopping experience collection (23200 times) [2024-04-28 00:28:59,284][54818] InferenceWorker_p0-w0: resuming experience collection (23200 times) [2024-04-28 00:28:59,610][54818] Updated weights for policy 0, policy_version 522768 (0.0026) [2024-04-28 00:29:02,504][54818] Updated weights for policy 0, policy_version 522778 (0.0024) [2024-04-28 00:29:04,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58709.3, 300 sec: 59315.6). Total num frames: 8565276672. Throughput: 0: 59026.8. Samples: 1470476520. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:29:04,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-28 00:29:05,102][54818] Updated weights for policy 0, policy_version 522788 (0.0026) [2024-04-28 00:29:07,949][54818] Updated weights for policy 0, policy_version 522798 (0.0026) [2024-04-28 00:29:09,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58982.5, 300 sec: 59371.2). Total num frames: 8565587968. Throughput: 0: 58819.6. Samples: 1470828740. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:29:09,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 00:29:10,722][54818] Updated weights for policy 0, policy_version 522808 (0.0026) [2024-04-28 00:29:13,508][54818] Updated weights for policy 0, policy_version 522818 (0.0026) [2024-04-28 00:29:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59255.5, 300 sec: 59371.2). Total num frames: 8565882880. Throughput: 0: 58982.9. Samples: 1471006880. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:29:14,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 00:29:16,265][54818] Updated weights for policy 0, policy_version 522828 (0.0025) [2024-04-28 00:29:19,082][54818] Updated weights for policy 0, policy_version 522838 (0.0023) [2024-04-28 00:29:19,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58982.4, 300 sec: 59426.7). Total num frames: 8566177792. Throughput: 0: 58872.3. Samples: 1471359280. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 00:29:19,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 00:29:21,831][54818] Updated weights for policy 0, policy_version 522848 (0.0026) [2024-04-28 00:29:24,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58982.4, 300 sec: 59371.2). Total num frames: 8566456320. Throughput: 0: 58924.3. Samples: 1471711960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:29:24,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-28 00:29:24,513][54818] Updated weights for policy 0, policy_version 522858 (0.0024) [2024-04-28 00:29:27,967][54818] Updated weights for policy 0, policy_version 522868 (0.0026) [2024-04-28 00:29:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58982.5, 300 sec: 59426.7). Total num frames: 8566751232. Throughput: 0: 58880.4. Samples: 1471884740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:29:29,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 00:29:30,381][54818] Updated weights for policy 0, policy_version 522878 (0.0025) [2024-04-28 00:29:33,479][54818] Updated weights for policy 0, policy_version 522888 (0.0025) [2024-04-28 00:29:34,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.4, 300 sec: 59426.7). Total num frames: 8567046144. Throughput: 0: 58718.2. Samples: 1472231900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:29:34,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 00:29:35,900][54818] Updated weights for policy 0, policy_version 522898 (0.0025) [2024-04-28 00:29:38,861][54818] Updated weights for policy 0, policy_version 522908 (0.0026) [2024-04-28 00:29:39,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.3, 300 sec: 59426.7). Total num frames: 8567341056. Throughput: 0: 58651.9. Samples: 1472590660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:29:39,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 00:29:41,417][54818] Updated weights for policy 0, policy_version 522918 (0.0025) [2024-04-28 00:29:44,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58709.3, 300 sec: 59371.2). Total num frames: 8567635968. Throughput: 0: 58546.9. Samples: 1472763200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:29:44,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-28 00:29:44,754][54818] Updated weights for policy 0, policy_version 522928 (0.0025) [2024-04-28 00:29:47,031][54818] Updated weights for policy 0, policy_version 522938 (0.0025) [2024-04-28 00:29:49,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58436.2, 300 sec: 59371.2). Total num frames: 8567930880. Throughput: 0: 58676.3. Samples: 1473116960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:29:49,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 00:29:49,266][54587] No heartbeat for components: RolloutWorker_w4 (11137 seconds) [2024-04-28 00:29:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000522945_8567930880.pth... [2024-04-28 00:29:49,329][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000522077_8553709568.pth [2024-04-28 00:29:50,320][54818] Updated weights for policy 0, policy_version 522948 (0.0026) [2024-04-28 00:29:53,052][54818] Updated weights for policy 0, policy_version 522958 (0.0026) [2024-04-28 00:29:54,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58709.2, 300 sec: 59315.6). Total num frames: 8568225792. Throughput: 0: 58512.4. Samples: 1473461800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:29:54,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 00:29:55,646][54798] Signal inference workers to stop experience collection... (23250 times) [2024-04-28 00:29:55,647][54798] Signal inference workers to resume experience collection... (23250 times) [2024-04-28 00:29:55,673][54818] InferenceWorker_p0-w0: stopping experience collection (23250 times) [2024-04-28 00:29:55,673][54818] InferenceWorker_p0-w0: resuming experience collection (23250 times) [2024-04-28 00:29:55,760][54818] Updated weights for policy 0, policy_version 522968 (0.0027) [2024-04-28 00:29:58,391][54818] Updated weights for policy 0, policy_version 522978 (0.0026) [2024-04-28 00:29:59,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58436.2, 300 sec: 59260.1). Total num frames: 8568504320. Throughput: 0: 58562.2. Samples: 1473642180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:29:59,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-28 00:30:01,425][54818] Updated weights for policy 0, policy_version 522988 (0.0026) [2024-04-28 00:30:03,886][54818] Updated weights for policy 0, policy_version 522998 (0.0030) [2024-04-28 00:30:04,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58982.3, 300 sec: 59315.6). Total num frames: 8568815616. Throughput: 0: 58632.9. Samples: 1473997760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:30:04,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 00:30:07,029][54818] Updated weights for policy 0, policy_version 523008 (0.0026) [2024-04-28 00:30:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58709.3, 300 sec: 59315.6). Total num frames: 8569110528. Throughput: 0: 58611.1. Samples: 1474349460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:30:09,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-28 00:30:09,309][54818] Updated weights for policy 0, policy_version 523018 (0.0026) [2024-04-28 00:30:12,375][54818] Updated weights for policy 0, policy_version 523028 (0.0026) [2024-04-28 00:30:14,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.3, 300 sec: 59260.1). Total num frames: 8569405440. Throughput: 0: 58944.9. Samples: 1474537260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:30:14,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-28 00:30:14,783][54818] Updated weights for policy 0, policy_version 523038 (0.0027) [2024-04-28 00:30:17,967][54818] Updated weights for policy 0, policy_version 523048 (0.0026) [2024-04-28 00:30:19,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58709.4, 300 sec: 59204.6). Total num frames: 8569700352. Throughput: 0: 58960.9. Samples: 1474885140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:30:19,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 00:30:20,251][54818] Updated weights for policy 0, policy_version 523058 (0.0023) [2024-04-28 00:30:23,363][54818] Updated weights for policy 0, policy_version 523068 (0.0026) [2024-04-28 00:30:24,253][54587] Fps is (10 sec: 60621.3, 60 sec: 59255.6, 300 sec: 59315.6). Total num frames: 8570011648. Throughput: 0: 58853.0. Samples: 1475239040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:30:24,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 00:30:25,864][54818] Updated weights for policy 0, policy_version 523078 (0.0026) [2024-04-28 00:30:28,885][54818] Updated weights for policy 0, policy_version 523088 (0.0026) [2024-04-28 00:30:29,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58982.5, 300 sec: 59260.1). Total num frames: 8570290176. Throughput: 0: 59175.8. Samples: 1475426100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:30:29,253][54587] Avg episode reward: [(0, '0.647')] [2024-04-28 00:30:31,737][54818] Updated weights for policy 0, policy_version 523098 (0.0024) [2024-04-28 00:30:34,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58982.4, 300 sec: 59260.1). Total num frames: 8570585088. Throughput: 0: 59038.8. Samples: 1475773700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:30:34,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 00:30:34,405][54818] Updated weights for policy 0, policy_version 523108 (0.0029) [2024-04-28 00:30:37,418][54818] Updated weights for policy 0, policy_version 523118 (0.0026) [2024-04-28 00:30:39,253][54587] Fps is (10 sec: 57342.8, 60 sec: 58709.2, 300 sec: 59204.6). Total num frames: 8570863616. Throughput: 0: 59382.6. Samples: 1476134020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:30:39,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 00:30:39,825][54818] Updated weights for policy 0, policy_version 523128 (0.0025) [2024-04-28 00:30:42,852][54818] Updated weights for policy 0, policy_version 523138 (0.0026) [2024-04-28 00:30:44,253][54587] Fps is (10 sec: 55705.8, 60 sec: 58436.4, 300 sec: 59093.5). Total num frames: 8571142144. Throughput: 0: 59028.1. Samples: 1476298440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:30:44,253][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 00:30:45,401][54818] Updated weights for policy 0, policy_version 523148 (0.0024) [2024-04-28 00:30:45,404][54798] Signal inference workers to stop experience collection... (23300 times) [2024-04-28 00:30:45,404][54798] Signal inference workers to resume experience collection... (23300 times) [2024-04-28 00:30:45,430][54818] InferenceWorker_p0-w0: stopping experience collection (23300 times) [2024-04-28 00:30:45,430][54818] InferenceWorker_p0-w0: resuming experience collection (23300 times) [2024-04-28 00:30:48,545][54818] Updated weights for policy 0, policy_version 523158 (0.0025) [2024-04-28 00:30:49,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.3, 300 sec: 59149.0). Total num frames: 8571453440. Throughput: 0: 59158.2. Samples: 1476659880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 00:30:49,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 00:30:50,778][54818] Updated weights for policy 0, policy_version 523168 (0.0027) [2024-04-28 00:30:54,137][54818] Updated weights for policy 0, policy_version 523178 (0.0024) [2024-04-28 00:30:54,253][54587] Fps is (10 sec: 60620.0, 60 sec: 58709.3, 300 sec: 59204.6). Total num frames: 8571748352. Throughput: 0: 59392.4. Samples: 1477022120. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 00:30:54,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 00:30:56,236][54818] Updated weights for policy 0, policy_version 523188 (0.0023) [2024-04-28 00:30:59,253][54587] Fps is (10 sec: 57345.1, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8572026880. Throughput: 0: 59009.0. Samples: 1477192660. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 00:30:59,253][54587] Avg episode reward: [(0, '0.539')] [2024-04-28 00:30:59,735][54818] Updated weights for policy 0, policy_version 523198 (0.0026) [2024-04-28 00:31:01,763][54818] Updated weights for policy 0, policy_version 523208 (0.0027) [2024-04-28 00:31:04,253][54587] Fps is (10 sec: 60621.4, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8572354560. Throughput: 0: 59094.3. Samples: 1477544380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 00:31:04,262][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 00:31:05,297][54818] Updated weights for policy 0, policy_version 523218 (0.0026) [2024-04-28 00:31:07,339][54818] Updated weights for policy 0, policy_version 523228 (0.0027) [2024-04-28 00:31:09,253][54587] Fps is (10 sec: 62258.4, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8572649472. Throughput: 0: 59060.7. Samples: 1477896780. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 00:31:09,254][54587] Avg episode reward: [(0, '0.483')] [2024-04-28 00:31:10,977][54818] Updated weights for policy 0, policy_version 523238 (0.0026) [2024-04-28 00:31:12,966][54818] Updated weights for policy 0, policy_version 523248 (0.0024) [2024-04-28 00:31:14,253][54587] Fps is (10 sec: 62258.9, 60 sec: 59528.6, 300 sec: 59315.6). Total num frames: 8572977152. Throughput: 0: 59169.7. Samples: 1478088740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 00:31:14,254][54587] Avg episode reward: [(0, '0.510')] [2024-04-28 00:31:16,539][54818] Updated weights for policy 0, policy_version 523258 (0.0024) [2024-04-28 00:31:18,490][54818] Updated weights for policy 0, policy_version 523268 (0.0022) [2024-04-28 00:31:19,253][54587] Fps is (10 sec: 63897.6, 60 sec: 59801.5, 300 sec: 59371.2). Total num frames: 8573288448. Throughput: 0: 58999.0. Samples: 1478428660. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 00:31:19,254][54587] Avg episode reward: [(0, '0.521')] [2024-04-28 00:31:20,833][54798] Signal inference workers to stop experience collection... (23350 times) [2024-04-28 00:31:20,867][54818] InferenceWorker_p0-w0: stopping experience collection (23350 times) [2024-04-28 00:31:20,885][54798] Signal inference workers to resume experience collection... (23350 times) [2024-04-28 00:31:20,886][54818] InferenceWorker_p0-w0: resuming experience collection (23350 times) [2024-04-28 00:31:22,029][54818] Updated weights for policy 0, policy_version 523278 (0.0024) [2024-04-28 00:31:24,039][54818] Updated weights for policy 0, policy_version 523288 (0.0023) [2024-04-28 00:31:24,253][54587] Fps is (10 sec: 60621.6, 60 sec: 59528.6, 300 sec: 59371.2). Total num frames: 8573583360. Throughput: 0: 59056.8. Samples: 1478791560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 00:31:24,253][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 00:31:27,409][54818] Updated weights for policy 0, policy_version 523298 (0.0025) [2024-04-28 00:31:29,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59528.4, 300 sec: 59260.1). Total num frames: 8573861888. Throughput: 0: 59617.2. Samples: 1478981220. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 00:31:29,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 00:31:29,354][54818] Updated weights for policy 0, policy_version 523308 (0.0025) [2024-04-28 00:31:32,836][54818] Updated weights for policy 0, policy_version 523318 (0.0026) [2024-04-28 00:31:34,253][54587] Fps is (10 sec: 57343.2, 60 sec: 59528.5, 300 sec: 59260.1). Total num frames: 8574156800. Throughput: 0: 59446.8. Samples: 1479334980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 00:31:34,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 00:31:34,762][54818] Updated weights for policy 0, policy_version 523328 (0.0026) [2024-04-28 00:31:38,445][54818] Updated weights for policy 0, policy_version 523338 (0.0024) [2024-04-28 00:31:39,253][54587] Fps is (10 sec: 58981.9, 60 sec: 59801.6, 300 sec: 59204.5). Total num frames: 8574451712. Throughput: 0: 59218.6. Samples: 1479686960. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 00:31:39,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 00:31:40,092][54818] Updated weights for policy 0, policy_version 523348 (0.0024) [2024-04-28 00:31:43,820][54818] Updated weights for policy 0, policy_version 523358 (0.0026) [2024-04-28 00:31:44,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59801.5, 300 sec: 59204.6). Total num frames: 8574730240. Throughput: 0: 59364.8. Samples: 1479864080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 00:31:44,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 00:31:45,599][54818] Updated weights for policy 0, policy_version 523368 (0.0025) [2024-04-28 00:31:49,253][54587] Fps is (10 sec: 54067.4, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8574992384. Throughput: 0: 59541.7. Samples: 1480223760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 00:31:49,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 00:31:49,363][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000523377_8575008768.pth... [2024-04-28 00:31:49,414][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000522512_8560836608.pth [2024-04-28 00:31:49,543][54818] Updated weights for policy 0, policy_version 523378 (0.0025) [2024-04-28 00:31:51,226][54818] Updated weights for policy 0, policy_version 523388 (0.0025) [2024-04-28 00:31:54,038][54798] Signal inference workers to stop experience collection... (23400 times) [2024-04-28 00:31:54,043][54798] Signal inference workers to resume experience collection... (23400 times) [2024-04-28 00:31:54,063][54818] InferenceWorker_p0-w0: stopping experience collection (23400 times) [2024-04-28 00:31:54,064][54818] InferenceWorker_p0-w0: resuming experience collection (23400 times) [2024-04-28 00:31:54,253][54587] Fps is (10 sec: 57343.7, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8575303680. Throughput: 0: 59553.8. Samples: 1480576700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 00:31:54,254][54587] Avg episode reward: [(0, '0.509')] [2024-04-28 00:31:54,989][54818] Updated weights for policy 0, policy_version 523398 (0.0026) [2024-04-28 00:31:56,753][54818] Updated weights for policy 0, policy_version 523408 (0.0025) [2024-04-28 00:31:59,253][54587] Fps is (10 sec: 58983.2, 60 sec: 59255.5, 300 sec: 58982.5). Total num frames: 8575582208. Throughput: 0: 58777.4. Samples: 1480733720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 00:31:59,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-28 00:32:00,376][54818] Updated weights for policy 0, policy_version 523418 (0.0024) [2024-04-28 00:32:02,950][54818] Updated weights for policy 0, policy_version 523428 (0.0025) [2024-04-28 00:32:04,253][54587] Fps is (10 sec: 57344.8, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8575877120. Throughput: 0: 59240.2. Samples: 1481094460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 00:32:04,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 00:32:05,906][54818] Updated weights for policy 0, policy_version 523438 (0.0026) [2024-04-28 00:32:08,724][54818] Updated weights for policy 0, policy_version 523448 (0.0025) [2024-04-28 00:32:09,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.5, 300 sec: 59037.9). Total num frames: 8576172032. Throughput: 0: 59288.8. Samples: 1481459560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 00:32:09,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 00:32:11,399][54818] Updated weights for policy 0, policy_version 523458 (0.0024) [2024-04-28 00:32:14,222][54818] Updated weights for policy 0, policy_version 523468 (0.0025) [2024-04-28 00:32:14,253][54587] Fps is (10 sec: 62258.3, 60 sec: 58709.3, 300 sec: 59204.5). Total num frames: 8576499712. Throughput: 0: 58720.4. Samples: 1481623640. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 00:32:14,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-28 00:32:16,969][54818] Updated weights for policy 0, policy_version 523478 (0.0025) [2024-04-28 00:32:19,253][54587] Fps is (10 sec: 62257.8, 60 sec: 58436.2, 300 sec: 59093.5). Total num frames: 8576794624. Throughput: 0: 58547.4. Samples: 1481969620. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:32:19,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-28 00:32:20,165][54818] Updated weights for policy 0, policy_version 523488 (0.0023) [2024-04-28 00:32:22,378][54818] Updated weights for policy 0, policy_version 523498 (0.0021) [2024-04-28 00:32:22,821][54798] Signal inference workers to stop experience collection... (23450 times) [2024-04-28 00:32:22,824][54798] Signal inference workers to resume experience collection... (23450 times) [2024-04-28 00:32:22,845][54818] InferenceWorker_p0-w0: stopping experience collection (23450 times) [2024-04-28 00:32:22,845][54818] InferenceWorker_p0-w0: resuming experience collection (23450 times) [2024-04-28 00:32:24,253][54587] Fps is (10 sec: 60621.2, 60 sec: 58709.2, 300 sec: 59204.6). Total num frames: 8577105920. Throughput: 0: 58616.6. Samples: 1482324700. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:32:24,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 00:32:25,698][54818] Updated weights for policy 0, policy_version 523508 (0.0025) [2024-04-28 00:32:27,772][54818] Updated weights for policy 0, policy_version 523518 (0.0020) [2024-04-28 00:32:29,253][54587] Fps is (10 sec: 62260.4, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8577417216. Throughput: 0: 59147.6. Samples: 1482525720. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:32:29,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-28 00:32:31,049][54818] Updated weights for policy 0, policy_version 523528 (0.0026) [2024-04-28 00:32:33,282][54818] Updated weights for policy 0, policy_version 523538 (0.0022) [2024-04-28 00:32:34,253][54587] Fps is (10 sec: 62258.9, 60 sec: 59528.5, 300 sec: 59315.6). Total num frames: 8577728512. Throughput: 0: 59048.5. Samples: 1482880940. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:32:34,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 00:32:36,576][54818] Updated weights for policy 0, policy_version 523548 (0.0026) [2024-04-28 00:32:38,798][54818] Updated weights for policy 0, policy_version 523558 (0.0022) [2024-04-28 00:32:39,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.6, 300 sec: 59260.1). Total num frames: 8578007040. Throughput: 0: 58705.0. Samples: 1483218420. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:32:39,254][54587] Avg episode reward: [(0, '0.683')] [2024-04-28 00:32:42,129][54818] Updated weights for policy 0, policy_version 523568 (0.0025) [2024-04-28 00:32:44,231][54818] Updated weights for policy 0, policy_version 523578 (0.0025) [2024-04-28 00:32:44,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59528.5, 300 sec: 59204.6). Total num frames: 8578301952. Throughput: 0: 59329.2. Samples: 1483403540. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:32:44,254][54587] Avg episode reward: [(0, '0.664')] [2024-04-28 00:32:47,645][54818] Updated weights for policy 0, policy_version 523588 (0.0026) [2024-04-28 00:32:49,253][54587] Fps is (10 sec: 55705.6, 60 sec: 59528.6, 300 sec: 59037.9). Total num frames: 8578564096. Throughput: 0: 59280.3. Samples: 1483762080. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:32:49,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 00:32:49,263][54587] No heartbeat for components: RolloutWorker_w4 (11317 seconds) [2024-04-28 00:32:49,637][54818] Updated weights for policy 0, policy_version 523598 (0.0024) [2024-04-28 00:32:53,043][54818] Updated weights for policy 0, policy_version 523608 (0.0026) [2024-04-28 00:32:54,013][54798] Signal inference workers to stop experience collection... (23500 times) [2024-04-28 00:32:54,049][54818] InferenceWorker_p0-w0: stopping experience collection (23500 times) [2024-04-28 00:32:54,061][54798] Signal inference workers to resume experience collection... (23500 times) [2024-04-28 00:32:54,067][54818] InferenceWorker_p0-w0: resuming experience collection (23500 times) [2024-04-28 00:32:54,253][54587] Fps is (10 sec: 54067.5, 60 sec: 58982.5, 300 sec: 58982.4). Total num frames: 8578842624. Throughput: 0: 59267.5. Samples: 1484126600. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:32:54,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 00:32:55,053][54818] Updated weights for policy 0, policy_version 523618 (0.0025) [2024-04-28 00:32:59,241][54818] Updated weights for policy 0, policy_version 523628 (0.0025) [2024-04-28 00:32:59,253][54587] Fps is (10 sec: 55705.5, 60 sec: 58982.3, 300 sec: 58871.3). Total num frames: 8579121152. Throughput: 0: 59228.1. Samples: 1484288900. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:32:59,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 00:33:00,704][54818] Updated weights for policy 0, policy_version 523638 (0.0023) [2024-04-28 00:33:04,253][54587] Fps is (10 sec: 58981.7, 60 sec: 59255.3, 300 sec: 58926.9). Total num frames: 8579432448. Throughput: 0: 59534.3. Samples: 1484648660. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:33:04,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 00:33:04,514][54818] Updated weights for policy 0, policy_version 523648 (0.0026) [2024-04-28 00:33:06,123][54818] Updated weights for policy 0, policy_version 523658 (0.0027) [2024-04-28 00:33:09,253][54587] Fps is (10 sec: 62259.6, 60 sec: 59528.5, 300 sec: 59038.0). Total num frames: 8579743744. Throughput: 0: 59611.6. Samples: 1485007220. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:33:09,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-28 00:33:10,053][54818] Updated weights for policy 0, policy_version 523668 (0.0026) [2024-04-28 00:33:11,558][54818] Updated weights for policy 0, policy_version 523678 (0.0022) [2024-04-28 00:33:14,253][54587] Fps is (10 sec: 60621.4, 60 sec: 58982.5, 300 sec: 58982.4). Total num frames: 8580038656. Throughput: 0: 58730.2. Samples: 1485168580. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:33:14,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-28 00:33:15,404][54818] Updated weights for policy 0, policy_version 523688 (0.0026) [2024-04-28 00:33:17,100][54818] Updated weights for policy 0, policy_version 523698 (0.0024) [2024-04-28 00:33:19,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.6, 300 sec: 59038.0). Total num frames: 8580333568. Throughput: 0: 58965.0. Samples: 1485534360. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:33:19,253][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 00:33:20,771][54818] Updated weights for policy 0, policy_version 523708 (0.0025) [2024-04-28 00:33:22,810][54818] Updated weights for policy 0, policy_version 523718 (0.0025) [2024-04-28 00:33:24,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58436.3, 300 sec: 58982.4). Total num frames: 8580612096. Throughput: 0: 59659.6. Samples: 1485903100. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:33:24,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-28 00:33:26,184][54818] Updated weights for policy 0, policy_version 523728 (0.0026) [2024-04-28 00:33:26,707][54798] Signal inference workers to stop experience collection... (23550 times) [2024-04-28 00:33:26,707][54798] Signal inference workers to resume experience collection... (23550 times) [2024-04-28 00:33:26,727][54818] InferenceWorker_p0-w0: stopping experience collection (23550 times) [2024-04-28 00:33:26,727][54818] InferenceWorker_p0-w0: resuming experience collection (23550 times) [2024-04-28 00:33:28,353][54818] Updated weights for policy 0, policy_version 523738 (0.0024) [2024-04-28 00:33:29,253][54587] Fps is (10 sec: 58981.4, 60 sec: 58436.1, 300 sec: 59037.9). Total num frames: 8580923392. Throughput: 0: 59335.9. Samples: 1486073660. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:33:29,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-28 00:33:31,618][54818] Updated weights for policy 0, policy_version 523748 (0.0026) [2024-04-28 00:33:34,253][54587] Fps is (10 sec: 62259.4, 60 sec: 58436.4, 300 sec: 59093.5). Total num frames: 8581234688. Throughput: 0: 59083.6. Samples: 1486420840. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:33:34,254][54587] Avg episode reward: [(0, '0.675')] [2024-04-28 00:33:34,508][54818] Updated weights for policy 0, policy_version 523758 (0.0026) [2024-04-28 00:33:37,067][54818] Updated weights for policy 0, policy_version 523768 (0.0026) [2024-04-28 00:33:39,253][54587] Fps is (10 sec: 63897.7, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8581562368. Throughput: 0: 58986.5. Samples: 1486781000. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:33:39,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 00:33:40,263][54818] Updated weights for policy 0, policy_version 523778 (0.0029) [2024-04-28 00:33:42,513][54818] Updated weights for policy 0, policy_version 523788 (0.0026) [2024-04-28 00:33:44,253][54587] Fps is (10 sec: 62259.2, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8581857280. Throughput: 0: 59678.3. Samples: 1486974420. Policy #0 lag: (min: 0.0, avg: 13.8, max: 21.0) [2024-04-28 00:33:44,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 00:33:45,762][54818] Updated weights for policy 0, policy_version 523798 (0.0027) [2024-04-28 00:33:47,878][54818] Updated weights for policy 0, policy_version 523808 (0.0026) [2024-04-28 00:33:49,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59801.5, 300 sec: 59149.0). Total num frames: 8582152192. Throughput: 0: 59464.0. Samples: 1487324540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:33:49,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 00:33:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000523813_8582152192.pth... [2024-04-28 00:33:49,329][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000522945_8567930880.pth [2024-04-28 00:33:51,078][54818] Updated weights for policy 0, policy_version 523818 (0.0025) [2024-04-28 00:33:53,398][54818] Updated weights for policy 0, policy_version 523828 (0.0027) [2024-04-28 00:33:54,253][54587] Fps is (10 sec: 62258.3, 60 sec: 60620.7, 300 sec: 59260.1). Total num frames: 8582479872. Throughput: 0: 59285.1. Samples: 1487675060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:33:54,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 00:33:56,560][54818] Updated weights for policy 0, policy_version 523838 (0.0026) [2024-04-28 00:33:58,834][54818] Updated weights for policy 0, policy_version 523848 (0.0024) [2024-04-28 00:33:59,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60620.7, 300 sec: 59260.1). Total num frames: 8582758400. Throughput: 0: 60011.8. Samples: 1487869120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:33:59,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-28 00:34:02,122][54818] Updated weights for policy 0, policy_version 523858 (0.0023) [2024-04-28 00:34:03,638][54798] Signal inference workers to stop experience collection... (23600 times) [2024-04-28 00:34:03,639][54798] Signal inference workers to resume experience collection... (23600 times) [2024-04-28 00:34:03,655][54818] InferenceWorker_p0-w0: stopping experience collection (23600 times) [2024-04-28 00:34:03,656][54818] InferenceWorker_p0-w0: resuming experience collection (23600 times) [2024-04-28 00:34:04,253][54587] Fps is (10 sec: 55705.8, 60 sec: 60074.7, 300 sec: 59149.0). Total num frames: 8583036928. Throughput: 0: 59694.5. Samples: 1488220620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:34:04,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 00:34:04,326][54818] Updated weights for policy 0, policy_version 523868 (0.0026) [2024-04-28 00:34:07,581][54818] Updated weights for policy 0, policy_version 523878 (0.0026) [2024-04-28 00:34:09,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59801.5, 300 sec: 59149.0). Total num frames: 8583331840. Throughput: 0: 59451.0. Samples: 1488578400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:34:09,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 00:34:09,638][54818] Updated weights for policy 0, policy_version 523888 (0.0026) [2024-04-28 00:34:13,111][54818] Updated weights for policy 0, policy_version 523898 (0.0026) [2024-04-28 00:34:14,253][54587] Fps is (10 sec: 57344.4, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8583610368. Throughput: 0: 59660.1. Samples: 1488758360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:34:14,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 00:34:15,059][54818] Updated weights for policy 0, policy_version 523908 (0.0027) [2024-04-28 00:34:18,635][54818] Updated weights for policy 0, policy_version 523918 (0.0025) [2024-04-28 00:34:19,253][54587] Fps is (10 sec: 57343.8, 60 sec: 59528.4, 300 sec: 59149.0). Total num frames: 8583905280. Throughput: 0: 60094.4. Samples: 1489125100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:34:19,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-28 00:34:20,459][54818] Updated weights for policy 0, policy_version 523928 (0.0025) [2024-04-28 00:34:24,149][54818] Updated weights for policy 0, policy_version 523938 (0.0026) [2024-04-28 00:34:24,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59801.6, 300 sec: 59149.0). Total num frames: 8584200192. Throughput: 0: 59882.3. Samples: 1489475700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:34:24,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 00:34:25,907][54818] Updated weights for policy 0, policy_version 523948 (0.0026) [2024-04-28 00:34:29,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8584495104. Throughput: 0: 59192.3. Samples: 1489638080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:34:29,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-28 00:34:29,707][54818] Updated weights for policy 0, policy_version 523958 (0.0024) [2024-04-28 00:34:31,280][54818] Updated weights for policy 0, policy_version 523968 (0.0028) [2024-04-28 00:34:34,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8584790016. Throughput: 0: 59579.1. Samples: 1490005600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:34:34,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 00:34:35,185][54818] Updated weights for policy 0, policy_version 523978 (0.0026) [2024-04-28 00:34:35,856][54798] Signal inference workers to stop experience collection... (23650 times) [2024-04-28 00:34:35,861][54798] Signal inference workers to resume experience collection... (23650 times) [2024-04-28 00:34:35,882][54818] InferenceWorker_p0-w0: stopping experience collection (23650 times) [2024-04-28 00:34:35,882][54818] InferenceWorker_p0-w0: resuming experience collection (23650 times) [2024-04-28 00:34:37,026][54818] Updated weights for policy 0, policy_version 523988 (0.0026) [2024-04-28 00:34:39,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58436.3, 300 sec: 59093.5). Total num frames: 8585068544. Throughput: 0: 59791.2. Samples: 1490365660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:34:39,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 00:34:40,536][54818] Updated weights for policy 0, policy_version 523998 (0.0025) [2024-04-28 00:34:42,933][54818] Updated weights for policy 0, policy_version 524008 (0.0022) [2024-04-28 00:34:44,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.3, 300 sec: 59149.0). Total num frames: 8585379840. Throughput: 0: 59110.8. Samples: 1490529100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:34:44,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 00:34:46,102][54818] Updated weights for policy 0, policy_version 524018 (0.0025) [2024-04-28 00:34:48,778][54818] Updated weights for policy 0, policy_version 524028 (0.0026) [2024-04-28 00:34:49,253][54587] Fps is (10 sec: 60621.5, 60 sec: 58709.5, 300 sec: 59149.0). Total num frames: 8585674752. Throughput: 0: 58989.1. Samples: 1490875120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:34:49,253][54587] Avg episode reward: [(0, '0.514')] [2024-04-28 00:34:51,607][54818] Updated weights for policy 0, policy_version 524038 (0.0026) [2024-04-28 00:34:54,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58436.3, 300 sec: 59260.1). Total num frames: 8585986048. Throughput: 0: 59012.1. Samples: 1491233940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:34:54,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 00:34:54,484][54818] Updated weights for policy 0, policy_version 524048 (0.0024) [2024-04-28 00:34:57,133][54818] Updated weights for policy 0, policy_version 524058 (0.0023) [2024-04-28 00:34:59,253][54587] Fps is (10 sec: 62258.9, 60 sec: 58982.6, 300 sec: 59260.1). Total num frames: 8586297344. Throughput: 0: 59291.1. Samples: 1491426460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:34:59,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 00:35:00,044][54818] Updated weights for policy 0, policy_version 524068 (0.0025) [2024-04-28 00:35:02,717][54818] Updated weights for policy 0, policy_version 524078 (0.0027) [2024-04-28 00:35:04,253][54587] Fps is (10 sec: 62259.1, 60 sec: 59528.5, 300 sec: 59315.6). Total num frames: 8586608640. Throughput: 0: 58909.4. Samples: 1491776020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:35:04,254][54587] Avg episode reward: [(0, '0.682')] [2024-04-28 00:35:06,181][54818] Updated weights for policy 0, policy_version 524088 (0.0024) [2024-04-28 00:35:08,077][54818] Updated weights for policy 0, policy_version 524098 (0.0025) [2024-04-28 00:35:08,106][54798] Signal inference workers to stop experience collection... (23700 times) [2024-04-28 00:35:08,144][54818] InferenceWorker_p0-w0: stopping experience collection (23700 times) [2024-04-28 00:35:08,201][54798] Signal inference workers to resume experience collection... (23700 times) [2024-04-28 00:35:08,201][54818] InferenceWorker_p0-w0: resuming experience collection (23700 times) [2024-04-28 00:35:09,253][54587] Fps is (10 sec: 62259.0, 60 sec: 59801.7, 300 sec: 59371.2). Total num frames: 8586919936. Throughput: 0: 58576.4. Samples: 1492111640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:35:09,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 00:35:11,817][54818] Updated weights for policy 0, policy_version 524108 (0.0025) [2024-04-28 00:35:13,545][54818] Updated weights for policy 0, policy_version 524118 (0.0026) [2024-04-28 00:35:14,253][54587] Fps is (10 sec: 62260.0, 60 sec: 60347.8, 300 sec: 59426.7). Total num frames: 8587231232. Throughput: 0: 59562.8. Samples: 1492318400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:35:14,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 00:35:17,173][54818] Updated weights for policy 0, policy_version 524128 (0.0026) [2024-04-28 00:35:19,141][54818] Updated weights for policy 0, policy_version 524138 (0.0025) [2024-04-28 00:35:19,253][54587] Fps is (10 sec: 55705.8, 60 sec: 59528.7, 300 sec: 59204.5). Total num frames: 8587476992. Throughput: 0: 59027.6. Samples: 1492661840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 00:35:19,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-28 00:35:22,781][54818] Updated weights for policy 0, policy_version 524148 (0.0025) [2024-04-28 00:35:24,253][54587] Fps is (10 sec: 50790.2, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8587739136. Throughput: 0: 58695.2. Samples: 1493006940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 00:35:24,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 00:35:24,744][54818] Updated weights for policy 0, policy_version 524158 (0.0026) [2024-04-28 00:35:28,394][54818] Updated weights for policy 0, policy_version 524168 (0.0026) [2024-04-28 00:35:29,253][54587] Fps is (10 sec: 57343.7, 60 sec: 59255.5, 300 sec: 59204.5). Total num frames: 8588050432. Throughput: 0: 58877.3. Samples: 1493178580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 00:35:29,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 00:35:30,184][54818] Updated weights for policy 0, policy_version 524178 (0.0025) [2024-04-28 00:35:33,734][54818] Updated weights for policy 0, policy_version 524188 (0.0025) [2024-04-28 00:35:34,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8588328960. Throughput: 0: 59263.5. Samples: 1493541980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 00:35:34,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 00:35:35,627][54818] Updated weights for policy 0, policy_version 524198 (0.0026) [2024-04-28 00:35:39,253][54587] Fps is (10 sec: 55705.8, 60 sec: 58982.4, 300 sec: 59204.5). Total num frames: 8588607488. Throughput: 0: 59247.6. Samples: 1493900080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 00:35:39,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 00:35:39,288][54818] Updated weights for policy 0, policy_version 524208 (0.0026) [2024-04-28 00:35:40,531][54798] Signal inference workers to stop experience collection... (23750 times) [2024-04-28 00:35:40,566][54818] InferenceWorker_p0-w0: stopping experience collection (23750 times) [2024-04-28 00:35:40,626][54798] Signal inference workers to resume experience collection... (23750 times) [2024-04-28 00:35:40,626][54818] InferenceWorker_p0-w0: resuming experience collection (23750 times) [2024-04-28 00:35:41,134][54818] Updated weights for policy 0, policy_version 524218 (0.0024) [2024-04-28 00:35:44,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8588902400. Throughput: 0: 58352.9. Samples: 1494052340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 00:35:44,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 00:35:44,869][54818] Updated weights for policy 0, policy_version 524228 (0.0026) [2024-04-28 00:35:46,732][54818] Updated weights for policy 0, policy_version 524238 (0.0023) [2024-04-28 00:35:49,253][54587] Fps is (10 sec: 58981.5, 60 sec: 58709.1, 300 sec: 59149.0). Total num frames: 8589197312. Throughput: 0: 58409.2. Samples: 1494404440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 00:35:49,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 00:35:49,266][54587] No heartbeat for components: RolloutWorker_w4 (11497 seconds) [2024-04-28 00:35:49,267][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000524243_8589197312.pth... [2024-04-28 00:35:49,319][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000523377_8575008768.pth [2024-04-28 00:35:50,423][54818] Updated weights for policy 0, policy_version 524248 (0.0026) [2024-04-28 00:35:52,176][54818] Updated weights for policy 0, policy_version 524258 (0.0026) [2024-04-28 00:35:54,253][54587] Fps is (10 sec: 60620.5, 60 sec: 58709.3, 300 sec: 59260.1). Total num frames: 8589508608. Throughput: 0: 58996.0. Samples: 1494766460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 00:35:54,254][54587] Avg episode reward: [(0, '0.528')] [2024-04-28 00:35:56,020][54818] Updated weights for policy 0, policy_version 524268 (0.0027) [2024-04-28 00:35:57,691][54818] Updated weights for policy 0, policy_version 524278 (0.0025) [2024-04-28 00:35:59,253][54587] Fps is (10 sec: 62259.7, 60 sec: 58709.2, 300 sec: 59204.5). Total num frames: 8589819904. Throughput: 0: 58382.5. Samples: 1494945620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 00:35:59,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 00:36:01,342][54818] Updated weights for policy 0, policy_version 524288 (0.0026) [2024-04-28 00:36:03,364][54818] Updated weights for policy 0, policy_version 524298 (0.0024) [2024-04-28 00:36:04,253][54587] Fps is (10 sec: 62259.3, 60 sec: 58709.4, 300 sec: 59260.1). Total num frames: 8590131200. Throughput: 0: 58593.7. Samples: 1495298560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 00:36:04,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 00:36:06,968][54818] Updated weights for policy 0, policy_version 524308 (0.0026) [2024-04-28 00:36:08,866][54818] Updated weights for policy 0, policy_version 524318 (0.0025) [2024-04-28 00:36:09,253][54587] Fps is (10 sec: 62260.1, 60 sec: 58709.4, 300 sec: 59204.6). Total num frames: 8590442496. Throughput: 0: 58551.6. Samples: 1495641760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 00:36:09,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 00:36:12,361][54818] Updated weights for policy 0, policy_version 524328 (0.0026) [2024-04-28 00:36:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58436.1, 300 sec: 59149.0). Total num frames: 8590737408. Throughput: 0: 59147.5. Samples: 1495840220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 00:36:14,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 00:36:14,540][54818] Updated weights for policy 0, policy_version 524338 (0.0027) [2024-04-28 00:36:17,879][54818] Updated weights for policy 0, policy_version 524348 (0.0026) [2024-04-28 00:36:19,000][54798] Signal inference workers to stop experience collection... (23800 times) [2024-04-28 00:36:19,000][54798] Signal inference workers to resume experience collection... (23800 times) [2024-04-28 00:36:19,030][54818] InferenceWorker_p0-w0: stopping experience collection (23800 times) [2024-04-28 00:36:19,031][54818] InferenceWorker_p0-w0: resuming experience collection (23800 times) [2024-04-28 00:36:19,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59528.6, 300 sec: 59204.5). Total num frames: 8591048704. Throughput: 0: 58940.1. Samples: 1496194280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 00:36:19,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 00:36:19,961][54818] Updated weights for policy 0, policy_version 524358 (0.0025) [2024-04-28 00:36:23,571][54818] Updated weights for policy 0, policy_version 524368 (0.0026) [2024-04-28 00:36:24,253][54587] Fps is (10 sec: 55706.0, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8591294464. Throughput: 0: 58846.3. Samples: 1496548160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 00:36:24,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 00:36:25,561][54818] Updated weights for policy 0, policy_version 524378 (0.0028) [2024-04-28 00:36:29,179][54818] Updated weights for policy 0, policy_version 524388 (0.0026) [2024-04-28 00:36:29,253][54587] Fps is (10 sec: 52428.9, 60 sec: 58709.5, 300 sec: 59038.0). Total num frames: 8591572992. Throughput: 0: 59441.0. Samples: 1496727180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 00:36:29,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 00:36:32,211][54818] Updated weights for policy 0, policy_version 524398 (0.0024) [2024-04-28 00:36:34,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58982.4, 300 sec: 59038.0). Total num frames: 8591867904. Throughput: 0: 59484.7. Samples: 1497081240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 00:36:34,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 00:36:34,585][54818] Updated weights for policy 0, policy_version 524408 (0.0025) [2024-04-28 00:36:37,585][54818] Updated weights for policy 0, policy_version 524418 (0.0024) [2024-04-28 00:36:39,253][54587] Fps is (10 sec: 60620.2, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8592179200. Throughput: 0: 59309.8. Samples: 1497435400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 00:36:39,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 00:36:40,122][54818] Updated weights for policy 0, policy_version 524428 (0.0025) [2024-04-28 00:36:43,049][54818] Updated weights for policy 0, policy_version 524438 (0.0025) [2024-04-28 00:36:44,253][54587] Fps is (10 sec: 60621.0, 60 sec: 59528.6, 300 sec: 59260.1). Total num frames: 8592474112. Throughput: 0: 58951.8. Samples: 1497598440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:36:44,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 00:36:45,663][54818] Updated weights for policy 0, policy_version 524448 (0.0024) [2024-04-28 00:36:48,629][54818] Updated weights for policy 0, policy_version 524458 (0.0025) [2024-04-28 00:36:49,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59528.8, 300 sec: 59204.6). Total num frames: 8592769024. Throughput: 0: 59145.4. Samples: 1497960100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:36:49,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 00:36:51,039][54818] Updated weights for policy 0, policy_version 524468 (0.0025) [2024-04-28 00:36:51,761][54798] Signal inference workers to stop experience collection... (23850 times) [2024-04-28 00:36:51,807][54818] InferenceWorker_p0-w0: stopping experience collection (23850 times) [2024-04-28 00:36:51,818][54798] Signal inference workers to resume experience collection... (23850 times) [2024-04-28 00:36:51,823][54818] InferenceWorker_p0-w0: resuming experience collection (23850 times) [2024-04-28 00:36:53,975][54818] Updated weights for policy 0, policy_version 524478 (0.0024) [2024-04-28 00:36:54,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58982.5, 300 sec: 59204.5). Total num frames: 8593047552. Throughput: 0: 59517.7. Samples: 1498320060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:36:54,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 00:36:56,491][54818] Updated weights for policy 0, policy_version 524488 (0.0027) [2024-04-28 00:36:59,253][54587] Fps is (10 sec: 58981.5, 60 sec: 58982.4, 300 sec: 59260.1). Total num frames: 8593358848. Throughput: 0: 58905.7. Samples: 1498490980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:36:59,254][54587] Avg episode reward: [(0, '0.694')] [2024-04-28 00:36:59,462][54818] Updated weights for policy 0, policy_version 524498 (0.0022) [2024-04-28 00:37:02,055][54818] Updated weights for policy 0, policy_version 524508 (0.0024) [2024-04-28 00:37:04,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58163.2, 300 sec: 59149.0). Total num frames: 8593620992. Throughput: 0: 58770.6. Samples: 1498838960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:37:04,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 00:37:04,906][54818] Updated weights for policy 0, policy_version 524518 (0.0028) [2024-04-28 00:37:07,607][54818] Updated weights for policy 0, policy_version 524528 (0.0026) [2024-04-28 00:37:09,253][54587] Fps is (10 sec: 58983.5, 60 sec: 58436.3, 300 sec: 59149.1). Total num frames: 8593948672. Throughput: 0: 58637.0. Samples: 1499186820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:37:09,253][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 00:37:10,726][54818] Updated weights for policy 0, policy_version 524538 (0.0026) [2024-04-28 00:37:13,178][54818] Updated weights for policy 0, policy_version 524548 (0.0026) [2024-04-28 00:37:14,253][54587] Fps is (10 sec: 63897.7, 60 sec: 58709.4, 300 sec: 59204.6). Total num frames: 8594259968. Throughput: 0: 58863.9. Samples: 1499376060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:37:14,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 00:37:16,105][54818] Updated weights for policy 0, policy_version 524558 (0.0027) [2024-04-28 00:37:18,770][54818] Updated weights for policy 0, policy_version 524568 (0.0026) [2024-04-28 00:37:19,253][54587] Fps is (10 sec: 60619.8, 60 sec: 58436.1, 300 sec: 59149.0). Total num frames: 8594554880. Throughput: 0: 58777.2. Samples: 1499726220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:37:19,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 00:37:21,615][54818] Updated weights for policy 0, policy_version 524578 (0.0025) [2024-04-28 00:37:24,095][54818] Updated weights for policy 0, policy_version 524588 (0.0026) [2024-04-28 00:37:24,253][54587] Fps is (10 sec: 58982.6, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8594849792. Throughput: 0: 58793.9. Samples: 1500081120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:37:24,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 00:37:27,255][54818] Updated weights for policy 0, policy_version 524598 (0.0026) [2024-04-28 00:37:29,253][54587] Fps is (10 sec: 58982.6, 60 sec: 59528.4, 300 sec: 59037.9). Total num frames: 8595144704. Throughput: 0: 59172.3. Samples: 1500261200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:37:29,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 00:37:29,608][54818] Updated weights for policy 0, policy_version 524608 (0.0026) [2024-04-28 00:37:32,772][54818] Updated weights for policy 0, policy_version 524618 (0.0024) [2024-04-28 00:37:33,596][54798] Signal inference workers to stop experience collection... (23900 times) [2024-04-28 00:37:33,632][54818] InferenceWorker_p0-w0: stopping experience collection (23900 times) [2024-04-28 00:37:33,645][54798] Signal inference workers to resume experience collection... (23900 times) [2024-04-28 00:37:33,649][54818] InferenceWorker_p0-w0: resuming experience collection (23900 times) [2024-04-28 00:37:34,253][54587] Fps is (10 sec: 57343.2, 60 sec: 59255.3, 300 sec: 59037.9). Total num frames: 8595423232. Throughput: 0: 58893.6. Samples: 1500610320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:37:34,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-28 00:37:35,204][54818] Updated weights for policy 0, policy_version 524628 (0.0027) [2024-04-28 00:37:38,442][54818] Updated weights for policy 0, policy_version 524638 (0.0026) [2024-04-28 00:37:39,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 8595718144. Throughput: 0: 58870.7. Samples: 1500969240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:37:39,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 00:37:40,874][54818] Updated weights for policy 0, policy_version 524648 (0.0025) [2024-04-28 00:37:44,220][54818] Updated weights for policy 0, policy_version 524658 (0.0026) [2024-04-28 00:37:44,253][54587] Fps is (10 sec: 57344.7, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8595996672. Throughput: 0: 58956.2. Samples: 1501144000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:37:44,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 00:37:46,499][54818] Updated weights for policy 0, policy_version 524668 (0.0028) [2024-04-28 00:37:49,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58709.3, 300 sec: 59149.0). Total num frames: 8596291584. Throughput: 0: 59150.7. Samples: 1501500740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:37:49,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 00:37:49,305][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000524677_8596307968.pth... [2024-04-28 00:37:49,352][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000523813_8582152192.pth [2024-04-28 00:37:49,630][54818] Updated weights for policy 0, policy_version 524678 (0.0026) [2024-04-28 00:37:52,354][54818] Updated weights for policy 0, policy_version 524688 (0.0027) [2024-04-28 00:37:54,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8596586496. Throughput: 0: 59117.7. Samples: 1501847120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:37:54,254][54587] Avg episode reward: [(0, '0.451')] [2024-04-28 00:37:55,106][54818] Updated weights for policy 0, policy_version 524698 (0.0025) [2024-04-28 00:37:57,761][54818] Updated weights for policy 0, policy_version 524708 (0.0025) [2024-04-28 00:37:59,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8596881408. Throughput: 0: 58793.3. Samples: 1502021760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:37:59,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 00:38:00,599][54818] Updated weights for policy 0, policy_version 524718 (0.0026) [2024-04-28 00:38:03,376][54818] Updated weights for policy 0, policy_version 524728 (0.0024) [2024-04-28 00:38:04,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8597176320. Throughput: 0: 58802.7. Samples: 1502372340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:38:04,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 00:38:06,089][54818] Updated weights for policy 0, policy_version 524738 (0.0025) [2024-04-28 00:38:09,062][54818] Updated weights for policy 0, policy_version 524748 (0.0024) [2024-04-28 00:38:09,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58709.2, 300 sec: 59093.5). Total num frames: 8597471232. Throughput: 0: 58816.7. Samples: 1502727880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 00:38:09,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 00:38:11,618][54818] Updated weights for policy 0, policy_version 524758 (0.0026) [2024-04-28 00:38:14,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58709.3, 300 sec: 59149.0). Total num frames: 8597782528. Throughput: 0: 58669.8. Samples: 1502901340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:38:14,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 00:38:14,801][54818] Updated weights for policy 0, policy_version 524768 (0.0025) [2024-04-28 00:38:17,190][54818] Updated weights for policy 0, policy_version 524778 (0.0027) [2024-04-28 00:38:19,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58436.3, 300 sec: 59149.0). Total num frames: 8598061056. Throughput: 0: 58724.5. Samples: 1503252920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:38:19,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 00:38:20,229][54818] Updated weights for policy 0, policy_version 524788 (0.0025) [2024-04-28 00:38:21,922][54798] Signal inference workers to stop experience collection... (23950 times) [2024-04-28 00:38:21,922][54798] Signal inference workers to resume experience collection... (23950 times) [2024-04-28 00:38:21,936][54818] InferenceWorker_p0-w0: stopping experience collection (23950 times) [2024-04-28 00:38:21,936][54818] InferenceWorker_p0-w0: resuming experience collection (23950 times) [2024-04-28 00:38:22,806][54818] Updated weights for policy 0, policy_version 524798 (0.0026) [2024-04-28 00:38:24,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58436.2, 300 sec: 59093.5). Total num frames: 8598355968. Throughput: 0: 58802.1. Samples: 1503615340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:38:24,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 00:38:25,650][54818] Updated weights for policy 0, policy_version 524808 (0.0024) [2024-04-28 00:38:28,156][54818] Updated weights for policy 0, policy_version 524818 (0.0024) [2024-04-28 00:38:29,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58436.3, 300 sec: 59037.9). Total num frames: 8598650880. Throughput: 0: 58841.7. Samples: 1503791880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:38:29,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-28 00:38:31,205][54818] Updated weights for policy 0, policy_version 524828 (0.0026) [2024-04-28 00:38:33,702][54818] Updated weights for policy 0, policy_version 524838 (0.0026) [2024-04-28 00:38:34,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8598945792. Throughput: 0: 58744.7. Samples: 1504144260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:38:34,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 00:38:36,734][54818] Updated weights for policy 0, policy_version 524848 (0.0025) [2024-04-28 00:38:39,253][54587] Fps is (10 sec: 60620.2, 60 sec: 58982.3, 300 sec: 58982.4). Total num frames: 8599257088. Throughput: 0: 58867.4. Samples: 1504496160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:38:39,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 00:38:39,735][54818] Updated weights for policy 0, policy_version 524858 (0.0027) [2024-04-28 00:38:42,314][54818] Updated weights for policy 0, policy_version 524868 (0.0027) [2024-04-28 00:38:44,253][54587] Fps is (10 sec: 60621.6, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8599552000. Throughput: 0: 59165.0. Samples: 1504684180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:38:44,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 00:38:45,098][54818] Updated weights for policy 0, policy_version 524878 (0.0027) [2024-04-28 00:38:47,884][54818] Updated weights for policy 0, policy_version 524888 (0.0026) [2024-04-28 00:38:49,253][54587] Fps is (10 sec: 60621.3, 60 sec: 59528.5, 300 sec: 58926.9). Total num frames: 8599863296. Throughput: 0: 59252.9. Samples: 1505038720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:38:49,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-28 00:38:49,264][54587] No heartbeat for components: RolloutWorker_w4 (11677 seconds) [2024-04-28 00:38:50,691][54818] Updated weights for policy 0, policy_version 524898 (0.0025) [2024-04-28 00:38:53,492][54818] Updated weights for policy 0, policy_version 524908 (0.0025) [2024-04-28 00:38:54,253][54587] Fps is (10 sec: 58981.1, 60 sec: 59255.3, 300 sec: 58926.9). Total num frames: 8600141824. Throughput: 0: 59134.1. Samples: 1505388920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:38:54,254][54587] Avg episode reward: [(0, '0.689')] [2024-04-28 00:38:56,271][54818] Updated weights for policy 0, policy_version 524918 (0.0025) [2024-04-28 00:38:58,948][54818] Updated weights for policy 0, policy_version 524928 (0.0025) [2024-04-28 00:38:59,253][54587] Fps is (10 sec: 55706.2, 60 sec: 58982.5, 300 sec: 58926.9). Total num frames: 8600420352. Throughput: 0: 59142.3. Samples: 1505562740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:38:59,253][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 00:39:01,798][54818] Updated weights for policy 0, policy_version 524938 (0.0027) [2024-04-28 00:39:04,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58982.3, 300 sec: 58926.9). Total num frames: 8600715264. Throughput: 0: 59145.3. Samples: 1505914460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:39:04,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 00:39:04,519][54818] Updated weights for policy 0, policy_version 524948 (0.0026) [2024-04-28 00:39:07,256][54818] Updated weights for policy 0, policy_version 524958 (0.0027) [2024-04-28 00:39:09,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58982.5, 300 sec: 58982.4). Total num frames: 8601010176. Throughput: 0: 59017.9. Samples: 1506271140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:39:09,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 00:39:10,245][54818] Updated weights for policy 0, policy_version 524968 (0.0025) [2024-04-28 00:39:10,248][54798] Signal inference workers to stop experience collection... (24000 times) [2024-04-28 00:39:10,248][54798] Signal inference workers to resume experience collection... (24000 times) [2024-04-28 00:39:10,270][54818] InferenceWorker_p0-w0: stopping experience collection (24000 times) [2024-04-28 00:39:10,270][54818] InferenceWorker_p0-w0: resuming experience collection (24000 times) [2024-04-28 00:39:12,758][54818] Updated weights for policy 0, policy_version 524978 (0.0025) [2024-04-28 00:39:14,253][54587] Fps is (10 sec: 58983.2, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8601305088. Throughput: 0: 59119.6. Samples: 1506452260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:39:14,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 00:39:15,641][54818] Updated weights for policy 0, policy_version 524988 (0.0026) [2024-04-28 00:39:18,281][54818] Updated weights for policy 0, policy_version 524998 (0.0026) [2024-04-28 00:39:19,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8601600000. Throughput: 0: 59146.7. Samples: 1506805860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:39:19,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-28 00:39:21,055][54818] Updated weights for policy 0, policy_version 525008 (0.0026) [2024-04-28 00:39:23,783][54818] Updated weights for policy 0, policy_version 525018 (0.0026) [2024-04-28 00:39:24,253][54587] Fps is (10 sec: 60620.2, 60 sec: 59255.5, 300 sec: 59037.9). Total num frames: 8601911296. Throughput: 0: 59095.2. Samples: 1507155440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:39:24,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 00:39:26,530][54818] Updated weights for policy 0, policy_version 525028 (0.0025) [2024-04-28 00:39:29,217][54818] Updated weights for policy 0, policy_version 525038 (0.0024) [2024-04-28 00:39:29,253][54587] Fps is (10 sec: 62259.7, 60 sec: 59528.6, 300 sec: 59093.5). Total num frames: 8602222592. Throughput: 0: 58858.6. Samples: 1507332820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:39:29,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 00:39:32,098][54818] Updated weights for policy 0, policy_version 525048 (0.0026) [2024-04-28 00:39:34,253][54587] Fps is (10 sec: 58983.2, 60 sec: 59255.6, 300 sec: 59093.5). Total num frames: 8602501120. Throughput: 0: 58677.0. Samples: 1507679180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:39:34,253][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 00:39:34,749][54818] Updated weights for policy 0, policy_version 525058 (0.0026) [2024-04-28 00:39:37,581][54818] Updated weights for policy 0, policy_version 525068 (0.0024) [2024-04-28 00:39:39,253][54587] Fps is (10 sec: 57343.7, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8602796032. Throughput: 0: 59076.6. Samples: 1508047360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 00:39:39,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-28 00:39:40,792][54818] Updated weights for policy 0, policy_version 525078 (0.0026) [2024-04-28 00:39:43,038][54818] Updated weights for policy 0, policy_version 525088 (0.0024) [2024-04-28 00:39:44,253][54587] Fps is (10 sec: 57343.1, 60 sec: 58709.2, 300 sec: 58982.4). Total num frames: 8603074560. Throughput: 0: 59061.6. Samples: 1508220520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:39:44,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 00:39:46,132][54818] Updated weights for policy 0, policy_version 525098 (0.0026) [2024-04-28 00:39:48,574][54818] Updated weights for policy 0, policy_version 525108 (0.0025) [2024-04-28 00:39:49,253][54587] Fps is (10 sec: 58981.2, 60 sec: 58709.2, 300 sec: 58982.4). Total num frames: 8603385856. Throughput: 0: 59294.5. Samples: 1508582720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:39:49,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 00:39:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000525109_8603385856.pth... [2024-04-28 00:39:49,320][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000524243_8589197312.pth [2024-04-28 00:39:51,832][54818] Updated weights for policy 0, policy_version 525118 (0.0026) [2024-04-28 00:39:54,113][54818] Updated weights for policy 0, policy_version 525128 (0.0025) [2024-04-28 00:39:54,253][54587] Fps is (10 sec: 62259.4, 60 sec: 59255.6, 300 sec: 58982.4). Total num frames: 8603697152. Throughput: 0: 58959.1. Samples: 1508924300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:39:54,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 00:39:57,551][54818] Updated weights for policy 0, policy_version 525138 (0.0026) [2024-04-28 00:39:59,253][54587] Fps is (10 sec: 58984.0, 60 sec: 59255.5, 300 sec: 58871.3). Total num frames: 8603975680. Throughput: 0: 58994.7. Samples: 1509107020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:39:59,253][54587] Avg episode reward: [(0, '0.663')] [2024-04-28 00:39:59,607][54818] Updated weights for policy 0, policy_version 525148 (0.0025) [2024-04-28 00:40:00,632][54798] Signal inference workers to stop experience collection... (24050 times) [2024-04-28 00:40:00,633][54798] Signal inference workers to resume experience collection... (24050 times) [2024-04-28 00:40:00,645][54818] InferenceWorker_p0-w0: stopping experience collection (24050 times) [2024-04-28 00:40:00,645][54818] InferenceWorker_p0-w0: resuming experience collection (24050 times) [2024-04-28 00:40:03,050][54818] Updated weights for policy 0, policy_version 525158 (0.0025) [2024-04-28 00:40:04,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59255.5, 300 sec: 58815.8). Total num frames: 8604270592. Throughput: 0: 58998.2. Samples: 1509460780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:40:04,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 00:40:05,665][54818] Updated weights for policy 0, policy_version 525168 (0.0026) [2024-04-28 00:40:08,587][54818] Updated weights for policy 0, policy_version 525178 (0.0026) [2024-04-28 00:40:09,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.5, 300 sec: 58760.2). Total num frames: 8604565504. Throughput: 0: 59299.3. Samples: 1509823900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:40:09,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 00:40:11,008][54818] Updated weights for policy 0, policy_version 525188 (0.0026) [2024-04-28 00:40:14,124][54818] Updated weights for policy 0, policy_version 525198 (0.0025) [2024-04-28 00:40:14,253][54587] Fps is (10 sec: 57344.8, 60 sec: 58982.5, 300 sec: 58871.3). Total num frames: 8604844032. Throughput: 0: 59036.1. Samples: 1509989440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:40:14,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 00:40:16,599][54818] Updated weights for policy 0, policy_version 525208 (0.0026) [2024-04-28 00:40:19,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8605138944. Throughput: 0: 59234.5. Samples: 1510344740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:40:19,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 00:40:19,514][54818] Updated weights for policy 0, policy_version 525218 (0.0028) [2024-04-28 00:40:22,197][54818] Updated weights for policy 0, policy_version 525228 (0.0026) [2024-04-28 00:40:24,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58709.4, 300 sec: 58926.9). Total num frames: 8605433856. Throughput: 0: 58939.1. Samples: 1510699620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:40:24,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 00:40:24,992][54818] Updated weights for policy 0, policy_version 525238 (0.0026) [2024-04-28 00:40:27,519][54818] Updated weights for policy 0, policy_version 525248 (0.0026) [2024-04-28 00:40:29,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58709.2, 300 sec: 59037.9). Total num frames: 8605745152. Throughput: 0: 59070.2. Samples: 1510878680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:40:29,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 00:40:30,499][54818] Updated weights for policy 0, policy_version 525258 (0.0025) [2024-04-28 00:40:32,954][54818] Updated weights for policy 0, policy_version 525268 (0.0028) [2024-04-28 00:40:34,253][54587] Fps is (10 sec: 62258.9, 60 sec: 59255.3, 300 sec: 59149.0). Total num frames: 8606056448. Throughput: 0: 58951.7. Samples: 1511235540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:40:34,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-28 00:40:36,018][54818] Updated weights for policy 0, policy_version 525278 (0.0026) [2024-04-28 00:40:38,608][54818] Updated weights for policy 0, policy_version 525288 (0.0025) [2024-04-28 00:40:39,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8606351360. Throughput: 0: 59095.1. Samples: 1511583580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:40:39,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 00:40:41,389][54818] Updated weights for policy 0, policy_version 525298 (0.0025) [2024-04-28 00:40:44,148][54818] Updated weights for policy 0, policy_version 525308 (0.0026) [2024-04-28 00:40:44,253][54587] Fps is (10 sec: 58983.3, 60 sec: 59528.7, 300 sec: 59149.1). Total num frames: 8606646272. Throughput: 0: 59110.2. Samples: 1511766980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:40:44,253][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 00:40:46,906][54818] Updated weights for policy 0, policy_version 525318 (0.0024) [2024-04-28 00:40:49,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59255.7, 300 sec: 59093.5). Total num frames: 8606941184. Throughput: 0: 59189.4. Samples: 1512124300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:40:49,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 00:40:49,525][54818] Updated weights for policy 0, policy_version 525328 (0.0026) [2024-04-28 00:40:52,447][54818] Updated weights for policy 0, policy_version 525338 (0.0026) [2024-04-28 00:40:54,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58982.5, 300 sec: 59038.0). Total num frames: 8607236096. Throughput: 0: 59035.1. Samples: 1512480480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:40:54,253][54587] Avg episode reward: [(0, '0.661')] [2024-04-28 00:40:55,162][54818] Updated weights for policy 0, policy_version 525348 (0.0027) [2024-04-28 00:40:57,994][54818] Updated weights for policy 0, policy_version 525358 (0.0026) [2024-04-28 00:40:59,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8607531008. Throughput: 0: 59477.7. Samples: 1512665940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:40:59,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-28 00:41:00,622][54818] Updated weights for policy 0, policy_version 525368 (0.0024) [2024-04-28 00:41:02,626][54798] Signal inference workers to stop experience collection... (24100 times) [2024-04-28 00:41:02,626][54798] Signal inference workers to resume experience collection... (24100 times) [2024-04-28 00:41:02,650][54818] InferenceWorker_p0-w0: stopping experience collection (24100 times) [2024-04-28 00:41:02,650][54818] InferenceWorker_p0-w0: resuming experience collection (24100 times) [2024-04-28 00:41:03,472][54818] Updated weights for policy 0, policy_version 525378 (0.0027) [2024-04-28 00:41:04,253][54587] Fps is (10 sec: 58981.9, 60 sec: 59255.5, 300 sec: 58926.8). Total num frames: 8607825920. Throughput: 0: 59310.3. Samples: 1513013700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:41:04,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 00:41:06,208][54818] Updated weights for policy 0, policy_version 525388 (0.0025) [2024-04-28 00:41:08,864][54818] Updated weights for policy 0, policy_version 525398 (0.0026) [2024-04-28 00:41:09,254][54587] Fps is (10 sec: 58980.9, 60 sec: 59255.2, 300 sec: 58926.8). Total num frames: 8608120832. Throughput: 0: 59287.8. Samples: 1513367580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 00:41:09,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 00:41:11,618][54818] Updated weights for policy 0, policy_version 525408 (0.0025) [2024-04-28 00:41:14,253][54587] Fps is (10 sec: 60620.7, 60 sec: 59801.5, 300 sec: 58926.8). Total num frames: 8608432128. Throughput: 0: 59261.4. Samples: 1513545440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 00:41:14,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 00:41:14,436][54818] Updated weights for policy 0, policy_version 525418 (0.0027) [2024-04-28 00:41:17,297][54818] Updated weights for policy 0, policy_version 525428 (0.0026) [2024-04-28 00:41:19,253][54587] Fps is (10 sec: 60621.9, 60 sec: 59801.6, 300 sec: 59093.5). Total num frames: 8608727040. Throughput: 0: 59419.6. Samples: 1513909420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 00:41:19,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 00:41:19,948][54818] Updated weights for policy 0, policy_version 525438 (0.0027) [2024-04-28 00:41:23,156][54818] Updated weights for policy 0, policy_version 525448 (0.0026) [2024-04-28 00:41:24,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59801.7, 300 sec: 59149.0). Total num frames: 8609021952. Throughput: 0: 59547.6. Samples: 1514263220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 00:41:24,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 00:41:25,606][54818] Updated weights for policy 0, policy_version 525458 (0.0026) [2024-04-28 00:41:28,748][54818] Updated weights for policy 0, policy_version 525468 (0.0025) [2024-04-28 00:41:29,253][54587] Fps is (10 sec: 55705.2, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 8609284096. Throughput: 0: 59409.1. Samples: 1514440400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 00:41:29,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 00:41:31,151][54818] Updated weights for policy 0, policy_version 525478 (0.0026) [2024-04-28 00:41:34,253][54587] Fps is (10 sec: 55704.7, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8609579008. Throughput: 0: 59284.3. Samples: 1514792100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 00:41:34,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 00:41:34,313][54818] Updated weights for policy 0, policy_version 525488 (0.0027) [2024-04-28 00:41:36,565][54818] Updated weights for policy 0, policy_version 525498 (0.0023) [2024-04-28 00:41:39,253][54587] Fps is (10 sec: 60621.5, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8609890304. Throughput: 0: 59312.0. Samples: 1515149520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 00:41:39,254][54587] Avg episode reward: [(0, '0.498')] [2024-04-28 00:41:39,809][54818] Updated weights for policy 0, policy_version 525508 (0.0026) [2024-04-28 00:41:42,537][54818] Updated weights for policy 0, policy_version 525518 (0.0026) [2024-04-28 00:41:44,253][54587] Fps is (10 sec: 60622.0, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 8610185216. Throughput: 0: 58956.5. Samples: 1515318980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 00:41:44,253][54587] Avg episode reward: [(0, '0.676')] [2024-04-28 00:41:45,353][54818] Updated weights for policy 0, policy_version 525528 (0.0025) [2024-04-28 00:41:48,008][54818] Updated weights for policy 0, policy_version 525538 (0.0023) [2024-04-28 00:41:48,191][54798] Signal inference workers to stop experience collection... (24150 times) [2024-04-28 00:41:48,191][54798] Signal inference workers to resume experience collection... (24150 times) [2024-04-28 00:41:48,219][54818] InferenceWorker_p0-w0: stopping experience collection (24150 times) [2024-04-28 00:41:48,219][54818] InferenceWorker_p0-w0: resuming experience collection (24150 times) [2024-04-28 00:41:49,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8610480128. Throughput: 0: 59150.7. Samples: 1515675480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 00:41:49,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 00:41:49,263][54587] No heartbeat for components: RolloutWorker_w4 (11857 seconds) [2024-04-28 00:41:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000525543_8610496512.pth... [2024-04-28 00:41:49,315][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000524677_8596307968.pth [2024-04-28 00:41:50,738][54818] Updated weights for policy 0, policy_version 525548 (0.0026) [2024-04-28 00:41:53,395][54818] Updated weights for policy 0, policy_version 525558 (0.0026) [2024-04-28 00:41:54,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58982.3, 300 sec: 59038.0). Total num frames: 8610775040. Throughput: 0: 59167.8. Samples: 1516030120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 00:41:54,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 00:41:56,264][54818] Updated weights for policy 0, policy_version 525568 (0.0026) [2024-04-28 00:41:58,999][54818] Updated weights for policy 0, policy_version 525578 (0.0025) [2024-04-28 00:41:59,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8611086336. Throughput: 0: 59258.2. Samples: 1516212060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 00:41:59,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 00:42:01,764][54818] Updated weights for policy 0, policy_version 525588 (0.0025) [2024-04-28 00:42:04,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8611364864. Throughput: 0: 58852.8. Samples: 1516557800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 00:42:04,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-28 00:42:04,616][54818] Updated weights for policy 0, policy_version 525598 (0.0026) [2024-04-28 00:42:07,186][54818] Updated weights for policy 0, policy_version 525608 (0.0026) [2024-04-28 00:42:09,253][54587] Fps is (10 sec: 57343.2, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8611659776. Throughput: 0: 58863.7. Samples: 1516912100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 00:42:09,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 00:42:10,001][54818] Updated weights for policy 0, policy_version 525618 (0.0027) [2024-04-28 00:42:12,726][54818] Updated weights for policy 0, policy_version 525628 (0.0031) [2024-04-28 00:42:14,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 8611971072. Throughput: 0: 58996.5. Samples: 1517095240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 00:42:14,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 00:42:15,625][54818] Updated weights for policy 0, policy_version 525638 (0.0026) [2024-04-28 00:42:18,401][54818] Updated weights for policy 0, policy_version 525648 (0.0024) [2024-04-28 00:42:19,253][54587] Fps is (10 sec: 60622.0, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 8612265984. Throughput: 0: 59041.1. Samples: 1517448940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 00:42:19,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 00:42:21,202][54818] Updated weights for policy 0, policy_version 525658 (0.0026) [2024-04-28 00:42:23,825][54818] Updated weights for policy 0, policy_version 525668 (0.0026) [2024-04-28 00:42:24,253][54587] Fps is (10 sec: 58983.3, 60 sec: 58982.4, 300 sec: 59038.0). Total num frames: 8612560896. Throughput: 0: 58815.2. Samples: 1517796200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 00:42:24,253][54587] Avg episode reward: [(0, '0.547')] [2024-04-28 00:42:26,940][54818] Updated weights for policy 0, policy_version 525678 (0.0025) [2024-04-28 00:42:29,253][54587] Fps is (10 sec: 58981.9, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8612855808. Throughput: 0: 59040.7. Samples: 1517975820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 00:42:29,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 00:42:29,322][54818] Updated weights for policy 0, policy_version 525688 (0.0026) [2024-04-28 00:42:32,480][54818] Updated weights for policy 0, policy_version 525698 (0.0026) [2024-04-28 00:42:34,253][54587] Fps is (10 sec: 58981.7, 60 sec: 59528.6, 300 sec: 59093.5). Total num frames: 8613150720. Throughput: 0: 58996.4. Samples: 1518330320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 00:42:34,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 00:42:34,930][54818] Updated weights for policy 0, policy_version 525708 (0.0026) [2024-04-28 00:42:38,054][54818] Updated weights for policy 0, policy_version 525718 (0.0026) [2024-04-28 00:42:39,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8613429248. Throughput: 0: 59024.4. Samples: 1518686220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:42:39,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 00:42:40,417][54818] Updated weights for policy 0, policy_version 525728 (0.0025) [2024-04-28 00:42:43,480][54818] Updated weights for policy 0, policy_version 525738 (0.0030) [2024-04-28 00:42:44,253][54587] Fps is (10 sec: 57344.6, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8613724160. Throughput: 0: 58845.5. Samples: 1518860100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:42:44,253][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 00:42:45,973][54818] Updated weights for policy 0, policy_version 525748 (0.0025) [2024-04-28 00:42:46,633][54798] Signal inference workers to stop experience collection... (24200 times) [2024-04-28 00:42:46,634][54798] Signal inference workers to resume experience collection... (24200 times) [2024-04-28 00:42:46,649][54818] InferenceWorker_p0-w0: stopping experience collection (24200 times) [2024-04-28 00:42:46,649][54818] InferenceWorker_p0-w0: resuming experience collection (24200 times) [2024-04-28 00:42:48,992][54818] Updated weights for policy 0, policy_version 525758 (0.0026) [2024-04-28 00:42:49,254][54587] Fps is (10 sec: 58981.4, 60 sec: 58982.2, 300 sec: 59093.4). Total num frames: 8614019072. Throughput: 0: 59178.5. Samples: 1519220840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:42:49,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 00:42:51,542][54818] Updated weights for policy 0, policy_version 525768 (0.0026) [2024-04-28 00:42:54,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8614330368. Throughput: 0: 59038.6. Samples: 1519568820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:42:54,253][54587] Avg episode reward: [(0, '0.514')] [2024-04-28 00:42:54,632][54818] Updated weights for policy 0, policy_version 525778 (0.0026) [2024-04-28 00:42:57,256][54818] Updated weights for policy 0, policy_version 525788 (0.0026) [2024-04-28 00:42:59,253][54587] Fps is (10 sec: 58983.4, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8614608896. Throughput: 0: 58770.7. Samples: 1519739920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:42:59,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 00:43:00,079][54818] Updated weights for policy 0, policy_version 525798 (0.0025) [2024-04-28 00:43:02,662][54818] Updated weights for policy 0, policy_version 525808 (0.0027) [2024-04-28 00:43:04,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8614903808. Throughput: 0: 58834.7. Samples: 1520096500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:43:04,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-28 00:43:05,792][54818] Updated weights for policy 0, policy_version 525818 (0.0024) [2024-04-28 00:43:08,516][54818] Updated weights for policy 0, policy_version 525828 (0.0027) [2024-04-28 00:43:09,253][54587] Fps is (10 sec: 55706.1, 60 sec: 58436.5, 300 sec: 58926.9). Total num frames: 8615165952. Throughput: 0: 59165.3. Samples: 1520458640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:43:09,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-28 00:43:11,415][54818] Updated weights for policy 0, policy_version 525838 (0.0024) [2024-04-28 00:43:14,172][54818] Updated weights for policy 0, policy_version 525848 (0.0024) [2024-04-28 00:43:14,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8615493632. Throughput: 0: 59008.5. Samples: 1520631200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:43:14,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 00:43:16,917][54818] Updated weights for policy 0, policy_version 525858 (0.0023) [2024-04-28 00:43:19,253][54587] Fps is (10 sec: 62259.5, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8615788544. Throughput: 0: 59023.7. Samples: 1520986380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:43:19,253][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 00:43:19,812][54818] Updated weights for policy 0, policy_version 525868 (0.0025) [2024-04-28 00:43:22,365][54818] Updated weights for policy 0, policy_version 525878 (0.0024) [2024-04-28 00:43:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8616099840. Throughput: 0: 59054.7. Samples: 1521343680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:43:24,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 00:43:25,475][54818] Updated weights for policy 0, policy_version 525888 (0.0024) [2024-04-28 00:43:27,621][54798] Signal inference workers to stop experience collection... (24250 times) [2024-04-28 00:43:27,661][54818] InferenceWorker_p0-w0: stopping experience collection (24250 times) [2024-04-28 00:43:27,677][54798] Signal inference workers to resume experience collection... (24250 times) [2024-04-28 00:43:27,678][54818] InferenceWorker_p0-w0: resuming experience collection (24250 times) [2024-04-28 00:43:27,790][54818] Updated weights for policy 0, policy_version 525898 (0.0026) [2024-04-28 00:43:29,253][54587] Fps is (10 sec: 62258.8, 60 sec: 59255.6, 300 sec: 59204.6). Total num frames: 8616411136. Throughput: 0: 59259.9. Samples: 1521526800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:43:29,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 00:43:30,984][54818] Updated weights for policy 0, policy_version 525908 (0.0025) [2024-04-28 00:43:33,231][54818] Updated weights for policy 0, policy_version 525918 (0.0024) [2024-04-28 00:43:34,253][54587] Fps is (10 sec: 60619.8, 60 sec: 59255.3, 300 sec: 59149.0). Total num frames: 8616706048. Throughput: 0: 59172.1. Samples: 1521883580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:43:34,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 00:43:36,331][54818] Updated weights for policy 0, policy_version 525928 (0.0026) [2024-04-28 00:43:38,867][54818] Updated weights for policy 0, policy_version 525938 (0.0022) [2024-04-28 00:43:39,254][54587] Fps is (10 sec: 58981.0, 60 sec: 59528.4, 300 sec: 59149.0). Total num frames: 8617000960. Throughput: 0: 59298.7. Samples: 1522237280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:43:39,254][54587] Avg episode reward: [(0, '0.686')] [2024-04-28 00:43:41,707][54818] Updated weights for policy 0, policy_version 525948 (0.0024) [2024-04-28 00:43:44,253][54587] Fps is (10 sec: 57345.1, 60 sec: 59255.4, 300 sec: 59038.0). Total num frames: 8617279488. Throughput: 0: 59381.9. Samples: 1522412100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:43:44,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-28 00:43:44,263][54818] Updated weights for policy 0, policy_version 525958 (0.0024) [2024-04-28 00:43:47,179][54818] Updated weights for policy 0, policy_version 525968 (0.0026) [2024-04-28 00:43:49,253][54587] Fps is (10 sec: 55706.9, 60 sec: 58982.6, 300 sec: 59038.0). Total num frames: 8617558016. Throughput: 0: 59376.5. Samples: 1522768440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:43:49,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 00:43:49,340][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000525975_8617574400.pth... [2024-04-28 00:43:49,398][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000525109_8603385856.pth [2024-04-28 00:43:49,781][54818] Updated weights for policy 0, policy_version 525978 (0.0023) [2024-04-28 00:43:52,691][54818] Updated weights for policy 0, policy_version 525988 (0.0027) [2024-04-28 00:43:54,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58709.2, 300 sec: 59093.5). Total num frames: 8617852928. Throughput: 0: 59030.6. Samples: 1523115020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:43:54,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 00:43:55,087][54818] Updated weights for policy 0, policy_version 525998 (0.0026) [2024-04-28 00:43:58,220][54818] Updated weights for policy 0, policy_version 526008 (0.0025) [2024-04-28 00:43:58,565][54798] Signal inference workers to stop experience collection... (24300 times) [2024-04-28 00:43:58,565][54798] Signal inference workers to resume experience collection... (24300 times) [2024-04-28 00:43:58,575][54818] InferenceWorker_p0-w0: stopping experience collection (24300 times) [2024-04-28 00:43:58,575][54818] InferenceWorker_p0-w0: resuming experience collection (24300 times) [2024-04-28 00:43:59,253][54587] Fps is (10 sec: 60620.0, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8618164224. Throughput: 0: 59295.0. Samples: 1523299480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:43:59,254][54587] Avg episode reward: [(0, '0.671')] [2024-04-28 00:44:00,609][54818] Updated weights for policy 0, policy_version 526018 (0.0027) [2024-04-28 00:44:03,793][54818] Updated weights for policy 0, policy_version 526028 (0.0027) [2024-04-28 00:44:04,253][54587] Fps is (10 sec: 60621.6, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8618459136. Throughput: 0: 59442.2. Samples: 1523661280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 00:44:04,253][54587] Avg episode reward: [(0, '0.661')] [2024-04-28 00:44:06,272][54818] Updated weights for policy 0, policy_version 526038 (0.0025) [2024-04-28 00:44:09,253][54587] Fps is (10 sec: 58982.0, 60 sec: 59801.4, 300 sec: 59149.0). Total num frames: 8618754048. Throughput: 0: 59230.9. Samples: 1524009080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:44:09,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-28 00:44:09,269][54818] Updated weights for policy 0, policy_version 526048 (0.0026) [2024-04-28 00:44:11,973][54818] Updated weights for policy 0, policy_version 526058 (0.0023) [2024-04-28 00:44:14,253][54587] Fps is (10 sec: 58981.8, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8619048960. Throughput: 0: 58897.3. Samples: 1524177180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:44:14,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-28 00:44:14,828][54818] Updated weights for policy 0, policy_version 526068 (0.0024) [2024-04-28 00:44:17,461][54818] Updated weights for policy 0, policy_version 526078 (0.0024) [2024-04-28 00:44:19,253][54587] Fps is (10 sec: 58983.4, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8619343872. Throughput: 0: 58975.3. Samples: 1524537460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:44:19,254][54587] Avg episode reward: [(0, '0.678')] [2024-04-28 00:44:20,239][54818] Updated weights for policy 0, policy_version 526088 (0.0026) [2024-04-28 00:44:23,045][54818] Updated weights for policy 0, policy_version 526098 (0.0023) [2024-04-28 00:44:24,253][54587] Fps is (10 sec: 60621.3, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8619655168. Throughput: 0: 59234.6. Samples: 1524902820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:44:24,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 00:44:25,937][54818] Updated weights for policy 0, policy_version 526108 (0.0025) [2024-04-28 00:44:28,787][54818] Updated weights for policy 0, policy_version 526118 (0.0024) [2024-04-28 00:44:29,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58709.2, 300 sec: 59093.4). Total num frames: 8619933696. Throughput: 0: 59287.4. Samples: 1525080040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:44:29,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 00:44:31,808][54818] Updated weights for policy 0, policy_version 526128 (0.0027) [2024-04-28 00:44:34,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58709.5, 300 sec: 59093.5). Total num frames: 8620228608. Throughput: 0: 58905.8. Samples: 1525419200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:44:34,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 00:44:34,543][54818] Updated weights for policy 0, policy_version 526138 (0.0026) [2024-04-28 00:44:36,281][54798] Signal inference workers to stop experience collection... (24350 times) [2024-04-28 00:44:36,317][54818] InferenceWorker_p0-w0: stopping experience collection (24350 times) [2024-04-28 00:44:36,371][54798] Signal inference workers to resume experience collection... (24350 times) [2024-04-28 00:44:36,372][54818] InferenceWorker_p0-w0: resuming experience collection (24350 times) [2024-04-28 00:44:37,385][54818] Updated weights for policy 0, policy_version 526148 (0.0025) [2024-04-28 00:44:39,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8620539904. Throughput: 0: 59074.6. Samples: 1525773380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:44:39,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-28 00:44:40,075][54818] Updated weights for policy 0, policy_version 526158 (0.0025) [2024-04-28 00:44:42,880][54818] Updated weights for policy 0, policy_version 526168 (0.0026) [2024-04-28 00:44:44,253][54587] Fps is (10 sec: 62258.4, 60 sec: 59528.4, 300 sec: 59204.6). Total num frames: 8620851200. Throughput: 0: 59054.3. Samples: 1525956920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:44:44,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 00:44:45,562][54818] Updated weights for policy 0, policy_version 526178 (0.0028) [2024-04-28 00:44:48,356][54818] Updated weights for policy 0, policy_version 526188 (0.0026) [2024-04-28 00:44:49,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59801.5, 300 sec: 59149.0). Total num frames: 8621146112. Throughput: 0: 59063.8. Samples: 1526319160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:44:49,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 00:44:49,266][54587] No heartbeat for components: RolloutWorker_w4 (12037 seconds) [2024-04-28 00:44:51,131][54818] Updated weights for policy 0, policy_version 526198 (0.0023) [2024-04-28 00:44:53,781][54818] Updated weights for policy 0, policy_version 526208 (0.0024) [2024-04-28 00:44:54,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59801.6, 300 sec: 59204.5). Total num frames: 8621441024. Throughput: 0: 59101.5. Samples: 1526668640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:44:54,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-28 00:44:56,865][54818] Updated weights for policy 0, policy_version 526218 (0.0025) [2024-04-28 00:44:59,182][54818] Updated weights for policy 0, policy_version 526228 (0.0025) [2024-04-28 00:44:59,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8621719552. Throughput: 0: 59389.3. Samples: 1526849700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:44:59,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 00:45:02,161][54818] Updated weights for policy 0, policy_version 526238 (0.0026) [2024-04-28 00:45:04,253][54587] Fps is (10 sec: 55705.6, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8621998080. Throughput: 0: 59159.1. Samples: 1527199620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:45:04,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 00:45:04,679][54818] Updated weights for policy 0, policy_version 526248 (0.0024) [2024-04-28 00:45:07,695][54818] Updated weights for policy 0, policy_version 526258 (0.0026) [2024-04-28 00:45:09,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8622292992. Throughput: 0: 58894.5. Samples: 1527553080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:45:09,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 00:45:10,119][54818] Updated weights for policy 0, policy_version 526268 (0.0029) [2024-04-28 00:45:13,132][54818] Updated weights for policy 0, policy_version 526278 (0.0027) [2024-04-28 00:45:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59255.4, 300 sec: 59204.6). Total num frames: 8622604288. Throughput: 0: 58980.5. Samples: 1527734160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:45:14,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 00:45:15,133][54798] Signal inference workers to stop experience collection... (24400 times) [2024-04-28 00:45:15,161][54818] InferenceWorker_p0-w0: stopping experience collection (24400 times) [2024-04-28 00:45:15,223][54798] Signal inference workers to resume experience collection... (24400 times) [2024-04-28 00:45:15,224][54818] InferenceWorker_p0-w0: resuming experience collection (24400 times) [2024-04-28 00:45:15,551][54818] Updated weights for policy 0, policy_version 526288 (0.0021) [2024-04-28 00:45:18,640][54818] Updated weights for policy 0, policy_version 526298 (0.0026) [2024-04-28 00:45:19,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8622899200. Throughput: 0: 59423.5. Samples: 1528093260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:45:19,254][54587] Avg episode reward: [(0, '0.692')] [2024-04-28 00:45:21,007][54818] Updated weights for policy 0, policy_version 526308 (0.0021) [2024-04-28 00:45:24,029][54818] Updated weights for policy 0, policy_version 526318 (0.0025) [2024-04-28 00:45:24,253][54587] Fps is (10 sec: 58983.2, 60 sec: 58982.4, 300 sec: 59149.1). Total num frames: 8623194112. Throughput: 0: 59493.6. Samples: 1528450580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:45:24,253][54587] Avg episode reward: [(0, '0.538')] [2024-04-28 00:45:26,396][54818] Updated weights for policy 0, policy_version 526328 (0.0027) [2024-04-28 00:45:29,253][54587] Fps is (10 sec: 58981.7, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8623489024. Throughput: 0: 59097.7. Samples: 1528616320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:45:29,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 00:45:29,532][54818] Updated weights for policy 0, policy_version 526338 (0.0029) [2024-04-28 00:45:31,887][54818] Updated weights for policy 0, policy_version 526348 (0.0028) [2024-04-28 00:45:34,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58982.4, 300 sec: 59038.0). Total num frames: 8623767552. Throughput: 0: 59020.2. Samples: 1528975060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 00:45:34,253][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 00:45:35,003][54818] Updated weights for policy 0, policy_version 526358 (0.0025) [2024-04-28 00:45:37,517][54818] Updated weights for policy 0, policy_version 526368 (0.0025) [2024-04-28 00:45:39,253][54587] Fps is (10 sec: 55706.2, 60 sec: 58436.3, 300 sec: 58982.4). Total num frames: 8624046080. Throughput: 0: 59344.0. Samples: 1529339120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:45:39,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 00:45:40,604][54818] Updated weights for policy 0, policy_version 526378 (0.0023) [2024-04-28 00:45:43,345][54818] Updated weights for policy 0, policy_version 526388 (0.0023) [2024-04-28 00:45:44,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58163.3, 300 sec: 58982.4). Total num frames: 8624340992. Throughput: 0: 59211.2. Samples: 1529514200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:45:44,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 00:45:46,158][54818] Updated weights for policy 0, policy_version 526398 (0.0025) [2024-04-28 00:45:49,136][54818] Updated weights for policy 0, policy_version 526408 (0.0025) [2024-04-28 00:45:49,253][54587] Fps is (10 sec: 62259.4, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8624668672. Throughput: 0: 59149.4. Samples: 1529861340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:45:49,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 00:45:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000526408_8624668672.pth... [2024-04-28 00:45:49,314][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000525543_8610496512.pth [2024-04-28 00:45:51,748][54818] Updated weights for policy 0, policy_version 526418 (0.0025) [2024-04-28 00:45:54,253][54587] Fps is (10 sec: 62258.3, 60 sec: 58709.2, 300 sec: 59093.4). Total num frames: 8624963584. Throughput: 0: 59063.9. Samples: 1530210960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:45:54,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-28 00:45:54,908][54818] Updated weights for policy 0, policy_version 526428 (0.0023) [2024-04-28 00:45:57,410][54818] Updated weights for policy 0, policy_version 526438 (0.0026) [2024-04-28 00:45:59,253][54587] Fps is (10 sec: 60620.1, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8625274880. Throughput: 0: 59167.5. Samples: 1530396700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:45:59,254][54587] Avg episode reward: [(0, '0.524')] [2024-04-28 00:46:00,500][54818] Updated weights for policy 0, policy_version 526448 (0.0025) [2024-04-28 00:46:00,920][54798] Signal inference workers to stop experience collection... (24450 times) [2024-04-28 00:46:00,922][54798] Signal inference workers to resume experience collection... (24450 times) [2024-04-28 00:46:00,949][54818] InferenceWorker_p0-w0: stopping experience collection (24450 times) [2024-04-28 00:46:00,950][54818] InferenceWorker_p0-w0: resuming experience collection (24450 times) [2024-04-28 00:46:02,822][54818] Updated weights for policy 0, policy_version 526458 (0.0025) [2024-04-28 00:46:04,253][54587] Fps is (10 sec: 62260.6, 60 sec: 59801.7, 300 sec: 59204.6). Total num frames: 8625586176. Throughput: 0: 59064.1. Samples: 1530751140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:46:04,253][54587] Avg episode reward: [(0, '0.563')] [2024-04-28 00:46:06,005][54818] Updated weights for policy 0, policy_version 526468 (0.0026) [2024-04-28 00:46:08,347][54818] Updated weights for policy 0, policy_version 526478 (0.0026) [2024-04-28 00:46:09,253][54587] Fps is (10 sec: 60621.3, 60 sec: 59801.6, 300 sec: 59149.0). Total num frames: 8625881088. Throughput: 0: 59203.0. Samples: 1531114720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:46:09,254][54587] Avg episode reward: [(0, '0.488')] [2024-04-28 00:46:11,483][54818] Updated weights for policy 0, policy_version 526488 (0.0025) [2024-04-28 00:46:13,930][54818] Updated weights for policy 0, policy_version 526498 (0.0025) [2024-04-28 00:46:14,253][54587] Fps is (10 sec: 57343.6, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8626159616. Throughput: 0: 59440.6. Samples: 1531291140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:46:14,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 00:46:17,004][54818] Updated weights for policy 0, policy_version 526508 (0.0025) [2024-04-28 00:46:19,253][54587] Fps is (10 sec: 57343.3, 60 sec: 59255.3, 300 sec: 59093.4). Total num frames: 8626454528. Throughput: 0: 59285.1. Samples: 1531642900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:46:19,254][54587] Avg episode reward: [(0, '0.528')] [2024-04-28 00:46:19,392][54818] Updated weights for policy 0, policy_version 526518 (0.0025) [2024-04-28 00:46:22,465][54818] Updated weights for policy 0, policy_version 526528 (0.0025) [2024-04-28 00:46:24,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8626749440. Throughput: 0: 59020.1. Samples: 1531995020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:46:24,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-28 00:46:24,924][54818] Updated weights for policy 0, policy_version 526538 (0.0026) [2024-04-28 00:46:27,843][54818] Updated weights for policy 0, policy_version 526548 (0.0025) [2024-04-28 00:46:29,253][54587] Fps is (10 sec: 58983.3, 60 sec: 59255.6, 300 sec: 59204.6). Total num frames: 8627044352. Throughput: 0: 59232.9. Samples: 1532179680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:46:29,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 00:46:30,447][54818] Updated weights for policy 0, policy_version 526558 (0.0024) [2024-04-28 00:46:33,388][54818] Updated weights for policy 0, policy_version 526568 (0.0028) [2024-04-28 00:46:34,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8627339264. Throughput: 0: 59488.0. Samples: 1532538300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:46:34,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 00:46:35,860][54818] Updated weights for policy 0, policy_version 526578 (0.0026) [2024-04-28 00:46:38,954][54818] Updated weights for policy 0, policy_version 526588 (0.0027) [2024-04-28 00:46:39,253][54587] Fps is (10 sec: 57343.8, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8627617792. Throughput: 0: 59504.2. Samples: 1532888640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:46:39,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 00:46:41,247][54818] Updated weights for policy 0, policy_version 526598 (0.0026) [2024-04-28 00:46:43,544][54798] Signal inference workers to stop experience collection... (24500 times) [2024-04-28 00:46:43,550][54798] Signal inference workers to resume experience collection... (24500 times) [2024-04-28 00:46:43,563][54818] InferenceWorker_p0-w0: stopping experience collection (24500 times) [2024-04-28 00:46:43,563][54818] InferenceWorker_p0-w0: resuming experience collection (24500 times) [2024-04-28 00:46:44,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59801.6, 300 sec: 59149.0). Total num frames: 8627929088. Throughput: 0: 58987.2. Samples: 1533051120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:46:44,254][54587] Avg episode reward: [(0, '0.504')] [2024-04-28 00:46:44,564][54818] Updated weights for policy 0, policy_version 526608 (0.0025) [2024-04-28 00:46:46,729][54818] Updated weights for policy 0, policy_version 526618 (0.0024) [2024-04-28 00:46:49,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8628224000. Throughput: 0: 59307.4. Samples: 1533419980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:46:49,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 00:46:49,887][54818] Updated weights for policy 0, policy_version 526628 (0.0026) [2024-04-28 00:46:52,114][54818] Updated weights for policy 0, policy_version 526638 (0.0025) [2024-04-28 00:46:54,253][54587] Fps is (10 sec: 58981.9, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8628518912. Throughput: 0: 59296.4. Samples: 1533783060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:46:54,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 00:46:55,415][54818] Updated weights for policy 0, policy_version 526648 (0.0028) [2024-04-28 00:46:57,559][54818] Updated weights for policy 0, policy_version 526658 (0.0025) [2024-04-28 00:46:59,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8628797440. Throughput: 0: 59042.5. Samples: 1533948060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:46:59,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-28 00:47:01,027][54818] Updated weights for policy 0, policy_version 526668 (0.0024) [2024-04-28 00:47:03,602][54818] Updated weights for policy 0, policy_version 526678 (0.0025) [2024-04-28 00:47:04,253][54587] Fps is (10 sec: 58983.3, 60 sec: 58709.3, 300 sec: 59149.1). Total num frames: 8629108736. Throughput: 0: 59050.5. Samples: 1534300160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 00:47:04,253][54587] Avg episode reward: [(0, '0.668')] [2024-04-28 00:47:06,449][54818] Updated weights for policy 0, policy_version 526688 (0.0028) [2024-04-28 00:47:09,253][54587] Fps is (10 sec: 60621.7, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8629403648. Throughput: 0: 59052.4. Samples: 1534652380. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 00:47:09,253][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 00:47:09,282][54818] Updated weights for policy 0, policy_version 526698 (0.0022) [2024-04-28 00:47:11,852][54818] Updated weights for policy 0, policy_version 526708 (0.0027) [2024-04-28 00:47:14,253][54587] Fps is (10 sec: 60620.2, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8629714944. Throughput: 0: 59115.5. Samples: 1534839880. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 00:47:14,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 00:47:14,949][54818] Updated weights for policy 0, policy_version 526718 (0.0027) [2024-04-28 00:47:17,513][54818] Updated weights for policy 0, policy_version 526728 (0.0025) [2024-04-28 00:47:19,253][54587] Fps is (10 sec: 62258.8, 60 sec: 59528.7, 300 sec: 59204.5). Total num frames: 8630026240. Throughput: 0: 58950.2. Samples: 1535191060. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 00:47:19,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 00:47:20,633][54818] Updated weights for policy 0, policy_version 526738 (0.0026) [2024-04-28 00:47:22,999][54818] Updated weights for policy 0, policy_version 526748 (0.0025) [2024-04-28 00:47:24,253][54587] Fps is (10 sec: 60620.7, 60 sec: 59528.4, 300 sec: 59204.6). Total num frames: 8630321152. Throughput: 0: 58941.3. Samples: 1535541000. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 00:47:24,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 00:47:26,091][54818] Updated weights for policy 0, policy_version 526758 (0.0026) [2024-04-28 00:47:27,612][54798] Signal inference workers to stop experience collection... (24550 times) [2024-04-28 00:47:27,618][54798] Signal inference workers to resume experience collection... (24550 times) [2024-04-28 00:47:27,630][54818] InferenceWorker_p0-w0: stopping experience collection (24550 times) [2024-04-28 00:47:27,631][54818] InferenceWorker_p0-w0: resuming experience collection (24550 times) [2024-04-28 00:47:28,678][54818] Updated weights for policy 0, policy_version 526768 (0.0024) [2024-04-28 00:47:29,253][54587] Fps is (10 sec: 55705.9, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8630583296. Throughput: 0: 59437.0. Samples: 1535725780. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 00:47:29,253][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 00:47:31,527][54818] Updated weights for policy 0, policy_version 526778 (0.0032) [2024-04-28 00:47:34,230][54818] Updated weights for policy 0, policy_version 526788 (0.0024) [2024-04-28 00:47:34,253][54587] Fps is (10 sec: 57344.5, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8630894592. Throughput: 0: 59129.9. Samples: 1536080820. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 00:47:34,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 00:47:37,135][54818] Updated weights for policy 0, policy_version 526798 (0.0025) [2024-04-28 00:47:39,253][54587] Fps is (10 sec: 58981.7, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8631173120. Throughput: 0: 58851.2. Samples: 1536431360. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 00:47:39,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 00:47:39,602][54818] Updated weights for policy 0, policy_version 526808 (0.0024) [2024-04-28 00:47:42,695][54818] Updated weights for policy 0, policy_version 526818 (0.0025) [2024-04-28 00:47:44,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58982.5, 300 sec: 59149.1). Total num frames: 8631468032. Throughput: 0: 58931.8. Samples: 1536599980. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 00:47:44,253][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 00:47:45,398][54818] Updated weights for policy 0, policy_version 526828 (0.0025) [2024-04-28 00:47:48,226][54818] Updated weights for policy 0, policy_version 526838 (0.0026) [2024-04-28 00:47:49,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58982.4, 300 sec: 59093.4). Total num frames: 8631762944. Throughput: 0: 59203.4. Samples: 1536964320. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 00:47:49,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 00:47:49,264][54587] No heartbeat for components: RolloutWorker_w4 (12217 seconds) [2024-04-28 00:47:49,384][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000526842_8631779328.pth... [2024-04-28 00:47:49,443][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000525975_8617574400.pth [2024-04-28 00:47:50,824][54818] Updated weights for policy 0, policy_version 526848 (0.0025) [2024-04-28 00:47:54,106][54818] Updated weights for policy 0, policy_version 526858 (0.0023) [2024-04-28 00:47:54,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58709.5, 300 sec: 59093.5). Total num frames: 8632041472. Throughput: 0: 59227.6. Samples: 1537317620. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 00:47:54,253][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 00:47:56,417][54818] Updated weights for policy 0, policy_version 526868 (0.0030) [2024-04-28 00:47:59,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8632352768. Throughput: 0: 58828.8. Samples: 1537487180. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 00:47:59,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 00:47:59,657][54818] Updated weights for policy 0, policy_version 526878 (0.0025) [2024-04-28 00:48:01,971][54818] Updated weights for policy 0, policy_version 526888 (0.0024) [2024-04-28 00:48:04,253][54587] Fps is (10 sec: 60620.5, 60 sec: 58982.4, 300 sec: 59260.1). Total num frames: 8632647680. Throughput: 0: 58904.5. Samples: 1537841760. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 00:48:04,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-28 00:48:05,156][54818] Updated weights for policy 0, policy_version 526898 (0.0025) [2024-04-28 00:48:07,299][54818] Updated weights for policy 0, policy_version 526908 (0.0026) [2024-04-28 00:48:09,253][54587] Fps is (10 sec: 58982.9, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8632942592. Throughput: 0: 59061.0. Samples: 1538198740. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 00:48:09,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-28 00:48:10,563][54818] Updated weights for policy 0, policy_version 526918 (0.0025) [2024-04-28 00:48:12,814][54818] Updated weights for policy 0, policy_version 526928 (0.0025) [2024-04-28 00:48:14,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58982.4, 300 sec: 59204.5). Total num frames: 8633253888. Throughput: 0: 59044.8. Samples: 1538382800. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 00:48:14,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-28 00:48:16,100][54818] Updated weights for policy 0, policy_version 526938 (0.0026) [2024-04-28 00:48:16,122][54798] Signal inference workers to stop experience collection... (24600 times) [2024-04-28 00:48:16,123][54798] Signal inference workers to resume experience collection... (24600 times) [2024-04-28 00:48:16,144][54818] InferenceWorker_p0-w0: stopping experience collection (24600 times) [2024-04-28 00:48:16,144][54818] InferenceWorker_p0-w0: resuming experience collection (24600 times) [2024-04-28 00:48:18,241][54818] Updated weights for policy 0, policy_version 526948 (0.0026) [2024-04-28 00:48:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58709.3, 300 sec: 59149.0). Total num frames: 8633548800. Throughput: 0: 58950.6. Samples: 1538733600. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 00:48:19,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-28 00:48:21,597][54818] Updated weights for policy 0, policy_version 526958 (0.0023) [2024-04-28 00:48:24,159][54818] Updated weights for policy 0, policy_version 526968 (0.0027) [2024-04-28 00:48:24,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8633843712. Throughput: 0: 58897.9. Samples: 1539081760. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 00:48:24,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 00:48:27,023][54818] Updated weights for policy 0, policy_version 526978 (0.0026) [2024-04-28 00:48:29,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59528.4, 300 sec: 59149.0). Total num frames: 8634155008. Throughput: 0: 59392.7. Samples: 1539272660. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 00:48:29,254][54587] Avg episode reward: [(0, '0.484')] [2024-04-28 00:48:29,521][54818] Updated weights for policy 0, policy_version 526988 (0.0026) [2024-04-28 00:48:32,383][54818] Updated weights for policy 0, policy_version 526998 (0.0026) [2024-04-28 00:48:34,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59255.4, 300 sec: 59149.1). Total num frames: 8634449920. Throughput: 0: 59235.2. Samples: 1539629900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:48:34,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 00:48:35,059][54818] Updated weights for policy 0, policy_version 527008 (0.0028) [2024-04-28 00:48:37,810][54818] Updated weights for policy 0, policy_version 527018 (0.0025) [2024-04-28 00:48:39,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8634728448. Throughput: 0: 59126.6. Samples: 1539978320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:48:39,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 00:48:40,786][54818] Updated weights for policy 0, policy_version 527028 (0.0026) [2024-04-28 00:48:43,558][54818] Updated weights for policy 0, policy_version 527038 (0.0024) [2024-04-28 00:48:44,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59528.4, 300 sec: 59260.1). Total num frames: 8635039744. Throughput: 0: 59268.5. Samples: 1540154260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:48:44,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-28 00:48:46,295][54818] Updated weights for policy 0, policy_version 527048 (0.0025) [2024-04-28 00:48:48,997][54818] Updated weights for policy 0, policy_version 527058 (0.0026) [2024-04-28 00:48:49,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59528.7, 300 sec: 59260.1). Total num frames: 8635334656. Throughput: 0: 59513.8. Samples: 1540519880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:48:49,253][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 00:48:51,616][54818] Updated weights for policy 0, policy_version 527068 (0.0026) [2024-04-28 00:48:54,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59801.5, 300 sec: 59204.6). Total num frames: 8635629568. Throughput: 0: 59496.0. Samples: 1540876060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:48:54,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 00:48:54,398][54818] Updated weights for policy 0, policy_version 527078 (0.0026) [2024-04-28 00:48:57,334][54818] Updated weights for policy 0, policy_version 527088 (0.0022) [2024-04-28 00:48:59,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59528.6, 300 sec: 59204.5). Total num frames: 8635924480. Throughput: 0: 59318.7. Samples: 1541052140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:48:59,262][54587] Avg episode reward: [(0, '0.642')] [2024-04-28 00:48:59,718][54818] Updated weights for policy 0, policy_version 527098 (0.0026) [2024-04-28 00:49:02,808][54818] Updated weights for policy 0, policy_version 527108 (0.0025) [2024-04-28 00:49:04,253][54587] Fps is (10 sec: 58982.0, 60 sec: 59528.4, 300 sec: 59204.6). Total num frames: 8636219392. Throughput: 0: 59395.4. Samples: 1541406400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:49:04,263][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 00:49:05,313][54818] Updated weights for policy 0, policy_version 527118 (0.0027) [2024-04-28 00:49:08,421][54818] Updated weights for policy 0, policy_version 527128 (0.0026) [2024-04-28 00:49:09,253][54587] Fps is (10 sec: 57343.1, 60 sec: 59255.3, 300 sec: 59149.0). Total num frames: 8636497920. Throughput: 0: 59575.8. Samples: 1541762680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:49:09,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 00:49:09,696][54798] Signal inference workers to stop experience collection... (24650 times) [2024-04-28 00:49:09,696][54798] Signal inference workers to resume experience collection... (24650 times) [2024-04-28 00:49:09,709][54818] InferenceWorker_p0-w0: stopping experience collection (24650 times) [2024-04-28 00:49:09,709][54818] InferenceWorker_p0-w0: resuming experience collection (24650 times) [2024-04-28 00:49:10,948][54818] Updated weights for policy 0, policy_version 527138 (0.0026) [2024-04-28 00:49:14,196][54818] Updated weights for policy 0, policy_version 527148 (0.0025) [2024-04-28 00:49:14,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8636792832. Throughput: 0: 59127.2. Samples: 1541933380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:49:14,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 00:49:16,748][54818] Updated weights for policy 0, policy_version 527158 (0.0026) [2024-04-28 00:49:19,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8637104128. Throughput: 0: 59045.3. Samples: 1542286940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:49:19,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-28 00:49:19,663][54818] Updated weights for policy 0, policy_version 527168 (0.0025) [2024-04-28 00:49:22,181][54818] Updated weights for policy 0, policy_version 527178 (0.0026) [2024-04-28 00:49:24,253][54587] Fps is (10 sec: 60620.5, 60 sec: 59255.4, 300 sec: 59204.6). Total num frames: 8637399040. Throughput: 0: 59163.0. Samples: 1542640660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:49:24,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 00:49:25,200][54818] Updated weights for policy 0, policy_version 527188 (0.0026) [2024-04-28 00:49:27,660][54818] Updated weights for policy 0, policy_version 527198 (0.0026) [2024-04-28 00:49:29,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58982.4, 300 sec: 59204.5). Total num frames: 8637693952. Throughput: 0: 59244.5. Samples: 1542820260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:49:29,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 00:49:30,780][54818] Updated weights for policy 0, policy_version 527208 (0.0026) [2024-04-28 00:49:33,301][54818] Updated weights for policy 0, policy_version 527218 (0.0027) [2024-04-28 00:49:34,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8638005248. Throughput: 0: 59011.0. Samples: 1543175380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:49:34,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-28 00:49:36,471][54818] Updated weights for policy 0, policy_version 527228 (0.0026) [2024-04-28 00:49:38,706][54818] Updated weights for policy 0, policy_version 527238 (0.0026) [2024-04-28 00:49:39,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8638300160. Throughput: 0: 58922.1. Samples: 1543527560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:49:39,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 00:49:41,895][54818] Updated weights for policy 0, policy_version 527248 (0.0026) [2024-04-28 00:49:44,189][54818] Updated weights for policy 0, policy_version 527258 (0.0027) [2024-04-28 00:49:44,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8638595072. Throughput: 0: 59175.0. Samples: 1543715020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:49:44,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 00:49:47,346][54818] Updated weights for policy 0, policy_version 527268 (0.0024) [2024-04-28 00:49:49,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8638873600. Throughput: 0: 58976.0. Samples: 1544060320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:49:49,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-28 00:49:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000527275_8638873600.pth... [2024-04-28 00:49:49,327][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000526408_8624668672.pth [2024-04-28 00:49:49,936][54818] Updated weights for policy 0, policy_version 527278 (0.0026) [2024-04-28 00:49:53,021][54818] Updated weights for policy 0, policy_version 527288 (0.0027) [2024-04-28 00:49:54,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8639168512. Throughput: 0: 58844.0. Samples: 1544410660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:49:54,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-28 00:49:55,497][54818] Updated weights for policy 0, policy_version 527298 (0.0025) [2024-04-28 00:49:58,464][54818] Updated weights for policy 0, policy_version 527308 (0.0026) [2024-04-28 00:49:59,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59255.4, 300 sec: 59260.1). Total num frames: 8639479808. Throughput: 0: 59090.7. Samples: 1544592460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 00:49:59,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 00:50:00,993][54818] Updated weights for policy 0, policy_version 527318 (0.0026) [2024-04-28 00:50:04,067][54818] Updated weights for policy 0, policy_version 527328 (0.0027) [2024-04-28 00:50:04,253][54587] Fps is (10 sec: 58983.6, 60 sec: 58982.6, 300 sec: 59204.6). Total num frames: 8639758336. Throughput: 0: 59188.2. Samples: 1544950400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:50:04,253][54587] Avg episode reward: [(0, '0.663')] [2024-04-28 00:50:06,490][54818] Updated weights for policy 0, policy_version 527338 (0.0025) [2024-04-28 00:50:09,253][54587] Fps is (10 sec: 55705.6, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8640036864. Throughput: 0: 59254.7. Samples: 1545307120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:50:09,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 00:50:09,574][54818] Updated weights for policy 0, policy_version 527348 (0.0024) [2024-04-28 00:50:11,886][54818] Updated weights for policy 0, policy_version 527358 (0.0026) [2024-04-28 00:50:14,253][54587] Fps is (10 sec: 57342.8, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8640331776. Throughput: 0: 58892.8. Samples: 1545470440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:50:14,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 00:50:14,989][54798] Signal inference workers to stop experience collection... (24700 times) [2024-04-28 00:50:14,991][54798] Signal inference workers to resume experience collection... (24700 times) [2024-04-28 00:50:15,009][54818] InferenceWorker_p0-w0: stopping experience collection (24700 times) [2024-04-28 00:50:15,009][54818] InferenceWorker_p0-w0: resuming experience collection (24700 times) [2024-04-28 00:50:15,110][54818] Updated weights for policy 0, policy_version 527368 (0.0026) [2024-04-28 00:50:17,364][54818] Updated weights for policy 0, policy_version 527378 (0.0026) [2024-04-28 00:50:19,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58709.3, 300 sec: 59093.4). Total num frames: 8640626688. Throughput: 0: 58940.3. Samples: 1545827700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:50:19,254][54587] Avg episode reward: [(0, '0.511')] [2024-04-28 00:50:20,491][54818] Updated weights for policy 0, policy_version 527388 (0.0023) [2024-04-28 00:50:23,022][54818] Updated weights for policy 0, policy_version 527398 (0.0026) [2024-04-28 00:50:24,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8640921600. Throughput: 0: 59174.7. Samples: 1546190420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:50:24,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 00:50:26,258][54818] Updated weights for policy 0, policy_version 527408 (0.0026) [2024-04-28 00:50:28,832][54818] Updated weights for policy 0, policy_version 527418 (0.0027) [2024-04-28 00:50:29,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58709.3, 300 sec: 59149.0). Total num frames: 8641216512. Throughput: 0: 58785.7. Samples: 1546360380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:50:29,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 00:50:31,735][54818] Updated weights for policy 0, policy_version 527428 (0.0025) [2024-04-28 00:50:34,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58709.2, 300 sec: 59260.1). Total num frames: 8641527808. Throughput: 0: 58919.5. Samples: 1546711700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:50:34,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-28 00:50:34,392][54818] Updated weights for policy 0, policy_version 527438 (0.0026) [2024-04-28 00:50:37,363][54818] Updated weights for policy 0, policy_version 527448 (0.0026) [2024-04-28 00:50:39,253][54587] Fps is (10 sec: 62259.4, 60 sec: 58982.4, 300 sec: 59315.6). Total num frames: 8641839104. Throughput: 0: 58960.1. Samples: 1547063860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:50:39,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 00:50:39,897][54818] Updated weights for policy 0, policy_version 527458 (0.0025) [2024-04-28 00:50:42,825][54818] Updated weights for policy 0, policy_version 527468 (0.0024) [2024-04-28 00:50:44,253][54587] Fps is (10 sec: 60621.4, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8642134016. Throughput: 0: 59115.5. Samples: 1547252660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:50:44,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 00:50:45,789][54818] Updated weights for policy 0, policy_version 527478 (0.0025) [2024-04-28 00:50:48,368][54818] Updated weights for policy 0, policy_version 527488 (0.0026) [2024-04-28 00:50:49,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8642428928. Throughput: 0: 58952.7. Samples: 1547603280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:50:49,262][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 00:50:49,267][54587] No heartbeat for components: RolloutWorker_w4 (12397 seconds) [2024-04-28 00:50:51,266][54818] Updated weights for policy 0, policy_version 527498 (0.0025) [2024-04-28 00:50:53,817][54818] Updated weights for policy 0, policy_version 527508 (0.0026) [2024-04-28 00:50:54,254][54587] Fps is (10 sec: 58981.2, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8642723840. Throughput: 0: 58864.2. Samples: 1547956020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:50:54,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 00:50:56,836][54818] Updated weights for policy 0, policy_version 527518 (0.0025) [2024-04-28 00:50:59,253][54587] Fps is (10 sec: 57344.8, 60 sec: 58709.4, 300 sec: 59037.9). Total num frames: 8643002368. Throughput: 0: 59263.8. Samples: 1548137300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:50:59,253][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 00:50:59,323][54818] Updated weights for policy 0, policy_version 527528 (0.0025) [2024-04-28 00:51:02,254][54818] Updated weights for policy 0, policy_version 527538 (0.0023) [2024-04-28 00:51:04,253][54587] Fps is (10 sec: 57344.7, 60 sec: 58982.2, 300 sec: 59037.9). Total num frames: 8643297280. Throughput: 0: 59220.0. Samples: 1548492600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:51:04,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 00:51:04,847][54818] Updated weights for policy 0, policy_version 527548 (0.0026) [2024-04-28 00:51:07,995][54818] Updated weights for policy 0, policy_version 527558 (0.0026) [2024-04-28 00:51:09,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8643592192. Throughput: 0: 59097.0. Samples: 1548849780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:51:09,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 00:51:10,443][54818] Updated weights for policy 0, policy_version 527568 (0.0026) [2024-04-28 00:51:13,637][54818] Updated weights for policy 0, policy_version 527578 (0.0026) [2024-04-28 00:51:14,253][54587] Fps is (10 sec: 57344.5, 60 sec: 58982.5, 300 sec: 59038.0). Total num frames: 8643870720. Throughput: 0: 59008.5. Samples: 1549015760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:51:14,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-28 00:51:15,955][54818] Updated weights for policy 0, policy_version 527588 (0.0025) [2024-04-28 00:51:16,110][54798] Signal inference workers to stop experience collection... (24750 times) [2024-04-28 00:51:16,144][54818] InferenceWorker_p0-w0: stopping experience collection (24750 times) [2024-04-28 00:51:16,199][54798] Signal inference workers to resume experience collection... (24750 times) [2024-04-28 00:51:16,199][54818] InferenceWorker_p0-w0: resuming experience collection (24750 times) [2024-04-28 00:51:19,253][54587] Fps is (10 sec: 55705.3, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8644149248. Throughput: 0: 59295.7. Samples: 1549380000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:51:19,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 00:51:19,282][54818] Updated weights for policy 0, policy_version 527598 (0.0024) [2024-04-28 00:51:21,388][54818] Updated weights for policy 0, policy_version 527608 (0.0025) [2024-04-28 00:51:24,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59255.6, 300 sec: 59093.5). Total num frames: 8644476928. Throughput: 0: 59250.8. Samples: 1549730140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:51:24,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 00:51:24,728][54818] Updated weights for policy 0, policy_version 527618 (0.0023) [2024-04-28 00:51:26,870][54818] Updated weights for policy 0, policy_version 527628 (0.0024) [2024-04-28 00:51:29,253][54587] Fps is (10 sec: 62259.4, 60 sec: 59255.6, 300 sec: 59093.5). Total num frames: 8644771840. Throughput: 0: 59013.4. Samples: 1549908260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 00:51:29,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 00:51:30,161][54818] Updated weights for policy 0, policy_version 527638 (0.0026) [2024-04-28 00:51:32,313][54818] Updated weights for policy 0, policy_version 527648 (0.0026) [2024-04-28 00:51:34,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8645066752. Throughput: 0: 58895.1. Samples: 1550253560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:51:34,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-28 00:51:35,715][54818] Updated weights for policy 0, policy_version 527658 (0.0026) [2024-04-28 00:51:37,750][54818] Updated weights for policy 0, policy_version 527668 (0.0027) [2024-04-28 00:51:39,253][54587] Fps is (10 sec: 60620.1, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8645378048. Throughput: 0: 59109.9. Samples: 1550615960. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:51:39,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-28 00:51:41,184][54818] Updated weights for policy 0, policy_version 527678 (0.0025) [2024-04-28 00:51:43,183][54818] Updated weights for policy 0, policy_version 527688 (0.0026) [2024-04-28 00:51:44,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8645672960. Throughput: 0: 59152.7. Samples: 1550799180. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:51:44,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-28 00:51:46,600][54818] Updated weights for policy 0, policy_version 527698 (0.0027) [2024-04-28 00:51:48,931][54818] Updated weights for policy 0, policy_version 527708 (0.0026) [2024-04-28 00:51:49,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8645967872. Throughput: 0: 59135.6. Samples: 1551153700. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:51:49,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-28 00:51:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000527708_8645967872.pth... [2024-04-28 00:51:49,351][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000526842_8631779328.pth [2024-04-28 00:51:52,069][54818] Updated weights for policy 0, policy_version 527718 (0.0023) [2024-04-28 00:51:54,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59255.6, 300 sec: 59260.1). Total num frames: 8646279168. Throughput: 0: 58923.9. Samples: 1551501360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:51:54,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 00:51:54,606][54818] Updated weights for policy 0, policy_version 527728 (0.0027) [2024-04-28 00:51:57,692][54818] Updated weights for policy 0, policy_version 527738 (0.0025) [2024-04-28 00:51:59,253][54587] Fps is (10 sec: 62259.6, 60 sec: 59801.5, 300 sec: 59260.1). Total num frames: 8646590464. Throughput: 0: 59508.4. Samples: 1551693640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:51:59,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-28 00:52:00,164][54818] Updated weights for policy 0, policy_version 527748 (0.0025) [2024-04-28 00:52:03,274][54818] Updated weights for policy 0, policy_version 527758 (0.0026) [2024-04-28 00:52:04,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59528.6, 300 sec: 59204.5). Total num frames: 8646868992. Throughput: 0: 59253.8. Samples: 1552046420. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:52:04,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 00:52:05,666][54818] Updated weights for policy 0, policy_version 527768 (0.0025) [2024-04-28 00:52:07,882][54798] Signal inference workers to stop experience collection... (24800 times) [2024-04-28 00:52:07,882][54798] Signal inference workers to resume experience collection... (24800 times) [2024-04-28 00:52:07,894][54818] InferenceWorker_p0-w0: stopping experience collection (24800 times) [2024-04-28 00:52:07,895][54818] InferenceWorker_p0-w0: resuming experience collection (24800 times) [2024-04-28 00:52:08,689][54818] Updated weights for policy 0, policy_version 527778 (0.0025) [2024-04-28 00:52:09,253][54587] Fps is (10 sec: 55706.1, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8647147520. Throughput: 0: 59284.9. Samples: 1552397960. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:52:09,253][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 00:52:11,416][54818] Updated weights for policy 0, policy_version 527788 (0.0023) [2024-04-28 00:52:14,141][54818] Updated weights for policy 0, policy_version 527798 (0.0025) [2024-04-28 00:52:14,253][54587] Fps is (10 sec: 57344.7, 60 sec: 59528.6, 300 sec: 59038.0). Total num frames: 8647442432. Throughput: 0: 59085.4. Samples: 1552567100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:52:14,253][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 00:52:16,726][54818] Updated weights for policy 0, policy_version 527808 (0.0026) [2024-04-28 00:52:19,253][54587] Fps is (10 sec: 57343.2, 60 sec: 59528.5, 300 sec: 58982.4). Total num frames: 8647720960. Throughput: 0: 59384.0. Samples: 1552925840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:52:19,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-28 00:52:19,706][54818] Updated weights for policy 0, policy_version 527818 (0.0027) [2024-04-28 00:52:22,362][54818] Updated weights for policy 0, policy_version 527828 (0.0026) [2024-04-28 00:52:24,253][54587] Fps is (10 sec: 57343.0, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8648015872. Throughput: 0: 59459.6. Samples: 1553291640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:52:24,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 00:52:25,113][54818] Updated weights for policy 0, policy_version 527838 (0.0025) [2024-04-28 00:52:27,946][54818] Updated weights for policy 0, policy_version 527848 (0.0026) [2024-04-28 00:52:29,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8648310784. Throughput: 0: 59021.4. Samples: 1553455140. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:52:29,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 00:52:30,707][54818] Updated weights for policy 0, policy_version 527858 (0.0026) [2024-04-28 00:52:33,429][54818] Updated weights for policy 0, policy_version 527868 (0.0026) [2024-04-28 00:52:34,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8648605696. Throughput: 0: 58930.8. Samples: 1553805580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:52:34,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 00:52:36,259][54818] Updated weights for policy 0, policy_version 527878 (0.0025) [2024-04-28 00:52:39,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58709.3, 300 sec: 59093.4). Total num frames: 8648900608. Throughput: 0: 59184.8. Samples: 1554164680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:52:39,254][54587] Avg episode reward: [(0, '0.699')] [2024-04-28 00:52:39,351][54818] Updated weights for policy 0, policy_version 527888 (0.0027) [2024-04-28 00:52:41,780][54818] Updated weights for policy 0, policy_version 527898 (0.0025) [2024-04-28 00:52:44,253][54587] Fps is (10 sec: 60620.5, 60 sec: 58982.5, 300 sec: 59149.0). Total num frames: 8649211904. Throughput: 0: 58813.3. Samples: 1554340240. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:52:44,254][54587] Avg episode reward: [(0, '0.531')] [2024-04-28 00:52:44,798][54818] Updated weights for policy 0, policy_version 527908 (0.0026) [2024-04-28 00:52:47,246][54818] Updated weights for policy 0, policy_version 527918 (0.0025) [2024-04-28 00:52:49,253][54587] Fps is (10 sec: 60621.7, 60 sec: 58982.5, 300 sec: 59204.5). Total num frames: 8649506816. Throughput: 0: 58905.8. Samples: 1554697180. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:52:49,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 00:52:50,542][54818] Updated weights for policy 0, policy_version 527928 (0.0023) [2024-04-28 00:52:52,953][54818] Updated weights for policy 0, policy_version 527938 (0.0026) [2024-04-28 00:52:54,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8649801728. Throughput: 0: 58641.7. Samples: 1555036840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:52:54,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-28 00:52:56,180][54818] Updated weights for policy 0, policy_version 527948 (0.0025) [2024-04-28 00:52:58,441][54818] Updated weights for policy 0, policy_version 527958 (0.0025) [2024-04-28 00:52:59,253][54587] Fps is (10 sec: 60620.1, 60 sec: 58709.2, 300 sec: 59204.5). Total num frames: 8650113024. Throughput: 0: 59110.4. Samples: 1555227080. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-04-28 00:52:59,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 00:53:01,503][54818] Updated weights for policy 0, policy_version 527968 (0.0025) [2024-04-28 00:53:03,898][54818] Updated weights for policy 0, policy_version 527978 (0.0026) [2024-04-28 00:53:04,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8650407936. Throughput: 0: 58996.1. Samples: 1555580660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 00:53:04,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-28 00:53:04,270][54798] Signal inference workers to stop experience collection... (24850 times) [2024-04-28 00:53:04,270][54798] Signal inference workers to resume experience collection... (24850 times) [2024-04-28 00:53:04,281][54818] InferenceWorker_p0-w0: stopping experience collection (24850 times) [2024-04-28 00:53:04,281][54818] InferenceWorker_p0-w0: resuming experience collection (24850 times) [2024-04-28 00:53:07,073][54818] Updated weights for policy 0, policy_version 527988 (0.0026) [2024-04-28 00:53:09,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8650702848. Throughput: 0: 58643.6. Samples: 1555930600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 00:53:09,254][54587] Avg episode reward: [(0, '0.531')] [2024-04-28 00:53:09,405][54818] Updated weights for policy 0, policy_version 527998 (0.0026) [2024-04-28 00:53:12,547][54818] Updated weights for policy 0, policy_version 528008 (0.0026) [2024-04-28 00:53:14,253][54587] Fps is (10 sec: 57343.9, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8650981376. Throughput: 0: 59317.9. Samples: 1556124440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 00:53:14,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 00:53:14,798][54818] Updated weights for policy 0, policy_version 528018 (0.0026) [2024-04-28 00:53:18,054][54818] Updated weights for policy 0, policy_version 528028 (0.0025) [2024-04-28 00:53:19,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8651292672. Throughput: 0: 59344.9. Samples: 1556476100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 00:53:19,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 00:53:20,297][54818] Updated weights for policy 0, policy_version 528038 (0.0025) [2024-04-28 00:53:23,548][54818] Updated weights for policy 0, policy_version 528048 (0.0026) [2024-04-28 00:53:24,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59528.7, 300 sec: 59093.5). Total num frames: 8651587584. Throughput: 0: 59166.5. Samples: 1556827160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 00:53:24,254][54587] Avg episode reward: [(0, '0.523')] [2024-04-28 00:53:25,927][54818] Updated weights for policy 0, policy_version 528058 (0.0025) [2024-04-28 00:53:29,001][54818] Updated weights for policy 0, policy_version 528068 (0.0025) [2024-04-28 00:53:29,253][54587] Fps is (10 sec: 57343.9, 60 sec: 59255.5, 300 sec: 59037.9). Total num frames: 8651866112. Throughput: 0: 59202.2. Samples: 1557004340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 00:53:29,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-28 00:53:31,456][54818] Updated weights for policy 0, policy_version 528078 (0.0025) [2024-04-28 00:53:34,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8652161024. Throughput: 0: 59257.0. Samples: 1557363740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 00:53:34,253][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 00:53:34,508][54818] Updated weights for policy 0, policy_version 528088 (0.0026) [2024-04-28 00:53:37,018][54818] Updated weights for policy 0, policy_version 528098 (0.0026) [2024-04-28 00:53:39,253][54587] Fps is (10 sec: 57343.3, 60 sec: 58982.4, 300 sec: 58982.4). Total num frames: 8652439552. Throughput: 0: 59745.2. Samples: 1557725380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 00:53:39,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 00:53:39,960][54818] Updated weights for policy 0, policy_version 528108 (0.0026) [2024-04-28 00:53:42,961][54818] Updated weights for policy 0, policy_version 528118 (0.0026) [2024-04-28 00:53:44,253][54587] Fps is (10 sec: 57343.2, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8652734464. Throughput: 0: 59225.8. Samples: 1557892240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 00:53:44,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 00:53:45,491][54818] Updated weights for policy 0, policy_version 528128 (0.0026) [2024-04-28 00:53:48,855][54818] Updated weights for policy 0, policy_version 528138 (0.0024) [2024-04-28 00:53:49,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58982.2, 300 sec: 59037.9). Total num frames: 8653045760. Throughput: 0: 59168.7. Samples: 1558243260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 00:53:49,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-28 00:53:49,265][54587] No heartbeat for components: RolloutWorker_w4 (12577 seconds) [2024-04-28 00:53:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000528140_8653045760.pth... [2024-04-28 00:53:49,316][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000527275_8638873600.pth [2024-04-28 00:53:50,538][54798] Signal inference workers to stop experience collection... (24900 times) [2024-04-28 00:53:50,543][54798] Signal inference workers to resume experience collection... (24900 times) [2024-04-28 00:53:50,554][54818] InferenceWorker_p0-w0: stopping experience collection (24900 times) [2024-04-28 00:53:50,554][54818] InferenceWorker_p0-w0: resuming experience collection (24900 times) [2024-04-28 00:53:50,911][54818] Updated weights for policy 0, policy_version 528148 (0.0025) [2024-04-28 00:53:54,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8653324288. Throughput: 0: 59493.0. Samples: 1558607780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 00:53:54,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 00:53:54,464][54818] Updated weights for policy 0, policy_version 528158 (0.0027) [2024-04-28 00:53:56,430][54818] Updated weights for policy 0, policy_version 528168 (0.0026) [2024-04-28 00:53:59,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58709.3, 300 sec: 59037.9). Total num frames: 8653635584. Throughput: 0: 58949.6. Samples: 1558777180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 00:53:59,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 00:53:59,864][54818] Updated weights for policy 0, policy_version 528178 (0.0025) [2024-04-28 00:54:01,784][54818] Updated weights for policy 0, policy_version 528188 (0.0025) [2024-04-28 00:54:04,253][54587] Fps is (10 sec: 62259.2, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8653946880. Throughput: 0: 59008.9. Samples: 1559131500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 00:54:04,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 00:54:05,289][54818] Updated weights for policy 0, policy_version 528198 (0.0027) [2024-04-28 00:54:07,508][54818] Updated weights for policy 0, policy_version 528208 (0.0027) [2024-04-28 00:54:09,253][54587] Fps is (10 sec: 62259.5, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8654258176. Throughput: 0: 59137.6. Samples: 1559488360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 00:54:09,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 00:54:10,913][54818] Updated weights for policy 0, policy_version 528218 (0.0028) [2024-04-28 00:54:13,196][54818] Updated weights for policy 0, policy_version 528228 (0.0025) [2024-04-28 00:54:14,253][54587] Fps is (10 sec: 62258.4, 60 sec: 59801.5, 300 sec: 59204.5). Total num frames: 8654569472. Throughput: 0: 59243.9. Samples: 1559670320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 00:54:14,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 00:54:16,437][54818] Updated weights for policy 0, policy_version 528238 (0.0024) [2024-04-28 00:54:18,721][54818] Updated weights for policy 0, policy_version 528248 (0.0025) [2024-04-28 00:54:19,253][54587] Fps is (10 sec: 58983.0, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8654848000. Throughput: 0: 58972.4. Samples: 1560017500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 00:54:19,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 00:54:22,019][54818] Updated weights for policy 0, policy_version 528258 (0.0025) [2024-04-28 00:54:24,009][54818] Updated weights for policy 0, policy_version 528268 (0.0026) [2024-04-28 00:54:24,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59255.3, 300 sec: 59149.0). Total num frames: 8655142912. Throughput: 0: 58898.3. Samples: 1560375800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 00:54:24,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 00:54:27,513][54818] Updated weights for policy 0, policy_version 528278 (0.0026) [2024-04-28 00:54:29,253][54587] Fps is (10 sec: 58981.2, 60 sec: 59528.4, 300 sec: 59093.4). Total num frames: 8655437824. Throughput: 0: 59417.2. Samples: 1560566020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:54:29,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 00:54:29,565][54818] Updated weights for policy 0, policy_version 528288 (0.0025) [2024-04-28 00:54:33,022][54818] Updated weights for policy 0, policy_version 528298 (0.0024) [2024-04-28 00:54:34,253][54587] Fps is (10 sec: 57344.4, 60 sec: 59255.4, 300 sec: 59037.9). Total num frames: 8655716352. Throughput: 0: 59481.5. Samples: 1560919920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:54:34,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 00:54:35,221][54818] Updated weights for policy 0, policy_version 528308 (0.0023) [2024-04-28 00:54:38,233][54798] Signal inference workers to stop experience collection... (24950 times) [2024-04-28 00:54:38,239][54798] Signal inference workers to resume experience collection... (24950 times) [2024-04-28 00:54:38,256][54818] InferenceWorker_p0-w0: stopping experience collection (24950 times) [2024-04-28 00:54:38,256][54818] InferenceWorker_p0-w0: resuming experience collection (24950 times) [2024-04-28 00:54:38,368][54818] Updated weights for policy 0, policy_version 528318 (0.0026) [2024-04-28 00:54:39,253][54587] Fps is (10 sec: 58983.6, 60 sec: 59801.7, 300 sec: 59093.5). Total num frames: 8656027648. Throughput: 0: 59034.7. Samples: 1561264340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:54:39,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-28 00:54:40,639][54818] Updated weights for policy 0, policy_version 528328 (0.0023) [2024-04-28 00:54:43,882][54818] Updated weights for policy 0, policy_version 528338 (0.0024) [2024-04-28 00:54:44,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59801.6, 300 sec: 59149.0). Total num frames: 8656322560. Throughput: 0: 59192.9. Samples: 1561440860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:54:44,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 00:54:46,066][54818] Updated weights for policy 0, policy_version 528348 (0.0025) [2024-04-28 00:54:49,253][54587] Fps is (10 sec: 57343.3, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8656601088. Throughput: 0: 59448.7. Samples: 1561806700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:54:49,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 00:54:49,398][54818] Updated weights for policy 0, policy_version 528358 (0.0027) [2024-04-28 00:54:51,446][54818] Updated weights for policy 0, policy_version 528368 (0.0023) [2024-04-28 00:54:54,253][54587] Fps is (10 sec: 55706.6, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8656879616. Throughput: 0: 59338.4. Samples: 1562158580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:54:54,253][54587] Avg episode reward: [(0, '0.564')] [2024-04-28 00:54:54,853][54818] Updated weights for policy 0, policy_version 528378 (0.0026) [2024-04-28 00:54:57,079][54818] Updated weights for policy 0, policy_version 528388 (0.0026) [2024-04-28 00:54:59,253][54587] Fps is (10 sec: 55705.6, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8657158144. Throughput: 0: 58859.6. Samples: 1562319000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:54:59,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 00:55:00,385][54818] Updated weights for policy 0, policy_version 528398 (0.0026) [2024-04-28 00:55:02,901][54818] Updated weights for policy 0, policy_version 528408 (0.0026) [2024-04-28 00:55:04,253][54587] Fps is (10 sec: 58981.4, 60 sec: 58709.2, 300 sec: 59093.5). Total num frames: 8657469440. Throughput: 0: 59085.2. Samples: 1562676340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:55:04,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 00:55:05,973][54818] Updated weights for policy 0, policy_version 528418 (0.0024) [2024-04-28 00:55:08,981][54818] Updated weights for policy 0, policy_version 528428 (0.0027) [2024-04-28 00:55:09,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58436.2, 300 sec: 59093.5). Total num frames: 8657764352. Throughput: 0: 59133.7. Samples: 1563036820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:55:09,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 00:55:11,390][54818] Updated weights for policy 0, policy_version 528438 (0.0026) [2024-04-28 00:55:14,253][54587] Fps is (10 sec: 60620.9, 60 sec: 58436.3, 300 sec: 59149.0). Total num frames: 8658075648. Throughput: 0: 58819.2. Samples: 1563212880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:55:14,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-28 00:55:14,460][54818] Updated weights for policy 0, policy_version 528448 (0.0025) [2024-04-28 00:55:16,260][54798] Signal inference workers to stop experience collection... (25000 times) [2024-04-28 00:55:16,293][54818] InferenceWorker_p0-w0: stopping experience collection (25000 times) [2024-04-28 00:55:16,325][54798] Signal inference workers to resume experience collection... (25000 times) [2024-04-28 00:55:16,325][54818] InferenceWorker_p0-w0: resuming experience collection (25000 times) [2024-04-28 00:55:16,924][54818] Updated weights for policy 0, policy_version 528458 (0.0023) [2024-04-28 00:55:19,253][54587] Fps is (10 sec: 62259.6, 60 sec: 58982.3, 300 sec: 59204.5). Total num frames: 8658386944. Throughput: 0: 58799.0. Samples: 1563565880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:55:19,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 00:55:20,064][54818] Updated weights for policy 0, policy_version 528468 (0.0025) [2024-04-28 00:55:22,388][54818] Updated weights for policy 0, policy_version 528478 (0.0026) [2024-04-28 00:55:24,253][54587] Fps is (10 sec: 62259.2, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8658698240. Throughput: 0: 58791.9. Samples: 1563909980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:55:24,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 00:55:25,782][54818] Updated weights for policy 0, policy_version 528488 (0.0025) [2024-04-28 00:55:27,981][54818] Updated weights for policy 0, policy_version 528498 (0.0025) [2024-04-28 00:55:29,253][54587] Fps is (10 sec: 60621.8, 60 sec: 59255.7, 300 sec: 59204.6). Total num frames: 8658993152. Throughput: 0: 59168.7. Samples: 1564103440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:55:29,253][54587] Avg episode reward: [(0, '0.650')] [2024-04-28 00:55:31,443][54818] Updated weights for policy 0, policy_version 528508 (0.0026) [2024-04-28 00:55:33,693][54818] Updated weights for policy 0, policy_version 528518 (0.0025) [2024-04-28 00:55:34,253][54587] Fps is (10 sec: 57344.8, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8659271680. Throughput: 0: 58794.9. Samples: 1564452460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:55:34,253][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 00:55:37,043][54818] Updated weights for policy 0, policy_version 528528 (0.0026) [2024-04-28 00:55:39,253][54587] Fps is (10 sec: 55704.3, 60 sec: 58709.2, 300 sec: 59037.9). Total num frames: 8659550208. Throughput: 0: 58665.9. Samples: 1564798560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:55:39,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 00:55:39,293][54818] Updated weights for policy 0, policy_version 528538 (0.0026) [2024-04-28 00:55:42,533][54818] Updated weights for policy 0, policy_version 528548 (0.0024) [2024-04-28 00:55:44,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8659861504. Throughput: 0: 59166.7. Samples: 1564981500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:55:44,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-28 00:55:44,768][54818] Updated weights for policy 0, policy_version 528558 (0.0025) [2024-04-28 00:55:48,038][54818] Updated weights for policy 0, policy_version 528568 (0.0027) [2024-04-28 00:55:49,253][54587] Fps is (10 sec: 60621.3, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8660156416. Throughput: 0: 59272.0. Samples: 1565343580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:55:49,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 00:55:49,267][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000528574_8660156416.pth... [2024-04-28 00:55:49,318][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000527708_8645967872.pth [2024-04-28 00:55:50,167][54818] Updated weights for policy 0, policy_version 528578 (0.0027) [2024-04-28 00:55:53,556][54818] Updated weights for policy 0, policy_version 528588 (0.0026) [2024-04-28 00:55:54,253][54587] Fps is (10 sec: 55705.5, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8660418560. Throughput: 0: 58992.1. Samples: 1565691460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-04-28 00:55:54,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-28 00:55:55,742][54818] Updated weights for policy 0, policy_version 528598 (0.0026) [2024-04-28 00:55:59,253][54587] Fps is (10 sec: 55706.2, 60 sec: 59255.6, 300 sec: 59038.0). Total num frames: 8660713472. Throughput: 0: 58734.4. Samples: 1565855920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:55:59,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 00:55:59,256][54818] Updated weights for policy 0, policy_version 528608 (0.0026) [2024-04-28 00:56:01,245][54818] Updated weights for policy 0, policy_version 528618 (0.0025) [2024-04-28 00:56:02,743][54798] Signal inference workers to stop experience collection... (25050 times) [2024-04-28 00:56:02,743][54798] Signal inference workers to resume experience collection... (25050 times) [2024-04-28 00:56:02,755][54818] InferenceWorker_p0-w0: stopping experience collection (25050 times) [2024-04-28 00:56:02,755][54818] InferenceWorker_p0-w0: resuming experience collection (25050 times) [2024-04-28 00:56:04,253][54587] Fps is (10 sec: 58983.2, 60 sec: 58982.6, 300 sec: 59037.9). Total num frames: 8661008384. Throughput: 0: 58882.4. Samples: 1566215580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:56:04,253][54587] Avg episode reward: [(0, '0.539')] [2024-04-28 00:56:04,610][54818] Updated weights for policy 0, policy_version 528628 (0.0025) [2024-04-28 00:56:06,698][54818] Updated weights for policy 0, policy_version 528638 (0.0026) [2024-04-28 00:56:09,253][54587] Fps is (10 sec: 57344.2, 60 sec: 58709.6, 300 sec: 59038.0). Total num frames: 8661286912. Throughput: 0: 59127.3. Samples: 1566570700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:56:09,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 00:56:10,123][54818] Updated weights for policy 0, policy_version 528648 (0.0025) [2024-04-28 00:56:12,528][54818] Updated weights for policy 0, policy_version 528658 (0.0024) [2024-04-28 00:56:14,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8661598208. Throughput: 0: 58616.3. Samples: 1566741180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:56:14,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 00:56:15,709][54818] Updated weights for policy 0, policy_version 528668 (0.0024) [2024-04-28 00:56:17,884][54818] Updated weights for policy 0, policy_version 528678 (0.0026) [2024-04-28 00:56:19,253][54587] Fps is (10 sec: 63896.6, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8661925888. Throughput: 0: 58683.4. Samples: 1567093220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:56:19,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-28 00:56:21,197][54818] Updated weights for policy 0, policy_version 528688 (0.0025) [2024-04-28 00:56:23,362][54818] Updated weights for policy 0, policy_version 528698 (0.0026) [2024-04-28 00:56:24,253][54587] Fps is (10 sec: 60621.7, 60 sec: 58436.4, 300 sec: 59093.5). Total num frames: 8662204416. Throughput: 0: 58922.6. Samples: 1567450060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:56:24,253][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 00:56:26,561][54818] Updated weights for policy 0, policy_version 528708 (0.0027) [2024-04-28 00:56:29,174][54818] Updated weights for policy 0, policy_version 528718 (0.0026) [2024-04-28 00:56:29,253][54587] Fps is (10 sec: 58982.1, 60 sec: 58709.1, 300 sec: 59149.0). Total num frames: 8662515712. Throughput: 0: 58976.8. Samples: 1567635460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:56:29,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 00:56:31,962][54818] Updated weights for policy 0, policy_version 528728 (0.0022) [2024-04-28 00:56:34,253][54587] Fps is (10 sec: 60620.2, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8662810624. Throughput: 0: 58808.1. Samples: 1567989940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:56:34,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 00:56:34,701][54818] Updated weights for policy 0, policy_version 528738 (0.0026) [2024-04-28 00:56:37,457][54818] Updated weights for policy 0, policy_version 528748 (0.0024) [2024-04-28 00:56:39,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8663121920. Throughput: 0: 58841.3. Samples: 1568339320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:56:39,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 00:56:40,224][54818] Updated weights for policy 0, policy_version 528758 (0.0027) [2024-04-28 00:56:42,998][54818] Updated weights for policy 0, policy_version 528768 (0.0026) [2024-04-28 00:56:44,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59255.6, 300 sec: 59149.0). Total num frames: 8663416832. Throughput: 0: 59344.5. Samples: 1568526420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:56:44,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 00:56:45,814][54818] Updated weights for policy 0, policy_version 528778 (0.0025) [2024-04-28 00:56:48,419][54818] Updated weights for policy 0, policy_version 528788 (0.0026) [2024-04-28 00:56:49,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8663695360. Throughput: 0: 59246.0. Samples: 1568881660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:56:49,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-28 00:56:49,260][54587] No heartbeat for components: RolloutWorker_w4 (12757 seconds) [2024-04-28 00:56:51,551][54818] Updated weights for policy 0, policy_version 528798 (0.0026) [2024-04-28 00:56:53,970][54818] Updated weights for policy 0, policy_version 528808 (0.0026) [2024-04-28 00:56:54,253][54587] Fps is (10 sec: 57343.6, 60 sec: 59528.6, 300 sec: 58982.4). Total num frames: 8663990272. Throughput: 0: 59153.2. Samples: 1569232600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:56:54,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-28 00:56:54,298][54798] Signal inference workers to stop experience collection... (25100 times) [2024-04-28 00:56:54,343][54818] InferenceWorker_p0-w0: stopping experience collection (25100 times) [2024-04-28 00:56:54,353][54798] Signal inference workers to resume experience collection... (25100 times) [2024-04-28 00:56:54,359][54818] InferenceWorker_p0-w0: resuming experience collection (25100 times) [2024-04-28 00:56:57,083][54818] Updated weights for policy 0, policy_version 528818 (0.0025) [2024-04-28 00:56:59,253][54587] Fps is (10 sec: 60621.0, 60 sec: 59801.4, 300 sec: 59093.5). Total num frames: 8664301568. Throughput: 0: 59494.1. Samples: 1569418420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:56:59,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 00:56:59,568][54818] Updated weights for policy 0, policy_version 528828 (0.0026) [2024-04-28 00:57:02,657][54818] Updated weights for policy 0, policy_version 528838 (0.0025) [2024-04-28 00:57:04,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8664580096. Throughput: 0: 59568.1. Samples: 1569773780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:57:04,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 00:57:05,138][54818] Updated weights for policy 0, policy_version 528848 (0.0026) [2024-04-28 00:57:08,156][54818] Updated weights for policy 0, policy_version 528858 (0.0026) [2024-04-28 00:57:09,253][54587] Fps is (10 sec: 57344.6, 60 sec: 59801.5, 300 sec: 59093.5). Total num frames: 8664875008. Throughput: 0: 59519.0. Samples: 1570128420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:57:09,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 00:57:10,559][54818] Updated weights for policy 0, policy_version 528868 (0.0026) [2024-04-28 00:57:13,653][54818] Updated weights for policy 0, policy_version 528878 (0.0031) [2024-04-28 00:57:14,253][54587] Fps is (10 sec: 58981.5, 60 sec: 59528.4, 300 sec: 59149.0). Total num frames: 8665169920. Throughput: 0: 59101.3. Samples: 1570295020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:57:14,254][54587] Avg episode reward: [(0, '0.479')] [2024-04-28 00:57:16,294][54818] Updated weights for policy 0, policy_version 528888 (0.0027) [2024-04-28 00:57:19,242][54818] Updated weights for policy 0, policy_version 528898 (0.0023) [2024-04-28 00:57:19,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8665464832. Throughput: 0: 59174.1. Samples: 1570652780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:57:19,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-28 00:57:21,765][54818] Updated weights for policy 0, policy_version 528908 (0.0026) [2024-04-28 00:57:24,253][54587] Fps is (10 sec: 57344.3, 60 sec: 58982.2, 300 sec: 59093.5). Total num frames: 8665743360. Throughput: 0: 59407.2. Samples: 1571012640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 00:57:24,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-28 00:57:24,799][54818] Updated weights for policy 0, policy_version 528918 (0.0026) [2024-04-28 00:57:27,406][54818] Updated weights for policy 0, policy_version 528928 (0.0026) [2024-04-28 00:57:29,253][54587] Fps is (10 sec: 57344.5, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8666038272. Throughput: 0: 58984.3. Samples: 1571180720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:57:29,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 00:57:30,492][54818] Updated weights for policy 0, policy_version 528938 (0.0026) [2024-04-28 00:57:32,945][54818] Updated weights for policy 0, policy_version 528948 (0.0025) [2024-04-28 00:57:34,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8666333184. Throughput: 0: 59067.3. Samples: 1571539680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:57:34,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 00:57:35,806][54818] Updated weights for policy 0, policy_version 528958 (0.0026) [2024-04-28 00:57:38,327][54818] Updated weights for policy 0, policy_version 528968 (0.0025) [2024-04-28 00:57:39,253][54587] Fps is (10 sec: 60620.4, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8666644480. Throughput: 0: 59118.1. Samples: 1571892920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:57:39,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 00:57:41,273][54818] Updated weights for policy 0, policy_version 528978 (0.0026) [2024-04-28 00:57:43,839][54818] Updated weights for policy 0, policy_version 528988 (0.0026) [2024-04-28 00:57:44,253][54587] Fps is (10 sec: 60620.7, 60 sec: 58709.2, 300 sec: 59093.5). Total num frames: 8666939392. Throughput: 0: 58918.3. Samples: 1572069740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:57:44,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 00:57:46,829][54818] Updated weights for policy 0, policy_version 528998 (0.0025) [2024-04-28 00:57:49,253][54587] Fps is (10 sec: 58983.4, 60 sec: 58982.6, 300 sec: 59093.5). Total num frames: 8667234304. Throughput: 0: 58842.8. Samples: 1572421700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:57:49,253][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 00:57:49,346][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000529007_8667250688.pth... [2024-04-28 00:57:49,399][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000528140_8653045760.pth [2024-04-28 00:57:49,857][54818] Updated weights for policy 0, policy_version 529008 (0.0026) [2024-04-28 00:57:52,416][54818] Updated weights for policy 0, policy_version 529018 (0.0026) [2024-04-28 00:57:54,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.4, 300 sec: 59038.0). Total num frames: 8667529216. Throughput: 0: 58802.7. Samples: 1572774540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:57:54,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-28 00:57:55,402][54818] Updated weights for policy 0, policy_version 529028 (0.0025) [2024-04-28 00:57:57,991][54818] Updated weights for policy 0, policy_version 529038 (0.0026) [2024-04-28 00:57:59,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58709.5, 300 sec: 59037.9). Total num frames: 8667824128. Throughput: 0: 59191.8. Samples: 1572958640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:57:59,253][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 00:58:00,877][54818] Updated weights for policy 0, policy_version 529048 (0.0026) [2024-04-28 00:58:03,086][54798] Signal inference workers to stop experience collection... (25150 times) [2024-04-28 00:58:03,086][54798] Signal inference workers to resume experience collection... (25150 times) [2024-04-28 00:58:03,097][54818] InferenceWorker_p0-w0: stopping experience collection (25150 times) [2024-04-28 00:58:03,116][54818] InferenceWorker_p0-w0: resuming experience collection (25150 times) [2024-04-28 00:58:03,472][54818] Updated weights for policy 0, policy_version 529058 (0.0026) [2024-04-28 00:58:04,254][54587] Fps is (10 sec: 60616.7, 60 sec: 59254.8, 300 sec: 59093.4). Total num frames: 8668135424. Throughput: 0: 59141.9. Samples: 1573314200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:58:04,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 00:58:06,578][54818] Updated weights for policy 0, policy_version 529068 (0.0026) [2024-04-28 00:58:09,028][54818] Updated weights for policy 0, policy_version 529078 (0.0026) [2024-04-28 00:58:09,253][54587] Fps is (10 sec: 58981.5, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8668413952. Throughput: 0: 58911.5. Samples: 1573663660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:58:09,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 00:58:11,893][54818] Updated weights for policy 0, policy_version 529088 (0.0025) [2024-04-28 00:58:14,253][54587] Fps is (10 sec: 58986.1, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8668725248. Throughput: 0: 59240.0. Samples: 1573846520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:58:14,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 00:58:14,543][54818] Updated weights for policy 0, policy_version 529098 (0.0026) [2024-04-28 00:58:17,457][54818] Updated weights for policy 0, policy_version 529108 (0.0030) [2024-04-28 00:58:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59255.5, 300 sec: 59093.4). Total num frames: 8669020160. Throughput: 0: 59191.9. Samples: 1574203320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:58:19,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-28 00:58:19,974][54818] Updated weights for policy 0, policy_version 529118 (0.0025) [2024-04-28 00:58:23,060][54818] Updated weights for policy 0, policy_version 529128 (0.0026) [2024-04-28 00:58:24,253][54587] Fps is (10 sec: 57344.6, 60 sec: 59255.6, 300 sec: 59093.5). Total num frames: 8669298688. Throughput: 0: 59123.7. Samples: 1574553480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:58:24,253][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 00:58:25,422][54818] Updated weights for policy 0, policy_version 529138 (0.0025) [2024-04-28 00:58:28,680][54818] Updated weights for policy 0, policy_version 529148 (0.0026) [2024-04-28 00:58:29,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59255.4, 300 sec: 59093.4). Total num frames: 8669593600. Throughput: 0: 59080.4. Samples: 1574728360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:58:29,254][54587] Avg episode reward: [(0, '0.706')] [2024-04-28 00:58:31,020][54818] Updated weights for policy 0, policy_version 529158 (0.0026) [2024-04-28 00:58:33,991][54818] Updated weights for policy 0, policy_version 529168 (0.0026) [2024-04-28 00:58:34,253][54587] Fps is (10 sec: 58981.5, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8669888512. Throughput: 0: 59168.6. Samples: 1575084300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:58:34,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 00:58:36,634][54818] Updated weights for policy 0, policy_version 529178 (0.0026) [2024-04-28 00:58:39,253][54587] Fps is (10 sec: 58983.5, 60 sec: 58982.6, 300 sec: 59149.1). Total num frames: 8670183424. Throughput: 0: 59359.7. Samples: 1575445720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:58:39,253][54587] Avg episode reward: [(0, '0.652')] [2024-04-28 00:58:39,449][54818] Updated weights for policy 0, policy_version 529188 (0.0026) [2024-04-28 00:58:42,050][54818] Updated weights for policy 0, policy_version 529198 (0.0025) [2024-04-28 00:58:44,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8670478336. Throughput: 0: 59126.1. Samples: 1575619320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:58:44,254][54587] Avg episode reward: [(0, '0.677')] [2024-04-28 00:58:44,948][54818] Updated weights for policy 0, policy_version 529208 (0.0026) [2024-04-28 00:58:47,594][54818] Updated weights for policy 0, policy_version 529218 (0.0025) [2024-04-28 00:58:49,253][54587] Fps is (10 sec: 58981.4, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8670773248. Throughput: 0: 59129.3. Samples: 1575974980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:58:49,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 00:58:50,561][54818] Updated weights for policy 0, policy_version 529228 (0.0025) [2024-04-28 00:58:53,211][54818] Updated weights for policy 0, policy_version 529238 (0.0026) [2024-04-28 00:58:54,253][54587] Fps is (10 sec: 60621.6, 60 sec: 59255.5, 300 sec: 59149.1). Total num frames: 8671084544. Throughput: 0: 59359.8. Samples: 1576334840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 00:58:54,253][54587] Avg episode reward: [(0, '0.637')] [2024-04-28 00:58:56,280][54818] Updated weights for policy 0, policy_version 529248 (0.0025) [2024-04-28 00:58:58,681][54818] Updated weights for policy 0, policy_version 529258 (0.0026) [2024-04-28 00:58:59,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59255.3, 300 sec: 59093.4). Total num frames: 8671379456. Throughput: 0: 59192.8. Samples: 1576510200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 00:58:59,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 00:59:01,842][54818] Updated weights for policy 0, policy_version 529268 (0.0025) [2024-04-28 00:59:04,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58983.1, 300 sec: 59038.0). Total num frames: 8671674368. Throughput: 0: 59112.2. Samples: 1576863360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 00:59:04,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 00:59:04,470][54818] Updated weights for policy 0, policy_version 529278 (0.0025) [2024-04-28 00:59:06,379][54798] Signal inference workers to stop experience collection... (25200 times) [2024-04-28 00:59:06,380][54798] Signal inference workers to resume experience collection... (25200 times) [2024-04-28 00:59:06,404][54818] InferenceWorker_p0-w0: stopping experience collection (25200 times) [2024-04-28 00:59:06,405][54818] InferenceWorker_p0-w0: resuming experience collection (25200 times) [2024-04-28 00:59:07,222][54818] Updated weights for policy 0, policy_version 529288 (0.0026) [2024-04-28 00:59:09,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59255.5, 300 sec: 58982.4). Total num frames: 8671969280. Throughput: 0: 59337.1. Samples: 1577223660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 00:59:09,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 00:59:09,993][54818] Updated weights for policy 0, policy_version 529298 (0.0027) [2024-04-28 00:59:12,693][54818] Updated weights for policy 0, policy_version 529308 (0.0026) [2024-04-28 00:59:14,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59255.6, 300 sec: 59093.5). Total num frames: 8672280576. Throughput: 0: 59330.9. Samples: 1577398240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 00:59:14,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 00:59:15,531][54818] Updated weights for policy 0, policy_version 529318 (0.0026) [2024-04-28 00:59:18,130][54818] Updated weights for policy 0, policy_version 529328 (0.0026) [2024-04-28 00:59:19,253][54587] Fps is (10 sec: 60620.7, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8672575488. Throughput: 0: 59433.7. Samples: 1577758820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 00:59:19,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-28 00:59:20,887][54818] Updated weights for policy 0, policy_version 529338 (0.0026) [2024-04-28 00:59:23,649][54818] Updated weights for policy 0, policy_version 529348 (0.0027) [2024-04-28 00:59:24,253][54587] Fps is (10 sec: 58981.7, 60 sec: 59528.4, 300 sec: 59093.5). Total num frames: 8672870400. Throughput: 0: 59157.5. Samples: 1578107820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 00:59:24,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 00:59:26,428][54818] Updated weights for policy 0, policy_version 529358 (0.0026) [2024-04-28 00:59:29,069][54818] Updated weights for policy 0, policy_version 529368 (0.0025) [2024-04-28 00:59:29,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59528.5, 300 sec: 59149.0). Total num frames: 8673165312. Throughput: 0: 59319.5. Samples: 1578288700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 00:59:29,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-28 00:59:32,007][54818] Updated weights for policy 0, policy_version 529378 (0.0025) [2024-04-28 00:59:34,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8673460224. Throughput: 0: 59222.2. Samples: 1578639980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 00:59:34,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 00:59:34,669][54818] Updated weights for policy 0, policy_version 529388 (0.0026) [2024-04-28 00:59:37,688][54818] Updated weights for policy 0, policy_version 529398 (0.0021) [2024-04-28 00:59:39,253][54587] Fps is (10 sec: 57344.9, 60 sec: 59255.4, 300 sec: 59038.0). Total num frames: 8673738752. Throughput: 0: 59106.2. Samples: 1578994620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 00:59:39,253][54587] Avg episode reward: [(0, '0.533')] [2024-04-28 00:59:40,361][54818] Updated weights for policy 0, policy_version 529408 (0.0026) [2024-04-28 00:59:43,189][54818] Updated weights for policy 0, policy_version 529418 (0.0026) [2024-04-28 00:59:44,253][54587] Fps is (10 sec: 58983.0, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8674050048. Throughput: 0: 59237.5. Samples: 1579175880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 00:59:44,254][54587] Avg episode reward: [(0, '0.471')] [2024-04-28 00:59:45,920][54818] Updated weights for policy 0, policy_version 529428 (0.0025) [2024-04-28 00:59:48,708][54818] Updated weights for policy 0, policy_version 529438 (0.0026) [2024-04-28 00:59:49,253][54587] Fps is (10 sec: 60619.6, 60 sec: 59528.4, 300 sec: 59204.5). Total num frames: 8674344960. Throughput: 0: 59246.8. Samples: 1579529480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 00:59:49,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 00:59:49,265][54587] No heartbeat for components: RolloutWorker_w4 (12937 seconds) [2024-04-28 00:59:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000529440_8674344960.pth... [2024-04-28 00:59:49,319][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000528574_8660156416.pth [2024-04-28 00:59:51,392][54818] Updated weights for policy 0, policy_version 529448 (0.0024) [2024-04-28 00:59:54,253][54587] Fps is (10 sec: 57343.8, 60 sec: 58982.3, 300 sec: 59204.6). Total num frames: 8674623488. Throughput: 0: 59101.4. Samples: 1579883220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 00:59:54,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 00:59:54,286][54818] Updated weights for policy 0, policy_version 529458 (0.0026) [2024-04-28 00:59:56,889][54818] Updated weights for policy 0, policy_version 529468 (0.0027) [2024-04-28 00:59:59,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8674918400. Throughput: 0: 59114.4. Samples: 1580058400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 00:59:59,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 00:59:59,740][54818] Updated weights for policy 0, policy_version 529478 (0.0026) [2024-04-28 01:00:02,386][54818] Updated weights for policy 0, policy_version 529488 (0.0025) [2024-04-28 01:00:04,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8675213312. Throughput: 0: 58900.1. Samples: 1580409320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 01:00:04,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 01:00:05,252][54818] Updated weights for policy 0, policy_version 529498 (0.0027) [2024-04-28 01:00:06,199][54798] Signal inference workers to stop experience collection... (25250 times) [2024-04-28 01:00:06,199][54798] Signal inference workers to resume experience collection... (25250 times) [2024-04-28 01:00:06,212][54818] InferenceWorker_p0-w0: stopping experience collection (25250 times) [2024-04-28 01:00:06,212][54818] InferenceWorker_p0-w0: resuming experience collection (25250 times) [2024-04-28 01:00:07,855][54818] Updated weights for policy 0, policy_version 529508 (0.0025) [2024-04-28 01:00:09,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8675508224. Throughput: 0: 59225.4. Samples: 1580772960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 01:00:09,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-28 01:00:10,666][54818] Updated weights for policy 0, policy_version 529518 (0.0026) [2024-04-28 01:00:13,355][54818] Updated weights for policy 0, policy_version 529528 (0.0026) [2024-04-28 01:00:14,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.2, 300 sec: 59037.9). Total num frames: 8675803136. Throughput: 0: 59087.1. Samples: 1580947620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 01:00:14,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 01:00:16,289][54818] Updated weights for policy 0, policy_version 529538 (0.0026) [2024-04-28 01:00:18,845][54818] Updated weights for policy 0, policy_version 529548 (0.0026) [2024-04-28 01:00:19,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58982.4, 300 sec: 59037.9). Total num frames: 8676114432. Throughput: 0: 59123.5. Samples: 1581300540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 01:00:19,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 01:00:21,844][54818] Updated weights for policy 0, policy_version 529558 (0.0026) [2024-04-28 01:00:24,253][54587] Fps is (10 sec: 60621.7, 60 sec: 58982.5, 300 sec: 59037.9). Total num frames: 8676409344. Throughput: 0: 59088.0. Samples: 1581653580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 01:00:24,253][54587] Avg episode reward: [(0, '0.536')] [2024-04-28 01:00:24,971][54818] Updated weights for policy 0, policy_version 529568 (0.0026) [2024-04-28 01:00:27,431][54818] Updated weights for policy 0, policy_version 529578 (0.0025) [2024-04-28 01:00:29,253][54587] Fps is (10 sec: 57344.7, 60 sec: 58709.4, 300 sec: 59037.9). Total num frames: 8676687872. Throughput: 0: 58961.8. Samples: 1581829160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 01:00:29,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 01:00:30,577][54818] Updated weights for policy 0, policy_version 529588 (0.0028) [2024-04-28 01:00:33,170][54818] Updated weights for policy 0, policy_version 529598 (0.0026) [2024-04-28 01:00:34,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.5, 300 sec: 59149.1). Total num frames: 8676999168. Throughput: 0: 58984.7. Samples: 1582183780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 01:00:34,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 01:00:36,006][54818] Updated weights for policy 0, policy_version 529608 (0.0026) [2024-04-28 01:00:38,581][54818] Updated weights for policy 0, policy_version 529618 (0.0026) [2024-04-28 01:00:39,254][54587] Fps is (10 sec: 60619.5, 60 sec: 59255.2, 300 sec: 59093.4). Total num frames: 8677294080. Throughput: 0: 58935.3. Samples: 1582535320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 01:00:39,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-28 01:00:41,552][54818] Updated weights for policy 0, policy_version 529628 (0.0025) [2024-04-28 01:00:44,052][54818] Updated weights for policy 0, policy_version 529638 (0.0026) [2024-04-28 01:00:44,253][54587] Fps is (10 sec: 58981.9, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8677588992. Throughput: 0: 58963.7. Samples: 1582711760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 01:00:44,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 01:00:47,059][54818] Updated weights for policy 0, policy_version 529648 (0.0026) [2024-04-28 01:00:49,253][54587] Fps is (10 sec: 58984.0, 60 sec: 58982.6, 300 sec: 59204.6). Total num frames: 8677883904. Throughput: 0: 59079.3. Samples: 1583067880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 01:00:49,253][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 01:00:49,615][54818] Updated weights for policy 0, policy_version 529658 (0.0025) [2024-04-28 01:00:52,489][54818] Updated weights for policy 0, policy_version 529668 (0.0026) [2024-04-28 01:00:54,253][54587] Fps is (10 sec: 58981.9, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8678178816. Throughput: 0: 58816.8. Samples: 1583419720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 01:00:54,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 01:00:55,099][54818] Updated weights for policy 0, policy_version 529678 (0.0026) [2024-04-28 01:00:57,898][54818] Updated weights for policy 0, policy_version 529688 (0.0025) [2024-04-28 01:00:59,253][54587] Fps is (10 sec: 58982.0, 60 sec: 59255.6, 300 sec: 59204.5). Total num frames: 8678473728. Throughput: 0: 58848.5. Samples: 1583595800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 01:00:59,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 01:01:00,917][54818] Updated weights for policy 0, policy_version 529698 (0.0030) [2024-04-28 01:01:02,966][54798] Signal inference workers to stop experience collection... (25300 times) [2024-04-28 01:01:02,966][54798] Signal inference workers to resume experience collection... (25300 times) [2024-04-28 01:01:02,979][54818] InferenceWorker_p0-w0: stopping experience collection (25300 times) [2024-04-28 01:01:02,979][54818] InferenceWorker_p0-w0: resuming experience collection (25300 times) [2024-04-28 01:01:03,520][54818] Updated weights for policy 0, policy_version 529708 (0.0026) [2024-04-28 01:01:04,253][54587] Fps is (10 sec: 57344.8, 60 sec: 58982.5, 300 sec: 59204.5). Total num frames: 8678752256. Throughput: 0: 59096.6. Samples: 1583959880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 01:01:04,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-28 01:01:06,448][54818] Updated weights for policy 0, policy_version 529718 (0.0026) [2024-04-28 01:01:09,221][54818] Updated weights for policy 0, policy_version 529728 (0.0026) [2024-04-28 01:01:09,253][54587] Fps is (10 sec: 58981.8, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8679063552. Throughput: 0: 58940.2. Samples: 1584305900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 01:01:09,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-28 01:01:11,967][54818] Updated weights for policy 0, policy_version 529738 (0.0025) [2024-04-28 01:01:14,253][54587] Fps is (10 sec: 62258.6, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8679374848. Throughput: 0: 58975.5. Samples: 1584483060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 01:01:14,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 01:01:14,872][54818] Updated weights for policy 0, policy_version 529748 (0.0026) [2024-04-28 01:01:17,773][54818] Updated weights for policy 0, policy_version 529758 (0.0027) [2024-04-28 01:01:19,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8679653376. Throughput: 0: 59113.1. Samples: 1584843880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 01:01:19,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 01:01:20,425][54818] Updated weights for policy 0, policy_version 529768 (0.0026) [2024-04-28 01:01:23,293][54818] Updated weights for policy 0, policy_version 529778 (0.0025) [2024-04-28 01:01:24,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8679948288. Throughput: 0: 59072.2. Samples: 1585193560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 01:01:24,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 01:01:25,823][54818] Updated weights for policy 0, policy_version 529788 (0.0026) [2024-04-28 01:01:28,874][54818] Updated weights for policy 0, policy_version 529798 (0.0026) [2024-04-28 01:01:29,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59255.4, 300 sec: 59093.5). Total num frames: 8680243200. Throughput: 0: 59044.3. Samples: 1585368760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 01:01:29,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 01:01:31,352][54818] Updated weights for policy 0, policy_version 529808 (0.0025) [2024-04-28 01:01:34,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8680521728. Throughput: 0: 59034.6. Samples: 1585724440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 01:01:34,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 01:01:34,422][54818] Updated weights for policy 0, policy_version 529818 (0.0030) [2024-04-28 01:01:36,815][54818] Updated weights for policy 0, policy_version 529828 (0.0026) [2024-04-28 01:01:39,253][54587] Fps is (10 sec: 57344.8, 60 sec: 58709.6, 300 sec: 58982.4). Total num frames: 8680816640. Throughput: 0: 58994.9. Samples: 1586074480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 01:01:39,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 01:01:39,945][54818] Updated weights for policy 0, policy_version 529838 (0.0026) [2024-04-28 01:01:42,317][54818] Updated weights for policy 0, policy_version 529848 (0.0024) [2024-04-28 01:01:44,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.4, 300 sec: 59038.0). Total num frames: 8681111552. Throughput: 0: 59049.8. Samples: 1586253040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 01:01:44,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 01:01:45,511][54818] Updated weights for policy 0, policy_version 529858 (0.0021) [2024-04-28 01:01:47,934][54818] Updated weights for policy 0, policy_version 529868 (0.0022) [2024-04-28 01:01:49,253][54587] Fps is (10 sec: 60619.8, 60 sec: 58982.2, 300 sec: 59093.5). Total num frames: 8681422848. Throughput: 0: 58647.8. Samples: 1586599040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 01:01:49,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-28 01:01:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000529872_8681422848.pth... [2024-04-28 01:01:49,322][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000529007_8667250688.pth [2024-04-28 01:01:50,755][54798] Signal inference workers to stop experience collection... (25350 times) [2024-04-28 01:01:50,755][54798] Signal inference workers to resume experience collection... (25350 times) [2024-04-28 01:01:50,763][54818] InferenceWorker_p0-w0: stopping experience collection (25350 times) [2024-04-28 01:01:50,764][54818] InferenceWorker_p0-w0: resuming experience collection (25350 times) [2024-04-28 01:01:51,010][54818] Updated weights for policy 0, policy_version 529878 (0.0026) [2024-04-28 01:01:53,530][54818] Updated weights for policy 0, policy_version 529888 (0.0026) [2024-04-28 01:01:54,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58709.4, 300 sec: 58982.4). Total num frames: 8681701376. Throughput: 0: 58740.2. Samples: 1586949200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:01:54,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 01:01:56,474][54818] Updated weights for policy 0, policy_version 529898 (0.0025) [2024-04-28 01:01:59,253][54587] Fps is (10 sec: 57344.1, 60 sec: 58709.2, 300 sec: 59037.9). Total num frames: 8681996288. Throughput: 0: 58868.4. Samples: 1587132140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:01:59,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 01:01:59,459][54818] Updated weights for policy 0, policy_version 529908 (0.0026) [2024-04-28 01:02:02,109][54818] Updated weights for policy 0, policy_version 529918 (0.0026) [2024-04-28 01:02:04,253][54587] Fps is (10 sec: 60620.1, 60 sec: 59255.3, 300 sec: 59093.5). Total num frames: 8682307584. Throughput: 0: 58625.8. Samples: 1587482040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:02:04,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 01:02:04,921][54818] Updated weights for policy 0, policy_version 529928 (0.0025) [2024-04-28 01:02:07,597][54818] Updated weights for policy 0, policy_version 529938 (0.0026) [2024-04-28 01:02:09,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58709.4, 300 sec: 59038.0). Total num frames: 8682586112. Throughput: 0: 58704.5. Samples: 1587835260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:02:09,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-28 01:02:10,526][54818] Updated weights for policy 0, policy_version 529948 (0.0026) [2024-04-28 01:02:12,982][54818] Updated weights for policy 0, policy_version 529958 (0.0026) [2024-04-28 01:02:14,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58709.3, 300 sec: 59093.5). Total num frames: 8682897408. Throughput: 0: 58978.7. Samples: 1588022800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:02:14,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 01:02:16,022][54818] Updated weights for policy 0, policy_version 529968 (0.0028) [2024-04-28 01:02:18,344][54818] Updated weights for policy 0, policy_version 529978 (0.0025) [2024-04-28 01:02:19,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8683175936. Throughput: 0: 58915.9. Samples: 1588375660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:02:19,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 01:02:21,532][54818] Updated weights for policy 0, policy_version 529988 (0.0026) [2024-04-28 01:02:24,013][54818] Updated weights for policy 0, policy_version 529998 (0.0025) [2024-04-28 01:02:24,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8683487232. Throughput: 0: 58834.1. Samples: 1588722020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:02:24,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-28 01:02:27,098][54818] Updated weights for policy 0, policy_version 530008 (0.0030) [2024-04-28 01:02:29,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58709.4, 300 sec: 59093.5). Total num frames: 8683765760. Throughput: 0: 58833.7. Samples: 1588900560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:02:29,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-28 01:02:29,860][54818] Updated weights for policy 0, policy_version 530018 (0.0026) [2024-04-28 01:02:32,909][54818] Updated weights for policy 0, policy_version 530028 (0.0026) [2024-04-28 01:02:34,253][54587] Fps is (10 sec: 58982.9, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8684077056. Throughput: 0: 59179.3. Samples: 1589262100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:02:34,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 01:02:35,351][54818] Updated weights for policy 0, policy_version 530038 (0.0025) [2024-04-28 01:02:37,110][54798] Signal inference workers to stop experience collection... (25400 times) [2024-04-28 01:02:37,143][54818] InferenceWorker_p0-w0: stopping experience collection (25400 times) [2024-04-28 01:02:37,197][54798] Signal inference workers to resume experience collection... (25400 times) [2024-04-28 01:02:37,198][54818] InferenceWorker_p0-w0: resuming experience collection (25400 times) [2024-04-28 01:02:38,435][54818] Updated weights for policy 0, policy_version 530048 (0.0025) [2024-04-28 01:02:39,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8684355584. Throughput: 0: 59354.1. Samples: 1589620140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:02:39,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 01:02:40,897][54818] Updated weights for policy 0, policy_version 530058 (0.0025) [2024-04-28 01:02:44,100][54818] Updated weights for policy 0, policy_version 530068 (0.0026) [2024-04-28 01:02:44,253][54587] Fps is (10 sec: 55705.3, 60 sec: 58709.3, 300 sec: 58982.4). Total num frames: 8684634112. Throughput: 0: 58905.5. Samples: 1589782880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:02:44,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 01:02:46,294][54818] Updated weights for policy 0, policy_version 530078 (0.0026) [2024-04-28 01:02:49,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58436.3, 300 sec: 58982.4). Total num frames: 8684929024. Throughput: 0: 59241.4. Samples: 1590147900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:02:49,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 01:02:49,262][54587] No heartbeat for components: RolloutWorker_w4 (13117 seconds) [2024-04-28 01:02:49,468][54818] Updated weights for policy 0, policy_version 530088 (0.0024) [2024-04-28 01:02:51,646][54818] Updated weights for policy 0, policy_version 530098 (0.0030) [2024-04-28 01:02:54,253][54587] Fps is (10 sec: 60620.6, 60 sec: 58982.3, 300 sec: 59037.9). Total num frames: 8685240320. Throughput: 0: 59091.1. Samples: 1590494360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:02:54,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 01:02:55,177][54818] Updated weights for policy 0, policy_version 530108 (0.0026) [2024-04-28 01:02:57,244][54818] Updated weights for policy 0, policy_version 530118 (0.0026) [2024-04-28 01:02:59,253][54587] Fps is (10 sec: 62258.5, 60 sec: 59255.4, 300 sec: 59038.0). Total num frames: 8685551616. Throughput: 0: 58987.0. Samples: 1590677220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:02:59,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-28 01:03:00,610][54818] Updated weights for policy 0, policy_version 530128 (0.0025) [2024-04-28 01:03:02,824][54818] Updated weights for policy 0, policy_version 530138 (0.0026) [2024-04-28 01:03:04,253][54587] Fps is (10 sec: 62259.9, 60 sec: 59255.6, 300 sec: 59149.1). Total num frames: 8685862912. Throughput: 0: 58910.4. Samples: 1591026620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:03:04,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 01:03:06,172][54818] Updated weights for policy 0, policy_version 530148 (0.0026) [2024-04-28 01:03:08,197][54818] Updated weights for policy 0, policy_version 530158 (0.0025) [2024-04-28 01:03:09,253][54587] Fps is (10 sec: 58983.1, 60 sec: 59255.4, 300 sec: 59037.9). Total num frames: 8686141440. Throughput: 0: 59221.8. Samples: 1591387000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:03:09,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-28 01:03:11,556][54818] Updated weights for policy 0, policy_version 530168 (0.0027) [2024-04-28 01:03:13,997][54818] Updated weights for policy 0, policy_version 530178 (0.0022) [2024-04-28 01:03:14,253][54587] Fps is (10 sec: 57343.5, 60 sec: 58982.5, 300 sec: 59038.0). Total num frames: 8686436352. Throughput: 0: 59315.1. Samples: 1591569740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:03:14,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 01:03:17,074][54818] Updated weights for policy 0, policy_version 530188 (0.0025) [2024-04-28 01:03:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8686747648. Throughput: 0: 59161.7. Samples: 1591924380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 01:03:19,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 01:03:19,441][54818] Updated weights for policy 0, policy_version 530198 (0.0025) [2024-04-28 01:03:20,350][54798] Signal inference workers to stop experience collection... (25450 times) [2024-04-28 01:03:20,353][54798] Signal inference workers to resume experience collection... (25450 times) [2024-04-28 01:03:20,378][54818] InferenceWorker_p0-w0: stopping experience collection (25450 times) [2024-04-28 01:03:20,378][54818] InferenceWorker_p0-w0: resuming experience collection (25450 times) [2024-04-28 01:03:22,475][54818] Updated weights for policy 0, policy_version 530208 (0.0026) [2024-04-28 01:03:24,253][54587] Fps is (10 sec: 60621.3, 60 sec: 59255.6, 300 sec: 59149.0). Total num frames: 8687042560. Throughput: 0: 59107.7. Samples: 1592279980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:03:24,253][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 01:03:25,042][54818] Updated weights for policy 0, policy_version 530218 (0.0026) [2024-04-28 01:03:27,984][54818] Updated weights for policy 0, policy_version 530228 (0.0026) [2024-04-28 01:03:29,253][54587] Fps is (10 sec: 58981.6, 60 sec: 59528.4, 300 sec: 59149.0). Total num frames: 8687337472. Throughput: 0: 59517.1. Samples: 1592461160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:03:29,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 01:03:30,693][54818] Updated weights for policy 0, policy_version 530238 (0.0025) [2024-04-28 01:03:33,612][54818] Updated weights for policy 0, policy_version 530248 (0.0025) [2024-04-28 01:03:34,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8687616000. Throughput: 0: 59281.9. Samples: 1592815580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:03:34,253][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 01:03:36,164][54818] Updated weights for policy 0, policy_version 530258 (0.0024) [2024-04-28 01:03:38,976][54818] Updated weights for policy 0, policy_version 530268 (0.0026) [2024-04-28 01:03:39,253][54587] Fps is (10 sec: 57344.9, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8687910912. Throughput: 0: 59484.0. Samples: 1593171140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:03:39,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 01:03:41,568][54818] Updated weights for policy 0, policy_version 530278 (0.0026) [2024-04-28 01:03:44,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59801.6, 300 sec: 59149.0). Total num frames: 8688222208. Throughput: 0: 59247.8. Samples: 1593343360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:03:44,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 01:03:44,506][54818] Updated weights for policy 0, policy_version 530288 (0.0025) [2024-04-28 01:03:47,020][54818] Updated weights for policy 0, policy_version 530298 (0.0025) [2024-04-28 01:03:49,253][54587] Fps is (10 sec: 60620.4, 60 sec: 59801.6, 300 sec: 59093.4). Total num frames: 8688517120. Throughput: 0: 59486.0. Samples: 1593703500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:03:49,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 01:03:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000530305_8688517120.pth... [2024-04-28 01:03:49,318][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000529440_8674344960.pth [2024-04-28 01:03:50,095][54818] Updated weights for policy 0, policy_version 530308 (0.0026) [2024-04-28 01:03:52,457][54818] Updated weights for policy 0, policy_version 530318 (0.0026) [2024-04-28 01:03:54,253][54587] Fps is (10 sec: 57343.7, 60 sec: 59255.5, 300 sec: 59038.0). Total num frames: 8688795648. Throughput: 0: 59347.1. Samples: 1594057620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:03:54,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 01:03:55,597][54818] Updated weights for policy 0, policy_version 530328 (0.0026) [2024-04-28 01:03:58,215][54818] Updated weights for policy 0, policy_version 530338 (0.0026) [2024-04-28 01:03:59,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.5, 300 sec: 59093.4). Total num frames: 8689106944. Throughput: 0: 59267.5. Samples: 1594236780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:03:59,254][54587] Avg episode reward: [(0, '0.698')] [2024-04-28 01:04:01,005][54818] Updated weights for policy 0, policy_version 530348 (0.0026) [2024-04-28 01:04:03,715][54818] Updated weights for policy 0, policy_version 530358 (0.0026) [2024-04-28 01:04:04,253][54587] Fps is (10 sec: 62259.5, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8689418240. Throughput: 0: 59310.3. Samples: 1594593340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:04:04,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 01:04:06,652][54818] Updated weights for policy 0, policy_version 530368 (0.0025) [2024-04-28 01:04:09,253][54587] Fps is (10 sec: 58982.9, 60 sec: 59255.5, 300 sec: 59037.9). Total num frames: 8689696768. Throughput: 0: 59244.4. Samples: 1594945980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:04:09,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 01:04:09,351][54818] Updated weights for policy 0, policy_version 530378 (0.0025) [2024-04-28 01:04:12,002][54818] Updated weights for policy 0, policy_version 530388 (0.0026) [2024-04-28 01:04:14,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59255.5, 300 sec: 59038.0). Total num frames: 8689991680. Throughput: 0: 59267.4. Samples: 1595128180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:04:14,253][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 01:04:14,793][54818] Updated weights for policy 0, policy_version 530398 (0.0026) [2024-04-28 01:04:17,379][54818] Updated weights for policy 0, policy_version 530408 (0.0024) [2024-04-28 01:04:19,253][54587] Fps is (10 sec: 62259.8, 60 sec: 59528.7, 300 sec: 59149.0). Total num frames: 8690319360. Throughput: 0: 59346.3. Samples: 1595486160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:04:19,253][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 01:04:19,835][54798] Signal inference workers to stop experience collection... (25500 times) [2024-04-28 01:04:19,879][54818] InferenceWorker_p0-w0: stopping experience collection (25500 times) [2024-04-28 01:04:19,895][54798] Signal inference workers to resume experience collection... (25500 times) [2024-04-28 01:04:19,896][54818] InferenceWorker_p0-w0: resuming experience collection (25500 times) [2024-04-28 01:04:20,138][54818] Updated weights for policy 0, policy_version 530418 (0.0024) [2024-04-28 01:04:23,030][54818] Updated weights for policy 0, policy_version 530428 (0.0025) [2024-04-28 01:04:24,253][54587] Fps is (10 sec: 60619.9, 60 sec: 59255.3, 300 sec: 59093.5). Total num frames: 8690597888. Throughput: 0: 59399.5. Samples: 1595844120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:04:24,262][54587] Avg episode reward: [(0, '0.692')] [2024-04-28 01:04:25,506][54818] Updated weights for policy 0, policy_version 530438 (0.0025) [2024-04-28 01:04:28,656][54818] Updated weights for policy 0, policy_version 530448 (0.0026) [2024-04-28 01:04:29,253][54587] Fps is (10 sec: 57343.1, 60 sec: 59255.6, 300 sec: 59093.5). Total num frames: 8690892800. Throughput: 0: 59573.2. Samples: 1596024160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:04:29,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-28 01:04:31,426][54818] Updated weights for policy 0, policy_version 530458 (0.0027) [2024-04-28 01:04:34,072][54818] Updated weights for policy 0, policy_version 530468 (0.0027) [2024-04-28 01:04:34,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59528.4, 300 sec: 59149.0). Total num frames: 8691187712. Throughput: 0: 59468.0. Samples: 1596379560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:04:34,254][54587] Avg episode reward: [(0, '0.681')] [2024-04-28 01:04:36,905][54818] Updated weights for policy 0, policy_version 530478 (0.0026) [2024-04-28 01:04:39,253][54587] Fps is (10 sec: 58982.6, 60 sec: 59528.5, 300 sec: 59093.5). Total num frames: 8691482624. Throughput: 0: 59307.1. Samples: 1596726440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:04:39,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 01:04:39,568][54818] Updated weights for policy 0, policy_version 530488 (0.0026) [2024-04-28 01:04:42,549][54818] Updated weights for policy 0, policy_version 530498 (0.0025) [2024-04-28 01:04:44,253][54587] Fps is (10 sec: 58983.3, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8691777536. Throughput: 0: 59247.8. Samples: 1596902920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:04:44,253][54587] Avg episode reward: [(0, '0.657')] [2024-04-28 01:04:45,180][54818] Updated weights for policy 0, policy_version 530508 (0.0026) [2024-04-28 01:04:48,139][54818] Updated weights for policy 0, policy_version 530518 (0.0027) [2024-04-28 01:04:49,253][54587] Fps is (10 sec: 60621.4, 60 sec: 59528.7, 300 sec: 59204.6). Total num frames: 8692088832. Throughput: 0: 59520.5. Samples: 1597271760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 01:04:49,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 01:04:50,728][54818] Updated weights for policy 0, policy_version 530528 (0.0023) [2024-04-28 01:04:53,556][54818] Updated weights for policy 0, policy_version 530538 (0.0025) [2024-04-28 01:04:54,253][54587] Fps is (10 sec: 58982.2, 60 sec: 59528.6, 300 sec: 59149.0). Total num frames: 8692367360. Throughput: 0: 59521.8. Samples: 1597624460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:04:54,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 01:04:56,241][54818] Updated weights for policy 0, policy_version 530548 (0.0026) [2024-04-28 01:04:59,130][54818] Updated weights for policy 0, policy_version 530558 (0.0025) [2024-04-28 01:04:59,253][54587] Fps is (10 sec: 57343.3, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8692662272. Throughput: 0: 59536.3. Samples: 1597807320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:04:59,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 01:05:01,898][54818] Updated weights for policy 0, policy_version 530568 (0.0026) [2024-04-28 01:05:04,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8692957184. Throughput: 0: 59204.4. Samples: 1598150360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:05:04,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 01:05:04,700][54818] Updated weights for policy 0, policy_version 530578 (0.0028) [2024-04-28 01:05:07,381][54818] Updated weights for policy 0, policy_version 530588 (0.0025) [2024-04-28 01:05:09,253][54587] Fps is (10 sec: 57344.5, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8693235712. Throughput: 0: 59248.1. Samples: 1598510280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:05:09,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 01:05:10,081][54818] Updated weights for policy 0, policy_version 530598 (0.0026) [2024-04-28 01:05:12,843][54798] Signal inference workers to stop experience collection... (25550 times) [2024-04-28 01:05:12,845][54798] Signal inference workers to resume experience collection... (25550 times) [2024-04-28 01:05:12,851][54818] Updated weights for policy 0, policy_version 530608 (0.0027) [2024-04-28 01:05:12,861][54818] InferenceWorker_p0-w0: stopping experience collection (25550 times) [2024-04-28 01:05:12,861][54818] InferenceWorker_p0-w0: resuming experience collection (25550 times) [2024-04-28 01:05:14,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8693547008. Throughput: 0: 59222.9. Samples: 1598689180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:05:14,253][54587] Avg episode reward: [(0, '0.654')] [2024-04-28 01:05:15,661][54818] Updated weights for policy 0, policy_version 530618 (0.0026) [2024-04-28 01:05:18,284][54818] Updated weights for policy 0, policy_version 530628 (0.0026) [2024-04-28 01:05:19,253][54587] Fps is (10 sec: 63897.5, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8693874688. Throughput: 0: 59253.9. Samples: 1599045980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:05:19,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 01:05:21,049][54818] Updated weights for policy 0, policy_version 530638 (0.0025) [2024-04-28 01:05:23,677][54818] Updated weights for policy 0, policy_version 530648 (0.0025) [2024-04-28 01:05:24,253][54587] Fps is (10 sec: 62258.5, 60 sec: 59528.6, 300 sec: 59260.1). Total num frames: 8694169600. Throughput: 0: 59311.6. Samples: 1599395460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:05:24,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 01:05:26,627][54818] Updated weights for policy 0, policy_version 530658 (0.0026) [2024-04-28 01:05:29,043][54818] Updated weights for policy 0, policy_version 530668 (0.0027) [2024-04-28 01:05:29,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59528.6, 300 sec: 59204.5). Total num frames: 8694464512. Throughput: 0: 59403.5. Samples: 1599576080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:05:29,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 01:05:32,167][54818] Updated weights for policy 0, policy_version 530678 (0.0025) [2024-04-28 01:05:34,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59255.6, 300 sec: 59149.1). Total num frames: 8694743040. Throughput: 0: 59053.8. Samples: 1599929180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:05:34,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 01:05:34,823][54818] Updated weights for policy 0, policy_version 530688 (0.0023) [2024-04-28 01:05:37,699][54818] Updated weights for policy 0, policy_version 530698 (0.0026) [2024-04-28 01:05:39,253][54587] Fps is (10 sec: 55705.5, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8695021568. Throughput: 0: 59157.3. Samples: 1600286540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:05:39,254][54587] Avg episode reward: [(0, '0.510')] [2024-04-28 01:05:40,139][54818] Updated weights for policy 0, policy_version 530708 (0.0024) [2024-04-28 01:05:43,076][54818] Updated weights for policy 0, policy_version 530718 (0.0024) [2024-04-28 01:05:44,253][54587] Fps is (10 sec: 58981.7, 60 sec: 59255.3, 300 sec: 59149.0). Total num frames: 8695332864. Throughput: 0: 59276.4. Samples: 1600474760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:05:44,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 01:05:45,611][54818] Updated weights for policy 0, policy_version 530728 (0.0025) [2024-04-28 01:05:48,506][54818] Updated weights for policy 0, policy_version 530738 (0.0026) [2024-04-28 01:05:49,253][54587] Fps is (10 sec: 60620.1, 60 sec: 58982.2, 300 sec: 59149.0). Total num frames: 8695627776. Throughput: 0: 59410.9. Samples: 1600823860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:05:49,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 01:05:49,263][54587] No heartbeat for components: RolloutWorker_w4 (13297 seconds) [2024-04-28 01:05:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000530739_8695627776.pth... [2024-04-28 01:05:49,316][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000529872_8681422848.pth [2024-04-28 01:05:51,687][54818] Updated weights for policy 0, policy_version 530748 (0.0025) [2024-04-28 01:05:53,500][54798] Signal inference workers to stop experience collection... (25600 times) [2024-04-28 01:05:53,534][54818] InferenceWorker_p0-w0: stopping experience collection (25600 times) [2024-04-28 01:05:53,590][54798] Signal inference workers to resume experience collection... (25600 times) [2024-04-28 01:05:53,590][54818] InferenceWorker_p0-w0: resuming experience collection (25600 times) [2024-04-28 01:05:54,153][54818] Updated weights for policy 0, policy_version 530758 (0.0024) [2024-04-28 01:05:54,253][54587] Fps is (10 sec: 60620.8, 60 sec: 59528.4, 300 sec: 59204.5). Total num frames: 8695939072. Throughput: 0: 59176.7. Samples: 1601173240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:05:54,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 01:05:57,359][54818] Updated weights for policy 0, policy_version 530768 (0.0025) [2024-04-28 01:05:59,253][54587] Fps is (10 sec: 60621.9, 60 sec: 59528.7, 300 sec: 59260.1). Total num frames: 8696233984. Throughput: 0: 59034.6. Samples: 1601345740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:05:59,253][54587] Avg episode reward: [(0, '0.669')] [2024-04-28 01:05:59,785][54818] Updated weights for policy 0, policy_version 530778 (0.0025) [2024-04-28 01:06:03,077][54818] Updated weights for policy 0, policy_version 530788 (0.0026) [2024-04-28 01:06:04,253][54587] Fps is (10 sec: 57344.5, 60 sec: 59255.4, 300 sec: 59149.0). Total num frames: 8696512512. Throughput: 0: 59162.2. Samples: 1601708280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:06:04,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 01:06:05,551][54818] Updated weights for policy 0, policy_version 530798 (0.0024) [2024-04-28 01:06:08,616][54818] Updated weights for policy 0, policy_version 530808 (0.0024) [2024-04-28 01:06:09,253][54587] Fps is (10 sec: 55704.8, 60 sec: 59255.4, 300 sec: 59037.9). Total num frames: 8696791040. Throughput: 0: 59236.8. Samples: 1602061120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:06:09,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 01:06:11,118][54818] Updated weights for policy 0, policy_version 530818 (0.0026) [2024-04-28 01:06:14,066][54818] Updated weights for policy 0, policy_version 530828 (0.0027) [2024-04-28 01:06:14,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58982.3, 300 sec: 59093.5). Total num frames: 8697085952. Throughput: 0: 59054.2. Samples: 1602233520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:06:14,254][54587] Avg episode reward: [(0, '0.684')] [2024-04-28 01:06:16,474][54818] Updated weights for policy 0, policy_version 530838 (0.0025) [2024-04-28 01:06:19,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58436.2, 300 sec: 59093.5). Total num frames: 8697380864. Throughput: 0: 58927.0. Samples: 1602580900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-04-28 01:06:19,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-28 01:06:19,669][54818] Updated weights for policy 0, policy_version 530848 (0.0028) [2024-04-28 01:06:22,368][54818] Updated weights for policy 0, policy_version 530858 (0.0023) [2024-04-28 01:06:24,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58436.3, 300 sec: 59093.5). Total num frames: 8697675776. Throughput: 0: 59025.4. Samples: 1602942680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:06:24,254][54587] Avg episode reward: [(0, '0.531')] [2024-04-28 01:06:25,176][54818] Updated weights for policy 0, policy_version 530868 (0.0028) [2024-04-28 01:06:27,719][54818] Updated weights for policy 0, policy_version 530878 (0.0023) [2024-04-28 01:06:29,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58709.3, 300 sec: 59204.6). Total num frames: 8697987072. Throughput: 0: 58717.9. Samples: 1603117060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:06:29,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 01:06:30,685][54818] Updated weights for policy 0, policy_version 530888 (0.0027) [2024-04-28 01:06:33,179][54818] Updated weights for policy 0, policy_version 530898 (0.0026) [2024-04-28 01:06:33,180][54798] Signal inference workers to stop experience collection... (25650 times) [2024-04-28 01:06:33,181][54798] Signal inference workers to resume experience collection... (25650 times) [2024-04-28 01:06:33,198][54818] InferenceWorker_p0-w0: stopping experience collection (25650 times) [2024-04-28 01:06:33,198][54818] InferenceWorker_p0-w0: resuming experience collection (25650 times) [2024-04-28 01:06:34,253][54587] Fps is (10 sec: 62258.7, 60 sec: 59255.4, 300 sec: 59260.1). Total num frames: 8698298368. Throughput: 0: 58958.8. Samples: 1603477000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:06:34,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 01:06:36,216][54818] Updated weights for policy 0, policy_version 530908 (0.0026) [2024-04-28 01:06:38,592][54818] Updated weights for policy 0, policy_version 530918 (0.0025) [2024-04-28 01:06:39,253][54587] Fps is (10 sec: 60620.1, 60 sec: 59528.4, 300 sec: 59260.1). Total num frames: 8698593280. Throughput: 0: 58916.4. Samples: 1603824480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:06:39,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 01:06:41,785][54818] Updated weights for policy 0, policy_version 530928 (0.0026) [2024-04-28 01:06:44,058][54818] Updated weights for policy 0, policy_version 530938 (0.0026) [2024-04-28 01:06:44,253][54587] Fps is (10 sec: 60621.4, 60 sec: 59528.7, 300 sec: 59260.1). Total num frames: 8698904576. Throughput: 0: 59220.9. Samples: 1604010680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:06:44,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-28 01:06:47,319][54818] Updated weights for policy 0, policy_version 530948 (0.0025) [2024-04-28 01:06:49,253][54587] Fps is (10 sec: 58983.6, 60 sec: 59255.7, 300 sec: 59260.1). Total num frames: 8699183104. Throughput: 0: 58896.1. Samples: 1604358600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:06:49,253][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 01:06:49,418][54818] Updated weights for policy 0, policy_version 530958 (0.0026) [2024-04-28 01:06:52,780][54818] Updated weights for policy 0, policy_version 530968 (0.0026) [2024-04-28 01:06:54,253][54587] Fps is (10 sec: 55705.8, 60 sec: 58709.5, 300 sec: 59204.6). Total num frames: 8699461632. Throughput: 0: 58988.2. Samples: 1604715580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:06:54,253][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 01:06:54,891][54818] Updated weights for policy 0, policy_version 530978 (0.0025) [2024-04-28 01:06:58,233][54818] Updated weights for policy 0, policy_version 530988 (0.0026) [2024-04-28 01:06:59,253][54587] Fps is (10 sec: 57343.6, 60 sec: 58709.3, 300 sec: 59149.0). Total num frames: 8699756544. Throughput: 0: 59260.5. Samples: 1604900240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:06:59,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 01:07:00,353][54818] Updated weights for policy 0, policy_version 530998 (0.0022) [2024-04-28 01:07:03,882][54818] Updated weights for policy 0, policy_version 531008 (0.0025) [2024-04-28 01:07:04,253][54587] Fps is (10 sec: 58981.4, 60 sec: 58982.3, 300 sec: 59204.5). Total num frames: 8700051456. Throughput: 0: 59335.4. Samples: 1605251000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:07:04,254][54587] Avg episode reward: [(0, '0.699')] [2024-04-28 01:07:06,240][54818] Updated weights for policy 0, policy_version 531018 (0.0024) [2024-04-28 01:07:09,218][54818] Updated weights for policy 0, policy_version 531028 (0.0026) [2024-04-28 01:07:09,253][54587] Fps is (10 sec: 60620.6, 60 sec: 59528.6, 300 sec: 59204.6). Total num frames: 8700362752. Throughput: 0: 59212.0. Samples: 1605607220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:07:09,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-28 01:07:11,574][54818] Updated weights for policy 0, policy_version 531038 (0.0023) [2024-04-28 01:07:14,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59255.4, 300 sec: 59204.6). Total num frames: 8700641280. Throughput: 0: 58971.0. Samples: 1605770760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:07:14,254][54587] Avg episode reward: [(0, '0.493')] [2024-04-28 01:07:14,401][54798] Signal inference workers to stop experience collection... (25700 times) [2024-04-28 01:07:14,401][54798] Signal inference workers to resume experience collection... (25700 times) [2024-04-28 01:07:14,413][54818] InferenceWorker_p0-w0: stopping experience collection (25700 times) [2024-04-28 01:07:14,413][54818] InferenceWorker_p0-w0: resuming experience collection (25700 times) [2024-04-28 01:07:14,777][54818] Updated weights for policy 0, policy_version 531048 (0.0026) [2024-04-28 01:07:17,402][54818] Updated weights for policy 0, policy_version 531058 (0.0023) [2024-04-28 01:07:19,253][54587] Fps is (10 sec: 57343.9, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8700936192. Throughput: 0: 59082.7. Samples: 1606135720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:07:19,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 01:07:20,292][54818] Updated weights for policy 0, policy_version 531068 (0.0024) [2024-04-28 01:07:23,080][54818] Updated weights for policy 0, policy_version 531078 (0.0025) [2024-04-28 01:07:24,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8701231104. Throughput: 0: 59442.7. Samples: 1606499400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:07:24,254][54587] Avg episode reward: [(0, '0.682')] [2024-04-28 01:07:25,938][54818] Updated weights for policy 0, policy_version 531088 (0.0026) [2024-04-28 01:07:28,863][54818] Updated weights for policy 0, policy_version 531098 (0.0026) [2024-04-28 01:07:29,253][54587] Fps is (10 sec: 57343.3, 60 sec: 58709.2, 300 sec: 59093.4). Total num frames: 8701509632. Throughput: 0: 59012.2. Samples: 1606666240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:07:29,254][54587] Avg episode reward: [(0, '0.714')] [2024-04-28 01:07:31,435][54818] Updated weights for policy 0, policy_version 531108 (0.0026) [2024-04-28 01:07:34,253][54587] Fps is (10 sec: 58983.2, 60 sec: 58709.5, 300 sec: 59204.6). Total num frames: 8701820928. Throughput: 0: 59045.3. Samples: 1607015640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:07:34,253][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 01:07:34,345][54818] Updated weights for policy 0, policy_version 531118 (0.0026) [2024-04-28 01:07:36,928][54818] Updated weights for policy 0, policy_version 531128 (0.0025) [2024-04-28 01:07:39,253][54587] Fps is (10 sec: 62260.6, 60 sec: 58982.6, 300 sec: 59315.7). Total num frames: 8702132224. Throughput: 0: 59165.4. Samples: 1607378020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:07:39,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 01:07:39,855][54818] Updated weights for policy 0, policy_version 531138 (0.0027) [2024-04-28 01:07:42,384][54818] Updated weights for policy 0, policy_version 531148 (0.0026) [2024-04-28 01:07:44,253][54587] Fps is (10 sec: 62258.4, 60 sec: 58982.3, 300 sec: 59371.2). Total num frames: 8702443520. Throughput: 0: 59127.9. Samples: 1607561000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:07:44,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 01:07:45,373][54818] Updated weights for policy 0, policy_version 531158 (0.0026) [2024-04-28 01:07:47,744][54818] Updated weights for policy 0, policy_version 531168 (0.0025) [2024-04-28 01:07:49,253][54587] Fps is (10 sec: 60619.5, 60 sec: 59255.2, 300 sec: 59315.6). Total num frames: 8702738432. Throughput: 0: 59174.2. Samples: 1607913840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 01:07:49,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 01:07:49,306][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000531174_8702754816.pth... [2024-04-28 01:07:49,356][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000530305_8688517120.pth [2024-04-28 01:07:50,760][54818] Updated weights for policy 0, policy_version 531178 (0.0025) [2024-04-28 01:07:53,123][54818] Updated weights for policy 0, policy_version 531188 (0.0026) [2024-04-28 01:07:54,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59801.5, 300 sec: 59315.7). Total num frames: 8703049728. Throughput: 0: 59022.2. Samples: 1608263220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:07:54,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 01:07:56,574][54818] Updated weights for policy 0, policy_version 531198 (0.0026) [2024-04-28 01:07:58,631][54818] Updated weights for policy 0, policy_version 531208 (0.0026) [2024-04-28 01:07:59,253][54587] Fps is (10 sec: 60621.4, 60 sec: 59801.5, 300 sec: 59260.1). Total num frames: 8703344640. Throughput: 0: 59681.4. Samples: 1608456420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:07:59,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-28 01:08:01,386][54798] Signal inference workers to stop experience collection... (25750 times) [2024-04-28 01:08:01,391][54798] Signal inference workers to resume experience collection... (25750 times) [2024-04-28 01:08:01,405][54818] InferenceWorker_p0-w0: stopping experience collection (25750 times) [2024-04-28 01:08:01,405][54818] InferenceWorker_p0-w0: resuming experience collection (25750 times) [2024-04-28 01:08:01,960][54818] Updated weights for policy 0, policy_version 531218 (0.0027) [2024-04-28 01:08:04,180][54818] Updated weights for policy 0, policy_version 531228 (0.0028) [2024-04-28 01:08:04,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59801.8, 300 sec: 59315.7). Total num frames: 8703639552. Throughput: 0: 59466.3. Samples: 1608811700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:08:04,253][54587] Avg episode reward: [(0, '0.559')] [2024-04-28 01:08:07,517][54818] Updated weights for policy 0, policy_version 531238 (0.0026) [2024-04-28 01:08:09,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8703918080. Throughput: 0: 59254.8. Samples: 1609165860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:08:09,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-28 01:08:09,667][54818] Updated weights for policy 0, policy_version 531248 (0.0025) [2024-04-28 01:08:13,021][54818] Updated weights for policy 0, policy_version 531258 (0.0027) [2024-04-28 01:08:14,253][54587] Fps is (10 sec: 57343.5, 60 sec: 59528.6, 300 sec: 59204.6). Total num frames: 8704212992. Throughput: 0: 59564.1. Samples: 1609346620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:08:14,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 01:08:15,275][54818] Updated weights for policy 0, policy_version 531268 (0.0026) [2024-04-28 01:08:18,440][54818] Updated weights for policy 0, policy_version 531278 (0.0026) [2024-04-28 01:08:19,253][54587] Fps is (10 sec: 58982.6, 60 sec: 59528.6, 300 sec: 59204.6). Total num frames: 8704507904. Throughput: 0: 59543.5. Samples: 1609695100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:08:19,253][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 01:08:20,790][54818] Updated weights for policy 0, policy_version 531288 (0.0026) [2024-04-28 01:08:23,991][54818] Updated weights for policy 0, policy_version 531298 (0.0024) [2024-04-28 01:08:24,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59255.5, 300 sec: 59149.1). Total num frames: 8704786432. Throughput: 0: 59430.5. Samples: 1610052400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:08:24,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 01:08:26,408][54818] Updated weights for policy 0, policy_version 531308 (0.0025) [2024-04-28 01:08:29,253][54587] Fps is (10 sec: 57343.6, 60 sec: 59528.7, 300 sec: 59204.5). Total num frames: 8705081344. Throughput: 0: 59144.5. Samples: 1610222500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:08:29,254][54587] Avg episode reward: [(0, '0.672')] [2024-04-28 01:08:29,520][54818] Updated weights for policy 0, policy_version 531318 (0.0023) [2024-04-28 01:08:31,944][54818] Updated weights for policy 0, policy_version 531328 (0.0025) [2024-04-28 01:08:34,253][54587] Fps is (10 sec: 58982.5, 60 sec: 59255.4, 300 sec: 59204.6). Total num frames: 8705376256. Throughput: 0: 59287.3. Samples: 1610581760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:08:34,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 01:08:35,067][54818] Updated weights for policy 0, policy_version 531338 (0.0025) [2024-04-28 01:08:37,272][54818] Updated weights for policy 0, policy_version 531348 (0.0025) [2024-04-28 01:08:39,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58982.3, 300 sec: 59149.0). Total num frames: 8705671168. Throughput: 0: 59518.2. Samples: 1610941540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:08:39,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 01:08:40,638][54818] Updated weights for policy 0, policy_version 531358 (0.0026) [2024-04-28 01:08:43,336][54818] Updated weights for policy 0, policy_version 531368 (0.0027) [2024-04-28 01:08:44,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8705966080. Throughput: 0: 59069.4. Samples: 1611114540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:08:44,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 01:08:46,162][54818] Updated weights for policy 0, policy_version 531378 (0.0026) [2024-04-28 01:08:48,699][54818] Updated weights for policy 0, policy_version 531388 (0.0025) [2024-04-28 01:08:49,253][54587] Fps is (10 sec: 60620.0, 60 sec: 58982.4, 300 sec: 59260.1). Total num frames: 8706277376. Throughput: 0: 59002.8. Samples: 1611466840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:08:49,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 01:08:49,265][54587] No heartbeat for components: RolloutWorker_w4 (13477 seconds) [2024-04-28 01:08:51,539][54818] Updated weights for policy 0, policy_version 531398 (0.0026) [2024-04-28 01:08:52,624][54798] Signal inference workers to stop experience collection... (25800 times) [2024-04-28 01:08:52,624][54798] Signal inference workers to resume experience collection... (25800 times) [2024-04-28 01:08:52,637][54818] InferenceWorker_p0-w0: stopping experience collection (25800 times) [2024-04-28 01:08:52,654][54818] InferenceWorker_p0-w0: resuming experience collection (25800 times) [2024-04-28 01:08:54,253][54587] Fps is (10 sec: 60620.3, 60 sec: 58709.2, 300 sec: 59204.6). Total num frames: 8706572288. Throughput: 0: 58987.0. Samples: 1611820280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:08:54,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 01:08:54,451][54818] Updated weights for policy 0, policy_version 531408 (0.0024) [2024-04-28 01:08:57,044][54818] Updated weights for policy 0, policy_version 531418 (0.0026) [2024-04-28 01:08:59,253][54587] Fps is (10 sec: 58983.3, 60 sec: 58709.3, 300 sec: 59149.0). Total num frames: 8706867200. Throughput: 0: 59040.0. Samples: 1612003420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:08:59,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 01:09:00,100][54818] Updated weights for policy 0, policy_version 531428 (0.0025) [2024-04-28 01:09:02,486][54818] Updated weights for policy 0, policy_version 531438 (0.0024) [2024-04-28 01:09:04,253][54587] Fps is (10 sec: 58982.6, 60 sec: 58709.2, 300 sec: 59204.5). Total num frames: 8707162112. Throughput: 0: 59009.2. Samples: 1612350520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:09:04,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 01:09:05,634][54818] Updated weights for policy 0, policy_version 531448 (0.0026) [2024-04-28 01:09:07,949][54818] Updated weights for policy 0, policy_version 531458 (0.0026) [2024-04-28 01:09:09,253][54587] Fps is (10 sec: 60620.2, 60 sec: 59255.3, 300 sec: 59260.1). Total num frames: 8707473408. Throughput: 0: 59103.9. Samples: 1612712080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:09:09,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 01:09:11,111][54818] Updated weights for policy 0, policy_version 531468 (0.0025) [2024-04-28 01:09:13,372][54818] Updated weights for policy 0, policy_version 531478 (0.0026) [2024-04-28 01:09:14,253][54587] Fps is (10 sec: 60621.2, 60 sec: 59255.5, 300 sec: 59149.0). Total num frames: 8707768320. Throughput: 0: 59342.3. Samples: 1612892900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:09:14,253][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 01:09:16,657][54818] Updated weights for policy 0, policy_version 531488 (0.0026) [2024-04-28 01:09:19,231][54818] Updated weights for policy 0, policy_version 531498 (0.0026) [2024-04-28 01:09:19,253][54587] Fps is (10 sec: 58983.3, 60 sec: 59255.5, 300 sec: 59204.6). Total num frames: 8708063232. Throughput: 0: 59194.3. Samples: 1613245500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 01:09:19,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 01:09:22,124][54818] Updated weights for policy 0, policy_version 531508 (0.0026) [2024-04-28 01:09:24,253][54587] Fps is (10 sec: 60620.3, 60 sec: 59801.5, 300 sec: 59260.1). Total num frames: 8708374528. Throughput: 0: 59080.4. Samples: 1613600160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:09:24,262][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 01:09:24,631][54818] Updated weights for policy 0, policy_version 531518 (0.0026) [2024-04-28 01:09:27,706][54818] Updated weights for policy 0, policy_version 531528 (0.0027) [2024-04-28 01:09:29,253][54587] Fps is (10 sec: 58981.3, 60 sec: 59528.4, 300 sec: 59204.5). Total num frames: 8708653056. Throughput: 0: 59130.0. Samples: 1613775400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:09:29,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-28 01:09:30,219][54818] Updated weights for policy 0, policy_version 531538 (0.0026) [2024-04-28 01:09:33,270][54818] Updated weights for policy 0, policy_version 531548 (0.0026) [2024-04-28 01:09:34,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59528.5, 300 sec: 59204.6). Total num frames: 8708947968. Throughput: 0: 59402.4. Samples: 1614139940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:09:34,263][54587] Avg episode reward: [(0, '0.666')] [2024-04-28 01:09:35,712][54818] Updated weights for policy 0, policy_version 531558 (0.0025) [2024-04-28 01:09:38,779][54818] Updated weights for policy 0, policy_version 531568 (0.0026) [2024-04-28 01:09:39,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59528.4, 300 sec: 59204.5). Total num frames: 8709242880. Throughput: 0: 59448.8. Samples: 1614495480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:09:39,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 01:09:41,248][54818] Updated weights for policy 0, policy_version 531578 (0.0026) [2024-04-28 01:09:44,253][54587] Fps is (10 sec: 57344.3, 60 sec: 59255.5, 300 sec: 59093.5). Total num frames: 8709521408. Throughput: 0: 59307.1. Samples: 1614672240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:09:44,254][54587] Avg episode reward: [(0, '0.521')] [2024-04-28 01:09:44,321][54818] Updated weights for policy 0, policy_version 531588 (0.0026) [2024-04-28 01:09:46,930][54818] Updated weights for policy 0, policy_version 531598 (0.0027) [2024-04-28 01:09:49,253][54587] Fps is (10 sec: 57345.1, 60 sec: 58982.6, 300 sec: 59149.0). Total num frames: 8709816320. Throughput: 0: 59461.0. Samples: 1615026260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:09:49,253][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 01:09:49,339][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000531606_8709832704.pth... [2024-04-28 01:09:49,389][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000530739_8695627776.pth [2024-04-28 01:09:49,756][54818] Updated weights for policy 0, policy_version 531608 (0.0026) [2024-04-28 01:09:49,995][54798] Signal inference workers to stop experience collection... (25850 times) [2024-04-28 01:09:50,039][54818] InferenceWorker_p0-w0: stopping experience collection (25850 times) [2024-04-28 01:09:50,051][54798] Signal inference workers to resume experience collection... (25850 times) [2024-04-28 01:09:50,054][54818] InferenceWorker_p0-w0: resuming experience collection (25850 times) [2024-04-28 01:09:52,555][54818] Updated weights for policy 0, policy_version 531618 (0.0026) [2024-04-28 01:09:54,253][54587] Fps is (10 sec: 58981.8, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8710111232. Throughput: 0: 59245.8. Samples: 1615378140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:09:54,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 01:09:55,244][54818] Updated weights for policy 0, policy_version 531628 (0.0033) [2024-04-28 01:09:58,252][54818] Updated weights for policy 0, policy_version 531638 (0.0025) [2024-04-28 01:09:59,253][54587] Fps is (10 sec: 58982.0, 60 sec: 58982.4, 300 sec: 59149.0). Total num frames: 8710406144. Throughput: 0: 59159.9. Samples: 1615555100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:09:59,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 01:10:00,611][54818] Updated weights for policy 0, policy_version 531648 (0.0025) [2024-04-28 01:10:03,765][54818] Updated weights for policy 0, policy_version 531658 (0.0025) [2024-04-28 01:10:04,253][54587] Fps is (10 sec: 60621.5, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8710717440. Throughput: 0: 59295.1. Samples: 1615913780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:10:04,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 01:10:06,011][54818] Updated weights for policy 0, policy_version 531668 (0.0027) [2024-04-28 01:10:09,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8710995968. Throughput: 0: 59334.3. Samples: 1616270200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:10:09,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 01:10:09,288][54818] Updated weights for policy 0, policy_version 531678 (0.0026) [2024-04-28 01:10:11,535][54818] Updated weights for policy 0, policy_version 531688 (0.0032) [2024-04-28 01:10:14,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58982.4, 300 sec: 59093.5). Total num frames: 8711307264. Throughput: 0: 59202.9. Samples: 1616439520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:10:14,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 01:10:14,596][54818] Updated weights for policy 0, policy_version 531698 (0.0026) [2024-04-28 01:10:17,086][54818] Updated weights for policy 0, policy_version 531708 (0.0024) [2024-04-28 01:10:19,253][54587] Fps is (10 sec: 60621.5, 60 sec: 58982.5, 300 sec: 59093.5). Total num frames: 8711602176. Throughput: 0: 58995.7. Samples: 1616794740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:10:19,253][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 01:10:20,215][54818] Updated weights for policy 0, policy_version 531718 (0.0026) [2024-04-28 01:10:22,437][54818] Updated weights for policy 0, policy_version 531728 (0.0026) [2024-04-28 01:10:24,253][54587] Fps is (10 sec: 58981.5, 60 sec: 58709.2, 300 sec: 59093.4). Total num frames: 8711897088. Throughput: 0: 59022.7. Samples: 1617151500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:10:24,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 01:10:25,805][54818] Updated weights for policy 0, policy_version 531738 (0.0027) [2024-04-28 01:10:28,009][54818] Updated weights for policy 0, policy_version 531748 (0.0026) [2024-04-28 01:10:29,253][54587] Fps is (10 sec: 62258.2, 60 sec: 59528.6, 300 sec: 59260.1). Total num frames: 8712224768. Throughput: 0: 59309.7. Samples: 1617341180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:10:29,254][54587] Avg episode reward: [(0, '0.701')] [2024-04-28 01:10:31,485][54818] Updated weights for policy 0, policy_version 531758 (0.0026) [2024-04-28 01:10:33,614][54818] Updated weights for policy 0, policy_version 531768 (0.0025) [2024-04-28 01:10:34,253][54587] Fps is (10 sec: 62259.5, 60 sec: 59528.5, 300 sec: 59315.6). Total num frames: 8712519680. Throughput: 0: 59222.9. Samples: 1617691300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:10:34,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 01:10:36,992][54818] Updated weights for policy 0, policy_version 531778 (0.0025) [2024-04-28 01:10:39,023][54818] Updated weights for policy 0, policy_version 531788 (0.0026) [2024-04-28 01:10:39,253][54587] Fps is (10 sec: 58982.7, 60 sec: 59528.7, 300 sec: 59260.1). Total num frames: 8712814592. Throughput: 0: 59177.4. Samples: 1618041120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:10:39,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-28 01:10:42,579][54818] Updated weights for policy 0, policy_version 531798 (0.0026) [2024-04-28 01:10:44,253][54587] Fps is (10 sec: 60621.8, 60 sec: 60074.7, 300 sec: 59315.7). Total num frames: 8713125888. Throughput: 0: 59480.5. Samples: 1618231720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:10:44,253][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 01:10:44,445][54818] Updated weights for policy 0, policy_version 531808 (0.0024) [2024-04-28 01:10:47,936][54818] Updated weights for policy 0, policy_version 531818 (0.0026) [2024-04-28 01:10:49,253][54587] Fps is (10 sec: 60620.2, 60 sec: 60074.5, 300 sec: 59260.1). Total num frames: 8713420800. Throughput: 0: 59524.3. Samples: 1618592380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:10:49,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 01:10:49,567][54798] Signal inference workers to stop experience collection... (25900 times) [2024-04-28 01:10:49,574][54798] Signal inference workers to resume experience collection... (25900 times) [2024-04-28 01:10:49,593][54818] InferenceWorker_p0-w0: stopping experience collection (25900 times) [2024-04-28 01:10:49,593][54818] InferenceWorker_p0-w0: resuming experience collection (25900 times) [2024-04-28 01:10:50,059][54818] Updated weights for policy 0, policy_version 531828 (0.0027) [2024-04-28 01:10:53,366][54818] Updated weights for policy 0, policy_version 531838 (0.0027) [2024-04-28 01:10:54,253][54587] Fps is (10 sec: 57343.1, 60 sec: 59801.6, 300 sec: 59204.5). Total num frames: 8713699328. Throughput: 0: 59413.2. Samples: 1618943800. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:10:54,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 01:10:55,595][54818] Updated weights for policy 0, policy_version 531848 (0.0025) [2024-04-28 01:10:58,906][54818] Updated weights for policy 0, policy_version 531858 (0.0024) [2024-04-28 01:10:59,253][54587] Fps is (10 sec: 55706.5, 60 sec: 59528.6, 300 sec: 59204.6). Total num frames: 8713977856. Throughput: 0: 59524.1. Samples: 1619118100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:10:59,253][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 01:11:01,149][54818] Updated weights for policy 0, policy_version 531868 (0.0024) [2024-04-28 01:11:04,253][54587] Fps is (10 sec: 55706.5, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8714256384. Throughput: 0: 59553.3. Samples: 1619474640. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:11:04,253][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 01:11:04,438][54818] Updated weights for policy 0, policy_version 531878 (0.0025) [2024-04-28 01:11:06,834][54818] Updated weights for policy 0, policy_version 531888 (0.0026) [2024-04-28 01:11:09,253][54587] Fps is (10 sec: 57344.2, 60 sec: 59255.6, 300 sec: 59204.6). Total num frames: 8714551296. Throughput: 0: 59626.5. Samples: 1619834680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:11:09,253][54587] Avg episode reward: [(0, '0.638')] [2024-04-28 01:11:09,783][54818] Updated weights for policy 0, policy_version 531898 (0.0024) [2024-04-28 01:11:12,443][54818] Updated weights for policy 0, policy_version 531908 (0.0026) [2024-04-28 01:11:14,253][54587] Fps is (10 sec: 58981.2, 60 sec: 58982.3, 300 sec: 59204.5). Total num frames: 8714846208. Throughput: 0: 59118.6. Samples: 1620001520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:11:14,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 01:11:15,377][54818] Updated weights for policy 0, policy_version 531918 (0.0026) [2024-04-28 01:11:17,945][54818] Updated weights for policy 0, policy_version 531928 (0.0025) [2024-04-28 01:11:19,253][54587] Fps is (10 sec: 62258.7, 60 sec: 59528.4, 300 sec: 59315.6). Total num frames: 8715173888. Throughput: 0: 59369.5. Samples: 1620362920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:11:19,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 01:11:20,725][54818] Updated weights for policy 0, policy_version 531938 (0.0024) [2024-04-28 01:11:23,511][54818] Updated weights for policy 0, policy_version 531948 (0.0025) [2024-04-28 01:11:24,253][54587] Fps is (10 sec: 63898.5, 60 sec: 59801.7, 300 sec: 59315.6). Total num frames: 8715485184. Throughput: 0: 59620.4. Samples: 1620724040. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:11:24,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-28 01:11:26,236][54818] Updated weights for policy 0, policy_version 531958 (0.0025) [2024-04-28 01:11:28,961][54818] Updated weights for policy 0, policy_version 531968 (0.0029) [2024-04-28 01:11:29,253][54587] Fps is (10 sec: 58982.7, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8715763712. Throughput: 0: 59214.7. Samples: 1620896380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:11:29,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 01:11:31,803][54818] Updated weights for policy 0, policy_version 531978 (0.0025) [2024-04-28 01:11:34,253][54587] Fps is (10 sec: 58982.0, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8716075008. Throughput: 0: 59007.6. Samples: 1621247720. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:11:34,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-28 01:11:34,516][54818] Updated weights for policy 0, policy_version 531988 (0.0024) [2024-04-28 01:11:37,239][54818] Updated weights for policy 0, policy_version 531998 (0.0026) [2024-04-28 01:11:39,253][54587] Fps is (10 sec: 60620.0, 60 sec: 59255.4, 300 sec: 59204.5). Total num frames: 8716369920. Throughput: 0: 59176.9. Samples: 1621606760. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:11:39,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 01:11:39,912][54818] Updated weights for policy 0, policy_version 532008 (0.0026) [2024-04-28 01:11:42,096][54798] Signal inference workers to stop experience collection... (25950 times) [2024-04-28 01:11:42,119][54818] InferenceWorker_p0-w0: stopping experience collection (25950 times) [2024-04-28 01:11:42,188][54798] Signal inference workers to resume experience collection... (25950 times) [2024-04-28 01:11:42,189][54818] InferenceWorker_p0-w0: resuming experience collection (25950 times) [2024-04-28 01:11:42,698][54818] Updated weights for policy 0, policy_version 532018 (0.0024) [2024-04-28 01:11:44,253][54587] Fps is (10 sec: 58982.5, 60 sec: 58982.3, 300 sec: 59260.1). Total num frames: 8716664832. Throughput: 0: 59494.5. Samples: 1621795360. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:11:44,254][54587] Avg episode reward: [(0, '0.664')] [2024-04-28 01:11:45,446][54818] Updated weights for policy 0, policy_version 532028 (0.0025) [2024-04-28 01:11:48,158][54818] Updated weights for policy 0, policy_version 532038 (0.0025) [2024-04-28 01:11:49,253][54587] Fps is (10 sec: 60621.3, 60 sec: 59255.6, 300 sec: 59371.2). Total num frames: 8716976128. Throughput: 0: 59291.9. Samples: 1622142780. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:11:49,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 01:11:49,265][54587] No heartbeat for components: RolloutWorker_w4 (13657 seconds) [2024-04-28 01:11:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000532042_8716976128.pth... [2024-04-28 01:11:49,319][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000531174_8702754816.pth [2024-04-28 01:11:51,034][54818] Updated weights for policy 0, policy_version 532048 (0.0027) [2024-04-28 01:11:53,730][54818] Updated weights for policy 0, policy_version 532058 (0.0026) [2024-04-28 01:11:54,253][54587] Fps is (10 sec: 60621.1, 60 sec: 59528.6, 300 sec: 59371.2). Total num frames: 8717271040. Throughput: 0: 59151.9. Samples: 1622496520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:11:54,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 01:11:57,035][54818] Updated weights for policy 0, policy_version 532068 (0.0026) [2024-04-28 01:11:59,253][54587] Fps is (10 sec: 57344.7, 60 sec: 59528.6, 300 sec: 59315.7). Total num frames: 8717549568. Throughput: 0: 59474.1. Samples: 1622677840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:11:59,253][54587] Avg episode reward: [(0, '0.490')] [2024-04-28 01:11:59,309][54818] Updated weights for policy 0, policy_version 532078 (0.0026) [2024-04-28 01:12:02,724][54818] Updated weights for policy 0, policy_version 532088 (0.0026) [2024-04-28 01:12:04,253][54587] Fps is (10 sec: 57343.8, 60 sec: 59801.5, 300 sec: 59260.1). Total num frames: 8717844480. Throughput: 0: 59516.4. Samples: 1623041160. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:12:04,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 01:12:04,725][54818] Updated weights for policy 0, policy_version 532098 (0.0025) [2024-04-28 01:12:08,129][54818] Updated weights for policy 0, policy_version 532108 (0.0026) [2024-04-28 01:12:09,253][54587] Fps is (10 sec: 58981.7, 60 sec: 59801.5, 300 sec: 59315.6). Total num frames: 8718139392. Throughput: 0: 59296.4. Samples: 1623392380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:12:09,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 01:12:10,154][54818] Updated weights for policy 0, policy_version 532118 (0.0027) [2024-04-28 01:12:13,445][54818] Updated weights for policy 0, policy_version 532128 (0.0027) [2024-04-28 01:12:14,253][54587] Fps is (10 sec: 58982.4, 60 sec: 59801.7, 300 sec: 59315.6). Total num frames: 8718434304. Throughput: 0: 59275.9. Samples: 1623563800. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-04-28 01:12:14,254][54587] Avg episode reward: [(0, '0.673')] [2024-04-28 01:12:15,671][54818] Updated weights for policy 0, policy_version 532138 (0.0025) [2024-04-28 01:12:18,802][54818] Updated weights for policy 0, policy_version 532148 (0.0026) [2024-04-28 01:12:19,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.4, 300 sec: 59315.6). Total num frames: 8718729216. Throughput: 0: 59352.5. Samples: 1623918580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:12:19,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 01:12:21,459][54818] Updated weights for policy 0, policy_version 532158 (0.0025) [2024-04-28 01:12:24,253][54587] Fps is (10 sec: 58982.3, 60 sec: 58982.4, 300 sec: 59371.2). Total num frames: 8719024128. Throughput: 0: 59357.4. Samples: 1624277840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:12:24,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-28 01:12:24,407][54818] Updated weights for policy 0, policy_version 532168 (0.0025) [2024-04-28 01:12:27,035][54818] Updated weights for policy 0, policy_version 532178 (0.0029) [2024-04-28 01:12:29,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58982.3, 300 sec: 59260.1). Total num frames: 8719302656. Throughput: 0: 58906.3. Samples: 1624446140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:12:29,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 01:12:29,830][54818] Updated weights for policy 0, policy_version 532188 (0.0026) [2024-04-28 01:12:32,600][54818] Updated weights for policy 0, policy_version 532198 (0.0027) [2024-04-28 01:12:34,253][54587] Fps is (10 sec: 58982.4, 60 sec: 58982.4, 300 sec: 59260.1). Total num frames: 8719613952. Throughput: 0: 59168.8. Samples: 1624805380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:12:34,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 01:12:34,967][54798] Signal inference workers to stop experience collection... (26000 times) [2024-04-28 01:12:34,967][54798] Signal inference workers to resume experience collection... (26000 times) [2024-04-28 01:12:34,979][54818] InferenceWorker_p0-w0: stopping experience collection (26000 times) [2024-04-28 01:12:34,979][54818] InferenceWorker_p0-w0: resuming experience collection (26000 times) [2024-04-28 01:12:35,481][54818] Updated weights for policy 0, policy_version 532208 (0.0026) [2024-04-28 01:12:37,963][54818] Updated weights for policy 0, policy_version 532218 (0.0026) [2024-04-28 01:12:39,253][54587] Fps is (10 sec: 60620.5, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8719908864. Throughput: 0: 59247.9. Samples: 1625162680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:12:39,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 01:12:41,072][54818] Updated weights for policy 0, policy_version 532228 (0.0026) [2024-04-28 01:12:43,684][54818] Updated weights for policy 0, policy_version 532238 (0.0025) [2024-04-28 01:12:44,253][54587] Fps is (10 sec: 58983.0, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8720203776. Throughput: 0: 59239.0. Samples: 1625343600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:12:44,253][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 01:12:46,444][54818] Updated weights for policy 0, policy_version 532248 (0.0026) [2024-04-28 01:12:49,253][54587] Fps is (10 sec: 58983.1, 60 sec: 58709.4, 300 sec: 59149.0). Total num frames: 8720498688. Throughput: 0: 58925.0. Samples: 1625692780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:12:49,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 01:12:49,269][54818] Updated weights for policy 0, policy_version 532258 (0.0026) [2024-04-28 01:12:51,977][54818] Updated weights for policy 0, policy_version 532268 (0.0027) [2024-04-28 01:12:54,253][54587] Fps is (10 sec: 60620.8, 60 sec: 58982.5, 300 sec: 59204.6). Total num frames: 8720809984. Throughput: 0: 59037.8. Samples: 1626049080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:12:54,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 01:12:55,036][54818] Updated weights for policy 0, policy_version 532278 (0.0025) [2024-04-28 01:12:57,594][54818] Updated weights for policy 0, policy_version 532288 (0.0024) [2024-04-28 01:12:59,253][54587] Fps is (10 sec: 62258.3, 60 sec: 59528.3, 300 sec: 59260.1). Total num frames: 8721121280. Throughput: 0: 59209.7. Samples: 1626228240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:12:59,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 01:13:00,606][54818] Updated weights for policy 0, policy_version 532298 (0.0025) [2024-04-28 01:13:03,131][54818] Updated weights for policy 0, policy_version 532308 (0.0025) [2024-04-28 01:13:04,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8721399808. Throughput: 0: 59308.9. Samples: 1626587480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:13:04,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 01:13:05,908][54818] Updated weights for policy 0, policy_version 532318 (0.0026) [2024-04-28 01:13:08,540][54818] Updated weights for policy 0, policy_version 532328 (0.0026) [2024-04-28 01:13:09,253][54587] Fps is (10 sec: 58982.8, 60 sec: 59528.5, 300 sec: 59315.6). Total num frames: 8721711104. Throughput: 0: 59150.7. Samples: 1626939620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:13:09,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-28 01:13:11,515][54818] Updated weights for policy 0, policy_version 532338 (0.0026) [2024-04-28 01:13:14,020][54818] Updated weights for policy 0, policy_version 532348 (0.0026) [2024-04-28 01:13:14,253][54587] Fps is (10 sec: 58982.3, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8721989632. Throughput: 0: 59461.8. Samples: 1627121920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:13:14,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 01:13:17,061][54818] Updated weights for policy 0, policy_version 532358 (0.0026) [2024-04-28 01:13:19,253][54587] Fps is (10 sec: 57343.6, 60 sec: 59255.4, 300 sec: 59315.6). Total num frames: 8722284544. Throughput: 0: 59360.4. Samples: 1627476600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:13:19,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-28 01:13:19,751][54818] Updated weights for policy 0, policy_version 532368 (0.0026) [2024-04-28 01:13:22,473][54818] Updated weights for policy 0, policy_version 532378 (0.0022) [2024-04-28 01:13:24,253][54587] Fps is (10 sec: 58981.8, 60 sec: 59255.4, 300 sec: 59315.6). Total num frames: 8722579456. Throughput: 0: 59155.9. Samples: 1627824700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:13:24,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 01:13:25,055][54818] Updated weights for policy 0, policy_version 532388 (0.0025) [2024-04-28 01:13:27,978][54818] Updated weights for policy 0, policy_version 532398 (0.0026) [2024-04-28 01:13:29,254][54587] Fps is (10 sec: 58981.6, 60 sec: 59528.4, 300 sec: 59315.6). Total num frames: 8722874368. Throughput: 0: 59217.0. Samples: 1628008380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:13:29,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 01:13:30,524][54818] Updated weights for policy 0, policy_version 532408 (0.0026) [2024-04-28 01:13:33,451][54818] Updated weights for policy 0, policy_version 532418 (0.0026) [2024-04-28 01:13:34,253][54587] Fps is (10 sec: 60621.5, 60 sec: 59528.6, 300 sec: 59371.2). Total num frames: 8723185664. Throughput: 0: 59594.1. Samples: 1628374520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:13:34,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 01:13:36,005][54818] Updated weights for policy 0, policy_version 532428 (0.0026) [2024-04-28 01:13:38,933][54818] Updated weights for policy 0, policy_version 532438 (0.0027) [2024-04-28 01:13:39,254][54587] Fps is (10 sec: 60620.8, 60 sec: 59528.4, 300 sec: 59371.1). Total num frames: 8723480576. Throughput: 0: 59380.6. Samples: 1628721220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:13:39,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-28 01:13:40,240][54798] Signal inference workers to stop experience collection... (26050 times) [2024-04-28 01:13:40,240][54798] Signal inference workers to resume experience collection... (26050 times) [2024-04-28 01:13:40,267][54818] InferenceWorker_p0-w0: stopping experience collection (26050 times) [2024-04-28 01:13:40,267][54818] InferenceWorker_p0-w0: resuming experience collection (26050 times) [2024-04-28 01:13:41,647][54818] Updated weights for policy 0, policy_version 532448 (0.0026) [2024-04-28 01:13:44,253][54587] Fps is (10 sec: 58982.1, 60 sec: 59528.4, 300 sec: 59315.7). Total num frames: 8723775488. Throughput: 0: 59227.2. Samples: 1628893460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:13:44,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 01:13:44,442][54818] Updated weights for policy 0, policy_version 532458 (0.0027) [2024-04-28 01:13:47,010][54818] Updated weights for policy 0, policy_version 532468 (0.0025) [2024-04-28 01:13:49,253][54587] Fps is (10 sec: 58983.2, 60 sec: 59528.4, 300 sec: 59315.6). Total num frames: 8724070400. Throughput: 0: 59279.0. Samples: 1629255040. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:13:49,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-28 01:13:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000532475_8724070400.pth... [2024-04-28 01:13:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000531606_8709832704.pth [2024-04-28 01:13:50,060][54818] Updated weights for policy 0, policy_version 532478 (0.0024) [2024-04-28 01:13:52,580][54818] Updated weights for policy 0, policy_version 532488 (0.0025) [2024-04-28 01:13:54,253][54587] Fps is (10 sec: 55705.1, 60 sec: 58709.1, 300 sec: 59204.5). Total num frames: 8724332544. Throughput: 0: 59556.7. Samples: 1629619680. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:13:54,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 01:13:55,463][54818] Updated weights for policy 0, policy_version 532498 (0.0024) [2024-04-28 01:13:58,028][54818] Updated weights for policy 0, policy_version 532508 (0.0026) [2024-04-28 01:13:59,253][54587] Fps is (10 sec: 55705.2, 60 sec: 58436.2, 300 sec: 59204.5). Total num frames: 8724627456. Throughput: 0: 59151.0. Samples: 1629783720. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:13:59,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 01:14:01,070][54818] Updated weights for policy 0, policy_version 532518 (0.0026) [2024-04-28 01:14:03,749][54818] Updated weights for policy 0, policy_version 532528 (0.0027) [2024-04-28 01:14:04,253][54587] Fps is (10 sec: 60621.8, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8724938752. Throughput: 0: 59254.3. Samples: 1630143040. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:14:04,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-28 01:14:06,422][54818] Updated weights for policy 0, policy_version 532538 (0.0027) [2024-04-28 01:14:09,253][54587] Fps is (10 sec: 62260.3, 60 sec: 58982.5, 300 sec: 59260.1). Total num frames: 8725250048. Throughput: 0: 59274.9. Samples: 1630492060. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:14:09,254][54587] Avg episode reward: [(0, '0.519')] [2024-04-28 01:14:09,848][54818] Updated weights for policy 0, policy_version 532548 (0.0025) [2024-04-28 01:14:12,129][54818] Updated weights for policy 0, policy_version 532558 (0.0026) [2024-04-28 01:14:14,253][54587] Fps is (10 sec: 62259.4, 60 sec: 59528.6, 300 sec: 59315.6). Total num frames: 8725561344. Throughput: 0: 59251.0. Samples: 1630674660. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:14:14,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 01:14:15,388][54818] Updated weights for policy 0, policy_version 532568 (0.0025) [2024-04-28 01:14:17,714][54818] Updated weights for policy 0, policy_version 532578 (0.0026) [2024-04-28 01:14:19,253][54587] Fps is (10 sec: 60620.2, 60 sec: 59528.6, 300 sec: 59260.1). Total num frames: 8725856256. Throughput: 0: 59077.3. Samples: 1631033000. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:14:19,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-28 01:14:20,869][54818] Updated weights for policy 0, policy_version 532588 (0.0030) [2024-04-28 01:14:23,278][54818] Updated weights for policy 0, policy_version 532598 (0.0024) [2024-04-28 01:14:24,253][54587] Fps is (10 sec: 58981.8, 60 sec: 59528.6, 300 sec: 59315.7). Total num frames: 8726151168. Throughput: 0: 59266.0. Samples: 1631388180. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:14:24,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 01:14:26,210][54818] Updated weights for policy 0, policy_version 532608 (0.0026) [2024-04-28 01:14:27,868][54798] Signal inference workers to stop experience collection... (26100 times) [2024-04-28 01:14:27,908][54818] InferenceWorker_p0-w0: stopping experience collection (26100 times) [2024-04-28 01:14:27,927][54798] Signal inference workers to resume experience collection... (26100 times) [2024-04-28 01:14:27,927][54818] InferenceWorker_p0-w0: resuming experience collection (26100 times) [2024-04-28 01:14:28,922][54818] Updated weights for policy 0, policy_version 532618 (0.0026) [2024-04-28 01:14:29,253][54587] Fps is (10 sec: 57343.2, 60 sec: 59255.5, 300 sec: 59260.1). Total num frames: 8726429696. Throughput: 0: 59372.3. Samples: 1631565220. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:14:29,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 01:14:31,742][54818] Updated weights for policy 0, policy_version 532628 (0.0025) [2024-04-28 01:14:34,253][54587] Fps is (10 sec: 57344.4, 60 sec: 58982.4, 300 sec: 59260.1). Total num frames: 8726724608. Throughput: 0: 59223.7. Samples: 1631920100. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:14:34,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-28 01:14:34,509][54818] Updated weights for policy 0, policy_version 532638 (0.0026) [2024-04-28 01:14:37,247][54818] Updated weights for policy 0, policy_version 532648 (0.0023) [2024-04-28 01:14:39,253][54587] Fps is (10 sec: 58982.8, 60 sec: 58982.5, 300 sec: 59315.6). Total num frames: 8727019520. Throughput: 0: 59035.6. Samples: 1632276280. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:14:39,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 01:14:39,931][54818] Updated weights for policy 0, policy_version 532658 (0.0026) [2024-04-28 01:14:42,676][54818] Updated weights for policy 0, policy_version 532668 (0.0025) [2024-04-28 01:14:44,253][54587] Fps is (10 sec: 58981.6, 60 sec: 58982.3, 300 sec: 59315.6). Total num frames: 8727314432. Throughput: 0: 59382.7. Samples: 1632455940. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:14:44,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 01:14:45,382][54818] Updated weights for policy 0, policy_version 532678 (0.0026) [2024-04-28 01:14:48,191][54818] Updated weights for policy 0, policy_version 532688 (0.0027) [2024-04-28 01:14:49,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58709.3, 300 sec: 59260.1). Total num frames: 8727592960. Throughput: 0: 59223.8. Samples: 1632808120. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:14:49,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 01:14:49,272][54587] No heartbeat for components: RolloutWorker_w4 (13837 seconds) [2024-04-28 01:14:50,866][54818] Updated weights for policy 0, policy_version 532698 (0.0025) [2024-04-28 01:14:53,522][54818] Updated weights for policy 0, policy_version 532708 (0.0026) [2024-04-28 01:14:54,253][54587] Fps is (10 sec: 58983.3, 60 sec: 59528.7, 300 sec: 59315.6). Total num frames: 8727904256. Throughput: 0: 59187.5. Samples: 1633155500. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:14:54,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 01:14:56,254][54818] Updated weights for policy 0, policy_version 532718 (0.0026) [2024-04-28 01:14:59,043][54818] Updated weights for policy 0, policy_version 532728 (0.0026) [2024-04-28 01:14:59,253][54587] Fps is (10 sec: 62259.7, 60 sec: 59801.7, 300 sec: 59315.6). Total num frames: 8728215552. Throughput: 0: 59067.0. Samples: 1633332680. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:14:59,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 01:15:01,852][54818] Updated weights for policy 0, policy_version 532738 (0.0026) [2024-04-28 01:15:04,253][54587] Fps is (10 sec: 60620.1, 60 sec: 59528.4, 300 sec: 59371.2). Total num frames: 8728510464. Throughput: 0: 59023.5. Samples: 1633689060. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:15:04,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 01:15:04,786][54818] Updated weights for policy 0, policy_version 532748 (0.0026) [2024-04-28 01:15:07,273][54818] Updated weights for policy 0, policy_version 532758 (0.0025) [2024-04-28 01:15:09,253][54587] Fps is (10 sec: 57344.5, 60 sec: 58982.4, 300 sec: 59260.1). Total num frames: 8728788992. Throughput: 0: 59153.9. Samples: 1634050100. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:15:09,253][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 01:15:10,759][54818] Updated weights for policy 0, policy_version 532768 (0.0025) [2024-04-28 01:15:12,543][54798] Signal inference workers to stop experience collection... (26150 times) [2024-04-28 01:15:12,543][54798] Signal inference workers to resume experience collection... (26150 times) [2024-04-28 01:15:12,571][54818] InferenceWorker_p0-w0: stopping experience collection (26150 times) [2024-04-28 01:15:12,571][54818] InferenceWorker_p0-w0: resuming experience collection (26150 times) [2024-04-28 01:15:12,658][54818] Updated weights for policy 0, policy_version 532778 (0.0024) [2024-04-28 01:15:14,253][54587] Fps is (10 sec: 57344.0, 60 sec: 58709.2, 300 sec: 59260.1). Total num frames: 8729083904. Throughput: 0: 59254.8. Samples: 1634231680. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-04-28 01:15:14,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 01:15:16,322][54818] Updated weights for policy 0, policy_version 532788 (0.0026) [2024-04-28 01:15:18,162][54818] Updated weights for policy 0, policy_version 532798 (0.0026) [2024-04-28 01:15:19,253][54587] Fps is (10 sec: 58981.7, 60 sec: 58709.3, 300 sec: 59260.1). Total num frames: 8729378816. Throughput: 0: 59039.9. Samples: 1634576900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:15:19,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 01:15:21,643][54818] Updated weights for policy 0, policy_version 532808 (0.0024) [2024-04-28 01:15:23,718][54818] Updated weights for policy 0, policy_version 532818 (0.0024) [2024-04-28 01:15:24,253][54587] Fps is (10 sec: 60621.3, 60 sec: 58982.4, 300 sec: 59204.6). Total num frames: 8729690112. Throughput: 0: 58855.7. Samples: 1634924780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:15:24,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 01:15:27,231][54818] Updated weights for policy 0, policy_version 532828 (0.0031) [2024-04-28 01:15:29,239][54818] Updated weights for policy 0, policy_version 532838 (0.0025) [2024-04-28 01:15:29,253][54587] Fps is (10 sec: 63897.8, 60 sec: 59801.8, 300 sec: 59315.6). Total num frames: 8730017792. Throughput: 0: 59138.8. Samples: 1635117180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:15:29,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 01:15:32,822][54818] Updated weights for policy 0, policy_version 532848 (0.0027) [2024-04-28 01:15:34,253][54587] Fps is (10 sec: 62258.7, 60 sec: 59801.5, 300 sec: 59315.6). Total num frames: 8730312704. Throughput: 0: 59340.5. Samples: 1635478440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:15:34,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 01:15:35,448][54818] Updated weights for policy 0, policy_version 532858 (0.0025) [2024-04-28 01:15:38,359][54818] Updated weights for policy 0, policy_version 532868 (0.0025) [2024-04-28 01:15:39,253][54587] Fps is (10 sec: 57344.1, 60 sec: 59528.6, 300 sec: 59204.5). Total num frames: 8730591232. Throughput: 0: 59377.3. Samples: 1635827480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:15:39,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 01:15:40,886][54818] Updated weights for policy 0, policy_version 532878 (0.0026) [2024-04-28 01:15:43,651][54818] Updated weights for policy 0, policy_version 532888 (0.0026) [2024-04-28 01:15:44,253][54587] Fps is (10 sec: 55706.0, 60 sec: 59255.6, 300 sec: 59149.0). Total num frames: 8730869760. Throughput: 0: 59303.6. Samples: 1636001340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:15:44,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 01:15:46,225][54818] Updated weights for policy 0, policy_version 532898 (0.0026) [2024-04-28 01:15:49,253][54587] Fps is (10 sec: 55705.9, 60 sec: 59255.6, 300 sec: 59149.0). Total num frames: 8731148288. Throughput: 0: 59314.8. Samples: 1636358220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:15:49,253][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 01:15:49,276][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000532908_8731164672.pth... [2024-04-28 01:15:49,280][54818] Updated weights for policy 0, policy_version 532908 (0.0025) [2024-04-28 01:15:49,333][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000532042_8716976128.pth [2024-04-28 01:15:49,631][54798] Signal inference workers to stop experience collection... (26200 times) [2024-04-28 01:15:49,631][54798] Signal inference workers to resume experience collection... (26200 times) [2024-04-28 01:15:49,643][54818] InferenceWorker_p0-w0: stopping experience collection (26200 times) [2024-04-28 01:15:49,654][54818] InferenceWorker_p0-w0: resuming experience collection (26200 times) [2024-04-28 01:15:51,739][54818] Updated weights for policy 0, policy_version 532918 (0.0025) [2024-04-28 01:15:54,253][54587] Fps is (10 sec: 57343.3, 60 sec: 58982.3, 300 sec: 59204.5). Total num frames: 8731443200. Throughput: 0: 59370.0. Samples: 1636721760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:15:54,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-28 01:15:54,753][54818] Updated weights for policy 0, policy_version 532928 (0.0025) [2024-04-28 01:15:57,365][54818] Updated weights for policy 0, policy_version 532938 (0.0025) [2024-04-28 01:15:59,253][54587] Fps is (10 sec: 58982.2, 60 sec: 58709.4, 300 sec: 59260.1). Total num frames: 8731738112. Throughput: 0: 59286.8. Samples: 1636899580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:15:59,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-28 01:16:00,028][54818] Updated weights for policy 0, policy_version 532948 (0.0025) [2024-04-28 01:16:02,778][54818] Updated weights for policy 0, policy_version 532958 (0.0025) [2024-04-28 01:16:04,253][54587] Fps is (10 sec: 62259.7, 60 sec: 59255.5, 300 sec: 59371.1). Total num frames: 8732065792. Throughput: 0: 59494.2. Samples: 1637254140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:16:04,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 01:16:05,672][54818] Updated weights for policy 0, policy_version 532968 (0.0026) [2024-04-28 01:16:08,735][54818] Updated weights for policy 0, policy_version 532978 (0.0028) [2024-04-28 01:16:09,253][54587] Fps is (10 sec: 60620.1, 60 sec: 59255.3, 300 sec: 59315.6). Total num frames: 8732344320. Throughput: 0: 59218.5. Samples: 1637589620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:16:09,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-28 01:16:11,395][54818] Updated weights for policy 0, policy_version 532988 (0.0018) [2024-04-28 01:16:14,196][54818] Updated weights for policy 0, policy_version 532998 (0.0016) [2024-04-28 01:16:14,253][54587] Fps is (10 sec: 57344.0, 60 sec: 59255.5, 300 sec: 59204.5). Total num frames: 8732639232. Throughput: 0: 58775.1. Samples: 1637762060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:16:14,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 01:16:16,599][54818] Updated weights for policy 0, policy_version 533008 (0.0017) [2024-04-28 01:16:19,253][54587] Fps is (10 sec: 58983.4, 60 sec: 59255.6, 300 sec: 59149.0). Total num frames: 8732934144. Throughput: 0: 59054.9. Samples: 1638135900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:16:19,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-28 01:16:19,760][54818] Updated weights for policy 0, policy_version 533018 (0.0016) [2024-04-28 01:16:21,806][54818] Updated weights for policy 0, policy_version 533028 (0.0016) [2024-04-28 01:16:24,253][54587] Fps is (10 sec: 60620.9, 60 sec: 59255.4, 300 sec: 59260.1). Total num frames: 8733245440. Throughput: 0: 59338.2. Samples: 1638497700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:16:24,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 01:16:24,998][54818] Updated weights for policy 0, policy_version 533038 (0.0016) [2024-04-28 01:16:27,113][54818] Updated weights for policy 0, policy_version 533048 (0.0025) [2024-04-28 01:16:29,253][54587] Fps is (10 sec: 60619.6, 60 sec: 58709.2, 300 sec: 59204.5). Total num frames: 8733540352. Throughput: 0: 59642.0. Samples: 1638685240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:16:29,254][54587] Avg episode reward: [(0, '0.668')] [2024-04-28 01:16:30,469][54818] Updated weights for policy 0, policy_version 533058 (0.0016) [2024-04-28 01:16:31,951][54798] Signal inference workers to stop experience collection... (26250 times) [2024-04-28 01:16:31,968][54818] InferenceWorker_p0-w0: stopping experience collection (26250 times) [2024-04-28 01:16:32,006][54798] Signal inference workers to resume experience collection... (26250 times) [2024-04-28 01:16:32,007][54818] InferenceWorker_p0-w0: resuming experience collection (26250 times) [2024-04-28 01:16:32,607][54818] Updated weights for policy 0, policy_version 533068 (0.0019) [2024-04-28 01:16:34,253][54587] Fps is (10 sec: 60621.0, 60 sec: 58982.5, 300 sec: 59260.1). Total num frames: 8733851648. Throughput: 0: 59645.3. Samples: 1639042260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:16:34,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 01:16:35,782][54818] Updated weights for policy 0, policy_version 533078 (0.0017) [2024-04-28 01:16:38,073][54818] Updated weights for policy 0, policy_version 533088 (0.0024) [2024-04-28 01:16:39,253][54587] Fps is (10 sec: 62260.0, 60 sec: 59528.5, 300 sec: 59315.6). Total num frames: 8734162944. Throughput: 0: 59765.9. Samples: 1639411220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:16:39,255][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 01:16:41,052][54818] Updated weights for policy 0, policy_version 533098 (0.0019) [2024-04-28 01:16:43,295][54818] Updated weights for policy 0, policy_version 533108 (0.0016) [2024-04-28 01:16:44,253][54587] Fps is (10 sec: 62259.2, 60 sec: 60074.7, 300 sec: 59315.6). Total num frames: 8734474240. Throughput: 0: 59996.4. Samples: 1639599420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:16:44,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-28 01:16:46,622][54818] Updated weights for policy 0, policy_version 533118 (0.0019) [2024-04-28 01:16:48,603][54818] Updated weights for policy 0, policy_version 533128 (0.0019) [2024-04-28 01:16:49,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60347.7, 300 sec: 59315.6). Total num frames: 8734769152. Throughput: 0: 60092.1. Samples: 1639958280. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 01:16:49,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 01:16:51,965][54818] Updated weights for policy 0, policy_version 533138 (0.0017) [2024-04-28 01:16:54,253][54587] Fps is (10 sec: 60620.2, 60 sec: 60620.8, 300 sec: 59426.7). Total num frames: 8735080448. Throughput: 0: 60535.6. Samples: 1640313720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 01:16:54,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 01:16:54,760][54818] Updated weights for policy 0, policy_version 533148 (0.0021) [2024-04-28 01:16:57,105][54818] Updated weights for policy 0, policy_version 533158 (0.0020) [2024-04-28 01:16:59,253][54587] Fps is (10 sec: 62258.3, 60 sec: 60893.7, 300 sec: 59482.2). Total num frames: 8735391744. Throughput: 0: 61161.7. Samples: 1640514340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 01:16:59,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 01:17:00,277][54818] Updated weights for policy 0, policy_version 533168 (0.0018) [2024-04-28 01:17:02,572][54818] Updated weights for policy 0, policy_version 533178 (0.0015) [2024-04-28 01:17:04,253][54587] Fps is (10 sec: 62259.7, 60 sec: 60620.8, 300 sec: 59537.8). Total num frames: 8735703040. Throughput: 0: 60840.3. Samples: 1640873720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 01:17:04,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 01:17:05,403][54818] Updated weights for policy 0, policy_version 533188 (0.0016) [2024-04-28 01:17:07,834][54818] Updated weights for policy 0, policy_version 533198 (0.0018) [2024-04-28 01:17:09,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60894.0, 300 sec: 59537.8). Total num frames: 8735997952. Throughput: 0: 60644.9. Samples: 1641226720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 01:17:09,255][54587] Avg episode reward: [(0, '0.539')] [2024-04-28 01:17:10,918][54818] Updated weights for policy 0, policy_version 533208 (0.0019) [2024-04-28 01:17:13,068][54798] Signal inference workers to stop experience collection... (26300 times) [2024-04-28 01:17:13,093][54818] InferenceWorker_p0-w0: stopping experience collection (26300 times) [2024-04-28 01:17:13,120][54798] Signal inference workers to resume experience collection... (26300 times) [2024-04-28 01:17:13,121][54818] InferenceWorker_p0-w0: resuming experience collection (26300 times) [2024-04-28 01:17:13,124][54818] Updated weights for policy 0, policy_version 533218 (0.0017) [2024-04-28 01:17:14,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61167.0, 300 sec: 59593.3). Total num frames: 8736309248. Throughput: 0: 60798.4. Samples: 1641421160. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 01:17:14,255][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 01:17:16,266][54818] Updated weights for policy 0, policy_version 533228 (0.0017) [2024-04-28 01:17:18,280][54818] Updated weights for policy 0, policy_version 533238 (0.0017) [2024-04-28 01:17:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61166.9, 300 sec: 59593.3). Total num frames: 8736604160. Throughput: 0: 60877.8. Samples: 1641781760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 01:17:19,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 01:17:21,694][54818] Updated weights for policy 0, policy_version 533248 (0.0019) [2024-04-28 01:17:23,747][54818] Updated weights for policy 0, policy_version 533258 (0.0017) [2024-04-28 01:17:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61167.0, 300 sec: 59704.4). Total num frames: 8736915456. Throughput: 0: 60629.4. Samples: 1642139540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 01:17:24,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 01:17:27,044][54818] Updated weights for policy 0, policy_version 533268 (0.0016) [2024-04-28 01:17:29,207][54818] Updated weights for policy 0, policy_version 533278 (0.0017) [2024-04-28 01:17:29,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61440.1, 300 sec: 59704.4). Total num frames: 8737226752. Throughput: 0: 60871.9. Samples: 1642338660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 01:17:29,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-28 01:17:32,399][54818] Updated weights for policy 0, policy_version 533288 (0.0015) [2024-04-28 01:17:34,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61167.0, 300 sec: 59704.4). Total num frames: 8737521664. Throughput: 0: 60686.8. Samples: 1642689180. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 01:17:34,253][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 01:17:35,112][54818] Updated weights for policy 0, policy_version 533298 (0.0016) [2024-04-28 01:17:37,534][54818] Updated weights for policy 0, policy_version 533308 (0.0016) [2024-04-28 01:17:39,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61166.8, 300 sec: 59759.9). Total num frames: 8737832960. Throughput: 0: 60935.5. Samples: 1643055820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 01:17:39,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 01:17:40,788][54818] Updated weights for policy 0, policy_version 533318 (0.0016) [2024-04-28 01:17:42,784][54818] Updated weights for policy 0, policy_version 533328 (0.0016) [2024-04-28 01:17:44,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61167.0, 300 sec: 59815.5). Total num frames: 8738144256. Throughput: 0: 60747.8. Samples: 1643247980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 01:17:44,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 01:17:45,983][54818] Updated weights for policy 0, policy_version 533338 (0.0016) [2024-04-28 01:17:47,972][54818] Updated weights for policy 0, policy_version 533348 (0.0018) [2024-04-28 01:17:49,253][54587] Fps is (10 sec: 60622.0, 60 sec: 61167.0, 300 sec: 59759.9). Total num frames: 8738439168. Throughput: 0: 60778.3. Samples: 1643608740. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 01:17:49,253][54587] Avg episode reward: [(0, '0.510')] [2024-04-28 01:17:49,259][54587] No heartbeat for components: RolloutWorker_w4 (14017 seconds) [2024-04-28 01:17:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000533353_8738455552.pth... [2024-04-28 01:17:49,319][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000532475_8724070400.pth [2024-04-28 01:17:51,423][54818] Updated weights for policy 0, policy_version 533358 (0.0017) [2024-04-28 01:17:52,260][54798] Signal inference workers to stop experience collection... (26350 times) [2024-04-28 01:17:52,297][54818] InferenceWorker_p0-w0: stopping experience collection (26350 times) [2024-04-28 01:17:52,311][54798] Signal inference workers to resume experience collection... (26350 times) [2024-04-28 01:17:52,312][54818] InferenceWorker_p0-w0: resuming experience collection (26350 times) [2024-04-28 01:17:53,405][54818] Updated weights for policy 0, policy_version 533368 (0.0016) [2024-04-28 01:17:54,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61167.0, 300 sec: 59760.0). Total num frames: 8738750464. Throughput: 0: 61096.0. Samples: 1643976040. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 01:17:54,254][54587] Avg episode reward: [(0, '0.520')] [2024-04-28 01:17:56,717][54818] Updated weights for policy 0, policy_version 533378 (0.0016) [2024-04-28 01:17:58,740][54818] Updated weights for policy 0, policy_version 533388 (0.0015) [2024-04-28 01:17:59,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61167.1, 300 sec: 59871.0). Total num frames: 8739061760. Throughput: 0: 61173.7. Samples: 1644173980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 01:17:59,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-28 01:18:01,953][54818] Updated weights for policy 0, policy_version 533398 (0.0017) [2024-04-28 01:18:04,014][54818] Updated weights for policy 0, policy_version 533408 (0.0020) [2024-04-28 01:18:04,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60893.8, 300 sec: 59815.5). Total num frames: 8739356672. Throughput: 0: 61012.8. Samples: 1644527340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 01:18:04,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 01:18:07,226][54818] Updated weights for policy 0, policy_version 533418 (0.0018) [2024-04-28 01:18:09,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.8, 300 sec: 59926.5). Total num frames: 8739667968. Throughput: 0: 61090.0. Samples: 1644888600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 01:18:09,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 01:18:09,542][54818] Updated weights for policy 0, policy_version 533428 (0.0017) [2024-04-28 01:18:12,561][54818] Updated weights for policy 0, policy_version 533438 (0.0017) [2024-04-28 01:18:14,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61166.9, 300 sec: 59982.1). Total num frames: 8739979264. Throughput: 0: 61100.1. Samples: 1645088160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:18:14,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 01:18:15,343][54818] Updated weights for policy 0, policy_version 533448 (0.0018) [2024-04-28 01:18:17,684][54818] Updated weights for policy 0, policy_version 533458 (0.0016) [2024-04-28 01:18:19,253][54587] Fps is (10 sec: 60621.9, 60 sec: 61167.0, 300 sec: 59982.1). Total num frames: 8740274176. Throughput: 0: 61018.7. Samples: 1645435020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:18:19,253][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 01:18:20,945][54818] Updated weights for policy 0, policy_version 533468 (0.0016) [2024-04-28 01:18:23,155][54818] Updated weights for policy 0, policy_version 533478 (0.0019) [2024-04-28 01:18:24,253][54587] Fps is (10 sec: 60619.8, 60 sec: 61166.7, 300 sec: 60037.6). Total num frames: 8740585472. Throughput: 0: 60804.4. Samples: 1645792020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:18:24,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-28 01:18:26,736][54818] Updated weights for policy 0, policy_version 533488 (0.0015) [2024-04-28 01:18:28,345][54798] Signal inference workers to stop experience collection... (26400 times) [2024-04-28 01:18:28,363][54818] InferenceWorker_p0-w0: stopping experience collection (26400 times) [2024-04-28 01:18:28,402][54798] Signal inference workers to resume experience collection... (26400 times) [2024-04-28 01:18:28,402][54818] InferenceWorker_p0-w0: resuming experience collection (26400 times) [2024-04-28 01:18:28,527][54818] Updated weights for policy 0, policy_version 533498 (0.0019) [2024-04-28 01:18:29,253][54587] Fps is (10 sec: 58981.8, 60 sec: 60620.8, 300 sec: 59926.6). Total num frames: 8740864000. Throughput: 0: 61038.6. Samples: 1645994720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:18:29,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 01:18:32,145][54818] Updated weights for policy 0, policy_version 533508 (0.0016) [2024-04-28 01:18:33,653][54818] Updated weights for policy 0, policy_version 533518 (0.0018) [2024-04-28 01:18:34,253][54587] Fps is (10 sec: 58983.0, 60 sec: 60893.7, 300 sec: 59982.1). Total num frames: 8741175296. Throughput: 0: 60962.5. Samples: 1646352060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:18:34,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 01:18:37,303][54818] Updated weights for policy 0, policy_version 533528 (0.0016) [2024-04-28 01:18:39,154][54818] Updated weights for policy 0, policy_version 533538 (0.0019) [2024-04-28 01:18:39,253][54587] Fps is (10 sec: 62259.9, 60 sec: 60894.1, 300 sec: 60037.7). Total num frames: 8741486592. Throughput: 0: 60684.6. Samples: 1646706840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:18:39,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 01:18:42,520][54818] Updated weights for policy 0, policy_version 533548 (0.0016) [2024-04-28 01:18:44,253][54587] Fps is (10 sec: 62259.2, 60 sec: 60893.7, 300 sec: 60093.2). Total num frames: 8741797888. Throughput: 0: 60844.8. Samples: 1646912000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:18:44,255][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 01:18:44,534][54818] Updated weights for policy 0, policy_version 533558 (0.0016) [2024-04-28 01:18:47,713][54818] Updated weights for policy 0, policy_version 533568 (0.0016) [2024-04-28 01:18:49,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60893.8, 300 sec: 60204.3). Total num frames: 8742092800. Throughput: 0: 60942.4. Samples: 1647269740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:18:49,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 01:18:49,971][54818] Updated weights for policy 0, policy_version 533578 (0.0016) [2024-04-28 01:18:52,920][54818] Updated weights for policy 0, policy_version 533588 (0.0017) [2024-04-28 01:18:54,253][54587] Fps is (10 sec: 58983.2, 60 sec: 60620.9, 300 sec: 60204.3). Total num frames: 8742387712. Throughput: 0: 60878.5. Samples: 1647628120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:18:54,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 01:18:55,318][54818] Updated weights for policy 0, policy_version 533598 (0.0019) [2024-04-28 01:18:58,185][54818] Updated weights for policy 0, policy_version 533608 (0.0020) [2024-04-28 01:18:59,253][54587] Fps is (10 sec: 58982.7, 60 sec: 60347.8, 300 sec: 60148.7). Total num frames: 8742682624. Throughput: 0: 60795.7. Samples: 1647823960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:18:59,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-28 01:19:01,071][54818] Updated weights for policy 0, policy_version 533618 (0.0018) [2024-04-28 01:19:03,405][54818] Updated weights for policy 0, policy_version 533628 (0.0018) [2024-04-28 01:19:03,748][54798] Signal inference workers to stop experience collection... (26450 times) [2024-04-28 01:19:03,748][54798] Signal inference workers to resume experience collection... (26450 times) [2024-04-28 01:19:03,773][54818] InferenceWorker_p0-w0: stopping experience collection (26450 times) [2024-04-28 01:19:03,773][54818] InferenceWorker_p0-w0: resuming experience collection (26450 times) [2024-04-28 01:19:04,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60620.9, 300 sec: 60148.7). Total num frames: 8742993920. Throughput: 0: 61141.3. Samples: 1648186380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:19:04,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 01:19:06,671][54818] Updated weights for policy 0, policy_version 533638 (0.0017) [2024-04-28 01:19:08,873][54818] Updated weights for policy 0, policy_version 533648 (0.0015) [2024-04-28 01:19:09,253][54587] Fps is (10 sec: 62259.0, 60 sec: 60620.9, 300 sec: 60148.7). Total num frames: 8743305216. Throughput: 0: 61118.5. Samples: 1648542340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:19:09,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 01:19:11,985][54818] Updated weights for policy 0, policy_version 533658 (0.0018) [2024-04-28 01:19:14,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60347.8, 300 sec: 60148.7). Total num frames: 8743600128. Throughput: 0: 60978.3. Samples: 1648738740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:19:14,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 01:19:14,262][54818] Updated weights for policy 0, policy_version 533668 (0.0017) [2024-04-28 01:19:17,278][54818] Updated weights for policy 0, policy_version 533678 (0.0016) [2024-04-28 01:19:19,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60620.7, 300 sec: 60204.3). Total num frames: 8743911424. Throughput: 0: 61088.1. Samples: 1649101020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:19:19,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 01:19:19,550][54818] Updated weights for policy 0, policy_version 533688 (0.0018) [2024-04-28 01:19:22,578][54818] Updated weights for policy 0, policy_version 533698 (0.0017) [2024-04-28 01:19:24,253][54587] Fps is (10 sec: 62259.0, 60 sec: 60621.0, 300 sec: 60315.4). Total num frames: 8744222720. Throughput: 0: 61151.9. Samples: 1649458680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:19:24,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 01:19:24,951][54818] Updated weights for policy 0, policy_version 533708 (0.0017) [2024-04-28 01:19:27,837][54818] Updated weights for policy 0, policy_version 533718 (0.0015) [2024-04-28 01:19:29,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61166.9, 300 sec: 60370.9). Total num frames: 8744534016. Throughput: 0: 60833.4. Samples: 1649649500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:19:29,254][54587] Avg episode reward: [(0, '0.671')] [2024-04-28 01:19:30,352][54818] Updated weights for policy 0, policy_version 533728 (0.0017) [2024-04-28 01:19:33,218][54818] Updated weights for policy 0, policy_version 533738 (0.0017) [2024-04-28 01:19:34,253][54587] Fps is (10 sec: 60619.9, 60 sec: 60893.8, 300 sec: 60370.9). Total num frames: 8744828928. Throughput: 0: 60913.1. Samples: 1650010840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:19:34,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 01:19:35,995][54818] Updated weights for policy 0, policy_version 533748 (0.0016) [2024-04-28 01:19:38,436][54818] Updated weights for policy 0, policy_version 533758 (0.0016) [2024-04-28 01:19:39,253][54587] Fps is (10 sec: 58982.2, 60 sec: 60620.6, 300 sec: 60370.9). Total num frames: 8745123840. Throughput: 0: 61202.0. Samples: 1650382220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 01:19:39,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 01:19:41,396][54818] Updated weights for policy 0, policy_version 533768 (0.0016) [2024-04-28 01:19:43,825][54818] Updated weights for policy 0, policy_version 533778 (0.0017) [2024-04-28 01:19:44,253][54587] Fps is (10 sec: 60621.6, 60 sec: 60620.8, 300 sec: 60482.0). Total num frames: 8745435136. Throughput: 0: 61050.1. Samples: 1650571220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:19:44,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-28 01:19:46,507][54818] Updated weights for policy 0, policy_version 533788 (0.0016) [2024-04-28 01:19:49,213][54818] Updated weights for policy 0, policy_version 533798 (0.0016) [2024-04-28 01:19:49,253][54587] Fps is (10 sec: 62259.6, 60 sec: 60893.8, 300 sec: 60481.9). Total num frames: 8745746432. Throughput: 0: 60976.8. Samples: 1650930340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:19:49,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 01:19:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000533798_8745746432.pth... [2024-04-28 01:19:49,317][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000532908_8731164672.pth [2024-04-28 01:19:49,497][54798] Signal inference workers to stop experience collection... (26500 times) [2024-04-28 01:19:49,502][54798] Signal inference workers to resume experience collection... (26500 times) [2024-04-28 01:19:49,513][54818] InferenceWorker_p0-w0: stopping experience collection (26500 times) [2024-04-28 01:19:49,513][54818] InferenceWorker_p0-w0: resuming experience collection (26500 times) [2024-04-28 01:19:51,610][54818] Updated weights for policy 0, policy_version 533808 (0.0017) [2024-04-28 01:19:54,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61166.9, 300 sec: 60482.0). Total num frames: 8746057728. Throughput: 0: 61260.9. Samples: 1651299080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:19:54,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 01:19:54,567][54818] Updated weights for policy 0, policy_version 533818 (0.0015) [2024-04-28 01:19:57,385][54818] Updated weights for policy 0, policy_version 533828 (0.0016) [2024-04-28 01:19:59,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61166.9, 300 sec: 60482.0). Total num frames: 8746352640. Throughput: 0: 60798.3. Samples: 1651474660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:19:59,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 01:19:59,802][54818] Updated weights for policy 0, policy_version 533838 (0.0016) [2024-04-28 01:20:02,812][54818] Updated weights for policy 0, policy_version 533848 (0.0016) [2024-04-28 01:20:04,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61166.8, 300 sec: 60593.0). Total num frames: 8746663936. Throughput: 0: 61227.5. Samples: 1651856260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:20:04,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 01:20:05,118][54818] Updated weights for policy 0, policy_version 533858 (0.0015) [2024-04-28 01:20:08,339][54818] Updated weights for policy 0, policy_version 533868 (0.0015) [2024-04-28 01:20:09,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60893.8, 300 sec: 60593.0). Total num frames: 8746958848. Throughput: 0: 61316.9. Samples: 1652217940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:20:09,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-28 01:20:10,581][54818] Updated weights for policy 0, policy_version 533878 (0.0017) [2024-04-28 01:20:13,563][54818] Updated weights for policy 0, policy_version 533888 (0.0016) [2024-04-28 01:20:14,253][54587] Fps is (10 sec: 58983.1, 60 sec: 60893.9, 300 sec: 60593.1). Total num frames: 8747253760. Throughput: 0: 60802.3. Samples: 1652385600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:20:14,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-28 01:20:15,795][54818] Updated weights for policy 0, policy_version 533898 (0.0016) [2024-04-28 01:20:19,045][54818] Updated weights for policy 0, policy_version 533908 (0.0019) [2024-04-28 01:20:19,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60893.9, 300 sec: 60593.0). Total num frames: 8747565056. Throughput: 0: 61375.8. Samples: 1652772740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:20:19,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-28 01:20:21,066][54818] Updated weights for policy 0, policy_version 533918 (0.0016) [2024-04-28 01:20:24,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60620.9, 300 sec: 60482.0). Total num frames: 8747859968. Throughput: 0: 61273.5. Samples: 1653139520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:20:24,253][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 01:20:24,312][54818] Updated weights for policy 0, policy_version 533928 (0.0017) [2024-04-28 01:20:26,109][54818] Updated weights for policy 0, policy_version 533938 (0.0016) [2024-04-28 01:20:29,253][54587] Fps is (10 sec: 58982.0, 60 sec: 60347.7, 300 sec: 60482.0). Total num frames: 8748154880. Throughput: 0: 60717.3. Samples: 1653303500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:20:29,254][54587] Avg episode reward: [(0, '0.677')] [2024-04-28 01:20:29,602][54818] Updated weights for policy 0, policy_version 533948 (0.0019) [2024-04-28 01:20:31,757][54818] Updated weights for policy 0, policy_version 533958 (0.0017) [2024-04-28 01:20:33,780][54798] Signal inference workers to stop experience collection... (26550 times) [2024-04-28 01:20:33,781][54798] Signal inference workers to resume experience collection... (26550 times) [2024-04-28 01:20:33,791][54818] InferenceWorker_p0-w0: stopping experience collection (26550 times) [2024-04-28 01:20:33,792][54818] InferenceWorker_p0-w0: resuming experience collection (26550 times) [2024-04-28 01:20:34,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60621.0, 300 sec: 60593.0). Total num frames: 8748466176. Throughput: 0: 61177.4. Samples: 1653683320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:20:34,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 01:20:34,714][54818] Updated weights for policy 0, policy_version 533968 (0.0017) [2024-04-28 01:20:37,965][54818] Updated weights for policy 0, policy_version 533978 (0.0016) [2024-04-28 01:20:39,253][54587] Fps is (10 sec: 62258.7, 60 sec: 60893.8, 300 sec: 60704.1). Total num frames: 8748777472. Throughput: 0: 61116.7. Samples: 1654049340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:20:39,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-28 01:20:40,254][54818] Updated weights for policy 0, policy_version 533988 (0.0016) [2024-04-28 01:20:43,411][54818] Updated weights for policy 0, policy_version 533998 (0.0019) [2024-04-28 01:20:44,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60620.8, 300 sec: 60759.6). Total num frames: 8749072384. Throughput: 0: 60739.8. Samples: 1654207960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:20:44,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 01:20:45,567][54818] Updated weights for policy 0, policy_version 534008 (0.0016) [2024-04-28 01:20:49,017][54818] Updated weights for policy 0, policy_version 534018 (0.0018) [2024-04-28 01:20:49,253][54587] Fps is (10 sec: 58983.8, 60 sec: 60347.8, 300 sec: 60759.7). Total num frames: 8749367296. Throughput: 0: 60638.4. Samples: 1654584980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:20:49,253][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 01:20:49,262][54587] No heartbeat for components: RolloutWorker_w4 (14197 seconds), RolloutWorker_w5 (297 seconds) [2024-04-28 01:20:50,931][54818] Updated weights for policy 0, policy_version 534028 (0.0016) [2024-04-28 01:20:54,200][54818] Updated weights for policy 0, policy_version 534038 (0.0018) [2024-04-28 01:20:54,255][54587] Fps is (10 sec: 60612.8, 60 sec: 60346.3, 300 sec: 60814.9). Total num frames: 8749678592. Throughput: 0: 60973.7. Samples: 1654961840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:20:54,256][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 01:20:56,227][54818] Updated weights for policy 0, policy_version 534048 (0.0017) [2024-04-28 01:20:59,253][54587] Fps is (10 sec: 62258.5, 60 sec: 60620.7, 300 sec: 60759.7). Total num frames: 8749989888. Throughput: 0: 60800.4. Samples: 1655121620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:20:59,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 01:20:59,347][54818] Updated weights for policy 0, policy_version 534058 (0.0017) [2024-04-28 01:21:01,488][54818] Updated weights for policy 0, policy_version 534068 (0.0016) [2024-04-28 01:21:03,963][54798] Signal inference workers to stop experience collection... (26600 times) [2024-04-28 01:21:03,964][54798] Signal inference workers to resume experience collection... (26600 times) [2024-04-28 01:21:03,979][54818] InferenceWorker_p0-w0: stopping experience collection (26600 times) [2024-04-28 01:21:03,980][54818] InferenceWorker_p0-w0: resuming experience collection (26600 times) [2024-04-28 01:21:04,253][54587] Fps is (10 sec: 60628.7, 60 sec: 60347.7, 300 sec: 60815.2). Total num frames: 8750284800. Throughput: 0: 60630.6. Samples: 1655501120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:21:04,255][54587] Avg episode reward: [(0, '0.665')] [2024-04-28 01:21:04,669][54818] Updated weights for policy 0, policy_version 534078 (0.0018) [2024-04-28 01:21:06,885][54818] Updated weights for policy 0, policy_version 534088 (0.0015) [2024-04-28 01:21:09,253][54587] Fps is (10 sec: 58981.6, 60 sec: 60347.6, 300 sec: 60815.2). Total num frames: 8750579712. Throughput: 0: 60832.6. Samples: 1655877000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 01:21:09,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 01:21:09,937][54818] Updated weights for policy 0, policy_version 534098 (0.0018) [2024-04-28 01:21:12,531][54818] Updated weights for policy 0, policy_version 534108 (0.0018) [2024-04-28 01:21:14,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60620.8, 300 sec: 60870.7). Total num frames: 8750891008. Throughput: 0: 60723.2. Samples: 1656036040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:21:14,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 01:21:15,122][54818] Updated weights for policy 0, policy_version 534118 (0.0022) [2024-04-28 01:21:18,801][54818] Updated weights for policy 0, policy_version 534128 (0.0016) [2024-04-28 01:21:19,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60347.6, 300 sec: 60815.2). Total num frames: 8751185920. Throughput: 0: 60593.6. Samples: 1656410040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:21:19,255][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 01:21:20,494][54818] Updated weights for policy 0, policy_version 534138 (0.0016) [2024-04-28 01:21:24,047][54818] Updated weights for policy 0, policy_version 534148 (0.0017) [2024-04-28 01:21:24,253][54587] Fps is (10 sec: 58983.1, 60 sec: 60347.8, 300 sec: 60815.2). Total num frames: 8751480832. Throughput: 0: 60735.4. Samples: 1656782420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:21:24,253][54587] Avg episode reward: [(0, '0.666')] [2024-04-28 01:21:25,767][54818] Updated weights for policy 0, policy_version 534158 (0.0020) [2024-04-28 01:21:29,253][54587] Fps is (10 sec: 60622.0, 60 sec: 60621.0, 300 sec: 60815.2). Total num frames: 8751792128. Throughput: 0: 60815.3. Samples: 1656944640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:21:29,253][54587] Avg episode reward: [(0, '0.678')] [2024-04-28 01:21:29,319][54818] Updated weights for policy 0, policy_version 534168 (0.0018) [2024-04-28 01:21:31,117][54818] Updated weights for policy 0, policy_version 534178 (0.0016) [2024-04-28 01:21:34,253][54587] Fps is (10 sec: 62258.4, 60 sec: 60620.8, 300 sec: 60815.2). Total num frames: 8752103424. Throughput: 0: 60716.7. Samples: 1657317240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:21:34,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 01:21:34,584][54818] Updated weights for policy 0, policy_version 534188 (0.0017) [2024-04-28 01:21:35,381][54798] Signal inference workers to stop experience collection... (26650 times) [2024-04-28 01:21:35,414][54818] InferenceWorker_p0-w0: stopping experience collection (26650 times) [2024-04-28 01:21:35,444][54798] Signal inference workers to resume experience collection... (26650 times) [2024-04-28 01:21:35,444][54818] InferenceWorker_p0-w0: resuming experience collection (26650 times) [2024-04-28 01:21:36,370][54818] Updated weights for policy 0, policy_version 534198 (0.0016) [2024-04-28 01:21:39,253][54587] Fps is (10 sec: 60619.8, 60 sec: 60347.8, 300 sec: 60759.6). Total num frames: 8752398336. Throughput: 0: 60652.9. Samples: 1657691140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:21:39,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-28 01:21:39,867][54818] Updated weights for policy 0, policy_version 534208 (0.0017) [2024-04-28 01:21:42,095][54818] Updated weights for policy 0, policy_version 534218 (0.0020) [2024-04-28 01:21:44,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60620.8, 300 sec: 60815.2). Total num frames: 8752709632. Throughput: 0: 60740.4. Samples: 1657854940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:21:44,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-28 01:21:45,175][54818] Updated weights for policy 0, policy_version 534228 (0.0017) [2024-04-28 01:21:47,615][54818] Updated weights for policy 0, policy_version 534238 (0.0018) [2024-04-28 01:21:49,253][54587] Fps is (10 sec: 62259.5, 60 sec: 60893.7, 300 sec: 60815.2). Total num frames: 8753020928. Throughput: 0: 60610.7. Samples: 1658228600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:21:49,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-28 01:21:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000534242_8753020928.pth... [2024-04-28 01:21:49,316][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000533353_8738455552.pth [2024-04-28 01:21:50,482][54818] Updated weights for policy 0, policy_version 534248 (0.0018) [2024-04-28 01:21:53,655][54818] Updated weights for policy 0, policy_version 534258 (0.0017) [2024-04-28 01:21:54,253][54587] Fps is (10 sec: 58983.0, 60 sec: 60349.2, 300 sec: 60704.2). Total num frames: 8753299456. Throughput: 0: 60678.5. Samples: 1658607520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:21:54,253][54587] Avg episode reward: [(0, '0.550')] [2024-04-28 01:21:55,630][54818] Updated weights for policy 0, policy_version 534268 (0.0016) [2024-04-28 01:21:59,215][54818] Updated weights for policy 0, policy_version 534278 (0.0018) [2024-04-28 01:21:59,253][54587] Fps is (10 sec: 58981.8, 60 sec: 60347.6, 300 sec: 60704.1). Total num frames: 8753610752. Throughput: 0: 60644.3. Samples: 1658765040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:21:59,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 01:22:00,870][54818] Updated weights for policy 0, policy_version 534288 (0.0021) [2024-04-28 01:22:04,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60347.9, 300 sec: 60704.1). Total num frames: 8753905664. Throughput: 0: 60795.4. Samples: 1659145820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:22:04,253][54587] Avg episode reward: [(0, '0.658')] [2024-04-28 01:22:04,608][54818] Updated weights for policy 0, policy_version 534298 (0.0016) [2024-04-28 01:22:04,752][54798] Signal inference workers to stop experience collection... (26700 times) [2024-04-28 01:22:04,753][54798] Signal inference workers to resume experience collection... (26700 times) [2024-04-28 01:22:04,764][54818] InferenceWorker_p0-w0: stopping experience collection (26700 times) [2024-04-28 01:22:04,764][54818] InferenceWorker_p0-w0: resuming experience collection (26700 times) [2024-04-28 01:22:06,138][54818] Updated weights for policy 0, policy_version 534308 (0.0017) [2024-04-28 01:22:09,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60620.9, 300 sec: 60704.1). Total num frames: 8754216960. Throughput: 0: 60750.0. Samples: 1659516180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:22:09,255][54587] Avg episode reward: [(0, '0.641')] [2024-04-28 01:22:09,866][54818] Updated weights for policy 0, policy_version 534318 (0.0015) [2024-04-28 01:22:11,227][54818] Updated weights for policy 0, policy_version 534328 (0.0017) [2024-04-28 01:22:14,253][54587] Fps is (10 sec: 62258.2, 60 sec: 60620.7, 300 sec: 60759.6). Total num frames: 8754528256. Throughput: 0: 60774.9. Samples: 1659679520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:22:14,254][54587] Avg episode reward: [(0, '0.674')] [2024-04-28 01:22:14,936][54818] Updated weights for policy 0, policy_version 534338 (0.0016) [2024-04-28 01:22:16,489][54818] Updated weights for policy 0, policy_version 534348 (0.0025) [2024-04-28 01:22:19,253][54587] Fps is (10 sec: 60621.6, 60 sec: 60621.0, 300 sec: 60704.1). Total num frames: 8754823168. Throughput: 0: 60975.2. Samples: 1660061120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:22:19,253][54587] Avg episode reward: [(0, '0.574')] [2024-04-28 01:22:20,161][54818] Updated weights for policy 0, policy_version 534358 (0.0017) [2024-04-28 01:22:22,436][54818] Updated weights for policy 0, policy_version 534368 (0.0017) [2024-04-28 01:22:24,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61166.8, 300 sec: 60759.7). Total num frames: 8755150848. Throughput: 0: 60929.0. Samples: 1660432940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:22:24,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-28 01:22:25,333][54818] Updated weights for policy 0, policy_version 534378 (0.0016) [2024-04-28 01:22:28,010][54818] Updated weights for policy 0, policy_version 534388 (0.0015) [2024-04-28 01:22:29,253][54587] Fps is (10 sec: 62258.7, 60 sec: 60893.7, 300 sec: 60759.6). Total num frames: 8755445760. Throughput: 0: 60946.2. Samples: 1660597520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:22:29,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 01:22:30,597][54818] Updated weights for policy 0, policy_version 534398 (0.0016) [2024-04-28 01:22:33,429][54818] Updated weights for policy 0, policy_version 534408 (0.0016) [2024-04-28 01:22:34,252][54798] Signal inference workers to stop experience collection... (26750 times) [2024-04-28 01:22:34,253][54798] Signal inference workers to resume experience collection... (26750 times) [2024-04-28 01:22:34,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.8, 300 sec: 60759.7). Total num frames: 8755757056. Throughput: 0: 61093.8. Samples: 1660977820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:22:34,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 01:22:34,267][54818] InferenceWorker_p0-w0: stopping experience collection (26750 times) [2024-04-28 01:22:34,267][54818] InferenceWorker_p0-w0: resuming experience collection (26750 times) [2024-04-28 01:22:35,770][54818] Updated weights for policy 0, policy_version 534418 (0.0018) [2024-04-28 01:22:39,110][54818] Updated weights for policy 0, policy_version 534428 (0.0015) [2024-04-28 01:22:39,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61166.9, 300 sec: 60759.6). Total num frames: 8756068352. Throughput: 0: 60816.2. Samples: 1661344260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:22:39,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 01:22:41,143][54818] Updated weights for policy 0, policy_version 534438 (0.0019) [2024-04-28 01:22:44,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61167.0, 300 sec: 60815.2). Total num frames: 8756379648. Throughput: 0: 61176.6. Samples: 1661517980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:22:44,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 01:22:44,485][54818] Updated weights for policy 0, policy_version 534448 (0.0016) [2024-04-28 01:22:46,530][54818] Updated weights for policy 0, policy_version 534458 (0.0017) [2024-04-28 01:22:49,253][54587] Fps is (10 sec: 60622.1, 60 sec: 60894.0, 300 sec: 60759.7). Total num frames: 8756674560. Throughput: 0: 60946.2. Samples: 1661888400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:22:49,253][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 01:22:49,727][54818] Updated weights for policy 0, policy_version 534468 (0.0016) [2024-04-28 01:22:52,137][54818] Updated weights for policy 0, policy_version 534478 (0.0016) [2024-04-28 01:22:54,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61439.9, 300 sec: 60759.6). Total num frames: 8756985856. Throughput: 0: 61109.0. Samples: 1662266080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:22:54,254][54587] Avg episode reward: [(0, '0.513')] [2024-04-28 01:22:54,993][54818] Updated weights for policy 0, policy_version 534488 (0.0018) [2024-04-28 01:22:57,671][54818] Updated weights for policy 0, policy_version 534498 (0.0023) [2024-04-28 01:22:59,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61440.1, 300 sec: 60815.2). Total num frames: 8757297152. Throughput: 0: 61186.2. Samples: 1662432900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:22:59,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 01:23:00,180][54818] Updated weights for policy 0, policy_version 534508 (0.0017) [2024-04-28 01:23:03,048][54818] Updated weights for policy 0, policy_version 534518 (0.0015) [2024-04-28 01:23:04,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61713.0, 300 sec: 60815.2). Total num frames: 8757608448. Throughput: 0: 61064.9. Samples: 1662809040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:23:04,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 01:23:05,505][54818] Updated weights for policy 0, policy_version 534528 (0.0015) [2024-04-28 01:23:08,214][54818] Updated weights for policy 0, policy_version 534538 (0.0016) [2024-04-28 01:23:09,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61440.1, 300 sec: 60759.7). Total num frames: 8757903360. Throughput: 0: 60993.8. Samples: 1663177660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:23:09,253][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 01:23:10,869][54818] Updated weights for policy 0, policy_version 534548 (0.0020) [2024-04-28 01:23:13,817][54818] Updated weights for policy 0, policy_version 534558 (0.0016) [2024-04-28 01:23:14,253][54587] Fps is (10 sec: 58982.3, 60 sec: 61167.0, 300 sec: 60759.6). Total num frames: 8758198272. Throughput: 0: 61190.3. Samples: 1663351080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:23:14,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 01:23:16,111][54798] Signal inference workers to stop experience collection... (26800 times) [2024-04-28 01:23:16,156][54818] InferenceWorker_p0-w0: stopping experience collection (26800 times) [2024-04-28 01:23:16,166][54798] Signal inference workers to resume experience collection... (26800 times) [2024-04-28 01:23:16,172][54818] InferenceWorker_p0-w0: resuming experience collection (26800 times) [2024-04-28 01:23:16,288][54818] Updated weights for policy 0, policy_version 534568 (0.0016) [2024-04-28 01:23:19,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61440.0, 300 sec: 60759.7). Total num frames: 8758509568. Throughput: 0: 60963.6. Samples: 1663721180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:23:19,255][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 01:23:19,278][54818] Updated weights for policy 0, policy_version 534578 (0.0017) [2024-04-28 01:23:21,487][54818] Updated weights for policy 0, policy_version 534588 (0.0019) [2024-04-28 01:23:24,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61167.0, 300 sec: 60870.7). Total num frames: 8758820864. Throughput: 0: 60868.3. Samples: 1664083320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:23:24,253][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 01:23:24,391][54818] Updated weights for policy 0, policy_version 534598 (0.0016) [2024-04-28 01:23:26,936][54818] Updated weights for policy 0, policy_version 534608 (0.0016) [2024-04-28 01:23:29,254][54587] Fps is (10 sec: 62257.9, 60 sec: 61439.8, 300 sec: 60870.7). Total num frames: 8759132160. Throughput: 0: 61055.3. Samples: 1664265480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:23:29,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-28 01:23:29,859][54818] Updated weights for policy 0, policy_version 534618 (0.0018) [2024-04-28 01:23:32,115][54818] Updated weights for policy 0, policy_version 534628 (0.0019) [2024-04-28 01:23:34,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61167.0, 300 sec: 60815.2). Total num frames: 8759427072. Throughput: 0: 61113.2. Samples: 1664638500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:23:34,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 01:23:35,193][54818] Updated weights for policy 0, policy_version 534638 (0.0018) [2024-04-28 01:23:37,743][54818] Updated weights for policy 0, policy_version 534648 (0.0017) [2024-04-28 01:23:39,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61440.0, 300 sec: 60870.7). Total num frames: 8759754752. Throughput: 0: 60827.5. Samples: 1665003320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:23:39,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 01:23:40,536][54818] Updated weights for policy 0, policy_version 534658 (0.0016) [2024-04-28 01:23:43,074][54818] Updated weights for policy 0, policy_version 534668 (0.0023) [2024-04-28 01:23:44,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61166.9, 300 sec: 60870.7). Total num frames: 8760049664. Throughput: 0: 61106.3. Samples: 1665182680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:23:44,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 01:23:46,154][54818] Updated weights for policy 0, policy_version 534678 (0.0018) [2024-04-28 01:23:48,707][54818] Updated weights for policy 0, policy_version 534688 (0.0016) [2024-04-28 01:23:49,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61439.9, 300 sec: 60926.2). Total num frames: 8760360960. Throughput: 0: 60783.0. Samples: 1665544280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:23:49,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 01:23:49,263][54587] No heartbeat for components: RolloutWorker_w4 (14377 seconds), RolloutWorker_w5 (477 seconds) [2024-04-28 01:23:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000534690_8760360960.pth... [2024-04-28 01:23:49,320][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000533798_8745746432.pth [2024-04-28 01:23:51,348][54818] Updated weights for policy 0, policy_version 534698 (0.0018) [2024-04-28 01:23:53,928][54818] Updated weights for policy 0, policy_version 534708 (0.0016) [2024-04-28 01:23:54,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61166.9, 300 sec: 60926.2). Total num frames: 8760655872. Throughput: 0: 60768.7. Samples: 1665912260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:23:54,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 01:23:56,544][54818] Updated weights for policy 0, policy_version 534718 (0.0016) [2024-04-28 01:23:59,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61167.0, 300 sec: 60926.3). Total num frames: 8760967168. Throughput: 0: 60952.4. Samples: 1666093940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:23:59,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-28 01:23:59,383][54818] Updated weights for policy 0, policy_version 534728 (0.0017) [2024-04-28 01:24:01,144][54798] Signal inference workers to stop experience collection... (26850 times) [2024-04-28 01:24:01,181][54818] InferenceWorker_p0-w0: stopping experience collection (26850 times) [2024-04-28 01:24:01,200][54798] Signal inference workers to resume experience collection... (26850 times) [2024-04-28 01:24:01,200][54818] InferenceWorker_p0-w0: resuming experience collection (26850 times) [2024-04-28 01:24:01,928][54818] Updated weights for policy 0, policy_version 534738 (0.0017) [2024-04-28 01:24:04,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60893.9, 300 sec: 60870.7). Total num frames: 8761262080. Throughput: 0: 60824.1. Samples: 1666458260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:24:04,254][54587] Avg episode reward: [(0, '0.667')] [2024-04-28 01:24:04,569][54818] Updated weights for policy 0, policy_version 534748 (0.0016) [2024-04-28 01:24:07,471][54818] Updated weights for policy 0, policy_version 534758 (0.0017) [2024-04-28 01:24:09,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61167.0, 300 sec: 60926.3). Total num frames: 8761573376. Throughput: 0: 60930.7. Samples: 1666825200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 01:24:09,253][54587] Avg episode reward: [(0, '0.691')] [2024-04-28 01:24:10,448][54818] Updated weights for policy 0, policy_version 534768 (0.0016) [2024-04-28 01:24:12,865][54818] Updated weights for policy 0, policy_version 534778 (0.0018) [2024-04-28 01:24:14,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61440.0, 300 sec: 60926.3). Total num frames: 8761884672. Throughput: 0: 61018.5. Samples: 1667011300. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:24:14,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 01:24:15,881][54818] Updated weights for policy 0, policy_version 534788 (0.0016) [2024-04-28 01:24:18,293][54818] Updated weights for policy 0, policy_version 534798 (0.0017) [2024-04-28 01:24:19,253][54587] Fps is (10 sec: 62258.0, 60 sec: 61439.9, 300 sec: 60926.2). Total num frames: 8762195968. Throughput: 0: 60809.2. Samples: 1667374920. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:24:19,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 01:24:21,191][54818] Updated weights for policy 0, policy_version 534808 (0.0017) [2024-04-28 01:24:23,572][54818] Updated weights for policy 0, policy_version 534818 (0.0017) [2024-04-28 01:24:24,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61439.9, 300 sec: 60926.3). Total num frames: 8762507264. Throughput: 0: 60780.0. Samples: 1667738420. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:24:24,255][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 01:24:26,496][54818] Updated weights for policy 0, policy_version 534828 (0.0017) [2024-04-28 01:24:28,804][54818] Updated weights for policy 0, policy_version 534838 (0.0017) [2024-04-28 01:24:29,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61167.2, 300 sec: 60926.3). Total num frames: 8762802176. Throughput: 0: 60903.7. Samples: 1667923340. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:24:29,253][54587] Avg episode reward: [(0, '0.708')] [2024-04-28 01:24:31,733][54818] Updated weights for policy 0, policy_version 534848 (0.0017) [2024-04-28 01:24:34,159][54818] Updated weights for policy 0, policy_version 534858 (0.0017) [2024-04-28 01:24:34,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.0, 300 sec: 60981.8). Total num frames: 8763113472. Throughput: 0: 60952.4. Samples: 1668287140. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:24:34,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 01:24:37,053][54818] Updated weights for policy 0, policy_version 534868 (0.0016) [2024-04-28 01:24:39,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61167.0, 300 sec: 60981.8). Total num frames: 8763424768. Throughput: 0: 60931.1. Samples: 1668654160. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:24:39,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 01:24:39,359][54818] Updated weights for policy 0, policy_version 534878 (0.0016) [2024-04-28 01:24:42,341][54818] Updated weights for policy 0, policy_version 534888 (0.0019) [2024-04-28 01:24:44,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61440.0, 300 sec: 60981.8). Total num frames: 8763736064. Throughput: 0: 61243.0. Samples: 1668849880. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:24:44,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 01:24:44,911][54818] Updated weights for policy 0, policy_version 534898 (0.0015) [2024-04-28 01:24:47,707][54818] Updated weights for policy 0, policy_version 534908 (0.0016) [2024-04-28 01:24:49,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61166.9, 300 sec: 60926.3). Total num frames: 8764030976. Throughput: 0: 61004.7. Samples: 1669203480. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:24:49,254][54587] Avg episode reward: [(0, '0.481')] [2024-04-28 01:24:50,787][54818] Updated weights for policy 0, policy_version 534918 (0.0017) [2024-04-28 01:24:51,464][54798] Signal inference workers to stop experience collection... (26900 times) [2024-04-28 01:24:51,465][54798] Signal inference workers to resume experience collection... (26900 times) [2024-04-28 01:24:51,476][54818] InferenceWorker_p0-w0: stopping experience collection (26900 times) [2024-04-28 01:24:51,476][54818] InferenceWorker_p0-w0: resuming experience collection (26900 times) [2024-04-28 01:24:53,099][54818] Updated weights for policy 0, policy_version 534928 (0.0016) [2024-04-28 01:24:54,253][54587] Fps is (10 sec: 58982.5, 60 sec: 61166.9, 300 sec: 60926.2). Total num frames: 8764325888. Throughput: 0: 60918.5. Samples: 1669566540. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:24:54,255][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 01:24:56,122][54818] Updated weights for policy 0, policy_version 534938 (0.0016) [2024-04-28 01:24:58,291][54818] Updated weights for policy 0, policy_version 534948 (0.0016) [2024-04-28 01:24:59,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.9, 300 sec: 60926.3). Total num frames: 8764637184. Throughput: 0: 61174.6. Samples: 1669764160. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:24:59,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-28 01:25:01,665][54818] Updated weights for policy 0, policy_version 534958 (0.0015) [2024-04-28 01:25:03,373][54818] Updated weights for policy 0, policy_version 534968 (0.0019) [2024-04-28 01:25:04,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61439.9, 300 sec: 60981.8). Total num frames: 8764948480. Throughput: 0: 61002.0. Samples: 1670120000. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:25:04,254][54587] Avg episode reward: [(0, '0.488')] [2024-04-28 01:25:06,875][54818] Updated weights for policy 0, policy_version 534978 (0.0017) [2024-04-28 01:25:09,050][54818] Updated weights for policy 0, policy_version 534988 (0.0018) [2024-04-28 01:25:09,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61166.9, 300 sec: 60981.8). Total num frames: 8765243392. Throughput: 0: 60862.9. Samples: 1670477240. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:25:09,253][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 01:25:12,178][54818] Updated weights for policy 0, policy_version 534998 (0.0019) [2024-04-28 01:25:14,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61167.0, 300 sec: 60981.8). Total num frames: 8765554688. Throughput: 0: 61308.9. Samples: 1670682240. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:25:14,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 01:25:14,320][54818] Updated weights for policy 0, policy_version 535008 (0.0017) [2024-04-28 01:25:17,526][54818] Updated weights for policy 0, policy_version 535018 (0.0018) [2024-04-28 01:25:19,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60894.0, 300 sec: 60981.8). Total num frames: 8765849600. Throughput: 0: 61104.1. Samples: 1671036820. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:25:19,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-28 01:25:19,646][54818] Updated weights for policy 0, policy_version 535028 (0.0019) [2024-04-28 01:25:22,813][54818] Updated weights for policy 0, policy_version 535038 (0.0019) [2024-04-28 01:25:24,253][54587] Fps is (10 sec: 58980.9, 60 sec: 60620.7, 300 sec: 60981.8). Total num frames: 8766144512. Throughput: 0: 60575.4. Samples: 1671380060. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:25:24,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 01:25:25,897][54818] Updated weights for policy 0, policy_version 535048 (0.0020) [2024-04-28 01:25:27,923][54798] Signal inference workers to stop experience collection... (26950 times) [2024-04-28 01:25:27,961][54818] InferenceWorker_p0-w0: stopping experience collection (26950 times) [2024-04-28 01:25:27,971][54798] Signal inference workers to resume experience collection... (26950 times) [2024-04-28 01:25:27,975][54818] InferenceWorker_p0-w0: resuming experience collection (26950 times) [2024-04-28 01:25:28,104][54818] Updated weights for policy 0, policy_version 535058 (0.0017) [2024-04-28 01:25:29,253][54587] Fps is (10 sec: 60619.7, 60 sec: 60893.7, 300 sec: 60981.8). Total num frames: 8766455808. Throughput: 0: 60819.4. Samples: 1671586760. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:25:29,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 01:25:31,211][54818] Updated weights for policy 0, policy_version 535068 (0.0018) [2024-04-28 01:25:33,289][54818] Updated weights for policy 0, policy_version 535078 (0.0020) [2024-04-28 01:25:34,253][54587] Fps is (10 sec: 63898.6, 60 sec: 61166.9, 300 sec: 61037.4). Total num frames: 8766783488. Throughput: 0: 61038.2. Samples: 1671950200. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:25:34,255][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 01:25:36,501][54818] Updated weights for policy 0, policy_version 535088 (0.0017) [2024-04-28 01:25:38,489][54818] Updated weights for policy 0, policy_version 535098 (0.0017) [2024-04-28 01:25:39,253][54587] Fps is (10 sec: 62260.0, 60 sec: 60893.9, 300 sec: 61037.4). Total num frames: 8767078400. Throughput: 0: 60780.9. Samples: 1672301680. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-04-28 01:25:39,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 01:25:42,071][54818] Updated weights for policy 0, policy_version 535108 (0.0016) [2024-04-28 01:25:43,889][54818] Updated weights for policy 0, policy_version 535118 (0.0019) [2024-04-28 01:25:44,254][54587] Fps is (10 sec: 58976.6, 60 sec: 60619.9, 300 sec: 61037.1). Total num frames: 8767373312. Throughput: 0: 60865.4. Samples: 1672503160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:25:44,255][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 01:25:47,293][54818] Updated weights for policy 0, policy_version 535128 (0.0016) [2024-04-28 01:25:49,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60893.9, 300 sec: 61037.6). Total num frames: 8767684608. Throughput: 0: 60831.9. Samples: 1672857440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:25:49,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 01:25:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000535137_8767684608.pth... [2024-04-28 01:25:49,329][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000534242_8753020928.pth [2024-04-28 01:25:49,481][54818] Updated weights for policy 0, policy_version 535138 (0.0017) [2024-04-28 01:25:52,403][54818] Updated weights for policy 0, policy_version 535148 (0.0017) [2024-04-28 01:25:54,253][54587] Fps is (10 sec: 62265.3, 60 sec: 61167.0, 300 sec: 61037.3). Total num frames: 8767995904. Throughput: 0: 60914.5. Samples: 1673218400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:25:54,255][54587] Avg episode reward: [(0, '0.665')] [2024-04-28 01:25:54,773][54818] Updated weights for policy 0, policy_version 535158 (0.0015) [2024-04-28 01:25:57,652][54818] Updated weights for policy 0, policy_version 535168 (0.0020) [2024-04-28 01:25:59,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 8768307200. Throughput: 0: 60711.4. Samples: 1673414260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:25:59,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 01:26:00,211][54818] Updated weights for policy 0, policy_version 535178 (0.0016) [2024-04-28 01:26:03,050][54818] Updated weights for policy 0, policy_version 535188 (0.0016) [2024-04-28 01:26:04,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 8768602112. Throughput: 0: 60754.2. Samples: 1673770760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:26:04,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 01:26:06,190][54818] Updated weights for policy 0, policy_version 535198 (0.0015) [2024-04-28 01:26:07,983][54798] Signal inference workers to stop experience collection... (27000 times) [2024-04-28 01:26:07,987][54798] Signal inference workers to resume experience collection... (27000 times) [2024-04-28 01:26:08,000][54818] InferenceWorker_p0-w0: stopping experience collection (27000 times) [2024-04-28 01:26:08,000][54818] InferenceWorker_p0-w0: resuming experience collection (27000 times) [2024-04-28 01:26:08,445][54818] Updated weights for policy 0, policy_version 535208 (0.0017) [2024-04-28 01:26:09,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.7, 300 sec: 61092.9). Total num frames: 8768913408. Throughput: 0: 61105.0. Samples: 1674129780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:26:09,255][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 01:26:11,363][54818] Updated weights for policy 0, policy_version 535218 (0.0018) [2024-04-28 01:26:13,792][54818] Updated weights for policy 0, policy_version 535228 (0.0021) [2024-04-28 01:26:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 8769208320. Throughput: 0: 60888.2. Samples: 1674326720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:26:14,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 01:26:16,973][54818] Updated weights for policy 0, policy_version 535238 (0.0022) [2024-04-28 01:26:19,051][54818] Updated weights for policy 0, policy_version 535248 (0.0016) [2024-04-28 01:26:19,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 8769519616. Throughput: 0: 60865.4. Samples: 1674689140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:26:19,253][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 01:26:22,158][54818] Updated weights for policy 0, policy_version 535258 (0.0015) [2024-04-28 01:26:24,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61167.1, 300 sec: 61092.9). Total num frames: 8769814528. Throughput: 0: 61052.8. Samples: 1675049060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:26:24,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-28 01:26:24,342][54818] Updated weights for policy 0, policy_version 535268 (0.0019) [2024-04-28 01:26:27,448][54818] Updated weights for policy 0, policy_version 535278 (0.0015) [2024-04-28 01:26:29,253][54587] Fps is (10 sec: 58982.4, 60 sec: 60894.1, 300 sec: 61037.4). Total num frames: 8770109440. Throughput: 0: 60808.9. Samples: 1675239500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:26:29,254][54587] Avg episode reward: [(0, '0.460')] [2024-04-28 01:26:29,609][54818] Updated weights for policy 0, policy_version 535288 (0.0017) [2024-04-28 01:26:32,579][54818] Updated weights for policy 0, policy_version 535298 (0.0017) [2024-04-28 01:26:34,253][54587] Fps is (10 sec: 60621.6, 60 sec: 60620.9, 300 sec: 61092.9). Total num frames: 8770420736. Throughput: 0: 61003.3. Samples: 1675602580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:26:34,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-28 01:26:35,178][54818] Updated weights for policy 0, policy_version 535308 (0.0016) [2024-04-28 01:26:37,847][54818] Updated weights for policy 0, policy_version 535318 (0.0017) [2024-04-28 01:26:39,253][54587] Fps is (10 sec: 62258.0, 60 sec: 60893.7, 300 sec: 61092.9). Total num frames: 8770732032. Throughput: 0: 61066.5. Samples: 1675966400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:26:39,254][54587] Avg episode reward: [(0, '0.509')] [2024-04-28 01:26:40,857][54818] Updated weights for policy 0, policy_version 535328 (0.0016) [2024-04-28 01:26:43,386][54818] Updated weights for policy 0, policy_version 535338 (0.0018) [2024-04-28 01:26:44,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60894.9, 300 sec: 61037.4). Total num frames: 8771026944. Throughput: 0: 60993.0. Samples: 1676158940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:26:44,253][54587] Avg episode reward: [(0, '0.544')] [2024-04-28 01:26:46,093][54818] Updated weights for policy 0, policy_version 535348 (0.0016) [2024-04-28 01:26:48,518][54818] Updated weights for policy 0, policy_version 535358 (0.0016) [2024-04-28 01:26:49,253][54587] Fps is (10 sec: 60622.4, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 8771338240. Throughput: 0: 61149.0. Samples: 1676522460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:26:49,253][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 01:26:49,258][54587] No heartbeat for components: RolloutWorker_w4 (14557 seconds), RolloutWorker_w5 (657 seconds) [2024-04-28 01:26:51,659][54818] Updated weights for policy 0, policy_version 535368 (0.0016) [2024-04-28 01:26:53,720][54818] Updated weights for policy 0, policy_version 535378 (0.0016) [2024-04-28 01:26:54,254][54587] Fps is (10 sec: 62256.0, 60 sec: 60893.4, 300 sec: 61148.4). Total num frames: 8771649536. Throughput: 0: 61134.2. Samples: 1676880840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:26:54,255][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 01:26:56,984][54818] Updated weights for policy 0, policy_version 535388 (0.0017) [2024-04-28 01:26:59,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60620.8, 300 sec: 61148.4). Total num frames: 8771944448. Throughput: 0: 60957.4. Samples: 1677069800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:26:59,254][54587] Avg episode reward: [(0, '0.518')] [2024-04-28 01:26:59,350][54818] Updated weights for policy 0, policy_version 535398 (0.0017) [2024-04-28 01:27:02,365][54818] Updated weights for policy 0, policy_version 535408 (0.0018) [2024-04-28 01:27:02,790][54798] Signal inference workers to stop experience collection... (27050 times) [2024-04-28 01:27:02,832][54818] InferenceWorker_p0-w0: stopping experience collection (27050 times) [2024-04-28 01:27:02,848][54798] Signal inference workers to resume experience collection... (27050 times) [2024-04-28 01:27:02,849][54818] InferenceWorker_p0-w0: resuming experience collection (27050 times) [2024-04-28 01:27:04,253][54587] Fps is (10 sec: 60624.0, 60 sec: 60893.9, 300 sec: 61148.5). Total num frames: 8772255744. Throughput: 0: 61199.2. Samples: 1677443100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:27:04,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 01:27:04,657][54818] Updated weights for policy 0, policy_version 535418 (0.0019) [2024-04-28 01:27:07,814][54818] Updated weights for policy 0, policy_version 535428 (0.0017) [2024-04-28 01:27:09,253][54587] Fps is (10 sec: 62258.7, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 8772567040. Throughput: 0: 61136.4. Samples: 1677800200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:27:09,255][54587] Avg episode reward: [(0, '0.725')] [2024-04-28 01:27:10,269][54818] Updated weights for policy 0, policy_version 535438 (0.0018) [2024-04-28 01:27:12,984][54818] Updated weights for policy 0, policy_version 535448 (0.0015) [2024-04-28 01:27:14,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 8772861952. Throughput: 0: 61059.6. Samples: 1677987180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:27:14,253][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 01:27:15,480][54818] Updated weights for policy 0, policy_version 535458 (0.0018) [2024-04-28 01:27:18,377][54818] Updated weights for policy 0, policy_version 535468 (0.0015) [2024-04-28 01:27:19,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 8773173248. Throughput: 0: 61175.9. Samples: 1678355500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:27:19,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 01:27:20,651][54818] Updated weights for policy 0, policy_version 535478 (0.0017) [2024-04-28 01:27:23,592][54818] Updated weights for policy 0, policy_version 535488 (0.0017) [2024-04-28 01:27:24,253][54587] Fps is (10 sec: 60619.8, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 8773468160. Throughput: 0: 61085.0. Samples: 1678715220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:27:24,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 01:27:26,063][54818] Updated weights for policy 0, policy_version 535498 (0.0019) [2024-04-28 01:27:29,044][54818] Updated weights for policy 0, policy_version 535508 (0.0017) [2024-04-28 01:27:29,253][54587] Fps is (10 sec: 58982.3, 60 sec: 60893.8, 300 sec: 61037.3). Total num frames: 8773763072. Throughput: 0: 60879.0. Samples: 1678898500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:27:29,255][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 01:27:31,358][54818] Updated weights for policy 0, policy_version 535518 (0.0019) [2024-04-28 01:27:34,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60893.8, 300 sec: 61037.4). Total num frames: 8774074368. Throughput: 0: 61006.5. Samples: 1679267760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:27:34,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 01:27:34,584][54818] Updated weights for policy 0, policy_version 535528 (0.0016) [2024-04-28 01:27:37,150][54818] Updated weights for policy 0, policy_version 535538 (0.0016) [2024-04-28 01:27:39,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60620.9, 300 sec: 60981.8). Total num frames: 8774369280. Throughput: 0: 61207.7. Samples: 1679635160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:27:39,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 01:27:39,844][54818] Updated weights for policy 0, policy_version 535548 (0.0019) [2024-04-28 01:27:42,681][54818] Updated weights for policy 0, policy_version 535558 (0.0015) [2024-04-28 01:27:43,891][54798] Signal inference workers to stop experience collection... (27100 times) [2024-04-28 01:27:43,933][54818] InferenceWorker_p0-w0: stopping experience collection (27100 times) [2024-04-28 01:27:43,950][54798] Signal inference workers to resume experience collection... (27100 times) [2024-04-28 01:27:43,951][54818] InferenceWorker_p0-w0: resuming experience collection (27100 times) [2024-04-28 01:27:44,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60893.8, 300 sec: 61037.3). Total num frames: 8774680576. Throughput: 0: 60870.6. Samples: 1679808980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:27:44,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-28 01:27:45,229][54818] Updated weights for policy 0, policy_version 535568 (0.0017) [2024-04-28 01:27:47,975][54818] Updated weights for policy 0, policy_version 535578 (0.0017) [2024-04-28 01:27:49,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60620.8, 300 sec: 60981.8). Total num frames: 8774975488. Throughput: 0: 60895.6. Samples: 1680183400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:27:49,253][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 01:27:49,291][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000535583_8774991872.pth... [2024-04-28 01:27:49,350][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000534690_8760360960.pth [2024-04-28 01:27:50,402][54818] Updated weights for policy 0, policy_version 535588 (0.0017) [2024-04-28 01:27:53,203][54818] Updated weights for policy 0, policy_version 535598 (0.0015) [2024-04-28 01:27:54,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60621.2, 300 sec: 60981.8). Total num frames: 8775286784. Throughput: 0: 61046.3. Samples: 1680547280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:27:54,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 01:27:55,735][54818] Updated weights for policy 0, policy_version 535608 (0.0018) [2024-04-28 01:27:58,597][54818] Updated weights for policy 0, policy_version 535618 (0.0015) [2024-04-28 01:27:59,253][54587] Fps is (10 sec: 62258.6, 60 sec: 60893.8, 300 sec: 60981.8). Total num frames: 8775598080. Throughput: 0: 60983.9. Samples: 1680731460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:27:59,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 01:28:01,036][54818] Updated weights for policy 0, policy_version 535628 (0.0020) [2024-04-28 01:28:04,074][54818] Updated weights for policy 0, policy_version 535638 (0.0015) [2024-04-28 01:28:04,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60620.7, 300 sec: 60981.8). Total num frames: 8775892992. Throughput: 0: 60919.1. Samples: 1681096860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:28:04,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-28 01:28:06,682][54818] Updated weights for policy 0, policy_version 535648 (0.0016) [2024-04-28 01:28:09,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60621.0, 300 sec: 61037.4). Total num frames: 8776204288. Throughput: 0: 60931.8. Samples: 1681457140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:28:09,254][54587] Avg episode reward: [(0, '0.675')] [2024-04-28 01:28:09,491][54818] Updated weights for policy 0, policy_version 535658 (0.0018) [2024-04-28 01:28:12,075][54818] Updated weights for policy 0, policy_version 535668 (0.0017) [2024-04-28 01:28:14,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60620.6, 300 sec: 60981.8). Total num frames: 8776499200. Throughput: 0: 60949.3. Samples: 1681641220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:28:14,255][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 01:28:15,041][54818] Updated weights for policy 0, policy_version 535678 (0.0017) [2024-04-28 01:28:17,214][54818] Updated weights for policy 0, policy_version 535688 (0.0018) [2024-04-28 01:28:19,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60620.8, 300 sec: 60981.8). Total num frames: 8776810496. Throughput: 0: 60917.7. Samples: 1682009060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:28:19,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 01:28:20,178][54818] Updated weights for policy 0, policy_version 535698 (0.0019) [2024-04-28 01:28:22,974][54818] Updated weights for policy 0, policy_version 535708 (0.0016) [2024-04-28 01:28:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60620.9, 300 sec: 60926.3). Total num frames: 8777105408. Throughput: 0: 60862.3. Samples: 1682373960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:28:24,255][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 01:28:25,451][54818] Updated weights for policy 0, policy_version 535718 (0.0019) [2024-04-28 01:28:28,375][54818] Updated weights for policy 0, policy_version 535728 (0.0015) [2024-04-28 01:28:29,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60893.9, 300 sec: 60981.8). Total num frames: 8777416704. Throughput: 0: 60952.9. Samples: 1682551860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:28:29,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 01:28:30,719][54818] Updated weights for policy 0, policy_version 535738 (0.0017) [2024-04-28 01:28:33,687][54818] Updated weights for policy 0, policy_version 535748 (0.0016) [2024-04-28 01:28:34,253][54587] Fps is (10 sec: 62258.7, 60 sec: 60893.8, 300 sec: 60926.3). Total num frames: 8777728000. Throughput: 0: 60801.1. Samples: 1682919460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:28:34,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-28 01:28:34,661][54798] Signal inference workers to stop experience collection... (27150 times) [2024-04-28 01:28:34,667][54798] Signal inference workers to resume experience collection... (27150 times) [2024-04-28 01:28:34,677][54818] InferenceWorker_p0-w0: stopping experience collection (27150 times) [2024-04-28 01:28:34,678][54818] InferenceWorker_p0-w0: resuming experience collection (27150 times) [2024-04-28 01:28:35,952][54818] Updated weights for policy 0, policy_version 535758 (0.0017) [2024-04-28 01:28:38,947][54818] Updated weights for policy 0, policy_version 535768 (0.0017) [2024-04-28 01:28:39,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60893.9, 300 sec: 60926.3). Total num frames: 8778022912. Throughput: 0: 61032.0. Samples: 1683293720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 01:28:39,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 01:28:41,228][54818] Updated weights for policy 0, policy_version 535778 (0.0018) [2024-04-28 01:28:44,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60893.9, 300 sec: 60926.3). Total num frames: 8778334208. Throughput: 0: 60908.9. Samples: 1683472360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-04-28 01:28:44,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-28 01:28:44,509][54818] Updated weights for policy 0, policy_version 535788 (0.0018) [2024-04-28 01:28:46,705][54818] Updated weights for policy 0, policy_version 535798 (0.0017) [2024-04-28 01:28:49,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60893.9, 300 sec: 60926.3). Total num frames: 8778629120. Throughput: 0: 60918.9. Samples: 1683838200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-04-28 01:28:49,253][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 01:28:49,769][54818] Updated weights for policy 0, policy_version 535808 (0.0018) [2024-04-28 01:28:52,183][54818] Updated weights for policy 0, policy_version 535818 (0.0017) [2024-04-28 01:28:54,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60893.9, 300 sec: 60926.3). Total num frames: 8778940416. Throughput: 0: 61251.5. Samples: 1684213460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-04-28 01:28:54,253][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 01:28:54,916][54818] Updated weights for policy 0, policy_version 535828 (0.0017) [2024-04-28 01:28:57,594][54818] Updated weights for policy 0, policy_version 535838 (0.0016) [2024-04-28 01:28:59,253][54587] Fps is (10 sec: 62258.0, 60 sec: 60893.8, 300 sec: 60981.8). Total num frames: 8779251712. Throughput: 0: 60995.6. Samples: 1684386020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-04-28 01:28:59,255][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 01:29:00,335][54818] Updated weights for policy 0, policy_version 535848 (0.0017) [2024-04-28 01:29:02,814][54818] Updated weights for policy 0, policy_version 535858 (0.0017) [2024-04-28 01:29:04,253][54587] Fps is (10 sec: 62260.0, 60 sec: 61167.1, 300 sec: 60981.8). Total num frames: 8779563008. Throughput: 0: 60977.1. Samples: 1684753020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-04-28 01:29:04,253][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 01:29:05,721][54818] Updated weights for policy 0, policy_version 535868 (0.0018) [2024-04-28 01:29:08,469][54818] Updated weights for policy 0, policy_version 535878 (0.0016) [2024-04-28 01:29:09,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60893.7, 300 sec: 60926.3). Total num frames: 8779857920. Throughput: 0: 61193.7. Samples: 1685127680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-04-28 01:29:09,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-28 01:29:10,967][54818] Updated weights for policy 0, policy_version 535888 (0.0021) [2024-04-28 01:29:14,196][54818] Updated weights for policy 0, policy_version 535898 (0.0020) [2024-04-28 01:29:14,253][54587] Fps is (10 sec: 58982.0, 60 sec: 60894.0, 300 sec: 60870.8). Total num frames: 8780152832. Throughput: 0: 61064.6. Samples: 1685299760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-04-28 01:29:14,253][54587] Avg episode reward: [(0, '0.648')] [2024-04-28 01:29:16,315][54818] Updated weights for policy 0, policy_version 535908 (0.0017) [2024-04-28 01:29:17,894][54798] Signal inference workers to stop experience collection... (27200 times) [2024-04-28 01:29:17,918][54818] InferenceWorker_p0-w0: stopping experience collection (27200 times) [2024-04-28 01:29:17,956][54798] Signal inference workers to resume experience collection... (27200 times) [2024-04-28 01:29:17,956][54818] InferenceWorker_p0-w0: resuming experience collection (27200 times) [2024-04-28 01:29:19,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.8, 300 sec: 60870.7). Total num frames: 8780464128. Throughput: 0: 60860.9. Samples: 1685658200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-04-28 01:29:19,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 01:29:19,371][54818] Updated weights for policy 0, policy_version 535918 (0.0018) [2024-04-28 01:29:21,599][54818] Updated weights for policy 0, policy_version 535928 (0.0015) [2024-04-28 01:29:24,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61167.0, 300 sec: 60926.3). Total num frames: 8780775424. Throughput: 0: 60940.6. Samples: 1686036040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-04-28 01:29:24,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 01:29:24,786][54818] Updated weights for policy 0, policy_version 535938 (0.0017) [2024-04-28 01:29:26,948][54818] Updated weights for policy 0, policy_version 535948 (0.0028) [2024-04-28 01:29:29,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60893.9, 300 sec: 60870.7). Total num frames: 8781070336. Throughput: 0: 60808.9. Samples: 1686208760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-04-28 01:29:29,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 01:29:30,064][54818] Updated weights for policy 0, policy_version 535958 (0.0015) [2024-04-28 01:29:32,090][54818] Updated weights for policy 0, policy_version 535968 (0.0016) [2024-04-28 01:29:34,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60893.9, 300 sec: 60870.7). Total num frames: 8781381632. Throughput: 0: 60964.2. Samples: 1686581600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-04-28 01:29:34,255][54587] Avg episode reward: [(0, '0.518')] [2024-04-28 01:29:35,693][54818] Updated weights for policy 0, policy_version 535978 (0.0017) [2024-04-28 01:29:37,697][54818] Updated weights for policy 0, policy_version 535988 (0.0019) [2024-04-28 01:29:39,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61167.0, 300 sec: 60870.8). Total num frames: 8781692928. Throughput: 0: 60871.6. Samples: 1686952680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-04-28 01:29:39,253][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 01:29:40,933][54818] Updated weights for policy 0, policy_version 535998 (0.0017) [2024-04-28 01:29:42,978][54818] Updated weights for policy 0, policy_version 536008 (0.0016) [2024-04-28 01:29:44,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61166.9, 300 sec: 60926.3). Total num frames: 8782004224. Throughput: 0: 61025.4. Samples: 1687132160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-04-28 01:29:44,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 01:29:46,283][54818] Updated weights for policy 0, policy_version 536018 (0.0019) [2024-04-28 01:29:48,718][54818] Updated weights for policy 0, policy_version 536028 (0.0016) [2024-04-28 01:29:49,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.8, 300 sec: 60926.3). Total num frames: 8782299136. Throughput: 0: 60878.4. Samples: 1687492560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-04-28 01:29:49,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-28 01:29:49,262][54587] No heartbeat for components: RolloutWorker_w4 (14737 seconds), RolloutWorker_w5 (837 seconds) [2024-04-28 01:29:49,353][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000536030_8782315520.pth... [2024-04-28 01:29:49,407][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000535137_8767684608.pth [2024-04-28 01:29:51,440][54818] Updated weights for policy 0, policy_version 536038 (0.0018) [2024-04-28 01:29:54,112][54818] Updated weights for policy 0, policy_version 536048 (0.0019) [2024-04-28 01:29:54,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61166.9, 300 sec: 60926.3). Total num frames: 8782610432. Throughput: 0: 60838.3. Samples: 1687865400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-04-28 01:29:54,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 01:29:56,326][54798] Signal inference workers to stop experience collection... (27250 times) [2024-04-28 01:29:56,327][54798] Signal inference workers to resume experience collection... (27250 times) [2024-04-28 01:29:56,342][54818] InferenceWorker_p0-w0: stopping experience collection (27250 times) [2024-04-28 01:29:56,342][54818] InferenceWorker_p0-w0: resuming experience collection (27250 times) [2024-04-28 01:29:56,710][54818] Updated weights for policy 0, policy_version 536058 (0.0018) [2024-04-28 01:29:59,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60894.0, 300 sec: 60870.7). Total num frames: 8782905344. Throughput: 0: 61067.0. Samples: 1688047780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-04-28 01:29:59,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 01:29:59,370][54818] Updated weights for policy 0, policy_version 536068 (0.0016) [2024-04-28 01:30:01,881][54818] Updated weights for policy 0, policy_version 536078 (0.0016) [2024-04-28 01:30:04,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60893.8, 300 sec: 60926.3). Total num frames: 8783216640. Throughput: 0: 61238.9. Samples: 1688413940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 19.0) [2024-04-28 01:30:04,253][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 01:30:04,847][54818] Updated weights for policy 0, policy_version 536088 (0.0016) [2024-04-28 01:30:07,037][54818] Updated weights for policy 0, policy_version 536098 (0.0021) [2024-04-28 01:30:09,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61167.0, 300 sec: 60926.3). Total num frames: 8783527936. Throughput: 0: 60993.7. Samples: 1688780760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:30:09,263][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 01:30:10,465][54818] Updated weights for policy 0, policy_version 536108 (0.0015) [2024-04-28 01:30:12,998][54818] Updated weights for policy 0, policy_version 536118 (0.0017) [2024-04-28 01:30:14,253][54587] Fps is (10 sec: 62258.0, 60 sec: 61439.8, 300 sec: 60981.8). Total num frames: 8783839232. Throughput: 0: 61225.2. Samples: 1688963900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:30:14,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 01:30:15,913][54818] Updated weights for policy 0, policy_version 536128 (0.0016) [2024-04-28 01:30:18,311][54818] Updated weights for policy 0, policy_version 536138 (0.0016) [2024-04-28 01:30:19,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61167.1, 300 sec: 60981.9). Total num frames: 8784134144. Throughput: 0: 60979.7. Samples: 1689325680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:30:19,253][54587] Avg episode reward: [(0, '0.669')] [2024-04-28 01:30:21,046][54818] Updated weights for policy 0, policy_version 536148 (0.0019) [2024-04-28 01:30:23,632][54818] Updated weights for policy 0, policy_version 536158 (0.0017) [2024-04-28 01:30:24,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61166.8, 300 sec: 60981.8). Total num frames: 8784445440. Throughput: 0: 60903.3. Samples: 1689693340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:30:24,255][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 01:30:26,435][54818] Updated weights for policy 0, policy_version 536168 (0.0017) [2024-04-28 01:30:29,004][54818] Updated weights for policy 0, policy_version 536178 (0.0015) [2024-04-28 01:30:29,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61167.0, 300 sec: 60870.7). Total num frames: 8784740352. Throughput: 0: 61023.2. Samples: 1689878200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:30:29,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 01:30:31,703][54818] Updated weights for policy 0, policy_version 536188 (0.0016) [2024-04-28 01:30:34,253][54587] Fps is (10 sec: 58983.3, 60 sec: 60894.0, 300 sec: 60870.7). Total num frames: 8785035264. Throughput: 0: 60979.2. Samples: 1690236620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:30:34,253][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 01:30:34,508][54818] Updated weights for policy 0, policy_version 536198 (0.0017) [2024-04-28 01:30:37,068][54818] Updated weights for policy 0, policy_version 536208 (0.0018) [2024-04-28 01:30:39,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60893.9, 300 sec: 60926.5). Total num frames: 8785346560. Throughput: 0: 60958.7. Samples: 1690608540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:30:39,253][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 01:30:39,738][54818] Updated weights for policy 0, policy_version 536218 (0.0018) [2024-04-28 01:30:42,424][54818] Updated weights for policy 0, policy_version 536228 (0.0015) [2024-04-28 01:30:44,070][54798] Signal inference workers to stop experience collection... (27300 times) [2024-04-28 01:30:44,071][54798] Signal inference workers to resume experience collection... (27300 times) [2024-04-28 01:30:44,086][54818] InferenceWorker_p0-w0: stopping experience collection (27300 times) [2024-04-28 01:30:44,087][54818] InferenceWorker_p0-w0: resuming experience collection (27300 times) [2024-04-28 01:30:44,253][54587] Fps is (10 sec: 62259.6, 60 sec: 60894.0, 300 sec: 60926.3). Total num frames: 8785657856. Throughput: 0: 60898.3. Samples: 1690788200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:30:44,253][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 01:30:45,206][54818] Updated weights for policy 0, policy_version 536238 (0.0020) [2024-04-28 01:30:47,879][54818] Updated weights for policy 0, policy_version 536248 (0.0016) [2024-04-28 01:30:49,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61167.0, 300 sec: 60926.3). Total num frames: 8785969152. Throughput: 0: 60960.0. Samples: 1691157140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:30:49,253][54587] Avg episode reward: [(0, '0.642')] [2024-04-28 01:30:50,454][54818] Updated weights for policy 0, policy_version 536258 (0.0016) [2024-04-28 01:30:53,341][54818] Updated weights for policy 0, policy_version 536268 (0.0016) [2024-04-28 01:30:54,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61166.9, 300 sec: 60926.3). Total num frames: 8786280448. Throughput: 0: 60979.0. Samples: 1691524820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:30:54,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 01:30:55,788][54818] Updated weights for policy 0, policy_version 536278 (0.0016) [2024-04-28 01:30:58,677][54818] Updated weights for policy 0, policy_version 536288 (0.0015) [2024-04-28 01:30:59,253][54587] Fps is (10 sec: 62257.9, 60 sec: 61439.8, 300 sec: 60981.8). Total num frames: 8786591744. Throughput: 0: 61009.3. Samples: 1691709320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:30:59,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 01:31:01,315][54818] Updated weights for policy 0, policy_version 536298 (0.0017) [2024-04-28 01:31:03,776][54818] Updated weights for policy 0, policy_version 536308 (0.0016) [2024-04-28 01:31:04,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61166.9, 300 sec: 60926.3). Total num frames: 8786886656. Throughput: 0: 61066.3. Samples: 1692073660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:31:04,253][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 01:31:06,801][54818] Updated weights for policy 0, policy_version 536318 (0.0016) [2024-04-28 01:31:09,221][54818] Updated weights for policy 0, policy_version 536328 (0.0016) [2024-04-28 01:31:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.8, 300 sec: 60981.8). Total num frames: 8787197952. Throughput: 0: 61095.0. Samples: 1692442620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:31:09,255][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 01:31:12,013][54818] Updated weights for policy 0, policy_version 536338 (0.0015) [2024-04-28 01:31:14,253][54587] Fps is (10 sec: 60620.2, 60 sec: 60894.0, 300 sec: 60926.3). Total num frames: 8787492864. Throughput: 0: 61051.0. Samples: 1692625500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:31:14,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 01:31:14,481][54818] Updated weights for policy 0, policy_version 536348 (0.0015) [2024-04-28 01:31:17,332][54818] Updated weights for policy 0, policy_version 536358 (0.0017) [2024-04-28 01:31:19,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61166.9, 300 sec: 60981.8). Total num frames: 8787804160. Throughput: 0: 61143.5. Samples: 1692988080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:31:19,254][54587] Avg episode reward: [(0, '0.671')] [2024-04-28 01:31:20,117][54818] Updated weights for policy 0, policy_version 536368 (0.0016) [2024-04-28 01:31:22,592][54818] Updated weights for policy 0, policy_version 536378 (0.0018) [2024-04-28 01:31:24,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61167.1, 300 sec: 61037.3). Total num frames: 8788115456. Throughput: 0: 61142.6. Samples: 1693359960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:31:24,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 01:31:25,276][54818] Updated weights for policy 0, policy_version 536388 (0.0016) [2024-04-28 01:31:27,877][54818] Updated weights for policy 0, policy_version 536398 (0.0018) [2024-04-28 01:31:29,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.9, 300 sec: 60981.8). Total num frames: 8788410368. Throughput: 0: 61289.1. Samples: 1693546220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:31:29,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 01:31:30,556][54818] Updated weights for policy 0, policy_version 536408 (0.0017) [2024-04-28 01:31:33,215][54818] Updated weights for policy 0, policy_version 536418 (0.0015) [2024-04-28 01:31:34,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61439.9, 300 sec: 60981.8). Total num frames: 8788721664. Throughput: 0: 61165.1. Samples: 1693909580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 01:31:34,255][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 01:31:35,861][54818] Updated weights for policy 0, policy_version 536428 (0.0016) [2024-04-28 01:31:38,668][54818] Updated weights for policy 0, policy_version 536438 (0.0015) [2024-04-28 01:31:39,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61439.8, 300 sec: 61037.3). Total num frames: 8789032960. Throughput: 0: 61142.6. Samples: 1694276240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:31:39,254][54587] Avg episode reward: [(0, '0.688')] [2024-04-28 01:31:41,109][54818] Updated weights for policy 0, policy_version 536448 (0.0018) [2024-04-28 01:31:43,883][54818] Updated weights for policy 0, policy_version 536458 (0.0018) [2024-04-28 01:31:44,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61439.9, 300 sec: 61037.3). Total num frames: 8789344256. Throughput: 0: 61284.2. Samples: 1694467100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:31:44,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 01:31:46,983][54818] Updated weights for policy 0, policy_version 536468 (0.0017) [2024-04-28 01:31:47,735][54798] Signal inference workers to stop experience collection... (27350 times) [2024-04-28 01:31:47,767][54818] InferenceWorker_p0-w0: stopping experience collection (27350 times) [2024-04-28 01:31:47,797][54798] Signal inference workers to resume experience collection... (27350 times) [2024-04-28 01:31:47,797][54818] InferenceWorker_p0-w0: resuming experience collection (27350 times) [2024-04-28 01:31:49,253][54587] Fps is (10 sec: 60622.0, 60 sec: 61166.9, 300 sec: 60981.9). Total num frames: 8789639168. Throughput: 0: 61295.1. Samples: 1694831940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:31:49,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-28 01:31:49,318][54818] Updated weights for policy 0, policy_version 536478 (0.0019) [2024-04-28 01:31:49,320][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000536478_8789655552.pth... [2024-04-28 01:31:49,375][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000535583_8774991872.pth [2024-04-28 01:31:52,474][54818] Updated weights for policy 0, policy_version 536488 (0.0017) [2024-04-28 01:31:54,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61166.9, 300 sec: 61037.3). Total num frames: 8789950464. Throughput: 0: 61257.0. Samples: 1695199180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:31:54,255][54587] Avg episode reward: [(0, '0.689')] [2024-04-28 01:31:54,619][54818] Updated weights for policy 0, policy_version 536498 (0.0020) [2024-04-28 01:31:57,742][54818] Updated weights for policy 0, policy_version 536508 (0.0018) [2024-04-28 01:31:59,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60894.1, 300 sec: 60981.8). Total num frames: 8790245376. Throughput: 0: 60982.8. Samples: 1695369720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:31:59,253][54587] Avg episode reward: [(0, '0.558')] [2024-04-28 01:32:00,164][54818] Updated weights for policy 0, policy_version 536518 (0.0015) [2024-04-28 01:32:02,940][54818] Updated weights for policy 0, policy_version 536528 (0.0016) [2024-04-28 01:32:04,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61166.8, 300 sec: 60981.8). Total num frames: 8790556672. Throughput: 0: 61328.4. Samples: 1695747860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:32:04,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 01:32:05,444][54818] Updated weights for policy 0, policy_version 536538 (0.0016) [2024-04-28 01:32:08,182][54818] Updated weights for policy 0, policy_version 536548 (0.0016) [2024-04-28 01:32:09,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61167.0, 300 sec: 61037.3). Total num frames: 8790867968. Throughput: 0: 61201.2. Samples: 1696114020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:32:09,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 01:32:10,754][54818] Updated weights for policy 0, policy_version 536558 (0.0017) [2024-04-28 01:32:13,725][54818] Updated weights for policy 0, policy_version 536568 (0.0020) [2024-04-28 01:32:14,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61440.1, 300 sec: 61037.4). Total num frames: 8791179264. Throughput: 0: 61048.6. Samples: 1696293400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:32:14,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 01:32:16,104][54818] Updated weights for policy 0, policy_version 536578 (0.0016) [2024-04-28 01:32:18,891][54818] Updated weights for policy 0, policy_version 536588 (0.0017) [2024-04-28 01:32:19,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61167.0, 300 sec: 61037.4). Total num frames: 8791474176. Throughput: 0: 61142.9. Samples: 1696661000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:32:19,254][54587] Avg episode reward: [(0, '0.492')] [2024-04-28 01:32:21,936][54818] Updated weights for policy 0, policy_version 536598 (0.0015) [2024-04-28 01:32:24,010][54818] Updated weights for policy 0, policy_version 536608 (0.0019) [2024-04-28 01:32:24,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 8791785472. Throughput: 0: 61188.6. Samples: 1697029720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:32:24,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 01:32:27,156][54818] Updated weights for policy 0, policy_version 536618 (0.0016) [2024-04-28 01:32:29,253][54587] Fps is (10 sec: 60619.9, 60 sec: 61166.9, 300 sec: 61037.3). Total num frames: 8792080384. Throughput: 0: 61088.3. Samples: 1697216080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:32:29,255][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 01:32:29,465][54818] Updated weights for policy 0, policy_version 536628 (0.0016) [2024-04-28 01:32:31,106][54798] Signal inference workers to stop experience collection... (27400 times) [2024-04-28 01:32:31,147][54818] InferenceWorker_p0-w0: stopping experience collection (27400 times) [2024-04-28 01:32:31,166][54798] Signal inference workers to resume experience collection... (27400 times) [2024-04-28 01:32:31,167][54818] InferenceWorker_p0-w0: resuming experience collection (27400 times) [2024-04-28 01:32:32,510][54818] Updated weights for policy 0, policy_version 536638 (0.0016) [2024-04-28 01:32:34,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 8792391680. Throughput: 0: 61182.1. Samples: 1697585140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:32:34,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 01:32:34,788][54818] Updated weights for policy 0, policy_version 536648 (0.0019) [2024-04-28 01:32:37,864][54818] Updated weights for policy 0, policy_version 536658 (0.0016) [2024-04-28 01:32:39,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.9, 300 sec: 61037.3). Total num frames: 8792686592. Throughput: 0: 61148.9. Samples: 1697950880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:32:39,254][54587] Avg episode reward: [(0, '0.484')] [2024-04-28 01:32:40,124][54818] Updated weights for policy 0, policy_version 536668 (0.0017) [2024-04-28 01:32:43,268][54818] Updated weights for policy 0, policy_version 536678 (0.0017) [2024-04-28 01:32:44,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 8792997888. Throughput: 0: 61335.4. Samples: 1698129820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:32:44,259][54587] Avg episode reward: [(0, '0.533')] [2024-04-28 01:32:45,544][54818] Updated weights for policy 0, policy_version 536688 (0.0018) [2024-04-28 01:32:48,482][54818] Updated weights for policy 0, policy_version 536698 (0.0022) [2024-04-28 01:32:49,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60893.7, 300 sec: 61037.3). Total num frames: 8793292800. Throughput: 0: 61267.5. Samples: 1698504900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:32:49,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 01:32:49,263][54587] No heartbeat for components: RolloutWorker_w4 (14917 seconds), RolloutWorker_w5 (1017 seconds) [2024-04-28 01:32:51,098][54818] Updated weights for policy 0, policy_version 536708 (0.0018) [2024-04-28 01:32:53,897][54818] Updated weights for policy 0, policy_version 536718 (0.0016) [2024-04-28 01:32:54,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60893.9, 300 sec: 61037.3). Total num frames: 8793604096. Throughput: 0: 61016.0. Samples: 1698859740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:32:54,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 01:32:56,392][54818] Updated weights for policy 0, policy_version 536728 (0.0016) [2024-04-28 01:32:59,224][54818] Updated weights for policy 0, policy_version 536738 (0.0016) [2024-04-28 01:32:59,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61166.8, 300 sec: 61092.9). Total num frames: 8793915392. Throughput: 0: 61127.9. Samples: 1699044160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:32:59,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-28 01:33:02,055][54818] Updated weights for policy 0, policy_version 536748 (0.0020) [2024-04-28 01:33:04,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61166.8, 300 sec: 61092.8). Total num frames: 8794226688. Throughput: 0: 61122.8. Samples: 1699411540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-04-28 01:33:04,254][54587] Avg episode reward: [(0, '0.668')] [2024-04-28 01:33:04,542][54818] Updated weights for policy 0, policy_version 536758 (0.0020) [2024-04-28 01:33:07,291][54818] Updated weights for policy 0, policy_version 536768 (0.0017) [2024-04-28 01:33:09,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 8794521600. Throughput: 0: 60980.5. Samples: 1699773840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:33:09,253][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 01:33:10,064][54818] Updated weights for policy 0, policy_version 536778 (0.0016) [2024-04-28 01:33:12,896][54818] Updated weights for policy 0, policy_version 536788 (0.0016) [2024-04-28 01:33:14,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 8794832896. Throughput: 0: 60896.1. Samples: 1699956400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:33:14,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 01:33:15,197][54818] Updated weights for policy 0, policy_version 536798 (0.0015) [2024-04-28 01:33:18,003][54818] Updated weights for policy 0, policy_version 536808 (0.0016) [2024-04-28 01:33:19,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 8795127808. Throughput: 0: 60948.4. Samples: 1700327820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:33:19,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-28 01:33:20,564][54818] Updated weights for policy 0, policy_version 536818 (0.0017) [2024-04-28 01:33:23,333][54818] Updated weights for policy 0, policy_version 536828 (0.0016) [2024-04-28 01:33:24,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 8795439104. Throughput: 0: 60813.0. Samples: 1700687460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:33:24,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-28 01:33:25,478][54798] Signal inference workers to stop experience collection... (27450 times) [2024-04-28 01:33:25,478][54798] Signal inference workers to resume experience collection... (27450 times) [2024-04-28 01:33:25,490][54818] InferenceWorker_p0-w0: stopping experience collection (27450 times) [2024-04-28 01:33:25,490][54818] InferenceWorker_p0-w0: resuming experience collection (27450 times) [2024-04-28 01:33:25,794][54818] Updated weights for policy 0, policy_version 536838 (0.0015) [2024-04-28 01:33:28,627][54818] Updated weights for policy 0, policy_version 536848 (0.0015) [2024-04-28 01:33:29,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 8795750400. Throughput: 0: 60936.8. Samples: 1700871980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:33:29,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 01:33:31,451][54818] Updated weights for policy 0, policy_version 536858 (0.0018) [2024-04-28 01:33:34,213][54818] Updated weights for policy 0, policy_version 536868 (0.0016) [2024-04-28 01:33:34,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 8796045312. Throughput: 0: 60767.5. Samples: 1701239440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:33:34,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 01:33:36,874][54818] Updated weights for policy 0, policy_version 536878 (0.0016) [2024-04-28 01:33:39,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 8796356608. Throughput: 0: 60935.6. Samples: 1701601840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:33:39,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 01:33:39,596][54818] Updated weights for policy 0, policy_version 536888 (0.0016) [2024-04-28 01:33:42,423][54818] Updated weights for policy 0, policy_version 536898 (0.0017) [2024-04-28 01:33:44,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60893.8, 300 sec: 61092.8). Total num frames: 8796651520. Throughput: 0: 60902.1. Samples: 1701784760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:33:44,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 01:33:44,815][54818] Updated weights for policy 0, policy_version 536908 (0.0016) [2024-04-28 01:33:47,881][54818] Updated weights for policy 0, policy_version 536918 (0.0016) [2024-04-28 01:33:49,253][54587] Fps is (10 sec: 58982.3, 60 sec: 60893.9, 300 sec: 61037.3). Total num frames: 8796946432. Throughput: 0: 61057.0. Samples: 1702159100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:33:49,254][54587] Avg episode reward: [(0, '0.668')] [2024-04-28 01:33:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000536923_8796946432.pth... [2024-04-28 01:33:49,329][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000536030_8782315520.pth [2024-04-28 01:33:50,162][54818] Updated weights for policy 0, policy_version 536928 (0.0018) [2024-04-28 01:33:53,152][54818] Updated weights for policy 0, policy_version 536938 (0.0016) [2024-04-28 01:33:54,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60893.9, 300 sec: 61037.4). Total num frames: 8797257728. Throughput: 0: 60963.0. Samples: 1702517180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:33:54,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 01:33:55,630][54818] Updated weights for policy 0, policy_version 536948 (0.0016) [2024-04-28 01:33:58,482][54818] Updated weights for policy 0, policy_version 536958 (0.0015) [2024-04-28 01:33:59,253][54587] Fps is (10 sec: 62259.5, 60 sec: 60893.9, 300 sec: 61037.3). Total num frames: 8797569024. Throughput: 0: 60942.2. Samples: 1702698800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:33:59,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 01:34:00,785][54818] Updated weights for policy 0, policy_version 536968 (0.0021) [2024-04-28 01:34:03,720][54818] Updated weights for policy 0, policy_version 536978 (0.0018) [2024-04-28 01:34:04,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60621.0, 300 sec: 61037.4). Total num frames: 8797863936. Throughput: 0: 60912.6. Samples: 1703068880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:34:04,254][54587] Avg episode reward: [(0, '0.482')] [2024-04-28 01:34:06,220][54818] Updated weights for policy 0, policy_version 536988 (0.0018) [2024-04-28 01:34:09,166][54818] Updated weights for policy 0, policy_version 536998 (0.0022) [2024-04-28 01:34:09,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 8798175232. Throughput: 0: 61075.2. Samples: 1703435840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:34:09,254][54587] Avg episode reward: [(0, '0.684')] [2024-04-28 01:34:11,556][54818] Updated weights for policy 0, policy_version 537008 (0.0015) [2024-04-28 01:34:12,000][54798] Signal inference workers to stop experience collection... (27500 times) [2024-04-28 01:34:12,000][54798] Signal inference workers to resume experience collection... (27500 times) [2024-04-28 01:34:12,022][54818] InferenceWorker_p0-w0: stopping experience collection (27500 times) [2024-04-28 01:34:12,022][54818] InferenceWorker_p0-w0: resuming experience collection (27500 times) [2024-04-28 01:34:14,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60620.9, 300 sec: 61037.4). Total num frames: 8798470144. Throughput: 0: 60794.1. Samples: 1703607700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:34:14,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 01:34:14,479][54818] Updated weights for policy 0, policy_version 537018 (0.0018) [2024-04-28 01:34:16,934][54818] Updated weights for policy 0, policy_version 537028 (0.0016) [2024-04-28 01:34:19,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.9, 300 sec: 61037.3). Total num frames: 8798781440. Throughput: 0: 60863.8. Samples: 1703978300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:34:19,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 01:34:19,816][54818] Updated weights for policy 0, policy_version 537038 (0.0017) [2024-04-28 01:34:22,375][54818] Updated weights for policy 0, policy_version 537048 (0.0017) [2024-04-28 01:34:24,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60620.9, 300 sec: 61037.3). Total num frames: 8799076352. Throughput: 0: 61157.4. Samples: 1704353920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:34:24,254][54587] Avg episode reward: [(0, '0.667')] [2024-04-28 01:34:25,185][54818] Updated weights for policy 0, policy_version 537058 (0.0017) [2024-04-28 01:34:27,578][54818] Updated weights for policy 0, policy_version 537068 (0.0015) [2024-04-28 01:34:29,253][54587] Fps is (10 sec: 62258.3, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 8799404032. Throughput: 0: 60997.3. Samples: 1704529640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:34:29,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 01:34:30,471][54818] Updated weights for policy 0, policy_version 537078 (0.0019) [2024-04-28 01:34:33,609][54818] Updated weights for policy 0, policy_version 537088 (0.0019) [2024-04-28 01:34:34,253][54587] Fps is (10 sec: 63897.4, 60 sec: 61167.1, 300 sec: 61092.9). Total num frames: 8799715328. Throughput: 0: 60701.4. Samples: 1704890660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 01:34:34,254][54587] Avg episode reward: [(0, '0.513')] [2024-04-28 01:34:35,746][54818] Updated weights for policy 0, policy_version 537098 (0.0020) [2024-04-28 01:34:38,785][54818] Updated weights for policy 0, policy_version 537108 (0.0018) [2024-04-28 01:34:39,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60893.8, 300 sec: 61037.3). Total num frames: 8800010240. Throughput: 0: 61051.0. Samples: 1705264480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:34:39,255][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 01:34:41,105][54818] Updated weights for policy 0, policy_version 537118 (0.0018) [2024-04-28 01:34:44,195][54818] Updated weights for policy 0, policy_version 537128 (0.0016) [2024-04-28 01:34:44,256][54587] Fps is (10 sec: 58965.4, 60 sec: 60891.1, 300 sec: 61036.8). Total num frames: 8800305152. Throughput: 0: 60844.6. Samples: 1705436980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:34:44,257][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 01:34:46,577][54818] Updated weights for policy 0, policy_version 537138 (0.0016) [2024-04-28 01:34:49,253][54587] Fps is (10 sec: 58983.5, 60 sec: 60894.0, 300 sec: 60981.8). Total num frames: 8800600064. Throughput: 0: 60760.9. Samples: 1705803120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:34:49,253][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 01:34:49,421][54818] Updated weights for policy 0, policy_version 537148 (0.0019) [2024-04-28 01:34:51,510][54798] Signal inference workers to stop experience collection... (27550 times) [2024-04-28 01:34:51,510][54798] Signal inference workers to resume experience collection... (27550 times) [2024-04-28 01:34:51,524][54818] InferenceWorker_p0-w0: stopping experience collection (27550 times) [2024-04-28 01:34:51,524][54818] InferenceWorker_p0-w0: resuming experience collection (27550 times) [2024-04-28 01:34:51,864][54818] Updated weights for policy 0, policy_version 537158 (0.0020) [2024-04-28 01:34:54,253][54587] Fps is (10 sec: 60637.7, 60 sec: 60893.8, 300 sec: 61037.3). Total num frames: 8800911360. Throughput: 0: 60912.8. Samples: 1706176920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:34:54,254][54587] Avg episode reward: [(0, '0.509')] [2024-04-28 01:34:54,662][54818] Updated weights for policy 0, policy_version 537168 (0.0021) [2024-04-28 01:34:57,165][54818] Updated weights for policy 0, policy_version 537178 (0.0016) [2024-04-28 01:34:59,253][54587] Fps is (10 sec: 60619.8, 60 sec: 60620.7, 300 sec: 60981.8). Total num frames: 8801206272. Throughput: 0: 60970.9. Samples: 1706351400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:34:59,255][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 01:35:00,063][54818] Updated weights for policy 0, policy_version 537188 (0.0017) [2024-04-28 01:35:02,690][54818] Updated weights for policy 0, policy_version 537198 (0.0016) [2024-04-28 01:35:04,253][54587] Fps is (10 sec: 58982.8, 60 sec: 60620.7, 300 sec: 60926.3). Total num frames: 8801501184. Throughput: 0: 60823.9. Samples: 1706715380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:35:04,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-28 01:35:05,593][54818] Updated weights for policy 0, policy_version 537208 (0.0021) [2024-04-28 01:35:08,500][54818] Updated weights for policy 0, policy_version 537218 (0.0018) [2024-04-28 01:35:09,253][54587] Fps is (10 sec: 60621.6, 60 sec: 60620.8, 300 sec: 60926.3). Total num frames: 8801812480. Throughput: 0: 60752.5. Samples: 1707087780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:35:09,253][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 01:35:10,917][54818] Updated weights for policy 0, policy_version 537228 (0.0017) [2024-04-28 01:35:13,952][54818] Updated weights for policy 0, policy_version 537238 (0.0017) [2024-04-28 01:35:14,253][54587] Fps is (10 sec: 63897.4, 60 sec: 61166.8, 300 sec: 61037.3). Total num frames: 8802140160. Throughput: 0: 60826.8. Samples: 1707266840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:35:14,255][54587] Avg episode reward: [(0, '0.522')] [2024-04-28 01:35:16,302][54818] Updated weights for policy 0, policy_version 537248 (0.0017) [2024-04-28 01:35:19,211][54818] Updated weights for policy 0, policy_version 537258 (0.0017) [2024-04-28 01:35:19,253][54587] Fps is (10 sec: 62258.8, 60 sec: 60893.8, 300 sec: 60981.8). Total num frames: 8802435072. Throughput: 0: 60928.0. Samples: 1707632420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:35:19,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-28 01:35:21,594][54818] Updated weights for policy 0, policy_version 537268 (0.0016) [2024-04-28 01:35:24,253][54587] Fps is (10 sec: 58982.9, 60 sec: 60893.9, 300 sec: 60981.8). Total num frames: 8802729984. Throughput: 0: 60730.0. Samples: 1707997320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:35:24,254][54587] Avg episode reward: [(0, '0.678')] [2024-04-28 01:35:24,436][54818] Updated weights for policy 0, policy_version 537278 (0.0016) [2024-04-28 01:35:27,149][54818] Updated weights for policy 0, policy_version 537288 (0.0018) [2024-04-28 01:35:29,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60621.0, 300 sec: 61037.3). Total num frames: 8803041280. Throughput: 0: 60894.7. Samples: 1708177060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:35:29,253][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 01:35:29,809][54818] Updated weights for policy 0, policy_version 537298 (0.0017) [2024-04-28 01:35:32,565][54818] Updated weights for policy 0, policy_version 537308 (0.0016) [2024-04-28 01:35:34,253][54587] Fps is (10 sec: 62259.4, 60 sec: 60620.9, 300 sec: 61037.3). Total num frames: 8803352576. Throughput: 0: 60970.7. Samples: 1708546800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:35:34,253][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 01:35:34,991][54818] Updated weights for policy 0, policy_version 537318 (0.0017) [2024-04-28 01:35:38,375][54818] Updated weights for policy 0, policy_version 537328 (0.0017) [2024-04-28 01:35:38,673][54798] Signal inference workers to stop experience collection... (27600 times) [2024-04-28 01:35:38,674][54798] Signal inference workers to resume experience collection... (27600 times) [2024-04-28 01:35:38,688][54818] InferenceWorker_p0-w0: stopping experience collection (27600 times) [2024-04-28 01:35:38,688][54818] InferenceWorker_p0-w0: resuming experience collection (27600 times) [2024-04-28 01:35:39,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60621.0, 300 sec: 60981.8). Total num frames: 8803647488. Throughput: 0: 60934.8. Samples: 1708918980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:35:39,253][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 01:35:40,216][54818] Updated weights for policy 0, policy_version 537338 (0.0016) [2024-04-28 01:35:43,569][54818] Updated weights for policy 0, policy_version 537348 (0.0016) [2024-04-28 01:35:44,253][54587] Fps is (10 sec: 60619.6, 60 sec: 60896.7, 300 sec: 60981.8). Total num frames: 8803958784. Throughput: 0: 61016.0. Samples: 1709097120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:35:44,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 01:35:45,633][54818] Updated weights for policy 0, policy_version 537358 (0.0017) [2024-04-28 01:35:48,821][54818] Updated weights for policy 0, policy_version 537368 (0.0019) [2024-04-28 01:35:49,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61166.9, 300 sec: 60981.8). Total num frames: 8804270080. Throughput: 0: 61227.2. Samples: 1709470600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:35:49,253][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 01:35:49,258][54587] No heartbeat for components: RolloutWorker_w4 (15097 seconds), RolloutWorker_w5 (1197 seconds) [2024-04-28 01:35:49,341][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000537371_8804286464.pth... [2024-04-28 01:35:49,399][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000536478_8789655552.pth [2024-04-28 01:35:50,956][54818] Updated weights for policy 0, policy_version 537378 (0.0016) [2024-04-28 01:35:53,934][54818] Updated weights for policy 0, policy_version 537388 (0.0018) [2024-04-28 01:35:54,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60893.9, 300 sec: 60926.3). Total num frames: 8804564992. Throughput: 0: 61110.6. Samples: 1709837760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:35:54,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 01:35:56,340][54818] Updated weights for policy 0, policy_version 537398 (0.0017) [2024-04-28 01:35:59,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61167.0, 300 sec: 60981.8). Total num frames: 8804876288. Throughput: 0: 61051.6. Samples: 1710014160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:35:59,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-28 01:35:59,442][54818] Updated weights for policy 0, policy_version 537408 (0.0016) [2024-04-28 01:36:01,910][54818] Updated weights for policy 0, policy_version 537418 (0.0018) [2024-04-28 01:36:04,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61167.0, 300 sec: 60926.3). Total num frames: 8805171200. Throughput: 0: 61071.2. Samples: 1710380620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 01:36:04,253][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 01:36:04,774][54818] Updated weights for policy 0, policy_version 537428 (0.0016) [2024-04-28 01:36:07,198][54818] Updated weights for policy 0, policy_version 537438 (0.0016) [2024-04-28 01:36:09,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61166.9, 300 sec: 60981.8). Total num frames: 8805482496. Throughput: 0: 61119.5. Samples: 1710747700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:36:09,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-28 01:36:09,957][54818] Updated weights for policy 0, policy_version 537448 (0.0016) [2024-04-28 01:36:12,975][54818] Updated weights for policy 0, policy_version 537458 (0.0017) [2024-04-28 01:36:14,253][54587] Fps is (10 sec: 62258.7, 60 sec: 60893.9, 300 sec: 60981.8). Total num frames: 8805793792. Throughput: 0: 61159.0. Samples: 1710929220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:36:14,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-28 01:36:15,071][54818] Updated weights for policy 0, policy_version 537468 (0.0017) [2024-04-28 01:36:18,221][54818] Updated weights for policy 0, policy_version 537478 (0.0018) [2024-04-28 01:36:19,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61166.9, 300 sec: 60981.8). Total num frames: 8806105088. Throughput: 0: 61349.7. Samples: 1711307540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:36:19,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 01:36:20,766][54818] Updated weights for policy 0, policy_version 537488 (0.0015) [2024-04-28 01:36:21,295][54798] Signal inference workers to stop experience collection... (27650 times) [2024-04-28 01:36:21,296][54798] Signal inference workers to resume experience collection... (27650 times) [2024-04-28 01:36:21,314][54818] InferenceWorker_p0-w0: stopping experience collection (27650 times) [2024-04-28 01:36:21,314][54818] InferenceWorker_p0-w0: resuming experience collection (27650 times) [2024-04-28 01:36:23,645][54818] Updated weights for policy 0, policy_version 537498 (0.0019) [2024-04-28 01:36:24,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.8, 300 sec: 60981.8). Total num frames: 8806400000. Throughput: 0: 61049.1. Samples: 1711666200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:36:24,255][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 01:36:26,083][54818] Updated weights for policy 0, policy_version 537508 (0.0017) [2024-04-28 01:36:28,713][54818] Updated weights for policy 0, policy_version 537518 (0.0017) [2024-04-28 01:36:29,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.8, 300 sec: 60981.8). Total num frames: 8806711296. Throughput: 0: 61005.4. Samples: 1711842360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:36:29,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 01:36:31,220][54818] Updated weights for policy 0, policy_version 537528 (0.0016) [2024-04-28 01:36:34,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60893.7, 300 sec: 60926.3). Total num frames: 8807006208. Throughput: 0: 60942.9. Samples: 1712213040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:36:34,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-28 01:36:34,409][54818] Updated weights for policy 0, policy_version 537538 (0.0020) [2024-04-28 01:36:36,863][54818] Updated weights for policy 0, policy_version 537548 (0.0018) [2024-04-28 01:36:39,253][54587] Fps is (10 sec: 58983.1, 60 sec: 60893.9, 300 sec: 60870.7). Total num frames: 8807301120. Throughput: 0: 60777.4. Samples: 1712572740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:36:39,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 01:36:39,629][54818] Updated weights for policy 0, policy_version 537558 (0.0017) [2024-04-28 01:36:42,447][54818] Updated weights for policy 0, policy_version 537568 (0.0017) [2024-04-28 01:36:44,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61166.9, 300 sec: 60981.8). Total num frames: 8807628800. Throughput: 0: 61094.0. Samples: 1712763400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:36:44,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 01:36:44,989][54818] Updated weights for policy 0, policy_version 537578 (0.0018) [2024-04-28 01:36:47,921][54818] Updated weights for policy 0, policy_version 537588 (0.0016) [2024-04-28 01:36:49,253][54587] Fps is (10 sec: 62257.8, 60 sec: 60893.6, 300 sec: 60926.2). Total num frames: 8807923712. Throughput: 0: 61103.7. Samples: 1713130300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:36:49,255][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 01:36:50,355][54818] Updated weights for policy 0, policy_version 537598 (0.0021) [2024-04-28 01:36:53,137][54818] Updated weights for policy 0, policy_version 537608 (0.0015) [2024-04-28 01:36:54,253][54587] Fps is (10 sec: 60622.0, 60 sec: 61167.0, 300 sec: 60981.8). Total num frames: 8808235008. Throughput: 0: 60807.5. Samples: 1713484040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:36:54,255][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 01:36:55,470][54818] Updated weights for policy 0, policy_version 537618 (0.0018) [2024-04-28 01:36:58,380][54818] Updated weights for policy 0, policy_version 537628 (0.0016) [2024-04-28 01:36:58,403][54798] Signal inference workers to stop experience collection... (27700 times) [2024-04-28 01:36:58,407][54798] Signal inference workers to resume experience collection... (27700 times) [2024-04-28 01:36:58,419][54818] InferenceWorker_p0-w0: stopping experience collection (27700 times) [2024-04-28 01:36:58,419][54818] InferenceWorker_p0-w0: resuming experience collection (27700 times) [2024-04-28 01:36:59,253][54587] Fps is (10 sec: 60622.4, 60 sec: 60894.0, 300 sec: 60926.3). Total num frames: 8808529920. Throughput: 0: 60967.7. Samples: 1713672760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:36:59,253][54587] Avg episode reward: [(0, '0.668')] [2024-04-28 01:37:00,778][54818] Updated weights for policy 0, policy_version 537638 (0.0017) [2024-04-28 01:37:03,726][54818] Updated weights for policy 0, policy_version 537648 (0.0017) [2024-04-28 01:37:04,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.9, 300 sec: 60926.3). Total num frames: 8808841216. Throughput: 0: 60690.2. Samples: 1714038600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:37:04,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 01:37:06,251][54818] Updated weights for policy 0, policy_version 537658 (0.0019) [2024-04-28 01:37:09,253][54587] Fps is (10 sec: 60619.8, 60 sec: 60893.7, 300 sec: 60870.7). Total num frames: 8809136128. Throughput: 0: 60755.6. Samples: 1714400200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:37:09,255][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 01:37:09,332][54818] Updated weights for policy 0, policy_version 537668 (0.0017) [2024-04-28 01:37:12,088][54818] Updated weights for policy 0, policy_version 537678 (0.0016) [2024-04-28 01:37:14,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60894.0, 300 sec: 60926.3). Total num frames: 8809447424. Throughput: 0: 60950.8. Samples: 1714585140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:37:14,253][54587] Avg episode reward: [(0, '0.536')] [2024-04-28 01:37:14,593][54818] Updated weights for policy 0, policy_version 537688 (0.0017) [2024-04-28 01:37:17,303][54818] Updated weights for policy 0, policy_version 537698 (0.0016) [2024-04-28 01:37:19,253][54587] Fps is (10 sec: 62259.9, 60 sec: 60893.9, 300 sec: 60926.3). Total num frames: 8809758720. Throughput: 0: 60959.3. Samples: 1714956200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:37:19,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 01:37:19,949][54818] Updated weights for policy 0, policy_version 537708 (0.0017) [2024-04-28 01:37:22,624][54818] Updated weights for policy 0, policy_version 537718 (0.0015) [2024-04-28 01:37:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60894.1, 300 sec: 60926.3). Total num frames: 8810053632. Throughput: 0: 61168.5. Samples: 1715325320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:37:24,253][54587] Avg episode reward: [(0, '0.563')] [2024-04-28 01:37:25,181][54818] Updated weights for policy 0, policy_version 537728 (0.0016) [2024-04-28 01:37:27,954][54818] Updated weights for policy 0, policy_version 537738 (0.0020) [2024-04-28 01:37:29,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60894.0, 300 sec: 60926.3). Total num frames: 8810364928. Throughput: 0: 60883.0. Samples: 1715503120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:37:29,253][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 01:37:30,360][54818] Updated weights for policy 0, policy_version 537748 (0.0017) [2024-04-28 01:37:33,371][54818] Updated weights for policy 0, policy_version 537758 (0.0017) [2024-04-28 01:37:34,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61167.0, 300 sec: 60981.8). Total num frames: 8810676224. Throughput: 0: 60851.2. Samples: 1715868600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 01:37:34,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 01:37:35,661][54818] Updated weights for policy 0, policy_version 537768 (0.0016) [2024-04-28 01:37:38,737][54818] Updated weights for policy 0, policy_version 537778 (0.0016) [2024-04-28 01:37:39,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61439.9, 300 sec: 60981.8). Total num frames: 8810987520. Throughput: 0: 61173.2. Samples: 1716236840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:37:39,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 01:37:41,033][54818] Updated weights for policy 0, policy_version 537788 (0.0016) [2024-04-28 01:37:44,180][54818] Updated weights for policy 0, policy_version 537798 (0.0015) [2024-04-28 01:37:44,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60894.1, 300 sec: 60981.8). Total num frames: 8811282432. Throughput: 0: 60936.4. Samples: 1716414900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:37:44,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-28 01:37:46,523][54818] Updated weights for policy 0, policy_version 537808 (0.0016) [2024-04-28 01:37:49,234][54798] Signal inference workers to stop experience collection... (27750 times) [2024-04-28 01:37:49,237][54798] Signal inference workers to resume experience collection... (27750 times) [2024-04-28 01:37:49,248][54818] InferenceWorker_p0-w0: stopping experience collection (27750 times) [2024-04-28 01:37:49,248][54818] InferenceWorker_p0-w0: resuming experience collection (27750 times) [2024-04-28 01:37:49,253][54587] Fps is (10 sec: 58982.8, 60 sec: 60894.0, 300 sec: 60926.3). Total num frames: 8811577344. Throughput: 0: 61001.3. Samples: 1716783660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:37:49,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-28 01:37:49,350][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000537817_8811593728.pth... [2024-04-28 01:37:49,403][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000536923_8796946432.pth [2024-04-28 01:37:49,632][54818] Updated weights for policy 0, policy_version 537818 (0.0017) [2024-04-28 01:37:52,050][54818] Updated weights for policy 0, policy_version 537828 (0.0016) [2024-04-28 01:37:54,253][54587] Fps is (10 sec: 60620.2, 60 sec: 60893.8, 300 sec: 60926.3). Total num frames: 8811888640. Throughput: 0: 61139.6. Samples: 1717151480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:37:54,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-28 01:37:54,875][54818] Updated weights for policy 0, policy_version 537838 (0.0016) [2024-04-28 01:37:57,447][54818] Updated weights for policy 0, policy_version 537848 (0.0020) [2024-04-28 01:37:59,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60893.7, 300 sec: 60870.7). Total num frames: 8812183552. Throughput: 0: 61002.4. Samples: 1717330260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:37:59,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-28 01:38:00,350][54818] Updated weights for policy 0, policy_version 537858 (0.0017) [2024-04-28 01:38:02,875][54818] Updated weights for policy 0, policy_version 537868 (0.0016) [2024-04-28 01:38:04,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60893.8, 300 sec: 60926.2). Total num frames: 8812494848. Throughput: 0: 60868.3. Samples: 1717695280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:38:04,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 01:38:05,519][54818] Updated weights for policy 0, policy_version 537878 (0.0018) [2024-04-28 01:38:08,184][54818] Updated weights for policy 0, policy_version 537888 (0.0016) [2024-04-28 01:38:09,253][54587] Fps is (10 sec: 62260.4, 60 sec: 61167.1, 300 sec: 60926.3). Total num frames: 8812806144. Throughput: 0: 60755.5. Samples: 1718059320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:38:09,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-28 01:38:10,884][54818] Updated weights for policy 0, policy_version 537898 (0.0017) [2024-04-28 01:38:13,503][54818] Updated weights for policy 0, policy_version 537908 (0.0015) [2024-04-28 01:38:14,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61166.9, 300 sec: 60981.8). Total num frames: 8813117440. Throughput: 0: 60907.9. Samples: 1718243980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:38:14,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 01:38:16,130][54818] Updated weights for policy 0, policy_version 537918 (0.0016) [2024-04-28 01:38:18,726][54818] Updated weights for policy 0, policy_version 537928 (0.0018) [2024-04-28 01:38:19,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60893.8, 300 sec: 60926.3). Total num frames: 8813412352. Throughput: 0: 61083.2. Samples: 1718617340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:38:19,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-28 01:38:21,693][54818] Updated weights for policy 0, policy_version 537938 (0.0017) [2024-04-28 01:38:24,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61166.9, 300 sec: 60926.3). Total num frames: 8813723648. Throughput: 0: 60853.1. Samples: 1718975220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:38:24,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 01:38:24,350][54818] Updated weights for policy 0, policy_version 537948 (0.0016) [2024-04-28 01:38:27,029][54818] Updated weights for policy 0, policy_version 537958 (0.0015) [2024-04-28 01:38:29,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61166.8, 300 sec: 60981.8). Total num frames: 8814034944. Throughput: 0: 61025.6. Samples: 1719161060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:38:29,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 01:38:29,470][54818] Updated weights for policy 0, policy_version 537968 (0.0015) [2024-04-28 01:38:32,356][54818] Updated weights for policy 0, policy_version 537978 (0.0017) [2024-04-28 01:38:34,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61166.9, 300 sec: 60981.8). Total num frames: 8814346240. Throughput: 0: 60984.4. Samples: 1719527960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:38:34,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 01:38:34,867][54818] Updated weights for policy 0, policy_version 537988 (0.0016) [2024-04-28 01:38:37,715][54818] Updated weights for policy 0, policy_version 537998 (0.0017) [2024-04-28 01:38:39,253][54587] Fps is (10 sec: 60621.6, 60 sec: 60894.0, 300 sec: 60981.8). Total num frames: 8814641152. Throughput: 0: 60889.9. Samples: 1719891520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:38:39,253][54587] Avg episode reward: [(0, '0.542')] [2024-04-28 01:38:40,639][54818] Updated weights for policy 0, policy_version 538008 (0.0016) [2024-04-28 01:38:41,631][54798] Signal inference workers to stop experience collection... (27800 times) [2024-04-28 01:38:41,634][54798] Signal inference workers to resume experience collection... (27800 times) [2024-04-28 01:38:41,649][54818] InferenceWorker_p0-w0: stopping experience collection (27800 times) [2024-04-28 01:38:41,650][54818] InferenceWorker_p0-w0: resuming experience collection (27800 times) [2024-04-28 01:38:42,975][54818] Updated weights for policy 0, policy_version 538018 (0.0017) [2024-04-28 01:38:44,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61167.0, 300 sec: 61037.4). Total num frames: 8814952448. Throughput: 0: 61038.5. Samples: 1720076980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:38:44,253][54587] Avg episode reward: [(0, '0.645')] [2024-04-28 01:38:46,153][54818] Updated weights for policy 0, policy_version 538028 (0.0018) [2024-04-28 01:38:48,323][54818] Updated weights for policy 0, policy_version 538038 (0.0015) [2024-04-28 01:38:49,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61440.0, 300 sec: 61037.3). Total num frames: 8815263744. Throughput: 0: 61068.1. Samples: 1720443340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:38:49,255][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 01:38:49,261][54587] No heartbeat for components: RolloutWorker_w4 (15277 seconds), RolloutWorker_w5 (1377 seconds) [2024-04-28 01:38:51,328][54818] Updated weights for policy 0, policy_version 538048 (0.0017) [2024-04-28 01:38:53,786][54818] Updated weights for policy 0, policy_version 538058 (0.0015) [2024-04-28 01:38:54,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61440.0, 300 sec: 61037.3). Total num frames: 8815575040. Throughput: 0: 61162.9. Samples: 1720811660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:38:54,255][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 01:38:56,593][54818] Updated weights for policy 0, policy_version 538068 (0.0017) [2024-04-28 01:38:59,156][54818] Updated weights for policy 0, policy_version 538078 (0.0015) [2024-04-28 01:38:59,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.1, 300 sec: 61037.3). Total num frames: 8815869952. Throughput: 0: 61119.5. Samples: 1720994360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:38:59,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 01:39:01,899][54818] Updated weights for policy 0, policy_version 538088 (0.0016) [2024-04-28 01:39:04,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61440.1, 300 sec: 61037.4). Total num frames: 8816181248. Throughput: 0: 60853.5. Samples: 1721355740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 01:39:04,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 01:39:04,288][54818] Updated weights for policy 0, policy_version 538098 (0.0017) [2024-04-28 01:39:07,130][54818] Updated weights for policy 0, policy_version 538108 (0.0019) [2024-04-28 01:39:09,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 8816492544. Throughput: 0: 61190.7. Samples: 1721728800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:39:09,253][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 01:39:09,919][54818] Updated weights for policy 0, policy_version 538118 (0.0016) [2024-04-28 01:39:12,428][54818] Updated weights for policy 0, policy_version 538128 (0.0022) [2024-04-28 01:39:14,253][54587] Fps is (10 sec: 60619.7, 60 sec: 61166.8, 300 sec: 61037.3). Total num frames: 8816787456. Throughput: 0: 61058.2. Samples: 1721908680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:39:14,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 01:39:15,953][54818] Updated weights for policy 0, policy_version 538138 (0.0016) [2024-04-28 01:39:17,887][54818] Updated weights for policy 0, policy_version 538148 (0.0015) [2024-04-28 01:39:19,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 8817098752. Throughput: 0: 60869.4. Samples: 1722267080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:39:19,255][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 01:39:21,076][54818] Updated weights for policy 0, policy_version 538158 (0.0018) [2024-04-28 01:39:21,436][54798] Signal inference workers to stop experience collection... (27850 times) [2024-04-28 01:39:21,467][54818] InferenceWorker_p0-w0: stopping experience collection (27850 times) [2024-04-28 01:39:21,497][54798] Signal inference workers to resume experience collection... (27850 times) [2024-04-28 01:39:21,497][54818] InferenceWorker_p0-w0: resuming experience collection (27850 times) [2024-04-28 01:39:23,543][54818] Updated weights for policy 0, policy_version 538168 (0.0018) [2024-04-28 01:39:24,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61166.8, 300 sec: 60981.8). Total num frames: 8817393664. Throughput: 0: 61036.8. Samples: 1722638180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:39:24,255][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 01:39:26,527][54818] Updated weights for policy 0, policy_version 538178 (0.0019) [2024-04-28 01:39:28,824][54818] Updated weights for policy 0, policy_version 538188 (0.0017) [2024-04-28 01:39:29,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.9, 300 sec: 60981.8). Total num frames: 8817704960. Throughput: 0: 61128.7. Samples: 1722827780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:39:29,255][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 01:39:31,720][54818] Updated weights for policy 0, policy_version 538198 (0.0019) [2024-04-28 01:39:34,011][54818] Updated weights for policy 0, policy_version 538208 (0.0016) [2024-04-28 01:39:34,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60894.1, 300 sec: 60981.9). Total num frames: 8817999872. Throughput: 0: 60905.1. Samples: 1723184060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:39:34,253][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 01:39:36,943][54818] Updated weights for policy 0, policy_version 538218 (0.0016) [2024-04-28 01:39:39,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61166.9, 300 sec: 61037.9). Total num frames: 8818311168. Throughput: 0: 60947.1. Samples: 1723554280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:39:39,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-28 01:39:39,463][54818] Updated weights for policy 0, policy_version 538228 (0.0016) [2024-04-28 01:39:42,062][54818] Updated weights for policy 0, policy_version 538238 (0.0015) [2024-04-28 01:39:44,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.8, 300 sec: 61037.3). Total num frames: 8818606080. Throughput: 0: 61157.4. Samples: 1723746440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:39:44,253][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 01:39:44,789][54818] Updated weights for policy 0, policy_version 538248 (0.0018) [2024-04-28 01:39:47,328][54818] Updated weights for policy 0, policy_version 538258 (0.0017) [2024-04-28 01:39:49,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60893.9, 300 sec: 61037.4). Total num frames: 8818917376. Throughput: 0: 61067.1. Samples: 1724103760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:39:49,253][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 01:39:49,342][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000538265_8818933760.pth... [2024-04-28 01:39:49,394][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000537371_8804286464.pth [2024-04-28 01:39:50,187][54818] Updated weights for policy 0, policy_version 538268 (0.0018) [2024-04-28 01:39:52,653][54818] Updated weights for policy 0, policy_version 538278 (0.0017) [2024-04-28 01:39:54,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60620.9, 300 sec: 61037.4). Total num frames: 8819212288. Throughput: 0: 60852.5. Samples: 1724467160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:39:54,253][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 01:39:55,871][54818] Updated weights for policy 0, policy_version 538288 (0.0016) [2024-04-28 01:39:56,170][54798] Signal inference workers to stop experience collection... (27900 times) [2024-04-28 01:39:56,171][54798] Signal inference workers to resume experience collection... (27900 times) [2024-04-28 01:39:56,178][54818] InferenceWorker_p0-w0: stopping experience collection (27900 times) [2024-04-28 01:39:56,188][54818] InferenceWorker_p0-w0: resuming experience collection (27900 times) [2024-04-28 01:39:57,883][54818] Updated weights for policy 0, policy_version 538298 (0.0018) [2024-04-28 01:39:59,254][54587] Fps is (10 sec: 60619.6, 60 sec: 60893.7, 300 sec: 61092.8). Total num frames: 8819523584. Throughput: 0: 61195.1. Samples: 1724662460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:39:59,255][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 01:40:01,156][54818] Updated weights for policy 0, policy_version 538308 (0.0016) [2024-04-28 01:40:03,306][54818] Updated weights for policy 0, policy_version 538318 (0.0018) [2024-04-28 01:40:04,253][54587] Fps is (10 sec: 62259.4, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 8819834880. Throughput: 0: 61063.8. Samples: 1725014940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:40:04,253][54587] Avg episode reward: [(0, '0.636')] [2024-04-28 01:40:06,585][54818] Updated weights for policy 0, policy_version 538328 (0.0016) [2024-04-28 01:40:08,854][54818] Updated weights for policy 0, policy_version 538338 (0.0020) [2024-04-28 01:40:09,253][54587] Fps is (10 sec: 60621.9, 60 sec: 60620.7, 300 sec: 60981.8). Total num frames: 8820129792. Throughput: 0: 60973.4. Samples: 1725381980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:40:09,255][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 01:40:11,937][54818] Updated weights for policy 0, policy_version 538348 (0.0018) [2024-04-28 01:40:14,253][54587] Fps is (10 sec: 60620.0, 60 sec: 60894.0, 300 sec: 61037.3). Total num frames: 8820441088. Throughput: 0: 61149.4. Samples: 1725579500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:40:14,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-28 01:40:14,525][54818] Updated weights for policy 0, policy_version 538358 (0.0018) [2024-04-28 01:40:17,235][54818] Updated weights for policy 0, policy_version 538368 (0.0016) [2024-04-28 01:40:19,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60620.9, 300 sec: 61037.3). Total num frames: 8820736000. Throughput: 0: 61030.1. Samples: 1725930420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:40:19,253][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 01:40:19,908][54818] Updated weights for policy 0, policy_version 538378 (0.0016) [2024-04-28 01:40:22,605][54818] Updated weights for policy 0, policy_version 538388 (0.0017) [2024-04-28 01:40:24,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61166.9, 300 sec: 61092.8). Total num frames: 8821063680. Throughput: 0: 60924.8. Samples: 1726295900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:40:24,254][54587] Avg episode reward: [(0, '0.689')] [2024-04-28 01:40:25,366][54818] Updated weights for policy 0, policy_version 538398 (0.0017) [2024-04-28 01:40:27,723][54818] Updated weights for policy 0, policy_version 538408 (0.0021) [2024-04-28 01:40:29,253][54587] Fps is (10 sec: 62259.4, 60 sec: 60894.0, 300 sec: 61037.3). Total num frames: 8821358592. Throughput: 0: 61024.9. Samples: 1726492560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:40:29,253][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 01:40:30,621][54818] Updated weights for policy 0, policy_version 538418 (0.0016) [2024-04-28 01:40:32,879][54818] Updated weights for policy 0, policy_version 538428 (0.0018) [2024-04-28 01:40:34,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61166.8, 300 sec: 61092.9). Total num frames: 8821669888. Throughput: 0: 61055.1. Samples: 1726851240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 01:40:34,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 01:40:36,220][54818] Updated weights for policy 0, policy_version 538438 (0.0016) [2024-04-28 01:40:36,235][54798] Signal inference workers to stop experience collection... (27950 times) [2024-04-28 01:40:36,236][54798] Signal inference workers to resume experience collection... (27950 times) [2024-04-28 01:40:36,251][54818] InferenceWorker_p0-w0: stopping experience collection (27950 times) [2024-04-28 01:40:36,251][54818] InferenceWorker_p0-w0: resuming experience collection (27950 times) [2024-04-28 01:40:38,173][54818] Updated weights for policy 0, policy_version 538448 (0.0018) [2024-04-28 01:40:39,256][54587] Fps is (10 sec: 62240.6, 60 sec: 61164.0, 300 sec: 61092.3). Total num frames: 8821981184. Throughput: 0: 61095.9. Samples: 1727216660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:40:39,257][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 01:40:41,219][54818] Updated weights for policy 0, policy_version 538458 (0.0017) [2024-04-28 01:40:43,403][54818] Updated weights for policy 0, policy_version 538468 (0.0017) [2024-04-28 01:40:44,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61439.8, 300 sec: 61092.8). Total num frames: 8822292480. Throughput: 0: 61243.6. Samples: 1727418420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:40:44,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 01:40:46,734][54818] Updated weights for policy 0, policy_version 538478 (0.0016) [2024-04-28 01:40:48,955][54818] Updated weights for policy 0, policy_version 538488 (0.0015) [2024-04-28 01:40:49,253][54587] Fps is (10 sec: 60638.5, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 8822587392. Throughput: 0: 61407.8. Samples: 1727778300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:40:49,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 01:40:52,020][54818] Updated weights for policy 0, policy_version 538498 (0.0016) [2024-04-28 01:40:54,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61439.8, 300 sec: 61092.9). Total num frames: 8822898688. Throughput: 0: 61383.5. Samples: 1728144240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:40:54,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 01:40:54,493][54818] Updated weights for policy 0, policy_version 538508 (0.0016) [2024-04-28 01:40:57,174][54818] Updated weights for policy 0, policy_version 538518 (0.0017) [2024-04-28 01:40:59,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 8823209984. Throughput: 0: 61247.4. Samples: 1728335640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:40:59,255][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 01:40:59,789][54818] Updated weights for policy 0, policy_version 538528 (0.0017) [2024-04-28 01:41:02,438][54818] Updated weights for policy 0, policy_version 538538 (0.0016) [2024-04-28 01:41:04,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61166.8, 300 sec: 61092.9). Total num frames: 8823504896. Throughput: 0: 61621.7. Samples: 1728703400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:41:04,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 01:41:04,967][54818] Updated weights for policy 0, policy_version 538548 (0.0016) [2024-04-28 01:41:07,791][54818] Updated weights for policy 0, policy_version 538558 (0.0017) [2024-04-28 01:41:09,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61439.9, 300 sec: 61092.9). Total num frames: 8823816192. Throughput: 0: 61439.5. Samples: 1729060680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:41:09,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 01:41:10,353][54818] Updated weights for policy 0, policy_version 538568 (0.0016) [2024-04-28 01:41:13,236][54818] Updated weights for policy 0, policy_version 538578 (0.0016) [2024-04-28 01:41:14,253][54587] Fps is (10 sec: 60619.9, 60 sec: 61166.8, 300 sec: 61037.3). Total num frames: 8824111104. Throughput: 0: 61174.8. Samples: 1729245440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:41:14,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-28 01:41:15,610][54818] Updated weights for policy 0, policy_version 538588 (0.0020) [2024-04-28 01:41:18,640][54818] Updated weights for policy 0, policy_version 538598 (0.0019) [2024-04-28 01:41:19,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61439.9, 300 sec: 61092.9). Total num frames: 8824422400. Throughput: 0: 61396.4. Samples: 1729614080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:41:19,255][54587] Avg episode reward: [(0, '0.540')] [2024-04-28 01:41:20,586][54798] Signal inference workers to stop experience collection... (28000 times) [2024-04-28 01:41:20,587][54798] Signal inference workers to resume experience collection... (28000 times) [2024-04-28 01:41:20,606][54818] InferenceWorker_p0-w0: stopping experience collection (28000 times) [2024-04-28 01:41:20,606][54818] InferenceWorker_p0-w0: resuming experience collection (28000 times) [2024-04-28 01:41:21,076][54818] Updated weights for policy 0, policy_version 538608 (0.0017) [2024-04-28 01:41:23,904][54818] Updated weights for policy 0, policy_version 538618 (0.0016) [2024-04-28 01:41:24,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60893.9, 300 sec: 61037.3). Total num frames: 8824717312. Throughput: 0: 61570.2. Samples: 1729987140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:41:24,255][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 01:41:26,588][54818] Updated weights for policy 0, policy_version 538628 (0.0017) [2024-04-28 01:41:29,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.7, 300 sec: 61092.9). Total num frames: 8825028608. Throughput: 0: 60982.7. Samples: 1730162640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:41:29,255][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 01:41:29,425][54818] Updated weights for policy 0, policy_version 538638 (0.0017) [2024-04-28 01:41:31,914][54818] Updated weights for policy 0, policy_version 538648 (0.0016) [2024-04-28 01:41:34,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 8825339904. Throughput: 0: 60992.9. Samples: 1730522980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:41:34,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 01:41:34,735][54818] Updated weights for policy 0, policy_version 538658 (0.0016) [2024-04-28 01:41:37,421][54818] Updated weights for policy 0, policy_version 538668 (0.0017) [2024-04-28 01:41:39,253][54587] Fps is (10 sec: 62260.2, 60 sec: 61169.9, 300 sec: 61092.9). Total num frames: 8825651200. Throughput: 0: 61258.8. Samples: 1730900880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:41:39,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 01:41:40,222][54818] Updated weights for policy 0, policy_version 538678 (0.0015) [2024-04-28 01:41:42,499][54818] Updated weights for policy 0, policy_version 538688 (0.0017) [2024-04-28 01:41:44,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60894.1, 300 sec: 61092.9). Total num frames: 8825946112. Throughput: 0: 60985.1. Samples: 1731079960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:41:44,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 01:41:45,429][54818] Updated weights for policy 0, policy_version 538698 (0.0018) [2024-04-28 01:41:47,741][54818] Updated weights for policy 0, policy_version 538708 (0.0016) [2024-04-28 01:41:49,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 8826257408. Throughput: 0: 60917.2. Samples: 1731444680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:41:49,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 01:41:49,264][54587] No heartbeat for components: RolloutWorker_w4 (15457 seconds), RolloutWorker_w5 (1557 seconds) [2024-04-28 01:41:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000538712_8826257408.pth... [2024-04-28 01:41:49,327][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000537817_8811593728.pth [2024-04-28 01:41:50,648][54818] Updated weights for policy 0, policy_version 538718 (0.0015) [2024-04-28 01:41:53,565][54818] Updated weights for policy 0, policy_version 538728 (0.0015) [2024-04-28 01:41:54,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 8826552320. Throughput: 0: 61123.7. Samples: 1731811240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:41:54,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 01:41:56,021][54818] Updated weights for policy 0, policy_version 538738 (0.0016) [2024-04-28 01:41:58,758][54818] Updated weights for policy 0, policy_version 538748 (0.0017) [2024-04-28 01:41:59,253][54587] Fps is (10 sec: 58982.4, 60 sec: 60620.9, 300 sec: 61037.3). Total num frames: 8826847232. Throughput: 0: 61175.7. Samples: 1731998340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:41:59,255][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 01:42:01,504][54818] Updated weights for policy 0, policy_version 538758 (0.0017) [2024-04-28 01:42:04,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 8827158528. Throughput: 0: 61195.5. Samples: 1732367880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-04-28 01:42:04,255][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 01:42:04,346][54818] Updated weights for policy 0, policy_version 538768 (0.0015) [2024-04-28 01:42:06,834][54818] Updated weights for policy 0, policy_version 538778 (0.0017) [2024-04-28 01:42:09,253][54587] Fps is (10 sec: 62259.9, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 8827469824. Throughput: 0: 60957.5. Samples: 1732730220. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-04-28 01:42:09,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 01:42:09,494][54818] Updated weights for policy 0, policy_version 538788 (0.0018) [2024-04-28 01:42:10,085][54798] Signal inference workers to stop experience collection... (28050 times) [2024-04-28 01:42:10,123][54818] InferenceWorker_p0-w0: stopping experience collection (28050 times) [2024-04-28 01:42:10,141][54798] Signal inference workers to resume experience collection... (28050 times) [2024-04-28 01:42:10,142][54818] InferenceWorker_p0-w0: resuming experience collection (28050 times) [2024-04-28 01:42:12,064][54818] Updated weights for policy 0, policy_version 538798 (0.0017) [2024-04-28 01:42:14,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 8827781120. Throughput: 0: 61173.5. Samples: 1732915440. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-04-28 01:42:14,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 01:42:14,784][54818] Updated weights for policy 0, policy_version 538808 (0.0017) [2024-04-28 01:42:17,458][54818] Updated weights for policy 0, policy_version 538818 (0.0015) [2024-04-28 01:42:19,253][54587] Fps is (10 sec: 58982.5, 60 sec: 60621.0, 300 sec: 61037.3). Total num frames: 8828059648. Throughput: 0: 61299.6. Samples: 1733281460. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-04-28 01:42:19,253][54587] Avg episode reward: [(0, '0.732')] [2024-04-28 01:42:20,032][54818] Updated weights for policy 0, policy_version 538828 (0.0017) [2024-04-28 01:42:22,994][54818] Updated weights for policy 0, policy_version 538838 (0.0015) [2024-04-28 01:42:24,253][54587] Fps is (10 sec: 58982.6, 60 sec: 60893.9, 300 sec: 61037.3). Total num frames: 8828370944. Throughput: 0: 61092.0. Samples: 1733650020. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-04-28 01:42:24,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-28 01:42:25,237][54818] Updated weights for policy 0, policy_version 538848 (0.0018) [2024-04-28 01:42:28,325][54818] Updated weights for policy 0, policy_version 538858 (0.0015) [2024-04-28 01:42:29,253][54587] Fps is (10 sec: 62259.3, 60 sec: 60894.1, 300 sec: 61037.4). Total num frames: 8828682240. Throughput: 0: 61101.4. Samples: 1733829520. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-04-28 01:42:29,253][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 01:42:30,850][54818] Updated weights for policy 0, policy_version 538868 (0.0023) [2024-04-28 01:42:34,084][54818] Updated weights for policy 0, policy_version 538878 (0.0017) [2024-04-28 01:42:34,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60620.7, 300 sec: 60981.8). Total num frames: 8828977152. Throughput: 0: 61227.1. Samples: 1734199900. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-04-28 01:42:34,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 01:42:36,235][54818] Updated weights for policy 0, policy_version 538888 (0.0017) [2024-04-28 01:42:39,253][54587] Fps is (10 sec: 60619.6, 60 sec: 60620.7, 300 sec: 61037.3). Total num frames: 8829288448. Throughput: 0: 61336.8. Samples: 1734571400. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-04-28 01:42:39,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 01:42:39,378][54818] Updated weights for policy 0, policy_version 538898 (0.0016) [2024-04-28 01:42:41,377][54818] Updated weights for policy 0, policy_version 538908 (0.0015) [2024-04-28 01:42:44,253][54587] Fps is (10 sec: 62260.3, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 8829599744. Throughput: 0: 61027.8. Samples: 1734744580. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-04-28 01:42:44,253][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 01:42:44,479][54818] Updated weights for policy 0, policy_version 538918 (0.0021) [2024-04-28 01:42:46,728][54818] Updated weights for policy 0, policy_version 538928 (0.0016) [2024-04-28 01:42:49,253][54587] Fps is (10 sec: 62259.0, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 8829911040. Throughput: 0: 61016.4. Samples: 1735113620. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-04-28 01:42:49,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-28 01:42:49,878][54818] Updated weights for policy 0, policy_version 538938 (0.0017) [2024-04-28 01:42:52,482][54818] Updated weights for policy 0, policy_version 538948 (0.0015) [2024-04-28 01:42:54,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 8830205952. Throughput: 0: 61136.9. Samples: 1735481380. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-04-28 01:42:54,253][54587] Avg episode reward: [(0, '0.652')] [2024-04-28 01:42:55,227][54818] Updated weights for policy 0, policy_version 538958 (0.0016) [2024-04-28 01:42:56,803][54798] Signal inference workers to stop experience collection... (28100 times) [2024-04-28 01:42:56,807][54798] Signal inference workers to resume experience collection... (28100 times) [2024-04-28 01:42:56,820][54818] InferenceWorker_p0-w0: stopping experience collection (28100 times) [2024-04-28 01:42:56,820][54818] InferenceWorker_p0-w0: resuming experience collection (28100 times) [2024-04-28 01:42:57,746][54818] Updated weights for policy 0, policy_version 538968 (0.0015) [2024-04-28 01:42:59,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 8830517248. Throughput: 0: 60896.0. Samples: 1735655760. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-04-28 01:42:59,258][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 01:43:00,417][54818] Updated weights for policy 0, policy_version 538978 (0.0016) [2024-04-28 01:43:03,348][54818] Updated weights for policy 0, policy_version 538988 (0.0017) [2024-04-28 01:43:04,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 8830828544. Throughput: 0: 60927.4. Samples: 1736023200. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-04-28 01:43:04,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 01:43:05,764][54818] Updated weights for policy 0, policy_version 538998 (0.0015) [2024-04-28 01:43:08,792][54818] Updated weights for policy 0, policy_version 539008 (0.0017) [2024-04-28 01:43:09,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 8831139840. Throughput: 0: 61088.0. Samples: 1736398980. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-04-28 01:43:09,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-28 01:43:11,045][54818] Updated weights for policy 0, policy_version 539018 (0.0017) [2024-04-28 01:43:13,993][54818] Updated weights for policy 0, policy_version 539028 (0.0017) [2024-04-28 01:43:14,253][54587] Fps is (10 sec: 60621.6, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 8831434752. Throughput: 0: 60959.5. Samples: 1736572700. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-04-28 01:43:14,253][54587] Avg episode reward: [(0, '0.670')] [2024-04-28 01:43:16,461][54818] Updated weights for policy 0, policy_version 539038 (0.0021) [2024-04-28 01:43:19,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 8831746048. Throughput: 0: 60892.6. Samples: 1736940060. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-04-28 01:43:19,253][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 01:43:19,258][54818] Updated weights for policy 0, policy_version 539048 (0.0015) [2024-04-28 01:43:21,891][54818] Updated weights for policy 0, policy_version 539058 (0.0017) [2024-04-28 01:43:24,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 8832057344. Throughput: 0: 60815.3. Samples: 1737308080. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-04-28 01:43:24,253][54587] Avg episode reward: [(0, '0.646')] [2024-04-28 01:43:24,821][54818] Updated weights for policy 0, policy_version 539068 (0.0017) [2024-04-28 01:43:27,065][54818] Updated weights for policy 0, policy_version 539078 (0.0015) [2024-04-28 01:43:29,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61439.9, 300 sec: 61092.9). Total num frames: 8832368640. Throughput: 0: 60887.4. Samples: 1737484520. Policy #0 lag: (min: 2.0, avg: 9.6, max: 21.0) [2024-04-28 01:43:29,255][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 01:43:30,172][54818] Updated weights for policy 0, policy_version 539088 (0.0018) [2024-04-28 01:43:32,542][54818] Updated weights for policy 0, policy_version 539098 (0.0016) [2024-04-28 01:43:34,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.1, 300 sec: 61092.9). Total num frames: 8832663552. Throughput: 0: 60934.4. Samples: 1737855660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:43:34,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 01:43:35,565][54818] Updated weights for policy 0, policy_version 539108 (0.0018) [2024-04-28 01:43:36,096][54798] Signal inference workers to stop experience collection... (28150 times) [2024-04-28 01:43:36,096][54798] Signal inference workers to resume experience collection... (28150 times) [2024-04-28 01:43:36,105][54818] InferenceWorker_p0-w0: stopping experience collection (28150 times) [2024-04-28 01:43:36,105][54818] InferenceWorker_p0-w0: resuming experience collection (28150 times) [2024-04-28 01:43:38,112][54818] Updated weights for policy 0, policy_version 539118 (0.0018) [2024-04-28 01:43:39,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.1, 300 sec: 61092.9). Total num frames: 8832974848. Throughput: 0: 60986.5. Samples: 1738225780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:43:39,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 01:43:41,002][54818] Updated weights for policy 0, policy_version 539128 (0.0016) [2024-04-28 01:43:43,401][54818] Updated weights for policy 0, policy_version 539138 (0.0016) [2024-04-28 01:43:44,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61439.8, 300 sec: 61092.9). Total num frames: 8833286144. Throughput: 0: 61068.0. Samples: 1738403820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:43:44,255][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 01:43:46,188][54818] Updated weights for policy 0, policy_version 539148 (0.0015) [2024-04-28 01:43:48,738][54818] Updated weights for policy 0, policy_version 539158 (0.0017) [2024-04-28 01:43:49,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61440.1, 300 sec: 61092.9). Total num frames: 8833597440. Throughput: 0: 61149.3. Samples: 1738774920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:43:49,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 01:43:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000539160_8833597440.pth... [2024-04-28 01:43:49,314][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000538265_8818933760.pth [2024-04-28 01:43:51,449][54818] Updated weights for policy 0, policy_version 539168 (0.0019) [2024-04-28 01:43:54,143][54818] Updated weights for policy 0, policy_version 539178 (0.0015) [2024-04-28 01:43:54,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61439.9, 300 sec: 61092.9). Total num frames: 8833892352. Throughput: 0: 60925.4. Samples: 1739140620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:43:54,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-28 01:43:56,595][54818] Updated weights for policy 0, policy_version 539188 (0.0016) [2024-04-28 01:43:59,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 8834203648. Throughput: 0: 61008.3. Samples: 1739318080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:43:59,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 01:43:59,807][54818] Updated weights for policy 0, policy_version 539198 (0.0017) [2024-04-28 01:44:01,972][54818] Updated weights for policy 0, policy_version 539208 (0.0017) [2024-04-28 01:44:04,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61167.0, 300 sec: 61037.3). Total num frames: 8834498560. Throughput: 0: 61143.0. Samples: 1739691500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:44:04,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 01:44:05,132][54818] Updated weights for policy 0, policy_version 539218 (0.0017) [2024-04-28 01:44:07,791][54818] Updated weights for policy 0, policy_version 539228 (0.0017) [2024-04-28 01:44:09,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 8834809856. Throughput: 0: 61052.9. Samples: 1740055460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:44:09,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 01:44:10,370][54818] Updated weights for policy 0, policy_version 539238 (0.0017) [2024-04-28 01:44:13,267][54798] Signal inference workers to stop experience collection... (28200 times) [2024-04-28 01:44:13,295][54818] InferenceWorker_p0-w0: stopping experience collection (28200 times) [2024-04-28 01:44:13,324][54798] Signal inference workers to resume experience collection... (28200 times) [2024-04-28 01:44:13,324][54818] InferenceWorker_p0-w0: resuming experience collection (28200 times) [2024-04-28 01:44:13,327][54818] Updated weights for policy 0, policy_version 539248 (0.0017) [2024-04-28 01:44:14,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61166.8, 300 sec: 61037.3). Total num frames: 8835104768. Throughput: 0: 61064.8. Samples: 1740232440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:44:14,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 01:44:15,848][54818] Updated weights for policy 0, policy_version 539258 (0.0018) [2024-04-28 01:44:18,399][54818] Updated weights for policy 0, policy_version 539268 (0.0017) [2024-04-28 01:44:19,254][54587] Fps is (10 sec: 58981.2, 60 sec: 60893.6, 300 sec: 61037.3). Total num frames: 8835399680. Throughput: 0: 61051.3. Samples: 1740602980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:44:19,255][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 01:44:21,221][54818] Updated weights for policy 0, policy_version 539278 (0.0022) [2024-04-28 01:44:23,743][54818] Updated weights for policy 0, policy_version 539288 (0.0016) [2024-04-28 01:44:24,253][54587] Fps is (10 sec: 58982.7, 60 sec: 60620.7, 300 sec: 60981.8). Total num frames: 8835694592. Throughput: 0: 60963.5. Samples: 1740969140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:44:24,254][54587] Avg episode reward: [(0, '0.664')] [2024-04-28 01:44:26,385][54818] Updated weights for policy 0, policy_version 539298 (0.0017) [2024-04-28 01:44:29,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60620.6, 300 sec: 61037.3). Total num frames: 8836005888. Throughput: 0: 60964.7. Samples: 1741147240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:44:29,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-28 01:44:29,361][54818] Updated weights for policy 0, policy_version 539308 (0.0017) [2024-04-28 01:44:31,664][54818] Updated weights for policy 0, policy_version 539318 (0.0019) [2024-04-28 01:44:34,253][54587] Fps is (10 sec: 62259.0, 60 sec: 60893.8, 300 sec: 61037.3). Total num frames: 8836317184. Throughput: 0: 60796.0. Samples: 1741510740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:44:34,255][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 01:44:34,786][54818] Updated weights for policy 0, policy_version 539328 (0.0016) [2024-04-28 01:44:36,875][54818] Updated weights for policy 0, policy_version 539338 (0.0016) [2024-04-28 01:44:39,253][54587] Fps is (10 sec: 62259.5, 60 sec: 60893.7, 300 sec: 61092.8). Total num frames: 8836628480. Throughput: 0: 60859.3. Samples: 1741879300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:44:39,254][54587] Avg episode reward: [(0, '0.506')] [2024-04-28 01:44:40,014][54818] Updated weights for policy 0, policy_version 539348 (0.0018) [2024-04-28 01:44:42,478][54818] Updated weights for policy 0, policy_version 539358 (0.0018) [2024-04-28 01:44:44,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60620.8, 300 sec: 61037.3). Total num frames: 8836923392. Throughput: 0: 60941.3. Samples: 1742060440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:44:44,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-28 01:44:45,466][54818] Updated weights for policy 0, policy_version 539368 (0.0016) [2024-04-28 01:44:48,010][54818] Updated weights for policy 0, policy_version 539378 (0.0018) [2024-04-28 01:44:49,253][54587] Fps is (10 sec: 60622.1, 60 sec: 60621.0, 300 sec: 61092.9). Total num frames: 8837234688. Throughput: 0: 60691.2. Samples: 1742422600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:44:49,253][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 01:44:49,259][54587] No heartbeat for components: RolloutWorker_w4 (15637 seconds), RolloutWorker_w5 (1737 seconds) [2024-04-28 01:44:50,967][54818] Updated weights for policy 0, policy_version 539388 (0.0016) [2024-04-28 01:44:51,220][54798] Signal inference workers to stop experience collection... (28250 times) [2024-04-28 01:44:51,222][54798] Signal inference workers to resume experience collection... (28250 times) [2024-04-28 01:44:51,232][54818] InferenceWorker_p0-w0: stopping experience collection (28250 times) [2024-04-28 01:44:51,232][54818] InferenceWorker_p0-w0: resuming experience collection (28250 times) [2024-04-28 01:44:53,419][54818] Updated weights for policy 0, policy_version 539398 (0.0015) [2024-04-28 01:44:54,253][54587] Fps is (10 sec: 62260.0, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 8837545984. Throughput: 0: 60935.6. Samples: 1742797560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:44:54,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 01:44:56,193][54818] Updated weights for policy 0, policy_version 539408 (0.0018) [2024-04-28 01:44:58,756][54818] Updated weights for policy 0, policy_version 539418 (0.0017) [2024-04-28 01:44:59,253][54587] Fps is (10 sec: 62258.3, 60 sec: 60893.8, 300 sec: 61092.8). Total num frames: 8837857280. Throughput: 0: 60963.1. Samples: 1742975780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:44:59,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 01:45:01,437][54818] Updated weights for policy 0, policy_version 539428 (0.0019) [2024-04-28 01:45:04,057][54818] Updated weights for policy 0, policy_version 539438 (0.0015) [2024-04-28 01:45:04,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 8838152192. Throughput: 0: 60908.8. Samples: 1743343860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:45:04,253][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 01:45:06,784][54818] Updated weights for policy 0, policy_version 539448 (0.0019) [2024-04-28 01:45:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 8838463488. Throughput: 0: 60811.5. Samples: 1743705660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:45:09,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 01:45:09,373][54818] Updated weights for policy 0, policy_version 539458 (0.0016) [2024-04-28 01:45:12,167][54818] Updated weights for policy 0, policy_version 539468 (0.0016) [2024-04-28 01:45:14,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61167.1, 300 sec: 61148.4). Total num frames: 8838774784. Throughput: 0: 61086.1. Samples: 1743896100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:45:14,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 01:45:14,458][54818] Updated weights for policy 0, policy_version 539478 (0.0015) [2024-04-28 01:45:17,499][54818] Updated weights for policy 0, policy_version 539488 (0.0016) [2024-04-28 01:45:19,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61440.2, 300 sec: 61092.9). Total num frames: 8839086080. Throughput: 0: 61088.5. Samples: 1744259720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:45:19,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 01:45:20,076][54818] Updated weights for policy 0, policy_version 539498 (0.0017) [2024-04-28 01:45:23,059][54818] Updated weights for policy 0, policy_version 539508 (0.0016) [2024-04-28 01:45:24,253][54587] Fps is (10 sec: 60619.7, 60 sec: 61439.9, 300 sec: 61092.8). Total num frames: 8839380992. Throughput: 0: 61012.5. Samples: 1744624860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:45:24,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-28 01:45:25,555][54818] Updated weights for policy 0, policy_version 539518 (0.0017) [2024-04-28 01:45:28,194][54818] Updated weights for policy 0, policy_version 539528 (0.0019) [2024-04-28 01:45:29,253][54587] Fps is (10 sec: 58981.9, 60 sec: 61167.0, 300 sec: 61037.3). Total num frames: 8839675904. Throughput: 0: 61068.8. Samples: 1744808540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:45:29,263][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 01:45:30,966][54818] Updated weights for policy 0, policy_version 539538 (0.0020) [2024-04-28 01:45:33,569][54818] Updated weights for policy 0, policy_version 539548 (0.0016) [2024-04-28 01:45:34,253][54587] Fps is (10 sec: 60622.1, 60 sec: 61167.1, 300 sec: 61038.0). Total num frames: 8839987200. Throughput: 0: 61263.6. Samples: 1745179460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:45:34,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 01:45:36,415][54818] Updated weights for policy 0, policy_version 539558 (0.0018) [2024-04-28 01:45:38,947][54818] Updated weights for policy 0, policy_version 539568 (0.0015) [2024-04-28 01:45:39,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60894.0, 300 sec: 60981.8). Total num frames: 8840282112. Throughput: 0: 61035.8. Samples: 1745544180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:45:39,255][54587] Avg episode reward: [(0, '0.540')] [2024-04-28 01:45:41,830][54818] Updated weights for policy 0, policy_version 539578 (0.0016) [2024-04-28 01:45:42,780][54798] Signal inference workers to stop experience collection... (28300 times) [2024-04-28 01:45:42,786][54798] Signal inference workers to resume experience collection... (28300 times) [2024-04-28 01:45:42,794][54818] InferenceWorker_p0-w0: stopping experience collection (28300 times) [2024-04-28 01:45:42,794][54818] InferenceWorker_p0-w0: resuming experience collection (28300 times) [2024-04-28 01:45:44,236][54818] Updated weights for policy 0, policy_version 539588 (0.0016) [2024-04-28 01:45:44,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 8840609792. Throughput: 0: 61179.6. Samples: 1745728860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:45:44,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 01:45:47,023][54818] Updated weights for policy 0, policy_version 539598 (0.0017) [2024-04-28 01:45:49,254][54587] Fps is (10 sec: 62258.2, 60 sec: 61166.6, 300 sec: 61037.3). Total num frames: 8840904704. Throughput: 0: 61091.1. Samples: 1746092980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:45:49,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-28 01:45:49,305][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000539607_8840921088.pth... [2024-04-28 01:45:49,362][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000538712_8826257408.pth [2024-04-28 01:45:49,637][54818] Updated weights for policy 0, policy_version 539608 (0.0018) [2024-04-28 01:45:52,381][54818] Updated weights for policy 0, policy_version 539618 (0.0017) [2024-04-28 01:45:54,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61166.9, 300 sec: 61037.4). Total num frames: 8841216000. Throughput: 0: 61305.9. Samples: 1746464420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:45:54,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-28 01:45:55,023][54818] Updated weights for policy 0, policy_version 539628 (0.0015) [2024-04-28 01:45:57,529][54818] Updated weights for policy 0, policy_version 539638 (0.0016) [2024-04-28 01:45:59,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60893.8, 300 sec: 61037.3). Total num frames: 8841510912. Throughput: 0: 61185.1. Samples: 1746649440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:45:59,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 01:46:00,238][54818] Updated weights for policy 0, policy_version 539648 (0.0017) [2024-04-28 01:46:02,891][54818] Updated weights for policy 0, policy_version 539658 (0.0017) [2024-04-28 01:46:04,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.8, 300 sec: 61037.4). Total num frames: 8841822208. Throughput: 0: 61097.7. Samples: 1747009120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:46:04,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 01:46:05,406][54818] Updated weights for policy 0, policy_version 539668 (0.0017) [2024-04-28 01:46:08,811][54818] Updated weights for policy 0, policy_version 539678 (0.0016) [2024-04-28 01:46:09,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60893.9, 300 sec: 61037.4). Total num frames: 8842117120. Throughput: 0: 61476.1. Samples: 1747391280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:46:09,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 01:46:10,695][54818] Updated weights for policy 0, policy_version 539688 (0.0016) [2024-04-28 01:46:13,908][54818] Updated weights for policy 0, policy_version 539698 (0.0017) [2024-04-28 01:46:14,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60893.9, 300 sec: 61037.4). Total num frames: 8842428416. Throughput: 0: 61334.0. Samples: 1747568560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:46:14,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-28 01:46:16,556][54818] Updated weights for policy 0, policy_version 539708 (0.0018) [2024-04-28 01:46:19,166][54818] Updated weights for policy 0, policy_version 539718 (0.0018) [2024-04-28 01:46:19,253][54587] Fps is (10 sec: 62259.3, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 8842739712. Throughput: 0: 61101.2. Samples: 1747929020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:46:19,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 01:46:21,912][54818] Updated weights for policy 0, policy_version 539728 (0.0016) [2024-04-28 01:46:24,253][54587] Fps is (10 sec: 60619.7, 60 sec: 60893.9, 300 sec: 61037.4). Total num frames: 8843034624. Throughput: 0: 61369.8. Samples: 1748305820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:46:24,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 01:46:24,441][54818] Updated weights for policy 0, policy_version 539738 (0.0016) [2024-04-28 01:46:24,923][54798] Signal inference workers to stop experience collection... (28350 times) [2024-04-28 01:46:24,923][54798] Signal inference workers to resume experience collection... (28350 times) [2024-04-28 01:46:24,931][54818] InferenceWorker_p0-w0: stopping experience collection (28350 times) [2024-04-28 01:46:24,931][54818] InferenceWorker_p0-w0: resuming experience collection (28350 times) [2024-04-28 01:46:26,935][54818] Updated weights for policy 0, policy_version 539748 (0.0016) [2024-04-28 01:46:29,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61167.1, 300 sec: 61037.4). Total num frames: 8843345920. Throughput: 0: 61328.2. Samples: 1748488620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 01:46:29,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 01:46:29,754][54818] Updated weights for policy 0, policy_version 539758 (0.0016) [2024-04-28 01:46:32,303][54818] Updated weights for policy 0, policy_version 539768 (0.0016) [2024-04-28 01:46:34,253][54587] Fps is (10 sec: 63897.5, 60 sec: 61439.8, 300 sec: 61092.9). Total num frames: 8843673600. Throughput: 0: 61278.4. Samples: 1748850500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:46:34,255][54587] Avg episode reward: [(0, '0.655')] [2024-04-28 01:46:34,939][54818] Updated weights for policy 0, policy_version 539778 (0.0017) [2024-04-28 01:46:37,490][54818] Updated weights for policy 0, policy_version 539788 (0.0017) [2024-04-28 01:46:39,253][54587] Fps is (10 sec: 62258.0, 60 sec: 61439.9, 300 sec: 61092.8). Total num frames: 8843968512. Throughput: 0: 61371.3. Samples: 1749226140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:46:39,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 01:46:40,237][54818] Updated weights for policy 0, policy_version 539798 (0.0015) [2024-04-28 01:46:42,980][54818] Updated weights for policy 0, policy_version 539808 (0.0018) [2024-04-28 01:46:44,253][54587] Fps is (10 sec: 58983.3, 60 sec: 60894.0, 300 sec: 61037.4). Total num frames: 8844263424. Throughput: 0: 61386.0. Samples: 1749411800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:46:44,254][54587] Avg episode reward: [(0, '0.522')] [2024-04-28 01:46:45,971][54818] Updated weights for policy 0, policy_version 539818 (0.0016) [2024-04-28 01:46:48,650][54818] Updated weights for policy 0, policy_version 539828 (0.0021) [2024-04-28 01:46:49,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61167.1, 300 sec: 61092.9). Total num frames: 8844574720. Throughput: 0: 61403.6. Samples: 1749772280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:46:49,254][54587] Avg episode reward: [(0, '0.671')] [2024-04-28 01:46:51,192][54818] Updated weights for policy 0, policy_version 539838 (0.0016) [2024-04-28 01:46:54,051][54818] Updated weights for policy 0, policy_version 539848 (0.0019) [2024-04-28 01:46:54,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 8844886016. Throughput: 0: 61146.7. Samples: 1750142880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:46:54,255][54587] Avg episode reward: [(0, '0.538')] [2024-04-28 01:46:56,369][54818] Updated weights for policy 0, policy_version 539858 (0.0019) [2024-04-28 01:46:59,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 8845180928. Throughput: 0: 61329.6. Samples: 1750328400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:46:59,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-28 01:46:59,322][54818] Updated weights for policy 0, policy_version 539868 (0.0018) [2024-04-28 01:47:01,749][54818] Updated weights for policy 0, policy_version 539878 (0.0017) [2024-04-28 01:47:04,253][54587] Fps is (10 sec: 58982.4, 60 sec: 60893.9, 300 sec: 61037.3). Total num frames: 8845475840. Throughput: 0: 61419.5. Samples: 1750692900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:47:04,255][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 01:47:04,657][54818] Updated weights for policy 0, policy_version 539888 (0.0016) [2024-04-28 01:47:07,209][54818] Updated weights for policy 0, policy_version 539898 (0.0019) [2024-04-28 01:47:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61167.0, 300 sec: 61037.3). Total num frames: 8845787136. Throughput: 0: 61209.8. Samples: 1751060260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:47:09,255][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 01:47:09,627][54798] Signal inference workers to stop experience collection... (28400 times) [2024-04-28 01:47:09,658][54818] InferenceWorker_p0-w0: stopping experience collection (28400 times) [2024-04-28 01:47:09,686][54798] Signal inference workers to resume experience collection... (28400 times) [2024-04-28 01:47:09,686][54818] InferenceWorker_p0-w0: resuming experience collection (28400 times) [2024-04-28 01:47:09,972][54818] Updated weights for policy 0, policy_version 539908 (0.0016) [2024-04-28 01:47:12,510][54818] Updated weights for policy 0, policy_version 539918 (0.0017) [2024-04-28 01:47:14,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61166.7, 300 sec: 61148.4). Total num frames: 8846098432. Throughput: 0: 61246.3. Samples: 1751244720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:47:14,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-28 01:47:15,181][54818] Updated weights for policy 0, policy_version 539928 (0.0016) [2024-04-28 01:47:17,801][54818] Updated weights for policy 0, policy_version 539938 (0.0018) [2024-04-28 01:47:19,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 8846409728. Throughput: 0: 61332.9. Samples: 1751610480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:47:19,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 01:47:20,512][54818] Updated weights for policy 0, policy_version 539948 (0.0016) [2024-04-28 01:47:23,058][54818] Updated weights for policy 0, policy_version 539958 (0.0017) [2024-04-28 01:47:24,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 8846704640. Throughput: 0: 61178.8. Samples: 1751979180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:47:24,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 01:47:25,748][54818] Updated weights for policy 0, policy_version 539968 (0.0017) [2024-04-28 01:47:28,460][54818] Updated weights for policy 0, policy_version 539978 (0.0015) [2024-04-28 01:47:29,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61167.0, 300 sec: 61148.5). Total num frames: 8847015936. Throughput: 0: 61075.1. Samples: 1752160180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:47:29,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 01:47:31,043][54818] Updated weights for policy 0, policy_version 539988 (0.0016) [2024-04-28 01:47:34,096][54818] Updated weights for policy 0, policy_version 539998 (0.0015) [2024-04-28 01:47:34,253][54587] Fps is (10 sec: 62259.2, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 8847327232. Throughput: 0: 61150.7. Samples: 1752524060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:47:34,255][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 01:47:36,676][54818] Updated weights for policy 0, policy_version 540008 (0.0017) [2024-04-28 01:47:39,253][54587] Fps is (10 sec: 62258.1, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 8847638528. Throughput: 0: 61196.8. Samples: 1752896740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:47:39,255][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 01:47:39,569][54818] Updated weights for policy 0, policy_version 540018 (0.0016) [2024-04-28 01:47:42,051][54818] Updated weights for policy 0, policy_version 540028 (0.0016) [2024-04-28 01:47:44,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61166.8, 300 sec: 61092.9). Total num frames: 8847933440. Throughput: 0: 61068.8. Samples: 1753076500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:47:44,255][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 01:47:44,819][54818] Updated weights for policy 0, policy_version 540038 (0.0016) [2024-04-28 01:47:47,242][54818] Updated weights for policy 0, policy_version 540048 (0.0016) [2024-04-28 01:47:49,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 8848244736. Throughput: 0: 61125.4. Samples: 1753443540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:47:49,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 01:47:49,263][54587] No heartbeat for components: RolloutWorker_w4 (15817 seconds), RolloutWorker_w5 (1917 seconds) [2024-04-28 01:47:49,359][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000540055_8848261120.pth... [2024-04-28 01:47:49,411][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000539160_8833597440.pth [2024-04-28 01:47:50,122][54818] Updated weights for policy 0, policy_version 540058 (0.0016) [2024-04-28 01:47:52,615][54818] Updated weights for policy 0, policy_version 540068 (0.0015) [2024-04-28 01:47:54,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 8848556032. Throughput: 0: 61050.6. Samples: 1753807540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:47:54,254][54587] Avg episode reward: [(0, '0.655')] [2024-04-28 01:47:55,224][54798] Signal inference workers to stop experience collection... (28450 times) [2024-04-28 01:47:55,225][54798] Signal inference workers to resume experience collection... (28450 times) [2024-04-28 01:47:55,241][54818] InferenceWorker_p0-w0: stopping experience collection (28450 times) [2024-04-28 01:47:55,241][54818] InferenceWorker_p0-w0: resuming experience collection (28450 times) [2024-04-28 01:47:55,349][54818] Updated weights for policy 0, policy_version 540078 (0.0017) [2024-04-28 01:47:58,098][54818] Updated weights for policy 0, policy_version 540088 (0.0015) [2024-04-28 01:47:59,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 8848867328. Throughput: 0: 61306.5. Samples: 1754003500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 01:47:59,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 01:48:00,746][54818] Updated weights for policy 0, policy_version 540098 (0.0017) [2024-04-28 01:48:03,264][54818] Updated weights for policy 0, policy_version 540108 (0.0017) [2024-04-28 01:48:04,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 8849162240. Throughput: 0: 61113.8. Samples: 1754360600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:48:04,255][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 01:48:06,358][54818] Updated weights for policy 0, policy_version 540118 (0.0017) [2024-04-28 01:48:08,541][54818] Updated weights for policy 0, policy_version 540128 (0.0017) [2024-04-28 01:48:09,253][54587] Fps is (10 sec: 60619.9, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 8849473536. Throughput: 0: 61003.4. Samples: 1754724340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:48:09,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 01:48:11,764][54818] Updated weights for policy 0, policy_version 540138 (0.0016) [2024-04-28 01:48:13,742][54818] Updated weights for policy 0, policy_version 540148 (0.0021) [2024-04-28 01:48:14,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 8849784832. Throughput: 0: 61411.8. Samples: 1754923720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:48:14,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 01:48:17,020][54818] Updated weights for policy 0, policy_version 540158 (0.0015) [2024-04-28 01:48:19,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 8850096128. Throughput: 0: 61403.5. Samples: 1755287220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:48:19,255][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 01:48:19,464][54818] Updated weights for policy 0, policy_version 540168 (0.0016) [2024-04-28 01:48:22,056][54818] Updated weights for policy 0, policy_version 540178 (0.0016) [2024-04-28 01:48:24,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61713.1, 300 sec: 61148.4). Total num frames: 8850407424. Throughput: 0: 61159.6. Samples: 1755648920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:48:24,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 01:48:24,725][54818] Updated weights for policy 0, policy_version 540188 (0.0018) [2024-04-28 01:48:27,302][54818] Updated weights for policy 0, policy_version 540198 (0.0017) [2024-04-28 01:48:29,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61712.9, 300 sec: 61203.9). Total num frames: 8850718720. Throughput: 0: 61395.1. Samples: 1755839280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:48:29,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 01:48:30,037][54818] Updated weights for policy 0, policy_version 540208 (0.0016) [2024-04-28 01:48:32,811][54818] Updated weights for policy 0, policy_version 540218 (0.0016) [2024-04-28 01:48:34,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 8851013632. Throughput: 0: 61428.4. Samples: 1756207820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:48:34,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 01:48:35,396][54818] Updated weights for policy 0, policy_version 540228 (0.0016) [2024-04-28 01:48:36,255][54798] Signal inference workers to stop experience collection... (28500 times) [2024-04-28 01:48:36,299][54818] InferenceWorker_p0-w0: stopping experience collection (28500 times) [2024-04-28 01:48:36,315][54798] Signal inference workers to resume experience collection... (28500 times) [2024-04-28 01:48:36,315][54818] InferenceWorker_p0-w0: resuming experience collection (28500 times) [2024-04-28 01:48:38,184][54818] Updated weights for policy 0, policy_version 540238 (0.0017) [2024-04-28 01:48:39,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 8851324928. Throughput: 0: 61414.8. Samples: 1756571200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:48:39,253][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 01:48:40,739][54818] Updated weights for policy 0, policy_version 540248 (0.0016) [2024-04-28 01:48:43,394][54818] Updated weights for policy 0, policy_version 540258 (0.0016) [2024-04-28 01:48:44,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61713.2, 300 sec: 61148.5). Total num frames: 8851636224. Throughput: 0: 61181.0. Samples: 1756756640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:48:44,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 01:48:46,089][54818] Updated weights for policy 0, policy_version 540268 (0.0017) [2024-04-28 01:48:48,647][54818] Updated weights for policy 0, policy_version 540278 (0.0016) [2024-04-28 01:48:49,253][54587] Fps is (10 sec: 62258.1, 60 sec: 61712.9, 300 sec: 61203.9). Total num frames: 8851947520. Throughput: 0: 61591.4. Samples: 1757132220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:48:49,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 01:48:51,214][54818] Updated weights for policy 0, policy_version 540288 (0.0017) [2024-04-28 01:48:53,817][54818] Updated weights for policy 0, policy_version 540298 (0.0018) [2024-04-28 01:48:54,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61713.1, 300 sec: 61204.0). Total num frames: 8852258816. Throughput: 0: 61523.7. Samples: 1757492900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:48:54,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 01:48:56,661][54818] Updated weights for policy 0, policy_version 540308 (0.0015) [2024-04-28 01:48:58,962][54818] Updated weights for policy 0, policy_version 540318 (0.0017) [2024-04-28 01:48:59,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 8852570112. Throughput: 0: 61210.1. Samples: 1757678180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:48:59,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-28 01:49:02,236][54818] Updated weights for policy 0, policy_version 540328 (0.0016) [2024-04-28 01:49:04,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61986.1, 300 sec: 61259.5). Total num frames: 8852881408. Throughput: 0: 61462.7. Samples: 1758053040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:49:04,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 01:49:04,872][54818] Updated weights for policy 0, policy_version 540338 (0.0016) [2024-04-28 01:49:07,461][54818] Updated weights for policy 0, policy_version 540348 (0.0017) [2024-04-28 01:49:09,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61713.1, 300 sec: 61259.5). Total num frames: 8853176320. Throughput: 0: 61395.1. Samples: 1758411700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:49:09,254][54587] Avg episode reward: [(0, '0.506')] [2024-04-28 01:49:10,246][54818] Updated weights for policy 0, policy_version 540358 (0.0018) [2024-04-28 01:49:12,909][54818] Updated weights for policy 0, policy_version 540368 (0.0016) [2024-04-28 01:49:14,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61713.1, 300 sec: 61315.1). Total num frames: 8853487616. Throughput: 0: 61410.7. Samples: 1758602760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:49:14,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 01:49:15,571][54818] Updated weights for policy 0, policy_version 540378 (0.0017) [2024-04-28 01:49:18,202][54818] Updated weights for policy 0, policy_version 540388 (0.0016) [2024-04-28 01:49:18,717][54798] Signal inference workers to stop experience collection... (28550 times) [2024-04-28 01:49:18,718][54798] Signal inference workers to resume experience collection... (28550 times) [2024-04-28 01:49:18,732][54818] InferenceWorker_p0-w0: stopping experience collection (28550 times) [2024-04-28 01:49:18,732][54818] InferenceWorker_p0-w0: resuming experience collection (28550 times) [2024-04-28 01:49:19,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61713.1, 300 sec: 61370.6). Total num frames: 8853798912. Throughput: 0: 61356.9. Samples: 1758968880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:49:19,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-28 01:49:20,739][54818] Updated weights for policy 0, policy_version 540398 (0.0015) [2024-04-28 01:49:23,401][54818] Updated weights for policy 0, policy_version 540408 (0.0018) [2024-04-28 01:49:24,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61713.1, 300 sec: 61370.6). Total num frames: 8854110208. Throughput: 0: 61215.1. Samples: 1759325880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:49:24,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-28 01:49:26,099][54818] Updated weights for policy 0, policy_version 540418 (0.0018) [2024-04-28 01:49:28,691][54818] Updated weights for policy 0, policy_version 540428 (0.0016) [2024-04-28 01:49:29,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 8854405120. Throughput: 0: 61360.3. Samples: 1759517860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:49:29,255][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 01:49:31,831][54818] Updated weights for policy 0, policy_version 540438 (0.0016) [2024-04-28 01:49:34,018][54818] Updated weights for policy 0, policy_version 540448 (0.0016) [2024-04-28 01:49:34,253][54587] Fps is (10 sec: 58981.5, 60 sec: 61439.8, 300 sec: 61259.5). Total num frames: 8854700032. Throughput: 0: 61247.1. Samples: 1759888340. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:49:34,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 01:49:37,186][54818] Updated weights for policy 0, policy_version 540458 (0.0016) [2024-04-28 01:49:39,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 8855011328. Throughput: 0: 61196.3. Samples: 1760246740. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:49:39,255][54587] Avg episode reward: [(0, '0.547')] [2024-04-28 01:49:39,384][54818] Updated weights for policy 0, policy_version 540468 (0.0018) [2024-04-28 01:49:42,471][54818] Updated weights for policy 0, policy_version 540478 (0.0016) [2024-04-28 01:49:44,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61166.7, 300 sec: 61259.5). Total num frames: 8855306240. Throughput: 0: 61132.4. Samples: 1760429140. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:49:44,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 01:49:45,047][54818] Updated weights for policy 0, policy_version 540488 (0.0016) [2024-04-28 01:49:47,767][54818] Updated weights for policy 0, policy_version 540498 (0.0016) [2024-04-28 01:49:49,253][54587] Fps is (10 sec: 58983.6, 60 sec: 60894.1, 300 sec: 61204.0). Total num frames: 8855601152. Throughput: 0: 61053.2. Samples: 1760800420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:49:49,253][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 01:49:49,261][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000540504_8855617536.pth... [2024-04-28 01:49:49,299][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000539607_8840921088.pth [2024-04-28 01:49:50,250][54818] Updated weights for policy 0, policy_version 540508 (0.0016) [2024-04-28 01:49:53,086][54818] Updated weights for policy 0, policy_version 540518 (0.0017) [2024-04-28 01:49:54,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 8855912448. Throughput: 0: 61116.5. Samples: 1761161940. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:49:54,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 01:49:55,959][54818] Updated weights for policy 0, policy_version 540528 (0.0021) [2024-04-28 01:49:58,394][54818] Updated weights for policy 0, policy_version 540538 (0.0016) [2024-04-28 01:49:59,253][54587] Fps is (10 sec: 62257.9, 60 sec: 60893.9, 300 sec: 61259.5). Total num frames: 8856223744. Throughput: 0: 60901.7. Samples: 1761343340. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:49:59,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 01:50:01,103][54818] Updated weights for policy 0, policy_version 540548 (0.0016) [2024-04-28 01:50:03,888][54818] Updated weights for policy 0, policy_version 540558 (0.0017) [2024-04-28 01:50:04,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60621.0, 300 sec: 61204.0). Total num frames: 8856518656. Throughput: 0: 61038.3. Samples: 1761715600. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:50:04,253][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 01:50:06,314][54818] Updated weights for policy 0, policy_version 540568 (0.0020) [2024-04-28 01:50:09,177][54818] Updated weights for policy 0, policy_version 540578 (0.0017) [2024-04-28 01:50:09,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.9, 300 sec: 61203.9). Total num frames: 8856829952. Throughput: 0: 61373.3. Samples: 1762087680. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:50:09,255][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 01:50:11,608][54818] Updated weights for policy 0, policy_version 540588 (0.0022) [2024-04-28 01:50:14,253][54587] Fps is (10 sec: 60619.4, 60 sec: 60620.7, 300 sec: 61148.4). Total num frames: 8857124864. Throughput: 0: 60903.0. Samples: 1762258500. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:50:14,262][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 01:50:14,263][54798] Signal inference workers to stop experience collection... (28600 times) [2024-04-28 01:50:14,305][54818] InferenceWorker_p0-w0: stopping experience collection (28600 times) [2024-04-28 01:50:14,321][54798] Signal inference workers to resume experience collection... (28600 times) [2024-04-28 01:50:14,322][54818] InferenceWorker_p0-w0: resuming experience collection (28600 times) [2024-04-28 01:50:14,444][54818] Updated weights for policy 0, policy_version 540598 (0.0016) [2024-04-28 01:50:17,216][54818] Updated weights for policy 0, policy_version 540608 (0.0017) [2024-04-28 01:50:19,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60620.7, 300 sec: 61204.0). Total num frames: 8857436160. Throughput: 0: 60944.1. Samples: 1762630820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:50:19,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 01:50:19,956][54818] Updated weights for policy 0, policy_version 540618 (0.0022) [2024-04-28 01:50:22,604][54818] Updated weights for policy 0, policy_version 540628 (0.0015) [2024-04-28 01:50:24,253][54587] Fps is (10 sec: 62260.4, 60 sec: 60620.8, 300 sec: 61259.5). Total num frames: 8857747456. Throughput: 0: 61298.8. Samples: 1763005180. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:50:24,253][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 01:50:25,127][54818] Updated weights for policy 0, policy_version 540638 (0.0016) [2024-04-28 01:50:27,925][54818] Updated weights for policy 0, policy_version 540648 (0.0016) [2024-04-28 01:50:29,253][54587] Fps is (10 sec: 60621.6, 60 sec: 60620.9, 300 sec: 61204.0). Total num frames: 8858042368. Throughput: 0: 61208.2. Samples: 1763183500. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:50:29,253][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 01:50:30,716][54818] Updated weights for policy 0, policy_version 540658 (0.0019) [2024-04-28 01:50:33,017][54818] Updated weights for policy 0, policy_version 540668 (0.0018) [2024-04-28 01:50:34,254][54587] Fps is (10 sec: 60615.8, 60 sec: 60893.2, 300 sec: 61259.3). Total num frames: 8858353664. Throughput: 0: 61190.8. Samples: 1763554060. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:50:34,263][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 01:50:35,942][54818] Updated weights for policy 0, policy_version 540678 (0.0015) [2024-04-28 01:50:38,719][54818] Updated weights for policy 0, policy_version 540688 (0.0017) [2024-04-28 01:50:39,253][54587] Fps is (10 sec: 62258.0, 60 sec: 60893.8, 300 sec: 61203.9). Total num frames: 8858664960. Throughput: 0: 61345.6. Samples: 1763922500. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:50:39,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 01:50:41,175][54818] Updated weights for policy 0, policy_version 540698 (0.0020) [2024-04-28 01:50:43,994][54818] Updated weights for policy 0, policy_version 540708 (0.0017) [2024-04-28 01:50:44,253][54587] Fps is (10 sec: 60625.2, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 8858959872. Throughput: 0: 61192.0. Samples: 1764096980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:50:44,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-28 01:50:46,782][54818] Updated weights for policy 0, policy_version 540718 (0.0017) [2024-04-28 01:50:49,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61166.7, 300 sec: 61203.9). Total num frames: 8859271168. Throughput: 0: 61071.3. Samples: 1764463820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:50:49,255][54587] Avg episode reward: [(0, '0.534')] [2024-04-28 01:50:49,265][54587] No heartbeat for components: RolloutWorker_w4 (15997 seconds), RolloutWorker_w5 (2097 seconds) [2024-04-28 01:50:49,535][54818] Updated weights for policy 0, policy_version 540728 (0.0017) [2024-04-28 01:50:52,190][54818] Updated weights for policy 0, policy_version 540738 (0.0016) [2024-04-28 01:50:54,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 8859582464. Throughput: 0: 60937.3. Samples: 1764829860. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:50:54,254][54587] Avg episode reward: [(0, '0.532')] [2024-04-28 01:50:54,991][54818] Updated weights for policy 0, policy_version 540748 (0.0016) [2024-04-28 01:50:55,983][54798] Signal inference workers to stop experience collection... (28650 times) [2024-04-28 01:50:55,988][54798] Signal inference workers to resume experience collection... (28650 times) [2024-04-28 01:50:55,998][54818] InferenceWorker_p0-w0: stopping experience collection (28650 times) [2024-04-28 01:50:56,007][54818] InferenceWorker_p0-w0: resuming experience collection (28650 times) [2024-04-28 01:50:57,508][54818] Updated weights for policy 0, policy_version 540758 (0.0017) [2024-04-28 01:50:59,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 8859893760. Throughput: 0: 61164.6. Samples: 1765010900. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-04-28 01:50:59,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-28 01:51:00,376][54818] Updated weights for policy 0, policy_version 540768 (0.0016) [2024-04-28 01:51:02,950][54818] Updated weights for policy 0, policy_version 540778 (0.0016) [2024-04-28 01:51:04,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.7, 300 sec: 61259.5). Total num frames: 8860188672. Throughput: 0: 61026.6. Samples: 1765377020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:51:04,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-28 01:51:05,594][54818] Updated weights for policy 0, policy_version 540788 (0.0016) [2024-04-28 01:51:08,387][54818] Updated weights for policy 0, policy_version 540798 (0.0016) [2024-04-28 01:51:09,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 8860499968. Throughput: 0: 60901.7. Samples: 1765745760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:51:09,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 01:51:10,805][54818] Updated weights for policy 0, policy_version 540808 (0.0016) [2024-04-28 01:51:13,581][54818] Updated weights for policy 0, policy_version 540818 (0.0016) [2024-04-28 01:51:14,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61167.0, 300 sec: 61203.9). Total num frames: 8860794880. Throughput: 0: 60972.7. Samples: 1765927280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:51:14,255][54587] Avg episode reward: [(0, '0.542')] [2024-04-28 01:51:16,139][54818] Updated weights for policy 0, policy_version 540828 (0.0016) [2024-04-28 01:51:18,865][54818] Updated weights for policy 0, policy_version 540838 (0.0016) [2024-04-28 01:51:19,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 8861106176. Throughput: 0: 60867.2. Samples: 1766293040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:51:19,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 01:51:21,231][54818] Updated weights for policy 0, policy_version 540848 (0.0017) [2024-04-28 01:51:24,253][54587] Fps is (10 sec: 60621.8, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 8861401088. Throughput: 0: 60920.3. Samples: 1766663900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:51:24,253][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 01:51:24,343][54818] Updated weights for policy 0, policy_version 540858 (0.0017) [2024-04-28 01:51:26,903][54818] Updated weights for policy 0, policy_version 540868 (0.0017) [2024-04-28 01:51:29,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 8861712384. Throughput: 0: 61053.7. Samples: 1766844400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:51:29,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 01:51:29,877][54818] Updated weights for policy 0, policy_version 540878 (0.0016) [2024-04-28 01:51:32,183][54818] Updated weights for policy 0, policy_version 540888 (0.0015) [2024-04-28 01:51:34,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61167.7, 300 sec: 61204.0). Total num frames: 8862023680. Throughput: 0: 60926.4. Samples: 1767205500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:51:34,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-28 01:51:35,228][54818] Updated weights for policy 0, policy_version 540898 (0.0017) [2024-04-28 01:51:37,469][54818] Updated weights for policy 0, policy_version 540908 (0.0015) [2024-04-28 01:51:39,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60894.0, 300 sec: 61203.9). Total num frames: 8862318592. Throughput: 0: 60944.5. Samples: 1767572360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:51:39,255][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 01:51:40,621][54818] Updated weights for policy 0, policy_version 540918 (0.0017) [2024-04-28 01:51:43,185][54818] Updated weights for policy 0, policy_version 540928 (0.0017) [2024-04-28 01:51:44,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 8862629888. Throughput: 0: 60983.0. Samples: 1767755140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:51:44,255][54587] Avg episode reward: [(0, '0.559')] [2024-04-28 01:51:45,745][54798] Signal inference workers to stop experience collection... (28700 times) [2024-04-28 01:51:45,784][54818] InferenceWorker_p0-w0: stopping experience collection (28700 times) [2024-04-28 01:51:45,802][54798] Signal inference workers to resume experience collection... (28700 times) [2024-04-28 01:51:45,802][54818] InferenceWorker_p0-w0: resuming experience collection (28700 times) [2024-04-28 01:51:45,805][54818] Updated weights for policy 0, policy_version 540938 (0.0017) [2024-04-28 01:51:48,408][54818] Updated weights for policy 0, policy_version 540948 (0.0020) [2024-04-28 01:51:49,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 8862941184. Throughput: 0: 61145.8. Samples: 1768128580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:51:49,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 01:51:49,267][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000540951_8862941184.pth... [2024-04-28 01:51:49,320][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000540055_8848261120.pth [2024-04-28 01:51:51,070][54818] Updated weights for policy 0, policy_version 540958 (0.0016) [2024-04-28 01:51:53,552][54818] Updated weights for policy 0, policy_version 540968 (0.0016) [2024-04-28 01:51:54,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 8863252480. Throughput: 0: 60977.7. Samples: 1768489760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:51:54,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 01:51:56,272][54818] Updated weights for policy 0, policy_version 540978 (0.0021) [2024-04-28 01:51:58,888][54818] Updated weights for policy 0, policy_version 540988 (0.0016) [2024-04-28 01:51:59,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.8, 300 sec: 61259.5). Total num frames: 8863547392. Throughput: 0: 61025.9. Samples: 1768673440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:51:59,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 01:52:01,692][54818] Updated weights for policy 0, policy_version 540998 (0.0016) [2024-04-28 01:52:04,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 8863858688. Throughput: 0: 61164.5. Samples: 1769045440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:52:04,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 01:52:04,563][54818] Updated weights for policy 0, policy_version 541008 (0.0017) [2024-04-28 01:52:06,806][54818] Updated weights for policy 0, policy_version 541018 (0.0017) [2024-04-28 01:52:09,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 8864169984. Throughput: 0: 61038.2. Samples: 1769410620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:52:09,254][54587] Avg episode reward: [(0, '0.671')] [2024-04-28 01:52:09,795][54818] Updated weights for policy 0, policy_version 541028 (0.0016) [2024-04-28 01:52:12,345][54818] Updated weights for policy 0, policy_version 541038 (0.0016) [2024-04-28 01:52:14,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 8864481280. Throughput: 0: 61168.9. Samples: 1769597000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:52:14,254][54587] Avg episode reward: [(0, '0.498')] [2024-04-28 01:52:15,335][54818] Updated weights for policy 0, policy_version 541048 (0.0015) [2024-04-28 01:52:17,791][54818] Updated weights for policy 0, policy_version 541058 (0.0023) [2024-04-28 01:52:19,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 8864776192. Throughput: 0: 61302.7. Samples: 1769964120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:52:19,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 01:52:20,783][54818] Updated weights for policy 0, policy_version 541068 (0.0016) [2024-04-28 01:52:23,111][54818] Updated weights for policy 0, policy_version 541078 (0.0016) [2024-04-28 01:52:24,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61439.8, 300 sec: 61259.5). Total num frames: 8865087488. Throughput: 0: 61287.9. Samples: 1770330320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:52:24,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 01:52:25,975][54818] Updated weights for policy 0, policy_version 541088 (0.0016) [2024-04-28 01:52:28,573][54818] Updated weights for policy 0, policy_version 541098 (0.0017) [2024-04-28 01:52:29,255][54587] Fps is (10 sec: 60612.2, 60 sec: 61165.6, 300 sec: 61203.7). Total num frames: 8865382400. Throughput: 0: 61216.0. Samples: 1770509940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 01:52:29,256][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 01:52:31,264][54818] Updated weights for policy 0, policy_version 541108 (0.0016) [2024-04-28 01:52:34,111][54818] Updated weights for policy 0, policy_version 541118 (0.0018) [2024-04-28 01:52:34,254][54587] Fps is (10 sec: 58981.5, 60 sec: 60893.6, 300 sec: 61148.4). Total num frames: 8865677312. Throughput: 0: 61102.5. Samples: 1770878200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:52:34,255][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 01:52:34,281][54798] Signal inference workers to stop experience collection... (28750 times) [2024-04-28 01:52:34,281][54798] Signal inference workers to resume experience collection... (28750 times) [2024-04-28 01:52:34,289][54818] InferenceWorker_p0-w0: stopping experience collection (28750 times) [2024-04-28 01:52:34,299][54818] InferenceWorker_p0-w0: resuming experience collection (28750 times) [2024-04-28 01:52:36,678][54818] Updated weights for policy 0, policy_version 541128 (0.0016) [2024-04-28 01:52:39,253][54587] Fps is (10 sec: 60629.3, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 8865988608. Throughput: 0: 61281.3. Samples: 1771247420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:52:39,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 01:52:39,304][54818] Updated weights for policy 0, policy_version 541138 (0.0016) [2024-04-28 01:52:41,898][54818] Updated weights for policy 0, policy_version 541148 (0.0016) [2024-04-28 01:52:44,253][54587] Fps is (10 sec: 62260.3, 60 sec: 61167.0, 300 sec: 61203.9). Total num frames: 8866299904. Throughput: 0: 61283.1. Samples: 1771431180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:52:44,255][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 01:52:44,518][54818] Updated weights for policy 0, policy_version 541158 (0.0020) [2024-04-28 01:52:47,194][54818] Updated weights for policy 0, policy_version 541168 (0.0016) [2024-04-28 01:52:49,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 8866611200. Throughput: 0: 61090.8. Samples: 1771794520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:52:49,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 01:52:49,986][54818] Updated weights for policy 0, policy_version 541178 (0.0018) [2024-04-28 01:52:52,519][54818] Updated weights for policy 0, policy_version 541188 (0.0018) [2024-04-28 01:52:54,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 8866906112. Throughput: 0: 61149.1. Samples: 1772162340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:52:54,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-28 01:52:55,346][54818] Updated weights for policy 0, policy_version 541198 (0.0016) [2024-04-28 01:52:57,766][54818] Updated weights for policy 0, policy_version 541208 (0.0016) [2024-04-28 01:52:59,254][54587] Fps is (10 sec: 60619.3, 60 sec: 61166.7, 300 sec: 61203.9). Total num frames: 8867217408. Throughput: 0: 61219.8. Samples: 1772351900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:52:59,255][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 01:53:00,843][54818] Updated weights for policy 0, policy_version 541218 (0.0017) [2024-04-28 01:53:03,072][54818] Updated weights for policy 0, policy_version 541228 (0.0017) [2024-04-28 01:53:04,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 8867512320. Throughput: 0: 61057.6. Samples: 1772711720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:53:04,255][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 01:53:06,045][54818] Updated weights for policy 0, policy_version 541238 (0.0016) [2024-04-28 01:53:08,806][54818] Updated weights for policy 0, policy_version 541248 (0.0016) [2024-04-28 01:53:09,253][54587] Fps is (10 sec: 60621.9, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 8867823616. Throughput: 0: 61144.9. Samples: 1773081840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:53:09,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 01:53:11,745][54818] Updated weights for policy 0, policy_version 541258 (0.0016) [2024-04-28 01:53:13,905][54818] Updated weights for policy 0, policy_version 541268 (0.0016) [2024-04-28 01:53:14,253][54587] Fps is (10 sec: 62259.7, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 8868134912. Throughput: 0: 61425.9. Samples: 1773274020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:53:14,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 01:53:16,989][54818] Updated weights for policy 0, policy_version 541278 (0.0016) [2024-04-28 01:53:18,213][54798] Signal inference workers to stop experience collection... (28800 times) [2024-04-28 01:53:18,252][54818] InferenceWorker_p0-w0: stopping experience collection (28800 times) [2024-04-28 01:53:18,271][54798] Signal inference workers to resume experience collection... (28800 times) [2024-04-28 01:53:18,272][54818] InferenceWorker_p0-w0: resuming experience collection (28800 times) [2024-04-28 01:53:19,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 8868446208. Throughput: 0: 61085.6. Samples: 1773627040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:53:19,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 01:53:19,548][54818] Updated weights for policy 0, policy_version 541288 (0.0018) [2024-04-28 01:53:22,116][54818] Updated weights for policy 0, policy_version 541298 (0.0016) [2024-04-28 01:53:24,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 8868741120. Throughput: 0: 61100.1. Samples: 1773996920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:53:24,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 01:53:24,801][54818] Updated weights for policy 0, policy_version 541308 (0.0022) [2024-04-28 01:53:27,448][54818] Updated weights for policy 0, policy_version 541318 (0.0016) [2024-04-28 01:53:29,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61168.5, 300 sec: 61148.4). Total num frames: 8869052416. Throughput: 0: 61205.1. Samples: 1774185400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:53:29,253][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 01:53:30,300][54818] Updated weights for policy 0, policy_version 541328 (0.0015) [2024-04-28 01:53:32,742][54818] Updated weights for policy 0, policy_version 541338 (0.0017) [2024-04-28 01:53:34,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61440.2, 300 sec: 61148.4). Total num frames: 8869363712. Throughput: 0: 61144.4. Samples: 1774546020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:53:34,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 01:53:35,890][54818] Updated weights for policy 0, policy_version 541348 (0.0017) [2024-04-28 01:53:38,130][54818] Updated weights for policy 0, policy_version 541358 (0.0017) [2024-04-28 01:53:39,253][54587] Fps is (10 sec: 62258.1, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 8869675008. Throughput: 0: 61005.8. Samples: 1774907600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:53:39,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 01:53:41,099][54818] Updated weights for policy 0, policy_version 541368 (0.0016) [2024-04-28 01:53:43,399][54818] Updated weights for policy 0, policy_version 541378 (0.0017) [2024-04-28 01:53:44,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 8869969920. Throughput: 0: 61115.5. Samples: 1775102080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:53:44,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 01:53:46,518][54818] Updated weights for policy 0, policy_version 541388 (0.0016) [2024-04-28 01:53:48,691][54818] Updated weights for policy 0, policy_version 541398 (0.0016) [2024-04-28 01:53:49,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.7, 300 sec: 61092.8). Total num frames: 8870281216. Throughput: 0: 61160.7. Samples: 1775463960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:53:49,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-28 01:53:49,265][54587] No heartbeat for components: RolloutWorker_w4 (16177 seconds), RolloutWorker_w5 (2277 seconds) [2024-04-28 01:53:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000541399_8870281216.pth... [2024-04-28 01:53:49,331][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000540504_8855617536.pth [2024-04-28 01:53:51,852][54818] Updated weights for policy 0, policy_version 541408 (0.0015) [2024-04-28 01:53:54,217][54818] Updated weights for policy 0, policy_version 541418 (0.0018) [2024-04-28 01:53:54,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61440.1, 300 sec: 61092.9). Total num frames: 8870592512. Throughput: 0: 61052.9. Samples: 1775829220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:53:54,255][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 01:53:57,205][54818] Updated weights for policy 0, policy_version 541428 (0.0018) [2024-04-28 01:53:59,253][54587] Fps is (10 sec: 60622.4, 60 sec: 61167.2, 300 sec: 61037.4). Total num frames: 8870887424. Throughput: 0: 61017.9. Samples: 1776019820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 01:53:59,254][54587] Avg episode reward: [(0, '0.725')] [2024-04-28 01:53:59,678][54818] Updated weights for policy 0, policy_version 541438 (0.0017) [2024-04-28 01:54:02,559][54818] Updated weights for policy 0, policy_version 541448 (0.0016) [2024-04-28 01:54:04,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 8871198720. Throughput: 0: 61219.0. Samples: 1776381900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:54:04,255][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 01:54:05,287][54818] Updated weights for policy 0, policy_version 541458 (0.0020) [2024-04-28 01:54:05,556][54798] Signal inference workers to stop experience collection... (28850 times) [2024-04-28 01:54:05,566][54798] Signal inference workers to resume experience collection... (28850 times) [2024-04-28 01:54:05,587][54818] InferenceWorker_p0-w0: stopping experience collection (28850 times) [2024-04-28 01:54:05,587][54818] InferenceWorker_p0-w0: resuming experience collection (28850 times) [2024-04-28 01:54:07,794][54818] Updated weights for policy 0, policy_version 541468 (0.0016) [2024-04-28 01:54:09,253][54587] Fps is (10 sec: 62258.0, 60 sec: 61439.9, 300 sec: 61092.9). Total num frames: 8871510016. Throughput: 0: 60918.8. Samples: 1776738280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:54:09,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 01:54:10,468][54818] Updated weights for policy 0, policy_version 541478 (0.0016) [2024-04-28 01:54:13,019][54818] Updated weights for policy 0, policy_version 541488 (0.0015) [2024-04-28 01:54:14,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61166.9, 300 sec: 61037.3). Total num frames: 8871804928. Throughput: 0: 61091.3. Samples: 1776934520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:54:14,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 01:54:15,812][54818] Updated weights for policy 0, policy_version 541498 (0.0019) [2024-04-28 01:54:18,277][54818] Updated weights for policy 0, policy_version 541508 (0.0021) [2024-04-28 01:54:19,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61166.8, 300 sec: 61037.3). Total num frames: 8872116224. Throughput: 0: 61133.2. Samples: 1777297020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:54:19,255][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 01:54:21,383][54818] Updated weights for policy 0, policy_version 541518 (0.0016) [2024-04-28 01:54:23,722][54818] Updated weights for policy 0, policy_version 541528 (0.0018) [2024-04-28 01:54:24,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61439.9, 300 sec: 61092.9). Total num frames: 8872427520. Throughput: 0: 61149.9. Samples: 1777659340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:54:24,255][54587] Avg episode reward: [(0, '0.549')] [2024-04-28 01:54:26,738][54818] Updated weights for policy 0, policy_version 541538 (0.0017) [2024-04-28 01:54:29,049][54818] Updated weights for policy 0, policy_version 541548 (0.0016) [2024-04-28 01:54:29,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61166.7, 300 sec: 61092.9). Total num frames: 8872722432. Throughput: 0: 61047.8. Samples: 1777849240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:54:29,255][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 01:54:32,169][54818] Updated weights for policy 0, policy_version 541558 (0.0015) [2024-04-28 01:54:34,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61166.8, 300 sec: 61092.9). Total num frames: 8873033728. Throughput: 0: 61005.0. Samples: 1778209180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:54:34,255][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 01:54:34,710][54818] Updated weights for policy 0, policy_version 541568 (0.0015) [2024-04-28 01:54:37,281][54818] Updated weights for policy 0, policy_version 541578 (0.0019) [2024-04-28 01:54:39,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 8873328640. Throughput: 0: 60983.2. Samples: 1778573460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:54:39,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-28 01:54:39,924][54818] Updated weights for policy 0, policy_version 541588 (0.0016) [2024-04-28 01:54:42,651][54818] Updated weights for policy 0, policy_version 541598 (0.0016) [2024-04-28 01:54:44,253][54587] Fps is (10 sec: 58983.5, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 8873623552. Throughput: 0: 61032.0. Samples: 1778766260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:54:44,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 01:54:45,361][54818] Updated weights for policy 0, policy_version 541608 (0.0015) [2024-04-28 01:54:47,998][54818] Updated weights for policy 0, policy_version 541618 (0.0015) [2024-04-28 01:54:49,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61167.1, 300 sec: 61148.4). Total num frames: 8873951232. Throughput: 0: 60999.2. Samples: 1779126860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:54:49,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 01:54:50,559][54818] Updated weights for policy 0, policy_version 541628 (0.0016) [2024-04-28 01:54:53,111][54818] Updated weights for policy 0, policy_version 541638 (0.0018) [2024-04-28 01:54:54,253][54587] Fps is (10 sec: 62259.1, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 8874246144. Throughput: 0: 61211.4. Samples: 1779492780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:54:54,253][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 01:54:55,995][54818] Updated weights for policy 0, policy_version 541648 (0.0015) [2024-04-28 01:54:58,461][54818] Updated weights for policy 0, policy_version 541658 (0.0017) [2024-04-28 01:54:59,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 8874557440. Throughput: 0: 61148.9. Samples: 1779686220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:54:59,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 01:55:01,500][54818] Updated weights for policy 0, policy_version 541668 (0.0016) [2024-04-28 01:55:02,663][54798] Signal inference workers to stop experience collection... (28900 times) [2024-04-28 01:55:02,705][54818] InferenceWorker_p0-w0: stopping experience collection (28900 times) [2024-04-28 01:55:02,725][54798] Signal inference workers to resume experience collection... (28900 times) [2024-04-28 01:55:02,725][54818] InferenceWorker_p0-w0: resuming experience collection (28900 times) [2024-04-28 01:55:03,808][54818] Updated weights for policy 0, policy_version 541678 (0.0016) [2024-04-28 01:55:04,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 8874868736. Throughput: 0: 61266.7. Samples: 1780054020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:55:04,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 01:55:06,685][54818] Updated weights for policy 0, policy_version 541688 (0.0016) [2024-04-28 01:55:09,139][54818] Updated weights for policy 0, policy_version 541698 (0.0015) [2024-04-28 01:55:09,253][54587] Fps is (10 sec: 62260.0, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 8875180032. Throughput: 0: 61250.8. Samples: 1780415620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:55:09,261][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 01:55:12,078][54818] Updated weights for policy 0, policy_version 541708 (0.0016) [2024-04-28 01:55:14,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 8875491328. Throughput: 0: 61279.6. Samples: 1780606820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:55:14,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 01:55:14,375][54818] Updated weights for policy 0, policy_version 541718 (0.0015) [2024-04-28 01:55:17,243][54818] Updated weights for policy 0, policy_version 541728 (0.0016) [2024-04-28 01:55:19,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61440.1, 300 sec: 61203.9). Total num frames: 8875802624. Throughput: 0: 61416.1. Samples: 1780972900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:55:19,255][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 01:55:20,288][54818] Updated weights for policy 0, policy_version 541738 (0.0021) [2024-04-28 01:55:22,676][54818] Updated weights for policy 0, policy_version 541748 (0.0015) [2024-04-28 01:55:24,253][54587] Fps is (10 sec: 60621.8, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 8876097536. Throughput: 0: 61394.8. Samples: 1781336220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:55:24,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 01:55:25,409][54818] Updated weights for policy 0, policy_version 541758 (0.0016) [2024-04-28 01:55:28,291][54818] Updated weights for policy 0, policy_version 541768 (0.0019) [2024-04-28 01:55:29,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61440.0, 300 sec: 61204.1). Total num frames: 8876408832. Throughput: 0: 61167.3. Samples: 1781518800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 01:55:29,254][54587] Avg episode reward: [(0, '0.522')] [2024-04-28 01:55:30,995][54818] Updated weights for policy 0, policy_version 541778 (0.0015) [2024-04-28 01:55:33,378][54818] Updated weights for policy 0, policy_version 541788 (0.0019) [2024-04-28 01:55:34,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61167.1, 300 sec: 61148.5). Total num frames: 8876703744. Throughput: 0: 61333.0. Samples: 1781886840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:55:34,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 01:55:36,187][54818] Updated weights for policy 0, policy_version 541798 (0.0018) [2024-04-28 01:55:38,684][54818] Updated weights for policy 0, policy_version 541808 (0.0016) [2024-04-28 01:55:39,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 8877015040. Throughput: 0: 61266.0. Samples: 1782249760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:55:39,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-28 01:55:41,475][54818] Updated weights for policy 0, policy_version 541818 (0.0017) [2024-04-28 01:55:44,038][54818] Updated weights for policy 0, policy_version 541828 (0.0015) [2024-04-28 01:55:44,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61713.0, 300 sec: 61204.0). Total num frames: 8877326336. Throughput: 0: 61140.1. Samples: 1782437520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:55:44,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-28 01:55:46,753][54818] Updated weights for policy 0, policy_version 541838 (0.0017) [2024-04-28 01:55:49,197][54818] Updated weights for policy 0, policy_version 541848 (0.0016) [2024-04-28 01:55:49,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 8877637632. Throughput: 0: 61201.5. Samples: 1782808080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:55:49,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 01:55:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000541848_8877637632.pth... [2024-04-28 01:55:49,323][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000540951_8862941184.pth [2024-04-28 01:55:52,576][54818] Updated weights for policy 0, policy_version 541858 (0.0016) [2024-04-28 01:55:54,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 8877932544. Throughput: 0: 61248.0. Samples: 1783171780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:55:54,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-28 01:55:54,579][54818] Updated weights for policy 0, policy_version 541868 (0.0017) [2024-04-28 01:55:56,128][54798] Signal inference workers to stop experience collection... (28950 times) [2024-04-28 01:55:56,137][54798] Signal inference workers to resume experience collection... (28950 times) [2024-04-28 01:55:56,155][54818] InferenceWorker_p0-w0: stopping experience collection (28950 times) [2024-04-28 01:55:56,155][54818] InferenceWorker_p0-w0: resuming experience collection (28950 times) [2024-04-28 01:55:57,722][54818] Updated weights for policy 0, policy_version 541878 (0.0018) [2024-04-28 01:55:59,253][54587] Fps is (10 sec: 58982.9, 60 sec: 61167.1, 300 sec: 61148.5). Total num frames: 8878227456. Throughput: 0: 60974.0. Samples: 1783350640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:55:59,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 01:56:00,070][54818] Updated weights for policy 0, policy_version 541888 (0.0016) [2024-04-28 01:56:02,923][54818] Updated weights for policy 0, policy_version 541898 (0.0017) [2024-04-28 01:56:04,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 8878538752. Throughput: 0: 61190.2. Samples: 1783726460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:56:04,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-28 01:56:05,753][54818] Updated weights for policy 0, policy_version 541908 (0.0017) [2024-04-28 01:56:08,552][54818] Updated weights for policy 0, policy_version 541918 (0.0016) [2024-04-28 01:56:09,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 8878850048. Throughput: 0: 61292.0. Samples: 1784094360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:56:09,253][54587] Avg episode reward: [(0, '0.532')] [2024-04-28 01:56:11,027][54818] Updated weights for policy 0, policy_version 541928 (0.0018) [2024-04-28 01:56:13,757][54818] Updated weights for policy 0, policy_version 541938 (0.0017) [2024-04-28 01:56:14,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 8879144960. Throughput: 0: 61119.1. Samples: 1784269160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:56:14,255][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 01:56:16,202][54818] Updated weights for policy 0, policy_version 541948 (0.0016) [2024-04-28 01:56:19,107][54818] Updated weights for policy 0, policy_version 541958 (0.0015) [2024-04-28 01:56:19,253][54587] Fps is (10 sec: 58981.9, 60 sec: 60620.8, 300 sec: 61148.4). Total num frames: 8879439872. Throughput: 0: 61167.1. Samples: 1784639360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:56:19,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 01:56:21,617][54818] Updated weights for policy 0, policy_version 541968 (0.0018) [2024-04-28 01:56:24,241][54818] Updated weights for policy 0, policy_version 541978 (0.0016) [2024-04-28 01:56:24,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61166.7, 300 sec: 61203.9). Total num frames: 8879767552. Throughput: 0: 61284.3. Samples: 1785007560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:56:24,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 01:56:27,479][54818] Updated weights for policy 0, policy_version 541988 (0.0015) [2024-04-28 01:56:29,253][54587] Fps is (10 sec: 62258.9, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 8880062464. Throughput: 0: 61096.8. Samples: 1785186880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:56:29,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 01:56:29,529][54818] Updated weights for policy 0, policy_version 541998 (0.0015) [2024-04-28 01:56:32,710][54818] Updated weights for policy 0, policy_version 542008 (0.0017) [2024-04-28 01:56:34,253][54587] Fps is (10 sec: 60621.8, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 8880373760. Throughput: 0: 61159.5. Samples: 1785560260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:56:34,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-28 01:56:35,057][54818] Updated weights for policy 0, policy_version 542018 (0.0018) [2024-04-28 01:56:37,986][54818] Updated weights for policy 0, policy_version 542028 (0.0016) [2024-04-28 01:56:39,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60894.0, 300 sec: 61148.5). Total num frames: 8880668672. Throughput: 0: 61032.4. Samples: 1785918240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:56:39,253][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 01:56:40,350][54818] Updated weights for policy 0, policy_version 542038 (0.0017) [2024-04-28 01:56:43,077][54798] Signal inference workers to stop experience collection... (29000 times) [2024-04-28 01:56:43,117][54818] InferenceWorker_p0-w0: stopping experience collection (29000 times) [2024-04-28 01:56:43,172][54798] Signal inference workers to resume experience collection... (29000 times) [2024-04-28 01:56:43,172][54818] InferenceWorker_p0-w0: resuming experience collection (29000 times) [2024-04-28 01:56:43,174][54818] Updated weights for policy 0, policy_version 542048 (0.0017) [2024-04-28 01:56:44,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 8880979968. Throughput: 0: 61142.5. Samples: 1786102060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:56:44,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 01:56:45,801][54818] Updated weights for policy 0, policy_version 542058 (0.0016) [2024-04-28 01:56:48,453][54818] Updated weights for policy 0, policy_version 542068 (0.0016) [2024-04-28 01:56:49,253][54587] Fps is (10 sec: 62257.8, 60 sec: 60893.7, 300 sec: 61148.4). Total num frames: 8881291264. Throughput: 0: 61115.8. Samples: 1786476680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:56:49,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-28 01:56:49,264][54587] No heartbeat for components: RolloutWorker_w4 (16357 seconds), RolloutWorker_w5 (2457 seconds) [2024-04-28 01:56:51,193][54818] Updated weights for policy 0, policy_version 542078 (0.0018) [2024-04-28 01:56:53,940][54818] Updated weights for policy 0, policy_version 542088 (0.0018) [2024-04-28 01:56:54,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 8881586176. Throughput: 0: 60858.6. Samples: 1786833000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:56:54,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-28 01:56:56,952][54818] Updated weights for policy 0, policy_version 542098 (0.0016) [2024-04-28 01:56:59,052][54818] Updated weights for policy 0, policy_version 542108 (0.0019) [2024-04-28 01:56:59,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 8881897472. Throughput: 0: 61202.3. Samples: 1787023260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 01:56:59,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 01:57:02,255][54818] Updated weights for policy 0, policy_version 542118 (0.0017) [2024-04-28 01:57:04,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 8882208768. Throughput: 0: 61218.6. Samples: 1787394200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:57:04,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 01:57:04,396][54818] Updated weights for policy 0, policy_version 542128 (0.0016) [2024-04-28 01:57:07,617][54818] Updated weights for policy 0, policy_version 542138 (0.0019) [2024-04-28 01:57:09,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 8882520064. Throughput: 0: 60920.7. Samples: 1787748980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:57:09,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 01:57:09,874][54818] Updated weights for policy 0, policy_version 542148 (0.0015) [2024-04-28 01:57:12,721][54818] Updated weights for policy 0, policy_version 542158 (0.0017) [2024-04-28 01:57:14,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 8882831360. Throughput: 0: 61229.8. Samples: 1787942220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:57:14,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 01:57:15,360][54818] Updated weights for policy 0, policy_version 542168 (0.0016) [2024-04-28 01:57:18,236][54818] Updated weights for policy 0, policy_version 542178 (0.0017) [2024-04-28 01:57:19,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 8883126272. Throughput: 0: 61079.0. Samples: 1788308820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:57:19,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 01:57:20,419][54818] Updated weights for policy 0, policy_version 542188 (0.0015) [2024-04-28 01:57:23,456][54818] Updated weights for policy 0, policy_version 542198 (0.0017) [2024-04-28 01:57:24,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61167.0, 300 sec: 61204.2). Total num frames: 8883437568. Throughput: 0: 61204.2. Samples: 1788672440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:57:24,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-28 01:57:25,894][54818] Updated weights for policy 0, policy_version 542208 (0.0023) [2024-04-28 01:57:28,223][54798] Signal inference workers to stop experience collection... (29050 times) [2024-04-28 01:57:28,228][54798] Signal inference workers to resume experience collection... (29050 times) [2024-04-28 01:57:28,242][54818] InferenceWorker_p0-w0: stopping experience collection (29050 times) [2024-04-28 01:57:28,242][54818] InferenceWorker_p0-w0: resuming experience collection (29050 times) [2024-04-28 01:57:28,707][54818] Updated weights for policy 0, policy_version 542218 (0.0019) [2024-04-28 01:57:29,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 8883748864. Throughput: 0: 61093.3. Samples: 1788851260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:57:29,255][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 01:57:31,442][54818] Updated weights for policy 0, policy_version 542228 (0.0017) [2024-04-28 01:57:33,946][54818] Updated weights for policy 0, policy_version 542238 (0.0017) [2024-04-28 01:57:34,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 8884043776. Throughput: 0: 61062.0. Samples: 1789224460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:57:34,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 01:57:36,743][54818] Updated weights for policy 0, policy_version 542248 (0.0015) [2024-04-28 01:57:39,253][54587] Fps is (10 sec: 58982.1, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 8884338688. Throughput: 0: 61201.2. Samples: 1789587060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:57:39,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 01:57:39,396][54818] Updated weights for policy 0, policy_version 542258 (0.0018) [2024-04-28 01:57:42,403][54818] Updated weights for policy 0, policy_version 542268 (0.0016) [2024-04-28 01:57:44,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 8884649984. Throughput: 0: 61136.6. Samples: 1789774400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:57:44,253][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 01:57:44,790][54818] Updated weights for policy 0, policy_version 542278 (0.0017) [2024-04-28 01:57:48,068][54818] Updated weights for policy 0, policy_version 542288 (0.0017) [2024-04-28 01:57:49,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 8884961280. Throughput: 0: 61063.0. Samples: 1790142040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:57:49,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 01:57:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000542295_8884961280.pth... [2024-04-28 01:57:49,321][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000541399_8870281216.pth [2024-04-28 01:57:50,079][54818] Updated weights for policy 0, policy_version 542298 (0.0016) [2024-04-28 01:57:53,348][54818] Updated weights for policy 0, policy_version 542308 (0.0018) [2024-04-28 01:57:54,253][54587] Fps is (10 sec: 60619.9, 60 sec: 61166.9, 300 sec: 61148.5). Total num frames: 8885256192. Throughput: 0: 61087.0. Samples: 1790497900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:57:54,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 01:57:55,428][54818] Updated weights for policy 0, policy_version 542318 (0.0016) [2024-04-28 01:57:58,571][54818] Updated weights for policy 0, policy_version 542328 (0.0015) [2024-04-28 01:57:59,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 8885567488. Throughput: 0: 60732.8. Samples: 1790675200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:57:59,255][54587] Avg episode reward: [(0, '0.512')] [2024-04-28 01:58:00,770][54818] Updated weights for policy 0, policy_version 542338 (0.0016) [2024-04-28 01:58:03,841][54818] Updated weights for policy 0, policy_version 542348 (0.0017) [2024-04-28 01:58:04,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 8885878784. Throughput: 0: 61021.9. Samples: 1791054800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:58:04,255][54587] Avg episode reward: [(0, '0.641')] [2024-04-28 01:58:06,302][54818] Updated weights for policy 0, policy_version 542358 (0.0018) [2024-04-28 01:58:08,598][54798] Signal inference workers to stop experience collection... (29100 times) [2024-04-28 01:58:08,599][54798] Signal inference workers to resume experience collection... (29100 times) [2024-04-28 01:58:08,611][54818] InferenceWorker_p0-w0: stopping experience collection (29100 times) [2024-04-28 01:58:08,612][54818] InferenceWorker_p0-w0: resuming experience collection (29100 times) [2024-04-28 01:58:09,075][54818] Updated weights for policy 0, policy_version 542368 (0.0017) [2024-04-28 01:58:09,253][54587] Fps is (10 sec: 60621.8, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 8886173696. Throughput: 0: 61037.6. Samples: 1791419120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:58:09,253][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 01:58:11,583][54818] Updated weights for policy 0, policy_version 542378 (0.0019) [2024-04-28 01:58:14,212][54818] Updated weights for policy 0, policy_version 542388 (0.0016) [2024-04-28 01:58:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 8886484992. Throughput: 0: 60978.8. Samples: 1791595300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:58:14,255][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 01:58:17,260][54818] Updated weights for policy 0, policy_version 542398 (0.0015) [2024-04-28 01:58:19,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 8886796288. Throughput: 0: 61057.7. Samples: 1791972060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:58:19,263][54587] Avg episode reward: [(0, '0.694')] [2024-04-28 01:58:19,448][54818] Updated weights for policy 0, policy_version 542408 (0.0016) [2024-04-28 01:58:22,658][54818] Updated weights for policy 0, policy_version 542418 (0.0018) [2024-04-28 01:58:24,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 8887091200. Throughput: 0: 61090.9. Samples: 1792336140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:58:24,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 01:58:24,754][54818] Updated weights for policy 0, policy_version 542428 (0.0018) [2024-04-28 01:58:27,791][54818] Updated weights for policy 0, policy_version 542438 (0.0016) [2024-04-28 01:58:29,253][54587] Fps is (10 sec: 57344.4, 60 sec: 60347.8, 300 sec: 61037.3). Total num frames: 8887369728. Throughput: 0: 61050.1. Samples: 1792521660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 01:58:29,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 01:58:29,976][54818] Updated weights for policy 0, policy_version 542448 (0.0019) [2024-04-28 01:58:33,311][54818] Updated weights for policy 0, policy_version 542458 (0.0015) [2024-04-28 01:58:34,253][54587] Fps is (10 sec: 58982.4, 60 sec: 60620.9, 300 sec: 61037.4). Total num frames: 8887681024. Throughput: 0: 60998.9. Samples: 1792886980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:58:34,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 01:58:35,503][54818] Updated weights for policy 0, policy_version 542468 (0.0017) [2024-04-28 01:58:38,608][54818] Updated weights for policy 0, policy_version 542478 (0.0018) [2024-04-28 01:58:39,253][54587] Fps is (10 sec: 62259.2, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 8887992320. Throughput: 0: 61189.4. Samples: 1793251420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:58:39,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 01:58:41,088][54818] Updated weights for policy 0, policy_version 542488 (0.0020) [2024-04-28 01:58:43,969][54818] Updated weights for policy 0, policy_version 542498 (0.0016) [2024-04-28 01:58:44,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60620.8, 300 sec: 61037.4). Total num frames: 8888287232. Throughput: 0: 61259.8. Samples: 1793431880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:58:44,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 01:58:44,522][54798] Signal inference workers to stop experience collection... (29150 times) [2024-04-28 01:58:44,564][54818] InferenceWorker_p0-w0: stopping experience collection (29150 times) [2024-04-28 01:58:44,582][54798] Signal inference workers to resume experience collection... (29150 times) [2024-04-28 01:58:44,582][54818] InferenceWorker_p0-w0: resuming experience collection (29150 times) [2024-04-28 01:58:46,499][54818] Updated weights for policy 0, policy_version 542508 (0.0018) [2024-04-28 01:58:49,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60621.0, 300 sec: 61037.4). Total num frames: 8888598528. Throughput: 0: 61010.8. Samples: 1793800280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:58:49,253][54587] Avg episode reward: [(0, '0.499')] [2024-04-28 01:58:49,327][54818] Updated weights for policy 0, policy_version 542518 (0.0015) [2024-04-28 01:58:51,762][54818] Updated weights for policy 0, policy_version 542528 (0.0018) [2024-04-28 01:58:54,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60620.8, 300 sec: 61037.3). Total num frames: 8888893440. Throughput: 0: 61117.2. Samples: 1794169400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:58:54,255][54587] Avg episode reward: [(0, '0.531')] [2024-04-28 01:58:54,583][54818] Updated weights for policy 0, policy_version 542538 (0.0017) [2024-04-28 01:58:57,142][54818] Updated weights for policy 0, policy_version 542548 (0.0016) [2024-04-28 01:58:59,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60620.9, 300 sec: 61037.4). Total num frames: 8889204736. Throughput: 0: 61172.9. Samples: 1794348080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:58:59,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 01:58:59,742][54818] Updated weights for policy 0, policy_version 542558 (0.0016) [2024-04-28 01:59:02,607][54818] Updated weights for policy 0, policy_version 542568 (0.0021) [2024-04-28 01:59:04,253][54587] Fps is (10 sec: 62259.2, 60 sec: 60620.7, 300 sec: 61037.4). Total num frames: 8889516032. Throughput: 0: 61068.9. Samples: 1794720160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:59:04,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 01:59:04,946][54818] Updated weights for policy 0, policy_version 542578 (0.0021) [2024-04-28 01:59:08,031][54818] Updated weights for policy 0, policy_version 542588 (0.0015) [2024-04-28 01:59:09,253][54587] Fps is (10 sec: 62258.8, 60 sec: 60893.7, 300 sec: 61092.9). Total num frames: 8889827328. Throughput: 0: 61170.5. Samples: 1795088820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:59:09,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 01:59:10,707][54818] Updated weights for policy 0, policy_version 542598 (0.0018) [2024-04-28 01:59:13,481][54818] Updated weights for policy 0, policy_version 542608 (0.0018) [2024-04-28 01:59:14,253][54587] Fps is (10 sec: 62259.4, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 8890138624. Throughput: 0: 61040.4. Samples: 1795268480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:59:14,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-28 01:59:15,896][54818] Updated weights for policy 0, policy_version 542618 (0.0018) [2024-04-28 01:59:18,638][54818] Updated weights for policy 0, policy_version 542628 (0.0017) [2024-04-28 01:59:19,253][54587] Fps is (10 sec: 62260.1, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 8890449920. Throughput: 0: 61130.2. Samples: 1795637840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:59:19,254][54587] Avg episode reward: [(0, '0.510')] [2024-04-28 01:59:21,018][54818] Updated weights for policy 0, policy_version 542638 (0.0017) [2024-04-28 01:59:23,910][54818] Updated weights for policy 0, policy_version 542648 (0.0016) [2024-04-28 01:59:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 8890744832. Throughput: 0: 61395.1. Samples: 1796014200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:59:24,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 01:59:26,521][54818] Updated weights for policy 0, policy_version 542658 (0.0016) [2024-04-28 01:59:29,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 8891056128. Throughput: 0: 61301.2. Samples: 1796190440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:59:29,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-28 01:59:29,457][54818] Updated weights for policy 0, policy_version 542668 (0.0016) [2024-04-28 01:59:31,874][54818] Updated weights for policy 0, policy_version 542678 (0.0018) [2024-04-28 01:59:32,193][54798] Signal inference workers to stop experience collection... (29200 times) [2024-04-28 01:59:32,231][54818] InferenceWorker_p0-w0: stopping experience collection (29200 times) [2024-04-28 01:59:32,245][54798] Signal inference workers to resume experience collection... (29200 times) [2024-04-28 01:59:32,246][54818] InferenceWorker_p0-w0: resuming experience collection (29200 times) [2024-04-28 01:59:34,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 8891367424. Throughput: 0: 61256.9. Samples: 1796556840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:59:34,253][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 01:59:34,803][54818] Updated weights for policy 0, policy_version 542688 (0.0016) [2024-04-28 01:59:37,268][54818] Updated weights for policy 0, policy_version 542698 (0.0018) [2024-04-28 01:59:39,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61439.9, 300 sec: 61203.9). Total num frames: 8891678720. Throughput: 0: 61264.4. Samples: 1796926300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:59:39,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 01:59:40,079][54818] Updated weights for policy 0, policy_version 542708 (0.0016) [2024-04-28 01:59:42,478][54818] Updated weights for policy 0, policy_version 542718 (0.0016) [2024-04-28 01:59:44,253][54587] Fps is (10 sec: 60619.7, 60 sec: 61439.9, 300 sec: 61092.9). Total num frames: 8891973632. Throughput: 0: 61372.8. Samples: 1797109860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:59:44,255][54587] Avg episode reward: [(0, '0.556')] [2024-04-28 01:59:45,288][54818] Updated weights for policy 0, policy_version 542728 (0.0016) [2024-04-28 01:59:48,046][54818] Updated weights for policy 0, policy_version 542738 (0.0018) [2024-04-28 01:59:49,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 8892284928. Throughput: 0: 61148.1. Samples: 1797471820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:59:49,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-28 01:59:49,261][54587] No heartbeat for components: RolloutWorker_w4 (16537 seconds), RolloutWorker_w5 (2637 seconds) [2024-04-28 01:59:49,330][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000542743_8892301312.pth... [2024-04-28 01:59:49,384][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000541848_8877637632.pth [2024-04-28 01:59:50,589][54818] Updated weights for policy 0, policy_version 542748 (0.0017) [2024-04-28 01:59:53,244][54818] Updated weights for policy 0, policy_version 542758 (0.0018) [2024-04-28 01:59:54,253][54587] Fps is (10 sec: 62260.3, 60 sec: 61713.2, 300 sec: 61148.5). Total num frames: 8892596224. Throughput: 0: 61042.5. Samples: 1797835720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:59:54,253][54587] Avg episode reward: [(0, '0.671')] [2024-04-28 01:59:55,968][54818] Updated weights for policy 0, policy_version 542768 (0.0016) [2024-04-28 01:59:58,521][54818] Updated weights for policy 0, policy_version 542778 (0.0018) [2024-04-28 01:59:59,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61713.0, 300 sec: 61148.4). Total num frames: 8892907520. Throughput: 0: 61367.0. Samples: 1798030000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-04-28 01:59:59,255][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 02:00:01,514][54818] Updated weights for policy 0, policy_version 542788 (0.0017) [2024-04-28 02:00:04,093][54818] Updated weights for policy 0, policy_version 542798 (0.0015) [2024-04-28 02:00:04,253][54587] Fps is (10 sec: 60619.6, 60 sec: 61440.0, 300 sec: 61092.8). Total num frames: 8893202432. Throughput: 0: 61219.3. Samples: 1798392720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:00:04,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 02:00:06,874][54818] Updated weights for policy 0, policy_version 542808 (0.0017) [2024-04-28 02:00:09,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61440.1, 300 sec: 61092.9). Total num frames: 8893513728. Throughput: 0: 60877.3. Samples: 1798753680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:00:09,254][54587] Avg episode reward: [(0, '0.516')] [2024-04-28 02:00:09,793][54818] Updated weights for policy 0, policy_version 542818 (0.0016) [2024-04-28 02:00:12,239][54818] Updated weights for policy 0, policy_version 542828 (0.0016) [2024-04-28 02:00:14,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 8893825024. Throughput: 0: 61288.0. Samples: 1798948400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:00:14,254][54587] Avg episode reward: [(0, '0.676')] [2024-04-28 02:00:15,174][54818] Updated weights for policy 0, policy_version 542838 (0.0015) [2024-04-28 02:00:15,527][54798] Signal inference workers to stop experience collection... (29250 times) [2024-04-28 02:00:15,528][54798] Signal inference workers to resume experience collection... (29250 times) [2024-04-28 02:00:15,545][54818] InferenceWorker_p0-w0: stopping experience collection (29250 times) [2024-04-28 02:00:15,545][54818] InferenceWorker_p0-w0: resuming experience collection (29250 times) [2024-04-28 02:00:17,434][54818] Updated weights for policy 0, policy_version 542848 (0.0016) [2024-04-28 02:00:19,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61439.8, 300 sec: 61148.4). Total num frames: 8894136320. Throughput: 0: 61190.3. Samples: 1799310420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:00:19,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-28 02:00:20,533][54818] Updated weights for policy 0, policy_version 542858 (0.0017) [2024-04-28 02:00:22,751][54818] Updated weights for policy 0, policy_version 542868 (0.0017) [2024-04-28 02:00:24,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 8894431232. Throughput: 0: 60982.4. Samples: 1799670500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:00:24,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 02:00:25,800][54818] Updated weights for policy 0, policy_version 542878 (0.0016) [2024-04-28 02:00:28,125][54818] Updated weights for policy 0, policy_version 542888 (0.0017) [2024-04-28 02:00:29,253][54587] Fps is (10 sec: 58982.9, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 8894726144. Throughput: 0: 61045.3. Samples: 1799856900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:00:29,254][54587] Avg episode reward: [(0, '0.504')] [2024-04-28 02:00:31,073][54818] Updated weights for policy 0, policy_version 542898 (0.0015) [2024-04-28 02:00:33,492][54818] Updated weights for policy 0, policy_version 542908 (0.0016) [2024-04-28 02:00:34,253][54587] Fps is (10 sec: 60619.9, 60 sec: 61166.7, 300 sec: 61092.9). Total num frames: 8895037440. Throughput: 0: 60950.9. Samples: 1800214620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:00:34,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 02:00:36,422][54818] Updated weights for policy 0, policy_version 542918 (0.0017) [2024-04-28 02:00:38,952][54818] Updated weights for policy 0, policy_version 542928 (0.0016) [2024-04-28 02:00:39,253][54587] Fps is (10 sec: 62260.2, 60 sec: 61167.1, 300 sec: 61092.9). Total num frames: 8895348736. Throughput: 0: 61189.3. Samples: 1800589240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:00:39,253][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 02:00:41,751][54818] Updated weights for policy 0, policy_version 542938 (0.0018) [2024-04-28 02:00:44,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61167.0, 300 sec: 61037.3). Total num frames: 8895643648. Throughput: 0: 60989.0. Samples: 1800774500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:00:44,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 02:00:44,372][54818] Updated weights for policy 0, policy_version 542948 (0.0015) [2024-04-28 02:00:47,182][54818] Updated weights for policy 0, policy_version 542958 (0.0018) [2024-04-28 02:00:49,253][54587] Fps is (10 sec: 58981.9, 60 sec: 60893.8, 300 sec: 61037.3). Total num frames: 8895938560. Throughput: 0: 60958.8. Samples: 1801135860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:00:49,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-28 02:00:49,782][54818] Updated weights for policy 0, policy_version 542968 (0.0016) [2024-04-28 02:00:52,384][54818] Updated weights for policy 0, policy_version 542978 (0.0015) [2024-04-28 02:00:54,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61166.7, 300 sec: 61148.4). Total num frames: 8896266240. Throughput: 0: 61105.7. Samples: 1801503440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:00:54,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 02:00:55,272][54818] Updated weights for policy 0, policy_version 542988 (0.0017) [2024-04-28 02:00:57,701][54818] Updated weights for policy 0, policy_version 542998 (0.0017) [2024-04-28 02:00:59,253][54587] Fps is (10 sec: 62259.2, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 8896561152. Throughput: 0: 61035.6. Samples: 1801695000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:00:59,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 02:00:59,533][54798] Signal inference workers to stop experience collection... (29300 times) [2024-04-28 02:00:59,567][54818] InferenceWorker_p0-w0: stopping experience collection (29300 times) [2024-04-28 02:00:59,595][54798] Signal inference workers to resume experience collection... (29300 times) [2024-04-28 02:00:59,595][54818] InferenceWorker_p0-w0: resuming experience collection (29300 times) [2024-04-28 02:01:00,642][54818] Updated weights for policy 0, policy_version 543008 (0.0016) [2024-04-28 02:01:03,139][54818] Updated weights for policy 0, policy_version 543018 (0.0015) [2024-04-28 02:01:04,253][54587] Fps is (10 sec: 60621.9, 60 sec: 61167.1, 300 sec: 61092.9). Total num frames: 8896872448. Throughput: 0: 61053.2. Samples: 1802057800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:01:04,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 02:01:06,035][54818] Updated weights for policy 0, policy_version 543028 (0.0016) [2024-04-28 02:01:08,392][54818] Updated weights for policy 0, policy_version 543038 (0.0018) [2024-04-28 02:01:09,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 8897183744. Throughput: 0: 61252.8. Samples: 1802426880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:01:09,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 02:01:11,278][54818] Updated weights for policy 0, policy_version 543048 (0.0018) [2024-04-28 02:01:14,040][54818] Updated weights for policy 0, policy_version 543058 (0.0016) [2024-04-28 02:01:14,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 8897478656. Throughput: 0: 61117.0. Samples: 1802607160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:01:14,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 02:01:16,480][54818] Updated weights for policy 0, policy_version 543068 (0.0017) [2024-04-28 02:01:19,253][54587] Fps is (10 sec: 58983.0, 60 sec: 60621.1, 300 sec: 61037.4). Total num frames: 8897773568. Throughput: 0: 61379.9. Samples: 1802976700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:01:19,253][54587] Avg episode reward: [(0, '0.558')] [2024-04-28 02:01:19,262][54818] Updated weights for policy 0, policy_version 543078 (0.0016) [2024-04-28 02:01:21,926][54818] Updated weights for policy 0, policy_version 543088 (0.0018) [2024-04-28 02:01:24,253][54587] Fps is (10 sec: 58982.1, 60 sec: 60620.8, 300 sec: 61037.4). Total num frames: 8898068480. Throughput: 0: 61164.3. Samples: 1803341640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:01:24,254][54587] Avg episode reward: [(0, '0.526')] [2024-04-28 02:01:24,626][54818] Updated weights for policy 0, policy_version 543098 (0.0021) [2024-04-28 02:01:27,206][54818] Updated weights for policy 0, policy_version 543108 (0.0017) [2024-04-28 02:01:29,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 8898396160. Throughput: 0: 61084.0. Samples: 1803523280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 02:01:29,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 02:01:29,949][54818] Updated weights for policy 0, policy_version 543118 (0.0017) [2024-04-28 02:01:32,403][54818] Updated weights for policy 0, policy_version 543128 (0.0017) [2024-04-28 02:01:34,253][54587] Fps is (10 sec: 62259.1, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 8898691072. Throughput: 0: 61092.0. Samples: 1803885000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:01:34,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 02:01:35,643][54818] Updated weights for policy 0, policy_version 543138 (0.0016) [2024-04-28 02:01:37,691][54818] Updated weights for policy 0, policy_version 543148 (0.0018) [2024-04-28 02:01:39,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.7, 300 sec: 61092.9). Total num frames: 8899002368. Throughput: 0: 61194.3. Samples: 1804257180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:01:39,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 02:01:41,050][54818] Updated weights for policy 0, policy_version 543158 (0.0016) [2024-04-28 02:01:43,113][54818] Updated weights for policy 0, policy_version 543168 (0.0016) [2024-04-28 02:01:44,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60893.8, 300 sec: 61037.4). Total num frames: 8899297280. Throughput: 0: 60976.9. Samples: 1804438960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:01:44,255][54587] Avg episode reward: [(0, '0.647')] [2024-04-28 02:01:44,858][54798] Signal inference workers to stop experience collection... (29350 times) [2024-04-28 02:01:44,908][54818] InferenceWorker_p0-w0: stopping experience collection (29350 times) [2024-04-28 02:01:44,915][54798] Signal inference workers to resume experience collection... (29350 times) [2024-04-28 02:01:44,916][54818] InferenceWorker_p0-w0: resuming experience collection (29350 times) [2024-04-28 02:01:46,069][54818] Updated weights for policy 0, policy_version 543178 (0.0020) [2024-04-28 02:01:48,736][54818] Updated weights for policy 0, policy_version 543188 (0.0016) [2024-04-28 02:01:49,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 8899608576. Throughput: 0: 60878.0. Samples: 1804797320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:01:49,255][54587] Avg episode reward: [(0, '0.638')] [2024-04-28 02:01:49,267][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000543189_8899608576.pth... [2024-04-28 02:01:49,334][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000542295_8884961280.pth [2024-04-28 02:01:51,834][54818] Updated weights for policy 0, policy_version 543198 (0.0016) [2024-04-28 02:01:54,128][54818] Updated weights for policy 0, policy_version 543208 (0.0021) [2024-04-28 02:01:54,253][54587] Fps is (10 sec: 62259.5, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 8899919872. Throughput: 0: 60938.7. Samples: 1805169120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:01:54,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-28 02:01:56,985][54818] Updated weights for policy 0, policy_version 543218 (0.0017) [2024-04-28 02:01:59,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60894.0, 300 sec: 61037.4). Total num frames: 8900214784. Throughput: 0: 61145.4. Samples: 1805358700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:01:59,253][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 02:01:59,708][54818] Updated weights for policy 0, policy_version 543228 (0.0020) [2024-04-28 02:02:02,150][54818] Updated weights for policy 0, policy_version 543238 (0.0019) [2024-04-28 02:02:04,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.8, 300 sec: 61037.3). Total num frames: 8900526080. Throughput: 0: 60925.6. Samples: 1805718360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:02:04,255][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 02:02:04,889][54818] Updated weights for policy 0, policy_version 543248 (0.0016) [2024-04-28 02:02:07,353][54818] Updated weights for policy 0, policy_version 543258 (0.0016) [2024-04-28 02:02:09,253][54587] Fps is (10 sec: 62259.3, 60 sec: 60894.0, 300 sec: 61037.4). Total num frames: 8900837376. Throughput: 0: 61123.7. Samples: 1806092200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:02:09,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-28 02:02:10,013][54818] Updated weights for policy 0, policy_version 543268 (0.0017) [2024-04-28 02:02:12,801][54818] Updated weights for policy 0, policy_version 543278 (0.0016) [2024-04-28 02:02:14,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 8901148672. Throughput: 0: 61312.5. Samples: 1806282340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:02:14,253][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 02:02:15,294][54818] Updated weights for policy 0, policy_version 543288 (0.0017) [2024-04-28 02:02:18,232][54818] Updated weights for policy 0, policy_version 543298 (0.0015) [2024-04-28 02:02:19,253][54587] Fps is (10 sec: 62258.1, 60 sec: 61439.8, 300 sec: 61092.9). Total num frames: 8901459968. Throughput: 0: 61347.9. Samples: 1806645660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:02:19,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 02:02:20,722][54818] Updated weights for policy 0, policy_version 543308 (0.0016) [2024-04-28 02:02:23,703][54818] Updated weights for policy 0, policy_version 543318 (0.0015) [2024-04-28 02:02:24,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61713.1, 300 sec: 61092.9). Total num frames: 8901771264. Throughput: 0: 61255.2. Samples: 1807013660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:02:24,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-28 02:02:26,230][54818] Updated weights for policy 0, policy_version 543328 (0.0019) [2024-04-28 02:02:28,843][54818] Updated weights for policy 0, policy_version 543338 (0.0015) [2024-04-28 02:02:29,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 8902066176. Throughput: 0: 61294.6. Samples: 1807197220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:02:29,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 02:02:31,659][54818] Updated weights for policy 0, policy_version 543348 (0.0015) [2024-04-28 02:02:31,915][54798] Signal inference workers to stop experience collection... (29400 times) [2024-04-28 02:02:31,942][54818] InferenceWorker_p0-w0: stopping experience collection (29400 times) [2024-04-28 02:02:31,972][54798] Signal inference workers to resume experience collection... (29400 times) [2024-04-28 02:02:31,972][54818] InferenceWorker_p0-w0: resuming experience collection (29400 times) [2024-04-28 02:02:34,098][54818] Updated weights for policy 0, policy_version 543358 (0.0017) [2024-04-28 02:02:34,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61440.1, 300 sec: 61148.5). Total num frames: 8902377472. Throughput: 0: 61514.8. Samples: 1807565480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:02:34,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 02:02:36,822][54818] Updated weights for policy 0, policy_version 543368 (0.0016) [2024-04-28 02:02:39,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 8902688768. Throughput: 0: 61405.7. Samples: 1807932380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:02:39,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 02:02:39,453][54818] Updated weights for policy 0, policy_version 543378 (0.0016) [2024-04-28 02:02:42,219][54818] Updated weights for policy 0, policy_version 543388 (0.0017) [2024-04-28 02:02:44,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 8902983680. Throughput: 0: 61157.2. Samples: 1808110780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:02:44,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 02:02:44,792][54818] Updated weights for policy 0, policy_version 543398 (0.0016) [2024-04-28 02:02:47,919][54818] Updated weights for policy 0, policy_version 543408 (0.0018) [2024-04-28 02:02:49,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 8903294976. Throughput: 0: 61441.7. Samples: 1808483240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:02:49,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 02:02:49,259][54587] No heartbeat for components: RolloutWorker_w4 (16717 seconds), RolloutWorker_w5 (2817 seconds) [2024-04-28 02:02:50,285][54818] Updated weights for policy 0, policy_version 543418 (0.0016) [2024-04-28 02:02:53,133][54818] Updated weights for policy 0, policy_version 543428 (0.0016) [2024-04-28 02:02:54,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 8903606272. Throughput: 0: 61245.2. Samples: 1808848240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:02:54,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-28 02:02:55,543][54818] Updated weights for policy 0, policy_version 543438 (0.0018) [2024-04-28 02:02:58,408][54818] Updated weights for policy 0, policy_version 543448 (0.0015) [2024-04-28 02:02:59,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61713.0, 300 sec: 61148.4). Total num frames: 8903917568. Throughput: 0: 60982.2. Samples: 1809026540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:02:59,255][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 02:03:00,981][54818] Updated weights for policy 0, policy_version 543458 (0.0017) [2024-04-28 02:03:03,557][54818] Updated weights for policy 0, policy_version 543468 (0.0016) [2024-04-28 02:03:04,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 8904212480. Throughput: 0: 61341.9. Samples: 1809406040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:03:04,253][54587] Avg episode reward: [(0, '0.475')] [2024-04-28 02:03:06,127][54818] Updated weights for policy 0, policy_version 543478 (0.0018) [2024-04-28 02:03:08,976][54818] Updated weights for policy 0, policy_version 543488 (0.0016) [2024-04-28 02:03:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 8904523776. Throughput: 0: 61196.8. Samples: 1809767520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:03:09,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 02:03:11,470][54818] Updated weights for policy 0, policy_version 543498 (0.0016) [2024-04-28 02:03:14,102][54818] Updated weights for policy 0, policy_version 543508 (0.0016) [2024-04-28 02:03:14,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 8904835072. Throughput: 0: 61145.4. Samples: 1809948760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:03:14,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-28 02:03:16,812][54818] Updated weights for policy 0, policy_version 543518 (0.0016) [2024-04-28 02:03:19,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.1, 300 sec: 61203.9). Total num frames: 8905146368. Throughput: 0: 61166.6. Samples: 1810317980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:03:19,255][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 02:03:19,498][54818] Updated weights for policy 0, policy_version 543528 (0.0023) [2024-04-28 02:03:20,901][54798] Signal inference workers to stop experience collection... (29450 times) [2024-04-28 02:03:20,909][54818] InferenceWorker_p0-w0: stopping experience collection (29450 times) [2024-04-28 02:03:20,992][54798] Signal inference workers to resume experience collection... (29450 times) [2024-04-28 02:03:20,992][54818] InferenceWorker_p0-w0: resuming experience collection (29450 times) [2024-04-28 02:03:22,487][54818] Updated weights for policy 0, policy_version 543538 (0.0016) [2024-04-28 02:03:24,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 8905441280. Throughput: 0: 61144.2. Samples: 1810683860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:03:24,253][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 02:03:24,813][54818] Updated weights for policy 0, policy_version 543548 (0.0017) [2024-04-28 02:03:27,716][54818] Updated weights for policy 0, policy_version 543558 (0.0018) [2024-04-28 02:03:29,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 8905752576. Throughput: 0: 61132.0. Samples: 1810861720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:03:29,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 02:03:30,393][54818] Updated weights for policy 0, policy_version 543568 (0.0016) [2024-04-28 02:03:33,419][54818] Updated weights for policy 0, policy_version 543578 (0.0018) [2024-04-28 02:03:34,253][54587] Fps is (10 sec: 60619.2, 60 sec: 61166.7, 300 sec: 61203.9). Total num frames: 8906047488. Throughput: 0: 61049.2. Samples: 1811230460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:03:34,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 02:03:35,851][54818] Updated weights for policy 0, policy_version 543588 (0.0019) [2024-04-28 02:03:38,649][54818] Updated weights for policy 0, policy_version 543598 (0.0017) [2024-04-28 02:03:39,253][54587] Fps is (10 sec: 58982.3, 60 sec: 60893.9, 300 sec: 61203.9). Total num frames: 8906342400. Throughput: 0: 61197.3. Samples: 1811602120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:03:39,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 02:03:41,139][54818] Updated weights for policy 0, policy_version 543608 (0.0015) [2024-04-28 02:03:43,999][54818] Updated weights for policy 0, policy_version 543618 (0.0016) [2024-04-28 02:03:44,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 8906653696. Throughput: 0: 61260.9. Samples: 1811783280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:03:44,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 02:03:46,505][54818] Updated weights for policy 0, policy_version 543628 (0.0017) [2024-04-28 02:03:49,200][54818] Updated weights for policy 0, policy_version 543638 (0.0016) [2024-04-28 02:03:49,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 8906964992. Throughput: 0: 60899.9. Samples: 1812146540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:03:49,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 02:03:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000543638_8906964992.pth... [2024-04-28 02:03:49,322][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000542743_8892301312.pth [2024-04-28 02:03:51,965][54818] Updated weights for policy 0, policy_version 543648 (0.0016) [2024-04-28 02:03:54,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.8, 300 sec: 61203.9). Total num frames: 8907259904. Throughput: 0: 61178.6. Samples: 1812520560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:03:54,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-28 02:03:54,610][54818] Updated weights for policy 0, policy_version 543658 (0.0017) [2024-04-28 02:03:57,338][54818] Updated weights for policy 0, policy_version 543668 (0.0017) [2024-04-28 02:03:59,253][54587] Fps is (10 sec: 60620.0, 60 sec: 60893.7, 300 sec: 61203.9). Total num frames: 8907571200. Throughput: 0: 61182.9. Samples: 1812702000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:03:59,254][54587] Avg episode reward: [(0, '0.671')] [2024-04-28 02:03:59,697][54818] Updated weights for policy 0, policy_version 543678 (0.0019) [2024-04-28 02:04:02,658][54818] Updated weights for policy 0, policy_version 543688 (0.0016) [2024-04-28 02:04:04,253][54587] Fps is (10 sec: 60621.8, 60 sec: 60893.9, 300 sec: 61148.5). Total num frames: 8907866112. Throughput: 0: 61165.5. Samples: 1813070420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:04:04,253][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 02:04:05,019][54818] Updated weights for policy 0, policy_version 543698 (0.0016) [2024-04-28 02:04:07,114][54798] Signal inference workers to stop experience collection... (29500 times) [2024-04-28 02:04:07,116][54798] Signal inference workers to resume experience collection... (29500 times) [2024-04-28 02:04:07,130][54818] InferenceWorker_p0-w0: stopping experience collection (29500 times) [2024-04-28 02:04:07,130][54818] InferenceWorker_p0-w0: resuming experience collection (29500 times) [2024-04-28 02:04:07,946][54818] Updated weights for policy 0, policy_version 543708 (0.0017) [2024-04-28 02:04:09,253][54587] Fps is (10 sec: 60622.1, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 8908177408. Throughput: 0: 61159.1. Samples: 1813436020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:04:09,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 02:04:10,609][54818] Updated weights for policy 0, policy_version 543718 (0.0018) [2024-04-28 02:04:13,313][54818] Updated weights for policy 0, policy_version 543728 (0.0016) [2024-04-28 02:04:14,253][54587] Fps is (10 sec: 62259.2, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 8908488704. Throughput: 0: 61173.5. Samples: 1813614520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:04:14,253][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 02:04:15,737][54818] Updated weights for policy 0, policy_version 543738 (0.0015) [2024-04-28 02:04:18,905][54818] Updated weights for policy 0, policy_version 543748 (0.0021) [2024-04-28 02:04:19,253][54587] Fps is (10 sec: 62257.8, 60 sec: 60893.7, 300 sec: 61203.9). Total num frames: 8908800000. Throughput: 0: 61252.9. Samples: 1813986840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:04:19,255][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 02:04:21,435][54818] Updated weights for policy 0, policy_version 543758 (0.0016) [2024-04-28 02:04:24,253][54587] Fps is (10 sec: 58981.7, 60 sec: 60620.7, 300 sec: 61092.9). Total num frames: 8909078528. Throughput: 0: 61252.5. Samples: 1814358480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:04:24,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 02:04:24,327][54818] Updated weights for policy 0, policy_version 543768 (0.0018) [2024-04-28 02:04:26,895][54818] Updated weights for policy 0, policy_version 543778 (0.0015) [2024-04-28 02:04:29,253][54587] Fps is (10 sec: 58983.7, 60 sec: 60620.9, 300 sec: 61092.9). Total num frames: 8909389824. Throughput: 0: 61182.4. Samples: 1814536480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:04:29,253][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 02:04:29,598][54818] Updated weights for policy 0, policy_version 543788 (0.0020) [2024-04-28 02:04:32,111][54818] Updated weights for policy 0, policy_version 543798 (0.0018) [2024-04-28 02:04:34,253][54587] Fps is (10 sec: 62259.7, 60 sec: 60894.1, 300 sec: 61092.9). Total num frames: 8909701120. Throughput: 0: 61268.5. Samples: 1814903620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:04:34,253][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 02:04:34,911][54818] Updated weights for policy 0, policy_version 543808 (0.0018) [2024-04-28 02:04:37,421][54818] Updated weights for policy 0, policy_version 543818 (0.0017) [2024-04-28 02:04:39,253][54587] Fps is (10 sec: 60620.0, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 8909996032. Throughput: 0: 61074.3. Samples: 1815268900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:04:39,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 02:04:40,112][54818] Updated weights for policy 0, policy_version 543828 (0.0017) [2024-04-28 02:04:42,747][54818] Updated weights for policy 0, policy_version 543838 (0.0016) [2024-04-28 02:04:44,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 8910323712. Throughput: 0: 61182.4. Samples: 1815455200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:04:44,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 02:04:45,347][54818] Updated weights for policy 0, policy_version 543848 (0.0018) [2024-04-28 02:04:47,520][54798] Signal inference workers to stop experience collection... (29550 times) [2024-04-28 02:04:47,549][54818] InferenceWorker_p0-w0: stopping experience collection (29550 times) [2024-04-28 02:04:47,580][54798] Signal inference workers to resume experience collection... (29550 times) [2024-04-28 02:04:47,581][54818] InferenceWorker_p0-w0: resuming experience collection (29550 times) [2024-04-28 02:04:48,204][54818] Updated weights for policy 0, policy_version 543858 (0.0016) [2024-04-28 02:04:49,253][54587] Fps is (10 sec: 62259.5, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 8910618624. Throughput: 0: 61195.4. Samples: 1815824220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:04:49,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 02:04:50,558][54818] Updated weights for policy 0, policy_version 543868 (0.0017) [2024-04-28 02:04:53,491][54818] Updated weights for policy 0, policy_version 543878 (0.0016) [2024-04-28 02:04:54,253][54587] Fps is (10 sec: 58982.6, 60 sec: 60894.0, 300 sec: 61037.4). Total num frames: 8910913536. Throughput: 0: 61168.4. Samples: 1816188600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:04:54,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 02:04:56,254][54818] Updated weights for policy 0, policy_version 543888 (0.0018) [2024-04-28 02:04:59,068][54818] Updated weights for policy 0, policy_version 543898 (0.0017) [2024-04-28 02:04:59,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 8911224832. Throughput: 0: 61264.2. Samples: 1816371420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:04:59,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 02:05:01,482][54818] Updated weights for policy 0, policy_version 543908 (0.0016) [2024-04-28 02:05:04,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 8911536128. Throughput: 0: 61102.0. Samples: 1816736420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:05:04,256][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 02:05:04,354][54818] Updated weights for policy 0, policy_version 543918 (0.0019) [2024-04-28 02:05:06,923][54818] Updated weights for policy 0, policy_version 543928 (0.0021) [2024-04-28 02:05:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60893.7, 300 sec: 61037.3). Total num frames: 8911831040. Throughput: 0: 61122.1. Samples: 1817108980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:05:09,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 02:05:09,819][54818] Updated weights for policy 0, policy_version 543938 (0.0019) [2024-04-28 02:05:12,394][54818] Updated weights for policy 0, policy_version 543948 (0.0016) [2024-04-28 02:05:14,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60893.8, 300 sec: 61037.4). Total num frames: 8912142336. Throughput: 0: 61150.6. Samples: 1817288260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:05:14,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-28 02:05:14,994][54818] Updated weights for policy 0, policy_version 543958 (0.0017) [2024-04-28 02:05:17,651][54818] Updated weights for policy 0, policy_version 543968 (0.0017) [2024-04-28 02:05:19,253][54587] Fps is (10 sec: 62259.4, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 8912453632. Throughput: 0: 61129.2. Samples: 1817654440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:05:19,255][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 02:05:20,199][54818] Updated weights for policy 0, policy_version 543978 (0.0016) [2024-04-28 02:05:22,911][54818] Updated weights for policy 0, policy_version 543988 (0.0018) [2024-04-28 02:05:24,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 8912748544. Throughput: 0: 61229.9. Samples: 1818024240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:05:24,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 02:05:25,515][54818] Updated weights for policy 0, policy_version 543998 (0.0016) [2024-04-28 02:05:28,258][54818] Updated weights for policy 0, policy_version 544008 (0.0016) [2024-04-28 02:05:29,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61440.0, 300 sec: 61148.5). Total num frames: 8913076224. Throughput: 0: 61160.1. Samples: 1818207400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:05:29,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 02:05:30,697][54818] Updated weights for policy 0, policy_version 544018 (0.0019) [2024-04-28 02:05:33,573][54818] Updated weights for policy 0, policy_version 544028 (0.0016) [2024-04-28 02:05:34,253][54587] Fps is (10 sec: 63897.6, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 8913387520. Throughput: 0: 61250.7. Samples: 1818580500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:05:34,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 02:05:36,160][54818] Updated weights for policy 0, policy_version 544038 (0.0015) [2024-04-28 02:05:39,078][54818] Updated weights for policy 0, policy_version 544048 (0.0016) [2024-04-28 02:05:39,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 8913682432. Throughput: 0: 61272.5. Samples: 1818945860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:05:39,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 02:05:39,749][54798] Signal inference workers to stop experience collection... (29600 times) [2024-04-28 02:05:39,767][54818] InferenceWorker_p0-w0: stopping experience collection (29600 times) [2024-04-28 02:05:39,807][54798] Signal inference workers to resume experience collection... (29600 times) [2024-04-28 02:05:39,808][54818] InferenceWorker_p0-w0: resuming experience collection (29600 times) [2024-04-28 02:05:41,796][54818] Updated weights for policy 0, policy_version 544058 (0.0016) [2024-04-28 02:05:44,253][54587] Fps is (10 sec: 58982.9, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 8913977344. Throughput: 0: 61315.3. Samples: 1819130600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:05:44,253][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 02:05:44,425][54818] Updated weights for policy 0, policy_version 544068 (0.0016) [2024-04-28 02:05:46,931][54818] Updated weights for policy 0, policy_version 544078 (0.0016) [2024-04-28 02:05:49,253][54587] Fps is (10 sec: 62258.1, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 8914305024. Throughput: 0: 61236.3. Samples: 1819492060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:05:49,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-28 02:05:49,262][54587] No heartbeat for components: RolloutWorker_w4 (16897 seconds), RolloutWorker_w5 (2997 seconds) [2024-04-28 02:05:49,365][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000544087_8914321408.pth... [2024-04-28 02:05:49,420][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000543189_8899608576.pth [2024-04-28 02:05:49,607][54818] Updated weights for policy 0, policy_version 544088 (0.0015) [2024-04-28 02:05:52,472][54818] Updated weights for policy 0, policy_version 544098 (0.0016) [2024-04-28 02:05:54,253][54587] Fps is (10 sec: 63896.4, 60 sec: 61713.0, 300 sec: 61203.9). Total num frames: 8914616320. Throughput: 0: 61119.1. Samples: 1819859340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:05:54,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 02:05:54,897][54818] Updated weights for policy 0, policy_version 544108 (0.0015) [2024-04-28 02:05:57,828][54818] Updated weights for policy 0, policy_version 544118 (0.0015) [2024-04-28 02:05:59,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 8914911232. Throughput: 0: 61284.8. Samples: 1820046080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-04-28 02:05:59,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-28 02:06:00,018][54818] Updated weights for policy 0, policy_version 544128 (0.0018) [2024-04-28 02:06:03,218][54818] Updated weights for policy 0, policy_version 544138 (0.0015) [2024-04-28 02:06:04,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 8915222528. Throughput: 0: 61376.2. Samples: 1820416360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:06:04,253][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 02:06:05,493][54818] Updated weights for policy 0, policy_version 544148 (0.0017) [2024-04-28 02:06:08,453][54818] Updated weights for policy 0, policy_version 544158 (0.0015) [2024-04-28 02:06:09,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61713.1, 300 sec: 61203.9). Total num frames: 8915533824. Throughput: 0: 61475.5. Samples: 1820790640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:06:09,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 02:06:10,862][54818] Updated weights for policy 0, policy_version 544168 (0.0016) [2024-04-28 02:06:13,662][54818] Updated weights for policy 0, policy_version 544178 (0.0016) [2024-04-28 02:06:14,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 8915828736. Throughput: 0: 61419.7. Samples: 1820971280. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:06:14,253][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 02:06:16,334][54818] Updated weights for policy 0, policy_version 544188 (0.0017) [2024-04-28 02:06:19,039][54818] Updated weights for policy 0, policy_version 544198 (0.0016) [2024-04-28 02:06:19,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 8916140032. Throughput: 0: 61452.5. Samples: 1821345860. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:06:19,253][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 02:06:21,647][54818] Updated weights for policy 0, policy_version 544208 (0.0017) [2024-04-28 02:06:24,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61713.2, 300 sec: 61204.0). Total num frames: 8916451328. Throughput: 0: 61423.2. Samples: 1821709900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:06:24,253][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 02:06:24,276][54818] Updated weights for policy 0, policy_version 544218 (0.0019) [2024-04-28 02:06:26,927][54818] Updated weights for policy 0, policy_version 544228 (0.0016) [2024-04-28 02:06:29,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 8916762624. Throughput: 0: 61455.3. Samples: 1821896100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:06:29,255][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 02:06:29,596][54818] Updated weights for policy 0, policy_version 544238 (0.0016) [2024-04-28 02:06:32,157][54818] Updated weights for policy 0, policy_version 544248 (0.0016) [2024-04-28 02:06:34,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 8917073920. Throughput: 0: 61672.2. Samples: 1822267300. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:06:34,254][54587] Avg episode reward: [(0, '0.524')] [2024-04-28 02:06:34,382][54798] Signal inference workers to stop experience collection... (29650 times) [2024-04-28 02:06:34,422][54818] InferenceWorker_p0-w0: stopping experience collection (29650 times) [2024-04-28 02:06:34,439][54798] Signal inference workers to resume experience collection... (29650 times) [2024-04-28 02:06:34,439][54818] InferenceWorker_p0-w0: resuming experience collection (29650 times) [2024-04-28 02:06:34,993][54818] Updated weights for policy 0, policy_version 544258 (0.0019) [2024-04-28 02:06:37,612][54818] Updated weights for policy 0, policy_version 544268 (0.0016) [2024-04-28 02:06:39,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61713.0, 300 sec: 61315.0). Total num frames: 8917385216. Throughput: 0: 61639.7. Samples: 1822633120. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:06:39,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 02:06:40,318][54818] Updated weights for policy 0, policy_version 544278 (0.0015) [2024-04-28 02:06:42,854][54818] Updated weights for policy 0, policy_version 544288 (0.0017) [2024-04-28 02:06:44,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61985.9, 300 sec: 61315.0). Total num frames: 8917696512. Throughput: 0: 61580.3. Samples: 1822817200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:06:44,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-28 02:06:45,796][54818] Updated weights for policy 0, policy_version 544298 (0.0018) [2024-04-28 02:06:48,309][54818] Updated weights for policy 0, policy_version 544308 (0.0019) [2024-04-28 02:06:49,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61713.1, 300 sec: 61315.0). Total num frames: 8918007808. Throughput: 0: 61557.2. Samples: 1823186440. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:06:49,254][54587] Avg episode reward: [(0, '0.728')] [2024-04-28 02:06:50,978][54818] Updated weights for policy 0, policy_version 544318 (0.0017) [2024-04-28 02:06:53,471][54818] Updated weights for policy 0, policy_version 544328 (0.0018) [2024-04-28 02:06:54,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61713.1, 300 sec: 61370.6). Total num frames: 8918319104. Throughput: 0: 61336.4. Samples: 1823550780. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:06:54,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-28 02:06:56,284][54818] Updated weights for policy 0, policy_version 544338 (0.0016) [2024-04-28 02:06:58,662][54818] Updated weights for policy 0, policy_version 544348 (0.0017) [2024-04-28 02:06:59,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61713.0, 300 sec: 61315.0). Total num frames: 8918614016. Throughput: 0: 61681.5. Samples: 1823746960. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:06:59,255][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 02:07:01,826][54818] Updated weights for policy 0, policy_version 544358 (0.0017) [2024-04-28 02:07:04,178][54818] Updated weights for policy 0, policy_version 544368 (0.0017) [2024-04-28 02:07:04,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61713.0, 300 sec: 61315.0). Total num frames: 8918925312. Throughput: 0: 61364.4. Samples: 1824107260. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:07:04,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-28 02:07:07,247][54818] Updated weights for policy 0, policy_version 544378 (0.0015) [2024-04-28 02:07:09,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61713.0, 300 sec: 61315.0). Total num frames: 8919236608. Throughput: 0: 61409.6. Samples: 1824473340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:07:09,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 02:07:09,457][54818] Updated weights for policy 0, policy_version 544388 (0.0018) [2024-04-28 02:07:12,359][54818] Updated weights for policy 0, policy_version 544398 (0.0017) [2024-04-28 02:07:14,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61712.9, 300 sec: 61259.5). Total num frames: 8919531520. Throughput: 0: 61540.9. Samples: 1824665440. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:07:14,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 02:07:14,581][54818] Updated weights for policy 0, policy_version 544408 (0.0017) [2024-04-28 02:07:17,645][54818] Updated weights for policy 0, policy_version 544418 (0.0017) [2024-04-28 02:07:19,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61986.0, 300 sec: 61315.0). Total num frames: 8919859200. Throughput: 0: 61458.2. Samples: 1825032920. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:07:19,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 02:07:19,552][54798] Signal inference workers to stop experience collection... (29700 times) [2024-04-28 02:07:19,582][54818] InferenceWorker_p0-w0: stopping experience collection (29700 times) [2024-04-28 02:07:19,610][54798] Signal inference workers to resume experience collection... (29700 times) [2024-04-28 02:07:19,611][54818] InferenceWorker_p0-w0: resuming experience collection (29700 times) [2024-04-28 02:07:20,074][54818] Updated weights for policy 0, policy_version 544428 (0.0017) [2024-04-28 02:07:23,006][54818] Updated weights for policy 0, policy_version 544438 (0.0016) [2024-04-28 02:07:24,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61712.9, 300 sec: 61315.0). Total num frames: 8920154112. Throughput: 0: 61225.2. Samples: 1825388260. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:07:24,255][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 02:07:25,984][54818] Updated weights for policy 0, policy_version 544448 (0.0023) [2024-04-28 02:07:28,238][54818] Updated weights for policy 0, policy_version 544458 (0.0018) [2024-04-28 02:07:29,253][54587] Fps is (10 sec: 58981.9, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 8920449024. Throughput: 0: 61325.7. Samples: 1825576860. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-04-28 02:07:29,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 02:07:31,265][54818] Updated weights for policy 0, policy_version 544468 (0.0018) [2024-04-28 02:07:33,723][54818] Updated weights for policy 0, policy_version 544478 (0.0017) [2024-04-28 02:07:34,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 8920760320. Throughput: 0: 61374.3. Samples: 1825948280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:07:34,254][54587] Avg episode reward: [(0, '0.708')] [2024-04-28 02:07:36,602][54818] Updated weights for policy 0, policy_version 544488 (0.0019) [2024-04-28 02:07:39,079][54818] Updated weights for policy 0, policy_version 544498 (0.0016) [2024-04-28 02:07:39,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61166.7, 300 sec: 61259.5). Total num frames: 8921055232. Throughput: 0: 61312.3. Samples: 1826309840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:07:39,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 02:07:41,745][54818] Updated weights for policy 0, policy_version 544508 (0.0015) [2024-04-28 02:07:44,253][54587] Fps is (10 sec: 58982.1, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 8921350144. Throughput: 0: 60975.7. Samples: 1826490860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:07:44,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 02:07:44,502][54818] Updated weights for policy 0, policy_version 544518 (0.0017) [2024-04-28 02:07:47,271][54818] Updated weights for policy 0, policy_version 544528 (0.0016) [2024-04-28 02:07:49,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 8921661440. Throughput: 0: 61125.7. Samples: 1826857920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:07:49,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 02:07:49,352][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000544536_8921677824.pth... [2024-04-28 02:07:49,408][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000543638_8906964992.pth [2024-04-28 02:07:49,744][54818] Updated weights for policy 0, policy_version 544538 (0.0017) [2024-04-28 02:07:52,500][54818] Updated weights for policy 0, policy_version 544548 (0.0016) [2024-04-28 02:07:54,253][54587] Fps is (10 sec: 62258.3, 60 sec: 60893.8, 300 sec: 61203.9). Total num frames: 8921972736. Throughput: 0: 61188.8. Samples: 1827226840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:07:54,254][54587] Avg episode reward: [(0, '0.508')] [2024-04-28 02:07:54,960][54818] Updated weights for policy 0, policy_version 544558 (0.0017) [2024-04-28 02:07:57,735][54818] Updated weights for policy 0, policy_version 544568 (0.0016) [2024-04-28 02:07:59,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 8922284032. Throughput: 0: 61108.1. Samples: 1827415300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:07:59,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 02:08:00,394][54818] Updated weights for policy 0, policy_version 544578 (0.0016) [2024-04-28 02:08:03,068][54818] Updated weights for policy 0, policy_version 544588 (0.0022) [2024-04-28 02:08:04,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 8922595328. Throughput: 0: 60951.5. Samples: 1827775740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:08:04,255][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 02:08:06,006][54818] Updated weights for policy 0, policy_version 544598 (0.0016) [2024-04-28 02:08:06,729][54798] Signal inference workers to stop experience collection... (29750 times) [2024-04-28 02:08:06,751][54818] InferenceWorker_p0-w0: stopping experience collection (29750 times) [2024-04-28 02:08:06,821][54798] Signal inference workers to resume experience collection... (29750 times) [2024-04-28 02:08:06,821][54818] InferenceWorker_p0-w0: resuming experience collection (29750 times) [2024-04-28 02:08:08,517][54818] Updated weights for policy 0, policy_version 544608 (0.0016) [2024-04-28 02:08:09,253][54587] Fps is (10 sec: 60620.0, 60 sec: 60893.8, 300 sec: 61203.9). Total num frames: 8922890240. Throughput: 0: 61337.7. Samples: 1828148460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:08:09,255][54587] Avg episode reward: [(0, '0.538')] [2024-04-28 02:08:11,233][54818] Updated weights for policy 0, policy_version 544618 (0.0017) [2024-04-28 02:08:13,840][54818] Updated weights for policy 0, policy_version 544628 (0.0015) [2024-04-28 02:08:14,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 8923217920. Throughput: 0: 61340.1. Samples: 1828337160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:08:14,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 02:08:16,744][54818] Updated weights for policy 0, policy_version 544638 (0.0016) [2024-04-28 02:08:19,065][54818] Updated weights for policy 0, policy_version 544648 (0.0015) [2024-04-28 02:08:19,253][54587] Fps is (10 sec: 62259.3, 60 sec: 60893.8, 300 sec: 61259.5). Total num frames: 8923512832. Throughput: 0: 61079.4. Samples: 1828696860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:08:19,255][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 02:08:21,986][54818] Updated weights for policy 0, policy_version 544658 (0.0017) [2024-04-28 02:08:24,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 8923824128. Throughput: 0: 61311.8. Samples: 1829068860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:08:24,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-28 02:08:24,356][54818] Updated weights for policy 0, policy_version 544668 (0.0016) [2024-04-28 02:08:27,155][54818] Updated weights for policy 0, policy_version 544678 (0.0018) [2024-04-28 02:08:29,253][54587] Fps is (10 sec: 62260.4, 60 sec: 61440.2, 300 sec: 61315.1). Total num frames: 8924135424. Throughput: 0: 61595.7. Samples: 1829262660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:08:29,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 02:08:29,700][54818] Updated weights for policy 0, policy_version 544688 (0.0016) [2024-04-28 02:08:32,371][54818] Updated weights for policy 0, policy_version 544698 (0.0016) [2024-04-28 02:08:33,617][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] [2024-04-28 02:08:34,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61439.9, 300 sec: 61370.6). Total num frames: 8924446720. Throughput: 0: 61383.6. Samples: 1829620180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:08:34,255][54587] Avg episode reward: [(0, '0.536')] [2024-04-28 02:08:34,255][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] [2024-04-28 02:08:34,970][54818] Updated weights for policy 0, policy_version 544708 (0.0016) [2024-04-28 02:08:37,657][54818] Updated weights for policy 0, policy_version 544718 (0.0016) [2024-04-28 02:08:39,253][54587] Fps is (10 sec: 60619.7, 60 sec: 61440.1, 300 sec: 61315.0). Total num frames: 8924741632. Throughput: 0: 61385.0. Samples: 1829989160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:08:39,256][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 02:08:40,420][54818] Updated weights for policy 0, policy_version 544728 (0.0022) [2024-04-28 02:08:43,137][54818] Updated weights for policy 0, policy_version 544738 (0.0015) [2024-04-28 02:08:44,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61713.0, 300 sec: 61315.0). Total num frames: 8925052928. Throughput: 0: 61357.3. Samples: 1830176380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:08:44,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 02:08:46,282][54818] Updated weights for policy 0, policy_version 544748 (0.0016) [2024-04-28 02:08:48,404][54818] Updated weights for policy 0, policy_version 544758 (0.0018) [2024-04-28 02:08:49,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61713.0, 300 sec: 61370.6). Total num frames: 8925364224. Throughput: 0: 61420.0. Samples: 1830539640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:08:49,255][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 02:08:49,263][54587] No heartbeat for components: RolloutWorker_w4 (17077 seconds), RolloutWorker_w5 (3177 seconds) [2024-04-28 02:08:51,375][54818] Updated weights for policy 0, policy_version 544768 (0.0015) [2024-04-28 02:08:53,654][54818] Updated weights for policy 0, policy_version 544778 (0.0016) [2024-04-28 02:08:54,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61440.2, 300 sec: 61315.1). Total num frames: 8925659136. Throughput: 0: 61181.1. Samples: 1830901600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:08:54,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 02:08:56,813][54818] Updated weights for policy 0, policy_version 544788 (0.0017) [2024-04-28 02:08:59,253][54587] Fps is (10 sec: 58983.5, 60 sec: 61167.0, 300 sec: 61315.0). Total num frames: 8925954048. Throughput: 0: 61306.0. Samples: 1831095920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 02:08:59,253][54587] Avg episode reward: [(0, '0.660')] [2024-04-28 02:08:59,329][54818] Updated weights for policy 0, policy_version 544798 (0.0016) [2024-04-28 02:09:00,293][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (100 times) [2024-04-28 02:09:01,597][54798] Signal inference workers to stop experience collection... (29800 times) [2024-04-28 02:09:01,639][54818] InferenceWorker_p0-w0: stopping experience collection (29800 times) [2024-04-28 02:09:01,655][54798] Signal inference workers to resume experience collection... (29800 times) [2024-04-28 02:09:01,656][54818] InferenceWorker_p0-w0: resuming experience collection (29800 times) [2024-04-28 02:09:02,053][54818] Updated weights for policy 0, policy_version 544808 (0.0016) [2024-04-28 02:09:04,253][54587] Fps is (10 sec: 58982.5, 60 sec: 60894.0, 300 sec: 61259.5). Total num frames: 8926248960. Throughput: 0: 61292.7. Samples: 1831455020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:09:04,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-28 02:09:04,631][54818] Updated weights for policy 0, policy_version 544818 (0.0016) [2024-04-28 02:09:07,412][54818] Updated weights for policy 0, policy_version 544828 (0.0016) [2024-04-28 02:09:09,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61167.2, 300 sec: 61259.5). Total num frames: 8926560256. Throughput: 0: 61147.3. Samples: 1831820480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:09:09,253][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 02:09:09,926][54818] Updated weights for policy 0, policy_version 544838 (0.0017) [2024-04-28 02:09:12,779][54818] Updated weights for policy 0, policy_version 544848 (0.0015) [2024-04-28 02:09:14,253][54587] Fps is (10 sec: 62258.6, 60 sec: 60893.9, 300 sec: 61259.5). Total num frames: 8926871552. Throughput: 0: 61101.2. Samples: 1832012220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:09:14,255][54587] Avg episode reward: [(0, '0.534')] [2024-04-28 02:09:15,063][54818] Updated weights for policy 0, policy_version 544858 (0.0019) [2024-04-28 02:09:18,054][54818] Updated weights for policy 0, policy_version 544868 (0.0016) [2024-04-28 02:09:19,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60894.0, 300 sec: 61315.0). Total num frames: 8927166464. Throughput: 0: 61279.6. Samples: 1832377760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:09:19,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-28 02:09:20,651][54818] Updated weights for policy 0, policy_version 544878 (0.0018) [2024-04-28 02:09:23,445][54818] Updated weights for policy 0, policy_version 544888 (0.0017) [2024-04-28 02:09:24,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60893.8, 300 sec: 61315.0). Total num frames: 8927477760. Throughput: 0: 61071.2. Samples: 1832737360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:09:24,255][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 02:09:26,082][54818] Updated weights for policy 0, policy_version 544898 (0.0018) [2024-04-28 02:09:27,264][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (200 times) [2024-04-28 02:09:28,652][54818] Updated weights for policy 0, policy_version 544908 (0.0018) [2024-04-28 02:09:29,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60620.7, 300 sec: 61259.5). Total num frames: 8927772672. Throughput: 0: 61078.7. Samples: 1832924920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:09:29,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 02:09:31,503][54818] Updated weights for policy 0, policy_version 544918 (0.0018) [2024-04-28 02:09:34,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60620.8, 300 sec: 61315.0). Total num frames: 8928083968. Throughput: 0: 61279.7. Samples: 1833297220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:09:34,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 02:09:34,411][54818] Updated weights for policy 0, policy_version 544928 (0.0017) [2024-04-28 02:09:36,729][54818] Updated weights for policy 0, policy_version 544938 (0.0016) [2024-04-28 02:09:39,253][54587] Fps is (10 sec: 62259.6, 60 sec: 60894.0, 300 sec: 61259.5). Total num frames: 8928395264. Throughput: 0: 61315.5. Samples: 1833660800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:09:39,253][54587] Avg episode reward: [(0, '0.528')] [2024-04-28 02:09:39,628][54818] Updated weights for policy 0, policy_version 544948 (0.0015) [2024-04-28 02:09:41,890][54818] Updated weights for policy 0, policy_version 544958 (0.0017) [2024-04-28 02:09:43,663][54798] Signal inference workers to stop experience collection... (29850 times) [2024-04-28 02:09:43,682][54818] InferenceWorker_p0-w0: stopping experience collection (29850 times) [2024-04-28 02:09:43,720][54798] Signal inference workers to resume experience collection... (29850 times) [2024-04-28 02:09:43,720][54818] InferenceWorker_p0-w0: resuming experience collection (29850 times) [2024-04-28 02:09:44,253][54587] Fps is (10 sec: 62258.7, 60 sec: 60893.8, 300 sec: 61315.0). Total num frames: 8928706560. Throughput: 0: 61105.9. Samples: 1833845700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:09:44,254][54587] Avg episode reward: [(0, '0.510')] [2024-04-28 02:09:44,778][54818] Updated weights for policy 0, policy_version 544968 (0.0019) [2024-04-28 02:09:47,232][54818] Updated weights for policy 0, policy_version 544978 (0.0016) [2024-04-28 02:09:49,253][54587] Fps is (10 sec: 62259.2, 60 sec: 60894.0, 300 sec: 61370.6). Total num frames: 8929017856. Throughput: 0: 61557.7. Samples: 1834225120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:09:49,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 02:09:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000544984_8929017856.pth... [2024-04-28 02:09:49,318][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000544087_8914321408.pth [2024-04-28 02:09:50,000][54818] Updated weights for policy 0, policy_version 544988 (0.0018) [2024-04-28 02:09:52,807][54818] Updated weights for policy 0, policy_version 544998 (0.0016) [2024-04-28 02:09:54,131][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (300 times) [2024-04-28 02:09:54,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61166.7, 300 sec: 61370.6). Total num frames: 8929329152. Throughput: 0: 61570.3. Samples: 1834591160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:09:54,255][54587] Avg episode reward: [(0, '0.645')] [2024-04-28 02:09:55,317][54818] Updated weights for policy 0, policy_version 545008 (0.0016) [2024-04-28 02:09:58,084][54818] Updated weights for policy 0, policy_version 545018 (0.0017) [2024-04-28 02:09:59,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61439.8, 300 sec: 61370.6). Total num frames: 8929640448. Throughput: 0: 61285.3. Samples: 1834770060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:09:59,254][54587] Avg episode reward: [(0, '0.508')] [2024-04-28 02:10:00,595][54818] Updated weights for policy 0, policy_version 545028 (0.0017) [2024-04-28 02:10:03,432][54818] Updated weights for policy 0, policy_version 545038 (0.0016) [2024-04-28 02:10:04,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61439.9, 300 sec: 61370.6). Total num frames: 8929935360. Throughput: 0: 61433.7. Samples: 1835142280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:10:04,255][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 02:10:05,846][54818] Updated weights for policy 0, policy_version 545048 (0.0016) [2024-04-28 02:10:08,801][54818] Updated weights for policy 0, policy_version 545058 (0.0015) [2024-04-28 02:10:09,253][54587] Fps is (10 sec: 60621.8, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 8930246656. Throughput: 0: 61801.0. Samples: 1835518400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:10:09,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 02:10:10,999][54818] Updated weights for policy 0, policy_version 545068 (0.0016) [2024-04-28 02:10:13,994][54818] Updated weights for policy 0, policy_version 545078 (0.0016) [2024-04-28 02:10:14,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 8930557952. Throughput: 0: 61653.4. Samples: 1835699320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:10:14,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 02:10:16,636][54818] Updated weights for policy 0, policy_version 545088 (0.0017) [2024-04-28 02:10:19,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61713.0, 300 sec: 61426.1). Total num frames: 8930869248. Throughput: 0: 61571.1. Samples: 1836067920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:10:19,262][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 02:10:19,532][54818] Updated weights for policy 0, policy_version 545098 (0.0016) [2024-04-28 02:10:20,359][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (400 times) [2024-04-28 02:10:22,178][54818] Updated weights for policy 0, policy_version 545108 (0.0017) [2024-04-28 02:10:24,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61713.0, 300 sec: 61370.5). Total num frames: 8931180544. Throughput: 0: 61623.4. Samples: 1836433860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:10:24,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 02:10:24,702][54818] Updated weights for policy 0, policy_version 545118 (0.0017) [2024-04-28 02:10:27,346][54818] Updated weights for policy 0, policy_version 545128 (0.0017) [2024-04-28 02:10:29,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61986.0, 300 sec: 61370.5). Total num frames: 8931491840. Throughput: 0: 61674.6. Samples: 1836621060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:10:29,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 02:10:29,842][54818] Updated weights for policy 0, policy_version 545138 (0.0017) [2024-04-28 02:10:33,007][54818] Updated weights for policy 0, policy_version 545148 (0.0016) [2024-04-28 02:10:34,253][54587] Fps is (10 sec: 60622.0, 60 sec: 61713.2, 300 sec: 61370.6). Total num frames: 8931786752. Throughput: 0: 61510.3. Samples: 1836993080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:10:34,253][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 02:10:35,087][54818] Updated weights for policy 0, policy_version 545158 (0.0017) [2024-04-28 02:10:35,838][54798] Signal inference workers to stop experience collection... (29900 times) [2024-04-28 02:10:35,888][54818] InferenceWorker_p0-w0: stopping experience collection (29900 times) [2024-04-28 02:10:35,890][54798] Signal inference workers to resume experience collection... (29900 times) [2024-04-28 02:10:35,895][54818] InferenceWorker_p0-w0: resuming experience collection (29900 times) [2024-04-28 02:10:38,245][54818] Updated weights for policy 0, policy_version 545168 (0.0017) [2024-04-28 02:10:39,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61712.9, 300 sec: 61426.1). Total num frames: 8932098048. Throughput: 0: 61580.0. Samples: 1837362260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:10:39,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 02:10:40,447][54818] Updated weights for policy 0, policy_version 545178 (0.0019) [2024-04-28 02:10:43,768][54818] Updated weights for policy 0, policy_version 545188 (0.0017) [2024-04-28 02:10:44,253][54587] Fps is (10 sec: 60619.6, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 8932392960. Throughput: 0: 61335.5. Samples: 1837530160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:10:44,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-28 02:10:45,744][54818] Updated weights for policy 0, policy_version 545198 (0.0015) [2024-04-28 02:10:47,494][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (500 times) [2024-04-28 02:10:48,840][54818] Updated weights for policy 0, policy_version 545208 (0.0016) [2024-04-28 02:10:49,253][54587] Fps is (10 sec: 58983.6, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 8932687872. Throughput: 0: 61484.2. Samples: 1837909060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:10:49,253][54587] Avg episode reward: [(0, '0.546')] [2024-04-28 02:10:50,940][54818] Updated weights for policy 0, policy_version 545218 (0.0015) [2024-04-28 02:10:54,253][54587] Fps is (10 sec: 60622.0, 60 sec: 61167.2, 300 sec: 61315.1). Total num frames: 8932999168. Throughput: 0: 61273.8. Samples: 1838275720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:10:54,253][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 02:10:54,336][54818] Updated weights for policy 0, policy_version 545228 (0.0016) [2024-04-28 02:10:56,218][54818] Updated weights for policy 0, policy_version 545238 (0.0016) [2024-04-28 02:10:59,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61167.0, 300 sec: 61315.0). Total num frames: 8933310464. Throughput: 0: 61190.1. Samples: 1838452880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:10:59,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 02:10:59,717][54818] Updated weights for policy 0, policy_version 545248 (0.0015) [2024-04-28 02:11:02,292][54818] Updated weights for policy 0, policy_version 545258 (0.0018) [2024-04-28 02:11:04,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 8933621760. Throughput: 0: 61253.8. Samples: 1838824340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:11:04,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 02:11:04,904][54818] Updated weights for policy 0, policy_version 545268 (0.0015) [2024-04-28 02:11:07,629][54818] Updated weights for policy 0, policy_version 545278 (0.0016) [2024-04-28 02:11:09,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 8933916672. Throughput: 0: 61402.0. Samples: 1839196940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:11:09,253][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 02:11:10,072][54818] Updated weights for policy 0, policy_version 545288 (0.0016) [2024-04-28 02:11:12,985][54818] Updated weights for policy 0, policy_version 545298 (0.0017) [2024-04-28 02:11:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.8, 300 sec: 61315.0). Total num frames: 8934227968. Throughput: 0: 60957.9. Samples: 1839364160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:11:14,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 02:11:14,330][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (600 times) [2024-04-28 02:11:15,396][54818] Updated weights for policy 0, policy_version 545308 (0.0016) [2024-04-28 02:11:18,422][54818] Updated weights for policy 0, policy_version 545318 (0.0017) [2024-04-28 02:11:19,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61167.0, 300 sec: 61315.0). Total num frames: 8934539264. Throughput: 0: 60986.6. Samples: 1839737480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:11:19,254][54587] Avg episode reward: [(0, '0.512')] [2024-04-28 02:11:20,753][54818] Updated weights for policy 0, policy_version 545328 (0.0017) [2024-04-28 02:11:23,694][54818] Updated weights for policy 0, policy_version 545338 (0.0018) [2024-04-28 02:11:24,253][54587] Fps is (10 sec: 62260.0, 60 sec: 61167.1, 300 sec: 61315.1). Total num frames: 8934850560. Throughput: 0: 60961.1. Samples: 1840105500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:11:24,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-28 02:11:26,041][54818] Updated weights for policy 0, policy_version 545348 (0.0018) [2024-04-28 02:11:29,026][54818] Updated weights for policy 0, policy_version 545358 (0.0016) [2024-04-28 02:11:29,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61167.0, 300 sec: 61315.0). Total num frames: 8935161856. Throughput: 0: 61160.5. Samples: 1840282380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:11:29,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-28 02:11:29,817][54798] Signal inference workers to stop experience collection... (29950 times) [2024-04-28 02:11:29,818][54798] Signal inference workers to resume experience collection... (29950 times) [2024-04-28 02:11:29,833][54818] InferenceWorker_p0-w0: stopping experience collection (29950 times) [2024-04-28 02:11:29,834][54818] InferenceWorker_p0-w0: resuming experience collection (29950 times) [2024-04-28 02:11:31,319][54818] Updated weights for policy 0, policy_version 545368 (0.0015) [2024-04-28 02:11:34,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 8935456768. Throughput: 0: 61060.4. Samples: 1840656780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:11:34,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-28 02:11:34,264][54818] Updated weights for policy 0, policy_version 545378 (0.0015) [2024-04-28 02:11:36,618][54818] Updated weights for policy 0, policy_version 545388 (0.0018) [2024-04-28 02:11:39,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 8935768064. Throughput: 0: 61119.3. Samples: 1841026100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:11:39,255][54587] Avg episode reward: [(0, '0.536')] [2024-04-28 02:11:39,683][54818] Updated weights for policy 0, policy_version 545398 (0.0016) [2024-04-28 02:11:40,609][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (700 times) [2024-04-28 02:11:42,391][54818] Updated weights for policy 0, policy_version 545408 (0.0016) [2024-04-28 02:11:44,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 8936079360. Throughput: 0: 61134.3. Samples: 1841203920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:11:44,254][54587] Avg episode reward: [(0, '0.668')] [2024-04-28 02:11:44,937][54818] Updated weights for policy 0, policy_version 545418 (0.0017) [2024-04-28 02:11:47,866][54818] Updated weights for policy 0, policy_version 545428 (0.0018) [2024-04-28 02:11:49,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 8936390656. Throughput: 0: 61350.3. Samples: 1841585100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:11:49,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 02:11:49,261][54587] No heartbeat for components: RolloutWorker_w4 (17257 seconds), RolloutWorker_w5 (3357 seconds) [2024-04-28 02:11:49,262][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000545434_8936390656.pth... [2024-04-28 02:11:49,333][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000544536_8921677824.pth [2024-04-28 02:11:50,222][54818] Updated weights for policy 0, policy_version 545438 (0.0017) [2024-04-28 02:11:53,077][54818] Updated weights for policy 0, policy_version 545448 (0.0017) [2024-04-28 02:11:54,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 8936685568. Throughput: 0: 60977.8. Samples: 1841940940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:11:54,253][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 02:11:55,395][54818] Updated weights for policy 0, policy_version 545458 (0.0019) [2024-04-28 02:11:58,395][54818] Updated weights for policy 0, policy_version 545468 (0.0015) [2024-04-28 02:11:59,253][54587] Fps is (10 sec: 58982.9, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 8936980480. Throughput: 0: 61320.7. Samples: 1842123580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 02:11:59,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-28 02:12:00,728][54818] Updated weights for policy 0, policy_version 545478 (0.0023) [2024-04-28 02:12:03,782][54818] Updated weights for policy 0, policy_version 545488 (0.0017) [2024-04-28 02:12:04,253][54587] Fps is (10 sec: 62258.0, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 8937308160. Throughput: 0: 61302.9. Samples: 1842496120. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:12:04,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 02:12:06,188][54818] Updated weights for policy 0, policy_version 545498 (0.0018) [2024-04-28 02:12:07,776][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (800 times) [2024-04-28 02:12:08,924][54818] Updated weights for policy 0, policy_version 545508 (0.0016) [2024-04-28 02:12:09,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 8937603072. Throughput: 0: 61353.9. Samples: 1842866420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:12:09,253][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 02:12:11,889][54818] Updated weights for policy 0, policy_version 545518 (0.0016) [2024-04-28 02:12:13,673][54798] Signal inference workers to stop experience collection... (30000 times) [2024-04-28 02:12:13,673][54798] Signal inference workers to resume experience collection... (30000 times) [2024-04-28 02:12:13,689][54818] InferenceWorker_p0-w0: stopping experience collection (30000 times) [2024-04-28 02:12:13,690][54818] InferenceWorker_p0-w0: resuming experience collection (30000 times) [2024-04-28 02:12:14,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.0, 300 sec: 61203.9). Total num frames: 8937914368. Throughput: 0: 61535.5. Samples: 1843051480. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:12:14,255][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 02:12:14,469][54818] Updated weights for policy 0, policy_version 545528 (0.0018) [2024-04-28 02:12:17,155][54818] Updated weights for policy 0, policy_version 545538 (0.0015) [2024-04-28 02:12:19,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 8938209280. Throughput: 0: 61389.8. Samples: 1843419320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:12:19,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 02:12:19,675][54818] Updated weights for policy 0, policy_version 545548 (0.0015) [2024-04-28 02:12:22,404][54818] Updated weights for policy 0, policy_version 545558 (0.0016) [2024-04-28 02:12:24,253][54587] Fps is (10 sec: 60621.9, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 8938520576. Throughput: 0: 61276.2. Samples: 1843783520. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:12:24,253][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 02:12:24,980][54818] Updated weights for policy 0, policy_version 545568 (0.0017) [2024-04-28 02:12:27,710][54818] Updated weights for policy 0, policy_version 545578 (0.0017) [2024-04-28 02:12:29,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 8938831872. Throughput: 0: 61440.7. Samples: 1843968760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:12:29,254][54587] Avg episode reward: [(0, '0.681')] [2024-04-28 02:12:30,192][54818] Updated weights for policy 0, policy_version 545588 (0.0016) [2024-04-28 02:12:33,169][54818] Updated weights for policy 0, policy_version 545598 (0.0018) [2024-04-28 02:12:34,253][54587] Fps is (10 sec: 60620.0, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 8939126784. Throughput: 0: 61043.1. Samples: 1844332040. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:12:34,254][54587] Avg episode reward: [(0, '0.479')] [2024-04-28 02:12:34,358][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (900 times) [2024-04-28 02:12:35,732][54818] Updated weights for policy 0, policy_version 545608 (0.0016) [2024-04-28 02:12:38,390][54818] Updated weights for policy 0, policy_version 545618 (0.0016) [2024-04-28 02:12:39,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 8939438080. Throughput: 0: 61352.7. Samples: 1844701820. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:12:39,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 02:12:41,062][54818] Updated weights for policy 0, policy_version 545628 (0.0017) [2024-04-28 02:12:43,643][54818] Updated weights for policy 0, policy_version 545638 (0.0016) [2024-04-28 02:12:44,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 8939749376. Throughput: 0: 61328.3. Samples: 1844883360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:12:44,255][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 02:12:46,371][54818] Updated weights for policy 0, policy_version 545648 (0.0017) [2024-04-28 02:12:49,144][54818] Updated weights for policy 0, policy_version 545658 (0.0016) [2024-04-28 02:12:49,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61166.9, 300 sec: 61315.1). Total num frames: 8940060672. Throughput: 0: 61253.4. Samples: 1845252520. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:12:49,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 02:12:51,626][54818] Updated weights for policy 0, policy_version 545668 (0.0016) [2024-04-28 02:12:54,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 8940371968. Throughput: 0: 61074.0. Samples: 1845614760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:12:54,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 02:12:54,569][54818] Updated weights for policy 0, policy_version 545678 (0.0023) [2024-04-28 02:12:56,980][54818] Updated weights for policy 0, policy_version 545688 (0.0017) [2024-04-28 02:12:59,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 8940666880. Throughput: 0: 61039.8. Samples: 1845798260. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:12:59,253][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 02:12:59,912][54818] Updated weights for policy 0, policy_version 545698 (0.0016) [2024-04-28 02:13:01,099][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (1000 times) [2024-04-28 02:13:02,238][54818] Updated weights for policy 0, policy_version 545708 (0.0017) [2024-04-28 02:13:04,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61167.1, 300 sec: 61315.1). Total num frames: 8940978176. Throughput: 0: 61261.0. Samples: 1846176060. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:13:04,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 02:13:05,261][54818] Updated weights for policy 0, policy_version 545718 (0.0016) [2024-04-28 02:13:06,022][54798] Signal inference workers to stop experience collection... (30050 times) [2024-04-28 02:13:06,053][54818] InferenceWorker_p0-w0: stopping experience collection (30050 times) [2024-04-28 02:13:06,081][54798] Signal inference workers to resume experience collection... (30050 times) [2024-04-28 02:13:06,081][54818] InferenceWorker_p0-w0: resuming experience collection (30050 times) [2024-04-28 02:13:07,702][54818] Updated weights for policy 0, policy_version 545728 (0.0017) [2024-04-28 02:13:09,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 8941289472. Throughput: 0: 61314.1. Samples: 1846542660. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:13:09,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 02:13:10,700][54818] Updated weights for policy 0, policy_version 545738 (0.0016) [2024-04-28 02:13:13,097][54818] Updated weights for policy 0, policy_version 545748 (0.0016) [2024-04-28 02:13:14,253][54587] Fps is (10 sec: 62257.9, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 8941600768. Throughput: 0: 61247.5. Samples: 1846724900. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:13:14,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 02:13:15,977][54818] Updated weights for policy 0, policy_version 545758 (0.0016) [2024-04-28 02:13:18,316][54818] Updated weights for policy 0, policy_version 545768 (0.0018) [2024-04-28 02:13:19,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61713.0, 300 sec: 61315.0). Total num frames: 8941912064. Throughput: 0: 61361.7. Samples: 1847093320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:13:19,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 02:13:21,201][54818] Updated weights for policy 0, policy_version 545778 (0.0015) [2024-04-28 02:13:23,781][54818] Updated weights for policy 0, policy_version 545788 (0.0018) [2024-04-28 02:13:24,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61712.9, 300 sec: 61315.0). Total num frames: 8942223360. Throughput: 0: 61177.7. Samples: 1847454820. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:13:24,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 02:13:26,580][54818] Updated weights for policy 0, policy_version 545798 (0.0016) [2024-04-28 02:13:27,817][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (1100 times) [2024-04-28 02:13:29,043][54818] Updated weights for policy 0, policy_version 545808 (0.0016) [2024-04-28 02:13:29,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 8942518272. Throughput: 0: 61303.0. Samples: 1847642000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 02:13:29,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 02:13:31,952][54818] Updated weights for policy 0, policy_version 545818 (0.0016) [2024-04-28 02:13:34,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61713.0, 300 sec: 61315.0). Total num frames: 8942829568. Throughput: 0: 61161.3. Samples: 1848004780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:13:34,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 02:13:34,489][54818] Updated weights for policy 0, policy_version 545828 (0.0016) [2024-04-28 02:13:37,323][54818] Updated weights for policy 0, policy_version 545838 (0.0015) [2024-04-28 02:13:39,253][54587] Fps is (10 sec: 60621.8, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 8943124480. Throughput: 0: 61230.4. Samples: 1848370120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:13:39,254][54587] Avg episode reward: [(0, '0.688')] [2024-04-28 02:13:39,970][54818] Updated weights for policy 0, policy_version 545848 (0.0016) [2024-04-28 02:13:42,566][54818] Updated weights for policy 0, policy_version 545858 (0.0016) [2024-04-28 02:13:44,253][54587] Fps is (10 sec: 58982.8, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 8943419392. Throughput: 0: 61325.2. Samples: 1848557900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:13:44,254][54587] Avg episode reward: [(0, '0.542')] [2024-04-28 02:13:45,267][54818] Updated weights for policy 0, policy_version 545868 (0.0016) [2024-04-28 02:13:47,961][54818] Updated weights for policy 0, policy_version 545878 (0.0018) [2024-04-28 02:13:49,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61167.1, 300 sec: 61259.5). Total num frames: 8943730688. Throughput: 0: 61035.5. Samples: 1848922660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:13:49,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 02:13:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000545883_8943747072.pth... [2024-04-28 02:13:49,316][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000544984_8929017856.pth [2024-04-28 02:13:50,662][54818] Updated weights for policy 0, policy_version 545888 (0.0018) [2024-04-28 02:13:53,296][54818] Updated weights for policy 0, policy_version 545898 (0.0019) [2024-04-28 02:13:54,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61167.1, 300 sec: 61315.0). Total num frames: 8944041984. Throughput: 0: 60987.7. Samples: 1849287100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:13:54,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 02:13:54,393][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (1200 times) [2024-04-28 02:13:56,024][54818] Updated weights for policy 0, policy_version 545908 (0.0015) [2024-04-28 02:13:58,529][54818] Updated weights for policy 0, policy_version 545918 (0.0019) [2024-04-28 02:13:59,253][54587] Fps is (10 sec: 62258.1, 60 sec: 61439.8, 300 sec: 61370.5). Total num frames: 8944353280. Throughput: 0: 61100.9. Samples: 1849474440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:13:59,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 02:14:01,593][54818] Updated weights for policy 0, policy_version 545928 (0.0017) [2024-04-28 02:14:02,412][54798] Signal inference workers to stop experience collection... (30100 times) [2024-04-28 02:14:02,452][54818] InferenceWorker_p0-w0: stopping experience collection (30100 times) [2024-04-28 02:14:02,471][54798] Signal inference workers to resume experience collection... (30100 times) [2024-04-28 02:14:02,472][54818] InferenceWorker_p0-w0: resuming experience collection (30100 times) [2024-04-28 02:14:04,091][54818] Updated weights for policy 0, policy_version 545938 (0.0017) [2024-04-28 02:14:04,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61439.9, 300 sec: 61370.6). Total num frames: 8944664576. Throughput: 0: 60887.7. Samples: 1849833260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:14:04,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 02:14:06,760][54818] Updated weights for policy 0, policy_version 545948 (0.0016) [2024-04-28 02:14:09,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61167.0, 300 sec: 61315.0). Total num frames: 8944959488. Throughput: 0: 61024.2. Samples: 1850200900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:14:09,255][54587] Avg episode reward: [(0, '0.645')] [2024-04-28 02:14:09,436][54818] Updated weights for policy 0, policy_version 545958 (0.0015) [2024-04-28 02:14:12,252][54818] Updated weights for policy 0, policy_version 545968 (0.0018) [2024-04-28 02:14:14,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61167.0, 300 sec: 61370.6). Total num frames: 8945270784. Throughput: 0: 61046.7. Samples: 1850389100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:14:14,255][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 02:14:14,911][54818] Updated weights for policy 0, policy_version 545978 (0.0018) [2024-04-28 02:14:17,564][54818] Updated weights for policy 0, policy_version 545988 (0.0015) [2024-04-28 02:14:19,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61167.0, 300 sec: 61370.6). Total num frames: 8945582080. Throughput: 0: 61089.0. Samples: 1850753780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:14:19,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 02:14:20,365][54818] Updated weights for policy 0, policy_version 545998 (0.0016) [2024-04-28 02:14:21,395][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (1300 times) [2024-04-28 02:14:22,946][54818] Updated weights for policy 0, policy_version 546008 (0.0020) [2024-04-28 02:14:24,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60894.0, 300 sec: 61370.6). Total num frames: 8945876992. Throughput: 0: 61109.7. Samples: 1851120060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:14:24,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 02:14:25,573][54818] Updated weights for policy 0, policy_version 546018 (0.0016) [2024-04-28 02:14:28,094][54818] Updated weights for policy 0, policy_version 546028 (0.0016) [2024-04-28 02:14:29,253][54587] Fps is (10 sec: 60619.9, 60 sec: 61166.9, 300 sec: 61370.5). Total num frames: 8946188288. Throughput: 0: 61088.2. Samples: 1851306880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:14:29,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 02:14:31,116][54818] Updated weights for policy 0, policy_version 546038 (0.0017) [2024-04-28 02:14:33,581][54818] Updated weights for policy 0, policy_version 546048 (0.0017) [2024-04-28 02:14:34,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61167.0, 300 sec: 61370.6). Total num frames: 8946499584. Throughput: 0: 61034.5. Samples: 1851669220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:14:34,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 02:14:36,343][54818] Updated weights for policy 0, policy_version 546058 (0.0016) [2024-04-28 02:14:38,831][54818] Updated weights for policy 0, policy_version 546068 (0.0017) [2024-04-28 02:14:39,253][54587] Fps is (10 sec: 60622.0, 60 sec: 61166.9, 300 sec: 61315.1). Total num frames: 8946794496. Throughput: 0: 61107.9. Samples: 1852036960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:14:39,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 02:14:41,787][54818] Updated weights for policy 0, policy_version 546078 (0.0016) [2024-04-28 02:14:44,150][54818] Updated weights for policy 0, policy_version 546088 (0.0017) [2024-04-28 02:14:44,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 8947105792. Throughput: 0: 61159.7. Samples: 1852226620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:14:44,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-28 02:14:46,982][54818] Updated weights for policy 0, policy_version 546098 (0.0015) [2024-04-28 02:14:48,268][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (1400 times) [2024-04-28 02:14:49,254][54587] Fps is (10 sec: 62251.9, 60 sec: 61438.7, 300 sec: 61314.8). Total num frames: 8947417088. Throughput: 0: 61345.1. Samples: 1852593860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:14:49,255][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 02:14:49,265][54587] No heartbeat for components: RolloutWorker_w4 (17437 seconds), RolloutWorker_w5 (3537 seconds) [2024-04-28 02:14:49,680][54818] Updated weights for policy 0, policy_version 546108 (0.0016) [2024-04-28 02:14:51,946][54798] Signal inference workers to stop experience collection... (30150 times) [2024-04-28 02:14:51,951][54798] Signal inference workers to resume experience collection... (30150 times) [2024-04-28 02:14:51,958][54818] InferenceWorker_p0-w0: stopping experience collection (30150 times) [2024-04-28 02:14:51,958][54818] InferenceWorker_p0-w0: resuming experience collection (30150 times) [2024-04-28 02:14:52,306][54818] Updated weights for policy 0, policy_version 546118 (0.0021) [2024-04-28 02:14:54,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61439.9, 300 sec: 61315.1). Total num frames: 8947728384. Throughput: 0: 61380.8. Samples: 1852963040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:14:54,255][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 02:14:55,158][54818] Updated weights for policy 0, policy_version 546128 (0.0017) [2024-04-28 02:14:57,578][54818] Updated weights for policy 0, policy_version 546138 (0.0017) [2024-04-28 02:14:59,253][54587] Fps is (10 sec: 60628.0, 60 sec: 61167.1, 300 sec: 61315.1). Total num frames: 8948023296. Throughput: 0: 61141.5. Samples: 1853140460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:14:59,255][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 02:15:00,469][54818] Updated weights for policy 0, policy_version 546148 (0.0017) [2024-04-28 02:15:03,087][54818] Updated weights for policy 0, policy_version 546158 (0.0016) [2024-04-28 02:15:04,253][54587] Fps is (10 sec: 58983.0, 60 sec: 60893.9, 300 sec: 61259.5). Total num frames: 8948318208. Throughput: 0: 61081.5. Samples: 1853502440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:15:04,253][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 02:15:05,697][54818] Updated weights for policy 0, policy_version 546168 (0.0015) [2024-04-28 02:15:08,354][54818] Updated weights for policy 0, policy_version 546178 (0.0017) [2024-04-28 02:15:09,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 8948629504. Throughput: 0: 61255.4. Samples: 1853876560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:15:09,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 02:15:11,045][54818] Updated weights for policy 0, policy_version 546188 (0.0017) [2024-04-28 02:15:13,609][54818] Updated weights for policy 0, policy_version 546198 (0.0016) [2024-04-28 02:15:14,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 8948940800. Throughput: 0: 61162.9. Samples: 1854059200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:15:14,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-28 02:15:14,937][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (1500 times) [2024-04-28 02:15:16,276][54818] Updated weights for policy 0, policy_version 546208 (0.0016) [2024-04-28 02:15:19,025][54818] Updated weights for policy 0, policy_version 546218 (0.0015) [2024-04-28 02:15:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60893.8, 300 sec: 61204.0). Total num frames: 8949235712. Throughput: 0: 61398.6. Samples: 1854432160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:15:19,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 02:15:21,501][54818] Updated weights for policy 0, policy_version 546228 (0.0017) [2024-04-28 02:15:24,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 8949547008. Throughput: 0: 61356.9. Samples: 1854798020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:15:24,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 02:15:24,312][54818] Updated weights for policy 0, policy_version 546238 (0.0017) [2024-04-28 02:15:27,186][54818] Updated weights for policy 0, policy_version 546248 (0.0016) [2024-04-28 02:15:29,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60894.1, 300 sec: 61204.0). Total num frames: 8949841920. Throughput: 0: 61121.5. Samples: 1854977080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:15:29,253][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 02:15:29,832][54818] Updated weights for policy 0, policy_version 546258 (0.0017) [2024-04-28 02:15:32,661][54818] Updated weights for policy 0, policy_version 546268 (0.0016) [2024-04-28 02:15:34,253][54587] Fps is (10 sec: 58981.4, 60 sec: 60620.7, 300 sec: 61148.4). Total num frames: 8950136832. Throughput: 0: 61299.2. Samples: 1855352260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:15:34,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 02:15:34,984][54818] Updated weights for policy 0, policy_version 546278 (0.0026) [2024-04-28 02:15:37,935][54818] Updated weights for policy 0, policy_version 546288 (0.0018) [2024-04-28 02:15:39,253][54587] Fps is (10 sec: 58982.1, 60 sec: 60620.8, 300 sec: 61148.4). Total num frames: 8950431744. Throughput: 0: 61196.5. Samples: 1855716880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:15:39,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 02:15:40,418][54818] Updated weights for policy 0, policy_version 546298 (0.0016) [2024-04-28 02:15:41,533][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (1600 times) [2024-04-28 02:15:43,367][54818] Updated weights for policy 0, policy_version 546308 (0.0018) [2024-04-28 02:15:44,253][54587] Fps is (10 sec: 60622.2, 60 sec: 60620.9, 300 sec: 61204.0). Total num frames: 8950743040. Throughput: 0: 61109.5. Samples: 1855890380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:15:44,253][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 02:15:45,709][54818] Updated weights for policy 0, policy_version 546318 (0.0021) [2024-04-28 02:15:46,312][54798] Signal inference workers to stop experience collection... (30200 times) [2024-04-28 02:15:46,339][54818] InferenceWorker_p0-w0: stopping experience collection (30200 times) [2024-04-28 02:15:46,369][54798] Signal inference workers to resume experience collection... (30200 times) [2024-04-28 02:15:46,370][54818] InferenceWorker_p0-w0: resuming experience collection (30200 times) [2024-04-28 02:15:48,617][54818] Updated weights for policy 0, policy_version 546328 (0.0016) [2024-04-28 02:15:49,253][54587] Fps is (10 sec: 62258.5, 60 sec: 60621.9, 300 sec: 61203.9). Total num frames: 8951054336. Throughput: 0: 61447.8. Samples: 1856267600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:15:49,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 02:15:49,332][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000546330_8951070720.pth... [2024-04-28 02:15:49,387][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000545434_8936390656.pth [2024-04-28 02:15:51,159][54818] Updated weights for policy 0, policy_version 546338 (0.0016) [2024-04-28 02:15:54,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60347.8, 300 sec: 61148.4). Total num frames: 8951349248. Throughput: 0: 61154.3. Samples: 1856628500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:15:54,255][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 02:15:54,281][54818] Updated weights for policy 0, policy_version 546348 (0.0018) [2024-04-28 02:15:56,499][54818] Updated weights for policy 0, policy_version 546358 (0.0015) [2024-04-28 02:15:59,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60620.6, 300 sec: 61148.4). Total num frames: 8951660544. Throughput: 0: 61017.2. Samples: 1856804980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:15:59,255][54587] Avg episode reward: [(0, '0.520')] [2024-04-28 02:15:59,540][54818] Updated weights for policy 0, policy_version 546368 (0.0017) [2024-04-28 02:16:01,707][54818] Updated weights for policy 0, policy_version 546378 (0.0018) [2024-04-28 02:16:04,257][54587] Fps is (10 sec: 60600.0, 60 sec: 60617.3, 300 sec: 61147.7). Total num frames: 8951955456. Throughput: 0: 61207.9. Samples: 1857186720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:16:04,257][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 02:16:04,805][54818] Updated weights for policy 0, policy_version 546388 (0.0019) [2024-04-28 02:16:07,152][54818] Updated weights for policy 0, policy_version 546398 (0.0026) [2024-04-28 02:16:09,025][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (1700 times) [2024-04-28 02:16:09,254][54587] Fps is (10 sec: 60620.4, 60 sec: 60620.7, 300 sec: 61148.4). Total num frames: 8952266752. Throughput: 0: 61012.5. Samples: 1857543600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:16:09,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 02:16:10,121][54818] Updated weights for policy 0, policy_version 546408 (0.0017) [2024-04-28 02:16:12,811][54818] Updated weights for policy 0, policy_version 546418 (0.0022) [2024-04-28 02:16:14,253][54587] Fps is (10 sec: 62280.0, 60 sec: 60620.8, 300 sec: 61148.4). Total num frames: 8952578048. Throughput: 0: 61059.3. Samples: 1857724760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:16:14,255][54587] Avg episode reward: [(0, '0.527')] [2024-04-28 02:16:15,401][54818] Updated weights for policy 0, policy_version 546428 (0.0019) [2024-04-28 02:16:18,022][54818] Updated weights for policy 0, policy_version 546438 (0.0017) [2024-04-28 02:16:19,253][54587] Fps is (10 sec: 63899.9, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 8952905728. Throughput: 0: 60885.7. Samples: 1858092100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:16:19,253][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 02:16:20,774][54818] Updated weights for policy 0, policy_version 546448 (0.0016) [2024-04-28 02:16:23,320][54818] Updated weights for policy 0, policy_version 546458 (0.0016) [2024-04-28 02:16:24,253][54587] Fps is (10 sec: 62259.6, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 8953200640. Throughput: 0: 61137.3. Samples: 1858468060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:16:24,254][54587] Avg episode reward: [(0, '0.655')] [2024-04-28 02:16:25,958][54818] Updated weights for policy 0, policy_version 546468 (0.0018) [2024-04-28 02:16:28,911][54818] Updated weights for policy 0, policy_version 546478 (0.0018) [2024-04-28 02:16:29,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 8953511936. Throughput: 0: 61332.8. Samples: 1858650360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:16:29,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 02:16:29,540][54798] Signal inference workers to stop experience collection... (30250 times) [2024-04-28 02:16:29,541][54798] Signal inference workers to resume experience collection... (30250 times) [2024-04-28 02:16:29,552][54818] InferenceWorker_p0-w0: stopping experience collection (30250 times) [2024-04-28 02:16:29,553][54818] InferenceWorker_p0-w0: resuming experience collection (30250 times) [2024-04-28 02:16:31,239][54818] Updated weights for policy 0, policy_version 546488 (0.0016) [2024-04-28 02:16:34,149][54818] Updated weights for policy 0, policy_version 546498 (0.0018) [2024-04-28 02:16:34,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 8953823232. Throughput: 0: 61218.7. Samples: 1859022440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:16:34,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-28 02:16:35,073][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (1800 times) [2024-04-28 02:16:36,941][54818] Updated weights for policy 0, policy_version 546508 (0.0016) [2024-04-28 02:16:39,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61713.0, 300 sec: 61204.0). Total num frames: 8954134528. Throughput: 0: 61203.5. Samples: 1859382660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:16:39,255][54587] Avg episode reward: [(0, '0.510')] [2024-04-28 02:16:39,422][54818] Updated weights for policy 0, policy_version 546518 (0.0016) [2024-04-28 02:16:42,110][54818] Updated weights for policy 0, policy_version 546528 (0.0016) [2024-04-28 02:16:44,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61439.8, 300 sec: 61148.4). Total num frames: 8954429440. Throughput: 0: 61454.8. Samples: 1859570440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:16:44,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 02:16:44,664][54818] Updated weights for policy 0, policy_version 546538 (0.0015) [2024-04-28 02:16:47,496][54818] Updated weights for policy 0, policy_version 546548 (0.0016) [2024-04-28 02:16:49,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61440.2, 300 sec: 61204.0). Total num frames: 8954740736. Throughput: 0: 61157.2. Samples: 1859938580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:16:49,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 02:16:49,976][54818] Updated weights for policy 0, policy_version 546558 (0.0015) [2024-04-28 02:16:52,747][54818] Updated weights for policy 0, policy_version 546568 (0.0017) [2024-04-28 02:16:54,253][54587] Fps is (10 sec: 62260.0, 60 sec: 61713.1, 300 sec: 61259.5). Total num frames: 8955052032. Throughput: 0: 61351.9. Samples: 1860304420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:16:54,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 02:16:54,254][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (100 times) [2024-04-28 02:16:55,410][54818] Updated weights for policy 0, policy_version 546578 (0.0016) [2024-04-28 02:16:58,152][54818] Updated weights for policy 0, policy_version 546588 (0.0017) [2024-04-28 02:16:59,253][54587] Fps is (10 sec: 62257.9, 60 sec: 61713.1, 300 sec: 61204.0). Total num frames: 8955363328. Throughput: 0: 61370.2. Samples: 1860486420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:16:59,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 02:17:00,856][54818] Updated weights for policy 0, policy_version 546598 (0.0016) [2024-04-28 02:17:02,243][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (1900 times) [2024-04-28 02:17:03,590][54818] Updated weights for policy 0, policy_version 546608 (0.0017) [2024-04-28 02:17:04,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61716.5, 300 sec: 61203.9). Total num frames: 8955658240. Throughput: 0: 61437.5. Samples: 1860856800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:17:04,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 02:17:06,052][54818] Updated weights for policy 0, policy_version 546618 (0.0015) [2024-04-28 02:17:08,870][54818] Updated weights for policy 0, policy_version 546628 (0.0015) [2024-04-28 02:17:09,253][54587] Fps is (10 sec: 58983.5, 60 sec: 61440.3, 300 sec: 61148.5). Total num frames: 8955953152. Throughput: 0: 61081.9. Samples: 1861216740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:17:09,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 02:17:11,552][54818] Updated weights for policy 0, policy_version 546638 (0.0017) [2024-04-28 02:17:14,215][54818] Updated weights for policy 0, policy_version 546648 (0.0017) [2024-04-28 02:17:14,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61713.1, 300 sec: 61259.5). Total num frames: 8956280832. Throughput: 0: 61053.2. Samples: 1861397760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:17:14,255][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 02:17:16,750][54818] Updated weights for policy 0, policy_version 546658 (0.0018) [2024-04-28 02:17:19,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 8956575744. Throughput: 0: 60997.1. Samples: 1861767300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:17:19,253][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 02:17:19,926][54818] Updated weights for policy 0, policy_version 546668 (0.0015) [2024-04-28 02:17:21,353][54798] Signal inference workers to stop experience collection... (30300 times) [2024-04-28 02:17:21,353][54798] Signal inference workers to resume experience collection... (30300 times) [2024-04-28 02:17:21,368][54818] InferenceWorker_p0-w0: stopping experience collection (30300 times) [2024-04-28 02:17:21,368][54818] InferenceWorker_p0-w0: resuming experience collection (30300 times) [2024-04-28 02:17:22,349][54818] Updated weights for policy 0, policy_version 546678 (0.0017) [2024-04-28 02:17:24,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 8956887040. Throughput: 0: 61077.3. Samples: 1862131140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:17:24,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-28 02:17:25,262][54818] Updated weights for policy 0, policy_version 546688 (0.0015) [2024-04-28 02:17:27,722][54818] Updated weights for policy 0, policy_version 546698 (0.0018) [2024-04-28 02:17:28,763][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (2000 times) [2024-04-28 02:17:29,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 8957181952. Throughput: 0: 61018.9. Samples: 1862316280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:17:29,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-28 02:17:30,579][54818] Updated weights for policy 0, policy_version 546708 (0.0019) [2024-04-28 02:17:33,027][54818] Updated weights for policy 0, policy_version 546718 (0.0016) [2024-04-28 02:17:34,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 8957509632. Throughput: 0: 60987.3. Samples: 1862683020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:17:34,254][54587] Avg episode reward: [(0, '0.528')] [2024-04-28 02:17:35,787][54818] Updated weights for policy 0, policy_version 546728 (0.0016) [2024-04-28 02:17:38,248][54818] Updated weights for policy 0, policy_version 546738 (0.0016) [2024-04-28 02:17:39,253][54587] Fps is (10 sec: 63896.8, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 8957820928. Throughput: 0: 61064.7. Samples: 1863052340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:17:39,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-28 02:17:41,081][54818] Updated weights for policy 0, policy_version 546748 (0.0018) [2024-04-28 02:17:43,687][54818] Updated weights for policy 0, policy_version 546758 (0.0018) [2024-04-28 02:17:44,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 8958115840. Throughput: 0: 61040.2. Samples: 1863233220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:17:44,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 02:17:46,492][54818] Updated weights for policy 0, policy_version 546768 (0.0015) [2024-04-28 02:17:48,914][54818] Updated weights for policy 0, policy_version 546778 (0.0018) [2024-04-28 02:17:49,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 8958427136. Throughput: 0: 60964.5. Samples: 1863600200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:17:49,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 02:17:49,264][54587] No heartbeat for components: RolloutWorker_w4 (17617 seconds), RolloutWorker_w5 (3717 seconds) [2024-04-28 02:17:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000546779_8958427136.pth... [2024-04-28 02:17:49,316][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000545883_8943747072.pth [2024-04-28 02:17:51,689][54818] Updated weights for policy 0, policy_version 546788 (0.0016) [2024-04-28 02:17:54,042][54818] Updated weights for policy 0, policy_version 546798 (0.0017) [2024-04-28 02:17:54,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 8958738432. Throughput: 0: 61138.6. Samples: 1863967980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:17:54,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 02:17:55,719][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (2100 times) [2024-04-28 02:17:57,103][54818] Updated weights for policy 0, policy_version 546808 (0.0016) [2024-04-28 02:17:59,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 8959049728. Throughput: 0: 61236.1. Samples: 1864153380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:17:59,255][54587] Avg episode reward: [(0, '0.541')] [2024-04-28 02:18:00,039][54818] Updated weights for policy 0, policy_version 546818 (0.0017) [2024-04-28 02:18:02,558][54818] Updated weights for policy 0, policy_version 546828 (0.0016) [2024-04-28 02:18:04,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 8959344640. Throughput: 0: 61270.5. Samples: 1864524480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:18:04,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 02:18:05,191][54818] Updated weights for policy 0, policy_version 546838 (0.0017) [2024-04-28 02:18:07,739][54818] Updated weights for policy 0, policy_version 546848 (0.0016) [2024-04-28 02:18:09,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61712.9, 300 sec: 61204.0). Total num frames: 8959655936. Throughput: 0: 61274.2. Samples: 1864888480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:18:09,256][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 02:18:10,510][54818] Updated weights for policy 0, policy_version 546858 (0.0018) [2024-04-28 02:18:13,253][54818] Updated weights for policy 0, policy_version 546868 (0.0017) [2024-04-28 02:18:14,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 8959967232. Throughput: 0: 61257.5. Samples: 1865072880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:18:14,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 02:18:15,848][54818] Updated weights for policy 0, policy_version 546878 (0.0015) [2024-04-28 02:18:15,969][54798] Signal inference workers to stop experience collection... (30350 times) [2024-04-28 02:18:16,006][54818] InferenceWorker_p0-w0: stopping experience collection (30350 times) [2024-04-28 02:18:16,059][54798] Signal inference workers to resume experience collection... (30350 times) [2024-04-28 02:18:16,059][54818] InferenceWorker_p0-w0: resuming experience collection (30350 times) [2024-04-28 02:18:18,393][54818] Updated weights for policy 0, policy_version 546888 (0.0016) [2024-04-28 02:18:19,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 8960262144. Throughput: 0: 61353.9. Samples: 1865443940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:18:19,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 02:18:21,070][54818] Updated weights for policy 0, policy_version 546898 (0.0016) [2024-04-28 02:18:22,019][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (2200 times) [2024-04-28 02:18:23,562][54818] Updated weights for policy 0, policy_version 546908 (0.0018) [2024-04-28 02:18:24,253][54587] Fps is (10 sec: 58983.2, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 8960557056. Throughput: 0: 61207.2. Samples: 1865806660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:18:24,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 02:18:26,392][54818] Updated weights for policy 0, policy_version 546918 (0.0015) [2024-04-28 02:18:28,976][54818] Updated weights for policy 0, policy_version 546928 (0.0024) [2024-04-28 02:18:29,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 8960868352. Throughput: 0: 61443.9. Samples: 1865998200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:18:29,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 02:18:31,862][54818] Updated weights for policy 0, policy_version 546938 (0.0016) [2024-04-28 02:18:34,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61167.0, 300 sec: 61203.9). Total num frames: 8961179648. Throughput: 0: 61284.5. Samples: 1866358000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:18:34,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 02:18:34,710][54818] Updated weights for policy 0, policy_version 546948 (0.0017) [2024-04-28 02:18:37,196][54818] Updated weights for policy 0, policy_version 546958 (0.0016) [2024-04-28 02:18:39,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 8961474560. Throughput: 0: 61180.0. Samples: 1866721080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:18:39,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 02:18:39,956][54818] Updated weights for policy 0, policy_version 546968 (0.0015) [2024-04-28 02:18:42,503][54818] Updated weights for policy 0, policy_version 546978 (0.0015) [2024-04-28 02:18:44,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 8961785856. Throughput: 0: 61347.9. Samples: 1866914040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:18:44,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 02:18:45,296][54818] Updated weights for policy 0, policy_version 546988 (0.0015) [2024-04-28 02:18:48,002][54818] Updated weights for policy 0, policy_version 546998 (0.0019) [2024-04-28 02:18:48,754][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (2300 times) [2024-04-28 02:18:49,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61167.0, 300 sec: 61203.9). Total num frames: 8962097152. Throughput: 0: 61191.6. Samples: 1867278100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:18:49,255][54587] Avg episode reward: [(0, '0.661')] [2024-04-28 02:18:50,556][54818] Updated weights for policy 0, policy_version 547008 (0.0021) [2024-04-28 02:18:53,039][54818] Updated weights for policy 0, policy_version 547018 (0.0016) [2024-04-28 02:18:54,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61166.8, 300 sec: 61204.0). Total num frames: 8962408448. Throughput: 0: 61315.6. Samples: 1867647680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:18:54,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 02:18:56,000][54818] Updated weights for policy 0, policy_version 547028 (0.0017) [2024-04-28 02:18:57,952][54798] Signal inference workers to stop experience collection... (30400 times) [2024-04-28 02:18:57,959][54798] Signal inference workers to resume experience collection... (30400 times) [2024-04-28 02:18:57,968][54818] InferenceWorker_p0-w0: stopping experience collection (30400 times) [2024-04-28 02:18:57,984][54818] InferenceWorker_p0-w0: resuming experience collection (30400 times) [2024-04-28 02:18:58,392][54818] Updated weights for policy 0, policy_version 547038 (0.0016) [2024-04-28 02:18:59,253][54587] Fps is (10 sec: 60620.2, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 8962703360. Throughput: 0: 61304.4. Samples: 1867831580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:18:59,256][54587] Avg episode reward: [(0, '0.492')] [2024-04-28 02:19:01,473][54818] Updated weights for policy 0, policy_version 547048 (0.0015) [2024-04-28 02:19:03,717][54818] Updated weights for policy 0, policy_version 547058 (0.0019) [2024-04-28 02:19:04,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 8963014656. Throughput: 0: 61156.8. Samples: 1868196000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:19:04,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 02:19:06,839][54818] Updated weights for policy 0, policy_version 547068 (0.0016) [2024-04-28 02:19:09,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 8963309568. Throughput: 0: 61215.5. Samples: 1868561360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:19:09,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 02:19:09,429][54818] Updated weights for policy 0, policy_version 547078 (0.0017) [2024-04-28 02:19:12,056][54818] Updated weights for policy 0, policy_version 547088 (0.0016) [2024-04-28 02:19:14,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 8963620864. Throughput: 0: 61130.3. Samples: 1868749060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:19:14,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 02:19:14,903][54818] Updated weights for policy 0, policy_version 547098 (0.0018) [2024-04-28 02:19:16,034][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (2400 times) [2024-04-28 02:19:17,417][54818] Updated weights for policy 0, policy_version 547108 (0.0017) [2024-04-28 02:19:19,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 8963932160. Throughput: 0: 61259.6. Samples: 1869114680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:19:19,255][54587] Avg episode reward: [(0, '0.637')] [2024-04-28 02:19:20,259][54818] Updated weights for policy 0, policy_version 547118 (0.0018) [2024-04-28 02:19:22,715][54818] Updated weights for policy 0, policy_version 547128 (0.0022) [2024-04-28 02:19:24,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 8964227072. Throughput: 0: 61154.1. Samples: 1869473020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:19:24,254][54587] Avg episode reward: [(0, '0.518')] [2024-04-28 02:19:25,398][54818] Updated weights for policy 0, policy_version 547138 (0.0016) [2024-04-28 02:19:28,048][54818] Updated weights for policy 0, policy_version 547148 (0.0016) [2024-04-28 02:19:29,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 8964538368. Throughput: 0: 61102.6. Samples: 1869663660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 02:19:29,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 02:19:31,087][54818] Updated weights for policy 0, policy_version 547158 (0.0016) [2024-04-28 02:19:33,314][54818] Updated weights for policy 0, policy_version 547168 (0.0016) [2024-04-28 02:19:34,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 8964833280. Throughput: 0: 61078.7. Samples: 1870026640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:19:34,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 02:19:36,228][54818] Updated weights for policy 0, policy_version 547178 (0.0017) [2024-04-28 02:19:38,465][54818] Updated weights for policy 0, policy_version 547188 (0.0017) [2024-04-28 02:19:39,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 8965144576. Throughput: 0: 60865.4. Samples: 1870386620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:19:39,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 02:19:41,606][54818] Updated weights for policy 0, policy_version 547198 (0.0016) [2024-04-28 02:19:42,013][54798] Signal inference workers to stop experience collection... (30450 times) [2024-04-28 02:19:42,030][54818] InferenceWorker_p0-w0: stopping experience collection (30450 times) [2024-04-28 02:19:42,105][54798] Signal inference workers to resume experience collection... (30450 times) [2024-04-28 02:19:42,105][54818] InferenceWorker_p0-w0: resuming experience collection (30450 times) [2024-04-28 02:19:42,859][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (2500 times) [2024-04-28 02:19:43,791][54818] Updated weights for policy 0, policy_version 547208 (0.0025) [2024-04-28 02:19:44,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61166.9, 300 sec: 61148.6). Total num frames: 8965455872. Throughput: 0: 61175.6. Samples: 1870584480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:19:44,255][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 02:19:46,845][54818] Updated weights for policy 0, policy_version 547218 (0.0016) [2024-04-28 02:19:49,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 8965767168. Throughput: 0: 61242.2. Samples: 1870951900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:19:49,255][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 02:19:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000547227_8965767168.pth... [2024-04-28 02:19:49,317][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000546330_8951070720.pth [2024-04-28 02:19:49,536][54818] Updated weights for policy 0, policy_version 547228 (0.0016) [2024-04-28 02:19:52,127][54818] Updated weights for policy 0, policy_version 547238 (0.0018) [2024-04-28 02:19:54,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 8966078464. Throughput: 0: 61144.7. Samples: 1871312880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:19:54,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-28 02:19:54,891][54818] Updated weights for policy 0, policy_version 547248 (0.0015) [2024-04-28 02:19:57,538][54818] Updated weights for policy 0, policy_version 547258 (0.0019) [2024-04-28 02:19:59,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61167.0, 300 sec: 61203.9). Total num frames: 8966373376. Throughput: 0: 61214.6. Samples: 1871503720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:19:59,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 02:20:00,421][54818] Updated weights for policy 0, policy_version 547268 (0.0015) [2024-04-28 02:20:02,844][54818] Updated weights for policy 0, policy_version 547278 (0.0016) [2024-04-28 02:20:04,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 8966684672. Throughput: 0: 61084.9. Samples: 1871863500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:20:04,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 02:20:05,691][54818] Updated weights for policy 0, policy_version 547288 (0.0016) [2024-04-28 02:20:08,001][54818] Updated weights for policy 0, policy_version 547298 (0.0016) [2024-04-28 02:20:09,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 8966995968. Throughput: 0: 61258.0. Samples: 1872229620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:20:09,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-28 02:20:09,612][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (2600 times) [2024-04-28 02:20:11,226][54818] Updated weights for policy 0, policy_version 547308 (0.0017) [2024-04-28 02:20:13,438][54818] Updated weights for policy 0, policy_version 547318 (0.0018) [2024-04-28 02:20:14,253][54587] Fps is (10 sec: 60620.0, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 8967290880. Throughput: 0: 61154.6. Samples: 1872415620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:20:14,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 02:20:16,522][54818] Updated weights for policy 0, policy_version 547328 (0.0016) [2024-04-28 02:20:18,765][54818] Updated weights for policy 0, policy_version 547338 (0.0015) [2024-04-28 02:20:19,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 8967602176. Throughput: 0: 61199.6. Samples: 1872780620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:20:19,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-28 02:20:21,709][54818] Updated weights for policy 0, policy_version 547348 (0.0017) [2024-04-28 02:20:24,116][54818] Updated weights for policy 0, policy_version 547358 (0.0016) [2024-04-28 02:20:24,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 8967913472. Throughput: 0: 61477.3. Samples: 1873153100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:20:24,255][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 02:20:26,817][54818] Updated weights for policy 0, policy_version 547368 (0.0018) [2024-04-28 02:20:27,314][54798] Signal inference workers to stop experience collection... (30500 times) [2024-04-28 02:20:27,331][54818] InferenceWorker_p0-w0: stopping experience collection (30500 times) [2024-04-28 02:20:27,372][54798] Signal inference workers to resume experience collection... (30500 times) [2024-04-28 02:20:27,372][54818] InferenceWorker_p0-w0: resuming experience collection (30500 times) [2024-04-28 02:20:29,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61167.1, 300 sec: 61259.5). Total num frames: 8968208384. Throughput: 0: 61094.4. Samples: 1873333720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:20:29,253][54587] Avg episode reward: [(0, '0.522')] [2024-04-28 02:20:29,785][54818] Updated weights for policy 0, policy_version 547378 (0.0018) [2024-04-28 02:20:32,337][54818] Updated weights for policy 0, policy_version 547388 (0.0016) [2024-04-28 02:20:34,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 8968519680. Throughput: 0: 61034.2. Samples: 1873698440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:20:34,255][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 02:20:35,040][54818] Updated weights for policy 0, policy_version 547398 (0.0017) [2024-04-28 02:20:36,328][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (2700 times) [2024-04-28 02:20:37,733][54818] Updated weights for policy 0, policy_version 547408 (0.0019) [2024-04-28 02:20:39,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61440.1, 300 sec: 61315.0). Total num frames: 8968830976. Throughput: 0: 61261.6. Samples: 1874069640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:20:39,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 02:20:40,658][54818] Updated weights for policy 0, policy_version 547418 (0.0018) [2024-04-28 02:20:42,954][54818] Updated weights for policy 0, policy_version 547428 (0.0015) [2024-04-28 02:20:44,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 8969125888. Throughput: 0: 61110.7. Samples: 1874253700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:20:44,255][54587] Avg episode reward: [(0, '0.544')] [2024-04-28 02:20:45,900][54818] Updated weights for policy 0, policy_version 547438 (0.0015) [2024-04-28 02:20:48,119][54818] Updated weights for policy 0, policy_version 547448 (0.0016) [2024-04-28 02:20:49,253][54587] Fps is (10 sec: 58981.7, 60 sec: 60893.9, 300 sec: 61259.5). Total num frames: 8969420800. Throughput: 0: 61314.6. Samples: 1874622660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:20:49,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 02:20:49,264][54587] No heartbeat for components: RolloutWorker_w4 (17797 seconds), RolloutWorker_w5 (3897 seconds) [2024-04-28 02:20:50,994][54818] Updated weights for policy 0, policy_version 547458 (0.0017) [2024-04-28 02:20:53,679][54818] Updated weights for policy 0, policy_version 547468 (0.0015) [2024-04-28 02:20:54,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61167.0, 300 sec: 61315.0). Total num frames: 8969748480. Throughput: 0: 61421.6. Samples: 1874993600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:20:54,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 02:20:56,208][54818] Updated weights for policy 0, policy_version 547478 (0.0019) [2024-04-28 02:20:59,137][54818] Updated weights for policy 0, policy_version 547488 (0.0018) [2024-04-28 02:20:59,253][54587] Fps is (10 sec: 62260.0, 60 sec: 61167.1, 300 sec: 61315.8). Total num frames: 8970043392. Throughput: 0: 61348.7. Samples: 1875176300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 02:20:59,253][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 02:21:01,586][54818] Updated weights for policy 0, policy_version 547498 (0.0016) [2024-04-28 02:21:02,905][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (2800 times) [2024-04-28 02:21:04,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61166.9, 300 sec: 61315.1). Total num frames: 8970354688. Throughput: 0: 61474.6. Samples: 1875546980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:21:04,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 02:21:04,445][54818] Updated weights for policy 0, policy_version 547508 (0.0016) [2024-04-28 02:21:07,019][54818] Updated weights for policy 0, policy_version 547518 (0.0018) [2024-04-28 02:21:09,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60893.8, 300 sec: 61259.5). Total num frames: 8970649600. Throughput: 0: 61270.2. Samples: 1875910260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:21:09,255][54587] Avg episode reward: [(0, '0.574')] [2024-04-28 02:21:09,743][54818] Updated weights for policy 0, policy_version 547528 (0.0017) [2024-04-28 02:21:12,532][54818] Updated weights for policy 0, policy_version 547538 (0.0016) [2024-04-28 02:21:14,253][54587] Fps is (10 sec: 58983.0, 60 sec: 60894.1, 300 sec: 61148.4). Total num frames: 8970944512. Throughput: 0: 61346.2. Samples: 1876094300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:21:14,253][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 02:21:15,002][54818] Updated weights for policy 0, policy_version 547548 (0.0017) [2024-04-28 02:21:18,245][54818] Updated weights for policy 0, policy_version 547558 (0.0017) [2024-04-28 02:21:19,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 8971255808. Throughput: 0: 61406.1. Samples: 1876461700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:21:19,253][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 02:21:20,569][54818] Updated weights for policy 0, policy_version 547568 (0.0018) [2024-04-28 02:21:23,226][54798] Signal inference workers to stop experience collection... (30550 times) [2024-04-28 02:21:23,227][54798] Signal inference workers to resume experience collection... (30550 times) [2024-04-28 02:21:23,239][54818] InferenceWorker_p0-w0: stopping experience collection (30550 times) [2024-04-28 02:21:23,240][54818] InferenceWorker_p0-w0: resuming experience collection (30550 times) [2024-04-28 02:21:23,349][54818] Updated weights for policy 0, policy_version 547578 (0.0016) [2024-04-28 02:21:24,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60620.9, 300 sec: 61148.4). Total num frames: 8971550720. Throughput: 0: 61185.4. Samples: 1876822980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:21:24,253][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 02:21:25,727][54818] Updated weights for policy 0, policy_version 547588 (0.0018) [2024-04-28 02:21:28,492][54818] Updated weights for policy 0, policy_version 547598 (0.0019) [2024-04-28 02:21:29,253][54587] Fps is (10 sec: 60619.9, 60 sec: 60893.7, 300 sec: 61148.4). Total num frames: 8971862016. Throughput: 0: 61184.0. Samples: 1877006980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:21:29,255][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 02:21:29,676][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (2900 times) [2024-04-28 02:21:31,212][54818] Updated weights for policy 0, policy_version 547608 (0.0017) [2024-04-28 02:21:34,247][54818] Updated weights for policy 0, policy_version 547618 (0.0016) [2024-04-28 02:21:34,253][54587] Fps is (10 sec: 62259.0, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 8972173312. Throughput: 0: 61245.5. Samples: 1877378700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:21:34,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-28 02:21:36,477][54818] Updated weights for policy 0, policy_version 547628 (0.0016) [2024-04-28 02:21:39,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60620.9, 300 sec: 61148.5). Total num frames: 8972468224. Throughput: 0: 61102.5. Samples: 1877743200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:21:39,253][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 02:21:39,422][54818] Updated weights for policy 0, policy_version 547638 (0.0015) [2024-04-28 02:21:41,853][54818] Updated weights for policy 0, policy_version 547648 (0.0015) [2024-04-28 02:21:44,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 8972779520. Throughput: 0: 61039.6. Samples: 1877923080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:21:44,262][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 02:21:44,737][54818] Updated weights for policy 0, policy_version 547658 (0.0017) [2024-04-28 02:21:47,214][54818] Updated weights for policy 0, policy_version 547668 (0.0024) [2024-04-28 02:21:49,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 8973074432. Throughput: 0: 61051.7. Samples: 1878294300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:21:49,253][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 02:21:49,331][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000547674_8973090816.pth... [2024-04-28 02:21:49,401][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000546779_8958427136.pth [2024-04-28 02:21:50,161][54818] Updated weights for policy 0, policy_version 547678 (0.0016) [2024-04-28 02:21:53,020][54818] Updated weights for policy 0, policy_version 547688 (0.0017) [2024-04-28 02:21:54,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60621.0, 300 sec: 61092.9). Total num frames: 8973385728. Throughput: 0: 60999.7. Samples: 1878655240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:21:54,253][54587] Avg episode reward: [(0, '0.520')] [2024-04-28 02:21:55,351][54818] Updated weights for policy 0, policy_version 547698 (0.0017) [2024-04-28 02:21:56,510][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (3000 times) [2024-04-28 02:21:58,164][54818] Updated weights for policy 0, policy_version 547708 (0.0018) [2024-04-28 02:21:59,253][54587] Fps is (10 sec: 63896.6, 60 sec: 61166.8, 300 sec: 61204.0). Total num frames: 8973713408. Throughput: 0: 60946.9. Samples: 1878836920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:21:59,255][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 02:22:00,560][54818] Updated weights for policy 0, policy_version 547718 (0.0017) [2024-04-28 02:22:03,423][54818] Updated weights for policy 0, policy_version 547728 (0.0015) [2024-04-28 02:22:04,253][54587] Fps is (10 sec: 62258.6, 60 sec: 60893.9, 300 sec: 61203.9). Total num frames: 8974008320. Throughput: 0: 61053.2. Samples: 1879209100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:22:04,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 02:22:05,835][54818] Updated weights for policy 0, policy_version 547738 (0.0018) [2024-04-28 02:22:08,768][54818] Updated weights for policy 0, policy_version 547748 (0.0017) [2024-04-28 02:22:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 8974319616. Throughput: 0: 61114.4. Samples: 1879573140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:22:09,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 02:22:11,142][54818] Updated weights for policy 0, policy_version 547758 (0.0016) [2024-04-28 02:22:12,429][54798] Signal inference workers to stop experience collection... (30600 times) [2024-04-28 02:22:12,457][54818] InferenceWorker_p0-w0: stopping experience collection (30600 times) [2024-04-28 02:22:12,517][54798] Signal inference workers to resume experience collection... (30600 times) [2024-04-28 02:22:12,517][54818] InferenceWorker_p0-w0: resuming experience collection (30600 times) [2024-04-28 02:22:14,152][54818] Updated weights for policy 0, policy_version 547768 (0.0015) [2024-04-28 02:22:14,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61439.9, 300 sec: 61203.9). Total num frames: 8974630912. Throughput: 0: 61075.1. Samples: 1879755360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:22:14,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-28 02:22:16,877][54818] Updated weights for policy 0, policy_version 547778 (0.0016) [2024-04-28 02:22:19,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 8974942208. Throughput: 0: 60972.4. Samples: 1880122460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:22:19,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 02:22:19,643][54818] Updated weights for policy 0, policy_version 547788 (0.0020) [2024-04-28 02:22:22,336][54818] Updated weights for policy 0, policy_version 547798 (0.0016) [2024-04-28 02:22:23,500][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (3100 times) [2024-04-28 02:22:24,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 8975253504. Throughput: 0: 61159.0. Samples: 1880495360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:22:24,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 02:22:24,800][54818] Updated weights for policy 0, policy_version 547808 (0.0016) [2024-04-28 02:22:27,740][54818] Updated weights for policy 0, policy_version 547818 (0.0019) [2024-04-28 02:22:29,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61713.2, 300 sec: 61204.0). Total num frames: 8975564800. Throughput: 0: 61231.1. Samples: 1880678480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:22:29,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-28 02:22:30,107][54818] Updated weights for policy 0, policy_version 547828 (0.0016) [2024-04-28 02:22:33,100][54818] Updated weights for policy 0, policy_version 547838 (0.0020) [2024-04-28 02:22:34,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 8975859712. Throughput: 0: 61255.5. Samples: 1881050800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:22:34,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 02:22:35,357][54818] Updated weights for policy 0, policy_version 547848 (0.0020) [2024-04-28 02:22:38,390][54818] Updated weights for policy 0, policy_version 547858 (0.0016) [2024-04-28 02:22:39,253][54587] Fps is (10 sec: 58982.6, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 8976154624. Throughput: 0: 61360.4. Samples: 1881416460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:22:39,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 02:22:40,563][54818] Updated weights for policy 0, policy_version 547868 (0.0015) [2024-04-28 02:22:43,629][54818] Updated weights for policy 0, policy_version 547878 (0.0016) [2024-04-28 02:22:44,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61712.9, 300 sec: 61203.9). Total num frames: 8976482304. Throughput: 0: 61446.2. Samples: 1881602000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:22:44,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 02:22:46,314][54818] Updated weights for policy 0, policy_version 547888 (0.0015) [2024-04-28 02:22:49,012][54818] Updated weights for policy 0, policy_version 547898 (0.0018) [2024-04-28 02:22:49,253][54587] Fps is (10 sec: 63897.0, 60 sec: 61986.0, 300 sec: 61203.9). Total num frames: 8976793600. Throughput: 0: 61453.8. Samples: 1881974520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:22:49,255][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 02:22:50,013][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (3200 times) [2024-04-28 02:22:51,621][54818] Updated weights for policy 0, policy_version 547908 (0.0016) [2024-04-28 02:22:54,242][54818] Updated weights for policy 0, policy_version 547918 (0.0019) [2024-04-28 02:22:54,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61713.0, 300 sec: 61148.4). Total num frames: 8977088512. Throughput: 0: 61486.4. Samples: 1882340020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:22:54,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 02:22:56,482][54798] Signal inference workers to stop experience collection... (30650 times) [2024-04-28 02:22:56,487][54798] Signal inference workers to resume experience collection... (30650 times) [2024-04-28 02:22:56,498][54818] InferenceWorker_p0-w0: stopping experience collection (30650 times) [2024-04-28 02:22:56,498][54818] InferenceWorker_p0-w0: resuming experience collection (30650 times) [2024-04-28 02:22:56,930][54818] Updated weights for policy 0, policy_version 547928 (0.0016) [2024-04-28 02:22:59,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 8977399808. Throughput: 0: 61343.5. Samples: 1882515820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:22:59,255][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 02:22:59,397][54818] Updated weights for policy 0, policy_version 547938 (0.0017) [2024-04-28 02:23:02,054][54818] Updated weights for policy 0, policy_version 547948 (0.0015) [2024-04-28 02:23:04,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 8977694720. Throughput: 0: 61569.8. Samples: 1882893100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:23:04,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 02:23:04,785][54818] Updated weights for policy 0, policy_version 547958 (0.0016) [2024-04-28 02:23:07,768][54818] Updated weights for policy 0, policy_version 547968 (0.0016) [2024-04-28 02:23:09,253][54587] Fps is (10 sec: 58983.4, 60 sec: 61167.1, 300 sec: 61092.9). Total num frames: 8977989632. Throughput: 0: 61285.0. Samples: 1883253180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:23:09,253][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 02:23:10,235][54818] Updated weights for policy 0, policy_version 547978 (0.0016) [2024-04-28 02:23:13,166][54818] Updated weights for policy 0, policy_version 547988 (0.0015) [2024-04-28 02:23:14,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61167.1, 300 sec: 61148.4). Total num frames: 8978300928. Throughput: 0: 61318.7. Samples: 1883437820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:23:14,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 02:23:15,502][54818] Updated weights for policy 0, policy_version 547998 (0.0018) [2024-04-28 02:23:16,858][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (3300 times) [2024-04-28 02:23:18,427][54818] Updated weights for policy 0, policy_version 548008 (0.0016) [2024-04-28 02:23:19,253][54587] Fps is (10 sec: 62258.1, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 8978612224. Throughput: 0: 61169.1. Samples: 1883803420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:23:19,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 02:23:20,759][54818] Updated weights for policy 0, policy_version 548018 (0.0016) [2024-04-28 02:23:23,713][54818] Updated weights for policy 0, policy_version 548028 (0.0016) [2024-04-28 02:23:24,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 8978923520. Throughput: 0: 61323.9. Samples: 1884176040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:23:24,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-28 02:23:26,517][54818] Updated weights for policy 0, policy_version 548038 (0.0017) [2024-04-28 02:23:29,070][54818] Updated weights for policy 0, policy_version 548048 (0.0017) [2024-04-28 02:23:29,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60893.7, 300 sec: 61148.4). Total num frames: 8979218432. Throughput: 0: 61361.3. Samples: 1884363260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:23:29,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 02:23:31,793][54818] Updated weights for policy 0, policy_version 548058 (0.0020) [2024-04-28 02:23:34,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 8979529728. Throughput: 0: 61040.5. Samples: 1884721340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:23:34,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 02:23:34,419][54818] Updated weights for policy 0, policy_version 548068 (0.0016) [2024-04-28 02:23:37,107][54818] Updated weights for policy 0, policy_version 548078 (0.0016) [2024-04-28 02:23:39,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 8979841024. Throughput: 0: 61235.4. Samples: 1885095620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:23:39,255][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 02:23:39,599][54818] Updated weights for policy 0, policy_version 548088 (0.0015) [2024-04-28 02:23:42,343][54818] Updated weights for policy 0, policy_version 548098 (0.0016) [2024-04-28 02:23:43,494][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (3400 times) [2024-04-28 02:23:44,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61167.0, 300 sec: 61203.9). Total num frames: 8980152320. Throughput: 0: 61521.0. Samples: 1885284260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:23:44,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 02:23:44,803][54818] Updated weights for policy 0, policy_version 548108 (0.0016) [2024-04-28 02:23:47,611][54818] Updated weights for policy 0, policy_version 548118 (0.0016) [2024-04-28 02:23:49,253][54587] Fps is (10 sec: 60620.0, 60 sec: 60893.7, 300 sec: 61148.4). Total num frames: 8980447232. Throughput: 0: 61115.7. Samples: 1885643320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:23:49,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-28 02:23:49,263][54587] No heartbeat for components: RolloutWorker_w4 (17977 seconds), RolloutWorker_w5 (4077 seconds) [2024-04-28 02:23:49,391][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000548124_8980463616.pth... [2024-04-28 02:23:49,436][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000547227_8965767168.pth [2024-04-28 02:23:50,125][54818] Updated weights for policy 0, policy_version 548128 (0.0015) [2024-04-28 02:23:52,322][54798] Signal inference workers to stop experience collection... (30700 times) [2024-04-28 02:23:52,352][54818] InferenceWorker_p0-w0: stopping experience collection (30700 times) [2024-04-28 02:23:52,412][54798] Signal inference workers to resume experience collection... (30700 times) [2024-04-28 02:23:52,412][54818] InferenceWorker_p0-w0: resuming experience collection (30700 times) [2024-04-28 02:23:53,095][54818] Updated weights for policy 0, policy_version 548138 (0.0015) [2024-04-28 02:23:54,257][54587] Fps is (10 sec: 60598.5, 60 sec: 61163.1, 300 sec: 61203.2). Total num frames: 8980758528. Throughput: 0: 61252.6. Samples: 1886009780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:23:54,258][54587] Avg episode reward: [(0, '0.661')] [2024-04-28 02:23:55,501][54818] Updated weights for policy 0, policy_version 548148 (0.0016) [2024-04-28 02:23:58,346][54818] Updated weights for policy 0, policy_version 548158 (0.0017) [2024-04-28 02:23:59,253][54587] Fps is (10 sec: 62261.0, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 8981069824. Throughput: 0: 61496.9. Samples: 1886205180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:23:59,253][54587] Avg episode reward: [(0, '0.574')] [2024-04-28 02:24:01,041][54818] Updated weights for policy 0, policy_version 548168 (0.0016) [2024-04-28 02:24:03,613][54818] Updated weights for policy 0, policy_version 548178 (0.0015) [2024-04-28 02:24:04,253][54587] Fps is (10 sec: 62282.1, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 8981381120. Throughput: 0: 61476.5. Samples: 1886569860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:24:04,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 02:24:06,618][54818] Updated weights for policy 0, policy_version 548188 (0.0017) [2024-04-28 02:24:08,997][54818] Updated weights for policy 0, policy_version 548198 (0.0016) [2024-04-28 02:24:09,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 8981676032. Throughput: 0: 61200.9. Samples: 1886930080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:24:09,253][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 02:24:10,022][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (3500 times) [2024-04-28 02:24:11,803][54818] Updated weights for policy 0, policy_version 548208 (0.0017) [2024-04-28 02:24:14,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 8981987328. Throughput: 0: 61292.2. Samples: 1887121400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:24:14,255][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 02:24:14,391][54818] Updated weights for policy 0, policy_version 548218 (0.0017) [2024-04-28 02:24:17,242][54818] Updated weights for policy 0, policy_version 548228 (0.0015) [2024-04-28 02:24:19,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 8982298624. Throughput: 0: 61358.2. Samples: 1887482460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:24:19,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 02:24:19,547][54818] Updated weights for policy 0, policy_version 548238 (0.0016) [2024-04-28 02:24:22,492][54818] Updated weights for policy 0, policy_version 548248 (0.0018) [2024-04-28 02:24:24,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 8982609920. Throughput: 0: 61253.5. Samples: 1887852020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:24:24,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 02:24:24,788][54818] Updated weights for policy 0, policy_version 548258 (0.0017) [2024-04-28 02:24:27,987][54818] Updated weights for policy 0, policy_version 548268 (0.0016) [2024-04-28 02:24:29,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61713.1, 300 sec: 61315.0). Total num frames: 8982921216. Throughput: 0: 61089.3. Samples: 1888033280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:24:29,255][54587] Avg episode reward: [(0, '0.488')] [2024-04-28 02:24:30,141][54818] Updated weights for policy 0, policy_version 548278 (0.0018) [2024-04-28 02:24:33,230][54818] Updated weights for policy 0, policy_version 548288 (0.0018) [2024-04-28 02:24:34,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 8983216128. Throughput: 0: 61285.2. Samples: 1888401140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:24:34,254][54587] Avg episode reward: [(0, '0.532')] [2024-04-28 02:24:35,711][54818] Updated weights for policy 0, policy_version 548298 (0.0021) [2024-04-28 02:24:37,175][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (3600 times) [2024-04-28 02:24:38,793][54818] Updated weights for policy 0, policy_version 548308 (0.0015) [2024-04-28 02:24:39,253][54587] Fps is (10 sec: 58983.0, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 8983511040. Throughput: 0: 61406.4. Samples: 1888772840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:24:39,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 02:24:41,417][54818] Updated weights for policy 0, policy_version 548318 (0.0016) [2024-04-28 02:24:43,995][54818] Updated weights for policy 0, policy_version 548328 (0.0015) [2024-04-28 02:24:44,257][54587] Fps is (10 sec: 60599.3, 60 sec: 61163.4, 300 sec: 61203.2). Total num frames: 8983822336. Throughput: 0: 60959.1. Samples: 1888948560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:24:44,258][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 02:24:46,580][54818] Updated weights for policy 0, policy_version 548338 (0.0016) [2024-04-28 02:24:46,715][54798] Signal inference workers to stop experience collection... (30750 times) [2024-04-28 02:24:46,747][54818] InferenceWorker_p0-w0: stopping experience collection (30750 times) [2024-04-28 02:24:46,807][54798] Signal inference workers to resume experience collection... (30750 times) [2024-04-28 02:24:46,807][54818] InferenceWorker_p0-w0: resuming experience collection (30750 times) [2024-04-28 02:24:49,253][54587] Fps is (10 sec: 60619.9, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 8984117248. Throughput: 0: 60977.7. Samples: 1889313860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:24:49,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 02:24:49,386][54818] Updated weights for policy 0, policy_version 548348 (0.0015) [2024-04-28 02:24:51,785][54818] Updated weights for policy 0, policy_version 548358 (0.0016) [2024-04-28 02:24:54,253][54587] Fps is (10 sec: 59003.0, 60 sec: 60897.6, 300 sec: 61148.4). Total num frames: 8984412160. Throughput: 0: 61274.1. Samples: 1889687420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:24:54,256][54587] Avg episode reward: [(0, '0.679')] [2024-04-28 02:24:54,761][54818] Updated weights for policy 0, policy_version 548368 (0.0018) [2024-04-28 02:24:57,171][54818] Updated weights for policy 0, policy_version 548378 (0.0020) [2024-04-28 02:24:59,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.7, 300 sec: 61148.4). Total num frames: 8984723456. Throughput: 0: 60910.6. Samples: 1889862380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:24:59,255][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 02:24:59,848][54818] Updated weights for policy 0, policy_version 548388 (0.0016) [2024-04-28 02:25:02,642][54818] Updated weights for policy 0, policy_version 548398 (0.0016) [2024-04-28 02:25:03,995][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (3700 times) [2024-04-28 02:25:04,253][54587] Fps is (10 sec: 62259.0, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 8985034752. Throughput: 0: 61179.9. Samples: 1890235560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:25:04,254][54587] Avg episode reward: [(0, '0.672')] [2024-04-28 02:25:05,159][54818] Updated weights for policy 0, policy_version 548408 (0.0017) [2024-04-28 02:25:07,974][54818] Updated weights for policy 0, policy_version 548418 (0.0017) [2024-04-28 02:25:09,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 8985346048. Throughput: 0: 61187.9. Samples: 1890605480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:25:09,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-28 02:25:10,596][54818] Updated weights for policy 0, policy_version 548428 (0.0019) [2024-04-28 02:25:13,508][54818] Updated weights for policy 0, policy_version 548438 (0.0017) [2024-04-28 02:25:14,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 8985640960. Throughput: 0: 61076.5. Samples: 1890781720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:25:14,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-28 02:25:14,254][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (200 times) [2024-04-28 02:25:16,101][54818] Updated weights for policy 0, policy_version 548448 (0.0016) [2024-04-28 02:25:18,626][54818] Updated weights for policy 0, policy_version 548458 (0.0016) [2024-04-28 02:25:19,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 8985968640. Throughput: 0: 61235.3. Samples: 1891156740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:25:19,255][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 02:25:21,406][54818] Updated weights for policy 0, policy_version 548468 (0.0016) [2024-04-28 02:25:24,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60620.8, 300 sec: 61148.4). Total num frames: 8986247168. Throughput: 0: 61002.2. Samples: 1891517940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:25:24,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 02:25:24,298][54818] Updated weights for policy 0, policy_version 548478 (0.0016) [2024-04-28 02:25:26,712][54818] Updated weights for policy 0, policy_version 548488 (0.0017) [2024-04-28 02:25:29,253][54587] Fps is (10 sec: 60621.8, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 8986574848. Throughput: 0: 61096.8. Samples: 1891697700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 02:25:29,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 02:25:29,565][54818] Updated weights for policy 0, policy_version 548498 (0.0016) [2024-04-28 02:25:30,888][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (3800 times) [2024-04-28 02:25:31,488][54798] Signal inference workers to stop experience collection... (30800 times) [2024-04-28 02:25:31,519][54818] InferenceWorker_p0-w0: stopping experience collection (30800 times) [2024-04-28 02:25:31,549][54798] Signal inference workers to resume experience collection... (30800 times) [2024-04-28 02:25:31,549][54818] InferenceWorker_p0-w0: resuming experience collection (30800 times) [2024-04-28 02:25:31,968][54818] Updated weights for policy 0, policy_version 548508 (0.0023) [2024-04-28 02:25:34,253][54587] Fps is (10 sec: 63896.8, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 8986886144. Throughput: 0: 61104.4. Samples: 1892063560. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:25:34,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 02:25:34,951][54818] Updated weights for policy 0, policy_version 548518 (0.0016) [2024-04-28 02:25:37,249][54818] Updated weights for policy 0, policy_version 548528 (0.0016) [2024-04-28 02:25:39,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.8, 300 sec: 61204.0). Total num frames: 8987181056. Throughput: 0: 61048.9. Samples: 1892434620. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:25:39,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 02:25:40,158][54818] Updated weights for policy 0, policy_version 548538 (0.0016) [2024-04-28 02:25:42,637][54818] Updated weights for policy 0, policy_version 548548 (0.0018) [2024-04-28 02:25:44,253][54587] Fps is (10 sec: 58983.5, 60 sec: 60897.5, 300 sec: 61204.0). Total num frames: 8987475968. Throughput: 0: 61290.4. Samples: 1892620440. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:25:44,253][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 02:25:45,438][54818] Updated weights for policy 0, policy_version 548558 (0.0020) [2024-04-28 02:25:48,428][54818] Updated weights for policy 0, policy_version 548568 (0.0017) [2024-04-28 02:25:49,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 8987787264. Throughput: 0: 60841.8. Samples: 1892973440. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:25:49,255][54587] Avg episode reward: [(0, '0.547')] [2024-04-28 02:25:49,321][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000548572_8987803648.pth... [2024-04-28 02:25:49,373][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000547674_8973090816.pth [2024-04-28 02:25:50,912][54818] Updated weights for policy 0, policy_version 548578 (0.0017) [2024-04-28 02:25:53,694][54818] Updated weights for policy 0, policy_version 548588 (0.0016) [2024-04-28 02:25:54,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61440.0, 300 sec: 61203.9). Total num frames: 8988098560. Throughput: 0: 61122.2. Samples: 1893355980. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:25:54,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 02:25:56,076][54818] Updated weights for policy 0, policy_version 548598 (0.0016) [2024-04-28 02:25:57,168][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (3900 times) [2024-04-28 02:25:59,053][54818] Updated weights for policy 0, policy_version 548608 (0.0017) [2024-04-28 02:25:59,253][54587] Fps is (10 sec: 60621.8, 60 sec: 61167.1, 300 sec: 61148.4). Total num frames: 8988393472. Throughput: 0: 61158.4. Samples: 1893533840. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:25:59,253][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 02:26:01,461][54818] Updated weights for policy 0, policy_version 548618 (0.0015) [2024-04-28 02:26:04,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 8988704768. Throughput: 0: 60971.6. Samples: 1893900460. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:26:04,255][54587] Avg episode reward: [(0, '0.542')] [2024-04-28 02:26:04,457][54818] Updated weights for policy 0, policy_version 548628 (0.0021) [2024-04-28 02:26:06,788][54818] Updated weights for policy 0, policy_version 548638 (0.0015) [2024-04-28 02:26:09,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 8988999680. Throughput: 0: 61187.7. Samples: 1894271380. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:26:09,253][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 02:26:09,825][54818] Updated weights for policy 0, policy_version 548648 (0.0016) [2024-04-28 02:26:12,121][54818] Updated weights for policy 0, policy_version 548658 (0.0016) [2024-04-28 02:26:14,253][54587] Fps is (10 sec: 60621.9, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 8989310976. Throughput: 0: 61198.8. Samples: 1894451640. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:26:14,260][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 02:26:15,107][54818] Updated weights for policy 0, policy_version 548668 (0.0016) [2024-04-28 02:26:17,558][54818] Updated weights for policy 0, policy_version 548678 (0.0015) [2024-04-28 02:26:19,253][54587] Fps is (10 sec: 62258.8, 60 sec: 60894.0, 300 sec: 61259.5). Total num frames: 8989622272. Throughput: 0: 61198.0. Samples: 1894817460. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:26:19,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 02:26:20,412][54818] Updated weights for policy 0, policy_version 548688 (0.0016) [2024-04-28 02:26:23,069][54818] Updated weights for policy 0, policy_version 548698 (0.0016) [2024-04-28 02:26:24,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 8989917184. Throughput: 0: 61165.5. Samples: 1895187060. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:26:24,253][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 02:26:24,495][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (4000 times) [2024-04-28 02:26:25,805][54818] Updated weights for policy 0, policy_version 548708 (0.0015) [2024-04-28 02:26:28,508][54818] Updated weights for policy 0, policy_version 548718 (0.0015) [2024-04-28 02:26:29,253][54587] Fps is (10 sec: 60619.8, 60 sec: 60893.7, 300 sec: 61203.9). Total num frames: 8990228480. Throughput: 0: 60984.2. Samples: 1895364740. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:26:29,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 02:26:31,196][54818] Updated weights for policy 0, policy_version 548728 (0.0017) [2024-04-28 02:26:31,447][54798] Signal inference workers to stop experience collection... (30850 times) [2024-04-28 02:26:31,481][54818] InferenceWorker_p0-w0: stopping experience collection (30850 times) [2024-04-28 02:26:31,499][54798] Signal inference workers to resume experience collection... (30850 times) [2024-04-28 02:26:31,499][54818] InferenceWorker_p0-w0: resuming experience collection (30850 times) [2024-04-28 02:26:33,819][54818] Updated weights for policy 0, policy_version 548738 (0.0022) [2024-04-28 02:26:34,253][54587] Fps is (10 sec: 62258.1, 60 sec: 60893.9, 300 sec: 61259.5). Total num frames: 8990539776. Throughput: 0: 61303.9. Samples: 1895732120. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:26:34,255][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 02:26:36,313][54818] Updated weights for policy 0, policy_version 548748 (0.0018) [2024-04-28 02:26:39,194][54818] Updated weights for policy 0, policy_version 548758 (0.0017) [2024-04-28 02:26:39,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 8990851072. Throughput: 0: 61109.3. Samples: 1896105900. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:26:39,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 02:26:41,559][54818] Updated weights for policy 0, policy_version 548768 (0.0016) [2024-04-28 02:26:44,253][54587] Fps is (10 sec: 62260.0, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 8991162368. Throughput: 0: 61078.5. Samples: 1896282380. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:26:44,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 02:26:44,503][54818] Updated weights for policy 0, policy_version 548778 (0.0016) [2024-04-28 02:26:46,920][54818] Updated weights for policy 0, policy_version 548788 (0.0022) [2024-04-28 02:26:49,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 8991473664. Throughput: 0: 61176.0. Samples: 1896653380. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:26:49,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 02:26:49,261][54587] No heartbeat for components: RolloutWorker_w4 (18157 seconds), RolloutWorker_w5 (4257 seconds) [2024-04-28 02:26:49,977][54818] Updated weights for policy 0, policy_version 548798 (0.0016) [2024-04-28 02:26:51,019][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (4100 times) [2024-04-28 02:26:52,499][54818] Updated weights for policy 0, policy_version 548808 (0.0017) [2024-04-28 02:26:54,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 8991784960. Throughput: 0: 61224.7. Samples: 1897026500. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:26:54,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 02:26:55,471][54818] Updated weights for policy 0, policy_version 548818 (0.0017) [2024-04-28 02:26:57,846][54818] Updated weights for policy 0, policy_version 548828 (0.0017) [2024-04-28 02:26:59,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61712.9, 300 sec: 61315.0). Total num frames: 8992096256. Throughput: 0: 61114.0. Samples: 1897201780. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-04-28 02:26:59,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-28 02:27:00,685][54818] Updated weights for policy 0, policy_version 548838 (0.0015) [2024-04-28 02:27:03,088][54818] Updated weights for policy 0, policy_version 548848 (0.0017) [2024-04-28 02:27:04,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 8992391168. Throughput: 0: 61349.6. Samples: 1897578200. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:27:04,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-28 02:27:05,991][54818] Updated weights for policy 0, policy_version 548858 (0.0018) [2024-04-28 02:27:08,637][54818] Updated weights for policy 0, policy_version 548868 (0.0016) [2024-04-28 02:27:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61712.9, 300 sec: 61259.5). Total num frames: 8992702464. Throughput: 0: 61216.7. Samples: 1897941820. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:27:09,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 02:27:11,290][54818] Updated weights for policy 0, policy_version 548878 (0.0018) [2024-04-28 02:27:11,932][54798] Signal inference workers to stop experience collection... (30900 times) [2024-04-28 02:27:11,933][54798] Signal inference workers to resume experience collection... (30900 times) [2024-04-28 02:27:11,953][54818] InferenceWorker_p0-w0: stopping experience collection (30900 times) [2024-04-28 02:27:11,954][54818] InferenceWorker_p0-w0: resuming experience collection (30900 times) [2024-04-28 02:27:13,755][54818] Updated weights for policy 0, policy_version 548888 (0.0018) [2024-04-28 02:27:14,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61712.9, 300 sec: 61259.5). Total num frames: 8993013760. Throughput: 0: 61243.7. Samples: 1898120700. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:27:14,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-28 02:27:16,452][54818] Updated weights for policy 0, policy_version 548898 (0.0016) [2024-04-28 02:27:17,769][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (4200 times) [2024-04-28 02:27:19,137][54818] Updated weights for policy 0, policy_version 548908 (0.0019) [2024-04-28 02:27:19,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 8993308672. Throughput: 0: 61247.8. Samples: 1898488260. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:27:19,253][54587] Avg episode reward: [(0, '0.687')] [2024-04-28 02:27:21,724][54818] Updated weights for policy 0, policy_version 548918 (0.0016) [2024-04-28 02:27:24,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61713.0, 300 sec: 61203.9). Total num frames: 8993619968. Throughput: 0: 61129.8. Samples: 1898856740. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:27:24,254][54587] Avg episode reward: [(0, '0.672')] [2024-04-28 02:27:24,353][54818] Updated weights for policy 0, policy_version 548928 (0.0016) [2024-04-28 02:27:27,126][54818] Updated weights for policy 0, policy_version 548938 (0.0017) [2024-04-28 02:27:29,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.2, 300 sec: 61204.0). Total num frames: 8993914880. Throughput: 0: 61221.0. Samples: 1899037320. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:27:29,253][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 02:27:29,961][54818] Updated weights for policy 0, policy_version 548948 (0.0016) [2024-04-28 02:27:32,580][54818] Updated weights for policy 0, policy_version 548958 (0.0018) [2024-04-28 02:27:34,253][54587] Fps is (10 sec: 58983.1, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 8994209792. Throughput: 0: 61119.7. Samples: 1899403760. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:27:34,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 02:27:35,688][54818] Updated weights for policy 0, policy_version 548968 (0.0017) [2024-04-28 02:27:37,924][54818] Updated weights for policy 0, policy_version 548978 (0.0017) [2024-04-28 02:27:39,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 8994521088. Throughput: 0: 60963.6. Samples: 1899769860. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:27:39,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 02:27:40,999][54818] Updated weights for policy 0, policy_version 548988 (0.0015) [2024-04-28 02:27:43,218][54818] Updated weights for policy 0, policy_version 548998 (0.0017) [2024-04-28 02:27:44,253][54587] Fps is (10 sec: 60619.9, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 8994816000. Throughput: 0: 61243.1. Samples: 1899957720. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:27:44,255][54587] Avg episode reward: [(0, '0.507')] [2024-04-28 02:27:44,603][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (4300 times) [2024-04-28 02:27:46,410][54818] Updated weights for policy 0, policy_version 549008 (0.0017) [2024-04-28 02:27:48,367][54818] Updated weights for policy 0, policy_version 549018 (0.0016) [2024-04-28 02:27:49,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 8995127296. Throughput: 0: 60944.5. Samples: 1900320700. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:27:49,255][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 02:27:49,332][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000549020_8995143680.pth... [2024-04-28 02:27:49,392][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000548124_8980463616.pth [2024-04-28 02:27:51,643][54818] Updated weights for policy 0, policy_version 549028 (0.0017) [2024-04-28 02:27:52,880][54798] Signal inference workers to stop experience collection... (30950 times) [2024-04-28 02:27:52,920][54818] InferenceWorker_p0-w0: stopping experience collection (30950 times) [2024-04-28 02:27:52,930][54798] Signal inference workers to resume experience collection... (30950 times) [2024-04-28 02:27:52,934][54818] InferenceWorker_p0-w0: resuming experience collection (30950 times) [2024-04-28 02:27:53,953][54818] Updated weights for policy 0, policy_version 549038 (0.0016) [2024-04-28 02:27:54,253][54587] Fps is (10 sec: 62259.1, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 8995438592. Throughput: 0: 60944.0. Samples: 1900684300. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:27:54,255][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 02:27:56,919][54818] Updated weights for policy 0, policy_version 549048 (0.0016) [2024-04-28 02:27:59,253][54587] Fps is (10 sec: 62258.8, 60 sec: 60893.8, 300 sec: 61203.9). Total num frames: 8995749888. Throughput: 0: 61274.5. Samples: 1900878060. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:27:59,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-28 02:27:59,384][54818] Updated weights for policy 0, policy_version 549058 (0.0016) [2024-04-28 02:28:02,231][54818] Updated weights for policy 0, policy_version 549068 (0.0018) [2024-04-28 02:28:04,253][54587] Fps is (10 sec: 60622.0, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 8996044800. Throughput: 0: 61081.4. Samples: 1901236920. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:28:04,253][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 02:28:04,859][54818] Updated weights for policy 0, policy_version 549078 (0.0018) [2024-04-28 02:28:07,563][54818] Updated weights for policy 0, policy_version 549088 (0.0017) [2024-04-28 02:28:09,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60894.0, 300 sec: 61203.9). Total num frames: 8996356096. Throughput: 0: 60853.9. Samples: 1901595160. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:28:09,255][54587] Avg episode reward: [(0, '0.658')] [2024-04-28 02:28:10,481][54818] Updated weights for policy 0, policy_version 549098 (0.0016) [2024-04-28 02:28:11,630][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (4400 times) [2024-04-28 02:28:12,758][54818] Updated weights for policy 0, policy_version 549108 (0.0015) [2024-04-28 02:28:14,253][54587] Fps is (10 sec: 62258.0, 60 sec: 60893.8, 300 sec: 61204.0). Total num frames: 8996667392. Throughput: 0: 61290.9. Samples: 1901795420. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:28:14,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 02:28:15,638][54818] Updated weights for policy 0, policy_version 549118 (0.0016) [2024-04-28 02:28:18,153][54818] Updated weights for policy 0, policy_version 549128 (0.0018) [2024-04-28 02:28:19,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 8996978688. Throughput: 0: 61114.5. Samples: 1902153920. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:28:19,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-28 02:28:21,246][54818] Updated weights for policy 0, policy_version 549138 (0.0015) [2024-04-28 02:28:23,427][54818] Updated weights for policy 0, policy_version 549148 (0.0018) [2024-04-28 02:28:24,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 8997289984. Throughput: 0: 60950.6. Samples: 1902512640. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:28:24,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 02:28:26,848][54818] Updated weights for policy 0, policy_version 549158 (0.0015) [2024-04-28 02:28:28,681][54818] Updated weights for policy 0, policy_version 549168 (0.0016) [2024-04-28 02:28:29,253][54587] Fps is (10 sec: 58983.2, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 8997568512. Throughput: 0: 61267.8. Samples: 1902714760. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 02:28:29,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 02:28:32,000][54818] Updated weights for policy 0, policy_version 549178 (0.0016) [2024-04-28 02:28:34,223][54818] Updated weights for policy 0, policy_version 549188 (0.0015) [2024-04-28 02:28:34,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 8997896192. Throughput: 0: 61214.7. Samples: 1903075360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:28:34,255][54587] Avg episode reward: [(0, '0.553')] [2024-04-28 02:28:37,345][54818] Updated weights for policy 0, policy_version 549198 (0.0017) [2024-04-28 02:28:37,903][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (4500 times) [2024-04-28 02:28:39,253][54587] Fps is (10 sec: 63897.2, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 8998207488. Throughput: 0: 61151.7. Samples: 1903436120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:28:39,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 02:28:39,566][54818] Updated weights for policy 0, policy_version 549208 (0.0020) [2024-04-28 02:28:42,563][54818] Updated weights for policy 0, policy_version 549218 (0.0016) [2024-04-28 02:28:44,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 8998502400. Throughput: 0: 61238.8. Samples: 1903633800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:28:44,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 02:28:44,406][54798] Signal inference workers to stop experience collection... (31000 times) [2024-04-28 02:28:44,407][54798] Signal inference workers to resume experience collection... (31000 times) [2024-04-28 02:28:44,425][54818] InferenceWorker_p0-w0: stopping experience collection (31000 times) [2024-04-28 02:28:44,425][54818] InferenceWorker_p0-w0: resuming experience collection (31000 times) [2024-04-28 02:28:44,966][54818] Updated weights for policy 0, policy_version 549228 (0.0017) [2024-04-28 02:28:47,791][54818] Updated weights for policy 0, policy_version 549238 (0.0017) [2024-04-28 02:28:49,254][54587] Fps is (10 sec: 60619.5, 60 sec: 61439.9, 300 sec: 61204.7). Total num frames: 8998813696. Throughput: 0: 61238.3. Samples: 1903992660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:28:49,255][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 02:28:50,168][54818] Updated weights for policy 0, policy_version 549248 (0.0015) [2024-04-28 02:28:52,973][54818] Updated weights for policy 0, policy_version 549258 (0.0015) [2024-04-28 02:28:54,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61167.1, 300 sec: 61148.4). Total num frames: 8999108608. Throughput: 0: 61316.9. Samples: 1904354420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:28:54,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 02:28:55,613][54818] Updated weights for policy 0, policy_version 549268 (0.0016) [2024-04-28 02:28:58,619][54818] Updated weights for policy 0, policy_version 549278 (0.0021) [2024-04-28 02:28:59,253][54587] Fps is (10 sec: 60622.5, 60 sec: 61167.2, 300 sec: 61148.4). Total num frames: 8999419904. Throughput: 0: 61080.3. Samples: 1904544020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:28:59,253][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 02:29:01,143][54818] Updated weights for policy 0, policy_version 549288 (0.0016) [2024-04-28 02:29:03,987][54818] Updated weights for policy 0, policy_version 549298 (0.0016) [2024-04-28 02:29:04,253][54587] Fps is (10 sec: 60619.8, 60 sec: 61166.7, 300 sec: 61148.4). Total num frames: 8999714816. Throughput: 0: 61208.3. Samples: 1904908300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:29:04,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 02:29:05,013][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (4600 times) [2024-04-28 02:29:07,064][54818] Updated weights for policy 0, policy_version 549308 (0.0017) [2024-04-28 02:29:09,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9000026112. Throughput: 0: 61332.1. Samples: 1905272580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:29:09,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 02:29:09,257][54818] Updated weights for policy 0, policy_version 549318 (0.0018) [2024-04-28 02:29:12,229][54818] Updated weights for policy 0, policy_version 549328 (0.0019) [2024-04-28 02:29:14,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 9000321024. Throughput: 0: 60871.8. Samples: 1905454000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:29:14,254][54587] Avg episode reward: [(0, '0.476')] [2024-04-28 02:29:14,614][54818] Updated weights for policy 0, policy_version 549338 (0.0017) [2024-04-28 02:29:17,609][54818] Updated weights for policy 0, policy_version 549348 (0.0016) [2024-04-28 02:29:19,253][54587] Fps is (10 sec: 58982.8, 60 sec: 60620.9, 300 sec: 61037.4). Total num frames: 9000615936. Throughput: 0: 60983.7. Samples: 1905819620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:29:19,253][54587] Avg episode reward: [(0, '0.564')] [2024-04-28 02:29:19,769][54818] Updated weights for policy 0, policy_version 549358 (0.0016) [2024-04-28 02:29:22,787][54818] Updated weights for policy 0, policy_version 549368 (0.0015) [2024-04-28 02:29:24,253][54587] Fps is (10 sec: 58982.4, 60 sec: 60347.8, 300 sec: 60981.8). Total num frames: 9000910848. Throughput: 0: 61071.9. Samples: 1906184360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:29:24,255][54587] Avg episode reward: [(0, '0.663')] [2024-04-28 02:29:25,257][54818] Updated weights for policy 0, policy_version 549378 (0.0017) [2024-04-28 02:29:25,724][54798] Signal inference workers to stop experience collection... (31050 times) [2024-04-28 02:29:25,727][54798] Signal inference workers to resume experience collection... (31050 times) [2024-04-28 02:29:25,738][54818] InferenceWorker_p0-w0: stopping experience collection (31050 times) [2024-04-28 02:29:25,738][54818] InferenceWorker_p0-w0: resuming experience collection (31050 times) [2024-04-28 02:29:28,166][54818] Updated weights for policy 0, policy_version 549388 (0.0017) [2024-04-28 02:29:29,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 9001238528. Throughput: 0: 60772.6. Samples: 1906368560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:29:29,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 02:29:30,605][54818] Updated weights for policy 0, policy_version 549398 (0.0018) [2024-04-28 02:29:32,035][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (4700 times) [2024-04-28 02:29:33,441][54818] Updated weights for policy 0, policy_version 549408 (0.0018) [2024-04-28 02:29:34,253][54587] Fps is (10 sec: 63897.3, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 9001549824. Throughput: 0: 61115.3. Samples: 1906742840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:29:34,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 02:29:35,931][54818] Updated weights for policy 0, policy_version 549418 (0.0018) [2024-04-28 02:29:38,949][54818] Updated weights for policy 0, policy_version 549428 (0.0017) [2024-04-28 02:29:39,253][54587] Fps is (10 sec: 60619.9, 60 sec: 60620.7, 300 sec: 61093.6). Total num frames: 9001844736. Throughput: 0: 61099.4. Samples: 1907103900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:29:39,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 02:29:41,279][54818] Updated weights for policy 0, policy_version 549438 (0.0016) [2024-04-28 02:29:44,207][54818] Updated weights for policy 0, policy_version 549448 (0.0018) [2024-04-28 02:29:44,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 9002156032. Throughput: 0: 60994.9. Samples: 1907288800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:29:44,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 02:29:46,931][54818] Updated weights for policy 0, policy_version 549458 (0.0016) [2024-04-28 02:29:49,253][54587] Fps is (10 sec: 62259.6, 60 sec: 60894.1, 300 sec: 61204.0). Total num frames: 9002467328. Throughput: 0: 61034.9. Samples: 1907654860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:29:49,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 02:29:49,261][54587] No heartbeat for components: RolloutWorker_w4 (18337 seconds), RolloutWorker_w5 (4437 seconds) [2024-04-28 02:29:49,261][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000549467_9002467328.pth... [2024-04-28 02:29:49,309][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000548572_8987803648.pth [2024-04-28 02:29:49,508][54818] Updated weights for policy 0, policy_version 549468 (0.0016) [2024-04-28 02:29:52,344][54818] Updated weights for policy 0, policy_version 549478 (0.0017) [2024-04-28 02:29:54,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 9002762240. Throughput: 0: 60992.8. Samples: 1908017260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:29:54,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 02:29:54,965][54818] Updated weights for policy 0, policy_version 549488 (0.0022) [2024-04-28 02:29:57,550][54818] Updated weights for policy 0, policy_version 549498 (0.0015) [2024-04-28 02:29:58,840][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (4800 times) [2024-04-28 02:29:59,253][54587] Fps is (10 sec: 58982.6, 60 sec: 60620.7, 300 sec: 61092.9). Total num frames: 9003057152. Throughput: 0: 61081.0. Samples: 1908202640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 02:29:59,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-28 02:30:00,249][54818] Updated weights for policy 0, policy_version 549508 (0.0016) [2024-04-28 02:30:03,307][54818] Updated weights for policy 0, policy_version 549518 (0.0017) [2024-04-28 02:30:04,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61167.1, 300 sec: 61148.4). Total num frames: 9003384832. Throughput: 0: 60946.1. Samples: 1908562200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:30:04,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 02:30:05,529][54818] Updated weights for policy 0, policy_version 549528 (0.0017) [2024-04-28 02:30:06,555][54798] Signal inference workers to stop experience collection... (31100 times) [2024-04-28 02:30:06,591][54818] InferenceWorker_p0-w0: stopping experience collection (31100 times) [2024-04-28 02:30:06,645][54798] Signal inference workers to resume experience collection... (31100 times) [2024-04-28 02:30:06,645][54818] InferenceWorker_p0-w0: resuming experience collection (31100 times) [2024-04-28 02:30:08,495][54818] Updated weights for policy 0, policy_version 549538 (0.0016) [2024-04-28 02:30:09,253][54587] Fps is (10 sec: 63896.6, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 9003696128. Throughput: 0: 61179.4. Samples: 1908937440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:30:09,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-28 02:30:10,713][54818] Updated weights for policy 0, policy_version 549548 (0.0017) [2024-04-28 02:30:13,698][54818] Updated weights for policy 0, policy_version 549558 (0.0017) [2024-04-28 02:30:14,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 9003991040. Throughput: 0: 61099.4. Samples: 1909118040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:30:14,255][54587] Avg episode reward: [(0, '0.670')] [2024-04-28 02:30:16,502][54818] Updated weights for policy 0, policy_version 549568 (0.0020) [2024-04-28 02:30:19,026][54818] Updated weights for policy 0, policy_version 549578 (0.0015) [2024-04-28 02:30:19,253][54587] Fps is (10 sec: 58982.4, 60 sec: 61166.7, 300 sec: 61148.4). Total num frames: 9004285952. Throughput: 0: 60883.0. Samples: 1909482580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:30:19,256][54587] Avg episode reward: [(0, '0.559')] [2024-04-28 02:30:21,676][54818] Updated weights for policy 0, policy_version 549588 (0.0017) [2024-04-28 02:30:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 9004597248. Throughput: 0: 61120.6. Samples: 1909854320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:30:24,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 02:30:24,341][54818] Updated weights for policy 0, policy_version 549598 (0.0017) [2024-04-28 02:30:25,537][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (4900 times) [2024-04-28 02:30:26,825][54818] Updated weights for policy 0, policy_version 549608 (0.0018) [2024-04-28 02:30:29,253][54587] Fps is (10 sec: 62260.6, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 9004908544. Throughput: 0: 60964.2. Samples: 1910032180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:30:29,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 02:30:29,648][54818] Updated weights for policy 0, policy_version 549618 (0.0020) [2024-04-28 02:30:32,206][54818] Updated weights for policy 0, policy_version 549628 (0.0016) [2024-04-28 02:30:34,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9005219840. Throughput: 0: 61152.4. Samples: 1910406720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:30:34,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 02:30:34,940][54818] Updated weights for policy 0, policy_version 549638 (0.0017) [2024-04-28 02:30:37,874][54818] Updated weights for policy 0, policy_version 549648 (0.0015) [2024-04-28 02:30:39,253][54587] Fps is (10 sec: 60619.2, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 9005514752. Throughput: 0: 61230.1. Samples: 1910772620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:30:39,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 02:30:40,242][54818] Updated weights for policy 0, policy_version 549658 (0.0018) [2024-04-28 02:30:43,404][54818] Updated weights for policy 0, policy_version 549668 (0.0016) [2024-04-28 02:30:44,253][54587] Fps is (10 sec: 58983.3, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 9005809664. Throughput: 0: 61093.0. Samples: 1910951820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:30:44,253][54587] Avg episode reward: [(0, '0.655')] [2024-04-28 02:30:45,597][54818] Updated weights for policy 0, policy_version 549678 (0.0017) [2024-04-28 02:30:48,712][54818] Updated weights for policy 0, policy_version 549688 (0.0016) [2024-04-28 02:30:49,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 9006120960. Throughput: 0: 61382.5. Samples: 1911324420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:30:49,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 02:30:50,877][54818] Updated weights for policy 0, policy_version 549698 (0.0017) [2024-04-28 02:30:52,121][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (5000 times) [2024-04-28 02:30:53,830][54818] Updated weights for policy 0, policy_version 549708 (0.0015) [2024-04-28 02:30:54,253][54587] Fps is (10 sec: 60619.5, 60 sec: 60893.8, 300 sec: 61092.8). Total num frames: 9006415872. Throughput: 0: 61357.8. Samples: 1911698540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:30:54,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 02:30:56,223][54818] Updated weights for policy 0, policy_version 549718 (0.0015) [2024-04-28 02:30:59,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.8, 300 sec: 61092.9). Total num frames: 9006727168. Throughput: 0: 61194.5. Samples: 1911871800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:30:59,255][54587] Avg episode reward: [(0, '0.550')] [2024-04-28 02:30:59,405][54798] Signal inference workers to stop experience collection... (31150 times) [2024-04-28 02:30:59,406][54798] Signal inference workers to resume experience collection... (31150 times) [2024-04-28 02:30:59,414][54818] Updated weights for policy 0, policy_version 549728 (0.0018) [2024-04-28 02:30:59,422][54818] InferenceWorker_p0-w0: stopping experience collection (31150 times) [2024-04-28 02:30:59,423][54818] InferenceWorker_p0-w0: resuming experience collection (31150 times) [2024-04-28 02:31:01,497][54818] Updated weights for policy 0, policy_version 549738 (0.0018) [2024-04-28 02:31:04,253][54587] Fps is (10 sec: 62260.5, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 9007038464. Throughput: 0: 61343.4. Samples: 1912243020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:31:04,253][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 02:31:04,617][54818] Updated weights for policy 0, policy_version 549748 (0.0015) [2024-04-28 02:31:06,953][54818] Updated weights for policy 0, policy_version 549758 (0.0017) [2024-04-28 02:31:09,253][54587] Fps is (10 sec: 62260.1, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 9007349760. Throughput: 0: 61515.5. Samples: 1912622520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:31:09,253][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 02:31:09,896][54818] Updated weights for policy 0, policy_version 549768 (0.0017) [2024-04-28 02:31:12,384][54818] Updated weights for policy 0, policy_version 549778 (0.0017) [2024-04-28 02:31:14,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9007661056. Throughput: 0: 61461.2. Samples: 1912797940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:31:14,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-28 02:31:15,129][54818] Updated weights for policy 0, policy_version 549788 (0.0015) [2024-04-28 02:31:17,572][54818] Updated weights for policy 0, policy_version 549798 (0.0016) [2024-04-28 02:31:19,156][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (5100 times) [2024-04-28 02:31:19,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9007955968. Throughput: 0: 61192.4. Samples: 1913160380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:31:19,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 02:31:20,436][54818] Updated weights for policy 0, policy_version 549808 (0.0016) [2024-04-28 02:31:23,081][54818] Updated weights for policy 0, policy_version 549818 (0.0017) [2024-04-28 02:31:24,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.9, 300 sec: 61148.5). Total num frames: 9008267264. Throughput: 0: 61506.5. Samples: 1913540400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:31:24,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 02:31:25,594][54818] Updated weights for policy 0, policy_version 549828 (0.0018) [2024-04-28 02:31:28,722][54818] Updated weights for policy 0, policy_version 549838 (0.0015) [2024-04-28 02:31:29,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 9008578560. Throughput: 0: 61399.4. Samples: 1913714800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-04-28 02:31:29,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 02:31:31,063][54818] Updated weights for policy 0, policy_version 549848 (0.0018) [2024-04-28 02:31:33,969][54818] Updated weights for policy 0, policy_version 549858 (0.0017) [2024-04-28 02:31:34,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9008889856. Throughput: 0: 61242.3. Samples: 1914080320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:31:34,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 02:31:36,390][54818] Updated weights for policy 0, policy_version 549868 (0.0017) [2024-04-28 02:31:39,253][54587] Fps is (10 sec: 60621.8, 60 sec: 61167.2, 300 sec: 61092.9). Total num frames: 9009184768. Throughput: 0: 61226.6. Samples: 1914453720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:31:39,253][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 02:31:39,293][54818] Updated weights for policy 0, policy_version 549878 (0.0016) [2024-04-28 02:31:41,693][54818] Updated weights for policy 0, policy_version 549888 (0.0016) [2024-04-28 02:31:44,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61713.0, 300 sec: 61148.4). Total num frames: 9009512448. Throughput: 0: 61345.1. Samples: 1914632320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:31:44,255][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 02:31:44,608][54818] Updated weights for policy 0, policy_version 549898 (0.0016) [2024-04-28 02:31:45,714][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (5200 times) [2024-04-28 02:31:47,219][54818] Updated weights for policy 0, policy_version 549908 (0.0016) [2024-04-28 02:31:48,127][54798] Signal inference workers to stop experience collection... (31200 times) [2024-04-28 02:31:48,127][54798] Signal inference workers to resume experience collection... (31200 times) [2024-04-28 02:31:48,142][54818] InferenceWorker_p0-w0: stopping experience collection (31200 times) [2024-04-28 02:31:48,142][54818] InferenceWorker_p0-w0: resuming experience collection (31200 times) [2024-04-28 02:31:49,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61440.2, 300 sec: 61092.9). Total num frames: 9009807360. Throughput: 0: 61283.5. Samples: 1915000780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:31:49,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-28 02:31:49,339][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000549916_9009823744.pth... [2024-04-28 02:31:49,398][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000549020_8995143680.pth [2024-04-28 02:31:50,091][54818] Updated weights for policy 0, policy_version 549918 (0.0015) [2024-04-28 02:31:52,421][54818] Updated weights for policy 0, policy_version 549928 (0.0016) [2024-04-28 02:31:54,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61713.1, 300 sec: 61092.9). Total num frames: 9010118656. Throughput: 0: 61044.7. Samples: 1915369540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:31:54,255][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 02:31:55,353][54818] Updated weights for policy 0, policy_version 549938 (0.0015) [2024-04-28 02:31:57,801][54818] Updated weights for policy 0, policy_version 549948 (0.0017) [2024-04-28 02:31:59,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61713.2, 300 sec: 61148.4). Total num frames: 9010429952. Throughput: 0: 61157.8. Samples: 1915550040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:31:59,254][54587] Avg episode reward: [(0, '0.678')] [2024-04-28 02:32:00,439][54818] Updated weights for policy 0, policy_version 549958 (0.0021) [2024-04-28 02:32:03,335][54818] Updated weights for policy 0, policy_version 549968 (0.0016) [2024-04-28 02:32:04,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61439.9, 300 sec: 61092.9). Total num frames: 9010724864. Throughput: 0: 61438.7. Samples: 1915925120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:32:04,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 02:32:05,746][54818] Updated weights for policy 0, policy_version 549978 (0.0015) [2024-04-28 02:32:08,470][54818] Updated weights for policy 0, policy_version 549988 (0.0019) [2024-04-28 02:32:09,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61439.9, 300 sec: 61092.9). Total num frames: 9011036160. Throughput: 0: 61227.0. Samples: 1916295620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:32:09,254][54587] Avg episode reward: [(0, '0.702')] [2024-04-28 02:32:10,992][54818] Updated weights for policy 0, policy_version 549998 (0.0017) [2024-04-28 02:32:12,040][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (5300 times) [2024-04-28 02:32:14,248][54818] Updated weights for policy 0, policy_version 550008 (0.0020) [2024-04-28 02:32:14,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 9011331072. Throughput: 0: 61367.7. Samples: 1916476340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:32:14,253][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 02:32:16,521][54818] Updated weights for policy 0, policy_version 550018 (0.0015) [2024-04-28 02:32:19,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 9011642368. Throughput: 0: 61423.0. Samples: 1916844360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:32:19,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 02:32:19,392][54818] Updated weights for policy 0, policy_version 550028 (0.0015) [2024-04-28 02:32:21,706][54818] Updated weights for policy 0, policy_version 550038 (0.0018) [2024-04-28 02:32:24,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 9011937280. Throughput: 0: 61331.0. Samples: 1917213620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:32:24,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-28 02:32:24,529][54818] Updated weights for policy 0, policy_version 550048 (0.0018) [2024-04-28 02:32:27,065][54818] Updated weights for policy 0, policy_version 550058 (0.0015) [2024-04-28 02:32:29,253][54587] Fps is (10 sec: 60621.9, 60 sec: 61167.1, 300 sec: 61148.4). Total num frames: 9012248576. Throughput: 0: 61362.8. Samples: 1917393640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:32:29,253][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 02:32:30,118][54818] Updated weights for policy 0, policy_version 550068 (0.0019) [2024-04-28 02:32:32,411][54818] Updated weights for policy 0, policy_version 550078 (0.0017) [2024-04-28 02:32:34,253][54587] Fps is (10 sec: 62258.0, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 9012559872. Throughput: 0: 61302.4. Samples: 1917759400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:32:34,254][54587] Avg episode reward: [(0, '0.671')] [2024-04-28 02:32:35,379][54818] Updated weights for policy 0, policy_version 550088 (0.0017) [2024-04-28 02:32:37,665][54818] Updated weights for policy 0, policy_version 550098 (0.0016) [2024-04-28 02:32:39,149][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (5400 times) [2024-04-28 02:32:39,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 9012871168. Throughput: 0: 61293.9. Samples: 1918127760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:32:39,254][54587] Avg episode reward: [(0, '0.497')] [2024-04-28 02:32:40,775][54818] Updated weights for policy 0, policy_version 550108 (0.0016) [2024-04-28 02:32:43,077][54818] Updated weights for policy 0, policy_version 550118 (0.0017) [2024-04-28 02:32:44,177][54798] Signal inference workers to stop experience collection... (31250 times) [2024-04-28 02:32:44,177][54798] Signal inference workers to resume experience collection... (31250 times) [2024-04-28 02:32:44,184][54818] InferenceWorker_p0-w0: stopping experience collection (31250 times) [2024-04-28 02:32:44,185][54818] InferenceWorker_p0-w0: resuming experience collection (31250 times) [2024-04-28 02:32:44,253][54587] Fps is (10 sec: 62260.6, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9013182464. Throughput: 0: 61377.8. Samples: 1918312040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:32:44,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 02:32:46,173][54818] Updated weights for policy 0, policy_version 550128 (0.0016) [2024-04-28 02:32:48,505][54818] Updated weights for policy 0, policy_version 550138 (0.0017) [2024-04-28 02:32:49,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61439.8, 300 sec: 61204.0). Total num frames: 9013493760. Throughput: 0: 61107.9. Samples: 1918674980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:32:49,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 02:32:49,263][54587] No heartbeat for components: RolloutWorker_w4 (18517 seconds), RolloutWorker_w5 (4617 seconds) [2024-04-28 02:32:51,396][54818] Updated weights for policy 0, policy_version 550148 (0.0016) [2024-04-28 02:32:54,088][54818] Updated weights for policy 0, policy_version 550158 (0.0016) [2024-04-28 02:32:54,253][54587] Fps is (10 sec: 60619.9, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9013788672. Throughput: 0: 60919.1. Samples: 1919036980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:32:54,255][54587] Avg episode reward: [(0, '0.506')] [2024-04-28 02:32:56,736][54818] Updated weights for policy 0, policy_version 550168 (0.0016) [2024-04-28 02:32:59,253][54587] Fps is (10 sec: 60621.9, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9014099968. Throughput: 0: 61189.3. Samples: 1919229860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 02:32:59,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 02:32:59,306][54818] Updated weights for policy 0, policy_version 550178 (0.0015) [2024-04-28 02:33:02,169][54818] Updated weights for policy 0, policy_version 550188 (0.0015) [2024-04-28 02:33:04,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9014394880. Throughput: 0: 60982.2. Samples: 1919588560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:33:04,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 02:33:04,742][54818] Updated weights for policy 0, policy_version 550198 (0.0019) [2024-04-28 02:33:06,005][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (5500 times) [2024-04-28 02:33:07,527][54818] Updated weights for policy 0, policy_version 550208 (0.0017) [2024-04-28 02:33:09,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9014706176. Throughput: 0: 60972.8. Samples: 1919957400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:33:09,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-28 02:33:10,094][54818] Updated weights for policy 0, policy_version 550218 (0.0020) [2024-04-28 02:33:12,654][54818] Updated weights for policy 0, policy_version 550228 (0.0018) [2024-04-28 02:33:14,253][54587] Fps is (10 sec: 62260.2, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9015017472. Throughput: 0: 61242.2. Samples: 1920149540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:33:14,253][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 02:33:15,434][54818] Updated weights for policy 0, policy_version 550238 (0.0018) [2024-04-28 02:33:17,828][54818] Updated weights for policy 0, policy_version 550248 (0.0017) [2024-04-28 02:33:19,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9015328768. Throughput: 0: 61139.8. Samples: 1920510680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:33:19,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 02:33:20,891][54818] Updated weights for policy 0, policy_version 550258 (0.0017) [2024-04-28 02:33:23,260][54818] Updated weights for policy 0, policy_version 550268 (0.0015) [2024-04-28 02:33:24,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 9015640064. Throughput: 0: 61200.0. Samples: 1920881760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:33:24,255][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 02:33:26,256][54818] Updated weights for policy 0, policy_version 550278 (0.0018) [2024-04-28 02:33:26,873][54798] Signal inference workers to stop experience collection... (31300 times) [2024-04-28 02:33:26,902][54818] InferenceWorker_p0-w0: stopping experience collection (31300 times) [2024-04-28 02:33:26,969][54798] Signal inference workers to resume experience collection... (31300 times) [2024-04-28 02:33:26,969][54818] InferenceWorker_p0-w0: resuming experience collection (31300 times) [2024-04-28 02:33:28,964][54818] Updated weights for policy 0, policy_version 550288 (0.0018) [2024-04-28 02:33:29,257][54587] Fps is (10 sec: 60596.8, 60 sec: 61435.9, 300 sec: 61147.6). Total num frames: 9015934976. Throughput: 0: 61195.9. Samples: 1921066100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:33:29,258][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 02:33:31,571][54818] Updated weights for policy 0, policy_version 550298 (0.0018) [2024-04-28 02:33:32,775][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (5600 times) [2024-04-28 02:33:34,142][54818] Updated weights for policy 0, policy_version 550308 (0.0016) [2024-04-28 02:33:34,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9016246272. Throughput: 0: 61126.3. Samples: 1921425660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:33:34,254][54587] Avg episode reward: [(0, '0.502')] [2024-04-28 02:33:34,255][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (300 times) [2024-04-28 02:33:37,108][54818] Updated weights for policy 0, policy_version 550318 (0.0020) [2024-04-28 02:33:39,253][54587] Fps is (10 sec: 59005.5, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9016524800. Throughput: 0: 61179.2. Samples: 1921790040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:33:39,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-28 02:33:39,508][54818] Updated weights for policy 0, policy_version 550328 (0.0021) [2024-04-28 02:33:42,314][54818] Updated weights for policy 0, policy_version 550338 (0.0015) [2024-04-28 02:33:44,254][54587] Fps is (10 sec: 58977.1, 60 sec: 60892.8, 300 sec: 61092.7). Total num frames: 9016836096. Throughput: 0: 61086.6. Samples: 1921978820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:33:44,264][54587] Avg episode reward: [(0, '0.650')] [2024-04-28 02:33:45,310][54818] Updated weights for policy 0, policy_version 550348 (0.0019) [2024-04-28 02:33:47,800][54818] Updated weights for policy 0, policy_version 550358 (0.0018) [2024-04-28 02:33:49,253][54587] Fps is (10 sec: 62259.7, 60 sec: 60894.1, 300 sec: 61148.4). Total num frames: 9017147392. Throughput: 0: 61186.9. Samples: 1922341960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:33:49,253][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 02:33:49,345][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000550364_9017163776.pth... [2024-04-28 02:33:49,403][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000549467_9002467328.pth [2024-04-28 02:33:50,558][54818] Updated weights for policy 0, policy_version 550368 (0.0017) [2024-04-28 02:33:53,197][54818] Updated weights for policy 0, policy_version 550378 (0.0015) [2024-04-28 02:33:54,253][54587] Fps is (10 sec: 62264.7, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9017458688. Throughput: 0: 61039.1. Samples: 1922704160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:33:54,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 02:33:55,827][54818] Updated weights for policy 0, policy_version 550388 (0.0016) [2024-04-28 02:33:58,378][54818] Updated weights for policy 0, policy_version 550398 (0.0016) [2024-04-28 02:33:59,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60893.9, 300 sec: 61148.5). Total num frames: 9017753600. Throughput: 0: 60955.2. Samples: 1922892520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:33:59,253][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 02:33:59,632][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (5700 times) [2024-04-28 02:34:01,060][54818] Updated weights for policy 0, policy_version 550408 (0.0017) [2024-04-28 02:34:03,789][54818] Updated weights for policy 0, policy_version 550418 (0.0017) [2024-04-28 02:34:04,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 9018081280. Throughput: 0: 61061.4. Samples: 1923258440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:34:04,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 02:34:06,507][54818] Updated weights for policy 0, policy_version 550428 (0.0018) [2024-04-28 02:34:09,075][54818] Updated weights for policy 0, policy_version 550438 (0.0020) [2024-04-28 02:34:09,253][54587] Fps is (10 sec: 62257.8, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9018376192. Throughput: 0: 61026.5. Samples: 1923627960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:34:09,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-28 02:34:11,971][54818] Updated weights for policy 0, policy_version 550448 (0.0015) [2024-04-28 02:34:14,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9018687488. Throughput: 0: 60930.1. Samples: 1923807720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:34:14,255][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 02:34:14,470][54818] Updated weights for policy 0, policy_version 550458 (0.0018) [2024-04-28 02:34:17,207][54818] Updated weights for policy 0, policy_version 550468 (0.0016) [2024-04-28 02:34:19,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60893.9, 300 sec: 61259.5). Total num frames: 9018982400. Throughput: 0: 61197.0. Samples: 1924179520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:34:19,253][54587] Avg episode reward: [(0, '0.637')] [2024-04-28 02:34:19,989][54818] Updated weights for policy 0, policy_version 550478 (0.0016) [2024-04-28 02:34:22,634][54818] Updated weights for policy 0, policy_version 550488 (0.0016) [2024-04-28 02:34:24,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9019293696. Throughput: 0: 61302.8. Samples: 1924548660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:34:24,253][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 02:34:25,218][54818] Updated weights for policy 0, policy_version 550498 (0.0016) [2024-04-28 02:34:26,292][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (5800 times) [2024-04-28 02:34:27,842][54818] Updated weights for policy 0, policy_version 550508 (0.0019) [2024-04-28 02:34:29,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60897.9, 300 sec: 61148.5). Total num frames: 9019588608. Throughput: 0: 61122.7. Samples: 1924729280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 02:34:29,254][54587] Avg episode reward: [(0, '0.675')] [2024-04-28 02:34:30,467][54818] Updated weights for policy 0, policy_version 550518 (0.0015) [2024-04-28 02:34:31,159][54798] Signal inference workers to stop experience collection... (31350 times) [2024-04-28 02:34:31,159][54798] Signal inference workers to resume experience collection... (31350 times) [2024-04-28 02:34:31,178][54818] InferenceWorker_p0-w0: stopping experience collection (31350 times) [2024-04-28 02:34:31,178][54818] InferenceWorker_p0-w0: resuming experience collection (31350 times) [2024-04-28 02:34:33,393][54818] Updated weights for policy 0, policy_version 550528 (0.0016) [2024-04-28 02:34:34,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9019899904. Throughput: 0: 61195.5. Samples: 1925095760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:34:34,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 02:34:35,895][54818] Updated weights for policy 0, policy_version 550538 (0.0017) [2024-04-28 02:34:38,836][54818] Updated weights for policy 0, policy_version 550548 (0.0017) [2024-04-28 02:34:39,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9020211200. Throughput: 0: 61429.4. Samples: 1925468480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:34:39,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 02:34:41,175][54818] Updated weights for policy 0, policy_version 550558 (0.0018) [2024-04-28 02:34:44,000][54818] Updated weights for policy 0, policy_version 550568 (0.0015) [2024-04-28 02:34:44,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61167.8, 300 sec: 61148.4). Total num frames: 9020506112. Throughput: 0: 61248.6. Samples: 1925648720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:34:44,255][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 02:34:46,525][54818] Updated weights for policy 0, policy_version 550578 (0.0017) [2024-04-28 02:34:49,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.8, 300 sec: 61204.0). Total num frames: 9020817408. Throughput: 0: 61264.3. Samples: 1926015340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:34:49,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-28 02:34:49,519][54818] Updated weights for policy 0, policy_version 550588 (0.0018) [2024-04-28 02:34:51,883][54818] Updated weights for policy 0, policy_version 550598 (0.0016) [2024-04-28 02:34:53,150][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (5900 times) [2024-04-28 02:34:54,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60893.9, 300 sec: 61203.9). Total num frames: 9021112320. Throughput: 0: 61344.5. Samples: 1926388460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:34:54,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 02:34:54,804][54818] Updated weights for policy 0, policy_version 550608 (0.0018) [2024-04-28 02:34:57,102][54818] Updated weights for policy 0, policy_version 550618 (0.0018) [2024-04-28 02:34:59,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9021423616. Throughput: 0: 61183.3. Samples: 1926560960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:34:59,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-28 02:35:00,009][54818] Updated weights for policy 0, policy_version 550628 (0.0016) [2024-04-28 02:35:02,305][54818] Updated weights for policy 0, policy_version 550638 (0.0016) [2024-04-28 02:35:04,253][54587] Fps is (10 sec: 62259.9, 60 sec: 60893.9, 300 sec: 61148.5). Total num frames: 9021734912. Throughput: 0: 61250.2. Samples: 1926935780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:35:04,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 02:35:05,425][54818] Updated weights for policy 0, policy_version 550648 (0.0022) [2024-04-28 02:35:08,006][54818] Updated weights for policy 0, policy_version 550658 (0.0016) [2024-04-28 02:35:09,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60894.1, 300 sec: 61148.4). Total num frames: 9022029824. Throughput: 0: 61252.5. Samples: 1927305020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:35:09,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-28 02:35:10,686][54818] Updated weights for policy 0, policy_version 550668 (0.0015) [2024-04-28 02:35:13,307][54818] Updated weights for policy 0, policy_version 550678 (0.0019) [2024-04-28 02:35:14,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9022341120. Throughput: 0: 61096.4. Samples: 1927478620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:35:14,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 02:35:16,072][54818] Updated weights for policy 0, policy_version 550688 (0.0018) [2024-04-28 02:35:19,171][54818] Updated weights for policy 0, policy_version 550698 (0.0015) [2024-04-28 02:35:19,253][54587] Fps is (10 sec: 60619.5, 60 sec: 60893.7, 300 sec: 61148.4). Total num frames: 9022636032. Throughput: 0: 61133.1. Samples: 1927846760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:35:19,263][54587] Avg episode reward: [(0, '0.636')] [2024-04-28 02:35:20,065][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (6000 times) [2024-04-28 02:35:21,240][54818] Updated weights for policy 0, policy_version 550708 (0.0016) [2024-04-28 02:35:22,165][54798] Signal inference workers to stop experience collection... (31400 times) [2024-04-28 02:35:22,198][54818] InferenceWorker_p0-w0: stopping experience collection (31400 times) [2024-04-28 02:35:22,225][54798] Signal inference workers to resume experience collection... (31400 times) [2024-04-28 02:35:22,226][54818] InferenceWorker_p0-w0: resuming experience collection (31400 times) [2024-04-28 02:35:24,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 9022947328. Throughput: 0: 61125.8. Samples: 1928219140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:35:24,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-28 02:35:24,420][54818] Updated weights for policy 0, policy_version 550718 (0.0018) [2024-04-28 02:35:26,473][54818] Updated weights for policy 0, policy_version 550728 (0.0017) [2024-04-28 02:35:29,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 9023242240. Throughput: 0: 60946.3. Samples: 1928391300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:35:29,255][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 02:35:30,125][54818] Updated weights for policy 0, policy_version 550738 (0.0016) [2024-04-28 02:35:31,857][54818] Updated weights for policy 0, policy_version 550748 (0.0018) [2024-04-28 02:35:34,253][54587] Fps is (10 sec: 58982.7, 60 sec: 60620.8, 300 sec: 61092.9). Total num frames: 9023537152. Throughput: 0: 60924.2. Samples: 1928756920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:35:34,253][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 02:35:35,269][54818] Updated weights for policy 0, policy_version 550758 (0.0016) [2024-04-28 02:35:37,170][54818] Updated weights for policy 0, policy_version 550768 (0.0016) [2024-04-28 02:35:39,253][54587] Fps is (10 sec: 62259.6, 60 sec: 60893.9, 300 sec: 61203.9). Total num frames: 9023864832. Throughput: 0: 60902.4. Samples: 1929129060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:35:39,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 02:35:40,470][54818] Updated weights for policy 0, policy_version 550778 (0.0015) [2024-04-28 02:35:42,561][54818] Updated weights for policy 0, policy_version 550788 (0.0016) [2024-04-28 02:35:44,253][54587] Fps is (10 sec: 63897.2, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9024176128. Throughput: 0: 61053.7. Samples: 1929308380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:35:44,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 02:35:45,736][54818] Updated weights for policy 0, policy_version 550798 (0.0017) [2024-04-28 02:35:46,466][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (6100 times) [2024-04-28 02:35:48,159][54818] Updated weights for policy 0, policy_version 550808 (0.0017) [2024-04-28 02:35:49,253][54587] Fps is (10 sec: 60619.7, 60 sec: 60893.8, 300 sec: 61204.0). Total num frames: 9024471040. Throughput: 0: 60940.6. Samples: 1929678120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:35:49,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 02:35:49,265][54587] No heartbeat for components: RolloutWorker_w4 (18697 seconds), RolloutWorker_w5 (4797 seconds) [2024-04-28 02:35:49,372][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000550811_9024487424.pth... [2024-04-28 02:35:49,426][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000549916_9009823744.pth [2024-04-28 02:35:50,915][54818] Updated weights for policy 0, policy_version 550818 (0.0015) [2024-04-28 02:35:53,404][54818] Updated weights for policy 0, policy_version 550828 (0.0017) [2024-04-28 02:35:54,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9024782336. Throughput: 0: 60965.6. Samples: 1930048480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:35:54,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 02:35:56,174][54818] Updated weights for policy 0, policy_version 550838 (0.0019) [2024-04-28 02:35:59,165][54818] Updated weights for policy 0, policy_version 550848 (0.0017) [2024-04-28 02:35:59,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 9025093632. Throughput: 0: 61111.0. Samples: 1930228620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 02:35:59,255][54587] Avg episode reward: [(0, '0.660')] [2024-04-28 02:36:01,473][54818] Updated weights for policy 0, policy_version 550858 (0.0015) [2024-04-28 02:36:04,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9025388544. Throughput: 0: 61074.0. Samples: 1930595080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:36:04,255][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 02:36:04,688][54818] Updated weights for policy 0, policy_version 550868 (0.0016) [2024-04-28 02:36:06,752][54818] Updated weights for policy 0, policy_version 550878 (0.0018) [2024-04-28 02:36:07,563][54798] Signal inference workers to stop experience collection... (31450 times) [2024-04-28 02:36:07,599][54818] InferenceWorker_p0-w0: stopping experience collection (31450 times) [2024-04-28 02:36:07,653][54798] Signal inference workers to resume experience collection... (31450 times) [2024-04-28 02:36:07,653][54818] InferenceWorker_p0-w0: resuming experience collection (31450 times) [2024-04-28 02:36:09,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.7, 300 sec: 61148.4). Total num frames: 9025699840. Throughput: 0: 60981.6. Samples: 1930963320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:36:09,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-28 02:36:10,152][54818] Updated weights for policy 0, policy_version 550888 (0.0016) [2024-04-28 02:36:12,318][54818] Updated weights for policy 0, policy_version 550898 (0.0017) [2024-04-28 02:36:13,011][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (6200 times) [2024-04-28 02:36:14,253][54587] Fps is (10 sec: 60619.5, 60 sec: 60893.6, 300 sec: 61148.4). Total num frames: 9025994752. Throughput: 0: 61215.8. Samples: 1931146020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:36:14,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 02:36:15,402][54818] Updated weights for policy 0, policy_version 550908 (0.0017) [2024-04-28 02:36:17,646][54818] Updated weights for policy 0, policy_version 550918 (0.0019) [2024-04-28 02:36:19,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9026306048. Throughput: 0: 61242.1. Samples: 1931512820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:36:19,255][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 02:36:20,726][54818] Updated weights for policy 0, policy_version 550928 (0.0019) [2024-04-28 02:36:22,950][54818] Updated weights for policy 0, policy_version 550938 (0.0015) [2024-04-28 02:36:24,253][54587] Fps is (10 sec: 62260.5, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9026617344. Throughput: 0: 61070.7. Samples: 1931877240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:36:24,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 02:36:25,856][54818] Updated weights for policy 0, policy_version 550948 (0.0018) [2024-04-28 02:36:28,168][54818] Updated weights for policy 0, policy_version 550958 (0.0016) [2024-04-28 02:36:29,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 9026928640. Throughput: 0: 61167.9. Samples: 1932060940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:36:29,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 02:36:31,097][54818] Updated weights for policy 0, policy_version 550968 (0.0016) [2024-04-28 02:36:33,465][54818] Updated weights for policy 0, policy_version 550978 (0.0016) [2024-04-28 02:36:34,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 9027223552. Throughput: 0: 61148.2. Samples: 1932429780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:36:34,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 02:36:36,417][54818] Updated weights for policy 0, policy_version 550988 (0.0016) [2024-04-28 02:36:39,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61166.8, 300 sec: 61092.9). Total num frames: 9027534848. Throughput: 0: 60908.8. Samples: 1932789380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:36:39,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 02:36:39,606][54818] Updated weights for policy 0, policy_version 550998 (0.0017) [2024-04-28 02:36:40,508][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (6300 times) [2024-04-28 02:36:41,735][54818] Updated weights for policy 0, policy_version 551008 (0.0017) [2024-04-28 02:36:44,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9027846144. Throughput: 0: 61033.4. Samples: 1932975120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:36:44,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-28 02:36:44,783][54818] Updated weights for policy 0, policy_version 551018 (0.0015) [2024-04-28 02:36:47,124][54818] Updated weights for policy 0, policy_version 551028 (0.0016) [2024-04-28 02:36:49,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9028157440. Throughput: 0: 61128.2. Samples: 1933345860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:36:49,255][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 02:36:50,097][54818] Updated weights for policy 0, policy_version 551038 (0.0016) [2024-04-28 02:36:50,288][54798] Signal inference workers to stop experience collection... (31500 times) [2024-04-28 02:36:50,321][54818] InferenceWorker_p0-w0: stopping experience collection (31500 times) [2024-04-28 02:36:50,340][54798] Signal inference workers to resume experience collection... (31500 times) [2024-04-28 02:36:50,341][54818] InferenceWorker_p0-w0: resuming experience collection (31500 times) [2024-04-28 02:36:52,657][54818] Updated weights for policy 0, policy_version 551048 (0.0016) [2024-04-28 02:36:54,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 9028452352. Throughput: 0: 61020.7. Samples: 1933709240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:36:54,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 02:36:55,458][54818] Updated weights for policy 0, policy_version 551058 (0.0018) [2024-04-28 02:36:57,987][54818] Updated weights for policy 0, policy_version 551068 (0.0018) [2024-04-28 02:36:59,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9028763648. Throughput: 0: 61085.0. Samples: 1933894840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:36:59,255][54587] Avg episode reward: [(0, '0.654')] [2024-04-28 02:37:00,576][54818] Updated weights for policy 0, policy_version 551078 (0.0015) [2024-04-28 02:37:03,427][54818] Updated weights for policy 0, policy_version 551088 (0.0016) [2024-04-28 02:37:04,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9029074944. Throughput: 0: 61089.9. Samples: 1934261860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:37:04,253][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 02:37:06,101][54818] Updated weights for policy 0, policy_version 551098 (0.0016) [2024-04-28 02:37:07,048][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (6400 times) [2024-04-28 02:37:08,629][54818] Updated weights for policy 0, policy_version 551108 (0.0019) [2024-04-28 02:37:09,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61440.2, 300 sec: 61203.9). Total num frames: 9029386240. Throughput: 0: 61128.0. Samples: 1934628000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:37:09,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 02:37:11,344][54818] Updated weights for policy 0, policy_version 551118 (0.0017) [2024-04-28 02:37:13,909][54818] Updated weights for policy 0, policy_version 551128 (0.0017) [2024-04-28 02:37:14,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.2, 300 sec: 61148.5). Total num frames: 9029681152. Throughput: 0: 61273.1. Samples: 1934818220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:37:14,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-28 02:37:16,845][54818] Updated weights for policy 0, policy_version 551138 (0.0015) [2024-04-28 02:37:19,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61440.0, 300 sec: 61203.9). Total num frames: 9029992448. Throughput: 0: 61128.4. Samples: 1935180560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:37:19,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 02:37:19,439][54818] Updated weights for policy 0, policy_version 551148 (0.0016) [2024-04-28 02:37:22,375][54818] Updated weights for policy 0, policy_version 551158 (0.0017) [2024-04-28 02:37:24,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61439.9, 300 sec: 61203.9). Total num frames: 9030303744. Throughput: 0: 61382.3. Samples: 1935551580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:37:24,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 02:37:24,648][54818] Updated weights for policy 0, policy_version 551168 (0.0016) [2024-04-28 02:37:27,478][54818] Updated weights for policy 0, policy_version 551178 (0.0015) [2024-04-28 02:37:29,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9030615040. Throughput: 0: 61352.8. Samples: 1935736000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 02:37:29,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 02:37:30,005][54818] Updated weights for policy 0, policy_version 551188 (0.0016) [2024-04-28 02:37:32,678][54818] Updated weights for policy 0, policy_version 551198 (0.0016) [2024-04-28 02:37:33,952][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (6500 times) [2024-04-28 02:37:34,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9030909952. Throughput: 0: 61214.0. Samples: 1936100480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:37:34,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 02:37:35,098][54818] Updated weights for policy 0, policy_version 551208 (0.0017) [2024-04-28 02:37:37,699][54798] Signal inference workers to stop experience collection... (31550 times) [2024-04-28 02:37:37,699][54798] Signal inference workers to resume experience collection... (31550 times) [2024-04-28 02:37:37,716][54818] InferenceWorker_p0-w0: stopping experience collection (31550 times) [2024-04-28 02:37:37,716][54818] InferenceWorker_p0-w0: resuming experience collection (31550 times) [2024-04-28 02:37:38,068][54818] Updated weights for policy 0, policy_version 551218 (0.0019) [2024-04-28 02:37:39,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61440.2, 300 sec: 61148.4). Total num frames: 9031221248. Throughput: 0: 61392.0. Samples: 1936471880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:37:39,253][54587] Avg episode reward: [(0, '0.539')] [2024-04-28 02:37:40,848][54818] Updated weights for policy 0, policy_version 551228 (0.0022) [2024-04-28 02:37:43,677][54818] Updated weights for policy 0, policy_version 551238 (0.0017) [2024-04-28 02:37:44,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 9031516160. Throughput: 0: 61332.2. Samples: 1936654780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:37:44,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 02:37:46,281][54818] Updated weights for policy 0, policy_version 551248 (0.0017) [2024-04-28 02:37:48,809][54818] Updated weights for policy 0, policy_version 551258 (0.0015) [2024-04-28 02:37:49,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61440.2, 300 sec: 61204.0). Total num frames: 9031843840. Throughput: 0: 61354.2. Samples: 1937022800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:37:49,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 02:37:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000551260_9031843840.pth... [2024-04-28 02:37:49,321][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000550364_9017163776.pth [2024-04-28 02:37:51,493][54818] Updated weights for policy 0, policy_version 551268 (0.0016) [2024-04-28 02:37:53,990][54818] Updated weights for policy 0, policy_version 551278 (0.0018) [2024-04-28 02:37:54,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9032138752. Throughput: 0: 61429.8. Samples: 1937392340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:37:54,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 02:37:57,041][54818] Updated weights for policy 0, policy_version 551288 (0.0018) [2024-04-28 02:37:59,195][54818] Updated weights for policy 0, policy_version 551298 (0.0017) [2024-04-28 02:37:59,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61713.1, 300 sec: 61259.5). Total num frames: 9032466432. Throughput: 0: 61319.0. Samples: 1937577580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:37:59,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 02:38:00,851][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (6600 times) [2024-04-28 02:38:02,310][54818] Updated weights for policy 0, policy_version 551308 (0.0017) [2024-04-28 02:38:04,253][54587] Fps is (10 sec: 63897.3, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 9032777728. Throughput: 0: 61533.4. Samples: 1937949560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:38:04,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-28 02:38:04,568][54818] Updated weights for policy 0, policy_version 551318 (0.0019) [2024-04-28 02:38:07,666][54818] Updated weights for policy 0, policy_version 551328 (0.0018) [2024-04-28 02:38:09,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.0, 300 sec: 61203.9). Total num frames: 9033072640. Throughput: 0: 61381.0. Samples: 1938313720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:38:09,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 02:38:09,714][54818] Updated weights for policy 0, policy_version 551338 (0.0017) [2024-04-28 02:38:12,971][54818] Updated weights for policy 0, policy_version 551348 (0.0016) [2024-04-28 02:38:14,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61713.0, 300 sec: 61203.9). Total num frames: 9033383936. Throughput: 0: 61416.0. Samples: 1938499720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:38:14,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 02:38:14,895][54818] Updated weights for policy 0, policy_version 551358 (0.0017) [2024-04-28 02:38:17,404][54798] Signal inference workers to stop experience collection... (31600 times) [2024-04-28 02:38:17,436][54818] InferenceWorker_p0-w0: stopping experience collection (31600 times) [2024-04-28 02:38:17,467][54798] Signal inference workers to resume experience collection... (31600 times) [2024-04-28 02:38:17,467][54818] InferenceWorker_p0-w0: resuming experience collection (31600 times) [2024-04-28 02:38:18,247][54818] Updated weights for policy 0, policy_version 551368 (0.0018) [2024-04-28 02:38:19,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9033678848. Throughput: 0: 61501.8. Samples: 1938868060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:38:19,253][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 02:38:20,935][54818] Updated weights for policy 0, policy_version 551378 (0.0017) [2024-04-28 02:38:23,669][54818] Updated weights for policy 0, policy_version 551388 (0.0017) [2024-04-28 02:38:24,253][54587] Fps is (10 sec: 58982.9, 60 sec: 61167.0, 300 sec: 61149.2). Total num frames: 9033973760. Throughput: 0: 61228.0. Samples: 1939227140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:38:24,253][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 02:38:26,191][54818] Updated weights for policy 0, policy_version 551398 (0.0015) [2024-04-28 02:38:27,836][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (6700 times) [2024-04-28 02:38:28,965][54818] Updated weights for policy 0, policy_version 551408 (0.0016) [2024-04-28 02:38:29,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61167.1, 300 sec: 61148.4). Total num frames: 9034285056. Throughput: 0: 61292.0. Samples: 1939412920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:38:29,253][54587] Avg episode reward: [(0, '0.701')] [2024-04-28 02:38:31,583][54818] Updated weights for policy 0, policy_version 551418 (0.0016) [2024-04-28 02:38:34,088][54818] Updated weights for policy 0, policy_version 551428 (0.0016) [2024-04-28 02:38:34,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9034596352. Throughput: 0: 61514.3. Samples: 1939790940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:38:34,253][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 02:38:37,108][54818] Updated weights for policy 0, policy_version 551438 (0.0016) [2024-04-28 02:38:39,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61439.8, 300 sec: 61259.7). Total num frames: 9034907648. Throughput: 0: 61271.8. Samples: 1940149580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:38:39,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 02:38:39,286][54818] Updated weights for policy 0, policy_version 551448 (0.0016) [2024-04-28 02:38:42,721][54818] Updated weights for policy 0, policy_version 551458 (0.0017) [2024-04-28 02:38:44,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61712.9, 300 sec: 61259.5). Total num frames: 9035218944. Throughput: 0: 61258.6. Samples: 1940334220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:38:44,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-28 02:38:44,652][54818] Updated weights for policy 0, policy_version 551468 (0.0018) [2024-04-28 02:38:48,048][54818] Updated weights for policy 0, policy_version 551478 (0.0016) [2024-04-28 02:38:49,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.8, 300 sec: 61204.0). Total num frames: 9035513856. Throughput: 0: 61280.3. Samples: 1940707180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:38:49,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-28 02:38:49,261][54587] No heartbeat for components: RolloutWorker_w4 (18877 seconds), RolloutWorker_w5 (4977 seconds) [2024-04-28 02:38:49,960][54818] Updated weights for policy 0, policy_version 551488 (0.0019) [2024-04-28 02:38:53,205][54818] Updated weights for policy 0, policy_version 551498 (0.0016) [2024-04-28 02:38:54,167][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (6800 times) [2024-04-28 02:38:54,253][54587] Fps is (10 sec: 58982.9, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9035808768. Throughput: 0: 61283.6. Samples: 1941071480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:38:54,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 02:38:55,234][54818] Updated weights for policy 0, policy_version 551508 (0.0018) [2024-04-28 02:38:58,575][54818] Updated weights for policy 0, policy_version 551518 (0.0016) [2024-04-28 02:38:59,253][54587] Fps is (10 sec: 60621.9, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 9036120064. Throughput: 0: 61092.2. Samples: 1941248860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:38:59,253][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 02:38:59,728][54798] Signal inference workers to stop experience collection... (31650 times) [2024-04-28 02:38:59,765][54818] InferenceWorker_p0-w0: stopping experience collection (31650 times) [2024-04-28 02:38:59,820][54798] Signal inference workers to resume experience collection... (31650 times) [2024-04-28 02:38:59,820][54818] InferenceWorker_p0-w0: resuming experience collection (31650 times) [2024-04-28 02:39:00,539][54818] Updated weights for policy 0, policy_version 551528 (0.0016) [2024-04-28 02:39:03,837][54818] Updated weights for policy 0, policy_version 551538 (0.0019) [2024-04-28 02:39:04,253][54587] Fps is (10 sec: 60619.9, 60 sec: 60620.6, 300 sec: 61148.4). Total num frames: 9036414976. Throughput: 0: 61152.7. Samples: 1941619940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:39:04,255][54587] Avg episode reward: [(0, '0.491')] [2024-04-28 02:39:06,232][54818] Updated weights for policy 0, policy_version 551548 (0.0017) [2024-04-28 02:39:09,121][54818] Updated weights for policy 0, policy_version 551558 (0.0016) [2024-04-28 02:39:09,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 9036726272. Throughput: 0: 61414.6. Samples: 1941990800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:39:09,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 02:39:11,447][54818] Updated weights for policy 0, policy_version 551568 (0.0016) [2024-04-28 02:39:14,253][54587] Fps is (10 sec: 62260.1, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9037037568. Throughput: 0: 61175.9. Samples: 1942165840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:39:14,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 02:39:14,487][54818] Updated weights for policy 0, policy_version 551578 (0.0017) [2024-04-28 02:39:17,436][54818] Updated weights for policy 0, policy_version 551588 (0.0017) [2024-04-28 02:39:19,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 9037348864. Throughput: 0: 61112.7. Samples: 1942541020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:39:19,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-28 02:39:19,696][54818] Updated weights for policy 0, policy_version 551598 (0.0017) [2024-04-28 02:39:20,605][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (6900 times) [2024-04-28 02:39:22,766][54818] Updated weights for policy 0, policy_version 551608 (0.0017) [2024-04-28 02:39:24,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 9037643776. Throughput: 0: 61243.2. Samples: 1942905520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:39:24,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 02:39:25,094][54818] Updated weights for policy 0, policy_version 551618 (0.0018) [2024-04-28 02:39:28,152][54818] Updated weights for policy 0, policy_version 551628 (0.0015) [2024-04-28 02:39:29,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 9037955072. Throughput: 0: 60951.1. Samples: 1943077020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:39:29,263][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 02:39:30,260][54818] Updated weights for policy 0, policy_version 551638 (0.0016) [2024-04-28 02:39:33,555][54818] Updated weights for policy 0, policy_version 551648 (0.0015) [2024-04-28 02:39:34,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9038266368. Throughput: 0: 61042.4. Samples: 1943454080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:39:34,263][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 02:39:35,703][54818] Updated weights for policy 0, policy_version 551658 (0.0019) [2024-04-28 02:39:38,859][54818] Updated weights for policy 0, policy_version 551668 (0.0016) [2024-04-28 02:39:39,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9038561280. Throughput: 0: 61165.3. Samples: 1943823920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:39:39,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 02:39:39,953][54798] Signal inference workers to stop experience collection... (31700 times) [2024-04-28 02:39:39,959][54798] Signal inference workers to resume experience collection... (31700 times) [2024-04-28 02:39:39,972][54818] InferenceWorker_p0-w0: stopping experience collection (31700 times) [2024-04-28 02:39:39,972][54818] InferenceWorker_p0-w0: resuming experience collection (31700 times) [2024-04-28 02:39:40,905][54818] Updated weights for policy 0, policy_version 551678 (0.0018) [2024-04-28 02:39:44,128][54818] Updated weights for policy 0, policy_version 551688 (0.0016) [2024-04-28 02:39:44,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9038872576. Throughput: 0: 61004.9. Samples: 1943994080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:39:44,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 02:39:46,205][54818] Updated weights for policy 0, policy_version 551698 (0.0018) [2024-04-28 02:39:47,976][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (7000 times) [2024-04-28 02:39:49,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9039167488. Throughput: 0: 60985.6. Samples: 1944364280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:39:49,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-28 02:39:49,339][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000551708_9039183872.pth... [2024-04-28 02:39:49,344][54818] Updated weights for policy 0, policy_version 551708 (0.0019) [2024-04-28 02:39:49,397][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000550811_9024487424.pth [2024-04-28 02:39:51,816][54818] Updated weights for policy 0, policy_version 551718 (0.0016) [2024-04-28 02:39:54,253][54587] Fps is (10 sec: 60619.9, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 9039478784. Throughput: 0: 61184.8. Samples: 1944744120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:39:54,263][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 02:39:54,514][54818] Updated weights for policy 0, policy_version 551728 (0.0020) [2024-04-28 02:39:57,130][54818] Updated weights for policy 0, policy_version 551738 (0.0016) [2024-04-28 02:39:59,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 9039773696. Throughput: 0: 61030.7. Samples: 1944912220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:39:59,254][54587] Avg episode reward: [(0, '0.681')] [2024-04-28 02:39:59,689][54818] Updated weights for policy 0, policy_version 551748 (0.0017) [2024-04-28 02:40:02,788][54818] Updated weights for policy 0, policy_version 551758 (0.0023) [2024-04-28 02:40:04,253][54587] Fps is (10 sec: 60621.8, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 9040084992. Throughput: 0: 61033.5. Samples: 1945287520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:40:04,253][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 02:40:04,971][54818] Updated weights for policy 0, policy_version 551768 (0.0015) [2024-04-28 02:40:07,968][54818] Updated weights for policy 0, policy_version 551778 (0.0017) [2024-04-28 02:40:09,253][54587] Fps is (10 sec: 62258.1, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 9040396288. Throughput: 0: 61247.0. Samples: 1945661640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:40:09,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 02:40:10,686][54818] Updated weights for policy 0, policy_version 551788 (0.0018) [2024-04-28 02:40:11,378][54798] Signal inference workers to stop experience collection... (31750 times) [2024-04-28 02:40:11,412][54818] InferenceWorker_p0-w0: stopping experience collection (31750 times) [2024-04-28 02:40:11,467][54798] Signal inference workers to resume experience collection... (31750 times) [2024-04-28 02:40:11,467][54818] InferenceWorker_p0-w0: resuming experience collection (31750 times) [2024-04-28 02:40:13,505][54818] Updated weights for policy 0, policy_version 551798 (0.0017) [2024-04-28 02:40:14,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9040707584. Throughput: 0: 61282.3. Samples: 1945834720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:40:14,255][54587] Avg episode reward: [(0, '0.533')] [2024-04-28 02:40:14,366][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (7100 times) [2024-04-28 02:40:16,262][54818] Updated weights for policy 0, policy_version 551808 (0.0019) [2024-04-28 02:40:18,945][54818] Updated weights for policy 0, policy_version 551818 (0.0022) [2024-04-28 02:40:19,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9041018880. Throughput: 0: 61111.9. Samples: 1946204120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:40:19,255][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 02:40:21,649][54818] Updated weights for policy 0, policy_version 551828 (0.0017) [2024-04-28 02:40:24,108][54818] Updated weights for policy 0, policy_version 551838 (0.0016) [2024-04-28 02:40:24,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9041313792. Throughput: 0: 61141.6. Samples: 1946575300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:40:24,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-28 02:40:26,719][54818] Updated weights for policy 0, policy_version 551848 (0.0017) [2024-04-28 02:40:29,253][54587] Fps is (10 sec: 58983.0, 60 sec: 60894.0, 300 sec: 61259.5). Total num frames: 9041608704. Throughput: 0: 61358.7. Samples: 1946755220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:40:29,253][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 02:40:29,375][54818] Updated weights for policy 0, policy_version 551858 (0.0016) [2024-04-28 02:40:31,929][54818] Updated weights for policy 0, policy_version 551868 (0.0017) [2024-04-28 02:40:34,253][54587] Fps is (10 sec: 62260.3, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9041936384. Throughput: 0: 61280.0. Samples: 1947121880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 02:40:34,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 02:40:34,604][54818] Updated weights for policy 0, policy_version 551878 (0.0015) [2024-04-28 02:40:37,385][54818] Updated weights for policy 0, policy_version 551888 (0.0016) [2024-04-28 02:40:39,253][54587] Fps is (10 sec: 62257.9, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 9042231296. Throughput: 0: 60937.3. Samples: 1947486300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:40:39,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 02:40:40,084][54818] Updated weights for policy 0, policy_version 551898 (0.0016) [2024-04-28 02:40:41,499][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (7200 times) [2024-04-28 02:40:42,877][54818] Updated weights for policy 0, policy_version 551908 (0.0016) [2024-04-28 02:40:44,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9042542592. Throughput: 0: 61351.0. Samples: 1947673020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:40:44,255][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 02:40:45,167][54818] Updated weights for policy 0, policy_version 551918 (0.0017) [2024-04-28 02:40:48,053][54818] Updated weights for policy 0, policy_version 551928 (0.0019) [2024-04-28 02:40:49,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61439.8, 300 sec: 61259.5). Total num frames: 9042853888. Throughput: 0: 61281.1. Samples: 1948045180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:40:49,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 02:40:50,743][54818] Updated weights for policy 0, policy_version 551938 (0.0016) [2024-04-28 02:40:52,691][54798] Signal inference workers to stop experience collection... (31800 times) [2024-04-28 02:40:52,710][54818] InferenceWorker_p0-w0: stopping experience collection (31800 times) [2024-04-28 02:40:52,750][54798] Signal inference workers to resume experience collection... (31800 times) [2024-04-28 02:40:52,750][54818] InferenceWorker_p0-w0: resuming experience collection (31800 times) [2024-04-28 02:40:53,368][54818] Updated weights for policy 0, policy_version 551948 (0.0015) [2024-04-28 02:40:54,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9043165184. Throughput: 0: 60922.0. Samples: 1948403120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:40:54,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 02:40:56,340][54818] Updated weights for policy 0, policy_version 551958 (0.0016) [2024-04-28 02:40:58,650][54818] Updated weights for policy 0, policy_version 551968 (0.0020) [2024-04-28 02:40:59,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61713.0, 300 sec: 61315.0). Total num frames: 9043476480. Throughput: 0: 61337.4. Samples: 1948594900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:40:59,254][54587] Avg episode reward: [(0, '0.674')] [2024-04-28 02:41:01,529][54818] Updated weights for policy 0, policy_version 551978 (0.0015) [2024-04-28 02:41:03,947][54818] Updated weights for policy 0, policy_version 551988 (0.0020) [2024-04-28 02:41:04,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9043771392. Throughput: 0: 61282.4. Samples: 1948961820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:41:04,253][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 02:41:07,116][54818] Updated weights for policy 0, policy_version 551998 (0.0015) [2024-04-28 02:41:07,898][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (7300 times) [2024-04-28 02:41:09,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61440.1, 300 sec: 61315.1). Total num frames: 9044082688. Throughput: 0: 60994.4. Samples: 1949320040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:41:09,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 02:41:09,318][54818] Updated weights for policy 0, policy_version 552008 (0.0018) [2024-04-28 02:41:12,501][54818] Updated weights for policy 0, policy_version 552018 (0.0017) [2024-04-28 02:41:14,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61440.1, 300 sec: 61315.1). Total num frames: 9044393984. Throughput: 0: 61289.8. Samples: 1949513260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:41:14,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 02:41:15,053][54818] Updated weights for policy 0, policy_version 552028 (0.0015) [2024-04-28 02:41:17,749][54818] Updated weights for policy 0, policy_version 552038 (0.0016) [2024-04-28 02:41:19,253][54587] Fps is (10 sec: 62257.9, 60 sec: 61439.8, 300 sec: 61315.0). Total num frames: 9044705280. Throughput: 0: 61265.5. Samples: 1949878840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:41:19,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 02:41:20,386][54818] Updated weights for policy 0, policy_version 552048 (0.0018) [2024-04-28 02:41:22,870][54818] Updated weights for policy 0, policy_version 552058 (0.0017) [2024-04-28 02:41:24,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61713.2, 300 sec: 61315.1). Total num frames: 9045016576. Throughput: 0: 61299.3. Samples: 1950244760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:41:24,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 02:41:25,619][54818] Updated weights for policy 0, policy_version 552068 (0.0016) [2024-04-28 02:41:28,087][54818] Updated weights for policy 0, policy_version 552078 (0.0017) [2024-04-28 02:41:29,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61986.0, 300 sec: 61370.6). Total num frames: 9045327872. Throughput: 0: 61417.3. Samples: 1950436800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:41:29,255][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 02:41:30,952][54818] Updated weights for policy 0, policy_version 552088 (0.0016) [2024-04-28 02:41:33,449][54818] Updated weights for policy 0, policy_version 552098 (0.0015) [2024-04-28 02:41:34,053][54798] Signal inference workers to stop experience collection... (31850 times) [2024-04-28 02:41:34,053][54798] Signal inference workers to resume experience collection... (31850 times) [2024-04-28 02:41:34,061][54818] InferenceWorker_p0-w0: stopping experience collection (31850 times) [2024-04-28 02:41:34,061][54818] InferenceWorker_p0-w0: resuming experience collection (31850 times) [2024-04-28 02:41:34,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61713.0, 300 sec: 61370.6). Total num frames: 9045639168. Throughput: 0: 61157.0. Samples: 1950797240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:41:34,254][54587] Avg episode reward: [(0, '0.505')] [2024-04-28 02:41:35,056][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (7400 times) [2024-04-28 02:41:36,164][54818] Updated weights for policy 0, policy_version 552108 (0.0016) [2024-04-28 02:41:38,814][54818] Updated weights for policy 0, policy_version 552118 (0.0017) [2024-04-28 02:41:39,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61986.2, 300 sec: 61370.6). Total num frames: 9045950464. Throughput: 0: 61352.7. Samples: 1951164000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:41:39,256][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 02:41:41,997][54818] Updated weights for policy 0, policy_version 552128 (0.0022) [2024-04-28 02:41:43,926][54818] Updated weights for policy 0, policy_version 552138 (0.0017) [2024-04-28 02:41:44,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61713.2, 300 sec: 61315.1). Total num frames: 9046245376. Throughput: 0: 61235.7. Samples: 1951350500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:41:44,253][54587] Avg episode reward: [(0, '0.693')] [2024-04-28 02:41:47,214][54818] Updated weights for policy 0, policy_version 552148 (0.0016) [2024-04-28 02:41:49,188][54818] Updated weights for policy 0, policy_version 552158 (0.0016) [2024-04-28 02:41:49,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61713.1, 300 sec: 61370.6). Total num frames: 9046556672. Throughput: 0: 61177.6. Samples: 1951714820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:41:49,254][54587] Avg episode reward: [(0, '0.528')] [2024-04-28 02:41:49,261][54587] No heartbeat for components: RolloutWorker_w4 (19057 seconds), RolloutWorker_w5 (5157 seconds) [2024-04-28 02:41:49,262][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000552158_9046556672.pth... [2024-04-28 02:41:49,314][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000551260_9031843840.pth [2024-04-28 02:41:52,537][54818] Updated weights for policy 0, policy_version 552168 (0.0016) [2024-04-28 02:41:54,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61440.0, 300 sec: 61315.1). Total num frames: 9046851584. Throughput: 0: 61234.3. Samples: 1952075580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:41:54,253][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 02:41:54,254][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (400 times) [2024-04-28 02:41:54,945][54818] Updated weights for policy 0, policy_version 552178 (0.0018) [2024-04-28 02:41:57,853][54818] Updated weights for policy 0, policy_version 552188 (0.0019) [2024-04-28 02:41:59,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 9047162880. Throughput: 0: 61215.9. Samples: 1952267980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:41:59,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 02:42:00,549][54818] Updated weights for policy 0, policy_version 552198 (0.0017) [2024-04-28 02:42:01,788][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (7500 times) [2024-04-28 02:42:03,110][54818] Updated weights for policy 0, policy_version 552208 (0.0017) [2024-04-28 02:42:04,255][54587] Fps is (10 sec: 60611.8, 60 sec: 61438.4, 300 sec: 61259.2). Total num frames: 9047457792. Throughput: 0: 61340.5. Samples: 1952639240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 02:42:04,255][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 02:42:06,053][54818] Updated weights for policy 0, policy_version 552218 (0.0017) [2024-04-28 02:42:08,304][54818] Updated weights for policy 0, policy_version 552228 (0.0016) [2024-04-28 02:42:09,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61713.1, 300 sec: 61370.6). Total num frames: 9047785472. Throughput: 0: 61147.1. Samples: 1952996380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:42:09,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 02:42:11,194][54818] Updated weights for policy 0, policy_version 552238 (0.0016) [2024-04-28 02:42:13,493][54818] Updated weights for policy 0, policy_version 552248 (0.0016) [2024-04-28 02:42:14,253][54587] Fps is (10 sec: 60630.0, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9048064000. Throughput: 0: 61090.0. Samples: 1953185840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:42:14,253][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 02:42:14,349][54798] Signal inference workers to stop experience collection... (31900 times) [2024-04-28 02:42:14,389][54818] InferenceWorker_p0-w0: stopping experience collection (31900 times) [2024-04-28 02:42:14,401][54798] Signal inference workers to resume experience collection... (31900 times) [2024-04-28 02:42:14,407][54818] InferenceWorker_p0-w0: resuming experience collection (31900 times) [2024-04-28 02:42:16,517][54818] Updated weights for policy 0, policy_version 552258 (0.0017) [2024-04-28 02:42:18,892][54818] Updated weights for policy 0, policy_version 552268 (0.0015) [2024-04-28 02:42:19,254][54587] Fps is (10 sec: 60619.4, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 9048391680. Throughput: 0: 61409.9. Samples: 1953560700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:42:19,255][54587] Avg episode reward: [(0, '0.675')] [2024-04-28 02:42:21,811][54818] Updated weights for policy 0, policy_version 552278 (0.0019) [2024-04-28 02:42:24,112][54818] Updated weights for policy 0, policy_version 552288 (0.0019) [2024-04-28 02:42:24,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9048686592. Throughput: 0: 61150.8. Samples: 1953915780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:42:24,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 02:42:27,168][54818] Updated weights for policy 0, policy_version 552298 (0.0016) [2024-04-28 02:42:28,084][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (7600 times) [2024-04-28 02:42:29,255][54587] Fps is (10 sec: 60611.1, 60 sec: 61165.2, 300 sec: 61314.6). Total num frames: 9048997888. Throughput: 0: 61335.6. Samples: 1954110720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:42:29,256][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 02:42:29,404][54818] Updated weights for policy 0, policy_version 552308 (0.0016) [2024-04-28 02:42:32,676][54818] Updated weights for policy 0, policy_version 552318 (0.0015) [2024-04-28 02:42:34,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.8, 300 sec: 61259.5). Total num frames: 9049292800. Throughput: 0: 61430.7. Samples: 1954479200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:42:34,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-28 02:42:34,751][54818] Updated weights for policy 0, policy_version 552328 (0.0019) [2024-04-28 02:42:37,863][54818] Updated weights for policy 0, policy_version 552338 (0.0016) [2024-04-28 02:42:39,253][54587] Fps is (10 sec: 60632.3, 60 sec: 60894.1, 300 sec: 61315.0). Total num frames: 9049604096. Throughput: 0: 61375.6. Samples: 1954837480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:42:39,253][54587] Avg episode reward: [(0, '0.636')] [2024-04-28 02:42:40,522][54818] Updated weights for policy 0, policy_version 552348 (0.0015) [2024-04-28 02:42:42,993][54818] Updated weights for policy 0, policy_version 552358 (0.0016) [2024-04-28 02:42:44,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9049915392. Throughput: 0: 61455.4. Samples: 1955033480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:42:44,255][54587] Avg episode reward: [(0, '0.647')] [2024-04-28 02:42:45,755][54818] Updated weights for policy 0, policy_version 552368 (0.0016) [2024-04-28 02:42:48,168][54818] Updated weights for policy 0, policy_version 552378 (0.0017) [2024-04-28 02:42:49,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60894.0, 300 sec: 61259.5). Total num frames: 9050210304. Throughput: 0: 61339.0. Samples: 1955399400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:42:49,253][54587] Avg episode reward: [(0, '0.559')] [2024-04-28 02:42:51,660][54818] Updated weights for policy 0, policy_version 552388 (0.0016) [2024-04-28 02:42:51,859][54798] Signal inference workers to stop experience collection... (31950 times) [2024-04-28 02:42:51,879][54818] InferenceWorker_p0-w0: stopping experience collection (31950 times) [2024-04-28 02:42:51,918][54798] Signal inference workers to resume experience collection... (31950 times) [2024-04-28 02:42:51,918][54818] InferenceWorker_p0-w0: resuming experience collection (31950 times) [2024-04-28 02:42:53,533][54818] Updated weights for policy 0, policy_version 552398 (0.0021) [2024-04-28 02:42:54,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9050521600. Throughput: 0: 61334.2. Samples: 1955756420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:42:54,254][54587] Avg episode reward: [(0, '0.510')] [2024-04-28 02:42:54,739][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (7700 times) [2024-04-28 02:42:56,832][54818] Updated weights for policy 0, policy_version 552408 (0.0017) [2024-04-28 02:42:58,884][54818] Updated weights for policy 0, policy_version 552418 (0.0018) [2024-04-28 02:42:59,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9050816512. Throughput: 0: 61471.5. Samples: 1955952060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:42:59,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 02:43:02,157][54818] Updated weights for policy 0, policy_version 552428 (0.0016) [2024-04-28 02:43:04,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61168.3, 300 sec: 61203.9). Total num frames: 9051127808. Throughput: 0: 61110.4. Samples: 1956310660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:43:04,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 02:43:04,440][54818] Updated weights for policy 0, policy_version 552438 (0.0016) [2024-04-28 02:43:07,381][54818] Updated weights for policy 0, policy_version 552448 (0.0016) [2024-04-28 02:43:09,253][54587] Fps is (10 sec: 62259.2, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9051439104. Throughput: 0: 61292.1. Samples: 1956673920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:43:09,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 02:43:09,766][54818] Updated weights for policy 0, policy_version 552458 (0.0015) [2024-04-28 02:43:12,803][54818] Updated weights for policy 0, policy_version 552468 (0.0017) [2024-04-28 02:43:14,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9051750400. Throughput: 0: 61157.1. Samples: 1956862680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:43:14,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 02:43:15,199][54818] Updated weights for policy 0, policy_version 552478 (0.0018) [2024-04-28 02:43:17,900][54818] Updated weights for policy 0, policy_version 552488 (0.0018) [2024-04-28 02:43:19,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60894.1, 300 sec: 61259.5). Total num frames: 9052045312. Throughput: 0: 61000.5. Samples: 1957224220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:43:19,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 02:43:20,467][54818] Updated weights for policy 0, policy_version 552498 (0.0016) [2024-04-28 02:43:22,250][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (7800 times) [2024-04-28 02:43:23,238][54818] Updated weights for policy 0, policy_version 552508 (0.0016) [2024-04-28 02:43:24,253][54587] Fps is (10 sec: 58983.1, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9052340224. Throughput: 0: 61129.8. Samples: 1957588320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:43:24,253][54587] Avg episode reward: [(0, '0.723')] [2024-04-28 02:43:26,112][54818] Updated weights for policy 0, policy_version 552518 (0.0021) [2024-04-28 02:43:28,676][54818] Updated weights for policy 0, policy_version 552528 (0.0017) [2024-04-28 02:43:29,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60895.6, 300 sec: 61203.9). Total num frames: 9052651520. Throughput: 0: 60917.8. Samples: 1957774780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:43:29,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 02:43:31,564][54818] Updated weights for policy 0, policy_version 552538 (0.0015) [2024-04-28 02:43:34,152][54818] Updated weights for policy 0, policy_version 552548 (0.0015) [2024-04-28 02:43:34,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9052946432. Throughput: 0: 60825.2. Samples: 1958136540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 02:43:34,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 02:43:36,756][54818] Updated weights for policy 0, policy_version 552558 (0.0016) [2024-04-28 02:43:39,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 9053257728. Throughput: 0: 61031.6. Samples: 1958502840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:43:39,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 02:43:39,498][54818] Updated weights for policy 0, policy_version 552568 (0.0017) [2024-04-28 02:43:39,895][54798] Signal inference workers to stop experience collection... (32000 times) [2024-04-28 02:43:39,895][54798] Signal inference workers to resume experience collection... (32000 times) [2024-04-28 02:43:39,923][54818] InferenceWorker_p0-w0: stopping experience collection (32000 times) [2024-04-28 02:43:39,923][54818] InferenceWorker_p0-w0: resuming experience collection (32000 times) [2024-04-28 02:43:42,334][54818] Updated weights for policy 0, policy_version 552578 (0.0017) [2024-04-28 02:43:44,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60620.8, 300 sec: 61148.4). Total num frames: 9053552640. Throughput: 0: 60785.2. Samples: 1958687400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:43:44,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 02:43:44,939][54818] Updated weights for policy 0, policy_version 552588 (0.0016) [2024-04-28 02:43:47,512][54818] Updated weights for policy 0, policy_version 552598 (0.0017) [2024-04-28 02:43:48,626][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (7900 times) [2024-04-28 02:43:49,253][54587] Fps is (10 sec: 58982.6, 60 sec: 60620.8, 300 sec: 61148.4). Total num frames: 9053847552. Throughput: 0: 61051.8. Samples: 1959057980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:43:49,253][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 02:43:49,284][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000552604_9053863936.pth... [2024-04-28 02:43:49,349][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000551708_9039183872.pth [2024-04-28 02:43:50,141][54818] Updated weights for policy 0, policy_version 552608 (0.0016) [2024-04-28 02:43:52,744][54818] Updated weights for policy 0, policy_version 552618 (0.0016) [2024-04-28 02:43:54,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60620.8, 300 sec: 61148.4). Total num frames: 9054158848. Throughput: 0: 61067.0. Samples: 1959421940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:43:54,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 02:43:55,423][54818] Updated weights for policy 0, policy_version 552628 (0.0016) [2024-04-28 02:43:57,860][54818] Updated weights for policy 0, policy_version 552638 (0.0016) [2024-04-28 02:43:59,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60620.7, 300 sec: 61148.4). Total num frames: 9054453760. Throughput: 0: 61006.6. Samples: 1959607980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:43:59,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 02:44:00,685][54818] Updated weights for policy 0, policy_version 552648 (0.0017) [2024-04-28 02:44:03,305][54818] Updated weights for policy 0, policy_version 552658 (0.0016) [2024-04-28 02:44:04,253][54587] Fps is (10 sec: 60621.6, 60 sec: 60621.0, 300 sec: 61148.4). Total num frames: 9054765056. Throughput: 0: 61179.2. Samples: 1959977280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:44:04,253][54587] Avg episode reward: [(0, '0.532')] [2024-04-28 02:44:05,917][54818] Updated weights for policy 0, policy_version 552668 (0.0016) [2024-04-28 02:44:08,900][54818] Updated weights for policy 0, policy_version 552678 (0.0016) [2024-04-28 02:44:09,253][54587] Fps is (10 sec: 62259.1, 60 sec: 60620.7, 300 sec: 61148.4). Total num frames: 9055076352. Throughput: 0: 61369.6. Samples: 1960349960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:44:09,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 02:44:11,344][54818] Updated weights for policy 0, policy_version 552688 (0.0018) [2024-04-28 02:44:14,253][54587] Fps is (10 sec: 62258.5, 60 sec: 60620.8, 300 sec: 61148.4). Total num frames: 9055387648. Throughput: 0: 61319.2. Samples: 1960534140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:44:14,255][54587] Avg episode reward: [(0, '0.669')] [2024-04-28 02:44:14,482][54818] Updated weights for policy 0, policy_version 552698 (0.0016) [2024-04-28 02:44:15,393][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (8000 times) [2024-04-28 02:44:16,872][54818] Updated weights for policy 0, policy_version 552708 (0.0021) [2024-04-28 02:44:19,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60620.7, 300 sec: 61148.4). Total num frames: 9055682560. Throughput: 0: 61423.9. Samples: 1960900620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:44:19,255][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 02:44:19,791][54818] Updated weights for policy 0, policy_version 552718 (0.0015) [2024-04-28 02:44:20,081][54798] Signal inference workers to stop experience collection... (32050 times) [2024-04-28 02:44:20,086][54798] Signal inference workers to resume experience collection... (32050 times) [2024-04-28 02:44:20,098][54818] InferenceWorker_p0-w0: stopping experience collection (32050 times) [2024-04-28 02:44:20,098][54818] InferenceWorker_p0-w0: resuming experience collection (32050 times) [2024-04-28 02:44:22,070][54818] Updated weights for policy 0, policy_version 552728 (0.0018) [2024-04-28 02:44:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 9055993856. Throughput: 0: 61312.9. Samples: 1961261920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:44:24,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 02:44:25,105][54818] Updated weights for policy 0, policy_version 552738 (0.0016) [2024-04-28 02:44:27,303][54818] Updated weights for policy 0, policy_version 552748 (0.0020) [2024-04-28 02:44:29,253][54587] Fps is (10 sec: 62259.5, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9056305152. Throughput: 0: 61180.0. Samples: 1961440500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:44:29,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 02:44:30,310][54818] Updated weights for policy 0, policy_version 552758 (0.0017) [2024-04-28 02:44:33,040][54818] Updated weights for policy 0, policy_version 552768 (0.0016) [2024-04-28 02:44:34,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9056616448. Throughput: 0: 61208.4. Samples: 1961812360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:44:34,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-28 02:44:35,554][54818] Updated weights for policy 0, policy_version 552778 (0.0017) [2024-04-28 02:44:38,200][54818] Updated weights for policy 0, policy_version 552788 (0.0015) [2024-04-28 02:44:39,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9056911360. Throughput: 0: 61238.3. Samples: 1962177660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:44:39,253][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 02:44:40,947][54818] Updated weights for policy 0, policy_version 552798 (0.0016) [2024-04-28 02:44:41,810][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (8100 times) [2024-04-28 02:44:43,456][54818] Updated weights for policy 0, policy_version 552808 (0.0015) [2024-04-28 02:44:44,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9057239040. Throughput: 0: 61202.3. Samples: 1962362080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:44:44,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 02:44:46,086][54818] Updated weights for policy 0, policy_version 552818 (0.0017) [2024-04-28 02:44:49,119][54818] Updated weights for policy 0, policy_version 552828 (0.0016) [2024-04-28 02:44:49,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 9057533952. Throughput: 0: 61276.3. Samples: 1962734720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:44:49,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 02:44:49,264][54587] No heartbeat for components: RolloutWorker_w4 (19237 seconds), RolloutWorker_w5 (5337 seconds) [2024-04-28 02:44:51,469][54818] Updated weights for policy 0, policy_version 552838 (0.0017) [2024-04-28 02:44:54,254][54587] Fps is (10 sec: 60618.7, 60 sec: 61439.7, 300 sec: 61259.4). Total num frames: 9057845248. Throughput: 0: 61247.6. Samples: 1963106120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:44:54,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 02:44:54,685][54818] Updated weights for policy 0, policy_version 552848 (0.0015) [2024-04-28 02:44:56,587][54818] Updated weights for policy 0, policy_version 552858 (0.0016) [2024-04-28 02:44:59,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61713.1, 300 sec: 61259.5). Total num frames: 9058156544. Throughput: 0: 61034.2. Samples: 1963280680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:44:59,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 02:44:59,840][54818] Updated weights for policy 0, policy_version 552868 (0.0016) [2024-04-28 02:45:01,350][54798] Signal inference workers to stop experience collection... (32100 times) [2024-04-28 02:45:01,368][54818] InferenceWorker_p0-w0: stopping experience collection (32100 times) [2024-04-28 02:45:01,405][54798] Signal inference workers to resume experience collection... (32100 times) [2024-04-28 02:45:01,405][54818] InferenceWorker_p0-w0: resuming experience collection (32100 times) [2024-04-28 02:45:01,828][54818] Updated weights for policy 0, policy_version 552878 (0.0021) [2024-04-28 02:45:04,253][54587] Fps is (10 sec: 62260.8, 60 sec: 61712.9, 300 sec: 61259.5). Total num frames: 9058467840. Throughput: 0: 61200.9. Samples: 1963654660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 02:45:04,255][54587] Avg episode reward: [(0, '0.647')] [2024-04-28 02:45:05,291][54818] Updated weights for policy 0, policy_version 552888 (0.0016) [2024-04-28 02:45:07,600][54818] Updated weights for policy 0, policy_version 552898 (0.0023) [2024-04-28 02:45:08,794][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (8200 times) [2024-04-28 02:45:09,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9058762752. Throughput: 0: 61495.9. Samples: 1964029240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:45:09,255][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 02:45:10,506][54818] Updated weights for policy 0, policy_version 552908 (0.0015) [2024-04-28 02:45:12,873][54818] Updated weights for policy 0, policy_version 552918 (0.0018) [2024-04-28 02:45:14,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 9059074048. Throughput: 0: 61376.1. Samples: 1964202420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:45:14,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 02:45:15,705][54818] Updated weights for policy 0, policy_version 552928 (0.0016) [2024-04-28 02:45:18,256][54818] Updated weights for policy 0, policy_version 552938 (0.0017) [2024-04-28 02:45:19,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61713.1, 300 sec: 61259.5). Total num frames: 9059385344. Throughput: 0: 61518.6. Samples: 1964580700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:45:19,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-28 02:45:20,892][54818] Updated weights for policy 0, policy_version 552948 (0.0017) [2024-04-28 02:45:23,810][54818] Updated weights for policy 0, policy_version 552958 (0.0017) [2024-04-28 02:45:24,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61713.0, 300 sec: 61315.0). Total num frames: 9059696640. Throughput: 0: 61703.9. Samples: 1964954340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:45:24,255][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 02:45:26,049][54818] Updated weights for policy 0, policy_version 552968 (0.0017) [2024-04-28 02:45:28,966][54818] Updated weights for policy 0, policy_version 552978 (0.0016) [2024-04-28 02:45:29,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 9059991552. Throughput: 0: 61447.1. Samples: 1965127200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:45:29,254][54587] Avg episode reward: [(0, '0.664')] [2024-04-28 02:45:31,374][54818] Updated weights for policy 0, policy_version 552988 (0.0016) [2024-04-28 02:45:34,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9060302848. Throughput: 0: 61415.2. Samples: 1965498400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:45:34,253][54587] Avg episode reward: [(0, '0.741')] [2024-04-28 02:45:34,317][54818] Updated weights for policy 0, policy_version 552998 (0.0021) [2024-04-28 02:45:35,629][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (8300 times) [2024-04-28 02:45:36,619][54818] Updated weights for policy 0, policy_version 553008 (0.0019) [2024-04-28 02:45:39,253][54587] Fps is (10 sec: 63897.5, 60 sec: 61986.1, 300 sec: 61315.0). Total num frames: 9060630528. Throughput: 0: 61566.7. Samples: 1965876600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:45:39,254][54587] Avg episode reward: [(0, '0.521')] [2024-04-28 02:45:39,429][54818] Updated weights for policy 0, policy_version 553018 (0.0017) [2024-04-28 02:45:42,052][54818] Updated weights for policy 0, policy_version 553028 (0.0016) [2024-04-28 02:45:44,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9060909056. Throughput: 0: 61499.2. Samples: 1966048140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:45:44,254][54587] Avg episode reward: [(0, '0.516')] [2024-04-28 02:45:45,463][54818] Updated weights for policy 0, policy_version 553038 (0.0018) [2024-04-28 02:45:46,264][54798] Signal inference workers to stop experience collection... (32150 times) [2024-04-28 02:45:46,267][54798] Signal inference workers to resume experience collection... (32150 times) [2024-04-28 02:45:46,275][54818] InferenceWorker_p0-w0: stopping experience collection (32150 times) [2024-04-28 02:45:46,275][54818] InferenceWorker_p0-w0: resuming experience collection (32150 times) [2024-04-28 02:45:47,387][54818] Updated weights for policy 0, policy_version 553048 (0.0019) [2024-04-28 02:45:49,253][54587] Fps is (10 sec: 58982.7, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 9061220352. Throughput: 0: 61371.3. Samples: 1966416360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:45:49,255][54587] Avg episode reward: [(0, '0.558')] [2024-04-28 02:45:49,400][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000553055_9061253120.pth... [2024-04-28 02:45:49,435][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000552158_9046556672.pth [2024-04-28 02:45:50,704][54818] Updated weights for policy 0, policy_version 553058 (0.0018) [2024-04-28 02:45:53,011][54818] Updated weights for policy 0, policy_version 553068 (0.0015) [2024-04-28 02:45:54,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.4, 300 sec: 61204.0). Total num frames: 9061531648. Throughput: 0: 61328.6. Samples: 1966789020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:45:54,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 02:45:55,887][54818] Updated weights for policy 0, policy_version 553078 (0.0020) [2024-04-28 02:45:58,333][54818] Updated weights for policy 0, policy_version 553088 (0.0019) [2024-04-28 02:45:59,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9061826560. Throughput: 0: 61288.4. Samples: 1966960400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:45:59,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 02:46:01,193][54818] Updated weights for policy 0, policy_version 553098 (0.0017) [2024-04-28 02:46:01,936][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (8400 times) [2024-04-28 02:46:04,016][54818] Updated weights for policy 0, policy_version 553108 (0.0017) [2024-04-28 02:46:04,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 9062137856. Throughput: 0: 61193.1. Samples: 1967334380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:46:04,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 02:46:06,417][54818] Updated weights for policy 0, policy_version 553118 (0.0016) [2024-04-28 02:46:09,192][54818] Updated weights for policy 0, policy_version 553128 (0.0019) [2024-04-28 02:46:09,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.0, 300 sec: 61203.9). Total num frames: 9062449152. Throughput: 0: 61091.6. Samples: 1967703460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:46:09,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 02:46:11,657][54818] Updated weights for policy 0, policy_version 553138 (0.0016) [2024-04-28 02:46:14,253][54587] Fps is (10 sec: 60619.7, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 9062744064. Throughput: 0: 61120.3. Samples: 1967877620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:46:14,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 02:46:14,628][54818] Updated weights for policy 0, policy_version 553148 (0.0017) [2024-04-28 02:46:16,903][54818] Updated weights for policy 0, policy_version 553158 (0.0018) [2024-04-28 02:46:19,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9063055360. Throughput: 0: 61050.2. Samples: 1968245660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:46:19,254][54587] Avg episode reward: [(0, '0.513')] [2024-04-28 02:46:19,895][54818] Updated weights for policy 0, policy_version 553168 (0.0015) [2024-04-28 02:46:22,086][54818] Updated weights for policy 0, policy_version 553178 (0.0018) [2024-04-28 02:46:24,253][54587] Fps is (10 sec: 60621.6, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9063350272. Throughput: 0: 60865.8. Samples: 1968615560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:46:24,253][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 02:46:25,397][54818] Updated weights for policy 0, policy_version 553188 (0.0016) [2024-04-28 02:46:25,966][54798] Signal inference workers to stop experience collection... (32200 times) [2024-04-28 02:46:25,967][54798] Signal inference workers to resume experience collection... (32200 times) [2024-04-28 02:46:25,981][54818] InferenceWorker_p0-w0: stopping experience collection (32200 times) [2024-04-28 02:46:25,981][54818] InferenceWorker_p0-w0: resuming experience collection (32200 times) [2024-04-28 02:46:27,933][54818] Updated weights for policy 0, policy_version 553198 (0.0017) [2024-04-28 02:46:29,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.8, 300 sec: 61092.9). Total num frames: 9063661568. Throughput: 0: 61111.3. Samples: 1968798160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:46:29,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 02:46:29,361][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (8500 times) [2024-04-28 02:46:30,786][54818] Updated weights for policy 0, policy_version 553208 (0.0017) [2024-04-28 02:46:33,296][54818] Updated weights for policy 0, policy_version 553218 (0.0016) [2024-04-28 02:46:34,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60893.8, 300 sec: 61037.4). Total num frames: 9063956480. Throughput: 0: 61120.4. Samples: 1969166780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 02:46:34,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 02:46:35,991][54818] Updated weights for policy 0, policy_version 553228 (0.0016) [2024-04-28 02:46:38,706][54818] Updated weights for policy 0, policy_version 553238 (0.0016) [2024-04-28 02:46:39,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60620.8, 300 sec: 61092.9). Total num frames: 9064267776. Throughput: 0: 61029.2. Samples: 1969535340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:46:39,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 02:46:41,501][54818] Updated weights for policy 0, policy_version 553248 (0.0017) [2024-04-28 02:46:43,949][54818] Updated weights for policy 0, policy_version 553258 (0.0016) [2024-04-28 02:46:44,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61166.8, 300 sec: 61092.9). Total num frames: 9064579072. Throughput: 0: 61238.7. Samples: 1969716140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:46:44,255][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 02:46:46,548][54818] Updated weights for policy 0, policy_version 553268 (0.0017) [2024-04-28 02:46:49,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 9064890368. Throughput: 0: 61007.8. Samples: 1970079740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:46:49,255][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 02:46:49,610][54818] Updated weights for policy 0, policy_version 553278 (0.0017) [2024-04-28 02:46:51,952][54818] Updated weights for policy 0, policy_version 553288 (0.0018) [2024-04-28 02:46:54,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.7, 300 sec: 61092.9). Total num frames: 9065185280. Throughput: 0: 61075.1. Samples: 1970451840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:46:54,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-28 02:46:54,919][54818] Updated weights for policy 0, policy_version 553298 (0.0017) [2024-04-28 02:46:55,983][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (8600 times) [2024-04-28 02:46:57,284][54818] Updated weights for policy 0, policy_version 553308 (0.0016) [2024-04-28 02:46:59,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61167.0, 300 sec: 61148.7). Total num frames: 9065496576. Throughput: 0: 61345.0. Samples: 1970638140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:46:59,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 02:47:00,200][54818] Updated weights for policy 0, policy_version 553318 (0.0016) [2024-04-28 02:47:02,560][54818] Updated weights for policy 0, policy_version 553328 (0.0022) [2024-04-28 02:47:04,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61166.8, 300 sec: 61092.9). Total num frames: 9065807872. Throughput: 0: 61144.4. Samples: 1970997160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:47:04,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-28 02:47:05,437][54818] Updated weights for policy 0, policy_version 553338 (0.0018) [2024-04-28 02:47:07,889][54818] Updated weights for policy 0, policy_version 553348 (0.0015) [2024-04-28 02:47:09,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9066119168. Throughput: 0: 61183.8. Samples: 1971368840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:47:09,255][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 02:47:10,630][54818] Updated weights for policy 0, policy_version 553358 (0.0015) [2024-04-28 02:47:13,248][54818] Updated weights for policy 0, policy_version 553368 (0.0021) [2024-04-28 02:47:13,611][54798] Signal inference workers to stop experience collection... (32250 times) [2024-04-28 02:47:13,611][54798] Signal inference workers to resume experience collection... (32250 times) [2024-04-28 02:47:13,620][54818] InferenceWorker_p0-w0: stopping experience collection (32250 times) [2024-04-28 02:47:13,620][54818] InferenceWorker_p0-w0: resuming experience collection (32250 times) [2024-04-28 02:47:14,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 9066414080. Throughput: 0: 61254.3. Samples: 1971554600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:47:14,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 02:47:15,993][54818] Updated weights for policy 0, policy_version 553378 (0.0016) [2024-04-28 02:47:18,446][54818] Updated weights for policy 0, policy_version 553388 (0.0016) [2024-04-28 02:47:19,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9066725376. Throughput: 0: 61178.6. Samples: 1971919820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:47:19,254][54587] Avg episode reward: [(0, '0.447')] [2024-04-28 02:47:21,212][54818] Updated weights for policy 0, policy_version 553398 (0.0018) [2024-04-28 02:47:22,666][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (8700 times) [2024-04-28 02:47:24,073][54818] Updated weights for policy 0, policy_version 553408 (0.0016) [2024-04-28 02:47:24,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61439.9, 300 sec: 61148.8). Total num frames: 9067036672. Throughput: 0: 61124.9. Samples: 1972285960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:47:24,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-28 02:47:26,598][54818] Updated weights for policy 0, policy_version 553418 (0.0018) [2024-04-28 02:47:29,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9067347968. Throughput: 0: 61220.9. Samples: 1972471080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:47:29,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-28 02:47:29,692][54818] Updated weights for policy 0, policy_version 553428 (0.0016) [2024-04-28 02:47:32,110][54818] Updated weights for policy 0, policy_version 553438 (0.0019) [2024-04-28 02:47:34,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 9067642880. Throughput: 0: 61369.8. Samples: 1972841380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:47:34,255][54587] Avg episode reward: [(0, '0.670')] [2024-04-28 02:47:35,059][54818] Updated weights for policy 0, policy_version 553448 (0.0017) [2024-04-28 02:47:37,514][54818] Updated weights for policy 0, policy_version 553458 (0.0019) [2024-04-28 02:47:39,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9067954176. Throughput: 0: 61072.0. Samples: 1973200080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:47:39,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 02:47:40,408][54818] Updated weights for policy 0, policy_version 553468 (0.0015) [2024-04-28 02:47:42,886][54818] Updated weights for policy 0, policy_version 553478 (0.0020) [2024-04-28 02:47:44,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9068249088. Throughput: 0: 61160.1. Samples: 1973390340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:47:44,253][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 02:47:45,785][54818] Updated weights for policy 0, policy_version 553488 (0.0017) [2024-04-28 02:47:48,099][54818] Updated weights for policy 0, policy_version 553498 (0.0018) [2024-04-28 02:47:49,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 9068560384. Throughput: 0: 61231.8. Samples: 1973752600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:47:49,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 02:47:49,266][54587] No heartbeat for components: RolloutWorker_w4 (19417 seconds), RolloutWorker_w5 (5517 seconds) [2024-04-28 02:47:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000553501_9068560384.pth... [2024-04-28 02:47:49,316][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000552604_9053863936.pth [2024-04-28 02:47:49,541][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (8800 times) [2024-04-28 02:47:50,981][54818] Updated weights for policy 0, policy_version 553508 (0.0017) [2024-04-28 02:47:53,250][54818] Updated weights for policy 0, policy_version 553518 (0.0016) [2024-04-28 02:47:54,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61440.0, 300 sec: 61203.9). Total num frames: 9068871680. Throughput: 0: 61071.6. Samples: 1974117060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:47:54,255][54587] Avg episode reward: [(0, '0.647')] [2024-04-28 02:47:56,243][54818] Updated weights for policy 0, policy_version 553528 (0.0020) [2024-04-28 02:47:56,757][54798] Signal inference workers to stop experience collection... (32300 times) [2024-04-28 02:47:56,758][54798] Signal inference workers to resume experience collection... (32300 times) [2024-04-28 02:47:56,771][54818] InferenceWorker_p0-w0: stopping experience collection (32300 times) [2024-04-28 02:47:56,771][54818] InferenceWorker_p0-w0: resuming experience collection (32300 times) [2024-04-28 02:47:58,887][54818] Updated weights for policy 0, policy_version 553538 (0.0016) [2024-04-28 02:47:59,253][54587] Fps is (10 sec: 60621.9, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9069166592. Throughput: 0: 61137.8. Samples: 1974305800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:47:59,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 02:48:01,835][54818] Updated weights for policy 0, policy_version 553548 (0.0017) [2024-04-28 02:48:04,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9069477888. Throughput: 0: 60969.9. Samples: 1974663460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 02:48:04,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 02:48:04,301][54818] Updated weights for policy 0, policy_version 553558 (0.0017) [2024-04-28 02:48:07,163][54818] Updated weights for policy 0, policy_version 553568 (0.0016) [2024-04-28 02:48:09,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9069789184. Throughput: 0: 60994.7. Samples: 1975030720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:48:09,254][54587] Avg episode reward: [(0, '0.663')] [2024-04-28 02:48:09,790][54818] Updated weights for policy 0, policy_version 553578 (0.0017) [2024-04-28 02:48:12,374][54818] Updated weights for policy 0, policy_version 553588 (0.0017) [2024-04-28 02:48:14,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9070100480. Throughput: 0: 61170.2. Samples: 1975223740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:48:14,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 02:48:15,037][54818] Updated weights for policy 0, policy_version 553598 (0.0016) [2024-04-28 02:48:16,152][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (8900 times) [2024-04-28 02:48:17,567][54818] Updated weights for policy 0, policy_version 553608 (0.0022) [2024-04-28 02:48:19,253][54587] Fps is (10 sec: 63896.8, 60 sec: 61713.0, 300 sec: 61315.0). Total num frames: 9070428160. Throughput: 0: 60962.6. Samples: 1975584700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:48:19,255][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 02:48:20,599][54818] Updated weights for policy 0, policy_version 553618 (0.0019) [2024-04-28 02:48:23,082][54818] Updated weights for policy 0, policy_version 553628 (0.0019) [2024-04-28 02:48:24,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9070706688. Throughput: 0: 61142.2. Samples: 1975951480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:48:24,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 02:48:25,990][54818] Updated weights for policy 0, policy_version 553638 (0.0020) [2024-04-28 02:48:28,476][54818] Updated weights for policy 0, policy_version 553648 (0.0018) [2024-04-28 02:48:29,253][54587] Fps is (10 sec: 58982.9, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9071017984. Throughput: 0: 61079.8. Samples: 1976138940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:48:29,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 02:48:31,321][54818] Updated weights for policy 0, policy_version 553658 (0.0017) [2024-04-28 02:48:33,765][54818] Updated weights for policy 0, policy_version 553668 (0.0017) [2024-04-28 02:48:34,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9071312896. Throughput: 0: 61073.5. Samples: 1976500900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:48:34,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-28 02:48:36,586][54818] Updated weights for policy 0, policy_version 553678 (0.0017) [2024-04-28 02:48:39,110][54818] Updated weights for policy 0, policy_version 553688 (0.0016) [2024-04-28 02:48:39,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9071624192. Throughput: 0: 61214.7. Samples: 1976871720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:48:39,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-28 02:48:41,851][54818] Updated weights for policy 0, policy_version 553698 (0.0017) [2024-04-28 02:48:42,887][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (9000 times) [2024-04-28 02:48:44,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 9071935488. Throughput: 0: 61120.8. Samples: 1977056240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:48:44,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 02:48:44,515][54818] Updated weights for policy 0, policy_version 553708 (0.0017) [2024-04-28 02:48:45,042][54798] Signal inference workers to stop experience collection... (32350 times) [2024-04-28 02:48:45,081][54818] InferenceWorker_p0-w0: stopping experience collection (32350 times) [2024-04-28 02:48:45,092][54798] Signal inference workers to resume experience collection... (32350 times) [2024-04-28 02:48:45,095][54818] InferenceWorker_p0-w0: resuming experience collection (32350 times) [2024-04-28 02:48:47,265][54818] Updated weights for policy 0, policy_version 553718 (0.0016) [2024-04-28 02:48:49,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61167.2, 300 sec: 61259.5). Total num frames: 9072230400. Throughput: 0: 61264.9. Samples: 1977420380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:48:49,253][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 02:48:49,977][54818] Updated weights for policy 0, policy_version 553728 (0.0017) [2024-04-28 02:48:52,625][54818] Updated weights for policy 0, policy_version 553738 (0.0018) [2024-04-28 02:48:54,253][54587] Fps is (10 sec: 62260.0, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9072558080. Throughput: 0: 61307.6. Samples: 1977789560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:48:54,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 02:48:55,331][54818] Updated weights for policy 0, policy_version 553748 (0.0019) [2024-04-28 02:48:57,780][54818] Updated weights for policy 0, policy_version 553758 (0.0018) [2024-04-28 02:48:59,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 9072852992. Throughput: 0: 61178.6. Samples: 1977976780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:48:59,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-28 02:49:00,594][54818] Updated weights for policy 0, policy_version 553768 (0.0015) [2024-04-28 02:49:03,041][54818] Updated weights for policy 0, policy_version 553778 (0.0015) [2024-04-28 02:49:04,253][54587] Fps is (10 sec: 58982.4, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9073147904. Throughput: 0: 61158.0. Samples: 1978336800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:49:04,253][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 02:49:05,952][54818] Updated weights for policy 0, policy_version 553788 (0.0018) [2024-04-28 02:49:08,645][54818] Updated weights for policy 0, policy_version 553798 (0.0016) [2024-04-28 02:49:09,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9073459200. Throughput: 0: 61317.0. Samples: 1978710740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:49:09,253][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 02:49:09,772][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (9100 times) [2024-04-28 02:49:11,260][54818] Updated weights for policy 0, policy_version 553808 (0.0016) [2024-04-28 02:49:14,068][54818] Updated weights for policy 0, policy_version 553818 (0.0016) [2024-04-28 02:49:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.9, 300 sec: 61259.5). Total num frames: 9073754112. Throughput: 0: 61358.3. Samples: 1978900060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:49:14,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 02:49:16,751][54818] Updated weights for policy 0, policy_version 553828 (0.0017) [2024-04-28 02:49:19,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60620.9, 300 sec: 61259.5). Total num frames: 9074065408. Throughput: 0: 61343.1. Samples: 1979261340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:49:19,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-28 02:49:19,325][54818] Updated weights for policy 0, policy_version 553838 (0.0017) [2024-04-28 02:49:22,045][54818] Updated weights for policy 0, policy_version 553848 (0.0016) [2024-04-28 02:49:24,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9074360320. Throughput: 0: 61244.5. Samples: 1979627720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:49:24,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 02:49:24,599][54818] Updated weights for policy 0, policy_version 553858 (0.0018) [2024-04-28 02:49:27,231][54818] Updated weights for policy 0, policy_version 553868 (0.0021) [2024-04-28 02:49:29,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9074671616. Throughput: 0: 61223.3. Samples: 1979811280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:49:29,254][54587] Avg episode reward: [(0, '0.507')] [2024-04-28 02:49:29,891][54818] Updated weights for policy 0, policy_version 553878 (0.0018) [2024-04-28 02:49:32,865][54818] Updated weights for policy 0, policy_version 553888 (0.0015) [2024-04-28 02:49:33,450][54798] Signal inference workers to stop experience collection... (32400 times) [2024-04-28 02:49:33,450][54798] Signal inference workers to resume experience collection... (32400 times) [2024-04-28 02:49:33,455][54818] InferenceWorker_p0-w0: stopping experience collection (32400 times) [2024-04-28 02:49:33,455][54818] InferenceWorker_p0-w0: resuming experience collection (32400 times) [2024-04-28 02:49:34,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9074982912. Throughput: 0: 61377.1. Samples: 1980182360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 02:49:34,254][54587] Avg episode reward: [(0, '0.493')] [2024-04-28 02:49:35,498][54818] Updated weights for policy 0, policy_version 553898 (0.0019) [2024-04-28 02:49:36,538][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (9200 times) [2024-04-28 02:49:38,006][54818] Updated weights for policy 0, policy_version 553908 (0.0019) [2024-04-28 02:49:39,253][54587] Fps is (10 sec: 62258.0, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 9075294208. Throughput: 0: 61286.0. Samples: 1980547440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:49:39,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 02:49:40,548][54818] Updated weights for policy 0, policy_version 553918 (0.0017) [2024-04-28 02:49:43,201][54818] Updated weights for policy 0, policy_version 553928 (0.0017) [2024-04-28 02:49:44,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9075589120. Throughput: 0: 61122.8. Samples: 1980727300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:49:44,253][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 02:49:45,955][54818] Updated weights for policy 0, policy_version 553938 (0.0016) [2024-04-28 02:49:48,701][54818] Updated weights for policy 0, policy_version 553948 (0.0018) [2024-04-28 02:49:49,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61166.8, 300 sec: 61204.0). Total num frames: 9075900416. Throughput: 0: 61371.9. Samples: 1981098540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:49:49,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 02:49:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000553949_9075900416.pth... [2024-04-28 02:49:49,343][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000553055_9061253120.pth [2024-04-28 02:49:51,305][54818] Updated weights for policy 0, policy_version 553958 (0.0016) [2024-04-28 02:49:54,129][54818] Updated weights for policy 0, policy_version 553968 (0.0017) [2024-04-28 02:49:54,253][54587] Fps is (10 sec: 62257.9, 60 sec: 60893.6, 300 sec: 61203.9). Total num frames: 9076211712. Throughput: 0: 61127.7. Samples: 1981461500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:49:54,255][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 02:49:56,520][54818] Updated weights for policy 0, policy_version 553978 (0.0018) [2024-04-28 02:49:59,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9076506624. Throughput: 0: 60955.9. Samples: 1981643080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:49:59,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 02:49:59,835][54818] Updated weights for policy 0, policy_version 553988 (0.0016) [2024-04-28 02:50:01,936][54818] Updated weights for policy 0, policy_version 553998 (0.0016) [2024-04-28 02:50:03,282][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (9300 times) [2024-04-28 02:50:04,253][54587] Fps is (10 sec: 58983.3, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 9076801536. Throughput: 0: 61247.6. Samples: 1982017480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:50:04,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 02:50:04,936][54818] Updated weights for policy 0, policy_version 554008 (0.0020) [2024-04-28 02:50:07,335][54818] Updated weights for policy 0, policy_version 554018 (0.0018) [2024-04-28 02:50:09,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 9077112832. Throughput: 0: 61096.8. Samples: 1982377080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:50:09,255][54587] Avg episode reward: [(0, '0.669')] [2024-04-28 02:50:10,154][54818] Updated weights for policy 0, policy_version 554028 (0.0018) [2024-04-28 02:50:12,772][54818] Updated weights for policy 0, policy_version 554038 (0.0017) [2024-04-28 02:50:14,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9077424128. Throughput: 0: 61125.3. Samples: 1982561920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:50:14,255][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 02:50:14,257][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (500 times) [2024-04-28 02:50:15,470][54818] Updated weights for policy 0, policy_version 554048 (0.0017) [2024-04-28 02:50:18,261][54818] Updated weights for policy 0, policy_version 554058 (0.0017) [2024-04-28 02:50:19,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9077719040. Throughput: 0: 61042.4. Samples: 1982929260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:50:19,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 02:50:20,350][54798] Signal inference workers to stop experience collection... (32450 times) [2024-04-28 02:50:20,382][54818] InferenceWorker_p0-w0: stopping experience collection (32450 times) [2024-04-28 02:50:20,402][54798] Signal inference workers to resume experience collection... (32450 times) [2024-04-28 02:50:20,403][54818] InferenceWorker_p0-w0: resuming experience collection (32450 times) [2024-04-28 02:50:20,759][54818] Updated weights for policy 0, policy_version 554068 (0.0021) [2024-04-28 02:50:23,499][54818] Updated weights for policy 0, policy_version 554078 (0.0020) [2024-04-28 02:50:24,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9078030336. Throughput: 0: 60990.8. Samples: 1983292020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:50:24,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 02:50:26,189][54818] Updated weights for policy 0, policy_version 554088 (0.0016) [2024-04-28 02:50:28,802][54818] Updated weights for policy 0, policy_version 554098 (0.0017) [2024-04-28 02:50:29,253][54587] Fps is (10 sec: 62258.0, 60 sec: 61166.7, 300 sec: 61148.4). Total num frames: 9078341632. Throughput: 0: 61206.4. Samples: 1983481600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:50:29,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-28 02:50:30,208][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (9400 times) [2024-04-28 02:50:31,540][54818] Updated weights for policy 0, policy_version 554108 (0.0015) [2024-04-28 02:50:34,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 9078652928. Throughput: 0: 61064.0. Samples: 1983846420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:50:34,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 02:50:34,601][54818] Updated weights for policy 0, policy_version 554118 (0.0015) [2024-04-28 02:50:36,914][54818] Updated weights for policy 0, policy_version 554128 (0.0017) [2024-04-28 02:50:39,253][54587] Fps is (10 sec: 60621.9, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 9078947840. Throughput: 0: 61111.8. Samples: 1984211520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:50:39,254][54587] Avg episode reward: [(0, '0.523')] [2024-04-28 02:50:39,865][54818] Updated weights for policy 0, policy_version 554138 (0.0016) [2024-04-28 02:50:42,154][54818] Updated weights for policy 0, policy_version 554148 (0.0016) [2024-04-28 02:50:44,253][54587] Fps is (10 sec: 60620.0, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 9079259136. Throughput: 0: 61285.2. Samples: 1984400920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:50:44,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 02:50:45,117][54818] Updated weights for policy 0, policy_version 554158 (0.0018) [2024-04-28 02:50:47,697][54818] Updated weights for policy 0, policy_version 554168 (0.0018) [2024-04-28 02:50:49,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9079570432. Throughput: 0: 61130.6. Samples: 1984768360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:50:49,255][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 02:50:49,264][54587] No heartbeat for components: RolloutWorker_w4 (19597 seconds), RolloutWorker_w5 (5697 seconds) [2024-04-28 02:50:50,215][54818] Updated weights for policy 0, policy_version 554178 (0.0016) [2024-04-28 02:50:52,964][54818] Updated weights for policy 0, policy_version 554188 (0.0018) [2024-04-28 02:50:54,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9079881728. Throughput: 0: 61283.0. Samples: 1985134820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:50:54,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 02:50:55,515][54818] Updated weights for policy 0, policy_version 554198 (0.0017) [2024-04-28 02:50:56,987][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (9500 times) [2024-04-28 02:50:58,228][54818] Updated weights for policy 0, policy_version 554208 (0.0019) [2024-04-28 02:50:59,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9080176640. Throughput: 0: 61315.9. Samples: 1985321140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:50:59,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 02:51:00,845][54818] Updated weights for policy 0, policy_version 554218 (0.0019) [2024-04-28 02:51:02,977][54798] Signal inference workers to stop experience collection... (32500 times) [2024-04-28 02:51:02,978][54798] Signal inference workers to resume experience collection... (32500 times) [2024-04-28 02:51:02,992][54818] InferenceWorker_p0-w0: stopping experience collection (32500 times) [2024-04-28 02:51:02,992][54818] InferenceWorker_p0-w0: resuming experience collection (32500 times) [2024-04-28 02:51:03,539][54818] Updated weights for policy 0, policy_version 554228 (0.0016) [2024-04-28 02:51:04,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9080487936. Throughput: 0: 61478.7. Samples: 1985695800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 02:51:04,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 02:51:06,192][54818] Updated weights for policy 0, policy_version 554238 (0.0020) [2024-04-28 02:51:09,027][54818] Updated weights for policy 0, policy_version 554248 (0.0015) [2024-04-28 02:51:09,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9080799232. Throughput: 0: 61484.8. Samples: 1986058840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:51:09,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 02:51:11,535][54818] Updated weights for policy 0, policy_version 554258 (0.0017) [2024-04-28 02:51:14,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61439.9, 300 sec: 61203.9). Total num frames: 9081110528. Throughput: 0: 61357.9. Samples: 1986242700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:51:14,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 02:51:14,482][54818] Updated weights for policy 0, policy_version 554268 (0.0016) [2024-04-28 02:51:17,280][54818] Updated weights for policy 0, policy_version 554278 (0.0018) [2024-04-28 02:51:19,253][54587] Fps is (10 sec: 60619.9, 60 sec: 61439.8, 300 sec: 61203.9). Total num frames: 9081405440. Throughput: 0: 61472.6. Samples: 1986612700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:51:19,255][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 02:51:19,720][54818] Updated weights for policy 0, policy_version 554288 (0.0016) [2024-04-28 02:51:22,423][54818] Updated weights for policy 0, policy_version 554298 (0.0021) [2024-04-28 02:51:23,646][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (9600 times) [2024-04-28 02:51:24,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9081716736. Throughput: 0: 61505.4. Samples: 1986979260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:51:24,254][54587] Avg episode reward: [(0, '0.522')] [2024-04-28 02:51:24,791][54818] Updated weights for policy 0, policy_version 554308 (0.0017) [2024-04-28 02:51:27,924][54818] Updated weights for policy 0, policy_version 554318 (0.0017) [2024-04-28 02:51:29,253][54587] Fps is (10 sec: 60622.1, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 9082011648. Throughput: 0: 61332.7. Samples: 1987160880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:51:29,253][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 02:51:30,267][54818] Updated weights for policy 0, policy_version 554328 (0.0018) [2024-04-28 02:51:33,202][54818] Updated weights for policy 0, policy_version 554338 (0.0015) [2024-04-28 02:51:34,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9082322944. Throughput: 0: 61367.6. Samples: 1987529900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:51:34,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-28 02:51:35,548][54818] Updated weights for policy 0, policy_version 554348 (0.0016) [2024-04-28 02:51:38,567][54818] Updated weights for policy 0, policy_version 554358 (0.0016) [2024-04-28 02:51:39,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 9082634240. Throughput: 0: 61345.3. Samples: 1987895360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:51:39,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-28 02:51:40,860][54818] Updated weights for policy 0, policy_version 554368 (0.0016) [2024-04-28 02:51:43,865][54818] Updated weights for policy 0, policy_version 554378 (0.0016) [2024-04-28 02:51:44,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61167.1, 300 sec: 61148.4). Total num frames: 9082929152. Throughput: 0: 61372.5. Samples: 1988082900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:51:44,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 02:51:46,572][54818] Updated weights for policy 0, policy_version 554388 (0.0016) [2024-04-28 02:51:49,199][54818] Updated weights for policy 0, policy_version 554398 (0.0018) [2024-04-28 02:51:49,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9083256832. Throughput: 0: 61167.5. Samples: 1988448340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:51:49,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 02:51:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000554398_9083256832.pth... [2024-04-28 02:51:49,322][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000553501_9068560384.pth [2024-04-28 02:51:49,880][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (9700 times) [2024-04-28 02:51:50,855][54798] Signal inference workers to stop experience collection... (32550 times) [2024-04-28 02:51:50,855][54798] Signal inference workers to resume experience collection... (32550 times) [2024-04-28 02:51:50,868][54818] InferenceWorker_p0-w0: stopping experience collection (32550 times) [2024-04-28 02:51:50,868][54818] InferenceWorker_p0-w0: resuming experience collection (32550 times) [2024-04-28 02:51:51,757][54818] Updated weights for policy 0, policy_version 554408 (0.0020) [2024-04-28 02:51:54,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 9083551744. Throughput: 0: 61289.9. Samples: 1988816880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:51:54,253][54587] Avg episode reward: [(0, '0.550')] [2024-04-28 02:51:54,397][54818] Updated weights for policy 0, policy_version 554418 (0.0019) [2024-04-28 02:51:57,216][54818] Updated weights for policy 0, policy_version 554428 (0.0016) [2024-04-28 02:51:59,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9083863040. Throughput: 0: 61196.5. Samples: 1988996540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:51:59,255][54587] Avg episode reward: [(0, '0.509')] [2024-04-28 02:51:59,583][54818] Updated weights for policy 0, policy_version 554438 (0.0017) [2024-04-28 02:52:02,545][54818] Updated weights for policy 0, policy_version 554448 (0.0017) [2024-04-28 02:52:04,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 9084174336. Throughput: 0: 61241.6. Samples: 1989368560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:52:04,254][54587] Avg episode reward: [(0, '0.462')] [2024-04-28 02:52:04,766][54818] Updated weights for policy 0, policy_version 554458 (0.0020) [2024-04-28 02:52:08,216][54818] Updated weights for policy 0, policy_version 554468 (0.0016) [2024-04-28 02:52:09,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9084485632. Throughput: 0: 61316.4. Samples: 1989738500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:52:09,253][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 02:52:10,124][54818] Updated weights for policy 0, policy_version 554478 (0.0021) [2024-04-28 02:52:13,568][54818] Updated weights for policy 0, policy_version 554488 (0.0015) [2024-04-28 02:52:14,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9084796928. Throughput: 0: 61279.8. Samples: 1989918480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:52:14,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 02:52:15,723][54818] Updated weights for policy 0, policy_version 554498 (0.0021) [2024-04-28 02:52:17,269][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (9800 times) [2024-04-28 02:52:18,726][54818] Updated weights for policy 0, policy_version 554508 (0.0016) [2024-04-28 02:52:19,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61713.3, 300 sec: 61259.5). Total num frames: 9085108224. Throughput: 0: 61270.4. Samples: 1990287060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:52:19,253][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 02:52:21,274][54818] Updated weights for policy 0, policy_version 554518 (0.0017) [2024-04-28 02:52:23,905][54818] Updated weights for policy 0, policy_version 554528 (0.0016) [2024-04-28 02:52:24,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61712.9, 300 sec: 61259.5). Total num frames: 9085419520. Throughput: 0: 61405.7. Samples: 1990658620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:52:24,255][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 02:52:26,528][54818] Updated weights for policy 0, policy_version 554538 (0.0015) [2024-04-28 02:52:29,042][54818] Updated weights for policy 0, policy_version 554548 (0.0016) [2024-04-28 02:52:29,253][54587] Fps is (10 sec: 60620.0, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 9085714432. Throughput: 0: 61406.1. Samples: 1990846180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:52:29,254][54587] Avg episode reward: [(0, '0.517')] [2024-04-28 02:52:29,465][54798] Signal inference workers to stop experience collection... (32600 times) [2024-04-28 02:52:29,508][54818] InferenceWorker_p0-w0: stopping experience collection (32600 times) [2024-04-28 02:52:29,525][54798] Signal inference workers to resume experience collection... (32600 times) [2024-04-28 02:52:29,525][54818] InferenceWorker_p0-w0: resuming experience collection (32600 times) [2024-04-28 02:52:31,924][54818] Updated weights for policy 0, policy_version 554558 (0.0018) [2024-04-28 02:52:34,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 9086025728. Throughput: 0: 61615.9. Samples: 1991221060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 02:52:34,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 02:52:34,356][54818] Updated weights for policy 0, policy_version 554568 (0.0015) [2024-04-28 02:52:37,207][54818] Updated weights for policy 0, policy_version 554578 (0.0018) [2024-04-28 02:52:39,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61713.1, 300 sec: 61315.0). Total num frames: 9086337024. Throughput: 0: 61473.6. Samples: 1991583200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:52:39,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-28 02:52:39,585][54818] Updated weights for policy 0, policy_version 554588 (0.0015) [2024-04-28 02:52:42,660][54818] Updated weights for policy 0, policy_version 554598 (0.0016) [2024-04-28 02:52:43,598][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (9900 times) [2024-04-28 02:52:44,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 9086631936. Throughput: 0: 61747.9. Samples: 1991775200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:52:44,255][54587] Avg episode reward: [(0, '0.556')] [2024-04-28 02:52:44,841][54818] Updated weights for policy 0, policy_version 554608 (0.0017) [2024-04-28 02:52:47,904][54818] Updated weights for policy 0, policy_version 554618 (0.0017) [2024-04-28 02:52:49,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9086943232. Throughput: 0: 61464.1. Samples: 1992134440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:52:49,253][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 02:52:50,241][54818] Updated weights for policy 0, policy_version 554628 (0.0017) [2024-04-28 02:52:53,202][54818] Updated weights for policy 0, policy_version 554638 (0.0016) [2024-04-28 02:52:54,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9087238144. Throughput: 0: 61306.6. Samples: 1992497300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:52:54,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 02:52:56,009][54818] Updated weights for policy 0, policy_version 554648 (0.0018) [2024-04-28 02:52:58,531][54818] Updated weights for policy 0, policy_version 554658 (0.0019) [2024-04-28 02:52:59,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9087549440. Throughput: 0: 61635.4. Samples: 1992692060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:52:59,253][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 02:53:01,527][54818] Updated weights for policy 0, policy_version 554668 (0.0016) [2024-04-28 02:53:03,744][54818] Updated weights for policy 0, policy_version 554678 (0.0016) [2024-04-28 02:53:04,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9087844352. Throughput: 0: 61305.3. Samples: 1993045800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:53:04,253][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 02:53:06,964][54818] Updated weights for policy 0, policy_version 554688 (0.0016) [2024-04-28 02:53:09,230][54818] Updated weights for policy 0, policy_version 554698 (0.0016) [2024-04-28 02:53:09,253][54587] Fps is (10 sec: 62258.0, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9088172032. Throughput: 0: 61164.0. Samples: 1993411000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:53:09,254][54587] Avg episode reward: [(0, '0.497')] [2024-04-28 02:53:10,442][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (10000 times) [2024-04-28 02:53:12,163][54818] Updated weights for policy 0, policy_version 554708 (0.0018) [2024-04-28 02:53:14,253][54587] Fps is (10 sec: 63897.4, 60 sec: 61440.2, 300 sec: 61204.0). Total num frames: 9088483328. Throughput: 0: 61337.4. Samples: 1993606360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:53:14,255][54587] Avg episode reward: [(0, '0.696')] [2024-04-28 02:53:14,474][54818] Updated weights for policy 0, policy_version 554718 (0.0016) [2024-04-28 02:53:17,459][54818] Updated weights for policy 0, policy_version 554728 (0.0020) [2024-04-28 02:53:18,535][54798] Signal inference workers to stop experience collection... (32650 times) [2024-04-28 02:53:18,535][54798] Signal inference workers to resume experience collection... (32650 times) [2024-04-28 02:53:18,551][54818] InferenceWorker_p0-w0: stopping experience collection (32650 times) [2024-04-28 02:53:18,551][54818] InferenceWorker_p0-w0: resuming experience collection (32650 times) [2024-04-28 02:53:19,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9088778240. Throughput: 0: 60974.3. Samples: 1993964900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:53:19,255][54587] Avg episode reward: [(0, '0.507')] [2024-04-28 02:53:19,738][54818] Updated weights for policy 0, policy_version 554738 (0.0016) [2024-04-28 02:53:22,990][54818] Updated weights for policy 0, policy_version 554748 (0.0016) [2024-04-28 02:53:24,253][54587] Fps is (10 sec: 58982.5, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9089073152. Throughput: 0: 60945.5. Samples: 1994325740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:53:24,254][54587] Avg episode reward: [(0, '0.496')] [2024-04-28 02:53:24,997][54818] Updated weights for policy 0, policy_version 554758 (0.0017) [2024-04-28 02:53:28,188][54818] Updated weights for policy 0, policy_version 554768 (0.0016) [2024-04-28 02:53:29,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9089384448. Throughput: 0: 60946.7. Samples: 1994517800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:53:29,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 02:53:30,687][54818] Updated weights for policy 0, policy_version 554778 (0.0016) [2024-04-28 02:53:33,484][54818] Updated weights for policy 0, policy_version 554788 (0.0018) [2024-04-28 02:53:34,253][54587] Fps is (10 sec: 62258.0, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9089695744. Throughput: 0: 60935.2. Samples: 1994876540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:53:34,254][54587] Avg episode reward: [(0, '0.518')] [2024-04-28 02:53:36,073][54818] Updated weights for policy 0, policy_version 554798 (0.0017) [2024-04-28 02:53:37,374][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (10100 times) [2024-04-28 02:53:38,760][54818] Updated weights for policy 0, policy_version 554808 (0.0017) [2024-04-28 02:53:39,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9090007040. Throughput: 0: 61088.9. Samples: 1995246300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:53:39,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-28 02:53:41,719][54818] Updated weights for policy 0, policy_version 554818 (0.0016) [2024-04-28 02:53:43,943][54818] Updated weights for policy 0, policy_version 554828 (0.0016) [2024-04-28 02:53:44,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9090301952. Throughput: 0: 61013.7. Samples: 1995437680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:53:44,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 02:53:46,931][54818] Updated weights for policy 0, policy_version 554838 (0.0017) [2024-04-28 02:53:49,222][54818] Updated weights for policy 0, policy_version 554848 (0.0019) [2024-04-28 02:53:49,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9090629632. Throughput: 0: 61095.5. Samples: 1995795100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:53:49,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 02:53:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000554848_9090629632.pth... [2024-04-28 02:53:49,265][54587] No heartbeat for components: RolloutWorker_w4 (19777 seconds), RolloutWorker_w5 (5877 seconds) [2024-04-28 02:53:49,318][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000553949_9075900416.pth [2024-04-28 02:53:52,360][54818] Updated weights for policy 0, policy_version 554858 (0.0016) [2024-04-28 02:53:54,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9090924544. Throughput: 0: 61172.2. Samples: 1996163740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:53:54,255][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 02:53:54,659][54818] Updated weights for policy 0, policy_version 554868 (0.0018) [2024-04-28 02:53:57,509][54818] Updated weights for policy 0, policy_version 554878 (0.0016) [2024-04-28 02:53:59,253][54587] Fps is (10 sec: 58983.0, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9091219456. Throughput: 0: 60972.1. Samples: 1996350100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:53:59,253][54587] Avg episode reward: [(0, '0.541')] [2024-04-28 02:54:00,048][54818] Updated weights for policy 0, policy_version 554888 (0.0018) [2024-04-28 02:54:03,181][54818] Updated weights for policy 0, policy_version 554898 (0.0015) [2024-04-28 02:54:04,049][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (10200 times) [2024-04-28 02:54:04,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9091530752. Throughput: 0: 60943.0. Samples: 1996707340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 02:54:04,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 02:54:05,489][54818] Updated weights for policy 0, policy_version 554908 (0.0016) [2024-04-28 02:54:06,757][54798] Signal inference workers to stop experience collection... (32700 times) [2024-04-28 02:54:06,798][54818] InferenceWorker_p0-w0: stopping experience collection (32700 times) [2024-04-28 02:54:06,815][54798] Signal inference workers to resume experience collection... (32700 times) [2024-04-28 02:54:06,816][54818] InferenceWorker_p0-w0: resuming experience collection (32700 times) [2024-04-28 02:54:08,417][54818] Updated weights for policy 0, policy_version 554918 (0.0018) [2024-04-28 02:54:09,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61167.0, 300 sec: 61315.0). Total num frames: 9091842048. Throughput: 0: 61247.0. Samples: 1997081860. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:54:09,254][54587] Avg episode reward: [(0, '0.517')] [2024-04-28 02:54:10,849][54818] Updated weights for policy 0, policy_version 554928 (0.0016) [2024-04-28 02:54:13,906][54818] Updated weights for policy 0, policy_version 554938 (0.0017) [2024-04-28 02:54:14,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60893.8, 300 sec: 61259.5). Total num frames: 9092136960. Throughput: 0: 61092.9. Samples: 1997266980. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:54:14,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 02:54:16,330][54818] Updated weights for policy 0, policy_version 554948 (0.0016) [2024-04-28 02:54:19,238][54818] Updated weights for policy 0, policy_version 554958 (0.0018) [2024-04-28 02:54:19,254][54587] Fps is (10 sec: 58981.4, 60 sec: 60893.6, 300 sec: 61259.4). Total num frames: 9092431872. Throughput: 0: 61122.6. Samples: 1997627060. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:54:19,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 02:54:21,839][54818] Updated weights for policy 0, policy_version 554968 (0.0017) [2024-04-28 02:54:24,253][54587] Fps is (10 sec: 58983.3, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9092726784. Throughput: 0: 61039.2. Samples: 1997993060. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:54:24,253][54587] Avg episode reward: [(0, '0.550')] [2024-04-28 02:54:24,396][54818] Updated weights for policy 0, policy_version 554978 (0.0021) [2024-04-28 02:54:27,237][54818] Updated weights for policy 0, policy_version 554988 (0.0017) [2024-04-28 02:54:29,253][54587] Fps is (10 sec: 60622.3, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9093038080. Throughput: 0: 60955.2. Samples: 1998180660. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:54:29,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-28 02:54:29,744][54818] Updated weights for policy 0, policy_version 554998 (0.0015) [2024-04-28 02:54:30,880][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (10300 times) [2024-04-28 02:54:32,411][54818] Updated weights for policy 0, policy_version 555008 (0.0016) [2024-04-28 02:54:34,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60620.9, 300 sec: 61148.4). Total num frames: 9093332992. Throughput: 0: 61124.0. Samples: 1998545680. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:54:34,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 02:54:34,996][54818] Updated weights for policy 0, policy_version 555018 (0.0019) [2024-04-28 02:54:37,704][54818] Updated weights for policy 0, policy_version 555028 (0.0018) [2024-04-28 02:54:39,253][54587] Fps is (10 sec: 60620.2, 60 sec: 60620.8, 300 sec: 61203.9). Total num frames: 9093644288. Throughput: 0: 61080.3. Samples: 1998912360. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:54:39,256][54587] Avg episode reward: [(0, '0.550')] [2024-04-28 02:54:40,295][54818] Updated weights for policy 0, policy_version 555038 (0.0015) [2024-04-28 02:54:43,279][54818] Updated weights for policy 0, policy_version 555048 (0.0015) [2024-04-28 02:54:44,253][54587] Fps is (10 sec: 62259.5, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9093955584. Throughput: 0: 60953.2. Samples: 1999093000. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:54:44,254][54587] Avg episode reward: [(0, '0.505')] [2024-04-28 02:54:44,997][54798] Signal inference workers to stop experience collection... (32750 times) [2024-04-28 02:54:44,999][54798] Signal inference workers to resume experience collection... (32750 times) [2024-04-28 02:54:45,013][54818] InferenceWorker_p0-w0: stopping experience collection (32750 times) [2024-04-28 02:54:45,013][54818] InferenceWorker_p0-w0: resuming experience collection (32750 times) [2024-04-28 02:54:45,641][54818] Updated weights for policy 0, policy_version 555058 (0.0017) [2024-04-28 02:54:48,386][54818] Updated weights for policy 0, policy_version 555068 (0.0017) [2024-04-28 02:54:49,253][54587] Fps is (10 sec: 62259.4, 60 sec: 60620.8, 300 sec: 61204.0). Total num frames: 9094266880. Throughput: 0: 61336.1. Samples: 1999467460. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:54:49,255][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 02:54:51,273][54818] Updated weights for policy 0, policy_version 555078 (0.0019) [2024-04-28 02:54:53,764][54818] Updated weights for policy 0, policy_version 555088 (0.0016) [2024-04-28 02:54:54,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60620.7, 300 sec: 61204.0). Total num frames: 9094561792. Throughput: 0: 60975.5. Samples: 1999825760. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:54:54,255][54587] Avg episode reward: [(0, '0.544')] [2024-04-28 02:54:56,512][54818] Updated weights for policy 0, policy_version 555098 (0.0019) [2024-04-28 02:54:57,782][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (10400 times) [2024-04-28 02:54:59,253][54587] Fps is (10 sec: 60620.0, 60 sec: 60893.6, 300 sec: 61259.5). Total num frames: 9094873088. Throughput: 0: 60999.0. Samples: 2000011940. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:54:59,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 02:54:59,391][54818] Updated weights for policy 0, policy_version 555108 (0.0016) [2024-04-28 02:55:01,741][54818] Updated weights for policy 0, policy_version 555118 (0.0017) [2024-04-28 02:55:04,253][54587] Fps is (10 sec: 62259.5, 60 sec: 60893.9, 300 sec: 61259.5). Total num frames: 9095184384. Throughput: 0: 61248.7. Samples: 2000383240. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:55:04,255][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 02:55:04,633][54818] Updated weights for policy 0, policy_version 555128 (0.0017) [2024-04-28 02:55:07,208][54818] Updated weights for policy 0, policy_version 555138 (0.0016) [2024-04-28 02:55:09,253][54587] Fps is (10 sec: 62259.6, 60 sec: 60893.8, 300 sec: 61259.5). Total num frames: 9095495680. Throughput: 0: 61178.5. Samples: 2000746100. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:55:09,255][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 02:55:09,964][54818] Updated weights for policy 0, policy_version 555148 (0.0016) [2024-04-28 02:55:12,547][54818] Updated weights for policy 0, policy_version 555158 (0.0016) [2024-04-28 02:55:14,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60894.1, 300 sec: 61259.5). Total num frames: 9095790592. Throughput: 0: 61188.1. Samples: 2000934120. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:55:14,253][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 02:55:15,217][54818] Updated weights for policy 0, policy_version 555168 (0.0015) [2024-04-28 02:55:17,764][54818] Updated weights for policy 0, policy_version 555178 (0.0016) [2024-04-28 02:55:19,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61440.2, 300 sec: 61315.0). Total num frames: 9096118272. Throughput: 0: 61313.7. Samples: 2001304800. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:55:19,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-28 02:55:20,489][54818] Updated weights for policy 0, policy_version 555188 (0.0015) [2024-04-28 02:55:23,129][54818] Updated weights for policy 0, policy_version 555198 (0.0018) [2024-04-28 02:55:24,253][54587] Fps is (10 sec: 63896.0, 60 sec: 61712.8, 300 sec: 61315.0). Total num frames: 9096429568. Throughput: 0: 61358.5. Samples: 2001673500. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:55:24,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 02:55:24,397][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (10500 times) [2024-04-28 02:55:25,768][54818] Updated weights for policy 0, policy_version 555208 (0.0016) [2024-04-28 02:55:28,480][54818] Updated weights for policy 0, policy_version 555218 (0.0016) [2024-04-28 02:55:29,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9096724480. Throughput: 0: 61457.2. Samples: 2001858580. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:55:29,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 02:55:30,967][54818] Updated weights for policy 0, policy_version 555228 (0.0017) [2024-04-28 02:55:33,675][54818] Updated weights for policy 0, policy_version 555238 (0.0015) [2024-04-28 02:55:34,253][54587] Fps is (10 sec: 60622.3, 60 sec: 61713.2, 300 sec: 61315.1). Total num frames: 9097035776. Throughput: 0: 61426.8. Samples: 2002231660. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:55:34,253][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 02:55:36,335][54818] Updated weights for policy 0, policy_version 555248 (0.0019) [2024-04-28 02:55:38,929][54818] Updated weights for policy 0, policy_version 555258 (0.0016) [2024-04-28 02:55:39,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61713.0, 300 sec: 61315.0). Total num frames: 9097347072. Throughput: 0: 61609.3. Samples: 2002598180. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-04-28 02:55:39,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 02:55:40,523][54798] Signal inference workers to stop experience collection... (32800 times) [2024-04-28 02:55:40,562][54818] InferenceWorker_p0-w0: stopping experience collection (32800 times) [2024-04-28 02:55:40,579][54798] Signal inference workers to resume experience collection... (32800 times) [2024-04-28 02:55:40,580][54818] InferenceWorker_p0-w0: resuming experience collection (32800 times) [2024-04-28 02:55:41,737][54818] Updated weights for policy 0, policy_version 555268 (0.0017) [2024-04-28 02:55:44,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61713.1, 300 sec: 61315.1). Total num frames: 9097658368. Throughput: 0: 61719.8. Samples: 2002789320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:55:44,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 02:55:44,502][54818] Updated weights for policy 0, policy_version 555278 (0.0016) [2024-04-28 02:55:47,316][54818] Updated weights for policy 0, policy_version 555288 (0.0018) [2024-04-28 02:55:49,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61713.0, 300 sec: 61315.0). Total num frames: 9097969664. Throughput: 0: 61506.6. Samples: 2003151040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:55:49,254][54587] Avg episode reward: [(0, '0.674')] [2024-04-28 02:55:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000555296_9097969664.pth... [2024-04-28 02:55:49,321][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000554398_9083256832.pth [2024-04-28 02:55:49,724][54818] Updated weights for policy 0, policy_version 555298 (0.0016) [2024-04-28 02:55:50,990][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (10600 times) [2024-04-28 02:55:52,410][54818] Updated weights for policy 0, policy_version 555308 (0.0019) [2024-04-28 02:55:54,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61986.1, 300 sec: 61370.6). Total num frames: 9098280960. Throughput: 0: 61717.4. Samples: 2003523380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:55:54,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 02:55:55,194][54818] Updated weights for policy 0, policy_version 555318 (0.0016) [2024-04-28 02:55:57,583][54818] Updated weights for policy 0, policy_version 555328 (0.0018) [2024-04-28 02:55:59,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61713.1, 300 sec: 61315.0). Total num frames: 9098575872. Throughput: 0: 61670.4. Samples: 2003709300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:55:59,255][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 02:56:00,405][54818] Updated weights for policy 0, policy_version 555338 (0.0018) [2024-04-28 02:56:02,963][54818] Updated weights for policy 0, policy_version 555348 (0.0017) [2024-04-28 02:56:04,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61713.1, 300 sec: 61315.0). Total num frames: 9098887168. Throughput: 0: 61513.0. Samples: 2004072880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:56:04,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 02:56:05,710][54818] Updated weights for policy 0, policy_version 555358 (0.0016) [2024-04-28 02:56:08,397][54818] Updated weights for policy 0, policy_version 555368 (0.0017) [2024-04-28 02:56:09,253][54587] Fps is (10 sec: 60621.9, 60 sec: 61440.2, 300 sec: 61259.5). Total num frames: 9099182080. Throughput: 0: 61608.8. Samples: 2004445880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:56:09,253][54587] Avg episode reward: [(0, '0.549')] [2024-04-28 02:56:11,022][54818] Updated weights for policy 0, policy_version 555378 (0.0020) [2024-04-28 02:56:13,607][54818] Updated weights for policy 0, policy_version 555388 (0.0016) [2024-04-28 02:56:14,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61713.0, 300 sec: 61315.1). Total num frames: 9099493376. Throughput: 0: 61602.3. Samples: 2004630680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:56:14,254][54587] Avg episode reward: [(0, '0.525')] [2024-04-28 02:56:16,314][54818] Updated weights for policy 0, policy_version 555398 (0.0015) [2024-04-28 02:56:17,590][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (10700 times) [2024-04-28 02:56:19,067][54818] Updated weights for policy 0, policy_version 555408 (0.0021) [2024-04-28 02:56:19,253][54587] Fps is (10 sec: 62258.0, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 9099804672. Throughput: 0: 61667.7. Samples: 2005006720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:56:19,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 02:56:21,672][54818] Updated weights for policy 0, policy_version 555418 (0.0020) [2024-04-28 02:56:24,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61167.1, 300 sec: 61315.0). Total num frames: 9100099584. Throughput: 0: 61561.5. Samples: 2005368440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:56:24,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-28 02:56:24,652][54818] Updated weights for policy 0, policy_version 555428 (0.0016) [2024-04-28 02:56:26,888][54818] Updated weights for policy 0, policy_version 555438 (0.0016) [2024-04-28 02:56:29,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61440.1, 300 sec: 61315.0). Total num frames: 9100410880. Throughput: 0: 61397.8. Samples: 2005552220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:56:29,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 02:56:29,839][54818] Updated weights for policy 0, policy_version 555448 (0.0019) [2024-04-28 02:56:32,426][54818] Updated weights for policy 0, policy_version 555458 (0.0017) [2024-04-28 02:56:33,415][54798] Signal inference workers to stop experience collection... (32850 times) [2024-04-28 02:56:33,465][54818] InferenceWorker_p0-w0: stopping experience collection (32850 times) [2024-04-28 02:56:33,468][54798] Signal inference workers to resume experience collection... (32850 times) [2024-04-28 02:56:33,473][54818] InferenceWorker_p0-w0: resuming experience collection (32850 times) [2024-04-28 02:56:34,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9100705792. Throughput: 0: 61543.8. Samples: 2005920500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:56:34,253][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 02:56:35,124][54818] Updated weights for policy 0, policy_version 555468 (0.0016) [2024-04-28 02:56:37,718][54818] Updated weights for policy 0, policy_version 555478 (0.0019) [2024-04-28 02:56:39,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61167.1, 300 sec: 61315.1). Total num frames: 9101017088. Throughput: 0: 61501.5. Samples: 2006290940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:56:39,253][54587] Avg episode reward: [(0, '0.558')] [2024-04-28 02:56:40,457][54818] Updated weights for policy 0, policy_version 555488 (0.0019) [2024-04-28 02:56:43,077][54818] Updated weights for policy 0, policy_version 555498 (0.0019) [2024-04-28 02:56:44,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9101328384. Throughput: 0: 61319.8. Samples: 2006468680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:56:44,253][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 02:56:44,469][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (10800 times) [2024-04-28 02:56:45,737][54818] Updated weights for policy 0, policy_version 555508 (0.0016) [2024-04-28 02:56:48,128][54818] Updated weights for policy 0, policy_version 555518 (0.0018) [2024-04-28 02:56:49,253][54587] Fps is (10 sec: 62258.1, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9101639680. Throughput: 0: 61567.0. Samples: 2006843400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:56:49,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 02:56:49,264][54587] No heartbeat for components: RolloutWorker_w4 (19957 seconds), RolloutWorker_w5 (6057 seconds) [2024-04-28 02:56:51,195][54818] Updated weights for policy 0, policy_version 555528 (0.0017) [2024-04-28 02:56:53,962][54818] Updated weights for policy 0, policy_version 555538 (0.0018) [2024-04-28 02:56:54,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9101950976. Throughput: 0: 61478.5. Samples: 2007212420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:56:54,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-28 02:56:56,284][54818] Updated weights for policy 0, policy_version 555548 (0.0018) [2024-04-28 02:56:59,218][54818] Updated weights for policy 0, policy_version 555558 (0.0020) [2024-04-28 02:56:59,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 9102262272. Throughput: 0: 61298.5. Samples: 2007389120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:56:59,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 02:57:01,843][54818] Updated weights for policy 0, policy_version 555568 (0.0016) [2024-04-28 02:57:04,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9102557184. Throughput: 0: 61073.2. Samples: 2007755000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:57:04,253][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 02:57:04,628][54818] Updated weights for policy 0, policy_version 555578 (0.0015) [2024-04-28 02:57:07,101][54818] Updated weights for policy 0, policy_version 555588 (0.0015) [2024-04-28 02:57:09,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61439.8, 300 sec: 61259.5). Total num frames: 9102868480. Throughput: 0: 61238.0. Samples: 2008124160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 02:57:09,255][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 02:57:09,993][54818] Updated weights for policy 0, policy_version 555598 (0.0016) [2024-04-28 02:57:11,108][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (10900 times) [2024-04-28 02:57:12,557][54818] Updated weights for policy 0, policy_version 555608 (0.0016) [2024-04-28 02:57:14,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9103163392. Throughput: 0: 61162.6. Samples: 2008304540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:57:14,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 02:57:15,354][54818] Updated weights for policy 0, policy_version 555618 (0.0017) [2024-04-28 02:57:16,330][54798] Signal inference workers to stop experience collection... (32900 times) [2024-04-28 02:57:16,330][54798] Signal inference workers to resume experience collection... (32900 times) [2024-04-28 02:57:16,338][54818] InferenceWorker_p0-w0: stopping experience collection (32900 times) [2024-04-28 02:57:16,346][54818] InferenceWorker_p0-w0: resuming experience collection (32900 times) [2024-04-28 02:57:17,678][54818] Updated weights for policy 0, policy_version 555628 (0.0016) [2024-04-28 02:57:19,253][54587] Fps is (10 sec: 60621.8, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 9103474688. Throughput: 0: 61158.2. Samples: 2008672620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:57:19,253][54587] Avg episode reward: [(0, '0.527')] [2024-04-28 02:57:20,889][54818] Updated weights for policy 0, policy_version 555638 (0.0015) [2024-04-28 02:57:22,967][54818] Updated weights for policy 0, policy_version 555648 (0.0016) [2024-04-28 02:57:24,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9103785984. Throughput: 0: 61279.3. Samples: 2009048520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:57:24,255][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 02:57:25,992][54818] Updated weights for policy 0, policy_version 555658 (0.0016) [2024-04-28 02:57:28,562][54818] Updated weights for policy 0, policy_version 555668 (0.0017) [2024-04-28 02:57:29,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9104097280. Throughput: 0: 61285.8. Samples: 2009226540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:57:29,253][54587] Avg episode reward: [(0, '0.652')] [2024-04-28 02:57:31,173][54818] Updated weights for policy 0, policy_version 555678 (0.0016) [2024-04-28 02:57:33,791][54818] Updated weights for policy 0, policy_version 555688 (0.0018) [2024-04-28 02:57:34,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61712.9, 300 sec: 61259.5). Total num frames: 9104408576. Throughput: 0: 61214.2. Samples: 2009598040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:57:34,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 02:57:36,402][54818] Updated weights for policy 0, policy_version 555698 (0.0017) [2024-04-28 02:57:37,666][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (11000 times) [2024-04-28 02:57:39,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61713.0, 300 sec: 61315.1). Total num frames: 9104719872. Throughput: 0: 61182.3. Samples: 2009965620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:57:39,254][54587] Avg episode reward: [(0, '0.524')] [2024-04-28 02:57:39,256][54818] Updated weights for policy 0, policy_version 555708 (0.0018) [2024-04-28 02:57:41,690][54818] Updated weights for policy 0, policy_version 555718 (0.0017) [2024-04-28 02:57:44,253][54587] Fps is (10 sec: 58983.2, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9104998400. Throughput: 0: 61252.7. Samples: 2010145480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:57:44,253][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 02:57:44,577][54818] Updated weights for policy 0, policy_version 555728 (0.0016) [2024-04-28 02:57:47,104][54818] Updated weights for policy 0, policy_version 555738 (0.0018) [2024-04-28 02:57:49,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61440.1, 300 sec: 61315.0). Total num frames: 9105326080. Throughput: 0: 61381.2. Samples: 2010517160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:57:49,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 02:57:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000555745_9105326080.pth... [2024-04-28 02:57:49,335][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000554848_9090629632.pth [2024-04-28 02:57:50,229][54818] Updated weights for policy 0, policy_version 555748 (0.0017) [2024-04-28 02:57:52,696][54818] Updated weights for policy 0, policy_version 555758 (0.0018) [2024-04-28 02:57:54,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61167.1, 300 sec: 61259.5). Total num frames: 9105620992. Throughput: 0: 61230.5. Samples: 2010879520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:57:54,253][54587] Avg episode reward: [(0, '0.673')] [2024-04-28 02:57:55,610][54818] Updated weights for policy 0, policy_version 555768 (0.0018) [2024-04-28 02:57:57,957][54818] Updated weights for policy 0, policy_version 555778 (0.0016) [2024-04-28 02:57:59,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61167.0, 300 sec: 61315.0). Total num frames: 9105932288. Throughput: 0: 61235.9. Samples: 2011060160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:57:59,255][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 02:58:00,674][54818] Updated weights for policy 0, policy_version 555788 (0.0017) [2024-04-28 02:58:03,367][54818] Updated weights for policy 0, policy_version 555798 (0.0018) [2024-04-28 02:58:04,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9106243584. Throughput: 0: 61386.2. Samples: 2011435000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:58:04,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-28 02:58:04,770][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (11100 times) [2024-04-28 02:58:06,027][54818] Updated weights for policy 0, policy_version 555808 (0.0017) [2024-04-28 02:58:06,409][54798] Signal inference workers to stop experience collection... (32950 times) [2024-04-28 02:58:06,449][54818] InferenceWorker_p0-w0: stopping experience collection (32950 times) [2024-04-28 02:58:06,498][54798] Signal inference workers to resume experience collection... (32950 times) [2024-04-28 02:58:06,498][54818] InferenceWorker_p0-w0: resuming experience collection (32950 times) [2024-04-28 02:58:08,625][54818] Updated weights for policy 0, policy_version 555818 (0.0016) [2024-04-28 02:58:09,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 9106538496. Throughput: 0: 61115.3. Samples: 2011798700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:58:09,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 02:58:11,341][54818] Updated weights for policy 0, policy_version 555828 (0.0017) [2024-04-28 02:58:14,010][54818] Updated weights for policy 0, policy_version 555838 (0.0022) [2024-04-28 02:58:14,253][54587] Fps is (10 sec: 60619.8, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9106849792. Throughput: 0: 61134.8. Samples: 2011977620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:58:14,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 02:58:16,548][54818] Updated weights for policy 0, policy_version 555848 (0.0020) [2024-04-28 02:58:19,259][54587] Fps is (10 sec: 62226.4, 60 sec: 61434.6, 300 sec: 61313.9). Total num frames: 9107161088. Throughput: 0: 60988.5. Samples: 2012342840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:58:19,259][54587] Avg episode reward: [(0, '0.534')] [2024-04-28 02:58:19,408][54818] Updated weights for policy 0, policy_version 555858 (0.0020) [2024-04-28 02:58:21,874][54818] Updated weights for policy 0, policy_version 555868 (0.0017) [2024-04-28 02:58:24,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9107456000. Throughput: 0: 61215.1. Samples: 2012720300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:58:24,255][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 02:58:24,906][54818] Updated weights for policy 0, policy_version 555878 (0.0015) [2024-04-28 02:58:27,271][54818] Updated weights for policy 0, policy_version 555888 (0.0019) [2024-04-28 02:58:29,253][54587] Fps is (10 sec: 62291.1, 60 sec: 61439.8, 300 sec: 61315.0). Total num frames: 9107783680. Throughput: 0: 61191.8. Samples: 2012899120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:58:29,255][54587] Avg episode reward: [(0, '0.682')] [2024-04-28 02:58:30,224][54818] Updated weights for policy 0, policy_version 555898 (0.0015) [2024-04-28 02:58:31,178][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (11200 times) [2024-04-28 02:58:32,871][54818] Updated weights for policy 0, policy_version 555908 (0.0017) [2024-04-28 02:58:34,253][54587] Fps is (10 sec: 63898.5, 60 sec: 61440.2, 300 sec: 61315.1). Total num frames: 9108094976. Throughput: 0: 61253.5. Samples: 2013273560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:58:34,254][54587] Avg episode reward: [(0, '0.528')] [2024-04-28 02:58:34,254][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (600 times) [2024-04-28 02:58:35,497][54818] Updated weights for policy 0, policy_version 555918 (0.0017) [2024-04-28 02:58:37,936][54818] Updated weights for policy 0, policy_version 555928 (0.0018) [2024-04-28 02:58:39,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9108389888. Throughput: 0: 61372.3. Samples: 2013641280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 02:58:39,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 02:58:40,767][54818] Updated weights for policy 0, policy_version 555938 (0.0016) [2024-04-28 02:58:43,418][54818] Updated weights for policy 0, policy_version 555948 (0.0017) [2024-04-28 02:58:44,253][54587] Fps is (10 sec: 58982.3, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9108684800. Throughput: 0: 61366.5. Samples: 2013821640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 02:58:44,253][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 02:58:45,966][54818] Updated weights for policy 0, policy_version 555958 (0.0018) [2024-04-28 02:58:48,631][54818] Updated weights for policy 0, policy_version 555968 (0.0016) [2024-04-28 02:58:49,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 9109012480. Throughput: 0: 61389.1. Samples: 2014197520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 02:58:49,255][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 02:58:51,192][54818] Updated weights for policy 0, policy_version 555978 (0.0016) [2024-04-28 02:58:54,007][54818] Updated weights for policy 0, policy_version 555988 (0.0017) [2024-04-28 02:58:54,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 9109307392. Throughput: 0: 61324.0. Samples: 2014558280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 02:58:54,255][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 02:58:55,007][54798] Signal inference workers to stop experience collection... (33000 times) [2024-04-28 02:58:55,011][54798] Signal inference workers to resume experience collection... (33000 times) [2024-04-28 02:58:55,022][54818] InferenceWorker_p0-w0: stopping experience collection (33000 times) [2024-04-28 02:58:55,022][54818] InferenceWorker_p0-w0: resuming experience collection (33000 times) [2024-04-28 02:58:56,850][54818] Updated weights for policy 0, policy_version 555998 (0.0017) [2024-04-28 02:58:58,007][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (11300 times) [2024-04-28 02:58:59,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 9109618688. Throughput: 0: 61548.5. Samples: 2014747300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 02:58:59,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 02:58:59,517][54818] Updated weights for policy 0, policy_version 556008 (0.0018) [2024-04-28 02:59:02,100][54818] Updated weights for policy 0, policy_version 556018 (0.0017) [2024-04-28 02:59:04,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9109913600. Throughput: 0: 61587.6. Samples: 2015113960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 02:59:04,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 02:59:04,806][54818] Updated weights for policy 0, policy_version 556028 (0.0017) [2024-04-28 02:59:07,457][54818] Updated weights for policy 0, policy_version 556038 (0.0016) [2024-04-28 02:59:09,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 9110224896. Throughput: 0: 61378.1. Samples: 2015482320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 02:59:09,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-28 02:59:10,020][54818] Updated weights for policy 0, policy_version 556048 (0.0021) [2024-04-28 02:59:12,558][54818] Updated weights for policy 0, policy_version 556058 (0.0018) [2024-04-28 02:59:14,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9110536192. Throughput: 0: 61542.3. Samples: 2015668520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 02:59:14,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 02:59:15,406][54818] Updated weights for policy 0, policy_version 556068 (0.0020) [2024-04-28 02:59:18,212][54818] Updated weights for policy 0, policy_version 556078 (0.0016) [2024-04-28 02:59:19,253][54587] Fps is (10 sec: 62260.2, 60 sec: 61445.4, 300 sec: 61426.1). Total num frames: 9110847488. Throughput: 0: 61218.2. Samples: 2016028380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 02:59:19,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 02:59:20,643][54818] Updated weights for policy 0, policy_version 556088 (0.0017) [2024-04-28 02:59:23,568][54818] Updated weights for policy 0, policy_version 556098 (0.0016) [2024-04-28 02:59:24,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9111142400. Throughput: 0: 61317.8. Samples: 2016400580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 02:59:24,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-28 02:59:24,714][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (11400 times) [2024-04-28 02:59:26,009][54818] Updated weights for policy 0, policy_version 556108 (0.0017) [2024-04-28 02:59:28,935][54818] Updated weights for policy 0, policy_version 556118 (0.0017) [2024-04-28 02:59:29,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61167.1, 300 sec: 61426.1). Total num frames: 9111453696. Throughput: 0: 61422.1. Samples: 2016585640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 02:59:29,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-28 02:59:31,436][54818] Updated weights for policy 0, policy_version 556128 (0.0017) [2024-04-28 02:59:34,127][54818] Updated weights for policy 0, policy_version 556138 (0.0017) [2024-04-28 02:59:34,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61166.8, 300 sec: 61426.1). Total num frames: 9111764992. Throughput: 0: 61240.1. Samples: 2016953320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 02:59:34,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 02:59:37,062][54818] Updated weights for policy 0, policy_version 556148 (0.0016) [2024-04-28 02:59:39,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61167.0, 300 sec: 61370.6). Total num frames: 9112059904. Throughput: 0: 61280.5. Samples: 2017315900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 02:59:39,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 02:59:39,654][54818] Updated weights for policy 0, policy_version 556158 (0.0018) [2024-04-28 02:59:42,340][54818] Updated weights for policy 0, policy_version 556168 (0.0016) [2024-04-28 02:59:44,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9112371200. Throughput: 0: 61301.5. Samples: 2017505860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 02:59:44,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 02:59:44,845][54818] Updated weights for policy 0, policy_version 556178 (0.0015) [2024-04-28 02:59:47,696][54818] Updated weights for policy 0, policy_version 556188 (0.0019) [2024-04-28 02:59:49,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60893.9, 300 sec: 61370.6). Total num frames: 9112666112. Throughput: 0: 61366.1. Samples: 2017875440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 02:59:49,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 02:59:49,261][54587] No heartbeat for components: RolloutWorker_w4 (20137 seconds), RolloutWorker_w5 (6237 seconds) [2024-04-28 02:59:49,284][54798] Signal inference workers to stop experience collection... (33050 times) [2024-04-28 02:59:49,324][54818] InferenceWorker_p0-w0: stopping experience collection (33050 times) [2024-04-28 02:59:49,344][54798] Signal inference workers to resume experience collection... (33050 times) [2024-04-28 02:59:49,344][54818] InferenceWorker_p0-w0: resuming experience collection (33050 times) [2024-04-28 02:59:49,345][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000556194_9112682496.pth... [2024-04-28 02:59:49,379][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000555296_9097969664.pth [2024-04-28 02:59:50,070][54818] Updated weights for policy 0, policy_version 556198 (0.0023) [2024-04-28 02:59:51,587][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (11500 times) [2024-04-28 02:59:53,052][54818] Updated weights for policy 0, policy_version 556208 (0.0016) [2024-04-28 02:59:54,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9112977408. Throughput: 0: 61209.9. Samples: 2018236760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 02:59:54,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 02:59:55,568][54818] Updated weights for policy 0, policy_version 556218 (0.0016) [2024-04-28 02:59:58,330][54818] Updated weights for policy 0, policy_version 556228 (0.0018) [2024-04-28 02:59:59,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61167.0, 300 sec: 61370.6). Total num frames: 9113288704. Throughput: 0: 61171.1. Samples: 2018421220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 02:59:59,255][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 03:00:00,936][54818] Updated weights for policy 0, policy_version 556238 (0.0015) [2024-04-28 03:00:03,638][54818] Updated weights for policy 0, policy_version 556248 (0.0018) [2024-04-28 03:00:04,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61167.0, 300 sec: 61315.1). Total num frames: 9113583616. Throughput: 0: 61457.8. Samples: 2018793980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 03:00:04,253][54587] Avg episode reward: [(0, '0.514')] [2024-04-28 03:00:06,397][54818] Updated weights for policy 0, policy_version 556258 (0.0018) [2024-04-28 03:00:09,121][54818] Updated weights for policy 0, policy_version 556268 (0.0018) [2024-04-28 03:00:09,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61167.1, 300 sec: 61370.6). Total num frames: 9113894912. Throughput: 0: 61205.9. Samples: 2019154840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 03:00:09,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 03:00:11,590][54818] Updated weights for policy 0, policy_version 556278 (0.0017) [2024-04-28 03:00:14,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9114206208. Throughput: 0: 61164.3. Samples: 2019338040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:00:14,255][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 03:00:14,378][54818] Updated weights for policy 0, policy_version 556288 (0.0017) [2024-04-28 03:00:17,108][54818] Updated weights for policy 0, policy_version 556298 (0.0016) [2024-04-28 03:00:18,166][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (11600 times) [2024-04-28 03:00:19,253][54587] Fps is (10 sec: 62258.0, 60 sec: 61166.7, 300 sec: 61315.0). Total num frames: 9114517504. Throughput: 0: 61219.8. Samples: 2019708220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:00:19,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 03:00:19,653][54818] Updated weights for policy 0, policy_version 556308 (0.0017) [2024-04-28 03:00:22,397][54818] Updated weights for policy 0, policy_version 556318 (0.0016) [2024-04-28 03:00:24,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61166.8, 300 sec: 61315.0). Total num frames: 9114812416. Throughput: 0: 61294.0. Samples: 2020074140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:00:24,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 03:00:24,843][54818] Updated weights for policy 0, policy_version 556328 (0.0017) [2024-04-28 03:00:27,884][54818] Updated weights for policy 0, policy_version 556338 (0.0017) [2024-04-28 03:00:29,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9115123712. Throughput: 0: 61211.4. Samples: 2020260380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:00:29,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-28 03:00:30,062][54818] Updated weights for policy 0, policy_version 556348 (0.0019) [2024-04-28 03:00:33,108][54818] Updated weights for policy 0, policy_version 556358 (0.0016) [2024-04-28 03:00:34,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61166.8, 300 sec: 61315.0). Total num frames: 9115435008. Throughput: 0: 61265.2. Samples: 2020632380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:00:34,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 03:00:35,567][54818] Updated weights for policy 0, policy_version 556368 (0.0016) [2024-04-28 03:00:38,041][54798] Signal inference workers to stop experience collection... (33100 times) [2024-04-28 03:00:38,074][54818] InferenceWorker_p0-w0: stopping experience collection (33100 times) [2024-04-28 03:00:38,094][54798] Signal inference workers to resume experience collection... (33100 times) [2024-04-28 03:00:38,095][54818] InferenceWorker_p0-w0: resuming experience collection (33100 times) [2024-04-28 03:00:38,339][54818] Updated weights for policy 0, policy_version 556378 (0.0021) [2024-04-28 03:00:39,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9115729920. Throughput: 0: 61279.9. Samples: 2020994360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:00:39,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 03:00:41,188][54818] Updated weights for policy 0, policy_version 556388 (0.0017) [2024-04-28 03:00:43,504][54818] Updated weights for policy 0, policy_version 556398 (0.0017) [2024-04-28 03:00:44,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 9116057600. Throughput: 0: 61319.5. Samples: 2021180600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:00:44,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 03:00:44,899][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (11700 times) [2024-04-28 03:00:46,349][54818] Updated weights for policy 0, policy_version 556408 (0.0015) [2024-04-28 03:00:49,128][54818] Updated weights for policy 0, policy_version 556418 (0.0017) [2024-04-28 03:00:49,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9116352512. Throughput: 0: 61284.6. Samples: 2021551800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:00:49,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 03:00:51,925][54818] Updated weights for policy 0, policy_version 556428 (0.0015) [2024-04-28 03:00:54,253][54587] Fps is (10 sec: 60622.0, 60 sec: 61440.1, 300 sec: 61315.1). Total num frames: 9116663808. Throughput: 0: 61322.8. Samples: 2021914360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:00:54,253][54587] Avg episode reward: [(0, '0.534')] [2024-04-28 03:00:54,277][54818] Updated weights for policy 0, policy_version 556438 (0.0017) [2024-04-28 03:00:57,091][54818] Updated weights for policy 0, policy_version 556448 (0.0016) [2024-04-28 03:00:59,253][54587] Fps is (10 sec: 62260.2, 60 sec: 61440.1, 300 sec: 61315.0). Total num frames: 9116975104. Throughput: 0: 61475.2. Samples: 2022104420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:00:59,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 03:00:59,706][54818] Updated weights for policy 0, policy_version 556458 (0.0015) [2024-04-28 03:01:02,619][54818] Updated weights for policy 0, policy_version 556468 (0.0015) [2024-04-28 03:01:04,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61713.0, 300 sec: 61370.6). Total num frames: 9117286400. Throughput: 0: 61359.4. Samples: 2022469380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:01:04,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 03:01:04,962][54818] Updated weights for policy 0, policy_version 556478 (0.0016) [2024-04-28 03:01:07,908][54818] Updated weights for policy 0, policy_version 556488 (0.0016) [2024-04-28 03:01:09,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.0, 300 sec: 61315.1). Total num frames: 9117581312. Throughput: 0: 61351.8. Samples: 2022834960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:01:09,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 03:01:10,204][54818] Updated weights for policy 0, policy_version 556498 (0.0018) [2024-04-28 03:01:11,868][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (11800 times) [2024-04-28 03:01:13,219][54818] Updated weights for policy 0, policy_version 556508 (0.0016) [2024-04-28 03:01:14,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61440.2, 300 sec: 61315.1). Total num frames: 9117892608. Throughput: 0: 61296.6. Samples: 2023018720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:01:14,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 03:01:15,505][54818] Updated weights for policy 0, policy_version 556518 (0.0018) [2024-04-28 03:01:18,480][54818] Updated weights for policy 0, policy_version 556528 (0.0023) [2024-04-28 03:01:19,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9118203904. Throughput: 0: 61213.1. Samples: 2023386960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:01:19,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 03:01:21,108][54818] Updated weights for policy 0, policy_version 556538 (0.0016) [2024-04-28 03:01:23,707][54818] Updated weights for policy 0, policy_version 556548 (0.0018) [2024-04-28 03:01:24,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61713.3, 300 sec: 61370.6). Total num frames: 9118515200. Throughput: 0: 61236.3. Samples: 2023749980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:01:24,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 03:01:26,915][54818] Updated weights for policy 0, policy_version 556558 (0.0018) [2024-04-28 03:01:29,079][54818] Updated weights for policy 0, policy_version 556568 (0.0018) [2024-04-28 03:01:29,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61713.0, 300 sec: 61426.1). Total num frames: 9118826496. Throughput: 0: 61363.0. Samples: 2023941940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:01:29,255][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 03:01:29,959][54798] Signal inference workers to stop experience collection... (33150 times) [2024-04-28 03:01:29,963][54798] Signal inference workers to resume experience collection... (33150 times) [2024-04-28 03:01:29,977][54818] InferenceWorker_p0-w0: stopping experience collection (33150 times) [2024-04-28 03:01:29,977][54818] InferenceWorker_p0-w0: resuming experience collection (33150 times) [2024-04-28 03:01:32,150][54818] Updated weights for policy 0, policy_version 556578 (0.0015) [2024-04-28 03:01:34,191][54818] Updated weights for policy 0, policy_version 556588 (0.0018) [2024-04-28 03:01:34,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61713.2, 300 sec: 61426.1). Total num frames: 9119137792. Throughput: 0: 61210.9. Samples: 2024306280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:01:34,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-28 03:01:37,429][54818] Updated weights for policy 0, policy_version 556598 (0.0015) [2024-04-28 03:01:38,485][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (11900 times) [2024-04-28 03:01:39,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61986.2, 300 sec: 61426.1). Total num frames: 9119449088. Throughput: 0: 61318.9. Samples: 2024673720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 03:01:39,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 03:01:39,804][54818] Updated weights for policy 0, policy_version 556608 (0.0018) [2024-04-28 03:01:42,526][54818] Updated weights for policy 0, policy_version 556618 (0.0016) [2024-04-28 03:01:44,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9119744000. Throughput: 0: 61320.9. Samples: 2024863860. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:01:44,254][54587] Avg episode reward: [(0, '0.520')] [2024-04-28 03:01:45,024][54818] Updated weights for policy 0, policy_version 556628 (0.0017) [2024-04-28 03:01:48,046][54818] Updated weights for policy 0, policy_version 556638 (0.0016) [2024-04-28 03:01:49,253][54587] Fps is (10 sec: 58983.3, 60 sec: 61440.2, 300 sec: 61315.1). Total num frames: 9120038912. Throughput: 0: 61222.3. Samples: 2025224380. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:01:49,253][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 03:01:49,347][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000556644_9120055296.pth... [2024-04-28 03:01:49,400][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000555745_9105326080.pth [2024-04-28 03:01:50,261][54818] Updated weights for policy 0, policy_version 556648 (0.0018) [2024-04-28 03:01:53,386][54818] Updated weights for policy 0, policy_version 556658 (0.0018) [2024-04-28 03:01:54,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61439.8, 300 sec: 61315.1). Total num frames: 9120350208. Throughput: 0: 61298.9. Samples: 2025593420. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:01:54,255][54587] Avg episode reward: [(0, '0.563')] [2024-04-28 03:01:55,433][54818] Updated weights for policy 0, policy_version 556668 (0.0018) [2024-04-28 03:01:58,499][54818] Updated weights for policy 0, policy_version 556678 (0.0017) [2024-04-28 03:01:59,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9120645120. Throughput: 0: 61370.6. Samples: 2025780400. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:01:59,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-28 03:02:00,951][54818] Updated weights for policy 0, policy_version 556688 (0.0016) [2024-04-28 03:02:03,788][54818] Updated weights for policy 0, policy_version 556698 (0.0017) [2024-04-28 03:02:04,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61167.0, 300 sec: 61315.1). Total num frames: 9120956416. Throughput: 0: 61402.8. Samples: 2026150080. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:02:04,253][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 03:02:04,975][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (12000 times) [2024-04-28 03:02:06,385][54818] Updated weights for policy 0, policy_version 556708 (0.0017) [2024-04-28 03:02:09,183][54818] Updated weights for policy 0, policy_version 556718 (0.0015) [2024-04-28 03:02:09,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9121267712. Throughput: 0: 61597.7. Samples: 2026521880. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:02:09,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 03:02:11,887][54818] Updated weights for policy 0, policy_version 556728 (0.0016) [2024-04-28 03:02:12,705][54798] Signal inference workers to stop experience collection... (33200 times) [2024-04-28 03:02:12,745][54818] InferenceWorker_p0-w0: stopping experience collection (33200 times) [2024-04-28 03:02:12,761][54798] Signal inference workers to resume experience collection... (33200 times) [2024-04-28 03:02:12,762][54818] InferenceWorker_p0-w0: resuming experience collection (33200 times) [2024-04-28 03:02:14,253][54587] Fps is (10 sec: 62257.5, 60 sec: 61439.7, 300 sec: 61370.5). Total num frames: 9121579008. Throughput: 0: 61364.0. Samples: 2026703320. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:02:14,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 03:02:14,446][54818] Updated weights for policy 0, policy_version 556738 (0.0017) [2024-04-28 03:02:17,214][54818] Updated weights for policy 0, policy_version 556748 (0.0022) [2024-04-28 03:02:19,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9121873920. Throughput: 0: 61453.2. Samples: 2027071680. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:02:19,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 03:02:19,872][54818] Updated weights for policy 0, policy_version 556758 (0.0015) [2024-04-28 03:02:22,547][54818] Updated weights for policy 0, policy_version 556768 (0.0016) [2024-04-28 03:02:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61166.7, 300 sec: 61315.0). Total num frames: 9122185216. Throughput: 0: 61499.5. Samples: 2027441200. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:02:24,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 03:02:25,031][54818] Updated weights for policy 0, policy_version 556778 (0.0022) [2024-04-28 03:02:27,820][54818] Updated weights for policy 0, policy_version 556788 (0.0016) [2024-04-28 03:02:29,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61167.0, 300 sec: 61315.0). Total num frames: 9122496512. Throughput: 0: 61359.0. Samples: 2027625020. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:02:29,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 03:02:30,348][54818] Updated weights for policy 0, policy_version 556798 (0.0016) [2024-04-28 03:02:31,544][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (12100 times) [2024-04-28 03:02:33,023][54818] Updated weights for policy 0, policy_version 556808 (0.0018) [2024-04-28 03:02:34,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9122807808. Throughput: 0: 61662.9. Samples: 2027999220. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:02:34,255][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 03:02:35,584][54818] Updated weights for policy 0, policy_version 556818 (0.0018) [2024-04-28 03:02:38,651][54818] Updated weights for policy 0, policy_version 556828 (0.0017) [2024-04-28 03:02:39,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.9, 300 sec: 61370.6). Total num frames: 9123102720. Throughput: 0: 61652.9. Samples: 2028367800. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:02:39,255][54587] Avg episode reward: [(0, '0.654')] [2024-04-28 03:02:40,945][54818] Updated weights for policy 0, policy_version 556838 (0.0020) [2024-04-28 03:02:43,789][54818] Updated weights for policy 0, policy_version 556848 (0.0017) [2024-04-28 03:02:44,255][54587] Fps is (10 sec: 60610.2, 60 sec: 61165.1, 300 sec: 61314.7). Total num frames: 9123414016. Throughput: 0: 61477.0. Samples: 2028546980. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:02:44,256][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 03:02:46,382][54818] Updated weights for policy 0, policy_version 556858 (0.0016) [2024-04-28 03:02:49,154][54818] Updated weights for policy 0, policy_version 556868 (0.0016) [2024-04-28 03:02:49,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61439.9, 300 sec: 61370.5). Total num frames: 9123725312. Throughput: 0: 61489.1. Samples: 2028917100. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:02:49,255][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 03:02:49,265][54587] No heartbeat for components: RolloutWorker_w4 (20317 seconds), RolloutWorker_w5 (6417 seconds) [2024-04-28 03:02:51,682][54818] Updated weights for policy 0, policy_version 556878 (0.0017) [2024-04-28 03:02:54,253][54587] Fps is (10 sec: 62270.5, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9124036608. Throughput: 0: 61464.4. Samples: 2029287780. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:02:54,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 03:02:54,437][54818] Updated weights for policy 0, policy_version 556888 (0.0017) [2024-04-28 03:02:57,000][54818] Updated weights for policy 0, policy_version 556898 (0.0016) [2024-04-28 03:02:58,269][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (12200 times) [2024-04-28 03:02:59,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 9124331520. Throughput: 0: 61556.7. Samples: 2029473360. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:02:59,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 03:02:59,865][54818] Updated weights for policy 0, policy_version 556908 (0.0015) [2024-04-28 03:03:01,748][54798] Signal inference workers to stop experience collection... (33250 times) [2024-04-28 03:03:01,770][54818] InferenceWorker_p0-w0: stopping experience collection (33250 times) [2024-04-28 03:03:01,806][54798] Signal inference workers to resume experience collection... (33250 times) [2024-04-28 03:03:01,806][54818] InferenceWorker_p0-w0: resuming experience collection (33250 times) [2024-04-28 03:03:02,198][54818] Updated weights for policy 0, policy_version 556918 (0.0015) [2024-04-28 03:03:04,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61713.0, 300 sec: 61426.1). Total num frames: 9124659200. Throughput: 0: 61604.1. Samples: 2029843860. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:03:04,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 03:03:05,205][54818] Updated weights for policy 0, policy_version 556928 (0.0015) [2024-04-28 03:03:07,540][54818] Updated weights for policy 0, policy_version 556938 (0.0017) [2024-04-28 03:03:09,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61439.9, 300 sec: 61370.6). Total num frames: 9124954112. Throughput: 0: 61521.4. Samples: 2030209660. Policy #0 lag: (min: 1.0, avg: 8.6, max: 19.0) [2024-04-28 03:03:09,255][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 03:03:10,393][54818] Updated weights for policy 0, policy_version 556948 (0.0016) [2024-04-28 03:03:12,977][54818] Updated weights for policy 0, policy_version 556958 (0.0015) [2024-04-28 03:03:14,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61440.1, 300 sec: 61371.7). Total num frames: 9125265408. Throughput: 0: 61568.0. Samples: 2030395580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:03:14,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 03:03:15,765][54818] Updated weights for policy 0, policy_version 556968 (0.0016) [2024-04-28 03:03:18,166][54818] Updated weights for policy 0, policy_version 556978 (0.0015) [2024-04-28 03:03:19,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9125560320. Throughput: 0: 61448.1. Samples: 2030764380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:03:19,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 03:03:20,982][54818] Updated weights for policy 0, policy_version 556988 (0.0017) [2024-04-28 03:03:23,555][54818] Updated weights for policy 0, policy_version 556998 (0.0018) [2024-04-28 03:03:24,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61440.1, 300 sec: 61315.1). Total num frames: 9125871616. Throughput: 0: 61576.0. Samples: 2031138720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:03:24,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 03:03:24,959][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (12300 times) [2024-04-28 03:03:26,231][54818] Updated weights for policy 0, policy_version 557008 (0.0018) [2024-04-28 03:03:29,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9126166528. Throughput: 0: 61583.2. Samples: 2031318120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:03:29,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 03:03:29,377][54818] Updated weights for policy 0, policy_version 557018 (0.0017) [2024-04-28 03:03:31,403][54818] Updated weights for policy 0, policy_version 557028 (0.0020) [2024-04-28 03:03:34,253][54587] Fps is (10 sec: 58983.3, 60 sec: 60894.0, 300 sec: 61259.5). Total num frames: 9126461440. Throughput: 0: 61561.1. Samples: 2031687340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:03:34,253][54587] Avg episode reward: [(0, '0.661')] [2024-04-28 03:03:34,628][54818] Updated weights for policy 0, policy_version 557038 (0.0017) [2024-04-28 03:03:36,627][54818] Updated weights for policy 0, policy_version 557048 (0.0017) [2024-04-28 03:03:39,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61440.0, 300 sec: 61370.5). Total num frames: 9126789120. Throughput: 0: 61610.6. Samples: 2032060260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:03:39,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 03:03:39,976][54818] Updated weights for policy 0, policy_version 557058 (0.0016) [2024-04-28 03:03:41,879][54818] Updated weights for policy 0, policy_version 557068 (0.0016) [2024-04-28 03:03:44,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61168.8, 300 sec: 61259.5). Total num frames: 9127084032. Throughput: 0: 61408.5. Samples: 2032236740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:03:44,253][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 03:03:45,180][54818] Updated weights for policy 0, policy_version 557078 (0.0017) [2024-04-28 03:03:46,693][54798] Signal inference workers to stop experience collection... (33300 times) [2024-04-28 03:03:46,732][54818] InferenceWorker_p0-w0: stopping experience collection (33300 times) [2024-04-28 03:03:46,749][54798] Signal inference workers to resume experience collection... (33300 times) [2024-04-28 03:03:46,749][54818] InferenceWorker_p0-w0: resuming experience collection (33300 times) [2024-04-28 03:03:47,254][54818] Updated weights for policy 0, policy_version 557088 (0.0018) [2024-04-28 03:03:49,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9127395328. Throughput: 0: 61259.0. Samples: 2032600520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:03:49,255][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 03:03:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000557092_9127395328.pth... [2024-04-28 03:03:49,318][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000556194_9112682496.pth [2024-04-28 03:03:50,627][54818] Updated weights for policy 0, policy_version 557098 (0.0015) [2024-04-28 03:03:51,345][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (12400 times) [2024-04-28 03:03:52,682][54818] Updated weights for policy 0, policy_version 557108 (0.0016) [2024-04-28 03:03:54,253][54587] Fps is (10 sec: 62258.1, 60 sec: 61166.8, 300 sec: 61315.0). Total num frames: 9127706624. Throughput: 0: 61561.3. Samples: 2032979920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:03:54,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 03:03:55,870][54818] Updated weights for policy 0, policy_version 557118 (0.0018) [2024-04-28 03:03:58,194][54818] Updated weights for policy 0, policy_version 557128 (0.0016) [2024-04-28 03:03:59,258][54587] Fps is (10 sec: 62230.7, 60 sec: 61435.2, 300 sec: 61369.6). Total num frames: 9128017920. Throughput: 0: 61299.5. Samples: 2033154340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:03:59,258][54587] Avg episode reward: [(0, '0.538')] [2024-04-28 03:04:01,132][54818] Updated weights for policy 0, policy_version 557138 (0.0017) [2024-04-28 03:04:03,941][54818] Updated weights for policy 0, policy_version 557148 (0.0018) [2024-04-28 03:04:04,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61166.8, 300 sec: 61370.6). Total num frames: 9128329216. Throughput: 0: 61249.6. Samples: 2033520620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:04:04,255][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 03:04:06,307][54818] Updated weights for policy 0, policy_version 557158 (0.0018) [2024-04-28 03:04:09,114][54818] Updated weights for policy 0, policy_version 557168 (0.0018) [2024-04-28 03:04:09,253][54587] Fps is (10 sec: 62287.3, 60 sec: 61439.9, 300 sec: 61370.6). Total num frames: 9128640512. Throughput: 0: 61403.9. Samples: 2033901900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:04:09,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 03:04:11,563][54818] Updated weights for policy 0, policy_version 557178 (0.0017) [2024-04-28 03:04:14,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61440.0, 300 sec: 61370.5). Total num frames: 9128951808. Throughput: 0: 61341.8. Samples: 2034078500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:04:14,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 03:04:14,641][54818] Updated weights for policy 0, policy_version 557188 (0.0017) [2024-04-28 03:04:16,768][54818] Updated weights for policy 0, policy_version 557198 (0.0017) [2024-04-28 03:04:17,930][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (12500 times) [2024-04-28 03:04:19,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61439.9, 300 sec: 61370.6). Total num frames: 9129246720. Throughput: 0: 61178.0. Samples: 2034440360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:04:19,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 03:04:20,050][54818] Updated weights for policy 0, policy_version 557208 (0.0018) [2024-04-28 03:04:22,002][54818] Updated weights for policy 0, policy_version 557218 (0.0016) [2024-04-28 03:04:24,253][54587] Fps is (10 sec: 58982.7, 60 sec: 61167.0, 300 sec: 61315.0). Total num frames: 9129541632. Throughput: 0: 61386.7. Samples: 2034822660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:04:24,255][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 03:04:25,379][54818] Updated weights for policy 0, policy_version 557228 (0.0016) [2024-04-28 03:04:27,347][54818] Updated weights for policy 0, policy_version 557238 (0.0017) [2024-04-28 03:04:29,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61713.2, 300 sec: 61370.6). Total num frames: 9129869312. Throughput: 0: 61338.6. Samples: 2034996980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:04:29,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-28 03:04:30,651][54818] Updated weights for policy 0, policy_version 557248 (0.0017) [2024-04-28 03:04:32,704][54818] Updated weights for policy 0, policy_version 557258 (0.0017) [2024-04-28 03:04:34,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61713.1, 300 sec: 61370.6). Total num frames: 9130164224. Throughput: 0: 61358.9. Samples: 2035361660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:04:34,253][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 03:04:35,562][54798] Signal inference workers to stop experience collection... (33350 times) [2024-04-28 03:04:35,562][54798] Signal inference workers to resume experience collection... (33350 times) [2024-04-28 03:04:35,574][54818] InferenceWorker_p0-w0: stopping experience collection (33350 times) [2024-04-28 03:04:35,574][54818] InferenceWorker_p0-w0: resuming experience collection (33350 times) [2024-04-28 03:04:35,804][54818] Updated weights for policy 0, policy_version 557268 (0.0017) [2024-04-28 03:04:38,363][54818] Updated weights for policy 0, policy_version 557278 (0.0016) [2024-04-28 03:04:39,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9130475520. Throughput: 0: 61253.6. Samples: 2035736320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:04:39,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 03:04:41,077][54818] Updated weights for policy 0, policy_version 557288 (0.0018) [2024-04-28 03:04:43,740][54818] Updated weights for policy 0, policy_version 557298 (0.0019) [2024-04-28 03:04:44,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61713.0, 300 sec: 61426.1). Total num frames: 9130786816. Throughput: 0: 61367.2. Samples: 2035915580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:04:44,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-28 03:04:45,334][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (12600 times) [2024-04-28 03:04:46,403][54818] Updated weights for policy 0, policy_version 557308 (0.0017) [2024-04-28 03:04:49,138][54818] Updated weights for policy 0, policy_version 557318 (0.0016) [2024-04-28 03:04:49,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61713.1, 300 sec: 61426.1). Total num frames: 9131098112. Throughput: 0: 61324.6. Samples: 2036280220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:04:49,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 03:04:51,648][54818] Updated weights for policy 0, policy_version 557328 (0.0017) [2024-04-28 03:04:54,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61713.2, 300 sec: 61426.1). Total num frames: 9131409408. Throughput: 0: 61182.0. Samples: 2036655080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:04:54,254][54587] Avg episode reward: [(0, '0.701')] [2024-04-28 03:04:55,054][54818] Updated weights for policy 0, policy_version 557338 (0.0018) [2024-04-28 03:04:56,894][54818] Updated weights for policy 0, policy_version 557348 (0.0017) [2024-04-28 03:04:59,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61444.8, 300 sec: 61426.1). Total num frames: 9131704320. Throughput: 0: 61210.3. Samples: 2036832960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:04:59,254][54587] Avg episode reward: [(0, '0.695')] [2024-04-28 03:05:00,134][54818] Updated weights for policy 0, policy_version 557358 (0.0018) [2024-04-28 03:05:02,197][54818] Updated weights for policy 0, policy_version 557368 (0.0016) [2024-04-28 03:05:04,253][54587] Fps is (10 sec: 60619.8, 60 sec: 61440.0, 300 sec: 61426.1). Total num frames: 9132015616. Throughput: 0: 61271.0. Samples: 2037197560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:05:04,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 03:05:05,515][54818] Updated weights for policy 0, policy_version 557378 (0.0015) [2024-04-28 03:05:07,634][54818] Updated weights for policy 0, policy_version 557388 (0.0015) [2024-04-28 03:05:09,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61440.1, 300 sec: 61426.1). Total num frames: 9132326912. Throughput: 0: 60989.8. Samples: 2037567200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:05:09,255][54587] Avg episode reward: [(0, '0.663')] [2024-04-28 03:05:10,768][54818] Updated weights for policy 0, policy_version 557398 (0.0018) [2024-04-28 03:05:11,709][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (12700 times) [2024-04-28 03:05:13,297][54818] Updated weights for policy 0, policy_version 557408 (0.0017) [2024-04-28 03:05:14,253][54587] Fps is (10 sec: 62260.2, 60 sec: 61440.1, 300 sec: 61426.2). Total num frames: 9132638208. Throughput: 0: 61164.5. Samples: 2037749380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:05:14,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 03:05:16,057][54818] Updated weights for policy 0, policy_version 557418 (0.0017) [2024-04-28 03:05:16,247][54798] Signal inference workers to stop experience collection... (33400 times) [2024-04-28 03:05:16,285][54818] InferenceWorker_p0-w0: stopping experience collection (33400 times) [2024-04-28 03:05:16,299][54798] Signal inference workers to resume experience collection... (33400 times) [2024-04-28 03:05:16,300][54818] InferenceWorker_p0-w0: resuming experience collection (33400 times) [2024-04-28 03:05:18,618][54818] Updated weights for policy 0, policy_version 557428 (0.0016) [2024-04-28 03:05:19,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61440.1, 300 sec: 61426.2). Total num frames: 9132933120. Throughput: 0: 61229.3. Samples: 2038116980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:05:19,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 03:05:21,232][54818] Updated weights for policy 0, policy_version 557438 (0.0016) [2024-04-28 03:05:23,860][54818] Updated weights for policy 0, policy_version 557448 (0.0017) [2024-04-28 03:05:24,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61713.1, 300 sec: 61426.1). Total num frames: 9133244416. Throughput: 0: 61060.8. Samples: 2038484060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:05:24,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 03:05:26,544][54818] Updated weights for policy 0, policy_version 557458 (0.0017) [2024-04-28 03:05:29,081][54818] Updated weights for policy 0, policy_version 557468 (0.0018) [2024-04-28 03:05:29,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61439.9, 300 sec: 61426.1). Total num frames: 9133555712. Throughput: 0: 61204.3. Samples: 2038669780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:05:29,255][54587] Avg episode reward: [(0, '0.539')] [2024-04-28 03:05:31,868][54818] Updated weights for policy 0, policy_version 557478 (0.0020) [2024-04-28 03:05:34,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61713.0, 300 sec: 61481.7). Total num frames: 9133867008. Throughput: 0: 61279.1. Samples: 2039037780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:05:34,254][54587] Avg episode reward: [(0, '0.650')] [2024-04-28 03:05:35,068][54818] Updated weights for policy 0, policy_version 557488 (0.0017) [2024-04-28 03:05:37,505][54818] Updated weights for policy 0, policy_version 557498 (0.0020) [2024-04-28 03:05:38,546][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (12800 times) [2024-04-28 03:05:39,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61439.9, 300 sec: 61370.6). Total num frames: 9134161920. Throughput: 0: 61118.2. Samples: 2039405400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:05:39,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-28 03:05:40,330][54818] Updated weights for policy 0, policy_version 557508 (0.0018) [2024-04-28 03:05:42,890][54818] Updated weights for policy 0, policy_version 557518 (0.0017) [2024-04-28 03:05:44,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.0, 300 sec: 61426.1). Total num frames: 9134473216. Throughput: 0: 61107.5. Samples: 2039582800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:05:44,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 03:05:45,513][54818] Updated weights for policy 0, policy_version 557528 (0.0017) [2024-04-28 03:05:48,259][54818] Updated weights for policy 0, policy_version 557538 (0.0019) [2024-04-28 03:05:49,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61439.9, 300 sec: 61426.1). Total num frames: 9134784512. Throughput: 0: 61361.8. Samples: 2039958840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:05:49,254][54587] Avg episode reward: [(0, '0.521')] [2024-04-28 03:05:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000557543_9134784512.pth... [2024-04-28 03:05:49,264][54587] No heartbeat for components: RolloutWorker_w4 (20497 seconds), RolloutWorker_w5 (6597 seconds) [2024-04-28 03:05:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000556644_9120055296.pth [2024-04-28 03:05:50,655][54818] Updated weights for policy 0, policy_version 557548 (0.0018) [2024-04-28 03:05:51,632][54798] Signal inference workers to stop experience collection... (33450 times) [2024-04-28 03:05:51,632][54798] Signal inference workers to resume experience collection... (33450 times) [2024-04-28 03:05:51,651][54818] InferenceWorker_p0-w0: stopping experience collection (33450 times) [2024-04-28 03:05:51,651][54818] InferenceWorker_p0-w0: resuming experience collection (33450 times) [2024-04-28 03:05:53,601][54818] Updated weights for policy 0, policy_version 557558 (0.0018) [2024-04-28 03:05:54,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61167.0, 300 sec: 61370.6). Total num frames: 9135079424. Throughput: 0: 61154.8. Samples: 2040319160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:05:54,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 03:05:56,340][54818] Updated weights for policy 0, policy_version 557568 (0.0016) [2024-04-28 03:05:58,793][54818] Updated weights for policy 0, policy_version 557578 (0.0016) [2024-04-28 03:05:59,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61439.9, 300 sec: 61370.6). Total num frames: 9135390720. Throughput: 0: 61195.9. Samples: 2040503200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:05:59,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 03:06:01,575][54818] Updated weights for policy 0, policy_version 557588 (0.0016) [2024-04-28 03:06:04,068][54818] Updated weights for policy 0, policy_version 557598 (0.0015) [2024-04-28 03:06:04,253][54587] Fps is (10 sec: 60619.2, 60 sec: 61166.9, 300 sec: 61370.5). Total num frames: 9135685632. Throughput: 0: 61196.2. Samples: 2040870820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:06:04,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 03:06:05,259][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (12900 times) [2024-04-28 03:06:06,933][54818] Updated weights for policy 0, policy_version 557608 (0.0017) [2024-04-28 03:06:09,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.8, 300 sec: 61370.5). Total num frames: 9135996928. Throughput: 0: 61306.9. Samples: 2041242880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:06:09,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 03:06:09,387][54818] Updated weights for policy 0, policy_version 557618 (0.0016) [2024-04-28 03:06:12,150][54818] Updated weights for policy 0, policy_version 557628 (0.0015) [2024-04-28 03:06:14,253][54587] Fps is (10 sec: 62260.2, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9136308224. Throughput: 0: 61327.3. Samples: 2041429500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:06:14,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 03:06:14,866][54818] Updated weights for policy 0, policy_version 557638 (0.0017) [2024-04-28 03:06:17,614][54818] Updated weights for policy 0, policy_version 557648 (0.0016) [2024-04-28 03:06:19,253][54587] Fps is (10 sec: 60622.1, 60 sec: 61167.0, 300 sec: 61315.0). Total num frames: 9136603136. Throughput: 0: 61300.1. Samples: 2041796280. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:06:19,253][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 03:06:20,000][54818] Updated weights for policy 0, policy_version 557658 (0.0016) [2024-04-28 03:06:22,954][54818] Updated weights for policy 0, policy_version 557668 (0.0016) [2024-04-28 03:06:24,253][54587] Fps is (10 sec: 58983.0, 60 sec: 60894.0, 300 sec: 61259.5). Total num frames: 9136898048. Throughput: 0: 61322.8. Samples: 2042164920. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:06:24,253][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 03:06:25,246][54818] Updated weights for policy 0, policy_version 557678 (0.0017) [2024-04-28 03:06:28,148][54818] Updated weights for policy 0, policy_version 557688 (0.0016) [2024-04-28 03:06:29,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60893.9, 300 sec: 61259.5). Total num frames: 9137209344. Throughput: 0: 61354.6. Samples: 2042343760. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:06:29,254][54587] Avg episode reward: [(0, '0.504')] [2024-04-28 03:06:30,470][54818] Updated weights for policy 0, policy_version 557698 (0.0016) [2024-04-28 03:06:32,196][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (13000 times) [2024-04-28 03:06:33,746][54818] Updated weights for policy 0, policy_version 557708 (0.0016) [2024-04-28 03:06:34,253][54587] Fps is (10 sec: 62258.4, 60 sec: 60893.8, 300 sec: 61259.5). Total num frames: 9137520640. Throughput: 0: 61314.8. Samples: 2042718000. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:06:34,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-28 03:06:36,205][54818] Updated weights for policy 0, policy_version 557718 (0.0017) [2024-04-28 03:06:39,150][54818] Updated weights for policy 0, policy_version 557728 (0.0018) [2024-04-28 03:06:39,222][54798] Signal inference workers to stop experience collection... (33500 times) [2024-04-28 03:06:39,253][54818] InferenceWorker_p0-w0: stopping experience collection (33500 times) [2024-04-28 03:06:39,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60893.8, 300 sec: 61259.5). Total num frames: 9137815552. Throughput: 0: 61533.6. Samples: 2043088180. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:06:39,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 03:06:39,281][54798] Signal inference workers to resume experience collection... (33500 times) [2024-04-28 03:06:39,281][54818] InferenceWorker_p0-w0: resuming experience collection (33500 times) [2024-04-28 03:06:41,510][54818] Updated weights for policy 0, policy_version 557738 (0.0018) [2024-04-28 03:06:44,253][54587] Fps is (10 sec: 60621.6, 60 sec: 60894.0, 300 sec: 61315.0). Total num frames: 9138126848. Throughput: 0: 61421.1. Samples: 2043267140. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:06:44,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 03:06:44,314][54818] Updated weights for policy 0, policy_version 557748 (0.0016) [2024-04-28 03:06:46,964][54818] Updated weights for policy 0, policy_version 557758 (0.0016) [2024-04-28 03:06:49,253][54587] Fps is (10 sec: 62260.0, 60 sec: 60894.1, 300 sec: 61315.1). Total num frames: 9138438144. Throughput: 0: 61434.1. Samples: 2043635340. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:06:49,253][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 03:06:49,523][54818] Updated weights for policy 0, policy_version 557768 (0.0017) [2024-04-28 03:06:52,398][54818] Updated weights for policy 0, policy_version 557778 (0.0017) [2024-04-28 03:06:54,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9138749440. Throughput: 0: 61419.9. Samples: 2044006760. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:06:54,253][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 03:06:54,254][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (700 times) [2024-04-28 03:06:54,647][54818] Updated weights for policy 0, policy_version 557788 (0.0018) [2024-04-28 03:06:57,597][54818] Updated weights for policy 0, policy_version 557798 (0.0016) [2024-04-28 03:06:58,853][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (13100 times) [2024-04-28 03:06:59,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60894.0, 300 sec: 61315.0). Total num frames: 9139044352. Throughput: 0: 61404.6. Samples: 2044192700. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:06:59,253][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 03:06:59,834][54818] Updated weights for policy 0, policy_version 557808 (0.0018) [2024-04-28 03:07:03,305][54818] Updated weights for policy 0, policy_version 557818 (0.0017) [2024-04-28 03:07:04,253][54587] Fps is (10 sec: 60620.0, 60 sec: 61167.1, 300 sec: 61315.0). Total num frames: 9139355648. Throughput: 0: 61313.2. Samples: 2044555380. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:07:04,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 03:07:05,120][54818] Updated weights for policy 0, policy_version 557828 (0.0018) [2024-04-28 03:07:08,552][54818] Updated weights for policy 0, policy_version 557838 (0.0017) [2024-04-28 03:07:09,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61167.2, 300 sec: 61315.1). Total num frames: 9139666944. Throughput: 0: 61441.3. Samples: 2044929780. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:07:09,253][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 03:07:10,593][54818] Updated weights for policy 0, policy_version 557848 (0.0015) [2024-04-28 03:07:13,778][54818] Updated weights for policy 0, policy_version 557858 (0.0017) [2024-04-28 03:07:14,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9139978240. Throughput: 0: 61434.7. Samples: 2045108320. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:07:14,255][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 03:07:15,995][54818] Updated weights for policy 0, policy_version 557868 (0.0018) [2024-04-28 03:07:19,066][54818] Updated weights for policy 0, policy_version 557878 (0.0017) [2024-04-28 03:07:19,253][54587] Fps is (10 sec: 60619.7, 60 sec: 61166.8, 300 sec: 61315.0). Total num frames: 9140273152. Throughput: 0: 61331.0. Samples: 2045477900. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:07:19,255][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 03:07:19,765][54798] Signal inference workers to stop experience collection... (33550 times) [2024-04-28 03:07:19,765][54798] Signal inference workers to resume experience collection... (33550 times) [2024-04-28 03:07:19,773][54818] InferenceWorker_p0-w0: stopping experience collection (33550 times) [2024-04-28 03:07:19,782][54818] InferenceWorker_p0-w0: resuming experience collection (33550 times) [2024-04-28 03:07:21,816][54818] Updated weights for policy 0, policy_version 557888 (0.0016) [2024-04-28 03:07:24,253][54587] Fps is (10 sec: 58983.3, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9140568064. Throughput: 0: 61372.6. Samples: 2045849940. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:07:24,253][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 03:07:24,409][54818] Updated weights for policy 0, policy_version 557898 (0.0017) [2024-04-28 03:07:25,161][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (13200 times) [2024-04-28 03:07:27,118][54818] Updated weights for policy 0, policy_version 557908 (0.0016) [2024-04-28 03:07:29,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9140879360. Throughput: 0: 61262.1. Samples: 2046023940. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:07:29,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 03:07:29,653][54818] Updated weights for policy 0, policy_version 557918 (0.0020) [2024-04-28 03:07:32,731][54818] Updated weights for policy 0, policy_version 557928 (0.0016) [2024-04-28 03:07:34,255][54587] Fps is (10 sec: 60612.7, 60 sec: 60892.6, 300 sec: 61259.2). Total num frames: 9141174272. Throughput: 0: 61408.9. Samples: 2046398820. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:07:34,255][54587] Avg episode reward: [(0, '0.642')] [2024-04-28 03:07:34,824][54818] Updated weights for policy 0, policy_version 557938 (0.0016) [2024-04-28 03:07:38,114][54818] Updated weights for policy 0, policy_version 557948 (0.0016) [2024-04-28 03:07:39,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.9, 300 sec: 61259.9). Total num frames: 9141485568. Throughput: 0: 61240.3. Samples: 2046762580. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:07:39,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 03:07:40,123][54818] Updated weights for policy 0, policy_version 557958 (0.0017) [2024-04-28 03:07:43,582][54818] Updated weights for policy 0, policy_version 557968 (0.0018) [2024-04-28 03:07:44,253][54587] Fps is (10 sec: 60627.8, 60 sec: 60893.7, 300 sec: 61204.0). Total num frames: 9141780480. Throughput: 0: 60949.6. Samples: 2046935440. Policy #0 lag: (min: 2.0, avg: 11.0, max: 21.0) [2024-04-28 03:07:44,256][54587] Avg episode reward: [(0, '0.657')] [2024-04-28 03:07:45,355][54818] Updated weights for policy 0, policy_version 557978 (0.0018) [2024-04-28 03:07:49,051][54818] Updated weights for policy 0, policy_version 557988 (0.0016) [2024-04-28 03:07:49,253][54587] Fps is (10 sec: 60621.6, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9142091776. Throughput: 0: 61189.5. Samples: 2047308900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:07:49,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 03:07:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000557989_9142091776.pth... [2024-04-28 03:07:49,328][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000557092_9127395328.pth [2024-04-28 03:07:50,580][54818] Updated weights for policy 0, policy_version 557998 (0.0020) [2024-04-28 03:07:52,871][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (13300 times) [2024-04-28 03:07:54,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60620.7, 300 sec: 61204.0). Total num frames: 9142386688. Throughput: 0: 61055.0. Samples: 2047677260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:07:54,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 03:07:54,448][54818] Updated weights for policy 0, policy_version 558008 (0.0016) [2024-04-28 03:07:54,765][54798] Signal inference workers to stop experience collection... (33600 times) [2024-04-28 03:07:54,799][54818] InferenceWorker_p0-w0: stopping experience collection (33600 times) [2024-04-28 03:07:54,825][54798] Signal inference workers to resume experience collection... (33600 times) [2024-04-28 03:07:54,826][54818] InferenceWorker_p0-w0: resuming experience collection (33600 times) [2024-04-28 03:07:55,705][54818] Updated weights for policy 0, policy_version 558018 (0.0020) [2024-04-28 03:07:59,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9142697984. Throughput: 0: 60893.1. Samples: 2047848500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:07:59,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 03:07:59,488][54818] Updated weights for policy 0, policy_version 558028 (0.0018) [2024-04-28 03:08:01,763][54818] Updated weights for policy 0, policy_version 558038 (0.0015) [2024-04-28 03:08:04,253][54587] Fps is (10 sec: 62259.8, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9143009280. Throughput: 0: 61173.6. Samples: 2048230700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:08:04,253][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 03:08:04,795][54818] Updated weights for policy 0, policy_version 558048 (0.0020) [2024-04-28 03:08:07,840][54818] Updated weights for policy 0, policy_version 558058 (0.0018) [2024-04-28 03:08:09,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60620.8, 300 sec: 61148.4). Total num frames: 9143304192. Throughput: 0: 60951.1. Samples: 2048592740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:08:09,253][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 03:08:09,949][54818] Updated weights for policy 0, policy_version 558068 (0.0017) [2024-04-28 03:08:13,085][54818] Updated weights for policy 0, policy_version 558078 (0.0019) [2024-04-28 03:08:14,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60620.9, 300 sec: 61204.0). Total num frames: 9143615488. Throughput: 0: 60829.0. Samples: 2048761240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:08:14,253][54587] Avg episode reward: [(0, '0.655')] [2024-04-28 03:08:15,095][54818] Updated weights for policy 0, policy_version 558088 (0.0018) [2024-04-28 03:08:18,292][54818] Updated weights for policy 0, policy_version 558098 (0.0017) [2024-04-28 03:08:19,253][54587] Fps is (10 sec: 62259.2, 60 sec: 60894.1, 300 sec: 61204.0). Total num frames: 9143926784. Throughput: 0: 60975.1. Samples: 2049142620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:08:19,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-28 03:08:19,449][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (13400 times) [2024-04-28 03:08:20,431][54818] Updated weights for policy 0, policy_version 558108 (0.0016) [2024-04-28 03:08:23,606][54818] Updated weights for policy 0, policy_version 558118 (0.0015) [2024-04-28 03:08:24,253][54587] Fps is (10 sec: 63897.8, 60 sec: 61440.0, 300 sec: 61315.1). Total num frames: 9144254464. Throughput: 0: 60984.2. Samples: 2049506860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:08:24,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 03:08:25,933][54818] Updated weights for policy 0, policy_version 558128 (0.0019) [2024-04-28 03:08:26,541][54798] Signal inference workers to stop experience collection... (33650 times) [2024-04-28 03:08:26,541][54798] Signal inference workers to resume experience collection... (33650 times) [2024-04-28 03:08:26,567][54818] InferenceWorker_p0-w0: stopping experience collection (33650 times) [2024-04-28 03:08:26,567][54818] InferenceWorker_p0-w0: resuming experience collection (33650 times) [2024-04-28 03:08:28,996][54818] Updated weights for policy 0, policy_version 558138 (0.0017) [2024-04-28 03:08:29,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.9, 300 sec: 61259.5). Total num frames: 9144532992. Throughput: 0: 60885.1. Samples: 2049675260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:08:29,255][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 03:08:31,086][54818] Updated weights for policy 0, policy_version 558148 (0.0018) [2024-04-28 03:08:34,253][54587] Fps is (10 sec: 58981.8, 60 sec: 61168.2, 300 sec: 61204.0). Total num frames: 9144844288. Throughput: 0: 60983.0. Samples: 2050053140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:08:34,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 03:08:34,468][54818] Updated weights for policy 0, policy_version 558158 (0.0016) [2024-04-28 03:08:36,376][54818] Updated weights for policy 0, policy_version 558168 (0.0018) [2024-04-28 03:08:39,253][54587] Fps is (10 sec: 60620.0, 60 sec: 60893.8, 300 sec: 61203.9). Total num frames: 9145139200. Throughput: 0: 61058.1. Samples: 2050424880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:08:39,255][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 03:08:39,645][54818] Updated weights for policy 0, policy_version 558178 (0.0017) [2024-04-28 03:08:41,761][54818] Updated weights for policy 0, policy_version 558188 (0.0017) [2024-04-28 03:08:44,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 9145450496. Throughput: 0: 61039.5. Samples: 2050595280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:08:44,255][54587] Avg episode reward: [(0, '0.530')] [2024-04-28 03:08:44,800][54818] Updated weights for policy 0, policy_version 558198 (0.0016) [2024-04-28 03:08:45,827][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (13500 times) [2024-04-28 03:08:47,378][54818] Updated weights for policy 0, policy_version 558208 (0.0017) [2024-04-28 03:08:49,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9145761792. Throughput: 0: 60850.5. Samples: 2050968980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:08:49,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 03:08:49,262][54587] No heartbeat for components: RolloutWorker_w4 (20677 seconds), RolloutWorker_w5 (6777 seconds) [2024-04-28 03:08:50,016][54818] Updated weights for policy 0, policy_version 558218 (0.0017) [2024-04-28 03:08:53,292][54818] Updated weights for policy 0, policy_version 558228 (0.0017) [2024-04-28 03:08:54,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.9, 300 sec: 61149.4). Total num frames: 9146056704. Throughput: 0: 61173.6. Samples: 2051345560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:08:54,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 03:08:55,570][54818] Updated weights for policy 0, policy_version 558238 (0.0022) [2024-04-28 03:08:58,637][54818] Updated weights for policy 0, policy_version 558248 (0.0015) [2024-04-28 03:08:59,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61439.8, 300 sec: 61204.0). Total num frames: 9146384384. Throughput: 0: 61151.3. Samples: 2051513060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:08:59,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-28 03:09:00,997][54818] Updated weights for policy 0, policy_version 558258 (0.0017) [2024-04-28 03:09:03,932][54818] Updated weights for policy 0, policy_version 558268 (0.0017) [2024-04-28 03:09:04,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61166.7, 300 sec: 61148.4). Total num frames: 9146679296. Throughput: 0: 60913.1. Samples: 2051883720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:09:04,255][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 03:09:06,114][54818] Updated weights for policy 0, policy_version 558278 (0.0017) [2024-04-28 03:09:08,911][54798] Signal inference workers to stop experience collection... (33700 times) [2024-04-28 03:09:08,944][54818] InferenceWorker_p0-w0: stopping experience collection (33700 times) [2024-04-28 03:09:08,963][54798] Signal inference workers to resume experience collection... (33700 times) [2024-04-28 03:09:08,963][54818] InferenceWorker_p0-w0: resuming experience collection (33700 times) [2024-04-28 03:09:09,216][54818] Updated weights for policy 0, policy_version 558288 (0.0016) [2024-04-28 03:09:09,253][54587] Fps is (10 sec: 60622.0, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9146990592. Throughput: 0: 61163.1. Samples: 2052259200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:09:09,254][54587] Avg episode reward: [(0, '0.532')] [2024-04-28 03:09:11,418][54818] Updated weights for policy 0, policy_version 558298 (0.0018) [2024-04-28 03:09:13,297][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (13600 times) [2024-04-28 03:09:14,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61439.8, 300 sec: 61203.9). Total num frames: 9147301888. Throughput: 0: 61274.9. Samples: 2052432640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 03:09:14,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 03:09:14,463][54818] Updated weights for policy 0, policy_version 558308 (0.0018) [2024-04-28 03:09:17,015][54818] Updated weights for policy 0, policy_version 558318 (0.0016) [2024-04-28 03:09:19,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9147596800. Throughput: 0: 60854.3. Samples: 2052791580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:09:19,254][54587] Avg episode reward: [(0, '0.533')] [2024-04-28 03:09:19,639][54818] Updated weights for policy 0, policy_version 558328 (0.0016) [2024-04-28 03:09:22,393][54818] Updated weights for policy 0, policy_version 558338 (0.0016) [2024-04-28 03:09:24,253][54587] Fps is (10 sec: 60621.6, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 9147908096. Throughput: 0: 60996.5. Samples: 2053169720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:09:24,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 03:09:24,908][54818] Updated weights for policy 0, policy_version 558348 (0.0017) [2024-04-28 03:09:27,683][54818] Updated weights for policy 0, policy_version 558358 (0.0016) [2024-04-28 03:09:29,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61439.9, 300 sec: 61203.9). Total num frames: 9148219392. Throughput: 0: 61122.9. Samples: 2053345820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:09:29,255][54587] Avg episode reward: [(0, '0.542')] [2024-04-28 03:09:30,375][54818] Updated weights for policy 0, policy_version 558368 (0.0016) [2024-04-28 03:09:33,426][54818] Updated weights for policy 0, policy_version 558378 (0.0017) [2024-04-28 03:09:34,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61439.9, 300 sec: 61203.9). Total num frames: 9148530688. Throughput: 0: 60962.1. Samples: 2053712280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:09:34,255][54587] Avg episode reward: [(0, '0.549')] [2024-04-28 03:09:35,811][54818] Updated weights for policy 0, policy_version 558388 (0.0016) [2024-04-28 03:09:38,592][54818] Updated weights for policy 0, policy_version 558398 (0.0017) [2024-04-28 03:09:39,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9148825600. Throughput: 0: 60671.2. Samples: 2054075760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:09:39,255][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 03:09:39,655][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (13700 times) [2024-04-28 03:09:41,474][54818] Updated weights for policy 0, policy_version 558408 (0.0016) [2024-04-28 03:09:43,986][54818] Updated weights for policy 0, policy_version 558418 (0.0015) [2024-04-28 03:09:44,255][54587] Fps is (10 sec: 60612.7, 60 sec: 61438.5, 300 sec: 61148.1). Total num frames: 9149136896. Throughput: 0: 61207.6. Samples: 2054267480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:09:44,256][54587] Avg episode reward: [(0, '0.677')] [2024-04-28 03:09:46,723][54818] Updated weights for policy 0, policy_version 558428 (0.0017) [2024-04-28 03:09:49,182][54818] Updated weights for policy 0, policy_version 558438 (0.0016) [2024-04-28 03:09:49,253][54587] Fps is (10 sec: 62258.1, 60 sec: 61439.8, 300 sec: 61148.4). Total num frames: 9149448192. Throughput: 0: 61170.6. Samples: 2054636400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:09:49,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 03:09:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000558438_9149448192.pth... [2024-04-28 03:09:49,329][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000557543_9134784512.pth [2024-04-28 03:09:51,976][54818] Updated weights for policy 0, policy_version 558448 (0.0017) [2024-04-28 03:09:54,020][54798] Signal inference workers to stop experience collection... (33750 times) [2024-04-28 03:09:54,024][54798] Signal inference workers to resume experience collection... (33750 times) [2024-04-28 03:09:54,036][54818] InferenceWorker_p0-w0: stopping experience collection (33750 times) [2024-04-28 03:09:54,036][54818] InferenceWorker_p0-w0: resuming experience collection (33750 times) [2024-04-28 03:09:54,253][54587] Fps is (10 sec: 62268.4, 60 sec: 61713.2, 300 sec: 61204.0). Total num frames: 9149759488. Throughput: 0: 60879.6. Samples: 2054998780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:09:54,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 03:09:54,427][54818] Updated weights for policy 0, policy_version 558458 (0.0015) [2024-04-28 03:09:57,229][54818] Updated weights for policy 0, policy_version 558468 (0.0017) [2024-04-28 03:09:59,253][54587] Fps is (10 sec: 60622.3, 60 sec: 61167.1, 300 sec: 61148.5). Total num frames: 9150054400. Throughput: 0: 61299.9. Samples: 2055191120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:09:59,253][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 03:09:59,960][54818] Updated weights for policy 0, policy_version 558478 (0.0017) [2024-04-28 03:10:02,956][54818] Updated weights for policy 0, policy_version 558488 (0.0016) [2024-04-28 03:10:04,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9150365696. Throughput: 0: 61326.6. Samples: 2055551280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:10:04,253][54587] Avg episode reward: [(0, '0.502')] [2024-04-28 03:10:05,151][54818] Updated weights for policy 0, policy_version 558498 (0.0016) [2024-04-28 03:10:06,770][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (13800 times) [2024-04-28 03:10:08,050][54818] Updated weights for policy 0, policy_version 558508 (0.0016) [2024-04-28 03:10:09,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 9150676992. Throughput: 0: 61016.5. Samples: 2055915460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:10:09,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 03:10:10,473][54818] Updated weights for policy 0, policy_version 558518 (0.0020) [2024-04-28 03:10:13,392][54818] Updated weights for policy 0, policy_version 558528 (0.0017) [2024-04-28 03:10:14,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.2, 300 sec: 61204.0). Total num frames: 9150988288. Throughput: 0: 61311.3. Samples: 2056104820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:10:14,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-28 03:10:16,089][54818] Updated weights for policy 0, policy_version 558538 (0.0017) [2024-04-28 03:10:18,556][54818] Updated weights for policy 0, policy_version 558548 (0.0017) [2024-04-28 03:10:19,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61713.0, 300 sec: 61204.0). Total num frames: 9151299584. Throughput: 0: 61293.4. Samples: 2056470480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:10:19,255][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 03:10:21,361][54818] Updated weights for policy 0, policy_version 558558 (0.0017) [2024-04-28 03:10:23,722][54818] Updated weights for policy 0, policy_version 558568 (0.0020) [2024-04-28 03:10:24,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.1, 300 sec: 61148.5). Total num frames: 9151594496. Throughput: 0: 61112.5. Samples: 2056825820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:10:24,253][54587] Avg episode reward: [(0, '0.498')] [2024-04-28 03:10:27,028][54818] Updated weights for policy 0, policy_version 558578 (0.0015) [2024-04-28 03:10:29,234][54818] Updated weights for policy 0, policy_version 558588 (0.0015) [2024-04-28 03:10:29,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9151905792. Throughput: 0: 61186.9. Samples: 2057020800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:10:29,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 03:10:32,238][54818] Updated weights for policy 0, policy_version 558598 (0.0017) [2024-04-28 03:10:33,261][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (13900 times) [2024-04-28 03:10:34,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 9152217088. Throughput: 0: 61139.3. Samples: 2057387660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:10:34,254][54587] Avg episode reward: [(0, '0.680')] [2024-04-28 03:10:34,528][54818] Updated weights for policy 0, policy_version 558608 (0.0018) [2024-04-28 03:10:37,639][54818] Updated weights for policy 0, policy_version 558618 (0.0016) [2024-04-28 03:10:38,314][54798] Signal inference workers to stop experience collection... (33800 times) [2024-04-28 03:10:38,354][54818] InferenceWorker_p0-w0: stopping experience collection (33800 times) [2024-04-28 03:10:38,405][54798] Signal inference workers to resume experience collection... (33800 times) [2024-04-28 03:10:38,405][54818] InferenceWorker_p0-w0: resuming experience collection (33800 times) [2024-04-28 03:10:39,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61713.1, 300 sec: 61204.0). Total num frames: 9152528384. Throughput: 0: 61020.8. Samples: 2057744720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:10:39,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 03:10:39,902][54818] Updated weights for policy 0, policy_version 558628 (0.0015) [2024-04-28 03:10:42,959][54818] Updated weights for policy 0, policy_version 558638 (0.0016) [2024-04-28 03:10:44,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61714.4, 300 sec: 61204.0). Total num frames: 9152839680. Throughput: 0: 61115.4. Samples: 2057941320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 03:10:44,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-28 03:10:45,393][54818] Updated weights for policy 0, policy_version 558648 (0.0019) [2024-04-28 03:10:48,205][54818] Updated weights for policy 0, policy_version 558658 (0.0020) [2024-04-28 03:10:49,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61713.1, 300 sec: 61259.5). Total num frames: 9153150976. Throughput: 0: 61304.7. Samples: 2058310000. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:10:49,255][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 03:10:50,877][54818] Updated weights for policy 0, policy_version 558668 (0.0016) [2024-04-28 03:10:53,390][54818] Updated weights for policy 0, policy_version 558678 (0.0016) [2024-04-28 03:10:54,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61712.9, 300 sec: 61259.5). Total num frames: 9153462272. Throughput: 0: 61149.2. Samples: 2058667180. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:10:54,254][54587] Avg episode reward: [(0, '0.553')] [2024-04-28 03:10:56,306][54818] Updated weights for policy 0, policy_version 558688 (0.0016) [2024-04-28 03:10:58,632][54818] Updated weights for policy 0, policy_version 558698 (0.0017) [2024-04-28 03:10:59,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 9153757184. Throughput: 0: 61323.9. Samples: 2058864400. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:10:59,255][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 03:10:59,840][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (14000 times) [2024-04-28 03:11:01,545][54818] Updated weights for policy 0, policy_version 558708 (0.0017) [2024-04-28 03:11:03,894][54818] Updated weights for policy 0, policy_version 558718 (0.0018) [2024-04-28 03:11:04,253][54587] Fps is (10 sec: 58983.1, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9154052096. Throughput: 0: 61273.9. Samples: 2059227800. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:11:04,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 03:11:07,332][54818] Updated weights for policy 0, policy_version 558728 (0.0016) [2024-04-28 03:11:09,216][54818] Updated weights for policy 0, policy_version 558738 (0.0015) [2024-04-28 03:11:09,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61439.9, 300 sec: 61203.9). Total num frames: 9154363392. Throughput: 0: 61272.7. Samples: 2059583100. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:11:09,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-28 03:11:12,704][54818] Updated weights for policy 0, policy_version 558748 (0.0016) [2024-04-28 03:11:14,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9154658304. Throughput: 0: 61204.9. Samples: 2059775020. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:11:14,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 03:11:14,468][54818] Updated weights for policy 0, policy_version 558758 (0.0018) [2024-04-28 03:11:18,029][54818] Updated weights for policy 0, policy_version 558768 (0.0016) [2024-04-28 03:11:18,236][54798] Signal inference workers to stop experience collection... (33850 times) [2024-04-28 03:11:18,276][54818] InferenceWorker_p0-w0: stopping experience collection (33850 times) [2024-04-28 03:11:18,331][54798] Signal inference workers to resume experience collection... (33850 times) [2024-04-28 03:11:18,331][54818] InferenceWorker_p0-w0: resuming experience collection (33850 times) [2024-04-28 03:11:19,254][54587] Fps is (10 sec: 60615.2, 60 sec: 61165.9, 300 sec: 61259.3). Total num frames: 9154969600. Throughput: 0: 61285.8. Samples: 2060145580. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:11:19,255][54587] Avg episode reward: [(0, '0.647')] [2024-04-28 03:11:19,929][54818] Updated weights for policy 0, policy_version 558778 (0.0016) [2024-04-28 03:11:23,158][54818] Updated weights for policy 0, policy_version 558788 (0.0016) [2024-04-28 03:11:24,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9155264512. Throughput: 0: 61425.0. Samples: 2060508840. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:11:24,253][54587] Avg episode reward: [(0, '0.637')] [2024-04-28 03:11:25,513][54818] Updated weights for policy 0, policy_version 558798 (0.0019) [2024-04-28 03:11:27,039][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (14100 times) [2024-04-28 03:11:28,362][54818] Updated weights for policy 0, policy_version 558808 (0.0018) [2024-04-28 03:11:29,253][54587] Fps is (10 sec: 60626.5, 60 sec: 61166.8, 300 sec: 61204.0). Total num frames: 9155575808. Throughput: 0: 61093.8. Samples: 2060690540. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:11:29,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 03:11:30,925][54818] Updated weights for policy 0, policy_version 558818 (0.0017) [2024-04-28 03:11:33,535][54818] Updated weights for policy 0, policy_version 558828 (0.0018) [2024-04-28 03:11:34,253][54587] Fps is (10 sec: 60619.3, 60 sec: 60893.7, 300 sec: 61203.9). Total num frames: 9155870720. Throughput: 0: 61268.8. Samples: 2061067100. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:11:34,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-28 03:11:36,484][54818] Updated weights for policy 0, policy_version 558838 (0.0016) [2024-04-28 03:11:38,807][54818] Updated weights for policy 0, policy_version 558848 (0.0017) [2024-04-28 03:11:39,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.8, 300 sec: 61203.9). Total num frames: 9156182016. Throughput: 0: 61293.9. Samples: 2061425400. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:11:39,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 03:11:41,815][54818] Updated weights for policy 0, policy_version 558858 (0.0015) [2024-04-28 03:11:44,139][54818] Updated weights for policy 0, policy_version 558868 (0.0019) [2024-04-28 03:11:44,253][54587] Fps is (10 sec: 62260.4, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9156493312. Throughput: 0: 61097.9. Samples: 2061613800. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:11:44,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 03:11:47,102][54818] Updated weights for policy 0, policy_version 558878 (0.0016) [2024-04-28 03:11:49,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60621.0, 300 sec: 61148.4). Total num frames: 9156788224. Throughput: 0: 61338.7. Samples: 2061988040. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:11:49,253][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 03:11:49,262][54587] No heartbeat for components: RolloutWorker_w4 (20857 seconds), RolloutWorker_w5 (6957 seconds) [2024-04-28 03:11:49,280][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000558887_9156804608.pth... [2024-04-28 03:11:49,340][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000557989_9142091776.pth [2024-04-28 03:11:49,488][54818] Updated weights for policy 0, policy_version 558888 (0.0019) [2024-04-28 03:11:49,965][54798] Signal inference workers to stop experience collection... (33900 times) [2024-04-28 03:11:49,965][54798] Signal inference workers to resume experience collection... (33900 times) [2024-04-28 03:11:49,984][54818] InferenceWorker_p0-w0: stopping experience collection (33900 times) [2024-04-28 03:11:49,985][54818] InferenceWorker_p0-w0: resuming experience collection (33900 times) [2024-04-28 03:11:52,911][54818] Updated weights for policy 0, policy_version 558898 (0.0017) [2024-04-28 03:11:53,481][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (14200 times) [2024-04-28 03:11:54,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60620.9, 300 sec: 61204.0). Total num frames: 9157099520. Throughput: 0: 61246.8. Samples: 2062339200. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:11:54,253][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 03:11:54,751][54818] Updated weights for policy 0, policy_version 558908 (0.0020) [2024-04-28 03:11:58,250][54818] Updated weights for policy 0, policy_version 558918 (0.0018) [2024-04-28 03:11:59,253][54587] Fps is (10 sec: 62259.3, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9157410816. Throughput: 0: 61202.7. Samples: 2062529140. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:11:59,254][54587] Avg episode reward: [(0, '0.715')] [2024-04-28 03:12:00,135][54818] Updated weights for policy 0, policy_version 558928 (0.0017) [2024-04-28 03:12:03,415][54818] Updated weights for policy 0, policy_version 558938 (0.0019) [2024-04-28 03:12:04,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 9157722112. Throughput: 0: 61142.6. Samples: 2062896940. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:12:04,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 03:12:05,616][54818] Updated weights for policy 0, policy_version 558948 (0.0019) [2024-04-28 03:12:08,603][54818] Updated weights for policy 0, policy_version 558958 (0.0015) [2024-04-28 03:12:09,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9158017024. Throughput: 0: 61137.6. Samples: 2063260040. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:12:09,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 03:12:11,084][54818] Updated weights for policy 0, policy_version 558968 (0.0016) [2024-04-28 03:12:13,771][54818] Updated weights for policy 0, policy_version 558978 (0.0017) [2024-04-28 03:12:14,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9158328320. Throughput: 0: 61167.7. Samples: 2063443080. Policy #0 lag: (min: 0.0, avg: 7.5, max: 20.0) [2024-04-28 03:12:14,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 03:12:16,621][54818] Updated weights for policy 0, policy_version 558988 (0.0017) [2024-04-28 03:12:19,162][54818] Updated weights for policy 0, policy_version 558998 (0.0018) [2024-04-28 03:12:19,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60894.8, 300 sec: 61203.9). Total num frames: 9158623232. Throughput: 0: 61099.3. Samples: 2063816560. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:12:19,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-28 03:12:19,994][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (14300 times) [2024-04-28 03:12:21,949][54818] Updated weights for policy 0, policy_version 559008 (0.0018) [2024-04-28 03:12:24,253][54587] Fps is (10 sec: 60619.7, 60 sec: 61166.7, 300 sec: 61203.9). Total num frames: 9158934528. Throughput: 0: 61189.6. Samples: 2064178940. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:12:24,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 03:12:24,444][54818] Updated weights for policy 0, policy_version 559018 (0.0018) [2024-04-28 03:12:27,333][54818] Updated weights for policy 0, policy_version 559028 (0.0019) [2024-04-28 03:12:28,167][54798] Signal inference workers to stop experience collection... (33950 times) [2024-04-28 03:12:28,206][54818] InferenceWorker_p0-w0: stopping experience collection (33950 times) [2024-04-28 03:12:28,262][54798] Signal inference workers to resume experience collection... (33950 times) [2024-04-28 03:12:28,262][54818] InferenceWorker_p0-w0: resuming experience collection (33950 times) [2024-04-28 03:12:29,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.9, 300 sec: 61204.2). Total num frames: 9159229440. Throughput: 0: 60994.1. Samples: 2064358540. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:12:29,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-28 03:12:29,730][54818] Updated weights for policy 0, policy_version 559038 (0.0015) [2024-04-28 03:12:33,043][54818] Updated weights for policy 0, policy_version 559048 (0.0016) [2024-04-28 03:12:34,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 9159540736. Throughput: 0: 60774.1. Samples: 2064722880. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:12:34,255][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 03:12:35,184][54818] Updated weights for policy 0, policy_version 559058 (0.0019) [2024-04-28 03:12:38,373][54818] Updated weights for policy 0, policy_version 559068 (0.0016) [2024-04-28 03:12:39,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9159835648. Throughput: 0: 61199.6. Samples: 2065093180. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:12:39,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 03:12:40,486][54818] Updated weights for policy 0, policy_version 559078 (0.0016) [2024-04-28 03:12:43,602][54818] Updated weights for policy 0, policy_version 559088 (0.0017) [2024-04-28 03:12:44,253][54587] Fps is (10 sec: 58982.9, 60 sec: 60620.8, 300 sec: 61148.4). Total num frames: 9160130560. Throughput: 0: 60807.5. Samples: 2065265480. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:12:44,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 03:12:45,948][54818] Updated weights for policy 0, policy_version 559098 (0.0017) [2024-04-28 03:12:47,480][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (14400 times) [2024-04-28 03:12:49,034][54818] Updated weights for policy 0, policy_version 559108 (0.0015) [2024-04-28 03:12:49,255][54587] Fps is (10 sec: 60607.4, 60 sec: 60891.6, 300 sec: 61203.5). Total num frames: 9160441856. Throughput: 0: 60942.1. Samples: 2065639460. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:12:49,256][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 03:12:51,291][54818] Updated weights for policy 0, policy_version 559118 (0.0016) [2024-04-28 03:12:54,178][54818] Updated weights for policy 0, policy_version 559128 (0.0015) [2024-04-28 03:12:54,253][54587] Fps is (10 sec: 62258.2, 60 sec: 60893.7, 300 sec: 61203.9). Total num frames: 9160753152. Throughput: 0: 61124.4. Samples: 2066010640. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:12:54,255][54587] Avg episode reward: [(0, '0.675')] [2024-04-28 03:12:56,666][54818] Updated weights for policy 0, policy_version 559138 (0.0018) [2024-04-28 03:12:59,253][54587] Fps is (10 sec: 62271.7, 60 sec: 60893.6, 300 sec: 61203.9). Total num frames: 9161064448. Throughput: 0: 60955.8. Samples: 2066186100. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:12:59,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 03:12:59,386][54818] Updated weights for policy 0, policy_version 559148 (0.0016) [2024-04-28 03:13:02,000][54818] Updated weights for policy 0, policy_version 559158 (0.0018) [2024-04-28 03:13:04,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60620.7, 300 sec: 61203.9). Total num frames: 9161359360. Throughput: 0: 60999.4. Samples: 2066561540. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:13:04,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 03:13:04,775][54818] Updated weights for policy 0, policy_version 559168 (0.0017) [2024-04-28 03:13:07,465][54818] Updated weights for policy 0, policy_version 559178 (0.0016) [2024-04-28 03:13:09,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60893.8, 300 sec: 61203.9). Total num frames: 9161670656. Throughput: 0: 61074.7. Samples: 2066927300. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:13:09,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 03:13:10,065][54818] Updated weights for policy 0, policy_version 559188 (0.0017) [2024-04-28 03:13:12,886][54818] Updated weights for policy 0, policy_version 559198 (0.0015) [2024-04-28 03:13:14,117][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (14500 times) [2024-04-28 03:13:14,253][54587] Fps is (10 sec: 62260.0, 60 sec: 60893.8, 300 sec: 61203.9). Total num frames: 9161981952. Throughput: 0: 61125.8. Samples: 2067109200. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:13:14,255][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 03:13:15,338][54818] Updated weights for policy 0, policy_version 559208 (0.0016) [2024-04-28 03:13:18,078][54818] Updated weights for policy 0, policy_version 559218 (0.0017) [2024-04-28 03:13:19,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9162276864. Throughput: 0: 61394.7. Samples: 2067485640. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:13:19,253][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 03:13:20,168][54798] Signal inference workers to stop experience collection... (34000 times) [2024-04-28 03:13:20,199][54818] InferenceWorker_p0-w0: stopping experience collection (34000 times) [2024-04-28 03:13:20,260][54798] Signal inference workers to resume experience collection... (34000 times) [2024-04-28 03:13:20,260][54818] InferenceWorker_p0-w0: resuming experience collection (34000 times) [2024-04-28 03:13:20,484][54818] Updated weights for policy 0, policy_version 559228 (0.0019) [2024-04-28 03:13:23,284][54818] Updated weights for policy 0, policy_version 559238 (0.0017) [2024-04-28 03:13:24,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9162588160. Throughput: 0: 61295.9. Samples: 2067851500. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:13:24,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 03:13:25,813][54818] Updated weights for policy 0, policy_version 559248 (0.0020) [2024-04-28 03:13:28,825][54818] Updated weights for policy 0, policy_version 559258 (0.0019) [2024-04-28 03:13:29,253][54587] Fps is (10 sec: 60620.2, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 9162883072. Throughput: 0: 61517.2. Samples: 2068033760. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:13:29,255][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 03:13:31,299][54818] Updated weights for policy 0, policy_version 559268 (0.0017) [2024-04-28 03:13:34,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60893.8, 300 sec: 61204.0). Total num frames: 9163194368. Throughput: 0: 61505.1. Samples: 2068407060. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:13:34,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-28 03:13:34,386][54818] Updated weights for policy 0, policy_version 559278 (0.0015) [2024-04-28 03:13:36,516][54818] Updated weights for policy 0, policy_version 559288 (0.0016) [2024-04-28 03:13:39,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61166.7, 300 sec: 61203.9). Total num frames: 9163505664. Throughput: 0: 61529.7. Samples: 2068779480. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:13:39,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 03:13:39,630][54818] Updated weights for policy 0, policy_version 559298 (0.0018) [2024-04-28 03:13:40,424][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (14600 times) [2024-04-28 03:13:41,912][54818] Updated weights for policy 0, policy_version 559308 (0.0016) [2024-04-28 03:13:44,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 9163816960. Throughput: 0: 61510.9. Samples: 2068954080. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:13:44,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 03:13:44,957][54818] Updated weights for policy 0, policy_version 559318 (0.0017) [2024-04-28 03:13:47,174][54818] Updated weights for policy 0, policy_version 559328 (0.0016) [2024-04-28 03:13:49,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61169.0, 300 sec: 61204.0). Total num frames: 9164111872. Throughput: 0: 61471.2. Samples: 2069327740. Policy #0 lag: (min: 1.0, avg: 7.5, max: 19.0) [2024-04-28 03:13:49,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-28 03:13:49,387][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000559334_9164128256.pth... [2024-04-28 03:13:49,430][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000558438_9149448192.pth [2024-04-28 03:13:50,123][54818] Updated weights for policy 0, policy_version 559338 (0.0016) [2024-04-28 03:13:52,571][54818] Updated weights for policy 0, policy_version 559348 (0.0015) [2024-04-28 03:13:54,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9164423168. Throughput: 0: 61502.7. Samples: 2069694920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:13:54,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 03:13:55,387][54818] Updated weights for policy 0, policy_version 559358 (0.0017) [2024-04-28 03:13:57,906][54818] Updated weights for policy 0, policy_version 559368 (0.0020) [2024-04-28 03:13:59,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 9164734464. Throughput: 0: 61465.4. Samples: 2069875140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:13:59,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-28 03:14:00,610][54818] Updated weights for policy 0, policy_version 559378 (0.0016) [2024-04-28 03:14:02,196][54798] Signal inference workers to stop experience collection... (34050 times) [2024-04-28 03:14:02,233][54818] InferenceWorker_p0-w0: stopping experience collection (34050 times) [2024-04-28 03:14:02,251][54798] Signal inference workers to resume experience collection... (34050 times) [2024-04-28 03:14:02,251][54818] InferenceWorker_p0-w0: resuming experience collection (34050 times) [2024-04-28 03:14:03,340][54818] Updated weights for policy 0, policy_version 559388 (0.0019) [2024-04-28 03:14:04,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61440.2, 300 sec: 61203.9). Total num frames: 9165045760. Throughput: 0: 61440.0. Samples: 2070250440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:14:04,254][54587] Avg episode reward: [(0, '0.510')] [2024-04-28 03:14:05,984][54818] Updated weights for policy 0, policy_version 559398 (0.0018) [2024-04-28 03:14:06,997][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (14700 times) [2024-04-28 03:14:08,623][54818] Updated weights for policy 0, policy_version 559408 (0.0015) [2024-04-28 03:14:09,253][54587] Fps is (10 sec: 60619.9, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9165340672. Throughput: 0: 61442.0. Samples: 2070616400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:14:09,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 03:14:11,287][54818] Updated weights for policy 0, policy_version 559418 (0.0016) [2024-04-28 03:14:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9165651968. Throughput: 0: 61572.0. Samples: 2070804500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:14:14,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 03:14:14,286][54818] Updated weights for policy 0, policy_version 559428 (0.0016) [2024-04-28 03:14:16,488][54818] Updated weights for policy 0, policy_version 559438 (0.0016) [2024-04-28 03:14:19,253][54587] Fps is (10 sec: 63898.3, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 9165979648. Throughput: 0: 61525.8. Samples: 2071175720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:14:19,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 03:14:19,486][54818] Updated weights for policy 0, policy_version 559448 (0.0016) [2024-04-28 03:14:21,698][54818] Updated weights for policy 0, policy_version 559458 (0.0017) [2024-04-28 03:14:24,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 9166274560. Throughput: 0: 61496.3. Samples: 2071546800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:14:24,253][54587] Avg episode reward: [(0, '0.556')] [2024-04-28 03:14:24,805][54818] Updated weights for policy 0, policy_version 559468 (0.0017) [2024-04-28 03:14:26,989][54818] Updated weights for policy 0, policy_version 559478 (0.0016) [2024-04-28 03:14:29,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61713.2, 300 sec: 61204.0). Total num frames: 9166585856. Throughput: 0: 61736.5. Samples: 2071732220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:14:29,253][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 03:14:30,085][54818] Updated weights for policy 0, policy_version 559488 (0.0017) [2024-04-28 03:14:32,259][54818] Updated weights for policy 0, policy_version 559498 (0.0019) [2024-04-28 03:14:33,983][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (14800 times) [2024-04-28 03:14:34,253][54587] Fps is (10 sec: 62257.7, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 9166897152. Throughput: 0: 61445.7. Samples: 2072092800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:14:34,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 03:14:35,568][54818] Updated weights for policy 0, policy_version 559508 (0.0018) [2024-04-28 03:14:37,529][54818] Updated weights for policy 0, policy_version 559518 (0.0015) [2024-04-28 03:14:39,253][54587] Fps is (10 sec: 63896.6, 60 sec: 61986.2, 300 sec: 61315.3). Total num frames: 9167224832. Throughput: 0: 61691.6. Samples: 2072471040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:14:39,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 03:14:40,799][54818] Updated weights for policy 0, policy_version 559528 (0.0017) [2024-04-28 03:14:42,704][54818] Updated weights for policy 0, policy_version 559538 (0.0017) [2024-04-28 03:14:44,253][54587] Fps is (10 sec: 62260.8, 60 sec: 61713.1, 300 sec: 61259.6). Total num frames: 9167519744. Throughput: 0: 61767.3. Samples: 2072654660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:14:44,253][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 03:14:46,055][54818] Updated weights for policy 0, policy_version 559548 (0.0016) [2024-04-28 03:14:48,312][54818] Updated weights for policy 0, policy_version 559558 (0.0016) [2024-04-28 03:14:49,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61986.2, 300 sec: 61259.5). Total num frames: 9167831040. Throughput: 0: 61474.2. Samples: 2073016780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:14:49,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 03:14:49,266][54587] No heartbeat for components: RolloutWorker_w4 (21037 seconds), RolloutWorker_w5 (7137 seconds) [2024-04-28 03:14:49,397][54798] Signal inference workers to stop experience collection... (34100 times) [2024-04-28 03:14:49,405][54818] InferenceWorker_p0-w0: stopping experience collection (34100 times) [2024-04-28 03:14:49,488][54798] Signal inference workers to resume experience collection... (34100 times) [2024-04-28 03:14:49,489][54818] InferenceWorker_p0-w0: resuming experience collection (34100 times) [2024-04-28 03:14:51,285][54818] Updated weights for policy 0, policy_version 559568 (0.0017) [2024-04-28 03:14:53,700][54818] Updated weights for policy 0, policy_version 559578 (0.0017) [2024-04-28 03:14:54,253][54587] Fps is (10 sec: 62257.7, 60 sec: 61986.1, 300 sec: 61315.0). Total num frames: 9168142336. Throughput: 0: 61821.3. Samples: 2073398360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:14:54,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-28 03:14:56,509][54818] Updated weights for policy 0, policy_version 559588 (0.0016) [2024-04-28 03:14:59,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61713.1, 300 sec: 61259.5). Total num frames: 9168437248. Throughput: 0: 61827.1. Samples: 2073586720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:14:59,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-28 03:14:59,269][54818] Updated weights for policy 0, policy_version 559598 (0.0016) [2024-04-28 03:15:00,581][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (14900 times) [2024-04-28 03:15:01,862][54818] Updated weights for policy 0, policy_version 559608 (0.0016) [2024-04-28 03:15:04,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61713.1, 300 sec: 61259.5). Total num frames: 9168748544. Throughput: 0: 61470.7. Samples: 2073941900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:15:04,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-28 03:15:04,675][54818] Updated weights for policy 0, policy_version 559618 (0.0016) [2024-04-28 03:15:07,034][54818] Updated weights for policy 0, policy_version 559628 (0.0018) [2024-04-28 03:15:09,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61986.2, 300 sec: 61259.5). Total num frames: 9169059840. Throughput: 0: 61661.2. Samples: 2074321560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:15:09,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 03:15:10,025][54818] Updated weights for policy 0, policy_version 559638 (0.0019) [2024-04-28 03:15:12,123][54818] Updated weights for policy 0, policy_version 559648 (0.0016) [2024-04-28 03:15:14,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61986.0, 300 sec: 61259.5). Total num frames: 9169371136. Throughput: 0: 61603.2. Samples: 2074504380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:15:14,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 03:15:14,254][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (800 times) [2024-04-28 03:15:15,314][54818] Updated weights for policy 0, policy_version 559658 (0.0017) [2024-04-28 03:15:17,427][54818] Updated weights for policy 0, policy_version 559668 (0.0016) [2024-04-28 03:15:19,255][54587] Fps is (10 sec: 60613.1, 60 sec: 61438.7, 300 sec: 61259.2). Total num frames: 9169666048. Throughput: 0: 61756.2. Samples: 2074871900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 03:15:19,255][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 03:15:20,552][54818] Updated weights for policy 0, policy_version 559678 (0.0020) [2024-04-28 03:15:22,840][54818] Updated weights for policy 0, policy_version 559688 (0.0016) [2024-04-28 03:15:24,253][54587] Fps is (10 sec: 60622.3, 60 sec: 61713.1, 300 sec: 61259.5). Total num frames: 9169977344. Throughput: 0: 61643.3. Samples: 2075244980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:15:24,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 03:15:25,734][54818] Updated weights for policy 0, policy_version 559698 (0.0016) [2024-04-28 03:15:26,884][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (15000 times) [2024-04-28 03:15:27,935][54818] Updated weights for policy 0, policy_version 559708 (0.0016) [2024-04-28 03:15:29,253][54587] Fps is (10 sec: 63905.5, 60 sec: 61986.0, 300 sec: 61315.0). Total num frames: 9170305024. Throughput: 0: 61623.3. Samples: 2075427720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:15:29,255][54587] Avg episode reward: [(0, '0.513')] [2024-04-28 03:15:31,051][54818] Updated weights for policy 0, policy_version 559718 (0.0018) [2024-04-28 03:15:31,923][54798] Signal inference workers to stop experience collection... (34150 times) [2024-04-28 03:15:31,925][54798] Signal inference workers to resume experience collection... (34150 times) [2024-04-28 03:15:31,931][54818] InferenceWorker_p0-w0: stopping experience collection (34150 times) [2024-04-28 03:15:31,941][54818] InferenceWorker_p0-w0: resuming experience collection (34150 times) [2024-04-28 03:15:33,829][54818] Updated weights for policy 0, policy_version 559728 (0.0016) [2024-04-28 03:15:34,253][54587] Fps is (10 sec: 63897.0, 60 sec: 61986.3, 300 sec: 61315.0). Total num frames: 9170616320. Throughput: 0: 61853.8. Samples: 2075800200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:15:34,255][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 03:15:36,458][54818] Updated weights for policy 0, policy_version 559738 (0.0016) [2024-04-28 03:15:39,019][54818] Updated weights for policy 0, policy_version 559748 (0.0015) [2024-04-28 03:15:39,253][54587] Fps is (10 sec: 60621.9, 60 sec: 61440.2, 300 sec: 61259.5). Total num frames: 9170911232. Throughput: 0: 61715.9. Samples: 2076175560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:15:39,253][54587] Avg episode reward: [(0, '0.549')] [2024-04-28 03:15:41,643][54818] Updated weights for policy 0, policy_version 559758 (0.0016) [2024-04-28 03:15:44,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61712.9, 300 sec: 61259.5). Total num frames: 9171222528. Throughput: 0: 61545.7. Samples: 2076356280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:15:44,255][54587] Avg episode reward: [(0, '0.662')] [2024-04-28 03:15:44,399][54818] Updated weights for policy 0, policy_version 559768 (0.0016) [2024-04-28 03:15:46,894][54818] Updated weights for policy 0, policy_version 559778 (0.0016) [2024-04-28 03:15:49,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61713.1, 300 sec: 61259.5). Total num frames: 9171533824. Throughput: 0: 61816.1. Samples: 2076723620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:15:49,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-28 03:15:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000559786_9171533824.pth... [2024-04-28 03:15:49,324][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000558887_9156804608.pth [2024-04-28 03:15:49,872][54818] Updated weights for policy 0, policy_version 559788 (0.0016) [2024-04-28 03:15:52,029][54818] Updated weights for policy 0, policy_version 559798 (0.0016) [2024-04-28 03:15:53,527][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (15100 times) [2024-04-28 03:15:54,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61713.2, 300 sec: 61315.0). Total num frames: 9171845120. Throughput: 0: 61684.4. Samples: 2077097360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:15:54,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 03:15:55,048][54818] Updated weights for policy 0, policy_version 559808 (0.0017) [2024-04-28 03:15:57,197][54818] Updated weights for policy 0, policy_version 559818 (0.0024) [2024-04-28 03:15:59,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61713.1, 300 sec: 61315.0). Total num frames: 9172140032. Throughput: 0: 61815.4. Samples: 2077286060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:15:59,254][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 03:16:00,267][54818] Updated weights for policy 0, policy_version 559828 (0.0017) [2024-04-28 03:16:03,002][54818] Updated weights for policy 0, policy_version 559838 (0.0019) [2024-04-28 03:16:04,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61713.0, 300 sec: 61315.0). Total num frames: 9172451328. Throughput: 0: 61671.4. Samples: 2077647040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:16:04,255][54587] Avg episode reward: [(0, '0.504')] [2024-04-28 03:16:05,670][54818] Updated weights for policy 0, policy_version 559848 (0.0017) [2024-04-28 03:16:08,161][54818] Updated weights for policy 0, policy_version 559858 (0.0019) [2024-04-28 03:16:09,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61713.2, 300 sec: 61370.6). Total num frames: 9172762624. Throughput: 0: 61664.4. Samples: 2078019880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:16:09,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 03:16:10,745][54818] Updated weights for policy 0, policy_version 559868 (0.0015) [2024-04-28 03:16:13,413][54818] Updated weights for policy 0, policy_version 559878 (0.0016) [2024-04-28 03:16:14,253][54587] Fps is (10 sec: 63898.0, 60 sec: 61986.3, 300 sec: 61426.3). Total num frames: 9173090304. Throughput: 0: 61827.6. Samples: 2078209960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:16:14,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-28 03:16:16,229][54818] Updated weights for policy 0, policy_version 559888 (0.0016) [2024-04-28 03:16:18,719][54818] Updated weights for policy 0, policy_version 559898 (0.0015) [2024-04-28 03:16:19,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61987.6, 300 sec: 61426.1). Total num frames: 9173385216. Throughput: 0: 61732.1. Samples: 2078578140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:16:19,253][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 03:16:20,110][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (15200 times) [2024-04-28 03:16:21,687][54818] Updated weights for policy 0, policy_version 559908 (0.0015) [2024-04-28 03:16:24,072][54818] Updated weights for policy 0, policy_version 559918 (0.0016) [2024-04-28 03:16:24,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61986.1, 300 sec: 61426.1). Total num frames: 9173696512. Throughput: 0: 61491.9. Samples: 2078942700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:16:24,255][54587] Avg episode reward: [(0, '0.559')] [2024-04-28 03:16:24,704][54798] Signal inference workers to stop experience collection... (34200 times) [2024-04-28 03:16:24,736][54818] InferenceWorker_p0-w0: stopping experience collection (34200 times) [2024-04-28 03:16:24,754][54798] Signal inference workers to resume experience collection... (34200 times) [2024-04-28 03:16:24,754][54818] InferenceWorker_p0-w0: resuming experience collection (34200 times) [2024-04-28 03:16:26,793][54818] Updated weights for policy 0, policy_version 559928 (0.0016) [2024-04-28 03:16:29,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61713.1, 300 sec: 61481.7). Total num frames: 9174007808. Throughput: 0: 61676.9. Samples: 2079131740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:16:29,254][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 03:16:29,507][54818] Updated weights for policy 0, policy_version 559938 (0.0016) [2024-04-28 03:16:32,013][54818] Updated weights for policy 0, policy_version 559948 (0.0016) [2024-04-28 03:16:34,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61440.1, 300 sec: 61426.1). Total num frames: 9174302720. Throughput: 0: 61764.1. Samples: 2079503000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:16:34,253][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 03:16:34,624][54818] Updated weights for policy 0, policy_version 559958 (0.0017) [2024-04-28 03:16:37,517][54818] Updated weights for policy 0, policy_version 559968 (0.0016) [2024-04-28 03:16:39,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61712.8, 300 sec: 61426.1). Total num frames: 9174614016. Throughput: 0: 61526.1. Samples: 2079866040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:16:39,254][54587] Avg episode reward: [(0, '0.514')] [2024-04-28 03:16:40,099][54818] Updated weights for policy 0, policy_version 559978 (0.0016) [2024-04-28 03:16:42,929][54818] Updated weights for policy 0, policy_version 559988 (0.0017) [2024-04-28 03:16:44,253][54587] Fps is (10 sec: 62257.9, 60 sec: 61713.1, 300 sec: 61481.6). Total num frames: 9174925312. Throughput: 0: 61414.9. Samples: 2080049740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:16:44,255][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 03:16:45,408][54818] Updated weights for policy 0, policy_version 559998 (0.0016) [2024-04-28 03:16:46,765][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (15300 times) [2024-04-28 03:16:48,213][54818] Updated weights for policy 0, policy_version 560008 (0.0016) [2024-04-28 03:16:49,253][54587] Fps is (10 sec: 62260.4, 60 sec: 61713.0, 300 sec: 61481.7). Total num frames: 9175236608. Throughput: 0: 61612.2. Samples: 2080419580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 03:16:49,253][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 03:16:51,046][54818] Updated weights for policy 0, policy_version 560018 (0.0015) [2024-04-28 03:16:53,484][54818] Updated weights for policy 0, policy_version 560028 (0.0017) [2024-04-28 03:16:54,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61713.1, 300 sec: 61481.6). Total num frames: 9175547904. Throughput: 0: 61407.0. Samples: 2080783200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:16:54,255][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 03:16:56,328][54818] Updated weights for policy 0, policy_version 560038 (0.0018) [2024-04-28 03:16:58,921][54818] Updated weights for policy 0, policy_version 560048 (0.0016) [2024-04-28 03:16:59,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61986.0, 300 sec: 61481.6). Total num frames: 9175859200. Throughput: 0: 61374.1. Samples: 2080971800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:16:59,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 03:17:02,019][54818] Updated weights for policy 0, policy_version 560058 (0.0016) [2024-04-28 03:17:04,047][54818] Updated weights for policy 0, policy_version 560068 (0.0022) [2024-04-28 03:17:04,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61713.2, 300 sec: 61481.7). Total num frames: 9176154112. Throughput: 0: 61211.5. Samples: 2081332660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:17:04,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-28 03:17:07,066][54818] Updated weights for policy 0, policy_version 560078 (0.0016) [2024-04-28 03:17:09,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61712.9, 300 sec: 61481.6). Total num frames: 9176465408. Throughput: 0: 61193.6. Samples: 2081696420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:17:09,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 03:17:09,536][54818] Updated weights for policy 0, policy_version 560088 (0.0016) [2024-04-28 03:17:12,435][54818] Updated weights for policy 0, policy_version 560098 (0.0018) [2024-04-28 03:17:13,472][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (15400 times) [2024-04-28 03:17:14,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.9, 300 sec: 61481.7). Total num frames: 9176760320. Throughput: 0: 61335.2. Samples: 2081891820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:17:14,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-28 03:17:14,788][54818] Updated weights for policy 0, policy_version 560108 (0.0016) [2024-04-28 03:17:17,504][54798] Signal inference workers to stop experience collection... (34250 times) [2024-04-28 03:17:17,505][54798] Signal inference workers to resume experience collection... (34250 times) [2024-04-28 03:17:17,528][54818] InferenceWorker_p0-w0: stopping experience collection (34250 times) [2024-04-28 03:17:17,528][54818] InferenceWorker_p0-w0: resuming experience collection (34250 times) [2024-04-28 03:17:17,615][54818] Updated weights for policy 0, policy_version 560118 (0.0016) [2024-04-28 03:17:19,253][54587] Fps is (10 sec: 58983.4, 60 sec: 61167.0, 300 sec: 61426.2). Total num frames: 9177055232. Throughput: 0: 60960.9. Samples: 2082246240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:17:19,253][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 03:17:20,254][54818] Updated weights for policy 0, policy_version 560128 (0.0019) [2024-04-28 03:17:23,039][54818] Updated weights for policy 0, policy_version 560138 (0.0015) [2024-04-28 03:17:24,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61440.0, 300 sec: 61537.2). Total num frames: 9177382912. Throughput: 0: 61082.0. Samples: 2082614720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:17:24,255][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 03:17:25,604][54818] Updated weights for policy 0, policy_version 560148 (0.0018) [2024-04-28 03:17:28,276][54818] Updated weights for policy 0, policy_version 560158 (0.0016) [2024-04-28 03:17:29,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61167.0, 300 sec: 61481.6). Total num frames: 9177677824. Throughput: 0: 61332.9. Samples: 2082809720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:17:29,256][54587] Avg episode reward: [(0, '0.651')] [2024-04-28 03:17:30,964][54818] Updated weights for policy 0, policy_version 560168 (0.0017) [2024-04-28 03:17:33,582][54818] Updated weights for policy 0, policy_version 560178 (0.0015) [2024-04-28 03:17:34,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61439.9, 300 sec: 61537.2). Total num frames: 9177989120. Throughput: 0: 61172.8. Samples: 2083172360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:17:34,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 03:17:36,476][54818] Updated weights for policy 0, policy_version 560188 (0.0017) [2024-04-28 03:17:38,838][54818] Updated weights for policy 0, policy_version 560198 (0.0020) [2024-04-28 03:17:39,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61440.1, 300 sec: 61592.7). Total num frames: 9178300416. Throughput: 0: 61270.5. Samples: 2083540380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:17:39,255][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 03:17:39,963][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (15500 times) [2024-04-28 03:17:41,817][54818] Updated weights for policy 0, policy_version 560208 (0.0015) [2024-04-28 03:17:44,029][54818] Updated weights for policy 0, policy_version 560218 (0.0016) [2024-04-28 03:17:44,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61440.1, 300 sec: 61593.2). Total num frames: 9178611712. Throughput: 0: 61325.5. Samples: 2083731440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:17:44,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 03:17:47,175][54818] Updated weights for policy 0, policy_version 560228 (0.0018) [2024-04-28 03:17:49,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61440.0, 300 sec: 61592.8). Total num frames: 9178923008. Throughput: 0: 61434.6. Samples: 2084097220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:17:49,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 03:17:49,262][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000560237_9178923008.pth... [2024-04-28 03:17:49,262][54587] No heartbeat for components: RolloutWorker_w4 (21217 seconds), RolloutWorker_w5 (7317 seconds) [2024-04-28 03:17:49,319][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000559334_9164128256.pth [2024-04-28 03:17:49,471][54818] Updated weights for policy 0, policy_version 560238 (0.0018) [2024-04-28 03:17:52,489][54818] Updated weights for policy 0, policy_version 560248 (0.0018) [2024-04-28 03:17:54,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61167.0, 300 sec: 61537.2). Total num frames: 9179217920. Throughput: 0: 61565.6. Samples: 2084466860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:17:54,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 03:17:54,886][54818] Updated weights for policy 0, policy_version 560258 (0.0018) [2024-04-28 03:17:57,689][54818] Updated weights for policy 0, policy_version 560268 (0.0018) [2024-04-28 03:17:59,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61166.9, 300 sec: 61592.7). Total num frames: 9179529216. Throughput: 0: 61393.2. Samples: 2084654520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:17:59,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-28 03:18:00,010][54818] Updated weights for policy 0, policy_version 560278 (0.0016) [2024-04-28 03:18:02,866][54818] Updated weights for policy 0, policy_version 560288 (0.0016) [2024-04-28 03:18:04,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61167.0, 300 sec: 61537.2). Total num frames: 9179824128. Throughput: 0: 61591.6. Samples: 2085017860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:18:04,253][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 03:18:05,498][54818] Updated weights for policy 0, policy_version 560298 (0.0017) [2024-04-28 03:18:07,154][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (15600 times) [2024-04-28 03:18:08,323][54818] Updated weights for policy 0, policy_version 560308 (0.0019) [2024-04-28 03:18:09,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61166.9, 300 sec: 61537.2). Total num frames: 9180135424. Throughput: 0: 61705.2. Samples: 2085391460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:18:09,255][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 03:18:10,289][54798] Signal inference workers to stop experience collection... (34300 times) [2024-04-28 03:18:10,312][54818] InferenceWorker_p0-w0: stopping experience collection (34300 times) [2024-04-28 03:18:10,381][54798] Signal inference workers to resume experience collection... (34300 times) [2024-04-28 03:18:10,381][54818] InferenceWorker_p0-w0: resuming experience collection (34300 times) [2024-04-28 03:18:11,031][54818] Updated weights for policy 0, policy_version 560318 (0.0016) [2024-04-28 03:18:13,794][54818] Updated weights for policy 0, policy_version 560328 (0.0016) [2024-04-28 03:18:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61167.0, 300 sec: 61537.2). Total num frames: 9180430336. Throughput: 0: 61544.2. Samples: 2085579200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:18:14,253][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 03:18:16,714][54818] Updated weights for policy 0, policy_version 560338 (0.0017) [2024-04-28 03:18:19,057][54818] Updated weights for policy 0, policy_version 560348 (0.0015) [2024-04-28 03:18:19,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61439.9, 300 sec: 61537.2). Total num frames: 9180741632. Throughput: 0: 61557.7. Samples: 2085942460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 03:18:19,254][54587] Avg episode reward: [(0, '0.522')] [2024-04-28 03:18:21,971][54818] Updated weights for policy 0, policy_version 560358 (0.0017) [2024-04-28 03:18:24,160][54818] Updated weights for policy 0, policy_version 560368 (0.0017) [2024-04-28 03:18:24,253][54587] Fps is (10 sec: 63897.2, 60 sec: 61440.0, 300 sec: 61648.3). Total num frames: 9181069312. Throughput: 0: 61398.3. Samples: 2086303300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:18:24,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 03:18:27,345][54818] Updated weights for policy 0, policy_version 560378 (0.0018) [2024-04-28 03:18:29,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61440.0, 300 sec: 61592.7). Total num frames: 9181364224. Throughput: 0: 61418.6. Samples: 2086495280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:18:29,255][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 03:18:29,699][54818] Updated weights for policy 0, policy_version 560388 (0.0016) [2024-04-28 03:18:32,661][54818] Updated weights for policy 0, policy_version 560398 (0.0017) [2024-04-28 03:18:33,587][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (15700 times) [2024-04-28 03:18:34,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.1, 300 sec: 61592.8). Total num frames: 9181675520. Throughput: 0: 61449.9. Samples: 2086862460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:18:34,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 03:18:34,835][54818] Updated weights for policy 0, policy_version 560408 (0.0017) [2024-04-28 03:18:37,959][54818] Updated weights for policy 0, policy_version 560418 (0.0016) [2024-04-28 03:18:39,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61440.1, 300 sec: 61592.7). Total num frames: 9181986816. Throughput: 0: 61293.2. Samples: 2087225060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:18:39,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 03:18:40,142][54818] Updated weights for policy 0, policy_version 560428 (0.0016) [2024-04-28 03:18:43,043][54818] Updated weights for policy 0, policy_version 560438 (0.0015) [2024-04-28 03:18:44,253][54587] Fps is (10 sec: 62258.0, 60 sec: 61439.9, 300 sec: 61648.3). Total num frames: 9182298112. Throughput: 0: 61420.4. Samples: 2087418440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:18:44,254][54587] Avg episode reward: [(0, '0.504')] [2024-04-28 03:18:45,813][54818] Updated weights for policy 0, policy_version 560448 (0.0018) [2024-04-28 03:18:48,257][54818] Updated weights for policy 0, policy_version 560458 (0.0019) [2024-04-28 03:18:49,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61440.0, 300 sec: 61648.3). Total num frames: 9182609408. Throughput: 0: 61582.1. Samples: 2087789060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:18:49,255][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 03:18:50,974][54818] Updated weights for policy 0, policy_version 560468 (0.0016) [2024-04-28 03:18:53,577][54818] Updated weights for policy 0, policy_version 560478 (0.0016) [2024-04-28 03:18:54,043][54798] Signal inference workers to stop experience collection... (34350 times) [2024-04-28 03:18:54,074][54818] InferenceWorker_p0-w0: stopping experience collection (34350 times) [2024-04-28 03:18:54,133][54798] Signal inference workers to resume experience collection... (34350 times) [2024-04-28 03:18:54,133][54818] InferenceWorker_p0-w0: resuming experience collection (34350 times) [2024-04-28 03:18:54,253][54587] Fps is (10 sec: 62260.4, 60 sec: 61713.0, 300 sec: 61648.3). Total num frames: 9182920704. Throughput: 0: 61207.7. Samples: 2088145800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:18:54,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-28 03:18:56,225][54818] Updated weights for policy 0, policy_version 560488 (0.0021) [2024-04-28 03:18:58,947][54818] Updated weights for policy 0, policy_version 560498 (0.0016) [2024-04-28 03:18:59,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61440.2, 300 sec: 61592.7). Total num frames: 9183215616. Throughput: 0: 61273.4. Samples: 2088336500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:18:59,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 03:19:00,222][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (15800 times) [2024-04-28 03:19:01,867][54818] Updated weights for policy 0, policy_version 560508 (0.0016) [2024-04-28 03:19:04,166][54818] Updated weights for policy 0, policy_version 560518 (0.0016) [2024-04-28 03:19:04,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61713.0, 300 sec: 61648.3). Total num frames: 9183526912. Throughput: 0: 61393.4. Samples: 2088705160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:19:04,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 03:19:07,370][54818] Updated weights for policy 0, policy_version 560528 (0.0017) [2024-04-28 03:19:09,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61440.2, 300 sec: 61592.8). Total num frames: 9183821824. Throughput: 0: 61339.2. Samples: 2089063560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:19:09,253][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 03:19:09,485][54818] Updated weights for policy 0, policy_version 560538 (0.0016) [2024-04-28 03:19:12,624][54818] Updated weights for policy 0, policy_version 560548 (0.0016) [2024-04-28 03:19:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61713.0, 300 sec: 61537.2). Total num frames: 9184133120. Throughput: 0: 61265.8. Samples: 2089252240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:19:14,254][54587] Avg episode reward: [(0, '0.519')] [2024-04-28 03:19:14,778][54818] Updated weights for policy 0, policy_version 560558 (0.0017) [2024-04-28 03:19:17,995][54818] Updated weights for policy 0, policy_version 560568 (0.0016) [2024-04-28 03:19:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61440.2, 300 sec: 61537.2). Total num frames: 9184428032. Throughput: 0: 61261.8. Samples: 2089619240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:19:19,253][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 03:19:20,290][54818] Updated weights for policy 0, policy_version 560578 (0.0017) [2024-04-28 03:19:23,206][54818] Updated weights for policy 0, policy_version 560588 (0.0019) [2024-04-28 03:19:24,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61167.0, 300 sec: 61537.2). Total num frames: 9184739328. Throughput: 0: 61487.7. Samples: 2089992000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:19:24,253][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 03:19:25,813][54818] Updated weights for policy 0, policy_version 560598 (0.0018) [2024-04-28 03:19:27,128][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (15900 times) [2024-04-28 03:19:28,624][54818] Updated weights for policy 0, policy_version 560608 (0.0017) [2024-04-28 03:19:29,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61440.1, 300 sec: 61537.2). Total num frames: 9185050624. Throughput: 0: 61181.9. Samples: 2090171620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:19:29,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 03:19:31,092][54818] Updated weights for policy 0, policy_version 560618 (0.0019) [2024-04-28 03:19:33,897][54818] Updated weights for policy 0, policy_version 560628 (0.0015) [2024-04-28 03:19:34,253][54587] Fps is (10 sec: 60620.0, 60 sec: 61166.9, 300 sec: 61426.1). Total num frames: 9185345536. Throughput: 0: 61074.2. Samples: 2090537400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:19:34,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 03:19:36,344][54818] Updated weights for policy 0, policy_version 560638 (0.0016) [2024-04-28 03:19:39,254][54587] Fps is (10 sec: 60615.7, 60 sec: 61166.0, 300 sec: 61481.5). Total num frames: 9185656832. Throughput: 0: 61389.9. Samples: 2090908400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:19:39,255][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 03:19:39,262][54818] Updated weights for policy 0, policy_version 560648 (0.0016) [2024-04-28 03:19:41,911][54818] Updated weights for policy 0, policy_version 560658 (0.0016) [2024-04-28 03:19:44,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60894.1, 300 sec: 61426.1). Total num frames: 9185951744. Throughput: 0: 61220.0. Samples: 2091091400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:19:44,253][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 03:19:44,453][54818] Updated weights for policy 0, policy_version 560668 (0.0017) [2024-04-28 03:19:45,339][54798] Signal inference workers to stop experience collection... (34400 times) [2024-04-28 03:19:45,340][54798] Signal inference workers to resume experience collection... (34400 times) [2024-04-28 03:19:45,366][54818] InferenceWorker_p0-w0: stopping experience collection (34400 times) [2024-04-28 03:19:45,366][54818] InferenceWorker_p0-w0: resuming experience collection (34400 times) [2024-04-28 03:19:47,129][54818] Updated weights for policy 0, policy_version 560678 (0.0017) [2024-04-28 03:19:49,253][54587] Fps is (10 sec: 60626.2, 60 sec: 60893.9, 300 sec: 61426.1). Total num frames: 9186263040. Throughput: 0: 61084.0. Samples: 2091453940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 03:19:49,254][54587] Avg episode reward: [(0, '0.484')] [2024-04-28 03:19:49,353][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000560686_9186279424.pth... [2024-04-28 03:19:49,404][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000559786_9171533824.pth [2024-04-28 03:19:49,695][54818] Updated weights for policy 0, policy_version 560688 (0.0017) [2024-04-28 03:19:52,638][54818] Updated weights for policy 0, policy_version 560698 (0.0016) [2024-04-28 03:19:53,871][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (16000 times) [2024-04-28 03:19:54,253][54587] Fps is (10 sec: 62258.6, 60 sec: 60893.8, 300 sec: 61481.7). Total num frames: 9186574336. Throughput: 0: 61388.3. Samples: 2091826040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:19:54,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-28 03:19:55,190][54818] Updated weights for policy 0, policy_version 560708 (0.0016) [2024-04-28 03:19:57,935][54818] Updated weights for policy 0, policy_version 560718 (0.0019) [2024-04-28 03:19:59,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61166.8, 300 sec: 61481.6). Total num frames: 9186885632. Throughput: 0: 61239.9. Samples: 2092008040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:19:59,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 03:20:00,416][54818] Updated weights for policy 0, policy_version 560728 (0.0017) [2024-04-28 03:20:03,295][54818] Updated weights for policy 0, policy_version 560738 (0.0016) [2024-04-28 03:20:04,253][54587] Fps is (10 sec: 60620.2, 60 sec: 60893.8, 300 sec: 61426.1). Total num frames: 9187180544. Throughput: 0: 61380.6. Samples: 2092381380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:20:04,255][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 03:20:05,752][54818] Updated weights for policy 0, policy_version 560748 (0.0016) [2024-04-28 03:20:08,577][54818] Updated weights for policy 0, policy_version 560758 (0.0015) [2024-04-28 03:20:09,253][54587] Fps is (10 sec: 58983.1, 60 sec: 60893.8, 300 sec: 61370.6). Total num frames: 9187475456. Throughput: 0: 61339.5. Samples: 2092752280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:20:09,253][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 03:20:11,012][54818] Updated weights for policy 0, policy_version 560768 (0.0017) [2024-04-28 03:20:13,974][54818] Updated weights for policy 0, policy_version 560778 (0.0018) [2024-04-28 03:20:14,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60893.9, 300 sec: 61426.4). Total num frames: 9187786752. Throughput: 0: 61208.0. Samples: 2092925980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:20:14,254][54587] Avg episode reward: [(0, '0.538')] [2024-04-28 03:20:16,188][54818] Updated weights for policy 0, policy_version 560788 (0.0018) [2024-04-28 03:20:19,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61166.9, 300 sec: 61426.1). Total num frames: 9188098048. Throughput: 0: 61427.2. Samples: 2093301620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:20:19,254][54587] Avg episode reward: [(0, '0.680')] [2024-04-28 03:20:19,314][54818] Updated weights for policy 0, policy_version 560798 (0.0015) [2024-04-28 03:20:20,490][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (16100 times) [2024-04-28 03:20:21,615][54818] Updated weights for policy 0, policy_version 560808 (0.0017) [2024-04-28 03:20:24,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60893.8, 300 sec: 61315.1). Total num frames: 9188392960. Throughput: 0: 61503.9. Samples: 2093676020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:20:24,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 03:20:24,667][54818] Updated weights for policy 0, policy_version 560818 (0.0019) [2024-04-28 03:20:26,949][54818] Updated weights for policy 0, policy_version 560828 (0.0016) [2024-04-28 03:20:29,253][54587] Fps is (10 sec: 60619.9, 60 sec: 60893.8, 300 sec: 61315.0). Total num frames: 9188704256. Throughput: 0: 61142.0. Samples: 2093842800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:20:29,255][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 03:20:30,187][54818] Updated weights for policy 0, policy_version 560838 (0.0016) [2024-04-28 03:20:31,136][54798] Signal inference workers to stop experience collection... (34450 times) [2024-04-28 03:20:31,139][54798] Signal inference workers to resume experience collection... (34450 times) [2024-04-28 03:20:31,148][54818] InferenceWorker_p0-w0: stopping experience collection (34450 times) [2024-04-28 03:20:31,148][54818] InferenceWorker_p0-w0: resuming experience collection (34450 times) [2024-04-28 03:20:32,565][54818] Updated weights for policy 0, policy_version 560848 (0.0018) [2024-04-28 03:20:34,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61167.0, 300 sec: 61370.6). Total num frames: 9189015552. Throughput: 0: 61307.2. Samples: 2094212760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:20:34,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 03:20:35,370][54818] Updated weights for policy 0, policy_version 560858 (0.0016) [2024-04-28 03:20:37,638][54818] Updated weights for policy 0, policy_version 560868 (0.0017) [2024-04-28 03:20:39,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61167.8, 300 sec: 61370.6). Total num frames: 9189326848. Throughput: 0: 61302.7. Samples: 2094584660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:20:39,254][54587] Avg episode reward: [(0, '0.503')] [2024-04-28 03:20:40,642][54818] Updated weights for policy 0, policy_version 560878 (0.0018) [2024-04-28 03:20:43,177][54818] Updated weights for policy 0, policy_version 560888 (0.0016) [2024-04-28 03:20:44,253][54587] Fps is (10 sec: 60619.9, 60 sec: 61166.7, 300 sec: 61315.0). Total num frames: 9189621760. Throughput: 0: 61281.7. Samples: 2094765720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:20:44,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 03:20:45,934][54818] Updated weights for policy 0, policy_version 560898 (0.0020) [2024-04-28 03:20:47,176][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (16200 times) [2024-04-28 03:20:48,665][54818] Updated weights for policy 0, policy_version 560908 (0.0016) [2024-04-28 03:20:49,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9189933056. Throughput: 0: 61068.6. Samples: 2095129460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:20:49,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 03:20:49,264][54587] No heartbeat for components: RolloutWorker_w4 (21397 seconds), RolloutWorker_w5 (7497 seconds) [2024-04-28 03:20:51,308][54818] Updated weights for policy 0, policy_version 560918 (0.0016) [2024-04-28 03:20:54,253][54587] Fps is (10 sec: 60622.0, 60 sec: 60894.0, 300 sec: 61315.0). Total num frames: 9190227968. Throughput: 0: 61130.7. Samples: 2095503160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:20:54,253][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 03:20:54,275][54818] Updated weights for policy 0, policy_version 560928 (0.0016) [2024-04-28 03:20:56,669][54818] Updated weights for policy 0, policy_version 560938 (0.0017) [2024-04-28 03:20:59,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60894.0, 300 sec: 61315.1). Total num frames: 9190539264. Throughput: 0: 61407.7. Samples: 2095689320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:20:59,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 03:20:59,512][54818] Updated weights for policy 0, policy_version 560948 (0.0016) [2024-04-28 03:21:01,971][54818] Updated weights for policy 0, policy_version 560958 (0.0016) [2024-04-28 03:21:04,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61167.1, 300 sec: 61315.0). Total num frames: 9190850560. Throughput: 0: 61127.2. Samples: 2096052340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:21:04,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 03:21:04,753][54818] Updated weights for policy 0, policy_version 560968 (0.0016) [2024-04-28 03:21:07,284][54818] Updated weights for policy 0, policy_version 560978 (0.0016) [2024-04-28 03:21:09,253][54587] Fps is (10 sec: 62258.1, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9191161856. Throughput: 0: 61027.0. Samples: 2096422240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:21:09,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 03:21:10,035][54818] Updated weights for policy 0, policy_version 560988 (0.0016) [2024-04-28 03:21:12,488][54818] Updated weights for policy 0, policy_version 560998 (0.0017) [2024-04-28 03:21:14,009][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (16300 times) [2024-04-28 03:21:14,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61440.1, 300 sec: 61315.0). Total num frames: 9191473152. Throughput: 0: 61448.6. Samples: 2096607980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:21:14,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 03:21:15,385][54818] Updated weights for policy 0, policy_version 561008 (0.0016) [2024-04-28 03:21:17,974][54818] Updated weights for policy 0, policy_version 561018 (0.0016) [2024-04-28 03:21:19,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 9191784448. Throughput: 0: 61306.2. Samples: 2096971540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:21:19,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 03:21:20,640][54818] Updated weights for policy 0, policy_version 561028 (0.0017) [2024-04-28 03:21:23,175][54818] Updated weights for policy 0, policy_version 561038 (0.0016) [2024-04-28 03:21:24,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9192079360. Throughput: 0: 61329.3. Samples: 2097344480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 03:21:24,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 03:21:24,343][54798] Signal inference workers to stop experience collection... (34500 times) [2024-04-28 03:21:24,372][54818] InferenceWorker_p0-w0: stopping experience collection (34500 times) [2024-04-28 03:21:24,403][54798] Signal inference workers to resume experience collection... (34500 times) [2024-04-28 03:21:24,404][54818] InferenceWorker_p0-w0: resuming experience collection (34500 times) [2024-04-28 03:21:26,098][54818] Updated weights for policy 0, policy_version 561048 (0.0016) [2024-04-28 03:21:28,687][54818] Updated weights for policy 0, policy_version 561058 (0.0018) [2024-04-28 03:21:29,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.1, 300 sec: 61315.0). Total num frames: 9192390656. Throughput: 0: 61434.4. Samples: 2097530260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:21:29,253][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 03:21:31,585][54818] Updated weights for policy 0, policy_version 561068 (0.0017) [2024-04-28 03:21:33,945][54818] Updated weights for policy 0, policy_version 561078 (0.0018) [2024-04-28 03:21:34,253][54587] Fps is (10 sec: 62260.2, 60 sec: 61440.1, 300 sec: 61315.1). Total num frames: 9192701952. Throughput: 0: 61339.8. Samples: 2097889740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:21:34,253][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 03:21:36,919][54818] Updated weights for policy 0, policy_version 561088 (0.0020) [2024-04-28 03:21:39,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61439.9, 300 sec: 61315.1). Total num frames: 9193013248. Throughput: 0: 61183.0. Samples: 2098256400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:21:39,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 03:21:39,392][54818] Updated weights for policy 0, policy_version 561098 (0.0016) [2024-04-28 03:21:40,969][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (16400 times) [2024-04-28 03:21:42,150][54818] Updated weights for policy 0, policy_version 561108 (0.0018) [2024-04-28 03:21:44,253][54587] Fps is (10 sec: 60620.0, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9193308160. Throughput: 0: 61346.6. Samples: 2098449920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:21:44,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 03:21:45,068][54818] Updated weights for policy 0, policy_version 561118 (0.0016) [2024-04-28 03:21:47,383][54818] Updated weights for policy 0, policy_version 561128 (0.0018) [2024-04-28 03:21:49,253][54587] Fps is (10 sec: 58983.0, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9193603072. Throughput: 0: 61072.0. Samples: 2098800580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:21:49,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 03:21:49,385][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000561135_9193635840.pth... [2024-04-28 03:21:49,432][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000560237_9178923008.pth [2024-04-28 03:21:50,281][54818] Updated weights for policy 0, policy_version 561138 (0.0017) [2024-04-28 03:21:52,817][54818] Updated weights for policy 0, policy_version 561148 (0.0019) [2024-04-28 03:21:54,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61713.1, 300 sec: 61259.5). Total num frames: 9193930752. Throughput: 0: 61138.0. Samples: 2099173440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:21:54,253][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 03:21:55,682][54818] Updated weights for policy 0, policy_version 561158 (0.0022) [2024-04-28 03:21:58,258][54818] Updated weights for policy 0, policy_version 561168 (0.0016) [2024-04-28 03:21:59,253][54587] Fps is (10 sec: 63896.8, 60 sec: 61712.9, 300 sec: 61315.0). Total num frames: 9194242048. Throughput: 0: 61238.1. Samples: 2099363700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:21:59,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 03:22:01,065][54818] Updated weights for policy 0, policy_version 561178 (0.0017) [2024-04-28 03:22:03,492][54818] Updated weights for policy 0, policy_version 561188 (0.0016) [2024-04-28 03:22:04,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9194536960. Throughput: 0: 60947.2. Samples: 2099714160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:22:04,253][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 03:22:06,570][54818] Updated weights for policy 0, policy_version 561198 (0.0021) [2024-04-28 03:22:07,347][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (16500 times) [2024-04-28 03:22:08,926][54818] Updated weights for policy 0, policy_version 561208 (0.0022) [2024-04-28 03:22:09,253][54587] Fps is (10 sec: 60621.8, 60 sec: 61440.2, 300 sec: 61315.1). Total num frames: 9194848256. Throughput: 0: 60966.8. Samples: 2100087980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:22:09,253][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 03:22:11,743][54818] Updated weights for policy 0, policy_version 561218 (0.0016) [2024-04-28 03:22:14,130][54818] Updated weights for policy 0, policy_version 561228 (0.0016) [2024-04-28 03:22:14,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9195159552. Throughput: 0: 60992.4. Samples: 2100274920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:22:14,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-28 03:22:16,990][54818] Updated weights for policy 0, policy_version 561238 (0.0015) [2024-04-28 03:22:19,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 9195470848. Throughput: 0: 61142.0. Samples: 2100641140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:22:19,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 03:22:19,623][54818] Updated weights for policy 0, policy_version 561248 (0.0017) [2024-04-28 03:22:21,439][54798] Signal inference workers to stop experience collection... (34550 times) [2024-04-28 03:22:21,439][54798] Signal inference workers to resume experience collection... (34550 times) [2024-04-28 03:22:21,450][54818] InferenceWorker_p0-w0: stopping experience collection (34550 times) [2024-04-28 03:22:21,451][54818] InferenceWorker_p0-w0: resuming experience collection (34550 times) [2024-04-28 03:22:22,361][54818] Updated weights for policy 0, policy_version 561258 (0.0016) [2024-04-28 03:22:24,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61440.0, 300 sec: 61315.1). Total num frames: 9195765760. Throughput: 0: 61157.8. Samples: 2101008500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:22:24,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 03:22:25,348][54818] Updated weights for policy 0, policy_version 561268 (0.0015) [2024-04-28 03:22:27,552][54818] Updated weights for policy 0, policy_version 561278 (0.0016) [2024-04-28 03:22:29,253][54587] Fps is (10 sec: 58982.6, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9196060672. Throughput: 0: 60867.6. Samples: 2101188960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:22:29,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 03:22:30,811][54818] Updated weights for policy 0, policy_version 561288 (0.0018) [2024-04-28 03:22:33,089][54818] Updated weights for policy 0, policy_version 561298 (0.0016) [2024-04-28 03:22:34,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9196371968. Throughput: 0: 61257.6. Samples: 2101557180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:22:34,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 03:22:34,297][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (16600 times) [2024-04-28 03:22:36,057][54818] Updated weights for policy 0, policy_version 561308 (0.0016) [2024-04-28 03:22:38,314][54818] Updated weights for policy 0, policy_version 561318 (0.0019) [2024-04-28 03:22:39,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9196683264. Throughput: 0: 61088.2. Samples: 2101922420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:22:39,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 03:22:41,370][54818] Updated weights for policy 0, policy_version 561328 (0.0017) [2024-04-28 03:22:43,652][54818] Updated weights for policy 0, policy_version 561338 (0.0016) [2024-04-28 03:22:44,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9196994560. Throughput: 0: 60984.5. Samples: 2102108000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:22:44,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 03:22:46,571][54818] Updated weights for policy 0, policy_version 561348 (0.0017) [2024-04-28 03:22:49,012][54818] Updated weights for policy 0, policy_version 561358 (0.0019) [2024-04-28 03:22:49,253][54587] Fps is (10 sec: 63897.6, 60 sec: 61986.0, 300 sec: 61370.5). Total num frames: 9197322240. Throughput: 0: 61410.5. Samples: 2102477640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:22:49,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-28 03:22:51,824][54818] Updated weights for policy 0, policy_version 561368 (0.0017) [2024-04-28 03:22:54,196][54818] Updated weights for policy 0, policy_version 561378 (0.0015) [2024-04-28 03:22:54,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61439.9, 300 sec: 61315.1). Total num frames: 9197617152. Throughput: 0: 61276.7. Samples: 2102845440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:22:54,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 03:22:57,043][54818] Updated weights for policy 0, policy_version 561388 (0.0018) [2024-04-28 03:22:59,253][54587] Fps is (10 sec: 58982.6, 60 sec: 61167.0, 300 sec: 61315.0). Total num frames: 9197912064. Throughput: 0: 61296.4. Samples: 2103033260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:22:59,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-28 03:22:59,555][54818] Updated weights for policy 0, policy_version 561398 (0.0016) [2024-04-28 03:23:01,389][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (16700 times) [2024-04-28 03:23:02,426][54818] Updated weights for policy 0, policy_version 561408 (0.0018) [2024-04-28 03:23:04,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61439.9, 300 sec: 61315.1). Total num frames: 9198223360. Throughput: 0: 61289.8. Samples: 2103399180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:23:04,254][54587] Avg episode reward: [(0, '0.671')] [2024-04-28 03:23:05,462][54818] Updated weights for policy 0, policy_version 561418 (0.0017) [2024-04-28 03:23:06,729][54798] Signal inference workers to stop experience collection... (34600 times) [2024-04-28 03:23:06,739][54798] Signal inference workers to resume experience collection... (34600 times) [2024-04-28 03:23:06,751][54818] InferenceWorker_p0-w0: stopping experience collection (34600 times) [2024-04-28 03:23:06,752][54818] InferenceWorker_p0-w0: resuming experience collection (34600 times) [2024-04-28 03:23:07,726][54818] Updated weights for policy 0, policy_version 561428 (0.0016) [2024-04-28 03:23:09,253][54587] Fps is (10 sec: 63897.5, 60 sec: 61712.9, 300 sec: 61426.1). Total num frames: 9198551040. Throughput: 0: 61275.0. Samples: 2103765880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:23:09,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 03:23:10,758][54818] Updated weights for policy 0, policy_version 561438 (0.0016) [2024-04-28 03:23:13,252][54818] Updated weights for policy 0, policy_version 561448 (0.0017) [2024-04-28 03:23:14,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9198845952. Throughput: 0: 61451.6. Samples: 2103954280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:23:14,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-28 03:23:16,037][54818] Updated weights for policy 0, policy_version 561458 (0.0018) [2024-04-28 03:23:18,538][54818] Updated weights for policy 0, policy_version 561468 (0.0015) [2024-04-28 03:23:19,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 9199157248. Throughput: 0: 61425.4. Samples: 2104321320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:23:19,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 03:23:21,271][54818] Updated weights for policy 0, policy_version 561478 (0.0017) [2024-04-28 03:23:23,711][54818] Updated weights for policy 0, policy_version 561488 (0.0016) [2024-04-28 03:23:24,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 9199452160. Throughput: 0: 61256.0. Samples: 2104678940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:23:24,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-28 03:23:26,698][54818] Updated weights for policy 0, policy_version 561498 (0.0015) [2024-04-28 03:23:27,763][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (16800 times) [2024-04-28 03:23:28,830][54818] Updated weights for policy 0, policy_version 561508 (0.0020) [2024-04-28 03:23:29,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61713.1, 300 sec: 61315.0). Total num frames: 9199763456. Throughput: 0: 61400.0. Samples: 2104871000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:23:29,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-28 03:23:32,174][54818] Updated weights for policy 0, policy_version 561518 (0.0015) [2024-04-28 03:23:34,218][54818] Updated weights for policy 0, policy_version 561528 (0.0015) [2024-04-28 03:23:34,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61713.1, 300 sec: 61315.0). Total num frames: 9200074752. Throughput: 0: 61366.8. Samples: 2105239140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:23:34,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 03:23:34,254][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (900 times) [2024-04-28 03:23:37,566][54818] Updated weights for policy 0, policy_version 561538 (0.0015) [2024-04-28 03:23:39,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9200369664. Throughput: 0: 61134.6. Samples: 2105596500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:23:39,255][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 03:23:40,175][54818] Updated weights for policy 0, policy_version 561548 (0.0017) [2024-04-28 03:23:42,894][54818] Updated weights for policy 0, policy_version 561558 (0.0017) [2024-04-28 03:23:44,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9200680960. Throughput: 0: 61127.0. Samples: 2105783980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:23:44,255][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 03:23:45,467][54818] Updated weights for policy 0, policy_version 561568 (0.0017) [2024-04-28 03:23:48,237][54818] Updated weights for policy 0, policy_version 561578 (0.0017) [2024-04-28 03:23:49,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9200975872. Throughput: 0: 61260.1. Samples: 2106155880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:23:49,253][54587] Avg episode reward: [(0, '0.642')] [2024-04-28 03:23:49,260][54587] No heartbeat for components: RolloutWorker_w4 (21577 seconds), RolloutWorker_w5 (7677 seconds) [2024-04-28 03:23:49,334][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000561584_9200992256.pth... [2024-04-28 03:23:49,397][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000560686_9186279424.pth [2024-04-28 03:23:50,822][54818] Updated weights for policy 0, policy_version 561588 (0.0017) [2024-04-28 03:23:50,949][54798] Signal inference workers to stop experience collection... (34650 times) [2024-04-28 03:23:50,955][54818] InferenceWorker_p0-w0: stopping experience collection (34650 times) [2024-04-28 03:23:51,010][54798] Signal inference workers to resume experience collection... (34650 times) [2024-04-28 03:23:51,010][54818] InferenceWorker_p0-w0: resuming experience collection (34650 times) [2024-04-28 03:23:53,432][54818] Updated weights for policy 0, policy_version 561598 (0.0017) [2024-04-28 03:23:54,253][54587] Fps is (10 sec: 58982.0, 60 sec: 60893.7, 300 sec: 61203.9). Total num frames: 9201270784. Throughput: 0: 61070.1. Samples: 2106514040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:23:54,254][54587] Avg episode reward: [(0, '0.476')] [2024-04-28 03:23:54,421][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (16900 times) [2024-04-28 03:23:56,044][54818] Updated weights for policy 0, policy_version 561608 (0.0016) [2024-04-28 03:23:58,601][54818] Updated weights for policy 0, policy_version 561618 (0.0016) [2024-04-28 03:23:59,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9201582080. Throughput: 0: 61031.1. Samples: 2106700680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:23:59,255][54587] Avg episode reward: [(0, '0.687')] [2024-04-28 03:24:01,260][54818] Updated weights for policy 0, policy_version 561628 (0.0016) [2024-04-28 03:24:03,899][54818] Updated weights for policy 0, policy_version 561638 (0.0018) [2024-04-28 03:24:04,253][54587] Fps is (10 sec: 60622.3, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9201876992. Throughput: 0: 61217.5. Samples: 2107076100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:24:04,253][54587] Avg episode reward: [(0, '0.665')] [2024-04-28 03:24:06,601][54818] Updated weights for policy 0, policy_version 561648 (0.0017) [2024-04-28 03:24:09,193][54818] Updated weights for policy 0, policy_version 561658 (0.0017) [2024-04-28 03:24:09,253][54587] Fps is (10 sec: 62259.0, 60 sec: 60893.9, 300 sec: 61259.5). Total num frames: 9202204672. Throughput: 0: 61325.4. Samples: 2107438580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:24:09,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-28 03:24:12,349][54818] Updated weights for policy 0, policy_version 561668 (0.0016) [2024-04-28 03:24:14,253][54587] Fps is (10 sec: 62258.8, 60 sec: 60893.9, 300 sec: 61259.5). Total num frames: 9202499584. Throughput: 0: 61068.9. Samples: 2107619100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:24:14,253][54587] Avg episode reward: [(0, '0.648')] [2024-04-28 03:24:14,639][54818] Updated weights for policy 0, policy_version 561678 (0.0015) [2024-04-28 03:24:17,592][54818] Updated weights for policy 0, policy_version 561688 (0.0017) [2024-04-28 03:24:19,253][54587] Fps is (10 sec: 57343.7, 60 sec: 60347.6, 300 sec: 61148.4). Total num frames: 9202778112. Throughput: 0: 61093.2. Samples: 2107988340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:24:19,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 03:24:20,084][54818] Updated weights for policy 0, policy_version 561698 (0.0016) [2024-04-28 03:24:21,075][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (17000 times) [2024-04-28 03:24:22,879][54818] Updated weights for policy 0, policy_version 561708 (0.0019) [2024-04-28 03:24:24,253][54587] Fps is (10 sec: 58982.6, 60 sec: 60620.9, 300 sec: 61148.4). Total num frames: 9203089408. Throughput: 0: 61325.5. Samples: 2108356140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-04-28 03:24:24,253][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 03:24:25,470][54818] Updated weights for policy 0, policy_version 561718 (0.0016) [2024-04-28 03:24:28,283][54818] Updated weights for policy 0, policy_version 561728 (0.0016) [2024-04-28 03:24:29,253][54587] Fps is (10 sec: 62259.3, 60 sec: 60620.7, 300 sec: 61204.0). Total num frames: 9203400704. Throughput: 0: 61210.7. Samples: 2108538460. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:24:29,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-28 03:24:30,602][54818] Updated weights for policy 0, policy_version 561738 (0.0022) [2024-04-28 03:24:33,667][54818] Updated weights for policy 0, policy_version 561748 (0.0015) [2024-04-28 03:24:34,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60347.7, 300 sec: 61148.6). Total num frames: 9203695616. Throughput: 0: 61075.5. Samples: 2108904280. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:24:34,253][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 03:24:36,002][54818] Updated weights for policy 0, policy_version 561758 (0.0017) [2024-04-28 03:24:37,262][54798] Signal inference workers to stop experience collection... (34700 times) [2024-04-28 03:24:37,263][54798] Signal inference workers to resume experience collection... (34700 times) [2024-04-28 03:24:37,278][54818] InferenceWorker_p0-w0: stopping experience collection (34700 times) [2024-04-28 03:24:37,278][54818] InferenceWorker_p0-w0: resuming experience collection (34700 times) [2024-04-28 03:24:38,974][54818] Updated weights for policy 0, policy_version 561768 (0.0016) [2024-04-28 03:24:39,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60620.8, 300 sec: 61203.9). Total num frames: 9204006912. Throughput: 0: 61390.0. Samples: 2109276580. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:24:39,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 03:24:41,331][54818] Updated weights for policy 0, policy_version 561778 (0.0019) [2024-04-28 03:24:44,253][54587] Fps is (10 sec: 62259.4, 60 sec: 60620.9, 300 sec: 61204.0). Total num frames: 9204318208. Throughput: 0: 61173.9. Samples: 2109453500. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:24:44,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 03:24:44,599][54818] Updated weights for policy 0, policy_version 561788 (0.0017) [2024-04-28 03:24:46,671][54818] Updated weights for policy 0, policy_version 561798 (0.0017) [2024-04-28 03:24:48,584][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (17100 times) [2024-04-28 03:24:49,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60620.7, 300 sec: 61148.4). Total num frames: 9204613120. Throughput: 0: 60966.5. Samples: 2109819600. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:24:49,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 03:24:49,740][54818] Updated weights for policy 0, policy_version 561808 (0.0015) [2024-04-28 03:24:52,362][54818] Updated weights for policy 0, policy_version 561818 (0.0018) [2024-04-28 03:24:54,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 9204924416. Throughput: 0: 61349.8. Samples: 2110199320. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:24:54,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 03:24:55,072][54818] Updated weights for policy 0, policy_version 561828 (0.0022) [2024-04-28 03:24:57,452][54818] Updated weights for policy 0, policy_version 561838 (0.0016) [2024-04-28 03:24:59,253][54587] Fps is (10 sec: 62259.1, 60 sec: 60893.8, 300 sec: 61204.0). Total num frames: 9205235712. Throughput: 0: 61106.1. Samples: 2110368880. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:24:59,254][54587] Avg episode reward: [(0, '0.676')] [2024-04-28 03:25:00,215][54818] Updated weights for policy 0, policy_version 561848 (0.0015) [2024-04-28 03:25:02,668][54818] Updated weights for policy 0, policy_version 561858 (0.0016) [2024-04-28 03:25:04,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60893.8, 300 sec: 61204.0). Total num frames: 9205530624. Throughput: 0: 60956.6. Samples: 2110731380. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:25:04,253][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 03:25:05,525][54818] Updated weights for policy 0, policy_version 561868 (0.0018) [2024-04-28 03:25:08,747][54818] Updated weights for policy 0, policy_version 561878 (0.0016) [2024-04-28 03:25:09,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60620.8, 300 sec: 61204.0). Total num frames: 9205841920. Throughput: 0: 61378.5. Samples: 2111118180. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:25:09,255][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 03:25:10,889][54818] Updated weights for policy 0, policy_version 561888 (0.0016) [2024-04-28 03:25:13,899][54818] Updated weights for policy 0, policy_version 561898 (0.0016) [2024-04-28 03:25:14,253][54587] Fps is (10 sec: 62259.5, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9206153216. Throughput: 0: 61199.3. Samples: 2111292420. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:25:14,253][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 03:25:15,024][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (17200 times) [2024-04-28 03:25:16,357][54818] Updated weights for policy 0, policy_version 561908 (0.0019) [2024-04-28 03:25:16,839][54798] Signal inference workers to stop experience collection... (34750 times) [2024-04-28 03:25:16,875][54818] InferenceWorker_p0-w0: stopping experience collection (34750 times) [2024-04-28 03:25:16,930][54798] Signal inference workers to resume experience collection... (34750 times) [2024-04-28 03:25:16,930][54818] InferenceWorker_p0-w0: resuming experience collection (34750 times) [2024-04-28 03:25:19,064][54818] Updated weights for policy 0, policy_version 561918 (0.0015) [2024-04-28 03:25:19,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9206464512. Throughput: 0: 61234.6. Samples: 2111659840. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:25:19,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-28 03:25:21,644][54818] Updated weights for policy 0, policy_version 561928 (0.0017) [2024-04-28 03:25:24,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9206775808. Throughput: 0: 61369.3. Samples: 2112038200. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:25:24,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 03:25:24,699][54818] Updated weights for policy 0, policy_version 561938 (0.0017) [2024-04-28 03:25:27,074][54818] Updated weights for policy 0, policy_version 561948 (0.0015) [2024-04-28 03:25:29,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9207070720. Throughput: 0: 61424.7. Samples: 2112217620. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:25:29,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 03:25:29,923][54818] Updated weights for policy 0, policy_version 561958 (0.0022) [2024-04-28 03:25:32,156][54818] Updated weights for policy 0, policy_version 561968 (0.0021) [2024-04-28 03:25:34,253][54587] Fps is (10 sec: 58982.4, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 9207365632. Throughput: 0: 61260.8. Samples: 2112576340. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:25:34,255][54587] Avg episode reward: [(0, '0.550')] [2024-04-28 03:25:35,154][54818] Updated weights for policy 0, policy_version 561978 (0.0016) [2024-04-28 03:25:37,351][54818] Updated weights for policy 0, policy_version 561988 (0.0019) [2024-04-28 03:25:39,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9207676928. Throughput: 0: 61266.3. Samples: 2112956300. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:25:39,254][54587] Avg episode reward: [(0, '0.669')] [2024-04-28 03:25:40,392][54818] Updated weights for policy 0, policy_version 561998 (0.0018) [2024-04-28 03:25:41,736][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (17300 times) [2024-04-28 03:25:42,733][54818] Updated weights for policy 0, policy_version 562008 (0.0015) [2024-04-28 03:25:44,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9207988224. Throughput: 0: 61492.1. Samples: 2113136020. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:25:44,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-28 03:25:45,771][54818] Updated weights for policy 0, policy_version 562018 (0.0016) [2024-04-28 03:25:48,221][54818] Updated weights for policy 0, policy_version 562028 (0.0017) [2024-04-28 03:25:49,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9208299520. Throughput: 0: 61410.6. Samples: 2113494860. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:25:49,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-28 03:25:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000562030_9208299520.pth... [2024-04-28 03:25:49,315][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000561135_9193635840.pth [2024-04-28 03:25:51,008][54818] Updated weights for policy 0, policy_version 562038 (0.0017) [2024-04-28 03:25:53,891][54818] Updated weights for policy 0, policy_version 562048 (0.0017) [2024-04-28 03:25:54,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9208594432. Throughput: 0: 61060.4. Samples: 2113865900. Policy #0 lag: (min: 2.0, avg: 11.3, max: 21.0) [2024-04-28 03:25:54,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 03:25:56,507][54818] Updated weights for policy 0, policy_version 562058 (0.0018) [2024-04-28 03:25:59,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 9208905728. Throughput: 0: 61196.9. Samples: 2114046280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:25:59,253][54587] Avg episode reward: [(0, '0.672')] [2024-04-28 03:25:59,299][54818] Updated weights for policy 0, policy_version 562068 (0.0016) [2024-04-28 03:26:01,586][54818] Updated weights for policy 0, policy_version 562078 (0.0017) [2024-04-28 03:26:04,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9209200640. Throughput: 0: 61169.9. Samples: 2114412480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:26:04,253][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 03:26:04,819][54818] Updated weights for policy 0, policy_version 562088 (0.0015) [2024-04-28 03:26:05,591][54798] Signal inference workers to stop experience collection... (34800 times) [2024-04-28 03:26:05,591][54798] Signal inference workers to resume experience collection... (34800 times) [2024-04-28 03:26:05,618][54818] InferenceWorker_p0-w0: stopping experience collection (34800 times) [2024-04-28 03:26:05,618][54818] InferenceWorker_p0-w0: resuming experience collection (34800 times) [2024-04-28 03:26:07,064][54818] Updated weights for policy 0, policy_version 562098 (0.0017) [2024-04-28 03:26:08,543][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (17400 times) [2024-04-28 03:26:09,253][54587] Fps is (10 sec: 60619.7, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9209511936. Throughput: 0: 61124.9. Samples: 2114788820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:26:09,255][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 03:26:10,200][54818] Updated weights for policy 0, policy_version 562108 (0.0017) [2024-04-28 03:26:12,290][54818] Updated weights for policy 0, policy_version 562118 (0.0018) [2024-04-28 03:26:14,253][54587] Fps is (10 sec: 63896.4, 60 sec: 61439.7, 300 sec: 61203.9). Total num frames: 9209839616. Throughput: 0: 61065.7. Samples: 2114965580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:26:14,254][54587] Avg episode reward: [(0, '0.485')] [2024-04-28 03:26:15,440][54818] Updated weights for policy 0, policy_version 562128 (0.0016) [2024-04-28 03:26:17,581][54818] Updated weights for policy 0, policy_version 562138 (0.0015) [2024-04-28 03:26:19,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9210134528. Throughput: 0: 61184.9. Samples: 2115329660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:26:19,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 03:26:20,785][54818] Updated weights for policy 0, policy_version 562148 (0.0015) [2024-04-28 03:26:23,059][54818] Updated weights for policy 0, policy_version 562158 (0.0016) [2024-04-28 03:26:24,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 9210445824. Throughput: 0: 61045.1. Samples: 2115703340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:26:24,255][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 03:26:25,954][54818] Updated weights for policy 0, policy_version 562168 (0.0015) [2024-04-28 03:26:28,587][54818] Updated weights for policy 0, policy_version 562178 (0.0018) [2024-04-28 03:26:29,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61440.0, 300 sec: 61203.9). Total num frames: 9210757120. Throughput: 0: 60993.2. Samples: 2115880720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:26:29,255][54587] Avg episode reward: [(0, '0.654')] [2024-04-28 03:26:31,378][54818] Updated weights for policy 0, policy_version 562188 (0.0019) [2024-04-28 03:26:33,910][54818] Updated weights for policy 0, policy_version 562198 (0.0015) [2024-04-28 03:26:34,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61713.1, 300 sec: 61203.9). Total num frames: 9211068416. Throughput: 0: 61103.9. Samples: 2116244540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:26:34,255][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 03:26:35,468][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (17500 times) [2024-04-28 03:26:36,598][54818] Updated weights for policy 0, policy_version 562208 (0.0016) [2024-04-28 03:26:39,094][54818] Updated weights for policy 0, policy_version 562218 (0.0017) [2024-04-28 03:26:39,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 9211379712. Throughput: 0: 61234.3. Samples: 2116621440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:26:39,254][54587] Avg episode reward: [(0, '0.505')] [2024-04-28 03:26:41,809][54818] Updated weights for policy 0, policy_version 562228 (0.0017) [2024-04-28 03:26:44,253][54587] Fps is (10 sec: 62260.2, 60 sec: 61713.1, 300 sec: 61315.0). Total num frames: 9211691008. Throughput: 0: 61184.0. Samples: 2116799560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:26:44,254][54587] Avg episode reward: [(0, '0.518')] [2024-04-28 03:26:44,977][54818] Updated weights for policy 0, policy_version 562238 (0.0017) [2024-04-28 03:26:47,039][54818] Updated weights for policy 0, policy_version 562248 (0.0018) [2024-04-28 03:26:49,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61440.0, 300 sec: 61203.9). Total num frames: 9211985920. Throughput: 0: 61159.5. Samples: 2117164660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:26:49,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 03:26:49,267][54587] No heartbeat for components: RolloutWorker_w4 (21757 seconds), RolloutWorker_w5 (7857 seconds) [2024-04-28 03:26:50,215][54818] Updated weights for policy 0, policy_version 562258 (0.0016) [2024-04-28 03:26:51,969][54798] Signal inference workers to stop experience collection... (34850 times) [2024-04-28 03:26:51,988][54818] InferenceWorker_p0-w0: stopping experience collection (34850 times) [2024-04-28 03:26:52,025][54798] Signal inference workers to resume experience collection... (34850 times) [2024-04-28 03:26:52,026][54818] InferenceWorker_p0-w0: resuming experience collection (34850 times) [2024-04-28 03:26:52,442][54818] Updated weights for policy 0, policy_version 562268 (0.0019) [2024-04-28 03:26:54,253][54587] Fps is (10 sec: 58982.4, 60 sec: 61440.2, 300 sec: 61148.4). Total num frames: 9212280832. Throughput: 0: 60990.9. Samples: 2117533400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:26:54,253][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 03:26:55,694][54818] Updated weights for policy 0, policy_version 562278 (0.0018) [2024-04-28 03:26:57,618][54818] Updated weights for policy 0, policy_version 562288 (0.0017) [2024-04-28 03:26:59,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9212592128. Throughput: 0: 61309.2. Samples: 2117724480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:26:59,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 03:27:00,987][54818] Updated weights for policy 0, policy_version 562298 (0.0016) [2024-04-28 03:27:01,865][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (17600 times) [2024-04-28 03:27:03,035][54818] Updated weights for policy 0, policy_version 562308 (0.0018) [2024-04-28 03:27:04,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61712.9, 300 sec: 61203.9). Total num frames: 9212903424. Throughput: 0: 61127.5. Samples: 2118080400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:27:04,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-28 03:27:06,190][54818] Updated weights for policy 0, policy_version 562318 (0.0016) [2024-04-28 03:27:08,872][54818] Updated weights for policy 0, policy_version 562328 (0.0016) [2024-04-28 03:27:09,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61713.2, 300 sec: 61204.0). Total num frames: 9213214720. Throughput: 0: 61162.0. Samples: 2118455620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:27:09,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 03:27:11,572][54818] Updated weights for policy 0, policy_version 562338 (0.0015) [2024-04-28 03:27:13,998][54818] Updated weights for policy 0, policy_version 562348 (0.0018) [2024-04-28 03:27:14,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9213509632. Throughput: 0: 61311.5. Samples: 2118639740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:27:14,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 03:27:16,845][54818] Updated weights for policy 0, policy_version 562358 (0.0016) [2024-04-28 03:27:19,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61440.0, 300 sec: 61203.9). Total num frames: 9213820928. Throughput: 0: 61177.8. Samples: 2118997540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:27:19,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 03:27:19,373][54818] Updated weights for policy 0, policy_version 562368 (0.0017) [2024-04-28 03:27:22,083][54818] Updated weights for policy 0, policy_version 562378 (0.0017) [2024-04-28 03:27:24,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61167.1, 300 sec: 61203.9). Total num frames: 9214115840. Throughput: 0: 61100.0. Samples: 2119370940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 19.0) [2024-04-28 03:27:24,255][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 03:27:25,000][54818] Updated weights for policy 0, policy_version 562388 (0.0016) [2024-04-28 03:27:27,379][54818] Updated weights for policy 0, policy_version 562398 (0.0016) [2024-04-28 03:27:28,624][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (17700 times) [2024-04-28 03:27:29,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9214443520. Throughput: 0: 61267.1. Samples: 2119556580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:27:29,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 03:27:30,403][54818] Updated weights for policy 0, policy_version 562408 (0.0017) [2024-04-28 03:27:32,667][54818] Updated weights for policy 0, policy_version 562418 (0.0019) [2024-04-28 03:27:34,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9214738432. Throughput: 0: 61170.0. Samples: 2119917320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:27:34,255][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 03:27:35,908][54818] Updated weights for policy 0, policy_version 562428 (0.0015) [2024-04-28 03:27:38,113][54818] Updated weights for policy 0, policy_version 562438 (0.0020) [2024-04-28 03:27:39,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9215049728. Throughput: 0: 61106.1. Samples: 2120283180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:27:39,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 03:27:40,935][54798] Signal inference workers to stop experience collection... (34900 times) [2024-04-28 03:27:40,980][54818] InferenceWorker_p0-w0: stopping experience collection (34900 times) [2024-04-28 03:27:40,986][54798] Signal inference workers to resume experience collection... (34900 times) [2024-04-28 03:27:40,992][54818] InferenceWorker_p0-w0: resuming experience collection (34900 times) [2024-04-28 03:27:41,095][54818] Updated weights for policy 0, policy_version 562448 (0.0016) [2024-04-28 03:27:43,460][54818] Updated weights for policy 0, policy_version 562458 (0.0019) [2024-04-28 03:27:44,253][54587] Fps is (10 sec: 60622.1, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9215344640. Throughput: 0: 61076.9. Samples: 2120472940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:27:44,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 03:27:46,418][54818] Updated weights for policy 0, policy_version 562468 (0.0016) [2024-04-28 03:27:48,786][54818] Updated weights for policy 0, policy_version 562478 (0.0017) [2024-04-28 03:27:49,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9215655936. Throughput: 0: 61307.2. Samples: 2120839220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:27:49,255][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 03:27:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000562479_9215655936.pth... [2024-04-28 03:27:49,343][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000561584_9200992256.pth [2024-04-28 03:27:51,687][54818] Updated weights for policy 0, policy_version 562488 (0.0020) [2024-04-28 03:27:54,056][54818] Updated weights for policy 0, policy_version 562498 (0.0017) [2024-04-28 03:27:54,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 9215967232. Throughput: 0: 60941.3. Samples: 2121197980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:27:54,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 03:27:55,861][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (17800 times) [2024-04-28 03:27:56,969][54818] Updated weights for policy 0, policy_version 562508 (0.0017) [2024-04-28 03:27:59,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 9216278528. Throughput: 0: 61114.4. Samples: 2121389880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:27:59,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 03:27:59,759][54818] Updated weights for policy 0, policy_version 562518 (0.0019) [2024-04-28 03:28:02,582][54818] Updated weights for policy 0, policy_version 562528 (0.0016) [2024-04-28 03:28:04,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61167.1, 300 sec: 61092.9). Total num frames: 9216573440. Throughput: 0: 61244.1. Samples: 2121753520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:28:04,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-28 03:28:04,941][54818] Updated weights for policy 0, policy_version 562538 (0.0020) [2024-04-28 03:28:07,841][54818] Updated weights for policy 0, policy_version 562548 (0.0016) [2024-04-28 03:28:09,253][54587] Fps is (10 sec: 58981.9, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 9216868352. Throughput: 0: 60931.5. Samples: 2122112860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:28:09,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-28 03:28:10,777][54818] Updated weights for policy 0, policy_version 562558 (0.0015) [2024-04-28 03:28:13,356][54818] Updated weights for policy 0, policy_version 562568 (0.0018) [2024-04-28 03:28:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61167.1, 300 sec: 61092.9). Total num frames: 9217179648. Throughput: 0: 60959.1. Samples: 2122299740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:28:14,254][54587] Avg episode reward: [(0, '0.693')] [2024-04-28 03:28:16,352][54818] Updated weights for policy 0, policy_version 562578 (0.0016) [2024-04-28 03:28:18,677][54818] Updated weights for policy 0, policy_version 562588 (0.0015) [2024-04-28 03:28:19,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9217490944. Throughput: 0: 61064.6. Samples: 2122665220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:28:19,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 03:28:21,537][54818] Updated weights for policy 0, policy_version 562598 (0.0017) [2024-04-28 03:28:22,538][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (17900 times) [2024-04-28 03:28:23,859][54818] Updated weights for policy 0, policy_version 562608 (0.0018) [2024-04-28 03:28:24,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 9217785856. Throughput: 0: 61070.2. Samples: 2123031340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:28:24,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 03:28:26,796][54818] Updated weights for policy 0, policy_version 562618 (0.0016) [2024-04-28 03:28:27,410][54798] Signal inference workers to stop experience collection... (34950 times) [2024-04-28 03:28:27,411][54798] Signal inference workers to resume experience collection... (34950 times) [2024-04-28 03:28:27,436][54818] InferenceWorker_p0-w0: stopping experience collection (34950 times) [2024-04-28 03:28:27,436][54818] InferenceWorker_p0-w0: resuming experience collection (34950 times) [2024-04-28 03:28:29,253][54587] Fps is (10 sec: 58982.9, 60 sec: 60620.8, 300 sec: 61037.4). Total num frames: 9218080768. Throughput: 0: 60964.4. Samples: 2123216340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:28:29,253][54587] Avg episode reward: [(0, '0.641')] [2024-04-28 03:28:29,303][54818] Updated weights for policy 0, policy_version 562628 (0.0017) [2024-04-28 03:28:32,013][54818] Updated weights for policy 0, policy_version 562638 (0.0018) [2024-04-28 03:28:34,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 9218392064. Throughput: 0: 60870.5. Samples: 2123578400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:28:34,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 03:28:34,573][54818] Updated weights for policy 0, policy_version 562648 (0.0016) [2024-04-28 03:28:37,438][54818] Updated weights for policy 0, policy_version 562658 (0.0018) [2024-04-28 03:28:39,253][54587] Fps is (10 sec: 60619.9, 60 sec: 60620.8, 300 sec: 61037.3). Total num frames: 9218686976. Throughput: 0: 61087.9. Samples: 2123946940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:28:39,255][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 03:28:39,762][54818] Updated weights for policy 0, policy_version 562668 (0.0016) [2024-04-28 03:28:42,715][54818] Updated weights for policy 0, policy_version 562678 (0.0015) [2024-04-28 03:28:44,254][54587] Fps is (10 sec: 60620.2, 60 sec: 60893.5, 300 sec: 61092.8). Total num frames: 9218998272. Throughput: 0: 61007.7. Samples: 2124135240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:28:44,255][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 03:28:44,998][54818] Updated weights for policy 0, policy_version 562688 (0.0015) [2024-04-28 03:28:48,128][54818] Updated weights for policy 0, policy_version 562698 (0.0016) [2024-04-28 03:28:49,253][54587] Fps is (10 sec: 60621.6, 60 sec: 60620.9, 300 sec: 61092.9). Total num frames: 9219293184. Throughput: 0: 60983.6. Samples: 2124497780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:28:49,253][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 03:28:49,371][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (18000 times) [2024-04-28 03:28:50,841][54818] Updated weights for policy 0, policy_version 562708 (0.0018) [2024-04-28 03:28:53,585][54818] Updated weights for policy 0, policy_version 562718 (0.0018) [2024-04-28 03:28:54,253][54587] Fps is (10 sec: 60622.1, 60 sec: 60620.8, 300 sec: 61092.9). Total num frames: 9219604480. Throughput: 0: 61049.9. Samples: 2124860100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 03:28:54,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 03:28:56,311][54818] Updated weights for policy 0, policy_version 562728 (0.0018) [2024-04-28 03:28:58,873][54818] Updated weights for policy 0, policy_version 562738 (0.0017) [2024-04-28 03:28:59,253][54587] Fps is (10 sec: 62258.7, 60 sec: 60620.8, 300 sec: 61148.4). Total num frames: 9219915776. Throughput: 0: 61083.9. Samples: 2125048520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:28:59,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 03:29:01,538][54818] Updated weights for policy 0, policy_version 562748 (0.0020) [2024-04-28 03:29:04,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60620.8, 300 sec: 61037.4). Total num frames: 9220210688. Throughput: 0: 61149.5. Samples: 2125416940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:29:04,253][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 03:29:04,272][54818] Updated weights for policy 0, policy_version 562758 (0.0015) [2024-04-28 03:29:06,711][54818] Updated weights for policy 0, policy_version 562768 (0.0016) [2024-04-28 03:29:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9220521984. Throughput: 0: 61126.7. Samples: 2125782040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:29:09,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 03:29:09,499][54818] Updated weights for policy 0, policy_version 562778 (0.0017) [2024-04-28 03:29:12,451][54818] Updated weights for policy 0, policy_version 562788 (0.0016) [2024-04-28 03:29:14,253][54587] Fps is (10 sec: 62258.5, 60 sec: 60893.8, 300 sec: 61204.0). Total num frames: 9220833280. Throughput: 0: 60977.2. Samples: 2125960320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:29:14,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 03:29:14,971][54818] Updated weights for policy 0, policy_version 562798 (0.0018) [2024-04-28 03:29:16,002][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (18100 times) [2024-04-28 03:29:17,722][54818] Updated weights for policy 0, policy_version 562808 (0.0016) [2024-04-28 03:29:19,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60620.8, 300 sec: 61148.4). Total num frames: 9221128192. Throughput: 0: 61378.8. Samples: 2126340440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:29:19,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 03:29:20,121][54818] Updated weights for policy 0, policy_version 562818 (0.0018) [2024-04-28 03:29:22,917][54818] Updated weights for policy 0, policy_version 562828 (0.0015) [2024-04-28 03:29:24,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 9221439488. Throughput: 0: 61219.7. Samples: 2126701820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:29:24,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 03:29:25,457][54818] Updated weights for policy 0, policy_version 562838 (0.0017) [2024-04-28 03:29:28,276][54818] Updated weights for policy 0, policy_version 562848 (0.0016) [2024-04-28 03:29:29,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9221734400. Throughput: 0: 61092.4. Samples: 2126884380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:29:29,253][54587] Avg episode reward: [(0, '0.662')] [2024-04-28 03:29:30,419][54798] Signal inference workers to stop experience collection... (35000 times) [2024-04-28 03:29:30,419][54798] Signal inference workers to resume experience collection... (35000 times) [2024-04-28 03:29:30,435][54818] InferenceWorker_p0-w0: stopping experience collection (35000 times) [2024-04-28 03:29:30,435][54818] InferenceWorker_p0-w0: resuming experience collection (35000 times) [2024-04-28 03:29:30,912][54818] Updated weights for policy 0, policy_version 562858 (0.0016) [2024-04-28 03:29:33,601][54818] Updated weights for policy 0, policy_version 562868 (0.0015) [2024-04-28 03:29:34,253][54587] Fps is (10 sec: 58981.7, 60 sec: 60620.9, 300 sec: 61092.9). Total num frames: 9222029312. Throughput: 0: 61313.1. Samples: 2127256880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:29:34,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-28 03:29:36,132][54818] Updated weights for policy 0, policy_version 562878 (0.0017) [2024-04-28 03:29:39,253][54587] Fps is (10 sec: 60619.3, 60 sec: 60893.8, 300 sec: 61092.8). Total num frames: 9222340608. Throughput: 0: 61201.1. Samples: 2127614160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:29:39,255][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 03:29:39,375][54818] Updated weights for policy 0, policy_version 562888 (0.0017) [2024-04-28 03:29:41,438][54818] Updated weights for policy 0, policy_version 562898 (0.0015) [2024-04-28 03:29:42,790][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (18200 times) [2024-04-28 03:29:44,255][54587] Fps is (10 sec: 62249.1, 60 sec: 60892.4, 300 sec: 61148.1). Total num frames: 9222651904. Throughput: 0: 61072.8. Samples: 2127796900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:29:44,255][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 03:29:44,677][54818] Updated weights for policy 0, policy_version 562908 (0.0016) [2024-04-28 03:29:46,804][54818] Updated weights for policy 0, policy_version 562918 (0.0017) [2024-04-28 03:29:49,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 9222963200. Throughput: 0: 61123.3. Samples: 2128167500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:29:49,255][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 03:29:49,267][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000562925_9222963200.pth... [2024-04-28 03:29:49,271][54587] No heartbeat for components: RolloutWorker_w4 (21937 seconds), RolloutWorker_w5 (8037 seconds) [2024-04-28 03:29:49,327][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000562030_9208299520.pth [2024-04-28 03:29:49,973][54818] Updated weights for policy 0, policy_version 562928 (0.0017) [2024-04-28 03:29:52,581][54818] Updated weights for policy 0, policy_version 562938 (0.0016) [2024-04-28 03:29:54,253][54587] Fps is (10 sec: 60631.1, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9223258112. Throughput: 0: 61107.2. Samples: 2128531860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:29:54,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-28 03:29:55,217][54818] Updated weights for policy 0, policy_version 562948 (0.0018) [2024-04-28 03:29:57,777][54818] Updated weights for policy 0, policy_version 562958 (0.0017) [2024-04-28 03:29:59,253][54587] Fps is (10 sec: 60621.8, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 9223569408. Throughput: 0: 61023.7. Samples: 2128706380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:29:59,253][54587] Avg episode reward: [(0, '0.654')] [2024-04-28 03:30:00,555][54818] Updated weights for policy 0, policy_version 562968 (0.0017) [2024-04-28 03:30:03,076][54818] Updated weights for policy 0, policy_version 562978 (0.0017) [2024-04-28 03:30:04,253][54587] Fps is (10 sec: 63897.5, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 9223897088. Throughput: 0: 61021.8. Samples: 2129086420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:30:04,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 03:30:05,824][54818] Updated weights for policy 0, policy_version 562988 (0.0017) [2024-04-28 03:30:08,236][54818] Updated weights for policy 0, policy_version 562998 (0.0017) [2024-04-28 03:30:09,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9224192000. Throughput: 0: 61069.3. Samples: 2129449940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:30:09,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-28 03:30:10,162][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (18300 times) [2024-04-28 03:30:10,315][54798] Signal inference workers to stop experience collection... (35050 times) [2024-04-28 03:30:10,324][54798] Signal inference workers to resume experience collection... (35050 times) [2024-04-28 03:30:10,337][54818] InferenceWorker_p0-w0: stopping experience collection (35050 times) [2024-04-28 03:30:10,337][54818] InferenceWorker_p0-w0: resuming experience collection (35050 times) [2024-04-28 03:30:11,159][54818] Updated weights for policy 0, policy_version 563008 (0.0017) [2024-04-28 03:30:13,907][54818] Updated weights for policy 0, policy_version 563018 (0.0016) [2024-04-28 03:30:14,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9224503296. Throughput: 0: 60799.4. Samples: 2129620360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:30:14,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 03:30:16,308][54818] Updated weights for policy 0, policy_version 563028 (0.0018) [2024-04-28 03:30:19,204][54818] Updated weights for policy 0, policy_version 563038 (0.0015) [2024-04-28 03:30:19,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9224814592. Throughput: 0: 60856.6. Samples: 2129995420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:30:19,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 03:30:21,726][54818] Updated weights for policy 0, policy_version 563048 (0.0017) [2024-04-28 03:30:24,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 9225125888. Throughput: 0: 61148.6. Samples: 2130365840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:30:24,255][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 03:30:25,091][54818] Updated weights for policy 0, policy_version 563058 (0.0016) [2024-04-28 03:30:27,122][54818] Updated weights for policy 0, policy_version 563068 (0.0018) [2024-04-28 03:30:29,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61439.8, 300 sec: 61203.9). Total num frames: 9225420800. Throughput: 0: 61066.6. Samples: 2130544800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:30:29,255][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 03:30:30,326][54818] Updated weights for policy 0, policy_version 563078 (0.0021) [2024-04-28 03:30:32,496][54818] Updated weights for policy 0, policy_version 563088 (0.0016) [2024-04-28 03:30:34,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61713.2, 300 sec: 61204.0). Total num frames: 9225732096. Throughput: 0: 60902.8. Samples: 2130908120. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:30:34,253][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 03:30:35,600][54818] Updated weights for policy 0, policy_version 563098 (0.0017) [2024-04-28 03:30:36,318][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (18400 times) [2024-04-28 03:30:37,820][54818] Updated weights for policy 0, policy_version 563108 (0.0016) [2024-04-28 03:30:39,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61713.1, 300 sec: 61203.9). Total num frames: 9226043392. Throughput: 0: 61147.9. Samples: 2131283520. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:30:39,254][54587] Avg episode reward: [(0, '0.657')] [2024-04-28 03:30:40,822][54818] Updated weights for policy 0, policy_version 563118 (0.0017) [2024-04-28 03:30:43,083][54818] Updated weights for policy 0, policy_version 563128 (0.0015) [2024-04-28 03:30:44,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61714.9, 300 sec: 61204.0). Total num frames: 9226354688. Throughput: 0: 61281.3. Samples: 2131464040. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:30:44,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-28 03:30:45,945][54818] Updated weights for policy 0, policy_version 563138 (0.0017) [2024-04-28 03:30:48,666][54818] Updated weights for policy 0, policy_version 563148 (0.0016) [2024-04-28 03:30:49,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9226649600. Throughput: 0: 60968.0. Samples: 2131829980. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:30:49,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 03:30:50,716][54798] Signal inference workers to stop experience collection... (35100 times) [2024-04-28 03:30:50,748][54818] InferenceWorker_p0-w0: stopping experience collection (35100 times) [2024-04-28 03:30:50,768][54798] Signal inference workers to resume experience collection... (35100 times) [2024-04-28 03:30:50,768][54818] InferenceWorker_p0-w0: resuming experience collection (35100 times) [2024-04-28 03:30:51,248][54818] Updated weights for policy 0, policy_version 563158 (0.0019) [2024-04-28 03:30:54,019][54818] Updated weights for policy 0, policy_version 563168 (0.0019) [2024-04-28 03:30:54,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61713.2, 300 sec: 61204.0). Total num frames: 9226960896. Throughput: 0: 61176.9. Samples: 2132202900. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:30:54,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 03:30:56,508][54818] Updated weights for policy 0, policy_version 563178 (0.0017) [2024-04-28 03:30:59,191][54818] Updated weights for policy 0, policy_version 563188 (0.0016) [2024-04-28 03:30:59,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 9227272192. Throughput: 0: 61380.1. Samples: 2132382460. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:30:59,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 03:31:01,801][54818] Updated weights for policy 0, policy_version 563198 (0.0017) [2024-04-28 03:31:03,416][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (18500 times) [2024-04-28 03:31:04,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9227583488. Throughput: 0: 61314.6. Samples: 2132754580. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:31:04,255][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 03:31:04,841][54818] Updated weights for policy 0, policy_version 563208 (0.0016) [2024-04-28 03:31:07,164][54818] Updated weights for policy 0, policy_version 563218 (0.0016) [2024-04-28 03:31:09,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 9227878400. Throughput: 0: 61269.8. Samples: 2133122980. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:31:09,254][54587] Avg episode reward: [(0, '0.673')] [2024-04-28 03:31:10,372][54818] Updated weights for policy 0, policy_version 563228 (0.0016) [2024-04-28 03:31:12,568][54818] Updated weights for policy 0, policy_version 563238 (0.0018) [2024-04-28 03:31:14,253][54587] Fps is (10 sec: 58982.3, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9228173312. Throughput: 0: 61231.6. Samples: 2133300220. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:31:14,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 03:31:15,694][54818] Updated weights for policy 0, policy_version 563248 (0.0015) [2024-04-28 03:31:18,251][54818] Updated weights for policy 0, policy_version 563258 (0.0020) [2024-04-28 03:31:19,256][54587] Fps is (10 sec: 60607.3, 60 sec: 61164.6, 300 sec: 61148.0). Total num frames: 9228484608. Throughput: 0: 61307.5. Samples: 2133667100. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:31:19,256][54587] Avg episode reward: [(0, '0.692')] [2024-04-28 03:31:20,908][54818] Updated weights for policy 0, policy_version 563268 (0.0018) [2024-04-28 03:31:23,582][54818] Updated weights for policy 0, policy_version 563278 (0.0016) [2024-04-28 03:31:24,253][54587] Fps is (10 sec: 60621.9, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 9228779520. Throughput: 0: 61200.7. Samples: 2134037540. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:31:24,253][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 03:31:26,206][54818] Updated weights for policy 0, policy_version 563288 (0.0016) [2024-04-28 03:31:29,064][54818] Updated weights for policy 0, policy_version 563298 (0.0018) [2024-04-28 03:31:29,253][54587] Fps is (10 sec: 60634.6, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 9229090816. Throughput: 0: 61233.2. Samples: 2134219540. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:31:29,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 03:31:30,352][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (18600 times) [2024-04-28 03:31:31,542][54818] Updated weights for policy 0, policy_version 563308 (0.0015) [2024-04-28 03:31:34,185][54818] Updated weights for policy 0, policy_version 563318 (0.0015) [2024-04-28 03:31:34,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 9229402112. Throughput: 0: 61177.5. Samples: 2134582960. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:31:34,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 03:31:36,711][54818] Updated weights for policy 0, policy_version 563328 (0.0016) [2024-04-28 03:31:37,013][54798] Signal inference workers to stop experience collection... (35150 times) [2024-04-28 03:31:37,013][54798] Signal inference workers to resume experience collection... (35150 times) [2024-04-28 03:31:37,020][54818] InferenceWorker_p0-w0: stopping experience collection (35150 times) [2024-04-28 03:31:37,028][54818] InferenceWorker_p0-w0: resuming experience collection (35150 times) [2024-04-28 03:31:39,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61166.9, 300 sec: 61092.8). Total num frames: 9229713408. Throughput: 0: 61234.4. Samples: 2134958460. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:31:39,255][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 03:31:39,383][54818] Updated weights for policy 0, policy_version 563338 (0.0018) [2024-04-28 03:31:41,945][54818] Updated weights for policy 0, policy_version 563348 (0.0018) [2024-04-28 03:31:44,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 9230008320. Throughput: 0: 61153.7. Samples: 2135134380. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:31:44,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 03:31:44,705][54818] Updated weights for policy 0, policy_version 563358 (0.0015) [2024-04-28 03:31:47,313][54818] Updated weights for policy 0, policy_version 563368 (0.0016) [2024-04-28 03:31:49,253][54587] Fps is (10 sec: 58982.8, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9230303232. Throughput: 0: 61137.8. Samples: 2135505780. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:31:49,254][54587] Avg episode reward: [(0, '0.686')] [2024-04-28 03:31:49,319][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000563374_9230319616.pth... [2024-04-28 03:31:49,357][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000562479_9215655936.pth [2024-04-28 03:31:50,319][54818] Updated weights for policy 0, policy_version 563378 (0.0020) [2024-04-28 03:31:52,981][54818] Updated weights for policy 0, policy_version 563388 (0.0021) [2024-04-28 03:31:54,253][54587] Fps is (10 sec: 58983.1, 60 sec: 60620.8, 300 sec: 61037.3). Total num frames: 9230598144. Throughput: 0: 61054.9. Samples: 2135870440. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:31:54,253][54587] Avg episode reward: [(0, '0.668')] [2024-04-28 03:31:54,254][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (1000 times) [2024-04-28 03:31:55,632][54818] Updated weights for policy 0, policy_version 563398 (0.0018) [2024-04-28 03:31:56,463][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (18700 times) [2024-04-28 03:31:58,487][54818] Updated weights for policy 0, policy_version 563408 (0.0018) [2024-04-28 03:31:59,254][54587] Fps is (10 sec: 62257.9, 60 sec: 60893.6, 300 sec: 61092.8). Total num frames: 9230925824. Throughput: 0: 61034.9. Samples: 2136046800. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-04-28 03:31:59,255][54587] Avg episode reward: [(0, '0.654')] [2024-04-28 03:32:01,171][54818] Updated weights for policy 0, policy_version 563418 (0.0017) [2024-04-28 03:32:03,939][54818] Updated weights for policy 0, policy_version 563428 (0.0017) [2024-04-28 03:32:04,253][54587] Fps is (10 sec: 63896.7, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9231237120. Throughput: 0: 61255.5. Samples: 2136423460. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:32:04,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 03:32:06,227][54818] Updated weights for policy 0, policy_version 563438 (0.0015) [2024-04-28 03:32:09,183][54818] Updated weights for policy 0, policy_version 563448 (0.0016) [2024-04-28 03:32:09,253][54587] Fps is (10 sec: 60622.0, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9231532032. Throughput: 0: 61262.9. Samples: 2136794380. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:32:09,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-28 03:32:11,677][54818] Updated weights for policy 0, policy_version 563458 (0.0016) [2024-04-28 03:32:14,253][54587] Fps is (10 sec: 58982.4, 60 sec: 60893.9, 300 sec: 61037.3). Total num frames: 9231826944. Throughput: 0: 61155.5. Samples: 2136971540. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:32:14,255][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 03:32:14,507][54818] Updated weights for policy 0, policy_version 563468 (0.0017) [2024-04-28 03:32:16,923][54818] Updated weights for policy 0, policy_version 563478 (0.0018) [2024-04-28 03:32:19,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60896.2, 300 sec: 61092.9). Total num frames: 9232138240. Throughput: 0: 61322.6. Samples: 2137342480. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:32:19,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 03:32:19,865][54818] Updated weights for policy 0, policy_version 563488 (0.0018) [2024-04-28 03:32:19,893][54798] Signal inference workers to stop experience collection... (35200 times) [2024-04-28 03:32:19,894][54798] Signal inference workers to resume experience collection... (35200 times) [2024-04-28 03:32:19,912][54818] InferenceWorker_p0-w0: stopping experience collection (35200 times) [2024-04-28 03:32:19,913][54818] InferenceWorker_p0-w0: resuming experience collection (35200 times) [2024-04-28 03:32:22,361][54818] Updated weights for policy 0, policy_version 563498 (0.0018) [2024-04-28 03:32:23,829][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (18800 times) [2024-04-28 03:32:24,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60893.7, 300 sec: 60981.8). Total num frames: 9232433152. Throughput: 0: 61118.8. Samples: 2137708800. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:32:24,254][54587] Avg episode reward: [(0, '0.664')] [2024-04-28 03:32:25,081][54818] Updated weights for policy 0, policy_version 563508 (0.0018) [2024-04-28 03:32:27,869][54818] Updated weights for policy 0, policy_version 563518 (0.0016) [2024-04-28 03:32:29,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60893.9, 300 sec: 61037.4). Total num frames: 9232744448. Throughput: 0: 61340.0. Samples: 2137894680. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:32:29,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-28 03:32:30,263][54818] Updated weights for policy 0, policy_version 563528 (0.0017) [2024-04-28 03:32:33,074][54818] Updated weights for policy 0, policy_version 563538 (0.0018) [2024-04-28 03:32:34,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60620.7, 300 sec: 60981.8). Total num frames: 9233039360. Throughput: 0: 61150.7. Samples: 2138257560. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:32:34,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-28 03:32:35,466][54818] Updated weights for policy 0, policy_version 563548 (0.0017) [2024-04-28 03:32:38,659][54818] Updated weights for policy 0, policy_version 563558 (0.0017) [2024-04-28 03:32:39,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60621.0, 300 sec: 61037.3). Total num frames: 9233350656. Throughput: 0: 61387.5. Samples: 2138632880. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:32:39,253][54587] Avg episode reward: [(0, '0.544')] [2024-04-28 03:32:40,696][54818] Updated weights for policy 0, policy_version 563568 (0.0016) [2024-04-28 03:32:44,004][54818] Updated weights for policy 0, policy_version 563578 (0.0016) [2024-04-28 03:32:44,253][54587] Fps is (10 sec: 62258.2, 60 sec: 60893.7, 300 sec: 61037.3). Total num frames: 9233661952. Throughput: 0: 61479.2. Samples: 2138813360. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:32:44,254][54587] Avg episode reward: [(0, '0.753')] [2024-04-28 03:32:46,225][54818] Updated weights for policy 0, policy_version 563588 (0.0018) [2024-04-28 03:32:49,253][54587] Fps is (10 sec: 62258.1, 60 sec: 61166.9, 300 sec: 61037.3). Total num frames: 9233973248. Throughput: 0: 61147.5. Samples: 2139175100. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:32:49,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 03:32:49,268][54587] No heartbeat for components: RolloutWorker_w4 (22117 seconds), RolloutWorker_w5 (8217 seconds) [2024-04-28 03:32:49,471][54818] Updated weights for policy 0, policy_version 563598 (0.0017) [2024-04-28 03:32:50,332][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (18900 times) [2024-04-28 03:32:51,884][54818] Updated weights for policy 0, policy_version 563608 (0.0019) [2024-04-28 03:32:54,253][54587] Fps is (10 sec: 60622.0, 60 sec: 61166.8, 300 sec: 60981.8). Total num frames: 9234268160. Throughput: 0: 61235.6. Samples: 2139549980. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:32:54,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 03:32:54,704][54818] Updated weights for policy 0, policy_version 563618 (0.0016) [2024-04-28 03:32:57,289][54818] Updated weights for policy 0, policy_version 563628 (0.0016) [2024-04-28 03:32:59,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60894.1, 300 sec: 61037.3). Total num frames: 9234579456. Throughput: 0: 61288.9. Samples: 2139729540. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:32:59,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-28 03:32:59,977][54818] Updated weights for policy 0, policy_version 563638 (0.0017) [2024-04-28 03:33:02,549][54818] Updated weights for policy 0, policy_version 563648 (0.0018) [2024-04-28 03:33:04,253][54587] Fps is (10 sec: 62259.9, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 9234890752. Throughput: 0: 61207.2. Samples: 2140096800. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:33:04,253][54587] Avg episode reward: [(0, '0.553')] [2024-04-28 03:33:05,194][54818] Updated weights for policy 0, policy_version 563658 (0.0018) [2024-04-28 03:33:07,378][54798] Signal inference workers to stop experience collection... (35250 times) [2024-04-28 03:33:07,401][54818] InferenceWorker_p0-w0: stopping experience collection (35250 times) [2024-04-28 03:33:07,441][54798] Signal inference workers to resume experience collection... (35250 times) [2024-04-28 03:33:07,442][54818] InferenceWorker_p0-w0: resuming experience collection (35250 times) [2024-04-28 03:33:08,359][54818] Updated weights for policy 0, policy_version 563668 (0.0017) [2024-04-28 03:33:09,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60893.9, 300 sec: 61037.3). Total num frames: 9235185664. Throughput: 0: 61316.9. Samples: 2140468060. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:33:09,254][54587] Avg episode reward: [(0, '0.513')] [2024-04-28 03:33:10,484][54818] Updated weights for policy 0, policy_version 563678 (0.0018) [2024-04-28 03:33:13,744][54818] Updated weights for policy 0, policy_version 563688 (0.0017) [2024-04-28 03:33:14,253][54587] Fps is (10 sec: 62258.1, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 9235513344. Throughput: 0: 60975.4. Samples: 2140638580. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:33:14,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 03:33:15,637][54818] Updated weights for policy 0, policy_version 563698 (0.0019) [2024-04-28 03:33:17,287][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (19000 times) [2024-04-28 03:33:19,056][54818] Updated weights for policy 0, policy_version 563708 (0.0017) [2024-04-28 03:33:19,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 9235808256. Throughput: 0: 61227.6. Samples: 2141012800. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:33:19,254][54587] Avg episode reward: [(0, '0.497')] [2024-04-28 03:33:20,955][54818] Updated weights for policy 0, policy_version 563718 (0.0016) [2024-04-28 03:33:24,253][54587] Fps is (10 sec: 58983.1, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 9236103168. Throughput: 0: 61007.5. Samples: 2141378220. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:33:24,255][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 03:33:24,337][54818] Updated weights for policy 0, policy_version 563728 (0.0016) [2024-04-28 03:33:26,383][54818] Updated weights for policy 0, policy_version 563738 (0.0018) [2024-04-28 03:33:29,253][54587] Fps is (10 sec: 58981.4, 60 sec: 60893.7, 300 sec: 61037.3). Total num frames: 9236398080. Throughput: 0: 60882.7. Samples: 2141553080. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-04-28 03:33:29,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 03:33:29,579][54818] Updated weights for policy 0, policy_version 563748 (0.0020) [2024-04-28 03:33:32,193][54818] Updated weights for policy 0, policy_version 563758 (0.0020) [2024-04-28 03:33:34,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 9236709376. Throughput: 0: 61003.7. Samples: 2141920260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:33:34,254][54587] Avg episode reward: [(0, '0.501')] [2024-04-28 03:33:34,906][54818] Updated weights for policy 0, policy_version 563768 (0.0015) [2024-04-28 03:33:37,353][54818] Updated weights for policy 0, policy_version 563778 (0.0017) [2024-04-28 03:33:39,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61166.7, 300 sec: 61092.9). Total num frames: 9237020672. Throughput: 0: 61022.6. Samples: 2142296000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:33:39,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 03:33:40,146][54818] Updated weights for policy 0, policy_version 563788 (0.0016) [2024-04-28 03:33:43,011][54818] Updated weights for policy 0, policy_version 563798 (0.0017) [2024-04-28 03:33:44,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60894.1, 300 sec: 61092.9). Total num frames: 9237315584. Throughput: 0: 60893.0. Samples: 2142469720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:33:44,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 03:33:44,426][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (19100 times) [2024-04-28 03:33:44,636][54798] Signal inference workers to stop experience collection... (35300 times) [2024-04-28 03:33:44,662][54818] InferenceWorker_p0-w0: stopping experience collection (35300 times) [2024-04-28 03:33:44,691][54798] Signal inference workers to resume experience collection... (35300 times) [2024-04-28 03:33:44,692][54818] InferenceWorker_p0-w0: resuming experience collection (35300 times) [2024-04-28 03:33:45,314][54818] Updated weights for policy 0, policy_version 563808 (0.0017) [2024-04-28 03:33:48,614][54818] Updated weights for policy 0, policy_version 563818 (0.0016) [2024-04-28 03:33:49,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9237626880. Throughput: 0: 61015.8. Samples: 2142842520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:33:49,255][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 03:33:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000563820_9237626880.pth... [2024-04-28 03:33:49,327][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000562925_9222963200.pth [2024-04-28 03:33:50,538][54818] Updated weights for policy 0, policy_version 563828 (0.0017) [2024-04-28 03:33:53,846][54818] Updated weights for policy 0, policy_version 563838 (0.0016) [2024-04-28 03:33:54,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 9237938176. Throughput: 0: 60977.9. Samples: 2143212060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:33:54,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 03:33:56,011][54818] Updated weights for policy 0, policy_version 563848 (0.0016) [2024-04-28 03:33:59,252][54818] Updated weights for policy 0, policy_version 563858 (0.0018) [2024-04-28 03:33:59,254][54587] Fps is (10 sec: 62258.4, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 9238249472. Throughput: 0: 61079.4. Samples: 2143387160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:33:59,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 03:34:01,421][54818] Updated weights for policy 0, policy_version 563868 (0.0017) [2024-04-28 03:34:04,253][54587] Fps is (10 sec: 60620.2, 60 sec: 60893.7, 300 sec: 61092.9). Total num frames: 9238544384. Throughput: 0: 60889.3. Samples: 2143752820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:34:04,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 03:34:04,546][54818] Updated weights for policy 0, policy_version 563878 (0.0016) [2024-04-28 03:34:06,729][54818] Updated weights for policy 0, policy_version 563888 (0.0016) [2024-04-28 03:34:09,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61166.8, 300 sec: 61092.9). Total num frames: 9238855680. Throughput: 0: 61069.6. Samples: 2144126360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:34:09,254][54587] Avg episode reward: [(0, '0.514')] [2024-04-28 03:34:09,892][54818] Updated weights for policy 0, policy_version 563898 (0.0017) [2024-04-28 03:34:10,433][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (19200 times) [2024-04-28 03:34:12,297][54818] Updated weights for policy 0, policy_version 563908 (0.0018) [2024-04-28 03:34:14,253][54587] Fps is (10 sec: 62259.5, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 9239166976. Throughput: 0: 61159.8. Samples: 2144305260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:34:14,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 03:34:15,015][54818] Updated weights for policy 0, policy_version 563918 (0.0017) [2024-04-28 03:34:17,658][54818] Updated weights for policy 0, policy_version 563928 (0.0015) [2024-04-28 03:34:19,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 9239461888. Throughput: 0: 61235.9. Samples: 2144675880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:34:19,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 03:34:20,194][54818] Updated weights for policy 0, policy_version 563938 (0.0019) [2024-04-28 03:34:23,225][54818] Updated weights for policy 0, policy_version 563948 (0.0016) [2024-04-28 03:34:24,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9239773184. Throughput: 0: 61099.4. Samples: 2145045460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:34:24,253][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 03:34:25,498][54818] Updated weights for policy 0, policy_version 563958 (0.0017) [2024-04-28 03:34:25,743][54798] Signal inference workers to stop experience collection... (35350 times) [2024-04-28 03:34:25,782][54818] InferenceWorker_p0-w0: stopping experience collection (35350 times) [2024-04-28 03:34:25,838][54798] Signal inference workers to resume experience collection... (35350 times) [2024-04-28 03:34:25,838][54818] InferenceWorker_p0-w0: resuming experience collection (35350 times) [2024-04-28 03:34:28,647][54818] Updated weights for policy 0, policy_version 563968 (0.0019) [2024-04-28 03:34:29,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 9240084480. Throughput: 0: 61089.7. Samples: 2145218760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:34:29,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 03:34:30,826][54818] Updated weights for policy 0, policy_version 563978 (0.0018) [2024-04-28 03:34:33,988][54818] Updated weights for policy 0, policy_version 563988 (0.0017) [2024-04-28 03:34:34,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 9240395776. Throughput: 0: 61199.6. Samples: 2145596500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:34:34,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 03:34:36,196][54818] Updated weights for policy 0, policy_version 563998 (0.0016) [2024-04-28 03:34:37,837][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (19300 times) [2024-04-28 03:34:39,185][54818] Updated weights for policy 0, policy_version 564008 (0.0018) [2024-04-28 03:34:39,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61440.1, 300 sec: 61204.3). Total num frames: 9240707072. Throughput: 0: 61139.4. Samples: 2145963340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:34:39,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 03:34:41,793][54818] Updated weights for policy 0, policy_version 564018 (0.0018) [2024-04-28 03:34:44,253][54587] Fps is (10 sec: 58982.8, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 9240985600. Throughput: 0: 61102.5. Samples: 2146136760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:34:44,253][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 03:34:44,629][54818] Updated weights for policy 0, policy_version 564028 (0.0016) [2024-04-28 03:34:47,068][54818] Updated weights for policy 0, policy_version 564038 (0.0017) [2024-04-28 03:34:49,253][54587] Fps is (10 sec: 58982.1, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9241296896. Throughput: 0: 61018.2. Samples: 2146498640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:34:49,254][54587] Avg episode reward: [(0, '0.673')] [2024-04-28 03:34:50,081][54818] Updated weights for policy 0, policy_version 564048 (0.0018) [2024-04-28 03:34:52,488][54818] Updated weights for policy 0, policy_version 564058 (0.0018) [2024-04-28 03:34:54,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9241608192. Throughput: 0: 61039.3. Samples: 2146873120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:34:54,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 03:34:55,255][54818] Updated weights for policy 0, policy_version 564068 (0.0018) [2024-04-28 03:34:58,045][54818] Updated weights for policy 0, policy_version 564078 (0.0015) [2024-04-28 03:34:59,253][54587] Fps is (10 sec: 62260.1, 60 sec: 61167.2, 300 sec: 61092.9). Total num frames: 9241919488. Throughput: 0: 61072.5. Samples: 2147053520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-04-28 03:34:59,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 03:35:00,677][54818] Updated weights for policy 0, policy_version 564088 (0.0017) [2024-04-28 03:35:03,482][54818] Updated weights for policy 0, policy_version 564098 (0.0017) [2024-04-28 03:35:04,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9242230784. Throughput: 0: 60935.3. Samples: 2147417960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:35:04,253][54587] Avg episode reward: [(0, '0.510')] [2024-04-28 03:35:04,573][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (19400 times) [2024-04-28 03:35:06,031][54818] Updated weights for policy 0, policy_version 564108 (0.0018) [2024-04-28 03:35:06,949][54798] Signal inference workers to stop experience collection... (35400 times) [2024-04-28 03:35:06,982][54818] InferenceWorker_p0-w0: stopping experience collection (35400 times) [2024-04-28 03:35:07,037][54798] Signal inference workers to resume experience collection... (35400 times) [2024-04-28 03:35:07,037][54818] InferenceWorker_p0-w0: resuming experience collection (35400 times) [2024-04-28 03:35:08,675][54818] Updated weights for policy 0, policy_version 564118 (0.0018) [2024-04-28 03:35:09,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9242542080. Throughput: 0: 60940.3. Samples: 2147787780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:35:09,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 03:35:11,352][54818] Updated weights for policy 0, policy_version 564128 (0.0016) [2024-04-28 03:35:13,969][54818] Updated weights for policy 0, policy_version 564138 (0.0015) [2024-04-28 03:35:14,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 9242836992. Throughput: 0: 61140.7. Samples: 2147970080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:35:14,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 03:35:16,770][54818] Updated weights for policy 0, policy_version 564148 (0.0016) [2024-04-28 03:35:19,247][54818] Updated weights for policy 0, policy_version 564158 (0.0015) [2024-04-28 03:35:19,254][54587] Fps is (10 sec: 62255.8, 60 sec: 61712.6, 300 sec: 61148.3). Total num frames: 9243164672. Throughput: 0: 60970.0. Samples: 2148340180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:35:19,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 03:35:22,073][54818] Updated weights for policy 0, policy_version 564168 (0.0018) [2024-04-28 03:35:24,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61439.8, 300 sec: 61148.4). Total num frames: 9243459584. Throughput: 0: 60892.8. Samples: 2148703520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:35:24,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-28 03:35:24,748][54818] Updated weights for policy 0, policy_version 564178 (0.0018) [2024-04-28 03:35:27,506][54818] Updated weights for policy 0, policy_version 564188 (0.0019) [2024-04-28 03:35:29,253][54587] Fps is (10 sec: 60624.2, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9243770880. Throughput: 0: 61072.4. Samples: 2148885020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:35:29,254][54587] Avg episode reward: [(0, '0.647')] [2024-04-28 03:35:30,085][54818] Updated weights for policy 0, policy_version 564198 (0.0015) [2024-04-28 03:35:31,268][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (19500 times) [2024-04-28 03:35:32,848][54818] Updated weights for policy 0, policy_version 564208 (0.0017) [2024-04-28 03:35:34,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61440.1, 300 sec: 61148.5). Total num frames: 9244082176. Throughput: 0: 61277.1. Samples: 2149256100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:35:34,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 03:35:35,396][54818] Updated weights for policy 0, policy_version 564218 (0.0018) [2024-04-28 03:35:38,152][54818] Updated weights for policy 0, policy_version 564228 (0.0015) [2024-04-28 03:35:39,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 9244393472. Throughput: 0: 61097.2. Samples: 2149622500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:35:39,255][54587] Avg episode reward: [(0, '0.527')] [2024-04-28 03:35:40,798][54818] Updated weights for policy 0, policy_version 564238 (0.0016) [2024-04-28 03:35:43,520][54818] Updated weights for policy 0, policy_version 564248 (0.0015) [2024-04-28 03:35:44,253][54587] Fps is (10 sec: 58982.5, 60 sec: 61440.1, 300 sec: 61092.9). Total num frames: 9244672000. Throughput: 0: 61155.6. Samples: 2149805520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:35:44,253][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 03:35:46,043][54818] Updated weights for policy 0, policy_version 564258 (0.0016) [2024-04-28 03:35:48,823][54818] Updated weights for policy 0, policy_version 564268 (0.0016) [2024-04-28 03:35:49,253][54587] Fps is (10 sec: 58983.3, 60 sec: 61440.1, 300 sec: 61092.9). Total num frames: 9244983296. Throughput: 0: 61253.3. Samples: 2150174360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:35:49,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 03:35:49,259][54587] No heartbeat for components: RolloutWorker_w4 (22297 seconds), RolloutWorker_w5 (8397 seconds) [2024-04-28 03:35:49,315][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000564270_9244999680.pth... [2024-04-28 03:35:49,367][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000563374_9230319616.pth [2024-04-28 03:35:51,591][54818] Updated weights for policy 0, policy_version 564278 (0.0016) [2024-04-28 03:35:54,157][54818] Updated weights for policy 0, policy_version 564288 (0.0018) [2024-04-28 03:35:54,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61440.1, 300 sec: 61092.9). Total num frames: 9245294592. Throughput: 0: 61178.8. Samples: 2150540820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:35:54,253][54587] Avg episode reward: [(0, '0.520')] [2024-04-28 03:35:56,818][54818] Updated weights for policy 0, policy_version 564298 (0.0015) [2024-04-28 03:35:58,044][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (19600 times) [2024-04-28 03:35:59,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61439.9, 300 sec: 61092.9). Total num frames: 9245605888. Throughput: 0: 61318.5. Samples: 2150729420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:35:59,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-28 03:35:59,334][54818] Updated weights for policy 0, policy_version 564308 (0.0022) [2024-04-28 03:36:02,278][54818] Updated weights for policy 0, policy_version 564318 (0.0016) [2024-04-28 03:36:02,969][54798] Signal inference workers to stop experience collection... (35450 times) [2024-04-28 03:36:03,007][54818] InferenceWorker_p0-w0: stopping experience collection (35450 times) [2024-04-28 03:36:03,060][54798] Signal inference workers to resume experience collection... (35450 times) [2024-04-28 03:36:03,061][54818] InferenceWorker_p0-w0: resuming experience collection (35450 times) [2024-04-28 03:36:04,253][54587] Fps is (10 sec: 60619.7, 60 sec: 61166.8, 300 sec: 61092.9). Total num frames: 9245900800. Throughput: 0: 61316.2. Samples: 2151099380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:36:04,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 03:36:04,693][54818] Updated weights for policy 0, policy_version 564328 (0.0018) [2024-04-28 03:36:07,714][54818] Updated weights for policy 0, policy_version 564338 (0.0019) [2024-04-28 03:36:09,253][54587] Fps is (10 sec: 58981.8, 60 sec: 60893.7, 300 sec: 61092.9). Total num frames: 9246195712. Throughput: 0: 61309.2. Samples: 2151462440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:36:09,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 03:36:10,122][54818] Updated weights for policy 0, policy_version 564348 (0.0017) [2024-04-28 03:36:13,088][54818] Updated weights for policy 0, policy_version 564358 (0.0016) [2024-04-28 03:36:14,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61439.8, 300 sec: 61148.9). Total num frames: 9246523392. Throughput: 0: 61515.5. Samples: 2151653220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:36:14,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 03:36:15,396][54818] Updated weights for policy 0, policy_version 564368 (0.0017) [2024-04-28 03:36:18,163][54818] Updated weights for policy 0, policy_version 564378 (0.0015) [2024-04-28 03:36:19,253][54587] Fps is (10 sec: 63898.3, 60 sec: 61167.4, 300 sec: 61203.9). Total num frames: 9246834688. Throughput: 0: 61390.1. Samples: 2152018660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:36:19,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 03:36:20,746][54818] Updated weights for policy 0, policy_version 564388 (0.0019) [2024-04-28 03:36:23,409][54818] Updated weights for policy 0, policy_version 564398 (0.0016) [2024-04-28 03:36:24,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9247129600. Throughput: 0: 61219.2. Samples: 2152377360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:36:24,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 03:36:24,729][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (19700 times) [2024-04-28 03:36:26,578][54818] Updated weights for policy 0, policy_version 564408 (0.0015) [2024-04-28 03:36:28,583][54818] Updated weights for policy 0, policy_version 564418 (0.0018) [2024-04-28 03:36:29,253][54587] Fps is (10 sec: 58983.1, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9247424512. Throughput: 0: 61440.4. Samples: 2152570340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:36:29,253][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 03:36:31,991][54818] Updated weights for policy 0, policy_version 564428 (0.0017) [2024-04-28 03:36:34,129][54818] Updated weights for policy 0, policy_version 564438 (0.0019) [2024-04-28 03:36:34,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9247752192. Throughput: 0: 61434.6. Samples: 2152938920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:36:34,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 03:36:37,123][54818] Updated weights for policy 0, policy_version 564448 (0.0016) [2024-04-28 03:36:39,253][54587] Fps is (10 sec: 63896.3, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9248063488. Throughput: 0: 61147.7. Samples: 2153292480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:36:39,254][54587] Avg episode reward: [(0, '0.506')] [2024-04-28 03:36:39,421][54818] Updated weights for policy 0, policy_version 564458 (0.0017) [2024-04-28 03:36:42,326][54818] Updated weights for policy 0, policy_version 564468 (0.0015) [2024-04-28 03:36:43,593][54798] Signal inference workers to stop experience collection... (35500 times) [2024-04-28 03:36:43,594][54798] Signal inference workers to resume experience collection... (35500 times) [2024-04-28 03:36:43,603][54818] InferenceWorker_p0-w0: stopping experience collection (35500 times) [2024-04-28 03:36:43,615][54818] InferenceWorker_p0-w0: resuming experience collection (35500 times) [2024-04-28 03:36:44,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61439.8, 300 sec: 61204.0). Total num frames: 9248358400. Throughput: 0: 61297.3. Samples: 2153487800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:36:44,255][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 03:36:45,107][54818] Updated weights for policy 0, policy_version 564478 (0.0021) [2024-04-28 03:36:47,833][54818] Updated weights for policy 0, policy_version 564488 (0.0016) [2024-04-28 03:36:49,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9248669696. Throughput: 0: 61190.8. Samples: 2153852960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:36:49,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 03:36:50,191][54818] Updated weights for policy 0, policy_version 564498 (0.0018) [2024-04-28 03:36:51,921][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (19800 times) [2024-04-28 03:36:53,113][54818] Updated weights for policy 0, policy_version 564508 (0.0017) [2024-04-28 03:36:54,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 9248980992. Throughput: 0: 61071.8. Samples: 2154210660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:36:54,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-28 03:36:55,569][54818] Updated weights for policy 0, policy_version 564518 (0.0016) [2024-04-28 03:36:58,300][54818] Updated weights for policy 0, policy_version 564528 (0.0017) [2024-04-28 03:36:59,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9249292288. Throughput: 0: 61167.6. Samples: 2154405760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:36:59,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 03:37:00,965][54818] Updated weights for policy 0, policy_version 564538 (0.0017) [2024-04-28 03:37:03,556][54818] Updated weights for policy 0, policy_version 564548 (0.0016) [2024-04-28 03:37:04,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 9249587200. Throughput: 0: 61117.8. Samples: 2154768960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:37:04,254][54587] Avg episode reward: [(0, '0.534')] [2024-04-28 03:37:06,548][54818] Updated weights for policy 0, policy_version 564558 (0.0016) [2024-04-28 03:37:08,977][54818] Updated weights for policy 0, policy_version 564568 (0.0017) [2024-04-28 03:37:09,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61713.1, 300 sec: 61259.5). Total num frames: 9249898496. Throughput: 0: 61104.3. Samples: 2155127060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:37:09,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 03:37:12,000][54818] Updated weights for policy 0, policy_version 564578 (0.0017) [2024-04-28 03:37:14,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9250193408. Throughput: 0: 61069.1. Samples: 2155318460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:37:14,254][54587] Avg episode reward: [(0, '0.676')] [2024-04-28 03:37:14,359][54818] Updated weights for policy 0, policy_version 564588 (0.0018) [2024-04-28 03:37:17,442][54818] Updated weights for policy 0, policy_version 564598 (0.0017) [2024-04-28 03:37:18,366][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (19900 times) [2024-04-28 03:37:19,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9250504704. Throughput: 0: 61168.8. Samples: 2155691520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:37:19,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 03:37:19,680][54818] Updated weights for policy 0, policy_version 564608 (0.0018) [2024-04-28 03:37:22,717][54818] Updated weights for policy 0, policy_version 564618 (0.0015) [2024-04-28 03:37:24,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9250816000. Throughput: 0: 61158.3. Samples: 2156044600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:37:24,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 03:37:25,125][54818] Updated weights for policy 0, policy_version 564628 (0.0017) [2024-04-28 03:37:27,372][54798] Signal inference workers to stop experience collection... (35550 times) [2024-04-28 03:37:27,373][54798] Signal inference workers to resume experience collection... (35550 times) [2024-04-28 03:37:27,379][54818] InferenceWorker_p0-w0: stopping experience collection (35550 times) [2024-04-28 03:37:27,388][54818] InferenceWorker_p0-w0: resuming experience collection (35550 times) [2024-04-28 03:37:27,969][54818] Updated weights for policy 0, policy_version 564638 (0.0017) [2024-04-28 03:37:29,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9251110912. Throughput: 0: 60881.1. Samples: 2156227440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:37:29,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 03:37:30,850][54818] Updated weights for policy 0, policy_version 564648 (0.0017) [2024-04-28 03:37:33,230][54818] Updated weights for policy 0, policy_version 564658 (0.0016) [2024-04-28 03:37:34,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9251422208. Throughput: 0: 61096.3. Samples: 2156602300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:37:34,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 03:37:35,991][54818] Updated weights for policy 0, policy_version 564668 (0.0016) [2024-04-28 03:37:38,632][54818] Updated weights for policy 0, policy_version 564678 (0.0015) [2024-04-28 03:37:39,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9251733504. Throughput: 0: 61261.8. Samples: 2156967440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:37:39,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 03:37:41,243][54818] Updated weights for policy 0, policy_version 564688 (0.0016) [2024-04-28 03:37:43,928][54818] Updated weights for policy 0, policy_version 564698 (0.0018) [2024-04-28 03:37:44,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9252044800. Throughput: 0: 60980.5. Samples: 2157149880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:37:44,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 03:37:44,952][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (20000 times) [2024-04-28 03:37:46,580][54818] Updated weights for policy 0, policy_version 564708 (0.0017) [2024-04-28 03:37:49,075][54818] Updated weights for policy 0, policy_version 564718 (0.0016) [2024-04-28 03:37:49,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9252339712. Throughput: 0: 61182.1. Samples: 2157522160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:37:49,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 03:37:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000564718_9252339712.pth... [2024-04-28 03:37:49,316][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000563820_9237626880.pth [2024-04-28 03:37:52,127][54818] Updated weights for policy 0, policy_version 564728 (0.0019) [2024-04-28 03:37:54,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9252651008. Throughput: 0: 61404.0. Samples: 2157890240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:37:54,254][54587] Avg episode reward: [(0, '0.687')] [2024-04-28 03:37:54,562][54818] Updated weights for policy 0, policy_version 564738 (0.0015) [2024-04-28 03:37:57,490][54818] Updated weights for policy 0, policy_version 564748 (0.0015) [2024-04-28 03:37:59,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9252962304. Throughput: 0: 61177.4. Samples: 2158071440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:37:59,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 03:37:59,758][54818] Updated weights for policy 0, policy_version 564758 (0.0015) [2024-04-28 03:38:02,695][54818] Updated weights for policy 0, policy_version 564768 (0.0017) [2024-04-28 03:38:04,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 9253273600. Throughput: 0: 61002.7. Samples: 2158436640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:38:04,255][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 03:38:05,399][54818] Updated weights for policy 0, policy_version 564778 (0.0016) [2024-04-28 03:38:08,299][54818] Updated weights for policy 0, policy_version 564788 (0.0018) [2024-04-28 03:38:09,253][54587] Fps is (10 sec: 60619.8, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 9253568512. Throughput: 0: 61320.7. Samples: 2158804040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:38:09,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 03:38:10,937][54818] Updated weights for policy 0, policy_version 564798 (0.0015) [2024-04-28 03:38:12,095][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (20100 times) [2024-04-28 03:38:13,424][54818] Updated weights for policy 0, policy_version 564808 (0.0016) [2024-04-28 03:38:14,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9253879808. Throughput: 0: 61400.8. Samples: 2158990480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:38:14,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 03:38:16,151][54818] Updated weights for policy 0, policy_version 564818 (0.0017) [2024-04-28 03:38:18,772][54818] Updated weights for policy 0, policy_version 564828 (0.0018) [2024-04-28 03:38:19,253][54587] Fps is (10 sec: 60622.0, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9254174720. Throughput: 0: 61236.6. Samples: 2159357940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:38:19,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 03:38:21,500][54818] Updated weights for policy 0, policy_version 564838 (0.0015) [2024-04-28 03:38:23,975][54818] Updated weights for policy 0, policy_version 564848 (0.0017) [2024-04-28 03:38:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61167.0, 300 sec: 61315.1). Total num frames: 9254486016. Throughput: 0: 61287.6. Samples: 2159725380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:38:24,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 03:38:24,873][54798] Signal inference workers to stop experience collection... (35600 times) [2024-04-28 03:38:24,875][54798] Signal inference workers to resume experience collection... (35600 times) [2024-04-28 03:38:24,883][54818] InferenceWorker_p0-w0: stopping experience collection (35600 times) [2024-04-28 03:38:24,883][54818] InferenceWorker_p0-w0: resuming experience collection (35600 times) [2024-04-28 03:38:26,965][54818] Updated weights for policy 0, policy_version 564858 (0.0016) [2024-04-28 03:38:29,238][54818] Updated weights for policy 0, policy_version 564868 (0.0015) [2024-04-28 03:38:29,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 9254797312. Throughput: 0: 61318.6. Samples: 2159909220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:38:29,255][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 03:38:32,043][54818] Updated weights for policy 0, policy_version 564878 (0.0018) [2024-04-28 03:38:34,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9255092224. Throughput: 0: 61339.5. Samples: 2160282440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:38:34,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 03:38:34,856][54818] Updated weights for policy 0, policy_version 564888 (0.0016) [2024-04-28 03:38:37,501][54818] Updated weights for policy 0, policy_version 564898 (0.0018) [2024-04-28 03:38:38,491][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (20200 times) [2024-04-28 03:38:39,253][54587] Fps is (10 sec: 58982.2, 60 sec: 60893.8, 300 sec: 61259.5). Total num frames: 9255387136. Throughput: 0: 61151.5. Samples: 2160642060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:38:39,254][54587] Avg episode reward: [(0, '0.667')] [2024-04-28 03:38:40,084][54818] Updated weights for policy 0, policy_version 564908 (0.0016) [2024-04-28 03:38:42,726][54818] Updated weights for policy 0, policy_version 564918 (0.0017) [2024-04-28 03:38:44,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60893.9, 300 sec: 61259.5). Total num frames: 9255698432. Throughput: 0: 61374.7. Samples: 2160833300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:38:44,255][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 03:38:45,292][54818] Updated weights for policy 0, policy_version 564928 (0.0016) [2024-04-28 03:38:48,160][54818] Updated weights for policy 0, policy_version 564938 (0.0018) [2024-04-28 03:38:49,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9256009728. Throughput: 0: 61399.6. Samples: 2161199620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:38:49,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 03:38:49,263][54587] No heartbeat for components: RolloutWorker_w4 (22477 seconds), RolloutWorker_w5 (8577 seconds) [2024-04-28 03:38:50,557][54818] Updated weights for policy 0, policy_version 564948 (0.0016) [2024-04-28 03:38:53,453][54818] Updated weights for policy 0, policy_version 564958 (0.0016) [2024-04-28 03:38:54,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.8, 300 sec: 61204.0). Total num frames: 9256304640. Throughput: 0: 61355.2. Samples: 2161565020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:38:54,255][54587] Avg episode reward: [(0, '0.542')] [2024-04-28 03:38:55,789][54818] Updated weights for policy 0, policy_version 564968 (0.0016) [2024-04-28 03:38:58,658][54818] Updated weights for policy 0, policy_version 564978 (0.0015) [2024-04-28 03:38:59,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60893.8, 300 sec: 61259.5). Total num frames: 9256615936. Throughput: 0: 61356.7. Samples: 2161751540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:38:59,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 03:39:01,312][54818] Updated weights for policy 0, policy_version 564988 (0.0018) [2024-04-28 03:39:04,253][54587] Fps is (10 sec: 62260.1, 60 sec: 60893.9, 300 sec: 61259.5). Total num frames: 9256927232. Throughput: 0: 61396.0. Samples: 2162120760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:39:04,255][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 03:39:04,255][54818] Updated weights for policy 0, policy_version 564998 (0.0016) [2024-04-28 03:39:05,224][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (20300 times) [2024-04-28 03:39:06,456][54818] Updated weights for policy 0, policy_version 565008 (0.0017) [2024-04-28 03:39:09,253][54587] Fps is (10 sec: 60622.1, 60 sec: 60894.1, 300 sec: 61204.0). Total num frames: 9257222144. Throughput: 0: 61339.2. Samples: 2162485640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:39:09,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 03:39:09,505][54818] Updated weights for policy 0, policy_version 565018 (0.0016) [2024-04-28 03:39:12,322][54818] Updated weights for policy 0, policy_version 565028 (0.0015) [2024-04-28 03:39:14,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.9, 300 sec: 61259.5). Total num frames: 9257533440. Throughput: 0: 61277.8. Samples: 2162666720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:39:14,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 03:39:14,861][54818] Updated weights for policy 0, policy_version 565038 (0.0017) [2024-04-28 03:39:17,482][54818] Updated weights for policy 0, policy_version 565048 (0.0015) [2024-04-28 03:39:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9257828352. Throughput: 0: 61268.6. Samples: 2163039520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:39:19,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 03:39:20,042][54798] Signal inference workers to stop experience collection... (35650 times) [2024-04-28 03:39:20,080][54818] InferenceWorker_p0-w0: stopping experience collection (35650 times) [2024-04-28 03:39:20,101][54798] Signal inference workers to resume experience collection... (35650 times) [2024-04-28 03:39:20,101][54818] InferenceWorker_p0-w0: resuming experience collection (35650 times) [2024-04-28 03:39:20,103][54818] Updated weights for policy 0, policy_version 565058 (0.0016) [2024-04-28 03:39:22,867][54818] Updated weights for policy 0, policy_version 565068 (0.0016) [2024-04-28 03:39:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9258139648. Throughput: 0: 61471.7. Samples: 2163408280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:39:24,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 03:39:25,338][54818] Updated weights for policy 0, policy_version 565078 (0.0016) [2024-04-28 03:39:28,198][54818] Updated weights for policy 0, policy_version 565088 (0.0017) [2024-04-28 03:39:29,253][54587] Fps is (10 sec: 62259.3, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9258450944. Throughput: 0: 61175.7. Samples: 2163586200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:39:29,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 03:39:30,796][54818] Updated weights for policy 0, policy_version 565098 (0.0016) [2024-04-28 03:39:32,023][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (20400 times) [2024-04-28 03:39:33,576][54818] Updated weights for policy 0, policy_version 565108 (0.0016) [2024-04-28 03:39:34,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 9258762240. Throughput: 0: 61229.1. Samples: 2163954920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:39:34,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 03:39:36,088][54818] Updated weights for policy 0, policy_version 565118 (0.0016) [2024-04-28 03:39:38,952][54818] Updated weights for policy 0, policy_version 565128 (0.0017) [2024-04-28 03:39:39,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61167.1, 300 sec: 61259.5). Total num frames: 9259057152. Throughput: 0: 61396.6. Samples: 2164327860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:39:39,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 03:39:41,387][54818] Updated weights for policy 0, policy_version 565138 (0.0019) [2024-04-28 03:39:44,253][54587] Fps is (10 sec: 60619.8, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9259368448. Throughput: 0: 61054.8. Samples: 2164499000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:39:44,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-28 03:39:44,439][54818] Updated weights for policy 0, policy_version 565148 (0.0015) [2024-04-28 03:39:46,533][54818] Updated weights for policy 0, policy_version 565158 (0.0019) [2024-04-28 03:39:49,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9259663360. Throughput: 0: 61136.8. Samples: 2164871920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:39:49,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-28 03:39:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000565165_9259663360.pth... [2024-04-28 03:39:49,315][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000564270_9244999680.pth [2024-04-28 03:39:49,759][54818] Updated weights for policy 0, policy_version 565168 (0.0015) [2024-04-28 03:39:52,364][54818] Updated weights for policy 0, policy_version 565178 (0.0016) [2024-04-28 03:39:54,253][54587] Fps is (10 sec: 58982.7, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9259958272. Throughput: 0: 61080.8. Samples: 2165234280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:39:54,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 03:39:55,093][54818] Updated weights for policy 0, policy_version 565188 (0.0018) [2024-04-28 03:39:57,586][54818] Updated weights for policy 0, policy_version 565198 (0.0017) [2024-04-28 03:39:59,168][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (20500 times) [2024-04-28 03:39:59,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9260269568. Throughput: 0: 61020.3. Samples: 2165412640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:39:59,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 03:40:00,466][54818] Updated weights for policy 0, policy_version 565208 (0.0016) [2024-04-28 03:40:03,146][54818] Updated weights for policy 0, policy_version 565218 (0.0022) [2024-04-28 03:40:04,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60620.6, 300 sec: 61092.9). Total num frames: 9260564480. Throughput: 0: 60827.8. Samples: 2165776780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:40:04,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 03:40:05,821][54818] Updated weights for policy 0, policy_version 565228 (0.0017) [2024-04-28 03:40:08,603][54818] Updated weights for policy 0, policy_version 565238 (0.0020) [2024-04-28 03:40:09,218][54798] Signal inference workers to stop experience collection... (35700 times) [2024-04-28 03:40:09,250][54818] InferenceWorker_p0-w0: stopping experience collection (35700 times) [2024-04-28 03:40:09,253][54587] Fps is (10 sec: 58982.5, 60 sec: 60620.6, 300 sec: 61092.8). Total num frames: 9260859392. Throughput: 0: 60843.4. Samples: 2166146240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:40:09,255][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 03:40:09,275][54798] Signal inference workers to resume experience collection... (35700 times) [2024-04-28 03:40:09,276][54818] InferenceWorker_p0-w0: resuming experience collection (35700 times) [2024-04-28 03:40:11,122][54818] Updated weights for policy 0, policy_version 565248 (0.0016) [2024-04-28 03:40:14,215][54818] Updated weights for policy 0, policy_version 565258 (0.0016) [2024-04-28 03:40:14,253][54587] Fps is (10 sec: 62259.5, 60 sec: 60893.8, 300 sec: 61093.0). Total num frames: 9261187072. Throughput: 0: 60842.0. Samples: 2166324100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:40:14,255][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 03:40:14,255][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (1100 times) [2024-04-28 03:40:16,540][54818] Updated weights for policy 0, policy_version 565268 (0.0016) [2024-04-28 03:40:19,253][54587] Fps is (10 sec: 62260.2, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9261481984. Throughput: 0: 60776.4. Samples: 2166689860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:40:19,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 03:40:19,450][54818] Updated weights for policy 0, policy_version 565278 (0.0016) [2024-04-28 03:40:21,718][54818] Updated weights for policy 0, policy_version 565288 (0.0016) [2024-04-28 03:40:24,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 9261793280. Throughput: 0: 60958.9. Samples: 2167071020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:40:24,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 03:40:24,813][54818] Updated weights for policy 0, policy_version 565298 (0.0019) [2024-04-28 03:40:25,991][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (20600 times) [2024-04-28 03:40:26,857][54818] Updated weights for policy 0, policy_version 565308 (0.0020) [2024-04-28 03:40:29,253][54587] Fps is (10 sec: 60619.9, 60 sec: 60620.6, 300 sec: 61037.3). Total num frames: 9262088192. Throughput: 0: 60908.5. Samples: 2167239880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:40:29,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 03:40:30,280][54818] Updated weights for policy 0, policy_version 565318 (0.0016) [2024-04-28 03:40:32,311][54818] Updated weights for policy 0, policy_version 565328 (0.0016) [2024-04-28 03:40:34,253][54587] Fps is (10 sec: 60622.1, 60 sec: 60620.8, 300 sec: 61037.4). Total num frames: 9262399488. Throughput: 0: 60692.2. Samples: 2167603060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:40:34,253][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 03:40:35,634][54818] Updated weights for policy 0, policy_version 565338 (0.0015) [2024-04-28 03:40:37,524][54818] Updated weights for policy 0, policy_version 565348 (0.0016) [2024-04-28 03:40:39,253][54587] Fps is (10 sec: 62259.0, 60 sec: 60893.7, 300 sec: 61148.4). Total num frames: 9262710784. Throughput: 0: 61140.3. Samples: 2167985600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:40:39,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 03:40:40,817][54818] Updated weights for policy 0, policy_version 565358 (0.0018) [2024-04-28 03:40:43,113][54818] Updated weights for policy 0, policy_version 565368 (0.0016) [2024-04-28 03:40:44,253][54587] Fps is (10 sec: 60619.6, 60 sec: 60620.8, 300 sec: 61092.9). Total num frames: 9263005696. Throughput: 0: 61083.1. Samples: 2168161380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:40:44,255][54587] Avg episode reward: [(0, '0.648')] [2024-04-28 03:40:46,085][54818] Updated weights for policy 0, policy_version 565378 (0.0018) [2024-04-28 03:40:49,146][54818] Updated weights for policy 0, policy_version 565388 (0.0018) [2024-04-28 03:40:49,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9263316992. Throughput: 0: 60935.7. Samples: 2168518880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:40:49,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 03:40:50,745][54798] Signal inference workers to stop experience collection... (35750 times) [2024-04-28 03:40:50,746][54798] Signal inference workers to resume experience collection... (35750 times) [2024-04-28 03:40:50,760][54818] InferenceWorker_p0-w0: stopping experience collection (35750 times) [2024-04-28 03:40:50,760][54818] InferenceWorker_p0-w0: resuming experience collection (35750 times) [2024-04-28 03:40:51,443][54818] Updated weights for policy 0, policy_version 565398 (0.0016) [2024-04-28 03:40:52,347][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (20700 times) [2024-04-28 03:40:54,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 9263628288. Throughput: 0: 61191.1. Samples: 2168899840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:40:54,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 03:40:54,408][54818] Updated weights for policy 0, policy_version 565408 (0.0019) [2024-04-28 03:40:56,740][54818] Updated weights for policy 0, policy_version 565418 (0.0017) [2024-04-28 03:40:59,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 9263923200. Throughput: 0: 61228.6. Samples: 2169079380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:40:59,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 03:40:59,839][54818] Updated weights for policy 0, policy_version 565428 (0.0017) [2024-04-28 03:41:01,939][54818] Updated weights for policy 0, policy_version 565438 (0.0016) [2024-04-28 03:41:04,253][54587] Fps is (10 sec: 58983.3, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 9264218112. Throughput: 0: 61038.2. Samples: 2169436580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 03:41:04,253][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 03:41:05,299][54818] Updated weights for policy 0, policy_version 565448 (0.0016) [2024-04-28 03:41:07,275][54818] Updated weights for policy 0, policy_version 565458 (0.0019) [2024-04-28 03:41:09,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 9264545792. Throughput: 0: 60885.3. Samples: 2169810860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:41:09,255][54587] Avg episode reward: [(0, '0.547')] [2024-04-28 03:41:10,604][54818] Updated weights for policy 0, policy_version 565468 (0.0016) [2024-04-28 03:41:12,511][54818] Updated weights for policy 0, policy_version 565478 (0.0024) [2024-04-28 03:41:14,253][54587] Fps is (10 sec: 63896.7, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 9264857088. Throughput: 0: 61249.8. Samples: 2169996120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:41:14,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 03:41:15,799][54818] Updated weights for policy 0, policy_version 565488 (0.0018) [2024-04-28 03:41:17,808][54818] Updated weights for policy 0, policy_version 565498 (0.0016) [2024-04-28 03:41:19,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 9265152000. Throughput: 0: 61220.3. Samples: 2170357980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:41:19,254][54587] Avg episode reward: [(0, '0.662')] [2024-04-28 03:41:19,914][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (20800 times) [2024-04-28 03:41:20,967][54818] Updated weights for policy 0, policy_version 565508 (0.0017) [2024-04-28 03:41:22,983][54818] Updated weights for policy 0, policy_version 565518 (0.0016) [2024-04-28 03:41:24,253][54587] Fps is (10 sec: 58983.0, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 9265446912. Throughput: 0: 60989.5. Samples: 2170730120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:41:24,254][54587] Avg episode reward: [(0, '0.702')] [2024-04-28 03:41:26,190][54818] Updated weights for policy 0, policy_version 565528 (0.0016) [2024-04-28 03:41:27,500][54798] Signal inference workers to stop experience collection... (35800 times) [2024-04-28 03:41:27,500][54798] Signal inference workers to resume experience collection... (35800 times) [2024-04-28 03:41:27,510][54818] InferenceWorker_p0-w0: stopping experience collection (35800 times) [2024-04-28 03:41:27,511][54818] InferenceWorker_p0-w0: resuming experience collection (35800 times) [2024-04-28 03:41:29,163][54818] Updated weights for policy 0, policy_version 565538 (0.0018) [2024-04-28 03:41:29,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61440.1, 300 sec: 61092.9). Total num frames: 9265774592. Throughput: 0: 61293.1. Samples: 2170919560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:41:29,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-28 03:41:31,617][54818] Updated weights for policy 0, policy_version 565548 (0.0018) [2024-04-28 03:41:34,253][54587] Fps is (10 sec: 63897.1, 60 sec: 61439.8, 300 sec: 61092.9). Total num frames: 9266085888. Throughput: 0: 61208.4. Samples: 2171273260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:41:34,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 03:41:34,403][54818] Updated weights for policy 0, policy_version 565558 (0.0017) [2024-04-28 03:41:37,022][54818] Updated weights for policy 0, policy_version 565568 (0.0021) [2024-04-28 03:41:39,253][54587] Fps is (10 sec: 60620.0, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 9266380800. Throughput: 0: 60957.8. Samples: 2171642940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:41:39,254][54587] Avg episode reward: [(0, '0.510')] [2024-04-28 03:41:40,128][54818] Updated weights for policy 0, policy_version 565578 (0.0018) [2024-04-28 03:41:42,272][54818] Updated weights for policy 0, policy_version 565588 (0.0018) [2024-04-28 03:41:44,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61440.2, 300 sec: 61092.9). Total num frames: 9266692096. Throughput: 0: 61240.9. Samples: 2171835220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:41:44,253][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 03:41:45,528][54818] Updated weights for policy 0, policy_version 565598 (0.0016) [2024-04-28 03:41:46,322][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (20900 times) [2024-04-28 03:41:47,452][54818] Updated weights for policy 0, policy_version 565608 (0.0018) [2024-04-28 03:41:49,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 9267003392. Throughput: 0: 61129.7. Samples: 2172187420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:41:49,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-28 03:41:49,261][54587] No heartbeat for components: RolloutWorker_w4 (22657 seconds), RolloutWorker_w5 (8757 seconds) [2024-04-28 03:41:49,262][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000565613_9267003392.pth... [2024-04-28 03:41:49,322][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000564718_9252339712.pth [2024-04-28 03:41:50,775][54818] Updated weights for policy 0, policy_version 565618 (0.0017) [2024-04-28 03:41:52,674][54818] Updated weights for policy 0, policy_version 565628 (0.0019) [2024-04-28 03:41:54,253][54587] Fps is (10 sec: 62258.0, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 9267314688. Throughput: 0: 60963.6. Samples: 2172554220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:41:54,255][54587] Avg episode reward: [(0, '0.549')] [2024-04-28 03:41:55,505][54798] Signal inference workers to stop experience collection... (35850 times) [2024-04-28 03:41:55,550][54818] InferenceWorker_p0-w0: stopping experience collection (35850 times) [2024-04-28 03:41:55,566][54798] Signal inference workers to resume experience collection... (35850 times) [2024-04-28 03:41:55,567][54818] InferenceWorker_p0-w0: resuming experience collection (35850 times) [2024-04-28 03:41:55,904][54818] Updated weights for policy 0, policy_version 565638 (0.0018) [2024-04-28 03:41:57,908][54818] Updated weights for policy 0, policy_version 565648 (0.0019) [2024-04-28 03:41:59,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61713.0, 300 sec: 61148.4). Total num frames: 9267625984. Throughput: 0: 61212.9. Samples: 2172750700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:41:59,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 03:42:01,335][54818] Updated weights for policy 0, policy_version 565658 (0.0017) [2024-04-28 03:42:03,172][54818] Updated weights for policy 0, policy_version 565668 (0.0018) [2024-04-28 03:42:04,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61712.9, 300 sec: 61092.9). Total num frames: 9267920896. Throughput: 0: 61062.6. Samples: 2173105800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:42:04,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 03:42:06,677][54818] Updated weights for policy 0, policy_version 565678 (0.0016) [2024-04-28 03:42:08,713][54818] Updated weights for policy 0, policy_version 565688 (0.0018) [2024-04-28 03:42:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9268232192. Throughput: 0: 60952.7. Samples: 2173473000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:42:09,255][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 03:42:11,919][54818] Updated weights for policy 0, policy_version 565698 (0.0016) [2024-04-28 03:42:12,821][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (21000 times) [2024-04-28 03:42:14,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9268543488. Throughput: 0: 61095.9. Samples: 2173668880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:42:14,254][54587] Avg episode reward: [(0, '0.488')] [2024-04-28 03:42:14,714][54818] Updated weights for policy 0, policy_version 565708 (0.0017) [2024-04-28 03:42:17,120][54818] Updated weights for policy 0, policy_version 565718 (0.0016) [2024-04-28 03:42:19,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61713.0, 300 sec: 61148.4). Total num frames: 9268854784. Throughput: 0: 61240.9. Samples: 2174029100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:42:19,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-28 03:42:19,978][54818] Updated weights for policy 0, policy_version 565728 (0.0015) [2024-04-28 03:42:22,322][54818] Updated weights for policy 0, policy_version 565738 (0.0016) [2024-04-28 03:42:24,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61986.0, 300 sec: 61203.9). Total num frames: 9269166080. Throughput: 0: 61043.5. Samples: 2174389900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:42:24,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 03:42:25,382][54818] Updated weights for policy 0, policy_version 565748 (0.0018) [2024-04-28 03:42:27,832][54818] Updated weights for policy 0, policy_version 565758 (0.0016) [2024-04-28 03:42:29,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 9269460992. Throughput: 0: 61130.6. Samples: 2174586100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:42:29,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 03:42:31,064][54818] Updated weights for policy 0, policy_version 565768 (0.0018) [2024-04-28 03:42:33,171][54818] Updated weights for policy 0, policy_version 565778 (0.0016) [2024-04-28 03:42:34,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9269772288. Throughput: 0: 61335.9. Samples: 2174947540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:42:34,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-28 03:42:36,322][54818] Updated weights for policy 0, policy_version 565788 (0.0017) [2024-04-28 03:42:38,480][54818] Updated weights for policy 0, policy_version 565798 (0.0017) [2024-04-28 03:42:39,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61713.2, 300 sec: 61148.4). Total num frames: 9270083584. Throughput: 0: 61104.2. Samples: 2175303900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 03:42:39,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-28 03:42:39,970][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (21100 times) [2024-04-28 03:42:41,289][54798] Signal inference workers to stop experience collection... (35900 times) [2024-04-28 03:42:41,325][54818] InferenceWorker_p0-w0: stopping experience collection (35900 times) [2024-04-28 03:42:41,345][54798] Signal inference workers to resume experience collection... (35900 times) [2024-04-28 03:42:41,345][54818] InferenceWorker_p0-w0: resuming experience collection (35900 times) [2024-04-28 03:42:41,592][54818] Updated weights for policy 0, policy_version 565808 (0.0017) [2024-04-28 03:42:43,677][54818] Updated weights for policy 0, policy_version 565818 (0.0017) [2024-04-28 03:42:44,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61713.0, 300 sec: 61204.0). Total num frames: 9270394880. Throughput: 0: 61110.7. Samples: 2175500680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:42:44,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 03:42:46,888][54818] Updated weights for policy 0, policy_version 565828 (0.0015) [2024-04-28 03:42:49,053][54818] Updated weights for policy 0, policy_version 565838 (0.0015) [2024-04-28 03:42:49,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9270689792. Throughput: 0: 61265.9. Samples: 2175862760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:42:49,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 03:42:52,080][54818] Updated weights for policy 0, policy_version 565848 (0.0015) [2024-04-28 03:42:54,253][54587] Fps is (10 sec: 58983.0, 60 sec: 61167.1, 300 sec: 61092.9). Total num frames: 9270984704. Throughput: 0: 61161.1. Samples: 2176225240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:42:54,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 03:42:54,660][54818] Updated weights for policy 0, policy_version 565858 (0.0018) [2024-04-28 03:42:57,415][54818] Updated weights for policy 0, policy_version 565868 (0.0018) [2024-04-28 03:42:59,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9271312384. Throughput: 0: 61082.1. Samples: 2176417580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:42:59,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-28 03:42:59,990][54818] Updated weights for policy 0, policy_version 565878 (0.0016) [2024-04-28 03:43:02,822][54818] Updated weights for policy 0, policy_version 565888 (0.0021) [2024-04-28 03:43:04,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9271607296. Throughput: 0: 61094.6. Samples: 2176778360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:43:04,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 03:43:05,650][54818] Updated weights for policy 0, policy_version 565898 (0.0018) [2024-04-28 03:43:06,941][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (21200 times) [2024-04-28 03:43:08,249][54818] Updated weights for policy 0, policy_version 565908 (0.0017) [2024-04-28 03:43:09,253][54587] Fps is (10 sec: 58982.4, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 9271902208. Throughput: 0: 61151.6. Samples: 2177141720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:43:09,255][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 03:43:11,229][54818] Updated weights for policy 0, policy_version 565918 (0.0016) [2024-04-28 03:43:13,569][54818] Updated weights for policy 0, policy_version 565928 (0.0017) [2024-04-28 03:43:14,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9272213504. Throughput: 0: 60970.2. Samples: 2177329760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:43:14,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 03:43:16,621][54818] Updated weights for policy 0, policy_version 565938 (0.0018) [2024-04-28 03:43:18,911][54818] Updated weights for policy 0, policy_version 565948 (0.0021) [2024-04-28 03:43:19,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9272524800. Throughput: 0: 60985.3. Samples: 2177691880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:43:19,255][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 03:43:21,622][54818] Updated weights for policy 0, policy_version 565958 (0.0016) [2024-04-28 03:43:24,050][54818] Updated weights for policy 0, policy_version 565968 (0.0018) [2024-04-28 03:43:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 9272819712. Throughput: 0: 61217.3. Samples: 2178058680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:43:24,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 03:43:26,143][54798] Signal inference workers to stop experience collection... (35950 times) [2024-04-28 03:43:26,173][54818] InferenceWorker_p0-w0: stopping experience collection (35950 times) [2024-04-28 03:43:26,203][54798] Signal inference workers to resume experience collection... (35950 times) [2024-04-28 03:43:26,204][54818] InferenceWorker_p0-w0: resuming experience collection (35950 times) [2024-04-28 03:43:27,112][54818] Updated weights for policy 0, policy_version 565978 (0.0019) [2024-04-28 03:43:29,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9273131008. Throughput: 0: 60933.7. Samples: 2178242700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:43:29,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 03:43:29,629][54818] Updated weights for policy 0, policy_version 565988 (0.0016) [2024-04-28 03:43:32,326][54818] Updated weights for policy 0, policy_version 565998 (0.0017) [2024-04-28 03:43:33,639][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (21300 times) [2024-04-28 03:43:34,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9273425920. Throughput: 0: 61060.9. Samples: 2178610500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:43:34,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 03:43:34,958][54818] Updated weights for policy 0, policy_version 566008 (0.0016) [2024-04-28 03:43:37,617][54818] Updated weights for policy 0, policy_version 566018 (0.0016) [2024-04-28 03:43:39,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 9273737216. Throughput: 0: 61195.4. Samples: 2178979040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:43:39,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 03:43:40,099][54818] Updated weights for policy 0, policy_version 566028 (0.0017) [2024-04-28 03:43:43,039][54818] Updated weights for policy 0, policy_version 566038 (0.0016) [2024-04-28 03:43:44,253][54587] Fps is (10 sec: 60620.2, 60 sec: 60620.8, 300 sec: 61092.9). Total num frames: 9274032128. Throughput: 0: 61015.5. Samples: 2179163280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:43:44,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 03:43:45,756][54818] Updated weights for policy 0, policy_version 566048 (0.0016) [2024-04-28 03:43:48,368][54818] Updated weights for policy 0, policy_version 566058 (0.0015) [2024-04-28 03:43:49,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60893.9, 300 sec: 61148.5). Total num frames: 9274343424. Throughput: 0: 61112.6. Samples: 2179528420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:43:49,253][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 03:43:49,348][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000566062_9274359808.pth... [2024-04-28 03:43:49,401][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000565165_9259663360.pth [2024-04-28 03:43:51,231][54818] Updated weights for policy 0, policy_version 566068 (0.0015) [2024-04-28 03:43:53,939][54818] Updated weights for policy 0, policy_version 566078 (0.0016) [2024-04-28 03:43:54,253][54587] Fps is (10 sec: 60621.6, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9274638336. Throughput: 0: 61134.4. Samples: 2179892760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:43:54,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 03:43:56,770][54818] Updated weights for policy 0, policy_version 566088 (0.0017) [2024-04-28 03:43:59,253][54587] Fps is (10 sec: 58981.9, 60 sec: 60347.8, 300 sec: 61037.3). Total num frames: 9274933248. Throughput: 0: 61197.3. Samples: 2180083640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:43:59,254][54587] Avg episode reward: [(0, '0.670')] [2024-04-28 03:43:59,270][54818] Updated weights for policy 0, policy_version 566098 (0.0018) [2024-04-28 03:44:00,392][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (21400 times) [2024-04-28 03:44:02,045][54818] Updated weights for policy 0, policy_version 566108 (0.0022) [2024-04-28 03:44:04,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60620.9, 300 sec: 61092.9). Total num frames: 9275244544. Throughput: 0: 61171.3. Samples: 2180444580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:44:04,253][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 03:44:04,458][54818] Updated weights for policy 0, policy_version 566118 (0.0015) [2024-04-28 03:44:07,237][54818] Updated weights for policy 0, policy_version 566128 (0.0015) [2024-04-28 03:44:09,253][54587] Fps is (10 sec: 62259.2, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9275555840. Throughput: 0: 61091.1. Samples: 2180807780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 03:44:09,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 03:44:09,793][54818] Updated weights for policy 0, policy_version 566138 (0.0015) [2024-04-28 03:44:12,681][54818] Updated weights for policy 0, policy_version 566148 (0.0016) [2024-04-28 03:44:14,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60620.8, 300 sec: 61092.9). Total num frames: 9275850752. Throughput: 0: 61185.0. Samples: 2180996020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:44:14,256][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 03:44:15,184][54818] Updated weights for policy 0, policy_version 566158 (0.0017) [2024-04-28 03:44:18,087][54818] Updated weights for policy 0, policy_version 566168 (0.0016) [2024-04-28 03:44:19,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60620.8, 300 sec: 61092.9). Total num frames: 9276162048. Throughput: 0: 61171.0. Samples: 2181363200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:44:19,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 03:44:19,592][54798] Signal inference workers to stop experience collection... (36000 times) [2024-04-28 03:44:19,593][54798] Signal inference workers to resume experience collection... (36000 times) [2024-04-28 03:44:19,611][54818] InferenceWorker_p0-w0: stopping experience collection (36000 times) [2024-04-28 03:44:19,611][54818] InferenceWorker_p0-w0: resuming experience collection (36000 times) [2024-04-28 03:44:20,553][54818] Updated weights for policy 0, policy_version 566178 (0.0019) [2024-04-28 03:44:23,337][54818] Updated weights for policy 0, policy_version 566188 (0.0016) [2024-04-28 03:44:24,253][54587] Fps is (10 sec: 62259.0, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 9276473344. Throughput: 0: 61045.4. Samples: 2181726080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:44:24,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 03:44:25,845][54818] Updated weights for policy 0, policy_version 566198 (0.0020) [2024-04-28 03:44:27,188][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (21500 times) [2024-04-28 03:44:28,735][54818] Updated weights for policy 0, policy_version 566208 (0.0018) [2024-04-28 03:44:29,253][54587] Fps is (10 sec: 58982.5, 60 sec: 60347.7, 300 sec: 60981.8). Total num frames: 9276751872. Throughput: 0: 61081.8. Samples: 2181911960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:44:29,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 03:44:31,215][54818] Updated weights for policy 0, policy_version 566218 (0.0016) [2024-04-28 03:44:34,253][54818] Updated weights for policy 0, policy_version 566228 (0.0015) [2024-04-28 03:44:34,253][54587] Fps is (10 sec: 58982.8, 60 sec: 60620.8, 300 sec: 61037.3). Total num frames: 9277063168. Throughput: 0: 61204.9. Samples: 2182282640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:44:34,256][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 03:44:36,435][54818] Updated weights for policy 0, policy_version 566238 (0.0017) [2024-04-28 03:44:39,253][54587] Fps is (10 sec: 62259.9, 60 sec: 60620.9, 300 sec: 61037.4). Total num frames: 9277374464. Throughput: 0: 61178.6. Samples: 2182645800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:44:39,255][54587] Avg episode reward: [(0, '0.647')] [2024-04-28 03:44:39,527][54818] Updated weights for policy 0, policy_version 566248 (0.0019) [2024-04-28 03:44:42,229][54818] Updated weights for policy 0, policy_version 566258 (0.0017) [2024-04-28 03:44:44,253][54587] Fps is (10 sec: 62258.8, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9277685760. Throughput: 0: 60948.4. Samples: 2182826320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:44:44,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 03:44:44,686][54818] Updated weights for policy 0, policy_version 566268 (0.0019) [2024-04-28 03:44:47,280][54818] Updated weights for policy 0, policy_version 566278 (0.0019) [2024-04-28 03:44:49,253][54587] Fps is (10 sec: 62258.1, 60 sec: 60893.7, 300 sec: 61148.4). Total num frames: 9277997056. Throughput: 0: 61345.1. Samples: 2183205120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:44:49,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 03:44:49,264][54587] No heartbeat for components: RolloutWorker_w4 (22837 seconds), RolloutWorker_w5 (8937 seconds) [2024-04-28 03:44:49,812][54818] Updated weights for policy 0, policy_version 566288 (0.0017) [2024-04-28 03:44:52,825][54818] Updated weights for policy 0, policy_version 566298 (0.0020) [2024-04-28 03:44:54,218][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (21600 times) [2024-04-28 03:44:54,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9278291968. Throughput: 0: 61323.2. Samples: 2183567320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:44:54,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 03:44:55,038][54818] Updated weights for policy 0, policy_version 566308 (0.0017) [2024-04-28 03:44:58,163][54818] Updated weights for policy 0, policy_version 566318 (0.0020) [2024-04-28 03:44:59,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9278603264. Throughput: 0: 61137.8. Samples: 2183747220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:44:59,255][54587] Avg episode reward: [(0, '0.563')] [2024-04-28 03:44:59,821][54798] Signal inference workers to stop experience collection... (36050 times) [2024-04-28 03:44:59,858][54818] InferenceWorker_p0-w0: stopping experience collection (36050 times) [2024-04-28 03:44:59,872][54798] Signal inference workers to resume experience collection... (36050 times) [2024-04-28 03:44:59,875][54818] InferenceWorker_p0-w0: resuming experience collection (36050 times) [2024-04-28 03:45:00,369][54818] Updated weights for policy 0, policy_version 566328 (0.0018) [2024-04-28 03:45:03,759][54818] Updated weights for policy 0, policy_version 566338 (0.0015) [2024-04-28 03:45:04,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61166.8, 300 sec: 61204.0). Total num frames: 9278914560. Throughput: 0: 61121.3. Samples: 2184113660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:45:04,255][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 03:45:05,906][54818] Updated weights for policy 0, policy_version 566348 (0.0016) [2024-04-28 03:45:08,928][54818] Updated weights for policy 0, policy_version 566358 (0.0016) [2024-04-28 03:45:09,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9279225856. Throughput: 0: 61416.9. Samples: 2184489840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:45:09,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 03:45:11,333][54818] Updated weights for policy 0, policy_version 566368 (0.0018) [2024-04-28 03:45:14,238][54818] Updated weights for policy 0, policy_version 566378 (0.0018) [2024-04-28 03:45:14,253][54587] Fps is (10 sec: 62260.0, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9279537152. Throughput: 0: 61307.7. Samples: 2184670800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:45:14,253][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 03:45:16,968][54818] Updated weights for policy 0, policy_version 566388 (0.0018) [2024-04-28 03:45:19,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9279832064. Throughput: 0: 61210.2. Samples: 2185037100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:45:19,253][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 03:45:19,457][54818] Updated weights for policy 0, policy_version 566398 (0.0017) [2024-04-28 03:45:20,361][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (21700 times) [2024-04-28 03:45:22,350][54818] Updated weights for policy 0, policy_version 566408 (0.0017) [2024-04-28 03:45:24,253][54587] Fps is (10 sec: 58982.4, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9280126976. Throughput: 0: 61388.9. Samples: 2185408300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:45:24,254][54587] Avg episode reward: [(0, '0.648')] [2024-04-28 03:45:24,856][54818] Updated weights for policy 0, policy_version 566418 (0.0020) [2024-04-28 03:45:27,704][54818] Updated weights for policy 0, policy_version 566428 (0.0016) [2024-04-28 03:45:29,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9280438272. Throughput: 0: 61254.3. Samples: 2185582760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:45:29,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 03:45:30,096][54818] Updated weights for policy 0, policy_version 566438 (0.0018) [2024-04-28 03:45:32,893][54818] Updated weights for policy 0, policy_version 566448 (0.0015) [2024-04-28 03:45:34,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 9280749568. Throughput: 0: 61199.2. Samples: 2185959080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:45:34,254][54587] Avg episode reward: [(0, '0.726')] [2024-04-28 03:45:35,311][54818] Updated weights for policy 0, policy_version 566458 (0.0015) [2024-04-28 03:45:38,112][54818] Updated weights for policy 0, policy_version 566468 (0.0019) [2024-04-28 03:45:39,253][54587] Fps is (10 sec: 60619.6, 60 sec: 61166.7, 300 sec: 61148.4). Total num frames: 9281044480. Throughput: 0: 61310.4. Samples: 2186326300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 03:45:39,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 03:45:39,727][54798] Signal inference workers to stop experience collection... (36100 times) [2024-04-28 03:45:39,769][54818] InferenceWorker_p0-w0: stopping experience collection (36100 times) [2024-04-28 03:45:39,788][54798] Signal inference workers to resume experience collection... (36100 times) [2024-04-28 03:45:39,789][54818] InferenceWorker_p0-w0: resuming experience collection (36100 times) [2024-04-28 03:45:40,455][54818] Updated weights for policy 0, policy_version 566478 (0.0018) [2024-04-28 03:45:43,875][54818] Updated weights for policy 0, policy_version 566488 (0.0018) [2024-04-28 03:45:44,253][54587] Fps is (10 sec: 58982.2, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 9281339392. Throughput: 0: 61140.7. Samples: 2186498560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:45:44,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 03:45:46,216][54818] Updated weights for policy 0, policy_version 566498 (0.0022) [2024-04-28 03:45:47,604][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (21800 times) [2024-04-28 03:45:49,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9281650688. Throughput: 0: 61203.1. Samples: 2186867800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:45:49,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 03:45:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000566507_9281650688.pth... [2024-04-28 03:45:49,326][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000565613_9267003392.pth [2024-04-28 03:45:49,501][54818] Updated weights for policy 0, policy_version 566508 (0.0019) [2024-04-28 03:45:51,598][54818] Updated weights for policy 0, policy_version 566518 (0.0017) [2024-04-28 03:45:54,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 9281961984. Throughput: 0: 60943.9. Samples: 2187232320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:45:54,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 03:45:54,445][54818] Updated weights for policy 0, policy_version 566528 (0.0015) [2024-04-28 03:45:57,130][54818] Updated weights for policy 0, policy_version 566538 (0.0016) [2024-04-28 03:45:59,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 9282256896. Throughput: 0: 60858.5. Samples: 2187409440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:45:59,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 03:45:59,749][54818] Updated weights for policy 0, policy_version 566548 (0.0016) [2024-04-28 03:46:02,681][54818] Updated weights for policy 0, policy_version 566558 (0.0016) [2024-04-28 03:46:04,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 9282568192. Throughput: 0: 61035.4. Samples: 2187783700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:46:04,255][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 03:46:05,169][54818] Updated weights for policy 0, policy_version 566568 (0.0017) [2024-04-28 03:46:07,985][54818] Updated weights for policy 0, policy_version 566578 (0.0016) [2024-04-28 03:46:09,253][54587] Fps is (10 sec: 62260.0, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9282879488. Throughput: 0: 60815.2. Samples: 2188144980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:46:09,253][54587] Avg episode reward: [(0, '0.559')] [2024-04-28 03:46:10,437][54818] Updated weights for policy 0, policy_version 566588 (0.0016) [2024-04-28 03:46:13,171][54818] Updated weights for policy 0, policy_version 566598 (0.0017) [2024-04-28 03:46:14,253][54587] Fps is (10 sec: 63897.6, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 9283207168. Throughput: 0: 61057.1. Samples: 2188330340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:46:14,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 03:46:14,488][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (21900 times) [2024-04-28 03:46:15,785][54818] Updated weights for policy 0, policy_version 566608 (0.0016) [2024-04-28 03:46:18,599][54818] Updated weights for policy 0, policy_version 566618 (0.0016) [2024-04-28 03:46:19,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9283502080. Throughput: 0: 60989.9. Samples: 2188703620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:46:19,253][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 03:46:21,007][54818] Updated weights for policy 0, policy_version 566628 (0.0016) [2024-04-28 03:46:23,760][54818] Updated weights for policy 0, policy_version 566638 (0.0018) [2024-04-28 03:46:24,253][54587] Fps is (10 sec: 60622.0, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9283813376. Throughput: 0: 60900.3. Samples: 2189066800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:46:24,253][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 03:46:25,516][54798] Signal inference workers to stop experience collection... (36150 times) [2024-04-28 03:46:25,520][54798] Signal inference workers to resume experience collection... (36150 times) [2024-04-28 03:46:25,532][54818] InferenceWorker_p0-w0: stopping experience collection (36150 times) [2024-04-28 03:46:25,532][54818] InferenceWorker_p0-w0: resuming experience collection (36150 times) [2024-04-28 03:46:26,217][54818] Updated weights for policy 0, policy_version 566648 (0.0018) [2024-04-28 03:46:28,896][54818] Updated weights for policy 0, policy_version 566658 (0.0020) [2024-04-28 03:46:29,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 9284124672. Throughput: 0: 61124.6. Samples: 2189249160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:46:29,254][54587] Avg episode reward: [(0, '0.504')] [2024-04-28 03:46:31,518][54818] Updated weights for policy 0, policy_version 566668 (0.0016) [2024-04-28 03:46:34,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9284435968. Throughput: 0: 61241.8. Samples: 2189623680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:46:34,256][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 03:46:34,553][54818] Updated weights for policy 0, policy_version 566678 (0.0016) [2024-04-28 03:46:37,052][54818] Updated weights for policy 0, policy_version 566688 (0.0017) [2024-04-28 03:46:39,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9284730880. Throughput: 0: 61288.0. Samples: 2189990280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:46:39,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 03:46:39,864][54818] Updated weights for policy 0, policy_version 566698 (0.0015) [2024-04-28 03:46:40,801][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (22000 times) [2024-04-28 03:46:42,383][54818] Updated weights for policy 0, policy_version 566708 (0.0015) [2024-04-28 03:46:44,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61713.2, 300 sec: 61148.4). Total num frames: 9285042176. Throughput: 0: 61406.2. Samples: 2190172720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:46:44,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 03:46:45,133][54818] Updated weights for policy 0, policy_version 566718 (0.0015) [2024-04-28 03:46:47,836][54818] Updated weights for policy 0, policy_version 566728 (0.0017) [2024-04-28 03:46:49,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61713.3, 300 sec: 61148.5). Total num frames: 9285353472. Throughput: 0: 61362.0. Samples: 2190544980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:46:49,253][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 03:46:50,594][54818] Updated weights for policy 0, policy_version 566738 (0.0016) [2024-04-28 03:46:53,092][54818] Updated weights for policy 0, policy_version 566748 (0.0015) [2024-04-28 03:46:54,253][54587] Fps is (10 sec: 63897.7, 60 sec: 61986.1, 300 sec: 61204.0). Total num frames: 9285681152. Throughput: 0: 61495.0. Samples: 2190912260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:46:54,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 03:46:55,689][54818] Updated weights for policy 0, policy_version 566758 (0.0015) [2024-04-28 03:46:58,288][54818] Updated weights for policy 0, policy_version 566768 (0.0016) [2024-04-28 03:46:59,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61986.3, 300 sec: 61204.0). Total num frames: 9285976064. Throughput: 0: 61491.4. Samples: 2191097440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:46:59,253][54587] Avg episode reward: [(0, '0.533')] [2024-04-28 03:47:01,162][54818] Updated weights for policy 0, policy_version 566778 (0.0016) [2024-04-28 03:47:03,785][54818] Updated weights for policy 0, policy_version 566788 (0.0017) [2024-04-28 03:47:04,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61986.2, 300 sec: 61204.0). Total num frames: 9286287360. Throughput: 0: 61464.8. Samples: 2191469540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:47:04,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 03:47:06,335][54818] Updated weights for policy 0, policy_version 566798 (0.0015) [2024-04-28 03:47:07,805][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (22100 times) [2024-04-28 03:47:08,985][54818] Updated weights for policy 0, policy_version 566808 (0.0016) [2024-04-28 03:47:09,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61986.0, 300 sec: 61203.9). Total num frames: 9286598656. Throughput: 0: 61603.4. Samples: 2191838960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:47:09,254][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 03:47:11,654][54818] Updated weights for policy 0, policy_version 566818 (0.0016) [2024-04-28 03:47:14,132][54818] Updated weights for policy 0, policy_version 566828 (0.0016) [2024-04-28 03:47:14,253][54587] Fps is (10 sec: 62260.0, 60 sec: 61713.3, 300 sec: 61204.0). Total num frames: 9286909952. Throughput: 0: 61628.5. Samples: 2192022440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 03:47:14,254][54587] Avg episode reward: [(0, '0.492')] [2024-04-28 03:47:16,615][54798] Signal inference workers to stop experience collection... (36200 times) [2024-04-28 03:47:16,615][54798] Signal inference workers to resume experience collection... (36200 times) [2024-04-28 03:47:16,624][54818] InferenceWorker_p0-w0: stopping experience collection (36200 times) [2024-04-28 03:47:16,624][54818] InferenceWorker_p0-w0: resuming experience collection (36200 times) [2024-04-28 03:47:16,903][54818] Updated weights for policy 0, policy_version 566838 (0.0017) [2024-04-28 03:47:19,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61986.1, 300 sec: 61204.0). Total num frames: 9287221248. Throughput: 0: 61511.2. Samples: 2192391680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:47:19,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 03:47:19,531][54818] Updated weights for policy 0, policy_version 566848 (0.0016) [2024-04-28 03:47:22,325][54818] Updated weights for policy 0, policy_version 566858 (0.0019) [2024-04-28 03:47:24,253][54587] Fps is (10 sec: 60619.9, 60 sec: 61712.9, 300 sec: 61203.9). Total num frames: 9287516160. Throughput: 0: 61507.9. Samples: 2192758140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:47:24,255][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 03:47:25,387][54818] Updated weights for policy 0, policy_version 566868 (0.0018) [2024-04-28 03:47:27,866][54818] Updated weights for policy 0, policy_version 566878 (0.0021) [2024-04-28 03:47:29,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61713.1, 300 sec: 61204.0). Total num frames: 9287827456. Throughput: 0: 61601.4. Samples: 2192944780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:47:29,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 03:47:30,634][54818] Updated weights for policy 0, policy_version 566888 (0.0019) [2024-04-28 03:47:33,143][54818] Updated weights for policy 0, policy_version 566898 (0.0015) [2024-04-28 03:47:34,185][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (22200 times) [2024-04-28 03:47:34,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9288122368. Throughput: 0: 61391.5. Samples: 2193307600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:47:34,253][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 03:47:35,905][54818] Updated weights for policy 0, policy_version 566908 (0.0020) [2024-04-28 03:47:38,601][54818] Updated weights for policy 0, policy_version 566918 (0.0016) [2024-04-28 03:47:39,253][54587] Fps is (10 sec: 58981.9, 60 sec: 61439.9, 300 sec: 61092.9). Total num frames: 9288417280. Throughput: 0: 61350.6. Samples: 2193673040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:47:39,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 03:47:41,217][54818] Updated weights for policy 0, policy_version 566928 (0.0015) [2024-04-28 03:47:43,894][54818] Updated weights for policy 0, policy_version 566938 (0.0021) [2024-04-28 03:47:44,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9288728576. Throughput: 0: 61345.2. Samples: 2193857980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:47:44,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-28 03:47:46,695][54818] Updated weights for policy 0, policy_version 566948 (0.0020) [2024-04-28 03:47:49,076][54818] Updated weights for policy 0, policy_version 566958 (0.0016) [2024-04-28 03:47:49,254][54587] Fps is (10 sec: 62258.0, 60 sec: 61439.6, 300 sec: 61203.9). Total num frames: 9289039872. Throughput: 0: 61159.7. Samples: 2194221740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:47:49,254][54587] Avg episode reward: [(0, '0.517')] [2024-04-28 03:47:49,263][54587] No heartbeat for components: RolloutWorker_w4 (23017 seconds), RolloutWorker_w5 (9117 seconds) [2024-04-28 03:47:49,334][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000566959_9289056256.pth... [2024-04-28 03:47:49,388][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000566062_9274359808.pth [2024-04-28 03:47:51,743][54818] Updated weights for policy 0, policy_version 566968 (0.0017) [2024-04-28 03:47:54,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9289351168. Throughput: 0: 61261.5. Samples: 2194595720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:47:54,254][54587] Avg episode reward: [(0, '0.503')] [2024-04-28 03:47:54,338][54818] Updated weights for policy 0, policy_version 566978 (0.0015) [2024-04-28 03:47:57,070][54818] Updated weights for policy 0, policy_version 566988 (0.0016) [2024-04-28 03:47:59,253][54587] Fps is (10 sec: 62260.8, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 9289662464. Throughput: 0: 61343.4. Samples: 2194782900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:47:59,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 03:47:59,672][54818] Updated weights for policy 0, policy_version 566998 (0.0016) [2024-04-28 03:48:01,176][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (22300 times) [2024-04-28 03:48:02,192][54818] Updated weights for policy 0, policy_version 567008 (0.0017) [2024-04-28 03:48:04,253][54587] Fps is (10 sec: 60620.0, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9289957376. Throughput: 0: 61211.5. Samples: 2195146200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:48:04,255][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 03:48:05,417][54818] Updated weights for policy 0, policy_version 567018 (0.0017) [2024-04-28 03:48:07,618][54818] Updated weights for policy 0, policy_version 567028 (0.0017) [2024-04-28 03:48:09,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9290268672. Throughput: 0: 61394.2. Samples: 2195520880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:48:09,255][54587] Avg episode reward: [(0, '0.523')] [2024-04-28 03:48:10,696][54818] Updated weights for policy 0, policy_version 567038 (0.0017) [2024-04-28 03:48:11,063][54798] Signal inference workers to stop experience collection... (36250 times) [2024-04-28 03:48:11,083][54818] InferenceWorker_p0-w0: stopping experience collection (36250 times) [2024-04-28 03:48:11,121][54798] Signal inference workers to resume experience collection... (36250 times) [2024-04-28 03:48:11,122][54818] InferenceWorker_p0-w0: resuming experience collection (36250 times) [2024-04-28 03:48:12,872][54818] Updated weights for policy 0, policy_version 567048 (0.0017) [2024-04-28 03:48:14,253][54587] Fps is (10 sec: 62260.1, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9290579968. Throughput: 0: 61387.7. Samples: 2195707220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:48:14,253][54587] Avg episode reward: [(0, '0.542')] [2024-04-28 03:48:16,144][54818] Updated weights for policy 0, policy_version 567058 (0.0017) [2024-04-28 03:48:18,421][54818] Updated weights for policy 0, policy_version 567068 (0.0015) [2024-04-28 03:48:19,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9290874880. Throughput: 0: 61324.5. Samples: 2196067200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:48:19,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 03:48:21,402][54818] Updated weights for policy 0, policy_version 567078 (0.0018) [2024-04-28 03:48:23,869][54818] Updated weights for policy 0, policy_version 567088 (0.0016) [2024-04-28 03:48:24,253][54587] Fps is (10 sec: 62257.8, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9291202560. Throughput: 0: 61651.5. Samples: 2196447360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:48:24,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 03:48:26,397][54818] Updated weights for policy 0, policy_version 567098 (0.0017) [2024-04-28 03:48:27,428][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (22400 times) [2024-04-28 03:48:28,941][54818] Updated weights for policy 0, policy_version 567108 (0.0015) [2024-04-28 03:48:29,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9291497472. Throughput: 0: 61594.8. Samples: 2196629740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:48:29,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-28 03:48:31,572][54818] Updated weights for policy 0, policy_version 567118 (0.0017) [2024-04-28 03:48:34,253][54587] Fps is (10 sec: 60622.1, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9291808768. Throughput: 0: 61723.1. Samples: 2196999260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:48:34,253][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 03:48:34,254][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (1200 times) [2024-04-28 03:48:34,418][54818] Updated weights for policy 0, policy_version 567128 (0.0017) [2024-04-28 03:48:36,844][54818] Updated weights for policy 0, policy_version 567138 (0.0015) [2024-04-28 03:48:39,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61713.1, 300 sec: 61315.0). Total num frames: 9292120064. Throughput: 0: 61693.2. Samples: 2197371920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:48:39,255][54587] Avg episode reward: [(0, '0.445')] [2024-04-28 03:48:39,686][54818] Updated weights for policy 0, policy_version 567148 (0.0017) [2024-04-28 03:48:42,250][54818] Updated weights for policy 0, policy_version 567158 (0.0016) [2024-04-28 03:48:44,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9292414976. Throughput: 0: 61669.0. Samples: 2197558000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 03:48:44,253][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 03:48:44,942][54818] Updated weights for policy 0, policy_version 567168 (0.0016) [2024-04-28 03:48:47,521][54818] Updated weights for policy 0, policy_version 567178 (0.0018) [2024-04-28 03:48:49,253][54587] Fps is (10 sec: 58982.6, 60 sec: 61167.2, 300 sec: 61259.5). Total num frames: 9292709888. Throughput: 0: 61760.1. Samples: 2197925400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:48:49,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 03:48:50,072][54818] Updated weights for policy 0, policy_version 567188 (0.0017) [2024-04-28 03:48:52,645][54798] Signal inference workers to stop experience collection... (36300 times) [2024-04-28 03:48:52,645][54798] Signal inference workers to resume experience collection... (36300 times) [2024-04-28 03:48:52,656][54818] InferenceWorker_p0-w0: stopping experience collection (36300 times) [2024-04-28 03:48:52,656][54818] InferenceWorker_p0-w0: resuming experience collection (36300 times) [2024-04-28 03:48:52,762][54818] Updated weights for policy 0, policy_version 567198 (0.0017) [2024-04-28 03:48:54,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9293037568. Throughput: 0: 61794.8. Samples: 2198301640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:48:54,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 03:48:54,372][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (22500 times) [2024-04-28 03:48:55,473][54818] Updated weights for policy 0, policy_version 567208 (0.0017) [2024-04-28 03:48:58,038][54818] Updated weights for policy 0, policy_version 567218 (0.0017) [2024-04-28 03:48:59,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9293332480. Throughput: 0: 61651.4. Samples: 2198481540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:48:59,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 03:49:01,009][54818] Updated weights for policy 0, policy_version 567228 (0.0016) [2024-04-28 03:49:03,562][54818] Updated weights for policy 0, policy_version 567238 (0.0020) [2024-04-28 03:49:04,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.1, 300 sec: 61315.1). Total num frames: 9293643776. Throughput: 0: 61885.3. Samples: 2198852040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:49:04,253][54587] Avg episode reward: [(0, '0.578')] [2024-04-28 03:49:06,262][54818] Updated weights for policy 0, policy_version 567248 (0.0015) [2024-04-28 03:49:09,224][54818] Updated weights for policy 0, policy_version 567258 (0.0017) [2024-04-28 03:49:09,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9293955072. Throughput: 0: 61621.0. Samples: 2199220300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:49:09,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 03:49:11,385][54818] Updated weights for policy 0, policy_version 567268 (0.0015) [2024-04-28 03:49:14,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61439.8, 300 sec: 61370.6). Total num frames: 9294266368. Throughput: 0: 61726.0. Samples: 2199407420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:49:14,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 03:49:14,375][54818] Updated weights for policy 0, policy_version 567278 (0.0016) [2024-04-28 03:49:16,716][54818] Updated weights for policy 0, policy_version 567288 (0.0016) [2024-04-28 03:49:19,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61712.9, 300 sec: 61370.6). Total num frames: 9294577664. Throughput: 0: 61680.7. Samples: 2199774900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:49:19,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-28 03:49:19,494][54818] Updated weights for policy 0, policy_version 567298 (0.0017) [2024-04-28 03:49:20,834][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (22600 times) [2024-04-28 03:49:21,928][54818] Updated weights for policy 0, policy_version 567308 (0.0020) [2024-04-28 03:49:24,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61440.1, 300 sec: 61481.7). Total num frames: 9294888960. Throughput: 0: 61796.4. Samples: 2200152760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:49:24,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 03:49:24,811][54818] Updated weights for policy 0, policy_version 567318 (0.0017) [2024-04-28 03:49:27,296][54818] Updated weights for policy 0, policy_version 567328 (0.0016) [2024-04-28 03:49:29,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61440.0, 300 sec: 61426.1). Total num frames: 9295183872. Throughput: 0: 61679.1. Samples: 2200333560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:49:29,253][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 03:49:29,955][54818] Updated weights for policy 0, policy_version 567338 (0.0017) [2024-04-28 03:49:32,490][54818] Updated weights for policy 0, policy_version 567348 (0.0019) [2024-04-28 03:49:34,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61440.0, 300 sec: 61426.1). Total num frames: 9295495168. Throughput: 0: 61642.3. Samples: 2200699300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:49:34,253][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 03:49:35,411][54818] Updated weights for policy 0, policy_version 567358 (0.0016) [2024-04-28 03:49:36,290][54798] Signal inference workers to stop experience collection... (36350 times) [2024-04-28 03:49:36,329][54818] InferenceWorker_p0-w0: stopping experience collection (36350 times) [2024-04-28 03:49:36,347][54798] Signal inference workers to resume experience collection... (36350 times) [2024-04-28 03:49:36,348][54818] InferenceWorker_p0-w0: resuming experience collection (36350 times) [2024-04-28 03:49:37,915][54818] Updated weights for policy 0, policy_version 567368 (0.0018) [2024-04-28 03:49:39,253][54587] Fps is (10 sec: 63898.0, 60 sec: 61713.2, 300 sec: 61481.7). Total num frames: 9295822848. Throughput: 0: 61618.7. Samples: 2201074480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:49:39,253][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 03:49:40,901][54818] Updated weights for policy 0, policy_version 567378 (0.0022) [2024-04-28 03:49:43,229][54818] Updated weights for policy 0, policy_version 567388 (0.0016) [2024-04-28 03:49:44,253][54587] Fps is (10 sec: 63896.6, 60 sec: 61986.0, 300 sec: 61481.7). Total num frames: 9296134144. Throughput: 0: 61630.6. Samples: 2201254920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:49:44,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 03:49:46,067][54818] Updated weights for policy 0, policy_version 567398 (0.0016) [2024-04-28 03:49:46,982][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (22700 times) [2024-04-28 03:49:48,925][54818] Updated weights for policy 0, policy_version 567408 (0.0016) [2024-04-28 03:49:49,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61986.2, 300 sec: 61481.7). Total num frames: 9296429056. Throughput: 0: 61627.1. Samples: 2201625260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:49:49,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 03:49:49,307][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000567410_9296445440.pth... [2024-04-28 03:49:49,362][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000566507_9281650688.pth [2024-04-28 03:49:51,490][54818] Updated weights for policy 0, policy_version 567418 (0.0020) [2024-04-28 03:49:54,179][54818] Updated weights for policy 0, policy_version 567428 (0.0015) [2024-04-28 03:49:54,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61713.0, 300 sec: 61481.6). Total num frames: 9296740352. Throughput: 0: 61590.2. Samples: 2201991860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:49:54,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-28 03:49:56,627][54818] Updated weights for policy 0, policy_version 567438 (0.0016) [2024-04-28 03:49:59,253][54587] Fps is (10 sec: 62257.6, 60 sec: 61985.9, 300 sec: 61481.6). Total num frames: 9297051648. Throughput: 0: 61484.7. Samples: 2202174240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:49:59,255][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 03:49:59,538][54818] Updated weights for policy 0, policy_version 567448 (0.0016) [2024-04-28 03:50:01,867][54818] Updated weights for policy 0, policy_version 567458 (0.0019) [2024-04-28 03:50:04,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61712.9, 300 sec: 61426.1). Total num frames: 9297346560. Throughput: 0: 61644.9. Samples: 2202548920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:50:04,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 03:50:04,974][54818] Updated weights for policy 0, policy_version 567468 (0.0016) [2024-04-28 03:50:07,102][54818] Updated weights for policy 0, policy_version 567478 (0.0017) [2024-04-28 03:50:09,253][54587] Fps is (10 sec: 58983.8, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9297641472. Throughput: 0: 61358.3. Samples: 2202913880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:50:09,253][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 03:50:10,223][54818] Updated weights for policy 0, policy_version 567488 (0.0016) [2024-04-28 03:50:13,065][54818] Updated weights for policy 0, policy_version 567498 (0.0017) [2024-04-28 03:50:14,084][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (22800 times) [2024-04-28 03:50:14,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61713.1, 300 sec: 61481.6). Total num frames: 9297969152. Throughput: 0: 61379.5. Samples: 2203095640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 03:50:14,255][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 03:50:15,408][54818] Updated weights for policy 0, policy_version 567508 (0.0016) [2024-04-28 03:50:18,300][54818] Updated weights for policy 0, policy_version 567518 (0.0019) [2024-04-28 03:50:19,253][54587] Fps is (10 sec: 63896.8, 60 sec: 61713.0, 300 sec: 61537.2). Total num frames: 9298280448. Throughput: 0: 61478.0. Samples: 2203465820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:50:19,255][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 03:50:20,721][54818] Updated weights for policy 0, policy_version 567528 (0.0015) [2024-04-28 03:50:23,559][54818] Updated weights for policy 0, policy_version 567538 (0.0015) [2024-04-28 03:50:24,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61440.1, 300 sec: 61481.7). Total num frames: 9298575360. Throughput: 0: 61357.3. Samples: 2203835560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:50:24,253][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 03:50:24,318][54798] Signal inference workers to stop experience collection... (36400 times) [2024-04-28 03:50:24,318][54798] Signal inference workers to resume experience collection... (36400 times) [2024-04-28 03:50:24,331][54818] InferenceWorker_p0-w0: stopping experience collection (36400 times) [2024-04-28 03:50:24,331][54818] InferenceWorker_p0-w0: resuming experience collection (36400 times) [2024-04-28 03:50:26,047][54818] Updated weights for policy 0, policy_version 567548 (0.0016) [2024-04-28 03:50:28,834][54818] Updated weights for policy 0, policy_version 567558 (0.0015) [2024-04-28 03:50:29,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61713.0, 300 sec: 61481.7). Total num frames: 9298886656. Throughput: 0: 61328.4. Samples: 2204014700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:50:29,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-28 03:50:31,261][54818] Updated weights for policy 0, policy_version 567568 (0.0015) [2024-04-28 03:50:34,228][54818] Updated weights for policy 0, policy_version 567578 (0.0018) [2024-04-28 03:50:34,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61713.0, 300 sec: 61537.2). Total num frames: 9299197952. Throughput: 0: 61359.9. Samples: 2204386460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:50:34,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 03:50:36,865][54818] Updated weights for policy 0, policy_version 567588 (0.0017) [2024-04-28 03:50:39,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61166.8, 300 sec: 61537.2). Total num frames: 9299492864. Throughput: 0: 61420.0. Samples: 2204755760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:50:39,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 03:50:39,389][54818] Updated weights for policy 0, policy_version 567598 (0.0017) [2024-04-28 03:50:40,710][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (22900 times) [2024-04-28 03:50:42,042][54818] Updated weights for policy 0, policy_version 567608 (0.0017) [2024-04-28 03:50:44,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61167.0, 300 sec: 61537.2). Total num frames: 9299804160. Throughput: 0: 61437.6. Samples: 2204938920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:50:44,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 03:50:44,716][54818] Updated weights for policy 0, policy_version 567618 (0.0019) [2024-04-28 03:50:47,483][54818] Updated weights for policy 0, policy_version 567628 (0.0017) [2024-04-28 03:50:49,253][54587] Fps is (10 sec: 63898.4, 60 sec: 61713.1, 300 sec: 61592.8). Total num frames: 9300131840. Throughput: 0: 61434.9. Samples: 2205313480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:50:49,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 03:50:49,263][54587] No heartbeat for components: RolloutWorker_w4 (23197 seconds), RolloutWorker_w5 (9297 seconds) [2024-04-28 03:50:49,933][54818] Updated weights for policy 0, policy_version 567638 (0.0016) [2024-04-28 03:50:52,931][54818] Updated weights for policy 0, policy_version 567648 (0.0016) [2024-04-28 03:50:54,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61439.9, 300 sec: 61592.7). Total num frames: 9300426752. Throughput: 0: 61606.9. Samples: 2205686200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:50:54,256][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 03:50:55,262][54818] Updated weights for policy 0, policy_version 567658 (0.0019) [2024-04-28 03:50:58,119][54818] Updated weights for policy 0, policy_version 567668 (0.0015) [2024-04-28 03:50:59,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61440.2, 300 sec: 61592.8). Total num frames: 9300738048. Throughput: 0: 61583.6. Samples: 2205866900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:50:59,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 03:51:00,860][54818] Updated weights for policy 0, policy_version 567678 (0.0022) [2024-04-28 03:51:03,423][54818] Updated weights for policy 0, policy_version 567688 (0.0015) [2024-04-28 03:51:04,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61713.1, 300 sec: 61592.7). Total num frames: 9301049344. Throughput: 0: 61493.3. Samples: 2206233020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:51:04,254][54587] Avg episode reward: [(0, '0.676')] [2024-04-28 03:51:06,140][54818] Updated weights for policy 0, policy_version 567698 (0.0016) [2024-04-28 03:51:07,743][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (23000 times) [2024-04-28 03:51:08,883][54818] Updated weights for policy 0, policy_version 567708 (0.0016) [2024-04-28 03:51:09,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61713.0, 300 sec: 61481.7). Total num frames: 9301344256. Throughput: 0: 61306.5. Samples: 2206594360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:51:09,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-28 03:51:11,390][54818] Updated weights for policy 0, policy_version 567718 (0.0017) [2024-04-28 03:51:13,363][54798] Signal inference workers to stop experience collection... (36450 times) [2024-04-28 03:51:13,364][54798] Signal inference workers to resume experience collection... (36450 times) [2024-04-28 03:51:13,371][54818] InferenceWorker_p0-w0: stopping experience collection (36450 times) [2024-04-28 03:51:13,381][54818] InferenceWorker_p0-w0: resuming experience collection (36450 times) [2024-04-28 03:51:14,095][54818] Updated weights for policy 0, policy_version 567728 (0.0016) [2024-04-28 03:51:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61439.9, 300 sec: 61537.2). Total num frames: 9301655552. Throughput: 0: 61424.8. Samples: 2206778820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:51:14,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 03:51:16,622][54818] Updated weights for policy 0, policy_version 567738 (0.0015) [2024-04-28 03:51:19,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61440.0, 300 sec: 61537.2). Total num frames: 9301966848. Throughput: 0: 61562.9. Samples: 2207156800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:51:19,254][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 03:51:19,364][54818] Updated weights for policy 0, policy_version 567748 (0.0017) [2024-04-28 03:51:22,015][54818] Updated weights for policy 0, policy_version 567758 (0.0016) [2024-04-28 03:51:24,253][54587] Fps is (10 sec: 60621.9, 60 sec: 61440.0, 300 sec: 61481.7). Total num frames: 9302261760. Throughput: 0: 61638.4. Samples: 2207529480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:51:24,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 03:51:24,549][54818] Updated weights for policy 0, policy_version 567768 (0.0016) [2024-04-28 03:51:27,382][54818] Updated weights for policy 0, policy_version 567778 (0.0019) [2024-04-28 03:51:29,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61440.1, 300 sec: 61481.7). Total num frames: 9302573056. Throughput: 0: 61582.8. Samples: 2207710140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:51:29,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 03:51:29,745][54818] Updated weights for policy 0, policy_version 567788 (0.0017) [2024-04-28 03:51:32,961][54818] Updated weights for policy 0, policy_version 567798 (0.0016) [2024-04-28 03:51:33,823][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (23100 times) [2024-04-28 03:51:34,253][54587] Fps is (10 sec: 63897.4, 60 sec: 61713.1, 300 sec: 61592.7). Total num frames: 9302900736. Throughput: 0: 61657.3. Samples: 2208088060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:51:34,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 03:51:35,222][54818] Updated weights for policy 0, policy_version 567808 (0.0019) [2024-04-28 03:51:38,249][54818] Updated weights for policy 0, policy_version 567818 (0.0017) [2024-04-28 03:51:39,253][54587] Fps is (10 sec: 63897.2, 60 sec: 61986.2, 300 sec: 61592.7). Total num frames: 9303212032. Throughput: 0: 61626.4. Samples: 2208459380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:51:39,255][54587] Avg episode reward: [(0, '0.505')] [2024-04-28 03:51:40,445][54818] Updated weights for policy 0, policy_version 567828 (0.0017) [2024-04-28 03:51:43,419][54818] Updated weights for policy 0, policy_version 567838 (0.0016) [2024-04-28 03:51:44,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61986.2, 300 sec: 61592.7). Total num frames: 9303523328. Throughput: 0: 61667.2. Samples: 2208641920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 03:51:44,253][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 03:51:45,896][54818] Updated weights for policy 0, policy_version 567848 (0.0016) [2024-04-28 03:51:48,666][54818] Updated weights for policy 0, policy_version 567858 (0.0016) [2024-04-28 03:51:49,253][54587] Fps is (10 sec: 60619.7, 60 sec: 61439.7, 300 sec: 61481.6). Total num frames: 9303818240. Throughput: 0: 61847.8. Samples: 2209016180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:51:49,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 03:51:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000567861_9303834624.pth... [2024-04-28 03:51:49,320][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000566959_9289056256.pth [2024-04-28 03:51:51,134][54818] Updated weights for policy 0, policy_version 567868 (0.0019) [2024-04-28 03:51:53,910][54818] Updated weights for policy 0, policy_version 567878 (0.0017) [2024-04-28 03:51:54,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61713.3, 300 sec: 61537.2). Total num frames: 9304129536. Throughput: 0: 62070.4. Samples: 2209387520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:51:54,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-28 03:51:56,325][54818] Updated weights for policy 0, policy_version 567888 (0.0016) [2024-04-28 03:51:59,068][54798] Signal inference workers to stop experience collection... (36500 times) [2024-04-28 03:51:59,101][54818] InferenceWorker_p0-w0: stopping experience collection (36500 times) [2024-04-28 03:51:59,132][54798] Signal inference workers to resume experience collection... (36500 times) [2024-04-28 03:51:59,132][54818] InferenceWorker_p0-w0: resuming experience collection (36500 times) [2024-04-28 03:51:59,134][54818] Updated weights for policy 0, policy_version 567898 (0.0018) [2024-04-28 03:51:59,253][54587] Fps is (10 sec: 63898.5, 60 sec: 61986.0, 300 sec: 61592.7). Total num frames: 9304457216. Throughput: 0: 62139.6. Samples: 2209575100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:51:59,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 03:52:00,183][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (23200 times) [2024-04-28 03:52:01,787][54818] Updated weights for policy 0, policy_version 567908 (0.0016) [2024-04-28 03:52:04,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61713.1, 300 sec: 61537.2). Total num frames: 9304752128. Throughput: 0: 61899.7. Samples: 2209942280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:52:04,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 03:52:04,364][54818] Updated weights for policy 0, policy_version 567918 (0.0016) [2024-04-28 03:52:07,011][54818] Updated weights for policy 0, policy_version 567928 (0.0015) [2024-04-28 03:52:09,253][54587] Fps is (10 sec: 58982.6, 60 sec: 61713.1, 300 sec: 61481.6). Total num frames: 9305047040. Throughput: 0: 61859.0. Samples: 2210313140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:52:09,254][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 03:52:09,493][54818] Updated weights for policy 0, policy_version 567938 (0.0021) [2024-04-28 03:52:12,459][54818] Updated weights for policy 0, policy_version 567948 (0.0017) [2024-04-28 03:52:14,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61713.3, 300 sec: 61481.7). Total num frames: 9305358336. Throughput: 0: 61933.4. Samples: 2210497140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:52:14,253][54587] Avg episode reward: [(0, '0.541')] [2024-04-28 03:52:15,027][54818] Updated weights for policy 0, policy_version 567958 (0.0017) [2024-04-28 03:52:17,793][54818] Updated weights for policy 0, policy_version 567968 (0.0019) [2024-04-28 03:52:19,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61713.2, 300 sec: 61537.2). Total num frames: 9305669632. Throughput: 0: 61795.6. Samples: 2210868860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:52:19,254][54587] Avg episode reward: [(0, '0.547')] [2024-04-28 03:52:20,284][54818] Updated weights for policy 0, policy_version 567978 (0.0016) [2024-04-28 03:52:23,122][54818] Updated weights for policy 0, policy_version 567988 (0.0017) [2024-04-28 03:52:24,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61986.2, 300 sec: 61537.2). Total num frames: 9305980928. Throughput: 0: 61725.0. Samples: 2211237000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:52:24,253][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 03:52:25,542][54818] Updated weights for policy 0, policy_version 567998 (0.0016) [2024-04-28 03:52:26,946][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (23300 times) [2024-04-28 03:52:28,236][54818] Updated weights for policy 0, policy_version 568008 (0.0015) [2024-04-28 03:52:29,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61986.0, 300 sec: 61592.7). Total num frames: 9306292224. Throughput: 0: 61719.8. Samples: 2211419320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:52:29,254][54587] Avg episode reward: [(0, '0.697')] [2024-04-28 03:52:30,997][54818] Updated weights for policy 0, policy_version 568018 (0.0020) [2024-04-28 03:52:33,579][54818] Updated weights for policy 0, policy_version 568028 (0.0020) [2024-04-28 03:52:34,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61440.0, 300 sec: 61592.7). Total num frames: 9306587136. Throughput: 0: 61822.9. Samples: 2211798200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:52:34,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 03:52:36,291][54818] Updated weights for policy 0, policy_version 568038 (0.0016) [2024-04-28 03:52:39,148][54818] Updated weights for policy 0, policy_version 568048 (0.0016) [2024-04-28 03:52:39,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.0, 300 sec: 61592.7). Total num frames: 9306898432. Throughput: 0: 61563.4. Samples: 2212157880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:52:39,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 03:52:41,459][54818] Updated weights for policy 0, policy_version 568058 (0.0016) [2024-04-28 03:52:44,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61440.0, 300 sec: 61592.8). Total num frames: 9307209728. Throughput: 0: 61523.3. Samples: 2212343640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:52:44,254][54587] Avg episode reward: [(0, '0.513')] [2024-04-28 03:52:44,363][54818] Updated weights for policy 0, policy_version 568068 (0.0016) [2024-04-28 03:52:46,902][54818] Updated weights for policy 0, policy_version 568078 (0.0015) [2024-04-28 03:52:49,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61713.3, 300 sec: 61592.7). Total num frames: 9307521024. Throughput: 0: 61604.1. Samples: 2212714460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:52:49,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-28 03:52:49,507][54818] Updated weights for policy 0, policy_version 568088 (0.0015) [2024-04-28 03:52:51,021][54798] Signal inference workers to stop experience collection... (36550 times) [2024-04-28 03:52:51,025][54798] Signal inference workers to resume experience collection... (36550 times) [2024-04-28 03:52:51,035][54818] InferenceWorker_p0-w0: stopping experience collection (36550 times) [2024-04-28 03:52:51,035][54818] InferenceWorker_p0-w0: resuming experience collection (36550 times) [2024-04-28 03:52:52,367][54818] Updated weights for policy 0, policy_version 568098 (0.0019) [2024-04-28 03:52:53,288][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (23400 times) [2024-04-28 03:52:54,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61712.9, 300 sec: 61592.7). Total num frames: 9307832320. Throughput: 0: 61566.6. Samples: 2213083640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:52:54,255][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 03:52:55,081][54818] Updated weights for policy 0, policy_version 568108 (0.0016) [2024-04-28 03:52:57,531][54818] Updated weights for policy 0, policy_version 568118 (0.0016) [2024-04-28 03:52:59,253][54587] Fps is (10 sec: 62258.0, 60 sec: 61439.9, 300 sec: 61648.3). Total num frames: 9308143616. Throughput: 0: 61691.7. Samples: 2213273280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:52:59,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 03:53:00,304][54818] Updated weights for policy 0, policy_version 568128 (0.0016) [2024-04-28 03:53:02,975][54818] Updated weights for policy 0, policy_version 568138 (0.0016) [2024-04-28 03:53:04,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61713.0, 300 sec: 61648.3). Total num frames: 9308454912. Throughput: 0: 61545.2. Samples: 2213638400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:53:04,255][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 03:53:05,638][54818] Updated weights for policy 0, policy_version 568148 (0.0015) [2024-04-28 03:53:08,141][54818] Updated weights for policy 0, policy_version 568158 (0.0016) [2024-04-28 03:53:09,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61986.2, 300 sec: 61648.3). Total num frames: 9308766208. Throughput: 0: 61408.3. Samples: 2214000380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:53:09,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 03:53:11,003][54818] Updated weights for policy 0, policy_version 568168 (0.0015) [2024-04-28 03:53:13,277][54818] Updated weights for policy 0, policy_version 568178 (0.0015) [2024-04-28 03:53:14,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61712.9, 300 sec: 61648.2). Total num frames: 9309061120. Throughput: 0: 61672.5. Samples: 2214194580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:53:14,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 03:53:16,316][54818] Updated weights for policy 0, policy_version 568188 (0.0017) [2024-04-28 03:53:18,729][54818] Updated weights for policy 0, policy_version 568198 (0.0020) [2024-04-28 03:53:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61713.0, 300 sec: 61592.8). Total num frames: 9309372416. Throughput: 0: 61291.6. Samples: 2214556320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:53:19,255][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 03:53:20,224][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (23500 times) [2024-04-28 03:53:21,590][54818] Updated weights for policy 0, policy_version 568208 (0.0017) [2024-04-28 03:53:24,018][54818] Updated weights for policy 0, policy_version 568218 (0.0016) [2024-04-28 03:53:24,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61713.0, 300 sec: 61648.3). Total num frames: 9309683712. Throughput: 0: 61371.6. Samples: 2214919600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:53:24,255][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 03:53:27,093][54818] Updated weights for policy 0, policy_version 568228 (0.0016) [2024-04-28 03:53:29,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61713.2, 300 sec: 61648.3). Total num frames: 9309995008. Throughput: 0: 61446.6. Samples: 2215108740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:53:29,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 03:53:29,699][54818] Updated weights for policy 0, policy_version 568238 (0.0017) [2024-04-28 03:53:32,437][54818] Updated weights for policy 0, policy_version 568248 (0.0018) [2024-04-28 03:53:34,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61713.0, 300 sec: 61592.7). Total num frames: 9310289920. Throughput: 0: 61479.3. Samples: 2215481040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:53:34,256][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 03:53:35,113][54818] Updated weights for policy 0, policy_version 568258 (0.0017) [2024-04-28 03:53:37,835][54818] Updated weights for policy 0, policy_version 568268 (0.0020) [2024-04-28 03:53:39,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61713.0, 300 sec: 61648.2). Total num frames: 9310601216. Throughput: 0: 61158.2. Samples: 2215835760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:53:39,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 03:53:40,533][54818] Updated weights for policy 0, policy_version 568278 (0.0014) [2024-04-28 03:53:40,892][54798] Signal inference workers to stop experience collection... (36600 times) [2024-04-28 03:53:40,898][54798] Signal inference workers to resume experience collection... (36600 times) [2024-04-28 03:53:40,900][54818] InferenceWorker_p0-w0: stopping experience collection (36600 times) [2024-04-28 03:53:40,909][54818] InferenceWorker_p0-w0: resuming experience collection (36600 times) [2024-04-28 03:53:42,981][54818] Updated weights for policy 0, policy_version 568288 (0.0015) [2024-04-28 03:53:44,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61713.0, 300 sec: 61703.8). Total num frames: 9310912512. Throughput: 0: 61206.4. Samples: 2216027560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:53:44,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 03:53:45,839][54818] Updated weights for policy 0, policy_version 568298 (0.0016) [2024-04-28 03:53:47,086][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (23600 times) [2024-04-28 03:53:48,080][54818] Updated weights for policy 0, policy_version 568308 (0.0019) [2024-04-28 03:53:49,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61440.0, 300 sec: 61592.7). Total num frames: 9311207424. Throughput: 0: 61400.2. Samples: 2216401400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:53:49,253][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 03:53:49,258][54587] No heartbeat for components: RolloutWorker_w4 (23377 seconds), RolloutWorker_w5 (9477 seconds) [2024-04-28 03:53:49,297][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000568312_9311223808.pth... [2024-04-28 03:53:49,354][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000567410_9296445440.pth [2024-04-28 03:53:51,015][54818] Updated weights for policy 0, policy_version 568318 (0.0017) [2024-04-28 03:53:53,393][54818] Updated weights for policy 0, policy_version 568328 (0.0017) [2024-04-28 03:53:54,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61440.1, 300 sec: 61648.3). Total num frames: 9311518720. Throughput: 0: 61189.8. Samples: 2216753920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:53:54,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 03:53:56,353][54818] Updated weights for policy 0, policy_version 568338 (0.0023) [2024-04-28 03:53:58,679][54818] Updated weights for policy 0, policy_version 568348 (0.0019) [2024-04-28 03:53:59,253][54587] Fps is (10 sec: 60619.6, 60 sec: 61166.9, 300 sec: 61592.7). Total num frames: 9311813632. Throughput: 0: 61173.6. Samples: 2216947400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:53:59,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 03:54:01,959][54818] Updated weights for policy 0, policy_version 568358 (0.0017) [2024-04-28 03:54:04,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61166.9, 300 sec: 61592.7). Total num frames: 9312124928. Throughput: 0: 61182.1. Samples: 2217309520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:54:04,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 03:54:04,515][54818] Updated weights for policy 0, policy_version 568368 (0.0016) [2024-04-28 03:54:07,261][54818] Updated weights for policy 0, policy_version 568378 (0.0018) [2024-04-28 03:54:09,253][54587] Fps is (10 sec: 62260.2, 60 sec: 61167.0, 300 sec: 61592.8). Total num frames: 9312436224. Throughput: 0: 61135.6. Samples: 2217670700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:54:09,254][54587] Avg episode reward: [(0, '0.523')] [2024-04-28 03:54:09,871][54818] Updated weights for policy 0, policy_version 568388 (0.0016) [2024-04-28 03:54:12,694][54818] Updated weights for policy 0, policy_version 568398 (0.0018) [2024-04-28 03:54:13,532][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (23700 times) [2024-04-28 03:54:14,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61440.0, 300 sec: 61592.7). Total num frames: 9312747520. Throughput: 0: 61091.9. Samples: 2217857880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:54:14,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 03:54:15,482][54818] Updated weights for policy 0, policy_version 568408 (0.0016) [2024-04-28 03:54:17,830][54818] Updated weights for policy 0, policy_version 568418 (0.0016) [2024-04-28 03:54:19,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.9, 300 sec: 61537.2). Total num frames: 9313042432. Throughput: 0: 61041.0. Samples: 2218227880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:54:19,255][54587] Avg episode reward: [(0, '0.669')] [2024-04-28 03:54:20,856][54818] Updated weights for policy 0, policy_version 568428 (0.0016) [2024-04-28 03:54:22,971][54818] Updated weights for policy 0, policy_version 568438 (0.0016) [2024-04-28 03:54:24,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61166.9, 300 sec: 61592.7). Total num frames: 9313353728. Throughput: 0: 61214.3. Samples: 2218590400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:54:24,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 03:54:26,114][54818] Updated weights for policy 0, policy_version 568448 (0.0015) [2024-04-28 03:54:27,684][54798] Signal inference workers to stop experience collection... (36650 times) [2024-04-28 03:54:27,685][54798] Signal inference workers to resume experience collection... (36650 times) [2024-04-28 03:54:27,699][54818] InferenceWorker_p0-w0: stopping experience collection (36650 times) [2024-04-28 03:54:27,700][54818] InferenceWorker_p0-w0: resuming experience collection (36650 times) [2024-04-28 03:54:28,477][54818] Updated weights for policy 0, policy_version 568458 (0.0016) [2024-04-28 03:54:29,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61166.9, 300 sec: 61592.7). Total num frames: 9313665024. Throughput: 0: 61103.9. Samples: 2218777240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:54:29,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 03:54:31,339][54818] Updated weights for policy 0, policy_version 568468 (0.0017) [2024-04-28 03:54:33,685][54818] Updated weights for policy 0, policy_version 568478 (0.0015) [2024-04-28 03:54:34,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61167.0, 300 sec: 61481.6). Total num frames: 9313959936. Throughput: 0: 60813.2. Samples: 2219138000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:54:34,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 03:54:36,653][54818] Updated weights for policy 0, policy_version 568488 (0.0018) [2024-04-28 03:54:39,247][54818] Updated weights for policy 0, policy_version 568498 (0.0016) [2024-04-28 03:54:39,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61167.1, 300 sec: 61481.7). Total num frames: 9314271232. Throughput: 0: 61176.9. Samples: 2219506880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:54:39,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-28 03:54:40,744][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (23800 times) [2024-04-28 03:54:42,052][54818] Updated weights for policy 0, policy_version 568508 (0.0017) [2024-04-28 03:54:44,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.8, 300 sec: 61481.6). Total num frames: 9314566144. Throughput: 0: 60999.3. Samples: 2219692360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:54:44,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-28 03:54:44,735][54818] Updated weights for policy 0, policy_version 568518 (0.0017) [2024-04-28 03:54:47,376][54818] Updated weights for policy 0, policy_version 568528 (0.0020) [2024-04-28 03:54:49,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.9, 300 sec: 61481.7). Total num frames: 9314877440. Throughput: 0: 61043.7. Samples: 2220056480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 03:54:49,255][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 03:54:50,341][54818] Updated weights for policy 0, policy_version 568538 (0.0015) [2024-04-28 03:54:52,553][54818] Updated weights for policy 0, policy_version 568548 (0.0017) [2024-04-28 03:54:54,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61167.0, 300 sec: 61481.7). Total num frames: 9315188736. Throughput: 0: 61184.0. Samples: 2220423980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:54:54,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 03:54:55,720][54818] Updated weights for policy 0, policy_version 568558 (0.0017) [2024-04-28 03:54:57,790][54818] Updated weights for policy 0, policy_version 568568 (0.0016) [2024-04-28 03:54:59,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61167.2, 300 sec: 61481.7). Total num frames: 9315483648. Throughput: 0: 61327.7. Samples: 2220617620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:54:59,253][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 03:55:01,063][54818] Updated weights for policy 0, policy_version 568578 (0.0015) [2024-04-28 03:55:02,989][54818] Updated weights for policy 0, policy_version 568588 (0.0017) [2024-04-28 03:55:04,253][54587] Fps is (10 sec: 60619.9, 60 sec: 61166.9, 300 sec: 61537.2). Total num frames: 9315794944. Throughput: 0: 61029.7. Samples: 2220974220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:55:04,255][54587] Avg episode reward: [(0, '0.650')] [2024-04-28 03:55:06,436][54818] Updated weights for policy 0, policy_version 568598 (0.0016) [2024-04-28 03:55:07,146][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (23900 times) [2024-04-28 03:55:08,293][54818] Updated weights for policy 0, policy_version 568608 (0.0016) [2024-04-28 03:55:09,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61166.9, 300 sec: 61481.6). Total num frames: 9316106240. Throughput: 0: 61211.1. Samples: 2221344900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:55:09,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 03:55:11,655][54818] Updated weights for policy 0, policy_version 568618 (0.0016) [2024-04-28 03:55:14,062][54818] Updated weights for policy 0, policy_version 568628 (0.0016) [2024-04-28 03:55:14,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60893.9, 300 sec: 61426.1). Total num frames: 9316401152. Throughput: 0: 61327.1. Samples: 2221536960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:55:14,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 03:55:16,869][54818] Updated weights for policy 0, policy_version 568638 (0.0017) [2024-04-28 03:55:17,336][54798] Signal inference workers to stop experience collection... (36700 times) [2024-04-28 03:55:17,354][54818] InferenceWorker_p0-w0: stopping experience collection (36700 times) [2024-04-28 03:55:17,393][54798] Signal inference workers to resume experience collection... (36700 times) [2024-04-28 03:55:17,393][54818] InferenceWorker_p0-w0: resuming experience collection (36700 times) [2024-04-28 03:55:19,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61167.0, 300 sec: 61481.7). Total num frames: 9316712448. Throughput: 0: 61264.1. Samples: 2221894880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:55:19,253][54587] Avg episode reward: [(0, '0.564')] [2024-04-28 03:55:19,320][54818] Updated weights for policy 0, policy_version 568648 (0.0016) [2024-04-28 03:55:22,124][54818] Updated weights for policy 0, policy_version 568658 (0.0016) [2024-04-28 03:55:24,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.8, 300 sec: 61426.1). Total num frames: 9317007360. Throughput: 0: 61243.8. Samples: 2222262860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:55:24,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 03:55:24,836][54818] Updated weights for policy 0, policy_version 568668 (0.0016) [2024-04-28 03:55:27,477][54818] Updated weights for policy 0, policy_version 568678 (0.0017) [2024-04-28 03:55:29,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60893.9, 300 sec: 61426.1). Total num frames: 9317318656. Throughput: 0: 61446.3. Samples: 2222457440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:55:29,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 03:55:30,357][54818] Updated weights for policy 0, policy_version 568688 (0.0017) [2024-04-28 03:55:32,577][54818] Updated weights for policy 0, policy_version 568698 (0.0016) [2024-04-28 03:55:33,652][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (24000 times) [2024-04-28 03:55:34,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61166.9, 300 sec: 61481.6). Total num frames: 9317629952. Throughput: 0: 61381.2. Samples: 2222818640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:55:34,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 03:55:35,817][54818] Updated weights for policy 0, policy_version 568708 (0.0015) [2024-04-28 03:55:37,793][54818] Updated weights for policy 0, policy_version 568718 (0.0015) [2024-04-28 03:55:39,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61166.9, 300 sec: 61481.7). Total num frames: 9317941248. Throughput: 0: 61202.6. Samples: 2223178100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:55:39,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 03:55:41,107][54818] Updated weights for policy 0, policy_version 568728 (0.0016) [2024-04-28 03:55:43,035][54818] Updated weights for policy 0, policy_version 568738 (0.0020) [2024-04-28 03:55:44,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61440.0, 300 sec: 61426.1). Total num frames: 9318252544. Throughput: 0: 61338.9. Samples: 2223377880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:55:44,254][54587] Avg episode reward: [(0, '0.689')] [2024-04-28 03:55:46,286][54818] Updated weights for policy 0, policy_version 568748 (0.0016) [2024-04-28 03:55:48,275][54818] Updated weights for policy 0, policy_version 568758 (0.0018) [2024-04-28 03:55:49,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61439.9, 300 sec: 61481.7). Total num frames: 9318563840. Throughput: 0: 61427.1. Samples: 2223738440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:55:49,256][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 03:55:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000568760_9318563840.pth... [2024-04-28 03:55:49,319][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000567861_9303834624.pth [2024-04-28 03:55:51,496][54818] Updated weights for policy 0, policy_version 568768 (0.0016) [2024-04-28 03:55:53,707][54818] Updated weights for policy 0, policy_version 568778 (0.0017) [2024-04-28 03:55:54,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.8, 300 sec: 61426.1). Total num frames: 9318858752. Throughput: 0: 61284.8. Samples: 2224102720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:55:54,255][54587] Avg episode reward: [(0, '0.666')] [2024-04-28 03:55:57,044][54818] Updated weights for policy 0, policy_version 568788 (0.0016) [2024-04-28 03:55:57,407][54798] Signal inference workers to stop experience collection... (36750 times) [2024-04-28 03:55:57,407][54798] Signal inference workers to resume experience collection... (36750 times) [2024-04-28 03:55:57,421][54818] InferenceWorker_p0-w0: stopping experience collection (36750 times) [2024-04-28 03:55:57,421][54818] InferenceWorker_p0-w0: resuming experience collection (36750 times) [2024-04-28 03:55:59,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61439.9, 300 sec: 61426.1). Total num frames: 9319170048. Throughput: 0: 61307.5. Samples: 2224295800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:55:59,255][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 03:55:59,877][54818] Updated weights for policy 0, policy_version 568798 (0.0016) [2024-04-28 03:56:00,851][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (24100 times) [2024-04-28 03:56:02,229][54818] Updated weights for policy 0, policy_version 568808 (0.0016) [2024-04-28 03:56:04,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61440.0, 300 sec: 61481.6). Total num frames: 9319481344. Throughput: 0: 61314.9. Samples: 2224654060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:56:04,255][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 03:56:05,080][54818] Updated weights for policy 0, policy_version 568818 (0.0016) [2024-04-28 03:56:07,407][54818] Updated weights for policy 0, policy_version 568828 (0.0019) [2024-04-28 03:56:09,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61440.0, 300 sec: 61481.7). Total num frames: 9319792640. Throughput: 0: 61351.2. Samples: 2225023660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:56:09,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 03:56:10,530][54818] Updated weights for policy 0, policy_version 568838 (0.0017) [2024-04-28 03:56:12,602][54818] Updated weights for policy 0, policy_version 568848 (0.0018) [2024-04-28 03:56:14,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.0, 300 sec: 61426.1). Total num frames: 9320087552. Throughput: 0: 61426.1. Samples: 2225221620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:56:14,254][54587] Avg episode reward: [(0, '0.610')] [2024-04-28 03:56:15,742][54818] Updated weights for policy 0, policy_version 568858 (0.0015) [2024-04-28 03:56:17,904][54818] Updated weights for policy 0, policy_version 568868 (0.0020) [2024-04-28 03:56:19,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61439.9, 300 sec: 61481.6). Total num frames: 9320398848. Throughput: 0: 61330.3. Samples: 2225578500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 03:56:19,255][54587] Avg episode reward: [(0, '0.687')] [2024-04-28 03:56:21,115][54818] Updated weights for policy 0, policy_version 568878 (0.0017) [2024-04-28 03:56:23,278][54818] Updated weights for policy 0, policy_version 568888 (0.0018) [2024-04-28 03:56:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.0, 300 sec: 61426.1). Total num frames: 9320693760. Throughput: 0: 61411.9. Samples: 2225941640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:56:24,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 03:56:26,564][54818] Updated weights for policy 0, policy_version 568898 (0.0016) [2024-04-28 03:56:27,382][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (24200 times) [2024-04-28 03:56:28,515][54818] Updated weights for policy 0, policy_version 568908 (0.0017) [2024-04-28 03:56:29,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9321005056. Throughput: 0: 61377.0. Samples: 2226139840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:56:29,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 03:56:31,735][54818] Updated weights for policy 0, policy_version 568918 (0.0015) [2024-04-28 03:56:33,986][54818] Updated weights for policy 0, policy_version 568928 (0.0018) [2024-04-28 03:56:34,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9321316352. Throughput: 0: 61320.2. Samples: 2226497840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:56:34,253][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 03:56:36,289][54798] Signal inference workers to stop experience collection... (36800 times) [2024-04-28 03:56:36,330][54818] InferenceWorker_p0-w0: stopping experience collection (36800 times) [2024-04-28 03:56:36,346][54798] Signal inference workers to resume experience collection... (36800 times) [2024-04-28 03:56:36,346][54818] InferenceWorker_p0-w0: resuming experience collection (36800 times) [2024-04-28 03:56:36,946][54818] Updated weights for policy 0, policy_version 568938 (0.0016) [2024-04-28 03:56:39,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9321627648. Throughput: 0: 61398.5. Samples: 2226865640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:56:39,253][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 03:56:39,277][54818] Updated weights for policy 0, policy_version 568948 (0.0016) [2024-04-28 03:56:42,276][54818] Updated weights for policy 0, policy_version 568958 (0.0021) [2024-04-28 03:56:44,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61440.0, 300 sec: 61426.1). Total num frames: 9321938944. Throughput: 0: 61442.7. Samples: 2227060720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:56:44,255][54587] Avg episode reward: [(0, '0.650')] [2024-04-28 03:56:44,750][54818] Updated weights for policy 0, policy_version 568968 (0.0016) [2024-04-28 03:56:47,417][54818] Updated weights for policy 0, policy_version 568978 (0.0017) [2024-04-28 03:56:49,253][54587] Fps is (10 sec: 60619.2, 60 sec: 61166.9, 300 sec: 61370.5). Total num frames: 9322233856. Throughput: 0: 61440.8. Samples: 2227418900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:56:49,254][54587] Avg episode reward: [(0, '0.516')] [2024-04-28 03:56:49,262][54587] No heartbeat for components: RolloutWorker_w4 (23557 seconds), RolloutWorker_w5 (9657 seconds) [2024-04-28 03:56:50,273][54818] Updated weights for policy 0, policy_version 568988 (0.0016) [2024-04-28 03:56:52,790][54818] Updated weights for policy 0, policy_version 568998 (0.0018) [2024-04-28 03:56:53,990][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (24300 times) [2024-04-28 03:56:54,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.1, 300 sec: 61315.0). Total num frames: 9322545152. Throughput: 0: 61412.0. Samples: 2227787200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:56:54,255][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 03:56:54,256][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (1300 times) [2024-04-28 03:56:55,998][54818] Updated weights for policy 0, policy_version 569008 (0.0017) [2024-04-28 03:56:58,227][54818] Updated weights for policy 0, policy_version 569018 (0.0017) [2024-04-28 03:56:59,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9322856448. Throughput: 0: 61198.6. Samples: 2227975560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:56:59,256][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 03:57:01,160][54818] Updated weights for policy 0, policy_version 569028 (0.0019) [2024-04-28 03:57:03,726][54818] Updated weights for policy 0, policy_version 569038 (0.0017) [2024-04-28 03:57:04,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61167.0, 300 sec: 61370.6). Total num frames: 9323151360. Throughput: 0: 61203.6. Samples: 2228332660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:57:04,255][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 03:57:06,553][54818] Updated weights for policy 0, policy_version 569048 (0.0016) [2024-04-28 03:57:08,932][54818] Updated weights for policy 0, policy_version 569058 (0.0016) [2024-04-28 03:57:09,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61166.9, 300 sec: 61370.5). Total num frames: 9323462656. Throughput: 0: 61405.7. Samples: 2228704900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:57:09,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 03:57:11,847][54818] Updated weights for policy 0, policy_version 569068 (0.0015) [2024-04-28 03:57:14,191][54818] Updated weights for policy 0, policy_version 569078 (0.0016) [2024-04-28 03:57:14,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9323773952. Throughput: 0: 61233.4. Samples: 2228895340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:57:14,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 03:57:17,139][54818] Updated weights for policy 0, policy_version 569088 (0.0021) [2024-04-28 03:57:18,942][54798] Signal inference workers to stop experience collection... (36850 times) [2024-04-28 03:57:18,943][54798] Signal inference workers to resume experience collection... (36850 times) [2024-04-28 03:57:18,967][54818] InferenceWorker_p0-w0: stopping experience collection (36850 times) [2024-04-28 03:57:18,967][54818] InferenceWorker_p0-w0: resuming experience collection (36850 times) [2024-04-28 03:57:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9324068864. Throughput: 0: 61318.5. Samples: 2229257180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:57:19,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 03:57:19,412][54818] Updated weights for policy 0, policy_version 569098 (0.0018) [2024-04-28 03:57:21,189][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (24400 times) [2024-04-28 03:57:22,527][54818] Updated weights for policy 0, policy_version 569108 (0.0016) [2024-04-28 03:57:24,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61713.0, 300 sec: 61370.6). Total num frames: 9324396544. Throughput: 0: 61344.6. Samples: 2229626160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:57:24,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-28 03:57:24,616][54818] Updated weights for policy 0, policy_version 569118 (0.0017) [2024-04-28 03:57:27,847][54818] Updated weights for policy 0, policy_version 569128 (0.0018) [2024-04-28 03:57:29,253][54587] Fps is (10 sec: 62260.1, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9324691456. Throughput: 0: 61157.1. Samples: 2229812780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:57:29,253][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 03:57:29,984][54818] Updated weights for policy 0, policy_version 569138 (0.0018) [2024-04-28 03:57:32,964][54818] Updated weights for policy 0, policy_version 569148 (0.0018) [2024-04-28 03:57:34,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61439.8, 300 sec: 61370.6). Total num frames: 9325002752. Throughput: 0: 61420.1. Samples: 2230182800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:57:34,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 03:57:35,678][54818] Updated weights for policy 0, policy_version 569158 (0.0020) [2024-04-28 03:57:38,373][54818] Updated weights for policy 0, policy_version 569168 (0.0016) [2024-04-28 03:57:39,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61439.9, 300 sec: 61370.6). Total num frames: 9325314048. Throughput: 0: 61408.9. Samples: 2230550600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:57:39,255][54587] Avg episode reward: [(0, '0.645')] [2024-04-28 03:57:40,903][54818] Updated weights for policy 0, policy_version 569178 (0.0016) [2024-04-28 03:57:43,640][54818] Updated weights for policy 0, policy_version 569188 (0.0016) [2024-04-28 03:57:44,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9325625344. Throughput: 0: 61268.1. Samples: 2230732620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:57:44,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-28 03:57:46,478][54818] Updated weights for policy 0, policy_version 569198 (0.0016) [2024-04-28 03:57:47,591][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (24500 times) [2024-04-28 03:57:48,983][54818] Updated weights for policy 0, policy_version 569208 (0.0018) [2024-04-28 03:57:49,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61440.3, 300 sec: 61315.1). Total num frames: 9325920256. Throughput: 0: 61598.4. Samples: 2231104580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:57:49,253][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 03:57:49,324][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000569210_9325936640.pth... [2024-04-28 03:57:49,379][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000568312_9311223808.pth [2024-04-28 03:57:51,915][54818] Updated weights for policy 0, policy_version 569218 (0.0018) [2024-04-28 03:57:54,168][54818] Updated weights for policy 0, policy_version 569228 (0.0019) [2024-04-28 03:57:54,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.0, 300 sec: 61315.1). Total num frames: 9326231552. Throughput: 0: 61406.3. Samples: 2231468180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 19.0) [2024-04-28 03:57:54,254][54587] Avg episode reward: [(0, '0.515')] [2024-04-28 03:57:57,434][54818] Updated weights for policy 0, policy_version 569238 (0.0017) [2024-04-28 03:57:59,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9326526464. Throughput: 0: 61406.6. Samples: 2231658640. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:57:59,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 03:57:59,664][54818] Updated weights for policy 0, policy_version 569248 (0.0018) [2024-04-28 03:58:02,665][54818] Updated weights for policy 0, policy_version 569258 (0.0018) [2024-04-28 03:58:04,069][54798] Signal inference workers to stop experience collection... (36900 times) [2024-04-28 03:58:04,071][54798] Signal inference workers to resume experience collection... (36900 times) [2024-04-28 03:58:04,085][54818] InferenceWorker_p0-w0: stopping experience collection (36900 times) [2024-04-28 03:58:04,085][54818] InferenceWorker_p0-w0: resuming experience collection (36900 times) [2024-04-28 03:58:04,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9326837760. Throughput: 0: 61495.6. Samples: 2232024480. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:58:04,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 03:58:04,887][54818] Updated weights for policy 0, policy_version 569268 (0.0015) [2024-04-28 03:58:07,790][54818] Updated weights for policy 0, policy_version 569278 (0.0015) [2024-04-28 03:58:09,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 9327149056. Throughput: 0: 61469.8. Samples: 2232392300. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:58:09,255][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 03:58:10,275][54818] Updated weights for policy 0, policy_version 569288 (0.0016) [2024-04-28 03:58:12,932][54818] Updated weights for policy 0, policy_version 569298 (0.0017) [2024-04-28 03:58:14,213][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (24600 times) [2024-04-28 03:58:14,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9327443968. Throughput: 0: 61421.7. Samples: 2232576760. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:58:14,253][54587] Avg episode reward: [(0, '0.618')] [2024-04-28 03:58:15,524][54818] Updated weights for policy 0, policy_version 569308 (0.0016) [2024-04-28 03:58:18,356][54818] Updated weights for policy 0, policy_version 569318 (0.0015) [2024-04-28 03:58:19,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9327755264. Throughput: 0: 61479.4. Samples: 2232949360. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:58:19,253][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 03:58:20,886][54818] Updated weights for policy 0, policy_version 569328 (0.0017) [2024-04-28 03:58:23,573][54818] Updated weights for policy 0, policy_version 569338 (0.0015) [2024-04-28 03:58:24,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9328050176. Throughput: 0: 61291.7. Samples: 2233308720. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:58:24,253][54587] Avg episode reward: [(0, '0.536')] [2024-04-28 03:58:26,597][54818] Updated weights for policy 0, policy_version 569348 (0.0016) [2024-04-28 03:58:28,933][54818] Updated weights for policy 0, policy_version 569358 (0.0016) [2024-04-28 03:58:29,253][54587] Fps is (10 sec: 60619.9, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9328361472. Throughput: 0: 61538.2. Samples: 2233501840. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:58:29,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 03:58:31,773][54818] Updated weights for policy 0, policy_version 569368 (0.0015) [2024-04-28 03:58:34,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9328672768. Throughput: 0: 61465.2. Samples: 2233870520. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:58:34,254][54587] Avg episode reward: [(0, '0.499')] [2024-04-28 03:58:34,480][54818] Updated weights for policy 0, policy_version 569378 (0.0019) [2024-04-28 03:58:37,346][54818] Updated weights for policy 0, policy_version 569388 (0.0016) [2024-04-28 03:58:39,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9328984064. Throughput: 0: 61398.0. Samples: 2234231100. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:58:39,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 03:58:39,620][54818] Updated weights for policy 0, policy_version 569398 (0.0017) [2024-04-28 03:58:40,925][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (24700 times) [2024-04-28 03:58:42,709][54818] Updated weights for policy 0, policy_version 569408 (0.0017) [2024-04-28 03:58:44,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9329295360. Throughput: 0: 61400.0. Samples: 2234421640. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:58:44,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 03:58:44,906][54818] Updated weights for policy 0, policy_version 569418 (0.0017) [2024-04-28 03:58:47,864][54818] Updated weights for policy 0, policy_version 569428 (0.0015) [2024-04-28 03:58:49,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61439.8, 300 sec: 61315.0). Total num frames: 9329606656. Throughput: 0: 61549.2. Samples: 2234794200. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:58:49,254][54587] Avg episode reward: [(0, '0.506')] [2024-04-28 03:58:50,319][54818] Updated weights for policy 0, policy_version 569438 (0.0016) [2024-04-28 03:58:53,085][54818] Updated weights for policy 0, policy_version 569448 (0.0018) [2024-04-28 03:58:53,237][54798] Signal inference workers to stop experience collection... (36950 times) [2024-04-28 03:58:53,238][54798] Signal inference workers to resume experience collection... (36950 times) [2024-04-28 03:58:53,254][54818] InferenceWorker_p0-w0: stopping experience collection (36950 times) [2024-04-28 03:58:53,254][54818] InferenceWorker_p0-w0: resuming experience collection (36950 times) [2024-04-28 03:58:54,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9329917952. Throughput: 0: 61299.6. Samples: 2235150780. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:58:54,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 03:58:55,388][54818] Updated weights for policy 0, policy_version 569458 (0.0017) [2024-04-28 03:58:58,267][54818] Updated weights for policy 0, policy_version 569468 (0.0019) [2024-04-28 03:58:59,253][54587] Fps is (10 sec: 62260.2, 60 sec: 61713.1, 300 sec: 61370.6). Total num frames: 9330229248. Throughput: 0: 61458.2. Samples: 2235342380. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:58:59,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 03:59:01,174][54818] Updated weights for policy 0, policy_version 569478 (0.0016) [2024-04-28 03:59:03,440][54818] Updated weights for policy 0, policy_version 569488 (0.0017) [2024-04-28 03:59:04,255][54587] Fps is (10 sec: 60613.7, 60 sec: 61438.8, 300 sec: 61314.8). Total num frames: 9330524160. Throughput: 0: 61431.6. Samples: 2235713860. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:59:04,255][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 03:59:06,484][54818] Updated weights for policy 0, policy_version 569498 (0.0017) [2024-04-28 03:59:07,830][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (24800 times) [2024-04-28 03:59:08,891][54818] Updated weights for policy 0, policy_version 569508 (0.0018) [2024-04-28 03:59:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.1, 300 sec: 61315.0). Total num frames: 9330835456. Throughput: 0: 61369.2. Samples: 2236070340. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:59:09,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 03:59:11,904][54818] Updated weights for policy 0, policy_version 569518 (0.0017) [2024-04-28 03:59:14,225][54818] Updated weights for policy 0, policy_version 569528 (0.0017) [2024-04-28 03:59:14,253][54587] Fps is (10 sec: 62266.5, 60 sec: 61713.0, 300 sec: 61370.6). Total num frames: 9331146752. Throughput: 0: 61365.4. Samples: 2236263280. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:59:14,255][54587] Avg episode reward: [(0, '0.528')] [2024-04-28 03:59:17,298][54818] Updated weights for policy 0, policy_version 569538 (0.0017) [2024-04-28 03:59:19,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61713.0, 300 sec: 61370.6). Total num frames: 9331458048. Throughput: 0: 61386.7. Samples: 2236632920. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:59:19,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 03:59:19,581][54818] Updated weights for policy 0, policy_version 569548 (0.0022) [2024-04-28 03:59:22,789][54818] Updated weights for policy 0, policy_version 569558 (0.0018) [2024-04-28 03:59:24,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61713.0, 300 sec: 61315.0). Total num frames: 9331752960. Throughput: 0: 61212.6. Samples: 2236985660. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 03:59:24,255][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 03:59:25,090][54818] Updated weights for policy 0, policy_version 569568 (0.0018) [2024-04-28 03:59:28,051][54818] Updated weights for policy 0, policy_version 569578 (0.0016) [2024-04-28 03:59:29,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61713.2, 300 sec: 61370.6). Total num frames: 9332064256. Throughput: 0: 61228.1. Samples: 2237176900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:59:29,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-28 03:59:30,599][54818] Updated weights for policy 0, policy_version 569588 (0.0019) [2024-04-28 03:59:33,274][54818] Updated weights for policy 0, policy_version 569598 (0.0018) [2024-04-28 03:59:34,143][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (24900 times) [2024-04-28 03:59:34,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61440.1, 300 sec: 61315.0). Total num frames: 9332359168. Throughput: 0: 61176.8. Samples: 2237547140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:59:34,253][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 03:59:35,898][54818] Updated weights for policy 0, policy_version 569608 (0.0018) [2024-04-28 03:59:38,608][54818] Updated weights for policy 0, policy_version 569618 (0.0019) [2024-04-28 03:59:39,253][54587] Fps is (10 sec: 60620.0, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9332670464. Throughput: 0: 61379.5. Samples: 2237912860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:59:39,255][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 03:59:41,333][54818] Updated weights for policy 0, policy_version 569628 (0.0015) [2024-04-28 03:59:43,896][54818] Updated weights for policy 0, policy_version 569638 (0.0017) [2024-04-28 03:59:44,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9332981760. Throughput: 0: 61097.4. Samples: 2238091760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:59:44,253][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 03:59:46,466][54818] Updated weights for policy 0, policy_version 569648 (0.0019) [2024-04-28 03:59:47,927][54798] Signal inference workers to stop experience collection... (37000 times) [2024-04-28 03:59:47,958][54818] InferenceWorker_p0-w0: stopping experience collection (37000 times) [2024-04-28 03:59:47,986][54798] Signal inference workers to resume experience collection... (37000 times) [2024-04-28 03:59:47,987][54818] InferenceWorker_p0-w0: resuming experience collection (37000 times) [2024-04-28 03:59:49,095][54818] Updated weights for policy 0, policy_version 569658 (0.0016) [2024-04-28 03:59:49,253][54587] Fps is (10 sec: 62260.2, 60 sec: 61440.2, 300 sec: 61370.6). Total num frames: 9333293056. Throughput: 0: 61107.1. Samples: 2238463600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:59:49,253][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 03:59:49,258][54587] No heartbeat for components: RolloutWorker_w4 (23737 seconds), RolloutWorker_w5 (9837 seconds) [2024-04-28 03:59:49,346][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000569660_9333309440.pth... [2024-04-28 03:59:49,401][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000568760_9318563840.pth [2024-04-28 03:59:51,935][54818] Updated weights for policy 0, policy_version 569668 (0.0020) [2024-04-28 03:59:54,220][54818] Updated weights for policy 0, policy_version 569678 (0.0017) [2024-04-28 03:59:54,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61439.9, 300 sec: 61426.1). Total num frames: 9333604352. Throughput: 0: 61363.9. Samples: 2238831720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:59:54,255][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 03:59:57,305][54818] Updated weights for policy 0, policy_version 569688 (0.0016) [2024-04-28 03:59:59,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9333899264. Throughput: 0: 61033.8. Samples: 2239009800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 03:59:59,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 03:59:59,595][54818] Updated weights for policy 0, policy_version 569698 (0.0017) [2024-04-28 04:00:01,389][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (25000 times) [2024-04-28 04:00:02,788][54818] Updated weights for policy 0, policy_version 569708 (0.0021) [2024-04-28 04:00:04,253][54587] Fps is (10 sec: 58982.9, 60 sec: 61168.2, 300 sec: 61315.0). Total num frames: 9334194176. Throughput: 0: 61032.5. Samples: 2239379380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:00:04,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 04:00:05,358][54818] Updated weights for policy 0, policy_version 569718 (0.0018) [2024-04-28 04:00:08,112][54818] Updated weights for policy 0, policy_version 569728 (0.0015) [2024-04-28 04:00:09,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61166.8, 300 sec: 61370.6). Total num frames: 9334505472. Throughput: 0: 61264.4. Samples: 2239742560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:00:09,255][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 04:00:10,901][54818] Updated weights for policy 0, policy_version 569738 (0.0016) [2024-04-28 04:00:13,398][54818] Updated weights for policy 0, policy_version 569748 (0.0015) [2024-04-28 04:00:14,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61167.0, 300 sec: 61370.6). Total num frames: 9334816768. Throughput: 0: 61148.9. Samples: 2239928600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:00:14,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 04:00:16,172][54818] Updated weights for policy 0, policy_version 569758 (0.0015) [2024-04-28 04:00:18,490][54818] Updated weights for policy 0, policy_version 569768 (0.0017) [2024-04-28 04:00:19,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60893.9, 300 sec: 61370.6). Total num frames: 9335111680. Throughput: 0: 61312.4. Samples: 2240306200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:00:19,254][54587] Avg episode reward: [(0, '0.642')] [2024-04-28 04:00:21,366][54818] Updated weights for policy 0, policy_version 569778 (0.0018) [2024-04-28 04:00:23,848][54818] Updated weights for policy 0, policy_version 569788 (0.0018) [2024-04-28 04:00:24,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61440.0, 300 sec: 61426.1). Total num frames: 9335439360. Throughput: 0: 61109.4. Samples: 2240662780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:00:24,255][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 04:00:26,765][54818] Updated weights for policy 0, policy_version 569798 (0.0017) [2024-04-28 04:00:27,964][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (25100 times) [2024-04-28 04:00:29,093][54818] Updated weights for policy 0, policy_version 569808 (0.0016) [2024-04-28 04:00:29,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9335734272. Throughput: 0: 61365.8. Samples: 2240853220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:00:29,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-28 04:00:31,867][54818] Updated weights for policy 0, policy_version 569818 (0.0016) [2024-04-28 04:00:34,044][54798] Signal inference workers to stop experience collection... (37050 times) [2024-04-28 04:00:34,069][54818] InferenceWorker_p0-w0: stopping experience collection (37050 times) [2024-04-28 04:00:34,105][54798] Signal inference workers to resume experience collection... (37050 times) [2024-04-28 04:00:34,105][54818] InferenceWorker_p0-w0: resuming experience collection (37050 times) [2024-04-28 04:00:34,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61439.8, 300 sec: 61370.6). Total num frames: 9336045568. Throughput: 0: 61282.0. Samples: 2241221300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:00:34,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 04:00:34,331][54818] Updated weights for policy 0, policy_version 569828 (0.0016) [2024-04-28 04:00:37,129][54818] Updated weights for policy 0, policy_version 569838 (0.0017) [2024-04-28 04:00:39,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9336340480. Throughput: 0: 61332.0. Samples: 2241591660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:00:39,255][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 04:00:39,883][54818] Updated weights for policy 0, policy_version 569848 (0.0017) [2024-04-28 04:00:42,628][54818] Updated weights for policy 0, policy_version 569858 (0.0019) [2024-04-28 04:00:44,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61167.0, 300 sec: 61315.1). Total num frames: 9336651776. Throughput: 0: 61458.4. Samples: 2241775420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:00:44,253][54587] Avg episode reward: [(0, '0.514')] [2024-04-28 04:00:45,394][54818] Updated weights for policy 0, policy_version 569868 (0.0016) [2024-04-28 04:00:47,809][54818] Updated weights for policy 0, policy_version 569878 (0.0016) [2024-04-28 04:00:49,253][54587] Fps is (10 sec: 62260.1, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9336963072. Throughput: 0: 61521.0. Samples: 2242147820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:00:49,253][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 04:00:50,590][54818] Updated weights for policy 0, policy_version 569888 (0.0017) [2024-04-28 04:00:53,092][54818] Updated weights for policy 0, policy_version 569898 (0.0016) [2024-04-28 04:00:54,253][54587] Fps is (10 sec: 62258.0, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9337274368. Throughput: 0: 61684.0. Samples: 2242518340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:00:54,255][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 04:00:54,256][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (25200 times) [2024-04-28 04:00:55,864][54818] Updated weights for policy 0, policy_version 569908 (0.0017) [2024-04-28 04:00:58,498][54818] Updated weights for policy 0, policy_version 569918 (0.0019) [2024-04-28 04:00:59,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61167.0, 300 sec: 61315.1). Total num frames: 9337569280. Throughput: 0: 61571.5. Samples: 2242699320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:00:59,255][54587] Avg episode reward: [(0, '0.645')] [2024-04-28 04:01:01,117][54818] Updated weights for policy 0, policy_version 569928 (0.0015) [2024-04-28 04:01:03,988][54818] Updated weights for policy 0, policy_version 569938 (0.0017) [2024-04-28 04:01:04,253][54587] Fps is (10 sec: 58982.4, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9337864192. Throughput: 0: 61270.9. Samples: 2243063400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:01:04,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 04:01:06,465][54818] Updated weights for policy 0, policy_version 569948 (0.0016) [2024-04-28 04:01:09,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61167.1, 300 sec: 61315.1). Total num frames: 9338175488. Throughput: 0: 61666.4. Samples: 2243437760. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:01:09,253][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 04:01:09,309][54818] Updated weights for policy 0, policy_version 569958 (0.0017) [2024-04-28 04:01:11,836][54818] Updated weights for policy 0, policy_version 569968 (0.0016) [2024-04-28 04:01:14,253][54587] Fps is (10 sec: 62260.1, 60 sec: 61166.9, 300 sec: 61315.1). Total num frames: 9338486784. Throughput: 0: 61481.8. Samples: 2243619900. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:01:14,253][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 04:01:14,584][54818] Updated weights for policy 0, policy_version 569978 (0.0018) [2024-04-28 04:01:15,500][54798] Signal inference workers to stop experience collection... (37100 times) [2024-04-28 04:01:15,502][54798] Signal inference workers to resume experience collection... (37100 times) [2024-04-28 04:01:15,508][54818] InferenceWorker_p0-w0: stopping experience collection (37100 times) [2024-04-28 04:01:15,518][54818] InferenceWorker_p0-w0: resuming experience collection (37100 times) [2024-04-28 04:01:17,030][54818] Updated weights for policy 0, policy_version 569988 (0.0020) [2024-04-28 04:01:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61167.0, 300 sec: 61315.1). Total num frames: 9338781696. Throughput: 0: 61398.9. Samples: 2243984240. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:01:19,253][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 04:01:20,091][54818] Updated weights for policy 0, policy_version 569998 (0.0016) [2024-04-28 04:01:21,045][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (25300 times) [2024-04-28 04:01:22,444][54818] Updated weights for policy 0, policy_version 570008 (0.0016) [2024-04-28 04:01:24,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.9, 300 sec: 61315.0). Total num frames: 9339092992. Throughput: 0: 61501.0. Samples: 2244359200. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:01:24,255][54587] Avg episode reward: [(0, '0.578')] [2024-04-28 04:01:25,380][54818] Updated weights for policy 0, policy_version 570018 (0.0018) [2024-04-28 04:01:28,086][54818] Updated weights for policy 0, policy_version 570028 (0.0016) [2024-04-28 04:01:29,253][54587] Fps is (10 sec: 62258.1, 60 sec: 61166.8, 300 sec: 61315.0). Total num frames: 9339404288. Throughput: 0: 61378.9. Samples: 2244537480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:01:29,255][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 04:01:30,727][54818] Updated weights for policy 0, policy_version 570038 (0.0015) [2024-04-28 04:01:33,309][54818] Updated weights for policy 0, policy_version 570048 (0.0017) [2024-04-28 04:01:34,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60894.0, 300 sec: 61259.5). Total num frames: 9339699200. Throughput: 0: 61307.5. Samples: 2244906660. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:01:34,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 04:01:35,929][54818] Updated weights for policy 0, policy_version 570058 (0.0016) [2024-04-28 04:01:38,805][54818] Updated weights for policy 0, policy_version 570068 (0.0016) [2024-04-28 04:01:39,253][54587] Fps is (10 sec: 58983.0, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9339994112. Throughput: 0: 61379.7. Samples: 2245280420. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:01:39,254][54587] Avg episode reward: [(0, '0.692')] [2024-04-28 04:01:41,077][54818] Updated weights for policy 0, policy_version 570078 (0.0017) [2024-04-28 04:01:44,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.8, 300 sec: 61259.5). Total num frames: 9340305408. Throughput: 0: 61167.6. Samples: 2245451860. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:01:44,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-28 04:01:44,415][54818] Updated weights for policy 0, policy_version 570088 (0.0016) [2024-04-28 04:01:46,378][54818] Updated weights for policy 0, policy_version 570098 (0.0017) [2024-04-28 04:01:47,783][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (25400 times) [2024-04-28 04:01:49,253][54587] Fps is (10 sec: 62259.5, 60 sec: 60893.8, 300 sec: 61259.5). Total num frames: 9340616704. Throughput: 0: 61351.8. Samples: 2245824220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:01:49,253][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 04:01:49,347][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000570107_9340633088.pth... [2024-04-28 04:01:49,402][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000569210_9325936640.pth [2024-04-28 04:01:49,573][54818] Updated weights for policy 0, policy_version 570108 (0.0017) [2024-04-28 04:01:51,600][54818] Updated weights for policy 0, policy_version 570118 (0.0017) [2024-04-28 04:01:54,253][54587] Fps is (10 sec: 62258.5, 60 sec: 60893.9, 300 sec: 61259.5). Total num frames: 9340928000. Throughput: 0: 61449.5. Samples: 2246203000. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:01:54,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 04:01:54,853][54818] Updated weights for policy 0, policy_version 570128 (0.0017) [2024-04-28 04:01:57,007][54818] Updated weights for policy 0, policy_version 570138 (0.0018) [2024-04-28 04:01:59,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.8, 300 sec: 61259.5). Total num frames: 9341222912. Throughput: 0: 61178.6. Samples: 2246372940. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:01:59,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 04:02:00,173][54818] Updated weights for policy 0, policy_version 570148 (0.0015) [2024-04-28 04:02:02,523][54818] Updated weights for policy 0, policy_version 570158 (0.0017) [2024-04-28 04:02:04,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9341534208. Throughput: 0: 61406.4. Samples: 2246747540. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:02:04,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 04:02:05,138][54798] Signal inference workers to stop experience collection... (37150 times) [2024-04-28 04:02:05,179][54818] InferenceWorker_p0-w0: stopping experience collection (37150 times) [2024-04-28 04:02:05,191][54798] Signal inference workers to resume experience collection... (37150 times) [2024-04-28 04:02:05,195][54818] InferenceWorker_p0-w0: resuming experience collection (37150 times) [2024-04-28 04:02:05,302][54818] Updated weights for policy 0, policy_version 570168 (0.0018) [2024-04-28 04:02:07,939][54818] Updated weights for policy 0, policy_version 570178 (0.0017) [2024-04-28 04:02:09,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9341845504. Throughput: 0: 61530.6. Samples: 2247128080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:02:09,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 04:02:10,503][54818] Updated weights for policy 0, policy_version 570188 (0.0019) [2024-04-28 04:02:13,745][54818] Updated weights for policy 0, policy_version 570198 (0.0016) [2024-04-28 04:02:14,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61166.8, 300 sec: 61315.0). Total num frames: 9342156800. Throughput: 0: 61186.2. Samples: 2247290860. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:02:14,254][54587] Avg episode reward: [(0, '0.463')] [2024-04-28 04:02:14,832][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (25500 times) [2024-04-28 04:02:15,778][54818] Updated weights for policy 0, policy_version 570208 (0.0017) [2024-04-28 04:02:19,083][54818] Updated weights for policy 0, policy_version 570218 (0.0016) [2024-04-28 04:02:19,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.7, 300 sec: 61204.0). Total num frames: 9342451712. Throughput: 0: 61350.9. Samples: 2247667460. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:02:19,255][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 04:02:20,993][54818] Updated weights for policy 0, policy_version 570228 (0.0018) [2024-04-28 04:02:24,253][54587] Fps is (10 sec: 58983.6, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9342746624. Throughput: 0: 61354.3. Samples: 2248041360. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:02:24,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-28 04:02:24,386][54818] Updated weights for policy 0, policy_version 570238 (0.0018) [2024-04-28 04:02:26,264][54818] Updated weights for policy 0, policy_version 570248 (0.0018) [2024-04-28 04:02:29,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9343074304. Throughput: 0: 61414.0. Samples: 2248215500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 20.0) [2024-04-28 04:02:29,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 04:02:29,759][54818] Updated weights for policy 0, policy_version 570258 (0.0016) [2024-04-28 04:02:32,202][54818] Updated weights for policy 0, policy_version 570268 (0.0016) [2024-04-28 04:02:34,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9343369216. Throughput: 0: 61321.8. Samples: 2248583700. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:02:34,253][54587] Avg episode reward: [(0, '0.513')] [2024-04-28 04:02:34,950][54818] Updated weights for policy 0, policy_version 570278 (0.0017) [2024-04-28 04:02:37,300][54818] Updated weights for policy 0, policy_version 570288 (0.0017) [2024-04-28 04:02:39,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61439.9, 300 sec: 61203.9). Total num frames: 9343680512. Throughput: 0: 61264.0. Samples: 2248959880. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:02:39,255][54587] Avg episode reward: [(0, '0.578')] [2024-04-28 04:02:40,100][54818] Updated weights for policy 0, policy_version 570298 (0.0016) [2024-04-28 04:02:41,014][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (25600 times) [2024-04-28 04:02:42,930][54818] Updated weights for policy 0, policy_version 570308 (0.0015) [2024-04-28 04:02:44,253][54587] Fps is (10 sec: 60619.8, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 9343975424. Throughput: 0: 61287.5. Samples: 2249130880. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:02:44,254][54587] Avg episode reward: [(0, '0.654')] [2024-04-28 04:02:45,301][54818] Updated weights for policy 0, policy_version 570318 (0.0017) [2024-04-28 04:02:48,223][54818] Updated weights for policy 0, policy_version 570328 (0.0017) [2024-04-28 04:02:49,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9344286720. Throughput: 0: 61308.3. Samples: 2249506400. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:02:49,253][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 04:02:49,262][54587] No heartbeat for components: RolloutWorker_w4 (23917 seconds), RolloutWorker_w5 (10017 seconds) [2024-04-28 04:02:49,380][54798] Signal inference workers to stop experience collection... (37200 times) [2024-04-28 04:02:49,413][54818] InferenceWorker_p0-w0: stopping experience collection (37200 times) [2024-04-28 04:02:49,432][54798] Signal inference workers to resume experience collection... (37200 times) [2024-04-28 04:02:49,433][54818] InferenceWorker_p0-w0: resuming experience collection (37200 times) [2024-04-28 04:02:50,544][54818] Updated weights for policy 0, policy_version 570338 (0.0018) [2024-04-28 04:02:53,635][54818] Updated weights for policy 0, policy_version 570348 (0.0016) [2024-04-28 04:02:54,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9344598016. Throughput: 0: 61092.3. Samples: 2249877240. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:02:54,255][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 04:02:55,813][54818] Updated weights for policy 0, policy_version 570358 (0.0019) [2024-04-28 04:02:59,178][54818] Updated weights for policy 0, policy_version 570368 (0.0015) [2024-04-28 04:02:59,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9344909312. Throughput: 0: 61296.5. Samples: 2250049200. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:02:59,255][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 04:03:01,236][54818] Updated weights for policy 0, policy_version 570378 (0.0016) [2024-04-28 04:03:04,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9345220608. Throughput: 0: 61270.3. Samples: 2250424620. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:03:04,254][54587] Avg episode reward: [(0, '0.655')] [2024-04-28 04:03:04,529][54818] Updated weights for policy 0, policy_version 570388 (0.0016) [2024-04-28 04:03:06,510][54818] Updated weights for policy 0, policy_version 570398 (0.0015) [2024-04-28 04:03:08,220][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (25700 times) [2024-04-28 04:03:09,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9345515520. Throughput: 0: 61233.8. Samples: 2250796880. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:03:09,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 04:03:09,723][54818] Updated weights for policy 0, policy_version 570408 (0.0017) [2024-04-28 04:03:11,872][54818] Updated weights for policy 0, policy_version 570418 (0.0017) [2024-04-28 04:03:14,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61167.1, 300 sec: 61259.5). Total num frames: 9345826816. Throughput: 0: 61184.6. Samples: 2250968800. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:03:14,254][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 04:03:15,076][54818] Updated weights for policy 0, policy_version 570428 (0.0017) [2024-04-28 04:03:17,623][54818] Updated weights for policy 0, policy_version 570438 (0.0017) [2024-04-28 04:03:19,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61440.1, 300 sec: 61315.0). Total num frames: 9346138112. Throughput: 0: 61243.0. Samples: 2251339640. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:03:19,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 04:03:20,214][54818] Updated weights for policy 0, policy_version 570448 (0.0018) [2024-04-28 04:03:23,271][54818] Updated weights for policy 0, policy_version 570458 (0.0017) [2024-04-28 04:03:23,733][54798] Signal inference workers to stop experience collection... (37250 times) [2024-04-28 04:03:23,767][54818] InferenceWorker_p0-w0: stopping experience collection (37250 times) [2024-04-28 04:03:23,787][54798] Signal inference workers to resume experience collection... (37250 times) [2024-04-28 04:03:23,788][54818] InferenceWorker_p0-w0: resuming experience collection (37250 times) [2024-04-28 04:03:24,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61713.0, 300 sec: 61315.0). Total num frames: 9346449408. Throughput: 0: 61310.3. Samples: 2251718840. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:03:24,254][54587] Avg episode reward: [(0, '0.511')] [2024-04-28 04:03:25,459][54818] Updated weights for policy 0, policy_version 570468 (0.0018) [2024-04-28 04:03:28,807][54818] Updated weights for policy 0, policy_version 570478 (0.0016) [2024-04-28 04:03:29,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9346744320. Throughput: 0: 61192.9. Samples: 2251884560. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:03:29,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 04:03:30,614][54818] Updated weights for policy 0, policy_version 570488 (0.0017) [2024-04-28 04:03:34,019][54818] Updated weights for policy 0, policy_version 570498 (0.0017) [2024-04-28 04:03:34,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9347055616. Throughput: 0: 61137.7. Samples: 2252257600. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:03:34,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-28 04:03:34,838][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (25800 times) [2024-04-28 04:03:36,087][54818] Updated weights for policy 0, policy_version 570508 (0.0016) [2024-04-28 04:03:39,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9347350528. Throughput: 0: 61141.0. Samples: 2252628580. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:03:39,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 04:03:39,510][54818] Updated weights for policy 0, policy_version 570518 (0.0019) [2024-04-28 04:03:41,765][54818] Updated weights for policy 0, policy_version 570528 (0.0017) [2024-04-28 04:03:44,253][54587] Fps is (10 sec: 58982.7, 60 sec: 61167.1, 300 sec: 61148.5). Total num frames: 9347645440. Throughput: 0: 61130.8. Samples: 2252800080. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:03:44,254][54587] Avg episode reward: [(0, '0.563')] [2024-04-28 04:03:44,651][54818] Updated weights for policy 0, policy_version 570538 (0.0018) [2024-04-28 04:03:47,015][54818] Updated weights for policy 0, policy_version 570548 (0.0018) [2024-04-28 04:03:49,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 9347973120. Throughput: 0: 61035.1. Samples: 2253171200. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:03:49,254][54587] Avg episode reward: [(0, '0.496')] [2024-04-28 04:03:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000570555_9347973120.pth... [2024-04-28 04:03:49,315][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000569660_9333309440.pth [2024-04-28 04:03:49,835][54818] Updated weights for policy 0, policy_version 570558 (0.0016) [2024-04-28 04:03:52,348][54818] Updated weights for policy 0, policy_version 570568 (0.0016) [2024-04-28 04:03:54,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9348268032. Throughput: 0: 61166.0. Samples: 2253549360. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:03:54,254][54587] Avg episode reward: [(0, '0.523')] [2024-04-28 04:03:55,099][54818] Updated weights for policy 0, policy_version 570578 (0.0017) [2024-04-28 04:03:56,674][54798] Signal inference workers to stop experience collection... (37300 times) [2024-04-28 04:03:56,674][54798] Signal inference workers to resume experience collection... (37300 times) [2024-04-28 04:03:56,699][54818] InferenceWorker_p0-w0: stopping experience collection (37300 times) [2024-04-28 04:03:56,699][54818] InferenceWorker_p0-w0: resuming experience collection (37300 times) [2024-04-28 04:03:57,895][54818] Updated weights for policy 0, policy_version 570588 (0.0016) [2024-04-28 04:03:59,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61167.0, 300 sec: 61204.2). Total num frames: 9348579328. Throughput: 0: 61055.9. Samples: 2253716320. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-04-28 04:03:59,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 04:04:00,333][54818] Updated weights for policy 0, policy_version 570598 (0.0017) [2024-04-28 04:04:01,565][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (25900 times) [2024-04-28 04:04:03,241][54818] Updated weights for policy 0, policy_version 570608 (0.0017) [2024-04-28 04:04:04,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60894.0, 300 sec: 61148.4). Total num frames: 9348874240. Throughput: 0: 61118.3. Samples: 2254089960. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:04:04,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 04:04:05,748][54818] Updated weights for policy 0, policy_version 570618 (0.0018) [2024-04-28 04:04:08,694][54818] Updated weights for policy 0, policy_version 570628 (0.0015) [2024-04-28 04:04:09,254][54587] Fps is (10 sec: 62258.0, 60 sec: 61439.7, 300 sec: 61203.9). Total num frames: 9349201920. Throughput: 0: 61155.7. Samples: 2254470860. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:04:09,254][54587] Avg episode reward: [(0, '0.709')] [2024-04-28 04:04:10,931][54818] Updated weights for policy 0, policy_version 570638 (0.0017) [2024-04-28 04:04:13,909][54818] Updated weights for policy 0, policy_version 570648 (0.0017) [2024-04-28 04:04:14,253][54587] Fps is (10 sec: 65536.1, 60 sec: 61713.1, 300 sec: 61259.5). Total num frames: 9349529600. Throughput: 0: 61324.2. Samples: 2254644140. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:04:14,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 04:04:16,295][54818] Updated weights for policy 0, policy_version 570658 (0.0015) [2024-04-28 04:04:19,034][54818] Updated weights for policy 0, policy_version 570668 (0.0016) [2024-04-28 04:04:19,253][54587] Fps is (10 sec: 62261.1, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9349824512. Throughput: 0: 61347.2. Samples: 2255018220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:04:19,253][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 04:04:21,481][54818] Updated weights for policy 0, policy_version 570678 (0.0016) [2024-04-28 04:04:24,195][54818] Updated weights for policy 0, policy_version 570688 (0.0017) [2024-04-28 04:04:24,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61713.0, 300 sec: 61315.0). Total num frames: 9350152192. Throughput: 0: 61482.1. Samples: 2255395280. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:04:24,254][54587] Avg episode reward: [(0, '0.508')] [2024-04-28 04:04:26,936][54818] Updated weights for policy 0, policy_version 570698 (0.0017) [2024-04-28 04:04:28,436][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (26000 times) [2024-04-28 04:04:29,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61713.1, 300 sec: 61315.0). Total num frames: 9350447104. Throughput: 0: 61568.8. Samples: 2255570680. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:04:29,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 04:04:29,621][54818] Updated weights for policy 0, policy_version 570708 (0.0016) [2024-04-28 04:04:32,100][54818] Updated weights for policy 0, policy_version 570718 (0.0016) [2024-04-28 04:04:34,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61713.0, 300 sec: 61315.0). Total num frames: 9350758400. Throughput: 0: 61720.0. Samples: 2255948600. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:04:34,254][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 04:04:34,908][54818] Updated weights for policy 0, policy_version 570728 (0.0015) [2024-04-28 04:04:37,766][54818] Updated weights for policy 0, policy_version 570738 (0.0016) [2024-04-28 04:04:39,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 9351053312. Throughput: 0: 61294.7. Samples: 2256307620. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:04:39,254][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 04:04:40,625][54818] Updated weights for policy 0, policy_version 570748 (0.0017) [2024-04-28 04:04:42,738][54798] Signal inference workers to stop experience collection... (37350 times) [2024-04-28 04:04:42,739][54798] Signal inference workers to resume experience collection... (37350 times) [2024-04-28 04:04:42,754][54818] InferenceWorker_p0-w0: stopping experience collection (37350 times) [2024-04-28 04:04:42,754][54818] InferenceWorker_p0-w0: resuming experience collection (37350 times) [2024-04-28 04:04:43,070][54818] Updated weights for policy 0, policy_version 570758 (0.0017) [2024-04-28 04:04:44,253][54587] Fps is (10 sec: 62259.2, 60 sec: 62259.1, 300 sec: 61315.0). Total num frames: 9351380992. Throughput: 0: 61712.4. Samples: 2256493380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:04:44,255][54587] Avg episode reward: [(0, '0.501')] [2024-04-28 04:04:46,027][54818] Updated weights for policy 0, policy_version 570768 (0.0016) [2024-04-28 04:04:48,277][54818] Updated weights for policy 0, policy_version 570778 (0.0016) [2024-04-28 04:04:49,253][54587] Fps is (10 sec: 62260.0, 60 sec: 61713.2, 300 sec: 61259.5). Total num frames: 9351675904. Throughput: 0: 61668.9. Samples: 2256865060. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:04:49,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 04:04:51,242][54818] Updated weights for policy 0, policy_version 570788 (0.0019) [2024-04-28 04:04:53,609][54818] Updated weights for policy 0, policy_version 570798 (0.0019) [2024-04-28 04:04:54,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61986.1, 300 sec: 61315.0). Total num frames: 9351987200. Throughput: 0: 61262.0. Samples: 2257227640. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:04:54,255][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 04:04:54,758][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (26100 times) [2024-04-28 04:04:56,544][54818] Updated weights for policy 0, policy_version 570808 (0.0017) [2024-04-28 04:04:58,713][54818] Updated weights for policy 0, policy_version 570818 (0.0017) [2024-04-28 04:04:59,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61986.1, 300 sec: 61370.6). Total num frames: 9352298496. Throughput: 0: 61718.5. Samples: 2257421480. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:04:59,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 04:05:01,933][54818] Updated weights for policy 0, policy_version 570828 (0.0017) [2024-04-28 04:05:03,982][54818] Updated weights for policy 0, policy_version 570838 (0.0017) [2024-04-28 04:05:04,253][54587] Fps is (10 sec: 62259.4, 60 sec: 62259.1, 300 sec: 61370.6). Total num frames: 9352609792. Throughput: 0: 61312.7. Samples: 2257777300. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:05:04,255][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 04:05:07,146][54818] Updated weights for policy 0, policy_version 570848 (0.0016) [2024-04-28 04:05:09,253][54587] Fps is (10 sec: 62260.0, 60 sec: 61986.4, 300 sec: 61370.6). Total num frames: 9352921088. Throughput: 0: 61058.4. Samples: 2258142900. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:05:09,253][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 04:05:09,399][54818] Updated weights for policy 0, policy_version 570858 (0.0017) [2024-04-28 04:05:12,276][54818] Updated weights for policy 0, policy_version 570868 (0.0019) [2024-04-28 04:05:14,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61439.8, 300 sec: 61370.5). Total num frames: 9353216000. Throughput: 0: 61551.0. Samples: 2258340480. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:05:14,254][54587] Avg episode reward: [(0, '0.635')] [2024-04-28 04:05:14,254][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (1400 times) [2024-04-28 04:05:15,127][54818] Updated weights for policy 0, policy_version 570878 (0.0016) [2024-04-28 04:05:17,846][54818] Updated weights for policy 0, policy_version 570888 (0.0016) [2024-04-28 04:05:19,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61712.9, 300 sec: 61315.0). Total num frames: 9353527296. Throughput: 0: 61238.2. Samples: 2258704320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:05:19,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 04:05:20,851][54818] Updated weights for policy 0, policy_version 570898 (0.0018) [2024-04-28 04:05:21,866][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (26200 times) [2024-04-28 04:05:22,893][54818] Updated weights for policy 0, policy_version 570908 (0.0019) [2024-04-28 04:05:24,253][54587] Fps is (10 sec: 62260.3, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9353838592. Throughput: 0: 61144.2. Samples: 2259059100. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:05:24,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-28 04:05:26,070][54818] Updated weights for policy 0, policy_version 570918 (0.0017) [2024-04-28 04:05:27,609][54798] Signal inference workers to stop experience collection... (37400 times) [2024-04-28 04:05:27,618][54798] Signal inference workers to resume experience collection... (37400 times) [2024-04-28 04:05:27,656][54818] InferenceWorker_p0-w0: stopping experience collection (37400 times) [2024-04-28 04:05:27,656][54818] InferenceWorker_p0-w0: resuming experience collection (37400 times) [2024-04-28 04:05:28,244][54818] Updated weights for policy 0, policy_version 570928 (0.0022) [2024-04-28 04:05:29,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 9354133504. Throughput: 0: 61577.8. Samples: 2259264380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 04:05:29,255][54587] Avg episode reward: [(0, '0.663')] [2024-04-28 04:05:31,305][54818] Updated weights for policy 0, policy_version 570938 (0.0016) [2024-04-28 04:05:33,543][54818] Updated weights for policy 0, policy_version 570948 (0.0018) [2024-04-28 04:05:34,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9354444800. Throughput: 0: 61197.8. Samples: 2259618960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:05:34,253][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 04:05:36,584][54818] Updated weights for policy 0, policy_version 570958 (0.0017) [2024-04-28 04:05:38,970][54818] Updated weights for policy 0, policy_version 570968 (0.0018) [2024-04-28 04:05:39,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61713.1, 300 sec: 61370.6). Total num frames: 9354756096. Throughput: 0: 61302.3. Samples: 2259986240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:05:39,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 04:05:41,960][54818] Updated weights for policy 0, policy_version 570978 (0.0017) [2024-04-28 04:05:44,074][54818] Updated weights for policy 0, policy_version 570988 (0.0017) [2024-04-28 04:05:44,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9355067392. Throughput: 0: 61378.3. Samples: 2260183500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:05:44,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 04:05:47,153][54818] Updated weights for policy 0, policy_version 570998 (0.0016) [2024-04-28 04:05:48,087][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (26300 times) [2024-04-28 04:05:49,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61712.9, 300 sec: 61370.6). Total num frames: 9355378688. Throughput: 0: 61531.4. Samples: 2260546220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:05:49,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 04:05:49,265][54587] No heartbeat for components: RolloutWorker_w4 (24097 seconds), RolloutWorker_w5 (10197 seconds) [2024-04-28 04:05:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000571007_9355378688.pth... [2024-04-28 04:05:49,324][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000570107_9340633088.pth [2024-04-28 04:05:49,491][54818] Updated weights for policy 0, policy_version 571008 (0.0016) [2024-04-28 04:05:52,282][54818] Updated weights for policy 0, policy_version 571018 (0.0018) [2024-04-28 04:05:54,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9355673600. Throughput: 0: 61538.5. Samples: 2260912140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:05:54,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 04:05:55,385][54818] Updated weights for policy 0, policy_version 571028 (0.0017) [2024-04-28 04:05:57,695][54818] Updated weights for policy 0, policy_version 571038 (0.0017) [2024-04-28 04:05:59,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.0, 300 sec: 61426.1). Total num frames: 9355984896. Throughput: 0: 61278.8. Samples: 2261098020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:05:59,254][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 04:06:00,972][54818] Updated weights for policy 0, policy_version 571048 (0.0017) [2024-04-28 04:06:03,042][54818] Updated weights for policy 0, policy_version 571058 (0.0018) [2024-04-28 04:06:04,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61439.9, 300 sec: 61426.1). Total num frames: 9356296192. Throughput: 0: 61268.8. Samples: 2261461420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:06:04,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 04:06:06,273][54818] Updated weights for policy 0, policy_version 571068 (0.0016) [2024-04-28 04:06:08,342][54818] Updated weights for policy 0, policy_version 571078 (0.0015) [2024-04-28 04:06:09,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9356591104. Throughput: 0: 61487.5. Samples: 2261826040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:06:09,255][54587] Avg episode reward: [(0, '0.667')] [2024-04-28 04:06:10,582][54798] Signal inference workers to stop experience collection... (37450 times) [2024-04-28 04:06:10,595][54818] InferenceWorker_p0-w0: stopping experience collection (37450 times) [2024-04-28 04:06:10,676][54798] Signal inference workers to resume experience collection... (37450 times) [2024-04-28 04:06:10,676][54818] InferenceWorker_p0-w0: resuming experience collection (37450 times) [2024-04-28 04:06:11,469][54818] Updated weights for policy 0, policy_version 571088 (0.0015) [2024-04-28 04:06:13,526][54818] Updated weights for policy 0, policy_version 571098 (0.0015) [2024-04-28 04:06:14,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61440.2, 300 sec: 61426.1). Total num frames: 9356902400. Throughput: 0: 61179.3. Samples: 2262017440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:06:14,253][54587] Avg episode reward: [(0, '0.689')] [2024-04-28 04:06:15,445][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (26400 times) [2024-04-28 04:06:16,795][54818] Updated weights for policy 0, policy_version 571108 (0.0018) [2024-04-28 04:06:18,847][54818] Updated weights for policy 0, policy_version 571118 (0.0015) [2024-04-28 04:06:19,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61440.0, 300 sec: 61426.1). Total num frames: 9357213696. Throughput: 0: 61459.3. Samples: 2262384640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:06:19,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 04:06:21,962][54818] Updated weights for policy 0, policy_version 571128 (0.0017) [2024-04-28 04:06:24,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61167.0, 300 sec: 61370.6). Total num frames: 9357508608. Throughput: 0: 61383.2. Samples: 2262748480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:06:24,253][54587] Avg episode reward: [(0, '0.667')] [2024-04-28 04:06:24,300][54818] Updated weights for policy 0, policy_version 571138 (0.0020) [2024-04-28 04:06:27,249][54818] Updated weights for policy 0, policy_version 571148 (0.0021) [2024-04-28 04:06:29,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61713.1, 300 sec: 61481.6). Total num frames: 9357836288. Throughput: 0: 61262.2. Samples: 2262940300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:06:29,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 04:06:29,719][54818] Updated weights for policy 0, policy_version 571158 (0.0018) [2024-04-28 04:06:32,516][54818] Updated weights for policy 0, policy_version 571168 (0.0016) [2024-04-28 04:06:34,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61440.0, 300 sec: 61481.7). Total num frames: 9358131200. Throughput: 0: 61277.9. Samples: 2263303720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:06:34,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 04:06:35,687][54818] Updated weights for policy 0, policy_version 571178 (0.0024) [2024-04-28 04:06:37,749][54818] Updated weights for policy 0, policy_version 571188 (0.0018) [2024-04-28 04:06:39,253][54587] Fps is (10 sec: 58983.1, 60 sec: 61167.0, 300 sec: 61426.1). Total num frames: 9358426112. Throughput: 0: 61062.8. Samples: 2263659960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:06:39,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 04:06:41,121][54818] Updated weights for policy 0, policy_version 571198 (0.0016) [2024-04-28 04:06:42,069][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (26500 times) [2024-04-28 04:06:42,393][54798] Signal inference workers to stop experience collection... (37500 times) [2024-04-28 04:06:42,430][54818] InferenceWorker_p0-w0: stopping experience collection (37500 times) [2024-04-28 04:06:42,481][54798] Signal inference workers to resume experience collection... (37500 times) [2024-04-28 04:06:42,481][54818] InferenceWorker_p0-w0: resuming experience collection (37500 times) [2024-04-28 04:06:43,115][54818] Updated weights for policy 0, policy_version 571208 (0.0016) [2024-04-28 04:06:44,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61166.9, 300 sec: 61426.1). Total num frames: 9358737408. Throughput: 0: 61370.7. Samples: 2263859700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:06:44,255][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 04:06:46,259][54818] Updated weights for policy 0, policy_version 571218 (0.0016) [2024-04-28 04:06:48,263][54818] Updated weights for policy 0, policy_version 571228 (0.0016) [2024-04-28 04:06:49,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61167.0, 300 sec: 61426.1). Total num frames: 9359048704. Throughput: 0: 61366.8. Samples: 2264222920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:06:49,255][54587] Avg episode reward: [(0, '0.646')] [2024-04-28 04:06:51,554][54818] Updated weights for policy 0, policy_version 571238 (0.0016) [2024-04-28 04:06:53,402][54818] Updated weights for policy 0, policy_version 571248 (0.0017) [2024-04-28 04:06:54,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61440.0, 300 sec: 61481.7). Total num frames: 9359360000. Throughput: 0: 61235.6. Samples: 2264581640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:06:54,255][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 04:06:56,727][54818] Updated weights for policy 0, policy_version 571258 (0.0018) [2024-04-28 04:06:58,746][54818] Updated weights for policy 0, policy_version 571268 (0.0019) [2024-04-28 04:06:59,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.9, 300 sec: 61426.1). Total num frames: 9359654912. Throughput: 0: 61323.4. Samples: 2264777000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:06:59,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 04:07:01,903][54818] Updated weights for policy 0, policy_version 571278 (0.0017) [2024-04-28 04:07:04,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61167.1, 300 sec: 61426.1). Total num frames: 9359966208. Throughput: 0: 61289.5. Samples: 2265142660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 04:07:04,256][54587] Avg episode reward: [(0, '0.542')] [2024-04-28 04:07:04,649][54818] Updated weights for policy 0, policy_version 571288 (0.0017) [2024-04-28 04:07:07,393][54818] Updated weights for policy 0, policy_version 571298 (0.0016) [2024-04-28 04:07:08,128][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (26600 times) [2024-04-28 04:07:09,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61439.9, 300 sec: 61426.1). Total num frames: 9360277504. Throughput: 0: 61251.8. Samples: 2265504820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:07:09,255][54587] Avg episode reward: [(0, '0.647')] [2024-04-28 04:07:09,945][54818] Updated weights for policy 0, policy_version 571308 (0.0016) [2024-04-28 04:07:12,842][54818] Updated weights for policy 0, policy_version 571318 (0.0016) [2024-04-28 04:07:14,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61439.9, 300 sec: 61481.7). Total num frames: 9360588800. Throughput: 0: 61198.2. Samples: 2265694220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:07:14,255][54587] Avg episode reward: [(0, '0.510')] [2024-04-28 04:07:15,304][54818] Updated weights for policy 0, policy_version 571328 (0.0018) [2024-04-28 04:07:18,120][54818] Updated weights for policy 0, policy_version 571338 (0.0016) [2024-04-28 04:07:19,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.9, 300 sec: 61481.6). Total num frames: 9360883712. Throughput: 0: 61299.0. Samples: 2266062180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:07:19,255][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 04:07:20,011][54798] Signal inference workers to stop experience collection... (37550 times) [2024-04-28 04:07:20,044][54818] InferenceWorker_p0-w0: stopping experience collection (37550 times) [2024-04-28 04:07:20,073][54798] Signal inference workers to resume experience collection... (37550 times) [2024-04-28 04:07:20,073][54818] InferenceWorker_p0-w0: resuming experience collection (37550 times) [2024-04-28 04:07:20,737][54818] Updated weights for policy 0, policy_version 571348 (0.0017) [2024-04-28 04:07:23,475][54818] Updated weights for policy 0, policy_version 571358 (0.0016) [2024-04-28 04:07:24,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61439.9, 300 sec: 61426.1). Total num frames: 9361195008. Throughput: 0: 61459.0. Samples: 2266425620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:07:24,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 04:07:26,090][54818] Updated weights for policy 0, policy_version 571368 (0.0016) [2024-04-28 04:07:28,581][54818] Updated weights for policy 0, policy_version 571378 (0.0020) [2024-04-28 04:07:29,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60893.8, 300 sec: 61426.1). Total num frames: 9361489920. Throughput: 0: 61142.2. Samples: 2266611100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:07:29,254][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 04:07:31,478][54818] Updated weights for policy 0, policy_version 571388 (0.0016) [2024-04-28 04:07:33,780][54818] Updated weights for policy 0, policy_version 571398 (0.0017) [2024-04-28 04:07:34,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61167.0, 300 sec: 61426.1). Total num frames: 9361801216. Throughput: 0: 61257.4. Samples: 2266979500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:07:34,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-28 04:07:35,501][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (26700 times) [2024-04-28 04:07:36,938][54818] Updated weights for policy 0, policy_version 571408 (0.0016) [2024-04-28 04:07:39,171][54818] Updated weights for policy 0, policy_version 571418 (0.0015) [2024-04-28 04:07:39,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61439.9, 300 sec: 61481.7). Total num frames: 9362112512. Throughput: 0: 61364.9. Samples: 2267343060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:07:39,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 04:07:42,303][54818] Updated weights for policy 0, policy_version 571428 (0.0016) [2024-04-28 04:07:44,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61167.0, 300 sec: 61426.1). Total num frames: 9362407424. Throughput: 0: 61051.6. Samples: 2267524320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:07:44,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 04:07:44,704][54818] Updated weights for policy 0, policy_version 571438 (0.0018) [2024-04-28 04:07:47,565][54818] Updated weights for policy 0, policy_version 571448 (0.0015) [2024-04-28 04:07:49,254][54587] Fps is (10 sec: 60619.6, 60 sec: 61166.8, 300 sec: 61426.1). Total num frames: 9362718720. Throughput: 0: 61214.4. Samples: 2267897320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:07:49,254][54587] Avg episode reward: [(0, '0.667')] [2024-04-28 04:07:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000571455_9362718720.pth... [2024-04-28 04:07:49,319][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000570555_9347973120.pth [2024-04-28 04:07:50,373][54818] Updated weights for policy 0, policy_version 571458 (0.0015) [2024-04-28 04:07:53,113][54818] Updated weights for policy 0, policy_version 571468 (0.0018) [2024-04-28 04:07:54,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61166.9, 300 sec: 61426.1). Total num frames: 9363030016. Throughput: 0: 61285.0. Samples: 2268262640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:07:54,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 04:07:55,565][54818] Updated weights for policy 0, policy_version 571478 (0.0017) [2024-04-28 04:07:58,373][54818] Updated weights for policy 0, policy_version 571488 (0.0015) [2024-04-28 04:07:58,772][54798] Signal inference workers to stop experience collection... (37600 times) [2024-04-28 04:07:58,772][54798] Signal inference workers to resume experience collection... (37600 times) [2024-04-28 04:07:58,780][54818] InferenceWorker_p0-w0: stopping experience collection (37600 times) [2024-04-28 04:07:58,780][54818] InferenceWorker_p0-w0: resuming experience collection (37600 times) [2024-04-28 04:07:59,253][54587] Fps is (10 sec: 60621.9, 60 sec: 61167.0, 300 sec: 61370.6). Total num frames: 9363324928. Throughput: 0: 61084.9. Samples: 2268443040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:07:59,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 04:08:00,910][54818] Updated weights for policy 0, policy_version 571498 (0.0016) [2024-04-28 04:08:01,957][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (26800 times) [2024-04-28 04:08:03,632][54818] Updated weights for policy 0, policy_version 571508 (0.0015) [2024-04-28 04:08:04,253][54587] Fps is (10 sec: 60619.8, 60 sec: 61166.8, 300 sec: 61426.1). Total num frames: 9363636224. Throughput: 0: 61345.3. Samples: 2268822720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:08:04,254][54587] Avg episode reward: [(0, '0.680')] [2024-04-28 04:08:06,088][54818] Updated weights for policy 0, policy_version 571518 (0.0017) [2024-04-28 04:08:08,867][54818] Updated weights for policy 0, policy_version 571528 (0.0016) [2024-04-28 04:08:09,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61167.0, 300 sec: 61426.1). Total num frames: 9363947520. Throughput: 0: 61352.0. Samples: 2269186460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:08:09,255][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 04:08:11,299][54818] Updated weights for policy 0, policy_version 571538 (0.0016) [2024-04-28 04:08:14,172][54818] Updated weights for policy 0, policy_version 571548 (0.0019) [2024-04-28 04:08:14,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60893.8, 300 sec: 61370.5). Total num frames: 9364242432. Throughput: 0: 60985.2. Samples: 2269355440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:08:14,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 04:08:16,496][54818] Updated weights for policy 0, policy_version 571558 (0.0018) [2024-04-28 04:08:19,253][54587] Fps is (10 sec: 58982.7, 60 sec: 60894.1, 300 sec: 61315.1). Total num frames: 9364537344. Throughput: 0: 61350.7. Samples: 2269740280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:08:19,253][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 04:08:19,500][54818] Updated weights for policy 0, policy_version 571568 (0.0016) [2024-04-28 04:08:21,959][54818] Updated weights for policy 0, policy_version 571578 (0.0017) [2024-04-28 04:08:24,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60893.9, 300 sec: 61370.6). Total num frames: 9364848640. Throughput: 0: 61468.0. Samples: 2270109120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:08:24,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 04:08:24,900][54818] Updated weights for policy 0, policy_version 571588 (0.0017) [2024-04-28 04:08:28,056][54818] Updated weights for policy 0, policy_version 571598 (0.0015) [2024-04-28 04:08:28,923][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (26900 times) [2024-04-28 04:08:29,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61167.1, 300 sec: 61370.6). Total num frames: 9365159936. Throughput: 0: 61199.7. Samples: 2270278300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:08:29,253][54587] Avg episode reward: [(0, '0.660')] [2024-04-28 04:08:30,277][54818] Updated weights for policy 0, policy_version 571608 (0.0017) [2024-04-28 04:08:33,345][54818] Updated weights for policy 0, policy_version 571618 (0.0015) [2024-04-28 04:08:34,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60893.9, 300 sec: 61370.6). Total num frames: 9365454848. Throughput: 0: 61512.0. Samples: 2270665340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 04:08:34,253][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 04:08:35,454][54818] Updated weights for policy 0, policy_version 571628 (0.0017) [2024-04-28 04:08:38,558][54818] Updated weights for policy 0, policy_version 571638 (0.0018) [2024-04-28 04:08:39,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60894.0, 300 sec: 61426.1). Total num frames: 9365766144. Throughput: 0: 61551.7. Samples: 2271032460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:08:39,253][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 04:08:39,340][54798] Signal inference workers to stop experience collection... (37650 times) [2024-04-28 04:08:39,340][54798] Signal inference workers to resume experience collection... (37650 times) [2024-04-28 04:08:39,361][54818] InferenceWorker_p0-w0: stopping experience collection (37650 times) [2024-04-28 04:08:39,361][54818] InferenceWorker_p0-w0: resuming experience collection (37650 times) [2024-04-28 04:08:40,608][54818] Updated weights for policy 0, policy_version 571648 (0.0018) [2024-04-28 04:08:43,894][54818] Updated weights for policy 0, policy_version 571658 (0.0016) [2024-04-28 04:08:44,253][54587] Fps is (10 sec: 62258.1, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9366077440. Throughput: 0: 61182.2. Samples: 2271196240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:08:44,262][54587] Avg episode reward: [(0, '0.651')] [2024-04-28 04:08:45,842][54818] Updated weights for policy 0, policy_version 571668 (0.0017) [2024-04-28 04:08:49,021][54818] Updated weights for policy 0, policy_version 571678 (0.0015) [2024-04-28 04:08:49,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61167.2, 300 sec: 61426.1). Total num frames: 9366388736. Throughput: 0: 61346.5. Samples: 2271583300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:08:49,253][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 04:08:49,269][54587] No heartbeat for components: RolloutWorker_w4 (24277 seconds), RolloutWorker_w5 (10377 seconds) [2024-04-28 04:08:50,975][54818] Updated weights for policy 0, policy_version 571688 (0.0016) [2024-04-28 04:08:54,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60893.9, 300 sec: 61370.6). Total num frames: 9366683648. Throughput: 0: 61650.2. Samples: 2271960720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:08:54,253][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 04:08:54,301][54818] Updated weights for policy 0, policy_version 571698 (0.0017) [2024-04-28 04:08:55,262][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (27000 times) [2024-04-28 04:08:56,444][54818] Updated weights for policy 0, policy_version 571708 (0.0016) [2024-04-28 04:08:59,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61167.0, 300 sec: 61426.1). Total num frames: 9366994944. Throughput: 0: 61461.6. Samples: 2272121200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:08:59,253][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 04:08:59,529][54818] Updated weights for policy 0, policy_version 571718 (0.0017) [2024-04-28 04:09:02,520][54818] Updated weights for policy 0, policy_version 571728 (0.0017) [2024-04-28 04:09:04,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61167.2, 300 sec: 61370.6). Total num frames: 9367306240. Throughput: 0: 61460.4. Samples: 2272506000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:09:04,254][54587] Avg episode reward: [(0, '0.674')] [2024-04-28 04:09:04,933][54818] Updated weights for policy 0, policy_version 571738 (0.0018) [2024-04-28 04:09:07,861][54818] Updated weights for policy 0, policy_version 571748 (0.0017) [2024-04-28 04:09:09,253][54587] Fps is (10 sec: 60619.9, 60 sec: 60893.8, 300 sec: 61259.5). Total num frames: 9367601152. Throughput: 0: 61434.6. Samples: 2272873680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:09:09,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 04:09:10,154][54818] Updated weights for policy 0, policy_version 571758 (0.0022) [2024-04-28 04:09:13,323][54818] Updated weights for policy 0, policy_version 571768 (0.0016) [2024-04-28 04:09:14,253][54587] Fps is (10 sec: 58981.9, 60 sec: 60894.0, 300 sec: 61259.5). Total num frames: 9367896064. Throughput: 0: 61446.1. Samples: 2273043380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:09:14,254][54587] Avg episode reward: [(0, '0.560')] [2024-04-28 04:09:15,447][54818] Updated weights for policy 0, policy_version 571778 (0.0018) [2024-04-28 04:09:18,842][54818] Updated weights for policy 0, policy_version 571788 (0.0015) [2024-04-28 04:09:19,253][54587] Fps is (10 sec: 60621.9, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9368207360. Throughput: 0: 61173.8. Samples: 2273418160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:09:19,253][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 04:09:20,311][54798] Signal inference workers to stop experience collection... (37700 times) [2024-04-28 04:09:20,315][54798] Signal inference workers to resume experience collection... (37700 times) [2024-04-28 04:09:20,333][54818] InferenceWorker_p0-w0: stopping experience collection (37700 times) [2024-04-28 04:09:20,333][54818] InferenceWorker_p0-w0: resuming experience collection (37700 times) [2024-04-28 04:09:20,576][54818] Updated weights for policy 0, policy_version 571798 (0.0018) [2024-04-28 04:09:22,319][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (27100 times) [2024-04-28 04:09:24,136][54818] Updated weights for policy 0, policy_version 571808 (0.0017) [2024-04-28 04:09:24,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9368502272. Throughput: 0: 61309.2. Samples: 2273791380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:09:24,257][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 04:09:25,863][54818] Updated weights for policy 0, policy_version 571818 (0.0016) [2024-04-28 04:09:29,253][54587] Fps is (10 sec: 60619.3, 60 sec: 60893.6, 300 sec: 61203.9). Total num frames: 9368813568. Throughput: 0: 61291.9. Samples: 2273954380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:09:29,255][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 04:09:29,405][54818] Updated weights for policy 0, policy_version 571828 (0.0015) [2024-04-28 04:09:31,318][54818] Updated weights for policy 0, policy_version 571838 (0.0018) [2024-04-28 04:09:34,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.8, 300 sec: 61204.0). Total num frames: 9369108480. Throughput: 0: 61247.5. Samples: 2274339440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:09:34,253][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 04:09:34,554][54818] Updated weights for policy 0, policy_version 571848 (0.0016) [2024-04-28 04:09:37,035][54818] Updated weights for policy 0, policy_version 571858 (0.0015) [2024-04-28 04:09:39,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61166.7, 300 sec: 61204.0). Total num frames: 9369436160. Throughput: 0: 61019.9. Samples: 2274706620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:09:39,254][54587] Avg episode reward: [(0, '0.671')] [2024-04-28 04:09:39,902][54818] Updated weights for policy 0, policy_version 571868 (0.0017) [2024-04-28 04:09:42,649][54818] Updated weights for policy 0, policy_version 571878 (0.0022) [2024-04-28 04:09:44,253][54587] Fps is (10 sec: 62258.2, 60 sec: 60893.8, 300 sec: 61203.9). Total num frames: 9369731072. Throughput: 0: 61354.4. Samples: 2274882160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:09:44,254][54587] Avg episode reward: [(0, '0.556')] [2024-04-28 04:09:45,055][54818] Updated weights for policy 0, policy_version 571888 (0.0018) [2024-04-28 04:09:48,291][54818] Updated weights for policy 0, policy_version 571898 (0.0019) [2024-04-28 04:09:49,232][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (27200 times) [2024-04-28 04:09:49,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.7, 300 sec: 61203.9). Total num frames: 9370042368. Throughput: 0: 61121.1. Samples: 2275256460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:09:49,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 04:09:49,351][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000571903_9370058752.pth... [2024-04-28 04:09:49,403][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000571007_9355378688.pth [2024-04-28 04:09:50,198][54818] Updated weights for policy 0, policy_version 571908 (0.0017) [2024-04-28 04:09:53,532][54818] Updated weights for policy 0, policy_version 571918 (0.0017) [2024-04-28 04:09:54,253][54587] Fps is (10 sec: 60621.9, 60 sec: 60893.9, 300 sec: 61148.5). Total num frames: 9370337280. Throughput: 0: 61213.6. Samples: 2275628280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:09:54,253][54587] Avg episode reward: [(0, '0.672')] [2024-04-28 04:09:54,752][54798] Signal inference workers to stop experience collection... (37750 times) [2024-04-28 04:09:54,796][54818] InferenceWorker_p0-w0: stopping experience collection (37750 times) [2024-04-28 04:09:54,803][54798] Signal inference workers to resume experience collection... (37750 times) [2024-04-28 04:09:54,810][54818] InferenceWorker_p0-w0: resuming experience collection (37750 times) [2024-04-28 04:09:55,461][54818] Updated weights for policy 0, policy_version 571928 (0.0016) [2024-04-28 04:09:58,774][54818] Updated weights for policy 0, policy_version 571938 (0.0020) [2024-04-28 04:09:59,253][54587] Fps is (10 sec: 60622.1, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9370648576. Throughput: 0: 61150.4. Samples: 2275795140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:09:59,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 04:10:00,715][54818] Updated weights for policy 0, policy_version 571948 (0.0017) [2024-04-28 04:10:04,130][54818] Updated weights for policy 0, policy_version 571958 (0.0017) [2024-04-28 04:10:04,253][54587] Fps is (10 sec: 62258.8, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 9370959872. Throughput: 0: 61243.5. Samples: 2276174120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-04-28 04:10:04,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 04:10:05,850][54818] Updated weights for policy 0, policy_version 571968 (0.0021) [2024-04-28 04:10:09,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 9371271168. Throughput: 0: 61218.3. Samples: 2276546200. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:10:09,259][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 04:10:09,418][54818] Updated weights for policy 0, policy_version 571978 (0.0016) [2024-04-28 04:10:11,461][54818] Updated weights for policy 0, policy_version 571988 (0.0016) [2024-04-28 04:10:14,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61440.2, 300 sec: 61204.0). Total num frames: 9371582464. Throughput: 0: 61270.1. Samples: 2276711520. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:10:14,253][54587] Avg episode reward: [(0, '0.652')] [2024-04-28 04:10:14,565][54818] Updated weights for policy 0, policy_version 571998 (0.0016) [2024-04-28 04:10:15,519][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (27300 times) [2024-04-28 04:10:17,445][54818] Updated weights for policy 0, policy_version 572008 (0.0017) [2024-04-28 04:10:19,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 9371877376. Throughput: 0: 61083.5. Samples: 2277088200. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:10:19,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 04:10:19,884][54818] Updated weights for policy 0, policy_version 572018 (0.0018) [2024-04-28 04:10:23,120][54818] Updated weights for policy 0, policy_version 572028 (0.0020) [2024-04-28 04:10:24,253][54587] Fps is (10 sec: 60619.5, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 9372188672. Throughput: 0: 61109.7. Samples: 2277456560. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:10:24,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 04:10:25,394][54818] Updated weights for policy 0, policy_version 572038 (0.0017) [2024-04-28 04:10:28,452][54818] Updated weights for policy 0, policy_version 572048 (0.0016) [2024-04-28 04:10:29,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61167.1, 300 sec: 61148.4). Total num frames: 9372483584. Throughput: 0: 61008.6. Samples: 2277627540. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:10:29,253][54587] Avg episode reward: [(0, '0.641')] [2024-04-28 04:10:30,617][54818] Updated weights for policy 0, policy_version 572058 (0.0016) [2024-04-28 04:10:30,649][54798] Signal inference workers to stop experience collection... (37800 times) [2024-04-28 04:10:30,669][54818] InferenceWorker_p0-w0: stopping experience collection (37800 times) [2024-04-28 04:10:30,741][54798] Signal inference workers to resume experience collection... (37800 times) [2024-04-28 04:10:30,741][54818] InferenceWorker_p0-w0: resuming experience collection (37800 times) [2024-04-28 04:10:33,805][54818] Updated weights for policy 0, policy_version 572068 (0.0015) [2024-04-28 04:10:34,253][54587] Fps is (10 sec: 60621.9, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9372794880. Throughput: 0: 60974.5. Samples: 2278000300. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:10:34,253][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 04:10:35,812][54818] Updated weights for policy 0, policy_version 572078 (0.0017) [2024-04-28 04:10:39,074][54818] Updated weights for policy 0, policy_version 572088 (0.0016) [2024-04-28 04:10:39,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9373089792. Throughput: 0: 60858.0. Samples: 2278366900. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:10:39,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 04:10:41,204][54818] Updated weights for policy 0, policy_version 572098 (0.0017) [2024-04-28 04:10:43,021][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (27400 times) [2024-04-28 04:10:44,253][54587] Fps is (10 sec: 60619.9, 60 sec: 61167.0, 300 sec: 61092.9). Total num frames: 9373401088. Throughput: 0: 60888.7. Samples: 2278535140. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:10:44,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-28 04:10:44,373][54818] Updated weights for policy 0, policy_version 572108 (0.0018) [2024-04-28 04:10:46,537][54818] Updated weights for policy 0, policy_version 572118 (0.0017) [2024-04-28 04:10:49,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9373712384. Throughput: 0: 60896.4. Samples: 2278914460. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:10:49,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-28 04:10:49,630][54818] Updated weights for policy 0, policy_version 572128 (0.0017) [2024-04-28 04:10:52,011][54818] Updated weights for policy 0, policy_version 572138 (0.0017) [2024-04-28 04:10:54,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 9374007296. Throughput: 0: 60828.0. Samples: 2279283460. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:10:54,253][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 04:10:54,928][54818] Updated weights for policy 0, policy_version 572148 (0.0016) [2024-04-28 04:10:57,571][54818] Updated weights for policy 0, policy_version 572158 (0.0016) [2024-04-28 04:10:59,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 9374318592. Throughput: 0: 60983.5. Samples: 2279455780. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:10:59,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 04:11:00,269][54818] Updated weights for policy 0, policy_version 572168 (0.0016) [2024-04-28 04:11:03,071][54818] Updated weights for policy 0, policy_version 572178 (0.0016) [2024-04-28 04:11:04,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9374613504. Throughput: 0: 60815.6. Samples: 2279824900. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:11:04,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 04:11:05,632][54818] Updated weights for policy 0, policy_version 572188 (0.0018) [2024-04-28 04:11:08,453][54818] Updated weights for policy 0, policy_version 572198 (0.0017) [2024-04-28 04:11:09,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9374924800. Throughput: 0: 60878.5. Samples: 2280196080. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:11:09,253][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 04:11:09,646][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (27500 times) [2024-04-28 04:11:11,072][54818] Updated weights for policy 0, policy_version 572208 (0.0016) [2024-04-28 04:11:13,961][54818] Updated weights for policy 0, policy_version 572218 (0.0015) [2024-04-28 04:11:14,253][54587] Fps is (10 sec: 62259.5, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 9375236096. Throughput: 0: 60897.8. Samples: 2280367940. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:11:14,253][54587] Avg episode reward: [(0, '0.561')] [2024-04-28 04:11:16,016][54798] Signal inference workers to stop experience collection... (37850 times) [2024-04-28 04:11:16,021][54798] Signal inference workers to resume experience collection... (37850 times) [2024-04-28 04:11:16,030][54818] InferenceWorker_p0-w0: stopping experience collection (37850 times) [2024-04-28 04:11:16,030][54818] InferenceWorker_p0-w0: resuming experience collection (37850 times) [2024-04-28 04:11:16,445][54818] Updated weights for policy 0, policy_version 572228 (0.0017) [2024-04-28 04:11:19,201][54818] Updated weights for policy 0, policy_version 572238 (0.0016) [2024-04-28 04:11:19,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9375547392. Throughput: 0: 60854.1. Samples: 2280738740. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:11:19,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 04:11:21,834][54818] Updated weights for policy 0, policy_version 572248 (0.0015) [2024-04-28 04:11:24,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61167.1, 300 sec: 61092.9). Total num frames: 9375858688. Throughput: 0: 60888.1. Samples: 2281106860. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:11:24,254][54587] Avg episode reward: [(0, '0.659')] [2024-04-28 04:11:24,573][54818] Updated weights for policy 0, policy_version 572258 (0.0020) [2024-04-28 04:11:26,955][54818] Updated weights for policy 0, policy_version 572268 (0.0019) [2024-04-28 04:11:29,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 9376169984. Throughput: 0: 61194.6. Samples: 2281288900. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:11:29,254][54587] Avg episode reward: [(0, '0.646')] [2024-04-28 04:11:29,962][54818] Updated weights for policy 0, policy_version 572278 (0.0017) [2024-04-28 04:11:32,507][54818] Updated weights for policy 0, policy_version 572288 (0.0017) [2024-04-28 04:11:34,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61439.9, 300 sec: 61203.9). Total num frames: 9376481280. Throughput: 0: 60877.4. Samples: 2281653940. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:11:34,254][54587] Avg episode reward: [(0, '0.539')] [2024-04-28 04:11:35,275][54818] Updated weights for policy 0, policy_version 572298 (0.0016) [2024-04-28 04:11:36,554][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (27600 times) [2024-04-28 04:11:37,902][54818] Updated weights for policy 0, policy_version 572308 (0.0017) [2024-04-28 04:11:39,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61440.1, 300 sec: 61148.5). Total num frames: 9376776192. Throughput: 0: 60931.6. Samples: 2282025380. Policy #0 lag: (min: 0.0, avg: 7.0, max: 20.0) [2024-04-28 04:11:39,253][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 04:11:40,691][54818] Updated weights for policy 0, policy_version 572318 (0.0017) [2024-04-28 04:11:42,997][54818] Updated weights for policy 0, policy_version 572328 (0.0015) [2024-04-28 04:11:44,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9377087488. Throughput: 0: 61219.6. Samples: 2282210660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:11:44,253][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 04:11:45,934][54818] Updated weights for policy 0, policy_version 572338 (0.0018) [2024-04-28 04:11:48,477][54818] Updated weights for policy 0, policy_version 572348 (0.0019) [2024-04-28 04:11:49,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 9377398784. Throughput: 0: 61048.7. Samples: 2282572100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:11:49,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 04:11:49,260][54587] No heartbeat for components: RolloutWorker_w4 (24457 seconds), RolloutWorker_w5 (10557 seconds) [2024-04-28 04:11:49,278][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000572352_9377415168.pth... [2024-04-28 04:11:49,330][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000571455_9362718720.pth [2024-04-28 04:11:51,549][54818] Updated weights for policy 0, policy_version 572358 (0.0017) [2024-04-28 04:11:53,815][54818] Updated weights for policy 0, policy_version 572368 (0.0019) [2024-04-28 04:11:54,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61713.0, 300 sec: 61204.0). Total num frames: 9377710080. Throughput: 0: 60970.1. Samples: 2282939740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:11:54,254][54587] Avg episode reward: [(0, '0.519')] [2024-04-28 04:11:56,808][54818] Updated weights for policy 0, policy_version 572378 (0.0018) [2024-04-28 04:11:58,969][54798] Signal inference workers to stop experience collection... (37900 times) [2024-04-28 04:11:58,970][54798] Signal inference workers to resume experience collection... (37900 times) [2024-04-28 04:11:58,994][54818] InferenceWorker_p0-w0: stopping experience collection (37900 times) [2024-04-28 04:11:58,994][54818] InferenceWorker_p0-w0: resuming experience collection (37900 times) [2024-04-28 04:11:59,095][54818] Updated weights for policy 0, policy_version 572388 (0.0017) [2024-04-28 04:11:59,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61713.0, 300 sec: 61204.0). Total num frames: 9378021376. Throughput: 0: 61460.3. Samples: 2283133660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:11:59,254][54587] Avg episode reward: [(0, '0.492')] [2024-04-28 04:12:02,116][54818] Updated weights for policy 0, policy_version 572398 (0.0016) [2024-04-28 04:12:02,849][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (27700 times) [2024-04-28 04:12:04,234][54818] Updated weights for policy 0, policy_version 572408 (0.0015) [2024-04-28 04:12:04,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61986.0, 300 sec: 61204.0). Total num frames: 9378332672. Throughput: 0: 61141.3. Samples: 2283490100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:12:04,255][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 04:12:07,323][54818] Updated weights for policy 0, policy_version 572418 (0.0017) [2024-04-28 04:12:09,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61713.1, 300 sec: 61148.4). Total num frames: 9378627584. Throughput: 0: 61024.9. Samples: 2283852980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:12:09,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 04:12:10,244][54818] Updated weights for policy 0, policy_version 572428 (0.0018) [2024-04-28 04:12:12,736][54818] Updated weights for policy 0, policy_version 572438 (0.0016) [2024-04-28 04:12:14,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61712.9, 300 sec: 61204.0). Total num frames: 9378938880. Throughput: 0: 61248.4. Samples: 2284045080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:12:14,255][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 04:12:15,556][54818] Updated weights for policy 0, policy_version 572448 (0.0017) [2024-04-28 04:12:18,015][54818] Updated weights for policy 0, policy_version 572458 (0.0017) [2024-04-28 04:12:19,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9379233792. Throughput: 0: 61001.7. Samples: 2284399020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:12:19,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 04:12:20,951][54818] Updated weights for policy 0, policy_version 572468 (0.0016) [2024-04-28 04:12:23,478][54818] Updated weights for policy 0, policy_version 572478 (0.0016) [2024-04-28 04:12:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 9379545088. Throughput: 0: 60962.0. Samples: 2284768680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:12:24,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 04:12:26,276][54818] Updated weights for policy 0, policy_version 572488 (0.0019) [2024-04-28 04:12:28,563][54818] Updated weights for policy 0, policy_version 572498 (0.0016) [2024-04-28 04:12:29,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 9379856384. Throughput: 0: 61178.2. Samples: 2284963680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:12:29,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 04:12:30,015][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (27800 times) [2024-04-28 04:12:31,862][54818] Updated weights for policy 0, policy_version 572508 (0.0016) [2024-04-28 04:12:33,796][54818] Updated weights for policy 0, policy_version 572518 (0.0017) [2024-04-28 04:12:34,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9380151296. Throughput: 0: 61158.4. Samples: 2285324220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:12:34,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 04:12:37,001][54798] Signal inference workers to stop experience collection... (37950 times) [2024-04-28 04:12:37,043][54818] InferenceWorker_p0-w0: stopping experience collection (37950 times) [2024-04-28 04:12:37,061][54798] Signal inference workers to resume experience collection... (37950 times) [2024-04-28 04:12:37,061][54818] InferenceWorker_p0-w0: resuming experience collection (37950 times) [2024-04-28 04:12:37,186][54818] Updated weights for policy 0, policy_version 572528 (0.0017) [2024-04-28 04:12:39,253][54587] Fps is (10 sec: 58981.7, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 9380446208. Throughput: 0: 60945.6. Samples: 2285682300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:12:39,254][54587] Avg episode reward: [(0, '0.503')] [2024-04-28 04:12:39,370][54818] Updated weights for policy 0, policy_version 572538 (0.0016) [2024-04-28 04:12:42,353][54818] Updated weights for policy 0, policy_version 572548 (0.0017) [2024-04-28 04:12:44,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.8, 300 sec: 61148.5). Total num frames: 9380757504. Throughput: 0: 61155.1. Samples: 2285885640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:12:44,263][54587] Avg episode reward: [(0, '0.693')] [2024-04-28 04:12:44,696][54818] Updated weights for policy 0, policy_version 572558 (0.0015) [2024-04-28 04:12:47,543][54818] Updated weights for policy 0, policy_version 572568 (0.0019) [2024-04-28 04:12:49,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9381068800. Throughput: 0: 61255.5. Samples: 2286246600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:12:49,263][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 04:12:50,028][54818] Updated weights for policy 0, policy_version 572578 (0.0019) [2024-04-28 04:12:52,730][54818] Updated weights for policy 0, policy_version 572588 (0.0016) [2024-04-28 04:12:54,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.7, 300 sec: 61148.4). Total num frames: 9381363712. Throughput: 0: 61042.0. Samples: 2286599880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:12:54,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 04:12:55,725][54818] Updated weights for policy 0, policy_version 572598 (0.0017) [2024-04-28 04:12:57,069][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (27900 times) [2024-04-28 04:12:58,012][54818] Updated weights for policy 0, policy_version 572608 (0.0023) [2024-04-28 04:12:59,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60893.9, 300 sec: 61148.5). Total num frames: 9381675008. Throughput: 0: 61394.4. Samples: 2286807820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:12:59,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 04:13:01,294][54818] Updated weights for policy 0, policy_version 572618 (0.0017) [2024-04-28 04:13:03,243][54818] Updated weights for policy 0, policy_version 572628 (0.0019) [2024-04-28 04:13:04,253][54587] Fps is (10 sec: 62259.8, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9381986304. Throughput: 0: 61349.8. Samples: 2287159760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:13:04,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 04:13:06,714][54818] Updated weights for policy 0, policy_version 572638 (0.0016) [2024-04-28 04:13:08,441][54818] Updated weights for policy 0, policy_version 572648 (0.0020) [2024-04-28 04:13:09,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61166.8, 300 sec: 61204.0). Total num frames: 9382297600. Throughput: 0: 61088.0. Samples: 2287517640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:13:09,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 04:13:12,043][54818] Updated weights for policy 0, policy_version 572658 (0.0016) [2024-04-28 04:13:12,185][54798] Signal inference workers to stop experience collection... (38000 times) [2024-04-28 04:13:12,226][54818] InferenceWorker_p0-w0: stopping experience collection (38000 times) [2024-04-28 04:13:12,275][54798] Signal inference workers to resume experience collection... (38000 times) [2024-04-28 04:13:12,276][54818] InferenceWorker_p0-w0: resuming experience collection (38000 times) [2024-04-28 04:13:13,756][54818] Updated weights for policy 0, policy_version 572668 (0.0019) [2024-04-28 04:13:14,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9382592512. Throughput: 0: 61279.1. Samples: 2287721240. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:13:14,253][54587] Avg episode reward: [(0, '0.675')] [2024-04-28 04:13:17,184][54818] Updated weights for policy 0, policy_version 572678 (0.0017) [2024-04-28 04:13:19,175][54818] Updated weights for policy 0, policy_version 572688 (0.0019) [2024-04-28 04:13:19,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9382920192. Throughput: 0: 61173.6. Samples: 2288077040. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:13:19,255][54587] Avg episode reward: [(0, '0.657')] [2024-04-28 04:13:22,428][54818] Updated weights for policy 0, policy_version 572698 (0.0016) [2024-04-28 04:13:23,192][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (28000 times) [2024-04-28 04:13:24,253][54587] Fps is (10 sec: 63897.5, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9383231488. Throughput: 0: 61137.5. Samples: 2288433480. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:13:24,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 04:13:25,152][54818] Updated weights for policy 0, policy_version 572708 (0.0017) [2024-04-28 04:13:27,653][54818] Updated weights for policy 0, policy_version 572718 (0.0017) [2024-04-28 04:13:29,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9383526400. Throughput: 0: 61248.8. Samples: 2288641840. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:13:29,255][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 04:13:30,540][54818] Updated weights for policy 0, policy_version 572728 (0.0015) [2024-04-28 04:13:32,870][54818] Updated weights for policy 0, policy_version 572738 (0.0016) [2024-04-28 04:13:34,253][54587] Fps is (10 sec: 58982.2, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9383821312. Throughput: 0: 61191.6. Samples: 2289000220. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:13:34,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 04:13:34,254][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (1500 times) [2024-04-28 04:13:36,251][54818] Updated weights for policy 0, policy_version 572748 (0.0018) [2024-04-28 04:13:38,156][54818] Updated weights for policy 0, policy_version 572758 (0.0017) [2024-04-28 04:13:39,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 9384148992. Throughput: 0: 61255.0. Samples: 2289356360. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:13:39,254][54587] Avg episode reward: [(0, '0.552')] [2024-04-28 04:13:41,444][54818] Updated weights for policy 0, policy_version 572768 (0.0016) [2024-04-28 04:13:43,348][54818] Updated weights for policy 0, policy_version 572778 (0.0015) [2024-04-28 04:13:44,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61440.0, 300 sec: 61203.9). Total num frames: 9384443904. Throughput: 0: 61142.6. Samples: 2289559240. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:13:44,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 04:13:46,599][54818] Updated weights for policy 0, policy_version 572788 (0.0018) [2024-04-28 04:13:48,620][54818] Updated weights for policy 0, policy_version 572798 (0.0021) [2024-04-28 04:13:49,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9384755200. Throughput: 0: 61291.4. Samples: 2289917880. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:13:49,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 04:13:49,263][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000572800_9384755200.pth... [2024-04-28 04:13:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000571903_9370058752.pth [2024-04-28 04:13:50,726][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (28100 times) [2024-04-28 04:13:51,432][54798] Signal inference workers to stop experience collection... (38050 times) [2024-04-28 04:13:51,433][54798] Signal inference workers to resume experience collection... (38050 times) [2024-04-28 04:13:51,444][54818] InferenceWorker_p0-w0: stopping experience collection (38050 times) [2024-04-28 04:13:51,444][54818] InferenceWorker_p0-w0: resuming experience collection (38050 times) [2024-04-28 04:13:51,917][54818] Updated weights for policy 0, policy_version 572808 (0.0016) [2024-04-28 04:13:53,890][54818] Updated weights for policy 0, policy_version 572818 (0.0017) [2024-04-28 04:13:54,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61440.2, 300 sec: 61204.0). Total num frames: 9385050112. Throughput: 0: 61300.2. Samples: 2290276140. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:13:54,255][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 04:13:57,273][54818] Updated weights for policy 0, policy_version 572828 (0.0017) [2024-04-28 04:13:59,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9385361408. Throughput: 0: 61136.0. Samples: 2290472360. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:13:59,254][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 04:13:59,377][54818] Updated weights for policy 0, policy_version 572838 (0.0021) [2024-04-28 04:14:02,433][54818] Updated weights for policy 0, policy_version 572848 (0.0017) [2024-04-28 04:14:04,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9385672704. Throughput: 0: 61334.4. Samples: 2290837080. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:14:04,254][54587] Avg episode reward: [(0, '0.536')] [2024-04-28 04:14:05,020][54818] Updated weights for policy 0, policy_version 572858 (0.0018) [2024-04-28 04:14:07,685][54818] Updated weights for policy 0, policy_version 572868 (0.0018) [2024-04-28 04:14:09,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9385967616. Throughput: 0: 61373.6. Samples: 2291195300. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:14:09,254][54587] Avg episode reward: [(0, '0.693')] [2024-04-28 04:14:11,041][54818] Updated weights for policy 0, policy_version 572878 (0.0021) [2024-04-28 04:14:13,062][54818] Updated weights for policy 0, policy_version 572888 (0.0017) [2024-04-28 04:14:14,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61439.8, 300 sec: 61259.5). Total num frames: 9386278912. Throughput: 0: 61014.2. Samples: 2291387480. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:14:14,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 04:14:16,272][54818] Updated weights for policy 0, policy_version 572898 (0.0017) [2024-04-28 04:14:17,375][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (28200 times) [2024-04-28 04:14:18,469][54818] Updated weights for policy 0, policy_version 572908 (0.0018) [2024-04-28 04:14:19,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61167.1, 300 sec: 61315.0). Total num frames: 9386590208. Throughput: 0: 61150.3. Samples: 2291751980. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:14:19,253][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 04:14:21,589][54818] Updated weights for policy 0, policy_version 572918 (0.0018) [2024-04-28 04:14:23,745][54818] Updated weights for policy 0, policy_version 572928 (0.0016) [2024-04-28 04:14:24,253][54587] Fps is (10 sec: 60621.9, 60 sec: 60893.9, 300 sec: 61259.5). Total num frames: 9386885120. Throughput: 0: 61323.9. Samples: 2292115920. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:14:24,253][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 04:14:25,965][54798] Signal inference workers to stop experience collection... (38100 times) [2024-04-28 04:14:26,007][54818] InferenceWorker_p0-w0: stopping experience collection (38100 times) [2024-04-28 04:14:26,023][54798] Signal inference workers to resume experience collection... (38100 times) [2024-04-28 04:14:26,023][54818] InferenceWorker_p0-w0: resuming experience collection (38100 times) [2024-04-28 04:14:26,747][54818] Updated weights for policy 0, policy_version 572938 (0.0016) [2024-04-28 04:14:29,007][54818] Updated weights for policy 0, policy_version 572948 (0.0016) [2024-04-28 04:14:29,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61167.0, 300 sec: 61315.0). Total num frames: 9387196416. Throughput: 0: 61047.2. Samples: 2292306360. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:14:29,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 04:14:32,161][54818] Updated weights for policy 0, policy_version 572958 (0.0016) [2024-04-28 04:14:34,231][54818] Updated weights for policy 0, policy_version 572968 (0.0019) [2024-04-28 04:14:34,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9387507712. Throughput: 0: 61183.8. Samples: 2292671140. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:14:34,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 04:14:37,546][54818] Updated weights for policy 0, policy_version 572978 (0.0015) [2024-04-28 04:14:39,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60894.1, 300 sec: 61259.5). Total num frames: 9387802624. Throughput: 0: 61249.3. Samples: 2293032360. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-04-28 04:14:39,253][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 04:14:39,534][54818] Updated weights for policy 0, policy_version 572988 (0.0017) [2024-04-28 04:14:42,659][54818] Updated weights for policy 0, policy_version 572998 (0.0019) [2024-04-28 04:14:43,689][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (28300 times) [2024-04-28 04:14:44,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9388113920. Throughput: 0: 61146.2. Samples: 2293223940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:14:44,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 04:14:44,960][54818] Updated weights for policy 0, policy_version 573008 (0.0017) [2024-04-28 04:14:48,108][54818] Updated weights for policy 0, policy_version 573018 (0.0017) [2024-04-28 04:14:49,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60894.0, 300 sec: 61259.5). Total num frames: 9388408832. Throughput: 0: 61129.0. Samples: 2293587880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:14:49,253][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 04:14:49,259][54587] No heartbeat for components: RolloutWorker_w4 (24637 seconds), RolloutWorker_w5 (10737 seconds) [2024-04-28 04:14:50,641][54818] Updated weights for policy 0, policy_version 573028 (0.0018) [2024-04-28 04:14:53,331][54818] Updated weights for policy 0, policy_version 573038 (0.0016) [2024-04-28 04:14:54,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9388720128. Throughput: 0: 61248.5. Samples: 2293951480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:14:54,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 04:14:56,215][54818] Updated weights for policy 0, policy_version 573048 (0.0015) [2024-04-28 04:14:58,776][54818] Updated weights for policy 0, policy_version 573058 (0.0017) [2024-04-28 04:14:59,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9389031424. Throughput: 0: 61217.3. Samples: 2294142260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:14:59,254][54587] Avg episode reward: [(0, '0.720')] [2024-04-28 04:15:01,838][54818] Updated weights for policy 0, policy_version 573068 (0.0017) [2024-04-28 04:15:03,864][54818] Updated weights for policy 0, policy_version 573078 (0.0017) [2024-04-28 04:15:04,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9389326336. Throughput: 0: 61110.7. Samples: 2294501960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:15:04,253][54587] Avg episode reward: [(0, '0.682')] [2024-04-28 04:15:07,011][54818] Updated weights for policy 0, policy_version 573088 (0.0016) [2024-04-28 04:15:09,168][54818] Updated weights for policy 0, policy_version 573098 (0.0016) [2024-04-28 04:15:09,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61167.0, 300 sec: 61203.9). Total num frames: 9389637632. Throughput: 0: 61193.2. Samples: 2294869620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:15:09,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 04:15:10,843][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (28400 times) [2024-04-28 04:15:12,247][54818] Updated weights for policy 0, policy_version 573108 (0.0015) [2024-04-28 04:15:14,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9389948928. Throughput: 0: 61132.0. Samples: 2295057300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:15:14,254][54587] Avg episode reward: [(0, '0.527')] [2024-04-28 04:15:14,399][54818] Updated weights for policy 0, policy_version 573118 (0.0017) [2024-04-28 04:15:17,742][54818] Updated weights for policy 0, policy_version 573128 (0.0015) [2024-04-28 04:15:18,141][54798] Signal inference workers to stop experience collection... (38150 times) [2024-04-28 04:15:18,145][54798] Signal inference workers to resume experience collection... (38150 times) [2024-04-28 04:15:18,154][54818] InferenceWorker_p0-w0: stopping experience collection (38150 times) [2024-04-28 04:15:18,154][54818] InferenceWorker_p0-w0: resuming experience collection (38150 times) [2024-04-28 04:15:19,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9390243840. Throughput: 0: 61252.5. Samples: 2295427500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:15:19,253][54587] Avg episode reward: [(0, '0.513')] [2024-04-28 04:15:19,814][54818] Updated weights for policy 0, policy_version 573138 (0.0018) [2024-04-28 04:15:23,080][54818] Updated weights for policy 0, policy_version 573148 (0.0017) [2024-04-28 04:15:24,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9390555136. Throughput: 0: 61267.1. Samples: 2295789380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:15:24,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 04:15:25,433][54818] Updated weights for policy 0, policy_version 573158 (0.0019) [2024-04-28 04:15:28,381][54818] Updated weights for policy 0, policy_version 573168 (0.0015) [2024-04-28 04:15:29,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9390866432. Throughput: 0: 61052.4. Samples: 2295971300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:15:29,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-28 04:15:30,813][54818] Updated weights for policy 0, policy_version 573178 (0.0017) [2024-04-28 04:15:33,520][54818] Updated weights for policy 0, policy_version 573188 (0.0015) [2024-04-28 04:15:34,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61166.8, 300 sec: 61315.0). Total num frames: 9391177728. Throughput: 0: 61332.2. Samples: 2296347840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:15:34,255][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 04:15:36,143][54818] Updated weights for policy 0, policy_version 573198 (0.0016) [2024-04-28 04:15:37,743][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (28500 times) [2024-04-28 04:15:38,784][54818] Updated weights for policy 0, policy_version 573208 (0.0017) [2024-04-28 04:15:39,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9391472640. Throughput: 0: 61343.1. Samples: 2296711920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:15:39,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 04:15:41,498][54818] Updated weights for policy 0, policy_version 573218 (0.0016) [2024-04-28 04:15:44,098][54818] Updated weights for policy 0, policy_version 573228 (0.0015) [2024-04-28 04:15:44,253][54587] Fps is (10 sec: 58982.6, 60 sec: 60893.8, 300 sec: 61204.0). Total num frames: 9391767552. Throughput: 0: 61078.7. Samples: 2296890800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:15:44,255][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 04:15:46,610][54818] Updated weights for policy 0, policy_version 573238 (0.0018) [2024-04-28 04:15:49,253][54587] Fps is (10 sec: 58982.3, 60 sec: 60893.7, 300 sec: 61203.9). Total num frames: 9392062464. Throughput: 0: 61432.7. Samples: 2297266440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:15:49,254][54587] Avg episode reward: [(0, '0.671')] [2024-04-28 04:15:49,321][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000573247_9392078848.pth... [2024-04-28 04:15:49,375][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000572352_9377415168.pth [2024-04-28 04:15:49,529][54818] Updated weights for policy 0, policy_version 573248 (0.0017) [2024-04-28 04:15:52,464][54818] Updated weights for policy 0, policy_version 573258 (0.0018) [2024-04-28 04:15:54,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9392373760. Throughput: 0: 61272.1. Samples: 2297626860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:15:54,253][54587] Avg episode reward: [(0, '0.710')] [2024-04-28 04:15:54,768][54818] Updated weights for policy 0, policy_version 573268 (0.0016) [2024-04-28 04:15:57,724][54818] Updated weights for policy 0, policy_version 573278 (0.0017) [2024-04-28 04:15:59,253][54587] Fps is (10 sec: 62259.8, 60 sec: 60894.0, 300 sec: 61259.5). Total num frames: 9392685056. Throughput: 0: 61077.4. Samples: 2297805780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:15:59,253][54587] Avg episode reward: [(0, '0.661')] [2024-04-28 04:16:00,147][54818] Updated weights for policy 0, policy_version 573288 (0.0016) [2024-04-28 04:16:03,116][54818] Updated weights for policy 0, policy_version 573298 (0.0015) [2024-04-28 04:16:04,196][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (28600 times) [2024-04-28 04:16:04,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9392979968. Throughput: 0: 61280.5. Samples: 2298185120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:16:04,253][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 04:16:05,429][54818] Updated weights for policy 0, policy_version 573308 (0.0017) [2024-04-28 04:16:08,343][54818] Updated weights for policy 0, policy_version 573318 (0.0016) [2024-04-28 04:16:09,175][54798] Signal inference workers to stop experience collection... (38200 times) [2024-04-28 04:16:09,183][54818] InferenceWorker_p0-w0: stopping experience collection (38200 times) [2024-04-28 04:16:09,253][54587] Fps is (10 sec: 58982.1, 60 sec: 60620.8, 300 sec: 61148.4). Total num frames: 9393274880. Throughput: 0: 61486.2. Samples: 2298556260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:16:09,254][54587] Avg episode reward: [(0, '0.470')] [2024-04-28 04:16:09,266][54798] Signal inference workers to resume experience collection... (38200 times) [2024-04-28 04:16:09,266][54818] InferenceWorker_p0-w0: resuming experience collection (38200 times) [2024-04-28 04:16:10,592][54818] Updated weights for policy 0, policy_version 573328 (0.0018) [2024-04-28 04:16:13,902][54818] Updated weights for policy 0, policy_version 573338 (0.0017) [2024-04-28 04:16:14,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60620.8, 300 sec: 61148.4). Total num frames: 9393586176. Throughput: 0: 61325.3. Samples: 2298730940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-04-28 04:16:14,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 04:16:15,974][54818] Updated weights for policy 0, policy_version 573348 (0.0016) [2024-04-28 04:16:19,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60620.7, 300 sec: 61092.9). Total num frames: 9393881088. Throughput: 0: 61095.2. Samples: 2299097120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:16:19,254][54587] Avg episode reward: [(0, '0.578')] [2024-04-28 04:16:19,296][54818] Updated weights for policy 0, policy_version 573358 (0.0016) [2024-04-28 04:16:21,168][54818] Updated weights for policy 0, policy_version 573368 (0.0016) [2024-04-28 04:16:24,254][54587] Fps is (10 sec: 62256.6, 60 sec: 60893.4, 300 sec: 61148.3). Total num frames: 9394208768. Throughput: 0: 61425.2. Samples: 2299476080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:16:24,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 04:16:24,368][54818] Updated weights for policy 0, policy_version 573378 (0.0017) [2024-04-28 04:16:26,698][54818] Updated weights for policy 0, policy_version 573388 (0.0017) [2024-04-28 04:16:29,253][54587] Fps is (10 sec: 62258.6, 60 sec: 60620.7, 300 sec: 61092.9). Total num frames: 9394503680. Throughput: 0: 61276.8. Samples: 2299648260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:16:29,254][54587] Avg episode reward: [(0, '0.518')] [2024-04-28 04:16:29,582][54818] Updated weights for policy 0, policy_version 573398 (0.0016) [2024-04-28 04:16:30,650][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (28700 times) [2024-04-28 04:16:32,266][54818] Updated weights for policy 0, policy_version 573408 (0.0021) [2024-04-28 04:16:34,253][54587] Fps is (10 sec: 60623.6, 60 sec: 60620.9, 300 sec: 61148.4). Total num frames: 9394814976. Throughput: 0: 61054.8. Samples: 2300013900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:16:34,254][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 04:16:34,973][54818] Updated weights for policy 0, policy_version 573418 (0.0016) [2024-04-28 04:16:37,498][54818] Updated weights for policy 0, policy_version 573428 (0.0016) [2024-04-28 04:16:39,253][54587] Fps is (10 sec: 60622.0, 60 sec: 60620.9, 300 sec: 61092.9). Total num frames: 9395109888. Throughput: 0: 61613.8. Samples: 2300399480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:16:39,253][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 04:16:40,180][54818] Updated weights for policy 0, policy_version 573438 (0.0017) [2024-04-28 04:16:43,168][54818] Updated weights for policy 0, policy_version 573448 (0.0016) [2024-04-28 04:16:44,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 9395421184. Throughput: 0: 61399.6. Samples: 2300568760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:16:44,253][54587] Avg episode reward: [(0, '0.650')] [2024-04-28 04:16:45,365][54818] Updated weights for policy 0, policy_version 573458 (0.0019) [2024-04-28 04:16:48,714][54818] Updated weights for policy 0, policy_version 573468 (0.0019) [2024-04-28 04:16:49,254][54587] Fps is (10 sec: 62257.4, 60 sec: 61166.8, 300 sec: 61092.8). Total num frames: 9395732480. Throughput: 0: 61182.7. Samples: 2300938360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:16:49,254][54587] Avg episode reward: [(0, '0.511')] [2024-04-28 04:16:49,609][54798] Signal inference workers to stop experience collection... (38250 times) [2024-04-28 04:16:49,610][54798] Signal inference workers to resume experience collection... (38250 times) [2024-04-28 04:16:49,616][54818] InferenceWorker_p0-w0: stopping experience collection (38250 times) [2024-04-28 04:16:49,626][54818] InferenceWorker_p0-w0: resuming experience collection (38250 times) [2024-04-28 04:16:50,641][54818] Updated weights for policy 0, policy_version 573478 (0.0016) [2024-04-28 04:16:54,101][54818] Updated weights for policy 0, policy_version 573488 (0.0016) [2024-04-28 04:16:54,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60893.8, 300 sec: 61037.3). Total num frames: 9396027392. Throughput: 0: 61426.2. Samples: 2301320440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:16:54,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-28 04:16:55,917][54818] Updated weights for policy 0, policy_version 573498 (0.0021) [2024-04-28 04:16:57,338][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (28800 times) [2024-04-28 04:16:59,253][54587] Fps is (10 sec: 60622.2, 60 sec: 60893.8, 300 sec: 61037.4). Total num frames: 9396338688. Throughput: 0: 61168.9. Samples: 2301483540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:16:59,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-28 04:16:59,567][54818] Updated weights for policy 0, policy_version 573508 (0.0018) [2024-04-28 04:17:01,324][54818] Updated weights for policy 0, policy_version 573518 (0.0016) [2024-04-28 04:17:04,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.8, 300 sec: 61037.3). Total num frames: 9396633600. Throughput: 0: 61227.2. Samples: 2301852340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:17:04,256][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 04:17:04,789][54818] Updated weights for policy 0, policy_version 573528 (0.0019) [2024-04-28 04:17:06,609][54818] Updated weights for policy 0, policy_version 573538 (0.0017) [2024-04-28 04:17:09,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.9, 300 sec: 61037.4). Total num frames: 9396944896. Throughput: 0: 61369.0. Samples: 2302237660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:17:09,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 04:17:09,940][54818] Updated weights for policy 0, policy_version 573548 (0.0017) [2024-04-28 04:17:12,127][54818] Updated weights for policy 0, policy_version 573558 (0.0020) [2024-04-28 04:17:14,253][54587] Fps is (10 sec: 60621.2, 60 sec: 60894.0, 300 sec: 61037.4). Total num frames: 9397239808. Throughput: 0: 61153.6. Samples: 2302400160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:17:14,253][54587] Avg episode reward: [(0, '0.550')] [2024-04-28 04:17:15,165][54818] Updated weights for policy 0, policy_version 573568 (0.0016) [2024-04-28 04:17:17,370][54818] Updated weights for policy 0, policy_version 573578 (0.0017) [2024-04-28 04:17:19,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61167.0, 300 sec: 61037.4). Total num frames: 9397551104. Throughput: 0: 61335.6. Samples: 2302774000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:17:19,253][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 04:17:20,424][54818] Updated weights for policy 0, policy_version 573588 (0.0017) [2024-04-28 04:17:23,665][54818] Updated weights for policy 0, policy_version 573598 (0.0016) [2024-04-28 04:17:24,253][54587] Fps is (10 sec: 62258.9, 60 sec: 60894.3, 300 sec: 61037.3). Total num frames: 9397862400. Throughput: 0: 61372.4. Samples: 2303161240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:17:24,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 04:17:24,661][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (28900 times) [2024-04-28 04:17:25,406][54798] Signal inference workers to stop experience collection... (38300 times) [2024-04-28 04:17:25,446][54818] InferenceWorker_p0-w0: stopping experience collection (38300 times) [2024-04-28 04:17:25,500][54798] Signal inference workers to resume experience collection... (38300 times) [2024-04-28 04:17:25,501][54818] InferenceWorker_p0-w0: resuming experience collection (38300 times) [2024-04-28 04:17:25,620][54818] Updated weights for policy 0, policy_version 573608 (0.0020) [2024-04-28 04:17:29,160][54818] Updated weights for policy 0, policy_version 573618 (0.0017) [2024-04-28 04:17:29,253][54587] Fps is (10 sec: 60619.5, 60 sec: 60893.8, 300 sec: 61037.3). Total num frames: 9398157312. Throughput: 0: 61101.0. Samples: 2303318320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:17:29,255][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 04:17:30,910][54818] Updated weights for policy 0, policy_version 573628 (0.0016) [2024-04-28 04:17:34,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9398468608. Throughput: 0: 61047.0. Samples: 2303685460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:17:34,255][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 04:17:34,405][54818] Updated weights for policy 0, policy_version 573638 (0.0018) [2024-04-28 04:17:36,267][54818] Updated weights for policy 0, policy_version 573648 (0.0017) [2024-04-28 04:17:39,253][54587] Fps is (10 sec: 62260.5, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 9398779904. Throughput: 0: 61266.8. Samples: 2304077440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:17:39,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 04:17:39,459][54818] Updated weights for policy 0, policy_version 573658 (0.0018) [2024-04-28 04:17:41,372][54818] Updated weights for policy 0, policy_version 573668 (0.0015) [2024-04-28 04:17:44,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 9399091200. Throughput: 0: 61151.6. Samples: 2304235360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 04:17:44,253][54587] Avg episode reward: [(0, '0.578')] [2024-04-28 04:17:44,840][54818] Updated weights for policy 0, policy_version 573678 (0.0015) [2024-04-28 04:17:46,495][54818] Updated weights for policy 0, policy_version 573688 (0.0018) [2024-04-28 04:17:49,253][54587] Fps is (10 sec: 60619.6, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 9399386112. Throughput: 0: 61259.8. Samples: 2304609040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:17:49,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 04:17:49,264][54587] No heartbeat for components: RolloutWorker_w4 (24817 seconds), RolloutWorker_w5 (10917 seconds) [2024-04-28 04:17:49,365][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000573694_9399402496.pth... [2024-04-28 04:17:49,422][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000572800_9384755200.pth [2024-04-28 04:17:50,055][54818] Updated weights for policy 0, policy_version 573698 (0.0018) [2024-04-28 04:17:50,844][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (29000 times) [2024-04-28 04:17:51,867][54818] Updated weights for policy 0, policy_version 573708 (0.0018) [2024-04-28 04:17:54,253][54587] Fps is (10 sec: 58982.6, 60 sec: 60894.0, 300 sec: 61037.4). Total num frames: 9399681024. Throughput: 0: 61149.5. Samples: 2304989380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:17:54,253][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 04:17:55,532][54818] Updated weights for policy 0, policy_version 573718 (0.0015) [2024-04-28 04:17:55,757][54798] Signal inference workers to stop experience collection... (38350 times) [2024-04-28 04:17:55,800][54818] InferenceWorker_p0-w0: stopping experience collection (38350 times) [2024-04-28 04:17:55,815][54798] Signal inference workers to resume experience collection... (38350 times) [2024-04-28 04:17:55,816][54818] InferenceWorker_p0-w0: resuming experience collection (38350 times) [2024-04-28 04:17:57,300][54818] Updated weights for policy 0, policy_version 573728 (0.0018) [2024-04-28 04:17:59,253][54587] Fps is (10 sec: 60621.3, 60 sec: 60893.8, 300 sec: 61037.3). Total num frames: 9399992320. Throughput: 0: 61205.2. Samples: 2305154400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:17:59,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 04:18:00,634][54818] Updated weights for policy 0, policy_version 573738 (0.0018) [2024-04-28 04:18:03,516][54818] Updated weights for policy 0, policy_version 573748 (0.0015) [2024-04-28 04:18:04,253][54587] Fps is (10 sec: 63897.0, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 9400320000. Throughput: 0: 61161.2. Samples: 2305526260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:18:04,254][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 04:18:05,851][54818] Updated weights for policy 0, policy_version 573758 (0.0017) [2024-04-28 04:18:09,223][54818] Updated weights for policy 0, policy_version 573768 (0.0017) [2024-04-28 04:18:09,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 9400614912. Throughput: 0: 60966.6. Samples: 2305904740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:18:09,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-28 04:18:11,048][54818] Updated weights for policy 0, policy_version 573778 (0.0017) [2024-04-28 04:18:14,253][54587] Fps is (10 sec: 58981.7, 60 sec: 61166.7, 300 sec: 60981.8). Total num frames: 9400909824. Throughput: 0: 61182.7. Samples: 2306071540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:18:14,257][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 04:18:14,605][54818] Updated weights for policy 0, policy_version 573788 (0.0016) [2024-04-28 04:18:16,342][54818] Updated weights for policy 0, policy_version 573798 (0.0019) [2024-04-28 04:18:17,390][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (29100 times) [2024-04-28 04:18:19,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.8, 300 sec: 60981.8). Total num frames: 9401221120. Throughput: 0: 61095.0. Samples: 2306434740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:18:19,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 04:18:19,939][54818] Updated weights for policy 0, policy_version 573808 (0.0016) [2024-04-28 04:18:21,632][54818] Updated weights for policy 0, policy_version 573818 (0.0017) [2024-04-28 04:18:24,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61166.8, 300 sec: 61037.3). Total num frames: 9401532416. Throughput: 0: 60891.4. Samples: 2306817560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:18:24,255][54587] Avg episode reward: [(0, '0.548')] [2024-04-28 04:18:25,100][54818] Updated weights for policy 0, policy_version 573828 (0.0016) [2024-04-28 04:18:27,129][54818] Updated weights for policy 0, policy_version 573838 (0.0016) [2024-04-28 04:18:29,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61440.0, 300 sec: 61092.9). Total num frames: 9401843712. Throughput: 0: 61232.7. Samples: 2306990840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:18:29,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 04:18:29,971][54798] Signal inference workers to stop experience collection... (38400 times) [2024-04-28 04:18:29,972][54798] Signal inference workers to resume experience collection... (38400 times) [2024-04-28 04:18:29,984][54818] InferenceWorker_p0-w0: stopping experience collection (38400 times) [2024-04-28 04:18:29,984][54818] InferenceWorker_p0-w0: resuming experience collection (38400 times) [2024-04-28 04:18:30,382][54818] Updated weights for policy 0, policy_version 573848 (0.0018) [2024-04-28 04:18:32,481][54818] Updated weights for policy 0, policy_version 573858 (0.0015) [2024-04-28 04:18:34,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61440.0, 300 sec: 61037.4). Total num frames: 9402155008. Throughput: 0: 60899.7. Samples: 2307349520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:18:34,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-28 04:18:35,683][54818] Updated weights for policy 0, policy_version 573868 (0.0017) [2024-04-28 04:18:37,885][54818] Updated weights for policy 0, policy_version 573878 (0.0017) [2024-04-28 04:18:39,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.7, 300 sec: 61037.3). Total num frames: 9402449920. Throughput: 0: 60847.7. Samples: 2307727540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:18:39,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 04:18:40,845][54818] Updated weights for policy 0, policy_version 573888 (0.0016) [2024-04-28 04:18:43,355][54818] Updated weights for policy 0, policy_version 573898 (0.0015) [2024-04-28 04:18:44,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61167.0, 300 sec: 61037.4). Total num frames: 9402761216. Throughput: 0: 61184.6. Samples: 2307907700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:18:44,253][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 04:18:45,027][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (29200 times) [2024-04-28 04:18:46,264][54818] Updated weights for policy 0, policy_version 573908 (0.0019) [2024-04-28 04:18:48,907][54818] Updated weights for policy 0, policy_version 573918 (0.0016) [2024-04-28 04:18:49,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61440.1, 300 sec: 61092.9). Total num frames: 9403072512. Throughput: 0: 60992.4. Samples: 2308270920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:18:49,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 04:18:51,538][54818] Updated weights for policy 0, policy_version 573928 (0.0018) [2024-04-28 04:18:54,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61713.0, 300 sec: 61092.9). Total num frames: 9403383808. Throughput: 0: 60839.2. Samples: 2308642500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:18:54,255][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 04:18:54,675][54818] Updated weights for policy 0, policy_version 573938 (0.0018) [2024-04-28 04:18:56,865][54818] Updated weights for policy 0, policy_version 573948 (0.0017) [2024-04-28 04:18:59,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61440.1, 300 sec: 61037.4). Total num frames: 9403678720. Throughput: 0: 61212.7. Samples: 2308826100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:18:59,253][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 04:18:59,743][54818] Updated weights for policy 0, policy_version 573958 (0.0016) [2024-04-28 04:19:02,171][54818] Updated weights for policy 0, policy_version 573968 (0.0016) [2024-04-28 04:19:04,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9404006400. Throughput: 0: 61178.3. Samples: 2309187760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:19:04,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 04:19:04,909][54818] Updated weights for policy 0, policy_version 573978 (0.0015) [2024-04-28 04:19:07,459][54818] Updated weights for policy 0, policy_version 573988 (0.0017) [2024-04-28 04:19:09,253][54587] Fps is (10 sec: 62258.0, 60 sec: 61439.9, 300 sec: 61092.9). Total num frames: 9404301312. Throughput: 0: 60698.6. Samples: 2309549000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:19:09,255][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 04:19:10,658][54818] Updated weights for policy 0, policy_version 573998 (0.0016) [2024-04-28 04:19:11,781][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (29300 times) [2024-04-28 04:19:12,684][54818] Updated weights for policy 0, policy_version 574008 (0.0016) [2024-04-28 04:19:14,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61713.2, 300 sec: 61092.9). Total num frames: 9404612608. Throughput: 0: 61211.3. Samples: 2309745340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:19:14,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 04:19:15,822][54798] Signal inference workers to stop experience collection... (38450 times) [2024-04-28 04:19:15,822][54798] Signal inference workers to resume experience collection... (38450 times) [2024-04-28 04:19:15,843][54818] InferenceWorker_p0-w0: stopping experience collection (38450 times) [2024-04-28 04:19:15,843][54818] InferenceWorker_p0-w0: resuming experience collection (38450 times) [2024-04-28 04:19:15,929][54818] Updated weights for policy 0, policy_version 574018 (0.0020) [2024-04-28 04:19:18,180][54818] Updated weights for policy 0, policy_version 574028 (0.0016) [2024-04-28 04:19:19,253][54587] Fps is (10 sec: 62260.0, 60 sec: 61713.2, 300 sec: 61148.4). Total num frames: 9404923904. Throughput: 0: 61176.9. Samples: 2310102480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-04-28 04:19:19,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:19:21,343][54818] Updated weights for policy 0, policy_version 574038 (0.0016) [2024-04-28 04:19:23,332][54818] Updated weights for policy 0, policy_version 574048 (0.0017) [2024-04-28 04:19:24,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61713.0, 300 sec: 61148.4). Total num frames: 9405235200. Throughput: 0: 60997.8. Samples: 2310472440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:19:24,254][54587] Avg episode reward: [(0, '0.652')] [2024-04-28 04:19:26,622][54818] Updated weights for policy 0, policy_version 574058 (0.0016) [2024-04-28 04:19:28,681][54818] Updated weights for policy 0, policy_version 574068 (0.0016) [2024-04-28 04:19:29,253][54587] Fps is (10 sec: 60620.0, 60 sec: 61440.0, 300 sec: 61092.8). Total num frames: 9405530112. Throughput: 0: 61245.1. Samples: 2310663740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:19:29,254][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 04:19:31,928][54818] Updated weights for policy 0, policy_version 574078 (0.0017) [2024-04-28 04:19:34,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 9405841408. Throughput: 0: 61194.6. Samples: 2311024680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:19:34,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-28 04:19:34,371][54818] Updated weights for policy 0, policy_version 574088 (0.0017) [2024-04-28 04:19:37,269][54818] Updated weights for policy 0, policy_version 574098 (0.0016) [2024-04-28 04:19:38,095][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (29400 times) [2024-04-28 04:19:39,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61713.1, 300 sec: 61148.4). Total num frames: 9406152704. Throughput: 0: 60904.7. Samples: 2311383220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:19:39,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 04:19:39,576][54818] Updated weights for policy 0, policy_version 574108 (0.0019) [2024-04-28 04:19:42,680][54818] Updated weights for policy 0, policy_version 574118 (0.0016) [2024-04-28 04:19:44,253][54587] Fps is (10 sec: 62260.2, 60 sec: 61713.1, 300 sec: 61204.0). Total num frames: 9406464000. Throughput: 0: 61309.8. Samples: 2311585040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:19:44,253][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 04:19:45,309][54818] Updated weights for policy 0, policy_version 574128 (0.0017) [2024-04-28 04:19:47,891][54818] Updated weights for policy 0, policy_version 574138 (0.0018) [2024-04-28 04:19:49,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61712.9, 300 sec: 61203.9). Total num frames: 9406775296. Throughput: 0: 61273.2. Samples: 2311945060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:19:49,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 04:19:49,291][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000574145_9406791680.pth... [2024-04-28 04:19:49,342][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000573247_9392078848.pth [2024-04-28 04:19:50,631][54818] Updated weights for policy 0, policy_version 574148 (0.0015) [2024-04-28 04:19:53,221][54818] Updated weights for policy 0, policy_version 574158 (0.0017) [2024-04-28 04:19:54,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61713.0, 300 sec: 61204.0). Total num frames: 9407086592. Throughput: 0: 61271.2. Samples: 2312306200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:19:54,255][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 04:19:56,246][54818] Updated weights for policy 0, policy_version 574168 (0.0016) [2024-04-28 04:19:58,349][54818] Updated weights for policy 0, policy_version 574178 (0.0016) [2024-04-28 04:19:59,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61986.0, 300 sec: 61259.5). Total num frames: 9407397888. Throughput: 0: 61316.4. Samples: 2312504580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:19:59,254][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 04:20:01,564][54818] Updated weights for policy 0, policy_version 574188 (0.0017) [2024-04-28 04:20:03,234][54798] Signal inference workers to stop experience collection... (38500 times) [2024-04-28 04:20:03,278][54818] InferenceWorker_p0-w0: stopping experience collection (38500 times) [2024-04-28 04:20:03,329][54798] Signal inference workers to resume experience collection... (38500 times) [2024-04-28 04:20:03,329][54818] InferenceWorker_p0-w0: resuming experience collection (38500 times) [2024-04-28 04:20:03,607][54818] Updated weights for policy 0, policy_version 574198 (0.0018) [2024-04-28 04:20:04,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 9407692800. Throughput: 0: 61485.8. Samples: 2312869340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:20:04,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 04:20:04,932][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (29500 times) [2024-04-28 04:20:06,945][54818] Updated weights for policy 0, policy_version 574208 (0.0015) [2024-04-28 04:20:08,895][54818] Updated weights for policy 0, policy_version 574218 (0.0017) [2024-04-28 04:20:09,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61713.1, 300 sec: 61204.0). Total num frames: 9408004096. Throughput: 0: 61197.0. Samples: 2313226300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:20:09,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 04:20:12,194][54818] Updated weights for policy 0, policy_version 574228 (0.0015) [2024-04-28 04:20:14,121][54818] Updated weights for policy 0, policy_version 574238 (0.0017) [2024-04-28 04:20:14,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61712.9, 300 sec: 61259.5). Total num frames: 9408315392. Throughput: 0: 61260.0. Samples: 2313420440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:20:14,255][54587] Avg episode reward: [(0, '0.642')] [2024-04-28 04:20:17,415][54818] Updated weights for policy 0, policy_version 574248 (0.0017) [2024-04-28 04:20:19,254][54587] Fps is (10 sec: 62256.2, 60 sec: 61712.5, 300 sec: 61259.4). Total num frames: 9408626688. Throughput: 0: 61414.1. Samples: 2313788340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:20:19,255][54587] Avg episode reward: [(0, '0.670')] [2024-04-28 04:20:19,684][54818] Updated weights for policy 0, policy_version 574258 (0.0017) [2024-04-28 04:20:22,900][54818] Updated weights for policy 0, policy_version 574268 (0.0017) [2024-04-28 04:20:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.1, 300 sec: 61203.9). Total num frames: 9408921600. Throughput: 0: 61357.9. Samples: 2314144320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:20:24,254][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 04:20:24,991][54818] Updated weights for policy 0, policy_version 574278 (0.0016) [2024-04-28 04:20:28,147][54818] Updated weights for policy 0, policy_version 574288 (0.0016) [2024-04-28 04:20:29,253][54587] Fps is (10 sec: 60623.6, 60 sec: 61713.1, 300 sec: 61204.0). Total num frames: 9409232896. Throughput: 0: 61387.8. Samples: 2314347500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:20:29,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 04:20:30,822][54818] Updated weights for policy 0, policy_version 574298 (0.0018) [2024-04-28 04:20:32,194][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (29600 times) [2024-04-28 04:20:33,483][54818] Updated weights for policy 0, policy_version 574308 (0.0020) [2024-04-28 04:20:34,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 9409544192. Throughput: 0: 61326.3. Samples: 2314704740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:20:34,255][54587] Avg episode reward: [(0, '0.486')] [2024-04-28 04:20:36,337][54818] Updated weights for policy 0, policy_version 574318 (0.0018) [2024-04-28 04:20:38,657][54818] Updated weights for policy 0, policy_version 574328 (0.0017) [2024-04-28 04:20:39,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61713.2, 300 sec: 61315.0). Total num frames: 9409855488. Throughput: 0: 61244.0. Samples: 2315062180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:20:39,255][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 04:20:41,833][54818] Updated weights for policy 0, policy_version 574338 (0.0016) [2024-04-28 04:20:42,267][54798] Signal inference workers to stop experience collection... (38550 times) [2024-04-28 04:20:42,309][54818] InferenceWorker_p0-w0: stopping experience collection (38550 times) [2024-04-28 04:20:42,325][54798] Signal inference workers to resume experience collection... (38550 times) [2024-04-28 04:20:42,325][54818] InferenceWorker_p0-w0: resuming experience collection (38550 times) [2024-04-28 04:20:43,763][54818] Updated weights for policy 0, policy_version 574348 (0.0020) [2024-04-28 04:20:44,253][54587] Fps is (10 sec: 60621.9, 60 sec: 61440.0, 300 sec: 61315.1). Total num frames: 9410150400. Throughput: 0: 61202.9. Samples: 2315258700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:20:44,253][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 04:20:47,122][54818] Updated weights for policy 0, policy_version 574358 (0.0017) [2024-04-28 04:20:48,985][54818] Updated weights for policy 0, policy_version 574368 (0.0017) [2024-04-28 04:20:49,253][54587] Fps is (10 sec: 58982.1, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9410445312. Throughput: 0: 61169.6. Samples: 2315621980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 04:20:49,254][54587] Avg episode reward: [(0, '0.511')] [2024-04-28 04:20:49,263][54587] No heartbeat for components: RolloutWorker_w4 (24997 seconds), RolloutWorker_w5 (11097 seconds) [2024-04-28 04:20:52,484][54818] Updated weights for policy 0, policy_version 574378 (0.0017) [2024-04-28 04:20:54,253][54587] Fps is (10 sec: 60619.7, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9410756608. Throughput: 0: 61247.9. Samples: 2315982460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:20:54,255][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 04:20:54,337][54818] Updated weights for policy 0, policy_version 574388 (0.0017) [2024-04-28 04:20:57,687][54818] Updated weights for policy 0, policy_version 574398 (0.0016) [2024-04-28 04:20:58,509][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (29700 times) [2024-04-28 04:20:59,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61167.0, 300 sec: 61315.0). Total num frames: 9411067904. Throughput: 0: 61337.5. Samples: 2316180620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:20:59,254][54587] Avg episode reward: [(0, '0.564')] [2024-04-28 04:20:59,674][54818] Updated weights for policy 0, policy_version 574408 (0.0016) [2024-04-28 04:21:02,828][54818] Updated weights for policy 0, policy_version 574418 (0.0015) [2024-04-28 04:21:04,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9411362816. Throughput: 0: 61199.8. Samples: 2316542300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:21:04,255][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 04:21:05,012][54818] Updated weights for policy 0, policy_version 574428 (0.0015) [2024-04-28 04:21:08,130][54818] Updated weights for policy 0, policy_version 574438 (0.0019) [2024-04-28 04:21:09,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9411674112. Throughput: 0: 61390.2. Samples: 2316906880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:21:09,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-28 04:21:10,731][54818] Updated weights for policy 0, policy_version 574448 (0.0018) [2024-04-28 04:21:13,353][54818] Updated weights for policy 0, policy_version 574458 (0.0019) [2024-04-28 04:21:14,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61167.1, 300 sec: 61370.6). Total num frames: 9411985408. Throughput: 0: 61254.4. Samples: 2317103940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:21:14,253][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 04:21:16,122][54818] Updated weights for policy 0, policy_version 574468 (0.0016) [2024-04-28 04:21:18,671][54818] Updated weights for policy 0, policy_version 574478 (0.0016) [2024-04-28 04:21:19,243][54798] Signal inference workers to stop experience collection... (38600 times) [2024-04-28 04:21:19,247][54798] Signal inference workers to resume experience collection... (38600 times) [2024-04-28 04:21:19,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61167.4, 300 sec: 61315.1). Total num frames: 9412296704. Throughput: 0: 61449.0. Samples: 2317469940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:21:19,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 04:21:19,260][54818] InferenceWorker_p0-w0: stopping experience collection (38600 times) [2024-04-28 04:21:19,260][54818] InferenceWorker_p0-w0: resuming experience collection (38600 times) [2024-04-28 04:21:21,771][54818] Updated weights for policy 0, policy_version 574488 (0.0018) [2024-04-28 04:21:23,861][54818] Updated weights for policy 0, policy_version 574498 (0.0017) [2024-04-28 04:21:24,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9412608000. Throughput: 0: 61388.5. Samples: 2317824660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:21:24,255][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 04:21:24,933][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (29800 times) [2024-04-28 04:21:27,252][54818] Updated weights for policy 0, policy_version 574508 (0.0019) [2024-04-28 04:21:29,210][54818] Updated weights for policy 0, policy_version 574518 (0.0016) [2024-04-28 04:21:29,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9412902912. Throughput: 0: 61427.3. Samples: 2318022940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:21:29,255][54587] Avg episode reward: [(0, '0.528')] [2024-04-28 04:21:32,394][54818] Updated weights for policy 0, policy_version 574528 (0.0016) [2024-04-28 04:21:34,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61167.1, 300 sec: 61370.6). Total num frames: 9413214208. Throughput: 0: 61628.6. Samples: 2318395260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:21:34,254][54587] Avg episode reward: [(0, '0.711')] [2024-04-28 04:21:34,282][54818] Updated weights for policy 0, policy_version 574538 (0.0017) [2024-04-28 04:21:37,812][54818] Updated weights for policy 0, policy_version 574548 (0.0016) [2024-04-28 04:21:39,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61166.8, 300 sec: 61370.5). Total num frames: 9413525504. Throughput: 0: 61622.2. Samples: 2318755460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:21:39,255][54587] Avg episode reward: [(0, '0.550')] [2024-04-28 04:21:39,535][54818] Updated weights for policy 0, policy_version 574558 (0.0017) [2024-04-28 04:21:42,833][54818] Updated weights for policy 0, policy_version 574568 (0.0017) [2024-04-28 04:21:44,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9413836800. Throughput: 0: 61482.7. Samples: 2318947340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:21:44,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 04:21:44,766][54818] Updated weights for policy 0, policy_version 574578 (0.0018) [2024-04-28 04:21:48,179][54818] Updated weights for policy 0, policy_version 574588 (0.0016) [2024-04-28 04:21:49,253][54587] Fps is (10 sec: 60621.9, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9414131712. Throughput: 0: 61685.0. Samples: 2319318120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:21:49,253][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 04:21:49,307][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000574594_9414148096.pth... [2024-04-28 04:21:49,363][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000573694_9399402496.pth [2024-04-28 04:21:50,493][54818] Updated weights for policy 0, policy_version 574598 (0.0015) [2024-04-28 04:21:52,231][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (29900 times) [2024-04-28 04:21:53,430][54818] Updated weights for policy 0, policy_version 574608 (0.0017) [2024-04-28 04:21:54,084][54798] Signal inference workers to stop experience collection... (38650 times) [2024-04-28 04:21:54,125][54818] InferenceWorker_p0-w0: stopping experience collection (38650 times) [2024-04-28 04:21:54,141][54798] Signal inference workers to resume experience collection... (38650 times) [2024-04-28 04:21:54,142][54818] InferenceWorker_p0-w0: resuming experience collection (38650 times) [2024-04-28 04:21:54,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61440.2, 300 sec: 61370.6). Total num frames: 9414443008. Throughput: 0: 61569.5. Samples: 2319677500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:21:54,255][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 04:21:54,256][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (1600 times) [2024-04-28 04:21:55,738][54818] Updated weights for policy 0, policy_version 574618 (0.0017) [2024-04-28 04:21:58,653][54818] Updated weights for policy 0, policy_version 574628 (0.0017) [2024-04-28 04:21:59,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61439.9, 300 sec: 61426.1). Total num frames: 9414754304. Throughput: 0: 61422.5. Samples: 2319867960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:21:59,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 04:22:01,577][54818] Updated weights for policy 0, policy_version 574638 (0.0016) [2024-04-28 04:22:04,046][54818] Updated weights for policy 0, policy_version 574648 (0.0018) [2024-04-28 04:22:04,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9415049216. Throughput: 0: 61417.9. Samples: 2320233740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:22:04,253][54587] Avg episode reward: [(0, '0.486')] [2024-04-28 04:22:06,813][54818] Updated weights for policy 0, policy_version 574658 (0.0017) [2024-04-28 04:22:09,126][54818] Updated weights for policy 0, policy_version 574668 (0.0016) [2024-04-28 04:22:09,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61440.0, 300 sec: 61426.1). Total num frames: 9415360512. Throughput: 0: 61651.4. Samples: 2320598980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:22:09,254][54587] Avg episode reward: [(0, '0.638')] [2024-04-28 04:22:12,290][54818] Updated weights for policy 0, policy_version 574678 (0.0017) [2024-04-28 04:22:14,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61439.8, 300 sec: 61426.1). Total num frames: 9415671808. Throughput: 0: 61276.9. Samples: 2320780400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:22:14,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 04:22:14,403][54818] Updated weights for policy 0, policy_version 574688 (0.0016) [2024-04-28 04:22:17,623][54818] Updated weights for policy 0, policy_version 574698 (0.0016) [2024-04-28 04:22:18,719][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (30000 times) [2024-04-28 04:22:19,253][54587] Fps is (10 sec: 60621.7, 60 sec: 61167.0, 300 sec: 61370.6). Total num frames: 9415966720. Throughput: 0: 61233.4. Samples: 2321150760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 04:22:19,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 04:22:19,806][54818] Updated weights for policy 0, policy_version 574708 (0.0018) [2024-04-28 04:22:22,945][54818] Updated weights for policy 0, policy_version 574718 (0.0016) [2024-04-28 04:22:24,253][54587] Fps is (10 sec: 58982.6, 60 sec: 60893.8, 300 sec: 61370.6). Total num frames: 9416261632. Throughput: 0: 61515.6. Samples: 2321523660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:22:24,255][54587] Avg episode reward: [(0, '0.555')] [2024-04-28 04:22:25,171][54818] Updated weights for policy 0, policy_version 574728 (0.0019) [2024-04-28 04:22:28,406][54818] Updated weights for policy 0, policy_version 574738 (0.0016) [2024-04-28 04:22:29,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61167.1, 300 sec: 61370.6). Total num frames: 9416572928. Throughput: 0: 61204.0. Samples: 2321701520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:22:29,253][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 04:22:30,263][54798] Signal inference workers to stop experience collection... (38700 times) [2024-04-28 04:22:30,264][54798] Signal inference workers to resume experience collection... (38700 times) [2024-04-28 04:22:30,278][54818] InferenceWorker_p0-w0: stopping experience collection (38700 times) [2024-04-28 04:22:30,278][54818] InferenceWorker_p0-w0: resuming experience collection (38700 times) [2024-04-28 04:22:30,382][54818] Updated weights for policy 0, policy_version 574748 (0.0016) [2024-04-28 04:22:33,668][54818] Updated weights for policy 0, policy_version 574758 (0.0020) [2024-04-28 04:22:34,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9416884224. Throughput: 0: 61270.6. Samples: 2322075300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:22:34,254][54587] Avg episode reward: [(0, '0.529')] [2024-04-28 04:22:35,689][54818] Updated weights for policy 0, policy_version 574768 (0.0017) [2024-04-28 04:22:38,760][54818] Updated weights for policy 0, policy_version 574778 (0.0019) [2024-04-28 04:22:39,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61167.0, 300 sec: 61370.5). Total num frames: 9417195520. Throughput: 0: 61487.4. Samples: 2322444440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:22:39,255][54587] Avg episode reward: [(0, '0.665')] [2024-04-28 04:22:41,242][54818] Updated weights for policy 0, policy_version 574788 (0.0017) [2024-04-28 04:22:44,233][54818] Updated weights for policy 0, policy_version 574798 (0.0016) [2024-04-28 04:22:44,253][54587] Fps is (10 sec: 60619.5, 60 sec: 60893.6, 300 sec: 61370.6). Total num frames: 9417490432. Throughput: 0: 61222.5. Samples: 2322622980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:22:44,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-28 04:22:45,001][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (30100 times) [2024-04-28 04:22:46,547][54818] Updated weights for policy 0, policy_version 574808 (0.0015) [2024-04-28 04:22:49,253][54587] Fps is (10 sec: 58982.8, 60 sec: 60893.8, 300 sec: 61370.6). Total num frames: 9417785344. Throughput: 0: 61460.4. Samples: 2322999460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:22:49,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 04:22:49,414][54818] Updated weights for policy 0, policy_version 574818 (0.0017) [2024-04-28 04:22:52,340][54818] Updated weights for policy 0, policy_version 574828 (0.0015) [2024-04-28 04:22:54,253][54587] Fps is (10 sec: 60622.3, 60 sec: 60893.9, 300 sec: 61370.6). Total num frames: 9418096640. Throughput: 0: 61501.1. Samples: 2323366520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:22:54,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 04:22:54,675][54818] Updated weights for policy 0, policy_version 574838 (0.0017) [2024-04-28 04:22:57,519][54818] Updated weights for policy 0, policy_version 574848 (0.0018) [2024-04-28 04:22:59,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60620.8, 300 sec: 61259.5). Total num frames: 9418391552. Throughput: 0: 61450.3. Samples: 2323545660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:22:59,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 04:22:59,854][54818] Updated weights for policy 0, policy_version 574858 (0.0017) [2024-04-28 04:23:02,921][54818] Updated weights for policy 0, policy_version 574868 (0.0016) [2024-04-28 04:23:04,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9418719232. Throughput: 0: 61526.5. Samples: 2323919460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:23:04,255][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 04:23:05,214][54818] Updated weights for policy 0, policy_version 574878 (0.0017) [2024-04-28 04:23:08,071][54818] Updated weights for policy 0, policy_version 574888 (0.0016) [2024-04-28 04:23:09,253][54587] Fps is (10 sec: 62259.7, 60 sec: 60894.0, 300 sec: 61370.6). Total num frames: 9419014144. Throughput: 0: 61509.5. Samples: 2324291580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:23:09,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 04:23:10,456][54818] Updated weights for policy 0, policy_version 574898 (0.0018) [2024-04-28 04:23:11,817][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (30200 times) [2024-04-28 04:23:13,425][54818] Updated weights for policy 0, policy_version 574908 (0.0016) [2024-04-28 04:23:14,253][54587] Fps is (10 sec: 60620.2, 60 sec: 60893.8, 300 sec: 61370.6). Total num frames: 9419325440. Throughput: 0: 61452.2. Samples: 2324466880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:23:14,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 04:23:15,659][54818] Updated weights for policy 0, policy_version 574918 (0.0016) [2024-04-28 04:23:15,664][54798] Signal inference workers to stop experience collection... (38750 times) [2024-04-28 04:23:15,668][54798] Signal inference workers to resume experience collection... (38750 times) [2024-04-28 04:23:15,682][54818] InferenceWorker_p0-w0: stopping experience collection (38750 times) [2024-04-28 04:23:15,682][54818] InferenceWorker_p0-w0: resuming experience collection (38750 times) [2024-04-28 04:23:18,874][54818] Updated weights for policy 0, policy_version 574928 (0.0018) [2024-04-28 04:23:19,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61166.8, 300 sec: 61370.6). Total num frames: 9419636736. Throughput: 0: 61581.2. Samples: 2324846460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:23:19,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 04:23:20,816][54818] Updated weights for policy 0, policy_version 574938 (0.0019) [2024-04-28 04:23:24,042][54818] Updated weights for policy 0, policy_version 574948 (0.0016) [2024-04-28 04:23:24,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61439.9, 300 sec: 61370.6). Total num frames: 9419948032. Throughput: 0: 61590.1. Samples: 2325216000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:23:24,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 04:23:26,229][54818] Updated weights for policy 0, policy_version 574958 (0.0017) [2024-04-28 04:23:29,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61439.9, 300 sec: 61370.6). Total num frames: 9420259328. Throughput: 0: 61601.6. Samples: 2325395040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:23:29,254][54587] Avg episode reward: [(0, '0.681')] [2024-04-28 04:23:29,508][54818] Updated weights for policy 0, policy_version 574968 (0.0016) [2024-04-28 04:23:31,780][54818] Updated weights for policy 0, policy_version 574978 (0.0016) [2024-04-28 04:23:34,253][54587] Fps is (10 sec: 62260.4, 60 sec: 61440.0, 300 sec: 61426.2). Total num frames: 9420570624. Throughput: 0: 61521.0. Samples: 2325767900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:23:34,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 04:23:34,801][54818] Updated weights for policy 0, policy_version 574988 (0.0015) [2024-04-28 04:23:37,381][54818] Updated weights for policy 0, policy_version 574998 (0.0015) [2024-04-28 04:23:38,587][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (30300 times) [2024-04-28 04:23:39,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61440.0, 300 sec: 61426.1). Total num frames: 9420881920. Throughput: 0: 61629.1. Samples: 2326139840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:23:39,255][54587] Avg episode reward: [(0, '0.655')] [2024-04-28 04:23:39,891][54818] Updated weights for policy 0, policy_version 575008 (0.0021) [2024-04-28 04:23:42,552][54818] Updated weights for policy 0, policy_version 575018 (0.0018) [2024-04-28 04:23:44,253][54587] Fps is (10 sec: 60620.0, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9421176832. Throughput: 0: 61524.0. Samples: 2326314240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:23:44,255][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:23:45,248][54818] Updated weights for policy 0, policy_version 575028 (0.0016) [2024-04-28 04:23:48,044][54818] Updated weights for policy 0, policy_version 575038 (0.0017) [2024-04-28 04:23:49,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61713.0, 300 sec: 61370.5). Total num frames: 9421488128. Throughput: 0: 61564.3. Samples: 2326689860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:23:49,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 04:23:49,263][54587] No heartbeat for components: RolloutWorker_w4 (25177 seconds), RolloutWorker_w5 (11277 seconds) [2024-04-28 04:23:49,264][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000575042_9421488128.pth... [2024-04-28 04:23:49,322][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000574145_9406791680.pth [2024-04-28 04:23:50,482][54818] Updated weights for policy 0, policy_version 575048 (0.0015) [2024-04-28 04:23:53,359][54818] Updated weights for policy 0, policy_version 575058 (0.0016) [2024-04-28 04:23:54,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61712.9, 300 sec: 61426.1). Total num frames: 9421799424. Throughput: 0: 61512.7. Samples: 2327059660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 04:23:54,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 04:23:55,709][54818] Updated weights for policy 0, policy_version 575068 (0.0016) [2024-04-28 04:23:58,576][54818] Updated weights for policy 0, policy_version 575078 (0.0016) [2024-04-28 04:23:59,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61986.2, 300 sec: 61370.6). Total num frames: 9422110720. Throughput: 0: 61545.9. Samples: 2327236440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:23:59,255][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 04:24:00,952][54818] Updated weights for policy 0, policy_version 575088 (0.0016) [2024-04-28 04:24:03,962][54818] Updated weights for policy 0, policy_version 575098 (0.0016) [2024-04-28 04:24:04,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9422405632. Throughput: 0: 61630.2. Samples: 2327619820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:24:04,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 04:24:04,955][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (30400 times) [2024-04-28 04:24:04,960][54798] Signal inference workers to stop experience collection... (38800 times) [2024-04-28 04:24:04,960][54798] Signal inference workers to resume experience collection... (38800 times) [2024-04-28 04:24:04,968][54818] InferenceWorker_p0-w0: stopping experience collection (38800 times) [2024-04-28 04:24:04,968][54818] InferenceWorker_p0-w0: resuming experience collection (38800 times) [2024-04-28 04:24:06,301][54818] Updated weights for policy 0, policy_version 575108 (0.0016) [2024-04-28 04:24:09,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61713.1, 300 sec: 61370.6). Total num frames: 9422716928. Throughput: 0: 61733.2. Samples: 2327993980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:24:09,253][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 04:24:09,282][54818] Updated weights for policy 0, policy_version 575118 (0.0016) [2024-04-28 04:24:11,634][54818] Updated weights for policy 0, policy_version 575128 (0.0018) [2024-04-28 04:24:14,255][54587] Fps is (10 sec: 62251.0, 60 sec: 61711.8, 300 sec: 61370.3). Total num frames: 9423028224. Throughput: 0: 61612.8. Samples: 2328167700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:24:14,255][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 04:24:14,520][54818] Updated weights for policy 0, policy_version 575138 (0.0015) [2024-04-28 04:24:17,003][54818] Updated weights for policy 0, policy_version 575148 (0.0017) [2024-04-28 04:24:19,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61713.2, 300 sec: 61370.6). Total num frames: 9423339520. Throughput: 0: 61620.0. Samples: 2328540800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:24:19,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 04:24:19,773][54818] Updated weights for policy 0, policy_version 575158 (0.0017) [2024-04-28 04:24:22,346][54818] Updated weights for policy 0, policy_version 575168 (0.0015) [2024-04-28 04:24:24,253][54587] Fps is (10 sec: 62267.4, 60 sec: 61713.1, 300 sec: 61426.1). Total num frames: 9423650816. Throughput: 0: 61624.4. Samples: 2328912940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:24:24,254][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 04:24:25,135][54818] Updated weights for policy 0, policy_version 575178 (0.0017) [2024-04-28 04:24:27,643][54818] Updated weights for policy 0, policy_version 575188 (0.0016) [2024-04-28 04:24:29,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9423945728. Throughput: 0: 61773.8. Samples: 2329094060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:24:29,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-28 04:24:30,371][54818] Updated weights for policy 0, policy_version 575198 (0.0016) [2024-04-28 04:24:31,683][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (30500 times) [2024-04-28 04:24:32,762][54818] Updated weights for policy 0, policy_version 575208 (0.0016) [2024-04-28 04:24:34,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61713.0, 300 sec: 61426.1). Total num frames: 9424273408. Throughput: 0: 61497.9. Samples: 2329457260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:24:34,254][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 04:24:35,547][54818] Updated weights for policy 0, policy_version 575218 (0.0016) [2024-04-28 04:24:38,554][54818] Updated weights for policy 0, policy_version 575228 (0.0017) [2024-04-28 04:24:39,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61440.0, 300 sec: 61370.5). Total num frames: 9424568320. Throughput: 0: 61748.9. Samples: 2329838360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:24:39,255][54587] Avg episode reward: [(0, '0.531')] [2024-04-28 04:24:41,069][54818] Updated weights for policy 0, policy_version 575238 (0.0016) [2024-04-28 04:24:43,790][54818] Updated weights for policy 0, policy_version 575248 (0.0016) [2024-04-28 04:24:44,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61713.0, 300 sec: 61370.6). Total num frames: 9424879616. Throughput: 0: 61915.0. Samples: 2330022620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:24:44,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 04:24:46,174][54818] Updated weights for policy 0, policy_version 575258 (0.0018) [2024-04-28 04:24:49,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61440.2, 300 sec: 61315.1). Total num frames: 9425174528. Throughput: 0: 61509.1. Samples: 2330387720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:24:49,253][54587] Avg episode reward: [(0, '0.667')] [2024-04-28 04:24:49,271][54818] Updated weights for policy 0, policy_version 575268 (0.0016) [2024-04-28 04:24:51,534][54818] Updated weights for policy 0, policy_version 575278 (0.0018) [2024-04-28 04:24:54,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61713.0, 300 sec: 61370.6). Total num frames: 9425502208. Throughput: 0: 61436.1. Samples: 2330758620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:24:54,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 04:24:54,593][54818] Updated weights for policy 0, policy_version 575288 (0.0017) [2024-04-28 04:24:57,027][54818] Updated weights for policy 0, policy_version 575298 (0.0017) [2024-04-28 04:24:58,276][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (30600 times) [2024-04-28 04:24:59,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61439.9, 300 sec: 61370.5). Total num frames: 9425797120. Throughput: 0: 61594.7. Samples: 2330939380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:24:59,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 04:24:59,663][54818] Updated weights for policy 0, policy_version 575308 (0.0021) [2024-04-28 04:25:02,190][54818] Updated weights for policy 0, policy_version 575318 (0.0020) [2024-04-28 04:25:04,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61986.1, 300 sec: 61426.1). Total num frames: 9426124800. Throughput: 0: 61488.2. Samples: 2331307780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:25:04,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-28 04:25:05,090][54818] Updated weights for policy 0, policy_version 575328 (0.0016) [2024-04-28 04:25:06,695][54798] Signal inference workers to stop experience collection... (38850 times) [2024-04-28 04:25:06,696][54798] Signal inference workers to resume experience collection... (38850 times) [2024-04-28 04:25:06,718][54818] InferenceWorker_p0-w0: stopping experience collection (38850 times) [2024-04-28 04:25:06,718][54818] InferenceWorker_p0-w0: resuming experience collection (38850 times) [2024-04-28 04:25:07,557][54818] Updated weights for policy 0, policy_version 575338 (0.0019) [2024-04-28 04:25:09,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61712.9, 300 sec: 61370.6). Total num frames: 9426419712. Throughput: 0: 61414.2. Samples: 2331676580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:25:09,255][54587] Avg episode reward: [(0, '0.641')] [2024-04-28 04:25:10,500][54818] Updated weights for policy 0, policy_version 575348 (0.0021) [2024-04-28 04:25:12,893][54818] Updated weights for policy 0, policy_version 575358 (0.0017) [2024-04-28 04:25:14,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61987.6, 300 sec: 61426.2). Total num frames: 9426747392. Throughput: 0: 61569.9. Samples: 2331864700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:25:14,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-28 04:25:15,662][54818] Updated weights for policy 0, policy_version 575368 (0.0017) [2024-04-28 04:25:18,110][54818] Updated weights for policy 0, policy_version 575378 (0.0017) [2024-04-28 04:25:19,253][54587] Fps is (10 sec: 63897.5, 60 sec: 61985.9, 300 sec: 61481.6). Total num frames: 9427058688. Throughput: 0: 61821.6. Samples: 2332239240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:25:19,255][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 04:25:20,927][54818] Updated weights for policy 0, policy_version 575388 (0.0018) [2024-04-28 04:25:23,369][54818] Updated weights for policy 0, policy_version 575398 (0.0017) [2024-04-28 04:25:24,253][54587] Fps is (10 sec: 60620.0, 60 sec: 61713.0, 300 sec: 61426.1). Total num frames: 9427353600. Throughput: 0: 61518.6. Samples: 2332606700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:25:24,255][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 04:25:25,016][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (30700 times) [2024-04-28 04:25:26,095][54818] Updated weights for policy 0, policy_version 575408 (0.0017) [2024-04-28 04:25:28,710][54818] Updated weights for policy 0, policy_version 575418 (0.0016) [2024-04-28 04:25:29,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61986.2, 300 sec: 61426.1). Total num frames: 9427664896. Throughput: 0: 61569.9. Samples: 2332793260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:25:29,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 04:25:31,472][54818] Updated weights for policy 0, policy_version 575428 (0.0016) [2024-04-28 04:25:33,907][54818] Updated weights for policy 0, policy_version 575438 (0.0017) [2024-04-28 04:25:34,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61712.9, 300 sec: 61426.1). Total num frames: 9427976192. Throughput: 0: 61656.2. Samples: 2333162260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:25:34,255][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 04:25:36,769][54818] Updated weights for policy 0, policy_version 575448 (0.0018) [2024-04-28 04:25:39,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61986.1, 300 sec: 61481.6). Total num frames: 9428287488. Throughput: 0: 61610.2. Samples: 2333531080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:25:39,255][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 04:25:39,791][54818] Updated weights for policy 0, policy_version 575458 (0.0018) [2024-04-28 04:25:42,413][54818] Updated weights for policy 0, policy_version 575468 (0.0016) [2024-04-28 04:25:44,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61986.2, 300 sec: 61537.2). Total num frames: 9428598784. Throughput: 0: 61817.3. Samples: 2333721160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:25:44,254][54587] Avg episode reward: [(0, '0.551')] [2024-04-28 04:25:44,980][54818] Updated weights for policy 0, policy_version 575478 (0.0016) [2024-04-28 04:25:46,126][54798] Signal inference workers to stop experience collection... (38900 times) [2024-04-28 04:25:46,175][54818] InferenceWorker_p0-w0: stopping experience collection (38900 times) [2024-04-28 04:25:46,175][54798] Signal inference workers to resume experience collection... (38900 times) [2024-04-28 04:25:46,186][54818] InferenceWorker_p0-w0: resuming experience collection (38900 times) [2024-04-28 04:25:47,626][54818] Updated weights for policy 0, policy_version 575488 (0.0017) [2024-04-28 04:25:49,253][54587] Fps is (10 sec: 60622.0, 60 sec: 61986.1, 300 sec: 61481.7). Total num frames: 9428893696. Throughput: 0: 61869.1. Samples: 2334091880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:25:49,253][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 04:25:49,319][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000575495_9428910080.pth... [2024-04-28 04:25:49,371][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000574594_9414148096.pth [2024-04-28 04:25:50,309][54818] Updated weights for policy 0, policy_version 575498 (0.0015) [2024-04-28 04:25:51,255][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (30800 times) [2024-04-28 04:25:52,730][54818] Updated weights for policy 0, policy_version 575508 (0.0017) [2024-04-28 04:25:54,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61713.2, 300 sec: 61481.6). Total num frames: 9429204992. Throughput: 0: 61695.7. Samples: 2334452880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:25:54,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 04:25:55,423][54818] Updated weights for policy 0, policy_version 575518 (0.0019) [2024-04-28 04:25:57,895][54818] Updated weights for policy 0, policy_version 575528 (0.0017) [2024-04-28 04:25:59,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61986.2, 300 sec: 61537.2). Total num frames: 9429516288. Throughput: 0: 61808.3. Samples: 2334646080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:25:59,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 04:26:00,674][54818] Updated weights for policy 0, policy_version 575538 (0.0016) [2024-04-28 04:26:03,173][54818] Updated weights for policy 0, policy_version 575548 (0.0017) [2024-04-28 04:26:04,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61439.9, 300 sec: 61481.6). Total num frames: 9429811200. Throughput: 0: 61586.7. Samples: 2335010640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:26:04,255][54587] Avg episode reward: [(0, '0.582')] [2024-04-28 04:26:06,035][54818] Updated weights for policy 0, policy_version 575558 (0.0020) [2024-04-28 04:26:08,436][54818] Updated weights for policy 0, policy_version 575568 (0.0018) [2024-04-28 04:26:09,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61713.2, 300 sec: 61481.6). Total num frames: 9430122496. Throughput: 0: 61503.7. Samples: 2335374360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:26:09,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 04:26:11,288][54818] Updated weights for policy 0, policy_version 575578 (0.0020) [2024-04-28 04:26:13,873][54818] Updated weights for policy 0, policy_version 575588 (0.0016) [2024-04-28 04:26:14,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61439.9, 300 sec: 61481.7). Total num frames: 9430433792. Throughput: 0: 61602.6. Samples: 2335565380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:26:14,256][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 04:26:17,126][54818] Updated weights for policy 0, policy_version 575598 (0.0020) [2024-04-28 04:26:17,961][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (30900 times) [2024-04-28 04:26:19,253][54587] Fps is (10 sec: 62258.5, 60 sec: 61440.0, 300 sec: 61481.6). Total num frames: 9430745088. Throughput: 0: 61511.6. Samples: 2335930280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:26:19,256][54587] Avg episode reward: [(0, '0.649')] [2024-04-28 04:26:19,674][54818] Updated weights for policy 0, policy_version 575608 (0.0015) [2024-04-28 04:26:22,496][54818] Updated weights for policy 0, policy_version 575618 (0.0018) [2024-04-28 04:26:24,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61713.2, 300 sec: 61537.2). Total num frames: 9431056384. Throughput: 0: 61264.6. Samples: 2336287980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:26:24,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 04:26:25,055][54818] Updated weights for policy 0, policy_version 575628 (0.0017) [2024-04-28 04:26:25,257][54798] Signal inference workers to stop experience collection... (38950 times) [2024-04-28 04:26:25,289][54818] InferenceWorker_p0-w0: stopping experience collection (38950 times) [2024-04-28 04:26:25,315][54798] Signal inference workers to resume experience collection... (38950 times) [2024-04-28 04:26:25,316][54818] InferenceWorker_p0-w0: resuming experience collection (38950 times) [2024-04-28 04:26:27,726][54818] Updated weights for policy 0, policy_version 575638 (0.0018) [2024-04-28 04:26:29,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.0, 300 sec: 61481.6). Total num frames: 9431351296. Throughput: 0: 61295.2. Samples: 2336479440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:26:29,255][54587] Avg episode reward: [(0, '0.527')] [2024-04-28 04:26:30,243][54818] Updated weights for policy 0, policy_version 575648 (0.0018) [2024-04-28 04:26:32,925][54818] Updated weights for policy 0, policy_version 575658 (0.0016) [2024-04-28 04:26:34,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61440.1, 300 sec: 61481.7). Total num frames: 9431662592. Throughput: 0: 61228.7. Samples: 2336847180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:26:34,255][54587] Avg episode reward: [(0, '0.514')] [2024-04-28 04:26:35,424][54818] Updated weights for policy 0, policy_version 575668 (0.0017) [2024-04-28 04:26:38,099][54818] Updated weights for policy 0, policy_version 575678 (0.0016) [2024-04-28 04:26:39,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61440.0, 300 sec: 61481.6). Total num frames: 9431973888. Throughput: 0: 61463.9. Samples: 2337218760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:26:39,254][54587] Avg episode reward: [(0, '0.492')] [2024-04-28 04:26:40,754][54818] Updated weights for policy 0, policy_version 575688 (0.0016) [2024-04-28 04:26:43,658][54818] Updated weights for policy 0, policy_version 575698 (0.0016) [2024-04-28 04:26:44,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61440.0, 300 sec: 61537.2). Total num frames: 9432285184. Throughput: 0: 61200.9. Samples: 2337400120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:26:44,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 04:26:44,757][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (31000 times) [2024-04-28 04:26:46,387][54818] Updated weights for policy 0, policy_version 575708 (0.0016) [2024-04-28 04:26:48,866][54818] Updated weights for policy 0, policy_version 575718 (0.0016) [2024-04-28 04:26:49,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61713.0, 300 sec: 61537.2). Total num frames: 9432596480. Throughput: 0: 61203.7. Samples: 2337764800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:26:49,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 04:26:49,261][54587] No heartbeat for components: RolloutWorker_w4 (25357 seconds), RolloutWorker_w5 (11457 seconds) [2024-04-28 04:26:51,646][54818] Updated weights for policy 0, policy_version 575728 (0.0015) [2024-04-28 04:26:54,222][54818] Updated weights for policy 0, policy_version 575738 (0.0016) [2024-04-28 04:26:54,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61440.1, 300 sec: 61481.7). Total num frames: 9432891392. Throughput: 0: 61413.4. Samples: 2338137960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 04:26:54,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 04:26:56,983][54818] Updated weights for policy 0, policy_version 575748 (0.0020) [2024-04-28 04:26:59,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61439.9, 300 sec: 61537.2). Total num frames: 9433202688. Throughput: 0: 61135.5. Samples: 2338316480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:26:59,254][54587] Avg episode reward: [(0, '0.558')] [2024-04-28 04:26:59,395][54818] Updated weights for policy 0, policy_version 575758 (0.0016) [2024-04-28 04:27:02,177][54818] Updated weights for policy 0, policy_version 575768 (0.0017) [2024-04-28 04:27:04,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61440.1, 300 sec: 61481.7). Total num frames: 9433497600. Throughput: 0: 61180.6. Samples: 2338683400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:27:04,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 04:27:04,604][54818] Updated weights for policy 0, policy_version 575778 (0.0017) [2024-04-28 04:27:07,392][54818] Updated weights for policy 0, policy_version 575788 (0.0016) [2024-04-28 04:27:09,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61439.9, 300 sec: 61481.6). Total num frames: 9433808896. Throughput: 0: 61571.4. Samples: 2339058700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:27:09,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-28 04:27:09,825][54818] Updated weights for policy 0, policy_version 575798 (0.0017) [2024-04-28 04:27:11,249][54798] Signal inference workers to stop experience collection... (39000 times) [2024-04-28 04:27:11,288][54818] InferenceWorker_p0-w0: stopping experience collection (39000 times) [2024-04-28 04:27:11,306][54798] Signal inference workers to resume experience collection... (39000 times) [2024-04-28 04:27:11,308][54818] InferenceWorker_p0-w0: resuming experience collection (39000 times) [2024-04-28 04:27:11,484][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (31100 times) [2024-04-28 04:27:12,644][54818] Updated weights for policy 0, policy_version 575808 (0.0016) [2024-04-28 04:27:14,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.0, 300 sec: 61537.2). Total num frames: 9434120192. Throughput: 0: 61432.1. Samples: 2339243880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:27:14,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 04:27:15,692][54818] Updated weights for policy 0, policy_version 575818 (0.0017) [2024-04-28 04:27:18,361][54818] Updated weights for policy 0, policy_version 575828 (0.0017) [2024-04-28 04:27:19,253][54587] Fps is (10 sec: 60622.1, 60 sec: 61167.1, 300 sec: 61537.2). Total num frames: 9434415104. Throughput: 0: 61258.0. Samples: 2339603780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:27:19,253][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 04:27:20,980][54818] Updated weights for policy 0, policy_version 575838 (0.0017) [2024-04-28 04:27:23,679][54818] Updated weights for policy 0, policy_version 575848 (0.0016) [2024-04-28 04:27:24,253][54587] Fps is (10 sec: 58982.7, 60 sec: 60893.9, 300 sec: 61481.7). Total num frames: 9434710016. Throughput: 0: 61400.6. Samples: 2339981780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:27:24,253][54587] Avg episode reward: [(0, '0.652')] [2024-04-28 04:27:26,346][54818] Updated weights for policy 0, policy_version 575858 (0.0018) [2024-04-28 04:27:29,232][54818] Updated weights for policy 0, policy_version 575868 (0.0017) [2024-04-28 04:27:29,253][54587] Fps is (10 sec: 60620.0, 60 sec: 61166.9, 300 sec: 61481.6). Total num frames: 9435021312. Throughput: 0: 61392.5. Samples: 2340162780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:27:29,254][54587] Avg episode reward: [(0, '0.597')] [2024-04-28 04:27:31,660][54818] Updated weights for policy 0, policy_version 575878 (0.0017) [2024-04-28 04:27:34,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61166.9, 300 sec: 61481.7). Total num frames: 9435332608. Throughput: 0: 61370.1. Samples: 2340526460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:27:34,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 04:27:34,388][54818] Updated weights for policy 0, policy_version 575888 (0.0017) [2024-04-28 04:27:36,688][54818] Updated weights for policy 0, policy_version 575898 (0.0017) [2024-04-28 04:27:38,455][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (31200 times) [2024-04-28 04:27:39,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61166.8, 300 sec: 61537.2). Total num frames: 9435643904. Throughput: 0: 61531.2. Samples: 2340906880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:27:39,254][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 04:27:39,537][54818] Updated weights for policy 0, policy_version 575908 (0.0015) [2024-04-28 04:27:41,988][54818] Updated weights for policy 0, policy_version 575918 (0.0016) [2024-04-28 04:27:44,253][54587] Fps is (10 sec: 62260.1, 60 sec: 61167.1, 300 sec: 61592.8). Total num frames: 9435955200. Throughput: 0: 61405.6. Samples: 2341079720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:27:44,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 04:27:44,735][54818] Updated weights for policy 0, policy_version 575928 (0.0016) [2024-04-28 04:27:47,204][54818] Updated weights for policy 0, policy_version 575938 (0.0020) [2024-04-28 04:27:49,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60893.8, 300 sec: 61537.2). Total num frames: 9436250112. Throughput: 0: 61498.6. Samples: 2341450840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:27:49,254][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 04:27:49,289][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000575944_9436266496.pth... [2024-04-28 04:27:49,350][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000575042_9421488128.pth [2024-04-28 04:27:50,365][54818] Updated weights for policy 0, policy_version 575948 (0.0018) [2024-04-28 04:27:53,099][54818] Updated weights for policy 0, policy_version 575958 (0.0022) [2024-04-28 04:27:54,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61166.9, 300 sec: 61592.7). Total num frames: 9436561408. Throughput: 0: 61556.6. Samples: 2341828740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:27:54,254][54587] Avg episode reward: [(0, '0.706')] [2024-04-28 04:27:55,724][54818] Updated weights for policy 0, policy_version 575968 (0.0017) [2024-04-28 04:27:55,838][54798] Signal inference workers to stop experience collection... (39050 times) [2024-04-28 04:27:55,867][54818] InferenceWorker_p0-w0: stopping experience collection (39050 times) [2024-04-28 04:27:55,897][54798] Signal inference workers to resume experience collection... (39050 times) [2024-04-28 04:27:55,897][54818] InferenceWorker_p0-w0: resuming experience collection (39050 times) [2024-04-28 04:27:58,402][54818] Updated weights for policy 0, policy_version 575978 (0.0016) [2024-04-28 04:27:59,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60894.0, 300 sec: 61481.7). Total num frames: 9436856320. Throughput: 0: 61298.3. Samples: 2342002300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:27:59,253][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 04:28:01,060][54818] Updated weights for policy 0, policy_version 575988 (0.0016) [2024-04-28 04:28:03,876][54818] Updated weights for policy 0, policy_version 575998 (0.0016) [2024-04-28 04:28:04,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61440.0, 300 sec: 61592.7). Total num frames: 9437184000. Throughput: 0: 61515.8. Samples: 2342372000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:28:04,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-28 04:28:05,018][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (31300 times) [2024-04-28 04:28:06,270][54818] Updated weights for policy 0, policy_version 576008 (0.0016) [2024-04-28 04:28:09,232][54818] Updated weights for policy 0, policy_version 576018 (0.0015) [2024-04-28 04:28:09,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61167.1, 300 sec: 61537.2). Total num frames: 9437478912. Throughput: 0: 61466.6. Samples: 2342747780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:28:09,253][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 04:28:11,403][54818] Updated weights for policy 0, policy_version 576028 (0.0020) [2024-04-28 04:28:14,253][54587] Fps is (10 sec: 58982.0, 60 sec: 60893.7, 300 sec: 61481.6). Total num frames: 9437773824. Throughput: 0: 61149.2. Samples: 2342914500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:28:14,254][54587] Avg episode reward: [(0, '0.482')] [2024-04-28 04:28:14,468][54818] Updated weights for policy 0, policy_version 576038 (0.0017) [2024-04-28 04:28:16,482][54818] Updated weights for policy 0, policy_version 576048 (0.0019) [2024-04-28 04:28:19,253][54587] Fps is (10 sec: 57344.1, 60 sec: 60620.8, 300 sec: 61370.6). Total num frames: 9438052352. Throughput: 0: 61260.2. Samples: 2343283160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:28:19,253][54587] Avg episode reward: [(0, '0.687')] [2024-04-28 04:28:19,823][54818] Updated weights for policy 0, policy_version 576058 (0.0016) [2024-04-28 04:28:22,185][54818] Updated weights for policy 0, policy_version 576068 (0.0019) [2024-04-28 04:28:24,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61166.8, 300 sec: 61426.1). Total num frames: 9438380032. Throughput: 0: 61176.2. Samples: 2343659800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:28:24,254][54587] Avg episode reward: [(0, '0.580')] [2024-04-28 04:28:25,335][54818] Updated weights for policy 0, policy_version 576078 (0.0016) [2024-04-28 04:28:27,531][54818] Updated weights for policy 0, policy_version 576088 (0.0019) [2024-04-28 04:28:29,253][54587] Fps is (10 sec: 63897.1, 60 sec: 61166.9, 300 sec: 61426.1). Total num frames: 9438691328. Throughput: 0: 60943.0. Samples: 2343822160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 04:28:29,254][54587] Avg episode reward: [(0, '0.630')] [2024-04-28 04:28:30,724][54818] Updated weights for policy 0, policy_version 576098 (0.0015) [2024-04-28 04:28:31,268][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (31400 times) [2024-04-28 04:28:31,268][54798] Signal inference workers to stop experience collection... (39100 times) [2024-04-28 04:28:31,309][54818] InferenceWorker_p0-w0: stopping experience collection (39100 times) [2024-04-28 04:28:31,326][54798] Signal inference workers to resume experience collection... (39100 times) [2024-04-28 04:28:31,326][54818] InferenceWorker_p0-w0: resuming experience collection (39100 times) [2024-04-28 04:28:33,092][54818] Updated weights for policy 0, policy_version 576108 (0.0018) [2024-04-28 04:28:34,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60894.0, 300 sec: 61370.6). Total num frames: 9438986240. Throughput: 0: 60947.7. Samples: 2344193480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:28:34,253][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 04:28:35,846][54818] Updated weights for policy 0, policy_version 576118 (0.0015) [2024-04-28 04:28:38,850][54818] Updated weights for policy 0, policy_version 576128 (0.0017) [2024-04-28 04:28:39,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60894.0, 300 sec: 61426.1). Total num frames: 9439297536. Throughput: 0: 61035.5. Samples: 2344575340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:28:39,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 04:28:41,072][54818] Updated weights for policy 0, policy_version 576138 (0.0018) [2024-04-28 04:28:44,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60620.7, 300 sec: 61370.6). Total num frames: 9439592448. Throughput: 0: 60924.8. Samples: 2344743920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:28:44,255][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 04:28:44,331][54818] Updated weights for policy 0, policy_version 576148 (0.0018) [2024-04-28 04:28:46,216][54818] Updated weights for policy 0, policy_version 576158 (0.0022) [2024-04-28 04:28:49,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60893.7, 300 sec: 61370.6). Total num frames: 9439903744. Throughput: 0: 60935.8. Samples: 2345114120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:28:49,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 04:28:49,659][54818] Updated weights for policy 0, policy_version 576168 (0.0017) [2024-04-28 04:28:51,624][54818] Updated weights for policy 0, policy_version 576178 (0.0016) [2024-04-28 04:28:54,253][54587] Fps is (10 sec: 62259.2, 60 sec: 60893.8, 300 sec: 61370.6). Total num frames: 9440215040. Throughput: 0: 61062.1. Samples: 2345495580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:28:54,255][54587] Avg episode reward: [(0, '0.636')] [2024-04-28 04:28:54,976][54818] Updated weights for policy 0, policy_version 576188 (0.0016) [2024-04-28 04:28:56,878][54818] Updated weights for policy 0, policy_version 576198 (0.0016) [2024-04-28 04:28:58,748][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (31500 times) [2024-04-28 04:28:59,253][54587] Fps is (10 sec: 62260.0, 60 sec: 61166.8, 300 sec: 61426.1). Total num frames: 9440526336. Throughput: 0: 61101.4. Samples: 2345664060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:28:59,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 04:29:00,341][54818] Updated weights for policy 0, policy_version 576208 (0.0017) [2024-04-28 04:29:02,063][54818] Updated weights for policy 0, policy_version 576218 (0.0019) [2024-04-28 04:29:04,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60620.8, 300 sec: 61370.5). Total num frames: 9440821248. Throughput: 0: 61138.5. Samples: 2346034400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:29:04,254][54587] Avg episode reward: [(0, '0.629')] [2024-04-28 04:29:05,456][54818] Updated weights for policy 0, policy_version 576228 (0.0018) [2024-04-28 04:29:07,767][54818] Updated weights for policy 0, policy_version 576238 (0.0019) [2024-04-28 04:29:07,771][54798] Signal inference workers to stop experience collection... (39150 times) [2024-04-28 04:29:07,771][54798] Signal inference workers to resume experience collection... (39150 times) [2024-04-28 04:29:07,785][54818] InferenceWorker_p0-w0: stopping experience collection (39150 times) [2024-04-28 04:29:07,785][54818] InferenceWorker_p0-w0: resuming experience collection (39150 times) [2024-04-28 04:29:09,253][54587] Fps is (10 sec: 60620.6, 60 sec: 60893.7, 300 sec: 61370.8). Total num frames: 9441132544. Throughput: 0: 61148.9. Samples: 2346411500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:29:09,254][54587] Avg episode reward: [(0, '0.598')] [2024-04-28 04:29:10,773][54818] Updated weights for policy 0, policy_version 576248 (0.0018) [2024-04-28 04:29:12,834][54818] Updated weights for policy 0, policy_version 576258 (0.0020) [2024-04-28 04:29:14,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61167.1, 300 sec: 61370.6). Total num frames: 9441443840. Throughput: 0: 61386.3. Samples: 2346584540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:29:14,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 04:29:15,903][54818] Updated weights for policy 0, policy_version 576268 (0.0018) [2024-04-28 04:29:18,633][54818] Updated weights for policy 0, policy_version 576278 (0.0019) [2024-04-28 04:29:19,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 9441738752. Throughput: 0: 61318.1. Samples: 2346952800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:29:19,255][54587] Avg episode reward: [(0, '0.518')] [2024-04-28 04:29:21,083][54818] Updated weights for policy 0, policy_version 576288 (0.0017) [2024-04-28 04:29:24,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61167.0, 300 sec: 61370.6). Total num frames: 9442050048. Throughput: 0: 61028.1. Samples: 2347321600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:29:24,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 04:29:24,528][54818] Updated weights for policy 0, policy_version 576298 (0.0015) [2024-04-28 04:29:25,420][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (31600 times) [2024-04-28 04:29:26,591][54818] Updated weights for policy 0, policy_version 576308 (0.0017) [2024-04-28 04:29:29,253][54587] Fps is (10 sec: 62260.2, 60 sec: 61167.0, 300 sec: 61315.1). Total num frames: 9442361344. Throughput: 0: 61296.6. Samples: 2347502260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:29:29,253][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 04:29:29,816][54818] Updated weights for policy 0, policy_version 576318 (0.0016) [2024-04-28 04:29:31,933][54818] Updated weights for policy 0, policy_version 576328 (0.0019) [2024-04-28 04:29:34,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9442672640. Throughput: 0: 61214.5. Samples: 2347868760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:29:34,255][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 04:29:35,151][54818] Updated weights for policy 0, policy_version 576338 (0.0017) [2024-04-28 04:29:37,158][54818] Updated weights for policy 0, policy_version 576348 (0.0017) [2024-04-28 04:29:39,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61167.0, 300 sec: 61315.1). Total num frames: 9442967552. Throughput: 0: 60921.0. Samples: 2348237020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:29:39,254][54587] Avg episode reward: [(0, '0.577')] [2024-04-28 04:29:40,481][54818] Updated weights for policy 0, policy_version 576358 (0.0015) [2024-04-28 04:29:42,696][54818] Updated weights for policy 0, policy_version 576368 (0.0015) [2024-04-28 04:29:44,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9443278848. Throughput: 0: 61258.4. Samples: 2348420680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:29:44,254][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 04:29:45,628][54818] Updated weights for policy 0, policy_version 576378 (0.0018) [2024-04-28 04:29:45,684][54798] Signal inference workers to stop experience collection... (39200 times) [2024-04-28 04:29:45,720][54818] InferenceWorker_p0-w0: stopping experience collection (39200 times) [2024-04-28 04:29:45,775][54798] Signal inference workers to resume experience collection... (39200 times) [2024-04-28 04:29:45,775][54818] InferenceWorker_p0-w0: resuming experience collection (39200 times) [2024-04-28 04:29:47,934][54818] Updated weights for policy 0, policy_version 576388 (0.0017) [2024-04-28 04:29:49,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61440.2, 300 sec: 61315.1). Total num frames: 9443590144. Throughput: 0: 61172.6. Samples: 2348787160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:29:49,255][54587] Avg episode reward: [(0, '0.583')] [2024-04-28 04:29:49,266][54587] No heartbeat for components: RolloutWorker_w4 (25537 seconds), RolloutWorker_w5 (11637 seconds) [2024-04-28 04:29:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000576391_9443590144.pth... [2024-04-28 04:29:49,321][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000575495_9428910080.pth [2024-04-28 04:29:50,821][54818] Updated weights for policy 0, policy_version 576398 (0.0016) [2024-04-28 04:29:51,983][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (31700 times) [2024-04-28 04:29:53,340][54818] Updated weights for policy 0, policy_version 576408 (0.0016) [2024-04-28 04:29:54,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9443901440. Throughput: 0: 60965.9. Samples: 2349154960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:29:54,255][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 04:29:56,219][54818] Updated weights for policy 0, policy_version 576418 (0.0016) [2024-04-28 04:29:58,583][54818] Updated weights for policy 0, policy_version 576428 (0.0019) [2024-04-28 04:29:59,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9444196352. Throughput: 0: 61264.8. Samples: 2349341460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:29:59,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 04:30:01,420][54818] Updated weights for policy 0, policy_version 576438 (0.0023) [2024-04-28 04:30:04,253][54587] Fps is (10 sec: 60621.8, 60 sec: 61440.2, 300 sec: 61315.1). Total num frames: 9444507648. Throughput: 0: 61300.3. Samples: 2349711300. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:30:04,253][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 04:30:04,333][54818] Updated weights for policy 0, policy_version 576448 (0.0017) [2024-04-28 04:30:06,818][54818] Updated weights for policy 0, policy_version 576458 (0.0018) [2024-04-28 04:30:09,253][54587] Fps is (10 sec: 63898.5, 60 sec: 61713.3, 300 sec: 61315.1). Total num frames: 9444835328. Throughput: 0: 61271.7. Samples: 2350078820. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:30:09,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 04:30:09,666][54818] Updated weights for policy 0, policy_version 576468 (0.0017) [2024-04-28 04:30:12,223][54818] Updated weights for policy 0, policy_version 576478 (0.0015) [2024-04-28 04:30:14,253][54587] Fps is (10 sec: 63896.4, 60 sec: 61713.0, 300 sec: 61315.1). Total num frames: 9445146624. Throughput: 0: 61489.2. Samples: 2350269280. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:30:14,254][54587] Avg episode reward: [(0, '0.640')] [2024-04-28 04:30:14,255][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (1700 times) [2024-04-28 04:30:14,860][54818] Updated weights for policy 0, policy_version 576488 (0.0015) [2024-04-28 04:30:17,532][54818] Updated weights for policy 0, policy_version 576498 (0.0016) [2024-04-28 04:30:18,705][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (31800 times) [2024-04-28 04:30:19,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61713.2, 300 sec: 61315.1). Total num frames: 9445441536. Throughput: 0: 61406.7. Samples: 2350632060. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:30:19,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 04:30:20,193][54818] Updated weights for policy 0, policy_version 576508 (0.0017) [2024-04-28 04:30:22,842][54818] Updated weights for policy 0, policy_version 576518 (0.0019) [2024-04-28 04:30:24,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61713.1, 300 sec: 61315.0). Total num frames: 9445752832. Throughput: 0: 61280.9. Samples: 2350994660. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:30:24,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 04:30:25,711][54818] Updated weights for policy 0, policy_version 576528 (0.0016) [2024-04-28 04:30:28,194][54818] Updated weights for policy 0, policy_version 576538 (0.0018) [2024-04-28 04:30:28,358][54798] Signal inference workers to stop experience collection... (39250 times) [2024-04-28 04:30:28,359][54798] Signal inference workers to resume experience collection... (39250 times) [2024-04-28 04:30:28,374][54818] InferenceWorker_p0-w0: stopping experience collection (39250 times) [2024-04-28 04:30:28,374][54818] InferenceWorker_p0-w0: resuming experience collection (39250 times) [2024-04-28 04:30:29,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61713.0, 300 sec: 61315.1). Total num frames: 9446064128. Throughput: 0: 61216.9. Samples: 2351175440. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:30:29,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 04:30:30,938][54818] Updated weights for policy 0, policy_version 576548 (0.0016) [2024-04-28 04:30:33,490][54818] Updated weights for policy 0, policy_version 576558 (0.0016) [2024-04-28 04:30:34,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9446359040. Throughput: 0: 61434.2. Samples: 2351551700. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:30:34,255][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 04:30:36,191][54818] Updated weights for policy 0, policy_version 576568 (0.0018) [2024-04-28 04:30:38,740][54818] Updated weights for policy 0, policy_version 576578 (0.0016) [2024-04-28 04:30:39,253][54587] Fps is (10 sec: 60619.9, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 9446670336. Throughput: 0: 61392.8. Samples: 2351917640. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:30:39,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 04:30:41,507][54818] Updated weights for policy 0, policy_version 576588 (0.0016) [2024-04-28 04:30:44,034][54818] Updated weights for policy 0, policy_version 576598 (0.0017) [2024-04-28 04:30:44,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61713.1, 300 sec: 61315.0). Total num frames: 9446981632. Throughput: 0: 61386.4. Samples: 2352103840. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:30:44,254][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 04:30:45,483][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (31900 times) [2024-04-28 04:30:46,880][54818] Updated weights for policy 0, policy_version 576608 (0.0016) [2024-04-28 04:30:49,253][54587] Fps is (10 sec: 62260.1, 60 sec: 61713.1, 300 sec: 61315.1). Total num frames: 9447292928. Throughput: 0: 61433.2. Samples: 2352475800. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:30:49,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 04:30:49,683][54818] Updated weights for policy 0, policy_version 576618 (0.0016) [2024-04-28 04:30:52,406][54818] Updated weights for policy 0, policy_version 576628 (0.0016) [2024-04-28 04:30:54,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61713.2, 300 sec: 61315.1). Total num frames: 9447604224. Throughput: 0: 61368.9. Samples: 2352840420. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:30:54,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 04:30:54,920][54818] Updated weights for policy 0, policy_version 576638 (0.0015) [2024-04-28 04:30:57,531][54818] Updated weights for policy 0, policy_version 576648 (0.0018) [2024-04-28 04:30:59,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61986.3, 300 sec: 61370.6). Total num frames: 9447915520. Throughput: 0: 61285.9. Samples: 2353027140. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:30:59,254][54587] Avg episode reward: [(0, '0.658')] [2024-04-28 04:31:00,084][54818] Updated weights for policy 0, policy_version 576658 (0.0015) [2024-04-28 04:31:02,818][54818] Updated weights for policy 0, policy_version 576668 (0.0016) [2024-04-28 04:31:04,253][54587] Fps is (10 sec: 60620.0, 60 sec: 61712.9, 300 sec: 61315.0). Total num frames: 9448210432. Throughput: 0: 61483.0. Samples: 2353398800. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:31:04,254][54587] Avg episode reward: [(0, '0.546')] [2024-04-28 04:31:05,446][54818] Updated weights for policy 0, policy_version 576678 (0.0015) [2024-04-28 04:31:08,092][54818] Updated weights for policy 0, policy_version 576688 (0.0017) [2024-04-28 04:31:09,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.0, 300 sec: 61315.1). Total num frames: 9448521728. Throughput: 0: 61549.9. Samples: 2353764400. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:31:09,254][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 04:31:10,892][54818] Updated weights for policy 0, policy_version 576698 (0.0016) [2024-04-28 04:31:12,085][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (32000 times) [2024-04-28 04:31:13,555][54818] Updated weights for policy 0, policy_version 576708 (0.0016) [2024-04-28 04:31:14,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9448816640. Throughput: 0: 61611.1. Samples: 2353947940. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:31:14,254][54587] Avg episode reward: [(0, '0.617')] [2024-04-28 04:31:16,146][54818] Updated weights for policy 0, policy_version 576718 (0.0015) [2024-04-28 04:31:18,895][54818] Updated weights for policy 0, policy_version 576728 (0.0017) [2024-04-28 04:31:19,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9449127936. Throughput: 0: 61522.3. Samples: 2354320200. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:31:19,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 04:31:21,420][54818] Updated weights for policy 0, policy_version 576738 (0.0017) [2024-04-28 04:31:24,204][54818] Updated weights for policy 0, policy_version 576748 (0.0017) [2024-04-28 04:31:24,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61440.0, 300 sec: 61315.1). Total num frames: 9449439232. Throughput: 0: 61528.7. Samples: 2354686420. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:31:24,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 04:31:25,830][54798] Signal inference workers to stop experience collection... (39300 times) [2024-04-28 04:31:25,831][54798] Signal inference workers to resume experience collection... (39300 times) [2024-04-28 04:31:25,839][54818] InferenceWorker_p0-w0: stopping experience collection (39300 times) [2024-04-28 04:31:25,839][54818] InferenceWorker_p0-w0: resuming experience collection (39300 times) [2024-04-28 04:31:27,179][54818] Updated weights for policy 0, policy_version 576758 (0.0018) [2024-04-28 04:31:29,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61440.0, 300 sec: 61315.1). Total num frames: 9449750528. Throughput: 0: 61474.7. Samples: 2354870200. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:31:29,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 04:31:29,470][54818] Updated weights for policy 0, policy_version 576768 (0.0016) [2024-04-28 04:31:32,352][54818] Updated weights for policy 0, policy_version 576778 (0.0017) [2024-04-28 04:31:34,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9450045440. Throughput: 0: 61581.4. Samples: 2355246960. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-04-28 04:31:34,253][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 04:31:34,613][54818] Updated weights for policy 0, policy_version 576788 (0.0017) [2024-04-28 04:31:37,723][54818] Updated weights for policy 0, policy_version 576798 (0.0016) [2024-04-28 04:31:38,935][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (32100 times) [2024-04-28 04:31:39,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.2, 300 sec: 61259.5). Total num frames: 9450356736. Throughput: 0: 61493.7. Samples: 2355607640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:31:39,255][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 04:31:39,837][54818] Updated weights for policy 0, policy_version 576808 (0.0016) [2024-04-28 04:31:43,024][54818] Updated weights for policy 0, policy_version 576818 (0.0017) [2024-04-28 04:31:44,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9450651648. Throughput: 0: 61608.0. Samples: 2355799500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:31:44,253][54587] Avg episode reward: [(0, '0.510')] [2024-04-28 04:31:45,248][54818] Updated weights for policy 0, policy_version 576828 (0.0017) [2024-04-28 04:31:48,321][54818] Updated weights for policy 0, policy_version 576838 (0.0017) [2024-04-28 04:31:49,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 9450979328. Throughput: 0: 61503.1. Samples: 2356166440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:31:49,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 04:31:49,266][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000576842_9450979328.pth... [2024-04-28 04:31:49,321][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000575944_9436266496.pth [2024-04-28 04:31:50,829][54818] Updated weights for policy 0, policy_version 576848 (0.0019) [2024-04-28 04:31:53,665][54818] Updated weights for policy 0, policy_version 576858 (0.0015) [2024-04-28 04:31:54,253][54587] Fps is (10 sec: 63897.3, 60 sec: 61439.9, 300 sec: 61315.1). Total num frames: 9451290624. Throughput: 0: 61376.3. Samples: 2356526340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:31:54,254][54587] Avg episode reward: [(0, '0.614')] [2024-04-28 04:31:56,478][54818] Updated weights for policy 0, policy_version 576868 (0.0022) [2024-04-28 04:31:59,012][54818] Updated weights for policy 0, policy_version 576878 (0.0017) [2024-04-28 04:31:59,253][54587] Fps is (10 sec: 60621.9, 60 sec: 61167.0, 300 sec: 61315.1). Total num frames: 9451585536. Throughput: 0: 61412.1. Samples: 2356711480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:31:59,253][54587] Avg episode reward: [(0, '0.671')] [2024-04-28 04:32:01,786][54818] Updated weights for policy 0, policy_version 576888 (0.0019) [2024-04-28 04:32:04,246][54818] Updated weights for policy 0, policy_version 576898 (0.0016) [2024-04-28 04:32:04,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61440.2, 300 sec: 61315.1). Total num frames: 9451896832. Throughput: 0: 61403.6. Samples: 2357083360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:32:04,253][54587] Avg episode reward: [(0, '0.603')] [2024-04-28 04:32:05,410][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (32200 times) [2024-04-28 04:32:07,064][54818] Updated weights for policy 0, policy_version 576908 (0.0016) [2024-04-28 04:32:09,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.0, 300 sec: 61315.1). Total num frames: 9452208128. Throughput: 0: 61359.2. Samples: 2357447580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:32:09,253][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 04:32:09,343][54818] Updated weights for policy 0, policy_version 576918 (0.0017) [2024-04-28 04:32:12,524][54818] Updated weights for policy 0, policy_version 576928 (0.0017) [2024-04-28 04:32:14,015][54798] Signal inference workers to stop experience collection... (39350 times) [2024-04-28 04:32:14,016][54798] Signal inference workers to resume experience collection... (39350 times) [2024-04-28 04:32:14,028][54818] InferenceWorker_p0-w0: stopping experience collection (39350 times) [2024-04-28 04:32:14,028][54818] InferenceWorker_p0-w0: resuming experience collection (39350 times) [2024-04-28 04:32:14,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.1, 300 sec: 61315.0). Total num frames: 9452503040. Throughput: 0: 61248.1. Samples: 2357626360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:32:14,253][54587] Avg episode reward: [(0, '0.541')] [2024-04-28 04:32:14,741][54818] Updated weights for policy 0, policy_version 576938 (0.0016) [2024-04-28 04:32:17,975][54818] Updated weights for policy 0, policy_version 576948 (0.0016) [2024-04-28 04:32:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9452814336. Throughput: 0: 61072.0. Samples: 2357995200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:32:19,253][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 04:32:20,071][54818] Updated weights for policy 0, policy_version 576958 (0.0018) [2024-04-28 04:32:23,238][54818] Updated weights for policy 0, policy_version 576968 (0.0016) [2024-04-28 04:32:24,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9453125632. Throughput: 0: 61372.5. Samples: 2358369400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:32:24,253][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 04:32:25,846][54818] Updated weights for policy 0, policy_version 576978 (0.0019) [2024-04-28 04:32:28,579][54818] Updated weights for policy 0, policy_version 576988 (0.0016) [2024-04-28 04:32:29,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61439.9, 300 sec: 61370.6). Total num frames: 9453436928. Throughput: 0: 61101.6. Samples: 2358549080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:32:29,255][54587] Avg episode reward: [(0, '0.654')] [2024-04-28 04:32:31,109][54818] Updated weights for policy 0, policy_version 576998 (0.0016) [2024-04-28 04:32:32,301][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (32300 times) [2024-04-28 04:32:33,790][54818] Updated weights for policy 0, policy_version 577008 (0.0016) [2024-04-28 04:32:34,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61440.0, 300 sec: 61315.1). Total num frames: 9453731840. Throughput: 0: 61124.2. Samples: 2358917020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:32:34,253][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 04:32:36,502][54818] Updated weights for policy 0, policy_version 577018 (0.0021) [2024-04-28 04:32:39,124][54818] Updated weights for policy 0, policy_version 577028 (0.0016) [2024-04-28 04:32:39,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61440.1, 300 sec: 61315.0). Total num frames: 9454043136. Throughput: 0: 61343.7. Samples: 2359286800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:32:39,253][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 04:32:41,753][54818] Updated weights for policy 0, policy_version 577038 (0.0016) [2024-04-28 04:32:44,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61440.0, 300 sec: 61315.1). Total num frames: 9454338048. Throughput: 0: 61120.0. Samples: 2359461880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:32:44,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 04:32:44,318][54818] Updated weights for policy 0, policy_version 577048 (0.0017) [2024-04-28 04:32:47,361][54818] Updated weights for policy 0, policy_version 577058 (0.0018) [2024-04-28 04:32:49,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61167.0, 300 sec: 61315.0). Total num frames: 9454649344. Throughput: 0: 61141.2. Samples: 2359834720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:32:49,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 04:32:49,261][54587] No heartbeat for components: RolloutWorker_w4 (25717 seconds), RolloutWorker_w5 (11817 seconds) [2024-04-28 04:32:49,605][54818] Updated weights for policy 0, policy_version 577068 (0.0017) [2024-04-28 04:32:52,700][54818] Updated weights for policy 0, policy_version 577078 (0.0016) [2024-04-28 04:32:54,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9454960640. Throughput: 0: 61160.8. Samples: 2360199820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:32:54,254][54587] Avg episode reward: [(0, '0.660')] [2024-04-28 04:32:54,990][54818] Updated weights for policy 0, policy_version 577088 (0.0015) [2024-04-28 04:32:57,549][54798] Signal inference workers to stop experience collection... (39400 times) [2024-04-28 04:32:57,550][54798] Signal inference workers to resume experience collection... (39400 times) [2024-04-28 04:32:57,567][54818] InferenceWorker_p0-w0: stopping experience collection (39400 times) [2024-04-28 04:32:57,567][54818] InferenceWorker_p0-w0: resuming experience collection (39400 times) [2024-04-28 04:32:58,025][54818] Updated weights for policy 0, policy_version 577098 (0.0016) [2024-04-28 04:32:58,908][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (32400 times) [2024-04-28 04:32:59,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9455255552. Throughput: 0: 61282.2. Samples: 2360384060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:32:59,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-28 04:33:00,213][54818] Updated weights for policy 0, policy_version 577108 (0.0017) [2024-04-28 04:33:03,179][54818] Updated weights for policy 0, policy_version 577118 (0.0015) [2024-04-28 04:33:04,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9455566848. Throughput: 0: 61336.8. Samples: 2360755360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 04:33:04,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 04:33:05,700][54818] Updated weights for policy 0, policy_version 577128 (0.0016) [2024-04-28 04:33:08,473][54818] Updated weights for policy 0, policy_version 577138 (0.0016) [2024-04-28 04:33:09,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.7, 300 sec: 61315.1). Total num frames: 9455861760. Throughput: 0: 61124.3. Samples: 2361120000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:33:09,254][54587] Avg episode reward: [(0, '0.550')] [2024-04-28 04:33:11,244][54818] Updated weights for policy 0, policy_version 577148 (0.0016) [2024-04-28 04:33:13,750][54818] Updated weights for policy 0, policy_version 577158 (0.0016) [2024-04-28 04:33:14,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61166.9, 300 sec: 61426.1). Total num frames: 9456173056. Throughput: 0: 61259.2. Samples: 2361305740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:33:14,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-28 04:33:16,756][54818] Updated weights for policy 0, policy_version 577168 (0.0016) [2024-04-28 04:33:19,216][54818] Updated weights for policy 0, policy_version 577178 (0.0016) [2024-04-28 04:33:19,256][54587] Fps is (10 sec: 62242.1, 60 sec: 61164.0, 300 sec: 61370.0). Total num frames: 9456484352. Throughput: 0: 61180.6. Samples: 2361670320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:33:19,257][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 04:33:22,325][54818] Updated weights for policy 0, policy_version 577188 (0.0018) [2024-04-28 04:33:24,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61166.8, 300 sec: 61370.6). Total num frames: 9456795648. Throughput: 0: 60982.1. Samples: 2362031000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:33:24,254][54587] Avg episode reward: [(0, '0.574')] [2024-04-28 04:33:24,621][54818] Updated weights for policy 0, policy_version 577198 (0.0024) [2024-04-28 04:33:25,542][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (32500 times) [2024-04-28 04:33:27,530][54818] Updated weights for policy 0, policy_version 577208 (0.0016) [2024-04-28 04:33:29,253][54587] Fps is (10 sec: 60637.7, 60 sec: 60893.9, 300 sec: 61370.6). Total num frames: 9457090560. Throughput: 0: 61446.6. Samples: 2362226980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:33:29,262][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 04:33:29,876][54818] Updated weights for policy 0, policy_version 577218 (0.0017) [2024-04-28 04:33:32,714][54818] Updated weights for policy 0, policy_version 577228 (0.0017) [2024-04-28 04:33:34,253][54587] Fps is (10 sec: 58982.9, 60 sec: 60893.9, 300 sec: 61315.1). Total num frames: 9457385472. Throughput: 0: 61246.4. Samples: 2362590800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:33:34,253][54587] Avg episode reward: [(0, '0.661')] [2024-04-28 04:33:35,207][54818] Updated weights for policy 0, policy_version 577238 (0.0016) [2024-04-28 04:33:38,027][54818] Updated weights for policy 0, policy_version 577248 (0.0017) [2024-04-28 04:33:39,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61166.9, 300 sec: 61426.1). Total num frames: 9457713152. Throughput: 0: 61116.9. Samples: 2362950080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:33:39,254][54587] Avg episode reward: [(0, '0.645')] [2024-04-28 04:33:40,397][54818] Updated weights for policy 0, policy_version 577258 (0.0021) [2024-04-28 04:33:43,264][54818] Updated weights for policy 0, policy_version 577268 (0.0015) [2024-04-28 04:33:44,253][54587] Fps is (10 sec: 63897.2, 60 sec: 61440.0, 300 sec: 61426.2). Total num frames: 9458024448. Throughput: 0: 61311.5. Samples: 2363143080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:33:44,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 04:33:45,782][54818] Updated weights for policy 0, policy_version 577278 (0.0016) [2024-04-28 04:33:48,443][54818] Updated weights for policy 0, policy_version 577288 (0.0018) [2024-04-28 04:33:49,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9458319360. Throughput: 0: 61178.6. Samples: 2363508400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:33:49,255][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 04:33:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000577290_9458319360.pth... [2024-04-28 04:33:49,322][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000576391_9443590144.pth [2024-04-28 04:33:51,287][54818] Updated weights for policy 0, policy_version 577298 (0.0016) [2024-04-28 04:33:52,556][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (32600 times) [2024-04-28 04:33:54,230][54818] Updated weights for policy 0, policy_version 577308 (0.0017) [2024-04-28 04:33:54,253][54587] Fps is (10 sec: 58982.3, 60 sec: 60893.9, 300 sec: 61315.1). Total num frames: 9458614272. Throughput: 0: 61014.3. Samples: 2363865640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:33:54,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 04:33:56,967][54818] Updated weights for policy 0, policy_version 577318 (0.0016) [2024-04-28 04:33:57,108][54798] Signal inference workers to stop experience collection... (39450 times) [2024-04-28 04:33:57,142][54818] InferenceWorker_p0-w0: stopping experience collection (39450 times) [2024-04-28 04:33:57,161][54798] Signal inference workers to resume experience collection... (39450 times) [2024-04-28 04:33:57,161][54818] InferenceWorker_p0-w0: resuming experience collection (39450 times) [2024-04-28 04:33:59,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9458925568. Throughput: 0: 61144.4. Samples: 2364057240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:33:59,254][54587] Avg episode reward: [(0, '0.486')] [2024-04-28 04:33:59,497][54818] Updated weights for policy 0, policy_version 577328 (0.0017) [2024-04-28 04:34:02,251][54818] Updated weights for policy 0, policy_version 577338 (0.0016) [2024-04-28 04:34:04,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9459236864. Throughput: 0: 61129.5. Samples: 2364420980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:34:04,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 04:34:04,762][54818] Updated weights for policy 0, policy_version 577348 (0.0016) [2024-04-28 04:34:07,549][54818] Updated weights for policy 0, policy_version 577358 (0.0016) [2024-04-28 04:34:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.9, 300 sec: 61315.0). Total num frames: 9459531776. Throughput: 0: 61176.0. Samples: 2364783920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:34:09,255][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 04:34:10,054][54818] Updated weights for policy 0, policy_version 577368 (0.0016) [2024-04-28 04:34:12,701][54818] Updated weights for policy 0, policy_version 577378 (0.0017) [2024-04-28 04:34:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.8, 300 sec: 61370.6). Total num frames: 9459843072. Throughput: 0: 61118.1. Samples: 2364977300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:34:14,254][54587] Avg episode reward: [(0, '0.540')] [2024-04-28 04:34:15,596][54818] Updated weights for policy 0, policy_version 577388 (0.0016) [2024-04-28 04:34:18,052][54818] Updated weights for policy 0, policy_version 577398 (0.0016) [2024-04-28 04:34:19,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61169.7, 300 sec: 61370.6). Total num frames: 9460154368. Throughput: 0: 61017.6. Samples: 2365336600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:34:19,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 04:34:19,665][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (32700 times) [2024-04-28 04:34:20,759][54818] Updated weights for policy 0, policy_version 577408 (0.0017) [2024-04-28 04:34:23,248][54818] Updated weights for policy 0, policy_version 577418 (0.0018) [2024-04-28 04:34:24,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9460465664. Throughput: 0: 61029.7. Samples: 2365696420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:34:24,255][54587] Avg episode reward: [(0, '0.590')] [2024-04-28 04:34:26,450][54818] Updated weights for policy 0, policy_version 577428 (0.0015) [2024-04-28 04:34:28,649][54818] Updated weights for policy 0, policy_version 577438 (0.0018) [2024-04-28 04:34:29,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61167.0, 300 sec: 61315.1). Total num frames: 9460760576. Throughput: 0: 61088.5. Samples: 2365892060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:34:29,253][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 04:34:31,804][54818] Updated weights for policy 0, policy_version 577448 (0.0016) [2024-04-28 04:34:34,180][54818] Updated weights for policy 0, policy_version 577458 (0.0016) [2024-04-28 04:34:34,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61439.9, 300 sec: 61370.6). Total num frames: 9461071872. Throughput: 0: 60904.0. Samples: 2366249080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 04:34:34,255][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 04:34:37,208][54818] Updated weights for policy 0, policy_version 577468 (0.0015) [2024-04-28 04:34:39,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61166.9, 300 sec: 61370.6). Total num frames: 9461383168. Throughput: 0: 61038.2. Samples: 2366612360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:34:39,255][54587] Avg episode reward: [(0, '0.627')] [2024-04-28 04:34:39,635][54818] Updated weights for policy 0, policy_version 577478 (0.0016) [2024-04-28 04:34:42,327][54818] Updated weights for policy 0, policy_version 577488 (0.0022) [2024-04-28 04:34:44,253][54587] Fps is (10 sec: 60621.0, 60 sec: 60893.9, 300 sec: 61315.0). Total num frames: 9461678080. Throughput: 0: 61164.9. Samples: 2366809660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:34:44,254][54587] Avg episode reward: [(0, '0.655')] [2024-04-28 04:34:45,252][54818] Updated weights for policy 0, policy_version 577498 (0.0018) [2024-04-28 04:34:46,370][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (32800 times) [2024-04-28 04:34:47,483][54818] Updated weights for policy 0, policy_version 577508 (0.0016) [2024-04-28 04:34:49,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61167.0, 300 sec: 61315.1). Total num frames: 9461989376. Throughput: 0: 61015.2. Samples: 2367166660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:34:49,253][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 04:34:50,416][54818] Updated weights for policy 0, policy_version 577518 (0.0017) [2024-04-28 04:34:52,819][54818] Updated weights for policy 0, policy_version 577528 (0.0016) [2024-04-28 04:34:54,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61440.0, 300 sec: 61370.6). Total num frames: 9462300672. Throughput: 0: 61053.8. Samples: 2367531340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:34:54,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 04:34:55,737][54818] Updated weights for policy 0, policy_version 577538 (0.0016) [2024-04-28 04:34:56,811][54798] Signal inference workers to stop experience collection... (39500 times) [2024-04-28 04:34:56,812][54798] Signal inference workers to resume experience collection... (39500 times) [2024-04-28 04:34:56,828][54818] InferenceWorker_p0-w0: stopping experience collection (39500 times) [2024-04-28 04:34:56,828][54818] InferenceWorker_p0-w0: resuming experience collection (39500 times) [2024-04-28 04:34:58,181][54818] Updated weights for policy 0, policy_version 577548 (0.0016) [2024-04-28 04:34:59,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.1, 300 sec: 61370.6). Total num frames: 9462611968. Throughput: 0: 61153.5. Samples: 2367729200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:34:59,253][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 04:35:01,309][54818] Updated weights for policy 0, policy_version 577558 (0.0015) [2024-04-28 04:35:03,396][54818] Updated weights for policy 0, policy_version 577568 (0.0017) [2024-04-28 04:35:04,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61440.1, 300 sec: 61315.0). Total num frames: 9462923264. Throughput: 0: 61268.1. Samples: 2368093660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:35:04,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 04:35:06,732][54818] Updated weights for policy 0, policy_version 577578 (0.0017) [2024-04-28 04:35:08,637][54818] Updated weights for policy 0, policy_version 577588 (0.0016) [2024-04-28 04:35:09,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61713.1, 300 sec: 61315.1). Total num frames: 9463234560. Throughput: 0: 61252.1. Samples: 2368452760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:35:09,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 04:35:11,865][54818] Updated weights for policy 0, policy_version 577598 (0.0017) [2024-04-28 04:35:12,739][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (32900 times) [2024-04-28 04:35:14,253][54587] Fps is (10 sec: 58982.8, 60 sec: 61167.1, 300 sec: 61259.5). Total num frames: 9463513088. Throughput: 0: 61213.8. Samples: 2368646680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:35:14,253][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 04:35:14,281][54818] Updated weights for policy 0, policy_version 577608 (0.0017) [2024-04-28 04:35:17,109][54818] Updated weights for policy 0, policy_version 577618 (0.0015) [2024-04-28 04:35:19,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 9463840768. Throughput: 0: 61299.1. Samples: 2369007540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:35:19,254][54587] Avg episode reward: [(0, '0.691')] [2024-04-28 04:35:19,977][54818] Updated weights for policy 0, policy_version 577628 (0.0015) [2024-04-28 04:35:22,424][54818] Updated weights for policy 0, policy_version 577638 (0.0016) [2024-04-28 04:35:24,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9464135680. Throughput: 0: 61236.5. Samples: 2369368000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:35:24,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 04:35:25,397][54818] Updated weights for policy 0, policy_version 577648 (0.0016) [2024-04-28 04:35:27,578][54818] Updated weights for policy 0, policy_version 577658 (0.0017) [2024-04-28 04:35:29,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 9464446976. Throughput: 0: 61136.0. Samples: 2369560780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:35:29,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 04:35:30,671][54818] Updated weights for policy 0, policy_version 577668 (0.0017) [2024-04-28 04:35:33,194][54818] Updated weights for policy 0, policy_version 577678 (0.0016) [2024-04-28 04:35:34,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61167.1, 300 sec: 61259.5). Total num frames: 9464741888. Throughput: 0: 61132.0. Samples: 2369917600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:35:34,253][54587] Avg episode reward: [(0, '0.651')] [2024-04-28 04:35:36,139][54818] Updated weights for policy 0, policy_version 577688 (0.0018) [2024-04-28 04:35:38,047][54798] Signal inference workers to stop experience collection... (39550 times) [2024-04-28 04:35:38,047][54798] Signal inference workers to resume experience collection... (39550 times) [2024-04-28 04:35:38,067][54818] InferenceWorker_p0-w0: stopping experience collection (39550 times) [2024-04-28 04:35:38,067][54818] InferenceWorker_p0-w0: resuming experience collection (39550 times) [2024-04-28 04:35:38,448][54818] Updated weights for policy 0, policy_version 577698 (0.0017) [2024-04-28 04:35:39,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9465053184. Throughput: 0: 61119.5. Samples: 2370281720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:35:39,254][54587] Avg episode reward: [(0, '0.575')] [2024-04-28 04:35:40,175][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (33000 times) [2024-04-28 04:35:41,580][54818] Updated weights for policy 0, policy_version 577708 (0.0015) [2024-04-28 04:35:43,673][54818] Updated weights for policy 0, policy_version 577718 (0.0017) [2024-04-28 04:35:44,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9465348096. Throughput: 0: 61094.3. Samples: 2370478440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:35:44,253][54587] Avg episode reward: [(0, '0.666')] [2024-04-28 04:35:46,720][54818] Updated weights for policy 0, policy_version 577728 (0.0017) [2024-04-28 04:35:49,109][54818] Updated weights for policy 0, policy_version 577738 (0.0016) [2024-04-28 04:35:49,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9465659392. Throughput: 0: 60822.2. Samples: 2370830660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:35:49,253][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 04:35:49,264][54587] No heartbeat for components: RolloutWorker_w4 (25897 seconds), RolloutWorker_w5 (11997 seconds) [2024-04-28 04:35:49,265][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000577738_9465659392.pth... [2024-04-28 04:35:49,321][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000576842_9450979328.pth [2024-04-28 04:35:52,113][54818] Updated weights for policy 0, policy_version 577748 (0.0016) [2024-04-28 04:35:54,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61167.0, 300 sec: 61203.9). Total num frames: 9465970688. Throughput: 0: 61078.6. Samples: 2371201300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:35:54,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 04:35:54,430][54818] Updated weights for policy 0, policy_version 577758 (0.0015) [2024-04-28 04:35:57,319][54818] Updated weights for policy 0, policy_version 577768 (0.0019) [2024-04-28 04:35:59,253][54587] Fps is (10 sec: 60620.5, 60 sec: 60893.8, 300 sec: 61204.0). Total num frames: 9466265600. Throughput: 0: 60947.9. Samples: 2371389340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:35:59,255][54587] Avg episode reward: [(0, '0.671')] [2024-04-28 04:36:00,159][54818] Updated weights for policy 0, policy_version 577778 (0.0015) [2024-04-28 04:36:02,480][54818] Updated weights for policy 0, policy_version 577788 (0.0020) [2024-04-28 04:36:04,253][54587] Fps is (10 sec: 58982.9, 60 sec: 60620.9, 300 sec: 61148.4). Total num frames: 9466560512. Throughput: 0: 60824.1. Samples: 2371744620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:36:04,253][54587] Avg episode reward: [(0, '0.626')] [2024-04-28 04:36:05,875][54818] Updated weights for policy 0, policy_version 577798 (0.0016) [2024-04-28 04:36:06,732][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (33100 times) [2024-04-28 04:36:07,762][54818] Updated weights for policy 0, policy_version 577808 (0.0017) [2024-04-28 04:36:09,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60620.6, 300 sec: 61203.9). Total num frames: 9466871808. Throughput: 0: 61060.7. Samples: 2372115740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 04:36:09,254][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 04:36:11,400][54818] Updated weights for policy 0, policy_version 577818 (0.0016) [2024-04-28 04:36:13,211][54818] Updated weights for policy 0, policy_version 577828 (0.0018) [2024-04-28 04:36:14,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9467183104. Throughput: 0: 61007.2. Samples: 2372306100. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:36:14,253][54587] Avg episode reward: [(0, '0.558')] [2024-04-28 04:36:16,669][54818] Updated weights for policy 0, policy_version 577838 (0.0016) [2024-04-28 04:36:19,000][54818] Updated weights for policy 0, policy_version 577848 (0.0019) [2024-04-28 04:36:19,253][54587] Fps is (10 sec: 60621.9, 60 sec: 60620.9, 300 sec: 61148.4). Total num frames: 9467478016. Throughput: 0: 60852.4. Samples: 2372655960. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:36:19,253][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 04:36:21,867][54818] Updated weights for policy 0, policy_version 577858 (0.0017) [2024-04-28 04:36:21,925][54798] Signal inference workers to stop experience collection... (39600 times) [2024-04-28 04:36:21,961][54818] InferenceWorker_p0-w0: stopping experience collection (39600 times) [2024-04-28 04:36:22,014][54798] Signal inference workers to resume experience collection... (39600 times) [2024-04-28 04:36:22,014][54818] InferenceWorker_p0-w0: resuming experience collection (39600 times) [2024-04-28 04:36:24,253][54587] Fps is (10 sec: 58981.6, 60 sec: 60620.7, 300 sec: 61092.9). Total num frames: 9467772928. Throughput: 0: 60972.0. Samples: 2373025460. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:36:24,255][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 04:36:24,392][54818] Updated weights for policy 0, policy_version 577868 (0.0019) [2024-04-28 04:36:26,962][54818] Updated weights for policy 0, policy_version 577878 (0.0017) [2024-04-28 04:36:29,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60620.8, 300 sec: 61148.4). Total num frames: 9468084224. Throughput: 0: 60923.1. Samples: 2373219980. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:36:29,253][54587] Avg episode reward: [(0, '0.639')] [2024-04-28 04:36:29,584][54818] Updated weights for policy 0, policy_version 577888 (0.0019) [2024-04-28 04:36:32,116][54818] Updated weights for policy 0, policy_version 577898 (0.0018) [2024-04-28 04:36:33,118][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (33200 times) [2024-04-28 04:36:34,253][54587] Fps is (10 sec: 62260.1, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9468395520. Throughput: 0: 61039.6. Samples: 2373577440. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:36:34,253][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 04:36:34,767][54818] Updated weights for policy 0, policy_version 577908 (0.0016) [2024-04-28 04:36:37,499][54818] Updated weights for policy 0, policy_version 577918 (0.0017) [2024-04-28 04:36:39,253][54587] Fps is (10 sec: 60620.9, 60 sec: 60621.0, 300 sec: 61148.4). Total num frames: 9468690432. Throughput: 0: 60964.1. Samples: 2373944680. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:36:39,253][54587] Avg episode reward: [(0, '0.654')] [2024-04-28 04:36:40,346][54818] Updated weights for policy 0, policy_version 577928 (0.0020) [2024-04-28 04:36:42,941][54818] Updated weights for policy 0, policy_version 577938 (0.0017) [2024-04-28 04:36:44,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9469001728. Throughput: 0: 60963.3. Samples: 2374132680. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:36:44,253][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 04:36:46,179][54818] Updated weights for policy 0, policy_version 577948 (0.0017) [2024-04-28 04:36:48,170][54818] Updated weights for policy 0, policy_version 577958 (0.0016) [2024-04-28 04:36:49,253][54587] Fps is (10 sec: 62259.2, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9469313024. Throughput: 0: 61203.1. Samples: 2374498760. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:36:49,253][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 04:36:51,385][54818] Updated weights for policy 0, policy_version 577968 (0.0015) [2024-04-28 04:36:53,449][54818] Updated weights for policy 0, policy_version 577978 (0.0018) [2024-04-28 04:36:54,253][54587] Fps is (10 sec: 62258.7, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9469624320. Throughput: 0: 61028.2. Samples: 2374862000. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:36:54,254][54587] Avg episode reward: [(0, '0.562')] [2024-04-28 04:36:56,617][54818] Updated weights for policy 0, policy_version 577988 (0.0017) [2024-04-28 04:36:58,876][54818] Updated weights for policy 0, policy_version 577998 (0.0015) [2024-04-28 04:36:59,253][54587] Fps is (10 sec: 60619.5, 60 sec: 60893.8, 300 sec: 61092.8). Total num frames: 9469919232. Throughput: 0: 61039.7. Samples: 2375052900. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:36:59,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 04:37:00,340][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (33300 times) [2024-04-28 04:37:01,148][54798] Signal inference workers to stop experience collection... (39650 times) [2024-04-28 04:37:01,149][54798] Signal inference workers to resume experience collection... (39650 times) [2024-04-28 04:37:01,172][54818] InferenceWorker_p0-w0: stopping experience collection (39650 times) [2024-04-28 04:37:01,172][54818] InferenceWorker_p0-w0: resuming experience collection (39650 times) [2024-04-28 04:37:01,869][54818] Updated weights for policy 0, policy_version 578008 (0.0016) [2024-04-28 04:37:04,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.8, 300 sec: 61092.9). Total num frames: 9470230528. Throughput: 0: 61437.6. Samples: 2375420660. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:37:04,254][54587] Avg episode reward: [(0, '0.718')] [2024-04-28 04:37:04,465][54818] Updated weights for policy 0, policy_version 578018 (0.0017) [2024-04-28 04:37:07,144][54818] Updated weights for policy 0, policy_version 578028 (0.0015) [2024-04-28 04:37:09,253][54587] Fps is (10 sec: 62259.9, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9470541824. Throughput: 0: 61317.4. Samples: 2375784740. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:37:09,254][54587] Avg episode reward: [(0, '0.664')] [2024-04-28 04:37:09,750][54818] Updated weights for policy 0, policy_version 578038 (0.0016) [2024-04-28 04:37:12,302][54818] Updated weights for policy 0, policy_version 578048 (0.0016) [2024-04-28 04:37:14,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9470853120. Throughput: 0: 61208.8. Samples: 2375974380. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:37:14,254][54587] Avg episode reward: [(0, '0.636')] [2024-04-28 04:37:15,000][54818] Updated weights for policy 0, policy_version 578058 (0.0016) [2024-04-28 04:37:17,836][54818] Updated weights for policy 0, policy_version 578068 (0.0018) [2024-04-28 04:37:19,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 9471148032. Throughput: 0: 61303.1. Samples: 2376336080. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:37:19,253][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 04:37:20,456][54818] Updated weights for policy 0, policy_version 578078 (0.0016) [2024-04-28 04:37:23,263][54818] Updated weights for policy 0, policy_version 578088 (0.0015) [2024-04-28 04:37:24,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61440.2, 300 sec: 61092.9). Total num frames: 9471459328. Throughput: 0: 61207.6. Samples: 2376699020. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:37:24,253][54587] Avg episode reward: [(0, '0.566')] [2024-04-28 04:37:25,902][54818] Updated weights for policy 0, policy_version 578098 (0.0016) [2024-04-28 04:37:27,071][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (33400 times) [2024-04-28 04:37:28,479][54818] Updated weights for policy 0, policy_version 578108 (0.0016) [2024-04-28 04:37:29,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 9471770624. Throughput: 0: 61249.2. Samples: 2376888900. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:37:29,254][54587] Avg episode reward: [(0, '0.520')] [2024-04-28 04:37:31,108][54818] Updated weights for policy 0, policy_version 578118 (0.0016) [2024-04-28 04:37:33,729][54818] Updated weights for policy 0, policy_version 578128 (0.0018) [2024-04-28 04:37:34,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 9472081920. Throughput: 0: 61226.1. Samples: 2377253940. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:37:34,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 04:37:36,712][54818] Updated weights for policy 0, policy_version 578138 (0.0016) [2024-04-28 04:37:39,011][54818] Updated weights for policy 0, policy_version 578148 (0.0017) [2024-04-28 04:37:39,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9472376832. Throughput: 0: 61282.3. Samples: 2377619700. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-04-28 04:37:39,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 04:37:42,020][54818] Updated weights for policy 0, policy_version 578158 (0.0017) [2024-04-28 04:37:44,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61439.8, 300 sec: 61148.4). Total num frames: 9472688128. Throughput: 0: 61187.1. Samples: 2377806320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:37:44,254][54587] Avg episode reward: [(0, '0.510')] [2024-04-28 04:37:44,541][54818] Updated weights for policy 0, policy_version 578168 (0.0017) [2024-04-28 04:37:47,235][54818] Updated weights for policy 0, policy_version 578178 (0.0017) [2024-04-28 04:37:49,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.8, 300 sec: 61092.9). Total num frames: 9472983040. Throughput: 0: 61037.8. Samples: 2378167360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:37:49,254][54587] Avg episode reward: [(0, '0.537')] [2024-04-28 04:37:49,262][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000578185_9472983040.pth... [2024-04-28 04:37:49,318][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000577290_9458319360.pth [2024-04-28 04:37:49,907][54818] Updated weights for policy 0, policy_version 578188 (0.0019) [2024-04-28 04:37:52,398][54818] Updated weights for policy 0, policy_version 578198 (0.0020) [2024-04-28 04:37:53,259][54798] Signal inference workers to stop experience collection... (39700 times) [2024-04-28 04:37:53,297][54818] InferenceWorker_p0-w0: stopping experience collection (39700 times) [2024-04-28 04:37:53,351][54798] Signal inference workers to resume experience collection... (39700 times) [2024-04-28 04:37:53,351][54818] InferenceWorker_p0-w0: resuming experience collection (39700 times) [2024-04-28 04:37:53,622][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (33500 times) [2024-04-28 04:37:54,253][54587] Fps is (10 sec: 60621.8, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9473294336. Throughput: 0: 61089.8. Samples: 2378533780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:37:54,254][54587] Avg episode reward: [(0, '0.641')] [2024-04-28 04:37:55,368][54818] Updated weights for policy 0, policy_version 578208 (0.0018) [2024-04-28 04:37:58,237][54818] Updated weights for policy 0, policy_version 578218 (0.0016) [2024-04-28 04:37:59,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9473605632. Throughput: 0: 61096.4. Samples: 2378723720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:37:59,255][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 04:38:00,609][54818] Updated weights for policy 0, policy_version 578228 (0.0017) [2024-04-28 04:38:03,482][54818] Updated weights for policy 0, policy_version 578238 (0.0016) [2024-04-28 04:38:04,253][54587] Fps is (10 sec: 62258.7, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9473916928. Throughput: 0: 61026.1. Samples: 2379082260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:38:04,254][54587] Avg episode reward: [(0, '0.653')] [2024-04-28 04:38:06,028][54818] Updated weights for policy 0, policy_version 578248 (0.0015) [2024-04-28 04:38:08,720][54818] Updated weights for policy 0, policy_version 578258 (0.0017) [2024-04-28 04:38:09,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9474211840. Throughput: 0: 61291.9. Samples: 2379457160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:38:09,255][54587] Avg episode reward: [(0, '0.661')] [2024-04-28 04:38:11,380][54818] Updated weights for policy 0, policy_version 578268 (0.0016) [2024-04-28 04:38:14,094][54818] Updated weights for policy 0, policy_version 578278 (0.0017) [2024-04-28 04:38:14,253][54587] Fps is (10 sec: 58981.9, 60 sec: 60893.8, 300 sec: 61093.4). Total num frames: 9474506752. Throughput: 0: 61234.1. Samples: 2379644440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:38:14,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 04:38:16,813][54818] Updated weights for policy 0, policy_version 578288 (0.0016) [2024-04-28 04:38:19,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.8, 300 sec: 61092.9). Total num frames: 9474818048. Throughput: 0: 61085.3. Samples: 2380002780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:38:19,254][54587] Avg episode reward: [(0, '0.666')] [2024-04-28 04:38:19,366][54818] Updated weights for policy 0, policy_version 578298 (0.0016) [2024-04-28 04:38:20,587][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (33600 times) [2024-04-28 04:38:21,992][54818] Updated weights for policy 0, policy_version 578308 (0.0019) [2024-04-28 04:38:24,253][54587] Fps is (10 sec: 62260.5, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9475129344. Throughput: 0: 61293.8. Samples: 2380377920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:38:24,253][54587] Avg episode reward: [(0, '0.538')] [2024-04-28 04:38:24,738][54818] Updated weights for policy 0, policy_version 578318 (0.0016) [2024-04-28 04:38:27,402][54818] Updated weights for policy 0, policy_version 578328 (0.0016) [2024-04-28 04:38:29,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9475440640. Throughput: 0: 61137.9. Samples: 2380557520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:38:29,254][54587] Avg episode reward: [(0, '0.587')] [2024-04-28 04:38:30,143][54818] Updated weights for policy 0, policy_version 578338 (0.0016) [2024-04-28 04:38:32,762][54818] Updated weights for policy 0, policy_version 578348 (0.0017) [2024-04-28 04:38:34,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60893.8, 300 sec: 61092.9). Total num frames: 9475735552. Throughput: 0: 61331.5. Samples: 2380927280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:38:34,254][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 04:38:34,254][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (1800 times) [2024-04-28 04:38:35,433][54818] Updated weights for policy 0, policy_version 578358 (0.0016) [2024-04-28 04:38:38,052][54818] Updated weights for policy 0, policy_version 578368 (0.0016) [2024-04-28 04:38:39,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 9476046848. Throughput: 0: 61349.3. Samples: 2381294500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:38:39,254][54587] Avg episode reward: [(0, '0.584')] [2024-04-28 04:38:40,962][54818] Updated weights for policy 0, policy_version 578378 (0.0016) [2024-04-28 04:38:43,605][54818] Updated weights for policy 0, policy_version 578388 (0.0017) [2024-04-28 04:38:44,253][54587] Fps is (10 sec: 60621.4, 60 sec: 60894.1, 300 sec: 61092.9). Total num frames: 9476341760. Throughput: 0: 61157.1. Samples: 2381475780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:38:44,253][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 04:38:44,466][54798] Signal inference workers to stop experience collection... (39750 times) [2024-04-28 04:38:44,467][54798] Signal inference workers to resume experience collection... (39750 times) [2024-04-28 04:38:44,494][54818] InferenceWorker_p0-w0: stopping experience collection (39750 times) [2024-04-28 04:38:44,494][54818] InferenceWorker_p0-w0: resuming experience collection (39750 times) [2024-04-28 04:38:46,142][54818] Updated weights for policy 0, policy_version 578398 (0.0016) [2024-04-28 04:38:47,492][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (33700 times) [2024-04-28 04:38:48,791][54818] Updated weights for policy 0, policy_version 578408 (0.0018) [2024-04-28 04:38:49,253][54587] Fps is (10 sec: 58982.3, 60 sec: 60893.9, 300 sec: 61092.9). Total num frames: 9476636672. Throughput: 0: 61415.1. Samples: 2381845940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:38:49,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 04:38:49,262][54587] No heartbeat for components: RolloutWorker_w4 (26077 seconds), RolloutWorker_w5 (12177 seconds) [2024-04-28 04:38:51,432][54818] Updated weights for policy 0, policy_version 578418 (0.0018) [2024-04-28 04:38:54,109][54818] Updated weights for policy 0, policy_version 578428 (0.0016) [2024-04-28 04:38:54,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9476964352. Throughput: 0: 61333.8. Samples: 2382217180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:38:54,254][54587] Avg episode reward: [(0, '0.593')] [2024-04-28 04:38:56,772][54818] Updated weights for policy 0, policy_version 578438 (0.0021) [2024-04-28 04:38:59,253][54587] Fps is (10 sec: 63897.5, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9477275648. Throughput: 0: 61204.5. Samples: 2382398640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:38:59,254][54587] Avg episode reward: [(0, '0.637')] [2024-04-28 04:38:59,341][54818] Updated weights for policy 0, policy_version 578448 (0.0016) [2024-04-28 04:39:02,008][54818] Updated weights for policy 0, policy_version 578458 (0.0016) [2024-04-28 04:39:04,253][54587] Fps is (10 sec: 60620.1, 60 sec: 60893.8, 300 sec: 61148.4). Total num frames: 9477570560. Throughput: 0: 61383.9. Samples: 2382765060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:39:04,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 04:39:04,613][54818] Updated weights for policy 0, policy_version 578468 (0.0015) [2024-04-28 04:39:07,361][54818] Updated weights for policy 0, policy_version 578478 (0.0016) [2024-04-28 04:39:09,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9477881856. Throughput: 0: 61309.2. Samples: 2383136840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:39:09,254][54587] Avg episode reward: [(0, '0.509')] [2024-04-28 04:39:10,041][54818] Updated weights for policy 0, policy_version 578488 (0.0016) [2024-04-28 04:39:12,959][54818] Updated weights for policy 0, policy_version 578498 (0.0016) [2024-04-28 04:39:13,951][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (33800 times) [2024-04-28 04:39:14,253][54587] Fps is (10 sec: 62260.0, 60 sec: 61440.1, 300 sec: 61148.4). Total num frames: 9478193152. Throughput: 0: 61384.1. Samples: 2383319800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 04:39:14,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 04:39:15,606][54818] Updated weights for policy 0, policy_version 578508 (0.0017) [2024-04-28 04:39:18,359][54818] Updated weights for policy 0, policy_version 578518 (0.0017) [2024-04-28 04:39:19,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9478504448. Throughput: 0: 61348.0. Samples: 2383687940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:39:19,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 04:39:20,942][54818] Updated weights for policy 0, policy_version 578528 (0.0017) [2024-04-28 04:39:23,611][54818] Updated weights for policy 0, policy_version 578538 (0.0016) [2024-04-28 04:39:24,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61439.9, 300 sec: 61203.9). Total num frames: 9478815744. Throughput: 0: 61250.7. Samples: 2384050780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:39:24,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:39:26,450][54818] Updated weights for policy 0, policy_version 578548 (0.0016) [2024-04-28 04:39:28,925][54818] Updated weights for policy 0, policy_version 578558 (0.0018) [2024-04-28 04:39:29,253][54587] Fps is (10 sec: 58982.9, 60 sec: 60894.0, 300 sec: 61092.9). Total num frames: 9479094272. Throughput: 0: 61467.1. Samples: 2384241800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:39:29,253][54587] Avg episode reward: [(0, '0.522')] [2024-04-28 04:39:31,792][54818] Updated weights for policy 0, policy_version 578568 (0.0016) [2024-04-28 04:39:34,237][54818] Updated weights for policy 0, policy_version 578578 (0.0017) [2024-04-28 04:39:34,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9479421952. Throughput: 0: 61195.6. Samples: 2384599740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:39:34,254][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 04:39:37,002][54818] Updated weights for policy 0, policy_version 578588 (0.0017) [2024-04-28 04:39:37,266][54798] Signal inference workers to stop experience collection... (39800 times) [2024-04-28 04:39:37,307][54818] InferenceWorker_p0-w0: stopping experience collection (39800 times) [2024-04-28 04:39:37,326][54798] Signal inference workers to resume experience collection... (39800 times) [2024-04-28 04:39:37,326][54818] InferenceWorker_p0-w0: resuming experience collection (39800 times) [2024-04-28 04:39:39,253][54587] Fps is (10 sec: 63897.4, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 9479733248. Throughput: 0: 61221.8. Samples: 2384972160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:39:39,254][54587] Avg episode reward: [(0, '0.573')] [2024-04-28 04:39:39,351][54818] Updated weights for policy 0, policy_version 578598 (0.0021) [2024-04-28 04:39:40,900][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (33900 times) [2024-04-28 04:39:42,301][54818] Updated weights for policy 0, policy_version 578608 (0.0016) [2024-04-28 04:39:44,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61439.9, 300 sec: 61148.4). Total num frames: 9480028160. Throughput: 0: 61238.6. Samples: 2385154380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:39:44,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 04:39:44,820][54818] Updated weights for policy 0, policy_version 578618 (0.0016) [2024-04-28 04:39:47,632][54818] Updated weights for policy 0, policy_version 578628 (0.0016) [2024-04-28 04:39:49,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61713.2, 300 sec: 61148.4). Total num frames: 9480339456. Throughput: 0: 61316.3. Samples: 2385524280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:39:49,253][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 04:39:49,329][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000578635_9480355840.pth... [2024-04-28 04:39:49,390][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000577738_9465659392.pth [2024-04-28 04:39:50,270][54818] Updated weights for policy 0, policy_version 578638 (0.0017) [2024-04-28 04:39:53,197][54818] Updated weights for policy 0, policy_version 578648 (0.0018) [2024-04-28 04:39:54,253][54587] Fps is (10 sec: 62259.5, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9480650752. Throughput: 0: 61212.5. Samples: 2385891400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:39:54,255][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 04:39:55,757][54818] Updated weights for policy 0, policy_version 578658 (0.0018) [2024-04-28 04:39:58,315][54818] Updated weights for policy 0, policy_version 578668 (0.0017) [2024-04-28 04:39:59,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61440.0, 300 sec: 61148.4). Total num frames: 9480962048. Throughput: 0: 61243.0. Samples: 2386075740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:39:59,255][54587] Avg episode reward: [(0, '0.569')] [2024-04-28 04:40:00,941][54818] Updated weights for policy 0, policy_version 578678 (0.0016) [2024-04-28 04:40:03,555][54818] Updated weights for policy 0, policy_version 578688 (0.0017) [2024-04-28 04:40:04,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61440.1, 300 sec: 61092.9). Total num frames: 9481256960. Throughput: 0: 61359.6. Samples: 2386449120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:40:04,255][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 04:40:06,225][54818] Updated weights for policy 0, policy_version 578698 (0.0017) [2024-04-28 04:40:07,755][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (34000 times) [2024-04-28 04:40:08,765][54818] Updated weights for policy 0, policy_version 578708 (0.0017) [2024-04-28 04:40:09,253][54587] Fps is (10 sec: 58982.8, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9481551872. Throughput: 0: 61268.8. Samples: 2386807880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:40:09,254][54587] Avg episode reward: [(0, '0.643')] [2024-04-28 04:40:11,490][54818] Updated weights for policy 0, policy_version 578718 (0.0017) [2024-04-28 04:40:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.9, 300 sec: 61092.9). Total num frames: 9481863168. Throughput: 0: 61161.6. Samples: 2386994080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:40:14,256][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 04:40:14,357][54818] Updated weights for policy 0, policy_version 578728 (0.0022) [2024-04-28 04:40:17,327][54818] Updated weights for policy 0, policy_version 578738 (0.0017) [2024-04-28 04:40:19,253][54587] Fps is (10 sec: 62259.8, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9482174464. Throughput: 0: 61575.2. Samples: 2387370620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:40:19,253][54587] Avg episode reward: [(0, '0.615')] [2024-04-28 04:40:19,633][54818] Updated weights for policy 0, policy_version 578748 (0.0015) [2024-04-28 04:40:20,480][54798] Signal inference workers to stop experience collection... (39850 times) [2024-04-28 04:40:20,486][54818] InferenceWorker_p0-w0: stopping experience collection (39850 times) [2024-04-28 04:40:20,487][54798] Signal inference workers to resume experience collection... (39850 times) [2024-04-28 04:40:20,497][54818] InferenceWorker_p0-w0: resuming experience collection (39850 times) [2024-04-28 04:40:22,623][54818] Updated weights for policy 0, policy_version 578758 (0.0017) [2024-04-28 04:40:24,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61166.9, 300 sec: 61148.4). Total num frames: 9482485760. Throughput: 0: 61204.4. Samples: 2387726360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:40:24,254][54587] Avg episode reward: [(0, '0.520')] [2024-04-28 04:40:24,861][54818] Updated weights for policy 0, policy_version 578768 (0.0017) [2024-04-28 04:40:27,837][54818] Updated weights for policy 0, policy_version 578778 (0.0016) [2024-04-28 04:40:29,253][54587] Fps is (10 sec: 60619.4, 60 sec: 61439.8, 300 sec: 61148.4). Total num frames: 9482780672. Throughput: 0: 61417.2. Samples: 2387918160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:40:29,254][54587] Avg episode reward: [(0, '0.625')] [2024-04-28 04:40:30,285][54818] Updated weights for policy 0, policy_version 578788 (0.0016) [2024-04-28 04:40:33,046][54818] Updated weights for policy 0, policy_version 578798 (0.0016) [2024-04-28 04:40:34,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61167.0, 300 sec: 61148.4). Total num frames: 9483091968. Throughput: 0: 61538.2. Samples: 2388293500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:40:34,254][54587] Avg episode reward: [(0, '0.600')] [2024-04-28 04:40:34,264][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (34100 times) [2024-04-28 04:40:35,687][54818] Updated weights for policy 0, policy_version 578808 (0.0017) [2024-04-28 04:40:38,263][54818] Updated weights for policy 0, policy_version 578818 (0.0017) [2024-04-28 04:40:39,253][54587] Fps is (10 sec: 62260.1, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9483403264. Throughput: 0: 61253.3. Samples: 2388647800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:40:39,254][54587] Avg episode reward: [(0, '0.619')] [2024-04-28 04:40:40,766][54818] Updated weights for policy 0, policy_version 578828 (0.0015) [2024-04-28 04:40:43,483][54818] Updated weights for policy 0, policy_version 578838 (0.0016) [2024-04-28 04:40:44,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61440.1, 300 sec: 61204.0). Total num frames: 9483714560. Throughput: 0: 61396.6. Samples: 2388838580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 04:40:44,254][54587] Avg episode reward: [(0, '0.651')] [2024-04-28 04:40:46,000][54818] Updated weights for policy 0, policy_version 578848 (0.0017) [2024-04-28 04:40:48,648][54818] Updated weights for policy 0, policy_version 578858 (0.0018) [2024-04-28 04:40:49,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.8, 300 sec: 61148.4). Total num frames: 9484009472. Throughput: 0: 61469.7. Samples: 2389215260. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:40:49,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 04:40:51,861][54818] Updated weights for policy 0, policy_version 578868 (0.0016) [2024-04-28 04:40:54,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9484320768. Throughput: 0: 61284.5. Samples: 2389565680. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:40:54,254][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 04:40:54,603][54818] Updated weights for policy 0, policy_version 578878 (0.0017) [2024-04-28 04:40:55,123][54798] Signal inference workers to stop experience collection... (39900 times) [2024-04-28 04:40:55,155][54818] InferenceWorker_p0-w0: stopping experience collection (39900 times) [2024-04-28 04:40:55,183][54798] Signal inference workers to resume experience collection... (39900 times) [2024-04-28 04:40:55,184][54818] InferenceWorker_p0-w0: resuming experience collection (39900 times) [2024-04-28 04:40:57,474][54818] Updated weights for policy 0, policy_version 578888 (0.0016) [2024-04-28 04:40:59,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9484632064. Throughput: 0: 61389.4. Samples: 2389756600. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:40:59,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 04:41:00,051][54818] Updated weights for policy 0, policy_version 578898 (0.0021) [2024-04-28 04:41:00,588][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (34200 times) [2024-04-28 04:41:02,753][54818] Updated weights for policy 0, policy_version 578908 (0.0016) [2024-04-28 04:41:04,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9484943360. Throughput: 0: 61259.5. Samples: 2390127300. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:41:04,254][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 04:41:05,250][54818] Updated weights for policy 0, policy_version 578918 (0.0019) [2024-04-28 04:41:07,865][54818] Updated weights for policy 0, policy_version 578928 (0.0017) [2024-04-28 04:41:09,253][54587] Fps is (10 sec: 60620.5, 60 sec: 61440.0, 300 sec: 61203.9). Total num frames: 9485238272. Throughput: 0: 61364.4. Samples: 2390487760. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:41:09,263][54587] Avg episode reward: [(0, '0.616')] [2024-04-28 04:41:10,410][54818] Updated weights for policy 0, policy_version 578938 (0.0019) [2024-04-28 04:41:13,207][54818] Updated weights for policy 0, policy_version 578948 (0.0016) [2024-04-28 04:41:14,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9485549568. Throughput: 0: 61288.8. Samples: 2390676140. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:41:14,253][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 04:41:15,837][54818] Updated weights for policy 0, policy_version 578958 (0.0020) [2024-04-28 04:41:18,598][54818] Updated weights for policy 0, policy_version 578968 (0.0017) [2024-04-28 04:41:19,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61440.0, 300 sec: 61315.1). Total num frames: 9485860864. Throughput: 0: 61151.2. Samples: 2391045300. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:41:19,253][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 04:41:20,939][54818] Updated weights for policy 0, policy_version 578978 (0.0016) [2024-04-28 04:41:22,772][54798] Signal inference workers to stop experience collection... (39950 times) [2024-04-28 04:41:22,788][54818] InferenceWorker_p0-w0: stopping experience collection (39950 times) [2024-04-28 04:41:22,829][54798] Signal inference workers to resume experience collection... (39950 times) [2024-04-28 04:41:22,829][54818] InferenceWorker_p0-w0: resuming experience collection (39950 times) [2024-04-28 04:41:23,831][54818] Updated weights for policy 0, policy_version 578988 (0.0016) [2024-04-28 04:41:24,253][54587] Fps is (10 sec: 62258.3, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 9486172160. Throughput: 0: 61499.0. Samples: 2391415260. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:41:24,254][54587] Avg episode reward: [(0, '0.594')] [2024-04-28 04:41:26,167][54818] Updated weights for policy 0, policy_version 578998 (0.0016) [2024-04-28 04:41:27,633][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (34300 times) [2024-04-28 04:41:29,061][54818] Updated weights for policy 0, policy_version 579008 (0.0016) [2024-04-28 04:41:29,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61713.2, 300 sec: 61315.0). Total num frames: 9486483456. Throughput: 0: 61383.0. Samples: 2391600820. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:41:29,254][54587] Avg episode reward: [(0, '0.622')] [2024-04-28 04:41:32,110][54818] Updated weights for policy 0, policy_version 579018 (0.0016) [2024-04-28 04:41:34,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 9486778368. Throughput: 0: 61012.4. Samples: 2391960820. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:41:34,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 04:41:34,314][54818] Updated weights for policy 0, policy_version 579028 (0.0017) [2024-04-28 04:41:37,258][54818] Updated weights for policy 0, policy_version 579038 (0.0020) [2024-04-28 04:41:39,253][54587] Fps is (10 sec: 58983.2, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9487073280. Throughput: 0: 61416.1. Samples: 2392329400. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:41:39,253][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 04:41:39,955][54818] Updated weights for policy 0, policy_version 579048 (0.0015) [2024-04-28 04:41:42,513][54818] Updated weights for policy 0, policy_version 579058 (0.0016) [2024-04-28 04:41:44,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9487384576. Throughput: 0: 61327.5. Samples: 2392516340. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:41:44,254][54587] Avg episode reward: [(0, '0.613')] [2024-04-28 04:41:45,082][54818] Updated weights for policy 0, policy_version 579068 (0.0016) [2024-04-28 04:41:47,673][54818] Updated weights for policy 0, policy_version 579078 (0.0018) [2024-04-28 04:41:49,253][54587] Fps is (10 sec: 60620.2, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9487679488. Throughput: 0: 61154.1. Samples: 2392879240. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:41:49,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 04:41:49,265][54587] No heartbeat for components: RolloutWorker_w4 (26257 seconds), RolloutWorker_w5 (12357 seconds) [2024-04-28 04:41:49,267][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000579083_9487695872.pth... [2024-04-28 04:41:49,333][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000578185_9472983040.pth [2024-04-28 04:41:50,383][54818] Updated weights for policy 0, policy_version 579088 (0.0015) [2024-04-28 04:41:53,330][54818] Updated weights for policy 0, policy_version 579098 (0.0015) [2024-04-28 04:41:54,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61439.9, 300 sec: 61315.1). Total num frames: 9488007168. Throughput: 0: 61447.1. Samples: 2393252880. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:41:54,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 04:41:54,467][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (34400 times) [2024-04-28 04:41:55,660][54818] Updated weights for policy 0, policy_version 579108 (0.0016) [2024-04-28 04:41:58,654][54818] Updated weights for policy 0, policy_version 579118 (0.0015) [2024-04-28 04:41:59,253][54587] Fps is (10 sec: 62259.2, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9488302080. Throughput: 0: 61263.4. Samples: 2393433000. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:41:59,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 04:42:00,597][54798] Signal inference workers to stop experience collection... (40000 times) [2024-04-28 04:42:00,600][54798] Signal inference workers to resume experience collection... (40000 times) [2024-04-28 04:42:00,611][54818] InferenceWorker_p0-w0: stopping experience collection (40000 times) [2024-04-28 04:42:00,611][54818] InferenceWorker_p0-w0: resuming experience collection (40000 times) [2024-04-28 04:42:01,289][54818] Updated weights for policy 0, policy_version 579128 (0.0018) [2024-04-28 04:42:04,021][54818] Updated weights for policy 0, policy_version 579138 (0.0016) [2024-04-28 04:42:04,253][54587] Fps is (10 sec: 58982.5, 60 sec: 60893.8, 300 sec: 61204.0). Total num frames: 9488596992. Throughput: 0: 61146.6. Samples: 2393796900. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:42:04,255][54587] Avg episode reward: [(0, '0.574')] [2024-04-28 04:42:06,615][54818] Updated weights for policy 0, policy_version 579148 (0.0018) [2024-04-28 04:42:09,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9488908288. Throughput: 0: 61283.7. Samples: 2394173020. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:42:09,253][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 04:42:09,264][54818] Updated weights for policy 0, policy_version 579158 (0.0018) [2024-04-28 04:42:12,018][54818] Updated weights for policy 0, policy_version 579168 (0.0018) [2024-04-28 04:42:14,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9489219584. Throughput: 0: 61124.2. Samples: 2394351400. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:42:14,253][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 04:42:14,664][54818] Updated weights for policy 0, policy_version 579178 (0.0016) [2024-04-28 04:42:17,257][54818] Updated weights for policy 0, policy_version 579188 (0.0016) [2024-04-28 04:42:19,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60893.8, 300 sec: 61203.9). Total num frames: 9489514496. Throughput: 0: 61135.2. Samples: 2394711900. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 04:42:19,254][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 04:42:19,865][54818] Updated weights for policy 0, policy_version 579198 (0.0018) [2024-04-28 04:42:21,144][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (34500 times) [2024-04-28 04:42:22,453][54818] Updated weights for policy 0, policy_version 579208 (0.0017) [2024-04-28 04:42:24,253][54587] Fps is (10 sec: 60620.2, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9489825792. Throughput: 0: 61460.3. Samples: 2395095120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:42:24,254][54587] Avg episode reward: [(0, '0.544')] [2024-04-28 04:42:25,123][54818] Updated weights for policy 0, policy_version 579218 (0.0017) [2024-04-28 04:42:28,063][54818] Updated weights for policy 0, policy_version 579228 (0.0017) [2024-04-28 04:42:29,253][54587] Fps is (10 sec: 62259.4, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9490137088. Throughput: 0: 61133.8. Samples: 2395267360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:42:29,254][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 04:42:30,755][54818] Updated weights for policy 0, policy_version 579238 (0.0016) [2024-04-28 04:42:33,771][54818] Updated weights for policy 0, policy_version 579248 (0.0016) [2024-04-28 04:42:34,253][54587] Fps is (10 sec: 60621.5, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9490432000. Throughput: 0: 61053.0. Samples: 2395626620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:42:34,253][54587] Avg episode reward: [(0, '0.538')] [2024-04-28 04:42:36,146][54818] Updated weights for policy 0, policy_version 579258 (0.0015) [2024-04-28 04:42:39,025][54818] Updated weights for policy 0, policy_version 579268 (0.0016) [2024-04-28 04:42:39,253][54587] Fps is (10 sec: 60621.2, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9490743296. Throughput: 0: 61236.1. Samples: 2396008500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:42:39,253][54587] Avg episode reward: [(0, '0.628')] [2024-04-28 04:42:41,322][54818] Updated weights for policy 0, policy_version 579278 (0.0017) [2024-04-28 04:42:43,784][54798] Signal inference workers to stop experience collection... (40050 times) [2024-04-28 04:42:43,797][54818] InferenceWorker_p0-w0: stopping experience collection (40050 times) [2024-04-28 04:42:43,879][54798] Signal inference workers to resume experience collection... (40050 times) [2024-04-28 04:42:43,880][54818] InferenceWorker_p0-w0: resuming experience collection (40050 times) [2024-04-28 04:42:44,254][54587] Fps is (10 sec: 60618.3, 60 sec: 60893.5, 300 sec: 61203.9). Total num frames: 9491038208. Throughput: 0: 61075.1. Samples: 2396181400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:42:44,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 04:42:44,281][54818] Updated weights for policy 0, policy_version 579288 (0.0016) [2024-04-28 04:42:46,565][54818] Updated weights for policy 0, policy_version 579298 (0.0017) [2024-04-28 04:42:48,299][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (34600 times) [2024-04-28 04:42:49,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9491349504. Throughput: 0: 61110.2. Samples: 2396546860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:42:49,255][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 04:42:49,511][54818] Updated weights for policy 0, policy_version 579308 (0.0018) [2024-04-28 04:42:51,811][54818] Updated weights for policy 0, policy_version 579318 (0.0018) [2024-04-28 04:42:54,253][54587] Fps is (10 sec: 62260.9, 60 sec: 60893.8, 300 sec: 61204.0). Total num frames: 9491660800. Throughput: 0: 61155.4. Samples: 2396925020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:42:54,254][54587] Avg episode reward: [(0, '0.548')] [2024-04-28 04:42:54,664][54818] Updated weights for policy 0, policy_version 579328 (0.0017) [2024-04-28 04:42:57,438][54818] Updated weights for policy 0, policy_version 579338 (0.0019) [2024-04-28 04:42:59,253][54587] Fps is (10 sec: 62258.8, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9491972096. Throughput: 0: 61107.8. Samples: 2397101260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:42:59,254][54587] Avg episode reward: [(0, '0.565')] [2024-04-28 04:43:00,189][54818] Updated weights for policy 0, policy_version 579348 (0.0018) [2024-04-28 04:43:02,558][54818] Updated weights for policy 0, policy_version 579358 (0.0016) [2024-04-28 04:43:04,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9492283392. Throughput: 0: 61364.4. Samples: 2397473300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:43:04,255][54587] Avg episode reward: [(0, '0.602')] [2024-04-28 04:43:05,469][54818] Updated weights for policy 0, policy_version 579368 (0.0016) [2024-04-28 04:43:08,273][54818] Updated weights for policy 0, policy_version 579378 (0.0016) [2024-04-28 04:43:09,253][54587] Fps is (10 sec: 60620.9, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9492578304. Throughput: 0: 61195.1. Samples: 2397848900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:43:09,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 04:43:10,794][54818] Updated weights for policy 0, policy_version 579388 (0.0018) [2024-04-28 04:43:13,580][54818] Updated weights for policy 0, policy_version 579398 (0.0018) [2024-04-28 04:43:14,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61166.8, 300 sec: 61259.5). Total num frames: 9492889600. Throughput: 0: 61184.8. Samples: 2398020680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:43:14,254][54587] Avg episode reward: [(0, '0.457')] [2024-04-28 04:43:14,620][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (34700 times) [2024-04-28 04:43:16,058][54818] Updated weights for policy 0, policy_version 579408 (0.0016) [2024-04-28 04:43:18,823][54818] Updated weights for policy 0, policy_version 579418 (0.0016) [2024-04-28 04:43:19,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 9493184512. Throughput: 0: 61422.8. Samples: 2398390660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:43:19,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 04:43:21,427][54818] Updated weights for policy 0, policy_version 579428 (0.0017) [2024-04-28 04:43:24,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9493495808. Throughput: 0: 61275.1. Samples: 2398765880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:43:24,253][54587] Avg episode reward: [(0, '0.576')] [2024-04-28 04:43:24,289][54818] Updated weights for policy 0, policy_version 579438 (0.0017) [2024-04-28 04:43:26,594][54798] Signal inference workers to stop experience collection... (40100 times) [2024-04-28 04:43:26,621][54818] InferenceWorker_p0-w0: stopping experience collection (40100 times) [2024-04-28 04:43:26,652][54798] Signal inference workers to resume experience collection... (40100 times) [2024-04-28 04:43:26,653][54818] InferenceWorker_p0-w0: resuming experience collection (40100 times) [2024-04-28 04:43:26,655][54818] Updated weights for policy 0, policy_version 579448 (0.0016) [2024-04-28 04:43:29,253][54587] Fps is (10 sec: 60621.1, 60 sec: 60893.8, 300 sec: 61204.0). Total num frames: 9493790720. Throughput: 0: 61366.6. Samples: 2398942880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:43:29,255][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 04:43:29,739][54818] Updated weights for policy 0, policy_version 579458 (0.0016) [2024-04-28 04:43:31,900][54818] Updated weights for policy 0, policy_version 579468 (0.0016) [2024-04-28 04:43:34,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9494102016. Throughput: 0: 61397.9. Samples: 2399309760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:43:34,253][54587] Avg episode reward: [(0, '0.631')] [2024-04-28 04:43:34,989][54818] Updated weights for policy 0, policy_version 579478 (0.0017) [2024-04-28 04:43:37,078][54818] Updated weights for policy 0, policy_version 579488 (0.0017) [2024-04-28 04:43:39,253][54587] Fps is (10 sec: 62259.6, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9494413312. Throughput: 0: 61341.9. Samples: 2399685400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:43:39,254][54587] Avg episode reward: [(0, '0.592')] [2024-04-28 04:43:40,296][54818] Updated weights for policy 0, policy_version 579498 (0.0018) [2024-04-28 04:43:41,305][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (34800 times) [2024-04-28 04:43:42,440][54818] Updated weights for policy 0, policy_version 579508 (0.0016) [2024-04-28 04:43:44,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61440.2, 300 sec: 61315.0). Total num frames: 9494724608. Throughput: 0: 61326.2. Samples: 2399860940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:43:44,254][54587] Avg episode reward: [(0, '0.567')] [2024-04-28 04:43:45,542][54818] Updated weights for policy 0, policy_version 579518 (0.0015) [2024-04-28 04:43:48,489][54818] Updated weights for policy 0, policy_version 579528 (0.0019) [2024-04-28 04:43:49,254][54587] Fps is (10 sec: 60618.8, 60 sec: 61166.6, 300 sec: 61203.9). Total num frames: 9495019520. Throughput: 0: 61095.2. Samples: 2400222600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 04:43:49,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 04:43:49,262][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000579531_9495035904.pth... [2024-04-28 04:43:49,313][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000578635_9480355840.pth [2024-04-28 04:43:51,139][54818] Updated weights for policy 0, policy_version 579538 (0.0017) [2024-04-28 04:43:53,842][54818] Updated weights for policy 0, policy_version 579548 (0.0018) [2024-04-28 04:43:54,253][54587] Fps is (10 sec: 60621.4, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9495330816. Throughput: 0: 61044.5. Samples: 2400595900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:43:54,253][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 04:43:56,258][54818] Updated weights for policy 0, policy_version 579558 (0.0016) [2024-04-28 04:43:59,088][54818] Updated weights for policy 0, policy_version 579568 (0.0017) [2024-04-28 04:43:59,253][54587] Fps is (10 sec: 62260.8, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9495642112. Throughput: 0: 61308.8. Samples: 2400779580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:43:59,255][54587] Avg episode reward: [(0, '0.608')] [2024-04-28 04:44:01,492][54818] Updated weights for policy 0, policy_version 579578 (0.0017) [2024-04-28 04:44:04,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9495953408. Throughput: 0: 61129.0. Samples: 2401141460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:44:04,254][54587] Avg episode reward: [(0, '0.634')] [2024-04-28 04:44:04,870][54818] Updated weights for policy 0, policy_version 579588 (0.0017) [2024-04-28 04:44:06,773][54818] Updated weights for policy 0, policy_version 579598 (0.0020) [2024-04-28 04:44:08,226][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (34900 times) [2024-04-28 04:44:09,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9496248320. Throughput: 0: 61157.5. Samples: 2401517980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:44:09,254][54587] Avg episode reward: [(0, '0.668')] [2024-04-28 04:44:10,128][54818] Updated weights for policy 0, policy_version 579608 (0.0018) [2024-04-28 04:44:10,818][54798] Signal inference workers to stop experience collection... (40150 times) [2024-04-28 04:44:10,818][54798] Signal inference workers to resume experience collection... (40150 times) [2024-04-28 04:44:10,826][54818] InferenceWorker_p0-w0: stopping experience collection (40150 times) [2024-04-28 04:44:10,826][54818] InferenceWorker_p0-w0: resuming experience collection (40150 times) [2024-04-28 04:44:12,029][54818] Updated weights for policy 0, policy_version 579618 (0.0019) [2024-04-28 04:44:14,253][54587] Fps is (10 sec: 60620.6, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9496559616. Throughput: 0: 61068.0. Samples: 2401690940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:44:14,255][54587] Avg episode reward: [(0, '0.659')] [2024-04-28 04:44:15,282][54818] Updated weights for policy 0, policy_version 579628 (0.0015) [2024-04-28 04:44:17,308][54818] Updated weights for policy 0, policy_version 579638 (0.0018) [2024-04-28 04:44:19,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61440.0, 300 sec: 61203.9). Total num frames: 9496870912. Throughput: 0: 61104.7. Samples: 2402059480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:44:19,255][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 04:44:20,597][54818] Updated weights for policy 0, policy_version 579648 (0.0017) [2024-04-28 04:44:22,729][54818] Updated weights for policy 0, policy_version 579658 (0.0018) [2024-04-28 04:44:24,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61439.9, 300 sec: 61315.0). Total num frames: 9497182208. Throughput: 0: 60967.1. Samples: 2402428920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:44:24,254][54587] Avg episode reward: [(0, '0.585')] [2024-04-28 04:44:25,931][54818] Updated weights for policy 0, policy_version 579668 (0.0016) [2024-04-28 04:44:28,256][54818] Updated weights for policy 0, policy_version 579678 (0.0019) [2024-04-28 04:44:29,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61440.0, 300 sec: 61204.0). Total num frames: 9497477120. Throughput: 0: 61111.6. Samples: 2402610960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:44:29,254][54587] Avg episode reward: [(0, '0.607')] [2024-04-28 04:44:31,236][54818] Updated weights for policy 0, policy_version 579688 (0.0015) [2024-04-28 04:44:33,559][54818] Updated weights for policy 0, policy_version 579698 (0.0016) [2024-04-28 04:44:34,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61439.9, 300 sec: 61203.9). Total num frames: 9497788416. Throughput: 0: 60965.3. Samples: 2402966020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:44:34,254][54587] Avg episode reward: [(0, '0.611')] [2024-04-28 04:44:35,333][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (35000 times) [2024-04-28 04:44:36,535][54818] Updated weights for policy 0, policy_version 579708 (0.0017) [2024-04-28 04:44:39,253][54587] Fps is (10 sec: 60621.3, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9498083328. Throughput: 0: 61008.5. Samples: 2403341280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:44:39,254][54587] Avg episode reward: [(0, '0.541')] [2024-04-28 04:44:39,261][54818] Updated weights for policy 0, policy_version 579718 (0.0016) [2024-04-28 04:44:41,791][54818] Updated weights for policy 0, policy_version 579728 (0.0016) [2024-04-28 04:44:44,253][54587] Fps is (10 sec: 62259.1, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9498411008. Throughput: 0: 61237.4. Samples: 2403535260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:44:44,255][54587] Avg episode reward: [(0, '0.581')] [2024-04-28 04:44:44,845][54818] Updated weights for policy 0, policy_version 579738 (0.0016) [2024-04-28 04:44:46,993][54818] Updated weights for policy 0, policy_version 579748 (0.0017) [2024-04-28 04:44:49,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61440.2, 300 sec: 61203.9). Total num frames: 9498705920. Throughput: 0: 61233.6. Samples: 2403896980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:44:49,255][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 04:44:49,264][54587] No heartbeat for components: RolloutWorker_w4 (26437 seconds), RolloutWorker_w5 (12537 seconds) [2024-04-28 04:44:50,526][54818] Updated weights for policy 0, policy_version 579758 (0.0020) [2024-04-28 04:44:51,972][54798] Signal inference workers to stop experience collection... (40200 times) [2024-04-28 04:44:51,973][54798] Signal inference workers to resume experience collection... (40200 times) [2024-04-28 04:44:51,982][54818] InferenceWorker_p0-w0: stopping experience collection (40200 times) [2024-04-28 04:44:51,982][54818] InferenceWorker_p0-w0: resuming experience collection (40200 times) [2024-04-28 04:44:52,221][54818] Updated weights for policy 0, policy_version 579768 (0.0018) [2024-04-28 04:44:54,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61439.9, 300 sec: 61204.0). Total num frames: 9499017216. Throughput: 0: 60926.3. Samples: 2404259660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:44:54,254][54587] Avg episode reward: [(0, '0.596')] [2024-04-28 04:44:55,581][54818] Updated weights for policy 0, policy_version 579778 (0.0017) [2024-04-28 04:44:57,692][54818] Updated weights for policy 0, policy_version 579788 (0.0015) [2024-04-28 04:44:59,253][54587] Fps is (10 sec: 60622.0, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 9499312128. Throughput: 0: 61331.7. Samples: 2404450860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:44:59,254][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 04:45:00,844][54818] Updated weights for policy 0, policy_version 579798 (0.0021) [2024-04-28 04:45:01,772][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (35100 times) [2024-04-28 04:45:02,810][54818] Updated weights for policy 0, policy_version 579808 (0.0019) [2024-04-28 04:45:04,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61166.9, 300 sec: 61259.5). Total num frames: 9499623424. Throughput: 0: 61133.4. Samples: 2404810480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:45:04,254][54587] Avg episode reward: [(0, '0.606')] [2024-04-28 04:45:06,117][54818] Updated weights for policy 0, policy_version 579818 (0.0017) [2024-04-28 04:45:08,315][54818] Updated weights for policy 0, policy_version 579828 (0.0017) [2024-04-28 04:45:09,253][54587] Fps is (10 sec: 62258.4, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9499934720. Throughput: 0: 61106.6. Samples: 2405178720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:45:09,254][54587] Avg episode reward: [(0, '0.612')] [2024-04-28 04:45:11,463][54818] Updated weights for policy 0, policy_version 579838 (0.0015) [2024-04-28 04:45:13,983][54818] Updated weights for policy 0, policy_version 579848 (0.0018) [2024-04-28 04:45:14,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61167.0, 300 sec: 61203.9). Total num frames: 9500229632. Throughput: 0: 61137.4. Samples: 2405362140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:45:14,254][54587] Avg episode reward: [(0, '0.568')] [2024-04-28 04:45:16,518][54818] Updated weights for policy 0, policy_version 579858 (0.0017) [2024-04-28 04:45:19,194][54818] Updated weights for policy 0, policy_version 579868 (0.0017) [2024-04-28 04:45:19,253][54587] Fps is (10 sec: 62258.6, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9500557312. Throughput: 0: 61447.4. Samples: 2405731160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-04-28 04:45:19,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 04:45:22,093][54818] Updated weights for policy 0, policy_version 579878 (0.0018) [2024-04-28 04:45:24,253][54587] Fps is (10 sec: 63897.5, 60 sec: 61440.0, 300 sec: 61315.1). Total num frames: 9500868608. Throughput: 0: 61278.1. Samples: 2406098800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:45:24,254][54587] Avg episode reward: [(0, '0.599')] [2024-04-28 04:45:24,530][54818] Updated weights for policy 0, policy_version 579888 (0.0023) [2024-04-28 04:45:27,312][54818] Updated weights for policy 0, policy_version 579898 (0.0021) [2024-04-28 04:45:28,454][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (35200 times) [2024-04-28 04:45:29,253][54587] Fps is (10 sec: 60621.0, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9501163520. Throughput: 0: 61023.0. Samples: 2406281300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:45:29,254][54587] Avg episode reward: [(0, '0.549')] [2024-04-28 04:45:30,370][54818] Updated weights for policy 0, policy_version 579908 (0.0016) [2024-04-28 04:45:32,769][54818] Updated weights for policy 0, policy_version 579918 (0.0016) [2024-04-28 04:45:34,253][54587] Fps is (10 sec: 58981.9, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9501458432. Throughput: 0: 61100.5. Samples: 2406646500. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:45:34,254][54587] Avg episode reward: [(0, '0.531')] [2024-04-28 04:45:35,714][54818] Updated weights for policy 0, policy_version 579928 (0.0016) [2024-04-28 04:45:36,467][54798] Signal inference workers to stop experience collection... (40250 times) [2024-04-28 04:45:36,467][54798] Signal inference workers to resume experience collection... (40250 times) [2024-04-28 04:45:36,477][54818] InferenceWorker_p0-w0: stopping experience collection (40250 times) [2024-04-28 04:45:36,477][54818] InferenceWorker_p0-w0: resuming experience collection (40250 times) [2024-04-28 04:45:37,834][54818] Updated weights for policy 0, policy_version 579938 (0.0018) [2024-04-28 04:45:39,253][54587] Fps is (10 sec: 60621.1, 60 sec: 61439.9, 300 sec: 61203.9). Total num frames: 9501769728. Throughput: 0: 61158.2. Samples: 2407011780. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:45:39,255][54587] Avg episode reward: [(0, '0.591')] [2024-04-28 04:45:40,945][54818] Updated weights for policy 0, policy_version 579948 (0.0015) [2024-04-28 04:45:43,107][54818] Updated weights for policy 0, policy_version 579958 (0.0017) [2024-04-28 04:45:44,253][54587] Fps is (10 sec: 62260.1, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9502081024. Throughput: 0: 61210.2. Samples: 2407205320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:45:44,254][54587] Avg episode reward: [(0, '0.623')] [2024-04-28 04:45:46,357][54818] Updated weights for policy 0, policy_version 579968 (0.0017) [2024-04-28 04:45:48,563][54818] Updated weights for policy 0, policy_version 579978 (0.0017) [2024-04-28 04:45:49,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 9502375936. Throughput: 0: 61237.9. Samples: 2407566180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:45:49,253][54587] Avg episode reward: [(0, '0.633')] [2024-04-28 04:45:49,275][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000579980_9502392320.pth... [2024-04-28 04:45:49,329][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000579083_9487695872.pth [2024-04-28 04:45:51,854][54818] Updated weights for policy 0, policy_version 579988 (0.0019) [2024-04-28 04:45:54,084][54818] Updated weights for policy 0, policy_version 579998 (0.0016) [2024-04-28 04:45:54,253][54587] Fps is (10 sec: 60620.0, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9502687232. Throughput: 0: 61086.6. Samples: 2407927620. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:45:54,254][54587] Avg episode reward: [(0, '0.620')] [2024-04-28 04:45:55,653][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (35300 times) [2024-04-28 04:45:57,042][54818] Updated weights for policy 0, policy_version 580008 (0.0018) [2024-04-28 04:45:59,253][54587] Fps is (10 sec: 62258.2, 60 sec: 61439.8, 300 sec: 61203.9). Total num frames: 9502998528. Throughput: 0: 61364.3. Samples: 2408123540. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:45:59,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 04:45:59,393][54818] Updated weights for policy 0, policy_version 580018 (0.0021) [2024-04-28 04:46:02,342][54818] Updated weights for policy 0, policy_version 580028 (0.0016) [2024-04-28 04:46:04,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9503309824. Throughput: 0: 61246.8. Samples: 2408487260. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:46:04,254][54587] Avg episode reward: [(0, '0.579')] [2024-04-28 04:46:04,749][54818] Updated weights for policy 0, policy_version 580038 (0.0018) [2024-04-28 04:46:07,624][54818] Updated weights for policy 0, policy_version 580048 (0.0015) [2024-04-28 04:46:09,253][54587] Fps is (10 sec: 62259.7, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9503621120. Throughput: 0: 61155.1. Samples: 2408850780. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:46:09,254][54587] Avg episode reward: [(0, '0.554')] [2024-04-28 04:46:09,981][54818] Updated weights for policy 0, policy_version 580058 (0.0017) [2024-04-28 04:46:12,878][54818] Updated weights for policy 0, policy_version 580068 (0.0018) [2024-04-28 04:46:14,253][54587] Fps is (10 sec: 60620.7, 60 sec: 61439.9, 300 sec: 61203.9). Total num frames: 9503916032. Throughput: 0: 61396.5. Samples: 2409044140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:46:14,254][54587] Avg episode reward: [(0, '0.570')] [2024-04-28 04:46:15,433][54818] Updated weights for policy 0, policy_version 580078 (0.0018) [2024-04-28 04:46:18,187][54818] Updated weights for policy 0, policy_version 580088 (0.0016) [2024-04-28 04:46:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61167.0, 300 sec: 61204.0). Total num frames: 9504227328. Throughput: 0: 61375.2. Samples: 2409408380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:46:19,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 04:46:20,763][54818] Updated weights for policy 0, policy_version 580098 (0.0016) [2024-04-28 04:46:22,409][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (35400 times) [2024-04-28 04:46:23,347][54798] Signal inference workers to stop experience collection... (40300 times) [2024-04-28 04:46:23,378][54818] InferenceWorker_p0-w0: stopping experience collection (40300 times) [2024-04-28 04:46:23,406][54798] Signal inference workers to resume experience collection... (40300 times) [2024-04-28 04:46:23,407][54818] InferenceWorker_p0-w0: resuming experience collection (40300 times) [2024-04-28 04:46:23,532][54818] Updated weights for policy 0, policy_version 580108 (0.0019) [2024-04-28 04:46:24,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 9504538624. Throughput: 0: 61163.9. Samples: 2409764160. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:46:24,254][54587] Avg episode reward: [(0, '0.530')] [2024-04-28 04:46:26,901][54818] Updated weights for policy 0, policy_version 580118 (0.0016) [2024-04-28 04:46:28,762][54818] Updated weights for policy 0, policy_version 580128 (0.0016) [2024-04-28 04:46:29,253][54587] Fps is (10 sec: 60621.5, 60 sec: 61167.1, 300 sec: 61204.0). Total num frames: 9504833536. Throughput: 0: 61373.4. Samples: 2409967120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:46:29,253][54587] Avg episode reward: [(0, '0.658')] [2024-04-28 04:46:32,209][54818] Updated weights for policy 0, policy_version 580138 (0.0016) [2024-04-28 04:46:34,046][54818] Updated weights for policy 0, policy_version 580148 (0.0017) [2024-04-28 04:46:34,253][54587] Fps is (10 sec: 60622.1, 60 sec: 61440.2, 300 sec: 61259.5). Total num frames: 9505144832. Throughput: 0: 61335.6. Samples: 2410326280. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:46:34,253][54587] Avg episode reward: [(0, '0.588')] [2024-04-28 04:46:37,362][54818] Updated weights for policy 0, policy_version 580158 (0.0020) [2024-04-28 04:46:39,253][54587] Fps is (10 sec: 62258.9, 60 sec: 61440.1, 300 sec: 61259.5). Total num frames: 9505456128. Throughput: 0: 61426.8. Samples: 2410691820. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:46:39,254][54587] Avg episode reward: [(0, '0.632')] [2024-04-28 04:46:39,257][54818] Updated weights for policy 0, policy_version 580168 (0.0020) [2024-04-28 04:46:42,616][54818] Updated weights for policy 0, policy_version 580178 (0.0016) [2024-04-28 04:46:44,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 9505767424. Throughput: 0: 61433.5. Samples: 2410888040. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:46:44,254][54587] Avg episode reward: [(0, '0.621')] [2024-04-28 04:46:44,745][54818] Updated weights for policy 0, policy_version 580188 (0.0016) [2024-04-28 04:46:47,800][54818] Updated weights for policy 0, policy_version 580198 (0.0016) [2024-04-28 04:46:48,645][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (35500 times) [2024-04-28 04:46:49,253][54587] Fps is (10 sec: 62259.0, 60 sec: 61713.0, 300 sec: 61259.5). Total num frames: 9506078720. Throughput: 0: 61397.4. Samples: 2411250140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:46:49,254][54587] Avg episode reward: [(0, '0.572')] [2024-04-28 04:46:50,067][54818] Updated weights for policy 0, policy_version 580208 (0.0018) [2024-04-28 04:46:53,090][54818] Updated weights for policy 0, policy_version 580218 (0.0017) [2024-04-28 04:46:54,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61440.0, 300 sec: 61259.5). Total num frames: 9506373632. Throughput: 0: 61362.2. Samples: 2411612080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-04-28 04:46:54,255][54587] Avg episode reward: [(0, '0.564')] [2024-04-28 04:46:54,255][54587] Runner:signal_='update_training_info' queue is Full (). receivers=['RolloutWorker_w4'] (1900 times) [2024-04-28 04:46:55,437][54818] Updated weights for policy 0, policy_version 580228 (0.0017) [2024-04-28 04:46:58,442][54818] Updated weights for policy 0, policy_version 580238 (0.0017) [2024-04-28 04:46:59,253][54587] Fps is (10 sec: 60620.3, 60 sec: 61440.0, 300 sec: 61315.0). Total num frames: 9506684928. Throughput: 0: 61335.6. Samples: 2411804240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:46:59,254][54587] Avg episode reward: [(0, '0.494')] [2024-04-28 04:47:00,882][54818] Updated weights for policy 0, policy_version 580248 (0.0018) [2024-04-28 04:47:03,628][54818] Updated weights for policy 0, policy_version 580258 (0.0015) [2024-04-28 04:47:04,253][54587] Fps is (10 sec: 60621.6, 60 sec: 61167.1, 300 sec: 61259.5). Total num frames: 9506979840. Throughput: 0: 61334.8. Samples: 2412168440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:47:04,253][54587] Avg episode reward: [(0, '0.601')] [2024-04-28 04:47:06,678][54818] Updated weights for policy 0, policy_version 580268 (0.0015) [2024-04-28 04:47:08,375][54798] Signal inference workers to stop experience collection... (40350 times) [2024-04-28 04:47:08,379][54798] Signal inference workers to resume experience collection... (40350 times) [2024-04-28 04:47:08,389][54818] InferenceWorker_p0-w0: stopping experience collection (40350 times) [2024-04-28 04:47:08,389][54818] InferenceWorker_p0-w0: resuming experience collection (40350 times) [2024-04-28 04:47:08,905][54818] Updated weights for policy 0, policy_version 580278 (0.0019) [2024-04-28 04:47:09,253][54587] Fps is (10 sec: 58983.3, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9507274752. Throughput: 0: 61342.1. Samples: 2412524540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:47:09,253][54587] Avg episode reward: [(0, '0.578')] [2024-04-28 04:47:12,174][54818] Updated weights for policy 0, policy_version 580288 (0.0015) [2024-04-28 04:47:14,253][54587] Fps is (10 sec: 60620.4, 60 sec: 61167.0, 300 sec: 61259.5). Total num frames: 9507586048. Throughput: 0: 60997.2. Samples: 2412712000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:47:14,254][54587] Avg episode reward: [(0, '0.665')] [2024-04-28 04:47:14,403][54818] Updated weights for policy 0, policy_version 580298 (0.0017) [2024-04-28 04:47:15,362][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (35600 times) [2024-04-28 04:47:17,618][54818] Updated weights for policy 0, policy_version 580308 (0.0017) [2024-04-28 04:47:19,253][54587] Fps is (10 sec: 60620.8, 60 sec: 60894.0, 300 sec: 61204.0). Total num frames: 9507880960. Throughput: 0: 61316.9. Samples: 2413085540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:47:19,253][54587] Avg episode reward: [(0, '0.589')] [2024-04-28 04:47:19,596][54818] Updated weights for policy 0, policy_version 580318 (0.0023) [2024-04-28 04:47:23,069][54818] Updated weights for policy 0, policy_version 580328 (0.0017) [2024-04-28 04:47:24,253][54587] Fps is (10 sec: 60620.4, 60 sec: 60893.9, 300 sec: 61203.9). Total num frames: 9508192256. Throughput: 0: 61096.8. Samples: 2413441180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:47:24,255][54587] Avg episode reward: [(0, '0.644')] [2024-04-28 04:47:24,968][54818] Updated weights for policy 0, policy_version 580338 (0.0016) [2024-04-28 04:47:28,323][54818] Updated weights for policy 0, policy_version 580348 (0.0018) [2024-04-28 04:47:29,253][54587] Fps is (10 sec: 60620.0, 60 sec: 60893.7, 300 sec: 61203.9). Total num frames: 9508487168. Throughput: 0: 60781.2. Samples: 2413623200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:47:29,255][54587] Avg episode reward: [(0, '0.604')] [2024-04-28 04:47:30,230][54818] Updated weights for policy 0, policy_version 580358 (0.0021) [2024-04-28 04:47:33,526][54818] Updated weights for policy 0, policy_version 580368 (0.0018) [2024-04-28 04:47:34,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9508798464. Throughput: 0: 61211.7. Samples: 2414004660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:47:34,253][54587] Avg episode reward: [(0, '0.539')] [2024-04-28 04:47:35,560][54818] Updated weights for policy 0, policy_version 580378 (0.0016) [2024-04-28 04:47:38,764][54818] Updated weights for policy 0, policy_version 580388 (0.0019) [2024-04-28 04:47:39,253][54587] Fps is (10 sec: 62259.6, 60 sec: 60893.9, 300 sec: 61259.6). Total num frames: 9509109760. Throughput: 0: 61085.9. Samples: 2414360940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:47:39,253][54587] Avg episode reward: [(0, '0.609')] [2024-04-28 04:47:41,269][54818] Updated weights for policy 0, policy_version 580398 (0.0016) [2024-04-28 04:47:42,999][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (35700 times) [2024-04-28 04:47:44,043][54818] Updated weights for policy 0, policy_version 580408 (0.0015) [2024-04-28 04:47:44,253][54587] Fps is (10 sec: 60620.3, 60 sec: 60620.8, 300 sec: 61204.0). Total num frames: 9509404672. Throughput: 0: 60838.8. Samples: 2414541980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:47:44,254][54587] Avg episode reward: [(0, '0.675')] [2024-04-28 04:47:46,438][54818] Updated weights for policy 0, policy_version 580418 (0.0015) [2024-04-28 04:47:49,253][54587] Fps is (10 sec: 58982.5, 60 sec: 60347.8, 300 sec: 61148.4). Total num frames: 9509699584. Throughput: 0: 60975.5. Samples: 2414912340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:47:49,253][54587] Avg episode reward: [(0, '0.656')] [2024-04-28 04:47:49,262][54587] No heartbeat for components: RolloutWorker_w4 (26617 seconds), RolloutWorker_w5 (12717 seconds) [2024-04-28 04:47:49,367][54798] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000580428_9509732352.pth... [2024-04-28 04:47:49,368][54818] Updated weights for policy 0, policy_version 580428 (0.0018) [2024-04-28 04:47:49,421][54798] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000579531_9495035904.pth [2024-04-28 04:47:49,616][54798] Signal inference workers to stop experience collection... (40400 times) [2024-04-28 04:47:49,656][54818] InferenceWorker_p0-w0: stopping experience collection (40400 times) [2024-04-28 04:47:49,668][54798] Signal inference workers to resume experience collection... (40400 times) [2024-04-28 04:47:49,671][54818] InferenceWorker_p0-w0: resuming experience collection (40400 times) [2024-04-28 04:47:51,788][54818] Updated weights for policy 0, policy_version 580438 (0.0017) [2024-04-28 04:47:54,253][54587] Fps is (10 sec: 62258.8, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9510027264. Throughput: 0: 61283.8. Samples: 2415282320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:47:54,254][54587] Avg episode reward: [(0, '0.661')] [2024-04-28 04:47:54,731][54818] Updated weights for policy 0, policy_version 580448 (0.0018) [2024-04-28 04:47:57,724][54818] Updated weights for policy 0, policy_version 580458 (0.0015) [2024-04-28 04:47:59,253][54587] Fps is (10 sec: 63896.5, 60 sec: 60893.8, 300 sec: 61203.9). Total num frames: 9510338560. Throughput: 0: 61048.7. Samples: 2415459200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:47:59,254][54587] Avg episode reward: [(0, '0.545')] [2024-04-28 04:48:00,059][54818] Updated weights for policy 0, policy_version 580468 (0.0018) [2024-04-28 04:48:03,256][54818] Updated weights for policy 0, policy_version 580478 (0.0016) [2024-04-28 04:48:04,253][54587] Fps is (10 sec: 60620.7, 60 sec: 60893.7, 300 sec: 61204.0). Total num frames: 9510633472. Throughput: 0: 60995.8. Samples: 2415830360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:48:04,255][54587] Avg episode reward: [(0, '0.482')] [2024-04-28 04:48:05,156][54818] Updated weights for policy 0, policy_version 580488 (0.0015) [2024-04-28 04:48:08,479][54818] Updated weights for policy 0, policy_version 580498 (0.0016) [2024-04-28 04:48:09,253][54587] Fps is (10 sec: 58983.6, 60 sec: 60893.9, 300 sec: 61148.4). Total num frames: 9510928384. Throughput: 0: 61295.3. Samples: 2416199460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:48:09,253][54587] Avg episode reward: [(0, '0.535')] [2024-04-28 04:48:09,473][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (35800 times) [2024-04-28 04:48:10,461][54818] Updated weights for policy 0, policy_version 580508 (0.0015) [2024-04-28 04:48:13,654][54818] Updated weights for policy 0, policy_version 580518 (0.0017) [2024-04-28 04:48:14,253][54587] Fps is (10 sec: 60621.7, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9511239680. Throughput: 0: 61083.7. Samples: 2416371960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:48:14,253][54587] Avg episode reward: [(0, '0.664')] [2024-04-28 04:48:15,715][54818] Updated weights for policy 0, policy_version 580528 (0.0018) [2024-04-28 04:48:18,952][54818] Updated weights for policy 0, policy_version 580538 (0.0015) [2024-04-28 04:48:19,253][54587] Fps is (10 sec: 63896.7, 60 sec: 61439.9, 300 sec: 61259.5). Total num frames: 9511567360. Throughput: 0: 61050.0. Samples: 2416751920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:48:19,254][54587] Avg episode reward: [(0, '0.595')] [2024-04-28 04:48:21,111][54818] Updated weights for policy 0, policy_version 580548 (0.0016) [2024-04-28 04:48:24,253][54587] Fps is (10 sec: 60619.9, 60 sec: 60893.9, 300 sec: 61204.0). Total num frames: 9511845888. Throughput: 0: 61404.8. Samples: 2417124160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:48:24,255][54587] Avg episode reward: [(0, '0.525')] [2024-04-28 04:48:24,317][54818] Updated weights for policy 0, policy_version 580558 (0.0020) [2024-04-28 04:48:26,589][54818] Updated weights for policy 0, policy_version 580568 (0.0017) [2024-04-28 04:48:29,253][54587] Fps is (10 sec: 58982.1, 60 sec: 61166.9, 300 sec: 61203.9). Total num frames: 9512157184. Throughput: 0: 61136.7. Samples: 2417293140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-04-28 04:48:29,254][54587] Avg episode reward: [(0, '0.528')] [2024-04-28 04:48:29,627][54818] Updated weights for policy 0, policy_version 580578 (0.0017) [2024-04-28 04:48:30,163][54798] Signal inference workers to stop experience collection... (40450 times) [2024-04-28 04:48:30,165][54798] Signal inference workers to resume experience collection... (40450 times) [2024-04-28 04:48:30,172][54818] InferenceWorker_p0-w0: stopping experience collection (40450 times) [2024-04-28 04:48:30,182][54818] InferenceWorker_p0-w0: resuming experience collection (40450 times) [2024-04-28 04:48:31,830][54818] Updated weights for policy 0, policy_version 580588 (0.0017) [2024-04-28 04:48:34,253][54587] Fps is (10 sec: 62259.3, 60 sec: 61166.8, 300 sec: 61203.9). Total num frames: 9512468480. Throughput: 0: 61260.3. Samples: 2417669060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:48:34,254][54587] Avg episode reward: [(0, '0.543')] [2024-04-28 04:48:34,750][54818] Updated weights for policy 0, policy_version 580598 (0.0018) [2024-04-28 04:48:35,834][54798] Batcher_0:signal_='trajectory_buffers_available' queue is Full (). receivers=['RolloutWorker_w4'] (35900 times) [2024-04-28 04:48:37,512][54818] Updated weights for policy 0, policy_version 580608 (0.0017) [2024-04-28 04:48:39,253][54587] Fps is (10 sec: 62259.4, 60 sec: 61166.8, 300 sec: 61204.0). Total num frames: 9512779776. Throughput: 0: 61329.3. Samples: 2418042140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:48:39,254][54587] Avg episode reward: [(0, '0.559')] [2024-04-28 04:48:40,084][54818] Updated weights for policy 0, policy_version 580618 (0.0017) [2024-04-28 04:48:42,741][54818] Updated weights for policy 0, policy_version 580628 (0.0017) [2024-04-28 04:48:44,253][54587] Fps is (10 sec: 60620.8, 60 sec: 61166.9, 300 sec: 61204.0). Total num frames: 9513074688. Throughput: 0: 61107.2. Samples: 2418209020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:48:44,254][54587] Avg episode reward: [(0, '0.605')] [2024-04-28 04:48:45,326][54818] Updated weights for policy 0, policy_version 580638 (0.0018) [2024-04-28 04:48:48,139][54818] Updated weights for policy 0, policy_version 580648 (0.0016) [2024-04-28 04:48:49,253][54587] Fps is (10 sec: 60620.1, 60 sec: 61439.8, 300 sec: 61203.9). Total num frames: 9513385984. Throughput: 0: 61137.2. Samples: 2418581540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:48:49,254][54587] Avg episode reward: [(0, '0.624')] [2024-04-28 04:48:50,749][54818] Updated weights for policy 0, policy_version 580658 (0.0016) [2024-04-28 04:48:54,253][54587] Fps is (10 sec: 57344.3, 60 sec: 60347.8, 300 sec: 61037.4). Total num frames: 9513648128. Throughput: 0: 61385.7. Samples: 2418961820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:48:54,254][54587] Avg episode reward: [(0, '0.557')] [2024-04-28 04:48:59,253][54587] Fps is (10 sec: 26215.0, 60 sec: 55159.6, 300 sec: 59982.1). Total num frames: 9513648128. Throughput: 0: 62137.3. Samples: 2419168140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:48:59,254][54587] Avg episode reward: [(0, '0.571')] [2024-04-28 04:49:04,254][54587] Fps is (10 sec: 0.0, 60 sec: 50243.9, 300 sec: 58982.3). Total num frames: 9513648128. Throughput: 0: 59196.7. Samples: 2419415800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:49:04,255][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:49:09,254][54587] Fps is (10 sec: 0.0, 60 sec: 45328.8, 300 sec: 57927.1). Total num frames: 9513648128. Throughput: 0: 50989.5. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:49:09,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:49:14,254][54587] Fps is (10 sec: 0.0, 60 sec: 40140.6, 300 sec: 56871.9). Total num frames: 9513648128. Throughput: 0: 47234.5. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:49:14,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:49:19,254][54587] Fps is (10 sec: 0.0, 60 sec: 34679.3, 300 sec: 55816.6). Total num frames: 9513648128. Throughput: 0: 38880.7. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:49:19,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:49:24,253][54587] Fps is (10 sec: 0.0, 60 sec: 30037.3, 300 sec: 54817.0). Total num frames: 9513648128. Throughput: 0: 30590.2. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:49:24,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:49:29,254][54587] Fps is (10 sec: 0.0, 60 sec: 24849.0, 300 sec: 53761.7). Total num frames: 9513648128. Throughput: 0: 26881.6. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:49:29,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:49:34,254][54587] Fps is (10 sec: 0.0, 60 sec: 19660.7, 300 sec: 52762.0). Total num frames: 9513648128. Throughput: 0: 18603.5. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:49:34,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:49:39,254][54587] Fps is (10 sec: 0.0, 60 sec: 14472.5, 300 sec: 51651.2). Total num frames: 9513648128. Throughput: 0: 10152.8. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:49:39,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:49:44,254][54587] Fps is (10 sec: 0.0, 60 sec: 9557.3, 300 sec: 50651.5). Total num frames: 9513648128. Throughput: 0: 5568.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:49:44,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:49:49,254][54587] Fps is (10 sec: 0.0, 60 sec: 4369.1, 300 sec: 49596.3). Total num frames: 9513648128. Throughput: 0: 64.4. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:49:49,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:49:54,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 48596.6). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:49:54,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:49:59,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 47541.3). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:49:59,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:50:04,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 46486.1). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:50:04,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:50:09,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 45486.4). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:50:09,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:50:14,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 44375.6). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:50:14,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:50:19,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 43320.4). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:50:19,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:50:24,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 42320.7). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:50:24,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:50:29,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 41321.0). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:50:29,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:50:34,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 40265.7). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:50:34,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:50:39,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 39210.5). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:50:39,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:50:44,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 38210.8). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:50:44,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:50:49,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 37155.6). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:50:49,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:50:49,284][54587] No heartbeat for components: RolloutWorker_w4 (26797 seconds), RolloutWorker_w5 (12897 seconds) [2024-04-28 04:50:54,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 36100.3). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:50:54,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:50:59,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 35045.1). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:50:59,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:51:04,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 33989.8). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:51:04,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:51:09,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 32990.1). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:51:09,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:51:14,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 31934.9). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:51:14,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:51:19,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 30879.7). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:51:19,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:51:24,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 29879.9). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:51:24,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:51:29,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 28824.7). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:51:29,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:51:34,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 27769.5). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:51:34,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:51:39,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 26714.2). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:51:39,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:51:44,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 25659.0). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:51:44,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:51:49,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 24659.3). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:51:49,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:51:54,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 23604.1). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:51:54,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:51:59,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 22604.3). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:51:59,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:52:04,253][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 21604.6). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:52:04,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:52:09,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 20549.4). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:52:09,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:52:14,253][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 19549.7). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:52:14,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:52:19,253][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 18494.5). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:52:19,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:52:24,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 17494.8). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:52:24,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:52:29,253][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 16439.5). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:52:29,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:52:34,253][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 15384.3). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:52:34,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:52:39,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 14384.6). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:52:39,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:52:44,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 13384.9). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:52:44,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:52:49,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 12274.1). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:52:49,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:52:54,253][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 11218.9). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:52:54,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:52:59,253][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 10219.2). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:52:59,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:53:04,253][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 9219.5). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:53:04,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:53:09,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 8164.2). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:53:09,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:53:14,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 7053.4). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:53:14,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:53:19,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 6109.3). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:53:19,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:53:24,253][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 5054.0). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:53:24,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:53:29,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 3998.8). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:53:29,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:53:34,253][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 2943.6). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:53:34,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:53:39,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 1943.9). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:53:39,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:53:44,253][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 888.6). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:53:44,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:53:49,254][54587] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 9513648128. Throughput: 0: 0.0. Samples: 2419418700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 04:53:49,254][54587] Avg episode reward: [(0, '0.586')] [2024-04-28 04:53:49,277][54587] No heartbeat for components: Batcher_0 (297 seconds), LearnerWorker_p0 (297 seconds), RolloutWorker_w4 (26977 seconds), RolloutWorker_w5 (13077 seconds) [2024-04-28 04:53:49,278][54587] Stopping training due to lack of heartbeats from , [2024-04-28 04:53:49,279][54587] Runner:signal_='stop' queue is Full (). receivers=['RolloutWorker_w4'] [2024-04-28 04:53:49,280][54819] Stopping RolloutWorker_w6... [2024-04-28 04:53:49,281][54819] Loop rollout_proc6_evt_loop terminating... [2024-04-28 04:53:49,280][54820] Stopping RolloutWorker_w1... [2024-04-28 04:53:49,281][54828] Stopping RolloutWorker_w0... [2024-04-28 04:53:49,280][54821] Stopping RolloutWorker_w2... [2024-04-28 04:53:49,281][54825] Stopping RolloutWorker_w9... [2024-04-28 04:53:49,280][54822] Stopping RolloutWorker_w3... [2024-04-28 04:53:49,281][54830] Stopping RolloutWorker_w11... [2024-04-28 04:53:49,281][54829] Stopping RolloutWorker_w10... [2024-04-28 04:53:49,281][54826] Stopping RolloutWorker_w7... [2024-04-28 04:53:49,280][54824] Stopping RolloutWorker_w8... [2024-04-28 04:53:49,281][54839] Stopping RolloutWorker_w19... [2024-04-28 04:53:49,280][54831] Stopping RolloutWorker_w12... [2024-04-28 04:53:49,280][54833] Stopping RolloutWorker_w13... [2024-04-28 04:53:49,281][54843] Stopping RolloutWorker_w24... [2024-04-28 04:53:49,281][54850] Stopping RolloutWorker_w31... [2024-04-28 04:53:49,281][54845] Stopping RolloutWorker_w25... [2024-04-28 04:53:49,281][54834] Stopping RolloutWorker_w16... [2024-04-28 04:53:49,281][54847] Stopping RolloutWorker_w27... [2024-04-28 04:53:49,281][54828] Loop rollout_proc0_evt_loop terminating... [2024-04-28 04:53:49,281][54820] Loop rollout_proc1_evt_loop terminating... [2024-04-28 04:53:49,281][54842] Stopping RolloutWorker_w23... [2024-04-28 04:53:49,281][54844] Stopping RolloutWorker_w26... [2024-04-28 04:53:49,281][54822] Loop rollout_proc3_evt_loop terminating... [2024-04-28 04:53:49,281][54821] Loop rollout_proc2_evt_loop terminating... [2024-04-28 04:53:49,281][54830] Loop rollout_proc11_evt_loop terminating... [2024-04-28 04:53:49,281][54832] Stopping RolloutWorker_w14... [2024-04-28 04:53:49,281][54829] Loop rollout_proc10_evt_loop terminating... [2024-04-28 04:53:49,281][54846] Stopping RolloutWorker_w28... [2024-04-28 04:53:49,281][54836] Stopping RolloutWorker_w15... [2024-04-28 04:53:49,281][54825] Loop rollout_proc9_evt_loop terminating... [2024-04-28 04:53:49,281][54824] Loop rollout_proc8_evt_loop terminating... [2024-04-28 04:53:49,281][54826] Loop rollout_proc7_evt_loop terminating... [2024-04-28 04:53:49,281][54837] Stopping RolloutWorker_w20... [2024-04-28 04:53:49,281][54838] Stopping RolloutWorker_w18... [2024-04-28 04:53:49,281][54841] Stopping RolloutWorker_w22... [2024-04-28 04:53:49,281][54587] Component LearnerWorker_p0 process died already! Don't wait for it. [2024-04-28 04:53:49,281][54849] Stopping RolloutWorker_w29... [2024-04-28 04:53:49,281][54839] Loop rollout_proc19_evt_loop terminating... [2024-04-28 04:53:49,282][54831] Loop rollout_proc12_evt_loop terminating... [2024-04-28 04:53:49,282][54833] Loop rollout_proc13_evt_loop terminating... [2024-04-28 04:53:49,281][54848] Stopping RolloutWorker_w30... [2024-04-28 04:53:49,281][54840] Stopping RolloutWorker_w21... [2024-04-28 04:53:49,282][54843] Loop rollout_proc24_evt_loop terminating... [2024-04-28 04:53:49,282][54850] Loop rollout_proc31_evt_loop terminating... [2024-04-28 04:53:49,282][54836] Loop rollout_proc15_evt_loop terminating... [2024-04-28 04:53:49,282][54832] Loop rollout_proc14_evt_loop terminating... [2024-04-28 04:53:49,282][54844] Loop rollout_proc26_evt_loop terminating... [2024-04-28 04:53:49,282][54845] Loop rollout_proc25_evt_loop terminating... [2024-04-28 04:53:49,282][54837] Loop rollout_proc20_evt_loop terminating... [2024-04-28 04:53:49,281][54835] Stopping RolloutWorker_w17... [2024-04-28 04:53:49,282][54838] Loop rollout_proc18_evt_loop terminating... [2024-04-28 04:53:49,282][54834] Loop rollout_proc16_evt_loop terminating... [2024-04-28 04:53:49,282][54849] Loop rollout_proc29_evt_loop terminating... [2024-04-28 04:53:49,282][54846] Loop rollout_proc28_evt_loop terminating... [2024-04-28 04:53:49,282][54847] Loop rollout_proc27_evt_loop terminating... [2024-04-28 04:53:49,282][54841] Loop rollout_proc22_evt_loop terminating... [2024-04-28 04:53:49,282][54587] Component RolloutWorker_w4 process died already! Don't wait for it. [2024-04-28 04:53:49,282][54848] Loop rollout_proc30_evt_loop terminating... [2024-04-28 04:53:49,282][54842] Loop rollout_proc23_evt_loop terminating... [2024-04-28 04:53:49,282][54840] Loop rollout_proc21_evt_loop terminating... [2024-04-28 04:53:49,283][54835] Loop rollout_proc17_evt_loop terminating... [2024-04-28 04:53:49,283][54587] Component RolloutWorker_w5 process died already! Don't wait for it. [2024-04-28 04:53:49,284][54587] Component RolloutWorker_w6 stopped! [2024-04-28 04:53:49,284][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w1', 'RolloutWorker_w2', 'RolloutWorker_w3', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,285][54587] Component RolloutWorker_w1 stopped! [2024-04-28 04:53:49,285][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w2', 'RolloutWorker_w3', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,285][54587] Component RolloutWorker_w2 stopped! [2024-04-28 04:53:49,286][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w3', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,286][54587] Component RolloutWorker_w3 stopped! [2024-04-28 04:53:49,287][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w12', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,287][54587] Component RolloutWorker_w12 stopped! [2024-04-28 04:53:49,288][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w13', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,288][54587] Component RolloutWorker_w13 stopped! [2024-04-28 04:53:49,288][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w7', 'RolloutWorker_w8', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,288][54587] Component RolloutWorker_w8 stopped! [2024-04-28 04:53:49,288][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w7', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w28', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,288][54587] Component RolloutWorker_w28 stopped! [2024-04-28 04:53:49,288][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w0', 'RolloutWorker_w7', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,289][54587] Component RolloutWorker_w0 stopped! [2024-04-28 04:53:49,289][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w7', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w11', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,289][54587] Component RolloutWorker_w11 stopped! [2024-04-28 04:53:49,289][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w7', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w19', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,289][54587] Component RolloutWorker_w19 stopped! [2024-04-28 04:53:49,289][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w7', 'RolloutWorker_w9', 'RolloutWorker_w10', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,289][54587] Component RolloutWorker_w10 stopped! [2024-04-28 04:53:49,289][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w7', 'RolloutWorker_w9', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,289][54587] Component RolloutWorker_w9 stopped! [2024-04-28 04:53:49,289][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w7', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w24', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,289][54587] Component RolloutWorker_w24 stopped! [2024-04-28 04:53:49,289][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w7', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w18', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,290][54587] Component RolloutWorker_w18 stopped! [2024-04-28 04:53:49,290][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w7', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w22', 'RolloutWorker_w23', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,290][54587] Component RolloutWorker_w22 stopped! [2024-04-28 04:53:49,290][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w7', 'RolloutWorker_w14', 'RolloutWorker_w15', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w23', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,290][54587] Component RolloutWorker_w15 stopped! [2024-04-28 04:53:49,290][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w7', 'RolloutWorker_w14', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w23', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,290][54587] Component RolloutWorker_w14 stopped! [2024-04-28 04:53:49,290][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w7', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w23', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,290][54587] Component RolloutWorker_w7 stopped! [2024-04-28 04:53:49,291][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w16', 'RolloutWorker_w17', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w23', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,291][54587] Component RolloutWorker_w16 stopped! [2024-04-28 04:53:49,291][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w17', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w23', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w29', 'RolloutWorker_w30', 'RolloutWorker_w31'] to stop... [2024-04-28 04:53:49,291][54587] Component RolloutWorker_w31 stopped! [2024-04-28 04:53:49,291][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w17', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w23', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w29', 'RolloutWorker_w30'] to stop... [2024-04-28 04:53:49,291][54587] Component RolloutWorker_w30 stopped! [2024-04-28 04:53:49,291][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w17', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w23', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w29'] to stop... [2024-04-28 04:53:49,291][54587] Component RolloutWorker_w17 stopped! [2024-04-28 04:53:49,291][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w23', 'RolloutWorker_w25', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w29'] to stop... [2024-04-28 04:53:49,291][54587] Component RolloutWorker_w25 stopped! [2024-04-28 04:53:49,291][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w23', 'RolloutWorker_w26', 'RolloutWorker_w27', 'RolloutWorker_w29'] to stop... [2024-04-28 04:53:49,291][54587] Component RolloutWorker_w27 stopped! [2024-04-28 04:53:49,291][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w23', 'RolloutWorker_w26', 'RolloutWorker_w29'] to stop... [2024-04-28 04:53:49,291][54587] Component RolloutWorker_w26 stopped! [2024-04-28 04:53:49,292][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w23', 'RolloutWorker_w29'] to stop... [2024-04-28 04:53:49,292][54587] Component RolloutWorker_w23 stopped! [2024-04-28 04:53:49,292][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w20', 'RolloutWorker_w21', 'RolloutWorker_w29'] to stop... [2024-04-28 04:53:49,292][54587] Component RolloutWorker_w20 stopped! [2024-04-28 04:53:49,292][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w21', 'RolloutWorker_w29'] to stop... [2024-04-28 04:53:49,292][54587] Component RolloutWorker_w29 stopped! [2024-04-28 04:53:49,292][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0', 'RolloutWorker_w21'] to stop... [2024-04-28 04:53:49,292][54587] Component RolloutWorker_w21 stopped! [2024-04-28 04:53:49,292][54587] Waiting for ['Batcher_0', 'InferenceWorker_p0-w0'] to stop... [2024-04-28 04:53:49,380][54818] Weights refcount: 2 0 [2024-04-28 04:53:49,384][54818] Stopping InferenceWorker_p0-w0... [2024-04-28 04:53:49,385][54818] Loop inference_proc0-0_evt_loop terminating... [2024-04-28 04:53:49,386][54587] Component InferenceWorker_p0-w0 stopped! [2024-04-28 04:53:49,387][54587] Waiting for ['Batcher_0'] to stop... [2024-04-28 10:07:29,448][57108] Saving configuration to /workspace/metta/train_dir/p2.fa.clean/config.json... [2024-04-28 10:07:29,456][57108] Rollout worker 0 uses device cpu [2024-04-28 10:07:29,456][57108] Rollout worker 1 uses device cpu [2024-04-28 10:07:29,456][57108] Rollout worker 2 uses device cpu [2024-04-28 10:07:29,456][57108] Rollout worker 3 uses device cpu [2024-04-28 10:07:29,456][57108] Rollout worker 4 uses device cpu [2024-04-28 10:07:29,457][57108] Rollout worker 5 uses device cpu [2024-04-28 10:07:29,457][57108] Rollout worker 6 uses device cpu [2024-04-28 10:07:29,457][57108] Rollout worker 7 uses device cpu [2024-04-28 10:07:29,457][57108] Rollout worker 8 uses device cpu [2024-04-28 10:07:29,457][57108] Rollout worker 9 uses device cpu [2024-04-28 10:07:29,457][57108] Rollout worker 10 uses device cpu [2024-04-28 10:07:29,457][57108] Rollout worker 11 uses device cpu [2024-04-28 10:07:29,457][57108] Rollout worker 12 uses device cpu [2024-04-28 10:07:29,457][57108] Rollout worker 13 uses device cpu [2024-04-28 10:07:29,457][57108] Rollout worker 14 uses device cpu [2024-04-28 10:07:29,457][57108] Rollout worker 15 uses device cpu [2024-04-28 10:07:29,457][57108] Rollout worker 16 uses device cpu [2024-04-28 10:07:29,457][57108] Rollout worker 17 uses device cpu [2024-04-28 10:07:29,457][57108] Rollout worker 18 uses device cpu [2024-04-28 10:07:29,457][57108] Rollout worker 19 uses device cpu [2024-04-28 10:07:29,457][57108] Rollout worker 20 uses device cpu [2024-04-28 10:07:29,457][57108] Rollout worker 21 uses device cpu [2024-04-28 10:07:29,458][57108] Rollout worker 22 uses device cpu [2024-04-28 10:07:29,458][57108] Rollout worker 23 uses device cpu [2024-04-28 10:07:29,458][57108] Rollout worker 24 uses device cpu [2024-04-28 10:07:29,458][57108] Rollout worker 25 uses device cpu [2024-04-28 10:07:29,458][57108] Rollout worker 26 uses device cpu [2024-04-28 10:07:29,458][57108] Rollout worker 27 uses device cpu [2024-04-28 10:07:29,458][57108] Rollout worker 28 uses device cpu [2024-04-28 10:07:29,458][57108] Rollout worker 29 uses device cpu [2024-04-28 10:07:29,458][57108] Rollout worker 30 uses device cpu [2024-04-28 10:07:29,458][57108] Rollout worker 31 uses device cpu [2024-04-28 10:07:29,991][57108] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-28 10:07:29,991][57108] InferenceWorker_p0-w0: min num requests: 10 [2024-04-28 10:07:30,033][57108] Starting all processes... [2024-04-28 10:07:30,033][57108] Starting process learner_proc0 [2024-04-28 10:07:30,087][57108] Starting all processes... [2024-04-28 10:07:30,091][57108] Starting process inference_proc0-0 [2024-04-28 10:07:30,091][57108] Starting process rollout_proc0 [2024-04-28 10:07:30,091][57108] Starting process rollout_proc1 [2024-04-28 10:07:30,091][57108] Starting process rollout_proc2 [2024-04-28 10:07:30,092][57108] Starting process rollout_proc3 [2024-04-28 10:07:30,092][57108] Starting process rollout_proc4 [2024-04-28 10:07:30,092][57108] Starting process rollout_proc5 [2024-04-28 10:07:30,093][57108] Starting process rollout_proc6 [2024-04-28 10:07:30,093][57108] Starting process rollout_proc7 [2024-04-28 10:07:30,093][57108] Starting process rollout_proc8 [2024-04-28 10:07:30,093][57108] Starting process rollout_proc9 [2024-04-28 10:07:30,093][57108] Starting process rollout_proc10 [2024-04-28 10:07:30,093][57108] Starting process rollout_proc11 [2024-04-28 10:07:30,094][57108] Starting process rollout_proc12 [2024-04-28 10:07:30,094][57108] Starting process rollout_proc13 [2024-04-28 10:07:30,095][57108] Starting process rollout_proc14 [2024-04-28 10:07:30,095][57108] Starting process rollout_proc15 [2024-04-28 10:07:30,095][57108] Starting process rollout_proc16 [2024-04-28 10:07:30,096][57108] Starting process rollout_proc17 [2024-04-28 10:07:30,098][57108] Starting process rollout_proc18 [2024-04-28 10:07:30,098][57108] Starting process rollout_proc19 [2024-04-28 10:07:30,099][57108] Starting process rollout_proc20 [2024-04-28 10:07:30,100][57108] Starting process rollout_proc21 [2024-04-28 10:07:30,104][57108] Starting process rollout_proc22 [2024-04-28 10:07:30,104][57108] Starting process rollout_proc23 [2024-04-28 10:07:30,110][57108] Starting process rollout_proc24 [2024-04-28 10:07:30,110][57108] Starting process rollout_proc25 [2024-04-28 10:07:30,110][57108] Starting process rollout_proc26 [2024-04-28 10:07:30,110][57108] Starting process rollout_proc27 [2024-04-28 10:07:30,116][57108] Starting process rollout_proc28 [2024-04-28 10:07:30,117][57108] Starting process rollout_proc29 [2024-04-28 10:07:30,118][57108] Starting process rollout_proc30 [2024-04-28 10:07:30,118][57108] Starting process rollout_proc31 [2024-04-28 10:07:33,499][57345] Worker 5 uses CPU cores [5] [2024-04-28 10:07:33,585][57349] Worker 9 uses CPU cores [9] [2024-04-28 10:07:33,606][57342] Worker 1 uses CPU cores [1] [2024-04-28 10:07:33,630][57363] Worker 24 uses CPU cores [24] [2024-04-28 10:07:33,631][57339] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-28 10:07:33,631][57339] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-04-28 10:07:33,651][57339] Num visible devices: 1 [2024-04-28 10:07:33,654][57368] Worker 28 uses CPU cores [28] [2024-04-28 10:07:33,696][57364] Worker 23 uses CPU cores [23] [2024-04-28 10:07:33,729][57340] Worker 0 uses CPU cores [0] [2024-04-28 10:07:33,730][57341] Worker 2 uses CPU cores [2] [2024-04-28 10:07:33,790][57353] Worker 14 uses CPU cores [14] [2024-04-28 10:07:33,794][57343] Worker 3 uses CPU cores [3] [2024-04-28 10:07:33,870][57367] Worker 25 uses CPU cores [25] [2024-04-28 10:07:33,902][57319] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-28 10:07:33,902][57319] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-04-28 10:07:33,910][57366] Worker 27 uses CPU cores [27] [2024-04-28 10:07:33,912][57319] Num visible devices: 1 [2024-04-28 10:07:33,938][57355] Worker 15 uses CPU cores [15] [2024-04-28 10:07:33,944][57319] Starting seed is not provided [2024-04-28 10:07:33,944][57319] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-28 10:07:33,945][57319] Initializing actor-critic model on device cuda:0 [2024-04-28 10:07:33,945][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,957][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,957][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,957][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,957][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,957][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,957][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,957][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,958][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,958][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,958][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,958][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,958][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,958][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,958][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,958][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,958][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,959][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,959][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,959][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,959][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,959][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,959][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,959][57319] RunningMeanStd input shape: (1,) [2024-04-28 10:07:33,960][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,961][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,961][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,961][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,961][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,961][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,961][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,961][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,961][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,961][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,962][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,962][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,962][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,962][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,962][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,962][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,962][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,962][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,962][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,963][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,963][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,963][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,963][57319] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:33,970][57344] Worker 4 uses CPU cores [4] [2024-04-28 10:07:33,970][57360] Worker 20 uses CPU cores [20] [2024-04-28 10:07:33,977][57369] Worker 31 uses CPU cores [31] [2024-04-28 10:07:33,982][57371] Worker 30 uses CPU cores [30] [2024-04-28 10:07:33,986][57347] Worker 7 uses CPU cores [7] [2024-04-28 10:07:33,994][57370] Worker 29 uses CPU cores [29] [2024-04-28 10:07:34,002][57348] Worker 8 uses CPU cores [8] [2024-04-28 10:07:34,009][57350] Worker 11 uses CPU cores [11] [2024-04-28 10:07:34,009][57354] Worker 13 uses CPU cores [13] [2024-04-28 10:07:34,017][57361] Worker 21 uses CPU cores [21] [2024-04-28 10:07:34,020][57319] Created Actor Critic model with architecture: [2024-04-28 10:07:34,021][57319] PredictingActorCritic( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): FeatureEncoder( (_global_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (conf:agent:energy:initial): RunningMeanStdInPlace() (conf:agent:energy:max): RunningMeanStdInPlace() (conf:agent:energy:regen): RunningMeanStdInPlace() (conf:altar:cooldown): RunningMeanStdInPlace() (conf:altar:cost): RunningMeanStdInPlace() (conf:attack:damage): RunningMeanStdInPlace() (conf:attack:freeze_duration): RunningMeanStdInPlace() (conf:charger:cooldown): RunningMeanStdInPlace() (conf:charger:energy): RunningMeanStdInPlace() (conf:cost:attack): RunningMeanStdInPlace() (conf:cost:frozen): RunningMeanStdInPlace() (conf:cost:jump): RunningMeanStdInPlace() (conf:cost:move): RunningMeanStdInPlace() (conf:cost:move:predator): RunningMeanStdInPlace() (conf:cost:move:prey): RunningMeanStdInPlace() (conf:cost:rotate): RunningMeanStdInPlace() (conf:cost:shield): RunningMeanStdInPlace() (conf:cost:shield:upkeep): RunningMeanStdInPlace() (conf:generator:cooldown): RunningMeanStdInPlace() (conf:gift:energy): RunningMeanStdInPlace() (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() (last_reward): RunningMeanStdInPlace() ) ) (_grid_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (charger): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv:1): RunningMeanStdInPlace() (agent:inv:2): RunningMeanStdInPlace() (agent:inv:3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (agent:species): RunningMeanStdInPlace() (altar:ready): RunningMeanStdInPlace() (charger:bonus): RunningMeanStdInPlace() (charger:input:1): RunningMeanStdInPlace() (charger:input:2): RunningMeanStdInPlace() (charger:input:3): RunningMeanStdInPlace() (charger:output): RunningMeanStdInPlace() (charger:ready): RunningMeanStdInPlace() (generator:ready): RunningMeanStdInPlace() (generator:resource): RunningMeanStdInPlace() ) ) (_global_encoder): FeatureListEncoder( (_embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (_grid_encoder): FeatureListEncoder( (_embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (encoder_head): Sequential( (0): Linear(in_features=520, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): FeatureDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=17, bias=True) ) ) [2024-04-28 10:07:34,021][57356] Worker 16 uses CPU cores [16] [2024-04-28 10:07:34,038][57346] Worker 6 uses CPU cores [6] [2024-04-28 10:07:34,075][57362] Worker 22 uses CPU cores [22] [2024-04-28 10:07:34,105][57359] Worker 18 uses CPU cores [18] [2024-04-28 10:07:34,117][57357] Worker 17 uses CPU cores [17] [2024-04-28 10:07:34,121][57351] Worker 10 uses CPU cores [10] [2024-04-28 10:07:34,144][57358] Worker 19 uses CPU cores [19] [2024-04-28 10:07:34,175][57365] Worker 26 uses CPU cores [26] [2024-04-28 10:07:34,182][57352] Worker 12 uses CPU cores [12] [2024-04-28 10:07:34,240][57319] Using optimizer [2024-04-28 10:07:34,345][57319] Loading state from checkpoint /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000580428_9509732352.pth... [2024-04-28 10:07:34,364][57319] Loading model from checkpoint [2024-04-28 10:07:34,366][57319] Loaded experiment state at self.train_step=580428, self.env_steps=9509732352 [2024-04-28 10:07:34,366][57319] Initialized policy 0 weights for model version 580428 [2024-04-28 10:07:34,368][57319] LearnerWorker_p0 finished initialization! [2024-04-28 10:07:34,368][57319] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-04-28 10:07:34,449][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,454][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,454][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,455][57339] RunningMeanStd input shape: (1,) [2024-04-28 10:07:34,456][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,456][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,456][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,456][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,456][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,456][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,456][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,456][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,456][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,456][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,456][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,457][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,457][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,457][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,457][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,457][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,457][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,457][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,457][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,457][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,457][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,457][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,457][57339] RunningMeanStd input shape: (11, 11) [2024-04-28 10:07:34,517][57108] Inference worker 0-0 is ready! [2024-04-28 10:07:34,517][57108] All inference workers are ready! Signal rollout workers to start! [2024-04-28 10:07:35,158][57354] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,279][57355] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,291][57348] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,292][57345] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,296][57341] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,299][57349] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,300][57353] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,300][57346] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,302][57343] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,304][57347] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,311][57342] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,312][57344] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,313][57340] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,315][57350] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,317][57351] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,337][57363] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,340][57357] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,342][57371] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,342][57356] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,343][57358] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,343][57369] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,343][57368] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,344][57359] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,346][57367] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,347][57370] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,348][57361] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,350][57366] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,354][57362] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,355][57364] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,356][57360] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,364][57352] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,371][57365] Decorrelating experience for 0 frames... [2024-04-28 10:07:35,888][57354] Decorrelating experience for 256 frames... [2024-04-28 10:07:35,976][57355] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,027][57348] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,028][57353] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,028][57345] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,034][57341] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,041][57346] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,041][57349] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,042][57343] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,046][57347] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,046][57352] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,048][57344] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,053][57342] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,058][57340] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,058][57350] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,061][57351] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,122][57363] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,135][57357] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,160][57371] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,161][57368] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,164][57361] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,167][57359] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,167][57358] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,171][57356] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,172][57364] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,172][57367] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,172][57370] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,173][57369] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,173][57365] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,173][57366] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,174][57362] Decorrelating experience for 256 frames... [2024-04-28 10:07:36,179][57360] Decorrelating experience for 256 frames... [2024-04-28 10:07:37,169][57108] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 9509732352. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-04-28 10:07:40,763][57354] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-04-28 10:07:40,771][57355] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-04-28 10:07:40,800][57345] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-04-28 10:07:40,809][57342] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-04-28 10:07:40,830][57349] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-04-28 10:07:40,838][57343] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-04-28 10:07:40,866][57350] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-04-28 10:07:40,879][57353] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-04-28 10:07:40,893][57352] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-04-28 10:07:40,894][57348] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-04-28 10:07:40,898][57347] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-04-28 10:07:40,910][57351] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-04-28 10:07:40,929][57319] Signal inference workers to stop experience collection... [2024-04-28 10:07:40,936][57341] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-04-28 10:07:40,936][57339] InferenceWorker_p0-w0: stopping experience collection [2024-04-28 10:07:41,412][57319] Signal inference workers to resume experience collection... [2024-04-28 10:07:41,412][57339] InferenceWorker_p0-w0: resuming experience collection [2024-04-28 10:07:41,671][57357] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-04-28 10:07:41,743][57359] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-04-28 10:07:41,743][57365] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-04-28 10:07:41,768][57364] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-04-28 10:07:41,772][57363] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-04-28 10:07:41,840][57360] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-04-28 10:07:41,840][57367] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-04-28 10:07:41,840][57361] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-04-28 10:07:41,845][57368] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-04-28 10:07:41,849][57371] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-04-28 10:07:41,850][57370] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-04-28 10:07:41,875][57366] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-04-28 10:07:41,887][57358] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-04-28 10:07:41,887][57369] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-04-28 10:07:41,939][57356] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-04-28 10:07:41,943][57362] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-04-28 10:07:42,169][57108] Fps is (10 sec: 22938.3, 60 sec: 22938.3, 300 sec: 22938.3). Total num frames: 9509847040. Throughput: 0: 57873.7. Samples: 289360. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:07:42,637][57344] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-04-28 10:07:42,695][57346] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-04-28 10:07:42,774][57339] Updated weights for policy 0, policy_version 580438 (0.0020) [2024-04-28 10:07:45,520][57342] Worker 1 awakens! [2024-04-28 10:07:47,169][57108] Fps is (10 sec: 16384.1, 60 sec: 16384.1, 300 sec: 16384.1). Total num frames: 9509896192. Throughput: 0: 33422.1. Samples: 334220. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:07:49,988][57108] Heartbeat connected on Batcher_0 [2024-04-28 10:07:49,989][57108] Heartbeat connected on LearnerWorker_p0 [2024-04-28 10:07:49,993][57108] Heartbeat connected on RolloutWorker_w0 [2024-04-28 10:07:49,994][57108] Heartbeat connected on RolloutWorker_w1 [2024-04-28 10:07:50,042][57108] Heartbeat connected on InferenceWorker_p0-w0 [2024-04-28 10:07:50,358][57341] Worker 2 awakens! [2024-04-28 10:07:50,363][57108] Heartbeat connected on RolloutWorker_w2 [2024-04-28 10:07:52,169][57108] Fps is (10 sec: 6553.3, 60 sec: 12014.7, 300 sec: 12014.7). Total num frames: 9509912576. Throughput: 0: 22707.6. Samples: 340620. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:07:54,970][57343] Worker 3 awakens! [2024-04-28 10:07:54,975][57108] Heartbeat connected on RolloutWorker_w3 [2024-04-28 10:07:57,169][57108] Fps is (10 sec: 3276.7, 60 sec: 9830.3, 300 sec: 9830.3). Total num frames: 9509928960. Throughput: 0: 17974.8. Samples: 359500. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:08:01,478][57344] Worker 4 awakens! [2024-04-28 10:08:01,488][57108] Heartbeat connected on RolloutWorker_w4 [2024-04-28 10:08:02,169][57108] Fps is (10 sec: 4915.4, 60 sec: 9175.1, 300 sec: 9175.1). Total num frames: 9509961728. Throughput: 0: 15389.7. Samples: 384740. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:08:02,169][57108] Avg episode reward: [(0, '0.275')] [2024-04-28 10:08:04,338][57345] Worker 5 awakens! [2024-04-28 10:08:04,342][57108] Heartbeat connected on RolloutWorker_w5 [2024-04-28 10:08:07,169][57108] Fps is (10 sec: 11469.2, 60 sec: 10376.6, 300 sec: 10376.6). Total num frames: 9510043648. Throughput: 0: 14578.7. Samples: 437360. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:08:07,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 10:08:07,246][57339] Updated weights for policy 0, policy_version 580448 (0.0017) [2024-04-28 10:08:10,921][57346] Worker 6 awakens! [2024-04-28 10:08:10,925][57108] Heartbeat connected on RolloutWorker_w6 [2024-04-28 10:08:12,169][57108] Fps is (10 sec: 19660.8, 60 sec: 12171.0, 300 sec: 12171.0). Total num frames: 9510158336. Throughput: 0: 15981.8. Samples: 559360. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:08:12,175][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 10:08:13,810][57347] Worker 7 awakens! [2024-04-28 10:08:13,816][57108] Heartbeat connected on RolloutWorker_w7 [2024-04-28 10:08:13,931][57339] Updated weights for policy 0, policy_version 580458 (0.0013) [2024-04-28 10:08:17,169][57108] Fps is (10 sec: 26214.1, 60 sec: 14336.0, 300 sec: 14336.0). Total num frames: 9510305792. Throughput: 0: 17987.0. Samples: 719480. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:08:17,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 10:08:18,490][57348] Worker 8 awakens! [2024-04-28 10:08:18,494][57108] Heartbeat connected on RolloutWorker_w8 [2024-04-28 10:08:19,962][57339] Updated weights for policy 0, policy_version 580468 (0.0013) [2024-04-28 10:08:22,169][57108] Fps is (10 sec: 27852.6, 60 sec: 15655.8, 300 sec: 15655.8). Total num frames: 9510436864. Throughput: 0: 17825.8. Samples: 802160. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:08:22,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 10:08:23,118][57349] Worker 9 awakens! [2024-04-28 10:08:23,124][57108] Heartbeat connected on RolloutWorker_w9 [2024-04-28 10:08:25,549][57339] Updated weights for policy 0, policy_version 580478 (0.0013) [2024-04-28 10:08:27,169][57108] Fps is (10 sec: 29491.3, 60 sec: 17367.1, 300 sec: 17367.1). Total num frames: 9510600704. Throughput: 0: 15572.9. Samples: 990140. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:08:27,169][57108] Avg episode reward: [(0, '0.518')] [2024-04-28 10:08:27,883][57351] Worker 10 awakens! [2024-04-28 10:08:27,887][57108] Heartbeat connected on RolloutWorker_w10 [2024-04-28 10:08:29,398][57339] Updated weights for policy 0, policy_version 580488 (0.0013) [2024-04-28 10:08:32,169][57108] Fps is (10 sec: 36044.9, 60 sec: 19362.9, 300 sec: 19362.9). Total num frames: 9510797312. Throughput: 0: 19609.8. Samples: 1216660. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:08:32,169][57108] Avg episode reward: [(0, '0.468')] [2024-04-28 10:08:32,526][57350] Worker 11 awakens! [2024-04-28 10:08:32,533][57108] Heartbeat connected on RolloutWorker_w11 [2024-04-28 10:08:33,911][57339] Updated weights for policy 0, policy_version 580498 (0.0014) [2024-04-28 10:08:37,169][57108] Fps is (10 sec: 40960.1, 60 sec: 21299.2, 300 sec: 21299.2). Total num frames: 9511010304. Throughput: 0: 22319.7. Samples: 1345000. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:08:37,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 10:08:37,242][57352] Worker 12 awakens! [2024-04-28 10:08:37,248][57108] Heartbeat connected on RolloutWorker_w12 [2024-04-28 10:08:37,365][57339] Updated weights for policy 0, policy_version 580508 (0.0015) [2024-04-28 10:08:41,383][57339] Updated weights for policy 0, policy_version 580518 (0.0017) [2024-04-28 10:08:41,801][57354] Worker 13 awakens! [2024-04-28 10:08:41,808][57108] Heartbeat connected on RolloutWorker_w13 [2024-04-28 10:08:42,169][57108] Fps is (10 sec: 42598.7, 60 sec: 22937.6, 300 sec: 22937.6). Total num frames: 9511223296. Throughput: 0: 27718.4. Samples: 1606820. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:08:42,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 10:08:45,303][57339] Updated weights for policy 0, policy_version 580528 (0.0016) [2024-04-28 10:08:46,606][57353] Worker 14 awakens! [2024-04-28 10:08:46,613][57108] Heartbeat connected on RolloutWorker_w14 [2024-04-28 10:08:47,169][57108] Fps is (10 sec: 44236.5, 60 sec: 25941.3, 300 sec: 24576.0). Total num frames: 9511452672. Throughput: 0: 33066.2. Samples: 1872720. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:08:47,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 10:08:49,156][57339] Updated weights for policy 0, policy_version 580538 (0.0018) [2024-04-28 10:08:51,184][57355] Worker 15 awakens! [2024-04-28 10:08:51,190][57108] Heartbeat connected on RolloutWorker_w15 [2024-04-28 10:08:52,169][57108] Fps is (10 sec: 42597.9, 60 sec: 28945.2, 300 sec: 25559.0). Total num frames: 9511649280. Throughput: 0: 34537.7. Samples: 1991560. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:08:52,169][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 10:08:52,948][57339] Updated weights for policy 0, policy_version 580548 (0.0019) [2024-04-28 10:08:56,838][57339] Updated weights for policy 0, policy_version 580558 (0.0023) [2024-04-28 10:08:56,955][57356] Worker 16 awakens! [2024-04-28 10:08:56,963][57108] Heartbeat connected on RolloutWorker_w16 [2024-04-28 10:08:57,169][57108] Fps is (10 sec: 42598.4, 60 sec: 32495.0, 300 sec: 26828.8). Total num frames: 9511878656. Throughput: 0: 37451.0. Samples: 2244660. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:08:57,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 10:09:00,584][57339] Updated weights for policy 0, policy_version 580568 (0.0026) [2024-04-28 10:09:01,458][57357] Worker 17 awakens! [2024-04-28 10:09:01,468][57108] Heartbeat connected on RolloutWorker_w17 [2024-04-28 10:09:02,169][57108] Fps is (10 sec: 44237.1, 60 sec: 35498.6, 300 sec: 27756.4). Total num frames: 9512091648. Throughput: 0: 39734.3. Samples: 2507520. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:09:02,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 10:09:04,174][57339] Updated weights for policy 0, policy_version 580578 (0.0022) [2024-04-28 10:09:06,218][57359] Worker 18 awakens! [2024-04-28 10:09:06,227][57108] Heartbeat connected on RolloutWorker_w18 [2024-04-28 10:09:07,169][57108] Fps is (10 sec: 45875.3, 60 sec: 38229.3, 300 sec: 28945.1). Total num frames: 9512337408. Throughput: 0: 40927.1. Samples: 2643880. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:09:07,169][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 10:09:08,040][57339] Updated weights for policy 0, policy_version 580588 (0.0024) [2024-04-28 10:09:10,958][57339] Updated weights for policy 0, policy_version 580598 (0.0021) [2024-04-28 10:09:11,046][57358] Worker 19 awakens! [2024-04-28 10:09:11,057][57108] Heartbeat connected on RolloutWorker_w19 [2024-04-28 10:09:12,169][57108] Fps is (10 sec: 45874.7, 60 sec: 39867.6, 300 sec: 29663.7). Total num frames: 9512550400. Throughput: 0: 42966.6. Samples: 2923640. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-04-28 10:09:12,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 10:09:14,581][57339] Updated weights for policy 0, policy_version 580608 (0.0028) [2024-04-28 10:09:15,690][57360] Worker 20 awakens! [2024-04-28 10:09:15,701][57108] Heartbeat connected on RolloutWorker_w20 [2024-04-28 10:09:17,169][57108] Fps is (10 sec: 45874.8, 60 sec: 41506.1, 300 sec: 30638.1). Total num frames: 9512796160. Throughput: 0: 44393.7. Samples: 3214380. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:09:17,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 10:09:17,802][57339] Updated weights for policy 0, policy_version 580618 (0.0025) [2024-04-28 10:09:20,378][57361] Worker 21 awakens! [2024-04-28 10:09:20,389][57108] Heartbeat connected on RolloutWorker_w21 [2024-04-28 10:09:21,171][57339] Updated weights for policy 0, policy_version 580628 (0.0026) [2024-04-28 10:09:22,169][57108] Fps is (10 sec: 50790.6, 60 sec: 43690.6, 300 sec: 31675.7). Total num frames: 9513058304. Throughput: 0: 44735.5. Samples: 3358100. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:09:22,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 10:09:24,492][57339] Updated weights for policy 0, policy_version 580638 (0.0023) [2024-04-28 10:09:25,094][57362] Worker 22 awakens! [2024-04-28 10:09:25,105][57108] Heartbeat connected on RolloutWorker_w22 [2024-04-28 10:09:27,169][57108] Fps is (10 sec: 50790.9, 60 sec: 45056.0, 300 sec: 32470.1). Total num frames: 9513304064. Throughput: 0: 45579.0. Samples: 3657880. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:09:27,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 10:09:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000580646_9513304064.pth... [2024-04-28 10:09:27,222][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000579980_9502392320.pth [2024-04-28 10:09:28,145][57339] Updated weights for policy 0, policy_version 580648 (0.0025) [2024-04-28 10:09:29,649][57364] Worker 23 awakens! [2024-04-28 10:09:29,662][57108] Heartbeat connected on RolloutWorker_w23 [2024-04-28 10:09:30,708][57339] Updated weights for policy 0, policy_version 580658 (0.0025) [2024-04-28 10:09:32,169][57108] Fps is (10 sec: 49152.1, 60 sec: 45875.2, 300 sec: 33195.4). Total num frames: 9513549824. Throughput: 0: 46400.5. Samples: 3960740. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:09:32,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 10:09:34,352][57339] Updated weights for policy 0, policy_version 580668 (0.0024) [2024-04-28 10:09:34,370][57363] Worker 24 awakens! [2024-04-28 10:09:34,382][57108] Heartbeat connected on RolloutWorker_w24 [2024-04-28 10:09:37,169][57108] Fps is (10 sec: 49151.4, 60 sec: 46421.2, 300 sec: 33860.2). Total num frames: 9513795584. Throughput: 0: 47159.0. Samples: 4113720. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:09:37,170][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 10:09:37,635][57339] Updated weights for policy 0, policy_version 580678 (0.0022) [2024-04-28 10:09:39,128][57367] Worker 25 awakens! [2024-04-28 10:09:39,138][57108] Heartbeat connected on RolloutWorker_w25 [2024-04-28 10:09:40,318][57339] Updated weights for policy 0, policy_version 580688 (0.0026) [2024-04-28 10:09:42,169][57108] Fps is (10 sec: 50790.3, 60 sec: 47240.5, 300 sec: 34603.0). Total num frames: 9514057728. Throughput: 0: 48315.1. Samples: 4418840. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:09:42,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 10:09:43,718][57365] Worker 26 awakens! [2024-04-28 10:09:43,731][57108] Heartbeat connected on RolloutWorker_w26 [2024-04-28 10:09:43,815][57339] Updated weights for policy 0, policy_version 580698 (0.0026) [2024-04-28 10:09:46,659][57339] Updated weights for policy 0, policy_version 580708 (0.0032) [2024-04-28 10:09:47,169][57108] Fps is (10 sec: 54067.1, 60 sec: 48059.6, 300 sec: 35414.6). Total num frames: 9514336256. Throughput: 0: 49763.8. Samples: 4746900. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:09:47,170][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 10:09:48,538][57366] Worker 27 awakens! [2024-04-28 10:09:48,551][57108] Heartbeat connected on RolloutWorker_w27 [2024-04-28 10:09:49,472][57339] Updated weights for policy 0, policy_version 580718 (0.0028) [2024-04-28 10:09:52,169][57108] Fps is (10 sec: 52429.0, 60 sec: 48879.0, 300 sec: 35923.4). Total num frames: 9514582016. Throughput: 0: 50116.0. Samples: 4899100. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:09:52,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 10:09:52,977][57339] Updated weights for policy 0, policy_version 580728 (0.0028) [2024-04-28 10:09:53,194][57368] Worker 28 awakens! [2024-04-28 10:09:53,208][57108] Heartbeat connected on RolloutWorker_w28 [2024-04-28 10:09:55,972][57339] Updated weights for policy 0, policy_version 580738 (0.0027) [2024-04-28 10:09:57,169][57108] Fps is (10 sec: 54067.6, 60 sec: 49971.2, 300 sec: 36747.0). Total num frames: 9514876928. Throughput: 0: 51181.8. Samples: 5226820. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:09:57,170][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 10:09:57,843][57370] Worker 29 awakens! [2024-04-28 10:09:57,860][57108] Heartbeat connected on RolloutWorker_w29 [2024-04-28 10:09:59,247][57339] Updated weights for policy 0, policy_version 580748 (0.0027) [2024-04-28 10:10:01,670][57339] Updated weights for policy 0, policy_version 580758 (0.0025) [2024-04-28 10:10:02,169][57108] Fps is (10 sec: 57342.8, 60 sec: 51063.3, 300 sec: 37400.7). Total num frames: 9515155456. Throughput: 0: 51970.1. Samples: 5553040. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:10:02,170][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 10:10:02,574][57371] Worker 30 awakens! [2024-04-28 10:10:02,589][57108] Heartbeat connected on RolloutWorker_w30 [2024-04-28 10:10:05,092][57339] Updated weights for policy 0, policy_version 580768 (0.0034) [2024-04-28 10:10:07,144][57319] Signal inference workers to stop experience collection... (50 times) [2024-04-28 10:10:07,162][57339] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-04-28 10:10:07,169][57108] Fps is (10 sec: 54068.2, 60 sec: 51336.6, 300 sec: 37901.7). Total num frames: 9515417600. Throughput: 0: 52635.7. Samples: 5726700. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:10:07,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 10:10:07,234][57319] Signal inference workers to resume experience collection... (50 times) [2024-04-28 10:10:07,234][57339] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-04-28 10:10:07,298][57369] Worker 31 awakens! [2024-04-28 10:10:07,312][57108] Heartbeat connected on RolloutWorker_w31 [2024-04-28 10:10:07,532][57339] Updated weights for policy 0, policy_version 580778 (0.0028) [2024-04-28 10:10:10,836][57339] Updated weights for policy 0, policy_version 580788 (0.0032) [2024-04-28 10:10:12,169][57108] Fps is (10 sec: 55706.8, 60 sec: 52701.9, 300 sec: 38581.7). Total num frames: 9515712512. Throughput: 0: 53399.1. Samples: 6060840. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:10:12,169][57108] Avg episode reward: [(0, '0.523')] [2024-04-28 10:10:13,371][57339] Updated weights for policy 0, policy_version 580798 (0.0028) [2024-04-28 10:10:16,791][57339] Updated weights for policy 0, policy_version 580808 (0.0030) [2024-04-28 10:10:17,169][57108] Fps is (10 sec: 55704.3, 60 sec: 52974.9, 300 sec: 39014.4). Total num frames: 9515974656. Throughput: 0: 54099.9. Samples: 6395240. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:10:17,170][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 10:10:19,363][57339] Updated weights for policy 0, policy_version 580818 (0.0029) [2024-04-28 10:10:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 53248.1, 300 sec: 39520.2). Total num frames: 9516253184. Throughput: 0: 54350.8. Samples: 6559500. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:10:22,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 10:10:22,527][57339] Updated weights for policy 0, policy_version 580828 (0.0031) [2024-04-28 10:10:25,249][57339] Updated weights for policy 0, policy_version 580838 (0.0027) [2024-04-28 10:10:27,169][57108] Fps is (10 sec: 55706.3, 60 sec: 53794.1, 300 sec: 39996.2). Total num frames: 9516531712. Throughput: 0: 55160.1. Samples: 6901040. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:10:27,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 10:10:28,328][57339] Updated weights for policy 0, policy_version 580848 (0.0042) [2024-04-28 10:10:31,018][57339] Updated weights for policy 0, policy_version 580858 (0.0032) [2024-04-28 10:10:32,169][57108] Fps is (10 sec: 57343.5, 60 sec: 54613.3, 300 sec: 40538.7). Total num frames: 9516826624. Throughput: 0: 55306.8. Samples: 7235700. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:10:32,170][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 10:10:34,084][57339] Updated weights for policy 0, policy_version 580868 (0.0026) [2024-04-28 10:10:36,967][57339] Updated weights for policy 0, policy_version 580878 (0.0037) [2024-04-28 10:10:37,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55159.5, 300 sec: 40960.0). Total num frames: 9517105152. Throughput: 0: 55578.1. Samples: 7400120. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:10:37,170][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 10:10:39,995][57339] Updated weights for policy 0, policy_version 580888 (0.0034) [2024-04-28 10:10:42,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.5, 300 sec: 41358.5). Total num frames: 9517383680. Throughput: 0: 55725.4. Samples: 7734460. Policy #0 lag: (min: 0.0, avg: 6.4, max: 13.0) [2024-04-28 10:10:42,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 10:10:42,923][57339] Updated weights for policy 0, policy_version 580898 (0.0033) [2024-04-28 10:10:45,953][57339] Updated weights for policy 0, policy_version 580908 (0.0032) [2024-04-28 10:10:47,169][57108] Fps is (10 sec: 57345.1, 60 sec: 55705.8, 300 sec: 41822.3). Total num frames: 9517678592. Throughput: 0: 55788.4. Samples: 8063500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:10:47,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 10:10:48,783][57339] Updated weights for policy 0, policy_version 580918 (0.0030) [2024-04-28 10:10:51,804][57339] Updated weights for policy 0, policy_version 580928 (0.0026) [2024-04-28 10:10:52,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56251.8, 300 sec: 42178.3). Total num frames: 9517957120. Throughput: 0: 55854.1. Samples: 8240140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:10:52,169][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 10:10:54,598][57339] Updated weights for policy 0, policy_version 580938 (0.0031) [2024-04-28 10:10:57,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.7, 300 sec: 42434.6). Total num frames: 9518219264. Throughput: 0: 55823.6. Samples: 8572900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:10:57,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 10:10:57,662][57339] Updated weights for policy 0, policy_version 580948 (0.0032) [2024-04-28 10:11:00,612][57339] Updated weights for policy 0, policy_version 580958 (0.0037) [2024-04-28 10:11:02,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.8, 300 sec: 42758.3). Total num frames: 9518497792. Throughput: 0: 55836.6. Samples: 8907880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:11:02,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 10:11:03,540][57339] Updated weights for policy 0, policy_version 580968 (0.0032) [2024-04-28 10:11:06,558][57339] Updated weights for policy 0, policy_version 580978 (0.0026) [2024-04-28 10:11:07,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55978.5, 300 sec: 43066.5). Total num frames: 9518776320. Throughput: 0: 55812.8. Samples: 9071080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:11:07,170][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 10:11:09,371][57339] Updated weights for policy 0, policy_version 580988 (0.0027) [2024-04-28 10:11:12,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.6, 300 sec: 43284.3). Total num frames: 9519038464. Throughput: 0: 55658.7. Samples: 9405680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:11:12,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 10:11:12,399][57339] Updated weights for policy 0, policy_version 580998 (0.0032) [2024-04-28 10:11:15,197][57339] Updated weights for policy 0, policy_version 581008 (0.0040) [2024-04-28 10:11:17,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 43641.0). Total num frames: 9519333376. Throughput: 0: 55712.8. Samples: 9742780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:11:17,170][57108] Avg episode reward: [(0, '0.489')] [2024-04-28 10:11:18,197][57339] Updated weights for policy 0, policy_version 581018 (0.0035) [2024-04-28 10:11:19,440][57319] Signal inference workers to stop experience collection... (100 times) [2024-04-28 10:11:19,469][57339] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-04-28 10:11:19,498][57319] Signal inference workers to resume experience collection... (100 times) [2024-04-28 10:11:19,499][57339] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-04-28 10:11:21,086][57339] Updated weights for policy 0, policy_version 581028 (0.0028) [2024-04-28 10:11:22,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.8, 300 sec: 43909.2). Total num frames: 9519611904. Throughput: 0: 55688.3. Samples: 9906080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:11:22,169][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 10:11:24,114][57339] Updated weights for policy 0, policy_version 581038 (0.0031) [2024-04-28 10:11:27,128][57339] Updated weights for policy 0, policy_version 581048 (0.0028) [2024-04-28 10:11:27,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55978.7, 300 sec: 44165.6). Total num frames: 9519890432. Throughput: 0: 55672.1. Samples: 10239700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:11:27,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 10:11:27,270][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000581049_9519906816.pth... [2024-04-28 10:11:27,308][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000580428_9509732352.pth [2024-04-28 10:11:30,066][57339] Updated weights for policy 0, policy_version 581058 (0.0031) [2024-04-28 10:11:32,169][57108] Fps is (10 sec: 54066.1, 60 sec: 55432.5, 300 sec: 44341.4). Total num frames: 9520152576. Throughput: 0: 55765.5. Samples: 10572960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:11:32,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 10:11:32,907][57339] Updated weights for policy 0, policy_version 581068 (0.0028) [2024-04-28 10:11:35,910][57339] Updated weights for policy 0, policy_version 581078 (0.0028) [2024-04-28 10:11:37,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55705.6, 300 sec: 44646.4). Total num frames: 9520447488. Throughput: 0: 55558.9. Samples: 10740300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:11:37,170][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 10:11:38,860][57339] Updated weights for policy 0, policy_version 581088 (0.0031) [2024-04-28 10:11:41,757][57339] Updated weights for policy 0, policy_version 581098 (0.0032) [2024-04-28 10:11:42,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55432.6, 300 sec: 44805.2). Total num frames: 9520709632. Throughput: 0: 55584.0. Samples: 11074180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:11:42,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 10:11:44,729][57339] Updated weights for policy 0, policy_version 581108 (0.0032) [2024-04-28 10:11:47,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55159.4, 300 sec: 45023.3). Total num frames: 9520988160. Throughput: 0: 55556.9. Samples: 11407940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:11:47,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 10:11:47,866][57339] Updated weights for policy 0, policy_version 581118 (0.0027) [2024-04-28 10:11:50,623][57339] Updated weights for policy 0, policy_version 581128 (0.0030) [2024-04-28 10:11:52,170][57108] Fps is (10 sec: 57335.9, 60 sec: 55431.2, 300 sec: 45296.7). Total num frames: 9521283072. Throughput: 0: 55533.9. Samples: 11570180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:11:52,171][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 10:11:53,707][57339] Updated weights for policy 0, policy_version 581138 (0.0024) [2024-04-28 10:11:56,431][57339] Updated weights for policy 0, policy_version 581148 (0.0028) [2024-04-28 10:11:57,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 45434.1). Total num frames: 9521545216. Throughput: 0: 55472.0. Samples: 11901920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:11:57,169][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 10:11:59,645][57339] Updated weights for policy 0, policy_version 581158 (0.0032) [2024-04-28 10:12:02,169][57108] Fps is (10 sec: 54074.7, 60 sec: 55432.5, 300 sec: 45627.9). Total num frames: 9521823744. Throughput: 0: 55342.4. Samples: 12233180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:12:02,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 10:12:02,482][57339] Updated weights for policy 0, policy_version 581168 (0.0035) [2024-04-28 10:12:05,595][57339] Updated weights for policy 0, policy_version 581178 (0.0036) [2024-04-28 10:12:07,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.5, 300 sec: 45753.8). Total num frames: 9522085888. Throughput: 0: 55347.4. Samples: 12396720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:12:07,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 10:12:08,524][57339] Updated weights for policy 0, policy_version 581188 (0.0026) [2024-04-28 10:12:11,613][57339] Updated weights for policy 0, policy_version 581198 (0.0035) [2024-04-28 10:12:12,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 45934.8). Total num frames: 9522364416. Throughput: 0: 55237.3. Samples: 12725380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:12:12,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 10:12:14,450][57339] Updated weights for policy 0, policy_version 581208 (0.0024) [2024-04-28 10:12:17,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55159.5, 300 sec: 46109.3). Total num frames: 9522642944. Throughput: 0: 55218.3. Samples: 13057780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 10:12:17,170][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 10:12:17,662][57339] Updated weights for policy 0, policy_version 581218 (0.0031) [2024-04-28 10:12:20,453][57339] Updated weights for policy 0, policy_version 581228 (0.0031) [2024-04-28 10:12:22,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55432.5, 300 sec: 46335.1). Total num frames: 9522937856. Throughput: 0: 55168.3. Samples: 13222860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:12:22,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 10:12:23,510][57339] Updated weights for policy 0, policy_version 581238 (0.0028) [2024-04-28 10:12:26,266][57339] Updated weights for policy 0, policy_version 581248 (0.0030) [2024-04-28 10:12:27,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55159.5, 300 sec: 46440.2). Total num frames: 9523200000. Throughput: 0: 55132.5. Samples: 13555140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:12:27,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 10:12:29,498][57339] Updated weights for policy 0, policy_version 581258 (0.0027) [2024-04-28 10:12:32,140][57339] Updated weights for policy 0, policy_version 581268 (0.0032) [2024-04-28 10:12:32,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.7, 300 sec: 46652.8). Total num frames: 9523494912. Throughput: 0: 54956.4. Samples: 13880980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:12:32,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 10:12:33,503][57319] Signal inference workers to stop experience collection... (150 times) [2024-04-28 10:12:33,548][57339] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-04-28 10:12:33,557][57319] Signal inference workers to resume experience collection... (150 times) [2024-04-28 10:12:33,564][57339] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-04-28 10:12:35,395][57339] Updated weights for policy 0, policy_version 581278 (0.0030) [2024-04-28 10:12:37,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.6, 300 sec: 47152.6). Total num frames: 9523757056. Throughput: 0: 55259.6. Samples: 14056780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:12:37,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 10:12:38,051][57339] Updated weights for policy 0, policy_version 581288 (0.0028) [2024-04-28 10:12:41,421][57339] Updated weights for policy 0, policy_version 581298 (0.0032) [2024-04-28 10:12:42,169][57108] Fps is (10 sec: 50790.3, 60 sec: 54886.4, 300 sec: 47819.1). Total num frames: 9524002816. Throughput: 0: 55332.3. Samples: 14391880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:12:42,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 10:12:43,893][57339] Updated weights for policy 0, policy_version 581308 (0.0024) [2024-04-28 10:12:47,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55159.4, 300 sec: 48763.3). Total num frames: 9524297728. Throughput: 0: 55233.7. Samples: 14718700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:12:47,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 10:12:47,271][57339] Updated weights for policy 0, policy_version 581318 (0.0024) [2024-04-28 10:12:49,845][57339] Updated weights for policy 0, policy_version 581328 (0.0034) [2024-04-28 10:12:52,169][57108] Fps is (10 sec: 57344.3, 60 sec: 54887.7, 300 sec: 49651.9). Total num frames: 9524576256. Throughput: 0: 55180.4. Samples: 14879840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:12:52,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 10:12:53,190][57339] Updated weights for policy 0, policy_version 581338 (0.0030) [2024-04-28 10:12:55,780][57339] Updated weights for policy 0, policy_version 581348 (0.0038) [2024-04-28 10:12:57,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55159.5, 300 sec: 50484.9). Total num frames: 9524854784. Throughput: 0: 55165.9. Samples: 15207840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:12:57,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 10:12:59,067][57339] Updated weights for policy 0, policy_version 581358 (0.0030) [2024-04-28 10:13:01,998][57339] Updated weights for policy 0, policy_version 581368 (0.0031) [2024-04-28 10:13:02,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55159.4, 300 sec: 51151.4). Total num frames: 9525133312. Throughput: 0: 55043.1. Samples: 15534720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:13:02,170][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 10:13:05,020][57339] Updated weights for policy 0, policy_version 581378 (0.0031) [2024-04-28 10:13:07,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55159.5, 300 sec: 51651.2). Total num frames: 9525395456. Throughput: 0: 55075.0. Samples: 15701240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:13:07,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 10:13:07,839][57339] Updated weights for policy 0, policy_version 581388 (0.0029) [2024-04-28 10:13:11,111][57339] Updated weights for policy 0, policy_version 581398 (0.0025) [2024-04-28 10:13:12,169][57108] Fps is (10 sec: 52429.1, 60 sec: 54886.4, 300 sec: 52040.0). Total num frames: 9525657600. Throughput: 0: 55094.2. Samples: 16034380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:13:12,169][57108] Avg episode reward: [(0, '0.539')] [2024-04-28 10:13:13,798][57339] Updated weights for policy 0, policy_version 581408 (0.0037) [2024-04-28 10:13:17,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54886.5, 300 sec: 52539.9). Total num frames: 9525936128. Throughput: 0: 55133.0. Samples: 16361960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:13:17,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 10:13:17,269][57339] Updated weights for policy 0, policy_version 581418 (0.0028) [2024-04-28 10:13:19,891][57339] Updated weights for policy 0, policy_version 581428 (0.0030) [2024-04-28 10:13:22,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54613.2, 300 sec: 52928.6). Total num frames: 9526214656. Throughput: 0: 54754.1. Samples: 16520720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:13:22,169][57108] Avg episode reward: [(0, '0.523')] [2024-04-28 10:13:23,390][57339] Updated weights for policy 0, policy_version 581438 (0.0028) [2024-04-28 10:13:25,845][57339] Updated weights for policy 0, policy_version 581448 (0.0026) [2024-04-28 10:13:27,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55159.5, 300 sec: 53261.9). Total num frames: 9526509568. Throughput: 0: 54600.6. Samples: 16848900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:13:27,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 10:13:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000581452_9526509568.pth... [2024-04-28 10:13:27,235][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000580646_9513304064.pth [2024-04-28 10:13:29,397][57339] Updated weights for policy 0, policy_version 581458 (0.0032) [2024-04-28 10:13:32,102][57339] Updated weights for policy 0, policy_version 581468 (0.0036) [2024-04-28 10:13:32,169][57108] Fps is (10 sec: 55706.1, 60 sec: 54613.4, 300 sec: 53428.5). Total num frames: 9526771712. Throughput: 0: 54639.7. Samples: 17177480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:13:32,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 10:13:35,272][57339] Updated weights for policy 0, policy_version 581478 (0.0027) [2024-04-28 10:13:37,169][57108] Fps is (10 sec: 54066.5, 60 sec: 54886.3, 300 sec: 53650.6). Total num frames: 9527050240. Throughput: 0: 54829.7. Samples: 17347180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:13:37,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 10:13:38,055][57339] Updated weights for policy 0, policy_version 581488 (0.0032) [2024-04-28 10:13:41,041][57339] Updated weights for policy 0, policy_version 581498 (0.0032) [2024-04-28 10:13:42,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 53761.7). Total num frames: 9527312384. Throughput: 0: 54851.0. Samples: 17676140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:13:42,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 10:13:44,067][57339] Updated weights for policy 0, policy_version 581508 (0.0034) [2024-04-28 10:13:44,079][57319] Signal inference workers to stop experience collection... (200 times) [2024-04-28 10:13:44,117][57339] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-04-28 10:13:44,172][57319] Signal inference workers to resume experience collection... (200 times) [2024-04-28 10:13:44,172][57339] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-04-28 10:13:47,090][57339] Updated weights for policy 0, policy_version 581518 (0.0035) [2024-04-28 10:13:47,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54886.4, 300 sec: 54039.4). Total num frames: 9527590912. Throughput: 0: 54945.4. Samples: 18007260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 10:13:47,170][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 10:13:49,892][57339] Updated weights for policy 0, policy_version 581528 (0.0038) [2024-04-28 10:13:52,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54613.3, 300 sec: 54150.5). Total num frames: 9527853056. Throughput: 0: 54977.4. Samples: 18175220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:13:52,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 10:13:53,050][57339] Updated weights for policy 0, policy_version 581538 (0.0034) [2024-04-28 10:13:55,786][57339] Updated weights for policy 0, policy_version 581548 (0.0026) [2024-04-28 10:13:57,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55159.4, 300 sec: 54483.7). Total num frames: 9528164352. Throughput: 0: 54884.5. Samples: 18504180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:13:57,170][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 10:13:59,067][57339] Updated weights for policy 0, policy_version 581558 (0.0032) [2024-04-28 10:14:01,835][57339] Updated weights for policy 0, policy_version 581568 (0.0026) [2024-04-28 10:14:02,169][57108] Fps is (10 sec: 57343.9, 60 sec: 54886.5, 300 sec: 54539.3). Total num frames: 9528426496. Throughput: 0: 54839.1. Samples: 18829720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:14:02,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 10:14:04,899][57339] Updated weights for policy 0, policy_version 581578 (0.0030) [2024-04-28 10:14:07,169][57108] Fps is (10 sec: 52428.3, 60 sec: 54886.3, 300 sec: 54705.9). Total num frames: 9528688640. Throughput: 0: 55042.2. Samples: 18997620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:14:07,170][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 10:14:07,690][57339] Updated weights for policy 0, policy_version 581588 (0.0031) [2024-04-28 10:14:10,997][57339] Updated weights for policy 0, policy_version 581598 (0.0029) [2024-04-28 10:14:12,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.4, 300 sec: 54817.0). Total num frames: 9528967168. Throughput: 0: 55000.3. Samples: 19323920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:14:12,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 10:14:13,699][57339] Updated weights for policy 0, policy_version 581608 (0.0036) [2024-04-28 10:14:17,075][57339] Updated weights for policy 0, policy_version 581618 (0.0038) [2024-04-28 10:14:17,169][57108] Fps is (10 sec: 54067.7, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9529229312. Throughput: 0: 55093.3. Samples: 19656680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:14:17,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 10:14:19,606][57339] Updated weights for policy 0, policy_version 581628 (0.0028) [2024-04-28 10:14:22,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54886.4, 300 sec: 54928.0). Total num frames: 9529507840. Throughput: 0: 54847.1. Samples: 19815300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:14:22,169][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 10:14:23,106][57339] Updated weights for policy 0, policy_version 581638 (0.0030) [2024-04-28 10:14:25,581][57339] Updated weights for policy 0, policy_version 581648 (0.0026) [2024-04-28 10:14:27,169][57108] Fps is (10 sec: 57343.8, 60 sec: 54886.3, 300 sec: 55094.7). Total num frames: 9529802752. Throughput: 0: 54795.1. Samples: 20141920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:14:27,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 10:14:29,057][57339] Updated weights for policy 0, policy_version 581658 (0.0033) [2024-04-28 10:14:31,593][57339] Updated weights for policy 0, policy_version 581668 (0.0026) [2024-04-28 10:14:32,169][57108] Fps is (10 sec: 55705.4, 60 sec: 54886.3, 300 sec: 55150.2). Total num frames: 9530064896. Throughput: 0: 54671.5. Samples: 20467480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:14:32,170][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 10:14:34,866][57339] Updated weights for policy 0, policy_version 581678 (0.0030) [2024-04-28 10:14:37,169][57108] Fps is (10 sec: 52429.0, 60 sec: 54613.4, 300 sec: 55150.2). Total num frames: 9530327040. Throughput: 0: 54746.7. Samples: 20638820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:14:37,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 10:14:37,601][57339] Updated weights for policy 0, policy_version 581688 (0.0033) [2024-04-28 10:14:40,864][57339] Updated weights for policy 0, policy_version 581698 (0.0035) [2024-04-28 10:14:42,169][57108] Fps is (10 sec: 54067.2, 60 sec: 54886.3, 300 sec: 55150.2). Total num frames: 9530605568. Throughput: 0: 54753.6. Samples: 20968100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:14:42,170][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 10:14:43,514][57339] Updated weights for policy 0, policy_version 581708 (0.0031) [2024-04-28 10:14:46,922][57339] Updated weights for policy 0, policy_version 581718 (0.0031) [2024-04-28 10:14:47,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54613.4, 300 sec: 55205.7). Total num frames: 9530867712. Throughput: 0: 54850.2. Samples: 21297980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:14:47,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 10:14:49,411][57339] Updated weights for policy 0, policy_version 581728 (0.0026) [2024-04-28 10:14:51,811][57319] Signal inference workers to stop experience collection... (250 times) [2024-04-28 10:14:51,837][57339] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-04-28 10:14:51,904][57319] Signal inference workers to resume experience collection... (250 times) [2024-04-28 10:14:51,904][57339] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-04-28 10:14:52,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55159.3, 300 sec: 55205.7). Total num frames: 9531162624. Throughput: 0: 54637.3. Samples: 21456300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:14:52,170][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 10:14:52,711][57339] Updated weights for policy 0, policy_version 581738 (0.0024) [2024-04-28 10:14:55,402][57339] Updated weights for policy 0, policy_version 581748 (0.0028) [2024-04-28 10:14:57,169][57108] Fps is (10 sec: 55705.2, 60 sec: 54340.2, 300 sec: 55150.2). Total num frames: 9531424768. Throughput: 0: 54738.2. Samples: 21787140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:14:57,178][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 10:14:58,696][57339] Updated weights for policy 0, policy_version 581758 (0.0030) [2024-04-28 10:15:01,275][57339] Updated weights for policy 0, policy_version 581768 (0.0028) [2024-04-28 10:15:02,169][57108] Fps is (10 sec: 55706.7, 60 sec: 54886.5, 300 sec: 55261.3). Total num frames: 9531719680. Throughput: 0: 54678.8. Samples: 22117220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:15:02,169][57108] Avg episode reward: [(0, '0.690')] [2024-04-28 10:15:04,768][57339] Updated weights for policy 0, policy_version 581778 (0.0027) [2024-04-28 10:15:07,169][57108] Fps is (10 sec: 55705.9, 60 sec: 54886.4, 300 sec: 55150.2). Total num frames: 9531981824. Throughput: 0: 55021.4. Samples: 22291260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:15:07,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 10:15:07,372][57339] Updated weights for policy 0, policy_version 581788 (0.0032) [2024-04-28 10:15:10,808][57339] Updated weights for policy 0, policy_version 581798 (0.0029) [2024-04-28 10:15:12,169][57108] Fps is (10 sec: 55704.5, 60 sec: 55159.4, 300 sec: 55261.3). Total num frames: 9532276736. Throughput: 0: 55106.1. Samples: 22621700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:15:12,170][57108] Avg episode reward: [(0, '0.514')] [2024-04-28 10:15:13,285][57339] Updated weights for policy 0, policy_version 581808 (0.0029) [2024-04-28 10:15:16,581][57339] Updated weights for policy 0, policy_version 581818 (0.0036) [2024-04-28 10:15:17,169][57108] Fps is (10 sec: 54067.5, 60 sec: 54886.4, 300 sec: 55150.2). Total num frames: 9532522496. Throughput: 0: 55191.7. Samples: 22951100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:15:17,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 10:15:19,107][57339] Updated weights for policy 0, policy_version 581828 (0.0029) [2024-04-28 10:15:22,169][57108] Fps is (10 sec: 52428.8, 60 sec: 54886.4, 300 sec: 55150.2). Total num frames: 9532801024. Throughput: 0: 54921.6. Samples: 23110300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-04-28 10:15:22,170][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 10:15:22,540][57339] Updated weights for policy 0, policy_version 581838 (0.0027) [2024-04-28 10:15:25,225][57339] Updated weights for policy 0, policy_version 581848 (0.0025) [2024-04-28 10:15:27,169][57108] Fps is (10 sec: 55705.0, 60 sec: 54613.3, 300 sec: 55094.7). Total num frames: 9533079552. Throughput: 0: 54918.7. Samples: 23439440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:15:27,170][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 10:15:27,183][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000581853_9533079552.pth... [2024-04-28 10:15:27,245][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000581049_9519906816.pth [2024-04-28 10:15:28,554][57339] Updated weights for policy 0, policy_version 581858 (0.0037) [2024-04-28 10:15:31,128][57339] Updated weights for policy 0, policy_version 581868 (0.0035) [2024-04-28 10:15:32,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55159.5, 300 sec: 55150.2). Total num frames: 9533374464. Throughput: 0: 54930.1. Samples: 23769840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:15:32,170][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 10:15:34,508][57339] Updated weights for policy 0, policy_version 581878 (0.0032) [2024-04-28 10:15:37,038][57339] Updated weights for policy 0, policy_version 581888 (0.0030) [2024-04-28 10:15:37,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.4, 300 sec: 55150.2). Total num frames: 9533652992. Throughput: 0: 55191.6. Samples: 23939920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:15:37,170][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 10:15:40,509][57339] Updated weights for policy 0, policy_version 581898 (0.0032) [2024-04-28 10:15:42,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55159.4, 300 sec: 55039.1). Total num frames: 9533915136. Throughput: 0: 55182.5. Samples: 24270360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:15:42,170][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 10:15:42,947][57339] Updated weights for policy 0, policy_version 581908 (0.0028) [2024-04-28 10:15:46,418][57339] Updated weights for policy 0, policy_version 581918 (0.0030) [2024-04-28 10:15:47,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55039.1). Total num frames: 9534193664. Throughput: 0: 55094.6. Samples: 24596480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:15:47,169][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 10:15:49,045][57339] Updated weights for policy 0, policy_version 581928 (0.0033) [2024-04-28 10:15:52,169][57108] Fps is (10 sec: 52429.6, 60 sec: 54613.4, 300 sec: 54983.6). Total num frames: 9534439424. Throughput: 0: 54843.5. Samples: 24759220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:15:52,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 10:15:52,328][57339] Updated weights for policy 0, policy_version 581938 (0.0028) [2024-04-28 10:15:55,110][57339] Updated weights for policy 0, policy_version 581948 (0.0040) [2024-04-28 10:15:57,169][57108] Fps is (10 sec: 52428.3, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 9534717952. Throughput: 0: 54829.4. Samples: 25089020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:15:57,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 10:15:58,283][57339] Updated weights for policy 0, policy_version 581958 (0.0036) [2024-04-28 10:16:00,977][57319] Signal inference workers to stop experience collection... (300 times) [2024-04-28 10:16:00,978][57319] Signal inference workers to resume experience collection... (300 times) [2024-04-28 10:16:01,005][57339] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-04-28 10:16:01,005][57339] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-04-28 10:16:01,089][57339] Updated weights for policy 0, policy_version 581968 (0.0031) [2024-04-28 10:16:02,169][57108] Fps is (10 sec: 55706.0, 60 sec: 54613.3, 300 sec: 54983.6). Total num frames: 9534996480. Throughput: 0: 54835.5. Samples: 25418700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:16:02,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 10:16:04,216][57339] Updated weights for policy 0, policy_version 581978 (0.0032) [2024-04-28 10:16:06,958][57339] Updated weights for policy 0, policy_version 581988 (0.0025) [2024-04-28 10:16:07,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55159.3, 300 sec: 55094.6). Total num frames: 9535291392. Throughput: 0: 54963.5. Samples: 25583660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:16:07,170][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 10:16:10,028][57339] Updated weights for policy 0, policy_version 581998 (0.0033) [2024-04-28 10:16:12,169][57108] Fps is (10 sec: 57344.0, 60 sec: 54886.5, 300 sec: 55039.2). Total num frames: 9535569920. Throughput: 0: 55042.4. Samples: 25916340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:16:12,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 10:16:13,074][57339] Updated weights for policy 0, policy_version 582008 (0.0031) [2024-04-28 10:16:15,905][57339] Updated weights for policy 0, policy_version 582018 (0.0024) [2024-04-28 10:16:17,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55432.4, 300 sec: 55039.1). Total num frames: 9535848448. Throughput: 0: 55107.5. Samples: 26249680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:16:17,170][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 10:16:18,928][57339] Updated weights for policy 0, policy_version 582028 (0.0025) [2024-04-28 10:16:21,846][57339] Updated weights for policy 0, policy_version 582038 (0.0026) [2024-04-28 10:16:22,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.6, 300 sec: 54983.6). Total num frames: 9536110592. Throughput: 0: 54973.4. Samples: 26413720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:16:22,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 10:16:24,945][57339] Updated weights for policy 0, policy_version 582048 (0.0028) [2024-04-28 10:16:27,169][57108] Fps is (10 sec: 50791.0, 60 sec: 54613.4, 300 sec: 54928.1). Total num frames: 9536356352. Throughput: 0: 54866.9. Samples: 26739360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:16:27,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 10:16:27,846][57339] Updated weights for policy 0, policy_version 582058 (0.0033) [2024-04-28 10:16:30,878][57339] Updated weights for policy 0, policy_version 582068 (0.0030) [2024-04-28 10:16:32,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54613.4, 300 sec: 54928.1). Total num frames: 9536651264. Throughput: 0: 55019.5. Samples: 27072360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:16:32,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 10:16:34,010][57339] Updated weights for policy 0, policy_version 582078 (0.0034) [2024-04-28 10:16:36,863][57339] Updated weights for policy 0, policy_version 582088 (0.0028) [2024-04-28 10:16:37,169][57108] Fps is (10 sec: 57344.3, 60 sec: 54613.5, 300 sec: 54983.6). Total num frames: 9536929792. Throughput: 0: 55016.1. Samples: 27234940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:16:37,169][57108] Avg episode reward: [(0, '0.704')] [2024-04-28 10:16:39,850][57339] Updated weights for policy 0, policy_version 582098 (0.0037) [2024-04-28 10:16:42,169][57108] Fps is (10 sec: 55706.0, 60 sec: 54886.6, 300 sec: 54983.6). Total num frames: 9537208320. Throughput: 0: 54930.8. Samples: 27560900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:16:42,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 10:16:42,779][57339] Updated weights for policy 0, policy_version 582108 (0.0027) [2024-04-28 10:16:45,856][57339] Updated weights for policy 0, policy_version 582118 (0.0031) [2024-04-28 10:16:47,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55159.4, 300 sec: 54983.8). Total num frames: 9537503232. Throughput: 0: 54847.0. Samples: 27886820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:16:47,170][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 10:16:48,844][57339] Updated weights for policy 0, policy_version 582128 (0.0034) [2024-04-28 10:16:51,835][57339] Updated weights for policy 0, policy_version 582138 (0.0027) [2024-04-28 10:16:52,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.6, 300 sec: 54928.1). Total num frames: 9537748992. Throughput: 0: 54895.4. Samples: 28053940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 10:16:52,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 10:16:55,045][57339] Updated weights for policy 0, policy_version 582148 (0.0030) [2024-04-28 10:16:57,169][57108] Fps is (10 sec: 50790.3, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9538011136. Throughput: 0: 54747.4. Samples: 28379980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:16:57,170][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 10:16:57,895][57339] Updated weights for policy 0, policy_version 582158 (0.0035) [2024-04-28 10:17:01,020][57339] Updated weights for policy 0, policy_version 582168 (0.0034) [2024-04-28 10:17:02,169][57108] Fps is (10 sec: 52428.7, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9538273280. Throughput: 0: 54654.0. Samples: 28709100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:17:02,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 10:17:03,952][57339] Updated weights for policy 0, policy_version 582178 (0.0028) [2024-04-28 10:17:04,536][57319] Signal inference workers to stop experience collection... (350 times) [2024-04-28 10:17:04,587][57339] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-04-28 10:17:04,595][57319] Signal inference workers to resume experience collection... (350 times) [2024-04-28 10:17:04,603][57339] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-04-28 10:17:06,871][57339] Updated weights for policy 0, policy_version 582188 (0.0032) [2024-04-28 10:17:07,169][57108] Fps is (10 sec: 55705.1, 60 sec: 54613.3, 300 sec: 54928.0). Total num frames: 9538568192. Throughput: 0: 54652.2. Samples: 28873080. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:17:07,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 10:17:09,997][57339] Updated weights for policy 0, policy_version 582198 (0.0027) [2024-04-28 10:17:12,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54340.3, 300 sec: 54872.5). Total num frames: 9538830336. Throughput: 0: 54601.9. Samples: 29196440. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:17:12,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 10:17:13,029][57339] Updated weights for policy 0, policy_version 582208 (0.0032) [2024-04-28 10:17:16,073][57339] Updated weights for policy 0, policy_version 582218 (0.0031) [2024-04-28 10:17:17,169][57108] Fps is (10 sec: 55706.9, 60 sec: 54613.5, 300 sec: 54872.5). Total num frames: 9539125248. Throughput: 0: 54370.3. Samples: 29519020. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:17:17,169][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 10:17:19,118][57339] Updated weights for policy 0, policy_version 582228 (0.0027) [2024-04-28 10:17:22,078][57339] Updated weights for policy 0, policy_version 582238 (0.0033) [2024-04-28 10:17:22,169][57108] Fps is (10 sec: 55705.1, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9539387392. Throughput: 0: 54567.5. Samples: 29690480. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:17:22,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 10:17:25,112][57339] Updated weights for policy 0, policy_version 582248 (0.0027) [2024-04-28 10:17:27,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 54817.0). Total num frames: 9539665920. Throughput: 0: 54668.9. Samples: 30021000. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:17:27,169][57108] Avg episode reward: [(0, '0.527')] [2024-04-28 10:17:27,236][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000582256_9539682304.pth... [2024-04-28 10:17:27,277][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000581452_9526509568.pth [2024-04-28 10:17:28,099][57339] Updated weights for policy 0, policy_version 582258 (0.0030) [2024-04-28 10:17:30,953][57339] Updated weights for policy 0, policy_version 582268 (0.0032) [2024-04-28 10:17:32,169][57108] Fps is (10 sec: 54067.5, 60 sec: 54613.4, 300 sec: 54817.0). Total num frames: 9539928064. Throughput: 0: 54703.7. Samples: 30348480. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:17:32,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 10:17:33,991][57339] Updated weights for policy 0, policy_version 582278 (0.0030) [2024-04-28 10:17:36,925][57339] Updated weights for policy 0, policy_version 582288 (0.0034) [2024-04-28 10:17:37,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54613.3, 300 sec: 54928.1). Total num frames: 9540206592. Throughput: 0: 54594.6. Samples: 30510700. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:17:37,169][57108] Avg episode reward: [(0, '0.530')] [2024-04-28 10:17:40,000][57339] Updated weights for policy 0, policy_version 582298 (0.0032) [2024-04-28 10:17:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9540485120. Throughput: 0: 54490.4. Samples: 30832040. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:17:42,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 10:17:42,845][57339] Updated weights for policy 0, policy_version 582308 (0.0028) [2024-04-28 10:17:45,875][57339] Updated weights for policy 0, policy_version 582318 (0.0027) [2024-04-28 10:17:47,169][57108] Fps is (10 sec: 55706.0, 60 sec: 54340.4, 300 sec: 54872.5). Total num frames: 9540763648. Throughput: 0: 54555.2. Samples: 31164080. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:17:47,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 10:17:49,095][57339] Updated weights for policy 0, policy_version 582328 (0.0026) [2024-04-28 10:17:51,818][57339] Updated weights for policy 0, policy_version 582338 (0.0034) [2024-04-28 10:17:52,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9541025792. Throughput: 0: 54611.0. Samples: 31330560. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:17:52,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 10:17:55,030][57339] Updated weights for policy 0, policy_version 582348 (0.0027) [2024-04-28 10:17:56,220][57319] Signal inference workers to stop experience collection... (400 times) [2024-04-28 10:17:56,221][57319] Signal inference workers to resume experience collection... (400 times) [2024-04-28 10:17:56,244][57339] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-04-28 10:17:56,244][57339] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-04-28 10:17:57,169][57108] Fps is (10 sec: 54065.9, 60 sec: 54886.3, 300 sec: 54817.0). Total num frames: 9541304320. Throughput: 0: 54720.6. Samples: 31658880. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:17:57,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 10:17:57,788][57339] Updated weights for policy 0, policy_version 582358 (0.0026) [2024-04-28 10:18:01,129][57339] Updated weights for policy 0, policy_version 582368 (0.0031) [2024-04-28 10:18:02,169][57108] Fps is (10 sec: 52428.4, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9541550080. Throughput: 0: 54822.6. Samples: 31986040. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:18:02,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 10:18:03,748][57339] Updated weights for policy 0, policy_version 582378 (0.0032) [2024-04-28 10:18:07,169][57108] Fps is (10 sec: 54068.1, 60 sec: 54613.5, 300 sec: 54872.5). Total num frames: 9541844992. Throughput: 0: 54672.5. Samples: 32150740. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:18:07,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 10:18:07,176][57339] Updated weights for policy 0, policy_version 582388 (0.0034) [2024-04-28 10:18:09,690][57339] Updated weights for policy 0, policy_version 582398 (0.0032) [2024-04-28 10:18:12,169][57108] Fps is (10 sec: 57344.4, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9542123520. Throughput: 0: 54552.0. Samples: 32475840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:18:12,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 10:18:13,105][57339] Updated weights for policy 0, policy_version 582408 (0.0029) [2024-04-28 10:18:15,770][57339] Updated weights for policy 0, policy_version 582418 (0.0035) [2024-04-28 10:18:17,175][57108] Fps is (10 sec: 57308.8, 60 sec: 54880.7, 300 sec: 54926.9). Total num frames: 9542418432. Throughput: 0: 54685.0. Samples: 32809640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:18:17,175][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 10:18:19,003][57339] Updated weights for policy 0, policy_version 582428 (0.0028) [2024-04-28 10:18:21,663][57339] Updated weights for policy 0, policy_version 582438 (0.0026) [2024-04-28 10:18:22,169][57108] Fps is (10 sec: 55705.2, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9542680576. Throughput: 0: 54828.0. Samples: 32977960. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:18:22,170][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 10:18:25,138][57339] Updated weights for policy 0, policy_version 582448 (0.0026) [2024-04-28 10:18:27,169][57108] Fps is (10 sec: 52460.7, 60 sec: 54613.2, 300 sec: 54817.0). Total num frames: 9542942720. Throughput: 0: 54902.1. Samples: 33302640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-04-28 10:18:27,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 10:18:27,553][57339] Updated weights for policy 0, policy_version 582458 (0.0030) [2024-04-28 10:18:31,307][57339] Updated weights for policy 0, policy_version 582468 (0.0027) [2024-04-28 10:18:32,169][57108] Fps is (10 sec: 52428.7, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9543204864. Throughput: 0: 54741.6. Samples: 33627460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:18:32,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 10:18:33,481][57339] Updated weights for policy 0, policy_version 582478 (0.0026) [2024-04-28 10:18:37,169][57108] Fps is (10 sec: 52429.0, 60 sec: 54340.3, 300 sec: 54761.4). Total num frames: 9543467008. Throughput: 0: 54611.0. Samples: 33788060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:18:37,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 10:18:37,312][57339] Updated weights for policy 0, policy_version 582488 (0.0043) [2024-04-28 10:18:39,508][57339] Updated weights for policy 0, policy_version 582498 (0.0030) [2024-04-28 10:18:42,169][57108] Fps is (10 sec: 54067.7, 60 sec: 54340.3, 300 sec: 54761.5). Total num frames: 9543745536. Throughput: 0: 54485.6. Samples: 34110720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:18:42,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 10:18:43,222][57339] Updated weights for policy 0, policy_version 582508 (0.0032) [2024-04-28 10:18:45,404][57339] Updated weights for policy 0, policy_version 582518 (0.0031) [2024-04-28 10:18:47,169][57108] Fps is (10 sec: 57344.4, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9544040448. Throughput: 0: 54549.4. Samples: 34440760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:18:47,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 10:18:49,295][57339] Updated weights for policy 0, policy_version 582528 (0.0028) [2024-04-28 10:18:52,024][57339] Updated weights for policy 0, policy_version 582538 (0.0027) [2024-04-28 10:18:52,169][57108] Fps is (10 sec: 55705.3, 60 sec: 54613.3, 300 sec: 54705.9). Total num frames: 9544302592. Throughput: 0: 54751.1. Samples: 34614540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:18:52,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 10:18:55,699][57339] Updated weights for policy 0, policy_version 582548 (0.0033) [2024-04-28 10:18:57,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55159.6, 300 sec: 54872.5). Total num frames: 9544613888. Throughput: 0: 54956.8. Samples: 34948900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:18:57,170][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 10:18:57,832][57339] Updated weights for policy 0, policy_version 582558 (0.0029) [2024-04-28 10:19:01,475][57339] Updated weights for policy 0, policy_version 582568 (0.0029) [2024-04-28 10:19:02,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54886.4, 300 sec: 54761.4). Total num frames: 9544843264. Throughput: 0: 54848.8. Samples: 35277500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:19:02,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 10:19:03,590][57339] Updated weights for policy 0, policy_version 582578 (0.0025) [2024-04-28 10:19:07,169][57108] Fps is (10 sec: 49151.5, 60 sec: 54340.1, 300 sec: 54705.9). Total num frames: 9545105408. Throughput: 0: 54475.9. Samples: 35429380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:19:07,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 10:19:07,311][57339] Updated weights for policy 0, policy_version 582588 (0.0036) [2024-04-28 10:19:07,895][57319] Signal inference workers to stop experience collection... (450 times) [2024-04-28 10:19:07,941][57339] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-04-28 10:19:07,954][57319] Signal inference workers to resume experience collection... (450 times) [2024-04-28 10:19:07,958][57339] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-04-28 10:19:09,792][57339] Updated weights for policy 0, policy_version 582598 (0.0027) [2024-04-28 10:19:12,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9545400320. Throughput: 0: 54543.1. Samples: 35757080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:19:12,170][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 10:19:13,221][57339] Updated weights for policy 0, policy_version 582608 (0.0031) [2024-04-28 10:19:15,846][57339] Updated weights for policy 0, policy_version 582618 (0.0026) [2024-04-28 10:19:17,169][57108] Fps is (10 sec: 58982.8, 60 sec: 54618.8, 300 sec: 54872.5). Total num frames: 9545695232. Throughput: 0: 54617.3. Samples: 36085240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:19:17,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 10:19:19,379][57339] Updated weights for policy 0, policy_version 582628 (0.0034) [2024-04-28 10:19:21,654][57339] Updated weights for policy 0, policy_version 582638 (0.0030) [2024-04-28 10:19:22,169][57108] Fps is (10 sec: 55704.8, 60 sec: 54613.2, 300 sec: 54761.4). Total num frames: 9545957376. Throughput: 0: 54902.0. Samples: 36258660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:19:22,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 10:19:25,132][57339] Updated weights for policy 0, policy_version 582648 (0.0034) [2024-04-28 10:19:27,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.4, 300 sec: 54872.5). Total num frames: 9546252288. Throughput: 0: 55175.4. Samples: 36593620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:19:27,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 10:19:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000582657_9546252288.pth... [2024-04-28 10:19:27,231][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000581853_9533079552.pth [2024-04-28 10:19:27,588][57339] Updated weights for policy 0, policy_version 582658 (0.0032) [2024-04-28 10:19:31,148][57339] Updated weights for policy 0, policy_version 582668 (0.0029) [2024-04-28 10:19:32,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55159.4, 300 sec: 54872.5). Total num frames: 9546514432. Throughput: 0: 55135.0. Samples: 36921840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:19:32,170][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 10:19:33,511][57339] Updated weights for policy 0, policy_version 582678 (0.0033) [2024-04-28 10:19:37,169][57108] Fps is (10 sec: 49152.8, 60 sec: 54613.4, 300 sec: 54705.9). Total num frames: 9546743808. Throughput: 0: 54692.6. Samples: 37075700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:19:37,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 10:19:37,249][57339] Updated weights for policy 0, policy_version 582688 (0.0029) [2024-04-28 10:19:39,503][57339] Updated weights for policy 0, policy_version 582698 (0.0025) [2024-04-28 10:19:42,169][57108] Fps is (10 sec: 52429.0, 60 sec: 54886.3, 300 sec: 54817.0). Total num frames: 9547038720. Throughput: 0: 54516.0. Samples: 37402120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:19:42,170][57108] Avg episode reward: [(0, '0.679')] [2024-04-28 10:19:43,232][57339] Updated weights for policy 0, policy_version 582708 (0.0039) [2024-04-28 10:19:45,257][57339] Updated weights for policy 0, policy_version 582718 (0.0027) [2024-04-28 10:19:47,169][57108] Fps is (10 sec: 58982.3, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9547333632. Throughput: 0: 54679.2. Samples: 37738060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:19:47,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 10:19:49,038][57339] Updated weights for policy 0, policy_version 582728 (0.0029) [2024-04-28 10:19:51,178][57339] Updated weights for policy 0, policy_version 582738 (0.0031) [2024-04-28 10:19:52,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55432.5, 300 sec: 54928.1). Total num frames: 9547628544. Throughput: 0: 55151.6. Samples: 37911200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:19:52,170][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 10:19:55,083][57339] Updated weights for policy 0, policy_version 582748 (0.0031) [2024-04-28 10:19:57,169][57108] Fps is (10 sec: 55704.9, 60 sec: 54613.3, 300 sec: 54816.9). Total num frames: 9547890688. Throughput: 0: 55143.1. Samples: 38238520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 10:19:57,170][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 10:19:57,463][57339] Updated weights for policy 0, policy_version 582758 (0.0030) [2024-04-28 10:20:01,181][57339] Updated weights for policy 0, policy_version 582768 (0.0036) [2024-04-28 10:20:02,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 54872.5). Total num frames: 9548169216. Throughput: 0: 55132.9. Samples: 38566220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:20:02,170][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 10:20:03,298][57339] Updated weights for policy 0, policy_version 582778 (0.0031) [2024-04-28 10:20:07,021][57339] Updated weights for policy 0, policy_version 582788 (0.0027) [2024-04-28 10:20:07,029][57319] Signal inference workers to stop experience collection... (500 times) [2024-04-28 10:20:07,029][57319] Signal inference workers to resume experience collection... (500 times) [2024-04-28 10:20:07,042][57339] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-04-28 10:20:07,060][57339] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-04-28 10:20:07,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55159.6, 300 sec: 54705.9). Total num frames: 9548414976. Throughput: 0: 54874.0. Samples: 38727980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:20:07,170][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 10:20:09,060][57339] Updated weights for policy 0, policy_version 582798 (0.0029) [2024-04-28 10:20:12,169][57108] Fps is (10 sec: 52428.8, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9548693504. Throughput: 0: 54850.3. Samples: 39061880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:20:12,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 10:20:12,861][57339] Updated weights for policy 0, policy_version 582808 (0.0032) [2024-04-28 10:20:15,233][57339] Updated weights for policy 0, policy_version 582818 (0.0027) [2024-04-28 10:20:17,169][57108] Fps is (10 sec: 55705.1, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9548972032. Throughput: 0: 54843.5. Samples: 39389800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:20:17,170][57108] Avg episode reward: [(0, '0.517')] [2024-04-28 10:20:18,807][57339] Updated weights for policy 0, policy_version 582828 (0.0031) [2024-04-28 10:20:21,280][57339] Updated weights for policy 0, policy_version 582838 (0.0026) [2024-04-28 10:20:22,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55159.7, 300 sec: 54872.5). Total num frames: 9549266944. Throughput: 0: 55077.3. Samples: 39554180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:20:22,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 10:20:24,725][57339] Updated weights for policy 0, policy_version 582848 (0.0027) [2024-04-28 10:20:27,101][57339] Updated weights for policy 0, policy_version 582858 (0.0030) [2024-04-28 10:20:27,169][57108] Fps is (10 sec: 57344.1, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9549545472. Throughput: 0: 55072.8. Samples: 39880400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:20:27,169][57108] Avg episode reward: [(0, '0.529')] [2024-04-28 10:20:30,655][57339] Updated weights for policy 0, policy_version 582868 (0.0027) [2024-04-28 10:20:32,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55432.6, 300 sec: 54872.5). Total num frames: 9549840384. Throughput: 0: 54965.3. Samples: 40211500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:20:32,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 10:20:32,909][57339] Updated weights for policy 0, policy_version 582878 (0.0029) [2024-04-28 10:20:36,579][57339] Updated weights for policy 0, policy_version 582888 (0.0030) [2024-04-28 10:20:37,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.5, 300 sec: 54817.0). Total num frames: 9550086144. Throughput: 0: 54810.8. Samples: 40377680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:20:37,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 10:20:38,933][57339] Updated weights for policy 0, policy_version 582898 (0.0031) [2024-04-28 10:20:42,169][57108] Fps is (10 sec: 49151.6, 60 sec: 54886.4, 300 sec: 54705.9). Total num frames: 9550331904. Throughput: 0: 54838.7. Samples: 40706260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:20:42,170][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 10:20:42,691][57339] Updated weights for policy 0, policy_version 582908 (0.0028) [2024-04-28 10:20:44,964][57339] Updated weights for policy 0, policy_version 582918 (0.0035) [2024-04-28 10:20:47,169][57108] Fps is (10 sec: 52428.3, 60 sec: 54613.2, 300 sec: 54817.0). Total num frames: 9550610432. Throughput: 0: 54853.7. Samples: 41034640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:20:47,170][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 10:20:48,528][57339] Updated weights for policy 0, policy_version 582928 (0.0026) [2024-04-28 10:20:50,769][57339] Updated weights for policy 0, policy_version 582938 (0.0030) [2024-04-28 10:20:52,169][57108] Fps is (10 sec: 58983.1, 60 sec: 54886.6, 300 sec: 54928.1). Total num frames: 9550921728. Throughput: 0: 54921.9. Samples: 41199460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:20:52,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 10:20:54,460][57339] Updated weights for policy 0, policy_version 582948 (0.0043) [2024-04-28 10:20:56,714][57339] Updated weights for policy 0, policy_version 582958 (0.0029) [2024-04-28 10:20:57,169][57108] Fps is (10 sec: 58983.2, 60 sec: 55159.6, 300 sec: 54928.1). Total num frames: 9551200256. Throughput: 0: 54791.2. Samples: 41527480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:20:57,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 10:21:00,423][57339] Updated weights for policy 0, policy_version 582968 (0.0030) [2024-04-28 10:21:02,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54886.5, 300 sec: 54817.0). Total num frames: 9551462400. Throughput: 0: 54838.0. Samples: 41857500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:21:02,169][57108] Avg episode reward: [(0, '0.437')] [2024-04-28 10:21:02,634][57339] Updated weights for policy 0, policy_version 582978 (0.0032) [2024-04-28 10:21:06,296][57319] Signal inference workers to stop experience collection... (550 times) [2024-04-28 10:21:06,341][57339] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-04-28 10:21:06,354][57319] Signal inference workers to resume experience collection... (550 times) [2024-04-28 10:21:06,360][57339] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-04-28 10:21:06,363][57339] Updated weights for policy 0, policy_version 582988 (0.0030) [2024-04-28 10:21:07,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 54817.0). Total num frames: 9551740928. Throughput: 0: 54970.6. Samples: 42027860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:21:07,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 10:21:08,735][57339] Updated weights for policy 0, policy_version 582998 (0.0034) [2024-04-28 10:21:12,169][57108] Fps is (10 sec: 52427.9, 60 sec: 54886.3, 300 sec: 54705.9). Total num frames: 9551986688. Throughput: 0: 54938.6. Samples: 42352640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:21:12,170][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 10:21:12,231][57339] Updated weights for policy 0, policy_version 583008 (0.0030) [2024-04-28 10:21:14,654][57339] Updated weights for policy 0, policy_version 583018 (0.0031) [2024-04-28 10:21:17,169][57108] Fps is (10 sec: 50789.8, 60 sec: 54613.3, 300 sec: 54705.9). Total num frames: 9552248832. Throughput: 0: 54977.2. Samples: 42685480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:21:17,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 10:21:18,310][57339] Updated weights for policy 0, policy_version 583028 (0.0027) [2024-04-28 10:21:20,513][57339] Updated weights for policy 0, policy_version 583038 (0.0031) [2024-04-28 10:21:22,169][57108] Fps is (10 sec: 55706.5, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9552543744. Throughput: 0: 54839.6. Samples: 42845460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:21:22,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 10:21:24,130][57339] Updated weights for policy 0, policy_version 583048 (0.0029) [2024-04-28 10:21:26,786][57339] Updated weights for policy 0, policy_version 583058 (0.0034) [2024-04-28 10:21:27,169][57108] Fps is (10 sec: 57344.7, 60 sec: 54613.4, 300 sec: 54817.0). Total num frames: 9552822272. Throughput: 0: 54873.4. Samples: 43175560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:21:27,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 10:21:27,241][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000583059_9552838656.pth... [2024-04-28 10:21:27,285][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000582256_9539682304.pth [2024-04-28 10:21:30,092][57339] Updated weights for policy 0, policy_version 583068 (0.0036) [2024-04-28 10:21:32,169][57108] Fps is (10 sec: 57343.3, 60 sec: 54613.2, 300 sec: 54872.5). Total num frames: 9553117184. Throughput: 0: 54810.7. Samples: 43501120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 10:21:32,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 10:21:32,684][57339] Updated weights for policy 0, policy_version 583078 (0.0025) [2024-04-28 10:21:36,084][57339] Updated weights for policy 0, policy_version 583088 (0.0031) [2024-04-28 10:21:37,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9553362944. Throughput: 0: 54916.7. Samples: 43670720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:21:37,169][57108] Avg episode reward: [(0, '0.530')] [2024-04-28 10:21:38,462][57339] Updated weights for policy 0, policy_version 583098 (0.0034) [2024-04-28 10:21:41,987][57339] Updated weights for policy 0, policy_version 583108 (0.0026) [2024-04-28 10:21:42,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 54761.4). Total num frames: 9553657856. Throughput: 0: 55049.7. Samples: 44004720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:21:42,170][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 10:21:44,844][57339] Updated weights for policy 0, policy_version 583118 (0.0029) [2024-04-28 10:21:47,169][57108] Fps is (10 sec: 52428.5, 60 sec: 54613.3, 300 sec: 54705.9). Total num frames: 9553887232. Throughput: 0: 55058.5. Samples: 44335140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:21:47,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 10:21:47,933][57339] Updated weights for policy 0, policy_version 583128 (0.0036) [2024-04-28 10:21:50,990][57339] Updated weights for policy 0, policy_version 583138 (0.0030) [2024-04-28 10:21:52,169][57108] Fps is (10 sec: 52429.2, 60 sec: 54340.2, 300 sec: 54817.0). Total num frames: 9554182144. Throughput: 0: 54703.1. Samples: 44489500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:21:52,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 10:21:54,003][57339] Updated weights for policy 0, policy_version 583148 (0.0031) [2024-04-28 10:21:56,771][57339] Updated weights for policy 0, policy_version 583158 (0.0030) [2024-04-28 10:21:57,169][57108] Fps is (10 sec: 58982.4, 60 sec: 54613.2, 300 sec: 54928.0). Total num frames: 9554477056. Throughput: 0: 54769.4. Samples: 44817260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:21:57,170][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 10:21:59,922][57339] Updated weights for policy 0, policy_version 583168 (0.0027) [2024-04-28 10:22:02,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55159.5, 300 sec: 54928.1). Total num frames: 9554771968. Throughput: 0: 54631.3. Samples: 45143880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:22:02,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 10:22:02,636][57339] Updated weights for policy 0, policy_version 583178 (0.0035) [2024-04-28 10:22:05,758][57339] Updated weights for policy 0, policy_version 583188 (0.0029) [2024-04-28 10:22:07,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54613.2, 300 sec: 54872.5). Total num frames: 9555017728. Throughput: 0: 55144.3. Samples: 45326960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:22:07,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 10:22:08,545][57339] Updated weights for policy 0, policy_version 583198 (0.0029) [2024-04-28 10:22:11,822][57339] Updated weights for policy 0, policy_version 583208 (0.0031) [2024-04-28 10:22:12,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55159.5, 300 sec: 54817.0). Total num frames: 9555296256. Throughput: 0: 55071.9. Samples: 45653800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:22:12,170][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 10:22:12,659][57319] Signal inference workers to stop experience collection... (600 times) [2024-04-28 10:22:12,659][57319] Signal inference workers to resume experience collection... (600 times) [2024-04-28 10:22:12,680][57339] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-04-28 10:22:12,681][57339] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-04-28 10:22:14,595][57339] Updated weights for policy 0, policy_version 583218 (0.0028) [2024-04-28 10:22:17,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55159.6, 300 sec: 54817.0). Total num frames: 9555558400. Throughput: 0: 55170.8. Samples: 45983800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:22:17,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 10:22:17,679][57339] Updated weights for policy 0, policy_version 583228 (0.0029) [2024-04-28 10:22:20,396][57339] Updated weights for policy 0, policy_version 583238 (0.0028) [2024-04-28 10:22:22,169][57108] Fps is (10 sec: 50790.7, 60 sec: 54340.2, 300 sec: 54705.9). Total num frames: 9555804160. Throughput: 0: 54875.6. Samples: 46140120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:22:22,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 10:22:23,573][57339] Updated weights for policy 0, policy_version 583248 (0.0029) [2024-04-28 10:22:26,492][57339] Updated weights for policy 0, policy_version 583258 (0.0032) [2024-04-28 10:22:27,169][57108] Fps is (10 sec: 55704.7, 60 sec: 54886.3, 300 sec: 54872.5). Total num frames: 9556115456. Throughput: 0: 54671.1. Samples: 46464920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:22:27,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 10:22:29,437][57339] Updated weights for policy 0, policy_version 583268 (0.0027) [2024-04-28 10:22:32,169][57108] Fps is (10 sec: 58982.5, 60 sec: 54613.4, 300 sec: 54872.5). Total num frames: 9556393984. Throughput: 0: 54651.3. Samples: 46794440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:22:32,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 10:22:32,550][57339] Updated weights for policy 0, policy_version 583278 (0.0032) [2024-04-28 10:22:35,552][57339] Updated weights for policy 0, policy_version 583288 (0.0031) [2024-04-28 10:22:37,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.4, 300 sec: 54872.5). Total num frames: 9556672512. Throughput: 0: 55047.4. Samples: 46966640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:22:37,178][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 10:22:38,444][57339] Updated weights for policy 0, policy_version 583298 (0.0042) [2024-04-28 10:22:41,583][57339] Updated weights for policy 0, policy_version 583308 (0.0032) [2024-04-28 10:22:42,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54613.4, 300 sec: 54817.0). Total num frames: 9556934656. Throughput: 0: 55053.9. Samples: 47294680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:22:42,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 10:22:44,555][57339] Updated weights for policy 0, policy_version 583318 (0.0031) [2024-04-28 10:22:47,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 54872.5). Total num frames: 9557213184. Throughput: 0: 55060.3. Samples: 47621600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:22:47,170][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 10:22:47,466][57339] Updated weights for policy 0, policy_version 583328 (0.0035) [2024-04-28 10:22:50,616][57339] Updated weights for policy 0, policy_version 583338 (0.0033) [2024-04-28 10:22:52,169][57108] Fps is (10 sec: 52428.5, 60 sec: 54613.3, 300 sec: 54761.5). Total num frames: 9557458944. Throughput: 0: 54473.8. Samples: 47778280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:22:52,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 10:22:53,444][57339] Updated weights for policy 0, policy_version 583348 (0.0030) [2024-04-28 10:22:56,506][57339] Updated weights for policy 0, policy_version 583358 (0.0033) [2024-04-28 10:22:57,169][57108] Fps is (10 sec: 54067.5, 60 sec: 54613.4, 300 sec: 54928.1). Total num frames: 9557753856. Throughput: 0: 54625.4. Samples: 48111940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:22:57,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 10:22:59,516][57339] Updated weights for policy 0, policy_version 583368 (0.0029) [2024-04-28 10:23:02,169][57108] Fps is (10 sec: 57344.9, 60 sec: 54340.3, 300 sec: 54872.5). Total num frames: 9558032384. Throughput: 0: 54531.1. Samples: 48437700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 10:23:02,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 10:23:02,321][57339] Updated weights for policy 0, policy_version 583378 (0.0029) [2024-04-28 10:23:05,425][57339] Updated weights for policy 0, policy_version 583388 (0.0030) [2024-04-28 10:23:07,169][57108] Fps is (10 sec: 55706.0, 60 sec: 54886.5, 300 sec: 54872.5). Total num frames: 9558310912. Throughput: 0: 54776.1. Samples: 48605040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:23:07,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 10:23:08,272][57339] Updated weights for policy 0, policy_version 583398 (0.0028) [2024-04-28 10:23:11,250][57339] Updated weights for policy 0, policy_version 583408 (0.0027) [2024-04-28 10:23:12,169][57108] Fps is (10 sec: 54066.7, 60 sec: 54613.4, 300 sec: 54762.6). Total num frames: 9558573056. Throughput: 0: 54929.0. Samples: 48936720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:23:12,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 10:23:14,296][57339] Updated weights for policy 0, policy_version 583418 (0.0028) [2024-04-28 10:23:16,002][57319] Signal inference workers to stop experience collection... (650 times) [2024-04-28 10:23:16,002][57319] Signal inference workers to resume experience collection... (650 times) [2024-04-28 10:23:16,015][57339] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-04-28 10:23:16,015][57339] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-04-28 10:23:17,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55159.3, 300 sec: 54872.5). Total num frames: 9558867968. Throughput: 0: 54923.9. Samples: 49266020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:23:17,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 10:23:17,301][57339] Updated weights for policy 0, policy_version 583428 (0.0028) [2024-04-28 10:23:20,421][57339] Updated weights for policy 0, policy_version 583438 (0.0030) [2024-04-28 10:23:22,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 54872.5). Total num frames: 9559130112. Throughput: 0: 54732.1. Samples: 49429580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:23:22,170][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 10:23:23,193][57339] Updated weights for policy 0, policy_version 583448 (0.0031) [2024-04-28 10:23:26,471][57339] Updated weights for policy 0, policy_version 583458 (0.0032) [2024-04-28 10:23:27,169][57108] Fps is (10 sec: 52429.0, 60 sec: 54613.4, 300 sec: 54872.5). Total num frames: 9559392256. Throughput: 0: 54721.7. Samples: 49757160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:23:27,170][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 10:23:27,182][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000583459_9559392256.pth... [2024-04-28 10:23:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000582657_9546252288.pth [2024-04-28 10:23:29,138][57339] Updated weights for policy 0, policy_version 583468 (0.0031) [2024-04-28 10:23:32,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55159.5, 300 sec: 55039.2). Total num frames: 9559703552. Throughput: 0: 54798.4. Samples: 50087520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:23:32,169][57108] Avg episode reward: [(0, '0.505')] [2024-04-28 10:23:32,172][57339] Updated weights for policy 0, policy_version 583478 (0.0029) [2024-04-28 10:23:35,179][57339] Updated weights for policy 0, policy_version 583488 (0.0024) [2024-04-28 10:23:37,169][57108] Fps is (10 sec: 57344.3, 60 sec: 54886.5, 300 sec: 54983.6). Total num frames: 9559965696. Throughput: 0: 55115.2. Samples: 50258460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:23:37,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 10:23:38,245][57339] Updated weights for policy 0, policy_version 583498 (0.0030) [2024-04-28 10:23:41,069][57339] Updated weights for policy 0, policy_version 583508 (0.0031) [2024-04-28 10:23:42,169][57108] Fps is (10 sec: 52428.0, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9560227840. Throughput: 0: 54953.3. Samples: 50584840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:23:42,170][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 10:23:44,496][57339] Updated weights for policy 0, policy_version 583518 (0.0041) [2024-04-28 10:23:47,100][57339] Updated weights for policy 0, policy_version 583528 (0.0027) [2024-04-28 10:23:47,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55159.5, 300 sec: 54983.6). Total num frames: 9560522752. Throughput: 0: 55043.0. Samples: 50914640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:23:47,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 10:23:50,408][57339] Updated weights for policy 0, policy_version 583538 (0.0030) [2024-04-28 10:23:52,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.7, 300 sec: 54817.0). Total num frames: 9560784896. Throughput: 0: 54844.0. Samples: 51073020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:23:52,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 10:23:53,105][57339] Updated weights for policy 0, policy_version 583548 (0.0031) [2024-04-28 10:23:56,277][57339] Updated weights for policy 0, policy_version 583558 (0.0033) [2024-04-28 10:23:57,169][57108] Fps is (10 sec: 50790.2, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9561030656. Throughput: 0: 54760.4. Samples: 51400940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:23:57,170][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 10:23:58,960][57339] Updated weights for policy 0, policy_version 583568 (0.0031) [2024-04-28 10:24:02,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 9561325568. Throughput: 0: 54825.9. Samples: 51733180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:24:02,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 10:24:02,237][57339] Updated weights for policy 0, policy_version 583578 (0.0031) [2024-04-28 10:24:05,171][57339] Updated weights for policy 0, policy_version 583588 (0.0028) [2024-04-28 10:24:07,170][57108] Fps is (10 sec: 57338.9, 60 sec: 54885.5, 300 sec: 54927.9). Total num frames: 9561604096. Throughput: 0: 54854.0. Samples: 51898060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:24:07,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 10:24:08,306][57339] Updated weights for policy 0, policy_version 583598 (0.0032) [2024-04-28 10:24:11,205][57339] Updated weights for policy 0, policy_version 583608 (0.0027) [2024-04-28 10:24:12,169][57108] Fps is (10 sec: 54066.6, 60 sec: 54886.3, 300 sec: 54817.0). Total num frames: 9561866240. Throughput: 0: 54908.8. Samples: 52228060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:24:12,170][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 10:24:14,389][57339] Updated weights for policy 0, policy_version 583618 (0.0032) [2024-04-28 10:24:17,157][57339] Updated weights for policy 0, policy_version 583628 (0.0029) [2024-04-28 10:24:17,169][57108] Fps is (10 sec: 55710.8, 60 sec: 54886.5, 300 sec: 54928.1). Total num frames: 9562161152. Throughput: 0: 54785.2. Samples: 52552860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:24:17,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 10:24:20,515][57339] Updated weights for policy 0, policy_version 583638 (0.0026) [2024-04-28 10:24:21,147][57319] Signal inference workers to stop experience collection... (700 times) [2024-04-28 10:24:21,148][57319] Signal inference workers to resume experience collection... (700 times) [2024-04-28 10:24:21,178][57339] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-04-28 10:24:21,183][57339] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-04-28 10:24:22,169][57108] Fps is (10 sec: 55706.4, 60 sec: 54886.5, 300 sec: 54817.0). Total num frames: 9562423296. Throughput: 0: 54713.8. Samples: 52720580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:24:22,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 10:24:22,977][57339] Updated weights for policy 0, policy_version 583648 (0.0036) [2024-04-28 10:24:26,330][57339] Updated weights for policy 0, policy_version 583658 (0.0029) [2024-04-28 10:24:27,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.4, 300 sec: 54872.5). Total num frames: 9562701824. Throughput: 0: 54816.8. Samples: 53051600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:24:27,170][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 10:24:29,197][57339] Updated weights for policy 0, policy_version 583668 (0.0031) [2024-04-28 10:24:32,169][57108] Fps is (10 sec: 52428.8, 60 sec: 54067.2, 300 sec: 54928.1). Total num frames: 9562947584. Throughput: 0: 54847.2. Samples: 53382760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:24:32,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 10:24:32,377][57339] Updated weights for policy 0, policy_version 583678 (0.0033) [2024-04-28 10:24:35,216][57339] Updated weights for policy 0, policy_version 583688 (0.0027) [2024-04-28 10:24:37,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54613.3, 300 sec: 54928.0). Total num frames: 9563242496. Throughput: 0: 54860.3. Samples: 53541740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 10:24:37,170][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 10:24:38,248][57339] Updated weights for policy 0, policy_version 583698 (0.0035) [2024-04-28 10:24:41,041][57339] Updated weights for policy 0, policy_version 583708 (0.0029) [2024-04-28 10:24:42,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54613.4, 300 sec: 54817.0). Total num frames: 9563504640. Throughput: 0: 54861.0. Samples: 53869680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:24:42,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 10:24:44,159][57339] Updated weights for policy 0, policy_version 583718 (0.0027) [2024-04-28 10:24:46,961][57339] Updated weights for policy 0, policy_version 583728 (0.0030) [2024-04-28 10:24:47,169][57108] Fps is (10 sec: 55705.9, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9563799552. Throughput: 0: 54799.5. Samples: 54199160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:24:47,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 10:24:50,176][57339] Updated weights for policy 0, policy_version 583738 (0.0029) [2024-04-28 10:24:52,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9564061696. Throughput: 0: 54886.1. Samples: 54367880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:24:52,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 10:24:52,922][57339] Updated weights for policy 0, policy_version 583748 (0.0030) [2024-04-28 10:24:56,362][57339] Updated weights for policy 0, policy_version 583758 (0.0030) [2024-04-28 10:24:57,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 54817.0). Total num frames: 9564340224. Throughput: 0: 54802.8. Samples: 54694180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:24:57,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 10:24:58,955][57339] Updated weights for policy 0, policy_version 583768 (0.0029) [2024-04-28 10:25:02,137][57339] Updated weights for policy 0, policy_version 583778 (0.0029) [2024-04-28 10:25:02,169][57108] Fps is (10 sec: 55705.6, 60 sec: 54886.4, 300 sec: 54928.1). Total num frames: 9564618752. Throughput: 0: 54804.5. Samples: 55019060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:25:02,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 10:25:04,926][57339] Updated weights for policy 0, policy_version 583788 (0.0029) [2024-04-28 10:25:07,169][57108] Fps is (10 sec: 54067.6, 60 sec: 54614.3, 300 sec: 54872.5). Total num frames: 9564880896. Throughput: 0: 54615.1. Samples: 55178260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:25:07,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 10:25:08,041][57339] Updated weights for policy 0, policy_version 583798 (0.0036) [2024-04-28 10:25:11,058][57339] Updated weights for policy 0, policy_version 583808 (0.0029) [2024-04-28 10:25:12,169][57108] Fps is (10 sec: 52428.1, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9565143040. Throughput: 0: 54591.1. Samples: 55508200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:25:12,178][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 10:25:14,105][57339] Updated weights for policy 0, policy_version 583818 (0.0029) [2024-04-28 10:25:17,058][57339] Updated weights for policy 0, policy_version 583828 (0.0029) [2024-04-28 10:25:17,169][57108] Fps is (10 sec: 55705.6, 60 sec: 54613.4, 300 sec: 54817.0). Total num frames: 9565437952. Throughput: 0: 54522.7. Samples: 55836280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:25:17,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 10:25:19,984][57339] Updated weights for policy 0, policy_version 583838 (0.0028) [2024-04-28 10:25:22,169][57108] Fps is (10 sec: 57345.2, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9565716480. Throughput: 0: 54747.3. Samples: 56005360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:25:22,169][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 10:25:23,034][57339] Updated weights for policy 0, policy_version 583848 (0.0028) [2024-04-28 10:25:25,878][57339] Updated weights for policy 0, policy_version 583858 (0.0032) [2024-04-28 10:25:27,169][57108] Fps is (10 sec: 54066.6, 60 sec: 54613.4, 300 sec: 54705.9). Total num frames: 9565978624. Throughput: 0: 54772.8. Samples: 56334460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:25:27,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 10:25:27,189][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000583861_9565978624.pth... [2024-04-28 10:25:27,240][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000583059_9552838656.pth [2024-04-28 10:25:28,817][57339] Updated weights for policy 0, policy_version 583868 (0.0027) [2024-04-28 10:25:31,828][57339] Updated weights for policy 0, policy_version 583878 (0.0028) [2024-04-28 10:25:32,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55159.3, 300 sec: 54817.0). Total num frames: 9566257152. Throughput: 0: 54797.3. Samples: 56665040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:25:32,178][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 10:25:34,941][57339] Updated weights for policy 0, policy_version 583888 (0.0029) [2024-04-28 10:25:37,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9566519296. Throughput: 0: 54630.6. Samples: 56826260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:25:37,170][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 10:25:38,011][57339] Updated weights for policy 0, policy_version 583898 (0.0029) [2024-04-28 10:25:41,065][57339] Updated weights for policy 0, policy_version 583908 (0.0036) [2024-04-28 10:25:42,169][57108] Fps is (10 sec: 52429.0, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9566781440. Throughput: 0: 54656.8. Samples: 57153740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:25:42,170][57108] Avg episode reward: [(0, '0.517')] [2024-04-28 10:25:44,034][57339] Updated weights for policy 0, policy_version 583918 (0.0028) [2024-04-28 10:25:45,544][57319] Signal inference workers to stop experience collection... (750 times) [2024-04-28 10:25:45,581][57339] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-04-28 10:25:45,609][57319] Signal inference workers to resume experience collection... (750 times) [2024-04-28 10:25:45,609][57339] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-04-28 10:25:47,145][57339] Updated weights for policy 0, policy_version 583928 (0.0026) [2024-04-28 10:25:47,169][57108] Fps is (10 sec: 55706.1, 60 sec: 54613.4, 300 sec: 54761.4). Total num frames: 9567076352. Throughput: 0: 54695.1. Samples: 57480340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:25:47,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 10:25:50,119][57339] Updated weights for policy 0, policy_version 583938 (0.0028) [2024-04-28 10:25:52,169][57108] Fps is (10 sec: 57344.3, 60 sec: 54886.4, 300 sec: 54761.4). Total num frames: 9567354880. Throughput: 0: 54832.4. Samples: 57645720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:25:52,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 10:25:53,031][57339] Updated weights for policy 0, policy_version 583948 (0.0032) [2024-04-28 10:25:56,202][57339] Updated weights for policy 0, policy_version 583958 (0.0025) [2024-04-28 10:25:57,169][57108] Fps is (10 sec: 54066.7, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9567617024. Throughput: 0: 54776.1. Samples: 57973120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:25:57,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 10:25:58,978][57339] Updated weights for policy 0, policy_version 583968 (0.0036) [2024-04-28 10:26:02,056][57339] Updated weights for policy 0, policy_version 583978 (0.0027) [2024-04-28 10:26:02,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9567895552. Throughput: 0: 54917.7. Samples: 58307580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:26:02,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 10:26:04,965][57339] Updated weights for policy 0, policy_version 583988 (0.0026) [2024-04-28 10:26:07,169][57108] Fps is (10 sec: 55705.3, 60 sec: 54886.2, 300 sec: 54872.5). Total num frames: 9568174080. Throughput: 0: 54731.7. Samples: 58468300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 10:26:07,170][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 10:26:08,008][57339] Updated weights for policy 0, policy_version 583998 (0.0030) [2024-04-28 10:26:11,000][57339] Updated weights for policy 0, policy_version 584008 (0.0032) [2024-04-28 10:26:12,169][57108] Fps is (10 sec: 54066.4, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9568436224. Throughput: 0: 54823.5. Samples: 58801520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:26:12,170][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 10:26:13,936][57339] Updated weights for policy 0, policy_version 584018 (0.0033) [2024-04-28 10:26:17,050][57339] Updated weights for policy 0, policy_version 584028 (0.0030) [2024-04-28 10:26:17,169][57108] Fps is (10 sec: 54068.2, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9568714752. Throughput: 0: 54834.8. Samples: 59132600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:26:17,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 10:26:19,863][57339] Updated weights for policy 0, policy_version 584038 (0.0030) [2024-04-28 10:26:22,169][57108] Fps is (10 sec: 57344.3, 60 sec: 54886.3, 300 sec: 54872.5). Total num frames: 9569009664. Throughput: 0: 54966.2. Samples: 59299740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:26:22,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 10:26:23,124][57339] Updated weights for policy 0, policy_version 584048 (0.0030) [2024-04-28 10:26:25,723][57339] Updated weights for policy 0, policy_version 584058 (0.0028) [2024-04-28 10:26:27,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55159.5, 300 sec: 54817.0). Total num frames: 9569288192. Throughput: 0: 55024.1. Samples: 59629820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:26:27,169][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 10:26:29,012][57339] Updated weights for policy 0, policy_version 584068 (0.0031) [2024-04-28 10:26:31,732][57339] Updated weights for policy 0, policy_version 584078 (0.0026) [2024-04-28 10:26:32,169][57108] Fps is (10 sec: 54067.9, 60 sec: 54886.5, 300 sec: 54872.5). Total num frames: 9569550336. Throughput: 0: 55113.8. Samples: 59960460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:26:32,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 10:26:34,865][57339] Updated weights for policy 0, policy_version 584088 (0.0028) [2024-04-28 10:26:37,169][57108] Fps is (10 sec: 52428.9, 60 sec: 54886.5, 300 sec: 54761.5). Total num frames: 9569812480. Throughput: 0: 55120.0. Samples: 60126120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:26:37,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 10:26:37,805][57339] Updated weights for policy 0, policy_version 584098 (0.0030) [2024-04-28 10:26:40,768][57339] Updated weights for policy 0, policy_version 584108 (0.0030) [2024-04-28 10:26:42,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 54928.1). Total num frames: 9570091008. Throughput: 0: 55061.4. Samples: 60450880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:26:42,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 10:26:43,754][57339] Updated weights for policy 0, policy_version 584118 (0.0029) [2024-04-28 10:26:46,751][57339] Updated weights for policy 0, policy_version 584128 (0.0027) [2024-04-28 10:26:47,169][57108] Fps is (10 sec: 55705.6, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9570369536. Throughput: 0: 54967.6. Samples: 60781120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:26:47,178][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 10:26:49,646][57339] Updated weights for policy 0, policy_version 584138 (0.0036) [2024-04-28 10:26:52,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9570648064. Throughput: 0: 55081.5. Samples: 60946960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:26:52,178][57108] Avg episode reward: [(0, '0.702')] [2024-04-28 10:26:52,683][57339] Updated weights for policy 0, policy_version 584148 (0.0032) [2024-04-28 10:26:55,663][57339] Updated weights for policy 0, policy_version 584158 (0.0037) [2024-04-28 10:26:57,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55159.6, 300 sec: 54761.4). Total num frames: 9570926592. Throughput: 0: 54912.7. Samples: 61272580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:26:57,178][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 10:26:58,712][57339] Updated weights for policy 0, policy_version 584168 (0.0030) [2024-04-28 10:27:01,540][57339] Updated weights for policy 0, policy_version 584178 (0.0030) [2024-04-28 10:27:02,169][57108] Fps is (10 sec: 54067.7, 60 sec: 54886.5, 300 sec: 54817.0). Total num frames: 9571188736. Throughput: 0: 54826.7. Samples: 61599800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:27:02,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 10:27:04,896][57339] Updated weights for policy 0, policy_version 584188 (0.0029) [2024-04-28 10:27:07,169][57108] Fps is (10 sec: 52428.5, 60 sec: 54613.4, 300 sec: 54761.4). Total num frames: 9571450880. Throughput: 0: 54881.0. Samples: 61769380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:27:07,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 10:27:07,441][57339] Updated weights for policy 0, policy_version 584198 (0.0029) [2024-04-28 10:27:10,651][57339] Updated weights for policy 0, policy_version 584208 (0.0032) [2024-04-28 10:27:12,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55159.6, 300 sec: 54872.5). Total num frames: 9571745792. Throughput: 0: 54938.2. Samples: 62102040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:27:12,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 10:27:13,364][57339] Updated weights for policy 0, policy_version 584218 (0.0031) [2024-04-28 10:27:13,376][57319] Signal inference workers to stop experience collection... (800 times) [2024-04-28 10:27:13,376][57319] Signal inference workers to resume experience collection... (800 times) [2024-04-28 10:27:13,401][57339] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-04-28 10:27:13,401][57339] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-04-28 10:27:16,453][57339] Updated weights for policy 0, policy_version 584228 (0.0026) [2024-04-28 10:27:17,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54886.4, 300 sec: 54928.1). Total num frames: 9572007936. Throughput: 0: 54816.4. Samples: 62427200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:27:17,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 10:27:19,391][57339] Updated weights for policy 0, policy_version 584238 (0.0029) [2024-04-28 10:27:22,169][57108] Fps is (10 sec: 55706.0, 60 sec: 54886.5, 300 sec: 54872.5). Total num frames: 9572302848. Throughput: 0: 54687.6. Samples: 62587060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:27:22,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 10:27:22,315][57339] Updated weights for policy 0, policy_version 584248 (0.0028) [2024-04-28 10:27:25,360][57339] Updated weights for policy 0, policy_version 584258 (0.0034) [2024-04-28 10:27:27,169][57108] Fps is (10 sec: 55704.3, 60 sec: 54613.1, 300 sec: 54816.9). Total num frames: 9572564992. Throughput: 0: 54976.2. Samples: 62924820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:27:27,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 10:27:27,246][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000584264_9572581376.pth... [2024-04-28 10:27:27,292][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000583459_9559392256.pth [2024-04-28 10:27:28,397][57339] Updated weights for policy 0, policy_version 584268 (0.0032) [2024-04-28 10:27:31,223][57339] Updated weights for policy 0, policy_version 584278 (0.0026) [2024-04-28 10:27:32,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54886.3, 300 sec: 54817.0). Total num frames: 9572843520. Throughput: 0: 54947.1. Samples: 63253740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:27:32,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 10:27:34,332][57339] Updated weights for policy 0, policy_version 584288 (0.0029) [2024-04-28 10:27:37,073][57339] Updated weights for policy 0, policy_version 584298 (0.0029) [2024-04-28 10:27:37,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55432.4, 300 sec: 54928.0). Total num frames: 9573138432. Throughput: 0: 54947.9. Samples: 63419620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:27:37,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 10:27:40,137][57339] Updated weights for policy 0, policy_version 584308 (0.0029) [2024-04-28 10:27:42,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9573384192. Throughput: 0: 55086.6. Samples: 63751480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 10:27:42,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 10:27:43,108][57339] Updated weights for policy 0, policy_version 584318 (0.0029) [2024-04-28 10:27:46,445][57339] Updated weights for policy 0, policy_version 584328 (0.0030) [2024-04-28 10:27:47,169][57108] Fps is (10 sec: 52429.7, 60 sec: 54886.5, 300 sec: 54928.1). Total num frames: 9573662720. Throughput: 0: 55153.3. Samples: 64081700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:27:47,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 10:27:49,056][57339] Updated weights for policy 0, policy_version 584338 (0.0033) [2024-04-28 10:27:52,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9573941248. Throughput: 0: 55024.0. Samples: 64245460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:27:52,170][57108] Avg episode reward: [(0, '0.530')] [2024-04-28 10:27:52,520][57339] Updated weights for policy 0, policy_version 584348 (0.0027) [2024-04-28 10:27:55,200][57339] Updated weights for policy 0, policy_version 584358 (0.0034) [2024-04-28 10:27:57,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55159.4, 300 sec: 54928.0). Total num frames: 9574236160. Throughput: 0: 54912.4. Samples: 64573100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:27:57,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 10:27:58,343][57339] Updated weights for policy 0, policy_version 584368 (0.0032) [2024-04-28 10:28:01,210][57339] Updated weights for policy 0, policy_version 584378 (0.0028) [2024-04-28 10:28:02,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55159.4, 300 sec: 54872.5). Total num frames: 9574498304. Throughput: 0: 55158.1. Samples: 64909320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:28:02,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 10:28:04,178][57339] Updated weights for policy 0, policy_version 584388 (0.0026) [2024-04-28 10:28:07,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55159.5, 300 sec: 54872.5). Total num frames: 9574760448. Throughput: 0: 55223.5. Samples: 65072120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:28:07,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 10:28:07,377][57339] Updated weights for policy 0, policy_version 584398 (0.0032) [2024-04-28 10:28:10,178][57339] Updated weights for policy 0, policy_version 584408 (0.0026) [2024-04-28 10:28:12,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54886.3, 300 sec: 54817.0). Total num frames: 9575038976. Throughput: 0: 54966.4. Samples: 65398300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:28:12,170][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 10:28:13,198][57339] Updated weights for policy 0, policy_version 584418 (0.0032) [2024-04-28 10:28:16,197][57339] Updated weights for policy 0, policy_version 584428 (0.0033) [2024-04-28 10:28:17,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.4, 300 sec: 54872.5). Total num frames: 9575317504. Throughput: 0: 55002.6. Samples: 65728860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:28:17,178][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 10:28:18,916][57319] Signal inference workers to stop experience collection... (850 times) [2024-04-28 10:28:18,917][57319] Signal inference workers to resume experience collection... (850 times) [2024-04-28 10:28:18,943][57339] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-04-28 10:28:18,943][57339] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-04-28 10:28:19,027][57339] Updated weights for policy 0, policy_version 584438 (0.0032) [2024-04-28 10:28:22,169][57108] Fps is (10 sec: 54067.6, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9575579648. Throughput: 0: 55117.9. Samples: 65899920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:28:22,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 10:28:22,174][57339] Updated weights for policy 0, policy_version 584448 (0.0024) [2024-04-28 10:28:25,040][57339] Updated weights for policy 0, policy_version 584458 (0.0030) [2024-04-28 10:28:27,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55432.6, 300 sec: 54872.5). Total num frames: 9575890944. Throughput: 0: 55046.5. Samples: 66228580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:28:27,178][57108] Avg episode reward: [(0, '0.703')] [2024-04-28 10:28:28,135][57339] Updated weights for policy 0, policy_version 584468 (0.0029) [2024-04-28 10:28:31,081][57339] Updated weights for policy 0, policy_version 584478 (0.0029) [2024-04-28 10:28:32,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55432.5, 300 sec: 54928.0). Total num frames: 9576169472. Throughput: 0: 54995.9. Samples: 66556520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:28:32,178][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 10:28:33,995][57339] Updated weights for policy 0, policy_version 584488 (0.0027) [2024-04-28 10:28:36,954][57339] Updated weights for policy 0, policy_version 584498 (0.0025) [2024-04-28 10:28:37,169][57108] Fps is (10 sec: 52429.9, 60 sec: 54613.5, 300 sec: 54872.5). Total num frames: 9576415232. Throughput: 0: 55025.4. Samples: 66721600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:28:37,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 10:28:39,939][57339] Updated weights for policy 0, policy_version 584508 (0.0031) [2024-04-28 10:28:42,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55159.5, 300 sec: 54817.0). Total num frames: 9576693760. Throughput: 0: 55081.0. Samples: 67051740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:28:42,169][57108] Avg episode reward: [(0, '0.520')] [2024-04-28 10:28:42,853][57339] Updated weights for policy 0, policy_version 584518 (0.0028) [2024-04-28 10:28:45,956][57339] Updated weights for policy 0, policy_version 584528 (0.0026) [2024-04-28 10:28:47,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9576955904. Throughput: 0: 54877.5. Samples: 67378800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:28:47,169][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 10:28:49,159][57339] Updated weights for policy 0, policy_version 584538 (0.0031) [2024-04-28 10:28:51,910][57339] Updated weights for policy 0, policy_version 584548 (0.0029) [2024-04-28 10:28:52,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.4, 300 sec: 54983.6). Total num frames: 9577250816. Throughput: 0: 54818.6. Samples: 67538960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:28:52,170][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 10:28:55,006][57339] Updated weights for policy 0, policy_version 584558 (0.0026) [2024-04-28 10:28:57,169][57108] Fps is (10 sec: 54066.4, 60 sec: 54340.2, 300 sec: 54817.0). Total num frames: 9577496576. Throughput: 0: 54908.9. Samples: 67869200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:28:57,170][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 10:28:57,811][57339] Updated weights for policy 0, policy_version 584568 (0.0028) [2024-04-28 10:29:00,973][57339] Updated weights for policy 0, policy_version 584578 (0.0029) [2024-04-28 10:29:02,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54886.4, 300 sec: 54872.7). Total num frames: 9577791488. Throughput: 0: 54745.4. Samples: 68192400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:29:02,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 10:29:03,926][57339] Updated weights for policy 0, policy_version 584588 (0.0032) [2024-04-28 10:29:06,960][57339] Updated weights for policy 0, policy_version 584598 (0.0031) [2024-04-28 10:29:07,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55159.5, 300 sec: 54928.1). Total num frames: 9578070016. Throughput: 0: 54634.2. Samples: 68358460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:29:07,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 10:29:09,924][57339] Updated weights for policy 0, policy_version 584608 (0.0033) [2024-04-28 10:29:12,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9578332160. Throughput: 0: 54662.3. Samples: 68688380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-04-28 10:29:12,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 10:29:12,860][57339] Updated weights for policy 0, policy_version 584618 (0.0028) [2024-04-28 10:29:15,772][57339] Updated weights for policy 0, policy_version 584628 (0.0030) [2024-04-28 10:29:17,169][57108] Fps is (10 sec: 52428.7, 60 sec: 54613.4, 300 sec: 54817.0). Total num frames: 9578594304. Throughput: 0: 54776.5. Samples: 69021460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:29:17,170][57108] Avg episode reward: [(0, '0.516')] [2024-04-28 10:29:18,726][57339] Updated weights for policy 0, policy_version 584638 (0.0031) [2024-04-28 10:29:21,709][57339] Updated weights for policy 0, policy_version 584648 (0.0028) [2024-04-28 10:29:22,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55432.6, 300 sec: 54928.1). Total num frames: 9578905600. Throughput: 0: 54660.0. Samples: 69181300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:29:22,169][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 10:29:24,688][57339] Updated weights for policy 0, policy_version 584658 (0.0032) [2024-04-28 10:29:27,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54340.4, 300 sec: 54928.0). Total num frames: 9579151360. Throughput: 0: 54584.8. Samples: 69508060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:29:27,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 10:29:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000584665_9579151360.pth... [2024-04-28 10:29:27,233][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000583861_9565978624.pth [2024-04-28 10:29:27,622][57319] Signal inference workers to stop experience collection... (900 times) [2024-04-28 10:29:27,623][57319] Signal inference workers to resume experience collection... (900 times) [2024-04-28 10:29:27,641][57339] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-04-28 10:29:27,641][57339] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-04-28 10:29:27,890][57339] Updated weights for policy 0, policy_version 584668 (0.0032) [2024-04-28 10:29:30,757][57339] Updated weights for policy 0, policy_version 584678 (0.0029) [2024-04-28 10:29:32,169][57108] Fps is (10 sec: 52428.3, 60 sec: 54340.3, 300 sec: 54872.5). Total num frames: 9579429888. Throughput: 0: 54482.6. Samples: 69830520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:29:32,170][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 10:29:33,838][57339] Updated weights for policy 0, policy_version 584688 (0.0030) [2024-04-28 10:29:36,764][57339] Updated weights for policy 0, policy_version 584698 (0.0027) [2024-04-28 10:29:37,169][57108] Fps is (10 sec: 55705.4, 60 sec: 54886.3, 300 sec: 54928.0). Total num frames: 9579708416. Throughput: 0: 54720.9. Samples: 70001400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:29:37,170][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 10:29:39,736][57339] Updated weights for policy 0, policy_version 584708 (0.0040) [2024-04-28 10:29:42,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9579970560. Throughput: 0: 54717.9. Samples: 70331500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:29:42,169][57108] Avg episode reward: [(0, '0.692')] [2024-04-28 10:29:42,828][57339] Updated weights for policy 0, policy_version 584718 (0.0029) [2024-04-28 10:29:45,594][57339] Updated weights for policy 0, policy_version 584728 (0.0026) [2024-04-28 10:29:47,169][57108] Fps is (10 sec: 54068.0, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9580249088. Throughput: 0: 54817.4. Samples: 70659180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:29:47,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 10:29:48,624][57339] Updated weights for policy 0, policy_version 584738 (0.0032) [2024-04-28 10:29:51,694][57339] Updated weights for policy 0, policy_version 584748 (0.0028) [2024-04-28 10:29:52,169][57108] Fps is (10 sec: 55704.7, 60 sec: 54613.2, 300 sec: 54872.5). Total num frames: 9580527616. Throughput: 0: 54723.4. Samples: 70821020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:29:52,169][57108] Avg episode reward: [(0, '0.690')] [2024-04-28 10:29:54,487][57339] Updated weights for policy 0, policy_version 584758 (0.0027) [2024-04-28 10:29:57,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54886.5, 300 sec: 54817.0). Total num frames: 9580789760. Throughput: 0: 54694.3. Samples: 71149620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:29:57,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 10:29:57,799][57339] Updated weights for policy 0, policy_version 584768 (0.0033) [2024-04-28 10:30:00,455][57339] Updated weights for policy 0, policy_version 584778 (0.0033) [2024-04-28 10:30:02,169][57108] Fps is (10 sec: 54067.9, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9581068288. Throughput: 0: 54611.1. Samples: 71478960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:30:02,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 10:30:03,802][57339] Updated weights for policy 0, policy_version 584788 (0.0036) [2024-04-28 10:30:06,472][57339] Updated weights for policy 0, policy_version 584798 (0.0027) [2024-04-28 10:30:07,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55159.5, 300 sec: 55039.2). Total num frames: 9581379584. Throughput: 0: 54937.7. Samples: 71653500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:30:07,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 10:30:09,821][57339] Updated weights for policy 0, policy_version 584808 (0.0032) [2024-04-28 10:30:12,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9581625344. Throughput: 0: 54908.4. Samples: 71978940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:30:12,170][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 10:30:12,427][57339] Updated weights for policy 0, policy_version 584818 (0.0032) [2024-04-28 10:30:15,731][57339] Updated weights for policy 0, policy_version 584828 (0.0026) [2024-04-28 10:30:17,169][57108] Fps is (10 sec: 49151.6, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9581871104. Throughput: 0: 55084.8. Samples: 72309340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:30:17,170][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 10:30:18,398][57339] Updated weights for policy 0, policy_version 584838 (0.0037) [2024-04-28 10:30:21,550][57339] Updated weights for policy 0, policy_version 584848 (0.0030) [2024-04-28 10:30:22,169][57108] Fps is (10 sec: 54066.5, 60 sec: 54340.1, 300 sec: 54872.5). Total num frames: 9582166016. Throughput: 0: 54930.6. Samples: 72473280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:30:22,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 10:30:24,312][57339] Updated weights for policy 0, policy_version 584858 (0.0031) [2024-04-28 10:30:27,169][57108] Fps is (10 sec: 57344.6, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9582444544. Throughput: 0: 54934.2. Samples: 72803540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:30:27,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 10:30:27,399][57339] Updated weights for policy 0, policy_version 584868 (0.0032) [2024-04-28 10:30:30,147][57339] Updated weights for policy 0, policy_version 584878 (0.0027) [2024-04-28 10:30:32,169][57108] Fps is (10 sec: 54068.6, 60 sec: 54613.5, 300 sec: 54872.5). Total num frames: 9582706688. Throughput: 0: 54900.9. Samples: 73129720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:30:32,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 10:30:33,296][57319] Signal inference workers to stop experience collection... (950 times) [2024-04-28 10:30:33,324][57339] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-04-28 10:30:33,354][57319] Signal inference workers to resume experience collection... (950 times) [2024-04-28 10:30:33,354][57339] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-04-28 10:30:33,468][57339] Updated weights for policy 0, policy_version 584888 (0.0033) [2024-04-28 10:30:36,021][57339] Updated weights for policy 0, policy_version 584898 (0.0024) [2024-04-28 10:30:37,169][57108] Fps is (10 sec: 55705.1, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 9583001600. Throughput: 0: 55117.4. Samples: 73301300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:30:37,170][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 10:30:39,962][57339] Updated weights for policy 0, policy_version 584908 (0.0033) [2024-04-28 10:30:42,118][57339] Updated weights for policy 0, policy_version 584918 (0.0030) [2024-04-28 10:30:42,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55432.5, 300 sec: 54983.6). Total num frames: 9583296512. Throughput: 0: 55034.6. Samples: 73626180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:30:42,169][57108] Avg episode reward: [(0, '0.684')] [2024-04-28 10:30:45,992][57339] Updated weights for policy 0, policy_version 584928 (0.0029) [2024-04-28 10:30:47,170][57108] Fps is (10 sec: 52425.1, 60 sec: 54612.6, 300 sec: 54816.8). Total num frames: 9583525888. Throughput: 0: 55047.5. Samples: 73956140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 10:30:47,170][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 10:30:48,104][57339] Updated weights for policy 0, policy_version 584938 (0.0031) [2024-04-28 10:30:52,084][57339] Updated weights for policy 0, policy_version 584948 (0.0031) [2024-04-28 10:30:52,169][57108] Fps is (10 sec: 49152.2, 60 sec: 54340.4, 300 sec: 54817.0). Total num frames: 9583788032. Throughput: 0: 54670.7. Samples: 74113680. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:30:52,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 10:30:54,184][57339] Updated weights for policy 0, policy_version 584958 (0.0031) [2024-04-28 10:30:57,169][57108] Fps is (10 sec: 55709.6, 60 sec: 54886.3, 300 sec: 54872.5). Total num frames: 9584082944. Throughput: 0: 54695.1. Samples: 74440220. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:30:57,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 10:30:57,866][57339] Updated weights for policy 0, policy_version 584968 (0.0034) [2024-04-28 10:31:00,387][57339] Updated weights for policy 0, policy_version 584978 (0.0032) [2024-04-28 10:31:02,169][57108] Fps is (10 sec: 57343.9, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9584361472. Throughput: 0: 54596.6. Samples: 74766180. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:31:02,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 10:31:03,666][57339] Updated weights for policy 0, policy_version 584988 (0.0031) [2024-04-28 10:31:06,585][57339] Updated weights for policy 0, policy_version 584998 (0.0027) [2024-04-28 10:31:07,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54067.2, 300 sec: 54872.5). Total num frames: 9584623616. Throughput: 0: 54731.2. Samples: 74936180. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:31:07,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 10:31:09,662][57339] Updated weights for policy 0, policy_version 585008 (0.0025) [2024-04-28 10:31:12,169][57108] Fps is (10 sec: 55705.9, 60 sec: 54886.5, 300 sec: 54928.1). Total num frames: 9584918528. Throughput: 0: 54716.5. Samples: 75265780. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:31:12,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 10:31:12,325][57339] Updated weights for policy 0, policy_version 585018 (0.0032) [2024-04-28 10:31:15,572][57339] Updated weights for policy 0, policy_version 585028 (0.0029) [2024-04-28 10:31:17,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55159.6, 300 sec: 54817.0). Total num frames: 9585180672. Throughput: 0: 54760.3. Samples: 75593940. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:31:17,169][57108] Avg episode reward: [(0, '0.699')] [2024-04-28 10:31:18,281][57339] Updated weights for policy 0, policy_version 585038 (0.0032) [2024-04-28 10:31:21,633][57339] Updated weights for policy 0, policy_version 585048 (0.0036) [2024-04-28 10:31:22,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54886.6, 300 sec: 54817.0). Total num frames: 9585459200. Throughput: 0: 54586.4. Samples: 75757680. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:31:22,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 10:31:24,155][57339] Updated weights for policy 0, policy_version 585058 (0.0033) [2024-04-28 10:31:27,169][57108] Fps is (10 sec: 52428.1, 60 sec: 54340.2, 300 sec: 54761.4). Total num frames: 9585704960. Throughput: 0: 54620.4. Samples: 76084100. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:31:27,169][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 10:31:27,197][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000585066_9585721344.pth... [2024-04-28 10:31:27,247][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000584264_9572581376.pth [2024-04-28 10:31:27,825][57339] Updated weights for policy 0, policy_version 585068 (0.0027) [2024-04-28 10:31:30,225][57339] Updated weights for policy 0, policy_version 585078 (0.0031) [2024-04-28 10:31:32,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55159.4, 300 sec: 54928.1). Total num frames: 9586016256. Throughput: 0: 54552.1. Samples: 76410940. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:31:32,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 10:31:33,879][57339] Updated weights for policy 0, policy_version 585088 (0.0027) [2024-04-28 10:31:36,053][57339] Updated weights for policy 0, policy_version 585098 (0.0034) [2024-04-28 10:31:37,169][57108] Fps is (10 sec: 57344.9, 60 sec: 54613.5, 300 sec: 54872.5). Total num frames: 9586278400. Throughput: 0: 54817.8. Samples: 76580480. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:31:37,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 10:31:39,654][57339] Updated weights for policy 0, policy_version 585108 (0.0027) [2024-04-28 10:31:42,169][57108] Fps is (10 sec: 54066.6, 60 sec: 54340.2, 300 sec: 54872.5). Total num frames: 9586556928. Throughput: 0: 54868.9. Samples: 76909320. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:31:42,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 10:31:42,446][57339] Updated weights for policy 0, policy_version 585118 (0.0027) [2024-04-28 10:31:45,423][57319] Signal inference workers to stop experience collection... (1000 times) [2024-04-28 10:31:45,423][57319] Signal inference workers to resume experience collection... (1000 times) [2024-04-28 10:31:45,441][57339] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-04-28 10:31:45,441][57339] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-04-28 10:31:45,545][57339] Updated weights for policy 0, policy_version 585128 (0.0034) [2024-04-28 10:31:47,169][57108] Fps is (10 sec: 54066.1, 60 sec: 54887.0, 300 sec: 54817.0). Total num frames: 9586819072. Throughput: 0: 54906.9. Samples: 77237000. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:31:47,178][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 10:31:48,309][57339] Updated weights for policy 0, policy_version 585138 (0.0026) [2024-04-28 10:31:51,384][57339] Updated weights for policy 0, policy_version 585148 (0.0030) [2024-04-28 10:31:52,169][57108] Fps is (10 sec: 52429.2, 60 sec: 54886.4, 300 sec: 54761.4). Total num frames: 9587081216. Throughput: 0: 54619.6. Samples: 77394060. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:31:52,169][57108] Avg episode reward: [(0, '0.697')] [2024-04-28 10:31:54,194][57339] Updated weights for policy 0, policy_version 585158 (0.0028) [2024-04-28 10:31:57,169][57108] Fps is (10 sec: 54068.3, 60 sec: 54613.5, 300 sec: 54817.0). Total num frames: 9587359744. Throughput: 0: 54684.0. Samples: 77726560. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:31:57,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 10:31:57,319][57339] Updated weights for policy 0, policy_version 585168 (0.0032) [2024-04-28 10:32:00,253][57339] Updated weights for policy 0, policy_version 585178 (0.0034) [2024-04-28 10:32:02,169][57108] Fps is (10 sec: 54066.8, 60 sec: 54340.2, 300 sec: 54817.0). Total num frames: 9587621888. Throughput: 0: 54595.0. Samples: 78050720. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:32:02,169][57108] Avg episode reward: [(0, '0.530')] [2024-04-28 10:32:03,367][57339] Updated weights for policy 0, policy_version 585188 (0.0031) [2024-04-28 10:32:06,077][57339] Updated weights for policy 0, policy_version 585198 (0.0037) [2024-04-28 10:32:07,169][57108] Fps is (10 sec: 55705.2, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9587916800. Throughput: 0: 54709.3. Samples: 78219600. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:32:07,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 10:32:09,442][57339] Updated weights for policy 0, policy_version 585208 (0.0031) [2024-04-28 10:32:12,049][57339] Updated weights for policy 0, policy_version 585218 (0.0036) [2024-04-28 10:32:12,169][57108] Fps is (10 sec: 58982.5, 60 sec: 54886.3, 300 sec: 54928.0). Total num frames: 9588211712. Throughput: 0: 54701.0. Samples: 78545640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:32:12,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 10:32:15,259][57339] Updated weights for policy 0, policy_version 585228 (0.0026) [2024-04-28 10:32:17,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9588473856. Throughput: 0: 54847.1. Samples: 78879060. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 10:32:17,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 10:32:17,922][57339] Updated weights for policy 0, policy_version 585238 (0.0030) [2024-04-28 10:32:21,268][57339] Updated weights for policy 0, policy_version 585248 (0.0029) [2024-04-28 10:32:22,169][57108] Fps is (10 sec: 52429.2, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9588736000. Throughput: 0: 54707.1. Samples: 79042300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:32:22,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 10:32:23,841][57339] Updated weights for policy 0, policy_version 585258 (0.0028) [2024-04-28 10:32:27,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.6, 300 sec: 54817.0). Total num frames: 9589014528. Throughput: 0: 54768.6. Samples: 79373900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:32:27,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 10:32:27,351][57339] Updated weights for policy 0, policy_version 585268 (0.0028) [2024-04-28 10:32:30,019][57339] Updated weights for policy 0, policy_version 585278 (0.0028) [2024-04-28 10:32:32,169][57108] Fps is (10 sec: 55705.4, 60 sec: 54613.3, 300 sec: 54761.5). Total num frames: 9589293056. Throughput: 0: 54845.5. Samples: 79705040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:32:32,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 10:32:33,130][57339] Updated weights for policy 0, policy_version 585288 (0.0031) [2024-04-28 10:32:35,876][57339] Updated weights for policy 0, policy_version 585298 (0.0029) [2024-04-28 10:32:37,169][57108] Fps is (10 sec: 55704.6, 60 sec: 54886.2, 300 sec: 54872.5). Total num frames: 9589571584. Throughput: 0: 55034.0. Samples: 79870600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:32:37,170][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 10:32:39,081][57339] Updated weights for policy 0, policy_version 585308 (0.0031) [2024-04-28 10:32:41,796][57339] Updated weights for policy 0, policy_version 585318 (0.0025) [2024-04-28 10:32:42,169][57108] Fps is (10 sec: 55705.0, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9589850112. Throughput: 0: 54923.8. Samples: 80198140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:32:42,169][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 10:32:45,151][57339] Updated weights for policy 0, policy_version 585328 (0.0029) [2024-04-28 10:32:47,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55159.6, 300 sec: 54872.5). Total num frames: 9590128640. Throughput: 0: 55051.6. Samples: 80528040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:32:47,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 10:32:47,808][57339] Updated weights for policy 0, policy_version 585338 (0.0034) [2024-04-28 10:32:51,180][57339] Updated weights for policy 0, policy_version 585348 (0.0027) [2024-04-28 10:32:52,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.4, 300 sec: 54761.4). Total num frames: 9590390784. Throughput: 0: 55186.1. Samples: 80702980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:32:52,178][57108] Avg episode reward: [(0, '0.686')] [2024-04-28 10:32:53,881][57339] Updated weights for policy 0, policy_version 585358 (0.0029) [2024-04-28 10:32:57,142][57339] Updated weights for policy 0, policy_version 585368 (0.0033) [2024-04-28 10:32:57,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55159.3, 300 sec: 54817.0). Total num frames: 9590669312. Throughput: 0: 55251.0. Samples: 81031940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:32:57,178][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 10:32:59,789][57339] Updated weights for policy 0, policy_version 585378 (0.0025) [2024-04-28 10:33:02,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 54872.5). Total num frames: 9590947840. Throughput: 0: 55195.9. Samples: 81362880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:33:02,175][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 10:33:03,032][57339] Updated weights for policy 0, policy_version 585388 (0.0032) [2024-04-28 10:33:05,527][57319] Signal inference workers to stop experience collection... (1050 times) [2024-04-28 10:33:05,529][57319] Signal inference workers to resume experience collection... (1050 times) [2024-04-28 10:33:05,561][57339] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-04-28 10:33:05,561][57339] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-04-28 10:33:05,638][57339] Updated weights for policy 0, policy_version 585398 (0.0027) [2024-04-28 10:33:07,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54886.3, 300 sec: 54817.0). Total num frames: 9591209984. Throughput: 0: 55013.6. Samples: 81517920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:33:07,169][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 10:33:09,009][57339] Updated weights for policy 0, policy_version 585408 (0.0039) [2024-04-28 10:33:11,747][57339] Updated weights for policy 0, policy_version 585418 (0.0036) [2024-04-28 10:33:12,169][57108] Fps is (10 sec: 55705.4, 60 sec: 54886.3, 300 sec: 54872.5). Total num frames: 9591504896. Throughput: 0: 54986.9. Samples: 81848320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:33:12,178][57108] Avg episode reward: [(0, '0.699')] [2024-04-28 10:33:14,877][57339] Updated weights for policy 0, policy_version 585428 (0.0025) [2024-04-28 10:33:17,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9591750656. Throughput: 0: 54989.3. Samples: 82179560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:33:17,178][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 10:33:17,920][57339] Updated weights for policy 0, policy_version 585438 (0.0033) [2024-04-28 10:33:20,684][57339] Updated weights for policy 0, policy_version 585448 (0.0031) [2024-04-28 10:33:22,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.3, 300 sec: 54817.0). Total num frames: 9592061952. Throughput: 0: 55039.1. Samples: 82347360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:33:22,170][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 10:33:23,837][57339] Updated weights for policy 0, policy_version 585458 (0.0031) [2024-04-28 10:33:26,574][57339] Updated weights for policy 0, policy_version 585468 (0.0030) [2024-04-28 10:33:27,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55159.3, 300 sec: 54761.4). Total num frames: 9592324096. Throughput: 0: 55073.7. Samples: 82676460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:33:27,170][57108] Avg episode reward: [(0, '0.485')] [2024-04-28 10:33:27,262][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000585470_9592340480.pth... [2024-04-28 10:33:27,314][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000584665_9579151360.pth [2024-04-28 10:33:29,717][57339] Updated weights for policy 0, policy_version 585478 (0.0029) [2024-04-28 10:33:32,169][57108] Fps is (10 sec: 52429.9, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9592586240. Throughput: 0: 55091.6. Samples: 83007160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:33:32,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 10:33:32,600][57339] Updated weights for policy 0, policy_version 585488 (0.0034) [2024-04-28 10:33:35,592][57339] Updated weights for policy 0, policy_version 585498 (0.0032) [2024-04-28 10:33:37,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.4, 300 sec: 54872.5). Total num frames: 9592881152. Throughput: 0: 54978.1. Samples: 83177000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:33:37,170][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 10:33:38,523][57339] Updated weights for policy 0, policy_version 585508 (0.0028) [2024-04-28 10:33:41,599][57339] Updated weights for policy 0, policy_version 585518 (0.0028) [2024-04-28 10:33:42,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54886.5, 300 sec: 54872.5). Total num frames: 9593143296. Throughput: 0: 55013.9. Samples: 83507560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:33:42,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 10:33:44,573][57339] Updated weights for policy 0, policy_version 585528 (0.0028) [2024-04-28 10:33:47,169][57108] Fps is (10 sec: 52430.1, 60 sec: 54613.4, 300 sec: 54761.5). Total num frames: 9593405440. Throughput: 0: 54913.5. Samples: 83833980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:33:47,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 10:33:47,692][57339] Updated weights for policy 0, policy_version 585538 (0.0028) [2024-04-28 10:33:50,690][57339] Updated weights for policy 0, policy_version 585548 (0.0036) [2024-04-28 10:33:52,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55159.5, 300 sec: 54928.1). Total num frames: 9593700352. Throughput: 0: 55162.2. Samples: 84000220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 10:33:52,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 10:33:53,708][57339] Updated weights for policy 0, policy_version 585558 (0.0029) [2024-04-28 10:33:56,513][57339] Updated weights for policy 0, policy_version 585568 (0.0026) [2024-04-28 10:33:57,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55432.6, 300 sec: 54928.1). Total num frames: 9593995264. Throughput: 0: 55091.2. Samples: 84327420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:33:57,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 10:33:59,645][57339] Updated weights for policy 0, policy_version 585578 (0.0028) [2024-04-28 10:34:02,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9594241024. Throughput: 0: 55035.6. Samples: 84656160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:34:02,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 10:34:02,182][57319] Signal inference workers to stop experience collection... (1100 times) [2024-04-28 10:34:02,182][57319] Signal inference workers to resume experience collection... (1100 times) [2024-04-28 10:34:02,204][57339] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-04-28 10:34:02,204][57339] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-04-28 10:34:02,425][57339] Updated weights for policy 0, policy_version 585588 (0.0029) [2024-04-28 10:34:05,857][57339] Updated weights for policy 0, policy_version 585598 (0.0028) [2024-04-28 10:34:07,169][57108] Fps is (10 sec: 50790.7, 60 sec: 54886.5, 300 sec: 54817.0). Total num frames: 9594503168. Throughput: 0: 54891.8. Samples: 84817480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:34:07,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 10:34:08,317][57339] Updated weights for policy 0, policy_version 585608 (0.0027) [2024-04-28 10:34:11,727][57339] Updated weights for policy 0, policy_version 585618 (0.0036) [2024-04-28 10:34:12,169][57108] Fps is (10 sec: 54067.5, 60 sec: 54613.5, 300 sec: 54872.5). Total num frames: 9594781696. Throughput: 0: 54814.9. Samples: 85143120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:34:12,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 10:34:14,393][57339] Updated weights for policy 0, policy_version 585628 (0.0032) [2024-04-28 10:34:17,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55159.4, 300 sec: 54761.4). Total num frames: 9595060224. Throughput: 0: 54836.3. Samples: 85474800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:34:17,170][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 10:34:17,726][57339] Updated weights for policy 0, policy_version 585638 (0.0038) [2024-04-28 10:34:20,543][57339] Updated weights for policy 0, policy_version 585648 (0.0033) [2024-04-28 10:34:22,169][57108] Fps is (10 sec: 55703.8, 60 sec: 54613.2, 300 sec: 54872.5). Total num frames: 9595338752. Throughput: 0: 54816.3. Samples: 85643740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:34:22,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 10:34:23,707][57339] Updated weights for policy 0, policy_version 585658 (0.0035) [2024-04-28 10:34:26,580][57339] Updated weights for policy 0, policy_version 585668 (0.0031) [2024-04-28 10:34:27,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55159.6, 300 sec: 54928.0). Total num frames: 9595633664. Throughput: 0: 54719.9. Samples: 85969960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:34:27,170][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 10:34:29,628][57339] Updated weights for policy 0, policy_version 585678 (0.0027) [2024-04-28 10:34:32,169][57108] Fps is (10 sec: 54068.9, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9595879424. Throughput: 0: 54733.7. Samples: 86297000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:34:32,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 10:34:32,364][57339] Updated weights for policy 0, policy_version 585688 (0.0028) [2024-04-28 10:34:35,551][57339] Updated weights for policy 0, policy_version 585698 (0.0031) [2024-04-28 10:34:37,169][57108] Fps is (10 sec: 50790.5, 60 sec: 54340.4, 300 sec: 54817.0). Total num frames: 9596141568. Throughput: 0: 54636.5. Samples: 86458860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:34:37,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 10:34:38,385][57339] Updated weights for policy 0, policy_version 585708 (0.0029) [2024-04-28 10:34:41,545][57339] Updated weights for policy 0, policy_version 585718 (0.0027) [2024-04-28 10:34:42,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9596436480. Throughput: 0: 54767.6. Samples: 86791960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:34:42,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 10:34:44,144][57339] Updated weights for policy 0, policy_version 585728 (0.0026) [2024-04-28 10:34:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54886.3, 300 sec: 54817.0). Total num frames: 9596698624. Throughput: 0: 54767.1. Samples: 87120680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:34:47,169][57108] Avg episode reward: [(0, '0.689')] [2024-04-28 10:34:47,811][57339] Updated weights for policy 0, policy_version 585738 (0.0030) [2024-04-28 10:34:50,060][57339] Updated weights for policy 0, policy_version 585748 (0.0027) [2024-04-28 10:34:52,169][57108] Fps is (10 sec: 55705.6, 60 sec: 54886.5, 300 sec: 54928.1). Total num frames: 9596993536. Throughput: 0: 54856.9. Samples: 87286040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:34:52,169][57108] Avg episode reward: [(0, '0.682')] [2024-04-28 10:34:53,598][57339] Updated weights for policy 0, policy_version 585758 (0.0029) [2024-04-28 10:34:55,978][57339] Updated weights for policy 0, policy_version 585768 (0.0035) [2024-04-28 10:34:57,169][57108] Fps is (10 sec: 57343.6, 60 sec: 54613.3, 300 sec: 54928.0). Total num frames: 9597272064. Throughput: 0: 54899.4. Samples: 87613600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:34:57,170][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 10:34:59,509][57339] Updated weights for policy 0, policy_version 585778 (0.0028) [2024-04-28 10:35:02,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54886.4, 300 sec: 54761.4). Total num frames: 9597534208. Throughput: 0: 54881.9. Samples: 87944480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:35:02,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 10:35:02,198][57339] Updated weights for policy 0, policy_version 585788 (0.0027) [2024-04-28 10:35:04,657][57319] Signal inference workers to stop experience collection... (1150 times) [2024-04-28 10:35:04,690][57339] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-04-28 10:35:04,715][57319] Signal inference workers to resume experience collection... (1150 times) [2024-04-28 10:35:04,718][57339] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-04-28 10:35:05,476][57339] Updated weights for policy 0, policy_version 585798 (0.0027) [2024-04-28 10:35:07,169][57108] Fps is (10 sec: 52429.2, 60 sec: 54886.3, 300 sec: 54817.0). Total num frames: 9597796352. Throughput: 0: 54917.7. Samples: 88115020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:35:07,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 10:35:08,112][57339] Updated weights for policy 0, policy_version 585808 (0.0030) [2024-04-28 10:35:11,451][57339] Updated weights for policy 0, policy_version 585818 (0.0032) [2024-04-28 10:35:12,169][57108] Fps is (10 sec: 52429.0, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9598058496. Throughput: 0: 54900.6. Samples: 88440480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:35:12,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 10:35:14,046][57339] Updated weights for policy 0, policy_version 585828 (0.0028) [2024-04-28 10:35:17,169][57108] Fps is (10 sec: 55705.3, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9598353408. Throughput: 0: 54903.9. Samples: 88767680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:35:17,170][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 10:35:17,388][57339] Updated weights for policy 0, policy_version 585838 (0.0028) [2024-04-28 10:35:20,009][57339] Updated weights for policy 0, policy_version 585848 (0.0027) [2024-04-28 10:35:22,169][57108] Fps is (10 sec: 57343.2, 60 sec: 54886.6, 300 sec: 54872.5). Total num frames: 9598631936. Throughput: 0: 55097.3. Samples: 88938240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 10:35:22,170][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 10:35:23,324][57339] Updated weights for policy 0, policy_version 585858 (0.0031) [2024-04-28 10:35:25,836][57339] Updated weights for policy 0, policy_version 585868 (0.0029) [2024-04-28 10:35:27,169][57108] Fps is (10 sec: 57344.0, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 9598926848. Throughput: 0: 54939.0. Samples: 89264220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:35:27,170][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 10:35:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000585872_9598926848.pth... [2024-04-28 10:35:27,232][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000585066_9585721344.pth [2024-04-28 10:35:29,428][57339] Updated weights for policy 0, policy_version 585878 (0.0028) [2024-04-28 10:35:31,863][57339] Updated weights for policy 0, policy_version 585888 (0.0030) [2024-04-28 10:35:32,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.4, 300 sec: 54928.1). Total num frames: 9599205376. Throughput: 0: 54891.0. Samples: 89590780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:35:32,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 10:35:35,524][57339] Updated weights for policy 0, policy_version 585898 (0.0031) [2024-04-28 10:35:37,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55159.4, 300 sec: 54761.4). Total num frames: 9599451136. Throughput: 0: 55003.4. Samples: 89761200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:35:37,170][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 10:35:37,865][57339] Updated weights for policy 0, policy_version 585908 (0.0026) [2024-04-28 10:35:41,579][57339] Updated weights for policy 0, policy_version 585918 (0.0030) [2024-04-28 10:35:42,169][57108] Fps is (10 sec: 50791.1, 60 sec: 54613.3, 300 sec: 54872.7). Total num frames: 9599713280. Throughput: 0: 54992.2. Samples: 90088240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:35:42,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 10:35:43,873][57339] Updated weights for policy 0, policy_version 585928 (0.0033) [2024-04-28 10:35:47,169][57108] Fps is (10 sec: 52428.9, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9599975424. Throughput: 0: 54933.3. Samples: 90416480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:35:47,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 10:35:47,467][57339] Updated weights for policy 0, policy_version 585938 (0.0027) [2024-04-28 10:35:49,735][57339] Updated weights for policy 0, policy_version 585948 (0.0030) [2024-04-28 10:35:52,169][57108] Fps is (10 sec: 55705.3, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9600270336. Throughput: 0: 54773.3. Samples: 90579820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:35:52,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 10:35:53,389][57339] Updated weights for policy 0, policy_version 585958 (0.0034) [2024-04-28 10:35:55,645][57339] Updated weights for policy 0, policy_version 585968 (0.0033) [2024-04-28 10:35:57,169][57108] Fps is (10 sec: 58982.1, 60 sec: 54886.4, 300 sec: 54928.0). Total num frames: 9600565248. Throughput: 0: 54811.4. Samples: 90907000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:35:57,170][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 10:35:59,290][57339] Updated weights for policy 0, policy_version 585978 (0.0027) [2024-04-28 10:36:01,180][57319] Signal inference workers to stop experience collection... (1200 times) [2024-04-28 10:36:01,181][57319] Signal inference workers to resume experience collection... (1200 times) [2024-04-28 10:36:01,200][57339] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-04-28 10:36:01,200][57339] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-04-28 10:36:01,697][57339] Updated weights for policy 0, policy_version 585988 (0.0031) [2024-04-28 10:36:02,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55159.4, 300 sec: 54983.6). Total num frames: 9600843776. Throughput: 0: 54882.2. Samples: 91237380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:36:02,170][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 10:36:05,281][57339] Updated weights for policy 0, policy_version 585998 (0.0030) [2024-04-28 10:36:07,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.4, 300 sec: 54872.5). Total num frames: 9601105920. Throughput: 0: 54966.3. Samples: 91411720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:36:07,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 10:36:07,552][57339] Updated weights for policy 0, policy_version 586008 (0.0027) [2024-04-28 10:36:11,145][57339] Updated weights for policy 0, policy_version 586018 (0.0027) [2024-04-28 10:36:12,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55432.5, 300 sec: 54928.1). Total num frames: 9601384448. Throughput: 0: 55199.2. Samples: 91748180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:36:12,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 10:36:13,453][57339] Updated weights for policy 0, policy_version 586028 (0.0028) [2024-04-28 10:36:17,169][57108] Fps is (10 sec: 52429.5, 60 sec: 54613.5, 300 sec: 54817.0). Total num frames: 9601630208. Throughput: 0: 55197.1. Samples: 92074640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:36:17,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 10:36:17,189][57339] Updated weights for policy 0, policy_version 586038 (0.0030) [2024-04-28 10:36:19,371][57339] Updated weights for policy 0, policy_version 586048 (0.0030) [2024-04-28 10:36:22,169][57108] Fps is (10 sec: 52428.4, 60 sec: 54613.4, 300 sec: 54928.1). Total num frames: 9601908736. Throughput: 0: 54769.8. Samples: 92225840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:36:22,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 10:36:23,200][57339] Updated weights for policy 0, policy_version 586058 (0.0031) [2024-04-28 10:36:25,268][57339] Updated weights for policy 0, policy_version 586068 (0.0037) [2024-04-28 10:36:27,169][57108] Fps is (10 sec: 58981.5, 60 sec: 54886.4, 300 sec: 54928.0). Total num frames: 9602220032. Throughput: 0: 54804.7. Samples: 92554460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:36:27,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 10:36:29,230][57339] Updated weights for policy 0, policy_version 586078 (0.0034) [2024-04-28 10:36:31,355][57339] Updated weights for policy 0, policy_version 586088 (0.0031) [2024-04-28 10:36:32,169][57108] Fps is (10 sec: 58982.5, 60 sec: 54886.5, 300 sec: 54983.6). Total num frames: 9602498560. Throughput: 0: 54818.7. Samples: 92883320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:36:32,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 10:36:35,046][57339] Updated weights for policy 0, policy_version 586098 (0.0025) [2024-04-28 10:36:37,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 54983.6). Total num frames: 9602777088. Throughput: 0: 55151.1. Samples: 93061620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:36:37,169][57108] Avg episode reward: [(0, '0.707')] [2024-04-28 10:36:37,274][57339] Updated weights for policy 0, policy_version 586108 (0.0031) [2024-04-28 10:36:40,897][57339] Updated weights for policy 0, policy_version 586118 (0.0030) [2024-04-28 10:36:42,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55159.4, 300 sec: 54928.1). Total num frames: 9603022848. Throughput: 0: 55126.8. Samples: 93387700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:36:42,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 10:36:43,292][57339] Updated weights for policy 0, policy_version 586128 (0.0029) [2024-04-28 10:36:46,938][57339] Updated weights for policy 0, policy_version 586138 (0.0032) [2024-04-28 10:36:47,169][57108] Fps is (10 sec: 50790.9, 60 sec: 55159.6, 300 sec: 54928.1). Total num frames: 9603284992. Throughput: 0: 55138.4. Samples: 93718600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:36:47,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 10:36:49,186][57339] Updated weights for policy 0, policy_version 586148 (0.0033) [2024-04-28 10:36:52,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55159.4, 300 sec: 54983.6). Total num frames: 9603579904. Throughput: 0: 54831.5. Samples: 93879140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:36:52,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 10:36:52,917][57339] Updated weights for policy 0, policy_version 586158 (0.0027) [2024-04-28 10:36:55,282][57339] Updated weights for policy 0, policy_version 586168 (0.0034) [2024-04-28 10:36:57,169][57108] Fps is (10 sec: 57343.9, 60 sec: 54886.5, 300 sec: 55039.2). Total num frames: 9603858432. Throughput: 0: 54663.6. Samples: 94208040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 10:36:57,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 10:36:58,660][57319] Signal inference workers to stop experience collection... (1250 times) [2024-04-28 10:36:58,683][57339] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-04-28 10:36:58,749][57319] Signal inference workers to resume experience collection... (1250 times) [2024-04-28 10:36:58,749][57339] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-04-28 10:36:58,855][57339] Updated weights for policy 0, policy_version 586178 (0.0025) [2024-04-28 10:37:01,327][57339] Updated weights for policy 0, policy_version 586188 (0.0031) [2024-04-28 10:37:02,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55432.4, 300 sec: 55094.6). Total num frames: 9604169728. Throughput: 0: 54662.3. Samples: 94534460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:37:02,170][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 10:37:05,043][57339] Updated weights for policy 0, policy_version 586198 (0.0033) [2024-04-28 10:37:07,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55159.5, 300 sec: 54928.1). Total num frames: 9604415488. Throughput: 0: 55196.9. Samples: 94709700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:37:07,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 10:37:07,282][57339] Updated weights for policy 0, policy_version 586208 (0.0029) [2024-04-28 10:37:10,966][57339] Updated weights for policy 0, policy_version 586218 (0.0031) [2024-04-28 10:37:12,169][57108] Fps is (10 sec: 50791.1, 60 sec: 54886.3, 300 sec: 54928.0). Total num frames: 9604677632. Throughput: 0: 55238.2. Samples: 95040180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:37:12,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 10:37:13,203][57339] Updated weights for policy 0, policy_version 586228 (0.0029) [2024-04-28 10:37:16,879][57339] Updated weights for policy 0, policy_version 586238 (0.0026) [2024-04-28 10:37:17,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55159.3, 300 sec: 54928.0). Total num frames: 9604939776. Throughput: 0: 55237.7. Samples: 95369020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:37:17,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 10:37:19,325][57339] Updated weights for policy 0, policy_version 586248 (0.0031) [2024-04-28 10:37:22,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55159.6, 300 sec: 54928.1). Total num frames: 9605218304. Throughput: 0: 54674.8. Samples: 95521980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:37:22,169][57108] Avg episode reward: [(0, '0.520')] [2024-04-28 10:37:22,918][57339] Updated weights for policy 0, policy_version 586258 (0.0029) [2024-04-28 10:37:25,186][57339] Updated weights for policy 0, policy_version 586268 (0.0032) [2024-04-28 10:37:27,169][57108] Fps is (10 sec: 55706.2, 60 sec: 54613.4, 300 sec: 54928.1). Total num frames: 9605496832. Throughput: 0: 54773.8. Samples: 95852520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:37:27,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 10:37:27,257][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000586274_9605513216.pth... [2024-04-28 10:37:27,305][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000585470_9592340480.pth [2024-04-28 10:37:28,825][57339] Updated weights for policy 0, policy_version 586278 (0.0030) [2024-04-28 10:37:31,166][57339] Updated weights for policy 0, policy_version 586288 (0.0030) [2024-04-28 10:37:32,169][57108] Fps is (10 sec: 57343.1, 60 sec: 54886.3, 300 sec: 54983.6). Total num frames: 9605791744. Throughput: 0: 54712.7. Samples: 96180680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:37:32,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 10:37:34,798][57339] Updated weights for policy 0, policy_version 586298 (0.0029) [2024-04-28 10:37:37,047][57339] Updated weights for policy 0, policy_version 586308 (0.0031) [2024-04-28 10:37:37,169][57108] Fps is (10 sec: 57344.1, 60 sec: 54886.5, 300 sec: 54983.6). Total num frames: 9606070272. Throughput: 0: 55032.2. Samples: 96355580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:37:37,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 10:37:40,640][57339] Updated weights for policy 0, policy_version 586318 (0.0029) [2024-04-28 10:37:42,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55159.5, 300 sec: 54928.1). Total num frames: 9606332416. Throughput: 0: 55034.7. Samples: 96684600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:37:42,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 10:37:43,092][57339] Updated weights for policy 0, policy_version 586328 (0.0027) [2024-04-28 10:37:46,699][57339] Updated weights for policy 0, policy_version 586338 (0.0032) [2024-04-28 10:37:47,169][57108] Fps is (10 sec: 52428.1, 60 sec: 55159.3, 300 sec: 54928.1). Total num frames: 9606594560. Throughput: 0: 55174.8. Samples: 97017320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:37:47,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 10:37:48,898][57339] Updated weights for policy 0, policy_version 586348 (0.0029) [2024-04-28 10:37:52,169][57108] Fps is (10 sec: 50790.2, 60 sec: 54340.4, 300 sec: 54817.0). Total num frames: 9606840320. Throughput: 0: 54621.4. Samples: 97167660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:37:52,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 10:37:52,660][57339] Updated weights for policy 0, policy_version 586358 (0.0027) [2024-04-28 10:37:54,704][57339] Updated weights for policy 0, policy_version 586368 (0.0029) [2024-04-28 10:37:57,169][57108] Fps is (10 sec: 55706.3, 60 sec: 54886.4, 300 sec: 54928.1). Total num frames: 9607151616. Throughput: 0: 54672.2. Samples: 97500420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:37:57,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 10:37:58,618][57339] Updated weights for policy 0, policy_version 586378 (0.0031) [2024-04-28 10:38:00,994][57339] Updated weights for policy 0, policy_version 586388 (0.0028) [2024-04-28 10:38:02,169][57108] Fps is (10 sec: 60620.8, 60 sec: 54613.5, 300 sec: 55039.1). Total num frames: 9607446528. Throughput: 0: 54651.6. Samples: 97828340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:38:02,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 10:38:04,549][57339] Updated weights for policy 0, policy_version 586398 (0.0032) [2024-04-28 10:38:06,786][57319] Signal inference workers to stop experience collection... (1300 times) [2024-04-28 10:38:06,811][57339] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-04-28 10:38:06,883][57319] Signal inference workers to resume experience collection... (1300 times) [2024-04-28 10:38:06,883][57339] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-04-28 10:38:06,994][57339] Updated weights for policy 0, policy_version 586408 (0.0030) [2024-04-28 10:38:07,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54886.5, 300 sec: 54928.1). Total num frames: 9607708672. Throughput: 0: 55150.7. Samples: 98003760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:38:07,169][57108] Avg episode reward: [(0, '0.526')] [2024-04-28 10:38:10,393][57339] Updated weights for policy 0, policy_version 586418 (0.0027) [2024-04-28 10:38:12,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55159.6, 300 sec: 55039.2). Total num frames: 9607987200. Throughput: 0: 55133.8. Samples: 98333540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:38:12,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 10:38:12,959][57339] Updated weights for policy 0, policy_version 586428 (0.0032) [2024-04-28 10:38:16,460][57339] Updated weights for policy 0, policy_version 586438 (0.0028) [2024-04-28 10:38:17,169][57108] Fps is (10 sec: 54065.4, 60 sec: 55159.3, 300 sec: 54872.5). Total num frames: 9608249344. Throughput: 0: 55139.3. Samples: 98661960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:38:17,170][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 10:38:18,734][57339] Updated weights for policy 0, policy_version 586448 (0.0036) [2024-04-28 10:38:22,169][57108] Fps is (10 sec: 50790.1, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9608495104. Throughput: 0: 54874.6. Samples: 98824940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:38:22,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 10:38:22,378][57339] Updated weights for policy 0, policy_version 586458 (0.0031) [2024-04-28 10:38:24,687][57339] Updated weights for policy 0, policy_version 586468 (0.0026) [2024-04-28 10:38:27,169][57108] Fps is (10 sec: 52429.8, 60 sec: 54613.2, 300 sec: 54872.5). Total num frames: 9608773632. Throughput: 0: 54974.5. Samples: 99158460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 19.0) [2024-04-28 10:38:27,170][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 10:38:28,299][57339] Updated weights for policy 0, policy_version 586478 (0.0027) [2024-04-28 10:38:30,633][57339] Updated weights for policy 0, policy_version 586488 (0.0033) [2024-04-28 10:38:32,169][57108] Fps is (10 sec: 58981.9, 60 sec: 54886.4, 300 sec: 54928.1). Total num frames: 9609084928. Throughput: 0: 54942.7. Samples: 99489740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:38:32,170][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 10:38:34,224][57339] Updated weights for policy 0, policy_version 586498 (0.0030) [2024-04-28 10:38:36,407][57339] Updated weights for policy 0, policy_version 586508 (0.0025) [2024-04-28 10:38:37,169][57108] Fps is (10 sec: 58982.7, 60 sec: 54886.3, 300 sec: 54983.6). Total num frames: 9609363456. Throughput: 0: 55414.6. Samples: 99661320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:38:37,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 10:38:40,273][57339] Updated weights for policy 0, policy_version 586518 (0.0033) [2024-04-28 10:38:42,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55159.5, 300 sec: 55039.1). Total num frames: 9609641984. Throughput: 0: 55236.0. Samples: 99986040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:38:42,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 10:38:42,406][57339] Updated weights for policy 0, policy_version 586528 (0.0023) [2024-04-28 10:38:46,153][57339] Updated weights for policy 0, policy_version 586538 (0.0032) [2024-04-28 10:38:47,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.5, 300 sec: 54928.1). Total num frames: 9609904128. Throughput: 0: 55201.7. Samples: 100312420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:38:47,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 10:38:48,656][57339] Updated weights for policy 0, policy_version 586548 (0.0030) [2024-04-28 10:38:52,169][57108] Fps is (10 sec: 50790.3, 60 sec: 55159.5, 300 sec: 54761.4). Total num frames: 9610149888. Throughput: 0: 54634.2. Samples: 100462300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:38:52,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 10:38:52,189][57339] Updated weights for policy 0, policy_version 586558 (0.0036) [2024-04-28 10:38:54,627][57339] Updated weights for policy 0, policy_version 586568 (0.0023) [2024-04-28 10:38:57,169][57108] Fps is (10 sec: 52428.8, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9610428416. Throughput: 0: 54758.1. Samples: 100797660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:38:57,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 10:38:58,247][57339] Updated weights for policy 0, policy_version 586578 (0.0029) [2024-04-28 10:38:58,646][57319] Signal inference workers to stop experience collection... (1350 times) [2024-04-28 10:38:58,686][57339] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-04-28 10:38:58,741][57319] Signal inference workers to resume experience collection... (1350 times) [2024-04-28 10:38:58,741][57339] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-04-28 10:39:00,732][57339] Updated weights for policy 0, policy_version 586588 (0.0033) [2024-04-28 10:39:02,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54067.3, 300 sec: 54872.5). Total num frames: 9610690560. Throughput: 0: 54680.8. Samples: 101122580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:39:02,169][57108] Avg episode reward: [(0, '0.709')] [2024-04-28 10:39:04,310][57339] Updated weights for policy 0, policy_version 586598 (0.0030) [2024-04-28 10:39:06,780][57339] Updated weights for policy 0, policy_version 586608 (0.0030) [2024-04-28 10:39:07,169][57108] Fps is (10 sec: 57344.4, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 9611001856. Throughput: 0: 54741.4. Samples: 101288300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:39:07,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 10:39:10,185][57339] Updated weights for policy 0, policy_version 586618 (0.0031) [2024-04-28 10:39:12,169][57108] Fps is (10 sec: 58981.8, 60 sec: 54886.3, 300 sec: 54983.6). Total num frames: 9611280384. Throughput: 0: 54612.0. Samples: 101616000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:39:12,170][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 10:39:12,675][57339] Updated weights for policy 0, policy_version 586628 (0.0031) [2024-04-28 10:39:16,386][57339] Updated weights for policy 0, policy_version 586638 (0.0029) [2024-04-28 10:39:17,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55159.6, 300 sec: 54983.6). Total num frames: 9611558912. Throughput: 0: 54526.1. Samples: 101943420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:39:17,170][57108] Avg episode reward: [(0, '0.535')] [2024-04-28 10:39:18,729][57339] Updated weights for policy 0, policy_version 586648 (0.0026) [2024-04-28 10:39:22,169][57108] Fps is (10 sec: 50791.1, 60 sec: 54886.5, 300 sec: 54761.5). Total num frames: 9611788288. Throughput: 0: 54393.5. Samples: 102109020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:39:22,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 10:39:22,278][57339] Updated weights for policy 0, policy_version 586658 (0.0028) [2024-04-28 10:39:24,681][57339] Updated weights for policy 0, policy_version 586668 (0.0028) [2024-04-28 10:39:27,169][57108] Fps is (10 sec: 49152.9, 60 sec: 54613.4, 300 sec: 54817.0). Total num frames: 9612050432. Throughput: 0: 54365.8. Samples: 102432500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:39:27,169][57108] Avg episode reward: [(0, '0.526')] [2024-04-28 10:39:27,246][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000586674_9612066816.pth... [2024-04-28 10:39:27,294][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000585872_9598926848.pth [2024-04-28 10:39:28,242][57339] Updated weights for policy 0, policy_version 586678 (0.0031) [2024-04-28 10:39:30,588][57339] Updated weights for policy 0, policy_version 586688 (0.0033) [2024-04-28 10:39:32,169][57108] Fps is (10 sec: 55704.3, 60 sec: 54340.2, 300 sec: 54928.0). Total num frames: 9612345344. Throughput: 0: 54391.9. Samples: 102760060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:39:32,170][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 10:39:34,271][57339] Updated weights for policy 0, policy_version 586698 (0.0033) [2024-04-28 10:39:36,762][57339] Updated weights for policy 0, policy_version 586708 (0.0025) [2024-04-28 10:39:37,169][57108] Fps is (10 sec: 58981.8, 60 sec: 54613.3, 300 sec: 54928.0). Total num frames: 9612640256. Throughput: 0: 54742.1. Samples: 102925700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:39:37,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 10:39:40,167][57339] Updated weights for policy 0, policy_version 586718 (0.0033) [2024-04-28 10:39:42,169][57108] Fps is (10 sec: 57344.7, 60 sec: 54613.3, 300 sec: 54983.6). Total num frames: 9612918784. Throughput: 0: 54433.8. Samples: 103247180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:39:42,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 10:39:42,901][57339] Updated weights for policy 0, policy_version 586728 (0.0033) [2024-04-28 10:39:45,967][57339] Updated weights for policy 0, policy_version 586738 (0.0028) [2024-04-28 10:39:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54886.4, 300 sec: 54928.0). Total num frames: 9613197312. Throughput: 0: 54601.2. Samples: 103579640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:39:47,169][57108] Avg episode reward: [(0, '0.516')] [2024-04-28 10:39:48,868][57339] Updated weights for policy 0, policy_version 586748 (0.0029) [2024-04-28 10:39:52,005][57339] Updated weights for policy 0, policy_version 586758 (0.0029) [2024-04-28 10:39:52,169][57108] Fps is (10 sec: 52429.4, 60 sec: 54886.5, 300 sec: 54817.0). Total num frames: 9613443072. Throughput: 0: 54700.5. Samples: 103749820. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:39:52,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 10:39:54,680][57339] Updated weights for policy 0, policy_version 586768 (0.0026) [2024-04-28 10:39:57,169][57108] Fps is (10 sec: 49152.5, 60 sec: 54340.4, 300 sec: 54761.4). Total num frames: 9613688832. Throughput: 0: 54788.6. Samples: 104081480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:39:57,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 10:39:57,955][57339] Updated weights for policy 0, policy_version 586778 (0.0029) [2024-04-28 10:40:00,766][57339] Updated weights for policy 0, policy_version 586788 (0.0027) [2024-04-28 10:40:02,169][57108] Fps is (10 sec: 52428.0, 60 sec: 54613.2, 300 sec: 54817.0). Total num frames: 9613967360. Throughput: 0: 54731.2. Samples: 104406320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-04-28 10:40:02,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 10:40:02,692][57319] Signal inference workers to stop experience collection... (1400 times) [2024-04-28 10:40:02,732][57339] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-04-28 10:40:02,787][57319] Signal inference workers to resume experience collection... (1400 times) [2024-04-28 10:40:02,788][57339] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-04-28 10:40:03,929][57339] Updated weights for policy 0, policy_version 586798 (0.0024) [2024-04-28 10:40:06,782][57339] Updated weights for policy 0, policy_version 586808 (0.0037) [2024-04-28 10:40:07,169][57108] Fps is (10 sec: 58982.1, 60 sec: 54613.3, 300 sec: 54983.6). Total num frames: 9614278656. Throughput: 0: 54573.7. Samples: 104564840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:40:07,169][57108] Avg episode reward: [(0, '0.531')] [2024-04-28 10:40:09,963][57339] Updated weights for policy 0, policy_version 586818 (0.0032) [2024-04-28 10:40:12,169][57108] Fps is (10 sec: 58983.1, 60 sec: 54613.4, 300 sec: 54928.1). Total num frames: 9614557184. Throughput: 0: 54681.4. Samples: 104893160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:40:12,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 10:40:12,594][57339] Updated weights for policy 0, policy_version 586828 (0.0033) [2024-04-28 10:40:15,968][57339] Updated weights for policy 0, policy_version 586838 (0.0030) [2024-04-28 10:40:17,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54613.5, 300 sec: 54928.1). Total num frames: 9614835712. Throughput: 0: 54675.8. Samples: 105220460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:40:17,169][57108] Avg episode reward: [(0, '0.510')] [2024-04-28 10:40:18,858][57339] Updated weights for policy 0, policy_version 586848 (0.0026) [2024-04-28 10:40:21,867][57339] Updated weights for policy 0, policy_version 586858 (0.0027) [2024-04-28 10:40:22,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.4, 300 sec: 54817.0). Total num frames: 9615097856. Throughput: 0: 54796.1. Samples: 105391520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:40:22,169][57108] Avg episode reward: [(0, '0.699')] [2024-04-28 10:40:24,941][57339] Updated weights for policy 0, policy_version 586868 (0.0035) [2024-04-28 10:40:27,169][57108] Fps is (10 sec: 50789.7, 60 sec: 54886.3, 300 sec: 54705.9). Total num frames: 9615343616. Throughput: 0: 54958.6. Samples: 105720320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:40:27,170][57108] Avg episode reward: [(0, '0.523')] [2024-04-28 10:40:27,701][57339] Updated weights for policy 0, policy_version 586878 (0.0031) [2024-04-28 10:40:31,174][57339] Updated weights for policy 0, policy_version 586888 (0.0031) [2024-04-28 10:40:32,169][57108] Fps is (10 sec: 50789.6, 60 sec: 54340.3, 300 sec: 54761.4). Total num frames: 9615605760. Throughput: 0: 54889.2. Samples: 106049660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:40:32,170][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 10:40:33,783][57339] Updated weights for policy 0, policy_version 586898 (0.0030) [2024-04-28 10:40:37,088][57339] Updated weights for policy 0, policy_version 586908 (0.0030) [2024-04-28 10:40:37,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54340.3, 300 sec: 54872.5). Total num frames: 9615900672. Throughput: 0: 54440.7. Samples: 106199660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:40:37,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 10:40:39,861][57339] Updated weights for policy 0, policy_version 586918 (0.0027) [2024-04-28 10:40:42,169][57108] Fps is (10 sec: 58982.7, 60 sec: 54613.3, 300 sec: 54983.6). Total num frames: 9616195584. Throughput: 0: 54399.8. Samples: 106529480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:40:42,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 10:40:42,927][57339] Updated weights for policy 0, policy_version 586928 (0.0031) [2024-04-28 10:40:45,714][57339] Updated weights for policy 0, policy_version 586938 (0.0030) [2024-04-28 10:40:47,169][57108] Fps is (10 sec: 57343.8, 60 sec: 54613.3, 300 sec: 54928.0). Total num frames: 9616474112. Throughput: 0: 54601.7. Samples: 106863400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:40:47,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 10:40:48,979][57339] Updated weights for policy 0, policy_version 586948 (0.0031) [2024-04-28 10:40:51,609][57339] Updated weights for policy 0, policy_version 586958 (0.0028) [2024-04-28 10:40:52,169][57108] Fps is (10 sec: 54067.9, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9616736256. Throughput: 0: 55030.3. Samples: 107041200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:40:52,169][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 10:40:54,920][57339] Updated weights for policy 0, policy_version 586968 (0.0029) [2024-04-28 10:40:57,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55159.3, 300 sec: 54761.4). Total num frames: 9616998400. Throughput: 0: 54987.8. Samples: 107367620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:40:57,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 10:40:57,701][57339] Updated weights for policy 0, policy_version 586978 (0.0032) [2024-04-28 10:41:00,852][57339] Updated weights for policy 0, policy_version 586988 (0.0026) [2024-04-28 10:41:02,169][57108] Fps is (10 sec: 52428.4, 60 sec: 54886.4, 300 sec: 54761.4). Total num frames: 9617260544. Throughput: 0: 55033.3. Samples: 107696960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:41:02,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 10:41:03,668][57339] Updated weights for policy 0, policy_version 586998 (0.0036) [2024-04-28 10:41:04,284][57319] Signal inference workers to stop experience collection... (1450 times) [2024-04-28 10:41:04,289][57319] Signal inference workers to resume experience collection... (1450 times) [2024-04-28 10:41:04,315][57339] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-04-28 10:41:04,315][57339] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-04-28 10:41:06,633][57339] Updated weights for policy 0, policy_version 587008 (0.0030) [2024-04-28 10:41:07,169][57108] Fps is (10 sec: 55705.4, 60 sec: 54613.2, 300 sec: 54816.9). Total num frames: 9617555456. Throughput: 0: 54598.9. Samples: 107848480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:41:07,170][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 10:41:09,497][57339] Updated weights for policy 0, policy_version 587018 (0.0033) [2024-04-28 10:41:12,169][57108] Fps is (10 sec: 55706.1, 60 sec: 54340.3, 300 sec: 54872.5). Total num frames: 9617817600. Throughput: 0: 54634.0. Samples: 108178840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:41:12,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 10:41:12,920][57339] Updated weights for policy 0, policy_version 587028 (0.0028) [2024-04-28 10:41:15,386][57339] Updated weights for policy 0, policy_version 587038 (0.0029) [2024-04-28 10:41:17,169][57108] Fps is (10 sec: 55706.5, 60 sec: 54613.3, 300 sec: 54928.1). Total num frames: 9618112512. Throughput: 0: 54718.0. Samples: 108511960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:41:17,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 10:41:18,930][57339] Updated weights for policy 0, policy_version 587048 (0.0027) [2024-04-28 10:41:21,392][57339] Updated weights for policy 0, policy_version 587058 (0.0030) [2024-04-28 10:41:22,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55159.5, 300 sec: 54872.5). Total num frames: 9618407424. Throughput: 0: 55372.1. Samples: 108691400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:41:22,169][57108] Avg episode reward: [(0, '0.697')] [2024-04-28 10:41:24,738][57339] Updated weights for policy 0, policy_version 587068 (0.0038) [2024-04-28 10:41:27,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.6, 300 sec: 54761.5). Total num frames: 9618653184. Throughput: 0: 55258.4. Samples: 109016100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:41:27,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 10:41:27,234][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000587077_9618669568.pth... [2024-04-28 10:41:27,283][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000586274_9605513216.pth [2024-04-28 10:41:27,440][57339] Updated weights for policy 0, policy_version 587078 (0.0034) [2024-04-28 10:41:30,648][57339] Updated weights for policy 0, policy_version 587088 (0.0028) [2024-04-28 10:41:32,169][57108] Fps is (10 sec: 50790.6, 60 sec: 55159.6, 300 sec: 54705.9). Total num frames: 9618915328. Throughput: 0: 55177.5. Samples: 109346380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 19.0) [2024-04-28 10:41:32,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 10:41:33,340][57339] Updated weights for policy 0, policy_version 587098 (0.0033) [2024-04-28 10:41:36,714][57339] Updated weights for policy 0, policy_version 587108 (0.0032) [2024-04-28 10:41:37,169][57108] Fps is (10 sec: 54066.3, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9619193856. Throughput: 0: 54597.1. Samples: 109498080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:41:37,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 10:41:39,240][57339] Updated weights for policy 0, policy_version 587118 (0.0030) [2024-04-28 10:41:42,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54613.5, 300 sec: 54872.5). Total num frames: 9619472384. Throughput: 0: 54684.2. Samples: 109828400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:41:42,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 10:41:42,871][57339] Updated weights for policy 0, policy_version 587128 (0.0031) [2024-04-28 10:41:45,263][57339] Updated weights for policy 0, policy_version 587138 (0.0034) [2024-04-28 10:41:47,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9619750912. Throughput: 0: 54809.2. Samples: 110163380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:41:47,169][57108] Avg episode reward: [(0, '0.701')] [2024-04-28 10:41:48,704][57339] Updated weights for policy 0, policy_version 587148 (0.0029) [2024-04-28 10:41:51,111][57339] Updated weights for policy 0, policy_version 587158 (0.0025) [2024-04-28 10:41:52,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55159.4, 300 sec: 54872.5). Total num frames: 9620045824. Throughput: 0: 55227.7. Samples: 110333720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:41:52,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 10:41:54,763][57339] Updated weights for policy 0, policy_version 587168 (0.0028) [2024-04-28 10:41:57,004][57339] Updated weights for policy 0, policy_version 587178 (0.0031) [2024-04-28 10:41:57,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55432.6, 300 sec: 54761.5). Total num frames: 9620324352. Throughput: 0: 55167.8. Samples: 110661400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:41:57,169][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 10:42:00,726][57339] Updated weights for policy 0, policy_version 587188 (0.0031) [2024-04-28 10:42:02,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 54817.0). Total num frames: 9620586496. Throughput: 0: 55210.1. Samples: 110996420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:42:02,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 10:42:02,961][57339] Updated weights for policy 0, policy_version 587198 (0.0029) [2024-04-28 10:42:06,733][57339] Updated weights for policy 0, policy_version 587208 (0.0035) [2024-04-28 10:42:07,169][57108] Fps is (10 sec: 50790.7, 60 sec: 54613.4, 300 sec: 54761.5). Total num frames: 9620832256. Throughput: 0: 54720.9. Samples: 111153840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:42:07,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 10:42:08,928][57339] Updated weights for policy 0, policy_version 587218 (0.0032) [2024-04-28 10:42:12,169][57108] Fps is (10 sec: 52429.3, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9621110784. Throughput: 0: 54811.6. Samples: 111482620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:42:12,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 10:42:12,524][57339] Updated weights for policy 0, policy_version 587228 (0.0029) [2024-04-28 10:42:13,678][57319] Signal inference workers to stop experience collection... (1500 times) [2024-04-28 10:42:13,716][57339] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-04-28 10:42:13,745][57319] Signal inference workers to resume experience collection... (1500 times) [2024-04-28 10:42:13,746][57339] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-04-28 10:42:14,873][57339] Updated weights for policy 0, policy_version 587238 (0.0027) [2024-04-28 10:42:17,169][57108] Fps is (10 sec: 55706.0, 60 sec: 54613.4, 300 sec: 54817.0). Total num frames: 9621389312. Throughput: 0: 54821.3. Samples: 111813340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:42:17,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 10:42:18,488][57339] Updated weights for policy 0, policy_version 587248 (0.0030) [2024-04-28 10:42:21,028][57339] Updated weights for policy 0, policy_version 587258 (0.0035) [2024-04-28 10:42:22,169][57108] Fps is (10 sec: 57343.2, 60 sec: 54613.2, 300 sec: 54872.5). Total num frames: 9621684224. Throughput: 0: 55046.7. Samples: 111975180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:42:22,169][57108] Avg episode reward: [(0, '0.514')] [2024-04-28 10:42:24,532][57339] Updated weights for policy 0, policy_version 587268 (0.0031) [2024-04-28 10:42:26,955][57339] Updated weights for policy 0, policy_version 587278 (0.0026) [2024-04-28 10:42:27,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55432.5, 300 sec: 54872.5). Total num frames: 9621979136. Throughput: 0: 55011.5. Samples: 112303920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:42:27,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 10:42:30,326][57339] Updated weights for policy 0, policy_version 587288 (0.0026) [2024-04-28 10:42:32,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.4, 300 sec: 54761.4). Total num frames: 9622224896. Throughput: 0: 54846.7. Samples: 112631480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:42:32,178][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 10:42:32,838][57339] Updated weights for policy 0, policy_version 587298 (0.0026) [2024-04-28 10:42:36,253][57339] Updated weights for policy 0, policy_version 587308 (0.0033) [2024-04-28 10:42:37,169][57108] Fps is (10 sec: 50790.2, 60 sec: 54886.5, 300 sec: 54761.4). Total num frames: 9622487040. Throughput: 0: 54715.6. Samples: 112795920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:42:37,178][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 10:42:38,946][57339] Updated weights for policy 0, policy_version 587318 (0.0030) [2024-04-28 10:42:42,169][57108] Fps is (10 sec: 54066.7, 60 sec: 54886.2, 300 sec: 54817.0). Total num frames: 9622765568. Throughput: 0: 54729.2. Samples: 113124220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:42:42,170][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 10:42:42,653][57339] Updated weights for policy 0, policy_version 587328 (0.0033) [2024-04-28 10:42:44,760][57339] Updated weights for policy 0, policy_version 587338 (0.0028) [2024-04-28 10:42:47,169][57108] Fps is (10 sec: 52428.6, 60 sec: 54340.3, 300 sec: 54817.0). Total num frames: 9623011328. Throughput: 0: 54553.3. Samples: 113451320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:42:47,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 10:42:48,628][57339] Updated weights for policy 0, policy_version 587348 (0.0028) [2024-04-28 10:42:50,864][57339] Updated weights for policy 0, policy_version 587358 (0.0030) [2024-04-28 10:42:52,169][57108] Fps is (10 sec: 52430.0, 60 sec: 54067.3, 300 sec: 54705.9). Total num frames: 9623289856. Throughput: 0: 54557.0. Samples: 113608900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:42:52,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 10:42:54,715][57339] Updated weights for policy 0, policy_version 587368 (0.0036) [2024-04-28 10:42:56,781][57339] Updated weights for policy 0, policy_version 587378 (0.0035) [2024-04-28 10:42:57,169][57108] Fps is (10 sec: 58982.3, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9623601152. Throughput: 0: 54570.9. Samples: 113938320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:42:57,169][57108] Avg episode reward: [(0, '0.690')] [2024-04-28 10:43:00,706][57339] Updated weights for policy 0, policy_version 587388 (0.0027) [2024-04-28 10:43:02,169][57108] Fps is (10 sec: 55705.1, 60 sec: 54340.3, 300 sec: 54705.9). Total num frames: 9623846912. Throughput: 0: 54391.9. Samples: 114260980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:43:02,169][57108] Avg episode reward: [(0, '0.487')] [2024-04-28 10:43:02,937][57339] Updated weights for policy 0, policy_version 587398 (0.0032) [2024-04-28 10:43:06,599][57339] Updated weights for policy 0, policy_version 587408 (0.0032) [2024-04-28 10:43:07,169][57108] Fps is (10 sec: 52429.5, 60 sec: 54886.5, 300 sec: 54705.9). Total num frames: 9624125440. Throughput: 0: 54577.5. Samples: 114431160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 10:43:07,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 10:43:09,162][57339] Updated weights for policy 0, policy_version 587418 (0.0036) [2024-04-28 10:43:12,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54613.2, 300 sec: 54705.9). Total num frames: 9624387584. Throughput: 0: 54629.2. Samples: 114762240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:43:12,170][57108] Avg episode reward: [(0, '0.683')] [2024-04-28 10:43:12,434][57339] Updated weights for policy 0, policy_version 587428 (0.0029) [2024-04-28 10:43:15,312][57339] Updated weights for policy 0, policy_version 587438 (0.0024) [2024-04-28 10:43:17,169][57108] Fps is (10 sec: 54066.6, 60 sec: 54613.2, 300 sec: 54817.0). Total num frames: 9624666112. Throughput: 0: 54617.3. Samples: 115089260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:43:17,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 10:43:18,418][57339] Updated weights for policy 0, policy_version 587448 (0.0028) [2024-04-28 10:43:21,357][57339] Updated weights for policy 0, policy_version 587458 (0.0028) [2024-04-28 10:43:22,169][57108] Fps is (10 sec: 54066.8, 60 sec: 54067.2, 300 sec: 54761.4). Total num frames: 9624928256. Throughput: 0: 54442.1. Samples: 115245820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:43:22,170][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 10:43:23,040][57319] Signal inference workers to stop experience collection... (1550 times) [2024-04-28 10:43:23,080][57339] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-04-28 10:43:23,090][57319] Signal inference workers to resume experience collection... (1550 times) [2024-04-28 10:43:23,096][57339] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-04-28 10:43:24,708][57339] Updated weights for policy 0, policy_version 587468 (0.0030) [2024-04-28 10:43:27,169][57108] Fps is (10 sec: 55706.1, 60 sec: 54067.2, 300 sec: 54705.9). Total num frames: 9625223168. Throughput: 0: 54352.6. Samples: 115570080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:43:27,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 10:43:27,215][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000587478_9625239552.pth... [2024-04-28 10:43:27,221][57339] Updated weights for policy 0, policy_version 587478 (0.0033) [2024-04-28 10:43:27,264][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000586674_9612066816.pth [2024-04-28 10:43:30,718][57339] Updated weights for policy 0, policy_version 587488 (0.0030) [2024-04-28 10:43:32,169][57108] Fps is (10 sec: 58983.6, 60 sec: 54886.5, 300 sec: 54761.5). Total num frames: 9625518080. Throughput: 0: 54356.2. Samples: 115897340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:43:32,169][57108] Avg episode reward: [(0, '0.686')] [2024-04-28 10:43:33,108][57339] Updated weights for policy 0, policy_version 587498 (0.0025) [2024-04-28 10:43:36,757][57339] Updated weights for policy 0, policy_version 587508 (0.0031) [2024-04-28 10:43:37,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54613.3, 300 sec: 54650.4). Total num frames: 9625763840. Throughput: 0: 54769.6. Samples: 116073540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:43:37,170][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 10:43:39,122][57339] Updated weights for policy 0, policy_version 587518 (0.0035) [2024-04-28 10:43:42,169][57108] Fps is (10 sec: 50789.8, 60 sec: 54340.4, 300 sec: 54650.4). Total num frames: 9626025984. Throughput: 0: 54669.4. Samples: 116398440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:43:42,170][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 10:43:42,643][57339] Updated weights for policy 0, policy_version 587528 (0.0033) [2024-04-28 10:43:45,287][57339] Updated weights for policy 0, policy_version 587538 (0.0035) [2024-04-28 10:43:47,169][57108] Fps is (10 sec: 54067.5, 60 sec: 54886.5, 300 sec: 54761.4). Total num frames: 9626304512. Throughput: 0: 54686.7. Samples: 116721880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:43:47,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 10:43:48,585][57339] Updated weights for policy 0, policy_version 587548 (0.0028) [2024-04-28 10:43:51,130][57339] Updated weights for policy 0, policy_version 587558 (0.0027) [2024-04-28 10:43:52,169][57108] Fps is (10 sec: 52428.4, 60 sec: 54340.1, 300 sec: 54650.3). Total num frames: 9626550272. Throughput: 0: 54525.1. Samples: 116884800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:43:52,170][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 10:43:54,506][57339] Updated weights for policy 0, policy_version 587568 (0.0033) [2024-04-28 10:43:57,169][57108] Fps is (10 sec: 55705.4, 60 sec: 54340.3, 300 sec: 54817.0). Total num frames: 9626861568. Throughput: 0: 54352.1. Samples: 117208080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:43:57,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 10:43:57,616][57339] Updated weights for policy 0, policy_version 587578 (0.0027) [2024-04-28 10:44:00,415][57339] Updated weights for policy 0, policy_version 587588 (0.0030) [2024-04-28 10:44:02,169][57108] Fps is (10 sec: 60621.4, 60 sec: 55159.5, 300 sec: 54761.4). Total num frames: 9627156480. Throughput: 0: 54436.5. Samples: 117538900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:44:02,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 10:44:03,626][57339] Updated weights for policy 0, policy_version 587598 (0.0029) [2024-04-28 10:44:06,413][57339] Updated weights for policy 0, policy_version 587608 (0.0031) [2024-04-28 10:44:07,169][57108] Fps is (10 sec: 55705.1, 60 sec: 54886.3, 300 sec: 54705.9). Total num frames: 9627418624. Throughput: 0: 54858.7. Samples: 117714460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:44:07,170][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 10:44:09,675][57339] Updated weights for policy 0, policy_version 587618 (0.0030) [2024-04-28 10:44:12,169][57108] Fps is (10 sec: 52428.3, 60 sec: 54886.3, 300 sec: 54650.4). Total num frames: 9627680768. Throughput: 0: 54950.1. Samples: 118042840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:44:12,170][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 10:44:12,279][57339] Updated weights for policy 0, policy_version 587628 (0.0025) [2024-04-28 10:44:15,616][57339] Updated weights for policy 0, policy_version 587638 (0.0028) [2024-04-28 10:44:17,169][57108] Fps is (10 sec: 52429.4, 60 sec: 54613.4, 300 sec: 54761.4). Total num frames: 9627942912. Throughput: 0: 55041.7. Samples: 118374220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:44:17,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 10:44:18,286][57339] Updated weights for policy 0, policy_version 587648 (0.0028) [2024-04-28 10:44:18,729][57319] Signal inference workers to stop experience collection... (1600 times) [2024-04-28 10:44:18,769][57339] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-04-28 10:44:18,778][57319] Signal inference workers to resume experience collection... (1600 times) [2024-04-28 10:44:18,783][57339] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-04-28 10:44:21,417][57339] Updated weights for policy 0, policy_version 587658 (0.0031) [2024-04-28 10:44:22,169][57108] Fps is (10 sec: 52429.4, 60 sec: 54613.4, 300 sec: 54761.4). Total num frames: 9628205056. Throughput: 0: 54488.0. Samples: 118525500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:44:22,169][57108] Avg episode reward: [(0, '0.515')] [2024-04-28 10:44:24,265][57339] Updated weights for policy 0, policy_version 587668 (0.0030) [2024-04-28 10:44:27,169][57108] Fps is (10 sec: 55705.6, 60 sec: 54613.3, 300 sec: 54761.5). Total num frames: 9628499968. Throughput: 0: 54685.4. Samples: 118859280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:44:27,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 10:44:27,189][57339] Updated weights for policy 0, policy_version 587678 (0.0028) [2024-04-28 10:44:30,236][57339] Updated weights for policy 0, policy_version 587688 (0.0030) [2024-04-28 10:44:32,169][57108] Fps is (10 sec: 58982.2, 60 sec: 54613.2, 300 sec: 54761.4). Total num frames: 9628794880. Throughput: 0: 54774.6. Samples: 119186740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:44:32,169][57108] Avg episode reward: [(0, '0.703')] [2024-04-28 10:44:33,457][57339] Updated weights for policy 0, policy_version 587698 (0.0034) [2024-04-28 10:44:36,152][57339] Updated weights for policy 0, policy_version 587708 (0.0028) [2024-04-28 10:44:37,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55159.5, 300 sec: 54761.4). Total num frames: 9629073408. Throughput: 0: 55054.4. Samples: 119362240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:44:37,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 10:44:39,344][57339] Updated weights for policy 0, policy_version 587718 (0.0032) [2024-04-28 10:44:42,031][57339] Updated weights for policy 0, policy_version 587728 (0.0033) [2024-04-28 10:44:42,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55159.5, 300 sec: 54705.9). Total num frames: 9629335552. Throughput: 0: 55288.5. Samples: 119696060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-04-28 10:44:42,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 10:44:45,257][57339] Updated weights for policy 0, policy_version 587738 (0.0030) [2024-04-28 10:44:47,169][57108] Fps is (10 sec: 52428.4, 60 sec: 54886.3, 300 sec: 54761.4). Total num frames: 9629597696. Throughput: 0: 55180.4. Samples: 120022020. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:44:47,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 10:44:48,042][57339] Updated weights for policy 0, policy_version 587748 (0.0027) [2024-04-28 10:44:51,389][57339] Updated weights for policy 0, policy_version 587758 (0.0030) [2024-04-28 10:44:52,169][57108] Fps is (10 sec: 52428.1, 60 sec: 55159.5, 300 sec: 54816.9). Total num frames: 9629859840. Throughput: 0: 54660.4. Samples: 120174180. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:44:52,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 10:44:54,016][57339] Updated weights for policy 0, policy_version 587768 (0.0030) [2024-04-28 10:44:57,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9630138368. Throughput: 0: 54603.2. Samples: 120499980. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:44:57,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 10:44:57,418][57339] Updated weights for policy 0, policy_version 587778 (0.0027) [2024-04-28 10:44:59,971][57339] Updated weights for policy 0, policy_version 587788 (0.0028) [2024-04-28 10:45:02,169][57108] Fps is (10 sec: 58982.2, 60 sec: 54886.3, 300 sec: 54817.0). Total num frames: 9630449664. Throughput: 0: 54548.7. Samples: 120828920. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:45:02,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 10:45:03,365][57339] Updated weights for policy 0, policy_version 587798 (0.0038) [2024-04-28 10:45:06,011][57339] Updated weights for policy 0, policy_version 587808 (0.0033) [2024-04-28 10:45:07,169][57108] Fps is (10 sec: 57343.4, 60 sec: 54886.4, 300 sec: 54761.4). Total num frames: 9630711808. Throughput: 0: 55184.3. Samples: 121008800. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:45:07,170][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 10:45:09,206][57339] Updated weights for policy 0, policy_version 587818 (0.0030) [2024-04-28 10:45:11,951][57339] Updated weights for policy 0, policy_version 587828 (0.0029) [2024-04-28 10:45:12,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55159.5, 300 sec: 54761.4). Total num frames: 9630990336. Throughput: 0: 54959.9. Samples: 121332480. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:45:12,170][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 10:45:15,138][57339] Updated weights for policy 0, policy_version 587838 (0.0027) [2024-04-28 10:45:17,169][57108] Fps is (10 sec: 50790.4, 60 sec: 54613.2, 300 sec: 54650.3). Total num frames: 9631219712. Throughput: 0: 55046.1. Samples: 121663820. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:45:17,170][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 10:45:17,887][57339] Updated weights for policy 0, policy_version 587848 (0.0032) [2024-04-28 10:45:18,825][57319] Signal inference workers to stop experience collection... (1650 times) [2024-04-28 10:45:18,826][57319] Signal inference workers to resume experience collection... (1650 times) [2024-04-28 10:45:18,841][57339] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-04-28 10:45:18,841][57339] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-04-28 10:45:21,252][57339] Updated weights for policy 0, policy_version 587858 (0.0031) [2024-04-28 10:45:22,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55159.5, 300 sec: 54817.0). Total num frames: 9631514624. Throughput: 0: 54571.6. Samples: 121817960. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:45:22,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 10:45:23,826][57339] Updated weights for policy 0, policy_version 587868 (0.0028) [2024-04-28 10:45:27,169][57108] Fps is (10 sec: 55706.0, 60 sec: 54613.2, 300 sec: 54817.0). Total num frames: 9631776768. Throughput: 0: 54516.3. Samples: 122149300. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:45:27,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 10:45:27,197][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000587878_9631793152.pth... [2024-04-28 10:45:27,204][57339] Updated weights for policy 0, policy_version 587878 (0.0030) [2024-04-28 10:45:27,242][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000587077_9618669568.pth [2024-04-28 10:45:29,691][57339] Updated weights for policy 0, policy_version 587888 (0.0029) [2024-04-28 10:45:32,169][57108] Fps is (10 sec: 57344.0, 60 sec: 54886.5, 300 sec: 54872.5). Total num frames: 9632088064. Throughput: 0: 54489.9. Samples: 122474060. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:45:32,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 10:45:33,381][57339] Updated weights for policy 0, policy_version 587898 (0.0030) [2024-04-28 10:45:35,801][57339] Updated weights for policy 0, policy_version 587908 (0.0028) [2024-04-28 10:45:37,169][57108] Fps is (10 sec: 60621.7, 60 sec: 55159.5, 300 sec: 54872.5). Total num frames: 9632382976. Throughput: 0: 54909.1. Samples: 122645080. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:45:37,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 10:45:39,212][57339] Updated weights for policy 0, policy_version 587918 (0.0021) [2024-04-28 10:45:41,926][57339] Updated weights for policy 0, policy_version 587928 (0.0031) [2024-04-28 10:45:42,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54886.4, 300 sec: 54761.5). Total num frames: 9632628736. Throughput: 0: 54971.2. Samples: 122973680. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:45:42,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 10:45:45,176][57339] Updated weights for policy 0, policy_version 587938 (0.0025) [2024-04-28 10:45:47,169][57108] Fps is (10 sec: 47513.1, 60 sec: 54340.3, 300 sec: 54650.3). Total num frames: 9632858112. Throughput: 0: 54974.3. Samples: 123302760. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:45:47,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 10:45:48,066][57339] Updated weights for policy 0, policy_version 587948 (0.0026) [2024-04-28 10:45:51,167][57339] Updated weights for policy 0, policy_version 587958 (0.0034) [2024-04-28 10:45:52,169][57108] Fps is (10 sec: 49152.0, 60 sec: 54340.4, 300 sec: 54650.4). Total num frames: 9633120256. Throughput: 0: 54338.5. Samples: 123454020. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:45:52,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 10:45:54,008][57339] Updated weights for policy 0, policy_version 587968 (0.0026) [2024-04-28 10:45:57,147][57339] Updated weights for policy 0, policy_version 587978 (0.0035) [2024-04-28 10:45:57,169][57108] Fps is (10 sec: 57344.6, 60 sec: 54886.5, 300 sec: 54817.0). Total num frames: 9633431552. Throughput: 0: 54405.0. Samples: 123780700. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:45:57,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 10:45:59,872][57339] Updated weights for policy 0, policy_version 587988 (0.0023) [2024-04-28 10:46:02,169][57108] Fps is (10 sec: 58981.3, 60 sec: 54340.3, 300 sec: 54761.4). Total num frames: 9633710080. Throughput: 0: 54349.4. Samples: 124109540. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:46:02,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 10:46:03,222][57339] Updated weights for policy 0, policy_version 587998 (0.0025) [2024-04-28 10:46:04,650][57319] Signal inference workers to stop experience collection... (1700 times) [2024-04-28 10:46:04,650][57319] Signal inference workers to resume experience collection... (1700 times) [2024-04-28 10:46:04,681][57339] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-04-28 10:46:04,681][57339] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-04-28 10:46:06,086][57339] Updated weights for policy 0, policy_version 588008 (0.0029) [2024-04-28 10:46:07,169][57108] Fps is (10 sec: 55704.9, 60 sec: 54613.4, 300 sec: 54817.0). Total num frames: 9633988608. Throughput: 0: 54685.2. Samples: 124278800. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:46:07,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 10:46:09,246][57339] Updated weights for policy 0, policy_version 588018 (0.0031) [2024-04-28 10:46:12,144][57339] Updated weights for policy 0, policy_version 588028 (0.0033) [2024-04-28 10:46:12,169][57108] Fps is (10 sec: 54067.6, 60 sec: 54340.3, 300 sec: 54705.9). Total num frames: 9634250752. Throughput: 0: 54576.0. Samples: 124605220. Policy #0 lag: (min: 1.0, avg: 12.4, max: 21.0) [2024-04-28 10:46:12,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 10:46:15,323][57339] Updated weights for policy 0, policy_version 588038 (0.0035) [2024-04-28 10:46:17,169][57108] Fps is (10 sec: 50790.5, 60 sec: 54613.4, 300 sec: 54539.3). Total num frames: 9634496512. Throughput: 0: 54623.9. Samples: 124932140. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:46:17,169][57108] Avg episode reward: [(0, '0.526')] [2024-04-28 10:46:18,093][57339] Updated weights for policy 0, policy_version 588048 (0.0026) [2024-04-28 10:46:21,209][57339] Updated weights for policy 0, policy_version 588058 (0.0026) [2024-04-28 10:46:22,169][57108] Fps is (10 sec: 50790.3, 60 sec: 54067.1, 300 sec: 54594.8). Total num frames: 9634758656. Throughput: 0: 54204.7. Samples: 125084300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:46:22,170][57108] Avg episode reward: [(0, '0.529')] [2024-04-28 10:46:24,128][57339] Updated weights for policy 0, policy_version 588068 (0.0033) [2024-04-28 10:46:27,080][57339] Updated weights for policy 0, policy_version 588078 (0.0029) [2024-04-28 10:46:27,169][57108] Fps is (10 sec: 57344.2, 60 sec: 54886.5, 300 sec: 54761.4). Total num frames: 9635069952. Throughput: 0: 54103.5. Samples: 125408340. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:46:27,169][57108] Avg episode reward: [(0, '0.683')] [2024-04-28 10:46:30,072][57339] Updated weights for policy 0, policy_version 588088 (0.0031) [2024-04-28 10:46:32,169][57108] Fps is (10 sec: 58982.5, 60 sec: 54340.2, 300 sec: 54761.5). Total num frames: 9635348480. Throughput: 0: 54170.2. Samples: 125740420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:46:32,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 10:46:33,140][57339] Updated weights for policy 0, policy_version 588098 (0.0028) [2024-04-28 10:46:36,162][57339] Updated weights for policy 0, policy_version 588108 (0.0025) [2024-04-28 10:46:37,169][57108] Fps is (10 sec: 57343.7, 60 sec: 54340.2, 300 sec: 54817.0). Total num frames: 9635643392. Throughput: 0: 54677.2. Samples: 125914500. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:46:37,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 10:46:39,134][57339] Updated weights for policy 0, policy_version 588118 (0.0025) [2024-04-28 10:46:42,107][57339] Updated weights for policy 0, policy_version 588128 (0.0034) [2024-04-28 10:46:42,169][57108] Fps is (10 sec: 54067.2, 60 sec: 54340.2, 300 sec: 54705.9). Total num frames: 9635889152. Throughput: 0: 54689.2. Samples: 126241720. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:46:42,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 10:46:45,337][57339] Updated weights for policy 0, policy_version 588138 (0.0030) [2024-04-28 10:46:47,169][57108] Fps is (10 sec: 50790.9, 60 sec: 54886.5, 300 sec: 54594.8). Total num frames: 9636151296. Throughput: 0: 54702.4. Samples: 126571140. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:46:47,169][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 10:46:48,046][57339] Updated weights for policy 0, policy_version 588148 (0.0029) [2024-04-28 10:46:51,140][57339] Updated weights for policy 0, policy_version 588158 (0.0035) [2024-04-28 10:46:52,169][57108] Fps is (10 sec: 52428.6, 60 sec: 54886.3, 300 sec: 54539.3). Total num frames: 9636413440. Throughput: 0: 54426.2. Samples: 126727980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:46:52,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 10:46:53,951][57319] Signal inference workers to stop experience collection... (1750 times) [2024-04-28 10:46:53,951][57319] Signal inference workers to resume experience collection... (1750 times) [2024-04-28 10:46:53,968][57339] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-04-28 10:46:53,968][57339] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-04-28 10:46:54,099][57339] Updated weights for policy 0, policy_version 588168 (0.0031) [2024-04-28 10:46:57,080][57339] Updated weights for policy 0, policy_version 588178 (0.0030) [2024-04-28 10:46:57,169][57108] Fps is (10 sec: 55705.2, 60 sec: 54613.2, 300 sec: 54650.4). Total num frames: 9636708352. Throughput: 0: 54429.8. Samples: 127054560. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:46:57,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 10:46:59,869][57339] Updated weights for policy 0, policy_version 588188 (0.0028) [2024-04-28 10:47:02,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54340.3, 300 sec: 54705.9). Total num frames: 9636970496. Throughput: 0: 54504.8. Samples: 127384860. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:47:02,169][57108] Avg episode reward: [(0, '0.516')] [2024-04-28 10:47:03,509][57339] Updated weights for policy 0, policy_version 588198 (0.0032) [2024-04-28 10:47:05,929][57339] Updated weights for policy 0, policy_version 588208 (0.0032) [2024-04-28 10:47:07,169][57108] Fps is (10 sec: 55705.2, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9637265408. Throughput: 0: 54983.5. Samples: 127558560. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:47:07,169][57108] Avg episode reward: [(0, '0.678')] [2024-04-28 10:47:09,340][57339] Updated weights for policy 0, policy_version 588218 (0.0034) [2024-04-28 10:47:11,907][57339] Updated weights for policy 0, policy_version 588228 (0.0031) [2024-04-28 10:47:12,169][57108] Fps is (10 sec: 57344.8, 60 sec: 54886.5, 300 sec: 54761.4). Total num frames: 9637543936. Throughput: 0: 55068.5. Samples: 127886420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:47:12,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 10:47:15,355][57339] Updated weights for policy 0, policy_version 588238 (0.0030) [2024-04-28 10:47:17,169][57108] Fps is (10 sec: 52428.9, 60 sec: 54886.4, 300 sec: 54594.8). Total num frames: 9637789696. Throughput: 0: 54958.2. Samples: 128213540. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:47:17,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 10:47:17,903][57339] Updated weights for policy 0, policy_version 588248 (0.0028) [2024-04-28 10:47:21,307][57339] Updated weights for policy 0, policy_version 588258 (0.0028) [2024-04-28 10:47:22,169][57108] Fps is (10 sec: 50790.0, 60 sec: 54886.4, 300 sec: 54483.7). Total num frames: 9638051840. Throughput: 0: 54533.8. Samples: 128368520. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:47:22,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 10:47:23,775][57339] Updated weights for policy 0, policy_version 588268 (0.0027) [2024-04-28 10:47:27,108][57339] Updated weights for policy 0, policy_version 588278 (0.0024) [2024-04-28 10:47:27,169][57108] Fps is (10 sec: 55704.9, 60 sec: 54613.1, 300 sec: 54650.3). Total num frames: 9638346752. Throughput: 0: 54627.8. Samples: 128699980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:47:27,170][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 10:47:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000588278_9638346752.pth... [2024-04-28 10:47:27,224][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000587478_9625239552.pth [2024-04-28 10:47:29,663][57339] Updated weights for policy 0, policy_version 588288 (0.0028) [2024-04-28 10:47:32,169][57108] Fps is (10 sec: 57343.6, 60 sec: 54613.3, 300 sec: 54705.9). Total num frames: 9638625280. Throughput: 0: 54702.5. Samples: 129032760. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:47:32,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 10:47:33,099][57339] Updated weights for policy 0, policy_version 588298 (0.0029) [2024-04-28 10:47:35,626][57339] Updated weights for policy 0, policy_version 588308 (0.0039) [2024-04-28 10:47:37,169][57108] Fps is (10 sec: 55706.5, 60 sec: 54340.3, 300 sec: 54705.9). Total num frames: 9638903808. Throughput: 0: 54839.6. Samples: 129195760. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:47:37,170][57108] Avg episode reward: [(0, '0.689')] [2024-04-28 10:47:39,193][57339] Updated weights for policy 0, policy_version 588318 (0.0035) [2024-04-28 10:47:41,558][57339] Updated weights for policy 0, policy_version 588328 (0.0027) [2024-04-28 10:47:42,169][57108] Fps is (10 sec: 55706.5, 60 sec: 54886.5, 300 sec: 54817.0). Total num frames: 9639182336. Throughput: 0: 54818.8. Samples: 129521400. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:47:42,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 10:47:45,218][57339] Updated weights for policy 0, policy_version 588338 (0.0036) [2024-04-28 10:47:47,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54886.3, 300 sec: 54761.4). Total num frames: 9639444480. Throughput: 0: 54778.8. Samples: 129849900. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-04-28 10:47:47,169][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 10:47:47,645][57339] Updated weights for policy 0, policy_version 588348 (0.0024) [2024-04-28 10:47:51,155][57339] Updated weights for policy 0, policy_version 588358 (0.0036) [2024-04-28 10:47:52,169][57108] Fps is (10 sec: 50790.5, 60 sec: 54613.5, 300 sec: 54539.3). Total num frames: 9639690240. Throughput: 0: 54587.8. Samples: 130015000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:47:52,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 10:47:53,462][57339] Updated weights for policy 0, policy_version 588368 (0.0030) [2024-04-28 10:47:57,169][57108] Fps is (10 sec: 54067.6, 60 sec: 54613.4, 300 sec: 54705.9). Total num frames: 9639985152. Throughput: 0: 54613.3. Samples: 130344020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:47:57,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 10:47:57,174][57339] Updated weights for policy 0, policy_version 588378 (0.0028) [2024-04-28 10:47:59,046][57319] Signal inference workers to stop experience collection... (1800 times) [2024-04-28 10:47:59,047][57319] Signal inference workers to resume experience collection... (1800 times) [2024-04-28 10:47:59,070][57339] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-04-28 10:47:59,070][57339] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-04-28 10:47:59,484][57339] Updated weights for policy 0, policy_version 588388 (0.0036) [2024-04-28 10:48:02,169][57108] Fps is (10 sec: 55705.4, 60 sec: 54613.5, 300 sec: 54650.4). Total num frames: 9640247296. Throughput: 0: 54691.3. Samples: 130674640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:48:02,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 10:48:03,227][57339] Updated weights for policy 0, policy_version 588398 (0.0028) [2024-04-28 10:48:05,387][57339] Updated weights for policy 0, policy_version 588408 (0.0031) [2024-04-28 10:48:07,169][57108] Fps is (10 sec: 55704.5, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9640542208. Throughput: 0: 54887.9. Samples: 130838480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:48:07,170][57108] Avg episode reward: [(0, '0.699')] [2024-04-28 10:48:09,085][57339] Updated weights for policy 0, policy_version 588418 (0.0029) [2024-04-28 10:48:11,237][57339] Updated weights for policy 0, policy_version 588428 (0.0028) [2024-04-28 10:48:12,169][57108] Fps is (10 sec: 57343.8, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9640820736. Throughput: 0: 54844.2. Samples: 131167960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:48:12,169][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 10:48:15,132][57339] Updated weights for policy 0, policy_version 588438 (0.0026) [2024-04-28 10:48:17,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55432.4, 300 sec: 54872.5). Total num frames: 9641115648. Throughput: 0: 54658.1. Samples: 131492380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:48:17,170][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 10:48:17,307][57339] Updated weights for policy 0, policy_version 588448 (0.0029) [2024-04-28 10:48:21,106][57339] Updated weights for policy 0, policy_version 588458 (0.0031) [2024-04-28 10:48:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 54705.9). Total num frames: 9641361408. Throughput: 0: 54828.5. Samples: 131663040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:48:22,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 10:48:23,372][57339] Updated weights for policy 0, policy_version 588468 (0.0031) [2024-04-28 10:48:27,169][57108] Fps is (10 sec: 49152.5, 60 sec: 54340.4, 300 sec: 54539.3). Total num frames: 9641607168. Throughput: 0: 54973.2. Samples: 131995200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:48:27,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 10:48:27,272][57339] Updated weights for policy 0, policy_version 588478 (0.0034) [2024-04-28 10:48:29,521][57339] Updated weights for policy 0, policy_version 588488 (0.0026) [2024-04-28 10:48:32,169][57108] Fps is (10 sec: 52428.8, 60 sec: 54340.4, 300 sec: 54650.4). Total num frames: 9641885696. Throughput: 0: 54854.7. Samples: 132318360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:48:32,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 10:48:33,236][57339] Updated weights for policy 0, policy_version 588498 (0.0028) [2024-04-28 10:48:35,285][57339] Updated weights for policy 0, policy_version 588508 (0.0027) [2024-04-28 10:48:37,169][57108] Fps is (10 sec: 55706.0, 60 sec: 54340.3, 300 sec: 54705.9). Total num frames: 9642164224. Throughput: 0: 54741.2. Samples: 132478360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:48:37,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 10:48:39,067][57339] Updated weights for policy 0, policy_version 588518 (0.0036) [2024-04-28 10:48:41,255][57339] Updated weights for policy 0, policy_version 588528 (0.0030) [2024-04-28 10:48:42,169][57108] Fps is (10 sec: 57343.5, 60 sec: 54613.2, 300 sec: 54761.4). Total num frames: 9642459136. Throughput: 0: 54700.7. Samples: 132805560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:48:42,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 10:48:45,052][57339] Updated weights for policy 0, policy_version 588538 (0.0032) [2024-04-28 10:48:47,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55159.5, 300 sec: 54928.1). Total num frames: 9642754048. Throughput: 0: 54665.8. Samples: 133134600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:48:47,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 10:48:47,276][57339] Updated weights for policy 0, policy_version 588548 (0.0031) [2024-04-28 10:48:51,031][57339] Updated weights for policy 0, policy_version 588558 (0.0026) [2024-04-28 10:48:52,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55432.5, 300 sec: 54761.4). Total num frames: 9643016192. Throughput: 0: 54807.8. Samples: 133304820. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:48:52,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 10:48:52,604][57319] Signal inference workers to stop experience collection... (1850 times) [2024-04-28 10:48:52,605][57319] Signal inference workers to resume experience collection... (1850 times) [2024-04-28 10:48:52,630][57339] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-04-28 10:48:52,631][57339] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-04-28 10:48:53,477][57339] Updated weights for policy 0, policy_version 588568 (0.0032) [2024-04-28 10:48:57,028][57339] Updated weights for policy 0, policy_version 588578 (0.0039) [2024-04-28 10:48:57,169][57108] Fps is (10 sec: 52428.6, 60 sec: 54886.4, 300 sec: 54650.4). Total num frames: 9643278336. Throughput: 0: 54863.6. Samples: 133636820. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:48:57,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 10:48:59,449][57339] Updated weights for policy 0, policy_version 588588 (0.0030) [2024-04-28 10:49:02,169][57108] Fps is (10 sec: 50790.3, 60 sec: 54613.3, 300 sec: 54594.8). Total num frames: 9643524096. Throughput: 0: 54857.1. Samples: 133960940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:49:02,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 10:49:02,989][57339] Updated weights for policy 0, policy_version 588598 (0.0029) [2024-04-28 10:49:05,366][57339] Updated weights for policy 0, policy_version 588608 (0.0026) [2024-04-28 10:49:07,169][57108] Fps is (10 sec: 52428.5, 60 sec: 54340.3, 300 sec: 54650.4). Total num frames: 9643802624. Throughput: 0: 54503.5. Samples: 134115700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:49:07,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 10:49:08,970][57339] Updated weights for policy 0, policy_version 588618 (0.0033) [2024-04-28 10:49:11,510][57339] Updated weights for policy 0, policy_version 588628 (0.0033) [2024-04-28 10:49:12,169][57108] Fps is (10 sec: 57343.7, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9644097536. Throughput: 0: 54552.1. Samples: 134450040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:49:12,169][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 10:49:14,901][57339] Updated weights for policy 0, policy_version 588638 (0.0026) [2024-04-28 10:49:17,169][57108] Fps is (10 sec: 57343.9, 60 sec: 54340.4, 300 sec: 54817.0). Total num frames: 9644376064. Throughput: 0: 54545.3. Samples: 134772900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 10:49:17,170][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 10:49:17,572][57339] Updated weights for policy 0, policy_version 588648 (0.0032) [2024-04-28 10:49:20,865][57339] Updated weights for policy 0, policy_version 588658 (0.0041) [2024-04-28 10:49:22,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54886.4, 300 sec: 54761.4). Total num frames: 9644654592. Throughput: 0: 54885.8. Samples: 134948220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:49:22,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 10:49:23,546][57339] Updated weights for policy 0, policy_version 588668 (0.0026) [2024-04-28 10:49:26,870][57339] Updated weights for policy 0, policy_version 588678 (0.0027) [2024-04-28 10:49:27,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55159.6, 300 sec: 54650.4). Total num frames: 9644916736. Throughput: 0: 54915.3. Samples: 135276740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:49:27,169][57108] Avg episode reward: [(0, '0.682')] [2024-04-28 10:49:27,344][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000588681_9644949504.pth... [2024-04-28 10:49:27,390][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000587878_9631793152.pth [2024-04-28 10:49:29,406][57339] Updated weights for policy 0, policy_version 588688 (0.0031) [2024-04-28 10:49:32,169][57108] Fps is (10 sec: 52428.5, 60 sec: 54886.4, 300 sec: 54594.8). Total num frames: 9645178880. Throughput: 0: 54816.8. Samples: 135601360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:49:32,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 10:49:32,846][57339] Updated weights for policy 0, policy_version 588698 (0.0035) [2024-04-28 10:49:35,506][57339] Updated weights for policy 0, policy_version 588708 (0.0032) [2024-04-28 10:49:37,169][57108] Fps is (10 sec: 52428.2, 60 sec: 54613.3, 300 sec: 54594.8). Total num frames: 9645441024. Throughput: 0: 54381.6. Samples: 135752000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:49:37,170][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 10:49:38,941][57339] Updated weights for policy 0, policy_version 588718 (0.0024) [2024-04-28 10:49:41,672][57339] Updated weights for policy 0, policy_version 588728 (0.0028) [2024-04-28 10:49:42,169][57108] Fps is (10 sec: 55706.0, 60 sec: 54613.5, 300 sec: 54705.9). Total num frames: 9645735936. Throughput: 0: 54299.6. Samples: 136080300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:49:42,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 10:49:44,836][57339] Updated weights for policy 0, policy_version 588738 (0.0031) [2024-04-28 10:49:47,169][57108] Fps is (10 sec: 57344.0, 60 sec: 54340.2, 300 sec: 54761.4). Total num frames: 9646014464. Throughput: 0: 54413.2. Samples: 136409540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:49:47,170][57108] Avg episode reward: [(0, '0.695')] [2024-04-28 10:49:47,520][57339] Updated weights for policy 0, policy_version 588748 (0.0026) [2024-04-28 10:49:50,759][57339] Updated weights for policy 0, policy_version 588758 (0.0031) [2024-04-28 10:49:52,169][57108] Fps is (10 sec: 57343.6, 60 sec: 54886.3, 300 sec: 54817.0). Total num frames: 9646309376. Throughput: 0: 54935.1. Samples: 136587780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:49:52,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 10:49:53,839][57339] Updated weights for policy 0, policy_version 588768 (0.0032) [2024-04-28 10:49:56,653][57339] Updated weights for policy 0, policy_version 588778 (0.0029) [2024-04-28 10:49:57,169][57108] Fps is (10 sec: 54067.7, 60 sec: 54613.3, 300 sec: 54594.8). Total num frames: 9646555136. Throughput: 0: 54811.1. Samples: 136916540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:49:57,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 10:49:57,245][57319] Signal inference workers to stop experience collection... (1900 times) [2024-04-28 10:49:57,246][57319] Signal inference workers to resume experience collection... (1900 times) [2024-04-28 10:49:57,256][57339] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-04-28 10:49:57,267][57339] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-04-28 10:49:59,821][57339] Updated weights for policy 0, policy_version 588788 (0.0033) [2024-04-28 10:50:02,169][57108] Fps is (10 sec: 50790.0, 60 sec: 54886.3, 300 sec: 54594.8). Total num frames: 9646817280. Throughput: 0: 54873.3. Samples: 137242200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:50:02,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 10:50:02,699][57339] Updated weights for policy 0, policy_version 588798 (0.0039) [2024-04-28 10:50:05,867][57339] Updated weights for policy 0, policy_version 588808 (0.0028) [2024-04-28 10:50:07,169][57108] Fps is (10 sec: 52428.5, 60 sec: 54613.3, 300 sec: 54539.3). Total num frames: 9647079424. Throughput: 0: 54455.0. Samples: 137398700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:50:07,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 10:50:08,693][57339] Updated weights for policy 0, policy_version 588818 (0.0029) [2024-04-28 10:50:11,954][57339] Updated weights for policy 0, policy_version 588828 (0.0033) [2024-04-28 10:50:12,169][57108] Fps is (10 sec: 54068.1, 60 sec: 54340.3, 300 sec: 54705.9). Total num frames: 9647357952. Throughput: 0: 54396.0. Samples: 137724560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:50:12,169][57108] Avg episode reward: [(0, '0.501')] [2024-04-28 10:50:14,685][57339] Updated weights for policy 0, policy_version 588838 (0.0028) [2024-04-28 10:50:17,169][57108] Fps is (10 sec: 57343.5, 60 sec: 54613.3, 300 sec: 54705.9). Total num frames: 9647652864. Throughput: 0: 54387.5. Samples: 138048800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:50:17,170][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 10:50:17,871][57339] Updated weights for policy 0, policy_version 588848 (0.0031) [2024-04-28 10:50:20,703][57339] Updated weights for policy 0, policy_version 588858 (0.0035) [2024-04-28 10:50:22,169][57108] Fps is (10 sec: 58982.5, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9647947776. Throughput: 0: 54982.4. Samples: 138226200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:50:22,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 10:50:23,763][57339] Updated weights for policy 0, policy_version 588868 (0.0026) [2024-04-28 10:50:26,684][57339] Updated weights for policy 0, policy_version 588878 (0.0033) [2024-04-28 10:50:27,169][57108] Fps is (10 sec: 54067.8, 60 sec: 54613.3, 300 sec: 54594.8). Total num frames: 9648193536. Throughput: 0: 54904.4. Samples: 138551000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:50:27,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 10:50:29,882][57339] Updated weights for policy 0, policy_version 588888 (0.0026) [2024-04-28 10:50:32,169][57108] Fps is (10 sec: 49151.8, 60 sec: 54340.3, 300 sec: 54428.2). Total num frames: 9648439296. Throughput: 0: 54747.7. Samples: 138873180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:50:32,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 10:50:32,697][57339] Updated weights for policy 0, policy_version 588898 (0.0026) [2024-04-28 10:50:36,114][57339] Updated weights for policy 0, policy_version 588908 (0.0029) [2024-04-28 10:50:37,169][57108] Fps is (10 sec: 52427.9, 60 sec: 54613.2, 300 sec: 54539.2). Total num frames: 9648717824. Throughput: 0: 54221.6. Samples: 139027760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:50:37,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 10:50:38,770][57339] Updated weights for policy 0, policy_version 588918 (0.0035) [2024-04-28 10:50:42,117][57339] Updated weights for policy 0, policy_version 588928 (0.0029) [2024-04-28 10:50:42,169][57108] Fps is (10 sec: 55705.1, 60 sec: 54340.2, 300 sec: 54705.9). Total num frames: 9648996352. Throughput: 0: 54211.9. Samples: 139356080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:50:42,170][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 10:50:44,327][57319] Signal inference workers to stop experience collection... (1950 times) [2024-04-28 10:50:44,377][57319] Signal inference workers to resume experience collection... (1950 times) [2024-04-28 10:50:44,377][57339] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-04-28 10:50:44,402][57339] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-04-28 10:50:44,636][57339] Updated weights for policy 0, policy_version 588938 (0.0036) [2024-04-28 10:50:47,169][57108] Fps is (10 sec: 55706.7, 60 sec: 54340.3, 300 sec: 54761.4). Total num frames: 9649274880. Throughput: 0: 54314.4. Samples: 139686340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:50:47,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 10:50:48,152][57339] Updated weights for policy 0, policy_version 588948 (0.0033) [2024-04-28 10:50:50,697][57339] Updated weights for policy 0, policy_version 588958 (0.0029) [2024-04-28 10:50:52,169][57108] Fps is (10 sec: 58982.3, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9649586176. Throughput: 0: 54602.2. Samples: 139855800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 10:50:52,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 10:50:54,223][57339] Updated weights for policy 0, policy_version 588968 (0.0035) [2024-04-28 10:50:56,790][57339] Updated weights for policy 0, policy_version 588978 (0.0028) [2024-04-28 10:50:57,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54613.4, 300 sec: 54650.4). Total num frames: 9649831936. Throughput: 0: 54690.2. Samples: 140185620. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:50:57,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 10:51:00,135][57339] Updated weights for policy 0, policy_version 588988 (0.0028) [2024-04-28 10:51:02,169][57108] Fps is (10 sec: 50790.8, 60 sec: 54613.4, 300 sec: 54594.8). Total num frames: 9650094080. Throughput: 0: 54792.2. Samples: 140514440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:51:02,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 10:51:02,699][57339] Updated weights for policy 0, policy_version 588998 (0.0038) [2024-04-28 10:51:05,979][57339] Updated weights for policy 0, policy_version 589008 (0.0028) [2024-04-28 10:51:07,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54886.5, 300 sec: 54650.4). Total num frames: 9650372608. Throughput: 0: 54396.0. Samples: 140674020. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:51:07,169][57108] Avg episode reward: [(0, '0.531')] [2024-04-28 10:51:08,767][57339] Updated weights for policy 0, policy_version 589018 (0.0029) [2024-04-28 10:51:11,872][57339] Updated weights for policy 0, policy_version 589028 (0.0031) [2024-04-28 10:51:12,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54613.2, 300 sec: 54705.9). Total num frames: 9650634752. Throughput: 0: 54443.5. Samples: 141000960. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:51:12,169][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 10:51:14,812][57339] Updated weights for policy 0, policy_version 589038 (0.0032) [2024-04-28 10:51:17,169][57108] Fps is (10 sec: 54066.6, 60 sec: 54340.3, 300 sec: 54761.4). Total num frames: 9650913280. Throughput: 0: 54687.9. Samples: 141334140. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:51:17,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 10:51:17,890][57339] Updated weights for policy 0, policy_version 589048 (0.0034) [2024-04-28 10:51:20,935][57339] Updated weights for policy 0, policy_version 589058 (0.0035) [2024-04-28 10:51:22,169][57108] Fps is (10 sec: 57343.9, 60 sec: 54340.2, 300 sec: 54705.9). Total num frames: 9651208192. Throughput: 0: 54977.9. Samples: 141501760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:51:22,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 10:51:24,033][57339] Updated weights for policy 0, policy_version 589068 (0.0027) [2024-04-28 10:51:26,751][57339] Updated weights for policy 0, policy_version 589078 (0.0032) [2024-04-28 10:51:27,169][57108] Fps is (10 sec: 55705.9, 60 sec: 54613.3, 300 sec: 54650.4). Total num frames: 9651470336. Throughput: 0: 54977.4. Samples: 141830060. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:51:27,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 10:51:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000589080_9651486720.pth... [2024-04-28 10:51:27,243][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000588278_9638346752.pth [2024-04-28 10:51:29,903][57339] Updated weights for policy 0, policy_version 589088 (0.0026) [2024-04-28 10:51:32,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.4, 300 sec: 54594.8). Total num frames: 9651748864. Throughput: 0: 54934.6. Samples: 142158400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:51:32,169][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 10:51:32,673][57339] Updated weights for policy 0, policy_version 589098 (0.0029) [2024-04-28 10:51:36,015][57339] Updated weights for policy 0, policy_version 589108 (0.0029) [2024-04-28 10:51:37,019][57319] Signal inference workers to stop experience collection... (2000 times) [2024-04-28 10:51:37,020][57319] Signal inference workers to resume experience collection... (2000 times) [2024-04-28 10:51:37,037][57339] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-04-28 10:51:37,043][57339] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-04-28 10:51:37,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.6, 300 sec: 54705.9). Total num frames: 9652027392. Throughput: 0: 54714.7. Samples: 142317960. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:51:37,169][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 10:51:38,634][57339] Updated weights for policy 0, policy_version 589118 (0.0035) [2024-04-28 10:51:41,939][57339] Updated weights for policy 0, policy_version 589128 (0.0034) [2024-04-28 10:51:42,169][57108] Fps is (10 sec: 54066.6, 60 sec: 54886.3, 300 sec: 54705.9). Total num frames: 9652289536. Throughput: 0: 54712.7. Samples: 142647700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:51:42,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 10:51:44,556][57339] Updated weights for policy 0, policy_version 589138 (0.0034) [2024-04-28 10:51:47,169][57108] Fps is (10 sec: 50790.8, 60 sec: 54340.3, 300 sec: 54650.4). Total num frames: 9652535296. Throughput: 0: 54593.8. Samples: 142971160. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:51:47,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 10:51:48,046][57339] Updated weights for policy 0, policy_version 589148 (0.0028) [2024-04-28 10:51:50,680][57339] Updated weights for policy 0, policy_version 589158 (0.0029) [2024-04-28 10:51:52,169][57108] Fps is (10 sec: 57344.9, 60 sec: 54613.4, 300 sec: 54761.4). Total num frames: 9652862976. Throughput: 0: 54806.2. Samples: 143140300. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:51:52,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 10:51:54,091][57339] Updated weights for policy 0, policy_version 589168 (0.0041) [2024-04-28 10:51:56,766][57339] Updated weights for policy 0, policy_version 589178 (0.0030) [2024-04-28 10:51:57,169][57108] Fps is (10 sec: 58982.5, 60 sec: 54886.4, 300 sec: 54761.5). Total num frames: 9653125120. Throughput: 0: 54826.8. Samples: 143468160. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:51:57,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 10:51:59,965][57339] Updated weights for policy 0, policy_version 589188 (0.0030) [2024-04-28 10:52:02,169][57108] Fps is (10 sec: 52428.3, 60 sec: 54886.3, 300 sec: 54650.4). Total num frames: 9653387264. Throughput: 0: 54645.8. Samples: 143793200. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:52:02,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 10:52:02,675][57339] Updated weights for policy 0, policy_version 589198 (0.0034) [2024-04-28 10:52:05,833][57339] Updated weights for policy 0, policy_version 589208 (0.0026) [2024-04-28 10:52:07,169][57108] Fps is (10 sec: 52427.4, 60 sec: 54613.1, 300 sec: 54594.8). Total num frames: 9653649408. Throughput: 0: 54548.7. Samples: 143956460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:52:07,170][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 10:52:08,590][57339] Updated weights for policy 0, policy_version 589218 (0.0028) [2024-04-28 10:52:11,898][57339] Updated weights for policy 0, policy_version 589228 (0.0029) [2024-04-28 10:52:12,169][57108] Fps is (10 sec: 54067.6, 60 sec: 54886.4, 300 sec: 54705.9). Total num frames: 9653927936. Throughput: 0: 54573.8. Samples: 144285880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:52:12,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 10:52:14,582][57339] Updated weights for policy 0, policy_version 589238 (0.0031) [2024-04-28 10:52:17,169][57108] Fps is (10 sec: 52429.9, 60 sec: 54340.3, 300 sec: 54650.4). Total num frames: 9654173696. Throughput: 0: 54558.3. Samples: 144613520. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:52:17,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 10:52:18,081][57339] Updated weights for policy 0, policy_version 589248 (0.0029) [2024-04-28 10:52:20,614][57339] Updated weights for policy 0, policy_version 589258 (0.0032) [2024-04-28 10:52:22,169][57108] Fps is (10 sec: 54066.8, 60 sec: 54340.3, 300 sec: 54650.4). Total num frames: 9654468608. Throughput: 0: 54613.3. Samples: 144775560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:52:22,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 10:52:23,954][57339] Updated weights for policy 0, policy_version 589268 (0.0035) [2024-04-28 10:52:26,536][57339] Updated weights for policy 0, policy_version 589278 (0.0037) [2024-04-28 10:52:27,169][57108] Fps is (10 sec: 60619.7, 60 sec: 55159.3, 300 sec: 54761.4). Total num frames: 9654779904. Throughput: 0: 54514.2. Samples: 145100840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-04-28 10:52:27,170][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 10:52:29,865][57339] Updated weights for policy 0, policy_version 589288 (0.0028) [2024-04-28 10:52:32,169][57108] Fps is (10 sec: 55705.9, 60 sec: 54613.3, 300 sec: 54650.4). Total num frames: 9655025664. Throughput: 0: 54571.9. Samples: 145426900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:52:32,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 10:52:32,450][57319] Signal inference workers to stop experience collection... (2050 times) [2024-04-28 10:52:32,450][57319] Signal inference workers to resume experience collection... (2050 times) [2024-04-28 10:52:32,465][57339] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-04-28 10:52:32,465][57339] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-04-28 10:52:32,567][57339] Updated weights for policy 0, policy_version 589298 (0.0034) [2024-04-28 10:52:35,889][57339] Updated weights for policy 0, policy_version 589308 (0.0041) [2024-04-28 10:52:37,169][57108] Fps is (10 sec: 49152.9, 60 sec: 54067.3, 300 sec: 54539.3). Total num frames: 9655271424. Throughput: 0: 54428.0. Samples: 145589560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:52:37,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 10:52:38,604][57339] Updated weights for policy 0, policy_version 589318 (0.0037) [2024-04-28 10:52:41,815][57339] Updated weights for policy 0, policy_version 589328 (0.0035) [2024-04-28 10:52:42,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54613.5, 300 sec: 54650.4). Total num frames: 9655566336. Throughput: 0: 54444.4. Samples: 145918160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:52:42,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 10:52:44,480][57339] Updated weights for policy 0, policy_version 589338 (0.0033) [2024-04-28 10:52:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54886.4, 300 sec: 54705.9). Total num frames: 9655828480. Throughput: 0: 54621.5. Samples: 146251160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:52:47,169][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 10:52:47,615][57339] Updated weights for policy 0, policy_version 589348 (0.0030) [2024-04-28 10:52:50,481][57339] Updated weights for policy 0, policy_version 589358 (0.0027) [2024-04-28 10:52:52,169][57108] Fps is (10 sec: 52428.4, 60 sec: 53794.1, 300 sec: 54594.8). Total num frames: 9656090624. Throughput: 0: 54490.9. Samples: 146408540. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:52:52,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 10:52:53,719][57339] Updated weights for policy 0, policy_version 589368 (0.0030) [2024-04-28 10:52:56,509][57339] Updated weights for policy 0, policy_version 589378 (0.0031) [2024-04-28 10:52:57,169][57108] Fps is (10 sec: 57343.1, 60 sec: 54613.2, 300 sec: 54761.4). Total num frames: 9656401920. Throughput: 0: 54457.2. Samples: 146736460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:52:57,170][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 10:52:59,746][57339] Updated weights for policy 0, policy_version 589388 (0.0033) [2024-04-28 10:53:02,169][57108] Fps is (10 sec: 57344.2, 60 sec: 54613.4, 300 sec: 54650.4). Total num frames: 9656664064. Throughput: 0: 54459.1. Samples: 147064180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:53:02,170][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 10:53:02,359][57339] Updated weights for policy 0, policy_version 589398 (0.0031) [2024-04-28 10:53:05,742][57339] Updated weights for policy 0, policy_version 589408 (0.0035) [2024-04-28 10:53:07,169][57108] Fps is (10 sec: 50791.0, 60 sec: 54340.5, 300 sec: 54539.3). Total num frames: 9656909824. Throughput: 0: 54713.4. Samples: 147237660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:53:07,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 10:53:08,211][57339] Updated weights for policy 0, policy_version 589418 (0.0029) [2024-04-28 10:53:11,757][57339] Updated weights for policy 0, policy_version 589428 (0.0030) [2024-04-28 10:53:12,169][57108] Fps is (10 sec: 52428.1, 60 sec: 54340.1, 300 sec: 54483.7). Total num frames: 9657188352. Throughput: 0: 54808.0. Samples: 147567200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:53:12,170][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 10:53:14,260][57339] Updated weights for policy 0, policy_version 589438 (0.0031) [2024-04-28 10:53:17,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55159.5, 300 sec: 54650.4). Total num frames: 9657483264. Throughput: 0: 54819.7. Samples: 147893780. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:53:17,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 10:53:17,586][57339] Updated weights for policy 0, policy_version 589448 (0.0029) [2024-04-28 10:53:20,331][57339] Updated weights for policy 0, policy_version 589458 (0.0031) [2024-04-28 10:53:22,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54613.3, 300 sec: 54705.9). Total num frames: 9657745408. Throughput: 0: 54768.7. Samples: 148054160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:53:22,169][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 10:53:23,688][57339] Updated weights for policy 0, policy_version 589468 (0.0033) [2024-04-28 10:53:26,271][57339] Updated weights for policy 0, policy_version 589478 (0.0036) [2024-04-28 10:53:27,169][57108] Fps is (10 sec: 54066.4, 60 sec: 54067.3, 300 sec: 54705.9). Total num frames: 9658023936. Throughput: 0: 54755.0. Samples: 148382140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:53:27,170][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 10:53:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000589479_9658023936.pth... [2024-04-28 10:53:27,238][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000588681_9644949504.pth [2024-04-28 10:53:29,844][57339] Updated weights for policy 0, policy_version 589488 (0.0030) [2024-04-28 10:53:31,592][57319] Signal inference workers to stop experience collection... (2100 times) [2024-04-28 10:53:31,631][57339] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-04-28 10:53:31,642][57319] Signal inference workers to resume experience collection... (2100 times) [2024-04-28 10:53:31,650][57339] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-04-28 10:53:32,169][57108] Fps is (10 sec: 57345.2, 60 sec: 54886.5, 300 sec: 54761.5). Total num frames: 9658318848. Throughput: 0: 54639.6. Samples: 148709940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:53:32,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 10:53:32,349][57339] Updated weights for policy 0, policy_version 589498 (0.0030) [2024-04-28 10:53:35,735][57339] Updated weights for policy 0, policy_version 589508 (0.0030) [2024-04-28 10:53:37,169][57108] Fps is (10 sec: 54068.1, 60 sec: 54886.5, 300 sec: 54594.9). Total num frames: 9658564608. Throughput: 0: 54853.5. Samples: 148876940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:53:37,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 10:53:38,303][57339] Updated weights for policy 0, policy_version 589518 (0.0032) [2024-04-28 10:53:42,092][57339] Updated weights for policy 0, policy_version 589528 (0.0025) [2024-04-28 10:53:42,169][57108] Fps is (10 sec: 50790.4, 60 sec: 54340.3, 300 sec: 54483.7). Total num frames: 9658826752. Throughput: 0: 54813.6. Samples: 149203060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:53:42,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 10:53:44,245][57339] Updated weights for policy 0, policy_version 589538 (0.0033) [2024-04-28 10:53:47,169][57108] Fps is (10 sec: 55705.0, 60 sec: 54886.3, 300 sec: 54594.8). Total num frames: 9659121664. Throughput: 0: 54920.0. Samples: 149535580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:53:47,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 10:53:47,904][57339] Updated weights for policy 0, policy_version 589548 (0.0030) [2024-04-28 10:53:50,030][57339] Updated weights for policy 0, policy_version 589558 (0.0028) [2024-04-28 10:53:52,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54886.5, 300 sec: 54594.8). Total num frames: 9659383808. Throughput: 0: 54643.2. Samples: 149696600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:53:52,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 10:53:53,683][57339] Updated weights for policy 0, policy_version 589568 (0.0030) [2024-04-28 10:53:56,112][57339] Updated weights for policy 0, policy_version 589578 (0.0029) [2024-04-28 10:53:57,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54340.4, 300 sec: 54705.9). Total num frames: 9659662336. Throughput: 0: 54612.2. Samples: 150024740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-04-28 10:53:57,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 10:53:59,867][57339] Updated weights for policy 0, policy_version 589588 (0.0029) [2024-04-28 10:54:01,948][57339] Updated weights for policy 0, policy_version 589598 (0.0034) [2024-04-28 10:54:02,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55159.5, 300 sec: 54817.0). Total num frames: 9659973632. Throughput: 0: 54557.3. Samples: 150348860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:54:02,169][57108] Avg episode reward: [(0, '0.677')] [2024-04-28 10:54:05,807][57339] Updated weights for policy 0, policy_version 589608 (0.0028) [2024-04-28 10:54:07,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.5, 300 sec: 54650.4). Total num frames: 9660219392. Throughput: 0: 54745.5. Samples: 150517700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:54:07,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 10:54:08,028][57339] Updated weights for policy 0, policy_version 589618 (0.0026) [2024-04-28 10:54:11,625][57339] Updated weights for policy 0, policy_version 589628 (0.0031) [2024-04-28 10:54:12,169][57108] Fps is (10 sec: 50790.2, 60 sec: 54886.5, 300 sec: 54594.8). Total num frames: 9660481536. Throughput: 0: 54793.8. Samples: 150847860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:54:12,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 10:54:14,327][57339] Updated weights for policy 0, policy_version 589638 (0.0026) [2024-04-28 10:54:17,169][57108] Fps is (10 sec: 54066.6, 60 sec: 54613.2, 300 sec: 54594.8). Total num frames: 9660760064. Throughput: 0: 54997.5. Samples: 151184840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:54:17,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 10:54:17,436][57339] Updated weights for policy 0, policy_version 589648 (0.0038) [2024-04-28 10:54:20,242][57339] Updated weights for policy 0, policy_version 589658 (0.0031) [2024-04-28 10:54:22,169][57108] Fps is (10 sec: 55706.3, 60 sec: 54886.6, 300 sec: 54650.4). Total num frames: 9661038592. Throughput: 0: 54801.8. Samples: 151343020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:54:22,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 10:54:23,914][57339] Updated weights for policy 0, policy_version 589668 (0.0029) [2024-04-28 10:54:26,290][57339] Updated weights for policy 0, policy_version 589678 (0.0028) [2024-04-28 10:54:27,169][57108] Fps is (10 sec: 55705.9, 60 sec: 54886.4, 300 sec: 54705.9). Total num frames: 9661317120. Throughput: 0: 54769.2. Samples: 151667680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:54:27,170][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 10:54:29,886][57339] Updated weights for policy 0, policy_version 589688 (0.0028) [2024-04-28 10:54:32,080][57339] Updated weights for policy 0, policy_version 589698 (0.0031) [2024-04-28 10:54:32,169][57108] Fps is (10 sec: 57343.1, 60 sec: 54886.2, 300 sec: 54817.0). Total num frames: 9661612032. Throughput: 0: 54727.1. Samples: 151998300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:54:32,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 10:54:35,685][57339] Updated weights for policy 0, policy_version 589708 (0.0033) [2024-04-28 10:54:37,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55159.4, 300 sec: 54705.9). Total num frames: 9661874176. Throughput: 0: 54967.5. Samples: 152170140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:54:37,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 10:54:38,250][57339] Updated weights for policy 0, policy_version 589718 (0.0033) [2024-04-28 10:54:41,668][57319] Signal inference workers to stop experience collection... (2150 times) [2024-04-28 10:54:41,700][57339] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-04-28 10:54:41,728][57319] Signal inference workers to resume experience collection... (2150 times) [2024-04-28 10:54:41,728][57339] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-04-28 10:54:41,860][57339] Updated weights for policy 0, policy_version 589728 (0.0035) [2024-04-28 10:54:42,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55159.4, 300 sec: 54650.4). Total num frames: 9662136320. Throughput: 0: 54984.4. Samples: 152499040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:54:42,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 10:54:44,067][57339] Updated weights for policy 0, policy_version 589738 (0.0030) [2024-04-28 10:54:47,169][57108] Fps is (10 sec: 52428.8, 60 sec: 54613.4, 300 sec: 54539.3). Total num frames: 9662398464. Throughput: 0: 55140.1. Samples: 152830160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:54:47,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 10:54:47,718][57339] Updated weights for policy 0, policy_version 589748 (0.0027) [2024-04-28 10:54:49,857][57339] Updated weights for policy 0, policy_version 589758 (0.0028) [2024-04-28 10:54:52,169][57108] Fps is (10 sec: 54066.7, 60 sec: 54886.2, 300 sec: 54650.3). Total num frames: 9662676992. Throughput: 0: 54933.6. Samples: 152989720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:54:52,170][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 10:54:53,593][57339] Updated weights for policy 0, policy_version 589768 (0.0028) [2024-04-28 10:54:55,797][57339] Updated weights for policy 0, policy_version 589778 (0.0035) [2024-04-28 10:54:57,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55159.5, 300 sec: 54761.5). Total num frames: 9662971904. Throughput: 0: 54872.6. Samples: 153317120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:54:57,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 10:54:59,609][57339] Updated weights for policy 0, policy_version 589788 (0.0026) [2024-04-28 10:55:02,089][57339] Updated weights for policy 0, policy_version 589798 (0.0028) [2024-04-28 10:55:02,169][57108] Fps is (10 sec: 57344.7, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9663250432. Throughput: 0: 54787.7. Samples: 153650280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:55:02,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 10:55:05,637][57339] Updated weights for policy 0, policy_version 589808 (0.0034) [2024-04-28 10:55:07,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.6, 300 sec: 54872.5). Total num frames: 9663545344. Throughput: 0: 55151.1. Samples: 153824820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:55:07,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 10:55:07,999][57339] Updated weights for policy 0, policy_version 589818 (0.0026) [2024-04-28 10:55:11,478][57339] Updated weights for policy 0, policy_version 589828 (0.0030) [2024-04-28 10:55:12,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 54705.9). Total num frames: 9663791104. Throughput: 0: 55309.4. Samples: 154156600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:55:12,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 10:55:13,943][57339] Updated weights for policy 0, policy_version 589838 (0.0030) [2024-04-28 10:55:17,169][57108] Fps is (10 sec: 49151.8, 60 sec: 54613.4, 300 sec: 54539.3). Total num frames: 9664036864. Throughput: 0: 55317.0. Samples: 154487560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:55:17,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 10:55:17,429][57339] Updated weights for policy 0, policy_version 589848 (0.0030) [2024-04-28 10:55:19,729][57339] Updated weights for policy 0, policy_version 589858 (0.0030) [2024-04-28 10:55:22,169][57108] Fps is (10 sec: 54066.8, 60 sec: 54886.3, 300 sec: 54705.9). Total num frames: 9664331776. Throughput: 0: 54852.7. Samples: 154638520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:55:22,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 10:55:23,441][57339] Updated weights for policy 0, policy_version 589868 (0.0036) [2024-04-28 10:55:25,665][57339] Updated weights for policy 0, policy_version 589878 (0.0033) [2024-04-28 10:55:27,169][57108] Fps is (10 sec: 58980.8, 60 sec: 55159.3, 300 sec: 54872.5). Total num frames: 9664626688. Throughput: 0: 54786.4. Samples: 154964440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:55:27,170][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 10:55:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000589882_9664626688.pth... [2024-04-28 10:55:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000589080_9651486720.pth [2024-04-28 10:55:29,561][57339] Updated weights for policy 0, policy_version 589888 (0.0029) [2024-04-28 10:55:31,724][57339] Updated weights for policy 0, policy_version 589898 (0.0034) [2024-04-28 10:55:32,169][57108] Fps is (10 sec: 57343.9, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9664905216. Throughput: 0: 54723.4. Samples: 155292720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 10:55:32,170][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 10:55:35,406][57339] Updated weights for policy 0, policy_version 589908 (0.0031) [2024-04-28 10:55:37,169][57108] Fps is (10 sec: 55707.1, 60 sec: 55159.4, 300 sec: 54872.5). Total num frames: 9665183744. Throughput: 0: 55195.2. Samples: 155473500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:55:37,170][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 10:55:37,696][57339] Updated weights for policy 0, policy_version 589918 (0.0033) [2024-04-28 10:55:41,346][57339] Updated weights for policy 0, policy_version 589928 (0.0035) [2024-04-28 10:55:42,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55159.5, 300 sec: 54817.0). Total num frames: 9665445888. Throughput: 0: 55231.0. Samples: 155802520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:55:42,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 10:55:43,571][57339] Updated weights for policy 0, policy_version 589938 (0.0031) [2024-04-28 10:55:47,169][57108] Fps is (10 sec: 50790.6, 60 sec: 54886.4, 300 sec: 54594.8). Total num frames: 9665691648. Throughput: 0: 55093.8. Samples: 156129500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:55:47,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 10:55:47,374][57339] Updated weights for policy 0, policy_version 589948 (0.0033) [2024-04-28 10:55:47,532][57319] Signal inference workers to stop experience collection... (2200 times) [2024-04-28 10:55:47,563][57339] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-04-28 10:55:47,619][57319] Signal inference workers to resume experience collection... (2200 times) [2024-04-28 10:55:47,619][57339] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-04-28 10:55:49,393][57339] Updated weights for policy 0, policy_version 589958 (0.0025) [2024-04-28 10:55:52,169][57108] Fps is (10 sec: 50790.1, 60 sec: 54613.4, 300 sec: 54650.3). Total num frames: 9665953792. Throughput: 0: 54478.1. Samples: 156276340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:55:52,170][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 10:55:53,368][57339] Updated weights for policy 0, policy_version 589968 (0.0026) [2024-04-28 10:55:55,427][57339] Updated weights for policy 0, policy_version 589978 (0.0033) [2024-04-28 10:55:57,169][57108] Fps is (10 sec: 55705.6, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9666248704. Throughput: 0: 54384.0. Samples: 156603880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:55:57,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 10:55:59,273][57339] Updated weights for policy 0, policy_version 589988 (0.0033) [2024-04-28 10:56:01,431][57339] Updated weights for policy 0, policy_version 589998 (0.0032) [2024-04-28 10:56:02,169][57108] Fps is (10 sec: 58982.7, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9666543616. Throughput: 0: 54370.2. Samples: 156934220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:56:02,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 10:56:05,224][57339] Updated weights for policy 0, policy_version 590008 (0.0033) [2024-04-28 10:56:07,169][57108] Fps is (10 sec: 57343.9, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9666822144. Throughput: 0: 55033.9. Samples: 157115040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:56:07,170][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 10:56:07,507][57339] Updated weights for policy 0, policy_version 590018 (0.0027) [2024-04-28 10:56:11,172][57339] Updated weights for policy 0, policy_version 590028 (0.0039) [2024-04-28 10:56:12,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55159.4, 300 sec: 54872.5). Total num frames: 9667100672. Throughput: 0: 55202.4. Samples: 157448540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:56:12,170][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 10:56:13,360][57339] Updated weights for policy 0, policy_version 590038 (0.0034) [2024-04-28 10:56:17,103][57339] Updated weights for policy 0, policy_version 590048 (0.0024) [2024-04-28 10:56:17,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55159.4, 300 sec: 54705.9). Total num frames: 9667346432. Throughput: 0: 55189.4. Samples: 157776240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:56:17,169][57108] Avg episode reward: [(0, '0.676')] [2024-04-28 10:56:19,165][57339] Updated weights for policy 0, policy_version 590058 (0.0030) [2024-04-28 10:56:22,169][57108] Fps is (10 sec: 50791.1, 60 sec: 54613.4, 300 sec: 54705.9). Total num frames: 9667608576. Throughput: 0: 54408.9. Samples: 157921900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:56:22,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 10:56:23,021][57339] Updated weights for policy 0, policy_version 590068 (0.0029) [2024-04-28 10:56:25,306][57339] Updated weights for policy 0, policy_version 590078 (0.0032) [2024-04-28 10:56:27,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54340.5, 300 sec: 54705.9). Total num frames: 9667887104. Throughput: 0: 54444.4. Samples: 158252520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:56:27,170][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 10:56:28,887][57339] Updated weights for policy 0, policy_version 590088 (0.0030) [2024-04-28 10:56:31,336][57339] Updated weights for policy 0, policy_version 590098 (0.0029) [2024-04-28 10:56:32,169][57108] Fps is (10 sec: 58982.3, 60 sec: 54886.5, 300 sec: 54817.0). Total num frames: 9668198400. Throughput: 0: 54561.8. Samples: 158584780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:56:32,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 10:56:34,931][57339] Updated weights for policy 0, policy_version 590108 (0.0029) [2024-04-28 10:56:37,169][57108] Fps is (10 sec: 58983.2, 60 sec: 54886.5, 300 sec: 54872.6). Total num frames: 9668476928. Throughput: 0: 55209.1. Samples: 158760740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:56:37,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 10:56:37,173][57339] Updated weights for policy 0, policy_version 590118 (0.0025) [2024-04-28 10:56:40,246][57319] Signal inference workers to stop experience collection... (2250 times) [2024-04-28 10:56:40,268][57339] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-04-28 10:56:40,341][57319] Signal inference workers to resume experience collection... (2250 times) [2024-04-28 10:56:40,341][57339] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-04-28 10:56:40,934][57339] Updated weights for policy 0, policy_version 590128 (0.0026) [2024-04-28 10:56:42,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54886.4, 300 sec: 54928.1). Total num frames: 9668739072. Throughput: 0: 55221.4. Samples: 159088840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:56:42,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 10:56:42,976][57339] Updated weights for policy 0, policy_version 590138 (0.0030) [2024-04-28 10:56:46,674][57339] Updated weights for policy 0, policy_version 590148 (0.0030) [2024-04-28 10:56:47,169][57108] Fps is (10 sec: 52428.0, 60 sec: 55159.4, 300 sec: 54705.9). Total num frames: 9669001216. Throughput: 0: 55111.0. Samples: 159414220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:56:47,170][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 10:56:49,233][57339] Updated weights for policy 0, policy_version 590158 (0.0033) [2024-04-28 10:56:52,169][57108] Fps is (10 sec: 50790.5, 60 sec: 54886.5, 300 sec: 54650.4). Total num frames: 9669246976. Throughput: 0: 54554.3. Samples: 159569980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:56:52,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 10:56:52,670][57339] Updated weights for policy 0, policy_version 590168 (0.0027) [2024-04-28 10:56:55,167][57339] Updated weights for policy 0, policy_version 590178 (0.0034) [2024-04-28 10:56:57,169][57108] Fps is (10 sec: 52429.1, 60 sec: 54613.3, 300 sec: 54705.9). Total num frames: 9669525504. Throughput: 0: 54622.8. Samples: 159906560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:56:57,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 10:56:58,615][57339] Updated weights for policy 0, policy_version 590188 (0.0026) [2024-04-28 10:57:01,209][57339] Updated weights for policy 0, policy_version 590198 (0.0027) [2024-04-28 10:57:02,169][57108] Fps is (10 sec: 57343.6, 60 sec: 54613.4, 300 sec: 54817.0). Total num frames: 9669820416. Throughput: 0: 54701.8. Samples: 160237820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 10:57:02,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 10:57:04,637][57339] Updated weights for policy 0, policy_version 590208 (0.0026) [2024-04-28 10:57:07,031][57339] Updated weights for policy 0, policy_version 590218 (0.0035) [2024-04-28 10:57:07,169][57108] Fps is (10 sec: 60620.5, 60 sec: 55159.4, 300 sec: 54928.0). Total num frames: 9670131712. Throughput: 0: 55198.1. Samples: 160405820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:57:07,170][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 10:57:10,389][57339] Updated weights for policy 0, policy_version 590228 (0.0033) [2024-04-28 10:57:12,169][57108] Fps is (10 sec: 57343.2, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 9670393856. Throughput: 0: 55061.7. Samples: 160730300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:57:12,170][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 10:57:13,107][57339] Updated weights for policy 0, policy_version 590238 (0.0039) [2024-04-28 10:57:16,680][57339] Updated weights for policy 0, policy_version 590248 (0.0031) [2024-04-28 10:57:17,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55432.6, 300 sec: 54928.1). Total num frames: 9670672384. Throughput: 0: 55066.3. Samples: 161062760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:57:17,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 10:57:19,129][57339] Updated weights for policy 0, policy_version 590258 (0.0028) [2024-04-28 10:57:22,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55159.4, 300 sec: 54705.9). Total num frames: 9670918144. Throughput: 0: 54731.0. Samples: 161223640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:57:22,170][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 10:57:22,434][57339] Updated weights for policy 0, policy_version 590268 (0.0027) [2024-04-28 10:57:25,172][57339] Updated weights for policy 0, policy_version 590278 (0.0030) [2024-04-28 10:57:27,169][57108] Fps is (10 sec: 47513.1, 60 sec: 54340.3, 300 sec: 54650.4). Total num frames: 9671147520. Throughput: 0: 54777.7. Samples: 161553840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:57:27,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 10:57:27,264][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000590281_9671163904.pth... [2024-04-28 10:57:27,312][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000589479_9658023936.pth [2024-04-28 10:57:28,197][57319] Signal inference workers to stop experience collection... (2300 times) [2024-04-28 10:57:28,225][57339] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-04-28 10:57:28,255][57319] Signal inference workers to resume experience collection... (2300 times) [2024-04-28 10:57:28,256][57339] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-04-28 10:57:28,370][57339] Updated weights for policy 0, policy_version 590288 (0.0028) [2024-04-28 10:57:31,037][57339] Updated weights for policy 0, policy_version 590298 (0.0028) [2024-04-28 10:57:32,169][57108] Fps is (10 sec: 54067.7, 60 sec: 54340.3, 300 sec: 54872.5). Total num frames: 9671458816. Throughput: 0: 54841.1. Samples: 161882060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:57:32,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 10:57:34,276][57339] Updated weights for policy 0, policy_version 590308 (0.0030) [2024-04-28 10:57:37,018][57339] Updated weights for policy 0, policy_version 590318 (0.0026) [2024-04-28 10:57:37,169][57108] Fps is (10 sec: 62259.4, 60 sec: 54886.3, 300 sec: 54928.1). Total num frames: 9671770112. Throughput: 0: 55004.4. Samples: 162045180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:57:37,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 10:57:40,516][57339] Updated weights for policy 0, policy_version 590328 (0.0031) [2024-04-28 10:57:42,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54613.4, 300 sec: 54872.5). Total num frames: 9672015872. Throughput: 0: 54771.7. Samples: 162371280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:57:42,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 10:57:43,069][57339] Updated weights for policy 0, policy_version 590338 (0.0033) [2024-04-28 10:57:46,345][57339] Updated weights for policy 0, policy_version 590348 (0.0033) [2024-04-28 10:57:47,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.5, 300 sec: 54983.6). Total num frames: 9672310784. Throughput: 0: 54553.8. Samples: 162692740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:57:47,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 10:57:49,106][57339] Updated weights for policy 0, policy_version 590358 (0.0029) [2024-04-28 10:57:52,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.4, 300 sec: 54761.5). Total num frames: 9672556544. Throughput: 0: 54602.3. Samples: 162862920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:57:52,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 10:57:52,342][57339] Updated weights for policy 0, policy_version 590368 (0.0026) [2024-04-28 10:57:55,159][57339] Updated weights for policy 0, policy_version 590378 (0.0027) [2024-04-28 10:57:57,169][57108] Fps is (10 sec: 49152.6, 60 sec: 54613.5, 300 sec: 54705.9). Total num frames: 9672802304. Throughput: 0: 54739.0. Samples: 163193540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:57:57,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 10:57:58,354][57339] Updated weights for policy 0, policy_version 590388 (0.0033) [2024-04-28 10:58:01,255][57339] Updated weights for policy 0, policy_version 590398 (0.0031) [2024-04-28 10:58:02,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9673097216. Throughput: 0: 54589.6. Samples: 163519300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:58:02,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 10:58:04,182][57339] Updated weights for policy 0, policy_version 590408 (0.0032) [2024-04-28 10:58:07,169][57108] Fps is (10 sec: 58981.9, 60 sec: 54340.4, 300 sec: 54928.1). Total num frames: 9673392128. Throughput: 0: 54668.5. Samples: 163683720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:58:07,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 10:58:07,363][57339] Updated weights for policy 0, policy_version 590418 (0.0028) [2024-04-28 10:58:10,039][57339] Updated weights for policy 0, policy_version 590428 (0.0029) [2024-04-28 10:58:12,169][57108] Fps is (10 sec: 55706.2, 60 sec: 54340.4, 300 sec: 54817.0). Total num frames: 9673654272. Throughput: 0: 54569.0. Samples: 164009440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:58:12,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 10:58:13,313][57339] Updated weights for policy 0, policy_version 590438 (0.0027) [2024-04-28 10:58:16,018][57339] Updated weights for policy 0, policy_version 590448 (0.0028) [2024-04-28 10:58:17,169][57108] Fps is (10 sec: 55704.8, 60 sec: 54613.2, 300 sec: 54928.1). Total num frames: 9673949184. Throughput: 0: 54551.8. Samples: 164336900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:58:17,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 10:58:19,179][57339] Updated weights for policy 0, policy_version 590458 (0.0033) [2024-04-28 10:58:22,101][57339] Updated weights for policy 0, policy_version 590468 (0.0032) [2024-04-28 10:58:22,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55159.6, 300 sec: 54928.1). Total num frames: 9674227712. Throughput: 0: 54726.3. Samples: 164507860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:58:22,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 10:58:24,566][57319] Signal inference workers to stop experience collection... (2350 times) [2024-04-28 10:58:24,566][57319] Signal inference workers to resume experience collection... (2350 times) [2024-04-28 10:58:24,575][57339] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-04-28 10:58:24,593][57339] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-04-28 10:58:25,171][57339] Updated weights for policy 0, policy_version 590478 (0.0036) [2024-04-28 10:58:27,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55432.5, 300 sec: 54761.4). Total num frames: 9674473472. Throughput: 0: 54777.6. Samples: 164836280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:58:27,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 10:58:27,900][57339] Updated weights for policy 0, policy_version 590488 (0.0035) [2024-04-28 10:58:31,229][57339] Updated weights for policy 0, policy_version 590498 (0.0025) [2024-04-28 10:58:32,169][57108] Fps is (10 sec: 49152.0, 60 sec: 54340.3, 300 sec: 54761.4). Total num frames: 9674719232. Throughput: 0: 55047.7. Samples: 165169880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:58:32,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 10:58:33,925][57339] Updated weights for policy 0, policy_version 590508 (0.0026) [2024-04-28 10:58:37,169][57108] Fps is (10 sec: 55706.0, 60 sec: 54340.3, 300 sec: 54928.0). Total num frames: 9675030528. Throughput: 0: 54803.2. Samples: 165329060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 10:58:37,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 10:58:37,214][57339] Updated weights for policy 0, policy_version 590518 (0.0031) [2024-04-28 10:58:39,980][57339] Updated weights for policy 0, policy_version 590528 (0.0037) [2024-04-28 10:58:42,169][57108] Fps is (10 sec: 58982.2, 60 sec: 54886.3, 300 sec: 54872.5). Total num frames: 9675309056. Throughput: 0: 54789.2. Samples: 165659060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 10:58:42,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 10:58:43,339][57339] Updated weights for policy 0, policy_version 590538 (0.0032) [2024-04-28 10:58:45,898][57339] Updated weights for policy 0, policy_version 590548 (0.0027) [2024-04-28 10:58:47,169][57108] Fps is (10 sec: 55704.5, 60 sec: 54613.2, 300 sec: 54928.0). Total num frames: 9675587584. Throughput: 0: 54917.2. Samples: 165990580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 10:58:47,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 10:58:49,359][57339] Updated weights for policy 0, policy_version 590558 (0.0028) [2024-04-28 10:58:51,860][57339] Updated weights for policy 0, policy_version 590568 (0.0028) [2024-04-28 10:58:52,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55432.6, 300 sec: 54983.6). Total num frames: 9675882496. Throughput: 0: 55043.1. Samples: 166160660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 10:58:52,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 10:58:55,436][57339] Updated weights for policy 0, policy_version 590578 (0.0030) [2024-04-28 10:58:57,169][57108] Fps is (10 sec: 55706.8, 60 sec: 55705.6, 300 sec: 54817.0). Total num frames: 9676144640. Throughput: 0: 55109.8. Samples: 166489380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 10:58:57,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 10:58:57,883][57339] Updated weights for policy 0, policy_version 590588 (0.0033) [2024-04-28 10:59:01,292][57339] Updated weights for policy 0, policy_version 590598 (0.0025) [2024-04-28 10:59:02,169][57108] Fps is (10 sec: 50790.1, 60 sec: 54886.5, 300 sec: 54817.0). Total num frames: 9676390400. Throughput: 0: 55133.0. Samples: 166817880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 10:59:02,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 10:59:03,770][57339] Updated weights for policy 0, policy_version 590608 (0.0034) [2024-04-28 10:59:07,169][57108] Fps is (10 sec: 52428.7, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9676668928. Throughput: 0: 54944.0. Samples: 166980340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 10:59:07,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 10:59:07,406][57339] Updated weights for policy 0, policy_version 590618 (0.0026) [2024-04-28 10:59:09,723][57339] Updated weights for policy 0, policy_version 590628 (0.0027) [2024-04-28 10:59:12,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9676947456. Throughput: 0: 55023.6. Samples: 167312340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 10:59:12,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 10:59:13,388][57339] Updated weights for policy 0, policy_version 590638 (0.0029) [2024-04-28 10:59:15,615][57339] Updated weights for policy 0, policy_version 590648 (0.0029) [2024-04-28 10:59:17,169][57108] Fps is (10 sec: 57343.7, 60 sec: 54886.5, 300 sec: 54928.0). Total num frames: 9677242368. Throughput: 0: 54963.0. Samples: 167643220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 10:59:17,169][57108] Avg episode reward: [(0, '0.709')] [2024-04-28 10:59:19,421][57339] Updated weights for policy 0, policy_version 590658 (0.0031) [2024-04-28 10:59:21,528][57339] Updated weights for policy 0, policy_version 590668 (0.0029) [2024-04-28 10:59:22,169][57108] Fps is (10 sec: 57343.6, 60 sec: 54886.3, 300 sec: 54928.1). Total num frames: 9677520896. Throughput: 0: 55186.1. Samples: 167812440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 10:59:22,170][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 10:59:25,527][57339] Updated weights for policy 0, policy_version 590678 (0.0031) [2024-04-28 10:59:27,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 54817.0). Total num frames: 9677783040. Throughput: 0: 55096.8. Samples: 168138420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 10:59:27,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 10:59:27,210][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000590686_9677799424.pth... [2024-04-28 10:59:27,253][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000589882_9664626688.pth [2024-04-28 10:59:27,621][57339] Updated weights for policy 0, policy_version 590688 (0.0028) [2024-04-28 10:59:31,357][57339] Updated weights for policy 0, policy_version 590698 (0.0027) [2024-04-28 10:59:31,900][57319] Signal inference workers to stop experience collection... (2400 times) [2024-04-28 10:59:31,900][57319] Signal inference workers to resume experience collection... (2400 times) [2024-04-28 10:59:31,934][57339] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-04-28 10:59:31,934][57339] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-04-28 10:59:32,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55432.4, 300 sec: 54816.9). Total num frames: 9678045184. Throughput: 0: 55003.1. Samples: 168465720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 10:59:32,170][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 10:59:33,678][57339] Updated weights for policy 0, policy_version 590708 (0.0032) [2024-04-28 10:59:37,169][57108] Fps is (10 sec: 52428.4, 60 sec: 54613.2, 300 sec: 54817.0). Total num frames: 9678307328. Throughput: 0: 54610.9. Samples: 168618160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 10:59:37,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 10:59:37,251][57339] Updated weights for policy 0, policy_version 590718 (0.0031) [2024-04-28 10:59:39,653][57339] Updated weights for policy 0, policy_version 590728 (0.0025) [2024-04-28 10:59:42,169][57108] Fps is (10 sec: 54068.0, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9678585856. Throughput: 0: 54738.2. Samples: 168952600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 10:59:42,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 10:59:43,534][57339] Updated weights for policy 0, policy_version 590738 (0.0027) [2024-04-28 10:59:45,679][57339] Updated weights for policy 0, policy_version 590748 (0.0032) [2024-04-28 10:59:47,169][57108] Fps is (10 sec: 55705.4, 60 sec: 54613.4, 300 sec: 54872.5). Total num frames: 9678864384. Throughput: 0: 54753.2. Samples: 169281780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 10:59:47,169][57108] Avg episode reward: [(0, '0.714')] [2024-04-28 10:59:49,348][57339] Updated weights for policy 0, policy_version 590758 (0.0031) [2024-04-28 10:59:51,607][57339] Updated weights for policy 0, policy_version 590768 (0.0032) [2024-04-28 10:59:52,169][57108] Fps is (10 sec: 57343.7, 60 sec: 54613.2, 300 sec: 54872.5). Total num frames: 9679159296. Throughput: 0: 54878.6. Samples: 169449880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 10:59:52,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 10:59:55,139][57339] Updated weights for policy 0, policy_version 590778 (0.0024) [2024-04-28 10:59:57,169][57108] Fps is (10 sec: 57343.7, 60 sec: 54886.2, 300 sec: 54872.5). Total num frames: 9679437824. Throughput: 0: 54772.7. Samples: 169777120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 10:59:57,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 10:59:57,663][57339] Updated weights for policy 0, policy_version 590788 (0.0034) [2024-04-28 11:00:01,192][57339] Updated weights for policy 0, policy_version 590798 (0.0031) [2024-04-28 11:00:02,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.5, 300 sec: 54761.4). Total num frames: 9679699968. Throughput: 0: 54850.7. Samples: 170111500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 11:00:02,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 11:00:03,589][57339] Updated weights for policy 0, policy_version 590808 (0.0026) [2024-04-28 11:00:07,112][57339] Updated weights for policy 0, policy_version 590818 (0.0027) [2024-04-28 11:00:07,169][57108] Fps is (10 sec: 52429.6, 60 sec: 54886.3, 300 sec: 54817.0). Total num frames: 9679962112. Throughput: 0: 54571.2. Samples: 170268140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 11:00:07,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 11:00:09,414][57339] Updated weights for policy 0, policy_version 590828 (0.0031) [2024-04-28 11:00:12,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54886.4, 300 sec: 54928.1). Total num frames: 9680240640. Throughput: 0: 54704.5. Samples: 170600120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 11:00:12,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 11:00:12,981][57339] Updated weights for policy 0, policy_version 590838 (0.0031) [2024-04-28 11:00:15,296][57339] Updated weights for policy 0, policy_version 590848 (0.0037) [2024-04-28 11:00:17,169][57108] Fps is (10 sec: 54066.8, 60 sec: 54340.2, 300 sec: 54817.0). Total num frames: 9680502784. Throughput: 0: 54732.5. Samples: 170928680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:00:17,170][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 11:00:18,927][57339] Updated weights for policy 0, policy_version 590858 (0.0030) [2024-04-28 11:00:21,447][57339] Updated weights for policy 0, policy_version 590868 (0.0031) [2024-04-28 11:00:22,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54613.4, 300 sec: 54817.0). Total num frames: 9680797696. Throughput: 0: 55029.5. Samples: 171094480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:00:22,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 11:00:25,020][57339] Updated weights for policy 0, policy_version 590878 (0.0033) [2024-04-28 11:00:27,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55159.5, 300 sec: 54872.5). Total num frames: 9681092608. Throughput: 0: 54887.1. Samples: 171422520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:00:27,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 11:00:27,377][57339] Updated weights for policy 0, policy_version 590888 (0.0026) [2024-04-28 11:00:30,884][57339] Updated weights for policy 0, policy_version 590898 (0.0026) [2024-04-28 11:00:32,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54886.6, 300 sec: 54761.5). Total num frames: 9681338368. Throughput: 0: 54800.2. Samples: 171747780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:00:32,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 11:00:33,410][57339] Updated weights for policy 0, policy_version 590908 (0.0030) [2024-04-28 11:00:35,703][57319] Signal inference workers to stop experience collection... (2450 times) [2024-04-28 11:00:35,703][57319] Signal inference workers to resume experience collection... (2450 times) [2024-04-28 11:00:35,732][57339] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-04-28 11:00:35,732][57339] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-04-28 11:00:36,860][57339] Updated weights for policy 0, policy_version 590918 (0.0032) [2024-04-28 11:00:37,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55159.6, 300 sec: 54817.0). Total num frames: 9681616896. Throughput: 0: 54687.2. Samples: 171910800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:00:37,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 11:00:39,542][57339] Updated weights for policy 0, policy_version 590928 (0.0033) [2024-04-28 11:00:42,169][57108] Fps is (10 sec: 52428.3, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9681862656. Throughput: 0: 54707.3. Samples: 172238940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:00:42,170][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 11:00:42,945][57339] Updated weights for policy 0, policy_version 590938 (0.0026) [2024-04-28 11:00:45,544][57339] Updated weights for policy 0, policy_version 590948 (0.0032) [2024-04-28 11:00:47,169][57108] Fps is (10 sec: 52428.9, 60 sec: 54613.5, 300 sec: 54872.5). Total num frames: 9682141184. Throughput: 0: 54597.9. Samples: 172568400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:00:47,169][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 11:00:48,819][57339] Updated weights for policy 0, policy_version 590958 (0.0034) [2024-04-28 11:00:51,487][57339] Updated weights for policy 0, policy_version 590968 (0.0026) [2024-04-28 11:00:52,169][57108] Fps is (10 sec: 55706.1, 60 sec: 54340.3, 300 sec: 54817.0). Total num frames: 9682419712. Throughput: 0: 54777.9. Samples: 172733140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:00:52,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 11:00:54,753][57339] Updated weights for policy 0, policy_version 590978 (0.0031) [2024-04-28 11:00:57,169][57108] Fps is (10 sec: 57343.4, 60 sec: 54613.4, 300 sec: 54817.0). Total num frames: 9682714624. Throughput: 0: 54728.8. Samples: 173062920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:00:57,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 11:00:57,734][57339] Updated weights for policy 0, policy_version 590988 (0.0033) [2024-04-28 11:01:00,729][57339] Updated weights for policy 0, policy_version 590998 (0.0033) [2024-04-28 11:01:02,169][57108] Fps is (10 sec: 57343.1, 60 sec: 54886.3, 300 sec: 54817.0). Total num frames: 9682993152. Throughput: 0: 54599.5. Samples: 173385660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:01:02,169][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 11:01:03,594][57339] Updated weights for policy 0, policy_version 591008 (0.0031) [2024-04-28 11:01:06,770][57339] Updated weights for policy 0, policy_version 591018 (0.0038) [2024-04-28 11:01:07,169][57108] Fps is (10 sec: 54067.6, 60 sec: 54886.4, 300 sec: 54761.5). Total num frames: 9683255296. Throughput: 0: 54648.5. Samples: 173553660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:01:07,169][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 11:01:09,416][57339] Updated weights for policy 0, policy_version 591028 (0.0026) [2024-04-28 11:01:12,169][57108] Fps is (10 sec: 52429.6, 60 sec: 54613.4, 300 sec: 54817.0). Total num frames: 9683517440. Throughput: 0: 54716.1. Samples: 173884740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:01:12,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 11:01:12,834][57339] Updated weights for policy 0, policy_version 591038 (0.0028) [2024-04-28 11:01:15,525][57339] Updated weights for policy 0, policy_version 591048 (0.0038) [2024-04-28 11:01:17,169][57108] Fps is (10 sec: 52427.7, 60 sec: 54613.3, 300 sec: 54816.9). Total num frames: 9683779584. Throughput: 0: 54819.7. Samples: 174214680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:01:17,170][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 11:01:18,754][57339] Updated weights for policy 0, policy_version 591058 (0.0027) [2024-04-28 11:01:21,660][57339] Updated weights for policy 0, policy_version 591068 (0.0027) [2024-04-28 11:01:22,169][57108] Fps is (10 sec: 55705.3, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9684074496. Throughput: 0: 54735.5. Samples: 174373900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:01:22,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 11:01:24,690][57339] Updated weights for policy 0, policy_version 591078 (0.0032) [2024-04-28 11:01:27,169][57108] Fps is (10 sec: 57345.4, 60 sec: 54340.3, 300 sec: 54761.4). Total num frames: 9684353024. Throughput: 0: 54726.3. Samples: 174701620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:01:27,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 11:01:27,203][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000591087_9684369408.pth... [2024-04-28 11:01:27,249][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000590281_9671163904.pth [2024-04-28 11:01:27,436][57339] Updated weights for policy 0, policy_version 591088 (0.0034) [2024-04-28 11:01:30,789][57339] Updated weights for policy 0, policy_version 591098 (0.0033) [2024-04-28 11:01:32,169][57108] Fps is (10 sec: 55705.6, 60 sec: 54886.4, 300 sec: 54761.4). Total num frames: 9684631552. Throughput: 0: 54740.4. Samples: 175031720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:01:32,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 11:01:33,725][57339] Updated weights for policy 0, policy_version 591108 (0.0030) [2024-04-28 11:01:36,878][57339] Updated weights for policy 0, policy_version 591118 (0.0030) [2024-04-28 11:01:37,169][57108] Fps is (10 sec: 55705.3, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9684910080. Throughput: 0: 54802.2. Samples: 175199240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:01:37,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 11:01:39,656][57339] Updated weights for policy 0, policy_version 591128 (0.0028) [2024-04-28 11:01:42,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.5, 300 sec: 54817.0). Total num frames: 9685172224. Throughput: 0: 54764.5. Samples: 175527320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 11:01:42,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 11:01:42,732][57339] Updated weights for policy 0, policy_version 591138 (0.0028) [2024-04-28 11:01:45,874][57339] Updated weights for policy 0, policy_version 591148 (0.0032) [2024-04-28 11:01:47,169][57108] Fps is (10 sec: 50790.6, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9685417984. Throughput: 0: 54885.5. Samples: 175855500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:01:47,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 11:01:48,587][57339] Updated weights for policy 0, policy_version 591158 (0.0034) [2024-04-28 11:01:51,892][57339] Updated weights for policy 0, policy_version 591168 (0.0032) [2024-04-28 11:01:51,919][57319] Signal inference workers to stop experience collection... (2500 times) [2024-04-28 11:01:51,919][57319] Signal inference workers to resume experience collection... (2500 times) [2024-04-28 11:01:51,940][57339] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-04-28 11:01:51,940][57339] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-04-28 11:01:52,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55159.5, 300 sec: 54928.1). Total num frames: 9685729280. Throughput: 0: 54732.9. Samples: 176016640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:01:52,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 11:01:54,653][57339] Updated weights for policy 0, policy_version 591178 (0.0024) [2024-04-28 11:01:57,169][57108] Fps is (10 sec: 58982.4, 60 sec: 54886.5, 300 sec: 54872.5). Total num frames: 9686007808. Throughput: 0: 54681.3. Samples: 176345400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:01:57,169][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 11:01:57,688][57339] Updated weights for policy 0, policy_version 591188 (0.0026) [2024-04-28 11:02:00,649][57339] Updated weights for policy 0, policy_version 591198 (0.0032) [2024-04-28 11:02:02,169][57108] Fps is (10 sec: 55704.9, 60 sec: 54886.4, 300 sec: 54761.4). Total num frames: 9686286336. Throughput: 0: 54726.8. Samples: 176677380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:02:02,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 11:02:03,474][57339] Updated weights for policy 0, policy_version 591208 (0.0028) [2024-04-28 11:02:06,433][57339] Updated weights for policy 0, policy_version 591218 (0.0029) [2024-04-28 11:02:07,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55159.4, 300 sec: 54817.0). Total num frames: 9686564864. Throughput: 0: 54996.4. Samples: 176848740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:02:07,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 11:02:09,552][57339] Updated weights for policy 0, policy_version 591228 (0.0031) [2024-04-28 11:02:12,169][57108] Fps is (10 sec: 52428.4, 60 sec: 54886.2, 300 sec: 54705.9). Total num frames: 9686810624. Throughput: 0: 54982.0. Samples: 177175820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:02:12,170][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 11:02:12,351][57339] Updated weights for policy 0, policy_version 591238 (0.0029) [2024-04-28 11:02:15,464][57339] Updated weights for policy 0, policy_version 591248 (0.0032) [2024-04-28 11:02:17,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55159.6, 300 sec: 54817.0). Total num frames: 9687089152. Throughput: 0: 55013.2. Samples: 177507320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:02:17,170][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 11:02:18,349][57339] Updated weights for policy 0, policy_version 591258 (0.0028) [2024-04-28 11:02:21,331][57339] Updated weights for policy 0, policy_version 591268 (0.0030) [2024-04-28 11:02:22,169][57108] Fps is (10 sec: 55706.7, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 9687367680. Throughput: 0: 54753.4. Samples: 177663140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:02:22,169][57108] Avg episode reward: [(0, '0.676')] [2024-04-28 11:02:24,358][57339] Updated weights for policy 0, policy_version 591278 (0.0028) [2024-04-28 11:02:27,169][57108] Fps is (10 sec: 55706.4, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9687646208. Throughput: 0: 54793.0. Samples: 177993000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:02:27,169][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 11:02:27,329][57339] Updated weights for policy 0, policy_version 591288 (0.0026) [2024-04-28 11:02:30,314][57339] Updated weights for policy 0, policy_version 591298 (0.0030) [2024-04-28 11:02:32,169][57108] Fps is (10 sec: 55705.1, 60 sec: 54886.4, 300 sec: 54761.4). Total num frames: 9687924736. Throughput: 0: 54867.9. Samples: 178324560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:02:32,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 11:02:33,304][57339] Updated weights for policy 0, policy_version 591308 (0.0031) [2024-04-28 11:02:36,178][57339] Updated weights for policy 0, policy_version 591318 (0.0032) [2024-04-28 11:02:37,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55159.5, 300 sec: 54928.1). Total num frames: 9688219648. Throughput: 0: 55157.4. Samples: 178498720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:02:37,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 11:02:39,264][57339] Updated weights for policy 0, policy_version 591328 (0.0033) [2024-04-28 11:02:42,169][57339] Updated weights for policy 0, policy_version 591338 (0.0030) [2024-04-28 11:02:42,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.5, 300 sec: 54817.0). Total num frames: 9688481792. Throughput: 0: 55245.7. Samples: 178831460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:02:42,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 11:02:45,211][57339] Updated weights for policy 0, policy_version 591348 (0.0035) [2024-04-28 11:02:47,169][57108] Fps is (10 sec: 50789.7, 60 sec: 55159.4, 300 sec: 54817.0). Total num frames: 9688727552. Throughput: 0: 55147.6. Samples: 179159020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:02:47,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 11:02:48,214][57339] Updated weights for policy 0, policy_version 591358 (0.0037) [2024-04-28 11:02:51,192][57339] Updated weights for policy 0, policy_version 591368 (0.0027) [2024-04-28 11:02:52,169][57108] Fps is (10 sec: 50790.2, 60 sec: 54340.2, 300 sec: 54872.5). Total num frames: 9688989696. Throughput: 0: 54728.4. Samples: 179311520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:02:52,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 11:02:54,190][57339] Updated weights for policy 0, policy_version 591378 (0.0031) [2024-04-28 11:02:57,169][57108] Fps is (10 sec: 55706.2, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9689284608. Throughput: 0: 54799.8. Samples: 179641800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:02:57,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 11:02:57,261][57339] Updated weights for policy 0, policy_version 591388 (0.0031) [2024-04-28 11:03:00,215][57339] Updated weights for policy 0, policy_version 591398 (0.0039) [2024-04-28 11:03:02,169][57108] Fps is (10 sec: 57344.0, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9689563136. Throughput: 0: 54665.8. Samples: 179967280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:03:02,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 11:03:03,435][57339] Updated weights for policy 0, policy_version 591408 (0.0026) [2024-04-28 11:03:06,004][57339] Updated weights for policy 0, policy_version 591418 (0.0035) [2024-04-28 11:03:07,169][57108] Fps is (10 sec: 55705.3, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9689841664. Throughput: 0: 55110.1. Samples: 180143100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:03:07,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 11:03:09,429][57339] Updated weights for policy 0, policy_version 591428 (0.0033) [2024-04-28 11:03:11,941][57319] Signal inference workers to stop experience collection... (2550 times) [2024-04-28 11:03:11,979][57339] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-04-28 11:03:12,030][57319] Signal inference workers to resume experience collection... (2550 times) [2024-04-28 11:03:12,030][57339] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-04-28 11:03:12,032][57339] Updated weights for policy 0, policy_version 591438 (0.0029) [2024-04-28 11:03:12,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.7, 300 sec: 54872.5). Total num frames: 9690136576. Throughput: 0: 55053.7. Samples: 180470420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:03:12,170][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 11:03:15,402][57339] Updated weights for policy 0, policy_version 591448 (0.0031) [2024-04-28 11:03:17,169][57108] Fps is (10 sec: 52429.0, 60 sec: 54613.4, 300 sec: 54705.9). Total num frames: 9690365952. Throughput: 0: 54961.8. Samples: 180797840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:03:17,169][57108] Avg episode reward: [(0, '0.676')] [2024-04-28 11:03:18,180][57339] Updated weights for policy 0, policy_version 591458 (0.0028) [2024-04-28 11:03:21,585][57339] Updated weights for policy 0, policy_version 591468 (0.0026) [2024-04-28 11:03:22,169][57108] Fps is (10 sec: 49151.7, 60 sec: 54340.1, 300 sec: 54761.4). Total num frames: 9690628096. Throughput: 0: 54502.1. Samples: 180951320. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:03:22,170][57108] Avg episode reward: [(0, '0.709')] [2024-04-28 11:03:23,951][57339] Updated weights for policy 0, policy_version 591478 (0.0034) [2024-04-28 11:03:27,169][57108] Fps is (10 sec: 55705.2, 60 sec: 54613.2, 300 sec: 54928.0). Total num frames: 9690923008. Throughput: 0: 54403.5. Samples: 181279620. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:03:27,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 11:03:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000591487_9690923008.pth... [2024-04-28 11:03:27,227][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000590686_9677799424.pth [2024-04-28 11:03:27,577][57339] Updated weights for policy 0, policy_version 591488 (0.0031) [2024-04-28 11:03:29,946][57339] Updated weights for policy 0, policy_version 591498 (0.0032) [2024-04-28 11:03:32,169][57108] Fps is (10 sec: 58982.8, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9691217920. Throughput: 0: 54391.6. Samples: 181606640. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:03:32,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 11:03:33,481][57339] Updated weights for policy 0, policy_version 591508 (0.0025) [2024-04-28 11:03:35,942][57339] Updated weights for policy 0, policy_version 591518 (0.0027) [2024-04-28 11:03:37,169][57108] Fps is (10 sec: 57344.0, 60 sec: 54613.2, 300 sec: 54872.5). Total num frames: 9691496448. Throughput: 0: 54860.9. Samples: 181780260. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:03:37,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 11:03:39,503][57339] Updated weights for policy 0, policy_version 591528 (0.0031) [2024-04-28 11:03:42,011][57339] Updated weights for policy 0, policy_version 591538 (0.0028) [2024-04-28 11:03:42,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9691774976. Throughput: 0: 54791.1. Samples: 182107400. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:03:42,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 11:03:45,696][57339] Updated weights for policy 0, policy_version 591548 (0.0032) [2024-04-28 11:03:47,169][57108] Fps is (10 sec: 52428.2, 60 sec: 54886.3, 300 sec: 54705.9). Total num frames: 9692020736. Throughput: 0: 54926.1. Samples: 182438960. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:03:47,170][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 11:03:47,959][57339] Updated weights for policy 0, policy_version 591558 (0.0027) [2024-04-28 11:03:51,830][57339] Updated weights for policy 0, policy_version 591568 (0.0038) [2024-04-28 11:03:52,169][57108] Fps is (10 sec: 49151.6, 60 sec: 54613.3, 300 sec: 54650.3). Total num frames: 9692266496. Throughput: 0: 54378.2. Samples: 182590120. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:03:52,170][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 11:03:53,888][57339] Updated weights for policy 0, policy_version 591578 (0.0025) [2024-04-28 11:03:57,169][57108] Fps is (10 sec: 52429.1, 60 sec: 54340.1, 300 sec: 54761.4). Total num frames: 9692545024. Throughput: 0: 54408.3. Samples: 182918800. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:03:57,170][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 11:03:57,609][57339] Updated weights for policy 0, policy_version 591588 (0.0034) [2024-04-28 11:03:59,802][57339] Updated weights for policy 0, policy_version 591598 (0.0034) [2024-04-28 11:04:02,169][57108] Fps is (10 sec: 57344.6, 60 sec: 54613.4, 300 sec: 54817.0). Total num frames: 9692839936. Throughput: 0: 54471.6. Samples: 183249060. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:04:02,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 11:04:03,532][57339] Updated weights for policy 0, policy_version 591608 (0.0035) [2024-04-28 11:04:05,957][57339] Updated weights for policy 0, policy_version 591618 (0.0029) [2024-04-28 11:04:07,169][57108] Fps is (10 sec: 58982.5, 60 sec: 54886.3, 300 sec: 54872.5). Total num frames: 9693134848. Throughput: 0: 54942.6. Samples: 183423740. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:04:07,170][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 11:04:09,635][57339] Updated weights for policy 0, policy_version 591628 (0.0030) [2024-04-28 11:04:11,865][57339] Updated weights for policy 0, policy_version 591638 (0.0042) [2024-04-28 11:04:12,169][57108] Fps is (10 sec: 57343.6, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9693413376. Throughput: 0: 54932.0. Samples: 183751560. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:04:12,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 11:04:15,532][57339] Updated weights for policy 0, policy_version 591648 (0.0027) [2024-04-28 11:04:16,827][57319] Signal inference workers to stop experience collection... (2600 times) [2024-04-28 11:04:16,828][57319] Signal inference workers to resume experience collection... (2600 times) [2024-04-28 11:04:16,843][57339] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-04-28 11:04:16,843][57339] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-04-28 11:04:17,169][57108] Fps is (10 sec: 52429.7, 60 sec: 54886.5, 300 sec: 54705.9). Total num frames: 9693659136. Throughput: 0: 55029.9. Samples: 184082980. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:04:17,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 11:04:17,832][57339] Updated weights for policy 0, policy_version 591658 (0.0031) [2024-04-28 11:04:21,509][57339] Updated weights for policy 0, policy_version 591668 (0.0028) [2024-04-28 11:04:22,169][57108] Fps is (10 sec: 50790.0, 60 sec: 54886.4, 300 sec: 54705.9). Total num frames: 9693921280. Throughput: 0: 54462.6. Samples: 184231080. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:04:22,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 11:04:23,789][57339] Updated weights for policy 0, policy_version 591678 (0.0036) [2024-04-28 11:04:27,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54613.4, 300 sec: 54761.5). Total num frames: 9694199808. Throughput: 0: 54617.8. Samples: 184565200. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:04:27,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 11:04:27,387][57339] Updated weights for policy 0, policy_version 591688 (0.0029) [2024-04-28 11:04:29,716][57339] Updated weights for policy 0, policy_version 591698 (0.0027) [2024-04-28 11:04:32,169][57108] Fps is (10 sec: 57344.3, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9694494720. Throughput: 0: 54573.5. Samples: 184894760. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:04:32,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 11:04:33,491][57339] Updated weights for policy 0, policy_version 591708 (0.0036) [2024-04-28 11:04:35,660][57339] Updated weights for policy 0, policy_version 591718 (0.0027) [2024-04-28 11:04:37,169][57108] Fps is (10 sec: 57343.7, 60 sec: 54613.4, 300 sec: 54872.5). Total num frames: 9694773248. Throughput: 0: 54991.6. Samples: 185064740. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:04:37,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 11:04:39,568][57339] Updated weights for policy 0, policy_version 591728 (0.0033) [2024-04-28 11:04:41,711][57339] Updated weights for policy 0, policy_version 591738 (0.0025) [2024-04-28 11:04:42,169][57108] Fps is (10 sec: 54064.5, 60 sec: 54339.8, 300 sec: 54816.9). Total num frames: 9695035392. Throughput: 0: 54893.7. Samples: 185389040. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:04:42,170][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 11:04:45,390][57339] Updated weights for policy 0, policy_version 591748 (0.0031) [2024-04-28 11:04:47,169][57108] Fps is (10 sec: 52428.0, 60 sec: 54613.3, 300 sec: 54705.9). Total num frames: 9695297536. Throughput: 0: 54881.5. Samples: 185718740. Policy #0 lag: (min: 0.0, avg: 13.1, max: 22.0) [2024-04-28 11:04:47,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 11:04:47,589][57339] Updated weights for policy 0, policy_version 591758 (0.0035) [2024-04-28 11:04:51,390][57339] Updated weights for policy 0, policy_version 591768 (0.0027) [2024-04-28 11:04:52,169][57108] Fps is (10 sec: 52431.6, 60 sec: 54886.5, 300 sec: 54650.4). Total num frames: 9695559680. Throughput: 0: 54597.0. Samples: 185880600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:04:52,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 11:04:53,597][57339] Updated weights for policy 0, policy_version 591778 (0.0032) [2024-04-28 11:04:57,169][57108] Fps is (10 sec: 52429.3, 60 sec: 54613.4, 300 sec: 54650.3). Total num frames: 9695821824. Throughput: 0: 54545.7. Samples: 186206120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:04:57,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 11:04:57,396][57339] Updated weights for policy 0, policy_version 591788 (0.0038) [2024-04-28 11:04:59,731][57339] Updated weights for policy 0, policy_version 591798 (0.0034) [2024-04-28 11:05:02,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9696116736. Throughput: 0: 54561.7. Samples: 186538260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:05:02,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 11:05:03,198][57339] Updated weights for policy 0, policy_version 591808 (0.0027) [2024-04-28 11:05:05,614][57339] Updated weights for policy 0, policy_version 591818 (0.0026) [2024-04-28 11:05:07,169][57108] Fps is (10 sec: 60619.8, 60 sec: 54886.3, 300 sec: 54872.5). Total num frames: 9696428032. Throughput: 0: 55073.2. Samples: 186709380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:05:07,170][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 11:05:09,023][57339] Updated weights for policy 0, policy_version 591828 (0.0032) [2024-04-28 11:05:11,697][57339] Updated weights for policy 0, policy_version 591838 (0.0033) [2024-04-28 11:05:12,169][57108] Fps is (10 sec: 57343.6, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9696690176. Throughput: 0: 54968.8. Samples: 187038800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:05:12,170][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 11:05:15,142][57339] Updated weights for policy 0, policy_version 591848 (0.0032) [2024-04-28 11:05:17,169][57108] Fps is (10 sec: 54068.8, 60 sec: 55159.4, 300 sec: 54817.0). Total num frames: 9696968704. Throughput: 0: 54893.9. Samples: 187364980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:05:17,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 11:05:17,757][57339] Updated weights for policy 0, policy_version 591858 (0.0027) [2024-04-28 11:05:21,228][57339] Updated weights for policy 0, policy_version 591868 (0.0029) [2024-04-28 11:05:22,169][57108] Fps is (10 sec: 52428.5, 60 sec: 54886.4, 300 sec: 54650.3). Total num frames: 9697214464. Throughput: 0: 54667.0. Samples: 187524760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:05:22,170][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 11:05:23,598][57339] Updated weights for policy 0, policy_version 591878 (0.0030) [2024-04-28 11:05:26,964][57319] Signal inference workers to stop experience collection... (2650 times) [2024-04-28 11:05:26,966][57319] Signal inference workers to resume experience collection... (2650 times) [2024-04-28 11:05:26,998][57339] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-04-28 11:05:26,999][57339] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-04-28 11:05:27,075][57339] Updated weights for policy 0, policy_version 591888 (0.0032) [2024-04-28 11:05:27,169][57108] Fps is (10 sec: 52428.6, 60 sec: 54886.4, 300 sec: 54761.4). Total num frames: 9697492992. Throughput: 0: 54817.6. Samples: 187855800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:05:27,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 11:05:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000591888_9697492992.pth... [2024-04-28 11:05:27,231][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000591087_9684369408.pth [2024-04-28 11:05:29,534][57339] Updated weights for policy 0, policy_version 591898 (0.0031) [2024-04-28 11:05:32,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54340.2, 300 sec: 54705.9). Total num frames: 9697755136. Throughput: 0: 54817.5. Samples: 188185520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:05:32,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 11:05:33,165][57339] Updated weights for policy 0, policy_version 591908 (0.0031) [2024-04-28 11:05:35,303][57339] Updated weights for policy 0, policy_version 591918 (0.0029) [2024-04-28 11:05:37,169][57108] Fps is (10 sec: 57344.1, 60 sec: 54886.4, 300 sec: 54928.1). Total num frames: 9698066432. Throughput: 0: 54925.8. Samples: 188352260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:05:37,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 11:05:39,281][57339] Updated weights for policy 0, policy_version 591928 (0.0028) [2024-04-28 11:05:41,359][57339] Updated weights for policy 0, policy_version 591938 (0.0029) [2024-04-28 11:05:42,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55159.9, 300 sec: 54928.0). Total num frames: 9698344960. Throughput: 0: 54976.0. Samples: 188680040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:05:42,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 11:05:45,194][57339] Updated weights for policy 0, policy_version 591948 (0.0029) [2024-04-28 11:05:47,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55159.5, 300 sec: 54872.5). Total num frames: 9698607104. Throughput: 0: 54791.4. Samples: 189003880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:05:47,170][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 11:05:47,431][57339] Updated weights for policy 0, policy_version 591958 (0.0033) [2024-04-28 11:05:51,059][57339] Updated weights for policy 0, policy_version 591968 (0.0027) [2024-04-28 11:05:52,169][57108] Fps is (10 sec: 50790.5, 60 sec: 54886.4, 300 sec: 54705.9). Total num frames: 9698852864. Throughput: 0: 54582.9. Samples: 189165600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:05:52,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 11:05:53,372][57339] Updated weights for policy 0, policy_version 591978 (0.0029) [2024-04-28 11:05:57,129][57339] Updated weights for policy 0, policy_version 591988 (0.0025) [2024-04-28 11:05:57,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55159.4, 300 sec: 54705.9). Total num frames: 9699131392. Throughput: 0: 54607.0. Samples: 189496120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:05:57,170][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 11:05:59,545][57339] Updated weights for policy 0, policy_version 591998 (0.0033) [2024-04-28 11:06:02,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54613.3, 300 sec: 54705.9). Total num frames: 9699393536. Throughput: 0: 54612.8. Samples: 189822560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:06:02,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 11:06:03,209][57339] Updated weights for policy 0, policy_version 592008 (0.0031) [2024-04-28 11:06:05,594][57339] Updated weights for policy 0, policy_version 592018 (0.0034) [2024-04-28 11:06:07,169][57108] Fps is (10 sec: 57344.8, 60 sec: 54613.6, 300 sec: 54872.5). Total num frames: 9699704832. Throughput: 0: 54632.6. Samples: 189983220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:06:07,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 11:06:09,385][57339] Updated weights for policy 0, policy_version 592028 (0.0029) [2024-04-28 11:06:11,828][57339] Updated weights for policy 0, policy_version 592038 (0.0030) [2024-04-28 11:06:12,169][57108] Fps is (10 sec: 57343.6, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9699966976. Throughput: 0: 54485.2. Samples: 190307640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:06:12,170][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 11:06:15,297][57339] Updated weights for policy 0, policy_version 592048 (0.0030) [2024-04-28 11:06:17,169][57108] Fps is (10 sec: 55705.1, 60 sec: 54886.3, 300 sec: 54872.5). Total num frames: 9700261888. Throughput: 0: 54496.0. Samples: 190637840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:06:17,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 11:06:17,785][57339] Updated weights for policy 0, policy_version 592058 (0.0033) [2024-04-28 11:06:18,631][57319] Signal inference workers to stop experience collection... (2700 times) [2024-04-28 11:06:18,651][57339] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-04-28 11:06:18,722][57319] Signal inference workers to resume experience collection... (2700 times) [2024-04-28 11:06:18,723][57339] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-04-28 11:06:21,155][57339] Updated weights for policy 0, policy_version 592068 (0.0025) [2024-04-28 11:06:22,169][57108] Fps is (10 sec: 54068.0, 60 sec: 54886.5, 300 sec: 54761.4). Total num frames: 9700507648. Throughput: 0: 54522.2. Samples: 190805760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 11:06:22,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 11:06:23,874][57339] Updated weights for policy 0, policy_version 592078 (0.0031) [2024-04-28 11:06:27,089][57339] Updated weights for policy 0, policy_version 592088 (0.0028) [2024-04-28 11:06:27,169][57108] Fps is (10 sec: 50790.9, 60 sec: 54613.3, 300 sec: 54705.9). Total num frames: 9700769792. Throughput: 0: 54531.2. Samples: 191133940. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:06:27,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 11:06:29,886][57339] Updated weights for policy 0, policy_version 592098 (0.0030) [2024-04-28 11:06:32,169][57108] Fps is (10 sec: 52428.5, 60 sec: 54613.4, 300 sec: 54650.4). Total num frames: 9701031936. Throughput: 0: 54528.1. Samples: 191457640. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:06:32,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 11:06:33,176][57339] Updated weights for policy 0, policy_version 592108 (0.0025) [2024-04-28 11:06:36,005][57339] Updated weights for policy 0, policy_version 592118 (0.0026) [2024-04-28 11:06:37,169][57108] Fps is (10 sec: 55705.0, 60 sec: 54340.2, 300 sec: 54761.4). Total num frames: 9701326848. Throughput: 0: 54447.5. Samples: 191615740. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:06:37,170][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 11:06:39,048][57339] Updated weights for policy 0, policy_version 592128 (0.0038) [2024-04-28 11:06:42,009][57339] Updated weights for policy 0, policy_version 592138 (0.0033) [2024-04-28 11:06:42,169][57108] Fps is (10 sec: 57344.5, 60 sec: 54340.4, 300 sec: 54872.5). Total num frames: 9701605376. Throughput: 0: 54415.8. Samples: 191944820. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:06:42,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 11:06:45,072][57339] Updated weights for policy 0, policy_version 592148 (0.0028) [2024-04-28 11:06:47,169][57108] Fps is (10 sec: 55705.9, 60 sec: 54613.4, 300 sec: 54761.4). Total num frames: 9701883904. Throughput: 0: 54478.7. Samples: 192274100. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:06:47,170][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 11:06:47,918][57339] Updated weights for policy 0, policy_version 592158 (0.0026) [2024-04-28 11:06:51,101][57339] Updated weights for policy 0, policy_version 592168 (0.0032) [2024-04-28 11:06:52,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54886.4, 300 sec: 54705.9). Total num frames: 9702146048. Throughput: 0: 54701.3. Samples: 192444780. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:06:52,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 11:06:54,065][57339] Updated weights for policy 0, policy_version 592178 (0.0026) [2024-04-28 11:06:57,168][57339] Updated weights for policy 0, policy_version 592188 (0.0028) [2024-04-28 11:06:57,169][57108] Fps is (10 sec: 52428.5, 60 sec: 54613.4, 300 sec: 54650.4). Total num frames: 9702408192. Throughput: 0: 54784.0. Samples: 192772920. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:06:57,170][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 11:06:59,922][57339] Updated weights for policy 0, policy_version 592198 (0.0030) [2024-04-28 11:07:02,169][57108] Fps is (10 sec: 52428.7, 60 sec: 54613.4, 300 sec: 54594.8). Total num frames: 9702670336. Throughput: 0: 54854.8. Samples: 193106300. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:07:02,169][57108] Avg episode reward: [(0, '0.695')] [2024-04-28 11:07:03,155][57339] Updated weights for policy 0, policy_version 592208 (0.0030) [2024-04-28 11:07:05,963][57339] Updated weights for policy 0, policy_version 592218 (0.0031) [2024-04-28 11:07:07,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54340.2, 300 sec: 54761.5). Total num frames: 9702965248. Throughput: 0: 54615.9. Samples: 193263480. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:07:07,169][57108] Avg episode reward: [(0, '0.686')] [2024-04-28 11:07:09,088][57339] Updated weights for policy 0, policy_version 592228 (0.0029) [2024-04-28 11:07:12,020][57339] Updated weights for policy 0, policy_version 592238 (0.0035) [2024-04-28 11:07:12,169][57108] Fps is (10 sec: 57343.5, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9703243776. Throughput: 0: 54579.9. Samples: 193590040. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:07:12,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 11:07:14,866][57339] Updated weights for policy 0, policy_version 592248 (0.0031) [2024-04-28 11:07:17,169][57108] Fps is (10 sec: 55705.6, 60 sec: 54340.3, 300 sec: 54761.4). Total num frames: 9703522304. Throughput: 0: 54774.6. Samples: 193922500. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:07:17,170][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 11:07:18,001][57339] Updated weights for policy 0, policy_version 592258 (0.0030) [2024-04-28 11:07:21,034][57339] Updated weights for policy 0, policy_version 592268 (0.0035) [2024-04-28 11:07:22,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54886.3, 300 sec: 54761.4). Total num frames: 9703800832. Throughput: 0: 54992.5. Samples: 194090400. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:07:22,169][57108] Avg episode reward: [(0, '0.699')] [2024-04-28 11:07:23,905][57339] Updated weights for policy 0, policy_version 592278 (0.0027) [2024-04-28 11:07:24,234][57319] Signal inference workers to stop experience collection... (2750 times) [2024-04-28 11:07:24,265][57339] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-04-28 11:07:24,291][57319] Signal inference workers to resume experience collection... (2750 times) [2024-04-28 11:07:24,294][57339] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-04-28 11:07:26,855][57339] Updated weights for policy 0, policy_version 592288 (0.0032) [2024-04-28 11:07:27,169][57108] Fps is (10 sec: 54067.6, 60 sec: 54886.4, 300 sec: 54705.9). Total num frames: 9704062976. Throughput: 0: 54976.4. Samples: 194418760. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:07:27,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 11:07:27,199][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000592290_9704079360.pth... [2024-04-28 11:07:27,244][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000591487_9690923008.pth [2024-04-28 11:07:29,769][57339] Updated weights for policy 0, policy_version 592298 (0.0028) [2024-04-28 11:07:32,169][57108] Fps is (10 sec: 52429.4, 60 sec: 54886.5, 300 sec: 54594.8). Total num frames: 9704325120. Throughput: 0: 54963.7. Samples: 194747460. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:07:32,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 11:07:32,716][57339] Updated weights for policy 0, policy_version 592308 (0.0032) [2024-04-28 11:07:35,943][57339] Updated weights for policy 0, policy_version 592318 (0.0030) [2024-04-28 11:07:37,169][57108] Fps is (10 sec: 54066.4, 60 sec: 54613.3, 300 sec: 54650.3). Total num frames: 9704603648. Throughput: 0: 54707.8. Samples: 194906640. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:07:37,170][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 11:07:38,775][57339] Updated weights for policy 0, policy_version 592328 (0.0025) [2024-04-28 11:07:41,810][57339] Updated weights for policy 0, policy_version 592338 (0.0027) [2024-04-28 11:07:42,169][57108] Fps is (10 sec: 54067.2, 60 sec: 54340.3, 300 sec: 54705.9). Total num frames: 9704865792. Throughput: 0: 54647.3. Samples: 195232040. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:07:42,169][57108] Avg episode reward: [(0, '0.524')] [2024-04-28 11:07:44,893][57339] Updated weights for policy 0, policy_version 592348 (0.0038) [2024-04-28 11:07:47,169][57108] Fps is (10 sec: 55706.1, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9705160704. Throughput: 0: 54482.6. Samples: 195558020. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:07:47,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 11:07:47,794][57339] Updated weights for policy 0, policy_version 592358 (0.0033) [2024-04-28 11:07:50,946][57339] Updated weights for policy 0, policy_version 592368 (0.0031) [2024-04-28 11:07:52,169][57108] Fps is (10 sec: 57343.6, 60 sec: 54886.4, 300 sec: 54761.4). Total num frames: 9705439232. Throughput: 0: 54676.0. Samples: 195723900. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:07:52,169][57108] Avg episode reward: [(0, '0.531')] [2024-04-28 11:07:53,690][57339] Updated weights for policy 0, policy_version 592378 (0.0030) [2024-04-28 11:07:56,792][57339] Updated weights for policy 0, policy_version 592388 (0.0029) [2024-04-28 11:07:57,169][57108] Fps is (10 sec: 54067.7, 60 sec: 54886.5, 300 sec: 54705.9). Total num frames: 9705701376. Throughput: 0: 54767.3. Samples: 196054560. Policy #0 lag: (min: 1.0, avg: 12.2, max: 23.0) [2024-04-28 11:07:57,169][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 11:07:59,661][57339] Updated weights for policy 0, policy_version 592398 (0.0026) [2024-04-28 11:08:02,169][57108] Fps is (10 sec: 52429.3, 60 sec: 54886.5, 300 sec: 54650.4). Total num frames: 9705963520. Throughput: 0: 54616.6. Samples: 196380240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:08:02,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 11:08:02,665][57339] Updated weights for policy 0, policy_version 592408 (0.0031) [2024-04-28 11:08:05,956][57339] Updated weights for policy 0, policy_version 592418 (0.0029) [2024-04-28 11:08:07,169][57108] Fps is (10 sec: 52428.8, 60 sec: 54340.4, 300 sec: 54539.3). Total num frames: 9706225664. Throughput: 0: 54445.9. Samples: 196540460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:08:07,169][57108] Avg episode reward: [(0, '0.695')] [2024-04-28 11:08:08,680][57339] Updated weights for policy 0, policy_version 592428 (0.0033) [2024-04-28 11:08:12,111][57339] Updated weights for policy 0, policy_version 592438 (0.0039) [2024-04-28 11:08:12,169][57108] Fps is (10 sec: 54066.5, 60 sec: 54340.3, 300 sec: 54705.9). Total num frames: 9706504192. Throughput: 0: 54401.2. Samples: 196866820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:08:12,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 11:08:14,781][57339] Updated weights for policy 0, policy_version 592448 (0.0028) [2024-04-28 11:08:17,169][57108] Fps is (10 sec: 54065.5, 60 sec: 54067.0, 300 sec: 54705.9). Total num frames: 9706766336. Throughput: 0: 54333.4. Samples: 197192480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:08:17,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 11:08:18,337][57339] Updated weights for policy 0, policy_version 592458 (0.0028) [2024-04-28 11:08:20,642][57339] Updated weights for policy 0, policy_version 592468 (0.0026) [2024-04-28 11:08:22,169][57108] Fps is (10 sec: 57344.4, 60 sec: 54613.4, 300 sec: 54761.5). Total num frames: 9707077632. Throughput: 0: 54535.7. Samples: 197360740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:08:22,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 11:08:24,315][57339] Updated weights for policy 0, policy_version 592478 (0.0038) [2024-04-28 11:08:26,759][57339] Updated weights for policy 0, policy_version 592488 (0.0029) [2024-04-28 11:08:27,169][57108] Fps is (10 sec: 57345.3, 60 sec: 54613.3, 300 sec: 54650.4). Total num frames: 9707339776. Throughput: 0: 54557.2. Samples: 197687120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:08:27,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 11:08:30,354][57339] Updated weights for policy 0, policy_version 592498 (0.0024) [2024-04-28 11:08:32,169][57108] Fps is (10 sec: 52429.1, 60 sec: 54613.4, 300 sec: 54594.8). Total num frames: 9707601920. Throughput: 0: 54532.6. Samples: 198011980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:08:32,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 11:08:32,920][57339] Updated weights for policy 0, policy_version 592508 (0.0029) [2024-04-28 11:08:36,452][57339] Updated weights for policy 0, policy_version 592518 (0.0031) [2024-04-28 11:08:37,169][57108] Fps is (10 sec: 52428.2, 60 sec: 54340.3, 300 sec: 54539.3). Total num frames: 9707864064. Throughput: 0: 54390.5. Samples: 198171480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:08:37,170][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 11:08:37,194][57319] Signal inference workers to stop experience collection... (2800 times) [2024-04-28 11:08:37,204][57339] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-04-28 11:08:37,290][57319] Signal inference workers to resume experience collection... (2800 times) [2024-04-28 11:08:37,290][57339] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-04-28 11:08:38,964][57339] Updated weights for policy 0, policy_version 592528 (0.0028) [2024-04-28 11:08:42,169][57108] Fps is (10 sec: 52428.0, 60 sec: 54340.2, 300 sec: 54594.8). Total num frames: 9708126208. Throughput: 0: 54289.2. Samples: 198497580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:08:42,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 11:08:42,258][57339] Updated weights for policy 0, policy_version 592538 (0.0025) [2024-04-28 11:08:44,862][57339] Updated weights for policy 0, policy_version 592548 (0.0038) [2024-04-28 11:08:47,169][57108] Fps is (10 sec: 54068.1, 60 sec: 54067.3, 300 sec: 54705.9). Total num frames: 9708404736. Throughput: 0: 54443.5. Samples: 198830200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:08:47,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 11:08:48,138][57339] Updated weights for policy 0, policy_version 592558 (0.0029) [2024-04-28 11:08:50,766][57339] Updated weights for policy 0, policy_version 592568 (0.0031) [2024-04-28 11:08:52,169][57108] Fps is (10 sec: 58982.2, 60 sec: 54613.2, 300 sec: 54817.0). Total num frames: 9708716032. Throughput: 0: 54473.1. Samples: 198991760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:08:52,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 11:08:54,430][57339] Updated weights for policy 0, policy_version 592578 (0.0027) [2024-04-28 11:08:56,815][57339] Updated weights for policy 0, policy_version 592588 (0.0026) [2024-04-28 11:08:57,169][57108] Fps is (10 sec: 57344.2, 60 sec: 54613.3, 300 sec: 54705.9). Total num frames: 9708978176. Throughput: 0: 54523.7. Samples: 199320380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:08:57,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 11:09:00,462][57339] Updated weights for policy 0, policy_version 592598 (0.0033) [2024-04-28 11:09:02,169][57108] Fps is (10 sec: 52429.4, 60 sec: 54613.3, 300 sec: 54594.8). Total num frames: 9709240320. Throughput: 0: 54574.1. Samples: 199648300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:09:02,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 11:09:02,695][57339] Updated weights for policy 0, policy_version 592608 (0.0030) [2024-04-28 11:09:06,280][57339] Updated weights for policy 0, policy_version 592618 (0.0029) [2024-04-28 11:09:07,169][57108] Fps is (10 sec: 54066.5, 60 sec: 54886.3, 300 sec: 54594.8). Total num frames: 9709518848. Throughput: 0: 54529.7. Samples: 199814580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:09:07,169][57108] Avg episode reward: [(0, '0.529')] [2024-04-28 11:09:08,661][57339] Updated weights for policy 0, policy_version 592628 (0.0038) [2024-04-28 11:09:12,169][57108] Fps is (10 sec: 52428.8, 60 sec: 54340.3, 300 sec: 54594.8). Total num frames: 9709764608. Throughput: 0: 54667.6. Samples: 200147160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:09:12,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 11:09:12,192][57339] Updated weights for policy 0, policy_version 592638 (0.0030) [2024-04-28 11:09:14,569][57339] Updated weights for policy 0, policy_version 592648 (0.0026) [2024-04-28 11:09:17,169][57108] Fps is (10 sec: 54067.9, 60 sec: 54886.7, 300 sec: 54705.9). Total num frames: 9710059520. Throughput: 0: 54834.2. Samples: 200479520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:09:17,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 11:09:18,040][57339] Updated weights for policy 0, policy_version 592658 (0.0030) [2024-04-28 11:09:20,617][57339] Updated weights for policy 0, policy_version 592668 (0.0035) [2024-04-28 11:09:22,169][57108] Fps is (10 sec: 58982.1, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9710354432. Throughput: 0: 54959.2. Samples: 200644640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:09:22,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 11:09:24,020][57339] Updated weights for policy 0, policy_version 592678 (0.0031) [2024-04-28 11:09:26,525][57339] Updated weights for policy 0, policy_version 592688 (0.0027) [2024-04-28 11:09:27,169][57108] Fps is (10 sec: 55704.0, 60 sec: 54613.1, 300 sec: 54650.3). Total num frames: 9710616576. Throughput: 0: 55027.4. Samples: 200973820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-04-28 11:09:27,170][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 11:09:27,243][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000592690_9710632960.pth... [2024-04-28 11:09:27,288][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000591888_9697492992.pth [2024-04-28 11:09:29,981][57339] Updated weights for policy 0, policy_version 592698 (0.0033) [2024-04-28 11:09:32,169][57108] Fps is (10 sec: 54067.6, 60 sec: 54886.4, 300 sec: 54650.4). Total num frames: 9710895104. Throughput: 0: 54993.8. Samples: 201304920. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:09:32,169][57108] Avg episode reward: [(0, '0.698')] [2024-04-28 11:09:32,375][57339] Updated weights for policy 0, policy_version 592708 (0.0027) [2024-04-28 11:09:35,857][57339] Updated weights for policy 0, policy_version 592718 (0.0032) [2024-04-28 11:09:37,169][57108] Fps is (10 sec: 57345.1, 60 sec: 55432.6, 300 sec: 54761.5). Total num frames: 9711190016. Throughput: 0: 55098.3. Samples: 201471180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:09:37,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 11:09:38,413][57339] Updated weights for policy 0, policy_version 592728 (0.0034) [2024-04-28 11:09:41,671][57319] Signal inference workers to stop experience collection... (2850 times) [2024-04-28 11:09:41,671][57319] Signal inference workers to resume experience collection... (2850 times) [2024-04-28 11:09:41,698][57339] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-04-28 11:09:41,698][57339] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-04-28 11:09:41,787][57339] Updated weights for policy 0, policy_version 592738 (0.0040) [2024-04-28 11:09:42,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.6, 300 sec: 54705.9). Total num frames: 9711435776. Throughput: 0: 55042.2. Samples: 201797280. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:09:42,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 11:09:44,386][57339] Updated weights for policy 0, policy_version 592748 (0.0029) [2024-04-28 11:09:47,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55159.5, 300 sec: 54761.5). Total num frames: 9711714304. Throughput: 0: 55156.1. Samples: 202130320. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:09:47,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 11:09:47,633][57339] Updated weights for policy 0, policy_version 592758 (0.0026) [2024-04-28 11:09:50,410][57339] Updated weights for policy 0, policy_version 592768 (0.0027) [2024-04-28 11:09:52,169][57108] Fps is (10 sec: 55705.2, 60 sec: 54613.4, 300 sec: 54817.0). Total num frames: 9711992832. Throughput: 0: 55169.8. Samples: 202297220. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:09:52,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 11:09:53,613][57339] Updated weights for policy 0, policy_version 592778 (0.0027) [2024-04-28 11:09:56,279][57339] Updated weights for policy 0, policy_version 592788 (0.0031) [2024-04-28 11:09:57,169][57108] Fps is (10 sec: 57342.4, 60 sec: 55159.2, 300 sec: 54816.9). Total num frames: 9712287744. Throughput: 0: 55135.3. Samples: 202628260. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:09:57,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 11:09:59,453][57339] Updated weights for policy 0, policy_version 592798 (0.0030) [2024-04-28 11:10:02,169][57108] Fps is (10 sec: 54067.7, 60 sec: 54886.5, 300 sec: 54594.9). Total num frames: 9712533504. Throughput: 0: 55093.3. Samples: 202958720. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:10:02,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 11:10:02,320][57339] Updated weights for policy 0, policy_version 592808 (0.0026) [2024-04-28 11:10:05,215][57339] Updated weights for policy 0, policy_version 592818 (0.0031) [2024-04-28 11:10:07,169][57108] Fps is (10 sec: 52429.5, 60 sec: 54886.4, 300 sec: 54650.4). Total num frames: 9712812032. Throughput: 0: 55125.3. Samples: 203125280. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:10:07,170][57108] Avg episode reward: [(0, '0.530')] [2024-04-28 11:10:08,286][57339] Updated weights for policy 0, policy_version 592828 (0.0029) [2024-04-28 11:10:11,151][57339] Updated weights for policy 0, policy_version 592838 (0.0026) [2024-04-28 11:10:12,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 54650.4). Total num frames: 9713090560. Throughput: 0: 55154.1. Samples: 203455740. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:10:12,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 11:10:14,229][57339] Updated weights for policy 0, policy_version 592848 (0.0025) [2024-04-28 11:10:17,085][57339] Updated weights for policy 0, policy_version 592858 (0.0032) [2024-04-28 11:10:17,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.5, 300 sec: 54817.0). Total num frames: 9713385472. Throughput: 0: 55136.4. Samples: 203786060. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:10:17,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 11:10:20,142][57339] Updated weights for policy 0, policy_version 592868 (0.0028) [2024-04-28 11:10:22,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54613.4, 300 sec: 54705.9). Total num frames: 9713631232. Throughput: 0: 55018.3. Samples: 203947000. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:10:22,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 11:10:23,157][57339] Updated weights for policy 0, policy_version 592878 (0.0032) [2024-04-28 11:10:26,168][57339] Updated weights for policy 0, policy_version 592888 (0.0028) [2024-04-28 11:10:27,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.7, 300 sec: 54817.0). Total num frames: 9713926144. Throughput: 0: 55038.6. Samples: 204274020. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:10:27,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 11:10:29,118][57339] Updated weights for policy 0, policy_version 592898 (0.0031) [2024-04-28 11:10:32,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54886.4, 300 sec: 54650.4). Total num frames: 9714188288. Throughput: 0: 54955.9. Samples: 204603340. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:10:32,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 11:10:32,224][57339] Updated weights for policy 0, policy_version 592908 (0.0026) [2024-04-28 11:10:35,058][57339] Updated weights for policy 0, policy_version 592918 (0.0032) [2024-04-28 11:10:37,169][57108] Fps is (10 sec: 54066.4, 60 sec: 54613.2, 300 sec: 54650.3). Total num frames: 9714466816. Throughput: 0: 54974.5. Samples: 204771080. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:10:37,170][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 11:10:38,150][57339] Updated weights for policy 0, policy_version 592928 (0.0027) [2024-04-28 11:10:40,839][57339] Updated weights for policy 0, policy_version 592938 (0.0030) [2024-04-28 11:10:42,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54886.3, 300 sec: 54650.4). Total num frames: 9714728960. Throughput: 0: 54933.1. Samples: 205100240. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:10:42,169][57108] Avg episode reward: [(0, '0.529')] [2024-04-28 11:10:44,006][57339] Updated weights for policy 0, policy_version 592948 (0.0033) [2024-04-28 11:10:46,844][57339] Updated weights for policy 0, policy_version 592958 (0.0032) [2024-04-28 11:10:47,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55159.4, 300 sec: 54817.0). Total num frames: 9715023872. Throughput: 0: 54850.6. Samples: 205427000. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:10:47,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 11:10:49,954][57319] Signal inference workers to stop experience collection... (2900 times) [2024-04-28 11:10:49,954][57319] Signal inference workers to resume experience collection... (2900 times) [2024-04-28 11:10:49,979][57339] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-04-28 11:10:49,979][57339] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-04-28 11:10:50,062][57339] Updated weights for policy 0, policy_version 592968 (0.0029) [2024-04-28 11:10:52,169][57108] Fps is (10 sec: 55705.3, 60 sec: 54886.3, 300 sec: 54761.4). Total num frames: 9715286016. Throughput: 0: 54794.2. Samples: 205591020. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:10:52,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 11:10:52,900][57339] Updated weights for policy 0, policy_version 592978 (0.0027) [2024-04-28 11:10:56,107][57339] Updated weights for policy 0, policy_version 592988 (0.0026) [2024-04-28 11:10:57,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54613.5, 300 sec: 54817.0). Total num frames: 9715564544. Throughput: 0: 54794.6. Samples: 205921500. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:10:57,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 11:10:58,742][57339] Updated weights for policy 0, policy_version 592998 (0.0030) [2024-04-28 11:11:01,955][57339] Updated weights for policy 0, policy_version 593008 (0.0031) [2024-04-28 11:11:02,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.5, 300 sec: 54761.4). Total num frames: 9715859456. Throughput: 0: 54853.8. Samples: 206254480. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 11:11:02,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 11:11:04,847][57339] Updated weights for policy 0, policy_version 593018 (0.0031) [2024-04-28 11:11:07,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.5, 300 sec: 54761.5). Total num frames: 9716121600. Throughput: 0: 54808.8. Samples: 206413400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:11:07,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 11:11:07,887][57339] Updated weights for policy 0, policy_version 593028 (0.0026) [2024-04-28 11:11:10,904][57339] Updated weights for policy 0, policy_version 593038 (0.0031) [2024-04-28 11:11:12,169][57108] Fps is (10 sec: 50790.9, 60 sec: 54613.3, 300 sec: 54594.9). Total num frames: 9716367360. Throughput: 0: 54928.1. Samples: 206745780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:11:12,169][57108] Avg episode reward: [(0, '0.708')] [2024-04-28 11:11:13,943][57339] Updated weights for policy 0, policy_version 593048 (0.0027) [2024-04-28 11:11:16,808][57339] Updated weights for policy 0, policy_version 593058 (0.0034) [2024-04-28 11:11:17,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9716662272. Throughput: 0: 54916.4. Samples: 207074580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:11:17,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 11:11:20,020][57339] Updated weights for policy 0, policy_version 593068 (0.0032) [2024-04-28 11:11:22,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55432.5, 300 sec: 54872.5). Total num frames: 9716957184. Throughput: 0: 54886.9. Samples: 207240980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:11:22,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 11:11:22,700][57339] Updated weights for policy 0, policy_version 593078 (0.0031) [2024-04-28 11:11:25,860][57339] Updated weights for policy 0, policy_version 593088 (0.0033) [2024-04-28 11:11:27,169][57108] Fps is (10 sec: 55705.1, 60 sec: 54886.3, 300 sec: 54872.5). Total num frames: 9717219328. Throughput: 0: 54922.1. Samples: 207571740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:11:27,170][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 11:11:27,287][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000593093_9717235712.pth... [2024-04-28 11:11:27,336][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000592290_9704079360.pth [2024-04-28 11:11:28,663][57339] Updated weights for policy 0, policy_version 593098 (0.0030) [2024-04-28 11:11:31,851][57339] Updated weights for policy 0, policy_version 593108 (0.0028) [2024-04-28 11:11:32,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55159.5, 300 sec: 54817.0). Total num frames: 9717497856. Throughput: 0: 54972.5. Samples: 207900760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:11:32,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 11:11:34,612][57339] Updated weights for policy 0, policy_version 593118 (0.0032) [2024-04-28 11:11:37,169][57108] Fps is (10 sec: 52429.3, 60 sec: 54613.4, 300 sec: 54705.9). Total num frames: 9717743616. Throughput: 0: 54869.4. Samples: 208060140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:11:37,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 11:11:37,986][57339] Updated weights for policy 0, policy_version 593128 (0.0033) [2024-04-28 11:11:40,812][57339] Updated weights for policy 0, policy_version 593138 (0.0030) [2024-04-28 11:11:42,169][57108] Fps is (10 sec: 52428.2, 60 sec: 54886.4, 300 sec: 54705.9). Total num frames: 9718022144. Throughput: 0: 54838.2. Samples: 208389220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:11:42,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 11:11:43,927][57339] Updated weights for policy 0, policy_version 593148 (0.0025) [2024-04-28 11:11:46,845][57339] Updated weights for policy 0, policy_version 593158 (0.0025) [2024-04-28 11:11:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9718300672. Throughput: 0: 54793.0. Samples: 208720160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:11:47,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 11:11:48,245][57319] Signal inference workers to stop experience collection... (2950 times) [2024-04-28 11:11:48,249][57319] Signal inference workers to resume experience collection... (2950 times) [2024-04-28 11:11:48,273][57339] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-04-28 11:11:48,273][57339] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-04-28 11:11:49,851][57339] Updated weights for policy 0, policy_version 593168 (0.0038) [2024-04-28 11:11:52,169][57108] Fps is (10 sec: 55705.1, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9718579200. Throughput: 0: 55011.0. Samples: 208888900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:11:52,170][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 11:11:52,925][57339] Updated weights for policy 0, policy_version 593178 (0.0032) [2024-04-28 11:11:55,836][57339] Updated weights for policy 0, policy_version 593188 (0.0036) [2024-04-28 11:11:57,169][57108] Fps is (10 sec: 57342.8, 60 sec: 55159.3, 300 sec: 54928.0). Total num frames: 9718874112. Throughput: 0: 54817.9. Samples: 209212600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:11:57,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 11:11:58,820][57339] Updated weights for policy 0, policy_version 593198 (0.0028) [2024-04-28 11:12:01,941][57339] Updated weights for policy 0, policy_version 593208 (0.0028) [2024-04-28 11:12:02,169][57108] Fps is (10 sec: 54068.1, 60 sec: 54340.3, 300 sec: 54761.5). Total num frames: 9719119872. Throughput: 0: 54866.3. Samples: 209543560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:12:02,169][57108] Avg episode reward: [(0, '0.719')] [2024-04-28 11:12:04,639][57339] Updated weights for policy 0, policy_version 593218 (0.0026) [2024-04-28 11:12:07,169][57108] Fps is (10 sec: 50791.2, 60 sec: 54340.3, 300 sec: 54705.9). Total num frames: 9719382016. Throughput: 0: 54614.2. Samples: 209698620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:12:07,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 11:12:07,977][57339] Updated weights for policy 0, policy_version 593228 (0.0027) [2024-04-28 11:12:10,766][57339] Updated weights for policy 0, policy_version 593238 (0.0029) [2024-04-28 11:12:12,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.4, 300 sec: 54761.4). Total num frames: 9719676928. Throughput: 0: 54587.7. Samples: 210028180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:12:12,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 11:12:13,884][57339] Updated weights for policy 0, policy_version 593248 (0.0024) [2024-04-28 11:12:16,768][57339] Updated weights for policy 0, policy_version 593258 (0.0027) [2024-04-28 11:12:17,169][57108] Fps is (10 sec: 55705.4, 60 sec: 54613.3, 300 sec: 54705.9). Total num frames: 9719939072. Throughput: 0: 54597.6. Samples: 210357660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:12:17,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 11:12:19,766][57339] Updated weights for policy 0, policy_version 593268 (0.0036) [2024-04-28 11:12:22,169][57108] Fps is (10 sec: 55704.9, 60 sec: 54613.2, 300 sec: 54817.0). Total num frames: 9720233984. Throughput: 0: 54679.9. Samples: 210520740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:12:22,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 11:12:22,903][57339] Updated weights for policy 0, policy_version 593278 (0.0031) [2024-04-28 11:12:25,847][57339] Updated weights for policy 0, policy_version 593288 (0.0023) [2024-04-28 11:12:27,169][57108] Fps is (10 sec: 57344.0, 60 sec: 54886.5, 300 sec: 54872.5). Total num frames: 9720512512. Throughput: 0: 54763.1. Samples: 210853560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:12:27,170][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 11:12:29,372][57339] Updated weights for policy 0, policy_version 593298 (0.0031) [2024-04-28 11:12:31,829][57339] Updated weights for policy 0, policy_version 593308 (0.0032) [2024-04-28 11:12:32,169][57108] Fps is (10 sec: 54067.8, 60 sec: 54613.2, 300 sec: 54817.0). Total num frames: 9720774656. Throughput: 0: 54628.4. Samples: 211178440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 11:12:32,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 11:12:35,184][57339] Updated weights for policy 0, policy_version 593318 (0.0028) [2024-04-28 11:12:37,169][57108] Fps is (10 sec: 52428.2, 60 sec: 54886.3, 300 sec: 54816.9). Total num frames: 9721036800. Throughput: 0: 54523.0. Samples: 211342440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:12:37,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 11:12:37,667][57339] Updated weights for policy 0, policy_version 593328 (0.0027) [2024-04-28 11:12:41,027][57339] Updated weights for policy 0, policy_version 593338 (0.0027) [2024-04-28 11:12:41,663][57319] Signal inference workers to stop experience collection... (3000 times) [2024-04-28 11:12:41,664][57319] Signal inference workers to resume experience collection... (3000 times) [2024-04-28 11:12:41,695][57339] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-04-28 11:12:41,695][57339] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-04-28 11:12:42,169][57108] Fps is (10 sec: 52428.9, 60 sec: 54613.4, 300 sec: 54705.9). Total num frames: 9721298944. Throughput: 0: 54596.2. Samples: 211669420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:12:42,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 11:12:43,705][57339] Updated weights for policy 0, policy_version 593348 (0.0027) [2024-04-28 11:12:47,056][57339] Updated weights for policy 0, policy_version 593358 (0.0028) [2024-04-28 11:12:47,169][57108] Fps is (10 sec: 54068.2, 60 sec: 54613.3, 300 sec: 54705.9). Total num frames: 9721577472. Throughput: 0: 54500.8. Samples: 211996100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:12:47,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 11:12:49,712][57339] Updated weights for policy 0, policy_version 593368 (0.0038) [2024-04-28 11:12:52,169][57108] Fps is (10 sec: 57344.2, 60 sec: 54886.6, 300 sec: 54817.0). Total num frames: 9721872384. Throughput: 0: 54702.8. Samples: 212160240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:12:52,169][57108] Avg episode reward: [(0, '0.689')] [2024-04-28 11:12:53,080][57339] Updated weights for policy 0, policy_version 593378 (0.0030) [2024-04-28 11:12:55,649][57339] Updated weights for policy 0, policy_version 593388 (0.0031) [2024-04-28 11:12:57,169][57108] Fps is (10 sec: 55704.7, 60 sec: 54340.3, 300 sec: 54816.9). Total num frames: 9722134528. Throughput: 0: 54647.4. Samples: 212487320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:12:57,170][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 11:12:58,911][57339] Updated weights for policy 0, policy_version 593398 (0.0033) [2024-04-28 11:13:01,512][57339] Updated weights for policy 0, policy_version 593408 (0.0029) [2024-04-28 11:13:02,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9722413056. Throughput: 0: 54537.0. Samples: 212811820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:13:02,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 11:13:04,829][57339] Updated weights for policy 0, policy_version 593418 (0.0033) [2024-04-28 11:13:07,169][57108] Fps is (10 sec: 54067.8, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9722675200. Throughput: 0: 54796.5. Samples: 212986580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:13:07,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 11:13:07,482][57339] Updated weights for policy 0, policy_version 593428 (0.0030) [2024-04-28 11:13:11,060][57339] Updated weights for policy 0, policy_version 593438 (0.0029) [2024-04-28 11:13:12,169][57108] Fps is (10 sec: 52428.5, 60 sec: 54340.2, 300 sec: 54817.0). Total num frames: 9722937344. Throughput: 0: 54693.4. Samples: 213314760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:13:12,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 11:13:13,641][57339] Updated weights for policy 0, policy_version 593448 (0.0033) [2024-04-28 11:13:16,992][57339] Updated weights for policy 0, policy_version 593458 (0.0033) [2024-04-28 11:13:17,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54613.3, 300 sec: 54705.9). Total num frames: 9723215872. Throughput: 0: 54755.5. Samples: 213642440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:13:17,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 11:13:19,640][57339] Updated weights for policy 0, policy_version 593468 (0.0031) [2024-04-28 11:13:22,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54340.4, 300 sec: 54761.4). Total num frames: 9723494400. Throughput: 0: 54613.6. Samples: 213800040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:13:22,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 11:13:23,013][57339] Updated weights for policy 0, policy_version 593478 (0.0039) [2024-04-28 11:13:25,751][57339] Updated weights for policy 0, policy_version 593488 (0.0027) [2024-04-28 11:13:27,169][57108] Fps is (10 sec: 57344.2, 60 sec: 54613.3, 300 sec: 54872.5). Total num frames: 9723789312. Throughput: 0: 54696.8. Samples: 214130780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:13:27,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 11:13:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000593493_9723789312.pth... [2024-04-28 11:13:27,227][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000592690_9710632960.pth [2024-04-28 11:13:28,867][57339] Updated weights for policy 0, policy_version 593498 (0.0027) [2024-04-28 11:13:31,604][57339] Updated weights for policy 0, policy_version 593508 (0.0028) [2024-04-28 11:13:32,169][57108] Fps is (10 sec: 55706.1, 60 sec: 54613.4, 300 sec: 54872.6). Total num frames: 9724051456. Throughput: 0: 54589.9. Samples: 214452640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:13:32,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 11:13:35,100][57339] Updated weights for policy 0, policy_version 593518 (0.0026) [2024-04-28 11:13:37,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54886.5, 300 sec: 54928.1). Total num frames: 9724329984. Throughput: 0: 54764.3. Samples: 214624640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:13:37,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 11:13:37,627][57339] Updated weights for policy 0, policy_version 593528 (0.0025) [2024-04-28 11:13:41,060][57339] Updated weights for policy 0, policy_version 593538 (0.0029) [2024-04-28 11:13:42,169][57108] Fps is (10 sec: 54066.6, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9724592128. Throughput: 0: 54818.4. Samples: 214954140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:13:42,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 11:13:43,632][57339] Updated weights for policy 0, policy_version 593548 (0.0032) [2024-04-28 11:13:47,007][57339] Updated weights for policy 0, policy_version 593558 (0.0033) [2024-04-28 11:13:47,169][57108] Fps is (10 sec: 52429.2, 60 sec: 54613.3, 300 sec: 54705.9). Total num frames: 9724854272. Throughput: 0: 54915.1. Samples: 215283000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:13:47,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 11:13:49,567][57339] Updated weights for policy 0, policy_version 593568 (0.0027) [2024-04-28 11:13:52,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54340.2, 300 sec: 54761.4). Total num frames: 9725132800. Throughput: 0: 54524.5. Samples: 215440180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:13:52,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 11:13:53,118][57339] Updated weights for policy 0, policy_version 593578 (0.0032) [2024-04-28 11:13:55,513][57339] Updated weights for policy 0, policy_version 593588 (0.0028) [2024-04-28 11:13:57,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54613.5, 300 sec: 54817.0). Total num frames: 9725411328. Throughput: 0: 54453.9. Samples: 215765180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:13:57,169][57108] Avg episode reward: [(0, '0.506')] [2024-04-28 11:13:58,967][57339] Updated weights for policy 0, policy_version 593598 (0.0032) [2024-04-28 11:14:00,751][57319] Signal inference workers to stop experience collection... (3050 times) [2024-04-28 11:14:00,751][57319] Signal inference workers to resume experience collection... (3050 times) [2024-04-28 11:14:00,773][57339] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-04-28 11:14:00,773][57339] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-04-28 11:14:01,488][57339] Updated weights for policy 0, policy_version 593608 (0.0030) [2024-04-28 11:14:02,169][57108] Fps is (10 sec: 55705.9, 60 sec: 54613.3, 300 sec: 54817.0). Total num frames: 9725689856. Throughput: 0: 54490.8. Samples: 216094520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:14:02,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 11:14:04,806][57339] Updated weights for policy 0, policy_version 593618 (0.0027) [2024-04-28 11:14:07,169][57108] Fps is (10 sec: 55705.0, 60 sec: 54886.4, 300 sec: 54928.0). Total num frames: 9725968384. Throughput: 0: 54843.5. Samples: 216268000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 11:14:07,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 11:14:07,395][57339] Updated weights for policy 0, policy_version 593628 (0.0027) [2024-04-28 11:14:10,731][57339] Updated weights for policy 0, policy_version 593638 (0.0034) [2024-04-28 11:14:12,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.6, 300 sec: 54928.1). Total num frames: 9726263296. Throughput: 0: 54746.4. Samples: 216594360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:14:12,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 11:14:13,468][57339] Updated weights for policy 0, policy_version 593648 (0.0030) [2024-04-28 11:14:16,816][57339] Updated weights for policy 0, policy_version 593658 (0.0027) [2024-04-28 11:14:17,169][57108] Fps is (10 sec: 54067.9, 60 sec: 54886.5, 300 sec: 54761.5). Total num frames: 9726509056. Throughput: 0: 55078.6. Samples: 216931180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:14:17,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 11:14:19,277][57339] Updated weights for policy 0, policy_version 593668 (0.0032) [2024-04-28 11:14:22,169][57108] Fps is (10 sec: 52428.3, 60 sec: 54886.4, 300 sec: 54817.0). Total num frames: 9726787584. Throughput: 0: 54666.3. Samples: 217084620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:14:22,169][57108] Avg episode reward: [(0, '0.696')] [2024-04-28 11:14:22,796][57339] Updated weights for policy 0, policy_version 593678 (0.0032) [2024-04-28 11:14:25,309][57339] Updated weights for policy 0, policy_version 593688 (0.0034) [2024-04-28 11:14:27,169][57108] Fps is (10 sec: 54066.2, 60 sec: 54340.2, 300 sec: 54761.4). Total num frames: 9727049728. Throughput: 0: 54688.3. Samples: 217415120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:14:27,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 11:14:28,806][57339] Updated weights for policy 0, policy_version 593698 (0.0032) [2024-04-28 11:14:31,292][57339] Updated weights for policy 0, policy_version 593708 (0.0025) [2024-04-28 11:14:32,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54613.2, 300 sec: 54705.9). Total num frames: 9727328256. Throughput: 0: 54719.9. Samples: 217745400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:14:32,169][57108] Avg episode reward: [(0, '0.518')] [2024-04-28 11:14:34,681][57339] Updated weights for policy 0, policy_version 593718 (0.0027) [2024-04-28 11:14:37,169][57108] Fps is (10 sec: 57344.5, 60 sec: 54886.4, 300 sec: 54872.5). Total num frames: 9727623168. Throughput: 0: 55019.6. Samples: 217916060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:14:37,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 11:14:37,271][57339] Updated weights for policy 0, policy_version 593728 (0.0026) [2024-04-28 11:14:40,659][57339] Updated weights for policy 0, policy_version 593738 (0.0033) [2024-04-28 11:14:42,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55159.5, 300 sec: 54872.5). Total num frames: 9727901696. Throughput: 0: 55059.0. Samples: 218242840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:14:42,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 11:14:43,156][57339] Updated weights for policy 0, policy_version 593748 (0.0025) [2024-04-28 11:14:46,639][57339] Updated weights for policy 0, policy_version 593758 (0.0034) [2024-04-28 11:14:47,169][57108] Fps is (10 sec: 52428.8, 60 sec: 54886.4, 300 sec: 54761.4). Total num frames: 9728147456. Throughput: 0: 55047.5. Samples: 218571660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:14:47,170][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 11:14:49,218][57339] Updated weights for policy 0, policy_version 593768 (0.0033) [2024-04-28 11:14:52,169][57108] Fps is (10 sec: 52428.9, 60 sec: 54886.4, 300 sec: 54705.9). Total num frames: 9728425984. Throughput: 0: 54814.3. Samples: 218734640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:14:52,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 11:14:52,550][57339] Updated weights for policy 0, policy_version 593778 (0.0025) [2024-04-28 11:14:55,142][57339] Updated weights for policy 0, policy_version 593788 (0.0034) [2024-04-28 11:14:57,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54886.3, 300 sec: 54817.0). Total num frames: 9728704512. Throughput: 0: 55006.5. Samples: 219069660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:14:57,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 11:14:58,334][57339] Updated weights for policy 0, policy_version 593798 (0.0036) [2024-04-28 11:15:01,192][57339] Updated weights for policy 0, policy_version 593808 (0.0030) [2024-04-28 11:15:02,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55159.5, 300 sec: 54872.5). Total num frames: 9728999424. Throughput: 0: 54866.6. Samples: 219400180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:15:02,169][57108] Avg episode reward: [(0, '0.698')] [2024-04-28 11:15:04,325][57339] Updated weights for policy 0, policy_version 593818 (0.0030) [2024-04-28 11:15:07,159][57339] Updated weights for policy 0, policy_version 593828 (0.0038) [2024-04-28 11:15:07,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55159.4, 300 sec: 54872.5). Total num frames: 9729277952. Throughput: 0: 55106.1. Samples: 219564400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:15:07,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 11:15:10,240][57339] Updated weights for policy 0, policy_version 593838 (0.0034) [2024-04-28 11:15:12,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54886.3, 300 sec: 54817.0). Total num frames: 9729556480. Throughput: 0: 55058.3. Samples: 219892740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:15:12,169][57108] Avg episode reward: [(0, '0.524')] [2024-04-28 11:15:13,047][57339] Updated weights for policy 0, policy_version 593848 (0.0041) [2024-04-28 11:15:16,058][57339] Updated weights for policy 0, policy_version 593858 (0.0023) [2024-04-28 11:15:17,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.3, 300 sec: 54872.5). Total num frames: 9729818624. Throughput: 0: 55155.5. Samples: 220227400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:15:17,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 11:15:19,003][57339] Updated weights for policy 0, policy_version 593868 (0.0032) [2024-04-28 11:15:21,937][57339] Updated weights for policy 0, policy_version 593878 (0.0029) [2024-04-28 11:15:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 54817.0). Total num frames: 9730097152. Throughput: 0: 55096.9. Samples: 220395420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:15:22,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 11:15:24,810][57339] Updated weights for policy 0, policy_version 593888 (0.0027) [2024-04-28 11:15:26,746][57319] Signal inference workers to stop experience collection... (3100 times) [2024-04-28 11:15:26,746][57319] Signal inference workers to resume experience collection... (3100 times) [2024-04-28 11:15:26,773][57339] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-04-28 11:15:26,773][57339] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-04-28 11:15:27,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 54817.0). Total num frames: 9730359296. Throughput: 0: 55228.3. Samples: 220728120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:15:27,170][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 11:15:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000593894_9730359296.pth... [2024-04-28 11:15:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000593093_9717235712.pth [2024-04-28 11:15:27,845][57339] Updated weights for policy 0, policy_version 593898 (0.0027) [2024-04-28 11:15:30,749][57339] Updated weights for policy 0, policy_version 593908 (0.0028) [2024-04-28 11:15:32,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.5, 300 sec: 54817.0). Total num frames: 9730637824. Throughput: 0: 55164.8. Samples: 221054080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:15:32,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 11:15:33,917][57339] Updated weights for policy 0, policy_version 593918 (0.0028) [2024-04-28 11:15:36,624][57339] Updated weights for policy 0, policy_version 593928 (0.0025) [2024-04-28 11:15:37,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55159.4, 300 sec: 54928.0). Total num frames: 9730932736. Throughput: 0: 55342.1. Samples: 221225040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:15:37,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 11:15:39,848][57339] Updated weights for policy 0, policy_version 593938 (0.0033) [2024-04-28 11:15:42,169][57108] Fps is (10 sec: 54066.7, 60 sec: 54613.2, 300 sec: 54761.4). Total num frames: 9731178496. Throughput: 0: 55167.0. Samples: 221552180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 11:15:42,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 11:15:42,676][57339] Updated weights for policy 0, policy_version 593948 (0.0028) [2024-04-28 11:15:45,883][57339] Updated weights for policy 0, policy_version 593958 (0.0031) [2024-04-28 11:15:47,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 54872.5). Total num frames: 9731473408. Throughput: 0: 55116.8. Samples: 221880440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:15:47,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 11:15:48,584][57339] Updated weights for policy 0, policy_version 593968 (0.0030) [2024-04-28 11:15:51,742][57339] Updated weights for policy 0, policy_version 593978 (0.0027) [2024-04-28 11:15:52,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55159.4, 300 sec: 54817.0). Total num frames: 9731735552. Throughput: 0: 55219.7. Samples: 222049280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:15:52,170][57108] Avg episode reward: [(0, '0.513')] [2024-04-28 11:15:54,409][57339] Updated weights for policy 0, policy_version 593988 (0.0028) [2024-04-28 11:15:57,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55159.5, 300 sec: 54761.4). Total num frames: 9732014080. Throughput: 0: 55267.2. Samples: 222379760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:15:57,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 11:15:57,794][57339] Updated weights for policy 0, policy_version 593998 (0.0029) [2024-04-28 11:16:00,383][57339] Updated weights for policy 0, policy_version 594008 (0.0029) [2024-04-28 11:16:02,169][57108] Fps is (10 sec: 54067.2, 60 sec: 54613.3, 300 sec: 54761.4). Total num frames: 9732276224. Throughput: 0: 55183.7. Samples: 222710660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:16:02,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 11:16:03,811][57339] Updated weights for policy 0, policy_version 594018 (0.0028) [2024-04-28 11:16:06,266][57339] Updated weights for policy 0, policy_version 594028 (0.0034) [2024-04-28 11:16:07,169][57108] Fps is (10 sec: 55704.5, 60 sec: 54886.3, 300 sec: 54928.0). Total num frames: 9732571136. Throughput: 0: 55030.9. Samples: 222871820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:16:07,170][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 11:16:09,813][57339] Updated weights for policy 0, policy_version 594038 (0.0027) [2024-04-28 11:16:12,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55159.5, 300 sec: 54928.1). Total num frames: 9732866048. Throughput: 0: 54938.3. Samples: 223200340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:16:12,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 11:16:12,221][57339] Updated weights for policy 0, policy_version 594048 (0.0029) [2024-04-28 11:16:15,719][57339] Updated weights for policy 0, policy_version 594058 (0.0034) [2024-04-28 11:16:17,169][57108] Fps is (10 sec: 55706.8, 60 sec: 55159.6, 300 sec: 54817.0). Total num frames: 9733128192. Throughput: 0: 55033.4. Samples: 223530580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:16:17,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 11:16:18,151][57339] Updated weights for policy 0, policy_version 594068 (0.0034) [2024-04-28 11:16:21,693][57339] Updated weights for policy 0, policy_version 594078 (0.0029) [2024-04-28 11:16:22,169][57108] Fps is (10 sec: 52429.2, 60 sec: 54886.5, 300 sec: 54817.0). Total num frames: 9733390336. Throughput: 0: 55023.3. Samples: 223701080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:16:22,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 11:16:24,253][57339] Updated weights for policy 0, policy_version 594088 (0.0026) [2024-04-28 11:16:27,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55159.5, 300 sec: 54816.9). Total num frames: 9733668864. Throughput: 0: 55076.5. Samples: 224030620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:16:27,170][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 11:16:27,556][57339] Updated weights for policy 0, policy_version 594098 (0.0030) [2024-04-28 11:16:29,549][57319] Signal inference workers to stop experience collection... (3150 times) [2024-04-28 11:16:29,571][57339] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-04-28 11:16:29,607][57319] Signal inference workers to resume experience collection... (3150 times) [2024-04-28 11:16:29,607][57339] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-04-28 11:16:30,178][57339] Updated weights for policy 0, policy_version 594108 (0.0025) [2024-04-28 11:16:32,169][57108] Fps is (10 sec: 55704.2, 60 sec: 55159.3, 300 sec: 54928.0). Total num frames: 9733947392. Throughput: 0: 55016.3. Samples: 224356180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:16:32,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 11:16:33,449][57339] Updated weights for policy 0, policy_version 594118 (0.0028) [2024-04-28 11:16:36,148][57339] Updated weights for policy 0, policy_version 594128 (0.0035) [2024-04-28 11:16:37,169][57108] Fps is (10 sec: 54067.9, 60 sec: 54613.4, 300 sec: 54872.5). Total num frames: 9734209536. Throughput: 0: 54844.5. Samples: 224517280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:16:37,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 11:16:39,331][57339] Updated weights for policy 0, policy_version 594138 (0.0026) [2024-04-28 11:16:42,094][57339] Updated weights for policy 0, policy_version 594148 (0.0030) [2024-04-28 11:16:42,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55705.7, 300 sec: 54983.6). Total num frames: 9734520832. Throughput: 0: 54937.2. Samples: 224851940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:16:42,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 11:16:45,263][57339] Updated weights for policy 0, policy_version 594158 (0.0032) [2024-04-28 11:16:47,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55159.5, 300 sec: 54928.1). Total num frames: 9734782976. Throughput: 0: 55023.6. Samples: 225186720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:16:47,169][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 11:16:48,144][57339] Updated weights for policy 0, policy_version 594168 (0.0027) [2024-04-28 11:16:51,367][57339] Updated weights for policy 0, policy_version 594178 (0.0037) [2024-04-28 11:16:52,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 54872.5). Total num frames: 9735061504. Throughput: 0: 55239.2. Samples: 225357580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:16:52,170][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 11:16:54,252][57339] Updated weights for policy 0, policy_version 594188 (0.0030) [2024-04-28 11:16:57,094][57339] Updated weights for policy 0, policy_version 594198 (0.0030) [2024-04-28 11:16:57,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 54983.6). Total num frames: 9735340032. Throughput: 0: 55321.3. Samples: 225689800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:16:57,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 11:17:00,032][57339] Updated weights for policy 0, policy_version 594208 (0.0028) [2024-04-28 11:17:02,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55432.6, 300 sec: 54983.6). Total num frames: 9735602176. Throughput: 0: 55314.7. Samples: 226019740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:17:02,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 11:17:02,977][57339] Updated weights for policy 0, policy_version 594218 (0.0029) [2024-04-28 11:17:05,900][57339] Updated weights for policy 0, policy_version 594228 (0.0033) [2024-04-28 11:17:07,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.6, 300 sec: 54928.0). Total num frames: 9735880704. Throughput: 0: 55104.3. Samples: 226180780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:17:07,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 11:17:08,925][57339] Updated weights for policy 0, policy_version 594238 (0.0037) [2024-04-28 11:17:11,937][57339] Updated weights for policy 0, policy_version 594248 (0.0035) [2024-04-28 11:17:12,169][57108] Fps is (10 sec: 55705.4, 60 sec: 54886.4, 300 sec: 54983.6). Total num frames: 9736159232. Throughput: 0: 55164.1. Samples: 226513000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 26.0) [2024-04-28 11:17:12,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 11:17:15,002][57339] Updated weights for policy 0, policy_version 594258 (0.0026) [2024-04-28 11:17:17,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55432.5, 300 sec: 54983.6). Total num frames: 9736454144. Throughput: 0: 55413.6. Samples: 226849780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:17:17,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 11:17:17,818][57339] Updated weights for policy 0, policy_version 594268 (0.0027) [2024-04-28 11:17:20,717][57339] Updated weights for policy 0, policy_version 594278 (0.0033) [2024-04-28 11:17:22,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.5, 300 sec: 54983.6). Total num frames: 9736732672. Throughput: 0: 55653.2. Samples: 227021680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:17:22,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 11:17:23,775][57339] Updated weights for policy 0, policy_version 594288 (0.0034) [2024-04-28 11:17:26,575][57339] Updated weights for policy 0, policy_version 594298 (0.0030) [2024-04-28 11:17:27,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.6, 300 sec: 54983.6). Total num frames: 9736994816. Throughput: 0: 55592.0. Samples: 227353580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:17:27,169][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 11:17:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000594299_9736994816.pth... [2024-04-28 11:17:27,245][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000593493_9723789312.pth [2024-04-28 11:17:29,747][57339] Updated weights for policy 0, policy_version 594308 (0.0028) [2024-04-28 11:17:31,714][57319] Signal inference workers to stop experience collection... (3200 times) [2024-04-28 11:17:31,714][57319] Signal inference workers to resume experience collection... (3200 times) [2024-04-28 11:17:31,727][57339] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-04-28 11:17:31,727][57339] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-04-28 11:17:32,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.7, 300 sec: 55039.2). Total num frames: 9737273344. Throughput: 0: 55503.6. Samples: 227684380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:17:32,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 11:17:32,588][57339] Updated weights for policy 0, policy_version 594318 (0.0030) [2024-04-28 11:17:35,660][57339] Updated weights for policy 0, policy_version 594328 (0.0030) [2024-04-28 11:17:37,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55094.6). Total num frames: 9737551872. Throughput: 0: 55381.8. Samples: 227849760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:17:37,170][57108] Avg episode reward: [(0, '0.531')] [2024-04-28 11:17:38,481][57339] Updated weights for policy 0, policy_version 594338 (0.0026) [2024-04-28 11:17:41,594][57339] Updated weights for policy 0, policy_version 594348 (0.0027) [2024-04-28 11:17:42,169][57108] Fps is (10 sec: 54066.1, 60 sec: 54886.3, 300 sec: 55039.1). Total num frames: 9737814016. Throughput: 0: 55383.4. Samples: 228182060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:17:42,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 11:17:44,330][57339] Updated weights for policy 0, policy_version 594358 (0.0031) [2024-04-28 11:17:47,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.5, 300 sec: 55039.1). Total num frames: 9738108928. Throughput: 0: 55423.5. Samples: 228513800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:17:47,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 11:17:47,591][57339] Updated weights for policy 0, policy_version 594368 (0.0029) [2024-04-28 11:17:50,223][57339] Updated weights for policy 0, policy_version 594378 (0.0027) [2024-04-28 11:17:52,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55432.6, 300 sec: 55094.7). Total num frames: 9738387456. Throughput: 0: 55508.5. Samples: 228678660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:17:52,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 11:17:53,560][57339] Updated weights for policy 0, policy_version 594388 (0.0037) [2024-04-28 11:17:56,343][57339] Updated weights for policy 0, policy_version 594398 (0.0025) [2024-04-28 11:17:57,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55094.7). Total num frames: 9738665984. Throughput: 0: 55521.7. Samples: 229011480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:17:57,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 11:17:59,435][57339] Updated weights for policy 0, policy_version 594408 (0.0033) [2024-04-28 11:18:02,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55094.7). Total num frames: 9738928128. Throughput: 0: 55379.6. Samples: 229341860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:18:02,169][57108] Avg episode reward: [(0, '0.534')] [2024-04-28 11:18:02,199][57339] Updated weights for policy 0, policy_version 594418 (0.0032) [2024-04-28 11:18:05,298][57339] Updated weights for policy 0, policy_version 594428 (0.0028) [2024-04-28 11:18:07,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55150.2). Total num frames: 9739206656. Throughput: 0: 55193.9. Samples: 229505400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:18:07,178][57108] Avg episode reward: [(0, '0.705')] [2024-04-28 11:18:08,145][57339] Updated weights for policy 0, policy_version 594438 (0.0032) [2024-04-28 11:18:11,341][57339] Updated weights for policy 0, policy_version 594448 (0.0026) [2024-04-28 11:18:12,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55432.4, 300 sec: 55150.2). Total num frames: 9739485184. Throughput: 0: 55123.0. Samples: 229834120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:18:12,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 11:18:14,068][57339] Updated weights for policy 0, policy_version 594458 (0.0032) [2024-04-28 11:18:17,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54886.4, 300 sec: 55094.7). Total num frames: 9739747328. Throughput: 0: 55184.4. Samples: 230167680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:18:17,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 11:18:17,370][57339] Updated weights for policy 0, policy_version 594468 (0.0030) [2024-04-28 11:18:19,952][57339] Updated weights for policy 0, policy_version 594478 (0.0040) [2024-04-28 11:18:22,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55159.6, 300 sec: 55094.7). Total num frames: 9740042240. Throughput: 0: 55320.7. Samples: 230339180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:18:22,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 11:18:23,383][57339] Updated weights for policy 0, policy_version 594488 (0.0029) [2024-04-28 11:18:25,800][57339] Updated weights for policy 0, policy_version 594498 (0.0031) [2024-04-28 11:18:27,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55150.2). Total num frames: 9740320768. Throughput: 0: 55302.3. Samples: 230670660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:18:27,170][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 11:18:29,293][57339] Updated weights for policy 0, policy_version 594508 (0.0032) [2024-04-28 11:18:31,717][57339] Updated weights for policy 0, policy_version 594518 (0.0023) [2024-04-28 11:18:32,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55150.2). Total num frames: 9740599296. Throughput: 0: 55207.5. Samples: 230998140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:18:32,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 11:18:35,180][57339] Updated weights for policy 0, policy_version 594528 (0.0037) [2024-04-28 11:18:37,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55159.6, 300 sec: 55150.2). Total num frames: 9740861440. Throughput: 0: 55356.0. Samples: 231169680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:18:37,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 11:18:37,555][57339] Updated weights for policy 0, policy_version 594538 (0.0024) [2024-04-28 11:18:41,213][57339] Updated weights for policy 0, policy_version 594548 (0.0033) [2024-04-28 11:18:42,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.7, 300 sec: 55205.8). Total num frames: 9741139968. Throughput: 0: 55300.6. Samples: 231500000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:18:42,169][57108] Avg episode reward: [(0, '0.499')] [2024-04-28 11:18:42,963][57319] Signal inference workers to stop experience collection... (3250 times) [2024-04-28 11:18:42,963][57319] Signal inference workers to resume experience collection... (3250 times) [2024-04-28 11:18:42,971][57339] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-04-28 11:18:42,971][57339] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-04-28 11:18:43,506][57339] Updated weights for policy 0, policy_version 594558 (0.0029) [2024-04-28 11:18:47,063][57339] Updated weights for policy 0, policy_version 594568 (0.0027) [2024-04-28 11:18:47,169][57108] Fps is (10 sec: 54067.8, 60 sec: 54886.5, 300 sec: 55150.2). Total num frames: 9741402112. Throughput: 0: 55345.4. Samples: 231832400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 11:18:47,169][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 11:18:49,393][57339] Updated weights for policy 0, policy_version 594578 (0.0032) [2024-04-28 11:18:52,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.5, 300 sec: 55205.7). Total num frames: 9741697024. Throughput: 0: 55198.2. Samples: 231989320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:18:52,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 11:18:52,845][57339] Updated weights for policy 0, policy_version 594588 (0.0029) [2024-04-28 11:18:55,416][57339] Updated weights for policy 0, policy_version 594598 (0.0027) [2024-04-28 11:18:57,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55159.5, 300 sec: 55205.7). Total num frames: 9741975552. Throughput: 0: 55264.9. Samples: 232321040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:18:57,169][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 11:18:58,857][57339] Updated weights for policy 0, policy_version 594608 (0.0029) [2024-04-28 11:19:01,447][57339] Updated weights for policy 0, policy_version 594618 (0.0026) [2024-04-28 11:19:02,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.4, 300 sec: 55150.2). Total num frames: 9742237696. Throughput: 0: 55184.0. Samples: 232650960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:19:02,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 11:19:04,787][57339] Updated weights for policy 0, policy_version 594628 (0.0027) [2024-04-28 11:19:07,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55432.5, 300 sec: 55150.2). Total num frames: 9742532608. Throughput: 0: 55278.6. Samples: 232826720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:19:07,169][57108] Avg episode reward: [(0, '0.679')] [2024-04-28 11:19:07,236][57339] Updated weights for policy 0, policy_version 594638 (0.0025) [2024-04-28 11:19:10,674][57339] Updated weights for policy 0, policy_version 594648 (0.0032) [2024-04-28 11:19:12,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55159.5, 300 sec: 55205.7). Total num frames: 9742794752. Throughput: 0: 55188.5. Samples: 233154140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:19:12,169][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 11:19:13,103][57339] Updated weights for policy 0, policy_version 594658 (0.0034) [2024-04-28 11:19:16,554][57339] Updated weights for policy 0, policy_version 594668 (0.0034) [2024-04-28 11:19:17,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55159.5, 300 sec: 55150.2). Total num frames: 9743056896. Throughput: 0: 55204.0. Samples: 233482320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:19:17,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 11:19:19,032][57339] Updated weights for policy 0, policy_version 594678 (0.0036) [2024-04-28 11:19:22,169][57108] Fps is (10 sec: 52429.2, 60 sec: 54613.3, 300 sec: 55150.2). Total num frames: 9743319040. Throughput: 0: 55040.5. Samples: 233646500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:19:22,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 11:19:22,607][57339] Updated weights for policy 0, policy_version 594688 (0.0031) [2024-04-28 11:19:25,129][57339] Updated weights for policy 0, policy_version 594698 (0.0033) [2024-04-28 11:19:27,169][57108] Fps is (10 sec: 55705.9, 60 sec: 54886.5, 300 sec: 55205.8). Total num frames: 9743613952. Throughput: 0: 55074.6. Samples: 233978360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:19:27,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 11:19:27,210][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000594704_9743630336.pth... [2024-04-28 11:19:27,255][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000593894_9730359296.pth [2024-04-28 11:19:28,533][57339] Updated weights for policy 0, policy_version 594708 (0.0030) [2024-04-28 11:19:30,910][57339] Updated weights for policy 0, policy_version 594718 (0.0028) [2024-04-28 11:19:32,169][57108] Fps is (10 sec: 60620.7, 60 sec: 55432.5, 300 sec: 55261.3). Total num frames: 9743925248. Throughput: 0: 55135.9. Samples: 234313520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:19:32,169][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 11:19:34,314][57339] Updated weights for policy 0, policy_version 594728 (0.0026) [2024-04-28 11:19:36,837][57339] Updated weights for policy 0, policy_version 594738 (0.0037) [2024-04-28 11:19:37,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55205.7). Total num frames: 9744187392. Throughput: 0: 55376.8. Samples: 234481280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:19:37,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 11:19:40,282][57339] Updated weights for policy 0, policy_version 594748 (0.0026) [2024-04-28 11:19:42,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55159.4, 300 sec: 55261.3). Total num frames: 9744449536. Throughput: 0: 55354.7. Samples: 234812000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:19:42,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 11:19:42,947][57339] Updated weights for policy 0, policy_version 594758 (0.0027) [2024-04-28 11:19:46,184][57339] Updated weights for policy 0, policy_version 594768 (0.0026) [2024-04-28 11:19:46,220][57319] Signal inference workers to stop experience collection... (3300 times) [2024-04-28 11:19:46,221][57319] Signal inference workers to resume experience collection... (3300 times) [2024-04-28 11:19:46,252][57339] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-04-28 11:19:46,252][57339] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-04-28 11:19:47,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.4, 300 sec: 55316.8). Total num frames: 9744744448. Throughput: 0: 55404.8. Samples: 235144180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:19:47,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 11:19:48,779][57339] Updated weights for policy 0, policy_version 594778 (0.0033) [2024-04-28 11:19:52,094][57339] Updated weights for policy 0, policy_version 594788 (0.0026) [2024-04-28 11:19:52,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55159.5, 300 sec: 55261.3). Total num frames: 9745006592. Throughput: 0: 55106.7. Samples: 235306520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:19:52,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 11:19:54,636][57339] Updated weights for policy 0, policy_version 594798 (0.0027) [2024-04-28 11:19:57,169][57108] Fps is (10 sec: 52428.9, 60 sec: 54886.4, 300 sec: 55150.2). Total num frames: 9745268736. Throughput: 0: 55184.5. Samples: 235637440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:19:57,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 11:19:58,111][57339] Updated weights for policy 0, policy_version 594808 (0.0027) [2024-04-28 11:20:00,402][57339] Updated weights for policy 0, policy_version 594818 (0.0030) [2024-04-28 11:20:02,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 55150.2). Total num frames: 9745547264. Throughput: 0: 55261.8. Samples: 235969100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:20:02,169][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 11:20:04,144][57339] Updated weights for policy 0, policy_version 594828 (0.0027) [2024-04-28 11:20:06,416][57339] Updated weights for policy 0, policy_version 594838 (0.0027) [2024-04-28 11:20:07,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55432.5, 300 sec: 55261.3). Total num frames: 9745858560. Throughput: 0: 55298.2. Samples: 236134920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:20:07,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 11:20:10,071][57339] Updated weights for policy 0, policy_version 594848 (0.0028) [2024-04-28 11:20:12,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.6, 300 sec: 55261.3). Total num frames: 9746120704. Throughput: 0: 55339.9. Samples: 236468660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:20:12,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 11:20:12,384][57339] Updated weights for policy 0, policy_version 594858 (0.0028) [2024-04-28 11:20:15,880][57339] Updated weights for policy 0, policy_version 594868 (0.0031) [2024-04-28 11:20:17,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55261.3). Total num frames: 9746399232. Throughput: 0: 55339.6. Samples: 236803800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:20:17,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 11:20:18,160][57339] Updated weights for policy 0, policy_version 594878 (0.0025) [2024-04-28 11:20:21,800][57339] Updated weights for policy 0, policy_version 594888 (0.0031) [2024-04-28 11:20:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55261.3). Total num frames: 9746661376. Throughput: 0: 55104.9. Samples: 236961000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 11:20:22,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 11:20:24,261][57339] Updated weights for policy 0, policy_version 594898 (0.0028) [2024-04-28 11:20:27,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55159.5, 300 sec: 55205.8). Total num frames: 9746923520. Throughput: 0: 55160.1. Samples: 237294200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:20:27,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 11:20:27,701][57339] Updated weights for policy 0, policy_version 594908 (0.0033) [2024-04-28 11:20:30,037][57339] Updated weights for policy 0, policy_version 594918 (0.0024) [2024-04-28 11:20:32,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54613.3, 300 sec: 55150.2). Total num frames: 9747202048. Throughput: 0: 55237.4. Samples: 237629860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:20:32,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 11:20:33,621][57339] Updated weights for policy 0, policy_version 594928 (0.0031) [2024-04-28 11:20:35,341][57319] Signal inference workers to stop experience collection... (3350 times) [2024-04-28 11:20:35,341][57319] Signal inference workers to resume experience collection... (3350 times) [2024-04-28 11:20:35,355][57339] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-04-28 11:20:35,355][57339] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-04-28 11:20:36,001][57339] Updated weights for policy 0, policy_version 594938 (0.0033) [2024-04-28 11:20:37,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55159.4, 300 sec: 55316.8). Total num frames: 9747496960. Throughput: 0: 55244.7. Samples: 237792540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:20:37,170][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 11:20:39,493][57339] Updated weights for policy 0, policy_version 594948 (0.0028) [2024-04-28 11:20:42,026][57339] Updated weights for policy 0, policy_version 594958 (0.0027) [2024-04-28 11:20:42,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55705.6, 300 sec: 55316.8). Total num frames: 9747791872. Throughput: 0: 55258.3. Samples: 238124060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:20:42,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 11:20:45,450][57339] Updated weights for policy 0, policy_version 594968 (0.0029) [2024-04-28 11:20:47,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 9748070400. Throughput: 0: 55389.8. Samples: 238461640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:20:47,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 11:20:47,739][57339] Updated weights for policy 0, policy_version 594978 (0.0028) [2024-04-28 11:20:51,319][57339] Updated weights for policy 0, policy_version 594988 (0.0027) [2024-04-28 11:20:52,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9748332544. Throughput: 0: 55483.6. Samples: 238631680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:20:52,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 11:20:53,540][57339] Updated weights for policy 0, policy_version 594998 (0.0029) [2024-04-28 11:20:57,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9748594688. Throughput: 0: 55490.6. Samples: 238965740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:20:57,169][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 11:20:57,238][57339] Updated weights for policy 0, policy_version 595008 (0.0030) [2024-04-28 11:20:59,624][57339] Updated weights for policy 0, policy_version 595018 (0.0028) [2024-04-28 11:21:02,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55159.4, 300 sec: 55205.8). Total num frames: 9748856832. Throughput: 0: 55431.4. Samples: 239298220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:21:02,170][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 11:21:03,312][57339] Updated weights for policy 0, policy_version 595028 (0.0033) [2024-04-28 11:21:05,600][57339] Updated weights for policy 0, policy_version 595038 (0.0034) [2024-04-28 11:21:07,169][57108] Fps is (10 sec: 55705.9, 60 sec: 54886.4, 300 sec: 55205.7). Total num frames: 9749151744. Throughput: 0: 55426.2. Samples: 239455180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:21:07,170][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 11:21:09,139][57339] Updated weights for policy 0, policy_version 595048 (0.0026) [2024-04-28 11:21:11,408][57339] Updated weights for policy 0, policy_version 595058 (0.0029) [2024-04-28 11:21:12,169][57108] Fps is (10 sec: 58983.1, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9749446656. Throughput: 0: 55470.2. Samples: 239790360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:21:12,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 11:21:14,922][57339] Updated weights for policy 0, policy_version 595068 (0.0032) [2024-04-28 11:21:17,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55705.6, 300 sec: 55427.9). Total num frames: 9749741568. Throughput: 0: 55426.3. Samples: 240124040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:21:17,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 11:21:17,200][57339] Updated weights for policy 0, policy_version 595078 (0.0024) [2024-04-28 11:21:20,881][57339] Updated weights for policy 0, policy_version 595088 (0.0028) [2024-04-28 11:21:22,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55372.4). Total num frames: 9750003712. Throughput: 0: 55775.6. Samples: 240302440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:21:22,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 11:21:23,112][57339] Updated weights for policy 0, policy_version 595098 (0.0031) [2024-04-28 11:21:26,845][57339] Updated weights for policy 0, policy_version 595108 (0.0027) [2024-04-28 11:21:26,964][57319] Signal inference workers to stop experience collection... (3400 times) [2024-04-28 11:21:26,965][57319] Signal inference workers to resume experience collection... (3400 times) [2024-04-28 11:21:26,995][57339] InferenceWorker_p0-w0: stopping experience collection (3400 times) [2024-04-28 11:21:26,996][57339] InferenceWorker_p0-w0: resuming experience collection (3400 times) [2024-04-28 11:21:27,169][57108] Fps is (10 sec: 54066.1, 60 sec: 55978.5, 300 sec: 55372.4). Total num frames: 9750282240. Throughput: 0: 55873.6. Samples: 240638380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:21:27,170][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 11:21:27,291][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000595111_9750298624.pth... [2024-04-28 11:21:27,334][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000594299_9736994816.pth [2024-04-28 11:21:29,094][57339] Updated weights for policy 0, policy_version 595118 (0.0034) [2024-04-28 11:21:32,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55432.6, 300 sec: 55316.8). Total num frames: 9750528000. Throughput: 0: 55670.8. Samples: 240966820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:21:32,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 11:21:32,648][57339] Updated weights for policy 0, policy_version 595128 (0.0033) [2024-04-28 11:21:34,970][57339] Updated weights for policy 0, policy_version 595138 (0.0025) [2024-04-28 11:21:37,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55159.4, 300 sec: 55205.7). Total num frames: 9750806528. Throughput: 0: 55294.1. Samples: 241119920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:21:37,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 11:21:38,491][57339] Updated weights for policy 0, policy_version 595148 (0.0028) [2024-04-28 11:21:40,820][57339] Updated weights for policy 0, policy_version 595158 (0.0027) [2024-04-28 11:21:42,169][57108] Fps is (10 sec: 55705.1, 60 sec: 54886.4, 300 sec: 55261.3). Total num frames: 9751085056. Throughput: 0: 55260.5. Samples: 241452460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:21:42,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 11:21:44,493][57339] Updated weights for policy 0, policy_version 595168 (0.0030) [2024-04-28 11:21:46,983][57339] Updated weights for policy 0, policy_version 595178 (0.0027) [2024-04-28 11:21:47,169][57108] Fps is (10 sec: 58983.2, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 9751396352. Throughput: 0: 55226.0. Samples: 241783380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:21:47,169][57108] Avg episode reward: [(0, '0.522')] [2024-04-28 11:21:50,440][57339] Updated weights for policy 0, policy_version 595188 (0.0032) [2024-04-28 11:21:52,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.4, 300 sec: 55316.8). Total num frames: 9751658496. Throughput: 0: 55580.8. Samples: 241956320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 11:21:52,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 11:21:52,972][57339] Updated weights for policy 0, policy_version 595198 (0.0030) [2024-04-28 11:21:56,300][57339] Updated weights for policy 0, policy_version 595208 (0.0033) [2024-04-28 11:21:57,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55705.6, 300 sec: 55372.3). Total num frames: 9751937024. Throughput: 0: 55418.5. Samples: 242284200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:21:57,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 11:21:58,762][57339] Updated weights for policy 0, policy_version 595218 (0.0026) [2024-04-28 11:22:02,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55705.7, 300 sec: 55316.8). Total num frames: 9752199168. Throughput: 0: 55353.7. Samples: 242614960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:22:02,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 11:22:02,189][57339] Updated weights for policy 0, policy_version 595228 (0.0034) [2024-04-28 11:22:04,582][57339] Updated weights for policy 0, policy_version 595238 (0.0030) [2024-04-28 11:22:07,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55432.6, 300 sec: 55316.8). Total num frames: 9752477696. Throughput: 0: 54825.8. Samples: 242769600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:22:07,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 11:22:08,020][57339] Updated weights for policy 0, policy_version 595248 (0.0032) [2024-04-28 11:22:10,646][57339] Updated weights for policy 0, policy_version 595258 (0.0030) [2024-04-28 11:22:12,169][57108] Fps is (10 sec: 52428.8, 60 sec: 54613.4, 300 sec: 55150.2). Total num frames: 9752723456. Throughput: 0: 54792.2. Samples: 243104020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:22:12,170][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 11:22:14,080][57339] Updated weights for policy 0, policy_version 595268 (0.0033) [2024-04-28 11:22:16,740][57339] Updated weights for policy 0, policy_version 595278 (0.0033) [2024-04-28 11:22:17,169][57108] Fps is (10 sec: 55705.2, 60 sec: 54886.3, 300 sec: 55261.3). Total num frames: 9753034752. Throughput: 0: 54877.2. Samples: 243436300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:22:17,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 11:22:19,884][57339] Updated weights for policy 0, policy_version 595288 (0.0036) [2024-04-28 11:22:22,169][57108] Fps is (10 sec: 60620.1, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 9753329664. Throughput: 0: 55252.5. Samples: 243606280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:22:22,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 11:22:22,558][57339] Updated weights for policy 0, policy_version 595298 (0.0031) [2024-04-28 11:22:25,856][57339] Updated weights for policy 0, policy_version 595308 (0.0027) [2024-04-28 11:22:27,169][57108] Fps is (10 sec: 54064.9, 60 sec: 54886.1, 300 sec: 55261.2). Total num frames: 9753575424. Throughput: 0: 55219.9. Samples: 243937380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:22:27,170][57108] Avg episode reward: [(0, '0.534')] [2024-04-28 11:22:28,687][57339] Updated weights for policy 0, policy_version 595318 (0.0029) [2024-04-28 11:22:31,626][57319] Signal inference workers to stop experience collection... (3450 times) [2024-04-28 11:22:31,626][57319] Signal inference workers to resume experience collection... (3450 times) [2024-04-28 11:22:31,640][57339] InferenceWorker_p0-w0: stopping experience collection (3450 times) [2024-04-28 11:22:31,640][57339] InferenceWorker_p0-w0: resuming experience collection (3450 times) [2024-04-28 11:22:31,740][57339] Updated weights for policy 0, policy_version 595328 (0.0034) [2024-04-28 11:22:32,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55432.5, 300 sec: 55261.3). Total num frames: 9753853952. Throughput: 0: 55200.9. Samples: 244267420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:22:32,170][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 11:22:35,034][57339] Updated weights for policy 0, policy_version 595338 (0.0026) [2024-04-28 11:22:37,169][57108] Fps is (10 sec: 55708.3, 60 sec: 55432.6, 300 sec: 55316.9). Total num frames: 9754132480. Throughput: 0: 55075.7. Samples: 244434720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:22:37,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 11:22:37,657][57339] Updated weights for policy 0, policy_version 595348 (0.0026) [2024-04-28 11:22:41,333][57339] Updated weights for policy 0, policy_version 595358 (0.0034) [2024-04-28 11:22:42,169][57108] Fps is (10 sec: 54065.8, 60 sec: 55159.3, 300 sec: 55205.7). Total num frames: 9754394624. Throughput: 0: 55193.6. Samples: 244767920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:22:42,170][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 11:22:43,416][57339] Updated weights for policy 0, policy_version 595368 (0.0028) [2024-04-28 11:22:47,134][57339] Updated weights for policy 0, policy_version 595378 (0.0027) [2024-04-28 11:22:47,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54613.3, 300 sec: 55205.7). Total num frames: 9754673152. Throughput: 0: 55286.6. Samples: 245102860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:22:47,170][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 11:22:49,561][57339] Updated weights for policy 0, policy_version 595388 (0.0027) [2024-04-28 11:22:52,169][57108] Fps is (10 sec: 57345.3, 60 sec: 55159.5, 300 sec: 55261.3). Total num frames: 9754968064. Throughput: 0: 55414.6. Samples: 245263260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:22:52,169][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 11:22:52,958][57339] Updated weights for policy 0, policy_version 595398 (0.0037) [2024-04-28 11:22:55,387][57339] Updated weights for policy 0, policy_version 595408 (0.0029) [2024-04-28 11:22:57,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55159.5, 300 sec: 55316.8). Total num frames: 9755246592. Throughput: 0: 55386.6. Samples: 245596420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:22:57,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 11:22:58,876][57339] Updated weights for policy 0, policy_version 595418 (0.0027) [2024-04-28 11:23:01,327][57339] Updated weights for policy 0, policy_version 595428 (0.0028) [2024-04-28 11:23:02,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.3, 300 sec: 55261.3). Total num frames: 9755508736. Throughput: 0: 55385.7. Samples: 245928660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:23:02,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 11:23:04,724][57339] Updated weights for policy 0, policy_version 595438 (0.0029) [2024-04-28 11:23:07,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9755803648. Throughput: 0: 55490.7. Samples: 246103360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:23:07,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 11:23:07,272][57339] Updated weights for policy 0, policy_version 595448 (0.0035) [2024-04-28 11:23:10,679][57339] Updated weights for policy 0, policy_version 595458 (0.0028) [2024-04-28 11:23:12,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55978.7, 300 sec: 55372.4). Total num frames: 9756082176. Throughput: 0: 55464.2. Samples: 246433240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:23:12,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 11:23:13,388][57339] Updated weights for policy 0, policy_version 595468 (0.0026) [2024-04-28 11:23:16,680][57339] Updated weights for policy 0, policy_version 595478 (0.0031) [2024-04-28 11:23:17,169][57108] Fps is (10 sec: 52429.3, 60 sec: 54886.5, 300 sec: 55205.7). Total num frames: 9756327936. Throughput: 0: 55519.6. Samples: 246765800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:23:17,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 11:23:19,178][57339] Updated weights for policy 0, policy_version 595488 (0.0033) [2024-04-28 11:23:22,169][57108] Fps is (10 sec: 52428.6, 60 sec: 54613.4, 300 sec: 55205.8). Total num frames: 9756606464. Throughput: 0: 55311.5. Samples: 246923740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:23:22,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 11:23:22,620][57339] Updated weights for policy 0, policy_version 595498 (0.0026) [2024-04-28 11:23:25,000][57339] Updated weights for policy 0, policy_version 595508 (0.0028) [2024-04-28 11:23:27,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55706.1, 300 sec: 55316.8). Total num frames: 9756917760. Throughput: 0: 55354.6. Samples: 247258860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:23:27,169][57108] Avg episode reward: [(0, '0.718')] [2024-04-28 11:23:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000595515_9756917760.pth... [2024-04-28 11:23:27,234][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000594704_9743630336.pth [2024-04-28 11:23:28,685][57339] Updated weights for policy 0, policy_version 595518 (0.0026) [2024-04-28 11:23:31,046][57339] Updated weights for policy 0, policy_version 595528 (0.0031) [2024-04-28 11:23:32,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55432.6, 300 sec: 55316.8). Total num frames: 9757179904. Throughput: 0: 55312.6. Samples: 247591920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:23:32,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 11:23:34,463][57339] Updated weights for policy 0, policy_version 595538 (0.0025) [2024-04-28 11:23:37,037][57339] Updated weights for policy 0, policy_version 595548 (0.0029) [2024-04-28 11:23:37,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9757458432. Throughput: 0: 55513.3. Samples: 247761360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:23:37,170][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 11:23:38,185][57319] Signal inference workers to stop experience collection... (3500 times) [2024-04-28 11:23:38,185][57319] Signal inference workers to resume experience collection... (3500 times) [2024-04-28 11:23:38,215][57339] InferenceWorker_p0-w0: stopping experience collection (3500 times) [2024-04-28 11:23:38,215][57339] InferenceWorker_p0-w0: resuming experience collection (3500 times) [2024-04-28 11:23:40,289][57339] Updated weights for policy 0, policy_version 595558 (0.0026) [2024-04-28 11:23:42,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55979.0, 300 sec: 55427.9). Total num frames: 9757753344. Throughput: 0: 55498.9. Samples: 248093860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:23:42,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 11:23:42,903][57339] Updated weights for policy 0, policy_version 595568 (0.0034) [2024-04-28 11:23:45,991][57339] Updated weights for policy 0, policy_version 595578 (0.0029) [2024-04-28 11:23:47,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55705.8, 300 sec: 55316.8). Total num frames: 9758015488. Throughput: 0: 55640.3. Samples: 248432460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:23:47,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 11:23:48,657][57339] Updated weights for policy 0, policy_version 595588 (0.0032) [2024-04-28 11:23:51,882][57339] Updated weights for policy 0, policy_version 595598 (0.0029) [2024-04-28 11:23:52,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9758294016. Throughput: 0: 55385.8. Samples: 248595720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:23:52,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 11:23:54,594][57339] Updated weights for policy 0, policy_version 595608 (0.0030) [2024-04-28 11:23:57,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55159.4, 300 sec: 55316.8). Total num frames: 9758556160. Throughput: 0: 55446.1. Samples: 248928320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:23:57,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 11:23:57,757][57339] Updated weights for policy 0, policy_version 595618 (0.0033) [2024-04-28 11:24:00,583][57339] Updated weights for policy 0, policy_version 595628 (0.0027) [2024-04-28 11:24:02,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.8, 300 sec: 55316.8). Total num frames: 9758851072. Throughput: 0: 55456.4. Samples: 249261340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:24:02,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 11:24:03,608][57339] Updated weights for policy 0, policy_version 595638 (0.0033) [2024-04-28 11:24:06,278][57339] Updated weights for policy 0, policy_version 595648 (0.0027) [2024-04-28 11:24:07,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55159.5, 300 sec: 55316.8). Total num frames: 9759113216. Throughput: 0: 55685.3. Samples: 249429580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:24:07,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 11:24:09,446][57339] Updated weights for policy 0, policy_version 595658 (0.0030) [2024-04-28 11:24:12,006][57339] Updated weights for policy 0, policy_version 595668 (0.0026) [2024-04-28 11:24:12,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 9759424512. Throughput: 0: 55686.6. Samples: 249764760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:24:12,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 11:24:15,408][57339] Updated weights for policy 0, policy_version 595678 (0.0029) [2024-04-28 11:24:17,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55978.5, 300 sec: 55483.4). Total num frames: 9759686656. Throughput: 0: 55607.3. Samples: 250094260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:24:17,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 11:24:18,374][57339] Updated weights for policy 0, policy_version 595688 (0.0027) [2024-04-28 11:24:21,422][57339] Updated weights for policy 0, policy_version 595698 (0.0030) [2024-04-28 11:24:22,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.7, 300 sec: 55372.4). Total num frames: 9759948800. Throughput: 0: 55425.5. Samples: 250255500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:24:22,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 11:24:24,364][57339] Updated weights for policy 0, policy_version 595708 (0.0034) [2024-04-28 11:24:26,307][57319] Signal inference workers to stop experience collection... (3550 times) [2024-04-28 11:24:26,361][57319] Signal inference workers to resume experience collection... (3550 times) [2024-04-28 11:24:26,362][57339] InferenceWorker_p0-w0: stopping experience collection (3550 times) [2024-04-28 11:24:26,373][57339] InferenceWorker_p0-w0: resuming experience collection (3550 times) [2024-04-28 11:24:27,140][57339] Updated weights for policy 0, policy_version 595718 (0.0027) [2024-04-28 11:24:27,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.4, 300 sec: 55316.8). Total num frames: 9760243712. Throughput: 0: 55499.6. Samples: 250591360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:24:27,170][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 11:24:30,172][57339] Updated weights for policy 0, policy_version 595728 (0.0027) [2024-04-28 11:24:32,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9760505856. Throughput: 0: 55538.1. Samples: 250931680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:24:32,170][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 11:24:32,845][57339] Updated weights for policy 0, policy_version 595738 (0.0034) [2024-04-28 11:24:35,934][57339] Updated weights for policy 0, policy_version 595748 (0.0029) [2024-04-28 11:24:37,169][57108] Fps is (10 sec: 54068.5, 60 sec: 55432.7, 300 sec: 55372.4). Total num frames: 9760784384. Throughput: 0: 55445.4. Samples: 251090760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:24:37,169][57108] Avg episode reward: [(0, '0.535')] [2024-04-28 11:24:38,913][57339] Updated weights for policy 0, policy_version 595758 (0.0026) [2024-04-28 11:24:41,861][57339] Updated weights for policy 0, policy_version 595768 (0.0026) [2024-04-28 11:24:42,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 9761079296. Throughput: 0: 55511.3. Samples: 251426320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:24:42,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 11:24:44,918][57339] Updated weights for policy 0, policy_version 595778 (0.0025) [2024-04-28 11:24:47,169][57108] Fps is (10 sec: 57342.9, 60 sec: 55705.4, 300 sec: 55427.9). Total num frames: 9761357824. Throughput: 0: 55480.7. Samples: 251757980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:24:47,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 11:24:47,784][57339] Updated weights for policy 0, policy_version 595788 (0.0029) [2024-04-28 11:24:50,864][57339] Updated weights for policy 0, policy_version 595798 (0.0035) [2024-04-28 11:24:52,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 9761636352. Throughput: 0: 55407.6. Samples: 251922920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:24:52,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 11:24:53,655][57339] Updated weights for policy 0, policy_version 595808 (0.0026) [2024-04-28 11:24:56,716][57339] Updated weights for policy 0, policy_version 595818 (0.0023) [2024-04-28 11:24:57,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55427.9). Total num frames: 9761898496. Throughput: 0: 55379.0. Samples: 252256820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:24:57,170][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 11:24:59,578][57339] Updated weights for policy 0, policy_version 595828 (0.0031) [2024-04-28 11:25:02,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55159.4, 300 sec: 55261.3). Total num frames: 9762160640. Throughput: 0: 55416.1. Samples: 252587980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-04-28 11:25:02,169][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 11:25:02,565][57339] Updated weights for policy 0, policy_version 595838 (0.0023) [2024-04-28 11:25:05,597][57339] Updated weights for policy 0, policy_version 595848 (0.0029) [2024-04-28 11:25:07,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.7, 300 sec: 55372.4). Total num frames: 9762455552. Throughput: 0: 55572.9. Samples: 252756280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:25:07,169][57108] Avg episode reward: [(0, '0.531')] [2024-04-28 11:25:08,587][57339] Updated weights for policy 0, policy_version 595858 (0.0027) [2024-04-28 11:25:11,556][57339] Updated weights for policy 0, policy_version 595868 (0.0025) [2024-04-28 11:25:12,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55159.4, 300 sec: 55372.3). Total num frames: 9762734080. Throughput: 0: 55455.6. Samples: 253086860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:25:12,170][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 11:25:14,383][57339] Updated weights for policy 0, policy_version 595878 (0.0027) [2024-04-28 11:25:17,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.6, 300 sec: 55372.4). Total num frames: 9762996224. Throughput: 0: 55314.2. Samples: 253420820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:25:17,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 11:25:17,402][57339] Updated weights for policy 0, policy_version 595888 (0.0027) [2024-04-28 11:25:20,439][57339] Updated weights for policy 0, policy_version 595898 (0.0028) [2024-04-28 11:25:22,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 9763291136. Throughput: 0: 55432.8. Samples: 253585240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:25:22,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 11:25:23,381][57339] Updated weights for policy 0, policy_version 595908 (0.0033) [2024-04-28 11:25:24,697][57319] Signal inference workers to stop experience collection... (3600 times) [2024-04-28 11:25:24,697][57319] Signal inference workers to resume experience collection... (3600 times) [2024-04-28 11:25:24,718][57339] InferenceWorker_p0-w0: stopping experience collection (3600 times) [2024-04-28 11:25:24,718][57339] InferenceWorker_p0-w0: resuming experience collection (3600 times) [2024-04-28 11:25:26,622][57339] Updated weights for policy 0, policy_version 595918 (0.0032) [2024-04-28 11:25:27,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 9763553280. Throughput: 0: 55285.6. Samples: 253914180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:25:27,169][57108] Avg episode reward: [(0, '0.693')] [2024-04-28 11:25:27,211][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000595921_9763569664.pth... [2024-04-28 11:25:27,264][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000595111_9750298624.pth [2024-04-28 11:25:29,332][57339] Updated weights for policy 0, policy_version 595928 (0.0030) [2024-04-28 11:25:32,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55159.5, 300 sec: 55316.9). Total num frames: 9763815424. Throughput: 0: 55278.0. Samples: 254245480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:25:32,169][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 11:25:32,502][57339] Updated weights for policy 0, policy_version 595938 (0.0029) [2024-04-28 11:25:35,396][57339] Updated weights for policy 0, policy_version 595948 (0.0024) [2024-04-28 11:25:37,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55159.3, 300 sec: 55261.3). Total num frames: 9764093952. Throughput: 0: 55231.9. Samples: 254408360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:25:37,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 11:25:38,516][57339] Updated weights for policy 0, policy_version 595958 (0.0027) [2024-04-28 11:25:41,185][57339] Updated weights for policy 0, policy_version 595968 (0.0025) [2024-04-28 11:25:42,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55159.5, 300 sec: 55316.8). Total num frames: 9764388864. Throughput: 0: 55249.5. Samples: 254743040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:25:42,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 11:25:44,448][57339] Updated weights for policy 0, policy_version 595978 (0.0027) [2024-04-28 11:25:47,079][57339] Updated weights for policy 0, policy_version 595988 (0.0027) [2024-04-28 11:25:47,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55159.6, 300 sec: 55372.4). Total num frames: 9764667392. Throughput: 0: 55292.0. Samples: 255076120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:25:47,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 11:25:50,157][57339] Updated weights for policy 0, policy_version 595998 (0.0027) [2024-04-28 11:25:52,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54886.4, 300 sec: 55372.4). Total num frames: 9764929536. Throughput: 0: 55147.6. Samples: 255237920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:25:52,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 11:25:52,939][57339] Updated weights for policy 0, policy_version 596008 (0.0029) [2024-04-28 11:25:56,112][57339] Updated weights for policy 0, policy_version 596018 (0.0033) [2024-04-28 11:25:57,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 9765224448. Throughput: 0: 55201.5. Samples: 255570920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:25:57,170][57108] Avg episode reward: [(0, '0.699')] [2024-04-28 11:25:58,831][57339] Updated weights for policy 0, policy_version 596028 (0.0029) [2024-04-28 11:26:01,953][57339] Updated weights for policy 0, policy_version 596038 (0.0032) [2024-04-28 11:26:02,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 9765486592. Throughput: 0: 55116.6. Samples: 255901060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:26:02,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 11:26:04,901][57339] Updated weights for policy 0, policy_version 596048 (0.0027) [2024-04-28 11:26:07,169][57108] Fps is (10 sec: 52428.9, 60 sec: 54886.4, 300 sec: 55261.3). Total num frames: 9765748736. Throughput: 0: 55127.2. Samples: 256065960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:26:07,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 11:26:07,864][57339] Updated weights for policy 0, policy_version 596058 (0.0027) [2024-04-28 11:26:10,701][57339] Updated weights for policy 0, policy_version 596068 (0.0032) [2024-04-28 11:26:12,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55159.6, 300 sec: 55261.3). Total num frames: 9766043648. Throughput: 0: 55176.6. Samples: 256397120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:26:12,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 11:26:13,695][57339] Updated weights for policy 0, policy_version 596078 (0.0032) [2024-04-28 11:26:16,390][57319] Signal inference workers to stop experience collection... (3650 times) [2024-04-28 11:26:16,390][57319] Signal inference workers to resume experience collection... (3650 times) [2024-04-28 11:26:16,402][57339] InferenceWorker_p0-w0: stopping experience collection (3650 times) [2024-04-28 11:26:16,402][57339] InferenceWorker_p0-w0: resuming experience collection (3650 times) [2024-04-28 11:26:16,500][57339] Updated weights for policy 0, policy_version 596088 (0.0028) [2024-04-28 11:26:17,169][57108] Fps is (10 sec: 60620.4, 60 sec: 55978.6, 300 sec: 55427.9). Total num frames: 9766354944. Throughput: 0: 55083.4. Samples: 256724240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:26:17,169][57108] Avg episode reward: [(0, '0.690')] [2024-04-28 11:26:19,723][57339] Updated weights for policy 0, policy_version 596098 (0.0028) [2024-04-28 11:26:22,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.4, 300 sec: 55316.8). Total num frames: 9766600704. Throughput: 0: 55364.9. Samples: 256899780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:26:22,170][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 11:26:22,477][57339] Updated weights for policy 0, policy_version 596108 (0.0033) [2024-04-28 11:26:25,749][57339] Updated weights for policy 0, policy_version 596118 (0.0027) [2024-04-28 11:26:27,169][57108] Fps is (10 sec: 50790.7, 60 sec: 55159.6, 300 sec: 55372.4). Total num frames: 9766862848. Throughput: 0: 55283.5. Samples: 257230800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:26:27,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 11:26:28,466][57339] Updated weights for policy 0, policy_version 596128 (0.0034) [2024-04-28 11:26:31,549][57339] Updated weights for policy 0, policy_version 596138 (0.0028) [2024-04-28 11:26:32,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.3, 300 sec: 55372.4). Total num frames: 9767141376. Throughput: 0: 55159.4. Samples: 257558300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-04-28 11:26:32,170][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 11:26:34,399][57339] Updated weights for policy 0, policy_version 596148 (0.0035) [2024-04-28 11:26:37,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 9767419904. Throughput: 0: 55160.8. Samples: 257720160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:26:37,170][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 11:26:37,530][57339] Updated weights for policy 0, policy_version 596158 (0.0028) [2024-04-28 11:26:40,227][57339] Updated weights for policy 0, policy_version 596168 (0.0027) [2024-04-28 11:26:42,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55159.3, 300 sec: 55261.3). Total num frames: 9767698432. Throughput: 0: 55165.7. Samples: 258053380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:26:42,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 11:26:43,430][57339] Updated weights for policy 0, policy_version 596178 (0.0028) [2024-04-28 11:26:46,124][57339] Updated weights for policy 0, policy_version 596188 (0.0029) [2024-04-28 11:26:47,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55159.5, 300 sec: 55316.9). Total num frames: 9767976960. Throughput: 0: 55260.8. Samples: 258387800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:26:47,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 11:26:49,449][57339] Updated weights for policy 0, policy_version 596198 (0.0028) [2024-04-28 11:26:52,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55705.6, 300 sec: 55372.4). Total num frames: 9768271872. Throughput: 0: 55422.3. Samples: 258559960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:26:52,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 11:26:52,171][57339] Updated weights for policy 0, policy_version 596208 (0.0038) [2024-04-28 11:26:55,555][57339] Updated weights for policy 0, policy_version 596218 (0.0032) [2024-04-28 11:26:57,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 9768550400. Throughput: 0: 55477.4. Samples: 258893600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:26:57,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 11:26:57,968][57339] Updated weights for policy 0, policy_version 596228 (0.0028) [2024-04-28 11:27:01,304][57339] Updated weights for policy 0, policy_version 596238 (0.0027) [2024-04-28 11:27:02,169][57108] Fps is (10 sec: 52427.6, 60 sec: 55159.3, 300 sec: 55316.8). Total num frames: 9768796160. Throughput: 0: 55613.2. Samples: 259226840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:27:02,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 11:27:03,851][57339] Updated weights for policy 0, policy_version 596248 (0.0026) [2024-04-28 11:27:07,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 9769074688. Throughput: 0: 55143.3. Samples: 259381220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:27:07,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 11:27:07,236][57339] Updated weights for policy 0, policy_version 596258 (0.0033) [2024-04-28 11:27:09,822][57339] Updated weights for policy 0, policy_version 596268 (0.0038) [2024-04-28 11:27:12,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55159.3, 300 sec: 55316.8). Total num frames: 9769353216. Throughput: 0: 55262.0. Samples: 259717600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:27:12,170][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 11:27:13,290][57339] Updated weights for policy 0, policy_version 596278 (0.0028) [2024-04-28 11:27:15,613][57339] Updated weights for policy 0, policy_version 596288 (0.0025) [2024-04-28 11:27:17,169][57108] Fps is (10 sec: 55705.1, 60 sec: 54613.4, 300 sec: 55261.3). Total num frames: 9769631744. Throughput: 0: 55498.8. Samples: 260055740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:27:17,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 11:27:19,225][57339] Updated weights for policy 0, policy_version 596298 (0.0025) [2024-04-28 11:27:21,439][57339] Updated weights for policy 0, policy_version 596308 (0.0031) [2024-04-28 11:27:22,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55432.5, 300 sec: 55428.0). Total num frames: 9769926656. Throughput: 0: 55580.4. Samples: 260221280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:27:22,169][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 11:27:24,974][57339] Updated weights for policy 0, policy_version 596318 (0.0028) [2024-04-28 11:27:26,649][57319] Signal inference workers to stop experience collection... (3700 times) [2024-04-28 11:27:26,689][57339] InferenceWorker_p0-w0: stopping experience collection (3700 times) [2024-04-28 11:27:26,715][57319] Signal inference workers to resume experience collection... (3700 times) [2024-04-28 11:27:26,716][57339] InferenceWorker_p0-w0: resuming experience collection (3700 times) [2024-04-28 11:27:27,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55427.9). Total num frames: 9770205184. Throughput: 0: 55544.5. Samples: 260552880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:27:27,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 11:27:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000596326_9770205184.pth... [2024-04-28 11:27:27,238][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000595515_9756917760.pth [2024-04-28 11:27:27,508][57339] Updated weights for policy 0, policy_version 596328 (0.0038) [2024-04-28 11:27:30,875][57339] Updated weights for policy 0, policy_version 596338 (0.0032) [2024-04-28 11:27:32,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55159.6, 300 sec: 55316.8). Total num frames: 9770450944. Throughput: 0: 55396.4. Samples: 260880640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:27:32,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 11:27:33,419][57339] Updated weights for policy 0, policy_version 596348 (0.0027) [2024-04-28 11:27:37,169][57108] Fps is (10 sec: 50790.7, 60 sec: 54886.5, 300 sec: 55316.9). Total num frames: 9770713088. Throughput: 0: 55213.3. Samples: 261044560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:27:37,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 11:27:37,190][57339] Updated weights for policy 0, policy_version 596358 (0.0027) [2024-04-28 11:27:39,217][57339] Updated weights for policy 0, policy_version 596368 (0.0028) [2024-04-28 11:27:42,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 9771008000. Throughput: 0: 55115.9. Samples: 261373820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:27:42,170][57108] Avg episode reward: [(0, '0.750')] [2024-04-28 11:27:43,213][57339] Updated weights for policy 0, policy_version 596378 (0.0026) [2024-04-28 11:27:45,043][57339] Updated weights for policy 0, policy_version 596388 (0.0032) [2024-04-28 11:27:47,169][57108] Fps is (10 sec: 55705.3, 60 sec: 54886.4, 300 sec: 55261.3). Total num frames: 9771270144. Throughput: 0: 55097.9. Samples: 261706240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:27:47,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 11:27:48,984][57339] Updated weights for policy 0, policy_version 596398 (0.0032) [2024-04-28 11:27:51,134][57339] Updated weights for policy 0, policy_version 596408 (0.0036) [2024-04-28 11:27:52,169][57108] Fps is (10 sec: 55705.0, 60 sec: 54886.2, 300 sec: 55316.8). Total num frames: 9771565056. Throughput: 0: 55306.8. Samples: 261870040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:27:52,170][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 11:27:55,017][57339] Updated weights for policy 0, policy_version 596418 (0.0035) [2024-04-28 11:27:57,144][57339] Updated weights for policy 0, policy_version 596428 (0.0026) [2024-04-28 11:27:57,169][57108] Fps is (10 sec: 60620.9, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 9771876352. Throughput: 0: 55179.0. Samples: 262200640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:27:57,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 11:28:00,836][57339] Updated weights for policy 0, policy_version 596438 (0.0027) [2024-04-28 11:28:02,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9772122112. Throughput: 0: 55032.8. Samples: 262532220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:28:02,170][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 11:28:03,152][57339] Updated weights for policy 0, policy_version 596448 (0.0027) [2024-04-28 11:28:06,597][57339] Updated weights for policy 0, policy_version 596458 (0.0036) [2024-04-28 11:28:07,169][57108] Fps is (10 sec: 50790.0, 60 sec: 55159.4, 300 sec: 55261.3). Total num frames: 9772384256. Throughput: 0: 54965.8. Samples: 262694740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 11:28:07,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 11:28:09,478][57339] Updated weights for policy 0, policy_version 596468 (0.0024) [2024-04-28 11:28:12,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55159.7, 300 sec: 55372.4). Total num frames: 9772662784. Throughput: 0: 54938.2. Samples: 263025100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:28:12,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 11:28:12,548][57339] Updated weights for policy 0, policy_version 596478 (0.0027) [2024-04-28 11:28:15,540][57339] Updated weights for policy 0, policy_version 596488 (0.0032) [2024-04-28 11:28:17,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 9772941312. Throughput: 0: 55065.4. Samples: 263358580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:28:17,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 11:28:18,454][57339] Updated weights for policy 0, policy_version 596498 (0.0032) [2024-04-28 11:28:21,303][57339] Updated weights for policy 0, policy_version 596508 (0.0025) [2024-04-28 11:28:22,169][57108] Fps is (10 sec: 55705.6, 60 sec: 54886.5, 300 sec: 55261.3). Total num frames: 9773219840. Throughput: 0: 55000.4. Samples: 263519580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:28:22,170][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 11:28:24,309][57339] Updated weights for policy 0, policy_version 596518 (0.0026) [2024-04-28 11:28:27,033][57339] Updated weights for policy 0, policy_version 596528 (0.0029) [2024-04-28 11:28:27,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55159.3, 300 sec: 55372.3). Total num frames: 9773514752. Throughput: 0: 55144.8. Samples: 263855340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:28:27,170][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 11:28:30,302][57339] Updated weights for policy 0, policy_version 596538 (0.0025) [2024-04-28 11:28:32,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55978.7, 300 sec: 55427.9). Total num frames: 9773809664. Throughput: 0: 55088.5. Samples: 264185220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:28:32,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 11:28:32,978][57339] Updated weights for policy 0, policy_version 596548 (0.0034) [2024-04-28 11:28:36,227][57339] Updated weights for policy 0, policy_version 596558 (0.0035) [2024-04-28 11:28:37,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55978.6, 300 sec: 55316.8). Total num frames: 9774071808. Throughput: 0: 55294.9. Samples: 264358300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:28:37,178][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 11:28:39,158][57339] Updated weights for policy 0, policy_version 596568 (0.0029) [2024-04-28 11:28:41,587][57319] Signal inference workers to stop experience collection... (3750 times) [2024-04-28 11:28:41,588][57319] Signal inference workers to resume experience collection... (3750 times) [2024-04-28 11:28:41,611][57339] InferenceWorker_p0-w0: stopping experience collection (3750 times) [2024-04-28 11:28:41,611][57339] InferenceWorker_p0-w0: resuming experience collection (3750 times) [2024-04-28 11:28:42,125][57339] Updated weights for policy 0, policy_version 596578 (0.0031) [2024-04-28 11:28:42,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9774333952. Throughput: 0: 55277.2. Samples: 264688120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:28:42,178][57108] Avg episode reward: [(0, '0.518')] [2024-04-28 11:28:45,003][57339] Updated weights for policy 0, policy_version 596588 (0.0027) [2024-04-28 11:28:47,169][57108] Fps is (10 sec: 50789.5, 60 sec: 55159.3, 300 sec: 55205.7). Total num frames: 9774579712. Throughput: 0: 55317.8. Samples: 265021520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:28:47,178][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 11:28:48,126][57339] Updated weights for policy 0, policy_version 596598 (0.0031) [2024-04-28 11:28:50,951][57339] Updated weights for policy 0, policy_version 596608 (0.0030) [2024-04-28 11:28:52,169][57108] Fps is (10 sec: 52429.4, 60 sec: 54886.6, 300 sec: 55261.3). Total num frames: 9774858240. Throughput: 0: 55128.1. Samples: 265175500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:28:52,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 11:28:54,051][57339] Updated weights for policy 0, policy_version 596618 (0.0027) [2024-04-28 11:28:56,996][57339] Updated weights for policy 0, policy_version 596628 (0.0030) [2024-04-28 11:28:57,169][57108] Fps is (10 sec: 58982.9, 60 sec: 54886.3, 300 sec: 55316.8). Total num frames: 9775169536. Throughput: 0: 55144.4. Samples: 265506600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:28:57,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 11:28:59,848][57339] Updated weights for policy 0, policy_version 596638 (0.0027) [2024-04-28 11:29:02,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55432.7, 300 sec: 55372.4). Total num frames: 9775448064. Throughput: 0: 55105.3. Samples: 265838320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:29:02,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 11:29:02,911][57339] Updated weights for policy 0, policy_version 596648 (0.0033) [2024-04-28 11:29:05,710][57339] Updated weights for policy 0, policy_version 596658 (0.0036) [2024-04-28 11:29:07,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55316.8). Total num frames: 9775742976. Throughput: 0: 55441.7. Samples: 266014460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:29:07,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 11:29:08,665][57339] Updated weights for policy 0, policy_version 596668 (0.0028) [2024-04-28 11:29:11,647][57339] Updated weights for policy 0, policy_version 596678 (0.0030) [2024-04-28 11:29:12,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55159.5, 300 sec: 55205.8). Total num frames: 9775972352. Throughput: 0: 55275.3. Samples: 266342720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:29:12,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 11:29:14,562][57339] Updated weights for policy 0, policy_version 596688 (0.0029) [2024-04-28 11:29:17,169][57108] Fps is (10 sec: 50790.2, 60 sec: 55159.4, 300 sec: 55261.3). Total num frames: 9776250880. Throughput: 0: 55350.1. Samples: 266675980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:29:17,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 11:29:17,646][57339] Updated weights for policy 0, policy_version 596698 (0.0027) [2024-04-28 11:29:20,587][57339] Updated weights for policy 0, policy_version 596708 (0.0024) [2024-04-28 11:29:22,169][57108] Fps is (10 sec: 54066.7, 60 sec: 54886.3, 300 sec: 55150.2). Total num frames: 9776513024. Throughput: 0: 55075.4. Samples: 266836700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:29:22,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 11:29:23,776][57339] Updated weights for policy 0, policy_version 596718 (0.0023) [2024-04-28 11:29:26,794][57339] Updated weights for policy 0, policy_version 596728 (0.0026) [2024-04-28 11:29:27,169][57108] Fps is (10 sec: 55706.0, 60 sec: 54886.5, 300 sec: 55261.3). Total num frames: 9776807936. Throughput: 0: 55053.4. Samples: 267165520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:29:27,170][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 11:29:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000596729_9776807936.pth... [2024-04-28 11:29:27,229][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000595921_9763569664.pth [2024-04-28 11:29:29,758][57339] Updated weights for policy 0, policy_version 596738 (0.0034) [2024-04-28 11:29:32,169][57108] Fps is (10 sec: 58983.1, 60 sec: 54886.4, 300 sec: 55316.8). Total num frames: 9777102848. Throughput: 0: 54986.5. Samples: 267495900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:29:32,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 11:29:32,560][57339] Updated weights for policy 0, policy_version 596748 (0.0033) [2024-04-28 11:29:35,795][57339] Updated weights for policy 0, policy_version 596758 (0.0037) [2024-04-28 11:29:35,998][57319] Signal inference workers to stop experience collection... (3800 times) [2024-04-28 11:29:35,998][57319] Signal inference workers to resume experience collection... (3800 times) [2024-04-28 11:29:36,018][57339] InferenceWorker_p0-w0: stopping experience collection (3800 times) [2024-04-28 11:29:36,018][57339] InferenceWorker_p0-w0: resuming experience collection (3800 times) [2024-04-28 11:29:37,169][57108] Fps is (10 sec: 55705.4, 60 sec: 54886.3, 300 sec: 55205.7). Total num frames: 9777364992. Throughput: 0: 55326.6. Samples: 267665200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:29:37,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 11:29:38,501][57339] Updated weights for policy 0, policy_version 596768 (0.0028) [2024-04-28 11:29:41,575][57339] Updated weights for policy 0, policy_version 596778 (0.0036) [2024-04-28 11:29:42,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.6, 300 sec: 55261.3). Total num frames: 9777659904. Throughput: 0: 55374.2. Samples: 267998440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-04-28 11:29:42,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 11:29:44,481][57339] Updated weights for policy 0, policy_version 596788 (0.0030) [2024-04-28 11:29:47,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55150.2). Total num frames: 9777905664. Throughput: 0: 55283.1. Samples: 268326060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:29:47,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 11:29:47,659][57339] Updated weights for policy 0, policy_version 596798 (0.0027) [2024-04-28 11:29:50,556][57339] Updated weights for policy 0, policy_version 596808 (0.0033) [2024-04-28 11:29:52,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.5, 300 sec: 55261.3). Total num frames: 9778200576. Throughput: 0: 55082.7. Samples: 268493180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:29:52,169][57108] Avg episode reward: [(0, '0.527')] [2024-04-28 11:29:53,620][57339] Updated weights for policy 0, policy_version 596818 (0.0028) [2024-04-28 11:29:56,334][57339] Updated weights for policy 0, policy_version 596828 (0.0027) [2024-04-28 11:29:57,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54613.4, 300 sec: 55205.7). Total num frames: 9778446336. Throughput: 0: 55111.9. Samples: 268822760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:29:57,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 11:29:59,410][57339] Updated weights for policy 0, policy_version 596838 (0.0027) [2024-04-28 11:30:02,169][57108] Fps is (10 sec: 54067.6, 60 sec: 54886.4, 300 sec: 55205.7). Total num frames: 9778741248. Throughput: 0: 55114.4. Samples: 269156120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:30:02,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 11:30:02,415][57339] Updated weights for policy 0, policy_version 596848 (0.0025) [2024-04-28 11:30:05,161][57339] Updated weights for policy 0, policy_version 596858 (0.0030) [2024-04-28 11:30:07,169][57108] Fps is (10 sec: 57344.0, 60 sec: 54613.3, 300 sec: 55205.8). Total num frames: 9779019776. Throughput: 0: 55203.6. Samples: 269320860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:30:07,170][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 11:30:08,391][57339] Updated weights for policy 0, policy_version 596868 (0.0029) [2024-04-28 11:30:11,112][57339] Updated weights for policy 0, policy_version 596878 (0.0039) [2024-04-28 11:30:12,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55316.8). Total num frames: 9779314688. Throughput: 0: 55251.1. Samples: 269651820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:30:12,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 11:30:14,359][57339] Updated weights for policy 0, policy_version 596888 (0.0032) [2024-04-28 11:30:17,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 55150.2). Total num frames: 9779560448. Throughput: 0: 55263.0. Samples: 269982740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:30:17,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 11:30:17,178][57339] Updated weights for policy 0, policy_version 596898 (0.0025) [2024-04-28 11:30:20,156][57339] Updated weights for policy 0, policy_version 596908 (0.0028) [2024-04-28 11:30:22,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55432.6, 300 sec: 55205.8). Total num frames: 9779838976. Throughput: 0: 55192.0. Samples: 270148840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:30:22,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 11:30:23,202][57339] Updated weights for policy 0, policy_version 596918 (0.0026) [2024-04-28 11:30:25,905][57339] Updated weights for policy 0, policy_version 596928 (0.0029) [2024-04-28 11:30:27,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9780133888. Throughput: 0: 55188.4. Samples: 270481920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:30:27,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 11:30:29,004][57339] Updated weights for policy 0, policy_version 596938 (0.0040) [2024-04-28 11:30:32,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54613.3, 300 sec: 55205.8). Total num frames: 9780379648. Throughput: 0: 55235.6. Samples: 270811660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:30:32,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 11:30:32,241][57339] Updated weights for policy 0, policy_version 596948 (0.0035) [2024-04-28 11:30:34,925][57339] Updated weights for policy 0, policy_version 596958 (0.0033) [2024-04-28 11:30:37,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55159.4, 300 sec: 55205.7). Total num frames: 9780674560. Throughput: 0: 55235.0. Samples: 270978760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:30:37,170][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 11:30:38,016][57339] Updated weights for policy 0, policy_version 596968 (0.0026) [2024-04-28 11:30:40,807][57339] Updated weights for policy 0, policy_version 596978 (0.0026) [2024-04-28 11:30:42,169][57108] Fps is (10 sec: 57343.7, 60 sec: 54886.4, 300 sec: 55205.8). Total num frames: 9780953088. Throughput: 0: 55273.8. Samples: 271310080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:30:42,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 11:30:43,881][57339] Updated weights for policy 0, policy_version 596988 (0.0026) [2024-04-28 11:30:46,668][57319] Signal inference workers to stop experience collection... (3850 times) [2024-04-28 11:30:46,670][57319] Signal inference workers to resume experience collection... (3850 times) [2024-04-28 11:30:46,697][57339] InferenceWorker_p0-w0: stopping experience collection (3850 times) [2024-04-28 11:30:46,697][57339] InferenceWorker_p0-w0: resuming experience collection (3850 times) [2024-04-28 11:30:46,782][57339] Updated weights for policy 0, policy_version 596998 (0.0028) [2024-04-28 11:30:47,169][57108] Fps is (10 sec: 57345.0, 60 sec: 55705.6, 300 sec: 55316.8). Total num frames: 9781248000. Throughput: 0: 55221.8. Samples: 271641100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:30:47,169][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 11:30:49,819][57339] Updated weights for policy 0, policy_version 597008 (0.0026) [2024-04-28 11:30:52,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55159.4, 300 sec: 55205.7). Total num frames: 9781510144. Throughput: 0: 55313.7. Samples: 271809980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:30:52,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 11:30:52,579][57339] Updated weights for policy 0, policy_version 597018 (0.0034) [2024-04-28 11:30:55,746][57339] Updated weights for policy 0, policy_version 597028 (0.0028) [2024-04-28 11:30:57,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55705.5, 300 sec: 55261.3). Total num frames: 9781788672. Throughput: 0: 55243.0. Samples: 272137760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:30:57,170][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 11:30:58,536][57339] Updated weights for policy 0, policy_version 597038 (0.0027) [2024-04-28 11:31:01,682][57339] Updated weights for policy 0, policy_version 597048 (0.0027) [2024-04-28 11:31:02,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55159.4, 300 sec: 55261.3). Total num frames: 9782050816. Throughput: 0: 55309.3. Samples: 272471660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:31:02,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 11:31:04,454][57339] Updated weights for policy 0, policy_version 597058 (0.0034) [2024-04-28 11:31:07,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.4, 300 sec: 55205.7). Total num frames: 9782329344. Throughput: 0: 55160.3. Samples: 272631060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:31:07,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 11:31:07,563][57339] Updated weights for policy 0, policy_version 597068 (0.0028) [2024-04-28 11:31:10,405][57339] Updated weights for policy 0, policy_version 597078 (0.0024) [2024-04-28 11:31:12,169][57108] Fps is (10 sec: 55705.0, 60 sec: 54886.3, 300 sec: 55094.7). Total num frames: 9782607872. Throughput: 0: 55077.3. Samples: 272960400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-04-28 11:31:12,170][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 11:31:13,596][57339] Updated weights for policy 0, policy_version 597088 (0.0032) [2024-04-28 11:31:16,312][57339] Updated weights for policy 0, policy_version 597098 (0.0026) [2024-04-28 11:31:17,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55432.6, 300 sec: 55205.8). Total num frames: 9782886400. Throughput: 0: 55086.7. Samples: 273290560. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:31:17,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 11:31:19,493][57339] Updated weights for policy 0, policy_version 597108 (0.0031) [2024-04-28 11:31:22,173][57108] Fps is (10 sec: 55682.1, 60 sec: 55428.5, 300 sec: 55260.5). Total num frames: 9783164928. Throughput: 0: 55130.4. Samples: 273459860. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:31:22,174][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 11:31:22,257][57339] Updated weights for policy 0, policy_version 597118 (0.0028) [2024-04-28 11:31:25,495][57339] Updated weights for policy 0, policy_version 597128 (0.0041) [2024-04-28 11:31:27,169][57108] Fps is (10 sec: 54066.4, 60 sec: 54886.4, 300 sec: 55205.8). Total num frames: 9783427072. Throughput: 0: 55142.6. Samples: 273791500. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:31:27,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 11:31:27,182][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000597133_9783427072.pth... [2024-04-28 11:31:27,242][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000596326_9770205184.pth [2024-04-28 11:31:28,063][57339] Updated weights for policy 0, policy_version 597138 (0.0031) [2024-04-28 11:31:31,483][57339] Updated weights for policy 0, policy_version 597148 (0.0025) [2024-04-28 11:31:32,169][57108] Fps is (10 sec: 52451.5, 60 sec: 55159.4, 300 sec: 55150.2). Total num frames: 9783689216. Throughput: 0: 55105.8. Samples: 274120860. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:31:32,169][57108] Avg episode reward: [(0, '0.535')] [2024-04-28 11:31:34,020][57339] Updated weights for policy 0, policy_version 597158 (0.0032) [2024-04-28 11:31:37,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.5, 300 sec: 55205.7). Total num frames: 9783984128. Throughput: 0: 54950.7. Samples: 274282760. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:31:37,170][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 11:31:37,211][57339] Updated weights for policy 0, policy_version 597168 (0.0031) [2024-04-28 11:31:40,109][57339] Updated weights for policy 0, policy_version 597178 (0.0031) [2024-04-28 11:31:42,169][57108] Fps is (10 sec: 55705.1, 60 sec: 54886.3, 300 sec: 55150.2). Total num frames: 9784246272. Throughput: 0: 54993.4. Samples: 274612460. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:31:42,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 11:31:43,104][57339] Updated weights for policy 0, policy_version 597188 (0.0028) [2024-04-28 11:31:46,133][57339] Updated weights for policy 0, policy_version 597198 (0.0027) [2024-04-28 11:31:47,169][57108] Fps is (10 sec: 55705.2, 60 sec: 54886.2, 300 sec: 55150.2). Total num frames: 9784541184. Throughput: 0: 54934.9. Samples: 274943740. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:31:47,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 11:31:49,054][57339] Updated weights for policy 0, policy_version 597208 (0.0033) [2024-04-28 11:31:51,980][57339] Updated weights for policy 0, policy_version 597218 (0.0039) [2024-04-28 11:31:52,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55159.6, 300 sec: 55150.2). Total num frames: 9784819712. Throughput: 0: 55139.3. Samples: 275112320. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:31:52,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 11:31:55,011][57339] Updated weights for policy 0, policy_version 597228 (0.0028) [2024-04-28 11:31:57,169][57108] Fps is (10 sec: 54067.8, 60 sec: 54886.5, 300 sec: 55205.8). Total num frames: 9785081856. Throughput: 0: 55129.8. Samples: 275441240. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:31:57,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 11:31:57,970][57339] Updated weights for policy 0, policy_version 597238 (0.0035) [2024-04-28 11:32:01,233][57339] Updated weights for policy 0, policy_version 597248 (0.0030) [2024-04-28 11:32:02,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.5, 300 sec: 55205.7). Total num frames: 9785360384. Throughput: 0: 55110.2. Samples: 275770520. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:32:02,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 11:32:03,950][57339] Updated weights for policy 0, policy_version 597258 (0.0025) [2024-04-28 11:32:05,543][57319] Signal inference workers to stop experience collection... (3900 times) [2024-04-28 11:32:05,592][57339] InferenceWorker_p0-w0: stopping experience collection (3900 times) [2024-04-28 11:32:05,602][57319] Signal inference workers to resume experience collection... (3900 times) [2024-04-28 11:32:05,610][57339] InferenceWorker_p0-w0: resuming experience collection (3900 times) [2024-04-28 11:32:07,169][57108] Fps is (10 sec: 54067.7, 60 sec: 54886.5, 300 sec: 55150.3). Total num frames: 9785622528. Throughput: 0: 54991.1. Samples: 275934220. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:32:07,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 11:32:07,263][57339] Updated weights for policy 0, policy_version 597268 (0.0032) [2024-04-28 11:32:09,925][57339] Updated weights for policy 0, policy_version 597278 (0.0025) [2024-04-28 11:32:12,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54886.5, 300 sec: 55150.2). Total num frames: 9785901056. Throughput: 0: 54968.5. Samples: 276265080. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:32:12,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 11:32:13,382][57339] Updated weights for policy 0, policy_version 597288 (0.0031) [2024-04-28 11:32:15,761][57339] Updated weights for policy 0, policy_version 597298 (0.0032) [2024-04-28 11:32:17,169][57108] Fps is (10 sec: 55705.2, 60 sec: 54886.3, 300 sec: 55094.7). Total num frames: 9786179584. Throughput: 0: 55024.0. Samples: 276596940. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:32:17,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 11:32:19,385][57339] Updated weights for policy 0, policy_version 597308 (0.0026) [2024-04-28 11:32:21,810][57339] Updated weights for policy 0, policy_version 597318 (0.0031) [2024-04-28 11:32:22,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55163.3, 300 sec: 55150.2). Total num frames: 9786474496. Throughput: 0: 55147.0. Samples: 276764380. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:32:22,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 11:32:25,280][57339] Updated weights for policy 0, policy_version 597328 (0.0033) [2024-04-28 11:32:27,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54886.5, 300 sec: 55150.2). Total num frames: 9786720256. Throughput: 0: 55095.2. Samples: 277091740. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:32:27,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 11:32:27,910][57339] Updated weights for policy 0, policy_version 597338 (0.0034) [2024-04-28 11:32:31,185][57339] Updated weights for policy 0, policy_version 597348 (0.0029) [2024-04-28 11:32:32,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55432.5, 300 sec: 55261.3). Total num frames: 9787015168. Throughput: 0: 55025.0. Samples: 277419860. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:32:32,169][57108] Avg episode reward: [(0, '0.504')] [2024-04-28 11:32:33,729][57339] Updated weights for policy 0, policy_version 597358 (0.0026) [2024-04-28 11:32:37,145][57339] Updated weights for policy 0, policy_version 597368 (0.0030) [2024-04-28 11:32:37,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54886.5, 300 sec: 55150.2). Total num frames: 9787277312. Throughput: 0: 55013.7. Samples: 277587940. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:32:37,169][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 11:32:39,563][57339] Updated weights for policy 0, policy_version 597378 (0.0027) [2024-04-28 11:32:42,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.6, 300 sec: 55261.3). Total num frames: 9787572224. Throughput: 0: 54928.4. Samples: 277913020. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:32:42,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 11:32:42,953][57339] Updated weights for policy 0, policy_version 597388 (0.0030) [2024-04-28 11:32:45,576][57339] Updated weights for policy 0, policy_version 597398 (0.0031) [2024-04-28 11:32:47,169][57108] Fps is (10 sec: 55705.6, 60 sec: 54886.5, 300 sec: 55150.2). Total num frames: 9787834368. Throughput: 0: 54975.5. Samples: 278244420. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-04-28 11:32:47,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 11:32:48,806][57339] Updated weights for policy 0, policy_version 597408 (0.0028) [2024-04-28 11:32:51,440][57339] Updated weights for policy 0, policy_version 597418 (0.0030) [2024-04-28 11:32:52,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54886.3, 300 sec: 55039.1). Total num frames: 9788112896. Throughput: 0: 55188.7. Samples: 278417720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:32:52,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 11:32:54,631][57339] Updated weights for policy 0, policy_version 597428 (0.0028) [2024-04-28 11:32:56,795][57319] Signal inference workers to stop experience collection... (3950 times) [2024-04-28 11:32:56,844][57339] InferenceWorker_p0-w0: stopping experience collection (3950 times) [2024-04-28 11:32:56,855][57319] Signal inference workers to resume experience collection... (3950 times) [2024-04-28 11:32:56,861][57339] InferenceWorker_p0-w0: resuming experience collection (3950 times) [2024-04-28 11:32:57,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.6, 300 sec: 55205.8). Total num frames: 9788407808. Throughput: 0: 55169.0. Samples: 278747680. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:32:57,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 11:32:57,240][57339] Updated weights for policy 0, policy_version 597438 (0.0024) [2024-04-28 11:33:00,679][57339] Updated weights for policy 0, policy_version 597448 (0.0026) [2024-04-28 11:33:02,169][57108] Fps is (10 sec: 54067.5, 60 sec: 54886.3, 300 sec: 55150.2). Total num frames: 9788653568. Throughput: 0: 55175.1. Samples: 279079820. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:33:02,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 11:33:03,380][57339] Updated weights for policy 0, policy_version 597458 (0.0025) [2024-04-28 11:33:06,700][57339] Updated weights for policy 0, policy_version 597468 (0.0029) [2024-04-28 11:33:07,169][57108] Fps is (10 sec: 50789.9, 60 sec: 54886.3, 300 sec: 55094.7). Total num frames: 9788915712. Throughput: 0: 55007.2. Samples: 279239700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:33:07,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 11:33:09,305][57339] Updated weights for policy 0, policy_version 597478 (0.0026) [2024-04-28 11:33:12,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55159.5, 300 sec: 55150.2). Total num frames: 9789210624. Throughput: 0: 55073.7. Samples: 279570060. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:33:12,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 11:33:12,455][57339] Updated weights for policy 0, policy_version 597488 (0.0026) [2024-04-28 11:33:15,190][57339] Updated weights for policy 0, policy_version 597498 (0.0035) [2024-04-28 11:33:17,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55432.5, 300 sec: 55205.7). Total num frames: 9789505536. Throughput: 0: 55255.5. Samples: 279906360. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:33:17,170][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 11:33:18,194][57339] Updated weights for policy 0, policy_version 597508 (0.0031) [2024-04-28 11:33:21,050][57339] Updated weights for policy 0, policy_version 597518 (0.0026) [2024-04-28 11:33:22,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55159.5, 300 sec: 55150.2). Total num frames: 9789784064. Throughput: 0: 55411.4. Samples: 280081460. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:33:22,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 11:33:24,109][57339] Updated weights for policy 0, policy_version 597528 (0.0031) [2024-04-28 11:33:26,955][57339] Updated weights for policy 0, policy_version 597538 (0.0027) [2024-04-28 11:33:27,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55094.6). Total num frames: 9790062592. Throughput: 0: 55627.5. Samples: 280416260. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:33:27,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 11:33:27,286][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000597539_9790078976.pth... [2024-04-28 11:33:27,335][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000596729_9776807936.pth [2024-04-28 11:33:30,239][57339] Updated weights for policy 0, policy_version 597548 (0.0032) [2024-04-28 11:33:32,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55159.5, 300 sec: 55094.7). Total num frames: 9790324736. Throughput: 0: 55633.3. Samples: 280747920. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:33:32,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 11:33:32,876][57339] Updated weights for policy 0, policy_version 597558 (0.0030) [2024-04-28 11:33:35,909][57339] Updated weights for policy 0, policy_version 597568 (0.0027) [2024-04-28 11:33:37,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55159.5, 300 sec: 55094.7). Total num frames: 9790586880. Throughput: 0: 55310.7. Samples: 280906700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:33:37,169][57108] Avg episode reward: [(0, '0.708')] [2024-04-28 11:33:38,727][57339] Updated weights for policy 0, policy_version 597578 (0.0029) [2024-04-28 11:33:41,988][57339] Updated weights for policy 0, policy_version 597588 (0.0031) [2024-04-28 11:33:42,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.4, 300 sec: 55261.3). Total num frames: 9790881792. Throughput: 0: 55383.8. Samples: 281239960. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:33:42,170][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 11:33:44,559][57339] Updated weights for policy 0, policy_version 597598 (0.0026) [2024-04-28 11:33:47,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.4, 300 sec: 55261.3). Total num frames: 9791160320. Throughput: 0: 55386.6. Samples: 281572220. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:33:47,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 11:33:47,914][57339] Updated weights for policy 0, policy_version 597608 (0.0040) [2024-04-28 11:33:50,379][57339] Updated weights for policy 0, policy_version 597618 (0.0028) [2024-04-28 11:33:52,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 55205.7). Total num frames: 9791455232. Throughput: 0: 55707.1. Samples: 281746520. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:33:52,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 11:33:53,715][57339] Updated weights for policy 0, policy_version 597628 (0.0028) [2024-04-28 11:33:56,405][57339] Updated weights for policy 0, policy_version 597638 (0.0026) [2024-04-28 11:33:57,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.3, 300 sec: 55205.7). Total num frames: 9791733760. Throughput: 0: 55761.6. Samples: 282079340. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:33:57,178][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 11:33:58,744][57319] Signal inference workers to stop experience collection... (4000 times) [2024-04-28 11:33:58,763][57339] InferenceWorker_p0-w0: stopping experience collection (4000 times) [2024-04-28 11:33:58,830][57319] Signal inference workers to resume experience collection... (4000 times) [2024-04-28 11:33:58,830][57339] InferenceWorker_p0-w0: resuming experience collection (4000 times) [2024-04-28 11:33:59,483][57339] Updated weights for policy 0, policy_version 597648 (0.0026) [2024-04-28 11:34:02,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55094.7). Total num frames: 9791995904. Throughput: 0: 55585.3. Samples: 282407700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:34:02,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 11:34:02,354][57339] Updated weights for policy 0, policy_version 597658 (0.0026) [2024-04-28 11:34:05,734][57339] Updated weights for policy 0, policy_version 597668 (0.0029) [2024-04-28 11:34:07,169][57108] Fps is (10 sec: 50791.1, 60 sec: 55432.6, 300 sec: 55150.2). Total num frames: 9792241664. Throughput: 0: 55324.5. Samples: 282571060. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:34:07,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 11:34:08,371][57339] Updated weights for policy 0, policy_version 597678 (0.0032) [2024-04-28 11:34:11,800][57339] Updated weights for policy 0, policy_version 597688 (0.0033) [2024-04-28 11:34:12,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55159.5, 300 sec: 55150.2). Total num frames: 9792520192. Throughput: 0: 55319.2. Samples: 282905620. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:34:12,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 11:34:14,165][57339] Updated weights for policy 0, policy_version 597698 (0.0028) [2024-04-28 11:34:17,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9792831488. Throughput: 0: 55292.0. Samples: 283236060. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:34:17,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 11:34:17,811][57339] Updated weights for policy 0, policy_version 597708 (0.0038) [2024-04-28 11:34:20,129][57339] Updated weights for policy 0, policy_version 597718 (0.0029) [2024-04-28 11:34:22,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55159.5, 300 sec: 55205.7). Total num frames: 9793093632. Throughput: 0: 55355.0. Samples: 283397680. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-04-28 11:34:22,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 11:34:23,607][57339] Updated weights for policy 0, policy_version 597728 (0.0028) [2024-04-28 11:34:26,067][57339] Updated weights for policy 0, policy_version 597738 (0.0031) [2024-04-28 11:34:27,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.7, 300 sec: 55205.7). Total num frames: 9793388544. Throughput: 0: 55357.9. Samples: 283731060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:34:27,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 11:34:29,343][57339] Updated weights for policy 0, policy_version 597748 (0.0031) [2024-04-28 11:34:31,870][57339] Updated weights for policy 0, policy_version 597758 (0.0028) [2024-04-28 11:34:32,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.6, 300 sec: 55261.3). Total num frames: 9793667072. Throughput: 0: 55293.4. Samples: 284060420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:34:32,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 11:34:35,503][57339] Updated weights for policy 0, policy_version 597768 (0.0026) [2024-04-28 11:34:37,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55205.8). Total num frames: 9793945600. Throughput: 0: 55131.2. Samples: 284227420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:34:37,169][57108] Avg episode reward: [(0, '0.693')] [2024-04-28 11:34:37,880][57339] Updated weights for policy 0, policy_version 597778 (0.0026) [2024-04-28 11:34:41,448][57339] Updated weights for policy 0, policy_version 597788 (0.0029) [2024-04-28 11:34:42,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55159.5, 300 sec: 55205.8). Total num frames: 9794191360. Throughput: 0: 55127.7. Samples: 284560080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:34:42,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 11:34:43,657][57339] Updated weights for policy 0, policy_version 597798 (0.0031) [2024-04-28 11:34:47,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55159.5, 300 sec: 55150.2). Total num frames: 9794469888. Throughput: 0: 55367.2. Samples: 284899220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:34:47,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 11:34:47,216][57339] Updated weights for policy 0, policy_version 597808 (0.0025) [2024-04-28 11:34:49,590][57339] Updated weights for policy 0, policy_version 597818 (0.0029) [2024-04-28 11:34:52,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55159.6, 300 sec: 55316.8). Total num frames: 9794764800. Throughput: 0: 55269.0. Samples: 285058160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:34:52,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 11:34:53,426][57339] Updated weights for policy 0, policy_version 597828 (0.0030) [2024-04-28 11:34:55,389][57319] Signal inference workers to stop experience collection... (4050 times) [2024-04-28 11:34:55,390][57319] Signal inference workers to resume experience collection... (4050 times) [2024-04-28 11:34:55,409][57339] InferenceWorker_p0-w0: stopping experience collection (4050 times) [2024-04-28 11:34:55,410][57339] InferenceWorker_p0-w0: resuming experience collection (4050 times) [2024-04-28 11:34:55,514][57339] Updated weights for policy 0, policy_version 597838 (0.0034) [2024-04-28 11:34:57,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55159.6, 300 sec: 55261.3). Total num frames: 9795043328. Throughput: 0: 55183.6. Samples: 285388880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:34:57,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 11:34:59,285][57339] Updated weights for policy 0, policy_version 597848 (0.0040) [2024-04-28 11:35:01,440][57339] Updated weights for policy 0, policy_version 597858 (0.0029) [2024-04-28 11:35:02,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55432.5, 300 sec: 55261.3). Total num frames: 9795321856. Throughput: 0: 55347.9. Samples: 285726720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:35:02,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 11:35:05,122][57339] Updated weights for policy 0, policy_version 597868 (0.0027) [2024-04-28 11:35:07,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55205.7). Total num frames: 9795600384. Throughput: 0: 55690.2. Samples: 285903740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:35:07,170][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 11:35:07,446][57339] Updated weights for policy 0, policy_version 597878 (0.0026) [2024-04-28 11:35:10,915][57339] Updated weights for policy 0, policy_version 597888 (0.0034) [2024-04-28 11:35:12,169][57108] Fps is (10 sec: 57345.0, 60 sec: 56251.8, 300 sec: 55372.4). Total num frames: 9795895296. Throughput: 0: 55708.9. Samples: 286237960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:35:12,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 11:35:13,317][57339] Updated weights for policy 0, policy_version 597898 (0.0032) [2024-04-28 11:35:16,815][57339] Updated weights for policy 0, policy_version 597908 (0.0028) [2024-04-28 11:35:17,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.4, 300 sec: 55261.3). Total num frames: 9796141056. Throughput: 0: 55833.2. Samples: 286572920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:35:17,170][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 11:35:19,202][57339] Updated weights for policy 0, policy_version 597918 (0.0031) [2024-04-28 11:35:22,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55432.6, 300 sec: 55205.8). Total num frames: 9796419584. Throughput: 0: 55667.6. Samples: 286732460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:35:22,169][57108] Avg episode reward: [(0, '0.519')] [2024-04-28 11:35:22,536][57339] Updated weights for policy 0, policy_version 597928 (0.0027) [2024-04-28 11:35:25,217][57339] Updated weights for policy 0, policy_version 597938 (0.0029) [2024-04-28 11:35:27,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.3, 300 sec: 55316.8). Total num frames: 9796698112. Throughput: 0: 55651.0. Samples: 287064380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:35:27,170][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 11:35:27,288][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000597944_9796714496.pth... [2024-04-28 11:35:27,334][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000597133_9783427072.pth [2024-04-28 11:35:28,423][57339] Updated weights for policy 0, policy_version 597948 (0.0033) [2024-04-28 11:35:31,024][57339] Updated weights for policy 0, policy_version 597958 (0.0033) [2024-04-28 11:35:32,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9796993024. Throughput: 0: 55593.2. Samples: 287400920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:35:32,170][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 11:35:34,327][57339] Updated weights for policy 0, policy_version 597968 (0.0024) [2024-04-28 11:35:36,992][57339] Updated weights for policy 0, policy_version 597978 (0.0038) [2024-04-28 11:35:37,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9797271552. Throughput: 0: 55841.7. Samples: 287571040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:35:37,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 11:35:40,226][57339] Updated weights for policy 0, policy_version 597988 (0.0027) [2024-04-28 11:35:42,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56251.7, 300 sec: 55316.8). Total num frames: 9797566464. Throughput: 0: 55882.1. Samples: 287903580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:35:42,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 11:35:42,863][57339] Updated weights for policy 0, policy_version 597998 (0.0033) [2024-04-28 11:35:46,014][57339] Updated weights for policy 0, policy_version 598008 (0.0027) [2024-04-28 11:35:47,072][57319] Signal inference workers to stop experience collection... (4100 times) [2024-04-28 11:35:47,073][57319] Signal inference workers to resume experience collection... (4100 times) [2024-04-28 11:35:47,083][57339] InferenceWorker_p0-w0: stopping experience collection (4100 times) [2024-04-28 11:35:47,113][57339] InferenceWorker_p0-w0: resuming experience collection (4100 times) [2024-04-28 11:35:47,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.7, 300 sec: 55316.9). Total num frames: 9797828608. Throughput: 0: 55734.0. Samples: 288234740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:35:47,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 11:35:48,889][57339] Updated weights for policy 0, policy_version 598018 (0.0026) [2024-04-28 11:35:52,000][57339] Updated weights for policy 0, policy_version 598028 (0.0029) [2024-04-28 11:35:52,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55432.4, 300 sec: 55261.3). Total num frames: 9798090752. Throughput: 0: 55454.6. Samples: 288399200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:35:52,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 11:35:54,791][57339] Updated weights for policy 0, policy_version 598038 (0.0028) [2024-04-28 11:35:57,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55159.4, 300 sec: 55261.3). Total num frames: 9798352896. Throughput: 0: 55332.8. Samples: 288727940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:35:57,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 11:35:57,912][57339] Updated weights for policy 0, policy_version 598048 (0.0025) [2024-04-28 11:36:00,577][57339] Updated weights for policy 0, policy_version 598058 (0.0026) [2024-04-28 11:36:02,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55316.8). Total num frames: 9798647808. Throughput: 0: 55323.6. Samples: 289062480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:36:02,170][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 11:36:03,705][57339] Updated weights for policy 0, policy_version 598068 (0.0029) [2024-04-28 11:36:06,534][57339] Updated weights for policy 0, policy_version 598078 (0.0028) [2024-04-28 11:36:07,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55705.7, 300 sec: 55372.4). Total num frames: 9798942720. Throughput: 0: 55524.9. Samples: 289231080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:36:07,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 11:36:09,518][57339] Updated weights for policy 0, policy_version 598088 (0.0030) [2024-04-28 11:36:12,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55159.2, 300 sec: 55316.8). Total num frames: 9799204864. Throughput: 0: 55484.8. Samples: 289561200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:36:12,170][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 11:36:12,519][57339] Updated weights for policy 0, policy_version 598098 (0.0030) [2024-04-28 11:36:15,492][57339] Updated weights for policy 0, policy_version 598108 (0.0032) [2024-04-28 11:36:17,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55978.6, 300 sec: 55373.2). Total num frames: 9799499776. Throughput: 0: 55388.4. Samples: 289893400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:36:17,170][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 11:36:18,428][57339] Updated weights for policy 0, policy_version 598118 (0.0024) [2024-04-28 11:36:21,558][57339] Updated weights for policy 0, policy_version 598128 (0.0029) [2024-04-28 11:36:22,169][57108] Fps is (10 sec: 55706.9, 60 sec: 55705.6, 300 sec: 55372.4). Total num frames: 9799761920. Throughput: 0: 55467.6. Samples: 290067080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:36:22,170][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 11:36:24,237][57339] Updated weights for policy 0, policy_version 598138 (0.0033) [2024-04-28 11:36:27,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55705.7, 300 sec: 55427.9). Total num frames: 9800040448. Throughput: 0: 55443.6. Samples: 290398540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:36:27,178][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 11:36:27,589][57339] Updated weights for policy 0, policy_version 598148 (0.0028) [2024-04-28 11:36:30,233][57339] Updated weights for policy 0, policy_version 598158 (0.0031) [2024-04-28 11:36:32,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.6, 300 sec: 55316.9). Total num frames: 9800302592. Throughput: 0: 55509.8. Samples: 290732680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:36:32,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 11:36:33,281][57339] Updated weights for policy 0, policy_version 598168 (0.0030) [2024-04-28 11:36:36,174][57339] Updated weights for policy 0, policy_version 598178 (0.0025) [2024-04-28 11:36:37,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 9800581120. Throughput: 0: 55284.1. Samples: 290886980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:36:37,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 11:36:39,260][57339] Updated weights for policy 0, policy_version 598188 (0.0035) [2024-04-28 11:36:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 54886.5, 300 sec: 55316.9). Total num frames: 9800859648. Throughput: 0: 55434.3. Samples: 291222480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:36:42,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 11:36:42,197][57339] Updated weights for policy 0, policy_version 598198 (0.0030) [2024-04-28 11:36:45,022][57339] Updated weights for policy 0, policy_version 598208 (0.0035) [2024-04-28 11:36:47,169][57108] Fps is (10 sec: 58981.2, 60 sec: 55705.4, 300 sec: 55427.9). Total num frames: 9801170944. Throughput: 0: 55483.0. Samples: 291559220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:36:47,170][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 11:36:47,972][57339] Updated weights for policy 0, policy_version 598218 (0.0025) [2024-04-28 11:36:51,044][57339] Updated weights for policy 0, policy_version 598228 (0.0031) [2024-04-28 11:36:52,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55978.7, 300 sec: 55483.5). Total num frames: 9801449472. Throughput: 0: 55416.4. Samples: 291724820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:36:52,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 11:36:53,889][57339] Updated weights for policy 0, policy_version 598238 (0.0031) [2024-04-28 11:36:56,177][57319] Signal inference workers to stop experience collection... (4150 times) [2024-04-28 11:36:56,177][57319] Signal inference workers to resume experience collection... (4150 times) [2024-04-28 11:36:56,191][57339] InferenceWorker_p0-w0: stopping experience collection (4150 times) [2024-04-28 11:36:56,192][57339] InferenceWorker_p0-w0: resuming experience collection (4150 times) [2024-04-28 11:36:56,805][57339] Updated weights for policy 0, policy_version 598248 (0.0030) [2024-04-28 11:36:57,169][57108] Fps is (10 sec: 54068.4, 60 sec: 55978.8, 300 sec: 55427.9). Total num frames: 9801711616. Throughput: 0: 55521.2. Samples: 292059640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:36:57,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 11:36:59,822][57339] Updated weights for policy 0, policy_version 598258 (0.0029) [2024-04-28 11:37:02,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 9801973760. Throughput: 0: 55548.6. Samples: 292393080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:37:02,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 11:37:02,917][57339] Updated weights for policy 0, policy_version 598268 (0.0031) [2024-04-28 11:37:05,773][57339] Updated weights for policy 0, policy_version 598278 (0.0027) [2024-04-28 11:37:07,169][57108] Fps is (10 sec: 52428.6, 60 sec: 54886.4, 300 sec: 55372.4). Total num frames: 9802235904. Throughput: 0: 55299.1. Samples: 292555540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:37:07,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 11:37:08,689][57339] Updated weights for policy 0, policy_version 598288 (0.0026) [2024-04-28 11:37:11,802][57339] Updated weights for policy 0, policy_version 598298 (0.0034) [2024-04-28 11:37:12,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.7, 300 sec: 55372.4). Total num frames: 9802514432. Throughput: 0: 55335.6. Samples: 292888640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:37:12,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 11:37:14,500][57339] Updated weights for policy 0, policy_version 598308 (0.0025) [2024-04-28 11:37:17,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55159.6, 300 sec: 55372.4). Total num frames: 9802809344. Throughput: 0: 55281.7. Samples: 293220360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:37:17,170][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 11:37:17,670][57339] Updated weights for policy 0, policy_version 598318 (0.0026) [2024-04-28 11:37:20,487][57339] Updated weights for policy 0, policy_version 598328 (0.0030) [2024-04-28 11:37:22,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 9803104256. Throughput: 0: 55558.6. Samples: 293387120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:37:22,170][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 11:37:23,771][57339] Updated weights for policy 0, policy_version 598338 (0.0026) [2024-04-28 11:37:26,348][57339] Updated weights for policy 0, policy_version 598348 (0.0034) [2024-04-28 11:37:27,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 9803382784. Throughput: 0: 55512.8. Samples: 293720560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 11:37:27,170][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 11:37:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000598351_9803382784.pth... [2024-04-28 11:37:27,229][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000597539_9790078976.pth [2024-04-28 11:37:29,611][57339] Updated weights for policy 0, policy_version 598358 (0.0027) [2024-04-28 11:37:32,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 9803644928. Throughput: 0: 55505.5. Samples: 294056960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:37:32,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 11:37:32,241][57339] Updated weights for policy 0, policy_version 598368 (0.0028) [2024-04-28 11:37:35,458][57339] Updated weights for policy 0, policy_version 598378 (0.0027) [2024-04-28 11:37:37,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55432.4, 300 sec: 55372.4). Total num frames: 9803907072. Throughput: 0: 55487.9. Samples: 294221780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:37:37,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 11:37:38,085][57339] Updated weights for policy 0, policy_version 598388 (0.0036) [2024-04-28 11:37:41,427][57339] Updated weights for policy 0, policy_version 598398 (0.0027) [2024-04-28 11:37:42,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 9804185600. Throughput: 0: 55423.6. Samples: 294553700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:37:42,169][57108] Avg episode reward: [(0, '0.689')] [2024-04-28 11:37:44,073][57339] Updated weights for policy 0, policy_version 598408 (0.0027) [2024-04-28 11:37:47,169][57108] Fps is (10 sec: 54067.8, 60 sec: 54613.5, 300 sec: 55372.4). Total num frames: 9804447744. Throughput: 0: 55304.0. Samples: 294881760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:37:47,169][57108] Avg episode reward: [(0, '0.689')] [2024-04-28 11:37:47,467][57339] Updated weights for policy 0, policy_version 598418 (0.0026) [2024-04-28 11:37:49,945][57339] Updated weights for policy 0, policy_version 598428 (0.0035) [2024-04-28 11:37:52,169][57108] Fps is (10 sec: 55704.7, 60 sec: 54886.3, 300 sec: 55372.3). Total num frames: 9804742656. Throughput: 0: 55218.1. Samples: 295040360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:37:52,170][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 11:37:53,278][57339] Updated weights for policy 0, policy_version 598438 (0.0023) [2024-04-28 11:37:54,083][57319] Signal inference workers to stop experience collection... (4200 times) [2024-04-28 11:37:54,083][57319] Signal inference workers to resume experience collection... (4200 times) [2024-04-28 11:37:54,117][57339] InferenceWorker_p0-w0: stopping experience collection (4200 times) [2024-04-28 11:37:54,118][57339] InferenceWorker_p0-w0: resuming experience collection (4200 times) [2024-04-28 11:37:55,843][57339] Updated weights for policy 0, policy_version 598448 (0.0028) [2024-04-28 11:37:57,169][57108] Fps is (10 sec: 57342.7, 60 sec: 55159.2, 300 sec: 55483.4). Total num frames: 9805021184. Throughput: 0: 55184.6. Samples: 295371960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:37:57,170][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 11:37:59,120][57339] Updated weights for policy 0, policy_version 598458 (0.0027) [2024-04-28 11:38:02,131][57339] Updated weights for policy 0, policy_version 598468 (0.0028) [2024-04-28 11:38:02,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9805299712. Throughput: 0: 55177.3. Samples: 295703340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:38:02,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 11:38:05,188][57339] Updated weights for policy 0, policy_version 598478 (0.0032) [2024-04-28 11:38:07,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 9805594624. Throughput: 0: 55323.0. Samples: 295876660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:38:07,170][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 11:38:08,066][57339] Updated weights for policy 0, policy_version 598488 (0.0030) [2024-04-28 11:38:10,952][57339] Updated weights for policy 0, policy_version 598498 (0.0031) [2024-04-28 11:38:12,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 9805840384. Throughput: 0: 55288.9. Samples: 296208560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:38:12,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 11:38:13,981][57339] Updated weights for policy 0, policy_version 598508 (0.0027) [2024-04-28 11:38:16,901][57339] Updated weights for policy 0, policy_version 598518 (0.0031) [2024-04-28 11:38:17,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 9806118912. Throughput: 0: 55167.6. Samples: 296539500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:38:17,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 11:38:19,784][57339] Updated weights for policy 0, policy_version 598528 (0.0025) [2024-04-28 11:38:22,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54613.3, 300 sec: 55316.8). Total num frames: 9806381056. Throughput: 0: 55049.8. Samples: 296699020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:38:22,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 11:38:22,800][57339] Updated weights for policy 0, policy_version 598538 (0.0037) [2024-04-28 11:38:25,580][57339] Updated weights for policy 0, policy_version 598548 (0.0029) [2024-04-28 11:38:27,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54886.5, 300 sec: 55427.9). Total num frames: 9806675968. Throughput: 0: 55063.1. Samples: 297031540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:38:27,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 11:38:28,768][57339] Updated weights for policy 0, policy_version 598558 (0.0031) [2024-04-28 11:38:31,801][57339] Updated weights for policy 0, policy_version 598568 (0.0033) [2024-04-28 11:38:32,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 9806954496. Throughput: 0: 55150.5. Samples: 297363540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:38:32,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 11:38:34,615][57339] Updated weights for policy 0, policy_version 598578 (0.0026) [2024-04-28 11:38:37,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 9807233024. Throughput: 0: 55325.9. Samples: 297530020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:38:37,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 11:38:37,575][57339] Updated weights for policy 0, policy_version 598588 (0.0032) [2024-04-28 11:38:40,563][57339] Updated weights for policy 0, policy_version 598598 (0.0032) [2024-04-28 11:38:42,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.4, 300 sec: 55427.9). Total num frames: 9807511552. Throughput: 0: 55282.3. Samples: 297859660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:38:42,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 11:38:43,372][57339] Updated weights for policy 0, policy_version 598608 (0.0023) [2024-04-28 11:38:46,325][57339] Updated weights for policy 0, policy_version 598618 (0.0028) [2024-04-28 11:38:47,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55316.9). Total num frames: 9807773696. Throughput: 0: 55351.6. Samples: 298194160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:38:47,169][57108] Avg episode reward: [(0, '0.501')] [2024-04-28 11:38:49,461][57339] Updated weights for policy 0, policy_version 598628 (0.0026) [2024-04-28 11:38:52,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55316.8). Total num frames: 9808052224. Throughput: 0: 55130.2. Samples: 298357520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:38:52,170][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 11:38:52,520][57339] Updated weights for policy 0, policy_version 598638 (0.0028) [2024-04-28 11:38:55,360][57339] Updated weights for policy 0, policy_version 598648 (0.0026) [2024-04-28 11:38:57,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55159.7, 300 sec: 55372.4). Total num frames: 9808330752. Throughput: 0: 55109.8. Samples: 298688500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:38:57,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 11:38:58,330][57339] Updated weights for policy 0, policy_version 598658 (0.0030) [2024-04-28 11:38:59,108][57319] Signal inference workers to stop experience collection... (4250 times) [2024-04-28 11:38:59,138][57339] InferenceWorker_p0-w0: stopping experience collection (4250 times) [2024-04-28 11:38:59,165][57319] Signal inference workers to resume experience collection... (4250 times) [2024-04-28 11:38:59,170][57339] InferenceWorker_p0-w0: resuming experience collection (4250 times) [2024-04-28 11:39:01,159][57339] Updated weights for policy 0, policy_version 598668 (0.0028) [2024-04-28 11:39:02,169][57108] Fps is (10 sec: 55706.8, 60 sec: 55159.5, 300 sec: 55483.5). Total num frames: 9808609280. Throughput: 0: 55180.9. Samples: 299022640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 11:39:02,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 11:39:04,292][57339] Updated weights for policy 0, policy_version 598678 (0.0030) [2024-04-28 11:39:07,074][57339] Updated weights for policy 0, policy_version 598688 (0.0028) [2024-04-28 11:39:07,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 9808904192. Throughput: 0: 55479.0. Samples: 299195580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:39:07,170][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 11:39:10,047][57339] Updated weights for policy 0, policy_version 598698 (0.0025) [2024-04-28 11:39:12,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 9809166336. Throughput: 0: 55437.7. Samples: 299526240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:39:12,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 11:39:13,037][57339] Updated weights for policy 0, policy_version 598708 (0.0028) [2024-04-28 11:39:15,955][57339] Updated weights for policy 0, policy_version 598718 (0.0029) [2024-04-28 11:39:17,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 9809477632. Throughput: 0: 55562.2. Samples: 299863840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:39:17,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 11:39:18,915][57339] Updated weights for policy 0, policy_version 598728 (0.0027) [2024-04-28 11:39:21,902][57339] Updated weights for policy 0, policy_version 598738 (0.0025) [2024-04-28 11:39:22,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55372.3). Total num frames: 9809723392. Throughput: 0: 55512.7. Samples: 300028100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:39:22,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 11:39:24,688][57339] Updated weights for policy 0, policy_version 598748 (0.0030) [2024-04-28 11:39:27,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55432.3, 300 sec: 55372.3). Total num frames: 9810001920. Throughput: 0: 55633.2. Samples: 300363160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:39:27,170][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 11:39:27,285][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000598756_9810018304.pth... [2024-04-28 11:39:27,330][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000597944_9796714496.pth [2024-04-28 11:39:27,674][57339] Updated weights for policy 0, policy_version 598758 (0.0030) [2024-04-28 11:39:30,636][57339] Updated weights for policy 0, policy_version 598768 (0.0033) [2024-04-28 11:39:32,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.4, 300 sec: 55316.8). Total num frames: 9810264064. Throughput: 0: 55508.7. Samples: 300692060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:39:32,170][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 11:39:33,504][57339] Updated weights for policy 0, policy_version 598778 (0.0034) [2024-04-28 11:39:36,547][57339] Updated weights for policy 0, policy_version 598788 (0.0027) [2024-04-28 11:39:37,169][57108] Fps is (10 sec: 55706.8, 60 sec: 55432.5, 300 sec: 55483.5). Total num frames: 9810558976. Throughput: 0: 55639.3. Samples: 300861280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:39:37,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 11:39:39,422][57339] Updated weights for policy 0, policy_version 598798 (0.0027) [2024-04-28 11:39:42,169][57108] Fps is (10 sec: 58983.6, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 9810853888. Throughput: 0: 55731.6. Samples: 301196420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:39:42,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 11:39:42,496][57339] Updated weights for policy 0, policy_version 598808 (0.0035) [2024-04-28 11:39:45,283][57339] Updated weights for policy 0, policy_version 598818 (0.0031) [2024-04-28 11:39:47,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55427.9). Total num frames: 9811116032. Throughput: 0: 55795.9. Samples: 301533460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:39:47,169][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 11:39:48,378][57339] Updated weights for policy 0, policy_version 598828 (0.0030) [2024-04-28 11:39:51,224][57339] Updated weights for policy 0, policy_version 598838 (0.0028) [2024-04-28 11:39:52,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55978.8, 300 sec: 55483.4). Total num frames: 9811410944. Throughput: 0: 55669.5. Samples: 301700700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:39:52,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 11:39:54,345][57339] Updated weights for policy 0, policy_version 598848 (0.0028) [2024-04-28 11:39:56,983][57339] Updated weights for policy 0, policy_version 598858 (0.0025) [2024-04-28 11:39:57,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55483.5). Total num frames: 9811689472. Throughput: 0: 55748.8. Samples: 302034940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:39:57,170][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 11:40:00,284][57339] Updated weights for policy 0, policy_version 598868 (0.0028) [2024-04-28 11:40:02,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55427.9). Total num frames: 9811951616. Throughput: 0: 55720.6. Samples: 302371260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:40:02,169][57108] Avg episode reward: [(0, '0.521')] [2024-04-28 11:40:02,992][57339] Updated weights for policy 0, policy_version 598878 (0.0031) [2024-04-28 11:40:05,673][57319] Signal inference workers to stop experience collection... (4300 times) [2024-04-28 11:40:05,712][57339] InferenceWorker_p0-w0: stopping experience collection (4300 times) [2024-04-28 11:40:05,760][57319] Signal inference workers to resume experience collection... (4300 times) [2024-04-28 11:40:05,761][57339] InferenceWorker_p0-w0: resuming experience collection (4300 times) [2024-04-28 11:40:06,064][57339] Updated weights for policy 0, policy_version 598888 (0.0030) [2024-04-28 11:40:07,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55159.6, 300 sec: 55316.8). Total num frames: 9812213760. Throughput: 0: 55549.5. Samples: 302527820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:40:07,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 11:40:09,072][57339] Updated weights for policy 0, policy_version 598898 (0.0025) [2024-04-28 11:40:11,899][57339] Updated weights for policy 0, policy_version 598908 (0.0035) [2024-04-28 11:40:12,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 9812508672. Throughput: 0: 55423.4. Samples: 302857200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:40:12,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 11:40:14,964][57339] Updated weights for policy 0, policy_version 598918 (0.0031) [2024-04-28 11:40:17,169][57108] Fps is (10 sec: 55706.0, 60 sec: 54886.5, 300 sec: 55427.9). Total num frames: 9812770816. Throughput: 0: 55614.9. Samples: 303194720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:40:17,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 11:40:17,821][57339] Updated weights for policy 0, policy_version 598928 (0.0030) [2024-04-28 11:40:20,767][57339] Updated weights for policy 0, policy_version 598938 (0.0032) [2024-04-28 11:40:22,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 9813049344. Throughput: 0: 55658.0. Samples: 303365900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:40:22,170][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 11:40:23,777][57339] Updated weights for policy 0, policy_version 598948 (0.0028) [2024-04-28 11:40:26,507][57339] Updated weights for policy 0, policy_version 598958 (0.0029) [2024-04-28 11:40:27,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.8, 300 sec: 55427.9). Total num frames: 9813344256. Throughput: 0: 55616.8. Samples: 303699180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:40:27,169][57108] Avg episode reward: [(0, '0.498')] [2024-04-28 11:40:29,770][57339] Updated weights for policy 0, policy_version 598968 (0.0028) [2024-04-28 11:40:32,169][57108] Fps is (10 sec: 58983.1, 60 sec: 56251.8, 300 sec: 55483.5). Total num frames: 9813639168. Throughput: 0: 55439.1. Samples: 304028220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:40:32,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 11:40:32,407][57339] Updated weights for policy 0, policy_version 598978 (0.0027) [2024-04-28 11:40:35,504][57339] Updated weights for policy 0, policy_version 598988 (0.0028) [2024-04-28 11:40:37,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9813884928. Throughput: 0: 55620.9. Samples: 304203640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 11:40:37,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 11:40:38,367][57339] Updated weights for policy 0, policy_version 598998 (0.0030) [2024-04-28 11:40:41,400][57339] Updated weights for policy 0, policy_version 599008 (0.0034) [2024-04-28 11:40:42,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55159.4, 300 sec: 55372.4). Total num frames: 9814163456. Throughput: 0: 55611.2. Samples: 304537440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:40:42,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 11:40:44,173][57339] Updated weights for policy 0, policy_version 599018 (0.0024) [2024-04-28 11:40:47,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 9814458368. Throughput: 0: 55522.5. Samples: 304869780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:40:47,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 11:40:47,377][57339] Updated weights for policy 0, policy_version 599028 (0.0028) [2024-04-28 11:40:50,016][57339] Updated weights for policy 0, policy_version 599038 (0.0035) [2024-04-28 11:40:52,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 9814720512. Throughput: 0: 55556.4. Samples: 305027860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:40:52,169][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 11:40:53,110][57339] Updated weights for policy 0, policy_version 599048 (0.0030) [2024-04-28 11:40:56,282][57339] Updated weights for policy 0, policy_version 599058 (0.0032) [2024-04-28 11:40:57,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 9815015424. Throughput: 0: 55662.5. Samples: 305362020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:40:57,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 11:40:58,973][57339] Updated weights for policy 0, policy_version 599068 (0.0026) [2024-04-28 11:41:01,900][57319] Signal inference workers to stop experience collection... (4350 times) [2024-04-28 11:41:01,926][57339] InferenceWorker_p0-w0: stopping experience collection (4350 times) [2024-04-28 11:41:01,986][57319] Signal inference workers to resume experience collection... (4350 times) [2024-04-28 11:41:01,986][57339] InferenceWorker_p0-w0: resuming experience collection (4350 times) [2024-04-28 11:41:02,104][57339] Updated weights for policy 0, policy_version 599078 (0.0029) [2024-04-28 11:41:02,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55705.5, 300 sec: 55427.9). Total num frames: 9815293952. Throughput: 0: 55579.0. Samples: 305695780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:41:02,170][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 11:41:04,902][57339] Updated weights for policy 0, policy_version 599088 (0.0029) [2024-04-28 11:41:07,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55705.7, 300 sec: 55428.0). Total num frames: 9815556096. Throughput: 0: 55458.4. Samples: 305861520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:41:07,169][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 11:41:07,865][57339] Updated weights for policy 0, policy_version 599098 (0.0030) [2024-04-28 11:41:10,825][57339] Updated weights for policy 0, policy_version 599108 (0.0024) [2024-04-28 11:41:12,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55159.5, 300 sec: 55316.9). Total num frames: 9815818240. Throughput: 0: 55402.8. Samples: 306192300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:41:12,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 11:41:13,736][57339] Updated weights for policy 0, policy_version 599118 (0.0031) [2024-04-28 11:41:16,857][57339] Updated weights for policy 0, policy_version 599128 (0.0033) [2024-04-28 11:41:17,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55427.9). Total num frames: 9816113152. Throughput: 0: 55459.9. Samples: 306523920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:41:17,170][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 11:41:19,757][57339] Updated weights for policy 0, policy_version 599138 (0.0028) [2024-04-28 11:41:22,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55978.8, 300 sec: 55483.5). Total num frames: 9816408064. Throughput: 0: 55253.8. Samples: 306690060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:41:22,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 11:41:22,785][57339] Updated weights for policy 0, policy_version 599148 (0.0033) [2024-04-28 11:41:25,541][57339] Updated weights for policy 0, policy_version 599158 (0.0024) [2024-04-28 11:41:27,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 9816670208. Throughput: 0: 55177.3. Samples: 307020420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:41:27,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 11:41:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000599162_9816670208.pth... [2024-04-28 11:41:27,238][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000598351_9803382784.pth [2024-04-28 11:41:28,582][57339] Updated weights for policy 0, policy_version 599168 (0.0025) [2024-04-28 11:41:31,359][57339] Updated weights for policy 0, policy_version 599178 (0.0028) [2024-04-28 11:41:32,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 9816948736. Throughput: 0: 55201.8. Samples: 307353860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:41:32,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 11:41:34,376][57339] Updated weights for policy 0, policy_version 599188 (0.0029) [2024-04-28 11:41:37,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 9817243648. Throughput: 0: 55465.0. Samples: 307523780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:41:37,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 11:41:37,211][57339] Updated weights for policy 0, policy_version 599198 (0.0027) [2024-04-28 11:41:40,297][57339] Updated weights for policy 0, policy_version 599208 (0.0032) [2024-04-28 11:41:42,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55316.9). Total num frames: 9817489408. Throughput: 0: 55440.1. Samples: 307856820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:41:42,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 11:41:43,178][57339] Updated weights for policy 0, policy_version 599218 (0.0027) [2024-04-28 11:41:46,196][57339] Updated weights for policy 0, policy_version 599228 (0.0034) [2024-04-28 11:41:47,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55159.6, 300 sec: 55316.8). Total num frames: 9817767936. Throughput: 0: 55521.4. Samples: 308194240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:41:47,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 11:41:49,081][57339] Updated weights for policy 0, policy_version 599238 (0.0026) [2024-04-28 11:41:50,293][57319] Signal inference workers to stop experience collection... (4400 times) [2024-04-28 11:41:50,293][57319] Signal inference workers to resume experience collection... (4400 times) [2024-04-28 11:41:50,303][57339] InferenceWorker_p0-w0: stopping experience collection (4400 times) [2024-04-28 11:41:50,313][57339] InferenceWorker_p0-w0: resuming experience collection (4400 times) [2024-04-28 11:41:52,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55427.9). Total num frames: 9818062848. Throughput: 0: 55587.5. Samples: 308362960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:41:52,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 11:41:52,182][57339] Updated weights for policy 0, policy_version 599248 (0.0028) [2024-04-28 11:41:55,006][57339] Updated weights for policy 0, policy_version 599258 (0.0031) [2024-04-28 11:41:57,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 9818341376. Throughput: 0: 55642.6. Samples: 308696220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:41:57,169][57108] Avg episode reward: [(0, '0.496')] [2024-04-28 11:41:58,361][57339] Updated weights for policy 0, policy_version 599268 (0.0028) [2024-04-28 11:42:00,924][57339] Updated weights for policy 0, policy_version 599278 (0.0027) [2024-04-28 11:42:02,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55159.3, 300 sec: 55483.4). Total num frames: 9818603520. Throughput: 0: 55611.9. Samples: 309026460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:42:02,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 11:42:04,068][57339] Updated weights for policy 0, policy_version 599288 (0.0032) [2024-04-28 11:42:06,934][57339] Updated weights for policy 0, policy_version 599298 (0.0029) [2024-04-28 11:42:07,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 9818914816. Throughput: 0: 55792.0. Samples: 309200700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-04-28 11:42:07,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 11:42:09,842][57339] Updated weights for policy 0, policy_version 599308 (0.0026) [2024-04-28 11:42:12,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.4, 300 sec: 55427.9). Total num frames: 9819160576. Throughput: 0: 55717.7. Samples: 309527720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:42:12,170][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 11:42:12,941][57339] Updated weights for policy 0, policy_version 599318 (0.0030) [2024-04-28 11:42:15,788][57339] Updated weights for policy 0, policy_version 599328 (0.0031) [2024-04-28 11:42:17,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55432.7, 300 sec: 55372.4). Total num frames: 9819439104. Throughput: 0: 55756.2. Samples: 309862880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:42:17,169][57108] Avg episode reward: [(0, '0.531')] [2024-04-28 11:42:19,002][57339] Updated weights for policy 0, policy_version 599338 (0.0029) [2024-04-28 11:42:21,705][57339] Updated weights for policy 0, policy_version 599348 (0.0025) [2024-04-28 11:42:22,169][57108] Fps is (10 sec: 57345.2, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 9819734016. Throughput: 0: 55637.4. Samples: 310027460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:42:22,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 11:42:24,837][57339] Updated weights for policy 0, policy_version 599358 (0.0030) [2024-04-28 11:42:27,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 9820012544. Throughput: 0: 55745.2. Samples: 310365360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:42:27,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 11:42:27,457][57339] Updated weights for policy 0, policy_version 599368 (0.0032) [2024-04-28 11:42:30,862][57339] Updated weights for policy 0, policy_version 599378 (0.0034) [2024-04-28 11:42:32,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 9820291072. Throughput: 0: 55623.5. Samples: 310697300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:42:32,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 11:42:33,425][57339] Updated weights for policy 0, policy_version 599388 (0.0031) [2024-04-28 11:42:36,719][57339] Updated weights for policy 0, policy_version 599398 (0.0034) [2024-04-28 11:42:37,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55159.3, 300 sec: 55483.4). Total num frames: 9820553216. Throughput: 0: 55509.1. Samples: 310860880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:42:37,170][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 11:42:39,409][57339] Updated weights for policy 0, policy_version 599408 (0.0036) [2024-04-28 11:42:42,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 9820831744. Throughput: 0: 55450.3. Samples: 311191480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:42:42,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 11:42:42,546][57339] Updated weights for policy 0, policy_version 599418 (0.0030) [2024-04-28 11:42:45,414][57339] Updated weights for policy 0, policy_version 599428 (0.0035) [2024-04-28 11:42:47,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 9821110272. Throughput: 0: 55521.4. Samples: 311524920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:42:47,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 11:42:48,263][57339] Updated weights for policy 0, policy_version 599438 (0.0032) [2024-04-28 11:42:51,377][57339] Updated weights for policy 0, policy_version 599448 (0.0028) [2024-04-28 11:42:52,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 9821372416. Throughput: 0: 55247.4. Samples: 311686840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:42:52,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 11:42:53,989][57319] Signal inference workers to stop experience collection... (4450 times) [2024-04-28 11:42:53,990][57319] Signal inference workers to resume experience collection... (4450 times) [2024-04-28 11:42:54,020][57339] InferenceWorker_p0-w0: stopping experience collection (4450 times) [2024-04-28 11:42:54,020][57339] InferenceWorker_p0-w0: resuming experience collection (4450 times) [2024-04-28 11:42:54,105][57339] Updated weights for policy 0, policy_version 599458 (0.0026) [2024-04-28 11:42:57,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 9821667328. Throughput: 0: 55337.5. Samples: 312017900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:42:57,169][57108] Avg episode reward: [(0, '0.691')] [2024-04-28 11:42:57,275][57339] Updated weights for policy 0, policy_version 599468 (0.0031) [2024-04-28 11:43:00,050][57339] Updated weights for policy 0, policy_version 599478 (0.0031) [2024-04-28 11:43:02,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55705.8, 300 sec: 55427.9). Total num frames: 9821945856. Throughput: 0: 55243.9. Samples: 312348860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:43:02,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 11:43:03,184][57339] Updated weights for policy 0, policy_version 599488 (0.0032) [2024-04-28 11:43:06,189][57339] Updated weights for policy 0, policy_version 599498 (0.0028) [2024-04-28 11:43:07,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 9822224384. Throughput: 0: 55398.2. Samples: 312520380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:43:07,169][57108] Avg episode reward: [(0, '0.515')] [2024-04-28 11:43:09,062][57339] Updated weights for policy 0, policy_version 599508 (0.0028) [2024-04-28 11:43:11,970][57339] Updated weights for policy 0, policy_version 599518 (0.0030) [2024-04-28 11:43:12,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 9822502912. Throughput: 0: 55310.4. Samples: 312854320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:43:12,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 11:43:14,747][57339] Updated weights for policy 0, policy_version 599528 (0.0027) [2024-04-28 11:43:17,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9822765056. Throughput: 0: 55411.6. Samples: 313190820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:43:17,169][57108] Avg episode reward: [(0, '0.507')] [2024-04-28 11:43:17,792][57339] Updated weights for policy 0, policy_version 599538 (0.0031) [2024-04-28 11:43:20,804][57339] Updated weights for policy 0, policy_version 599548 (0.0035) [2024-04-28 11:43:22,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 9823059968. Throughput: 0: 55309.5. Samples: 313349800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:43:22,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 11:43:23,660][57339] Updated weights for policy 0, policy_version 599558 (0.0029) [2024-04-28 11:43:26,987][57339] Updated weights for policy 0, policy_version 599568 (0.0024) [2024-04-28 11:43:27,169][57108] Fps is (10 sec: 57342.3, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 9823338496. Throughput: 0: 55461.0. Samples: 313687240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:43:27,170][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 11:43:27,186][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000599569_9823338496.pth... [2024-04-28 11:43:27,238][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000598756_9810018304.pth [2024-04-28 11:43:29,633][57339] Updated weights for policy 0, policy_version 599578 (0.0031) [2024-04-28 11:43:32,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9823617024. Throughput: 0: 55467.1. Samples: 314020940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:43:32,170][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 11:43:32,826][57339] Updated weights for policy 0, policy_version 599588 (0.0027) [2024-04-28 11:43:35,321][57339] Updated weights for policy 0, policy_version 599598 (0.0032) [2024-04-28 11:43:37,169][57108] Fps is (10 sec: 54068.7, 60 sec: 55432.7, 300 sec: 55483.5). Total num frames: 9823879168. Throughput: 0: 55538.4. Samples: 314186060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:43:37,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 11:43:38,796][57339] Updated weights for policy 0, policy_version 599608 (0.0029) [2024-04-28 11:43:41,313][57339] Updated weights for policy 0, policy_version 599618 (0.0031) [2024-04-28 11:43:42,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9824157696. Throughput: 0: 55623.2. Samples: 314520940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 11:43:42,169][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 11:43:44,648][57339] Updated weights for policy 0, policy_version 599628 (0.0031) [2024-04-28 11:43:47,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55705.7, 300 sec: 55594.6). Total num frames: 9824452608. Throughput: 0: 55664.5. Samples: 314853760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:43:47,169][57108] Avg episode reward: [(0, '0.475')] [2024-04-28 11:43:47,202][57339] Updated weights for policy 0, policy_version 599638 (0.0027) [2024-04-28 11:43:50,511][57339] Updated weights for policy 0, policy_version 599648 (0.0024) [2024-04-28 11:43:52,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 9824714752. Throughput: 0: 55456.8. Samples: 315015940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:43:52,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 11:43:53,250][57339] Updated weights for policy 0, policy_version 599658 (0.0028) [2024-04-28 11:43:55,419][57319] Signal inference workers to stop experience collection... (4500 times) [2024-04-28 11:43:55,423][57319] Signal inference workers to resume experience collection... (4500 times) [2024-04-28 11:43:55,448][57339] InferenceWorker_p0-w0: stopping experience collection (4500 times) [2024-04-28 11:43:55,448][57339] InferenceWorker_p0-w0: resuming experience collection (4500 times) [2024-04-28 11:43:56,371][57339] Updated weights for policy 0, policy_version 599668 (0.0031) [2024-04-28 11:43:57,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 9824976896. Throughput: 0: 55232.5. Samples: 315339780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:43:57,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 11:43:59,527][57339] Updated weights for policy 0, policy_version 599678 (0.0029) [2024-04-28 11:44:02,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55159.3, 300 sec: 55427.9). Total num frames: 9825255424. Throughput: 0: 55205.5. Samples: 315675080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:44:02,170][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 11:44:02,579][57339] Updated weights for policy 0, policy_version 599688 (0.0026) [2024-04-28 11:44:05,618][57339] Updated weights for policy 0, policy_version 599698 (0.0033) [2024-04-28 11:44:07,169][57108] Fps is (10 sec: 57342.9, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 9825550336. Throughput: 0: 55245.6. Samples: 315835860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:44:07,170][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 11:44:08,627][57339] Updated weights for policy 0, policy_version 599708 (0.0031) [2024-04-28 11:44:11,370][57339] Updated weights for policy 0, policy_version 599718 (0.0028) [2024-04-28 11:44:12,169][57108] Fps is (10 sec: 54068.1, 60 sec: 54886.4, 300 sec: 55316.8). Total num frames: 9825796096. Throughput: 0: 55080.8. Samples: 316165860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:44:12,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 11:44:14,422][57339] Updated weights for policy 0, policy_version 599728 (0.0031) [2024-04-28 11:44:17,169][57339] Updated weights for policy 0, policy_version 599738 (0.0028) [2024-04-28 11:44:17,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 9826107392. Throughput: 0: 55006.8. Samples: 316496240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:44:17,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 11:44:20,466][57339] Updated weights for policy 0, policy_version 599748 (0.0030) [2024-04-28 11:44:22,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9826385920. Throughput: 0: 55172.8. Samples: 316668840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:44:22,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 11:44:23,103][57339] Updated weights for policy 0, policy_version 599758 (0.0025) [2024-04-28 11:44:26,443][57339] Updated weights for policy 0, policy_version 599768 (0.0031) [2024-04-28 11:44:27,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55159.7, 300 sec: 55539.0). Total num frames: 9826648064. Throughput: 0: 55098.2. Samples: 317000360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:44:27,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 11:44:29,226][57339] Updated weights for policy 0, policy_version 599778 (0.0036) [2024-04-28 11:44:32,169][57108] Fps is (10 sec: 52429.0, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 9826910208. Throughput: 0: 55091.9. Samples: 317332900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:44:32,169][57108] Avg episode reward: [(0, '0.693')] [2024-04-28 11:44:32,284][57339] Updated weights for policy 0, policy_version 599788 (0.0032) [2024-04-28 11:44:35,222][57339] Updated weights for policy 0, policy_version 599798 (0.0028) [2024-04-28 11:44:37,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 9827188736. Throughput: 0: 55064.5. Samples: 317493840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:44:37,169][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 11:44:38,196][57339] Updated weights for policy 0, policy_version 599808 (0.0031) [2024-04-28 11:44:41,072][57339] Updated weights for policy 0, policy_version 599818 (0.0026) [2024-04-28 11:44:42,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55159.3, 300 sec: 55427.9). Total num frames: 9827467264. Throughput: 0: 55261.6. Samples: 317826560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:44:42,178][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 11:44:44,117][57339] Updated weights for policy 0, policy_version 599828 (0.0025) [2024-04-28 11:44:47,000][57339] Updated weights for policy 0, policy_version 599838 (0.0026) [2024-04-28 11:44:47,169][57108] Fps is (10 sec: 55705.4, 60 sec: 54886.4, 300 sec: 55372.4). Total num frames: 9827745792. Throughput: 0: 55280.6. Samples: 318162700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:44:47,169][57108] Avg episode reward: [(0, '0.530')] [2024-04-28 11:44:50,023][57339] Updated weights for policy 0, policy_version 599848 (0.0028) [2024-04-28 11:44:52,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55159.4, 300 sec: 55372.4). Total num frames: 9828024320. Throughput: 0: 55503.2. Samples: 318333500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:44:52,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 11:44:52,820][57339] Updated weights for policy 0, policy_version 599858 (0.0032) [2024-04-28 11:44:55,832][57339] Updated weights for policy 0, policy_version 599868 (0.0028) [2024-04-28 11:44:56,326][57319] Signal inference workers to stop experience collection... (4550 times) [2024-04-28 11:44:56,366][57339] InferenceWorker_p0-w0: stopping experience collection (4550 times) [2024-04-28 11:44:56,380][57319] Signal inference workers to resume experience collection... (4550 times) [2024-04-28 11:44:56,386][57339] InferenceWorker_p0-w0: resuming experience collection (4550 times) [2024-04-28 11:44:57,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 9828319232. Throughput: 0: 55536.4. Samples: 318665000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:44:57,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 11:44:58,800][57339] Updated weights for policy 0, policy_version 599878 (0.0037) [2024-04-28 11:45:01,699][57339] Updated weights for policy 0, policy_version 599888 (0.0028) [2024-04-28 11:45:02,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55432.7, 300 sec: 55483.5). Total num frames: 9828581376. Throughput: 0: 55445.8. Samples: 318991300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:45:02,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 11:45:04,816][57339] Updated weights for policy 0, policy_version 599898 (0.0033) [2024-04-28 11:45:07,169][57108] Fps is (10 sec: 50790.5, 60 sec: 54613.4, 300 sec: 55316.8). Total num frames: 9828827136. Throughput: 0: 55356.0. Samples: 319159860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:45:07,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 11:45:07,802][57339] Updated weights for policy 0, policy_version 599908 (0.0031) [2024-04-28 11:45:10,698][57339] Updated weights for policy 0, policy_version 599918 (0.0026) [2024-04-28 11:45:12,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 9829138432. Throughput: 0: 55285.2. Samples: 319488200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:45:12,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 11:45:13,638][57339] Updated weights for policy 0, policy_version 599928 (0.0026) [2024-04-28 11:45:16,444][57339] Updated weights for policy 0, policy_version 599938 (0.0032) [2024-04-28 11:45:17,169][57108] Fps is (10 sec: 57344.0, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 9829400576. Throughput: 0: 55119.1. Samples: 319813260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 11:45:17,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 11:45:19,499][57339] Updated weights for policy 0, policy_version 599948 (0.0027) [2024-04-28 11:45:22,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54886.4, 300 sec: 55372.4). Total num frames: 9829679104. Throughput: 0: 55250.0. Samples: 319980100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:45:22,170][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 11:45:22,414][57339] Updated weights for policy 0, policy_version 599958 (0.0032) [2024-04-28 11:45:25,289][57339] Updated weights for policy 0, policy_version 599968 (0.0026) [2024-04-28 11:45:27,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55159.4, 300 sec: 55316.8). Total num frames: 9829957632. Throughput: 0: 55210.9. Samples: 320311040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:45:27,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 11:45:27,244][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000599974_9829974016.pth... [2024-04-28 11:45:27,295][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000599162_9816670208.pth [2024-04-28 11:45:28,518][57339] Updated weights for policy 0, policy_version 599978 (0.0027) [2024-04-28 11:45:31,325][57339] Updated weights for policy 0, policy_version 599988 (0.0029) [2024-04-28 11:45:32,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 9830252544. Throughput: 0: 55116.4. Samples: 320642940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:45:32,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 11:45:34,280][57339] Updated weights for policy 0, policy_version 599998 (0.0033) [2024-04-28 11:45:37,149][57339] Updated weights for policy 0, policy_version 600008 (0.0033) [2024-04-28 11:45:37,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 9830531072. Throughput: 0: 55181.3. Samples: 320816660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:45:37,170][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 11:45:40,222][57339] Updated weights for policy 0, policy_version 600018 (0.0026) [2024-04-28 11:45:42,169][57108] Fps is (10 sec: 52428.0, 60 sec: 55159.4, 300 sec: 55316.8). Total num frames: 9830776832. Throughput: 0: 55106.5. Samples: 321144800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:45:42,170][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 11:45:42,915][57339] Updated weights for policy 0, policy_version 600028 (0.0027) [2024-04-28 11:45:46,066][57339] Updated weights for policy 0, policy_version 600038 (0.0032) [2024-04-28 11:45:47,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 9831071744. Throughput: 0: 55372.9. Samples: 321483080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:45:47,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 11:45:48,333][57319] Signal inference workers to stop experience collection... (4600 times) [2024-04-28 11:45:48,386][57339] InferenceWorker_p0-w0: stopping experience collection (4600 times) [2024-04-28 11:45:48,392][57319] Signal inference workers to resume experience collection... (4600 times) [2024-04-28 11:45:48,400][57339] InferenceWorker_p0-w0: resuming experience collection (4600 times) [2024-04-28 11:45:48,957][57339] Updated weights for policy 0, policy_version 600048 (0.0030) [2024-04-28 11:45:51,904][57339] Updated weights for policy 0, policy_version 600058 (0.0035) [2024-04-28 11:45:52,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 9831350272. Throughput: 0: 55107.6. Samples: 321639700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:45:52,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 11:45:54,723][57339] Updated weights for policy 0, policy_version 600068 (0.0034) [2024-04-28 11:45:57,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 9831628800. Throughput: 0: 55259.6. Samples: 321974880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:45:57,169][57108] Avg episode reward: [(0, '0.702')] [2024-04-28 11:45:57,670][57339] Updated weights for policy 0, policy_version 600078 (0.0034) [2024-04-28 11:46:00,743][57339] Updated weights for policy 0, policy_version 600088 (0.0025) [2024-04-28 11:46:02,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 9831890944. Throughput: 0: 55410.7. Samples: 322306740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:46:02,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 11:46:03,698][57339] Updated weights for policy 0, policy_version 600098 (0.0030) [2024-04-28 11:46:06,700][57339] Updated weights for policy 0, policy_version 600108 (0.0032) [2024-04-28 11:46:07,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55483.4). Total num frames: 9832185856. Throughput: 0: 55565.1. Samples: 322480520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:46:07,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 11:46:09,725][57339] Updated weights for policy 0, policy_version 600118 (0.0033) [2024-04-28 11:46:12,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 9832464384. Throughput: 0: 55588.4. Samples: 322812520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:46:12,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 11:46:12,554][57339] Updated weights for policy 0, policy_version 600128 (0.0027) [2024-04-28 11:46:15,641][57339] Updated weights for policy 0, policy_version 600138 (0.0035) [2024-04-28 11:46:17,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55159.6, 300 sec: 55261.3). Total num frames: 9832710144. Throughput: 0: 55657.4. Samples: 323147520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:46:17,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 11:46:18,311][57339] Updated weights for policy 0, policy_version 600148 (0.0035) [2024-04-28 11:46:21,345][57339] Updated weights for policy 0, policy_version 600158 (0.0028) [2024-04-28 11:46:22,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.7, 300 sec: 55372.4). Total num frames: 9833005056. Throughput: 0: 55483.3. Samples: 323313400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:46:22,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 11:46:24,200][57339] Updated weights for policy 0, policy_version 600168 (0.0034) [2024-04-28 11:46:27,169][57108] Fps is (10 sec: 58981.4, 60 sec: 55705.5, 300 sec: 55427.9). Total num frames: 9833299968. Throughput: 0: 55542.8. Samples: 323644220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:46:27,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 11:46:27,262][57339] Updated weights for policy 0, policy_version 600178 (0.0031) [2024-04-28 11:46:30,194][57339] Updated weights for policy 0, policy_version 600188 (0.0033) [2024-04-28 11:46:32,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55159.4, 300 sec: 55316.8). Total num frames: 9833562112. Throughput: 0: 55431.4. Samples: 323977500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:46:32,170][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 11:46:33,411][57339] Updated weights for policy 0, policy_version 600198 (0.0028) [2024-04-28 11:46:36,032][57339] Updated weights for policy 0, policy_version 600208 (0.0027) [2024-04-28 11:46:37,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 9833857024. Throughput: 0: 55712.8. Samples: 324146780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:46:37,170][57108] Avg episode reward: [(0, '0.520')] [2024-04-28 11:46:39,208][57339] Updated weights for policy 0, policy_version 600218 (0.0029) [2024-04-28 11:46:41,772][57339] Updated weights for policy 0, policy_version 600228 (0.0028) [2024-04-28 11:46:42,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55978.9, 300 sec: 55483.4). Total num frames: 9834135552. Throughput: 0: 55641.5. Samples: 324478740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:46:42,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 11:46:45,144][57339] Updated weights for policy 0, policy_version 600238 (0.0030) [2024-04-28 11:46:47,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 9834397696. Throughput: 0: 55736.5. Samples: 324814880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 11:46:47,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 11:46:47,872][57339] Updated weights for policy 0, policy_version 600248 (0.0030) [2024-04-28 11:46:50,204][57319] Signal inference workers to stop experience collection... (4650 times) [2024-04-28 11:46:50,241][57339] InferenceWorker_p0-w0: stopping experience collection (4650 times) [2024-04-28 11:46:50,266][57319] Signal inference workers to resume experience collection... (4650 times) [2024-04-28 11:46:50,269][57339] InferenceWorker_p0-w0: resuming experience collection (4650 times) [2024-04-28 11:46:50,976][57339] Updated weights for policy 0, policy_version 600258 (0.0027) [2024-04-28 11:46:52,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 9834676224. Throughput: 0: 55532.9. Samples: 324979500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:46:52,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 11:46:53,749][57339] Updated weights for policy 0, policy_version 600268 (0.0026) [2024-04-28 11:46:57,144][57339] Updated weights for policy 0, policy_version 600278 (0.0031) [2024-04-28 11:46:57,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 9834954752. Throughput: 0: 55476.9. Samples: 325308980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:46:57,170][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 11:46:59,669][57339] Updated weights for policy 0, policy_version 600288 (0.0031) [2024-04-28 11:47:02,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55261.3). Total num frames: 9835216896. Throughput: 0: 55500.4. Samples: 325645040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:47:02,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 11:47:02,919][57339] Updated weights for policy 0, policy_version 600298 (0.0029) [2024-04-28 11:47:05,658][57339] Updated weights for policy 0, policy_version 600308 (0.0033) [2024-04-28 11:47:07,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55372.4). Total num frames: 9835495424. Throughput: 0: 55474.0. Samples: 325809740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:47:07,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 11:47:08,851][57339] Updated weights for policy 0, policy_version 600318 (0.0029) [2024-04-28 11:47:11,411][57339] Updated weights for policy 0, policy_version 600328 (0.0031) [2024-04-28 11:47:12,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 9835790336. Throughput: 0: 55434.7. Samples: 326138780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:47:12,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 11:47:14,955][57339] Updated weights for policy 0, policy_version 600338 (0.0030) [2024-04-28 11:47:17,169][57108] Fps is (10 sec: 58982.3, 60 sec: 56251.6, 300 sec: 55427.9). Total num frames: 9836085248. Throughput: 0: 55388.9. Samples: 326470000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:47:17,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 11:47:17,558][57339] Updated weights for policy 0, policy_version 600348 (0.0026) [2024-04-28 11:47:20,789][57339] Updated weights for policy 0, policy_version 600358 (0.0029) [2024-04-28 11:47:22,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.5, 300 sec: 55427.9). Total num frames: 9836363776. Throughput: 0: 55487.1. Samples: 326643700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:47:22,169][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 11:47:23,405][57339] Updated weights for policy 0, policy_version 600368 (0.0038) [2024-04-28 11:47:26,534][57339] Updated weights for policy 0, policy_version 600378 (0.0027) [2024-04-28 11:47:27,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55432.5, 300 sec: 55372.3). Total num frames: 9836625920. Throughput: 0: 55550.9. Samples: 326978540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:47:27,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 11:47:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000600380_9836625920.pth... [2024-04-28 11:47:27,230][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000599569_9823338496.pth [2024-04-28 11:47:29,125][57339] Updated weights for policy 0, policy_version 600388 (0.0031) [2024-04-28 11:47:32,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55427.9). Total num frames: 9836904448. Throughput: 0: 55552.3. Samples: 327314740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:47:32,170][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 11:47:32,399][57339] Updated weights for policy 0, policy_version 600398 (0.0032) [2024-04-28 11:47:35,070][57339] Updated weights for policy 0, policy_version 600408 (0.0026) [2024-04-28 11:47:37,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 9837166592. Throughput: 0: 55397.3. Samples: 327472380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:47:37,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 11:47:38,485][57339] Updated weights for policy 0, policy_version 600418 (0.0028) [2024-04-28 11:47:41,136][57339] Updated weights for policy 0, policy_version 600428 (0.0026) [2024-04-28 11:47:42,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 9837461504. Throughput: 0: 55475.7. Samples: 327805380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:47:42,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 11:47:44,367][57339] Updated weights for policy 0, policy_version 600438 (0.0032) [2024-04-28 11:47:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 9837723648. Throughput: 0: 55303.6. Samples: 328133700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:47:47,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 11:47:47,192][57339] Updated weights for policy 0, policy_version 600448 (0.0026) [2024-04-28 11:47:49,848][57319] Signal inference workers to stop experience collection... (4700 times) [2024-04-28 11:47:49,867][57339] InferenceWorker_p0-w0: stopping experience collection (4700 times) [2024-04-28 11:47:49,941][57319] Signal inference workers to resume experience collection... (4700 times) [2024-04-28 11:47:49,941][57339] InferenceWorker_p0-w0: resuming experience collection (4700 times) [2024-04-28 11:47:50,193][57339] Updated weights for policy 0, policy_version 600458 (0.0035) [2024-04-28 11:47:52,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55705.6, 300 sec: 55427.9). Total num frames: 9838018560. Throughput: 0: 55599.1. Samples: 328311700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:47:52,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 11:47:52,976][57339] Updated weights for policy 0, policy_version 600468 (0.0041) [2024-04-28 11:47:56,180][57339] Updated weights for policy 0, policy_version 600478 (0.0031) [2024-04-28 11:47:57,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 9838280704. Throughput: 0: 55655.1. Samples: 328643260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:47:57,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 11:47:58,797][57339] Updated weights for policy 0, policy_version 600488 (0.0033) [2024-04-28 11:48:02,040][57339] Updated weights for policy 0, policy_version 600498 (0.0024) [2024-04-28 11:48:02,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.5, 300 sec: 55372.4). Total num frames: 9838559232. Throughput: 0: 55709.8. Samples: 328976940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:48:02,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 11:48:04,778][57339] Updated weights for policy 0, policy_version 600508 (0.0030) [2024-04-28 11:48:07,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55159.4, 300 sec: 55261.3). Total num frames: 9838804992. Throughput: 0: 55242.2. Samples: 329129600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:48:07,170][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 11:48:07,945][57339] Updated weights for policy 0, policy_version 600518 (0.0024) [2024-04-28 11:48:10,691][57339] Updated weights for policy 0, policy_version 600528 (0.0035) [2024-04-28 11:48:12,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.6, 300 sec: 55372.4). Total num frames: 9839099904. Throughput: 0: 55141.1. Samples: 329459880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:48:12,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 11:48:13,910][57339] Updated weights for policy 0, policy_version 600538 (0.0027) [2024-04-28 11:48:16,493][57339] Updated weights for policy 0, policy_version 600548 (0.0031) [2024-04-28 11:48:17,169][57108] Fps is (10 sec: 57344.1, 60 sec: 54886.4, 300 sec: 55316.8). Total num frames: 9839378432. Throughput: 0: 54966.6. Samples: 329788240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:48:17,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 11:48:20,039][57339] Updated weights for policy 0, policy_version 600558 (0.0024) [2024-04-28 11:48:22,169][57108] Fps is (10 sec: 58981.7, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 9839689728. Throughput: 0: 55417.6. Samples: 329966180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 11:48:22,170][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 11:48:22,630][57339] Updated weights for policy 0, policy_version 600568 (0.0030) [2024-04-28 11:48:26,102][57339] Updated weights for policy 0, policy_version 600578 (0.0031) [2024-04-28 11:48:27,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 9839951872. Throughput: 0: 55290.9. Samples: 330293480. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:48:27,170][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 11:48:28,756][57339] Updated weights for policy 0, policy_version 600588 (0.0027) [2024-04-28 11:48:31,951][57339] Updated weights for policy 0, policy_version 600598 (0.0031) [2024-04-28 11:48:32,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55159.6, 300 sec: 55372.4). Total num frames: 9840214016. Throughput: 0: 55377.8. Samples: 330625700. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:48:32,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 11:48:34,764][57339] Updated weights for policy 0, policy_version 600608 (0.0024) [2024-04-28 11:48:37,169][57108] Fps is (10 sec: 50790.9, 60 sec: 54886.4, 300 sec: 55261.3). Total num frames: 9840459776. Throughput: 0: 54972.1. Samples: 330785440. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:48:37,169][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 11:48:37,853][57339] Updated weights for policy 0, policy_version 600618 (0.0032) [2024-04-28 11:48:40,684][57339] Updated weights for policy 0, policy_version 600628 (0.0037) [2024-04-28 11:48:42,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54886.3, 300 sec: 55261.3). Total num frames: 9840754688. Throughput: 0: 55102.7. Samples: 331122880. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:48:42,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 11:48:43,731][57339] Updated weights for policy 0, policy_version 600638 (0.0033) [2024-04-28 11:48:45,457][57319] Signal inference workers to stop experience collection... (4750 times) [2024-04-28 11:48:45,458][57319] Signal inference workers to resume experience collection... (4750 times) [2024-04-28 11:48:45,466][57339] InferenceWorker_p0-w0: stopping experience collection (4750 times) [2024-04-28 11:48:45,467][57339] InferenceWorker_p0-w0: resuming experience collection (4750 times) [2024-04-28 11:48:46,527][57339] Updated weights for policy 0, policy_version 600648 (0.0035) [2024-04-28 11:48:47,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55159.3, 300 sec: 55316.8). Total num frames: 9841033216. Throughput: 0: 55014.1. Samples: 331452580. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:48:47,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 11:48:49,653][57339] Updated weights for policy 0, policy_version 600658 (0.0041) [2024-04-28 11:48:52,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 9841328128. Throughput: 0: 55431.7. Samples: 331624020. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:48:52,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 11:48:52,310][57339] Updated weights for policy 0, policy_version 600668 (0.0029) [2024-04-28 11:48:55,536][57339] Updated weights for policy 0, policy_version 600678 (0.0026) [2024-04-28 11:48:57,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 9841623040. Throughput: 0: 55461.7. Samples: 331955660. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:48:57,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 11:48:58,293][57339] Updated weights for policy 0, policy_version 600688 (0.0033) [2024-04-28 11:49:01,459][57339] Updated weights for policy 0, policy_version 600698 (0.0035) [2024-04-28 11:49:02,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 9841885184. Throughput: 0: 55536.5. Samples: 332287380. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:49:02,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 11:49:04,182][57339] Updated weights for policy 0, policy_version 600708 (0.0028) [2024-04-28 11:49:07,169][57108] Fps is (10 sec: 50790.3, 60 sec: 55432.6, 300 sec: 55372.3). Total num frames: 9842130944. Throughput: 0: 55352.0. Samples: 332457020. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:49:07,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 11:49:07,364][57339] Updated weights for policy 0, policy_version 600718 (0.0038) [2024-04-28 11:49:10,111][57339] Updated weights for policy 0, policy_version 600728 (0.0035) [2024-04-28 11:49:12,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55159.5, 300 sec: 55261.3). Total num frames: 9842409472. Throughput: 0: 55417.0. Samples: 332787240. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:49:12,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 11:49:13,270][57339] Updated weights for policy 0, policy_version 600738 (0.0031) [2024-04-28 11:49:16,117][57339] Updated weights for policy 0, policy_version 600748 (0.0029) [2024-04-28 11:49:17,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55159.6, 300 sec: 55261.3). Total num frames: 9842688000. Throughput: 0: 55389.8. Samples: 333118240. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:49:17,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 11:49:19,121][57339] Updated weights for policy 0, policy_version 600758 (0.0031) [2024-04-28 11:49:22,138][57339] Updated weights for policy 0, policy_version 600768 (0.0028) [2024-04-28 11:49:22,169][57108] Fps is (10 sec: 57343.0, 60 sec: 54886.4, 300 sec: 55372.3). Total num frames: 9842982912. Throughput: 0: 55324.3. Samples: 333275040. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:49:22,170][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 11:49:24,982][57339] Updated weights for policy 0, policy_version 600778 (0.0031) [2024-04-28 11:49:27,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 9843277824. Throughput: 0: 55325.7. Samples: 333612540. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:49:27,169][57108] Avg episode reward: [(0, '0.506')] [2024-04-28 11:49:27,176][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000600786_9843277824.pth... [2024-04-28 11:49:27,221][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000599974_9829974016.pth [2024-04-28 11:49:28,278][57339] Updated weights for policy 0, policy_version 600788 (0.0029) [2024-04-28 11:49:30,963][57339] Updated weights for policy 0, policy_version 600798 (0.0027) [2024-04-28 11:49:32,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 9843556352. Throughput: 0: 55338.4. Samples: 333942800. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:49:32,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 11:49:34,056][57339] Updated weights for policy 0, policy_version 600808 (0.0029) [2024-04-28 11:49:36,725][57339] Updated weights for policy 0, policy_version 600818 (0.0028) [2024-04-28 11:49:37,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55978.6, 300 sec: 55427.9). Total num frames: 9843818496. Throughput: 0: 55437.2. Samples: 334118700. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:49:37,170][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 11:49:39,817][57339] Updated weights for policy 0, policy_version 600828 (0.0026) [2024-04-28 11:49:42,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 9844080640. Throughput: 0: 55496.1. Samples: 334452980. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:49:42,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 11:49:42,669][57339] Updated weights for policy 0, policy_version 600838 (0.0031) [2024-04-28 11:49:45,751][57339] Updated weights for policy 0, policy_version 600848 (0.0028) [2024-04-28 11:49:47,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 55427.9). Total num frames: 9844375552. Throughput: 0: 55504.5. Samples: 334785080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:49:47,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 11:49:48,507][57339] Updated weights for policy 0, policy_version 600858 (0.0028) [2024-04-28 11:49:49,121][57319] Signal inference workers to stop experience collection... (4800 times) [2024-04-28 11:49:49,121][57319] Signal inference workers to resume experience collection... (4800 times) [2024-04-28 11:49:49,132][57339] InferenceWorker_p0-w0: stopping experience collection (4800 times) [2024-04-28 11:49:49,132][57339] InferenceWorker_p0-w0: resuming experience collection (4800 times) [2024-04-28 11:49:51,671][57339] Updated weights for policy 0, policy_version 600868 (0.0028) [2024-04-28 11:49:52,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 9844654080. Throughput: 0: 55091.7. Samples: 334936140. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:49:52,169][57108] Avg episode reward: [(0, '0.521')] [2024-04-28 11:49:54,483][57339] Updated weights for policy 0, policy_version 600878 (0.0028) [2024-04-28 11:49:57,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54886.5, 300 sec: 55372.4). Total num frames: 9844916224. Throughput: 0: 55269.8. Samples: 335274380. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-04-28 11:49:57,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 11:49:57,635][57339] Updated weights for policy 0, policy_version 600888 (0.0024) [2024-04-28 11:50:00,367][57339] Updated weights for policy 0, policy_version 600898 (0.0029) [2024-04-28 11:50:02,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9845211136. Throughput: 0: 55280.4. Samples: 335605860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:50:02,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 11:50:03,482][57339] Updated weights for policy 0, policy_version 600908 (0.0029) [2024-04-28 11:50:06,229][57339] Updated weights for policy 0, policy_version 600918 (0.0027) [2024-04-28 11:50:07,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.8, 300 sec: 55427.9). Total num frames: 9845489664. Throughput: 0: 55587.3. Samples: 335776460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:50:07,169][57108] Avg episode reward: [(0, '0.531')] [2024-04-28 11:50:09,578][57339] Updated weights for policy 0, policy_version 600928 (0.0031) [2024-04-28 11:50:12,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55427.9). Total num frames: 9845751808. Throughput: 0: 55456.1. Samples: 336108060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:50:12,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 11:50:12,198][57339] Updated weights for policy 0, policy_version 600938 (0.0034) [2024-04-28 11:50:15,559][57339] Updated weights for policy 0, policy_version 600948 (0.0030) [2024-04-28 11:50:17,169][57108] Fps is (10 sec: 50790.4, 60 sec: 55159.5, 300 sec: 55316.9). Total num frames: 9845997568. Throughput: 0: 55449.3. Samples: 336438020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:50:17,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 11:50:18,248][57339] Updated weights for policy 0, policy_version 600958 (0.0032) [2024-04-28 11:50:21,439][57339] Updated weights for policy 0, policy_version 600968 (0.0032) [2024-04-28 11:50:22,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.6, 300 sec: 55372.4). Total num frames: 9846292480. Throughput: 0: 54953.5. Samples: 336591600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:50:22,169][57108] Avg episode reward: [(0, '0.679')] [2024-04-28 11:50:24,160][57339] Updated weights for policy 0, policy_version 600978 (0.0025) [2024-04-28 11:50:27,169][57108] Fps is (10 sec: 57343.0, 60 sec: 54886.3, 300 sec: 55316.8). Total num frames: 9846571008. Throughput: 0: 54890.0. Samples: 336923040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:50:27,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 11:50:27,467][57339] Updated weights for policy 0, policy_version 600988 (0.0030) [2024-04-28 11:50:30,229][57339] Updated weights for policy 0, policy_version 600998 (0.0033) [2024-04-28 11:50:32,169][57108] Fps is (10 sec: 55705.3, 60 sec: 54886.4, 300 sec: 55316.8). Total num frames: 9846849536. Throughput: 0: 54857.8. Samples: 337253680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:50:32,170][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 11:50:33,372][57339] Updated weights for policy 0, policy_version 601008 (0.0025) [2024-04-28 11:50:36,122][57339] Updated weights for policy 0, policy_version 601018 (0.0028) [2024-04-28 11:50:37,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 9847128064. Throughput: 0: 55390.6. Samples: 337428720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:50:37,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 11:50:39,266][57339] Updated weights for policy 0, policy_version 601028 (0.0030) [2024-04-28 11:50:42,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.4, 300 sec: 55316.8). Total num frames: 9847390208. Throughput: 0: 55211.5. Samples: 337758900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:50:42,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 11:50:42,197][57339] Updated weights for policy 0, policy_version 601038 (0.0030) [2024-04-28 11:50:45,248][57339] Updated weights for policy 0, policy_version 601048 (0.0027) [2024-04-28 11:50:47,169][57108] Fps is (10 sec: 54067.8, 60 sec: 54886.5, 300 sec: 55316.8). Total num frames: 9847668736. Throughput: 0: 55137.9. Samples: 338087060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:50:47,169][57108] Avg episode reward: [(0, '0.677')] [2024-04-28 11:50:48,063][57339] Updated weights for policy 0, policy_version 601058 (0.0029) [2024-04-28 11:50:51,117][57339] Updated weights for policy 0, policy_version 601068 (0.0031) [2024-04-28 11:50:52,169][57108] Fps is (10 sec: 55706.0, 60 sec: 54886.4, 300 sec: 55316.8). Total num frames: 9847947264. Throughput: 0: 54966.7. Samples: 338249960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:50:52,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 11:50:53,885][57339] Updated weights for policy 0, policy_version 601078 (0.0027) [2024-04-28 11:50:57,051][57339] Updated weights for policy 0, policy_version 601088 (0.0033) [2024-04-28 11:50:57,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55159.4, 300 sec: 55372.4). Total num frames: 9848225792. Throughput: 0: 55002.2. Samples: 338583160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:50:57,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 11:50:59,962][57339] Updated weights for policy 0, policy_version 601098 (0.0029) [2024-04-28 11:51:00,311][57319] Signal inference workers to stop experience collection... (4850 times) [2024-04-28 11:51:00,316][57319] Signal inference workers to resume experience collection... (4850 times) [2024-04-28 11:51:00,340][57339] InferenceWorker_p0-w0: stopping experience collection (4850 times) [2024-04-28 11:51:00,340][57339] InferenceWorker_p0-w0: resuming experience collection (4850 times) [2024-04-28 11:51:02,169][57108] Fps is (10 sec: 55704.4, 60 sec: 54886.3, 300 sec: 55316.8). Total num frames: 9848504320. Throughput: 0: 55023.3. Samples: 338914080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:51:02,170][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 11:51:02,847][57339] Updated weights for policy 0, policy_version 601108 (0.0024) [2024-04-28 11:51:05,836][57339] Updated weights for policy 0, policy_version 601118 (0.0040) [2024-04-28 11:51:07,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55159.3, 300 sec: 55372.3). Total num frames: 9848799232. Throughput: 0: 55382.9. Samples: 339083840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:51:07,170][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 11:51:08,774][57339] Updated weights for policy 0, policy_version 601128 (0.0031) [2024-04-28 11:51:11,661][57339] Updated weights for policy 0, policy_version 601138 (0.0033) [2024-04-28 11:51:12,169][57108] Fps is (10 sec: 57345.0, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 9849077760. Throughput: 0: 55479.7. Samples: 339419620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:51:12,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 11:51:14,656][57339] Updated weights for policy 0, policy_version 601148 (0.0034) [2024-04-28 11:51:17,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9849323520. Throughput: 0: 55452.0. Samples: 339749020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:51:17,169][57108] Avg episode reward: [(0, '0.515')] [2024-04-28 11:51:17,664][57339] Updated weights for policy 0, policy_version 601158 (0.0034) [2024-04-28 11:51:20,760][57339] Updated weights for policy 0, policy_version 601168 (0.0036) [2024-04-28 11:51:22,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55316.9). Total num frames: 9849618432. Throughput: 0: 55203.7. Samples: 339912880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:51:22,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 11:51:23,538][57339] Updated weights for policy 0, policy_version 601178 (0.0030) [2024-04-28 11:51:26,610][57339] Updated weights for policy 0, policy_version 601188 (0.0031) [2024-04-28 11:51:27,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55159.6, 300 sec: 55316.8). Total num frames: 9849880576. Throughput: 0: 55237.3. Samples: 340244580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:51:27,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 11:51:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000601189_9849880576.pth... [2024-04-28 11:51:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000600380_9836625920.pth [2024-04-28 11:51:29,375][57339] Updated weights for policy 0, policy_version 601198 (0.0027) [2024-04-28 11:51:32,169][57108] Fps is (10 sec: 52428.0, 60 sec: 54886.3, 300 sec: 55205.7). Total num frames: 9850142720. Throughput: 0: 55430.5. Samples: 340581440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 11:51:32,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 11:51:32,464][57339] Updated weights for policy 0, policy_version 601208 (0.0032) [2024-04-28 11:51:35,293][57339] Updated weights for policy 0, policy_version 601218 (0.0033) [2024-04-28 11:51:37,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55159.5, 300 sec: 55261.3). Total num frames: 9850437632. Throughput: 0: 55435.1. Samples: 340744540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:51:37,169][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 11:51:38,477][57339] Updated weights for policy 0, policy_version 601228 (0.0028) [2024-04-28 11:51:41,316][57339] Updated weights for policy 0, policy_version 601238 (0.0028) [2024-04-28 11:51:42,169][57108] Fps is (10 sec: 60621.8, 60 sec: 55978.7, 300 sec: 55427.9). Total num frames: 9850748928. Throughput: 0: 55438.3. Samples: 341077880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:51:42,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 11:51:44,382][57339] Updated weights for policy 0, policy_version 601248 (0.0025) [2024-04-28 11:51:47,102][57339] Updated weights for policy 0, policy_version 601258 (0.0027) [2024-04-28 11:51:47,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55372.4). Total num frames: 9851011072. Throughput: 0: 55490.0. Samples: 341411120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:51:47,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 11:51:50,236][57339] Updated weights for policy 0, policy_version 601268 (0.0032) [2024-04-28 11:51:52,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9851273216. Throughput: 0: 55497.1. Samples: 341581200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:51:52,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 11:51:52,915][57339] Updated weights for policy 0, policy_version 601278 (0.0038) [2024-04-28 11:51:56,054][57339] Updated weights for policy 0, policy_version 601288 (0.0029) [2024-04-28 11:51:57,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55159.5, 300 sec: 55316.8). Total num frames: 9851535360. Throughput: 0: 55418.7. Samples: 341913460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:51:57,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 11:51:59,033][57339] Updated weights for policy 0, policy_version 601298 (0.0033) [2024-04-28 11:52:02,029][57339] Updated weights for policy 0, policy_version 601308 (0.0034) [2024-04-28 11:52:02,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.7, 300 sec: 55372.4). Total num frames: 9851830272. Throughput: 0: 55473.0. Samples: 342245300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:52:02,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 11:52:04,854][57339] Updated weights for policy 0, policy_version 601318 (0.0030) [2024-04-28 11:52:07,169][57108] Fps is (10 sec: 55705.0, 60 sec: 54886.5, 300 sec: 55261.3). Total num frames: 9852092416. Throughput: 0: 55463.0. Samples: 342408720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:52:07,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 11:52:07,479][57319] Signal inference workers to stop experience collection... (4900 times) [2024-04-28 11:52:07,479][57319] Signal inference workers to resume experience collection... (4900 times) [2024-04-28 11:52:07,500][57339] InferenceWorker_p0-w0: stopping experience collection (4900 times) [2024-04-28 11:52:07,500][57339] InferenceWorker_p0-w0: resuming experience collection (4900 times) [2024-04-28 11:52:07,868][57339] Updated weights for policy 0, policy_version 601328 (0.0031) [2024-04-28 11:52:10,760][57339] Updated weights for policy 0, policy_version 601338 (0.0030) [2024-04-28 11:52:12,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.5, 300 sec: 55261.3). Total num frames: 9852387328. Throughput: 0: 55451.2. Samples: 342739880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:52:12,169][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 11:52:13,812][57339] Updated weights for policy 0, policy_version 601348 (0.0026) [2024-04-28 11:52:16,482][57339] Updated weights for policy 0, policy_version 601358 (0.0027) [2024-04-28 11:52:17,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55978.7, 300 sec: 55316.8). Total num frames: 9852682240. Throughput: 0: 55281.4. Samples: 343069100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:52:17,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 11:52:19,795][57339] Updated weights for policy 0, policy_version 601368 (0.0030) [2024-04-28 11:52:22,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.4, 300 sec: 55316.8). Total num frames: 9852944384. Throughput: 0: 55599.5. Samples: 343246520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:52:22,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 11:52:22,503][57339] Updated weights for policy 0, policy_version 601378 (0.0031) [2024-04-28 11:52:25,742][57339] Updated weights for policy 0, policy_version 601388 (0.0032) [2024-04-28 11:52:27,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55316.8). Total num frames: 9853222912. Throughput: 0: 55667.8. Samples: 343582940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:52:27,169][57108] Avg episode reward: [(0, '0.520')] [2024-04-28 11:52:28,263][57339] Updated weights for policy 0, policy_version 601398 (0.0027) [2024-04-28 11:52:31,603][57339] Updated weights for policy 0, policy_version 601408 (0.0027) [2024-04-28 11:52:32,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55316.8). Total num frames: 9853485056. Throughput: 0: 55482.1. Samples: 343907820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:52:32,169][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 11:52:34,127][57339] Updated weights for policy 0, policy_version 601418 (0.0033) [2024-04-28 11:52:37,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.4, 300 sec: 55261.2). Total num frames: 9853763584. Throughput: 0: 55277.1. Samples: 344068680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:52:37,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 11:52:37,476][57339] Updated weights for policy 0, policy_version 601428 (0.0032) [2024-04-28 11:52:40,173][57339] Updated weights for policy 0, policy_version 601438 (0.0030) [2024-04-28 11:52:42,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54886.3, 300 sec: 55316.8). Total num frames: 9854042112. Throughput: 0: 55266.9. Samples: 344400480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:52:42,169][57108] Avg episode reward: [(0, '0.534')] [2024-04-28 11:52:43,514][57339] Updated weights for policy 0, policy_version 601448 (0.0027) [2024-04-28 11:52:46,053][57339] Updated weights for policy 0, policy_version 601458 (0.0029) [2024-04-28 11:52:47,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9854337024. Throughput: 0: 55280.4. Samples: 344732920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:52:47,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 11:52:49,426][57339] Updated weights for policy 0, policy_version 601468 (0.0026) [2024-04-28 11:52:51,820][57339] Updated weights for policy 0, policy_version 601478 (0.0028) [2024-04-28 11:52:52,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55705.4, 300 sec: 55372.3). Total num frames: 9854615552. Throughput: 0: 55547.0. Samples: 344908340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:52:52,170][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 11:52:55,496][57339] Updated weights for policy 0, policy_version 601488 (0.0032) [2024-04-28 11:52:55,771][57319] Signal inference workers to stop experience collection... (4950 times) [2024-04-28 11:52:55,775][57319] Signal inference workers to resume experience collection... (4950 times) [2024-04-28 11:52:55,793][57339] InferenceWorker_p0-w0: stopping experience collection (4950 times) [2024-04-28 11:52:55,793][57339] InferenceWorker_p0-w0: resuming experience collection (4950 times) [2024-04-28 11:52:57,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55372.4). Total num frames: 9854894080. Throughput: 0: 55546.7. Samples: 345239480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:52:57,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 11:52:57,689][57339] Updated weights for policy 0, policy_version 601498 (0.0029) [2024-04-28 11:53:01,424][57339] Updated weights for policy 0, policy_version 601508 (0.0026) [2024-04-28 11:53:02,169][57108] Fps is (10 sec: 54068.3, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 9855156224. Throughput: 0: 55608.5. Samples: 345571480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 11:53:02,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 11:53:03,615][57339] Updated weights for policy 0, policy_version 601518 (0.0031) [2024-04-28 11:53:07,169][57108] Fps is (10 sec: 50790.1, 60 sec: 55159.5, 300 sec: 55261.3). Total num frames: 9855401984. Throughput: 0: 55164.1. Samples: 345728900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:53:07,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 11:53:07,303][57339] Updated weights for policy 0, policy_version 601528 (0.0025) [2024-04-28 11:53:09,608][57339] Updated weights for policy 0, policy_version 601538 (0.0033) [2024-04-28 11:53:12,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 55316.9). Total num frames: 9855696896. Throughput: 0: 55065.9. Samples: 346060900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:53:12,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 11:53:13,223][57339] Updated weights for policy 0, policy_version 601548 (0.0025) [2024-04-28 11:53:15,553][57339] Updated weights for policy 0, policy_version 601558 (0.0028) [2024-04-28 11:53:17,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55159.5, 300 sec: 55261.3). Total num frames: 9855991808. Throughput: 0: 55264.5. Samples: 346394720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:53:17,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 11:53:19,106][57339] Updated weights for policy 0, policy_version 601568 (0.0027) [2024-04-28 11:53:21,469][57339] Updated weights for policy 0, policy_version 601578 (0.0030) [2024-04-28 11:53:22,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55705.7, 300 sec: 55372.4). Total num frames: 9856286720. Throughput: 0: 55555.8. Samples: 346568680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:53:22,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 11:53:24,993][57339] Updated weights for policy 0, policy_version 601588 (0.0027) [2024-04-28 11:53:27,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55705.5, 300 sec: 55427.9). Total num frames: 9856565248. Throughput: 0: 55576.4. Samples: 346901420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:53:27,170][57108] Avg episode reward: [(0, '0.506')] [2024-04-28 11:53:27,177][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000601597_9856565248.pth... [2024-04-28 11:53:27,235][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000600786_9843277824.pth [2024-04-28 11:53:27,353][57339] Updated weights for policy 0, policy_version 601598 (0.0027) [2024-04-28 11:53:30,869][57339] Updated weights for policy 0, policy_version 601608 (0.0031) [2024-04-28 11:53:32,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 9856811008. Throughput: 0: 55520.0. Samples: 347231320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:53:32,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 11:53:33,265][57339] Updated weights for policy 0, policy_version 601618 (0.0028) [2024-04-28 11:53:36,838][57339] Updated weights for policy 0, policy_version 601628 (0.0029) [2024-04-28 11:53:37,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 9857089536. Throughput: 0: 55259.3. Samples: 347395000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:53:37,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 11:53:39,123][57339] Updated weights for policy 0, policy_version 601638 (0.0031) [2024-04-28 11:53:42,169][57108] Fps is (10 sec: 52428.6, 60 sec: 54886.4, 300 sec: 55261.3). Total num frames: 9857335296. Throughput: 0: 55226.5. Samples: 347724680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:53:42,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 11:53:42,774][57339] Updated weights for policy 0, policy_version 601648 (0.0030) [2024-04-28 11:53:44,901][57339] Updated weights for policy 0, policy_version 601658 (0.0035) [2024-04-28 11:53:47,169][57108] Fps is (10 sec: 52428.1, 60 sec: 54613.2, 300 sec: 55205.7). Total num frames: 9857613824. Throughput: 0: 55187.3. Samples: 348054920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:53:47,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 11:53:48,796][57339] Updated weights for policy 0, policy_version 601668 (0.0027) [2024-04-28 11:53:50,684][57319] Signal inference workers to stop experience collection... (5000 times) [2024-04-28 11:53:50,684][57319] Signal inference workers to resume experience collection... (5000 times) [2024-04-28 11:53:50,707][57339] InferenceWorker_p0-w0: stopping experience collection (5000 times) [2024-04-28 11:53:50,711][57339] InferenceWorker_p0-w0: resuming experience collection (5000 times) [2024-04-28 11:53:50,808][57339] Updated weights for policy 0, policy_version 601678 (0.0029) [2024-04-28 11:53:52,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55159.6, 300 sec: 55261.3). Total num frames: 9857925120. Throughput: 0: 55478.2. Samples: 348225420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:53:52,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 11:53:54,890][57339] Updated weights for policy 0, policy_version 601688 (0.0034) [2024-04-28 11:53:56,954][57339] Updated weights for policy 0, policy_version 601698 (0.0030) [2024-04-28 11:53:57,169][57108] Fps is (10 sec: 60620.8, 60 sec: 55432.3, 300 sec: 55372.3). Total num frames: 9858220032. Throughput: 0: 55340.2. Samples: 348551220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:53:57,170][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 11:54:00,631][57339] Updated weights for policy 0, policy_version 601708 (0.0032) [2024-04-28 11:54:02,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55432.4, 300 sec: 55427.9). Total num frames: 9858482176. Throughput: 0: 55359.8. Samples: 348885920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:54:02,170][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 11:54:02,923][57339] Updated weights for policy 0, policy_version 601718 (0.0031) [2024-04-28 11:54:06,324][57339] Updated weights for policy 0, policy_version 601728 (0.0028) [2024-04-28 11:54:07,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55978.7, 300 sec: 55427.9). Total num frames: 9858760704. Throughput: 0: 55262.2. Samples: 349055480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:54:07,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 11:54:08,585][57339] Updated weights for policy 0, policy_version 601738 (0.0030) [2024-04-28 11:54:12,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 9859022848. Throughput: 0: 55382.0. Samples: 349393600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:54:12,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 11:54:12,364][57339] Updated weights for policy 0, policy_version 601748 (0.0027) [2024-04-28 11:54:14,378][57339] Updated weights for policy 0, policy_version 601758 (0.0035) [2024-04-28 11:54:17,169][57108] Fps is (10 sec: 52428.6, 60 sec: 54886.4, 300 sec: 55261.3). Total num frames: 9859284992. Throughput: 0: 55401.3. Samples: 349724380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:54:17,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 11:54:18,256][57339] Updated weights for policy 0, policy_version 601768 (0.0026) [2024-04-28 11:54:20,259][57339] Updated weights for policy 0, policy_version 601778 (0.0025) [2024-04-28 11:54:22,169][57108] Fps is (10 sec: 55705.0, 60 sec: 54886.3, 300 sec: 55261.3). Total num frames: 9859579904. Throughput: 0: 55313.3. Samples: 349884100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:54:22,170][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 11:54:24,254][57339] Updated weights for policy 0, policy_version 601788 (0.0026) [2024-04-28 11:54:26,098][57339] Updated weights for policy 0, policy_version 601798 (0.0026) [2024-04-28 11:54:27,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55159.6, 300 sec: 55316.8). Total num frames: 9859874816. Throughput: 0: 55374.3. Samples: 350216520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:54:27,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 11:54:30,354][57339] Updated weights for policy 0, policy_version 601808 (0.0032) [2024-04-28 11:54:31,950][57339] Updated weights for policy 0, policy_version 601818 (0.0034) [2024-04-28 11:54:32,169][57108] Fps is (10 sec: 60620.3, 60 sec: 56251.6, 300 sec: 55483.4). Total num frames: 9860186112. Throughput: 0: 55383.1. Samples: 350547160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:54:32,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 11:54:36,098][57339] Updated weights for policy 0, policy_version 601828 (0.0031) [2024-04-28 11:54:37,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55427.9). Total num frames: 9860431872. Throughput: 0: 55494.7. Samples: 350722680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-04-28 11:54:37,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 11:54:38,113][57339] Updated weights for policy 0, policy_version 601838 (0.0029) [2024-04-28 11:54:41,891][57339] Updated weights for policy 0, policy_version 601848 (0.0027) [2024-04-28 11:54:42,169][57108] Fps is (10 sec: 49152.4, 60 sec: 55705.5, 300 sec: 55261.3). Total num frames: 9860677632. Throughput: 0: 55531.2. Samples: 351050120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:54:42,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 11:54:44,245][57339] Updated weights for policy 0, policy_version 601858 (0.0027) [2024-04-28 11:54:45,398][57319] Signal inference workers to stop experience collection... (5050 times) [2024-04-28 11:54:45,402][57319] Signal inference workers to resume experience collection... (5050 times) [2024-04-28 11:54:45,414][57339] InferenceWorker_p0-w0: stopping experience collection (5050 times) [2024-04-28 11:54:45,414][57339] InferenceWorker_p0-w0: resuming experience collection (5050 times) [2024-04-28 11:54:47,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.8, 300 sec: 55261.3). Total num frames: 9860956160. Throughput: 0: 55438.9. Samples: 351380660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:54:47,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 11:54:47,951][57339] Updated weights for policy 0, policy_version 601868 (0.0034) [2024-04-28 11:54:50,257][57339] Updated weights for policy 0, policy_version 601878 (0.0025) [2024-04-28 11:54:52,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.4, 300 sec: 55316.8). Total num frames: 9861234688. Throughput: 0: 55103.0. Samples: 351535120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:54:52,170][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 11:54:53,867][57339] Updated weights for policy 0, policy_version 601888 (0.0033) [2024-04-28 11:54:56,228][57339] Updated weights for policy 0, policy_version 601898 (0.0031) [2024-04-28 11:54:57,169][57108] Fps is (10 sec: 55704.9, 60 sec: 54886.5, 300 sec: 55261.3). Total num frames: 9861513216. Throughput: 0: 55001.6. Samples: 351868680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:54:57,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 11:54:59,638][57339] Updated weights for policy 0, policy_version 601908 (0.0029) [2024-04-28 11:55:02,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55432.7, 300 sec: 55316.8). Total num frames: 9861808128. Throughput: 0: 55041.8. Samples: 352201260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:55:02,169][57108] Avg episode reward: [(0, '0.696')] [2024-04-28 11:55:02,238][57339] Updated weights for policy 0, policy_version 601918 (0.0031) [2024-04-28 11:55:05,547][57339] Updated weights for policy 0, policy_version 601928 (0.0027) [2024-04-28 11:55:07,169][57108] Fps is (10 sec: 58981.7, 60 sec: 55705.4, 300 sec: 55427.9). Total num frames: 9862103040. Throughput: 0: 55474.1. Samples: 352380440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:55:07,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 11:55:08,071][57339] Updated weights for policy 0, policy_version 601938 (0.0027) [2024-04-28 11:55:11,517][57339] Updated weights for policy 0, policy_version 601948 (0.0024) [2024-04-28 11:55:12,169][57108] Fps is (10 sec: 52428.1, 60 sec: 55159.3, 300 sec: 55372.3). Total num frames: 9862332416. Throughput: 0: 55337.6. Samples: 352706720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:55:12,170][57108] Avg episode reward: [(0, '0.506')] [2024-04-28 11:55:13,929][57339] Updated weights for policy 0, policy_version 601958 (0.0045) [2024-04-28 11:55:17,169][57108] Fps is (10 sec: 50791.5, 60 sec: 55432.6, 300 sec: 55316.8). Total num frames: 9862610944. Throughput: 0: 55388.7. Samples: 353039640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:55:17,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 11:55:17,443][57339] Updated weights for policy 0, policy_version 601968 (0.0026) [2024-04-28 11:55:20,049][57339] Updated weights for policy 0, policy_version 601978 (0.0027) [2024-04-28 11:55:22,169][57108] Fps is (10 sec: 54067.9, 60 sec: 54886.5, 300 sec: 55261.3). Total num frames: 9862873088. Throughput: 0: 54944.4. Samples: 353195180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:55:22,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 11:55:23,283][57339] Updated weights for policy 0, policy_version 601988 (0.0026) [2024-04-28 11:55:26,160][57339] Updated weights for policy 0, policy_version 601998 (0.0031) [2024-04-28 11:55:27,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55159.4, 300 sec: 55372.4). Total num frames: 9863184384. Throughput: 0: 55149.3. Samples: 353531840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:55:27,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 11:55:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000602001_9863184384.pth... [2024-04-28 11:55:27,229][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000601189_9849880576.pth [2024-04-28 11:55:29,360][57339] Updated weights for policy 0, policy_version 602008 (0.0026) [2024-04-28 11:55:32,169][57108] Fps is (10 sec: 57343.7, 60 sec: 54340.4, 300 sec: 55316.8). Total num frames: 9863446528. Throughput: 0: 55079.9. Samples: 353859260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:55:32,170][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 11:55:32,290][57339] Updated weights for policy 0, policy_version 602018 (0.0029) [2024-04-28 11:55:35,189][57339] Updated weights for policy 0, policy_version 602028 (0.0034) [2024-04-28 11:55:36,121][57319] Signal inference workers to stop experience collection... (5100 times) [2024-04-28 11:55:36,152][57339] InferenceWorker_p0-w0: stopping experience collection (5100 times) [2024-04-28 11:55:36,180][57319] Signal inference workers to resume experience collection... (5100 times) [2024-04-28 11:55:36,183][57339] InferenceWorker_p0-w0: resuming experience collection (5100 times) [2024-04-28 11:55:37,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.3, 300 sec: 55427.9). Total num frames: 9863741440. Throughput: 0: 55351.9. Samples: 354025960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:55:37,170][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 11:55:38,219][57339] Updated weights for policy 0, policy_version 602038 (0.0037) [2024-04-28 11:55:40,900][57339] Updated weights for policy 0, policy_version 602048 (0.0026) [2024-04-28 11:55:42,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55705.7, 300 sec: 55427.9). Total num frames: 9864019968. Throughput: 0: 55323.7. Samples: 354358240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:55:42,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 11:55:44,000][57339] Updated weights for policy 0, policy_version 602058 (0.0029) [2024-04-28 11:55:47,163][57339] Updated weights for policy 0, policy_version 602068 (0.0026) [2024-04-28 11:55:47,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 9864282112. Throughput: 0: 55232.9. Samples: 354686740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:55:47,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 11:55:49,970][57339] Updated weights for policy 0, policy_version 602078 (0.0028) [2024-04-28 11:55:52,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55159.6, 300 sec: 55316.8). Total num frames: 9864544256. Throughput: 0: 54907.0. Samples: 354851240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:55:52,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 11:55:53,187][57339] Updated weights for policy 0, policy_version 602088 (0.0027) [2024-04-28 11:55:56,091][57339] Updated weights for policy 0, policy_version 602098 (0.0029) [2024-04-28 11:55:57,169][57108] Fps is (10 sec: 52428.9, 60 sec: 54886.5, 300 sec: 55261.3). Total num frames: 9864806400. Throughput: 0: 55023.8. Samples: 355182780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:55:57,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 11:55:59,221][57339] Updated weights for policy 0, policy_version 602108 (0.0023) [2024-04-28 11:56:01,972][57339] Updated weights for policy 0, policy_version 602118 (0.0025) [2024-04-28 11:56:02,169][57108] Fps is (10 sec: 57342.5, 60 sec: 55159.3, 300 sec: 55316.8). Total num frames: 9865117696. Throughput: 0: 54962.0. Samples: 355512940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:56:02,170][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 11:56:05,074][57339] Updated weights for policy 0, policy_version 602128 (0.0039) [2024-04-28 11:56:07,169][57108] Fps is (10 sec: 57344.0, 60 sec: 54613.6, 300 sec: 55261.3). Total num frames: 9865379840. Throughput: 0: 55231.1. Samples: 355680580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:56:07,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 11:56:08,160][57339] Updated weights for policy 0, policy_version 602138 (0.0032) [2024-04-28 11:56:10,818][57339] Updated weights for policy 0, policy_version 602148 (0.0029) [2024-04-28 11:56:12,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55705.7, 300 sec: 55427.9). Total num frames: 9865674752. Throughput: 0: 55043.2. Samples: 356008780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 11:56:12,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 11:56:13,987][57339] Updated weights for policy 0, policy_version 602158 (0.0028) [2024-04-28 11:56:16,723][57339] Updated weights for policy 0, policy_version 602168 (0.0028) [2024-04-28 11:56:17,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55316.8). Total num frames: 9865936896. Throughput: 0: 55147.2. Samples: 356340880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:56:17,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 11:56:19,947][57339] Updated weights for policy 0, policy_version 602178 (0.0034) [2024-04-28 11:56:22,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.5, 300 sec: 55372.4). Total num frames: 9866215424. Throughput: 0: 55227.2. Samples: 356511180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:56:22,178][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 11:56:22,686][57339] Updated weights for policy 0, policy_version 602188 (0.0034) [2024-04-28 11:56:26,064][57339] Updated weights for policy 0, policy_version 602198 (0.0025) [2024-04-28 11:56:27,169][57108] Fps is (10 sec: 54066.3, 60 sec: 54886.4, 300 sec: 55372.4). Total num frames: 9866477568. Throughput: 0: 55207.3. Samples: 356842580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:56:27,178][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 11:56:28,574][57339] Updated weights for policy 0, policy_version 602208 (0.0030) [2024-04-28 11:56:29,566][57319] Signal inference workers to stop experience collection... (5150 times) [2024-04-28 11:56:29,610][57339] InferenceWorker_p0-w0: stopping experience collection (5150 times) [2024-04-28 11:56:29,620][57319] Signal inference workers to resume experience collection... (5150 times) [2024-04-28 11:56:29,627][57339] InferenceWorker_p0-w0: resuming experience collection (5150 times) [2024-04-28 11:56:32,121][57339] Updated weights for policy 0, policy_version 602218 (0.0032) [2024-04-28 11:56:32,169][57108] Fps is (10 sec: 52428.4, 60 sec: 54886.3, 300 sec: 55261.3). Total num frames: 9866739712. Throughput: 0: 55300.7. Samples: 357175280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:56:32,178][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 11:56:34,737][57339] Updated weights for policy 0, policy_version 602228 (0.0028) [2024-04-28 11:56:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54613.4, 300 sec: 55150.2). Total num frames: 9867018240. Throughput: 0: 55198.0. Samples: 357335160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:56:37,178][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 11:56:37,973][57339] Updated weights for policy 0, policy_version 602238 (0.0025) [2024-04-28 11:56:40,720][57339] Updated weights for policy 0, policy_version 602248 (0.0039) [2024-04-28 11:56:42,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55159.3, 300 sec: 55316.8). Total num frames: 9867329536. Throughput: 0: 55131.4. Samples: 357663700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:56:42,170][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 11:56:43,876][57339] Updated weights for policy 0, policy_version 602258 (0.0027) [2024-04-28 11:56:46,621][57339] Updated weights for policy 0, policy_version 602268 (0.0031) [2024-04-28 11:56:47,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55159.3, 300 sec: 55316.8). Total num frames: 9867591680. Throughput: 0: 55153.3. Samples: 357994840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:56:47,178][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 11:56:49,899][57339] Updated weights for policy 0, policy_version 602278 (0.0031) [2024-04-28 11:56:52,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.4, 300 sec: 55372.3). Total num frames: 9867870208. Throughput: 0: 55257.2. Samples: 358167160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:56:52,178][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 11:56:52,465][57339] Updated weights for policy 0, policy_version 602288 (0.0033) [2024-04-28 11:56:55,879][57339] Updated weights for policy 0, policy_version 602298 (0.0029) [2024-04-28 11:56:57,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55705.6, 300 sec: 55316.8). Total num frames: 9868148736. Throughput: 0: 55322.2. Samples: 358498280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:56:57,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 11:56:58,504][57339] Updated weights for policy 0, policy_version 602308 (0.0037) [2024-04-28 11:57:01,919][57339] Updated weights for policy 0, policy_version 602318 (0.0026) [2024-04-28 11:57:02,169][57108] Fps is (10 sec: 50790.4, 60 sec: 54340.4, 300 sec: 55205.7). Total num frames: 9868378112. Throughput: 0: 55237.2. Samples: 358826560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:57:02,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 11:57:04,482][57339] Updated weights for policy 0, policy_version 602328 (0.0028) [2024-04-28 11:57:07,169][57108] Fps is (10 sec: 50790.0, 60 sec: 54613.2, 300 sec: 55150.2). Total num frames: 9868656640. Throughput: 0: 54901.7. Samples: 358981760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:57:07,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 11:57:07,938][57339] Updated weights for policy 0, policy_version 602338 (0.0034) [2024-04-28 11:57:10,250][57339] Updated weights for policy 0, policy_version 602348 (0.0031) [2024-04-28 11:57:12,169][57108] Fps is (10 sec: 58982.4, 60 sec: 54886.4, 300 sec: 55205.7). Total num frames: 9868967936. Throughput: 0: 54876.6. Samples: 359312020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:57:12,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 11:57:13,684][57339] Updated weights for policy 0, policy_version 602358 (0.0029) [2024-04-28 11:57:16,270][57339] Updated weights for policy 0, policy_version 602368 (0.0034) [2024-04-28 11:57:17,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55159.4, 300 sec: 55261.3). Total num frames: 9869246464. Throughput: 0: 54809.9. Samples: 359641720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:57:17,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 11:57:19,607][57339] Updated weights for policy 0, policy_version 602378 (0.0035) [2024-04-28 11:57:22,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54886.4, 300 sec: 55205.8). Total num frames: 9869508608. Throughput: 0: 55139.6. Samples: 359816440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:57:22,170][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 11:57:22,257][57339] Updated weights for policy 0, policy_version 602388 (0.0031) [2024-04-28 11:57:25,506][57339] Updated weights for policy 0, policy_version 602398 (0.0026) [2024-04-28 11:57:27,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55159.5, 300 sec: 55261.3). Total num frames: 9869787136. Throughput: 0: 55093.3. Samples: 360142900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:57:27,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 11:57:27,182][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000602404_9869787136.pth... [2024-04-28 11:57:27,231][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000601597_9856565248.pth [2024-04-28 11:57:28,146][57339] Updated weights for policy 0, policy_version 602408 (0.0026) [2024-04-28 11:57:31,273][57339] Updated weights for policy 0, policy_version 602418 (0.0029) [2024-04-28 11:57:32,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.6, 300 sec: 55205.8). Total num frames: 9870049280. Throughput: 0: 55066.4. Samples: 360472820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:57:32,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 11:57:34,061][57339] Updated weights for policy 0, policy_version 602428 (0.0025) [2024-04-28 11:57:37,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55159.6, 300 sec: 55205.8). Total num frames: 9870327808. Throughput: 0: 54833.0. Samples: 360634640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:57:37,169][57108] Avg episode reward: [(0, '0.519')] [2024-04-28 11:57:37,195][57339] Updated weights for policy 0, policy_version 602438 (0.0027) [2024-04-28 11:57:40,075][57339] Updated weights for policy 0, policy_version 602448 (0.0028) [2024-04-28 11:57:42,169][57108] Fps is (10 sec: 54067.9, 60 sec: 54340.4, 300 sec: 55094.7). Total num frames: 9870589952. Throughput: 0: 54852.6. Samples: 360966640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:57:42,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 11:57:43,410][57319] Signal inference workers to stop experience collection... (5200 times) [2024-04-28 11:57:43,410][57319] Signal inference workers to resume experience collection... (5200 times) [2024-04-28 11:57:43,439][57339] InferenceWorker_p0-w0: stopping experience collection (5200 times) [2024-04-28 11:57:43,439][57339] InferenceWorker_p0-w0: resuming experience collection (5200 times) [2024-04-28 11:57:43,526][57339] Updated weights for policy 0, policy_version 602458 (0.0030) [2024-04-28 11:57:45,907][57339] Updated weights for policy 0, policy_version 602468 (0.0025) [2024-04-28 11:57:47,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55432.8, 300 sec: 55261.3). Total num frames: 9870917632. Throughput: 0: 54815.7. Samples: 361293260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 11:57:47,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 11:57:49,590][57339] Updated weights for policy 0, policy_version 602478 (0.0030) [2024-04-28 11:57:51,802][57339] Updated weights for policy 0, policy_version 602488 (0.0028) [2024-04-28 11:57:52,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55159.5, 300 sec: 55205.7). Total num frames: 9871179776. Throughput: 0: 55309.5. Samples: 361470680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:57:52,169][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 11:57:55,467][57339] Updated weights for policy 0, policy_version 602498 (0.0027) [2024-04-28 11:57:57,169][57108] Fps is (10 sec: 50790.1, 60 sec: 54613.3, 300 sec: 55150.2). Total num frames: 9871425536. Throughput: 0: 55269.4. Samples: 361799140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:57:57,169][57108] Avg episode reward: [(0, '0.696')] [2024-04-28 11:57:57,753][57339] Updated weights for policy 0, policy_version 602508 (0.0034) [2024-04-28 11:58:01,465][57339] Updated weights for policy 0, policy_version 602518 (0.0029) [2024-04-28 11:58:02,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55432.5, 300 sec: 55261.3). Total num frames: 9871704064. Throughput: 0: 55182.6. Samples: 362124940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:58:02,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 11:58:03,841][57339] Updated weights for policy 0, policy_version 602528 (0.0031) [2024-04-28 11:58:07,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55159.6, 300 sec: 55150.2). Total num frames: 9871966208. Throughput: 0: 54807.2. Samples: 362282760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:58:07,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 11:58:07,231][57339] Updated weights for policy 0, policy_version 602538 (0.0032) [2024-04-28 11:58:09,671][57339] Updated weights for policy 0, policy_version 602548 (0.0030) [2024-04-28 11:58:12,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54613.3, 300 sec: 55094.7). Total num frames: 9872244736. Throughput: 0: 55077.0. Samples: 362621360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:58:12,169][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 11:58:13,108][57339] Updated weights for policy 0, policy_version 602558 (0.0029) [2024-04-28 11:58:15,651][57339] Updated weights for policy 0, policy_version 602568 (0.0031) [2024-04-28 11:58:17,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55159.4, 300 sec: 55150.2). Total num frames: 9872556032. Throughput: 0: 55123.5. Samples: 362953380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:58:17,170][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 11:58:19,028][57339] Updated weights for policy 0, policy_version 602578 (0.0036) [2024-04-28 11:58:21,429][57339] Updated weights for policy 0, policy_version 602588 (0.0037) [2024-04-28 11:58:22,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55159.5, 300 sec: 55094.7). Total num frames: 9872818176. Throughput: 0: 55223.6. Samples: 363119700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:58:22,169][57108] Avg episode reward: [(0, '0.532')] [2024-04-28 11:58:24,917][57339] Updated weights for policy 0, policy_version 602598 (0.0027) [2024-04-28 11:58:27,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.5, 300 sec: 55205.7). Total num frames: 9873096704. Throughput: 0: 55139.4. Samples: 363447920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:58:27,169][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 11:58:27,312][57339] Updated weights for policy 0, policy_version 602608 (0.0031) [2024-04-28 11:58:30,713][57339] Updated weights for policy 0, policy_version 602618 (0.0031) [2024-04-28 11:58:32,169][57108] Fps is (10 sec: 55704.4, 60 sec: 55432.4, 300 sec: 55205.7). Total num frames: 9873375232. Throughput: 0: 55267.3. Samples: 363780300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:58:32,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 11:58:33,278][57339] Updated weights for policy 0, policy_version 602628 (0.0029) [2024-04-28 11:58:36,611][57339] Updated weights for policy 0, policy_version 602638 (0.0028) [2024-04-28 11:58:37,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55261.3). Total num frames: 9873637376. Throughput: 0: 54923.0. Samples: 363942220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:58:37,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 11:58:38,535][57319] Signal inference workers to stop experience collection... (5250 times) [2024-04-28 11:58:38,536][57319] Signal inference workers to resume experience collection... (5250 times) [2024-04-28 11:58:38,545][57339] InferenceWorker_p0-w0: stopping experience collection (5250 times) [2024-04-28 11:58:38,556][57339] InferenceWorker_p0-w0: resuming experience collection (5250 times) [2024-04-28 11:58:39,227][57339] Updated weights for policy 0, policy_version 602648 (0.0031) [2024-04-28 11:58:42,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.4, 300 sec: 55261.3). Total num frames: 9873915904. Throughput: 0: 55046.2. Samples: 364276220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:58:42,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 11:58:42,640][57339] Updated weights for policy 0, policy_version 602658 (0.0030) [2024-04-28 11:58:44,989][57339] Updated weights for policy 0, policy_version 602668 (0.0028) [2024-04-28 11:58:47,169][57108] Fps is (10 sec: 55706.2, 60 sec: 54613.3, 300 sec: 55150.2). Total num frames: 9874194432. Throughput: 0: 55308.5. Samples: 364613820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:58:47,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 11:58:48,339][57339] Updated weights for policy 0, policy_version 602678 (0.0032) [2024-04-28 11:58:50,941][57339] Updated weights for policy 0, policy_version 602688 (0.0029) [2024-04-28 11:58:52,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55159.4, 300 sec: 55150.2). Total num frames: 9874489344. Throughput: 0: 55669.2. Samples: 364787880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:58:52,170][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 11:58:54,169][57339] Updated weights for policy 0, policy_version 602698 (0.0032) [2024-04-28 11:58:56,907][57339] Updated weights for policy 0, policy_version 602708 (0.0029) [2024-04-28 11:58:57,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55705.5, 300 sec: 55205.8). Total num frames: 9874767872. Throughput: 0: 55573.7. Samples: 365122180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:58:57,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 11:59:00,000][57339] Updated weights for policy 0, policy_version 602718 (0.0025) [2024-04-28 11:59:02,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55205.7). Total num frames: 9875046400. Throughput: 0: 55515.1. Samples: 365451560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:59:02,169][57108] Avg episode reward: [(0, '0.686')] [2024-04-28 11:59:02,719][57339] Updated weights for policy 0, policy_version 602728 (0.0033) [2024-04-28 11:59:06,012][57339] Updated weights for policy 0, policy_version 602738 (0.0029) [2024-04-28 11:59:07,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55705.3, 300 sec: 55205.7). Total num frames: 9875308544. Throughput: 0: 55539.7. Samples: 365619000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:59:07,170][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 11:59:08,567][57339] Updated weights for policy 0, policy_version 602748 (0.0027) [2024-04-28 11:59:11,809][57339] Updated weights for policy 0, policy_version 602758 (0.0030) [2024-04-28 11:59:12,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55316.8). Total num frames: 9875603456. Throughput: 0: 55726.2. Samples: 365955600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:59:12,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 11:59:14,454][57339] Updated weights for policy 0, policy_version 602768 (0.0031) [2024-04-28 11:59:17,169][57108] Fps is (10 sec: 54068.3, 60 sec: 54886.5, 300 sec: 55150.2). Total num frames: 9875849216. Throughput: 0: 55643.7. Samples: 366284260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:59:17,178][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 11:59:17,614][57339] Updated weights for policy 0, policy_version 602778 (0.0030) [2024-04-28 11:59:20,420][57339] Updated weights for policy 0, policy_version 602788 (0.0030) [2024-04-28 11:59:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.4, 300 sec: 55150.2). Total num frames: 9876144128. Throughput: 0: 55711.1. Samples: 366449220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:59:22,170][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 11:59:23,481][57339] Updated weights for policy 0, policy_version 602798 (0.0034) [2024-04-28 11:59:26,487][57339] Updated weights for policy 0, policy_version 602808 (0.0031) [2024-04-28 11:59:27,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55705.6, 300 sec: 55094.7). Total num frames: 9876439040. Throughput: 0: 55721.4. Samples: 366783680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:59:27,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 11:59:27,252][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000602811_9876455424.pth... [2024-04-28 11:59:27,300][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000602001_9863184384.pth [2024-04-28 11:59:29,357][57339] Updated weights for policy 0, policy_version 602818 (0.0034) [2024-04-28 11:59:32,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.7, 300 sec: 55205.7). Total num frames: 9876717568. Throughput: 0: 55568.3. Samples: 367114400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:59:32,170][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 11:59:32,303][57339] Updated weights for policy 0, policy_version 602828 (0.0027) [2024-04-28 11:59:35,095][57319] Signal inference workers to stop experience collection... (5300 times) [2024-04-28 11:59:35,095][57319] Signal inference workers to resume experience collection... (5300 times) [2024-04-28 11:59:35,109][57339] InferenceWorker_p0-w0: stopping experience collection (5300 times) [2024-04-28 11:59:35,127][57339] InferenceWorker_p0-w0: resuming experience collection (5300 times) [2024-04-28 11:59:35,209][57339] Updated weights for policy 0, policy_version 602838 (0.0030) [2024-04-28 11:59:37,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 55261.3). Total num frames: 9876979712. Throughput: 0: 55516.9. Samples: 367286140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:59:37,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 11:59:38,035][57339] Updated weights for policy 0, policy_version 602848 (0.0030) [2024-04-28 11:59:41,075][57339] Updated weights for policy 0, policy_version 602858 (0.0028) [2024-04-28 11:59:42,169][57108] Fps is (10 sec: 52429.9, 60 sec: 55432.7, 300 sec: 55205.8). Total num frames: 9877241856. Throughput: 0: 55470.5. Samples: 367618340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:59:42,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 11:59:43,975][57339] Updated weights for policy 0, policy_version 602868 (0.0032) [2024-04-28 11:59:47,079][57339] Updated weights for policy 0, policy_version 602878 (0.0028) [2024-04-28 11:59:47,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55316.8). Total num frames: 9877553152. Throughput: 0: 55548.1. Samples: 367951220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:59:47,170][57108] Avg episode reward: [(0, '0.500')] [2024-04-28 11:59:50,028][57339] Updated weights for policy 0, policy_version 602888 (0.0028) [2024-04-28 11:59:52,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.6, 300 sec: 55205.8). Total num frames: 9877798912. Throughput: 0: 55264.3. Samples: 368105880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:59:52,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 11:59:53,052][57339] Updated weights for policy 0, policy_version 602898 (0.0037) [2024-04-28 11:59:55,806][57339] Updated weights for policy 0, policy_version 602908 (0.0028) [2024-04-28 11:59:57,169][57108] Fps is (10 sec: 52429.7, 60 sec: 55159.7, 300 sec: 55150.2). Total num frames: 9878077440. Throughput: 0: 55174.4. Samples: 368438440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 11:59:57,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 11:59:59,091][57339] Updated weights for policy 0, policy_version 602918 (0.0032) [2024-04-28 12:00:01,598][57339] Updated weights for policy 0, policy_version 602928 (0.0032) [2024-04-28 12:00:02,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55432.6, 300 sec: 55150.2). Total num frames: 9878372352. Throughput: 0: 55135.9. Samples: 368765380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:00:02,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 12:00:05,243][57339] Updated weights for policy 0, policy_version 602938 (0.0033) [2024-04-28 12:00:07,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55705.8, 300 sec: 55316.8). Total num frames: 9878650880. Throughput: 0: 55279.6. Samples: 368936800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:00:07,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 12:00:07,746][57339] Updated weights for policy 0, policy_version 602948 (0.0034) [2024-04-28 12:00:11,144][57339] Updated weights for policy 0, policy_version 602958 (0.0028) [2024-04-28 12:00:12,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.4, 300 sec: 55261.3). Total num frames: 9878913024. Throughput: 0: 55163.5. Samples: 369266040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:00:12,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 12:00:13,866][57339] Updated weights for policy 0, policy_version 602968 (0.0037) [2024-04-28 12:00:16,989][57339] Updated weights for policy 0, policy_version 602978 (0.0034) [2024-04-28 12:00:17,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55316.8). Total num frames: 9879191552. Throughput: 0: 55235.2. Samples: 369599980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:00:17,170][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 12:00:19,827][57339] Updated weights for policy 0, policy_version 602988 (0.0032) [2024-04-28 12:00:22,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55205.8). Total num frames: 9879470080. Throughput: 0: 55053.8. Samples: 369763560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:00:22,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 12:00:22,800][57339] Updated weights for policy 0, policy_version 602998 (0.0027) [2024-04-28 12:00:25,706][57339] Updated weights for policy 0, policy_version 603008 (0.0038) [2024-04-28 12:00:27,169][57108] Fps is (10 sec: 54067.8, 60 sec: 54886.5, 300 sec: 55205.8). Total num frames: 9879732224. Throughput: 0: 55102.7. Samples: 370097960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:00:27,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 12:00:28,829][57339] Updated weights for policy 0, policy_version 603018 (0.0028) [2024-04-28 12:00:31,510][57339] Updated weights for policy 0, policy_version 603028 (0.0028) [2024-04-28 12:00:32,169][57108] Fps is (10 sec: 54066.6, 60 sec: 54886.4, 300 sec: 55150.2). Total num frames: 9880010752. Throughput: 0: 54915.5. Samples: 370422420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:00:32,170][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 12:00:34,861][57339] Updated weights for policy 0, policy_version 603038 (0.0026) [2024-04-28 12:00:37,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55705.7, 300 sec: 55261.3). Total num frames: 9880322048. Throughput: 0: 55338.3. Samples: 370596100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:00:37,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 12:00:37,473][57339] Updated weights for policy 0, policy_version 603048 (0.0032) [2024-04-28 12:00:40,105][57319] Signal inference workers to stop experience collection... (5350 times) [2024-04-28 12:00:40,111][57319] Signal inference workers to resume experience collection... (5350 times) [2024-04-28 12:00:40,133][57339] InferenceWorker_p0-w0: stopping experience collection (5350 times) [2024-04-28 12:00:40,134][57339] InferenceWorker_p0-w0: resuming experience collection (5350 times) [2024-04-28 12:00:40,746][57339] Updated weights for policy 0, policy_version 603058 (0.0028) [2024-04-28 12:00:42,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55705.5, 300 sec: 55261.3). Total num frames: 9880584192. Throughput: 0: 55350.1. Samples: 370929200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:00:42,178][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 12:00:43,371][57339] Updated weights for policy 0, policy_version 603068 (0.0030) [2024-04-28 12:00:46,623][57339] Updated weights for policy 0, policy_version 603078 (0.0028) [2024-04-28 12:00:47,169][57108] Fps is (10 sec: 50790.0, 60 sec: 54613.4, 300 sec: 55205.7). Total num frames: 9880829952. Throughput: 0: 55380.5. Samples: 371257500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:00:47,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 12:00:49,555][57339] Updated weights for policy 0, policy_version 603088 (0.0040) [2024-04-28 12:00:52,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55316.8). Total num frames: 9881124864. Throughput: 0: 55192.1. Samples: 371420440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:00:52,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 12:00:52,577][57339] Updated weights for policy 0, policy_version 603098 (0.0025) [2024-04-28 12:00:55,654][57339] Updated weights for policy 0, policy_version 603108 (0.0028) [2024-04-28 12:00:57,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55432.3, 300 sec: 55205.7). Total num frames: 9881403392. Throughput: 0: 55396.4. Samples: 371758880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:00:57,170][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 12:00:58,573][57339] Updated weights for policy 0, policy_version 603118 (0.0025) [2024-04-28 12:01:01,958][57339] Updated weights for policy 0, policy_version 603128 (0.0026) [2024-04-28 12:01:02,169][57108] Fps is (10 sec: 52429.0, 60 sec: 54613.4, 300 sec: 55150.2). Total num frames: 9881649152. Throughput: 0: 55423.7. Samples: 372094040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:01:02,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 12:01:04,424][57339] Updated weights for policy 0, policy_version 603138 (0.0035) [2024-04-28 12:01:07,169][57108] Fps is (10 sec: 54068.1, 60 sec: 54886.5, 300 sec: 55150.2). Total num frames: 9881944064. Throughput: 0: 55281.4. Samples: 372251220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:01:07,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 12:01:07,738][57339] Updated weights for policy 0, policy_version 603148 (0.0027) [2024-04-28 12:01:10,506][57339] Updated weights for policy 0, policy_version 603158 (0.0031) [2024-04-28 12:01:12,169][57108] Fps is (10 sec: 60620.3, 60 sec: 55705.7, 300 sec: 55316.8). Total num frames: 9882255360. Throughput: 0: 55242.1. Samples: 372583860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:01:12,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 12:01:13,634][57339] Updated weights for policy 0, policy_version 603168 (0.0029) [2024-04-28 12:01:16,369][57339] Updated weights for policy 0, policy_version 603178 (0.0024) [2024-04-28 12:01:17,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55432.7, 300 sec: 55261.3). Total num frames: 9882517504. Throughput: 0: 55407.8. Samples: 372915760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:01:17,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 12:01:19,533][57339] Updated weights for policy 0, policy_version 603188 (0.0038) [2024-04-28 12:01:22,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55159.5, 300 sec: 55261.3). Total num frames: 9882779648. Throughput: 0: 55326.1. Samples: 373085780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:01:22,170][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 12:01:22,340][57339] Updated weights for policy 0, policy_version 603198 (0.0031) [2024-04-28 12:01:25,254][57339] Updated weights for policy 0, policy_version 603208 (0.0027) [2024-04-28 12:01:27,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55316.9). Total num frames: 9883058176. Throughput: 0: 55338.3. Samples: 373419420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:01:27,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 12:01:27,214][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000603215_9883074560.pth... [2024-04-28 12:01:27,254][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000602404_9869787136.pth [2024-04-28 12:01:28,178][57339] Updated weights for policy 0, policy_version 603218 (0.0032) [2024-04-28 12:01:31,038][57339] Updated weights for policy 0, policy_version 603228 (0.0027) [2024-04-28 12:01:32,169][57108] Fps is (10 sec: 52428.8, 60 sec: 54886.5, 300 sec: 55205.8). Total num frames: 9883303936. Throughput: 0: 55447.5. Samples: 373752640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:01:32,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 12:01:34,210][57339] Updated weights for policy 0, policy_version 603238 (0.0029) [2024-04-28 12:01:37,052][57339] Updated weights for policy 0, policy_version 603248 (0.0029) [2024-04-28 12:01:37,169][57108] Fps is (10 sec: 55704.8, 60 sec: 54886.2, 300 sec: 55205.7). Total num frames: 9883615232. Throughput: 0: 55376.7. Samples: 373912400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:01:37,170][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 12:01:40,049][57339] Updated weights for policy 0, policy_version 603258 (0.0030) [2024-04-28 12:01:42,169][57108] Fps is (10 sec: 57344.4, 60 sec: 54886.5, 300 sec: 55205.8). Total num frames: 9883877376. Throughput: 0: 55119.8. Samples: 374239260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:01:42,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 12:01:43,321][57339] Updated weights for policy 0, policy_version 603268 (0.0029) [2024-04-28 12:01:44,626][57319] Signal inference workers to stop experience collection... (5400 times) [2024-04-28 12:01:44,627][57319] Signal inference workers to resume experience collection... (5400 times) [2024-04-28 12:01:44,646][57339] InferenceWorker_p0-w0: stopping experience collection (5400 times) [2024-04-28 12:01:44,665][57339] InferenceWorker_p0-w0: resuming experience collection (5400 times) [2024-04-28 12:01:45,807][57339] Updated weights for policy 0, policy_version 603278 (0.0027) [2024-04-28 12:01:47,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.6, 300 sec: 55316.8). Total num frames: 9884188672. Throughput: 0: 55024.8. Samples: 374570160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:01:47,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 12:01:49,367][57339] Updated weights for policy 0, policy_version 603288 (0.0024) [2024-04-28 12:01:51,809][57339] Updated weights for policy 0, policy_version 603298 (0.0027) [2024-04-28 12:01:52,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55432.6, 300 sec: 55261.3). Total num frames: 9884450816. Throughput: 0: 55446.7. Samples: 374746320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:01:52,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 12:01:55,297][57339] Updated weights for policy 0, policy_version 603308 (0.0033) [2024-04-28 12:01:57,169][57108] Fps is (10 sec: 50790.9, 60 sec: 54886.6, 300 sec: 55316.9). Total num frames: 9884696576. Throughput: 0: 55479.3. Samples: 375080420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:01:57,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 12:01:57,760][57339] Updated weights for policy 0, policy_version 603318 (0.0025) [2024-04-28 12:02:01,161][57339] Updated weights for policy 0, policy_version 603328 (0.0025) [2024-04-28 12:02:02,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55432.5, 300 sec: 55316.9). Total num frames: 9884975104. Throughput: 0: 55473.7. Samples: 375412080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:02:02,169][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 12:02:03,648][57339] Updated weights for policy 0, policy_version 603338 (0.0032) [2024-04-28 12:02:07,085][57339] Updated weights for policy 0, policy_version 603348 (0.0036) [2024-04-28 12:02:07,169][57108] Fps is (10 sec: 55704.3, 60 sec: 55159.3, 300 sec: 55205.7). Total num frames: 9885253632. Throughput: 0: 55163.8. Samples: 375568160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:02:07,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 12:02:09,442][57339] Updated weights for policy 0, policy_version 603358 (0.0029) [2024-04-28 12:02:12,169][57108] Fps is (10 sec: 57343.7, 60 sec: 54886.4, 300 sec: 55261.3). Total num frames: 9885548544. Throughput: 0: 54991.5. Samples: 375894040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:02:12,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 12:02:12,863][57339] Updated weights for policy 0, policy_version 603368 (0.0026) [2024-04-28 12:02:15,302][57339] Updated weights for policy 0, policy_version 603378 (0.0032) [2024-04-28 12:02:17,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55432.4, 300 sec: 55372.4). Total num frames: 9885843456. Throughput: 0: 55057.7. Samples: 376230240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:02:17,169][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 12:02:18,709][57339] Updated weights for policy 0, policy_version 603388 (0.0026) [2024-04-28 12:02:21,183][57339] Updated weights for policy 0, policy_version 603398 (0.0029) [2024-04-28 12:02:22,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55372.4). Total num frames: 9886121984. Throughput: 0: 55390.7. Samples: 376404980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:02:22,170][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 12:02:24,914][57339] Updated weights for policy 0, policy_version 603408 (0.0024) [2024-04-28 12:02:27,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 9886384128. Throughput: 0: 55425.8. Samples: 376733420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-04-28 12:02:27,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 12:02:27,177][57339] Updated weights for policy 0, policy_version 603418 (0.0033) [2024-04-28 12:02:30,913][57339] Updated weights for policy 0, policy_version 603428 (0.0026) [2024-04-28 12:02:32,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55705.6, 300 sec: 55316.8). Total num frames: 9886646272. Throughput: 0: 55624.9. Samples: 377073280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:02:32,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 12:02:32,932][57339] Updated weights for policy 0, policy_version 603438 (0.0032) [2024-04-28 12:02:36,897][57339] Updated weights for policy 0, policy_version 603448 (0.0037) [2024-04-28 12:02:37,169][57108] Fps is (10 sec: 52427.4, 60 sec: 54886.3, 300 sec: 55316.8). Total num frames: 9886908416. Throughput: 0: 55110.4. Samples: 377226300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:02:37,170][57108] Avg episode reward: [(0, '0.534')] [2024-04-28 12:02:38,937][57339] Updated weights for policy 0, policy_version 603458 (0.0032) [2024-04-28 12:02:40,089][57319] Signal inference workers to stop experience collection... (5450 times) [2024-04-28 12:02:40,094][57319] Signal inference workers to resume experience collection... (5450 times) [2024-04-28 12:02:40,104][57339] InferenceWorker_p0-w0: stopping experience collection (5450 times) [2024-04-28 12:02:40,122][57339] InferenceWorker_p0-w0: resuming experience collection (5450 times) [2024-04-28 12:02:42,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55159.3, 300 sec: 55150.2). Total num frames: 9887186944. Throughput: 0: 55077.9. Samples: 377558940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:02:42,170][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 12:02:42,765][57339] Updated weights for policy 0, policy_version 603468 (0.0029) [2024-04-28 12:02:44,920][57339] Updated weights for policy 0, policy_version 603478 (0.0029) [2024-04-28 12:02:47,169][57108] Fps is (10 sec: 57344.5, 60 sec: 54886.3, 300 sec: 55261.3). Total num frames: 9887481856. Throughput: 0: 55010.1. Samples: 377887540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:02:47,170][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 12:02:48,667][57339] Updated weights for policy 0, policy_version 603488 (0.0033) [2024-04-28 12:02:50,936][57339] Updated weights for policy 0, policy_version 603498 (0.0035) [2024-04-28 12:02:52,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55432.3, 300 sec: 55427.9). Total num frames: 9887776768. Throughput: 0: 55487.1. Samples: 378065080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:02:52,170][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 12:02:54,654][57339] Updated weights for policy 0, policy_version 603508 (0.0024) [2024-04-28 12:02:56,937][57339] Updated weights for policy 0, policy_version 603518 (0.0030) [2024-04-28 12:02:57,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55978.7, 300 sec: 55427.9). Total num frames: 9888055296. Throughput: 0: 55507.7. Samples: 378391880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:02:57,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 12:03:00,642][57339] Updated weights for policy 0, policy_version 603528 (0.0036) [2024-04-28 12:03:02,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.5, 300 sec: 55427.9). Total num frames: 9888317440. Throughput: 0: 55308.8. Samples: 378719140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:03:02,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 12:03:02,936][57339] Updated weights for policy 0, policy_version 603538 (0.0036) [2024-04-28 12:03:06,527][57339] Updated weights for policy 0, policy_version 603548 (0.0026) [2024-04-28 12:03:07,169][57108] Fps is (10 sec: 49151.2, 60 sec: 54886.5, 300 sec: 55261.3). Total num frames: 9888546816. Throughput: 0: 54968.9. Samples: 378878580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:03:07,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 12:03:08,856][57339] Updated weights for policy 0, policy_version 603558 (0.0031) [2024-04-28 12:03:12,169][57108] Fps is (10 sec: 50790.6, 60 sec: 54613.3, 300 sec: 55150.2). Total num frames: 9888825344. Throughput: 0: 55084.7. Samples: 379212240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:03:12,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 12:03:12,545][57339] Updated weights for policy 0, policy_version 603568 (0.0025) [2024-04-28 12:03:14,795][57339] Updated weights for policy 0, policy_version 603578 (0.0036) [2024-04-28 12:03:17,169][57108] Fps is (10 sec: 57344.7, 60 sec: 54613.4, 300 sec: 55261.3). Total num frames: 9889120256. Throughput: 0: 54805.0. Samples: 379539500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:03:17,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 12:03:18,480][57339] Updated weights for policy 0, policy_version 603588 (0.0030) [2024-04-28 12:03:20,564][57339] Updated weights for policy 0, policy_version 603598 (0.0028) [2024-04-28 12:03:22,169][57108] Fps is (10 sec: 58982.6, 60 sec: 54886.5, 300 sec: 55316.8). Total num frames: 9889415168. Throughput: 0: 55111.8. Samples: 379706320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:03:22,169][57108] Avg episode reward: [(0, '0.515')] [2024-04-28 12:03:24,658][57339] Updated weights for policy 0, policy_version 603608 (0.0027) [2024-04-28 12:03:26,792][57339] Updated weights for policy 0, policy_version 603618 (0.0026) [2024-04-28 12:03:27,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55159.3, 300 sec: 55316.8). Total num frames: 9889693696. Throughput: 0: 55087.2. Samples: 380037860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:03:27,170][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 12:03:27,225][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000603620_9889710080.pth... [2024-04-28 12:03:27,267][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000602811_9876455424.pth [2024-04-28 12:03:30,463][57339] Updated weights for policy 0, policy_version 603628 (0.0030) [2024-04-28 12:03:32,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 9889972224. Throughput: 0: 55050.8. Samples: 380364820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:03:32,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 12:03:32,862][57339] Updated weights for policy 0, policy_version 603638 (0.0028) [2024-04-28 12:03:36,234][57339] Updated weights for policy 0, policy_version 603648 (0.0033) [2024-04-28 12:03:37,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55159.7, 300 sec: 55261.3). Total num frames: 9890217984. Throughput: 0: 54722.5. Samples: 380527580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:03:37,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 12:03:38,634][57339] Updated weights for policy 0, policy_version 603658 (0.0027) [2024-04-28 12:03:42,169][57108] Fps is (10 sec: 50790.5, 60 sec: 54886.6, 300 sec: 55205.8). Total num frames: 9890480128. Throughput: 0: 54891.5. Samples: 380862000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:03:42,169][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 12:03:42,243][57339] Updated weights for policy 0, policy_version 603668 (0.0028) [2024-04-28 12:03:43,729][57319] Signal inference workers to stop experience collection... (5500 times) [2024-04-28 12:03:43,730][57319] Signal inference workers to resume experience collection... (5500 times) [2024-04-28 12:03:43,754][57339] InferenceWorker_p0-w0: stopping experience collection (5500 times) [2024-04-28 12:03:43,754][57339] InferenceWorker_p0-w0: resuming experience collection (5500 times) [2024-04-28 12:03:44,449][57339] Updated weights for policy 0, policy_version 603678 (0.0036) [2024-04-28 12:03:47,169][57108] Fps is (10 sec: 55704.8, 60 sec: 54886.4, 300 sec: 55205.7). Total num frames: 9890775040. Throughput: 0: 54951.5. Samples: 381191960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:03:47,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 12:03:48,023][57339] Updated weights for policy 0, policy_version 603688 (0.0036) [2024-04-28 12:03:50,486][57339] Updated weights for policy 0, policy_version 603698 (0.0024) [2024-04-28 12:03:52,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54340.5, 300 sec: 55150.2). Total num frames: 9891037184. Throughput: 0: 55044.6. Samples: 381355580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:03:52,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 12:03:54,026][57339] Updated weights for policy 0, policy_version 603708 (0.0029) [2024-04-28 12:03:56,564][57339] Updated weights for policy 0, policy_version 603718 (0.0032) [2024-04-28 12:03:57,169][57108] Fps is (10 sec: 57344.6, 60 sec: 54886.3, 300 sec: 55261.3). Total num frames: 9891348480. Throughput: 0: 54984.5. Samples: 381686540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:03:57,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 12:04:00,037][57339] Updated weights for policy 0, policy_version 603728 (0.0033) [2024-04-28 12:04:02,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55159.7, 300 sec: 55316.9). Total num frames: 9891627008. Throughput: 0: 54997.0. Samples: 382014360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:04:02,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 12:04:02,363][57339] Updated weights for policy 0, policy_version 603738 (0.0029) [2024-04-28 12:04:06,006][57339] Updated weights for policy 0, policy_version 603748 (0.0031) [2024-04-28 12:04:07,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.8, 300 sec: 55261.3). Total num frames: 9891905536. Throughput: 0: 55038.3. Samples: 382183040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:04:07,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 12:04:08,550][57339] Updated weights for policy 0, policy_version 603758 (0.0034) [2024-04-28 12:04:11,963][57339] Updated weights for policy 0, policy_version 603768 (0.0028) [2024-04-28 12:04:12,169][57108] Fps is (10 sec: 50789.9, 60 sec: 55159.5, 300 sec: 55205.8). Total num frames: 9892134912. Throughput: 0: 54923.3. Samples: 382509400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:04:12,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 12:04:14,560][57339] Updated weights for policy 0, policy_version 603778 (0.0031) [2024-04-28 12:04:17,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55159.4, 300 sec: 55205.8). Total num frames: 9892429824. Throughput: 0: 54995.9. Samples: 382839640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:04:17,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 12:04:17,748][57339] Updated weights for policy 0, policy_version 603788 (0.0027) [2024-04-28 12:04:20,558][57339] Updated weights for policy 0, policy_version 603798 (0.0028) [2024-04-28 12:04:22,169][57108] Fps is (10 sec: 57343.9, 60 sec: 54886.4, 300 sec: 55150.2). Total num frames: 9892708352. Throughput: 0: 54920.8. Samples: 382999020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:04:22,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 12:04:23,648][57339] Updated weights for policy 0, policy_version 603808 (0.0028) [2024-04-28 12:04:26,485][57339] Updated weights for policy 0, policy_version 603818 (0.0032) [2024-04-28 12:04:27,169][57108] Fps is (10 sec: 55705.4, 60 sec: 54886.4, 300 sec: 55150.2). Total num frames: 9892986880. Throughput: 0: 54886.5. Samples: 383331900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:04:27,170][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 12:04:29,586][57339] Updated weights for policy 0, policy_version 603828 (0.0028) [2024-04-28 12:04:32,169][57108] Fps is (10 sec: 55705.6, 60 sec: 54886.4, 300 sec: 55205.8). Total num frames: 9893265408. Throughput: 0: 54921.9. Samples: 383663440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:04:32,170][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 12:04:32,312][57339] Updated weights for policy 0, policy_version 603838 (0.0025) [2024-04-28 12:04:35,704][57339] Updated weights for policy 0, policy_version 603848 (0.0030) [2024-04-28 12:04:37,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.5, 300 sec: 55316.8). Total num frames: 9893560320. Throughput: 0: 55277.2. Samples: 383843060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:04:37,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 12:04:38,271][57339] Updated weights for policy 0, policy_version 603858 (0.0026) [2024-04-28 12:04:41,535][57339] Updated weights for policy 0, policy_version 603868 (0.0031) [2024-04-28 12:04:42,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55094.7). Total num frames: 9893806080. Throughput: 0: 55284.9. Samples: 384174360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:04:42,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 12:04:42,434][57319] Signal inference workers to stop experience collection... (5550 times) [2024-04-28 12:04:42,434][57319] Signal inference workers to resume experience collection... (5550 times) [2024-04-28 12:04:42,459][57339] InferenceWorker_p0-w0: stopping experience collection (5550 times) [2024-04-28 12:04:42,459][57339] InferenceWorker_p0-w0: resuming experience collection (5550 times) [2024-04-28 12:04:44,240][57339] Updated weights for policy 0, policy_version 603878 (0.0036) [2024-04-28 12:04:47,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55159.5, 300 sec: 55205.7). Total num frames: 9894084608. Throughput: 0: 55294.9. Samples: 384502640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:04:47,170][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 12:04:47,396][57339] Updated weights for policy 0, policy_version 603888 (0.0033) [2024-04-28 12:04:50,154][57339] Updated weights for policy 0, policy_version 603898 (0.0031) [2024-04-28 12:04:52,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55150.2). Total num frames: 9894346752. Throughput: 0: 55042.1. Samples: 384659940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:04:52,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 12:04:53,413][57339] Updated weights for policy 0, policy_version 603908 (0.0029) [2024-04-28 12:04:56,264][57339] Updated weights for policy 0, policy_version 603918 (0.0029) [2024-04-28 12:04:57,169][57108] Fps is (10 sec: 55706.0, 60 sec: 54886.4, 300 sec: 55150.2). Total num frames: 9894641664. Throughput: 0: 55110.7. Samples: 384989380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:04:57,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 12:04:59,405][57339] Updated weights for policy 0, policy_version 603928 (0.0030) [2024-04-28 12:05:02,117][57339] Updated weights for policy 0, policy_version 603938 (0.0027) [2024-04-28 12:05:02,169][57108] Fps is (10 sec: 57343.6, 60 sec: 54886.2, 300 sec: 55150.2). Total num frames: 9894920192. Throughput: 0: 55146.5. Samples: 385321240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:05:02,170][57108] Avg episode reward: [(0, '0.682')] [2024-04-28 12:05:05,188][57339] Updated weights for policy 0, policy_version 603948 (0.0031) [2024-04-28 12:05:07,169][57108] Fps is (10 sec: 55704.1, 60 sec: 54886.1, 300 sec: 55205.7). Total num frames: 9895198720. Throughput: 0: 55373.0. Samples: 385490820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:05:07,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 12:05:08,206][57339] Updated weights for policy 0, policy_version 603958 (0.0029) [2024-04-28 12:05:11,043][57339] Updated weights for policy 0, policy_version 603968 (0.0031) [2024-04-28 12:05:12,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55705.6, 300 sec: 55205.8). Total num frames: 9895477248. Throughput: 0: 55268.6. Samples: 385818980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:05:12,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 12:05:14,075][57339] Updated weights for policy 0, policy_version 603978 (0.0027) [2024-04-28 12:05:17,103][57339] Updated weights for policy 0, policy_version 603988 (0.0030) [2024-04-28 12:05:17,169][57108] Fps is (10 sec: 54068.4, 60 sec: 55159.5, 300 sec: 55150.2). Total num frames: 9895739392. Throughput: 0: 55333.3. Samples: 386153440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:05:17,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 12:05:20,063][57339] Updated weights for policy 0, policy_version 603998 (0.0028) [2024-04-28 12:05:22,169][57108] Fps is (10 sec: 52427.9, 60 sec: 54886.3, 300 sec: 55150.2). Total num frames: 9896001536. Throughput: 0: 54935.5. Samples: 386315160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:05:22,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 12:05:23,006][57339] Updated weights for policy 0, policy_version 604008 (0.0030) [2024-04-28 12:05:25,977][57339] Updated weights for policy 0, policy_version 604018 (0.0024) [2024-04-28 12:05:27,169][57108] Fps is (10 sec: 52429.3, 60 sec: 54613.5, 300 sec: 55094.7). Total num frames: 9896263680. Throughput: 0: 54859.2. Samples: 386643020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:05:27,169][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 12:05:27,298][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000604022_9896296448.pth... [2024-04-28 12:05:27,348][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000603215_9883074560.pth [2024-04-28 12:05:29,046][57339] Updated weights for policy 0, policy_version 604028 (0.0029) [2024-04-28 12:05:31,775][57339] Updated weights for policy 0, policy_version 604038 (0.0031) [2024-04-28 12:05:32,169][57108] Fps is (10 sec: 55706.0, 60 sec: 54886.4, 300 sec: 55039.1). Total num frames: 9896558592. Throughput: 0: 54941.3. Samples: 386975000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:05:32,170][57108] Avg episode reward: [(0, '0.705')] [2024-04-28 12:05:34,807][57339] Updated weights for policy 0, policy_version 604048 (0.0026) [2024-04-28 12:05:37,169][57108] Fps is (10 sec: 58982.3, 60 sec: 54886.5, 300 sec: 55150.2). Total num frames: 9896853504. Throughput: 0: 55245.0. Samples: 387145960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-04-28 12:05:37,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 12:05:37,790][57339] Updated weights for policy 0, policy_version 604058 (0.0028) [2024-04-28 12:05:40,841][57339] Updated weights for policy 0, policy_version 604068 (0.0030) [2024-04-28 12:05:42,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55432.6, 300 sec: 55261.3). Total num frames: 9897132032. Throughput: 0: 55264.0. Samples: 387476260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:05:42,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 12:05:43,658][57339] Updated weights for policy 0, policy_version 604078 (0.0039) [2024-04-28 12:05:46,718][57339] Updated weights for policy 0, policy_version 604088 (0.0023) [2024-04-28 12:05:47,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 55150.2). Total num frames: 9897394176. Throughput: 0: 55188.6. Samples: 387804720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:05:47,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 12:05:49,702][57339] Updated weights for policy 0, policy_version 604098 (0.0028) [2024-04-28 12:05:52,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55159.6, 300 sec: 55094.7). Total num frames: 9897656320. Throughput: 0: 55161.7. Samples: 387973080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:05:52,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 12:05:52,592][57339] Updated weights for policy 0, policy_version 604108 (0.0040) [2024-04-28 12:05:55,729][57339] Updated weights for policy 0, policy_version 604118 (0.0029) [2024-04-28 12:05:56,792][57319] Signal inference workers to stop experience collection... (5600 times) [2024-04-28 12:05:56,798][57319] Signal inference workers to resume experience collection... (5600 times) [2024-04-28 12:05:56,824][57339] InferenceWorker_p0-w0: stopping experience collection (5600 times) [2024-04-28 12:05:56,824][57339] InferenceWorker_p0-w0: resuming experience collection (5600 times) [2024-04-28 12:05:57,169][57108] Fps is (10 sec: 54068.1, 60 sec: 54886.5, 300 sec: 55205.8). Total num frames: 9897934848. Throughput: 0: 55198.8. Samples: 388302920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:05:57,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 12:05:58,591][57339] Updated weights for policy 0, policy_version 604128 (0.0031) [2024-04-28 12:06:01,783][57339] Updated weights for policy 0, policy_version 604138 (0.0029) [2024-04-28 12:06:02,169][57108] Fps is (10 sec: 55705.1, 60 sec: 54886.5, 300 sec: 55150.2). Total num frames: 9898213376. Throughput: 0: 55132.0. Samples: 388634380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:06:02,178][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 12:06:04,370][57339] Updated weights for policy 0, policy_version 604148 (0.0026) [2024-04-28 12:06:07,169][57108] Fps is (10 sec: 55704.5, 60 sec: 54886.6, 300 sec: 55039.1). Total num frames: 9898491904. Throughput: 0: 55081.0. Samples: 388793800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:06:07,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 12:06:07,532][57339] Updated weights for policy 0, policy_version 604158 (0.0028) [2024-04-28 12:06:10,344][57339] Updated weights for policy 0, policy_version 604168 (0.0023) [2024-04-28 12:06:12,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55159.4, 300 sec: 55150.2). Total num frames: 9898786816. Throughput: 0: 55204.8. Samples: 389127240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:06:12,169][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 12:06:13,479][57339] Updated weights for policy 0, policy_version 604178 (0.0026) [2024-04-28 12:06:16,230][57339] Updated weights for policy 0, policy_version 604188 (0.0028) [2024-04-28 12:06:17,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55432.6, 300 sec: 55205.7). Total num frames: 9899065344. Throughput: 0: 55304.1. Samples: 389463680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:06:17,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 12:06:19,479][57339] Updated weights for policy 0, policy_version 604198 (0.0027) [2024-04-28 12:06:22,169][57339] Updated weights for policy 0, policy_version 604208 (0.0031) [2024-04-28 12:06:22,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55205.7). Total num frames: 9899343872. Throughput: 0: 55167.1. Samples: 389628480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:06:22,170][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 12:06:25,347][57339] Updated weights for policy 0, policy_version 604218 (0.0029) [2024-04-28 12:06:27,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55432.5, 300 sec: 55205.8). Total num frames: 9899589632. Throughput: 0: 55239.5. Samples: 389962040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:06:27,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 12:06:28,081][57339] Updated weights for policy 0, policy_version 604228 (0.0026) [2024-04-28 12:06:31,136][57339] Updated weights for policy 0, policy_version 604238 (0.0030) [2024-04-28 12:06:32,169][57108] Fps is (10 sec: 50790.5, 60 sec: 54886.5, 300 sec: 55039.2). Total num frames: 9899851776. Throughput: 0: 55340.5. Samples: 390295040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:06:32,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 12:06:33,956][57339] Updated weights for policy 0, policy_version 604248 (0.0034) [2024-04-28 12:06:37,145][57339] Updated weights for policy 0, policy_version 604258 (0.0034) [2024-04-28 12:06:37,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55159.4, 300 sec: 55205.7). Total num frames: 9900163072. Throughput: 0: 54965.7. Samples: 390446540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:06:37,169][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 12:06:39,976][57339] Updated weights for policy 0, policy_version 604268 (0.0025) [2024-04-28 12:06:42,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55159.4, 300 sec: 55094.7). Total num frames: 9900441600. Throughput: 0: 54995.3. Samples: 390777720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:06:42,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 12:06:43,143][57339] Updated weights for policy 0, policy_version 604278 (0.0029) [2024-04-28 12:06:45,871][57339] Updated weights for policy 0, policy_version 604288 (0.0030) [2024-04-28 12:06:47,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55150.2). Total num frames: 9900720128. Throughput: 0: 55124.8. Samples: 391115000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:06:47,170][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 12:06:48,969][57339] Updated weights for policy 0, policy_version 604298 (0.0026) [2024-04-28 12:06:51,732][57339] Updated weights for policy 0, policy_version 604308 (0.0036) [2024-04-28 12:06:52,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.5, 300 sec: 55261.3). Total num frames: 9900998656. Throughput: 0: 55525.8. Samples: 391292460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:06:52,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 12:06:54,779][57339] Updated weights for policy 0, policy_version 604318 (0.0033) [2024-04-28 12:06:57,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.3, 300 sec: 55205.7). Total num frames: 9901260800. Throughput: 0: 55521.2. Samples: 391625700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:06:57,170][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 12:06:57,607][57319] Signal inference workers to stop experience collection... (5650 times) [2024-04-28 12:06:57,607][57319] Signal inference workers to resume experience collection... (5650 times) [2024-04-28 12:06:57,621][57339] InferenceWorker_p0-w0: stopping experience collection (5650 times) [2024-04-28 12:06:57,622][57339] InferenceWorker_p0-w0: resuming experience collection (5650 times) [2024-04-28 12:06:57,744][57339] Updated weights for policy 0, policy_version 604328 (0.0030) [2024-04-28 12:07:00,696][57339] Updated weights for policy 0, policy_version 604338 (0.0033) [2024-04-28 12:07:02,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55159.5, 300 sec: 55150.2). Total num frames: 9901522944. Throughput: 0: 55374.3. Samples: 391955520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:07:02,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 12:07:03,665][57339] Updated weights for policy 0, policy_version 604348 (0.0030) [2024-04-28 12:07:06,603][57339] Updated weights for policy 0, policy_version 604358 (0.0033) [2024-04-28 12:07:07,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55150.2). Total num frames: 9901817856. Throughput: 0: 55258.1. Samples: 392115100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 12:07:07,170][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 12:07:09,417][57339] Updated weights for policy 0, policy_version 604368 (0.0030) [2024-04-28 12:07:12,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55159.5, 300 sec: 55094.7). Total num frames: 9902096384. Throughput: 0: 55286.2. Samples: 392449920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:07:12,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 12:07:12,635][57339] Updated weights for policy 0, policy_version 604378 (0.0032) [2024-04-28 12:07:15,239][57339] Updated weights for policy 0, policy_version 604388 (0.0025) [2024-04-28 12:07:17,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55159.4, 300 sec: 55094.7). Total num frames: 9902374912. Throughput: 0: 55287.0. Samples: 392782960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:07:17,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 12:07:18,727][57339] Updated weights for policy 0, policy_version 604398 (0.0028) [2024-04-28 12:07:21,175][57339] Updated weights for policy 0, policy_version 604408 (0.0031) [2024-04-28 12:07:22,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.4, 300 sec: 55205.7). Total num frames: 9902669824. Throughput: 0: 55683.5. Samples: 392952300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:07:22,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 12:07:24,409][57339] Updated weights for policy 0, policy_version 604418 (0.0028) [2024-04-28 12:07:27,137][57339] Updated weights for policy 0, policy_version 604428 (0.0026) [2024-04-28 12:07:27,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55261.3). Total num frames: 9902948352. Throughput: 0: 55835.1. Samples: 393290300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:07:27,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 12:07:27,234][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000604429_9902964736.pth... [2024-04-28 12:07:27,276][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000603620_9889710080.pth [2024-04-28 12:07:30,238][57339] Updated weights for policy 0, policy_version 604438 (0.0030) [2024-04-28 12:07:32,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55978.5, 300 sec: 55261.3). Total num frames: 9903210496. Throughput: 0: 55819.1. Samples: 393626860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:07:32,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 12:07:32,917][57339] Updated weights for policy 0, policy_version 604448 (0.0026) [2024-04-28 12:07:36,181][57339] Updated weights for policy 0, policy_version 604458 (0.0026) [2024-04-28 12:07:37,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55261.3). Total num frames: 9903489024. Throughput: 0: 55449.3. Samples: 393787680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:07:37,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 12:07:38,956][57339] Updated weights for policy 0, policy_version 604468 (0.0032) [2024-04-28 12:07:41,920][57339] Updated weights for policy 0, policy_version 604478 (0.0027) [2024-04-28 12:07:42,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55432.7, 300 sec: 55205.8). Total num frames: 9903767552. Throughput: 0: 55450.0. Samples: 394120940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:07:42,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 12:07:44,917][57339] Updated weights for policy 0, policy_version 604488 (0.0026) [2024-04-28 12:07:47,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55094.7). Total num frames: 9904029696. Throughput: 0: 55609.7. Samples: 394457960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:07:47,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 12:07:47,701][57339] Updated weights for policy 0, policy_version 604498 (0.0026) [2024-04-28 12:07:50,711][57339] Updated weights for policy 0, policy_version 604508 (0.0025) [2024-04-28 12:07:52,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55159.5, 300 sec: 55094.6). Total num frames: 9904308224. Throughput: 0: 55638.7. Samples: 394618840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:07:52,170][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 12:07:54,231][57339] Updated weights for policy 0, policy_version 604518 (0.0029) [2024-04-28 12:07:56,525][57339] Updated weights for policy 0, policy_version 604528 (0.0035) [2024-04-28 12:07:57,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55705.7, 300 sec: 55205.8). Total num frames: 9904603136. Throughput: 0: 55575.6. Samples: 394950820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:07:57,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 12:08:00,279][57319] Signal inference workers to stop experience collection... (5700 times) [2024-04-28 12:08:00,279][57319] Signal inference workers to resume experience collection... (5700 times) [2024-04-28 12:08:00,282][57339] Updated weights for policy 0, policy_version 604538 (0.0031) [2024-04-28 12:08:00,291][57339] InferenceWorker_p0-w0: stopping experience collection (5700 times) [2024-04-28 12:08:00,292][57339] InferenceWorker_p0-w0: resuming experience collection (5700 times) [2024-04-28 12:08:02,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56251.7, 300 sec: 55427.9). Total num frames: 9904898048. Throughput: 0: 55595.6. Samples: 395284760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:08:02,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 12:08:02,404][57339] Updated weights for policy 0, policy_version 604548 (0.0027) [2024-04-28 12:08:06,075][57339] Updated weights for policy 0, policy_version 604558 (0.0028) [2024-04-28 12:08:07,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55159.6, 300 sec: 55261.3). Total num frames: 9905127424. Throughput: 0: 55391.3. Samples: 395444900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:08:07,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 12:08:08,445][57339] Updated weights for policy 0, policy_version 604568 (0.0031) [2024-04-28 12:08:11,906][57339] Updated weights for policy 0, policy_version 604578 (0.0036) [2024-04-28 12:08:12,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55432.5, 300 sec: 55261.3). Total num frames: 9905422336. Throughput: 0: 55300.0. Samples: 395778800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:08:12,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 12:08:14,348][57339] Updated weights for policy 0, policy_version 604588 (0.0028) [2024-04-28 12:08:17,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55432.7, 300 sec: 55205.8). Total num frames: 9905700864. Throughput: 0: 55291.3. Samples: 396114960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:08:17,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 12:08:17,873][57339] Updated weights for policy 0, policy_version 604598 (0.0034) [2024-04-28 12:08:20,027][57339] Updated weights for policy 0, policy_version 604608 (0.0029) [2024-04-28 12:08:22,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55432.4, 300 sec: 55261.3). Total num frames: 9905995776. Throughput: 0: 55353.2. Samples: 396278580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:08:22,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 12:08:23,621][57339] Updated weights for policy 0, policy_version 604618 (0.0028) [2024-04-28 12:08:26,031][57339] Updated weights for policy 0, policy_version 604628 (0.0026) [2024-04-28 12:08:27,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54886.5, 300 sec: 55150.2). Total num frames: 9906241536. Throughput: 0: 55343.0. Samples: 396611380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:08:27,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 12:08:29,377][57339] Updated weights for policy 0, policy_version 604638 (0.0038) [2024-04-28 12:08:32,033][57339] Updated weights for policy 0, policy_version 604648 (0.0023) [2024-04-28 12:08:32,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55372.3). Total num frames: 9906552832. Throughput: 0: 55181.7. Samples: 396941140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:08:32,170][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 12:08:35,449][57339] Updated weights for policy 0, policy_version 604658 (0.0028) [2024-04-28 12:08:37,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55432.6, 300 sec: 55372.4). Total num frames: 9906814976. Throughput: 0: 55417.8. Samples: 397112640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:08:37,169][57108] Avg episode reward: [(0, '0.531')] [2024-04-28 12:08:37,824][57339] Updated weights for policy 0, policy_version 604668 (0.0027) [2024-04-28 12:08:41,345][57339] Updated weights for policy 0, policy_version 604678 (0.0027) [2024-04-28 12:08:42,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.4, 300 sec: 55316.8). Total num frames: 9907093504. Throughput: 0: 55504.3. Samples: 397448520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:08:42,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 12:08:43,556][57339] Updated weights for policy 0, policy_version 604688 (0.0035) [2024-04-28 12:08:47,089][57339] Updated weights for policy 0, policy_version 604698 (0.0030) [2024-04-28 12:08:47,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55372.3). Total num frames: 9907372032. Throughput: 0: 55451.5. Samples: 397780080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:08:47,170][57108] Avg episode reward: [(0, '0.508')] [2024-04-28 12:08:48,392][57319] Signal inference workers to stop experience collection... (5750 times) [2024-04-28 12:08:48,425][57339] InferenceWorker_p0-w0: stopping experience collection (5750 times) [2024-04-28 12:08:48,451][57319] Signal inference workers to resume experience collection... (5750 times) [2024-04-28 12:08:48,451][57339] InferenceWorker_p0-w0: resuming experience collection (5750 times) [2024-04-28 12:08:49,524][57339] Updated weights for policy 0, policy_version 604708 (0.0035) [2024-04-28 12:08:52,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55205.8). Total num frames: 9907634176. Throughput: 0: 55444.9. Samples: 397939920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:08:52,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 12:08:52,756][57339] Updated weights for policy 0, policy_version 604718 (0.0026) [2024-04-28 12:08:55,535][57339] Updated weights for policy 0, policy_version 604728 (0.0028) [2024-04-28 12:08:57,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.3, 300 sec: 55261.2). Total num frames: 9907929088. Throughput: 0: 55578.1. Samples: 398279820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:08:57,170][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 12:08:58,733][57339] Updated weights for policy 0, policy_version 604738 (0.0031) [2024-04-28 12:09:01,419][57339] Updated weights for policy 0, policy_version 604748 (0.0029) [2024-04-28 12:09:02,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55159.5, 300 sec: 55261.3). Total num frames: 9908207616. Throughput: 0: 55565.3. Samples: 398615400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:09:02,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 12:09:04,692][57339] Updated weights for policy 0, policy_version 604758 (0.0031) [2024-04-28 12:09:07,169][57108] Fps is (10 sec: 57344.9, 60 sec: 56251.6, 300 sec: 55483.4). Total num frames: 9908502528. Throughput: 0: 55807.7. Samples: 398789920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:09:07,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 12:09:07,364][57339] Updated weights for policy 0, policy_version 604768 (0.0033) [2024-04-28 12:09:10,402][57339] Updated weights for policy 0, policy_version 604778 (0.0030) [2024-04-28 12:09:12,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55978.7, 300 sec: 55427.9). Total num frames: 9908781056. Throughput: 0: 55830.6. Samples: 399123760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:09:12,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 12:09:13,204][57339] Updated weights for policy 0, policy_version 604788 (0.0032) [2024-04-28 12:09:16,191][57339] Updated weights for policy 0, policy_version 604798 (0.0036) [2024-04-28 12:09:17,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.5, 300 sec: 55372.4). Total num frames: 9909043200. Throughput: 0: 55867.6. Samples: 399455180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:09:17,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 12:09:19,028][57339] Updated weights for policy 0, policy_version 604808 (0.0029) [2024-04-28 12:09:22,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.7, 300 sec: 55372.4). Total num frames: 9909321728. Throughput: 0: 55856.5. Samples: 399626180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:09:22,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 12:09:22,193][57339] Updated weights for policy 0, policy_version 604818 (0.0033) [2024-04-28 12:09:24,843][57339] Updated weights for policy 0, policy_version 604828 (0.0031) [2024-04-28 12:09:27,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55316.8). Total num frames: 9909583872. Throughput: 0: 55867.1. Samples: 399962540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:09:27,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 12:09:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000604834_9909600256.pth... [2024-04-28 12:09:27,238][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000604022_9896296448.pth [2024-04-28 12:09:28,315][57339] Updated weights for policy 0, policy_version 604838 (0.0030) [2024-04-28 12:09:30,541][57339] Updated weights for policy 0, policy_version 604848 (0.0029) [2024-04-28 12:09:32,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.6, 300 sec: 55261.3). Total num frames: 9909862400. Throughput: 0: 55976.1. Samples: 400299000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:09:32,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 12:09:34,074][57339] Updated weights for policy 0, policy_version 604858 (0.0024) [2024-04-28 12:09:36,385][57339] Updated weights for policy 0, policy_version 604868 (0.0028) [2024-04-28 12:09:37,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55978.6, 300 sec: 55483.4). Total num frames: 9910173696. Throughput: 0: 56274.4. Samples: 400472280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:09:37,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 12:09:39,832][57339] Updated weights for policy 0, policy_version 604878 (0.0027) [2024-04-28 12:09:42,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55978.8, 300 sec: 55483.5). Total num frames: 9910452224. Throughput: 0: 56071.0. Samples: 400803000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:09:42,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 12:09:42,491][57339] Updated weights for policy 0, policy_version 604888 (0.0032) [2024-04-28 12:09:45,728][57339] Updated weights for policy 0, policy_version 604898 (0.0031) [2024-04-28 12:09:47,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 9910730752. Throughput: 0: 55988.8. Samples: 401134900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:09:47,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 12:09:48,604][57339] Updated weights for policy 0, policy_version 604908 (0.0029) [2024-04-28 12:09:49,546][57319] Signal inference workers to stop experience collection... (5800 times) [2024-04-28 12:09:49,582][57339] InferenceWorker_p0-w0: stopping experience collection (5800 times) [2024-04-28 12:09:49,596][57319] Signal inference workers to resume experience collection... (5800 times) [2024-04-28 12:09:49,599][57339] InferenceWorker_p0-w0: resuming experience collection (5800 times) [2024-04-28 12:09:51,564][57339] Updated weights for policy 0, policy_version 604918 (0.0029) [2024-04-28 12:09:52,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55978.5, 300 sec: 55427.9). Total num frames: 9910992896. Throughput: 0: 56044.8. Samples: 401311940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:09:52,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 12:09:54,334][57339] Updated weights for policy 0, policy_version 604928 (0.0027) [2024-04-28 12:09:57,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.8, 300 sec: 55427.9). Total num frames: 9911271424. Throughput: 0: 56089.9. Samples: 401647800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:09:57,169][57108] Avg episode reward: [(0, '0.684')] [2024-04-28 12:09:57,360][57339] Updated weights for policy 0, policy_version 604938 (0.0039) [2024-04-28 12:10:00,250][57339] Updated weights for policy 0, policy_version 604948 (0.0028) [2024-04-28 12:10:02,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.5, 300 sec: 55427.9). Total num frames: 9911549952. Throughput: 0: 56113.3. Samples: 401980280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:10:02,170][57108] Avg episode reward: [(0, '0.514')] [2024-04-28 12:10:03,362][57339] Updated weights for policy 0, policy_version 604958 (0.0025) [2024-04-28 12:10:06,043][57339] Updated weights for policy 0, policy_version 604968 (0.0029) [2024-04-28 12:10:07,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 9911812096. Throughput: 0: 55726.2. Samples: 402133860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:10:07,169][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 12:10:09,353][57339] Updated weights for policy 0, policy_version 604978 (0.0028) [2024-04-28 12:10:11,867][57339] Updated weights for policy 0, policy_version 604988 (0.0028) [2024-04-28 12:10:12,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 9912123392. Throughput: 0: 55692.1. Samples: 402468680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:10:12,169][57108] Avg episode reward: [(0, '0.729')] [2024-04-28 12:10:15,301][57339] Updated weights for policy 0, policy_version 604998 (0.0030) [2024-04-28 12:10:17,169][57108] Fps is (10 sec: 60620.5, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 9912418304. Throughput: 0: 55686.2. Samples: 402804880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:10:17,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 12:10:17,647][57339] Updated weights for policy 0, policy_version 605008 (0.0029) [2024-04-28 12:10:21,231][57339] Updated weights for policy 0, policy_version 605018 (0.0031) [2024-04-28 12:10:22,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 9912680448. Throughput: 0: 55674.3. Samples: 402977620. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:10:22,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 12:10:23,577][57339] Updated weights for policy 0, policy_version 605028 (0.0028) [2024-04-28 12:10:27,094][57339] Updated weights for policy 0, policy_version 605038 (0.0031) [2024-04-28 12:10:27,169][57108] Fps is (10 sec: 52427.5, 60 sec: 55978.5, 300 sec: 55538.9). Total num frames: 9912942592. Throughput: 0: 55691.2. Samples: 403309120. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:10:27,170][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 12:10:29,699][57339] Updated weights for policy 0, policy_version 605048 (0.0031) [2024-04-28 12:10:32,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55978.7, 300 sec: 55483.4). Total num frames: 9913221120. Throughput: 0: 55681.0. Samples: 403640540. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:10:32,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 12:10:32,905][57339] Updated weights for policy 0, policy_version 605058 (0.0034) [2024-04-28 12:10:35,564][57339] Updated weights for policy 0, policy_version 605068 (0.0032) [2024-04-28 12:10:37,169][57108] Fps is (10 sec: 55707.0, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 9913499648. Throughput: 0: 55373.4. Samples: 403803740. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:10:37,169][57108] Avg episode reward: [(0, '0.528')] [2024-04-28 12:10:38,768][57339] Updated weights for policy 0, policy_version 605078 (0.0030) [2024-04-28 12:10:41,269][57339] Updated weights for policy 0, policy_version 605088 (0.0027) [2024-04-28 12:10:42,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 9913761792. Throughput: 0: 55193.7. Samples: 404131520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:10:42,170][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 12:10:42,803][57319] Signal inference workers to stop experience collection... (5850 times) [2024-04-28 12:10:42,842][57339] InferenceWorker_p0-w0: stopping experience collection (5850 times) [2024-04-28 12:10:42,855][57319] Signal inference workers to resume experience collection... (5850 times) [2024-04-28 12:10:42,861][57339] InferenceWorker_p0-w0: resuming experience collection (5850 times) [2024-04-28 12:10:44,778][57339] Updated weights for policy 0, policy_version 605098 (0.0027) [2024-04-28 12:10:47,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 9914073088. Throughput: 0: 55231.7. Samples: 404465700. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:10:47,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 12:10:47,216][57339] Updated weights for policy 0, policy_version 605108 (0.0033) [2024-04-28 12:10:50,621][57339] Updated weights for policy 0, policy_version 605118 (0.0033) [2024-04-28 12:10:52,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55978.8, 300 sec: 55650.0). Total num frames: 9914351616. Throughput: 0: 55775.6. Samples: 404643760. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:10:52,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 12:10:53,027][57339] Updated weights for policy 0, policy_version 605128 (0.0030) [2024-04-28 12:10:56,376][57339] Updated weights for policy 0, policy_version 605138 (0.0029) [2024-04-28 12:10:57,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 9914613760. Throughput: 0: 55752.5. Samples: 404977540. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:10:57,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 12:10:58,844][57339] Updated weights for policy 0, policy_version 605148 (0.0025) [2024-04-28 12:11:02,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 9914892288. Throughput: 0: 55674.2. Samples: 405310220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:11:02,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 12:11:02,258][57339] Updated weights for policy 0, policy_version 605158 (0.0035) [2024-04-28 12:11:05,012][57339] Updated weights for policy 0, policy_version 605168 (0.0025) [2024-04-28 12:11:07,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 9915154432. Throughput: 0: 55305.3. Samples: 405466360. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:11:07,169][57108] Avg episode reward: [(0, '0.518')] [2024-04-28 12:11:08,185][57339] Updated weights for policy 0, policy_version 605178 (0.0027) [2024-04-28 12:11:11,140][57339] Updated weights for policy 0, policy_version 605188 (0.0029) [2024-04-28 12:11:12,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 9915432960. Throughput: 0: 55408.6. Samples: 405802500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:11:12,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 12:11:14,087][57339] Updated weights for policy 0, policy_version 605198 (0.0027) [2024-04-28 12:11:17,031][57339] Updated weights for policy 0, policy_version 605208 (0.0035) [2024-04-28 12:11:17,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55159.3, 300 sec: 55538.9). Total num frames: 9915727872. Throughput: 0: 55527.2. Samples: 406139280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:11:17,170][57108] Avg episode reward: [(0, '0.505')] [2024-04-28 12:11:19,783][57339] Updated weights for policy 0, policy_version 605218 (0.0026) [2024-04-28 12:11:22,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 9916022784. Throughput: 0: 55766.2. Samples: 406313220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:11:22,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 12:11:22,947][57339] Updated weights for policy 0, policy_version 605228 (0.0027) [2024-04-28 12:11:25,658][57339] Updated weights for policy 0, policy_version 605238 (0.0029) [2024-04-28 12:11:27,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 9916301312. Throughput: 0: 55884.9. Samples: 406646340. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:11:27,178][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 12:11:27,230][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000605244_9916317696.pth... [2024-04-28 12:11:27,287][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000604429_9902964736.pth [2024-04-28 12:11:28,835][57339] Updated weights for policy 0, policy_version 605248 (0.0027) [2024-04-28 12:11:31,694][57339] Updated weights for policy 0, policy_version 605258 (0.0029) [2024-04-28 12:11:32,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 9916596224. Throughput: 0: 55916.4. Samples: 406981940. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:11:32,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 12:11:34,577][57339] Updated weights for policy 0, policy_version 605268 (0.0029) [2024-04-28 12:11:37,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9916825600. Throughput: 0: 55616.7. Samples: 407146520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:11:37,169][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 12:11:37,530][57339] Updated weights for policy 0, policy_version 605278 (0.0026) [2024-04-28 12:11:40,265][57339] Updated weights for policy 0, policy_version 605288 (0.0031) [2024-04-28 12:11:42,169][57108] Fps is (10 sec: 50790.4, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 9917104128. Throughput: 0: 55620.8. Samples: 407480480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:11:42,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 12:11:42,907][57319] Signal inference workers to stop experience collection... (5900 times) [2024-04-28 12:11:42,947][57339] InferenceWorker_p0-w0: stopping experience collection (5900 times) [2024-04-28 12:11:42,973][57319] Signal inference workers to resume experience collection... (5900 times) [2024-04-28 12:11:42,974][57339] InferenceWorker_p0-w0: resuming experience collection (5900 times) [2024-04-28 12:11:43,321][57339] Updated weights for policy 0, policy_version 605298 (0.0030) [2024-04-28 12:11:46,136][57339] Updated weights for policy 0, policy_version 605308 (0.0030) [2024-04-28 12:11:47,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 9917399040. Throughput: 0: 55576.9. Samples: 407811180. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:11:47,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 12:11:49,285][57339] Updated weights for policy 0, policy_version 605318 (0.0030) [2024-04-28 12:11:52,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 9917677568. Throughput: 0: 55722.7. Samples: 407973880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 12:11:52,169][57108] Avg episode reward: [(0, '0.683')] [2024-04-28 12:11:52,645][57339] Updated weights for policy 0, policy_version 605328 (0.0033) [2024-04-28 12:11:55,204][57339] Updated weights for policy 0, policy_version 605338 (0.0028) [2024-04-28 12:11:57,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 9917956096. Throughput: 0: 55606.4. Samples: 408304780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:11:57,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 12:11:58,695][57339] Updated weights for policy 0, policy_version 605348 (0.0030) [2024-04-28 12:12:01,021][57339] Updated weights for policy 0, policy_version 605358 (0.0027) [2024-04-28 12:12:02,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 9918251008. Throughput: 0: 55455.8. Samples: 408634780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:12:02,170][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 12:12:04,531][57339] Updated weights for policy 0, policy_version 605368 (0.0026) [2024-04-28 12:12:06,863][57339] Updated weights for policy 0, policy_version 605378 (0.0026) [2024-04-28 12:12:07,169][57108] Fps is (10 sec: 57343.1, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 9918529536. Throughput: 0: 55572.8. Samples: 408814000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:12:07,170][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 12:12:10,426][57339] Updated weights for policy 0, policy_version 605388 (0.0025) [2024-04-28 12:12:12,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 9918775296. Throughput: 0: 55601.8. Samples: 409148420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:12:12,170][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 12:12:12,782][57339] Updated weights for policy 0, policy_version 605398 (0.0034) [2024-04-28 12:12:16,393][57339] Updated weights for policy 0, policy_version 605408 (0.0032) [2024-04-28 12:12:17,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 9919053824. Throughput: 0: 55412.8. Samples: 409475520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:12:17,170][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 12:12:18,811][57339] Updated weights for policy 0, policy_version 605418 (0.0030) [2024-04-28 12:12:22,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54886.4, 300 sec: 55483.4). Total num frames: 9919315968. Throughput: 0: 55191.6. Samples: 409630140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:12:22,170][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 12:12:22,184][57339] Updated weights for policy 0, policy_version 605428 (0.0031) [2024-04-28 12:12:24,613][57339] Updated weights for policy 0, policy_version 605438 (0.0030) [2024-04-28 12:12:27,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 9919610880. Throughput: 0: 55222.9. Samples: 409965520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:12:27,170][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 12:12:28,068][57339] Updated weights for policy 0, policy_version 605448 (0.0028) [2024-04-28 12:12:30,495][57339] Updated weights for policy 0, policy_version 605458 (0.0027) [2024-04-28 12:12:32,169][57108] Fps is (10 sec: 57343.8, 60 sec: 54886.3, 300 sec: 55594.5). Total num frames: 9919889408. Throughput: 0: 55226.6. Samples: 410296380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:12:32,170][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 12:12:34,096][57339] Updated weights for policy 0, policy_version 605468 (0.0025) [2024-04-28 12:12:35,711][57319] Signal inference workers to stop experience collection... (5950 times) [2024-04-28 12:12:35,711][57319] Signal inference workers to resume experience collection... (5950 times) [2024-04-28 12:12:35,741][57339] InferenceWorker_p0-w0: stopping experience collection (5950 times) [2024-04-28 12:12:35,741][57339] InferenceWorker_p0-w0: resuming experience collection (5950 times) [2024-04-28 12:12:36,517][57339] Updated weights for policy 0, policy_version 605478 (0.0026) [2024-04-28 12:12:37,169][57108] Fps is (10 sec: 58983.1, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 9920200704. Throughput: 0: 55312.0. Samples: 410462920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:12:37,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 12:12:40,031][57339] Updated weights for policy 0, policy_version 605488 (0.0028) [2024-04-28 12:12:42,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 9920446464. Throughput: 0: 55402.1. Samples: 410797880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:12:42,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 12:12:42,455][57339] Updated weights for policy 0, policy_version 605498 (0.0032) [2024-04-28 12:12:45,953][57339] Updated weights for policy 0, policy_version 605508 (0.0030) [2024-04-28 12:12:47,169][57108] Fps is (10 sec: 49152.4, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 9920692224. Throughput: 0: 55454.8. Samples: 411130240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:12:47,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 12:12:48,305][57339] Updated weights for policy 0, policy_version 605518 (0.0035) [2024-04-28 12:12:51,887][57339] Updated weights for policy 0, policy_version 605528 (0.0029) [2024-04-28 12:12:52,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 9920987136. Throughput: 0: 54985.1. Samples: 411288320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:12:52,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 12:12:54,384][57339] Updated weights for policy 0, policy_version 605538 (0.0026) [2024-04-28 12:12:57,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 9921249280. Throughput: 0: 54920.1. Samples: 411619820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:12:57,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 12:12:57,784][57339] Updated weights for policy 0, policy_version 605548 (0.0029) [2024-04-28 12:13:00,233][57339] Updated weights for policy 0, policy_version 605558 (0.0032) [2024-04-28 12:13:02,169][57108] Fps is (10 sec: 55705.6, 60 sec: 54886.5, 300 sec: 55650.1). Total num frames: 9921544192. Throughput: 0: 55102.4. Samples: 411955120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:13:02,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 12:13:03,572][57339] Updated weights for policy 0, policy_version 605568 (0.0026) [2024-04-28 12:13:06,015][57339] Updated weights for policy 0, policy_version 605578 (0.0027) [2024-04-28 12:13:07,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 9921839104. Throughput: 0: 55493.4. Samples: 412127340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:13:07,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 12:13:09,554][57339] Updated weights for policy 0, policy_version 605588 (0.0027) [2024-04-28 12:13:11,855][57339] Updated weights for policy 0, policy_version 605598 (0.0031) [2024-04-28 12:13:12,169][57108] Fps is (10 sec: 58981.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 9922134016. Throughput: 0: 55403.6. Samples: 412458680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:13:12,170][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 12:13:15,441][57339] Updated weights for policy 0, policy_version 605608 (0.0029) [2024-04-28 12:13:17,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 9922379776. Throughput: 0: 55398.4. Samples: 412789300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:13:17,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 12:13:17,820][57339] Updated weights for policy 0, policy_version 605618 (0.0026) [2024-04-28 12:13:21,360][57339] Updated weights for policy 0, policy_version 605628 (0.0027) [2024-04-28 12:13:22,169][57108] Fps is (10 sec: 50791.1, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 9922641920. Throughput: 0: 55414.8. Samples: 412956580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:13:22,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 12:13:23,532][57339] Updated weights for policy 0, policy_version 605638 (0.0027) [2024-04-28 12:13:27,169][57108] Fps is (10 sec: 52427.6, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 9922904064. Throughput: 0: 55366.0. Samples: 413289360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:13:27,170][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 12:13:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000605646_9922904064.pth... [2024-04-28 12:13:27,220][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000604834_9909600256.pth [2024-04-28 12:13:27,467][57339] Updated weights for policy 0, policy_version 605648 (0.0029) [2024-04-28 12:13:29,511][57339] Updated weights for policy 0, policy_version 605658 (0.0029) [2024-04-28 12:13:32,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 9923198976. Throughput: 0: 55386.2. Samples: 413622620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:13:32,178][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 12:13:33,235][57339] Updated weights for policy 0, policy_version 605668 (0.0027) [2024-04-28 12:13:35,478][57339] Updated weights for policy 0, policy_version 605678 (0.0035) [2024-04-28 12:13:35,784][57319] Signal inference workers to stop experience collection... (6000 times) [2024-04-28 12:13:35,785][57319] Signal inference workers to resume experience collection... (6000 times) [2024-04-28 12:13:35,802][57339] InferenceWorker_p0-w0: stopping experience collection (6000 times) [2024-04-28 12:13:35,802][57339] InferenceWorker_p0-w0: resuming experience collection (6000 times) [2024-04-28 12:13:37,169][57108] Fps is (10 sec: 58983.6, 60 sec: 54886.5, 300 sec: 55594.5). Total num frames: 9923493888. Throughput: 0: 55624.0. Samples: 413791400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:13:37,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 12:13:39,203][57339] Updated weights for policy 0, policy_version 605688 (0.0029) [2024-04-28 12:13:41,528][57339] Updated weights for policy 0, policy_version 605698 (0.0026) [2024-04-28 12:13:42,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 9923788800. Throughput: 0: 55586.1. Samples: 414121200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:13:42,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 12:13:45,103][57339] Updated weights for policy 0, policy_version 605708 (0.0032) [2024-04-28 12:13:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 9924050944. Throughput: 0: 55474.7. Samples: 414451480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:13:47,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 12:13:47,374][57339] Updated weights for policy 0, policy_version 605718 (0.0027) [2024-04-28 12:13:50,937][57339] Updated weights for policy 0, policy_version 605728 (0.0028) [2024-04-28 12:13:52,169][57108] Fps is (10 sec: 50790.5, 60 sec: 55159.4, 300 sec: 55483.5). Total num frames: 9924296704. Throughput: 0: 55447.0. Samples: 414622460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:13:52,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 12:13:53,304][57339] Updated weights for policy 0, policy_version 605738 (0.0027) [2024-04-28 12:13:56,797][57339] Updated weights for policy 0, policy_version 605748 (0.0025) [2024-04-28 12:13:57,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 9924591616. Throughput: 0: 55443.6. Samples: 414953640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:13:57,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 12:13:59,161][57339] Updated weights for policy 0, policy_version 605758 (0.0028) [2024-04-28 12:14:02,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54886.3, 300 sec: 55372.4). Total num frames: 9924837376. Throughput: 0: 55453.3. Samples: 415284700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:14:02,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 12:14:02,803][57339] Updated weights for policy 0, policy_version 605768 (0.0028) [2024-04-28 12:14:05,073][57339] Updated weights for policy 0, policy_version 605778 (0.0033) [2024-04-28 12:14:07,169][57108] Fps is (10 sec: 52428.6, 60 sec: 54613.2, 300 sec: 55372.4). Total num frames: 9925115904. Throughput: 0: 55198.9. Samples: 415440540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:14:07,170][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 12:14:08,698][57339] Updated weights for policy 0, policy_version 605788 (0.0028) [2024-04-28 12:14:11,082][57339] Updated weights for policy 0, policy_version 605798 (0.0025) [2024-04-28 12:14:12,169][57108] Fps is (10 sec: 58983.0, 60 sec: 54886.6, 300 sec: 55539.0). Total num frames: 9925427200. Throughput: 0: 55103.9. Samples: 415769020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:14:12,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 12:14:14,729][57339] Updated weights for policy 0, policy_version 605808 (0.0027) [2024-04-28 12:14:17,011][57339] Updated weights for policy 0, policy_version 605818 (0.0023) [2024-04-28 12:14:17,169][57108] Fps is (10 sec: 60621.8, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 9925722112. Throughput: 0: 54963.6. Samples: 416095980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:14:17,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 12:14:20,548][57339] Updated weights for policy 0, policy_version 605828 (0.0033) [2024-04-28 12:14:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 9925967872. Throughput: 0: 55212.1. Samples: 416275940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:14:22,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 12:14:23,001][57339] Updated weights for policy 0, policy_version 605838 (0.0026) [2024-04-28 12:14:26,281][57339] Updated weights for policy 0, policy_version 605848 (0.0027) [2024-04-28 12:14:27,169][57108] Fps is (10 sec: 50790.4, 60 sec: 55432.7, 300 sec: 55483.5). Total num frames: 9926230016. Throughput: 0: 55349.9. Samples: 416611940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:14:27,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 12:14:28,731][57319] Signal inference workers to stop experience collection... (6050 times) [2024-04-28 12:14:28,731][57319] Signal inference workers to resume experience collection... (6050 times) [2024-04-28 12:14:28,743][57339] InferenceWorker_p0-w0: stopping experience collection (6050 times) [2024-04-28 12:14:28,743][57339] InferenceWorker_p0-w0: resuming experience collection (6050 times) [2024-04-28 12:14:28,842][57339] Updated weights for policy 0, policy_version 605858 (0.0027) [2024-04-28 12:14:32,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 9926524928. Throughput: 0: 55311.6. Samples: 416940500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:14:32,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 12:14:32,178][57339] Updated weights for policy 0, policy_version 605868 (0.0030) [2024-04-28 12:14:34,727][57339] Updated weights for policy 0, policy_version 605878 (0.0029) [2024-04-28 12:14:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54613.4, 300 sec: 55316.8). Total num frames: 9926770688. Throughput: 0: 54918.8. Samples: 417093800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:14:37,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 12:14:38,272][57339] Updated weights for policy 0, policy_version 605888 (0.0028) [2024-04-28 12:14:40,446][57339] Updated weights for policy 0, policy_version 605898 (0.0031) [2024-04-28 12:14:42,169][57108] Fps is (10 sec: 54066.2, 60 sec: 54613.3, 300 sec: 55372.3). Total num frames: 9927065600. Throughput: 0: 54897.7. Samples: 417424040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:14:42,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 12:14:44,110][57339] Updated weights for policy 0, policy_version 605908 (0.0028) [2024-04-28 12:14:46,258][57339] Updated weights for policy 0, policy_version 605918 (0.0028) [2024-04-28 12:14:47,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55159.4, 300 sec: 55483.5). Total num frames: 9927360512. Throughput: 0: 55042.2. Samples: 417761600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:14:47,170][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 12:14:50,088][57339] Updated weights for policy 0, policy_version 605928 (0.0032) [2024-04-28 12:14:52,169][57108] Fps is (10 sec: 60621.7, 60 sec: 56251.8, 300 sec: 55594.5). Total num frames: 9927671808. Throughput: 0: 55526.8. Samples: 417939240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:14:52,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 12:14:52,197][57339] Updated weights for policy 0, policy_version 605938 (0.0033) [2024-04-28 12:14:56,161][57339] Updated weights for policy 0, policy_version 605948 (0.0026) [2024-04-28 12:14:57,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 9927917568. Throughput: 0: 55606.5. Samples: 418271320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-04-28 12:14:57,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 12:14:58,443][57339] Updated weights for policy 0, policy_version 605958 (0.0035) [2024-04-28 12:15:01,867][57339] Updated weights for policy 0, policy_version 605968 (0.0029) [2024-04-28 12:15:02,169][57108] Fps is (10 sec: 50790.7, 60 sec: 55705.7, 300 sec: 55483.4). Total num frames: 9928179712. Throughput: 0: 55781.8. Samples: 418606160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:15:02,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 12:15:04,187][57339] Updated weights for policy 0, policy_version 605978 (0.0030) [2024-04-28 12:15:07,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55705.8, 300 sec: 55372.4). Total num frames: 9928458240. Throughput: 0: 55330.3. Samples: 418765800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:15:07,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 12:15:07,742][57339] Updated weights for policy 0, policy_version 605988 (0.0029) [2024-04-28 12:15:10,184][57339] Updated weights for policy 0, policy_version 605998 (0.0028) [2024-04-28 12:15:12,169][57108] Fps is (10 sec: 54066.6, 60 sec: 54886.3, 300 sec: 55261.3). Total num frames: 9928720384. Throughput: 0: 55303.9. Samples: 419100620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:15:12,170][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 12:15:13,710][57339] Updated weights for policy 0, policy_version 606008 (0.0025) [2024-04-28 12:15:16,397][57339] Updated weights for policy 0, policy_version 606018 (0.0030) [2024-04-28 12:15:17,169][57108] Fps is (10 sec: 55704.5, 60 sec: 54886.3, 300 sec: 55372.4). Total num frames: 9929015296. Throughput: 0: 55509.1. Samples: 419438420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:15:17,170][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 12:15:19,432][57339] Updated weights for policy 0, policy_version 606028 (0.0026) [2024-04-28 12:15:22,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55705.5, 300 sec: 55483.5). Total num frames: 9929310208. Throughput: 0: 55752.3. Samples: 419602660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:15:22,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 12:15:22,232][57339] Updated weights for policy 0, policy_version 606038 (0.0028) [2024-04-28 12:15:25,188][57339] Updated weights for policy 0, policy_version 606048 (0.0025) [2024-04-28 12:15:27,169][57108] Fps is (10 sec: 60621.6, 60 sec: 56524.8, 300 sec: 55594.5). Total num frames: 9929621504. Throughput: 0: 55841.5. Samples: 419936900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:15:27,169][57108] Avg episode reward: [(0, '0.513')] [2024-04-28 12:15:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000606056_9929621504.pth... [2024-04-28 12:15:27,231][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000605244_9916317696.pth [2024-04-28 12:15:27,923][57339] Updated weights for policy 0, policy_version 606058 (0.0028) [2024-04-28 12:15:30,332][57319] Signal inference workers to stop experience collection... (6100 times) [2024-04-28 12:15:30,368][57339] InferenceWorker_p0-w0: stopping experience collection (6100 times) [2024-04-28 12:15:30,422][57319] Signal inference workers to resume experience collection... (6100 times) [2024-04-28 12:15:30,422][57339] InferenceWorker_p0-w0: resuming experience collection (6100 times) [2024-04-28 12:15:31,233][57339] Updated weights for policy 0, policy_version 606068 (0.0034) [2024-04-28 12:15:32,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 9929883648. Throughput: 0: 55876.8. Samples: 420276060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:15:32,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 12:15:33,684][57339] Updated weights for policy 0, policy_version 606078 (0.0025) [2024-04-28 12:15:37,169][57108] Fps is (10 sec: 50790.5, 60 sec: 55978.7, 300 sec: 55483.5). Total num frames: 9930129408. Throughput: 0: 55739.2. Samples: 420447500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:15:37,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 12:15:37,210][57339] Updated weights for policy 0, policy_version 606088 (0.0029) [2024-04-28 12:15:39,541][57339] Updated weights for policy 0, policy_version 606098 (0.0032) [2024-04-28 12:15:42,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.7, 300 sec: 55372.4). Total num frames: 9930407936. Throughput: 0: 55791.2. Samples: 420781920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:15:42,169][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 12:15:43,210][57339] Updated weights for policy 0, policy_version 606108 (0.0028) [2024-04-28 12:15:45,716][57339] Updated weights for policy 0, policy_version 606118 (0.0026) [2024-04-28 12:15:47,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.5, 300 sec: 55316.8). Total num frames: 9930670080. Throughput: 0: 55753.3. Samples: 421115060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:15:47,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 12:15:48,974][57339] Updated weights for policy 0, policy_version 606128 (0.0029) [2024-04-28 12:15:51,641][57339] Updated weights for policy 0, policy_version 606138 (0.0031) [2024-04-28 12:15:52,169][57108] Fps is (10 sec: 55705.1, 60 sec: 54886.3, 300 sec: 55427.9). Total num frames: 9930964992. Throughput: 0: 55738.4. Samples: 421274040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:15:52,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 12:15:54,804][57339] Updated weights for policy 0, policy_version 606148 (0.0025) [2024-04-28 12:15:57,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 9931259904. Throughput: 0: 55640.1. Samples: 421604420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:15:57,169][57108] Avg episode reward: [(0, '0.520')] [2024-04-28 12:15:57,367][57339] Updated weights for policy 0, policy_version 606158 (0.0032) [2024-04-28 12:16:00,624][57339] Updated weights for policy 0, policy_version 606168 (0.0035) [2024-04-28 12:16:02,169][57108] Fps is (10 sec: 58983.1, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 9931554816. Throughput: 0: 55553.9. Samples: 421938340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:16:02,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 12:16:03,835][57339] Updated weights for policy 0, policy_version 606178 (0.0026) [2024-04-28 12:16:06,316][57339] Updated weights for policy 0, policy_version 606188 (0.0027) [2024-04-28 12:16:07,169][57108] Fps is (10 sec: 57343.4, 60 sec: 56251.6, 300 sec: 55594.5). Total num frames: 9931833344. Throughput: 0: 55947.6. Samples: 422120300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:16:07,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 12:16:09,660][57339] Updated weights for policy 0, policy_version 606198 (0.0029) [2024-04-28 12:16:12,169][57108] Fps is (10 sec: 54067.2, 60 sec: 56251.8, 300 sec: 55483.5). Total num frames: 9932095488. Throughput: 0: 56017.8. Samples: 422457700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:16:12,169][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 12:16:12,636][57339] Updated weights for policy 0, policy_version 606208 (0.0022) [2024-04-28 12:16:15,358][57339] Updated weights for policy 0, policy_version 606218 (0.0029) [2024-04-28 12:16:17,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.7, 300 sec: 55372.4). Total num frames: 9932357632. Throughput: 0: 55960.9. Samples: 422794300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:16:17,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 12:16:18,265][57339] Updated weights for policy 0, policy_version 606228 (0.0029) [2024-04-28 12:16:21,217][57339] Updated weights for policy 0, policy_version 606238 (0.0027) [2024-04-28 12:16:22,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55159.5, 300 sec: 55316.8). Total num frames: 9932619776. Throughput: 0: 55494.2. Samples: 422944740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:16:22,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 12:16:24,012][57339] Updated weights for policy 0, policy_version 606248 (0.0029) [2024-04-28 12:16:24,528][57319] Signal inference workers to stop experience collection... (6150 times) [2024-04-28 12:16:24,528][57319] Signal inference workers to resume experience collection... (6150 times) [2024-04-28 12:16:24,541][57339] InferenceWorker_p0-w0: stopping experience collection (6150 times) [2024-04-28 12:16:24,541][57339] InferenceWorker_p0-w0: resuming experience collection (6150 times) [2024-04-28 12:16:26,922][57339] Updated weights for policy 0, policy_version 606258 (0.0027) [2024-04-28 12:16:27,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55159.4, 300 sec: 55372.4). Total num frames: 9932931072. Throughput: 0: 55608.8. Samples: 423284320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:16:27,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 12:16:29,946][57339] Updated weights for policy 0, policy_version 606268 (0.0026) [2024-04-28 12:16:32,169][57108] Fps is (10 sec: 60620.6, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 9933225984. Throughput: 0: 55611.1. Samples: 423617560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-04-28 12:16:32,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 12:16:32,745][57339] Updated weights for policy 0, policy_version 606278 (0.0033) [2024-04-28 12:16:35,872][57339] Updated weights for policy 0, policy_version 606288 (0.0024) [2024-04-28 12:16:37,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56251.6, 300 sec: 55594.5). Total num frames: 9933504512. Throughput: 0: 56070.2. Samples: 423797200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:16:37,170][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 12:16:38,491][57339] Updated weights for policy 0, policy_version 606298 (0.0029) [2024-04-28 12:16:41,585][57339] Updated weights for policy 0, policy_version 606308 (0.0026) [2024-04-28 12:16:42,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56524.9, 300 sec: 55594.5). Total num frames: 9933799424. Throughput: 0: 56112.5. Samples: 424129480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:16:42,169][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 12:16:44,839][57339] Updated weights for policy 0, policy_version 606318 (0.0030) [2024-04-28 12:16:47,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56524.7, 300 sec: 55539.0). Total num frames: 9934061568. Throughput: 0: 56119.9. Samples: 424463740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:16:47,169][57108] Avg episode reward: [(0, '0.693')] [2024-04-28 12:16:47,307][57339] Updated weights for policy 0, policy_version 606328 (0.0031) [2024-04-28 12:16:50,717][57339] Updated weights for policy 0, policy_version 606338 (0.0025) [2024-04-28 12:16:52,169][57108] Fps is (10 sec: 52427.3, 60 sec: 55978.6, 300 sec: 55483.4). Total num frames: 9934323712. Throughput: 0: 55700.7. Samples: 424626840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:16:52,170][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 12:16:53,280][57339] Updated weights for policy 0, policy_version 606348 (0.0028) [2024-04-28 12:16:56,567][57339] Updated weights for policy 0, policy_version 606358 (0.0023) [2024-04-28 12:16:57,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 9934585856. Throughput: 0: 55688.8. Samples: 424963700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:16:57,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 12:16:59,210][57339] Updated weights for policy 0, policy_version 606368 (0.0031) [2024-04-28 12:17:02,169][57108] Fps is (10 sec: 55706.8, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 9934880768. Throughput: 0: 55603.1. Samples: 425296440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:17:02,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 12:17:02,329][57339] Updated weights for policy 0, policy_version 606378 (0.0032) [2024-04-28 12:17:05,226][57339] Updated weights for policy 0, policy_version 606388 (0.0029) [2024-04-28 12:17:07,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 9935175680. Throughput: 0: 56193.6. Samples: 425473460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:17:07,170][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 12:17:08,173][57339] Updated weights for policy 0, policy_version 606398 (0.0029) [2024-04-28 12:17:11,051][57339] Updated weights for policy 0, policy_version 606408 (0.0030) [2024-04-28 12:17:12,169][57108] Fps is (10 sec: 58982.1, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 9935470592. Throughput: 0: 55943.1. Samples: 425801760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:17:12,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 12:17:13,936][57339] Updated weights for policy 0, policy_version 606418 (0.0027) [2024-04-28 12:17:16,931][57339] Updated weights for policy 0, policy_version 606428 (0.0028) [2024-04-28 12:17:17,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 9935716352. Throughput: 0: 55938.3. Samples: 426134780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:17:17,169][57108] Avg episode reward: [(0, '0.532')] [2024-04-28 12:17:19,839][57339] Updated weights for policy 0, policy_version 606438 (0.0027) [2024-04-28 12:17:22,169][57108] Fps is (10 sec: 52429.2, 60 sec: 56251.8, 300 sec: 55539.0). Total num frames: 9935994880. Throughput: 0: 55776.6. Samples: 426307140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:17:22,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 12:17:22,758][57339] Updated weights for policy 0, policy_version 606448 (0.0028) [2024-04-28 12:17:25,694][57339] Updated weights for policy 0, policy_version 606458 (0.0024) [2024-04-28 12:17:27,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 9936273408. Throughput: 0: 55968.4. Samples: 426648060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:17:27,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 12:17:27,276][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000606463_9936289792.pth... [2024-04-28 12:17:27,319][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000605646_9922904064.pth [2024-04-28 12:17:28,497][57339] Updated weights for policy 0, policy_version 606468 (0.0028) [2024-04-28 12:17:31,735][57339] Updated weights for policy 0, policy_version 606478 (0.0032) [2024-04-28 12:17:32,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 9936551936. Throughput: 0: 55894.7. Samples: 426979000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:17:32,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 12:17:34,408][57339] Updated weights for policy 0, policy_version 606488 (0.0028) [2024-04-28 12:17:37,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 9936846848. Throughput: 0: 55834.5. Samples: 427139380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:17:37,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 12:17:37,589][57339] Updated weights for policy 0, policy_version 606498 (0.0036) [2024-04-28 12:17:40,384][57339] Updated weights for policy 0, policy_version 606508 (0.0027) [2024-04-28 12:17:42,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 9937125376. Throughput: 0: 55757.4. Samples: 427472780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:17:42,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 12:17:43,440][57339] Updated weights for policy 0, policy_version 606518 (0.0028) [2024-04-28 12:17:44,900][57319] Signal inference workers to stop experience collection... (6200 times) [2024-04-28 12:17:44,900][57319] Signal inference workers to resume experience collection... (6200 times) [2024-04-28 12:17:44,910][57339] InferenceWorker_p0-w0: stopping experience collection (6200 times) [2024-04-28 12:17:44,911][57339] InferenceWorker_p0-w0: resuming experience collection (6200 times) [2024-04-28 12:17:46,124][57339] Updated weights for policy 0, policy_version 606528 (0.0026) [2024-04-28 12:17:47,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 9937403904. Throughput: 0: 55812.7. Samples: 427808020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:17:47,170][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 12:17:49,271][57339] Updated weights for policy 0, policy_version 606538 (0.0028) [2024-04-28 12:17:52,169][57108] Fps is (10 sec: 54066.0, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 9937666048. Throughput: 0: 55621.7. Samples: 427976440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:17:52,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 12:17:52,210][57339] Updated weights for policy 0, policy_version 606548 (0.0029) [2024-04-28 12:17:55,038][57339] Updated weights for policy 0, policy_version 606558 (0.0026) [2024-04-28 12:17:57,169][57108] Fps is (10 sec: 54068.5, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 9937944576. Throughput: 0: 55710.8. Samples: 428308740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:17:57,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 12:17:58,177][57339] Updated weights for policy 0, policy_version 606568 (0.0030) [2024-04-28 12:18:01,002][57339] Updated weights for policy 0, policy_version 606578 (0.0025) [2024-04-28 12:18:02,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 9938223104. Throughput: 0: 55644.8. Samples: 428638800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:18:02,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 12:18:03,919][57339] Updated weights for policy 0, policy_version 606588 (0.0030) [2024-04-28 12:18:06,883][57339] Updated weights for policy 0, policy_version 606598 (0.0026) [2024-04-28 12:18:07,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 9938518016. Throughput: 0: 55590.1. Samples: 428808700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-04-28 12:18:07,170][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 12:18:09,770][57339] Updated weights for policy 0, policy_version 606608 (0.0028) [2024-04-28 12:18:12,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 9938780160. Throughput: 0: 55328.9. Samples: 429137860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:18:12,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 12:18:12,896][57339] Updated weights for policy 0, policy_version 606618 (0.0029) [2024-04-28 12:18:15,727][57339] Updated weights for policy 0, policy_version 606628 (0.0027) [2024-04-28 12:18:17,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 9939075072. Throughput: 0: 55335.5. Samples: 429469100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:18:17,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 12:18:18,854][57339] Updated weights for policy 0, policy_version 606638 (0.0033) [2024-04-28 12:18:21,679][57339] Updated weights for policy 0, policy_version 606648 (0.0028) [2024-04-28 12:18:22,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 9939337216. Throughput: 0: 55705.8. Samples: 429646140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:18:22,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 12:18:24,692][57339] Updated weights for policy 0, policy_version 606658 (0.0026) [2024-04-28 12:18:27,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 9939615744. Throughput: 0: 55692.4. Samples: 429978940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:18:27,169][57108] Avg episode reward: [(0, '0.539')] [2024-04-28 12:18:27,414][57339] Updated weights for policy 0, policy_version 606668 (0.0025) [2024-04-28 12:18:30,510][57339] Updated weights for policy 0, policy_version 606678 (0.0026) [2024-04-28 12:18:32,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9939877888. Throughput: 0: 55546.3. Samples: 430307600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:18:32,170][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 12:18:33,478][57339] Updated weights for policy 0, policy_version 606688 (0.0027) [2024-04-28 12:18:36,346][57339] Updated weights for policy 0, policy_version 606698 (0.0033) [2024-04-28 12:18:37,169][57108] Fps is (10 sec: 54065.9, 60 sec: 55159.3, 300 sec: 55483.4). Total num frames: 9940156416. Throughput: 0: 55343.1. Samples: 430466880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:18:37,170][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 12:18:39,342][57339] Updated weights for policy 0, policy_version 606708 (0.0030) [2024-04-28 12:18:42,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 9940451328. Throughput: 0: 55272.8. Samples: 430796020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:18:42,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 12:18:42,338][57339] Updated weights for policy 0, policy_version 606718 (0.0032) [2024-04-28 12:18:45,199][57339] Updated weights for policy 0, policy_version 606728 (0.0030) [2024-04-28 12:18:47,169][57108] Fps is (10 sec: 57345.9, 60 sec: 55432.8, 300 sec: 55705.6). Total num frames: 9940729856. Throughput: 0: 55364.2. Samples: 431130180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:18:47,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 12:18:48,352][57339] Updated weights for policy 0, policy_version 606738 (0.0033) [2024-04-28 12:18:50,972][57339] Updated weights for policy 0, policy_version 606748 (0.0026) [2024-04-28 12:18:52,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 9941024768. Throughput: 0: 55484.4. Samples: 431305500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:18:52,170][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 12:18:54,208][57339] Updated weights for policy 0, policy_version 606758 (0.0032) [2024-04-28 12:18:56,513][57319] Signal inference workers to stop experience collection... (6250 times) [2024-04-28 12:18:56,561][57339] InferenceWorker_p0-w0: stopping experience collection (6250 times) [2024-04-28 12:18:56,573][57319] Signal inference workers to resume experience collection... (6250 times) [2024-04-28 12:18:56,578][57339] InferenceWorker_p0-w0: resuming experience collection (6250 times) [2024-04-28 12:18:56,802][57339] Updated weights for policy 0, policy_version 606768 (0.0026) [2024-04-28 12:18:57,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 9941303296. Throughput: 0: 55593.3. Samples: 431639560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:18:57,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 12:19:00,131][57339] Updated weights for policy 0, policy_version 606778 (0.0026) [2024-04-28 12:19:02,169][57108] Fps is (10 sec: 50791.1, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 9941532672. Throughput: 0: 55695.7. Samples: 431975400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:19:02,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 12:19:02,815][57339] Updated weights for policy 0, policy_version 606788 (0.0024) [2024-04-28 12:19:06,038][57339] Updated weights for policy 0, policy_version 606798 (0.0028) [2024-04-28 12:19:07,169][57108] Fps is (10 sec: 50790.0, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 9941811200. Throughput: 0: 55287.9. Samples: 432134100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:19:07,170][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 12:19:08,633][57339] Updated weights for policy 0, policy_version 606808 (0.0040) [2024-04-28 12:19:11,812][57339] Updated weights for policy 0, policy_version 606818 (0.0033) [2024-04-28 12:19:12,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 9942106112. Throughput: 0: 55306.7. Samples: 432467740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:19:12,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 12:19:14,539][57339] Updated weights for policy 0, policy_version 606828 (0.0029) [2024-04-28 12:19:17,169][57108] Fps is (10 sec: 55706.1, 60 sec: 54886.5, 300 sec: 55594.5). Total num frames: 9942368256. Throughput: 0: 55382.4. Samples: 432799800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:19:17,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 12:19:17,738][57339] Updated weights for policy 0, policy_version 606838 (0.0027) [2024-04-28 12:19:20,385][57339] Updated weights for policy 0, policy_version 606848 (0.0033) [2024-04-28 12:19:22,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 9942679552. Throughput: 0: 55684.1. Samples: 432972660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:19:22,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 12:19:23,781][57339] Updated weights for policy 0, policy_version 606858 (0.0031) [2024-04-28 12:19:26,188][57339] Updated weights for policy 0, policy_version 606868 (0.0027) [2024-04-28 12:19:27,169][57108] Fps is (10 sec: 60619.9, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 9942974464. Throughput: 0: 55841.6. Samples: 433308900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:19:27,178][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 12:19:27,189][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000606871_9942974464.pth... [2024-04-28 12:19:27,239][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000606056_9929621504.pth [2024-04-28 12:19:29,758][57339] Updated weights for policy 0, policy_version 606878 (0.0034) [2024-04-28 12:19:32,166][57339] Updated weights for policy 0, policy_version 606888 (0.0028) [2024-04-28 12:19:32,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 9943252992. Throughput: 0: 55752.7. Samples: 433639060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:19:32,169][57108] Avg episode reward: [(0, '0.524')] [2024-04-28 12:19:35,753][57339] Updated weights for policy 0, policy_version 606898 (0.0031) [2024-04-28 12:19:37,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 9943498752. Throughput: 0: 55663.2. Samples: 433810340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:19:37,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 12:19:37,969][57339] Updated weights for policy 0, policy_version 606908 (0.0029) [2024-04-28 12:19:41,578][57339] Updated weights for policy 0, policy_version 606918 (0.0033) [2024-04-28 12:19:42,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 9943777280. Throughput: 0: 55676.0. Samples: 434144980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 12:19:42,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 12:19:43,892][57339] Updated weights for policy 0, policy_version 606928 (0.0027) [2024-04-28 12:19:47,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.3, 300 sec: 55483.4). Total num frames: 9944039424. Throughput: 0: 55538.1. Samples: 434474620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:19:47,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 12:19:47,664][57339] Updated weights for policy 0, policy_version 606938 (0.0028) [2024-04-28 12:19:49,735][57339] Updated weights for policy 0, policy_version 606948 (0.0028) [2024-04-28 12:19:52,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 9944334336. Throughput: 0: 55644.0. Samples: 434638080. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:19:52,170][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 12:19:53,453][57339] Updated weights for policy 0, policy_version 606958 (0.0028) [2024-04-28 12:19:55,681][57339] Updated weights for policy 0, policy_version 606968 (0.0029) [2024-04-28 12:19:57,169][57108] Fps is (10 sec: 60620.8, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 9944645632. Throughput: 0: 55569.7. Samples: 434968380. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:19:57,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 12:19:59,336][57339] Updated weights for policy 0, policy_version 606978 (0.0027) [2024-04-28 12:20:01,662][57339] Updated weights for policy 0, policy_version 606988 (0.0027) [2024-04-28 12:20:02,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 9944907776. Throughput: 0: 55642.6. Samples: 435303720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:20:02,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 12:20:05,080][57339] Updated weights for policy 0, policy_version 606998 (0.0030) [2024-04-28 12:20:07,169][57108] Fps is (10 sec: 54067.0, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 9945186304. Throughput: 0: 55721.4. Samples: 435480120. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:20:07,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 12:20:07,546][57339] Updated weights for policy 0, policy_version 607008 (0.0028) [2024-04-28 12:20:08,518][57319] Signal inference workers to stop experience collection... (6300 times) [2024-04-28 12:20:08,522][57319] Signal inference workers to resume experience collection... (6300 times) [2024-04-28 12:20:08,532][57339] InferenceWorker_p0-w0: stopping experience collection (6300 times) [2024-04-28 12:20:08,532][57339] InferenceWorker_p0-w0: resuming experience collection (6300 times) [2024-04-28 12:20:11,242][57339] Updated weights for policy 0, policy_version 607018 (0.0030) [2024-04-28 12:20:12,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 9945464832. Throughput: 0: 55611.2. Samples: 435811400. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:20:12,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 12:20:13,444][57339] Updated weights for policy 0, policy_version 607028 (0.0032) [2024-04-28 12:20:17,169][57108] Fps is (10 sec: 50790.9, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9945694208. Throughput: 0: 55615.7. Samples: 436141760. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:20:17,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 12:20:17,184][57339] Updated weights for policy 0, policy_version 607038 (0.0031) [2024-04-28 12:20:19,311][57339] Updated weights for policy 0, policy_version 607048 (0.0027) [2024-04-28 12:20:22,169][57108] Fps is (10 sec: 50790.8, 60 sec: 54886.5, 300 sec: 55427.9). Total num frames: 9945972736. Throughput: 0: 55309.8. Samples: 436299280. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:20:22,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 12:20:23,060][57339] Updated weights for policy 0, policy_version 607058 (0.0026) [2024-04-28 12:20:25,387][57339] Updated weights for policy 0, policy_version 607068 (0.0035) [2024-04-28 12:20:27,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 9946284032. Throughput: 0: 55197.7. Samples: 436628880. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:20:27,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 12:20:28,902][57339] Updated weights for policy 0, policy_version 607078 (0.0024) [2024-04-28 12:20:31,325][57339] Updated weights for policy 0, policy_version 607088 (0.0026) [2024-04-28 12:20:32,169][57108] Fps is (10 sec: 62258.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 9946595328. Throughput: 0: 55295.0. Samples: 436962900. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:20:32,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 12:20:34,867][57339] Updated weights for policy 0, policy_version 607098 (0.0027) [2024-04-28 12:20:37,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 9946841088. Throughput: 0: 55703.2. Samples: 437144720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:20:37,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 12:20:37,211][57339] Updated weights for policy 0, policy_version 607108 (0.0034) [2024-04-28 12:20:40,628][57339] Updated weights for policy 0, policy_version 607118 (0.0027) [2024-04-28 12:20:42,169][57108] Fps is (10 sec: 50791.1, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 9947103232. Throughput: 0: 55646.3. Samples: 437472460. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:20:42,169][57108] Avg episode reward: [(0, '0.500')] [2024-04-28 12:20:43,082][57339] Updated weights for policy 0, policy_version 607128 (0.0036) [2024-04-28 12:20:46,390][57339] Updated weights for policy 0, policy_version 607138 (0.0026) [2024-04-28 12:20:47,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 9947381760. Throughput: 0: 55492.1. Samples: 437800860. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:20:47,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 12:20:49,088][57339] Updated weights for policy 0, policy_version 607148 (0.0031) [2024-04-28 12:20:52,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 9947660288. Throughput: 0: 55124.1. Samples: 437960700. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:20:52,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 12:20:52,354][57339] Updated weights for policy 0, policy_version 607158 (0.0033) [2024-04-28 12:20:54,935][57339] Updated weights for policy 0, policy_version 607168 (0.0028) [2024-04-28 12:20:57,169][57108] Fps is (10 sec: 54066.2, 60 sec: 54613.2, 300 sec: 55483.4). Total num frames: 9947922432. Throughput: 0: 55115.9. Samples: 438291620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:20:57,170][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 12:20:58,288][57339] Updated weights for policy 0, policy_version 607178 (0.0029) [2024-04-28 12:21:00,905][57339] Updated weights for policy 0, policy_version 607188 (0.0028) [2024-04-28 12:21:02,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 9948217344. Throughput: 0: 54989.7. Samples: 438616300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:21:02,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 12:21:04,279][57339] Updated weights for policy 0, policy_version 607198 (0.0030) [2024-04-28 12:21:06,716][57319] Signal inference workers to stop experience collection... (6350 times) [2024-04-28 12:21:06,716][57319] Signal inference workers to resume experience collection... (6350 times) [2024-04-28 12:21:06,744][57339] InferenceWorker_p0-w0: stopping experience collection (6350 times) [2024-04-28 12:21:06,744][57339] InferenceWorker_p0-w0: resuming experience collection (6350 times) [2024-04-28 12:21:06,829][57339] Updated weights for policy 0, policy_version 607208 (0.0030) [2024-04-28 12:21:07,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 9948512256. Throughput: 0: 55419.0. Samples: 438793140. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:21:07,170][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 12:21:10,081][57339] Updated weights for policy 0, policy_version 607218 (0.0030) [2024-04-28 12:21:12,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 9948790784. Throughput: 0: 55605.4. Samples: 439131120. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:21:12,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 12:21:12,764][57339] Updated weights for policy 0, policy_version 607228 (0.0032) [2024-04-28 12:21:15,908][57339] Updated weights for policy 0, policy_version 607238 (0.0028) [2024-04-28 12:21:17,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 9949052928. Throughput: 0: 55565.4. Samples: 439463340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-04-28 12:21:17,170][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 12:21:18,854][57339] Updated weights for policy 0, policy_version 607248 (0.0027) [2024-04-28 12:21:21,854][57339] Updated weights for policy 0, policy_version 607258 (0.0026) [2024-04-28 12:21:22,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 9949331456. Throughput: 0: 54976.9. Samples: 439618680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:21:22,170][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 12:21:24,722][57339] Updated weights for policy 0, policy_version 607268 (0.0028) [2024-04-28 12:21:27,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 9949593600. Throughput: 0: 55193.5. Samples: 439956180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:21:27,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 12:21:27,202][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000607276_9949609984.pth... [2024-04-28 12:21:27,254][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000606463_9936289792.pth [2024-04-28 12:21:27,737][57339] Updated weights for policy 0, policy_version 607278 (0.0027) [2024-04-28 12:21:30,581][57339] Updated weights for policy 0, policy_version 607288 (0.0031) [2024-04-28 12:21:32,169][57108] Fps is (10 sec: 54068.1, 60 sec: 54613.5, 300 sec: 55483.5). Total num frames: 9949872128. Throughput: 0: 55259.2. Samples: 440287520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:21:32,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 12:21:33,583][57339] Updated weights for policy 0, policy_version 607298 (0.0029) [2024-04-28 12:21:36,538][57339] Updated weights for policy 0, policy_version 607308 (0.0028) [2024-04-28 12:21:37,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 9950150656. Throughput: 0: 55340.3. Samples: 440451020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:21:37,170][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 12:21:39,459][57339] Updated weights for policy 0, policy_version 607318 (0.0027) [2024-04-28 12:21:42,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55483.5). Total num frames: 9950429184. Throughput: 0: 55491.4. Samples: 440788720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:21:42,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 12:21:42,333][57339] Updated weights for policy 0, policy_version 607328 (0.0028) [2024-04-28 12:21:45,319][57339] Updated weights for policy 0, policy_version 607338 (0.0030) [2024-04-28 12:21:47,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 9950724096. Throughput: 0: 55708.8. Samples: 441123200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:21:47,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 12:21:48,320][57339] Updated weights for policy 0, policy_version 607348 (0.0025) [2024-04-28 12:21:51,291][57339] Updated weights for policy 0, policy_version 607358 (0.0031) [2024-04-28 12:21:52,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 9950986240. Throughput: 0: 55410.2. Samples: 441286600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:21:52,170][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 12:21:54,269][57339] Updated weights for policy 0, policy_version 607368 (0.0028) [2024-04-28 12:21:57,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 9951264768. Throughput: 0: 55311.1. Samples: 441620120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:21:57,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 12:21:57,244][57339] Updated weights for policy 0, policy_version 607378 (0.0031) [2024-04-28 12:22:00,096][57339] Updated weights for policy 0, policy_version 607388 (0.0025) [2024-04-28 12:22:02,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 9951543296. Throughput: 0: 55488.8. Samples: 441960340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:22:02,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 12:22:02,910][57339] Updated weights for policy 0, policy_version 607398 (0.0027) [2024-04-28 12:22:06,009][57339] Updated weights for policy 0, policy_version 607408 (0.0028) [2024-04-28 12:22:07,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 9951838208. Throughput: 0: 55680.5. Samples: 442124300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:22:07,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 12:22:08,878][57339] Updated weights for policy 0, policy_version 607418 (0.0025) [2024-04-28 12:22:11,880][57339] Updated weights for policy 0, policy_version 607428 (0.0035) [2024-04-28 12:22:12,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.3, 300 sec: 55594.5). Total num frames: 9952116736. Throughput: 0: 55589.3. Samples: 442457700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:22:12,170][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 12:22:14,423][57319] Signal inference workers to stop experience collection... (6400 times) [2024-04-28 12:22:14,423][57319] Signal inference workers to resume experience collection... (6400 times) [2024-04-28 12:22:14,437][57339] InferenceWorker_p0-w0: stopping experience collection (6400 times) [2024-04-28 12:22:14,437][57339] InferenceWorker_p0-w0: resuming experience collection (6400 times) [2024-04-28 12:22:14,676][57339] Updated weights for policy 0, policy_version 607438 (0.0036) [2024-04-28 12:22:17,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 9952362496. Throughput: 0: 55653.2. Samples: 442791920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:22:17,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 12:22:17,844][57339] Updated weights for policy 0, policy_version 607448 (0.0031) [2024-04-28 12:22:20,591][57339] Updated weights for policy 0, policy_version 607458 (0.0029) [2024-04-28 12:22:22,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 9952673792. Throughput: 0: 55672.4. Samples: 442956280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:22:22,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 12:22:23,672][57339] Updated weights for policy 0, policy_version 607468 (0.0027) [2024-04-28 12:22:26,456][57339] Updated weights for policy 0, policy_version 607478 (0.0028) [2024-04-28 12:22:27,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 9952935936. Throughput: 0: 55815.9. Samples: 443300440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:22:27,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 12:22:29,430][57339] Updated weights for policy 0, policy_version 607488 (0.0037) [2024-04-28 12:22:32,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 9953230848. Throughput: 0: 55821.0. Samples: 443635140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:22:32,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 12:22:32,603][57339] Updated weights for policy 0, policy_version 607498 (0.0025) [2024-04-28 12:22:35,430][57339] Updated weights for policy 0, policy_version 607508 (0.0030) [2024-04-28 12:22:37,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 9953492992. Throughput: 0: 55765.3. Samples: 443796040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:22:37,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 12:22:38,466][57339] Updated weights for policy 0, policy_version 607518 (0.0032) [2024-04-28 12:22:41,335][57339] Updated weights for policy 0, policy_version 607528 (0.0032) [2024-04-28 12:22:42,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 9953771520. Throughput: 0: 55890.6. Samples: 444135200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:22:42,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 12:22:44,201][57339] Updated weights for policy 0, policy_version 607538 (0.0031) [2024-04-28 12:22:47,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9954050048. Throughput: 0: 55792.9. Samples: 444471020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 19.0) [2024-04-28 12:22:47,170][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 12:22:47,268][57339] Updated weights for policy 0, policy_version 607548 (0.0029) [2024-04-28 12:22:49,903][57339] Updated weights for policy 0, policy_version 607558 (0.0038) [2024-04-28 12:22:52,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.7, 300 sec: 55483.4). Total num frames: 9954312192. Throughput: 0: 55701.0. Samples: 444630840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:22:52,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 12:22:53,095][57339] Updated weights for policy 0, policy_version 607568 (0.0029) [2024-04-28 12:22:55,878][57339] Updated weights for policy 0, policy_version 607578 (0.0026) [2024-04-28 12:22:57,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 9954607104. Throughput: 0: 55665.6. Samples: 444962640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:22:57,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 12:22:59,094][57339] Updated weights for policy 0, policy_version 607588 (0.0030) [2024-04-28 12:23:01,746][57339] Updated weights for policy 0, policy_version 607598 (0.0028) [2024-04-28 12:23:02,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55978.8, 300 sec: 55539.0). Total num frames: 9954902016. Throughput: 0: 55793.7. Samples: 445302640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:23:02,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 12:23:04,931][57339] Updated weights for policy 0, policy_version 607608 (0.0030) [2024-04-28 12:23:07,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 9955180544. Throughput: 0: 55912.5. Samples: 445472340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:23:07,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 12:23:07,600][57339] Updated weights for policy 0, policy_version 607618 (0.0034) [2024-04-28 12:23:10,830][57339] Updated weights for policy 0, policy_version 607628 (0.0028) [2024-04-28 12:23:12,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.7, 300 sec: 55483.5). Total num frames: 9955442688. Throughput: 0: 55768.9. Samples: 445810040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:23:12,169][57108] Avg episode reward: [(0, '0.534')] [2024-04-28 12:23:13,379][57339] Updated weights for policy 0, policy_version 607638 (0.0026) [2024-04-28 12:23:16,647][57339] Updated weights for policy 0, policy_version 607648 (0.0030) [2024-04-28 12:23:17,169][57108] Fps is (10 sec: 55705.0, 60 sec: 56251.6, 300 sec: 55594.5). Total num frames: 9955737600. Throughput: 0: 55714.5. Samples: 446142300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:23:17,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 12:23:19,270][57339] Updated weights for policy 0, policy_version 607658 (0.0029) [2024-04-28 12:23:22,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 9955999744. Throughput: 0: 55749.8. Samples: 446304780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:23:22,170][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 12:23:22,537][57339] Updated weights for policy 0, policy_version 607668 (0.0026) [2024-04-28 12:23:24,352][57319] Signal inference workers to stop experience collection... (6450 times) [2024-04-28 12:23:24,393][57339] InferenceWorker_p0-w0: stopping experience collection (6450 times) [2024-04-28 12:23:24,443][57319] Signal inference workers to resume experience collection... (6450 times) [2024-04-28 12:23:24,443][57339] InferenceWorker_p0-w0: resuming experience collection (6450 times) [2024-04-28 12:23:25,260][57339] Updated weights for policy 0, policy_version 607678 (0.0033) [2024-04-28 12:23:27,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 9956278272. Throughput: 0: 55771.9. Samples: 446644940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:23:27,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 12:23:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000607683_9956278272.pth... [2024-04-28 12:23:27,227][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000606871_9942974464.pth [2024-04-28 12:23:28,274][57339] Updated weights for policy 0, policy_version 607688 (0.0033) [2024-04-28 12:23:30,955][57339] Updated weights for policy 0, policy_version 607698 (0.0027) [2024-04-28 12:23:32,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 9956573184. Throughput: 0: 55669.9. Samples: 446976160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:23:32,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 12:23:34,177][57339] Updated weights for policy 0, policy_version 607708 (0.0026) [2024-04-28 12:23:36,837][57339] Updated weights for policy 0, policy_version 607718 (0.0029) [2024-04-28 12:23:37,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 9956851712. Throughput: 0: 55862.8. Samples: 447144680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:23:37,170][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 12:23:40,001][57339] Updated weights for policy 0, policy_version 607728 (0.0025) [2024-04-28 12:23:42,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 9957130240. Throughput: 0: 55893.2. Samples: 447477840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:23:42,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 12:23:42,776][57339] Updated weights for policy 0, policy_version 607738 (0.0037) [2024-04-28 12:23:46,005][57339] Updated weights for policy 0, policy_version 607748 (0.0028) [2024-04-28 12:23:47,169][57108] Fps is (10 sec: 54068.5, 60 sec: 55705.8, 300 sec: 55483.5). Total num frames: 9957392384. Throughput: 0: 55773.0. Samples: 447812420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:23:47,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 12:23:48,688][57339] Updated weights for policy 0, policy_version 607758 (0.0027) [2024-04-28 12:23:51,799][57339] Updated weights for policy 0, policy_version 607768 (0.0032) [2024-04-28 12:23:52,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55978.6, 300 sec: 55483.4). Total num frames: 9957670912. Throughput: 0: 55598.7. Samples: 447974280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:23:52,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 12:23:54,478][57339] Updated weights for policy 0, policy_version 607778 (0.0029) [2024-04-28 12:23:57,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 9957933056. Throughput: 0: 55563.2. Samples: 448310380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:23:57,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 12:23:57,817][57339] Updated weights for policy 0, policy_version 607788 (0.0035) [2024-04-28 12:24:00,315][57339] Updated weights for policy 0, policy_version 607798 (0.0032) [2024-04-28 12:24:02,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 9958227968. Throughput: 0: 55582.9. Samples: 448643520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:24:02,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 12:24:03,703][57339] Updated weights for policy 0, policy_version 607808 (0.0029) [2024-04-28 12:24:06,671][57339] Updated weights for policy 0, policy_version 607818 (0.0026) [2024-04-28 12:24:07,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 9958522880. Throughput: 0: 55663.2. Samples: 448809620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:24:07,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 12:24:09,575][57339] Updated weights for policy 0, policy_version 607828 (0.0034) [2024-04-28 12:24:12,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 9958801408. Throughput: 0: 55732.2. Samples: 449152880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:24:12,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 12:24:12,344][57339] Updated weights for policy 0, policy_version 607838 (0.0030) [2024-04-28 12:24:15,377][57339] Updated weights for policy 0, policy_version 607848 (0.0028) [2024-04-28 12:24:17,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 9959079936. Throughput: 0: 55762.0. Samples: 449485460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:24:17,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 12:24:18,132][57339] Updated weights for policy 0, policy_version 607858 (0.0029) [2024-04-28 12:24:21,171][57339] Updated weights for policy 0, policy_version 607868 (0.0031) [2024-04-28 12:24:22,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 9959342080. Throughput: 0: 55696.2. Samples: 449651000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 12:24:22,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 12:24:24,033][57319] Signal inference workers to stop experience collection... (6500 times) [2024-04-28 12:24:24,068][57339] InferenceWorker_p0-w0: stopping experience collection (6500 times) [2024-04-28 12:24:24,121][57319] Signal inference workers to resume experience collection... (6500 times) [2024-04-28 12:24:24,122][57339] InferenceWorker_p0-w0: resuming experience collection (6500 times) [2024-04-28 12:24:24,230][57339] Updated weights for policy 0, policy_version 607878 (0.0025) [2024-04-28 12:24:27,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 9959620608. Throughput: 0: 55655.5. Samples: 449982340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:24:27,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 12:24:27,208][57339] Updated weights for policy 0, policy_version 607888 (0.0032) [2024-04-28 12:24:30,364][57339] Updated weights for policy 0, policy_version 607898 (0.0027) [2024-04-28 12:24:32,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 9959899136. Throughput: 0: 55479.9. Samples: 450309020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:24:32,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 12:24:33,092][57339] Updated weights for policy 0, policy_version 607908 (0.0027) [2024-04-28 12:24:36,042][57339] Updated weights for policy 0, policy_version 607918 (0.0030) [2024-04-28 12:24:37,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 9960177664. Throughput: 0: 55515.1. Samples: 450472460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:24:37,169][57108] Avg episode reward: [(0, '0.512')] [2024-04-28 12:24:38,979][57339] Updated weights for policy 0, policy_version 607928 (0.0027) [2024-04-28 12:24:41,766][57339] Updated weights for policy 0, policy_version 607938 (0.0027) [2024-04-28 12:24:42,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 9960456192. Throughput: 0: 55519.5. Samples: 450808760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:24:42,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 12:24:44,841][57339] Updated weights for policy 0, policy_version 607948 (0.0026) [2024-04-28 12:24:47,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 9960734720. Throughput: 0: 55492.4. Samples: 451140680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:24:47,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 12:24:47,902][57339] Updated weights for policy 0, policy_version 607958 (0.0029) [2024-04-28 12:24:50,743][57339] Updated weights for policy 0, policy_version 607968 (0.0027) [2024-04-28 12:24:52,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 9961029632. Throughput: 0: 55530.7. Samples: 451308500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:24:52,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 12:24:53,848][57339] Updated weights for policy 0, policy_version 607978 (0.0033) [2024-04-28 12:24:56,534][57339] Updated weights for policy 0, policy_version 607988 (0.0027) [2024-04-28 12:24:57,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 9961291776. Throughput: 0: 55338.6. Samples: 451643120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:24:57,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 12:24:59,607][57339] Updated weights for policy 0, policy_version 607998 (0.0028) [2024-04-28 12:25:02,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 9961570304. Throughput: 0: 55379.9. Samples: 451977540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:25:02,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 12:25:02,504][57339] Updated weights for policy 0, policy_version 608008 (0.0033) [2024-04-28 12:25:05,702][57339] Updated weights for policy 0, policy_version 608018 (0.0033) [2024-04-28 12:25:07,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9961848832. Throughput: 0: 55358.2. Samples: 452142120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:25:07,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 12:25:08,419][57339] Updated weights for policy 0, policy_version 608028 (0.0031) [2024-04-28 12:25:11,965][57339] Updated weights for policy 0, policy_version 608038 (0.0025) [2024-04-28 12:25:12,169][57108] Fps is (10 sec: 54065.6, 60 sec: 55159.3, 300 sec: 55650.0). Total num frames: 9962110976. Throughput: 0: 55435.4. Samples: 452476940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:25:12,170][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 12:25:14,351][57339] Updated weights for policy 0, policy_version 608048 (0.0024) [2024-04-28 12:25:17,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 9962389504. Throughput: 0: 55494.6. Samples: 452806280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:25:17,170][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 12:25:17,710][57339] Updated weights for policy 0, policy_version 608058 (0.0025) [2024-04-28 12:25:20,188][57339] Updated weights for policy 0, policy_version 608068 (0.0030) [2024-04-28 12:25:22,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 9962684416. Throughput: 0: 55608.9. Samples: 452974860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:25:22,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 12:25:23,449][57339] Updated weights for policy 0, policy_version 608078 (0.0027) [2024-04-28 12:25:25,587][57319] Signal inference workers to stop experience collection... (6550 times) [2024-04-28 12:25:25,587][57319] Signal inference workers to resume experience collection... (6550 times) [2024-04-28 12:25:25,615][57339] InferenceWorker_p0-w0: stopping experience collection (6550 times) [2024-04-28 12:25:25,616][57339] InferenceWorker_p0-w0: resuming experience collection (6550 times) [2024-04-28 12:25:26,134][57339] Updated weights for policy 0, policy_version 608088 (0.0029) [2024-04-28 12:25:27,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 9962946560. Throughput: 0: 55528.0. Samples: 453307520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:25:27,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 12:25:27,235][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000608091_9962962944.pth... [2024-04-28 12:25:27,282][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000607276_9949609984.pth [2024-04-28 12:25:29,314][57339] Updated weights for policy 0, policy_version 608098 (0.0032) [2024-04-28 12:25:32,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 9963225088. Throughput: 0: 55535.4. Samples: 453639780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:25:32,170][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 12:25:32,209][57339] Updated weights for policy 0, policy_version 608108 (0.0031) [2024-04-28 12:25:35,355][57339] Updated weights for policy 0, policy_version 608118 (0.0026) [2024-04-28 12:25:37,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 9963487232. Throughput: 0: 55418.5. Samples: 453802340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:25:37,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 12:25:38,078][57339] Updated weights for policy 0, policy_version 608128 (0.0026) [2024-04-28 12:25:41,209][57339] Updated weights for policy 0, policy_version 608138 (0.0033) [2024-04-28 12:25:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 9963782144. Throughput: 0: 55323.8. Samples: 454132700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:25:42,170][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 12:25:43,987][57339] Updated weights for policy 0, policy_version 608148 (0.0029) [2024-04-28 12:25:47,072][57339] Updated weights for policy 0, policy_version 608158 (0.0033) [2024-04-28 12:25:47,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 9964060672. Throughput: 0: 55451.8. Samples: 454472880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:25:47,170][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 12:25:49,912][57339] Updated weights for policy 0, policy_version 608168 (0.0031) [2024-04-28 12:25:52,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 9964339200. Throughput: 0: 55374.2. Samples: 454633960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:25:52,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 12:25:52,976][57339] Updated weights for policy 0, policy_version 608178 (0.0033) [2024-04-28 12:25:55,834][57339] Updated weights for policy 0, policy_version 608188 (0.0026) [2024-04-28 12:25:57,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 9964634112. Throughput: 0: 55422.3. Samples: 454970940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 12:25:57,170][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 12:25:58,834][57339] Updated weights for policy 0, policy_version 608198 (0.0028) [2024-04-28 12:26:01,669][57339] Updated weights for policy 0, policy_version 608208 (0.0032) [2024-04-28 12:26:02,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 9964896256. Throughput: 0: 55493.3. Samples: 455303480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:26:02,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 12:26:04,726][57339] Updated weights for policy 0, policy_version 608218 (0.0028) [2024-04-28 12:26:07,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9965174784. Throughput: 0: 55536.4. Samples: 455474000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:26:07,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 12:26:07,472][57339] Updated weights for policy 0, policy_version 608228 (0.0028) [2024-04-28 12:26:10,508][57339] Updated weights for policy 0, policy_version 608238 (0.0032) [2024-04-28 12:26:12,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.8, 300 sec: 55594.5). Total num frames: 9965453312. Throughput: 0: 55588.0. Samples: 455808980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:26:12,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 12:26:13,555][57339] Updated weights for policy 0, policy_version 608248 (0.0026) [2024-04-28 12:26:16,243][57339] Updated weights for policy 0, policy_version 608258 (0.0028) [2024-04-28 12:26:17,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 9965748224. Throughput: 0: 55634.5. Samples: 456143320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:26:17,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 12:26:19,364][57339] Updated weights for policy 0, policy_version 608268 (0.0028) [2024-04-28 12:26:22,095][57339] Updated weights for policy 0, policy_version 608278 (0.0028) [2024-04-28 12:26:22,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 9966026752. Throughput: 0: 55819.2. Samples: 456314200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:26:22,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 12:26:25,215][57339] Updated weights for policy 0, policy_version 608288 (0.0027) [2024-04-28 12:26:27,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 9966288896. Throughput: 0: 55865.9. Samples: 456646660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:26:27,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 12:26:27,979][57339] Updated weights for policy 0, policy_version 608298 (0.0026) [2024-04-28 12:26:31,077][57319] Signal inference workers to stop experience collection... (6600 times) [2024-04-28 12:26:31,081][57319] Signal inference workers to resume experience collection... (6600 times) [2024-04-28 12:26:31,110][57339] InferenceWorker_p0-w0: stopping experience collection (6600 times) [2024-04-28 12:26:31,110][57339] InferenceWorker_p0-w0: resuming experience collection (6600 times) [2024-04-28 12:26:31,191][57339] Updated weights for policy 0, policy_version 608308 (0.0031) [2024-04-28 12:26:32,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 9966567424. Throughput: 0: 55794.2. Samples: 456983620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:26:32,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 12:26:33,943][57339] Updated weights for policy 0, policy_version 608318 (0.0035) [2024-04-28 12:26:37,062][57339] Updated weights for policy 0, policy_version 608328 (0.0027) [2024-04-28 12:26:37,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 9966862336. Throughput: 0: 55875.2. Samples: 457148340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:26:37,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 12:26:39,695][57339] Updated weights for policy 0, policy_version 608338 (0.0031) [2024-04-28 12:26:42,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 9967124480. Throughput: 0: 55848.0. Samples: 457484100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:26:42,170][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 12:26:43,030][57339] Updated weights for policy 0, policy_version 608348 (0.0027) [2024-04-28 12:26:45,647][57339] Updated weights for policy 0, policy_version 608358 (0.0033) [2024-04-28 12:26:47,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 9967403008. Throughput: 0: 55715.9. Samples: 457810700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:26:47,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 12:26:49,045][57339] Updated weights for policy 0, policy_version 608368 (0.0029) [2024-04-28 12:26:51,602][57339] Updated weights for policy 0, policy_version 608378 (0.0035) [2024-04-28 12:26:52,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 9967697920. Throughput: 0: 55738.2. Samples: 457982220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:26:52,170][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 12:26:54,883][57339] Updated weights for policy 0, policy_version 608388 (0.0026) [2024-04-28 12:26:57,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 9967960064. Throughput: 0: 55709.3. Samples: 458315900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:26:57,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 12:26:57,355][57339] Updated weights for policy 0, policy_version 608398 (0.0027) [2024-04-28 12:27:00,690][57339] Updated weights for policy 0, policy_version 608408 (0.0032) [2024-04-28 12:27:02,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 9968238592. Throughput: 0: 55834.1. Samples: 458655860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:27:02,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 12:27:03,147][57339] Updated weights for policy 0, policy_version 608418 (0.0029) [2024-04-28 12:27:06,569][57339] Updated weights for policy 0, policy_version 608428 (0.0025) [2024-04-28 12:27:07,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9968500736. Throughput: 0: 55572.4. Samples: 458814960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:27:07,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 12:27:09,354][57339] Updated weights for policy 0, policy_version 608438 (0.0033) [2024-04-28 12:27:12,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 9968779264. Throughput: 0: 55449.8. Samples: 459141900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:27:12,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 12:27:12,506][57339] Updated weights for policy 0, policy_version 608448 (0.0038) [2024-04-28 12:27:15,134][57339] Updated weights for policy 0, policy_version 608458 (0.0024) [2024-04-28 12:27:17,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 9969074176. Throughput: 0: 55360.1. Samples: 459474820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:27:17,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 12:27:18,335][57339] Updated weights for policy 0, policy_version 608468 (0.0024) [2024-04-28 12:27:20,889][57339] Updated weights for policy 0, policy_version 608478 (0.0033) [2024-04-28 12:27:22,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 9969352704. Throughput: 0: 55523.9. Samples: 459646920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:27:22,169][57108] Avg episode reward: [(0, '0.511')] [2024-04-28 12:27:24,151][57339] Updated weights for policy 0, policy_version 608488 (0.0027) [2024-04-28 12:27:26,752][57339] Updated weights for policy 0, policy_version 608498 (0.0033) [2024-04-28 12:27:27,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 9969647616. Throughput: 0: 55455.5. Samples: 459979600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:27:27,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 12:27:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000608499_9969647616.pth... [2024-04-28 12:27:27,226][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000607683_9956278272.pth [2024-04-28 12:27:29,697][57319] Signal inference workers to stop experience collection... (6650 times) [2024-04-28 12:27:29,698][57319] Signal inference workers to resume experience collection... (6650 times) [2024-04-28 12:27:29,713][57339] InferenceWorker_p0-w0: stopping experience collection (6650 times) [2024-04-28 12:27:29,713][57339] InferenceWorker_p0-w0: resuming experience collection (6650 times) [2024-04-28 12:27:30,299][57339] Updated weights for policy 0, policy_version 608508 (0.0038) [2024-04-28 12:27:32,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 9969909760. Throughput: 0: 55533.1. Samples: 460309680. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-04-28 12:27:32,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 12:27:32,703][57339] Updated weights for policy 0, policy_version 608518 (0.0024) [2024-04-28 12:27:36,264][57339] Updated weights for policy 0, policy_version 608528 (0.0037) [2024-04-28 12:27:37,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 9970171904. Throughput: 0: 55425.8. Samples: 460476380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:27:37,170][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 12:27:38,495][57339] Updated weights for policy 0, policy_version 608538 (0.0031) [2024-04-28 12:27:42,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 9970434048. Throughput: 0: 55472.5. Samples: 460812160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:27:42,169][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 12:27:42,193][57339] Updated weights for policy 0, policy_version 608548 (0.0027) [2024-04-28 12:27:44,270][57339] Updated weights for policy 0, policy_version 608558 (0.0029) [2024-04-28 12:27:47,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 9970712576. Throughput: 0: 55363.2. Samples: 461147200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:27:47,169][57108] Avg episode reward: [(0, '0.717')] [2024-04-28 12:27:48,067][57339] Updated weights for policy 0, policy_version 608568 (0.0034) [2024-04-28 12:27:50,176][57339] Updated weights for policy 0, policy_version 608578 (0.0024) [2024-04-28 12:27:52,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 9971007488. Throughput: 0: 55536.1. Samples: 461314080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:27:52,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 12:27:53,800][57339] Updated weights for policy 0, policy_version 608588 (0.0032) [2024-04-28 12:27:56,082][57339] Updated weights for policy 0, policy_version 608598 (0.0028) [2024-04-28 12:27:57,169][57108] Fps is (10 sec: 58981.8, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 9971302400. Throughput: 0: 55556.3. Samples: 461641940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:27:57,170][57108] Avg episode reward: [(0, '0.702')] [2024-04-28 12:27:59,722][57339] Updated weights for policy 0, policy_version 608608 (0.0034) [2024-04-28 12:28:01,900][57339] Updated weights for policy 0, policy_version 608618 (0.0029) [2024-04-28 12:28:02,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 9971597312. Throughput: 0: 55497.0. Samples: 461972180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:28:02,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 12:28:05,559][57339] Updated weights for policy 0, policy_version 608628 (0.0027) [2024-04-28 12:28:07,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 9971859456. Throughput: 0: 55642.4. Samples: 462150820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:28:07,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 12:28:07,817][57339] Updated weights for policy 0, policy_version 608638 (0.0027) [2024-04-28 12:28:11,423][57339] Updated weights for policy 0, policy_version 608648 (0.0028) [2024-04-28 12:28:12,170][57108] Fps is (10 sec: 50784.9, 60 sec: 55431.6, 300 sec: 55483.3). Total num frames: 9972105216. Throughput: 0: 55663.8. Samples: 462484520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:28:12,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 12:28:13,709][57339] Updated weights for policy 0, policy_version 608658 (0.0027) [2024-04-28 12:28:17,169][57108] Fps is (10 sec: 52427.7, 60 sec: 55159.3, 300 sec: 55539.0). Total num frames: 9972383744. Throughput: 0: 55702.4. Samples: 462816300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:28:17,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 12:28:17,546][57339] Updated weights for policy 0, policy_version 608668 (0.0030) [2024-04-28 12:28:19,760][57339] Updated weights for policy 0, policy_version 608678 (0.0031) [2024-04-28 12:28:22,169][57108] Fps is (10 sec: 54073.0, 60 sec: 54886.5, 300 sec: 55483.5). Total num frames: 9972645888. Throughput: 0: 55321.9. Samples: 462965860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:28:22,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 12:28:23,388][57339] Updated weights for policy 0, policy_version 608688 (0.0028) [2024-04-28 12:28:24,950][57319] Signal inference workers to stop experience collection... (6700 times) [2024-04-28 12:28:24,951][57319] Signal inference workers to resume experience collection... (6700 times) [2024-04-28 12:28:24,969][57339] InferenceWorker_p0-w0: stopping experience collection (6700 times) [2024-04-28 12:28:24,969][57339] InferenceWorker_p0-w0: resuming experience collection (6700 times) [2024-04-28 12:28:25,706][57339] Updated weights for policy 0, policy_version 608698 (0.0030) [2024-04-28 12:28:27,169][57108] Fps is (10 sec: 55706.7, 60 sec: 54886.5, 300 sec: 55483.4). Total num frames: 9972940800. Throughput: 0: 55271.1. Samples: 463299360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:28:27,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 12:28:29,276][57339] Updated weights for policy 0, policy_version 608708 (0.0028) [2024-04-28 12:28:31,661][57339] Updated weights for policy 0, policy_version 608718 (0.0027) [2024-04-28 12:28:32,169][57108] Fps is (10 sec: 60620.6, 60 sec: 55705.6, 300 sec: 55594.6). Total num frames: 9973252096. Throughput: 0: 55329.8. Samples: 463637040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:28:32,169][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 12:28:35,217][57339] Updated weights for policy 0, policy_version 608728 (0.0029) [2024-04-28 12:28:37,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 9973514240. Throughput: 0: 55444.5. Samples: 463809080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:28:37,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 12:28:37,871][57339] Updated weights for policy 0, policy_version 608738 (0.0024) [2024-04-28 12:28:40,956][57339] Updated weights for policy 0, policy_version 608748 (0.0030) [2024-04-28 12:28:42,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 9973776384. Throughput: 0: 55525.0. Samples: 464140560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:28:42,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 12:28:43,877][57339] Updated weights for policy 0, policy_version 608758 (0.0035) [2024-04-28 12:28:46,805][57339] Updated weights for policy 0, policy_version 608768 (0.0026) [2024-04-28 12:28:47,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 9974071296. Throughput: 0: 55567.5. Samples: 464472720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:28:47,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 12:28:49,631][57339] Updated weights for policy 0, policy_version 608778 (0.0030) [2024-04-28 12:28:52,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 9974333440. Throughput: 0: 55204.3. Samples: 464635020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:28:52,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 12:28:52,723][57339] Updated weights for policy 0, policy_version 608788 (0.0029) [2024-04-28 12:28:55,465][57339] Updated weights for policy 0, policy_version 608798 (0.0034) [2024-04-28 12:28:57,169][57108] Fps is (10 sec: 52428.7, 60 sec: 54886.5, 300 sec: 55483.4). Total num frames: 9974595584. Throughput: 0: 55217.2. Samples: 464969240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:28:57,169][57108] Avg episode reward: [(0, '0.477')] [2024-04-28 12:28:58,686][57339] Updated weights for policy 0, policy_version 608808 (0.0029) [2024-04-28 12:29:01,317][57339] Updated weights for policy 0, policy_version 608818 (0.0026) [2024-04-28 12:29:02,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54886.2, 300 sec: 55483.4). Total num frames: 9974890496. Throughput: 0: 55253.9. Samples: 465302720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:29:02,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 12:29:04,751][57339] Updated weights for policy 0, policy_version 608828 (0.0031) [2024-04-28 12:29:07,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9975185408. Throughput: 0: 55715.5. Samples: 465473060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:29:07,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 12:29:07,189][57339] Updated weights for policy 0, policy_version 608838 (0.0037) [2024-04-28 12:29:10,500][57339] Updated weights for policy 0, policy_version 608848 (0.0032) [2024-04-28 12:29:12,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55979.5, 300 sec: 55539.0). Total num frames: 9975463936. Throughput: 0: 55573.1. Samples: 465800160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:29:12,170][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 12:29:13,190][57339] Updated weights for policy 0, policy_version 608858 (0.0030) [2024-04-28 12:29:16,444][57339] Updated weights for policy 0, policy_version 608868 (0.0027) [2024-04-28 12:29:17,169][57108] Fps is (10 sec: 57343.2, 60 sec: 56251.8, 300 sec: 55650.0). Total num frames: 9975758848. Throughput: 0: 55607.8. Samples: 466139400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:29:17,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 12:29:19,078][57339] Updated weights for policy 0, policy_version 608878 (0.0031) [2024-04-28 12:29:22,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 9976004608. Throughput: 0: 55467.6. Samples: 466305120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:29:22,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 12:29:22,230][57339] Updated weights for policy 0, policy_version 608888 (0.0032) [2024-04-28 12:29:22,657][57319] Signal inference workers to stop experience collection... (6750 times) [2024-04-28 12:29:22,657][57319] Signal inference workers to resume experience collection... (6750 times) [2024-04-28 12:29:22,687][57339] InferenceWorker_p0-w0: stopping experience collection (6750 times) [2024-04-28 12:29:22,687][57339] InferenceWorker_p0-w0: resuming experience collection (6750 times) [2024-04-28 12:29:25,738][57339] Updated weights for policy 0, policy_version 608898 (0.0027) [2024-04-28 12:29:27,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 9976283136. Throughput: 0: 55650.2. Samples: 466644820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:29:27,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 12:29:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000608904_9976283136.pth... [2024-04-28 12:29:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000608091_9962962944.pth [2024-04-28 12:29:28,084][57339] Updated weights for policy 0, policy_version 608908 (0.0025) [2024-04-28 12:29:31,565][57339] Updated weights for policy 0, policy_version 608918 (0.0032) [2024-04-28 12:29:32,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54886.4, 300 sec: 55483.5). Total num frames: 9976545280. Throughput: 0: 55583.9. Samples: 466974000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:29:32,170][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 12:29:34,148][57339] Updated weights for policy 0, policy_version 608928 (0.0028) [2024-04-28 12:29:37,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 9976823808. Throughput: 0: 55527.6. Samples: 467133760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:29:37,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 12:29:37,627][57339] Updated weights for policy 0, policy_version 608938 (0.0032) [2024-04-28 12:29:39,996][57339] Updated weights for policy 0, policy_version 608948 (0.0033) [2024-04-28 12:29:42,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 9977135104. Throughput: 0: 55432.4. Samples: 467463700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:29:42,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 12:29:43,467][57339] Updated weights for policy 0, policy_version 608958 (0.0032) [2024-04-28 12:29:45,987][57339] Updated weights for policy 0, policy_version 608968 (0.0027) [2024-04-28 12:29:47,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 9977413632. Throughput: 0: 55390.2. Samples: 467795280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:29:47,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 12:29:49,207][57339] Updated weights for policy 0, policy_version 608978 (0.0024) [2024-04-28 12:29:51,793][57339] Updated weights for policy 0, policy_version 608988 (0.0031) [2024-04-28 12:29:52,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 9977675776. Throughput: 0: 55602.3. Samples: 467975160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:29:52,169][57108] Avg episode reward: [(0, '0.695')] [2024-04-28 12:29:54,944][57339] Updated weights for policy 0, policy_version 608998 (0.0032) [2024-04-28 12:29:57,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 9977954304. Throughput: 0: 55699.7. Samples: 468306640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:29:57,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 12:29:57,721][57339] Updated weights for policy 0, policy_version 609008 (0.0027) [2024-04-28 12:30:01,024][57339] Updated weights for policy 0, policy_version 609018 (0.0028) [2024-04-28 12:30:02,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 9978232832. Throughput: 0: 55579.6. Samples: 468640480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:30:02,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 12:30:03,500][57339] Updated weights for policy 0, policy_version 609028 (0.0029) [2024-04-28 12:30:06,960][57339] Updated weights for policy 0, policy_version 609038 (0.0033) [2024-04-28 12:30:07,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 9978494976. Throughput: 0: 55459.5. Samples: 468800800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:30:07,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 12:30:09,462][57339] Updated weights for policy 0, policy_version 609048 (0.0026) [2024-04-28 12:30:12,169][57108] Fps is (10 sec: 52428.5, 60 sec: 54886.5, 300 sec: 55483.4). Total num frames: 9978757120. Throughput: 0: 55216.4. Samples: 469129560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:30:12,170][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 12:30:12,833][57339] Updated weights for policy 0, policy_version 609058 (0.0032) [2024-04-28 12:30:15,242][57339] Updated weights for policy 0, policy_version 609068 (0.0027) [2024-04-28 12:30:17,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 9979068416. Throughput: 0: 55337.8. Samples: 469464200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:30:17,169][57108] Avg episode reward: [(0, '0.695')] [2024-04-28 12:30:18,640][57339] Updated weights for policy 0, policy_version 609078 (0.0034) [2024-04-28 12:30:21,180][57339] Updated weights for policy 0, policy_version 609088 (0.0027) [2024-04-28 12:30:22,169][57108] Fps is (10 sec: 60620.0, 60 sec: 55978.4, 300 sec: 55650.0). Total num frames: 9979363328. Throughput: 0: 55666.0. Samples: 469638740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:30:22,170][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 12:30:24,542][57319] Signal inference workers to stop experience collection... (6800 times) [2024-04-28 12:30:24,543][57319] Signal inference workers to resume experience collection... (6800 times) [2024-04-28 12:30:24,552][57339] InferenceWorker_p0-w0: stopping experience collection (6800 times) [2024-04-28 12:30:24,555][57339] Updated weights for policy 0, policy_version 609098 (0.0028) [2024-04-28 12:30:24,572][57339] InferenceWorker_p0-w0: resuming experience collection (6800 times) [2024-04-28 12:30:27,088][57339] Updated weights for policy 0, policy_version 609108 (0.0034) [2024-04-28 12:30:27,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55594.6). Total num frames: 9979625472. Throughput: 0: 55773.0. Samples: 469973480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:30:27,178][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 12:30:30,382][57339] Updated weights for policy 0, policy_version 609118 (0.0025) [2024-04-28 12:30:32,169][57108] Fps is (10 sec: 52430.0, 60 sec: 55705.6, 300 sec: 55594.6). Total num frames: 9979887616. Throughput: 0: 55783.7. Samples: 470305540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:30:32,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 12:30:32,995][57339] Updated weights for policy 0, policy_version 609128 (0.0028) [2024-04-28 12:30:36,198][57339] Updated weights for policy 0, policy_version 609138 (0.0030) [2024-04-28 12:30:37,169][57108] Fps is (10 sec: 54066.2, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 9980166144. Throughput: 0: 55282.3. Samples: 470462880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:30:37,170][57108] Avg episode reward: [(0, '0.693')] [2024-04-28 12:30:38,973][57339] Updated weights for policy 0, policy_version 609148 (0.0033) [2024-04-28 12:30:42,037][57339] Updated weights for policy 0, policy_version 609158 (0.0027) [2024-04-28 12:30:42,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 9980444672. Throughput: 0: 55467.2. Samples: 470802660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 12:30:42,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 12:30:44,746][57339] Updated weights for policy 0, policy_version 609168 (0.0026) [2024-04-28 12:30:47,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54886.4, 300 sec: 55483.4). Total num frames: 9980706816. Throughput: 0: 55528.3. Samples: 471139260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:30:47,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 12:30:48,182][57339] Updated weights for policy 0, policy_version 609178 (0.0030) [2024-04-28 12:30:50,705][57339] Updated weights for policy 0, policy_version 609188 (0.0027) [2024-04-28 12:30:52,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55705.4, 300 sec: 55539.0). Total num frames: 9981018112. Throughput: 0: 55592.8. Samples: 471302480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:30:52,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 12:30:53,928][57339] Updated weights for policy 0, policy_version 609198 (0.0027) [2024-04-28 12:30:56,583][57339] Updated weights for policy 0, policy_version 609208 (0.0031) [2024-04-28 12:30:57,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 9981296640. Throughput: 0: 55748.0. Samples: 471638220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:30:57,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 12:30:59,755][57339] Updated weights for policy 0, policy_version 609218 (0.0039) [2024-04-28 12:31:02,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9981558784. Throughput: 0: 55838.6. Samples: 471976940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:31:02,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 12:31:02,494][57339] Updated weights for policy 0, policy_version 609228 (0.0027) [2024-04-28 12:31:05,683][57339] Updated weights for policy 0, policy_version 609238 (0.0028) [2024-04-28 12:31:07,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.5, 300 sec: 55538.9). Total num frames: 9981837312. Throughput: 0: 55623.6. Samples: 472141800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:31:07,170][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 12:31:08,475][57339] Updated weights for policy 0, policy_version 609248 (0.0027) [2024-04-28 12:31:11,451][57339] Updated weights for policy 0, policy_version 609258 (0.0040) [2024-04-28 12:31:12,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56251.8, 300 sec: 55539.0). Total num frames: 9982132224. Throughput: 0: 55638.2. Samples: 472477200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:31:12,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 12:31:14,288][57339] Updated weights for policy 0, policy_version 609268 (0.0025) [2024-04-28 12:31:17,169][57108] Fps is (10 sec: 55706.7, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 9982394368. Throughput: 0: 55675.1. Samples: 472810920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:31:17,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 12:31:17,215][57339] Updated weights for policy 0, policy_version 609278 (0.0030) [2024-04-28 12:31:20,222][57339] Updated weights for policy 0, policy_version 609288 (0.0026) [2024-04-28 12:31:22,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 9982672896. Throughput: 0: 55969.5. Samples: 472981500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:31:22,170][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 12:31:23,082][57339] Updated weights for policy 0, policy_version 609298 (0.0022) [2024-04-28 12:31:26,071][57339] Updated weights for policy 0, policy_version 609308 (0.0022) [2024-04-28 12:31:27,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9982951424. Throughput: 0: 55908.9. Samples: 473318560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:31:27,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 12:31:27,196][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000609312_9982967808.pth... [2024-04-28 12:31:27,243][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000608499_9969647616.pth [2024-04-28 12:31:28,804][57319] Signal inference workers to stop experience collection... (6850 times) [2024-04-28 12:31:28,856][57339] InferenceWorker_p0-w0: stopping experience collection (6850 times) [2024-04-28 12:31:28,866][57319] Signal inference workers to resume experience collection... (6850 times) [2024-04-28 12:31:28,872][57339] InferenceWorker_p0-w0: resuming experience collection (6850 times) [2024-04-28 12:31:28,973][57339] Updated weights for policy 0, policy_version 609318 (0.0034) [2024-04-28 12:31:31,820][57339] Updated weights for policy 0, policy_version 609328 (0.0025) [2024-04-28 12:31:32,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 9983229952. Throughput: 0: 55761.6. Samples: 473648520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:31:32,169][57108] Avg episode reward: [(0, '0.682')] [2024-04-28 12:31:34,914][57339] Updated weights for policy 0, policy_version 609338 (0.0027) [2024-04-28 12:31:37,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 9983508480. Throughput: 0: 55878.6. Samples: 473817020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:31:37,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 12:31:37,739][57339] Updated weights for policy 0, policy_version 609348 (0.0025) [2024-04-28 12:31:40,734][57339] Updated weights for policy 0, policy_version 609358 (0.0029) [2024-04-28 12:31:42,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55705.4, 300 sec: 55539.0). Total num frames: 9983787008. Throughput: 0: 55763.5. Samples: 474147580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:31:42,170][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 12:31:43,736][57339] Updated weights for policy 0, policy_version 609368 (0.0028) [2024-04-28 12:31:46,694][57339] Updated weights for policy 0, policy_version 609378 (0.0026) [2024-04-28 12:31:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55483.4). Total num frames: 9984065536. Throughput: 0: 55523.9. Samples: 474475520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:31:47,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 12:31:49,790][57339] Updated weights for policy 0, policy_version 609388 (0.0034) [2024-04-28 12:31:52,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 9984344064. Throughput: 0: 55609.5. Samples: 474644220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:31:52,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 12:31:52,535][57339] Updated weights for policy 0, policy_version 609398 (0.0029) [2024-04-28 12:31:55,626][57339] Updated weights for policy 0, policy_version 609408 (0.0031) [2024-04-28 12:31:57,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 9984638976. Throughput: 0: 55609.7. Samples: 474979640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:31:57,169][57108] Avg episode reward: [(0, '0.676')] [2024-04-28 12:31:58,345][57339] Updated weights for policy 0, policy_version 609418 (0.0028) [2024-04-28 12:32:01,587][57339] Updated weights for policy 0, policy_version 609428 (0.0027) [2024-04-28 12:32:02,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9984884736. Throughput: 0: 55560.3. Samples: 475311140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:32:02,170][57108] Avg episode reward: [(0, '0.522')] [2024-04-28 12:32:04,284][57339] Updated weights for policy 0, policy_version 609438 (0.0029) [2024-04-28 12:32:07,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 9985163264. Throughput: 0: 55263.3. Samples: 475468340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:32:07,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 12:32:07,592][57339] Updated weights for policy 0, policy_version 609448 (0.0027) [2024-04-28 12:32:10,259][57339] Updated weights for policy 0, policy_version 609458 (0.0027) [2024-04-28 12:32:12,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 9985441792. Throughput: 0: 55201.6. Samples: 475802640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:32:12,170][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 12:32:13,388][57339] Updated weights for policy 0, policy_version 609468 (0.0027) [2024-04-28 12:32:16,249][57339] Updated weights for policy 0, policy_version 609478 (0.0028) [2024-04-28 12:32:17,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 9985736704. Throughput: 0: 55356.7. Samples: 476139580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 12:32:17,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 12:32:19,186][57339] Updated weights for policy 0, policy_version 609488 (0.0028) [2024-04-28 12:32:22,127][57339] Updated weights for policy 0, policy_version 609498 (0.0029) [2024-04-28 12:32:22,169][57108] Fps is (10 sec: 57341.0, 60 sec: 55705.1, 300 sec: 55483.3). Total num frames: 9986015232. Throughput: 0: 55352.3. Samples: 476307900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:32:22,170][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 12:32:25,117][57339] Updated weights for policy 0, policy_version 609508 (0.0030) [2024-04-28 12:32:27,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 9986293760. Throughput: 0: 55335.6. Samples: 476637680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:32:27,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 12:32:27,981][57339] Updated weights for policy 0, policy_version 609518 (0.0027) [2024-04-28 12:32:31,167][57339] Updated weights for policy 0, policy_version 609528 (0.0023) [2024-04-28 12:32:32,169][57108] Fps is (10 sec: 57347.9, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 9986588672. Throughput: 0: 55416.2. Samples: 476969240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:32:32,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 12:32:33,801][57339] Updated weights for policy 0, policy_version 609538 (0.0030) [2024-04-28 12:32:36,902][57339] Updated weights for policy 0, policy_version 609548 (0.0029) [2024-04-28 12:32:37,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55650.0). Total num frames: 9986850816. Throughput: 0: 55569.7. Samples: 477144860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:32:37,169][57108] Avg episode reward: [(0, '0.514')] [2024-04-28 12:32:39,625][57339] Updated weights for policy 0, policy_version 609558 (0.0037) [2024-04-28 12:32:42,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 9987112960. Throughput: 0: 55582.2. Samples: 477480840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:32:42,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 12:32:42,375][57319] Signal inference workers to stop experience collection... (6900 times) [2024-04-28 12:32:42,379][57319] Signal inference workers to resume experience collection... (6900 times) [2024-04-28 12:32:42,404][57339] InferenceWorker_p0-w0: stopping experience collection (6900 times) [2024-04-28 12:32:42,404][57339] InferenceWorker_p0-w0: resuming experience collection (6900 times) [2024-04-28 12:32:42,666][57339] Updated weights for policy 0, policy_version 609568 (0.0030) [2024-04-28 12:32:45,348][57339] Updated weights for policy 0, policy_version 609578 (0.0031) [2024-04-28 12:32:47,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 9987407872. Throughput: 0: 55627.2. Samples: 477814360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:32:47,169][57108] Avg episode reward: [(0, '0.716')] [2024-04-28 12:32:48,502][57339] Updated weights for policy 0, policy_version 609588 (0.0028) [2024-04-28 12:32:51,284][57339] Updated weights for policy 0, policy_version 609598 (0.0035) [2024-04-28 12:32:52,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 9987670016. Throughput: 0: 55834.8. Samples: 477980920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:32:52,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 12:32:54,419][57339] Updated weights for policy 0, policy_version 609608 (0.0030) [2024-04-28 12:32:57,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 9987964928. Throughput: 0: 55825.0. Samples: 478314760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:32:57,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 12:32:57,180][57339] Updated weights for policy 0, policy_version 609618 (0.0027) [2024-04-28 12:33:00,276][57339] Updated weights for policy 0, policy_version 609628 (0.0031) [2024-04-28 12:33:02,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 9988243456. Throughput: 0: 55805.8. Samples: 478650840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:33:02,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 12:33:03,108][57339] Updated weights for policy 0, policy_version 609638 (0.0031) [2024-04-28 12:33:06,056][57339] Updated weights for policy 0, policy_version 609648 (0.0035) [2024-04-28 12:33:07,169][57108] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 55705.8). Total num frames: 9988538368. Throughput: 0: 55882.1. Samples: 478822560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:33:07,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 12:33:09,178][57339] Updated weights for policy 0, policy_version 609658 (0.0025) [2024-04-28 12:33:11,811][57339] Updated weights for policy 0, policy_version 609668 (0.0031) [2024-04-28 12:33:12,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 9988800512. Throughput: 0: 55961.8. Samples: 479155960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:33:12,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 12:33:15,034][57339] Updated weights for policy 0, policy_version 609678 (0.0027) [2024-04-28 12:33:17,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 9989062656. Throughput: 0: 55999.8. Samples: 479489240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:33:17,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 12:33:17,918][57339] Updated weights for policy 0, policy_version 609688 (0.0025) [2024-04-28 12:33:20,839][57339] Updated weights for policy 0, policy_version 609698 (0.0026) [2024-04-28 12:33:22,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55433.0, 300 sec: 55594.5). Total num frames: 9989341184. Throughput: 0: 55573.7. Samples: 479645680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:33:22,170][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 12:33:23,799][57339] Updated weights for policy 0, policy_version 609708 (0.0033) [2024-04-28 12:33:26,910][57339] Updated weights for policy 0, policy_version 609718 (0.0033) [2024-04-28 12:33:27,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 9989636096. Throughput: 0: 55629.3. Samples: 479984160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:33:27,169][57108] Avg episode reward: [(0, '0.708')] [2024-04-28 12:33:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000609719_9989636096.pth... [2024-04-28 12:33:27,232][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000608904_9976283136.pth [2024-04-28 12:33:29,588][57339] Updated weights for policy 0, policy_version 609728 (0.0033) [2024-04-28 12:33:32,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 9989898240. Throughput: 0: 55651.2. Samples: 480318660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:33:32,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 12:33:32,912][57339] Updated weights for policy 0, policy_version 609738 (0.0032) [2024-04-28 12:33:35,619][57339] Updated weights for policy 0, policy_version 609748 (0.0026) [2024-04-28 12:33:37,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 9990193152. Throughput: 0: 55776.5. Samples: 480490860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:33:37,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 12:33:38,843][57339] Updated weights for policy 0, policy_version 609758 (0.0033) [2024-04-28 12:33:41,393][57339] Updated weights for policy 0, policy_version 609768 (0.0030) [2024-04-28 12:33:42,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 9990471680. Throughput: 0: 55640.8. Samples: 480818600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:33:42,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 12:33:44,718][57339] Updated weights for policy 0, policy_version 609778 (0.0031) [2024-04-28 12:33:47,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 9990750208. Throughput: 0: 55553.7. Samples: 481150760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 12:33:47,169][57108] Avg episode reward: [(0, '0.522')] [2024-04-28 12:33:47,229][57339] Updated weights for policy 0, policy_version 609788 (0.0030) [2024-04-28 12:33:50,593][57339] Updated weights for policy 0, policy_version 609798 (0.0023) [2024-04-28 12:33:52,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 9991012352. Throughput: 0: 55537.0. Samples: 481321720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:33:52,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 12:33:53,159][57339] Updated weights for policy 0, policy_version 609808 (0.0026) [2024-04-28 12:33:56,506][57339] Updated weights for policy 0, policy_version 609818 (0.0031) [2024-04-28 12:33:57,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 9991290880. Throughput: 0: 55543.6. Samples: 481655420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:33:57,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 12:33:57,752][57319] Signal inference workers to stop experience collection... (6950 times) [2024-04-28 12:33:57,752][57319] Signal inference workers to resume experience collection... (6950 times) [2024-04-28 12:33:57,776][57339] InferenceWorker_p0-w0: stopping experience collection (6950 times) [2024-04-28 12:33:57,777][57339] InferenceWorker_p0-w0: resuming experience collection (6950 times) [2024-04-28 12:33:59,149][57339] Updated weights for policy 0, policy_version 609828 (0.0031) [2024-04-28 12:34:02,169][57108] Fps is (10 sec: 54066.1, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 9991553024. Throughput: 0: 55425.2. Samples: 481983380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:34:02,170][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 12:34:02,535][57339] Updated weights for policy 0, policy_version 609838 (0.0033) [2024-04-28 12:34:05,148][57339] Updated weights for policy 0, policy_version 609848 (0.0026) [2024-04-28 12:34:07,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 9991847936. Throughput: 0: 55493.4. Samples: 482142880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:34:07,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 12:34:08,349][57339] Updated weights for policy 0, policy_version 609858 (0.0035) [2024-04-28 12:34:10,928][57339] Updated weights for policy 0, policy_version 609868 (0.0028) [2024-04-28 12:34:12,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 9992126464. Throughput: 0: 55354.2. Samples: 482475100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:34:12,170][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 12:34:14,129][57339] Updated weights for policy 0, policy_version 609878 (0.0026) [2024-04-28 12:34:16,738][57339] Updated weights for policy 0, policy_version 609888 (0.0038) [2024-04-28 12:34:17,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 9992404992. Throughput: 0: 55274.7. Samples: 482806020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:34:17,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 12:34:20,061][57339] Updated weights for policy 0, policy_version 609898 (0.0029) [2024-04-28 12:34:22,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 9992683520. Throughput: 0: 55390.6. Samples: 482983440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:34:22,170][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 12:34:22,683][57339] Updated weights for policy 0, policy_version 609908 (0.0033) [2024-04-28 12:34:26,222][57339] Updated weights for policy 0, policy_version 609918 (0.0028) [2024-04-28 12:34:27,169][57108] Fps is (10 sec: 55704.4, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 9992962048. Throughput: 0: 55477.6. Samples: 483315100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:34:27,170][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 12:34:28,611][57339] Updated weights for policy 0, policy_version 609928 (0.0031) [2024-04-28 12:34:32,060][57339] Updated weights for policy 0, policy_version 609938 (0.0025) [2024-04-28 12:34:32,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 9993224192. Throughput: 0: 55483.6. Samples: 483647520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:34:32,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 12:34:34,359][57339] Updated weights for policy 0, policy_version 609948 (0.0032) [2024-04-28 12:34:37,169][57108] Fps is (10 sec: 52430.0, 60 sec: 54886.5, 300 sec: 55427.9). Total num frames: 9993486336. Throughput: 0: 55114.2. Samples: 483801860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:34:37,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 12:34:38,057][57339] Updated weights for policy 0, policy_version 609958 (0.0031) [2024-04-28 12:34:40,380][57339] Updated weights for policy 0, policy_version 609968 (0.0030) [2024-04-28 12:34:42,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 9993797632. Throughput: 0: 55094.6. Samples: 484134680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:34:42,178][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 12:34:43,816][57339] Updated weights for policy 0, policy_version 609978 (0.0027) [2024-04-28 12:34:46,450][57339] Updated weights for policy 0, policy_version 609988 (0.0028) [2024-04-28 12:34:47,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55159.4, 300 sec: 55538.9). Total num frames: 9994059776. Throughput: 0: 55164.5. Samples: 484465780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:34:47,178][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 12:34:49,734][57339] Updated weights for policy 0, policy_version 609998 (0.0030) [2024-04-28 12:34:52,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 9994354688. Throughput: 0: 55498.3. Samples: 484640300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:34:52,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 12:34:52,175][57339] Updated weights for policy 0, policy_version 610008 (0.0027) [2024-04-28 12:34:55,530][57339] Updated weights for policy 0, policy_version 610018 (0.0026) [2024-04-28 12:34:57,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 9994616832. Throughput: 0: 55443.4. Samples: 484970060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:34:57,178][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 12:34:58,160][57339] Updated weights for policy 0, policy_version 610028 (0.0031) [2024-04-28 12:34:59,325][57319] Signal inference workers to stop experience collection... (7000 times) [2024-04-28 12:34:59,326][57319] Signal inference workers to resume experience collection... (7000 times) [2024-04-28 12:34:59,351][57339] InferenceWorker_p0-w0: stopping experience collection (7000 times) [2024-04-28 12:34:59,351][57339] InferenceWorker_p0-w0: resuming experience collection (7000 times) [2024-04-28 12:35:01,386][57339] Updated weights for policy 0, policy_version 610038 (0.0029) [2024-04-28 12:35:02,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.8, 300 sec: 55594.5). Total num frames: 9994895360. Throughput: 0: 55598.3. Samples: 485307940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:35:02,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 12:35:03,915][57339] Updated weights for policy 0, policy_version 610048 (0.0028) [2024-04-28 12:35:07,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 9995157504. Throughput: 0: 55246.7. Samples: 485469540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:35:07,170][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 12:35:07,366][57339] Updated weights for policy 0, policy_version 610058 (0.0024) [2024-04-28 12:35:09,772][57339] Updated weights for policy 0, policy_version 610068 (0.0032) [2024-04-28 12:35:12,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 9995436032. Throughput: 0: 55294.4. Samples: 485803340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:35:12,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 12:35:13,294][57339] Updated weights for policy 0, policy_version 610078 (0.0030) [2024-04-28 12:35:15,675][57339] Updated weights for policy 0, policy_version 610088 (0.0028) [2024-04-28 12:35:17,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 9995714560. Throughput: 0: 55312.5. Samples: 486136580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:35:17,169][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 12:35:19,195][57339] Updated weights for policy 0, policy_version 610098 (0.0027) [2024-04-28 12:35:21,672][57339] Updated weights for policy 0, policy_version 610108 (0.0033) [2024-04-28 12:35:22,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 9996025856. Throughput: 0: 55662.2. Samples: 486306660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 12:35:22,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 12:35:25,102][57339] Updated weights for policy 0, policy_version 610118 (0.0034) [2024-04-28 12:35:27,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 9996304384. Throughput: 0: 55638.4. Samples: 486638400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:35:27,169][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 12:35:27,176][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000610126_9996304384.pth... [2024-04-28 12:35:27,223][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000609312_9982967808.pth [2024-04-28 12:35:27,531][57339] Updated weights for policy 0, policy_version 610128 (0.0029) [2024-04-28 12:35:30,971][57339] Updated weights for policy 0, policy_version 610138 (0.0032) [2024-04-28 12:35:32,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55594.6). Total num frames: 9996566528. Throughput: 0: 55563.3. Samples: 486966120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:35:32,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 12:35:33,922][57339] Updated weights for policy 0, policy_version 610148 (0.0029) [2024-04-28 12:35:36,846][57339] Updated weights for policy 0, policy_version 610158 (0.0028) [2024-04-28 12:35:37,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 9996845056. Throughput: 0: 55515.9. Samples: 487138520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:35:37,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 12:35:39,779][57339] Updated weights for policy 0, policy_version 610168 (0.0031) [2024-04-28 12:35:42,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 9997123584. Throughput: 0: 55744.1. Samples: 487478540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:35:42,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 12:35:42,677][57339] Updated weights for policy 0, policy_version 610178 (0.0027) [2024-04-28 12:35:45,854][57339] Updated weights for policy 0, policy_version 610188 (0.0030) [2024-04-28 12:35:47,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 9997369344. Throughput: 0: 55735.8. Samples: 487816060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:35:47,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 12:35:48,614][57339] Updated weights for policy 0, policy_version 610198 (0.0029) [2024-04-28 12:35:51,864][57339] Updated weights for policy 0, policy_version 610208 (0.0026) [2024-04-28 12:35:52,169][57108] Fps is (10 sec: 52429.6, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 9997647872. Throughput: 0: 55516.2. Samples: 487967760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:35:52,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 12:35:54,442][57339] Updated weights for policy 0, policy_version 610218 (0.0035) [2024-04-28 12:35:57,169][57108] Fps is (10 sec: 58983.5, 60 sec: 55705.8, 300 sec: 55594.5). Total num frames: 9997959168. Throughput: 0: 55505.0. Samples: 488301060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:35:57,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 12:35:57,828][57339] Updated weights for policy 0, policy_version 610228 (0.0029) [2024-04-28 12:36:00,416][57339] Updated weights for policy 0, policy_version 610238 (0.0025) [2024-04-28 12:36:01,297][57319] Signal inference workers to stop experience collection... (7050 times) [2024-04-28 12:36:01,299][57319] Signal inference workers to resume experience collection... (7050 times) [2024-04-28 12:36:01,323][57339] InferenceWorker_p0-w0: stopping experience collection (7050 times) [2024-04-28 12:36:01,323][57339] InferenceWorker_p0-w0: resuming experience collection (7050 times) [2024-04-28 12:36:02,169][57108] Fps is (10 sec: 60619.4, 60 sec: 55978.4, 300 sec: 55650.0). Total num frames: 9998254080. Throughput: 0: 55489.9. Samples: 488633640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:36:02,170][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 12:36:03,731][57339] Updated weights for policy 0, policy_version 610248 (0.0032) [2024-04-28 12:36:06,162][57339] Updated weights for policy 0, policy_version 610258 (0.0030) [2024-04-28 12:36:07,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56251.9, 300 sec: 55594.5). Total num frames: 9998532608. Throughput: 0: 55712.0. Samples: 488813700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:36:07,169][57108] Avg episode reward: [(0, '0.708')] [2024-04-28 12:36:09,485][57339] Updated weights for policy 0, policy_version 610268 (0.0029) [2024-04-28 12:36:12,018][57339] Updated weights for policy 0, policy_version 610278 (0.0028) [2024-04-28 12:36:12,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 9998794752. Throughput: 0: 55886.6. Samples: 489153300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:36:12,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 12:36:15,516][57339] Updated weights for policy 0, policy_version 610288 (0.0030) [2024-04-28 12:36:17,169][57108] Fps is (10 sec: 52427.6, 60 sec: 55705.4, 300 sec: 55539.0). Total num frames: 9999056896. Throughput: 0: 55856.6. Samples: 489479680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:36:17,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 12:36:17,939][57339] Updated weights for policy 0, policy_version 610298 (0.0027) [2024-04-28 12:36:21,354][57339] Updated weights for policy 0, policy_version 610308 (0.0029) [2024-04-28 12:36:22,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 9999335424. Throughput: 0: 55595.6. Samples: 489640320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:36:22,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 12:36:23,739][57339] Updated weights for policy 0, policy_version 610318 (0.0031) [2024-04-28 12:36:27,169][57108] Fps is (10 sec: 54067.9, 60 sec: 54886.3, 300 sec: 55483.4). Total num frames: 9999597568. Throughput: 0: 55480.5. Samples: 489975160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:36:27,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 12:36:27,441][57339] Updated weights for policy 0, policy_version 610328 (0.0026) [2024-04-28 12:36:29,533][57339] Updated weights for policy 0, policy_version 610338 (0.0029) [2024-04-28 12:36:32,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 9999908864. Throughput: 0: 55448.6. Samples: 490311240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:36:32,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 12:36:33,363][57339] Updated weights for policy 0, policy_version 610348 (0.0034) [2024-04-28 12:36:35,382][57339] Updated weights for policy 0, policy_version 610358 (0.0030) [2024-04-28 12:36:37,169][57108] Fps is (10 sec: 62259.1, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10000220160. Throughput: 0: 56035.0. Samples: 490489340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:36:37,170][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 12:36:39,102][57339] Updated weights for policy 0, policy_version 610368 (0.0032) [2024-04-28 12:36:41,489][57339] Updated weights for policy 0, policy_version 610378 (0.0027) [2024-04-28 12:36:42,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 10000482304. Throughput: 0: 56021.7. Samples: 490822040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:36:42,169][57108] Avg episode reward: [(0, '0.534')] [2024-04-28 12:36:44,771][57339] Updated weights for policy 0, policy_version 610388 (0.0025) [2024-04-28 12:36:47,169][57108] Fps is (10 sec: 52429.4, 60 sec: 56251.9, 300 sec: 55594.5). Total num frames: 10000744448. Throughput: 0: 56018.5. Samples: 491154460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:36:47,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 12:36:47,288][57339] Updated weights for policy 0, policy_version 610398 (0.0030) [2024-04-28 12:36:50,693][57339] Updated weights for policy 0, policy_version 610408 (0.0026) [2024-04-28 12:36:52,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55978.6, 300 sec: 55483.4). Total num frames: 10001006592. Throughput: 0: 55700.4. Samples: 491320220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:36:52,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 12:36:52,954][57339] Updated weights for policy 0, policy_version 610418 (0.0028) [2024-04-28 12:36:56,580][57339] Updated weights for policy 0, policy_version 610428 (0.0031) [2024-04-28 12:36:57,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 10001285120. Throughput: 0: 55681.3. Samples: 491658960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:36:57,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 12:36:58,460][57319] Signal inference workers to stop experience collection... (7100 times) [2024-04-28 12:36:58,487][57339] InferenceWorker_p0-w0: stopping experience collection (7100 times) [2024-04-28 12:36:58,511][57319] Signal inference workers to resume experience collection... (7100 times) [2024-04-28 12:36:58,517][57339] InferenceWorker_p0-w0: resuming experience collection (7100 times) [2024-04-28 12:36:58,750][57339] Updated weights for policy 0, policy_version 610438 (0.0024) [2024-04-28 12:37:02,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54886.6, 300 sec: 55539.0). Total num frames: 10001547264. Throughput: 0: 55735.4. Samples: 491987760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:37:02,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 12:37:02,518][57339] Updated weights for policy 0, policy_version 610448 (0.0031) [2024-04-28 12:37:04,735][57339] Updated weights for policy 0, policy_version 610458 (0.0032) [2024-04-28 12:37:07,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 10001842176. Throughput: 0: 55795.7. Samples: 492151140. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:37:07,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 12:37:08,468][57339] Updated weights for policy 0, policy_version 610468 (0.0026) [2024-04-28 12:37:10,519][57339] Updated weights for policy 0, policy_version 610478 (0.0031) [2024-04-28 12:37:12,169][57108] Fps is (10 sec: 62259.3, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 10002169856. Throughput: 0: 55735.7. Samples: 492483260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:37:12,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 12:37:14,143][57339] Updated weights for policy 0, policy_version 610488 (0.0035) [2024-04-28 12:37:16,455][57339] Updated weights for policy 0, policy_version 610498 (0.0036) [2024-04-28 12:37:17,169][57108] Fps is (10 sec: 60622.1, 60 sec: 56525.0, 300 sec: 55705.7). Total num frames: 10002448384. Throughput: 0: 55852.4. Samples: 492824600. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:37:17,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 12:37:19,892][57339] Updated weights for policy 0, policy_version 610508 (0.0031) [2024-04-28 12:37:22,169][57108] Fps is (10 sec: 54066.8, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 10002710528. Throughput: 0: 55730.7. Samples: 492997220. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:37:22,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 12:37:22,403][57339] Updated weights for policy 0, policy_version 610518 (0.0034) [2024-04-28 12:37:25,800][57339] Updated weights for policy 0, policy_version 610528 (0.0028) [2024-04-28 12:37:27,169][57108] Fps is (10 sec: 50790.5, 60 sec: 55978.8, 300 sec: 55483.4). Total num frames: 10002956288. Throughput: 0: 55789.8. Samples: 493332580. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:37:27,169][57108] Avg episode reward: [(0, '0.682')] [2024-04-28 12:37:27,200][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000610533_10002972672.pth... [2024-04-28 12:37:27,247][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000609719_9989636096.pth [2024-04-28 12:37:28,500][57339] Updated weights for policy 0, policy_version 610538 (0.0031) [2024-04-28 12:37:31,803][57339] Updated weights for policy 0, policy_version 610548 (0.0033) [2024-04-28 12:37:32,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10003234816. Throughput: 0: 55845.3. Samples: 493667500. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:37:32,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 12:37:34,303][57339] Updated weights for policy 0, policy_version 610558 (0.0026) [2024-04-28 12:37:37,169][57108] Fps is (10 sec: 55705.0, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 10003513344. Throughput: 0: 55583.0. Samples: 493821460. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:37:37,169][57108] Avg episode reward: [(0, '0.526')] [2024-04-28 12:37:37,578][57339] Updated weights for policy 0, policy_version 610568 (0.0030) [2024-04-28 12:37:40,060][57339] Updated weights for policy 0, policy_version 610578 (0.0032) [2024-04-28 12:37:42,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10003808256. Throughput: 0: 55579.6. Samples: 494160040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:37:42,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 12:37:43,731][57339] Updated weights for policy 0, policy_version 610588 (0.0028) [2024-04-28 12:37:46,096][57339] Updated weights for policy 0, policy_version 610598 (0.0033) [2024-04-28 12:37:47,169][57108] Fps is (10 sec: 60621.3, 60 sec: 56251.7, 300 sec: 55761.2). Total num frames: 10004119552. Throughput: 0: 55616.0. Samples: 494490480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:37:47,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 12:37:49,634][57339] Updated weights for policy 0, policy_version 610608 (0.0031) [2024-04-28 12:37:51,466][57319] Signal inference workers to stop experience collection... (7150 times) [2024-04-28 12:37:51,507][57339] InferenceWorker_p0-w0: stopping experience collection (7150 times) [2024-04-28 12:37:51,558][57319] Signal inference workers to resume experience collection... (7150 times) [2024-04-28 12:37:51,558][57339] InferenceWorker_p0-w0: resuming experience collection (7150 times) [2024-04-28 12:37:52,065][57339] Updated weights for policy 0, policy_version 610618 (0.0028) [2024-04-28 12:37:52,169][57108] Fps is (10 sec: 55703.6, 60 sec: 55978.3, 300 sec: 55594.4). Total num frames: 10004365312. Throughput: 0: 55865.9. Samples: 494665120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:37:52,170][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 12:37:55,436][57339] Updated weights for policy 0, policy_version 610628 (0.0025) [2024-04-28 12:37:57,169][57108] Fps is (10 sec: 50790.2, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10004627456. Throughput: 0: 55863.5. Samples: 494997120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:37:57,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 12:37:57,917][57339] Updated weights for policy 0, policy_version 610638 (0.0025) [2024-04-28 12:38:01,299][57339] Updated weights for policy 0, policy_version 610648 (0.0037) [2024-04-28 12:38:02,169][57108] Fps is (10 sec: 54069.2, 60 sec: 55978.6, 300 sec: 55483.4). Total num frames: 10004905984. Throughput: 0: 55768.8. Samples: 495334200. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:38:02,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 12:38:03,818][57339] Updated weights for policy 0, policy_version 610658 (0.0030) [2024-04-28 12:38:07,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.8, 300 sec: 55483.5). Total num frames: 10005168128. Throughput: 0: 55253.0. Samples: 495483600. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:38:07,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 12:38:07,245][57339] Updated weights for policy 0, policy_version 610668 (0.0027) [2024-04-28 12:38:09,936][57339] Updated weights for policy 0, policy_version 610678 (0.0029) [2024-04-28 12:38:12,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 10005479424. Throughput: 0: 55059.9. Samples: 495810280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:38:12,170][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 12:38:13,081][57339] Updated weights for policy 0, policy_version 610688 (0.0029) [2024-04-28 12:38:15,819][57339] Updated weights for policy 0, policy_version 610698 (0.0027) [2024-04-28 12:38:17,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 10005757952. Throughput: 0: 55048.8. Samples: 496144700. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:38:17,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 12:38:19,000][57339] Updated weights for policy 0, policy_version 610708 (0.0026) [2024-04-28 12:38:21,598][57339] Updated weights for policy 0, policy_version 610718 (0.0027) [2024-04-28 12:38:22,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10006036480. Throughput: 0: 55407.1. Samples: 496314780. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:38:22,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 12:38:25,096][57339] Updated weights for policy 0, policy_version 610728 (0.0037) [2024-04-28 12:38:27,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10006282240. Throughput: 0: 55268.6. Samples: 496647120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:38:27,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 12:38:27,547][57339] Updated weights for policy 0, policy_version 610738 (0.0030) [2024-04-28 12:38:30,799][57339] Updated weights for policy 0, policy_version 610748 (0.0030) [2024-04-28 12:38:32,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55432.5, 300 sec: 55483.5). Total num frames: 10006560768. Throughput: 0: 55332.0. Samples: 496980420. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-04-28 12:38:32,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 12:38:33,507][57339] Updated weights for policy 0, policy_version 610758 (0.0030) [2024-04-28 12:38:36,731][57339] Updated weights for policy 0, policy_version 610768 (0.0026) [2024-04-28 12:38:37,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.7, 300 sec: 55483.5). Total num frames: 10006839296. Throughput: 0: 54974.8. Samples: 497138960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:38:37,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 12:38:39,454][57339] Updated weights for policy 0, policy_version 610778 (0.0028) [2024-04-28 12:38:42,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54886.5, 300 sec: 55427.9). Total num frames: 10007101440. Throughput: 0: 54927.2. Samples: 497468840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:38:42,169][57108] Avg episode reward: [(0, '0.523')] [2024-04-28 12:38:42,579][57339] Updated weights for policy 0, policy_version 610788 (0.0027) [2024-04-28 12:38:45,296][57339] Updated weights for policy 0, policy_version 610798 (0.0026) [2024-04-28 12:38:47,169][57108] Fps is (10 sec: 57343.9, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 10007412736. Throughput: 0: 55083.7. Samples: 497812960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:38:47,169][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 12:38:48,464][57339] Updated weights for policy 0, policy_version 610808 (0.0032) [2024-04-28 12:38:51,197][57339] Updated weights for policy 0, policy_version 610818 (0.0023) [2024-04-28 12:38:52,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55433.0, 300 sec: 55594.5). Total num frames: 10007691264. Throughput: 0: 55530.2. Samples: 497982460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:38:52,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 12:38:53,974][57319] Signal inference workers to stop experience collection... (7200 times) [2024-04-28 12:38:54,016][57339] InferenceWorker_p0-w0: stopping experience collection (7200 times) [2024-04-28 12:38:54,023][57319] Signal inference workers to resume experience collection... (7200 times) [2024-04-28 12:38:54,031][57339] InferenceWorker_p0-w0: resuming experience collection (7200 times) [2024-04-28 12:38:54,141][57339] Updated weights for policy 0, policy_version 610828 (0.0025) [2024-04-28 12:38:57,091][57339] Updated weights for policy 0, policy_version 610838 (0.0033) [2024-04-28 12:38:57,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10007969792. Throughput: 0: 55704.4. Samples: 498316980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:38:57,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 12:39:00,013][57339] Updated weights for policy 0, policy_version 610848 (0.0024) [2024-04-28 12:39:02,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 10008215552. Throughput: 0: 55664.9. Samples: 498649620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:39:02,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 12:39:03,053][57339] Updated weights for policy 0, policy_version 610858 (0.0026) [2024-04-28 12:39:05,877][57339] Updated weights for policy 0, policy_version 610868 (0.0029) [2024-04-28 12:39:07,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 10008510464. Throughput: 0: 55551.1. Samples: 498814580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:39:07,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 12:39:08,848][57339] Updated weights for policy 0, policy_version 610878 (0.0026) [2024-04-28 12:39:11,799][57339] Updated weights for policy 0, policy_version 610888 (0.0032) [2024-04-28 12:39:12,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10008805376. Throughput: 0: 55656.8. Samples: 499151680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:39:12,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 12:39:14,755][57339] Updated weights for policy 0, policy_version 610898 (0.0029) [2024-04-28 12:39:17,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 10009067520. Throughput: 0: 55685.8. Samples: 499486280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:39:17,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 12:39:17,686][57339] Updated weights for policy 0, policy_version 610908 (0.0028) [2024-04-28 12:39:20,730][57339] Updated weights for policy 0, policy_version 610918 (0.0026) [2024-04-28 12:39:22,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 10009346048. Throughput: 0: 55940.3. Samples: 499656280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:39:22,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 12:39:23,453][57339] Updated weights for policy 0, policy_version 610928 (0.0030) [2024-04-28 12:39:26,468][57339] Updated weights for policy 0, policy_version 610938 (0.0027) [2024-04-28 12:39:27,169][57108] Fps is (10 sec: 57342.8, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 10009640960. Throughput: 0: 56022.8. Samples: 499989880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:39:27,170][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 12:39:27,244][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000610941_10009657344.pth... [2024-04-28 12:39:27,299][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000610126_9996304384.pth [2024-04-28 12:39:29,337][57339] Updated weights for policy 0, policy_version 610948 (0.0029) [2024-04-28 12:39:32,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 10009919488. Throughput: 0: 55870.0. Samples: 500327120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:39:32,170][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 12:39:32,272][57339] Updated weights for policy 0, policy_version 610958 (0.0030) [2024-04-28 12:39:35,092][57339] Updated weights for policy 0, policy_version 610968 (0.0035) [2024-04-28 12:39:37,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 10010165248. Throughput: 0: 55706.5. Samples: 500489260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:39:37,170][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 12:39:38,118][57339] Updated weights for policy 0, policy_version 610978 (0.0023) [2024-04-28 12:39:41,004][57339] Updated weights for policy 0, policy_version 610988 (0.0030) [2024-04-28 12:39:42,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 10010460160. Throughput: 0: 55643.6. Samples: 500820940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:39:42,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 12:39:44,082][57339] Updated weights for policy 0, policy_version 610998 (0.0030) [2024-04-28 12:39:46,888][57339] Updated weights for policy 0, policy_version 611008 (0.0027) [2024-04-28 12:39:47,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55705.4, 300 sec: 55594.5). Total num frames: 10010755072. Throughput: 0: 55516.3. Samples: 501147860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:39:47,170][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 12:39:50,032][57339] Updated weights for policy 0, policy_version 611018 (0.0029) [2024-04-28 12:39:52,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10011033600. Throughput: 0: 55705.5. Samples: 501321320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:39:52,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 12:39:52,767][57339] Updated weights for policy 0, policy_version 611028 (0.0027) [2024-04-28 12:39:55,852][57339] Updated weights for policy 0, policy_version 611038 (0.0026) [2024-04-28 12:39:57,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10011295744. Throughput: 0: 55606.6. Samples: 501653980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:39:57,170][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 12:39:58,717][57339] Updated weights for policy 0, policy_version 611048 (0.0027) [2024-04-28 12:40:01,082][57319] Signal inference workers to stop experience collection... (7250 times) [2024-04-28 12:40:01,106][57339] InferenceWorker_p0-w0: stopping experience collection (7250 times) [2024-04-28 12:40:01,139][57319] Signal inference workers to resume experience collection... (7250 times) [2024-04-28 12:40:01,140][57339] InferenceWorker_p0-w0: resuming experience collection (7250 times) [2024-04-28 12:40:01,792][57339] Updated weights for policy 0, policy_version 611058 (0.0024) [2024-04-28 12:40:02,169][57108] Fps is (10 sec: 55705.0, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10011590656. Throughput: 0: 55583.9. Samples: 501987560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:40:02,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 12:40:04,816][57339] Updated weights for policy 0, policy_version 611068 (0.0026) [2024-04-28 12:40:07,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10011836416. Throughput: 0: 55381.3. Samples: 502148440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:40:07,170][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 12:40:07,767][57339] Updated weights for policy 0, policy_version 611078 (0.0025) [2024-04-28 12:40:10,628][57339] Updated weights for policy 0, policy_version 611088 (0.0030) [2024-04-28 12:40:12,169][57108] Fps is (10 sec: 50790.7, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 10012098560. Throughput: 0: 55332.6. Samples: 502479840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:40:12,169][57108] Avg episode reward: [(0, '0.705')] [2024-04-28 12:40:13,571][57339] Updated weights for policy 0, policy_version 611098 (0.0031) [2024-04-28 12:40:16,425][57339] Updated weights for policy 0, policy_version 611108 (0.0027) [2024-04-28 12:40:17,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10012409856. Throughput: 0: 55208.2. Samples: 502811480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:40:17,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 12:40:19,448][57339] Updated weights for policy 0, policy_version 611118 (0.0024) [2024-04-28 12:40:22,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 10012688384. Throughput: 0: 55319.3. Samples: 502978620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:40:22,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 12:40:22,411][57339] Updated weights for policy 0, policy_version 611128 (0.0034) [2024-04-28 12:40:25,392][57339] Updated weights for policy 0, policy_version 611138 (0.0031) [2024-04-28 12:40:27,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10012966912. Throughput: 0: 55334.6. Samples: 503311000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:40:27,170][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 12:40:28,483][57339] Updated weights for policy 0, policy_version 611148 (0.0028) [2024-04-28 12:40:31,361][57339] Updated weights for policy 0, policy_version 611158 (0.0027) [2024-04-28 12:40:32,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 10013229056. Throughput: 0: 55408.0. Samples: 503641220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:40:32,170][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 12:40:34,624][57339] Updated weights for policy 0, policy_version 611168 (0.0031) [2024-04-28 12:40:37,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 10013523968. Throughput: 0: 55298.2. Samples: 503809740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:40:37,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 12:40:37,222][57339] Updated weights for policy 0, policy_version 611178 (0.0025) [2024-04-28 12:40:40,467][57339] Updated weights for policy 0, policy_version 611188 (0.0026) [2024-04-28 12:40:42,169][57108] Fps is (10 sec: 52429.5, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 10013753344. Throughput: 0: 55225.4. Samples: 504139120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:40:42,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 12:40:43,055][57339] Updated weights for policy 0, policy_version 611198 (0.0025) [2024-04-28 12:40:46,503][57339] Updated weights for policy 0, policy_version 611208 (0.0027) [2024-04-28 12:40:47,169][57108] Fps is (10 sec: 52428.2, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 10014048256. Throughput: 0: 55226.7. Samples: 504472760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:40:47,169][57108] Avg episode reward: [(0, '0.503')] [2024-04-28 12:40:49,240][57339] Updated weights for policy 0, policy_version 611218 (0.0033) [2024-04-28 12:40:52,169][57108] Fps is (10 sec: 58981.4, 60 sec: 55159.3, 300 sec: 55538.9). Total num frames: 10014343168. Throughput: 0: 55386.5. Samples: 504640840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:40:52,170][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 12:40:52,528][57339] Updated weights for policy 0, policy_version 611228 (0.0036) [2024-04-28 12:40:55,506][57339] Updated weights for policy 0, policy_version 611238 (0.0040) [2024-04-28 12:40:57,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 10014621696. Throughput: 0: 55371.2. Samples: 504971540. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:40:57,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 12:40:58,469][57339] Updated weights for policy 0, policy_version 611248 (0.0030) [2024-04-28 12:41:01,239][57339] Updated weights for policy 0, policy_version 611258 (0.0031) [2024-04-28 12:41:02,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 10014900224. Throughput: 0: 55308.3. Samples: 505300360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:41:02,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 12:41:04,232][57339] Updated weights for policy 0, policy_version 611268 (0.0031) [2024-04-28 12:41:05,857][57319] Signal inference workers to stop experience collection... (7300 times) [2024-04-28 12:41:05,857][57319] Signal inference workers to resume experience collection... (7300 times) [2024-04-28 12:41:05,889][57339] InferenceWorker_p0-w0: stopping experience collection (7300 times) [2024-04-28 12:41:05,890][57339] InferenceWorker_p0-w0: resuming experience collection (7300 times) [2024-04-28 12:41:07,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 10015162368. Throughput: 0: 55469.8. Samples: 505474760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:41:07,169][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 12:41:07,255][57339] Updated weights for policy 0, policy_version 611278 (0.0031) [2024-04-28 12:41:10,191][57339] Updated weights for policy 0, policy_version 611288 (0.0031) [2024-04-28 12:41:12,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10015440896. Throughput: 0: 55438.3. Samples: 505805720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:41:12,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 12:41:13,041][57339] Updated weights for policy 0, policy_version 611298 (0.0028) [2024-04-28 12:41:16,259][57339] Updated weights for policy 0, policy_version 611308 (0.0028) [2024-04-28 12:41:17,169][57108] Fps is (10 sec: 52428.5, 60 sec: 54613.3, 300 sec: 55427.9). Total num frames: 10015686656. Throughput: 0: 55435.3. Samples: 506135800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:41:17,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 12:41:18,888][57339] Updated weights for policy 0, policy_version 611318 (0.0031) [2024-04-28 12:41:22,075][57339] Updated weights for policy 0, policy_version 611328 (0.0025) [2024-04-28 12:41:22,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 10015997952. Throughput: 0: 55096.3. Samples: 506289080. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:41:22,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 12:41:24,844][57339] Updated weights for policy 0, policy_version 611338 (0.0027) [2024-04-28 12:41:27,169][57108] Fps is (10 sec: 60620.5, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10016292864. Throughput: 0: 55211.9. Samples: 506623660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:41:27,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 12:41:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000611346_10016292864.pth... [2024-04-28 12:41:27,231][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000610533_10002972672.pth [2024-04-28 12:41:27,890][57339] Updated weights for policy 0, policy_version 611348 (0.0028) [2024-04-28 12:41:30,721][57339] Updated weights for policy 0, policy_version 611358 (0.0027) [2024-04-28 12:41:32,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.7, 300 sec: 55427.9). Total num frames: 10016571392. Throughput: 0: 55236.0. Samples: 506958380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:41:32,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 12:41:33,677][57339] Updated weights for policy 0, policy_version 611368 (0.0028) [2024-04-28 12:41:36,608][57339] Updated weights for policy 0, policy_version 611378 (0.0025) [2024-04-28 12:41:37,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.3, 300 sec: 55427.9). Total num frames: 10016833536. Throughput: 0: 55425.4. Samples: 507134980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:41:37,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 12:41:39,629][57339] Updated weights for policy 0, policy_version 611388 (0.0027) [2024-04-28 12:41:42,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55978.7, 300 sec: 55483.4). Total num frames: 10017112064. Throughput: 0: 55424.0. Samples: 507465620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 27.0) [2024-04-28 12:41:42,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 12:41:42,334][57339] Updated weights for policy 0, policy_version 611398 (0.0027) [2024-04-28 12:41:45,950][57339] Updated weights for policy 0, policy_version 611408 (0.0029) [2024-04-28 12:41:47,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 10017357824. Throughput: 0: 55611.2. Samples: 507802860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:41:47,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 12:41:48,280][57339] Updated weights for policy 0, policy_version 611418 (0.0029) [2024-04-28 12:41:51,674][57339] Updated weights for policy 0, policy_version 611428 (0.0029) [2024-04-28 12:41:52,169][57108] Fps is (10 sec: 52428.7, 60 sec: 54886.6, 300 sec: 55427.9). Total num frames: 10017636352. Throughput: 0: 55255.5. Samples: 507961260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:41:52,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 12:41:54,184][57339] Updated weights for policy 0, policy_version 611438 (0.0027) [2024-04-28 12:41:57,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 10017931264. Throughput: 0: 55297.4. Samples: 508294100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:41:57,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 12:41:57,700][57339] Updated weights for policy 0, policy_version 611448 (0.0029) [2024-04-28 12:41:59,956][57339] Updated weights for policy 0, policy_version 611458 (0.0029) [2024-04-28 12:42:02,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 10018226176. Throughput: 0: 55243.0. Samples: 508621740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:42:02,170][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 12:42:03,419][57319] Signal inference workers to stop experience collection... (7350 times) [2024-04-28 12:42:03,428][57339] InferenceWorker_p0-w0: stopping experience collection (7350 times) [2024-04-28 12:42:03,511][57319] Signal inference workers to resume experience collection... (7350 times) [2024-04-28 12:42:03,511][57339] InferenceWorker_p0-w0: resuming experience collection (7350 times) [2024-04-28 12:42:03,616][57339] Updated weights for policy 0, policy_version 611468 (0.0030) [2024-04-28 12:42:05,927][57339] Updated weights for policy 0, policy_version 611478 (0.0032) [2024-04-28 12:42:07,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.5, 300 sec: 55372.4). Total num frames: 10018504704. Throughput: 0: 55737.8. Samples: 508797280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:42:07,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 12:42:09,467][57339] Updated weights for policy 0, policy_version 611488 (0.0022) [2024-04-28 12:42:11,980][57339] Updated weights for policy 0, policy_version 611498 (0.0038) [2024-04-28 12:42:12,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55372.4). Total num frames: 10018783232. Throughput: 0: 55764.5. Samples: 509133060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:42:12,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 12:42:15,327][57339] Updated weights for policy 0, policy_version 611508 (0.0027) [2024-04-28 12:42:17,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55372.4). Total num frames: 10019045376. Throughput: 0: 55685.0. Samples: 509464200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:42:17,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 12:42:18,046][57339] Updated weights for policy 0, policy_version 611518 (0.0025) [2024-04-28 12:42:21,071][57339] Updated weights for policy 0, policy_version 611528 (0.0028) [2024-04-28 12:42:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 10019323904. Throughput: 0: 55239.3. Samples: 509620740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:42:22,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 12:42:23,967][57339] Updated weights for policy 0, policy_version 611538 (0.0030) [2024-04-28 12:42:27,097][57339] Updated weights for policy 0, policy_version 611548 (0.0024) [2024-04-28 12:42:27,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 10019602432. Throughput: 0: 55326.7. Samples: 509955320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:42:27,169][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 12:42:29,744][57339] Updated weights for policy 0, policy_version 611558 (0.0027) [2024-04-28 12:42:32,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 10019880960. Throughput: 0: 55176.8. Samples: 510285820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:42:32,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 12:42:33,133][57339] Updated weights for policy 0, policy_version 611568 (0.0028) [2024-04-28 12:42:35,501][57339] Updated weights for policy 0, policy_version 611578 (0.0033) [2024-04-28 12:42:37,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.7, 300 sec: 55483.4). Total num frames: 10020175872. Throughput: 0: 55557.3. Samples: 510461340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:42:37,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 12:42:39,354][57339] Updated weights for policy 0, policy_version 611588 (0.0025) [2024-04-28 12:42:41,464][57339] Updated weights for policy 0, policy_version 611598 (0.0030) [2024-04-28 12:42:42,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55432.6, 300 sec: 55316.8). Total num frames: 10020438016. Throughput: 0: 55429.4. Samples: 510788420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:42:42,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 12:42:45,236][57339] Updated weights for policy 0, policy_version 611608 (0.0038) [2024-04-28 12:42:47,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 55483.5). Total num frames: 10020732928. Throughput: 0: 55572.9. Samples: 511122520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:42:47,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 12:42:47,293][57339] Updated weights for policy 0, policy_version 611618 (0.0030) [2024-04-28 12:42:51,078][57339] Updated weights for policy 0, policy_version 611628 (0.0027) [2024-04-28 12:42:52,169][57108] Fps is (10 sec: 55704.5, 60 sec: 55978.5, 300 sec: 55483.4). Total num frames: 10020995072. Throughput: 0: 55410.1. Samples: 511290740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:42:52,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 12:42:53,257][57339] Updated weights for policy 0, policy_version 611638 (0.0026) [2024-04-28 12:42:56,768][57339] Updated weights for policy 0, policy_version 611648 (0.0026) [2024-04-28 12:42:57,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 10021257216. Throughput: 0: 55323.6. Samples: 511622620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:42:57,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 12:42:59,382][57339] Updated weights for policy 0, policy_version 611658 (0.0031) [2024-04-28 12:43:02,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 10021535744. Throughput: 0: 55502.6. Samples: 511961820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:43:02,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 12:43:02,585][57339] Updated weights for policy 0, policy_version 611668 (0.0033) [2024-04-28 12:43:05,474][57339] Updated weights for policy 0, policy_version 611678 (0.0029) [2024-04-28 12:43:07,124][57319] Signal inference workers to stop experience collection... (7400 times) [2024-04-28 12:43:07,125][57319] Signal inference workers to resume experience collection... (7400 times) [2024-04-28 12:43:07,135][57339] InferenceWorker_p0-w0: stopping experience collection (7400 times) [2024-04-28 12:43:07,155][57339] InferenceWorker_p0-w0: resuming experience collection (7400 times) [2024-04-28 12:43:07,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 10021847040. Throughput: 0: 55779.6. Samples: 512130820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:43:07,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 12:43:08,477][57339] Updated weights for policy 0, policy_version 611688 (0.0034) [2024-04-28 12:43:11,270][57339] Updated weights for policy 0, policy_version 611698 (0.0031) [2024-04-28 12:43:12,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 10022125568. Throughput: 0: 55757.7. Samples: 512464420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:43:12,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 12:43:14,432][57339] Updated weights for policy 0, policy_version 611708 (0.0029) [2024-04-28 12:43:17,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55432.5, 300 sec: 55372.4). Total num frames: 10022371328. Throughput: 0: 55906.0. Samples: 512801580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 12:43:17,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 12:43:17,292][57339] Updated weights for policy 0, policy_version 611718 (0.0026) [2024-04-28 12:43:20,211][57339] Updated weights for policy 0, policy_version 611728 (0.0025) [2024-04-28 12:43:22,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56251.6, 300 sec: 55650.0). Total num frames: 10022699008. Throughput: 0: 56019.0. Samples: 512982200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:43:22,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 12:43:22,965][57339] Updated weights for policy 0, policy_version 611738 (0.0026) [2024-04-28 12:43:26,006][57339] Updated weights for policy 0, policy_version 611748 (0.0024) [2024-04-28 12:43:27,169][57108] Fps is (10 sec: 58981.7, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 10022961152. Throughput: 0: 56280.3. Samples: 513321040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:43:27,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 12:43:27,228][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000611754_10022977536.pth... [2024-04-28 12:43:27,273][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000610941_10009657344.pth [2024-04-28 12:43:28,883][57339] Updated weights for policy 0, policy_version 611758 (0.0025) [2024-04-28 12:43:31,929][57339] Updated weights for policy 0, policy_version 611768 (0.0030) [2024-04-28 12:43:32,169][57108] Fps is (10 sec: 54068.3, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 10023239680. Throughput: 0: 56193.1. Samples: 513651200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:43:32,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 12:43:34,552][57339] Updated weights for policy 0, policy_version 611778 (0.0029) [2024-04-28 12:43:37,169][57108] Fps is (10 sec: 52429.7, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 10023485440. Throughput: 0: 56036.7. Samples: 513812380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:43:37,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 12:43:37,646][57339] Updated weights for policy 0, policy_version 611788 (0.0026) [2024-04-28 12:43:40,574][57339] Updated weights for policy 0, policy_version 611798 (0.0027) [2024-04-28 12:43:42,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 10023796736. Throughput: 0: 56203.0. Samples: 514151760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:43:42,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 12:43:43,565][57339] Updated weights for policy 0, policy_version 611808 (0.0026) [2024-04-28 12:43:46,324][57339] Updated weights for policy 0, policy_version 611818 (0.0025) [2024-04-28 12:43:47,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 10024075264. Throughput: 0: 56150.7. Samples: 514488600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:43:47,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 12:43:49,309][57339] Updated weights for policy 0, policy_version 611828 (0.0026) [2024-04-28 12:43:52,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.8, 300 sec: 55483.5). Total num frames: 10024337408. Throughput: 0: 55952.9. Samples: 514648700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:43:52,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 12:43:52,218][57339] Updated weights for policy 0, policy_version 611838 (0.0036) [2024-04-28 12:43:55,173][57339] Updated weights for policy 0, policy_version 611848 (0.0024) [2024-04-28 12:43:55,464][57319] Signal inference workers to stop experience collection... (7450 times) [2024-04-28 12:43:55,464][57319] Signal inference workers to resume experience collection... (7450 times) [2024-04-28 12:43:55,490][57339] InferenceWorker_p0-w0: stopping experience collection (7450 times) [2024-04-28 12:43:55,491][57339] InferenceWorker_p0-w0: resuming experience collection (7450 times) [2024-04-28 12:43:57,169][57108] Fps is (10 sec: 55704.9, 60 sec: 56251.6, 300 sec: 55650.1). Total num frames: 10024632320. Throughput: 0: 55938.2. Samples: 514981640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:43:57,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 12:43:58,077][57339] Updated weights for policy 0, policy_version 611858 (0.0028) [2024-04-28 12:44:01,041][57339] Updated weights for policy 0, policy_version 611868 (0.0030) [2024-04-28 12:44:02,169][57108] Fps is (10 sec: 58982.0, 60 sec: 56524.8, 300 sec: 55650.1). Total num frames: 10024927232. Throughput: 0: 55926.6. Samples: 515318280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:44:02,169][57108] Avg episode reward: [(0, '0.531')] [2024-04-28 12:44:03,805][57339] Updated weights for policy 0, policy_version 611878 (0.0032) [2024-04-28 12:44:06,893][57339] Updated weights for policy 0, policy_version 611888 (0.0028) [2024-04-28 12:44:07,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10025189376. Throughput: 0: 55692.7. Samples: 515488360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:44:07,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 12:44:09,537][57339] Updated weights for policy 0, policy_version 611898 (0.0033) [2024-04-28 12:44:12,169][57108] Fps is (10 sec: 50790.4, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 10025435136. Throughput: 0: 55607.2. Samples: 515823360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:44:12,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 12:44:12,672][57339] Updated weights for policy 0, policy_version 611908 (0.0036) [2024-04-28 12:44:15,525][57339] Updated weights for policy 0, policy_version 611918 (0.0030) [2024-04-28 12:44:17,169][57108] Fps is (10 sec: 55705.1, 60 sec: 56251.6, 300 sec: 55594.5). Total num frames: 10025746432. Throughput: 0: 55710.1. Samples: 516158160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:44:17,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 12:44:18,549][57339] Updated weights for policy 0, policy_version 611928 (0.0027) [2024-04-28 12:44:21,444][57339] Updated weights for policy 0, policy_version 611938 (0.0024) [2024-04-28 12:44:22,169][57108] Fps is (10 sec: 60620.0, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10026041344. Throughput: 0: 55776.6. Samples: 516322340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:44:22,170][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 12:44:24,341][57339] Updated weights for policy 0, policy_version 611948 (0.0026) [2024-04-28 12:44:27,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10026303488. Throughput: 0: 55660.8. Samples: 516656500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:44:27,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 12:44:27,348][57339] Updated weights for policy 0, policy_version 611958 (0.0028) [2024-04-28 12:44:30,198][57339] Updated weights for policy 0, policy_version 611968 (0.0026) [2024-04-28 12:44:32,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.4, 300 sec: 55650.1). Total num frames: 10026582016. Throughput: 0: 55674.0. Samples: 516993940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:44:32,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 12:44:33,324][57339] Updated weights for policy 0, policy_version 611978 (0.0032) [2024-04-28 12:44:36,078][57339] Updated weights for policy 0, policy_version 611988 (0.0032) [2024-04-28 12:44:37,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56524.7, 300 sec: 55650.1). Total num frames: 10026876928. Throughput: 0: 55928.4. Samples: 517165480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:44:37,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 12:44:39,044][57339] Updated weights for policy 0, policy_version 611998 (0.0028) [2024-04-28 12:44:41,983][57339] Updated weights for policy 0, policy_version 612008 (0.0026) [2024-04-28 12:44:42,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 10027155456. Throughput: 0: 55865.9. Samples: 517495600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:44:42,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 12:44:44,973][57339] Updated weights for policy 0, policy_version 612018 (0.0031) [2024-04-28 12:44:47,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 10027401216. Throughput: 0: 55900.8. Samples: 517833820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 12:44:47,170][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 12:44:47,928][57339] Updated weights for policy 0, policy_version 612028 (0.0031) [2024-04-28 12:44:50,694][57339] Updated weights for policy 0, policy_version 612038 (0.0027) [2024-04-28 12:44:52,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 10027712512. Throughput: 0: 55768.0. Samples: 517997920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:44:52,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 12:44:53,667][57339] Updated weights for policy 0, policy_version 612048 (0.0026) [2024-04-28 12:44:56,507][57339] Updated weights for policy 0, policy_version 612058 (0.0028) [2024-04-28 12:44:57,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 10027974656. Throughput: 0: 55855.8. Samples: 518336880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:44:57,178][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 12:44:59,585][57339] Updated weights for policy 0, policy_version 612068 (0.0029) [2024-04-28 12:45:02,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10028269568. Throughput: 0: 55878.2. Samples: 518672680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:45:02,170][57108] Avg episode reward: [(0, '0.703')] [2024-04-28 12:45:02,314][57339] Updated weights for policy 0, policy_version 612078 (0.0027) [2024-04-28 12:45:05,487][57339] Updated weights for policy 0, policy_version 612088 (0.0036) [2024-04-28 12:45:07,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10028531712. Throughput: 0: 55955.7. Samples: 518840340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:45:07,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 12:45:08,279][57339] Updated weights for policy 0, policy_version 612098 (0.0033) [2024-04-28 12:45:08,842][57319] Signal inference workers to stop experience collection... (7500 times) [2024-04-28 12:45:08,843][57319] Signal inference workers to resume experience collection... (7500 times) [2024-04-28 12:45:08,871][57339] InferenceWorker_p0-w0: stopping experience collection (7500 times) [2024-04-28 12:45:08,871][57339] InferenceWorker_p0-w0: resuming experience collection (7500 times) [2024-04-28 12:45:11,229][57339] Updated weights for policy 0, policy_version 612108 (0.0024) [2024-04-28 12:45:12,169][57108] Fps is (10 sec: 55705.8, 60 sec: 56524.8, 300 sec: 55650.0). Total num frames: 10028826624. Throughput: 0: 55968.1. Samples: 519175060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:45:12,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 12:45:14,079][57339] Updated weights for policy 0, policy_version 612118 (0.0029) [2024-04-28 12:45:16,997][57339] Updated weights for policy 0, policy_version 612128 (0.0028) [2024-04-28 12:45:17,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 10029105152. Throughput: 0: 55834.7. Samples: 519506500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:45:17,169][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 12:45:19,952][57339] Updated weights for policy 0, policy_version 612138 (0.0028) [2024-04-28 12:45:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10029367296. Throughput: 0: 55727.5. Samples: 519673220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:45:22,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 12:45:23,005][57339] Updated weights for policy 0, policy_version 612148 (0.0028) [2024-04-28 12:45:25,879][57339] Updated weights for policy 0, policy_version 612158 (0.0030) [2024-04-28 12:45:27,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10029662208. Throughput: 0: 55788.8. Samples: 520006100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:45:27,178][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 12:45:27,188][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000612162_10029662208.pth... [2024-04-28 12:45:27,242][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000611346_10016292864.pth [2024-04-28 12:45:29,040][57339] Updated weights for policy 0, policy_version 612168 (0.0027) [2024-04-28 12:45:31,606][57339] Updated weights for policy 0, policy_version 612178 (0.0025) [2024-04-28 12:45:32,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 10029940736. Throughput: 0: 55643.7. Samples: 520337780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:45:32,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 12:45:34,826][57339] Updated weights for policy 0, policy_version 612188 (0.0031) [2024-04-28 12:45:37,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10030219264. Throughput: 0: 55884.1. Samples: 520512700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:45:37,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 12:45:37,555][57339] Updated weights for policy 0, policy_version 612198 (0.0025) [2024-04-28 12:45:40,655][57339] Updated weights for policy 0, policy_version 612208 (0.0031) [2024-04-28 12:45:42,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10030481408. Throughput: 0: 55791.3. Samples: 520847480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:45:42,170][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 12:45:43,342][57339] Updated weights for policy 0, policy_version 612218 (0.0027) [2024-04-28 12:45:46,590][57339] Updated weights for policy 0, policy_version 612228 (0.0032) [2024-04-28 12:45:47,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10030759936. Throughput: 0: 55646.7. Samples: 521176780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:45:47,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 12:45:49,403][57339] Updated weights for policy 0, policy_version 612238 (0.0030) [2024-04-28 12:45:52,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 10031038464. Throughput: 0: 55494.6. Samples: 521337600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:45:52,169][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 12:45:52,525][57339] Updated weights for policy 0, policy_version 612248 (0.0027) [2024-04-28 12:45:55,268][57339] Updated weights for policy 0, policy_version 612258 (0.0023) [2024-04-28 12:45:57,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10031316992. Throughput: 0: 55516.8. Samples: 521673320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:45:57,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 12:45:58,219][57339] Updated weights for policy 0, policy_version 612268 (0.0030) [2024-04-28 12:46:01,194][57339] Updated weights for policy 0, policy_version 612278 (0.0030) [2024-04-28 12:46:02,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10031611904. Throughput: 0: 55683.6. Samples: 522012260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:46:02,170][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 12:46:04,098][57339] Updated weights for policy 0, policy_version 612288 (0.0029) [2024-04-28 12:46:07,000][57339] Updated weights for policy 0, policy_version 612298 (0.0026) [2024-04-28 12:46:07,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10031890432. Throughput: 0: 55534.2. Samples: 522172260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:46:07,170][57108] Avg episode reward: [(0, '0.684')] [2024-04-28 12:46:10,041][57339] Updated weights for policy 0, policy_version 612308 (0.0027) [2024-04-28 12:46:12,169][57108] Fps is (10 sec: 50790.8, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 10032119808. Throughput: 0: 55581.0. Samples: 522507240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:46:12,169][57108] Avg episode reward: [(0, '0.678')] [2024-04-28 12:46:12,945][57339] Updated weights for policy 0, policy_version 612318 (0.0023) [2024-04-28 12:46:16,038][57339] Updated weights for policy 0, policy_version 612328 (0.0026) [2024-04-28 12:46:17,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 10032431104. Throughput: 0: 55520.0. Samples: 522836180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:46:17,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 12:46:18,796][57339] Updated weights for policy 0, policy_version 612338 (0.0026) [2024-04-28 12:46:21,220][57319] Signal inference workers to stop experience collection... (7550 times) [2024-04-28 12:46:21,254][57339] InferenceWorker_p0-w0: stopping experience collection (7550 times) [2024-04-28 12:46:21,281][57319] Signal inference workers to resume experience collection... (7550 times) [2024-04-28 12:46:21,282][57339] InferenceWorker_p0-w0: resuming experience collection (7550 times) [2024-04-28 12:46:21,839][57339] Updated weights for policy 0, policy_version 612348 (0.0030) [2024-04-28 12:46:22,169][57108] Fps is (10 sec: 60619.7, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 10032726016. Throughput: 0: 55317.0. Samples: 523001980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 12:46:22,170][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 12:46:24,691][57339] Updated weights for policy 0, policy_version 612358 (0.0026) [2024-04-28 12:46:27,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10032988160. Throughput: 0: 55286.2. Samples: 523335360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:46:27,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 12:46:27,565][57339] Updated weights for policy 0, policy_version 612368 (0.0029) [2024-04-28 12:46:30,619][57339] Updated weights for policy 0, policy_version 612378 (0.0027) [2024-04-28 12:46:32,169][57108] Fps is (10 sec: 52430.1, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10033250304. Throughput: 0: 55371.6. Samples: 523668500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:46:32,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 12:46:33,528][57339] Updated weights for policy 0, policy_version 612388 (0.0028) [2024-04-28 12:46:36,634][57339] Updated weights for policy 0, policy_version 612398 (0.0033) [2024-04-28 12:46:37,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 10033561600. Throughput: 0: 55567.4. Samples: 523838140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:46:37,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 12:46:39,418][57339] Updated weights for policy 0, policy_version 612408 (0.0025) [2024-04-28 12:46:42,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10033823744. Throughput: 0: 55586.8. Samples: 524174720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:46:42,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 12:46:42,547][57339] Updated weights for policy 0, policy_version 612418 (0.0028) [2024-04-28 12:46:45,159][57339] Updated weights for policy 0, policy_version 612428 (0.0034) [2024-04-28 12:46:47,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10034085888. Throughput: 0: 55491.1. Samples: 524509360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:46:47,170][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 12:46:48,435][57339] Updated weights for policy 0, policy_version 612438 (0.0031) [2024-04-28 12:46:51,059][57339] Updated weights for policy 0, policy_version 612448 (0.0027) [2024-04-28 12:46:52,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10034380800. Throughput: 0: 55776.6. Samples: 524682200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:46:52,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 12:46:54,361][57339] Updated weights for policy 0, policy_version 612458 (0.0025) [2024-04-28 12:46:57,008][57339] Updated weights for policy 0, policy_version 612468 (0.0028) [2024-04-28 12:46:57,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10034675712. Throughput: 0: 55697.7. Samples: 525013640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:46:57,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 12:47:00,235][57339] Updated weights for policy 0, policy_version 612478 (0.0026) [2024-04-28 12:47:02,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10034937856. Throughput: 0: 55873.7. Samples: 525350500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:47:02,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 12:47:02,949][57339] Updated weights for policy 0, policy_version 612488 (0.0027) [2024-04-28 12:47:06,039][57339] Updated weights for policy 0, policy_version 612498 (0.0030) [2024-04-28 12:47:07,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 10035200000. Throughput: 0: 55911.4. Samples: 525517980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:47:07,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 12:47:08,639][57339] Updated weights for policy 0, policy_version 612508 (0.0030) [2024-04-28 12:47:12,056][57339] Updated weights for policy 0, policy_version 612518 (0.0031) [2024-04-28 12:47:12,169][57108] Fps is (10 sec: 55705.5, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10035494912. Throughput: 0: 55964.0. Samples: 525853740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:47:12,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 12:47:14,539][57339] Updated weights for policy 0, policy_version 612528 (0.0027) [2024-04-28 12:47:17,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10035757056. Throughput: 0: 55823.5. Samples: 526180560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:47:17,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 12:47:18,030][57339] Updated weights for policy 0, policy_version 612538 (0.0036) [2024-04-28 12:47:20,502][57339] Updated weights for policy 0, policy_version 612548 (0.0032) [2024-04-28 12:47:22,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10036035584. Throughput: 0: 55740.8. Samples: 526346480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:47:22,170][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 12:47:23,824][57339] Updated weights for policy 0, policy_version 612558 (0.0026) [2024-04-28 12:47:26,452][57339] Updated weights for policy 0, policy_version 612568 (0.0029) [2024-04-28 12:47:27,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10036330496. Throughput: 0: 55633.4. Samples: 526678220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:47:27,169][57108] Avg episode reward: [(0, '0.465')] [2024-04-28 12:47:27,177][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000612569_10036330496.pth... [2024-04-28 12:47:27,227][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000611754_10022977536.pth [2024-04-28 12:47:29,851][57339] Updated weights for policy 0, policy_version 612578 (0.0028) [2024-04-28 12:47:32,169][57108] Fps is (10 sec: 58982.7, 60 sec: 56251.5, 300 sec: 55761.1). Total num frames: 10036625408. Throughput: 0: 55563.0. Samples: 527009700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:47:32,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 12:47:32,194][57339] Updated weights for policy 0, policy_version 612588 (0.0023) [2024-04-28 12:47:34,003][57319] Signal inference workers to stop experience collection... (7600 times) [2024-04-28 12:47:34,004][57319] Signal inference workers to resume experience collection... (7600 times) [2024-04-28 12:47:34,028][57339] InferenceWorker_p0-w0: stopping experience collection (7600 times) [2024-04-28 12:47:34,028][57339] InferenceWorker_p0-w0: resuming experience collection (7600 times) [2024-04-28 12:47:35,903][57339] Updated weights for policy 0, policy_version 612598 (0.0026) [2024-04-28 12:47:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 10036871168. Throughput: 0: 55507.6. Samples: 527180040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:47:37,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 12:47:38,125][57339] Updated weights for policy 0, policy_version 612608 (0.0026) [2024-04-28 12:47:41,682][57339] Updated weights for policy 0, policy_version 612618 (0.0023) [2024-04-28 12:47:42,169][57108] Fps is (10 sec: 50791.4, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10037133312. Throughput: 0: 55545.5. Samples: 527513180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:47:42,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 12:47:44,560][57339] Updated weights for policy 0, policy_version 612628 (0.0027) [2024-04-28 12:47:47,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10037428224. Throughput: 0: 55525.7. Samples: 527849160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:47:47,170][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 12:47:47,453][57339] Updated weights for policy 0, policy_version 612638 (0.0032) [2024-04-28 12:47:50,673][57339] Updated weights for policy 0, policy_version 612648 (0.0034) [2024-04-28 12:47:52,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10037723136. Throughput: 0: 55298.6. Samples: 528006420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:47:52,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 12:47:53,478][57339] Updated weights for policy 0, policy_version 612658 (0.0026) [2024-04-28 12:47:56,565][57339] Updated weights for policy 0, policy_version 612668 (0.0026) [2024-04-28 12:47:57,169][57108] Fps is (10 sec: 54067.5, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 10037968896. Throughput: 0: 55295.6. Samples: 528342040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 12:47:57,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 12:47:59,682][57339] Updated weights for policy 0, policy_version 612678 (0.0027) [2024-04-28 12:48:02,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10038263808. Throughput: 0: 55350.3. Samples: 528671320. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:48:02,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 12:48:02,299][57339] Updated weights for policy 0, policy_version 612688 (0.0033) [2024-04-28 12:48:05,472][57339] Updated weights for policy 0, policy_version 612698 (0.0032) [2024-04-28 12:48:07,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10038558720. Throughput: 0: 55636.7. Samples: 528850120. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:48:07,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 12:48:07,966][57339] Updated weights for policy 0, policy_version 612708 (0.0027) [2024-04-28 12:48:11,445][57339] Updated weights for policy 0, policy_version 612718 (0.0032) [2024-04-28 12:48:12,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10038820864. Throughput: 0: 55716.0. Samples: 529185440. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:48:12,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 12:48:14,040][57339] Updated weights for policy 0, policy_version 612728 (0.0031) [2024-04-28 12:48:17,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 10039083008. Throughput: 0: 55759.8. Samples: 529518880. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:48:17,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 12:48:17,257][57339] Updated weights for policy 0, policy_version 612738 (0.0032) [2024-04-28 12:48:20,038][57339] Updated weights for policy 0, policy_version 612748 (0.0027) [2024-04-28 12:48:22,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 10039361536. Throughput: 0: 55464.8. Samples: 529675960. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:48:22,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 12:48:23,113][57339] Updated weights for policy 0, policy_version 612758 (0.0035) [2024-04-28 12:48:25,740][57339] Updated weights for policy 0, policy_version 612768 (0.0029) [2024-04-28 12:48:26,684][57319] Signal inference workers to stop experience collection... (7650 times) [2024-04-28 12:48:26,721][57339] InferenceWorker_p0-w0: stopping experience collection (7650 times) [2024-04-28 12:48:26,749][57319] Signal inference workers to resume experience collection... (7650 times) [2024-04-28 12:48:26,749][57339] InferenceWorker_p0-w0: resuming experience collection (7650 times) [2024-04-28 12:48:27,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10039656448. Throughput: 0: 55416.4. Samples: 530006920. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:48:27,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 12:48:29,035][57339] Updated weights for policy 0, policy_version 612778 (0.0033) [2024-04-28 12:48:31,907][57339] Updated weights for policy 0, policy_version 612788 (0.0033) [2024-04-28 12:48:32,169][57108] Fps is (10 sec: 55706.3, 60 sec: 54886.6, 300 sec: 55705.6). Total num frames: 10039918592. Throughput: 0: 55299.3. Samples: 530337620. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:48:32,169][57108] Avg episode reward: [(0, '0.720')] [2024-04-28 12:48:34,915][57339] Updated weights for policy 0, policy_version 612798 (0.0032) [2024-04-28 12:48:37,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10040213504. Throughput: 0: 55622.3. Samples: 530509420. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:48:37,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 12:48:37,893][57339] Updated weights for policy 0, policy_version 612808 (0.0030) [2024-04-28 12:48:40,928][57339] Updated weights for policy 0, policy_version 612818 (0.0029) [2024-04-28 12:48:42,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10040492032. Throughput: 0: 55590.8. Samples: 530843620. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:48:42,169][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 12:48:43,604][57339] Updated weights for policy 0, policy_version 612828 (0.0027) [2024-04-28 12:48:46,853][57339] Updated weights for policy 0, policy_version 612838 (0.0026) [2024-04-28 12:48:47,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 10040754176. Throughput: 0: 55716.3. Samples: 531178560. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:48:47,170][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 12:48:49,313][57339] Updated weights for policy 0, policy_version 612848 (0.0035) [2024-04-28 12:48:52,169][57108] Fps is (10 sec: 52428.0, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 10041016320. Throughput: 0: 55229.7. Samples: 531335460. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:48:52,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 12:48:52,733][57339] Updated weights for policy 0, policy_version 612858 (0.0028) [2024-04-28 12:48:55,097][57339] Updated weights for policy 0, policy_version 612868 (0.0035) [2024-04-28 12:48:57,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10041311232. Throughput: 0: 55327.1. Samples: 531675160. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:48:57,169][57108] Avg episode reward: [(0, '0.692')] [2024-04-28 12:48:58,498][57339] Updated weights for policy 0, policy_version 612878 (0.0028) [2024-04-28 12:49:01,362][57339] Updated weights for policy 0, policy_version 612888 (0.0033) [2024-04-28 12:49:02,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 10041606144. Throughput: 0: 55357.7. Samples: 532009980. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:49:02,169][57108] Avg episode reward: [(0, '0.716')] [2024-04-28 12:49:04,367][57339] Updated weights for policy 0, policy_version 612898 (0.0027) [2024-04-28 12:49:07,169][57339] Updated weights for policy 0, policy_version 612908 (0.0036) [2024-04-28 12:49:07,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10041884672. Throughput: 0: 55599.2. Samples: 532177920. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:49:07,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 12:49:10,194][57339] Updated weights for policy 0, policy_version 612918 (0.0028) [2024-04-28 12:49:12,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10042179584. Throughput: 0: 55705.3. Samples: 532513660. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:49:12,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 12:49:13,081][57339] Updated weights for policy 0, policy_version 612928 (0.0031) [2024-04-28 12:49:15,999][57339] Updated weights for policy 0, policy_version 612938 (0.0027) [2024-04-28 12:49:17,169][57108] Fps is (10 sec: 57343.9, 60 sec: 56251.6, 300 sec: 55650.1). Total num frames: 10042458112. Throughput: 0: 55841.7. Samples: 532850500. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:49:17,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 12:49:18,988][57339] Updated weights for policy 0, policy_version 612948 (0.0033) [2024-04-28 12:49:21,785][57339] Updated weights for policy 0, policy_version 612958 (0.0029) [2024-04-28 12:49:22,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10042720256. Throughput: 0: 55624.8. Samples: 533012540. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:49:22,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 12:49:24,636][57339] Updated weights for policy 0, policy_version 612968 (0.0030) [2024-04-28 12:49:27,169][57108] Fps is (10 sec: 50790.8, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 10042966016. Throughput: 0: 55679.5. Samples: 533349200. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:49:27,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 12:49:27,196][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000612975_10042982400.pth... [2024-04-28 12:49:27,242][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000612162_10029662208.pth [2024-04-28 12:49:27,722][57339] Updated weights for policy 0, policy_version 612978 (0.0034) [2024-04-28 12:49:29,556][57319] Signal inference workers to stop experience collection... (7700 times) [2024-04-28 12:49:29,557][57319] Signal inference workers to resume experience collection... (7700 times) [2024-04-28 12:49:29,597][57339] InferenceWorker_p0-w0: stopping experience collection (7700 times) [2024-04-28 12:49:29,598][57339] InferenceWorker_p0-w0: resuming experience collection (7700 times) [2024-04-28 12:49:30,430][57339] Updated weights for policy 0, policy_version 612988 (0.0027) [2024-04-28 12:49:32,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10043260928. Throughput: 0: 55699.2. Samples: 533685020. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-04-28 12:49:32,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 12:49:33,532][57339] Updated weights for policy 0, policy_version 612998 (0.0038) [2024-04-28 12:49:36,341][57339] Updated weights for policy 0, policy_version 613008 (0.0031) [2024-04-28 12:49:37,169][57108] Fps is (10 sec: 60620.7, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10043572224. Throughput: 0: 55860.1. Samples: 533849160. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:49:37,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 12:49:39,545][57339] Updated weights for policy 0, policy_version 613018 (0.0027) [2024-04-28 12:49:42,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10043834368. Throughput: 0: 55677.8. Samples: 534180660. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:49:42,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 12:49:42,215][57339] Updated weights for policy 0, policy_version 613028 (0.0031) [2024-04-28 12:49:45,390][57339] Updated weights for policy 0, policy_version 613038 (0.0030) [2024-04-28 12:49:47,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 10044112896. Throughput: 0: 55707.7. Samples: 534516820. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:49:47,169][57108] Avg episode reward: [(0, '0.479')] [2024-04-28 12:49:48,048][57339] Updated weights for policy 0, policy_version 613048 (0.0029) [2024-04-28 12:49:51,274][57339] Updated weights for policy 0, policy_version 613058 (0.0028) [2024-04-28 12:49:52,169][57108] Fps is (10 sec: 57343.1, 60 sec: 56524.7, 300 sec: 55705.6). Total num frames: 10044407808. Throughput: 0: 55834.1. Samples: 534690460. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:49:52,170][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 12:49:53,985][57339] Updated weights for policy 0, policy_version 613068 (0.0032) [2024-04-28 12:49:57,075][57339] Updated weights for policy 0, policy_version 613078 (0.0029) [2024-04-28 12:49:57,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 10044669952. Throughput: 0: 55794.1. Samples: 535024400. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:49:57,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 12:50:00,255][57339] Updated weights for policy 0, policy_version 613088 (0.0028) [2024-04-28 12:50:02,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10044932096. Throughput: 0: 55747.5. Samples: 535359140. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:50:02,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 12:50:02,962][57339] Updated weights for policy 0, policy_version 613098 (0.0027) [2024-04-28 12:50:06,054][57339] Updated weights for policy 0, policy_version 613108 (0.0028) [2024-04-28 12:50:07,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55705.7, 300 sec: 55594.6). Total num frames: 10045227008. Throughput: 0: 55812.6. Samples: 535524100. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:50:07,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 12:50:08,955][57339] Updated weights for policy 0, policy_version 613118 (0.0028) [2024-04-28 12:50:11,805][57339] Updated weights for policy 0, policy_version 613128 (0.0031) [2024-04-28 12:50:12,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55432.6, 300 sec: 55594.6). Total num frames: 10045505536. Throughput: 0: 55708.9. Samples: 535856100. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:50:12,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 12:50:14,766][57339] Updated weights for policy 0, policy_version 613138 (0.0023) [2024-04-28 12:50:17,169][57108] Fps is (10 sec: 55704.1, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 10045784064. Throughput: 0: 55706.0. Samples: 536191800. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:50:17,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 12:50:17,528][57339] Updated weights for policy 0, policy_version 613148 (0.0030) [2024-04-28 12:50:20,549][57339] Updated weights for policy 0, policy_version 613158 (0.0030) [2024-04-28 12:50:22,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10046062592. Throughput: 0: 55850.2. Samples: 536362420. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:50:22,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 12:50:23,613][57339] Updated weights for policy 0, policy_version 613168 (0.0029) [2024-04-28 12:50:26,565][57339] Updated weights for policy 0, policy_version 613178 (0.0025) [2024-04-28 12:50:27,169][57108] Fps is (10 sec: 57345.2, 60 sec: 56524.8, 300 sec: 55650.1). Total num frames: 10046357504. Throughput: 0: 55867.6. Samples: 536694700. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:50:27,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 12:50:29,618][57339] Updated weights for policy 0, policy_version 613188 (0.0039) [2024-04-28 12:50:31,535][57319] Signal inference workers to stop experience collection... (7750 times) [2024-04-28 12:50:31,571][57339] InferenceWorker_p0-w0: stopping experience collection (7750 times) [2024-04-28 12:50:31,599][57319] Signal inference workers to resume experience collection... (7750 times) [2024-04-28 12:50:31,602][57339] InferenceWorker_p0-w0: resuming experience collection (7750 times) [2024-04-28 12:50:32,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10046603264. Throughput: 0: 55829.2. Samples: 537029140. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:50:32,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 12:50:32,463][57339] Updated weights for policy 0, policy_version 613198 (0.0028) [2024-04-28 12:50:35,414][57339] Updated weights for policy 0, policy_version 613208 (0.0028) [2024-04-28 12:50:37,170][57108] Fps is (10 sec: 50782.4, 60 sec: 54885.0, 300 sec: 55538.7). Total num frames: 10046865408. Throughput: 0: 55490.8. Samples: 537187620. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:50:37,171][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 12:50:38,331][57339] Updated weights for policy 0, policy_version 613218 (0.0030) [2024-04-28 12:50:41,388][57339] Updated weights for policy 0, policy_version 613228 (0.0032) [2024-04-28 12:50:42,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10047160320. Throughput: 0: 55384.6. Samples: 537516700. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:50:42,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 12:50:44,272][57339] Updated weights for policy 0, policy_version 613238 (0.0029) [2024-04-28 12:50:47,169][57108] Fps is (10 sec: 57352.3, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 10047438848. Throughput: 0: 55458.7. Samples: 537854780. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:50:47,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 12:50:47,372][57339] Updated weights for policy 0, policy_version 613248 (0.0026) [2024-04-28 12:50:50,119][57339] Updated weights for policy 0, policy_version 613258 (0.0031) [2024-04-28 12:50:52,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 10047717376. Throughput: 0: 55438.5. Samples: 538018840. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:50:52,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 12:50:53,272][57339] Updated weights for policy 0, policy_version 613268 (0.0031) [2024-04-28 12:50:55,964][57339] Updated weights for policy 0, policy_version 613278 (0.0031) [2024-04-28 12:50:57,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 10047995904. Throughput: 0: 55511.2. Samples: 538354100. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:50:57,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 12:50:59,146][57339] Updated weights for policy 0, policy_version 613288 (0.0027) [2024-04-28 12:51:01,754][57339] Updated weights for policy 0, policy_version 613298 (0.0026) [2024-04-28 12:51:02,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 10048290816. Throughput: 0: 55483.3. Samples: 538688540. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:51:02,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 12:51:04,936][57339] Updated weights for policy 0, policy_version 613308 (0.0036) [2024-04-28 12:51:07,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10048552960. Throughput: 0: 55445.5. Samples: 538857460. Policy #0 lag: (min: 2.0, avg: 13.0, max: 25.0) [2024-04-28 12:51:07,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 12:51:07,765][57339] Updated weights for policy 0, policy_version 613318 (0.0026) [2024-04-28 12:51:10,826][57339] Updated weights for policy 0, policy_version 613328 (0.0032) [2024-04-28 12:51:12,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 10048815104. Throughput: 0: 55528.8. Samples: 539193500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:51:12,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 12:51:13,607][57339] Updated weights for policy 0, policy_version 613338 (0.0028) [2024-04-28 12:51:16,653][57339] Updated weights for policy 0, policy_version 613348 (0.0026) [2024-04-28 12:51:17,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 10049110016. Throughput: 0: 55486.6. Samples: 539526040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:51:17,170][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 12:51:19,654][57339] Updated weights for policy 0, policy_version 613358 (0.0034) [2024-04-28 12:51:22,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10049388544. Throughput: 0: 55619.3. Samples: 539690400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:51:22,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 12:51:22,540][57339] Updated weights for policy 0, policy_version 613368 (0.0031) [2024-04-28 12:51:25,462][57339] Updated weights for policy 0, policy_version 613378 (0.0026) [2024-04-28 12:51:27,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 10049667072. Throughput: 0: 55649.2. Samples: 540020920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:51:27,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 12:51:27,242][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000613384_10049683456.pth... [2024-04-28 12:51:27,284][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000612569_10036330496.pth [2024-04-28 12:51:28,490][57339] Updated weights for policy 0, policy_version 613388 (0.0026) [2024-04-28 12:51:31,461][57339] Updated weights for policy 0, policy_version 613398 (0.0029) [2024-04-28 12:51:32,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 10049961984. Throughput: 0: 55481.8. Samples: 540351460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:51:32,170][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 12:51:34,486][57339] Updated weights for policy 0, policy_version 613408 (0.0035) [2024-04-28 12:51:37,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55980.1, 300 sec: 55594.5). Total num frames: 10050224128. Throughput: 0: 55622.2. Samples: 540521840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:51:37,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 12:51:37,269][57339] Updated weights for policy 0, policy_version 613418 (0.0029) [2024-04-28 12:51:37,735][57319] Signal inference workers to stop experience collection... (7800 times) [2024-04-28 12:51:37,753][57339] InferenceWorker_p0-w0: stopping experience collection (7800 times) [2024-04-28 12:51:37,826][57319] Signal inference workers to resume experience collection... (7800 times) [2024-04-28 12:51:37,826][57339] InferenceWorker_p0-w0: resuming experience collection (7800 times) [2024-04-28 12:51:40,282][57339] Updated weights for policy 0, policy_version 613428 (0.0026) [2024-04-28 12:51:42,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55705.4, 300 sec: 55650.0). Total num frames: 10050502656. Throughput: 0: 55678.4. Samples: 540859640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:51:42,170][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 12:51:43,143][57339] Updated weights for policy 0, policy_version 613438 (0.0032) [2024-04-28 12:51:46,297][57339] Updated weights for policy 0, policy_version 613448 (0.0034) [2024-04-28 12:51:47,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 10050748416. Throughput: 0: 55664.8. Samples: 541193460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:51:47,170][57108] Avg episode reward: [(0, '0.511')] [2024-04-28 12:51:49,112][57339] Updated weights for policy 0, policy_version 613458 (0.0027) [2024-04-28 12:51:52,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55432.5, 300 sec: 55483.5). Total num frames: 10051043328. Throughput: 0: 55391.5. Samples: 541350080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:51:52,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 12:51:52,279][57339] Updated weights for policy 0, policy_version 613468 (0.0036) [2024-04-28 12:51:55,004][57339] Updated weights for policy 0, policy_version 613478 (0.0026) [2024-04-28 12:51:57,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10051338240. Throughput: 0: 55313.8. Samples: 541682620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:51:57,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 12:51:58,272][57339] Updated weights for policy 0, policy_version 613488 (0.0030) [2024-04-28 12:52:00,878][57339] Updated weights for policy 0, policy_version 613498 (0.0026) [2024-04-28 12:52:02,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 10051600384. Throughput: 0: 55330.1. Samples: 542015900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:52:02,169][57108] Avg episode reward: [(0, '0.530')] [2024-04-28 12:52:04,163][57339] Updated weights for policy 0, policy_version 613508 (0.0026) [2024-04-28 12:52:06,657][57339] Updated weights for policy 0, policy_version 613518 (0.0031) [2024-04-28 12:52:07,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10051895296. Throughput: 0: 55511.9. Samples: 542188440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:52:07,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 12:52:10,019][57339] Updated weights for policy 0, policy_version 613528 (0.0025) [2024-04-28 12:52:12,169][57108] Fps is (10 sec: 55706.8, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10052157440. Throughput: 0: 55600.2. Samples: 542522920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:52:12,169][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 12:52:12,772][57339] Updated weights for policy 0, policy_version 613538 (0.0032) [2024-04-28 12:52:15,913][57339] Updated weights for policy 0, policy_version 613548 (0.0035) [2024-04-28 12:52:17,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10052452352. Throughput: 0: 55495.2. Samples: 542848740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:52:17,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 12:52:18,637][57339] Updated weights for policy 0, policy_version 613558 (0.0027) [2024-04-28 12:52:21,811][57339] Updated weights for policy 0, policy_version 613568 (0.0028) [2024-04-28 12:52:22,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 10052714496. Throughput: 0: 55401.2. Samples: 543014900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:52:22,169][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 12:52:24,559][57339] Updated weights for policy 0, policy_version 613578 (0.0028) [2024-04-28 12:52:27,169][57108] Fps is (10 sec: 52428.0, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 10052976640. Throughput: 0: 55284.9. Samples: 543347460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:52:27,170][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 12:52:27,803][57339] Updated weights for policy 0, policy_version 613588 (0.0023) [2024-04-28 12:52:30,403][57339] Updated weights for policy 0, policy_version 613598 (0.0031) [2024-04-28 12:52:32,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10053271552. Throughput: 0: 55197.0. Samples: 543677320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:52:32,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 12:52:33,863][57339] Updated weights for policy 0, policy_version 613608 (0.0028) [2024-04-28 12:52:36,605][57339] Updated weights for policy 0, policy_version 613618 (0.0028) [2024-04-28 12:52:37,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10053533696. Throughput: 0: 55404.0. Samples: 543843260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:52:37,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 12:52:39,601][57339] Updated weights for policy 0, policy_version 613628 (0.0030) [2024-04-28 12:52:42,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 10053812224. Throughput: 0: 55336.9. Samples: 544172780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 12:52:42,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 12:52:42,335][57339] Updated weights for policy 0, policy_version 613638 (0.0027) [2024-04-28 12:52:45,482][57339] Updated weights for policy 0, policy_version 613648 (0.0029) [2024-04-28 12:52:47,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 10054107136. Throughput: 0: 55390.4. Samples: 544508460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:52:47,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 12:52:48,337][57339] Updated weights for policy 0, policy_version 613658 (0.0031) [2024-04-28 12:52:51,240][57339] Updated weights for policy 0, policy_version 613668 (0.0032) [2024-04-28 12:52:52,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10054385664. Throughput: 0: 55345.9. Samples: 544679000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:52:52,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 12:52:54,188][57319] Signal inference workers to stop experience collection... (7850 times) [2024-04-28 12:52:54,194][57319] Signal inference workers to resume experience collection... (7850 times) [2024-04-28 12:52:54,216][57339] InferenceWorker_p0-w0: stopping experience collection (7850 times) [2024-04-28 12:52:54,216][57339] InferenceWorker_p0-w0: resuming experience collection (7850 times) [2024-04-28 12:52:54,313][57339] Updated weights for policy 0, policy_version 613678 (0.0031) [2024-04-28 12:52:57,106][57339] Updated weights for policy 0, policy_version 613688 (0.0033) [2024-04-28 12:52:57,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10054664192. Throughput: 0: 55355.1. Samples: 545013900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:52:57,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 12:53:00,238][57339] Updated weights for policy 0, policy_version 613698 (0.0027) [2024-04-28 12:53:02,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55432.7, 300 sec: 55483.4). Total num frames: 10054926336. Throughput: 0: 55519.5. Samples: 545347120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:53:02,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 12:53:02,911][57339] Updated weights for policy 0, policy_version 613708 (0.0030) [2024-04-28 12:53:06,076][57339] Updated weights for policy 0, policy_version 613718 (0.0025) [2024-04-28 12:53:07,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 10055204864. Throughput: 0: 55528.6. Samples: 545513680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:53:07,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 12:53:08,732][57339] Updated weights for policy 0, policy_version 613728 (0.0029) [2024-04-28 12:53:11,870][57339] Updated weights for policy 0, policy_version 613738 (0.0028) [2024-04-28 12:53:12,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10055483392. Throughput: 0: 55556.7. Samples: 545847500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:53:12,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 12:53:14,671][57339] Updated weights for policy 0, policy_version 613748 (0.0027) [2024-04-28 12:53:17,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10055778304. Throughput: 0: 55745.2. Samples: 546185860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:53:17,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 12:53:17,597][57339] Updated weights for policy 0, policy_version 613758 (0.0031) [2024-04-28 12:53:20,512][57339] Updated weights for policy 0, policy_version 613768 (0.0027) [2024-04-28 12:53:22,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10056056832. Throughput: 0: 55790.2. Samples: 546353820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:53:22,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 12:53:23,490][57339] Updated weights for policy 0, policy_version 613778 (0.0026) [2024-04-28 12:53:26,298][57339] Updated weights for policy 0, policy_version 613788 (0.0027) [2024-04-28 12:53:27,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 10056335360. Throughput: 0: 55884.4. Samples: 546687580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:53:27,170][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 12:53:27,223][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000613791_10056351744.pth... [2024-04-28 12:53:27,267][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000612975_10042982400.pth [2024-04-28 12:53:29,419][57339] Updated weights for policy 0, policy_version 613798 (0.0026) [2024-04-28 12:53:32,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10056613888. Throughput: 0: 55827.0. Samples: 547020680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:53:32,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 12:53:32,210][57339] Updated weights for policy 0, policy_version 613808 (0.0032) [2024-04-28 12:53:35,364][57339] Updated weights for policy 0, policy_version 613818 (0.0024) [2024-04-28 12:53:37,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 10056892416. Throughput: 0: 55956.7. Samples: 547197060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:53:37,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 12:53:38,025][57339] Updated weights for policy 0, policy_version 613828 (0.0032) [2024-04-28 12:53:41,216][57339] Updated weights for policy 0, policy_version 613838 (0.0034) [2024-04-28 12:53:42,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10057170944. Throughput: 0: 55922.1. Samples: 547530400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:53:42,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 12:53:43,847][57339] Updated weights for policy 0, policy_version 613848 (0.0025) [2024-04-28 12:53:47,043][57339] Updated weights for policy 0, policy_version 613858 (0.0032) [2024-04-28 12:53:47,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10057449472. Throughput: 0: 55960.5. Samples: 547865340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:53:47,169][57108] Avg episode reward: [(0, '0.487')] [2024-04-28 12:53:49,816][57339] Updated weights for policy 0, policy_version 613868 (0.0024) [2024-04-28 12:53:52,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10057728000. Throughput: 0: 55875.2. Samples: 548028060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:53:52,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 12:53:52,997][57339] Updated weights for policy 0, policy_version 613878 (0.0028) [2024-04-28 12:53:55,712][57339] Updated weights for policy 0, policy_version 613888 (0.0030) [2024-04-28 12:53:57,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 10058022912. Throughput: 0: 56038.1. Samples: 548369220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:53:57,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 12:53:58,660][57339] Updated weights for policy 0, policy_version 613898 (0.0027) [2024-04-28 12:54:01,549][57339] Updated weights for policy 0, policy_version 613908 (0.0030) [2024-04-28 12:54:02,169][57108] Fps is (10 sec: 58981.6, 60 sec: 56524.7, 300 sec: 55705.6). Total num frames: 10058317824. Throughput: 0: 55913.2. Samples: 548701960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:54:02,170][57108] Avg episode reward: [(0, '0.518')] [2024-04-28 12:54:04,494][57339] Updated weights for policy 0, policy_version 613918 (0.0026) [2024-04-28 12:54:07,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 10058563584. Throughput: 0: 55879.0. Samples: 548868380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:54:07,169][57108] Avg episode reward: [(0, '0.521')] [2024-04-28 12:54:07,543][57319] Signal inference workers to stop experience collection... (7900 times) [2024-04-28 12:54:07,545][57319] Signal inference workers to resume experience collection... (7900 times) [2024-04-28 12:54:07,557][57339] Updated weights for policy 0, policy_version 613928 (0.0028) [2024-04-28 12:54:07,586][57339] InferenceWorker_p0-w0: stopping experience collection (7900 times) [2024-04-28 12:54:07,586][57339] InferenceWorker_p0-w0: resuming experience collection (7900 times) [2024-04-28 12:54:10,650][57339] Updated weights for policy 0, policy_version 613938 (0.0038) [2024-04-28 12:54:12,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55978.5, 300 sec: 55539.0). Total num frames: 10058842112. Throughput: 0: 56097.7. Samples: 549211980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:54:12,170][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 12:54:13,268][57339] Updated weights for policy 0, policy_version 613948 (0.0033) [2024-04-28 12:54:16,430][57339] Updated weights for policy 0, policy_version 613958 (0.0027) [2024-04-28 12:54:17,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10059120640. Throughput: 0: 56122.7. Samples: 549546200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 12:54:17,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 12:54:19,007][57339] Updated weights for policy 0, policy_version 613968 (0.0027) [2024-04-28 12:54:22,169][57108] Fps is (10 sec: 55706.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10059399168. Throughput: 0: 55769.9. Samples: 549706700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:54:22,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 12:54:22,200][57339] Updated weights for policy 0, policy_version 613978 (0.0023) [2024-04-28 12:54:24,920][57339] Updated weights for policy 0, policy_version 613988 (0.0035) [2024-04-28 12:54:27,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10059677696. Throughput: 0: 55803.6. Samples: 550041560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:54:27,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 12:54:28,106][57339] Updated weights for policy 0, policy_version 613998 (0.0033) [2024-04-28 12:54:30,704][57339] Updated weights for policy 0, policy_version 614008 (0.0026) [2024-04-28 12:54:32,169][57108] Fps is (10 sec: 58982.4, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 10059988992. Throughput: 0: 55792.5. Samples: 550376000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:54:32,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 12:54:34,068][57339] Updated weights for policy 0, policy_version 614018 (0.0034) [2024-04-28 12:54:36,602][57339] Updated weights for policy 0, policy_version 614028 (0.0025) [2024-04-28 12:54:37,169][57108] Fps is (10 sec: 58981.3, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10060267520. Throughput: 0: 55982.0. Samples: 550547260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:54:37,170][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 12:54:39,862][57339] Updated weights for policy 0, policy_version 614038 (0.0041) [2024-04-28 12:54:42,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 10060529664. Throughput: 0: 55860.4. Samples: 550882940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:54:42,170][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 12:54:42,584][57339] Updated weights for policy 0, policy_version 614048 (0.0031) [2024-04-28 12:54:45,552][57339] Updated weights for policy 0, policy_version 614058 (0.0034) [2024-04-28 12:54:47,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 10060808192. Throughput: 0: 55858.8. Samples: 551215600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:54:47,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 12:54:48,475][57339] Updated weights for policy 0, policy_version 614068 (0.0033) [2024-04-28 12:54:51,397][57339] Updated weights for policy 0, policy_version 614078 (0.0028) [2024-04-28 12:54:52,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 10061086720. Throughput: 0: 55867.2. Samples: 551382400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:54:52,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 12:54:54,329][57339] Updated weights for policy 0, policy_version 614088 (0.0028) [2024-04-28 12:54:57,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10061348864. Throughput: 0: 55659.7. Samples: 551716660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:54:57,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 12:54:57,422][57339] Updated weights for policy 0, policy_version 614098 (0.0033) [2024-04-28 12:55:00,207][57339] Updated weights for policy 0, policy_version 614108 (0.0028) [2024-04-28 12:55:02,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10061627392. Throughput: 0: 55574.1. Samples: 552047040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:55:02,169][57108] Avg episode reward: [(0, '0.528')] [2024-04-28 12:55:03,245][57339] Updated weights for policy 0, policy_version 614118 (0.0028) [2024-04-28 12:55:06,115][57339] Updated weights for policy 0, policy_version 614128 (0.0030) [2024-04-28 12:55:07,169][57108] Fps is (10 sec: 58982.7, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 10061938688. Throughput: 0: 55774.6. Samples: 552216560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:55:07,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 12:55:08,967][57339] Updated weights for policy 0, policy_version 614138 (0.0029) [2024-04-28 12:55:11,900][57339] Updated weights for policy 0, policy_version 614148 (0.0031) [2024-04-28 12:55:12,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 10062200832. Throughput: 0: 56008.4. Samples: 552561940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:55:12,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 12:55:14,829][57339] Updated weights for policy 0, policy_version 614158 (0.0031) [2024-04-28 12:55:17,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10062462976. Throughput: 0: 55918.6. Samples: 552892340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:55:17,169][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 12:55:17,856][57339] Updated weights for policy 0, policy_version 614168 (0.0028) [2024-04-28 12:55:20,706][57339] Updated weights for policy 0, policy_version 614178 (0.0031) [2024-04-28 12:55:21,446][57319] Signal inference workers to stop experience collection... (7950 times) [2024-04-28 12:55:21,446][57319] Signal inference workers to resume experience collection... (7950 times) [2024-04-28 12:55:21,468][57339] InferenceWorker_p0-w0: stopping experience collection (7950 times) [2024-04-28 12:55:21,469][57339] InferenceWorker_p0-w0: resuming experience collection (7950 times) [2024-04-28 12:55:22,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56251.6, 300 sec: 55650.0). Total num frames: 10062774272. Throughput: 0: 55896.1. Samples: 553062580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:55:22,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 12:55:23,766][57339] Updated weights for policy 0, policy_version 614188 (0.0032) [2024-04-28 12:55:26,428][57339] Updated weights for policy 0, policy_version 614198 (0.0034) [2024-04-28 12:55:27,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10063036416. Throughput: 0: 55922.9. Samples: 553399460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:55:27,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 12:55:27,221][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000614200_10063052800.pth... [2024-04-28 12:55:27,283][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000613384_10049683456.pth [2024-04-28 12:55:29,529][57339] Updated weights for policy 0, policy_version 614208 (0.0039) [2024-04-28 12:55:32,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55817.0). Total num frames: 10063331328. Throughput: 0: 56027.5. Samples: 553736840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:55:32,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 12:55:32,231][57339] Updated weights for policy 0, policy_version 614218 (0.0025) [2024-04-28 12:55:35,385][57339] Updated weights for policy 0, policy_version 614228 (0.0028) [2024-04-28 12:55:37,169][57108] Fps is (10 sec: 54065.9, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 10063577088. Throughput: 0: 55950.5. Samples: 553900180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:55:37,170][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 12:55:38,133][57339] Updated weights for policy 0, policy_version 614238 (0.0025) [2024-04-28 12:55:41,328][57339] Updated weights for policy 0, policy_version 614248 (0.0029) [2024-04-28 12:55:42,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 10063888384. Throughput: 0: 55853.8. Samples: 554230080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:55:42,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 12:55:44,364][57339] Updated weights for policy 0, policy_version 614258 (0.0031) [2024-04-28 12:55:47,086][57339] Updated weights for policy 0, policy_version 614268 (0.0030) [2024-04-28 12:55:47,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10064166912. Throughput: 0: 55975.1. Samples: 554565920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:55:47,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 12:55:50,394][57339] Updated weights for policy 0, policy_version 614278 (0.0031) [2024-04-28 12:55:52,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10064429056. Throughput: 0: 56027.1. Samples: 554737780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 12:55:52,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 12:55:52,983][57339] Updated weights for policy 0, policy_version 614288 (0.0036) [2024-04-28 12:55:56,121][57339] Updated weights for policy 0, policy_version 614298 (0.0026) [2024-04-28 12:55:57,169][57108] Fps is (10 sec: 55706.1, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 10064723968. Throughput: 0: 55744.5. Samples: 555070440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:55:57,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 12:55:58,839][57339] Updated weights for policy 0, policy_version 614308 (0.0032) [2024-04-28 12:56:01,955][57339] Updated weights for policy 0, policy_version 614318 (0.0032) [2024-04-28 12:56:02,169][57108] Fps is (10 sec: 55704.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10064986112. Throughput: 0: 55837.6. Samples: 555405040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:56:02,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 12:56:04,754][57339] Updated weights for policy 0, policy_version 614328 (0.0038) [2024-04-28 12:56:07,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10065281024. Throughput: 0: 55780.4. Samples: 555572700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:56:07,170][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 12:56:07,889][57339] Updated weights for policy 0, policy_version 614338 (0.0027) [2024-04-28 12:56:10,660][57339] Updated weights for policy 0, policy_version 614348 (0.0029) [2024-04-28 12:56:12,169][57108] Fps is (10 sec: 57345.1, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 10065559552. Throughput: 0: 55818.6. Samples: 555911300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:56:12,169][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 12:56:13,573][57339] Updated weights for policy 0, policy_version 614358 (0.0034) [2024-04-28 12:56:16,442][57339] Updated weights for policy 0, policy_version 614368 (0.0027) [2024-04-28 12:56:17,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 10065821696. Throughput: 0: 55760.7. Samples: 556246080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:56:17,170][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 12:56:19,407][57339] Updated weights for policy 0, policy_version 614378 (0.0030) [2024-04-28 12:56:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10066100224. Throughput: 0: 55902.0. Samples: 556415760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:56:22,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 12:56:22,349][57339] Updated weights for policy 0, policy_version 614388 (0.0026) [2024-04-28 12:56:25,502][57339] Updated weights for policy 0, policy_version 614398 (0.0033) [2024-04-28 12:56:27,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.4, 300 sec: 55705.6). Total num frames: 10066395136. Throughput: 0: 55967.3. Samples: 556748620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:56:27,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 12:56:28,260][57339] Updated weights for policy 0, policy_version 614408 (0.0025) [2024-04-28 12:56:29,564][57319] Signal inference workers to stop experience collection... (8000 times) [2024-04-28 12:56:29,573][57339] InferenceWorker_p0-w0: stopping experience collection (8000 times) [2024-04-28 12:56:29,655][57319] Signal inference workers to resume experience collection... (8000 times) [2024-04-28 12:56:29,656][57339] InferenceWorker_p0-w0: resuming experience collection (8000 times) [2024-04-28 12:56:31,422][57339] Updated weights for policy 0, policy_version 614418 (0.0025) [2024-04-28 12:56:32,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10066673664. Throughput: 0: 55956.5. Samples: 557083960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:56:32,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 12:56:34,157][57339] Updated weights for policy 0, policy_version 614428 (0.0031) [2024-04-28 12:56:37,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10066935808. Throughput: 0: 55749.7. Samples: 557246520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:56:37,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 12:56:37,192][57339] Updated weights for policy 0, policy_version 614438 (0.0034) [2024-04-28 12:56:40,030][57339] Updated weights for policy 0, policy_version 614448 (0.0029) [2024-04-28 12:56:42,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 10067247104. Throughput: 0: 55828.5. Samples: 557582720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:56:42,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 12:56:42,995][57339] Updated weights for policy 0, policy_version 614458 (0.0028) [2024-04-28 12:56:46,005][57339] Updated weights for policy 0, policy_version 614468 (0.0031) [2024-04-28 12:56:47,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10067509248. Throughput: 0: 55738.7. Samples: 557913280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:56:47,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 12:56:48,881][57339] Updated weights for policy 0, policy_version 614478 (0.0024) [2024-04-28 12:56:51,919][57339] Updated weights for policy 0, policy_version 614488 (0.0025) [2024-04-28 12:56:52,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 10067787776. Throughput: 0: 55804.8. Samples: 558083920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:56:52,170][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 12:56:54,761][57339] Updated weights for policy 0, policy_version 614498 (0.0032) [2024-04-28 12:56:57,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.4, 300 sec: 55761.2). Total num frames: 10068049920. Throughput: 0: 55627.9. Samples: 558414560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:56:57,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 12:56:57,810][57339] Updated weights for policy 0, policy_version 614508 (0.0027) [2024-04-28 12:57:00,447][57339] Updated weights for policy 0, policy_version 614518 (0.0028) [2024-04-28 12:57:02,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10068361216. Throughput: 0: 55696.6. Samples: 558752420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:57:02,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 12:57:03,557][57339] Updated weights for policy 0, policy_version 614528 (0.0028) [2024-04-28 12:57:06,351][57339] Updated weights for policy 0, policy_version 614538 (0.0027) [2024-04-28 12:57:07,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10068623360. Throughput: 0: 55672.4. Samples: 558921020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:57:07,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 12:57:09,464][57339] Updated weights for policy 0, policy_version 614548 (0.0028) [2024-04-28 12:57:12,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10068885504. Throughput: 0: 55715.3. Samples: 559255800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:57:12,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 12:57:12,404][57339] Updated weights for policy 0, policy_version 614558 (0.0028) [2024-04-28 12:57:15,294][57339] Updated weights for policy 0, policy_version 614568 (0.0029) [2024-04-28 12:57:17,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 10069164032. Throughput: 0: 55777.4. Samples: 559593940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:57:17,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 12:57:18,280][57339] Updated weights for policy 0, policy_version 614578 (0.0024) [2024-04-28 12:57:21,094][57339] Updated weights for policy 0, policy_version 614588 (0.0029) [2024-04-28 12:57:22,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55872.3). Total num frames: 10069458944. Throughput: 0: 55689.4. Samples: 559752540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:57:22,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 12:57:24,047][57339] Updated weights for policy 0, policy_version 614598 (0.0030) [2024-04-28 12:57:27,080][57339] Updated weights for policy 0, policy_version 614608 (0.0029) [2024-04-28 12:57:27,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 10069737472. Throughput: 0: 55712.0. Samples: 560089760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 12:57:27,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 12:57:27,177][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000614609_10069753856.pth... [2024-04-28 12:57:27,222][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000613791_10056351744.pth [2024-04-28 12:57:29,995][57339] Updated weights for policy 0, policy_version 614618 (0.0031) [2024-04-28 12:57:32,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10069999616. Throughput: 0: 55765.5. Samples: 560422720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:57:32,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 12:57:32,721][57319] Signal inference workers to stop experience collection... (8050 times) [2024-04-28 12:57:32,723][57319] Signal inference workers to resume experience collection... (8050 times) [2024-04-28 12:57:32,749][57339] InferenceWorker_p0-w0: stopping experience collection (8050 times) [2024-04-28 12:57:32,749][57339] InferenceWorker_p0-w0: resuming experience collection (8050 times) [2024-04-28 12:57:32,835][57339] Updated weights for policy 0, policy_version 614628 (0.0032) [2024-04-28 12:57:36,003][57339] Updated weights for policy 0, policy_version 614638 (0.0030) [2024-04-28 12:57:37,169][57108] Fps is (10 sec: 57343.4, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10070310912. Throughput: 0: 55769.8. Samples: 560593560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:57:37,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 12:57:38,778][57339] Updated weights for policy 0, policy_version 614648 (0.0030) [2024-04-28 12:57:41,781][57339] Updated weights for policy 0, policy_version 614658 (0.0033) [2024-04-28 12:57:42,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10070573056. Throughput: 0: 55871.7. Samples: 560928780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:57:42,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 12:57:44,551][57339] Updated weights for policy 0, policy_version 614668 (0.0028) [2024-04-28 12:57:47,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10070835200. Throughput: 0: 55817.3. Samples: 561264200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:57:47,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 12:57:47,631][57339] Updated weights for policy 0, policy_version 614678 (0.0026) [2024-04-28 12:57:50,372][57339] Updated weights for policy 0, policy_version 614688 (0.0031) [2024-04-28 12:57:52,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55159.7, 300 sec: 55705.6). Total num frames: 10071097344. Throughput: 0: 55703.6. Samples: 561427680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:57:52,169][57108] Avg episode reward: [(0, '0.524')] [2024-04-28 12:57:53,551][57339] Updated weights for policy 0, policy_version 614698 (0.0027) [2024-04-28 12:57:56,242][57339] Updated weights for policy 0, policy_version 614708 (0.0028) [2024-04-28 12:57:57,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10071392256. Throughput: 0: 55653.6. Samples: 561760220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:57:57,170][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 12:57:59,541][57339] Updated weights for policy 0, policy_version 614718 (0.0026) [2024-04-28 12:58:02,045][57339] Updated weights for policy 0, policy_version 614728 (0.0029) [2024-04-28 12:58:02,169][57108] Fps is (10 sec: 60620.5, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 10071703552. Throughput: 0: 55637.3. Samples: 562097620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:58:02,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 12:58:05,369][57339] Updated weights for policy 0, policy_version 614738 (0.0027) [2024-04-28 12:58:07,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 10071965696. Throughput: 0: 55878.5. Samples: 562267080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:58:07,170][57108] Avg episode reward: [(0, '0.686')] [2024-04-28 12:58:07,961][57339] Updated weights for policy 0, policy_version 614748 (0.0029) [2024-04-28 12:58:11,264][57339] Updated weights for policy 0, policy_version 614758 (0.0028) [2024-04-28 12:58:12,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10072244224. Throughput: 0: 55856.9. Samples: 562603320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:58:12,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 12:58:13,835][57339] Updated weights for policy 0, policy_version 614768 (0.0030) [2024-04-28 12:58:17,028][57339] Updated weights for policy 0, policy_version 614778 (0.0027) [2024-04-28 12:58:17,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10072522752. Throughput: 0: 55771.1. Samples: 562932420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:58:17,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 12:58:19,790][57339] Updated weights for policy 0, policy_version 614788 (0.0028) [2024-04-28 12:58:22,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 10072784896. Throughput: 0: 55604.5. Samples: 563095760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:58:22,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 12:58:22,869][57339] Updated weights for policy 0, policy_version 614798 (0.0027) [2024-04-28 12:58:25,725][57339] Updated weights for policy 0, policy_version 614808 (0.0026) [2024-04-28 12:58:27,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 10073063424. Throughput: 0: 55513.4. Samples: 563426880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:58:27,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 12:58:28,813][57339] Updated weights for policy 0, policy_version 614818 (0.0027) [2024-04-28 12:58:31,539][57339] Updated weights for policy 0, policy_version 614828 (0.0028) [2024-04-28 12:58:32,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10073341952. Throughput: 0: 55514.8. Samples: 563762360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:58:32,169][57108] Avg episode reward: [(0, '0.456')] [2024-04-28 12:58:32,705][57319] Signal inference workers to stop experience collection... (8100 times) [2024-04-28 12:58:32,757][57339] InferenceWorker_p0-w0: stopping experience collection (8100 times) [2024-04-28 12:58:32,760][57319] Signal inference workers to resume experience collection... (8100 times) [2024-04-28 12:58:32,772][57339] InferenceWorker_p0-w0: resuming experience collection (8100 times) [2024-04-28 12:58:34,485][57339] Updated weights for policy 0, policy_version 614838 (0.0026) [2024-04-28 12:58:37,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10073653248. Throughput: 0: 55616.7. Samples: 563930440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:58:37,169][57108] Avg episode reward: [(0, '0.504')] [2024-04-28 12:58:37,240][57339] Updated weights for policy 0, policy_version 614848 (0.0028) [2024-04-28 12:58:40,399][57339] Updated weights for policy 0, policy_version 614858 (0.0029) [2024-04-28 12:58:42,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10073915392. Throughput: 0: 55642.9. Samples: 564264140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:58:42,169][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 12:58:43,162][57339] Updated weights for policy 0, policy_version 614868 (0.0028) [2024-04-28 12:58:46,623][57339] Updated weights for policy 0, policy_version 614878 (0.0030) [2024-04-28 12:58:47,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10074177536. Throughput: 0: 55641.3. Samples: 564601480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:58:47,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 12:58:49,178][57339] Updated weights for policy 0, policy_version 614888 (0.0026) [2024-04-28 12:58:52,169][57108] Fps is (10 sec: 55705.2, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 10074472448. Throughput: 0: 55517.4. Samples: 564765360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:58:52,169][57108] Avg episode reward: [(0, '0.691')] [2024-04-28 12:58:52,672][57339] Updated weights for policy 0, policy_version 614898 (0.0028) [2024-04-28 12:58:55,105][57339] Updated weights for policy 0, policy_version 614908 (0.0032) [2024-04-28 12:58:57,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 10074734592. Throughput: 0: 55527.9. Samples: 565102080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:58:57,169][57108] Avg episode reward: [(0, '0.520')] [2024-04-28 12:58:58,411][57339] Updated weights for policy 0, policy_version 614918 (0.0024) [2024-04-28 12:59:01,076][57339] Updated weights for policy 0, policy_version 614928 (0.0028) [2024-04-28 12:59:02,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10075013120. Throughput: 0: 55627.1. Samples: 565435640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 27.0) [2024-04-28 12:59:02,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 12:59:04,168][57339] Updated weights for policy 0, policy_version 614938 (0.0028) [2024-04-28 12:59:07,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.7, 300 sec: 55761.2). Total num frames: 10075291648. Throughput: 0: 55610.7. Samples: 565598240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 12:59:07,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 12:59:07,302][57339] Updated weights for policy 0, policy_version 614948 (0.0029) [2024-04-28 12:59:10,101][57339] Updated weights for policy 0, policy_version 614958 (0.0029) [2024-04-28 12:59:12,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10075570176. Throughput: 0: 55643.9. Samples: 565930860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 12:59:12,169][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 12:59:13,134][57339] Updated weights for policy 0, policy_version 614968 (0.0028) [2024-04-28 12:59:16,059][57339] Updated weights for policy 0, policy_version 614978 (0.0031) [2024-04-28 12:59:17,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10075865088. Throughput: 0: 55688.8. Samples: 566268360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 12:59:17,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 12:59:18,932][57339] Updated weights for policy 0, policy_version 614988 (0.0028) [2024-04-28 12:59:21,836][57339] Updated weights for policy 0, policy_version 614998 (0.0031) [2024-04-28 12:59:22,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10076127232. Throughput: 0: 55681.3. Samples: 566436100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 12:59:22,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 12:59:24,922][57339] Updated weights for policy 0, policy_version 615008 (0.0028) [2024-04-28 12:59:27,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 10076422144. Throughput: 0: 55698.1. Samples: 566770560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 12:59:27,169][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 12:59:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000615016_10076422144.pth... [2024-04-28 12:59:27,223][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000614200_10063052800.pth [2024-04-28 12:59:27,674][57339] Updated weights for policy 0, policy_version 615018 (0.0029) [2024-04-28 12:59:30,933][57339] Updated weights for policy 0, policy_version 615028 (0.0028) [2024-04-28 12:59:32,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10076684288. Throughput: 0: 55454.7. Samples: 567096940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 12:59:32,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 12:59:33,586][57339] Updated weights for policy 0, policy_version 615038 (0.0029) [2024-04-28 12:59:35,991][57319] Signal inference workers to stop experience collection... (8150 times) [2024-04-28 12:59:35,991][57319] Signal inference workers to resume experience collection... (8150 times) [2024-04-28 12:59:36,014][57339] InferenceWorker_p0-w0: stopping experience collection (8150 times) [2024-04-28 12:59:36,015][57339] InferenceWorker_p0-w0: resuming experience collection (8150 times) [2024-04-28 12:59:36,674][57339] Updated weights for policy 0, policy_version 615048 (0.0026) [2024-04-28 12:59:37,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10076962816. Throughput: 0: 55493.5. Samples: 567262560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 12:59:37,169][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 12:59:39,566][57339] Updated weights for policy 0, policy_version 615058 (0.0029) [2024-04-28 12:59:42,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10077241344. Throughput: 0: 55583.5. Samples: 567603340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 12:59:42,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 12:59:42,370][57339] Updated weights for policy 0, policy_version 615068 (0.0025) [2024-04-28 12:59:45,356][57339] Updated weights for policy 0, policy_version 615078 (0.0034) [2024-04-28 12:59:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10077519872. Throughput: 0: 55636.1. Samples: 567939260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 12:59:47,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 12:59:48,303][57339] Updated weights for policy 0, policy_version 615088 (0.0028) [2024-04-28 12:59:51,067][57339] Updated weights for policy 0, policy_version 615098 (0.0035) [2024-04-28 12:59:52,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10077798400. Throughput: 0: 55577.6. Samples: 568099240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 12:59:52,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 12:59:54,278][57339] Updated weights for policy 0, policy_version 615108 (0.0026) [2024-04-28 12:59:57,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10078076928. Throughput: 0: 55608.0. Samples: 568433220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 12:59:57,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 12:59:57,263][57339] Updated weights for policy 0, policy_version 615118 (0.0028) [2024-04-28 13:00:00,234][57339] Updated weights for policy 0, policy_version 615128 (0.0026) [2024-04-28 13:00:02,169][57108] Fps is (10 sec: 58982.8, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10078388224. Throughput: 0: 55530.7. Samples: 568767240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:00:02,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 13:00:02,964][57339] Updated weights for policy 0, policy_version 615138 (0.0027) [2024-04-28 13:00:05,968][57339] Updated weights for policy 0, policy_version 615148 (0.0035) [2024-04-28 13:00:07,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10078650368. Throughput: 0: 55760.6. Samples: 568945320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:00:07,178][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 13:00:08,665][57339] Updated weights for policy 0, policy_version 615158 (0.0032) [2024-04-28 13:00:11,729][57339] Updated weights for policy 0, policy_version 615168 (0.0027) [2024-04-28 13:00:12,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10078928896. Throughput: 0: 55798.3. Samples: 569281480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:00:12,169][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 13:00:14,625][57339] Updated weights for policy 0, policy_version 615178 (0.0025) [2024-04-28 13:00:17,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 10079174656. Throughput: 0: 55946.7. Samples: 569614540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:00:17,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 13:00:17,625][57339] Updated weights for policy 0, policy_version 615188 (0.0032) [2024-04-28 13:00:20,651][57339] Updated weights for policy 0, policy_version 615198 (0.0029) [2024-04-28 13:00:22,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10079469568. Throughput: 0: 55830.1. Samples: 569774920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:00:22,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 13:00:23,322][57339] Updated weights for policy 0, policy_version 615208 (0.0026) [2024-04-28 13:00:26,539][57339] Updated weights for policy 0, policy_version 615218 (0.0027) [2024-04-28 13:00:27,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 10079748096. Throughput: 0: 55699.4. Samples: 570109820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:00:27,170][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 13:00:29,237][57339] Updated weights for policy 0, policy_version 615228 (0.0027) [2024-04-28 13:00:32,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10080043008. Throughput: 0: 55647.5. Samples: 570443400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:00:32,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 13:00:32,280][57339] Updated weights for policy 0, policy_version 615238 (0.0033) [2024-04-28 13:00:35,214][57339] Updated weights for policy 0, policy_version 615248 (0.0027) [2024-04-28 13:00:37,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10080321536. Throughput: 0: 55985.8. Samples: 570618600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:00:37,178][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 13:00:38,117][57339] Updated weights for policy 0, policy_version 615258 (0.0030) [2024-04-28 13:00:40,364][57319] Signal inference workers to stop experience collection... (8200 times) [2024-04-28 13:00:40,396][57339] InferenceWorker_p0-w0: stopping experience collection (8200 times) [2024-04-28 13:00:40,424][57319] Signal inference workers to resume experience collection... (8200 times) [2024-04-28 13:00:40,425][57339] InferenceWorker_p0-w0: resuming experience collection (8200 times) [2024-04-28 13:00:41,012][57339] Updated weights for policy 0, policy_version 615268 (0.0032) [2024-04-28 13:00:42,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10080600064. Throughput: 0: 55956.5. Samples: 570951260. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:00:42,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 13:00:44,177][57339] Updated weights for policy 0, policy_version 615278 (0.0029) [2024-04-28 13:00:46,859][57339] Updated weights for policy 0, policy_version 615288 (0.0033) [2024-04-28 13:00:47,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10080878592. Throughput: 0: 55947.6. Samples: 571284880. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:00:47,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 13:00:50,112][57339] Updated weights for policy 0, policy_version 615298 (0.0029) [2024-04-28 13:00:52,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10081157120. Throughput: 0: 55700.4. Samples: 571451840. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:00:52,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 13:00:52,754][57339] Updated weights for policy 0, policy_version 615308 (0.0026) [2024-04-28 13:00:55,975][57339] Updated weights for policy 0, policy_version 615318 (0.0029) [2024-04-28 13:00:57,169][57108] Fps is (10 sec: 50789.5, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 10081386496. Throughput: 0: 55611.9. Samples: 571784020. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:00:57,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 13:00:58,562][57339] Updated weights for policy 0, policy_version 615328 (0.0028) [2024-04-28 13:01:01,801][57339] Updated weights for policy 0, policy_version 615338 (0.0034) [2024-04-28 13:01:02,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10081697792. Throughput: 0: 55769.7. Samples: 572124180. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:01:02,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 13:01:04,754][57339] Updated weights for policy 0, policy_version 615348 (0.0026) [2024-04-28 13:01:07,169][57108] Fps is (10 sec: 60621.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10081992704. Throughput: 0: 55854.3. Samples: 572288360. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:01:07,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 13:01:07,532][57339] Updated weights for policy 0, policy_version 615358 (0.0030) [2024-04-28 13:01:10,599][57339] Updated weights for policy 0, policy_version 615368 (0.0029) [2024-04-28 13:01:12,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10082287616. Throughput: 0: 55919.3. Samples: 572626180. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:01:12,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 13:01:13,527][57339] Updated weights for policy 0, policy_version 615378 (0.0031) [2024-04-28 13:01:16,462][57339] Updated weights for policy 0, policy_version 615388 (0.0033) [2024-04-28 13:01:17,169][57108] Fps is (10 sec: 57343.7, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 10082566144. Throughput: 0: 55832.8. Samples: 572955880. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:01:17,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 13:01:19,493][57339] Updated weights for policy 0, policy_version 615398 (0.0028) [2024-04-28 13:01:22,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10082811904. Throughput: 0: 55616.1. Samples: 573121320. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:01:22,169][57108] Avg episode reward: [(0, '0.524')] [2024-04-28 13:01:22,358][57339] Updated weights for policy 0, policy_version 615408 (0.0024) [2024-04-28 13:01:26,031][57339] Updated weights for policy 0, policy_version 615418 (0.0027) [2024-04-28 13:01:27,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 10083090432. Throughput: 0: 55769.0. Samples: 573460860. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:01:27,169][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 13:01:27,267][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000615424_10083106816.pth... [2024-04-28 13:01:27,311][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000614609_10069753856.pth [2024-04-28 13:01:28,316][57339] Updated weights for policy 0, policy_version 615428 (0.0033) [2024-04-28 13:01:31,771][57339] Updated weights for policy 0, policy_version 615438 (0.0027) [2024-04-28 13:01:32,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 10083352576. Throughput: 0: 55908.4. Samples: 573800760. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:01:32,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 13:01:33,923][57319] Signal inference workers to stop experience collection... (8250 times) [2024-04-28 13:01:33,946][57339] InferenceWorker_p0-w0: stopping experience collection (8250 times) [2024-04-28 13:01:33,984][57319] Signal inference workers to resume experience collection... (8250 times) [2024-04-28 13:01:33,984][57339] InferenceWorker_p0-w0: resuming experience collection (8250 times) [2024-04-28 13:01:34,098][57339] Updated weights for policy 0, policy_version 615448 (0.0023) [2024-04-28 13:01:37,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10083647488. Throughput: 0: 55624.5. Samples: 573954940. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:01:37,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 13:01:37,525][57339] Updated weights for policy 0, policy_version 615458 (0.0031) [2024-04-28 13:01:39,966][57339] Updated weights for policy 0, policy_version 615468 (0.0030) [2024-04-28 13:01:42,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10083942400. Throughput: 0: 55649.1. Samples: 574288220. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:01:42,169][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 13:01:43,299][57339] Updated weights for policy 0, policy_version 615478 (0.0029) [2024-04-28 13:01:45,860][57339] Updated weights for policy 0, policy_version 615488 (0.0028) [2024-04-28 13:01:47,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 10084237312. Throughput: 0: 55488.4. Samples: 574621160. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:01:47,170][57108] Avg episode reward: [(0, '0.486')] [2024-04-28 13:01:49,162][57339] Updated weights for policy 0, policy_version 615498 (0.0035) [2024-04-28 13:01:51,701][57339] Updated weights for policy 0, policy_version 615508 (0.0032) [2024-04-28 13:01:52,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10084515840. Throughput: 0: 55912.9. Samples: 574804440. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:01:52,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 13:01:54,971][57339] Updated weights for policy 0, policy_version 615518 (0.0027) [2024-04-28 13:01:57,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56797.9, 300 sec: 55705.6). Total num frames: 10084794368. Throughput: 0: 55914.5. Samples: 575142340. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:01:57,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 13:01:57,381][57339] Updated weights for policy 0, policy_version 615528 (0.0033) [2024-04-28 13:02:00,666][57339] Updated weights for policy 0, policy_version 615538 (0.0025) [2024-04-28 13:02:02,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10085040128. Throughput: 0: 55999.2. Samples: 575475840. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:02:02,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 13:02:03,251][57339] Updated weights for policy 0, policy_version 615548 (0.0026) [2024-04-28 13:02:06,413][57339] Updated weights for policy 0, policy_version 615558 (0.0028) [2024-04-28 13:02:07,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10085318656. Throughput: 0: 55734.0. Samples: 575629360. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-04-28 13:02:07,170][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 13:02:09,167][57339] Updated weights for policy 0, policy_version 615568 (0.0027) [2024-04-28 13:02:12,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10085613568. Throughput: 0: 55653.6. Samples: 575965280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:02:12,169][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 13:02:12,634][57339] Updated weights for policy 0, policy_version 615578 (0.0026) [2024-04-28 13:02:14,890][57339] Updated weights for policy 0, policy_version 615588 (0.0027) [2024-04-28 13:02:17,169][57108] Fps is (10 sec: 60621.1, 60 sec: 55978.6, 300 sec: 55816.6). Total num frames: 10085924864. Throughput: 0: 55607.8. Samples: 576303120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:02:17,170][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 13:02:18,629][57339] Updated weights for policy 0, policy_version 615598 (0.0030) [2024-04-28 13:02:20,608][57339] Updated weights for policy 0, policy_version 615608 (0.0029) [2024-04-28 13:02:22,169][57108] Fps is (10 sec: 58982.2, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 10086203392. Throughput: 0: 56227.8. Samples: 576485200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:02:22,170][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 13:02:24,530][57339] Updated weights for policy 0, policy_version 615618 (0.0028) [2024-04-28 13:02:26,432][57339] Updated weights for policy 0, policy_version 615628 (0.0035) [2024-04-28 13:02:27,169][57108] Fps is (10 sec: 55706.4, 60 sec: 56524.7, 300 sec: 55872.2). Total num frames: 10086481920. Throughput: 0: 56248.4. Samples: 576819400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:02:27,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 13:02:30,267][57339] Updated weights for policy 0, policy_version 615638 (0.0027) [2024-04-28 13:02:32,169][57108] Fps is (10 sec: 54068.1, 60 sec: 56524.8, 300 sec: 55705.6). Total num frames: 10086744064. Throughput: 0: 56288.1. Samples: 577154120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:02:32,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 13:02:32,318][57339] Updated weights for policy 0, policy_version 615648 (0.0032) [2024-04-28 13:02:35,964][57319] Signal inference workers to stop experience collection... (8300 times) [2024-04-28 13:02:35,965][57319] Signal inference workers to resume experience collection... (8300 times) [2024-04-28 13:02:35,994][57339] InferenceWorker_p0-w0: stopping experience collection (8300 times) [2024-04-28 13:02:35,994][57339] InferenceWorker_p0-w0: resuming experience collection (8300 times) [2024-04-28 13:02:36,077][57339] Updated weights for policy 0, policy_version 615658 (0.0026) [2024-04-28 13:02:37,169][57108] Fps is (10 sec: 52427.9, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 10087006208. Throughput: 0: 55725.2. Samples: 577312080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:02:37,170][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 13:02:38,183][57339] Updated weights for policy 0, policy_version 615668 (0.0029) [2024-04-28 13:02:41,993][57339] Updated weights for policy 0, policy_version 615678 (0.0026) [2024-04-28 13:02:42,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10087284736. Throughput: 0: 55705.8. Samples: 577649100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:02:42,170][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 13:02:43,888][57339] Updated weights for policy 0, policy_version 615688 (0.0025) [2024-04-28 13:02:47,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10087563264. Throughput: 0: 55798.3. Samples: 577986760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:02:47,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 13:02:47,662][57339] Updated weights for policy 0, policy_version 615698 (0.0035) [2024-04-28 13:02:49,801][57339] Updated weights for policy 0, policy_version 615708 (0.0025) [2024-04-28 13:02:52,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10087858176. Throughput: 0: 56080.9. Samples: 578153000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:02:52,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 13:02:53,632][57339] Updated weights for policy 0, policy_version 615718 (0.0026) [2024-04-28 13:02:55,834][57339] Updated weights for policy 0, policy_version 615728 (0.0031) [2024-04-28 13:02:57,169][57108] Fps is (10 sec: 60620.5, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10088169472. Throughput: 0: 55965.8. Samples: 578483740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:02:57,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 13:02:59,317][57339] Updated weights for policy 0, policy_version 615738 (0.0032) [2024-04-28 13:03:01,783][57339] Updated weights for policy 0, policy_version 615748 (0.0027) [2024-04-28 13:03:02,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 10088431616. Throughput: 0: 55865.8. Samples: 578817080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:03:02,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 13:03:05,187][57339] Updated weights for policy 0, policy_version 615758 (0.0034) [2024-04-28 13:03:07,169][57108] Fps is (10 sec: 54066.3, 60 sec: 56524.8, 300 sec: 55816.6). Total num frames: 10088710144. Throughput: 0: 55717.7. Samples: 578992500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:03:07,170][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 13:03:07,660][57339] Updated weights for policy 0, policy_version 615768 (0.0031) [2024-04-28 13:03:11,036][57339] Updated weights for policy 0, policy_version 615778 (0.0034) [2024-04-28 13:03:12,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 10088972288. Throughput: 0: 55782.2. Samples: 579329600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:03:12,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 13:03:13,344][57339] Updated weights for policy 0, policy_version 615788 (0.0038) [2024-04-28 13:03:16,808][57339] Updated weights for policy 0, policy_version 615798 (0.0025) [2024-04-28 13:03:17,169][57108] Fps is (10 sec: 52429.7, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 10089234432. Throughput: 0: 55763.0. Samples: 579663460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:03:17,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 13:03:19,225][57339] Updated weights for policy 0, policy_version 615808 (0.0027) [2024-04-28 13:03:22,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10089529344. Throughput: 0: 55823.6. Samples: 579824140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:03:22,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 13:03:22,828][57339] Updated weights for policy 0, policy_version 615818 (0.0029) [2024-04-28 13:03:25,136][57339] Updated weights for policy 0, policy_version 615828 (0.0027) [2024-04-28 13:03:27,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10089807872. Throughput: 0: 55890.8. Samples: 580164180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:03:27,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 13:03:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000615833_10089807872.pth... [2024-04-28 13:03:27,231][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000615016_10076422144.pth [2024-04-28 13:03:28,813][57339] Updated weights for policy 0, policy_version 615838 (0.0029) [2024-04-28 13:03:30,890][57339] Updated weights for policy 0, policy_version 615848 (0.0026) [2024-04-28 13:03:31,551][57319] Signal inference workers to stop experience collection... (8350 times) [2024-04-28 13:03:31,589][57339] InferenceWorker_p0-w0: stopping experience collection (8350 times) [2024-04-28 13:03:31,598][57319] Signal inference workers to resume experience collection... (8350 times) [2024-04-28 13:03:31,607][57339] InferenceWorker_p0-w0: resuming experience collection (8350 times) [2024-04-28 13:03:32,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 10090102784. Throughput: 0: 55721.8. Samples: 580494240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:03:32,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 13:03:34,620][57339] Updated weights for policy 0, policy_version 615858 (0.0030) [2024-04-28 13:03:36,576][57339] Updated weights for policy 0, policy_version 615868 (0.0028) [2024-04-28 13:03:37,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10090381312. Throughput: 0: 55924.5. Samples: 580669600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:03:37,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 13:03:40,613][57339] Updated weights for policy 0, policy_version 615878 (0.0031) [2024-04-28 13:03:42,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10090659840. Throughput: 0: 55953.3. Samples: 581001640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-04-28 13:03:42,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 13:03:43,063][57339] Updated weights for policy 0, policy_version 615888 (0.0027) [2024-04-28 13:03:46,605][57339] Updated weights for policy 0, policy_version 615898 (0.0026) [2024-04-28 13:03:47,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10090905600. Throughput: 0: 55951.7. Samples: 581334900. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:03:47,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 13:03:49,260][57339] Updated weights for policy 0, policy_version 615908 (0.0030) [2024-04-28 13:03:52,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10091184128. Throughput: 0: 55624.6. Samples: 581495600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:03:52,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 13:03:52,343][57339] Updated weights for policy 0, policy_version 615918 (0.0033) [2024-04-28 13:03:55,081][57339] Updated weights for policy 0, policy_version 615928 (0.0028) [2024-04-28 13:03:57,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55159.3, 300 sec: 55816.7). Total num frames: 10091479040. Throughput: 0: 55602.9. Samples: 581831740. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:03:57,170][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 13:03:58,185][57339] Updated weights for policy 0, policy_version 615938 (0.0025) [2024-04-28 13:04:00,859][57339] Updated weights for policy 0, policy_version 615948 (0.0026) [2024-04-28 13:04:02,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10091757568. Throughput: 0: 55529.7. Samples: 582162300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:04:02,170][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 13:04:04,147][57339] Updated weights for policy 0, policy_version 615958 (0.0025) [2024-04-28 13:04:06,896][57339] Updated weights for policy 0, policy_version 615968 (0.0032) [2024-04-28 13:04:07,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 10092036096. Throughput: 0: 55701.9. Samples: 582330720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:04:07,178][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 13:04:10,142][57339] Updated weights for policy 0, policy_version 615978 (0.0031) [2024-04-28 13:04:12,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10092314624. Throughput: 0: 55414.3. Samples: 582657820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:04:12,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 13:04:12,810][57339] Updated weights for policy 0, policy_version 615988 (0.0024) [2024-04-28 13:04:16,323][57339] Updated weights for policy 0, policy_version 615998 (0.0035) [2024-04-28 13:04:17,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10092576768. Throughput: 0: 55517.8. Samples: 582992540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:04:17,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 13:04:18,581][57339] Updated weights for policy 0, policy_version 616008 (0.0026) [2024-04-28 13:04:22,158][57339] Updated weights for policy 0, policy_version 616018 (0.0029) [2024-04-28 13:04:22,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10092838912. Throughput: 0: 55129.4. Samples: 583150420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:04:22,178][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 13:04:24,585][57339] Updated weights for policy 0, policy_version 616028 (0.0029) [2024-04-28 13:04:27,169][57108] Fps is (10 sec: 52428.3, 60 sec: 54886.4, 300 sec: 55650.0). Total num frames: 10093101056. Throughput: 0: 55121.7. Samples: 583482120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:04:27,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 13:04:27,972][57339] Updated weights for policy 0, policy_version 616038 (0.0034) [2024-04-28 13:04:30,158][57319] Signal inference workers to stop experience collection... (8400 times) [2024-04-28 13:04:30,158][57319] Signal inference workers to resume experience collection... (8400 times) [2024-04-28 13:04:30,173][57339] InferenceWorker_p0-w0: stopping experience collection (8400 times) [2024-04-28 13:04:30,173][57339] InferenceWorker_p0-w0: resuming experience collection (8400 times) [2024-04-28 13:04:30,531][57339] Updated weights for policy 0, policy_version 616048 (0.0026) [2024-04-28 13:04:32,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 10093412352. Throughput: 0: 55055.6. Samples: 583812400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:04:32,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 13:04:33,802][57339] Updated weights for policy 0, policy_version 616058 (0.0032) [2024-04-28 13:04:36,416][57339] Updated weights for policy 0, policy_version 616068 (0.0029) [2024-04-28 13:04:37,169][57108] Fps is (10 sec: 57344.0, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 10093674496. Throughput: 0: 55407.1. Samples: 583988920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:04:37,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 13:04:39,713][57339] Updated weights for policy 0, policy_version 616078 (0.0030) [2024-04-28 13:04:42,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10093969408. Throughput: 0: 55296.6. Samples: 584320080. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:04:42,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 13:04:42,302][57339] Updated weights for policy 0, policy_version 616088 (0.0033) [2024-04-28 13:04:45,536][57339] Updated weights for policy 0, policy_version 616098 (0.0033) [2024-04-28 13:04:47,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10094231552. Throughput: 0: 55364.0. Samples: 584653680. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:04:47,170][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 13:04:48,148][57339] Updated weights for policy 0, policy_version 616108 (0.0031) [2024-04-28 13:04:51,315][57339] Updated weights for policy 0, policy_version 616118 (0.0037) [2024-04-28 13:04:52,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 10094493696. Throughput: 0: 55350.7. Samples: 584821500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:04:52,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 13:04:53,970][57339] Updated weights for policy 0, policy_version 616128 (0.0028) [2024-04-28 13:04:57,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 10094788608. Throughput: 0: 55415.6. Samples: 585151520. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:04:57,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 13:04:57,256][57339] Updated weights for policy 0, policy_version 616138 (0.0028) [2024-04-28 13:04:59,783][57339] Updated weights for policy 0, policy_version 616148 (0.0026) [2024-04-28 13:05:02,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 10095067136. Throughput: 0: 55416.7. Samples: 585486300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:05:02,169][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 13:05:03,040][57339] Updated weights for policy 0, policy_version 616158 (0.0026) [2024-04-28 13:05:05,585][57339] Updated weights for policy 0, policy_version 616168 (0.0040) [2024-04-28 13:05:07,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10095345664. Throughput: 0: 55686.7. Samples: 585656320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:05:07,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 13:05:08,993][57339] Updated weights for policy 0, policy_version 616178 (0.0026) [2024-04-28 13:05:11,496][57339] Updated weights for policy 0, policy_version 616188 (0.0025) [2024-04-28 13:05:12,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55705.4, 300 sec: 55872.2). Total num frames: 10095656960. Throughput: 0: 55774.5. Samples: 585991980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:05:12,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 13:05:14,996][57339] Updated weights for policy 0, policy_version 616198 (0.0026) [2024-04-28 13:05:17,063][57319] Signal inference workers to stop experience collection... (8450 times) [2024-04-28 13:05:17,064][57319] Signal inference workers to resume experience collection... (8450 times) [2024-04-28 13:05:17,111][57339] InferenceWorker_p0-w0: stopping experience collection (8450 times) [2024-04-28 13:05:17,112][57339] InferenceWorker_p0-w0: resuming experience collection (8450 times) [2024-04-28 13:05:17,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10095919104. Throughput: 0: 55905.8. Samples: 586328160. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 13:05:17,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 13:05:17,370][57339] Updated weights for policy 0, policy_version 616208 (0.0034) [2024-04-28 13:05:20,751][57339] Updated weights for policy 0, policy_version 616218 (0.0027) [2024-04-28 13:05:22,169][57108] Fps is (10 sec: 50791.0, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10096164864. Throughput: 0: 55601.8. Samples: 586491000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:05:22,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 13:05:23,148][57339] Updated weights for policy 0, policy_version 616228 (0.0030) [2024-04-28 13:05:26,506][57339] Updated weights for policy 0, policy_version 616238 (0.0031) [2024-04-28 13:05:27,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10096459776. Throughput: 0: 55737.0. Samples: 586828240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:05:27,169][57108] Avg episode reward: [(0, '0.677')] [2024-04-28 13:05:27,262][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000616240_10096476160.pth... [2024-04-28 13:05:27,310][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000615424_10083106816.pth [2024-04-28 13:05:28,978][57339] Updated weights for policy 0, policy_version 616248 (0.0029) [2024-04-28 13:05:32,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 10096738304. Throughput: 0: 55738.6. Samples: 587161920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:05:32,170][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 13:05:32,473][57339] Updated weights for policy 0, policy_version 616258 (0.0030) [2024-04-28 13:05:34,871][57339] Updated weights for policy 0, policy_version 616268 (0.0030) [2024-04-28 13:05:37,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10097016832. Throughput: 0: 55589.3. Samples: 587323020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:05:37,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 13:05:38,526][57339] Updated weights for policy 0, policy_version 616278 (0.0032) [2024-04-28 13:05:40,706][57339] Updated weights for policy 0, policy_version 616288 (0.0026) [2024-04-28 13:05:42,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10097311744. Throughput: 0: 55602.4. Samples: 587653640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:05:42,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 13:05:44,660][57339] Updated weights for policy 0, policy_version 616298 (0.0027) [2024-04-28 13:05:46,592][57339] Updated weights for policy 0, policy_version 616308 (0.0027) [2024-04-28 13:05:47,169][57108] Fps is (10 sec: 58982.2, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 10097606656. Throughput: 0: 55538.3. Samples: 587985520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:05:47,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 13:05:50,579][57339] Updated weights for policy 0, policy_version 616318 (0.0027) [2024-04-28 13:05:52,169][57108] Fps is (10 sec: 55706.3, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 10097868800. Throughput: 0: 55755.0. Samples: 588165300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:05:52,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 13:05:52,551][57339] Updated weights for policy 0, policy_version 616328 (0.0026) [2024-04-28 13:05:56,504][57339] Updated weights for policy 0, policy_version 616338 (0.0035) [2024-04-28 13:05:57,169][57108] Fps is (10 sec: 49151.7, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 10098098176. Throughput: 0: 55618.8. Samples: 588494820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:05:57,170][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 13:05:58,597][57339] Updated weights for policy 0, policy_version 616348 (0.0027) [2024-04-28 13:06:02,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10098393088. Throughput: 0: 55373.7. Samples: 588819980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:06:02,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 13:06:02,470][57339] Updated weights for policy 0, policy_version 616358 (0.0034) [2024-04-28 13:06:04,798][57339] Updated weights for policy 0, policy_version 616368 (0.0026) [2024-04-28 13:06:07,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55705.4, 300 sec: 55594.5). Total num frames: 10098688000. Throughput: 0: 55435.0. Samples: 588985580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:06:07,170][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 13:06:08,269][57339] Updated weights for policy 0, policy_version 616378 (0.0028) [2024-04-28 13:06:10,639][57339] Updated weights for policy 0, policy_version 616388 (0.0032) [2024-04-28 13:06:12,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55159.7, 300 sec: 55594.5). Total num frames: 10098966528. Throughput: 0: 55322.3. Samples: 589317740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:06:12,169][57108] Avg episode reward: [(0, '0.531')] [2024-04-28 13:06:13,145][57319] Signal inference workers to stop experience collection... (8500 times) [2024-04-28 13:06:13,145][57319] Signal inference workers to resume experience collection... (8500 times) [2024-04-28 13:06:13,170][57339] InferenceWorker_p0-w0: stopping experience collection (8500 times) [2024-04-28 13:06:13,170][57339] InferenceWorker_p0-w0: resuming experience collection (8500 times) [2024-04-28 13:06:14,010][57339] Updated weights for policy 0, policy_version 616398 (0.0039) [2024-04-28 13:06:16,600][57339] Updated weights for policy 0, policy_version 616408 (0.0027) [2024-04-28 13:06:17,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10099245056. Throughput: 0: 55320.1. Samples: 589651320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:06:17,169][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 13:06:19,989][57339] Updated weights for policy 0, policy_version 616418 (0.0029) [2024-04-28 13:06:22,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10099523584. Throughput: 0: 55529.1. Samples: 589821840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:06:22,170][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 13:06:22,580][57339] Updated weights for policy 0, policy_version 616428 (0.0029) [2024-04-28 13:06:26,090][57339] Updated weights for policy 0, policy_version 616438 (0.0031) [2024-04-28 13:06:27,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10099802112. Throughput: 0: 55570.8. Samples: 590154320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:06:27,169][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 13:06:28,344][57339] Updated weights for policy 0, policy_version 616448 (0.0030) [2024-04-28 13:06:32,028][57339] Updated weights for policy 0, policy_version 616458 (0.0032) [2024-04-28 13:06:32,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 10100064256. Throughput: 0: 55668.4. Samples: 590490600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:06:32,170][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 13:06:34,167][57339] Updated weights for policy 0, policy_version 616468 (0.0031) [2024-04-28 13:06:37,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10100342784. Throughput: 0: 55145.0. Samples: 590646820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:06:37,170][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 13:06:37,935][57339] Updated weights for policy 0, policy_version 616478 (0.0030) [2024-04-28 13:06:40,264][57339] Updated weights for policy 0, policy_version 616488 (0.0028) [2024-04-28 13:06:42,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55159.7, 300 sec: 55539.0). Total num frames: 10100621312. Throughput: 0: 55182.8. Samples: 590978040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:06:42,169][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 13:06:43,740][57339] Updated weights for policy 0, policy_version 616498 (0.0027) [2024-04-28 13:06:46,170][57339] Updated weights for policy 0, policy_version 616508 (0.0030) [2024-04-28 13:06:47,169][57108] Fps is (10 sec: 58981.1, 60 sec: 55432.3, 300 sec: 55650.0). Total num frames: 10100932608. Throughput: 0: 55386.9. Samples: 591312400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:06:47,170][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 13:06:49,507][57339] Updated weights for policy 0, policy_version 616518 (0.0040) [2024-04-28 13:06:52,090][57339] Updated weights for policy 0, policy_version 616528 (0.0027) [2024-04-28 13:06:52,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10101194752. Throughput: 0: 55542.3. Samples: 591484980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 13:06:52,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 13:06:55,332][57339] Updated weights for policy 0, policy_version 616538 (0.0030) [2024-04-28 13:06:57,169][57108] Fps is (10 sec: 55706.4, 60 sec: 56524.8, 300 sec: 55761.1). Total num frames: 10101489664. Throughput: 0: 55737.2. Samples: 591825920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:06:57,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 13:06:58,038][57339] Updated weights for policy 0, policy_version 616548 (0.0033) [2024-04-28 13:07:01,280][57339] Updated weights for policy 0, policy_version 616558 (0.0031) [2024-04-28 13:07:02,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10101751808. Throughput: 0: 55694.7. Samples: 592157580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:07:02,169][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 13:07:03,889][57339] Updated weights for policy 0, policy_version 616568 (0.0028) [2024-04-28 13:07:07,169][57108] Fps is (10 sec: 50790.7, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 10101997568. Throughput: 0: 55405.5. Samples: 592315080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:07:07,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 13:07:07,253][57339] Updated weights for policy 0, policy_version 616578 (0.0025) [2024-04-28 13:07:09,628][57339] Updated weights for policy 0, policy_version 616588 (0.0036) [2024-04-28 13:07:12,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 10102276096. Throughput: 0: 55452.0. Samples: 592649660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:07:12,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 13:07:13,019][57339] Updated weights for policy 0, policy_version 616598 (0.0027) [2024-04-28 13:07:15,404][57339] Updated weights for policy 0, policy_version 616608 (0.0030) [2024-04-28 13:07:17,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 10102571008. Throughput: 0: 55523.7. Samples: 592989160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:07:17,169][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 13:07:18,991][57339] Updated weights for policy 0, policy_version 616618 (0.0027) [2024-04-28 13:07:21,395][57339] Updated weights for policy 0, policy_version 616628 (0.0032) [2024-04-28 13:07:22,169][57108] Fps is (10 sec: 60620.9, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 10102882304. Throughput: 0: 55808.4. Samples: 593158200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:07:22,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 13:07:24,721][57339] Updated weights for policy 0, policy_version 616638 (0.0036) [2024-04-28 13:07:25,267][57319] Signal inference workers to stop experience collection... (8550 times) [2024-04-28 13:07:25,267][57319] Signal inference workers to resume experience collection... (8550 times) [2024-04-28 13:07:25,295][57339] InferenceWorker_p0-w0: stopping experience collection (8550 times) [2024-04-28 13:07:25,295][57339] InferenceWorker_p0-w0: resuming experience collection (8550 times) [2024-04-28 13:07:27,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10103144448. Throughput: 0: 55907.5. Samples: 593493880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:07:27,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 13:07:27,192][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000616648_10103160832.pth... [2024-04-28 13:07:27,203][57339] Updated weights for policy 0, policy_version 616648 (0.0029) [2024-04-28 13:07:27,241][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000615833_10089807872.pth [2024-04-28 13:07:30,621][57339] Updated weights for policy 0, policy_version 616658 (0.0026) [2024-04-28 13:07:32,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 10103422976. Throughput: 0: 55788.3. Samples: 593822860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:07:32,169][57108] Avg episode reward: [(0, '0.520')] [2024-04-28 13:07:33,074][57339] Updated weights for policy 0, policy_version 616668 (0.0033) [2024-04-28 13:07:36,400][57339] Updated weights for policy 0, policy_version 616678 (0.0033) [2024-04-28 13:07:37,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55978.5, 300 sec: 55650.1). Total num frames: 10103701504. Throughput: 0: 55874.6. Samples: 593999340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:07:37,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 13:07:38,846][57339] Updated weights for policy 0, policy_version 616688 (0.0031) [2024-04-28 13:07:42,169][57108] Fps is (10 sec: 54066.2, 60 sec: 55705.4, 300 sec: 55594.5). Total num frames: 10103963648. Throughput: 0: 55699.0. Samples: 594332380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:07:42,170][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 13:07:42,319][57339] Updated weights for policy 0, policy_version 616698 (0.0026) [2024-04-28 13:07:44,620][57339] Updated weights for policy 0, policy_version 616708 (0.0026) [2024-04-28 13:07:47,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55159.7, 300 sec: 55539.0). Total num frames: 10104242176. Throughput: 0: 55780.4. Samples: 594667700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:07:47,170][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 13:07:48,155][57339] Updated weights for policy 0, policy_version 616718 (0.0031) [2024-04-28 13:07:50,617][57339] Updated weights for policy 0, policy_version 616728 (0.0027) [2024-04-28 13:07:52,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 10104520704. Throughput: 0: 55806.1. Samples: 594826360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:07:52,170][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 13:07:54,037][57339] Updated weights for policy 0, policy_version 616738 (0.0029) [2024-04-28 13:07:56,590][57339] Updated weights for policy 0, policy_version 616748 (0.0027) [2024-04-28 13:07:57,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10104832000. Throughput: 0: 55944.5. Samples: 595167160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:07:57,169][57108] Avg episode reward: [(0, '0.526')] [2024-04-28 13:08:00,053][57339] Updated weights for policy 0, policy_version 616758 (0.0035) [2024-04-28 13:08:02,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 10105094144. Throughput: 0: 55860.7. Samples: 595502900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:08:02,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 13:08:02,477][57339] Updated weights for policy 0, policy_version 616768 (0.0028) [2024-04-28 13:08:05,791][57339] Updated weights for policy 0, policy_version 616778 (0.0028) [2024-04-28 13:08:07,169][57108] Fps is (10 sec: 55704.9, 60 sec: 56524.7, 300 sec: 55650.0). Total num frames: 10105389056. Throughput: 0: 55915.5. Samples: 595674400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:08:07,170][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 13:08:08,221][57339] Updated weights for policy 0, policy_version 616788 (0.0032) [2024-04-28 13:08:11,598][57339] Updated weights for policy 0, policy_version 616798 (0.0030) [2024-04-28 13:08:12,169][57108] Fps is (10 sec: 55706.1, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 10105651200. Throughput: 0: 55940.0. Samples: 596011180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:08:12,169][57108] Avg episode reward: [(0, '0.686')] [2024-04-28 13:08:14,098][57339] Updated weights for policy 0, policy_version 616808 (0.0034) [2024-04-28 13:08:17,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 10105913344. Throughput: 0: 56055.3. Samples: 596345360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:08:17,170][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 13:08:17,521][57339] Updated weights for policy 0, policy_version 616818 (0.0027) [2024-04-28 13:08:20,019][57339] Updated weights for policy 0, policy_version 616828 (0.0027) [2024-04-28 13:08:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 10106191872. Throughput: 0: 55739.3. Samples: 596507600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:08:22,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 13:08:23,279][57339] Updated weights for policy 0, policy_version 616838 (0.0031) [2024-04-28 13:08:25,804][57339] Updated weights for policy 0, policy_version 616848 (0.0033) [2024-04-28 13:08:27,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 10106470400. Throughput: 0: 55869.4. Samples: 596846500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-04-28 13:08:27,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 13:08:29,024][57339] Updated weights for policy 0, policy_version 616858 (0.0031) [2024-04-28 13:08:31,551][57319] Signal inference workers to stop experience collection... (8600 times) [2024-04-28 13:08:31,578][57339] InferenceWorker_p0-w0: stopping experience collection (8600 times) [2024-04-28 13:08:31,605][57319] Signal inference workers to resume experience collection... (8600 times) [2024-04-28 13:08:31,619][57339] InferenceWorker_p0-w0: resuming experience collection (8600 times) [2024-04-28 13:08:31,724][57339] Updated weights for policy 0, policy_version 616868 (0.0033) [2024-04-28 13:08:32,169][57108] Fps is (10 sec: 58981.8, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 10106781696. Throughput: 0: 55838.0. Samples: 597180420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:08:32,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 13:08:34,958][57339] Updated weights for policy 0, policy_version 616878 (0.0027) [2024-04-28 13:08:37,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 10107060224. Throughput: 0: 56074.8. Samples: 597349720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:08:37,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 13:08:37,726][57339] Updated weights for policy 0, policy_version 616888 (0.0029) [2024-04-28 13:08:40,920][57339] Updated weights for policy 0, policy_version 616898 (0.0025) [2024-04-28 13:08:42,169][57108] Fps is (10 sec: 55706.2, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 10107338752. Throughput: 0: 55987.5. Samples: 597686600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:08:42,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 13:08:43,474][57339] Updated weights for policy 0, policy_version 616908 (0.0029) [2024-04-28 13:08:46,792][57339] Updated weights for policy 0, policy_version 616918 (0.0027) [2024-04-28 13:08:47,169][57108] Fps is (10 sec: 55705.1, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 10107617280. Throughput: 0: 55967.6. Samples: 598021440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:08:47,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 13:08:49,327][57339] Updated weights for policy 0, policy_version 616928 (0.0030) [2024-04-28 13:08:52,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10107863040. Throughput: 0: 55753.7. Samples: 598183320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:08:52,170][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 13:08:52,698][57339] Updated weights for policy 0, policy_version 616938 (0.0035) [2024-04-28 13:08:55,261][57339] Updated weights for policy 0, policy_version 616948 (0.0029) [2024-04-28 13:08:57,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10108157952. Throughput: 0: 55694.7. Samples: 598517440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:08:57,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 13:08:58,489][57339] Updated weights for policy 0, policy_version 616958 (0.0033) [2024-04-28 13:09:01,698][57339] Updated weights for policy 0, policy_version 616968 (0.0036) [2024-04-28 13:09:02,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 10108420096. Throughput: 0: 55774.0. Samples: 598855180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:09:02,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 13:09:04,428][57339] Updated weights for policy 0, policy_version 616978 (0.0030) [2024-04-28 13:09:07,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10108715008. Throughput: 0: 55657.3. Samples: 599012180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:09:07,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 13:09:07,671][57339] Updated weights for policy 0, policy_version 616988 (0.0029) [2024-04-28 13:09:10,191][57339] Updated weights for policy 0, policy_version 616998 (0.0028) [2024-04-28 13:09:12,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10109009920. Throughput: 0: 55657.4. Samples: 599351080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:09:12,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 13:09:13,398][57339] Updated weights for policy 0, policy_version 617008 (0.0026) [2024-04-28 13:09:16,124][57339] Updated weights for policy 0, policy_version 617018 (0.0024) [2024-04-28 13:09:17,169][57108] Fps is (10 sec: 58982.9, 60 sec: 56524.9, 300 sec: 55816.7). Total num frames: 10109304832. Throughput: 0: 55666.4. Samples: 599685400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:09:17,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 13:09:19,438][57339] Updated weights for policy 0, policy_version 617028 (0.0033) [2024-04-28 13:09:22,048][57339] Updated weights for policy 0, policy_version 617038 (0.0028) [2024-04-28 13:09:22,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 10109550592. Throughput: 0: 55674.7. Samples: 599855080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:09:22,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 13:09:25,171][57339] Updated weights for policy 0, policy_version 617048 (0.0030) [2024-04-28 13:09:27,169][57108] Fps is (10 sec: 50790.4, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10109812736. Throughput: 0: 55543.6. Samples: 600186060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:09:27,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 13:09:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000617054_10109812736.pth... [2024-04-28 13:09:27,223][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000616240_10096476160.pth [2024-04-28 13:09:27,888][57339] Updated weights for policy 0, policy_version 617058 (0.0027) [2024-04-28 13:09:31,020][57339] Updated weights for policy 0, policy_version 617068 (0.0030) [2024-04-28 13:09:32,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 10110091264. Throughput: 0: 55451.2. Samples: 600516740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:09:32,169][57108] Avg episode reward: [(0, '0.700')] [2024-04-28 13:09:33,731][57339] Updated weights for policy 0, policy_version 617078 (0.0043) [2024-04-28 13:09:36,914][57319] Signal inference workers to stop experience collection... (8650 times) [2024-04-28 13:09:36,950][57339] InferenceWorker_p0-w0: stopping experience collection (8650 times) [2024-04-28 13:09:36,977][57319] Signal inference workers to resume experience collection... (8650 times) [2024-04-28 13:09:36,977][57339] InferenceWorker_p0-w0: resuming experience collection (8650 times) [2024-04-28 13:09:36,980][57339] Updated weights for policy 0, policy_version 617088 (0.0028) [2024-04-28 13:09:37,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10110386176. Throughput: 0: 55525.4. Samples: 600681960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:09:37,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 13:09:39,509][57339] Updated weights for policy 0, policy_version 617098 (0.0027) [2024-04-28 13:09:42,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55159.3, 300 sec: 55650.0). Total num frames: 10110648320. Throughput: 0: 55578.5. Samples: 601018480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:09:42,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 13:09:42,870][57339] Updated weights for policy 0, policy_version 617108 (0.0026) [2024-04-28 13:09:45,484][57339] Updated weights for policy 0, policy_version 617118 (0.0025) [2024-04-28 13:09:47,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10110959616. Throughput: 0: 55387.9. Samples: 601347640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:09:47,170][57108] Avg episode reward: [(0, '0.515')] [2024-04-28 13:09:48,717][57339] Updated weights for policy 0, policy_version 617128 (0.0031) [2024-04-28 13:09:51,532][57339] Updated weights for policy 0, policy_version 617138 (0.0027) [2024-04-28 13:09:52,169][57108] Fps is (10 sec: 58982.9, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 10111238144. Throughput: 0: 55727.6. Samples: 601519920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:09:52,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 13:09:54,444][57339] Updated weights for policy 0, policy_version 617148 (0.0028) [2024-04-28 13:09:57,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10111483904. Throughput: 0: 55640.0. Samples: 601854880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:09:57,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 13:09:57,380][57339] Updated weights for policy 0, policy_version 617158 (0.0029) [2024-04-28 13:10:00,292][57339] Updated weights for policy 0, policy_version 617168 (0.0030) [2024-04-28 13:10:02,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 10111762432. Throughput: 0: 55686.1. Samples: 602191280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 13:10:02,170][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 13:10:03,183][57339] Updated weights for policy 0, policy_version 617178 (0.0028) [2024-04-28 13:10:06,135][57339] Updated weights for policy 0, policy_version 617188 (0.0028) [2024-04-28 13:10:07,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 10112040960. Throughput: 0: 55399.1. Samples: 602348040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:10:07,169][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 13:10:09,158][57339] Updated weights for policy 0, policy_version 617198 (0.0030) [2024-04-28 13:10:12,098][57339] Updated weights for policy 0, policy_version 617208 (0.0033) [2024-04-28 13:10:12,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 10112335872. Throughput: 0: 55377.6. Samples: 602678060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:10:12,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 13:10:15,044][57339] Updated weights for policy 0, policy_version 617218 (0.0031) [2024-04-28 13:10:17,169][57108] Fps is (10 sec: 55704.8, 60 sec: 54886.3, 300 sec: 55705.6). Total num frames: 10112598016. Throughput: 0: 55515.5. Samples: 603014940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:10:17,170][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 13:10:18,094][57339] Updated weights for policy 0, policy_version 617228 (0.0030) [2024-04-28 13:10:20,865][57339] Updated weights for policy 0, policy_version 617238 (0.0026) [2024-04-28 13:10:22,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10112892928. Throughput: 0: 55571.7. Samples: 603182680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:10:22,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 13:10:23,817][57339] Updated weights for policy 0, policy_version 617248 (0.0031) [2024-04-28 13:10:26,660][57339] Updated weights for policy 0, policy_version 617258 (0.0032) [2024-04-28 13:10:27,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10113171456. Throughput: 0: 55598.9. Samples: 603520420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:10:27,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 13:10:29,800][57339] Updated weights for policy 0, policy_version 617268 (0.0028) [2024-04-28 13:10:32,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 10113433600. Throughput: 0: 55716.9. Samples: 603854900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:10:32,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 13:10:32,560][57339] Updated weights for policy 0, policy_version 617278 (0.0027) [2024-04-28 13:10:35,713][57339] Updated weights for policy 0, policy_version 617288 (0.0031) [2024-04-28 13:10:37,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10113712128. Throughput: 0: 55544.4. Samples: 604019420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:10:37,170][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 13:10:38,535][57339] Updated weights for policy 0, policy_version 617298 (0.0025) [2024-04-28 13:10:41,388][57339] Updated weights for policy 0, policy_version 617308 (0.0035) [2024-04-28 13:10:42,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 10114007040. Throughput: 0: 55644.9. Samples: 604358900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:10:42,169][57108] Avg episode reward: [(0, '0.694')] [2024-04-28 13:10:44,261][57339] Updated weights for policy 0, policy_version 617318 (0.0029) [2024-04-28 13:10:47,129][57339] Updated weights for policy 0, policy_version 617328 (0.0030) [2024-04-28 13:10:47,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10114301952. Throughput: 0: 55613.9. Samples: 604693900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:10:47,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 13:10:50,021][57339] Updated weights for policy 0, policy_version 617338 (0.0029) [2024-04-28 13:10:52,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10114564096. Throughput: 0: 55953.6. Samples: 604865960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:10:52,178][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 13:10:53,101][57339] Updated weights for policy 0, policy_version 617348 (0.0032) [2024-04-28 13:10:55,094][57319] Signal inference workers to stop experience collection... (8700 times) [2024-04-28 13:10:55,134][57339] InferenceWorker_p0-w0: stopping experience collection (8700 times) [2024-04-28 13:10:55,146][57319] Signal inference workers to resume experience collection... (8700 times) [2024-04-28 13:10:55,153][57339] InferenceWorker_p0-w0: resuming experience collection (8700 times) [2024-04-28 13:10:56,191][57339] Updated weights for policy 0, policy_version 617358 (0.0035) [2024-04-28 13:10:57,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10114842624. Throughput: 0: 55950.7. Samples: 605195840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:10:57,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 13:10:59,061][57339] Updated weights for policy 0, policy_version 617368 (0.0030) [2024-04-28 13:11:02,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10115104768. Throughput: 0: 55830.7. Samples: 605527320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:11:02,170][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 13:11:02,203][57339] Updated weights for policy 0, policy_version 617378 (0.0027) [2024-04-28 13:11:04,961][57339] Updated weights for policy 0, policy_version 617388 (0.0027) [2024-04-28 13:11:07,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10115399680. Throughput: 0: 55878.6. Samples: 605697220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:11:07,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 13:11:08,008][57339] Updated weights for policy 0, policy_version 617398 (0.0034) [2024-04-28 13:11:10,874][57339] Updated weights for policy 0, policy_version 617408 (0.0029) [2024-04-28 13:11:12,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 10115661824. Throughput: 0: 55929.6. Samples: 606037260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:11:12,170][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 13:11:13,794][57339] Updated weights for policy 0, policy_version 617418 (0.0037) [2024-04-28 13:11:16,719][57339] Updated weights for policy 0, policy_version 617428 (0.0034) [2024-04-28 13:11:17,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10115956736. Throughput: 0: 55793.4. Samples: 606365600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:11:17,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 13:11:19,689][57339] Updated weights for policy 0, policy_version 617438 (0.0027) [2024-04-28 13:11:22,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10116235264. Throughput: 0: 55991.7. Samples: 606539040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:11:22,169][57108] Avg episode reward: [(0, '0.677')] [2024-04-28 13:11:22,452][57339] Updated weights for policy 0, policy_version 617448 (0.0027) [2024-04-28 13:11:25,695][57339] Updated weights for policy 0, policy_version 617458 (0.0028) [2024-04-28 13:11:27,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 10116530176. Throughput: 0: 55912.6. Samples: 606874980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:11:27,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 13:11:27,187][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000617464_10116530176.pth... [2024-04-28 13:11:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000616648_10103160832.pth [2024-04-28 13:11:28,214][57339] Updated weights for policy 0, policy_version 617468 (0.0029) [2024-04-28 13:11:31,624][57339] Updated weights for policy 0, policy_version 617478 (0.0030) [2024-04-28 13:11:32,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10116808704. Throughput: 0: 55959.1. Samples: 607212060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:11:32,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 13:11:34,089][57339] Updated weights for policy 0, policy_version 617488 (0.0024) [2024-04-28 13:11:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10117070848. Throughput: 0: 55859.9. Samples: 607379660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:11:37,169][57108] Avg episode reward: [(0, '0.497')] [2024-04-28 13:11:37,271][57339] Updated weights for policy 0, policy_version 617498 (0.0043) [2024-04-28 13:11:40,112][57339] Updated weights for policy 0, policy_version 617508 (0.0030) [2024-04-28 13:11:42,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55705.4, 300 sec: 55650.1). Total num frames: 10117349376. Throughput: 0: 55953.2. Samples: 607713740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:11:42,170][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 13:11:43,161][57339] Updated weights for policy 0, policy_version 617518 (0.0032) [2024-04-28 13:11:46,210][57339] Updated weights for policy 0, policy_version 617528 (0.0028) [2024-04-28 13:11:47,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10117611520. Throughput: 0: 55947.3. Samples: 608044940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:11:47,169][57108] Avg episode reward: [(0, '0.524')] [2024-04-28 13:11:49,111][57339] Updated weights for policy 0, policy_version 617538 (0.0029) [2024-04-28 13:11:52,029][57339] Updated weights for policy 0, policy_version 617548 (0.0028) [2024-04-28 13:11:52,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 10117906432. Throughput: 0: 55779.4. Samples: 608207300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:11:52,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 13:11:55,004][57339] Updated weights for policy 0, policy_version 617558 (0.0030) [2024-04-28 13:11:57,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10118184960. Throughput: 0: 55569.5. Samples: 608537880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:11:57,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 13:11:57,951][57339] Updated weights for policy 0, policy_version 617568 (0.0031) [2024-04-28 13:11:59,365][57319] Signal inference workers to stop experience collection... (8750 times) [2024-04-28 13:11:59,366][57319] Signal inference workers to resume experience collection... (8750 times) [2024-04-28 13:11:59,391][57339] InferenceWorker_p0-w0: stopping experience collection (8750 times) [2024-04-28 13:11:59,391][57339] InferenceWorker_p0-w0: resuming experience collection (8750 times) [2024-04-28 13:12:00,807][57339] Updated weights for policy 0, policy_version 617578 (0.0040) [2024-04-28 13:12:02,169][57108] Fps is (10 sec: 57344.6, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10118479872. Throughput: 0: 55564.8. Samples: 608866020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:12:02,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 13:12:03,950][57339] Updated weights for policy 0, policy_version 617588 (0.0027) [2024-04-28 13:12:06,864][57339] Updated weights for policy 0, policy_version 617598 (0.0026) [2024-04-28 13:12:07,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10118758400. Throughput: 0: 55699.6. Samples: 609045520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:12:07,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 13:12:09,907][57339] Updated weights for policy 0, policy_version 617608 (0.0037) [2024-04-28 13:12:12,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10119004160. Throughput: 0: 55626.9. Samples: 609378180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:12:12,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 13:12:12,720][57339] Updated weights for policy 0, policy_version 617618 (0.0032) [2024-04-28 13:12:15,732][57339] Updated weights for policy 0, policy_version 617628 (0.0028) [2024-04-28 13:12:17,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10119299072. Throughput: 0: 55438.8. Samples: 609706800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:12:17,169][57108] Avg episode reward: [(0, '0.532')] [2024-04-28 13:12:18,647][57339] Updated weights for policy 0, policy_version 617638 (0.0030) [2024-04-28 13:12:21,646][57339] Updated weights for policy 0, policy_version 617648 (0.0029) [2024-04-28 13:12:22,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10119561216. Throughput: 0: 55140.0. Samples: 609860960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:12:22,169][57108] Avg episode reward: [(0, '0.683')] [2024-04-28 13:12:24,591][57339] Updated weights for policy 0, policy_version 617658 (0.0037) [2024-04-28 13:12:27,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 10119839744. Throughput: 0: 55088.2. Samples: 610192700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:12:27,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 13:12:27,589][57339] Updated weights for policy 0, policy_version 617668 (0.0035) [2024-04-28 13:12:30,528][57339] Updated weights for policy 0, policy_version 617678 (0.0025) [2024-04-28 13:12:32,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10120118272. Throughput: 0: 55173.8. Samples: 610527760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:12:32,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 13:12:33,435][57339] Updated weights for policy 0, policy_version 617688 (0.0024) [2024-04-28 13:12:36,296][57339] Updated weights for policy 0, policy_version 617698 (0.0026) [2024-04-28 13:12:37,169][57108] Fps is (10 sec: 58981.7, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10120429568. Throughput: 0: 55457.9. Samples: 610702900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:12:37,169][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 13:12:39,351][57339] Updated weights for policy 0, policy_version 617708 (0.0027) [2024-04-28 13:12:42,040][57339] Updated weights for policy 0, policy_version 617718 (0.0026) [2024-04-28 13:12:42,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55978.9, 300 sec: 55816.7). Total num frames: 10120708096. Throughput: 0: 55477.9. Samples: 611034380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:12:42,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 13:12:45,203][57339] Updated weights for policy 0, policy_version 617728 (0.0033) [2024-04-28 13:12:47,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10120953856. Throughput: 0: 55657.7. Samples: 611370620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:12:47,169][57108] Avg episode reward: [(0, '0.678')] [2024-04-28 13:12:47,850][57339] Updated weights for policy 0, policy_version 617738 (0.0032) [2024-04-28 13:12:48,816][57319] Signal inference workers to stop experience collection... (8800 times) [2024-04-28 13:12:48,817][57319] Signal inference workers to resume experience collection... (8800 times) [2024-04-28 13:12:48,846][57339] InferenceWorker_p0-w0: stopping experience collection (8800 times) [2024-04-28 13:12:48,846][57339] InferenceWorker_p0-w0: resuming experience collection (8800 times) [2024-04-28 13:12:51,280][57339] Updated weights for policy 0, policy_version 617748 (0.0026) [2024-04-28 13:12:52,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10121248768. Throughput: 0: 55482.2. Samples: 611542220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:12:52,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 13:12:53,810][57339] Updated weights for policy 0, policy_version 617758 (0.0029) [2024-04-28 13:12:57,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 10121494528. Throughput: 0: 55525.6. Samples: 611876840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:12:57,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 13:12:57,250][57339] Updated weights for policy 0, policy_version 617768 (0.0027) [2024-04-28 13:12:59,739][57339] Updated weights for policy 0, policy_version 617778 (0.0031) [2024-04-28 13:13:02,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10121805824. Throughput: 0: 55716.3. Samples: 612214040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:13:02,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 13:13:03,087][57339] Updated weights for policy 0, policy_version 617788 (0.0031) [2024-04-28 13:13:05,584][57339] Updated weights for policy 0, policy_version 617798 (0.0029) [2024-04-28 13:13:07,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55159.2, 300 sec: 55650.0). Total num frames: 10122067968. Throughput: 0: 56025.6. Samples: 612382120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:13:07,170][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 13:13:08,847][57339] Updated weights for policy 0, policy_version 617808 (0.0030) [2024-04-28 13:13:11,363][57339] Updated weights for policy 0, policy_version 617818 (0.0025) [2024-04-28 13:13:12,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10122379264. Throughput: 0: 56136.0. Samples: 612718820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-04-28 13:13:12,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 13:13:14,646][57339] Updated weights for policy 0, policy_version 617828 (0.0029) [2024-04-28 13:13:17,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10122641408. Throughput: 0: 56042.5. Samples: 613049680. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:13:17,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 13:13:17,321][57339] Updated weights for policy 0, policy_version 617838 (0.0023) [2024-04-28 13:13:20,518][57339] Updated weights for policy 0, policy_version 617848 (0.0031) [2024-04-28 13:13:22,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10122903552. Throughput: 0: 55932.6. Samples: 613219860. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:13:22,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 13:13:23,055][57339] Updated weights for policy 0, policy_version 617858 (0.0028) [2024-04-28 13:13:26,392][57339] Updated weights for policy 0, policy_version 617868 (0.0030) [2024-04-28 13:13:27,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10123214848. Throughput: 0: 55923.0. Samples: 613550920. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:13:27,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 13:13:27,177][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000617872_10123214848.pth... [2024-04-28 13:13:27,219][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000617054_10109812736.pth [2024-04-28 13:13:28,963][57339] Updated weights for policy 0, policy_version 617878 (0.0025) [2024-04-28 13:13:32,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10123460608. Throughput: 0: 55921.8. Samples: 613887100. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:13:32,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 13:13:32,228][57339] Updated weights for policy 0, policy_version 617888 (0.0027) [2024-04-28 13:13:34,837][57339] Updated weights for policy 0, policy_version 617898 (0.0028) [2024-04-28 13:13:37,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10123739136. Throughput: 0: 55737.3. Samples: 614050400. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:13:37,170][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 13:13:38,298][57339] Updated weights for policy 0, policy_version 617908 (0.0030) [2024-04-28 13:13:38,361][57319] Signal inference workers to stop experience collection... (8850 times) [2024-04-28 13:13:38,410][57339] InferenceWorker_p0-w0: stopping experience collection (8850 times) [2024-04-28 13:13:38,421][57319] Signal inference workers to resume experience collection... (8850 times) [2024-04-28 13:13:38,426][57339] InferenceWorker_p0-w0: resuming experience collection (8850 times) [2024-04-28 13:13:40,846][57339] Updated weights for policy 0, policy_version 617918 (0.0029) [2024-04-28 13:13:42,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 10124017664. Throughput: 0: 55724.5. Samples: 614384440. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:13:42,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 13:13:44,042][57339] Updated weights for policy 0, policy_version 617928 (0.0037) [2024-04-28 13:13:46,666][57339] Updated weights for policy 0, policy_version 617938 (0.0029) [2024-04-28 13:13:47,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 10124312576. Throughput: 0: 55641.0. Samples: 614717880. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:13:47,169][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 13:13:49,857][57339] Updated weights for policy 0, policy_version 617948 (0.0023) [2024-04-28 13:13:52,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10124591104. Throughput: 0: 55713.3. Samples: 614889200. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:13:52,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 13:13:52,497][57339] Updated weights for policy 0, policy_version 617958 (0.0040) [2024-04-28 13:13:55,665][57339] Updated weights for policy 0, policy_version 617968 (0.0029) [2024-04-28 13:13:57,169][57108] Fps is (10 sec: 55704.0, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 10124869632. Throughput: 0: 55496.5. Samples: 615216180. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:13:57,170][57108] Avg episode reward: [(0, '0.527')] [2024-04-28 13:13:58,424][57339] Updated weights for policy 0, policy_version 617978 (0.0030) [2024-04-28 13:14:01,765][57339] Updated weights for policy 0, policy_version 617988 (0.0025) [2024-04-28 13:14:02,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10125148160. Throughput: 0: 55676.9. Samples: 615555140. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:14:02,169][57108] Avg episode reward: [(0, '0.689')] [2024-04-28 13:14:04,347][57339] Updated weights for policy 0, policy_version 617998 (0.0035) [2024-04-28 13:14:07,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55432.6, 300 sec: 55538.9). Total num frames: 10125393920. Throughput: 0: 55428.7. Samples: 615714160. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:14:07,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 13:14:07,592][57339] Updated weights for policy 0, policy_version 618008 (0.0026) [2024-04-28 13:14:10,072][57339] Updated weights for policy 0, policy_version 618018 (0.0035) [2024-04-28 13:14:12,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 10125688832. Throughput: 0: 55435.5. Samples: 616045520. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:14:12,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 13:14:13,414][57339] Updated weights for policy 0, policy_version 618028 (0.0031) [2024-04-28 13:14:15,892][57339] Updated weights for policy 0, policy_version 618038 (0.0032) [2024-04-28 13:14:17,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10125950976. Throughput: 0: 55342.8. Samples: 616377520. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:14:17,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 13:14:19,332][57339] Updated weights for policy 0, policy_version 618048 (0.0029) [2024-04-28 13:14:22,045][57339] Updated weights for policy 0, policy_version 618058 (0.0032) [2024-04-28 13:14:22,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10126262272. Throughput: 0: 55413.3. Samples: 616544000. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:14:22,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 13:14:25,382][57339] Updated weights for policy 0, policy_version 618068 (0.0026) [2024-04-28 13:14:27,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10126524416. Throughput: 0: 55450.8. Samples: 616879720. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:14:27,169][57108] Avg episode reward: [(0, '0.713')] [2024-04-28 13:14:27,896][57339] Updated weights for policy 0, policy_version 618078 (0.0028) [2024-04-28 13:14:31,152][57339] Updated weights for policy 0, policy_version 618088 (0.0023) [2024-04-28 13:14:32,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10126819328. Throughput: 0: 55495.1. Samples: 617215160. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:14:32,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 13:14:33,701][57339] Updated weights for policy 0, policy_version 618098 (0.0027) [2024-04-28 13:14:36,795][57319] Signal inference workers to stop experience collection... (8900 times) [2024-04-28 13:14:36,795][57319] Signal inference workers to resume experience collection... (8900 times) [2024-04-28 13:14:36,819][57339] InferenceWorker_p0-w0: stopping experience collection (8900 times) [2024-04-28 13:14:36,819][57339] InferenceWorker_p0-w0: resuming experience collection (8900 times) [2024-04-28 13:14:36,904][57339] Updated weights for policy 0, policy_version 618108 (0.0029) [2024-04-28 13:14:37,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10127081472. Throughput: 0: 55408.8. Samples: 617382600. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:14:37,169][57108] Avg episode reward: [(0, '0.694')] [2024-04-28 13:14:39,702][57339] Updated weights for policy 0, policy_version 618118 (0.0026) [2024-04-28 13:14:42,169][57108] Fps is (10 sec: 52428.1, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10127343616. Throughput: 0: 55614.9. Samples: 617718840. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:14:42,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 13:14:42,926][57339] Updated weights for policy 0, policy_version 618128 (0.0034) [2024-04-28 13:14:45,707][57339] Updated weights for policy 0, policy_version 618138 (0.0027) [2024-04-28 13:14:47,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 10127622144. Throughput: 0: 55428.0. Samples: 618049400. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-04-28 13:14:47,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 13:14:48,865][57339] Updated weights for policy 0, policy_version 618148 (0.0034) [2024-04-28 13:14:51,729][57339] Updated weights for policy 0, policy_version 618158 (0.0024) [2024-04-28 13:14:52,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10127917056. Throughput: 0: 55492.1. Samples: 618211300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:14:52,170][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 13:14:54,643][57339] Updated weights for policy 0, policy_version 618168 (0.0034) [2024-04-28 13:14:57,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.8, 300 sec: 55705.6). Total num frames: 10128195584. Throughput: 0: 55527.6. Samples: 618544260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:14:57,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 13:14:57,491][57339] Updated weights for policy 0, policy_version 618178 (0.0031) [2024-04-28 13:15:00,491][57339] Updated weights for policy 0, policy_version 618188 (0.0026) [2024-04-28 13:15:02,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10128474112. Throughput: 0: 55610.6. Samples: 618880000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:15:02,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 13:15:03,314][57339] Updated weights for policy 0, policy_version 618198 (0.0024) [2024-04-28 13:15:06,302][57339] Updated weights for policy 0, policy_version 618208 (0.0034) [2024-04-28 13:15:07,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 10128769024. Throughput: 0: 55850.3. Samples: 619057260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:15:07,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 13:15:09,427][57339] Updated weights for policy 0, policy_version 618218 (0.0031) [2024-04-28 13:15:12,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10129031168. Throughput: 0: 55798.9. Samples: 619390680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:15:12,170][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 13:15:12,296][57339] Updated weights for policy 0, policy_version 618228 (0.0037) [2024-04-28 13:15:15,118][57339] Updated weights for policy 0, policy_version 618238 (0.0029) [2024-04-28 13:15:17,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10129293312. Throughput: 0: 55793.3. Samples: 619725860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:15:17,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 13:15:18,050][57339] Updated weights for policy 0, policy_version 618248 (0.0028) [2024-04-28 13:15:20,954][57339] Updated weights for policy 0, policy_version 618258 (0.0026) [2024-04-28 13:15:22,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 10129571840. Throughput: 0: 55743.6. Samples: 619891060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:15:22,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 13:15:23,765][57339] Updated weights for policy 0, policy_version 618268 (0.0030) [2024-04-28 13:15:26,841][57339] Updated weights for policy 0, policy_version 618278 (0.0030) [2024-04-28 13:15:27,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 10129883136. Throughput: 0: 55633.0. Samples: 620222320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:15:27,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 13:15:27,177][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000618279_10129883136.pth... [2024-04-28 13:15:27,229][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000617464_10116530176.pth [2024-04-28 13:15:29,710][57339] Updated weights for policy 0, policy_version 618288 (0.0028) [2024-04-28 13:15:32,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10130145280. Throughput: 0: 55718.3. Samples: 620556720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:15:32,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 13:15:32,932][57339] Updated weights for policy 0, policy_version 618298 (0.0026) [2024-04-28 13:15:35,667][57339] Updated weights for policy 0, policy_version 618308 (0.0029) [2024-04-28 13:15:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10130423808. Throughput: 0: 56009.0. Samples: 620731700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:15:37,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 13:15:38,604][57339] Updated weights for policy 0, policy_version 618318 (0.0025) [2024-04-28 13:15:40,219][57319] Signal inference workers to stop experience collection... (8950 times) [2024-04-28 13:15:40,252][57339] InferenceWorker_p0-w0: stopping experience collection (8950 times) [2024-04-28 13:15:40,277][57319] Signal inference workers to resume experience collection... (8950 times) [2024-04-28 13:15:40,282][57339] InferenceWorker_p0-w0: resuming experience collection (8950 times) [2024-04-28 13:15:41,395][57339] Updated weights for policy 0, policy_version 618328 (0.0027) [2024-04-28 13:15:42,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 10130718720. Throughput: 0: 56034.3. Samples: 621065800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:15:42,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 13:15:44,354][57339] Updated weights for policy 0, policy_version 618338 (0.0025) [2024-04-28 13:15:47,169][57108] Fps is (10 sec: 57343.1, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10130997248. Throughput: 0: 55972.4. Samples: 621398760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:15:47,169][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 13:15:47,193][57339] Updated weights for policy 0, policy_version 618348 (0.0031) [2024-04-28 13:15:50,378][57339] Updated weights for policy 0, policy_version 618358 (0.0038) [2024-04-28 13:15:52,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10131243008. Throughput: 0: 55720.9. Samples: 621564700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:15:52,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 13:15:53,125][57339] Updated weights for policy 0, policy_version 618368 (0.0029) [2024-04-28 13:15:56,233][57339] Updated weights for policy 0, policy_version 618378 (0.0031) [2024-04-28 13:15:57,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10131537920. Throughput: 0: 55702.3. Samples: 621897280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:15:57,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 13:15:59,110][57339] Updated weights for policy 0, policy_version 618388 (0.0029) [2024-04-28 13:16:01,989][57339] Updated weights for policy 0, policy_version 618398 (0.0028) [2024-04-28 13:16:02,169][57108] Fps is (10 sec: 58981.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10131832832. Throughput: 0: 55779.4. Samples: 622235940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:16:02,170][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 13:16:04,843][57339] Updated weights for policy 0, policy_version 618408 (0.0030) [2024-04-28 13:16:07,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10132111360. Throughput: 0: 55793.3. Samples: 622401760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:16:07,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 13:16:07,847][57339] Updated weights for policy 0, policy_version 618418 (0.0029) [2024-04-28 13:16:10,750][57339] Updated weights for policy 0, policy_version 618428 (0.0033) [2024-04-28 13:16:12,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10132389888. Throughput: 0: 55834.6. Samples: 622734880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:16:12,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 13:16:13,802][57339] Updated weights for policy 0, policy_version 618438 (0.0029) [2024-04-28 13:16:16,583][57339] Updated weights for policy 0, policy_version 618448 (0.0031) [2024-04-28 13:16:17,169][57108] Fps is (10 sec: 55704.8, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 10132668416. Throughput: 0: 55718.9. Samples: 623064080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:16:17,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 13:16:19,819][57339] Updated weights for policy 0, policy_version 618458 (0.0027) [2024-04-28 13:16:22,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 10132930560. Throughput: 0: 55630.0. Samples: 623235060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:16:22,170][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 13:16:22,537][57339] Updated weights for policy 0, policy_version 618468 (0.0027) [2024-04-28 13:16:25,507][57339] Updated weights for policy 0, policy_version 618478 (0.0027) [2024-04-28 13:16:27,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 10133225472. Throughput: 0: 55576.8. Samples: 623566760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:16:27,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 13:16:28,303][57339] Updated weights for policy 0, policy_version 618488 (0.0032) [2024-04-28 13:16:31,568][57339] Updated weights for policy 0, policy_version 618498 (0.0032) [2024-04-28 13:16:32,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10133487616. Throughput: 0: 55756.1. Samples: 623907780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:16:32,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 13:16:34,102][57339] Updated weights for policy 0, policy_version 618508 (0.0029) [2024-04-28 13:16:35,750][57319] Signal inference workers to stop experience collection... (9000 times) [2024-04-28 13:16:35,750][57319] Signal inference workers to resume experience collection... (9000 times) [2024-04-28 13:16:35,783][57339] InferenceWorker_p0-w0: stopping experience collection (9000 times) [2024-04-28 13:16:35,784][57339] InferenceWorker_p0-w0: resuming experience collection (9000 times) [2024-04-28 13:16:37,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 10133766144. Throughput: 0: 55584.4. Samples: 624066000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:16:37,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 13:16:37,400][57339] Updated weights for policy 0, policy_version 618518 (0.0031) [2024-04-28 13:16:40,180][57339] Updated weights for policy 0, policy_version 618528 (0.0030) [2024-04-28 13:16:42,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10134044672. Throughput: 0: 55657.9. Samples: 624401880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:16:42,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 13:16:43,269][57339] Updated weights for policy 0, policy_version 618538 (0.0032) [2024-04-28 13:16:45,992][57339] Updated weights for policy 0, policy_version 618548 (0.0027) [2024-04-28 13:16:47,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10134339584. Throughput: 0: 55719.6. Samples: 624743320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:16:47,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 13:16:49,101][57339] Updated weights for policy 0, policy_version 618558 (0.0028) [2024-04-28 13:16:52,013][57339] Updated weights for policy 0, policy_version 618568 (0.0028) [2024-04-28 13:16:52,169][57108] Fps is (10 sec: 57342.9, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 10134618112. Throughput: 0: 55715.4. Samples: 624908960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:16:52,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 13:16:54,883][57339] Updated weights for policy 0, policy_version 618578 (0.0024) [2024-04-28 13:16:57,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 10134896640. Throughput: 0: 55780.0. Samples: 625244980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:16:57,169][57108] Avg episode reward: [(0, '0.682')] [2024-04-28 13:16:58,147][57339] Updated weights for policy 0, policy_version 618588 (0.0027) [2024-04-28 13:17:00,851][57339] Updated weights for policy 0, policy_version 618598 (0.0027) [2024-04-28 13:17:02,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 10135158784. Throughput: 0: 55740.2. Samples: 625572380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:17:02,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 13:17:04,101][57339] Updated weights for policy 0, policy_version 618608 (0.0034) [2024-04-28 13:17:06,852][57339] Updated weights for policy 0, policy_version 618618 (0.0032) [2024-04-28 13:17:07,169][57108] Fps is (10 sec: 55704.0, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 10135453696. Throughput: 0: 55796.7. Samples: 625745920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:17:07,170][57108] Avg episode reward: [(0, '0.531')] [2024-04-28 13:17:09,951][57339] Updated weights for policy 0, policy_version 618628 (0.0030) [2024-04-28 13:17:12,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10135732224. Throughput: 0: 55872.2. Samples: 626081000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:17:12,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 13:17:12,689][57339] Updated weights for policy 0, policy_version 618638 (0.0030) [2024-04-28 13:17:15,696][57339] Updated weights for policy 0, policy_version 618648 (0.0025) [2024-04-28 13:17:17,169][57108] Fps is (10 sec: 52430.1, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 10135977984. Throughput: 0: 55720.9. Samples: 626415220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:17:17,169][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 13:17:18,696][57339] Updated weights for policy 0, policy_version 618658 (0.0025) [2024-04-28 13:17:21,592][57339] Updated weights for policy 0, policy_version 618668 (0.0027) [2024-04-28 13:17:22,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10136289280. Throughput: 0: 55958.6. Samples: 626584140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:17:22,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 13:17:24,406][57339] Updated weights for policy 0, policy_version 618678 (0.0030) [2024-04-28 13:17:27,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10136551424. Throughput: 0: 55900.4. Samples: 626917400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:17:27,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 13:17:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000618686_10136551424.pth... [2024-04-28 13:17:27,239][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000617872_10123214848.pth [2024-04-28 13:17:27,579][57339] Updated weights for policy 0, policy_version 618688 (0.0031) [2024-04-28 13:17:30,182][57339] Updated weights for policy 0, policy_version 618698 (0.0035) [2024-04-28 13:17:32,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10136846336. Throughput: 0: 55778.7. Samples: 627253360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:17:32,169][57108] Avg episode reward: [(0, '0.706')] [2024-04-28 13:17:33,458][57339] Updated weights for policy 0, policy_version 618708 (0.0029) [2024-04-28 13:17:36,044][57339] Updated weights for policy 0, policy_version 618718 (0.0028) [2024-04-28 13:17:37,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 10137124864. Throughput: 0: 55872.6. Samples: 627423220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:17:37,169][57108] Avg episode reward: [(0, '0.513')] [2024-04-28 13:17:39,201][57339] Updated weights for policy 0, policy_version 618728 (0.0030) [2024-04-28 13:17:41,911][57339] Updated weights for policy 0, policy_version 618738 (0.0026) [2024-04-28 13:17:42,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10137419776. Throughput: 0: 55796.9. Samples: 627755840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:17:42,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 13:17:45,142][57339] Updated weights for policy 0, policy_version 618748 (0.0026) [2024-04-28 13:17:47,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10137649152. Throughput: 0: 55896.3. Samples: 628087720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:17:47,169][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 13:17:47,846][57339] Updated weights for policy 0, policy_version 618758 (0.0027) [2024-04-28 13:17:51,136][57339] Updated weights for policy 0, policy_version 618768 (0.0037) [2024-04-28 13:17:52,169][57108] Fps is (10 sec: 50790.1, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 10137927680. Throughput: 0: 55628.3. Samples: 628249180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:17:52,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 13:17:53,474][57319] Signal inference workers to stop experience collection... (9050 times) [2024-04-28 13:17:53,501][57339] InferenceWorker_p0-w0: stopping experience collection (9050 times) [2024-04-28 13:17:53,530][57319] Signal inference workers to resume experience collection... (9050 times) [2024-04-28 13:17:53,530][57339] InferenceWorker_p0-w0: resuming experience collection (9050 times) [2024-04-28 13:17:53,644][57339] Updated weights for policy 0, policy_version 618778 (0.0026) [2024-04-28 13:17:56,942][57339] Updated weights for policy 0, policy_version 618788 (0.0023) [2024-04-28 13:17:57,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10138222592. Throughput: 0: 55635.0. Samples: 628584580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 13:17:57,169][57108] Avg episode reward: [(0, '0.679')] [2024-04-28 13:17:59,609][57339] Updated weights for policy 0, policy_version 618798 (0.0028) [2024-04-28 13:18:02,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55978.5, 300 sec: 55761.2). Total num frames: 10138517504. Throughput: 0: 55584.3. Samples: 628916520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:18:02,170][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 13:18:02,865][57339] Updated weights for policy 0, policy_version 618808 (0.0027) [2024-04-28 13:18:05,647][57339] Updated weights for policy 0, policy_version 618818 (0.0027) [2024-04-28 13:18:07,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.8, 300 sec: 55650.0). Total num frames: 10138796032. Throughput: 0: 55655.6. Samples: 629088640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:18:07,170][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 13:18:08,888][57339] Updated weights for policy 0, policy_version 618828 (0.0027) [2024-04-28 13:18:11,438][57339] Updated weights for policy 0, policy_version 618838 (0.0034) [2024-04-28 13:18:12,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 10139058176. Throughput: 0: 55542.6. Samples: 629416820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:18:12,178][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 13:18:14,847][57339] Updated weights for policy 0, policy_version 618848 (0.0036) [2024-04-28 13:18:17,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 10139353088. Throughput: 0: 55381.8. Samples: 629745540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:18:17,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 13:18:17,245][57339] Updated weights for policy 0, policy_version 618858 (0.0026) [2024-04-28 13:18:20,682][57339] Updated weights for policy 0, policy_version 618868 (0.0030) [2024-04-28 13:18:22,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 10139598848. Throughput: 0: 55280.2. Samples: 629910820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:18:22,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 13:18:23,145][57339] Updated weights for policy 0, policy_version 618878 (0.0026) [2024-04-28 13:18:26,707][57339] Updated weights for policy 0, policy_version 618888 (0.0036) [2024-04-28 13:18:27,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10139877376. Throughput: 0: 55326.6. Samples: 630245540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:18:27,170][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 13:18:29,091][57339] Updated weights for policy 0, policy_version 618898 (0.0034) [2024-04-28 13:18:32,169][57108] Fps is (10 sec: 54066.4, 60 sec: 54886.3, 300 sec: 55594.5). Total num frames: 10140139520. Throughput: 0: 55256.0. Samples: 630574240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:18:32,170][57108] Avg episode reward: [(0, '0.526')] [2024-04-28 13:18:32,722][57339] Updated weights for policy 0, policy_version 618908 (0.0029) [2024-04-28 13:18:35,128][57339] Updated weights for policy 0, policy_version 618918 (0.0028) [2024-04-28 13:18:37,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10140450816. Throughput: 0: 55397.7. Samples: 630742080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:18:37,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 13:18:38,648][57339] Updated weights for policy 0, policy_version 618928 (0.0026) [2024-04-28 13:18:41,183][57339] Updated weights for policy 0, policy_version 618938 (0.0034) [2024-04-28 13:18:42,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 10140729344. Throughput: 0: 55309.7. Samples: 631073520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:18:42,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 13:18:44,344][57339] Updated weights for policy 0, policy_version 618948 (0.0029) [2024-04-28 13:18:46,929][57339] Updated weights for policy 0, policy_version 618958 (0.0029) [2024-04-28 13:18:47,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 10141007872. Throughput: 0: 55397.3. Samples: 631409400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:18:47,170][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 13:18:50,087][57339] Updated weights for policy 0, policy_version 618968 (0.0025) [2024-04-28 13:18:52,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55594.6). Total num frames: 10141270016. Throughput: 0: 55427.6. Samples: 631582880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:18:52,169][57108] Avg episode reward: [(0, '0.684')] [2024-04-28 13:18:52,769][57339] Updated weights for policy 0, policy_version 618978 (0.0030) [2024-04-28 13:18:55,965][57339] Updated weights for policy 0, policy_version 618988 (0.0031) [2024-04-28 13:18:57,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10141564928. Throughput: 0: 55537.4. Samples: 631916000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:18:57,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 13:18:58,572][57339] Updated weights for policy 0, policy_version 618998 (0.0029) [2024-04-28 13:19:02,076][57339] Updated weights for policy 0, policy_version 619008 (0.0035) [2024-04-28 13:19:02,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 10141827072. Throughput: 0: 55692.4. Samples: 632251700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:19:02,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 13:19:04,493][57339] Updated weights for policy 0, policy_version 619018 (0.0027) [2024-04-28 13:19:05,974][57319] Signal inference workers to stop experience collection... (9100 times) [2024-04-28 13:19:06,015][57339] InferenceWorker_p0-w0: stopping experience collection (9100 times) [2024-04-28 13:19:06,027][57319] Signal inference workers to resume experience collection... (9100 times) [2024-04-28 13:19:06,033][57339] InferenceWorker_p0-w0: resuming experience collection (9100 times) [2024-04-28 13:19:07,169][57108] Fps is (10 sec: 52428.8, 60 sec: 54886.5, 300 sec: 55594.5). Total num frames: 10142089216. Throughput: 0: 55494.6. Samples: 632408080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:19:07,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 13:19:07,906][57339] Updated weights for policy 0, policy_version 619028 (0.0028) [2024-04-28 13:19:10,291][57339] Updated weights for policy 0, policy_version 619038 (0.0027) [2024-04-28 13:19:12,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10142400512. Throughput: 0: 55616.5. Samples: 632748280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:19:12,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 13:19:13,689][57339] Updated weights for policy 0, policy_version 619048 (0.0035) [2024-04-28 13:19:16,113][57339] Updated weights for policy 0, policy_version 619058 (0.0027) [2024-04-28 13:19:17,169][57108] Fps is (10 sec: 58981.3, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 10142679040. Throughput: 0: 55731.9. Samples: 633082180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:19:17,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 13:19:19,739][57339] Updated weights for policy 0, policy_version 619068 (0.0026) [2024-04-28 13:19:22,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 10142957568. Throughput: 0: 55907.1. Samples: 633257900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:19:22,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 13:19:22,184][57339] Updated weights for policy 0, policy_version 619078 (0.0036) [2024-04-28 13:19:25,610][57339] Updated weights for policy 0, policy_version 619088 (0.0031) [2024-04-28 13:19:27,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10143219712. Throughput: 0: 55955.1. Samples: 633591500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:19:27,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 13:19:27,175][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000619094_10143236096.pth... [2024-04-28 13:19:27,219][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000618279_10129883136.pth [2024-04-28 13:19:28,093][57339] Updated weights for policy 0, policy_version 619098 (0.0033) [2024-04-28 13:19:31,375][57339] Updated weights for policy 0, policy_version 619108 (0.0031) [2024-04-28 13:19:32,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 10143514624. Throughput: 0: 55832.1. Samples: 633921840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 13:19:32,170][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 13:19:33,882][57339] Updated weights for policy 0, policy_version 619118 (0.0027) [2024-04-28 13:19:37,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10143776768. Throughput: 0: 55762.6. Samples: 634092200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:19:37,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 13:19:37,187][57339] Updated weights for policy 0, policy_version 619128 (0.0030) [2024-04-28 13:19:39,736][57339] Updated weights for policy 0, policy_version 619138 (0.0034) [2024-04-28 13:19:42,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10144055296. Throughput: 0: 55762.1. Samples: 634425300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:19:42,170][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 13:19:42,941][57339] Updated weights for policy 0, policy_version 619148 (0.0028) [2024-04-28 13:19:45,530][57339] Updated weights for policy 0, policy_version 619158 (0.0027) [2024-04-28 13:19:47,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10144350208. Throughput: 0: 55699.9. Samples: 634758200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:19:47,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 13:19:48,915][57339] Updated weights for policy 0, policy_version 619168 (0.0027) [2024-04-28 13:19:51,460][57339] Updated weights for policy 0, policy_version 619178 (0.0026) [2024-04-28 13:19:52,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10144628736. Throughput: 0: 55878.7. Samples: 634922620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:19:52,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 13:19:54,791][57339] Updated weights for policy 0, policy_version 619188 (0.0028) [2024-04-28 13:19:57,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10144923648. Throughput: 0: 55786.1. Samples: 635258660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:19:57,170][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 13:19:57,308][57339] Updated weights for policy 0, policy_version 619198 (0.0031) [2024-04-28 13:20:00,637][57339] Updated weights for policy 0, policy_version 619208 (0.0033) [2024-04-28 13:20:02,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 10145185792. Throughput: 0: 55692.1. Samples: 635588320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:20:02,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 13:20:03,256][57339] Updated weights for policy 0, policy_version 619218 (0.0032) [2024-04-28 13:20:06,595][57339] Updated weights for policy 0, policy_version 619228 (0.0029) [2024-04-28 13:20:07,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 10145447936. Throughput: 0: 55447.6. Samples: 635753040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:20:07,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 13:20:09,548][57339] Updated weights for policy 0, policy_version 619238 (0.0025) [2024-04-28 13:20:12,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10145726464. Throughput: 0: 55344.0. Samples: 636081980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:20:12,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 13:20:12,604][57339] Updated weights for policy 0, policy_version 619248 (0.0025) [2024-04-28 13:20:13,858][57319] Signal inference workers to stop experience collection... (9150 times) [2024-04-28 13:20:13,858][57319] Signal inference workers to resume experience collection... (9150 times) [2024-04-28 13:20:13,880][57339] InferenceWorker_p0-w0: stopping experience collection (9150 times) [2024-04-28 13:20:13,881][57339] InferenceWorker_p0-w0: resuming experience collection (9150 times) [2024-04-28 13:20:15,438][57339] Updated weights for policy 0, policy_version 619258 (0.0029) [2024-04-28 13:20:17,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 10146004992. Throughput: 0: 55428.9. Samples: 636416140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:20:17,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 13:20:18,385][57339] Updated weights for policy 0, policy_version 619268 (0.0024) [2024-04-28 13:20:21,239][57339] Updated weights for policy 0, policy_version 619278 (0.0033) [2024-04-28 13:20:22,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10146299904. Throughput: 0: 55448.1. Samples: 636587360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:20:22,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 13:20:24,298][57339] Updated weights for policy 0, policy_version 619288 (0.0027) [2024-04-28 13:20:27,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10146562048. Throughput: 0: 55291.7. Samples: 636913420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:20:27,169][57108] Avg episode reward: [(0, '0.506')] [2024-04-28 13:20:27,302][57339] Updated weights for policy 0, policy_version 619298 (0.0029) [2024-04-28 13:20:30,286][57339] Updated weights for policy 0, policy_version 619308 (0.0024) [2024-04-28 13:20:32,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10146840576. Throughput: 0: 55242.8. Samples: 637244120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:20:32,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 13:20:33,210][57339] Updated weights for policy 0, policy_version 619318 (0.0029) [2024-04-28 13:20:36,278][57339] Updated weights for policy 0, policy_version 619328 (0.0027) [2024-04-28 13:20:37,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10147119104. Throughput: 0: 55364.8. Samples: 637414040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:20:37,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 13:20:39,056][57339] Updated weights for policy 0, policy_version 619338 (0.0030) [2024-04-28 13:20:42,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10147397632. Throughput: 0: 55405.3. Samples: 637751900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:20:42,169][57339] Updated weights for policy 0, policy_version 619348 (0.0029) [2024-04-28 13:20:42,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 13:20:44,907][57339] Updated weights for policy 0, policy_version 619358 (0.0025) [2024-04-28 13:20:47,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 10147659776. Throughput: 0: 55434.7. Samples: 638082880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:20:47,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 13:20:48,053][57339] Updated weights for policy 0, policy_version 619368 (0.0026) [2024-04-28 13:20:50,914][57339] Updated weights for policy 0, policy_version 619378 (0.0029) [2024-04-28 13:20:52,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 10147938304. Throughput: 0: 55399.1. Samples: 638246000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:20:52,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 13:20:54,024][57339] Updated weights for policy 0, policy_version 619388 (0.0044) [2024-04-28 13:20:56,798][57339] Updated weights for policy 0, policy_version 619398 (0.0029) [2024-04-28 13:20:57,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55159.6, 300 sec: 55594.6). Total num frames: 10148233216. Throughput: 0: 55513.0. Samples: 638580060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:20:57,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 13:20:59,829][57339] Updated weights for policy 0, policy_version 619408 (0.0031) [2024-04-28 13:21:02,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54886.5, 300 sec: 55483.5). Total num frames: 10148478976. Throughput: 0: 55432.5. Samples: 638910600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:21:02,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 13:21:02,593][57339] Updated weights for policy 0, policy_version 619418 (0.0029) [2024-04-28 13:21:05,712][57339] Updated weights for policy 0, policy_version 619428 (0.0026) [2024-04-28 13:21:07,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 10148757504. Throughput: 0: 55296.4. Samples: 639075700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-04-28 13:21:07,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 13:21:08,490][57339] Updated weights for policy 0, policy_version 619438 (0.0023) [2024-04-28 13:21:11,725][57339] Updated weights for policy 0, policy_version 619448 (0.0027) [2024-04-28 13:21:12,169][57108] Fps is (10 sec: 58981.7, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10149068800. Throughput: 0: 55347.9. Samples: 639404080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:21:12,178][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 13:21:13,019][57319] Signal inference workers to stop experience collection... (9200 times) [2024-04-28 13:21:13,065][57339] InferenceWorker_p0-w0: stopping experience collection (9200 times) [2024-04-28 13:21:13,077][57319] Signal inference workers to resume experience collection... (9200 times) [2024-04-28 13:21:13,084][57339] InferenceWorker_p0-w0: resuming experience collection (9200 times) [2024-04-28 13:21:14,565][57339] Updated weights for policy 0, policy_version 619458 (0.0028) [2024-04-28 13:21:17,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 10149330944. Throughput: 0: 55497.1. Samples: 639741500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:21:17,178][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 13:21:17,582][57339] Updated weights for policy 0, policy_version 619468 (0.0030) [2024-04-28 13:21:20,397][57339] Updated weights for policy 0, policy_version 619478 (0.0032) [2024-04-28 13:21:22,169][57108] Fps is (10 sec: 52429.1, 60 sec: 54886.4, 300 sec: 55483.5). Total num frames: 10149593088. Throughput: 0: 55350.2. Samples: 639904800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:21:22,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 13:21:23,730][57339] Updated weights for policy 0, policy_version 619488 (0.0028) [2024-04-28 13:21:26,189][57339] Updated weights for policy 0, policy_version 619498 (0.0029) [2024-04-28 13:21:27,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10149888000. Throughput: 0: 55307.6. Samples: 640240740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:21:27,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 13:21:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000619500_10149888000.pth... [2024-04-28 13:21:27,226][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000618686_10136551424.pth [2024-04-28 13:21:29,572][57339] Updated weights for policy 0, policy_version 619508 (0.0027) [2024-04-28 13:21:32,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 10150150144. Throughput: 0: 55175.1. Samples: 640565760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:21:32,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 13:21:32,355][57339] Updated weights for policy 0, policy_version 619518 (0.0027) [2024-04-28 13:21:35,461][57339] Updated weights for policy 0, policy_version 619528 (0.0030) [2024-04-28 13:21:37,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 10150428672. Throughput: 0: 55315.1. Samples: 640735180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:21:37,169][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 13:21:38,303][57339] Updated weights for policy 0, policy_version 619538 (0.0029) [2024-04-28 13:21:41,274][57339] Updated weights for policy 0, policy_version 619548 (0.0026) [2024-04-28 13:21:42,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.5, 300 sec: 55483.5). Total num frames: 10150707200. Throughput: 0: 55303.0. Samples: 641068700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:21:42,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 13:21:44,062][57339] Updated weights for policy 0, policy_version 619558 (0.0025) [2024-04-28 13:21:47,146][57339] Updated weights for policy 0, policy_version 619568 (0.0028) [2024-04-28 13:21:47,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10151002112. Throughput: 0: 55400.9. Samples: 641403640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:21:47,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 13:21:50,192][57339] Updated weights for policy 0, policy_version 619578 (0.0029) [2024-04-28 13:21:52,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10151280640. Throughput: 0: 55504.9. Samples: 641573420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:21:52,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 13:21:52,968][57339] Updated weights for policy 0, policy_version 619588 (0.0028) [2024-04-28 13:21:55,962][57339] Updated weights for policy 0, policy_version 619598 (0.0029) [2024-04-28 13:21:57,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55432.3, 300 sec: 55594.5). Total num frames: 10151559168. Throughput: 0: 55599.0. Samples: 641906040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:21:57,170][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 13:21:58,821][57339] Updated weights for policy 0, policy_version 619608 (0.0031) [2024-04-28 13:22:01,831][57339] Updated weights for policy 0, policy_version 619618 (0.0031) [2024-04-28 13:22:02,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55978.5, 300 sec: 55539.0). Total num frames: 10151837696. Throughput: 0: 55577.8. Samples: 642242500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:22:02,170][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 13:22:04,751][57339] Updated weights for policy 0, policy_version 619628 (0.0030) [2024-04-28 13:22:07,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 10152116224. Throughput: 0: 55701.3. Samples: 642411360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:22:07,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 13:22:07,556][57339] Updated weights for policy 0, policy_version 619638 (0.0027) [2024-04-28 13:22:10,797][57339] Updated weights for policy 0, policy_version 619648 (0.0031) [2024-04-28 13:22:12,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10152394752. Throughput: 0: 55649.9. Samples: 642744980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:22:12,169][57108] Avg episode reward: [(0, '0.726')] [2024-04-28 13:22:13,530][57339] Updated weights for policy 0, policy_version 619658 (0.0026) [2024-04-28 13:22:16,845][57339] Updated weights for policy 0, policy_version 619668 (0.0028) [2024-04-28 13:22:17,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 10152656896. Throughput: 0: 55859.9. Samples: 643079460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:22:17,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 13:22:19,328][57339] Updated weights for policy 0, policy_version 619678 (0.0031) [2024-04-28 13:22:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10152935424. Throughput: 0: 55716.5. Samples: 643242420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:22:22,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 13:22:22,883][57339] Updated weights for policy 0, policy_version 619688 (0.0037) [2024-04-28 13:22:25,266][57339] Updated weights for policy 0, policy_version 619698 (0.0029) [2024-04-28 13:22:27,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10153230336. Throughput: 0: 55644.8. Samples: 643572720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:22:27,169][57108] Avg episode reward: [(0, '0.506')] [2024-04-28 13:22:28,664][57339] Updated weights for policy 0, policy_version 619708 (0.0027) [2024-04-28 13:22:30,491][57319] Signal inference workers to stop experience collection... (9250 times) [2024-04-28 13:22:30,491][57319] Signal inference workers to resume experience collection... (9250 times) [2024-04-28 13:22:30,514][57339] InferenceWorker_p0-w0: stopping experience collection (9250 times) [2024-04-28 13:22:30,518][57339] InferenceWorker_p0-w0: resuming experience collection (9250 times) [2024-04-28 13:22:31,066][57339] Updated weights for policy 0, policy_version 619718 (0.0025) [2024-04-28 13:22:32,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 10153508864. Throughput: 0: 55710.1. Samples: 643910600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:22:32,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 13:22:34,542][57339] Updated weights for policy 0, policy_version 619728 (0.0030) [2024-04-28 13:22:37,024][57339] Updated weights for policy 0, policy_version 619738 (0.0025) [2024-04-28 13:22:37,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.7, 300 sec: 55483.4). Total num frames: 10153787392. Throughput: 0: 55832.0. Samples: 644085860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:22:37,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 13:22:40,284][57339] Updated weights for policy 0, policy_version 619748 (0.0025) [2024-04-28 13:22:42,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55705.7, 300 sec: 55594.6). Total num frames: 10154049536. Throughput: 0: 55847.0. Samples: 644419140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-04-28 13:22:42,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 13:22:42,829][57339] Updated weights for policy 0, policy_version 619758 (0.0021) [2024-04-28 13:22:46,175][57339] Updated weights for policy 0, policy_version 619768 (0.0028) [2024-04-28 13:22:47,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 10154328064. Throughput: 0: 55630.7. Samples: 644745880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:22:47,170][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 13:22:48,835][57339] Updated weights for policy 0, policy_version 619778 (0.0029) [2024-04-28 13:22:52,068][57339] Updated weights for policy 0, policy_version 619788 (0.0029) [2024-04-28 13:22:52,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10154606592. Throughput: 0: 55624.6. Samples: 644914460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:22:52,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 13:22:54,598][57339] Updated weights for policy 0, policy_version 619798 (0.0028) [2024-04-28 13:22:57,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 10154868736. Throughput: 0: 55599.3. Samples: 645246960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:22:57,170][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 13:22:57,849][57339] Updated weights for policy 0, policy_version 619808 (0.0032) [2024-04-28 13:23:00,404][57339] Updated weights for policy 0, policy_version 619818 (0.0028) [2024-04-28 13:23:02,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 10155180032. Throughput: 0: 55676.6. Samples: 645584900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:23:02,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 13:23:03,692][57339] Updated weights for policy 0, policy_version 619828 (0.0030) [2024-04-28 13:23:06,338][57339] Updated weights for policy 0, policy_version 619838 (0.0034) [2024-04-28 13:23:07,169][57108] Fps is (10 sec: 60621.4, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10155474944. Throughput: 0: 55799.0. Samples: 645753380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:23:07,169][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 13:23:09,569][57339] Updated weights for policy 0, policy_version 619848 (0.0030) [2024-04-28 13:23:12,113][57339] Updated weights for policy 0, policy_version 619858 (0.0031) [2024-04-28 13:23:12,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 10155753472. Throughput: 0: 55932.1. Samples: 646089660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:23:12,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 13:23:15,301][57339] Updated weights for policy 0, policy_version 619868 (0.0034) [2024-04-28 13:23:17,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 10156015616. Throughput: 0: 55932.5. Samples: 646427560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:23:17,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 13:23:17,989][57339] Updated weights for policy 0, policy_version 619878 (0.0026) [2024-04-28 13:23:21,170][57339] Updated weights for policy 0, policy_version 619888 (0.0034) [2024-04-28 13:23:22,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10156277760. Throughput: 0: 55690.3. Samples: 646591920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:23:22,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 13:23:23,883][57339] Updated weights for policy 0, policy_version 619898 (0.0030) [2024-04-28 13:23:27,046][57339] Updated weights for policy 0, policy_version 619908 (0.0027) [2024-04-28 13:23:27,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10156572672. Throughput: 0: 55651.7. Samples: 646923480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:23:27,169][57108] Avg episode reward: [(0, '0.709')] [2024-04-28 13:23:27,223][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000619909_10156589056.pth... [2024-04-28 13:23:27,268][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000619094_10143236096.pth [2024-04-28 13:23:29,864][57339] Updated weights for policy 0, policy_version 619918 (0.0029) [2024-04-28 13:23:32,169][57108] Fps is (10 sec: 55704.5, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10156834816. Throughput: 0: 55868.4. Samples: 647259960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:23:32,170][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 13:23:32,901][57339] Updated weights for policy 0, policy_version 619928 (0.0023) [2024-04-28 13:23:35,743][57339] Updated weights for policy 0, policy_version 619938 (0.0029) [2024-04-28 13:23:37,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10157129728. Throughput: 0: 55830.9. Samples: 647426860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:23:37,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 13:23:38,665][57339] Updated weights for policy 0, policy_version 619948 (0.0033) [2024-04-28 13:23:41,516][57339] Updated weights for policy 0, policy_version 619958 (0.0028) [2024-04-28 13:23:42,169][57108] Fps is (10 sec: 58983.3, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 10157424640. Throughput: 0: 55841.1. Samples: 647759800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:23:42,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 13:23:43,977][57319] Signal inference workers to stop experience collection... (9300 times) [2024-04-28 13:23:43,979][57319] Signal inference workers to resume experience collection... (9300 times) [2024-04-28 13:23:43,989][57339] InferenceWorker_p0-w0: stopping experience collection (9300 times) [2024-04-28 13:23:44,023][57339] InferenceWorker_p0-w0: resuming experience collection (9300 times) [2024-04-28 13:23:44,587][57339] Updated weights for policy 0, policy_version 619968 (0.0032) [2024-04-28 13:23:47,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 10157686784. Throughput: 0: 55852.4. Samples: 648098260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:23:47,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 13:23:47,361][57339] Updated weights for policy 0, policy_version 619978 (0.0031) [2024-04-28 13:23:50,523][57339] Updated weights for policy 0, policy_version 619988 (0.0030) [2024-04-28 13:23:52,169][57108] Fps is (10 sec: 55704.7, 60 sec: 56251.6, 300 sec: 55650.0). Total num frames: 10157981696. Throughput: 0: 55840.8. Samples: 648266220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:23:52,169][57108] Avg episode reward: [(0, '0.677')] [2024-04-28 13:23:53,456][57339] Updated weights for policy 0, policy_version 619998 (0.0027) [2024-04-28 13:23:56,412][57339] Updated weights for policy 0, policy_version 620008 (0.0032) [2024-04-28 13:23:57,169][57108] Fps is (10 sec: 55705.0, 60 sec: 56251.8, 300 sec: 55650.0). Total num frames: 10158243840. Throughput: 0: 55748.3. Samples: 648598340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:23:57,170][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 13:23:59,221][57339] Updated weights for policy 0, policy_version 620018 (0.0027) [2024-04-28 13:24:02,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10158522368. Throughput: 0: 55802.6. Samples: 648938680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:24:02,169][57108] Avg episode reward: [(0, '0.458')] [2024-04-28 13:24:02,294][57339] Updated weights for policy 0, policy_version 620028 (0.0028) [2024-04-28 13:24:05,081][57339] Updated weights for policy 0, policy_version 620038 (0.0028) [2024-04-28 13:24:07,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10158800896. Throughput: 0: 55802.2. Samples: 649103020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:24:07,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 13:24:08,141][57339] Updated weights for policy 0, policy_version 620048 (0.0031) [2024-04-28 13:24:10,918][57339] Updated weights for policy 0, policy_version 620058 (0.0032) [2024-04-28 13:24:12,170][57108] Fps is (10 sec: 55700.6, 60 sec: 55431.6, 300 sec: 55594.4). Total num frames: 10159079424. Throughput: 0: 55931.0. Samples: 649440420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:24:12,170][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 13:24:14,077][57339] Updated weights for policy 0, policy_version 620068 (0.0029) [2024-04-28 13:24:16,836][57339] Updated weights for policy 0, policy_version 620078 (0.0033) [2024-04-28 13:24:17,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10159357952. Throughput: 0: 55953.4. Samples: 649777860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 13:24:17,169][57108] Avg episode reward: [(0, '0.705')] [2024-04-28 13:24:19,821][57339] Updated weights for policy 0, policy_version 620088 (0.0031) [2024-04-28 13:24:22,169][57108] Fps is (10 sec: 57349.8, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10159652864. Throughput: 0: 55997.1. Samples: 649946720. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:24:22,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 13:24:22,748][57339] Updated weights for policy 0, policy_version 620098 (0.0023) [2024-04-28 13:24:25,702][57339] Updated weights for policy 0, policy_version 620108 (0.0032) [2024-04-28 13:24:27,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 10159931392. Throughput: 0: 56060.7. Samples: 650282540. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:24:27,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 13:24:28,496][57339] Updated weights for policy 0, policy_version 620118 (0.0035) [2024-04-28 13:24:31,415][57339] Updated weights for policy 0, policy_version 620128 (0.0025) [2024-04-28 13:24:32,174][57108] Fps is (10 sec: 55679.1, 60 sec: 56247.4, 300 sec: 55704.7). Total num frames: 10160209920. Throughput: 0: 55966.5. Samples: 650617020. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:24:32,174][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 13:24:34,315][57339] Updated weights for policy 0, policy_version 620138 (0.0031) [2024-04-28 13:24:37,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10160472064. Throughput: 0: 55998.4. Samples: 650786140. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:24:37,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 13:24:37,298][57339] Updated weights for policy 0, policy_version 620148 (0.0027) [2024-04-28 13:24:40,295][57339] Updated weights for policy 0, policy_version 620158 (0.0030) [2024-04-28 13:24:42,169][57108] Fps is (10 sec: 55731.6, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 10160766976. Throughput: 0: 56051.6. Samples: 651120660. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:24:42,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 13:24:43,129][57339] Updated weights for policy 0, policy_version 620168 (0.0030) [2024-04-28 13:24:46,128][57339] Updated weights for policy 0, policy_version 620178 (0.0026) [2024-04-28 13:24:47,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10161045504. Throughput: 0: 55889.5. Samples: 651453700. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:24:47,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 13:24:49,015][57339] Updated weights for policy 0, policy_version 620188 (0.0032) [2024-04-28 13:24:51,943][57339] Updated weights for policy 0, policy_version 620198 (0.0028) [2024-04-28 13:24:52,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10161324032. Throughput: 0: 55859.8. Samples: 651616720. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:24:52,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 13:24:54,977][57339] Updated weights for policy 0, policy_version 620208 (0.0036) [2024-04-28 13:24:57,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10161586176. Throughput: 0: 55839.9. Samples: 651953160. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:24:57,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 13:24:57,923][57339] Updated weights for policy 0, policy_version 620218 (0.0031) [2024-04-28 13:25:00,871][57339] Updated weights for policy 0, policy_version 620228 (0.0028) [2024-04-28 13:25:01,812][57319] Signal inference workers to stop experience collection... (9350 times) [2024-04-28 13:25:01,813][57319] Signal inference workers to resume experience collection... (9350 times) [2024-04-28 13:25:01,837][57339] InferenceWorker_p0-w0: stopping experience collection (9350 times) [2024-04-28 13:25:01,837][57339] InferenceWorker_p0-w0: resuming experience collection (9350 times) [2024-04-28 13:25:02,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10161881088. Throughput: 0: 55842.8. Samples: 652290780. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:25:02,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 13:25:04,027][57339] Updated weights for policy 0, policy_version 620238 (0.0026) [2024-04-28 13:25:06,580][57339] Updated weights for policy 0, policy_version 620248 (0.0033) [2024-04-28 13:25:07,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10162159616. Throughput: 0: 55578.6. Samples: 652447760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:25:07,169][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 13:25:09,792][57339] Updated weights for policy 0, policy_version 620258 (0.0026) [2024-04-28 13:25:12,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55979.6, 300 sec: 55705.6). Total num frames: 10162438144. Throughput: 0: 55597.5. Samples: 652784420. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:25:12,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 13:25:12,562][57339] Updated weights for policy 0, policy_version 620268 (0.0028) [2024-04-28 13:25:15,570][57339] Updated weights for policy 0, policy_version 620278 (0.0032) [2024-04-28 13:25:17,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10162700288. Throughput: 0: 55619.1. Samples: 653119620. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:25:17,172][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 13:25:18,637][57339] Updated weights for policy 0, policy_version 620288 (0.0028) [2024-04-28 13:25:21,559][57339] Updated weights for policy 0, policy_version 620298 (0.0036) [2024-04-28 13:25:22,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10162978816. Throughput: 0: 55587.6. Samples: 653287580. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:25:22,169][57108] Avg episode reward: [(0, '0.515')] [2024-04-28 13:25:24,472][57339] Updated weights for policy 0, policy_version 620308 (0.0035) [2024-04-28 13:25:27,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 10163257344. Throughput: 0: 55548.5. Samples: 653620340. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:25:27,169][57108] Avg episode reward: [(0, '0.535')] [2024-04-28 13:25:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000620316_10163257344.pth... [2024-04-28 13:25:27,230][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000619500_10149888000.pth [2024-04-28 13:25:27,496][57339] Updated weights for policy 0, policy_version 620318 (0.0030) [2024-04-28 13:25:30,271][57339] Updated weights for policy 0, policy_version 620328 (0.0029) [2024-04-28 13:25:32,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55710.0, 300 sec: 55705.6). Total num frames: 10163552256. Throughput: 0: 55607.9. Samples: 653956060. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:25:32,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 13:25:33,302][57339] Updated weights for policy 0, policy_version 620338 (0.0026) [2024-04-28 13:25:36,064][57339] Updated weights for policy 0, policy_version 620348 (0.0036) [2024-04-28 13:25:37,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 10163814400. Throughput: 0: 55809.4. Samples: 654128140. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:25:37,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 13:25:39,135][57339] Updated weights for policy 0, policy_version 620358 (0.0023) [2024-04-28 13:25:42,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10164092928. Throughput: 0: 55651.6. Samples: 654457480. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:25:42,169][57108] Avg episode reward: [(0, '0.486')] [2024-04-28 13:25:42,262][57339] Updated weights for policy 0, policy_version 620368 (0.0031) [2024-04-28 13:25:44,920][57339] Updated weights for policy 0, policy_version 620378 (0.0028) [2024-04-28 13:25:47,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10164371456. Throughput: 0: 55507.8. Samples: 654788640. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:25:47,169][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 13:25:48,285][57339] Updated weights for policy 0, policy_version 620388 (0.0031) [2024-04-28 13:25:50,934][57339] Updated weights for policy 0, policy_version 620398 (0.0027) [2024-04-28 13:25:52,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 10164649984. Throughput: 0: 55639.1. Samples: 654951520. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-04-28 13:25:52,169][57108] Avg episode reward: [(0, '0.682')] [2024-04-28 13:25:54,044][57339] Updated weights for policy 0, policy_version 620408 (0.0028) [2024-04-28 13:25:56,776][57339] Updated weights for policy 0, policy_version 620418 (0.0032) [2024-04-28 13:25:57,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10164928512. Throughput: 0: 55632.5. Samples: 655287880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:25:57,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 13:25:59,661][57319] Signal inference workers to stop experience collection... (9400 times) [2024-04-28 13:25:59,663][57319] Signal inference workers to resume experience collection... (9400 times) [2024-04-28 13:25:59,676][57339] InferenceWorker_p0-w0: stopping experience collection (9400 times) [2024-04-28 13:25:59,676][57339] InferenceWorker_p0-w0: resuming experience collection (9400 times) [2024-04-28 13:25:59,777][57339] Updated weights for policy 0, policy_version 620428 (0.0034) [2024-04-28 13:26:02,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10165223424. Throughput: 0: 55671.5. Samples: 655624840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:26:02,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 13:26:02,539][57339] Updated weights for policy 0, policy_version 620438 (0.0027) [2024-04-28 13:26:05,625][57339] Updated weights for policy 0, policy_version 620448 (0.0029) [2024-04-28 13:26:07,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10165485568. Throughput: 0: 55658.2. Samples: 655792200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:26:07,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 13:26:08,760][57339] Updated weights for policy 0, policy_version 620458 (0.0028) [2024-04-28 13:26:11,545][57339] Updated weights for policy 0, policy_version 620468 (0.0035) [2024-04-28 13:26:12,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10165764096. Throughput: 0: 55684.4. Samples: 656126140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:26:12,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 13:26:14,475][57339] Updated weights for policy 0, policy_version 620478 (0.0033) [2024-04-28 13:26:17,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10166042624. Throughput: 0: 55631.6. Samples: 656459480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:26:17,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 13:26:17,359][57339] Updated weights for policy 0, policy_version 620488 (0.0026) [2024-04-28 13:26:20,332][57339] Updated weights for policy 0, policy_version 620498 (0.0026) [2024-04-28 13:26:22,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 10166337536. Throughput: 0: 55612.6. Samples: 656630700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:26:22,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 13:26:23,267][57339] Updated weights for policy 0, policy_version 620508 (0.0034) [2024-04-28 13:26:26,113][57339] Updated weights for policy 0, policy_version 620518 (0.0025) [2024-04-28 13:26:27,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 10166616064. Throughput: 0: 55774.0. Samples: 656967320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:26:27,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 13:26:29,195][57339] Updated weights for policy 0, policy_version 620528 (0.0026) [2024-04-28 13:26:32,085][57339] Updated weights for policy 0, policy_version 620538 (0.0031) [2024-04-28 13:26:32,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10166894592. Throughput: 0: 55950.0. Samples: 657306380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:26:32,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 13:26:35,102][57339] Updated weights for policy 0, policy_version 620548 (0.0030) [2024-04-28 13:26:37,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10167173120. Throughput: 0: 56062.2. Samples: 657474320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:26:37,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 13:26:37,912][57339] Updated weights for policy 0, policy_version 620558 (0.0030) [2024-04-28 13:26:41,013][57339] Updated weights for policy 0, policy_version 620568 (0.0028) [2024-04-28 13:26:42,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10167451648. Throughput: 0: 56010.6. Samples: 657808360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:26:42,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 13:26:43,775][57339] Updated weights for policy 0, policy_version 620578 (0.0031) [2024-04-28 13:26:46,748][57339] Updated weights for policy 0, policy_version 620588 (0.0024) [2024-04-28 13:26:47,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10167713792. Throughput: 0: 55975.9. Samples: 658143760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:26:47,170][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 13:26:49,746][57339] Updated weights for policy 0, policy_version 620598 (0.0031) [2024-04-28 13:26:52,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 10168008704. Throughput: 0: 55890.5. Samples: 658307280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:26:52,169][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 13:26:52,633][57339] Updated weights for policy 0, policy_version 620608 (0.0030) [2024-04-28 13:26:55,590][57339] Updated weights for policy 0, policy_version 620618 (0.0034) [2024-04-28 13:26:56,456][57319] Signal inference workers to stop experience collection... (9450 times) [2024-04-28 13:26:56,456][57319] Signal inference workers to resume experience collection... (9450 times) [2024-04-28 13:26:56,466][57339] InferenceWorker_p0-w0: stopping experience collection (9450 times) [2024-04-28 13:26:56,470][57339] InferenceWorker_p0-w0: resuming experience collection (9450 times) [2024-04-28 13:26:57,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 10168287232. Throughput: 0: 55797.2. Samples: 658637020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:26:57,170][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 13:26:58,758][57339] Updated weights for policy 0, policy_version 620628 (0.0028) [2024-04-28 13:27:01,401][57339] Updated weights for policy 0, policy_version 620638 (0.0026) [2024-04-28 13:27:02,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10168565760. Throughput: 0: 55723.1. Samples: 658967020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:27:02,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 13:27:04,865][57339] Updated weights for policy 0, policy_version 620648 (0.0033) [2024-04-28 13:27:07,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10168844288. Throughput: 0: 55746.1. Samples: 659139280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:27:07,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 13:27:07,286][57339] Updated weights for policy 0, policy_version 620658 (0.0033) [2024-04-28 13:27:10,782][57339] Updated weights for policy 0, policy_version 620668 (0.0030) [2024-04-28 13:27:12,172][57108] Fps is (10 sec: 54051.3, 60 sec: 55703.0, 300 sec: 55760.6). Total num frames: 10169106432. Throughput: 0: 55751.7. Samples: 659476300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:27:12,172][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 13:27:13,183][57339] Updated weights for policy 0, policy_version 620678 (0.0024) [2024-04-28 13:27:16,795][57339] Updated weights for policy 0, policy_version 620688 (0.0034) [2024-04-28 13:27:17,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10169368576. Throughput: 0: 55623.6. Samples: 659809440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:27:17,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 13:27:19,062][57339] Updated weights for policy 0, policy_version 620698 (0.0028) [2024-04-28 13:27:22,169][57108] Fps is (10 sec: 54082.7, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 10169647104. Throughput: 0: 55318.7. Samples: 659963660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:27:22,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 13:27:22,705][57339] Updated weights for policy 0, policy_version 620708 (0.0030) [2024-04-28 13:27:24,832][57339] Updated weights for policy 0, policy_version 620718 (0.0028) [2024-04-28 13:27:27,169][57108] Fps is (10 sec: 57342.2, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10169942016. Throughput: 0: 55353.9. Samples: 660299300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 13:27:27,170][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 13:27:27,271][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000620725_10169958400.pth... [2024-04-28 13:27:27,317][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000619909_10156589056.pth [2024-04-28 13:27:28,589][57339] Updated weights for policy 0, policy_version 620728 (0.0031) [2024-04-28 13:27:30,893][57339] Updated weights for policy 0, policy_version 620738 (0.0029) [2024-04-28 13:27:32,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10170236928. Throughput: 0: 55249.4. Samples: 660629980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:27:32,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 13:27:34,377][57339] Updated weights for policy 0, policy_version 620748 (0.0030) [2024-04-28 13:27:36,877][57339] Updated weights for policy 0, policy_version 620758 (0.0027) [2024-04-28 13:27:37,169][57108] Fps is (10 sec: 55707.3, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10170499072. Throughput: 0: 55537.0. Samples: 660806440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:27:37,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 13:27:40,225][57339] Updated weights for policy 0, policy_version 620768 (0.0025) [2024-04-28 13:27:42,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10170793984. Throughput: 0: 55620.3. Samples: 661139920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:27:42,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 13:27:42,630][57339] Updated weights for policy 0, policy_version 620778 (0.0028) [2024-04-28 13:27:46,073][57339] Updated weights for policy 0, policy_version 620788 (0.0033) [2024-04-28 13:27:47,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10171056128. Throughput: 0: 55675.0. Samples: 661472400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:27:47,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 13:27:48,599][57339] Updated weights for policy 0, policy_version 620798 (0.0028) [2024-04-28 13:27:52,086][57339] Updated weights for policy 0, policy_version 620808 (0.0026) [2024-04-28 13:27:52,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55159.5, 300 sec: 55761.2). Total num frames: 10171318272. Throughput: 0: 55390.7. Samples: 661631860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:27:52,169][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 13:27:54,385][57339] Updated weights for policy 0, policy_version 620818 (0.0025) [2024-04-28 13:27:57,169][57108] Fps is (10 sec: 52428.5, 60 sec: 54886.5, 300 sec: 55594.5). Total num frames: 10171580416. Throughput: 0: 55330.1. Samples: 661966000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:27:57,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 13:27:57,926][57339] Updated weights for policy 0, policy_version 620828 (0.0028) [2024-04-28 13:28:00,359][57339] Updated weights for policy 0, policy_version 620838 (0.0025) [2024-04-28 13:28:02,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 10171891712. Throughput: 0: 55220.3. Samples: 662294360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:28:02,170][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 13:28:04,051][57339] Updated weights for policy 0, policy_version 620848 (0.0026) [2024-04-28 13:28:06,163][57339] Updated weights for policy 0, policy_version 620858 (0.0030) [2024-04-28 13:28:07,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 10172170240. Throughput: 0: 55585.7. Samples: 662465020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:28:07,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 13:28:09,844][57339] Updated weights for policy 0, policy_version 620868 (0.0036) [2024-04-28 13:28:12,041][57339] Updated weights for policy 0, policy_version 620878 (0.0037) [2024-04-28 13:28:12,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55981.2, 300 sec: 55761.1). Total num frames: 10172465152. Throughput: 0: 55540.6. Samples: 662798620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:28:12,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 13:28:15,640][57339] Updated weights for policy 0, policy_version 620888 (0.0031) [2024-04-28 13:28:17,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10172727296. Throughput: 0: 55590.2. Samples: 663131540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:28:17,169][57108] Avg episode reward: [(0, '0.510')] [2024-04-28 13:28:17,988][57339] Updated weights for policy 0, policy_version 620898 (0.0036) [2024-04-28 13:28:20,789][57319] Signal inference workers to stop experience collection... (9500 times) [2024-04-28 13:28:20,790][57319] Signal inference workers to resume experience collection... (9500 times) [2024-04-28 13:28:20,803][57339] InferenceWorker_p0-w0: stopping experience collection (9500 times) [2024-04-28 13:28:20,803][57339] InferenceWorker_p0-w0: resuming experience collection (9500 times) [2024-04-28 13:28:21,573][57339] Updated weights for policy 0, policy_version 620908 (0.0030) [2024-04-28 13:28:22,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10172989440. Throughput: 0: 55411.1. Samples: 663299940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:28:22,169][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 13:28:23,937][57339] Updated weights for policy 0, policy_version 620918 (0.0029) [2024-04-28 13:28:27,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 10173267968. Throughput: 0: 55295.9. Samples: 663628240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:28:27,170][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 13:28:27,472][57339] Updated weights for policy 0, policy_version 620928 (0.0026) [2024-04-28 13:28:29,875][57339] Updated weights for policy 0, policy_version 620938 (0.0030) [2024-04-28 13:28:32,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 10173546496. Throughput: 0: 55371.9. Samples: 663964140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:28:32,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 13:28:33,366][57339] Updated weights for policy 0, policy_version 620948 (0.0029) [2024-04-28 13:28:35,578][57339] Updated weights for policy 0, policy_version 620958 (0.0025) [2024-04-28 13:28:37,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10173825024. Throughput: 0: 55508.9. Samples: 664129760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:28:37,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 13:28:39,131][57339] Updated weights for policy 0, policy_version 620968 (0.0026) [2024-04-28 13:28:41,283][57339] Updated weights for policy 0, policy_version 620978 (0.0027) [2024-04-28 13:28:42,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10174119936. Throughput: 0: 55542.3. Samples: 664465400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:28:42,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 13:28:44,840][57339] Updated weights for policy 0, policy_version 620988 (0.0026) [2024-04-28 13:28:47,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10174414848. Throughput: 0: 55728.5. Samples: 664802140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:28:47,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 13:28:47,572][57339] Updated weights for policy 0, policy_version 620998 (0.0029) [2024-04-28 13:28:50,745][57339] Updated weights for policy 0, policy_version 621008 (0.0021) [2024-04-28 13:28:52,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10174676992. Throughput: 0: 55632.6. Samples: 664968480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:28:52,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 13:28:53,326][57339] Updated weights for policy 0, policy_version 621018 (0.0026) [2024-04-28 13:28:56,593][57339] Updated weights for policy 0, policy_version 621028 (0.0033) [2024-04-28 13:28:57,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10174939136. Throughput: 0: 55736.9. Samples: 665306780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:28:57,169][57108] Avg episode reward: [(0, '0.524')] [2024-04-28 13:28:59,010][57339] Updated weights for policy 0, policy_version 621038 (0.0027) [2024-04-28 13:29:02,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10175217664. Throughput: 0: 55807.2. Samples: 665642860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-04-28 13:29:02,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 13:29:02,366][57339] Updated weights for policy 0, policy_version 621048 (0.0030) [2024-04-28 13:29:04,933][57339] Updated weights for policy 0, policy_version 621058 (0.0027) [2024-04-28 13:29:07,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55705.8). Total num frames: 10175512576. Throughput: 0: 55661.7. Samples: 665804720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:29:07,170][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 13:29:08,200][57339] Updated weights for policy 0, policy_version 621068 (0.0027) [2024-04-28 13:29:11,006][57339] Updated weights for policy 0, policy_version 621078 (0.0028) [2024-04-28 13:29:12,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 10175791104. Throughput: 0: 55857.4. Samples: 666141820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:29:12,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 13:29:14,010][57339] Updated weights for policy 0, policy_version 621088 (0.0033) [2024-04-28 13:29:17,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 10176069632. Throughput: 0: 55840.9. Samples: 666476980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:29:17,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 13:29:17,172][57339] Updated weights for policy 0, policy_version 621098 (0.0036) [2024-04-28 13:29:19,345][57319] Signal inference workers to stop experience collection... (9550 times) [2024-04-28 13:29:19,392][57339] InferenceWorker_p0-w0: stopping experience collection (9550 times) [2024-04-28 13:29:19,403][57319] Signal inference workers to resume experience collection... (9550 times) [2024-04-28 13:29:19,409][57339] InferenceWorker_p0-w0: resuming experience collection (9550 times) [2024-04-28 13:29:19,789][57339] Updated weights for policy 0, policy_version 621108 (0.0028) [2024-04-28 13:29:22,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 10176364544. Throughput: 0: 55949.2. Samples: 666647480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:29:22,170][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 13:29:22,881][57339] Updated weights for policy 0, policy_version 621118 (0.0036) [2024-04-28 13:29:25,654][57339] Updated weights for policy 0, policy_version 621128 (0.0036) [2024-04-28 13:29:27,169][57108] Fps is (10 sec: 57343.6, 60 sec: 56251.6, 300 sec: 55706.5). Total num frames: 10176643072. Throughput: 0: 55851.3. Samples: 666978720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:29:27,170][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 13:29:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000621133_10176643072.pth... [2024-04-28 13:29:27,226][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000620316_10163257344.pth [2024-04-28 13:29:28,879][57339] Updated weights for policy 0, policy_version 621138 (0.0035) [2024-04-28 13:29:31,506][57339] Updated weights for policy 0, policy_version 621148 (0.0033) [2024-04-28 13:29:32,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10176888832. Throughput: 0: 55677.8. Samples: 667307640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:29:32,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 13:29:34,805][57339] Updated weights for policy 0, policy_version 621158 (0.0026) [2024-04-28 13:29:37,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10177183744. Throughput: 0: 55956.4. Samples: 667486520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:29:37,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 13:29:37,513][57339] Updated weights for policy 0, policy_version 621168 (0.0028) [2024-04-28 13:29:40,639][57339] Updated weights for policy 0, policy_version 621178 (0.0028) [2024-04-28 13:29:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10177445888. Throughput: 0: 55794.3. Samples: 667817520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:29:42,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 13:29:43,586][57339] Updated weights for policy 0, policy_version 621188 (0.0026) [2024-04-28 13:29:46,735][57339] Updated weights for policy 0, policy_version 621198 (0.0028) [2024-04-28 13:29:47,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10177740800. Throughput: 0: 55843.1. Samples: 668155800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:29:47,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 13:29:49,551][57339] Updated weights for policy 0, policy_version 621208 (0.0032) [2024-04-28 13:29:52,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10178002944. Throughput: 0: 55843.6. Samples: 668317680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:29:52,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 13:29:52,715][57339] Updated weights for policy 0, policy_version 621218 (0.0032) [2024-04-28 13:29:55,523][57339] Updated weights for policy 0, policy_version 621228 (0.0026) [2024-04-28 13:29:57,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 10178297856. Throughput: 0: 55706.6. Samples: 668648620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:29:57,170][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 13:29:58,476][57339] Updated weights for policy 0, policy_version 621238 (0.0029) [2024-04-28 13:30:01,290][57339] Updated weights for policy 0, policy_version 621248 (0.0026) [2024-04-28 13:30:02,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 10178576384. Throughput: 0: 55748.9. Samples: 668985680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:30:02,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 13:30:04,240][57339] Updated weights for policy 0, policy_version 621258 (0.0031) [2024-04-28 13:30:07,040][57339] Updated weights for policy 0, policy_version 621268 (0.0032) [2024-04-28 13:30:07,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 10178854912. Throughput: 0: 55692.0. Samples: 669153620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:30:07,170][57108] Avg episode reward: [(0, '0.698')] [2024-04-28 13:30:10,268][57339] Updated weights for policy 0, policy_version 621278 (0.0031) [2024-04-28 13:30:12,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10179117056. Throughput: 0: 55742.9. Samples: 669487140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:30:12,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 13:30:12,873][57339] Updated weights for policy 0, policy_version 621288 (0.0039) [2024-04-28 13:30:16,229][57339] Updated weights for policy 0, policy_version 621298 (0.0034) [2024-04-28 13:30:17,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10179411968. Throughput: 0: 55883.4. Samples: 669822400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:30:17,170][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 13:30:17,952][57319] Signal inference workers to stop experience collection... (9600 times) [2024-04-28 13:30:17,987][57339] InferenceWorker_p0-w0: stopping experience collection (9600 times) [2024-04-28 13:30:18,009][57319] Signal inference workers to resume experience collection... (9600 times) [2024-04-28 13:30:18,010][57339] InferenceWorker_p0-w0: resuming experience collection (9600 times) [2024-04-28 13:30:18,727][57339] Updated weights for policy 0, policy_version 621308 (0.0026) [2024-04-28 13:30:21,961][57339] Updated weights for policy 0, policy_version 621318 (0.0029) [2024-04-28 13:30:22,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 10179674112. Throughput: 0: 55501.4. Samples: 669984080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:30:22,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 13:30:24,592][57339] Updated weights for policy 0, policy_version 621328 (0.0030) [2024-04-28 13:30:27,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10179952640. Throughput: 0: 55487.4. Samples: 670314460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:30:27,169][57108] Avg episode reward: [(0, '0.509')] [2024-04-28 13:30:28,015][57339] Updated weights for policy 0, policy_version 621338 (0.0031) [2024-04-28 13:30:30,668][57339] Updated weights for policy 0, policy_version 621348 (0.0029) [2024-04-28 13:30:32,169][57108] Fps is (10 sec: 58981.9, 60 sec: 56251.7, 300 sec: 55761.2). Total num frames: 10180263936. Throughput: 0: 55283.1. Samples: 670643540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:30:32,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 13:30:33,845][57339] Updated weights for policy 0, policy_version 621358 (0.0028) [2024-04-28 13:30:36,591][57339] Updated weights for policy 0, policy_version 621368 (0.0029) [2024-04-28 13:30:37,169][57108] Fps is (10 sec: 57345.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10180526080. Throughput: 0: 55632.5. Samples: 670821140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-04-28 13:30:37,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 13:30:39,848][57339] Updated weights for policy 0, policy_version 621378 (0.0026) [2024-04-28 13:30:42,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10180788224. Throughput: 0: 55546.7. Samples: 671148220. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:30:42,170][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 13:30:42,471][57339] Updated weights for policy 0, policy_version 621388 (0.0032) [2024-04-28 13:30:45,617][57339] Updated weights for policy 0, policy_version 621398 (0.0029) [2024-04-28 13:30:47,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10181066752. Throughput: 0: 55527.6. Samples: 671484420. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:30:47,169][57108] Avg episode reward: [(0, '0.691')] [2024-04-28 13:30:48,201][57339] Updated weights for policy 0, policy_version 621408 (0.0028) [2024-04-28 13:30:51,493][57339] Updated weights for policy 0, policy_version 621418 (0.0026) [2024-04-28 13:30:52,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10181361664. Throughput: 0: 55503.2. Samples: 671651260. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:30:52,170][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 13:30:53,998][57339] Updated weights for policy 0, policy_version 621428 (0.0027) [2024-04-28 13:30:57,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10181623808. Throughput: 0: 55639.5. Samples: 671990920. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:30:57,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 13:30:57,286][57339] Updated weights for policy 0, policy_version 621438 (0.0030) [2024-04-28 13:30:59,726][57339] Updated weights for policy 0, policy_version 621448 (0.0029) [2024-04-28 13:31:02,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10181902336. Throughput: 0: 55773.1. Samples: 672332180. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:31:02,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 13:31:03,198][57339] Updated weights for policy 0, policy_version 621458 (0.0022) [2024-04-28 13:31:05,766][57339] Updated weights for policy 0, policy_version 621468 (0.0036) [2024-04-28 13:31:07,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10182213632. Throughput: 0: 55929.6. Samples: 672500920. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:31:07,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 13:31:09,023][57339] Updated weights for policy 0, policy_version 621478 (0.0028) [2024-04-28 13:31:11,522][57339] Updated weights for policy 0, policy_version 621488 (0.0030) [2024-04-28 13:31:12,169][57108] Fps is (10 sec: 58982.6, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 10182492160. Throughput: 0: 55985.6. Samples: 672833800. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:31:12,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 13:31:14,846][57339] Updated weights for policy 0, policy_version 621498 (0.0024) [2024-04-28 13:31:17,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10182770688. Throughput: 0: 56207.4. Samples: 673172880. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:31:17,170][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 13:31:17,247][57339] Updated weights for policy 0, policy_version 621508 (0.0027) [2024-04-28 13:31:20,518][57339] Updated weights for policy 0, policy_version 621518 (0.0030) [2024-04-28 13:31:22,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10183016448. Throughput: 0: 56047.1. Samples: 673343260. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:31:22,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 13:31:23,101][57339] Updated weights for policy 0, policy_version 621528 (0.0029) [2024-04-28 13:31:23,817][57319] Signal inference workers to stop experience collection... (9650 times) [2024-04-28 13:31:23,817][57319] Signal inference workers to resume experience collection... (9650 times) [2024-04-28 13:31:23,833][57339] InferenceWorker_p0-w0: stopping experience collection (9650 times) [2024-04-28 13:31:23,834][57339] InferenceWorker_p0-w0: resuming experience collection (9650 times) [2024-04-28 13:31:26,397][57339] Updated weights for policy 0, policy_version 621538 (0.0028) [2024-04-28 13:31:27,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 10183327744. Throughput: 0: 56182.5. Samples: 673676440. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:31:27,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 13:31:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000621541_10183327744.pth... [2024-04-28 13:31:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000620725_10169958400.pth [2024-04-28 13:31:28,956][57339] Updated weights for policy 0, policy_version 621548 (0.0025) [2024-04-28 13:31:32,145][57339] Updated weights for policy 0, policy_version 621558 (0.0030) [2024-04-28 13:31:32,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10183606272. Throughput: 0: 56310.2. Samples: 674018380. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:31:32,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 13:31:34,807][57339] Updated weights for policy 0, policy_version 621568 (0.0027) [2024-04-28 13:31:37,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10183868416. Throughput: 0: 56136.1. Samples: 674177380. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:31:37,169][57108] Avg episode reward: [(0, '0.689')] [2024-04-28 13:31:38,059][57339] Updated weights for policy 0, policy_version 621578 (0.0029) [2024-04-28 13:31:40,507][57339] Updated weights for policy 0, policy_version 621588 (0.0032) [2024-04-28 13:31:42,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 10184179712. Throughput: 0: 55964.1. Samples: 674509300. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:31:42,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 13:31:43,951][57339] Updated weights for policy 0, policy_version 621598 (0.0030) [2024-04-28 13:31:46,302][57339] Updated weights for policy 0, policy_version 621608 (0.0024) [2024-04-28 13:31:47,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56524.9, 300 sec: 55761.2). Total num frames: 10184458240. Throughput: 0: 55934.3. Samples: 674849220. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:31:47,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 13:31:49,761][57339] Updated weights for policy 0, policy_version 621618 (0.0027) [2024-04-28 13:31:52,132][57339] Updated weights for policy 0, policy_version 621628 (0.0028) [2024-04-28 13:31:52,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 10184753152. Throughput: 0: 56067.1. Samples: 675023940. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:31:52,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 13:31:55,504][57339] Updated weights for policy 0, policy_version 621638 (0.0032) [2024-04-28 13:31:57,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 10184982528. Throughput: 0: 56051.9. Samples: 675356140. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:31:57,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 13:31:58,286][57339] Updated weights for policy 0, policy_version 621648 (0.0029) [2024-04-28 13:32:01,337][57339] Updated weights for policy 0, policy_version 621658 (0.0025) [2024-04-28 13:32:02,169][57108] Fps is (10 sec: 52428.4, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 10185277440. Throughput: 0: 56064.8. Samples: 675695800. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:32:02,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 13:32:04,286][57339] Updated weights for policy 0, policy_version 621668 (0.0026) [2024-04-28 13:32:07,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.7, 300 sec: 55761.7). Total num frames: 10185555968. Throughput: 0: 55904.9. Samples: 675858980. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:32:07,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 13:32:07,220][57339] Updated weights for policy 0, policy_version 621678 (0.0031) [2024-04-28 13:32:10,007][57339] Updated weights for policy 0, policy_version 621688 (0.0030) [2024-04-28 13:32:12,169][57108] Fps is (10 sec: 57345.0, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10185850880. Throughput: 0: 55961.5. Samples: 676194700. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-04-28 13:32:12,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 13:32:13,007][57339] Updated weights for policy 0, policy_version 621698 (0.0034) [2024-04-28 13:32:15,935][57339] Updated weights for policy 0, policy_version 621708 (0.0026) [2024-04-28 13:32:17,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 10186129408. Throughput: 0: 55776.6. Samples: 676528320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:32:17,169][57108] Avg episode reward: [(0, '0.712')] [2024-04-28 13:32:18,853][57339] Updated weights for policy 0, policy_version 621718 (0.0025) [2024-04-28 13:32:21,672][57339] Updated weights for policy 0, policy_version 621728 (0.0029) [2024-04-28 13:32:22,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56797.9, 300 sec: 55872.3). Total num frames: 10186424320. Throughput: 0: 56128.9. Samples: 676703180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:32:22,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 13:32:24,657][57339] Updated weights for policy 0, policy_version 621738 (0.0026) [2024-04-28 13:32:27,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10186686464. Throughput: 0: 56335.4. Samples: 677044400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:32:27,170][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 13:32:27,531][57339] Updated weights for policy 0, policy_version 621748 (0.0028) [2024-04-28 13:32:30,743][57339] Updated weights for policy 0, policy_version 621758 (0.0029) [2024-04-28 13:32:31,896][57319] Signal inference workers to stop experience collection... (9700 times) [2024-04-28 13:32:31,949][57339] InferenceWorker_p0-w0: stopping experience collection (9700 times) [2024-04-28 13:32:31,949][57319] Signal inference workers to resume experience collection... (9700 times) [2024-04-28 13:32:31,961][57339] InferenceWorker_p0-w0: resuming experience collection (9700 times) [2024-04-28 13:32:32,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10186948608. Throughput: 0: 56266.6. Samples: 677381220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:32:32,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 13:32:33,427][57339] Updated weights for policy 0, policy_version 621768 (0.0027) [2024-04-28 13:32:36,516][57339] Updated weights for policy 0, policy_version 621778 (0.0026) [2024-04-28 13:32:37,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 10187227136. Throughput: 0: 55934.1. Samples: 677540980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:32:37,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 13:32:39,126][57339] Updated weights for policy 0, policy_version 621788 (0.0035) [2024-04-28 13:32:42,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10187522048. Throughput: 0: 56000.8. Samples: 677876180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:32:42,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 13:32:42,273][57339] Updated weights for policy 0, policy_version 621798 (0.0028) [2024-04-28 13:32:45,112][57339] Updated weights for policy 0, policy_version 621808 (0.0023) [2024-04-28 13:32:47,169][57108] Fps is (10 sec: 57345.1, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10187800576. Throughput: 0: 55864.7. Samples: 678209700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:32:47,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 13:32:48,072][57339] Updated weights for policy 0, policy_version 621818 (0.0023) [2024-04-28 13:32:50,942][57339] Updated weights for policy 0, policy_version 621828 (0.0029) [2024-04-28 13:32:52,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55705.5, 300 sec: 55983.3). Total num frames: 10188095488. Throughput: 0: 56106.4. Samples: 678383780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:32:52,170][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 13:32:54,123][57339] Updated weights for policy 0, policy_version 621838 (0.0032) [2024-04-28 13:32:56,950][57339] Updated weights for policy 0, policy_version 621848 (0.0026) [2024-04-28 13:32:57,169][57108] Fps is (10 sec: 55705.0, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10188357632. Throughput: 0: 55957.2. Samples: 678712780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:32:57,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 13:32:59,886][57339] Updated weights for policy 0, policy_version 621858 (0.0027) [2024-04-28 13:33:02,169][57108] Fps is (10 sec: 54068.3, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10188636160. Throughput: 0: 55848.9. Samples: 679041520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:33:02,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 13:33:02,912][57339] Updated weights for policy 0, policy_version 621868 (0.0029) [2024-04-28 13:33:05,853][57339] Updated weights for policy 0, policy_version 621878 (0.0033) [2024-04-28 13:33:07,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10188898304. Throughput: 0: 55721.1. Samples: 679210640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:33:07,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 13:33:08,730][57339] Updated weights for policy 0, policy_version 621888 (0.0024) [2024-04-28 13:33:11,790][57339] Updated weights for policy 0, policy_version 621898 (0.0039) [2024-04-28 13:33:12,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10189176832. Throughput: 0: 55599.7. Samples: 679546380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:33:12,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 13:33:14,465][57339] Updated weights for policy 0, policy_version 621908 (0.0028) [2024-04-28 13:33:17,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 10189471744. Throughput: 0: 55489.8. Samples: 679878260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:33:17,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 13:33:17,694][57339] Updated weights for policy 0, policy_version 621918 (0.0030) [2024-04-28 13:33:20,112][57319] Signal inference workers to stop experience collection... (9750 times) [2024-04-28 13:33:20,115][57319] Signal inference workers to resume experience collection... (9750 times) [2024-04-28 13:33:20,139][57339] InferenceWorker_p0-w0: stopping experience collection (9750 times) [2024-04-28 13:33:20,140][57339] InferenceWorker_p0-w0: resuming experience collection (9750 times) [2024-04-28 13:33:20,227][57339] Updated weights for policy 0, policy_version 621928 (0.0025) [2024-04-28 13:33:22,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 10189750272. Throughput: 0: 55673.0. Samples: 680046260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:33:22,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 13:33:23,530][57339] Updated weights for policy 0, policy_version 621938 (0.0026) [2024-04-28 13:33:26,348][57339] Updated weights for policy 0, policy_version 621948 (0.0026) [2024-04-28 13:33:27,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 10190045184. Throughput: 0: 55648.8. Samples: 680380380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:33:27,170][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 13:33:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000621951_10190045184.pth... [2024-04-28 13:33:27,217][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000621133_10176643072.pth [2024-04-28 13:33:29,432][57339] Updated weights for policy 0, policy_version 621958 (0.0026) [2024-04-28 13:33:32,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10190307328. Throughput: 0: 55632.0. Samples: 680713140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:33:32,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 13:33:32,256][57339] Updated weights for policy 0, policy_version 621968 (0.0028) [2024-04-28 13:33:35,476][57339] Updated weights for policy 0, policy_version 621978 (0.0023) [2024-04-28 13:33:37,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10190569472. Throughput: 0: 55372.6. Samples: 680875540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:33:37,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 13:33:38,319][57339] Updated weights for policy 0, policy_version 621988 (0.0033) [2024-04-28 13:33:41,241][57339] Updated weights for policy 0, policy_version 621998 (0.0024) [2024-04-28 13:33:42,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10190848000. Throughput: 0: 55496.1. Samples: 681210100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:33:42,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 13:33:44,110][57339] Updated weights for policy 0, policy_version 622008 (0.0027) [2024-04-28 13:33:46,953][57339] Updated weights for policy 0, policy_version 622018 (0.0026) [2024-04-28 13:33:47,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10191142912. Throughput: 0: 55750.2. Samples: 681550280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 13:33:47,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 13:33:49,795][57339] Updated weights for policy 0, policy_version 622028 (0.0032) [2024-04-28 13:33:52,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.7, 300 sec: 55872.2). Total num frames: 10191421440. Throughput: 0: 55506.4. Samples: 681708420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:33:52,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 13:33:53,319][57339] Updated weights for policy 0, policy_version 622038 (0.0029) [2024-04-28 13:33:55,501][57339] Updated weights for policy 0, policy_version 622048 (0.0030) [2024-04-28 13:33:57,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10191699968. Throughput: 0: 55438.2. Samples: 682041100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:33:57,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 13:33:59,217][57339] Updated weights for policy 0, policy_version 622058 (0.0025) [2024-04-28 13:34:01,278][57339] Updated weights for policy 0, policy_version 622068 (0.0033) [2024-04-28 13:34:02,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 10191962112. Throughput: 0: 55594.3. Samples: 682380000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:34:02,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 13:34:05,153][57339] Updated weights for policy 0, policy_version 622078 (0.0027) [2024-04-28 13:34:07,130][57339] Updated weights for policy 0, policy_version 622088 (0.0028) [2024-04-28 13:34:07,169][57108] Fps is (10 sec: 58982.8, 60 sec: 56524.9, 300 sec: 55927.8). Total num frames: 10192289792. Throughput: 0: 55792.6. Samples: 682556920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:34:07,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 13:34:10,937][57339] Updated weights for policy 0, policy_version 622098 (0.0026) [2024-04-28 13:34:12,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10192535552. Throughput: 0: 55850.7. Samples: 682893660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:34:12,170][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 13:34:13,087][57339] Updated weights for policy 0, policy_version 622108 (0.0035) [2024-04-28 13:34:16,672][57339] Updated weights for policy 0, policy_version 622118 (0.0030) [2024-04-28 13:34:17,169][57108] Fps is (10 sec: 50790.0, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10192797696. Throughput: 0: 55979.1. Samples: 683232200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:34:17,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 13:34:19,126][57339] Updated weights for policy 0, policy_version 622128 (0.0026) [2024-04-28 13:34:22,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10193076224. Throughput: 0: 55996.1. Samples: 683395360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:34:22,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 13:34:22,609][57339] Updated weights for policy 0, policy_version 622138 (0.0026) [2024-04-28 13:34:25,328][57339] Updated weights for policy 0, policy_version 622148 (0.0029) [2024-04-28 13:34:27,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55432.7, 300 sec: 55872.2). Total num frames: 10193371136. Throughput: 0: 56051.6. Samples: 683732420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:34:27,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 13:34:28,449][57339] Updated weights for policy 0, policy_version 622158 (0.0036) [2024-04-28 13:34:30,021][57319] Signal inference workers to stop experience collection... (9800 times) [2024-04-28 13:34:30,054][57339] InferenceWorker_p0-w0: stopping experience collection (9800 times) [2024-04-28 13:34:30,080][57319] Signal inference workers to resume experience collection... (9800 times) [2024-04-28 13:34:30,081][57339] InferenceWorker_p0-w0: resuming experience collection (9800 times) [2024-04-28 13:34:31,119][57339] Updated weights for policy 0, policy_version 622168 (0.0025) [2024-04-28 13:34:32,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10193649664. Throughput: 0: 55891.8. Samples: 684065420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:34:32,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 13:34:34,284][57339] Updated weights for policy 0, policy_version 622178 (0.0027) [2024-04-28 13:34:36,800][57339] Updated weights for policy 0, policy_version 622188 (0.0036) [2024-04-28 13:34:37,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10193928192. Throughput: 0: 56084.8. Samples: 684232240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:34:37,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 13:34:40,132][57339] Updated weights for policy 0, policy_version 622198 (0.0029) [2024-04-28 13:34:42,169][57108] Fps is (10 sec: 58982.8, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 10194239488. Throughput: 0: 56128.9. Samples: 684566900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:34:42,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 13:34:42,714][57339] Updated weights for policy 0, policy_version 622208 (0.0026) [2024-04-28 13:34:46,057][57339] Updated weights for policy 0, policy_version 622218 (0.0024) [2024-04-28 13:34:47,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10194501632. Throughput: 0: 56110.6. Samples: 684904980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:34:47,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 13:34:48,620][57339] Updated weights for policy 0, policy_version 622228 (0.0032) [2024-04-28 13:34:51,914][57339] Updated weights for policy 0, policy_version 622238 (0.0033) [2024-04-28 13:34:52,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10194763776. Throughput: 0: 55747.1. Samples: 685065540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:34:52,170][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 13:34:54,354][57339] Updated weights for policy 0, policy_version 622248 (0.0026) [2024-04-28 13:34:57,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 10195025920. Throughput: 0: 55701.0. Samples: 685400200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:34:57,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 13:34:57,824][57339] Updated weights for policy 0, policy_version 622258 (0.0026) [2024-04-28 13:35:00,620][57339] Updated weights for policy 0, policy_version 622268 (0.0034) [2024-04-28 13:35:02,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10195320832. Throughput: 0: 55496.0. Samples: 685729520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:35:02,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 13:35:03,663][57339] Updated weights for policy 0, policy_version 622278 (0.0033) [2024-04-28 13:35:06,679][57339] Updated weights for policy 0, policy_version 622288 (0.0025) [2024-04-28 13:35:07,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55159.4, 300 sec: 55872.2). Total num frames: 10195599360. Throughput: 0: 55670.2. Samples: 685900520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:35:07,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 13:35:09,552][57339] Updated weights for policy 0, policy_version 622298 (0.0034) [2024-04-28 13:35:12,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 10195861504. Throughput: 0: 55446.3. Samples: 686227500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:35:12,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 13:35:12,726][57339] Updated weights for policy 0, policy_version 622308 (0.0026) [2024-04-28 13:35:15,372][57339] Updated weights for policy 0, policy_version 622318 (0.0035) [2024-04-28 13:35:17,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 10196172800. Throughput: 0: 55393.1. Samples: 686558100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:35:17,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 13:35:18,507][57339] Updated weights for policy 0, policy_version 622328 (0.0029) [2024-04-28 13:35:21,222][57339] Updated weights for policy 0, policy_version 622338 (0.0026) [2024-04-28 13:35:21,550][57319] Signal inference workers to stop experience collection... (9850 times) [2024-04-28 13:35:21,604][57339] InferenceWorker_p0-w0: stopping experience collection (9850 times) [2024-04-28 13:35:21,605][57319] Signal inference workers to resume experience collection... (9850 times) [2024-04-28 13:35:21,619][57339] InferenceWorker_p0-w0: resuming experience collection (9850 times) [2024-04-28 13:35:22,169][57108] Fps is (10 sec: 58981.8, 60 sec: 56251.6, 300 sec: 55927.8). Total num frames: 10196451328. Throughput: 0: 55658.2. Samples: 686736860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 13:35:22,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 13:35:24,277][57339] Updated weights for policy 0, policy_version 622348 (0.0026) [2024-04-28 13:35:27,169][57108] Fps is (10 sec: 50790.3, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10196680704. Throughput: 0: 55549.0. Samples: 687066600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:35:27,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 13:35:27,251][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000622357_10196697088.pth... [2024-04-28 13:35:27,296][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000621541_10183327744.pth [2024-04-28 13:35:27,415][57339] Updated weights for policy 0, policy_version 622358 (0.0032) [2024-04-28 13:35:30,272][57339] Updated weights for policy 0, policy_version 622368 (0.0037) [2024-04-28 13:35:32,169][57108] Fps is (10 sec: 50791.0, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 10196959232. Throughput: 0: 55392.0. Samples: 687397620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:35:32,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 13:35:33,275][57339] Updated weights for policy 0, policy_version 622378 (0.0029) [2024-04-28 13:35:36,129][57339] Updated weights for policy 0, policy_version 622388 (0.0030) [2024-04-28 13:35:37,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10197254144. Throughput: 0: 55245.2. Samples: 687551580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:35:37,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 13:35:39,228][57339] Updated weights for policy 0, policy_version 622398 (0.0031) [2024-04-28 13:35:41,884][57339] Updated weights for policy 0, policy_version 622408 (0.0035) [2024-04-28 13:35:42,169][57108] Fps is (10 sec: 57343.3, 60 sec: 54886.4, 300 sec: 55816.7). Total num frames: 10197532672. Throughput: 0: 55334.5. Samples: 687890260. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:35:42,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 13:35:45,057][57339] Updated weights for policy 0, policy_version 622418 (0.0033) [2024-04-28 13:35:47,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10197827584. Throughput: 0: 55507.7. Samples: 688227360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:35:47,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 13:35:47,642][57339] Updated weights for policy 0, policy_version 622428 (0.0034) [2024-04-28 13:35:50,922][57339] Updated weights for policy 0, policy_version 622438 (0.0025) [2024-04-28 13:35:52,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10198122496. Throughput: 0: 55499.5. Samples: 688398000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:35:52,169][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 13:35:53,705][57339] Updated weights for policy 0, policy_version 622448 (0.0025) [2024-04-28 13:35:56,712][57339] Updated weights for policy 0, policy_version 622458 (0.0027) [2024-04-28 13:35:57,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10198384640. Throughput: 0: 55759.4. Samples: 688736680. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:35:57,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 13:35:59,680][57339] Updated weights for policy 0, policy_version 622468 (0.0025) [2024-04-28 13:36:02,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10198646784. Throughput: 0: 55743.4. Samples: 689066560. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:36:02,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 13:36:02,579][57339] Updated weights for policy 0, policy_version 622478 (0.0026) [2024-04-28 13:36:05,569][57339] Updated weights for policy 0, policy_version 622488 (0.0030) [2024-04-28 13:36:07,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10198925312. Throughput: 0: 55419.3. Samples: 689230720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:36:07,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 13:36:08,337][57339] Updated weights for policy 0, policy_version 622498 (0.0033) [2024-04-28 13:36:11,267][57339] Updated weights for policy 0, policy_version 622508 (0.0025) [2024-04-28 13:36:12,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 10199203840. Throughput: 0: 55557.5. Samples: 689566700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:36:12,170][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 13:36:14,223][57339] Updated weights for policy 0, policy_version 622518 (0.0026) [2024-04-28 13:36:17,066][57339] Updated weights for policy 0, policy_version 622528 (0.0026) [2024-04-28 13:36:17,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 10199498752. Throughput: 0: 55550.9. Samples: 689897420. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:36:17,170][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 13:36:20,151][57339] Updated weights for policy 0, policy_version 622538 (0.0032) [2024-04-28 13:36:22,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10199777280. Throughput: 0: 55976.0. Samples: 690070500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:36:22,170][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 13:36:23,004][57339] Updated weights for policy 0, policy_version 622548 (0.0039) [2024-04-28 13:36:25,449][57319] Signal inference workers to stop experience collection... (9900 times) [2024-04-28 13:36:25,449][57319] Signal inference workers to resume experience collection... (9900 times) [2024-04-28 13:36:25,474][57339] InferenceWorker_p0-w0: stopping experience collection (9900 times) [2024-04-28 13:36:25,474][57339] InferenceWorker_p0-w0: resuming experience collection (9900 times) [2024-04-28 13:36:26,005][57339] Updated weights for policy 0, policy_version 622558 (0.0029) [2024-04-28 13:36:27,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10200039424. Throughput: 0: 55790.7. Samples: 690400840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:36:27,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 13:36:28,803][57339] Updated weights for policy 0, policy_version 622568 (0.0033) [2024-04-28 13:36:31,736][57339] Updated weights for policy 0, policy_version 622578 (0.0028) [2024-04-28 13:36:32,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10200317952. Throughput: 0: 55699.1. Samples: 690733820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:36:32,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 13:36:34,717][57339] Updated weights for policy 0, policy_version 622588 (0.0032) [2024-04-28 13:36:37,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10200596480. Throughput: 0: 55610.7. Samples: 690900480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:36:37,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 13:36:37,623][57339] Updated weights for policy 0, policy_version 622598 (0.0027) [2024-04-28 13:36:40,486][57339] Updated weights for policy 0, policy_version 622608 (0.0026) [2024-04-28 13:36:42,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10200891392. Throughput: 0: 55440.0. Samples: 691231480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:36:42,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 13:36:43,568][57339] Updated weights for policy 0, policy_version 622618 (0.0026) [2024-04-28 13:36:46,425][57339] Updated weights for policy 0, policy_version 622628 (0.0026) [2024-04-28 13:36:47,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10201153536. Throughput: 0: 55598.3. Samples: 691568480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:36:47,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 13:36:49,396][57339] Updated weights for policy 0, policy_version 622638 (0.0030) [2024-04-28 13:36:52,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10201432064. Throughput: 0: 55722.4. Samples: 691738240. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:36:52,170][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 13:36:52,373][57339] Updated weights for policy 0, policy_version 622648 (0.0024) [2024-04-28 13:36:55,148][57339] Updated weights for policy 0, policy_version 622658 (0.0029) [2024-04-28 13:36:57,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10201726976. Throughput: 0: 55652.7. Samples: 692071060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 13:36:57,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 13:36:58,081][57339] Updated weights for policy 0, policy_version 622668 (0.0026) [2024-04-28 13:37:01,265][57339] Updated weights for policy 0, policy_version 622678 (0.0029) [2024-04-28 13:37:02,169][57108] Fps is (10 sec: 55707.2, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 10201989120. Throughput: 0: 55770.9. Samples: 692407100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:37:02,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 13:37:03,799][57339] Updated weights for policy 0, policy_version 622688 (0.0032) [2024-04-28 13:37:07,158][57339] Updated weights for policy 0, policy_version 622698 (0.0030) [2024-04-28 13:37:07,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 10202284032. Throughput: 0: 55640.3. Samples: 692574320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:37:07,170][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 13:37:09,706][57339] Updated weights for policy 0, policy_version 622708 (0.0028) [2024-04-28 13:37:12,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55705.7, 300 sec: 55650.0). Total num frames: 10202546176. Throughput: 0: 55722.7. Samples: 692908360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:37:12,170][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 13:37:12,936][57339] Updated weights for policy 0, policy_version 622718 (0.0026) [2024-04-28 13:37:15,785][57339] Updated weights for policy 0, policy_version 622728 (0.0029) [2024-04-28 13:37:17,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 10202841088. Throughput: 0: 55663.9. Samples: 693238700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:37:17,170][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 13:37:18,862][57339] Updated weights for policy 0, policy_version 622738 (0.0024) [2024-04-28 13:37:21,814][57339] Updated weights for policy 0, policy_version 622748 (0.0032) [2024-04-28 13:37:22,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10203119616. Throughput: 0: 55841.8. Samples: 693413360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:37:22,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 13:37:24,792][57339] Updated weights for policy 0, policy_version 622758 (0.0028) [2024-04-28 13:37:27,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10203398144. Throughput: 0: 55863.0. Samples: 693745320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:37:27,170][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 13:37:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000622766_10203398144.pth... [2024-04-28 13:37:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000621951_10190045184.pth [2024-04-28 13:37:27,695][57339] Updated weights for policy 0, policy_version 622768 (0.0034) [2024-04-28 13:37:30,621][57339] Updated weights for policy 0, policy_version 622778 (0.0024) [2024-04-28 13:37:32,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10203660288. Throughput: 0: 55687.6. Samples: 694074420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:37:32,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 13:37:33,801][57339] Updated weights for policy 0, policy_version 622788 (0.0030) [2024-04-28 13:37:36,385][57339] Updated weights for policy 0, policy_version 622798 (0.0023) [2024-04-28 13:37:37,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10203922432. Throughput: 0: 55568.1. Samples: 694238800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:37:37,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 13:37:39,773][57339] Updated weights for policy 0, policy_version 622808 (0.0029) [2024-04-28 13:37:41,782][57319] Signal inference workers to stop experience collection... (9950 times) [2024-04-28 13:37:41,782][57319] Signal inference workers to resume experience collection... (9950 times) [2024-04-28 13:37:41,793][57339] InferenceWorker_p0-w0: stopping experience collection (9950 times) [2024-04-28 13:37:41,793][57339] InferenceWorker_p0-w0: resuming experience collection (9950 times) [2024-04-28 13:37:42,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10204233728. Throughput: 0: 55508.4. Samples: 694568940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:37:42,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 13:37:42,457][57339] Updated weights for policy 0, policy_version 622818 (0.0026) [2024-04-28 13:37:45,565][57339] Updated weights for policy 0, policy_version 622828 (0.0028) [2024-04-28 13:37:47,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 10204479488. Throughput: 0: 55475.6. Samples: 694903520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:37:47,170][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 13:37:48,536][57339] Updated weights for policy 0, policy_version 622838 (0.0025) [2024-04-28 13:37:51,366][57339] Updated weights for policy 0, policy_version 622848 (0.0033) [2024-04-28 13:37:52,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.9, 300 sec: 55705.6). Total num frames: 10204790784. Throughput: 0: 55356.3. Samples: 695065340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:37:52,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 13:37:54,613][57339] Updated weights for policy 0, policy_version 622858 (0.0027) [2024-04-28 13:37:57,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 10205036544. Throughput: 0: 55334.7. Samples: 695398420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:37:57,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 13:37:57,346][57339] Updated weights for policy 0, policy_version 622868 (0.0033) [2024-04-28 13:38:00,589][57339] Updated weights for policy 0, policy_version 622878 (0.0024) [2024-04-28 13:38:02,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 10205331456. Throughput: 0: 55522.6. Samples: 695737220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:38:02,170][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 13:38:03,273][57339] Updated weights for policy 0, policy_version 622888 (0.0030) [2024-04-28 13:38:06,485][57339] Updated weights for policy 0, policy_version 622898 (0.0025) [2024-04-28 13:38:07,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10205609984. Throughput: 0: 55316.0. Samples: 695902580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:38:07,170][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 13:38:09,148][57339] Updated weights for policy 0, policy_version 622908 (0.0027) [2024-04-28 13:38:12,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10205872128. Throughput: 0: 55373.6. Samples: 696237120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:38:12,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 13:38:12,261][57339] Updated weights for policy 0, policy_version 622918 (0.0036) [2024-04-28 13:38:15,009][57339] Updated weights for policy 0, policy_version 622928 (0.0026) [2024-04-28 13:38:17,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10206167040. Throughput: 0: 55481.7. Samples: 696571100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:38:17,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 13:38:18,206][57339] Updated weights for policy 0, policy_version 622938 (0.0033) [2024-04-28 13:38:20,847][57339] Updated weights for policy 0, policy_version 622948 (0.0034) [2024-04-28 13:38:22,169][57108] Fps is (10 sec: 55704.4, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 10206429184. Throughput: 0: 55485.7. Samples: 696735660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:38:22,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 13:38:23,908][57339] Updated weights for policy 0, policy_version 622958 (0.0026) [2024-04-28 13:38:26,764][57339] Updated weights for policy 0, policy_version 622968 (0.0033) [2024-04-28 13:38:27,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10206724096. Throughput: 0: 55621.8. Samples: 697071920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:38:27,169][57108] Avg episode reward: [(0, '0.485')] [2024-04-28 13:38:29,818][57339] Updated weights for policy 0, policy_version 622978 (0.0029) [2024-04-28 13:38:32,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10207002624. Throughput: 0: 55489.6. Samples: 697400540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 13:38:32,169][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 13:38:32,499][57339] Updated weights for policy 0, policy_version 622988 (0.0025) [2024-04-28 13:38:35,789][57339] Updated weights for policy 0, policy_version 622998 (0.0028) [2024-04-28 13:38:37,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 10207264768. Throughput: 0: 55766.5. Samples: 697574840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:38:37,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 13:38:38,216][57339] Updated weights for policy 0, policy_version 623008 (0.0037) [2024-04-28 13:38:41,598][57339] Updated weights for policy 0, policy_version 623018 (0.0031) [2024-04-28 13:38:42,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10207543296. Throughput: 0: 55753.8. Samples: 697907340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:38:42,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 13:38:44,120][57339] Updated weights for policy 0, policy_version 623028 (0.0029) [2024-04-28 13:38:47,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 10207838208. Throughput: 0: 55668.4. Samples: 698242300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:38:47,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 13:38:47,521][57339] Updated weights for policy 0, policy_version 623038 (0.0027) [2024-04-28 13:38:50,022][57339] Updated weights for policy 0, policy_version 623048 (0.0028) [2024-04-28 13:38:52,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 10208100352. Throughput: 0: 55576.1. Samples: 698403500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:38:52,169][57108] Avg episode reward: [(0, '0.677')] [2024-04-28 13:38:53,232][57339] Updated weights for policy 0, policy_version 623058 (0.0031) [2024-04-28 13:38:56,144][57339] Updated weights for policy 0, policy_version 623068 (0.0029) [2024-04-28 13:38:57,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 10208378880. Throughput: 0: 55623.7. Samples: 698740200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:38:57,170][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 13:38:59,005][57339] Updated weights for policy 0, policy_version 623078 (0.0031) [2024-04-28 13:39:02,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.7, 300 sec: 55483.4). Total num frames: 10208657408. Throughput: 0: 55674.3. Samples: 699076440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:39:02,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 13:39:02,179][57339] Updated weights for policy 0, policy_version 623088 (0.0032) [2024-04-28 13:39:04,185][57319] Signal inference workers to stop experience collection... (10000 times) [2024-04-28 13:39:04,222][57339] InferenceWorker_p0-w0: stopping experience collection (10000 times) [2024-04-28 13:39:04,235][57319] Signal inference workers to resume experience collection... (10000 times) [2024-04-28 13:39:04,242][57339] InferenceWorker_p0-w0: resuming experience collection (10000 times) [2024-04-28 13:39:04,845][57339] Updated weights for policy 0, policy_version 623098 (0.0032) [2024-04-28 13:39:07,169][57108] Fps is (10 sec: 58983.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10208968704. Throughput: 0: 55754.4. Samples: 699244600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:39:07,169][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 13:39:08,044][57339] Updated weights for policy 0, policy_version 623108 (0.0030) [2024-04-28 13:39:10,767][57339] Updated weights for policy 0, policy_version 623118 (0.0025) [2024-04-28 13:39:12,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 10209214464. Throughput: 0: 55692.1. Samples: 699578060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:39:12,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 13:39:13,778][57339] Updated weights for policy 0, policy_version 623128 (0.0028) [2024-04-28 13:39:16,519][57339] Updated weights for policy 0, policy_version 623138 (0.0029) [2024-04-28 13:39:17,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10209509376. Throughput: 0: 55800.5. Samples: 699911560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:39:17,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 13:39:19,668][57339] Updated weights for policy 0, policy_version 623148 (0.0032) [2024-04-28 13:39:22,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55978.9, 300 sec: 55650.1). Total num frames: 10209787904. Throughput: 0: 55763.4. Samples: 700084180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:39:22,169][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 13:39:22,694][57339] Updated weights for policy 0, policy_version 623158 (0.0028) [2024-04-28 13:39:25,521][57339] Updated weights for policy 0, policy_version 623168 (0.0026) [2024-04-28 13:39:27,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10210050048. Throughput: 0: 55802.2. Samples: 700418440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:39:27,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 13:39:27,264][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000623173_10210066432.pth... [2024-04-28 13:39:27,320][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000622357_10196697088.pth [2024-04-28 13:39:28,727][57339] Updated weights for policy 0, policy_version 623178 (0.0030) [2024-04-28 13:39:31,259][57339] Updated weights for policy 0, policy_version 623188 (0.0027) [2024-04-28 13:39:32,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10210328576. Throughput: 0: 55878.8. Samples: 700756840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:39:32,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 13:39:34,473][57339] Updated weights for policy 0, policy_version 623198 (0.0029) [2024-04-28 13:39:37,169][57108] Fps is (10 sec: 58982.8, 60 sec: 56251.9, 300 sec: 55594.5). Total num frames: 10210639872. Throughput: 0: 56096.0. Samples: 700927820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:39:37,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 13:39:37,175][57339] Updated weights for policy 0, policy_version 623208 (0.0026) [2024-04-28 13:39:40,136][57339] Updated weights for policy 0, policy_version 623218 (0.0031) [2024-04-28 13:39:42,169][57108] Fps is (10 sec: 58982.7, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 10210918400. Throughput: 0: 56114.9. Samples: 701265360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:39:42,169][57108] Avg episode reward: [(0, '0.707')] [2024-04-28 13:39:43,167][57339] Updated weights for policy 0, policy_version 623228 (0.0036) [2024-04-28 13:39:46,048][57339] Updated weights for policy 0, policy_version 623238 (0.0025) [2024-04-28 13:39:47,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 10211213312. Throughput: 0: 55957.7. Samples: 701594540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:39:47,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 13:39:49,304][57339] Updated weights for policy 0, policy_version 623248 (0.0026) [2024-04-28 13:39:51,861][57339] Updated weights for policy 0, policy_version 623258 (0.0031) [2024-04-28 13:39:52,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10211475456. Throughput: 0: 56102.2. Samples: 701769200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:39:52,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 13:39:55,213][57339] Updated weights for policy 0, policy_version 623268 (0.0028) [2024-04-28 13:39:57,043][57319] Signal inference workers to stop experience collection... (10050 times) [2024-04-28 13:39:57,043][57319] Signal inference workers to resume experience collection... (10050 times) [2024-04-28 13:39:57,067][57339] InferenceWorker_p0-w0: stopping experience collection (10050 times) [2024-04-28 13:39:57,067][57339] InferenceWorker_p0-w0: resuming experience collection (10050 times) [2024-04-28 13:39:57,169][57108] Fps is (10 sec: 54067.5, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 10211753984. Throughput: 0: 56263.6. Samples: 702109920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:39:57,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 13:39:57,589][57339] Updated weights for policy 0, policy_version 623278 (0.0026) [2024-04-28 13:40:01,015][57339] Updated weights for policy 0, policy_version 623288 (0.0025) [2024-04-28 13:40:02,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10212016128. Throughput: 0: 56335.1. Samples: 702446640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:40:02,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 13:40:03,747][57339] Updated weights for policy 0, policy_version 623298 (0.0028) [2024-04-28 13:40:06,979][57339] Updated weights for policy 0, policy_version 623308 (0.0026) [2024-04-28 13:40:07,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 10212278272. Throughput: 0: 55696.6. Samples: 702590540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:40:07,170][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 13:40:09,530][57339] Updated weights for policy 0, policy_version 623318 (0.0028) [2024-04-28 13:40:12,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 10212573184. Throughput: 0: 55678.7. Samples: 702923980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:40:12,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 13:40:12,797][57339] Updated weights for policy 0, policy_version 623328 (0.0032) [2024-04-28 13:40:15,419][57339] Updated weights for policy 0, policy_version 623338 (0.0028) [2024-04-28 13:40:17,169][57108] Fps is (10 sec: 58983.1, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10212868096. Throughput: 0: 55695.6. Samples: 703263140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:40:17,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 13:40:18,519][57339] Updated weights for policy 0, policy_version 623348 (0.0027) [2024-04-28 13:40:21,219][57339] Updated weights for policy 0, policy_version 623358 (0.0031) [2024-04-28 13:40:22,169][57108] Fps is (10 sec: 58982.4, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 10213163008. Throughput: 0: 55773.7. Samples: 703437640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:40:22,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 13:40:24,239][57339] Updated weights for policy 0, policy_version 623368 (0.0032) [2024-04-28 13:40:27,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 10213408768. Throughput: 0: 55770.8. Samples: 703775040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:40:27,169][57108] Avg episode reward: [(0, '0.690')] [2024-04-28 13:40:27,177][57339] Updated weights for policy 0, policy_version 623378 (0.0029) [2024-04-28 13:40:30,306][57339] Updated weights for policy 0, policy_version 623388 (0.0023) [2024-04-28 13:40:32,169][57108] Fps is (10 sec: 54067.4, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 10213703680. Throughput: 0: 55848.1. Samples: 704107700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:40:32,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 13:40:32,921][57339] Updated weights for policy 0, policy_version 623398 (0.0029) [2024-04-28 13:40:36,175][57339] Updated weights for policy 0, policy_version 623408 (0.0028) [2024-04-28 13:40:37,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 10213949440. Throughput: 0: 55672.9. Samples: 704274480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:40:37,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 13:40:38,865][57339] Updated weights for policy 0, policy_version 623418 (0.0030) [2024-04-28 13:40:41,889][57339] Updated weights for policy 0, policy_version 623428 (0.0025) [2024-04-28 13:40:42,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 10214244352. Throughput: 0: 55580.8. Samples: 704611060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:40:42,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 13:40:44,640][57339] Updated weights for policy 0, policy_version 623438 (0.0029) [2024-04-28 13:40:47,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10214522880. Throughput: 0: 55523.0. Samples: 704945180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:40:47,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 13:40:47,604][57319] Signal inference workers to stop experience collection... (10100 times) [2024-04-28 13:40:47,605][57319] Signal inference workers to resume experience collection... (10100 times) [2024-04-28 13:40:47,633][57339] InferenceWorker_p0-w0: stopping experience collection (10100 times) [2024-04-28 13:40:47,634][57339] InferenceWorker_p0-w0: resuming experience collection (10100 times) [2024-04-28 13:40:47,717][57339] Updated weights for policy 0, policy_version 623448 (0.0029) [2024-04-28 13:40:50,659][57339] Updated weights for policy 0, policy_version 623458 (0.0027) [2024-04-28 13:40:52,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10214817792. Throughput: 0: 56170.7. Samples: 705118220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:40:52,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 13:40:53,668][57339] Updated weights for policy 0, policy_version 623468 (0.0030) [2024-04-28 13:40:56,553][57339] Updated weights for policy 0, policy_version 623478 (0.0033) [2024-04-28 13:40:57,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10215112704. Throughput: 0: 55999.6. Samples: 705443960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:40:57,169][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 13:40:59,420][57339] Updated weights for policy 0, policy_version 623488 (0.0028) [2024-04-28 13:41:02,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10215358464. Throughput: 0: 55821.7. Samples: 705775120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:41:02,170][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 13:41:02,535][57339] Updated weights for policy 0, policy_version 623498 (0.0026) [2024-04-28 13:41:05,355][57339] Updated weights for policy 0, policy_version 623508 (0.0026) [2024-04-28 13:41:07,169][57108] Fps is (10 sec: 54066.7, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 10215653376. Throughput: 0: 55618.2. Samples: 705940460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:41:07,170][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 13:41:08,345][57339] Updated weights for policy 0, policy_version 623518 (0.0025) [2024-04-28 13:41:11,219][57339] Updated weights for policy 0, policy_version 623528 (0.0027) [2024-04-28 13:41:12,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 10215915520. Throughput: 0: 55579.7. Samples: 706276140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:41:12,170][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 13:41:14,288][57339] Updated weights for policy 0, policy_version 623538 (0.0030) [2024-04-28 13:41:17,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10216194048. Throughput: 0: 55659.1. Samples: 706612360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:41:17,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 13:41:17,245][57339] Updated weights for policy 0, policy_version 623548 (0.0028) [2024-04-28 13:41:20,094][57339] Updated weights for policy 0, policy_version 623558 (0.0025) [2024-04-28 13:41:22,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10216472576. Throughput: 0: 55513.8. Samples: 706772600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:41:22,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 13:41:23,596][57339] Updated weights for policy 0, policy_version 623568 (0.0027) [2024-04-28 13:41:26,050][57339] Updated weights for policy 0, policy_version 623578 (0.0027) [2024-04-28 13:41:27,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10216751104. Throughput: 0: 55506.7. Samples: 707108860. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:41:27,170][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 13:41:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000623581_10216751104.pth... [2024-04-28 13:41:27,227][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000622766_10203398144.pth [2024-04-28 13:41:29,495][57339] Updated weights for policy 0, policy_version 623588 (0.0027) [2024-04-28 13:41:31,852][57339] Updated weights for policy 0, policy_version 623598 (0.0029) [2024-04-28 13:41:32,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10217046016. Throughput: 0: 55480.2. Samples: 707441780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:41:32,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 13:41:35,244][57339] Updated weights for policy 0, policy_version 623608 (0.0034) [2024-04-28 13:41:37,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 10217308160. Throughput: 0: 55279.1. Samples: 707605780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:41:37,170][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 13:41:37,788][57339] Updated weights for policy 0, policy_version 623618 (0.0029) [2024-04-28 13:41:40,013][57319] Signal inference workers to stop experience collection... (10150 times) [2024-04-28 13:41:40,067][57339] InferenceWorker_p0-w0: stopping experience collection (10150 times) [2024-04-28 13:41:40,070][57319] Signal inference workers to resume experience collection... (10150 times) [2024-04-28 13:41:40,078][57339] InferenceWorker_p0-w0: resuming experience collection (10150 times) [2024-04-28 13:41:41,037][57339] Updated weights for policy 0, policy_version 623628 (0.0031) [2024-04-28 13:41:42,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10217603072. Throughput: 0: 55451.0. Samples: 707939260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 13:41:42,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 13:41:43,724][57339] Updated weights for policy 0, policy_version 623638 (0.0027) [2024-04-28 13:41:46,883][57339] Updated weights for policy 0, policy_version 623648 (0.0030) [2024-04-28 13:41:47,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10217865216. Throughput: 0: 55587.6. Samples: 708276560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:41:47,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 13:41:49,548][57339] Updated weights for policy 0, policy_version 623658 (0.0030) [2024-04-28 13:41:52,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10218160128. Throughput: 0: 55596.0. Samples: 708442280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:41:52,169][57108] Avg episode reward: [(0, '0.703')] [2024-04-28 13:41:52,595][57339] Updated weights for policy 0, policy_version 623668 (0.0025) [2024-04-28 13:41:55,409][57339] Updated weights for policy 0, policy_version 623678 (0.0031) [2024-04-28 13:41:57,169][57108] Fps is (10 sec: 52429.1, 60 sec: 54613.4, 300 sec: 55594.5). Total num frames: 10218389504. Throughput: 0: 55430.0. Samples: 708770480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:41:57,169][57108] Avg episode reward: [(0, '0.693')] [2024-04-28 13:41:58,460][57339] Updated weights for policy 0, policy_version 623688 (0.0025) [2024-04-28 13:42:01,243][57339] Updated weights for policy 0, policy_version 623698 (0.0030) [2024-04-28 13:42:02,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55432.6, 300 sec: 55594.6). Total num frames: 10218684416. Throughput: 0: 55364.1. Samples: 709103740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:42:02,169][57108] Avg episode reward: [(0, '0.539')] [2024-04-28 13:42:04,608][57339] Updated weights for policy 0, policy_version 623708 (0.0026) [2024-04-28 13:42:07,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10218979328. Throughput: 0: 55639.2. Samples: 709276360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:42:07,169][57108] Avg episode reward: [(0, '0.532')] [2024-04-28 13:42:07,177][57339] Updated weights for policy 0, policy_version 623718 (0.0030) [2024-04-28 13:42:10,392][57339] Updated weights for policy 0, policy_version 623728 (0.0028) [2024-04-28 13:42:12,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 10219257856. Throughput: 0: 55588.6. Samples: 709610340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:42:12,169][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 13:42:13,159][57339] Updated weights for policy 0, policy_version 623738 (0.0026) [2024-04-28 13:42:16,293][57339] Updated weights for policy 0, policy_version 623748 (0.0031) [2024-04-28 13:42:17,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10219536384. Throughput: 0: 55454.1. Samples: 709937220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:42:17,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 13:42:18,901][57339] Updated weights for policy 0, policy_version 623758 (0.0025) [2024-04-28 13:42:22,125][57339] Updated weights for policy 0, policy_version 623768 (0.0027) [2024-04-28 13:42:22,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10219814912. Throughput: 0: 55576.2. Samples: 710106700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:42:22,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 13:42:25,252][57339] Updated weights for policy 0, policy_version 623778 (0.0027) [2024-04-28 13:42:27,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 10220077056. Throughput: 0: 55515.0. Samples: 710437440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:42:27,170][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 13:42:28,105][57339] Updated weights for policy 0, policy_version 623788 (0.0026) [2024-04-28 13:42:31,025][57339] Updated weights for policy 0, policy_version 623798 (0.0027) [2024-04-28 13:42:32,169][57108] Fps is (10 sec: 52428.6, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 10220339200. Throughput: 0: 55555.6. Samples: 710776560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:42:32,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 13:42:33,877][57339] Updated weights for policy 0, policy_version 623808 (0.0025) [2024-04-28 13:42:36,867][57339] Updated weights for policy 0, policy_version 623818 (0.0030) [2024-04-28 13:42:37,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10220634112. Throughput: 0: 55336.9. Samples: 710932440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:42:37,170][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 13:42:39,795][57339] Updated weights for policy 0, policy_version 623828 (0.0030) [2024-04-28 13:42:42,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10220912640. Throughput: 0: 55391.0. Samples: 711263080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:42:42,169][57108] Avg episode reward: [(0, '0.677')] [2024-04-28 13:42:42,854][57339] Updated weights for policy 0, policy_version 623838 (0.0026) [2024-04-28 13:42:45,770][57339] Updated weights for policy 0, policy_version 623848 (0.0030) [2024-04-28 13:42:46,087][57319] Signal inference workers to stop experience collection... (10200 times) [2024-04-28 13:42:46,126][57339] InferenceWorker_p0-w0: stopping experience collection (10200 times) [2024-04-28 13:42:46,151][57319] Signal inference workers to resume experience collection... (10200 times) [2024-04-28 13:42:46,157][57339] InferenceWorker_p0-w0: resuming experience collection (10200 times) [2024-04-28 13:42:47,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 10221207552. Throughput: 0: 55347.8. Samples: 711594400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:42:47,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 13:42:48,868][57339] Updated weights for policy 0, policy_version 623858 (0.0031) [2024-04-28 13:42:51,629][57339] Updated weights for policy 0, policy_version 623868 (0.0026) [2024-04-28 13:42:52,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10221486080. Throughput: 0: 55480.7. Samples: 711773000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:42:52,169][57108] Avg episode reward: [(0, '0.528')] [2024-04-28 13:42:54,660][57339] Updated weights for policy 0, policy_version 623878 (0.0028) [2024-04-28 13:42:57,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10221731840. Throughput: 0: 55417.7. Samples: 712104140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:42:57,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 13:42:57,545][57339] Updated weights for policy 0, policy_version 623888 (0.0031) [2024-04-28 13:43:00,510][57339] Updated weights for policy 0, policy_version 623898 (0.0031) [2024-04-28 13:43:02,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55432.3, 300 sec: 55594.5). Total num frames: 10222010368. Throughput: 0: 55571.3. Samples: 712437940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:43:02,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 13:43:03,602][57339] Updated weights for policy 0, policy_version 623908 (0.0027) [2024-04-28 13:43:06,526][57339] Updated weights for policy 0, policy_version 623918 (0.0026) [2024-04-28 13:43:07,169][57108] Fps is (10 sec: 54066.5, 60 sec: 54886.2, 300 sec: 55594.5). Total num frames: 10222272512. Throughput: 0: 55289.5. Samples: 712594740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:43:07,170][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 13:43:09,498][57339] Updated weights for policy 0, policy_version 623928 (0.0031) [2024-04-28 13:43:12,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55159.2, 300 sec: 55594.5). Total num frames: 10222567424. Throughput: 0: 55311.0. Samples: 712926440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:43:12,170][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 13:43:12,660][57339] Updated weights for policy 0, policy_version 623938 (0.0027) [2024-04-28 13:43:15,242][57339] Updated weights for policy 0, policy_version 623948 (0.0028) [2024-04-28 13:43:17,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10222845952. Throughput: 0: 55160.4. Samples: 713258780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 13:43:17,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 13:43:18,697][57339] Updated weights for policy 0, policy_version 623958 (0.0027) [2024-04-28 13:43:21,025][57339] Updated weights for policy 0, policy_version 623968 (0.0035) [2024-04-28 13:43:22,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 10223140864. Throughput: 0: 55553.8. Samples: 713432360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:43:22,169][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 13:43:24,564][57339] Updated weights for policy 0, policy_version 623978 (0.0030) [2024-04-28 13:43:26,914][57339] Updated weights for policy 0, policy_version 623988 (0.0029) [2024-04-28 13:43:27,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10223435776. Throughput: 0: 55590.6. Samples: 713764660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:43:27,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 13:43:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000623989_10223435776.pth... [2024-04-28 13:43:27,230][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000623173_10210066432.pth [2024-04-28 13:43:30,422][57339] Updated weights for policy 0, policy_version 623998 (0.0029) [2024-04-28 13:43:32,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10223665152. Throughput: 0: 55618.3. Samples: 714097220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:43:32,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 13:43:32,997][57339] Updated weights for policy 0, policy_version 624008 (0.0028) [2024-04-28 13:43:36,138][57339] Updated weights for policy 0, policy_version 624018 (0.0025) [2024-04-28 13:43:37,169][57108] Fps is (10 sec: 50791.0, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 10223943680. Throughput: 0: 55175.3. Samples: 714255880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:43:37,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 13:43:38,828][57339] Updated weights for policy 0, policy_version 624028 (0.0025) [2024-04-28 13:43:41,873][57339] Updated weights for policy 0, policy_version 624038 (0.0026) [2024-04-28 13:43:42,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.6, 300 sec: 55594.6). Total num frames: 10224238592. Throughput: 0: 55289.4. Samples: 714592160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:43:42,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 13:43:44,455][57319] Signal inference workers to stop experience collection... (10250 times) [2024-04-28 13:43:44,455][57319] Signal inference workers to resume experience collection... (10250 times) [2024-04-28 13:43:44,479][57339] InferenceWorker_p0-w0: stopping experience collection (10250 times) [2024-04-28 13:43:44,479][57339] InferenceWorker_p0-w0: resuming experience collection (10250 times) [2024-04-28 13:43:44,564][57339] Updated weights for policy 0, policy_version 624048 (0.0036) [2024-04-28 13:43:47,169][57108] Fps is (10 sec: 58981.0, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10224533504. Throughput: 0: 55478.7. Samples: 714934480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:43:47,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 13:43:47,701][57339] Updated weights for policy 0, policy_version 624058 (0.0027) [2024-04-28 13:43:50,422][57339] Updated weights for policy 0, policy_version 624068 (0.0029) [2024-04-28 13:43:52,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10224795648. Throughput: 0: 55674.4. Samples: 715100080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:43:52,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 13:43:53,573][57339] Updated weights for policy 0, policy_version 624078 (0.0028) [2024-04-28 13:43:56,372][57339] Updated weights for policy 0, policy_version 624088 (0.0024) [2024-04-28 13:43:57,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 10225074176. Throughput: 0: 55660.6. Samples: 715431160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:43:57,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 13:43:59,828][57339] Updated weights for policy 0, policy_version 624098 (0.0026) [2024-04-28 13:44:02,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55978.9, 300 sec: 55594.5). Total num frames: 10225369088. Throughput: 0: 55681.9. Samples: 715764460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:44:02,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 13:44:02,215][57339] Updated weights for policy 0, policy_version 624108 (0.0030) [2024-04-28 13:44:05,704][57339] Updated weights for policy 0, policy_version 624118 (0.0031) [2024-04-28 13:44:07,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 10225647616. Throughput: 0: 55688.4. Samples: 715938340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:44:07,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 13:44:08,044][57339] Updated weights for policy 0, policy_version 624128 (0.0036) [2024-04-28 13:44:11,672][57339] Updated weights for policy 0, policy_version 624138 (0.0027) [2024-04-28 13:44:12,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 10225893376. Throughput: 0: 55721.9. Samples: 716272140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:44:12,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 13:44:14,030][57339] Updated weights for policy 0, policy_version 624148 (0.0029) [2024-04-28 13:44:17,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10226188288. Throughput: 0: 55698.1. Samples: 716603640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:44:17,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 13:44:17,357][57339] Updated weights for policy 0, policy_version 624158 (0.0024) [2024-04-28 13:44:20,125][57339] Updated weights for policy 0, policy_version 624168 (0.0033) [2024-04-28 13:44:22,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10226483200. Throughput: 0: 55919.9. Samples: 716772280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:44:22,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 13:44:23,009][57339] Updated weights for policy 0, policy_version 624178 (0.0031) [2024-04-28 13:44:25,852][57339] Updated weights for policy 0, policy_version 624188 (0.0029) [2024-04-28 13:44:27,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10226745344. Throughput: 0: 55930.6. Samples: 717109040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:44:27,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 13:44:28,894][57339] Updated weights for policy 0, policy_version 624198 (0.0032) [2024-04-28 13:44:31,638][57339] Updated weights for policy 0, policy_version 624208 (0.0027) [2024-04-28 13:44:32,169][57108] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 10227040256. Throughput: 0: 55813.0. Samples: 717446060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:44:32,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 13:44:34,801][57339] Updated weights for policy 0, policy_version 624218 (0.0029) [2024-04-28 13:44:37,169][57108] Fps is (10 sec: 57343.7, 60 sec: 56251.6, 300 sec: 55594.5). Total num frames: 10227318784. Throughput: 0: 55854.2. Samples: 717613520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:44:37,170][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 13:44:37,634][57339] Updated weights for policy 0, policy_version 624228 (0.0026) [2024-04-28 13:44:40,560][57339] Updated weights for policy 0, policy_version 624238 (0.0031) [2024-04-28 13:44:42,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 10227580928. Throughput: 0: 55857.9. Samples: 717944760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:44:42,169][57108] Avg episode reward: [(0, '0.691')] [2024-04-28 13:44:43,532][57339] Updated weights for policy 0, policy_version 624248 (0.0026) [2024-04-28 13:44:46,896][57339] Updated weights for policy 0, policy_version 624258 (0.0025) [2024-04-28 13:44:47,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 10227859456. Throughput: 0: 56094.4. Samples: 718288720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:44:47,178][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 13:44:49,417][57339] Updated weights for policy 0, policy_version 624268 (0.0031) [2024-04-28 13:44:52,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10228137984. Throughput: 0: 55703.2. Samples: 718444980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 13:44:52,178][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 13:44:52,693][57339] Updated weights for policy 0, policy_version 624278 (0.0036) [2024-04-28 13:44:55,233][57339] Updated weights for policy 0, policy_version 624288 (0.0032) [2024-04-28 13:44:57,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 10228432896. Throughput: 0: 55720.0. Samples: 718779540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:44:57,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 13:44:58,598][57339] Updated weights for policy 0, policy_version 624298 (0.0027) [2024-04-28 13:45:01,105][57339] Updated weights for policy 0, policy_version 624308 (0.0033) [2024-04-28 13:45:02,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10228711424. Throughput: 0: 55786.5. Samples: 719114020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:45:02,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 13:45:04,299][57319] Signal inference workers to stop experience collection... (10300 times) [2024-04-28 13:45:04,341][57339] InferenceWorker_p0-w0: stopping experience collection (10300 times) [2024-04-28 13:45:04,354][57319] Signal inference workers to resume experience collection... (10300 times) [2024-04-28 13:45:04,360][57339] InferenceWorker_p0-w0: resuming experience collection (10300 times) [2024-04-28 13:45:04,459][57339] Updated weights for policy 0, policy_version 624318 (0.0026) [2024-04-28 13:45:07,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 10228973568. Throughput: 0: 55810.3. Samples: 719283740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:45:07,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 13:45:07,285][57339] Updated weights for policy 0, policy_version 624328 (0.0028) [2024-04-28 13:45:10,262][57339] Updated weights for policy 0, policy_version 624338 (0.0030) [2024-04-28 13:45:12,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 10229252096. Throughput: 0: 55666.7. Samples: 719614040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:45:12,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 13:45:13,279][57339] Updated weights for policy 0, policy_version 624348 (0.0028) [2024-04-28 13:45:16,014][57339] Updated weights for policy 0, policy_version 624358 (0.0025) [2024-04-28 13:45:17,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.8, 300 sec: 55483.5). Total num frames: 10229530624. Throughput: 0: 55525.0. Samples: 719944680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:45:17,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 13:45:19,233][57339] Updated weights for policy 0, policy_version 624368 (0.0028) [2024-04-28 13:45:21,878][57339] Updated weights for policy 0, policy_version 624378 (0.0028) [2024-04-28 13:45:22,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 10229825536. Throughput: 0: 55680.8. Samples: 720119160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:45:22,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 13:45:25,202][57339] Updated weights for policy 0, policy_version 624388 (0.0028) [2024-04-28 13:45:27,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 10230087680. Throughput: 0: 55594.6. Samples: 720446520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:45:27,170][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 13:45:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000624395_10230087680.pth... [2024-04-28 13:45:27,231][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000623581_10216751104.pth [2024-04-28 13:45:27,778][57339] Updated weights for policy 0, policy_version 624398 (0.0033) [2024-04-28 13:45:31,021][57339] Updated weights for policy 0, policy_version 624408 (0.0032) [2024-04-28 13:45:32,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10230382592. Throughput: 0: 55335.2. Samples: 720778800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:45:32,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 13:45:33,783][57339] Updated weights for policy 0, policy_version 624418 (0.0024) [2024-04-28 13:45:36,769][57339] Updated weights for policy 0, policy_version 624428 (0.0028) [2024-04-28 13:45:37,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10230644736. Throughput: 0: 55609.0. Samples: 720947380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:45:37,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 13:45:39,640][57339] Updated weights for policy 0, policy_version 624438 (0.0027) [2024-04-28 13:45:42,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10230923264. Throughput: 0: 55642.9. Samples: 721283480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:45:42,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 13:45:42,773][57339] Updated weights for policy 0, policy_version 624448 (0.0029) [2024-04-28 13:45:45,581][57339] Updated weights for policy 0, policy_version 624458 (0.0033) [2024-04-28 13:45:47,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 10231201792. Throughput: 0: 55520.4. Samples: 721612440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:45:47,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 13:45:48,832][57339] Updated weights for policy 0, policy_version 624468 (0.0031) [2024-04-28 13:45:49,001][57319] Signal inference workers to stop experience collection... (10350 times) [2024-04-28 13:45:49,041][57339] InferenceWorker_p0-w0: stopping experience collection (10350 times) [2024-04-28 13:45:49,096][57319] Signal inference workers to resume experience collection... (10350 times) [2024-04-28 13:45:49,096][57339] InferenceWorker_p0-w0: resuming experience collection (10350 times) [2024-04-28 13:45:51,462][57339] Updated weights for policy 0, policy_version 624478 (0.0020) [2024-04-28 13:45:52,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55483.4). Total num frames: 10231480320. Throughput: 0: 55282.5. Samples: 721771460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:45:52,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 13:45:54,571][57339] Updated weights for policy 0, policy_version 624488 (0.0031) [2024-04-28 13:45:57,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10231758848. Throughput: 0: 55396.9. Samples: 722106900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:45:57,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 13:45:57,386][57339] Updated weights for policy 0, policy_version 624498 (0.0031) [2024-04-28 13:46:00,460][57339] Updated weights for policy 0, policy_version 624508 (0.0030) [2024-04-28 13:46:02,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10232037376. Throughput: 0: 55509.8. Samples: 722442620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:46:02,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 13:46:03,115][57339] Updated weights for policy 0, policy_version 624518 (0.0027) [2024-04-28 13:46:06,363][57339] Updated weights for policy 0, policy_version 624528 (0.0025) [2024-04-28 13:46:07,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 10232332288. Throughput: 0: 55441.4. Samples: 722614020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:46:07,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 13:46:09,188][57339] Updated weights for policy 0, policy_version 624538 (0.0033) [2024-04-28 13:46:12,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10232578048. Throughput: 0: 55580.0. Samples: 722947620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:46:12,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 13:46:12,214][57339] Updated weights for policy 0, policy_version 624548 (0.0028) [2024-04-28 13:46:15,128][57339] Updated weights for policy 0, policy_version 624558 (0.0028) [2024-04-28 13:46:17,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 10232856576. Throughput: 0: 55635.5. Samples: 723282400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:46:17,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 13:46:17,989][57339] Updated weights for policy 0, policy_version 624568 (0.0031) [2024-04-28 13:46:21,060][57339] Updated weights for policy 0, policy_version 624578 (0.0022) [2024-04-28 13:46:22,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10233151488. Throughput: 0: 55518.1. Samples: 723445700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:46:22,170][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 13:46:23,806][57339] Updated weights for policy 0, policy_version 624588 (0.0026) [2024-04-28 13:46:26,836][57339] Updated weights for policy 0, policy_version 624598 (0.0034) [2024-04-28 13:46:27,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 10233413632. Throughput: 0: 55520.9. Samples: 723781920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 13:46:27,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 13:46:29,799][57339] Updated weights for policy 0, policy_version 624608 (0.0028) [2024-04-28 13:46:32,169][57108] Fps is (10 sec: 55702.9, 60 sec: 55432.0, 300 sec: 55594.4). Total num frames: 10233708544. Throughput: 0: 55557.0. Samples: 724112540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:46:32,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 13:46:32,756][57339] Updated weights for policy 0, policy_version 624618 (0.0025) [2024-04-28 13:46:35,667][57339] Updated weights for policy 0, policy_version 624628 (0.0031) [2024-04-28 13:46:37,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 10234003456. Throughput: 0: 55803.5. Samples: 724282620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:46:37,170][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 13:46:38,741][57339] Updated weights for policy 0, policy_version 624638 (0.0025) [2024-04-28 13:46:41,581][57339] Updated weights for policy 0, policy_version 624648 (0.0023) [2024-04-28 13:46:42,169][57108] Fps is (10 sec: 57347.9, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 10234281984. Throughput: 0: 55809.8. Samples: 724618340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:46:42,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 13:46:44,505][57339] Updated weights for policy 0, policy_version 624658 (0.0030) [2024-04-28 13:46:47,169][57108] Fps is (10 sec: 52429.9, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 10234527744. Throughput: 0: 55846.7. Samples: 724955720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:46:47,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 13:46:47,428][57339] Updated weights for policy 0, policy_version 624668 (0.0029) [2024-04-28 13:46:47,550][57319] Signal inference workers to stop experience collection... (10400 times) [2024-04-28 13:46:47,596][57339] InferenceWorker_p0-w0: stopping experience collection (10400 times) [2024-04-28 13:46:47,609][57319] Signal inference workers to resume experience collection... (10400 times) [2024-04-28 13:46:47,615][57339] InferenceWorker_p0-w0: resuming experience collection (10400 times) [2024-04-28 13:46:50,252][57339] Updated weights for policy 0, policy_version 624678 (0.0030) [2024-04-28 13:46:52,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 10234806272. Throughput: 0: 55772.9. Samples: 725123800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:46:52,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 13:46:53,292][57339] Updated weights for policy 0, policy_version 624688 (0.0028) [2024-04-28 13:46:56,423][57339] Updated weights for policy 0, policy_version 624698 (0.0027) [2024-04-28 13:46:57,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 10235101184. Throughput: 0: 55719.6. Samples: 725455000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:46:57,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 13:46:58,992][57339] Updated weights for policy 0, policy_version 624708 (0.0033) [2024-04-28 13:47:02,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.4, 300 sec: 55538.9). Total num frames: 10235363328. Throughput: 0: 55682.2. Samples: 725788100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:47:02,170][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 13:47:02,424][57339] Updated weights for policy 0, policy_version 624718 (0.0027) [2024-04-28 13:47:04,905][57339] Updated weights for policy 0, policy_version 624728 (0.0027) [2024-04-28 13:47:07,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10235658240. Throughput: 0: 55796.1. Samples: 725956520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:47:07,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 13:47:08,114][57339] Updated weights for policy 0, policy_version 624738 (0.0027) [2024-04-28 13:47:10,698][57339] Updated weights for policy 0, policy_version 624748 (0.0031) [2024-04-28 13:47:12,169][57108] Fps is (10 sec: 58983.3, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 10235953152. Throughput: 0: 55866.8. Samples: 726295920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:47:12,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 13:47:14,069][57339] Updated weights for policy 0, policy_version 624758 (0.0027) [2024-04-28 13:47:16,576][57339] Updated weights for policy 0, policy_version 624768 (0.0025) [2024-04-28 13:47:17,169][57108] Fps is (10 sec: 58981.9, 60 sec: 56524.8, 300 sec: 55705.6). Total num frames: 10236248064. Throughput: 0: 55892.2. Samples: 726627660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:47:17,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 13:47:20,026][57339] Updated weights for policy 0, policy_version 624778 (0.0032) [2024-04-28 13:47:22,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10236510208. Throughput: 0: 55970.8. Samples: 726801300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:47:22,169][57108] Avg episode reward: [(0, '0.520')] [2024-04-28 13:47:22,311][57339] Updated weights for policy 0, policy_version 624788 (0.0033) [2024-04-28 13:47:25,795][57339] Updated weights for policy 0, policy_version 624798 (0.0030) [2024-04-28 13:47:27,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10236772352. Throughput: 0: 56047.0. Samples: 727140460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:47:27,170][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 13:47:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000624803_10236772352.pth... [2024-04-28 13:47:27,224][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000623989_10223435776.pth [2024-04-28 13:47:28,195][57339] Updated weights for policy 0, policy_version 624808 (0.0028) [2024-04-28 13:47:31,508][57339] Updated weights for policy 0, policy_version 624818 (0.0028) [2024-04-28 13:47:32,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55433.1, 300 sec: 55594.5). Total num frames: 10237034496. Throughput: 0: 55841.3. Samples: 727468580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:47:32,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 13:47:34,158][57339] Updated weights for policy 0, policy_version 624828 (0.0025) [2024-04-28 13:47:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 10237313024. Throughput: 0: 55619.1. Samples: 727626660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:47:37,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 13:47:37,414][57339] Updated weights for policy 0, policy_version 624838 (0.0027) [2024-04-28 13:47:39,924][57339] Updated weights for policy 0, policy_version 624848 (0.0029) [2024-04-28 13:47:42,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 10237607936. Throughput: 0: 55756.8. Samples: 727964060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:47:42,170][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 13:47:43,358][57339] Updated weights for policy 0, policy_version 624858 (0.0025) [2024-04-28 13:47:45,727][57339] Updated weights for policy 0, policy_version 624868 (0.0030) [2024-04-28 13:47:47,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 10237886464. Throughput: 0: 55704.0. Samples: 728294780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:47:47,169][57108] Avg episode reward: [(0, '0.728')] [2024-04-28 13:47:49,287][57339] Updated weights for policy 0, policy_version 624878 (0.0027) [2024-04-28 13:47:51,381][57319] Signal inference workers to stop experience collection... (10450 times) [2024-04-28 13:47:51,421][57339] InferenceWorker_p0-w0: stopping experience collection (10450 times) [2024-04-28 13:47:51,433][57319] Signal inference workers to resume experience collection... (10450 times) [2024-04-28 13:47:51,439][57339] InferenceWorker_p0-w0: resuming experience collection (10450 times) [2024-04-28 13:47:51,548][57339] Updated weights for policy 0, policy_version 624888 (0.0032) [2024-04-28 13:47:52,169][57108] Fps is (10 sec: 58983.1, 60 sec: 56524.9, 300 sec: 55816.7). Total num frames: 10238197760. Throughput: 0: 55877.0. Samples: 728470980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:47:52,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 13:47:55,053][57339] Updated weights for policy 0, policy_version 624898 (0.0027) [2024-04-28 13:47:57,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10238443520. Throughput: 0: 55867.5. Samples: 728809960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:47:57,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 13:47:57,503][57339] Updated weights for policy 0, policy_version 624908 (0.0039) [2024-04-28 13:48:01,048][57339] Updated weights for policy 0, policy_version 624918 (0.0025) [2024-04-28 13:48:02,169][57108] Fps is (10 sec: 54066.6, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10238738432. Throughput: 0: 55959.6. Samples: 729145840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-04-28 13:48:02,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 13:48:03,411][57339] Updated weights for policy 0, policy_version 624928 (0.0027) [2024-04-28 13:48:06,822][57339] Updated weights for policy 0, policy_version 624938 (0.0034) [2024-04-28 13:48:07,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 10238984192. Throughput: 0: 55704.7. Samples: 729308020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:48:07,170][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 13:48:09,105][57339] Updated weights for policy 0, policy_version 624948 (0.0024) [2024-04-28 13:48:12,169][57108] Fps is (10 sec: 50790.9, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 10239246336. Throughput: 0: 55615.2. Samples: 729643140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:48:12,170][57108] Avg episode reward: [(0, '0.532')] [2024-04-28 13:48:12,612][57339] Updated weights for policy 0, policy_version 624958 (0.0032) [2024-04-28 13:48:14,969][57339] Updated weights for policy 0, policy_version 624968 (0.0029) [2024-04-28 13:48:17,169][57108] Fps is (10 sec: 57345.1, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 10239557632. Throughput: 0: 55828.1. Samples: 729980840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:48:17,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 13:48:18,445][57339] Updated weights for policy 0, policy_version 624978 (0.0029) [2024-04-28 13:48:20,881][57339] Updated weights for policy 0, policy_version 624988 (0.0033) [2024-04-28 13:48:22,169][57108] Fps is (10 sec: 60620.1, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10239852544. Throughput: 0: 55995.5. Samples: 730146460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:48:22,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 13:48:24,315][57339] Updated weights for policy 0, policy_version 624998 (0.0027) [2024-04-28 13:48:26,662][57339] Updated weights for policy 0, policy_version 625008 (0.0027) [2024-04-28 13:48:27,169][57108] Fps is (10 sec: 58981.5, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10240147456. Throughput: 0: 55882.2. Samples: 730478760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:48:27,170][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 13:48:30,197][57339] Updated weights for policy 0, policy_version 625018 (0.0027) [2024-04-28 13:48:32,169][57108] Fps is (10 sec: 55706.3, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10240409600. Throughput: 0: 56012.7. Samples: 730815340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:48:32,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 13:48:32,442][57339] Updated weights for policy 0, policy_version 625028 (0.0030) [2024-04-28 13:48:35,952][57339] Updated weights for policy 0, policy_version 625038 (0.0031) [2024-04-28 13:48:35,968][57319] Signal inference workers to stop experience collection... (10500 times) [2024-04-28 13:48:35,969][57319] Signal inference workers to resume experience collection... (10500 times) [2024-04-28 13:48:35,980][57339] InferenceWorker_p0-w0: stopping experience collection (10500 times) [2024-04-28 13:48:35,992][57339] InferenceWorker_p0-w0: resuming experience collection (10500 times) [2024-04-28 13:48:37,169][57108] Fps is (10 sec: 54067.9, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 10240688128. Throughput: 0: 55886.6. Samples: 730985880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:48:37,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 13:48:38,259][57339] Updated weights for policy 0, policy_version 625048 (0.0027) [2024-04-28 13:48:41,777][57339] Updated weights for policy 0, policy_version 625058 (0.0029) [2024-04-28 13:48:42,169][57108] Fps is (10 sec: 57343.6, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 10240983040. Throughput: 0: 55905.3. Samples: 731325700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:48:42,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 13:48:44,230][57339] Updated weights for policy 0, policy_version 625068 (0.0025) [2024-04-28 13:48:47,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 10241245184. Throughput: 0: 56040.5. Samples: 731667660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:48:47,169][57108] Avg episode reward: [(0, '0.497')] [2024-04-28 13:48:47,639][57339] Updated weights for policy 0, policy_version 625078 (0.0026) [2024-04-28 13:48:49,946][57339] Updated weights for policy 0, policy_version 625088 (0.0024) [2024-04-28 13:48:52,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 10241523712. Throughput: 0: 55925.0. Samples: 731824640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:48:52,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 13:48:53,356][57339] Updated weights for policy 0, policy_version 625098 (0.0029) [2024-04-28 13:48:55,772][57339] Updated weights for policy 0, policy_version 625108 (0.0025) [2024-04-28 13:48:57,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10241802240. Throughput: 0: 55893.8. Samples: 732158360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:48:57,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 13:48:59,213][57339] Updated weights for policy 0, policy_version 625118 (0.0031) [2024-04-28 13:49:01,833][57339] Updated weights for policy 0, policy_version 625128 (0.0027) [2024-04-28 13:49:02,169][57108] Fps is (10 sec: 58982.6, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10242113536. Throughput: 0: 55860.3. Samples: 732494560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:49:02,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 13:49:05,098][57339] Updated weights for policy 0, policy_version 625138 (0.0034) [2024-04-28 13:49:07,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56525.0, 300 sec: 55872.2). Total num frames: 10242375680. Throughput: 0: 56200.6. Samples: 732675480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:49:07,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 13:49:07,617][57339] Updated weights for policy 0, policy_version 625148 (0.0028) [2024-04-28 13:49:11,044][57339] Updated weights for policy 0, policy_version 625158 (0.0026) [2024-04-28 13:49:12,169][57108] Fps is (10 sec: 54066.5, 60 sec: 56797.7, 300 sec: 55816.7). Total num frames: 10242654208. Throughput: 0: 56167.9. Samples: 733006320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:49:12,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 13:49:13,440][57339] Updated weights for policy 0, policy_version 625168 (0.0025) [2024-04-28 13:49:16,796][57339] Updated weights for policy 0, policy_version 625178 (0.0036) [2024-04-28 13:49:17,169][57108] Fps is (10 sec: 55705.5, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10242932736. Throughput: 0: 56109.8. Samples: 733340280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:49:17,169][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 13:49:19,427][57339] Updated weights for policy 0, policy_version 625188 (0.0028) [2024-04-28 13:49:22,169][57108] Fps is (10 sec: 54068.5, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10243194880. Throughput: 0: 55918.7. Samples: 733502220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:49:22,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 13:49:22,627][57339] Updated weights for policy 0, policy_version 625198 (0.0031) [2024-04-28 13:49:25,124][57339] Updated weights for policy 0, policy_version 625208 (0.0031) [2024-04-28 13:49:27,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 10243473408. Throughput: 0: 55933.5. Samples: 733842700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:49:27,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 13:49:27,176][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000625213_10243489792.pth... [2024-04-28 13:49:27,220][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000624395_10230087680.pth [2024-04-28 13:49:28,547][57339] Updated weights for policy 0, policy_version 625218 (0.0026) [2024-04-28 13:49:31,026][57339] Updated weights for policy 0, policy_version 625228 (0.0028) [2024-04-28 13:49:32,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10243751936. Throughput: 0: 55853.9. Samples: 734181080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:49:32,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 13:49:34,376][57339] Updated weights for policy 0, policy_version 625238 (0.0027) [2024-04-28 13:49:37,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10244046848. Throughput: 0: 56096.6. Samples: 734348980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 13:49:37,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 13:49:37,323][57339] Updated weights for policy 0, policy_version 625248 (0.0032) [2024-04-28 13:49:39,382][57319] Signal inference workers to stop experience collection... (10550 times) [2024-04-28 13:49:39,437][57339] InferenceWorker_p0-w0: stopping experience collection (10550 times) [2024-04-28 13:49:39,438][57319] Signal inference workers to resume experience collection... (10550 times) [2024-04-28 13:49:39,452][57339] InferenceWorker_p0-w0: resuming experience collection (10550 times) [2024-04-28 13:49:40,126][57339] Updated weights for policy 0, policy_version 625258 (0.0035) [2024-04-28 13:49:42,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10244341760. Throughput: 0: 56104.4. Samples: 734683060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:49:42,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 13:49:43,242][57339] Updated weights for policy 0, policy_version 625268 (0.0024) [2024-04-28 13:49:46,066][57339] Updated weights for policy 0, policy_version 625278 (0.0033) [2024-04-28 13:49:47,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10244603904. Throughput: 0: 56016.6. Samples: 735015300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:49:47,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 13:49:49,158][57339] Updated weights for policy 0, policy_version 625288 (0.0025) [2024-04-28 13:49:51,926][57339] Updated weights for policy 0, policy_version 625298 (0.0028) [2024-04-28 13:49:52,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 10244882432. Throughput: 0: 55758.6. Samples: 735184620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:49:52,169][57108] Avg episode reward: [(0, '0.682')] [2024-04-28 13:49:55,038][57339] Updated weights for policy 0, policy_version 625308 (0.0025) [2024-04-28 13:49:57,169][57108] Fps is (10 sec: 57343.6, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10245177344. Throughput: 0: 55825.5. Samples: 735518460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:49:57,169][57108] Avg episode reward: [(0, '0.528')] [2024-04-28 13:49:57,922][57339] Updated weights for policy 0, policy_version 625318 (0.0026) [2024-04-28 13:50:01,029][57339] Updated weights for policy 0, policy_version 625328 (0.0033) [2024-04-28 13:50:02,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10245439488. Throughput: 0: 55822.5. Samples: 735852300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:50:02,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 13:50:03,783][57339] Updated weights for policy 0, policy_version 625338 (0.0030) [2024-04-28 13:50:06,959][57339] Updated weights for policy 0, policy_version 625348 (0.0032) [2024-04-28 13:50:07,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10245718016. Throughput: 0: 55807.1. Samples: 736013540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:50:07,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 13:50:09,630][57339] Updated weights for policy 0, policy_version 625358 (0.0029) [2024-04-28 13:50:12,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10245996544. Throughput: 0: 55706.5. Samples: 736349500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:50:12,178][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 13:50:12,708][57339] Updated weights for policy 0, policy_version 625368 (0.0027) [2024-04-28 13:50:15,599][57339] Updated weights for policy 0, policy_version 625378 (0.0029) [2024-04-28 13:50:17,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10246291456. Throughput: 0: 55623.0. Samples: 736684120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:50:17,169][57108] Avg episode reward: [(0, '0.496')] [2024-04-28 13:50:18,655][57339] Updated weights for policy 0, policy_version 625388 (0.0036) [2024-04-28 13:50:21,626][57339] Updated weights for policy 0, policy_version 625398 (0.0027) [2024-04-28 13:50:22,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 10246553600. Throughput: 0: 55590.0. Samples: 736850540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:50:22,170][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 13:50:24,789][57339] Updated weights for policy 0, policy_version 625408 (0.0033) [2024-04-28 13:50:27,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 10246832128. Throughput: 0: 55470.6. Samples: 737179240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:50:27,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 13:50:27,534][57339] Updated weights for policy 0, policy_version 625418 (0.0025) [2024-04-28 13:50:30,550][57339] Updated weights for policy 0, policy_version 625428 (0.0031) [2024-04-28 13:50:32,169][57108] Fps is (10 sec: 54068.4, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10247094272. Throughput: 0: 55520.1. Samples: 737513700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:50:32,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 13:50:33,300][57339] Updated weights for policy 0, policy_version 625438 (0.0029) [2024-04-28 13:50:36,385][57339] Updated weights for policy 0, policy_version 625448 (0.0030) [2024-04-28 13:50:37,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10247389184. Throughput: 0: 55521.3. Samples: 737683080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:50:37,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 13:50:39,039][57339] Updated weights for policy 0, policy_version 625458 (0.0025) [2024-04-28 13:50:42,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54886.5, 300 sec: 55705.6). Total num frames: 10247634944. Throughput: 0: 55400.6. Samples: 738011480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:50:42,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 13:50:42,343][57339] Updated weights for policy 0, policy_version 625468 (0.0026) [2024-04-28 13:50:45,170][57339] Updated weights for policy 0, policy_version 625478 (0.0033) [2024-04-28 13:50:47,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.4, 300 sec: 55761.2). Total num frames: 10247929856. Throughput: 0: 55303.6. Samples: 738340960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:50:47,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 13:50:48,229][57339] Updated weights for policy 0, policy_version 625488 (0.0026) [2024-04-28 13:50:50,959][57339] Updated weights for policy 0, policy_version 625498 (0.0028) [2024-04-28 13:50:52,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 10248208384. Throughput: 0: 55524.8. Samples: 738512160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:50:52,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 13:50:53,962][57339] Updated weights for policy 0, policy_version 625508 (0.0035) [2024-04-28 13:50:56,917][57339] Updated weights for policy 0, policy_version 625518 (0.0027) [2024-04-28 13:50:57,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10248503296. Throughput: 0: 55450.8. Samples: 738844780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:50:57,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 13:50:59,902][57339] Updated weights for policy 0, policy_version 625528 (0.0032) [2024-04-28 13:51:02,011][57319] Signal inference workers to stop experience collection... (10600 times) [2024-04-28 13:51:02,011][57319] Signal inference workers to resume experience collection... (10600 times) [2024-04-28 13:51:02,032][57339] InferenceWorker_p0-w0: stopping experience collection (10600 times) [2024-04-28 13:51:02,032][57339] InferenceWorker_p0-w0: resuming experience collection (10600 times) [2024-04-28 13:51:02,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10248765440. Throughput: 0: 55453.8. Samples: 739179540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:51:02,170][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 13:51:02,872][57339] Updated weights for policy 0, policy_version 625538 (0.0029) [2024-04-28 13:51:05,822][57339] Updated weights for policy 0, policy_version 625548 (0.0030) [2024-04-28 13:51:07,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55159.5, 300 sec: 55761.2). Total num frames: 10249027584. Throughput: 0: 55353.2. Samples: 739341420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:51:07,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 13:51:08,774][57339] Updated weights for policy 0, policy_version 625558 (0.0028) [2024-04-28 13:51:11,675][57339] Updated weights for policy 0, policy_version 625568 (0.0032) [2024-04-28 13:51:12,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 10249322496. Throughput: 0: 55566.4. Samples: 739679720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 13:51:12,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 13:51:14,578][57339] Updated weights for policy 0, policy_version 625578 (0.0031) [2024-04-28 13:51:17,169][57108] Fps is (10 sec: 55704.9, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 10249584640. Throughput: 0: 55495.9. Samples: 740011020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:51:17,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 13:51:17,582][57339] Updated weights for policy 0, policy_version 625588 (0.0030) [2024-04-28 13:51:20,575][57339] Updated weights for policy 0, policy_version 625598 (0.0026) [2024-04-28 13:51:22,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 10249879552. Throughput: 0: 55454.2. Samples: 740178520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:51:22,169][57108] Avg episode reward: [(0, '0.531')] [2024-04-28 13:51:23,425][57339] Updated weights for policy 0, policy_version 625608 (0.0029) [2024-04-28 13:51:26,510][57339] Updated weights for policy 0, policy_version 625618 (0.0027) [2024-04-28 13:51:27,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55432.7, 300 sec: 55761.3). Total num frames: 10250158080. Throughput: 0: 55641.8. Samples: 740515360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:51:27,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 13:51:27,189][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000625621_10250174464.pth... [2024-04-28 13:51:27,234][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000624803_10236772352.pth [2024-04-28 13:51:29,229][57339] Updated weights for policy 0, policy_version 625628 (0.0024) [2024-04-28 13:51:32,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10250420224. Throughput: 0: 55779.3. Samples: 740851020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:51:32,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 13:51:32,368][57339] Updated weights for policy 0, policy_version 625638 (0.0030) [2024-04-28 13:51:34,975][57339] Updated weights for policy 0, policy_version 625648 (0.0030) [2024-04-28 13:51:37,169][57108] Fps is (10 sec: 55704.4, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10250715136. Throughput: 0: 55635.9. Samples: 741015780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:51:37,170][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 13:51:38,306][57339] Updated weights for policy 0, policy_version 625658 (0.0030) [2024-04-28 13:51:40,706][57339] Updated weights for policy 0, policy_version 625668 (0.0028) [2024-04-28 13:51:42,169][57108] Fps is (10 sec: 57342.2, 60 sec: 55978.4, 300 sec: 55816.6). Total num frames: 10250993664. Throughput: 0: 55686.4. Samples: 741350680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:51:42,178][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 13:51:44,402][57339] Updated weights for policy 0, policy_version 625678 (0.0024) [2024-04-28 13:51:46,566][57339] Updated weights for policy 0, policy_version 625688 (0.0034) [2024-04-28 13:51:47,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 10251288576. Throughput: 0: 55647.1. Samples: 741683660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:51:47,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 13:51:50,156][57339] Updated weights for policy 0, policy_version 625698 (0.0029) [2024-04-28 13:51:52,169][57108] Fps is (10 sec: 57345.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10251567104. Throughput: 0: 56028.7. Samples: 741862720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:51:52,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 13:51:52,498][57339] Updated weights for policy 0, policy_version 625708 (0.0030) [2024-04-28 13:51:55,883][57339] Updated weights for policy 0, policy_version 625718 (0.0025) [2024-04-28 13:51:57,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 10251845632. Throughput: 0: 56049.6. Samples: 742201960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:51:57,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 13:51:58,273][57339] Updated weights for policy 0, policy_version 625728 (0.0030) [2024-04-28 13:52:01,760][57339] Updated weights for policy 0, policy_version 625738 (0.0027) [2024-04-28 13:52:02,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10252107776. Throughput: 0: 56018.3. Samples: 742531840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:52:02,169][57108] Avg episode reward: [(0, '0.519')] [2024-04-28 13:52:04,241][57339] Updated weights for policy 0, policy_version 625748 (0.0029) [2024-04-28 13:52:07,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 10252386304. Throughput: 0: 55880.8. Samples: 742693160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:52:07,170][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 13:52:07,740][57339] Updated weights for policy 0, policy_version 625758 (0.0029) [2024-04-28 13:52:09,929][57339] Updated weights for policy 0, policy_version 625768 (0.0032) [2024-04-28 13:52:12,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10252664832. Throughput: 0: 55849.3. Samples: 743028580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:52:12,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 13:52:13,482][57339] Updated weights for policy 0, policy_version 625778 (0.0027) [2024-04-28 13:52:15,820][57339] Updated weights for policy 0, policy_version 625788 (0.0029) [2024-04-28 13:52:17,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10252943360. Throughput: 0: 55749.2. Samples: 743359740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:52:17,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 13:52:19,500][57339] Updated weights for policy 0, policy_version 625798 (0.0026) [2024-04-28 13:52:20,987][57319] Signal inference workers to stop experience collection... (10650 times) [2024-04-28 13:52:21,036][57339] InferenceWorker_p0-w0: stopping experience collection (10650 times) [2024-04-28 13:52:21,043][57319] Signal inference workers to resume experience collection... (10650 times) [2024-04-28 13:52:21,052][57339] InferenceWorker_p0-w0: resuming experience collection (10650 times) [2024-04-28 13:52:21,704][57339] Updated weights for policy 0, policy_version 625808 (0.0028) [2024-04-28 13:52:22,169][57108] Fps is (10 sec: 58982.8, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10253254656. Throughput: 0: 55904.3. Samples: 743531460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:52:22,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 13:52:25,411][57339] Updated weights for policy 0, policy_version 625818 (0.0026) [2024-04-28 13:52:27,169][57108] Fps is (10 sec: 60620.1, 60 sec: 56524.6, 300 sec: 55983.3). Total num frames: 10253549568. Throughput: 0: 55885.5. Samples: 743865520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:52:27,170][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 13:52:27,471][57339] Updated weights for policy 0, policy_version 625828 (0.0028) [2024-04-28 13:52:31,321][57339] Updated weights for policy 0, policy_version 625838 (0.0030) [2024-04-28 13:52:32,173][57108] Fps is (10 sec: 54042.7, 60 sec: 56247.5, 300 sec: 55871.4). Total num frames: 10253795328. Throughput: 0: 55887.8. Samples: 744198860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:52:32,174][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 13:52:33,196][57339] Updated weights for policy 0, policy_version 625848 (0.0027) [2024-04-28 13:52:37,169][57108] Fps is (10 sec: 49152.1, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10254041088. Throughput: 0: 55521.7. Samples: 744361200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:52:37,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 13:52:37,205][57339] Updated weights for policy 0, policy_version 625858 (0.0028) [2024-04-28 13:52:39,267][57339] Updated weights for policy 0, policy_version 625868 (0.0029) [2024-04-28 13:52:42,169][57108] Fps is (10 sec: 50813.3, 60 sec: 55159.7, 300 sec: 55650.1). Total num frames: 10254303232. Throughput: 0: 55309.5. Samples: 744690880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:52:42,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 13:52:43,073][57339] Updated weights for policy 0, policy_version 625878 (0.0032) [2024-04-28 13:52:45,408][57339] Updated weights for policy 0, policy_version 625888 (0.0025) [2024-04-28 13:52:47,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10254614528. Throughput: 0: 55335.6. Samples: 745021940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 13:52:47,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 13:52:49,014][57339] Updated weights for policy 0, policy_version 625898 (0.0032) [2024-04-28 13:52:51,367][57339] Updated weights for policy 0, policy_version 625908 (0.0027) [2024-04-28 13:52:52,169][57108] Fps is (10 sec: 60620.3, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10254909440. Throughput: 0: 55572.9. Samples: 745193940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:52:52,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 13:52:54,877][57339] Updated weights for policy 0, policy_version 625918 (0.0027) [2024-04-28 13:52:57,147][57339] Updated weights for policy 0, policy_version 625928 (0.0029) [2024-04-28 13:52:57,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10255204352. Throughput: 0: 55637.7. Samples: 745532280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:52:57,169][57108] Avg episode reward: [(0, '0.692')] [2024-04-28 13:53:00,716][57339] Updated weights for policy 0, policy_version 625938 (0.0037) [2024-04-28 13:53:02,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10255482880. Throughput: 0: 55545.7. Samples: 745859300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:53:02,169][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 13:53:02,839][57339] Updated weights for policy 0, policy_version 625948 (0.0035) [2024-04-28 13:53:06,437][57339] Updated weights for policy 0, policy_version 625958 (0.0025) [2024-04-28 13:53:07,169][57108] Fps is (10 sec: 55706.2, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 10255761408. Throughput: 0: 55729.7. Samples: 746039300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:53:07,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 13:53:08,659][57339] Updated weights for policy 0, policy_version 625968 (0.0029) [2024-04-28 13:53:12,169][57108] Fps is (10 sec: 50790.3, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10255990784. Throughput: 0: 55671.6. Samples: 746370740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:53:12,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 13:53:12,424][57339] Updated weights for policy 0, policy_version 625978 (0.0036) [2024-04-28 13:53:14,631][57339] Updated weights for policy 0, policy_version 625988 (0.0032) [2024-04-28 13:53:17,169][57108] Fps is (10 sec: 50790.0, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10256269312. Throughput: 0: 55810.8. Samples: 746710100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:53:17,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 13:53:18,252][57339] Updated weights for policy 0, policy_version 625998 (0.0029) [2024-04-28 13:53:20,577][57339] Updated weights for policy 0, policy_version 626008 (0.0025) [2024-04-28 13:53:22,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 10256564224. Throughput: 0: 55530.8. Samples: 746860080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:53:22,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 13:53:24,028][57339] Updated weights for policy 0, policy_version 626018 (0.0031) [2024-04-28 13:53:25,801][57319] Signal inference workers to stop experience collection... (10700 times) [2024-04-28 13:53:25,829][57339] InferenceWorker_p0-w0: stopping experience collection (10700 times) [2024-04-28 13:53:25,856][57319] Signal inference workers to resume experience collection... (10700 times) [2024-04-28 13:53:25,857][57339] InferenceWorker_p0-w0: resuming experience collection (10700 times) [2024-04-28 13:53:26,482][57339] Updated weights for policy 0, policy_version 626028 (0.0026) [2024-04-28 13:53:27,169][57108] Fps is (10 sec: 58981.8, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10256859136. Throughput: 0: 55766.4. Samples: 747200380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:53:27,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 13:53:27,225][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000626030_10256875520.pth... [2024-04-28 13:53:27,271][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000625213_10243489792.pth [2024-04-28 13:53:29,944][57339] Updated weights for policy 0, policy_version 626038 (0.0032) [2024-04-28 13:53:32,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55709.7, 300 sec: 55761.1). Total num frames: 10257137664. Throughput: 0: 55871.1. Samples: 747536140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:53:32,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 13:53:32,594][57339] Updated weights for policy 0, policy_version 626048 (0.0024) [2024-04-28 13:53:35,798][57339] Updated weights for policy 0, policy_version 626058 (0.0033) [2024-04-28 13:53:37,169][57108] Fps is (10 sec: 57344.7, 60 sec: 56524.8, 300 sec: 55761.1). Total num frames: 10257432576. Throughput: 0: 56044.9. Samples: 747715960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:53:37,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 13:53:38,599][57339] Updated weights for policy 0, policy_version 626068 (0.0027) [2024-04-28 13:53:41,554][57339] Updated weights for policy 0, policy_version 626078 (0.0030) [2024-04-28 13:53:42,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56797.7, 300 sec: 55816.7). Total num frames: 10257711104. Throughput: 0: 55994.7. Samples: 748052040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:53:42,170][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 13:53:44,355][57339] Updated weights for policy 0, policy_version 626088 (0.0030) [2024-04-28 13:53:47,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10257973248. Throughput: 0: 56172.4. Samples: 748387060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:53:47,170][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 13:53:47,270][57339] Updated weights for policy 0, policy_version 626098 (0.0031) [2024-04-28 13:53:50,223][57339] Updated weights for policy 0, policy_version 626108 (0.0029) [2024-04-28 13:53:52,169][57108] Fps is (10 sec: 50791.1, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 10258219008. Throughput: 0: 55708.9. Samples: 748546200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:53:52,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 13:53:53,209][57339] Updated weights for policy 0, policy_version 626118 (0.0027) [2024-04-28 13:53:56,071][57339] Updated weights for policy 0, policy_version 626128 (0.0025) [2024-04-28 13:53:57,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 10258513920. Throughput: 0: 55810.3. Samples: 748882200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:53:57,169][57108] Avg episode reward: [(0, '0.676')] [2024-04-28 13:53:59,146][57339] Updated weights for policy 0, policy_version 626138 (0.0025) [2024-04-28 13:54:01,788][57339] Updated weights for policy 0, policy_version 626148 (0.0029) [2024-04-28 13:54:02,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10258808832. Throughput: 0: 55744.9. Samples: 749218620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:54:02,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 13:54:04,920][57339] Updated weights for policy 0, policy_version 626158 (0.0029) [2024-04-28 13:54:07,169][57108] Fps is (10 sec: 58981.5, 60 sec: 55705.5, 300 sec: 55761.2). Total num frames: 10259103744. Throughput: 0: 56365.2. Samples: 749396520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:54:07,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 13:54:07,502][57339] Updated weights for policy 0, policy_version 626168 (0.0025) [2024-04-28 13:54:10,729][57339] Updated weights for policy 0, policy_version 626178 (0.0029) [2024-04-28 13:54:12,169][57108] Fps is (10 sec: 58982.8, 60 sec: 56797.9, 300 sec: 55816.7). Total num frames: 10259398656. Throughput: 0: 56180.2. Samples: 749728480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:54:12,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 13:54:13,451][57339] Updated weights for policy 0, policy_version 626188 (0.0028) [2024-04-28 13:54:16,477][57339] Updated weights for policy 0, policy_version 626198 (0.0031) [2024-04-28 13:54:17,169][57108] Fps is (10 sec: 57345.1, 60 sec: 56798.0, 300 sec: 55872.2). Total num frames: 10259677184. Throughput: 0: 56140.1. Samples: 750062440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:54:17,169][57108] Avg episode reward: [(0, '0.690')] [2024-04-28 13:54:19,515][57339] Updated weights for policy 0, policy_version 626208 (0.0034) [2024-04-28 13:54:22,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10259922944. Throughput: 0: 55897.7. Samples: 750231360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:54:22,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 13:54:22,329][57319] Signal inference workers to stop experience collection... (10750 times) [2024-04-28 13:54:22,329][57319] Signal inference workers to resume experience collection... (10750 times) [2024-04-28 13:54:22,347][57339] InferenceWorker_p0-w0: stopping experience collection (10750 times) [2024-04-28 13:54:22,347][57339] InferenceWorker_p0-w0: resuming experience collection (10750 times) [2024-04-28 13:54:22,455][57339] Updated weights for policy 0, policy_version 626218 (0.0031) [2024-04-28 13:54:25,451][57339] Updated weights for policy 0, policy_version 626228 (0.0030) [2024-04-28 13:54:27,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10260201472. Throughput: 0: 55996.0. Samples: 750571860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 13:54:27,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 13:54:28,336][57339] Updated weights for policy 0, policy_version 626238 (0.0032) [2024-04-28 13:54:31,351][57339] Updated weights for policy 0, policy_version 626248 (0.0033) [2024-04-28 13:54:32,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10260480000. Throughput: 0: 56065.8. Samples: 750910020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:54:32,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 13:54:34,124][57339] Updated weights for policy 0, policy_version 626258 (0.0028) [2024-04-28 13:54:37,130][57339] Updated weights for policy 0, policy_version 626268 (0.0033) [2024-04-28 13:54:37,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10260774912. Throughput: 0: 56104.8. Samples: 751070920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:54:37,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 13:54:39,841][57339] Updated weights for policy 0, policy_version 626278 (0.0028) [2024-04-28 13:54:42,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10261053440. Throughput: 0: 56039.6. Samples: 751403980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:54:42,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 13:54:42,860][57339] Updated weights for policy 0, policy_version 626288 (0.0026) [2024-04-28 13:54:45,878][57339] Updated weights for policy 0, policy_version 626298 (0.0026) [2024-04-28 13:54:47,169][57108] Fps is (10 sec: 58982.0, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 10261364736. Throughput: 0: 56067.1. Samples: 751741640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:54:47,169][57108] Avg episode reward: [(0, '0.530')] [2024-04-28 13:54:48,676][57339] Updated weights for policy 0, policy_version 626308 (0.0029) [2024-04-28 13:54:51,765][57339] Updated weights for policy 0, policy_version 626318 (0.0028) [2024-04-28 13:54:52,169][57108] Fps is (10 sec: 58981.5, 60 sec: 57070.8, 300 sec: 55816.7). Total num frames: 10261643264. Throughput: 0: 56154.2. Samples: 751923460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:54:52,170][57108] Avg episode reward: [(0, '0.494')] [2024-04-28 13:54:54,560][57339] Updated weights for policy 0, policy_version 626328 (0.0021) [2024-04-28 13:54:57,169][57108] Fps is (10 sec: 52428.8, 60 sec: 56251.7, 300 sec: 55761.2). Total num frames: 10261889024. Throughput: 0: 56245.7. Samples: 752259540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:54:57,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 13:54:57,555][57339] Updated weights for policy 0, policy_version 626338 (0.0040) [2024-04-28 13:55:00,447][57339] Updated weights for policy 0, policy_version 626348 (0.0028) [2024-04-28 13:55:02,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10262167552. Throughput: 0: 56197.6. Samples: 752591340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:55:02,169][57108] Avg episode reward: [(0, '0.715')] [2024-04-28 13:55:03,314][57339] Updated weights for policy 0, policy_version 626358 (0.0031) [2024-04-28 13:55:06,286][57339] Updated weights for policy 0, policy_version 626368 (0.0031) [2024-04-28 13:55:07,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10262446080. Throughput: 0: 55941.7. Samples: 752748740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:55:07,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 13:55:09,061][57339] Updated weights for policy 0, policy_version 626378 (0.0032) [2024-04-28 13:55:12,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10262724608. Throughput: 0: 55982.6. Samples: 753091080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:55:12,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 13:55:12,385][57339] Updated weights for policy 0, policy_version 626388 (0.0031) [2024-04-28 13:55:14,821][57339] Updated weights for policy 0, policy_version 626398 (0.0029) [2024-04-28 13:55:17,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 10263003136. Throughput: 0: 55955.5. Samples: 753428020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:55:17,170][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 13:55:18,236][57339] Updated weights for policy 0, policy_version 626408 (0.0036) [2024-04-28 13:55:20,673][57339] Updated weights for policy 0, policy_version 626418 (0.0033) [2024-04-28 13:55:22,169][57108] Fps is (10 sec: 57344.9, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10263298048. Throughput: 0: 56208.5. Samples: 753600300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:55:22,169][57108] Avg episode reward: [(0, '0.700')] [2024-04-28 13:55:24,122][57339] Updated weights for policy 0, policy_version 626428 (0.0030) [2024-04-28 13:55:26,181][57319] Signal inference workers to stop experience collection... (10800 times) [2024-04-28 13:55:26,186][57319] Signal inference workers to resume experience collection... (10800 times) [2024-04-28 13:55:26,193][57339] InferenceWorker_p0-w0: stopping experience collection (10800 times) [2024-04-28 13:55:26,223][57339] InferenceWorker_p0-w0: resuming experience collection (10800 times) [2024-04-28 13:55:26,555][57339] Updated weights for policy 0, policy_version 626438 (0.0028) [2024-04-28 13:55:27,169][57108] Fps is (10 sec: 58981.8, 60 sec: 56524.7, 300 sec: 55927.7). Total num frames: 10263592960. Throughput: 0: 56128.1. Samples: 753929760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:55:27,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 13:55:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000626440_10263592960.pth... [2024-04-28 13:55:27,224][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000625621_10250174464.pth [2024-04-28 13:55:29,968][57339] Updated weights for policy 0, policy_version 626448 (0.0030) [2024-04-28 13:55:32,169][57108] Fps is (10 sec: 55704.8, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10263855104. Throughput: 0: 56089.2. Samples: 754265660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:55:32,170][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 13:55:32,595][57339] Updated weights for policy 0, policy_version 626458 (0.0028) [2024-04-28 13:55:35,685][57339] Updated weights for policy 0, policy_version 626468 (0.0025) [2024-04-28 13:55:37,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 10264133632. Throughput: 0: 55697.4. Samples: 754429840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:55:37,170][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 13:55:38,432][57339] Updated weights for policy 0, policy_version 626478 (0.0028) [2024-04-28 13:55:41,449][57339] Updated weights for policy 0, policy_version 626488 (0.0025) [2024-04-28 13:55:42,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10264395776. Throughput: 0: 55713.8. Samples: 754766660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:55:42,169][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 13:55:44,185][57339] Updated weights for policy 0, policy_version 626498 (0.0025) [2024-04-28 13:55:47,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 10264690688. Throughput: 0: 55792.2. Samples: 755101980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:55:47,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 13:55:47,222][57339] Updated weights for policy 0, policy_version 626508 (0.0030) [2024-04-28 13:55:49,992][57339] Updated weights for policy 0, policy_version 626518 (0.0029) [2024-04-28 13:55:52,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 10264952832. Throughput: 0: 55945.0. Samples: 755266260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:55:52,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 13:55:53,226][57339] Updated weights for policy 0, policy_version 626528 (0.0026) [2024-04-28 13:55:55,896][57339] Updated weights for policy 0, policy_version 626538 (0.0032) [2024-04-28 13:55:57,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10265247744. Throughput: 0: 55902.6. Samples: 755606700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:55:57,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 13:55:59,155][57339] Updated weights for policy 0, policy_version 626548 (0.0034) [2024-04-28 13:56:01,830][57339] Updated weights for policy 0, policy_version 626558 (0.0025) [2024-04-28 13:56:02,169][57108] Fps is (10 sec: 58982.3, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 10265542656. Throughput: 0: 55886.3. Samples: 755942900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 13:56:02,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 13:56:04,912][57339] Updated weights for policy 0, policy_version 626568 (0.0033) [2024-04-28 13:56:07,169][57108] Fps is (10 sec: 57344.8, 60 sec: 56251.8, 300 sec: 55927.7). Total num frames: 10265821184. Throughput: 0: 55823.1. Samples: 756112340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:56:07,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 13:56:07,613][57339] Updated weights for policy 0, policy_version 626578 (0.0026) [2024-04-28 13:56:10,805][57339] Updated weights for policy 0, policy_version 626588 (0.0031) [2024-04-28 13:56:12,169][57108] Fps is (10 sec: 55704.7, 60 sec: 56251.6, 300 sec: 55983.3). Total num frames: 10266099712. Throughput: 0: 56029.3. Samples: 756451080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:56:12,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 13:56:13,439][57339] Updated weights for policy 0, policy_version 626598 (0.0030) [2024-04-28 13:56:16,508][57339] Updated weights for policy 0, policy_version 626608 (0.0028) [2024-04-28 13:56:17,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 10266361856. Throughput: 0: 55995.7. Samples: 756785460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:56:17,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 13:56:19,476][57339] Updated weights for policy 0, policy_version 626618 (0.0032) [2024-04-28 13:56:22,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 10266656768. Throughput: 0: 56088.4. Samples: 756953820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:56:22,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 13:56:22,262][57339] Updated weights for policy 0, policy_version 626628 (0.0032) [2024-04-28 13:56:25,467][57339] Updated weights for policy 0, policy_version 626638 (0.0027) [2024-04-28 13:56:27,169][57108] Fps is (10 sec: 58981.8, 60 sec: 55978.8, 300 sec: 56038.8). Total num frames: 10266951680. Throughput: 0: 56123.9. Samples: 757292240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:56:27,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 13:56:28,036][57339] Updated weights for policy 0, policy_version 626648 (0.0027) [2024-04-28 13:56:31,323][57339] Updated weights for policy 0, policy_version 626658 (0.0031) [2024-04-28 13:56:32,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 10267213824. Throughput: 0: 56127.5. Samples: 757627720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:56:32,170][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 13:56:33,306][57319] Signal inference workers to stop experience collection... (10850 times) [2024-04-28 13:56:33,355][57339] InferenceWorker_p0-w0: stopping experience collection (10850 times) [2024-04-28 13:56:33,355][57319] Signal inference workers to resume experience collection... (10850 times) [2024-04-28 13:56:33,368][57339] InferenceWorker_p0-w0: resuming experience collection (10850 times) [2024-04-28 13:56:33,779][57339] Updated weights for policy 0, policy_version 626668 (0.0027) [2024-04-28 13:56:36,982][57339] Updated weights for policy 0, policy_version 626678 (0.0028) [2024-04-28 13:56:37,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10267492352. Throughput: 0: 56263.0. Samples: 757798100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:56:37,175][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 13:56:39,970][57339] Updated weights for policy 0, policy_version 626688 (0.0025) [2024-04-28 13:56:42,169][57108] Fps is (10 sec: 55705.8, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10267770880. Throughput: 0: 56103.7. Samples: 758131360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:56:42,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 13:56:42,706][57339] Updated weights for policy 0, policy_version 626698 (0.0026) [2024-04-28 13:56:45,953][57339] Updated weights for policy 0, policy_version 626708 (0.0029) [2024-04-28 13:56:47,169][57108] Fps is (10 sec: 57344.6, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10268065792. Throughput: 0: 56006.3. Samples: 758463180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:56:47,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 13:56:48,668][57339] Updated weights for policy 0, policy_version 626718 (0.0031) [2024-04-28 13:56:51,926][57339] Updated weights for policy 0, policy_version 626728 (0.0029) [2024-04-28 13:56:52,169][57108] Fps is (10 sec: 55705.8, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10268327936. Throughput: 0: 56050.3. Samples: 758634600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:56:52,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 13:56:54,655][57339] Updated weights for policy 0, policy_version 626738 (0.0029) [2024-04-28 13:56:57,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 10268606464. Throughput: 0: 55833.5. Samples: 758963580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:56:57,170][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 13:56:57,699][57339] Updated weights for policy 0, policy_version 626748 (0.0027) [2024-04-28 13:57:00,402][57339] Updated weights for policy 0, policy_version 626758 (0.0029) [2024-04-28 13:57:02,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 10268884992. Throughput: 0: 55886.2. Samples: 759300340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:57:02,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 13:57:03,416][57339] Updated weights for policy 0, policy_version 626768 (0.0034) [2024-04-28 13:57:06,418][57339] Updated weights for policy 0, policy_version 626778 (0.0026) [2024-04-28 13:57:07,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.4, 300 sec: 55927.7). Total num frames: 10269163520. Throughput: 0: 55879.0. Samples: 759468380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:57:07,179][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 13:57:09,356][57339] Updated weights for policy 0, policy_version 626788 (0.0025) [2024-04-28 13:57:12,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.8, 300 sec: 55927.7). Total num frames: 10269442048. Throughput: 0: 55698.3. Samples: 759798660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:57:12,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 13:57:12,393][57339] Updated weights for policy 0, policy_version 626798 (0.0025) [2024-04-28 13:57:15,118][57339] Updated weights for policy 0, policy_version 626808 (0.0032) [2024-04-28 13:57:17,169][57108] Fps is (10 sec: 54068.5, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10269704192. Throughput: 0: 55712.6. Samples: 760134780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:57:17,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 13:57:18,283][57339] Updated weights for policy 0, policy_version 626818 (0.0027) [2024-04-28 13:57:20,925][57339] Updated weights for policy 0, policy_version 626828 (0.0026) [2024-04-28 13:57:22,169][57108] Fps is (10 sec: 58982.2, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10270031872. Throughput: 0: 55762.7. Samples: 760307420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:57:22,169][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 13:57:24,195][57339] Updated weights for policy 0, policy_version 626838 (0.0023) [2024-04-28 13:57:26,868][57339] Updated weights for policy 0, policy_version 626848 (0.0028) [2024-04-28 13:57:27,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55705.6, 300 sec: 55928.6). Total num frames: 10270294016. Throughput: 0: 55731.9. Samples: 760639300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:57:27,169][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 13:57:27,182][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000626849_10270294016.pth... [2024-04-28 13:57:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000626030_10256875520.pth [2024-04-28 13:57:29,975][57339] Updated weights for policy 0, policy_version 626858 (0.0029) [2024-04-28 13:57:32,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55705.7, 300 sec: 55983.3). Total num frames: 10270556160. Throughput: 0: 55739.6. Samples: 760971460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:57:32,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 13:57:32,806][57339] Updated weights for policy 0, policy_version 626868 (0.0024) [2024-04-28 13:57:35,828][57339] Updated weights for policy 0, policy_version 626878 (0.0034) [2024-04-28 13:57:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 56038.8). Total num frames: 10270834688. Throughput: 0: 55608.8. Samples: 761137000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-04-28 13:57:37,170][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 13:57:38,739][57339] Updated weights for policy 0, policy_version 626888 (0.0031) [2024-04-28 13:57:41,310][57319] Signal inference workers to stop experience collection... (10900 times) [2024-04-28 13:57:41,339][57339] InferenceWorker_p0-w0: stopping experience collection (10900 times) [2024-04-28 13:57:41,360][57319] Signal inference workers to resume experience collection... (10900 times) [2024-04-28 13:57:41,360][57339] InferenceWorker_p0-w0: resuming experience collection (10900 times) [2024-04-28 13:57:41,687][57339] Updated weights for policy 0, policy_version 626898 (0.0027) [2024-04-28 13:57:42,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 10271129600. Throughput: 0: 55783.3. Samples: 761473820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:57:42,169][57108] Avg episode reward: [(0, '0.446')] [2024-04-28 13:57:44,457][57339] Updated weights for policy 0, policy_version 626908 (0.0026) [2024-04-28 13:57:47,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 10271391744. Throughput: 0: 55768.7. Samples: 761809940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:57:47,170][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 13:57:47,515][57339] Updated weights for policy 0, policy_version 626918 (0.0031) [2024-04-28 13:57:50,201][57339] Updated weights for policy 0, policy_version 626928 (0.0029) [2024-04-28 13:57:52,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10271670272. Throughput: 0: 55627.2. Samples: 761971600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:57:52,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 13:57:53,389][57339] Updated weights for policy 0, policy_version 626938 (0.0031) [2024-04-28 13:57:56,110][57339] Updated weights for policy 0, policy_version 626948 (0.0029) [2024-04-28 13:57:57,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 10271965184. Throughput: 0: 55776.9. Samples: 762308620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:57:57,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 13:57:59,128][57339] Updated weights for policy 0, policy_version 626958 (0.0030) [2024-04-28 13:58:02,169][57108] Fps is (10 sec: 55706.8, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10272227328. Throughput: 0: 55664.9. Samples: 762639700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:58:02,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 13:58:02,213][57339] Updated weights for policy 0, policy_version 626968 (0.0033) [2024-04-28 13:58:04,925][57339] Updated weights for policy 0, policy_version 626978 (0.0026) [2024-04-28 13:58:07,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.8, 300 sec: 56038.8). Total num frames: 10272522240. Throughput: 0: 55659.1. Samples: 762812080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:58:07,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 13:58:07,957][57339] Updated weights for policy 0, policy_version 626988 (0.0029) [2024-04-28 13:58:10,612][57339] Updated weights for policy 0, policy_version 626998 (0.0027) [2024-04-28 13:58:12,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55705.5, 300 sec: 55983.3). Total num frames: 10272784384. Throughput: 0: 55687.5. Samples: 763145240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:58:12,170][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 13:58:13,756][57339] Updated weights for policy 0, policy_version 627008 (0.0033) [2024-04-28 13:58:16,382][57339] Updated weights for policy 0, policy_version 627018 (0.0031) [2024-04-28 13:58:17,169][57108] Fps is (10 sec: 57343.4, 60 sec: 56524.6, 300 sec: 56038.8). Total num frames: 10273095680. Throughput: 0: 55732.6. Samples: 763479440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:58:17,169][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 13:58:19,583][57339] Updated weights for policy 0, policy_version 627028 (0.0031) [2024-04-28 13:58:22,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55432.6, 300 sec: 55927.8). Total num frames: 10273357824. Throughput: 0: 55808.1. Samples: 763648360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:58:22,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 13:58:22,318][57339] Updated weights for policy 0, policy_version 627038 (0.0027) [2024-04-28 13:58:25,520][57339] Updated weights for policy 0, policy_version 627048 (0.0031) [2024-04-28 13:58:27,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 10273619968. Throughput: 0: 55701.7. Samples: 763980400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:58:27,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 13:58:28,329][57339] Updated weights for policy 0, policy_version 627058 (0.0025) [2024-04-28 13:58:31,362][57339] Updated weights for policy 0, policy_version 627068 (0.0029) [2024-04-28 13:58:32,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10273898496. Throughput: 0: 55752.7. Samples: 764318800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:58:32,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 13:58:34,386][57339] Updated weights for policy 0, policy_version 627078 (0.0028) [2024-04-28 13:58:37,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10274193408. Throughput: 0: 55780.0. Samples: 764481700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:58:37,178][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 13:58:37,327][57339] Updated weights for policy 0, policy_version 627088 (0.0024) [2024-04-28 13:58:40,090][57339] Updated weights for policy 0, policy_version 627098 (0.0028) [2024-04-28 13:58:42,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 10274455552. Throughput: 0: 55740.7. Samples: 764816960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:58:42,170][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 13:58:43,161][57339] Updated weights for policy 0, policy_version 627108 (0.0028) [2024-04-28 13:58:45,936][57339] Updated weights for policy 0, policy_version 627118 (0.0032) [2024-04-28 13:58:47,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55978.8, 300 sec: 56038.8). Total num frames: 10274750464. Throughput: 0: 55796.8. Samples: 765150560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:58:47,169][57108] Avg episode reward: [(0, '0.526')] [2024-04-28 13:58:48,991][57339] Updated weights for policy 0, policy_version 627128 (0.0031) [2024-04-28 13:58:51,383][57319] Signal inference workers to stop experience collection... (10950 times) [2024-04-28 13:58:51,387][57319] Signal inference workers to resume experience collection... (10950 times) [2024-04-28 13:58:51,404][57339] InferenceWorker_p0-w0: stopping experience collection (10950 times) [2024-04-28 13:58:51,404][57339] InferenceWorker_p0-w0: resuming experience collection (10950 times) [2024-04-28 13:58:51,779][57339] Updated weights for policy 0, policy_version 627138 (0.0020) [2024-04-28 13:58:52,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 10275028992. Throughput: 0: 55692.8. Samples: 765318260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:58:52,170][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 13:58:54,999][57339] Updated weights for policy 0, policy_version 627148 (0.0029) [2024-04-28 13:58:57,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 10275323904. Throughput: 0: 55911.7. Samples: 765661260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:58:57,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 13:58:57,489][57339] Updated weights for policy 0, policy_version 627158 (0.0031) [2024-04-28 13:59:00,914][57339] Updated weights for policy 0, policy_version 627168 (0.0030) [2024-04-28 13:59:02,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10275569664. Throughput: 0: 55992.2. Samples: 765999080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:59:02,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 13:59:03,255][57339] Updated weights for policy 0, policy_version 627178 (0.0040) [2024-04-28 13:59:06,694][57339] Updated weights for policy 0, policy_version 627188 (0.0027) [2024-04-28 13:59:07,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55705.5, 300 sec: 55816.6). Total num frames: 10275864576. Throughput: 0: 55831.3. Samples: 766160780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:59:07,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 13:59:09,193][57339] Updated weights for policy 0, policy_version 627198 (0.0025) [2024-04-28 13:59:12,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10276143104. Throughput: 0: 55975.6. Samples: 766499300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-04-28 13:59:12,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 13:59:12,524][57339] Updated weights for policy 0, policy_version 627208 (0.0033) [2024-04-28 13:59:15,399][57339] Updated weights for policy 0, policy_version 627218 (0.0028) [2024-04-28 13:59:17,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55927.7). Total num frames: 10276421632. Throughput: 0: 56033.9. Samples: 766840340. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 13:59:17,169][57108] Avg episode reward: [(0, '0.694')] [2024-04-28 13:59:18,393][57339] Updated weights for policy 0, policy_version 627228 (0.0029) [2024-04-28 13:59:21,133][57339] Updated weights for policy 0, policy_version 627238 (0.0031) [2024-04-28 13:59:22,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 10276716544. Throughput: 0: 56187.3. Samples: 767010120. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 13:59:22,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 13:59:24,424][57339] Updated weights for policy 0, policy_version 627248 (0.0030) [2024-04-28 13:59:26,821][57339] Updated weights for policy 0, policy_version 627258 (0.0028) [2024-04-28 13:59:27,169][57108] Fps is (10 sec: 57345.3, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 10276995072. Throughput: 0: 56132.2. Samples: 767342900. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 13:59:27,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 13:59:27,199][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000627259_10277011456.pth... [2024-04-28 13:59:27,245][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000626440_10263592960.pth [2024-04-28 13:59:30,141][57339] Updated weights for policy 0, policy_version 627268 (0.0029) [2024-04-28 13:59:32,169][57108] Fps is (10 sec: 57343.0, 60 sec: 56524.7, 300 sec: 55983.3). Total num frames: 10277289984. Throughput: 0: 56155.8. Samples: 767677580. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 13:59:32,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 13:59:32,628][57339] Updated weights for policy 0, policy_version 627278 (0.0028) [2024-04-28 13:59:36,107][57339] Updated weights for policy 0, policy_version 627288 (0.0026) [2024-04-28 13:59:37,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 10277552128. Throughput: 0: 56293.9. Samples: 767851480. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 13:59:37,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 13:59:38,416][57339] Updated weights for policy 0, policy_version 627298 (0.0029) [2024-04-28 13:59:41,956][57339] Updated weights for policy 0, policy_version 627308 (0.0035) [2024-04-28 13:59:42,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10277814272. Throughput: 0: 56134.0. Samples: 768187300. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 13:59:42,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 13:59:44,109][57339] Updated weights for policy 0, policy_version 627318 (0.0028) [2024-04-28 13:59:47,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10278076416. Throughput: 0: 56059.4. Samples: 768521760. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 13:59:47,169][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 13:59:47,984][57339] Updated weights for policy 0, policy_version 627328 (0.0030) [2024-04-28 13:59:48,005][57319] Signal inference workers to stop experience collection... (11000 times) [2024-04-28 13:59:48,005][57319] Signal inference workers to resume experience collection... (11000 times) [2024-04-28 13:59:48,014][57339] InferenceWorker_p0-w0: stopping experience collection (11000 times) [2024-04-28 13:59:48,024][57339] InferenceWorker_p0-w0: resuming experience collection (11000 times) [2024-04-28 13:59:49,946][57339] Updated weights for policy 0, policy_version 627338 (0.0029) [2024-04-28 13:59:52,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10278371328. Throughput: 0: 55946.7. Samples: 768678380. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 13:59:52,169][57108] Avg episode reward: [(0, '0.676')] [2024-04-28 13:59:53,807][57339] Updated weights for policy 0, policy_version 627348 (0.0037) [2024-04-28 13:59:56,022][57339] Updated weights for policy 0, policy_version 627358 (0.0027) [2024-04-28 13:59:57,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 10278649856. Throughput: 0: 55760.9. Samples: 769008540. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 13:59:57,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 13:59:59,678][57339] Updated weights for policy 0, policy_version 627368 (0.0029) [2024-04-28 14:00:02,135][57339] Updated weights for policy 0, policy_version 627378 (0.0029) [2024-04-28 14:00:02,169][57108] Fps is (10 sec: 58983.2, 60 sec: 56524.7, 300 sec: 55983.3). Total num frames: 10278961152. Throughput: 0: 55667.3. Samples: 769345360. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 14:00:02,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 14:00:05,594][57339] Updated weights for policy 0, policy_version 627388 (0.0028) [2024-04-28 14:00:07,169][57108] Fps is (10 sec: 60620.4, 60 sec: 56524.9, 300 sec: 56038.8). Total num frames: 10279256064. Throughput: 0: 55859.4. Samples: 769523800. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 14:00:07,178][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 14:00:08,008][57339] Updated weights for policy 0, policy_version 627398 (0.0032) [2024-04-28 14:00:11,333][57339] Updated weights for policy 0, policy_version 627408 (0.0028) [2024-04-28 14:00:12,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10279501824. Throughput: 0: 55891.5. Samples: 769858020. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 14:00:12,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 14:00:13,911][57339] Updated weights for policy 0, policy_version 627418 (0.0030) [2024-04-28 14:00:17,112][57339] Updated weights for policy 0, policy_version 627428 (0.0032) [2024-04-28 14:00:17,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 10279780352. Throughput: 0: 55645.3. Samples: 770181620. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 14:00:17,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 14:00:19,957][57339] Updated weights for policy 0, policy_version 627438 (0.0029) [2024-04-28 14:00:22,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.4, 300 sec: 55761.2). Total num frames: 10280042496. Throughput: 0: 55404.8. Samples: 770344700. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 14:00:22,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 14:00:23,084][57339] Updated weights for policy 0, policy_version 627448 (0.0025) [2024-04-28 14:00:25,714][57339] Updated weights for policy 0, policy_version 627458 (0.0027) [2024-04-28 14:00:27,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 10280321024. Throughput: 0: 55432.0. Samples: 770681740. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 14:00:27,170][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 14:00:28,988][57339] Updated weights for policy 0, policy_version 627468 (0.0033) [2024-04-28 14:00:31,475][57339] Updated weights for policy 0, policy_version 627478 (0.0027) [2024-04-28 14:00:32,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55159.6, 300 sec: 55816.7). Total num frames: 10280599552. Throughput: 0: 55392.2. Samples: 771014400. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 14:00:32,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 14:00:34,931][57339] Updated weights for policy 0, policy_version 627488 (0.0025) [2024-04-28 14:00:37,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 10280910848. Throughput: 0: 55799.6. Samples: 771189360. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 14:00:37,170][57108] Avg episode reward: [(0, '0.535')] [2024-04-28 14:00:37,324][57339] Updated weights for policy 0, policy_version 627498 (0.0030) [2024-04-28 14:00:39,919][57319] Signal inference workers to stop experience collection... (11050 times) [2024-04-28 14:00:39,920][57319] Signal inference workers to resume experience collection... (11050 times) [2024-04-28 14:00:39,942][57339] InferenceWorker_p0-w0: stopping experience collection (11050 times) [2024-04-28 14:00:39,942][57339] InferenceWorker_p0-w0: resuming experience collection (11050 times) [2024-04-28 14:00:40,931][57339] Updated weights for policy 0, policy_version 627508 (0.0029) [2024-04-28 14:00:42,169][57108] Fps is (10 sec: 60620.2, 60 sec: 56524.9, 300 sec: 55983.3). Total num frames: 10281205760. Throughput: 0: 55868.0. Samples: 771522600. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 14:00:42,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 14:00:43,407][57339] Updated weights for policy 0, policy_version 627518 (0.0027) [2024-04-28 14:00:46,643][57339] Updated weights for policy 0, policy_version 627528 (0.0031) [2024-04-28 14:00:47,169][57108] Fps is (10 sec: 50791.1, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10281418752. Throughput: 0: 55800.1. Samples: 771856360. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-04-28 14:00:47,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 14:00:49,390][57339] Updated weights for policy 0, policy_version 627538 (0.0031) [2024-04-28 14:00:52,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10281730048. Throughput: 0: 55459.4. Samples: 772019480. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:00:52,170][57108] Avg episode reward: [(0, '0.702')] [2024-04-28 14:00:52,461][57339] Updated weights for policy 0, policy_version 627548 (0.0027) [2024-04-28 14:00:55,840][57339] Updated weights for policy 0, policy_version 627558 (0.0039) [2024-04-28 14:00:57,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10281992192. Throughput: 0: 55568.8. Samples: 772358620. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:00:57,170][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 14:00:58,407][57339] Updated weights for policy 0, policy_version 627568 (0.0027) [2024-04-28 14:01:01,805][57339] Updated weights for policy 0, policy_version 627578 (0.0031) [2024-04-28 14:01:02,169][57108] Fps is (10 sec: 52429.4, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 10282254336. Throughput: 0: 55797.9. Samples: 772692520. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:01:02,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 14:01:04,413][57339] Updated weights for policy 0, policy_version 627588 (0.0029) [2024-04-28 14:01:07,169][57108] Fps is (10 sec: 55706.1, 60 sec: 54886.4, 300 sec: 55761.2). Total num frames: 10282549248. Throughput: 0: 55773.4. Samples: 772854500. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:01:07,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 14:01:07,570][57339] Updated weights for policy 0, policy_version 627598 (0.0028) [2024-04-28 14:01:10,256][57339] Updated weights for policy 0, policy_version 627608 (0.0027) [2024-04-28 14:01:12,169][57108] Fps is (10 sec: 58981.2, 60 sec: 55705.4, 300 sec: 55872.2). Total num frames: 10282844160. Throughput: 0: 55729.2. Samples: 773189560. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:01:12,170][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 14:01:13,331][57339] Updated weights for policy 0, policy_version 627618 (0.0026) [2024-04-28 14:01:16,059][57339] Updated weights for policy 0, policy_version 627628 (0.0028) [2024-04-28 14:01:17,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 10283139072. Throughput: 0: 55725.8. Samples: 773522060. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:01:17,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 14:01:19,344][57339] Updated weights for policy 0, policy_version 627638 (0.0029) [2024-04-28 14:01:22,007][57339] Updated weights for policy 0, policy_version 627648 (0.0033) [2024-04-28 14:01:22,169][57108] Fps is (10 sec: 54068.7, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10283384832. Throughput: 0: 55608.6. Samples: 773691740. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:01:22,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 14:01:23,091][57319] Signal inference workers to stop experience collection... (11100 times) [2024-04-28 14:01:23,091][57319] Signal inference workers to resume experience collection... (11100 times) [2024-04-28 14:01:23,103][57339] InferenceWorker_p0-w0: stopping experience collection (11100 times) [2024-04-28 14:01:23,103][57339] InferenceWorker_p0-w0: resuming experience collection (11100 times) [2024-04-28 14:01:25,337][57339] Updated weights for policy 0, policy_version 627658 (0.0025) [2024-04-28 14:01:27,169][57108] Fps is (10 sec: 52427.9, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10283663360. Throughput: 0: 55585.7. Samples: 774023960. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:01:27,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 14:01:27,260][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000627666_10283679744.pth... [2024-04-28 14:01:27,306][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000626849_10270294016.pth [2024-04-28 14:01:27,890][57339] Updated weights for policy 0, policy_version 627668 (0.0028) [2024-04-28 14:01:31,036][57339] Updated weights for policy 0, policy_version 627678 (0.0025) [2024-04-28 14:01:32,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10283925504. Throughput: 0: 55597.8. Samples: 774358260. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:01:32,169][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 14:01:33,652][57339] Updated weights for policy 0, policy_version 627688 (0.0036) [2024-04-28 14:01:36,802][57339] Updated weights for policy 0, policy_version 627698 (0.0026) [2024-04-28 14:01:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 10284204032. Throughput: 0: 55389.4. Samples: 774512000. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:01:37,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 14:01:39,456][57339] Updated weights for policy 0, policy_version 627708 (0.0026) [2024-04-28 14:01:42,169][57108] Fps is (10 sec: 55705.1, 60 sec: 54613.3, 300 sec: 55650.0). Total num frames: 10284482560. Throughput: 0: 55322.8. Samples: 774848140. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:01:42,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 14:01:42,942][57339] Updated weights for policy 0, policy_version 627718 (0.0032) [2024-04-28 14:01:45,203][57339] Updated weights for policy 0, policy_version 627728 (0.0025) [2024-04-28 14:01:47,169][57108] Fps is (10 sec: 58982.7, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 10284793856. Throughput: 0: 55248.0. Samples: 775178680. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:01:47,170][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 14:01:48,911][57339] Updated weights for policy 0, policy_version 627738 (0.0029) [2024-04-28 14:01:51,336][57339] Updated weights for policy 0, policy_version 627748 (0.0026) [2024-04-28 14:01:52,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10285072384. Throughput: 0: 55675.6. Samples: 775359900. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:01:52,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 14:01:54,737][57339] Updated weights for policy 0, policy_version 627758 (0.0030) [2024-04-28 14:01:57,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10285334528. Throughput: 0: 55597.6. Samples: 775691440. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:01:57,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 14:01:57,239][57339] Updated weights for policy 0, policy_version 627768 (0.0031) [2024-04-28 14:02:00,601][57339] Updated weights for policy 0, policy_version 627778 (0.0026) [2024-04-28 14:02:02,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 10285613056. Throughput: 0: 55616.9. Samples: 776024820. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:02:02,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 14:02:03,062][57339] Updated weights for policy 0, policy_version 627788 (0.0030) [2024-04-28 14:02:06,500][57339] Updated weights for policy 0, policy_version 627798 (0.0025) [2024-04-28 14:02:07,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10285907968. Throughput: 0: 55473.8. Samples: 776188060. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:02:07,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 14:02:09,133][57339] Updated weights for policy 0, policy_version 627808 (0.0031) [2024-04-28 14:02:12,169][57108] Fps is (10 sec: 52428.8, 60 sec: 54886.7, 300 sec: 55705.6). Total num frames: 10286137344. Throughput: 0: 55472.7. Samples: 776520220. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:02:12,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 14:02:12,433][57339] Updated weights for policy 0, policy_version 627818 (0.0032) [2024-04-28 14:02:14,911][57339] Updated weights for policy 0, policy_version 627828 (0.0030) [2024-04-28 14:02:17,169][57108] Fps is (10 sec: 52428.5, 60 sec: 54886.3, 300 sec: 55594.5). Total num frames: 10286432256. Throughput: 0: 55494.6. Samples: 776855520. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:02:17,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 14:02:18,271][57339] Updated weights for policy 0, policy_version 627838 (0.0030) [2024-04-28 14:02:20,676][57339] Updated weights for policy 0, policy_version 627848 (0.0029) [2024-04-28 14:02:22,169][57108] Fps is (10 sec: 60620.4, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 10286743552. Throughput: 0: 55785.9. Samples: 777022360. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:02:22,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 14:02:24,098][57339] Updated weights for policy 0, policy_version 627858 (0.0032) [2024-04-28 14:02:26,712][57339] Updated weights for policy 0, policy_version 627868 (0.0031) [2024-04-28 14:02:27,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10287022080. Throughput: 0: 55735.6. Samples: 777356240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:02:27,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 14:02:29,724][57319] Signal inference workers to stop experience collection... (11150 times) [2024-04-28 14:02:29,757][57339] InferenceWorker_p0-w0: stopping experience collection (11150 times) [2024-04-28 14:02:29,783][57319] Signal inference workers to resume experience collection... (11150 times) [2024-04-28 14:02:29,783][57339] InferenceWorker_p0-w0: resuming experience collection (11150 times) [2024-04-28 14:02:29,898][57339] Updated weights for policy 0, policy_version 627878 (0.0027) [2024-04-28 14:02:32,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 10287284224. Throughput: 0: 55743.7. Samples: 777687140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:02:32,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 14:02:32,645][57339] Updated weights for policy 0, policy_version 627888 (0.0027) [2024-04-28 14:02:35,827][57339] Updated weights for policy 0, policy_version 627898 (0.0026) [2024-04-28 14:02:37,169][57108] Fps is (10 sec: 55705.2, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10287579136. Throughput: 0: 55544.4. Samples: 777859400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:02:37,170][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 14:02:38,427][57339] Updated weights for policy 0, policy_version 627908 (0.0029) [2024-04-28 14:02:41,550][57339] Updated weights for policy 0, policy_version 627918 (0.0025) [2024-04-28 14:02:42,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10287824896. Throughput: 0: 55730.2. Samples: 778199300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:02:42,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 14:02:44,251][57339] Updated weights for policy 0, policy_version 627928 (0.0027) [2024-04-28 14:02:47,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 10288119808. Throughput: 0: 55807.5. Samples: 778536160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:02:47,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 14:02:47,248][57339] Updated weights for policy 0, policy_version 627938 (0.0025) [2024-04-28 14:02:50,144][57339] Updated weights for policy 0, policy_version 627948 (0.0030) [2024-04-28 14:02:52,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10288398336. Throughput: 0: 55839.8. Samples: 778700860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:02:52,170][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 14:02:53,088][57339] Updated weights for policy 0, policy_version 627958 (0.0030) [2024-04-28 14:02:56,000][57339] Updated weights for policy 0, policy_version 627968 (0.0029) [2024-04-28 14:02:57,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10288676864. Throughput: 0: 55981.3. Samples: 779039380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:02:57,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 14:02:58,907][57339] Updated weights for policy 0, policy_version 627978 (0.0029) [2024-04-28 14:03:01,775][57339] Updated weights for policy 0, policy_version 627988 (0.0036) [2024-04-28 14:03:02,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 10288971776. Throughput: 0: 55970.1. Samples: 779374180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:03:02,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 14:03:04,751][57339] Updated weights for policy 0, policy_version 627998 (0.0035) [2024-04-28 14:03:07,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10289250304. Throughput: 0: 55984.5. Samples: 779541660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:03:07,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 14:03:07,637][57339] Updated weights for policy 0, policy_version 628008 (0.0032) [2024-04-28 14:03:10,521][57339] Updated weights for policy 0, policy_version 628018 (0.0030) [2024-04-28 14:03:12,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56524.7, 300 sec: 55705.6). Total num frames: 10289528832. Throughput: 0: 55846.7. Samples: 779869340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:03:12,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 14:03:13,516][57339] Updated weights for policy 0, policy_version 628028 (0.0024) [2024-04-28 14:03:16,519][57339] Updated weights for policy 0, policy_version 628038 (0.0027) [2024-04-28 14:03:17,169][57108] Fps is (10 sec: 55705.2, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10289807360. Throughput: 0: 55979.9. Samples: 780206240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:03:17,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 14:03:19,414][57339] Updated weights for policy 0, policy_version 628048 (0.0031) [2024-04-28 14:03:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10290069504. Throughput: 0: 55903.7. Samples: 780375060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:03:22,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 14:03:22,371][57339] Updated weights for policy 0, policy_version 628058 (0.0030) [2024-04-28 14:03:25,191][57339] Updated weights for policy 0, policy_version 628068 (0.0036) [2024-04-28 14:03:27,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10290348032. Throughput: 0: 55822.6. Samples: 780711320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:03:27,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 14:03:27,213][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000628074_10290364416.pth... [2024-04-28 14:03:27,262][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000627259_10277011456.pth [2024-04-28 14:03:28,080][57339] Updated weights for policy 0, policy_version 628078 (0.0028) [2024-04-28 14:03:30,918][57339] Updated weights for policy 0, policy_version 628088 (0.0031) [2024-04-28 14:03:32,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10290610176. Throughput: 0: 55801.8. Samples: 781047240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:03:32,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 14:03:33,971][57339] Updated weights for policy 0, policy_version 628098 (0.0027) [2024-04-28 14:03:36,982][57339] Updated weights for policy 0, policy_version 628108 (0.0024) [2024-04-28 14:03:37,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10290921472. Throughput: 0: 55829.3. Samples: 781213180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:03:37,170][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 14:03:39,861][57339] Updated weights for policy 0, policy_version 628118 (0.0030) [2024-04-28 14:03:42,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10291183616. Throughput: 0: 55797.3. Samples: 781550260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:03:42,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 14:03:42,994][57339] Updated weights for policy 0, policy_version 628128 (0.0035) [2024-04-28 14:03:45,789][57339] Updated weights for policy 0, policy_version 628138 (0.0034) [2024-04-28 14:03:47,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10291462144. Throughput: 0: 55711.3. Samples: 781881180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:03:47,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 14:03:48,791][57339] Updated weights for policy 0, policy_version 628148 (0.0034) [2024-04-28 14:03:51,552][57339] Updated weights for policy 0, policy_version 628158 (0.0031) [2024-04-28 14:03:52,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10291757056. Throughput: 0: 55808.4. Samples: 782053040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:03:52,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 14:03:54,739][57339] Updated weights for policy 0, policy_version 628168 (0.0030) [2024-04-28 14:03:57,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10292035584. Throughput: 0: 56045.4. Samples: 782391380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 14:03:57,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 14:03:57,346][57339] Updated weights for policy 0, policy_version 628178 (0.0024) [2024-04-28 14:03:59,061][57319] Signal inference workers to stop experience collection... (11200 times) [2024-04-28 14:03:59,094][57339] InferenceWorker_p0-w0: stopping experience collection (11200 times) [2024-04-28 14:03:59,123][57319] Signal inference workers to resume experience collection... (11200 times) [2024-04-28 14:03:59,124][57339] InferenceWorker_p0-w0: resuming experience collection (11200 times) [2024-04-28 14:04:00,744][57339] Updated weights for policy 0, policy_version 628188 (0.0028) [2024-04-28 14:04:02,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10292314112. Throughput: 0: 55933.4. Samples: 782723240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:04:02,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 14:04:03,331][57339] Updated weights for policy 0, policy_version 628198 (0.0030) [2024-04-28 14:04:06,692][57339] Updated weights for policy 0, policy_version 628208 (0.0029) [2024-04-28 14:04:07,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10292576256. Throughput: 0: 55686.7. Samples: 782880960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:04:07,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 14:04:09,121][57339] Updated weights for policy 0, policy_version 628218 (0.0027) [2024-04-28 14:04:12,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55761.2). Total num frames: 10292871168. Throughput: 0: 55730.5. Samples: 783219200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:04:12,170][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 14:04:12,360][57339] Updated weights for policy 0, policy_version 628228 (0.0037) [2024-04-28 14:04:14,905][57339] Updated weights for policy 0, policy_version 628238 (0.0026) [2024-04-28 14:04:17,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10293149696. Throughput: 0: 55804.2. Samples: 783558440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:04:17,170][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 14:04:18,175][57339] Updated weights for policy 0, policy_version 628248 (0.0026) [2024-04-28 14:04:20,695][57339] Updated weights for policy 0, policy_version 628258 (0.0024) [2024-04-28 14:04:22,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 10293411840. Throughput: 0: 55665.3. Samples: 783718120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:04:22,170][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 14:04:24,144][57339] Updated weights for policy 0, policy_version 628268 (0.0033) [2024-04-28 14:04:26,735][57339] Updated weights for policy 0, policy_version 628278 (0.0028) [2024-04-28 14:04:27,169][57108] Fps is (10 sec: 57344.9, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10293723136. Throughput: 0: 55567.1. Samples: 784050780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:04:27,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 14:04:30,423][57339] Updated weights for policy 0, policy_version 628288 (0.0030) [2024-04-28 14:04:32,169][57108] Fps is (10 sec: 58983.2, 60 sec: 56524.7, 300 sec: 55761.1). Total num frames: 10294001664. Throughput: 0: 55560.8. Samples: 784381420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:04:32,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 14:04:32,567][57339] Updated weights for policy 0, policy_version 628298 (0.0027) [2024-04-28 14:04:36,339][57339] Updated weights for policy 0, policy_version 628308 (0.0031) [2024-04-28 14:04:37,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 10294247424. Throughput: 0: 55703.6. Samples: 784559700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:04:37,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 14:04:38,269][57339] Updated weights for policy 0, policy_version 628318 (0.0032) [2024-04-28 14:04:42,156][57339] Updated weights for policy 0, policy_version 628328 (0.0028) [2024-04-28 14:04:42,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10294525952. Throughput: 0: 55723.8. Samples: 784898960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:04:42,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 14:04:44,444][57339] Updated weights for policy 0, policy_version 628338 (0.0027) [2024-04-28 14:04:47,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10294788096. Throughput: 0: 55629.8. Samples: 785226580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:04:47,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 14:04:47,959][57339] Updated weights for policy 0, policy_version 628348 (0.0029) [2024-04-28 14:04:50,324][57339] Updated weights for policy 0, policy_version 628358 (0.0027) [2024-04-28 14:04:52,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.3, 300 sec: 55650.0). Total num frames: 10295066624. Throughput: 0: 55543.8. Samples: 785380440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:04:52,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 14:04:53,963][57339] Updated weights for policy 0, policy_version 628368 (0.0026) [2024-04-28 14:04:54,988][57319] Signal inference workers to stop experience collection... (11250 times) [2024-04-28 14:04:55,040][57339] InferenceWorker_p0-w0: stopping experience collection (11250 times) [2024-04-28 14:04:55,040][57319] Signal inference workers to resume experience collection... (11250 times) [2024-04-28 14:04:55,062][57339] InferenceWorker_p0-w0: resuming experience collection (11250 times) [2024-04-28 14:04:56,192][57339] Updated weights for policy 0, policy_version 628378 (0.0023) [2024-04-28 14:04:57,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 10295361536. Throughput: 0: 55317.3. Samples: 785708480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:04:57,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 14:04:59,665][57339] Updated weights for policy 0, policy_version 628388 (0.0032) [2024-04-28 14:05:01,984][57339] Updated weights for policy 0, policy_version 628398 (0.0029) [2024-04-28 14:05:02,169][57108] Fps is (10 sec: 60621.8, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10295672832. Throughput: 0: 55219.7. Samples: 786043320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:05:02,169][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 14:05:05,532][57339] Updated weights for policy 0, policy_version 628408 (0.0029) [2024-04-28 14:05:07,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10295934976. Throughput: 0: 55751.7. Samples: 786226940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:05:07,169][57108] Avg episode reward: [(0, '0.698')] [2024-04-28 14:05:07,885][57339] Updated weights for policy 0, policy_version 628418 (0.0029) [2024-04-28 14:05:11,550][57339] Updated weights for policy 0, policy_version 628428 (0.0028) [2024-04-28 14:05:12,169][57108] Fps is (10 sec: 52428.0, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10296197120. Throughput: 0: 55661.1. Samples: 786555540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:05:12,170][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 14:05:14,200][57339] Updated weights for policy 0, policy_version 628438 (0.0031) [2024-04-28 14:05:17,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 10296459264. Throughput: 0: 55757.4. Samples: 786890500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:05:17,178][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 14:05:17,458][57339] Updated weights for policy 0, policy_version 628448 (0.0032) [2024-04-28 14:05:20,146][57339] Updated weights for policy 0, policy_version 628458 (0.0028) [2024-04-28 14:05:22,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10296737792. Throughput: 0: 55167.4. Samples: 787042240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:05:22,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 14:05:23,294][57339] Updated weights for policy 0, policy_version 628468 (0.0026) [2024-04-28 14:05:26,055][57339] Updated weights for policy 0, policy_version 628478 (0.0027) [2024-04-28 14:05:27,169][57108] Fps is (10 sec: 54067.2, 60 sec: 54613.3, 300 sec: 55594.5). Total num frames: 10296999936. Throughput: 0: 55017.9. Samples: 787374760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:05:27,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 14:05:27,185][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000628479_10296999936.pth... [2024-04-28 14:05:27,234][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000627666_10283679744.pth [2024-04-28 14:05:29,181][57339] Updated weights for policy 0, policy_version 628488 (0.0028) [2024-04-28 14:05:31,950][57339] Updated weights for policy 0, policy_version 628498 (0.0033) [2024-04-28 14:05:32,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10297311232. Throughput: 0: 55118.2. Samples: 787706900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 14:05:32,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 14:05:35,122][57339] Updated weights for policy 0, policy_version 628508 (0.0027) [2024-04-28 14:05:37,169][57108] Fps is (10 sec: 60621.1, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 10297606144. Throughput: 0: 55738.0. Samples: 787888640. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:05:37,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 14:05:37,737][57339] Updated weights for policy 0, policy_version 628518 (0.0031) [2024-04-28 14:05:41,142][57339] Updated weights for policy 0, policy_version 628528 (0.0032) [2024-04-28 14:05:42,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10297884672. Throughput: 0: 55752.2. Samples: 788217320. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:05:42,169][57108] Avg episode reward: [(0, '0.679')] [2024-04-28 14:05:43,680][57339] Updated weights for policy 0, policy_version 628538 (0.0028) [2024-04-28 14:05:46,883][57339] Updated weights for policy 0, policy_version 628548 (0.0026) [2024-04-28 14:05:47,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10298146816. Throughput: 0: 55774.7. Samples: 788553180. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:05:47,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 14:05:49,608][57339] Updated weights for policy 0, policy_version 628558 (0.0026) [2024-04-28 14:05:52,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10298408960. Throughput: 0: 55318.2. Samples: 788716260. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:05:52,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 14:05:52,641][57339] Updated weights for policy 0, policy_version 628568 (0.0024) [2024-04-28 14:05:54,426][57319] Signal inference workers to stop experience collection... (11300 times) [2024-04-28 14:05:54,427][57319] Signal inference workers to resume experience collection... (11300 times) [2024-04-28 14:05:54,451][57339] InferenceWorker_p0-w0: stopping experience collection (11300 times) [2024-04-28 14:05:54,451][57339] InferenceWorker_p0-w0: resuming experience collection (11300 times) [2024-04-28 14:05:55,342][57339] Updated weights for policy 0, policy_version 628578 (0.0025) [2024-04-28 14:05:57,169][57108] Fps is (10 sec: 50790.3, 60 sec: 54886.5, 300 sec: 55594.5). Total num frames: 10298654720. Throughput: 0: 55404.7. Samples: 789048740. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:05:57,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 14:05:58,497][57339] Updated weights for policy 0, policy_version 628588 (0.0034) [2024-04-28 14:06:01,283][57339] Updated weights for policy 0, policy_version 628598 (0.0027) [2024-04-28 14:06:02,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 10298966016. Throughput: 0: 55448.9. Samples: 789385700. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:06:02,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 14:06:04,382][57339] Updated weights for policy 0, policy_version 628608 (0.0027) [2024-04-28 14:06:07,169][57108] Fps is (10 sec: 60619.8, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 10299260928. Throughput: 0: 55799.9. Samples: 789553240. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:06:07,170][57108] Avg episode reward: [(0, '0.716')] [2024-04-28 14:06:07,567][57339] Updated weights for policy 0, policy_version 628618 (0.0030) [2024-04-28 14:06:10,109][57339] Updated weights for policy 0, policy_version 628628 (0.0025) [2024-04-28 14:06:12,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 10299555840. Throughput: 0: 55900.8. Samples: 789890300. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:06:12,170][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 14:06:13,313][57339] Updated weights for policy 0, policy_version 628638 (0.0030) [2024-04-28 14:06:15,874][57339] Updated weights for policy 0, policy_version 628648 (0.0028) [2024-04-28 14:06:17,169][57108] Fps is (10 sec: 57343.9, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 10299834368. Throughput: 0: 55871.0. Samples: 790221100. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:06:17,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 14:06:19,279][57339] Updated weights for policy 0, policy_version 628658 (0.0031) [2024-04-28 14:06:21,922][57339] Updated weights for policy 0, policy_version 628668 (0.0032) [2024-04-28 14:06:22,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10300096512. Throughput: 0: 55689.2. Samples: 790394660. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:06:22,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 14:06:25,182][57339] Updated weights for policy 0, policy_version 628678 (0.0030) [2024-04-28 14:06:27,169][57108] Fps is (10 sec: 54067.3, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 10300375040. Throughput: 0: 55876.7. Samples: 790731780. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:06:27,170][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 14:06:27,846][57339] Updated weights for policy 0, policy_version 628688 (0.0032) [2024-04-28 14:06:31,157][57339] Updated weights for policy 0, policy_version 628698 (0.0030) [2024-04-28 14:06:32,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10300637184. Throughput: 0: 55819.0. Samples: 791065040. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:06:32,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 14:06:33,553][57339] Updated weights for policy 0, policy_version 628708 (0.0030) [2024-04-28 14:06:36,843][57339] Updated weights for policy 0, policy_version 628718 (0.0034) [2024-04-28 14:06:37,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.3, 300 sec: 55705.6). Total num frames: 10300915712. Throughput: 0: 55819.5. Samples: 791228140. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:06:37,178][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 14:06:39,267][57339] Updated weights for policy 0, policy_version 628728 (0.0031) [2024-04-28 14:06:42,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 10301210624. Throughput: 0: 55859.4. Samples: 791562420. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:06:42,178][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 14:06:42,589][57339] Updated weights for policy 0, policy_version 628738 (0.0030) [2024-04-28 14:06:45,200][57339] Updated weights for policy 0, policy_version 628748 (0.0029) [2024-04-28 14:06:47,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10301505536. Throughput: 0: 55808.5. Samples: 791897080. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:06:47,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 14:06:48,680][57339] Updated weights for policy 0, policy_version 628758 (0.0030) [2024-04-28 14:06:51,198][57339] Updated weights for policy 0, policy_version 628768 (0.0027) [2024-04-28 14:06:52,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10301784064. Throughput: 0: 55891.6. Samples: 792068360. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:06:52,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 14:06:54,414][57339] Updated weights for policy 0, policy_version 628778 (0.0026) [2024-04-28 14:06:56,757][57319] Signal inference workers to stop experience collection... (11350 times) [2024-04-28 14:06:56,783][57339] InferenceWorker_p0-w0: stopping experience collection (11350 times) [2024-04-28 14:06:56,812][57319] Signal inference workers to resume experience collection... (11350 times) [2024-04-28 14:06:56,816][57339] InferenceWorker_p0-w0: resuming experience collection (11350 times) [2024-04-28 14:06:56,922][57339] Updated weights for policy 0, policy_version 628788 (0.0028) [2024-04-28 14:06:57,169][57108] Fps is (10 sec: 55704.8, 60 sec: 56797.7, 300 sec: 55761.1). Total num frames: 10302062592. Throughput: 0: 55982.6. Samples: 792409520. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:06:57,178][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 14:07:00,094][57339] Updated weights for policy 0, policy_version 628798 (0.0026) [2024-04-28 14:07:02,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 10302341120. Throughput: 0: 56047.3. Samples: 792743220. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:07:02,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 14:07:02,776][57339] Updated weights for policy 0, policy_version 628808 (0.0028) [2024-04-28 14:07:05,997][57339] Updated weights for policy 0, policy_version 628818 (0.0029) [2024-04-28 14:07:07,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 55816.6). Total num frames: 10302603264. Throughput: 0: 55898.6. Samples: 792910100. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-04-28 14:07:07,170][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 14:07:08,736][57339] Updated weights for policy 0, policy_version 628828 (0.0028) [2024-04-28 14:07:11,884][57339] Updated weights for policy 0, policy_version 628838 (0.0029) [2024-04-28 14:07:12,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10302881792. Throughput: 0: 55878.3. Samples: 793246300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:07:12,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 14:07:14,438][57339] Updated weights for policy 0, policy_version 628848 (0.0029) [2024-04-28 14:07:17,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 10303160320. Throughput: 0: 56014.5. Samples: 793585700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:07:17,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 14:07:17,800][57339] Updated weights for policy 0, policy_version 628858 (0.0027) [2024-04-28 14:07:20,396][57339] Updated weights for policy 0, policy_version 628868 (0.0027) [2024-04-28 14:07:22,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10303422464. Throughput: 0: 56129.8. Samples: 793753980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:07:22,170][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 14:07:23,473][57339] Updated weights for policy 0, policy_version 628878 (0.0028) [2024-04-28 14:07:26,122][57339] Updated weights for policy 0, policy_version 628888 (0.0026) [2024-04-28 14:07:27,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 10303733760. Throughput: 0: 56107.7. Samples: 794087260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:07:27,169][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 14:07:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000628890_10303733760.pth... [2024-04-28 14:07:27,244][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000628074_10290364416.pth [2024-04-28 14:07:29,353][57339] Updated weights for policy 0, policy_version 628898 (0.0033) [2024-04-28 14:07:31,982][57339] Updated weights for policy 0, policy_version 628908 (0.0029) [2024-04-28 14:07:32,169][57108] Fps is (10 sec: 60620.8, 60 sec: 56524.8, 300 sec: 55761.1). Total num frames: 10304028672. Throughput: 0: 56218.1. Samples: 794426900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:07:32,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 14:07:35,265][57339] Updated weights for policy 0, policy_version 628918 (0.0026) [2024-04-28 14:07:37,169][57108] Fps is (10 sec: 55704.8, 60 sec: 56251.7, 300 sec: 55816.6). Total num frames: 10304290816. Throughput: 0: 56168.8. Samples: 794595960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:07:37,170][57108] Avg episode reward: [(0, '0.503')] [2024-04-28 14:07:37,898][57339] Updated weights for policy 0, policy_version 628928 (0.0034) [2024-04-28 14:07:40,913][57339] Updated weights for policy 0, policy_version 628938 (0.0028) [2024-04-28 14:07:42,169][57108] Fps is (10 sec: 57343.9, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 10304602112. Throughput: 0: 56068.5. Samples: 794932600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:07:42,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 14:07:43,734][57339] Updated weights for policy 0, policy_version 628948 (0.0027) [2024-04-28 14:07:46,799][57339] Updated weights for policy 0, policy_version 628958 (0.0025) [2024-04-28 14:07:47,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10304864256. Throughput: 0: 56044.8. Samples: 795265240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:07:47,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 14:07:49,607][57339] Updated weights for policy 0, policy_version 628968 (0.0030) [2024-04-28 14:07:52,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10305126400. Throughput: 0: 56078.9. Samples: 795433640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:07:52,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 14:07:52,622][57339] Updated weights for policy 0, policy_version 628978 (0.0032) [2024-04-28 14:07:55,831][57339] Updated weights for policy 0, policy_version 628988 (0.0028) [2024-04-28 14:07:56,403][57319] Signal inference workers to stop experience collection... (11400 times) [2024-04-28 14:07:56,444][57339] InferenceWorker_p0-w0: stopping experience collection (11400 times) [2024-04-28 14:07:56,454][57319] Signal inference workers to resume experience collection... (11400 times) [2024-04-28 14:07:56,460][57339] InferenceWorker_p0-w0: resuming experience collection (11400 times) [2024-04-28 14:07:57,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 10305388544. Throughput: 0: 56140.5. Samples: 795772620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:07:57,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 14:07:58,436][57339] Updated weights for policy 0, policy_version 628998 (0.0025) [2024-04-28 14:08:01,541][57339] Updated weights for policy 0, policy_version 629008 (0.0026) [2024-04-28 14:08:02,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10305683456. Throughput: 0: 56026.5. Samples: 796106880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:08:02,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 14:08:04,412][57339] Updated weights for policy 0, policy_version 629018 (0.0026) [2024-04-28 14:08:07,169][57108] Fps is (10 sec: 58981.8, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 10305978368. Throughput: 0: 55961.7. Samples: 796272260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:08:07,170][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 14:08:07,501][57339] Updated weights for policy 0, policy_version 629028 (0.0028) [2024-04-28 14:08:10,254][57339] Updated weights for policy 0, policy_version 629038 (0.0025) [2024-04-28 14:08:12,169][57108] Fps is (10 sec: 57343.7, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 10306256896. Throughput: 0: 55946.6. Samples: 796604860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:08:12,169][57108] Avg episode reward: [(0, '0.683')] [2024-04-28 14:08:13,494][57339] Updated weights for policy 0, policy_version 629048 (0.0029) [2024-04-28 14:08:16,030][57339] Updated weights for policy 0, policy_version 629058 (0.0030) [2024-04-28 14:08:17,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56524.9, 300 sec: 55872.2). Total num frames: 10306551808. Throughput: 0: 55704.4. Samples: 796933600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:08:17,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 14:08:19,280][57339] Updated weights for policy 0, policy_version 629068 (0.0026) [2024-04-28 14:08:21,839][57339] Updated weights for policy 0, policy_version 629078 (0.0030) [2024-04-28 14:08:22,169][57108] Fps is (10 sec: 57343.9, 60 sec: 56797.9, 300 sec: 55872.2). Total num frames: 10306830336. Throughput: 0: 55831.2. Samples: 797108360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:08:22,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 14:08:25,004][57339] Updated weights for policy 0, policy_version 629088 (0.0032) [2024-04-28 14:08:27,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10307076096. Throughput: 0: 55810.8. Samples: 797444080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:08:27,169][57108] Avg episode reward: [(0, '0.529')] [2024-04-28 14:08:27,751][57339] Updated weights for policy 0, policy_version 629098 (0.0026) [2024-04-28 14:08:31,188][57339] Updated weights for policy 0, policy_version 629108 (0.0030) [2024-04-28 14:08:32,169][57108] Fps is (10 sec: 50790.2, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10307338240. Throughput: 0: 55844.0. Samples: 797778220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:08:32,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 14:08:33,517][57339] Updated weights for policy 0, policy_version 629118 (0.0030) [2024-04-28 14:08:37,016][57339] Updated weights for policy 0, policy_version 629128 (0.0031) [2024-04-28 14:08:37,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.8, 300 sec: 55761.1). Total num frames: 10307633152. Throughput: 0: 55673.8. Samples: 797938960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:08:37,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 14:08:39,386][57339] Updated weights for policy 0, policy_version 629138 (0.0025) [2024-04-28 14:08:42,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 10307911680. Throughput: 0: 55551.1. Samples: 798272420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:08:42,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 14:08:42,808][57339] Updated weights for policy 0, policy_version 629148 (0.0027) [2024-04-28 14:08:45,196][57339] Updated weights for policy 0, policy_version 629158 (0.0036) [2024-04-28 14:08:47,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10308206592. Throughput: 0: 55633.9. Samples: 798610400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:08:47,169][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 14:08:48,810][57339] Updated weights for policy 0, policy_version 629168 (0.0025) [2024-04-28 14:08:51,113][57339] Updated weights for policy 0, policy_version 629178 (0.0026) [2024-04-28 14:08:52,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10308501504. Throughput: 0: 55857.0. Samples: 798785820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:08:52,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 14:08:54,672][57339] Updated weights for policy 0, policy_version 629188 (0.0025) [2024-04-28 14:08:56,717][57319] Signal inference workers to stop experience collection... (11450 times) [2024-04-28 14:08:56,751][57339] InferenceWorker_p0-w0: stopping experience collection (11450 times) [2024-04-28 14:08:56,777][57319] Signal inference workers to resume experience collection... (11450 times) [2024-04-28 14:08:56,777][57339] InferenceWorker_p0-w0: resuming experience collection (11450 times) [2024-04-28 14:08:57,009][57339] Updated weights for policy 0, policy_version 629198 (0.0027) [2024-04-28 14:08:57,169][57108] Fps is (10 sec: 57343.3, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 10308780032. Throughput: 0: 55901.8. Samples: 799120440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:08:57,169][57108] Avg episode reward: [(0, '0.528')] [2024-04-28 14:09:00,566][57339] Updated weights for policy 0, policy_version 629208 (0.0032) [2024-04-28 14:09:02,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10309025792. Throughput: 0: 55982.7. Samples: 799452820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:09:02,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 14:09:02,918][57339] Updated weights for policy 0, policy_version 629218 (0.0026) [2024-04-28 14:09:06,427][57339] Updated weights for policy 0, policy_version 629228 (0.0035) [2024-04-28 14:09:07,169][57108] Fps is (10 sec: 50790.2, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10309287936. Throughput: 0: 55701.3. Samples: 799614920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:09:07,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 14:09:08,721][57339] Updated weights for policy 0, policy_version 629238 (0.0035) [2024-04-28 14:09:12,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10309582848. Throughput: 0: 55752.8. Samples: 799952960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:09:12,169][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 14:09:12,219][57339] Updated weights for policy 0, policy_version 629248 (0.0024) [2024-04-28 14:09:14,683][57339] Updated weights for policy 0, policy_version 629258 (0.0027) [2024-04-28 14:09:17,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55159.5, 300 sec: 55761.2). Total num frames: 10309861376. Throughput: 0: 55717.4. Samples: 800285500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:09:17,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 14:09:18,188][57339] Updated weights for policy 0, policy_version 629268 (0.0034) [2024-04-28 14:09:20,592][57339] Updated weights for policy 0, policy_version 629278 (0.0028) [2024-04-28 14:09:22,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 10310139904. Throughput: 0: 55647.0. Samples: 800443080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:09:22,170][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 14:09:24,131][57339] Updated weights for policy 0, policy_version 629288 (0.0031) [2024-04-28 14:09:26,362][57339] Updated weights for policy 0, policy_version 629298 (0.0034) [2024-04-28 14:09:27,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10310434816. Throughput: 0: 55637.3. Samples: 800776100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:09:27,169][57108] Avg episode reward: [(0, '0.522')] [2024-04-28 14:09:27,177][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000629300_10310451200.pth... [2024-04-28 14:09:27,224][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000628479_10296999936.pth [2024-04-28 14:09:30,081][57339] Updated weights for policy 0, policy_version 629308 (0.0027) [2024-04-28 14:09:32,169][57108] Fps is (10 sec: 58982.6, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 10310729728. Throughput: 0: 55589.6. Samples: 801111940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:09:32,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 14:09:32,209][57339] Updated weights for policy 0, policy_version 629318 (0.0027) [2024-04-28 14:09:35,972][57339] Updated weights for policy 0, policy_version 629328 (0.0025) [2024-04-28 14:09:37,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10311008256. Throughput: 0: 55563.6. Samples: 801286180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:09:37,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 14:09:38,186][57339] Updated weights for policy 0, policy_version 629338 (0.0028) [2024-04-28 14:09:41,737][57339] Updated weights for policy 0, policy_version 629348 (0.0026) [2024-04-28 14:09:42,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10311254016. Throughput: 0: 55588.5. Samples: 801621920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:09:42,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 14:09:43,986][57339] Updated weights for policy 0, policy_version 629358 (0.0031) [2024-04-28 14:09:47,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 10311532544. Throughput: 0: 55728.4. Samples: 801960600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:09:47,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 14:09:47,559][57339] Updated weights for policy 0, policy_version 629368 (0.0028) [2024-04-28 14:09:49,845][57339] Updated weights for policy 0, policy_version 629378 (0.0026) [2024-04-28 14:09:52,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10311811072. Throughput: 0: 55644.9. Samples: 802118940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:09:52,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 14:09:53,443][57339] Updated weights for policy 0, policy_version 629388 (0.0026) [2024-04-28 14:09:55,579][57339] Updated weights for policy 0, policy_version 629398 (0.0027) [2024-04-28 14:09:57,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10312105984. Throughput: 0: 55603.0. Samples: 802455100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:09:57,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 14:09:59,129][57339] Updated weights for policy 0, policy_version 629408 (0.0026) [2024-04-28 14:10:01,290][57319] Signal inference workers to stop experience collection... (11500 times) [2024-04-28 14:10:01,315][57339] InferenceWorker_p0-w0: stopping experience collection (11500 times) [2024-04-28 14:10:01,347][57319] Signal inference workers to resume experience collection... (11500 times) [2024-04-28 14:10:01,348][57339] InferenceWorker_p0-w0: resuming experience collection (11500 times) [2024-04-28 14:10:01,464][57339] Updated weights for policy 0, policy_version 629418 (0.0027) [2024-04-28 14:10:02,169][57108] Fps is (10 sec: 58982.8, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10312400896. Throughput: 0: 55620.9. Samples: 802788440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:10:02,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 14:10:05,009][57339] Updated weights for policy 0, policy_version 629428 (0.0027) [2024-04-28 14:10:07,169][57108] Fps is (10 sec: 57344.7, 60 sec: 56524.9, 300 sec: 55872.3). Total num frames: 10312679424. Throughput: 0: 56212.1. Samples: 802972620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:10:07,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 14:10:07,395][57339] Updated weights for policy 0, policy_version 629438 (0.0036) [2024-04-28 14:10:10,813][57339] Updated weights for policy 0, policy_version 629448 (0.0030) [2024-04-28 14:10:12,169][57108] Fps is (10 sec: 55705.5, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10312957952. Throughput: 0: 56158.7. Samples: 803303240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:10:12,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 14:10:13,156][57339] Updated weights for policy 0, policy_version 629458 (0.0029) [2024-04-28 14:10:16,764][57339] Updated weights for policy 0, policy_version 629468 (0.0031) [2024-04-28 14:10:17,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10313236480. Throughput: 0: 56187.1. Samples: 803640360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:10:17,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 14:10:19,184][57339] Updated weights for policy 0, policy_version 629478 (0.0024) [2024-04-28 14:10:22,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 10313498624. Throughput: 0: 55914.7. Samples: 803802340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-04-28 14:10:22,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 14:10:22,430][57339] Updated weights for policy 0, policy_version 629488 (0.0026) [2024-04-28 14:10:24,889][57339] Updated weights for policy 0, policy_version 629498 (0.0025) [2024-04-28 14:10:27,169][57108] Fps is (10 sec: 50790.7, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 10313744384. Throughput: 0: 55968.0. Samples: 804140480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:10:27,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 14:10:28,397][57339] Updated weights for policy 0, policy_version 629508 (0.0027) [2024-04-28 14:10:30,821][57339] Updated weights for policy 0, policy_version 629518 (0.0030) [2024-04-28 14:10:32,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10314039296. Throughput: 0: 55843.6. Samples: 804473560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:10:32,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 14:10:34,197][57339] Updated weights for policy 0, policy_version 629528 (0.0028) [2024-04-28 14:10:36,643][57339] Updated weights for policy 0, policy_version 629538 (0.0027) [2024-04-28 14:10:37,169][57108] Fps is (10 sec: 60620.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10314350592. Throughput: 0: 56061.0. Samples: 804641680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:10:37,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 14:10:40,003][57339] Updated weights for policy 0, policy_version 629548 (0.0027) [2024-04-28 14:10:42,169][57108] Fps is (10 sec: 58982.7, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10314629120. Throughput: 0: 56062.8. Samples: 804977920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:10:42,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 14:10:42,570][57339] Updated weights for policy 0, policy_version 629558 (0.0028) [2024-04-28 14:10:46,023][57339] Updated weights for policy 0, policy_version 629568 (0.0027) [2024-04-28 14:10:47,169][57108] Fps is (10 sec: 57343.2, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 10314924032. Throughput: 0: 55979.9. Samples: 805307540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:10:47,170][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 14:10:48,609][57339] Updated weights for policy 0, policy_version 629578 (0.0033) [2024-04-28 14:10:51,874][57339] Updated weights for policy 0, policy_version 629588 (0.0035) [2024-04-28 14:10:52,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 10315169792. Throughput: 0: 55696.1. Samples: 805478940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:10:52,169][57108] Avg episode reward: [(0, '0.519')] [2024-04-28 14:10:54,407][57339] Updated weights for policy 0, policy_version 629598 (0.0029) [2024-04-28 14:10:57,169][57108] Fps is (10 sec: 50790.9, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10315431936. Throughput: 0: 55732.5. Samples: 805811200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:10:57,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 14:10:57,687][57339] Updated weights for policy 0, policy_version 629608 (0.0027) [2024-04-28 14:11:00,325][57339] Updated weights for policy 0, policy_version 629618 (0.0034) [2024-04-28 14:11:02,169][57108] Fps is (10 sec: 54066.1, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10315710464. Throughput: 0: 55669.2. Samples: 806145480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:11:02,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 14:11:03,016][57319] Signal inference workers to stop experience collection... (11550 times) [2024-04-28 14:11:03,016][57319] Signal inference workers to resume experience collection... (11550 times) [2024-04-28 14:11:03,038][57339] InferenceWorker_p0-w0: stopping experience collection (11550 times) [2024-04-28 14:11:03,038][57339] InferenceWorker_p0-w0: resuming experience collection (11550 times) [2024-04-28 14:11:03,650][57339] Updated weights for policy 0, policy_version 629628 (0.0025) [2024-04-28 14:11:06,286][57339] Updated weights for policy 0, policy_version 629638 (0.0031) [2024-04-28 14:11:07,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 10316005376. Throughput: 0: 55684.7. Samples: 806308160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:11:07,170][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 14:11:09,641][57339] Updated weights for policy 0, policy_version 629648 (0.0028) [2024-04-28 14:11:12,169][57108] Fps is (10 sec: 58983.2, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10316300288. Throughput: 0: 55571.0. Samples: 806641180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:11:12,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 14:11:12,198][57339] Updated weights for policy 0, policy_version 629658 (0.0028) [2024-04-28 14:11:15,468][57339] Updated weights for policy 0, policy_version 629668 (0.0032) [2024-04-28 14:11:17,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10316562432. Throughput: 0: 55572.1. Samples: 806974300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:11:17,169][57108] Avg episode reward: [(0, '0.506')] [2024-04-28 14:11:18,097][57339] Updated weights for policy 0, policy_version 629678 (0.0025) [2024-04-28 14:11:21,273][57339] Updated weights for policy 0, policy_version 629688 (0.0027) [2024-04-28 14:11:22,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 10316857344. Throughput: 0: 55637.1. Samples: 807145360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:11:22,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 14:11:24,403][57339] Updated weights for policy 0, policy_version 629698 (0.0031) [2024-04-28 14:11:27,138][57339] Updated weights for policy 0, policy_version 629708 (0.0026) [2024-04-28 14:11:27,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56524.7, 300 sec: 55927.8). Total num frames: 10317135872. Throughput: 0: 55624.8. Samples: 807481040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:11:27,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 14:11:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000629708_10317135872.pth... [2024-04-28 14:11:27,229][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000628890_10303733760.pth [2024-04-28 14:11:30,268][57339] Updated weights for policy 0, policy_version 629718 (0.0033) [2024-04-28 14:11:32,169][57108] Fps is (10 sec: 55706.4, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 10317414400. Throughput: 0: 55741.0. Samples: 807815880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:11:32,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 14:11:33,270][57339] Updated weights for policy 0, policy_version 629728 (0.0034) [2024-04-28 14:11:36,055][57339] Updated weights for policy 0, policy_version 629738 (0.0029) [2024-04-28 14:11:37,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10317660160. Throughput: 0: 55413.6. Samples: 807972560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:11:37,170][57108] Avg episode reward: [(0, '0.509')] [2024-04-28 14:11:39,076][57339] Updated weights for policy 0, policy_version 629748 (0.0035) [2024-04-28 14:11:42,026][57339] Updated weights for policy 0, policy_version 629758 (0.0037) [2024-04-28 14:11:42,169][57108] Fps is (10 sec: 54065.1, 60 sec: 55432.1, 300 sec: 55761.1). Total num frames: 10317955072. Throughput: 0: 55488.0. Samples: 808308180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:11:42,170][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 14:11:44,947][57339] Updated weights for policy 0, policy_version 629768 (0.0034) [2024-04-28 14:11:47,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 10318233600. Throughput: 0: 55515.2. Samples: 808643660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:11:47,170][57108] Avg episode reward: [(0, '0.726')] [2024-04-28 14:11:47,901][57339] Updated weights for policy 0, policy_version 629778 (0.0026) [2024-04-28 14:11:50,883][57339] Updated weights for policy 0, policy_version 629788 (0.0031) [2024-04-28 14:11:52,169][57108] Fps is (10 sec: 57346.6, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10318528512. Throughput: 0: 55629.1. Samples: 808811460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:11:52,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 14:11:53,859][57339] Updated weights for policy 0, policy_version 629798 (0.0026) [2024-04-28 14:11:56,695][57339] Updated weights for policy 0, policy_version 629808 (0.0029) [2024-04-28 14:11:57,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10318790656. Throughput: 0: 55657.3. Samples: 809145760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-04-28 14:11:57,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 14:11:59,789][57339] Updated weights for policy 0, policy_version 629818 (0.0031) [2024-04-28 14:12:02,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10319069184. Throughput: 0: 55687.9. Samples: 809480260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:12:02,169][57108] Avg episode reward: [(0, '0.455')] [2024-04-28 14:12:02,483][57339] Updated weights for policy 0, policy_version 629828 (0.0028) [2024-04-28 14:12:05,759][57339] Updated weights for policy 0, policy_version 629838 (0.0028) [2024-04-28 14:12:07,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 10319364096. Throughput: 0: 55681.1. Samples: 809651000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:12:07,178][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 14:12:08,327][57339] Updated weights for policy 0, policy_version 629848 (0.0030) [2024-04-28 14:12:11,440][57339] Updated weights for policy 0, policy_version 629858 (0.0033) [2024-04-28 14:12:12,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10319626240. Throughput: 0: 55748.1. Samples: 809989700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:12:12,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 14:12:14,231][57319] Signal inference workers to stop experience collection... (11600 times) [2024-04-28 14:12:14,270][57339] InferenceWorker_p0-w0: stopping experience collection (11600 times) [2024-04-28 14:12:14,284][57319] Signal inference workers to resume experience collection... (11600 times) [2024-04-28 14:12:14,292][57339] InferenceWorker_p0-w0: resuming experience collection (11600 times) [2024-04-28 14:12:14,391][57339] Updated weights for policy 0, policy_version 629868 (0.0033) [2024-04-28 14:12:17,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55705.4, 300 sec: 55872.2). Total num frames: 10319904768. Throughput: 0: 55737.2. Samples: 810324060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:12:17,170][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 14:12:17,266][57339] Updated weights for policy 0, policy_version 629878 (0.0024) [2024-04-28 14:12:20,220][57339] Updated weights for policy 0, policy_version 629888 (0.0039) [2024-04-28 14:12:22,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.8, 300 sec: 55761.2). Total num frames: 10320183296. Throughput: 0: 55884.2. Samples: 810487340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:12:22,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 14:12:23,107][57339] Updated weights for policy 0, policy_version 629898 (0.0026) [2024-04-28 14:12:26,063][57339] Updated weights for policy 0, policy_version 629908 (0.0031) [2024-04-28 14:12:27,169][57108] Fps is (10 sec: 55706.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10320461824. Throughput: 0: 55986.8. Samples: 810827560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:12:27,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 14:12:28,877][57339] Updated weights for policy 0, policy_version 629918 (0.0032) [2024-04-28 14:12:31,958][57339] Updated weights for policy 0, policy_version 629928 (0.0028) [2024-04-28 14:12:32,170][57108] Fps is (10 sec: 57338.6, 60 sec: 55704.8, 300 sec: 55816.5). Total num frames: 10320756736. Throughput: 0: 56046.6. Samples: 811165800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:12:32,171][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 14:12:34,709][57339] Updated weights for policy 0, policy_version 629938 (0.0034) [2024-04-28 14:12:37,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10321018880. Throughput: 0: 55976.9. Samples: 811330420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:12:37,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 14:12:37,808][57339] Updated weights for policy 0, policy_version 629948 (0.0027) [2024-04-28 14:12:40,637][57339] Updated weights for policy 0, policy_version 629958 (0.0031) [2024-04-28 14:12:42,169][57108] Fps is (10 sec: 55710.6, 60 sec: 55979.1, 300 sec: 55761.2). Total num frames: 10321313792. Throughput: 0: 55975.6. Samples: 811664660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:12:42,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 14:12:43,718][57339] Updated weights for policy 0, policy_version 629968 (0.0026) [2024-04-28 14:12:46,569][57339] Updated weights for policy 0, policy_version 629978 (0.0036) [2024-04-28 14:12:47,169][57108] Fps is (10 sec: 58981.8, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10321608704. Throughput: 0: 55927.9. Samples: 811997020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:12:47,169][57108] Avg episode reward: [(0, '0.686')] [2024-04-28 14:12:49,573][57339] Updated weights for policy 0, policy_version 629988 (0.0026) [2024-04-28 14:12:52,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 10321870848. Throughput: 0: 55928.6. Samples: 812167780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:12:52,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 14:12:52,263][57339] Updated weights for policy 0, policy_version 629998 (0.0026) [2024-04-28 14:12:55,340][57339] Updated weights for policy 0, policy_version 630008 (0.0027) [2024-04-28 14:12:57,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10322149376. Throughput: 0: 55943.6. Samples: 812507160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:12:57,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 14:12:58,057][57339] Updated weights for policy 0, policy_version 630018 (0.0028) [2024-04-28 14:13:01,351][57339] Updated weights for policy 0, policy_version 630028 (0.0026) [2024-04-28 14:13:02,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 10322427904. Throughput: 0: 55996.8. Samples: 812843900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:13:02,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 14:13:03,885][57339] Updated weights for policy 0, policy_version 630038 (0.0030) [2024-04-28 14:13:07,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10322690048. Throughput: 0: 55962.6. Samples: 813005660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:13:07,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 14:13:07,265][57339] Updated weights for policy 0, policy_version 630048 (0.0026) [2024-04-28 14:13:09,765][57339] Updated weights for policy 0, policy_version 630058 (0.0025) [2024-04-28 14:13:12,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10322968576. Throughput: 0: 55791.6. Samples: 813338180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:13:12,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 14:13:13,034][57339] Updated weights for policy 0, policy_version 630068 (0.0027) [2024-04-28 14:13:15,663][57339] Updated weights for policy 0, policy_version 630078 (0.0028) [2024-04-28 14:13:17,169][57108] Fps is (10 sec: 58982.4, 60 sec: 56251.9, 300 sec: 55761.2). Total num frames: 10323279872. Throughput: 0: 55658.9. Samples: 813670400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:13:17,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 14:13:18,811][57339] Updated weights for policy 0, policy_version 630088 (0.0027) [2024-04-28 14:13:21,386][57339] Updated weights for policy 0, policy_version 630098 (0.0028) [2024-04-28 14:13:22,169][57108] Fps is (10 sec: 58982.8, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10323558400. Throughput: 0: 55916.6. Samples: 813846660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:13:22,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 14:13:22,226][57319] Signal inference workers to stop experience collection... (11650 times) [2024-04-28 14:13:22,231][57319] Signal inference workers to resume experience collection... (11650 times) [2024-04-28 14:13:22,245][57339] InferenceWorker_p0-w0: stopping experience collection (11650 times) [2024-04-28 14:13:22,262][57339] InferenceWorker_p0-w0: resuming experience collection (11650 times) [2024-04-28 14:13:24,842][57339] Updated weights for policy 0, policy_version 630108 (0.0031) [2024-04-28 14:13:27,169][57108] Fps is (10 sec: 55706.1, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 10323836928. Throughput: 0: 55910.8. Samples: 814180640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:13:27,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 14:13:27,247][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000630118_10323853312.pth... [2024-04-28 14:13:27,252][57339] Updated weights for policy 0, policy_version 630118 (0.0029) [2024-04-28 14:13:27,294][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000629300_10310451200.pth [2024-04-28 14:13:30,667][57339] Updated weights for policy 0, policy_version 630128 (0.0025) [2024-04-28 14:13:32,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55706.5, 300 sec: 55816.7). Total num frames: 10324099072. Throughput: 0: 55978.5. Samples: 814516040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-04-28 14:13:32,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 14:13:33,220][57339] Updated weights for policy 0, policy_version 630138 (0.0028) [2024-04-28 14:13:36,504][57339] Updated weights for policy 0, policy_version 630148 (0.0027) [2024-04-28 14:13:37,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10324361216. Throughput: 0: 55730.2. Samples: 814675640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:13:37,169][57108] Avg episode reward: [(0, '0.677')] [2024-04-28 14:13:39,058][57339] Updated weights for policy 0, policy_version 630158 (0.0026) [2024-04-28 14:13:42,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10324656128. Throughput: 0: 55696.9. Samples: 815013520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:13:42,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 14:13:42,194][57339] Updated weights for policy 0, policy_version 630168 (0.0026) [2024-04-28 14:13:44,942][57339] Updated weights for policy 0, policy_version 630178 (0.0030) [2024-04-28 14:13:47,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10324918272. Throughput: 0: 55710.1. Samples: 815350860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:13:47,170][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 14:13:48,470][57339] Updated weights for policy 0, policy_version 630188 (0.0031) [2024-04-28 14:13:50,833][57339] Updated weights for policy 0, policy_version 630198 (0.0025) [2024-04-28 14:13:52,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10325213184. Throughput: 0: 55789.8. Samples: 815516200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:13:52,169][57108] Avg episode reward: [(0, '0.710')] [2024-04-28 14:13:54,178][57339] Updated weights for policy 0, policy_version 630208 (0.0031) [2024-04-28 14:13:56,733][57339] Updated weights for policy 0, policy_version 630218 (0.0031) [2024-04-28 14:13:57,169][57108] Fps is (10 sec: 58983.4, 60 sec: 55978.7, 300 sec: 55872.3). Total num frames: 10325508096. Throughput: 0: 55764.1. Samples: 815847560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:13:57,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 14:13:59,869][57339] Updated weights for policy 0, policy_version 630228 (0.0027) [2024-04-28 14:14:02,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10325786624. Throughput: 0: 55841.0. Samples: 816183240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:14:02,170][57108] Avg episode reward: [(0, '0.696')] [2024-04-28 14:14:02,658][57339] Updated weights for policy 0, policy_version 630238 (0.0024) [2024-04-28 14:14:05,981][57339] Updated weights for policy 0, policy_version 630248 (0.0029) [2024-04-28 14:14:07,169][57108] Fps is (10 sec: 55704.8, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10326065152. Throughput: 0: 55810.9. Samples: 816358160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:14:07,169][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 14:14:08,493][57339] Updated weights for policy 0, policy_version 630258 (0.0035) [2024-04-28 14:14:11,885][57339] Updated weights for policy 0, policy_version 630268 (0.0029) [2024-04-28 14:14:12,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10326327296. Throughput: 0: 55858.2. Samples: 816694260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:14:12,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 14:14:14,461][57339] Updated weights for policy 0, policy_version 630278 (0.0031) [2024-04-28 14:14:17,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10326605824. Throughput: 0: 55788.3. Samples: 817026520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:14:17,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 14:14:17,612][57339] Updated weights for policy 0, policy_version 630288 (0.0027) [2024-04-28 14:14:20,350][57339] Updated weights for policy 0, policy_version 630298 (0.0032) [2024-04-28 14:14:22,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 10326884352. Throughput: 0: 55864.5. Samples: 817189540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:14:22,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 14:14:23,418][57339] Updated weights for policy 0, policy_version 630308 (0.0031) [2024-04-28 14:14:26,232][57339] Updated weights for policy 0, policy_version 630318 (0.0034) [2024-04-28 14:14:27,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10327179264. Throughput: 0: 55797.7. Samples: 817524420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:14:27,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 14:14:29,196][57339] Updated weights for policy 0, policy_version 630328 (0.0026) [2024-04-28 14:14:31,937][57339] Updated weights for policy 0, policy_version 630338 (0.0030) [2024-04-28 14:14:32,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 10327457792. Throughput: 0: 55813.0. Samples: 817862440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:14:32,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 14:14:35,067][57339] Updated weights for policy 0, policy_version 630348 (0.0029) [2024-04-28 14:14:37,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10327736320. Throughput: 0: 55894.3. Samples: 818031440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:14:37,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 14:14:37,754][57339] Updated weights for policy 0, policy_version 630358 (0.0027) [2024-04-28 14:14:41,150][57339] Updated weights for policy 0, policy_version 630368 (0.0026) [2024-04-28 14:14:42,169][57108] Fps is (10 sec: 57343.4, 60 sec: 56251.6, 300 sec: 55927.8). Total num frames: 10328031232. Throughput: 0: 56045.6. Samples: 818369620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:14:42,169][57108] Avg episode reward: [(0, '0.679')] [2024-04-28 14:14:43,424][57319] Signal inference workers to stop experience collection... (11700 times) [2024-04-28 14:14:43,464][57339] InferenceWorker_p0-w0: stopping experience collection (11700 times) [2024-04-28 14:14:43,488][57319] Signal inference workers to resume experience collection... (11700 times) [2024-04-28 14:14:43,488][57339] InferenceWorker_p0-w0: resuming experience collection (11700 times) [2024-04-28 14:14:43,615][57339] Updated weights for policy 0, policy_version 630378 (0.0025) [2024-04-28 14:14:47,102][57339] Updated weights for policy 0, policy_version 630388 (0.0029) [2024-04-28 14:14:47,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10328276992. Throughput: 0: 56011.2. Samples: 818703740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:14:47,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 14:14:49,405][57339] Updated weights for policy 0, policy_version 630398 (0.0032) [2024-04-28 14:14:52,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10328571904. Throughput: 0: 55712.6. Samples: 818865220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:14:52,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 14:14:52,923][57339] Updated weights for policy 0, policy_version 630408 (0.0029) [2024-04-28 14:14:55,280][57339] Updated weights for policy 0, policy_version 630418 (0.0032) [2024-04-28 14:14:57,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10328866816. Throughput: 0: 55772.4. Samples: 819204020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:14:57,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 14:14:58,737][57339] Updated weights for policy 0, policy_version 630428 (0.0033) [2024-04-28 14:15:01,192][57339] Updated weights for policy 0, policy_version 630438 (0.0028) [2024-04-28 14:15:02,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10329128960. Throughput: 0: 55913.9. Samples: 819542640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:15:02,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 14:15:04,374][57339] Updated weights for policy 0, policy_version 630448 (0.0030) [2024-04-28 14:15:07,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10329407488. Throughput: 0: 56027.9. Samples: 819710800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 14:15:07,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 14:15:07,277][57339] Updated weights for policy 0, policy_version 630458 (0.0029) [2024-04-28 14:15:10,028][57339] Updated weights for policy 0, policy_version 630468 (0.0035) [2024-04-28 14:15:12,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 10329686016. Throughput: 0: 56065.9. Samples: 820047380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:15:12,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 14:15:12,973][57339] Updated weights for policy 0, policy_version 630478 (0.0033) [2024-04-28 14:15:15,908][57339] Updated weights for policy 0, policy_version 630488 (0.0027) [2024-04-28 14:15:17,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10329980928. Throughput: 0: 55990.1. Samples: 820382000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:15:17,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 14:15:18,811][57339] Updated weights for policy 0, policy_version 630498 (0.0027) [2024-04-28 14:15:21,785][57339] Updated weights for policy 0, policy_version 630508 (0.0031) [2024-04-28 14:15:22,169][57108] Fps is (10 sec: 57343.6, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 10330259456. Throughput: 0: 56088.9. Samples: 820555440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:15:22,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 14:15:24,719][57339] Updated weights for policy 0, policy_version 630518 (0.0023) [2024-04-28 14:15:27,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 10330537984. Throughput: 0: 56060.4. Samples: 820892340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:15:27,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 14:15:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000630526_10330537984.pth... [2024-04-28 14:15:27,225][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000629708_10317135872.pth [2024-04-28 14:15:27,632][57339] Updated weights for policy 0, policy_version 630528 (0.0029) [2024-04-28 14:15:30,458][57339] Updated weights for policy 0, policy_version 630538 (0.0034) [2024-04-28 14:15:32,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10330816512. Throughput: 0: 56033.2. Samples: 821225240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:15:32,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 14:15:33,359][57339] Updated weights for policy 0, policy_version 630548 (0.0032) [2024-04-28 14:15:36,159][57339] Updated weights for policy 0, policy_version 630558 (0.0027) [2024-04-28 14:15:37,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10331078656. Throughput: 0: 56148.4. Samples: 821391900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:15:37,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 14:15:39,210][57339] Updated weights for policy 0, policy_version 630568 (0.0030) [2024-04-28 14:15:42,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10331373568. Throughput: 0: 56119.6. Samples: 821729400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:15:42,169][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 14:15:42,181][57339] Updated weights for policy 0, policy_version 630578 (0.0029) [2024-04-28 14:15:45,099][57339] Updated weights for policy 0, policy_version 630588 (0.0028) [2024-04-28 14:15:47,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10331652096. Throughput: 0: 55892.0. Samples: 822057780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:15:47,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 14:15:48,457][57339] Updated weights for policy 0, policy_version 630598 (0.0033) [2024-04-28 14:15:50,836][57339] Updated weights for policy 0, policy_version 630608 (0.0030) [2024-04-28 14:15:52,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10331930624. Throughput: 0: 55876.5. Samples: 822225240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:15:52,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 14:15:54,619][57339] Updated weights for policy 0, policy_version 630618 (0.0036) [2024-04-28 14:15:56,592][57339] Updated weights for policy 0, policy_version 630628 (0.0029) [2024-04-28 14:15:57,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 10332209152. Throughput: 0: 55800.9. Samples: 822558420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:15:57,169][57108] Avg episode reward: [(0, '0.519')] [2024-04-28 14:16:00,352][57339] Updated weights for policy 0, policy_version 630638 (0.0027) [2024-04-28 14:16:02,169][57108] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10332504064. Throughput: 0: 55771.6. Samples: 822891720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:16:02,169][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 14:16:02,553][57339] Updated weights for policy 0, policy_version 630648 (0.0027) [2024-04-28 14:16:03,749][57319] Signal inference workers to stop experience collection... (11750 times) [2024-04-28 14:16:03,785][57339] InferenceWorker_p0-w0: stopping experience collection (11750 times) [2024-04-28 14:16:03,812][57319] Signal inference workers to resume experience collection... (11750 times) [2024-04-28 14:16:03,817][57339] InferenceWorker_p0-w0: resuming experience collection (11750 times) [2024-04-28 14:16:06,151][57339] Updated weights for policy 0, policy_version 630658 (0.0029) [2024-04-28 14:16:07,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10332749824. Throughput: 0: 55653.4. Samples: 823059840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:16:07,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 14:16:08,386][57339] Updated weights for policy 0, policy_version 630668 (0.0028) [2024-04-28 14:16:12,003][57339] Updated weights for policy 0, policy_version 630678 (0.0033) [2024-04-28 14:16:12,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10333028352. Throughput: 0: 55585.4. Samples: 823393680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:16:12,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 14:16:15,105][57339] Updated weights for policy 0, policy_version 630688 (0.0030) [2024-04-28 14:16:17,169][57108] Fps is (10 sec: 57342.7, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10333323264. Throughput: 0: 55625.6. Samples: 823728400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:16:17,169][57108] Avg episode reward: [(0, '0.493')] [2024-04-28 14:16:17,785][57339] Updated weights for policy 0, policy_version 630698 (0.0028) [2024-04-28 14:16:20,884][57339] Updated weights for policy 0, policy_version 630708 (0.0027) [2024-04-28 14:16:22,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 10333585408. Throughput: 0: 55500.9. Samples: 823889440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:16:22,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 14:16:23,490][57339] Updated weights for policy 0, policy_version 630718 (0.0027) [2024-04-28 14:16:26,648][57339] Updated weights for policy 0, policy_version 630728 (0.0033) [2024-04-28 14:16:27,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10333847552. Throughput: 0: 55326.6. Samples: 824219100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:16:27,169][57108] Avg episode reward: [(0, '0.516')] [2024-04-28 14:16:29,749][57339] Updated weights for policy 0, policy_version 630738 (0.0031) [2024-04-28 14:16:32,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 10334158848. Throughput: 0: 55391.1. Samples: 824550380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:16:32,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 14:16:32,629][57339] Updated weights for policy 0, policy_version 630748 (0.0026) [2024-04-28 14:16:35,724][57339] Updated weights for policy 0, policy_version 630758 (0.0030) [2024-04-28 14:16:37,169][57108] Fps is (10 sec: 60621.0, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10334453760. Throughput: 0: 55631.1. Samples: 824728640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:16:37,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 14:16:38,735][57339] Updated weights for policy 0, policy_version 630768 (0.0033) [2024-04-28 14:16:41,721][57339] Updated weights for policy 0, policy_version 630778 (0.0029) [2024-04-28 14:16:42,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 10334699520. Throughput: 0: 55586.1. Samples: 825059800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 14:16:42,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 14:16:44,432][57339] Updated weights for policy 0, policy_version 630788 (0.0026) [2024-04-28 14:16:47,169][57108] Fps is (10 sec: 50790.1, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 10334961664. Throughput: 0: 55573.8. Samples: 825392540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:16:47,169][57108] Avg episode reward: [(0, '0.684')] [2024-04-28 14:16:47,517][57339] Updated weights for policy 0, policy_version 630798 (0.0029) [2024-04-28 14:16:50,439][57339] Updated weights for policy 0, policy_version 630808 (0.0026) [2024-04-28 14:16:52,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 10335256576. Throughput: 0: 55440.6. Samples: 825554680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:16:52,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 14:16:53,377][57339] Updated weights for policy 0, policy_version 630818 (0.0027) [2024-04-28 14:16:56,480][57339] Updated weights for policy 0, policy_version 630828 (0.0028) [2024-04-28 14:16:57,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.4, 300 sec: 55761.2). Total num frames: 10335518720. Throughput: 0: 55471.1. Samples: 825889880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:16:57,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 14:16:59,126][57339] Updated weights for policy 0, policy_version 630838 (0.0028) [2024-04-28 14:17:02,169][57108] Fps is (10 sec: 52429.5, 60 sec: 54613.4, 300 sec: 55650.1). Total num frames: 10335780864. Throughput: 0: 55502.4. Samples: 826226000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:17:02,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 14:17:02,561][57339] Updated weights for policy 0, policy_version 630848 (0.0026) [2024-04-28 14:17:05,009][57339] Updated weights for policy 0, policy_version 630858 (0.0036) [2024-04-28 14:17:07,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10336092160. Throughput: 0: 55511.4. Samples: 826387460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:17:07,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 14:17:08,636][57339] Updated weights for policy 0, policy_version 630868 (0.0027) [2024-04-28 14:17:08,659][57319] Signal inference workers to stop experience collection... (11800 times) [2024-04-28 14:17:08,663][57319] Signal inference workers to resume experience collection... (11800 times) [2024-04-28 14:17:08,688][57339] InferenceWorker_p0-w0: stopping experience collection (11800 times) [2024-04-28 14:17:08,688][57339] InferenceWorker_p0-w0: resuming experience collection (11800 times) [2024-04-28 14:17:10,774][57339] Updated weights for policy 0, policy_version 630878 (0.0031) [2024-04-28 14:17:12,169][57108] Fps is (10 sec: 62258.8, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10336403456. Throughput: 0: 55641.3. Samples: 826722960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:17:12,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 14:17:14,353][57339] Updated weights for policy 0, policy_version 630888 (0.0027) [2024-04-28 14:17:16,669][57339] Updated weights for policy 0, policy_version 630898 (0.0028) [2024-04-28 14:17:17,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10336649216. Throughput: 0: 55631.9. Samples: 827053820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:17:17,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 14:17:20,377][57339] Updated weights for policy 0, policy_version 630908 (0.0031) [2024-04-28 14:17:22,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10336927744. Throughput: 0: 55477.7. Samples: 827225140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:17:22,169][57108] Avg episode reward: [(0, '0.694')] [2024-04-28 14:17:22,635][57339] Updated weights for policy 0, policy_version 630918 (0.0030) [2024-04-28 14:17:26,241][57339] Updated weights for policy 0, policy_version 630928 (0.0027) [2024-04-28 14:17:27,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55705.8). Total num frames: 10337189888. Throughput: 0: 55592.9. Samples: 827561480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:17:27,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 14:17:27,345][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000630934_10337222656.pth... [2024-04-28 14:17:27,390][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000630118_10323853312.pth [2024-04-28 14:17:28,446][57339] Updated weights for policy 0, policy_version 630938 (0.0031) [2024-04-28 14:17:32,061][57339] Updated weights for policy 0, policy_version 630948 (0.0029) [2024-04-28 14:17:32,169][57108] Fps is (10 sec: 52429.4, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 10337452032. Throughput: 0: 55777.0. Samples: 827902500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:17:32,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 14:17:34,221][57339] Updated weights for policy 0, policy_version 630958 (0.0028) [2024-04-28 14:17:37,169][57108] Fps is (10 sec: 55705.3, 60 sec: 54886.3, 300 sec: 55705.6). Total num frames: 10337746944. Throughput: 0: 55578.7. Samples: 828055720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:17:37,175][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 14:17:37,817][57339] Updated weights for policy 0, policy_version 630968 (0.0027) [2024-04-28 14:17:39,910][57339] Updated weights for policy 0, policy_version 630978 (0.0029) [2024-04-28 14:17:42,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10338041856. Throughput: 0: 55561.8. Samples: 828390160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:17:42,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 14:17:43,911][57339] Updated weights for policy 0, policy_version 630988 (0.0030) [2024-04-28 14:17:45,845][57339] Updated weights for policy 0, policy_version 630998 (0.0036) [2024-04-28 14:17:47,169][57108] Fps is (10 sec: 60621.6, 60 sec: 56524.9, 300 sec: 55872.2). Total num frames: 10338353152. Throughput: 0: 55423.6. Samples: 828720060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:17:47,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 14:17:49,794][57339] Updated weights for policy 0, policy_version 631008 (0.0032) [2024-04-28 14:17:51,589][57339] Updated weights for policy 0, policy_version 631018 (0.0032) [2024-04-28 14:17:52,169][57108] Fps is (10 sec: 60620.8, 60 sec: 56524.9, 300 sec: 55927.7). Total num frames: 10338648064. Throughput: 0: 56070.7. Samples: 828910640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:17:52,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 14:17:55,585][57339] Updated weights for policy 0, policy_version 631028 (0.0029) [2024-04-28 14:17:57,169][57108] Fps is (10 sec: 54066.8, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10338893824. Throughput: 0: 56039.1. Samples: 829244720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:17:57,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 14:17:57,298][57319] Signal inference workers to stop experience collection... (11850 times) [2024-04-28 14:17:57,332][57339] InferenceWorker_p0-w0: stopping experience collection (11850 times) [2024-04-28 14:17:57,358][57319] Signal inference workers to resume experience collection... (11850 times) [2024-04-28 14:17:57,363][57339] InferenceWorker_p0-w0: resuming experience collection (11850 times) [2024-04-28 14:17:57,471][57339] Updated weights for policy 0, policy_version 631038 (0.0031) [2024-04-28 14:18:01,614][57339] Updated weights for policy 0, policy_version 631048 (0.0032) [2024-04-28 14:18:02,169][57108] Fps is (10 sec: 49152.1, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10339139584. Throughput: 0: 56095.2. Samples: 829578100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:18:02,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 14:18:03,262][57339] Updated weights for policy 0, policy_version 631058 (0.0027) [2024-04-28 14:18:07,169][57108] Fps is (10 sec: 49151.9, 60 sec: 54886.4, 300 sec: 55650.0). Total num frames: 10339385344. Throughput: 0: 55547.6. Samples: 829724780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:18:07,169][57108] Avg episode reward: [(0, '0.481')] [2024-04-28 14:18:07,491][57339] Updated weights for policy 0, policy_version 631068 (0.0033) [2024-04-28 14:18:09,386][57339] Updated weights for policy 0, policy_version 631078 (0.0032) [2024-04-28 14:18:12,169][57108] Fps is (10 sec: 54067.5, 60 sec: 54613.4, 300 sec: 55594.5). Total num frames: 10339680256. Throughput: 0: 55393.9. Samples: 830054200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:18:12,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 14:18:13,292][57339] Updated weights for policy 0, policy_version 631088 (0.0028) [2024-04-28 14:18:15,201][57339] Updated weights for policy 0, policy_version 631098 (0.0032) [2024-04-28 14:18:17,169][57108] Fps is (10 sec: 60621.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10339991552. Throughput: 0: 55284.4. Samples: 830390300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 14:18:17,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 14:18:19,125][57339] Updated weights for policy 0, policy_version 631108 (0.0024) [2024-04-28 14:18:20,961][57339] Updated weights for policy 0, policy_version 631118 (0.0030) [2024-04-28 14:18:22,173][57108] Fps is (10 sec: 60595.0, 60 sec: 55974.8, 300 sec: 55760.3). Total num frames: 10340286464. Throughput: 0: 55965.1. Samples: 830574380. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:18:22,173][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 14:18:24,988][57339] Updated weights for policy 0, policy_version 631128 (0.0034) [2024-04-28 14:18:26,966][57339] Updated weights for policy 0, policy_version 631138 (0.0028) [2024-04-28 14:18:27,169][57108] Fps is (10 sec: 58982.0, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 10340581376. Throughput: 0: 55809.3. Samples: 830901580. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:18:27,170][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 14:18:30,951][57339] Updated weights for policy 0, policy_version 631148 (0.0032) [2024-04-28 14:18:32,169][57108] Fps is (10 sec: 54089.7, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 10340827136. Throughput: 0: 55875.4. Samples: 831234460. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:18:32,170][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 14:18:32,719][57339] Updated weights for policy 0, policy_version 631158 (0.0031) [2024-04-28 14:18:36,752][57339] Updated weights for policy 0, policy_version 631168 (0.0029) [2024-04-28 14:18:37,169][57108] Fps is (10 sec: 49152.3, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10341072896. Throughput: 0: 55232.5. Samples: 831396100. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:18:37,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 14:18:38,492][57339] Updated weights for policy 0, policy_version 631178 (0.0028) [2024-04-28 14:18:42,060][57319] Signal inference workers to stop experience collection... (11900 times) [2024-04-28 14:18:42,092][57339] InferenceWorker_p0-w0: stopping experience collection (11900 times) [2024-04-28 14:18:42,119][57319] Signal inference workers to resume experience collection... (11900 times) [2024-04-28 14:18:42,120][57339] InferenceWorker_p0-w0: resuming experience collection (11900 times) [2024-04-28 14:18:42,169][57108] Fps is (10 sec: 50790.6, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 10341335040. Throughput: 0: 55354.2. Samples: 831735660. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:18:42,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 14:18:42,742][57339] Updated weights for policy 0, policy_version 631188 (0.0034) [2024-04-28 14:18:44,387][57339] Updated weights for policy 0, policy_version 631198 (0.0025) [2024-04-28 14:18:47,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54613.3, 300 sec: 55650.1). Total num frames: 10341629952. Throughput: 0: 55425.8. Samples: 832072260. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:18:47,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 14:18:48,419][57339] Updated weights for policy 0, policy_version 631208 (0.0026) [2024-04-28 14:18:50,465][57339] Updated weights for policy 0, policy_version 631218 (0.0031) [2024-04-28 14:18:52,169][57108] Fps is (10 sec: 62259.4, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 10341957632. Throughput: 0: 55821.9. Samples: 832236760. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:18:52,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 14:18:54,242][57339] Updated weights for policy 0, policy_version 631228 (0.0032) [2024-04-28 14:18:56,308][57339] Updated weights for policy 0, policy_version 631238 (0.0026) [2024-04-28 14:18:57,169][57108] Fps is (10 sec: 62259.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10342252544. Throughput: 0: 55972.4. Samples: 832572960. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:18:57,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 14:19:00,009][57339] Updated weights for policy 0, policy_version 631248 (0.0023) [2024-04-28 14:19:02,128][57339] Updated weights for policy 0, policy_version 631258 (0.0026) [2024-04-28 14:19:02,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56524.9, 300 sec: 55816.7). Total num frames: 10342531072. Throughput: 0: 55883.2. Samples: 832905040. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:19:02,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 14:19:05,928][57339] Updated weights for policy 0, policy_version 631268 (0.0031) [2024-04-28 14:19:07,169][57108] Fps is (10 sec: 54067.1, 60 sec: 56797.9, 300 sec: 55816.7). Total num frames: 10342793216. Throughput: 0: 55742.1. Samples: 833082540. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:19:07,169][57108] Avg episode reward: [(0, '0.530')] [2024-04-28 14:19:07,857][57339] Updated weights for policy 0, policy_version 631278 (0.0033) [2024-04-28 14:19:11,757][57339] Updated weights for policy 0, policy_version 631288 (0.0036) [2024-04-28 14:19:12,169][57108] Fps is (10 sec: 52428.8, 60 sec: 56251.7, 300 sec: 55761.2). Total num frames: 10343055360. Throughput: 0: 55973.5. Samples: 833420380. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:19:12,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 14:19:13,777][57339] Updated weights for policy 0, policy_version 631298 (0.0026) [2024-04-28 14:19:17,169][57108] Fps is (10 sec: 50790.6, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10343301120. Throughput: 0: 56192.6. Samples: 833763120. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:19:17,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 14:19:17,519][57339] Updated weights for policy 0, policy_version 631308 (0.0035) [2024-04-28 14:19:19,687][57339] Updated weights for policy 0, policy_version 631318 (0.0035) [2024-04-28 14:19:22,169][57108] Fps is (10 sec: 52428.2, 60 sec: 54890.2, 300 sec: 55594.5). Total num frames: 10343579648. Throughput: 0: 55814.1. Samples: 833907740. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:19:22,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 14:19:23,270][57339] Updated weights for policy 0, policy_version 631328 (0.0029) [2024-04-28 14:19:25,449][57339] Updated weights for policy 0, policy_version 631338 (0.0030) [2024-04-28 14:19:27,169][57108] Fps is (10 sec: 60620.2, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10343907328. Throughput: 0: 55748.8. Samples: 834244360. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:19:27,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 14:19:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000631342_10343907328.pth... [2024-04-28 14:19:27,227][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000630526_10330537984.pth [2024-04-28 14:19:28,972][57319] Signal inference workers to stop experience collection... (11950 times) [2024-04-28 14:19:28,972][57319] Signal inference workers to resume experience collection... (11950 times) [2024-04-28 14:19:28,994][57339] InferenceWorker_p0-w0: stopping experience collection (11950 times) [2024-04-28 14:19:29,020][57339] InferenceWorker_p0-w0: resuming experience collection (11950 times) [2024-04-28 14:19:29,103][57339] Updated weights for policy 0, policy_version 631348 (0.0027) [2024-04-28 14:19:31,438][57339] Updated weights for policy 0, policy_version 631358 (0.0027) [2024-04-28 14:19:32,169][57108] Fps is (10 sec: 62259.8, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10344202240. Throughput: 0: 55720.9. Samples: 834579700. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:19:32,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 14:19:34,928][57339] Updated weights for policy 0, policy_version 631368 (0.0033) [2024-04-28 14:19:37,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56524.8, 300 sec: 55705.6). Total num frames: 10344464384. Throughput: 0: 56172.9. Samples: 834764540. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:19:37,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 14:19:37,545][57339] Updated weights for policy 0, policy_version 631378 (0.0031) [2024-04-28 14:19:40,696][57339] Updated weights for policy 0, policy_version 631388 (0.0029) [2024-04-28 14:19:42,169][57108] Fps is (10 sec: 54066.6, 60 sec: 56797.8, 300 sec: 55816.7). Total num frames: 10344742912. Throughput: 0: 56019.0. Samples: 835093820. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:19:42,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 14:19:43,363][57339] Updated weights for policy 0, policy_version 631398 (0.0030) [2024-04-28 14:19:46,488][57339] Updated weights for policy 0, policy_version 631408 (0.0028) [2024-04-28 14:19:47,169][57108] Fps is (10 sec: 55705.5, 60 sec: 56524.8, 300 sec: 55761.1). Total num frames: 10345021440. Throughput: 0: 55947.5. Samples: 835422680. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:19:47,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 14:19:49,126][57339] Updated weights for policy 0, policy_version 631418 (0.0027) [2024-04-28 14:19:52,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10345283584. Throughput: 0: 55616.4. Samples: 835585280. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-04-28 14:19:52,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 14:19:52,403][57339] Updated weights for policy 0, policy_version 631428 (0.0028) [2024-04-28 14:19:55,101][57339] Updated weights for policy 0, policy_version 631438 (0.0030) [2024-04-28 14:19:57,169][57108] Fps is (10 sec: 50790.6, 60 sec: 54613.3, 300 sec: 55594.5). Total num frames: 10345529344. Throughput: 0: 55495.5. Samples: 835917680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:19:57,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 14:19:58,302][57339] Updated weights for policy 0, policy_version 631448 (0.0032) [2024-04-28 14:20:01,161][57339] Updated weights for policy 0, policy_version 631458 (0.0026) [2024-04-28 14:20:02,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 10345840640. Throughput: 0: 55246.5. Samples: 836249220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:20:02,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 14:20:04,299][57339] Updated weights for policy 0, policy_version 631468 (0.0023) [2024-04-28 14:20:06,962][57339] Updated weights for policy 0, policy_version 631478 (0.0031) [2024-04-28 14:20:07,169][57108] Fps is (10 sec: 60621.0, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10346135552. Throughput: 0: 55763.7. Samples: 836417100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:20:07,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 14:20:10,116][57339] Updated weights for policy 0, policy_version 631488 (0.0030) [2024-04-28 14:20:12,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 10346430464. Throughput: 0: 55792.4. Samples: 836755020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:20:12,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 14:20:12,813][57339] Updated weights for policy 0, policy_version 631498 (0.0029) [2024-04-28 14:20:15,965][57339] Updated weights for policy 0, policy_version 631508 (0.0031) [2024-04-28 14:20:15,974][57319] Signal inference workers to stop experience collection... (12000 times) [2024-04-28 14:20:15,979][57319] Signal inference workers to resume experience collection... (12000 times) [2024-04-28 14:20:16,002][57339] InferenceWorker_p0-w0: stopping experience collection (12000 times) [2024-04-28 14:20:16,002][57339] InferenceWorker_p0-w0: resuming experience collection (12000 times) [2024-04-28 14:20:17,169][57108] Fps is (10 sec: 54066.6, 60 sec: 56251.6, 300 sec: 55650.1). Total num frames: 10346676224. Throughput: 0: 55736.3. Samples: 837087840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:20:17,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 14:20:18,828][57339] Updated weights for policy 0, policy_version 631518 (0.0030) [2024-04-28 14:20:21,802][57339] Updated weights for policy 0, policy_version 631528 (0.0026) [2024-04-28 14:20:22,169][57108] Fps is (10 sec: 54067.4, 60 sec: 56524.8, 300 sec: 55705.6). Total num frames: 10346971136. Throughput: 0: 55363.5. Samples: 837255900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:20:22,169][57108] Avg episode reward: [(0, '0.676')] [2024-04-28 14:20:24,914][57339] Updated weights for policy 0, policy_version 631538 (0.0028) [2024-04-28 14:20:27,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 10347216896. Throughput: 0: 55360.2. Samples: 837585020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:20:27,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 14:20:27,740][57339] Updated weights for policy 0, policy_version 631548 (0.0031) [2024-04-28 14:20:30,897][57339] Updated weights for policy 0, policy_version 631558 (0.0026) [2024-04-28 14:20:32,169][57108] Fps is (10 sec: 50790.2, 60 sec: 54613.2, 300 sec: 55594.5). Total num frames: 10347479040. Throughput: 0: 55528.8. Samples: 837921480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:20:32,169][57108] Avg episode reward: [(0, '0.517')] [2024-04-28 14:20:33,515][57339] Updated weights for policy 0, policy_version 631568 (0.0028) [2024-04-28 14:20:36,736][57339] Updated weights for policy 0, policy_version 631578 (0.0031) [2024-04-28 14:20:37,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 10347773952. Throughput: 0: 55520.0. Samples: 838083680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:20:37,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 14:20:39,410][57339] Updated weights for policy 0, policy_version 631588 (0.0034) [2024-04-28 14:20:42,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 10348052480. Throughput: 0: 55484.9. Samples: 838414500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:20:42,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 14:20:42,849][57339] Updated weights for policy 0, policy_version 631598 (0.0029) [2024-04-28 14:20:45,387][57339] Updated weights for policy 0, policy_version 631608 (0.0030) [2024-04-28 14:20:47,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10348331008. Throughput: 0: 55531.7. Samples: 838748140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:20:47,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 14:20:48,810][57339] Updated weights for policy 0, policy_version 631618 (0.0029) [2024-04-28 14:20:51,212][57339] Updated weights for policy 0, policy_version 631628 (0.0028) [2024-04-28 14:20:52,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10348609536. Throughput: 0: 55680.8. Samples: 838922740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:20:52,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 14:20:54,711][57339] Updated weights for policy 0, policy_version 631638 (0.0024) [2024-04-28 14:20:56,984][57339] Updated weights for policy 0, policy_version 631648 (0.0034) [2024-04-28 14:20:57,169][57108] Fps is (10 sec: 58982.3, 60 sec: 56524.8, 300 sec: 55650.1). Total num frames: 10348920832. Throughput: 0: 55581.0. Samples: 839256160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:20:57,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 14:21:00,709][57339] Updated weights for policy 0, policy_version 631658 (0.0031) [2024-04-28 14:21:02,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10349182976. Throughput: 0: 55502.7. Samples: 839585460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:21:02,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 14:21:02,832][57339] Updated weights for policy 0, policy_version 631668 (0.0028) [2024-04-28 14:21:06,543][57339] Updated weights for policy 0, policy_version 631678 (0.0030) [2024-04-28 14:21:07,169][57108] Fps is (10 sec: 52428.0, 60 sec: 55159.3, 300 sec: 55650.0). Total num frames: 10349445120. Throughput: 0: 55537.2. Samples: 839755080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:21:07,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 14:21:08,614][57339] Updated weights for policy 0, policy_version 631688 (0.0027) [2024-04-28 14:21:12,169][57108] Fps is (10 sec: 52428.2, 60 sec: 54613.3, 300 sec: 55539.0). Total num frames: 10349707264. Throughput: 0: 55597.1. Samples: 840086900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:21:12,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 14:21:12,478][57339] Updated weights for policy 0, policy_version 631698 (0.0029) [2024-04-28 14:21:13,986][57319] Signal inference workers to stop experience collection... (12050 times) [2024-04-28 14:21:13,991][57319] Signal inference workers to resume experience collection... (12050 times) [2024-04-28 14:21:14,017][57339] InferenceWorker_p0-w0: stopping experience collection (12050 times) [2024-04-28 14:21:14,017][57339] InferenceWorker_p0-w0: resuming experience collection (12050 times) [2024-04-28 14:21:14,376][57339] Updated weights for policy 0, policy_version 631708 (0.0027) [2024-04-28 14:21:17,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 10350002176. Throughput: 0: 55532.9. Samples: 840420460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:21:17,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 14:21:18,411][57339] Updated weights for policy 0, policy_version 631718 (0.0029) [2024-04-28 14:21:20,264][57339] Updated weights for policy 0, policy_version 631728 (0.0027) [2024-04-28 14:21:22,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10350280704. Throughput: 0: 55678.8. Samples: 840589220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:21:22,169][57108] Avg episode reward: [(0, '0.701')] [2024-04-28 14:21:24,401][57339] Updated weights for policy 0, policy_version 631738 (0.0033) [2024-04-28 14:21:26,158][57339] Updated weights for policy 0, policy_version 631748 (0.0030) [2024-04-28 14:21:27,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 10350575616. Throughput: 0: 55671.3. Samples: 840919720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:21:27,178][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 14:21:27,188][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000631749_10350575616.pth... [2024-04-28 14:21:27,233][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000630934_10337222656.pth [2024-04-28 14:21:30,284][57339] Updated weights for policy 0, policy_version 631758 (0.0033) [2024-04-28 14:21:32,118][57339] Updated weights for policy 0, policy_version 631768 (0.0029) [2024-04-28 14:21:32,169][57108] Fps is (10 sec: 60620.1, 60 sec: 56797.8, 300 sec: 55705.6). Total num frames: 10350886912. Throughput: 0: 55621.1. Samples: 841251100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:21:32,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 14:21:36,065][57339] Updated weights for policy 0, policy_version 631778 (0.0026) [2024-04-28 14:21:37,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10351132672. Throughput: 0: 55529.8. Samples: 841421580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:21:37,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 14:21:38,233][57339] Updated weights for policy 0, policy_version 631788 (0.0025) [2024-04-28 14:21:41,977][57339] Updated weights for policy 0, policy_version 631798 (0.0028) [2024-04-28 14:21:42,169][57108] Fps is (10 sec: 50790.5, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10351394816. Throughput: 0: 55597.2. Samples: 841758040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:21:42,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 14:21:44,498][57339] Updated weights for policy 0, policy_version 631808 (0.0028) [2024-04-28 14:21:47,169][57108] Fps is (10 sec: 50790.8, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 10351640576. Throughput: 0: 55746.7. Samples: 842094060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:21:47,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 14:21:47,814][57339] Updated weights for policy 0, policy_version 631818 (0.0028) [2024-04-28 14:21:50,471][57339] Updated weights for policy 0, policy_version 631828 (0.0027) [2024-04-28 14:21:52,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10351951872. Throughput: 0: 55365.9. Samples: 842246540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:21:52,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 14:21:53,618][57339] Updated weights for policy 0, policy_version 631838 (0.0029) [2024-04-28 14:21:56,338][57339] Updated weights for policy 0, policy_version 631848 (0.0030) [2024-04-28 14:21:57,169][57108] Fps is (10 sec: 57343.7, 60 sec: 54886.3, 300 sec: 55705.6). Total num frames: 10352214016. Throughput: 0: 55396.1. Samples: 842579720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:21:57,169][57108] Avg episode reward: [(0, '0.530')] [2024-04-28 14:21:59,345][57339] Updated weights for policy 0, policy_version 631858 (0.0028) [2024-04-28 14:22:02,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 10352508928. Throughput: 0: 55428.8. Samples: 842914760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:22:02,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 14:22:02,257][57339] Updated weights for policy 0, policy_version 631868 (0.0035) [2024-04-28 14:22:03,028][57319] Signal inference workers to stop experience collection... (12100 times) [2024-04-28 14:22:03,029][57319] Signal inference workers to resume experience collection... (12100 times) [2024-04-28 14:22:03,043][57339] InferenceWorker_p0-w0: stopping experience collection (12100 times) [2024-04-28 14:22:03,043][57339] InferenceWorker_p0-w0: resuming experience collection (12100 times) [2024-04-28 14:22:05,343][57339] Updated weights for policy 0, policy_version 631878 (0.0028) [2024-04-28 14:22:07,169][57108] Fps is (10 sec: 62259.3, 60 sec: 56524.9, 300 sec: 55705.6). Total num frames: 10352836608. Throughput: 0: 55729.7. Samples: 843097060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:22:07,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 14:22:07,937][57339] Updated weights for policy 0, policy_version 631888 (0.0029) [2024-04-28 14:22:11,180][57339] Updated weights for policy 0, policy_version 631898 (0.0034) [2024-04-28 14:22:12,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 10353082368. Throughput: 0: 55848.6. Samples: 843432900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:22:12,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 14:22:13,822][57339] Updated weights for policy 0, policy_version 631908 (0.0032) [2024-04-28 14:22:16,915][57339] Updated weights for policy 0, policy_version 631918 (0.0030) [2024-04-28 14:22:17,169][57108] Fps is (10 sec: 50790.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10353344512. Throughput: 0: 55797.8. Samples: 843762000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:22:17,169][57108] Avg episode reward: [(0, '0.682')] [2024-04-28 14:22:19,810][57339] Updated weights for policy 0, policy_version 631928 (0.0031) [2024-04-28 14:22:22,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10353606656. Throughput: 0: 55627.7. Samples: 843924820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:22:22,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 14:22:22,721][57339] Updated weights for policy 0, policy_version 631938 (0.0032) [2024-04-28 14:22:25,600][57339] Updated weights for policy 0, policy_version 631948 (0.0034) [2024-04-28 14:22:27,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10353901568. Throughput: 0: 55599.5. Samples: 844260020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:22:27,169][57108] Avg episode reward: [(0, '0.684')] [2024-04-28 14:22:28,494][57339] Updated weights for policy 0, policy_version 631958 (0.0029) [2024-04-28 14:22:31,863][57339] Updated weights for policy 0, policy_version 631968 (0.0027) [2024-04-28 14:22:32,169][57108] Fps is (10 sec: 57344.2, 60 sec: 54886.5, 300 sec: 55705.6). Total num frames: 10354180096. Throughput: 0: 55684.9. Samples: 844599880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:22:32,169][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 14:22:34,361][57339] Updated weights for policy 0, policy_version 631978 (0.0031) [2024-04-28 14:22:37,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10354475008. Throughput: 0: 55968.7. Samples: 844765140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:22:37,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 14:22:37,794][57339] Updated weights for policy 0, policy_version 631988 (0.0027) [2024-04-28 14:22:40,385][57339] Updated weights for policy 0, policy_version 631998 (0.0031) [2024-04-28 14:22:42,169][57108] Fps is (10 sec: 58981.6, 60 sec: 56251.7, 300 sec: 55650.0). Total num frames: 10354769920. Throughput: 0: 55973.3. Samples: 845098520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:22:42,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 14:22:43,627][57339] Updated weights for policy 0, policy_version 632008 (0.0031) [2024-04-28 14:22:46,152][57339] Updated weights for policy 0, policy_version 632018 (0.0025) [2024-04-28 14:22:47,169][57108] Fps is (10 sec: 55706.4, 60 sec: 56524.8, 300 sec: 55539.0). Total num frames: 10355032064. Throughput: 0: 55974.8. Samples: 845433620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:22:47,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 14:22:49,350][57339] Updated weights for policy 0, policy_version 632028 (0.0030) [2024-04-28 14:22:52,145][57339] Updated weights for policy 0, policy_version 632038 (0.0030) [2024-04-28 14:22:52,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10355310592. Throughput: 0: 55742.2. Samples: 845605460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:22:52,178][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 14:22:55,150][57339] Updated weights for policy 0, policy_version 632048 (0.0027) [2024-04-28 14:22:57,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10355572736. Throughput: 0: 55719.6. Samples: 845940280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:22:57,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 14:22:57,945][57339] Updated weights for policy 0, policy_version 632058 (0.0026) [2024-04-28 14:22:59,022][57319] Signal inference workers to stop experience collection... (12150 times) [2024-04-28 14:22:59,026][57319] Signal inference workers to resume experience collection... (12150 times) [2024-04-28 14:22:59,049][57339] InferenceWorker_p0-w0: stopping experience collection (12150 times) [2024-04-28 14:22:59,050][57339] InferenceWorker_p0-w0: resuming experience collection (12150 times) [2024-04-28 14:23:01,071][57339] Updated weights for policy 0, policy_version 632068 (0.0036) [2024-04-28 14:23:02,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10355851264. Throughput: 0: 55872.4. Samples: 846276260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:23:02,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 14:23:03,736][57339] Updated weights for policy 0, policy_version 632078 (0.0026) [2024-04-28 14:23:06,962][57339] Updated weights for policy 0, policy_version 632088 (0.0031) [2024-04-28 14:23:07,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55159.4, 300 sec: 55816.7). Total num frames: 10356146176. Throughput: 0: 55950.1. Samples: 846442580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 14:23:07,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 14:23:09,511][57339] Updated weights for policy 0, policy_version 632098 (0.0026) [2024-04-28 14:23:12,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10356424704. Throughput: 0: 55961.0. Samples: 846778260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:23:12,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 14:23:12,838][57339] Updated weights for policy 0, policy_version 632108 (0.0027) [2024-04-28 14:23:15,385][57339] Updated weights for policy 0, policy_version 632118 (0.0027) [2024-04-28 14:23:17,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55650.9). Total num frames: 10356703232. Throughput: 0: 55735.5. Samples: 847107980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:23:17,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 14:23:18,671][57339] Updated weights for policy 0, policy_version 632128 (0.0039) [2024-04-28 14:23:21,365][57339] Updated weights for policy 0, policy_version 632138 (0.0025) [2024-04-28 14:23:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 10356965376. Throughput: 0: 55830.9. Samples: 847277520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:23:22,169][57108] Avg episode reward: [(0, '0.526')] [2024-04-28 14:23:24,612][57339] Updated weights for policy 0, policy_version 632148 (0.0029) [2024-04-28 14:23:27,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10357260288. Throughput: 0: 55822.3. Samples: 847610520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:23:27,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 14:23:27,186][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000632158_10357276672.pth... [2024-04-28 14:23:27,191][57339] Updated weights for policy 0, policy_version 632158 (0.0033) [2024-04-28 14:23:27,232][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000631342_10343907328.pth [2024-04-28 14:23:30,386][57339] Updated weights for policy 0, policy_version 632168 (0.0031) [2024-04-28 14:23:32,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 10357538816. Throughput: 0: 55759.0. Samples: 847942780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:23:32,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 14:23:33,160][57339] Updated weights for policy 0, policy_version 632178 (0.0030) [2024-04-28 14:23:36,305][57339] Updated weights for policy 0, policy_version 632188 (0.0028) [2024-04-28 14:23:37,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 10357817344. Throughput: 0: 55680.8. Samples: 848111100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:23:37,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 14:23:39,285][57339] Updated weights for policy 0, policy_version 632198 (0.0034) [2024-04-28 14:23:42,023][57339] Updated weights for policy 0, policy_version 632208 (0.0037) [2024-04-28 14:23:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10358095872. Throughput: 0: 55596.4. Samples: 848442120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:23:42,170][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 14:23:44,999][57339] Updated weights for policy 0, policy_version 632218 (0.0029) [2024-04-28 14:23:47,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 10358374400. Throughput: 0: 55642.1. Samples: 848780160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:23:47,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 14:23:47,940][57339] Updated weights for policy 0, policy_version 632228 (0.0029) [2024-04-28 14:23:50,980][57339] Updated weights for policy 0, policy_version 632238 (0.0026) [2024-04-28 14:23:52,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10358636544. Throughput: 0: 55456.5. Samples: 848938120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:23:52,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 14:23:53,845][57339] Updated weights for policy 0, policy_version 632248 (0.0032) [2024-04-28 14:23:56,962][57339] Updated weights for policy 0, policy_version 632258 (0.0024) [2024-04-28 14:23:57,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.4, 300 sec: 55538.9). Total num frames: 10358915072. Throughput: 0: 55450.9. Samples: 849273560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:23:57,170][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 14:23:59,950][57339] Updated weights for policy 0, policy_version 632268 (0.0027) [2024-04-28 14:24:02,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 10359209984. Throughput: 0: 55571.8. Samples: 849608720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:24:02,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 14:24:02,707][57339] Updated weights for policy 0, policy_version 632278 (0.0027) [2024-04-28 14:24:05,863][57339] Updated weights for policy 0, policy_version 632288 (0.0025) [2024-04-28 14:24:06,448][57319] Signal inference workers to stop experience collection... (12200 times) [2024-04-28 14:24:06,490][57339] InferenceWorker_p0-w0: stopping experience collection (12200 times) [2024-04-28 14:24:06,546][57319] Signal inference workers to resume experience collection... (12200 times) [2024-04-28 14:24:06,546][57339] InferenceWorker_p0-w0: resuming experience collection (12200 times) [2024-04-28 14:24:07,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10359488512. Throughput: 0: 55636.8. Samples: 849781180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:24:07,170][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 14:24:08,528][57339] Updated weights for policy 0, policy_version 632298 (0.0035) [2024-04-28 14:24:11,796][57339] Updated weights for policy 0, policy_version 632308 (0.0025) [2024-04-28 14:24:12,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10359767040. Throughput: 0: 55702.2. Samples: 850117120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:24:12,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 14:24:14,503][57339] Updated weights for policy 0, policy_version 632318 (0.0026) [2024-04-28 14:24:17,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 10360029184. Throughput: 0: 55782.6. Samples: 850453000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:24:17,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 14:24:17,640][57339] Updated weights for policy 0, policy_version 632328 (0.0026) [2024-04-28 14:24:20,583][57339] Updated weights for policy 0, policy_version 632338 (0.0028) [2024-04-28 14:24:22,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10360307712. Throughput: 0: 55751.1. Samples: 850619900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:24:22,178][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 14:24:23,410][57339] Updated weights for policy 0, policy_version 632348 (0.0028) [2024-04-28 14:24:26,476][57339] Updated weights for policy 0, policy_version 632358 (0.0027) [2024-04-28 14:24:27,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10360586240. Throughput: 0: 55842.8. Samples: 850955040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:24:27,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 14:24:29,307][57339] Updated weights for policy 0, policy_version 632368 (0.0031) [2024-04-28 14:24:32,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10360864768. Throughput: 0: 55653.9. Samples: 851284580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:24:32,169][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 14:24:32,411][57339] Updated weights for policy 0, policy_version 632378 (0.0030) [2024-04-28 14:24:35,071][57339] Updated weights for policy 0, policy_version 632388 (0.0027) [2024-04-28 14:24:37,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10361143296. Throughput: 0: 55716.1. Samples: 851445340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:24:37,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 14:24:38,267][57339] Updated weights for policy 0, policy_version 632398 (0.0030) [2024-04-28 14:24:40,940][57339] Updated weights for policy 0, policy_version 632408 (0.0032) [2024-04-28 14:24:42,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10361454592. Throughput: 0: 55757.8. Samples: 851782660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 14:24:42,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 14:24:44,073][57339] Updated weights for policy 0, policy_version 632418 (0.0032) [2024-04-28 14:24:46,746][57339] Updated weights for policy 0, policy_version 632428 (0.0024) [2024-04-28 14:24:47,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10361716736. Throughput: 0: 55824.5. Samples: 852120820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:24:47,170][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 14:24:49,808][57339] Updated weights for policy 0, policy_version 632438 (0.0028) [2024-04-28 14:24:52,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10361978880. Throughput: 0: 55783.2. Samples: 852291420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:24:52,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 14:24:52,482][57339] Updated weights for policy 0, policy_version 632448 (0.0030) [2024-04-28 14:24:55,785][57339] Updated weights for policy 0, policy_version 632458 (0.0027) [2024-04-28 14:24:57,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10362257408. Throughput: 0: 55814.2. Samples: 852628760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:24:57,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 14:24:58,202][57339] Updated weights for policy 0, policy_version 632468 (0.0028) [2024-04-28 14:25:01,697][57339] Updated weights for policy 0, policy_version 632478 (0.0028) [2024-04-28 14:25:02,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10362535936. Throughput: 0: 55748.1. Samples: 852961660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:25:02,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 14:25:04,036][57339] Updated weights for policy 0, policy_version 632488 (0.0026) [2024-04-28 14:25:07,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 10362814464. Throughput: 0: 55581.0. Samples: 853121040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:25:07,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 14:25:07,452][57339] Updated weights for policy 0, policy_version 632498 (0.0026) [2024-04-28 14:25:09,236][57319] Signal inference workers to stop experience collection... (12250 times) [2024-04-28 14:25:09,237][57319] Signal inference workers to resume experience collection... (12250 times) [2024-04-28 14:25:09,249][57339] InferenceWorker_p0-w0: stopping experience collection (12250 times) [2024-04-28 14:25:09,261][57339] InferenceWorker_p0-w0: resuming experience collection (12250 times) [2024-04-28 14:25:09,893][57339] Updated weights for policy 0, policy_version 632508 (0.0027) [2024-04-28 14:25:12,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10363092992. Throughput: 0: 55686.6. Samples: 853460940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:25:12,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 14:25:13,253][57339] Updated weights for policy 0, policy_version 632518 (0.0029) [2024-04-28 14:25:15,610][57339] Updated weights for policy 0, policy_version 632528 (0.0037) [2024-04-28 14:25:17,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 10363387904. Throughput: 0: 55782.3. Samples: 853794780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:25:17,169][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 14:25:19,221][57339] Updated weights for policy 0, policy_version 632538 (0.0024) [2024-04-28 14:25:21,599][57339] Updated weights for policy 0, policy_version 632548 (0.0027) [2024-04-28 14:25:22,169][57108] Fps is (10 sec: 60621.4, 60 sec: 56524.9, 300 sec: 55872.2). Total num frames: 10363699200. Throughput: 0: 56208.9. Samples: 853974740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:25:22,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 14:25:25,035][57339] Updated weights for policy 0, policy_version 632558 (0.0026) [2024-04-28 14:25:27,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10363961344. Throughput: 0: 56196.6. Samples: 854311500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:25:27,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 14:25:27,213][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000632567_10363977728.pth... [2024-04-28 14:25:27,257][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000631749_10350575616.pth [2024-04-28 14:25:27,406][57339] Updated weights for policy 0, policy_version 632568 (0.0026) [2024-04-28 14:25:30,797][57339] Updated weights for policy 0, policy_version 632578 (0.0031) [2024-04-28 14:25:32,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10364223488. Throughput: 0: 56155.2. Samples: 854647800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:25:32,170][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 14:25:33,479][57339] Updated weights for policy 0, policy_version 632588 (0.0028) [2024-04-28 14:25:36,600][57339] Updated weights for policy 0, policy_version 632598 (0.0027) [2024-04-28 14:25:37,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 10364502016. Throughput: 0: 55931.8. Samples: 854808360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:25:37,169][57108] Avg episode reward: [(0, '0.508')] [2024-04-28 14:25:39,296][57339] Updated weights for policy 0, policy_version 632608 (0.0033) [2024-04-28 14:25:42,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10364764160. Throughput: 0: 55921.7. Samples: 855145240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:25:42,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 14:25:42,477][57339] Updated weights for policy 0, policy_version 632618 (0.0025) [2024-04-28 14:25:44,958][57339] Updated weights for policy 0, policy_version 632628 (0.0028) [2024-04-28 14:25:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10365059072. Throughput: 0: 56044.8. Samples: 855483680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:25:47,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 14:25:48,385][57339] Updated weights for policy 0, policy_version 632638 (0.0034) [2024-04-28 14:25:50,704][57339] Updated weights for policy 0, policy_version 632648 (0.0026) [2024-04-28 14:25:52,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 10365337600. Throughput: 0: 56178.2. Samples: 855649060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:25:52,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 14:25:54,326][57339] Updated weights for policy 0, policy_version 632658 (0.0028) [2024-04-28 14:25:56,613][57339] Updated weights for policy 0, policy_version 632668 (0.0027) [2024-04-28 14:25:57,169][57108] Fps is (10 sec: 58982.8, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 10365648896. Throughput: 0: 55988.5. Samples: 855980420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:25:57,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 14:26:00,395][57339] Updated weights for policy 0, policy_version 632678 (0.0029) [2024-04-28 14:26:02,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10365911040. Throughput: 0: 55926.7. Samples: 856311480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:26:02,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 14:26:02,601][57339] Updated weights for policy 0, policy_version 632688 (0.0029) [2024-04-28 14:26:06,242][57339] Updated weights for policy 0, policy_version 632698 (0.0028) [2024-04-28 14:26:07,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56524.7, 300 sec: 55927.8). Total num frames: 10366205952. Throughput: 0: 55699.0. Samples: 856481200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:26:07,169][57108] Avg episode reward: [(0, '0.711')] [2024-04-28 14:26:08,203][57319] Signal inference workers to stop experience collection... (12300 times) [2024-04-28 14:26:08,203][57319] Signal inference workers to resume experience collection... (12300 times) [2024-04-28 14:26:08,219][57339] InferenceWorker_p0-w0: stopping experience collection (12300 times) [2024-04-28 14:26:08,220][57339] InferenceWorker_p0-w0: resuming experience collection (12300 times) [2024-04-28 14:26:08,323][57339] Updated weights for policy 0, policy_version 632708 (0.0025) [2024-04-28 14:26:11,966][57339] Updated weights for policy 0, policy_version 632718 (0.0034) [2024-04-28 14:26:12,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10366451712. Throughput: 0: 55720.0. Samples: 856818900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:26:12,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 14:26:14,199][57339] Updated weights for policy 0, policy_version 632728 (0.0027) [2024-04-28 14:26:17,169][57108] Fps is (10 sec: 50790.6, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10366713856. Throughput: 0: 55728.9. Samples: 857155600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-04-28 14:26:17,169][57108] Avg episode reward: [(0, '0.514')] [2024-04-28 14:26:17,798][57339] Updated weights for policy 0, policy_version 632738 (0.0027) [2024-04-28 14:26:20,085][57339] Updated weights for policy 0, policy_version 632748 (0.0030) [2024-04-28 14:26:22,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 10367008768. Throughput: 0: 55752.1. Samples: 857317200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:26:22,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 14:26:23,662][57339] Updated weights for policy 0, policy_version 632758 (0.0028) [2024-04-28 14:26:25,823][57339] Updated weights for policy 0, policy_version 632768 (0.0027) [2024-04-28 14:26:27,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10367287296. Throughput: 0: 55748.5. Samples: 857653920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:26:27,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 14:26:29,544][57339] Updated weights for policy 0, policy_version 632778 (0.0030) [2024-04-28 14:26:31,646][57339] Updated weights for policy 0, policy_version 632788 (0.0025) [2024-04-28 14:26:32,169][57108] Fps is (10 sec: 58982.1, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10367598592. Throughput: 0: 55447.6. Samples: 857978820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:26:32,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 14:26:35,410][57339] Updated weights for policy 0, policy_version 632798 (0.0027) [2024-04-28 14:26:37,169][57108] Fps is (10 sec: 60620.3, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 10367893504. Throughput: 0: 56047.9. Samples: 858171220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:26:37,169][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 14:26:37,532][57339] Updated weights for policy 0, policy_version 632808 (0.0025) [2024-04-28 14:26:41,211][57339] Updated weights for policy 0, policy_version 632818 (0.0030) [2024-04-28 14:26:42,169][57108] Fps is (10 sec: 55706.2, 60 sec: 56524.9, 300 sec: 55983.3). Total num frames: 10368155648. Throughput: 0: 55972.5. Samples: 858499180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:26:42,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 14:26:43,153][57319] Signal inference workers to stop experience collection... (12350 times) [2024-04-28 14:26:43,190][57339] InferenceWorker_p0-w0: stopping experience collection (12350 times) [2024-04-28 14:26:43,241][57319] Signal inference workers to resume experience collection... (12350 times) [2024-04-28 14:26:43,241][57339] InferenceWorker_p0-w0: resuming experience collection (12350 times) [2024-04-28 14:26:43,347][57339] Updated weights for policy 0, policy_version 632828 (0.0023) [2024-04-28 14:26:47,169][57108] Fps is (10 sec: 50791.2, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10368401408. Throughput: 0: 55955.1. Samples: 858829460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:26:47,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 14:26:47,210][57339] Updated weights for policy 0, policy_version 632838 (0.0026) [2024-04-28 14:26:49,385][57339] Updated weights for policy 0, policy_version 632848 (0.0027) [2024-04-28 14:26:52,169][57108] Fps is (10 sec: 50790.1, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10368663552. Throughput: 0: 55520.9. Samples: 858979640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:26:52,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 14:26:53,015][57339] Updated weights for policy 0, policy_version 632858 (0.0027) [2024-04-28 14:26:55,167][57339] Updated weights for policy 0, policy_version 632868 (0.0034) [2024-04-28 14:26:57,169][57108] Fps is (10 sec: 55703.8, 60 sec: 55159.2, 300 sec: 55761.1). Total num frames: 10368958464. Throughput: 0: 55484.5. Samples: 859315720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:26:57,169][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 14:26:58,846][57339] Updated weights for policy 0, policy_version 632878 (0.0025) [2024-04-28 14:27:01,582][57339] Updated weights for policy 0, policy_version 632888 (0.0025) [2024-04-28 14:27:02,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10369236992. Throughput: 0: 55528.5. Samples: 859654380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:27:02,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 14:27:04,645][57339] Updated weights for policy 0, policy_version 632898 (0.0028) [2024-04-28 14:27:07,169][57108] Fps is (10 sec: 58984.0, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10369548288. Throughput: 0: 55739.6. Samples: 859825480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:27:07,170][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 14:27:07,743][57339] Updated weights for policy 0, policy_version 632908 (0.0026) [2024-04-28 14:27:10,540][57339] Updated weights for policy 0, policy_version 632918 (0.0032) [2024-04-28 14:27:12,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10369810432. Throughput: 0: 55606.2. Samples: 860156200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:27:12,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 14:27:13,435][57339] Updated weights for policy 0, policy_version 632928 (0.0032) [2024-04-28 14:27:16,372][57339] Updated weights for policy 0, policy_version 632938 (0.0037) [2024-04-28 14:27:17,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 10370105344. Throughput: 0: 55833.4. Samples: 860491320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:27:17,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 14:27:19,161][57339] Updated weights for policy 0, policy_version 632948 (0.0026) [2024-04-28 14:27:22,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10370367488. Throughput: 0: 55381.6. Samples: 860663380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:27:22,169][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 14:27:22,272][57339] Updated weights for policy 0, policy_version 632958 (0.0029) [2024-04-28 14:27:24,956][57339] Updated weights for policy 0, policy_version 632968 (0.0033) [2024-04-28 14:27:27,169][57108] Fps is (10 sec: 50790.0, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10370613248. Throughput: 0: 55606.0. Samples: 861001460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:27:27,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 14:27:27,294][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000632973_10370629632.pth... [2024-04-28 14:27:27,340][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000632158_10357276672.pth [2024-04-28 14:27:27,693][57319] Signal inference workers to stop experience collection... (12400 times) [2024-04-28 14:27:27,730][57339] InferenceWorker_p0-w0: stopping experience collection (12400 times) [2024-04-28 14:27:27,757][57319] Signal inference workers to resume experience collection... (12400 times) [2024-04-28 14:27:27,759][57339] InferenceWorker_p0-w0: resuming experience collection (12400 times) [2024-04-28 14:27:28,133][57339] Updated weights for policy 0, policy_version 632978 (0.0032) [2024-04-28 14:27:30,820][57339] Updated weights for policy 0, policy_version 632988 (0.0034) [2024-04-28 14:27:32,169][57108] Fps is (10 sec: 52427.8, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 10370891776. Throughput: 0: 55672.7. Samples: 861334740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:27:32,170][57108] Avg episode reward: [(0, '0.516')] [2024-04-28 14:27:34,049][57339] Updated weights for policy 0, policy_version 632998 (0.0027) [2024-04-28 14:27:36,679][57339] Updated weights for policy 0, policy_version 633008 (0.0030) [2024-04-28 14:27:37,169][57108] Fps is (10 sec: 58981.5, 60 sec: 55159.3, 300 sec: 55705.6). Total num frames: 10371203072. Throughput: 0: 55863.7. Samples: 861493520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:27:37,170][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 14:27:39,965][57339] Updated weights for policy 0, policy_version 633018 (0.0026) [2024-04-28 14:27:42,169][57108] Fps is (10 sec: 60621.8, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10371497984. Throughput: 0: 55792.0. Samples: 861826340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:27:42,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 14:27:43,085][57339] Updated weights for policy 0, policy_version 633028 (0.0033) [2024-04-28 14:27:45,749][57339] Updated weights for policy 0, policy_version 633038 (0.0030) [2024-04-28 14:27:47,169][57108] Fps is (10 sec: 57345.3, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10371776512. Throughput: 0: 55778.6. Samples: 862164420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:27:47,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 14:27:49,036][57339] Updated weights for policy 0, policy_version 633048 (0.0029) [2024-04-28 14:27:51,455][57339] Updated weights for policy 0, policy_version 633058 (0.0025) [2024-04-28 14:27:52,169][57108] Fps is (10 sec: 57343.6, 60 sec: 56797.9, 300 sec: 55927.7). Total num frames: 10372071424. Throughput: 0: 55957.3. Samples: 862343560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-04-28 14:27:52,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 14:27:55,125][57339] Updated weights for policy 0, policy_version 633068 (0.0028) [2024-04-28 14:27:57,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.9, 300 sec: 55816.7). Total num frames: 10372317184. Throughput: 0: 56060.1. Samples: 862678900. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:27:57,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 14:27:57,345][57339] Updated weights for policy 0, policy_version 633078 (0.0025) [2024-04-28 14:28:00,829][57339] Updated weights for policy 0, policy_version 633088 (0.0026) [2024-04-28 14:28:02,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10372595712. Throughput: 0: 56037.3. Samples: 863013000. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:28:02,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 14:28:03,325][57339] Updated weights for policy 0, policy_version 633098 (0.0030) [2024-04-28 14:28:06,538][57339] Updated weights for policy 0, policy_version 633108 (0.0026) [2024-04-28 14:28:07,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 10372857856. Throughput: 0: 55832.2. Samples: 863175840. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:28:07,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 14:28:09,152][57339] Updated weights for policy 0, policy_version 633118 (0.0026) [2024-04-28 14:28:12,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10373136384. Throughput: 0: 55696.1. Samples: 863507780. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:28:12,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 14:28:12,524][57339] Updated weights for policy 0, policy_version 633128 (0.0026) [2024-04-28 14:28:14,935][57339] Updated weights for policy 0, policy_version 633138 (0.0030) [2024-04-28 14:28:17,169][57108] Fps is (10 sec: 60620.7, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 10373464064. Throughput: 0: 55862.2. Samples: 863848540. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:28:17,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 14:28:18,299][57339] Updated weights for policy 0, policy_version 633148 (0.0028) [2024-04-28 14:28:20,710][57339] Updated weights for policy 0, policy_version 633158 (0.0027) [2024-04-28 14:28:22,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10373709824. Throughput: 0: 56107.1. Samples: 864018320. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:28:22,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 14:28:24,055][57339] Updated weights for policy 0, policy_version 633168 (0.0031) [2024-04-28 14:28:24,501][57319] Signal inference workers to stop experience collection... (12450 times) [2024-04-28 14:28:24,538][57339] InferenceWorker_p0-w0: stopping experience collection (12450 times) [2024-04-28 14:28:24,563][57319] Signal inference workers to resume experience collection... (12450 times) [2024-04-28 14:28:24,564][57339] InferenceWorker_p0-w0: resuming experience collection (12450 times) [2024-04-28 14:28:26,620][57339] Updated weights for policy 0, policy_version 633178 (0.0034) [2024-04-28 14:28:27,169][57108] Fps is (10 sec: 54067.1, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 10374004736. Throughput: 0: 56230.0. Samples: 864356700. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:28:27,170][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 14:28:29,925][57339] Updated weights for policy 0, policy_version 633188 (0.0035) [2024-04-28 14:28:32,169][57108] Fps is (10 sec: 57343.3, 60 sec: 56524.9, 300 sec: 55816.7). Total num frames: 10374283264. Throughput: 0: 56043.6. Samples: 864686380. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:28:32,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 14:28:32,457][57339] Updated weights for policy 0, policy_version 633198 (0.0034) [2024-04-28 14:28:35,869][57339] Updated weights for policy 0, policy_version 633208 (0.0032) [2024-04-28 14:28:37,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.9, 300 sec: 55816.7). Total num frames: 10374561792. Throughput: 0: 55866.2. Samples: 864857540. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:28:37,170][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 14:28:38,122][57339] Updated weights for policy 0, policy_version 633218 (0.0036) [2024-04-28 14:28:41,752][57339] Updated weights for policy 0, policy_version 633228 (0.0027) [2024-04-28 14:28:42,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 10374856704. Throughput: 0: 55905.2. Samples: 865194640. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:28:42,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 14:28:44,028][57339] Updated weights for policy 0, policy_version 633238 (0.0029) [2024-04-28 14:28:47,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10375102464. Throughput: 0: 55882.8. Samples: 865527720. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:28:47,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 14:28:47,561][57339] Updated weights for policy 0, policy_version 633248 (0.0038) [2024-04-28 14:28:50,131][57339] Updated weights for policy 0, policy_version 633258 (0.0027) [2024-04-28 14:28:52,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 10375397376. Throughput: 0: 56079.1. Samples: 865699400. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:28:52,169][57108] Avg episode reward: [(0, '0.683')] [2024-04-28 14:28:53,362][57339] Updated weights for policy 0, policy_version 633268 (0.0025) [2024-04-28 14:28:56,126][57339] Updated weights for policy 0, policy_version 633278 (0.0023) [2024-04-28 14:28:57,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10375675904. Throughput: 0: 56028.5. Samples: 866029060. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:28:57,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 14:28:59,154][57339] Updated weights for policy 0, policy_version 633288 (0.0028) [2024-04-28 14:29:01,930][57339] Updated weights for policy 0, policy_version 633298 (0.0034) [2024-04-28 14:29:02,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10375954432. Throughput: 0: 55964.2. Samples: 866366920. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:29:02,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 14:29:04,883][57339] Updated weights for policy 0, policy_version 633308 (0.0027) [2024-04-28 14:29:07,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10376216576. Throughput: 0: 55921.6. Samples: 866534800. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:29:07,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 14:29:07,768][57339] Updated weights for policy 0, policy_version 633318 (0.0031) [2024-04-28 14:29:10,766][57339] Updated weights for policy 0, policy_version 633328 (0.0027) [2024-04-28 14:29:12,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10376495104. Throughput: 0: 55815.6. Samples: 866868400. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:29:12,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 14:29:13,613][57339] Updated weights for policy 0, policy_version 633338 (0.0032) [2024-04-28 14:29:16,661][57339] Updated weights for policy 0, policy_version 633348 (0.0031) [2024-04-28 14:29:16,900][57319] Signal inference workers to stop experience collection... (12500 times) [2024-04-28 14:29:16,942][57339] InferenceWorker_p0-w0: stopping experience collection (12500 times) [2024-04-28 14:29:16,993][57319] Signal inference workers to resume experience collection... (12500 times) [2024-04-28 14:29:16,994][57339] InferenceWorker_p0-w0: resuming experience collection (12500 times) [2024-04-28 14:29:17,169][57108] Fps is (10 sec: 60620.2, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 10376822784. Throughput: 0: 55942.6. Samples: 867203800. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:29:17,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 14:29:19,350][57339] Updated weights for policy 0, policy_version 633358 (0.0028) [2024-04-28 14:29:22,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 10377068544. Throughput: 0: 55780.8. Samples: 867367680. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:29:22,170][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 14:29:22,526][57339] Updated weights for policy 0, policy_version 633368 (0.0026) [2024-04-28 14:29:25,186][57339] Updated weights for policy 0, policy_version 633378 (0.0029) [2024-04-28 14:29:27,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10377347072. Throughput: 0: 55810.6. Samples: 867706120. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-04-28 14:29:27,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 14:29:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000633383_10377347072.pth... [2024-04-28 14:29:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000632567_10363977728.pth [2024-04-28 14:29:28,189][57339] Updated weights for policy 0, policy_version 633388 (0.0029) [2024-04-28 14:29:31,144][57339] Updated weights for policy 0, policy_version 633398 (0.0034) [2024-04-28 14:29:32,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 10377625600. Throughput: 0: 55763.6. Samples: 868037080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:29:32,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 14:29:33,914][57339] Updated weights for policy 0, policy_version 633408 (0.0027) [2024-04-28 14:29:37,056][57339] Updated weights for policy 0, policy_version 633418 (0.0025) [2024-04-28 14:29:37,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10377920512. Throughput: 0: 55735.3. Samples: 868207480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:29:37,169][57108] Avg episode reward: [(0, '0.691')] [2024-04-28 14:29:39,813][57339] Updated weights for policy 0, policy_version 633428 (0.0027) [2024-04-28 14:29:42,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 10378182656. Throughput: 0: 55823.6. Samples: 868541120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:29:42,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 14:29:43,108][57339] Updated weights for policy 0, policy_version 633438 (0.0027) [2024-04-28 14:29:45,620][57339] Updated weights for policy 0, policy_version 633448 (0.0035) [2024-04-28 14:29:47,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10378461184. Throughput: 0: 55869.7. Samples: 868881060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:29:47,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 14:29:48,780][57339] Updated weights for policy 0, policy_version 633458 (0.0028) [2024-04-28 14:29:51,619][57339] Updated weights for policy 0, policy_version 633468 (0.0026) [2024-04-28 14:29:52,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 10378756096. Throughput: 0: 56068.8. Samples: 869057900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:29:52,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 14:29:54,577][57339] Updated weights for policy 0, policy_version 633478 (0.0031) [2024-04-28 14:29:57,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10379034624. Throughput: 0: 55920.0. Samples: 869384800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:29:57,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 14:29:57,509][57339] Updated weights for policy 0, policy_version 633488 (0.0024) [2024-04-28 14:30:00,563][57339] Updated weights for policy 0, policy_version 633498 (0.0033) [2024-04-28 14:30:02,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10379296768. Throughput: 0: 55876.6. Samples: 869718240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:30:02,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 14:30:03,434][57339] Updated weights for policy 0, policy_version 633508 (0.0027) [2024-04-28 14:30:06,238][57339] Updated weights for policy 0, policy_version 633518 (0.0027) [2024-04-28 14:30:07,169][57108] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 10379591680. Throughput: 0: 56096.6. Samples: 869892020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:30:07,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 14:30:09,339][57339] Updated weights for policy 0, policy_version 633528 (0.0033) [2024-04-28 14:30:11,969][57339] Updated weights for policy 0, policy_version 633538 (0.0026) [2024-04-28 14:30:12,169][57108] Fps is (10 sec: 58981.8, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 10379886592. Throughput: 0: 56064.0. Samples: 870229000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:30:12,169][57108] Avg episode reward: [(0, '0.502')] [2024-04-28 14:30:15,286][57339] Updated weights for policy 0, policy_version 633548 (0.0030) [2024-04-28 14:30:17,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10380132352. Throughput: 0: 56165.6. Samples: 870564540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:30:17,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 14:30:17,299][57319] Signal inference workers to stop experience collection... (12550 times) [2024-04-28 14:30:17,299][57319] Signal inference workers to resume experience collection... (12550 times) [2024-04-28 14:30:17,330][57339] InferenceWorker_p0-w0: stopping experience collection (12550 times) [2024-04-28 14:30:17,330][57339] InferenceWorker_p0-w0: resuming experience collection (12550 times) [2024-04-28 14:30:17,904][57339] Updated weights for policy 0, policy_version 633558 (0.0026) [2024-04-28 14:30:21,033][57339] Updated weights for policy 0, policy_version 633568 (0.0029) [2024-04-28 14:30:22,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10380410880. Throughput: 0: 55964.0. Samples: 870725860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:30:22,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 14:30:24,132][57339] Updated weights for policy 0, policy_version 633578 (0.0028) [2024-04-28 14:30:26,796][57339] Updated weights for policy 0, policy_version 633588 (0.0030) [2024-04-28 14:30:27,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 10380705792. Throughput: 0: 55992.9. Samples: 871060800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:30:27,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 14:30:30,012][57339] Updated weights for policy 0, policy_version 633598 (0.0035) [2024-04-28 14:30:32,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10380984320. Throughput: 0: 55913.3. Samples: 871397160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:30:32,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 14:30:32,697][57339] Updated weights for policy 0, policy_version 633608 (0.0029) [2024-04-28 14:30:35,818][57339] Updated weights for policy 0, policy_version 633618 (0.0026) [2024-04-28 14:30:37,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55978.5, 300 sec: 55983.3). Total num frames: 10381279232. Throughput: 0: 55820.4. Samples: 871569820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:30:37,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 14:30:38,387][57339] Updated weights for policy 0, policy_version 633628 (0.0022) [2024-04-28 14:30:41,644][57339] Updated weights for policy 0, policy_version 633638 (0.0027) [2024-04-28 14:30:42,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 10381574144. Throughput: 0: 55952.1. Samples: 871902640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:30:42,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 14:30:44,271][57339] Updated weights for policy 0, policy_version 633648 (0.0027) [2024-04-28 14:30:47,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 10381819904. Throughput: 0: 56035.3. Samples: 872239840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:30:47,169][57108] Avg episode reward: [(0, '0.511')] [2024-04-28 14:30:47,421][57339] Updated weights for policy 0, policy_version 633658 (0.0027) [2024-04-28 14:30:50,465][57339] Updated weights for policy 0, policy_version 633668 (0.0030) [2024-04-28 14:30:52,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10382114816. Throughput: 0: 56076.3. Samples: 872415460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:30:52,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 14:30:53,281][57339] Updated weights for policy 0, policy_version 633678 (0.0029) [2024-04-28 14:30:56,689][57339] Updated weights for policy 0, policy_version 633688 (0.0026) [2024-04-28 14:30:57,169][57108] Fps is (10 sec: 57345.1, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10382393344. Throughput: 0: 55866.4. Samples: 872742980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:30:57,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 14:30:59,000][57339] Updated weights for policy 0, policy_version 633698 (0.0040) [2024-04-28 14:31:02,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10382639104. Throughput: 0: 55707.7. Samples: 873071380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:31:02,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 14:31:02,532][57339] Updated weights for policy 0, policy_version 633708 (0.0028) [2024-04-28 14:31:04,916][57339] Updated weights for policy 0, policy_version 633718 (0.0029) [2024-04-28 14:31:07,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10382934016. Throughput: 0: 55868.0. Samples: 873239920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:31:07,169][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 14:31:08,314][57339] Updated weights for policy 0, policy_version 633728 (0.0027) [2024-04-28 14:31:09,757][57319] Signal inference workers to stop experience collection... (12600 times) [2024-04-28 14:31:09,763][57319] Signal inference workers to resume experience collection... (12600 times) [2024-04-28 14:31:09,777][57339] InferenceWorker_p0-w0: stopping experience collection (12600 times) [2024-04-28 14:31:09,777][57339] InferenceWorker_p0-w0: resuming experience collection (12600 times) [2024-04-28 14:31:10,924][57339] Updated weights for policy 0, policy_version 633738 (0.0031) [2024-04-28 14:31:12,169][57108] Fps is (10 sec: 60621.2, 60 sec: 55978.8, 300 sec: 56038.8). Total num frames: 10383245312. Throughput: 0: 55929.8. Samples: 873577640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:31:12,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 14:31:14,070][57339] Updated weights for policy 0, policy_version 633748 (0.0030) [2024-04-28 14:31:16,908][57339] Updated weights for policy 0, policy_version 633758 (0.0033) [2024-04-28 14:31:17,169][57108] Fps is (10 sec: 58981.8, 60 sec: 56524.9, 300 sec: 55983.3). Total num frames: 10383523840. Throughput: 0: 55883.0. Samples: 873911900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:31:17,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 14:31:20,009][57339] Updated weights for policy 0, policy_version 633768 (0.0026) [2024-04-28 14:31:22,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10383769600. Throughput: 0: 55756.1. Samples: 874078840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:31:22,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 14:31:22,687][57339] Updated weights for policy 0, policy_version 633778 (0.0033) [2024-04-28 14:31:26,039][57339] Updated weights for policy 0, policy_version 633788 (0.0027) [2024-04-28 14:31:27,169][57108] Fps is (10 sec: 50790.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10384031744. Throughput: 0: 55701.3. Samples: 874409200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:31:27,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 14:31:27,188][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000633792_10384048128.pth... [2024-04-28 14:31:27,246][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000632973_10370629632.pth [2024-04-28 14:31:28,464][57339] Updated weights for policy 0, policy_version 633798 (0.0026) [2024-04-28 14:31:31,798][57339] Updated weights for policy 0, policy_version 633808 (0.0027) [2024-04-28 14:31:32,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10384310272. Throughput: 0: 55586.1. Samples: 874741200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:31:32,169][57108] Avg episode reward: [(0, '0.530')] [2024-04-28 14:31:34,389][57339] Updated weights for policy 0, policy_version 633818 (0.0027) [2024-04-28 14:31:37,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.7, 300 sec: 55761.1). Total num frames: 10384605184. Throughput: 0: 55224.1. Samples: 874900540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:31:37,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 14:31:38,038][57339] Updated weights for policy 0, policy_version 633828 (0.0032) [2024-04-28 14:31:40,126][57339] Updated weights for policy 0, policy_version 633838 (0.0030) [2024-04-28 14:31:42,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55159.4, 300 sec: 55872.2). Total num frames: 10384883712. Throughput: 0: 55361.8. Samples: 875234260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:31:42,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 14:31:43,783][57339] Updated weights for policy 0, policy_version 633848 (0.0029) [2024-04-28 14:31:45,918][57339] Updated weights for policy 0, policy_version 633858 (0.0027) [2024-04-28 14:31:47,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.9, 300 sec: 55983.3). Total num frames: 10385178624. Throughput: 0: 55410.8. Samples: 875564860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:31:47,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 14:31:49,766][57339] Updated weights for policy 0, policy_version 633868 (0.0028) [2024-04-28 14:31:51,894][57339] Updated weights for policy 0, policy_version 633878 (0.0029) [2024-04-28 14:31:52,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55978.8, 300 sec: 55983.4). Total num frames: 10385473536. Throughput: 0: 55775.5. Samples: 875749820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:31:52,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 14:31:55,598][57339] Updated weights for policy 0, policy_version 633888 (0.0026) [2024-04-28 14:31:57,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 10385719296. Throughput: 0: 55701.8. Samples: 876084220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:31:57,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 14:31:57,763][57339] Updated weights for policy 0, policy_version 633898 (0.0032) [2024-04-28 14:32:01,483][57339] Updated weights for policy 0, policy_version 633908 (0.0024) [2024-04-28 14:32:02,169][57108] Fps is (10 sec: 49151.5, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 10385965056. Throughput: 0: 55588.4. Samples: 876413380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:32:02,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 14:32:03,041][57319] Signal inference workers to stop experience collection... (12650 times) [2024-04-28 14:32:03,064][57339] InferenceWorker_p0-w0: stopping experience collection (12650 times) [2024-04-28 14:32:03,131][57319] Signal inference workers to resume experience collection... (12650 times) [2024-04-28 14:32:03,131][57339] InferenceWorker_p0-w0: resuming experience collection (12650 times) [2024-04-28 14:32:03,553][57339] Updated weights for policy 0, policy_version 633918 (0.0028) [2024-04-28 14:32:07,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 10386259968. Throughput: 0: 55354.3. Samples: 876569780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:32:07,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 14:32:07,259][57339] Updated weights for policy 0, policy_version 633928 (0.0035) [2024-04-28 14:32:09,373][57339] Updated weights for policy 0, policy_version 633938 (0.0031) [2024-04-28 14:32:12,169][57108] Fps is (10 sec: 57344.3, 60 sec: 54886.3, 300 sec: 55705.6). Total num frames: 10386538496. Throughput: 0: 55424.8. Samples: 876903320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:32:12,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 14:32:13,057][57339] Updated weights for policy 0, policy_version 633948 (0.0028) [2024-04-28 14:32:15,410][57339] Updated weights for policy 0, policy_version 633958 (0.0033) [2024-04-28 14:32:17,169][57108] Fps is (10 sec: 55704.5, 60 sec: 54886.3, 300 sec: 55761.1). Total num frames: 10386817024. Throughput: 0: 55404.7. Samples: 877234420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:32:17,170][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 14:32:18,907][57339] Updated weights for policy 0, policy_version 633968 (0.0036) [2024-04-28 14:32:21,262][57339] Updated weights for policy 0, policy_version 633978 (0.0025) [2024-04-28 14:32:22,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 10387128320. Throughput: 0: 55875.1. Samples: 877414920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:32:22,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 14:32:24,896][57339] Updated weights for policy 0, policy_version 633988 (0.0034) [2024-04-28 14:32:27,142][57339] Updated weights for policy 0, policy_version 633998 (0.0030) [2024-04-28 14:32:27,169][57108] Fps is (10 sec: 60621.1, 60 sec: 56524.7, 300 sec: 56038.8). Total num frames: 10387423232. Throughput: 0: 55867.9. Samples: 877748320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:32:27,170][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 14:32:30,950][57339] Updated weights for policy 0, policy_version 634008 (0.0025) [2024-04-28 14:32:32,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10387668992. Throughput: 0: 55936.8. Samples: 878082020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:32:32,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 14:32:33,106][57339] Updated weights for policy 0, policy_version 634018 (0.0030) [2024-04-28 14:32:36,969][57339] Updated weights for policy 0, policy_version 634028 (0.0031) [2024-04-28 14:32:37,169][57108] Fps is (10 sec: 49152.4, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10387914752. Throughput: 0: 55314.2. Samples: 878238960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:32:37,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 14:32:38,997][57339] Updated weights for policy 0, policy_version 634038 (0.0028) [2024-04-28 14:32:42,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10388209664. Throughput: 0: 55386.6. Samples: 878576620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-04-28 14:32:42,169][57108] Avg episode reward: [(0, '0.521')] [2024-04-28 14:32:42,817][57339] Updated weights for policy 0, policy_version 634048 (0.0027) [2024-04-28 14:32:44,776][57339] Updated weights for policy 0, policy_version 634058 (0.0028) [2024-04-28 14:32:47,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 10388488192. Throughput: 0: 55509.9. Samples: 878911320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:32:47,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 14:32:48,643][57339] Updated weights for policy 0, policy_version 634068 (0.0026) [2024-04-28 14:32:50,689][57339] Updated weights for policy 0, policy_version 634078 (0.0030) [2024-04-28 14:32:52,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55159.4, 300 sec: 55816.7). Total num frames: 10388783104. Throughput: 0: 55782.6. Samples: 879080000. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:32:52,169][57108] Avg episode reward: [(0, '0.526')] [2024-04-28 14:32:54,350][57339] Updated weights for policy 0, policy_version 634088 (0.0025) [2024-04-28 14:32:56,625][57339] Updated weights for policy 0, policy_version 634098 (0.0032) [2024-04-28 14:32:57,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10389078016. Throughput: 0: 55730.7. Samples: 879411200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:32:57,170][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 14:33:00,010][57339] Updated weights for policy 0, policy_version 634108 (0.0027) [2024-04-28 14:33:02,169][57108] Fps is (10 sec: 57343.9, 60 sec: 56524.9, 300 sec: 55927.8). Total num frames: 10389356544. Throughput: 0: 55900.1. Samples: 879749920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:33:02,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 14:33:02,373][57339] Updated weights for policy 0, policy_version 634118 (0.0027) [2024-04-28 14:33:05,477][57319] Signal inference workers to stop experience collection... (12700 times) [2024-04-28 14:33:05,478][57319] Signal inference workers to resume experience collection... (12700 times) [2024-04-28 14:33:05,497][57339] InferenceWorker_p0-w0: stopping experience collection (12700 times) [2024-04-28 14:33:05,497][57339] InferenceWorker_p0-w0: resuming experience collection (12700 times) [2024-04-28 14:33:05,826][57339] Updated weights for policy 0, policy_version 634128 (0.0028) [2024-04-28 14:33:07,169][57108] Fps is (10 sec: 55705.3, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 10389635072. Throughput: 0: 55817.7. Samples: 879926720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:33:07,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 14:33:08,106][57339] Updated weights for policy 0, policy_version 634138 (0.0025) [2024-04-28 14:33:11,787][57339] Updated weights for policy 0, policy_version 634148 (0.0026) [2024-04-28 14:33:12,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10389897216. Throughput: 0: 56109.4. Samples: 880273240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:33:12,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 14:33:13,906][57339] Updated weights for policy 0, policy_version 634158 (0.0034) [2024-04-28 14:33:17,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10390159360. Throughput: 0: 56057.7. Samples: 880604620. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:33:17,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 14:33:17,644][57339] Updated weights for policy 0, policy_version 634168 (0.0030) [2024-04-28 14:33:19,942][57339] Updated weights for policy 0, policy_version 634178 (0.0030) [2024-04-28 14:33:22,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 10390437888. Throughput: 0: 55810.2. Samples: 880750420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:33:22,169][57108] Avg episode reward: [(0, '0.522')] [2024-04-28 14:33:23,472][57339] Updated weights for policy 0, policy_version 634188 (0.0030) [2024-04-28 14:33:25,788][57339] Updated weights for policy 0, policy_version 634198 (0.0026) [2024-04-28 14:33:27,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 10390732800. Throughput: 0: 55893.7. Samples: 881091840. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:33:27,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 14:33:27,253][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000634201_10390749184.pth... [2024-04-28 14:33:27,301][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000633383_10377347072.pth [2024-04-28 14:33:29,294][57339] Updated weights for policy 0, policy_version 634208 (0.0026) [2024-04-28 14:33:31,656][57339] Updated weights for policy 0, policy_version 634218 (0.0032) [2024-04-28 14:33:32,169][57108] Fps is (10 sec: 60620.3, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10391044096. Throughput: 0: 55936.8. Samples: 881428480. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:33:32,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 14:33:35,011][57339] Updated weights for policy 0, policy_version 634228 (0.0029) [2024-04-28 14:33:37,169][57108] Fps is (10 sec: 58982.6, 60 sec: 56797.8, 300 sec: 55816.7). Total num frames: 10391322624. Throughput: 0: 56208.9. Samples: 881609400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:33:37,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 14:33:37,555][57339] Updated weights for policy 0, policy_version 634238 (0.0029) [2024-04-28 14:33:40,881][57339] Updated weights for policy 0, policy_version 634248 (0.0027) [2024-04-28 14:33:42,169][57108] Fps is (10 sec: 54067.8, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10391584768. Throughput: 0: 56149.8. Samples: 881937940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:33:42,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 14:33:43,493][57339] Updated weights for policy 0, policy_version 634258 (0.0029) [2024-04-28 14:33:46,759][57339] Updated weights for policy 0, policy_version 634268 (0.0026) [2024-04-28 14:33:47,169][57108] Fps is (10 sec: 54067.1, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10391863296. Throughput: 0: 56036.0. Samples: 882271540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:33:47,169][57108] Avg episode reward: [(0, '0.708')] [2024-04-28 14:33:49,485][57339] Updated weights for policy 0, policy_version 634278 (0.0026) [2024-04-28 14:33:52,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10392125440. Throughput: 0: 55774.8. Samples: 882436580. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:33:52,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 14:33:52,525][57339] Updated weights for policy 0, policy_version 634288 (0.0026) [2024-04-28 14:33:55,238][57339] Updated weights for policy 0, policy_version 634298 (0.0026) [2024-04-28 14:33:57,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10392387584. Throughput: 0: 55662.7. Samples: 882778060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:33:57,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 14:33:57,832][57319] Signal inference workers to stop experience collection... (12750 times) [2024-04-28 14:33:57,833][57319] Signal inference workers to resume experience collection... (12750 times) [2024-04-28 14:33:57,844][57339] InferenceWorker_p0-w0: stopping experience collection (12750 times) [2024-04-28 14:33:57,844][57339] InferenceWorker_p0-w0: resuming experience collection (12750 times) [2024-04-28 14:33:58,356][57339] Updated weights for policy 0, policy_version 634308 (0.0028) [2024-04-28 14:34:01,021][57339] Updated weights for policy 0, policy_version 634318 (0.0025) [2024-04-28 14:34:02,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10392682496. Throughput: 0: 55861.4. Samples: 883118380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:34:02,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 14:34:04,215][57339] Updated weights for policy 0, policy_version 634328 (0.0030) [2024-04-28 14:34:06,965][57339] Updated weights for policy 0, policy_version 634338 (0.0029) [2024-04-28 14:34:07,169][57108] Fps is (10 sec: 60620.4, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 10392993792. Throughput: 0: 56263.9. Samples: 883282300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:34:07,169][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 14:34:10,168][57339] Updated weights for policy 0, policy_version 634348 (0.0029) [2024-04-28 14:34:12,169][57108] Fps is (10 sec: 60620.6, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 10393288704. Throughput: 0: 56054.7. Samples: 883614300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:34:12,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 14:34:12,896][57339] Updated weights for policy 0, policy_version 634358 (0.0032) [2024-04-28 14:34:15,824][57339] Updated weights for policy 0, policy_version 634368 (0.0026) [2024-04-28 14:34:17,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56797.9, 300 sec: 55927.8). Total num frames: 10393567232. Throughput: 0: 56204.1. Samples: 883957660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-04-28 14:34:17,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 14:34:18,939][57339] Updated weights for policy 0, policy_version 634378 (0.0026) [2024-04-28 14:34:21,715][57339] Updated weights for policy 0, policy_version 634388 (0.0024) [2024-04-28 14:34:22,169][57108] Fps is (10 sec: 54066.7, 60 sec: 56524.7, 300 sec: 55872.2). Total num frames: 10393829376. Throughput: 0: 56002.5. Samples: 884129520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:34:22,170][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 14:34:24,849][57339] Updated weights for policy 0, policy_version 634398 (0.0027) [2024-04-28 14:34:27,169][57108] Fps is (10 sec: 54067.1, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10394107904. Throughput: 0: 56222.2. Samples: 884467940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:34:27,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 14:34:27,644][57339] Updated weights for policy 0, policy_version 634408 (0.0033) [2024-04-28 14:34:30,757][57339] Updated weights for policy 0, policy_version 634418 (0.0027) [2024-04-28 14:34:32,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55432.7, 300 sec: 55761.1). Total num frames: 10394370048. Throughput: 0: 56307.2. Samples: 884805360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:34:32,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 14:34:33,346][57339] Updated weights for policy 0, policy_version 634428 (0.0025) [2024-04-28 14:34:36,459][57339] Updated weights for policy 0, policy_version 634438 (0.0030) [2024-04-28 14:34:37,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 10394632192. Throughput: 0: 56064.9. Samples: 884959500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:34:37,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 14:34:39,324][57339] Updated weights for policy 0, policy_version 634448 (0.0028) [2024-04-28 14:34:42,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10394943488. Throughput: 0: 55889.8. Samples: 885293100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:34:42,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 14:34:42,206][57339] Updated weights for policy 0, policy_version 634458 (0.0028) [2024-04-28 14:34:45,398][57339] Updated weights for policy 0, policy_version 634468 (0.0029) [2024-04-28 14:34:47,169][57108] Fps is (10 sec: 60619.9, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10395238400. Throughput: 0: 55758.1. Samples: 885627500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:34:47,170][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 14:34:48,115][57339] Updated weights for policy 0, policy_version 634478 (0.0031) [2024-04-28 14:34:51,184][57339] Updated weights for policy 0, policy_version 634488 (0.0029) [2024-04-28 14:34:52,169][57108] Fps is (10 sec: 58981.8, 60 sec: 56797.8, 300 sec: 55927.8). Total num frames: 10395533312. Throughput: 0: 56141.3. Samples: 885808660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:34:52,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 14:34:54,062][57339] Updated weights for policy 0, policy_version 634498 (0.0026) [2024-04-28 14:34:56,992][57339] Updated weights for policy 0, policy_version 634508 (0.0033) [2024-04-28 14:34:57,169][57108] Fps is (10 sec: 55706.3, 60 sec: 56797.8, 300 sec: 55927.8). Total num frames: 10395795456. Throughput: 0: 56060.9. Samples: 886137040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:34:57,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 14:34:59,617][57319] Signal inference workers to stop experience collection... (12800 times) [2024-04-28 14:34:59,637][57339] InferenceWorker_p0-w0: stopping experience collection (12800 times) [2024-04-28 14:34:59,677][57319] Signal inference workers to resume experience collection... (12800 times) [2024-04-28 14:34:59,677][57339] InferenceWorker_p0-w0: resuming experience collection (12800 times) [2024-04-28 14:34:59,789][57339] Updated weights for policy 0, policy_version 634518 (0.0026) [2024-04-28 14:35:02,169][57108] Fps is (10 sec: 50790.7, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10396041216. Throughput: 0: 55825.4. Samples: 886469800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:35:02,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 14:35:02,846][57339] Updated weights for policy 0, policy_version 634528 (0.0028) [2024-04-28 14:35:05,951][57339] Updated weights for policy 0, policy_version 634538 (0.0034) [2024-04-28 14:35:07,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10396336128. Throughput: 0: 55716.0. Samples: 886636740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:35:07,170][57108] Avg episode reward: [(0, '0.684')] [2024-04-28 14:35:08,866][57339] Updated weights for policy 0, policy_version 634548 (0.0027) [2024-04-28 14:35:12,037][57339] Updated weights for policy 0, policy_version 634558 (0.0026) [2024-04-28 14:35:12,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 10396598272. Throughput: 0: 55630.3. Samples: 886971300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:35:12,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 14:35:14,773][57339] Updated weights for policy 0, policy_version 634568 (0.0027) [2024-04-28 14:35:17,169][57108] Fps is (10 sec: 52429.8, 60 sec: 54886.5, 300 sec: 55761.1). Total num frames: 10396860416. Throughput: 0: 55525.4. Samples: 887304000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:35:17,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 14:35:17,826][57339] Updated weights for policy 0, policy_version 634578 (0.0026) [2024-04-28 14:35:20,727][57339] Updated weights for policy 0, policy_version 634588 (0.0033) [2024-04-28 14:35:22,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10397188096. Throughput: 0: 55797.2. Samples: 887470380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:35:22,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 14:35:23,884][57339] Updated weights for policy 0, policy_version 634598 (0.0030) [2024-04-28 14:35:26,547][57339] Updated weights for policy 0, policy_version 634608 (0.0033) [2024-04-28 14:35:27,169][57108] Fps is (10 sec: 62258.6, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10397483008. Throughput: 0: 55907.9. Samples: 887808960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:35:27,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 14:35:27,275][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000634613_10397499392.pth... [2024-04-28 14:35:27,321][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000633792_10384048128.pth [2024-04-28 14:35:29,546][57339] Updated weights for policy 0, policy_version 634618 (0.0033) [2024-04-28 14:35:32,169][57108] Fps is (10 sec: 52429.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10397712384. Throughput: 0: 55939.4. Samples: 888144760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:35:32,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 14:35:32,330][57339] Updated weights for policy 0, policy_version 634628 (0.0027) [2024-04-28 14:35:35,366][57339] Updated weights for policy 0, policy_version 634638 (0.0027) [2024-04-28 14:35:37,169][57108] Fps is (10 sec: 50790.2, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 10397990912. Throughput: 0: 55696.0. Samples: 888314980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:35:37,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 14:35:38,335][57339] Updated weights for policy 0, policy_version 634648 (0.0028) [2024-04-28 14:35:41,167][57339] Updated weights for policy 0, policy_version 634658 (0.0025) [2024-04-28 14:35:42,169][57108] Fps is (10 sec: 58981.5, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 10398302208. Throughput: 0: 55841.7. Samples: 888649920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:35:42,170][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 14:35:44,155][57339] Updated weights for policy 0, policy_version 634668 (0.0031) [2024-04-28 14:35:46,462][57319] Signal inference workers to stop experience collection... (12850 times) [2024-04-28 14:35:46,504][57339] InferenceWorker_p0-w0: stopping experience collection (12850 times) [2024-04-28 14:35:46,554][57319] Signal inference workers to resume experience collection... (12850 times) [2024-04-28 14:35:46,554][57339] InferenceWorker_p0-w0: resuming experience collection (12850 times) [2024-04-28 14:35:46,971][57339] Updated weights for policy 0, policy_version 634678 (0.0025) [2024-04-28 14:35:47,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.7, 300 sec: 55761.2). Total num frames: 10398564352. Throughput: 0: 55774.2. Samples: 888979640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:35:47,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 14:35:49,965][57339] Updated weights for policy 0, policy_version 634688 (0.0030) [2024-04-28 14:35:52,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55159.3, 300 sec: 55761.1). Total num frames: 10398842880. Throughput: 0: 55959.5. Samples: 889154920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 14:35:52,170][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 14:35:52,833][57339] Updated weights for policy 0, policy_version 634698 (0.0035) [2024-04-28 14:35:55,816][57339] Updated weights for policy 0, policy_version 634708 (0.0026) [2024-04-28 14:35:57,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 10399137792. Throughput: 0: 55896.7. Samples: 889486660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:35:57,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 14:35:58,550][57339] Updated weights for policy 0, policy_version 634718 (0.0028) [2024-04-28 14:36:01,640][57339] Updated weights for policy 0, policy_version 634728 (0.0027) [2024-04-28 14:36:02,169][57108] Fps is (10 sec: 58983.4, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 10399432704. Throughput: 0: 55857.2. Samples: 889817580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:36:02,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 14:36:04,472][57339] Updated weights for policy 0, policy_version 634738 (0.0030) [2024-04-28 14:36:07,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10399662080. Throughput: 0: 55725.9. Samples: 889978040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:36:07,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 14:36:07,461][57339] Updated weights for policy 0, policy_version 634748 (0.0024) [2024-04-28 14:36:10,296][57339] Updated weights for policy 0, policy_version 634758 (0.0032) [2024-04-28 14:36:12,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10399956992. Throughput: 0: 55755.5. Samples: 890317960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:36:12,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 14:36:13,348][57339] Updated weights for policy 0, policy_version 634768 (0.0034) [2024-04-28 14:36:15,977][57339] Updated weights for policy 0, policy_version 634778 (0.0029) [2024-04-28 14:36:17,169][57108] Fps is (10 sec: 58981.9, 60 sec: 56524.6, 300 sec: 55872.2). Total num frames: 10400251904. Throughput: 0: 55762.0. Samples: 890654060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:36:17,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 14:36:19,263][57339] Updated weights for policy 0, policy_version 634788 (0.0029) [2024-04-28 14:36:21,822][57339] Updated weights for policy 0, policy_version 634798 (0.0024) [2024-04-28 14:36:22,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.7, 300 sec: 55927.7). Total num frames: 10400530432. Throughput: 0: 55669.4. Samples: 890820100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:36:22,169][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 14:36:25,084][57339] Updated weights for policy 0, policy_version 634808 (0.0031) [2024-04-28 14:36:27,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55159.5, 300 sec: 55872.2). Total num frames: 10400792576. Throughput: 0: 55610.3. Samples: 891152380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:36:27,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 14:36:27,762][57319] Signal inference workers to stop experience collection... (12900 times) [2024-04-28 14:36:27,807][57339] InferenceWorker_p0-w0: stopping experience collection (12900 times) [2024-04-28 14:36:27,827][57319] Signal inference workers to resume experience collection... (12900 times) [2024-04-28 14:36:27,833][57339] InferenceWorker_p0-w0: resuming experience collection (12900 times) [2024-04-28 14:36:27,835][57339] Updated weights for policy 0, policy_version 634818 (0.0027) [2024-04-28 14:36:30,989][57339] Updated weights for policy 0, policy_version 634828 (0.0027) [2024-04-28 14:36:32,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 10401087488. Throughput: 0: 55726.1. Samples: 891487320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:36:32,169][57108] Avg episode reward: [(0, '0.695')] [2024-04-28 14:36:33,512][57339] Updated weights for policy 0, policy_version 634838 (0.0027) [2024-04-28 14:36:36,925][57339] Updated weights for policy 0, policy_version 634848 (0.0031) [2024-04-28 14:36:37,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10401366016. Throughput: 0: 55572.6. Samples: 891655680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:36:37,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 14:36:39,480][57339] Updated weights for policy 0, policy_version 634858 (0.0031) [2024-04-28 14:36:42,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 10401611776. Throughput: 0: 55622.0. Samples: 891989640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:36:42,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 14:36:42,880][57339] Updated weights for policy 0, policy_version 634868 (0.0030) [2024-04-28 14:36:45,423][57339] Updated weights for policy 0, policy_version 634878 (0.0025) [2024-04-28 14:36:47,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10401923072. Throughput: 0: 55618.6. Samples: 892320420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:36:47,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 14:36:48,693][57339] Updated weights for policy 0, policy_version 634888 (0.0024) [2024-04-28 14:36:51,455][57339] Updated weights for policy 0, policy_version 634898 (0.0023) [2024-04-28 14:36:52,169][57108] Fps is (10 sec: 60620.4, 60 sec: 56251.9, 300 sec: 55927.7). Total num frames: 10402217984. Throughput: 0: 55775.9. Samples: 892487960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:36:52,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 14:36:54,593][57339] Updated weights for policy 0, policy_version 634908 (0.0028) [2024-04-28 14:36:57,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 10402480128. Throughput: 0: 55689.8. Samples: 892824000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:36:57,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 14:36:57,447][57339] Updated weights for policy 0, policy_version 634918 (0.0027) [2024-04-28 14:37:00,551][57339] Updated weights for policy 0, policy_version 634928 (0.0029) [2024-04-28 14:37:02,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55927.7). Total num frames: 10402758656. Throughput: 0: 55684.9. Samples: 893159880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:37:02,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 14:37:03,222][57339] Updated weights for policy 0, policy_version 634938 (0.0028) [2024-04-28 14:37:06,427][57339] Updated weights for policy 0, policy_version 634948 (0.0031) [2024-04-28 14:37:07,169][57108] Fps is (10 sec: 55705.3, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 10403037184. Throughput: 0: 55650.1. Samples: 893324360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:37:07,169][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 14:37:08,963][57339] Updated weights for policy 0, policy_version 634958 (0.0029) [2024-04-28 14:37:09,562][57319] Signal inference workers to stop experience collection... (12950 times) [2024-04-28 14:37:09,563][57319] Signal inference workers to resume experience collection... (12950 times) [2024-04-28 14:37:09,577][57339] InferenceWorker_p0-w0: stopping experience collection (12950 times) [2024-04-28 14:37:09,577][57339] InferenceWorker_p0-w0: resuming experience collection (12950 times) [2024-04-28 14:37:12,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10403299328. Throughput: 0: 55760.0. Samples: 893661580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:37:12,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 14:37:12,267][57339] Updated weights for policy 0, policy_version 634968 (0.0035) [2024-04-28 14:37:14,781][57339] Updated weights for policy 0, policy_version 634978 (0.0027) [2024-04-28 14:37:17,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 10403561472. Throughput: 0: 55847.2. Samples: 894000440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:37:17,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 14:37:18,079][57339] Updated weights for policy 0, policy_version 634988 (0.0026) [2024-04-28 14:37:20,777][57339] Updated weights for policy 0, policy_version 634998 (0.0032) [2024-04-28 14:37:22,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10403856384. Throughput: 0: 55596.4. Samples: 894157520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:37:22,170][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 14:37:24,007][57339] Updated weights for policy 0, policy_version 635008 (0.0028) [2024-04-28 14:37:26,802][57339] Updated weights for policy 0, policy_version 635018 (0.0030) [2024-04-28 14:37:27,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10404134912. Throughput: 0: 55535.2. Samples: 894488720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 14:37:27,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 14:37:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000635019_10404151296.pth... [2024-04-28 14:37:27,224][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000634201_10390749184.pth [2024-04-28 14:37:29,854][57339] Updated weights for policy 0, policy_version 635028 (0.0036) [2024-04-28 14:37:32,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 10404446208. Throughput: 0: 55666.7. Samples: 894825420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:37:32,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 14:37:32,633][57339] Updated weights for policy 0, policy_version 635038 (0.0031) [2024-04-28 14:37:35,847][57339] Updated weights for policy 0, policy_version 635048 (0.0031) [2024-04-28 14:37:37,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 10404708352. Throughput: 0: 55998.2. Samples: 895007880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:37:37,169][57108] Avg episode reward: [(0, '0.694')] [2024-04-28 14:37:38,584][57339] Updated weights for policy 0, policy_version 635058 (0.0027) [2024-04-28 14:37:41,566][57339] Updated weights for policy 0, policy_version 635068 (0.0025) [2024-04-28 14:37:42,169][57108] Fps is (10 sec: 55705.5, 60 sec: 56524.7, 300 sec: 55983.3). Total num frames: 10405003264. Throughput: 0: 55944.5. Samples: 895341500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:37:42,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 14:37:44,431][57339] Updated weights for policy 0, policy_version 635078 (0.0031) [2024-04-28 14:37:47,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10405249024. Throughput: 0: 55929.7. Samples: 895676720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:37:47,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 14:37:47,514][57339] Updated weights for policy 0, policy_version 635088 (0.0024) [2024-04-28 14:37:50,408][57339] Updated weights for policy 0, policy_version 635098 (0.0041) [2024-04-28 14:37:52,169][57108] Fps is (10 sec: 47514.1, 60 sec: 54340.3, 300 sec: 55594.5). Total num frames: 10405478400. Throughput: 0: 55697.5. Samples: 895830740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:37:52,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 14:37:53,305][57339] Updated weights for policy 0, policy_version 635108 (0.0034) [2024-04-28 14:37:56,085][57339] Updated weights for policy 0, policy_version 635118 (0.0033) [2024-04-28 14:37:57,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10405789696. Throughput: 0: 55695.5. Samples: 896167880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:37:57,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 14:37:59,199][57339] Updated weights for policy 0, policy_version 635128 (0.0029) [2024-04-28 14:38:01,800][57339] Updated weights for policy 0, policy_version 635138 (0.0025) [2024-04-28 14:38:02,169][57108] Fps is (10 sec: 62258.8, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10406100992. Throughput: 0: 55625.3. Samples: 896503580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:38:02,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 14:38:04,834][57319] Signal inference workers to stop experience collection... (13000 times) [2024-04-28 14:38:04,836][57319] Signal inference workers to resume experience collection... (13000 times) [2024-04-28 14:38:04,850][57339] InferenceWorker_p0-w0: stopping experience collection (13000 times) [2024-04-28 14:38:04,850][57339] InferenceWorker_p0-w0: resuming experience collection (13000 times) [2024-04-28 14:38:04,951][57339] Updated weights for policy 0, policy_version 635148 (0.0027) [2024-04-28 14:38:07,169][57108] Fps is (10 sec: 60620.9, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 10406395904. Throughput: 0: 56049.0. Samples: 896679720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:38:07,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 14:38:07,890][57339] Updated weights for policy 0, policy_version 635158 (0.0027) [2024-04-28 14:38:10,705][57339] Updated weights for policy 0, policy_version 635168 (0.0033) [2024-04-28 14:38:12,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56524.8, 300 sec: 56038.8). Total num frames: 10406690816. Throughput: 0: 56114.6. Samples: 897013880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:38:12,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 14:38:13,855][57339] Updated weights for policy 0, policy_version 635178 (0.0027) [2024-04-28 14:38:16,562][57339] Updated weights for policy 0, policy_version 635188 (0.0028) [2024-04-28 14:38:17,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56797.8, 300 sec: 56038.8). Total num frames: 10406969344. Throughput: 0: 55982.6. Samples: 897344640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:38:17,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 14:38:19,566][57339] Updated weights for policy 0, policy_version 635198 (0.0025) [2024-04-28 14:38:22,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10407215104. Throughput: 0: 55763.5. Samples: 897517240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:38:22,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 14:38:22,485][57339] Updated weights for policy 0, policy_version 635208 (0.0028) [2024-04-28 14:38:25,809][57339] Updated weights for policy 0, policy_version 635218 (0.0030) [2024-04-28 14:38:27,169][57108] Fps is (10 sec: 49151.6, 60 sec: 55432.3, 300 sec: 55650.0). Total num frames: 10407460864. Throughput: 0: 56003.0. Samples: 897861640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:38:27,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 14:38:28,285][57339] Updated weights for policy 0, policy_version 635228 (0.0029) [2024-04-28 14:38:31,568][57339] Updated weights for policy 0, policy_version 635238 (0.0032) [2024-04-28 14:38:32,169][57108] Fps is (10 sec: 52429.1, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 10407739392. Throughput: 0: 55989.1. Samples: 898196220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:38:32,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 14:38:34,045][57339] Updated weights for policy 0, policy_version 635248 (0.0031) [2024-04-28 14:38:37,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10408050688. Throughput: 0: 55970.1. Samples: 898349400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:38:37,170][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 14:38:37,567][57339] Updated weights for policy 0, policy_version 635258 (0.0031) [2024-04-28 14:38:39,958][57339] Updated weights for policy 0, policy_version 635268 (0.0021) [2024-04-28 14:38:42,169][57108] Fps is (10 sec: 62258.7, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 10408361984. Throughput: 0: 55904.4. Samples: 898683580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:38:42,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 14:38:43,622][57339] Updated weights for policy 0, policy_version 635278 (0.0036) [2024-04-28 14:38:44,799][57319] Signal inference workers to stop experience collection... (13050 times) [2024-04-28 14:38:44,817][57339] InferenceWorker_p0-w0: stopping experience collection (13050 times) [2024-04-28 14:38:44,891][57319] Signal inference workers to resume experience collection... (13050 times) [2024-04-28 14:38:44,891][57339] InferenceWorker_p0-w0: resuming experience collection (13050 times) [2024-04-28 14:38:45,882][57339] Updated weights for policy 0, policy_version 635288 (0.0026) [2024-04-28 14:38:47,169][57108] Fps is (10 sec: 60621.0, 60 sec: 56798.0, 300 sec: 56038.8). Total num frames: 10408656896. Throughput: 0: 55826.2. Samples: 899015760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:38:47,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 14:38:49,447][57339] Updated weights for policy 0, policy_version 635298 (0.0038) [2024-04-28 14:38:51,646][57339] Updated weights for policy 0, policy_version 635308 (0.0025) [2024-04-28 14:38:52,169][57108] Fps is (10 sec: 57344.6, 60 sec: 57617.0, 300 sec: 56094.4). Total num frames: 10408935424. Throughput: 0: 56097.4. Samples: 899204100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:38:52,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 14:38:55,199][57339] Updated weights for policy 0, policy_version 635318 (0.0023) [2024-04-28 14:38:57,169][57108] Fps is (10 sec: 52429.3, 60 sec: 56524.9, 300 sec: 55927.8). Total num frames: 10409181184. Throughput: 0: 56154.8. Samples: 899540840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:38:57,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 14:38:57,411][57339] Updated weights for policy 0, policy_version 635328 (0.0026) [2024-04-28 14:39:01,224][57339] Updated weights for policy 0, policy_version 635338 (0.0028) [2024-04-28 14:39:02,169][57108] Fps is (10 sec: 47514.0, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 10409410560. Throughput: 0: 56310.0. Samples: 899878580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-04-28 14:39:02,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 14:39:03,328][57339] Updated weights for policy 0, policy_version 635348 (0.0028) [2024-04-28 14:39:07,169][57108] Fps is (10 sec: 50790.2, 60 sec: 54886.5, 300 sec: 55594.5). Total num frames: 10409689088. Throughput: 0: 55627.2. Samples: 900020460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:39:07,169][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 14:39:07,181][57339] Updated weights for policy 0, policy_version 635358 (0.0026) [2024-04-28 14:39:09,149][57339] Updated weights for policy 0, policy_version 635368 (0.0030) [2024-04-28 14:39:12,169][57108] Fps is (10 sec: 58981.4, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 10410000384. Throughput: 0: 55449.9. Samples: 900356880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:39:12,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 14:39:12,907][57339] Updated weights for policy 0, policy_version 635378 (0.0028) [2024-04-28 14:39:15,069][57339] Updated weights for policy 0, policy_version 635388 (0.0030) [2024-04-28 14:39:17,169][57108] Fps is (10 sec: 62258.4, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10410311680. Throughput: 0: 55616.3. Samples: 900698960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:39:17,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 14:39:18,828][57339] Updated weights for policy 0, policy_version 635398 (0.0029) [2024-04-28 14:39:20,874][57339] Updated weights for policy 0, policy_version 635408 (0.0032) [2024-04-28 14:39:20,891][57319] Signal inference workers to stop experience collection... (13100 times) [2024-04-28 14:39:20,891][57319] Signal inference workers to resume experience collection... (13100 times) [2024-04-28 14:39:20,916][57339] InferenceWorker_p0-w0: stopping experience collection (13100 times) [2024-04-28 14:39:20,916][57339] InferenceWorker_p0-w0: resuming experience collection (13100 times) [2024-04-28 14:39:22,169][57108] Fps is (10 sec: 62259.7, 60 sec: 56797.9, 300 sec: 55983.3). Total num frames: 10410622976. Throughput: 0: 56365.4. Samples: 900885840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:39:22,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 14:39:24,638][57339] Updated weights for policy 0, policy_version 635418 (0.0037) [2024-04-28 14:39:26,788][57339] Updated weights for policy 0, policy_version 635428 (0.0026) [2024-04-28 14:39:27,169][57108] Fps is (10 sec: 57343.9, 60 sec: 57071.0, 300 sec: 55983.3). Total num frames: 10410885120. Throughput: 0: 56366.6. Samples: 901220080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:39:27,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 14:39:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000635430_10410885120.pth... [2024-04-28 14:39:27,237][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000634613_10397499392.pth [2024-04-28 14:39:30,566][57339] Updated weights for policy 0, policy_version 635438 (0.0030) [2024-04-28 14:39:32,169][57108] Fps is (10 sec: 49151.6, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10411114496. Throughput: 0: 56259.5. Samples: 901547440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:39:32,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 14:39:32,682][57339] Updated weights for policy 0, policy_version 635448 (0.0034) [2024-04-28 14:39:36,284][57339] Updated weights for policy 0, policy_version 635458 (0.0034) [2024-04-28 14:39:37,169][57108] Fps is (10 sec: 49152.7, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10411376640. Throughput: 0: 55690.3. Samples: 901710160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:39:37,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 14:39:38,438][57339] Updated weights for policy 0, policy_version 635468 (0.0023) [2024-04-28 14:39:42,039][57339] Updated weights for policy 0, policy_version 635478 (0.0026) [2024-04-28 14:39:42,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10411671552. Throughput: 0: 55707.0. Samples: 902047660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:39:42,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 14:39:44,213][57339] Updated weights for policy 0, policy_version 635488 (0.0026) [2024-04-28 14:39:47,169][57108] Fps is (10 sec: 57343.4, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 10411950080. Throughput: 0: 55538.0. Samples: 902377800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:39:47,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 14:39:48,066][57339] Updated weights for policy 0, policy_version 635498 (0.0032) [2024-04-28 14:39:50,092][57339] Updated weights for policy 0, policy_version 635508 (0.0029) [2024-04-28 14:39:52,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 10412261376. Throughput: 0: 56234.5. Samples: 902551020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:39:52,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 14:39:53,944][57339] Updated weights for policy 0, policy_version 635518 (0.0029) [2024-04-28 14:39:56,137][57339] Updated weights for policy 0, policy_version 635528 (0.0030) [2024-04-28 14:39:57,169][57108] Fps is (10 sec: 62258.6, 60 sec: 56524.6, 300 sec: 56038.8). Total num frames: 10412572672. Throughput: 0: 56116.3. Samples: 902882120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:39:57,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 14:39:59,689][57339] Updated weights for policy 0, policy_version 635538 (0.0030) [2024-04-28 14:40:01,960][57339] Updated weights for policy 0, policy_version 635548 (0.0032) [2024-04-28 14:40:02,169][57108] Fps is (10 sec: 57344.5, 60 sec: 57070.8, 300 sec: 55927.8). Total num frames: 10412834816. Throughput: 0: 56009.4. Samples: 903219380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:40:02,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 14:40:05,631][57339] Updated weights for policy 0, policy_version 635558 (0.0027) [2024-04-28 14:40:06,501][57319] Signal inference workers to stop experience collection... (13150 times) [2024-04-28 14:40:06,504][57319] Signal inference workers to resume experience collection... (13150 times) [2024-04-28 14:40:06,535][57339] InferenceWorker_p0-w0: stopping experience collection (13150 times) [2024-04-28 14:40:06,535][57339] InferenceWorker_p0-w0: resuming experience collection (13150 times) [2024-04-28 14:40:07,169][57108] Fps is (10 sec: 50790.8, 60 sec: 56524.7, 300 sec: 55872.2). Total num frames: 10413080576. Throughput: 0: 55628.8. Samples: 903389140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:40:07,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 14:40:07,846][57339] Updated weights for policy 0, policy_version 635568 (0.0025) [2024-04-28 14:40:11,475][57339] Updated weights for policy 0, policy_version 635578 (0.0036) [2024-04-28 14:40:12,169][57108] Fps is (10 sec: 49152.2, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10413326336. Throughput: 0: 55539.3. Samples: 903719340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:40:12,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 14:40:13,701][57339] Updated weights for policy 0, policy_version 635588 (0.0025) [2024-04-28 14:40:17,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 10413621248. Throughput: 0: 55710.7. Samples: 904054420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:40:17,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 14:40:17,182][57339] Updated weights for policy 0, policy_version 635598 (0.0031) [2024-04-28 14:40:19,613][57339] Updated weights for policy 0, policy_version 635608 (0.0030) [2024-04-28 14:40:22,169][57108] Fps is (10 sec: 58982.0, 60 sec: 54886.3, 300 sec: 55705.6). Total num frames: 10413916160. Throughput: 0: 55747.5. Samples: 904218800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:40:22,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 14:40:22,932][57339] Updated weights for policy 0, policy_version 635618 (0.0028) [2024-04-28 14:40:25,455][57339] Updated weights for policy 0, policy_version 635628 (0.0034) [2024-04-28 14:40:27,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55432.6, 300 sec: 55927.7). Total num frames: 10414211072. Throughput: 0: 55594.2. Samples: 904549400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:40:27,169][57108] Avg episode reward: [(0, '0.725')] [2024-04-28 14:40:28,862][57339] Updated weights for policy 0, policy_version 635638 (0.0027) [2024-04-28 14:40:31,215][57339] Updated weights for policy 0, policy_version 635648 (0.0027) [2024-04-28 14:40:32,169][57108] Fps is (10 sec: 58982.8, 60 sec: 56524.9, 300 sec: 55983.3). Total num frames: 10414505984. Throughput: 0: 55688.1. Samples: 904883760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:40:32,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 14:40:34,652][57339] Updated weights for policy 0, policy_version 635658 (0.0034) [2024-04-28 14:40:37,071][57339] Updated weights for policy 0, policy_version 635668 (0.0027) [2024-04-28 14:40:37,169][57108] Fps is (10 sec: 57343.9, 60 sec: 56797.8, 300 sec: 55872.2). Total num frames: 10414784512. Throughput: 0: 55784.5. Samples: 905061320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:40:37,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 14:40:40,302][57339] Updated weights for policy 0, policy_version 635678 (0.0033) [2024-04-28 14:40:42,169][57108] Fps is (10 sec: 54066.9, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10415046656. Throughput: 0: 55883.3. Samples: 905396860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 14:40:42,169][57108] Avg episode reward: [(0, '0.702')] [2024-04-28 14:40:42,944][57339] Updated weights for policy 0, policy_version 635688 (0.0030) [2024-04-28 14:40:46,646][57339] Updated weights for policy 0, policy_version 635698 (0.0026) [2024-04-28 14:40:47,169][57108] Fps is (10 sec: 50789.7, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10415292416. Throughput: 0: 56019.3. Samples: 905740260. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:40:47,170][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 14:40:48,795][57339] Updated weights for policy 0, policy_version 635708 (0.0034) [2024-04-28 14:40:52,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10415587328. Throughput: 0: 55585.8. Samples: 905890500. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:40:52,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 14:40:52,451][57339] Updated weights for policy 0, policy_version 635718 (0.0027) [2024-04-28 14:40:54,775][57339] Updated weights for policy 0, policy_version 635728 (0.0027) [2024-04-28 14:40:57,169][57108] Fps is (10 sec: 58983.1, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 10415882240. Throughput: 0: 55726.1. Samples: 906227020. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:40:57,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 14:40:58,236][57339] Updated weights for policy 0, policy_version 635738 (0.0030) [2024-04-28 14:41:00,672][57339] Updated weights for policy 0, policy_version 635748 (0.0034) [2024-04-28 14:41:02,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55432.6, 300 sec: 55927.8). Total num frames: 10416160768. Throughput: 0: 55756.5. Samples: 906563460. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:41:02,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 14:41:02,995][57319] Signal inference workers to stop experience collection... (13200 times) [2024-04-28 14:41:03,027][57339] InferenceWorker_p0-w0: stopping experience collection (13200 times) [2024-04-28 14:41:03,086][57319] Signal inference workers to resume experience collection... (13200 times) [2024-04-28 14:41:03,086][57339] InferenceWorker_p0-w0: resuming experience collection (13200 times) [2024-04-28 14:41:03,926][57339] Updated weights for policy 0, policy_version 635758 (0.0030) [2024-04-28 14:41:06,514][57339] Updated weights for policy 0, policy_version 635768 (0.0024) [2024-04-28 14:41:07,169][57108] Fps is (10 sec: 57344.6, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 10416455680. Throughput: 0: 56169.9. Samples: 906746440. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:41:07,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 14:41:09,787][57339] Updated weights for policy 0, policy_version 635778 (0.0027) [2024-04-28 14:41:12,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 10416717824. Throughput: 0: 56232.0. Samples: 907079840. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:41:12,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 14:41:12,308][57339] Updated weights for policy 0, policy_version 635788 (0.0029) [2024-04-28 14:41:15,634][57339] Updated weights for policy 0, policy_version 635798 (0.0032) [2024-04-28 14:41:17,169][57108] Fps is (10 sec: 55705.2, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 10417012736. Throughput: 0: 56314.1. Samples: 907417900. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:41:17,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 14:41:18,138][57339] Updated weights for policy 0, policy_version 635808 (0.0030) [2024-04-28 14:41:21,606][57339] Updated weights for policy 0, policy_version 635818 (0.0030) [2024-04-28 14:41:22,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10417258496. Throughput: 0: 55915.1. Samples: 907577500. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:41:22,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 14:41:23,982][57339] Updated weights for policy 0, policy_version 635828 (0.0027) [2024-04-28 14:41:27,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10417553408. Throughput: 0: 56027.0. Samples: 907918080. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:41:27,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 14:41:27,177][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000635837_10417553408.pth... [2024-04-28 14:41:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000635019_10404151296.pth [2024-04-28 14:41:27,623][57339] Updated weights for policy 0, policy_version 635838 (0.0030) [2024-04-28 14:41:29,941][57339] Updated weights for policy 0, policy_version 635848 (0.0028) [2024-04-28 14:41:32,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 10417831936. Throughput: 0: 55883.2. Samples: 908255000. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:41:32,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 14:41:33,351][57339] Updated weights for policy 0, policy_version 635858 (0.0031) [2024-04-28 14:41:35,842][57339] Updated weights for policy 0, policy_version 635868 (0.0031) [2024-04-28 14:41:37,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55705.7, 300 sec: 55983.3). Total num frames: 10418126848. Throughput: 0: 56227.2. Samples: 908420720. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:41:37,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 14:41:39,256][57339] Updated weights for policy 0, policy_version 635878 (0.0032) [2024-04-28 14:41:41,779][57339] Updated weights for policy 0, policy_version 635888 (0.0036) [2024-04-28 14:41:42,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10418405376. Throughput: 0: 56157.0. Samples: 908754080. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:41:42,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 14:41:45,198][57339] Updated weights for policy 0, policy_version 635898 (0.0029) [2024-04-28 14:41:47,169][57108] Fps is (10 sec: 55704.8, 60 sec: 56524.9, 300 sec: 55816.7). Total num frames: 10418683904. Throughput: 0: 56207.8. Samples: 909092820. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:41:47,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 14:41:47,451][57339] Updated weights for policy 0, policy_version 635908 (0.0031) [2024-04-28 14:41:51,042][57339] Updated weights for policy 0, policy_version 635918 (0.0031) [2024-04-28 14:41:52,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10418962432. Throughput: 0: 55942.6. Samples: 909263860. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:41:52,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 14:41:53,204][57339] Updated weights for policy 0, policy_version 635928 (0.0030) [2024-04-28 14:41:56,827][57339] Updated weights for policy 0, policy_version 635938 (0.0027) [2024-04-28 14:41:57,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10419224576. Throughput: 0: 56047.5. Samples: 909601980. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:41:57,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 14:41:57,302][57319] Signal inference workers to stop experience collection... (13250 times) [2024-04-28 14:41:57,324][57339] InferenceWorker_p0-w0: stopping experience collection (13250 times) [2024-04-28 14:41:57,358][57319] Signal inference workers to resume experience collection... (13250 times) [2024-04-28 14:41:57,358][57339] InferenceWorker_p0-w0: resuming experience collection (13250 times) [2024-04-28 14:41:59,233][57339] Updated weights for policy 0, policy_version 635948 (0.0027) [2024-04-28 14:42:02,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10419503104. Throughput: 0: 56076.0. Samples: 909941320. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:42:02,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 14:42:02,705][57339] Updated weights for policy 0, policy_version 635958 (0.0027) [2024-04-28 14:42:05,173][57339] Updated weights for policy 0, policy_version 635968 (0.0027) [2024-04-28 14:42:07,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.4, 300 sec: 55927.7). Total num frames: 10419798016. Throughput: 0: 56207.5. Samples: 910106840. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:42:07,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 14:42:08,506][57339] Updated weights for policy 0, policy_version 635978 (0.0023) [2024-04-28 14:42:10,889][57339] Updated weights for policy 0, policy_version 635988 (0.0032) [2024-04-28 14:42:12,169][57108] Fps is (10 sec: 58982.6, 60 sec: 56251.7, 300 sec: 56038.8). Total num frames: 10420092928. Throughput: 0: 56184.6. Samples: 910446380. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:42:12,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 14:42:14,298][57339] Updated weights for policy 0, policy_version 635998 (0.0035) [2024-04-28 14:42:16,743][57339] Updated weights for policy 0, policy_version 636008 (0.0023) [2024-04-28 14:42:17,169][57108] Fps is (10 sec: 58982.9, 60 sec: 56251.7, 300 sec: 56038.8). Total num frames: 10420387840. Throughput: 0: 55992.5. Samples: 910774660. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-04-28 14:42:17,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 14:42:20,052][57339] Updated weights for policy 0, policy_version 636018 (0.0033) [2024-04-28 14:42:22,169][57108] Fps is (10 sec: 54067.0, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 10420633600. Throughput: 0: 56290.1. Samples: 910953780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:42:22,169][57108] Avg episode reward: [(0, '0.696')] [2024-04-28 14:42:22,591][57339] Updated weights for policy 0, policy_version 636028 (0.0031) [2024-04-28 14:42:25,924][57339] Updated weights for policy 0, policy_version 636038 (0.0029) [2024-04-28 14:42:27,169][57108] Fps is (10 sec: 54067.2, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10420928512. Throughput: 0: 56267.9. Samples: 911286140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:42:27,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 14:42:28,497][57339] Updated weights for policy 0, policy_version 636048 (0.0027) [2024-04-28 14:42:31,900][57339] Updated weights for policy 0, policy_version 636058 (0.0033) [2024-04-28 14:42:32,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56251.9, 300 sec: 55927.8). Total num frames: 10421207040. Throughput: 0: 56257.5. Samples: 911624400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:42:32,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 14:42:34,234][57339] Updated weights for policy 0, policy_version 636068 (0.0033) [2024-04-28 14:42:37,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 10421452800. Throughput: 0: 55954.5. Samples: 911781820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:42:37,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 14:42:37,670][57339] Updated weights for policy 0, policy_version 636078 (0.0029) [2024-04-28 14:42:37,824][57319] Signal inference workers to stop experience collection... (13300 times) [2024-04-28 14:42:37,868][57339] InferenceWorker_p0-w0: stopping experience collection (13300 times) [2024-04-28 14:42:37,880][57319] Signal inference workers to resume experience collection... (13300 times) [2024-04-28 14:42:37,885][57339] InferenceWorker_p0-w0: resuming experience collection (13300 times) [2024-04-28 14:42:39,903][57339] Updated weights for policy 0, policy_version 636088 (0.0024) [2024-04-28 14:42:42,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 10421747712. Throughput: 0: 55956.0. Samples: 912120000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:42:42,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 14:42:43,405][57339] Updated weights for policy 0, policy_version 636098 (0.0032) [2024-04-28 14:42:45,769][57339] Updated weights for policy 0, policy_version 636108 (0.0035) [2024-04-28 14:42:47,169][57108] Fps is (10 sec: 62259.4, 60 sec: 56524.9, 300 sec: 56261.0). Total num frames: 10422075392. Throughput: 0: 55940.9. Samples: 912458660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:42:47,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 14:42:49,370][57339] Updated weights for policy 0, policy_version 636118 (0.0035) [2024-04-28 14:42:51,705][57339] Updated weights for policy 0, policy_version 636128 (0.0026) [2024-04-28 14:42:52,169][57108] Fps is (10 sec: 58982.1, 60 sec: 56251.6, 300 sec: 56094.4). Total num frames: 10422337536. Throughput: 0: 56052.0. Samples: 912629180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:42:52,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 14:42:55,186][57339] Updated weights for policy 0, policy_version 636138 (0.0032) [2024-04-28 14:42:57,169][57108] Fps is (10 sec: 54066.9, 60 sec: 56524.7, 300 sec: 55983.3). Total num frames: 10422616064. Throughput: 0: 55885.6. Samples: 912961240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:42:57,170][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 14:42:57,442][57339] Updated weights for policy 0, policy_version 636148 (0.0028) [2024-04-28 14:43:00,995][57339] Updated weights for policy 0, policy_version 636158 (0.0028) [2024-04-28 14:43:02,169][57108] Fps is (10 sec: 54067.0, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 10422878208. Throughput: 0: 56101.7. Samples: 913299240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:43:02,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 14:43:03,299][57339] Updated weights for policy 0, policy_version 636168 (0.0026) [2024-04-28 14:43:06,739][57339] Updated weights for policy 0, policy_version 636178 (0.0026) [2024-04-28 14:43:07,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55978.9, 300 sec: 55816.7). Total num frames: 10423156736. Throughput: 0: 55924.6. Samples: 913470380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:43:07,169][57108] Avg episode reward: [(0, '0.689')] [2024-04-28 14:43:09,209][57339] Updated weights for policy 0, policy_version 636188 (0.0026) [2024-04-28 14:43:12,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10423418880. Throughput: 0: 55881.8. Samples: 913800820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:43:12,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 14:43:12,690][57339] Updated weights for policy 0, policy_version 636198 (0.0032) [2024-04-28 14:43:14,973][57339] Updated weights for policy 0, policy_version 636208 (0.0025) [2024-04-28 14:43:17,169][57108] Fps is (10 sec: 57342.7, 60 sec: 55705.5, 300 sec: 55983.3). Total num frames: 10423730176. Throughput: 0: 55845.1. Samples: 914137440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:43:17,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 14:43:18,609][57339] Updated weights for policy 0, policy_version 636218 (0.0028) [2024-04-28 14:43:20,670][57339] Updated weights for policy 0, policy_version 636228 (0.0030) [2024-04-28 14:43:22,169][57108] Fps is (10 sec: 58982.4, 60 sec: 56251.8, 300 sec: 56094.4). Total num frames: 10424008704. Throughput: 0: 56163.2. Samples: 914309160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:43:22,169][57108] Avg episode reward: [(0, '0.517')] [2024-04-28 14:43:24,482][57339] Updated weights for policy 0, policy_version 636238 (0.0027) [2024-04-28 14:43:24,837][57319] Signal inference workers to stop experience collection... (13350 times) [2024-04-28 14:43:24,843][57319] Signal inference workers to resume experience collection... (13350 times) [2024-04-28 14:43:24,868][57339] InferenceWorker_p0-w0: stopping experience collection (13350 times) [2024-04-28 14:43:24,868][57339] InferenceWorker_p0-w0: resuming experience collection (13350 times) [2024-04-28 14:43:26,646][57339] Updated weights for policy 0, policy_version 636248 (0.0028) [2024-04-28 14:43:27,169][57108] Fps is (10 sec: 58982.7, 60 sec: 56524.7, 300 sec: 56205.4). Total num frames: 10424320000. Throughput: 0: 56077.3. Samples: 914643480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:43:27,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 14:43:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000636250_10424320000.pth... [2024-04-28 14:43:27,230][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000635430_10410885120.pth [2024-04-28 14:43:30,367][57339] Updated weights for policy 0, policy_version 636258 (0.0030) [2024-04-28 14:43:32,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 56038.8). Total num frames: 10424582144. Throughput: 0: 56002.8. Samples: 914978780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:43:32,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 14:43:32,473][57339] Updated weights for policy 0, policy_version 636268 (0.0030) [2024-04-28 14:43:36,175][57339] Updated weights for policy 0, policy_version 636278 (0.0025) [2024-04-28 14:43:37,169][57108] Fps is (10 sec: 52429.4, 60 sec: 56524.9, 300 sec: 55872.2). Total num frames: 10424844288. Throughput: 0: 55988.6. Samples: 915148660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:43:37,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 14:43:38,344][57339] Updated weights for policy 0, policy_version 636288 (0.0036) [2024-04-28 14:43:42,041][57339] Updated weights for policy 0, policy_version 636298 (0.0032) [2024-04-28 14:43:42,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 10425106432. Throughput: 0: 56140.2. Samples: 915487540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:43:42,169][57108] Avg episode reward: [(0, '0.710')] [2024-04-28 14:43:44,024][57339] Updated weights for policy 0, policy_version 636308 (0.0025) [2024-04-28 14:43:47,169][57108] Fps is (10 sec: 52428.1, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 10425368576. Throughput: 0: 55971.1. Samples: 915817940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:43:47,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 14:43:47,854][57339] Updated weights for policy 0, policy_version 636318 (0.0030) [2024-04-28 14:43:49,827][57339] Updated weights for policy 0, policy_version 636328 (0.0030) [2024-04-28 14:43:52,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55705.7, 300 sec: 55927.7). Total num frames: 10425679872. Throughput: 0: 55765.7. Samples: 915979840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 14:43:52,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 14:43:53,798][57339] Updated weights for policy 0, policy_version 636338 (0.0029) [2024-04-28 14:43:55,935][57339] Updated weights for policy 0, policy_version 636348 (0.0033) [2024-04-28 14:43:57,169][57108] Fps is (10 sec: 60621.7, 60 sec: 55978.8, 300 sec: 56149.9). Total num frames: 10425974784. Throughput: 0: 55860.9. Samples: 916314560. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:43:57,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 14:43:59,581][57339] Updated weights for policy 0, policy_version 636358 (0.0026) [2024-04-28 14:44:01,799][57339] Updated weights for policy 0, policy_version 636368 (0.0027) [2024-04-28 14:44:02,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56251.9, 300 sec: 56149.9). Total num frames: 10426253312. Throughput: 0: 55808.2. Samples: 916648800. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:44:02,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 14:44:05,411][57339] Updated weights for policy 0, policy_version 636378 (0.0025) [2024-04-28 14:44:07,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56524.8, 300 sec: 56094.4). Total num frames: 10426548224. Throughput: 0: 56076.9. Samples: 916832620. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:44:07,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 14:44:07,505][57339] Updated weights for policy 0, policy_version 636388 (0.0026) [2024-04-28 14:44:11,218][57339] Updated weights for policy 0, policy_version 636398 (0.0031) [2024-04-28 14:44:12,169][57108] Fps is (10 sec: 57343.6, 60 sec: 56797.9, 300 sec: 55983.3). Total num frames: 10426826752. Throughput: 0: 56273.0. Samples: 917175760. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:44:12,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 14:44:13,219][57339] Updated weights for policy 0, policy_version 636408 (0.0032) [2024-04-28 14:44:16,782][57319] Signal inference workers to stop experience collection... (13400 times) [2024-04-28 14:44:16,787][57319] Signal inference workers to resume experience collection... (13400 times) [2024-04-28 14:44:16,803][57339] InferenceWorker_p0-w0: stopping experience collection (13400 times) [2024-04-28 14:44:16,803][57339] InferenceWorker_p0-w0: resuming experience collection (13400 times) [2024-04-28 14:44:17,047][57339] Updated weights for policy 0, policy_version 636418 (0.0031) [2024-04-28 14:44:17,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10427088896. Throughput: 0: 56199.4. Samples: 917507760. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:44:17,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 14:44:19,077][57339] Updated weights for policy 0, policy_version 636428 (0.0026) [2024-04-28 14:44:22,169][57108] Fps is (10 sec: 49152.2, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10427318272. Throughput: 0: 55866.7. Samples: 917662660. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:44:22,169][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 14:44:22,819][57339] Updated weights for policy 0, policy_version 636438 (0.0034) [2024-04-28 14:44:25,160][57339] Updated weights for policy 0, policy_version 636448 (0.0025) [2024-04-28 14:44:27,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 56038.8). Total num frames: 10427645952. Throughput: 0: 55874.5. Samples: 918001900. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:44:27,170][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 14:44:28,530][57339] Updated weights for policy 0, policy_version 636458 (0.0026) [2024-04-28 14:44:31,242][57339] Updated weights for policy 0, policy_version 636468 (0.0027) [2024-04-28 14:44:32,169][57108] Fps is (10 sec: 63896.7, 60 sec: 56251.6, 300 sec: 56205.4). Total num frames: 10427957248. Throughput: 0: 56045.8. Samples: 918340000. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:44:32,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 14:44:34,440][57339] Updated weights for policy 0, policy_version 636478 (0.0029) [2024-04-28 14:44:36,972][57339] Updated weights for policy 0, policy_version 636488 (0.0025) [2024-04-28 14:44:37,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56251.6, 300 sec: 56094.4). Total num frames: 10428219392. Throughput: 0: 56300.8. Samples: 918513380. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:44:37,169][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 14:44:40,297][57339] Updated weights for policy 0, policy_version 636498 (0.0030) [2024-04-28 14:44:42,169][57108] Fps is (10 sec: 54067.8, 60 sec: 56524.7, 300 sec: 56094.4). Total num frames: 10428497920. Throughput: 0: 56256.9. Samples: 918846120. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:44:42,169][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 14:44:42,954][57339] Updated weights for policy 0, policy_version 636508 (0.0025) [2024-04-28 14:44:46,016][57339] Updated weights for policy 0, policy_version 636518 (0.0029) [2024-04-28 14:44:47,169][57108] Fps is (10 sec: 54067.4, 60 sec: 56524.9, 300 sec: 55927.8). Total num frames: 10428760064. Throughput: 0: 56150.1. Samples: 919175560. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:44:47,178][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 14:44:48,682][57339] Updated weights for policy 0, policy_version 636528 (0.0026) [2024-04-28 14:44:51,900][57339] Updated weights for policy 0, policy_version 636538 (0.0035) [2024-04-28 14:44:52,169][57108] Fps is (10 sec: 55704.8, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 10429054976. Throughput: 0: 55802.5. Samples: 919343740. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:44:52,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 14:44:54,479][57339] Updated weights for policy 0, policy_version 636548 (0.0033) [2024-04-28 14:44:57,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10429300736. Throughput: 0: 55644.0. Samples: 919679740. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:44:57,169][57108] Avg episode reward: [(0, '0.686')] [2024-04-28 14:44:57,694][57339] Updated weights for policy 0, policy_version 636558 (0.0027) [2024-04-28 14:45:00,255][57319] Signal inference workers to stop experience collection... (13450 times) [2024-04-28 14:45:00,257][57319] Signal inference workers to resume experience collection... (13450 times) [2024-04-28 14:45:00,275][57339] InferenceWorker_p0-w0: stopping experience collection (13450 times) [2024-04-28 14:45:00,276][57339] InferenceWorker_p0-w0: resuming experience collection (13450 times) [2024-04-28 14:45:00,380][57339] Updated weights for policy 0, policy_version 636568 (0.0026) [2024-04-28 14:45:02,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 10429595648. Throughput: 0: 55694.3. Samples: 920014000. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:45:02,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 14:45:03,560][57339] Updated weights for policy 0, policy_version 636578 (0.0026) [2024-04-28 14:45:06,155][57339] Updated weights for policy 0, policy_version 636588 (0.0026) [2024-04-28 14:45:07,169][57108] Fps is (10 sec: 60621.1, 60 sec: 55978.7, 300 sec: 56205.4). Total num frames: 10429906944. Throughput: 0: 55984.9. Samples: 920181980. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:45:07,169][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 14:45:09,523][57339] Updated weights for policy 0, policy_version 636598 (0.0029) [2024-04-28 14:45:12,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 56094.4). Total num frames: 10430169088. Throughput: 0: 55840.1. Samples: 920514700. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:45:12,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 14:45:12,446][57339] Updated weights for policy 0, policy_version 636608 (0.0029) [2024-04-28 14:45:15,407][57339] Updated weights for policy 0, policy_version 636618 (0.0028) [2024-04-28 14:45:17,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 10430447616. Throughput: 0: 55629.0. Samples: 920843300. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:45:17,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 14:45:18,550][57339] Updated weights for policy 0, policy_version 636628 (0.0028) [2024-04-28 14:45:21,143][57339] Updated weights for policy 0, policy_version 636638 (0.0027) [2024-04-28 14:45:22,169][57108] Fps is (10 sec: 57343.7, 60 sec: 57070.9, 300 sec: 56038.8). Total num frames: 10430742528. Throughput: 0: 55611.6. Samples: 921015900. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:45:22,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 14:45:24,311][57339] Updated weights for policy 0, policy_version 636648 (0.0026) [2024-04-28 14:45:26,974][57339] Updated weights for policy 0, policy_version 636658 (0.0031) [2024-04-28 14:45:27,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 10431004672. Throughput: 0: 55803.5. Samples: 921357280. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-04-28 14:45:27,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 14:45:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000636658_10431004672.pth... [2024-04-28 14:45:27,230][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000635837_10417553408.pth [2024-04-28 14:45:30,240][57339] Updated weights for policy 0, policy_version 636668 (0.0035) [2024-04-28 14:45:32,169][57108] Fps is (10 sec: 50790.3, 60 sec: 54886.5, 300 sec: 55816.7). Total num frames: 10431250432. Throughput: 0: 55926.6. Samples: 921692260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:45:32,170][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 14:45:32,937][57339] Updated weights for policy 0, policy_version 636678 (0.0025) [2024-04-28 14:45:36,151][57339] Updated weights for policy 0, policy_version 636688 (0.0027) [2024-04-28 14:45:37,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55927.8). Total num frames: 10431545344. Throughput: 0: 55650.3. Samples: 921848000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:45:37,178][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 14:45:38,928][57339] Updated weights for policy 0, policy_version 636698 (0.0025) [2024-04-28 14:45:42,005][57339] Updated weights for policy 0, policy_version 636708 (0.0030) [2024-04-28 14:45:42,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55432.5, 300 sec: 56038.9). Total num frames: 10431823872. Throughput: 0: 55685.7. Samples: 922185600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:45:42,169][57108] Avg episode reward: [(0, '0.527')] [2024-04-28 14:45:44,731][57339] Updated weights for policy 0, policy_version 636718 (0.0031) [2024-04-28 14:45:47,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 10432118784. Throughput: 0: 55709.3. Samples: 922520920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:45:47,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 14:45:47,871][57339] Updated weights for policy 0, policy_version 636728 (0.0033) [2024-04-28 14:45:50,502][57339] Updated weights for policy 0, policy_version 636738 (0.0029) [2024-04-28 14:45:50,509][57319] Signal inference workers to stop experience collection... (13500 times) [2024-04-28 14:45:50,509][57319] Signal inference workers to resume experience collection... (13500 times) [2024-04-28 14:45:50,524][57339] InferenceWorker_p0-w0: stopping experience collection (13500 times) [2024-04-28 14:45:50,524][57339] InferenceWorker_p0-w0: resuming experience collection (13500 times) [2024-04-28 14:45:52,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.7, 300 sec: 55983.3). Total num frames: 10432397312. Throughput: 0: 55796.8. Samples: 922692840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:45:52,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 14:45:53,767][57339] Updated weights for policy 0, policy_version 636748 (0.0028) [2024-04-28 14:45:56,329][57339] Updated weights for policy 0, policy_version 636758 (0.0025) [2024-04-28 14:45:57,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 10432659456. Throughput: 0: 55870.1. Samples: 923028860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:45:57,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 14:45:59,621][57339] Updated weights for policy 0, policy_version 636768 (0.0024) [2024-04-28 14:46:02,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 10432954368. Throughput: 0: 56044.1. Samples: 923365280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:46:02,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 14:46:02,182][57339] Updated weights for policy 0, policy_version 636778 (0.0033) [2024-04-28 14:46:05,569][57339] Updated weights for policy 0, policy_version 636788 (0.0027) [2024-04-28 14:46:07,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55159.4, 300 sec: 55927.8). Total num frames: 10433216512. Throughput: 0: 55846.7. Samples: 923529000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:46:07,169][57108] Avg episode reward: [(0, '0.699')] [2024-04-28 14:46:08,056][57339] Updated weights for policy 0, policy_version 636798 (0.0030) [2024-04-28 14:46:11,405][57339] Updated weights for policy 0, policy_version 636808 (0.0032) [2024-04-28 14:46:12,169][57108] Fps is (10 sec: 55704.4, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 10433511424. Throughput: 0: 55767.4. Samples: 923866820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:46:12,169][57108] Avg episode reward: [(0, '0.535')] [2024-04-28 14:46:13,757][57339] Updated weights for policy 0, policy_version 636818 (0.0030) [2024-04-28 14:46:17,156][57339] Updated weights for policy 0, policy_version 636828 (0.0030) [2024-04-28 14:46:17,169][57108] Fps is (10 sec: 57342.8, 60 sec: 55705.4, 300 sec: 56038.8). Total num frames: 10433789952. Throughput: 0: 55844.3. Samples: 924205260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:46:17,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 14:46:19,612][57339] Updated weights for policy 0, policy_version 636838 (0.0029) [2024-04-28 14:46:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.4, 300 sec: 55927.8). Total num frames: 10434052096. Throughput: 0: 55900.8. Samples: 924363540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:46:22,170][57108] Avg episode reward: [(0, '0.459')] [2024-04-28 14:46:23,015][57339] Updated weights for policy 0, policy_version 636848 (0.0037) [2024-04-28 14:46:25,772][57339] Updated weights for policy 0, policy_version 636858 (0.0027) [2024-04-28 14:46:27,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55978.6, 300 sec: 56038.8). Total num frames: 10434363392. Throughput: 0: 55980.0. Samples: 924704700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:46:27,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 14:46:28,933][57339] Updated weights for policy 0, policy_version 636868 (0.0033) [2024-04-28 14:46:31,479][57339] Updated weights for policy 0, policy_version 636878 (0.0025) [2024-04-28 14:46:32,169][57108] Fps is (10 sec: 60621.6, 60 sec: 56797.9, 300 sec: 56038.8). Total num frames: 10434658304. Throughput: 0: 55888.9. Samples: 925035920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:46:32,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 14:46:34,776][57339] Updated weights for policy 0, policy_version 636888 (0.0030) [2024-04-28 14:46:36,577][57319] Signal inference workers to stop experience collection... (13550 times) [2024-04-28 14:46:36,577][57319] Signal inference workers to resume experience collection... (13550 times) [2024-04-28 14:46:36,590][57339] InferenceWorker_p0-w0: stopping experience collection (13550 times) [2024-04-28 14:46:36,590][57339] InferenceWorker_p0-w0: resuming experience collection (13550 times) [2024-04-28 14:46:37,169][57108] Fps is (10 sec: 55705.5, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 10434920448. Throughput: 0: 55950.2. Samples: 925210600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:46:37,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 14:46:37,265][57339] Updated weights for policy 0, policy_version 636898 (0.0029) [2024-04-28 14:46:40,616][57339] Updated weights for policy 0, policy_version 636908 (0.0026) [2024-04-28 14:46:42,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 10435182592. Throughput: 0: 55904.1. Samples: 925544540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:46:42,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 14:46:43,191][57339] Updated weights for policy 0, policy_version 636918 (0.0028) [2024-04-28 14:46:46,314][57339] Updated weights for policy 0, policy_version 636928 (0.0031) [2024-04-28 14:46:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 10435477504. Throughput: 0: 55851.8. Samples: 925878620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:46:47,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 14:46:49,090][57339] Updated weights for policy 0, policy_version 636938 (0.0027) [2024-04-28 14:46:52,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 10435739648. Throughput: 0: 55988.3. Samples: 926048480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:46:52,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 14:46:52,283][57339] Updated weights for policy 0, policy_version 636948 (0.0025) [2024-04-28 14:46:54,911][57339] Updated weights for policy 0, policy_version 636958 (0.0025) [2024-04-28 14:46:57,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 10436001792. Throughput: 0: 55870.0. Samples: 926380960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:46:57,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 14:46:58,125][57339] Updated weights for policy 0, policy_version 636968 (0.0027) [2024-04-28 14:47:00,646][57339] Updated weights for policy 0, policy_version 636978 (0.0025) [2024-04-28 14:47:02,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 10436296704. Throughput: 0: 55710.5. Samples: 926712220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:47:02,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 14:47:03,884][57339] Updated weights for policy 0, policy_version 636988 (0.0034) [2024-04-28 14:47:06,360][57339] Updated weights for policy 0, policy_version 636998 (0.0030) [2024-04-28 14:47:07,169][57108] Fps is (10 sec: 58981.3, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 10436591616. Throughput: 0: 56072.0. Samples: 926886780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-04-28 14:47:07,170][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 14:47:09,692][57339] Updated weights for policy 0, policy_version 637008 (0.0023) [2024-04-28 14:47:12,169][57108] Fps is (10 sec: 58982.0, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 10436886528. Throughput: 0: 55988.9. Samples: 927224200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:47:12,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 14:47:12,338][57339] Updated weights for policy 0, policy_version 637018 (0.0029) [2024-04-28 14:47:15,641][57339] Updated weights for policy 0, policy_version 637028 (0.0029) [2024-04-28 14:47:17,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 10437148672. Throughput: 0: 55997.7. Samples: 927555820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:47:17,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 14:47:18,461][57339] Updated weights for policy 0, policy_version 637038 (0.0026) [2024-04-28 14:47:21,333][57339] Updated weights for policy 0, policy_version 637048 (0.0026) [2024-04-28 14:47:22,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10437410816. Throughput: 0: 55814.2. Samples: 927722240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:47:22,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 14:47:24,400][57339] Updated weights for policy 0, policy_version 637058 (0.0027) [2024-04-28 14:47:26,840][57319] Signal inference workers to stop experience collection... (13600 times) [2024-04-28 14:47:26,846][57319] Signal inference workers to resume experience collection... (13600 times) [2024-04-28 14:47:26,860][57339] InferenceWorker_p0-w0: stopping experience collection (13600 times) [2024-04-28 14:47:26,861][57339] InferenceWorker_p0-w0: resuming experience collection (13600 times) [2024-04-28 14:47:27,128][57339] Updated weights for policy 0, policy_version 637068 (0.0030) [2024-04-28 14:47:27,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 10437722112. Throughput: 0: 55875.0. Samples: 928058920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:47:27,170][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 14:47:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000637068_10437722112.pth... [2024-04-28 14:47:27,224][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000636250_10424320000.pth [2024-04-28 14:47:30,102][57339] Updated weights for policy 0, policy_version 637078 (0.0030) [2024-04-28 14:47:32,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55159.4, 300 sec: 55983.3). Total num frames: 10437967872. Throughput: 0: 56045.8. Samples: 928400680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:47:32,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 14:47:32,945][57339] Updated weights for policy 0, policy_version 637088 (0.0027) [2024-04-28 14:47:35,779][57339] Updated weights for policy 0, policy_version 637098 (0.0028) [2024-04-28 14:47:37,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55432.5, 300 sec: 55927.8). Total num frames: 10438246400. Throughput: 0: 55908.5. Samples: 928564360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:47:37,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 14:47:39,048][57339] Updated weights for policy 0, policy_version 637108 (0.0030) [2024-04-28 14:47:41,789][57339] Updated weights for policy 0, policy_version 637118 (0.0032) [2024-04-28 14:47:42,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10438557696. Throughput: 0: 55875.9. Samples: 928895380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:47:42,170][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 14:47:45,054][57339] Updated weights for policy 0, policy_version 637128 (0.0025) [2024-04-28 14:47:47,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10438836224. Throughput: 0: 56027.4. Samples: 929233460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:47:47,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 14:47:47,632][57339] Updated weights for policy 0, policy_version 637138 (0.0027) [2024-04-28 14:47:50,821][57339] Updated weights for policy 0, policy_version 637148 (0.0026) [2024-04-28 14:47:52,169][57108] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 10439114752. Throughput: 0: 56002.0. Samples: 929406860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:47:52,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 14:47:53,377][57339] Updated weights for policy 0, policy_version 637158 (0.0032) [2024-04-28 14:47:56,529][57339] Updated weights for policy 0, policy_version 637168 (0.0027) [2024-04-28 14:47:57,169][57108] Fps is (10 sec: 54067.3, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10439376896. Throughput: 0: 55864.4. Samples: 929738100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:47:57,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 14:47:59,156][57339] Updated weights for policy 0, policy_version 637178 (0.0025) [2024-04-28 14:48:02,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 10439655424. Throughput: 0: 55883.2. Samples: 930070560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:48:02,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 14:48:02,416][57339] Updated weights for policy 0, policy_version 637188 (0.0028) [2024-04-28 14:48:05,145][57339] Updated weights for policy 0, policy_version 637198 (0.0031) [2024-04-28 14:48:07,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 10439950336. Throughput: 0: 55994.2. Samples: 930241980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:48:07,170][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 14:48:08,382][57339] Updated weights for policy 0, policy_version 637208 (0.0026) [2024-04-28 14:48:11,096][57339] Updated weights for policy 0, policy_version 637218 (0.0028) [2024-04-28 14:48:12,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 10440196096. Throughput: 0: 55934.4. Samples: 930575960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:48:12,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 14:48:14,061][57339] Updated weights for policy 0, policy_version 637228 (0.0033) [2024-04-28 14:48:16,957][57339] Updated weights for policy 0, policy_version 637238 (0.0026) [2024-04-28 14:48:17,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 10440507392. Throughput: 0: 55755.1. Samples: 930909660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:48:17,169][57108] Avg episode reward: [(0, '0.735')] [2024-04-28 14:48:19,769][57339] Updated weights for policy 0, policy_version 637248 (0.0030) [2024-04-28 14:48:22,169][57108] Fps is (10 sec: 58982.6, 60 sec: 56251.9, 300 sec: 55816.7). Total num frames: 10440785920. Throughput: 0: 55864.5. Samples: 931078260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:48:22,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 14:48:22,791][57339] Updated weights for policy 0, policy_version 637258 (0.0035) [2024-04-28 14:48:25,713][57339] Updated weights for policy 0, policy_version 637268 (0.0026) [2024-04-28 14:48:27,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55705.8, 300 sec: 55872.2). Total num frames: 10441064448. Throughput: 0: 56041.5. Samples: 931417240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:48:27,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 14:48:28,488][57339] Updated weights for policy 0, policy_version 637278 (0.0032) [2024-04-28 14:48:30,641][57319] Signal inference workers to stop experience collection... (13650 times) [2024-04-28 14:48:30,642][57319] Signal inference workers to resume experience collection... (13650 times) [2024-04-28 14:48:30,663][57339] InferenceWorker_p0-w0: stopping experience collection (13650 times) [2024-04-28 14:48:30,664][57339] InferenceWorker_p0-w0: resuming experience collection (13650 times) [2024-04-28 14:48:31,728][57339] Updated weights for policy 0, policy_version 637288 (0.0032) [2024-04-28 14:48:32,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10441326592. Throughput: 0: 55983.6. Samples: 931752720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:48:32,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 14:48:34,474][57339] Updated weights for policy 0, policy_version 637298 (0.0028) [2024-04-28 14:48:37,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 10441621504. Throughput: 0: 55723.6. Samples: 931914420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:48:37,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 14:48:37,834][57339] Updated weights for policy 0, policy_version 637308 (0.0028) [2024-04-28 14:48:40,176][57339] Updated weights for policy 0, policy_version 637318 (0.0029) [2024-04-28 14:48:42,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 56038.8). Total num frames: 10441900032. Throughput: 0: 55929.3. Samples: 932254920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:48:42,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 14:48:43,885][57339] Updated weights for policy 0, policy_version 637328 (0.0033) [2024-04-28 14:48:46,202][57339] Updated weights for policy 0, policy_version 637338 (0.0028) [2024-04-28 14:48:47,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 10442178560. Throughput: 0: 55996.0. Samples: 932590380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:48:47,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 14:48:49,811][57339] Updated weights for policy 0, policy_version 637348 (0.0031) [2024-04-28 14:48:52,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 10442457088. Throughput: 0: 55989.3. Samples: 932761500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:48:52,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 14:48:52,309][57339] Updated weights for policy 0, policy_version 637358 (0.0025) [2024-04-28 14:48:55,586][57339] Updated weights for policy 0, policy_version 637368 (0.0029) [2024-04-28 14:48:57,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10442735616. Throughput: 0: 56013.8. Samples: 933096580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:48:57,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 14:48:58,440][57339] Updated weights for policy 0, policy_version 637378 (0.0033) [2024-04-28 14:49:01,466][57339] Updated weights for policy 0, policy_version 637388 (0.0034) [2024-04-28 14:49:02,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10443014144. Throughput: 0: 55932.5. Samples: 933426620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:49:02,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 14:49:04,225][57339] Updated weights for policy 0, policy_version 637398 (0.0027) [2024-04-28 14:49:07,169][57108] Fps is (10 sec: 54066.2, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10443276288. Throughput: 0: 55844.6. Samples: 933591280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:49:07,170][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 14:49:07,442][57339] Updated weights for policy 0, policy_version 637408 (0.0026) [2024-04-28 14:49:10,183][57339] Updated weights for policy 0, policy_version 637418 (0.0029) [2024-04-28 14:49:12,169][57108] Fps is (10 sec: 57343.9, 60 sec: 56524.8, 300 sec: 55927.8). Total num frames: 10443587584. Throughput: 0: 55669.2. Samples: 933922360. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:49:12,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 14:49:13,386][57339] Updated weights for policy 0, policy_version 637428 (0.0030) [2024-04-28 14:49:16,010][57339] Updated weights for policy 0, policy_version 637438 (0.0034) [2024-04-28 14:49:17,169][57108] Fps is (10 sec: 57345.2, 60 sec: 55705.7, 300 sec: 56038.8). Total num frames: 10443849728. Throughput: 0: 55587.6. Samples: 934254160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:49:17,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 14:49:19,206][57339] Updated weights for policy 0, policy_version 637448 (0.0030) [2024-04-28 14:49:21,758][57339] Updated weights for policy 0, policy_version 637458 (0.0027) [2024-04-28 14:49:22,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 10444128256. Throughput: 0: 55765.2. Samples: 934423860. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:49:22,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 14:49:25,073][57339] Updated weights for policy 0, policy_version 637468 (0.0028) [2024-04-28 14:49:27,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10444406784. Throughput: 0: 55660.0. Samples: 934759620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:49:27,170][57108] Avg episode reward: [(0, '0.684')] [2024-04-28 14:49:27,182][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000637476_10444406784.pth... [2024-04-28 14:49:27,235][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000636658_10431004672.pth [2024-04-28 14:49:27,563][57339] Updated weights for policy 0, policy_version 637478 (0.0027) [2024-04-28 14:49:31,005][57339] Updated weights for policy 0, policy_version 637488 (0.0037) [2024-04-28 14:49:32,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10444685312. Throughput: 0: 55671.0. Samples: 935095580. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:49:32,170][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 14:49:33,322][57339] Updated weights for policy 0, policy_version 637498 (0.0026) [2024-04-28 14:49:34,637][57319] Signal inference workers to stop experience collection... (13700 times) [2024-04-28 14:49:34,638][57319] Signal inference workers to resume experience collection... (13700 times) [2024-04-28 14:49:34,663][57339] InferenceWorker_p0-w0: stopping experience collection (13700 times) [2024-04-28 14:49:34,663][57339] InferenceWorker_p0-w0: resuming experience collection (13700 times) [2024-04-28 14:49:36,860][57339] Updated weights for policy 0, policy_version 637508 (0.0025) [2024-04-28 14:49:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 10444947456. Throughput: 0: 55475.6. Samples: 935257900. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:49:37,169][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 14:49:39,207][57339] Updated weights for policy 0, policy_version 637518 (0.0030) [2024-04-28 14:49:42,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10445225984. Throughput: 0: 55426.1. Samples: 935590760. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:49:42,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 14:49:42,676][57339] Updated weights for policy 0, policy_version 637528 (0.0027) [2024-04-28 14:49:45,262][57339] Updated weights for policy 0, policy_version 637538 (0.0029) [2024-04-28 14:49:47,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10445537280. Throughput: 0: 55523.6. Samples: 935925180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:49:47,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 14:49:48,521][57339] Updated weights for policy 0, policy_version 637548 (0.0024) [2024-04-28 14:49:51,288][57339] Updated weights for policy 0, policy_version 637558 (0.0030) [2024-04-28 14:49:52,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55705.6, 300 sec: 55927.7). Total num frames: 10445799424. Throughput: 0: 55609.4. Samples: 936093700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:49:52,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 14:49:54,392][57339] Updated weights for policy 0, policy_version 637568 (0.0034) [2024-04-28 14:49:56,974][57339] Updated weights for policy 0, policy_version 637578 (0.0029) [2024-04-28 14:49:57,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 10446077952. Throughput: 0: 55641.7. Samples: 936426240. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:49:57,170][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 14:50:00,297][57339] Updated weights for policy 0, policy_version 637588 (0.0032) [2024-04-28 14:50:02,169][57108] Fps is (10 sec: 57345.5, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10446372864. Throughput: 0: 55711.7. Samples: 936761180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:50:02,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 14:50:02,856][57339] Updated weights for policy 0, policy_version 637598 (0.0026) [2024-04-28 14:50:06,015][57339] Updated weights for policy 0, policy_version 637608 (0.0027) [2024-04-28 14:50:07,169][57108] Fps is (10 sec: 57344.7, 60 sec: 56251.9, 300 sec: 55872.2). Total num frames: 10446651392. Throughput: 0: 55757.8. Samples: 936932960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:50:07,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 14:50:08,683][57339] Updated weights for policy 0, policy_version 637618 (0.0027) [2024-04-28 14:50:12,162][57339] Updated weights for policy 0, policy_version 637628 (0.0026) [2024-04-28 14:50:12,169][57108] Fps is (10 sec: 52427.4, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10446897152. Throughput: 0: 55711.9. Samples: 937266660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:50:12,170][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 14:50:14,384][57339] Updated weights for policy 0, policy_version 637638 (0.0035) [2024-04-28 14:50:17,169][57108] Fps is (10 sec: 52428.0, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10447175680. Throughput: 0: 55767.9. Samples: 937605140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-04-28 14:50:17,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 14:50:18,085][57339] Updated weights for policy 0, policy_version 637648 (0.0031) [2024-04-28 14:50:20,091][57339] Updated weights for policy 0, policy_version 637658 (0.0025) [2024-04-28 14:50:22,169][57108] Fps is (10 sec: 58983.5, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10447486976. Throughput: 0: 55722.8. Samples: 937765420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:50:22,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 14:50:23,860][57339] Updated weights for policy 0, policy_version 637668 (0.0027) [2024-04-28 14:50:26,037][57339] Updated weights for policy 0, policy_version 637678 (0.0029) [2024-04-28 14:50:27,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 10447749120. Throughput: 0: 55828.4. Samples: 938103040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:50:27,170][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 14:50:29,729][57339] Updated weights for policy 0, policy_version 637688 (0.0028) [2024-04-28 14:50:30,617][57319] Signal inference workers to stop experience collection... (13750 times) [2024-04-28 14:50:30,618][57319] Signal inference workers to resume experience collection... (13750 times) [2024-04-28 14:50:30,658][57339] InferenceWorker_p0-w0: stopping experience collection (13750 times) [2024-04-28 14:50:30,658][57339] InferenceWorker_p0-w0: resuming experience collection (13750 times) [2024-04-28 14:50:32,022][57339] Updated weights for policy 0, policy_version 637698 (0.0028) [2024-04-28 14:50:32,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 10448044032. Throughput: 0: 55831.6. Samples: 938437600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:50:32,169][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 14:50:35,654][57339] Updated weights for policy 0, policy_version 637708 (0.0030) [2024-04-28 14:50:37,169][57108] Fps is (10 sec: 57345.0, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 10448322560. Throughput: 0: 55935.4. Samples: 938610780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:50:37,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 14:50:37,894][57339] Updated weights for policy 0, policy_version 637718 (0.0031) [2024-04-28 14:50:41,547][57339] Updated weights for policy 0, policy_version 637728 (0.0029) [2024-04-28 14:50:42,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10448584704. Throughput: 0: 55960.7. Samples: 938944460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:50:42,169][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 14:50:43,653][57339] Updated weights for policy 0, policy_version 637738 (0.0026) [2024-04-28 14:50:47,169][57108] Fps is (10 sec: 50790.3, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 10448830464. Throughput: 0: 55937.2. Samples: 939278360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:50:47,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 14:50:47,377][57339] Updated weights for policy 0, policy_version 637748 (0.0034) [2024-04-28 14:50:49,458][57339] Updated weights for policy 0, policy_version 637758 (0.0027) [2024-04-28 14:50:52,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 10449125376. Throughput: 0: 55607.1. Samples: 939435280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:50:52,169][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 14:50:53,207][57339] Updated weights for policy 0, policy_version 637768 (0.0027) [2024-04-28 14:50:55,263][57339] Updated weights for policy 0, policy_version 637778 (0.0030) [2024-04-28 14:50:57,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10449420288. Throughput: 0: 55652.6. Samples: 939771020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:50:57,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 14:50:59,168][57339] Updated weights for policy 0, policy_version 637788 (0.0029) [2024-04-28 14:51:01,029][57339] Updated weights for policy 0, policy_version 637798 (0.0026) [2024-04-28 14:51:02,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55159.4, 300 sec: 55816.7). Total num frames: 10449682432. Throughput: 0: 55606.4. Samples: 940107420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:51:02,178][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 14:51:05,031][57339] Updated weights for policy 0, policy_version 637808 (0.0027) [2024-04-28 14:51:07,064][57339] Updated weights for policy 0, policy_version 637818 (0.0025) [2024-04-28 14:51:07,169][57108] Fps is (10 sec: 58981.3, 60 sec: 55978.5, 300 sec: 55927.7). Total num frames: 10450010112. Throughput: 0: 56071.3. Samples: 940288640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:51:07,169][57108] Avg episode reward: [(0, '0.678')] [2024-04-28 14:51:10,645][57319] Signal inference workers to stop experience collection... (13800 times) [2024-04-28 14:51:10,663][57339] InferenceWorker_p0-w0: stopping experience collection (13800 times) [2024-04-28 14:51:10,703][57319] Signal inference workers to resume experience collection... (13800 times) [2024-04-28 14:51:10,704][57339] InferenceWorker_p0-w0: resuming experience collection (13800 times) [2024-04-28 14:51:10,822][57339] Updated weights for policy 0, policy_version 637828 (0.0026) [2024-04-28 14:51:12,169][57108] Fps is (10 sec: 58982.4, 60 sec: 56251.9, 300 sec: 55872.3). Total num frames: 10450272256. Throughput: 0: 55957.5. Samples: 940621120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:51:12,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 14:51:13,166][57339] Updated weights for policy 0, policy_version 637838 (0.0030) [2024-04-28 14:51:16,556][57339] Updated weights for policy 0, policy_version 637848 (0.0026) [2024-04-28 14:51:17,169][57108] Fps is (10 sec: 54068.8, 60 sec: 56252.0, 300 sec: 55927.8). Total num frames: 10450550784. Throughput: 0: 55878.8. Samples: 940952140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:51:17,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 14:51:19,281][57339] Updated weights for policy 0, policy_version 637858 (0.0027) [2024-04-28 14:51:22,169][57108] Fps is (10 sec: 50790.7, 60 sec: 54886.5, 300 sec: 55650.1). Total num frames: 10450780160. Throughput: 0: 55856.5. Samples: 941124320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:51:22,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 14:51:22,452][57339] Updated weights for policy 0, policy_version 637868 (0.0027) [2024-04-28 14:51:25,739][57339] Updated weights for policy 0, policy_version 637878 (0.0025) [2024-04-28 14:51:27,169][57108] Fps is (10 sec: 54065.7, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10451091456. Throughput: 0: 56025.5. Samples: 941465620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:51:27,170][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 14:51:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000637884_10451091456.pth... [2024-04-28 14:51:27,225][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000637068_10437722112.pth [2024-04-28 14:51:28,276][57339] Updated weights for policy 0, policy_version 637888 (0.0022) [2024-04-28 14:51:31,533][57339] Updated weights for policy 0, policy_version 637898 (0.0027) [2024-04-28 14:51:32,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10451353600. Throughput: 0: 55819.1. Samples: 941790220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:51:32,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 14:51:34,081][57339] Updated weights for policy 0, policy_version 637908 (0.0031) [2024-04-28 14:51:37,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10451632128. Throughput: 0: 55788.4. Samples: 941945760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:51:37,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 14:51:37,242][57339] Updated weights for policy 0, policy_version 637918 (0.0026) [2024-04-28 14:51:40,005][57339] Updated weights for policy 0, policy_version 637928 (0.0027) [2024-04-28 14:51:42,169][57108] Fps is (10 sec: 60620.2, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 10451959808. Throughput: 0: 55703.9. Samples: 942277700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:51:42,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 14:51:43,067][57339] Updated weights for policy 0, policy_version 637938 (0.0029) [2024-04-28 14:51:45,770][57339] Updated weights for policy 0, policy_version 637948 (0.0033) [2024-04-28 14:51:46,947][57319] Signal inference workers to stop experience collection... (13850 times) [2024-04-28 14:51:46,949][57319] Signal inference workers to resume experience collection... (13850 times) [2024-04-28 14:51:46,964][57339] InferenceWorker_p0-w0: stopping experience collection (13850 times) [2024-04-28 14:51:46,964][57339] InferenceWorker_p0-w0: resuming experience collection (13850 times) [2024-04-28 14:51:47,169][57108] Fps is (10 sec: 62259.3, 60 sec: 57070.9, 300 sec: 55983.3). Total num frames: 10452254720. Throughput: 0: 55780.9. Samples: 942617560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:51:47,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 14:51:48,879][57339] Updated weights for policy 0, policy_version 637958 (0.0040) [2024-04-28 14:51:51,590][57339] Updated weights for policy 0, policy_version 637968 (0.0026) [2024-04-28 14:51:52,169][57108] Fps is (10 sec: 55706.1, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 10452516864. Throughput: 0: 55757.1. Samples: 942797700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-04-28 14:51:52,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 14:51:54,555][57339] Updated weights for policy 0, policy_version 637978 (0.0031) [2024-04-28 14:51:57,169][57108] Fps is (10 sec: 50790.5, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10452762624. Throughput: 0: 55798.7. Samples: 943132060. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:51:57,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 14:51:57,419][57339] Updated weights for policy 0, policy_version 637988 (0.0027) [2024-04-28 14:52:00,307][57339] Updated weights for policy 0, policy_version 637998 (0.0032) [2024-04-28 14:52:02,169][57108] Fps is (10 sec: 50790.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10453024768. Throughput: 0: 55893.6. Samples: 943467360. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:52:02,170][57108] Avg episode reward: [(0, '0.689')] [2024-04-28 14:52:03,316][57339] Updated weights for policy 0, policy_version 638008 (0.0040) [2024-04-28 14:52:06,504][57339] Updated weights for policy 0, policy_version 638018 (0.0024) [2024-04-28 14:52:07,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 10453319680. Throughput: 0: 55539.5. Samples: 943623600. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:52:07,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 14:52:09,045][57339] Updated weights for policy 0, policy_version 638028 (0.0028) [2024-04-28 14:52:12,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10453581824. Throughput: 0: 55494.0. Samples: 943962840. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:52:12,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 14:52:12,387][57339] Updated weights for policy 0, policy_version 638038 (0.0033) [2024-04-28 14:52:14,927][57339] Updated weights for policy 0, policy_version 638048 (0.0025) [2024-04-28 14:52:17,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10453909504. Throughput: 0: 55663.1. Samples: 944295060. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:52:17,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 14:52:18,188][57339] Updated weights for policy 0, policy_version 638058 (0.0028) [2024-04-28 14:52:20,761][57339] Updated weights for policy 0, policy_version 638068 (0.0026) [2024-04-28 14:52:22,169][57108] Fps is (10 sec: 62258.2, 60 sec: 57070.7, 300 sec: 55872.2). Total num frames: 10454204416. Throughput: 0: 56295.4. Samples: 944479060. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:52:22,170][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 14:52:23,883][57339] Updated weights for policy 0, policy_version 638078 (0.0029) [2024-04-28 14:52:26,625][57339] Updated weights for policy 0, policy_version 638088 (0.0029) [2024-04-28 14:52:27,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56525.0, 300 sec: 55983.3). Total num frames: 10454482944. Throughput: 0: 56406.4. Samples: 944815980. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:52:27,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 14:52:29,592][57339] Updated weights for policy 0, policy_version 638098 (0.0028) [2024-04-28 14:52:32,169][57108] Fps is (10 sec: 50791.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10454712320. Throughput: 0: 56332.9. Samples: 945152540. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:52:32,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 14:52:32,496][57339] Updated weights for policy 0, policy_version 638108 (0.0027) [2024-04-28 14:52:33,244][57319] Signal inference workers to stop experience collection... (13900 times) [2024-04-28 14:52:33,276][57339] InferenceWorker_p0-w0: stopping experience collection (13900 times) [2024-04-28 14:52:33,305][57319] Signal inference workers to resume experience collection... (13900 times) [2024-04-28 14:52:33,309][57339] InferenceWorker_p0-w0: resuming experience collection (13900 times) [2024-04-28 14:52:35,526][57339] Updated weights for policy 0, policy_version 638118 (0.0028) [2024-04-28 14:52:37,169][57108] Fps is (10 sec: 52428.4, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10455007232. Throughput: 0: 55787.5. Samples: 945308140. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:52:37,169][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 14:52:38,391][57339] Updated weights for policy 0, policy_version 638128 (0.0026) [2024-04-28 14:52:41,481][57339] Updated weights for policy 0, policy_version 638138 (0.0029) [2024-04-28 14:52:42,169][57108] Fps is (10 sec: 54066.8, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 10455252992. Throughput: 0: 55731.0. Samples: 945639960. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:52:42,169][57108] Avg episode reward: [(0, '0.703')] [2024-04-28 14:52:44,315][57339] Updated weights for policy 0, policy_version 638148 (0.0033) [2024-04-28 14:52:47,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10455564288. Throughput: 0: 55670.2. Samples: 945972520. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:52:47,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 14:52:47,216][57339] Updated weights for policy 0, policy_version 638158 (0.0029) [2024-04-28 14:52:50,171][57339] Updated weights for policy 0, policy_version 638168 (0.0027) [2024-04-28 14:52:52,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10455842816. Throughput: 0: 56200.5. Samples: 946152620. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:52:52,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 14:52:53,136][57339] Updated weights for policy 0, policy_version 638178 (0.0031) [2024-04-28 14:52:55,935][57339] Updated weights for policy 0, policy_version 638188 (0.0033) [2024-04-28 14:52:57,169][57108] Fps is (10 sec: 58982.9, 60 sec: 56524.8, 300 sec: 55927.8). Total num frames: 10456154112. Throughput: 0: 56034.2. Samples: 946484380. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:52:57,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 14:52:59,133][57339] Updated weights for policy 0, policy_version 638198 (0.0028) [2024-04-28 14:53:01,831][57339] Updated weights for policy 0, policy_version 638208 (0.0030) [2024-04-28 14:53:02,169][57108] Fps is (10 sec: 57343.9, 60 sec: 56524.9, 300 sec: 55816.7). Total num frames: 10456416256. Throughput: 0: 56012.5. Samples: 946815620. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:53:02,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 14:53:05,436][57339] Updated weights for policy 0, policy_version 638218 (0.0031) [2024-04-28 14:53:07,169][57108] Fps is (10 sec: 50789.3, 60 sec: 55705.4, 300 sec: 55816.6). Total num frames: 10456662016. Throughput: 0: 55549.3. Samples: 946978780. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:53:07,170][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 14:53:07,739][57339] Updated weights for policy 0, policy_version 638228 (0.0027) [2024-04-28 14:53:11,334][57339] Updated weights for policy 0, policy_version 638238 (0.0029) [2024-04-28 14:53:12,169][57108] Fps is (10 sec: 52427.8, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 10456940544. Throughput: 0: 55486.0. Samples: 947312860. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:53:12,170][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 14:53:13,574][57339] Updated weights for policy 0, policy_version 638248 (0.0028) [2024-04-28 14:53:17,101][57339] Updated weights for policy 0, policy_version 638258 (0.0031) [2024-04-28 14:53:17,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 10457219072. Throughput: 0: 55447.9. Samples: 947647700. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:53:17,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 14:53:19,643][57339] Updated weights for policy 0, policy_version 638268 (0.0025) [2024-04-28 14:53:22,169][57108] Fps is (10 sec: 57345.1, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 10457513984. Throughput: 0: 55593.4. Samples: 947809840. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:53:22,169][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 14:53:22,915][57339] Updated weights for policy 0, policy_version 638278 (0.0029) [2024-04-28 14:53:25,677][57339] Updated weights for policy 0, policy_version 638288 (0.0024) [2024-04-28 14:53:27,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 10457808896. Throughput: 0: 55596.0. Samples: 948141780. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:53:27,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 14:53:27,184][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000638294_10457808896.pth... [2024-04-28 14:53:27,230][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000637476_10444406784.pth [2024-04-28 14:53:28,885][57339] Updated weights for policy 0, policy_version 638298 (0.0030) [2024-04-28 14:53:31,375][57319] Signal inference workers to stop experience collection... (13950 times) [2024-04-28 14:53:31,375][57319] Signal inference workers to resume experience collection... (13950 times) [2024-04-28 14:53:31,402][57339] InferenceWorker_p0-w0: stopping experience collection (13950 times) [2024-04-28 14:53:31,402][57339] InferenceWorker_p0-w0: resuming experience collection (13950 times) [2024-04-28 14:53:31,484][57339] Updated weights for policy 0, policy_version 638308 (0.0027) [2024-04-28 14:53:32,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10458071040. Throughput: 0: 55469.9. Samples: 948468660. Policy #0 lag: (min: 0.0, avg: 7.2, max: 20.0) [2024-04-28 14:53:32,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 14:53:34,668][57339] Updated weights for policy 0, policy_version 638318 (0.0031) [2024-04-28 14:53:37,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10458349568. Throughput: 0: 55306.9. Samples: 948641440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:53:37,170][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 14:53:37,351][57339] Updated weights for policy 0, policy_version 638328 (0.0026) [2024-04-28 14:53:40,501][57339] Updated weights for policy 0, policy_version 638338 (0.0030) [2024-04-28 14:53:42,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10458611712. Throughput: 0: 55324.0. Samples: 948973960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:53:42,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 14:53:43,153][57339] Updated weights for policy 0, policy_version 638348 (0.0028) [2024-04-28 14:53:46,525][57339] Updated weights for policy 0, policy_version 638358 (0.0034) [2024-04-28 14:53:47,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10458890240. Throughput: 0: 55444.5. Samples: 949310620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:53:47,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 14:53:49,037][57339] Updated weights for policy 0, policy_version 638368 (0.0027) [2024-04-28 14:53:52,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10459152384. Throughput: 0: 55350.1. Samples: 949469520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:53:52,169][57108] Avg episode reward: [(0, '0.697')] [2024-04-28 14:53:52,414][57339] Updated weights for policy 0, policy_version 638378 (0.0037) [2024-04-28 14:53:54,908][57339] Updated weights for policy 0, policy_version 638388 (0.0038) [2024-04-28 14:53:57,169][57108] Fps is (10 sec: 54066.3, 60 sec: 54613.2, 300 sec: 55650.0). Total num frames: 10459430912. Throughput: 0: 55354.3. Samples: 949803800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:53:57,170][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 14:53:58,151][57339] Updated weights for policy 0, policy_version 638398 (0.0029) [2024-04-28 14:54:00,781][57339] Updated weights for policy 0, policy_version 638408 (0.0028) [2024-04-28 14:54:02,169][57108] Fps is (10 sec: 60620.2, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 10459758592. Throughput: 0: 55236.4. Samples: 950133340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:54:02,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 14:54:03,890][57339] Updated weights for policy 0, policy_version 638418 (0.0033) [2024-04-28 14:54:06,688][57339] Updated weights for policy 0, policy_version 638428 (0.0026) [2024-04-28 14:54:07,169][57108] Fps is (10 sec: 58983.2, 60 sec: 55978.9, 300 sec: 55705.6). Total num frames: 10460020736. Throughput: 0: 55613.3. Samples: 950312440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:54:07,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 14:54:09,751][57339] Updated weights for policy 0, policy_version 638438 (0.0026) [2024-04-28 14:54:12,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 10460299264. Throughput: 0: 55752.1. Samples: 950650620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:54:12,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 14:54:12,668][57339] Updated weights for policy 0, policy_version 638448 (0.0027) [2024-04-28 14:54:15,548][57339] Updated weights for policy 0, policy_version 638458 (0.0031) [2024-04-28 14:54:17,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10460577792. Throughput: 0: 55863.9. Samples: 950982540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:54:17,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 14:54:18,395][57339] Updated weights for policy 0, policy_version 638468 (0.0027) [2024-04-28 14:54:21,299][57339] Updated weights for policy 0, policy_version 638478 (0.0032) [2024-04-28 14:54:22,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10460839936. Throughput: 0: 55685.0. Samples: 951147260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:54:22,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 14:54:24,288][57339] Updated weights for policy 0, policy_version 638488 (0.0028) [2024-04-28 14:54:27,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 10461134848. Throughput: 0: 55807.7. Samples: 951485320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:54:27,170][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 14:54:27,310][57339] Updated weights for policy 0, policy_version 638498 (0.0028) [2024-04-28 14:54:30,077][57339] Updated weights for policy 0, policy_version 638508 (0.0032) [2024-04-28 14:54:32,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10461396992. Throughput: 0: 55767.4. Samples: 951820160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:54:32,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 14:54:33,654][57339] Updated weights for policy 0, policy_version 638518 (0.0024) [2024-04-28 14:54:35,931][57339] Updated weights for policy 0, policy_version 638528 (0.0028) [2024-04-28 14:54:37,169][57108] Fps is (10 sec: 54068.8, 60 sec: 55432.7, 300 sec: 55761.2). Total num frames: 10461675520. Throughput: 0: 55822.7. Samples: 951981540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:54:37,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 14:54:39,616][57319] Signal inference workers to stop experience collection... (14000 times) [2024-04-28 14:54:39,618][57319] Signal inference workers to resume experience collection... (14000 times) [2024-04-28 14:54:39,626][57339] Updated weights for policy 0, policy_version 638538 (0.0028) [2024-04-28 14:54:39,644][57339] InferenceWorker_p0-w0: stopping experience collection (14000 times) [2024-04-28 14:54:39,644][57339] InferenceWorker_p0-w0: resuming experience collection (14000 times) [2024-04-28 14:54:41,684][57339] Updated weights for policy 0, policy_version 638548 (0.0029) [2024-04-28 14:54:42,169][57108] Fps is (10 sec: 58982.6, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10461986816. Throughput: 0: 55938.8. Samples: 952321040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:54:42,169][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 14:54:45,298][57339] Updated weights for policy 0, policy_version 638558 (0.0028) [2024-04-28 14:54:47,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 10462248960. Throughput: 0: 56136.9. Samples: 952659500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:54:47,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 14:54:47,511][57339] Updated weights for policy 0, policy_version 638568 (0.0026) [2024-04-28 14:54:51,048][57339] Updated weights for policy 0, policy_version 638578 (0.0033) [2024-04-28 14:54:52,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10462511104. Throughput: 0: 55865.3. Samples: 952826380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:54:52,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 14:54:53,390][57339] Updated weights for policy 0, policy_version 638588 (0.0034) [2024-04-28 14:54:57,046][57339] Updated weights for policy 0, policy_version 638598 (0.0030) [2024-04-28 14:54:57,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 10462806016. Throughput: 0: 55831.0. Samples: 953163020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:54:57,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 14:54:59,285][57339] Updated weights for policy 0, policy_version 638608 (0.0027) [2024-04-28 14:55:02,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10463084544. Throughput: 0: 55863.1. Samples: 953496380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:55:02,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 14:55:02,880][57339] Updated weights for policy 0, policy_version 638618 (0.0024) [2024-04-28 14:55:05,133][57339] Updated weights for policy 0, policy_version 638628 (0.0031) [2024-04-28 14:55:07,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10463363072. Throughput: 0: 55964.4. Samples: 953665660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-04-28 14:55:07,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 14:55:08,643][57339] Updated weights for policy 0, policy_version 638638 (0.0027) [2024-04-28 14:55:10,855][57339] Updated weights for policy 0, policy_version 638648 (0.0033) [2024-04-28 14:55:12,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10463641600. Throughput: 0: 55817.0. Samples: 953997080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:55:12,170][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 14:55:14,404][57339] Updated weights for policy 0, policy_version 638658 (0.0033) [2024-04-28 14:55:16,727][57339] Updated weights for policy 0, policy_version 638668 (0.0027) [2024-04-28 14:55:17,169][57108] Fps is (10 sec: 58982.6, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10463952896. Throughput: 0: 55784.4. Samples: 954330460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:55:17,170][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 14:55:20,364][57339] Updated weights for policy 0, policy_version 638678 (0.0025) [2024-04-28 14:55:22,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10464215040. Throughput: 0: 56154.5. Samples: 954508500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:55:22,169][57108] Avg episode reward: [(0, '0.524')] [2024-04-28 14:55:22,498][57339] Updated weights for policy 0, policy_version 638688 (0.0031) [2024-04-28 14:55:26,272][57339] Updated weights for policy 0, policy_version 638698 (0.0028) [2024-04-28 14:55:27,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10464477184. Throughput: 0: 56022.0. Samples: 954842040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:55:27,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 14:55:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000638701_10464477184.pth... [2024-04-28 14:55:27,224][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000637884_10451091456.pth [2024-04-28 14:55:28,018][57319] Signal inference workers to stop experience collection... (14050 times) [2024-04-28 14:55:28,024][57319] Signal inference workers to resume experience collection... (14050 times) [2024-04-28 14:55:28,039][57339] InferenceWorker_p0-w0: stopping experience collection (14050 times) [2024-04-28 14:55:28,039][57339] InferenceWorker_p0-w0: resuming experience collection (14050 times) [2024-04-28 14:55:28,267][57339] Updated weights for policy 0, policy_version 638708 (0.0031) [2024-04-28 14:55:32,159][57339] Updated weights for policy 0, policy_version 638718 (0.0028) [2024-04-28 14:55:32,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10464755712. Throughput: 0: 56053.3. Samples: 955181900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:55:32,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 14:55:34,194][57339] Updated weights for policy 0, policy_version 638728 (0.0035) [2024-04-28 14:55:37,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 10465034240. Throughput: 0: 55775.5. Samples: 955336280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:55:37,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 14:55:37,956][57339] Updated weights for policy 0, policy_version 638738 (0.0030) [2024-04-28 14:55:40,084][57339] Updated weights for policy 0, policy_version 638748 (0.0029) [2024-04-28 14:55:42,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 10465312768. Throughput: 0: 55836.6. Samples: 955675660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:55:42,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 14:55:43,890][57339] Updated weights for policy 0, policy_version 638758 (0.0036) [2024-04-28 14:55:46,032][57339] Updated weights for policy 0, policy_version 638768 (0.0033) [2024-04-28 14:55:47,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10465607680. Throughput: 0: 55882.7. Samples: 956011100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:55:47,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 14:55:49,739][57339] Updated weights for policy 0, policy_version 638778 (0.0026) [2024-04-28 14:55:51,755][57339] Updated weights for policy 0, policy_version 638788 (0.0031) [2024-04-28 14:55:52,169][57108] Fps is (10 sec: 58982.2, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 10465902592. Throughput: 0: 56038.3. Samples: 956187380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:55:52,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 14:55:55,459][57339] Updated weights for policy 0, policy_version 638798 (0.0027) [2024-04-28 14:55:57,169][57108] Fps is (10 sec: 58982.2, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 10466197504. Throughput: 0: 56125.8. Samples: 956522740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:55:57,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 14:55:57,427][57339] Updated weights for policy 0, policy_version 638808 (0.0027) [2024-04-28 14:56:01,196][57339] Updated weights for policy 0, policy_version 638818 (0.0026) [2024-04-28 14:56:02,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10466443264. Throughput: 0: 56220.1. Samples: 956860360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:56:02,169][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 14:56:03,399][57339] Updated weights for policy 0, policy_version 638828 (0.0030) [2024-04-28 14:56:07,169][57108] Fps is (10 sec: 50790.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10466705408. Throughput: 0: 55849.2. Samples: 957021720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:56:07,170][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 14:56:07,348][57339] Updated weights for policy 0, policy_version 638838 (0.0030) [2024-04-28 14:56:09,287][57339] Updated weights for policy 0, policy_version 638848 (0.0032) [2024-04-28 14:56:12,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 10466983936. Throughput: 0: 55937.2. Samples: 957359200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:56:12,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 14:56:13,165][57339] Updated weights for policy 0, policy_version 638858 (0.0025) [2024-04-28 14:56:14,417][57319] Signal inference workers to stop experience collection... (14100 times) [2024-04-28 14:56:14,417][57319] Signal inference workers to resume experience collection... (14100 times) [2024-04-28 14:56:14,444][57339] InferenceWorker_p0-w0: stopping experience collection (14100 times) [2024-04-28 14:56:14,444][57339] InferenceWorker_p0-w0: resuming experience collection (14100 times) [2024-04-28 14:56:14,940][57339] Updated weights for policy 0, policy_version 638868 (0.0023) [2024-04-28 14:56:17,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 10467295232. Throughput: 0: 55939.1. Samples: 957699160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:56:17,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 14:56:19,266][57339] Updated weights for policy 0, policy_version 638878 (0.0029) [2024-04-28 14:56:21,143][57339] Updated weights for policy 0, policy_version 638888 (0.0030) [2024-04-28 14:56:22,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55978.7, 300 sec: 55872.3). Total num frames: 10467573760. Throughput: 0: 56360.6. Samples: 957872500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:56:22,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 14:56:25,109][57339] Updated weights for policy 0, policy_version 638898 (0.0033) [2024-04-28 14:56:27,067][57339] Updated weights for policy 0, policy_version 638908 (0.0027) [2024-04-28 14:56:27,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56524.9, 300 sec: 55983.3). Total num frames: 10467868672. Throughput: 0: 56156.0. Samples: 958202680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:56:27,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 14:56:30,916][57339] Updated weights for policy 0, policy_version 638918 (0.0030) [2024-04-28 14:56:32,169][57108] Fps is (10 sec: 57343.4, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 10468147200. Throughput: 0: 56112.4. Samples: 958536160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:56:32,170][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 14:56:32,900][57339] Updated weights for policy 0, policy_version 638928 (0.0026) [2024-04-28 14:56:36,719][57339] Updated weights for policy 0, policy_version 638938 (0.0029) [2024-04-28 14:56:37,169][57108] Fps is (10 sec: 50790.7, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10468376576. Throughput: 0: 55888.1. Samples: 958702340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:56:37,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 14:56:38,950][57339] Updated weights for policy 0, policy_version 638948 (0.0025) [2024-04-28 14:56:42,169][57108] Fps is (10 sec: 50790.5, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10468655104. Throughput: 0: 55913.4. Samples: 959038840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 14:56:42,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 14:56:42,513][57339] Updated weights for policy 0, policy_version 638958 (0.0026) [2024-04-28 14:56:44,862][57339] Updated weights for policy 0, policy_version 638968 (0.0025) [2024-04-28 14:56:47,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10468950016. Throughput: 0: 55741.7. Samples: 959368740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:56:47,169][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 14:56:48,393][57339] Updated weights for policy 0, policy_version 638978 (0.0033) [2024-04-28 14:56:50,599][57339] Updated weights for policy 0, policy_version 638988 (0.0030) [2024-04-28 14:56:52,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10469244928. Throughput: 0: 55753.9. Samples: 959530640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:56:52,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 14:56:54,336][57339] Updated weights for policy 0, policy_version 638998 (0.0025) [2024-04-28 14:56:56,287][57339] Updated weights for policy 0, policy_version 639008 (0.0027) [2024-04-28 14:56:57,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55432.5, 300 sec: 55927.8). Total num frames: 10469523456. Throughput: 0: 55682.1. Samples: 959864900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:56:57,170][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 14:57:00,093][57339] Updated weights for policy 0, policy_version 639018 (0.0026) [2024-04-28 14:57:00,876][57319] Signal inference workers to stop experience collection... (14150 times) [2024-04-28 14:57:00,908][57339] InferenceWorker_p0-w0: stopping experience collection (14150 times) [2024-04-28 14:57:00,966][57319] Signal inference workers to resume experience collection... (14150 times) [2024-04-28 14:57:00,967][57339] InferenceWorker_p0-w0: resuming experience collection (14150 times) [2024-04-28 14:57:02,116][57339] Updated weights for policy 0, policy_version 639028 (0.0025) [2024-04-28 14:57:02,169][57108] Fps is (10 sec: 58982.7, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 10469834752. Throughput: 0: 55501.9. Samples: 960196740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:57:02,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 14:57:06,016][57339] Updated weights for policy 0, policy_version 639038 (0.0028) [2024-04-28 14:57:07,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56524.9, 300 sec: 55983.3). Total num frames: 10470096896. Throughput: 0: 55748.8. Samples: 960381200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:57:07,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 14:57:07,971][57339] Updated weights for policy 0, policy_version 639048 (0.0032) [2024-04-28 14:57:11,833][57339] Updated weights for policy 0, policy_version 639058 (0.0027) [2024-04-28 14:57:12,169][57108] Fps is (10 sec: 49151.8, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 10470326272. Throughput: 0: 55901.3. Samples: 960718240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:57:12,170][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 14:57:13,758][57339] Updated weights for policy 0, policy_version 639068 (0.0030) [2024-04-28 14:57:17,169][57108] Fps is (10 sec: 50790.9, 60 sec: 55159.6, 300 sec: 55594.6). Total num frames: 10470604800. Throughput: 0: 55917.9. Samples: 961052460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:57:17,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 14:57:17,675][57339] Updated weights for policy 0, policy_version 639078 (0.0025) [2024-04-28 14:57:20,076][57339] Updated weights for policy 0, policy_version 639088 (0.0027) [2024-04-28 14:57:22,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10470899712. Throughput: 0: 55691.1. Samples: 961208440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:57:22,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 14:57:23,351][57339] Updated weights for policy 0, policy_version 639098 (0.0034) [2024-04-28 14:57:25,947][57339] Updated weights for policy 0, policy_version 639108 (0.0030) [2024-04-28 14:57:27,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 10471194624. Throughput: 0: 55788.5. Samples: 961549320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:57:27,169][57108] Avg episode reward: [(0, '0.523')] [2024-04-28 14:57:27,207][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000639112_10471211008.pth... [2024-04-28 14:57:27,254][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000638294_10457808896.pth [2024-04-28 14:57:29,106][57339] Updated weights for policy 0, policy_version 639118 (0.0030) [2024-04-28 14:57:31,692][57339] Updated weights for policy 0, policy_version 639128 (0.0026) [2024-04-28 14:57:32,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10471489536. Throughput: 0: 55919.2. Samples: 961885100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:57:32,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 14:57:34,966][57339] Updated weights for policy 0, policy_version 639138 (0.0035) [2024-04-28 14:57:37,169][57108] Fps is (10 sec: 58981.4, 60 sec: 56797.7, 300 sec: 56038.8). Total num frames: 10471784448. Throughput: 0: 56312.8. Samples: 962064720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:57:37,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 14:57:37,521][57339] Updated weights for policy 0, policy_version 639148 (0.0033) [2024-04-28 14:57:40,854][57339] Updated weights for policy 0, policy_version 639158 (0.0025) [2024-04-28 14:57:42,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56797.9, 300 sec: 55927.8). Total num frames: 10472062976. Throughput: 0: 56309.9. Samples: 962398840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:57:42,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 14:57:43,453][57339] Updated weights for policy 0, policy_version 639168 (0.0031) [2024-04-28 14:57:46,425][57319] Signal inference workers to stop experience collection... (14200 times) [2024-04-28 14:57:46,464][57339] InferenceWorker_p0-w0: stopping experience collection (14200 times) [2024-04-28 14:57:46,518][57319] Signal inference workers to resume experience collection... (14200 times) [2024-04-28 14:57:46,518][57339] InferenceWorker_p0-w0: resuming experience collection (14200 times) [2024-04-28 14:57:46,791][57339] Updated weights for policy 0, policy_version 639178 (0.0025) [2024-04-28 14:57:47,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10472308736. Throughput: 0: 56336.4. Samples: 962731880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:57:47,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 14:57:49,407][57339] Updated weights for policy 0, policy_version 639188 (0.0030) [2024-04-28 14:57:52,169][57108] Fps is (10 sec: 50790.6, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10472570880. Throughput: 0: 55747.3. Samples: 962889820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:57:52,169][57108] Avg episode reward: [(0, '0.521')] [2024-04-28 14:57:52,633][57339] Updated weights for policy 0, policy_version 639198 (0.0030) [2024-04-28 14:57:55,111][57339] Updated weights for policy 0, policy_version 639208 (0.0027) [2024-04-28 14:57:57,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10472865792. Throughput: 0: 55604.4. Samples: 963220440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:57:57,169][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 14:57:58,559][57339] Updated weights for policy 0, policy_version 639218 (0.0031) [2024-04-28 14:58:00,962][57339] Updated weights for policy 0, policy_version 639228 (0.0028) [2024-04-28 14:58:02,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55159.5, 300 sec: 55872.3). Total num frames: 10473144320. Throughput: 0: 55539.6. Samples: 963551740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:58:02,169][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 14:58:04,426][57339] Updated weights for policy 0, policy_version 639238 (0.0029) [2024-04-28 14:58:07,101][57339] Updated weights for policy 0, policy_version 639248 (0.0027) [2024-04-28 14:58:07,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 10473439232. Throughput: 0: 56028.8. Samples: 963729740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:58:07,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 14:58:10,355][57339] Updated weights for policy 0, policy_version 639258 (0.0027) [2024-04-28 14:58:12,169][57108] Fps is (10 sec: 57343.0, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 10473717760. Throughput: 0: 55938.9. Samples: 964066580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:58:12,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 14:58:12,902][57339] Updated weights for policy 0, policy_version 639268 (0.0024) [2024-04-28 14:58:16,379][57339] Updated weights for policy 0, policy_version 639278 (0.0037) [2024-04-28 14:58:17,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56797.9, 300 sec: 55927.8). Total num frames: 10474012672. Throughput: 0: 55821.0. Samples: 964397040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 14:58:17,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 14:58:18,762][57339] Updated weights for policy 0, policy_version 639288 (0.0031) [2024-04-28 14:58:22,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10474242048. Throughput: 0: 55551.3. Samples: 964564520. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:58:22,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 14:58:22,254][57339] Updated weights for policy 0, policy_version 639298 (0.0026) [2024-04-28 14:58:24,842][57339] Updated weights for policy 0, policy_version 639308 (0.0025) [2024-04-28 14:58:27,169][57108] Fps is (10 sec: 50790.4, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10474520576. Throughput: 0: 55568.9. Samples: 964899440. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:58:27,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 14:58:28,044][57339] Updated weights for policy 0, policy_version 639318 (0.0030) [2024-04-28 14:58:30,803][57339] Updated weights for policy 0, policy_version 639328 (0.0028) [2024-04-28 14:58:32,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55159.5, 300 sec: 55761.2). Total num frames: 10474799104. Throughput: 0: 55540.0. Samples: 965231180. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:58:32,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 14:58:34,000][57339] Updated weights for policy 0, policy_version 639338 (0.0030) [2024-04-28 14:58:35,277][57319] Signal inference workers to stop experience collection... (14250 times) [2024-04-28 14:58:35,333][57319] Signal inference workers to resume experience collection... (14250 times) [2024-04-28 14:58:35,333][57339] InferenceWorker_p0-w0: stopping experience collection (14250 times) [2024-04-28 14:58:35,358][57339] InferenceWorker_p0-w0: resuming experience collection (14250 times) [2024-04-28 14:58:36,750][57339] Updated weights for policy 0, policy_version 639348 (0.0028) [2024-04-28 14:58:37,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55159.6, 300 sec: 55872.2). Total num frames: 10475094016. Throughput: 0: 55590.2. Samples: 965391380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:58:37,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 14:58:39,944][57339] Updated weights for policy 0, policy_version 639358 (0.0031) [2024-04-28 14:58:42,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55159.4, 300 sec: 55872.2). Total num frames: 10475372544. Throughput: 0: 55699.6. Samples: 965726920. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:58:42,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 14:58:42,744][57339] Updated weights for policy 0, policy_version 639368 (0.0035) [2024-04-28 14:58:45,740][57339] Updated weights for policy 0, policy_version 639378 (0.0033) [2024-04-28 14:58:47,169][57108] Fps is (10 sec: 57342.9, 60 sec: 55978.5, 300 sec: 55983.2). Total num frames: 10475667456. Throughput: 0: 55609.0. Samples: 966054160. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:58:47,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 14:58:48,569][57339] Updated weights for policy 0, policy_version 639388 (0.0033) [2024-04-28 14:58:51,604][57339] Updated weights for policy 0, policy_version 639398 (0.0028) [2024-04-28 14:58:52,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10475929600. Throughput: 0: 55535.2. Samples: 966228820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:58:52,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 14:58:54,520][57339] Updated weights for policy 0, policy_version 639408 (0.0025) [2024-04-28 14:58:57,169][57108] Fps is (10 sec: 52430.0, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10476191744. Throughput: 0: 55529.1. Samples: 966565380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:58:57,169][57108] Avg episode reward: [(0, '0.527')] [2024-04-28 14:58:57,483][57339] Updated weights for policy 0, policy_version 639418 (0.0028) [2024-04-28 14:59:00,513][57339] Updated weights for policy 0, policy_version 639428 (0.0029) [2024-04-28 14:59:02,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 10476470272. Throughput: 0: 55615.0. Samples: 966899720. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:59:02,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 14:59:03,267][57339] Updated weights for policy 0, policy_version 639438 (0.0027) [2024-04-28 14:59:06,431][57339] Updated weights for policy 0, policy_version 639448 (0.0029) [2024-04-28 14:59:07,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10476748800. Throughput: 0: 55330.9. Samples: 967054420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:59:07,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 14:59:09,115][57339] Updated weights for policy 0, policy_version 639458 (0.0027) [2024-04-28 14:59:12,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55159.6, 300 sec: 55761.2). Total num frames: 10477027328. Throughput: 0: 55295.2. Samples: 967387720. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:59:12,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 14:59:12,227][57339] Updated weights for policy 0, policy_version 639468 (0.0026) [2024-04-28 14:59:14,907][57339] Updated weights for policy 0, policy_version 639478 (0.0025) [2024-04-28 14:59:17,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55159.4, 300 sec: 55872.2). Total num frames: 10477322240. Throughput: 0: 55378.6. Samples: 967723220. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:59:17,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 14:59:18,049][57339] Updated weights for policy 0, policy_version 639488 (0.0029) [2024-04-28 14:59:21,015][57339] Updated weights for policy 0, policy_version 639498 (0.0034) [2024-04-28 14:59:22,169][57108] Fps is (10 sec: 60619.7, 60 sec: 56524.7, 300 sec: 55927.8). Total num frames: 10477633536. Throughput: 0: 55859.9. Samples: 967905080. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:59:22,170][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 14:59:23,871][57339] Updated weights for policy 0, policy_version 639508 (0.0028) [2024-04-28 14:59:25,753][57319] Signal inference workers to stop experience collection... (14300 times) [2024-04-28 14:59:25,754][57319] Signal inference workers to resume experience collection... (14300 times) [2024-04-28 14:59:25,769][57339] InferenceWorker_p0-w0: stopping experience collection (14300 times) [2024-04-28 14:59:25,769][57339] InferenceWorker_p0-w0: resuming experience collection (14300 times) [2024-04-28 14:59:27,077][57339] Updated weights for policy 0, policy_version 639518 (0.0027) [2024-04-28 14:59:27,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10477862912. Throughput: 0: 55823.5. Samples: 968238980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:59:27,170][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 14:59:27,308][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000639520_10477895680.pth... [2024-04-28 14:59:27,366][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000638701_10464477184.pth [2024-04-28 14:59:29,888][57339] Updated weights for policy 0, policy_version 639528 (0.0023) [2024-04-28 14:59:32,169][57108] Fps is (10 sec: 50790.9, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10478141440. Throughput: 0: 55764.7. Samples: 968563560. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:59:32,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 14:59:32,831][57339] Updated weights for policy 0, policy_version 639538 (0.0028) [2024-04-28 14:59:35,680][57339] Updated weights for policy 0, policy_version 639548 (0.0026) [2024-04-28 14:59:37,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 10478403584. Throughput: 0: 55438.6. Samples: 968723560. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:59:37,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 14:59:38,645][57339] Updated weights for policy 0, policy_version 639558 (0.0028) [2024-04-28 14:59:41,608][57339] Updated weights for policy 0, policy_version 639568 (0.0028) [2024-04-28 14:59:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 10478698496. Throughput: 0: 55433.3. Samples: 969059880. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:59:42,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 14:59:44,590][57339] Updated weights for policy 0, policy_version 639578 (0.0029) [2024-04-28 14:59:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54886.6, 300 sec: 55761.1). Total num frames: 10478960640. Throughput: 0: 55441.9. Samples: 969394600. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:59:47,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 14:59:47,582][57339] Updated weights for policy 0, policy_version 639588 (0.0029) [2024-04-28 14:59:50,289][57339] Updated weights for policy 0, policy_version 639598 (0.0030) [2024-04-28 14:59:52,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 10479255552. Throughput: 0: 55733.7. Samples: 969562440. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:59:52,170][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 14:59:53,571][57339] Updated weights for policy 0, policy_version 639608 (0.0025) [2024-04-28 14:59:56,076][57339] Updated weights for policy 0, policy_version 639618 (0.0025) [2024-04-28 14:59:57,169][57108] Fps is (10 sec: 62259.2, 60 sec: 56524.8, 300 sec: 55927.8). Total num frames: 10479583232. Throughput: 0: 55801.2. Samples: 969898780. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-04-28 14:59:57,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 14:59:59,494][57339] Updated weights for policy 0, policy_version 639628 (0.0033) [2024-04-28 15:00:02,022][57339] Updated weights for policy 0, policy_version 639638 (0.0033) [2024-04-28 15:00:02,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10479828992. Throughput: 0: 55712.4. Samples: 970230280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:00:02,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 15:00:05,279][57339] Updated weights for policy 0, policy_version 639648 (0.0029) [2024-04-28 15:00:07,169][57108] Fps is (10 sec: 49151.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10480074752. Throughput: 0: 55363.9. Samples: 970396460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:00:07,170][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 15:00:08,057][57339] Updated weights for policy 0, policy_version 639658 (0.0023) [2024-04-28 15:00:11,117][57339] Updated weights for policy 0, policy_version 639668 (0.0027) [2024-04-28 15:00:12,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10480353280. Throughput: 0: 55308.2. Samples: 970727840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:00:12,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 15:00:13,953][57339] Updated weights for policy 0, policy_version 639678 (0.0032) [2024-04-28 15:00:14,284][57319] Signal inference workers to stop experience collection... (14350 times) [2024-04-28 15:00:14,324][57339] InferenceWorker_p0-w0: stopping experience collection (14350 times) [2024-04-28 15:00:14,335][57319] Signal inference workers to resume experience collection... (14350 times) [2024-04-28 15:00:14,339][57339] InferenceWorker_p0-w0: resuming experience collection (14350 times) [2024-04-28 15:00:17,030][57339] Updated weights for policy 0, policy_version 639688 (0.0030) [2024-04-28 15:00:17,169][57108] Fps is (10 sec: 57345.1, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10480648192. Throughput: 0: 55415.1. Samples: 971057240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:00:17,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 15:00:19,771][57339] Updated weights for policy 0, policy_version 639698 (0.0029) [2024-04-28 15:00:22,169][57108] Fps is (10 sec: 57343.6, 60 sec: 54886.5, 300 sec: 55761.2). Total num frames: 10480926720. Throughput: 0: 55576.9. Samples: 971224520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:00:22,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 15:00:22,996][57339] Updated weights for policy 0, policy_version 639708 (0.0032) [2024-04-28 15:00:25,597][57339] Updated weights for policy 0, policy_version 639718 (0.0031) [2024-04-28 15:00:27,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10481188864. Throughput: 0: 55480.4. Samples: 971556500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:00:27,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 15:00:28,739][57339] Updated weights for policy 0, policy_version 639728 (0.0031) [2024-04-28 15:00:31,461][57339] Updated weights for policy 0, policy_version 639738 (0.0023) [2024-04-28 15:00:32,169][57108] Fps is (10 sec: 58981.7, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 10481516544. Throughput: 0: 55446.9. Samples: 971889720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:00:32,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 15:00:34,724][57339] Updated weights for policy 0, policy_version 639748 (0.0033) [2024-04-28 15:00:37,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10481762304. Throughput: 0: 55577.6. Samples: 972063420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:00:37,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 15:00:37,324][57339] Updated weights for policy 0, policy_version 639758 (0.0028) [2024-04-28 15:00:40,862][57339] Updated weights for policy 0, policy_version 639768 (0.0023) [2024-04-28 15:00:42,169][57108] Fps is (10 sec: 50791.3, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10482024448. Throughput: 0: 55594.3. Samples: 972400520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:00:42,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 15:00:43,065][57339] Updated weights for policy 0, policy_version 639778 (0.0028) [2024-04-28 15:00:46,706][57339] Updated weights for policy 0, policy_version 639788 (0.0037) [2024-04-28 15:00:47,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10482302976. Throughput: 0: 55658.6. Samples: 972734920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:00:47,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 15:00:49,006][57339] Updated weights for policy 0, policy_version 639798 (0.0033) [2024-04-28 15:00:52,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10482597888. Throughput: 0: 55276.5. Samples: 972883900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:00:52,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 15:00:52,541][57339] Updated weights for policy 0, policy_version 639808 (0.0031) [2024-04-28 15:00:54,987][57339] Updated weights for policy 0, policy_version 639818 (0.0026) [2024-04-28 15:00:57,169][57108] Fps is (10 sec: 57344.1, 60 sec: 54886.3, 300 sec: 55705.6). Total num frames: 10482876416. Throughput: 0: 55434.1. Samples: 973222380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:00:57,169][57108] Avg episode reward: [(0, '0.484')] [2024-04-28 15:00:58,243][57339] Updated weights for policy 0, policy_version 639828 (0.0024) [2024-04-28 15:01:00,903][57339] Updated weights for policy 0, policy_version 639838 (0.0025) [2024-04-28 15:01:02,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10483171328. Throughput: 0: 55537.5. Samples: 973556440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:01:02,170][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 15:01:03,944][57339] Updated weights for policy 0, policy_version 639848 (0.0031) [2024-04-28 15:01:06,536][57319] Signal inference workers to stop experience collection... (14400 times) [2024-04-28 15:01:06,580][57339] InferenceWorker_p0-w0: stopping experience collection (14400 times) [2024-04-28 15:01:06,592][57319] Signal inference workers to resume experience collection... (14400 times) [2024-04-28 15:01:06,598][57339] InferenceWorker_p0-w0: resuming experience collection (14400 times) [2024-04-28 15:01:06,703][57339] Updated weights for policy 0, policy_version 639858 (0.0029) [2024-04-28 15:01:07,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10483449856. Throughput: 0: 55742.6. Samples: 973732940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:01:07,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 15:01:09,831][57339] Updated weights for policy 0, policy_version 639868 (0.0034) [2024-04-28 15:01:12,169][57108] Fps is (10 sec: 54068.4, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 10483712000. Throughput: 0: 55861.4. Samples: 974070260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:01:12,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 15:01:12,483][57339] Updated weights for policy 0, policy_version 639878 (0.0026) [2024-04-28 15:01:15,735][57339] Updated weights for policy 0, policy_version 639888 (0.0026) [2024-04-28 15:01:17,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 10483974144. Throughput: 0: 55964.0. Samples: 974408100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:01:17,170][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 15:01:18,460][57339] Updated weights for policy 0, policy_version 639898 (0.0031) [2024-04-28 15:01:21,727][57339] Updated weights for policy 0, policy_version 639908 (0.0028) [2024-04-28 15:01:22,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10484269056. Throughput: 0: 55604.2. Samples: 974565620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:01:22,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 15:01:24,474][57339] Updated weights for policy 0, policy_version 639918 (0.0030) [2024-04-28 15:01:27,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 10484547584. Throughput: 0: 55519.4. Samples: 974898900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:01:27,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 15:01:27,177][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000639926_10484547584.pth... [2024-04-28 15:01:27,230][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000639112_10471211008.pth [2024-04-28 15:01:27,765][57339] Updated weights for policy 0, policy_version 639928 (0.0028) [2024-04-28 15:01:30,223][57339] Updated weights for policy 0, policy_version 639938 (0.0034) [2024-04-28 15:01:32,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 10484826112. Throughput: 0: 55500.1. Samples: 975232420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-04-28 15:01:32,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 15:01:33,638][57339] Updated weights for policy 0, policy_version 639948 (0.0024) [2024-04-28 15:01:36,012][57339] Updated weights for policy 0, policy_version 639958 (0.0029) [2024-04-28 15:01:37,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 10485121024. Throughput: 0: 56063.6. Samples: 975406760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:01:37,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 15:01:39,400][57339] Updated weights for policy 0, policy_version 639968 (0.0029) [2024-04-28 15:01:42,036][57339] Updated weights for policy 0, policy_version 639978 (0.0024) [2024-04-28 15:01:42,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 10485399552. Throughput: 0: 55981.9. Samples: 975741560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:01:42,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 15:01:45,160][57339] Updated weights for policy 0, policy_version 639988 (0.0026) [2024-04-28 15:01:47,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10485645312. Throughput: 0: 55940.9. Samples: 976073780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:01:47,170][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 15:01:48,085][57339] Updated weights for policy 0, policy_version 639998 (0.0031) [2024-04-28 15:01:51,191][57339] Updated weights for policy 0, policy_version 640008 (0.0025) [2024-04-28 15:01:52,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10485940224. Throughput: 0: 55584.0. Samples: 976234220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:01:52,170][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 15:01:53,984][57339] Updated weights for policy 0, policy_version 640018 (0.0031) [2024-04-28 15:01:56,929][57339] Updated weights for policy 0, policy_version 640028 (0.0026) [2024-04-28 15:01:57,169][57108] Fps is (10 sec: 57345.2, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 10486218752. Throughput: 0: 55688.5. Samples: 976576240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:01:57,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 15:01:58,948][57319] Signal inference workers to stop experience collection... (14450 times) [2024-04-28 15:01:58,948][57319] Signal inference workers to resume experience collection... (14450 times) [2024-04-28 15:01:58,969][57339] InferenceWorker_p0-w0: stopping experience collection (14450 times) [2024-04-28 15:01:58,969][57339] InferenceWorker_p0-w0: resuming experience collection (14450 times) [2024-04-28 15:01:59,696][57339] Updated weights for policy 0, policy_version 640038 (0.0032) [2024-04-28 15:02:02,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10486513664. Throughput: 0: 55639.3. Samples: 976911860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:02:02,169][57108] Avg episode reward: [(0, '0.534')] [2024-04-28 15:02:02,781][57339] Updated weights for policy 0, policy_version 640048 (0.0026) [2024-04-28 15:02:05,397][57339] Updated weights for policy 0, policy_version 640058 (0.0026) [2024-04-28 15:02:07,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 10486759424. Throughput: 0: 55858.9. Samples: 977079260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:02:07,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 15:02:08,709][57339] Updated weights for policy 0, policy_version 640068 (0.0030) [2024-04-28 15:02:11,312][57339] Updated weights for policy 0, policy_version 640078 (0.0025) [2024-04-28 15:02:12,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10487054336. Throughput: 0: 55959.3. Samples: 977417060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:02:12,169][57108] Avg episode reward: [(0, '0.678')] [2024-04-28 15:02:14,508][57339] Updated weights for policy 0, policy_version 640088 (0.0026) [2024-04-28 15:02:17,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56251.9, 300 sec: 55761.1). Total num frames: 10487349248. Throughput: 0: 55992.0. Samples: 977752060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:02:17,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 15:02:17,196][57339] Updated weights for policy 0, policy_version 640098 (0.0032) [2024-04-28 15:02:20,342][57339] Updated weights for policy 0, policy_version 640108 (0.0026) [2024-04-28 15:02:22,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.7, 300 sec: 55650.0). Total num frames: 10487611392. Throughput: 0: 55863.6. Samples: 977920620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:02:22,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 15:02:22,991][57339] Updated weights for policy 0, policy_version 640118 (0.0026) [2024-04-28 15:02:26,088][57339] Updated weights for policy 0, policy_version 640128 (0.0024) [2024-04-28 15:02:27,169][57108] Fps is (10 sec: 52427.8, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10487873536. Throughput: 0: 55757.5. Samples: 978250660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:02:27,170][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 15:02:29,003][57339] Updated weights for policy 0, policy_version 640138 (0.0028) [2024-04-28 15:02:31,789][57339] Updated weights for policy 0, policy_version 640148 (0.0025) [2024-04-28 15:02:32,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 10488184832. Throughput: 0: 55853.5. Samples: 978587180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:02:32,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 15:02:35,070][57339] Updated weights for policy 0, policy_version 640158 (0.0029) [2024-04-28 15:02:37,169][57108] Fps is (10 sec: 60621.3, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 10488479744. Throughput: 0: 56054.3. Samples: 978756660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:02:37,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 15:02:37,676][57339] Updated weights for policy 0, policy_version 640168 (0.0025) [2024-04-28 15:02:40,755][57339] Updated weights for policy 0, policy_version 640178 (0.0033) [2024-04-28 15:02:42,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10488741888. Throughput: 0: 55943.6. Samples: 979093700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:02:42,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 15:02:43,647][57339] Updated weights for policy 0, policy_version 640188 (0.0026) [2024-04-28 15:02:46,496][57339] Updated weights for policy 0, policy_version 640198 (0.0031) [2024-04-28 15:02:47,169][57108] Fps is (10 sec: 52428.0, 60 sec: 55978.6, 300 sec: 55705.5). Total num frames: 10489004032. Throughput: 0: 55803.3. Samples: 979423020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:02:47,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 15:02:47,815][57319] Signal inference workers to stop experience collection... (14500 times) [2024-04-28 15:02:47,815][57319] Signal inference workers to resume experience collection... (14500 times) [2024-04-28 15:02:47,842][57339] InferenceWorker_p0-w0: stopping experience collection (14500 times) [2024-04-28 15:02:47,842][57339] InferenceWorker_p0-w0: resuming experience collection (14500 times) [2024-04-28 15:02:49,419][57339] Updated weights for policy 0, policy_version 640208 (0.0029) [2024-04-28 15:02:52,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 10489315328. Throughput: 0: 55833.3. Samples: 979591760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:02:52,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 15:02:52,320][57339] Updated weights for policy 0, policy_version 640218 (0.0036) [2024-04-28 15:02:55,510][57339] Updated weights for policy 0, policy_version 640228 (0.0030) [2024-04-28 15:02:57,169][57108] Fps is (10 sec: 57345.3, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10489577472. Throughput: 0: 55804.8. Samples: 979928280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:02:57,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 15:02:58,214][57339] Updated weights for policy 0, policy_version 640238 (0.0031) [2024-04-28 15:03:01,479][57339] Updated weights for policy 0, policy_version 640248 (0.0033) [2024-04-28 15:03:02,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10489839616. Throughput: 0: 55911.0. Samples: 980268060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:03:02,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 15:03:03,990][57339] Updated weights for policy 0, policy_version 640258 (0.0025) [2024-04-28 15:03:07,141][57339] Updated weights for policy 0, policy_version 640268 (0.0030) [2024-04-28 15:03:07,169][57108] Fps is (10 sec: 57343.0, 60 sec: 56524.6, 300 sec: 55705.6). Total num frames: 10490150912. Throughput: 0: 56018.1. Samples: 980441440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 15:03:07,170][57108] Avg episode reward: [(0, '0.703')] [2024-04-28 15:03:09,728][57339] Updated weights for policy 0, policy_version 640278 (0.0024) [2024-04-28 15:03:12,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 10490413056. Throughput: 0: 56118.5. Samples: 980775980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:03:12,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 15:03:13,082][57339] Updated weights for policy 0, policy_version 640288 (0.0031) [2024-04-28 15:03:15,984][57339] Updated weights for policy 0, policy_version 640298 (0.0028) [2024-04-28 15:03:17,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.4, 300 sec: 55816.6). Total num frames: 10490707968. Throughput: 0: 56019.3. Samples: 981108060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:03:17,170][57108] Avg episode reward: [(0, '0.513')] [2024-04-28 15:03:18,992][57339] Updated weights for policy 0, policy_version 640308 (0.0028) [2024-04-28 15:03:21,884][57339] Updated weights for policy 0, policy_version 640318 (0.0032) [2024-04-28 15:03:22,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10490970112. Throughput: 0: 56024.0. Samples: 981277740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:03:22,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 15:03:24,793][57339] Updated weights for policy 0, policy_version 640328 (0.0034) [2024-04-28 15:03:27,169][57108] Fps is (10 sec: 55706.5, 60 sec: 56524.9, 300 sec: 55816.7). Total num frames: 10491265024. Throughput: 0: 56032.3. Samples: 981615160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:03:27,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 15:03:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000640336_10491265024.pth... [2024-04-28 15:03:27,226][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000639520_10477895680.pth [2024-04-28 15:03:27,716][57339] Updated weights for policy 0, policy_version 640338 (0.0027) [2024-04-28 15:03:30,592][57339] Updated weights for policy 0, policy_version 640348 (0.0027) [2024-04-28 15:03:32,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10491543552. Throughput: 0: 56180.8. Samples: 981951140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:03:32,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 15:03:33,428][57339] Updated weights for policy 0, policy_version 640358 (0.0025) [2024-04-28 15:03:36,618][57339] Updated weights for policy 0, policy_version 640368 (0.0030) [2024-04-28 15:03:37,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10491805696. Throughput: 0: 56066.8. Samples: 982114760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:03:37,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 15:03:39,227][57339] Updated weights for policy 0, policy_version 640378 (0.0030) [2024-04-28 15:03:42,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 10492084224. Throughput: 0: 56027.9. Samples: 982449540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:03:42,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 15:03:42,745][57339] Updated weights for policy 0, policy_version 640388 (0.0032) [2024-04-28 15:03:45,017][57339] Updated weights for policy 0, policy_version 640398 (0.0031) [2024-04-28 15:03:47,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.9, 300 sec: 55705.6). Total num frames: 10492362752. Throughput: 0: 55913.9. Samples: 982784180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:03:47,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 15:03:48,546][57339] Updated weights for policy 0, policy_version 640408 (0.0034) [2024-04-28 15:03:50,810][57339] Updated weights for policy 0, policy_version 640418 (0.0033) [2024-04-28 15:03:52,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10492657664. Throughput: 0: 55773.1. Samples: 982951220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:03:52,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 15:03:54,261][57339] Updated weights for policy 0, policy_version 640428 (0.0028) [2024-04-28 15:03:56,107][57319] Signal inference workers to stop experience collection... (14550 times) [2024-04-28 15:03:56,129][57339] InferenceWorker_p0-w0: stopping experience collection (14550 times) [2024-04-28 15:03:56,199][57319] Signal inference workers to resume experience collection... (14550 times) [2024-04-28 15:03:56,199][57339] InferenceWorker_p0-w0: resuming experience collection (14550 times) [2024-04-28 15:03:56,560][57339] Updated weights for policy 0, policy_version 640438 (0.0030) [2024-04-28 15:03:57,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10492936192. Throughput: 0: 55810.5. Samples: 983287460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:03:57,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 15:04:00,247][57339] Updated weights for policy 0, policy_version 640448 (0.0034) [2024-04-28 15:04:02,169][57108] Fps is (10 sec: 55706.3, 60 sec: 56251.9, 300 sec: 55816.7). Total num frames: 10493214720. Throughput: 0: 55824.8. Samples: 983620160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:04:02,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 15:04:02,948][57339] Updated weights for policy 0, policy_version 640458 (0.0030) [2024-04-28 15:04:06,096][57339] Updated weights for policy 0, policy_version 640468 (0.0032) [2024-04-28 15:04:07,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.6, 300 sec: 55816.6). Total num frames: 10493493248. Throughput: 0: 55792.7. Samples: 983788420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:04:07,170][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 15:04:08,787][57339] Updated weights for policy 0, policy_version 640478 (0.0027) [2024-04-28 15:04:11,888][57339] Updated weights for policy 0, policy_version 640488 (0.0038) [2024-04-28 15:04:12,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10493755392. Throughput: 0: 55803.2. Samples: 984126300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:04:12,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 15:04:14,574][57339] Updated weights for policy 0, policy_version 640498 (0.0031) [2024-04-28 15:04:17,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10494050304. Throughput: 0: 55861.2. Samples: 984464900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:04:17,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 15:04:17,640][57339] Updated weights for policy 0, policy_version 640508 (0.0030) [2024-04-28 15:04:20,259][57339] Updated weights for policy 0, policy_version 640518 (0.0027) [2024-04-28 15:04:22,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10494296064. Throughput: 0: 55703.1. Samples: 984621400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:04:22,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 15:04:23,491][57339] Updated weights for policy 0, policy_version 640528 (0.0028) [2024-04-28 15:04:25,996][57339] Updated weights for policy 0, policy_version 640538 (0.0027) [2024-04-28 15:04:27,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10494607360. Throughput: 0: 55732.6. Samples: 984957500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:04:27,169][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 15:04:29,511][57339] Updated weights for policy 0, policy_version 640548 (0.0028) [2024-04-28 15:04:31,935][57339] Updated weights for policy 0, policy_version 640558 (0.0027) [2024-04-28 15:04:32,169][57108] Fps is (10 sec: 60619.8, 60 sec: 55978.5, 300 sec: 55927.7). Total num frames: 10494902272. Throughput: 0: 55755.4. Samples: 985293180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:04:32,170][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 15:04:35,663][57339] Updated weights for policy 0, policy_version 640568 (0.0031) [2024-04-28 15:04:37,169][57108] Fps is (10 sec: 58981.7, 60 sec: 56524.7, 300 sec: 55927.7). Total num frames: 10495197184. Throughput: 0: 55882.1. Samples: 985465920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:04:37,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 15:04:37,852][57339] Updated weights for policy 0, policy_version 640578 (0.0026) [2024-04-28 15:04:41,483][57339] Updated weights for policy 0, policy_version 640588 (0.0023) [2024-04-28 15:04:42,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10495442944. Throughput: 0: 55904.0. Samples: 985803140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-04-28 15:04:42,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 15:04:43,785][57339] Updated weights for policy 0, policy_version 640598 (0.0031) [2024-04-28 15:04:47,169][57108] Fps is (10 sec: 50790.5, 60 sec: 55705.5, 300 sec: 55761.2). Total num frames: 10495705088. Throughput: 0: 55892.7. Samples: 986135340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:04:47,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 15:04:47,231][57339] Updated weights for policy 0, policy_version 640608 (0.0027) [2024-04-28 15:04:49,739][57339] Updated weights for policy 0, policy_version 640618 (0.0026) [2024-04-28 15:04:52,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10495983616. Throughput: 0: 55647.0. Samples: 986292520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:04:52,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 15:04:53,179][57339] Updated weights for policy 0, policy_version 640628 (0.0028) [2024-04-28 15:04:55,763][57339] Updated weights for policy 0, policy_version 640638 (0.0027) [2024-04-28 15:04:57,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10496262144. Throughput: 0: 55578.7. Samples: 986627340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:04:57,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 15:04:59,135][57339] Updated weights for policy 0, policy_version 640648 (0.0025) [2024-04-28 15:04:59,883][57319] Signal inference workers to stop experience collection... (14600 times) [2024-04-28 15:04:59,884][57319] Signal inference workers to resume experience collection... (14600 times) [2024-04-28 15:04:59,919][57339] InferenceWorker_p0-w0: stopping experience collection (14600 times) [2024-04-28 15:04:59,919][57339] InferenceWorker_p0-w0: resuming experience collection (14600 times) [2024-04-28 15:05:01,521][57339] Updated weights for policy 0, policy_version 640658 (0.0030) [2024-04-28 15:05:02,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55705.4, 300 sec: 55872.2). Total num frames: 10496557056. Throughput: 0: 55452.0. Samples: 986960240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:05:02,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 15:05:05,172][57339] Updated weights for policy 0, policy_version 640668 (0.0030) [2024-04-28 15:05:07,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55978.9, 300 sec: 55927.8). Total num frames: 10496851968. Throughput: 0: 55835.5. Samples: 987134000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:05:07,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 15:05:07,235][57339] Updated weights for policy 0, policy_version 640678 (0.0030) [2024-04-28 15:05:10,976][57339] Updated weights for policy 0, policy_version 640688 (0.0031) [2024-04-28 15:05:12,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10497130496. Throughput: 0: 55807.0. Samples: 987468820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:05:12,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 15:05:13,121][57339] Updated weights for policy 0, policy_version 640698 (0.0035) [2024-04-28 15:05:16,900][57339] Updated weights for policy 0, policy_version 640708 (0.0026) [2024-04-28 15:05:17,169][57108] Fps is (10 sec: 52428.0, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10497376256. Throughput: 0: 55798.2. Samples: 987804100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:05:17,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 15:05:19,088][57339] Updated weights for policy 0, policy_version 640718 (0.0029) [2024-04-28 15:05:22,169][57108] Fps is (10 sec: 49152.1, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10497622016. Throughput: 0: 55419.2. Samples: 987959780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:05:22,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 15:05:22,715][57339] Updated weights for policy 0, policy_version 640728 (0.0028) [2024-04-28 15:05:24,987][57339] Updated weights for policy 0, policy_version 640738 (0.0036) [2024-04-28 15:05:27,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10497933312. Throughput: 0: 55322.6. Samples: 988292660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:05:27,169][57108] Avg episode reward: [(0, '0.689')] [2024-04-28 15:05:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000640743_10497933312.pth... [2024-04-28 15:05:27,234][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000639926_10484547584.pth [2024-04-28 15:05:28,710][57339] Updated weights for policy 0, policy_version 640748 (0.0032) [2024-04-28 15:05:31,132][57339] Updated weights for policy 0, policy_version 640758 (0.0030) [2024-04-28 15:05:32,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 10498211840. Throughput: 0: 55372.6. Samples: 988627100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:05:32,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 15:05:34,621][57339] Updated weights for policy 0, policy_version 640768 (0.0028) [2024-04-28 15:05:37,169][57108] Fps is (10 sec: 55706.1, 60 sec: 54886.5, 300 sec: 55816.7). Total num frames: 10498490368. Throughput: 0: 55741.8. Samples: 988800900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:05:37,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 15:05:37,177][57339] Updated weights for policy 0, policy_version 640778 (0.0034) [2024-04-28 15:05:40,412][57339] Updated weights for policy 0, policy_version 640788 (0.0028) [2024-04-28 15:05:42,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10498785280. Throughput: 0: 55639.5. Samples: 989131120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:05:42,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 15:05:42,889][57339] Updated weights for policy 0, policy_version 640798 (0.0033) [2024-04-28 15:05:46,183][57339] Updated weights for policy 0, policy_version 640808 (0.0028) [2024-04-28 15:05:47,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56251.9, 300 sec: 55872.3). Total num frames: 10499080192. Throughput: 0: 55732.6. Samples: 989468200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:05:47,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 15:05:48,916][57339] Updated weights for policy 0, policy_version 640818 (0.0028) [2024-04-28 15:05:51,704][57319] Signal inference workers to stop experience collection... (14650 times) [2024-04-28 15:05:51,704][57319] Signal inference workers to resume experience collection... (14650 times) [2024-04-28 15:05:51,717][57339] InferenceWorker_p0-w0: stopping experience collection (14650 times) [2024-04-28 15:05:51,718][57339] InferenceWorker_p0-w0: resuming experience collection (14650 times) [2024-04-28 15:05:51,939][57339] Updated weights for policy 0, policy_version 640828 (0.0026) [2024-04-28 15:05:52,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10499325952. Throughput: 0: 55520.9. Samples: 989632440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:05:52,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 15:05:54,805][57339] Updated weights for policy 0, policy_version 640838 (0.0026) [2024-04-28 15:05:57,169][57108] Fps is (10 sec: 50789.8, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10499588096. Throughput: 0: 55668.0. Samples: 989973880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:05:57,169][57108] Avg episode reward: [(0, '0.708')] [2024-04-28 15:05:57,915][57339] Updated weights for policy 0, policy_version 640848 (0.0036) [2024-04-28 15:06:00,523][57339] Updated weights for policy 0, policy_version 640858 (0.0029) [2024-04-28 15:06:02,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10499899392. Throughput: 0: 55702.2. Samples: 990310700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:06:02,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 15:06:03,693][57339] Updated weights for policy 0, policy_version 640868 (0.0032) [2024-04-28 15:06:06,433][57339] Updated weights for policy 0, policy_version 640878 (0.0034) [2024-04-28 15:06:07,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10500161536. Throughput: 0: 55736.0. Samples: 990467900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:06:07,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 15:06:09,544][57339] Updated weights for policy 0, policy_version 640888 (0.0029) [2024-04-28 15:06:12,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 10500456448. Throughput: 0: 55766.2. Samples: 990802140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:06:12,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 15:06:12,188][57339] Updated weights for policy 0, policy_version 640898 (0.0037) [2024-04-28 15:06:15,444][57339] Updated weights for policy 0, policy_version 640908 (0.0026) [2024-04-28 15:06:17,169][57108] Fps is (10 sec: 58982.9, 60 sec: 56251.9, 300 sec: 55872.3). Total num frames: 10500751360. Throughput: 0: 55684.0. Samples: 991132880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:06:17,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 15:06:18,212][57339] Updated weights for policy 0, policy_version 640918 (0.0032) [2024-04-28 15:06:21,455][57339] Updated weights for policy 0, policy_version 640928 (0.0027) [2024-04-28 15:06:22,169][57108] Fps is (10 sec: 57343.6, 60 sec: 56797.8, 300 sec: 55872.2). Total num frames: 10501029888. Throughput: 0: 55749.6. Samples: 991309640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 15:06:22,170][57108] Avg episode reward: [(0, '0.454')] [2024-04-28 15:06:24,027][57339] Updated weights for policy 0, policy_version 640938 (0.0032) [2024-04-28 15:06:27,169][57108] Fps is (10 sec: 52427.6, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10501275648. Throughput: 0: 55872.2. Samples: 991645380. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:06:27,170][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 15:06:27,307][57339] Updated weights for policy 0, policy_version 640948 (0.0038) [2024-04-28 15:06:29,950][57339] Updated weights for policy 0, policy_version 640958 (0.0032) [2024-04-28 15:06:32,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10501554176. Throughput: 0: 55818.0. Samples: 991980020. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:06:32,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 15:06:33,068][57339] Updated weights for policy 0, policy_version 640968 (0.0035) [2024-04-28 15:06:35,986][57339] Updated weights for policy 0, policy_version 640978 (0.0032) [2024-04-28 15:06:37,169][57108] Fps is (10 sec: 55706.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10501832704. Throughput: 0: 55723.5. Samples: 992140000. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:06:37,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 15:06:39,017][57339] Updated weights for policy 0, policy_version 640988 (0.0030) [2024-04-28 15:06:41,685][57339] Updated weights for policy 0, policy_version 640998 (0.0029) [2024-04-28 15:06:42,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10502111232. Throughput: 0: 55623.1. Samples: 992476920. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:06:42,169][57108] Avg episode reward: [(0, '0.676')] [2024-04-28 15:06:44,115][57319] Signal inference workers to stop experience collection... (14700 times) [2024-04-28 15:06:44,116][57319] Signal inference workers to resume experience collection... (14700 times) [2024-04-28 15:06:44,128][57339] InferenceWorker_p0-w0: stopping experience collection (14700 times) [2024-04-28 15:06:44,128][57339] InferenceWorker_p0-w0: resuming experience collection (14700 times) [2024-04-28 15:06:44,844][57339] Updated weights for policy 0, policy_version 641008 (0.0030) [2024-04-28 15:06:47,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 10502406144. Throughput: 0: 55656.5. Samples: 992815240. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:06:47,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 15:06:47,580][57339] Updated weights for policy 0, policy_version 641018 (0.0033) [2024-04-28 15:06:50,694][57339] Updated weights for policy 0, policy_version 641028 (0.0027) [2024-04-28 15:06:52,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10502684672. Throughput: 0: 55990.7. Samples: 992987480. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:06:52,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 15:06:53,436][57339] Updated weights for policy 0, policy_version 641038 (0.0026) [2024-04-28 15:06:56,496][57339] Updated weights for policy 0, policy_version 641048 (0.0030) [2024-04-28 15:06:57,169][57108] Fps is (10 sec: 58981.9, 60 sec: 56797.8, 300 sec: 55872.2). Total num frames: 10502995968. Throughput: 0: 56076.3. Samples: 993325580. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:06:57,169][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 15:06:59,408][57339] Updated weights for policy 0, policy_version 641058 (0.0024) [2024-04-28 15:07:02,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10503241728. Throughput: 0: 56165.1. Samples: 993660320. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:07:02,170][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 15:07:02,307][57339] Updated weights for policy 0, policy_version 641068 (0.0034) [2024-04-28 15:07:05,245][57339] Updated weights for policy 0, policy_version 641078 (0.0029) [2024-04-28 15:07:07,169][57108] Fps is (10 sec: 50791.2, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10503503872. Throughput: 0: 55863.7. Samples: 993823500. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:07:07,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 15:07:08,129][57339] Updated weights for policy 0, policy_version 641088 (0.0024) [2024-04-28 15:07:11,079][57339] Updated weights for policy 0, policy_version 641098 (0.0029) [2024-04-28 15:07:12,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10503798784. Throughput: 0: 55937.8. Samples: 994162580. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:07:12,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 15:07:13,848][57339] Updated weights for policy 0, policy_version 641108 (0.0028) [2024-04-28 15:07:16,937][57339] Updated weights for policy 0, policy_version 641118 (0.0030) [2024-04-28 15:07:17,169][57108] Fps is (10 sec: 58981.5, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 10504093696. Throughput: 0: 55954.7. Samples: 994497980. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:07:17,169][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 15:07:19,748][57339] Updated weights for policy 0, policy_version 641128 (0.0026) [2024-04-28 15:07:22,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55432.7, 300 sec: 55872.3). Total num frames: 10504355840. Throughput: 0: 56056.5. Samples: 994662540. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:07:22,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 15:07:22,846][57339] Updated weights for policy 0, policy_version 641138 (0.0033) [2024-04-28 15:07:25,715][57339] Updated weights for policy 0, policy_version 641148 (0.0027) [2024-04-28 15:07:27,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56524.9, 300 sec: 55872.2). Total num frames: 10504667136. Throughput: 0: 56107.1. Samples: 995001740. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:07:27,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 15:07:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000641154_10504667136.pth... [2024-04-28 15:07:27,226][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000640336_10491265024.pth [2024-04-28 15:07:28,771][57339] Updated weights for policy 0, policy_version 641158 (0.0026) [2024-04-28 15:07:31,370][57339] Updated weights for policy 0, policy_version 641168 (0.0034) [2024-04-28 15:07:32,169][57108] Fps is (10 sec: 58981.9, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 10504945664. Throughput: 0: 55989.7. Samples: 995334780. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:07:32,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 15:07:34,564][57339] Updated weights for policy 0, policy_version 641178 (0.0026) [2024-04-28 15:07:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10505207808. Throughput: 0: 56020.1. Samples: 995508380. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:07:37,169][57108] Avg episode reward: [(0, '0.676')] [2024-04-28 15:07:37,207][57339] Updated weights for policy 0, policy_version 641188 (0.0030) [2024-04-28 15:07:40,289][57339] Updated weights for policy 0, policy_version 641198 (0.0030) [2024-04-28 15:07:42,169][57108] Fps is (10 sec: 54067.5, 60 sec: 56251.8, 300 sec: 55872.3). Total num frames: 10505486336. Throughput: 0: 55953.5. Samples: 995843480. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:07:42,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 15:07:42,966][57339] Updated weights for policy 0, policy_version 641208 (0.0027) [2024-04-28 15:07:46,063][57339] Updated weights for policy 0, policy_version 641218 (0.0026) [2024-04-28 15:07:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10505764864. Throughput: 0: 56095.7. Samples: 996184620. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:07:47,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 15:07:48,870][57339] Updated weights for policy 0, policy_version 641228 (0.0029) [2024-04-28 15:07:51,978][57339] Updated weights for policy 0, policy_version 641238 (0.0034) [2024-04-28 15:07:52,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10506043392. Throughput: 0: 55919.5. Samples: 996339880. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:07:52,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 15:07:54,817][57339] Updated weights for policy 0, policy_version 641248 (0.0033) [2024-04-28 15:07:57,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 10506305536. Throughput: 0: 55801.9. Samples: 996673660. Policy #0 lag: (min: 1.0, avg: 12.6, max: 25.0) [2024-04-28 15:07:57,169][57108] Avg episode reward: [(0, '0.701')] [2024-04-28 15:07:57,877][57339] Updated weights for policy 0, policy_version 641258 (0.0025) [2024-04-28 15:08:00,559][57319] Signal inference workers to stop experience collection... (14750 times) [2024-04-28 15:08:00,613][57339] InferenceWorker_p0-w0: stopping experience collection (14750 times) [2024-04-28 15:08:00,613][57319] Signal inference workers to resume experience collection... (14750 times) [2024-04-28 15:08:00,617][57339] Updated weights for policy 0, policy_version 641268 (0.0027) [2024-04-28 15:08:00,628][57339] InferenceWorker_p0-w0: resuming experience collection (14750 times) [2024-04-28 15:08:02,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 10506600448. Throughput: 0: 55784.5. Samples: 997008280. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:08:02,169][57108] Avg episode reward: [(0, '0.684')] [2024-04-28 15:08:03,747][57339] Updated weights for policy 0, policy_version 641278 (0.0026) [2024-04-28 15:08:06,402][57339] Updated weights for policy 0, policy_version 641288 (0.0029) [2024-04-28 15:08:07,169][57108] Fps is (10 sec: 60621.1, 60 sec: 56797.8, 300 sec: 55927.7). Total num frames: 10506911744. Throughput: 0: 56120.9. Samples: 997187980. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:08:07,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 15:08:09,722][57339] Updated weights for policy 0, policy_version 641298 (0.0032) [2024-04-28 15:08:12,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10507173888. Throughput: 0: 56027.6. Samples: 997522980. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:08:12,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 15:08:12,270][57339] Updated weights for policy 0, policy_version 641308 (0.0030) [2024-04-28 15:08:15,628][57339] Updated weights for policy 0, policy_version 641318 (0.0028) [2024-04-28 15:08:17,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10507436032. Throughput: 0: 56047.5. Samples: 997856920. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:08:17,169][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 15:08:18,192][57339] Updated weights for policy 0, policy_version 641328 (0.0026) [2024-04-28 15:08:21,334][57339] Updated weights for policy 0, policy_version 641338 (0.0029) [2024-04-28 15:08:22,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10507698176. Throughput: 0: 55841.8. Samples: 998021260. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:08:22,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 15:08:23,943][57339] Updated weights for policy 0, policy_version 641348 (0.0027) [2024-04-28 15:08:27,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10507993088. Throughput: 0: 55826.7. Samples: 998355680. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:08:27,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 15:08:27,232][57339] Updated weights for policy 0, policy_version 641358 (0.0025) [2024-04-28 15:08:29,817][57339] Updated weights for policy 0, policy_version 641368 (0.0036) [2024-04-28 15:08:32,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10508271616. Throughput: 0: 55714.2. Samples: 998691760. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:08:32,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 15:08:33,161][57339] Updated weights for policy 0, policy_version 641378 (0.0037) [2024-04-28 15:08:35,742][57339] Updated weights for policy 0, policy_version 641388 (0.0024) [2024-04-28 15:08:37,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10508566528. Throughput: 0: 56020.0. Samples: 998860780. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:08:37,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 15:08:39,042][57339] Updated weights for policy 0, policy_version 641398 (0.0028) [2024-04-28 15:08:41,480][57339] Updated weights for policy 0, policy_version 641408 (0.0025) [2024-04-28 15:08:42,169][57108] Fps is (10 sec: 58982.3, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 10508861440. Throughput: 0: 56104.0. Samples: 999198340. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:08:42,169][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 15:08:44,769][57339] Updated weights for policy 0, policy_version 641418 (0.0027) [2024-04-28 15:08:47,169][57108] Fps is (10 sec: 57342.2, 60 sec: 56251.5, 300 sec: 55872.2). Total num frames: 10509139968. Throughput: 0: 56143.3. Samples: 999534740. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:08:47,170][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 15:08:47,300][57339] Updated weights for policy 0, policy_version 641428 (0.0028) [2024-04-28 15:08:50,444][57339] Updated weights for policy 0, policy_version 641438 (0.0029) [2024-04-28 15:08:52,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10509385728. Throughput: 0: 56047.4. Samples: 999710120. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:08:52,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 15:08:53,023][57339] Updated weights for policy 0, policy_version 641448 (0.0031) [2024-04-28 15:08:53,928][57319] Signal inference workers to stop experience collection... (14800 times) [2024-04-28 15:08:53,972][57339] InferenceWorker_p0-w0: stopping experience collection (14800 times) [2024-04-28 15:08:53,982][57319] Signal inference workers to resume experience collection... (14800 times) [2024-04-28 15:08:53,988][57339] InferenceWorker_p0-w0: resuming experience collection (14800 times) [2024-04-28 15:08:56,421][57339] Updated weights for policy 0, policy_version 641458 (0.0026) [2024-04-28 15:08:57,169][57108] Fps is (10 sec: 52429.7, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10509664256. Throughput: 0: 56045.2. Samples: 1000045020. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:08:57,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 15:08:58,891][57339] Updated weights for policy 0, policy_version 641468 (0.0027) [2024-04-28 15:09:02,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10509959168. Throughput: 0: 56016.9. Samples: 1000377680. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:09:02,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 15:09:02,394][57339] Updated weights for policy 0, policy_version 641478 (0.0026) [2024-04-28 15:09:04,582][57339] Updated weights for policy 0, policy_version 641488 (0.0025) [2024-04-28 15:09:07,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 10510237696. Throughput: 0: 55922.0. Samples: 1000537760. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:09:07,170][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 15:09:08,059][57339] Updated weights for policy 0, policy_version 641498 (0.0027) [2024-04-28 15:09:10,363][57339] Updated weights for policy 0, policy_version 641508 (0.0027) [2024-04-28 15:09:12,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10510516224. Throughput: 0: 56034.6. Samples: 1000877240. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:09:12,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 15:09:13,810][57339] Updated weights for policy 0, policy_version 641518 (0.0027) [2024-04-28 15:09:16,242][57339] Updated weights for policy 0, policy_version 641528 (0.0034) [2024-04-28 15:09:17,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 10510811136. Throughput: 0: 56061.7. Samples: 1001214540. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:09:17,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 15:09:19,660][57339] Updated weights for policy 0, policy_version 641538 (0.0030) [2024-04-28 15:09:22,169][57108] Fps is (10 sec: 58982.4, 60 sec: 56797.8, 300 sec: 55927.7). Total num frames: 10511106048. Throughput: 0: 56145.6. Samples: 1001387340. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:09:22,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 15:09:22,247][57339] Updated weights for policy 0, policy_version 641548 (0.0027) [2024-04-28 15:09:25,530][57339] Updated weights for policy 0, policy_version 641558 (0.0027) [2024-04-28 15:09:27,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 10511351808. Throughput: 0: 56042.7. Samples: 1001720260. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:09:27,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 15:09:27,200][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000641563_10511368192.pth... [2024-04-28 15:09:27,259][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000640743_10497933312.pth [2024-04-28 15:09:28,045][57339] Updated weights for policy 0, policy_version 641568 (0.0026) [2024-04-28 15:09:31,326][57339] Updated weights for policy 0, policy_version 641578 (0.0024) [2024-04-28 15:09:32,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 10511630336. Throughput: 0: 56020.6. Samples: 1002055660. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-04-28 15:09:32,170][57108] Avg episode reward: [(0, '0.696')] [2024-04-28 15:09:33,943][57339] Updated weights for policy 0, policy_version 641588 (0.0027) [2024-04-28 15:09:37,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10511925248. Throughput: 0: 55753.4. Samples: 1002219020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:09:37,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 15:09:37,178][57339] Updated weights for policy 0, policy_version 641598 (0.0029) [2024-04-28 15:09:39,995][57339] Updated weights for policy 0, policy_version 641608 (0.0028) [2024-04-28 15:09:42,169][57108] Fps is (10 sec: 57345.0, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 10512203776. Throughput: 0: 55699.7. Samples: 1002551500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:09:42,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 15:09:43,128][57339] Updated weights for policy 0, policy_version 641618 (0.0026) [2024-04-28 15:09:44,933][57319] Signal inference workers to stop experience collection... (14850 times) [2024-04-28 15:09:44,933][57319] Signal inference workers to resume experience collection... (14850 times) [2024-04-28 15:09:44,945][57339] InferenceWorker_p0-w0: stopping experience collection (14850 times) [2024-04-28 15:09:44,945][57339] InferenceWorker_p0-w0: resuming experience collection (14850 times) [2024-04-28 15:09:45,810][57339] Updated weights for policy 0, policy_version 641628 (0.0028) [2024-04-28 15:09:47,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.7, 300 sec: 55927.7). Total num frames: 10512482304. Throughput: 0: 55724.4. Samples: 1002885280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:09:47,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 15:09:48,934][57339] Updated weights for policy 0, policy_version 641638 (0.0028) [2024-04-28 15:09:51,720][57339] Updated weights for policy 0, policy_version 641648 (0.0023) [2024-04-28 15:09:52,169][57108] Fps is (10 sec: 57343.6, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 10512777216. Throughput: 0: 56000.6. Samples: 1003057780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:09:52,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 15:09:54,711][57339] Updated weights for policy 0, policy_version 641658 (0.0026) [2024-04-28 15:09:57,169][57108] Fps is (10 sec: 55705.2, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10513039360. Throughput: 0: 55910.1. Samples: 1003393200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:09:57,170][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 15:09:57,699][57339] Updated weights for policy 0, policy_version 641668 (0.0031) [2024-04-28 15:10:00,779][57339] Updated weights for policy 0, policy_version 641678 (0.0033) [2024-04-28 15:10:02,169][57108] Fps is (10 sec: 50790.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10513285120. Throughput: 0: 55790.3. Samples: 1003725100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:10:02,169][57108] Avg episode reward: [(0, '0.514')] [2024-04-28 15:10:03,703][57339] Updated weights for policy 0, policy_version 641688 (0.0025) [2024-04-28 15:10:07,004][57339] Updated weights for policy 0, policy_version 641698 (0.0028) [2024-04-28 15:10:07,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10513580032. Throughput: 0: 55470.7. Samples: 1003883520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:10:07,178][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 15:10:09,460][57339] Updated weights for policy 0, policy_version 641708 (0.0035) [2024-04-28 15:10:12,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 10513874944. Throughput: 0: 55544.8. Samples: 1004219780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:10:12,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 15:10:12,771][57339] Updated weights for policy 0, policy_version 641718 (0.0027) [2024-04-28 15:10:15,381][57339] Updated weights for policy 0, policy_version 641728 (0.0025) [2024-04-28 15:10:17,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55983.3). Total num frames: 10514137088. Throughput: 0: 55564.2. Samples: 1004556040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:10:17,169][57108] Avg episode reward: [(0, '0.694')] [2024-04-28 15:10:18,542][57339] Updated weights for policy 0, policy_version 641738 (0.0025) [2024-04-28 15:10:21,387][57339] Updated weights for policy 0, policy_version 641748 (0.0029) [2024-04-28 15:10:22,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.5, 300 sec: 55872.2). Total num frames: 10514415616. Throughput: 0: 55569.3. Samples: 1004719640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:10:22,170][57108] Avg episode reward: [(0, '0.678')] [2024-04-28 15:10:24,910][57339] Updated weights for policy 0, policy_version 641758 (0.0029) [2024-04-28 15:10:27,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 10514710528. Throughput: 0: 55664.8. Samples: 1005056420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:10:27,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 15:10:27,377][57339] Updated weights for policy 0, policy_version 641768 (0.0027) [2024-04-28 15:10:30,655][57339] Updated weights for policy 0, policy_version 641778 (0.0029) [2024-04-28 15:10:32,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.8, 300 sec: 55927.7). Total num frames: 10514989056. Throughput: 0: 55673.4. Samples: 1005390580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:10:32,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 15:10:33,262][57339] Updated weights for policy 0, policy_version 641788 (0.0035) [2024-04-28 15:10:36,669][57339] Updated weights for policy 0, policy_version 641798 (0.0029) [2024-04-28 15:10:37,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 10515234816. Throughput: 0: 55333.5. Samples: 1005547780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:10:37,169][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 15:10:39,068][57339] Updated weights for policy 0, policy_version 641808 (0.0029) [2024-04-28 15:10:42,169][57108] Fps is (10 sec: 54066.1, 60 sec: 55432.3, 300 sec: 55761.1). Total num frames: 10515529728. Throughput: 0: 55377.7. Samples: 1005885200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:10:42,170][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 15:10:42,484][57339] Updated weights for policy 0, policy_version 641818 (0.0029) [2024-04-28 15:10:44,932][57339] Updated weights for policy 0, policy_version 641828 (0.0029) [2024-04-28 15:10:47,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 10515808256. Throughput: 0: 55416.9. Samples: 1006218860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:10:47,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 15:10:48,330][57339] Updated weights for policy 0, policy_version 641838 (0.0027) [2024-04-28 15:10:50,822][57339] Updated weights for policy 0, policy_version 641848 (0.0029) [2024-04-28 15:10:52,169][57108] Fps is (10 sec: 55706.9, 60 sec: 55159.5, 300 sec: 55927.8). Total num frames: 10516086784. Throughput: 0: 55545.4. Samples: 1006383060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:10:52,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 15:10:54,212][57339] Updated weights for policy 0, policy_version 641858 (0.0031) [2024-04-28 15:10:56,891][57339] Updated weights for policy 0, policy_version 641868 (0.0027) [2024-04-28 15:10:57,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10516365312. Throughput: 0: 55477.3. Samples: 1006716260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:10:57,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 15:10:58,323][57319] Signal inference workers to stop experience collection... (14900 times) [2024-04-28 15:10:58,323][57319] Signal inference workers to resume experience collection... (14900 times) [2024-04-28 15:10:58,345][57339] InferenceWorker_p0-w0: stopping experience collection (14900 times) [2024-04-28 15:10:58,345][57339] InferenceWorker_p0-w0: resuming experience collection (14900 times) [2024-04-28 15:11:00,080][57339] Updated weights for policy 0, policy_version 641878 (0.0037) [2024-04-28 15:11:02,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 10516643840. Throughput: 0: 55416.0. Samples: 1007049760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:11:02,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 15:11:02,778][57339] Updated weights for policy 0, policy_version 641888 (0.0032) [2024-04-28 15:11:05,922][57339] Updated weights for policy 0, policy_version 641898 (0.0028) [2024-04-28 15:11:07,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10516922368. Throughput: 0: 55522.3. Samples: 1007218140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:11:07,169][57108] Avg episode reward: [(0, '0.696')] [2024-04-28 15:11:08,739][57339] Updated weights for policy 0, policy_version 641908 (0.0032) [2024-04-28 15:11:11,867][57339] Updated weights for policy 0, policy_version 641918 (0.0034) [2024-04-28 15:11:12,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 10517184512. Throughput: 0: 55415.7. Samples: 1007550120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-04-28 15:11:12,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 15:11:14,887][57339] Updated weights for policy 0, policy_version 641928 (0.0033) [2024-04-28 15:11:17,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10517463040. Throughput: 0: 55283.0. Samples: 1007878320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:11:17,170][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 15:11:17,837][57339] Updated weights for policy 0, policy_version 641938 (0.0031) [2024-04-28 15:11:20,802][57339] Updated weights for policy 0, policy_version 641948 (0.0033) [2024-04-28 15:11:22,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55705.7, 300 sec: 55872.3). Total num frames: 10517757952. Throughput: 0: 55407.5. Samples: 1008041120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:11:22,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 15:11:23,544][57339] Updated weights for policy 0, policy_version 641958 (0.0032) [2024-04-28 15:11:26,675][57339] Updated weights for policy 0, policy_version 641968 (0.0030) [2024-04-28 15:11:27,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55159.4, 300 sec: 55816.7). Total num frames: 10518020096. Throughput: 0: 55420.7. Samples: 1008379120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:11:27,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 15:11:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000641969_10518020096.pth... [2024-04-28 15:11:27,232][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000641154_10504667136.pth [2024-04-28 15:11:29,442][57339] Updated weights for policy 0, policy_version 641978 (0.0028) [2024-04-28 15:11:32,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 10518298624. Throughput: 0: 55455.7. Samples: 1008714360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:11:32,169][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 15:11:32,402][57339] Updated weights for policy 0, policy_version 641988 (0.0028) [2024-04-28 15:11:35,341][57339] Updated weights for policy 0, policy_version 641998 (0.0023) [2024-04-28 15:11:37,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10518577152. Throughput: 0: 55582.7. Samples: 1008884280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:11:37,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 15:11:38,609][57339] Updated weights for policy 0, policy_version 642008 (0.0034) [2024-04-28 15:11:41,277][57339] Updated weights for policy 0, policy_version 642018 (0.0031) [2024-04-28 15:11:42,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.7, 300 sec: 55705.6). Total num frames: 10518839296. Throughput: 0: 55490.3. Samples: 1009213320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:11:42,169][57108] Avg episode reward: [(0, '0.722')] [2024-04-28 15:11:44,321][57339] Updated weights for policy 0, policy_version 642028 (0.0026) [2024-04-28 15:11:46,945][57339] Updated weights for policy 0, policy_version 642038 (0.0030) [2024-04-28 15:11:47,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10519150592. Throughput: 0: 55391.0. Samples: 1009542360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:11:47,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 15:11:50,210][57339] Updated weights for policy 0, policy_version 642048 (0.0023) [2024-04-28 15:11:52,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10519429120. Throughput: 0: 55390.1. Samples: 1009710700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:11:52,170][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 15:11:52,928][57339] Updated weights for policy 0, policy_version 642058 (0.0026) [2024-04-28 15:11:56,060][57339] Updated weights for policy 0, policy_version 642068 (0.0029) [2024-04-28 15:11:57,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 10519691264. Throughput: 0: 55479.4. Samples: 1010046700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:11:57,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 15:11:58,670][57339] Updated weights for policy 0, policy_version 642078 (0.0023) [2024-04-28 15:12:02,057][57339] Updated weights for policy 0, policy_version 642088 (0.0027) [2024-04-28 15:12:02,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10519969792. Throughput: 0: 55800.1. Samples: 1010389320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:12:02,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 15:12:04,597][57339] Updated weights for policy 0, policy_version 642098 (0.0029) [2024-04-28 15:12:07,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55705.4, 300 sec: 55816.7). Total num frames: 10520264704. Throughput: 0: 55800.2. Samples: 1010552140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:12:07,170][57108] Avg episode reward: [(0, '0.691')] [2024-04-28 15:12:07,891][57339] Updated weights for policy 0, policy_version 642108 (0.0025) [2024-04-28 15:12:10,470][57339] Updated weights for policy 0, policy_version 642118 (0.0030) [2024-04-28 15:12:12,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 10520543232. Throughput: 0: 55776.5. Samples: 1010889060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:12:12,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 15:12:12,394][57319] Signal inference workers to stop experience collection... (14950 times) [2024-04-28 15:12:12,395][57319] Signal inference workers to resume experience collection... (14950 times) [2024-04-28 15:12:12,407][57339] InferenceWorker_p0-w0: stopping experience collection (14950 times) [2024-04-28 15:12:12,408][57339] InferenceWorker_p0-w0: resuming experience collection (14950 times) [2024-04-28 15:12:13,547][57339] Updated weights for policy 0, policy_version 642128 (0.0025) [2024-04-28 15:12:16,135][57339] Updated weights for policy 0, policy_version 642138 (0.0024) [2024-04-28 15:12:17,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10520805376. Throughput: 0: 55939.9. Samples: 1011231660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:12:17,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 15:12:19,403][57339] Updated weights for policy 0, policy_version 642148 (0.0039) [2024-04-28 15:12:22,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10521100288. Throughput: 0: 55951.1. Samples: 1011402080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:12:22,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 15:12:22,248][57339] Updated weights for policy 0, policy_version 642158 (0.0028) [2024-04-28 15:12:25,180][57339] Updated weights for policy 0, policy_version 642168 (0.0033) [2024-04-28 15:12:27,169][57108] Fps is (10 sec: 58982.1, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10521395200. Throughput: 0: 56105.6. Samples: 1011738080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:12:27,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 15:12:28,009][57339] Updated weights for policy 0, policy_version 642178 (0.0032) [2024-04-28 15:12:31,237][57339] Updated weights for policy 0, policy_version 642188 (0.0031) [2024-04-28 15:12:32,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10521640960. Throughput: 0: 56221.0. Samples: 1012072300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:12:32,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 15:12:33,947][57339] Updated weights for policy 0, policy_version 642198 (0.0030) [2024-04-28 15:12:36,967][57339] Updated weights for policy 0, policy_version 642208 (0.0032) [2024-04-28 15:12:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10521935872. Throughput: 0: 56225.4. Samples: 1012240840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:12:37,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 15:12:39,790][57339] Updated weights for policy 0, policy_version 642218 (0.0027) [2024-04-28 15:12:42,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10522214400. Throughput: 0: 56158.3. Samples: 1012573820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:12:42,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 15:12:42,862][57339] Updated weights for policy 0, policy_version 642228 (0.0035) [2024-04-28 15:12:45,648][57339] Updated weights for policy 0, policy_version 642238 (0.0029) [2024-04-28 15:12:47,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10522509312. Throughput: 0: 55948.9. Samples: 1012907020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:12:47,170][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 15:12:48,874][57339] Updated weights for policy 0, policy_version 642248 (0.0028) [2024-04-28 15:12:51,442][57339] Updated weights for policy 0, policy_version 642258 (0.0028) [2024-04-28 15:12:52,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10522771456. Throughput: 0: 56256.7. Samples: 1013083680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:12:52,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 15:12:54,750][57339] Updated weights for policy 0, policy_version 642268 (0.0027) [2024-04-28 15:12:57,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10523049984. Throughput: 0: 56233.7. Samples: 1013419580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:12:57,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 15:12:57,527][57339] Updated weights for policy 0, policy_version 642278 (0.0029) [2024-04-28 15:13:00,554][57339] Updated weights for policy 0, policy_version 642288 (0.0030) [2024-04-28 15:13:01,829][57319] Signal inference workers to stop experience collection... (15000 times) [2024-04-28 15:13:01,869][57339] InferenceWorker_p0-w0: stopping experience collection (15000 times) [2024-04-28 15:13:01,920][57319] Signal inference workers to resume experience collection... (15000 times) [2024-04-28 15:13:01,920][57339] InferenceWorker_p0-w0: resuming experience collection (15000 times) [2024-04-28 15:13:02,169][57108] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10523344896. Throughput: 0: 55890.6. Samples: 1013746740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:13:02,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 15:13:03,416][57339] Updated weights for policy 0, policy_version 642298 (0.0026) [2024-04-28 15:13:06,584][57339] Updated weights for policy 0, policy_version 642308 (0.0027) [2024-04-28 15:13:07,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 10523590656. Throughput: 0: 55752.8. Samples: 1013910960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:13:07,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 15:13:09,364][57339] Updated weights for policy 0, policy_version 642318 (0.0032) [2024-04-28 15:13:12,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10523885568. Throughput: 0: 55670.8. Samples: 1014243260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:13:12,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 15:13:12,440][57339] Updated weights for policy 0, policy_version 642328 (0.0030) [2024-04-28 15:13:15,235][57339] Updated weights for policy 0, policy_version 642338 (0.0027) [2024-04-28 15:13:17,169][57108] Fps is (10 sec: 58982.1, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10524180480. Throughput: 0: 55670.6. Samples: 1014577480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:13:17,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 15:13:18,205][57339] Updated weights for policy 0, policy_version 642348 (0.0028) [2024-04-28 15:13:21,067][57339] Updated weights for policy 0, policy_version 642358 (0.0029) [2024-04-28 15:13:22,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10524459008. Throughput: 0: 55689.4. Samples: 1014746860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:13:22,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 15:13:24,278][57339] Updated weights for policy 0, policy_version 642368 (0.0030) [2024-04-28 15:13:26,843][57339] Updated weights for policy 0, policy_version 642378 (0.0028) [2024-04-28 15:13:27,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10524737536. Throughput: 0: 55743.4. Samples: 1015082280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:13:27,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 15:13:27,275][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000642380_10524753920.pth... [2024-04-28 15:13:27,330][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000641563_10511368192.pth [2024-04-28 15:13:30,190][57339] Updated weights for policy 0, policy_version 642388 (0.0029) [2024-04-28 15:13:32,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10524999680. Throughput: 0: 55739.6. Samples: 1015415300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:13:32,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 15:13:32,741][57339] Updated weights for policy 0, policy_version 642398 (0.0030) [2024-04-28 15:13:36,104][57339] Updated weights for policy 0, policy_version 642408 (0.0025) [2024-04-28 15:13:37,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10525294592. Throughput: 0: 55478.7. Samples: 1015580220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:13:37,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 15:13:38,603][57339] Updated weights for policy 0, policy_version 642418 (0.0030) [2024-04-28 15:13:42,075][57339] Updated weights for policy 0, policy_version 642428 (0.0025) [2024-04-28 15:13:42,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55594.6). Total num frames: 10525540352. Throughput: 0: 55477.0. Samples: 1015916040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:13:42,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 15:13:44,420][57339] Updated weights for policy 0, policy_version 642438 (0.0028) [2024-04-28 15:13:47,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10525835264. Throughput: 0: 55669.7. Samples: 1016251880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:13:47,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 15:13:47,871][57339] Updated weights for policy 0, policy_version 642448 (0.0028) [2024-04-28 15:13:50,227][57339] Updated weights for policy 0, policy_version 642458 (0.0030) [2024-04-28 15:13:52,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10526130176. Throughput: 0: 55875.9. Samples: 1016425380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:13:52,169][57108] Avg episode reward: [(0, '0.678')] [2024-04-28 15:13:53,697][57339] Updated weights for policy 0, policy_version 642468 (0.0028) [2024-04-28 15:13:56,197][57339] Updated weights for policy 0, policy_version 642478 (0.0025) [2024-04-28 15:13:57,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10526408704. Throughput: 0: 55888.8. Samples: 1016758260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:13:57,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 15:13:59,332][57339] Updated weights for policy 0, policy_version 642488 (0.0027) [2024-04-28 15:14:02,098][57339] Updated weights for policy 0, policy_version 642498 (0.0029) [2024-04-28 15:14:02,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10526687232. Throughput: 0: 55857.8. Samples: 1017091080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:14:02,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 15:14:05,230][57319] Signal inference workers to stop experience collection... (15050 times) [2024-04-28 15:14:05,230][57319] Signal inference workers to resume experience collection... (15050 times) [2024-04-28 15:14:05,242][57339] InferenceWorker_p0-w0: stopping experience collection (15050 times) [2024-04-28 15:14:05,242][57339] InferenceWorker_p0-w0: resuming experience collection (15050 times) [2024-04-28 15:14:05,341][57339] Updated weights for policy 0, policy_version 642508 (0.0027) [2024-04-28 15:14:07,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10526949376. Throughput: 0: 55767.2. Samples: 1017256380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:14:07,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 15:14:07,894][57339] Updated weights for policy 0, policy_version 642518 (0.0033) [2024-04-28 15:14:11,217][57339] Updated weights for policy 0, policy_version 642528 (0.0033) [2024-04-28 15:14:12,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10527244288. Throughput: 0: 55749.8. Samples: 1017591020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:14:12,170][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 15:14:13,638][57339] Updated weights for policy 0, policy_version 642538 (0.0031) [2024-04-28 15:14:17,038][57339] Updated weights for policy 0, policy_version 642548 (0.0032) [2024-04-28 15:14:17,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10527506432. Throughput: 0: 55782.2. Samples: 1017925500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:14:17,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 15:14:19,589][57339] Updated weights for policy 0, policy_version 642558 (0.0028) [2024-04-28 15:14:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10527784960. Throughput: 0: 55742.6. Samples: 1018088640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-04-28 15:14:22,170][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 15:14:22,943][57339] Updated weights for policy 0, policy_version 642568 (0.0033) [2024-04-28 15:14:25,668][57339] Updated weights for policy 0, policy_version 642578 (0.0028) [2024-04-28 15:14:27,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10528079872. Throughput: 0: 55862.6. Samples: 1018429860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:14:27,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 15:14:28,613][57339] Updated weights for policy 0, policy_version 642588 (0.0030) [2024-04-28 15:14:31,462][57339] Updated weights for policy 0, policy_version 642598 (0.0023) [2024-04-28 15:14:32,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10528358400. Throughput: 0: 55772.6. Samples: 1018761640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:14:32,169][57108] Avg episode reward: [(0, '0.505')] [2024-04-28 15:14:34,498][57339] Updated weights for policy 0, policy_version 642608 (0.0026) [2024-04-28 15:14:37,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10528636928. Throughput: 0: 55732.2. Samples: 1018933320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:14:37,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 15:14:37,272][57339] Updated weights for policy 0, policy_version 642618 (0.0029) [2024-04-28 15:14:40,284][57339] Updated weights for policy 0, policy_version 642628 (0.0036) [2024-04-28 15:14:42,169][57108] Fps is (10 sec: 55704.8, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10528915456. Throughput: 0: 55844.4. Samples: 1019271260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:14:42,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 15:14:43,285][57339] Updated weights for policy 0, policy_version 642638 (0.0027) [2024-04-28 15:14:46,145][57339] Updated weights for policy 0, policy_version 642648 (0.0026) [2024-04-28 15:14:47,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 10529210368. Throughput: 0: 55774.9. Samples: 1019600940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:14:47,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 15:14:49,059][57339] Updated weights for policy 0, policy_version 642658 (0.0027) [2024-04-28 15:14:51,929][57339] Updated weights for policy 0, policy_version 642668 (0.0028) [2024-04-28 15:14:52,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10529472512. Throughput: 0: 55859.0. Samples: 1019770040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:14:52,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 15:14:54,839][57339] Updated weights for policy 0, policy_version 642678 (0.0034) [2024-04-28 15:14:57,169][57108] Fps is (10 sec: 54066.2, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10529751040. Throughput: 0: 55804.4. Samples: 1020102220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:14:57,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 15:14:57,764][57339] Updated weights for policy 0, policy_version 642688 (0.0032) [2024-04-28 15:15:00,762][57339] Updated weights for policy 0, policy_version 642698 (0.0029) [2024-04-28 15:15:02,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10530029568. Throughput: 0: 55822.6. Samples: 1020437520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:15:02,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 15:15:03,958][57339] Updated weights for policy 0, policy_version 642708 (0.0032) [2024-04-28 15:15:06,688][57339] Updated weights for policy 0, policy_version 642718 (0.0032) [2024-04-28 15:15:07,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10530308096. Throughput: 0: 55854.8. Samples: 1020602100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:15:07,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 15:15:09,705][57339] Updated weights for policy 0, policy_version 642728 (0.0025) [2024-04-28 15:15:12,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10530570240. Throughput: 0: 55758.8. Samples: 1020939000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:15:12,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 15:15:12,536][57339] Updated weights for policy 0, policy_version 642738 (0.0031) [2024-04-28 15:15:15,443][57339] Updated weights for policy 0, policy_version 642748 (0.0032) [2024-04-28 15:15:17,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 10530865152. Throughput: 0: 55847.1. Samples: 1021274760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:15:17,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 15:15:18,388][57339] Updated weights for policy 0, policy_version 642758 (0.0031) [2024-04-28 15:15:19,934][57319] Signal inference workers to stop experience collection... (15100 times) [2024-04-28 15:15:19,972][57339] InferenceWorker_p0-w0: stopping experience collection (15100 times) [2024-04-28 15:15:19,983][57319] Signal inference workers to resume experience collection... (15100 times) [2024-04-28 15:15:19,989][57339] InferenceWorker_p0-w0: resuming experience collection (15100 times) [2024-04-28 15:15:21,394][57339] Updated weights for policy 0, policy_version 642768 (0.0028) [2024-04-28 15:15:22,169][57108] Fps is (10 sec: 58981.6, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10531160064. Throughput: 0: 55741.6. Samples: 1021441700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:15:22,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 15:15:24,281][57339] Updated weights for policy 0, policy_version 642778 (0.0038) [2024-04-28 15:15:27,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10531422208. Throughput: 0: 55645.4. Samples: 1021775300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:15:27,169][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 15:15:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000642787_10531422208.pth... [2024-04-28 15:15:27,236][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000641969_10518020096.pth [2024-04-28 15:15:27,373][57339] Updated weights for policy 0, policy_version 642788 (0.0030) [2024-04-28 15:15:30,287][57339] Updated weights for policy 0, policy_version 642798 (0.0031) [2024-04-28 15:15:32,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10531684352. Throughput: 0: 55590.1. Samples: 1022102500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:15:32,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 15:15:33,258][57339] Updated weights for policy 0, policy_version 642808 (0.0035) [2024-04-28 15:15:36,187][57339] Updated weights for policy 0, policy_version 642818 (0.0030) [2024-04-28 15:15:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10531962880. Throughput: 0: 55544.4. Samples: 1022269540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:15:37,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 15:15:38,946][57339] Updated weights for policy 0, policy_version 642828 (0.0029) [2024-04-28 15:15:42,091][57339] Updated weights for policy 0, policy_version 642838 (0.0030) [2024-04-28 15:15:42,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10532257792. Throughput: 0: 55673.4. Samples: 1022607520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:15:42,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 15:15:44,844][57339] Updated weights for policy 0, policy_version 642848 (0.0033) [2024-04-28 15:15:47,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55432.3, 300 sec: 55761.1). Total num frames: 10532536320. Throughput: 0: 55631.1. Samples: 1022940920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:15:47,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 15:15:47,893][57339] Updated weights for policy 0, policy_version 642858 (0.0032) [2024-04-28 15:15:50,805][57339] Updated weights for policy 0, policy_version 642868 (0.0030) [2024-04-28 15:15:52,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10532814848. Throughput: 0: 55726.0. Samples: 1023109780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:15:52,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 15:15:53,807][57339] Updated weights for policy 0, policy_version 642878 (0.0029) [2024-04-28 15:15:56,551][57339] Updated weights for policy 0, policy_version 642888 (0.0029) [2024-04-28 15:15:57,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10533093376. Throughput: 0: 55745.0. Samples: 1023447540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 15:15:57,170][57108] Avg episode reward: [(0, '0.528')] [2024-04-28 15:15:59,663][57339] Updated weights for policy 0, policy_version 642898 (0.0030) [2024-04-28 15:16:02,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10533371904. Throughput: 0: 55735.8. Samples: 1023782880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:16:02,170][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 15:16:02,418][57339] Updated weights for policy 0, policy_version 642908 (0.0030) [2024-04-28 15:16:05,583][57339] Updated weights for policy 0, policy_version 642918 (0.0031) [2024-04-28 15:16:07,169][57108] Fps is (10 sec: 52430.0, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 10533617664. Throughput: 0: 55565.9. Samples: 1023942160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:16:07,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 15:16:08,335][57339] Updated weights for policy 0, policy_version 642928 (0.0031) [2024-04-28 15:16:11,498][57339] Updated weights for policy 0, policy_version 642938 (0.0029) [2024-04-28 15:16:12,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10533928960. Throughput: 0: 55688.9. Samples: 1024281300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:16:12,170][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 15:16:14,180][57339] Updated weights for policy 0, policy_version 642948 (0.0032) [2024-04-28 15:16:17,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10534207488. Throughput: 0: 55795.9. Samples: 1024613320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:16:17,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 15:16:17,431][57339] Updated weights for policy 0, policy_version 642958 (0.0031) [2024-04-28 15:16:20,285][57339] Updated weights for policy 0, policy_version 642968 (0.0031) [2024-04-28 15:16:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 10534469632. Throughput: 0: 55730.7. Samples: 1024777420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:16:22,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 15:16:23,333][57339] Updated weights for policy 0, policy_version 642978 (0.0030) [2024-04-28 15:16:26,109][57319] Signal inference workers to stop experience collection... (15150 times) [2024-04-28 15:16:26,110][57319] Signal inference workers to resume experience collection... (15150 times) [2024-04-28 15:16:26,139][57339] InferenceWorker_p0-w0: stopping experience collection (15150 times) [2024-04-28 15:16:26,139][57339] InferenceWorker_p0-w0: resuming experience collection (15150 times) [2024-04-28 15:16:26,222][57339] Updated weights for policy 0, policy_version 642988 (0.0034) [2024-04-28 15:16:27,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55816.6). Total num frames: 10534764544. Throughput: 0: 55547.0. Samples: 1025107140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:16:27,170][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 15:16:29,289][57339] Updated weights for policy 0, policy_version 642998 (0.0032) [2024-04-28 15:16:32,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10535026688. Throughput: 0: 55561.8. Samples: 1025441200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:16:32,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 15:16:32,184][57339] Updated weights for policy 0, policy_version 643008 (0.0027) [2024-04-28 15:16:35,076][57339] Updated weights for policy 0, policy_version 643018 (0.0023) [2024-04-28 15:16:37,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10535305216. Throughput: 0: 55543.8. Samples: 1025609240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:16:37,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 15:16:38,058][57339] Updated weights for policy 0, policy_version 643028 (0.0031) [2024-04-28 15:16:40,793][57339] Updated weights for policy 0, policy_version 643038 (0.0026) [2024-04-28 15:16:42,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10535583744. Throughput: 0: 55590.9. Samples: 1025949120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:16:42,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 15:16:43,868][57339] Updated weights for policy 0, policy_version 643048 (0.0030) [2024-04-28 15:16:46,849][57339] Updated weights for policy 0, policy_version 643058 (0.0025) [2024-04-28 15:16:47,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10535878656. Throughput: 0: 55532.5. Samples: 1026281840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:16:47,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 15:16:49,730][57339] Updated weights for policy 0, policy_version 643068 (0.0026) [2024-04-28 15:16:52,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10536157184. Throughput: 0: 55819.4. Samples: 1026454040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:16:52,170][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 15:16:52,606][57339] Updated weights for policy 0, policy_version 643078 (0.0027) [2024-04-28 15:16:55,643][57339] Updated weights for policy 0, policy_version 643088 (0.0026) [2024-04-28 15:16:57,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10536419328. Throughput: 0: 55678.5. Samples: 1026786840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:16:57,170][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 15:16:58,624][57339] Updated weights for policy 0, policy_version 643098 (0.0030) [2024-04-28 15:17:01,507][57339] Updated weights for policy 0, policy_version 643108 (0.0026) [2024-04-28 15:17:02,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10536730624. Throughput: 0: 55711.0. Samples: 1027120320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:17:02,170][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 15:17:04,496][57339] Updated weights for policy 0, policy_version 643118 (0.0029) [2024-04-28 15:17:07,169][57108] Fps is (10 sec: 55706.9, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10536976384. Throughput: 0: 55824.1. Samples: 1027289500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:17:07,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 15:17:07,443][57339] Updated weights for policy 0, policy_version 643128 (0.0029) [2024-04-28 15:17:10,204][57339] Updated weights for policy 0, policy_version 643138 (0.0029) [2024-04-28 15:17:12,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10537271296. Throughput: 0: 55969.7. Samples: 1027625780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:17:12,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 15:17:13,364][57339] Updated weights for policy 0, policy_version 643148 (0.0027) [2024-04-28 15:17:16,074][57339] Updated weights for policy 0, policy_version 643158 (0.0032) [2024-04-28 15:17:17,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10537549824. Throughput: 0: 55973.3. Samples: 1027960000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:17:17,169][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 15:17:19,231][57339] Updated weights for policy 0, policy_version 643168 (0.0027) [2024-04-28 15:17:21,908][57339] Updated weights for policy 0, policy_version 643178 (0.0027) [2024-04-28 15:17:22,169][57108] Fps is (10 sec: 57343.9, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 10537844736. Throughput: 0: 55970.4. Samples: 1028127920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:17:22,170][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 15:17:25,007][57339] Updated weights for policy 0, policy_version 643188 (0.0030) [2024-04-28 15:17:27,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10538123264. Throughput: 0: 55859.1. Samples: 1028462780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:17:27,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 15:17:27,177][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000643196_10538123264.pth... [2024-04-28 15:17:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000642380_10524753920.pth [2024-04-28 15:17:27,847][57339] Updated weights for policy 0, policy_version 643198 (0.0027) [2024-04-28 15:17:30,966][57339] Updated weights for policy 0, policy_version 643208 (0.0028) [2024-04-28 15:17:32,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10538385408. Throughput: 0: 55910.7. Samples: 1028797820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:17:32,169][57108] Avg episode reward: [(0, '0.697')] [2024-04-28 15:17:33,666][57339] Updated weights for policy 0, policy_version 643218 (0.0028) [2024-04-28 15:17:36,917][57339] Updated weights for policy 0, policy_version 643228 (0.0031) [2024-04-28 15:17:37,169][57108] Fps is (10 sec: 55706.0, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10538680320. Throughput: 0: 55791.7. Samples: 1028964660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-04-28 15:17:37,170][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 15:17:39,507][57339] Updated weights for policy 0, policy_version 643238 (0.0028) [2024-04-28 15:17:42,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10538926080. Throughput: 0: 55857.1. Samples: 1029300400. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:17:42,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 15:17:42,643][57339] Updated weights for policy 0, policy_version 643248 (0.0026) [2024-04-28 15:17:45,425][57339] Updated weights for policy 0, policy_version 643258 (0.0028) [2024-04-28 15:17:47,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10539237376. Throughput: 0: 55881.0. Samples: 1029634960. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:17:47,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 15:17:48,493][57339] Updated weights for policy 0, policy_version 643268 (0.0028) [2024-04-28 15:17:48,497][57319] Signal inference workers to stop experience collection... (15200 times) [2024-04-28 15:17:48,497][57319] Signal inference workers to resume experience collection... (15200 times) [2024-04-28 15:17:48,513][57339] InferenceWorker_p0-w0: stopping experience collection (15200 times) [2024-04-28 15:17:48,513][57339] InferenceWorker_p0-w0: resuming experience collection (15200 times) [2024-04-28 15:17:51,446][57339] Updated weights for policy 0, policy_version 643278 (0.0032) [2024-04-28 15:17:52,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10539515904. Throughput: 0: 55886.2. Samples: 1029804380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:17:52,170][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 15:17:54,376][57339] Updated weights for policy 0, policy_version 643288 (0.0031) [2024-04-28 15:17:57,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55978.9, 300 sec: 55705.6). Total num frames: 10539778048. Throughput: 0: 55757.2. Samples: 1030134840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:17:57,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 15:17:57,176][57339] Updated weights for policy 0, policy_version 643298 (0.0034) [2024-04-28 15:18:00,361][57339] Updated weights for policy 0, policy_version 643308 (0.0029) [2024-04-28 15:18:02,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10540056576. Throughput: 0: 55774.6. Samples: 1030469860. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:18:02,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 15:18:03,013][57339] Updated weights for policy 0, policy_version 643318 (0.0032) [2024-04-28 15:18:06,176][57339] Updated weights for policy 0, policy_version 643328 (0.0029) [2024-04-28 15:18:07,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 10540335104. Throughput: 0: 55817.8. Samples: 1030639720. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:18:07,169][57108] Avg episode reward: [(0, '0.523')] [2024-04-28 15:18:08,921][57339] Updated weights for policy 0, policy_version 643338 (0.0028) [2024-04-28 15:18:11,928][57339] Updated weights for policy 0, policy_version 643348 (0.0027) [2024-04-28 15:18:12,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55978.9, 300 sec: 55761.2). Total num frames: 10540630016. Throughput: 0: 55893.9. Samples: 1030978000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:18:12,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 15:18:14,778][57339] Updated weights for policy 0, policy_version 643358 (0.0031) [2024-04-28 15:18:17,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10540875776. Throughput: 0: 55860.5. Samples: 1031311540. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:18:17,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 15:18:17,854][57339] Updated weights for policy 0, policy_version 643368 (0.0027) [2024-04-28 15:18:20,469][57339] Updated weights for policy 0, policy_version 643378 (0.0026) [2024-04-28 15:18:22,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10541187072. Throughput: 0: 55864.4. Samples: 1031478560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:18:22,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 15:18:23,770][57339] Updated weights for policy 0, policy_version 643388 (0.0036) [2024-04-28 15:18:26,378][57339] Updated weights for policy 0, policy_version 643398 (0.0035) [2024-04-28 15:18:27,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 10541449216. Throughput: 0: 55858.3. Samples: 1031814020. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:18:27,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 15:18:29,573][57339] Updated weights for policy 0, policy_version 643408 (0.0032) [2024-04-28 15:18:32,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 10541744128. Throughput: 0: 55863.3. Samples: 1032148800. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:18:32,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 15:18:32,224][57339] Updated weights for policy 0, policy_version 643418 (0.0026) [2024-04-28 15:18:35,477][57339] Updated weights for policy 0, policy_version 643428 (0.0028) [2024-04-28 15:18:37,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10542006272. Throughput: 0: 55838.6. Samples: 1032317120. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:18:37,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 15:18:38,180][57339] Updated weights for policy 0, policy_version 643438 (0.0030) [2024-04-28 15:18:40,971][57319] Signal inference workers to stop experience collection... (15250 times) [2024-04-28 15:18:40,997][57339] InferenceWorker_p0-w0: stopping experience collection (15250 times) [2024-04-28 15:18:41,031][57319] Signal inference workers to resume experience collection... (15250 times) [2024-04-28 15:18:41,031][57339] InferenceWorker_p0-w0: resuming experience collection (15250 times) [2024-04-28 15:18:41,302][57339] Updated weights for policy 0, policy_version 643448 (0.0026) [2024-04-28 15:18:42,169][57108] Fps is (10 sec: 55704.7, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10542301184. Throughput: 0: 55995.4. Samples: 1032654640. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:18:42,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 15:18:43,882][57339] Updated weights for policy 0, policy_version 643458 (0.0028) [2024-04-28 15:18:47,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10542563328. Throughput: 0: 56018.1. Samples: 1032990680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:18:47,170][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 15:18:47,279][57339] Updated weights for policy 0, policy_version 643468 (0.0029) [2024-04-28 15:18:49,809][57339] Updated weights for policy 0, policy_version 643478 (0.0036) [2024-04-28 15:18:52,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10542841856. Throughput: 0: 55806.4. Samples: 1033151000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:18:52,169][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 15:18:53,010][57339] Updated weights for policy 0, policy_version 643488 (0.0033) [2024-04-28 15:18:55,782][57339] Updated weights for policy 0, policy_version 643498 (0.0028) [2024-04-28 15:18:57,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56251.5, 300 sec: 55816.7). Total num frames: 10543153152. Throughput: 0: 55815.3. Samples: 1033489700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:18:57,170][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 15:18:58,716][57339] Updated weights for policy 0, policy_version 643508 (0.0025) [2024-04-28 15:19:01,591][57339] Updated weights for policy 0, policy_version 643518 (0.0038) [2024-04-28 15:19:02,169][57108] Fps is (10 sec: 60620.7, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 10543448064. Throughput: 0: 55985.7. Samples: 1033830900. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:19:02,169][57108] Avg episode reward: [(0, '0.530')] [2024-04-28 15:19:04,787][57339] Updated weights for policy 0, policy_version 643528 (0.0033) [2024-04-28 15:19:07,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10543710208. Throughput: 0: 56019.1. Samples: 1033999420. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:19:07,170][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 15:19:07,417][57339] Updated weights for policy 0, policy_version 643538 (0.0028) [2024-04-28 15:19:10,534][57339] Updated weights for policy 0, policy_version 643548 (0.0024) [2024-04-28 15:19:12,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10543972352. Throughput: 0: 55918.2. Samples: 1034330340. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-04-28 15:19:12,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 15:19:13,156][57339] Updated weights for policy 0, policy_version 643558 (0.0027) [2024-04-28 15:19:16,183][57339] Updated weights for policy 0, policy_version 643568 (0.0030) [2024-04-28 15:19:17,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56524.6, 300 sec: 55872.2). Total num frames: 10544267264. Throughput: 0: 56122.4. Samples: 1034674320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:19:17,170][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 15:19:18,891][57339] Updated weights for policy 0, policy_version 643578 (0.0029) [2024-04-28 15:19:22,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10544529408. Throughput: 0: 55912.6. Samples: 1034833180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:19:22,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 15:19:22,204][57339] Updated weights for policy 0, policy_version 643588 (0.0031) [2024-04-28 15:19:24,943][57339] Updated weights for policy 0, policy_version 643598 (0.0028) [2024-04-28 15:19:27,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10544791552. Throughput: 0: 55940.5. Samples: 1035171960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:19:27,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 15:19:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000643603_10544791552.pth... [2024-04-28 15:19:27,254][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000642787_10531422208.pth [2024-04-28 15:19:28,119][57339] Updated weights for policy 0, policy_version 643608 (0.0032) [2024-04-28 15:19:30,828][57339] Updated weights for policy 0, policy_version 643618 (0.0028) [2024-04-28 15:19:32,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 10545086464. Throughput: 0: 55781.0. Samples: 1035500820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:19:32,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 15:19:33,885][57339] Updated weights for policy 0, policy_version 643628 (0.0030) [2024-04-28 15:19:36,549][57339] Updated weights for policy 0, policy_version 643638 (0.0031) [2024-04-28 15:19:37,169][57108] Fps is (10 sec: 60621.2, 60 sec: 56524.9, 300 sec: 55872.2). Total num frames: 10545397760. Throughput: 0: 56070.3. Samples: 1035674160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:19:37,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 15:19:39,784][57339] Updated weights for policy 0, policy_version 643648 (0.0027) [2024-04-28 15:19:42,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10545659904. Throughput: 0: 56051.5. Samples: 1036012020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:19:42,170][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 15:19:42,508][57339] Updated weights for policy 0, policy_version 643658 (0.0029) [2024-04-28 15:19:45,700][57339] Updated weights for policy 0, policy_version 643668 (0.0031) [2024-04-28 15:19:47,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 10545922048. Throughput: 0: 55858.6. Samples: 1036344540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:19:47,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 15:19:48,257][57339] Updated weights for policy 0, policy_version 643678 (0.0026) [2024-04-28 15:19:50,729][57319] Signal inference workers to stop experience collection... (15300 times) [2024-04-28 15:19:50,729][57319] Signal inference workers to resume experience collection... (15300 times) [2024-04-28 15:19:50,744][57339] InferenceWorker_p0-w0: stopping experience collection (15300 times) [2024-04-28 15:19:50,761][57339] InferenceWorker_p0-w0: resuming experience collection (15300 times) [2024-04-28 15:19:51,611][57339] Updated weights for policy 0, policy_version 643688 (0.0028) [2024-04-28 15:19:52,169][57108] Fps is (10 sec: 55706.8, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10546216960. Throughput: 0: 55924.6. Samples: 1036516020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:19:52,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 15:19:54,115][57339] Updated weights for policy 0, policy_version 643698 (0.0028) [2024-04-28 15:19:57,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10546495488. Throughput: 0: 56007.9. Samples: 1036850700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:19:57,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 15:19:57,463][57339] Updated weights for policy 0, policy_version 643708 (0.0029) [2024-04-28 15:20:00,161][57339] Updated weights for policy 0, policy_version 643718 (0.0027) [2024-04-28 15:20:02,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 10546757632. Throughput: 0: 55816.2. Samples: 1037186040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:20:02,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 15:20:03,190][57339] Updated weights for policy 0, policy_version 643728 (0.0028) [2024-04-28 15:20:06,002][57339] Updated weights for policy 0, policy_version 643738 (0.0029) [2024-04-28 15:20:07,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 10547052544. Throughput: 0: 55950.6. Samples: 1037350960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:20:07,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 15:20:09,339][57339] Updated weights for policy 0, policy_version 643748 (0.0027) [2024-04-28 15:20:11,870][57339] Updated weights for policy 0, policy_version 643758 (0.0029) [2024-04-28 15:20:12,169][57108] Fps is (10 sec: 58981.6, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 10547347456. Throughput: 0: 55782.6. Samples: 1037682180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:20:12,170][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 15:20:15,191][57339] Updated weights for policy 0, policy_version 643768 (0.0034) [2024-04-28 15:20:17,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10547609600. Throughput: 0: 55851.7. Samples: 1038014140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:20:17,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 15:20:17,776][57339] Updated weights for policy 0, policy_version 643778 (0.0034) [2024-04-28 15:20:20,932][57339] Updated weights for policy 0, policy_version 643788 (0.0025) [2024-04-28 15:20:22,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 10547888128. Throughput: 0: 55971.8. Samples: 1038192900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:20:22,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 15:20:23,519][57339] Updated weights for policy 0, policy_version 643798 (0.0033) [2024-04-28 15:20:26,842][57339] Updated weights for policy 0, policy_version 643808 (0.0030) [2024-04-28 15:20:27,169][57108] Fps is (10 sec: 55705.0, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10548166656. Throughput: 0: 55833.9. Samples: 1038524540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:20:27,170][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 15:20:29,347][57339] Updated weights for policy 0, policy_version 643818 (0.0028) [2024-04-28 15:20:32,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10548428800. Throughput: 0: 55821.0. Samples: 1038856480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:20:32,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 15:20:32,681][57339] Updated weights for policy 0, policy_version 643828 (0.0031) [2024-04-28 15:20:35,223][57339] Updated weights for policy 0, policy_version 643838 (0.0030) [2024-04-28 15:20:37,169][57108] Fps is (10 sec: 52428.8, 60 sec: 54886.3, 300 sec: 55705.6). Total num frames: 10548690944. Throughput: 0: 55662.5. Samples: 1039020840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:20:37,170][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 15:20:38,593][57339] Updated weights for policy 0, policy_version 643848 (0.0025) [2024-04-28 15:20:41,075][57339] Updated weights for policy 0, policy_version 643858 (0.0027) [2024-04-28 15:20:42,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.7, 300 sec: 55761.1). Total num frames: 10548985856. Throughput: 0: 55550.2. Samples: 1039350460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:20:42,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 15:20:44,497][57339] Updated weights for policy 0, policy_version 643868 (0.0030) [2024-04-28 15:20:47,169][57108] Fps is (10 sec: 58983.2, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10549280768. Throughput: 0: 55539.6. Samples: 1039685320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:20:47,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 15:20:47,197][57339] Updated weights for policy 0, policy_version 643878 (0.0029) [2024-04-28 15:20:50,247][57339] Updated weights for policy 0, policy_version 643888 (0.0028) [2024-04-28 15:20:52,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10549559296. Throughput: 0: 55796.0. Samples: 1039861780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:20:52,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 15:20:53,113][57339] Updated weights for policy 0, policy_version 643898 (0.0032) [2024-04-28 15:20:55,665][57319] Signal inference workers to stop experience collection... (15350 times) [2024-04-28 15:20:55,696][57339] InferenceWorker_p0-w0: stopping experience collection (15350 times) [2024-04-28 15:20:55,724][57319] Signal inference workers to resume experience collection... (15350 times) [2024-04-28 15:20:55,730][57339] InferenceWorker_p0-w0: resuming experience collection (15350 times) [2024-04-28 15:20:56,093][57339] Updated weights for policy 0, policy_version 643908 (0.0027) [2024-04-28 15:20:57,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10549837824. Throughput: 0: 55840.1. Samples: 1040194980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:20:57,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 15:20:59,051][57339] Updated weights for policy 0, policy_version 643918 (0.0028) [2024-04-28 15:21:02,046][57339] Updated weights for policy 0, policy_version 643928 (0.0033) [2024-04-28 15:21:02,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10550116352. Throughput: 0: 55754.2. Samples: 1040523080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:21:02,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 15:21:04,934][57339] Updated weights for policy 0, policy_version 643938 (0.0026) [2024-04-28 15:21:07,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10550378496. Throughput: 0: 55309.5. Samples: 1040681820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:21:07,170][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 15:21:07,999][57339] Updated weights for policy 0, policy_version 643948 (0.0036) [2024-04-28 15:21:10,813][57339] Updated weights for policy 0, policy_version 643958 (0.0028) [2024-04-28 15:21:12,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 55761.2). Total num frames: 10550657024. Throughput: 0: 55539.7. Samples: 1041023820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:21:12,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 15:21:13,876][57339] Updated weights for policy 0, policy_version 643968 (0.0033) [2024-04-28 15:21:16,824][57339] Updated weights for policy 0, policy_version 643978 (0.0026) [2024-04-28 15:21:17,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10550935552. Throughput: 0: 55522.6. Samples: 1041355000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:21:17,170][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 15:21:19,744][57339] Updated weights for policy 0, policy_version 643988 (0.0032) [2024-04-28 15:21:22,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10551230464. Throughput: 0: 55533.4. Samples: 1041519840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:21:22,170][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 15:21:22,753][57339] Updated weights for policy 0, policy_version 643998 (0.0026) [2024-04-28 15:21:25,624][57339] Updated weights for policy 0, policy_version 644008 (0.0029) [2024-04-28 15:21:27,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 10551508992. Throughput: 0: 55574.3. Samples: 1041851300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:21:27,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 15:21:27,344][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000644015_10551541760.pth... [2024-04-28 15:21:27,397][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000643196_10538123264.pth [2024-04-28 15:21:28,469][57339] Updated weights for policy 0, policy_version 644018 (0.0026) [2024-04-28 15:21:31,596][57339] Updated weights for policy 0, policy_version 644028 (0.0032) [2024-04-28 15:21:32,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 10551803904. Throughput: 0: 55585.3. Samples: 1042186660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:21:32,169][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 15:21:34,387][57339] Updated weights for policy 0, policy_version 644038 (0.0032) [2024-04-28 15:21:37,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10552049664. Throughput: 0: 55428.8. Samples: 1042356080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:21:37,170][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 15:21:37,536][57339] Updated weights for policy 0, policy_version 644048 (0.0024) [2024-04-28 15:21:40,449][57339] Updated weights for policy 0, policy_version 644058 (0.0030) [2024-04-28 15:21:42,170][57108] Fps is (10 sec: 50786.4, 60 sec: 55431.8, 300 sec: 55705.5). Total num frames: 10552311808. Throughput: 0: 55460.8. Samples: 1042690760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:21:42,170][57108] Avg episode reward: [(0, '0.506')] [2024-04-28 15:21:43,297][57339] Updated weights for policy 0, policy_version 644068 (0.0026) [2024-04-28 15:21:46,876][57339] Updated weights for policy 0, policy_version 644078 (0.0026) [2024-04-28 15:21:47,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 10552590336. Throughput: 0: 55736.0. Samples: 1043031200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:21:47,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 15:21:49,148][57339] Updated weights for policy 0, policy_version 644088 (0.0026) [2024-04-28 15:21:52,169][57108] Fps is (10 sec: 57347.8, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 10552885248. Throughput: 0: 55656.3. Samples: 1043186360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:21:52,170][57108] Avg episode reward: [(0, '0.503')] [2024-04-28 15:21:52,579][57339] Updated weights for policy 0, policy_version 644098 (0.0031) [2024-04-28 15:21:54,567][57319] Signal inference workers to stop experience collection... (15400 times) [2024-04-28 15:21:54,567][57319] Signal inference workers to resume experience collection... (15400 times) [2024-04-28 15:21:54,591][57339] InferenceWorker_p0-w0: stopping experience collection (15400 times) [2024-04-28 15:21:54,591][57339] InferenceWorker_p0-w0: resuming experience collection (15400 times) [2024-04-28 15:21:55,076][57339] Updated weights for policy 0, policy_version 644108 (0.0030) [2024-04-28 15:21:57,169][57108] Fps is (10 sec: 60620.1, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 10553196544. Throughput: 0: 55607.4. Samples: 1043526160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:21:57,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 15:21:58,341][57339] Updated weights for policy 0, policy_version 644118 (0.0028) [2024-04-28 15:22:00,793][57339] Updated weights for policy 0, policy_version 644128 (0.0031) [2024-04-28 15:22:02,169][57108] Fps is (10 sec: 60621.0, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 10553491456. Throughput: 0: 55716.9. Samples: 1043862260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:22:02,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 15:22:04,267][57339] Updated weights for policy 0, policy_version 644138 (0.0035) [2024-04-28 15:22:06,768][57339] Updated weights for policy 0, policy_version 644148 (0.0031) [2024-04-28 15:22:07,169][57108] Fps is (10 sec: 55706.2, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10553753600. Throughput: 0: 55976.5. Samples: 1044038780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:22:07,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 15:22:10,151][57339] Updated weights for policy 0, policy_version 644158 (0.0030) [2024-04-28 15:22:12,169][57108] Fps is (10 sec: 50791.1, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10553999360. Throughput: 0: 56082.3. Samples: 1044375000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:22:12,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 15:22:12,550][57339] Updated weights for policy 0, policy_version 644168 (0.0033) [2024-04-28 15:22:15,905][57339] Updated weights for policy 0, policy_version 644178 (0.0029) [2024-04-28 15:22:17,169][57108] Fps is (10 sec: 50790.0, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10554261504. Throughput: 0: 56032.3. Samples: 1044708120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:22:17,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 15:22:18,518][57339] Updated weights for policy 0, policy_version 644188 (0.0027) [2024-04-28 15:22:21,623][57339] Updated weights for policy 0, policy_version 644198 (0.0027) [2024-04-28 15:22:22,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10554572800. Throughput: 0: 55680.9. Samples: 1044861720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:22:22,170][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 15:22:24,459][57339] Updated weights for policy 0, policy_version 644208 (0.0029) [2024-04-28 15:22:27,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10554834944. Throughput: 0: 55734.3. Samples: 1045198760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:22:27,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 15:22:27,490][57339] Updated weights for policy 0, policy_version 644218 (0.0030) [2024-04-28 15:22:30,372][57339] Updated weights for policy 0, policy_version 644228 (0.0037) [2024-04-28 15:22:32,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10555162624. Throughput: 0: 55512.9. Samples: 1045529280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:22:32,169][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 15:22:33,586][57339] Updated weights for policy 0, policy_version 644238 (0.0032) [2024-04-28 15:22:36,111][57339] Updated weights for policy 0, policy_version 644248 (0.0026) [2024-04-28 15:22:37,169][57108] Fps is (10 sec: 60620.8, 60 sec: 56524.9, 300 sec: 55983.3). Total num frames: 10555441152. Throughput: 0: 56122.8. Samples: 1045711880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:22:37,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 15:22:39,396][57339] Updated weights for policy 0, policy_version 644258 (0.0037) [2024-04-28 15:22:42,051][57339] Updated weights for policy 0, policy_version 644268 (0.0034) [2024-04-28 15:22:42,169][57108] Fps is (10 sec: 54067.3, 60 sec: 56525.6, 300 sec: 55816.7). Total num frames: 10555703296. Throughput: 0: 56090.9. Samples: 1046050240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:22:42,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 15:22:45,150][57339] Updated weights for policy 0, policy_version 644278 (0.0025) [2024-04-28 15:22:46,334][57319] Signal inference workers to stop experience collection... (15450 times) [2024-04-28 15:22:46,335][57319] Signal inference workers to resume experience collection... (15450 times) [2024-04-28 15:22:46,362][57339] InferenceWorker_p0-w0: stopping experience collection (15450 times) [2024-04-28 15:22:46,362][57339] InferenceWorker_p0-w0: resuming experience collection (15450 times) [2024-04-28 15:22:47,169][57108] Fps is (10 sec: 50790.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10555949056. Throughput: 0: 56014.7. Samples: 1046382920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:22:47,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 15:22:47,798][57339] Updated weights for policy 0, policy_version 644288 (0.0025) [2024-04-28 15:22:51,024][57339] Updated weights for policy 0, policy_version 644298 (0.0037) [2024-04-28 15:22:52,169][57108] Fps is (10 sec: 50790.6, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 10556211200. Throughput: 0: 55510.4. Samples: 1046536740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:22:52,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 15:22:53,753][57339] Updated weights for policy 0, policy_version 644308 (0.0032) [2024-04-28 15:22:56,880][57339] Updated weights for policy 0, policy_version 644318 (0.0026) [2024-04-28 15:22:57,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 10556506112. Throughput: 0: 55503.4. Samples: 1046872660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:22:57,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 15:22:59,609][57339] Updated weights for policy 0, policy_version 644328 (0.0036) [2024-04-28 15:23:02,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 10556801024. Throughput: 0: 55536.1. Samples: 1047207240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:23:02,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 15:23:02,870][57339] Updated weights for policy 0, policy_version 644338 (0.0024) [2024-04-28 15:23:05,575][57339] Updated weights for policy 0, policy_version 644348 (0.0028) [2024-04-28 15:23:07,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10557095936. Throughput: 0: 56062.3. Samples: 1047384520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:23:07,169][57108] Avg episode reward: [(0, '0.678')] [2024-04-28 15:23:08,549][57339] Updated weights for policy 0, policy_version 644358 (0.0025) [2024-04-28 15:23:11,462][57339] Updated weights for policy 0, policy_version 644368 (0.0026) [2024-04-28 15:23:12,169][57108] Fps is (10 sec: 60621.3, 60 sec: 56797.9, 300 sec: 56038.8). Total num frames: 10557407232. Throughput: 0: 56049.4. Samples: 1047720980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:23:12,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 15:23:14,394][57339] Updated weights for policy 0, policy_version 644378 (0.0031) [2024-04-28 15:23:17,169][57108] Fps is (10 sec: 54067.2, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 10557636608. Throughput: 0: 56128.9. Samples: 1048055080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:23:17,169][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 15:23:17,341][57339] Updated weights for policy 0, policy_version 644388 (0.0030) [2024-04-28 15:23:20,303][57339] Updated weights for policy 0, policy_version 644398 (0.0025) [2024-04-28 15:23:22,169][57108] Fps is (10 sec: 50790.4, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10557915136. Throughput: 0: 55778.3. Samples: 1048221900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:23:22,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 15:23:23,096][57339] Updated weights for policy 0, policy_version 644408 (0.0028) [2024-04-28 15:23:26,442][57339] Updated weights for policy 0, policy_version 644418 (0.0031) [2024-04-28 15:23:27,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10558193664. Throughput: 0: 55699.9. Samples: 1048556740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:23:27,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 15:23:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000644421_10558193664.pth... [2024-04-28 15:23:27,232][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000643603_10544791552.pth [2024-04-28 15:23:28,899][57339] Updated weights for policy 0, policy_version 644428 (0.0030) [2024-04-28 15:23:32,100][57319] Signal inference workers to stop experience collection... (15500 times) [2024-04-28 15:23:32,119][57339] InferenceWorker_p0-w0: stopping experience collection (15500 times) [2024-04-28 15:23:32,159][57319] Signal inference workers to resume experience collection... (15500 times) [2024-04-28 15:23:32,159][57339] InferenceWorker_p0-w0: resuming experience collection (15500 times) [2024-04-28 15:23:32,169][57108] Fps is (10 sec: 54066.6, 60 sec: 54886.4, 300 sec: 55761.1). Total num frames: 10558455808. Throughput: 0: 55652.9. Samples: 1048887300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:23:32,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 15:23:32,265][57339] Updated weights for policy 0, policy_version 644438 (0.0026) [2024-04-28 15:23:34,858][57339] Updated weights for policy 0, policy_version 644448 (0.0025) [2024-04-28 15:23:37,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54886.3, 300 sec: 55705.6). Total num frames: 10558734336. Throughput: 0: 55971.9. Samples: 1049055480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:23:37,169][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 15:23:38,299][57339] Updated weights for policy 0, policy_version 644458 (0.0028) [2024-04-28 15:23:40,735][57339] Updated weights for policy 0, policy_version 644468 (0.0027) [2024-04-28 15:23:42,169][57108] Fps is (10 sec: 60620.3, 60 sec: 55978.5, 300 sec: 55927.8). Total num frames: 10559062016. Throughput: 0: 55843.9. Samples: 1049385640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:23:42,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 15:23:44,311][57339] Updated weights for policy 0, policy_version 644478 (0.0029) [2024-04-28 15:23:46,637][57339] Updated weights for policy 0, policy_version 644488 (0.0031) [2024-04-28 15:23:47,169][57108] Fps is (10 sec: 60620.3, 60 sec: 56524.7, 300 sec: 55927.7). Total num frames: 10559340544. Throughput: 0: 55659.9. Samples: 1049711940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:23:47,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 15:23:50,108][57339] Updated weights for policy 0, policy_version 644498 (0.0026) [2024-04-28 15:23:52,169][57108] Fps is (10 sec: 50791.2, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 10559569920. Throughput: 0: 55661.4. Samples: 1049889280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:23:52,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 15:23:52,538][57339] Updated weights for policy 0, policy_version 644508 (0.0022) [2024-04-28 15:23:55,972][57339] Updated weights for policy 0, policy_version 644518 (0.0031) [2024-04-28 15:23:57,169][57108] Fps is (10 sec: 50790.5, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10559848448. Throughput: 0: 55582.9. Samples: 1050222220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:23:57,170][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 15:23:58,618][57339] Updated weights for policy 0, policy_version 644528 (0.0030) [2024-04-28 15:24:01,973][57339] Updated weights for policy 0, policy_version 644538 (0.0024) [2024-04-28 15:24:02,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10560126976. Throughput: 0: 55534.1. Samples: 1050554120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 15:24:02,170][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 15:24:04,384][57339] Updated weights for policy 0, policy_version 644548 (0.0032) [2024-04-28 15:24:07,169][57108] Fps is (10 sec: 52428.6, 60 sec: 54613.2, 300 sec: 55594.5). Total num frames: 10560372736. Throughput: 0: 55194.8. Samples: 1050705680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:24:07,170][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 15:24:07,987][57339] Updated weights for policy 0, policy_version 644558 (0.0031) [2024-04-28 15:24:08,232][57319] Signal inference workers to stop experience collection... (15550 times) [2024-04-28 15:24:08,239][57319] Signal inference workers to resume experience collection... (15550 times) [2024-04-28 15:24:08,246][57339] InferenceWorker_p0-w0: stopping experience collection (15550 times) [2024-04-28 15:24:08,270][57339] InferenceWorker_p0-w0: resuming experience collection (15550 times) [2024-04-28 15:24:10,319][57339] Updated weights for policy 0, policy_version 644568 (0.0028) [2024-04-28 15:24:12,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54613.2, 300 sec: 55650.1). Total num frames: 10560684032. Throughput: 0: 55198.2. Samples: 1051040660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:24:12,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 15:24:13,644][57339] Updated weights for policy 0, policy_version 644578 (0.0026) [2024-04-28 15:24:16,237][57339] Updated weights for policy 0, policy_version 644588 (0.0026) [2024-04-28 15:24:17,169][57108] Fps is (10 sec: 63897.4, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 10561011712. Throughput: 0: 55224.3. Samples: 1051372400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:24:17,170][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 15:24:19,451][57339] Updated weights for policy 0, policy_version 644598 (0.0027) [2024-04-28 15:24:22,021][57339] Updated weights for policy 0, policy_version 644608 (0.0033) [2024-04-28 15:24:22,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 10561273856. Throughput: 0: 55448.8. Samples: 1051550680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:24:22,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 15:24:25,535][57339] Updated weights for policy 0, policy_version 644618 (0.0029) [2024-04-28 15:24:27,169][57108] Fps is (10 sec: 50791.3, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10561519616. Throughput: 0: 55541.0. Samples: 1051884980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:24:27,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 15:24:27,791][57339] Updated weights for policy 0, policy_version 644628 (0.0023) [2024-04-28 15:24:31,381][57339] Updated weights for policy 0, policy_version 644638 (0.0028) [2024-04-28 15:24:32,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10561798144. Throughput: 0: 55761.0. Samples: 1052221180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:24:32,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 15:24:33,728][57339] Updated weights for policy 0, policy_version 644648 (0.0032) [2024-04-28 15:24:37,144][57339] Updated weights for policy 0, policy_version 644658 (0.0027) [2024-04-28 15:24:37,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10562076672. Throughput: 0: 55247.1. Samples: 1052375400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:24:37,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 15:24:39,590][57339] Updated weights for policy 0, policy_version 644668 (0.0031) [2024-04-28 15:24:42,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54613.4, 300 sec: 55650.1). Total num frames: 10562338816. Throughput: 0: 55369.0. Samples: 1052713820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:24:42,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 15:24:43,121][57339] Updated weights for policy 0, policy_version 644678 (0.0024) [2024-04-28 15:24:45,494][57339] Updated weights for policy 0, policy_version 644688 (0.0034) [2024-04-28 15:24:47,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10562650112. Throughput: 0: 55408.0. Samples: 1053047480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:24:47,170][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 15:24:48,964][57339] Updated weights for policy 0, policy_version 644698 (0.0027) [2024-04-28 15:24:51,197][57339] Updated weights for policy 0, policy_version 644708 (0.0026) [2024-04-28 15:24:52,169][57108] Fps is (10 sec: 60620.3, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 10562945024. Throughput: 0: 55854.2. Samples: 1053219120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:24:52,170][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 15:24:54,304][57319] Signal inference workers to stop experience collection... (15600 times) [2024-04-28 15:24:54,353][57339] InferenceWorker_p0-w0: stopping experience collection (15600 times) [2024-04-28 15:24:54,362][57319] Signal inference workers to resume experience collection... (15600 times) [2024-04-28 15:24:54,369][57339] InferenceWorker_p0-w0: resuming experience collection (15600 times) [2024-04-28 15:24:54,865][57339] Updated weights for policy 0, policy_version 644718 (0.0026) [2024-04-28 15:24:57,005][57339] Updated weights for policy 0, policy_version 644728 (0.0028) [2024-04-28 15:24:57,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10563223552. Throughput: 0: 55833.3. Samples: 1053553160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:24:57,170][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 15:25:00,692][57339] Updated weights for policy 0, policy_version 644738 (0.0029) [2024-04-28 15:25:02,169][57108] Fps is (10 sec: 54068.3, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10563485696. Throughput: 0: 55937.2. Samples: 1053889560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:25:02,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 15:25:02,956][57339] Updated weights for policy 0, policy_version 644748 (0.0027) [2024-04-28 15:25:06,701][57339] Updated weights for policy 0, policy_version 644758 (0.0024) [2024-04-28 15:25:07,169][57108] Fps is (10 sec: 52428.9, 60 sec: 56251.8, 300 sec: 55594.5). Total num frames: 10563747840. Throughput: 0: 55592.0. Samples: 1054052320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:25:07,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 15:25:08,854][57339] Updated weights for policy 0, policy_version 644768 (0.0031) [2024-04-28 15:25:12,169][57108] Fps is (10 sec: 52427.9, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10564009984. Throughput: 0: 55617.2. Samples: 1054387760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:25:12,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 15:25:12,724][57339] Updated weights for policy 0, policy_version 644778 (0.0026) [2024-04-28 15:25:14,636][57339] Updated weights for policy 0, policy_version 644788 (0.0030) [2024-04-28 15:25:17,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54613.5, 300 sec: 55594.5). Total num frames: 10564288512. Throughput: 0: 55540.9. Samples: 1054720520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:25:17,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 15:25:18,494][57339] Updated weights for policy 0, policy_version 644798 (0.0031) [2024-04-28 15:25:20,706][57339] Updated weights for policy 0, policy_version 644808 (0.0025) [2024-04-28 15:25:22,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10564583424. Throughput: 0: 55777.2. Samples: 1054885380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:25:22,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 15:25:24,438][57339] Updated weights for policy 0, policy_version 644818 (0.0032) [2024-04-28 15:25:26,511][57339] Updated weights for policy 0, policy_version 644828 (0.0028) [2024-04-28 15:25:27,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10564878336. Throughput: 0: 55740.1. Samples: 1055222120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:25:27,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 15:25:27,201][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000644830_10564894720.pth... [2024-04-28 15:25:27,256][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000644015_10551541760.pth [2024-04-28 15:25:30,376][57339] Updated weights for policy 0, policy_version 644838 (0.0031) [2024-04-28 15:25:32,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10565156864. Throughput: 0: 55773.8. Samples: 1055557300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:25:32,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 15:25:32,448][57339] Updated weights for policy 0, policy_version 644848 (0.0036) [2024-04-28 15:25:36,118][57339] Updated weights for policy 0, policy_version 644858 (0.0027) [2024-04-28 15:25:36,935][57319] Signal inference workers to stop experience collection... (15650 times) [2024-04-28 15:25:36,977][57339] InferenceWorker_p0-w0: stopping experience collection (15650 times) [2024-04-28 15:25:37,001][57319] Signal inference workers to resume experience collection... (15650 times) [2024-04-28 15:25:37,001][57339] InferenceWorker_p0-w0: resuming experience collection (15650 times) [2024-04-28 15:25:37,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10565435392. Throughput: 0: 55638.8. Samples: 1055722860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 15:25:37,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 15:25:38,235][57339] Updated weights for policy 0, policy_version 644868 (0.0026) [2024-04-28 15:25:41,967][57339] Updated weights for policy 0, policy_version 644878 (0.0026) [2024-04-28 15:25:42,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10565681152. Throughput: 0: 55655.7. Samples: 1056057660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:25:42,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 15:25:44,146][57339] Updated weights for policy 0, policy_version 644888 (0.0028) [2024-04-28 15:25:47,169][57108] Fps is (10 sec: 50790.4, 60 sec: 54886.5, 300 sec: 55539.0). Total num frames: 10565943296. Throughput: 0: 55649.2. Samples: 1056393780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:25:47,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 15:25:47,746][57339] Updated weights for policy 0, policy_version 644898 (0.0025) [2024-04-28 15:25:49,964][57339] Updated weights for policy 0, policy_version 644908 (0.0033) [2024-04-28 15:25:52,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 10566254592. Throughput: 0: 55680.5. Samples: 1056557940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:25:52,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 15:25:53,624][57339] Updated weights for policy 0, policy_version 644918 (0.0035) [2024-04-28 15:25:55,899][57339] Updated weights for policy 0, policy_version 644928 (0.0029) [2024-04-28 15:25:57,169][57108] Fps is (10 sec: 60620.7, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10566549504. Throughput: 0: 55541.9. Samples: 1056887140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:25:57,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 15:25:59,564][57339] Updated weights for policy 0, policy_version 644938 (0.0026) [2024-04-28 15:26:01,792][57339] Updated weights for policy 0, policy_version 644948 (0.0039) [2024-04-28 15:26:02,169][57108] Fps is (10 sec: 58980.9, 60 sec: 55978.4, 300 sec: 55816.6). Total num frames: 10566844416. Throughput: 0: 55470.4. Samples: 1057216700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:26:02,170][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 15:26:05,426][57339] Updated weights for policy 0, policy_version 644958 (0.0032) [2024-04-28 15:26:07,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10567106560. Throughput: 0: 55698.3. Samples: 1057391800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:26:07,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 15:26:07,804][57339] Updated weights for policy 0, policy_version 644968 (0.0024) [2024-04-28 15:26:11,276][57339] Updated weights for policy 0, policy_version 644978 (0.0028) [2024-04-28 15:26:12,169][57108] Fps is (10 sec: 52429.7, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10567368704. Throughput: 0: 55783.9. Samples: 1057732400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:26:12,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 15:26:13,478][57339] Updated weights for policy 0, policy_version 644988 (0.0029) [2024-04-28 15:26:17,087][57339] Updated weights for policy 0, policy_version 644998 (0.0026) [2024-04-28 15:26:17,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10567647232. Throughput: 0: 55765.8. Samples: 1058066760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:26:17,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 15:26:19,372][57339] Updated weights for policy 0, policy_version 645008 (0.0028) [2024-04-28 15:26:22,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10567909376. Throughput: 0: 55526.5. Samples: 1058221560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:26:22,170][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 15:26:23,023][57339] Updated weights for policy 0, policy_version 645018 (0.0034) [2024-04-28 15:26:25,270][57339] Updated weights for policy 0, policy_version 645028 (0.0032) [2024-04-28 15:26:27,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 10568204288. Throughput: 0: 55559.9. Samples: 1058557860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:26:27,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 15:26:28,990][57339] Updated weights for policy 0, policy_version 645038 (0.0029) [2024-04-28 15:26:30,971][57339] Updated weights for policy 0, policy_version 645048 (0.0029) [2024-04-28 15:26:32,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10568482816. Throughput: 0: 55630.5. Samples: 1058897160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:26:32,170][57108] Avg episode reward: [(0, '0.696')] [2024-04-28 15:26:34,719][57339] Updated weights for policy 0, policy_version 645058 (0.0029) [2024-04-28 15:26:36,646][57319] Signal inference workers to stop experience collection... (15700 times) [2024-04-28 15:26:36,687][57339] InferenceWorker_p0-w0: stopping experience collection (15700 times) [2024-04-28 15:26:36,713][57319] Signal inference workers to resume experience collection... (15700 times) [2024-04-28 15:26:36,714][57339] InferenceWorker_p0-w0: resuming experience collection (15700 times) [2024-04-28 15:26:36,831][57339] Updated weights for policy 0, policy_version 645068 (0.0024) [2024-04-28 15:26:37,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55978.6, 300 sec: 55872.4). Total num frames: 10568794112. Throughput: 0: 55824.3. Samples: 1059070040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:26:37,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 15:26:40,412][57339] Updated weights for policy 0, policy_version 645078 (0.0032) [2024-04-28 15:26:42,169][57108] Fps is (10 sec: 58983.7, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 10569072640. Throughput: 0: 55970.3. Samples: 1059405800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:26:42,169][57108] Avg episode reward: [(0, '0.517')] [2024-04-28 15:26:42,652][57339] Updated weights for policy 0, policy_version 645088 (0.0025) [2024-04-28 15:26:46,589][57339] Updated weights for policy 0, policy_version 645098 (0.0033) [2024-04-28 15:26:47,169][57108] Fps is (10 sec: 54067.4, 60 sec: 56524.8, 300 sec: 55761.2). Total num frames: 10569334784. Throughput: 0: 56301.6. Samples: 1059750260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:26:47,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 15:26:48,581][57339] Updated weights for policy 0, policy_version 645108 (0.0029) [2024-04-28 15:26:52,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10569596928. Throughput: 0: 55940.4. Samples: 1059909120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:26:52,169][57108] Avg episode reward: [(0, '0.684')] [2024-04-28 15:26:52,366][57339] Updated weights for policy 0, policy_version 645118 (0.0032) [2024-04-28 15:26:54,278][57339] Updated weights for policy 0, policy_version 645128 (0.0033) [2024-04-28 15:26:57,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10569875456. Throughput: 0: 55936.4. Samples: 1060249540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:26:57,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 15:26:58,029][57339] Updated weights for policy 0, policy_version 645138 (0.0036) [2024-04-28 15:27:00,279][57339] Updated weights for policy 0, policy_version 645148 (0.0027) [2024-04-28 15:27:02,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55159.7, 300 sec: 55594.5). Total num frames: 10570153984. Throughput: 0: 55961.3. Samples: 1060585020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:27:02,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 15:27:03,897][57339] Updated weights for policy 0, policy_version 645158 (0.0034) [2024-04-28 15:27:06,580][57339] Updated weights for policy 0, policy_version 645168 (0.0043) [2024-04-28 15:27:07,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10570448896. Throughput: 0: 56129.1. Samples: 1060747360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:27:07,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 15:27:09,868][57339] Updated weights for policy 0, policy_version 645178 (0.0026) [2024-04-28 15:27:12,169][57108] Fps is (10 sec: 58982.0, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10570743808. Throughput: 0: 56076.4. Samples: 1061081300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:27:12,170][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 15:27:12,294][57339] Updated weights for policy 0, policy_version 645188 (0.0028) [2024-04-28 15:27:15,708][57339] Updated weights for policy 0, policy_version 645198 (0.0030) [2024-04-28 15:27:17,169][57108] Fps is (10 sec: 57343.4, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 10571022336. Throughput: 0: 56004.0. Samples: 1061417340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 15:27:17,170][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 15:27:18,001][57339] Updated weights for policy 0, policy_version 645208 (0.0027) [2024-04-28 15:27:21,425][57339] Updated weights for policy 0, policy_version 645218 (0.0027) [2024-04-28 15:27:22,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55978.9, 300 sec: 55705.6). Total num frames: 10571268096. Throughput: 0: 56031.2. Samples: 1061591440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:27:22,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 15:27:23,739][57339] Updated weights for policy 0, policy_version 645228 (0.0030) [2024-04-28 15:27:27,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 10571563008. Throughput: 0: 55907.4. Samples: 1061921640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:27:27,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 15:27:27,198][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000645238_10571579392.pth... [2024-04-28 15:27:27,203][57339] Updated weights for policy 0, policy_version 645238 (0.0033) [2024-04-28 15:27:27,242][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000644421_10558193664.pth [2024-04-28 15:27:28,240][57319] Signal inference workers to stop experience collection... (15750 times) [2024-04-28 15:27:28,241][57319] Signal inference workers to resume experience collection... (15750 times) [2024-04-28 15:27:28,265][57339] InferenceWorker_p0-w0: stopping experience collection (15750 times) [2024-04-28 15:27:28,266][57339] InferenceWorker_p0-w0: resuming experience collection (15750 times) [2024-04-28 15:27:29,813][57339] Updated weights for policy 0, policy_version 645248 (0.0026) [2024-04-28 15:27:32,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10571825152. Throughput: 0: 55722.6. Samples: 1062257780. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:27:32,169][57108] Avg episode reward: [(0, '0.509')] [2024-04-28 15:27:33,361][57339] Updated weights for policy 0, policy_version 645258 (0.0031) [2024-04-28 15:27:35,613][57339] Updated weights for policy 0, policy_version 645268 (0.0028) [2024-04-28 15:27:37,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10572120064. Throughput: 0: 55705.0. Samples: 1062415840. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:27:37,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 15:27:39,070][57339] Updated weights for policy 0, policy_version 645278 (0.0029) [2024-04-28 15:27:41,413][57339] Updated weights for policy 0, policy_version 645288 (0.0034) [2024-04-28 15:27:42,169][57108] Fps is (10 sec: 58983.3, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10572414976. Throughput: 0: 55586.8. Samples: 1062750940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:27:42,169][57108] Avg episode reward: [(0, '0.492')] [2024-04-28 15:27:45,012][57339] Updated weights for policy 0, policy_version 645298 (0.0026) [2024-04-28 15:27:47,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 10572709888. Throughput: 0: 55597.0. Samples: 1063086880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:27:47,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 15:27:47,254][57339] Updated weights for policy 0, policy_version 645308 (0.0027) [2024-04-28 15:27:51,099][57339] Updated weights for policy 0, policy_version 645318 (0.0031) [2024-04-28 15:27:52,169][57108] Fps is (10 sec: 55704.7, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 10572972032. Throughput: 0: 55929.2. Samples: 1063264180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:27:52,170][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 15:27:53,674][57339] Updated weights for policy 0, policy_version 645328 (0.0029) [2024-04-28 15:27:56,869][57339] Updated weights for policy 0, policy_version 645338 (0.0031) [2024-04-28 15:27:57,169][57108] Fps is (10 sec: 50790.1, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10573217792. Throughput: 0: 55954.3. Samples: 1063599240. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:27:57,169][57108] Avg episode reward: [(0, '0.539')] [2024-04-28 15:27:59,636][57339] Updated weights for policy 0, policy_version 645348 (0.0034) [2024-04-28 15:28:02,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 10573512704. Throughput: 0: 55902.7. Samples: 1063932960. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:28:02,169][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 15:28:02,654][57339] Updated weights for policy 0, policy_version 645358 (0.0027) [2024-04-28 15:28:05,627][57339] Updated weights for policy 0, policy_version 645368 (0.0034) [2024-04-28 15:28:07,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 10573774848. Throughput: 0: 55304.3. Samples: 1064080140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:28:07,170][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 15:28:08,608][57339] Updated weights for policy 0, policy_version 645378 (0.0029) [2024-04-28 15:28:11,445][57339] Updated weights for policy 0, policy_version 645388 (0.0030) [2024-04-28 15:28:12,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10574053376. Throughput: 0: 55428.4. Samples: 1064415920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:28:12,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 15:28:14,503][57339] Updated weights for policy 0, policy_version 645398 (0.0029) [2024-04-28 15:28:17,165][57339] Updated weights for policy 0, policy_version 645408 (0.0027) [2024-04-28 15:28:17,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10574364672. Throughput: 0: 55356.5. Samples: 1064748820. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:28:17,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 15:28:20,747][57339] Updated weights for policy 0, policy_version 645418 (0.0027) [2024-04-28 15:28:22,169][57108] Fps is (10 sec: 60620.8, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 10574659584. Throughput: 0: 55829.2. Samples: 1064928160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:28:22,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 15:28:23,099][57339] Updated weights for policy 0, policy_version 645428 (0.0029) [2024-04-28 15:28:26,484][57339] Updated weights for policy 0, policy_version 645438 (0.0033) [2024-04-28 15:28:27,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10574905344. Throughput: 0: 55810.2. Samples: 1065262400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:28:27,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 15:28:29,053][57339] Updated weights for policy 0, policy_version 645448 (0.0030) [2024-04-28 15:28:32,169][57108] Fps is (10 sec: 50790.7, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10575167488. Throughput: 0: 55731.5. Samples: 1065594800. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:28:32,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 15:28:32,442][57339] Updated weights for policy 0, policy_version 645458 (0.0029) [2024-04-28 15:28:34,993][57339] Updated weights for policy 0, policy_version 645468 (0.0031) [2024-04-28 15:28:37,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 10575446016. Throughput: 0: 55357.8. Samples: 1065755280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:28:37,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 15:28:38,345][57339] Updated weights for policy 0, policy_version 645478 (0.0026) [2024-04-28 15:28:38,376][57319] Signal inference workers to stop experience collection... (15800 times) [2024-04-28 15:28:38,394][57339] InferenceWorker_p0-w0: stopping experience collection (15800 times) [2024-04-28 15:28:38,468][57319] Signal inference workers to resume experience collection... (15800 times) [2024-04-28 15:28:38,468][57339] InferenceWorker_p0-w0: resuming experience collection (15800 times) [2024-04-28 15:28:40,728][57339] Updated weights for policy 0, policy_version 645488 (0.0030) [2024-04-28 15:28:42,169][57108] Fps is (10 sec: 52428.8, 60 sec: 54613.3, 300 sec: 55427.9). Total num frames: 10575691776. Throughput: 0: 55458.2. Samples: 1066094860. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:28:42,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 15:28:44,180][57339] Updated weights for policy 0, policy_version 645498 (0.0033) [2024-04-28 15:28:46,606][57339] Updated weights for policy 0, policy_version 645508 (0.0037) [2024-04-28 15:28:47,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10576019456. Throughput: 0: 55341.4. Samples: 1066423320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:28:47,169][57108] Avg episode reward: [(0, '0.521')] [2024-04-28 15:28:50,004][57339] Updated weights for policy 0, policy_version 645518 (0.0028) [2024-04-28 15:28:52,169][57108] Fps is (10 sec: 60620.0, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10576297984. Throughput: 0: 55985.7. Samples: 1066599500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-04-28 15:28:52,170][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 15:28:52,477][57339] Updated weights for policy 0, policy_version 645528 (0.0027) [2024-04-28 15:28:55,891][57339] Updated weights for policy 0, policy_version 645538 (0.0025) [2024-04-28 15:28:57,169][57108] Fps is (10 sec: 58982.6, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 10576609280. Throughput: 0: 56074.2. Samples: 1066939260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:28:57,169][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 15:28:58,298][57339] Updated weights for policy 0, policy_version 645548 (0.0025) [2024-04-28 15:29:01,699][57339] Updated weights for policy 0, policy_version 645558 (0.0024) [2024-04-28 15:29:02,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 10576871424. Throughput: 0: 56031.6. Samples: 1067270240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:29:02,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 15:29:03,983][57339] Updated weights for policy 0, policy_version 645568 (0.0029) [2024-04-28 15:29:07,169][57108] Fps is (10 sec: 49152.5, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10577100800. Throughput: 0: 55703.7. Samples: 1067434820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:29:07,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 15:29:07,467][57339] Updated weights for policy 0, policy_version 645578 (0.0026) [2024-04-28 15:29:09,899][57339] Updated weights for policy 0, policy_version 645588 (0.0030) [2024-04-28 15:29:12,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 10577395712. Throughput: 0: 55711.2. Samples: 1067769400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:29:12,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 15:29:13,374][57339] Updated weights for policy 0, policy_version 645598 (0.0026) [2024-04-28 15:29:16,140][57339] Updated weights for policy 0, policy_version 645608 (0.0023) [2024-04-28 15:29:17,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10577674240. Throughput: 0: 55726.3. Samples: 1068102480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:29:17,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 15:29:19,157][57339] Updated weights for policy 0, policy_version 645618 (0.0031) [2024-04-28 15:29:22,130][57339] Updated weights for policy 0, policy_version 645628 (0.0027) [2024-04-28 15:29:22,169][57108] Fps is (10 sec: 57342.5, 60 sec: 55159.3, 300 sec: 55761.1). Total num frames: 10577969152. Throughput: 0: 55829.2. Samples: 1068267600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:29:22,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 15:29:24,837][57319] Signal inference workers to stop experience collection... (15850 times) [2024-04-28 15:29:24,879][57339] InferenceWorker_p0-w0: stopping experience collection (15850 times) [2024-04-28 15:29:24,902][57319] Signal inference workers to resume experience collection... (15850 times) [2024-04-28 15:29:24,903][57339] InferenceWorker_p0-w0: resuming experience collection (15850 times) [2024-04-28 15:29:25,036][57339] Updated weights for policy 0, policy_version 645638 (0.0024) [2024-04-28 15:29:27,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10578264064. Throughput: 0: 55578.2. Samples: 1068595880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:29:27,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 15:29:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000645646_10578264064.pth... [2024-04-28 15:29:27,227][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000644830_10564894720.pth [2024-04-28 15:29:27,935][57339] Updated weights for policy 0, policy_version 645648 (0.0032) [2024-04-28 15:29:30,832][57339] Updated weights for policy 0, policy_version 645658 (0.0028) [2024-04-28 15:29:32,169][57108] Fps is (10 sec: 58983.1, 60 sec: 56524.7, 300 sec: 55872.2). Total num frames: 10578558976. Throughput: 0: 55721.3. Samples: 1068930780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:29:32,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 15:29:34,050][57339] Updated weights for policy 0, policy_version 645668 (0.0029) [2024-04-28 15:29:36,695][57339] Updated weights for policy 0, policy_version 645678 (0.0029) [2024-04-28 15:29:37,169][57108] Fps is (10 sec: 55706.0, 60 sec: 56251.9, 300 sec: 55872.2). Total num frames: 10578821120. Throughput: 0: 55800.7. Samples: 1069110520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:29:37,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 15:29:39,977][57339] Updated weights for policy 0, policy_version 645688 (0.0026) [2024-04-28 15:29:42,169][57108] Fps is (10 sec: 50791.0, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 10579066880. Throughput: 0: 55628.5. Samples: 1069442540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:29:42,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 15:29:42,571][57339] Updated weights for policy 0, policy_version 645698 (0.0028) [2024-04-28 15:29:45,748][57339] Updated weights for policy 0, policy_version 645708 (0.0029) [2024-04-28 15:29:47,169][57108] Fps is (10 sec: 50789.6, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 10579329024. Throughput: 0: 55702.6. Samples: 1069776860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:29:47,170][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 15:29:48,463][57339] Updated weights for policy 0, policy_version 645718 (0.0028) [2024-04-28 15:29:51,562][57339] Updated weights for policy 0, policy_version 645728 (0.0028) [2024-04-28 15:29:52,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10579623936. Throughput: 0: 55486.1. Samples: 1069931700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:29:52,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 15:29:54,264][57339] Updated weights for policy 0, policy_version 645738 (0.0034) [2024-04-28 15:29:57,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 10579918848. Throughput: 0: 55486.0. Samples: 1070266280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:29:57,178][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 15:29:57,436][57339] Updated weights for policy 0, policy_version 645748 (0.0032) [2024-04-28 15:30:00,318][57339] Updated weights for policy 0, policy_version 645758 (0.0030) [2024-04-28 15:30:02,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10580213760. Throughput: 0: 55540.4. Samples: 1070601800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:30:02,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 15:30:03,249][57339] Updated weights for policy 0, policy_version 645768 (0.0025) [2024-04-28 15:30:06,082][57339] Updated weights for policy 0, policy_version 645778 (0.0027) [2024-04-28 15:30:07,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56524.7, 300 sec: 55872.2). Total num frames: 10580492288. Throughput: 0: 55751.7. Samples: 1070776420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:30:07,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 15:30:09,187][57339] Updated weights for policy 0, policy_version 645788 (0.0032) [2024-04-28 15:30:12,060][57339] Updated weights for policy 0, policy_version 645798 (0.0026) [2024-04-28 15:30:12,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 10580754432. Throughput: 0: 55859.9. Samples: 1071109580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:30:12,169][57108] Avg episode reward: [(0, '0.676')] [2024-04-28 15:30:14,923][57339] Updated weights for policy 0, policy_version 645808 (0.0030) [2024-04-28 15:30:17,169][57108] Fps is (10 sec: 50790.6, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10581000192. Throughput: 0: 55876.5. Samples: 1071445220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:30:17,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 15:30:17,915][57339] Updated weights for policy 0, policy_version 645818 (0.0033) [2024-04-28 15:30:18,256][57319] Signal inference workers to stop experience collection... (15900 times) [2024-04-28 15:30:18,256][57319] Signal inference workers to resume experience collection... (15900 times) [2024-04-28 15:30:18,266][57339] InferenceWorker_p0-w0: stopping experience collection (15900 times) [2024-04-28 15:30:18,266][57339] InferenceWorker_p0-w0: resuming experience collection (15900 times) [2024-04-28 15:30:21,060][57339] Updated weights for policy 0, policy_version 645828 (0.0030) [2024-04-28 15:30:22,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55159.7, 300 sec: 55594.5). Total num frames: 10581278720. Throughput: 0: 55199.1. Samples: 1071594480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:30:22,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 15:30:23,940][57339] Updated weights for policy 0, policy_version 645838 (0.0029) [2024-04-28 15:30:26,960][57339] Updated weights for policy 0, policy_version 645848 (0.0026) [2024-04-28 15:30:27,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10581573632. Throughput: 0: 55236.4. Samples: 1071928180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 15:30:27,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 15:30:29,711][57339] Updated weights for policy 0, policy_version 645858 (0.0031) [2024-04-28 15:30:32,169][57108] Fps is (10 sec: 57343.0, 60 sec: 54886.3, 300 sec: 55650.0). Total num frames: 10581852160. Throughput: 0: 55221.3. Samples: 1072261820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:30:32,170][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 15:30:32,824][57339] Updated weights for policy 0, policy_version 645868 (0.0030) [2024-04-28 15:30:35,664][57339] Updated weights for policy 0, policy_version 645878 (0.0030) [2024-04-28 15:30:37,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 10582147072. Throughput: 0: 55777.3. Samples: 1072441680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:30:37,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 15:30:38,613][57339] Updated weights for policy 0, policy_version 645888 (0.0026) [2024-04-28 15:30:41,592][57339] Updated weights for policy 0, policy_version 645898 (0.0029) [2024-04-28 15:30:42,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 10582441984. Throughput: 0: 55768.9. Samples: 1072775880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:30:42,169][57108] Avg episode reward: [(0, '0.724')] [2024-04-28 15:30:44,397][57339] Updated weights for policy 0, policy_version 645908 (0.0029) [2024-04-28 15:30:47,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10582687744. Throughput: 0: 55625.4. Samples: 1073104940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:30:47,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 15:30:47,560][57339] Updated weights for policy 0, policy_version 645918 (0.0031) [2024-04-28 15:30:50,319][57339] Updated weights for policy 0, policy_version 645928 (0.0023) [2024-04-28 15:30:52,169][57108] Fps is (10 sec: 49152.8, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 10582933504. Throughput: 0: 55306.4. Samples: 1073265200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:30:52,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 15:30:53,253][57339] Updated weights for policy 0, policy_version 645938 (0.0026) [2024-04-28 15:30:56,260][57339] Updated weights for policy 0, policy_version 645948 (0.0024) [2024-04-28 15:30:57,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 10583228416. Throughput: 0: 55243.1. Samples: 1073595520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:30:57,170][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 15:30:59,099][57339] Updated weights for policy 0, policy_version 645958 (0.0025) [2024-04-28 15:30:59,269][57319] Signal inference workers to stop experience collection... (15950 times) [2024-04-28 15:30:59,269][57319] Signal inference workers to resume experience collection... (15950 times) [2024-04-28 15:30:59,280][57339] InferenceWorker_p0-w0: stopping experience collection (15950 times) [2024-04-28 15:30:59,297][57339] InferenceWorker_p0-w0: resuming experience collection (15950 times) [2024-04-28 15:31:02,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 10583523328. Throughput: 0: 55360.0. Samples: 1073936420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:31:02,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 15:31:02,249][57339] Updated weights for policy 0, policy_version 645968 (0.0028) [2024-04-28 15:31:04,969][57339] Updated weights for policy 0, policy_version 645978 (0.0036) [2024-04-28 15:31:07,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10583818240. Throughput: 0: 55829.2. Samples: 1074106800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:31:07,170][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 15:31:08,136][57339] Updated weights for policy 0, policy_version 645988 (0.0028) [2024-04-28 15:31:10,736][57339] Updated weights for policy 0, policy_version 645998 (0.0026) [2024-04-28 15:31:12,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10584096768. Throughput: 0: 55886.9. Samples: 1074443100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:31:12,170][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 15:31:14,431][57339] Updated weights for policy 0, policy_version 646008 (0.0026) [2024-04-28 15:31:16,644][57339] Updated weights for policy 0, policy_version 646018 (0.0029) [2024-04-28 15:31:17,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56797.9, 300 sec: 55927.8). Total num frames: 10584408064. Throughput: 0: 55839.7. Samples: 1074774600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:31:17,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 15:31:20,215][57339] Updated weights for policy 0, policy_version 646028 (0.0029) [2024-04-28 15:31:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 10584637440. Throughput: 0: 55534.2. Samples: 1074940720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:31:22,170][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 15:31:22,626][57339] Updated weights for policy 0, policy_version 646038 (0.0025) [2024-04-28 15:31:25,934][57339] Updated weights for policy 0, policy_version 646048 (0.0029) [2024-04-28 15:31:27,169][57108] Fps is (10 sec: 47514.0, 60 sec: 55159.5, 300 sec: 55594.6). Total num frames: 10584883200. Throughput: 0: 55500.6. Samples: 1075273400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:31:27,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 15:31:27,242][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000646051_10584899584.pth... [2024-04-28 15:31:27,291][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000645238_10571579392.pth [2024-04-28 15:31:28,528][57339] Updated weights for policy 0, policy_version 646058 (0.0031) [2024-04-28 15:31:31,783][57339] Updated weights for policy 0, policy_version 646068 (0.0028) [2024-04-28 15:31:32,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55432.4, 300 sec: 55538.9). Total num frames: 10585178112. Throughput: 0: 55650.7. Samples: 1075609240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:31:32,170][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 15:31:34,289][57339] Updated weights for policy 0, policy_version 646078 (0.0037) [2024-04-28 15:31:37,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 10585473024. Throughput: 0: 55560.0. Samples: 1075765400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:31:37,169][57108] Avg episode reward: [(0, '0.529')] [2024-04-28 15:31:37,610][57339] Updated weights for policy 0, policy_version 646088 (0.0029) [2024-04-28 15:31:40,153][57339] Updated weights for policy 0, policy_version 646098 (0.0027) [2024-04-28 15:31:42,169][57108] Fps is (10 sec: 58983.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10585767936. Throughput: 0: 55629.8. Samples: 1076098860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:31:42,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 15:31:43,562][57339] Updated weights for policy 0, policy_version 646108 (0.0027) [2024-04-28 15:31:44,493][57319] Signal inference workers to stop experience collection... (16000 times) [2024-04-28 15:31:44,497][57319] Signal inference workers to resume experience collection... (16000 times) [2024-04-28 15:31:44,520][57339] InferenceWorker_p0-w0: stopping experience collection (16000 times) [2024-04-28 15:31:44,520][57339] InferenceWorker_p0-w0: resuming experience collection (16000 times) [2024-04-28 15:31:45,924][57339] Updated weights for policy 0, policy_version 646118 (0.0026) [2024-04-28 15:31:47,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 10586046464. Throughput: 0: 55545.7. Samples: 1076435980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:31:47,178][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 15:31:49,477][57339] Updated weights for policy 0, policy_version 646128 (0.0034) [2024-04-28 15:31:51,831][57339] Updated weights for policy 0, policy_version 646138 (0.0028) [2024-04-28 15:31:52,169][57108] Fps is (10 sec: 58983.1, 60 sec: 57070.9, 300 sec: 55872.2). Total num frames: 10586357760. Throughput: 0: 55808.1. Samples: 1076618160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:31:52,178][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 15:31:55,347][57339] Updated weights for policy 0, policy_version 646148 (0.0025) [2024-04-28 15:31:57,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10586587136. Throughput: 0: 55863.4. Samples: 1076956940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:31:57,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 15:31:57,742][57339] Updated weights for policy 0, policy_version 646158 (0.0031) [2024-04-28 15:32:01,100][57339] Updated weights for policy 0, policy_version 646168 (0.0030) [2024-04-28 15:32:02,169][57108] Fps is (10 sec: 49152.2, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10586849280. Throughput: 0: 55976.6. Samples: 1077293540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:32:02,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 15:32:03,500][57339] Updated weights for policy 0, policy_version 646178 (0.0027) [2024-04-28 15:32:07,006][57339] Updated weights for policy 0, policy_version 646188 (0.0028) [2024-04-28 15:32:07,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10587144192. Throughput: 0: 55698.4. Samples: 1077447140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-04-28 15:32:07,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 15:32:09,369][57339] Updated weights for policy 0, policy_version 646198 (0.0025) [2024-04-28 15:32:12,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10587439104. Throughput: 0: 55763.0. Samples: 1077782740. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:32:12,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 15:32:13,013][57339] Updated weights for policy 0, policy_version 646208 (0.0024) [2024-04-28 15:32:15,278][57339] Updated weights for policy 0, policy_version 646218 (0.0027) [2024-04-28 15:32:17,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 10587717632. Throughput: 0: 55840.3. Samples: 1078122040. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:32:17,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 15:32:18,713][57339] Updated weights for policy 0, policy_version 646228 (0.0032) [2024-04-28 15:32:21,070][57339] Updated weights for policy 0, policy_version 646238 (0.0026) [2024-04-28 15:32:22,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56251.9, 300 sec: 55761.1). Total num frames: 10588012544. Throughput: 0: 56311.5. Samples: 1078299420. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:32:22,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 15:32:24,737][57339] Updated weights for policy 0, policy_version 646248 (0.0026) [2024-04-28 15:32:26,606][57319] Signal inference workers to stop experience collection... (16050 times) [2024-04-28 15:32:26,607][57319] Signal inference workers to resume experience collection... (16050 times) [2024-04-28 15:32:26,618][57339] InferenceWorker_p0-w0: stopping experience collection (16050 times) [2024-04-28 15:32:26,618][57339] InferenceWorker_p0-w0: resuming experience collection (16050 times) [2024-04-28 15:32:26,859][57339] Updated weights for policy 0, policy_version 646258 (0.0031) [2024-04-28 15:32:27,169][57108] Fps is (10 sec: 58981.0, 60 sec: 57070.6, 300 sec: 55872.2). Total num frames: 10588307456. Throughput: 0: 56352.6. Samples: 1078634740. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:32:27,170][57108] Avg episode reward: [(0, '0.509')] [2024-04-28 15:32:30,593][57339] Updated weights for policy 0, policy_version 646268 (0.0031) [2024-04-28 15:32:32,169][57108] Fps is (10 sec: 55706.0, 60 sec: 56525.1, 300 sec: 55761.1). Total num frames: 10588569600. Throughput: 0: 56297.9. Samples: 1078969380. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:32:32,169][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 15:32:32,676][57339] Updated weights for policy 0, policy_version 646278 (0.0025) [2024-04-28 15:32:36,313][57339] Updated weights for policy 0, policy_version 646288 (0.0031) [2024-04-28 15:32:37,169][57108] Fps is (10 sec: 50792.1, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10588815360. Throughput: 0: 56056.5. Samples: 1079140700. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:32:37,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 15:32:38,615][57339] Updated weights for policy 0, policy_version 646298 (0.0030) [2024-04-28 15:32:42,044][57339] Updated weights for policy 0, policy_version 646308 (0.0031) [2024-04-28 15:32:42,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10589110272. Throughput: 0: 55914.9. Samples: 1079473120. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:32:42,170][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 15:32:44,367][57339] Updated weights for policy 0, policy_version 646318 (0.0023) [2024-04-28 15:32:47,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10589388800. Throughput: 0: 55859.9. Samples: 1079807240. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:32:47,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 15:32:48,486][57339] Updated weights for policy 0, policy_version 646328 (0.0026) [2024-04-28 15:32:50,091][57339] Updated weights for policy 0, policy_version 646338 (0.0036) [2024-04-28 15:32:52,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55159.3, 300 sec: 55761.1). Total num frames: 10589667328. Throughput: 0: 56034.5. Samples: 1079968700. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:32:52,170][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 15:32:54,319][57339] Updated weights for policy 0, policy_version 646348 (0.0029) [2024-04-28 15:32:55,992][57339] Updated weights for policy 0, policy_version 646358 (0.0029) [2024-04-28 15:32:57,169][57108] Fps is (10 sec: 57343.6, 60 sec: 56251.6, 300 sec: 55761.2). Total num frames: 10589962240. Throughput: 0: 55981.4. Samples: 1080301900. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:32:57,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 15:33:00,059][57339] Updated weights for policy 0, policy_version 646368 (0.0030) [2024-04-28 15:33:01,844][57339] Updated weights for policy 0, policy_version 646378 (0.0030) [2024-04-28 15:33:02,169][57108] Fps is (10 sec: 60621.6, 60 sec: 57070.9, 300 sec: 55927.8). Total num frames: 10590273536. Throughput: 0: 55945.8. Samples: 1080639600. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:33:02,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 15:33:06,036][57339] Updated weights for policy 0, policy_version 646388 (0.0029) [2024-04-28 15:33:07,169][57108] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10590519296. Throughput: 0: 55918.7. Samples: 1080815760. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:33:07,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 15:33:07,618][57339] Updated weights for policy 0, policy_version 646398 (0.0030) [2024-04-28 15:33:10,243][57319] Signal inference workers to stop experience collection... (16100 times) [2024-04-28 15:33:10,244][57319] Signal inference workers to resume experience collection... (16100 times) [2024-04-28 15:33:10,266][57339] InferenceWorker_p0-w0: stopping experience collection (16100 times) [2024-04-28 15:33:10,266][57339] InferenceWorker_p0-w0: resuming experience collection (16100 times) [2024-04-28 15:33:11,885][57339] Updated weights for policy 0, policy_version 646408 (0.0026) [2024-04-28 15:33:12,169][57108] Fps is (10 sec: 50790.9, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10590781440. Throughput: 0: 55877.8. Samples: 1081149220. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:33:12,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 15:33:13,379][57339] Updated weights for policy 0, policy_version 646418 (0.0033) [2024-04-28 15:33:17,169][57108] Fps is (10 sec: 52427.8, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 10591043584. Throughput: 0: 55954.4. Samples: 1081487340. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:33:17,170][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 15:33:17,742][57339] Updated weights for policy 0, policy_version 646428 (0.0025) [2024-04-28 15:33:19,307][57339] Updated weights for policy 0, policy_version 646438 (0.0030) [2024-04-28 15:33:22,169][57108] Fps is (10 sec: 55704.1, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10591338496. Throughput: 0: 55486.8. Samples: 1081637620. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:33:22,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 15:33:23,500][57339] Updated weights for policy 0, policy_version 646448 (0.0028) [2024-04-28 15:33:25,116][57339] Updated weights for policy 0, policy_version 646458 (0.0026) [2024-04-28 15:33:27,169][57108] Fps is (10 sec: 58983.1, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 10591633408. Throughput: 0: 55663.6. Samples: 1081977980. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:33:27,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 15:33:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000646462_10591633408.pth... [2024-04-28 15:33:27,234][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000645646_10578264064.pth [2024-04-28 15:33:29,389][57339] Updated weights for policy 0, policy_version 646468 (0.0023) [2024-04-28 15:33:30,947][57339] Updated weights for policy 0, policy_version 646478 (0.0027) [2024-04-28 15:33:32,169][57108] Fps is (10 sec: 57345.2, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10591911936. Throughput: 0: 55658.7. Samples: 1082311880. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:33:32,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 15:33:35,418][57339] Updated weights for policy 0, policy_version 646488 (0.0029) [2024-04-28 15:33:36,790][57339] Updated weights for policy 0, policy_version 646498 (0.0031) [2024-04-28 15:33:37,169][57108] Fps is (10 sec: 58982.4, 60 sec: 56797.8, 300 sec: 56038.8). Total num frames: 10592223232. Throughput: 0: 56092.5. Samples: 1082492860. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:33:37,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 15:33:41,208][57339] Updated weights for policy 0, policy_version 646508 (0.0033) [2024-04-28 15:33:42,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10592485376. Throughput: 0: 56140.0. Samples: 1082828200. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-04-28 15:33:42,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 15:33:42,694][57339] Updated weights for policy 0, policy_version 646518 (0.0030) [2024-04-28 15:33:46,997][57339] Updated weights for policy 0, policy_version 646528 (0.0032) [2024-04-28 15:33:47,011][57319] Signal inference workers to stop experience collection... (16150 times) [2024-04-28 15:33:47,011][57319] Signal inference workers to resume experience collection... (16150 times) [2024-04-28 15:33:47,024][57339] InferenceWorker_p0-w0: stopping experience collection (16150 times) [2024-04-28 15:33:47,024][57339] InferenceWorker_p0-w0: resuming experience collection (16150 times) [2024-04-28 15:33:47,169][57108] Fps is (10 sec: 50791.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10592731136. Throughput: 0: 56180.1. Samples: 1083167700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:33:47,169][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 15:33:48,593][57339] Updated weights for policy 0, policy_version 646538 (0.0033) [2024-04-28 15:33:52,169][57108] Fps is (10 sec: 49152.3, 60 sec: 55159.6, 300 sec: 55483.5). Total num frames: 10592976896. Throughput: 0: 55600.0. Samples: 1083317760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:33:52,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 15:33:52,877][57339] Updated weights for policy 0, policy_version 646548 (0.0031) [2024-04-28 15:33:54,464][57339] Updated weights for policy 0, policy_version 646558 (0.0025) [2024-04-28 15:33:57,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 10593288192. Throughput: 0: 55659.3. Samples: 1083653900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:33:57,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 15:33:58,768][57339] Updated weights for policy 0, policy_version 646568 (0.0030) [2024-04-28 15:34:00,941][57339] Updated weights for policy 0, policy_version 646578 (0.0029) [2024-04-28 15:34:02,169][57108] Fps is (10 sec: 60621.0, 60 sec: 55159.5, 300 sec: 55872.2). Total num frames: 10593583104. Throughput: 0: 55682.0. Samples: 1083993020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:34:02,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 15:34:04,453][57339] Updated weights for policy 0, policy_version 646588 (0.0028) [2024-04-28 15:34:06,672][57339] Updated weights for policy 0, policy_version 646598 (0.0025) [2024-04-28 15:34:07,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10593861632. Throughput: 0: 56181.5. Samples: 1084165780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:34:07,170][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 15:34:10,213][57339] Updated weights for policy 0, policy_version 646608 (0.0026) [2024-04-28 15:34:12,169][57108] Fps is (10 sec: 58981.6, 60 sec: 56524.6, 300 sec: 55927.7). Total num frames: 10594172928. Throughput: 0: 56006.6. Samples: 1084498280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:34:12,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 15:34:12,489][57339] Updated weights for policy 0, policy_version 646618 (0.0026) [2024-04-28 15:34:16,211][57339] Updated weights for policy 0, policy_version 646628 (0.0028) [2024-04-28 15:34:17,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56798.0, 300 sec: 55872.2). Total num frames: 10594451456. Throughput: 0: 56046.6. Samples: 1084833980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:34:17,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 15:34:18,345][57339] Updated weights for policy 0, policy_version 646638 (0.0032) [2024-04-28 15:34:22,061][57339] Updated weights for policy 0, policy_version 646648 (0.0029) [2024-04-28 15:34:22,169][57108] Fps is (10 sec: 50791.0, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 10594680832. Throughput: 0: 55637.4. Samples: 1084996540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:34:22,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 15:34:24,037][57339] Updated weights for policy 0, policy_version 646658 (0.0029) [2024-04-28 15:34:27,169][57108] Fps is (10 sec: 49151.6, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 10594942976. Throughput: 0: 55692.8. Samples: 1085334380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:34:27,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 15:34:27,931][57339] Updated weights for policy 0, policy_version 646668 (0.0027) [2024-04-28 15:34:29,712][57339] Updated weights for policy 0, policy_version 646678 (0.0024) [2024-04-28 15:34:32,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 10595237888. Throughput: 0: 55513.6. Samples: 1085665820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:34:32,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 15:34:33,813][57339] Updated weights for policy 0, policy_version 646688 (0.0031) [2024-04-28 15:34:34,093][57319] Signal inference workers to stop experience collection... (16200 times) [2024-04-28 15:34:34,143][57339] InferenceWorker_p0-w0: stopping experience collection (16200 times) [2024-04-28 15:34:34,148][57319] Signal inference workers to resume experience collection... (16200 times) [2024-04-28 15:34:34,152][57339] InferenceWorker_p0-w0: resuming experience collection (16200 times) [2024-04-28 15:34:35,756][57339] Updated weights for policy 0, policy_version 646698 (0.0029) [2024-04-28 15:34:37,169][57108] Fps is (10 sec: 60621.9, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 10595549184. Throughput: 0: 55818.7. Samples: 1085829600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:34:37,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 15:34:39,582][57339] Updated weights for policy 0, policy_version 646708 (0.0031) [2024-04-28 15:34:41,959][57339] Updated weights for policy 0, policy_version 646718 (0.0027) [2024-04-28 15:34:42,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55705.5, 300 sec: 55927.8). Total num frames: 10595827712. Throughput: 0: 55670.6. Samples: 1086159080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:34:42,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 15:34:45,525][57339] Updated weights for policy 0, policy_version 646728 (0.0027) [2024-04-28 15:34:47,169][57108] Fps is (10 sec: 57343.3, 60 sec: 56524.7, 300 sec: 55927.7). Total num frames: 10596122624. Throughput: 0: 55525.2. Samples: 1086491660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:34:47,169][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 15:34:48,224][57339] Updated weights for policy 0, policy_version 646738 (0.0030) [2024-04-28 15:34:51,305][57339] Updated weights for policy 0, policy_version 646748 (0.0035) [2024-04-28 15:34:52,169][57108] Fps is (10 sec: 55706.5, 60 sec: 56797.9, 300 sec: 55816.7). Total num frames: 10596384768. Throughput: 0: 55719.7. Samples: 1086673160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:34:52,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 15:34:53,967][57339] Updated weights for policy 0, policy_version 646758 (0.0036) [2024-04-28 15:34:57,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10596646912. Throughput: 0: 55830.8. Samples: 1087010660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:34:57,170][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 15:34:57,171][57339] Updated weights for policy 0, policy_version 646768 (0.0027) [2024-04-28 15:34:59,710][57339] Updated weights for policy 0, policy_version 646778 (0.0025) [2024-04-28 15:35:02,169][57108] Fps is (10 sec: 54066.1, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 10596925440. Throughput: 0: 55965.2. Samples: 1087352420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:35:02,170][57108] Avg episode reward: [(0, '0.473')] [2024-04-28 15:35:02,933][57339] Updated weights for policy 0, policy_version 646788 (0.0024) [2024-04-28 15:35:05,645][57339] Updated weights for policy 0, policy_version 646798 (0.0028) [2024-04-28 15:35:07,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10597171200. Throughput: 0: 55716.8. Samples: 1087503800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:35:07,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 15:35:08,854][57339] Updated weights for policy 0, policy_version 646808 (0.0029) [2024-04-28 15:35:11,360][57339] Updated weights for policy 0, policy_version 646818 (0.0031) [2024-04-28 15:35:12,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55159.5, 300 sec: 55872.2). Total num frames: 10597482496. Throughput: 0: 55533.0. Samples: 1087833360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:35:12,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 15:35:14,746][57339] Updated weights for policy 0, policy_version 646828 (0.0030) [2024-04-28 15:35:17,063][57339] Updated weights for policy 0, policy_version 646838 (0.0023) [2024-04-28 15:35:17,169][57108] Fps is (10 sec: 62259.2, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 10597793792. Throughput: 0: 55656.0. Samples: 1088170340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:35:17,170][57108] Avg episode reward: [(0, '0.500')] [2024-04-28 15:35:20,630][57339] Updated weights for policy 0, policy_version 646848 (0.0026) [2024-04-28 15:35:22,169][57108] Fps is (10 sec: 58982.4, 60 sec: 56524.7, 300 sec: 55927.7). Total num frames: 10598072320. Throughput: 0: 55994.1. Samples: 1088349340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:35:22,170][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 15:35:22,915][57339] Updated weights for policy 0, policy_version 646858 (0.0030) [2024-04-28 15:35:26,585][57339] Updated weights for policy 0, policy_version 646868 (0.0027) [2024-04-28 15:35:26,711][57319] Signal inference workers to stop experience collection... (16250 times) [2024-04-28 15:35:26,711][57319] Signal inference workers to resume experience collection... (16250 times) [2024-04-28 15:35:26,729][57339] InferenceWorker_p0-w0: stopping experience collection (16250 times) [2024-04-28 15:35:26,730][57339] InferenceWorker_p0-w0: resuming experience collection (16250 times) [2024-04-28 15:35:27,169][57108] Fps is (10 sec: 55706.5, 60 sec: 56798.1, 300 sec: 55927.8). Total num frames: 10598350848. Throughput: 0: 56170.5. Samples: 1088686740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:35:27,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 15:35:27,278][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000646873_10598367232.pth... [2024-04-28 15:35:27,327][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000646051_10584899584.pth [2024-04-28 15:35:28,830][57339] Updated weights for policy 0, policy_version 646878 (0.0032) [2024-04-28 15:35:32,169][57108] Fps is (10 sec: 50790.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10598580224. Throughput: 0: 56115.6. Samples: 1089016860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:35:32,170][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 15:35:32,420][57339] Updated weights for policy 0, policy_version 646888 (0.0032) [2024-04-28 15:35:35,223][57339] Updated weights for policy 0, policy_version 646898 (0.0026) [2024-04-28 15:35:37,174][57108] Fps is (10 sec: 52399.6, 60 sec: 55427.4, 300 sec: 55704.6). Total num frames: 10598875136. Throughput: 0: 55619.8. Samples: 1089176360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:35:37,175][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 15:35:38,220][57339] Updated weights for policy 0, policy_version 646908 (0.0035) [2024-04-28 15:35:41,280][57339] Updated weights for policy 0, policy_version 646918 (0.0032) [2024-04-28 15:35:42,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 10599137280. Throughput: 0: 55536.3. Samples: 1089509800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:35:42,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 15:35:44,117][57339] Updated weights for policy 0, policy_version 646928 (0.0029) [2024-04-28 15:35:46,991][57339] Updated weights for policy 0, policy_version 646938 (0.0031) [2024-04-28 15:35:47,169][57108] Fps is (10 sec: 55736.0, 60 sec: 55159.5, 300 sec: 55927.7). Total num frames: 10599432192. Throughput: 0: 55462.4. Samples: 1089848220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:35:47,170][57108] Avg episode reward: [(0, '0.496')] [2024-04-28 15:35:49,953][57339] Updated weights for policy 0, policy_version 646948 (0.0031) [2024-04-28 15:35:52,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55705.4, 300 sec: 55927.7). Total num frames: 10599727104. Throughput: 0: 55895.1. Samples: 1090019080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:35:52,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 15:35:53,038][57339] Updated weights for policy 0, policy_version 646958 (0.0027) [2024-04-28 15:35:55,826][57339] Updated weights for policy 0, policy_version 646968 (0.0031) [2024-04-28 15:35:57,169][57108] Fps is (10 sec: 58982.6, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10600022016. Throughput: 0: 56055.2. Samples: 1090355840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:35:57,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 15:35:58,934][57339] Updated weights for policy 0, policy_version 646978 (0.0032) [2024-04-28 15:36:01,782][57339] Updated weights for policy 0, policy_version 646988 (0.0036) [2024-04-28 15:36:02,169][57108] Fps is (10 sec: 57344.7, 60 sec: 56251.9, 300 sec: 55872.2). Total num frames: 10600300544. Throughput: 0: 56054.3. Samples: 1090692780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:36:02,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 15:36:04,770][57339] Updated weights for policy 0, policy_version 646998 (0.0028) [2024-04-28 15:36:07,169][57108] Fps is (10 sec: 50789.9, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10600529920. Throughput: 0: 55615.9. Samples: 1090852060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:36:07,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 15:36:07,451][57319] Signal inference workers to stop experience collection... (16300 times) [2024-04-28 15:36:07,451][57319] Signal inference workers to resume experience collection... (16300 times) [2024-04-28 15:36:07,473][57339] InferenceWorker_p0-w0: stopping experience collection (16300 times) [2024-04-28 15:36:07,473][57339] InferenceWorker_p0-w0: resuming experience collection (16300 times) [2024-04-28 15:36:07,715][57339] Updated weights for policy 0, policy_version 647008 (0.0025) [2024-04-28 15:36:10,449][57339] Updated weights for policy 0, policy_version 647018 (0.0027) [2024-04-28 15:36:12,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10600824832. Throughput: 0: 55505.6. Samples: 1091184500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:36:12,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 15:36:13,497][57339] Updated weights for policy 0, policy_version 647028 (0.0027) [2024-04-28 15:36:16,391][57339] Updated weights for policy 0, policy_version 647038 (0.0030) [2024-04-28 15:36:17,169][57108] Fps is (10 sec: 55706.1, 60 sec: 54886.5, 300 sec: 55761.2). Total num frames: 10601086976. Throughput: 0: 55680.5. Samples: 1091522480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:36:17,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 15:36:19,229][57339] Updated weights for policy 0, policy_version 647048 (0.0027) [2024-04-28 15:36:22,169][57108] Fps is (10 sec: 54066.7, 60 sec: 54886.3, 300 sec: 55872.2). Total num frames: 10601365504. Throughput: 0: 55717.7. Samples: 1091683360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:36:22,170][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 15:36:22,422][57339] Updated weights for policy 0, policy_version 647058 (0.0030) [2024-04-28 15:36:25,138][57339] Updated weights for policy 0, policy_version 647068 (0.0030) [2024-04-28 15:36:27,169][57108] Fps is (10 sec: 58981.3, 60 sec: 55432.3, 300 sec: 55927.8). Total num frames: 10601676800. Throughput: 0: 55667.4. Samples: 1092014840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:36:27,170][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 15:36:28,313][57339] Updated weights for policy 0, policy_version 647078 (0.0029) [2024-04-28 15:36:30,932][57339] Updated weights for policy 0, policy_version 647088 (0.0029) [2024-04-28 15:36:32,169][57108] Fps is (10 sec: 60621.3, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 10601971712. Throughput: 0: 55655.1. Samples: 1092352700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:36:32,170][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 15:36:34,209][57339] Updated weights for policy 0, policy_version 647098 (0.0030) [2024-04-28 15:36:36,718][57339] Updated weights for policy 0, policy_version 647108 (0.0028) [2024-04-28 15:36:37,169][57108] Fps is (10 sec: 57344.6, 60 sec: 56256.8, 300 sec: 55872.2). Total num frames: 10602250240. Throughput: 0: 55832.5. Samples: 1092531540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:36:37,169][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 15:36:40,139][57339] Updated weights for policy 0, policy_version 647118 (0.0033) [2024-04-28 15:36:42,169][57108] Fps is (10 sec: 50789.9, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10602479616. Throughput: 0: 55665.6. Samples: 1092860800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:36:42,170][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 15:36:42,744][57339] Updated weights for policy 0, policy_version 647128 (0.0027) [2024-04-28 15:36:46,081][57339] Updated weights for policy 0, policy_version 647138 (0.0033) [2024-04-28 15:36:47,169][57108] Fps is (10 sec: 50790.7, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10602758144. Throughput: 0: 55490.2. Samples: 1093189840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:36:47,170][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 15:36:48,760][57339] Updated weights for policy 0, policy_version 647148 (0.0034) [2024-04-28 15:36:51,943][57339] Updated weights for policy 0, policy_version 647158 (0.0026) [2024-04-28 15:36:52,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 10603036672. Throughput: 0: 55249.4. Samples: 1093338280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:36:52,170][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 15:36:54,086][57319] Signal inference workers to stop experience collection... (16350 times) [2024-04-28 15:36:54,087][57319] Signal inference workers to resume experience collection... (16350 times) [2024-04-28 15:36:54,102][57339] InferenceWorker_p0-w0: stopping experience collection (16350 times) [2024-04-28 15:36:54,102][57339] InferenceWorker_p0-w0: resuming experience collection (16350 times) [2024-04-28 15:36:54,594][57339] Updated weights for policy 0, policy_version 647168 (0.0030) [2024-04-28 15:36:57,169][57108] Fps is (10 sec: 55705.0, 60 sec: 54886.3, 300 sec: 55816.6). Total num frames: 10603315200. Throughput: 0: 55341.7. Samples: 1093674880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 15:36:57,170][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 15:36:57,949][57339] Updated weights for policy 0, policy_version 647178 (0.0029) [2024-04-28 15:37:00,539][57339] Updated weights for policy 0, policy_version 647188 (0.0024) [2024-04-28 15:37:02,169][57108] Fps is (10 sec: 57342.4, 60 sec: 55159.2, 300 sec: 55816.6). Total num frames: 10603610112. Throughput: 0: 55165.4. Samples: 1094004940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:37:02,170][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 15:37:03,942][57339] Updated weights for policy 0, policy_version 647198 (0.0021) [2024-04-28 15:37:06,381][57339] Updated weights for policy 0, policy_version 647208 (0.0029) [2024-04-28 15:37:07,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10603905024. Throughput: 0: 55396.9. Samples: 1094176220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:37:07,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 15:37:09,879][57339] Updated weights for policy 0, policy_version 647218 (0.0028) [2024-04-28 15:37:12,169][57108] Fps is (10 sec: 54068.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10604150784. Throughput: 0: 55534.4. Samples: 1094513880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:37:12,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 15:37:12,344][57339] Updated weights for policy 0, policy_version 647228 (0.0036) [2024-04-28 15:37:15,868][57339] Updated weights for policy 0, policy_version 647238 (0.0028) [2024-04-28 15:37:17,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 10604429312. Throughput: 0: 55494.7. Samples: 1094849960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:37:17,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 15:37:18,082][57339] Updated weights for policy 0, policy_version 647248 (0.0027) [2024-04-28 15:37:21,593][57339] Updated weights for policy 0, policy_version 647258 (0.0028) [2024-04-28 15:37:22,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10604691456. Throughput: 0: 55079.0. Samples: 1095010100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:37:22,170][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 15:37:23,980][57339] Updated weights for policy 0, policy_version 647268 (0.0032) [2024-04-28 15:37:27,169][57108] Fps is (10 sec: 54067.5, 60 sec: 54886.6, 300 sec: 55594.5). Total num frames: 10604969984. Throughput: 0: 55163.3. Samples: 1095343140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:37:27,170][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 15:37:27,274][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000647277_10604986368.pth... [2024-04-28 15:37:27,317][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000646462_10591633408.pth [2024-04-28 15:37:27,438][57339] Updated weights for policy 0, policy_version 647278 (0.0024) [2024-04-28 15:37:29,850][57339] Updated weights for policy 0, policy_version 647288 (0.0025) [2024-04-28 15:37:32,169][57108] Fps is (10 sec: 57344.4, 60 sec: 54886.4, 300 sec: 55761.1). Total num frames: 10605264896. Throughput: 0: 55335.9. Samples: 1095679960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:37:32,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 15:37:33,199][57339] Updated weights for policy 0, policy_version 647298 (0.0025) [2024-04-28 15:37:35,692][57319] Signal inference workers to stop experience collection... (16400 times) [2024-04-28 15:37:35,726][57339] InferenceWorker_p0-w0: stopping experience collection (16400 times) [2024-04-28 15:37:35,781][57319] Signal inference workers to resume experience collection... (16400 times) [2024-04-28 15:37:35,781][57339] InferenceWorker_p0-w0: resuming experience collection (16400 times) [2024-04-28 15:37:35,783][57339] Updated weights for policy 0, policy_version 647308 (0.0023) [2024-04-28 15:37:37,169][57108] Fps is (10 sec: 60620.9, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10605576192. Throughput: 0: 55885.4. Samples: 1095853120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:37:37,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 15:37:39,177][57339] Updated weights for policy 0, policy_version 647318 (0.0029) [2024-04-28 15:37:41,626][57339] Updated weights for policy 0, policy_version 647328 (0.0036) [2024-04-28 15:37:42,169][57108] Fps is (10 sec: 58982.7, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10605854720. Throughput: 0: 55766.8. Samples: 1096184380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:37:42,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 15:37:44,848][57339] Updated weights for policy 0, policy_version 647338 (0.0028) [2024-04-28 15:37:47,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 10606116864. Throughput: 0: 55878.0. Samples: 1096519440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:37:47,170][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 15:37:47,423][57339] Updated weights for policy 0, policy_version 647348 (0.0027) [2024-04-28 15:37:50,781][57339] Updated weights for policy 0, policy_version 647358 (0.0035) [2024-04-28 15:37:52,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10606395392. Throughput: 0: 55731.2. Samples: 1096684120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:37:52,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 15:37:53,272][57339] Updated weights for policy 0, policy_version 647368 (0.0027) [2024-04-28 15:37:56,843][57339] Updated weights for policy 0, policy_version 647378 (0.0029) [2024-04-28 15:37:57,169][57108] Fps is (10 sec: 52429.8, 60 sec: 55432.7, 300 sec: 55483.4). Total num frames: 10606641152. Throughput: 0: 55826.3. Samples: 1097026060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:37:57,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 15:37:59,096][57339] Updated weights for policy 0, policy_version 647388 (0.0038) [2024-04-28 15:38:02,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.8, 300 sec: 55650.0). Total num frames: 10606936064. Throughput: 0: 55714.2. Samples: 1097357100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:38:02,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 15:38:02,645][57339] Updated weights for policy 0, policy_version 647398 (0.0028) [2024-04-28 15:38:04,950][57339] Updated weights for policy 0, policy_version 647408 (0.0033) [2024-04-28 15:38:07,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 10607214592. Throughput: 0: 55785.6. Samples: 1097520440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:38:07,170][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 15:38:08,841][57339] Updated weights for policy 0, policy_version 647418 (0.0030) [2024-04-28 15:38:10,830][57339] Updated weights for policy 0, policy_version 647428 (0.0031) [2024-04-28 15:38:12,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10607509504. Throughput: 0: 55796.0. Samples: 1097853960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:38:12,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 15:38:14,577][57339] Updated weights for policy 0, policy_version 647438 (0.0029) [2024-04-28 15:38:16,778][57339] Updated weights for policy 0, policy_version 647448 (0.0028) [2024-04-28 15:38:17,169][57108] Fps is (10 sec: 58982.3, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10607804416. Throughput: 0: 55674.4. Samples: 1098185300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:38:17,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 15:38:20,366][57339] Updated weights for policy 0, policy_version 647458 (0.0033) [2024-04-28 15:38:22,150][57319] Signal inference workers to stop experience collection... (16450 times) [2024-04-28 15:38:22,151][57319] Signal inference workers to resume experience collection... (16450 times) [2024-04-28 15:38:22,169][57108] Fps is (10 sec: 55705.8, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 10608066560. Throughput: 0: 55857.8. Samples: 1098366720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:38:22,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 15:38:22,191][57339] InferenceWorker_p0-w0: stopping experience collection (16450 times) [2024-04-28 15:38:22,191][57339] InferenceWorker_p0-w0: resuming experience collection (16450 times) [2024-04-28 15:38:22,659][57339] Updated weights for policy 0, policy_version 647468 (0.0026) [2024-04-28 15:38:26,269][57339] Updated weights for policy 0, policy_version 647478 (0.0028) [2024-04-28 15:38:27,169][57108] Fps is (10 sec: 52428.1, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 10608328704. Throughput: 0: 55864.8. Samples: 1098698300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:38:27,169][57108] Avg episode reward: [(0, '0.678')] [2024-04-28 15:38:28,506][57339] Updated weights for policy 0, policy_version 647488 (0.0023) [2024-04-28 15:38:32,075][57339] Updated weights for policy 0, policy_version 647498 (0.0028) [2024-04-28 15:38:32,169][57108] Fps is (10 sec: 54066.2, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10608607232. Throughput: 0: 55758.7. Samples: 1099028580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-04-28 15:38:32,170][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 15:38:34,343][57339] Updated weights for policy 0, policy_version 647508 (0.0030) [2024-04-28 15:38:37,169][57108] Fps is (10 sec: 54067.9, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 10608869376. Throughput: 0: 55538.7. Samples: 1099183360. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:38:37,170][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 15:38:38,094][57339] Updated weights for policy 0, policy_version 647518 (0.0028) [2024-04-28 15:38:40,206][57339] Updated weights for policy 0, policy_version 647528 (0.0024) [2024-04-28 15:38:42,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 10609180672. Throughput: 0: 55411.8. Samples: 1099519600. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:38:42,170][57108] Avg episode reward: [(0, '0.529')] [2024-04-28 15:38:43,993][57339] Updated weights for policy 0, policy_version 647538 (0.0026) [2024-04-28 15:38:46,039][57339] Updated weights for policy 0, policy_version 647548 (0.0029) [2024-04-28 15:38:47,169][57108] Fps is (10 sec: 58981.4, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10609459200. Throughput: 0: 55582.1. Samples: 1099858300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:38:47,170][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 15:38:49,854][57339] Updated weights for policy 0, policy_version 647558 (0.0027) [2024-04-28 15:38:51,923][57339] Updated weights for policy 0, policy_version 647568 (0.0027) [2024-04-28 15:38:52,169][57108] Fps is (10 sec: 58983.3, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10609770496. Throughput: 0: 55911.5. Samples: 1100036460. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:38:52,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 15:38:55,788][57339] Updated weights for policy 0, policy_version 647578 (0.0027) [2024-04-28 15:38:57,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 10609999872. Throughput: 0: 55979.5. Samples: 1100373040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:38:57,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 15:38:57,794][57339] Updated weights for policy 0, policy_version 647588 (0.0032) [2024-04-28 15:39:01,675][57339] Updated weights for policy 0, policy_version 647598 (0.0026) [2024-04-28 15:39:02,169][57108] Fps is (10 sec: 50790.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10610278400. Throughput: 0: 56112.1. Samples: 1100710340. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:39:02,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 15:39:03,637][57339] Updated weights for policy 0, policy_version 647608 (0.0032) [2024-04-28 15:39:07,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 10610556928. Throughput: 0: 55645.2. Samples: 1100870760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:39:07,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 15:39:07,438][57339] Updated weights for policy 0, policy_version 647618 (0.0026) [2024-04-28 15:39:09,482][57339] Updated weights for policy 0, policy_version 647628 (0.0032) [2024-04-28 15:39:12,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10610851840. Throughput: 0: 55659.7. Samples: 1101202980. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:39:12,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 15:39:13,360][57339] Updated weights for policy 0, policy_version 647638 (0.0026) [2024-04-28 15:39:14,894][57319] Signal inference workers to stop experience collection... (16500 times) [2024-04-28 15:39:14,907][57339] InferenceWorker_p0-w0: stopping experience collection (16500 times) [2024-04-28 15:39:14,957][57319] Signal inference workers to resume experience collection... (16500 times) [2024-04-28 15:39:14,958][57339] InferenceWorker_p0-w0: resuming experience collection (16500 times) [2024-04-28 15:39:15,517][57339] Updated weights for policy 0, policy_version 647648 (0.0027) [2024-04-28 15:39:17,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10611130368. Throughput: 0: 55633.0. Samples: 1101532060. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:39:17,170][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 15:39:19,247][57339] Updated weights for policy 0, policy_version 647658 (0.0033) [2024-04-28 15:39:21,295][57339] Updated weights for policy 0, policy_version 647668 (0.0027) [2024-04-28 15:39:22,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10611408896. Throughput: 0: 56158.3. Samples: 1101710480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:39:22,169][57108] Avg episode reward: [(0, '0.699')] [2024-04-28 15:39:25,067][57339] Updated weights for policy 0, policy_version 647678 (0.0024) [2024-04-28 15:39:27,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56251.9, 300 sec: 55816.7). Total num frames: 10611703808. Throughput: 0: 56123.4. Samples: 1102045140. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:39:27,169][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 15:39:27,270][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000647688_10611720192.pth... [2024-04-28 15:39:27,279][57339] Updated weights for policy 0, policy_version 647688 (0.0026) [2024-04-28 15:39:27,322][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000646873_10598367232.pth [2024-04-28 15:39:30,845][57339] Updated weights for policy 0, policy_version 647698 (0.0029) [2024-04-28 15:39:32,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10611949568. Throughput: 0: 56057.1. Samples: 1102380860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:39:32,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 15:39:33,097][57339] Updated weights for policy 0, policy_version 647708 (0.0033) [2024-04-28 15:39:36,685][57339] Updated weights for policy 0, policy_version 647718 (0.0032) [2024-04-28 15:39:37,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 10612228096. Throughput: 0: 55738.1. Samples: 1102544680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:39:37,170][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 15:39:38,810][57339] Updated weights for policy 0, policy_version 647728 (0.0028) [2024-04-28 15:39:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 10612506624. Throughput: 0: 55659.6. Samples: 1102877720. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:39:42,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 15:39:42,608][57339] Updated weights for policy 0, policy_version 647738 (0.0027) [2024-04-28 15:39:45,073][57339] Updated weights for policy 0, policy_version 647748 (0.0028) [2024-04-28 15:39:47,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 10612801536. Throughput: 0: 55471.8. Samples: 1103206580. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:39:47,170][57108] Avg episode reward: [(0, '0.718')] [2024-04-28 15:39:48,409][57339] Updated weights for policy 0, policy_version 647758 (0.0026) [2024-04-28 15:39:50,779][57339] Updated weights for policy 0, policy_version 647768 (0.0024) [2024-04-28 15:39:52,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10613080064. Throughput: 0: 55930.4. Samples: 1103387620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:39:52,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 15:39:54,107][57339] Updated weights for policy 0, policy_version 647778 (0.0034) [2024-04-28 15:39:56,632][57339] Updated weights for policy 0, policy_version 647788 (0.0025) [2024-04-28 15:39:57,169][57108] Fps is (10 sec: 57345.1, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 10613374976. Throughput: 0: 56011.2. Samples: 1103723480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:39:57,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 15:39:59,892][57339] Updated weights for policy 0, policy_version 647798 (0.0028) [2024-04-28 15:40:02,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10613653504. Throughput: 0: 56028.0. Samples: 1104053320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:40:02,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 15:40:02,556][57339] Updated weights for policy 0, policy_version 647808 (0.0028) [2024-04-28 15:40:05,841][57339] Updated weights for policy 0, policy_version 647818 (0.0022) [2024-04-28 15:40:07,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10613915648. Throughput: 0: 55815.0. Samples: 1104222160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-04-28 15:40:07,169][57108] Avg episode reward: [(0, '0.726')] [2024-04-28 15:40:07,223][57319] Signal inference workers to stop experience collection... (16550 times) [2024-04-28 15:40:07,224][57319] Signal inference workers to resume experience collection... (16550 times) [2024-04-28 15:40:07,255][57339] InferenceWorker_p0-w0: stopping experience collection (16550 times) [2024-04-28 15:40:07,256][57339] InferenceWorker_p0-w0: resuming experience collection (16550 times) [2024-04-28 15:40:08,264][57339] Updated weights for policy 0, policy_version 647828 (0.0032) [2024-04-28 15:40:11,722][57339] Updated weights for policy 0, policy_version 647838 (0.0025) [2024-04-28 15:40:12,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10614194176. Throughput: 0: 55776.0. Samples: 1104555060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:40:12,169][57108] Avg episode reward: [(0, '0.676')] [2024-04-28 15:40:13,990][57339] Updated weights for policy 0, policy_version 647848 (0.0028) [2024-04-28 15:40:17,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10614472704. Throughput: 0: 55801.3. Samples: 1104891920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:40:17,170][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 15:40:17,621][57339] Updated weights for policy 0, policy_version 647858 (0.0028) [2024-04-28 15:40:19,873][57339] Updated weights for policy 0, policy_version 647868 (0.0033) [2024-04-28 15:40:22,169][57108] Fps is (10 sec: 57342.9, 60 sec: 55978.4, 300 sec: 55650.0). Total num frames: 10614767616. Throughput: 0: 55640.8. Samples: 1105048520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:40:22,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 15:40:23,471][57339] Updated weights for policy 0, policy_version 647878 (0.0029) [2024-04-28 15:40:25,642][57339] Updated weights for policy 0, policy_version 647888 (0.0029) [2024-04-28 15:40:27,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10615046144. Throughput: 0: 55738.2. Samples: 1105385940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:40:27,169][57108] Avg episode reward: [(0, '0.499')] [2024-04-28 15:40:29,378][57339] Updated weights for policy 0, policy_version 647898 (0.0027) [2024-04-28 15:40:31,970][57339] Updated weights for policy 0, policy_version 647908 (0.0027) [2024-04-28 15:40:32,169][57108] Fps is (10 sec: 55705.8, 60 sec: 56251.6, 300 sec: 55762.2). Total num frames: 10615324672. Throughput: 0: 55884.4. Samples: 1105721380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:40:32,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 15:40:35,422][57339] Updated weights for policy 0, policy_version 647918 (0.0031) [2024-04-28 15:40:37,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56524.9, 300 sec: 55872.2). Total num frames: 10615619584. Throughput: 0: 55866.2. Samples: 1105901600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:40:37,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 15:40:37,923][57339] Updated weights for policy 0, policy_version 647928 (0.0034) [2024-04-28 15:40:41,370][57339] Updated weights for policy 0, policy_version 647938 (0.0029) [2024-04-28 15:40:42,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10615881728. Throughput: 0: 55862.5. Samples: 1106237300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:40:42,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 15:40:43,714][57339] Updated weights for policy 0, policy_version 647948 (0.0025) [2024-04-28 15:40:47,090][57339] Updated weights for policy 0, policy_version 647958 (0.0027) [2024-04-28 15:40:47,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10616143872. Throughput: 0: 56004.9. Samples: 1106573540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:40:47,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 15:40:49,480][57339] Updated weights for policy 0, policy_version 647968 (0.0030) [2024-04-28 15:40:50,358][57319] Signal inference workers to stop experience collection... (16600 times) [2024-04-28 15:40:50,358][57319] Signal inference workers to resume experience collection... (16600 times) [2024-04-28 15:40:50,396][57339] InferenceWorker_p0-w0: stopping experience collection (16600 times) [2024-04-28 15:40:50,397][57339] InferenceWorker_p0-w0: resuming experience collection (16600 times) [2024-04-28 15:40:52,169][57108] Fps is (10 sec: 52428.1, 60 sec: 55432.3, 300 sec: 55538.9). Total num frames: 10616406016. Throughput: 0: 55757.5. Samples: 1106731260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:40:52,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 15:40:53,042][57339] Updated weights for policy 0, policy_version 647978 (0.0028) [2024-04-28 15:40:55,595][57339] Updated weights for policy 0, policy_version 647988 (0.0031) [2024-04-28 15:40:57,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10616717312. Throughput: 0: 55937.4. Samples: 1107072240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:40:57,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 15:40:58,920][57339] Updated weights for policy 0, policy_version 647998 (0.0028) [2024-04-28 15:41:01,391][57339] Updated weights for policy 0, policy_version 648008 (0.0031) [2024-04-28 15:41:02,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55705.4, 300 sec: 55816.7). Total num frames: 10616995840. Throughput: 0: 55922.5. Samples: 1107408440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:41:02,170][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 15:41:04,721][57339] Updated weights for policy 0, policy_version 648018 (0.0024) [2024-04-28 15:41:07,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10617274368. Throughput: 0: 56389.1. Samples: 1107586020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:41:07,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 15:41:07,175][57339] Updated weights for policy 0, policy_version 648028 (0.0030) [2024-04-28 15:41:10,557][57339] Updated weights for policy 0, policy_version 648038 (0.0029) [2024-04-28 15:41:12,169][57108] Fps is (10 sec: 57345.4, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10617569280. Throughput: 0: 56223.2. Samples: 1107915980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:41:12,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 15:41:13,094][57339] Updated weights for policy 0, policy_version 648048 (0.0031) [2024-04-28 15:41:16,419][57339] Updated weights for policy 0, policy_version 648058 (0.0031) [2024-04-28 15:41:17,169][57108] Fps is (10 sec: 58982.1, 60 sec: 56524.8, 300 sec: 55927.8). Total num frames: 10617864192. Throughput: 0: 56147.2. Samples: 1108248000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:41:17,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 15:41:19,077][57339] Updated weights for policy 0, policy_version 648068 (0.0033) [2024-04-28 15:41:22,169][57108] Fps is (10 sec: 50789.5, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10618077184. Throughput: 0: 55820.7. Samples: 1108413540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:41:22,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 15:41:22,443][57339] Updated weights for policy 0, policy_version 648078 (0.0029) [2024-04-28 15:41:24,956][57339] Updated weights for policy 0, policy_version 648088 (0.0027) [2024-04-28 15:41:27,169][57108] Fps is (10 sec: 50790.4, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10618372096. Throughput: 0: 55696.5. Samples: 1108743640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:41:27,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 15:41:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000648094_10618372096.pth... [2024-04-28 15:41:27,226][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000647277_10604986368.pth [2024-04-28 15:41:28,181][57339] Updated weights for policy 0, policy_version 648098 (0.0028) [2024-04-28 15:41:30,719][57339] Updated weights for policy 0, policy_version 648108 (0.0028) [2024-04-28 15:41:32,169][57108] Fps is (10 sec: 57345.0, 60 sec: 55432.7, 300 sec: 55594.6). Total num frames: 10618650624. Throughput: 0: 55642.8. Samples: 1109077460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:41:32,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 15:41:33,714][57319] Signal inference workers to stop experience collection... (16650 times) [2024-04-28 15:41:33,714][57319] Signal inference workers to resume experience collection... (16650 times) [2024-04-28 15:41:33,728][57339] InferenceWorker_p0-w0: stopping experience collection (16650 times) [2024-04-28 15:41:33,728][57339] InferenceWorker_p0-w0: resuming experience collection (16650 times) [2024-04-28 15:41:33,957][57339] Updated weights for policy 0, policy_version 648118 (0.0028) [2024-04-28 15:41:36,524][57339] Updated weights for policy 0, policy_version 648128 (0.0030) [2024-04-28 15:41:37,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10618945536. Throughput: 0: 55835.8. Samples: 1109243860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:41:37,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 15:41:39,936][57339] Updated weights for policy 0, policy_version 648138 (0.0028) [2024-04-28 15:41:42,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10619224064. Throughput: 0: 55464.9. Samples: 1109568160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:41:42,169][57108] Avg episode reward: [(0, '0.677')] [2024-04-28 15:41:42,456][57339] Updated weights for policy 0, policy_version 648148 (0.0028) [2024-04-28 15:41:45,909][57339] Updated weights for policy 0, policy_version 648158 (0.0028) [2024-04-28 15:41:47,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10619518976. Throughput: 0: 55459.3. Samples: 1109904100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-04-28 15:41:47,170][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 15:41:48,455][57339] Updated weights for policy 0, policy_version 648168 (0.0026) [2024-04-28 15:41:51,691][57339] Updated weights for policy 0, policy_version 648178 (0.0030) [2024-04-28 15:41:52,169][57108] Fps is (10 sec: 57343.7, 60 sec: 56525.0, 300 sec: 55872.2). Total num frames: 10619797504. Throughput: 0: 55361.3. Samples: 1110077280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:41:52,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 15:41:54,448][57339] Updated weights for policy 0, policy_version 648188 (0.0033) [2024-04-28 15:41:57,169][57108] Fps is (10 sec: 49152.0, 60 sec: 54886.3, 300 sec: 55594.6). Total num frames: 10620010496. Throughput: 0: 55431.4. Samples: 1110410400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:41:57,170][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 15:41:57,647][57339] Updated weights for policy 0, policy_version 648198 (0.0030) [2024-04-28 15:42:00,341][57339] Updated weights for policy 0, policy_version 648208 (0.0031) [2024-04-28 15:42:02,169][57108] Fps is (10 sec: 50790.5, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 10620305408. Throughput: 0: 55316.5. Samples: 1110737240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:42:02,169][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 15:42:03,720][57339] Updated weights for policy 0, policy_version 648218 (0.0028) [2024-04-28 15:42:06,194][57339] Updated weights for policy 0, policy_version 648228 (0.0027) [2024-04-28 15:42:07,169][57108] Fps is (10 sec: 60619.9, 60 sec: 55705.4, 300 sec: 55816.6). Total num frames: 10620616704. Throughput: 0: 55054.6. Samples: 1110891000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:42:07,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 15:42:09,427][57339] Updated weights for policy 0, policy_version 648238 (0.0028) [2024-04-28 15:42:12,169][57339] Updated weights for policy 0, policy_version 648248 (0.0027) [2024-04-28 15:42:12,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 10620895232. Throughput: 0: 55203.1. Samples: 1111227780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:42:12,170][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 15:42:14,950][57319] Signal inference workers to stop experience collection... (16700 times) [2024-04-28 15:42:14,955][57319] Signal inference workers to resume experience collection... (16700 times) [2024-04-28 15:42:14,984][57339] InferenceWorker_p0-w0: stopping experience collection (16700 times) [2024-04-28 15:42:14,984][57339] InferenceWorker_p0-w0: resuming experience collection (16700 times) [2024-04-28 15:42:15,351][57339] Updated weights for policy 0, policy_version 648258 (0.0036) [2024-04-28 15:42:17,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55159.5, 300 sec: 55872.2). Total num frames: 10621173760. Throughput: 0: 55194.5. Samples: 1111561220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:42:17,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 15:42:18,129][57339] Updated weights for policy 0, policy_version 648268 (0.0025) [2024-04-28 15:42:21,317][57339] Updated weights for policy 0, policy_version 648278 (0.0029) [2024-04-28 15:42:22,169][57108] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10621452288. Throughput: 0: 55398.2. Samples: 1111736780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:42:22,169][57108] Avg episode reward: [(0, '0.529')] [2024-04-28 15:42:24,182][57339] Updated weights for policy 0, policy_version 648288 (0.0028) [2024-04-28 15:42:27,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10621698048. Throughput: 0: 55515.9. Samples: 1112066380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:42:27,169][57108] Avg episode reward: [(0, '0.518')] [2024-04-28 15:42:27,197][57339] Updated weights for policy 0, policy_version 648298 (0.0029) [2024-04-28 15:42:30,247][57339] Updated weights for policy 0, policy_version 648308 (0.0030) [2024-04-28 15:42:32,169][57108] Fps is (10 sec: 50790.4, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 10621960192. Throughput: 0: 55514.7. Samples: 1112402260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:42:32,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 15:42:33,076][57339] Updated weights for policy 0, policy_version 648318 (0.0028) [2024-04-28 15:42:35,898][57339] Updated weights for policy 0, policy_version 648328 (0.0025) [2024-04-28 15:42:37,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10622255104. Throughput: 0: 55158.7. Samples: 1112559420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:42:37,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 15:42:39,043][57339] Updated weights for policy 0, policy_version 648338 (0.0030) [2024-04-28 15:42:41,784][57339] Updated weights for policy 0, policy_version 648348 (0.0029) [2024-04-28 15:42:42,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10622550016. Throughput: 0: 55099.2. Samples: 1112889860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:42:42,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 15:42:44,832][57339] Updated weights for policy 0, policy_version 648358 (0.0027) [2024-04-28 15:42:47,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10622828544. Throughput: 0: 55359.5. Samples: 1113228420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:42:47,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 15:42:47,737][57339] Updated weights for policy 0, policy_version 648368 (0.0036) [2024-04-28 15:42:50,811][57339] Updated weights for policy 0, policy_version 648378 (0.0024) [2024-04-28 15:42:52,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 10623123456. Throughput: 0: 55779.7. Samples: 1113401080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:42:52,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 15:42:53,547][57339] Updated weights for policy 0, policy_version 648388 (0.0025) [2024-04-28 15:42:56,853][57339] Updated weights for policy 0, policy_version 648398 (0.0026) [2024-04-28 15:42:57,169][57108] Fps is (10 sec: 55705.2, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10623385600. Throughput: 0: 55752.0. Samples: 1113736620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:42:57,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 15:42:59,185][57339] Updated weights for policy 0, policy_version 648408 (0.0031) [2024-04-28 15:43:02,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10623647744. Throughput: 0: 55908.0. Samples: 1114077080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:43:02,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 15:43:02,542][57339] Updated weights for policy 0, policy_version 648418 (0.0028) [2024-04-28 15:43:05,116][57339] Updated weights for policy 0, policy_version 648428 (0.0030) [2024-04-28 15:43:06,908][57319] Signal inference workers to stop experience collection... (16750 times) [2024-04-28 15:43:06,943][57339] InferenceWorker_p0-w0: stopping experience collection (16750 times) [2024-04-28 15:43:06,955][57319] Signal inference workers to resume experience collection... (16750 times) [2024-04-28 15:43:06,959][57339] InferenceWorker_p0-w0: resuming experience collection (16750 times) [2024-04-28 15:43:07,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.6, 300 sec: 55650.0). Total num frames: 10623926272. Throughput: 0: 55555.5. Samples: 1114236780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:43:07,169][57108] Avg episode reward: [(0, '0.534')] [2024-04-28 15:43:08,239][57339] Updated weights for policy 0, policy_version 648438 (0.0024) [2024-04-28 15:43:10,911][57339] Updated weights for policy 0, policy_version 648448 (0.0029) [2024-04-28 15:43:12,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10624204800. Throughput: 0: 55709.3. Samples: 1114573300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:43:12,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 15:43:14,125][57339] Updated weights for policy 0, policy_version 648458 (0.0028) [2024-04-28 15:43:16,641][57339] Updated weights for policy 0, policy_version 648468 (0.0028) [2024-04-28 15:43:17,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10624499712. Throughput: 0: 55611.0. Samples: 1114904760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:43:17,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 15:43:20,030][57339] Updated weights for policy 0, policy_version 648478 (0.0028) [2024-04-28 15:43:22,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 10624778240. Throughput: 0: 56016.0. Samples: 1115080140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 15:43:22,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 15:43:22,517][57339] Updated weights for policy 0, policy_version 648488 (0.0023) [2024-04-28 15:43:25,931][57339] Updated weights for policy 0, policy_version 648498 (0.0029) [2024-04-28 15:43:27,169][57108] Fps is (10 sec: 58982.8, 60 sec: 56524.9, 300 sec: 55872.2). Total num frames: 10625089536. Throughput: 0: 56141.7. Samples: 1115416240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:43:27,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 15:43:27,174][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000648504_10625089536.pth... [2024-04-28 15:43:27,221][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000647688_10611720192.pth [2024-04-28 15:43:28,392][57339] Updated weights for policy 0, policy_version 648508 (0.0026) [2024-04-28 15:43:31,662][57339] Updated weights for policy 0, policy_version 648518 (0.0033) [2024-04-28 15:43:32,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56524.9, 300 sec: 55872.2). Total num frames: 10625351680. Throughput: 0: 55963.7. Samples: 1115746780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:43:32,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 15:43:34,355][57339] Updated weights for policy 0, policy_version 648528 (0.0028) [2024-04-28 15:43:37,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10625613824. Throughput: 0: 55710.7. Samples: 1115908060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:43:37,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 15:43:37,599][57339] Updated weights for policy 0, policy_version 648538 (0.0033) [2024-04-28 15:43:40,435][57339] Updated weights for policy 0, policy_version 648548 (0.0029) [2024-04-28 15:43:42,169][57108] Fps is (10 sec: 50789.0, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 10625859584. Throughput: 0: 55730.1. Samples: 1116244480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:43:42,170][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 15:43:43,550][57339] Updated weights for policy 0, policy_version 648558 (0.0028) [2024-04-28 15:43:46,199][57339] Updated weights for policy 0, policy_version 648568 (0.0027) [2024-04-28 15:43:47,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55705.8, 300 sec: 55594.5). Total num frames: 10626170880. Throughput: 0: 55637.1. Samples: 1116580740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:43:47,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 15:43:49,304][57339] Updated weights for policy 0, policy_version 648578 (0.0027) [2024-04-28 15:43:49,317][57319] Signal inference workers to stop experience collection... (16800 times) [2024-04-28 15:43:49,322][57319] Signal inference workers to resume experience collection... (16800 times) [2024-04-28 15:43:49,335][57339] InferenceWorker_p0-w0: stopping experience collection (16800 times) [2024-04-28 15:43:49,336][57339] InferenceWorker_p0-w0: resuming experience collection (16800 times) [2024-04-28 15:43:52,169][57108] Fps is (10 sec: 58984.0, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 10626449408. Throughput: 0: 55659.3. Samples: 1116741440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:43:52,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 15:43:52,209][57339] Updated weights for policy 0, policy_version 648588 (0.0030) [2024-04-28 15:43:55,126][57339] Updated weights for policy 0, policy_version 648598 (0.0024) [2024-04-28 15:43:57,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10626727936. Throughput: 0: 55488.9. Samples: 1117070300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:43:57,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 15:43:58,164][57339] Updated weights for policy 0, policy_version 648608 (0.0032) [2024-04-28 15:44:01,071][57339] Updated weights for policy 0, policy_version 648618 (0.0029) [2024-04-28 15:44:02,169][57108] Fps is (10 sec: 57343.1, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10627022848. Throughput: 0: 55487.1. Samples: 1117401680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:44:02,169][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 15:44:04,140][57339] Updated weights for policy 0, policy_version 648628 (0.0025) [2024-04-28 15:44:07,002][57339] Updated weights for policy 0, policy_version 648638 (0.0030) [2024-04-28 15:44:07,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56251.9, 300 sec: 55761.2). Total num frames: 10627301376. Throughput: 0: 55560.0. Samples: 1117580340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:44:07,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 15:44:09,895][57339] Updated weights for policy 0, policy_version 648648 (0.0038) [2024-04-28 15:44:12,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 10627547136. Throughput: 0: 55480.7. Samples: 1117912880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:44:12,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 15:44:12,835][57339] Updated weights for policy 0, policy_version 648658 (0.0025) [2024-04-28 15:44:15,821][57339] Updated weights for policy 0, policy_version 648668 (0.0030) [2024-04-28 15:44:17,169][57108] Fps is (10 sec: 49151.1, 60 sec: 54886.4, 300 sec: 55538.9). Total num frames: 10627792896. Throughput: 0: 55586.4. Samples: 1118248180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:44:17,170][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 15:44:18,668][57339] Updated weights for policy 0, policy_version 648678 (0.0026) [2024-04-28 15:44:21,777][57339] Updated weights for policy 0, policy_version 648688 (0.0030) [2024-04-28 15:44:22,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10628120576. Throughput: 0: 55567.2. Samples: 1118408580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:44:22,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 15:44:24,553][57339] Updated weights for policy 0, policy_version 648698 (0.0028) [2024-04-28 15:44:27,169][57108] Fps is (10 sec: 60621.2, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10628399104. Throughput: 0: 55501.1. Samples: 1118742020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:44:27,169][57108] Avg episode reward: [(0, '0.678')] [2024-04-28 15:44:27,610][57339] Updated weights for policy 0, policy_version 648708 (0.0026) [2024-04-28 15:44:30,522][57339] Updated weights for policy 0, policy_version 648718 (0.0033) [2024-04-28 15:44:32,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.4, 300 sec: 55761.2). Total num frames: 10628677632. Throughput: 0: 55468.3. Samples: 1119076820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:44:32,170][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 15:44:33,374][57339] Updated weights for policy 0, policy_version 648728 (0.0031) [2024-04-28 15:44:35,023][57319] Signal inference workers to stop experience collection... (16850 times) [2024-04-28 15:44:35,024][57319] Signal inference workers to resume experience collection... (16850 times) [2024-04-28 15:44:35,037][57339] InferenceWorker_p0-w0: stopping experience collection (16850 times) [2024-04-28 15:44:35,038][57339] InferenceWorker_p0-w0: resuming experience collection (16850 times) [2024-04-28 15:44:36,352][57339] Updated weights for policy 0, policy_version 648738 (0.0029) [2024-04-28 15:44:37,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55978.5, 300 sec: 55816.6). Total num frames: 10628972544. Throughput: 0: 55862.8. Samples: 1119255280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:44:37,170][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 15:44:39,331][57339] Updated weights for policy 0, policy_version 648748 (0.0028) [2024-04-28 15:44:42,109][57339] Updated weights for policy 0, policy_version 648758 (0.0032) [2024-04-28 15:44:42,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56524.9, 300 sec: 55761.2). Total num frames: 10629251072. Throughput: 0: 56080.9. Samples: 1119593940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:44:42,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 15:44:45,029][57339] Updated weights for policy 0, policy_version 648768 (0.0031) [2024-04-28 15:44:47,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55432.3, 300 sec: 55650.0). Total num frames: 10629496832. Throughput: 0: 56179.1. Samples: 1119929740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:44:47,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 15:44:48,005][57339] Updated weights for policy 0, policy_version 648778 (0.0021) [2024-04-28 15:44:50,991][57339] Updated weights for policy 0, policy_version 648788 (0.0039) [2024-04-28 15:44:52,169][57108] Fps is (10 sec: 49152.5, 60 sec: 54886.4, 300 sec: 55483.4). Total num frames: 10629742592. Throughput: 0: 55621.8. Samples: 1120083320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:44:52,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 15:44:53,861][57339] Updated weights for policy 0, policy_version 648798 (0.0024) [2024-04-28 15:44:56,880][57339] Updated weights for policy 0, policy_version 648808 (0.0027) [2024-04-28 15:44:57,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10630070272. Throughput: 0: 55740.2. Samples: 1120421180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:44:57,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 15:44:59,585][57339] Updated weights for policy 0, policy_version 648818 (0.0027) [2024-04-28 15:45:02,169][57108] Fps is (10 sec: 60620.0, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10630348800. Throughput: 0: 55804.9. Samples: 1120759400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-04-28 15:45:02,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 15:45:02,762][57339] Updated weights for policy 0, policy_version 648828 (0.0028) [2024-04-28 15:45:05,445][57339] Updated weights for policy 0, policy_version 648838 (0.0024) [2024-04-28 15:45:07,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55159.3, 300 sec: 55650.0). Total num frames: 10630610944. Throughput: 0: 55928.8. Samples: 1120925380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:45:07,170][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 15:45:08,719][57339] Updated weights for policy 0, policy_version 648848 (0.0026) [2024-04-28 15:45:11,276][57339] Updated weights for policy 0, policy_version 648858 (0.0032) [2024-04-28 15:45:12,169][57108] Fps is (10 sec: 57344.8, 60 sec: 56251.9, 300 sec: 55761.2). Total num frames: 10630922240. Throughput: 0: 56008.1. Samples: 1121262380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:45:12,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 15:45:14,737][57339] Updated weights for policy 0, policy_version 648868 (0.0025) [2024-04-28 15:45:15,858][57319] Signal inference workers to stop experience collection... (16900 times) [2024-04-28 15:45:15,892][57339] InferenceWorker_p0-w0: stopping experience collection (16900 times) [2024-04-28 15:45:15,908][57319] Signal inference workers to resume experience collection... (16900 times) [2024-04-28 15:45:15,911][57339] InferenceWorker_p0-w0: resuming experience collection (16900 times) [2024-04-28 15:45:17,062][57339] Updated weights for policy 0, policy_version 648878 (0.0033) [2024-04-28 15:45:17,169][57108] Fps is (10 sec: 60620.8, 60 sec: 57071.0, 300 sec: 55761.2). Total num frames: 10631217152. Throughput: 0: 56111.0. Samples: 1121601820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:45:17,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 15:45:21,015][57339] Updated weights for policy 0, policy_version 648888 (0.0030) [2024-04-28 15:45:22,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10631479296. Throughput: 0: 55808.7. Samples: 1121766660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:45:22,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 15:45:22,849][57339] Updated weights for policy 0, policy_version 648898 (0.0030) [2024-04-28 15:45:26,865][57339] Updated weights for policy 0, policy_version 648908 (0.0030) [2024-04-28 15:45:27,169][57108] Fps is (10 sec: 49152.4, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 10631708672. Throughput: 0: 55617.9. Samples: 1122096740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:45:27,169][57108] Avg episode reward: [(0, '0.704')] [2024-04-28 15:45:27,210][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000648909_10631725056.pth... [2024-04-28 15:45:27,268][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000648094_10618372096.pth [2024-04-28 15:45:28,804][57339] Updated weights for policy 0, policy_version 648918 (0.0029) [2024-04-28 15:45:32,169][57108] Fps is (10 sec: 52428.1, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 10632003584. Throughput: 0: 55544.4. Samples: 1122429240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:45:32,170][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 15:45:32,853][57339] Updated weights for policy 0, policy_version 648928 (0.0033) [2024-04-28 15:45:34,894][57339] Updated weights for policy 0, policy_version 648938 (0.0033) [2024-04-28 15:45:37,169][57108] Fps is (10 sec: 60620.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10632314880. Throughput: 0: 55877.6. Samples: 1122597820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:45:37,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 15:45:38,680][57339] Updated weights for policy 0, policy_version 648948 (0.0029) [2024-04-28 15:45:40,628][57339] Updated weights for policy 0, policy_version 648958 (0.0032) [2024-04-28 15:45:42,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10632593408. Throughput: 0: 55829.6. Samples: 1122933520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:45:42,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 15:45:44,494][57339] Updated weights for policy 0, policy_version 648968 (0.0028) [2024-04-28 15:45:46,495][57339] Updated weights for policy 0, policy_version 648978 (0.0030) [2024-04-28 15:45:47,169][57108] Fps is (10 sec: 55705.5, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10632871936. Throughput: 0: 55670.7. Samples: 1123264580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:45:47,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 15:45:50,341][57339] Updated weights for policy 0, policy_version 648988 (0.0024) [2024-04-28 15:45:52,169][57108] Fps is (10 sec: 55706.9, 60 sec: 56797.9, 300 sec: 55705.6). Total num frames: 10633150464. Throughput: 0: 55913.5. Samples: 1123441480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:45:52,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 15:45:52,402][57339] Updated weights for policy 0, policy_version 648998 (0.0026) [2024-04-28 15:45:56,046][57339] Updated weights for policy 0, policy_version 649008 (0.0032) [2024-04-28 15:45:57,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10633428992. Throughput: 0: 56006.1. Samples: 1123782660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:45:57,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 15:45:57,430][57319] Signal inference workers to stop experience collection... (16950 times) [2024-04-28 15:45:57,465][57339] InferenceWorker_p0-w0: stopping experience collection (16950 times) [2024-04-28 15:45:57,493][57319] Signal inference workers to resume experience collection... (16950 times) [2024-04-28 15:45:57,493][57339] InferenceWorker_p0-w0: resuming experience collection (16950 times) [2024-04-28 15:45:58,369][57339] Updated weights for policy 0, policy_version 649018 (0.0031) [2024-04-28 15:46:01,843][57339] Updated weights for policy 0, policy_version 649028 (0.0035) [2024-04-28 15:46:02,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10633674752. Throughput: 0: 55999.2. Samples: 1124121780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:46:02,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 15:46:04,100][57339] Updated weights for policy 0, policy_version 649038 (0.0023) [2024-04-28 15:46:07,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55705.5, 300 sec: 55538.9). Total num frames: 10633953280. Throughput: 0: 55776.7. Samples: 1124276620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:46:07,170][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 15:46:07,689][57339] Updated weights for policy 0, policy_version 649048 (0.0025) [2024-04-28 15:46:09,879][57339] Updated weights for policy 0, policy_version 649058 (0.0024) [2024-04-28 15:46:12,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10634248192. Throughput: 0: 55837.3. Samples: 1124609420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:46:12,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 15:46:13,508][57339] Updated weights for policy 0, policy_version 649068 (0.0030) [2024-04-28 15:46:15,775][57339] Updated weights for policy 0, policy_version 649078 (0.0028) [2024-04-28 15:46:17,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10634543104. Throughput: 0: 55923.1. Samples: 1124945780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:46:17,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 15:46:19,355][57339] Updated weights for policy 0, policy_version 649088 (0.0031) [2024-04-28 15:46:21,784][57339] Updated weights for policy 0, policy_version 649098 (0.0025) [2024-04-28 15:46:22,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10634838016. Throughput: 0: 56048.4. Samples: 1125120000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:46:22,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 15:46:25,434][57339] Updated weights for policy 0, policy_version 649108 (0.0029) [2024-04-28 15:46:27,169][57108] Fps is (10 sec: 57343.4, 60 sec: 56797.6, 300 sec: 55816.6). Total num frames: 10635116544. Throughput: 0: 56176.8. Samples: 1125461480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:46:27,170][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 15:46:27,648][57339] Updated weights for policy 0, policy_version 649118 (0.0028) [2024-04-28 15:46:31,286][57339] Updated weights for policy 0, policy_version 649128 (0.0029) [2024-04-28 15:46:32,169][57108] Fps is (10 sec: 54068.0, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 10635378688. Throughput: 0: 56338.9. Samples: 1125799820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:46:32,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 15:46:33,463][57339] Updated weights for policy 0, policy_version 649138 (0.0025) [2024-04-28 15:46:37,114][57339] Updated weights for policy 0, policy_version 649148 (0.0026) [2024-04-28 15:46:37,169][57108] Fps is (10 sec: 52430.0, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10635640832. Throughput: 0: 55813.3. Samples: 1125953080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 15:46:37,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 15:46:39,343][57339] Updated weights for policy 0, policy_version 649158 (0.0025) [2024-04-28 15:46:42,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 10635935744. Throughput: 0: 55641.8. Samples: 1126286540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:46:42,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 15:46:42,975][57339] Updated weights for policy 0, policy_version 649168 (0.0028) [2024-04-28 15:46:45,118][57339] Updated weights for policy 0, policy_version 649178 (0.0034) [2024-04-28 15:46:47,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10636214272. Throughput: 0: 55636.4. Samples: 1126625420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:46:47,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 15:46:48,744][57339] Updated weights for policy 0, policy_version 649188 (0.0033) [2024-04-28 15:46:49,626][57319] Signal inference workers to stop experience collection... (17000 times) [2024-04-28 15:46:49,657][57339] InferenceWorker_p0-w0: stopping experience collection (17000 times) [2024-04-28 15:46:49,681][57319] Signal inference workers to resume experience collection... (17000 times) [2024-04-28 15:46:49,682][57339] InferenceWorker_p0-w0: resuming experience collection (17000 times) [2024-04-28 15:46:50,968][57339] Updated weights for policy 0, policy_version 649198 (0.0025) [2024-04-28 15:46:52,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10636509184. Throughput: 0: 56047.7. Samples: 1126798760. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:46:52,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 15:46:54,851][57339] Updated weights for policy 0, policy_version 649208 (0.0029) [2024-04-28 15:46:56,953][57339] Updated weights for policy 0, policy_version 649218 (0.0028) [2024-04-28 15:46:57,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10636787712. Throughput: 0: 55978.1. Samples: 1127128440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:46:57,170][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 15:47:00,668][57339] Updated weights for policy 0, policy_version 649228 (0.0025) [2024-04-28 15:47:02,169][57108] Fps is (10 sec: 55706.2, 60 sec: 56524.8, 300 sec: 55761.2). Total num frames: 10637066240. Throughput: 0: 55862.0. Samples: 1127459560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:47:02,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 15:47:02,745][57339] Updated weights for policy 0, policy_version 649238 (0.0026) [2024-04-28 15:47:06,514][57339] Updated weights for policy 0, policy_version 649248 (0.0030) [2024-04-28 15:47:07,169][57108] Fps is (10 sec: 54067.7, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 10637328384. Throughput: 0: 55761.0. Samples: 1127629240. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:47:07,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 15:47:08,748][57339] Updated weights for policy 0, policy_version 649258 (0.0024) [2024-04-28 15:47:12,169][57108] Fps is (10 sec: 50789.7, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10637574144. Throughput: 0: 55554.8. Samples: 1127961440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:47:12,169][57108] Avg episode reward: [(0, '0.698')] [2024-04-28 15:47:12,397][57339] Updated weights for policy 0, policy_version 649268 (0.0033) [2024-04-28 15:47:14,529][57339] Updated weights for policy 0, policy_version 649278 (0.0029) [2024-04-28 15:47:17,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 10637869056. Throughput: 0: 55526.5. Samples: 1128298520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:47:17,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 15:47:18,359][57339] Updated weights for policy 0, policy_version 649288 (0.0031) [2024-04-28 15:47:20,418][57339] Updated weights for policy 0, policy_version 649298 (0.0033) [2024-04-28 15:47:22,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10638163968. Throughput: 0: 55836.8. Samples: 1128465740. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:47:22,169][57108] Avg episode reward: [(0, '0.690')] [2024-04-28 15:47:24,070][57339] Updated weights for policy 0, policy_version 649308 (0.0033) [2024-04-28 15:47:26,243][57339] Updated weights for policy 0, policy_version 649318 (0.0029) [2024-04-28 15:47:27,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55705.7, 300 sec: 55927.7). Total num frames: 10638458880. Throughput: 0: 55895.4. Samples: 1128801840. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:47:27,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 15:47:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000649320_10638458880.pth... [2024-04-28 15:47:27,235][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000648504_10625089536.pth [2024-04-28 15:47:29,791][57339] Updated weights for policy 0, policy_version 649328 (0.0030) [2024-04-28 15:47:32,138][57339] Updated weights for policy 0, policy_version 649338 (0.0027) [2024-04-28 15:47:32,169][57108] Fps is (10 sec: 58982.8, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 10638753792. Throughput: 0: 55702.2. Samples: 1129132020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:47:32,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 15:47:35,816][57339] Updated weights for policy 0, policy_version 649348 (0.0031) [2024-04-28 15:47:37,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10638999552. Throughput: 0: 55588.0. Samples: 1129300220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:47:37,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 15:47:38,077][57339] Updated weights for policy 0, policy_version 649358 (0.0030) [2024-04-28 15:47:41,772][57339] Updated weights for policy 0, policy_version 649368 (0.0027) [2024-04-28 15:47:42,169][57108] Fps is (10 sec: 50790.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10639261696. Throughput: 0: 55648.2. Samples: 1129632600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:47:42,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 15:47:43,912][57339] Updated weights for policy 0, policy_version 649378 (0.0030) [2024-04-28 15:47:47,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 10639540224. Throughput: 0: 55702.9. Samples: 1129966200. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:47:47,171][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 15:47:47,592][57339] Updated weights for policy 0, policy_version 649388 (0.0033) [2024-04-28 15:47:49,849][57339] Updated weights for policy 0, policy_version 649398 (0.0027) [2024-04-28 15:47:52,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 10639835136. Throughput: 0: 55483.2. Samples: 1130125980. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:47:52,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 15:47:53,396][57339] Updated weights for policy 0, policy_version 649408 (0.0027) [2024-04-28 15:47:55,852][57339] Updated weights for policy 0, policy_version 649418 (0.0029) [2024-04-28 15:47:57,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10640130048. Throughput: 0: 55602.6. Samples: 1130463560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:47:57,170][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 15:47:59,126][57339] Updated weights for policy 0, policy_version 649428 (0.0028) [2024-04-28 15:48:01,701][57339] Updated weights for policy 0, policy_version 649438 (0.0033) [2024-04-28 15:48:02,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 10640408576. Throughput: 0: 55521.0. Samples: 1130796960. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:48:02,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 15:48:04,524][57319] Signal inference workers to stop experience collection... (17050 times) [2024-04-28 15:48:04,524][57319] Signal inference workers to resume experience collection... (17050 times) [2024-04-28 15:48:04,554][57339] InferenceWorker_p0-w0: stopping experience collection (17050 times) [2024-04-28 15:48:04,555][57339] InferenceWorker_p0-w0: resuming experience collection (17050 times) [2024-04-28 15:48:05,095][57339] Updated weights for policy 0, policy_version 649448 (0.0026) [2024-04-28 15:48:07,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 10640687104. Throughput: 0: 55720.3. Samples: 1130973160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:48:07,169][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 15:48:07,460][57339] Updated weights for policy 0, policy_version 649458 (0.0029) [2024-04-28 15:48:11,020][57339] Updated weights for policy 0, policy_version 649468 (0.0031) [2024-04-28 15:48:12,169][57108] Fps is (10 sec: 54067.0, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 10640949248. Throughput: 0: 55665.4. Samples: 1131306780. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-04-28 15:48:12,169][57108] Avg episode reward: [(0, '0.519')] [2024-04-28 15:48:13,519][57339] Updated weights for policy 0, policy_version 649478 (0.0027) [2024-04-28 15:48:16,718][57339] Updated weights for policy 0, policy_version 649488 (0.0024) [2024-04-28 15:48:17,169][57108] Fps is (10 sec: 52430.0, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10641211392. Throughput: 0: 55773.0. Samples: 1131641800. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:48:17,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 15:48:19,265][57339] Updated weights for policy 0, policy_version 649498 (0.0032) [2024-04-28 15:48:22,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 10641473536. Throughput: 0: 55637.3. Samples: 1131803900. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:48:22,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 15:48:22,643][57339] Updated weights for policy 0, policy_version 649508 (0.0032) [2024-04-28 15:48:25,164][57339] Updated weights for policy 0, policy_version 649518 (0.0035) [2024-04-28 15:48:27,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10641784832. Throughput: 0: 55676.3. Samples: 1132138040. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:48:27,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 15:48:28,482][57339] Updated weights for policy 0, policy_version 649528 (0.0024) [2024-04-28 15:48:31,102][57339] Updated weights for policy 0, policy_version 649538 (0.0026) [2024-04-28 15:48:32,169][57108] Fps is (10 sec: 60620.8, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10642079744. Throughput: 0: 55611.6. Samples: 1132468720. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:48:32,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 15:48:34,292][57339] Updated weights for policy 0, policy_version 649548 (0.0033) [2024-04-28 15:48:36,870][57339] Updated weights for policy 0, policy_version 649558 (0.0024) [2024-04-28 15:48:37,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 10642358272. Throughput: 0: 55918.6. Samples: 1132642320. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:48:37,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 15:48:40,404][57339] Updated weights for policy 0, policy_version 649568 (0.0026) [2024-04-28 15:48:42,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10642604032. Throughput: 0: 55962.5. Samples: 1132981860. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:48:42,169][57108] Avg episode reward: [(0, '0.683')] [2024-04-28 15:48:42,953][57339] Updated weights for policy 0, policy_version 649578 (0.0027) [2024-04-28 15:48:46,350][57339] Updated weights for policy 0, policy_version 649588 (0.0028) [2024-04-28 15:48:47,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10642898944. Throughput: 0: 55881.3. Samples: 1133311620. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:48:47,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 15:48:48,602][57339] Updated weights for policy 0, policy_version 649598 (0.0029) [2024-04-28 15:48:51,491][57319] Signal inference workers to stop experience collection... (17100 times) [2024-04-28 15:48:51,491][57319] Signal inference workers to resume experience collection... (17100 times) [2024-04-28 15:48:51,514][57339] InferenceWorker_p0-w0: stopping experience collection (17100 times) [2024-04-28 15:48:51,514][57339] InferenceWorker_p0-w0: resuming experience collection (17100 times) [2024-04-28 15:48:52,040][57339] Updated weights for policy 0, policy_version 649608 (0.0024) [2024-04-28 15:48:52,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10643177472. Throughput: 0: 55649.2. Samples: 1133477360. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:48:52,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 15:48:54,446][57339] Updated weights for policy 0, policy_version 649618 (0.0029) [2024-04-28 15:48:57,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10643456000. Throughput: 0: 55703.5. Samples: 1133813440. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:48:57,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 15:48:57,872][57339] Updated weights for policy 0, policy_version 649628 (0.0032) [2024-04-28 15:49:00,371][57339] Updated weights for policy 0, policy_version 649638 (0.0028) [2024-04-28 15:49:02,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 10643718144. Throughput: 0: 55666.6. Samples: 1134146800. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:49:02,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 15:49:03,578][57339] Updated weights for policy 0, policy_version 649648 (0.0031) [2024-04-28 15:49:06,240][57339] Updated weights for policy 0, policy_version 649658 (0.0027) [2024-04-28 15:49:07,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10644013056. Throughput: 0: 55821.2. Samples: 1134315860. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:49:07,170][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 15:49:09,527][57339] Updated weights for policy 0, policy_version 649668 (0.0025) [2024-04-28 15:49:12,078][57339] Updated weights for policy 0, policy_version 649678 (0.0027) [2024-04-28 15:49:12,169][57108] Fps is (10 sec: 60621.3, 60 sec: 56251.8, 300 sec: 56038.9). Total num frames: 10644324352. Throughput: 0: 55871.6. Samples: 1134652260. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:49:12,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 15:49:15,414][57339] Updated weights for policy 0, policy_version 649688 (0.0028) [2024-04-28 15:49:17,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10644570112. Throughput: 0: 55977.3. Samples: 1134987700. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:49:17,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 15:49:17,914][57339] Updated weights for policy 0, policy_version 649698 (0.0028) [2024-04-28 15:49:21,083][57339] Updated weights for policy 0, policy_version 649708 (0.0028) [2024-04-28 15:49:22,169][57108] Fps is (10 sec: 54066.5, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 10644865024. Throughput: 0: 55925.7. Samples: 1135158980. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:49:22,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 15:49:23,885][57339] Updated weights for policy 0, policy_version 649718 (0.0025) [2024-04-28 15:49:27,123][57339] Updated weights for policy 0, policy_version 649728 (0.0026) [2024-04-28 15:49:27,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10645143552. Throughput: 0: 55920.7. Samples: 1135498300. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:49:27,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 15:49:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000649728_10645143552.pth... [2024-04-28 15:49:27,237][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000648909_10631725056.pth [2024-04-28 15:49:29,851][57339] Updated weights for policy 0, policy_version 649738 (0.0035) [2024-04-28 15:49:32,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10645405696. Throughput: 0: 55973.8. Samples: 1135830440. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:49:32,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 15:49:33,098][57339] Updated weights for policy 0, policy_version 649748 (0.0027) [2024-04-28 15:49:34,447][57319] Signal inference workers to stop experience collection... (17150 times) [2024-04-28 15:49:34,447][57319] Signal inference workers to resume experience collection... (17150 times) [2024-04-28 15:49:34,462][57339] InferenceWorker_p0-w0: stopping experience collection (17150 times) [2024-04-28 15:49:34,492][57339] InferenceWorker_p0-w0: resuming experience collection (17150 times) [2024-04-28 15:49:35,827][57339] Updated weights for policy 0, policy_version 649758 (0.0027) [2024-04-28 15:49:37,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10645684224. Throughput: 0: 56024.2. Samples: 1135998460. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:49:37,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 15:49:38,796][57339] Updated weights for policy 0, policy_version 649768 (0.0024) [2024-04-28 15:49:41,697][57339] Updated weights for policy 0, policy_version 649778 (0.0029) [2024-04-28 15:49:42,169][57108] Fps is (10 sec: 57343.7, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 10645979136. Throughput: 0: 56010.7. Samples: 1136333920. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:49:42,170][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 15:49:44,495][57339] Updated weights for policy 0, policy_version 649788 (0.0026) [2024-04-28 15:49:47,169][57108] Fps is (10 sec: 58982.4, 60 sec: 56251.7, 300 sec: 56038.8). Total num frames: 10646274048. Throughput: 0: 56097.3. Samples: 1136671180. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:49:47,170][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 15:49:47,616][57339] Updated weights for policy 0, policy_version 649798 (0.0025) [2024-04-28 15:49:50,472][57339] Updated weights for policy 0, policy_version 649808 (0.0039) [2024-04-28 15:49:52,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10646536192. Throughput: 0: 55935.8. Samples: 1136832960. Policy #0 lag: (min: 1.0, avg: 8.6, max: 21.0) [2024-04-28 15:49:52,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 15:49:53,547][57339] Updated weights for policy 0, policy_version 649818 (0.0040) [2024-04-28 15:49:56,357][57339] Updated weights for policy 0, policy_version 649828 (0.0030) [2024-04-28 15:49:57,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10646814720. Throughput: 0: 55875.0. Samples: 1137166640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:49:57,170][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 15:49:59,299][57339] Updated weights for policy 0, policy_version 649838 (0.0032) [2024-04-28 15:50:02,169][57108] Fps is (10 sec: 55705.1, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10647093248. Throughput: 0: 55963.1. Samples: 1137506040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:50:02,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 15:50:02,188][57339] Updated weights for policy 0, policy_version 649848 (0.0028) [2024-04-28 15:50:05,148][57339] Updated weights for policy 0, policy_version 649858 (0.0032) [2024-04-28 15:50:07,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 10647404544. Throughput: 0: 56022.3. Samples: 1137679980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:50:07,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 15:50:07,923][57339] Updated weights for policy 0, policy_version 649868 (0.0032) [2024-04-28 15:50:11,021][57339] Updated weights for policy 0, policy_version 649878 (0.0028) [2024-04-28 15:50:12,169][57108] Fps is (10 sec: 55704.5, 60 sec: 55432.3, 300 sec: 55705.6). Total num frames: 10647650304. Throughput: 0: 55985.1. Samples: 1138017640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:50:12,169][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 15:50:13,785][57339] Updated weights for policy 0, policy_version 649888 (0.0031) [2024-04-28 15:50:16,888][57339] Updated weights for policy 0, policy_version 649898 (0.0035) [2024-04-28 15:50:17,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10647928832. Throughput: 0: 56069.9. Samples: 1138353580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:50:17,169][57108] Avg episode reward: [(0, '0.494')] [2024-04-28 15:50:19,827][57339] Updated weights for policy 0, policy_version 649908 (0.0029) [2024-04-28 15:50:22,169][57108] Fps is (10 sec: 55707.3, 60 sec: 55705.8, 300 sec: 55927.8). Total num frames: 10648207360. Throughput: 0: 55993.6. Samples: 1138518160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:50:22,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 15:50:22,778][57339] Updated weights for policy 0, policy_version 649918 (0.0030) [2024-04-28 15:50:25,513][57339] Updated weights for policy 0, policy_version 649928 (0.0029) [2024-04-28 15:50:27,169][57108] Fps is (10 sec: 58981.7, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 10648518656. Throughput: 0: 56004.5. Samples: 1138854120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:50:27,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 15:50:28,472][57339] Updated weights for policy 0, policy_version 649938 (0.0026) [2024-04-28 15:50:31,432][57339] Updated weights for policy 0, policy_version 649948 (0.0036) [2024-04-28 15:50:32,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56524.9, 300 sec: 55872.3). Total num frames: 10648797184. Throughput: 0: 56002.5. Samples: 1139191280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:50:32,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 15:50:34,276][57339] Updated weights for policy 0, policy_version 649958 (0.0031) [2024-04-28 15:50:37,170][57108] Fps is (10 sec: 54062.5, 60 sec: 56251.0, 300 sec: 55816.5). Total num frames: 10649059328. Throughput: 0: 56061.4. Samples: 1139355780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:50:37,170][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 15:50:37,317][57339] Updated weights for policy 0, policy_version 649968 (0.0025) [2024-04-28 15:50:37,683][57319] Signal inference workers to stop experience collection... (17200 times) [2024-04-28 15:50:37,684][57319] Signal inference workers to resume experience collection... (17200 times) [2024-04-28 15:50:37,697][57339] InferenceWorker_p0-w0: stopping experience collection (17200 times) [2024-04-28 15:50:37,697][57339] InferenceWorker_p0-w0: resuming experience collection (17200 times) [2024-04-28 15:50:40,159][57339] Updated weights for policy 0, policy_version 649978 (0.0031) [2024-04-28 15:50:42,169][57108] Fps is (10 sec: 54066.2, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10649337856. Throughput: 0: 56108.9. Samples: 1139691540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:50:42,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 15:50:43,214][57339] Updated weights for policy 0, policy_version 649988 (0.0032) [2024-04-28 15:50:46,160][57339] Updated weights for policy 0, policy_version 649998 (0.0028) [2024-04-28 15:50:47,169][57108] Fps is (10 sec: 55710.5, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10649616384. Throughput: 0: 56039.1. Samples: 1140027800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:50:47,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 15:50:48,953][57339] Updated weights for policy 0, policy_version 650008 (0.0030) [2024-04-28 15:50:52,027][57339] Updated weights for policy 0, policy_version 650018 (0.0036) [2024-04-28 15:50:52,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10649894912. Throughput: 0: 55850.4. Samples: 1140193240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:50:52,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 15:50:54,728][57339] Updated weights for policy 0, policy_version 650028 (0.0031) [2024-04-28 15:50:57,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 10650173440. Throughput: 0: 55766.9. Samples: 1140527140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:50:57,169][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 15:50:58,119][57339] Updated weights for policy 0, policy_version 650038 (0.0033) [2024-04-28 15:51:00,630][57339] Updated weights for policy 0, policy_version 650048 (0.0028) [2024-04-28 15:51:02,169][57108] Fps is (10 sec: 57343.9, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 10650468352. Throughput: 0: 55778.6. Samples: 1140863620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:51:02,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 15:51:03,824][57339] Updated weights for policy 0, policy_version 650058 (0.0033) [2024-04-28 15:51:06,729][57339] Updated weights for policy 0, policy_version 650068 (0.0028) [2024-04-28 15:51:07,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 10650746880. Throughput: 0: 55897.2. Samples: 1141033540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:51:07,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 15:51:09,573][57339] Updated weights for policy 0, policy_version 650078 (0.0026) [2024-04-28 15:51:12,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10651009024. Throughput: 0: 55872.4. Samples: 1141368380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:51:12,170][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 15:51:12,529][57339] Updated weights for policy 0, policy_version 650088 (0.0032) [2024-04-28 15:51:15,369][57339] Updated weights for policy 0, policy_version 650098 (0.0029) [2024-04-28 15:51:17,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 10651287552. Throughput: 0: 55811.8. Samples: 1141702820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:51:17,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 15:51:18,493][57339] Updated weights for policy 0, policy_version 650108 (0.0029) [2024-04-28 15:51:21,195][57339] Updated weights for policy 0, policy_version 650118 (0.0032) [2024-04-28 15:51:22,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10651549696. Throughput: 0: 55662.9. Samples: 1141860560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:51:22,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 15:51:24,472][57339] Updated weights for policy 0, policy_version 650128 (0.0028) [2024-04-28 15:51:27,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10651844608. Throughput: 0: 55557.9. Samples: 1142191640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-04-28 15:51:27,169][57108] Avg episode reward: [(0, '0.684')] [2024-04-28 15:51:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000650137_10651844608.pth... [2024-04-28 15:51:27,222][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000649320_10638458880.pth [2024-04-28 15:51:27,422][57339] Updated weights for policy 0, policy_version 650138 (0.0025) [2024-04-28 15:51:30,376][57339] Updated weights for policy 0, policy_version 650148 (0.0032) [2024-04-28 15:51:32,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55159.2, 300 sec: 55816.6). Total num frames: 10652106752. Throughput: 0: 55508.4. Samples: 1142525680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:51:32,170][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 15:51:33,326][57339] Updated weights for policy 0, policy_version 650158 (0.0026) [2024-04-28 15:51:36,031][57319] Signal inference workers to stop experience collection... (17250 times) [2024-04-28 15:51:36,031][57319] Signal inference workers to resume experience collection... (17250 times) [2024-04-28 15:51:36,051][57339] InferenceWorker_p0-w0: stopping experience collection (17250 times) [2024-04-28 15:51:36,051][57339] InferenceWorker_p0-w0: resuming experience collection (17250 times) [2024-04-28 15:51:36,147][57339] Updated weights for policy 0, policy_version 650168 (0.0036) [2024-04-28 15:51:37,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55706.4, 300 sec: 55816.7). Total num frames: 10652401664. Throughput: 0: 55602.1. Samples: 1142695340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:51:37,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 15:51:39,197][57339] Updated weights for policy 0, policy_version 650178 (0.0029) [2024-04-28 15:51:42,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10652663808. Throughput: 0: 55522.3. Samples: 1143025640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:51:42,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 15:51:42,197][57339] Updated weights for policy 0, policy_version 650188 (0.0028) [2024-04-28 15:51:45,030][57339] Updated weights for policy 0, policy_version 650198 (0.0030) [2024-04-28 15:51:47,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10652942336. Throughput: 0: 55470.6. Samples: 1143359800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:51:47,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 15:51:48,019][57339] Updated weights for policy 0, policy_version 650208 (0.0026) [2024-04-28 15:51:50,885][57339] Updated weights for policy 0, policy_version 650218 (0.0030) [2024-04-28 15:51:52,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10653220864. Throughput: 0: 55416.9. Samples: 1143527300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:51:52,170][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 15:51:53,946][57339] Updated weights for policy 0, policy_version 650228 (0.0028) [2024-04-28 15:51:56,656][57339] Updated weights for policy 0, policy_version 650238 (0.0027) [2024-04-28 15:51:57,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10653499392. Throughput: 0: 55377.8. Samples: 1143860380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:51:57,169][57108] Avg episode reward: [(0, '0.532')] [2024-04-28 15:51:59,764][57339] Updated weights for policy 0, policy_version 650248 (0.0026) [2024-04-28 15:52:02,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10653794304. Throughput: 0: 55372.5. Samples: 1144194580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:52:02,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 15:52:02,656][57339] Updated weights for policy 0, policy_version 650258 (0.0025) [2024-04-28 15:52:05,825][57339] Updated weights for policy 0, policy_version 650268 (0.0029) [2024-04-28 15:52:07,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55432.6, 300 sec: 55927.8). Total num frames: 10654072832. Throughput: 0: 55630.7. Samples: 1144363940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:52:07,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 15:52:08,344][57339] Updated weights for policy 0, policy_version 650278 (0.0031) [2024-04-28 15:52:11,812][57339] Updated weights for policy 0, policy_version 650288 (0.0026) [2024-04-28 15:52:12,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 10654334976. Throughput: 0: 55792.9. Samples: 1144702320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:52:12,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 15:52:14,428][57339] Updated weights for policy 0, policy_version 650298 (0.0031) [2024-04-28 15:52:17,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10654613504. Throughput: 0: 55792.6. Samples: 1145036340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:52:17,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 15:52:17,785][57339] Updated weights for policy 0, policy_version 650308 (0.0029) [2024-04-28 15:52:20,629][57339] Updated weights for policy 0, policy_version 650318 (0.0031) [2024-04-28 15:52:22,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10654908416. Throughput: 0: 55672.8. Samples: 1145200620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:52:22,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 15:52:23,598][57339] Updated weights for policy 0, policy_version 650328 (0.0029) [2024-04-28 15:52:26,460][57339] Updated weights for policy 0, policy_version 650338 (0.0029) [2024-04-28 15:52:27,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10655186944. Throughput: 0: 55662.5. Samples: 1145530460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:52:27,170][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 15:52:29,398][57339] Updated weights for policy 0, policy_version 650348 (0.0027) [2024-04-28 15:52:32,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 10655432704. Throughput: 0: 55585.0. Samples: 1145861120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:52:32,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 15:52:32,332][57339] Updated weights for policy 0, policy_version 650358 (0.0030) [2024-04-28 15:52:35,390][57339] Updated weights for policy 0, policy_version 650368 (0.0025) [2024-04-28 15:52:37,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10655727616. Throughput: 0: 55688.0. Samples: 1146033260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:52:37,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 15:52:38,385][57339] Updated weights for policy 0, policy_version 650378 (0.0025) [2024-04-28 15:52:41,335][57339] Updated weights for policy 0, policy_version 650388 (0.0027) [2024-04-28 15:52:42,169][57108] Fps is (10 sec: 58981.5, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 10656022528. Throughput: 0: 55533.3. Samples: 1146359380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:52:42,170][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 15:52:42,974][57319] Signal inference workers to stop experience collection... (17300 times) [2024-04-28 15:52:42,984][57319] Signal inference workers to resume experience collection... (17300 times) [2024-04-28 15:52:42,993][57339] InferenceWorker_p0-w0: stopping experience collection (17300 times) [2024-04-28 15:52:42,994][57339] InferenceWorker_p0-w0: resuming experience collection (17300 times) [2024-04-28 15:52:44,162][57339] Updated weights for policy 0, policy_version 650398 (0.0034) [2024-04-28 15:52:47,150][57339] Updated weights for policy 0, policy_version 650408 (0.0031) [2024-04-28 15:52:47,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10656284672. Throughput: 0: 55681.7. Samples: 1146700260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:52:47,170][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 15:52:49,893][57339] Updated weights for policy 0, policy_version 650418 (0.0032) [2024-04-28 15:52:52,169][57108] Fps is (10 sec: 52429.7, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10656546816. Throughput: 0: 55561.4. Samples: 1146864200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:52:52,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 15:52:53,148][57339] Updated weights for policy 0, policy_version 650428 (0.0030) [2024-04-28 15:52:55,748][57339] Updated weights for policy 0, policy_version 650438 (0.0034) [2024-04-28 15:52:57,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10656858112. Throughput: 0: 55371.9. Samples: 1147194060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:52:57,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 15:52:58,977][57339] Updated weights for policy 0, policy_version 650448 (0.0027) [2024-04-28 15:53:01,758][57339] Updated weights for policy 0, policy_version 650458 (0.0033) [2024-04-28 15:53:02,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10657120256. Throughput: 0: 55320.4. Samples: 1147525760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:53:02,169][57108] Avg episode reward: [(0, '0.532')] [2024-04-28 15:53:04,803][57339] Updated weights for policy 0, policy_version 650468 (0.0028) [2024-04-28 15:53:07,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 10657398784. Throughput: 0: 55471.1. Samples: 1147696820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 15:53:07,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 15:53:07,702][57339] Updated weights for policy 0, policy_version 650478 (0.0030) [2024-04-28 15:53:10,585][57339] Updated weights for policy 0, policy_version 650488 (0.0027) [2024-04-28 15:53:12,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 10657660928. Throughput: 0: 55500.9. Samples: 1148028000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:53:12,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 15:53:13,525][57339] Updated weights for policy 0, policy_version 650498 (0.0031) [2024-04-28 15:53:16,638][57339] Updated weights for policy 0, policy_version 650508 (0.0028) [2024-04-28 15:53:17,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10657955840. Throughput: 0: 55622.1. Samples: 1148364120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:53:17,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 15:53:19,417][57339] Updated weights for policy 0, policy_version 650518 (0.0026) [2024-04-28 15:53:22,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10658217984. Throughput: 0: 55403.1. Samples: 1148526400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:53:22,169][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 15:53:22,421][57339] Updated weights for policy 0, policy_version 650528 (0.0029) [2024-04-28 15:53:25,225][57339] Updated weights for policy 0, policy_version 650538 (0.0028) [2024-04-28 15:53:27,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10658496512. Throughput: 0: 55557.9. Samples: 1148859480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:53:27,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 15:53:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000650543_10658496512.pth... [2024-04-28 15:53:27,243][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000649728_10645143552.pth [2024-04-28 15:53:28,390][57339] Updated weights for policy 0, policy_version 650548 (0.0029) [2024-04-28 15:53:31,231][57339] Updated weights for policy 0, policy_version 650558 (0.0026) [2024-04-28 15:53:32,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10658791424. Throughput: 0: 55439.2. Samples: 1149195020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:53:32,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 15:53:34,293][57339] Updated weights for policy 0, policy_version 650568 (0.0029) [2024-04-28 15:53:36,924][57339] Updated weights for policy 0, policy_version 650578 (0.0026) [2024-04-28 15:53:37,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55705.5, 300 sec: 55816.6). Total num frames: 10659069952. Throughput: 0: 55427.8. Samples: 1149358460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:53:37,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 15:53:40,250][57339] Updated weights for policy 0, policy_version 650588 (0.0035) [2024-04-28 15:53:42,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10659364864. Throughput: 0: 55607.9. Samples: 1149696420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:53:42,170][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 15:53:42,973][57339] Updated weights for policy 0, policy_version 650598 (0.0031) [2024-04-28 15:53:46,043][57339] Updated weights for policy 0, policy_version 650608 (0.0028) [2024-04-28 15:53:47,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 10659610624. Throughput: 0: 55622.3. Samples: 1150028760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:53:47,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 15:53:48,850][57339] Updated weights for policy 0, policy_version 650618 (0.0033) [2024-04-28 15:53:51,905][57339] Updated weights for policy 0, policy_version 650628 (0.0028) [2024-04-28 15:53:52,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10659889152. Throughput: 0: 55442.8. Samples: 1150191740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:53:52,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 15:53:53,163][57319] Signal inference workers to stop experience collection... (17350 times) [2024-04-28 15:53:53,201][57339] InferenceWorker_p0-w0: stopping experience collection (17350 times) [2024-04-28 15:53:53,255][57319] Signal inference workers to resume experience collection... (17350 times) [2024-04-28 15:53:53,255][57339] InferenceWorker_p0-w0: resuming experience collection (17350 times) [2024-04-28 15:53:54,858][57339] Updated weights for policy 0, policy_version 650638 (0.0029) [2024-04-28 15:53:57,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 10660167680. Throughput: 0: 55561.0. Samples: 1150528240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:53:57,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 15:53:57,616][57339] Updated weights for policy 0, policy_version 650648 (0.0031) [2024-04-28 15:54:00,661][57339] Updated weights for policy 0, policy_version 650658 (0.0034) [2024-04-28 15:54:02,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10660446208. Throughput: 0: 55659.6. Samples: 1150868800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:54:02,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 15:54:03,395][57339] Updated weights for policy 0, policy_version 650668 (0.0030) [2024-04-28 15:54:06,414][57339] Updated weights for policy 0, policy_version 650678 (0.0031) [2024-04-28 15:54:07,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10660724736. Throughput: 0: 55706.1. Samples: 1151033180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:54:07,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 15:54:09,350][57339] Updated weights for policy 0, policy_version 650688 (0.0030) [2024-04-28 15:54:12,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10661019648. Throughput: 0: 55808.9. Samples: 1151370880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:54:12,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 15:54:12,283][57339] Updated weights for policy 0, policy_version 650698 (0.0028) [2024-04-28 15:54:15,332][57339] Updated weights for policy 0, policy_version 650708 (0.0028) [2024-04-28 15:54:17,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10661298176. Throughput: 0: 55883.7. Samples: 1151709780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:54:17,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 15:54:18,042][57339] Updated weights for policy 0, policy_version 650718 (0.0026) [2024-04-28 15:54:21,129][57339] Updated weights for policy 0, policy_version 650728 (0.0031) [2024-04-28 15:54:22,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10661576704. Throughput: 0: 56044.1. Samples: 1151880440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:54:22,170][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 15:54:23,773][57339] Updated weights for policy 0, policy_version 650738 (0.0029) [2024-04-28 15:54:27,102][57339] Updated weights for policy 0, policy_version 650748 (0.0028) [2024-04-28 15:54:27,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10661855232. Throughput: 0: 55951.5. Samples: 1152214240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:54:27,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 15:54:29,621][57339] Updated weights for policy 0, policy_version 650758 (0.0028) [2024-04-28 15:54:32,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10662133760. Throughput: 0: 55870.9. Samples: 1152542960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:54:32,170][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 15:54:32,849][57339] Updated weights for policy 0, policy_version 650768 (0.0034) [2024-04-28 15:54:35,692][57339] Updated weights for policy 0, policy_version 650778 (0.0034) [2024-04-28 15:54:37,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 10662395904. Throughput: 0: 56004.9. Samples: 1152711960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:54:37,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 15:54:38,657][57339] Updated weights for policy 0, policy_version 650788 (0.0029) [2024-04-28 15:54:41,550][57339] Updated weights for policy 0, policy_version 650798 (0.0027) [2024-04-28 15:54:42,169][57108] Fps is (10 sec: 55706.9, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 10662690816. Throughput: 0: 56089.9. Samples: 1153052280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-04-28 15:54:42,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 15:54:44,489][57339] Updated weights for policy 0, policy_version 650808 (0.0032) [2024-04-28 15:54:47,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10662969344. Throughput: 0: 55968.5. Samples: 1153387380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:54:47,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 15:54:47,609][57339] Updated weights for policy 0, policy_version 650818 (0.0027) [2024-04-28 15:54:50,364][57339] Updated weights for policy 0, policy_version 650828 (0.0027) [2024-04-28 15:54:52,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10663247872. Throughput: 0: 55935.7. Samples: 1153550280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:54:52,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 15:54:53,483][57339] Updated weights for policy 0, policy_version 650838 (0.0031) [2024-04-28 15:54:56,079][57339] Updated weights for policy 0, policy_version 650848 (0.0027) [2024-04-28 15:54:57,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10663526400. Throughput: 0: 55978.7. Samples: 1153889920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:54:57,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 15:54:59,214][57339] Updated weights for policy 0, policy_version 650858 (0.0032) [2024-04-28 15:55:02,055][57339] Updated weights for policy 0, policy_version 650868 (0.0025) [2024-04-28 15:55:02,169][57108] Fps is (10 sec: 57343.4, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 10663821312. Throughput: 0: 55843.9. Samples: 1154222760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:55:02,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 15:55:04,924][57339] Updated weights for policy 0, policy_version 650878 (0.0029) [2024-04-28 15:55:07,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56251.7, 300 sec: 55761.2). Total num frames: 10664099840. Throughput: 0: 55827.2. Samples: 1154392660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:55:07,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 15:55:07,949][57339] Updated weights for policy 0, policy_version 650888 (0.0029) [2024-04-28 15:55:10,898][57339] Updated weights for policy 0, policy_version 650898 (0.0031) [2024-04-28 15:55:11,963][57319] Signal inference workers to stop experience collection... (17400 times) [2024-04-28 15:55:12,014][57339] InferenceWorker_p0-w0: stopping experience collection (17400 times) [2024-04-28 15:55:12,024][57319] Signal inference workers to resume experience collection... (17400 times) [2024-04-28 15:55:12,029][57339] InferenceWorker_p0-w0: resuming experience collection (17400 times) [2024-04-28 15:55:12,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10664361984. Throughput: 0: 55782.3. Samples: 1154724440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:55:12,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 15:55:13,977][57339] Updated weights for policy 0, policy_version 650908 (0.0030) [2024-04-28 15:55:16,820][57339] Updated weights for policy 0, policy_version 650918 (0.0037) [2024-04-28 15:55:17,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 10664656896. Throughput: 0: 55944.0. Samples: 1155060440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:55:17,170][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 15:55:19,694][57339] Updated weights for policy 0, policy_version 650928 (0.0028) [2024-04-28 15:55:22,170][57108] Fps is (10 sec: 55697.8, 60 sec: 55704.4, 300 sec: 55594.3). Total num frames: 10664919040. Throughput: 0: 55840.5. Samples: 1155224860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:55:22,171][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 15:55:22,714][57339] Updated weights for policy 0, policy_version 650938 (0.0027) [2024-04-28 15:55:25,482][57339] Updated weights for policy 0, policy_version 650948 (0.0026) [2024-04-28 15:55:27,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 10665213952. Throughput: 0: 55653.9. Samples: 1155556720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:55:27,170][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 15:55:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000650953_10665213952.pth... [2024-04-28 15:55:27,234][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000650137_10651844608.pth [2024-04-28 15:55:28,530][57339] Updated weights for policy 0, policy_version 650958 (0.0028) [2024-04-28 15:55:31,409][57339] Updated weights for policy 0, policy_version 650968 (0.0026) [2024-04-28 15:55:32,169][57108] Fps is (10 sec: 55712.8, 60 sec: 55705.7, 300 sec: 55650.2). Total num frames: 10665476096. Throughput: 0: 55694.6. Samples: 1155893640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:55:32,170][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 15:55:34,613][57339] Updated weights for policy 0, policy_version 650978 (0.0025) [2024-04-28 15:55:37,169][57108] Fps is (10 sec: 55706.3, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 10665771008. Throughput: 0: 55930.5. Samples: 1156067160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:55:37,169][57108] Avg episode reward: [(0, '0.512')] [2024-04-28 15:55:37,320][57339] Updated weights for policy 0, policy_version 650988 (0.0031) [2024-04-28 15:55:40,670][57339] Updated weights for policy 0, policy_version 650998 (0.0022) [2024-04-28 15:55:42,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10666049536. Throughput: 0: 55892.0. Samples: 1156405060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:55:42,169][57108] Avg episode reward: [(0, '0.534')] [2024-04-28 15:55:42,984][57339] Updated weights for policy 0, policy_version 651008 (0.0034) [2024-04-28 15:55:46,340][57339] Updated weights for policy 0, policy_version 651018 (0.0028) [2024-04-28 15:55:47,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10666328064. Throughput: 0: 55913.7. Samples: 1156738880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:55:47,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 15:55:48,763][57339] Updated weights for policy 0, policy_version 651028 (0.0024) [2024-04-28 15:55:52,051][57339] Updated weights for policy 0, policy_version 651038 (0.0028) [2024-04-28 15:55:52,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10666606592. Throughput: 0: 55827.2. Samples: 1156904880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:55:52,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 15:55:54,846][57339] Updated weights for policy 0, policy_version 651048 (0.0027) [2024-04-28 15:55:57,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10666901504. Throughput: 0: 56000.8. Samples: 1157244480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:55:57,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 15:55:57,998][57339] Updated weights for policy 0, policy_version 651058 (0.0030) [2024-04-28 15:56:00,710][57339] Updated weights for policy 0, policy_version 651068 (0.0034) [2024-04-28 15:56:02,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10667163648. Throughput: 0: 55975.3. Samples: 1157579320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:56:02,170][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 15:56:03,736][57339] Updated weights for policy 0, policy_version 651078 (0.0030) [2024-04-28 15:56:06,398][57339] Updated weights for policy 0, policy_version 651088 (0.0026) [2024-04-28 15:56:07,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10667425792. Throughput: 0: 55938.1. Samples: 1157742000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:56:07,169][57108] Avg episode reward: [(0, '0.683')] [2024-04-28 15:56:07,488][57319] Signal inference workers to stop experience collection... (17450 times) [2024-04-28 15:56:07,489][57319] Signal inference workers to resume experience collection... (17450 times) [2024-04-28 15:56:07,511][57339] InferenceWorker_p0-w0: stopping experience collection (17450 times) [2024-04-28 15:56:07,511][57339] InferenceWorker_p0-w0: resuming experience collection (17450 times) [2024-04-28 15:56:09,610][57339] Updated weights for policy 0, policy_version 651098 (0.0028) [2024-04-28 15:56:12,169][57108] Fps is (10 sec: 57343.2, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 10667737088. Throughput: 0: 56030.3. Samples: 1158078080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:56:12,170][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 15:56:12,357][57339] Updated weights for policy 0, policy_version 651108 (0.0026) [2024-04-28 15:56:15,486][57339] Updated weights for policy 0, policy_version 651118 (0.0037) [2024-04-28 15:56:17,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10668015616. Throughput: 0: 55999.5. Samples: 1158413620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 15:56:17,170][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 15:56:18,097][57339] Updated weights for policy 0, policy_version 651128 (0.0029) [2024-04-28 15:56:21,229][57339] Updated weights for policy 0, policy_version 651138 (0.0032) [2024-04-28 15:56:22,169][57108] Fps is (10 sec: 52430.0, 60 sec: 55707.0, 300 sec: 55650.1). Total num frames: 10668261376. Throughput: 0: 56068.2. Samples: 1158590220. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:56:22,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 15:56:23,998][57339] Updated weights for policy 0, policy_version 651148 (0.0023) [2024-04-28 15:56:27,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10668556288. Throughput: 0: 55944.7. Samples: 1158922580. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:56:27,170][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 15:56:27,320][57339] Updated weights for policy 0, policy_version 651158 (0.0026) [2024-04-28 15:56:29,718][57339] Updated weights for policy 0, policy_version 651168 (0.0030) [2024-04-28 15:56:32,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10668834816. Throughput: 0: 55853.0. Samples: 1159252260. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:56:32,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 15:56:33,031][57339] Updated weights for policy 0, policy_version 651178 (0.0024) [2024-04-28 15:56:35,661][57339] Updated weights for policy 0, policy_version 651188 (0.0026) [2024-04-28 15:56:37,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10669113344. Throughput: 0: 55936.9. Samples: 1159422040. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:56:37,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 15:56:38,936][57339] Updated weights for policy 0, policy_version 651198 (0.0031) [2024-04-28 15:56:42,009][57339] Updated weights for policy 0, policy_version 651208 (0.0030) [2024-04-28 15:56:42,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10669408256. Throughput: 0: 55855.9. Samples: 1159758000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:56:42,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 15:56:44,757][57339] Updated weights for policy 0, policy_version 651218 (0.0029) [2024-04-28 15:56:47,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10669686784. Throughput: 0: 55779.2. Samples: 1160089380. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:56:47,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 15:56:47,876][57339] Updated weights for policy 0, policy_version 651228 (0.0024) [2024-04-28 15:56:50,427][57339] Updated weights for policy 0, policy_version 651238 (0.0028) [2024-04-28 15:56:52,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10669965312. Throughput: 0: 55984.4. Samples: 1160261300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:56:52,169][57108] Avg episode reward: [(0, '0.684')] [2024-04-28 15:56:53,726][57339] Updated weights for policy 0, policy_version 651248 (0.0029) [2024-04-28 15:56:56,266][57339] Updated weights for policy 0, policy_version 651258 (0.0026) [2024-04-28 15:56:57,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10670227456. Throughput: 0: 56032.3. Samples: 1160599520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:56:57,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 15:56:59,476][57339] Updated weights for policy 0, policy_version 651268 (0.0028) [2024-04-28 15:57:02,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10670522368. Throughput: 0: 56081.4. Samples: 1160937280. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:57:02,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 15:57:02,211][57339] Updated weights for policy 0, policy_version 651278 (0.0035) [2024-04-28 15:57:05,260][57339] Updated weights for policy 0, policy_version 651288 (0.0028) [2024-04-28 15:57:06,466][57319] Signal inference workers to stop experience collection... (17500 times) [2024-04-28 15:57:06,466][57319] Signal inference workers to resume experience collection... (17500 times) [2024-04-28 15:57:06,477][57339] InferenceWorker_p0-w0: stopping experience collection (17500 times) [2024-04-28 15:57:06,477][57339] InferenceWorker_p0-w0: resuming experience collection (17500 times) [2024-04-28 15:57:07,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10670784512. Throughput: 0: 55617.2. Samples: 1161093000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:57:07,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 15:57:08,286][57339] Updated weights for policy 0, policy_version 651298 (0.0030) [2024-04-28 15:57:11,156][57339] Updated weights for policy 0, policy_version 651308 (0.0029) [2024-04-28 15:57:12,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 10671079424. Throughput: 0: 55734.3. Samples: 1161430620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:57:12,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 15:57:14,039][57339] Updated weights for policy 0, policy_version 651318 (0.0028) [2024-04-28 15:57:17,089][57339] Updated weights for policy 0, policy_version 651328 (0.0026) [2024-04-28 15:57:17,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10671357952. Throughput: 0: 55817.0. Samples: 1161764020. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:57:17,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 15:57:19,894][57339] Updated weights for policy 0, policy_version 651338 (0.0030) [2024-04-28 15:57:22,169][57108] Fps is (10 sec: 57343.4, 60 sec: 56524.6, 300 sec: 55816.7). Total num frames: 10671652864. Throughput: 0: 55801.6. Samples: 1161933120. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:57:22,170][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 15:57:22,800][57339] Updated weights for policy 0, policy_version 651348 (0.0034) [2024-04-28 15:57:25,722][57339] Updated weights for policy 0, policy_version 651358 (0.0030) [2024-04-28 15:57:27,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10671898624. Throughput: 0: 55818.7. Samples: 1162269840. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:57:27,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 15:57:27,293][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000651362_10671915008.pth... [2024-04-28 15:57:27,349][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000650543_10658496512.pth [2024-04-28 15:57:28,846][57339] Updated weights for policy 0, policy_version 651368 (0.0030) [2024-04-28 15:57:31,464][57339] Updated weights for policy 0, policy_version 651378 (0.0028) [2024-04-28 15:57:32,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10672177152. Throughput: 0: 55765.3. Samples: 1162598820. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:57:32,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 15:57:34,774][57339] Updated weights for policy 0, policy_version 651388 (0.0027) [2024-04-28 15:57:37,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 10672472064. Throughput: 0: 55882.7. Samples: 1162776020. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:57:37,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 15:57:37,304][57339] Updated weights for policy 0, policy_version 651398 (0.0025) [2024-04-28 15:57:40,501][57339] Updated weights for policy 0, policy_version 651408 (0.0031) [2024-04-28 15:57:42,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10672750592. Throughput: 0: 55628.7. Samples: 1163102820. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:57:42,170][57108] Avg episode reward: [(0, '0.700')] [2024-04-28 15:57:43,150][57339] Updated weights for policy 0, policy_version 651418 (0.0031) [2024-04-28 15:57:46,302][57339] Updated weights for policy 0, policy_version 651428 (0.0029) [2024-04-28 15:57:47,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10673029120. Throughput: 0: 55621.9. Samples: 1163440260. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:57:47,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 15:57:49,149][57339] Updated weights for policy 0, policy_version 651438 (0.0026) [2024-04-28 15:57:52,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10673307648. Throughput: 0: 55920.2. Samples: 1163609400. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:57:52,169][57108] Avg episode reward: [(0, '0.700')] [2024-04-28 15:57:52,238][57339] Updated weights for policy 0, policy_version 651448 (0.0025) [2024-04-28 15:57:55,262][57339] Updated weights for policy 0, policy_version 651458 (0.0034) [2024-04-28 15:57:57,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10673586176. Throughput: 0: 55816.0. Samples: 1163942340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-04-28 15:57:57,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 15:57:58,163][57339] Updated weights for policy 0, policy_version 651468 (0.0027) [2024-04-28 15:58:00,994][57339] Updated weights for policy 0, policy_version 651478 (0.0029) [2024-04-28 15:58:02,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10673864704. Throughput: 0: 55779.9. Samples: 1164274120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:58:02,169][57108] Avg episode reward: [(0, '0.702')] [2024-04-28 15:58:03,949][57339] Updated weights for policy 0, policy_version 651488 (0.0030) [2024-04-28 15:58:06,938][57339] Updated weights for policy 0, policy_version 651498 (0.0032) [2024-04-28 15:58:07,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10674143232. Throughput: 0: 55765.0. Samples: 1164442540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:58:07,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 15:58:09,797][57339] Updated weights for policy 0, policy_version 651508 (0.0033) [2024-04-28 15:58:12,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 10674405376. Throughput: 0: 55629.5. Samples: 1164773160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:58:12,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 15:58:12,909][57339] Updated weights for policy 0, policy_version 651518 (0.0025) [2024-04-28 15:58:15,504][57319] Signal inference workers to stop experience collection... (17550 times) [2024-04-28 15:58:15,504][57319] Signal inference workers to resume experience collection... (17550 times) [2024-04-28 15:58:15,542][57339] InferenceWorker_p0-w0: stopping experience collection (17550 times) [2024-04-28 15:58:15,542][57339] InferenceWorker_p0-w0: resuming experience collection (17550 times) [2024-04-28 15:58:15,761][57339] Updated weights for policy 0, policy_version 651528 (0.0025) [2024-04-28 15:58:17,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 10674700288. Throughput: 0: 55704.0. Samples: 1165105500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:58:17,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 15:58:18,636][57339] Updated weights for policy 0, policy_version 651538 (0.0027) [2024-04-28 15:58:22,146][57339] Updated weights for policy 0, policy_version 651548 (0.0025) [2024-04-28 15:58:22,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55159.6, 300 sec: 55816.7). Total num frames: 10674962432. Throughput: 0: 55561.5. Samples: 1165276280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:58:22,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 15:58:24,436][57339] Updated weights for policy 0, policy_version 651558 (0.0031) [2024-04-28 15:58:27,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10675240960. Throughput: 0: 55633.8. Samples: 1165606340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:58:27,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 15:58:27,924][57339] Updated weights for policy 0, policy_version 651568 (0.0028) [2024-04-28 15:58:30,568][57339] Updated weights for policy 0, policy_version 651578 (0.0029) [2024-04-28 15:58:32,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10675519488. Throughput: 0: 55559.1. Samples: 1165940420. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:58:32,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 15:58:33,727][57339] Updated weights for policy 0, policy_version 651588 (0.0029) [2024-04-28 15:58:36,719][57339] Updated weights for policy 0, policy_version 651598 (0.0030) [2024-04-28 15:58:37,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10675798016. Throughput: 0: 55425.5. Samples: 1166103560. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:58:37,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 15:58:39,713][57339] Updated weights for policy 0, policy_version 651608 (0.0028) [2024-04-28 15:58:42,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10676076544. Throughput: 0: 55349.2. Samples: 1166433060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:58:42,170][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 15:58:42,509][57339] Updated weights for policy 0, policy_version 651618 (0.0033) [2024-04-28 15:58:45,571][57339] Updated weights for policy 0, policy_version 651628 (0.0030) [2024-04-28 15:58:47,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10676355072. Throughput: 0: 55269.8. Samples: 1166761260. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:58:47,169][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 15:58:48,547][57339] Updated weights for policy 0, policy_version 651638 (0.0029) [2024-04-28 15:58:51,432][57339] Updated weights for policy 0, policy_version 651648 (0.0032) [2024-04-28 15:58:52,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.4, 300 sec: 55872.2). Total num frames: 10676649984. Throughput: 0: 55511.8. Samples: 1166940580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:58:52,170][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 15:58:54,378][57339] Updated weights for policy 0, policy_version 651658 (0.0031) [2024-04-28 15:58:57,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10676912128. Throughput: 0: 55667.5. Samples: 1167278200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:58:57,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 15:58:57,221][57339] Updated weights for policy 0, policy_version 651668 (0.0026) [2024-04-28 15:59:00,128][57339] Updated weights for policy 0, policy_version 651678 (0.0032) [2024-04-28 15:59:02,169][57108] Fps is (10 sec: 52429.9, 60 sec: 55159.5, 300 sec: 55761.2). Total num frames: 10677174272. Throughput: 0: 55578.8. Samples: 1167606540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:59:02,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 15:59:03,125][57339] Updated weights for policy 0, policy_version 651688 (0.0032) [2024-04-28 15:59:05,984][57339] Updated weights for policy 0, policy_version 651698 (0.0033) [2024-04-28 15:59:07,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10677469184. Throughput: 0: 55514.1. Samples: 1167774420. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:59:07,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 15:59:09,151][57339] Updated weights for policy 0, policy_version 651708 (0.0033) [2024-04-28 15:59:11,858][57339] Updated weights for policy 0, policy_version 651718 (0.0027) [2024-04-28 15:59:12,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10677764096. Throughput: 0: 55626.2. Samples: 1168109520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:59:12,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 15:59:14,876][57339] Updated weights for policy 0, policy_version 651728 (0.0028) [2024-04-28 15:59:17,169][57108] Fps is (10 sec: 55704.3, 60 sec: 55432.3, 300 sec: 55761.1). Total num frames: 10678026240. Throughput: 0: 55673.9. Samples: 1168445760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:59:17,170][57108] Avg episode reward: [(0, '0.530')] [2024-04-28 15:59:17,640][57339] Updated weights for policy 0, policy_version 651738 (0.0034) [2024-04-28 15:59:18,332][57319] Signal inference workers to stop experience collection... (17600 times) [2024-04-28 15:59:18,339][57319] Signal inference workers to resume experience collection... (17600 times) [2024-04-28 15:59:18,358][57339] InferenceWorker_p0-w0: stopping experience collection (17600 times) [2024-04-28 15:59:18,358][57339] InferenceWorker_p0-w0: resuming experience collection (17600 times) [2024-04-28 15:59:20,686][57339] Updated weights for policy 0, policy_version 651748 (0.0035) [2024-04-28 15:59:22,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 10678304768. Throughput: 0: 55840.0. Samples: 1168616360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:59:22,170][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 15:59:23,585][57339] Updated weights for policy 0, policy_version 651758 (0.0030) [2024-04-28 15:59:26,484][57339] Updated weights for policy 0, policy_version 651768 (0.0029) [2024-04-28 15:59:27,169][57108] Fps is (10 sec: 57345.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10678599680. Throughput: 0: 55918.2. Samples: 1168949380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:59:27,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 15:59:27,199][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000651771_10678616064.pth... [2024-04-28 15:59:27,246][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000650953_10665213952.pth [2024-04-28 15:59:29,461][57339] Updated weights for policy 0, policy_version 651778 (0.0033) [2024-04-28 15:59:32,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10678861824. Throughput: 0: 56085.4. Samples: 1169285100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-04-28 15:59:32,169][57108] Avg episode reward: [(0, '0.677')] [2024-04-28 15:59:32,318][57339] Updated weights for policy 0, policy_version 651788 (0.0029) [2024-04-28 15:59:35,241][57339] Updated weights for policy 0, policy_version 651798 (0.0028) [2024-04-28 15:59:37,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10679140352. Throughput: 0: 55857.8. Samples: 1169454180. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 15:59:37,170][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 15:59:38,060][57339] Updated weights for policy 0, policy_version 651808 (0.0020) [2024-04-28 15:59:41,027][57339] Updated weights for policy 0, policy_version 651818 (0.0027) [2024-04-28 15:59:42,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10679418880. Throughput: 0: 55802.7. Samples: 1169789320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 15:59:42,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 15:59:43,836][57339] Updated weights for policy 0, policy_version 651828 (0.0037) [2024-04-28 15:59:46,890][57339] Updated weights for policy 0, policy_version 651838 (0.0029) [2024-04-28 15:59:47,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.5, 300 sec: 55816.6). Total num frames: 10679713792. Throughput: 0: 55912.6. Samples: 1170122620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 15:59:47,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 15:59:49,733][57339] Updated weights for policy 0, policy_version 651848 (0.0035) [2024-04-28 15:59:52,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 10679992320. Throughput: 0: 55752.0. Samples: 1170283260. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 15:59:52,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 15:59:52,842][57339] Updated weights for policy 0, policy_version 651858 (0.0025) [2024-04-28 15:59:55,900][57339] Updated weights for policy 0, policy_version 651868 (0.0028) [2024-04-28 15:59:57,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 10680254464. Throughput: 0: 55647.9. Samples: 1170613680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 15:59:57,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 15:59:58,615][57339] Updated weights for policy 0, policy_version 651878 (0.0030) [2024-04-28 16:00:01,860][57339] Updated weights for policy 0, policy_version 651888 (0.0027) [2024-04-28 16:00:02,169][57108] Fps is (10 sec: 55705.0, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 10680549376. Throughput: 0: 55596.2. Samples: 1170947580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 16:00:02,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 16:00:04,622][57339] Updated weights for policy 0, policy_version 651898 (0.0027) [2024-04-28 16:00:07,169][57108] Fps is (10 sec: 57345.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10680827904. Throughput: 0: 55534.9. Samples: 1171115420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 16:00:07,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 16:00:07,648][57339] Updated weights for policy 0, policy_version 651908 (0.0036) [2024-04-28 16:00:10,495][57339] Updated weights for policy 0, policy_version 651918 (0.0030) [2024-04-28 16:00:12,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10681090048. Throughput: 0: 55383.4. Samples: 1171441640. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 16:00:12,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 16:00:13,456][57339] Updated weights for policy 0, policy_version 651928 (0.0030) [2024-04-28 16:00:16,686][57339] Updated weights for policy 0, policy_version 651938 (0.0039) [2024-04-28 16:00:17,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55432.8, 300 sec: 55705.9). Total num frames: 10681352192. Throughput: 0: 55378.7. Samples: 1171777140. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 16:00:17,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 16:00:19,316][57339] Updated weights for policy 0, policy_version 651948 (0.0036) [2024-04-28 16:00:22,169][57108] Fps is (10 sec: 55706.8, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 10681647104. Throughput: 0: 55205.1. Samples: 1171938400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 16:00:22,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 16:00:22,834][57339] Updated weights for policy 0, policy_version 651958 (0.0027) [2024-04-28 16:00:25,153][57339] Updated weights for policy 0, policy_version 651968 (0.0032) [2024-04-28 16:00:27,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 10681925632. Throughput: 0: 55084.0. Samples: 1172268100. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 16:00:27,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 16:00:28,742][57339] Updated weights for policy 0, policy_version 651978 (0.0025) [2024-04-28 16:00:29,943][57319] Signal inference workers to stop experience collection... (17650 times) [2024-04-28 16:00:29,943][57319] Signal inference workers to resume experience collection... (17650 times) [2024-04-28 16:00:29,969][57339] InferenceWorker_p0-w0: stopping experience collection (17650 times) [2024-04-28 16:00:29,969][57339] InferenceWorker_p0-w0: resuming experience collection (17650 times) [2024-04-28 16:00:31,192][57339] Updated weights for policy 0, policy_version 651988 (0.0029) [2024-04-28 16:00:32,169][57108] Fps is (10 sec: 55704.5, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 10682204160. Throughput: 0: 55119.2. Samples: 1172602980. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 16:00:32,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 16:00:34,830][57339] Updated weights for policy 0, policy_version 651998 (0.0031) [2024-04-28 16:00:37,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10682482688. Throughput: 0: 55396.9. Samples: 1172776120. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 16:00:37,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 16:00:37,183][57339] Updated weights for policy 0, policy_version 652008 (0.0026) [2024-04-28 16:00:40,637][57339] Updated weights for policy 0, policy_version 652018 (0.0028) [2024-04-28 16:00:42,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10682744832. Throughput: 0: 55392.6. Samples: 1173106340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 16:00:42,169][57108] Avg episode reward: [(0, '0.706')] [2024-04-28 16:00:43,217][57339] Updated weights for policy 0, policy_version 652028 (0.0030) [2024-04-28 16:00:46,396][57339] Updated weights for policy 0, policy_version 652038 (0.0028) [2024-04-28 16:00:47,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55159.6, 300 sec: 55650.0). Total num frames: 10683023360. Throughput: 0: 55324.0. Samples: 1173437160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 16:00:47,170][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 16:00:49,249][57339] Updated weights for policy 0, policy_version 652048 (0.0037) [2024-04-28 16:00:52,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 10683301888. Throughput: 0: 55312.7. Samples: 1173604500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 16:00:52,170][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 16:00:52,353][57339] Updated weights for policy 0, policy_version 652058 (0.0026) [2024-04-28 16:00:55,158][57339] Updated weights for policy 0, policy_version 652068 (0.0028) [2024-04-28 16:00:57,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 10683580416. Throughput: 0: 55439.1. Samples: 1173936400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 16:00:57,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 16:00:58,328][57339] Updated weights for policy 0, policy_version 652078 (0.0029) [2024-04-28 16:01:00,924][57339] Updated weights for policy 0, policy_version 652088 (0.0033) [2024-04-28 16:01:02,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10683858944. Throughput: 0: 55337.7. Samples: 1174267340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 16:01:02,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 16:01:04,292][57339] Updated weights for policy 0, policy_version 652098 (0.0038) [2024-04-28 16:01:06,770][57339] Updated weights for policy 0, policy_version 652108 (0.0025) [2024-04-28 16:01:07,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 10684137472. Throughput: 0: 55446.5. Samples: 1174433500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-04-28 16:01:07,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 16:01:10,101][57339] Updated weights for policy 0, policy_version 652118 (0.0028) [2024-04-28 16:01:12,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 10684432384. Throughput: 0: 55624.2. Samples: 1174771200. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:01:12,170][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 16:01:12,514][57339] Updated weights for policy 0, policy_version 652128 (0.0025) [2024-04-28 16:01:15,883][57339] Updated weights for policy 0, policy_version 652138 (0.0030) [2024-04-28 16:01:17,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10684694528. Throughput: 0: 55766.8. Samples: 1175112480. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:01:17,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 16:01:18,451][57339] Updated weights for policy 0, policy_version 652148 (0.0031) [2024-04-28 16:01:21,887][57339] Updated weights for policy 0, policy_version 652158 (0.0033) [2024-04-28 16:01:22,169][57108] Fps is (10 sec: 52430.1, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10684956672. Throughput: 0: 55414.3. Samples: 1175269760. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:01:22,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 16:01:24,629][57339] Updated weights for policy 0, policy_version 652168 (0.0026) [2024-04-28 16:01:27,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10685251584. Throughput: 0: 55496.5. Samples: 1175603680. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:01:27,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 16:01:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000652176_10685251584.pth... [2024-04-28 16:01:27,227][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000651362_10671915008.pth [2024-04-28 16:01:27,743][57339] Updated weights for policy 0, policy_version 652178 (0.0031) [2024-04-28 16:01:30,614][57339] Updated weights for policy 0, policy_version 652188 (0.0031) [2024-04-28 16:01:32,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 10685530112. Throughput: 0: 55592.0. Samples: 1175938800. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:01:32,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 16:01:33,553][57339] Updated weights for policy 0, policy_version 652198 (0.0026) [2024-04-28 16:01:36,355][57339] Updated weights for policy 0, policy_version 652208 (0.0028) [2024-04-28 16:01:37,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10685808640. Throughput: 0: 55560.6. Samples: 1176104720. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:01:37,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 16:01:39,390][57339] Updated weights for policy 0, policy_version 652218 (0.0029) [2024-04-28 16:01:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10686087168. Throughput: 0: 55646.4. Samples: 1176440480. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:01:42,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 16:01:42,282][57339] Updated weights for policy 0, policy_version 652228 (0.0029) [2024-04-28 16:01:45,219][57339] Updated weights for policy 0, policy_version 652238 (0.0025) [2024-04-28 16:01:47,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10686365696. Throughput: 0: 55599.4. Samples: 1176769320. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:01:47,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 16:01:47,466][57319] Signal inference workers to stop experience collection... (17700 times) [2024-04-28 16:01:47,466][57319] Signal inference workers to resume experience collection... (17700 times) [2024-04-28 16:01:47,476][57339] InferenceWorker_p0-w0: stopping experience collection (17700 times) [2024-04-28 16:01:47,476][57339] InferenceWorker_p0-w0: resuming experience collection (17700 times) [2024-04-28 16:01:48,109][57339] Updated weights for policy 0, policy_version 652248 (0.0027) [2024-04-28 16:01:51,132][57339] Updated weights for policy 0, policy_version 652258 (0.0033) [2024-04-28 16:01:52,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55650.0). Total num frames: 10686644224. Throughput: 0: 55787.1. Samples: 1176943920. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:01:52,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 16:01:53,965][57339] Updated weights for policy 0, policy_version 652268 (0.0028) [2024-04-28 16:01:56,997][57339] Updated weights for policy 0, policy_version 652278 (0.0030) [2024-04-28 16:01:57,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10686922752. Throughput: 0: 55755.5. Samples: 1177280200. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:01:57,170][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 16:01:59,726][57339] Updated weights for policy 0, policy_version 652288 (0.0047) [2024-04-28 16:02:02,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10687184896. Throughput: 0: 55603.5. Samples: 1177614640. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:02:02,170][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 16:02:02,843][57339] Updated weights for policy 0, policy_version 652298 (0.0027) [2024-04-28 16:02:05,607][57339] Updated weights for policy 0, policy_version 652308 (0.0024) [2024-04-28 16:02:07,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10687463424. Throughput: 0: 55598.9. Samples: 1177771720. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:02:07,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 16:02:08,667][57339] Updated weights for policy 0, policy_version 652318 (0.0034) [2024-04-28 16:02:11,702][57339] Updated weights for policy 0, policy_version 652328 (0.0028) [2024-04-28 16:02:12,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 10687758336. Throughput: 0: 55594.3. Samples: 1178105420. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:02:12,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 16:02:14,474][57339] Updated weights for policy 0, policy_version 652338 (0.0026) [2024-04-28 16:02:17,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 10688036864. Throughput: 0: 55579.5. Samples: 1178439880. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:02:17,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 16:02:17,511][57339] Updated weights for policy 0, policy_version 652348 (0.0025) [2024-04-28 16:02:20,400][57339] Updated weights for policy 0, policy_version 652358 (0.0028) [2024-04-28 16:02:22,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55978.5, 300 sec: 55650.1). Total num frames: 10688315392. Throughput: 0: 55760.8. Samples: 1178613960. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:02:22,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 16:02:23,285][57339] Updated weights for policy 0, policy_version 652368 (0.0026) [2024-04-28 16:02:26,229][57339] Updated weights for policy 0, policy_version 652378 (0.0025) [2024-04-28 16:02:27,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10688577536. Throughput: 0: 55730.8. Samples: 1178948360. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:02:27,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 16:02:29,230][57339] Updated weights for policy 0, policy_version 652388 (0.0029) [2024-04-28 16:02:32,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10688872448. Throughput: 0: 55763.1. Samples: 1179278660. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:02:32,170][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 16:02:32,440][57339] Updated weights for policy 0, policy_version 652398 (0.0029) [2024-04-28 16:02:35,202][57339] Updated weights for policy 0, policy_version 652408 (0.0032) [2024-04-28 16:02:37,169][57108] Fps is (10 sec: 54066.1, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 10689118208. Throughput: 0: 55307.5. Samples: 1179432760. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:02:37,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 16:02:38,463][57339] Updated weights for policy 0, policy_version 652418 (0.0033) [2024-04-28 16:02:41,079][57339] Updated weights for policy 0, policy_version 652428 (0.0029) [2024-04-28 16:02:42,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 10689396736. Throughput: 0: 55405.8. Samples: 1179773460. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:02:42,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 16:02:44,373][57339] Updated weights for policy 0, policy_version 652438 (0.0032) [2024-04-28 16:02:45,854][57319] Signal inference workers to stop experience collection... (17750 times) [2024-04-28 16:02:45,877][57339] InferenceWorker_p0-w0: stopping experience collection (17750 times) [2024-04-28 16:02:45,947][57319] Signal inference workers to resume experience collection... (17750 times) [2024-04-28 16:02:45,947][57339] InferenceWorker_p0-w0: resuming experience collection (17750 times) [2024-04-28 16:02:46,846][57339] Updated weights for policy 0, policy_version 652448 (0.0024) [2024-04-28 16:02:47,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10689708032. Throughput: 0: 55465.7. Samples: 1180110600. Policy #0 lag: (min: 1.0, avg: 12.1, max: 26.0) [2024-04-28 16:02:47,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 16:02:50,294][57339] Updated weights for policy 0, policy_version 652458 (0.0024) [2024-04-28 16:02:52,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 10689970176. Throughput: 0: 55690.8. Samples: 1180277800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:02:52,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 16:02:52,724][57339] Updated weights for policy 0, policy_version 652468 (0.0028) [2024-04-28 16:02:56,029][57339] Updated weights for policy 0, policy_version 652478 (0.0027) [2024-04-28 16:02:57,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55705.8, 300 sec: 55594.5). Total num frames: 10690265088. Throughput: 0: 55776.9. Samples: 1180615380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:02:57,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 16:02:58,895][57339] Updated weights for policy 0, policy_version 652488 (0.0028) [2024-04-28 16:03:01,765][57339] Updated weights for policy 0, policy_version 652498 (0.0026) [2024-04-28 16:03:02,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10690527232. Throughput: 0: 55737.4. Samples: 1180948060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:03:02,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 16:03:04,882][57339] Updated weights for policy 0, policy_version 652508 (0.0030) [2024-04-28 16:03:07,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 10690822144. Throughput: 0: 55686.7. Samples: 1181119860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:03:07,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 16:03:07,728][57339] Updated weights for policy 0, policy_version 652518 (0.0031) [2024-04-28 16:03:10,571][57339] Updated weights for policy 0, policy_version 652528 (0.0029) [2024-04-28 16:03:12,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.4, 300 sec: 55594.5). Total num frames: 10691100672. Throughput: 0: 55685.1. Samples: 1181454200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:03:12,170][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 16:03:13,651][57339] Updated weights for policy 0, policy_version 652538 (0.0029) [2024-04-28 16:03:16,402][57339] Updated weights for policy 0, policy_version 652548 (0.0040) [2024-04-28 16:03:17,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10691362816. Throughput: 0: 55754.4. Samples: 1181787600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:03:17,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 16:03:19,426][57339] Updated weights for policy 0, policy_version 652558 (0.0027) [2024-04-28 16:03:22,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10691657728. Throughput: 0: 56030.7. Samples: 1181954140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:03:22,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 16:03:22,390][57339] Updated weights for policy 0, policy_version 652568 (0.0031) [2024-04-28 16:03:25,284][57339] Updated weights for policy 0, policy_version 652578 (0.0027) [2024-04-28 16:03:27,169][57108] Fps is (10 sec: 58982.7, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10691952640. Throughput: 0: 55991.4. Samples: 1182293060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:03:27,169][57108] Avg episode reward: [(0, '0.691')] [2024-04-28 16:03:27,278][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000652586_10691969024.pth... [2024-04-28 16:03:27,322][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000651771_10678616064.pth [2024-04-28 16:03:28,252][57339] Updated weights for policy 0, policy_version 652588 (0.0029) [2024-04-28 16:03:31,214][57339] Updated weights for policy 0, policy_version 652598 (0.0026) [2024-04-28 16:03:32,169][57108] Fps is (10 sec: 58982.2, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 10692247552. Throughput: 0: 55972.5. Samples: 1182629360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:03:32,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 16:03:34,137][57339] Updated weights for policy 0, policy_version 652608 (0.0033) [2024-04-28 16:03:37,009][57339] Updated weights for policy 0, policy_version 652618 (0.0030) [2024-04-28 16:03:37,169][57108] Fps is (10 sec: 54066.3, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 10692493312. Throughput: 0: 55948.3. Samples: 1182795480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:03:37,170][57108] Avg episode reward: [(0, '0.521')] [2024-04-28 16:03:40,203][57339] Updated weights for policy 0, policy_version 652628 (0.0026) [2024-04-28 16:03:42,169][57108] Fps is (10 sec: 49152.5, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 10692739072. Throughput: 0: 55776.8. Samples: 1183125340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:03:42,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 16:03:42,997][57339] Updated weights for policy 0, policy_version 652638 (0.0031) [2024-04-28 16:03:45,952][57339] Updated weights for policy 0, policy_version 652648 (0.0029) [2024-04-28 16:03:47,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 10693033984. Throughput: 0: 55808.1. Samples: 1183459420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:03:47,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 16:03:48,692][57339] Updated weights for policy 0, policy_version 652658 (0.0028) [2024-04-28 16:03:49,135][57319] Signal inference workers to stop experience collection... (17800 times) [2024-04-28 16:03:49,169][57339] InferenceWorker_p0-w0: stopping experience collection (17800 times) [2024-04-28 16:03:49,225][57319] Signal inference workers to resume experience collection... (17800 times) [2024-04-28 16:03:49,225][57339] InferenceWorker_p0-w0: resuming experience collection (17800 times) [2024-04-28 16:03:51,704][57339] Updated weights for policy 0, policy_version 652668 (0.0025) [2024-04-28 16:03:52,169][57108] Fps is (10 sec: 60620.1, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 10693345280. Throughput: 0: 55662.6. Samples: 1183624680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:03:52,170][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 16:03:54,560][57339] Updated weights for policy 0, policy_version 652678 (0.0029) [2024-04-28 16:03:57,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10693607424. Throughput: 0: 55695.6. Samples: 1183960500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:03:57,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 16:03:57,513][57339] Updated weights for policy 0, policy_version 652688 (0.0023) [2024-04-28 16:04:00,400][57339] Updated weights for policy 0, policy_version 652698 (0.0031) [2024-04-28 16:04:02,169][57108] Fps is (10 sec: 55706.0, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10693902336. Throughput: 0: 55766.6. Samples: 1184297100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:04:02,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 16:04:03,307][57339] Updated weights for policy 0, policy_version 652708 (0.0030) [2024-04-28 16:04:06,258][57339] Updated weights for policy 0, policy_version 652718 (0.0030) [2024-04-28 16:04:07,169][57108] Fps is (10 sec: 58981.8, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 10694197248. Throughput: 0: 55964.8. Samples: 1184472560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:04:07,170][57108] Avg episode reward: [(0, '0.529')] [2024-04-28 16:04:09,170][57339] Updated weights for policy 0, policy_version 652728 (0.0028) [2024-04-28 16:04:12,069][57339] Updated weights for policy 0, policy_version 652738 (0.0035) [2024-04-28 16:04:12,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10694459392. Throughput: 0: 55930.2. Samples: 1184809920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:04:12,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 16:04:14,858][57339] Updated weights for policy 0, policy_version 652748 (0.0032) [2024-04-28 16:04:17,169][57108] Fps is (10 sec: 50790.6, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10694705152. Throughput: 0: 56013.8. Samples: 1185149980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:04:17,170][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 16:04:18,008][57339] Updated weights for policy 0, policy_version 652758 (0.0030) [2024-04-28 16:04:20,593][57339] Updated weights for policy 0, policy_version 652768 (0.0028) [2024-04-28 16:04:22,169][57108] Fps is (10 sec: 54066.2, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10695000064. Throughput: 0: 55811.5. Samples: 1185307000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 16:04:22,170][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 16:04:23,946][57339] Updated weights for policy 0, policy_version 652778 (0.0032) [2024-04-28 16:04:26,502][57339] Updated weights for policy 0, policy_version 652788 (0.0030) [2024-04-28 16:04:27,169][57108] Fps is (10 sec: 58983.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10695294976. Throughput: 0: 55914.3. Samples: 1185641480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:04:27,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 16:04:29,727][57339] Updated weights for policy 0, policy_version 652798 (0.0027) [2024-04-28 16:04:32,169][57108] Fps is (10 sec: 58983.3, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10695589888. Throughput: 0: 56036.8. Samples: 1185981080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:04:32,169][57108] Avg episode reward: [(0, '0.535')] [2024-04-28 16:04:32,321][57339] Updated weights for policy 0, policy_version 652808 (0.0033) [2024-04-28 16:04:35,501][57339] Updated weights for policy 0, policy_version 652818 (0.0026) [2024-04-28 16:04:37,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10695852032. Throughput: 0: 56181.9. Samples: 1186152860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:04:37,169][57108] Avg episode reward: [(0, '0.534')] [2024-04-28 16:04:38,467][57339] Updated weights for policy 0, policy_version 652828 (0.0025) [2024-04-28 16:04:41,415][57339] Updated weights for policy 0, policy_version 652838 (0.0026) [2024-04-28 16:04:42,169][57108] Fps is (10 sec: 55704.8, 60 sec: 56797.7, 300 sec: 55705.6). Total num frames: 10696146944. Throughput: 0: 56146.5. Samples: 1186487100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:04:42,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 16:04:44,173][57339] Updated weights for policy 0, policy_version 652848 (0.0027) [2024-04-28 16:04:46,362][57319] Signal inference workers to stop experience collection... (17850 times) [2024-04-28 16:04:46,366][57319] Signal inference workers to resume experience collection... (17850 times) [2024-04-28 16:04:46,388][57339] InferenceWorker_p0-w0: stopping experience collection (17850 times) [2024-04-28 16:04:46,388][57339] InferenceWorker_p0-w0: resuming experience collection (17850 times) [2024-04-28 16:04:47,169][57108] Fps is (10 sec: 55705.3, 60 sec: 56251.6, 300 sec: 55650.0). Total num frames: 10696409088. Throughput: 0: 55978.7. Samples: 1186816140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:04:47,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 16:04:47,412][57339] Updated weights for policy 0, policy_version 652858 (0.0030) [2024-04-28 16:04:50,016][57339] Updated weights for policy 0, policy_version 652868 (0.0026) [2024-04-28 16:04:52,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10696687616. Throughput: 0: 55696.2. Samples: 1186978880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:04:52,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 16:04:53,146][57339] Updated weights for policy 0, policy_version 652878 (0.0026) [2024-04-28 16:04:55,862][57339] Updated weights for policy 0, policy_version 652888 (0.0028) [2024-04-28 16:04:57,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10696933376. Throughput: 0: 55695.8. Samples: 1187316240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:04:57,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 16:04:59,099][57339] Updated weights for policy 0, policy_version 652898 (0.0026) [2024-04-28 16:05:01,715][57339] Updated weights for policy 0, policy_version 652908 (0.0031) [2024-04-28 16:05:02,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10697244672. Throughput: 0: 55479.7. Samples: 1187646560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:05:02,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 16:05:04,891][57339] Updated weights for policy 0, policy_version 652918 (0.0029) [2024-04-28 16:05:07,169][57108] Fps is (10 sec: 60621.1, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10697539584. Throughput: 0: 55893.4. Samples: 1187822200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:05:07,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 16:05:07,622][57339] Updated weights for policy 0, policy_version 652928 (0.0027) [2024-04-28 16:05:11,038][57339] Updated weights for policy 0, policy_version 652938 (0.0033) [2024-04-28 16:05:12,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10697801728. Throughput: 0: 55825.0. Samples: 1188153600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:05:12,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 16:05:13,618][57339] Updated weights for policy 0, policy_version 652948 (0.0025) [2024-04-28 16:05:16,852][57339] Updated weights for policy 0, policy_version 652958 (0.0029) [2024-04-28 16:05:17,169][57108] Fps is (10 sec: 54067.1, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10698080256. Throughput: 0: 55690.6. Samples: 1188487160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:05:17,170][57108] Avg episode reward: [(0, '0.493')] [2024-04-28 16:05:19,635][57339] Updated weights for policy 0, policy_version 652968 (0.0029) [2024-04-28 16:05:22,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55705.7, 300 sec: 55650.0). Total num frames: 10698342400. Throughput: 0: 55506.6. Samples: 1188650660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:05:22,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 16:05:22,605][57339] Updated weights for policy 0, policy_version 652978 (0.0031) [2024-04-28 16:05:25,410][57339] Updated weights for policy 0, policy_version 652988 (0.0028) [2024-04-28 16:05:27,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10698637312. Throughput: 0: 55422.0. Samples: 1188981080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:05:27,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 16:05:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000652993_10698637312.pth... [2024-04-28 16:05:27,240][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000652176_10685251584.pth [2024-04-28 16:05:28,593][57339] Updated weights for policy 0, policy_version 652998 (0.0031) [2024-04-28 16:05:31,884][57339] Updated weights for policy 0, policy_version 653008 (0.0027) [2024-04-28 16:05:32,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 10698883072. Throughput: 0: 55603.1. Samples: 1189318280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:05:32,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 16:05:34,592][57339] Updated weights for policy 0, policy_version 653018 (0.0027) [2024-04-28 16:05:37,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10699177984. Throughput: 0: 55627.4. Samples: 1189482120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:05:37,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 16:05:37,884][57339] Updated weights for policy 0, policy_version 653028 (0.0025) [2024-04-28 16:05:40,457][57339] Updated weights for policy 0, policy_version 653038 (0.0024) [2024-04-28 16:05:42,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10699472896. Throughput: 0: 55512.0. Samples: 1189814280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:05:42,169][57108] Avg episode reward: [(0, '0.503')] [2024-04-28 16:05:43,782][57339] Updated weights for policy 0, policy_version 653048 (0.0030) [2024-04-28 16:05:46,340][57339] Updated weights for policy 0, policy_version 653058 (0.0038) [2024-04-28 16:05:47,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10699767808. Throughput: 0: 55600.9. Samples: 1190148600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:05:47,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 16:05:49,491][57339] Updated weights for policy 0, policy_version 653068 (0.0034) [2024-04-28 16:05:52,135][57339] Updated weights for policy 0, policy_version 653078 (0.0027) [2024-04-28 16:05:52,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10700029952. Throughput: 0: 55555.2. Samples: 1190322180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:05:52,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 16:05:54,265][57319] Signal inference workers to stop experience collection... (17900 times) [2024-04-28 16:05:54,271][57319] Signal inference workers to resume experience collection... (17900 times) [2024-04-28 16:05:54,286][57339] InferenceWorker_p0-w0: stopping experience collection (17900 times) [2024-04-28 16:05:54,286][57339] InferenceWorker_p0-w0: resuming experience collection (17900 times) [2024-04-28 16:05:55,184][57339] Updated weights for policy 0, policy_version 653088 (0.0027) [2024-04-28 16:05:57,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10700292096. Throughput: 0: 55638.6. Samples: 1190657340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:05:57,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 16:05:58,039][57339] Updated weights for policy 0, policy_version 653098 (0.0030) [2024-04-28 16:06:00,985][57339] Updated weights for policy 0, policy_version 653108 (0.0027) [2024-04-28 16:06:02,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10700587008. Throughput: 0: 55540.8. Samples: 1190986500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:06:02,170][57108] Avg episode reward: [(0, '0.706')] [2024-04-28 16:06:03,934][57339] Updated weights for policy 0, policy_version 653118 (0.0031) [2024-04-28 16:06:07,169][57108] Fps is (10 sec: 54067.2, 60 sec: 54886.5, 300 sec: 55594.6). Total num frames: 10700832768. Throughput: 0: 55489.4. Samples: 1191147680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:06:07,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 16:06:07,175][57339] Updated weights for policy 0, policy_version 653128 (0.0032) [2024-04-28 16:06:09,814][57339] Updated weights for policy 0, policy_version 653138 (0.0027) [2024-04-28 16:06:12,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10701127680. Throughput: 0: 55594.7. Samples: 1191482840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:06:12,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 16:06:12,890][57339] Updated weights for policy 0, policy_version 653148 (0.0033) [2024-04-28 16:06:15,684][57339] Updated weights for policy 0, policy_version 653158 (0.0028) [2024-04-28 16:06:17,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10701422592. Throughput: 0: 55528.5. Samples: 1191817060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:06:17,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 16:06:18,823][57339] Updated weights for policy 0, policy_version 653168 (0.0028) [2024-04-28 16:06:21,538][57339] Updated weights for policy 0, policy_version 653178 (0.0031) [2024-04-28 16:06:22,169][57108] Fps is (10 sec: 58981.6, 60 sec: 56251.6, 300 sec: 55816.6). Total num frames: 10701717504. Throughput: 0: 55739.0. Samples: 1191990380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:06:22,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 16:06:24,592][57339] Updated weights for policy 0, policy_version 653188 (0.0035) [2024-04-28 16:06:27,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10701963264. Throughput: 0: 55713.4. Samples: 1192321380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:06:27,170][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 16:06:27,451][57339] Updated weights for policy 0, policy_version 653198 (0.0029) [2024-04-28 16:06:30,484][57339] Updated weights for policy 0, policy_version 653208 (0.0033) [2024-04-28 16:06:32,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10702241792. Throughput: 0: 55722.1. Samples: 1192656100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:06:32,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 16:06:33,203][57339] Updated weights for policy 0, policy_version 653218 (0.0028) [2024-04-28 16:06:36,303][57339] Updated weights for policy 0, policy_version 653228 (0.0029) [2024-04-28 16:06:37,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10702520320. Throughput: 0: 55537.3. Samples: 1192821360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:06:37,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 16:06:39,118][57339] Updated weights for policy 0, policy_version 653238 (0.0026) [2024-04-28 16:06:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10702798848. Throughput: 0: 55517.6. Samples: 1193155640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:06:42,170][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 16:06:42,357][57339] Updated weights for policy 0, policy_version 653248 (0.0028) [2024-04-28 16:06:45,115][57339] Updated weights for policy 0, policy_version 653258 (0.0024) [2024-04-28 16:06:47,169][57108] Fps is (10 sec: 54066.5, 60 sec: 54886.2, 300 sec: 55650.0). Total num frames: 10703060992. Throughput: 0: 55598.2. Samples: 1193488420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:06:47,170][57108] Avg episode reward: [(0, '0.495')] [2024-04-28 16:06:47,520][57319] Signal inference workers to stop experience collection... (17950 times) [2024-04-28 16:06:47,521][57319] Signal inference workers to resume experience collection... (17950 times) [2024-04-28 16:06:47,530][57339] InferenceWorker_p0-w0: stopping experience collection (17950 times) [2024-04-28 16:06:47,546][57339] InferenceWorker_p0-w0: resuming experience collection (17950 times) [2024-04-28 16:06:48,097][57339] Updated weights for policy 0, policy_version 653268 (0.0033) [2024-04-28 16:06:50,925][57339] Updated weights for policy 0, policy_version 653278 (0.0029) [2024-04-28 16:06:52,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10703355904. Throughput: 0: 55767.9. Samples: 1193657240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:06:52,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 16:06:54,338][57339] Updated weights for policy 0, policy_version 653288 (0.0028) [2024-04-28 16:06:56,627][57339] Updated weights for policy 0, policy_version 653298 (0.0027) [2024-04-28 16:06:57,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 10703650816. Throughput: 0: 55658.1. Samples: 1193987460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:06:57,170][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 16:07:00,246][57339] Updated weights for policy 0, policy_version 653308 (0.0028) [2024-04-28 16:07:02,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10703929344. Throughput: 0: 55662.6. Samples: 1194321880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:07:02,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 16:07:02,559][57339] Updated weights for policy 0, policy_version 653318 (0.0035) [2024-04-28 16:07:06,070][57339] Updated weights for policy 0, policy_version 653328 (0.0026) [2024-04-28 16:07:07,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55705.4, 300 sec: 55650.0). Total num frames: 10704175104. Throughput: 0: 55655.1. Samples: 1194494860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:07:07,178][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 16:07:08,414][57339] Updated weights for policy 0, policy_version 653338 (0.0025) [2024-04-28 16:07:12,042][57339] Updated weights for policy 0, policy_version 653348 (0.0026) [2024-04-28 16:07:12,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10704453632. Throughput: 0: 55671.2. Samples: 1194826580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:07:12,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 16:07:14,223][57339] Updated weights for policy 0, policy_version 653358 (0.0030) [2024-04-28 16:07:17,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10704748544. Throughput: 0: 55685.4. Samples: 1195161940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:07:17,169][57108] Avg episode reward: [(0, '0.677')] [2024-04-28 16:07:17,780][57339] Updated weights for policy 0, policy_version 653368 (0.0030) [2024-04-28 16:07:20,203][57339] Updated weights for policy 0, policy_version 653378 (0.0031) [2024-04-28 16:07:22,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 10705027072. Throughput: 0: 55652.0. Samples: 1195325700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:07:22,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 16:07:23,727][57339] Updated weights for policy 0, policy_version 653388 (0.0026) [2024-04-28 16:07:26,134][57339] Updated weights for policy 0, policy_version 653398 (0.0034) [2024-04-28 16:07:27,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10705305600. Throughput: 0: 55551.3. Samples: 1195655440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:07:27,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 16:07:27,261][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000653401_10705321984.pth... [2024-04-28 16:07:27,309][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000652586_10691969024.pth [2024-04-28 16:07:29,730][57339] Updated weights for policy 0, policy_version 653408 (0.0031) [2024-04-28 16:07:32,056][57339] Updated weights for policy 0, policy_version 653418 (0.0029) [2024-04-28 16:07:32,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10705600512. Throughput: 0: 55432.9. Samples: 1195982900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:07:32,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 16:07:35,553][57339] Updated weights for policy 0, policy_version 653428 (0.0030) [2024-04-28 16:07:37,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10705862656. Throughput: 0: 55641.3. Samples: 1196161100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-04-28 16:07:37,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 16:07:38,029][57339] Updated weights for policy 0, policy_version 653438 (0.0034) [2024-04-28 16:07:41,344][57339] Updated weights for policy 0, policy_version 653448 (0.0029) [2024-04-28 16:07:42,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10706141184. Throughput: 0: 55769.4. Samples: 1196497080. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:07:42,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 16:07:43,737][57339] Updated weights for policy 0, policy_version 653458 (0.0026) [2024-04-28 16:07:47,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10706403328. Throughput: 0: 55710.2. Samples: 1196828840. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:07:47,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 16:07:47,424][57339] Updated weights for policy 0, policy_version 653468 (0.0026) [2024-04-28 16:07:49,538][57339] Updated weights for policy 0, policy_version 653478 (0.0028) [2024-04-28 16:07:52,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 10706681856. Throughput: 0: 55327.7. Samples: 1196984600. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:07:52,170][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 16:07:53,208][57339] Updated weights for policy 0, policy_version 653488 (0.0027) [2024-04-28 16:07:54,425][57319] Signal inference workers to stop experience collection... (18000 times) [2024-04-28 16:07:54,431][57319] Signal inference workers to resume experience collection... (18000 times) [2024-04-28 16:07:54,444][57339] InferenceWorker_p0-w0: stopping experience collection (18000 times) [2024-04-28 16:07:54,445][57339] InferenceWorker_p0-w0: resuming experience collection (18000 times) [2024-04-28 16:07:55,443][57339] Updated weights for policy 0, policy_version 653498 (0.0030) [2024-04-28 16:07:57,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10706976768. Throughput: 0: 55348.9. Samples: 1197317280. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:07:57,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 16:07:59,222][57339] Updated weights for policy 0, policy_version 653508 (0.0027) [2024-04-28 16:08:01,392][57339] Updated weights for policy 0, policy_version 653518 (0.0028) [2024-04-28 16:08:02,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 10707238912. Throughput: 0: 55377.2. Samples: 1197653920. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:08:02,170][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 16:08:05,170][57339] Updated weights for policy 0, policy_version 653528 (0.0028) [2024-04-28 16:08:07,169][57108] Fps is (10 sec: 57343.4, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10707550208. Throughput: 0: 55600.8. Samples: 1197827740. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:08:07,170][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 16:08:07,559][57339] Updated weights for policy 0, policy_version 653538 (0.0027) [2024-04-28 16:08:11,089][57339] Updated weights for policy 0, policy_version 653548 (0.0030) [2024-04-28 16:08:12,169][57108] Fps is (10 sec: 57345.0, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10707812352. Throughput: 0: 55714.2. Samples: 1198162580. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:08:12,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 16:08:13,466][57339] Updated weights for policy 0, policy_version 653558 (0.0028) [2024-04-28 16:08:16,810][57339] Updated weights for policy 0, policy_version 653568 (0.0028) [2024-04-28 16:08:17,169][57108] Fps is (10 sec: 50790.3, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 10708058112. Throughput: 0: 55860.0. Samples: 1198496600. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:08:17,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 16:08:19,502][57339] Updated weights for policy 0, policy_version 653578 (0.0032) [2024-04-28 16:08:22,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 10708336640. Throughput: 0: 55409.1. Samples: 1198654500. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:08:22,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 16:08:22,673][57339] Updated weights for policy 0, policy_version 653588 (0.0031) [2024-04-28 16:08:25,491][57339] Updated weights for policy 0, policy_version 653598 (0.0023) [2024-04-28 16:08:27,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55159.4, 300 sec: 55483.5). Total num frames: 10708615168. Throughput: 0: 55447.2. Samples: 1198992200. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:08:27,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 16:08:28,604][57339] Updated weights for policy 0, policy_version 653608 (0.0028) [2024-04-28 16:08:31,262][57339] Updated weights for policy 0, policy_version 653618 (0.0036) [2024-04-28 16:08:32,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55159.7, 300 sec: 55650.1). Total num frames: 10708910080. Throughput: 0: 55492.7. Samples: 1199326000. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:08:32,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 16:08:34,523][57339] Updated weights for policy 0, policy_version 653628 (0.0040) [2024-04-28 16:08:36,994][57339] Updated weights for policy 0, policy_version 653638 (0.0026) [2024-04-28 16:08:37,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10709204992. Throughput: 0: 55749.7. Samples: 1199493340. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:08:37,170][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 16:08:40,494][57339] Updated weights for policy 0, policy_version 653648 (0.0032) [2024-04-28 16:08:42,169][57108] Fps is (10 sec: 58981.2, 60 sec: 55978.6, 300 sec: 55816.6). Total num frames: 10709499904. Throughput: 0: 55713.6. Samples: 1199824400. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:08:42,170][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 16:08:42,940][57339] Updated weights for policy 0, policy_version 653658 (0.0031) [2024-04-28 16:08:46,440][57339] Updated weights for policy 0, policy_version 653668 (0.0032) [2024-04-28 16:08:47,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55705.8, 300 sec: 55594.6). Total num frames: 10709745664. Throughput: 0: 55708.3. Samples: 1200160780. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:08:47,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 16:08:48,815][57339] Updated weights for policy 0, policy_version 653678 (0.0029) [2024-04-28 16:08:52,169][57108] Fps is (10 sec: 50791.6, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 10710007808. Throughput: 0: 55475.4. Samples: 1200324120. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:08:52,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 16:08:52,259][57339] Updated weights for policy 0, policy_version 653688 (0.0028) [2024-04-28 16:08:54,625][57339] Updated weights for policy 0, policy_version 653698 (0.0033) [2024-04-28 16:08:57,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 10710286336. Throughput: 0: 55510.2. Samples: 1200660540. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:08:57,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 16:08:57,452][57319] Signal inference workers to stop experience collection... (18050 times) [2024-04-28 16:08:57,452][57319] Signal inference workers to resume experience collection... (18050 times) [2024-04-28 16:08:57,476][57339] InferenceWorker_p0-w0: stopping experience collection (18050 times) [2024-04-28 16:08:57,476][57339] InferenceWorker_p0-w0: resuming experience collection (18050 times) [2024-04-28 16:08:58,062][57339] Updated weights for policy 0, policy_version 653708 (0.0032) [2024-04-28 16:09:00,433][57339] Updated weights for policy 0, policy_version 653718 (0.0030) [2024-04-28 16:09:02,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 10710581248. Throughput: 0: 55576.3. Samples: 1200997520. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:09:02,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 16:09:03,923][57339] Updated weights for policy 0, policy_version 653728 (0.0028) [2024-04-28 16:09:06,335][57339] Updated weights for policy 0, policy_version 653738 (0.0034) [2024-04-28 16:09:07,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 10710859776. Throughput: 0: 55665.9. Samples: 1201159480. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:09:07,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 16:09:09,606][57339] Updated weights for policy 0, policy_version 653748 (0.0027) [2024-04-28 16:09:12,108][57339] Updated weights for policy 0, policy_version 653758 (0.0035) [2024-04-28 16:09:12,169][57108] Fps is (10 sec: 58981.7, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10711171072. Throughput: 0: 55687.0. Samples: 1201498120. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:09:12,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 16:09:15,413][57339] Updated weights for policy 0, policy_version 653768 (0.0031) [2024-04-28 16:09:17,169][57108] Fps is (10 sec: 57344.7, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 10711433216. Throughput: 0: 55818.5. Samples: 1201837840. Policy #0 lag: (min: 1.0, avg: 8.7, max: 21.0) [2024-04-28 16:09:17,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 16:09:17,773][57339] Updated weights for policy 0, policy_version 653778 (0.0028) [2024-04-28 16:09:21,224][57339] Updated weights for policy 0, policy_version 653788 (0.0029) [2024-04-28 16:09:22,169][57108] Fps is (10 sec: 54066.7, 60 sec: 56251.5, 300 sec: 55650.0). Total num frames: 10711711744. Throughput: 0: 55907.5. Samples: 1202009180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:09:22,170][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 16:09:23,501][57339] Updated weights for policy 0, policy_version 653798 (0.0023) [2024-04-28 16:09:27,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 10711973888. Throughput: 0: 55938.5. Samples: 1202341620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:09:27,169][57108] Avg episode reward: [(0, '0.503')] [2024-04-28 16:09:27,190][57339] Updated weights for policy 0, policy_version 653808 (0.0023) [2024-04-28 16:09:27,287][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000653809_10712006656.pth... [2024-04-28 16:09:27,333][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000652993_10698637312.pth [2024-04-28 16:09:29,365][57339] Updated weights for policy 0, policy_version 653818 (0.0026) [2024-04-28 16:09:32,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10712252416. Throughput: 0: 55904.4. Samples: 1202676480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:09:32,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 16:09:32,915][57339] Updated weights for policy 0, policy_version 653828 (0.0025) [2024-04-28 16:09:35,325][57339] Updated weights for policy 0, policy_version 653838 (0.0037) [2024-04-28 16:09:37,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.7, 300 sec: 55594.6). Total num frames: 10712547328. Throughput: 0: 55971.1. Samples: 1202842820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:09:37,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 16:09:38,785][57339] Updated weights for policy 0, policy_version 653848 (0.0029) [2024-04-28 16:09:41,671][57339] Updated weights for policy 0, policy_version 653858 (0.0031) [2024-04-28 16:09:42,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 10712809472. Throughput: 0: 55897.7. Samples: 1203175940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:09:42,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 16:09:44,519][57339] Updated weights for policy 0, policy_version 653868 (0.0030) [2024-04-28 16:09:47,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 10713104384. Throughput: 0: 55832.3. Samples: 1203509980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:09:47,169][57108] Avg episode reward: [(0, '0.494')] [2024-04-28 16:09:47,847][57339] Updated weights for policy 0, policy_version 653878 (0.0034) [2024-04-28 16:09:50,362][57339] Updated weights for policy 0, policy_version 653888 (0.0028) [2024-04-28 16:09:52,169][57108] Fps is (10 sec: 60620.8, 60 sec: 56797.7, 300 sec: 55872.2). Total num frames: 10713415680. Throughput: 0: 56265.1. Samples: 1203691400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:09:52,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 16:09:53,783][57339] Updated weights for policy 0, policy_version 653898 (0.0024) [2024-04-28 16:09:56,108][57339] Updated weights for policy 0, policy_version 653908 (0.0032) [2024-04-28 16:09:57,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 10713661440. Throughput: 0: 56148.0. Samples: 1204024780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:09:57,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 16:09:59,668][57339] Updated weights for policy 0, policy_version 653918 (0.0033) [2024-04-28 16:10:01,980][57339] Updated weights for policy 0, policy_version 653928 (0.0029) [2024-04-28 16:10:02,169][57108] Fps is (10 sec: 54067.4, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 10713956352. Throughput: 0: 56051.6. Samples: 1204360160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:10:02,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 16:10:03,074][57319] Signal inference workers to stop experience collection... (18100 times) [2024-04-28 16:10:03,122][57339] InferenceWorker_p0-w0: stopping experience collection (18100 times) [2024-04-28 16:10:03,131][57319] Signal inference workers to resume experience collection... (18100 times) [2024-04-28 16:10:03,137][57339] InferenceWorker_p0-w0: resuming experience collection (18100 times) [2024-04-28 16:10:05,385][57339] Updated weights for policy 0, policy_version 653938 (0.0025) [2024-04-28 16:10:07,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 10714234880. Throughput: 0: 55975.3. Samples: 1204528060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:10:07,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 16:10:08,039][57339] Updated weights for policy 0, policy_version 653948 (0.0028) [2024-04-28 16:10:11,278][57339] Updated weights for policy 0, policy_version 653958 (0.0026) [2024-04-28 16:10:12,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10714513408. Throughput: 0: 56049.2. Samples: 1204863840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:10:12,169][57108] Avg episode reward: [(0, '0.516')] [2024-04-28 16:10:13,906][57339] Updated weights for policy 0, policy_version 653968 (0.0024) [2024-04-28 16:10:17,151][57339] Updated weights for policy 0, policy_version 653978 (0.0029) [2024-04-28 16:10:17,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10714775552. Throughput: 0: 56092.4. Samples: 1205200640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:10:17,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 16:10:19,701][57339] Updated weights for policy 0, policy_version 653988 (0.0032) [2024-04-28 16:10:22,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10715070464. Throughput: 0: 56074.7. Samples: 1205366180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:10:22,169][57108] Avg episode reward: [(0, '0.498')] [2024-04-28 16:10:22,939][57339] Updated weights for policy 0, policy_version 653998 (0.0033) [2024-04-28 16:10:25,652][57339] Updated weights for policy 0, policy_version 654008 (0.0033) [2024-04-28 16:10:27,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 10715348992. Throughput: 0: 56038.7. Samples: 1205697680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:10:27,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 16:10:28,713][57339] Updated weights for policy 0, policy_version 654018 (0.0031) [2024-04-28 16:10:31,478][57339] Updated weights for policy 0, policy_version 654028 (0.0030) [2024-04-28 16:10:32,169][57108] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10715627520. Throughput: 0: 56000.5. Samples: 1206030000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:10:32,169][57108] Avg episode reward: [(0, '0.508')] [2024-04-28 16:10:34,588][57339] Updated weights for policy 0, policy_version 654038 (0.0026) [2024-04-28 16:10:37,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10715906048. Throughput: 0: 55851.2. Samples: 1206204700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:10:37,169][57108] Avg episode reward: [(0, '0.519')] [2024-04-28 16:10:37,219][57339] Updated weights for policy 0, policy_version 654048 (0.0028) [2024-04-28 16:10:40,554][57339] Updated weights for policy 0, policy_version 654058 (0.0028) [2024-04-28 16:10:42,169][57108] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 10716184576. Throughput: 0: 55959.1. Samples: 1206542940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:10:42,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 16:10:43,050][57339] Updated weights for policy 0, policy_version 654068 (0.0034) [2024-04-28 16:10:46,501][57339] Updated weights for policy 0, policy_version 654078 (0.0031) [2024-04-28 16:10:47,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 10716446720. Throughput: 0: 55936.8. Samples: 1206877320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:10:47,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 16:10:49,053][57339] Updated weights for policy 0, policy_version 654088 (0.0031) [2024-04-28 16:10:52,169][57108] Fps is (10 sec: 52428.7, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 10716708864. Throughput: 0: 55535.1. Samples: 1207027140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-04-28 16:10:52,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 16:10:52,420][57339] Updated weights for policy 0, policy_version 654098 (0.0026) [2024-04-28 16:10:54,925][57339] Updated weights for policy 0, policy_version 654108 (0.0024) [2024-04-28 16:10:57,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55705.4, 300 sec: 55650.0). Total num frames: 10717003776. Throughput: 0: 55522.4. Samples: 1207362360. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:10:57,170][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 16:10:58,296][57339] Updated weights for policy 0, policy_version 654118 (0.0033) [2024-04-28 16:11:00,829][57319] Signal inference workers to stop experience collection... (18150 times) [2024-04-28 16:11:00,867][57339] InferenceWorker_p0-w0: stopping experience collection (18150 times) [2024-04-28 16:11:00,892][57319] Signal inference workers to resume experience collection... (18150 times) [2024-04-28 16:11:00,893][57339] InferenceWorker_p0-w0: resuming experience collection (18150 times) [2024-04-28 16:11:00,896][57339] Updated weights for policy 0, policy_version 654128 (0.0028) [2024-04-28 16:11:02,169][57108] Fps is (10 sec: 60620.4, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10717315072. Throughput: 0: 55527.5. Samples: 1207699380. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:11:02,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 16:11:04,177][57339] Updated weights for policy 0, policy_version 654138 (0.0032) [2024-04-28 16:11:06,606][57339] Updated weights for policy 0, policy_version 654148 (0.0028) [2024-04-28 16:11:07,169][57108] Fps is (10 sec: 58984.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10717593600. Throughput: 0: 55742.2. Samples: 1207874580. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:11:07,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 16:11:10,064][57339] Updated weights for policy 0, policy_version 654158 (0.0032) [2024-04-28 16:11:12,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10717839360. Throughput: 0: 55721.0. Samples: 1208205120. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:11:12,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 16:11:12,493][57339] Updated weights for policy 0, policy_version 654168 (0.0029) [2024-04-28 16:11:15,817][57339] Updated weights for policy 0, policy_version 654178 (0.0025) [2024-04-28 16:11:17,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 10718134272. Throughput: 0: 55648.3. Samples: 1208534180. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:11:17,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 16:11:18,395][57339] Updated weights for policy 0, policy_version 654188 (0.0036) [2024-04-28 16:11:21,799][57339] Updated weights for policy 0, policy_version 654198 (0.0034) [2024-04-28 16:11:22,169][57108] Fps is (10 sec: 55704.4, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10718396416. Throughput: 0: 55465.5. Samples: 1208700660. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:11:22,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 16:11:24,175][57339] Updated weights for policy 0, policy_version 654208 (0.0024) [2024-04-28 16:11:27,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10718658560. Throughput: 0: 55462.6. Samples: 1209038760. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:11:27,170][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 16:11:27,244][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000654216_10718674944.pth... [2024-04-28 16:11:27,286][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000653401_10705321984.pth [2024-04-28 16:11:27,680][57339] Updated weights for policy 0, policy_version 654218 (0.0030) [2024-04-28 16:11:29,949][57339] Updated weights for policy 0, policy_version 654228 (0.0030) [2024-04-28 16:11:32,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10718937088. Throughput: 0: 55466.3. Samples: 1209373300. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:11:32,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 16:11:33,520][57339] Updated weights for policy 0, policy_version 654238 (0.0029) [2024-04-28 16:11:35,822][57339] Updated weights for policy 0, policy_version 654248 (0.0040) [2024-04-28 16:11:37,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10719248384. Throughput: 0: 55867.5. Samples: 1209541180. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:11:37,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 16:11:39,450][57339] Updated weights for policy 0, policy_version 654258 (0.0033) [2024-04-28 16:11:41,728][57339] Updated weights for policy 0, policy_version 654268 (0.0027) [2024-04-28 16:11:42,169][57108] Fps is (10 sec: 60620.9, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10719543296. Throughput: 0: 55801.2. Samples: 1209873400. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:11:42,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 16:11:45,213][57339] Updated weights for policy 0, policy_version 654278 (0.0034) [2024-04-28 16:11:47,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10719789056. Throughput: 0: 55677.7. Samples: 1210204880. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:11:47,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 16:11:47,582][57339] Updated weights for policy 0, policy_version 654288 (0.0030) [2024-04-28 16:11:50,338][57319] Signal inference workers to stop experience collection... (18200 times) [2024-04-28 16:11:50,338][57319] Signal inference workers to resume experience collection... (18200 times) [2024-04-28 16:11:50,350][57339] InferenceWorker_p0-w0: stopping experience collection (18200 times) [2024-04-28 16:11:50,350][57339] InferenceWorker_p0-w0: resuming experience collection (18200 times) [2024-04-28 16:11:51,175][57339] Updated weights for policy 0, policy_version 654298 (0.0025) [2024-04-28 16:11:52,169][57108] Fps is (10 sec: 50790.7, 60 sec: 55705.7, 300 sec: 55594.6). Total num frames: 10720051200. Throughput: 0: 55545.8. Samples: 1210374140. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:11:52,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 16:11:53,456][57339] Updated weights for policy 0, policy_version 654308 (0.0023) [2024-04-28 16:11:57,005][57339] Updated weights for policy 0, policy_version 654318 (0.0026) [2024-04-28 16:11:57,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10720346112. Throughput: 0: 55632.3. Samples: 1210708580. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:11:57,169][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 16:11:59,432][57339] Updated weights for policy 0, policy_version 654328 (0.0026) [2024-04-28 16:12:02,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55159.6, 300 sec: 55761.2). Total num frames: 10720624640. Throughput: 0: 55870.9. Samples: 1211048360. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:12:02,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 16:12:02,854][57339] Updated weights for policy 0, policy_version 654338 (0.0024) [2024-04-28 16:12:05,183][57339] Updated weights for policy 0, policy_version 654348 (0.0031) [2024-04-28 16:12:07,169][57108] Fps is (10 sec: 54066.8, 60 sec: 54886.2, 300 sec: 55705.6). Total num frames: 10720886784. Throughput: 0: 55557.3. Samples: 1211200740. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:12:07,170][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 16:12:08,934][57339] Updated weights for policy 0, policy_version 654358 (0.0024) [2024-04-28 16:12:10,955][57339] Updated weights for policy 0, policy_version 654368 (0.0030) [2024-04-28 16:12:12,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 10721198080. Throughput: 0: 55389.9. Samples: 1211531300. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:12:12,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 16:12:15,006][57339] Updated weights for policy 0, policy_version 654378 (0.0029) [2024-04-28 16:12:16,886][57339] Updated weights for policy 0, policy_version 654388 (0.0032) [2024-04-28 16:12:17,169][57108] Fps is (10 sec: 60621.2, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10721492992. Throughput: 0: 55339.5. Samples: 1211863580. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:12:17,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 16:12:20,866][57339] Updated weights for policy 0, policy_version 654398 (0.0030) [2024-04-28 16:12:22,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10721755136. Throughput: 0: 55579.0. Samples: 1212042240. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:12:22,170][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 16:12:22,972][57339] Updated weights for policy 0, policy_version 654408 (0.0025) [2024-04-28 16:12:26,721][57339] Updated weights for policy 0, policy_version 654418 (0.0028) [2024-04-28 16:12:27,169][57108] Fps is (10 sec: 49152.4, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10721984512. Throughput: 0: 55472.0. Samples: 1212369640. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:12:27,169][57108] Avg episode reward: [(0, '0.502')] [2024-04-28 16:12:28,965][57339] Updated weights for policy 0, policy_version 654428 (0.0025) [2024-04-28 16:12:32,169][57108] Fps is (10 sec: 50790.9, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10722263040. Throughput: 0: 55519.7. Samples: 1212703260. Policy #0 lag: (min: 2.0, avg: 12.2, max: 25.0) [2024-04-28 16:12:32,169][57108] Avg episode reward: [(0, '0.682')] [2024-04-28 16:12:32,755][57339] Updated weights for policy 0, policy_version 654438 (0.0031) [2024-04-28 16:12:34,958][57339] Updated weights for policy 0, policy_version 654448 (0.0033) [2024-04-28 16:12:37,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55159.3, 300 sec: 55650.0). Total num frames: 10722557952. Throughput: 0: 55254.8. Samples: 1212860620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:12:37,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 16:12:38,661][57339] Updated weights for policy 0, policy_version 654458 (0.0031) [2024-04-28 16:12:40,793][57339] Updated weights for policy 0, policy_version 654468 (0.0030) [2024-04-28 16:12:42,169][57108] Fps is (10 sec: 57344.3, 60 sec: 54886.5, 300 sec: 55705.6). Total num frames: 10722836480. Throughput: 0: 55209.1. Samples: 1213192980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:12:42,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 16:12:44,564][57339] Updated weights for policy 0, policy_version 654478 (0.0038) [2024-04-28 16:12:46,783][57339] Updated weights for policy 0, policy_version 654488 (0.0030) [2024-04-28 16:12:47,169][57108] Fps is (10 sec: 57345.0, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10723131392. Throughput: 0: 55152.3. Samples: 1213530220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:12:47,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 16:12:50,350][57339] Updated weights for policy 0, policy_version 654498 (0.0026) [2024-04-28 16:12:52,169][57108] Fps is (10 sec: 58981.3, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 10723426304. Throughput: 0: 55586.3. Samples: 1213702120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:12:52,170][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 16:12:52,705][57339] Updated weights for policy 0, policy_version 654508 (0.0026) [2024-04-28 16:12:56,207][57339] Updated weights for policy 0, policy_version 654518 (0.0031) [2024-04-28 16:12:57,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10723704832. Throughput: 0: 55803.9. Samples: 1214042480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:12:57,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 16:12:58,565][57339] Updated weights for policy 0, policy_version 654528 (0.0026) [2024-04-28 16:13:02,039][57339] Updated weights for policy 0, policy_version 654538 (0.0032) [2024-04-28 16:13:02,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55432.4, 300 sec: 55594.6). Total num frames: 10723950592. Throughput: 0: 55980.1. Samples: 1214382680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:13:02,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 16:13:03,234][57319] Signal inference workers to stop experience collection... (18250 times) [2024-04-28 16:13:03,234][57319] Signal inference workers to resume experience collection... (18250 times) [2024-04-28 16:13:03,245][57339] InferenceWorker_p0-w0: stopping experience collection (18250 times) [2024-04-28 16:13:03,245][57339] InferenceWorker_p0-w0: resuming experience collection (18250 times) [2024-04-28 16:13:04,373][57339] Updated weights for policy 0, policy_version 654548 (0.0026) [2024-04-28 16:13:07,169][57108] Fps is (10 sec: 50790.5, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 10724212736. Throughput: 0: 55471.7. Samples: 1214538460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:13:07,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 16:13:07,800][57339] Updated weights for policy 0, policy_version 654558 (0.0030) [2024-04-28 16:13:10,179][57339] Updated weights for policy 0, policy_version 654568 (0.0025) [2024-04-28 16:13:12,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10724524032. Throughput: 0: 55672.0. Samples: 1214874880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:13:12,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 16:13:13,629][57339] Updated weights for policy 0, policy_version 654578 (0.0027) [2024-04-28 16:13:16,039][57339] Updated weights for policy 0, policy_version 654588 (0.0028) [2024-04-28 16:13:17,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55159.6, 300 sec: 55816.7). Total num frames: 10724802560. Throughput: 0: 55781.4. Samples: 1215213420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:13:17,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 16:13:19,475][57339] Updated weights for policy 0, policy_version 654598 (0.0029) [2024-04-28 16:13:22,119][57339] Updated weights for policy 0, policy_version 654608 (0.0033) [2024-04-28 16:13:22,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 10725097472. Throughput: 0: 56109.5. Samples: 1215385540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:13:22,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 16:13:25,146][57339] Updated weights for policy 0, policy_version 654618 (0.0033) [2024-04-28 16:13:27,169][57108] Fps is (10 sec: 58982.0, 60 sec: 56797.9, 300 sec: 55872.2). Total num frames: 10725392384. Throughput: 0: 56247.5. Samples: 1215724120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:13:27,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 16:13:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000654626_10725392384.pth... [2024-04-28 16:13:27,227][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000653809_10712006656.pth [2024-04-28 16:13:28,222][57339] Updated weights for policy 0, policy_version 654628 (0.0028) [2024-04-28 16:13:31,072][57339] Updated weights for policy 0, policy_version 654638 (0.0027) [2024-04-28 16:13:32,169][57108] Fps is (10 sec: 54067.6, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 10725638144. Throughput: 0: 56127.6. Samples: 1216055960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:13:32,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 16:13:34,091][57339] Updated weights for policy 0, policy_version 654648 (0.0030) [2024-04-28 16:13:36,915][57339] Updated weights for policy 0, policy_version 654658 (0.0033) [2024-04-28 16:13:37,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 10725916672. Throughput: 0: 56036.6. Samples: 1216223760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:13:37,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 16:13:39,959][57339] Updated weights for policy 0, policy_version 654668 (0.0022) [2024-04-28 16:13:42,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10726178816. Throughput: 0: 55893.4. Samples: 1216557680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:13:42,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 16:13:42,730][57339] Updated weights for policy 0, policy_version 654678 (0.0030) [2024-04-28 16:13:45,737][57339] Updated weights for policy 0, policy_version 654688 (0.0032) [2024-04-28 16:13:47,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10726457344. Throughput: 0: 55614.6. Samples: 1216885340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:13:47,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 16:13:48,668][57339] Updated weights for policy 0, policy_version 654698 (0.0030) [2024-04-28 16:13:51,771][57339] Updated weights for policy 0, policy_version 654708 (0.0028) [2024-04-28 16:13:52,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55705.8, 300 sec: 55872.2). Total num frames: 10726768640. Throughput: 0: 55931.1. Samples: 1217055360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:13:52,169][57108] Avg episode reward: [(0, '0.720')] [2024-04-28 16:13:54,717][57339] Updated weights for policy 0, policy_version 654718 (0.0032) [2024-04-28 16:13:57,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10727047168. Throughput: 0: 55839.5. Samples: 1217387660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:13:57,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 16:13:57,563][57339] Updated weights for policy 0, policy_version 654728 (0.0030) [2024-04-28 16:14:00,597][57339] Updated weights for policy 0, policy_version 654738 (0.0030) [2024-04-28 16:14:01,217][57319] Signal inference workers to stop experience collection... (18300 times) [2024-04-28 16:14:01,239][57339] InferenceWorker_p0-w0: stopping experience collection (18300 times) [2024-04-28 16:14:01,276][57319] Signal inference workers to resume experience collection... (18300 times) [2024-04-28 16:14:01,276][57339] InferenceWorker_p0-w0: resuming experience collection (18300 times) [2024-04-28 16:14:02,169][57108] Fps is (10 sec: 55705.8, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10727325696. Throughput: 0: 55746.3. Samples: 1217722000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:14:02,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 16:14:03,284][57339] Updated weights for policy 0, policy_version 654748 (0.0025) [2024-04-28 16:14:06,390][57339] Updated weights for policy 0, policy_version 654758 (0.0032) [2024-04-28 16:14:07,169][57108] Fps is (10 sec: 54067.7, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 10727587840. Throughput: 0: 55716.2. Samples: 1217892760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-04-28 16:14:07,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 16:14:09,164][57339] Updated weights for policy 0, policy_version 654768 (0.0026) [2024-04-28 16:14:12,115][57339] Updated weights for policy 0, policy_version 654778 (0.0027) [2024-04-28 16:14:12,169][57108] Fps is (10 sec: 55704.5, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 10727882752. Throughput: 0: 55655.0. Samples: 1218228600. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:14:12,170][57108] Avg episode reward: [(0, '0.504')] [2024-04-28 16:14:15,180][57339] Updated weights for policy 0, policy_version 654788 (0.0026) [2024-04-28 16:14:17,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10728144896. Throughput: 0: 55670.3. Samples: 1218561120. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:14:17,169][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 16:14:18,035][57339] Updated weights for policy 0, policy_version 654798 (0.0029) [2024-04-28 16:14:21,061][57339] Updated weights for policy 0, policy_version 654808 (0.0029) [2024-04-28 16:14:22,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10728423424. Throughput: 0: 55528.5. Samples: 1218722540. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:14:22,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 16:14:24,027][57339] Updated weights for policy 0, policy_version 654818 (0.0027) [2024-04-28 16:14:26,882][57339] Updated weights for policy 0, policy_version 654828 (0.0028) [2024-04-28 16:14:27,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10728718336. Throughput: 0: 55525.3. Samples: 1219056320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:14:27,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 16:14:29,934][57339] Updated weights for policy 0, policy_version 654838 (0.0027) [2024-04-28 16:14:32,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10728996864. Throughput: 0: 55642.6. Samples: 1219389260. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:14:32,170][57108] Avg episode reward: [(0, '0.531')] [2024-04-28 16:14:32,586][57339] Updated weights for policy 0, policy_version 654848 (0.0033) [2024-04-28 16:14:35,926][57339] Updated weights for policy 0, policy_version 654858 (0.0030) [2024-04-28 16:14:37,169][57108] Fps is (10 sec: 57343.6, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10729291776. Throughput: 0: 55722.1. Samples: 1219562860. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:14:37,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 16:14:38,461][57339] Updated weights for policy 0, policy_version 654868 (0.0029) [2024-04-28 16:14:42,062][57339] Updated weights for policy 0, policy_version 654878 (0.0026) [2024-04-28 16:14:42,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 10729521152. Throughput: 0: 55630.1. Samples: 1219891020. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:14:42,170][57108] Avg episode reward: [(0, '0.516')] [2024-04-28 16:14:44,383][57339] Updated weights for policy 0, policy_version 654888 (0.0027) [2024-04-28 16:14:47,169][57108] Fps is (10 sec: 50790.2, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 10729799680. Throughput: 0: 55629.1. Samples: 1220225320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:14:47,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 16:14:47,804][57339] Updated weights for policy 0, policy_version 654898 (0.0026) [2024-04-28 16:14:50,294][57339] Updated weights for policy 0, policy_version 654908 (0.0030) [2024-04-28 16:14:52,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10730110976. Throughput: 0: 55651.3. Samples: 1220397080. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:14:52,169][57108] Avg episode reward: [(0, '0.513')] [2024-04-28 16:14:53,560][57339] Updated weights for policy 0, policy_version 654918 (0.0028) [2024-04-28 16:14:56,170][57339] Updated weights for policy 0, policy_version 654928 (0.0029) [2024-04-28 16:14:57,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10730373120. Throughput: 0: 55636.6. Samples: 1220732240. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:14:57,169][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 16:14:59,409][57339] Updated weights for policy 0, policy_version 654938 (0.0024) [2024-04-28 16:15:02,073][57339] Updated weights for policy 0, policy_version 654948 (0.0030) [2024-04-28 16:15:02,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10730668032. Throughput: 0: 55711.1. Samples: 1221068120. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:15:02,169][57108] Avg episode reward: [(0, '0.532')] [2024-04-28 16:15:05,193][57339] Updated weights for policy 0, policy_version 654958 (0.0027) [2024-04-28 16:15:05,638][57319] Signal inference workers to stop experience collection... (18350 times) [2024-04-28 16:15:05,668][57339] InferenceWorker_p0-w0: stopping experience collection (18350 times) [2024-04-28 16:15:05,694][57319] Signal inference workers to resume experience collection... (18350 times) [2024-04-28 16:15:05,698][57339] InferenceWorker_p0-w0: resuming experience collection (18350 times) [2024-04-28 16:15:07,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 10730930176. Throughput: 0: 55995.1. Samples: 1221242320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:15:07,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 16:15:08,241][57339] Updated weights for policy 0, policy_version 654968 (0.0035) [2024-04-28 16:15:11,054][57339] Updated weights for policy 0, policy_version 654978 (0.0030) [2024-04-28 16:15:12,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10731225088. Throughput: 0: 55961.3. Samples: 1221574580. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:15:12,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 16:15:13,961][57339] Updated weights for policy 0, policy_version 654988 (0.0028) [2024-04-28 16:15:16,882][57339] Updated weights for policy 0, policy_version 654998 (0.0027) [2024-04-28 16:15:17,169][57108] Fps is (10 sec: 58982.2, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 10731520000. Throughput: 0: 56038.7. Samples: 1221911000. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:15:17,170][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 16:15:19,996][57339] Updated weights for policy 0, policy_version 655008 (0.0038) [2024-04-28 16:15:22,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10731765760. Throughput: 0: 55831.6. Samples: 1222075280. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:15:22,169][57108] Avg episode reward: [(0, '0.679')] [2024-04-28 16:15:22,684][57339] Updated weights for policy 0, policy_version 655018 (0.0032) [2024-04-28 16:15:25,661][57339] Updated weights for policy 0, policy_version 655028 (0.0028) [2024-04-28 16:15:27,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10732060672. Throughput: 0: 56098.2. Samples: 1222415440. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:15:27,169][57108] Avg episode reward: [(0, '0.682')] [2024-04-28 16:15:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000655033_10732060672.pth... [2024-04-28 16:15:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000654216_10718674944.pth [2024-04-28 16:15:28,532][57339] Updated weights for policy 0, policy_version 655038 (0.0030) [2024-04-28 16:15:31,633][57339] Updated weights for policy 0, policy_version 655048 (0.0031) [2024-04-28 16:15:32,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 10732322816. Throughput: 0: 56094.7. Samples: 1222749580. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:15:32,170][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 16:15:34,454][57339] Updated weights for policy 0, policy_version 655058 (0.0028) [2024-04-28 16:15:37,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10732617728. Throughput: 0: 55941.3. Samples: 1222914440. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:15:37,169][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 16:15:37,326][57339] Updated weights for policy 0, policy_version 655068 (0.0030) [2024-04-28 16:15:40,344][57339] Updated weights for policy 0, policy_version 655078 (0.0023) [2024-04-28 16:15:42,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10732879872. Throughput: 0: 55913.9. Samples: 1223248360. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-04-28 16:15:42,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 16:15:43,238][57339] Updated weights for policy 0, policy_version 655088 (0.0030) [2024-04-28 16:15:46,044][57339] Updated weights for policy 0, policy_version 655098 (0.0035) [2024-04-28 16:15:47,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 10733191168. Throughput: 0: 55982.4. Samples: 1223587340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:15:47,170][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 16:15:49,001][57339] Updated weights for policy 0, policy_version 655108 (0.0031) [2024-04-28 16:15:51,804][57339] Updated weights for policy 0, policy_version 655118 (0.0024) [2024-04-28 16:15:52,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10733469696. Throughput: 0: 55853.4. Samples: 1223755720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:15:52,169][57108] Avg episode reward: [(0, '0.707')] [2024-04-28 16:15:54,858][57339] Updated weights for policy 0, policy_version 655128 (0.0024) [2024-04-28 16:15:57,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10733731840. Throughput: 0: 55990.7. Samples: 1224094160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:15:57,169][57108] Avg episode reward: [(0, '0.678')] [2024-04-28 16:15:57,820][57339] Updated weights for policy 0, policy_version 655138 (0.0024) [2024-04-28 16:16:00,591][57339] Updated weights for policy 0, policy_version 655148 (0.0040) [2024-04-28 16:16:02,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 10734010368. Throughput: 0: 55930.3. Samples: 1224427860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:16:02,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 16:16:03,727][57339] Updated weights for policy 0, policy_version 655158 (0.0029) [2024-04-28 16:16:06,433][57339] Updated weights for policy 0, policy_version 655168 (0.0027) [2024-04-28 16:16:07,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 10734288896. Throughput: 0: 55821.6. Samples: 1224587260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:16:07,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 16:16:09,535][57339] Updated weights for policy 0, policy_version 655178 (0.0024) [2024-04-28 16:16:12,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 10734583808. Throughput: 0: 55729.0. Samples: 1224923240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:16:12,169][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 16:16:12,343][57339] Updated weights for policy 0, policy_version 655188 (0.0029) [2024-04-28 16:16:13,328][57319] Signal inference workers to stop experience collection... (18400 times) [2024-04-28 16:16:13,329][57319] Signal inference workers to resume experience collection... (18400 times) [2024-04-28 16:16:13,353][57339] InferenceWorker_p0-w0: stopping experience collection (18400 times) [2024-04-28 16:16:13,353][57339] InferenceWorker_p0-w0: resuming experience collection (18400 times) [2024-04-28 16:16:15,457][57339] Updated weights for policy 0, policy_version 655198 (0.0029) [2024-04-28 16:16:17,169][57108] Fps is (10 sec: 55706.7, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 10734845952. Throughput: 0: 55800.1. Samples: 1225260580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:16:17,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 16:16:18,037][57339] Updated weights for policy 0, policy_version 655208 (0.0025) [2024-04-28 16:16:21,247][57339] Updated weights for policy 0, policy_version 655218 (0.0027) [2024-04-28 16:16:22,169][57108] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10735140864. Throughput: 0: 56006.7. Samples: 1225434740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:16:22,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 16:16:24,077][57339] Updated weights for policy 0, policy_version 655228 (0.0028) [2024-04-28 16:16:27,120][57339] Updated weights for policy 0, policy_version 655238 (0.0027) [2024-04-28 16:16:27,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 10735419392. Throughput: 0: 56052.5. Samples: 1225770720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:16:27,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 16:16:29,872][57339] Updated weights for policy 0, policy_version 655248 (0.0028) [2024-04-28 16:16:32,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10735681536. Throughput: 0: 55976.2. Samples: 1226106260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:16:32,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 16:16:32,908][57339] Updated weights for policy 0, policy_version 655258 (0.0032) [2024-04-28 16:16:35,739][57339] Updated weights for policy 0, policy_version 655268 (0.0027) [2024-04-28 16:16:37,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10735960064. Throughput: 0: 55899.9. Samples: 1226271220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:16:37,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 16:16:38,885][57339] Updated weights for policy 0, policy_version 655278 (0.0026) [2024-04-28 16:16:41,822][57339] Updated weights for policy 0, policy_version 655288 (0.0030) [2024-04-28 16:16:42,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 10736238592. Throughput: 0: 55682.6. Samples: 1226599880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:16:42,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 16:16:44,641][57339] Updated weights for policy 0, policy_version 655298 (0.0038) [2024-04-28 16:16:47,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 10736549888. Throughput: 0: 55769.4. Samples: 1226937480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:16:47,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 16:16:47,525][57339] Updated weights for policy 0, policy_version 655308 (0.0025) [2024-04-28 16:16:50,538][57339] Updated weights for policy 0, policy_version 655318 (0.0025) [2024-04-28 16:16:52,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55978.4, 300 sec: 55872.2). Total num frames: 10736828416. Throughput: 0: 56146.2. Samples: 1227113840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:16:52,170][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 16:16:53,515][57339] Updated weights for policy 0, policy_version 655328 (0.0025) [2024-04-28 16:16:56,377][57339] Updated weights for policy 0, policy_version 655338 (0.0029) [2024-04-28 16:16:57,169][57108] Fps is (10 sec: 54066.2, 60 sec: 55978.6, 300 sec: 55816.6). Total num frames: 10737090560. Throughput: 0: 56140.3. Samples: 1227449560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:16:57,169][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 16:16:59,540][57339] Updated weights for policy 0, policy_version 655348 (0.0028) [2024-04-28 16:16:59,856][57319] Signal inference workers to stop experience collection... (18450 times) [2024-04-28 16:16:59,856][57319] Signal inference workers to resume experience collection... (18450 times) [2024-04-28 16:16:59,867][57339] InferenceWorker_p0-w0: stopping experience collection (18450 times) [2024-04-28 16:16:59,868][57339] InferenceWorker_p0-w0: resuming experience collection (18450 times) [2024-04-28 16:17:02,169][57108] Fps is (10 sec: 54068.7, 60 sec: 55978.8, 300 sec: 55872.3). Total num frames: 10737369088. Throughput: 0: 56069.9. Samples: 1227783720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:17:02,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 16:17:02,287][57339] Updated weights for policy 0, policy_version 655358 (0.0032) [2024-04-28 16:17:05,227][57339] Updated weights for policy 0, policy_version 655368 (0.0029) [2024-04-28 16:17:07,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10737631232. Throughput: 0: 55652.9. Samples: 1227939120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:17:07,170][57108] Avg episode reward: [(0, '0.491')] [2024-04-28 16:17:08,059][57339] Updated weights for policy 0, policy_version 655378 (0.0030) [2024-04-28 16:17:10,981][57339] Updated weights for policy 0, policy_version 655388 (0.0027) [2024-04-28 16:17:12,169][57108] Fps is (10 sec: 55704.4, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10737926144. Throughput: 0: 55793.6. Samples: 1228281440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:17:12,170][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 16:17:13,833][57339] Updated weights for policy 0, policy_version 655398 (0.0033) [2024-04-28 16:17:16,908][57339] Updated weights for policy 0, policy_version 655408 (0.0035) [2024-04-28 16:17:17,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 10738204672. Throughput: 0: 55798.2. Samples: 1228617180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:17:17,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 16:17:19,727][57339] Updated weights for policy 0, policy_version 655418 (0.0027) [2024-04-28 16:17:22,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 10738499584. Throughput: 0: 55903.5. Samples: 1228786880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-04-28 16:17:22,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 16:17:22,716][57339] Updated weights for policy 0, policy_version 655428 (0.0025) [2024-04-28 16:17:25,585][57339] Updated weights for policy 0, policy_version 655438 (0.0034) [2024-04-28 16:17:27,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 10738778112. Throughput: 0: 55978.8. Samples: 1229118920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:17:27,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 16:17:27,235][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000655444_10738794496.pth... [2024-04-28 16:17:27,280][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000654626_10725392384.pth [2024-04-28 16:17:28,911][57339] Updated weights for policy 0, policy_version 655448 (0.0035) [2024-04-28 16:17:31,494][57339] Updated weights for policy 0, policy_version 655458 (0.0028) [2024-04-28 16:17:32,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55872.3). Total num frames: 10739040256. Throughput: 0: 55951.6. Samples: 1229455300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:17:32,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 16:17:34,731][57339] Updated weights for policy 0, policy_version 655468 (0.0025) [2024-04-28 16:17:37,169][57108] Fps is (10 sec: 55705.0, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 10739335168. Throughput: 0: 55897.9. Samples: 1229629240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:17:37,170][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 16:17:37,334][57339] Updated weights for policy 0, policy_version 655478 (0.0029) [2024-04-28 16:17:40,457][57339] Updated weights for policy 0, policy_version 655488 (0.0026) [2024-04-28 16:17:42,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10739580928. Throughput: 0: 55739.3. Samples: 1229957820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:17:42,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 16:17:43,108][57339] Updated weights for policy 0, policy_version 655498 (0.0027) [2024-04-28 16:17:46,152][57339] Updated weights for policy 0, policy_version 655508 (0.0028) [2024-04-28 16:17:47,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 10739875840. Throughput: 0: 55766.1. Samples: 1230293200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:17:47,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 16:17:49,237][57339] Updated weights for policy 0, policy_version 655518 (0.0028) [2024-04-28 16:17:52,053][57339] Updated weights for policy 0, policy_version 655528 (0.0027) [2024-04-28 16:17:52,169][57108] Fps is (10 sec: 58981.4, 60 sec: 55705.6, 300 sec: 55816.6). Total num frames: 10740170752. Throughput: 0: 56181.7. Samples: 1230467300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:17:52,170][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 16:17:55,195][57339] Updated weights for policy 0, policy_version 655538 (0.0028) [2024-04-28 16:17:57,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 10740449280. Throughput: 0: 56030.3. Samples: 1230802800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:17:57,170][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 16:17:58,163][57339] Updated weights for policy 0, policy_version 655548 (0.0025) [2024-04-28 16:18:01,040][57339] Updated weights for policy 0, policy_version 655558 (0.0030) [2024-04-28 16:18:02,169][57108] Fps is (10 sec: 57344.7, 60 sec: 56251.6, 300 sec: 56038.8). Total num frames: 10740744192. Throughput: 0: 55897.3. Samples: 1231132560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:18:02,170][57108] Avg episode reward: [(0, '0.690')] [2024-04-28 16:18:03,865][57339] Updated weights for policy 0, policy_version 655568 (0.0024) [2024-04-28 16:18:06,780][57339] Updated weights for policy 0, policy_version 655578 (0.0030) [2024-04-28 16:18:07,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10741006336. Throughput: 0: 56071.9. Samples: 1231310120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:18:07,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 16:18:09,721][57339] Updated weights for policy 0, policy_version 655588 (0.0025) [2024-04-28 16:18:12,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 10741284864. Throughput: 0: 56112.5. Samples: 1231643980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:18:12,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 16:18:12,519][57339] Updated weights for policy 0, policy_version 655598 (0.0035) [2024-04-28 16:18:15,667][57339] Updated weights for policy 0, policy_version 655608 (0.0029) [2024-04-28 16:18:16,571][57319] Signal inference workers to stop experience collection... (18500 times) [2024-04-28 16:18:16,572][57319] Signal inference workers to resume experience collection... (18500 times) [2024-04-28 16:18:16,585][57339] InferenceWorker_p0-w0: stopping experience collection (18500 times) [2024-04-28 16:18:16,585][57339] InferenceWorker_p0-w0: resuming experience collection (18500 times) [2024-04-28 16:18:17,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10741563392. Throughput: 0: 56194.3. Samples: 1231984040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:18:17,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 16:18:18,428][57339] Updated weights for policy 0, policy_version 655618 (0.0028) [2024-04-28 16:18:21,488][57339] Updated weights for policy 0, policy_version 655628 (0.0026) [2024-04-28 16:18:22,169][57108] Fps is (10 sec: 54066.2, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10741825536. Throughput: 0: 55726.1. Samples: 1232136920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:18:22,170][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 16:18:24,260][57339] Updated weights for policy 0, policy_version 655638 (0.0026) [2024-04-28 16:18:27,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10742120448. Throughput: 0: 55915.2. Samples: 1232474000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:18:27,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 16:18:27,195][57339] Updated weights for policy 0, policy_version 655648 (0.0026) [2024-04-28 16:18:30,006][57339] Updated weights for policy 0, policy_version 655658 (0.0035) [2024-04-28 16:18:32,169][57108] Fps is (10 sec: 58983.4, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10742415360. Throughput: 0: 55992.0. Samples: 1232812840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:18:32,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 16:18:32,879][57339] Updated weights for policy 0, policy_version 655668 (0.0033) [2024-04-28 16:18:35,948][57339] Updated weights for policy 0, policy_version 655678 (0.0028) [2024-04-28 16:18:37,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 10742693888. Throughput: 0: 56006.5. Samples: 1232987580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:18:37,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 16:18:38,898][57339] Updated weights for policy 0, policy_version 655688 (0.0029) [2024-04-28 16:18:41,886][57339] Updated weights for policy 0, policy_version 655698 (0.0032) [2024-04-28 16:18:42,169][57108] Fps is (10 sec: 55705.5, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 10742972416. Throughput: 0: 55963.7. Samples: 1233321160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:18:42,169][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 16:18:44,989][57339] Updated weights for policy 0, policy_version 655708 (0.0029) [2024-04-28 16:18:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10743250944. Throughput: 0: 55973.5. Samples: 1233651360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:18:47,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 16:18:47,646][57339] Updated weights for policy 0, policy_version 655718 (0.0034) [2024-04-28 16:18:50,719][57339] Updated weights for policy 0, policy_version 655728 (0.0027) [2024-04-28 16:18:52,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55432.7, 300 sec: 55761.1). Total num frames: 10743496704. Throughput: 0: 55754.8. Samples: 1233819080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:18:52,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 16:18:53,472][57339] Updated weights for policy 0, policy_version 655738 (0.0032) [2024-04-28 16:18:56,441][57339] Updated weights for policy 0, policy_version 655748 (0.0028) [2024-04-28 16:18:57,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10743791616. Throughput: 0: 55808.8. Samples: 1234155380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 16:18:57,169][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 16:18:59,298][57339] Updated weights for policy 0, policy_version 655758 (0.0030) [2024-04-28 16:19:02,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 10744070144. Throughput: 0: 55762.5. Samples: 1234493360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:19:02,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 16:19:02,543][57339] Updated weights for policy 0, policy_version 655768 (0.0029) [2024-04-28 16:19:05,231][57339] Updated weights for policy 0, policy_version 655778 (0.0028) [2024-04-28 16:19:07,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 10744365056. Throughput: 0: 56096.7. Samples: 1234661260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:19:07,169][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 16:19:08,326][57339] Updated weights for policy 0, policy_version 655788 (0.0026) [2024-04-28 16:19:10,956][57339] Updated weights for policy 0, policy_version 655798 (0.0032) [2024-04-28 16:19:12,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 10744643584. Throughput: 0: 55990.5. Samples: 1234993580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:19:12,178][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 16:19:14,099][57339] Updated weights for policy 0, policy_version 655808 (0.0027) [2024-04-28 16:19:16,850][57339] Updated weights for policy 0, policy_version 655818 (0.0028) [2024-04-28 16:19:17,169][57108] Fps is (10 sec: 57343.4, 60 sec: 56251.6, 300 sec: 55983.3). Total num frames: 10744938496. Throughput: 0: 55928.3. Samples: 1235329620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:19:17,178][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 16:19:20,037][57339] Updated weights for policy 0, policy_version 655828 (0.0029) [2024-04-28 16:19:22,169][57108] Fps is (10 sec: 57343.3, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 10745217024. Throughput: 0: 55862.0. Samples: 1235501380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:19:22,170][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 16:19:22,838][57339] Updated weights for policy 0, policy_version 655838 (0.0025) [2024-04-28 16:19:25,600][57319] Signal inference workers to stop experience collection... (18550 times) [2024-04-28 16:19:25,601][57319] Signal inference workers to resume experience collection... (18550 times) [2024-04-28 16:19:25,616][57339] InferenceWorker_p0-w0: stopping experience collection (18550 times) [2024-04-28 16:19:25,617][57339] InferenceWorker_p0-w0: resuming experience collection (18550 times) [2024-04-28 16:19:25,855][57339] Updated weights for policy 0, policy_version 655848 (0.0030) [2024-04-28 16:19:27,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.6, 300 sec: 55927.8). Total num frames: 10745495552. Throughput: 0: 55975.9. Samples: 1235840080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:19:27,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 16:19:27,177][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000655853_10745495552.pth... [2024-04-28 16:19:27,222][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000655033_10732060672.pth [2024-04-28 16:19:28,704][57339] Updated weights for policy 0, policy_version 655858 (0.0026) [2024-04-28 16:19:31,848][57339] Updated weights for policy 0, policy_version 655868 (0.0031) [2024-04-28 16:19:32,169][57108] Fps is (10 sec: 54068.4, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10745757696. Throughput: 0: 55842.2. Samples: 1236164260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:19:32,169][57108] Avg episode reward: [(0, '0.692')] [2024-04-28 16:19:34,546][57339] Updated weights for policy 0, policy_version 655878 (0.0025) [2024-04-28 16:19:37,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.5, 300 sec: 55983.3). Total num frames: 10746036224. Throughput: 0: 55761.7. Samples: 1236328360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:19:37,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 16:19:37,619][57339] Updated weights for policy 0, policy_version 655888 (0.0028) [2024-04-28 16:19:40,379][57339] Updated weights for policy 0, policy_version 655898 (0.0029) [2024-04-28 16:19:42,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 55927.8). Total num frames: 10746298368. Throughput: 0: 55779.2. Samples: 1236665440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:19:42,169][57108] Avg episode reward: [(0, '0.735')] [2024-04-28 16:19:43,482][57339] Updated weights for policy 0, policy_version 655908 (0.0025) [2024-04-28 16:19:46,248][57339] Updated weights for policy 0, policy_version 655918 (0.0026) [2024-04-28 16:19:47,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.5, 300 sec: 55927.8). Total num frames: 10746609664. Throughput: 0: 55839.6. Samples: 1237006140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:19:47,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 16:19:49,290][57339] Updated weights for policy 0, policy_version 655928 (0.0028) [2024-04-28 16:19:52,169][57108] Fps is (10 sec: 57343.2, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 10746871808. Throughput: 0: 55860.3. Samples: 1237174980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:19:52,170][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 16:19:52,285][57339] Updated weights for policy 0, policy_version 655938 (0.0030) [2024-04-28 16:19:55,227][57339] Updated weights for policy 0, policy_version 655948 (0.0026) [2024-04-28 16:19:57,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 10747183104. Throughput: 0: 55880.5. Samples: 1237508200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:19:57,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 16:19:58,051][57339] Updated weights for policy 0, policy_version 655958 (0.0031) [2024-04-28 16:20:01,033][57339] Updated weights for policy 0, policy_version 655968 (0.0025) [2024-04-28 16:20:02,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55978.5, 300 sec: 55927.7). Total num frames: 10747428864. Throughput: 0: 55873.1. Samples: 1237843920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:20:02,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 16:20:03,862][57339] Updated weights for policy 0, policy_version 655978 (0.0026) [2024-04-28 16:20:06,840][57339] Updated weights for policy 0, policy_version 655988 (0.0026) [2024-04-28 16:20:07,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10747723776. Throughput: 0: 55776.7. Samples: 1238011320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:20:07,178][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 16:20:09,786][57339] Updated weights for policy 0, policy_version 655998 (0.0040) [2024-04-28 16:20:12,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10747985920. Throughput: 0: 55881.7. Samples: 1238354760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:20:12,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 16:20:12,681][57339] Updated weights for policy 0, policy_version 656008 (0.0026) [2024-04-28 16:20:15,562][57339] Updated weights for policy 0, policy_version 656018 (0.0029) [2024-04-28 16:20:17,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.7, 300 sec: 55983.3). Total num frames: 10748280832. Throughput: 0: 56271.0. Samples: 1238696460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:20:17,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 16:20:18,505][57339] Updated weights for policy 0, policy_version 656028 (0.0031) [2024-04-28 16:20:21,400][57339] Updated weights for policy 0, policy_version 656038 (0.0027) [2024-04-28 16:20:22,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.7, 300 sec: 55872.2). Total num frames: 10748542976. Throughput: 0: 56068.5. Samples: 1238851440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:20:22,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 16:20:24,243][57319] Signal inference workers to stop experience collection... (18600 times) [2024-04-28 16:20:24,245][57319] Signal inference workers to resume experience collection... (18600 times) [2024-04-28 16:20:24,276][57339] InferenceWorker_p0-w0: stopping experience collection (18600 times) [2024-04-28 16:20:24,276][57339] InferenceWorker_p0-w0: resuming experience collection (18600 times) [2024-04-28 16:20:24,354][57339] Updated weights for policy 0, policy_version 656048 (0.0038) [2024-04-28 16:20:27,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 55983.3). Total num frames: 10748837888. Throughput: 0: 56087.1. Samples: 1239189360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:20:27,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 16:20:27,193][57339] Updated weights for policy 0, policy_version 656058 (0.0036) [2024-04-28 16:20:30,382][57339] Updated weights for policy 0, policy_version 656068 (0.0038) [2024-04-28 16:20:32,169][57108] Fps is (10 sec: 58982.3, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 10749132800. Throughput: 0: 55990.3. Samples: 1239525700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:20:32,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 16:20:33,042][57339] Updated weights for policy 0, policy_version 656078 (0.0026) [2024-04-28 16:20:36,375][57339] Updated weights for policy 0, policy_version 656088 (0.0030) [2024-04-28 16:20:37,169][57108] Fps is (10 sec: 58982.0, 60 sec: 56524.8, 300 sec: 56094.4). Total num frames: 10749427712. Throughput: 0: 56069.0. Samples: 1239698080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 16:20:37,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 16:20:38,949][57339] Updated weights for policy 0, policy_version 656098 (0.0030) [2024-04-28 16:20:42,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10749657088. Throughput: 0: 56108.5. Samples: 1240033080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:20:42,169][57108] Avg episode reward: [(0, '0.524')] [2024-04-28 16:20:42,230][57339] Updated weights for policy 0, policy_version 656108 (0.0026) [2024-04-28 16:20:44,821][57339] Updated weights for policy 0, policy_version 656118 (0.0029) [2024-04-28 16:20:47,169][57108] Fps is (10 sec: 50789.6, 60 sec: 55432.4, 300 sec: 55816.6). Total num frames: 10749935616. Throughput: 0: 56001.0. Samples: 1240363960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:20:47,170][57108] Avg episode reward: [(0, '0.528')] [2024-04-28 16:20:47,982][57339] Updated weights for policy 0, policy_version 656128 (0.0029) [2024-04-28 16:20:50,697][57339] Updated weights for policy 0, policy_version 656138 (0.0027) [2024-04-28 16:20:52,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.8, 300 sec: 55872.2). Total num frames: 10750214144. Throughput: 0: 55881.4. Samples: 1240525980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:20:52,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 16:20:53,887][57339] Updated weights for policy 0, policy_version 656148 (0.0026) [2024-04-28 16:20:56,703][57339] Updated weights for policy 0, policy_version 656158 (0.0033) [2024-04-28 16:20:57,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55432.5, 300 sec: 55927.8). Total num frames: 10750509056. Throughput: 0: 55720.1. Samples: 1240862160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:20:57,169][57108] Avg episode reward: [(0, '0.684')] [2024-04-28 16:20:59,846][57339] Updated weights for policy 0, policy_version 656168 (0.0034) [2024-04-28 16:21:02,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 10750787584. Throughput: 0: 55566.2. Samples: 1241196940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:21:02,169][57108] Avg episode reward: [(0, '0.535')] [2024-04-28 16:21:02,455][57339] Updated weights for policy 0, policy_version 656178 (0.0029) [2024-04-28 16:21:05,576][57339] Updated weights for policy 0, policy_version 656188 (0.0032) [2024-04-28 16:21:07,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 10751082496. Throughput: 0: 55911.9. Samples: 1241367480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:21:07,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 16:21:08,329][57339] Updated weights for policy 0, policy_version 656198 (0.0032) [2024-04-28 16:21:11,433][57339] Updated weights for policy 0, policy_version 656208 (0.0029) [2024-04-28 16:21:12,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 10751361024. Throughput: 0: 55872.4. Samples: 1241703620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:21:12,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 16:21:14,111][57339] Updated weights for policy 0, policy_version 656218 (0.0032) [2024-04-28 16:21:17,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10751623168. Throughput: 0: 55975.1. Samples: 1242044580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:21:17,170][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 16:21:17,319][57339] Updated weights for policy 0, policy_version 656228 (0.0026) [2024-04-28 16:21:18,101][57319] Signal inference workers to stop experience collection... (18650 times) [2024-04-28 16:21:18,138][57339] InferenceWorker_p0-w0: stopping experience collection (18650 times) [2024-04-28 16:21:18,194][57319] Signal inference workers to resume experience collection... (18650 times) [2024-04-28 16:21:18,195][57339] InferenceWorker_p0-w0: resuming experience collection (18650 times) [2024-04-28 16:21:20,297][57339] Updated weights for policy 0, policy_version 656238 (0.0028) [2024-04-28 16:21:22,169][57108] Fps is (10 sec: 54066.2, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 10751901696. Throughput: 0: 55702.0. Samples: 1242204680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:21:22,170][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 16:21:23,199][57339] Updated weights for policy 0, policy_version 656248 (0.0030) [2024-04-28 16:21:26,046][57339] Updated weights for policy 0, policy_version 656258 (0.0028) [2024-04-28 16:21:27,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.4, 300 sec: 55927.7). Total num frames: 10752180224. Throughput: 0: 55799.8. Samples: 1242544080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:21:27,170][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 16:21:27,308][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000656262_10752196608.pth... [2024-04-28 16:21:27,357][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000655444_10738794496.pth [2024-04-28 16:21:29,082][57339] Updated weights for policy 0, policy_version 656268 (0.0029) [2024-04-28 16:21:31,953][57339] Updated weights for policy 0, policy_version 656278 (0.0028) [2024-04-28 16:21:32,169][57108] Fps is (10 sec: 55706.7, 60 sec: 55432.6, 300 sec: 55927.8). Total num frames: 10752458752. Throughput: 0: 55729.1. Samples: 1242871760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:21:32,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 16:21:34,984][57339] Updated weights for policy 0, policy_version 656288 (0.0027) [2024-04-28 16:21:37,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55432.4, 300 sec: 55983.3). Total num frames: 10752753664. Throughput: 0: 55991.7. Samples: 1243045620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:21:37,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 16:21:37,895][57339] Updated weights for policy 0, policy_version 656298 (0.0023) [2024-04-28 16:21:40,730][57339] Updated weights for policy 0, policy_version 656308 (0.0029) [2024-04-28 16:21:42,169][57108] Fps is (10 sec: 58982.4, 60 sec: 56524.8, 300 sec: 55927.8). Total num frames: 10753048576. Throughput: 0: 56075.2. Samples: 1243385540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:21:42,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 16:21:43,854][57339] Updated weights for policy 0, policy_version 656318 (0.0032) [2024-04-28 16:21:46,572][57339] Updated weights for policy 0, policy_version 656328 (0.0030) [2024-04-28 16:21:47,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56524.9, 300 sec: 55927.8). Total num frames: 10753327104. Throughput: 0: 56046.2. Samples: 1243719020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:21:47,169][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 16:21:49,624][57339] Updated weights for policy 0, policy_version 656338 (0.0031) [2024-04-28 16:21:52,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10753572864. Throughput: 0: 56018.8. Samples: 1243888320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:21:52,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 16:21:52,438][57339] Updated weights for policy 0, policy_version 656348 (0.0026) [2024-04-28 16:21:55,440][57339] Updated weights for policy 0, policy_version 656358 (0.0029) [2024-04-28 16:21:57,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10753851392. Throughput: 0: 55907.9. Samples: 1244219480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:21:57,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 16:21:58,240][57339] Updated weights for policy 0, policy_version 656368 (0.0030) [2024-04-28 16:22:01,202][57339] Updated weights for policy 0, policy_version 656378 (0.0025) [2024-04-28 16:22:02,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 10754146304. Throughput: 0: 55807.5. Samples: 1244555920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:22:02,170][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 16:22:04,137][57339] Updated weights for policy 0, policy_version 656388 (0.0028) [2024-04-28 16:22:07,165][57339] Updated weights for policy 0, policy_version 656398 (0.0026) [2024-04-28 16:22:07,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 10754424832. Throughput: 0: 55860.2. Samples: 1244718380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:22:07,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 16:22:09,906][57339] Updated weights for policy 0, policy_version 656408 (0.0030) [2024-04-28 16:22:12,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 10754719744. Throughput: 0: 55788.5. Samples: 1245054560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-04-28 16:22:12,169][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 16:22:13,137][57339] Updated weights for policy 0, policy_version 656418 (0.0031) [2024-04-28 16:22:15,842][57339] Updated weights for policy 0, policy_version 656428 (0.0033) [2024-04-28 16:22:17,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10754965504. Throughput: 0: 55919.2. Samples: 1245388120. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:22:17,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 16:22:18,962][57339] Updated weights for policy 0, policy_version 656438 (0.0032) [2024-04-28 16:22:21,576][57339] Updated weights for policy 0, policy_version 656448 (0.0028) [2024-04-28 16:22:22,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10755260416. Throughput: 0: 55829.8. Samples: 1245557960. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:22:22,170][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 16:22:23,766][57319] Signal inference workers to stop experience collection... (18700 times) [2024-04-28 16:22:23,798][57339] InferenceWorker_p0-w0: stopping experience collection (18700 times) [2024-04-28 16:22:23,858][57319] Signal inference workers to resume experience collection... (18700 times) [2024-04-28 16:22:23,858][57339] InferenceWorker_p0-w0: resuming experience collection (18700 times) [2024-04-28 16:22:24,823][57339] Updated weights for policy 0, policy_version 656458 (0.0028) [2024-04-28 16:22:27,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 10755538944. Throughput: 0: 55759.6. Samples: 1245894720. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:22:27,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 16:22:27,495][57339] Updated weights for policy 0, policy_version 656468 (0.0027) [2024-04-28 16:22:30,837][57339] Updated weights for policy 0, policy_version 656478 (0.0033) [2024-04-28 16:22:32,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10755817472. Throughput: 0: 55770.3. Samples: 1246228680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:22:32,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 16:22:33,291][57339] Updated weights for policy 0, policy_version 656488 (0.0029) [2024-04-28 16:22:36,737][57339] Updated weights for policy 0, policy_version 656498 (0.0025) [2024-04-28 16:22:37,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.8, 300 sec: 55983.3). Total num frames: 10756096000. Throughput: 0: 55780.0. Samples: 1246398420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:22:37,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 16:22:39,180][57339] Updated weights for policy 0, policy_version 656508 (0.0030) [2024-04-28 16:22:42,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55159.3, 300 sec: 55872.2). Total num frames: 10756358144. Throughput: 0: 55839.9. Samples: 1246732280. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:22:42,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 16:22:42,449][57339] Updated weights for policy 0, policy_version 656518 (0.0025) [2024-04-28 16:22:44,928][57339] Updated weights for policy 0, policy_version 656528 (0.0026) [2024-04-28 16:22:47,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 10756653056. Throughput: 0: 55762.3. Samples: 1247065220. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:22:47,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 16:22:48,297][57339] Updated weights for policy 0, policy_version 656538 (0.0028) [2024-04-28 16:22:50,821][57339] Updated weights for policy 0, policy_version 656548 (0.0028) [2024-04-28 16:22:52,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10756931584. Throughput: 0: 55931.9. Samples: 1247235320. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:22:52,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 16:22:54,218][57339] Updated weights for policy 0, policy_version 656558 (0.0029) [2024-04-28 16:22:56,811][57339] Updated weights for policy 0, policy_version 656568 (0.0027) [2024-04-28 16:22:57,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10757210112. Throughput: 0: 55786.2. Samples: 1247564940. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:22:57,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 16:23:00,259][57339] Updated weights for policy 0, policy_version 656578 (0.0032) [2024-04-28 16:23:02,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 10757488640. Throughput: 0: 55749.0. Samples: 1247896840. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:23:02,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 16:23:02,690][57339] Updated weights for policy 0, policy_version 656588 (0.0026) [2024-04-28 16:23:06,133][57339] Updated weights for policy 0, policy_version 656598 (0.0030) [2024-04-28 16:23:07,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 10757750784. Throughput: 0: 55707.2. Samples: 1248064780. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:23:07,170][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 16:23:08,709][57339] Updated weights for policy 0, policy_version 656608 (0.0026) [2024-04-28 16:23:11,926][57339] Updated weights for policy 0, policy_version 656618 (0.0034) [2024-04-28 16:23:12,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 10758045696. Throughput: 0: 55689.3. Samples: 1248400740. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:23:12,170][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 16:23:14,489][57339] Updated weights for policy 0, policy_version 656628 (0.0030) [2024-04-28 16:23:17,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.4, 300 sec: 55872.2). Total num frames: 10758307840. Throughput: 0: 55518.1. Samples: 1248727000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:23:17,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 16:23:17,763][57339] Updated weights for policy 0, policy_version 656638 (0.0026) [2024-04-28 16:23:20,689][57339] Updated weights for policy 0, policy_version 656648 (0.0030) [2024-04-28 16:23:22,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 10758602752. Throughput: 0: 55485.2. Samples: 1248895260. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:23:22,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 16:23:23,878][57339] Updated weights for policy 0, policy_version 656658 (0.0033) [2024-04-28 16:23:26,716][57339] Updated weights for policy 0, policy_version 656668 (0.0030) [2024-04-28 16:23:27,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 10758848512. Throughput: 0: 55434.3. Samples: 1249226820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:23:27,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 16:23:27,189][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000656669_10758864896.pth... [2024-04-28 16:23:27,236][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000655853_10745495552.pth [2024-04-28 16:23:29,631][57319] Signal inference workers to stop experience collection... (18750 times) [2024-04-28 16:23:29,631][57319] Signal inference workers to resume experience collection... (18750 times) [2024-04-28 16:23:29,643][57339] InferenceWorker_p0-w0: stopping experience collection (18750 times) [2024-04-28 16:23:29,643][57339] InferenceWorker_p0-w0: resuming experience collection (18750 times) [2024-04-28 16:23:29,745][57339] Updated weights for policy 0, policy_version 656678 (0.0031) [2024-04-28 16:23:32,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10759143424. Throughput: 0: 55310.7. Samples: 1249554200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:23:32,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 16:23:32,578][57339] Updated weights for policy 0, policy_version 656688 (0.0028) [2024-04-28 16:23:35,529][57339] Updated weights for policy 0, policy_version 656698 (0.0024) [2024-04-28 16:23:37,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10759421952. Throughput: 0: 55350.2. Samples: 1249726080. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:23:37,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 16:23:38,351][57339] Updated weights for policy 0, policy_version 656708 (0.0030) [2024-04-28 16:23:41,469][57339] Updated weights for policy 0, policy_version 656718 (0.0030) [2024-04-28 16:23:42,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10759716864. Throughput: 0: 55484.1. Samples: 1250061720. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:23:42,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 16:23:44,268][57339] Updated weights for policy 0, policy_version 656728 (0.0032) [2024-04-28 16:23:47,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 10759962624. Throughput: 0: 55594.9. Samples: 1250398600. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:23:47,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 16:23:47,376][57339] Updated weights for policy 0, policy_version 656738 (0.0026) [2024-04-28 16:23:50,491][57339] Updated weights for policy 0, policy_version 656748 (0.0028) [2024-04-28 16:23:52,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 10760257536. Throughput: 0: 55371.0. Samples: 1250556480. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-04-28 16:23:52,170][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 16:23:53,326][57339] Updated weights for policy 0, policy_version 656758 (0.0025) [2024-04-28 16:23:56,173][57339] Updated weights for policy 0, policy_version 656768 (0.0033) [2024-04-28 16:23:57,169][57108] Fps is (10 sec: 60619.4, 60 sec: 55978.5, 300 sec: 55927.7). Total num frames: 10760568832. Throughput: 0: 55479.8. Samples: 1250897340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:23:57,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 16:23:59,083][57339] Updated weights for policy 0, policy_version 656778 (0.0025) [2024-04-28 16:24:02,002][57339] Updated weights for policy 0, policy_version 656788 (0.0030) [2024-04-28 16:24:02,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55432.7, 300 sec: 55761.1). Total num frames: 10760814592. Throughput: 0: 55780.1. Samples: 1251237100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:24:02,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 16:24:04,816][57339] Updated weights for policy 0, policy_version 656798 (0.0030) [2024-04-28 16:24:07,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10761109504. Throughput: 0: 55801.7. Samples: 1251406340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:24:07,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 16:24:07,959][57339] Updated weights for policy 0, policy_version 656808 (0.0030) [2024-04-28 16:24:10,653][57339] Updated weights for policy 0, policy_version 656818 (0.0030) [2024-04-28 16:24:12,169][57108] Fps is (10 sec: 57342.7, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 10761388032. Throughput: 0: 55833.5. Samples: 1251739340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:24:12,170][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 16:24:13,699][57339] Updated weights for policy 0, policy_version 656828 (0.0029) [2024-04-28 16:24:16,662][57339] Updated weights for policy 0, policy_version 656838 (0.0029) [2024-04-28 16:24:17,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 10761666560. Throughput: 0: 56112.5. Samples: 1252079260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:24:17,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 16:24:19,449][57339] Updated weights for policy 0, policy_version 656848 (0.0029) [2024-04-28 16:24:22,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 10761928704. Throughput: 0: 55910.1. Samples: 1252242040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:24:22,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 16:24:22,458][57339] Updated weights for policy 0, policy_version 656858 (0.0027) [2024-04-28 16:24:22,976][57319] Signal inference workers to stop experience collection... (18800 times) [2024-04-28 16:24:23,012][57339] InferenceWorker_p0-w0: stopping experience collection (18800 times) [2024-04-28 16:24:23,025][57319] Signal inference workers to resume experience collection... (18800 times) [2024-04-28 16:24:23,029][57339] InferenceWorker_p0-w0: resuming experience collection (18800 times) [2024-04-28 16:24:25,198][57339] Updated weights for policy 0, policy_version 656868 (0.0026) [2024-04-28 16:24:27,169][57108] Fps is (10 sec: 55705.5, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10762223616. Throughput: 0: 55869.7. Samples: 1252575860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:24:27,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 16:24:28,303][57339] Updated weights for policy 0, policy_version 656878 (0.0030) [2024-04-28 16:24:31,067][57339] Updated weights for policy 0, policy_version 656888 (0.0026) [2024-04-28 16:24:32,169][57108] Fps is (10 sec: 57345.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10762502144. Throughput: 0: 55872.0. Samples: 1252912840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:24:32,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 16:24:34,241][57339] Updated weights for policy 0, policy_version 656898 (0.0023) [2024-04-28 16:24:36,926][57339] Updated weights for policy 0, policy_version 656908 (0.0036) [2024-04-28 16:24:37,169][57108] Fps is (10 sec: 57343.6, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 10762797056. Throughput: 0: 56077.0. Samples: 1253079940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:24:37,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 16:24:39,985][57339] Updated weights for policy 0, policy_version 656918 (0.0028) [2024-04-28 16:24:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10763059200. Throughput: 0: 56061.2. Samples: 1253420080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:24:42,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 16:24:42,608][57339] Updated weights for policy 0, policy_version 656928 (0.0028) [2024-04-28 16:24:45,801][57339] Updated weights for policy 0, policy_version 656938 (0.0030) [2024-04-28 16:24:47,169][57108] Fps is (10 sec: 54067.9, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10763337728. Throughput: 0: 56000.9. Samples: 1253757140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:24:47,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 16:24:48,523][57339] Updated weights for policy 0, policy_version 656948 (0.0024) [2024-04-28 16:24:51,509][57339] Updated weights for policy 0, policy_version 656958 (0.0022) [2024-04-28 16:24:52,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10763616256. Throughput: 0: 55809.9. Samples: 1253917780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:24:52,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 16:24:54,342][57339] Updated weights for policy 0, policy_version 656968 (0.0029) [2024-04-28 16:24:57,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 10763894784. Throughput: 0: 55887.3. Samples: 1254254260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:24:57,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 16:24:57,408][57339] Updated weights for policy 0, policy_version 656978 (0.0025) [2024-04-28 16:25:00,037][57339] Updated weights for policy 0, policy_version 656988 (0.0031) [2024-04-28 16:25:02,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10764173312. Throughput: 0: 55826.6. Samples: 1254591460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:25:02,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 16:25:03,259][57339] Updated weights for policy 0, policy_version 656998 (0.0035) [2024-04-28 16:25:05,745][57339] Updated weights for policy 0, policy_version 657008 (0.0030) [2024-04-28 16:25:07,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10764468224. Throughput: 0: 55997.3. Samples: 1254761920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:25:07,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 16:25:09,120][57339] Updated weights for policy 0, policy_version 657018 (0.0028) [2024-04-28 16:25:11,680][57339] Updated weights for policy 0, policy_version 657028 (0.0028) [2024-04-28 16:25:12,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.9, 300 sec: 55816.7). Total num frames: 10764746752. Throughput: 0: 56077.0. Samples: 1255099320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:25:12,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 16:25:15,191][57339] Updated weights for policy 0, policy_version 657038 (0.0026) [2024-04-28 16:25:16,543][57319] Signal inference workers to stop experience collection... (18850 times) [2024-04-28 16:25:16,582][57339] InferenceWorker_p0-w0: stopping experience collection (18850 times) [2024-04-28 16:25:16,635][57319] Signal inference workers to resume experience collection... (18850 times) [2024-04-28 16:25:16,636][57339] InferenceWorker_p0-w0: resuming experience collection (18850 times) [2024-04-28 16:25:17,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10765025280. Throughput: 0: 55988.4. Samples: 1255432320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:25:17,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 16:25:17,762][57339] Updated weights for policy 0, policy_version 657048 (0.0031) [2024-04-28 16:25:21,139][57339] Updated weights for policy 0, policy_version 657058 (0.0036) [2024-04-28 16:25:22,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 10765287424. Throughput: 0: 55845.8. Samples: 1255593000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:25:22,170][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 16:25:23,676][57339] Updated weights for policy 0, policy_version 657068 (0.0025) [2024-04-28 16:25:27,073][57339] Updated weights for policy 0, policy_version 657078 (0.0024) [2024-04-28 16:25:27,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10765565952. Throughput: 0: 55746.0. Samples: 1255928660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:25:27,170][57108] Avg episode reward: [(0, '0.509')] [2024-04-28 16:25:27,187][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000657078_10765565952.pth... [2024-04-28 16:25:27,236][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000656262_10752196608.pth [2024-04-28 16:25:29,328][57339] Updated weights for policy 0, policy_version 657088 (0.0030) [2024-04-28 16:25:32,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10765860864. Throughput: 0: 55805.7. Samples: 1256268400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:25:32,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 16:25:33,127][57339] Updated weights for policy 0, policy_version 657098 (0.0029) [2024-04-28 16:25:35,255][57339] Updated weights for policy 0, policy_version 657108 (0.0030) [2024-04-28 16:25:37,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10766123008. Throughput: 0: 55773.8. Samples: 1256427600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:25:37,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 16:25:38,890][57339] Updated weights for policy 0, policy_version 657118 (0.0027) [2024-04-28 16:25:41,211][57339] Updated weights for policy 0, policy_version 657128 (0.0028) [2024-04-28 16:25:42,169][57108] Fps is (10 sec: 57342.9, 60 sec: 56251.5, 300 sec: 55927.8). Total num frames: 10766434304. Throughput: 0: 55711.8. Samples: 1256761300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:25:42,170][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 16:25:44,865][57339] Updated weights for policy 0, policy_version 657138 (0.0031) [2024-04-28 16:25:47,088][57339] Updated weights for policy 0, policy_version 657148 (0.0028) [2024-04-28 16:25:47,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10766712832. Throughput: 0: 55617.9. Samples: 1257094260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:25:47,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 16:25:50,553][57339] Updated weights for policy 0, policy_version 657158 (0.0032) [2024-04-28 16:25:52,169][57108] Fps is (10 sec: 55707.2, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10766991360. Throughput: 0: 55823.4. Samples: 1257273960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:25:52,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 16:25:52,744][57339] Updated weights for policy 0, policy_version 657168 (0.0027) [2024-04-28 16:25:56,305][57339] Updated weights for policy 0, policy_version 657178 (0.0034) [2024-04-28 16:25:57,169][57108] Fps is (10 sec: 52428.1, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10767237120. Throughput: 0: 55758.1. Samples: 1257608440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:25:57,170][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 16:25:58,302][57319] Signal inference workers to stop experience collection... (18900 times) [2024-04-28 16:25:58,333][57339] InferenceWorker_p0-w0: stopping experience collection (18900 times) [2024-04-28 16:25:58,360][57319] Signal inference workers to resume experience collection... (18900 times) [2024-04-28 16:25:58,365][57339] InferenceWorker_p0-w0: resuming experience collection (18900 times) [2024-04-28 16:25:58,469][57339] Updated weights for policy 0, policy_version 657188 (0.0023) [2024-04-28 16:26:02,099][57339] Updated weights for policy 0, policy_version 657198 (0.0031) [2024-04-28 16:26:02,169][57108] Fps is (10 sec: 54066.2, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10767532032. Throughput: 0: 55813.7. Samples: 1257943940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:26:02,170][57108] Avg episode reward: [(0, '0.690')] [2024-04-28 16:26:04,727][57339] Updated weights for policy 0, policy_version 657208 (0.0030) [2024-04-28 16:26:07,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 10767810560. Throughput: 0: 55905.3. Samples: 1258108740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:26:07,169][57108] Avg episode reward: [(0, '0.490')] [2024-04-28 16:26:07,998][57339] Updated weights for policy 0, policy_version 657218 (0.0026) [2024-04-28 16:26:10,449][57339] Updated weights for policy 0, policy_version 657228 (0.0029) [2024-04-28 16:26:12,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10768072704. Throughput: 0: 55828.6. Samples: 1258440940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:26:12,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 16:26:13,804][57339] Updated weights for policy 0, policy_version 657238 (0.0029) [2024-04-28 16:26:16,165][57339] Updated weights for policy 0, policy_version 657248 (0.0030) [2024-04-28 16:26:17,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10768384000. Throughput: 0: 55798.5. Samples: 1258779340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:26:17,170][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 16:26:19,677][57339] Updated weights for policy 0, policy_version 657258 (0.0028) [2024-04-28 16:26:22,055][57339] Updated weights for policy 0, policy_version 657268 (0.0028) [2024-04-28 16:26:22,169][57108] Fps is (10 sec: 60621.4, 60 sec: 56524.9, 300 sec: 55927.8). Total num frames: 10768678912. Throughput: 0: 56196.1. Samples: 1258956420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:26:22,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 16:26:25,457][57339] Updated weights for policy 0, policy_version 657278 (0.0028) [2024-04-28 16:26:27,169][57108] Fps is (10 sec: 55706.1, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10768941056. Throughput: 0: 56157.9. Samples: 1259288400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:26:27,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 16:26:28,023][57339] Updated weights for policy 0, policy_version 657288 (0.0032) [2024-04-28 16:26:31,368][57339] Updated weights for policy 0, policy_version 657298 (0.0028) [2024-04-28 16:26:32,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10769219584. Throughput: 0: 56295.0. Samples: 1259627540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:26:32,170][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 16:26:33,788][57339] Updated weights for policy 0, policy_version 657308 (0.0025) [2024-04-28 16:26:37,143][57339] Updated weights for policy 0, policy_version 657318 (0.0026) [2024-04-28 16:26:37,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 10769498112. Throughput: 0: 56035.8. Samples: 1259795580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:26:37,170][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 16:26:39,428][57339] Updated weights for policy 0, policy_version 657328 (0.0031) [2024-04-28 16:26:42,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 10769760256. Throughput: 0: 56063.2. Samples: 1260131280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:26:42,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 16:26:43,053][57339] Updated weights for policy 0, policy_version 657338 (0.0035) [2024-04-28 16:26:45,345][57339] Updated weights for policy 0, policy_version 657348 (0.0036) [2024-04-28 16:26:47,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 10770055168. Throughput: 0: 56088.9. Samples: 1260467940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:26:47,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 16:26:48,863][57339] Updated weights for policy 0, policy_version 657358 (0.0030) [2024-04-28 16:26:51,402][57339] Updated weights for policy 0, policy_version 657368 (0.0028) [2024-04-28 16:26:52,169][57108] Fps is (10 sec: 58981.7, 60 sec: 55978.4, 300 sec: 55927.7). Total num frames: 10770350080. Throughput: 0: 56168.3. Samples: 1260636320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:26:52,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 16:26:54,702][57339] Updated weights for policy 0, policy_version 657378 (0.0024) [2024-04-28 16:26:57,169][57108] Fps is (10 sec: 57343.9, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 10770628608. Throughput: 0: 56283.9. Samples: 1260973720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:26:57,169][57108] Avg episode reward: [(0, '0.506')] [2024-04-28 16:26:57,296][57339] Updated weights for policy 0, policy_version 657388 (0.0031) [2024-04-28 16:27:00,521][57339] Updated weights for policy 0, policy_version 657398 (0.0033) [2024-04-28 16:27:02,169][57108] Fps is (10 sec: 57344.6, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 10770923520. Throughput: 0: 56296.6. Samples: 1261312680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:27:02,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 16:27:03,064][57339] Updated weights for policy 0, policy_version 657408 (0.0030) [2024-04-28 16:27:06,242][57339] Updated weights for policy 0, policy_version 657418 (0.0024) [2024-04-28 16:27:07,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10771169280. Throughput: 0: 56141.6. Samples: 1261482800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-04-28 16:27:07,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 16:27:07,689][57319] Signal inference workers to stop experience collection... (18950 times) [2024-04-28 16:27:07,694][57319] Signal inference workers to resume experience collection... (18950 times) [2024-04-28 16:27:07,718][57339] InferenceWorker_p0-w0: stopping experience collection (18950 times) [2024-04-28 16:27:07,718][57339] InferenceWorker_p0-w0: resuming experience collection (18950 times) [2024-04-28 16:27:09,005][57339] Updated weights for policy 0, policy_version 657428 (0.0027) [2024-04-28 16:27:12,027][57339] Updated weights for policy 0, policy_version 657438 (0.0028) [2024-04-28 16:27:12,169][57108] Fps is (10 sec: 54067.1, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 10771464192. Throughput: 0: 56141.8. Samples: 1261814780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:27:12,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 16:27:14,976][57339] Updated weights for policy 0, policy_version 657448 (0.0027) [2024-04-28 16:27:17,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10771726336. Throughput: 0: 56101.7. Samples: 1262152120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:27:17,170][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 16:27:17,846][57339] Updated weights for policy 0, policy_version 657458 (0.0029) [2024-04-28 16:27:20,712][57339] Updated weights for policy 0, policy_version 657468 (0.0030) [2024-04-28 16:27:22,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10772004864. Throughput: 0: 55998.0. Samples: 1262315480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:27:22,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 16:27:23,804][57339] Updated weights for policy 0, policy_version 657478 (0.0028) [2024-04-28 16:27:26,483][57339] Updated weights for policy 0, policy_version 657488 (0.0026) [2024-04-28 16:27:27,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10772283392. Throughput: 0: 55947.9. Samples: 1262648940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:27:27,170][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 16:27:27,183][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000657488_10772283392.pth... [2024-04-28 16:27:27,246][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000656669_10758864896.pth [2024-04-28 16:27:29,852][57339] Updated weights for policy 0, policy_version 657498 (0.0025) [2024-04-28 16:27:32,169][57108] Fps is (10 sec: 58982.2, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 10772594688. Throughput: 0: 55857.5. Samples: 1262981520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:27:32,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 16:27:32,325][57339] Updated weights for policy 0, policy_version 657508 (0.0026) [2024-04-28 16:27:35,609][57339] Updated weights for policy 0, policy_version 657518 (0.0032) [2024-04-28 16:27:37,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 10772856832. Throughput: 0: 56050.2. Samples: 1263158580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:27:37,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 16:27:38,320][57339] Updated weights for policy 0, policy_version 657528 (0.0028) [2024-04-28 16:27:41,303][57339] Updated weights for policy 0, policy_version 657538 (0.0029) [2024-04-28 16:27:42,169][57108] Fps is (10 sec: 54067.2, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10773135360. Throughput: 0: 55995.3. Samples: 1263493500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:27:42,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 16:27:44,005][57339] Updated weights for policy 0, policy_version 657548 (0.0030) [2024-04-28 16:27:47,165][57339] Updated weights for policy 0, policy_version 657558 (0.0024) [2024-04-28 16:27:47,169][57108] Fps is (10 sec: 57344.6, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 10773430272. Throughput: 0: 55980.9. Samples: 1263831820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:27:47,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 16:27:50,254][57339] Updated weights for policy 0, policy_version 657568 (0.0026) [2024-04-28 16:27:52,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.8, 300 sec: 55872.2). Total num frames: 10773692416. Throughput: 0: 55721.5. Samples: 1263990260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:27:52,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 16:27:52,930][57339] Updated weights for policy 0, policy_version 657578 (0.0025) [2024-04-28 16:27:56,675][57339] Updated weights for policy 0, policy_version 657588 (0.0030) [2024-04-28 16:27:57,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10773954560. Throughput: 0: 55818.7. Samples: 1264326620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:27:57,169][57108] Avg episode reward: [(0, '0.702')] [2024-04-28 16:27:58,780][57339] Updated weights for policy 0, policy_version 657598 (0.0030) [2024-04-28 16:28:02,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 55872.2). Total num frames: 10774233088. Throughput: 0: 55726.7. Samples: 1264659820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:28:02,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 16:28:02,398][57339] Updated weights for policy 0, policy_version 657608 (0.0025) [2024-04-28 16:28:04,760][57339] Updated weights for policy 0, policy_version 657618 (0.0029) [2024-04-28 16:28:07,169][57108] Fps is (10 sec: 58981.4, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 10774544384. Throughput: 0: 55897.8. Samples: 1264830900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:28:07,170][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 16:28:08,121][57339] Updated weights for policy 0, policy_version 657628 (0.0031) [2024-04-28 16:28:10,619][57339] Updated weights for policy 0, policy_version 657638 (0.0025) [2024-04-28 16:28:12,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 10774806528. Throughput: 0: 55979.1. Samples: 1265168000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:28:12,170][57108] Avg episode reward: [(0, '0.488')] [2024-04-28 16:28:13,962][57339] Updated weights for policy 0, policy_version 657648 (0.0033) [2024-04-28 16:28:14,874][57319] Signal inference workers to stop experience collection... (19000 times) [2024-04-28 16:28:14,875][57319] Signal inference workers to resume experience collection... (19000 times) [2024-04-28 16:28:14,885][57339] InferenceWorker_p0-w0: stopping experience collection (19000 times) [2024-04-28 16:28:14,885][57339] InferenceWorker_p0-w0: resuming experience collection (19000 times) [2024-04-28 16:28:16,429][57339] Updated weights for policy 0, policy_version 657658 (0.0027) [2024-04-28 16:28:17,169][57108] Fps is (10 sec: 55707.1, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 10775101440. Throughput: 0: 55957.8. Samples: 1265499620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:28:17,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 16:28:19,805][57339] Updated weights for policy 0, policy_version 657668 (0.0026) [2024-04-28 16:28:22,169][57108] Fps is (10 sec: 57343.6, 60 sec: 56251.5, 300 sec: 56038.8). Total num frames: 10775379968. Throughput: 0: 55912.9. Samples: 1265674660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:28:22,178][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 16:28:22,353][57339] Updated weights for policy 0, policy_version 657678 (0.0032) [2024-04-28 16:28:25,551][57339] Updated weights for policy 0, policy_version 657688 (0.0027) [2024-04-28 16:28:27,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56524.8, 300 sec: 56038.8). Total num frames: 10775674880. Throughput: 0: 55909.2. Samples: 1266009420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:28:27,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 16:28:28,060][57339] Updated weights for policy 0, policy_version 657698 (0.0030) [2024-04-28 16:28:31,482][57339] Updated weights for policy 0, policy_version 657708 (0.0030) [2024-04-28 16:28:32,169][57108] Fps is (10 sec: 54068.6, 60 sec: 55432.6, 300 sec: 55927.8). Total num frames: 10775920640. Throughput: 0: 55915.8. Samples: 1266348020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:28:32,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 16:28:33,965][57339] Updated weights for policy 0, policy_version 657718 (0.0025) [2024-04-28 16:28:37,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 10776199168. Throughput: 0: 56021.7. Samples: 1266511240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:28:37,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 16:28:37,358][57339] Updated weights for policy 0, policy_version 657728 (0.0030) [2024-04-28 16:28:39,669][57339] Updated weights for policy 0, policy_version 657738 (0.0025) [2024-04-28 16:28:42,169][57108] Fps is (10 sec: 57342.9, 60 sec: 55978.6, 300 sec: 56038.8). Total num frames: 10776494080. Throughput: 0: 55940.0. Samples: 1266843920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 16:28:42,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 16:28:43,502][57339] Updated weights for policy 0, policy_version 657748 (0.0035) [2024-04-28 16:28:45,498][57339] Updated weights for policy 0, policy_version 657758 (0.0025) [2024-04-28 16:28:47,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55705.5, 300 sec: 55983.3). Total num frames: 10776772608. Throughput: 0: 55989.2. Samples: 1267179340. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:28:47,170][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 16:28:49,207][57339] Updated weights for policy 0, policy_version 657768 (0.0025) [2024-04-28 16:28:51,395][57339] Updated weights for policy 0, policy_version 657778 (0.0025) [2024-04-28 16:28:52,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10777067520. Throughput: 0: 56128.3. Samples: 1267356660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:28:52,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 16:28:55,035][57339] Updated weights for policy 0, policy_version 657788 (0.0030) [2024-04-28 16:28:57,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 10777329664. Throughput: 0: 56006.2. Samples: 1267688280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:28:57,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 16:28:57,383][57339] Updated weights for policy 0, policy_version 657798 (0.0029) [2024-04-28 16:29:00,818][57339] Updated weights for policy 0, policy_version 657808 (0.0026) [2024-04-28 16:29:02,169][57108] Fps is (10 sec: 54067.5, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 10777608192. Throughput: 0: 56065.3. Samples: 1268022560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:29:02,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 16:29:03,188][57339] Updated weights for policy 0, policy_version 657818 (0.0030) [2024-04-28 16:29:06,599][57339] Updated weights for policy 0, policy_version 657828 (0.0028) [2024-04-28 16:29:07,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.8, 300 sec: 55927.8). Total num frames: 10777886720. Throughput: 0: 55813.9. Samples: 1268186280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:29:07,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 16:29:08,882][57339] Updated weights for policy 0, policy_version 657838 (0.0030) [2024-04-28 16:29:12,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 10778165248. Throughput: 0: 56089.7. Samples: 1268533460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:29:12,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 16:29:12,377][57339] Updated weights for policy 0, policy_version 657848 (0.0041) [2024-04-28 16:29:14,674][57339] Updated weights for policy 0, policy_version 657858 (0.0029) [2024-04-28 16:29:17,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55983.3). Total num frames: 10778443776. Throughput: 0: 56037.1. Samples: 1268869700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:29:17,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 16:29:18,169][57339] Updated weights for policy 0, policy_version 657868 (0.0027) [2024-04-28 16:29:20,625][57339] Updated weights for policy 0, policy_version 657878 (0.0029) [2024-04-28 16:29:22,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 10778738688. Throughput: 0: 55951.6. Samples: 1269029060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:29:22,178][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 16:29:23,547][57319] Signal inference workers to stop experience collection... (19050 times) [2024-04-28 16:29:23,587][57339] InferenceWorker_p0-w0: stopping experience collection (19050 times) [2024-04-28 16:29:23,600][57319] Signal inference workers to resume experience collection... (19050 times) [2024-04-28 16:29:23,604][57339] InferenceWorker_p0-w0: resuming experience collection (19050 times) [2024-04-28 16:29:23,972][57339] Updated weights for policy 0, policy_version 657888 (0.0028) [2024-04-28 16:29:26,403][57339] Updated weights for policy 0, policy_version 657898 (0.0035) [2024-04-28 16:29:27,169][57108] Fps is (10 sec: 58981.8, 60 sec: 55978.6, 300 sec: 56038.8). Total num frames: 10779033600. Throughput: 0: 56055.9. Samples: 1269366440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:29:27,170][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 16:29:27,186][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000657900_10779033600.pth... [2024-04-28 16:29:27,239][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000657078_10765565952.pth [2024-04-28 16:29:29,713][57339] Updated weights for policy 0, policy_version 657908 (0.0031) [2024-04-28 16:29:32,169][57108] Fps is (10 sec: 57343.0, 60 sec: 56524.6, 300 sec: 55983.3). Total num frames: 10779312128. Throughput: 0: 56019.6. Samples: 1269700220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:29:32,178][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 16:29:32,321][57339] Updated weights for policy 0, policy_version 657918 (0.0024) [2024-04-28 16:29:35,615][57339] Updated weights for policy 0, policy_version 657928 (0.0030) [2024-04-28 16:29:37,169][57108] Fps is (10 sec: 54068.1, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 10779574272. Throughput: 0: 56074.7. Samples: 1269880020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:29:37,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 16:29:38,240][57339] Updated weights for policy 0, policy_version 657938 (0.0028) [2024-04-28 16:29:41,497][57339] Updated weights for policy 0, policy_version 657948 (0.0027) [2024-04-28 16:29:42,169][57108] Fps is (10 sec: 54068.3, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 10779852800. Throughput: 0: 56079.7. Samples: 1270211860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:29:42,169][57108] Avg episode reward: [(0, '0.519')] [2024-04-28 16:29:44,187][57339] Updated weights for policy 0, policy_version 657958 (0.0027) [2024-04-28 16:29:47,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 10780131328. Throughput: 0: 56075.0. Samples: 1270545940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:29:47,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 16:29:47,557][57339] Updated weights for policy 0, policy_version 657968 (0.0029) [2024-04-28 16:29:50,231][57339] Updated weights for policy 0, policy_version 657978 (0.0028) [2024-04-28 16:29:52,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55432.5, 300 sec: 55927.8). Total num frames: 10780393472. Throughput: 0: 56019.0. Samples: 1270707140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:29:52,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 16:29:53,386][57339] Updated weights for policy 0, policy_version 657988 (0.0032) [2024-04-28 16:29:56,029][57339] Updated weights for policy 0, policy_version 657998 (0.0027) [2024-04-28 16:29:57,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 10780688384. Throughput: 0: 55705.5. Samples: 1271040200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:29:57,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 16:29:59,164][57339] Updated weights for policy 0, policy_version 658008 (0.0027) [2024-04-28 16:30:01,853][57339] Updated weights for policy 0, policy_version 658018 (0.0025) [2024-04-28 16:30:02,169][57108] Fps is (10 sec: 58982.6, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 10780983296. Throughput: 0: 55733.3. Samples: 1271377700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:30:02,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 16:30:05,067][57339] Updated weights for policy 0, policy_version 658028 (0.0030) [2024-04-28 16:30:07,169][57108] Fps is (10 sec: 57343.1, 60 sec: 56251.6, 300 sec: 55983.3). Total num frames: 10781261824. Throughput: 0: 56042.0. Samples: 1271550960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:30:07,170][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 16:30:07,906][57339] Updated weights for policy 0, policy_version 658038 (0.0029) [2024-04-28 16:30:11,018][57339] Updated weights for policy 0, policy_version 658048 (0.0035) [2024-04-28 16:30:12,169][57108] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 10781540352. Throughput: 0: 56019.6. Samples: 1271887320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:30:12,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 16:30:13,760][57339] Updated weights for policy 0, policy_version 658058 (0.0030) [2024-04-28 16:30:16,835][57339] Updated weights for policy 0, policy_version 658068 (0.0027) [2024-04-28 16:30:17,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 10781802496. Throughput: 0: 56129.5. Samples: 1272226040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:30:17,169][57108] Avg episode reward: [(0, '0.522')] [2024-04-28 16:30:19,479][57339] Updated weights for policy 0, policy_version 658078 (0.0031) [2024-04-28 16:30:22,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 10782081024. Throughput: 0: 55697.4. Samples: 1272386400. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-04-28 16:30:22,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 16:30:22,622][57339] Updated weights for policy 0, policy_version 658088 (0.0031) [2024-04-28 16:30:25,495][57339] Updated weights for policy 0, policy_version 658098 (0.0024) [2024-04-28 16:30:27,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55927.8). Total num frames: 10782359552. Throughput: 0: 55768.8. Samples: 1272721460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:30:27,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 16:30:28,478][57339] Updated weights for policy 0, policy_version 658108 (0.0029) [2024-04-28 16:30:31,427][57339] Updated weights for policy 0, policy_version 658118 (0.0029) [2024-04-28 16:30:32,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55432.6, 300 sec: 55983.3). Total num frames: 10782638080. Throughput: 0: 55764.5. Samples: 1273055340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:30:32,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 16:30:34,394][57339] Updated weights for policy 0, policy_version 658128 (0.0030) [2024-04-28 16:30:37,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10782900224. Throughput: 0: 55800.0. Samples: 1273218140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:30:37,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 16:30:37,401][57339] Updated weights for policy 0, policy_version 658138 (0.0028) [2024-04-28 16:30:40,111][57339] Updated weights for policy 0, policy_version 658148 (0.0035) [2024-04-28 16:30:42,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10783195136. Throughput: 0: 55855.6. Samples: 1273553700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:30:42,169][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 16:30:43,273][57339] Updated weights for policy 0, policy_version 658158 (0.0034) [2024-04-28 16:30:45,992][57339] Updated weights for policy 0, policy_version 658168 (0.0032) [2024-04-28 16:30:47,169][57108] Fps is (10 sec: 60621.2, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 10783506432. Throughput: 0: 55862.3. Samples: 1273891500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:30:47,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 16:30:48,995][57339] Updated weights for policy 0, policy_version 658178 (0.0035) [2024-04-28 16:30:49,868][57319] Signal inference workers to stop experience collection... (19100 times) [2024-04-28 16:30:49,869][57319] Signal inference workers to resume experience collection... (19100 times) [2024-04-28 16:30:49,899][57339] InferenceWorker_p0-w0: stopping experience collection (19100 times) [2024-04-28 16:30:49,900][57339] InferenceWorker_p0-w0: resuming experience collection (19100 times) [2024-04-28 16:30:51,831][57339] Updated weights for policy 0, policy_version 658188 (0.0033) [2024-04-28 16:30:52,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 10783752192. Throughput: 0: 55935.3. Samples: 1274068040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:30:52,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 16:30:54,856][57339] Updated weights for policy 0, policy_version 658198 (0.0024) [2024-04-28 16:30:57,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 10784030720. Throughput: 0: 55796.6. Samples: 1274398160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:30:57,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 16:30:57,761][57339] Updated weights for policy 0, policy_version 658208 (0.0033) [2024-04-28 16:31:00,805][57339] Updated weights for policy 0, policy_version 658218 (0.0032) [2024-04-28 16:31:02,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 10784325632. Throughput: 0: 55649.2. Samples: 1274730260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:31:02,170][57108] Avg episode reward: [(0, '0.509')] [2024-04-28 16:31:03,536][57339] Updated weights for policy 0, policy_version 658228 (0.0028) [2024-04-28 16:31:06,569][57339] Updated weights for policy 0, policy_version 658238 (0.0026) [2024-04-28 16:31:07,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.7, 300 sec: 56038.8). Total num frames: 10784604160. Throughput: 0: 55908.3. Samples: 1274902280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:31:07,169][57108] Avg episode reward: [(0, '0.706')] [2024-04-28 16:31:09,363][57339] Updated weights for policy 0, policy_version 658248 (0.0031) [2024-04-28 16:31:12,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 10784866304. Throughput: 0: 55951.8. Samples: 1275239300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:31:12,170][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 16:31:12,391][57339] Updated weights for policy 0, policy_version 658258 (0.0028) [2024-04-28 16:31:15,143][57339] Updated weights for policy 0, policy_version 658268 (0.0027) [2024-04-28 16:31:17,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 10785161216. Throughput: 0: 55979.0. Samples: 1275574400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:31:17,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 16:31:18,428][57339] Updated weights for policy 0, policy_version 658278 (0.0032) [2024-04-28 16:31:20,907][57339] Updated weights for policy 0, policy_version 658288 (0.0030) [2024-04-28 16:31:22,169][57108] Fps is (10 sec: 57345.4, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 10785439744. Throughput: 0: 55978.4. Samples: 1275737160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:31:22,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 16:31:24,316][57339] Updated weights for policy 0, policy_version 658298 (0.0028) [2024-04-28 16:31:26,751][57339] Updated weights for policy 0, policy_version 658308 (0.0033) [2024-04-28 16:31:27,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56251.6, 300 sec: 55983.3). Total num frames: 10785734656. Throughput: 0: 56116.1. Samples: 1276078940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:31:27,170][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 16:31:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000658309_10785734656.pth... [2024-04-28 16:31:27,239][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000657488_10772283392.pth [2024-04-28 16:31:30,099][57339] Updated weights for policy 0, policy_version 658318 (0.0028) [2024-04-28 16:31:32,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 10785980416. Throughput: 0: 56056.5. Samples: 1276414040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:31:32,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 16:31:32,619][57339] Updated weights for policy 0, policy_version 658328 (0.0029) [2024-04-28 16:31:35,845][57339] Updated weights for policy 0, policy_version 658338 (0.0032) [2024-04-28 16:31:37,169][57108] Fps is (10 sec: 55706.5, 60 sec: 56524.8, 300 sec: 56038.8). Total num frames: 10786291712. Throughput: 0: 55817.8. Samples: 1276579840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:31:37,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 16:31:38,632][57339] Updated weights for policy 0, policy_version 658348 (0.0034) [2024-04-28 16:31:41,697][57339] Updated weights for policy 0, policy_version 658358 (0.0028) [2024-04-28 16:31:42,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10786553856. Throughput: 0: 55970.2. Samples: 1276916820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:31:42,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 16:31:44,380][57339] Updated weights for policy 0, policy_version 658368 (0.0026) [2024-04-28 16:31:47,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 10786832384. Throughput: 0: 56059.1. Samples: 1277252920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:31:47,178][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 16:31:47,584][57339] Updated weights for policy 0, policy_version 658378 (0.0029) [2024-04-28 16:31:50,293][57339] Updated weights for policy 0, policy_version 658388 (0.0028) [2024-04-28 16:31:50,928][57319] Signal inference workers to stop experience collection... (19150 times) [2024-04-28 16:31:50,932][57319] Signal inference workers to resume experience collection... (19150 times) [2024-04-28 16:31:50,948][57339] InferenceWorker_p0-w0: stopping experience collection (19150 times) [2024-04-28 16:31:50,948][57339] InferenceWorker_p0-w0: resuming experience collection (19150 times) [2024-04-28 16:31:52,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10787110912. Throughput: 0: 56045.7. Samples: 1277424340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:31:52,169][57108] Avg episode reward: [(0, '0.522')] [2024-04-28 16:31:53,377][57339] Updated weights for policy 0, policy_version 658398 (0.0030) [2024-04-28 16:31:56,121][57339] Updated weights for policy 0, policy_version 658408 (0.0028) [2024-04-28 16:31:57,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 10787405824. Throughput: 0: 55946.3. Samples: 1277756880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:31:57,178][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 16:31:59,156][57339] Updated weights for policy 0, policy_version 658418 (0.0032) [2024-04-28 16:32:01,812][57339] Updated weights for policy 0, policy_version 658428 (0.0027) [2024-04-28 16:32:02,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 10787684352. Throughput: 0: 55906.3. Samples: 1278090180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:32:02,178][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 16:32:05,107][57339] Updated weights for policy 0, policy_version 658438 (0.0031) [2024-04-28 16:32:07,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56251.6, 300 sec: 55983.3). Total num frames: 10787979264. Throughput: 0: 56243.7. Samples: 1278268140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:32:07,170][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 16:32:07,490][57339] Updated weights for policy 0, policy_version 658448 (0.0028) [2024-04-28 16:32:10,946][57339] Updated weights for policy 0, policy_version 658458 (0.0028) [2024-04-28 16:32:12,169][57108] Fps is (10 sec: 55706.0, 60 sec: 56251.9, 300 sec: 55983.3). Total num frames: 10788241408. Throughput: 0: 56108.3. Samples: 1278603800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:32:12,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 16:32:13,377][57339] Updated weights for policy 0, policy_version 658468 (0.0026) [2024-04-28 16:32:16,635][57339] Updated weights for policy 0, policy_version 658478 (0.0031) [2024-04-28 16:32:17,169][57108] Fps is (10 sec: 57344.8, 60 sec: 56524.9, 300 sec: 56094.4). Total num frames: 10788552704. Throughput: 0: 56170.6. Samples: 1278941720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:32:17,169][57108] Avg episode reward: [(0, '0.683')] [2024-04-28 16:32:19,250][57339] Updated weights for policy 0, policy_version 658488 (0.0032) [2024-04-28 16:32:22,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 10788798464. Throughput: 0: 56153.4. Samples: 1279106740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:32:22,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 16:32:22,458][57339] Updated weights for policy 0, policy_version 658498 (0.0031) [2024-04-28 16:32:25,580][57339] Updated weights for policy 0, policy_version 658508 (0.0028) [2024-04-28 16:32:27,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55705.8, 300 sec: 55872.2). Total num frames: 10789076992. Throughput: 0: 56157.3. Samples: 1279443900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:32:27,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 16:32:28,447][57339] Updated weights for policy 0, policy_version 658518 (0.0032) [2024-04-28 16:32:31,334][57339] Updated weights for policy 0, policy_version 658528 (0.0025) [2024-04-28 16:32:32,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10789355520. Throughput: 0: 56158.8. Samples: 1279780060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:32:32,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 16:32:34,206][57339] Updated weights for policy 0, policy_version 658538 (0.0030) [2024-04-28 16:32:37,057][57339] Updated weights for policy 0, policy_version 658548 (0.0030) [2024-04-28 16:32:37,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55978.5, 300 sec: 55983.3). Total num frames: 10789650432. Throughput: 0: 55949.2. Samples: 1279942060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:32:37,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 16:32:39,947][57339] Updated weights for policy 0, policy_version 658558 (0.0028) [2024-04-28 16:32:42,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10789928960. Throughput: 0: 55900.1. Samples: 1280272380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:32:42,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 16:32:42,800][57339] Updated weights for policy 0, policy_version 658568 (0.0030) [2024-04-28 16:32:45,859][57339] Updated weights for policy 0, policy_version 658578 (0.0029) [2024-04-28 16:32:47,169][57108] Fps is (10 sec: 55706.7, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 10790207488. Throughput: 0: 56024.1. Samples: 1280611260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:32:47,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 16:32:48,706][57339] Updated weights for policy 0, policy_version 658588 (0.0027) [2024-04-28 16:32:51,852][57339] Updated weights for policy 0, policy_version 658598 (0.0025) [2024-04-28 16:32:52,169][57108] Fps is (10 sec: 55704.5, 60 sec: 56251.6, 300 sec: 56038.8). Total num frames: 10790486016. Throughput: 0: 55986.1. Samples: 1280787520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:32:52,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 16:32:54,610][57339] Updated weights for policy 0, policy_version 658608 (0.0025) [2024-04-28 16:32:57,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 10790748160. Throughput: 0: 55869.7. Samples: 1281117940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:32:57,170][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 16:32:57,609][57339] Updated weights for policy 0, policy_version 658618 (0.0027) [2024-04-28 16:33:00,380][57339] Updated weights for policy 0, policy_version 658628 (0.0032) [2024-04-28 16:33:02,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10791043072. Throughput: 0: 55902.1. Samples: 1281457320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:33:02,170][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 16:33:03,330][57339] Updated weights for policy 0, policy_version 658638 (0.0035) [2024-04-28 16:33:06,334][57339] Updated weights for policy 0, policy_version 658648 (0.0027) [2024-04-28 16:33:07,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55927.8). Total num frames: 10791305216. Throughput: 0: 55938.6. Samples: 1281623980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:33:07,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 16:33:07,587][57319] Signal inference workers to stop experience collection... (19200 times) [2024-04-28 16:33:07,588][57319] Signal inference workers to resume experience collection... (19200 times) [2024-04-28 16:33:07,611][57339] InferenceWorker_p0-w0: stopping experience collection (19200 times) [2024-04-28 16:33:07,612][57339] InferenceWorker_p0-w0: resuming experience collection (19200 times) [2024-04-28 16:33:09,193][57339] Updated weights for policy 0, policy_version 658658 (0.0029) [2024-04-28 16:33:12,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 10791600128. Throughput: 0: 55757.0. Samples: 1281952960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:33:12,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 16:33:12,241][57339] Updated weights for policy 0, policy_version 658668 (0.0026) [2024-04-28 16:33:15,105][57339] Updated weights for policy 0, policy_version 658678 (0.0034) [2024-04-28 16:33:17,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.5, 300 sec: 55927.8). Total num frames: 10791878656. Throughput: 0: 55676.0. Samples: 1282285480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:33:17,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 16:33:18,340][57339] Updated weights for policy 0, policy_version 658688 (0.0027) [2024-04-28 16:33:20,932][57339] Updated weights for policy 0, policy_version 658698 (0.0033) [2024-04-28 16:33:22,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10792173568. Throughput: 0: 55949.2. Samples: 1282459760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:33:22,169][57108] Avg episode reward: [(0, '0.532')] [2024-04-28 16:33:24,269][57339] Updated weights for policy 0, policy_version 658708 (0.0031) [2024-04-28 16:33:26,696][57339] Updated weights for policy 0, policy_version 658718 (0.0032) [2024-04-28 16:33:27,169][57108] Fps is (10 sec: 57344.6, 60 sec: 56251.8, 300 sec: 56038.8). Total num frames: 10792452096. Throughput: 0: 56077.4. Samples: 1282795860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:33:27,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 16:33:27,255][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000658720_10792468480.pth... [2024-04-28 16:33:27,319][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000657900_10779033600.pth [2024-04-28 16:33:30,064][57339] Updated weights for policy 0, policy_version 658728 (0.0029) [2024-04-28 16:33:32,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 10792697856. Throughput: 0: 56097.3. Samples: 1283135640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:33:32,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 16:33:32,577][57339] Updated weights for policy 0, policy_version 658738 (0.0030) [2024-04-28 16:33:36,061][57339] Updated weights for policy 0, policy_version 658748 (0.0027) [2024-04-28 16:33:37,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 10792992768. Throughput: 0: 55841.5. Samples: 1283300380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-04-28 16:33:37,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 16:33:38,545][57339] Updated weights for policy 0, policy_version 658758 (0.0027) [2024-04-28 16:33:41,786][57339] Updated weights for policy 0, policy_version 658768 (0.0030) [2024-04-28 16:33:42,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 10793271296. Throughput: 0: 55847.1. Samples: 1283631060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:33:42,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 16:33:44,478][57339] Updated weights for policy 0, policy_version 658778 (0.0028) [2024-04-28 16:33:47,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 10793549824. Throughput: 0: 55806.2. Samples: 1283968600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:33:47,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 16:33:47,556][57339] Updated weights for policy 0, policy_version 658788 (0.0032) [2024-04-28 16:33:50,308][57339] Updated weights for policy 0, policy_version 658798 (0.0037) [2024-04-28 16:33:52,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.9, 300 sec: 55927.8). Total num frames: 10793828352. Throughput: 0: 55849.5. Samples: 1284137200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:33:52,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 16:33:53,622][57339] Updated weights for policy 0, policy_version 658808 (0.0023) [2024-04-28 16:33:56,048][57339] Updated weights for policy 0, policy_version 658818 (0.0033) [2024-04-28 16:33:57,169][57108] Fps is (10 sec: 57344.7, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 10794123264. Throughput: 0: 55933.3. Samples: 1284469960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:33:57,169][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 16:33:59,473][57339] Updated weights for policy 0, policy_version 658828 (0.0026) [2024-04-28 16:34:02,106][57339] Updated weights for policy 0, policy_version 658838 (0.0027) [2024-04-28 16:34:02,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 10794401792. Throughput: 0: 55831.5. Samples: 1284797900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:34:02,170][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 16:34:05,599][57339] Updated weights for policy 0, policy_version 658848 (0.0029) [2024-04-28 16:34:07,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 10794663936. Throughput: 0: 55748.8. Samples: 1284968460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:34:07,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 16:34:07,977][57339] Updated weights for policy 0, policy_version 658858 (0.0033) [2024-04-28 16:34:11,424][57339] Updated weights for policy 0, policy_version 658868 (0.0032) [2024-04-28 16:34:12,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 10794926080. Throughput: 0: 55696.8. Samples: 1285302220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:34:12,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 16:34:13,786][57339] Updated weights for policy 0, policy_version 658878 (0.0024) [2024-04-28 16:34:17,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55816.6). Total num frames: 10795204608. Throughput: 0: 55396.7. Samples: 1285628500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:34:17,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 16:34:17,518][57339] Updated weights for policy 0, policy_version 658888 (0.0028) [2024-04-28 16:34:19,535][57319] Signal inference workers to stop experience collection... (19250 times) [2024-04-28 16:34:19,536][57319] Signal inference workers to resume experience collection... (19250 times) [2024-04-28 16:34:19,549][57339] InferenceWorker_p0-w0: stopping experience collection (19250 times) [2024-04-28 16:34:19,549][57339] InferenceWorker_p0-w0: resuming experience collection (19250 times) [2024-04-28 16:34:19,650][57339] Updated weights for policy 0, policy_version 658898 (0.0030) [2024-04-28 16:34:22,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10795499520. Throughput: 0: 55383.3. Samples: 1285792620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:34:22,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 16:34:23,311][57339] Updated weights for policy 0, policy_version 658908 (0.0033) [2024-04-28 16:34:25,668][57339] Updated weights for policy 0, policy_version 658918 (0.0028) [2024-04-28 16:34:27,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55705.4, 300 sec: 55872.2). Total num frames: 10795794432. Throughput: 0: 55509.7. Samples: 1286129000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:34:27,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 16:34:29,080][57339] Updated weights for policy 0, policy_version 658928 (0.0032) [2024-04-28 16:34:31,493][57339] Updated weights for policy 0, policy_version 658938 (0.0030) [2024-04-28 16:34:32,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10796056576. Throughput: 0: 55335.7. Samples: 1286458700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:34:32,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 16:34:35,023][57339] Updated weights for policy 0, policy_version 658948 (0.0029) [2024-04-28 16:34:37,169][57108] Fps is (10 sec: 55706.8, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 10796351488. Throughput: 0: 55434.2. Samples: 1286631740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:34:37,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 16:34:37,305][57339] Updated weights for policy 0, policy_version 658958 (0.0028) [2024-04-28 16:34:40,951][57339] Updated weights for policy 0, policy_version 658968 (0.0025) [2024-04-28 16:34:42,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10796613632. Throughput: 0: 55537.2. Samples: 1286969140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:34:42,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 16:34:43,206][57339] Updated weights for policy 0, policy_version 658978 (0.0030) [2024-04-28 16:34:46,633][57339] Updated weights for policy 0, policy_version 658988 (0.0027) [2024-04-28 16:34:47,169][57108] Fps is (10 sec: 52428.0, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 10796875776. Throughput: 0: 55687.6. Samples: 1287303840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:34:47,170][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 16:34:49,092][57339] Updated weights for policy 0, policy_version 658998 (0.0025) [2024-04-28 16:34:52,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 10797154304. Throughput: 0: 55500.9. Samples: 1287466000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:34:52,169][57108] Avg episode reward: [(0, '0.679')] [2024-04-28 16:34:52,395][57339] Updated weights for policy 0, policy_version 659008 (0.0037) [2024-04-28 16:34:54,851][57339] Updated weights for policy 0, policy_version 659018 (0.0032) [2024-04-28 16:34:57,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 10797449216. Throughput: 0: 55593.7. Samples: 1287803940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:34:57,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 16:34:58,465][57339] Updated weights for policy 0, policy_version 659028 (0.0031) [2024-04-28 16:35:00,679][57339] Updated weights for policy 0, policy_version 659038 (0.0028) [2024-04-28 16:35:02,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10797727744. Throughput: 0: 55692.4. Samples: 1288134660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:35:02,170][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 16:35:04,262][57339] Updated weights for policy 0, policy_version 659048 (0.0026) [2024-04-28 16:35:06,516][57339] Updated weights for policy 0, policy_version 659058 (0.0033) [2024-04-28 16:35:07,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10798022656. Throughput: 0: 55998.5. Samples: 1288312560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:35:07,170][57108] Avg episode reward: [(0, '0.682')] [2024-04-28 16:35:10,232][57339] Updated weights for policy 0, policy_version 659068 (0.0028) [2024-04-28 16:35:12,169][57108] Fps is (10 sec: 55707.0, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 10798284800. Throughput: 0: 55918.1. Samples: 1288645300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-04-28 16:35:12,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 16:35:12,658][57339] Updated weights for policy 0, policy_version 659078 (0.0025) [2024-04-28 16:35:15,931][57339] Updated weights for policy 0, policy_version 659088 (0.0029) [2024-04-28 16:35:17,169][57108] Fps is (10 sec: 52429.7, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 10798546944. Throughput: 0: 56029.4. Samples: 1288980020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:35:17,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 16:35:18,055][57319] Signal inference workers to stop experience collection... (19300 times) [2024-04-28 16:35:18,055][57319] Signal inference workers to resume experience collection... (19300 times) [2024-04-28 16:35:18,079][57339] InferenceWorker_p0-w0: stopping experience collection (19300 times) [2024-04-28 16:35:18,079][57339] InferenceWorker_p0-w0: resuming experience collection (19300 times) [2024-04-28 16:35:18,413][57339] Updated weights for policy 0, policy_version 659098 (0.0026) [2024-04-28 16:35:21,711][57339] Updated weights for policy 0, policy_version 659108 (0.0026) [2024-04-28 16:35:22,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10798841856. Throughput: 0: 55874.6. Samples: 1289146100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:35:22,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 16:35:24,341][57339] Updated weights for policy 0, policy_version 659118 (0.0029) [2024-04-28 16:35:27,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 10799120384. Throughput: 0: 55773.7. Samples: 1289478960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:35:27,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 16:35:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000659126_10799120384.pth... [2024-04-28 16:35:27,223][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000658309_10785734656.pth [2024-04-28 16:35:27,626][57339] Updated weights for policy 0, policy_version 659128 (0.0026) [2024-04-28 16:35:30,266][57339] Updated weights for policy 0, policy_version 659138 (0.0032) [2024-04-28 16:35:32,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 10799398912. Throughput: 0: 55839.5. Samples: 1289816620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:35:32,170][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 16:35:33,564][57339] Updated weights for policy 0, policy_version 659148 (0.0030) [2024-04-28 16:35:35,948][57339] Updated weights for policy 0, policy_version 659158 (0.0030) [2024-04-28 16:35:37,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 10799693824. Throughput: 0: 55974.6. Samples: 1289984860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:35:37,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 16:35:39,332][57339] Updated weights for policy 0, policy_version 659168 (0.0030) [2024-04-28 16:35:41,645][57339] Updated weights for policy 0, policy_version 659178 (0.0026) [2024-04-28 16:35:42,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10799972352. Throughput: 0: 56019.2. Samples: 1290324800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:35:42,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 16:35:45,248][57339] Updated weights for policy 0, policy_version 659188 (0.0026) [2024-04-28 16:35:47,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 10800250880. Throughput: 0: 56136.2. Samples: 1290660780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:35:47,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 16:35:47,661][57339] Updated weights for policy 0, policy_version 659198 (0.0032) [2024-04-28 16:35:51,018][57339] Updated weights for policy 0, policy_version 659208 (0.0032) [2024-04-28 16:35:52,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10800513024. Throughput: 0: 55842.7. Samples: 1290825480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:35:52,169][57108] Avg episode reward: [(0, '0.718')] [2024-04-28 16:35:53,548][57339] Updated weights for policy 0, policy_version 659218 (0.0028) [2024-04-28 16:35:56,765][57339] Updated weights for policy 0, policy_version 659228 (0.0033) [2024-04-28 16:35:57,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 10800807936. Throughput: 0: 55803.9. Samples: 1291156480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:35:57,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 16:35:59,491][57339] Updated weights for policy 0, policy_version 659238 (0.0026) [2024-04-28 16:36:02,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10801086464. Throughput: 0: 55889.6. Samples: 1291495060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:36:02,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 16:36:02,533][57339] Updated weights for policy 0, policy_version 659248 (0.0027) [2024-04-28 16:36:05,433][57339] Updated weights for policy 0, policy_version 659258 (0.0025) [2024-04-28 16:36:07,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 10801364992. Throughput: 0: 55975.5. Samples: 1291665000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:36:07,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 16:36:08,436][57339] Updated weights for policy 0, policy_version 659268 (0.0029) [2024-04-28 16:36:11,125][57339] Updated weights for policy 0, policy_version 659278 (0.0028) [2024-04-28 16:36:12,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55978.6, 300 sec: 55872.3). Total num frames: 10801643520. Throughput: 0: 56015.2. Samples: 1291999640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:36:12,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 16:36:14,385][57339] Updated weights for policy 0, policy_version 659288 (0.0030) [2024-04-28 16:36:16,916][57339] Updated weights for policy 0, policy_version 659298 (0.0024) [2024-04-28 16:36:17,169][57108] Fps is (10 sec: 57343.2, 60 sec: 56524.6, 300 sec: 55927.7). Total num frames: 10801938432. Throughput: 0: 55861.7. Samples: 1292330400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:36:17,170][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 16:36:20,070][57339] Updated weights for policy 0, policy_version 659308 (0.0028) [2024-04-28 16:36:22,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10802200576. Throughput: 0: 55874.7. Samples: 1292499220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:36:22,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 16:36:22,920][57339] Updated weights for policy 0, policy_version 659318 (0.0027) [2024-04-28 16:36:25,924][57339] Updated weights for policy 0, policy_version 659328 (0.0031) [2024-04-28 16:36:27,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 10802479104. Throughput: 0: 55738.1. Samples: 1292833020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:36:27,170][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 16:36:28,809][57339] Updated weights for policy 0, policy_version 659338 (0.0030) [2024-04-28 16:36:31,884][57339] Updated weights for policy 0, policy_version 659348 (0.0027) [2024-04-28 16:36:32,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10802757632. Throughput: 0: 55795.5. Samples: 1293171580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:36:32,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 16:36:34,788][57339] Updated weights for policy 0, policy_version 659358 (0.0031) [2024-04-28 16:36:37,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10803036160. Throughput: 0: 55899.6. Samples: 1293340960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:36:37,169][57108] Avg episode reward: [(0, '0.689')] [2024-04-28 16:36:37,693][57339] Updated weights for policy 0, policy_version 659368 (0.0024) [2024-04-28 16:36:39,793][57319] Signal inference workers to stop experience collection... (19350 times) [2024-04-28 16:36:39,793][57319] Signal inference workers to resume experience collection... (19350 times) [2024-04-28 16:36:39,827][57339] InferenceWorker_p0-w0: stopping experience collection (19350 times) [2024-04-28 16:36:39,827][57339] InferenceWorker_p0-w0: resuming experience collection (19350 times) [2024-04-28 16:36:40,820][57339] Updated weights for policy 0, policy_version 659378 (0.0028) [2024-04-28 16:36:42,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 10803314688. Throughput: 0: 55836.2. Samples: 1293669120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:36:42,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 16:36:43,744][57339] Updated weights for policy 0, policy_version 659388 (0.0035) [2024-04-28 16:36:46,677][57339] Updated weights for policy 0, policy_version 659398 (0.0026) [2024-04-28 16:36:47,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 10803609600. Throughput: 0: 55746.4. Samples: 1294003640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:36:47,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 16:36:49,457][57339] Updated weights for policy 0, policy_version 659408 (0.0028) [2024-04-28 16:36:52,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10803871744. Throughput: 0: 55651.2. Samples: 1294169300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 16:36:52,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 16:36:52,533][57339] Updated weights for policy 0, policy_version 659418 (0.0028) [2024-04-28 16:36:55,385][57339] Updated weights for policy 0, policy_version 659428 (0.0031) [2024-04-28 16:36:57,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10804133888. Throughput: 0: 55580.4. Samples: 1294500760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:36:57,169][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 16:36:58,553][57339] Updated weights for policy 0, policy_version 659438 (0.0030) [2024-04-28 16:37:01,359][57339] Updated weights for policy 0, policy_version 659448 (0.0029) [2024-04-28 16:37:02,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10804412416. Throughput: 0: 55572.5. Samples: 1294831160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:37:02,170][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 16:37:04,441][57339] Updated weights for policy 0, policy_version 659458 (0.0031) [2024-04-28 16:37:07,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10804707328. Throughput: 0: 55518.8. Samples: 1294997560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:37:07,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 16:37:07,218][57339] Updated weights for policy 0, policy_version 659468 (0.0030) [2024-04-28 16:37:10,310][57339] Updated weights for policy 0, policy_version 659478 (0.0026) [2024-04-28 16:37:12,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10804985856. Throughput: 0: 55482.0. Samples: 1295329700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:37:12,169][57108] Avg episode reward: [(0, '0.701')] [2024-04-28 16:37:13,032][57339] Updated weights for policy 0, policy_version 659488 (0.0028) [2024-04-28 16:37:16,160][57339] Updated weights for policy 0, policy_version 659498 (0.0027) [2024-04-28 16:37:17,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 10805264384. Throughput: 0: 55441.4. Samples: 1295666440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:37:17,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 16:37:18,991][57339] Updated weights for policy 0, policy_version 659508 (0.0034) [2024-04-28 16:37:22,113][57339] Updated weights for policy 0, policy_version 659518 (0.0028) [2024-04-28 16:37:22,172][57108] Fps is (10 sec: 55688.9, 60 sec: 55702.9, 300 sec: 55816.1). Total num frames: 10805542912. Throughput: 0: 55470.6. Samples: 1295837300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:37:22,172][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 16:37:25,041][57339] Updated weights for policy 0, policy_version 659528 (0.0026) [2024-04-28 16:37:27,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10805805056. Throughput: 0: 55558.3. Samples: 1296169240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:37:27,170][57108] Avg episode reward: [(0, '0.691')] [2024-04-28 16:37:27,215][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000659535_10805821440.pth... [2024-04-28 16:37:27,260][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000658720_10792468480.pth [2024-04-28 16:37:27,849][57339] Updated weights for policy 0, policy_version 659538 (0.0032) [2024-04-28 16:37:30,854][57339] Updated weights for policy 0, policy_version 659548 (0.0029) [2024-04-28 16:37:32,169][57108] Fps is (10 sec: 52444.7, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 10806067200. Throughput: 0: 55558.7. Samples: 1296503780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:37:32,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 16:37:33,904][57339] Updated weights for policy 0, policy_version 659558 (0.0031) [2024-04-28 16:37:36,833][57339] Updated weights for policy 0, policy_version 659568 (0.0033) [2024-04-28 16:37:37,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10806362112. Throughput: 0: 55381.4. Samples: 1296661460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:37:37,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 16:37:39,905][57339] Updated weights for policy 0, policy_version 659578 (0.0030) [2024-04-28 16:37:42,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 10806640640. Throughput: 0: 55504.6. Samples: 1296998460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:37:42,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 16:37:42,909][57339] Updated weights for policy 0, policy_version 659588 (0.0028) [2024-04-28 16:37:45,586][57339] Updated weights for policy 0, policy_version 659598 (0.0029) [2024-04-28 16:37:47,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 10806919168. Throughput: 0: 55544.0. Samples: 1297330640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:37:47,170][57108] Avg episode reward: [(0, '0.691')] [2024-04-28 16:37:48,922][57339] Updated weights for policy 0, policy_version 659608 (0.0024) [2024-04-28 16:37:51,402][57339] Updated weights for policy 0, policy_version 659618 (0.0029) [2024-04-28 16:37:52,169][57108] Fps is (10 sec: 57342.0, 60 sec: 55705.3, 300 sec: 55816.6). Total num frames: 10807214080. Throughput: 0: 55796.5. Samples: 1297508420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:37:52,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 16:37:54,793][57339] Updated weights for policy 0, policy_version 659628 (0.0029) [2024-04-28 16:37:57,145][57319] Signal inference workers to stop experience collection... (19400 times) [2024-04-28 16:37:57,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10807476224. Throughput: 0: 55900.0. Samples: 1297845200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:37:57,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 16:37:57,181][57339] InferenceWorker_p0-w0: stopping experience collection (19400 times) [2024-04-28 16:37:57,235][57319] Signal inference workers to resume experience collection... (19400 times) [2024-04-28 16:37:57,235][57339] InferenceWorker_p0-w0: resuming experience collection (19400 times) [2024-04-28 16:37:57,347][57339] Updated weights for policy 0, policy_version 659638 (0.0028) [2024-04-28 16:38:00,663][57339] Updated weights for policy 0, policy_version 659648 (0.0030) [2024-04-28 16:38:02,169][57108] Fps is (10 sec: 54069.1, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 10807754752. Throughput: 0: 55695.2. Samples: 1298172720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:38:02,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 16:38:03,385][57339] Updated weights for policy 0, policy_version 659658 (0.0025) [2024-04-28 16:38:06,395][57339] Updated weights for policy 0, policy_version 659668 (0.0025) [2024-04-28 16:38:07,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55159.3, 300 sec: 55650.0). Total num frames: 10808016896. Throughput: 0: 55598.2. Samples: 1298339060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:38:07,178][57108] Avg episode reward: [(0, '0.689')] [2024-04-28 16:38:09,184][57339] Updated weights for policy 0, policy_version 659678 (0.0034) [2024-04-28 16:38:12,118][57339] Updated weights for policy 0, policy_version 659688 (0.0031) [2024-04-28 16:38:12,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10808328192. Throughput: 0: 55610.3. Samples: 1298671700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:38:12,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 16:38:14,925][57339] Updated weights for policy 0, policy_version 659698 (0.0027) [2024-04-28 16:38:17,169][57108] Fps is (10 sec: 58983.4, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10808606720. Throughput: 0: 55629.3. Samples: 1299007100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:38:17,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 16:38:18,080][57339] Updated weights for policy 0, policy_version 659708 (0.0026) [2024-04-28 16:38:20,885][57339] Updated weights for policy 0, policy_version 659718 (0.0031) [2024-04-28 16:38:22,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55708.3, 300 sec: 55705.6). Total num frames: 10808885248. Throughput: 0: 55979.8. Samples: 1299180560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:38:22,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 16:38:23,824][57339] Updated weights for policy 0, policy_version 659728 (0.0029) [2024-04-28 16:38:26,729][57339] Updated weights for policy 0, policy_version 659738 (0.0024) [2024-04-28 16:38:27,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10809163776. Throughput: 0: 55822.5. Samples: 1299510480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 16:38:27,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 16:38:29,930][57339] Updated weights for policy 0, policy_version 659748 (0.0031) [2024-04-28 16:38:32,170][57108] Fps is (10 sec: 54062.7, 60 sec: 55977.8, 300 sec: 55705.5). Total num frames: 10809425920. Throughput: 0: 56001.7. Samples: 1299850760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:38:32,171][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 16:38:32,470][57339] Updated weights for policy 0, policy_version 659758 (0.0030) [2024-04-28 16:38:35,828][57339] Updated weights for policy 0, policy_version 659768 (0.0027) [2024-04-28 16:38:37,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55978.4, 300 sec: 55761.1). Total num frames: 10809720832. Throughput: 0: 55610.8. Samples: 1300010900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:38:37,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 16:38:38,248][57339] Updated weights for policy 0, policy_version 659778 (0.0028) [2024-04-28 16:38:41,537][57339] Updated weights for policy 0, policy_version 659788 (0.0025) [2024-04-28 16:38:42,169][57108] Fps is (10 sec: 55710.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10809982976. Throughput: 0: 55554.7. Samples: 1300345160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:38:42,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 16:38:44,033][57339] Updated weights for policy 0, policy_version 659798 (0.0030) [2024-04-28 16:38:47,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10810277888. Throughput: 0: 55855.4. Samples: 1300686220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:38:47,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 16:38:47,216][57339] Updated weights for policy 0, policy_version 659808 (0.0028) [2024-04-28 16:38:50,005][57339] Updated weights for policy 0, policy_version 659818 (0.0031) [2024-04-28 16:38:51,590][57319] Signal inference workers to stop experience collection... (19450 times) [2024-04-28 16:38:51,590][57319] Signal inference workers to resume experience collection... (19450 times) [2024-04-28 16:38:51,618][57339] InferenceWorker_p0-w0: stopping experience collection (19450 times) [2024-04-28 16:38:51,619][57339] InferenceWorker_p0-w0: resuming experience collection (19450 times) [2024-04-28 16:38:52,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 10810572800. Throughput: 0: 56004.4. Samples: 1300859260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:38:52,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 16:38:53,076][57339] Updated weights for policy 0, policy_version 659828 (0.0028) [2024-04-28 16:38:55,890][57339] Updated weights for policy 0, policy_version 659838 (0.0028) [2024-04-28 16:38:57,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 10810834944. Throughput: 0: 56033.2. Samples: 1301193200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:38:57,170][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 16:38:59,254][57339] Updated weights for policy 0, policy_version 659848 (0.0032) [2024-04-28 16:39:01,685][57339] Updated weights for policy 0, policy_version 659858 (0.0024) [2024-04-28 16:39:02,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10811113472. Throughput: 0: 55987.1. Samples: 1301526520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:39:02,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 16:39:05,076][57339] Updated weights for policy 0, policy_version 659868 (0.0032) [2024-04-28 16:39:07,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 10811408384. Throughput: 0: 55995.1. Samples: 1301700340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:39:07,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 16:39:07,468][57339] Updated weights for policy 0, policy_version 659878 (0.0028) [2024-04-28 16:39:10,779][57339] Updated weights for policy 0, policy_version 659888 (0.0025) [2024-04-28 16:39:12,169][57108] Fps is (10 sec: 55704.5, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10811670528. Throughput: 0: 55999.0. Samples: 1302030440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:39:12,170][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 16:39:13,487][57339] Updated weights for policy 0, policy_version 659898 (0.0028) [2024-04-28 16:39:16,685][57339] Updated weights for policy 0, policy_version 659908 (0.0029) [2024-04-28 16:39:17,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10811949056. Throughput: 0: 55960.3. Samples: 1302368920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:39:17,169][57108] Avg episode reward: [(0, '0.734')] [2024-04-28 16:39:19,552][57339] Updated weights for policy 0, policy_version 659918 (0.0026) [2024-04-28 16:39:22,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10812227584. Throughput: 0: 56033.5. Samples: 1302532400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:39:22,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 16:39:22,513][57339] Updated weights for policy 0, policy_version 659928 (0.0029) [2024-04-28 16:39:25,321][57339] Updated weights for policy 0, policy_version 659938 (0.0027) [2024-04-28 16:39:27,169][57108] Fps is (10 sec: 57342.9, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10812522496. Throughput: 0: 56035.4. Samples: 1302866760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:39:27,170][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 16:39:27,182][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000659944_10812522496.pth... [2024-04-28 16:39:27,236][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000659126_10799120384.pth [2024-04-28 16:39:28,341][57339] Updated weights for policy 0, policy_version 659948 (0.0027) [2024-04-28 16:39:31,037][57339] Updated weights for policy 0, policy_version 659958 (0.0026) [2024-04-28 16:39:32,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55706.3, 300 sec: 55650.0). Total num frames: 10812768256. Throughput: 0: 55956.8. Samples: 1303204280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:39:32,170][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 16:39:34,074][57339] Updated weights for policy 0, policy_version 659968 (0.0037) [2024-04-28 16:39:37,077][57339] Updated weights for policy 0, policy_version 659978 (0.0029) [2024-04-28 16:39:37,169][57108] Fps is (10 sec: 55706.7, 60 sec: 55978.9, 300 sec: 55816.7). Total num frames: 10813079552. Throughput: 0: 55856.7. Samples: 1303372800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:39:37,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 16:39:39,815][57339] Updated weights for policy 0, policy_version 659988 (0.0026) [2024-04-28 16:39:42,169][57108] Fps is (10 sec: 58982.7, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10813358080. Throughput: 0: 55929.4. Samples: 1303710020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:39:42,170][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 16:39:42,986][57339] Updated weights for policy 0, policy_version 659998 (0.0030) [2024-04-28 16:39:46,006][57339] Updated weights for policy 0, policy_version 660008 (0.0028) [2024-04-28 16:39:47,169][57108] Fps is (10 sec: 55704.5, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10813636608. Throughput: 0: 55969.1. Samples: 1304045140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:39:47,170][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 16:39:48,841][57339] Updated weights for policy 0, policy_version 660018 (0.0024) [2024-04-28 16:39:51,740][57339] Updated weights for policy 0, policy_version 660028 (0.0026) [2024-04-28 16:39:52,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10813898752. Throughput: 0: 55867.6. Samples: 1304214380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:39:52,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 16:39:54,575][57339] Updated weights for policy 0, policy_version 660038 (0.0029) [2024-04-28 16:39:57,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10814193664. Throughput: 0: 55973.0. Samples: 1304549220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:39:57,170][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 16:39:57,692][57339] Updated weights for policy 0, policy_version 660048 (0.0029) [2024-04-28 16:39:59,710][57319] Signal inference workers to stop experience collection... (19500 times) [2024-04-28 16:39:59,749][57339] InferenceWorker_p0-w0: stopping experience collection (19500 times) [2024-04-28 16:39:59,772][57319] Signal inference workers to resume experience collection... (19500 times) [2024-04-28 16:39:59,773][57339] InferenceWorker_p0-w0: resuming experience collection (19500 times) [2024-04-28 16:40:00,369][57339] Updated weights for policy 0, policy_version 660058 (0.0026) [2024-04-28 16:40:02,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 10814472192. Throughput: 0: 55878.6. Samples: 1304883460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:40:02,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 16:40:03,640][57339] Updated weights for policy 0, policy_version 660068 (0.0025) [2024-04-28 16:40:06,340][57339] Updated weights for policy 0, policy_version 660078 (0.0027) [2024-04-28 16:40:07,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55816.6). Total num frames: 10814750720. Throughput: 0: 55925.3. Samples: 1305049040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 16:40:07,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 16:40:09,548][57339] Updated weights for policy 0, policy_version 660088 (0.0030) [2024-04-28 16:40:12,101][57339] Updated weights for policy 0, policy_version 660098 (0.0029) [2024-04-28 16:40:12,169][57108] Fps is (10 sec: 57343.0, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 10815045632. Throughput: 0: 55946.2. Samples: 1305384340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:40:12,170][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 16:40:15,287][57339] Updated weights for policy 0, policy_version 660108 (0.0028) [2024-04-28 16:40:17,169][57108] Fps is (10 sec: 57343.3, 60 sec: 56251.5, 300 sec: 55872.2). Total num frames: 10815324160. Throughput: 0: 55843.0. Samples: 1305717220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:40:17,170][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 16:40:17,916][57339] Updated weights for policy 0, policy_version 660118 (0.0032) [2024-04-28 16:40:21,207][57339] Updated weights for policy 0, policy_version 660128 (0.0028) [2024-04-28 16:40:22,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10815586304. Throughput: 0: 55935.0. Samples: 1305889880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:40:22,169][57108] Avg episode reward: [(0, '0.693')] [2024-04-28 16:40:23,789][57339] Updated weights for policy 0, policy_version 660138 (0.0030) [2024-04-28 16:40:27,003][57339] Updated weights for policy 0, policy_version 660148 (0.0028) [2024-04-28 16:40:27,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10815864832. Throughput: 0: 55932.5. Samples: 1306226980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:40:27,169][57108] Avg episode reward: [(0, '0.690')] [2024-04-28 16:40:29,708][57339] Updated weights for policy 0, policy_version 660158 (0.0033) [2024-04-28 16:40:32,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10816126976. Throughput: 0: 55917.8. Samples: 1306561440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:40:32,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 16:40:33,008][57339] Updated weights for policy 0, policy_version 660168 (0.0031) [2024-04-28 16:40:35,583][57339] Updated weights for policy 0, policy_version 660178 (0.0026) [2024-04-28 16:40:37,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10816421888. Throughput: 0: 55773.0. Samples: 1306724160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:40:37,169][57108] Avg episode reward: [(0, '0.509')] [2024-04-28 16:40:38,859][57339] Updated weights for policy 0, policy_version 660188 (0.0030) [2024-04-28 16:40:41,324][57339] Updated weights for policy 0, policy_version 660198 (0.0029) [2024-04-28 16:40:42,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10816700416. Throughput: 0: 55722.1. Samples: 1307056720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:40:42,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 16:40:44,657][57339] Updated weights for policy 0, policy_version 660208 (0.0026) [2024-04-28 16:40:47,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.9, 300 sec: 55872.2). Total num frames: 10816995328. Throughput: 0: 55774.7. Samples: 1307393320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:40:47,169][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 16:40:47,196][57339] Updated weights for policy 0, policy_version 660218 (0.0028) [2024-04-28 16:40:50,501][57339] Updated weights for policy 0, policy_version 660228 (0.0027) [2024-04-28 16:40:52,169][57108] Fps is (10 sec: 57344.6, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10817273856. Throughput: 0: 55902.2. Samples: 1307564640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:40:52,170][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 16:40:53,167][57339] Updated weights for policy 0, policy_version 660238 (0.0028) [2024-04-28 16:40:56,356][57339] Updated weights for policy 0, policy_version 660248 (0.0029) [2024-04-28 16:40:57,169][57108] Fps is (10 sec: 55704.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10817552384. Throughput: 0: 55945.8. Samples: 1307901900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:40:57,170][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 16:40:58,903][57339] Updated weights for policy 0, policy_version 660258 (0.0034) [2024-04-28 16:41:02,149][57339] Updated weights for policy 0, policy_version 660268 (0.0034) [2024-04-28 16:41:02,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10817830912. Throughput: 0: 56120.6. Samples: 1308242640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:41:02,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 16:41:02,584][57319] Signal inference workers to stop experience collection... (19550 times) [2024-04-28 16:41:02,584][57319] Signal inference workers to resume experience collection... (19550 times) [2024-04-28 16:41:02,599][57339] InferenceWorker_p0-w0: stopping experience collection (19550 times) [2024-04-28 16:41:02,617][57339] InferenceWorker_p0-w0: resuming experience collection (19550 times) [2024-04-28 16:41:04,799][57339] Updated weights for policy 0, policy_version 660278 (0.0026) [2024-04-28 16:41:07,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10818093056. Throughput: 0: 55747.5. Samples: 1308398520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:41:07,170][57108] Avg episode reward: [(0, '0.532')] [2024-04-28 16:41:07,900][57339] Updated weights for policy 0, policy_version 660288 (0.0031) [2024-04-28 16:41:10,654][57339] Updated weights for policy 0, policy_version 660298 (0.0033) [2024-04-28 16:41:12,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 10818387968. Throughput: 0: 55755.6. Samples: 1308735980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:41:12,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 16:41:13,775][57339] Updated weights for policy 0, policy_version 660308 (0.0028) [2024-04-28 16:41:16,439][57339] Updated weights for policy 0, policy_version 660318 (0.0024) [2024-04-28 16:41:17,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10818650112. Throughput: 0: 55700.1. Samples: 1309067940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:41:17,169][57108] Avg episode reward: [(0, '0.695')] [2024-04-28 16:41:19,608][57339] Updated weights for policy 0, policy_version 660328 (0.0031) [2024-04-28 16:41:22,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10818961408. Throughput: 0: 55969.7. Samples: 1309242800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:41:22,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 16:41:22,201][57339] Updated weights for policy 0, policy_version 660338 (0.0029) [2024-04-28 16:41:25,625][57339] Updated weights for policy 0, policy_version 660348 (0.0031) [2024-04-28 16:41:27,169][57108] Fps is (10 sec: 58981.5, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 10819239936. Throughput: 0: 56160.4. Samples: 1309583940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:41:27,170][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 16:41:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000660354_10819239936.pth... [2024-04-28 16:41:27,242][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000659535_10805821440.pth [2024-04-28 16:41:27,995][57339] Updated weights for policy 0, policy_version 660358 (0.0032) [2024-04-28 16:41:31,445][57339] Updated weights for policy 0, policy_version 660368 (0.0024) [2024-04-28 16:41:32,169][57108] Fps is (10 sec: 55704.4, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 10819518464. Throughput: 0: 56131.2. Samples: 1309919240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:41:32,170][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 16:41:33,908][57339] Updated weights for policy 0, policy_version 660378 (0.0028) [2024-04-28 16:41:37,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 10819764224. Throughput: 0: 56004.4. Samples: 1310084840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:41:37,170][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 16:41:37,422][57339] Updated weights for policy 0, policy_version 660388 (0.0029) [2024-04-28 16:41:39,864][57339] Updated weights for policy 0, policy_version 660398 (0.0032) [2024-04-28 16:41:42,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10820059136. Throughput: 0: 55853.8. Samples: 1310415320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-04-28 16:41:42,170][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 16:41:43,226][57339] Updated weights for policy 0, policy_version 660408 (0.0028) [2024-04-28 16:41:46,191][57339] Updated weights for policy 0, policy_version 660418 (0.0037) [2024-04-28 16:41:47,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10820337664. Throughput: 0: 55772.5. Samples: 1310752400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:41:47,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 16:41:49,017][57339] Updated weights for policy 0, policy_version 660428 (0.0024) [2024-04-28 16:41:51,925][57339] Updated weights for policy 0, policy_version 660438 (0.0028) [2024-04-28 16:41:52,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10820616192. Throughput: 0: 56016.5. Samples: 1310919260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:41:52,170][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 16:41:54,933][57339] Updated weights for policy 0, policy_version 660448 (0.0029) [2024-04-28 16:41:57,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 10820894720. Throughput: 0: 55852.8. Samples: 1311249360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:41:57,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 16:41:57,849][57339] Updated weights for policy 0, policy_version 660458 (0.0027) [2024-04-28 16:42:00,781][57339] Updated weights for policy 0, policy_version 660468 (0.0028) [2024-04-28 16:42:02,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55816.6). Total num frames: 10821173248. Throughput: 0: 55861.3. Samples: 1311581700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:42:02,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 16:42:03,775][57339] Updated weights for policy 0, policy_version 660478 (0.0025) [2024-04-28 16:42:06,702][57339] Updated weights for policy 0, policy_version 660488 (0.0034) [2024-04-28 16:42:07,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10821451776. Throughput: 0: 55772.0. Samples: 1311752540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:42:07,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 16:42:08,346][57319] Signal inference workers to stop experience collection... (19600 times) [2024-04-28 16:42:08,386][57339] InferenceWorker_p0-w0: stopping experience collection (19600 times) [2024-04-28 16:42:08,437][57319] Signal inference workers to resume experience collection... (19600 times) [2024-04-28 16:42:08,437][57339] InferenceWorker_p0-w0: resuming experience collection (19600 times) [2024-04-28 16:42:09,549][57339] Updated weights for policy 0, policy_version 660498 (0.0025) [2024-04-28 16:42:12,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10821713920. Throughput: 0: 55613.2. Samples: 1312086520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:42:12,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 16:42:12,534][57339] Updated weights for policy 0, policy_version 660508 (0.0036) [2024-04-28 16:42:15,400][57339] Updated weights for policy 0, policy_version 660518 (0.0032) [2024-04-28 16:42:17,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55978.6, 300 sec: 55817.2). Total num frames: 10822008832. Throughput: 0: 55536.5. Samples: 1312418380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:42:17,169][57108] Avg episode reward: [(0, '0.686')] [2024-04-28 16:42:18,647][57339] Updated weights for policy 0, policy_version 660528 (0.0033) [2024-04-28 16:42:21,490][57339] Updated weights for policy 0, policy_version 660538 (0.0028) [2024-04-28 16:42:22,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.4, 300 sec: 55816.7). Total num frames: 10822270976. Throughput: 0: 55371.7. Samples: 1312576560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:42:22,169][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 16:42:24,559][57339] Updated weights for policy 0, policy_version 660548 (0.0027) [2024-04-28 16:42:27,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.7, 300 sec: 55927.7). Total num frames: 10822565888. Throughput: 0: 55518.3. Samples: 1312913640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:42:27,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 16:42:27,398][57339] Updated weights for policy 0, policy_version 660558 (0.0031) [2024-04-28 16:42:30,412][57339] Updated weights for policy 0, policy_version 660568 (0.0029) [2024-04-28 16:42:32,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55159.5, 300 sec: 55816.6). Total num frames: 10822828032. Throughput: 0: 55359.5. Samples: 1313243580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:42:32,170][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 16:42:33,200][57339] Updated weights for policy 0, policy_version 660578 (0.0029) [2024-04-28 16:42:36,163][57339] Updated weights for policy 0, policy_version 660588 (0.0030) [2024-04-28 16:42:37,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10823122944. Throughput: 0: 55351.6. Samples: 1313410080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:42:37,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 16:42:39,275][57339] Updated weights for policy 0, policy_version 660598 (0.0026) [2024-04-28 16:42:42,112][57339] Updated weights for policy 0, policy_version 660608 (0.0029) [2024-04-28 16:42:42,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 10823401472. Throughput: 0: 55359.6. Samples: 1313740540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:42:42,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 16:42:45,033][57339] Updated weights for policy 0, policy_version 660618 (0.0026) [2024-04-28 16:42:47,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 10823663616. Throughput: 0: 55379.6. Samples: 1314073780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:42:47,170][57108] Avg episode reward: [(0, '0.698')] [2024-04-28 16:42:48,083][57339] Updated weights for policy 0, policy_version 660628 (0.0033) [2024-04-28 16:42:51,063][57339] Updated weights for policy 0, policy_version 660638 (0.0028) [2024-04-28 16:42:52,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 10823925760. Throughput: 0: 55193.3. Samples: 1314236240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:42:52,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 16:42:53,877][57339] Updated weights for policy 0, policy_version 660648 (0.0031) [2024-04-28 16:42:56,940][57339] Updated weights for policy 0, policy_version 660658 (0.0034) [2024-04-28 16:42:57,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10824237056. Throughput: 0: 55259.5. Samples: 1314573200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:42:57,170][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 16:42:59,870][57339] Updated weights for policy 0, policy_version 660668 (0.0027) [2024-04-28 16:43:02,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 10824499200. Throughput: 0: 55246.3. Samples: 1314904460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:43:02,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 16:43:02,903][57339] Updated weights for policy 0, policy_version 660678 (0.0031) [2024-04-28 16:43:05,779][57339] Updated weights for policy 0, policy_version 660688 (0.0025) [2024-04-28 16:43:07,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10824794112. Throughput: 0: 55619.5. Samples: 1315079440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:43:07,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 16:43:08,805][57339] Updated weights for policy 0, policy_version 660698 (0.0029) [2024-04-28 16:43:11,621][57339] Updated weights for policy 0, policy_version 660708 (0.0030) [2024-04-28 16:43:12,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10825056256. Throughput: 0: 55472.5. Samples: 1315409900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:43:12,170][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 16:43:14,711][57339] Updated weights for policy 0, policy_version 660718 (0.0030) [2024-04-28 16:43:17,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 10825318400. Throughput: 0: 55533.1. Samples: 1315742560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:43:17,169][57108] Avg episode reward: [(0, '0.741')] [2024-04-28 16:43:17,476][57339] Updated weights for policy 0, policy_version 660728 (0.0025) [2024-04-28 16:43:20,655][57339] Updated weights for policy 0, policy_version 660738 (0.0031) [2024-04-28 16:43:22,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10825596928. Throughput: 0: 55460.9. Samples: 1315905820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:43:22,169][57108] Avg episode reward: [(0, '0.511')] [2024-04-28 16:43:22,850][57319] Signal inference workers to stop experience collection... (19650 times) [2024-04-28 16:43:22,850][57319] Signal inference workers to resume experience collection... (19650 times) [2024-04-28 16:43:22,874][57339] InferenceWorker_p0-w0: stopping experience collection (19650 times) [2024-04-28 16:43:22,874][57339] InferenceWorker_p0-w0: resuming experience collection (19650 times) [2024-04-28 16:43:23,311][57339] Updated weights for policy 0, policy_version 660748 (0.0034) [2024-04-28 16:43:26,504][57339] Updated weights for policy 0, policy_version 660758 (0.0034) [2024-04-28 16:43:27,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55432.6, 300 sec: 55816.8). Total num frames: 10825891840. Throughput: 0: 55591.2. Samples: 1316242140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:43:27,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 16:43:27,278][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000660761_10825908224.pth... [2024-04-28 16:43:27,318][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000659944_10812522496.pth [2024-04-28 16:43:29,269][57339] Updated weights for policy 0, policy_version 660768 (0.0033) [2024-04-28 16:43:32,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 10826153984. Throughput: 0: 55558.8. Samples: 1316573920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:43:32,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 16:43:32,481][57339] Updated weights for policy 0, policy_version 660778 (0.0033) [2024-04-28 16:43:35,063][57339] Updated weights for policy 0, policy_version 660788 (0.0032) [2024-04-28 16:43:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 10826432512. Throughput: 0: 55534.2. Samples: 1316735280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:43:37,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 16:43:38,381][57339] Updated weights for policy 0, policy_version 660798 (0.0027) [2024-04-28 16:43:40,884][57339] Updated weights for policy 0, policy_version 660808 (0.0030) [2024-04-28 16:43:42,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10826727424. Throughput: 0: 55447.2. Samples: 1317068320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:43:42,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 16:43:44,120][57339] Updated weights for policy 0, policy_version 660818 (0.0027) [2024-04-28 16:43:47,000][57339] Updated weights for policy 0, policy_version 660828 (0.0031) [2024-04-28 16:43:47,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10827005952. Throughput: 0: 55502.6. Samples: 1317402080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:43:47,169][57108] Avg episode reward: [(0, '0.701')] [2024-04-28 16:43:49,897][57339] Updated weights for policy 0, policy_version 660838 (0.0030) [2024-04-28 16:43:52,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10827268096. Throughput: 0: 55417.9. Samples: 1317573240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:43:52,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 16:43:53,018][57339] Updated weights for policy 0, policy_version 660848 (0.0031) [2024-04-28 16:43:55,875][57339] Updated weights for policy 0, policy_version 660858 (0.0028) [2024-04-28 16:43:57,169][57108] Fps is (10 sec: 52429.6, 60 sec: 54886.5, 300 sec: 55650.1). Total num frames: 10827530240. Throughput: 0: 55438.4. Samples: 1317904620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:43:57,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 16:43:58,808][57339] Updated weights for policy 0, policy_version 660868 (0.0025) [2024-04-28 16:44:01,942][57339] Updated weights for policy 0, policy_version 660878 (0.0033) [2024-04-28 16:44:02,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10827825152. Throughput: 0: 55450.3. Samples: 1318237820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:44:02,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 16:44:04,756][57339] Updated weights for policy 0, policy_version 660888 (0.0029) [2024-04-28 16:44:07,169][57108] Fps is (10 sec: 58981.7, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 10828120064. Throughput: 0: 55544.0. Samples: 1318405300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:44:07,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 16:44:07,796][57339] Updated weights for policy 0, policy_version 660898 (0.0026) [2024-04-28 16:44:10,667][57339] Updated weights for policy 0, policy_version 660908 (0.0025) [2024-04-28 16:44:12,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10828365824. Throughput: 0: 55532.0. Samples: 1318741080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:44:12,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 16:44:13,859][57339] Updated weights for policy 0, policy_version 660918 (0.0029) [2024-04-28 16:44:16,450][57339] Updated weights for policy 0, policy_version 660928 (0.0030) [2024-04-28 16:44:17,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10828677120. Throughput: 0: 55437.3. Samples: 1319068600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:44:17,170][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 16:44:19,880][57339] Updated weights for policy 0, policy_version 660938 (0.0024) [2024-04-28 16:44:22,159][57339] Updated weights for policy 0, policy_version 660948 (0.0028) [2024-04-28 16:44:22,169][57108] Fps is (10 sec: 60620.0, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10828972032. Throughput: 0: 55766.9. Samples: 1319244800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:44:22,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 16:44:25,900][57339] Updated weights for policy 0, policy_version 660958 (0.0029) [2024-04-28 16:44:26,845][57319] Signal inference workers to stop experience collection... (19700 times) [2024-04-28 16:44:26,879][57339] InferenceWorker_p0-w0: stopping experience collection (19700 times) [2024-04-28 16:44:26,905][57319] Signal inference workers to resume experience collection... (19700 times) [2024-04-28 16:44:26,906][57339] InferenceWorker_p0-w0: resuming experience collection (19700 times) [2024-04-28 16:44:27,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10829234176. Throughput: 0: 55871.5. Samples: 1319582540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:44:27,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 16:44:28,186][57339] Updated weights for policy 0, policy_version 660968 (0.0032) [2024-04-28 16:44:31,741][57339] Updated weights for policy 0, policy_version 660978 (0.0037) [2024-04-28 16:44:32,169][57108] Fps is (10 sec: 50791.1, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10829479936. Throughput: 0: 55930.8. Samples: 1319918960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:44:32,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 16:44:34,142][57339] Updated weights for policy 0, policy_version 660988 (0.0034) [2024-04-28 16:44:37,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 10829774848. Throughput: 0: 55752.7. Samples: 1320082120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:44:37,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 16:44:37,558][57339] Updated weights for policy 0, policy_version 660998 (0.0029) [2024-04-28 16:44:40,040][57339] Updated weights for policy 0, policy_version 661008 (0.0033) [2024-04-28 16:44:42,169][57108] Fps is (10 sec: 58981.5, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10830069760. Throughput: 0: 55733.5. Samples: 1320412640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:44:42,170][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 16:44:43,441][57339] Updated weights for policy 0, policy_version 661018 (0.0031) [2024-04-28 16:44:45,992][57339] Updated weights for policy 0, policy_version 661028 (0.0026) [2024-04-28 16:44:47,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10830331904. Throughput: 0: 55770.5. Samples: 1320747500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:44:47,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 16:44:49,169][57339] Updated weights for policy 0, policy_version 661038 (0.0031) [2024-04-28 16:44:51,936][57339] Updated weights for policy 0, policy_version 661048 (0.0037) [2024-04-28 16:44:52,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 10830626816. Throughput: 0: 55856.9. Samples: 1320918860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:44:52,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 16:44:55,151][57339] Updated weights for policy 0, policy_version 661058 (0.0032) [2024-04-28 16:44:57,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 10830905344. Throughput: 0: 55748.3. Samples: 1321249760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 16:44:57,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 16:44:57,715][57339] Updated weights for policy 0, policy_version 661068 (0.0031) [2024-04-28 16:45:01,050][57339] Updated weights for policy 0, policy_version 661078 (0.0028) [2024-04-28 16:45:02,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10831167488. Throughput: 0: 55949.8. Samples: 1321586340. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:45:02,170][57108] Avg episode reward: [(0, '0.521')] [2024-04-28 16:45:03,514][57339] Updated weights for policy 0, policy_version 661088 (0.0035) [2024-04-28 16:45:07,161][57339] Updated weights for policy 0, policy_version 661098 (0.0025) [2024-04-28 16:45:07,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 10831429632. Throughput: 0: 55620.5. Samples: 1321747720. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:45:07,169][57108] Avg episode reward: [(0, '0.695')] [2024-04-28 16:45:09,327][57339] Updated weights for policy 0, policy_version 661108 (0.0026) [2024-04-28 16:45:12,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 10831724544. Throughput: 0: 55564.8. Samples: 1322082960. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:45:12,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 16:45:13,146][57339] Updated weights for policy 0, policy_version 661118 (0.0026) [2024-04-28 16:45:15,266][57339] Updated weights for policy 0, policy_version 661128 (0.0032) [2024-04-28 16:45:16,121][57319] Signal inference workers to stop experience collection... (19750 times) [2024-04-28 16:45:16,122][57319] Signal inference workers to resume experience collection... (19750 times) [2024-04-28 16:45:16,148][57339] InferenceWorker_p0-w0: stopping experience collection (19750 times) [2024-04-28 16:45:16,149][57339] InferenceWorker_p0-w0: resuming experience collection (19750 times) [2024-04-28 16:45:17,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10832003072. Throughput: 0: 55517.0. Samples: 1322417220. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:45:17,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 16:45:19,040][57339] Updated weights for policy 0, policy_version 661138 (0.0034) [2024-04-28 16:45:21,167][57339] Updated weights for policy 0, policy_version 661148 (0.0036) [2024-04-28 16:45:22,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 10832281600. Throughput: 0: 55572.4. Samples: 1322582880. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:45:22,170][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 16:45:24,934][57339] Updated weights for policy 0, policy_version 661158 (0.0030) [2024-04-28 16:45:27,095][57339] Updated weights for policy 0, policy_version 661168 (0.0026) [2024-04-28 16:45:27,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10832576512. Throughput: 0: 55527.2. Samples: 1322911360. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:45:27,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 16:45:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000661168_10832576512.pth... [2024-04-28 16:45:27,226][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000660354_10819239936.pth [2024-04-28 16:45:30,869][57339] Updated weights for policy 0, policy_version 661178 (0.0034) [2024-04-28 16:45:32,169][57108] Fps is (10 sec: 55706.9, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10832838656. Throughput: 0: 55378.8. Samples: 1323239540. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:45:32,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 16:45:33,000][57339] Updated weights for policy 0, policy_version 661188 (0.0029) [2024-04-28 16:45:36,672][57339] Updated weights for policy 0, policy_version 661198 (0.0026) [2024-04-28 16:45:37,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10833117184. Throughput: 0: 55421.9. Samples: 1323412840. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:45:37,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 16:45:38,939][57339] Updated weights for policy 0, policy_version 661208 (0.0026) [2024-04-28 16:45:42,169][57108] Fps is (10 sec: 52428.2, 60 sec: 54886.5, 300 sec: 55483.4). Total num frames: 10833362944. Throughput: 0: 55562.7. Samples: 1323750080. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:45:42,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 16:45:42,633][57339] Updated weights for policy 0, policy_version 661218 (0.0026) [2024-04-28 16:45:44,700][57339] Updated weights for policy 0, policy_version 661228 (0.0026) [2024-04-28 16:45:47,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10833674240. Throughput: 0: 55391.5. Samples: 1324078960. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:45:47,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 16:45:48,398][57339] Updated weights for policy 0, policy_version 661238 (0.0025) [2024-04-28 16:45:50,549][57339] Updated weights for policy 0, policy_version 661248 (0.0025) [2024-04-28 16:45:52,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 10833936384. Throughput: 0: 55548.5. Samples: 1324247400. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:45:52,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 16:45:54,346][57339] Updated weights for policy 0, policy_version 661258 (0.0030) [2024-04-28 16:45:56,435][57339] Updated weights for policy 0, policy_version 661268 (0.0029) [2024-04-28 16:45:57,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10834231296. Throughput: 0: 55543.6. Samples: 1324582420. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:45:57,169][57108] Avg episode reward: [(0, '0.492')] [2024-04-28 16:46:00,005][57319] Signal inference workers to stop experience collection... (19800 times) [2024-04-28 16:46:00,050][57339] InferenceWorker_p0-w0: stopping experience collection (19800 times) [2024-04-28 16:46:00,062][57319] Signal inference workers to resume experience collection... (19800 times) [2024-04-28 16:46:00,071][57339] InferenceWorker_p0-w0: resuming experience collection (19800 times) [2024-04-28 16:46:00,074][57339] Updated weights for policy 0, policy_version 661278 (0.0026) [2024-04-28 16:46:02,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10834509824. Throughput: 0: 55555.0. Samples: 1324917200. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:46:02,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 16:46:02,352][57339] Updated weights for policy 0, policy_version 661288 (0.0030) [2024-04-28 16:46:05,979][57339] Updated weights for policy 0, policy_version 661298 (0.0038) [2024-04-28 16:46:07,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 10834788352. Throughput: 0: 55533.1. Samples: 1325081860. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:46:07,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 16:46:08,270][57339] Updated weights for policy 0, policy_version 661308 (0.0033) [2024-04-28 16:46:11,754][57339] Updated weights for policy 0, policy_version 661318 (0.0029) [2024-04-28 16:46:12,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 10835050496. Throughput: 0: 55712.6. Samples: 1325418420. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:46:12,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 16:46:14,275][57339] Updated weights for policy 0, policy_version 661328 (0.0030) [2024-04-28 16:46:17,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55159.4, 300 sec: 55427.9). Total num frames: 10835312640. Throughput: 0: 55866.6. Samples: 1325753540. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:46:17,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 16:46:17,669][57339] Updated weights for policy 0, policy_version 661338 (0.0029) [2024-04-28 16:46:20,232][57339] Updated weights for policy 0, policy_version 661348 (0.0043) [2024-04-28 16:46:22,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 10835623936. Throughput: 0: 55565.7. Samples: 1325913300. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:46:22,170][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 16:46:23,458][57339] Updated weights for policy 0, policy_version 661358 (0.0026) [2024-04-28 16:46:26,293][57339] Updated weights for policy 0, policy_version 661368 (0.0029) [2024-04-28 16:46:27,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10835902464. Throughput: 0: 55616.8. Samples: 1326252840. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:46:27,170][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 16:46:29,374][57339] Updated weights for policy 0, policy_version 661378 (0.0033) [2024-04-28 16:46:32,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 10836164608. Throughput: 0: 55673.1. Samples: 1326584260. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:46:32,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 16:46:32,334][57339] Updated weights for policy 0, policy_version 661388 (0.0026) [2024-04-28 16:46:35,245][57339] Updated weights for policy 0, policy_version 661398 (0.0029) [2024-04-28 16:46:37,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 10836475904. Throughput: 0: 55891.1. Samples: 1326762500. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 16:46:37,169][57108] Avg episode reward: [(0, '0.505')] [2024-04-28 16:46:37,987][57339] Updated weights for policy 0, policy_version 661408 (0.0027) [2024-04-28 16:46:41,083][57339] Updated weights for policy 0, policy_version 661418 (0.0034) [2024-04-28 16:46:42,169][57108] Fps is (10 sec: 58983.0, 60 sec: 56524.8, 300 sec: 55650.1). Total num frames: 10836754432. Throughput: 0: 56015.6. Samples: 1327103120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:46:42,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 16:46:43,607][57339] Updated weights for policy 0, policy_version 661428 (0.0026) [2024-04-28 16:46:44,800][57319] Signal inference workers to stop experience collection... (19850 times) [2024-04-28 16:46:44,800][57319] Signal inference workers to resume experience collection... (19850 times) [2024-04-28 16:46:44,817][57339] InferenceWorker_p0-w0: stopping experience collection (19850 times) [2024-04-28 16:46:44,818][57339] InferenceWorker_p0-w0: resuming experience collection (19850 times) [2024-04-28 16:46:46,723][57339] Updated weights for policy 0, policy_version 661438 (0.0032) [2024-04-28 16:46:47,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10837016576. Throughput: 0: 56052.1. Samples: 1327439540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:46:47,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 16:46:49,572][57339] Updated weights for policy 0, policy_version 661448 (0.0025) [2024-04-28 16:46:52,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10837278720. Throughput: 0: 55948.4. Samples: 1327599540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:46:52,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 16:46:52,644][57339] Updated weights for policy 0, policy_version 661458 (0.0030) [2024-04-28 16:46:55,390][57339] Updated weights for policy 0, policy_version 661468 (0.0028) [2024-04-28 16:46:57,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55594.6). Total num frames: 10837573632. Throughput: 0: 55876.9. Samples: 1327932880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:46:57,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 16:46:58,641][57339] Updated weights for policy 0, policy_version 661478 (0.0029) [2024-04-28 16:47:01,152][57339] Updated weights for policy 0, policy_version 661488 (0.0025) [2024-04-28 16:47:02,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10837835776. Throughput: 0: 55931.1. Samples: 1328270440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:47:02,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 16:47:04,334][57339] Updated weights for policy 0, policy_version 661498 (0.0026) [2024-04-28 16:47:07,041][57339] Updated weights for policy 0, policy_version 661508 (0.0026) [2024-04-28 16:47:07,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10838147072. Throughput: 0: 56096.4. Samples: 1328437640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:47:07,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 16:47:10,004][57339] Updated weights for policy 0, policy_version 661518 (0.0022) [2024-04-28 16:47:12,169][57108] Fps is (10 sec: 60620.9, 60 sec: 56524.8, 300 sec: 55705.6). Total num frames: 10838441984. Throughput: 0: 55991.7. Samples: 1328772460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:47:12,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 16:47:12,875][57339] Updated weights for policy 0, policy_version 661528 (0.0031) [2024-04-28 16:47:15,924][57339] Updated weights for policy 0, policy_version 661538 (0.0028) [2024-04-28 16:47:17,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56797.8, 300 sec: 55761.1). Total num frames: 10838720512. Throughput: 0: 56084.1. Samples: 1329108040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:47:17,169][57108] Avg episode reward: [(0, '0.497')] [2024-04-28 16:47:18,779][57339] Updated weights for policy 0, policy_version 661548 (0.0027) [2024-04-28 16:47:21,907][57339] Updated weights for policy 0, policy_version 661558 (0.0036) [2024-04-28 16:47:22,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10838982656. Throughput: 0: 55954.6. Samples: 1329280460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:47:22,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 16:47:24,467][57339] Updated weights for policy 0, policy_version 661568 (0.0032) [2024-04-28 16:47:27,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10839244800. Throughput: 0: 55899.0. Samples: 1329618580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:47:27,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 16:47:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000661575_10839244800.pth... [2024-04-28 16:47:27,231][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000660761_10825908224.pth [2024-04-28 16:47:27,684][57339] Updated weights for policy 0, policy_version 661578 (0.0028) [2024-04-28 16:47:30,194][57339] Updated weights for policy 0, policy_version 661588 (0.0028) [2024-04-28 16:47:32,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 10839523328. Throughput: 0: 55719.0. Samples: 1329946900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:47:32,170][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 16:47:33,456][57339] Updated weights for policy 0, policy_version 661598 (0.0028) [2024-04-28 16:47:36,580][57339] Updated weights for policy 0, policy_version 661608 (0.0029) [2024-04-28 16:47:37,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10839801856. Throughput: 0: 55827.6. Samples: 1330111780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:47:37,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 16:47:39,327][57339] Updated weights for policy 0, policy_version 661618 (0.0029) [2024-04-28 16:47:42,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10840096768. Throughput: 0: 55871.8. Samples: 1330447120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:47:42,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 16:47:42,626][57339] Updated weights for policy 0, policy_version 661628 (0.0029) [2024-04-28 16:47:45,280][57339] Updated weights for policy 0, policy_version 661638 (0.0026) [2024-04-28 16:47:47,169][57108] Fps is (10 sec: 58982.8, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10840391680. Throughput: 0: 55777.8. Samples: 1330780440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:47:47,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 16:47:48,368][57339] Updated weights for policy 0, policy_version 661648 (0.0033) [2024-04-28 16:47:51,029][57339] Updated weights for policy 0, policy_version 661658 (0.0026) [2024-04-28 16:47:52,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56524.8, 300 sec: 55705.6). Total num frames: 10840670208. Throughput: 0: 55954.7. Samples: 1330955600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:47:52,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 16:47:54,360][57339] Updated weights for policy 0, policy_version 661668 (0.0027) [2024-04-28 16:47:56,276][57319] Signal inference workers to stop experience collection... (19900 times) [2024-04-28 16:47:56,277][57319] Signal inference workers to resume experience collection... (19900 times) [2024-04-28 16:47:56,299][57339] InferenceWorker_p0-w0: stopping experience collection (19900 times) [2024-04-28 16:47:56,299][57339] InferenceWorker_p0-w0: resuming experience collection (19900 times) [2024-04-28 16:47:56,775][57339] Updated weights for policy 0, policy_version 661678 (0.0027) [2024-04-28 16:47:57,169][57108] Fps is (10 sec: 55705.2, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 10840948736. Throughput: 0: 56001.7. Samples: 1331292540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:47:57,169][57108] Avg episode reward: [(0, '0.539')] [2024-04-28 16:48:00,372][57339] Updated weights for policy 0, policy_version 661688 (0.0027) [2024-04-28 16:48:02,169][57108] Fps is (10 sec: 54067.0, 60 sec: 56251.6, 300 sec: 55650.1). Total num frames: 10841210880. Throughput: 0: 55971.5. Samples: 1331626760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:48:02,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 16:48:02,692][57339] Updated weights for policy 0, policy_version 661698 (0.0029) [2024-04-28 16:48:06,088][57339] Updated weights for policy 0, policy_version 661708 (0.0027) [2024-04-28 16:48:07,171][57108] Fps is (10 sec: 52416.3, 60 sec: 55430.4, 300 sec: 55649.6). Total num frames: 10841473024. Throughput: 0: 55604.6. Samples: 1331782800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:48:07,172][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 16:48:08,558][57339] Updated weights for policy 0, policy_version 661718 (0.0028) [2024-04-28 16:48:11,867][57339] Updated weights for policy 0, policy_version 661728 (0.0034) [2024-04-28 16:48:12,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 10841751552. Throughput: 0: 55531.1. Samples: 1332117480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:48:12,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 16:48:14,500][57339] Updated weights for policy 0, policy_version 661738 (0.0037) [2024-04-28 16:48:17,169][57108] Fps is (10 sec: 57357.2, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10842046464. Throughput: 0: 55728.0. Samples: 1332454660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-04-28 16:48:17,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 16:48:17,859][57339] Updated weights for policy 0, policy_version 661748 (0.0025) [2024-04-28 16:48:20,228][57339] Updated weights for policy 0, policy_version 661758 (0.0033) [2024-04-28 16:48:22,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10842341376. Throughput: 0: 55953.7. Samples: 1332629700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:48:22,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 16:48:23,612][57339] Updated weights for policy 0, policy_version 661768 (0.0029) [2024-04-28 16:48:26,191][57339] Updated weights for policy 0, policy_version 661778 (0.0024) [2024-04-28 16:48:27,169][57108] Fps is (10 sec: 57344.8, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10842619904. Throughput: 0: 55898.8. Samples: 1332962560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:48:27,169][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 16:48:29,503][57339] Updated weights for policy 0, policy_version 661788 (0.0029) [2024-04-28 16:48:31,961][57339] Updated weights for policy 0, policy_version 661798 (0.0028) [2024-04-28 16:48:32,169][57108] Fps is (10 sec: 55705.8, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10842898432. Throughput: 0: 55911.0. Samples: 1333296440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:48:32,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 16:48:35,338][57339] Updated weights for policy 0, policy_version 661808 (0.0031) [2024-04-28 16:48:37,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10843160576. Throughput: 0: 55821.4. Samples: 1333467560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:48:37,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 16:48:37,817][57339] Updated weights for policy 0, policy_version 661818 (0.0025) [2024-04-28 16:48:41,163][57339] Updated weights for policy 0, policy_version 661828 (0.0030) [2024-04-28 16:48:42,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10843439104. Throughput: 0: 55870.5. Samples: 1333806720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:48:42,170][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 16:48:43,645][57339] Updated weights for policy 0, policy_version 661838 (0.0025) [2024-04-28 16:48:47,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 10843701248. Throughput: 0: 56005.4. Samples: 1334147000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:48:47,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 16:48:47,247][57339] Updated weights for policy 0, policy_version 661848 (0.0030) [2024-04-28 16:48:49,480][57339] Updated weights for policy 0, policy_version 661858 (0.0030) [2024-04-28 16:48:52,169][57108] Fps is (10 sec: 55706.7, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10843996160. Throughput: 0: 55869.3. Samples: 1334296780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:48:52,169][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 16:48:53,055][57339] Updated weights for policy 0, policy_version 661868 (0.0034) [2024-04-28 16:48:55,290][57339] Updated weights for policy 0, policy_version 661878 (0.0027) [2024-04-28 16:48:57,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10844291072. Throughput: 0: 55909.4. Samples: 1334633400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:48:57,169][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 16:48:58,959][57339] Updated weights for policy 0, policy_version 661888 (0.0036) [2024-04-28 16:49:00,242][57319] Signal inference workers to stop experience collection... (19950 times) [2024-04-28 16:49:00,242][57319] Signal inference workers to resume experience collection... (19950 times) [2024-04-28 16:49:00,256][57339] InferenceWorker_p0-w0: stopping experience collection (19950 times) [2024-04-28 16:49:00,257][57339] InferenceWorker_p0-w0: resuming experience collection (19950 times) [2024-04-28 16:49:01,242][57339] Updated weights for policy 0, policy_version 661898 (0.0031) [2024-04-28 16:49:02,169][57108] Fps is (10 sec: 58981.1, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10844585984. Throughput: 0: 55778.1. Samples: 1334964680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:49:02,170][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 16:49:04,867][57339] Updated weights for policy 0, policy_version 661908 (0.0030) [2024-04-28 16:49:07,083][57339] Updated weights for policy 0, policy_version 661918 (0.0029) [2024-04-28 16:49:07,169][57108] Fps is (10 sec: 57343.9, 60 sec: 56527.1, 300 sec: 55927.8). Total num frames: 10844864512. Throughput: 0: 55843.7. Samples: 1335142660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:49:07,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 16:49:10,682][57339] Updated weights for policy 0, policy_version 661928 (0.0027) [2024-04-28 16:49:12,169][57108] Fps is (10 sec: 52429.8, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10845110272. Throughput: 0: 55938.6. Samples: 1335479800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:49:12,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 16:49:12,923][57339] Updated weights for policy 0, policy_version 661938 (0.0031) [2024-04-28 16:49:16,636][57339] Updated weights for policy 0, policy_version 661948 (0.0029) [2024-04-28 16:49:17,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10845405184. Throughput: 0: 56029.4. Samples: 1335817760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:49:17,169][57108] Avg episode reward: [(0, '0.535')] [2024-04-28 16:49:18,771][57339] Updated weights for policy 0, policy_version 661958 (0.0026) [2024-04-28 16:49:22,169][57108] Fps is (10 sec: 54066.1, 60 sec: 55159.3, 300 sec: 55650.0). Total num frames: 10845650944. Throughput: 0: 55670.9. Samples: 1335972760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:49:22,170][57108] Avg episode reward: [(0, '0.677')] [2024-04-28 16:49:22,502][57339] Updated weights for policy 0, policy_version 661968 (0.0037) [2024-04-28 16:49:24,532][57339] Updated weights for policy 0, policy_version 661978 (0.0025) [2024-04-28 16:49:27,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10845945856. Throughput: 0: 55567.8. Samples: 1336307260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:49:27,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 16:49:27,176][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000661984_10845945856.pth... [2024-04-28 16:49:27,224][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000661168_10832576512.pth [2024-04-28 16:49:28,305][57339] Updated weights for policy 0, policy_version 661988 (0.0027) [2024-04-28 16:49:30,681][57339] Updated weights for policy 0, policy_version 661998 (0.0028) [2024-04-28 16:49:32,169][57108] Fps is (10 sec: 58983.8, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10846240768. Throughput: 0: 55420.1. Samples: 1336640900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:49:32,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 16:49:34,175][57339] Updated weights for policy 0, policy_version 662008 (0.0027) [2024-04-28 16:49:36,612][57339] Updated weights for policy 0, policy_version 662018 (0.0028) [2024-04-28 16:49:37,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 10846519296. Throughput: 0: 56008.4. Samples: 1336817160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:49:37,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 16:49:39,983][57339] Updated weights for policy 0, policy_version 662028 (0.0030) [2024-04-28 16:49:42,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56251.9, 300 sec: 55872.2). Total num frames: 10846814208. Throughput: 0: 55936.8. Samples: 1337150560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:49:42,178][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 16:49:42,400][57339] Updated weights for policy 0, policy_version 662038 (0.0034) [2024-04-28 16:49:45,955][57339] Updated weights for policy 0, policy_version 662048 (0.0033) [2024-04-28 16:49:47,169][57108] Fps is (10 sec: 57343.4, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 10847092736. Throughput: 0: 55961.9. Samples: 1337482960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:49:47,178][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 16:49:48,215][57339] Updated weights for policy 0, policy_version 662058 (0.0026) [2024-04-28 16:49:51,757][57339] Updated weights for policy 0, policy_version 662068 (0.0029) [2024-04-28 16:49:52,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10847338496. Throughput: 0: 55653.3. Samples: 1337647060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-04-28 16:49:52,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 16:49:54,092][57339] Updated weights for policy 0, policy_version 662078 (0.0029) [2024-04-28 16:49:57,169][57108] Fps is (10 sec: 50790.8, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 10847600640. Throughput: 0: 55612.9. Samples: 1337982380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:49:57,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 16:49:57,760][57339] Updated weights for policy 0, policy_version 662088 (0.0026) [2024-04-28 16:49:59,944][57339] Updated weights for policy 0, policy_version 662098 (0.0027) [2024-04-28 16:50:02,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 10847895552. Throughput: 0: 55527.3. Samples: 1338316500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:50:02,170][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 16:50:03,502][57339] Updated weights for policy 0, policy_version 662108 (0.0031) [2024-04-28 16:50:05,770][57339] Updated weights for policy 0, policy_version 662118 (0.0030) [2024-04-28 16:50:07,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 10848157696. Throughput: 0: 55626.1. Samples: 1338475920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:50:07,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 16:50:09,403][57339] Updated weights for policy 0, policy_version 662128 (0.0026) [2024-04-28 16:50:11,553][57339] Updated weights for policy 0, policy_version 662138 (0.0025) [2024-04-28 16:50:12,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.5, 300 sec: 55816.6). Total num frames: 10848468992. Throughput: 0: 55597.1. Samples: 1338809140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:50:12,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 16:50:15,175][57319] Signal inference workers to stop experience collection... (20000 times) [2024-04-28 16:50:15,175][57319] Signal inference workers to resume experience collection... (20000 times) [2024-04-28 16:50:15,202][57339] InferenceWorker_p0-w0: stopping experience collection (20000 times) [2024-04-28 16:50:15,202][57339] InferenceWorker_p0-w0: resuming experience collection (20000 times) [2024-04-28 16:50:15,284][57339] Updated weights for policy 0, policy_version 662148 (0.0023) [2024-04-28 16:50:17,169][57108] Fps is (10 sec: 60620.6, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10848763904. Throughput: 0: 55598.6. Samples: 1339142840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:50:17,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 16:50:17,455][57339] Updated weights for policy 0, policy_version 662158 (0.0029) [2024-04-28 16:50:21,242][57339] Updated weights for policy 0, policy_version 662168 (0.0027) [2024-04-28 16:50:22,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 10849026048. Throughput: 0: 55563.0. Samples: 1339317500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:50:22,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 16:50:23,486][57339] Updated weights for policy 0, policy_version 662178 (0.0037) [2024-04-28 16:50:27,139][57339] Updated weights for policy 0, policy_version 662188 (0.0030) [2024-04-28 16:50:27,169][57108] Fps is (10 sec: 52427.9, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 10849288192. Throughput: 0: 55483.4. Samples: 1339647320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:50:27,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 16:50:29,630][57339] Updated weights for policy 0, policy_version 662198 (0.0027) [2024-04-28 16:50:32,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10849550336. Throughput: 0: 55468.2. Samples: 1339979020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:50:32,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 16:50:32,987][57339] Updated weights for policy 0, policy_version 662208 (0.0032) [2024-04-28 16:50:35,452][57339] Updated weights for policy 0, policy_version 662218 (0.0030) [2024-04-28 16:50:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.3, 300 sec: 55816.7). Total num frames: 10849828864. Throughput: 0: 55310.0. Samples: 1340136020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:50:37,170][57108] Avg episode reward: [(0, '0.534')] [2024-04-28 16:50:38,854][57339] Updated weights for policy 0, policy_version 662228 (0.0027) [2024-04-28 16:50:41,900][57339] Updated weights for policy 0, policy_version 662238 (0.0027) [2024-04-28 16:50:42,169][57108] Fps is (10 sec: 55705.3, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 10850107392. Throughput: 0: 55229.3. Samples: 1340467700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:50:42,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 16:50:44,752][57339] Updated weights for policy 0, policy_version 662248 (0.0029) [2024-04-28 16:50:47,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 10850418688. Throughput: 0: 55205.9. Samples: 1340800760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:50:47,170][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 16:50:47,702][57339] Updated weights for policy 0, policy_version 662258 (0.0027) [2024-04-28 16:50:50,591][57339] Updated weights for policy 0, policy_version 662268 (0.0027) [2024-04-28 16:50:52,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10850697216. Throughput: 0: 55638.7. Samples: 1340979660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:50:52,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 16:50:53,590][57339] Updated weights for policy 0, policy_version 662278 (0.0029) [2024-04-28 16:50:56,450][57339] Updated weights for policy 0, policy_version 662288 (0.0030) [2024-04-28 16:50:57,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10850959360. Throughput: 0: 55727.3. Samples: 1341316860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:50:57,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 16:50:59,472][57339] Updated weights for policy 0, policy_version 662298 (0.0025) [2024-04-28 16:51:02,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 10851221504. Throughput: 0: 55752.4. Samples: 1341651700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:51:02,169][57108] Avg episode reward: [(0, '0.521')] [2024-04-28 16:51:02,373][57339] Updated weights for policy 0, policy_version 662308 (0.0032) [2024-04-28 16:51:05,192][57339] Updated weights for policy 0, policy_version 662318 (0.0029) [2024-04-28 16:51:07,169][57108] Fps is (10 sec: 50790.0, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 10851467264. Throughput: 0: 55226.3. Samples: 1341802680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:51:07,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 16:51:08,352][57339] Updated weights for policy 0, policy_version 662328 (0.0029) [2024-04-28 16:51:09,242][57319] Signal inference workers to stop experience collection... (20050 times) [2024-04-28 16:51:09,242][57319] Signal inference workers to resume experience collection... (20050 times) [2024-04-28 16:51:09,262][57339] InferenceWorker_p0-w0: stopping experience collection (20050 times) [2024-04-28 16:51:09,291][57339] InferenceWorker_p0-w0: resuming experience collection (20050 times) [2024-04-28 16:51:11,617][57339] Updated weights for policy 0, policy_version 662338 (0.0034) [2024-04-28 16:51:12,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54886.5, 300 sec: 55761.1). Total num frames: 10851762176. Throughput: 0: 55292.6. Samples: 1342135480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:51:12,170][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 16:51:14,318][57339] Updated weights for policy 0, policy_version 662348 (0.0029) [2024-04-28 16:51:17,169][57108] Fps is (10 sec: 58982.1, 60 sec: 54886.3, 300 sec: 55705.6). Total num frames: 10852057088. Throughput: 0: 55327.4. Samples: 1342468760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:51:17,170][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 16:51:17,366][57339] Updated weights for policy 0, policy_version 662358 (0.0025) [2024-04-28 16:51:20,108][57339] Updated weights for policy 0, policy_version 662368 (0.0026) [2024-04-28 16:51:22,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10852352000. Throughput: 0: 55624.1. Samples: 1342639100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:51:22,170][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 16:51:23,398][57339] Updated weights for policy 0, policy_version 662378 (0.0024) [2024-04-28 16:51:25,900][57339] Updated weights for policy 0, policy_version 662388 (0.0025) [2024-04-28 16:51:27,169][57108] Fps is (10 sec: 60620.8, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 10852663296. Throughput: 0: 55741.2. Samples: 1342976060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:51:27,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 16:51:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000662394_10852663296.pth... [2024-04-28 16:51:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000661575_10839244800.pth [2024-04-28 16:51:29,355][57339] Updated weights for policy 0, policy_version 662398 (0.0034) [2024-04-28 16:51:31,923][57339] Updated weights for policy 0, policy_version 662408 (0.0032) [2024-04-28 16:51:32,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10852892672. Throughput: 0: 55607.3. Samples: 1343303080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 16:51:32,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 16:51:35,344][57339] Updated weights for policy 0, policy_version 662418 (0.0027) [2024-04-28 16:51:37,169][57108] Fps is (10 sec: 49152.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10853154816. Throughput: 0: 55280.8. Samples: 1343467300. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:51:37,169][57108] Avg episode reward: [(0, '0.513')] [2024-04-28 16:51:38,004][57339] Updated weights for policy 0, policy_version 662428 (0.0025) [2024-04-28 16:51:41,023][57339] Updated weights for policy 0, policy_version 662438 (0.0031) [2024-04-28 16:51:42,169][57108] Fps is (10 sec: 54066.0, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 10853433344. Throughput: 0: 55293.6. Samples: 1343805080. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:51:42,170][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 16:51:43,823][57339] Updated weights for policy 0, policy_version 662448 (0.0028) [2024-04-28 16:51:46,878][57339] Updated weights for policy 0, policy_version 662458 (0.0034) [2024-04-28 16:51:47,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 10853711872. Throughput: 0: 55339.6. Samples: 1344141980. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:51:47,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 16:51:49,651][57339] Updated weights for policy 0, policy_version 662468 (0.0026) [2024-04-28 16:51:52,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55159.3, 300 sec: 55705.6). Total num frames: 10854006784. Throughput: 0: 55676.8. Samples: 1344308140. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:51:52,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 16:51:52,755][57339] Updated weights for policy 0, policy_version 662478 (0.0033) [2024-04-28 16:51:55,423][57339] Updated weights for policy 0, policy_version 662488 (0.0033) [2024-04-28 16:51:57,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10854301696. Throughput: 0: 55574.2. Samples: 1344636320. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:51:57,169][57108] Avg episode reward: [(0, '0.679')] [2024-04-28 16:51:58,563][57339] Updated weights for policy 0, policy_version 662498 (0.0031) [2024-04-28 16:52:01,162][57319] Signal inference workers to stop experience collection... (20100 times) [2024-04-28 16:52:01,162][57319] Signal inference workers to resume experience collection... (20100 times) [2024-04-28 16:52:01,184][57339] InferenceWorker_p0-w0: stopping experience collection (20100 times) [2024-04-28 16:52:01,184][57339] InferenceWorker_p0-w0: resuming experience collection (20100 times) [2024-04-28 16:52:01,274][57339] Updated weights for policy 0, policy_version 662508 (0.0029) [2024-04-28 16:52:02,169][57108] Fps is (10 sec: 58983.2, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 10854596608. Throughput: 0: 55610.8. Samples: 1344971240. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:52:02,170][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 16:52:04,402][57339] Updated weights for policy 0, policy_version 662518 (0.0030) [2024-04-28 16:52:07,169][57108] Fps is (10 sec: 54066.9, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 10854842368. Throughput: 0: 55555.6. Samples: 1345139100. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:52:07,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 16:52:07,205][57339] Updated weights for policy 0, policy_version 662528 (0.0033) [2024-04-28 16:52:10,470][57339] Updated weights for policy 0, policy_version 662538 (0.0028) [2024-04-28 16:52:12,169][57108] Fps is (10 sec: 50790.5, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10855104512. Throughput: 0: 55501.0. Samples: 1345473600. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:52:12,169][57108] Avg episode reward: [(0, '0.499')] [2024-04-28 16:52:13,027][57339] Updated weights for policy 0, policy_version 662548 (0.0027) [2024-04-28 16:52:16,385][57339] Updated weights for policy 0, policy_version 662558 (0.0030) [2024-04-28 16:52:17,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 10855366656. Throughput: 0: 55668.8. Samples: 1345808180. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:52:17,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 16:52:18,774][57339] Updated weights for policy 0, policy_version 662568 (0.0026) [2024-04-28 16:52:22,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10855661568. Throughput: 0: 55622.6. Samples: 1345970320. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:52:22,170][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 16:52:22,430][57339] Updated weights for policy 0, policy_version 662578 (0.0026) [2024-04-28 16:52:24,731][57339] Updated weights for policy 0, policy_version 662588 (0.0035) [2024-04-28 16:52:27,169][57108] Fps is (10 sec: 58981.1, 60 sec: 54886.3, 300 sec: 55705.6). Total num frames: 10855956480. Throughput: 0: 55498.6. Samples: 1346302520. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:52:27,170][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 16:52:28,270][57339] Updated weights for policy 0, policy_version 662598 (0.0028) [2024-04-28 16:52:30,517][57339] Updated weights for policy 0, policy_version 662608 (0.0037) [2024-04-28 16:52:32,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 10856251392. Throughput: 0: 55437.2. Samples: 1346636660. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:52:32,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 16:52:34,054][57339] Updated weights for policy 0, policy_version 662618 (0.0028) [2024-04-28 16:52:36,378][57339] Updated weights for policy 0, policy_version 662628 (0.0026) [2024-04-28 16:52:37,169][57108] Fps is (10 sec: 57345.2, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 10856529920. Throughput: 0: 55733.5. Samples: 1346816140. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:52:37,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 16:52:39,830][57339] Updated weights for policy 0, policy_version 662638 (0.0032) [2024-04-28 16:52:42,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 55650.0). Total num frames: 10856808448. Throughput: 0: 55913.8. Samples: 1347152440. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:52:42,178][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 16:52:42,507][57339] Updated weights for policy 0, policy_version 662648 (0.0029) [2024-04-28 16:52:45,738][57339] Updated weights for policy 0, policy_version 662658 (0.0030) [2024-04-28 16:52:47,169][57108] Fps is (10 sec: 52428.1, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 10857054208. Throughput: 0: 55868.3. Samples: 1347485320. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:52:47,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 16:52:48,475][57339] Updated weights for policy 0, policy_version 662668 (0.0028) [2024-04-28 16:52:51,637][57339] Updated weights for policy 0, policy_version 662678 (0.0030) [2024-04-28 16:52:52,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 10857332736. Throughput: 0: 55662.7. Samples: 1347643920. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:52:52,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 16:52:54,193][57339] Updated weights for policy 0, policy_version 662688 (0.0029) [2024-04-28 16:52:57,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10857611264. Throughput: 0: 55714.6. Samples: 1347980760. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:52:57,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 16:52:57,313][57319] Signal inference workers to stop experience collection... (20150 times) [2024-04-28 16:52:57,347][57339] InferenceWorker_p0-w0: stopping experience collection (20150 times) [2024-04-28 16:52:57,374][57319] Signal inference workers to resume experience collection... (20150 times) [2024-04-28 16:52:57,378][57339] InferenceWorker_p0-w0: resuming experience collection (20150 times) [2024-04-28 16:52:57,483][57339] Updated weights for policy 0, policy_version 662698 (0.0024) [2024-04-28 16:53:00,190][57339] Updated weights for policy 0, policy_version 662708 (0.0030) [2024-04-28 16:53:02,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55159.3, 300 sec: 55706.0). Total num frames: 10857906176. Throughput: 0: 55747.8. Samples: 1348316840. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:53:02,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 16:53:03,230][57339] Updated weights for policy 0, policy_version 662718 (0.0027) [2024-04-28 16:53:05,973][57339] Updated weights for policy 0, policy_version 662728 (0.0032) [2024-04-28 16:53:07,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10858184704. Throughput: 0: 55938.8. Samples: 1348487560. Policy #0 lag: (min: 1.0, avg: 12.9, max: 23.0) [2024-04-28 16:53:07,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 16:53:09,126][57339] Updated weights for policy 0, policy_version 662738 (0.0027) [2024-04-28 16:53:11,850][57339] Updated weights for policy 0, policy_version 662748 (0.0024) [2024-04-28 16:53:12,169][57108] Fps is (10 sec: 57344.8, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10858479616. Throughput: 0: 55972.2. Samples: 1348821260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:53:12,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 16:53:14,818][57339] Updated weights for policy 0, policy_version 662758 (0.0024) [2024-04-28 16:53:17,169][57108] Fps is (10 sec: 57343.6, 60 sec: 56524.8, 300 sec: 55650.1). Total num frames: 10858758144. Throughput: 0: 55974.8. Samples: 1349155520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:53:17,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 16:53:17,660][57339] Updated weights for policy 0, policy_version 662768 (0.0027) [2024-04-28 16:53:20,776][57339] Updated weights for policy 0, policy_version 662778 (0.0029) [2024-04-28 16:53:22,169][57108] Fps is (10 sec: 55705.5, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 10859036672. Throughput: 0: 55720.0. Samples: 1349323540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:53:22,170][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 16:53:23,537][57339] Updated weights for policy 0, policy_version 662788 (0.0031) [2024-04-28 16:53:26,527][57339] Updated weights for policy 0, policy_version 662798 (0.0029) [2024-04-28 16:53:27,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 10859282432. Throughput: 0: 55741.3. Samples: 1349660800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:53:27,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 16:53:27,185][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000662799_10859298816.pth... [2024-04-28 16:53:27,241][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000661984_10845945856.pth [2024-04-28 16:53:29,388][57339] Updated weights for policy 0, policy_version 662808 (0.0031) [2024-04-28 16:53:32,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10859593728. Throughput: 0: 55783.6. Samples: 1349995580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:53:32,170][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 16:53:32,308][57339] Updated weights for policy 0, policy_version 662818 (0.0031) [2024-04-28 16:53:35,248][57339] Updated weights for policy 0, policy_version 662828 (0.0027) [2024-04-28 16:53:37,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10859872256. Throughput: 0: 56001.4. Samples: 1350163980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:53:37,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 16:53:38,425][57339] Updated weights for policy 0, policy_version 662838 (0.0028) [2024-04-28 16:53:41,087][57339] Updated weights for policy 0, policy_version 662848 (0.0026) [2024-04-28 16:53:42,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10860150784. Throughput: 0: 55997.3. Samples: 1350500640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:53:42,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 16:53:44,185][57339] Updated weights for policy 0, policy_version 662858 (0.0029) [2024-04-28 16:53:46,949][57339] Updated weights for policy 0, policy_version 662868 (0.0027) [2024-04-28 16:53:47,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 10860429312. Throughput: 0: 55887.3. Samples: 1350831760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:53:47,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 16:53:50,020][57339] Updated weights for policy 0, policy_version 662878 (0.0025) [2024-04-28 16:53:52,169][57108] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 10860707840. Throughput: 0: 55911.5. Samples: 1351003580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:53:52,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 16:53:52,745][57339] Updated weights for policy 0, policy_version 662888 (0.0026) [2024-04-28 16:53:56,093][57339] Updated weights for policy 0, policy_version 662898 (0.0031) [2024-04-28 16:53:57,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 10860969984. Throughput: 0: 55954.7. Samples: 1351339220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:53:57,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 16:53:58,620][57339] Updated weights for policy 0, policy_version 662908 (0.0034) [2024-04-28 16:54:01,947][57339] Updated weights for policy 0, policy_version 662918 (0.0023) [2024-04-28 16:54:02,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 10861248512. Throughput: 0: 55992.1. Samples: 1351675160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:54:02,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 16:54:04,511][57339] Updated weights for policy 0, policy_version 662928 (0.0029) [2024-04-28 16:54:07,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10861543424. Throughput: 0: 55796.5. Samples: 1351834380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:54:07,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 16:54:07,857][57339] Updated weights for policy 0, policy_version 662938 (0.0028) [2024-04-28 16:54:10,488][57339] Updated weights for policy 0, policy_version 662948 (0.0036) [2024-04-28 16:54:12,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10861821952. Throughput: 0: 55702.8. Samples: 1352167420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:54:12,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 16:54:13,575][57339] Updated weights for policy 0, policy_version 662958 (0.0030) [2024-04-28 16:54:16,652][57339] Updated weights for policy 0, policy_version 662968 (0.0029) [2024-04-28 16:54:17,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10862100480. Throughput: 0: 55823.6. Samples: 1352507640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:54:17,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 16:54:19,402][57339] Updated weights for policy 0, policy_version 662978 (0.0029) [2024-04-28 16:54:20,298][57319] Signal inference workers to stop experience collection... (20200 times) [2024-04-28 16:54:20,338][57339] InferenceWorker_p0-w0: stopping experience collection (20200 times) [2024-04-28 16:54:20,392][57319] Signal inference workers to resume experience collection... (20200 times) [2024-04-28 16:54:20,392][57339] InferenceWorker_p0-w0: resuming experience collection (20200 times) [2024-04-28 16:54:22,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10862379008. Throughput: 0: 55754.1. Samples: 1352672920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:54:22,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 16:54:22,496][57339] Updated weights for policy 0, policy_version 662988 (0.0029) [2024-04-28 16:54:25,255][57339] Updated weights for policy 0, policy_version 662998 (0.0031) [2024-04-28 16:54:27,169][57108] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 55650.0). Total num frames: 10862657536. Throughput: 0: 55832.8. Samples: 1353013120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:54:27,170][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 16:54:28,318][57339] Updated weights for policy 0, policy_version 663008 (0.0034) [2024-04-28 16:54:31,041][57339] Updated weights for policy 0, policy_version 663018 (0.0027) [2024-04-28 16:54:32,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10862936064. Throughput: 0: 55887.6. Samples: 1353346700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:54:32,169][57108] Avg episode reward: [(0, '0.539')] [2024-04-28 16:54:34,151][57339] Updated weights for policy 0, policy_version 663028 (0.0030) [2024-04-28 16:54:36,748][57339] Updated weights for policy 0, policy_version 663038 (0.0036) [2024-04-28 16:54:37,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10863214592. Throughput: 0: 55888.9. Samples: 1353518580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:54:37,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 16:54:39,839][57339] Updated weights for policy 0, policy_version 663048 (0.0027) [2024-04-28 16:54:42,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10863493120. Throughput: 0: 55808.9. Samples: 1353850620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:54:42,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 16:54:42,717][57339] Updated weights for policy 0, policy_version 663058 (0.0025) [2024-04-28 16:54:45,913][57339] Updated weights for policy 0, policy_version 663068 (0.0033) [2024-04-28 16:54:47,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10863788032. Throughput: 0: 55714.7. Samples: 1354182320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:54:47,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 16:54:48,742][57339] Updated weights for policy 0, policy_version 663078 (0.0035) [2024-04-28 16:54:51,632][57339] Updated weights for policy 0, policy_version 663088 (0.0034) [2024-04-28 16:54:52,169][57108] Fps is (10 sec: 55704.4, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 10864050176. Throughput: 0: 56078.9. Samples: 1354357940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:54:52,170][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 16:54:54,471][57339] Updated weights for policy 0, policy_version 663098 (0.0031) [2024-04-28 16:54:57,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10864328704. Throughput: 0: 56059.9. Samples: 1354690120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:54:57,170][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 16:54:57,545][57339] Updated weights for policy 0, policy_version 663108 (0.0025) [2024-04-28 16:55:00,405][57339] Updated weights for policy 0, policy_version 663118 (0.0025) [2024-04-28 16:55:02,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 10864590848. Throughput: 0: 55977.6. Samples: 1355026640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:55:02,170][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 16:55:03,599][57339] Updated weights for policy 0, policy_version 663128 (0.0030) [2024-04-28 16:55:06,264][57339] Updated weights for policy 0, policy_version 663138 (0.0029) [2024-04-28 16:55:07,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10864885760. Throughput: 0: 55801.4. Samples: 1355183980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:55:07,169][57108] Avg episode reward: [(0, '0.683')] [2024-04-28 16:55:09,346][57339] Updated weights for policy 0, policy_version 663148 (0.0027) [2024-04-28 16:55:12,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10865164288. Throughput: 0: 55753.8. Samples: 1355522040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:55:12,170][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 16:55:12,440][57339] Updated weights for policy 0, policy_version 663158 (0.0027) [2024-04-28 16:55:15,207][57339] Updated weights for policy 0, policy_version 663168 (0.0029) [2024-04-28 16:55:17,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10865459200. Throughput: 0: 55813.4. Samples: 1355858300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:55:17,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 16:55:18,162][57339] Updated weights for policy 0, policy_version 663178 (0.0024) [2024-04-28 16:55:20,957][57339] Updated weights for policy 0, policy_version 663188 (0.0028) [2024-04-28 16:55:22,169][57108] Fps is (10 sec: 58982.6, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10865754112. Throughput: 0: 55846.5. Samples: 1356031680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:55:22,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 16:55:23,888][57339] Updated weights for policy 0, policy_version 663198 (0.0024) [2024-04-28 16:55:26,509][57319] Signal inference workers to stop experience collection... (20250 times) [2024-04-28 16:55:26,533][57339] InferenceWorker_p0-w0: stopping experience collection (20250 times) [2024-04-28 16:55:26,606][57319] Signal inference workers to resume experience collection... (20250 times) [2024-04-28 16:55:26,607][57339] InferenceWorker_p0-w0: resuming experience collection (20250 times) [2024-04-28 16:55:26,714][57339] Updated weights for policy 0, policy_version 663208 (0.0026) [2024-04-28 16:55:27,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10866016256. Throughput: 0: 55921.4. Samples: 1356367080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:55:27,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 16:55:27,213][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000663210_10866032640.pth... [2024-04-28 16:55:27,266][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000662394_10852663296.pth [2024-04-28 16:55:30,053][57339] Updated weights for policy 0, policy_version 663218 (0.0028) [2024-04-28 16:55:32,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10866278400. Throughput: 0: 55967.9. Samples: 1356700880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:55:32,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 16:55:32,856][57339] Updated weights for policy 0, policy_version 663228 (0.0027) [2024-04-28 16:55:35,838][57339] Updated weights for policy 0, policy_version 663238 (0.0028) [2024-04-28 16:55:37,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 10866556928. Throughput: 0: 55625.6. Samples: 1356861080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:55:37,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 16:55:38,585][57339] Updated weights for policy 0, policy_version 663248 (0.0031) [2024-04-28 16:55:41,817][57339] Updated weights for policy 0, policy_version 663258 (0.0029) [2024-04-28 16:55:42,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 10866835456. Throughput: 0: 55662.2. Samples: 1357194920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:55:42,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 16:55:44,361][57339] Updated weights for policy 0, policy_version 663268 (0.0030) [2024-04-28 16:55:47,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10867130368. Throughput: 0: 55664.1. Samples: 1357531520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:55:47,170][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 16:55:47,681][57339] Updated weights for policy 0, policy_version 663278 (0.0027) [2024-04-28 16:55:50,295][57339] Updated weights for policy 0, policy_version 663288 (0.0030) [2024-04-28 16:55:52,169][57108] Fps is (10 sec: 58982.4, 60 sec: 56251.9, 300 sec: 55816.7). Total num frames: 10867425280. Throughput: 0: 56026.6. Samples: 1357705180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:55:52,170][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 16:55:53,430][57339] Updated weights for policy 0, policy_version 663298 (0.0028) [2024-04-28 16:55:56,160][57339] Updated weights for policy 0, policy_version 663308 (0.0034) [2024-04-28 16:55:57,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10867687424. Throughput: 0: 55931.7. Samples: 1358038960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:55:57,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 16:55:59,168][57339] Updated weights for policy 0, policy_version 663318 (0.0028) [2024-04-28 16:56:01,993][57339] Updated weights for policy 0, policy_version 663328 (0.0030) [2024-04-28 16:56:02,169][57108] Fps is (10 sec: 54066.7, 60 sec: 56251.8, 300 sec: 55927.7). Total num frames: 10867965952. Throughput: 0: 55929.1. Samples: 1358375120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:56:02,170][57108] Avg episode reward: [(0, '0.679')] [2024-04-28 16:56:05,041][57339] Updated weights for policy 0, policy_version 663338 (0.0030) [2024-04-28 16:56:07,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 10868244480. Throughput: 0: 55779.9. Samples: 1358541780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:56:07,170][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 16:56:07,803][57339] Updated weights for policy 0, policy_version 663348 (0.0032) [2024-04-28 16:56:11,069][57339] Updated weights for policy 0, policy_version 663358 (0.0026) [2024-04-28 16:56:12,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10868506624. Throughput: 0: 55779.4. Samples: 1358877160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:56:12,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 16:56:13,560][57339] Updated weights for policy 0, policy_version 663368 (0.0033) [2024-04-28 16:56:17,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 10868768768. Throughput: 0: 55852.9. Samples: 1359214260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:56:17,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 16:56:17,286][57339] Updated weights for policy 0, policy_version 663378 (0.0032) [2024-04-28 16:56:19,720][57339] Updated weights for policy 0, policy_version 663388 (0.0031) [2024-04-28 16:56:20,104][57319] Signal inference workers to stop experience collection... (20300 times) [2024-04-28 16:56:20,149][57339] InferenceWorker_p0-w0: stopping experience collection (20300 times) [2024-04-28 16:56:20,162][57319] Signal inference workers to resume experience collection... (20300 times) [2024-04-28 16:56:20,165][57339] InferenceWorker_p0-w0: resuming experience collection (20300 times) [2024-04-28 16:56:22,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10869080064. Throughput: 0: 55895.3. Samples: 1359376380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-04-28 16:56:22,170][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 16:56:23,040][57339] Updated weights for policy 0, policy_version 663398 (0.0027) [2024-04-28 16:56:25,487][57339] Updated weights for policy 0, policy_version 663408 (0.0027) [2024-04-28 16:56:27,169][57108] Fps is (10 sec: 60620.4, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 10869374976. Throughput: 0: 55904.8. Samples: 1359710640. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:56:27,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 16:56:28,902][57339] Updated weights for policy 0, policy_version 663418 (0.0025) [2024-04-28 16:56:31,528][57339] Updated weights for policy 0, policy_version 663428 (0.0029) [2024-04-28 16:56:32,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 10869653504. Throughput: 0: 55918.2. Samples: 1360047840. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:56:32,170][57108] Avg episode reward: [(0, '0.677')] [2024-04-28 16:56:34,708][57339] Updated weights for policy 0, policy_version 663438 (0.0026) [2024-04-28 16:56:37,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10869899264. Throughput: 0: 55900.9. Samples: 1360220720. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:56:37,169][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 16:56:37,388][57339] Updated weights for policy 0, policy_version 663448 (0.0029) [2024-04-28 16:56:40,410][57339] Updated weights for policy 0, policy_version 663458 (0.0026) [2024-04-28 16:56:42,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10870177792. Throughput: 0: 55881.8. Samples: 1360553640. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:56:42,169][57108] Avg episode reward: [(0, '0.502')] [2024-04-28 16:56:43,214][57339] Updated weights for policy 0, policy_version 663468 (0.0028) [2024-04-28 16:56:46,292][57339] Updated weights for policy 0, policy_version 663478 (0.0025) [2024-04-28 16:56:47,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10870472704. Throughput: 0: 55890.4. Samples: 1360890180. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:56:47,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 16:56:48,913][57339] Updated weights for policy 0, policy_version 663488 (0.0030) [2024-04-28 16:56:52,105][57339] Updated weights for policy 0, policy_version 663498 (0.0026) [2024-04-28 16:56:52,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 10870751232. Throughput: 0: 55742.4. Samples: 1361050180. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:56:52,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 16:56:54,692][57339] Updated weights for policy 0, policy_version 663508 (0.0032) [2024-04-28 16:56:57,169][57108] Fps is (10 sec: 54066.2, 60 sec: 55432.3, 300 sec: 55650.0). Total num frames: 10871013376. Throughput: 0: 55735.8. Samples: 1361385280. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:56:57,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 16:56:58,191][57339] Updated weights for policy 0, policy_version 663518 (0.0026) [2024-04-28 16:57:00,543][57339] Updated weights for policy 0, policy_version 663528 (0.0031) [2024-04-28 16:57:02,169][57108] Fps is (10 sec: 58981.7, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 10871341056. Throughput: 0: 55660.8. Samples: 1361719000. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:57:02,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 16:57:04,062][57339] Updated weights for policy 0, policy_version 663538 (0.0031) [2024-04-28 16:57:06,628][57339] Updated weights for policy 0, policy_version 663548 (0.0030) [2024-04-28 16:57:07,169][57108] Fps is (10 sec: 60622.5, 60 sec: 56251.9, 300 sec: 55983.3). Total num frames: 10871619584. Throughput: 0: 55920.7. Samples: 1361892800. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:57:07,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 16:57:09,751][57339] Updated weights for policy 0, policy_version 663558 (0.0022) [2024-04-28 16:57:12,169][57108] Fps is (10 sec: 52427.8, 60 sec: 55978.4, 300 sec: 55927.7). Total num frames: 10871865344. Throughput: 0: 56032.2. Samples: 1362232100. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:57:12,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 16:57:12,474][57339] Updated weights for policy 0, policy_version 663568 (0.0030) [2024-04-28 16:57:15,521][57339] Updated weights for policy 0, policy_version 663578 (0.0031) [2024-04-28 16:57:17,169][57108] Fps is (10 sec: 52427.5, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 10872143872. Throughput: 0: 56034.2. Samples: 1362569380. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:57:17,170][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 16:57:18,180][57319] Signal inference workers to stop experience collection... (20350 times) [2024-04-28 16:57:18,182][57319] Signal inference workers to resume experience collection... (20350 times) [2024-04-28 16:57:18,220][57339] InferenceWorker_p0-w0: stopping experience collection (20350 times) [2024-04-28 16:57:18,220][57339] InferenceWorker_p0-w0: resuming experience collection (20350 times) [2024-04-28 16:57:18,296][57339] Updated weights for policy 0, policy_version 663588 (0.0033) [2024-04-28 16:57:21,374][57339] Updated weights for policy 0, policy_version 663598 (0.0027) [2024-04-28 16:57:22,169][57108] Fps is (10 sec: 57345.8, 60 sec: 55978.8, 300 sec: 55872.3). Total num frames: 10872438784. Throughput: 0: 55761.9. Samples: 1362730000. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:57:22,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 16:57:24,168][57339] Updated weights for policy 0, policy_version 663608 (0.0028) [2024-04-28 16:57:27,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10872700928. Throughput: 0: 55942.5. Samples: 1363071060. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:57:27,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 16:57:27,247][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000663618_10872717312.pth... [2024-04-28 16:57:27,252][57339] Updated weights for policy 0, policy_version 663618 (0.0029) [2024-04-28 16:57:27,303][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000662799_10859298816.pth [2024-04-28 16:57:29,947][57339] Updated weights for policy 0, policy_version 663628 (0.0028) [2024-04-28 16:57:32,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 10872979456. Throughput: 0: 55916.8. Samples: 1363406440. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:57:32,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 16:57:33,055][57339] Updated weights for policy 0, policy_version 663638 (0.0027) [2024-04-28 16:57:35,787][57339] Updated weights for policy 0, policy_version 663648 (0.0031) [2024-04-28 16:57:37,169][57108] Fps is (10 sec: 58982.3, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 10873290752. Throughput: 0: 56203.4. Samples: 1363579340. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:57:37,170][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 16:57:38,838][57339] Updated weights for policy 0, policy_version 663658 (0.0025) [2024-04-28 16:57:41,646][57339] Updated weights for policy 0, policy_version 663668 (0.0033) [2024-04-28 16:57:42,169][57108] Fps is (10 sec: 58983.2, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 10873569280. Throughput: 0: 56103.0. Samples: 1363909900. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:57:42,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 16:57:44,678][57339] Updated weights for policy 0, policy_version 663678 (0.0026) [2024-04-28 16:57:47,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55978.5, 300 sec: 55927.7). Total num frames: 10873831424. Throughput: 0: 56279.5. Samples: 1364251580. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:57:47,170][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 16:57:47,492][57339] Updated weights for policy 0, policy_version 663688 (0.0028) [2024-04-28 16:57:50,617][57339] Updated weights for policy 0, policy_version 663698 (0.0031) [2024-04-28 16:57:52,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 10874126336. Throughput: 0: 56104.0. Samples: 1364417480. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:57:52,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 16:57:53,492][57339] Updated weights for policy 0, policy_version 663708 (0.0027) [2024-04-28 16:57:56,369][57339] Updated weights for policy 0, policy_version 663718 (0.0024) [2024-04-28 16:57:57,169][57108] Fps is (10 sec: 57344.6, 60 sec: 56524.9, 300 sec: 55927.8). Total num frames: 10874404864. Throughput: 0: 56057.6. Samples: 1364754680. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:57:57,170][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 16:57:59,322][57339] Updated weights for policy 0, policy_version 663728 (0.0028) [2024-04-28 16:58:02,079][57339] Updated weights for policy 0, policy_version 663738 (0.0027) [2024-04-28 16:58:02,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.7, 300 sec: 55927.7). Total num frames: 10874683392. Throughput: 0: 56002.4. Samples: 1365089480. Policy #0 lag: (min: 0.0, avg: 12.8, max: 24.0) [2024-04-28 16:58:02,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 16:58:05,199][57339] Updated weights for policy 0, policy_version 663748 (0.0031) [2024-04-28 16:58:07,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 10874961920. Throughput: 0: 56207.1. Samples: 1365259320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:58:07,169][57108] Avg episode reward: [(0, '0.475')] [2024-04-28 16:58:07,843][57339] Updated weights for policy 0, policy_version 663758 (0.0030) [2024-04-28 16:58:08,479][57319] Signal inference workers to stop experience collection... (20400 times) [2024-04-28 16:58:08,479][57319] Signal inference workers to resume experience collection... (20400 times) [2024-04-28 16:58:08,503][57339] InferenceWorker_p0-w0: stopping experience collection (20400 times) [2024-04-28 16:58:08,503][57339] InferenceWorker_p0-w0: resuming experience collection (20400 times) [2024-04-28 16:58:11,062][57339] Updated weights for policy 0, policy_version 663768 (0.0026) [2024-04-28 16:58:12,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56252.0, 300 sec: 55872.2). Total num frames: 10875240448. Throughput: 0: 56163.2. Samples: 1365598400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:58:12,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 16:58:13,956][57339] Updated weights for policy 0, policy_version 663778 (0.0028) [2024-04-28 16:58:16,804][57339] Updated weights for policy 0, policy_version 663788 (0.0026) [2024-04-28 16:58:17,169][57108] Fps is (10 sec: 55705.3, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10875518976. Throughput: 0: 56073.4. Samples: 1365929740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:58:17,169][57108] Avg episode reward: [(0, '0.511')] [2024-04-28 16:58:19,917][57339] Updated weights for policy 0, policy_version 663798 (0.0027) [2024-04-28 16:58:22,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.5, 300 sec: 55927.8). Total num frames: 10875781120. Throughput: 0: 55879.6. Samples: 1366093920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:58:22,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 16:58:22,676][57339] Updated weights for policy 0, policy_version 663808 (0.0032) [2024-04-28 16:58:25,615][57339] Updated weights for policy 0, policy_version 663818 (0.0026) [2024-04-28 16:58:27,169][57108] Fps is (10 sec: 55705.2, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10876076032. Throughput: 0: 56034.5. Samples: 1366431460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:58:27,169][57108] Avg episode reward: [(0, '0.686')] [2024-04-28 16:58:28,478][57339] Updated weights for policy 0, policy_version 663828 (0.0025) [2024-04-28 16:58:31,407][57339] Updated weights for policy 0, policy_version 663838 (0.0026) [2024-04-28 16:58:32,169][57108] Fps is (10 sec: 58982.7, 60 sec: 56524.9, 300 sec: 55927.8). Total num frames: 10876370944. Throughput: 0: 55846.8. Samples: 1366764680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:58:32,169][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 16:58:34,314][57339] Updated weights for policy 0, policy_version 663848 (0.0027) [2024-04-28 16:58:37,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10876633088. Throughput: 0: 55887.4. Samples: 1366932420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:58:37,170][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 16:58:37,252][57339] Updated weights for policy 0, policy_version 663858 (0.0030) [2024-04-28 16:58:40,435][57339] Updated weights for policy 0, policy_version 663868 (0.0026) [2024-04-28 16:58:42,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10876911616. Throughput: 0: 55937.0. Samples: 1367271840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:58:42,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 16:58:43,119][57339] Updated weights for policy 0, policy_version 663878 (0.0034) [2024-04-28 16:58:46,197][57339] Updated weights for policy 0, policy_version 663888 (0.0025) [2024-04-28 16:58:47,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56251.8, 300 sec: 55927.7). Total num frames: 10877206528. Throughput: 0: 55974.2. Samples: 1367608320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:58:47,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 16:58:48,913][57339] Updated weights for policy 0, policy_version 663898 (0.0029) [2024-04-28 16:58:52,096][57339] Updated weights for policy 0, policy_version 663908 (0.0026) [2024-04-28 16:58:52,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 10877468672. Throughput: 0: 55739.9. Samples: 1367767620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:58:52,170][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 16:58:54,610][57339] Updated weights for policy 0, policy_version 663918 (0.0035) [2024-04-28 16:58:57,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 10877730816. Throughput: 0: 55717.3. Samples: 1368105680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:58:57,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 16:58:57,821][57339] Updated weights for policy 0, policy_version 663928 (0.0026) [2024-04-28 16:59:00,629][57339] Updated weights for policy 0, policy_version 663938 (0.0024) [2024-04-28 16:59:02,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 10878025728. Throughput: 0: 55859.1. Samples: 1368443400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:59:02,169][57108] Avg episode reward: [(0, '0.529')] [2024-04-28 16:59:03,732][57339] Updated weights for policy 0, policy_version 663948 (0.0032) [2024-04-28 16:59:06,530][57339] Updated weights for policy 0, policy_version 663958 (0.0035) [2024-04-28 16:59:07,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 10878320640. Throughput: 0: 55928.0. Samples: 1368610680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:59:07,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 16:59:09,786][57339] Updated weights for policy 0, policy_version 663968 (0.0024) [2024-04-28 16:59:12,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10878599168. Throughput: 0: 55995.2. Samples: 1368951240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:59:12,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 16:59:12,270][57339] Updated weights for policy 0, policy_version 663978 (0.0033) [2024-04-28 16:59:15,598][57339] Updated weights for policy 0, policy_version 663988 (0.0026) [2024-04-28 16:59:17,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 10878861312. Throughput: 0: 56005.8. Samples: 1369284940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:59:17,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 16:59:17,985][57339] Updated weights for policy 0, policy_version 663998 (0.0028) [2024-04-28 16:59:21,543][57339] Updated weights for policy 0, policy_version 664008 (0.0031) [2024-04-28 16:59:22,169][57108] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10879156224. Throughput: 0: 56151.5. Samples: 1369459240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:59:22,169][57108] Avg episode reward: [(0, '0.523')] [2024-04-28 16:59:23,822][57339] Updated weights for policy 0, policy_version 664018 (0.0028) [2024-04-28 16:59:26,146][57319] Signal inference workers to stop experience collection... (20450 times) [2024-04-28 16:59:26,146][57319] Signal inference workers to resume experience collection... (20450 times) [2024-04-28 16:59:26,169][57339] InferenceWorker_p0-w0: stopping experience collection (20450 times) [2024-04-28 16:59:26,170][57339] InferenceWorker_p0-w0: resuming experience collection (20450 times) [2024-04-28 16:59:27,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10879401984. Throughput: 0: 56052.0. Samples: 1369794180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:59:27,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 16:59:27,322][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000664028_10879434752.pth... [2024-04-28 16:59:27,328][57339] Updated weights for policy 0, policy_version 664028 (0.0028) [2024-04-28 16:59:27,369][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000663210_10866032640.pth [2024-04-28 16:59:29,674][57339] Updated weights for policy 0, policy_version 664038 (0.0030) [2024-04-28 16:59:32,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 10879713280. Throughput: 0: 56030.1. Samples: 1370129680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:59:32,170][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 16:59:33,216][57339] Updated weights for policy 0, policy_version 664048 (0.0029) [2024-04-28 16:59:35,401][57339] Updated weights for policy 0, policy_version 664058 (0.0042) [2024-04-28 16:59:37,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10879975424. Throughput: 0: 56056.9. Samples: 1370290180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:59:37,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 16:59:39,189][57339] Updated weights for policy 0, policy_version 664068 (0.0033) [2024-04-28 16:59:41,233][57339] Updated weights for policy 0, policy_version 664078 (0.0032) [2024-04-28 16:59:42,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 10880286720. Throughput: 0: 55950.2. Samples: 1370623440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 16:59:42,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 16:59:44,898][57339] Updated weights for policy 0, policy_version 664088 (0.0035) [2024-04-28 16:59:47,169][57108] Fps is (10 sec: 60621.4, 60 sec: 56251.8, 300 sec: 56038.9). Total num frames: 10880581632. Throughput: 0: 55904.6. Samples: 1370959100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:59:47,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 16:59:47,172][57339] Updated weights for policy 0, policy_version 664098 (0.0027) [2024-04-28 16:59:50,760][57339] Updated weights for policy 0, policy_version 664108 (0.0028) [2024-04-28 16:59:52,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 10880827392. Throughput: 0: 55899.1. Samples: 1371126140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:59:52,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 16:59:53,049][57339] Updated weights for policy 0, policy_version 664118 (0.0031) [2024-04-28 16:59:56,497][57339] Updated weights for policy 0, policy_version 664128 (0.0028) [2024-04-28 16:59:57,169][57108] Fps is (10 sec: 50789.5, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10881089536. Throughput: 0: 55771.5. Samples: 1371460960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 16:59:57,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 16:59:58,983][57339] Updated weights for policy 0, policy_version 664138 (0.0028) [2024-04-28 17:00:02,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 10881384448. Throughput: 0: 55779.6. Samples: 1371795020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:00:02,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 17:00:02,238][57339] Updated weights for policy 0, policy_version 664148 (0.0028) [2024-04-28 17:00:04,707][57339] Updated weights for policy 0, policy_version 664158 (0.0031) [2024-04-28 17:00:07,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55927.8). Total num frames: 10881662976. Throughput: 0: 55662.6. Samples: 1371964060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:00:07,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 17:00:08,209][57339] Updated weights for policy 0, policy_version 664168 (0.0030) [2024-04-28 17:00:10,574][57339] Updated weights for policy 0, policy_version 664178 (0.0032) [2024-04-28 17:00:12,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 10881941504. Throughput: 0: 55531.7. Samples: 1372293100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:00:12,169][57108] Avg episode reward: [(0, '0.522')] [2024-04-28 17:00:14,129][57339] Updated weights for policy 0, policy_version 664188 (0.0029) [2024-04-28 17:00:16,518][57339] Updated weights for policy 0, policy_version 664198 (0.0028) [2024-04-28 17:00:17,169][57108] Fps is (10 sec: 57344.6, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10882236416. Throughput: 0: 55524.5. Samples: 1372628280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:00:17,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 17:00:19,848][57339] Updated weights for policy 0, policy_version 664208 (0.0028) [2024-04-28 17:00:20,660][57319] Signal inference workers to stop experience collection... (20500 times) [2024-04-28 17:00:20,701][57339] InferenceWorker_p0-w0: stopping experience collection (20500 times) [2024-04-28 17:00:20,710][57319] Signal inference workers to resume experience collection... (20500 times) [2024-04-28 17:00:20,716][57339] InferenceWorker_p0-w0: resuming experience collection (20500 times) [2024-04-28 17:00:22,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 10882514944. Throughput: 0: 55839.6. Samples: 1372802960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:00:22,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 17:00:22,351][57339] Updated weights for policy 0, policy_version 664218 (0.0031) [2024-04-28 17:00:25,765][57339] Updated weights for policy 0, policy_version 664228 (0.0028) [2024-04-28 17:00:27,169][57108] Fps is (10 sec: 55705.1, 60 sec: 56524.7, 300 sec: 55983.3). Total num frames: 10882793472. Throughput: 0: 55924.4. Samples: 1373140040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:00:27,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 17:00:28,325][57339] Updated weights for policy 0, policy_version 664238 (0.0030) [2024-04-28 17:00:31,740][57339] Updated weights for policy 0, policy_version 664248 (0.0028) [2024-04-28 17:00:32,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 10883055616. Throughput: 0: 56037.3. Samples: 1373480780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:00:32,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 17:00:34,276][57339] Updated weights for policy 0, policy_version 664258 (0.0028) [2024-04-28 17:00:37,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 10883334144. Throughput: 0: 55927.1. Samples: 1373642860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:00:37,169][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 17:00:37,612][57339] Updated weights for policy 0, policy_version 664268 (0.0025) [2024-04-28 17:00:40,015][57339] Updated weights for policy 0, policy_version 664278 (0.0033) [2024-04-28 17:00:42,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 10883612672. Throughput: 0: 56016.6. Samples: 1373981700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:00:42,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 17:00:43,304][57339] Updated weights for policy 0, policy_version 664288 (0.0026) [2024-04-28 17:00:45,849][57339] Updated weights for policy 0, policy_version 664298 (0.0027) [2024-04-28 17:00:47,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 10883907584. Throughput: 0: 56102.6. Samples: 1374319640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:00:47,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 17:00:49,142][57339] Updated weights for policy 0, policy_version 664308 (0.0033) [2024-04-28 17:00:51,972][57339] Updated weights for policy 0, policy_version 664318 (0.0032) [2024-04-28 17:00:52,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 10884186112. Throughput: 0: 56154.7. Samples: 1374491020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:00:52,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 17:00:54,993][57339] Updated weights for policy 0, policy_version 664328 (0.0031) [2024-04-28 17:00:57,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56524.9, 300 sec: 55983.3). Total num frames: 10884481024. Throughput: 0: 56362.1. Samples: 1374829400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:00:57,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 17:00:57,748][57339] Updated weights for policy 0, policy_version 664338 (0.0032) [2024-04-28 17:01:00,855][57339] Updated weights for policy 0, policy_version 664348 (0.0034) [2024-04-28 17:01:02,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 10884743168. Throughput: 0: 56318.7. Samples: 1375162620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:01:02,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 17:01:03,624][57339] Updated weights for policy 0, policy_version 664358 (0.0029) [2024-04-28 17:01:06,753][57339] Updated weights for policy 0, policy_version 664368 (0.0031) [2024-04-28 17:01:07,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 10885021696. Throughput: 0: 55939.5. Samples: 1375320240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:01:07,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 17:01:09,437][57339] Updated weights for policy 0, policy_version 664378 (0.0026) [2024-04-28 17:01:12,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.6, 300 sec: 56038.8). Total num frames: 10885300224. Throughput: 0: 55970.0. Samples: 1375658680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:01:12,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 17:01:12,682][57339] Updated weights for policy 0, policy_version 664388 (0.0034) [2024-04-28 17:01:15,564][57339] Updated weights for policy 0, policy_version 664398 (0.0028) [2024-04-28 17:01:17,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 10885595136. Throughput: 0: 55873.3. Samples: 1375995080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:01:17,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 17:01:18,596][57339] Updated weights for policy 0, policy_version 664408 (0.0035) [2024-04-28 17:01:21,417][57339] Updated weights for policy 0, policy_version 664418 (0.0028) [2024-04-28 17:01:22,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 10885857280. Throughput: 0: 55930.7. Samples: 1376159740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:01:22,169][57108] Avg episode reward: [(0, '0.523')] [2024-04-28 17:01:24,406][57339] Updated weights for policy 0, policy_version 664428 (0.0027) [2024-04-28 17:01:27,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 10886135808. Throughput: 0: 55912.5. Samples: 1376497760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:01:27,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 17:01:27,235][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000664438_10886152192.pth... [2024-04-28 17:01:27,241][57339] Updated weights for policy 0, policy_version 664438 (0.0030) [2024-04-28 17:01:27,278][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000663618_10872717312.pth [2024-04-28 17:01:30,307][57339] Updated weights for policy 0, policy_version 664448 (0.0029) [2024-04-28 17:01:32,157][57319] Signal inference workers to stop experience collection... (20550 times) [2024-04-28 17:01:32,161][57319] Signal inference workers to resume experience collection... (20550 times) [2024-04-28 17:01:32,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 10886414336. Throughput: 0: 55829.8. Samples: 1376831980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:01:32,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 17:01:32,186][57339] InferenceWorker_p0-w0: stopping experience collection (20550 times) [2024-04-28 17:01:32,187][57339] InferenceWorker_p0-w0: resuming experience collection (20550 times) [2024-04-28 17:01:32,977][57339] Updated weights for policy 0, policy_version 664458 (0.0027) [2024-04-28 17:01:36,157][57339] Updated weights for policy 0, policy_version 664468 (0.0026) [2024-04-28 17:01:37,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 10886692864. Throughput: 0: 55743.2. Samples: 1376999460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:01:37,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 17:01:38,869][57339] Updated weights for policy 0, policy_version 664478 (0.0030) [2024-04-28 17:01:42,035][57339] Updated weights for policy 0, policy_version 664488 (0.0030) [2024-04-28 17:01:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 10886971392. Throughput: 0: 55667.1. Samples: 1377334420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:01:42,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 17:01:44,865][57339] Updated weights for policy 0, policy_version 664498 (0.0029) [2024-04-28 17:01:47,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55978.5, 300 sec: 55983.3). Total num frames: 10887266304. Throughput: 0: 55614.5. Samples: 1377665280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:01:47,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 17:01:47,825][57339] Updated weights for policy 0, policy_version 664508 (0.0034) [2024-04-28 17:01:50,817][57339] Updated weights for policy 0, policy_version 664518 (0.0037) [2024-04-28 17:01:52,169][57108] Fps is (10 sec: 58981.5, 60 sec: 56251.7, 300 sec: 56094.4). Total num frames: 10887561216. Throughput: 0: 55933.7. Samples: 1377837260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:01:52,170][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 17:01:53,661][57339] Updated weights for policy 0, policy_version 664528 (0.0028) [2024-04-28 17:01:56,510][57339] Updated weights for policy 0, policy_version 664538 (0.0025) [2024-04-28 17:01:57,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10887823360. Throughput: 0: 55835.5. Samples: 1378171280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:01:57,169][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 17:01:59,495][57339] Updated weights for policy 0, policy_version 664548 (0.0027) [2024-04-28 17:02:02,169][57108] Fps is (10 sec: 52429.8, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10888085504. Throughput: 0: 55767.6. Samples: 1378504620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:02:02,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 17:02:02,320][57339] Updated weights for policy 0, policy_version 664558 (0.0033) [2024-04-28 17:02:05,353][57339] Updated weights for policy 0, policy_version 664568 (0.0026) [2024-04-28 17:02:07,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55432.6, 300 sec: 55872.3). Total num frames: 10888347648. Throughput: 0: 55755.9. Samples: 1378668760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:02:07,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 17:02:08,311][57339] Updated weights for policy 0, policy_version 664578 (0.0024) [2024-04-28 17:02:11,323][57339] Updated weights for policy 0, policy_version 664588 (0.0027) [2024-04-28 17:02:12,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55705.4, 300 sec: 55927.8). Total num frames: 10888642560. Throughput: 0: 55624.6. Samples: 1379000880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:02:12,170][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 17:02:14,242][57339] Updated weights for policy 0, policy_version 664598 (0.0026) [2024-04-28 17:02:17,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 10888921088. Throughput: 0: 55662.3. Samples: 1379336780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:02:17,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 17:02:17,238][57339] Updated weights for policy 0, policy_version 664608 (0.0033) [2024-04-28 17:02:19,977][57339] Updated weights for policy 0, policy_version 664618 (0.0028) [2024-04-28 17:02:22,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 10889199616. Throughput: 0: 55612.5. Samples: 1379502020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:02:22,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 17:02:22,978][57339] Updated weights for policy 0, policy_version 664628 (0.0037) [2024-04-28 17:02:25,672][57339] Updated weights for policy 0, policy_version 664638 (0.0030) [2024-04-28 17:02:27,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 10889494528. Throughput: 0: 55556.4. Samples: 1379834460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:02:27,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 17:02:28,862][57339] Updated weights for policy 0, policy_version 664648 (0.0031) [2024-04-28 17:02:31,581][57319] Signal inference workers to stop experience collection... (20600 times) [2024-04-28 17:02:31,582][57319] Signal inference workers to resume experience collection... (20600 times) [2024-04-28 17:02:31,594][57339] InferenceWorker_p0-w0: stopping experience collection (20600 times) [2024-04-28 17:02:31,595][57339] InferenceWorker_p0-w0: resuming experience collection (20600 times) [2024-04-28 17:02:31,698][57339] Updated weights for policy 0, policy_version 664658 (0.0028) [2024-04-28 17:02:32,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10889756672. Throughput: 0: 55555.4. Samples: 1380165260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:02:32,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 17:02:34,882][57339] Updated weights for policy 0, policy_version 664668 (0.0033) [2024-04-28 17:02:37,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10890035200. Throughput: 0: 55510.0. Samples: 1380335200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:02:37,170][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 17:02:37,684][57339] Updated weights for policy 0, policy_version 664678 (0.0031) [2024-04-28 17:02:40,871][57339] Updated weights for policy 0, policy_version 664688 (0.0031) [2024-04-28 17:02:42,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55159.5, 300 sec: 55761.2). Total num frames: 10890280960. Throughput: 0: 55487.2. Samples: 1380668200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:02:42,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 17:02:43,537][57339] Updated weights for policy 0, policy_version 664698 (0.0029) [2024-04-28 17:02:46,589][57339] Updated weights for policy 0, policy_version 664708 (0.0029) [2024-04-28 17:02:47,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 10890592256. Throughput: 0: 55556.3. Samples: 1381004660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:02:47,170][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 17:02:49,387][57339] Updated weights for policy 0, policy_version 664718 (0.0031) [2024-04-28 17:02:52,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55159.6, 300 sec: 55816.7). Total num frames: 10890870784. Throughput: 0: 55580.0. Samples: 1381169860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:02:52,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 17:02:52,603][57339] Updated weights for policy 0, policy_version 664728 (0.0030) [2024-04-28 17:02:55,207][57339] Updated weights for policy 0, policy_version 664738 (0.0032) [2024-04-28 17:02:57,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10891149312. Throughput: 0: 55601.9. Samples: 1381502960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:02:57,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 17:02:58,752][57339] Updated weights for policy 0, policy_version 664748 (0.0029) [2024-04-28 17:03:01,020][57339] Updated weights for policy 0, policy_version 664758 (0.0031) [2024-04-28 17:03:02,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 10891444224. Throughput: 0: 55625.2. Samples: 1381839920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:03:02,169][57108] Avg episode reward: [(0, '0.516')] [2024-04-28 17:03:04,500][57339] Updated weights for policy 0, policy_version 664768 (0.0025) [2024-04-28 17:03:06,830][57339] Updated weights for policy 0, policy_version 664778 (0.0028) [2024-04-28 17:03:07,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 10891722752. Throughput: 0: 55813.1. Samples: 1382013620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:03:07,170][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 17:03:10,227][57339] Updated weights for policy 0, policy_version 664788 (0.0035) [2024-04-28 17:03:12,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 10892001280. Throughput: 0: 55978.2. Samples: 1382353480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:03:12,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 17:03:12,677][57339] Updated weights for policy 0, policy_version 664798 (0.0029) [2024-04-28 17:03:16,186][57339] Updated weights for policy 0, policy_version 664808 (0.0029) [2024-04-28 17:03:17,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 10892263424. Throughput: 0: 56173.2. Samples: 1382693060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:03:17,170][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 17:03:18,494][57339] Updated weights for policy 0, policy_version 664818 (0.0027) [2024-04-28 17:03:22,169][57108] Fps is (10 sec: 52428.1, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 10892525568. Throughput: 0: 55835.8. Samples: 1382847820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:03:22,170][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 17:03:22,184][57339] Updated weights for policy 0, policy_version 664828 (0.0028) [2024-04-28 17:03:24,170][57339] Updated weights for policy 0, policy_version 664838 (0.0022) [2024-04-28 17:03:27,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55705.4, 300 sec: 55816.6). Total num frames: 10892836864. Throughput: 0: 55953.0. Samples: 1383186100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:03:27,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 17:03:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000664846_10892836864.pth... [2024-04-28 17:03:27,222][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000664028_10879434752.pth [2024-04-28 17:03:27,890][57339] Updated weights for policy 0, policy_version 664848 (0.0031) [2024-04-28 17:03:30,155][57339] Updated weights for policy 0, policy_version 664858 (0.0025) [2024-04-28 17:03:30,965][57319] Signal inference workers to stop experience collection... (20650 times) [2024-04-28 17:03:30,965][57319] Signal inference workers to resume experience collection... (20650 times) [2024-04-28 17:03:30,993][57339] InferenceWorker_p0-w0: stopping experience collection (20650 times) [2024-04-28 17:03:30,993][57339] InferenceWorker_p0-w0: resuming experience collection (20650 times) [2024-04-28 17:03:32,169][57108] Fps is (10 sec: 60621.6, 60 sec: 56251.6, 300 sec: 55927.8). Total num frames: 10893131776. Throughput: 0: 56031.2. Samples: 1383526060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:03:32,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 17:03:33,620][57339] Updated weights for policy 0, policy_version 664868 (0.0033) [2024-04-28 17:03:36,147][57339] Updated weights for policy 0, policy_version 664878 (0.0038) [2024-04-28 17:03:37,169][57108] Fps is (10 sec: 58983.2, 60 sec: 56524.7, 300 sec: 55983.3). Total num frames: 10893426688. Throughput: 0: 56192.8. Samples: 1383698540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:03:37,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 17:03:39,668][57339] Updated weights for policy 0, policy_version 664888 (0.0033) [2024-04-28 17:03:41,885][57339] Updated weights for policy 0, policy_version 664898 (0.0025) [2024-04-28 17:03:42,169][57108] Fps is (10 sec: 55705.7, 60 sec: 56797.8, 300 sec: 55872.2). Total num frames: 10893688832. Throughput: 0: 56292.1. Samples: 1384036100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:03:42,170][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 17:03:45,545][57339] Updated weights for policy 0, policy_version 664908 (0.0030) [2024-04-28 17:03:47,169][57108] Fps is (10 sec: 54068.1, 60 sec: 56251.9, 300 sec: 55927.8). Total num frames: 10893967360. Throughput: 0: 56184.7. Samples: 1384368220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:03:47,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 17:03:47,809][57339] Updated weights for policy 0, policy_version 664918 (0.0031) [2024-04-28 17:03:51,325][57339] Updated weights for policy 0, policy_version 664928 (0.0036) [2024-04-28 17:03:52,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 10894229504. Throughput: 0: 55952.0. Samples: 1384531460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:03:52,170][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 17:03:53,673][57339] Updated weights for policy 0, policy_version 664938 (0.0027) [2024-04-28 17:03:57,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10894491648. Throughput: 0: 55872.0. Samples: 1384867720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:03:57,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 17:03:57,235][57339] Updated weights for policy 0, policy_version 664948 (0.0028) [2024-04-28 17:03:59,422][57339] Updated weights for policy 0, policy_version 664958 (0.0037) [2024-04-28 17:04:02,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10894753792. Throughput: 0: 55769.3. Samples: 1385202680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:04:02,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 17:04:03,141][57339] Updated weights for policy 0, policy_version 664968 (0.0032) [2024-04-28 17:04:05,134][57339] Updated weights for policy 0, policy_version 664978 (0.0028) [2024-04-28 17:04:07,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10895081472. Throughput: 0: 56042.7. Samples: 1385369740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:04:07,170][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 17:04:08,991][57339] Updated weights for policy 0, policy_version 664988 (0.0032) [2024-04-28 17:04:11,223][57339] Updated weights for policy 0, policy_version 664998 (0.0032) [2024-04-28 17:04:12,169][57108] Fps is (10 sec: 62258.6, 60 sec: 56251.6, 300 sec: 55983.3). Total num frames: 10895376384. Throughput: 0: 55898.3. Samples: 1385701520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:04:12,170][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 17:04:14,830][57339] Updated weights for policy 0, policy_version 665008 (0.0030) [2024-04-28 17:04:16,970][57339] Updated weights for policy 0, policy_version 665018 (0.0028) [2024-04-28 17:04:17,169][57108] Fps is (10 sec: 57345.1, 60 sec: 56524.9, 300 sec: 55927.8). Total num frames: 10895654912. Throughput: 0: 55862.3. Samples: 1386039860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:04:17,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 17:04:20,595][57339] Updated weights for policy 0, policy_version 665028 (0.0032) [2024-04-28 17:04:21,664][57319] Signal inference workers to stop experience collection... (20700 times) [2024-04-28 17:04:21,684][57339] InferenceWorker_p0-w0: stopping experience collection (20700 times) [2024-04-28 17:04:21,724][57319] Signal inference workers to resume experience collection... (20700 times) [2024-04-28 17:04:21,724][57339] InferenceWorker_p0-w0: resuming experience collection (20700 times) [2024-04-28 17:04:22,169][57108] Fps is (10 sec: 54067.8, 60 sec: 56524.9, 300 sec: 55983.3). Total num frames: 10895917056. Throughput: 0: 55997.8. Samples: 1386218440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:04:22,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 17:04:23,010][57339] Updated weights for policy 0, policy_version 665038 (0.0021) [2024-04-28 17:04:26,515][57339] Updated weights for policy 0, policy_version 665048 (0.0031) [2024-04-28 17:04:27,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 10896179200. Throughput: 0: 55972.0. Samples: 1386554840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:04:27,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 17:04:28,904][57339] Updated weights for policy 0, policy_version 665058 (0.0030) [2024-04-28 17:04:32,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 10896457728. Throughput: 0: 56004.6. Samples: 1386888440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 17:04:32,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 17:04:32,349][57339] Updated weights for policy 0, policy_version 665068 (0.0028) [2024-04-28 17:04:34,639][57339] Updated weights for policy 0, policy_version 665078 (0.0032) [2024-04-28 17:04:37,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 10896719872. Throughput: 0: 55829.9. Samples: 1387043800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:04:37,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 17:04:38,188][57339] Updated weights for policy 0, policy_version 665088 (0.0030) [2024-04-28 17:04:40,338][57339] Updated weights for policy 0, policy_version 665098 (0.0036) [2024-04-28 17:04:42,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10897031168. Throughput: 0: 55769.3. Samples: 1387377340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:04:42,170][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 17:04:44,026][57339] Updated weights for policy 0, policy_version 665108 (0.0026) [2024-04-28 17:04:46,254][57339] Updated weights for policy 0, policy_version 665118 (0.0031) [2024-04-28 17:04:47,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55705.4, 300 sec: 55872.2). Total num frames: 10897309696. Throughput: 0: 55728.8. Samples: 1387710480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:04:47,170][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 17:04:50,029][57339] Updated weights for policy 0, policy_version 665128 (0.0026) [2024-04-28 17:04:52,089][57339] Updated weights for policy 0, policy_version 665138 (0.0034) [2024-04-28 17:04:52,169][57108] Fps is (10 sec: 58982.2, 60 sec: 56524.9, 300 sec: 56038.8). Total num frames: 10897620992. Throughput: 0: 56095.2. Samples: 1387894020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:04:52,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 17:04:55,792][57339] Updated weights for policy 0, policy_version 665148 (0.0035) [2024-04-28 17:04:57,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56524.7, 300 sec: 55927.7). Total num frames: 10897883136. Throughput: 0: 56155.7. Samples: 1388228520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:04:57,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 17:04:58,021][57339] Updated weights for policy 0, policy_version 665158 (0.0034) [2024-04-28 17:05:01,561][57339] Updated weights for policy 0, policy_version 665168 (0.0037) [2024-04-28 17:05:02,169][57108] Fps is (10 sec: 52429.1, 60 sec: 56524.9, 300 sec: 55872.2). Total num frames: 10898145280. Throughput: 0: 56015.5. Samples: 1388560560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:05:02,169][57108] Avg episode reward: [(0, '0.527')] [2024-04-28 17:05:03,960][57339] Updated weights for policy 0, policy_version 665178 (0.0030) [2024-04-28 17:05:07,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 10898407424. Throughput: 0: 55646.3. Samples: 1388722520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:05:07,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 17:05:07,423][57339] Updated weights for policy 0, policy_version 665188 (0.0030) [2024-04-28 17:05:09,799][57339] Updated weights for policy 0, policy_version 665198 (0.0031) [2024-04-28 17:05:12,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 10898685952. Throughput: 0: 55642.6. Samples: 1389058760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:05:12,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 17:05:13,297][57339] Updated weights for policy 0, policy_version 665208 (0.0031) [2024-04-28 17:05:15,858][57339] Updated weights for policy 0, policy_version 665218 (0.0030) [2024-04-28 17:05:17,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10898964480. Throughput: 0: 55629.9. Samples: 1389391780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:05:17,170][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 17:05:19,111][57339] Updated weights for policy 0, policy_version 665228 (0.0026) [2024-04-28 17:05:21,624][57339] Updated weights for policy 0, policy_version 665238 (0.0029) [2024-04-28 17:05:22,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10899259392. Throughput: 0: 55849.2. Samples: 1389557020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:05:22,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 17:05:24,954][57339] Updated weights for policy 0, policy_version 665248 (0.0027) [2024-04-28 17:05:26,362][57319] Signal inference workers to stop experience collection... (20750 times) [2024-04-28 17:05:26,363][57319] Signal inference workers to resume experience collection... (20750 times) [2024-04-28 17:05:26,387][57339] InferenceWorker_p0-w0: stopping experience collection (20750 times) [2024-04-28 17:05:26,387][57339] InferenceWorker_p0-w0: resuming experience collection (20750 times) [2024-04-28 17:05:27,169][57108] Fps is (10 sec: 58982.7, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 10899554304. Throughput: 0: 55778.7. Samples: 1389887380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:05:27,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 17:05:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000665256_10899554304.pth... [2024-04-28 17:05:27,224][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000664438_10886152192.pth [2024-04-28 17:05:27,626][57339] Updated weights for policy 0, policy_version 665258 (0.0032) [2024-04-28 17:05:30,824][57339] Updated weights for policy 0, policy_version 665268 (0.0028) [2024-04-28 17:05:32,169][57108] Fps is (10 sec: 58983.2, 60 sec: 56524.9, 300 sec: 55983.3). Total num frames: 10899849216. Throughput: 0: 55805.9. Samples: 1390221740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:05:32,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 17:05:33,511][57339] Updated weights for policy 0, policy_version 665278 (0.0029) [2024-04-28 17:05:36,629][57339] Updated weights for policy 0, policy_version 665288 (0.0030) [2024-04-28 17:05:37,169][57108] Fps is (10 sec: 55704.3, 60 sec: 56524.6, 300 sec: 55927.7). Total num frames: 10900111360. Throughput: 0: 55731.3. Samples: 1390401940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:05:37,170][57108] Avg episode reward: [(0, '0.499')] [2024-04-28 17:05:39,225][57339] Updated weights for policy 0, policy_version 665298 (0.0027) [2024-04-28 17:05:42,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10900373504. Throughput: 0: 55695.6. Samples: 1390734820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:05:42,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 17:05:42,398][57339] Updated weights for policy 0, policy_version 665308 (0.0025) [2024-04-28 17:05:45,057][57339] Updated weights for policy 0, policy_version 665318 (0.0030) [2024-04-28 17:05:47,169][57108] Fps is (10 sec: 50791.4, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 10900619264. Throughput: 0: 55715.5. Samples: 1391067760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:05:47,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 17:05:48,306][57339] Updated weights for policy 0, policy_version 665328 (0.0028) [2024-04-28 17:05:51,368][57339] Updated weights for policy 0, policy_version 665338 (0.0027) [2024-04-28 17:05:52,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10900930560. Throughput: 0: 55602.5. Samples: 1391224640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:05:52,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 17:05:54,289][57339] Updated weights for policy 0, policy_version 665348 (0.0032) [2024-04-28 17:05:57,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 10901192704. Throughput: 0: 55504.0. Samples: 1391556440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:05:57,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 17:05:57,432][57339] Updated weights for policy 0, policy_version 665358 (0.0037) [2024-04-28 17:06:00,084][57339] Updated weights for policy 0, policy_version 665368 (0.0026) [2024-04-28 17:06:02,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 10901471232. Throughput: 0: 55471.9. Samples: 1391888020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:06:02,170][57108] Avg episode reward: [(0, '0.480')] [2024-04-28 17:06:03,395][57339] Updated weights for policy 0, policy_version 665378 (0.0027) [2024-04-28 17:06:05,966][57339] Updated weights for policy 0, policy_version 665388 (0.0024) [2024-04-28 17:06:06,893][57319] Signal inference workers to stop experience collection... (20800 times) [2024-04-28 17:06:06,935][57339] InferenceWorker_p0-w0: stopping experience collection (20800 times) [2024-04-28 17:06:06,948][57319] Signal inference workers to resume experience collection... (20800 times) [2024-04-28 17:06:06,954][57339] InferenceWorker_p0-w0: resuming experience collection (20800 times) [2024-04-28 17:06:07,169][57108] Fps is (10 sec: 60620.1, 60 sec: 56524.7, 300 sec: 55927.7). Total num frames: 10901798912. Throughput: 0: 55851.6. Samples: 1392070340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:06:07,170][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 17:06:09,379][57339] Updated weights for policy 0, policy_version 665398 (0.0039) [2024-04-28 17:06:11,891][57339] Updated weights for policy 0, policy_version 665408 (0.0029) [2024-04-28 17:06:12,169][57108] Fps is (10 sec: 58983.3, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10902061056. Throughput: 0: 55897.8. Samples: 1392402780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 17:06:12,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 17:06:15,146][57339] Updated weights for policy 0, policy_version 665418 (0.0029) [2024-04-28 17:06:17,169][57108] Fps is (10 sec: 49152.9, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10902290432. Throughput: 0: 55953.0. Samples: 1392739620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:06:17,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 17:06:17,760][57339] Updated weights for policy 0, policy_version 665428 (0.0024) [2024-04-28 17:06:21,023][57339] Updated weights for policy 0, policy_version 665438 (0.0025) [2024-04-28 17:06:22,169][57108] Fps is (10 sec: 49151.6, 60 sec: 54886.5, 300 sec: 55650.0). Total num frames: 10902552576. Throughput: 0: 55304.2. Samples: 1392890620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:06:22,178][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 17:06:23,711][57339] Updated weights for policy 0, policy_version 665448 (0.0031) [2024-04-28 17:06:26,872][57339] Updated weights for policy 0, policy_version 665458 (0.0036) [2024-04-28 17:06:27,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10902863872. Throughput: 0: 55336.0. Samples: 1393224940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:06:27,178][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 17:06:29,479][57339] Updated weights for policy 0, policy_version 665468 (0.0025) [2024-04-28 17:06:32,169][57108] Fps is (10 sec: 57344.5, 60 sec: 54613.4, 300 sec: 55705.6). Total num frames: 10903126016. Throughput: 0: 55441.0. Samples: 1393562600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:06:32,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 17:06:32,904][57339] Updated weights for policy 0, policy_version 665478 (0.0027) [2024-04-28 17:06:35,465][57339] Updated weights for policy 0, policy_version 665488 (0.0027) [2024-04-28 17:06:37,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55159.7, 300 sec: 55761.1). Total num frames: 10903420928. Throughput: 0: 55661.5. Samples: 1393729400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:06:37,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 17:06:38,780][57339] Updated weights for policy 0, policy_version 665498 (0.0029) [2024-04-28 17:06:41,297][57339] Updated weights for policy 0, policy_version 665508 (0.0026) [2024-04-28 17:06:42,169][57108] Fps is (10 sec: 60620.9, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10903732224. Throughput: 0: 55630.7. Samples: 1394059820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:06:42,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 17:06:44,560][57339] Updated weights for policy 0, policy_version 665518 (0.0025) [2024-04-28 17:06:46,471][57319] Signal inference workers to stop experience collection... (20850 times) [2024-04-28 17:06:46,471][57319] Signal inference workers to resume experience collection... (20850 times) [2024-04-28 17:06:46,497][57339] InferenceWorker_p0-w0: stopping experience collection (20850 times) [2024-04-28 17:06:46,497][57339] InferenceWorker_p0-w0: resuming experience collection (20850 times) [2024-04-28 17:06:47,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 10903994368. Throughput: 0: 55617.6. Samples: 1394390800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:06:47,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 17:06:47,200][57339] Updated weights for policy 0, policy_version 665528 (0.0034) [2024-04-28 17:06:50,711][57339] Updated weights for policy 0, policy_version 665538 (0.0029) [2024-04-28 17:06:52,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 10904272896. Throughput: 0: 55518.9. Samples: 1394568680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:06:52,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 17:06:52,975][57339] Updated weights for policy 0, policy_version 665548 (0.0029) [2024-04-28 17:06:56,622][57339] Updated weights for policy 0, policy_version 665558 (0.0025) [2024-04-28 17:06:57,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10904518656. Throughput: 0: 55614.7. Samples: 1394905440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:06:57,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 17:06:58,792][57339] Updated weights for policy 0, policy_version 665568 (0.0027) [2024-04-28 17:07:02,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55432.7, 300 sec: 55761.2). Total num frames: 10904797184. Throughput: 0: 55563.1. Samples: 1395239960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:07:02,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 17:07:02,478][57339] Updated weights for policy 0, policy_version 665578 (0.0034) [2024-04-28 17:07:04,709][57339] Updated weights for policy 0, policy_version 665588 (0.0024) [2024-04-28 17:07:07,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54613.5, 300 sec: 55705.6). Total num frames: 10905075712. Throughput: 0: 55700.5. Samples: 1395397140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:07:07,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 17:07:08,341][57339] Updated weights for policy 0, policy_version 665598 (0.0026) [2024-04-28 17:07:10,639][57339] Updated weights for policy 0, policy_version 665608 (0.0031) [2024-04-28 17:07:12,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10905370624. Throughput: 0: 55616.4. Samples: 1395727680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:07:12,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 17:07:14,270][57339] Updated weights for policy 0, policy_version 665618 (0.0026) [2024-04-28 17:07:16,375][57339] Updated weights for policy 0, policy_version 665628 (0.0027) [2024-04-28 17:07:17,169][57108] Fps is (10 sec: 60619.8, 60 sec: 56524.6, 300 sec: 55872.2). Total num frames: 10905681920. Throughput: 0: 55527.8. Samples: 1396061360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:07:17,178][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 17:07:20,134][57339] Updated weights for policy 0, policy_version 665638 (0.0031) [2024-04-28 17:07:22,169][57108] Fps is (10 sec: 58982.7, 60 sec: 56797.9, 300 sec: 55816.7). Total num frames: 10905960448. Throughput: 0: 55808.8. Samples: 1396240800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:07:22,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 17:07:22,224][57339] Updated weights for policy 0, policy_version 665648 (0.0026) [2024-04-28 17:07:25,945][57339] Updated weights for policy 0, policy_version 665658 (0.0029) [2024-04-28 17:07:27,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10906222592. Throughput: 0: 55895.1. Samples: 1396575100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:07:27,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 17:07:27,269][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000665664_10906238976.pth... [2024-04-28 17:07:27,319][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000664846_10892836864.pth [2024-04-28 17:07:28,109][57339] Updated weights for policy 0, policy_version 665668 (0.0031) [2024-04-28 17:07:31,631][57339] Updated weights for policy 0, policy_version 665678 (0.0026) [2024-04-28 17:07:32,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 10906484736. Throughput: 0: 55909.1. Samples: 1396906720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:07:32,170][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 17:07:33,957][57339] Updated weights for policy 0, policy_version 665688 (0.0031) [2024-04-28 17:07:37,169][57108] Fps is (10 sec: 52428.0, 60 sec: 55432.4, 300 sec: 55816.6). Total num frames: 10906746880. Throughput: 0: 55462.9. Samples: 1397064520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:07:37,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 17:07:37,530][57339] Updated weights for policy 0, policy_version 665698 (0.0024) [2024-04-28 17:07:39,778][57339] Updated weights for policy 0, policy_version 665708 (0.0028) [2024-04-28 17:07:42,169][57108] Fps is (10 sec: 54067.5, 60 sec: 54886.3, 300 sec: 55705.6). Total num frames: 10907025408. Throughput: 0: 55458.6. Samples: 1397401080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:07:42,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 17:07:43,425][57339] Updated weights for policy 0, policy_version 665718 (0.0024) [2024-04-28 17:07:45,687][57339] Updated weights for policy 0, policy_version 665728 (0.0026) [2024-04-28 17:07:47,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10907320320. Throughput: 0: 55585.7. Samples: 1397741320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:07:47,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 17:07:49,308][57339] Updated weights for policy 0, policy_version 665738 (0.0033) [2024-04-28 17:07:49,760][57319] Signal inference workers to stop experience collection... (20900 times) [2024-04-28 17:07:49,760][57319] Signal inference workers to resume experience collection... (20900 times) [2024-04-28 17:07:49,773][57339] InferenceWorker_p0-w0: stopping experience collection (20900 times) [2024-04-28 17:07:49,773][57339] InferenceWorker_p0-w0: resuming experience collection (20900 times) [2024-04-28 17:07:51,694][57339] Updated weights for policy 0, policy_version 665748 (0.0027) [2024-04-28 17:07:52,169][57108] Fps is (10 sec: 60620.3, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 10907631616. Throughput: 0: 55743.4. Samples: 1397905600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 17:07:52,170][57108] Avg episode reward: [(0, '0.531')] [2024-04-28 17:07:55,129][57339] Updated weights for policy 0, policy_version 665758 (0.0027) [2024-04-28 17:07:57,169][57108] Fps is (10 sec: 57343.6, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 10907893760. Throughput: 0: 55784.0. Samples: 1398237960. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:07:57,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 17:07:57,662][57339] Updated weights for policy 0, policy_version 665768 (0.0030) [2024-04-28 17:08:01,010][57339] Updated weights for policy 0, policy_version 665778 (0.0024) [2024-04-28 17:08:02,169][57108] Fps is (10 sec: 54066.7, 60 sec: 56251.5, 300 sec: 55761.1). Total num frames: 10908172288. Throughput: 0: 55770.1. Samples: 1398571020. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:08:02,170][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 17:08:03,530][57339] Updated weights for policy 0, policy_version 665788 (0.0025) [2024-04-28 17:08:06,960][57339] Updated weights for policy 0, policy_version 665798 (0.0040) [2024-04-28 17:08:07,169][57108] Fps is (10 sec: 55705.2, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 10908450816. Throughput: 0: 55600.3. Samples: 1398742820. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:08:07,170][57108] Avg episode reward: [(0, '0.532')] [2024-04-28 17:08:09,331][57339] Updated weights for policy 0, policy_version 665808 (0.0029) [2024-04-28 17:08:12,169][57108] Fps is (10 sec: 50791.6, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 10908680192. Throughput: 0: 55584.5. Samples: 1399076400. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:08:12,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 17:08:12,815][57339] Updated weights for policy 0, policy_version 665818 (0.0024) [2024-04-28 17:08:15,279][57339] Updated weights for policy 0, policy_version 665828 (0.0034) [2024-04-28 17:08:17,169][57108] Fps is (10 sec: 52429.6, 60 sec: 54886.5, 300 sec: 55761.2). Total num frames: 10908975104. Throughput: 0: 55567.2. Samples: 1399407240. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:08:17,169][57108] Avg episode reward: [(0, '0.529')] [2024-04-28 17:08:18,738][57339] Updated weights for policy 0, policy_version 665838 (0.0027) [2024-04-28 17:08:21,200][57339] Updated weights for policy 0, policy_version 665848 (0.0027) [2024-04-28 17:08:22,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10909270016. Throughput: 0: 55579.3. Samples: 1399565580. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:08:22,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 17:08:24,520][57339] Updated weights for policy 0, policy_version 665858 (0.0027) [2024-04-28 17:08:27,037][57339] Updated weights for policy 0, policy_version 665868 (0.0030) [2024-04-28 17:08:27,169][57108] Fps is (10 sec: 60619.8, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 10909581312. Throughput: 0: 55601.2. Samples: 1399903140. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:08:27,170][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 17:08:30,513][57319] Signal inference workers to stop experience collection... (20950 times) [2024-04-28 17:08:30,514][57319] Signal inference workers to resume experience collection... (20950 times) [2024-04-28 17:08:30,525][57339] Updated weights for policy 0, policy_version 665878 (0.0032) [2024-04-28 17:08:30,558][57339] InferenceWorker_p0-w0: stopping experience collection (20950 times) [2024-04-28 17:08:30,559][57339] InferenceWorker_p0-w0: resuming experience collection (20950 times) [2024-04-28 17:08:32,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 10909843456. Throughput: 0: 55419.1. Samples: 1400235180. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:08:32,169][57108] Avg episode reward: [(0, '0.518')] [2024-04-28 17:08:32,862][57339] Updated weights for policy 0, policy_version 665888 (0.0027) [2024-04-28 17:08:36,407][57339] Updated weights for policy 0, policy_version 665898 (0.0034) [2024-04-28 17:08:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10910121984. Throughput: 0: 55653.8. Samples: 1400410020. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:08:37,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 17:08:38,746][57339] Updated weights for policy 0, policy_version 665908 (0.0026) [2024-04-28 17:08:42,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10910367744. Throughput: 0: 55590.3. Samples: 1400739520. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:08:42,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 17:08:42,404][57339] Updated weights for policy 0, policy_version 665918 (0.0029) [2024-04-28 17:08:44,823][57339] Updated weights for policy 0, policy_version 665928 (0.0034) [2024-04-28 17:08:47,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 10910646272. Throughput: 0: 55736.1. Samples: 1401079140. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:08:47,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 17:08:48,193][57339] Updated weights for policy 0, policy_version 665938 (0.0023) [2024-04-28 17:08:50,723][57339] Updated weights for policy 0, policy_version 665948 (0.0025) [2024-04-28 17:08:52,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54613.4, 300 sec: 55650.1). Total num frames: 10910908416. Throughput: 0: 55465.9. Samples: 1401238780. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:08:52,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 17:08:53,934][57339] Updated weights for policy 0, policy_version 665958 (0.0028) [2024-04-28 17:08:56,487][57339] Updated weights for policy 0, policy_version 665968 (0.0029) [2024-04-28 17:08:57,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 10911219712. Throughput: 0: 55473.2. Samples: 1401572700. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:08:57,170][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 17:08:59,851][57339] Updated weights for policy 0, policy_version 665978 (0.0030) [2024-04-28 17:09:02,169][57108] Fps is (10 sec: 60620.9, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 10911514624. Throughput: 0: 55522.2. Samples: 1401905740. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:09:02,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 17:09:02,584][57339] Updated weights for policy 0, policy_version 665988 (0.0029) [2024-04-28 17:09:05,692][57339] Updated weights for policy 0, policy_version 665998 (0.0028) [2024-04-28 17:09:07,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10911793152. Throughput: 0: 55877.3. Samples: 1402080060. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:09:07,169][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 17:09:08,555][57339] Updated weights for policy 0, policy_version 666008 (0.0027) [2024-04-28 17:09:11,777][57339] Updated weights for policy 0, policy_version 666018 (0.0032) [2024-04-28 17:09:12,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56524.7, 300 sec: 55650.0). Total num frames: 10912071680. Throughput: 0: 55785.9. Samples: 1402413500. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:09:12,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 17:09:14,366][57339] Updated weights for policy 0, policy_version 666028 (0.0024) [2024-04-28 17:09:17,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10912317440. Throughput: 0: 55791.5. Samples: 1402745800. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:09:17,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 17:09:17,620][57339] Updated weights for policy 0, policy_version 666038 (0.0027) [2024-04-28 17:09:20,169][57339] Updated weights for policy 0, policy_version 666048 (0.0028) [2024-04-28 17:09:22,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10912612352. Throughput: 0: 55534.3. Samples: 1402909060. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:09:22,170][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 17:09:23,388][57339] Updated weights for policy 0, policy_version 666058 (0.0026) [2024-04-28 17:09:26,813][57339] Updated weights for policy 0, policy_version 666068 (0.0029) [2024-04-28 17:09:27,169][57108] Fps is (10 sec: 55705.4, 60 sec: 54886.5, 300 sec: 55650.1). Total num frames: 10912874496. Throughput: 0: 55561.8. Samples: 1403239800. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-04-28 17:09:27,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 17:09:27,243][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000666070_10912890880.pth... [2024-04-28 17:09:27,287][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000665256_10899554304.pth [2024-04-28 17:09:29,336][57339] Updated weights for policy 0, policy_version 666078 (0.0026) [2024-04-28 17:09:32,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10913169408. Throughput: 0: 55457.9. Samples: 1403574740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:09:32,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 17:09:32,575][57339] Updated weights for policy 0, policy_version 666088 (0.0033) [2024-04-28 17:09:35,237][57339] Updated weights for policy 0, policy_version 666098 (0.0031) [2024-04-28 17:09:37,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10913464320. Throughput: 0: 55664.5. Samples: 1403743680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:09:37,169][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 17:09:38,526][57339] Updated weights for policy 0, policy_version 666108 (0.0031) [2024-04-28 17:09:40,961][57339] Updated weights for policy 0, policy_version 666118 (0.0024) [2024-04-28 17:09:41,656][57319] Signal inference workers to stop experience collection... (21000 times) [2024-04-28 17:09:41,704][57339] InferenceWorker_p0-w0: stopping experience collection (21000 times) [2024-04-28 17:09:41,715][57319] Signal inference workers to resume experience collection... (21000 times) [2024-04-28 17:09:41,721][57339] InferenceWorker_p0-w0: resuming experience collection (21000 times) [2024-04-28 17:09:42,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10913742848. Throughput: 0: 55721.9. Samples: 1404080180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:09:42,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 17:09:44,511][57339] Updated weights for policy 0, policy_version 666128 (0.0031) [2024-04-28 17:09:47,002][57339] Updated weights for policy 0, policy_version 666138 (0.0028) [2024-04-28 17:09:47,169][57108] Fps is (10 sec: 54066.2, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 10914004992. Throughput: 0: 55805.1. Samples: 1404416980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:09:47,170][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 17:09:50,202][57339] Updated weights for policy 0, policy_version 666148 (0.0030) [2024-04-28 17:09:52,169][57108] Fps is (10 sec: 54067.4, 60 sec: 56251.8, 300 sec: 55594.5). Total num frames: 10914283520. Throughput: 0: 55669.5. Samples: 1404585180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:09:52,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 17:09:52,800][57339] Updated weights for policy 0, policy_version 666158 (0.0027) [2024-04-28 17:09:55,914][57339] Updated weights for policy 0, policy_version 666168 (0.0029) [2024-04-28 17:09:57,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10914562048. Throughput: 0: 55882.7. Samples: 1404928220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:09:57,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 17:09:58,644][57339] Updated weights for policy 0, policy_version 666178 (0.0031) [2024-04-28 17:10:01,805][57339] Updated weights for policy 0, policy_version 666188 (0.0027) [2024-04-28 17:10:02,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10914840576. Throughput: 0: 55982.2. Samples: 1405265000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:10:02,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 17:10:04,771][57339] Updated weights for policy 0, policy_version 666198 (0.0028) [2024-04-28 17:10:07,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10915102720. Throughput: 0: 55982.4. Samples: 1405428260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:10:07,173][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 17:10:07,610][57339] Updated weights for policy 0, policy_version 666208 (0.0031) [2024-04-28 17:10:10,644][57339] Updated weights for policy 0, policy_version 666218 (0.0033) [2024-04-28 17:10:12,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10915397632. Throughput: 0: 56051.6. Samples: 1405762120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:10:12,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 17:10:13,669][57339] Updated weights for policy 0, policy_version 666228 (0.0027) [2024-04-28 17:10:16,363][57339] Updated weights for policy 0, policy_version 666238 (0.0030) [2024-04-28 17:10:17,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10915659776. Throughput: 0: 55956.0. Samples: 1406092760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:10:17,169][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 17:10:19,448][57339] Updated weights for policy 0, policy_version 666248 (0.0027) [2024-04-28 17:10:22,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10915954688. Throughput: 0: 55991.0. Samples: 1406263280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:10:22,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 17:10:22,280][57339] Updated weights for policy 0, policy_version 666258 (0.0029) [2024-04-28 17:10:25,310][57339] Updated weights for policy 0, policy_version 666268 (0.0026) [2024-04-28 17:10:27,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 10916233216. Throughput: 0: 55910.1. Samples: 1406596140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:10:27,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 17:10:28,299][57339] Updated weights for policy 0, policy_version 666278 (0.0031) [2024-04-28 17:10:31,108][57339] Updated weights for policy 0, policy_version 666288 (0.0034) [2024-04-28 17:10:32,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10916528128. Throughput: 0: 55897.5. Samples: 1406932360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:10:32,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 17:10:34,145][57339] Updated weights for policy 0, policy_version 666298 (0.0026) [2024-04-28 17:10:37,072][57339] Updated weights for policy 0, policy_version 666308 (0.0026) [2024-04-28 17:10:37,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 10916790272. Throughput: 0: 55890.9. Samples: 1407100280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:10:37,170][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 17:10:40,001][57339] Updated weights for policy 0, policy_version 666318 (0.0024) [2024-04-28 17:10:42,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10917068800. Throughput: 0: 55699.1. Samples: 1407434680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:10:42,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 17:10:42,962][57339] Updated weights for policy 0, policy_version 666328 (0.0034) [2024-04-28 17:10:45,791][57339] Updated weights for policy 0, policy_version 666338 (0.0034) [2024-04-28 17:10:47,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10917363712. Throughput: 0: 55637.3. Samples: 1407768680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:10:47,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 17:10:47,525][57319] Signal inference workers to stop experience collection... (21050 times) [2024-04-28 17:10:47,562][57339] InferenceWorker_p0-w0: stopping experience collection (21050 times) [2024-04-28 17:10:47,621][57319] Signal inference workers to resume experience collection... (21050 times) [2024-04-28 17:10:47,622][57339] InferenceWorker_p0-w0: resuming experience collection (21050 times) [2024-04-28 17:10:48,704][57339] Updated weights for policy 0, policy_version 666348 (0.0030) [2024-04-28 17:10:51,794][57339] Updated weights for policy 0, policy_version 666358 (0.0033) [2024-04-28 17:10:52,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10917609472. Throughput: 0: 55635.5. Samples: 1407931860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:10:52,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 17:10:54,512][57339] Updated weights for policy 0, policy_version 666368 (0.0034) [2024-04-28 17:10:57,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10917904384. Throughput: 0: 55462.2. Samples: 1408257920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:10:57,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 17:10:57,786][57339] Updated weights for policy 0, policy_version 666378 (0.0027) [2024-04-28 17:11:00,469][57339] Updated weights for policy 0, policy_version 666388 (0.0028) [2024-04-28 17:11:02,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 10918199296. Throughput: 0: 55431.0. Samples: 1408587160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:11:02,170][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 17:11:03,688][57339] Updated weights for policy 0, policy_version 666398 (0.0028) [2024-04-28 17:11:06,378][57339] Updated weights for policy 0, policy_version 666408 (0.0032) [2024-04-28 17:11:07,169][57108] Fps is (10 sec: 57343.7, 60 sec: 56251.6, 300 sec: 55650.0). Total num frames: 10918477824. Throughput: 0: 55582.6. Samples: 1408764500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:11:07,170][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 17:11:09,618][57339] Updated weights for policy 0, policy_version 666418 (0.0031) [2024-04-28 17:11:12,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10918723584. Throughput: 0: 55521.9. Samples: 1409094620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:11:12,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 17:11:12,355][57339] Updated weights for policy 0, policy_version 666428 (0.0033) [2024-04-28 17:11:15,575][57339] Updated weights for policy 0, policy_version 666438 (0.0031) [2024-04-28 17:11:17,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10919018496. Throughput: 0: 55354.3. Samples: 1409423300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:11:17,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 17:11:18,105][57339] Updated weights for policy 0, policy_version 666448 (0.0029) [2024-04-28 17:11:21,540][57339] Updated weights for policy 0, policy_version 666458 (0.0026) [2024-04-28 17:11:22,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 10919280640. Throughput: 0: 55263.4. Samples: 1409587120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:11:22,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 17:11:24,071][57339] Updated weights for policy 0, policy_version 666468 (0.0026) [2024-04-28 17:11:27,169][57108] Fps is (10 sec: 52428.1, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 10919542784. Throughput: 0: 55326.2. Samples: 1409924360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:11:27,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 17:11:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000666476_10919542784.pth... [2024-04-28 17:11:27,231][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000665664_10906238976.pth [2024-04-28 17:11:27,599][57339] Updated weights for policy 0, policy_version 666478 (0.0032) [2024-04-28 17:11:30,025][57339] Updated weights for policy 0, policy_version 666488 (0.0030) [2024-04-28 17:11:32,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 10919837696. Throughput: 0: 55201.3. Samples: 1410252740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:11:32,170][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 17:11:33,386][57339] Updated weights for policy 0, policy_version 666498 (0.0025) [2024-04-28 17:11:35,798][57339] Updated weights for policy 0, policy_version 666508 (0.0029) [2024-04-28 17:11:37,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55538.9). Total num frames: 10920116224. Throughput: 0: 55333.2. Samples: 1410421860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:11:37,178][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 17:11:39,320][57339] Updated weights for policy 0, policy_version 666518 (0.0027) [2024-04-28 17:11:41,717][57339] Updated weights for policy 0, policy_version 666528 (0.0029) [2024-04-28 17:11:42,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10920427520. Throughput: 0: 55501.8. Samples: 1410755500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:11:42,170][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 17:11:45,305][57339] Updated weights for policy 0, policy_version 666538 (0.0030) [2024-04-28 17:11:47,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10920689664. Throughput: 0: 55521.9. Samples: 1411085640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:11:47,178][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 17:11:47,757][57339] Updated weights for policy 0, policy_version 666548 (0.0033) [2024-04-28 17:11:51,105][57339] Updated weights for policy 0, policy_version 666558 (0.0023) [2024-04-28 17:11:52,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10920968192. Throughput: 0: 55329.0. Samples: 1411254300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:11:52,170][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 17:11:53,500][57339] Updated weights for policy 0, policy_version 666568 (0.0029) [2024-04-28 17:11:53,891][57319] Signal inference workers to stop experience collection... (21100 times) [2024-04-28 17:11:53,891][57319] Signal inference workers to resume experience collection... (21100 times) [2024-04-28 17:11:53,918][57339] InferenceWorker_p0-w0: stopping experience collection (21100 times) [2024-04-28 17:11:53,918][57339] InferenceWorker_p0-w0: resuming experience collection (21100 times) [2024-04-28 17:11:56,871][57339] Updated weights for policy 0, policy_version 666578 (0.0028) [2024-04-28 17:11:57,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10921230336. Throughput: 0: 55524.8. Samples: 1411593240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:11:57,170][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 17:11:59,398][57339] Updated weights for policy 0, policy_version 666588 (0.0028) [2024-04-28 17:12:02,169][57108] Fps is (10 sec: 52428.4, 60 sec: 54886.4, 300 sec: 55650.0). Total num frames: 10921492480. Throughput: 0: 55769.1. Samples: 1411932920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:12:02,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 17:12:02,808][57339] Updated weights for policy 0, policy_version 666598 (0.0031) [2024-04-28 17:12:05,306][57339] Updated weights for policy 0, policy_version 666608 (0.0031) [2024-04-28 17:12:07,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10921787392. Throughput: 0: 55757.6. Samples: 1412096220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:12:07,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 17:12:08,700][57339] Updated weights for policy 0, policy_version 666618 (0.0032) [2024-04-28 17:12:11,113][57339] Updated weights for policy 0, policy_version 666628 (0.0026) [2024-04-28 17:12:12,169][57108] Fps is (10 sec: 60620.9, 60 sec: 56251.6, 300 sec: 55650.1). Total num frames: 10922098688. Throughput: 0: 55694.2. Samples: 1412430600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:12:12,170][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 17:12:14,388][57339] Updated weights for policy 0, policy_version 666638 (0.0033) [2024-04-28 17:12:16,937][57339] Updated weights for policy 0, policy_version 666648 (0.0027) [2024-04-28 17:12:17,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10922360832. Throughput: 0: 55936.1. Samples: 1412769860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:12:17,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 17:12:20,269][57339] Updated weights for policy 0, policy_version 666658 (0.0027) [2024-04-28 17:12:22,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.4, 300 sec: 55650.0). Total num frames: 10922639360. Throughput: 0: 56049.4. Samples: 1412944080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:12:22,170][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 17:12:22,716][57339] Updated weights for policy 0, policy_version 666668 (0.0026) [2024-04-28 17:12:26,253][57339] Updated weights for policy 0, policy_version 666678 (0.0025) [2024-04-28 17:12:27,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56524.9, 300 sec: 55761.2). Total num frames: 10922934272. Throughput: 0: 56135.6. Samples: 1413281600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:12:27,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 17:12:28,696][57339] Updated weights for policy 0, policy_version 666688 (0.0025) [2024-04-28 17:12:32,023][57339] Updated weights for policy 0, policy_version 666698 (0.0025) [2024-04-28 17:12:32,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10923180032. Throughput: 0: 56133.5. Samples: 1413611660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:12:32,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 17:12:34,593][57339] Updated weights for policy 0, policy_version 666708 (0.0027) [2024-04-28 17:12:37,169][57108] Fps is (10 sec: 50790.4, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 10923442176. Throughput: 0: 55887.6. Samples: 1413769240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:12:37,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 17:12:37,818][57339] Updated weights for policy 0, policy_version 666718 (0.0032) [2024-04-28 17:12:40,618][57339] Updated weights for policy 0, policy_version 666728 (0.0033) [2024-04-28 17:12:42,169][57108] Fps is (10 sec: 57345.1, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10923753472. Throughput: 0: 55857.0. Samples: 1414106800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:12:42,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 17:12:43,657][57339] Updated weights for policy 0, policy_version 666738 (0.0030) [2024-04-28 17:12:43,928][57319] Signal inference workers to stop experience collection... (21150 times) [2024-04-28 17:12:43,931][57319] Signal inference workers to resume experience collection... (21150 times) [2024-04-28 17:12:43,957][57339] InferenceWorker_p0-w0: stopping experience collection (21150 times) [2024-04-28 17:12:43,957][57339] InferenceWorker_p0-w0: resuming experience collection (21150 times) [2024-04-28 17:12:46,447][57339] Updated weights for policy 0, policy_version 666748 (0.0030) [2024-04-28 17:12:47,169][57108] Fps is (10 sec: 60620.8, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10924048384. Throughput: 0: 55665.5. Samples: 1414437860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 17:12:47,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 17:12:49,598][57339] Updated weights for policy 0, policy_version 666758 (0.0030) [2024-04-28 17:12:52,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10924310528. Throughput: 0: 55777.7. Samples: 1414606220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:12:52,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 17:12:52,492][57339] Updated weights for policy 0, policy_version 666768 (0.0035) [2024-04-28 17:12:55,533][57339] Updated weights for policy 0, policy_version 666778 (0.0029) [2024-04-28 17:12:57,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 10924589056. Throughput: 0: 55663.1. Samples: 1414935440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:12:57,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 17:12:58,223][57339] Updated weights for policy 0, policy_version 666788 (0.0028) [2024-04-28 17:13:01,309][57339] Updated weights for policy 0, policy_version 666798 (0.0031) [2024-04-28 17:13:02,169][57108] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 10924867584. Throughput: 0: 55586.9. Samples: 1415271280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:13:02,170][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 17:13:04,157][57339] Updated weights for policy 0, policy_version 666808 (0.0032) [2024-04-28 17:13:07,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 10925129728. Throughput: 0: 55367.6. Samples: 1415435620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:13:07,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 17:13:07,275][57339] Updated weights for policy 0, policy_version 666818 (0.0034) [2024-04-28 17:13:10,135][57339] Updated weights for policy 0, policy_version 666828 (0.0029) [2024-04-28 17:13:12,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10925408256. Throughput: 0: 55208.3. Samples: 1415765980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:13:12,170][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 17:13:13,272][57339] Updated weights for policy 0, policy_version 666838 (0.0027) [2024-04-28 17:13:16,155][57339] Updated weights for policy 0, policy_version 666848 (0.0026) [2024-04-28 17:13:17,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 10925703168. Throughput: 0: 55243.6. Samples: 1416097620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:13:17,169][57108] Avg episode reward: [(0, '0.539')] [2024-04-28 17:13:19,086][57339] Updated weights for policy 0, policy_version 666858 (0.0026) [2024-04-28 17:13:21,961][57339] Updated weights for policy 0, policy_version 666868 (0.0036) [2024-04-28 17:13:22,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10925965312. Throughput: 0: 55438.0. Samples: 1416263960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:13:22,169][57108] Avg episode reward: [(0, '0.528')] [2024-04-28 17:13:24,924][57339] Updated weights for policy 0, policy_version 666878 (0.0030) [2024-04-28 17:13:27,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 10926243840. Throughput: 0: 55364.7. Samples: 1416598220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:13:27,170][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 17:13:27,182][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000666885_10926243840.pth... [2024-04-28 17:13:27,237][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000666070_10912890880.pth [2024-04-28 17:13:27,857][57339] Updated weights for policy 0, policy_version 666888 (0.0027) [2024-04-28 17:13:31,019][57339] Updated weights for policy 0, policy_version 666898 (0.0029) [2024-04-28 17:13:32,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 10926538752. Throughput: 0: 55327.1. Samples: 1416927580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:13:32,169][57108] Avg episode reward: [(0, '0.682')] [2024-04-28 17:13:33,806][57339] Updated weights for policy 0, policy_version 666908 (0.0030) [2024-04-28 17:13:36,858][57339] Updated weights for policy 0, policy_version 666918 (0.0030) [2024-04-28 17:13:37,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10926800896. Throughput: 0: 55283.2. Samples: 1417093960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:13:37,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 17:13:39,836][57339] Updated weights for policy 0, policy_version 666928 (0.0030) [2024-04-28 17:13:42,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10927079424. Throughput: 0: 55341.1. Samples: 1417425780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:13:42,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 17:13:42,673][57339] Updated weights for policy 0, policy_version 666938 (0.0030) [2024-04-28 17:13:45,757][57339] Updated weights for policy 0, policy_version 666948 (0.0030) [2024-04-28 17:13:47,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55159.3, 300 sec: 55761.1). Total num frames: 10927357952. Throughput: 0: 55346.7. Samples: 1417761880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:13:47,170][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 17:13:48,486][57339] Updated weights for policy 0, policy_version 666958 (0.0030) [2024-04-28 17:13:51,498][57339] Updated weights for policy 0, policy_version 666968 (0.0035) [2024-04-28 17:13:52,169][57108] Fps is (10 sec: 54066.2, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 10927620096. Throughput: 0: 55427.0. Samples: 1417929840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:13:52,170][57108] Avg episode reward: [(0, '0.503')] [2024-04-28 17:13:54,272][57339] Updated weights for policy 0, policy_version 666978 (0.0026) [2024-04-28 17:13:57,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55159.7, 300 sec: 55539.0). Total num frames: 10927898624. Throughput: 0: 55394.4. Samples: 1418258720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:13:57,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 17:13:57,303][57339] Updated weights for policy 0, policy_version 666988 (0.0025) [2024-04-28 17:14:00,268][57339] Updated weights for policy 0, policy_version 666998 (0.0026) [2024-04-28 17:14:02,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10928193536. Throughput: 0: 55404.6. Samples: 1418590820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:14:02,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 17:14:03,397][57339] Updated weights for policy 0, policy_version 667008 (0.0026) [2024-04-28 17:14:06,086][57339] Updated weights for policy 0, policy_version 667018 (0.0028) [2024-04-28 17:14:07,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10928472064. Throughput: 0: 55598.4. Samples: 1418765880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:14:07,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 17:14:09,277][57339] Updated weights for policy 0, policy_version 667028 (0.0032) [2024-04-28 17:14:11,872][57319] Signal inference workers to stop experience collection... (21200 times) [2024-04-28 17:14:11,872][57319] Signal inference workers to resume experience collection... (21200 times) [2024-04-28 17:14:11,888][57339] InferenceWorker_p0-w0: stopping experience collection (21200 times) [2024-04-28 17:14:11,888][57339] InferenceWorker_p0-w0: resuming experience collection (21200 times) [2024-04-28 17:14:11,990][57339] Updated weights for policy 0, policy_version 667038 (0.0029) [2024-04-28 17:14:12,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10928750592. Throughput: 0: 55610.3. Samples: 1419100680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:14:12,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 17:14:15,218][57339] Updated weights for policy 0, policy_version 667048 (0.0027) [2024-04-28 17:14:17,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10929012736. Throughput: 0: 55586.5. Samples: 1419428980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:14:17,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 17:14:18,031][57339] Updated weights for policy 0, policy_version 667058 (0.0028) [2024-04-28 17:14:21,187][57339] Updated weights for policy 0, policy_version 667068 (0.0033) [2024-04-28 17:14:22,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10929274880. Throughput: 0: 55479.9. Samples: 1419590560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:14:22,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 17:14:24,128][57339] Updated weights for policy 0, policy_version 667078 (0.0034) [2024-04-28 17:14:26,973][57339] Updated weights for policy 0, policy_version 667088 (0.0030) [2024-04-28 17:14:27,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 10929569792. Throughput: 0: 55488.9. Samples: 1419922780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:14:27,169][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 17:14:30,176][57339] Updated weights for policy 0, policy_version 667098 (0.0035) [2024-04-28 17:14:32,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 10929848320. Throughput: 0: 55473.4. Samples: 1420258180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:14:32,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 17:14:32,966][57339] Updated weights for policy 0, policy_version 667108 (0.0037) [2024-04-28 17:14:35,929][57339] Updated weights for policy 0, policy_version 667118 (0.0034) [2024-04-28 17:14:37,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 10930126848. Throughput: 0: 55403.1. Samples: 1420422980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:14:37,170][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 17:14:38,754][57339] Updated weights for policy 0, policy_version 667128 (0.0030) [2024-04-28 17:14:41,896][57339] Updated weights for policy 0, policy_version 667138 (0.0032) [2024-04-28 17:14:42,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55594.6). Total num frames: 10930405376. Throughput: 0: 55510.6. Samples: 1420756700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:14:42,169][57108] Avg episode reward: [(0, '0.690')] [2024-04-28 17:14:44,733][57339] Updated weights for policy 0, policy_version 667148 (0.0032) [2024-04-28 17:14:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10930683904. Throughput: 0: 55564.8. Samples: 1421091240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:14:47,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 17:14:47,609][57339] Updated weights for policy 0, policy_version 667158 (0.0038) [2024-04-28 17:14:50,716][57339] Updated weights for policy 0, policy_version 667168 (0.0032) [2024-04-28 17:14:52,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10930962432. Throughput: 0: 55345.3. Samples: 1421256420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:14:52,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 17:14:53,585][57339] Updated weights for policy 0, policy_version 667178 (0.0026) [2024-04-28 17:14:56,587][57339] Updated weights for policy 0, policy_version 667188 (0.0025) [2024-04-28 17:14:57,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10931240960. Throughput: 0: 55363.6. Samples: 1421592040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:14:57,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 17:14:59,487][57339] Updated weights for policy 0, policy_version 667198 (0.0028) [2024-04-28 17:15:02,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10931503104. Throughput: 0: 55454.8. Samples: 1421924440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:15:02,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 17:15:02,582][57339] Updated weights for policy 0, policy_version 667208 (0.0027) [2024-04-28 17:15:05,374][57339] Updated weights for policy 0, policy_version 667218 (0.0024) [2024-04-28 17:15:07,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 10931798016. Throughput: 0: 55567.0. Samples: 1422091080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:15:07,170][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 17:15:08,340][57339] Updated weights for policy 0, policy_version 667228 (0.0026) [2024-04-28 17:15:11,330][57339] Updated weights for policy 0, policy_version 667238 (0.0030) [2024-04-28 17:15:12,169][57108] Fps is (10 sec: 54066.8, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 10932043776. Throughput: 0: 55547.0. Samples: 1422422400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:15:12,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 17:15:14,209][57339] Updated weights for policy 0, policy_version 667248 (0.0025) [2024-04-28 17:15:17,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10932338688. Throughput: 0: 55524.7. Samples: 1422756800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:15:17,170][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 17:15:17,388][57339] Updated weights for policy 0, policy_version 667258 (0.0034) [2024-04-28 17:15:20,100][57339] Updated weights for policy 0, policy_version 667268 (0.0032) [2024-04-28 17:15:22,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 10932617216. Throughput: 0: 55521.3. Samples: 1422921440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:15:22,170][57108] Avg episode reward: [(0, '0.678')] [2024-04-28 17:15:23,143][57339] Updated weights for policy 0, policy_version 667278 (0.0025) [2024-04-28 17:15:25,877][57339] Updated weights for policy 0, policy_version 667288 (0.0028) [2024-04-28 17:15:27,169][57108] Fps is (10 sec: 57345.3, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10932912128. Throughput: 0: 55527.1. Samples: 1423255420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:15:27,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 17:15:27,258][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000667293_10932928512.pth... [2024-04-28 17:15:27,303][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000666476_10919542784.pth [2024-04-28 17:15:28,908][57339] Updated weights for policy 0, policy_version 667298 (0.0034) [2024-04-28 17:15:31,912][57339] Updated weights for policy 0, policy_version 667308 (0.0029) [2024-04-28 17:15:32,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10933174272. Throughput: 0: 55436.5. Samples: 1423585880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:15:32,170][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 17:15:34,849][57339] Updated weights for policy 0, policy_version 667318 (0.0028) [2024-04-28 17:15:37,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 10933452800. Throughput: 0: 55494.2. Samples: 1423753660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:15:37,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 17:15:37,748][57339] Updated weights for policy 0, policy_version 667328 (0.0025) [2024-04-28 17:15:40,739][57339] Updated weights for policy 0, policy_version 667338 (0.0031) [2024-04-28 17:15:42,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 10933731328. Throughput: 0: 55436.3. Samples: 1424086680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:15:42,170][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 17:15:43,578][57339] Updated weights for policy 0, policy_version 667348 (0.0034) [2024-04-28 17:15:46,555][57339] Updated weights for policy 0, policy_version 667358 (0.0032) [2024-04-28 17:15:47,043][57319] Signal inference workers to stop experience collection... (21250 times) [2024-04-28 17:15:47,063][57339] InferenceWorker_p0-w0: stopping experience collection (21250 times) [2024-04-28 17:15:47,101][57319] Signal inference workers to resume experience collection... (21250 times) [2024-04-28 17:15:47,108][57339] InferenceWorker_p0-w0: resuming experience collection (21250 times) [2024-04-28 17:15:47,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.7, 300 sec: 55594.5). Total num frames: 10934009856. Throughput: 0: 55495.1. Samples: 1424421720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:15:47,169][57108] Avg episode reward: [(0, '0.530')] [2024-04-28 17:15:49,371][57339] Updated weights for policy 0, policy_version 667368 (0.0029) [2024-04-28 17:15:52,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 10934288384. Throughput: 0: 55493.5. Samples: 1424588280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:15:52,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 17:15:52,416][57339] Updated weights for policy 0, policy_version 667378 (0.0031) [2024-04-28 17:15:55,149][57339] Updated weights for policy 0, policy_version 667388 (0.0033) [2024-04-28 17:15:57,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 10934566912. Throughput: 0: 55545.3. Samples: 1424921940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:15:57,170][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 17:15:58,556][57339] Updated weights for policy 0, policy_version 667398 (0.0029) [2024-04-28 17:16:01,066][57339] Updated weights for policy 0, policy_version 667408 (0.0028) [2024-04-28 17:16:02,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 10934861824. Throughput: 0: 55598.9. Samples: 1425258740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 17:16:02,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 17:16:04,268][57339] Updated weights for policy 0, policy_version 667418 (0.0028) [2024-04-28 17:16:06,868][57339] Updated weights for policy 0, policy_version 667428 (0.0024) [2024-04-28 17:16:07,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10935140352. Throughput: 0: 55804.6. Samples: 1425432640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:16:07,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 17:16:10,177][57339] Updated weights for policy 0, policy_version 667438 (0.0024) [2024-04-28 17:16:12,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 10935418880. Throughput: 0: 55888.8. Samples: 1425770420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:16:12,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 17:16:12,715][57339] Updated weights for policy 0, policy_version 667448 (0.0026) [2024-04-28 17:16:16,094][57339] Updated weights for policy 0, policy_version 667458 (0.0031) [2024-04-28 17:16:17,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.8, 300 sec: 55594.5). Total num frames: 10935681024. Throughput: 0: 56003.2. Samples: 1426106020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:16:17,169][57108] Avg episode reward: [(0, '0.481')] [2024-04-28 17:16:18,501][57339] Updated weights for policy 0, policy_version 667468 (0.0031) [2024-04-28 17:16:21,904][57339] Updated weights for policy 0, policy_version 667478 (0.0025) [2024-04-28 17:16:22,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10935959552. Throughput: 0: 55713.4. Samples: 1426260760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:16:22,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 17:16:24,369][57339] Updated weights for policy 0, policy_version 667488 (0.0030) [2024-04-28 17:16:27,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 10936221696. Throughput: 0: 55886.4. Samples: 1426601560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:16:27,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 17:16:27,931][57339] Updated weights for policy 0, policy_version 667498 (0.0023) [2024-04-28 17:16:30,191][57339] Updated weights for policy 0, policy_version 667508 (0.0033) [2024-04-28 17:16:32,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 10936500224. Throughput: 0: 55844.8. Samples: 1426934740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:16:32,169][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 17:16:33,998][57339] Updated weights for policy 0, policy_version 667518 (0.0026) [2024-04-28 17:16:36,064][57339] Updated weights for policy 0, policy_version 667528 (0.0031) [2024-04-28 17:16:37,169][57108] Fps is (10 sec: 58981.7, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 10936811520. Throughput: 0: 55994.6. Samples: 1427108040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:16:37,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 17:16:39,960][57339] Updated weights for policy 0, policy_version 667538 (0.0028) [2024-04-28 17:16:41,874][57339] Updated weights for policy 0, policy_version 667548 (0.0029) [2024-04-28 17:16:42,169][57108] Fps is (10 sec: 62259.5, 60 sec: 56524.9, 300 sec: 55705.6). Total num frames: 10937122816. Throughput: 0: 55932.1. Samples: 1427438880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:16:42,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 17:16:45,790][57339] Updated weights for policy 0, policy_version 667558 (0.0026) [2024-04-28 17:16:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 10937368576. Throughput: 0: 55832.0. Samples: 1427771180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:16:47,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 17:16:47,800][57339] Updated weights for policy 0, policy_version 667568 (0.0027) [2024-04-28 17:16:50,490][57319] Signal inference workers to stop experience collection... (21300 times) [2024-04-28 17:16:50,529][57339] InferenceWorker_p0-w0: stopping experience collection (21300 times) [2024-04-28 17:16:50,538][57319] Signal inference workers to resume experience collection... (21300 times) [2024-04-28 17:16:50,542][57339] InferenceWorker_p0-w0: resuming experience collection (21300 times) [2024-04-28 17:16:51,727][57339] Updated weights for policy 0, policy_version 667578 (0.0028) [2024-04-28 17:16:52,169][57108] Fps is (10 sec: 49152.1, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 10937614336. Throughput: 0: 55759.6. Samples: 1427941820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:16:52,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 17:16:53,602][57339] Updated weights for policy 0, policy_version 667588 (0.0030) [2024-04-28 17:16:57,169][57108] Fps is (10 sec: 52427.1, 60 sec: 55432.3, 300 sec: 55594.5). Total num frames: 10937892864. Throughput: 0: 55792.1. Samples: 1428281080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:16:57,170][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 17:16:57,516][57339] Updated weights for policy 0, policy_version 667598 (0.0026) [2024-04-28 17:16:59,454][57339] Updated weights for policy 0, policy_version 667608 (0.0029) [2024-04-28 17:17:02,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10938204160. Throughput: 0: 55751.1. Samples: 1428614820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:17:02,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 17:17:03,399][57339] Updated weights for policy 0, policy_version 667618 (0.0036) [2024-04-28 17:17:05,384][57339] Updated weights for policy 0, policy_version 667628 (0.0030) [2024-04-28 17:17:07,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55159.3, 300 sec: 55427.9). Total num frames: 10938449920. Throughput: 0: 55722.4. Samples: 1428768280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:17:07,170][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 17:17:09,349][57339] Updated weights for policy 0, policy_version 667638 (0.0030) [2024-04-28 17:17:11,385][57339] Updated weights for policy 0, policy_version 667648 (0.0026) [2024-04-28 17:17:12,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10938761216. Throughput: 0: 55577.5. Samples: 1429102560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:17:12,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 17:17:15,012][57339] Updated weights for policy 0, policy_version 667658 (0.0029) [2024-04-28 17:17:17,072][57339] Updated weights for policy 0, policy_version 667668 (0.0031) [2024-04-28 17:17:17,169][57108] Fps is (10 sec: 62260.0, 60 sec: 56524.8, 300 sec: 55705.6). Total num frames: 10939072512. Throughput: 0: 55582.2. Samples: 1429435940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:17:17,170][57108] Avg episode reward: [(0, '0.534')] [2024-04-28 17:17:20,856][57339] Updated weights for policy 0, policy_version 667678 (0.0030) [2024-04-28 17:17:22,169][57108] Fps is (10 sec: 57344.7, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 10939334656. Throughput: 0: 55677.8. Samples: 1429613540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:17:22,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 17:17:22,863][57339] Updated weights for policy 0, policy_version 667688 (0.0025) [2024-04-28 17:17:26,815][57339] Updated weights for policy 0, policy_version 667698 (0.0035) [2024-04-28 17:17:27,169][57108] Fps is (10 sec: 50789.9, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 10939580416. Throughput: 0: 55688.2. Samples: 1429944860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:17:27,170][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 17:17:27,182][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000667699_10939580416.pth... [2024-04-28 17:17:27,230][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000666885_10926243840.pth [2024-04-28 17:17:28,840][57339] Updated weights for policy 0, policy_version 667708 (0.0030) [2024-04-28 17:17:32,169][57108] Fps is (10 sec: 50790.7, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10939842560. Throughput: 0: 55775.7. Samples: 1430281080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:17:32,169][57108] Avg episode reward: [(0, '0.532')] [2024-04-28 17:17:32,765][57339] Updated weights for policy 0, policy_version 667718 (0.0030) [2024-04-28 17:17:34,811][57339] Updated weights for policy 0, policy_version 667728 (0.0028) [2024-04-28 17:17:37,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10940153856. Throughput: 0: 55480.7. Samples: 1430438460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-04-28 17:17:37,170][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 17:17:38,609][57339] Updated weights for policy 0, policy_version 667738 (0.0029) [2024-04-28 17:17:40,538][57339] Updated weights for policy 0, policy_version 667748 (0.0028) [2024-04-28 17:17:42,169][57108] Fps is (10 sec: 57343.1, 60 sec: 54886.3, 300 sec: 55483.4). Total num frames: 10940416000. Throughput: 0: 55415.9. Samples: 1430774780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:17:42,170][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 17:17:44,404][57339] Updated weights for policy 0, policy_version 667758 (0.0027) [2024-04-28 17:17:45,728][57319] Signal inference workers to stop experience collection... (21350 times) [2024-04-28 17:17:45,767][57339] InferenceWorker_p0-w0: stopping experience collection (21350 times) [2024-04-28 17:17:45,792][57319] Signal inference workers to resume experience collection... (21350 times) [2024-04-28 17:17:45,797][57339] InferenceWorker_p0-w0: resuming experience collection (21350 times) [2024-04-28 17:17:46,987][57339] Updated weights for policy 0, policy_version 667768 (0.0025) [2024-04-28 17:17:47,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10940727296. Throughput: 0: 55576.9. Samples: 1431115780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:17:47,169][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 17:17:50,216][57339] Updated weights for policy 0, policy_version 667778 (0.0034) [2024-04-28 17:17:52,169][57108] Fps is (10 sec: 58982.9, 60 sec: 56524.7, 300 sec: 55650.1). Total num frames: 10941005824. Throughput: 0: 56002.9. Samples: 1431288400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:17:52,170][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 17:17:52,981][57339] Updated weights for policy 0, policy_version 667788 (0.0027) [2024-04-28 17:17:56,101][57339] Updated weights for policy 0, policy_version 667798 (0.0028) [2024-04-28 17:17:57,169][57108] Fps is (10 sec: 55705.7, 60 sec: 56525.2, 300 sec: 55650.1). Total num frames: 10941284352. Throughput: 0: 56016.7. Samples: 1431623300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:17:57,169][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 17:17:58,769][57339] Updated weights for policy 0, policy_version 667808 (0.0031) [2024-04-28 17:18:01,918][57339] Updated weights for policy 0, policy_version 667818 (0.0030) [2024-04-28 17:18:02,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10941546496. Throughput: 0: 55975.1. Samples: 1431954820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:18:02,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 17:18:04,515][57339] Updated weights for policy 0, policy_version 667828 (0.0032) [2024-04-28 17:18:07,169][57108] Fps is (10 sec: 50789.9, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 10941792256. Throughput: 0: 55581.7. Samples: 1432114720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:18:07,169][57108] Avg episode reward: [(0, '0.704')] [2024-04-28 17:18:07,684][57339] Updated weights for policy 0, policy_version 667838 (0.0027) [2024-04-28 17:18:10,622][57339] Updated weights for policy 0, policy_version 667848 (0.0032) [2024-04-28 17:18:12,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 10942087168. Throughput: 0: 55661.0. Samples: 1432449600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:18:12,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 17:18:13,772][57339] Updated weights for policy 0, policy_version 667858 (0.0029) [2024-04-28 17:18:16,553][57339] Updated weights for policy 0, policy_version 667868 (0.0032) [2024-04-28 17:18:17,169][57108] Fps is (10 sec: 57343.9, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 10942365696. Throughput: 0: 55540.7. Samples: 1432780420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:18:17,170][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 17:18:19,614][57339] Updated weights for policy 0, policy_version 667878 (0.0031) [2024-04-28 17:18:22,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10942660608. Throughput: 0: 55845.6. Samples: 1432951500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:18:22,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 17:18:22,240][57339] Updated weights for policy 0, policy_version 667888 (0.0026) [2024-04-28 17:18:25,472][57339] Updated weights for policy 0, policy_version 667898 (0.0026) [2024-04-28 17:18:27,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 10942922752. Throughput: 0: 55808.2. Samples: 1433286140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:18:27,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 17:18:28,169][57339] Updated weights for policy 0, policy_version 667908 (0.0028) [2024-04-28 17:18:31,288][57339] Updated weights for policy 0, policy_version 667918 (0.0027) [2024-04-28 17:18:32,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 10943217664. Throughput: 0: 55560.5. Samples: 1433616000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:18:32,169][57108] Avg episode reward: [(0, '0.740')] [2024-04-28 17:18:34,435][57339] Updated weights for policy 0, policy_version 667928 (0.0040) [2024-04-28 17:18:37,107][57339] Updated weights for policy 0, policy_version 667938 (0.0027) [2024-04-28 17:18:37,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10943496192. Throughput: 0: 55497.3. Samples: 1433785780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:18:37,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 17:18:37,537][57319] Signal inference workers to stop experience collection... (21400 times) [2024-04-28 17:18:37,569][57339] InferenceWorker_p0-w0: stopping experience collection (21400 times) [2024-04-28 17:18:37,595][57319] Signal inference workers to resume experience collection... (21400 times) [2024-04-28 17:18:37,596][57339] InferenceWorker_p0-w0: resuming experience collection (21400 times) [2024-04-28 17:18:40,592][57339] Updated weights for policy 0, policy_version 667948 (0.0030) [2024-04-28 17:18:42,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10943758336. Throughput: 0: 55531.5. Samples: 1434122220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:18:42,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 17:18:43,092][57339] Updated weights for policy 0, policy_version 667958 (0.0032) [2024-04-28 17:18:46,580][57339] Updated weights for policy 0, policy_version 667968 (0.0035) [2024-04-28 17:18:47,169][57108] Fps is (10 sec: 52428.6, 60 sec: 54886.3, 300 sec: 55594.5). Total num frames: 10944020480. Throughput: 0: 55515.1. Samples: 1434453000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:18:47,170][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 17:18:48,985][57339] Updated weights for policy 0, policy_version 667978 (0.0027) [2024-04-28 17:18:52,169][57108] Fps is (10 sec: 54066.6, 60 sec: 54886.3, 300 sec: 55594.5). Total num frames: 10944299008. Throughput: 0: 55338.6. Samples: 1434604960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:18:52,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 17:18:52,509][57339] Updated weights for policy 0, policy_version 667988 (0.0034) [2024-04-28 17:18:54,963][57339] Updated weights for policy 0, policy_version 667998 (0.0028) [2024-04-28 17:18:57,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54886.3, 300 sec: 55539.0). Total num frames: 10944577536. Throughput: 0: 55088.4. Samples: 1434928580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:18:57,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 17:18:58,341][57339] Updated weights for policy 0, policy_version 668008 (0.0035) [2024-04-28 17:19:00,836][57339] Updated weights for policy 0, policy_version 668018 (0.0026) [2024-04-28 17:19:02,169][57108] Fps is (10 sec: 58983.2, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 10944888832. Throughput: 0: 55195.3. Samples: 1435264200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:19:02,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 17:19:04,279][57339] Updated weights for policy 0, policy_version 668028 (0.0026) [2024-04-28 17:19:06,734][57339] Updated weights for policy 0, policy_version 668038 (0.0032) [2024-04-28 17:19:07,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 10945150976. Throughput: 0: 55375.0. Samples: 1435443380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:19:07,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 17:19:10,025][57339] Updated weights for policy 0, policy_version 668048 (0.0029) [2024-04-28 17:19:12,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10945413120. Throughput: 0: 55296.0. Samples: 1435774460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:19:12,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 17:19:12,480][57339] Updated weights for policy 0, policy_version 668058 (0.0030) [2024-04-28 17:19:16,174][57339] Updated weights for policy 0, policy_version 668068 (0.0028) [2024-04-28 17:19:17,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10945675264. Throughput: 0: 55267.0. Samples: 1436103020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-04-28 17:19:17,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 17:19:18,338][57339] Updated weights for policy 0, policy_version 668078 (0.0029) [2024-04-28 17:19:21,928][57339] Updated weights for policy 0, policy_version 668088 (0.0031) [2024-04-28 17:19:22,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 10945970176. Throughput: 0: 55128.0. Samples: 1436266540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:19:22,170][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 17:19:24,226][57339] Updated weights for policy 0, policy_version 668098 (0.0040) [2024-04-28 17:19:27,169][57108] Fps is (10 sec: 54067.6, 60 sec: 54886.4, 300 sec: 55483.5). Total num frames: 10946215936. Throughput: 0: 54964.0. Samples: 1436595600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:19:27,169][57108] Avg episode reward: [(0, '0.532')] [2024-04-28 17:19:27,236][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000668105_10946232320.pth... [2024-04-28 17:19:27,280][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000667293_10932928512.pth [2024-04-28 17:19:27,997][57339] Updated weights for policy 0, policy_version 668108 (0.0026) [2024-04-28 17:19:30,067][57339] Updated weights for policy 0, policy_version 668118 (0.0029) [2024-04-28 17:19:32,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 10946527232. Throughput: 0: 54993.8. Samples: 1436927720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:19:32,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 17:19:33,793][57339] Updated weights for policy 0, policy_version 668128 (0.0029) [2024-04-28 17:19:35,956][57339] Updated weights for policy 0, policy_version 668138 (0.0025) [2024-04-28 17:19:37,169][57108] Fps is (10 sec: 60621.1, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10946822144. Throughput: 0: 55458.9. Samples: 1437100600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:19:37,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 17:19:39,813][57339] Updated weights for policy 0, policy_version 668148 (0.0029) [2024-04-28 17:19:40,859][57319] Signal inference workers to stop experience collection... (21450 times) [2024-04-28 17:19:40,859][57319] Signal inference workers to resume experience collection... (21450 times) [2024-04-28 17:19:40,900][57339] InferenceWorker_p0-w0: stopping experience collection (21450 times) [2024-04-28 17:19:40,900][57339] InferenceWorker_p0-w0: resuming experience collection (21450 times) [2024-04-28 17:19:42,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10947084288. Throughput: 0: 55578.2. Samples: 1437429600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:19:42,170][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 17:19:42,255][57339] Updated weights for policy 0, policy_version 668158 (0.0030) [2024-04-28 17:19:45,698][57339] Updated weights for policy 0, policy_version 668168 (0.0033) [2024-04-28 17:19:47,169][57108] Fps is (10 sec: 54066.2, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10947362816. Throughput: 0: 55422.5. Samples: 1437758220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:19:47,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 17:19:48,273][57339] Updated weights for policy 0, policy_version 668178 (0.0029) [2024-04-28 17:19:51,598][57339] Updated weights for policy 0, policy_version 668188 (0.0033) [2024-04-28 17:19:52,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 10947608576. Throughput: 0: 55219.1. Samples: 1437928240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:19:52,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 17:19:54,185][57339] Updated weights for policy 0, policy_version 668198 (0.0032) [2024-04-28 17:19:57,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 10947887104. Throughput: 0: 55225.8. Samples: 1438259620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:19:57,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 17:19:57,405][57339] Updated weights for policy 0, policy_version 668208 (0.0031) [2024-04-28 17:20:00,021][57339] Updated weights for policy 0, policy_version 668218 (0.0031) [2024-04-28 17:20:02,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54340.2, 300 sec: 55427.9). Total num frames: 10948149248. Throughput: 0: 55470.2. Samples: 1438599180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:20:02,170][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 17:20:03,199][57339] Updated weights for policy 0, policy_version 668228 (0.0025) [2024-04-28 17:20:05,790][57339] Updated weights for policy 0, policy_version 668238 (0.0032) [2024-04-28 17:20:07,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55159.3, 300 sec: 55650.0). Total num frames: 10948460544. Throughput: 0: 55270.1. Samples: 1438753700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:20:07,170][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 17:20:09,148][57339] Updated weights for policy 0, policy_version 668248 (0.0026) [2024-04-28 17:20:11,595][57339] Updated weights for policy 0, policy_version 668258 (0.0026) [2024-04-28 17:20:12,169][57108] Fps is (10 sec: 60620.7, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 10948755456. Throughput: 0: 55330.1. Samples: 1439085460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:20:12,169][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 17:20:15,344][57339] Updated weights for policy 0, policy_version 668268 (0.0027) [2024-04-28 17:20:17,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 10949033984. Throughput: 0: 55483.5. Samples: 1439424480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:20:17,169][57108] Avg episode reward: [(0, '0.516')] [2024-04-28 17:20:17,427][57339] Updated weights for policy 0, policy_version 668278 (0.0027) [2024-04-28 17:20:21,154][57339] Updated weights for policy 0, policy_version 668288 (0.0028) [2024-04-28 17:20:22,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 10949328896. Throughput: 0: 55694.4. Samples: 1439606860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:20:22,170][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 17:20:23,282][57339] Updated weights for policy 0, policy_version 668298 (0.0027) [2024-04-28 17:20:26,975][57339] Updated weights for policy 0, policy_version 668308 (0.0025) [2024-04-28 17:20:27,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 10949574656. Throughput: 0: 55806.7. Samples: 1439940900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:20:27,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 17:20:27,428][57319] Signal inference workers to stop experience collection... (21500 times) [2024-04-28 17:20:27,429][57319] Signal inference workers to resume experience collection... (21500 times) [2024-04-28 17:20:27,468][57339] InferenceWorker_p0-w0: stopping experience collection (21500 times) [2024-04-28 17:20:27,469][57339] InferenceWorker_p0-w0: resuming experience collection (21500 times) [2024-04-28 17:20:29,277][57339] Updated weights for policy 0, policy_version 668318 (0.0033) [2024-04-28 17:20:32,169][57108] Fps is (10 sec: 50791.0, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 10949836800. Throughput: 0: 55946.4. Samples: 1440275800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:20:32,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 17:20:32,740][57339] Updated weights for policy 0, policy_version 668328 (0.0030) [2024-04-28 17:20:35,409][57339] Updated weights for policy 0, policy_version 668338 (0.0025) [2024-04-28 17:20:37,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 10950131712. Throughput: 0: 55729.3. Samples: 1440436060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:20:37,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 17:20:38,492][57339] Updated weights for policy 0, policy_version 668348 (0.0025) [2024-04-28 17:20:41,146][57339] Updated weights for policy 0, policy_version 668358 (0.0026) [2024-04-28 17:20:42,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 10950393856. Throughput: 0: 55744.8. Samples: 1440768140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:20:42,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 17:20:44,420][57339] Updated weights for policy 0, policy_version 668368 (0.0026) [2024-04-28 17:20:46,989][57339] Updated weights for policy 0, policy_version 668378 (0.0034) [2024-04-28 17:20:47,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10950721536. Throughput: 0: 55631.2. Samples: 1441102580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:20:47,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 17:20:50,286][57339] Updated weights for policy 0, policy_version 668388 (0.0028) [2024-04-28 17:20:52,169][57108] Fps is (10 sec: 60621.3, 60 sec: 56524.8, 300 sec: 55705.6). Total num frames: 10951000064. Throughput: 0: 56127.3. Samples: 1441279420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:20:52,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 17:20:52,877][57339] Updated weights for policy 0, policy_version 668398 (0.0026) [2024-04-28 17:20:56,139][57339] Updated weights for policy 0, policy_version 668408 (0.0029) [2024-04-28 17:20:57,169][57108] Fps is (10 sec: 55704.9, 60 sec: 56524.7, 300 sec: 55650.1). Total num frames: 10951278592. Throughput: 0: 56352.9. Samples: 1441621340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-04-28 17:20:57,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 17:20:58,664][57339] Updated weights for policy 0, policy_version 668418 (0.0029) [2024-04-28 17:21:02,099][57339] Updated weights for policy 0, policy_version 668428 (0.0030) [2024-04-28 17:21:02,169][57108] Fps is (10 sec: 52427.9, 60 sec: 56251.7, 300 sec: 55539.0). Total num frames: 10951524352. Throughput: 0: 56230.1. Samples: 1441954840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:21:02,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 17:21:04,423][57339] Updated weights for policy 0, policy_version 668438 (0.0029) [2024-04-28 17:21:07,169][57108] Fps is (10 sec: 50790.6, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 10951786496. Throughput: 0: 55539.2. Samples: 1442106120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:21:07,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 17:21:07,903][57339] Updated weights for policy 0, policy_version 668448 (0.0026) [2024-04-28 17:21:10,327][57339] Updated weights for policy 0, policy_version 668458 (0.0027) [2024-04-28 17:21:12,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 10952065024. Throughput: 0: 55571.5. Samples: 1442441620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:21:12,170][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 17:21:13,715][57339] Updated weights for policy 0, policy_version 668468 (0.0031) [2024-04-28 17:21:16,318][57339] Updated weights for policy 0, policy_version 668478 (0.0025) [2024-04-28 17:21:17,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10952359936. Throughput: 0: 55580.3. Samples: 1442776920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:21:17,169][57108] Avg episode reward: [(0, '0.704')] [2024-04-28 17:21:19,557][57339] Updated weights for policy 0, policy_version 668488 (0.0028) [2024-04-28 17:21:22,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10952654848. Throughput: 0: 55772.4. Samples: 1442945820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:21:22,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 17:21:22,413][57339] Updated weights for policy 0, policy_version 668498 (0.0028) [2024-04-28 17:21:25,519][57339] Updated weights for policy 0, policy_version 668508 (0.0037) [2024-04-28 17:21:26,172][57319] Signal inference workers to stop experience collection... (21550 times) [2024-04-28 17:21:26,172][57319] Signal inference workers to resume experience collection... (21550 times) [2024-04-28 17:21:26,182][57339] InferenceWorker_p0-w0: stopping experience collection (21550 times) [2024-04-28 17:21:26,182][57339] InferenceWorker_p0-w0: resuming experience collection (21550 times) [2024-04-28 17:21:27,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 10952933376. Throughput: 0: 55787.0. Samples: 1443278560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:21:27,170][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 17:21:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000668514_10952933376.pth... [2024-04-28 17:21:27,235][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000667699_10939580416.pth [2024-04-28 17:21:28,255][57339] Updated weights for policy 0, policy_version 668518 (0.0025) [2024-04-28 17:21:31,217][57339] Updated weights for policy 0, policy_version 668528 (0.0033) [2024-04-28 17:21:32,169][57108] Fps is (10 sec: 57343.6, 60 sec: 56524.7, 300 sec: 55650.1). Total num frames: 10953228288. Throughput: 0: 55720.7. Samples: 1443610020. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:21:32,170][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 17:21:34,279][57339] Updated weights for policy 0, policy_version 668538 (0.0025) [2024-04-28 17:21:36,980][57339] Updated weights for policy 0, policy_version 668548 (0.0025) [2024-04-28 17:21:37,169][57108] Fps is (10 sec: 55706.9, 60 sec: 55978.7, 300 sec: 55483.5). Total num frames: 10953490432. Throughput: 0: 55632.9. Samples: 1443782900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:21:37,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 17:21:40,196][57339] Updated weights for policy 0, policy_version 668558 (0.0027) [2024-04-28 17:21:42,169][57108] Fps is (10 sec: 50791.3, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 10953736192. Throughput: 0: 55503.3. Samples: 1444118980. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:21:42,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 17:21:42,975][57339] Updated weights for policy 0, policy_version 668568 (0.0031) [2024-04-28 17:21:46,109][57339] Updated weights for policy 0, policy_version 668578 (0.0028) [2024-04-28 17:21:47,169][57108] Fps is (10 sec: 54066.0, 60 sec: 55159.3, 300 sec: 55650.0). Total num frames: 10954031104. Throughput: 0: 55601.8. Samples: 1444456920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:21:47,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 17:21:48,994][57339] Updated weights for policy 0, policy_version 668588 (0.0033) [2024-04-28 17:21:51,873][57339] Updated weights for policy 0, policy_version 668598 (0.0029) [2024-04-28 17:21:52,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 10954309632. Throughput: 0: 55778.7. Samples: 1444616160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:21:52,170][57108] Avg episode reward: [(0, '0.699')] [2024-04-28 17:21:54,841][57339] Updated weights for policy 0, policy_version 668608 (0.0025) [2024-04-28 17:21:57,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 10954620928. Throughput: 0: 55724.9. Samples: 1444949240. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:21:57,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 17:21:57,758][57339] Updated weights for policy 0, policy_version 668618 (0.0026) [2024-04-28 17:22:00,610][57339] Updated weights for policy 0, policy_version 668628 (0.0028) [2024-04-28 17:22:02,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10954883072. Throughput: 0: 55718.7. Samples: 1445284260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:22:02,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 17:22:03,871][57339] Updated weights for policy 0, policy_version 668638 (0.0027) [2024-04-28 17:22:06,551][57339] Updated weights for policy 0, policy_version 668648 (0.0028) [2024-04-28 17:22:07,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 10955145216. Throughput: 0: 55629.4. Samples: 1445449140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:22:07,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 17:22:09,846][57339] Updated weights for policy 0, policy_version 668658 (0.0030) [2024-04-28 17:22:12,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55978.6, 300 sec: 55427.9). Total num frames: 10955423744. Throughput: 0: 55710.3. Samples: 1445785520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:22:12,169][57108] Avg episode reward: [(0, '0.532')] [2024-04-28 17:22:12,486][57339] Updated weights for policy 0, policy_version 668668 (0.0035) [2024-04-28 17:22:15,687][57339] Updated weights for policy 0, policy_version 668678 (0.0028) [2024-04-28 17:22:17,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 10955718656. Throughput: 0: 55778.2. Samples: 1446120040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:22:17,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 17:22:18,336][57339] Updated weights for policy 0, policy_version 668688 (0.0030) [2024-04-28 17:22:21,651][57339] Updated weights for policy 0, policy_version 668698 (0.0028) [2024-04-28 17:22:22,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 10955964416. Throughput: 0: 55533.1. Samples: 1446281900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:22:22,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 17:22:24,299][57339] Updated weights for policy 0, policy_version 668708 (0.0026) [2024-04-28 17:22:27,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 10956242944. Throughput: 0: 55391.7. Samples: 1446611620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:22:27,170][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 17:22:27,508][57339] Updated weights for policy 0, policy_version 668718 (0.0030) [2024-04-28 17:22:30,322][57339] Updated weights for policy 0, policy_version 668728 (0.0031) [2024-04-28 17:22:30,865][57319] Signal inference workers to stop experience collection... (21600 times) [2024-04-28 17:22:30,865][57319] Signal inference workers to resume experience collection... (21600 times) [2024-04-28 17:22:30,882][57339] InferenceWorker_p0-w0: stopping experience collection (21600 times) [2024-04-28 17:22:30,882][57339] InferenceWorker_p0-w0: resuming experience collection (21600 times) [2024-04-28 17:22:32,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10956554240. Throughput: 0: 55224.2. Samples: 1446942000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-04-28 17:22:32,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 17:22:33,349][57339] Updated weights for policy 0, policy_version 668738 (0.0037) [2024-04-28 17:22:36,233][57339] Updated weights for policy 0, policy_version 668748 (0.0032) [2024-04-28 17:22:37,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 10956816384. Throughput: 0: 55365.7. Samples: 1447107620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:22:37,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 17:22:39,233][57339] Updated weights for policy 0, policy_version 668758 (0.0033) [2024-04-28 17:22:42,151][57339] Updated weights for policy 0, policy_version 668768 (0.0029) [2024-04-28 17:22:42,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55483.5). Total num frames: 10957094912. Throughput: 0: 55408.2. Samples: 1447442600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:22:42,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 17:22:45,199][57339] Updated weights for policy 0, policy_version 668778 (0.0026) [2024-04-28 17:22:47,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 10957357056. Throughput: 0: 55339.0. Samples: 1447774520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:22:47,169][57108] Avg episode reward: [(0, '0.697')] [2024-04-28 17:22:48,092][57339] Updated weights for policy 0, policy_version 668788 (0.0028) [2024-04-28 17:22:51,212][57339] Updated weights for policy 0, policy_version 668798 (0.0028) [2024-04-28 17:22:52,169][57108] Fps is (10 sec: 57342.8, 60 sec: 55978.6, 300 sec: 55538.9). Total num frames: 10957668352. Throughput: 0: 55484.7. Samples: 1447945960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:22:52,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 17:22:54,128][57339] Updated weights for policy 0, policy_version 668808 (0.0031) [2024-04-28 17:22:56,962][57339] Updated weights for policy 0, policy_version 668818 (0.0023) [2024-04-28 17:22:57,169][57108] Fps is (10 sec: 55705.3, 60 sec: 54886.3, 300 sec: 55483.4). Total num frames: 10957914112. Throughput: 0: 55359.5. Samples: 1448276700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:22:57,170][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 17:22:59,834][57339] Updated weights for policy 0, policy_version 668828 (0.0026) [2024-04-28 17:23:02,169][57108] Fps is (10 sec: 52429.9, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 10958192640. Throughput: 0: 55519.8. Samples: 1448618420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:23:02,169][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 17:23:02,875][57339] Updated weights for policy 0, policy_version 668838 (0.0027) [2024-04-28 17:23:05,729][57339] Updated weights for policy 0, policy_version 668848 (0.0030) [2024-04-28 17:23:07,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10958487552. Throughput: 0: 55576.9. Samples: 1448782860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:23:07,170][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 17:23:08,839][57339] Updated weights for policy 0, policy_version 668858 (0.0034) [2024-04-28 17:23:11,709][57339] Updated weights for policy 0, policy_version 668868 (0.0037) [2024-04-28 17:23:12,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10958766080. Throughput: 0: 55634.0. Samples: 1449115140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:23:12,170][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 17:23:14,591][57339] Updated weights for policy 0, policy_version 668878 (0.0032) [2024-04-28 17:23:15,256][57319] Signal inference workers to stop experience collection... (21650 times) [2024-04-28 17:23:15,303][57339] InferenceWorker_p0-w0: stopping experience collection (21650 times) [2024-04-28 17:23:15,315][57319] Signal inference workers to resume experience collection... (21650 times) [2024-04-28 17:23:15,322][57339] InferenceWorker_p0-w0: resuming experience collection (21650 times) [2024-04-28 17:23:17,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55159.6, 300 sec: 55483.4). Total num frames: 10959028224. Throughput: 0: 55803.6. Samples: 1449453160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:23:17,169][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 17:23:17,487][57339] Updated weights for policy 0, policy_version 668888 (0.0028) [2024-04-28 17:23:20,454][57339] Updated weights for policy 0, policy_version 668898 (0.0027) [2024-04-28 17:23:22,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 10959306752. Throughput: 0: 55787.2. Samples: 1449618040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:23:22,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 17:23:23,434][57339] Updated weights for policy 0, policy_version 668908 (0.0025) [2024-04-28 17:23:26,284][57339] Updated weights for policy 0, policy_version 668918 (0.0031) [2024-04-28 17:23:27,169][57108] Fps is (10 sec: 58981.7, 60 sec: 56251.8, 300 sec: 55594.5). Total num frames: 10959618048. Throughput: 0: 55874.0. Samples: 1449956940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:23:27,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 17:23:27,246][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000668923_10959634432.pth... [2024-04-28 17:23:27,291][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000668105_10946232320.pth [2024-04-28 17:23:29,236][57339] Updated weights for policy 0, policy_version 668928 (0.0030) [2024-04-28 17:23:32,131][57339] Updated weights for policy 0, policy_version 668938 (0.0031) [2024-04-28 17:23:32,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10959880192. Throughput: 0: 55969.0. Samples: 1450293120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:23:32,170][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 17:23:35,057][57339] Updated weights for policy 0, policy_version 668948 (0.0026) [2024-04-28 17:23:37,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 10960142336. Throughput: 0: 55852.7. Samples: 1450459320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:23:37,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 17:23:37,976][57339] Updated weights for policy 0, policy_version 668958 (0.0026) [2024-04-28 17:23:41,006][57339] Updated weights for policy 0, policy_version 668968 (0.0030) [2024-04-28 17:23:42,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 10960437248. Throughput: 0: 55912.1. Samples: 1450792740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:23:42,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 17:23:43,721][57339] Updated weights for policy 0, policy_version 668978 (0.0029) [2024-04-28 17:23:46,868][57339] Updated weights for policy 0, policy_version 668988 (0.0028) [2024-04-28 17:23:47,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10960715776. Throughput: 0: 55885.2. Samples: 1451133260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:23:47,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 17:23:49,574][57339] Updated weights for policy 0, policy_version 668998 (0.0029) [2024-04-28 17:23:52,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10960994304. Throughput: 0: 55836.1. Samples: 1451295480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:23:52,170][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 17:23:52,588][57339] Updated weights for policy 0, policy_version 669008 (0.0030) [2024-04-28 17:23:55,448][57339] Updated weights for policy 0, policy_version 669018 (0.0029) [2024-04-28 17:23:57,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 10961272832. Throughput: 0: 55888.8. Samples: 1451630140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:23:57,170][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 17:23:58,543][57339] Updated weights for policy 0, policy_version 669028 (0.0031) [2024-04-28 17:24:01,172][57339] Updated weights for policy 0, policy_version 669038 (0.0025) [2024-04-28 17:24:02,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 10961567744. Throughput: 0: 55801.3. Samples: 1451964220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:24:02,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 17:24:04,547][57339] Updated weights for policy 0, policy_version 669048 (0.0027) [2024-04-28 17:24:06,917][57339] Updated weights for policy 0, policy_version 669058 (0.0026) [2024-04-28 17:24:07,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10961846272. Throughput: 0: 55972.4. Samples: 1452136800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:24:07,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 17:24:07,803][57319] Signal inference workers to stop experience collection... (21700 times) [2024-04-28 17:24:07,842][57339] InferenceWorker_p0-w0: stopping experience collection (21700 times) [2024-04-28 17:24:07,857][57319] Signal inference workers to resume experience collection... (21700 times) [2024-04-28 17:24:07,860][57339] InferenceWorker_p0-w0: resuming experience collection (21700 times) [2024-04-28 17:24:10,402][57339] Updated weights for policy 0, policy_version 669068 (0.0027) [2024-04-28 17:24:12,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10962108416. Throughput: 0: 55875.6. Samples: 1452471340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 17:24:12,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 17:24:12,753][57339] Updated weights for policy 0, policy_version 669078 (0.0030) [2024-04-28 17:24:16,312][57339] Updated weights for policy 0, policy_version 669088 (0.0031) [2024-04-28 17:24:17,169][57108] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 10962403328. Throughput: 0: 55853.0. Samples: 1452806500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:24:17,169][57108] Avg episode reward: [(0, '0.692')] [2024-04-28 17:24:18,707][57339] Updated weights for policy 0, policy_version 669098 (0.0031) [2024-04-28 17:24:22,086][57339] Updated weights for policy 0, policy_version 669108 (0.0030) [2024-04-28 17:24:22,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10962665472. Throughput: 0: 55755.5. Samples: 1452968320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:24:22,169][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 17:24:24,656][57339] Updated weights for policy 0, policy_version 669118 (0.0027) [2024-04-28 17:24:27,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10962944000. Throughput: 0: 55762.8. Samples: 1453302060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:24:27,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 17:24:27,981][57339] Updated weights for policy 0, policy_version 669128 (0.0031) [2024-04-28 17:24:30,413][57339] Updated weights for policy 0, policy_version 669138 (0.0030) [2024-04-28 17:24:32,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10963222528. Throughput: 0: 55720.2. Samples: 1453640660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:24:32,169][57108] Avg episode reward: [(0, '0.686')] [2024-04-28 17:24:33,852][57339] Updated weights for policy 0, policy_version 669148 (0.0033) [2024-04-28 17:24:36,350][57339] Updated weights for policy 0, policy_version 669158 (0.0038) [2024-04-28 17:24:37,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10963517440. Throughput: 0: 55745.5. Samples: 1453804020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:24:37,178][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 17:24:39,674][57339] Updated weights for policy 0, policy_version 669168 (0.0028) [2024-04-28 17:24:42,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10963795968. Throughput: 0: 55799.2. Samples: 1454141100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:24:42,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 17:24:42,229][57339] Updated weights for policy 0, policy_version 669178 (0.0033) [2024-04-28 17:24:45,565][57339] Updated weights for policy 0, policy_version 669188 (0.0025) [2024-04-28 17:24:47,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 10964090880. Throughput: 0: 55849.3. Samples: 1454477440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:24:47,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 17:24:47,911][57339] Updated weights for policy 0, policy_version 669198 (0.0025) [2024-04-28 17:24:51,419][57339] Updated weights for policy 0, policy_version 669208 (0.0028) [2024-04-28 17:24:52,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10964353024. Throughput: 0: 55934.1. Samples: 1454653840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:24:52,170][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 17:24:53,744][57339] Updated weights for policy 0, policy_version 669218 (0.0028) [2024-04-28 17:24:57,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10964615168. Throughput: 0: 56022.3. Samples: 1454992340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:24:57,169][57108] Avg episode reward: [(0, '0.534')] [2024-04-28 17:24:57,247][57339] Updated weights for policy 0, policy_version 669228 (0.0025) [2024-04-28 17:24:59,558][57339] Updated weights for policy 0, policy_version 669238 (0.0030) [2024-04-28 17:25:02,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10964893696. Throughput: 0: 55864.2. Samples: 1455320400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:25:02,170][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 17:25:03,130][57339] Updated weights for policy 0, policy_version 669248 (0.0035) [2024-04-28 17:25:05,600][57339] Updated weights for policy 0, policy_version 669258 (0.0033) [2024-04-28 17:25:07,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 10965172224. Throughput: 0: 55867.0. Samples: 1455482340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:25:07,170][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 17:25:09,009][57339] Updated weights for policy 0, policy_version 669268 (0.0037) [2024-04-28 17:25:11,549][57339] Updated weights for policy 0, policy_version 669278 (0.0031) [2024-04-28 17:25:12,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 10965467136. Throughput: 0: 55763.5. Samples: 1455811420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:25:12,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 17:25:12,729][57319] Signal inference workers to stop experience collection... (21750 times) [2024-04-28 17:25:12,732][57319] Signal inference workers to resume experience collection... (21750 times) [2024-04-28 17:25:12,755][57339] InferenceWorker_p0-w0: stopping experience collection (21750 times) [2024-04-28 17:25:12,755][57339] InferenceWorker_p0-w0: resuming experience collection (21750 times) [2024-04-28 17:25:14,827][57339] Updated weights for policy 0, policy_version 669288 (0.0032) [2024-04-28 17:25:17,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 10965762048. Throughput: 0: 55794.4. Samples: 1456151420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:25:17,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 17:25:17,412][57339] Updated weights for policy 0, policy_version 669298 (0.0026) [2024-04-28 17:25:20,771][57339] Updated weights for policy 0, policy_version 669308 (0.0040) [2024-04-28 17:25:22,169][57108] Fps is (10 sec: 57343.0, 60 sec: 56251.6, 300 sec: 55816.6). Total num frames: 10966040576. Throughput: 0: 56052.6. Samples: 1456326400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:25:22,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 17:25:23,359][57339] Updated weights for policy 0, policy_version 669318 (0.0035) [2024-04-28 17:25:26,581][57339] Updated weights for policy 0, policy_version 669328 (0.0026) [2024-04-28 17:25:27,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10966319104. Throughput: 0: 55961.3. Samples: 1456659360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:25:27,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 17:25:27,184][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000669332_10966335488.pth... [2024-04-28 17:25:27,243][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000668514_10952933376.pth [2024-04-28 17:25:29,200][57339] Updated weights for policy 0, policy_version 669338 (0.0026) [2024-04-28 17:25:32,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 10966564864. Throughput: 0: 55932.5. Samples: 1456994400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:25:32,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 17:25:32,458][57339] Updated weights for policy 0, policy_version 669348 (0.0026) [2024-04-28 17:25:35,191][57339] Updated weights for policy 0, policy_version 669358 (0.0032) [2024-04-28 17:25:37,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 10966843392. Throughput: 0: 55510.9. Samples: 1457151820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:25:37,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 17:25:38,225][57339] Updated weights for policy 0, policy_version 669368 (0.0028) [2024-04-28 17:25:40,944][57339] Updated weights for policy 0, policy_version 669378 (0.0029) [2024-04-28 17:25:42,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10967121920. Throughput: 0: 55512.8. Samples: 1457490420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:25:42,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 17:25:44,083][57339] Updated weights for policy 0, policy_version 669388 (0.0028) [2024-04-28 17:25:46,875][57339] Updated weights for policy 0, policy_version 669398 (0.0032) [2024-04-28 17:25:47,169][57108] Fps is (10 sec: 57342.4, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 10967416832. Throughput: 0: 55618.1. Samples: 1457823220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:25:47,170][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 17:25:49,940][57339] Updated weights for policy 0, policy_version 669408 (0.0040) [2024-04-28 17:25:52,169][57108] Fps is (10 sec: 58983.1, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 10967711744. Throughput: 0: 55892.7. Samples: 1457997500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-04-28 17:25:52,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 17:25:52,863][57339] Updated weights for policy 0, policy_version 669418 (0.0033) [2024-04-28 17:25:55,816][57339] Updated weights for policy 0, policy_version 669428 (0.0036) [2024-04-28 17:25:57,169][57108] Fps is (10 sec: 57344.6, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 10967990272. Throughput: 0: 55969.7. Samples: 1458330060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:25:57,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 17:25:58,874][57339] Updated weights for policy 0, policy_version 669438 (0.0026) [2024-04-28 17:26:01,681][57339] Updated weights for policy 0, policy_version 669448 (0.0028) [2024-04-28 17:26:01,822][57319] Signal inference workers to stop experience collection... (21800 times) [2024-04-28 17:26:01,823][57319] Signal inference workers to resume experience collection... (21800 times) [2024-04-28 17:26:01,846][57339] InferenceWorker_p0-w0: stopping experience collection (21800 times) [2024-04-28 17:26:01,847][57339] InferenceWorker_p0-w0: resuming experience collection (21800 times) [2024-04-28 17:26:02,169][57108] Fps is (10 sec: 55703.5, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 10968268800. Throughput: 0: 55818.0. Samples: 1458663240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:26:02,170][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 17:26:04,571][57339] Updated weights for policy 0, policy_version 669458 (0.0026) [2024-04-28 17:26:07,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10968530944. Throughput: 0: 55631.4. Samples: 1458829800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:26:07,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 17:26:07,392][57339] Updated weights for policy 0, policy_version 669468 (0.0032) [2024-04-28 17:26:10,444][57339] Updated weights for policy 0, policy_version 669478 (0.0026) [2024-04-28 17:26:12,169][57108] Fps is (10 sec: 52430.1, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10968793088. Throughput: 0: 55747.1. Samples: 1459167980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:26:12,178][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 17:26:13,236][57339] Updated weights for policy 0, policy_version 669488 (0.0028) [2024-04-28 17:26:16,386][57339] Updated weights for policy 0, policy_version 669498 (0.0031) [2024-04-28 17:26:17,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 10969071616. Throughput: 0: 55787.1. Samples: 1459504820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:26:17,169][57108] Avg episode reward: [(0, '0.690')] [2024-04-28 17:26:19,083][57339] Updated weights for policy 0, policy_version 669508 (0.0031) [2024-04-28 17:26:22,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 10969366528. Throughput: 0: 55808.6. Samples: 1459663220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:26:22,170][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 17:26:22,300][57339] Updated weights for policy 0, policy_version 669518 (0.0034) [2024-04-28 17:26:24,857][57339] Updated weights for policy 0, policy_version 669528 (0.0030) [2024-04-28 17:26:27,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10969661440. Throughput: 0: 55719.1. Samples: 1459997780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:26:27,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 17:26:28,019][57339] Updated weights for policy 0, policy_version 669538 (0.0027) [2024-04-28 17:26:30,687][57339] Updated weights for policy 0, policy_version 669548 (0.0039) [2024-04-28 17:26:32,169][57108] Fps is (10 sec: 57344.8, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 10969939968. Throughput: 0: 55862.9. Samples: 1460337040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:26:32,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 17:26:33,702][57339] Updated weights for policy 0, policy_version 669558 (0.0026) [2024-04-28 17:26:36,521][57339] Updated weights for policy 0, policy_version 669568 (0.0031) [2024-04-28 17:26:37,169][57108] Fps is (10 sec: 58981.7, 60 sec: 56797.6, 300 sec: 55983.3). Total num frames: 10970251264. Throughput: 0: 56018.8. Samples: 1460518360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:26:37,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 17:26:40,063][57339] Updated weights for policy 0, policy_version 669578 (0.0027) [2024-04-28 17:26:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 10970497024. Throughput: 0: 56095.7. Samples: 1460854360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:26:42,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 17:26:42,349][57339] Updated weights for policy 0, policy_version 669588 (0.0029) [2024-04-28 17:26:45,874][57339] Updated weights for policy 0, policy_version 669598 (0.0035) [2024-04-28 17:26:47,169][57108] Fps is (10 sec: 50790.4, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10970759168. Throughput: 0: 56121.0. Samples: 1461188680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:26:47,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 17:26:48,207][57339] Updated weights for policy 0, policy_version 669608 (0.0027) [2024-04-28 17:26:51,624][57339] Updated weights for policy 0, policy_version 669618 (0.0032) [2024-04-28 17:26:52,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 10971021312. Throughput: 0: 55987.1. Samples: 1461349220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:26:52,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 17:26:54,051][57339] Updated weights for policy 0, policy_version 669628 (0.0028) [2024-04-28 17:26:57,169][57108] Fps is (10 sec: 57345.4, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 10971332608. Throughput: 0: 55892.2. Samples: 1461683120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:26:57,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 17:26:57,455][57339] Updated weights for policy 0, policy_version 669638 (0.0030) [2024-04-28 17:26:59,945][57339] Updated weights for policy 0, policy_version 669648 (0.0035) [2024-04-28 17:27:02,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55705.9, 300 sec: 55816.7). Total num frames: 10971611136. Throughput: 0: 55856.1. Samples: 1462018340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:27:02,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 17:27:03,401][57339] Updated weights for policy 0, policy_version 669658 (0.0027) [2024-04-28 17:27:05,650][57319] Signal inference workers to stop experience collection... (21850 times) [2024-04-28 17:27:05,687][57339] InferenceWorker_p0-w0: stopping experience collection (21850 times) [2024-04-28 17:27:05,715][57319] Signal inference workers to resume experience collection... (21850 times) [2024-04-28 17:27:05,718][57339] InferenceWorker_p0-w0: resuming experience collection (21850 times) [2024-04-28 17:27:05,830][57339] Updated weights for policy 0, policy_version 669668 (0.0029) [2024-04-28 17:27:07,169][57108] Fps is (10 sec: 55704.2, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 10971889664. Throughput: 0: 56185.7. Samples: 1462191580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:27:07,170][57108] Avg episode reward: [(0, '0.514')] [2024-04-28 17:27:09,644][57339] Updated weights for policy 0, policy_version 669678 (0.0033) [2024-04-28 17:27:11,695][57339] Updated weights for policy 0, policy_version 669688 (0.0033) [2024-04-28 17:27:12,169][57108] Fps is (10 sec: 57343.4, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 10972184576. Throughput: 0: 56146.2. Samples: 1462524360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:27:12,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 17:27:15,514][57339] Updated weights for policy 0, policy_version 669698 (0.0030) [2024-04-28 17:27:17,169][57108] Fps is (10 sec: 57344.6, 60 sec: 56524.8, 300 sec: 55927.8). Total num frames: 10972463104. Throughput: 0: 55967.0. Samples: 1462855560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:27:17,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 17:27:17,551][57339] Updated weights for policy 0, policy_version 669708 (0.0027) [2024-04-28 17:27:21,347][57339] Updated weights for policy 0, policy_version 669718 (0.0029) [2024-04-28 17:27:22,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 10972725248. Throughput: 0: 55738.3. Samples: 1463026580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:27:22,170][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 17:27:23,266][57339] Updated weights for policy 0, policy_version 669728 (0.0028) [2024-04-28 17:27:27,089][57339] Updated weights for policy 0, policy_version 669738 (0.0026) [2024-04-28 17:27:27,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 10972987392. Throughput: 0: 55699.0. Samples: 1463360820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 17:27:27,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 17:27:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000669738_10972987392.pth... [2024-04-28 17:27:27,239][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000668923_10959634432.pth [2024-04-28 17:27:29,617][57339] Updated weights for policy 0, policy_version 669748 (0.0035) [2024-04-28 17:27:32,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10973282304. Throughput: 0: 55739.7. Samples: 1463696960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:27:32,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 17:27:32,989][57339] Updated weights for policy 0, policy_version 669758 (0.0025) [2024-04-28 17:27:35,336][57339] Updated weights for policy 0, policy_version 669768 (0.0026) [2024-04-28 17:27:37,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 10973560832. Throughput: 0: 55699.4. Samples: 1463855700. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:27:37,169][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 17:27:38,778][57339] Updated weights for policy 0, policy_version 669778 (0.0026) [2024-04-28 17:27:41,073][57339] Updated weights for policy 0, policy_version 669788 (0.0027) [2024-04-28 17:27:42,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10973839360. Throughput: 0: 55762.5. Samples: 1464192440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:27:42,169][57108] Avg episode reward: [(0, '0.504')] [2024-04-28 17:27:44,481][57339] Updated weights for policy 0, policy_version 669798 (0.0030) [2024-04-28 17:27:47,070][57339] Updated weights for policy 0, policy_version 669808 (0.0026) [2024-04-28 17:27:47,169][57108] Fps is (10 sec: 57344.7, 60 sec: 56251.9, 300 sec: 55816.7). Total num frames: 10974134272. Throughput: 0: 55743.6. Samples: 1464526800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:27:47,169][57108] Avg episode reward: [(0, '0.697')] [2024-04-28 17:27:50,369][57339] Updated weights for policy 0, policy_version 669818 (0.0031) [2024-04-28 17:27:52,169][57108] Fps is (10 sec: 57343.9, 60 sec: 56524.7, 300 sec: 55927.8). Total num frames: 10974412800. Throughput: 0: 55826.4. Samples: 1464703760. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:27:52,170][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 17:27:53,018][57339] Updated weights for policy 0, policy_version 669828 (0.0032) [2024-04-28 17:27:56,272][57339] Updated weights for policy 0, policy_version 669838 (0.0030) [2024-04-28 17:27:57,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10974674944. Throughput: 0: 55781.5. Samples: 1465034520. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:27:57,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 17:27:58,826][57339] Updated weights for policy 0, policy_version 669848 (0.0031) [2024-04-28 17:28:01,938][57339] Updated weights for policy 0, policy_version 669858 (0.0026) [2024-04-28 17:28:02,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10974953472. Throughput: 0: 55855.2. Samples: 1465369040. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:28:02,169][57108] Avg episode reward: [(0, '0.524')] [2024-04-28 17:28:04,636][57339] Updated weights for policy 0, policy_version 669868 (0.0033) [2024-04-28 17:28:07,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 10975232000. Throughput: 0: 55597.5. Samples: 1465528460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:28:07,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 17:28:07,462][57319] Signal inference workers to stop experience collection... (21900 times) [2024-04-28 17:28:07,463][57319] Signal inference workers to resume experience collection... (21900 times) [2024-04-28 17:28:07,482][57339] InferenceWorker_p0-w0: stopping experience collection (21900 times) [2024-04-28 17:28:07,483][57339] InferenceWorker_p0-w0: resuming experience collection (21900 times) [2024-04-28 17:28:07,708][57339] Updated weights for policy 0, policy_version 669878 (0.0031) [2024-04-28 17:28:10,579][57339] Updated weights for policy 0, policy_version 669888 (0.0030) [2024-04-28 17:28:12,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55159.4, 300 sec: 55816.6). Total num frames: 10975494144. Throughput: 0: 55550.6. Samples: 1465860600. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:28:12,170][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 17:28:13,785][57339] Updated weights for policy 0, policy_version 669898 (0.0034) [2024-04-28 17:28:16,668][57339] Updated weights for policy 0, policy_version 669908 (0.0032) [2024-04-28 17:28:17,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.4, 300 sec: 55816.7). Total num frames: 10975772672. Throughput: 0: 55566.7. Samples: 1466197460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:28:17,170][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 17:28:19,863][57339] Updated weights for policy 0, policy_version 669918 (0.0028) [2024-04-28 17:28:22,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 10976067584. Throughput: 0: 55734.3. Samples: 1466363740. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:28:22,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 17:28:22,484][57339] Updated weights for policy 0, policy_version 669928 (0.0047) [2024-04-28 17:28:25,779][57339] Updated weights for policy 0, policy_version 669938 (0.0027) [2024-04-28 17:28:27,169][57108] Fps is (10 sec: 58982.6, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 10976362496. Throughput: 0: 55733.7. Samples: 1466700460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:28:27,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 17:28:28,442][57339] Updated weights for policy 0, policy_version 669948 (0.0029) [2024-04-28 17:28:31,494][57339] Updated weights for policy 0, policy_version 669958 (0.0027) [2024-04-28 17:28:32,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 10976624640. Throughput: 0: 55581.2. Samples: 1467027960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:28:32,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 17:28:34,334][57339] Updated weights for policy 0, policy_version 669968 (0.0030) [2024-04-28 17:28:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 10976903168. Throughput: 0: 55407.1. Samples: 1467197080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:28:37,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 17:28:37,490][57339] Updated weights for policy 0, policy_version 669978 (0.0028) [2024-04-28 17:28:40,633][57339] Updated weights for policy 0, policy_version 669988 (0.0029) [2024-04-28 17:28:42,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10977181696. Throughput: 0: 55650.1. Samples: 1467538780. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:28:42,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 17:28:43,189][57339] Updated weights for policy 0, policy_version 669998 (0.0030) [2024-04-28 17:28:46,525][57339] Updated weights for policy 0, policy_version 670008 (0.0028) [2024-04-28 17:28:47,169][57108] Fps is (10 sec: 52428.7, 60 sec: 54886.3, 300 sec: 55705.6). Total num frames: 10977427456. Throughput: 0: 55660.3. Samples: 1467873760. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:28:47,169][57108] Avg episode reward: [(0, '0.512')] [2024-04-28 17:28:49,082][57339] Updated weights for policy 0, policy_version 670018 (0.0028) [2024-04-28 17:28:52,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 55761.2). Total num frames: 10977722368. Throughput: 0: 55737.3. Samples: 1468036640. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:28:52,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 17:28:52,410][57339] Updated weights for policy 0, policy_version 670028 (0.0025) [2024-04-28 17:28:54,996][57339] Updated weights for policy 0, policy_version 670038 (0.0030) [2024-04-28 17:28:57,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10978017280. Throughput: 0: 55628.6. Samples: 1468363880. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:28:57,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 17:28:58,182][57339] Updated weights for policy 0, policy_version 670048 (0.0032) [2024-04-28 17:29:00,991][57339] Updated weights for policy 0, policy_version 670058 (0.0027) [2024-04-28 17:29:02,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 10978312192. Throughput: 0: 55621.3. Samples: 1468700420. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:29:02,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 17:29:04,198][57339] Updated weights for policy 0, policy_version 670068 (0.0032) [2024-04-28 17:29:06,782][57339] Updated weights for policy 0, policy_version 670078 (0.0032) [2024-04-28 17:29:07,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 10978557952. Throughput: 0: 55806.3. Samples: 1468875020. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-04-28 17:29:07,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 17:29:10,150][57339] Updated weights for policy 0, policy_version 670088 (0.0029) [2024-04-28 17:29:11,126][57319] Signal inference workers to stop experience collection... (21950 times) [2024-04-28 17:29:11,127][57319] Signal inference workers to resume experience collection... (21950 times) [2024-04-28 17:29:11,167][57339] InferenceWorker_p0-w0: stopping experience collection (21950 times) [2024-04-28 17:29:11,167][57339] InferenceWorker_p0-w0: resuming experience collection (21950 times) [2024-04-28 17:29:12,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 10978852864. Throughput: 0: 55789.4. Samples: 1469210980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:29:12,170][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 17:29:12,504][57339] Updated weights for policy 0, policy_version 670098 (0.0025) [2024-04-28 17:29:16,202][57339] Updated weights for policy 0, policy_version 670108 (0.0035) [2024-04-28 17:29:17,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10979131392. Throughput: 0: 55879.2. Samples: 1469542520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:29:17,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 17:29:18,251][57339] Updated weights for policy 0, policy_version 670118 (0.0028) [2024-04-28 17:29:21,939][57339] Updated weights for policy 0, policy_version 670128 (0.0035) [2024-04-28 17:29:22,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 10979377152. Throughput: 0: 55680.5. Samples: 1469702700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:29:22,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 17:29:24,426][57339] Updated weights for policy 0, policy_version 670138 (0.0027) [2024-04-28 17:29:27,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 10979672064. Throughput: 0: 55563.5. Samples: 1470039140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:29:27,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 17:29:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000670146_10979672064.pth... [2024-04-28 17:29:27,224][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000669332_10966335488.pth [2024-04-28 17:29:27,686][57339] Updated weights for policy 0, policy_version 670148 (0.0031) [2024-04-28 17:29:30,236][57339] Updated weights for policy 0, policy_version 670158 (0.0028) [2024-04-28 17:29:32,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55705.8, 300 sec: 55761.1). Total num frames: 10979966976. Throughput: 0: 55430.4. Samples: 1470368120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:29:32,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 17:29:33,785][57339] Updated weights for policy 0, policy_version 670168 (0.0030) [2024-04-28 17:29:36,138][57339] Updated weights for policy 0, policy_version 670178 (0.0024) [2024-04-28 17:29:37,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 10980245504. Throughput: 0: 55617.8. Samples: 1470539440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:29:37,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 17:29:39,676][57339] Updated weights for policy 0, policy_version 670188 (0.0026) [2024-04-28 17:29:42,047][57339] Updated weights for policy 0, policy_version 670198 (0.0033) [2024-04-28 17:29:42,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10980524032. Throughput: 0: 55706.1. Samples: 1470870660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:29:42,170][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 17:29:45,400][57339] Updated weights for policy 0, policy_version 670208 (0.0025) [2024-04-28 17:29:47,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 10980802560. Throughput: 0: 55628.1. Samples: 1471203680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:29:47,169][57108] Avg episode reward: [(0, '0.701')] [2024-04-28 17:29:48,079][57339] Updated weights for policy 0, policy_version 670218 (0.0030) [2024-04-28 17:29:51,151][57339] Updated weights for policy 0, policy_version 670228 (0.0024) [2024-04-28 17:29:52,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 10981081088. Throughput: 0: 55454.6. Samples: 1471370480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:29:52,170][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 17:29:54,017][57339] Updated weights for policy 0, policy_version 670238 (0.0031) [2024-04-28 17:29:57,133][57339] Updated weights for policy 0, policy_version 670248 (0.0031) [2024-04-28 17:29:57,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 10981343232. Throughput: 0: 55492.9. Samples: 1471708160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:29:57,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 17:29:59,807][57339] Updated weights for policy 0, policy_version 670258 (0.0034) [2024-04-28 17:30:02,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55761.2). Total num frames: 10981621760. Throughput: 0: 55590.6. Samples: 1472044100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:30:02,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 17:30:03,029][57339] Updated weights for policy 0, policy_version 670268 (0.0027) [2024-04-28 17:30:05,588][57339] Updated weights for policy 0, policy_version 670278 (0.0030) [2024-04-28 17:30:07,169][57108] Fps is (10 sec: 58982.3, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 10981933056. Throughput: 0: 55591.0. Samples: 1472204300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:30:07,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 17:30:08,829][57339] Updated weights for policy 0, policy_version 670288 (0.0030) [2024-04-28 17:30:11,393][57339] Updated weights for policy 0, policy_version 670298 (0.0027) [2024-04-28 17:30:11,402][57319] Signal inference workers to stop experience collection... (22000 times) [2024-04-28 17:30:11,402][57319] Signal inference workers to resume experience collection... (22000 times) [2024-04-28 17:30:11,431][57339] InferenceWorker_p0-w0: stopping experience collection (22000 times) [2024-04-28 17:30:11,431][57339] InferenceWorker_p0-w0: resuming experience collection (22000 times) [2024-04-28 17:30:12,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10982195200. Throughput: 0: 55718.3. Samples: 1472546460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:30:12,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 17:30:14,569][57339] Updated weights for policy 0, policy_version 670308 (0.0032) [2024-04-28 17:30:17,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10982457344. Throughput: 0: 55907.5. Samples: 1472883960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:30:17,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 17:30:17,360][57339] Updated weights for policy 0, policy_version 670318 (0.0033) [2024-04-28 17:30:20,372][57339] Updated weights for policy 0, policy_version 670328 (0.0031) [2024-04-28 17:30:22,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56524.8, 300 sec: 55761.2). Total num frames: 10982768640. Throughput: 0: 55883.2. Samples: 1473054180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:30:22,169][57108] Avg episode reward: [(0, '0.518')] [2024-04-28 17:30:23,181][57339] Updated weights for policy 0, policy_version 670338 (0.0024) [2024-04-28 17:30:26,229][57339] Updated weights for policy 0, policy_version 670348 (0.0030) [2024-04-28 17:30:27,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 10983030784. Throughput: 0: 55958.7. Samples: 1473388800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:30:27,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 17:30:28,949][57339] Updated weights for policy 0, policy_version 670358 (0.0040) [2024-04-28 17:30:32,045][57339] Updated weights for policy 0, policy_version 670368 (0.0027) [2024-04-28 17:30:32,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10983309312. Throughput: 0: 55894.3. Samples: 1473718920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:30:32,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 17:30:34,887][57339] Updated weights for policy 0, policy_version 670378 (0.0030) [2024-04-28 17:30:37,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 10983587840. Throughput: 0: 55977.6. Samples: 1473889480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:30:37,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 17:30:37,822][57339] Updated weights for policy 0, policy_version 670388 (0.0036) [2024-04-28 17:30:40,858][57339] Updated weights for policy 0, policy_version 670398 (0.0029) [2024-04-28 17:30:42,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 10983882752. Throughput: 0: 55891.7. Samples: 1474223280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:30:42,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 17:30:43,817][57339] Updated weights for policy 0, policy_version 670408 (0.0034) [2024-04-28 17:30:46,624][57339] Updated weights for policy 0, policy_version 670418 (0.0032) [2024-04-28 17:30:47,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10984144896. Throughput: 0: 55765.8. Samples: 1474553560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:30:47,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 17:30:49,721][57339] Updated weights for policy 0, policy_version 670428 (0.0032) [2024-04-28 17:30:52,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 10984423424. Throughput: 0: 55939.1. Samples: 1474721560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:30:52,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 17:30:52,493][57339] Updated weights for policy 0, policy_version 670438 (0.0027) [2024-04-28 17:30:55,639][57339] Updated weights for policy 0, policy_version 670448 (0.0033) [2024-04-28 17:30:57,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 10984701952. Throughput: 0: 55749.3. Samples: 1475055180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:30:57,170][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 17:30:58,427][57339] Updated weights for policy 0, policy_version 670458 (0.0030) [2024-04-28 17:31:01,610][57339] Updated weights for policy 0, policy_version 670468 (0.0032) [2024-04-28 17:31:02,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 10984980480. Throughput: 0: 55620.0. Samples: 1475386860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:31:02,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 17:31:04,375][57339] Updated weights for policy 0, policy_version 670478 (0.0031) [2024-04-28 17:31:07,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 10985242624. Throughput: 0: 55526.6. Samples: 1475552880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:31:07,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 17:31:07,382][57339] Updated weights for policy 0, policy_version 670488 (0.0027) [2024-04-28 17:31:10,232][57339] Updated weights for policy 0, policy_version 670498 (0.0028) [2024-04-28 17:31:12,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 10985537536. Throughput: 0: 55493.4. Samples: 1475886000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:31:12,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 17:31:13,250][57339] Updated weights for policy 0, policy_version 670508 (0.0033) [2024-04-28 17:31:16,368][57339] Updated weights for policy 0, policy_version 670518 (0.0028) [2024-04-28 17:31:17,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 10985783296. Throughput: 0: 55505.3. Samples: 1476216660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:31:17,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 17:31:19,055][57339] Updated weights for policy 0, policy_version 670528 (0.0025) [2024-04-28 17:31:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 10986078208. Throughput: 0: 55337.5. Samples: 1476379660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:31:22,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 17:31:22,342][57339] Updated weights for policy 0, policy_version 670538 (0.0030) [2024-04-28 17:31:24,994][57339] Updated weights for policy 0, policy_version 670548 (0.0028) [2024-04-28 17:31:25,581][57319] Signal inference workers to stop experience collection... (22050 times) [2024-04-28 17:31:25,582][57319] Signal inference workers to resume experience collection... (22050 times) [2024-04-28 17:31:25,612][57339] InferenceWorker_p0-w0: stopping experience collection (22050 times) [2024-04-28 17:31:25,612][57339] InferenceWorker_p0-w0: resuming experience collection (22050 times) [2024-04-28 17:31:27,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10986340352. Throughput: 0: 55312.7. Samples: 1476712360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:31:27,169][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 17:31:27,177][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000670554_10986356736.pth... [2024-04-28 17:31:27,219][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000669738_10972987392.pth [2024-04-28 17:31:28,062][57339] Updated weights for policy 0, policy_version 670558 (0.0034) [2024-04-28 17:31:30,951][57339] Updated weights for policy 0, policy_version 670568 (0.0032) [2024-04-28 17:31:32,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10986651648. Throughput: 0: 55292.9. Samples: 1477041740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:31:32,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 17:31:33,837][57339] Updated weights for policy 0, policy_version 670578 (0.0028) [2024-04-28 17:31:36,794][57339] Updated weights for policy 0, policy_version 670588 (0.0031) [2024-04-28 17:31:37,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10986930176. Throughput: 0: 55548.0. Samples: 1477221220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:31:37,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 17:31:39,868][57339] Updated weights for policy 0, policy_version 670598 (0.0028) [2024-04-28 17:31:42,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.3, 300 sec: 55705.6). Total num frames: 10987192320. Throughput: 0: 55431.6. Samples: 1477549600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:31:42,170][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 17:31:42,799][57339] Updated weights for policy 0, policy_version 670608 (0.0031) [2024-04-28 17:31:45,888][57339] Updated weights for policy 0, policy_version 670618 (0.0027) [2024-04-28 17:31:47,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 10987454464. Throughput: 0: 55454.0. Samples: 1477882300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:31:47,170][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 17:31:48,734][57339] Updated weights for policy 0, policy_version 670628 (0.0031) [2024-04-28 17:31:51,924][57339] Updated weights for policy 0, policy_version 670638 (0.0026) [2024-04-28 17:31:52,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10987732992. Throughput: 0: 55241.9. Samples: 1478038760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:31:52,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 17:31:54,500][57339] Updated weights for policy 0, policy_version 670648 (0.0027) [2024-04-28 17:31:57,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10988027904. Throughput: 0: 55322.7. Samples: 1478375520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:31:57,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 17:31:57,924][57339] Updated weights for policy 0, policy_version 670658 (0.0027) [2024-04-28 17:32:00,299][57339] Updated weights for policy 0, policy_version 670668 (0.0025) [2024-04-28 17:32:02,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 10988290048. Throughput: 0: 55386.2. Samples: 1478709040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:32:02,170][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 17:32:03,900][57339] Updated weights for policy 0, policy_version 670678 (0.0026) [2024-04-28 17:32:06,197][57339] Updated weights for policy 0, policy_version 670688 (0.0030) [2024-04-28 17:32:07,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 10988601344. Throughput: 0: 55565.3. Samples: 1478880100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:32:07,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 17:32:09,705][57339] Updated weights for policy 0, policy_version 670698 (0.0030) [2024-04-28 17:32:12,029][57339] Updated weights for policy 0, policy_version 670708 (0.0028) [2024-04-28 17:32:12,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 10988879872. Throughput: 0: 55677.3. Samples: 1479217840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:32:12,170][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 17:32:15,522][57339] Updated weights for policy 0, policy_version 670718 (0.0029) [2024-04-28 17:32:17,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10989125632. Throughput: 0: 55736.8. Samples: 1479549900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:32:17,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 17:32:17,870][57339] Updated weights for policy 0, policy_version 670728 (0.0030) [2024-04-28 17:32:21,518][57339] Updated weights for policy 0, policy_version 670738 (0.0033) [2024-04-28 17:32:22,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 10989404160. Throughput: 0: 55504.1. Samples: 1479718900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-04-28 17:32:22,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 17:32:24,013][57339] Updated weights for policy 0, policy_version 670748 (0.0032) [2024-04-28 17:32:27,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10989682688. Throughput: 0: 55528.1. Samples: 1480048360. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:32:27,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 17:32:27,389][57339] Updated weights for policy 0, policy_version 670758 (0.0023) [2024-04-28 17:32:29,862][57339] Updated weights for policy 0, policy_version 670768 (0.0029) [2024-04-28 17:32:32,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 10989944832. Throughput: 0: 55484.1. Samples: 1480379080. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:32:32,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 17:32:32,276][57319] Signal inference workers to stop experience collection... (22100 times) [2024-04-28 17:32:32,309][57339] InferenceWorker_p0-w0: stopping experience collection (22100 times) [2024-04-28 17:32:32,337][57319] Signal inference workers to resume experience collection... (22100 times) [2024-04-28 17:32:32,337][57339] InferenceWorker_p0-w0: resuming experience collection (22100 times) [2024-04-28 17:32:33,210][57339] Updated weights for policy 0, policy_version 670778 (0.0028) [2024-04-28 17:32:35,825][57339] Updated weights for policy 0, policy_version 670788 (0.0024) [2024-04-28 17:32:37,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 10990239744. Throughput: 0: 55616.9. Samples: 1480541520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:32:37,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 17:32:39,086][57339] Updated weights for policy 0, policy_version 670798 (0.0029) [2024-04-28 17:32:41,825][57339] Updated weights for policy 0, policy_version 670808 (0.0029) [2024-04-28 17:32:42,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10990534656. Throughput: 0: 55503.5. Samples: 1480873180. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:32:42,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 17:32:45,142][57339] Updated weights for policy 0, policy_version 670818 (0.0026) [2024-04-28 17:32:47,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 10990813184. Throughput: 0: 55547.2. Samples: 1481208660. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:32:47,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 17:32:47,616][57339] Updated weights for policy 0, policy_version 670828 (0.0030) [2024-04-28 17:32:51,025][57339] Updated weights for policy 0, policy_version 670838 (0.0027) [2024-04-28 17:32:52,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 10991075328. Throughput: 0: 55450.7. Samples: 1481375380. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:32:52,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 17:32:53,487][57339] Updated weights for policy 0, policy_version 670848 (0.0027) [2024-04-28 17:32:56,735][57339] Updated weights for policy 0, policy_version 670858 (0.0028) [2024-04-28 17:32:57,169][57108] Fps is (10 sec: 52427.8, 60 sec: 55159.3, 300 sec: 55538.9). Total num frames: 10991337472. Throughput: 0: 55428.3. Samples: 1481712120. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:32:57,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 17:32:59,342][57339] Updated weights for policy 0, policy_version 670868 (0.0034) [2024-04-28 17:33:02,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10991616000. Throughput: 0: 55366.7. Samples: 1482041400. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:33:02,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 17:33:02,747][57339] Updated weights for policy 0, policy_version 670878 (0.0035) [2024-04-28 17:33:05,384][57339] Updated weights for policy 0, policy_version 670888 (0.0027) [2024-04-28 17:33:07,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 10991910912. Throughput: 0: 55254.2. Samples: 1482205340. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:33:07,169][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 17:33:08,772][57339] Updated weights for policy 0, policy_version 670898 (0.0026) [2024-04-28 17:33:11,324][57339] Updated weights for policy 0, policy_version 670908 (0.0030) [2024-04-28 17:33:12,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 10992173056. Throughput: 0: 55344.9. Samples: 1482538880. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:33:12,169][57108] Avg episode reward: [(0, '0.679')] [2024-04-28 17:33:14,585][57339] Updated weights for policy 0, policy_version 670918 (0.0028) [2024-04-28 17:33:17,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10992467968. Throughput: 0: 55413.2. Samples: 1482872680. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:33:17,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 17:33:17,314][57339] Updated weights for policy 0, policy_version 670928 (0.0026) [2024-04-28 17:33:20,358][57339] Updated weights for policy 0, policy_version 670938 (0.0026) [2024-04-28 17:33:22,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 10992746496. Throughput: 0: 55478.6. Samples: 1483038060. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:33:22,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 17:33:23,094][57339] Updated weights for policy 0, policy_version 670948 (0.0026) [2024-04-28 17:33:26,300][57339] Updated weights for policy 0, policy_version 670958 (0.0032) [2024-04-28 17:33:27,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10993025024. Throughput: 0: 55569.2. Samples: 1483373800. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:33:27,170][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 17:33:27,265][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000670962_10993041408.pth... [2024-04-28 17:33:27,307][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000670146_10979672064.pth [2024-04-28 17:33:28,961][57339] Updated weights for policy 0, policy_version 670968 (0.0029) [2024-04-28 17:33:31,771][57319] Signal inference workers to stop experience collection... (22150 times) [2024-04-28 17:33:31,771][57319] Signal inference workers to resume experience collection... (22150 times) [2024-04-28 17:33:31,781][57339] InferenceWorker_p0-w0: stopping experience collection (22150 times) [2024-04-28 17:33:31,782][57339] InferenceWorker_p0-w0: resuming experience collection (22150 times) [2024-04-28 17:33:32,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 10993287168. Throughput: 0: 55576.9. Samples: 1483709620. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:33:32,169][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 17:33:32,309][57339] Updated weights for policy 0, policy_version 670978 (0.0026) [2024-04-28 17:33:34,946][57339] Updated weights for policy 0, policy_version 670988 (0.0026) [2024-04-28 17:33:37,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 10993565696. Throughput: 0: 55414.1. Samples: 1483869020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:33:37,169][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 17:33:38,240][57339] Updated weights for policy 0, policy_version 670998 (0.0030) [2024-04-28 17:33:40,693][57339] Updated weights for policy 0, policy_version 671008 (0.0033) [2024-04-28 17:33:42,169][57108] Fps is (10 sec: 54066.8, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 10993827840. Throughput: 0: 55234.8. Samples: 1484197680. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:33:42,170][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 17:33:44,077][57339] Updated weights for policy 0, policy_version 671018 (0.0029) [2024-04-28 17:33:46,530][57339] Updated weights for policy 0, policy_version 671028 (0.0029) [2024-04-28 17:33:47,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 10994139136. Throughput: 0: 55416.4. Samples: 1484535140. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:33:47,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 17:33:49,873][57339] Updated weights for policy 0, policy_version 671038 (0.0028) [2024-04-28 17:33:52,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 10994417664. Throughput: 0: 55751.0. Samples: 1484714140. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:33:52,170][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 17:33:52,722][57339] Updated weights for policy 0, policy_version 671048 (0.0033) [2024-04-28 17:33:55,613][57339] Updated weights for policy 0, policy_version 671058 (0.0032) [2024-04-28 17:33:57,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 10994696192. Throughput: 0: 55687.1. Samples: 1485044800. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:33:57,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 17:33:58,566][57339] Updated weights for policy 0, policy_version 671068 (0.0036) [2024-04-28 17:34:01,556][57339] Updated weights for policy 0, policy_version 671078 (0.0024) [2024-04-28 17:34:02,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 10994991104. Throughput: 0: 55679.6. Samples: 1485378260. Policy #0 lag: (min: 1.0, avg: 11.0, max: 22.0) [2024-04-28 17:34:02,169][57108] Avg episode reward: [(0, '0.753')] [2024-04-28 17:34:04,467][57339] Updated weights for policy 0, policy_version 671088 (0.0033) [2024-04-28 17:34:07,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 10995236864. Throughput: 0: 55754.3. Samples: 1485547000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:34:07,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 17:34:07,334][57339] Updated weights for policy 0, policy_version 671098 (0.0027) [2024-04-28 17:34:10,224][57339] Updated weights for policy 0, policy_version 671108 (0.0028) [2024-04-28 17:34:12,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 10995515392. Throughput: 0: 55744.9. Samples: 1485882320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:34:12,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 17:34:13,203][57339] Updated weights for policy 0, policy_version 671118 (0.0027) [2024-04-28 17:34:16,186][57339] Updated weights for policy 0, policy_version 671128 (0.0030) [2024-04-28 17:34:17,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 10995810304. Throughput: 0: 55789.7. Samples: 1486220160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:34:17,170][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 17:34:19,020][57339] Updated weights for policy 0, policy_version 671138 (0.0025) [2024-04-28 17:34:22,047][57339] Updated weights for policy 0, policy_version 671148 (0.0039) [2024-04-28 17:34:22,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 10996088832. Throughput: 0: 55912.9. Samples: 1486385100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:34:22,170][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 17:34:24,933][57339] Updated weights for policy 0, policy_version 671158 (0.0031) [2024-04-28 17:34:27,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.8, 300 sec: 55650.0). Total num frames: 10996383744. Throughput: 0: 55956.9. Samples: 1486715740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:34:27,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 17:34:27,810][57339] Updated weights for policy 0, policy_version 671168 (0.0025) [2024-04-28 17:34:30,794][57339] Updated weights for policy 0, policy_version 671178 (0.0028) [2024-04-28 17:34:32,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 10996645888. Throughput: 0: 55960.5. Samples: 1487053360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:34:32,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 17:34:33,526][57339] Updated weights for policy 0, policy_version 671188 (0.0034) [2024-04-28 17:34:36,749][57339] Updated weights for policy 0, policy_version 671198 (0.0030) [2024-04-28 17:34:37,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 10996924416. Throughput: 0: 55701.5. Samples: 1487220700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:34:37,169][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 17:34:40,120][57339] Updated weights for policy 0, policy_version 671208 (0.0027) [2024-04-28 17:34:41,039][57319] Signal inference workers to stop experience collection... (22200 times) [2024-04-28 17:34:41,044][57319] Signal inference workers to resume experience collection... (22200 times) [2024-04-28 17:34:41,055][57339] InferenceWorker_p0-w0: stopping experience collection (22200 times) [2024-04-28 17:34:41,080][57339] InferenceWorker_p0-w0: resuming experience collection (22200 times) [2024-04-28 17:34:42,169][57108] Fps is (10 sec: 55704.6, 60 sec: 56251.6, 300 sec: 55594.5). Total num frames: 10997202944. Throughput: 0: 55651.4. Samples: 1487549120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:34:42,170][57108] Avg episode reward: [(0, '0.539')] [2024-04-28 17:34:42,624][57339] Updated weights for policy 0, policy_version 671218 (0.0031) [2024-04-28 17:34:45,821][57339] Updated weights for policy 0, policy_version 671228 (0.0036) [2024-04-28 17:34:47,169][57108] Fps is (10 sec: 55704.1, 60 sec: 55705.4, 300 sec: 55594.5). Total num frames: 10997481472. Throughput: 0: 55893.6. Samples: 1487893480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:34:47,170][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 17:34:48,397][57339] Updated weights for policy 0, policy_version 671238 (0.0028) [2024-04-28 17:34:51,874][57339] Updated weights for policy 0, policy_version 671248 (0.0029) [2024-04-28 17:34:52,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 10997743616. Throughput: 0: 55703.4. Samples: 1488053660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:34:52,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 17:34:54,196][57339] Updated weights for policy 0, policy_version 671258 (0.0031) [2024-04-28 17:34:57,169][57108] Fps is (10 sec: 54068.5, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 10998022144. Throughput: 0: 55628.2. Samples: 1488385580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:34:57,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 17:34:57,838][57339] Updated weights for policy 0, policy_version 671268 (0.0027) [2024-04-28 17:35:00,369][57339] Updated weights for policy 0, policy_version 671278 (0.0030) [2024-04-28 17:35:02,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 10998317056. Throughput: 0: 55469.3. Samples: 1488716280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:35:02,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 17:35:03,604][57339] Updated weights for policy 0, policy_version 671288 (0.0030) [2024-04-28 17:35:06,395][57339] Updated weights for policy 0, policy_version 671298 (0.0030) [2024-04-28 17:35:07,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 10998595584. Throughput: 0: 55642.7. Samples: 1488889020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:35:07,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 17:35:09,473][57339] Updated weights for policy 0, policy_version 671308 (0.0027) [2024-04-28 17:35:12,144][57339] Updated weights for policy 0, policy_version 671318 (0.0030) [2024-04-28 17:35:12,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 10998874112. Throughput: 0: 55745.4. Samples: 1489224280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:35:12,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 17:35:15,432][57339] Updated weights for policy 0, policy_version 671328 (0.0037) [2024-04-28 17:35:17,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 10999152640. Throughput: 0: 55623.5. Samples: 1489556420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:35:17,170][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 17:35:18,008][57339] Updated weights for policy 0, policy_version 671338 (0.0029) [2024-04-28 17:35:21,155][57339] Updated weights for policy 0, policy_version 671348 (0.0030) [2024-04-28 17:35:22,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 10999431168. Throughput: 0: 55537.7. Samples: 1489719900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:35:22,178][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 17:35:24,009][57339] Updated weights for policy 0, policy_version 671358 (0.0028) [2024-04-28 17:35:27,073][57339] Updated weights for policy 0, policy_version 671368 (0.0032) [2024-04-28 17:35:27,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 10999693312. Throughput: 0: 55687.7. Samples: 1490055060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:35:27,170][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 17:35:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000671368_10999693312.pth... [2024-04-28 17:35:27,236][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000670554_10986356736.pth [2024-04-28 17:35:29,847][57339] Updated weights for policy 0, policy_version 671378 (0.0031) [2024-04-28 17:35:32,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 10999971840. Throughput: 0: 55450.1. Samples: 1490388720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:35:32,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 17:35:33,035][57339] Updated weights for policy 0, policy_version 671388 (0.0026) [2024-04-28 17:35:35,556][57339] Updated weights for policy 0, policy_version 671398 (0.0031) [2024-04-28 17:35:37,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 11000266752. Throughput: 0: 55632.5. Samples: 1490557120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:35:37,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 17:35:38,819][57339] Updated weights for policy 0, policy_version 671408 (0.0025) [2024-04-28 17:35:41,577][57339] Updated weights for policy 0, policy_version 671418 (0.0028) [2024-04-28 17:35:42,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 11000528896. Throughput: 0: 55761.3. Samples: 1490894840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:35:42,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 17:35:44,546][57339] Updated weights for policy 0, policy_version 671428 (0.0029) [2024-04-28 17:35:47,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 11000807424. Throughput: 0: 55970.2. Samples: 1491234940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:35:47,170][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 17:35:47,453][57339] Updated weights for policy 0, policy_version 671438 (0.0029) [2024-04-28 17:35:50,329][57339] Updated weights for policy 0, policy_version 671448 (0.0030) [2024-04-28 17:35:52,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55432.7, 300 sec: 55483.5). Total num frames: 11001069568. Throughput: 0: 55709.5. Samples: 1491395940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:35:52,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 17:35:53,365][57339] Updated weights for policy 0, policy_version 671458 (0.0033) [2024-04-28 17:35:53,988][57319] Signal inference workers to stop experience collection... (22250 times) [2024-04-28 17:35:53,989][57319] Signal inference workers to resume experience collection... (22250 times) [2024-04-28 17:35:54,002][57339] InferenceWorker_p0-w0: stopping experience collection (22250 times) [2024-04-28 17:35:54,032][57339] InferenceWorker_p0-w0: resuming experience collection (22250 times) [2024-04-28 17:35:56,132][57339] Updated weights for policy 0, policy_version 671468 (0.0035) [2024-04-28 17:35:57,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 11001380864. Throughput: 0: 55618.7. Samples: 1491727120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:35:57,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 17:35:59,227][57339] Updated weights for policy 0, policy_version 671478 (0.0032) [2024-04-28 17:36:01,884][57339] Updated weights for policy 0, policy_version 671488 (0.0031) [2024-04-28 17:36:02,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11001659392. Throughput: 0: 55685.8. Samples: 1492062280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:36:02,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 17:36:05,163][57339] Updated weights for policy 0, policy_version 671498 (0.0039) [2024-04-28 17:36:07,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 11001921536. Throughput: 0: 55882.6. Samples: 1492234620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:36:07,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 17:36:07,832][57339] Updated weights for policy 0, policy_version 671508 (0.0026) [2024-04-28 17:36:10,970][57339] Updated weights for policy 0, policy_version 671518 (0.0026) [2024-04-28 17:36:12,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 11002232832. Throughput: 0: 55791.3. Samples: 1492565660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:36:12,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 17:36:13,954][57339] Updated weights for policy 0, policy_version 671528 (0.0031) [2024-04-28 17:36:16,805][57339] Updated weights for policy 0, policy_version 671538 (0.0031) [2024-04-28 17:36:17,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 11002494976. Throughput: 0: 55904.3. Samples: 1492904420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:36:17,169][57108] Avg episode reward: [(0, '0.523')] [2024-04-28 17:36:19,821][57339] Updated weights for policy 0, policy_version 671548 (0.0030) [2024-04-28 17:36:22,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11002773504. Throughput: 0: 55781.4. Samples: 1493067280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:36:22,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 17:36:22,765][57339] Updated weights for policy 0, policy_version 671558 (0.0030) [2024-04-28 17:36:25,618][57339] Updated weights for policy 0, policy_version 671568 (0.0026) [2024-04-28 17:36:27,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 11003019264. Throughput: 0: 55795.1. Samples: 1493405620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:36:27,169][57108] Avg episode reward: [(0, '0.699')] [2024-04-28 17:36:28,605][57339] Updated weights for policy 0, policy_version 671578 (0.0024) [2024-04-28 17:36:31,353][57339] Updated weights for policy 0, policy_version 671588 (0.0034) [2024-04-28 17:36:32,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 11003330560. Throughput: 0: 55577.4. Samples: 1493735920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:36:32,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 17:36:34,441][57339] Updated weights for policy 0, policy_version 671598 (0.0030) [2024-04-28 17:36:37,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 11003609088. Throughput: 0: 55829.3. Samples: 1493908260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:36:37,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 17:36:37,208][57339] Updated weights for policy 0, policy_version 671608 (0.0030) [2024-04-28 17:36:40,315][57339] Updated weights for policy 0, policy_version 671618 (0.0028) [2024-04-28 17:36:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11003887616. Throughput: 0: 55907.9. Samples: 1494242980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:36:42,169][57108] Avg episode reward: [(0, '0.510')] [2024-04-28 17:36:42,986][57339] Updated weights for policy 0, policy_version 671628 (0.0032) [2024-04-28 17:36:46,217][57339] Updated weights for policy 0, policy_version 671638 (0.0030) [2024-04-28 17:36:47,169][57108] Fps is (10 sec: 57343.7, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 11004182528. Throughput: 0: 55909.8. Samples: 1494578220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:36:47,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 17:36:47,580][57319] Signal inference workers to stop experience collection... (22300 times) [2024-04-28 17:36:47,581][57319] Signal inference workers to resume experience collection... (22300 times) [2024-04-28 17:36:47,594][57339] InferenceWorker_p0-w0: stopping experience collection (22300 times) [2024-04-28 17:36:47,594][57339] InferenceWorker_p0-w0: resuming experience collection (22300 times) [2024-04-28 17:36:48,664][57339] Updated weights for policy 0, policy_version 671648 (0.0033) [2024-04-28 17:36:52,044][57339] Updated weights for policy 0, policy_version 671658 (0.0033) [2024-04-28 17:36:52,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 11004444672. Throughput: 0: 55827.2. Samples: 1494746840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:36:52,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 17:36:54,574][57339] Updated weights for policy 0, policy_version 671668 (0.0027) [2024-04-28 17:36:57,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 11004706816. Throughput: 0: 55939.1. Samples: 1495082920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:36:57,170][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 17:36:57,879][57339] Updated weights for policy 0, policy_version 671678 (0.0028) [2024-04-28 17:37:00,535][57339] Updated weights for policy 0, policy_version 671688 (0.0030) [2024-04-28 17:37:02,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 11004985344. Throughput: 0: 55890.8. Samples: 1495419500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:37:02,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 17:37:03,725][57339] Updated weights for policy 0, policy_version 671698 (0.0026) [2024-04-28 17:37:06,356][57339] Updated weights for policy 0, policy_version 671708 (0.0023) [2024-04-28 17:37:07,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 11005280256. Throughput: 0: 55974.0. Samples: 1495586120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:37:07,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 17:37:09,554][57339] Updated weights for policy 0, policy_version 671718 (0.0026) [2024-04-28 17:37:12,140][57339] Updated weights for policy 0, policy_version 671728 (0.0037) [2024-04-28 17:37:12,169][57108] Fps is (10 sec: 60620.7, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11005591552. Throughput: 0: 55872.9. Samples: 1495919900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:37:12,169][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 17:37:15,407][57339] Updated weights for policy 0, policy_version 671738 (0.0033) [2024-04-28 17:37:17,169][57108] Fps is (10 sec: 58982.9, 60 sec: 56251.7, 300 sec: 55816.6). Total num frames: 11005870080. Throughput: 0: 55880.3. Samples: 1496250540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-04-28 17:37:17,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 17:37:17,839][57339] Updated weights for policy 0, policy_version 671748 (0.0031) [2024-04-28 17:37:21,306][57339] Updated weights for policy 0, policy_version 671758 (0.0028) [2024-04-28 17:37:22,169][57108] Fps is (10 sec: 55704.6, 60 sec: 56251.6, 300 sec: 55816.6). Total num frames: 11006148608. Throughput: 0: 56101.9. Samples: 1496432860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:37:22,170][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 17:37:23,741][57339] Updated weights for policy 0, policy_version 671768 (0.0026) [2024-04-28 17:37:27,169][57108] Fps is (10 sec: 52429.6, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 11006394368. Throughput: 0: 56139.2. Samples: 1496769240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:37:27,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 17:37:27,187][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000671778_11006410752.pth... [2024-04-28 17:37:27,190][57339] Updated weights for policy 0, policy_version 671778 (0.0033) [2024-04-28 17:37:27,234][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000670962_10993041408.pth [2024-04-28 17:37:28,797][57319] Signal inference workers to stop experience collection... (22350 times) [2024-04-28 17:37:28,832][57339] InferenceWorker_p0-w0: stopping experience collection (22350 times) [2024-04-28 17:37:28,844][57319] Signal inference workers to resume experience collection... (22350 times) [2024-04-28 17:37:28,850][57339] InferenceWorker_p0-w0: resuming experience collection (22350 times) [2024-04-28 17:37:29,664][57339] Updated weights for policy 0, policy_version 671788 (0.0033) [2024-04-28 17:37:32,169][57108] Fps is (10 sec: 50792.1, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 11006656512. Throughput: 0: 56128.6. Samples: 1497104000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:37:32,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 17:37:32,977][57339] Updated weights for policy 0, policy_version 671798 (0.0032) [2024-04-28 17:37:35,392][57339] Updated weights for policy 0, policy_version 671808 (0.0032) [2024-04-28 17:37:37,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 11006951424. Throughput: 0: 55816.4. Samples: 1497258580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:37:37,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 17:37:38,876][57339] Updated weights for policy 0, policy_version 671818 (0.0027) [2024-04-28 17:37:41,351][57339] Updated weights for policy 0, policy_version 671828 (0.0025) [2024-04-28 17:37:42,169][57108] Fps is (10 sec: 58981.4, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11007246336. Throughput: 0: 55747.1. Samples: 1497591540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:37:42,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 17:37:44,640][57339] Updated weights for policy 0, policy_version 671838 (0.0031) [2024-04-28 17:37:47,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11007541248. Throughput: 0: 55788.5. Samples: 1497929980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:37:47,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 17:37:47,350][57339] Updated weights for policy 0, policy_version 671848 (0.0028) [2024-04-28 17:37:50,388][57339] Updated weights for policy 0, policy_version 671858 (0.0032) [2024-04-28 17:37:52,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 55872.3). Total num frames: 11007819776. Throughput: 0: 56060.7. Samples: 1498108840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:37:52,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 17:37:53,062][57339] Updated weights for policy 0, policy_version 671868 (0.0028) [2024-04-28 17:37:56,341][57339] Updated weights for policy 0, policy_version 671878 (0.0038) [2024-04-28 17:37:57,169][57108] Fps is (10 sec: 55705.1, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 11008098304. Throughput: 0: 56045.8. Samples: 1498441960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:37:57,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 17:37:58,913][57339] Updated weights for policy 0, policy_version 671888 (0.0029) [2024-04-28 17:38:02,169][57108] Fps is (10 sec: 54067.0, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 11008360448. Throughput: 0: 56087.2. Samples: 1498774460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:38:02,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 17:38:02,252][57339] Updated weights for policy 0, policy_version 671898 (0.0031) [2024-04-28 17:38:04,851][57339] Updated weights for policy 0, policy_version 671908 (0.0027) [2024-04-28 17:38:07,169][57108] Fps is (10 sec: 50790.6, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 11008606208. Throughput: 0: 55610.5. Samples: 1498935320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:38:07,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 17:38:08,012][57339] Updated weights for policy 0, policy_version 671918 (0.0029) [2024-04-28 17:38:10,793][57339] Updated weights for policy 0, policy_version 671928 (0.0030) [2024-04-28 17:38:12,169][57108] Fps is (10 sec: 52429.3, 60 sec: 54886.5, 300 sec: 55650.1). Total num frames: 11008884736. Throughput: 0: 55496.5. Samples: 1499266580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:38:12,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 17:38:13,984][57339] Updated weights for policy 0, policy_version 671938 (0.0030) [2024-04-28 17:38:16,546][57339] Updated weights for policy 0, policy_version 671948 (0.0030) [2024-04-28 17:38:17,169][57108] Fps is (10 sec: 58981.8, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 11009196032. Throughput: 0: 55494.8. Samples: 1499601280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:38:17,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 17:38:19,725][57339] Updated weights for policy 0, policy_version 671958 (0.0025) [2024-04-28 17:38:22,169][57108] Fps is (10 sec: 60620.1, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11009490944. Throughput: 0: 55943.1. Samples: 1499776020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:38:22,170][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 17:38:22,657][57339] Updated weights for policy 0, policy_version 671968 (0.0032) [2024-04-28 17:38:25,688][57339] Updated weights for policy 0, policy_version 671978 (0.0030) [2024-04-28 17:38:27,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 11009769472. Throughput: 0: 55937.7. Samples: 1500108740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:38:27,170][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 17:38:28,657][57339] Updated weights for policy 0, policy_version 671988 (0.0030) [2024-04-28 17:38:31,609][57339] Updated weights for policy 0, policy_version 671998 (0.0028) [2024-04-28 17:38:32,169][57108] Fps is (10 sec: 54067.4, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 11010031616. Throughput: 0: 55815.5. Samples: 1500441680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:38:32,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 17:38:34,733][57339] Updated weights for policy 0, policy_version 672008 (0.0033) [2024-04-28 17:38:36,934][57319] Signal inference workers to stop experience collection... (22400 times) [2024-04-28 17:38:36,964][57339] InferenceWorker_p0-w0: stopping experience collection (22400 times) [2024-04-28 17:38:36,990][57319] Signal inference workers to resume experience collection... (22400 times) [2024-04-28 17:38:36,994][57339] InferenceWorker_p0-w0: resuming experience collection (22400 times) [2024-04-28 17:38:37,169][57108] Fps is (10 sec: 55706.0, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 11010326528. Throughput: 0: 55615.1. Samples: 1500611520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:38:37,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 17:38:37,325][57339] Updated weights for policy 0, policy_version 672018 (0.0033) [2024-04-28 17:38:40,639][57339] Updated weights for policy 0, policy_version 672028 (0.0027) [2024-04-28 17:38:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11010588672. Throughput: 0: 55690.3. Samples: 1500948020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:38:42,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 17:38:43,127][57339] Updated weights for policy 0, policy_version 672038 (0.0031) [2024-04-28 17:38:46,579][57339] Updated weights for policy 0, policy_version 672048 (0.0030) [2024-04-28 17:38:47,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 11010850816. Throughput: 0: 55732.0. Samples: 1501282400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:38:47,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 17:38:49,158][57339] Updated weights for policy 0, policy_version 672058 (0.0032) [2024-04-28 17:38:52,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 11011145728. Throughput: 0: 55696.4. Samples: 1501441660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:38:52,169][57108] Avg episode reward: [(0, '0.684')] [2024-04-28 17:38:52,316][57339] Updated weights for policy 0, policy_version 672068 (0.0027) [2024-04-28 17:38:54,940][57339] Updated weights for policy 0, policy_version 672078 (0.0026) [2024-04-28 17:38:57,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11011440640. Throughput: 0: 55807.8. Samples: 1501777940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-04-28 17:38:57,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 17:38:58,052][57339] Updated weights for policy 0, policy_version 672088 (0.0029) [2024-04-28 17:39:00,886][57339] Updated weights for policy 0, policy_version 672098 (0.0025) [2024-04-28 17:39:02,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 11011702784. Throughput: 0: 55838.8. Samples: 1502114020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:39:02,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 17:39:04,102][57339] Updated weights for policy 0, policy_version 672108 (0.0030) [2024-04-28 17:39:06,709][57339] Updated weights for policy 0, policy_version 672118 (0.0027) [2024-04-28 17:39:07,169][57108] Fps is (10 sec: 54066.7, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 11011981312. Throughput: 0: 55702.5. Samples: 1502282640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:39:07,170][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 17:39:10,086][57339] Updated weights for policy 0, policy_version 672128 (0.0030) [2024-04-28 17:39:12,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.7, 300 sec: 55761.2). Total num frames: 11012259840. Throughput: 0: 55807.3. Samples: 1502620060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:39:12,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 17:39:12,472][57339] Updated weights for policy 0, policy_version 672138 (0.0031) [2024-04-28 17:39:16,074][57339] Updated weights for policy 0, policy_version 672148 (0.0029) [2024-04-28 17:39:17,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 11012538368. Throughput: 0: 55815.2. Samples: 1502953360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:39:17,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 17:39:18,510][57339] Updated weights for policy 0, policy_version 672158 (0.0028) [2024-04-28 17:39:21,927][57339] Updated weights for policy 0, policy_version 672168 (0.0033) [2024-04-28 17:39:22,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11012816896. Throughput: 0: 55558.3. Samples: 1503111640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:39:22,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 17:39:24,388][57339] Updated weights for policy 0, policy_version 672178 (0.0037) [2024-04-28 17:39:27,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 11013079040. Throughput: 0: 55480.4. Samples: 1503444640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:39:27,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 17:39:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000672185_11013079040.pth... [2024-04-28 17:39:27,246][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000671368_10999693312.pth [2024-04-28 17:39:27,875][57339] Updated weights for policy 0, policy_version 672188 (0.0029) [2024-04-28 17:39:30,332][57339] Updated weights for policy 0, policy_version 672198 (0.0024) [2024-04-28 17:39:32,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 11013373952. Throughput: 0: 55526.3. Samples: 1503781080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:39:32,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 17:39:33,836][57339] Updated weights for policy 0, policy_version 672208 (0.0026) [2024-04-28 17:39:36,210][57339] Updated weights for policy 0, policy_version 672218 (0.0023) [2024-04-28 17:39:37,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 11013652480. Throughput: 0: 55795.1. Samples: 1503952440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:39:37,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 17:39:39,632][57339] Updated weights for policy 0, policy_version 672228 (0.0025) [2024-04-28 17:39:42,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 11013931008. Throughput: 0: 55638.2. Samples: 1504281660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:39:42,170][57108] Avg episode reward: [(0, '0.489')] [2024-04-28 17:39:42,195][57339] Updated weights for policy 0, policy_version 672238 (0.0030) [2024-04-28 17:39:43,655][57319] Signal inference workers to stop experience collection... (22450 times) [2024-04-28 17:39:43,706][57339] InferenceWorker_p0-w0: stopping experience collection (22450 times) [2024-04-28 17:39:43,711][57319] Signal inference workers to resume experience collection... (22450 times) [2024-04-28 17:39:43,716][57339] InferenceWorker_p0-w0: resuming experience collection (22450 times) [2024-04-28 17:39:45,375][57339] Updated weights for policy 0, policy_version 672248 (0.0032) [2024-04-28 17:39:47,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 11014209536. Throughput: 0: 55531.9. Samples: 1504612960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:39:47,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 17:39:48,240][57339] Updated weights for policy 0, policy_version 672258 (0.0033) [2024-04-28 17:39:51,291][57339] Updated weights for policy 0, policy_version 672268 (0.0032) [2024-04-28 17:39:52,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 11014488064. Throughput: 0: 55459.2. Samples: 1504778300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:39:52,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 17:39:53,953][57339] Updated weights for policy 0, policy_version 672278 (0.0030) [2024-04-28 17:39:57,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 11014750208. Throughput: 0: 55439.9. Samples: 1505114860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:39:57,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 17:39:57,244][57339] Updated weights for policy 0, policy_version 672288 (0.0034) [2024-04-28 17:39:59,720][57339] Updated weights for policy 0, policy_version 672298 (0.0028) [2024-04-28 17:40:02,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11015028736. Throughput: 0: 55579.5. Samples: 1505454440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:40:02,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 17:40:03,125][57339] Updated weights for policy 0, policy_version 672308 (0.0026) [2024-04-28 17:40:05,702][57339] Updated weights for policy 0, policy_version 672318 (0.0031) [2024-04-28 17:40:07,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11015307264. Throughput: 0: 55471.3. Samples: 1505607860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:40:07,170][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 17:40:08,943][57339] Updated weights for policy 0, policy_version 672328 (0.0028) [2024-04-28 17:40:11,695][57339] Updated weights for policy 0, policy_version 672338 (0.0031) [2024-04-28 17:40:12,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 11015585792. Throughput: 0: 55550.5. Samples: 1505944420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:40:12,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 17:40:14,949][57339] Updated weights for policy 0, policy_version 672348 (0.0031) [2024-04-28 17:40:17,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11015880704. Throughput: 0: 55454.4. Samples: 1506276540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:40:17,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 17:40:17,669][57339] Updated weights for policy 0, policy_version 672358 (0.0030) [2024-04-28 17:40:20,779][57339] Updated weights for policy 0, policy_version 672368 (0.0030) [2024-04-28 17:40:22,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 11016159232. Throughput: 0: 55392.0. Samples: 1506445080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:40:22,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 17:40:23,424][57339] Updated weights for policy 0, policy_version 672378 (0.0025) [2024-04-28 17:40:26,656][57339] Updated weights for policy 0, policy_version 672388 (0.0026) [2024-04-28 17:40:27,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 11016421376. Throughput: 0: 55537.0. Samples: 1506780820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:40:27,169][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 17:40:29,428][57339] Updated weights for policy 0, policy_version 672398 (0.0029) [2024-04-28 17:40:32,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 11016683520. Throughput: 0: 55604.6. Samples: 1507115160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:40:32,170][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 17:40:32,589][57339] Updated weights for policy 0, policy_version 672408 (0.0026) [2024-04-28 17:40:35,715][57339] Updated weights for policy 0, policy_version 672418 (0.0027) [2024-04-28 17:40:37,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 11016978432. Throughput: 0: 55460.0. Samples: 1507274000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 17:40:37,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 17:40:38,330][57339] Updated weights for policy 0, policy_version 672428 (0.0028) [2024-04-28 17:40:41,689][57339] Updated weights for policy 0, policy_version 672438 (0.0033) [2024-04-28 17:40:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 11017240576. Throughput: 0: 55470.7. Samples: 1507611040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:40:42,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 17:40:44,241][57339] Updated weights for policy 0, policy_version 672448 (0.0029) [2024-04-28 17:40:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 11017535488. Throughput: 0: 55409.3. Samples: 1507947860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:40:47,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 17:40:47,634][57339] Updated weights for policy 0, policy_version 672458 (0.0027) [2024-04-28 17:40:50,198][57339] Updated weights for policy 0, policy_version 672468 (0.0030) [2024-04-28 17:40:52,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 11017830400. Throughput: 0: 55865.1. Samples: 1508121780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:40:52,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 17:40:53,295][57339] Updated weights for policy 0, policy_version 672478 (0.0029) [2024-04-28 17:40:55,960][57339] Updated weights for policy 0, policy_version 672488 (0.0028) [2024-04-28 17:40:57,169][57108] Fps is (10 sec: 58982.0, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 11018125312. Throughput: 0: 55786.7. Samples: 1508454820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:40:57,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 17:40:59,089][57339] Updated weights for policy 0, policy_version 672498 (0.0030) [2024-04-28 17:41:01,824][57339] Updated weights for policy 0, policy_version 672508 (0.0038) [2024-04-28 17:41:02,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11018387456. Throughput: 0: 55901.8. Samples: 1508792120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:41:02,170][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 17:41:04,842][57339] Updated weights for policy 0, policy_version 672518 (0.0029) [2024-04-28 17:41:07,088][57319] Signal inference workers to stop experience collection... (22500 times) [2024-04-28 17:41:07,118][57339] InferenceWorker_p0-w0: stopping experience collection (22500 times) [2024-04-28 17:41:07,146][57319] Signal inference workers to resume experience collection... (22500 times) [2024-04-28 17:41:07,147][57339] InferenceWorker_p0-w0: resuming experience collection (22500 times) [2024-04-28 17:41:07,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 11018649600. Throughput: 0: 55973.8. Samples: 1508963900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:41:07,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 17:41:07,730][57339] Updated weights for policy 0, policy_version 672528 (0.0026) [2024-04-28 17:41:10,704][57339] Updated weights for policy 0, policy_version 672538 (0.0033) [2024-04-28 17:41:12,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11018911744. Throughput: 0: 55876.8. Samples: 1509295280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:41:12,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 17:41:13,575][57339] Updated weights for policy 0, policy_version 672548 (0.0025) [2024-04-28 17:41:16,682][57339] Updated weights for policy 0, policy_version 672558 (0.0028) [2024-04-28 17:41:17,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11019206656. Throughput: 0: 55936.8. Samples: 1509632320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:41:17,169][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 17:41:19,409][57339] Updated weights for policy 0, policy_version 672568 (0.0031) [2024-04-28 17:41:22,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 11019468800. Throughput: 0: 55910.3. Samples: 1509789960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:41:22,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 17:41:22,760][57339] Updated weights for policy 0, policy_version 672578 (0.0027) [2024-04-28 17:41:25,172][57339] Updated weights for policy 0, policy_version 672588 (0.0030) [2024-04-28 17:41:27,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 11019780096. Throughput: 0: 55923.0. Samples: 1510127580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:41:27,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 17:41:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000672594_11019780096.pth... [2024-04-28 17:41:27,230][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000671778_11006410752.pth [2024-04-28 17:41:28,556][57339] Updated weights for policy 0, policy_version 672598 (0.0030) [2024-04-28 17:41:30,965][57339] Updated weights for policy 0, policy_version 672608 (0.0033) [2024-04-28 17:41:32,169][57108] Fps is (10 sec: 60620.7, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 11020075008. Throughput: 0: 55816.9. Samples: 1510459620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:41:32,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 17:41:34,491][57339] Updated weights for policy 0, policy_version 672618 (0.0031) [2024-04-28 17:41:36,854][57339] Updated weights for policy 0, policy_version 672628 (0.0032) [2024-04-28 17:41:37,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 11020353536. Throughput: 0: 55930.5. Samples: 1510638660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:41:37,170][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 17:41:40,356][57339] Updated weights for policy 0, policy_version 672638 (0.0036) [2024-04-28 17:41:42,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 11020599296. Throughput: 0: 55912.1. Samples: 1510970860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:41:42,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 17:41:42,675][57339] Updated weights for policy 0, policy_version 672648 (0.0034) [2024-04-28 17:41:46,042][57339] Updated weights for policy 0, policy_version 672658 (0.0031) [2024-04-28 17:41:47,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11020877824. Throughput: 0: 55977.8. Samples: 1511311120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:41:47,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 17:41:48,421][57339] Updated weights for policy 0, policy_version 672668 (0.0034) [2024-04-28 17:41:51,878][57339] Updated weights for policy 0, policy_version 672678 (0.0026) [2024-04-28 17:41:52,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 11021172736. Throughput: 0: 55685.7. Samples: 1511469760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:41:52,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 17:41:54,464][57339] Updated weights for policy 0, policy_version 672688 (0.0023) [2024-04-28 17:41:57,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 11021451264. Throughput: 0: 55766.1. Samples: 1511804760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:41:57,170][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 17:41:57,884][57339] Updated weights for policy 0, policy_version 672698 (0.0026) [2024-04-28 17:42:00,328][57339] Updated weights for policy 0, policy_version 672708 (0.0024) [2024-04-28 17:42:02,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 11021729792. Throughput: 0: 55701.3. Samples: 1512138880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:42:02,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 17:42:03,724][57339] Updated weights for policy 0, policy_version 672718 (0.0037) [2024-04-28 17:42:05,974][57339] Updated weights for policy 0, policy_version 672728 (0.0030) [2024-04-28 17:42:07,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11022024704. Throughput: 0: 56037.4. Samples: 1512311640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:42:07,169][57108] Avg episode reward: [(0, '0.715')] [2024-04-28 17:42:09,559][57339] Updated weights for policy 0, policy_version 672738 (0.0028) [2024-04-28 17:42:10,973][57319] Signal inference workers to stop experience collection... (22550 times) [2024-04-28 17:42:10,975][57319] Signal inference workers to resume experience collection... (22550 times) [2024-04-28 17:42:10,994][57339] InferenceWorker_p0-w0: stopping experience collection (22550 times) [2024-04-28 17:42:10,994][57339] InferenceWorker_p0-w0: resuming experience collection (22550 times) [2024-04-28 17:42:11,806][57339] Updated weights for policy 0, policy_version 672748 (0.0035) [2024-04-28 17:42:12,169][57108] Fps is (10 sec: 58982.6, 60 sec: 56797.9, 300 sec: 55761.2). Total num frames: 11022319616. Throughput: 0: 56004.5. Samples: 1512647780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:42:12,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 17:42:15,450][57339] Updated weights for policy 0, policy_version 672758 (0.0033) [2024-04-28 17:42:17,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 11022565376. Throughput: 0: 56002.3. Samples: 1512979720. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:42:17,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 17:42:17,754][57339] Updated weights for policy 0, policy_version 672768 (0.0027) [2024-04-28 17:42:21,483][57339] Updated weights for policy 0, policy_version 672778 (0.0030) [2024-04-28 17:42:22,169][57108] Fps is (10 sec: 49152.0, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11022811136. Throughput: 0: 55395.2. Samples: 1513131440. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:42:22,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 17:42:23,938][57339] Updated weights for policy 0, policy_version 672788 (0.0026) [2024-04-28 17:42:27,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 11023106048. Throughput: 0: 55477.7. Samples: 1513467360. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:42:27,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 17:42:27,285][57339] Updated weights for policy 0, policy_version 672798 (0.0033) [2024-04-28 17:42:29,752][57339] Updated weights for policy 0, policy_version 672808 (0.0026) [2024-04-28 17:42:32,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 11023368192. Throughput: 0: 55293.3. Samples: 1513799320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:42:32,170][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 17:42:33,126][57339] Updated weights for policy 0, policy_version 672818 (0.0032) [2024-04-28 17:42:35,642][57339] Updated weights for policy 0, policy_version 672828 (0.0029) [2024-04-28 17:42:37,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 11023663104. Throughput: 0: 55577.8. Samples: 1513970760. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:42:37,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 17:42:38,979][57339] Updated weights for policy 0, policy_version 672838 (0.0023) [2024-04-28 17:42:41,635][57339] Updated weights for policy 0, policy_version 672848 (0.0027) [2024-04-28 17:42:42,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 11023958016. Throughput: 0: 55321.3. Samples: 1514294220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:42:42,170][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 17:42:44,933][57339] Updated weights for policy 0, policy_version 672858 (0.0029) [2024-04-28 17:42:47,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 11024236544. Throughput: 0: 55271.5. Samples: 1514626100. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:42:47,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 17:42:47,384][57339] Updated weights for policy 0, policy_version 672868 (0.0030) [2024-04-28 17:42:50,777][57339] Updated weights for policy 0, policy_version 672878 (0.0027) [2024-04-28 17:42:52,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 11024498688. Throughput: 0: 55419.5. Samples: 1514805520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:42:52,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 17:42:53,432][57339] Updated weights for policy 0, policy_version 672888 (0.0031) [2024-04-28 17:42:56,972][57339] Updated weights for policy 0, policy_version 672898 (0.0029) [2024-04-28 17:42:57,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 11024760832. Throughput: 0: 55348.0. Samples: 1515138440. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:42:57,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 17:42:59,377][57339] Updated weights for policy 0, policy_version 672908 (0.0027) [2024-04-28 17:43:02,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 11025039360. Throughput: 0: 55246.7. Samples: 1515465820. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:43:02,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 17:43:02,941][57339] Updated weights for policy 0, policy_version 672918 (0.0027) [2024-04-28 17:43:03,606][57319] Signal inference workers to stop experience collection... (22600 times) [2024-04-28 17:43:03,641][57339] InferenceWorker_p0-w0: stopping experience collection (22600 times) [2024-04-28 17:43:03,695][57319] Signal inference workers to resume experience collection... (22600 times) [2024-04-28 17:43:03,696][57339] InferenceWorker_p0-w0: resuming experience collection (22600 times) [2024-04-28 17:43:05,270][57339] Updated weights for policy 0, policy_version 672928 (0.0033) [2024-04-28 17:43:07,169][57108] Fps is (10 sec: 54066.8, 60 sec: 54613.3, 300 sec: 55650.0). Total num frames: 11025301504. Throughput: 0: 55347.1. Samples: 1515622060. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:43:07,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 17:43:08,883][57339] Updated weights for policy 0, policy_version 672938 (0.0032) [2024-04-28 17:43:11,405][57339] Updated weights for policy 0, policy_version 672948 (0.0033) [2024-04-28 17:43:12,169][57108] Fps is (10 sec: 58981.8, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 11025629184. Throughput: 0: 55225.8. Samples: 1515952520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:43:12,170][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 17:43:14,852][57339] Updated weights for policy 0, policy_version 672958 (0.0026) [2024-04-28 17:43:17,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 11025891328. Throughput: 0: 55231.2. Samples: 1516284720. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:43:17,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 17:43:17,317][57339] Updated weights for policy 0, policy_version 672968 (0.0030) [2024-04-28 17:43:20,695][57339] Updated weights for policy 0, policy_version 672978 (0.0028) [2024-04-28 17:43:22,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55978.7, 300 sec: 55594.6). Total num frames: 11026169856. Throughput: 0: 55328.6. Samples: 1516460540. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:43:22,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 17:43:23,338][57339] Updated weights for policy 0, policy_version 672988 (0.0024) [2024-04-28 17:43:26,627][57339] Updated weights for policy 0, policy_version 672998 (0.0030) [2024-04-28 17:43:27,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 11026415616. Throughput: 0: 55580.5. Samples: 1516795340. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:43:27,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 17:43:27,325][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000673001_11026448384.pth... [2024-04-28 17:43:27,386][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000672185_11013079040.pth [2024-04-28 17:43:29,329][57339] Updated weights for policy 0, policy_version 673008 (0.0027) [2024-04-28 17:43:32,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 11026694144. Throughput: 0: 55586.7. Samples: 1517127500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:43:32,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 17:43:32,373][57339] Updated weights for policy 0, policy_version 673018 (0.0027) [2024-04-28 17:43:35,027][57339] Updated weights for policy 0, policy_version 673028 (0.0028) [2024-04-28 17:43:37,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54886.4, 300 sec: 55483.4). Total num frames: 11026956288. Throughput: 0: 55126.6. Samples: 1517286220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:43:37,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 17:43:38,429][57339] Updated weights for policy 0, policy_version 673038 (0.0028) [2024-04-28 17:43:40,795][57339] Updated weights for policy 0, policy_version 673048 (0.0029) [2024-04-28 17:43:42,169][57108] Fps is (10 sec: 55705.4, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 11027251200. Throughput: 0: 55039.0. Samples: 1517615200. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:43:42,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 17:43:44,217][57339] Updated weights for policy 0, policy_version 673058 (0.0024) [2024-04-28 17:43:46,675][57339] Updated weights for policy 0, policy_version 673068 (0.0029) [2024-04-28 17:43:47,169][57108] Fps is (10 sec: 60620.7, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11027562496. Throughput: 0: 55264.4. Samples: 1517952720. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:43:47,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 17:43:50,158][57339] Updated weights for policy 0, policy_version 673078 (0.0020) [2024-04-28 17:43:50,462][57319] Signal inference workers to stop experience collection... (22650 times) [2024-04-28 17:43:50,519][57339] InferenceWorker_p0-w0: stopping experience collection (22650 times) [2024-04-28 17:43:50,519][57319] Signal inference workers to resume experience collection... (22650 times) [2024-04-28 17:43:50,534][57339] InferenceWorker_p0-w0: resuming experience collection (22650 times) [2024-04-28 17:43:52,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 11027841024. Throughput: 0: 55803.5. Samples: 1518133220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-04-28 17:43:52,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 17:43:52,622][57339] Updated weights for policy 0, policy_version 673088 (0.0028) [2024-04-28 17:43:56,120][57339] Updated weights for policy 0, policy_version 673098 (0.0031) [2024-04-28 17:43:57,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 11028119552. Throughput: 0: 55853.5. Samples: 1518465920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:43:57,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 17:43:58,483][57339] Updated weights for policy 0, policy_version 673108 (0.0024) [2024-04-28 17:44:01,894][57339] Updated weights for policy 0, policy_version 673118 (0.0027) [2024-04-28 17:44:02,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 11028381696. Throughput: 0: 55861.2. Samples: 1518798480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:44:02,170][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 17:44:04,413][57339] Updated weights for policy 0, policy_version 673128 (0.0026) [2024-04-28 17:44:07,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 11028643840. Throughput: 0: 55460.4. Samples: 1518956260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:44:07,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 17:44:07,771][57339] Updated weights for policy 0, policy_version 673138 (0.0036) [2024-04-28 17:44:10,401][57339] Updated weights for policy 0, policy_version 673148 (0.0025) [2024-04-28 17:44:12,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 11028922368. Throughput: 0: 55527.5. Samples: 1519294080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:44:12,170][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 17:44:13,571][57339] Updated weights for policy 0, policy_version 673158 (0.0023) [2024-04-28 17:44:16,229][57339] Updated weights for policy 0, policy_version 673168 (0.0027) [2024-04-28 17:44:17,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 11029217280. Throughput: 0: 55601.3. Samples: 1519629560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:44:17,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 17:44:19,443][57339] Updated weights for policy 0, policy_version 673178 (0.0030) [2024-04-28 17:44:22,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11029495808. Throughput: 0: 55825.5. Samples: 1519798360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:44:22,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 17:44:22,208][57339] Updated weights for policy 0, policy_version 673188 (0.0038) [2024-04-28 17:44:25,284][57339] Updated weights for policy 0, policy_version 673198 (0.0028) [2024-04-28 17:44:27,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 11029774336. Throughput: 0: 55876.8. Samples: 1520129660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:44:27,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 17:44:28,028][57339] Updated weights for policy 0, policy_version 673208 (0.0035) [2024-04-28 17:44:31,166][57339] Updated weights for policy 0, policy_version 673218 (0.0028) [2024-04-28 17:44:32,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 11030052864. Throughput: 0: 55800.9. Samples: 1520463760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:44:32,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 17:44:33,899][57339] Updated weights for policy 0, policy_version 673228 (0.0029) [2024-04-28 17:44:36,374][57319] Signal inference workers to stop experience collection... (22700 times) [2024-04-28 17:44:36,411][57339] InferenceWorker_p0-w0: stopping experience collection (22700 times) [2024-04-28 17:44:36,467][57319] Signal inference workers to resume experience collection... (22700 times) [2024-04-28 17:44:36,467][57339] InferenceWorker_p0-w0: resuming experience collection (22700 times) [2024-04-28 17:44:36,977][57339] Updated weights for policy 0, policy_version 673238 (0.0027) [2024-04-28 17:44:37,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56524.8, 300 sec: 55650.1). Total num frames: 11030347776. Throughput: 0: 55524.5. Samples: 1520631820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:44:37,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 17:44:39,775][57339] Updated weights for policy 0, policy_version 673248 (0.0032) [2024-04-28 17:44:42,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 11030593536. Throughput: 0: 55584.1. Samples: 1520967200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:44:42,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 17:44:42,878][57339] Updated weights for policy 0, policy_version 673258 (0.0025) [2024-04-28 17:44:45,775][57339] Updated weights for policy 0, policy_version 673268 (0.0031) [2024-04-28 17:44:47,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 11030872064. Throughput: 0: 55575.2. Samples: 1521299360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:44:47,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 17:44:48,915][57339] Updated weights for policy 0, policy_version 673278 (0.0030) [2024-04-28 17:44:51,614][57339] Updated weights for policy 0, policy_version 673288 (0.0032) [2024-04-28 17:44:52,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11031166976. Throughput: 0: 55621.7. Samples: 1521459240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:44:52,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 17:44:54,753][57339] Updated weights for policy 0, policy_version 673298 (0.0026) [2024-04-28 17:44:57,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11031445504. Throughput: 0: 55464.6. Samples: 1521789980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:44:57,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 17:44:57,642][57339] Updated weights for policy 0, policy_version 673308 (0.0026) [2024-04-28 17:45:00,535][57339] Updated weights for policy 0, policy_version 673318 (0.0024) [2024-04-28 17:45:02,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 11031707648. Throughput: 0: 55392.7. Samples: 1522122240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:45:02,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 17:45:03,435][57339] Updated weights for policy 0, policy_version 673328 (0.0024) [2024-04-28 17:45:06,443][57339] Updated weights for policy 0, policy_version 673338 (0.0028) [2024-04-28 17:45:07,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 11032002560. Throughput: 0: 55412.7. Samples: 1522291940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:45:07,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 17:45:09,363][57339] Updated weights for policy 0, policy_version 673348 (0.0032) [2024-04-28 17:45:12,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 11032264704. Throughput: 0: 55507.5. Samples: 1522627500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:45:12,170][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 17:45:12,469][57339] Updated weights for policy 0, policy_version 673358 (0.0031) [2024-04-28 17:45:15,171][57339] Updated weights for policy 0, policy_version 673368 (0.0030) [2024-04-28 17:45:17,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55159.6, 300 sec: 55483.5). Total num frames: 11032526848. Throughput: 0: 55487.3. Samples: 1522960680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:45:17,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 17:45:18,363][57339] Updated weights for policy 0, policy_version 673378 (0.0031) [2024-04-28 17:45:21,114][57339] Updated weights for policy 0, policy_version 673388 (0.0027) [2024-04-28 17:45:22,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 11032821760. Throughput: 0: 55233.7. Samples: 1523117340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:45:22,170][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 17:45:24,367][57339] Updated weights for policy 0, policy_version 673398 (0.0029) [2024-04-28 17:45:26,951][57339] Updated weights for policy 0, policy_version 673408 (0.0034) [2024-04-28 17:45:27,169][57108] Fps is (10 sec: 58981.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11033116672. Throughput: 0: 55239.7. Samples: 1523453000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:45:27,170][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 17:45:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000673408_11033116672.pth... [2024-04-28 17:45:27,227][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000672594_11019780096.pth [2024-04-28 17:45:28,785][57319] Signal inference workers to stop experience collection... (22750 times) [2024-04-28 17:45:28,809][57339] InferenceWorker_p0-w0: stopping experience collection (22750 times) [2024-04-28 17:45:28,881][57319] Signal inference workers to resume experience collection... (22750 times) [2024-04-28 17:45:28,881][57339] InferenceWorker_p0-w0: resuming experience collection (22750 times) [2024-04-28 17:45:30,094][57339] Updated weights for policy 0, policy_version 673418 (0.0026) [2024-04-28 17:45:32,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55159.3, 300 sec: 55538.9). Total num frames: 11033362432. Throughput: 0: 55414.8. Samples: 1523793040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-04-28 17:45:32,170][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 17:45:32,779][57339] Updated weights for policy 0, policy_version 673428 (0.0031) [2024-04-28 17:45:35,982][57339] Updated weights for policy 0, policy_version 673438 (0.0023) [2024-04-28 17:45:37,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 11033657344. Throughput: 0: 55619.3. Samples: 1523962100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:45:37,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 17:45:38,773][57339] Updated weights for policy 0, policy_version 673448 (0.0024) [2024-04-28 17:45:41,819][57339] Updated weights for policy 0, policy_version 673458 (0.0036) [2024-04-28 17:45:42,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.3, 300 sec: 55594.5). Total num frames: 11033935872. Throughput: 0: 55582.8. Samples: 1524291220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:45:42,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 17:45:44,724][57339] Updated weights for policy 0, policy_version 673468 (0.0026) [2024-04-28 17:45:47,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 11034230784. Throughput: 0: 55584.6. Samples: 1524623540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:45:47,170][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 17:45:47,588][57339] Updated weights for policy 0, policy_version 673478 (0.0028) [2024-04-28 17:45:51,033][57339] Updated weights for policy 0, policy_version 673488 (0.0032) [2024-04-28 17:45:52,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 11034476544. Throughput: 0: 55555.6. Samples: 1524791940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:45:52,169][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 17:45:53,396][57339] Updated weights for policy 0, policy_version 673498 (0.0026) [2024-04-28 17:45:56,939][57339] Updated weights for policy 0, policy_version 673508 (0.0026) [2024-04-28 17:45:57,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55432.3, 300 sec: 55539.0). Total num frames: 11034771456. Throughput: 0: 55646.2. Samples: 1525131580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:45:57,170][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 17:45:59,457][57339] Updated weights for policy 0, policy_version 673518 (0.0027) [2024-04-28 17:46:02,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.8, 300 sec: 55594.5). Total num frames: 11035049984. Throughput: 0: 55503.9. Samples: 1525458360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:46:02,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 17:46:02,667][57339] Updated weights for policy 0, policy_version 673528 (0.0033) [2024-04-28 17:46:05,364][57339] Updated weights for policy 0, policy_version 673538 (0.0024) [2024-04-28 17:46:07,169][57108] Fps is (10 sec: 55707.1, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11035328512. Throughput: 0: 55674.0. Samples: 1525622660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:46:07,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 17:46:08,437][57339] Updated weights for policy 0, policy_version 673548 (0.0028) [2024-04-28 17:46:11,428][57339] Updated weights for policy 0, policy_version 673558 (0.0026) [2024-04-28 17:46:12,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 11035607040. Throughput: 0: 55736.5. Samples: 1525961140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:46:12,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 17:46:14,269][57339] Updated weights for policy 0, policy_version 673568 (0.0033) [2024-04-28 17:46:17,169][57108] Fps is (10 sec: 55704.4, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 11035885568. Throughput: 0: 55603.3. Samples: 1526295180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:46:17,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 17:46:17,303][57339] Updated weights for policy 0, policy_version 673578 (0.0030) [2024-04-28 17:46:20,180][57339] Updated weights for policy 0, policy_version 673588 (0.0025) [2024-04-28 17:46:22,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55978.8, 300 sec: 55594.6). Total num frames: 11036180480. Throughput: 0: 55601.8. Samples: 1526464180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:46:22,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 17:46:22,990][57339] Updated weights for policy 0, policy_version 673598 (0.0030) [2024-04-28 17:46:26,076][57339] Updated weights for policy 0, policy_version 673608 (0.0032) [2024-04-28 17:46:27,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 11036442624. Throughput: 0: 55790.8. Samples: 1526801800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:46:27,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 17:46:28,789][57339] Updated weights for policy 0, policy_version 673618 (0.0031) [2024-04-28 17:46:31,983][57339] Updated weights for policy 0, policy_version 673628 (0.0035) [2024-04-28 17:46:32,169][57108] Fps is (10 sec: 54066.0, 60 sec: 55978.8, 300 sec: 55483.4). Total num frames: 11036721152. Throughput: 0: 55760.8. Samples: 1527132780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:46:32,170][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 17:46:34,752][57339] Updated weights for policy 0, policy_version 673638 (0.0027) [2024-04-28 17:46:37,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 11036999680. Throughput: 0: 55646.2. Samples: 1527296020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:46:37,169][57108] Avg episode reward: [(0, '0.698')] [2024-04-28 17:46:38,039][57339] Updated weights for policy 0, policy_version 673648 (0.0027) [2024-04-28 17:46:40,610][57339] Updated weights for policy 0, policy_version 673658 (0.0030) [2024-04-28 17:46:42,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55432.8, 300 sec: 55539.0). Total num frames: 11037261824. Throughput: 0: 55561.2. Samples: 1527631820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:46:42,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 17:46:43,820][57339] Updated weights for policy 0, policy_version 673668 (0.0032) [2024-04-28 17:46:46,482][57339] Updated weights for policy 0, policy_version 673678 (0.0027) [2024-04-28 17:46:47,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 11037556736. Throughput: 0: 55669.0. Samples: 1527963460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:46:47,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 17:46:49,719][57339] Updated weights for policy 0, policy_version 673688 (0.0027) [2024-04-28 17:46:52,151][57319] Signal inference workers to stop experience collection... (22800 times) [2024-04-28 17:46:52,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 11037835264. Throughput: 0: 55904.4. Samples: 1528138360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:46:52,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 17:46:52,183][57339] InferenceWorker_p0-w0: stopping experience collection (22800 times) [2024-04-28 17:46:52,208][57319] Signal inference workers to resume experience collection... (22800 times) [2024-04-28 17:46:52,209][57339] InferenceWorker_p0-w0: resuming experience collection (22800 times) [2024-04-28 17:46:52,328][57339] Updated weights for policy 0, policy_version 673698 (0.0025) [2024-04-28 17:46:55,515][57339] Updated weights for policy 0, policy_version 673708 (0.0028) [2024-04-28 17:46:57,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.9, 300 sec: 55594.5). Total num frames: 11038130176. Throughput: 0: 55684.1. Samples: 1528466920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:46:57,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 17:46:58,325][57339] Updated weights for policy 0, policy_version 673718 (0.0025) [2024-04-28 17:47:01,307][57339] Updated weights for policy 0, policy_version 673728 (0.0029) [2024-04-28 17:47:02,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55432.4, 300 sec: 55427.9). Total num frames: 11038375936. Throughput: 0: 55646.7. Samples: 1528799280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:47:02,170][57108] Avg episode reward: [(0, '0.678')] [2024-04-28 17:47:04,223][57339] Updated weights for policy 0, policy_version 673738 (0.0025) [2024-04-28 17:47:07,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55705.4, 300 sec: 55427.9). Total num frames: 11038670848. Throughput: 0: 55610.8. Samples: 1528966680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:47:07,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 17:47:07,260][57339] Updated weights for policy 0, policy_version 673748 (0.0030) [2024-04-28 17:47:10,237][57339] Updated weights for policy 0, policy_version 673758 (0.0036) [2024-04-28 17:47:12,169][57108] Fps is (10 sec: 55706.7, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 11038932992. Throughput: 0: 55489.5. Samples: 1529298820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-04-28 17:47:12,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 17:47:13,297][57339] Updated weights for policy 0, policy_version 673768 (0.0030) [2024-04-28 17:47:16,175][57339] Updated weights for policy 0, policy_version 673778 (0.0029) [2024-04-28 17:47:17,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 11039211520. Throughput: 0: 55601.7. Samples: 1529634860. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:47:17,170][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 17:47:19,130][57339] Updated weights for policy 0, policy_version 673788 (0.0034) [2024-04-28 17:47:22,064][57339] Updated weights for policy 0, policy_version 673798 (0.0030) [2024-04-28 17:47:22,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 11039506432. Throughput: 0: 55574.7. Samples: 1529796880. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:47:22,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 17:47:25,073][57339] Updated weights for policy 0, policy_version 673808 (0.0031) [2024-04-28 17:47:27,169][57108] Fps is (10 sec: 58983.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11039801344. Throughput: 0: 55550.5. Samples: 1530131600. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:47:27,170][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 17:47:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000673816_11039801344.pth... [2024-04-28 17:47:27,229][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000673001_11026448384.pth [2024-04-28 17:47:27,777][57339] Updated weights for policy 0, policy_version 673818 (0.0027) [2024-04-28 17:47:31,110][57339] Updated weights for policy 0, policy_version 673828 (0.0031) [2024-04-28 17:47:32,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 11040047104. Throughput: 0: 55618.6. Samples: 1530466300. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:47:32,169][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 17:47:33,738][57339] Updated weights for policy 0, policy_version 673838 (0.0031) [2024-04-28 17:47:37,005][57339] Updated weights for policy 0, policy_version 673848 (0.0025) [2024-04-28 17:47:37,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 11040325632. Throughput: 0: 55436.4. Samples: 1530633000. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:47:37,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 17:47:39,659][57339] Updated weights for policy 0, policy_version 673858 (0.0030) [2024-04-28 17:47:42,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 11040620544. Throughput: 0: 55544.4. Samples: 1530966420. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:47:42,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 17:47:42,851][57339] Updated weights for policy 0, policy_version 673868 (0.0034) [2024-04-28 17:47:45,865][57339] Updated weights for policy 0, policy_version 673878 (0.0030) [2024-04-28 17:47:47,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 11040882688. Throughput: 0: 55599.6. Samples: 1531301260. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:47:47,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 17:47:48,766][57339] Updated weights for policy 0, policy_version 673888 (0.0033) [2024-04-28 17:47:51,917][57339] Updated weights for policy 0, policy_version 673898 (0.0027) [2024-04-28 17:47:52,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 11041144832. Throughput: 0: 55491.7. Samples: 1531463800. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:47:52,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 17:47:54,613][57339] Updated weights for policy 0, policy_version 673908 (0.0033) [2024-04-28 17:47:57,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 11041456128. Throughput: 0: 55495.4. Samples: 1531796120. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:47:57,170][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 17:47:57,839][57339] Updated weights for policy 0, policy_version 673918 (0.0029) [2024-04-28 17:48:00,685][57339] Updated weights for policy 0, policy_version 673928 (0.0030) [2024-04-28 17:48:01,625][57319] Signal inference workers to stop experience collection... (22850 times) [2024-04-28 17:48:01,625][57319] Signal inference workers to resume experience collection... (22850 times) [2024-04-28 17:48:01,636][57339] InferenceWorker_p0-w0: stopping experience collection (22850 times) [2024-04-28 17:48:01,653][57339] InferenceWorker_p0-w0: resuming experience collection (22850 times) [2024-04-28 17:48:02,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 11041734656. Throughput: 0: 55453.5. Samples: 1532130260. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:48:02,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 17:48:03,576][57339] Updated weights for policy 0, policy_version 673938 (0.0044) [2024-04-28 17:48:06,586][57339] Updated weights for policy 0, policy_version 673948 (0.0030) [2024-04-28 17:48:07,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.7, 300 sec: 55483.5). Total num frames: 11041996800. Throughput: 0: 55595.7. Samples: 1532298680. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:48:07,169][57108] Avg episode reward: [(0, '0.694')] [2024-04-28 17:48:09,527][57339] Updated weights for policy 0, policy_version 673958 (0.0047) [2024-04-28 17:48:12,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 11042258944. Throughput: 0: 55590.7. Samples: 1532633180. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:48:12,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 17:48:12,395][57339] Updated weights for policy 0, policy_version 673968 (0.0025) [2024-04-28 17:48:15,523][57339] Updated weights for policy 0, policy_version 673978 (0.0029) [2024-04-28 17:48:17,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 11042553856. Throughput: 0: 55487.6. Samples: 1532963240. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:48:17,169][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 17:48:18,397][57339] Updated weights for policy 0, policy_version 673988 (0.0029) [2024-04-28 17:48:21,462][57339] Updated weights for policy 0, policy_version 673998 (0.0030) [2024-04-28 17:48:22,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 11042832384. Throughput: 0: 55386.1. Samples: 1533125380. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:48:22,170][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 17:48:24,375][57339] Updated weights for policy 0, policy_version 674008 (0.0026) [2024-04-28 17:48:27,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 11043094528. Throughput: 0: 55394.2. Samples: 1533459160. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:48:27,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 17:48:27,378][57339] Updated weights for policy 0, policy_version 674018 (0.0029) [2024-04-28 17:48:30,249][57339] Updated weights for policy 0, policy_version 674028 (0.0035) [2024-04-28 17:48:32,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11043389440. Throughput: 0: 55391.3. Samples: 1533793860. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:48:32,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 17:48:33,170][57339] Updated weights for policy 0, policy_version 674038 (0.0030) [2024-04-28 17:48:35,952][57339] Updated weights for policy 0, policy_version 674048 (0.0029) [2024-04-28 17:48:37,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11043684352. Throughput: 0: 55542.2. Samples: 1533963200. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:48:37,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 17:48:39,145][57339] Updated weights for policy 0, policy_version 674058 (0.0027) [2024-04-28 17:48:41,754][57339] Updated weights for policy 0, policy_version 674068 (0.0032) [2024-04-28 17:48:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 11043946496. Throughput: 0: 55652.2. Samples: 1534300460. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:48:42,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 17:48:44,898][57339] Updated weights for policy 0, policy_version 674078 (0.0031) [2024-04-28 17:48:47,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 11044225024. Throughput: 0: 55665.0. Samples: 1534635180. Policy #0 lag: (min: 1.0, avg: 11.0, max: 25.0) [2024-04-28 17:48:47,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 17:48:47,746][57339] Updated weights for policy 0, policy_version 674088 (0.0029) [2024-04-28 17:48:50,591][57339] Updated weights for policy 0, policy_version 674098 (0.0027) [2024-04-28 17:48:52,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 11044503552. Throughput: 0: 55643.6. Samples: 1534802640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:48:52,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 17:48:53,403][57339] Updated weights for policy 0, policy_version 674108 (0.0031) [2024-04-28 17:48:56,344][57339] Updated weights for policy 0, policy_version 674118 (0.0027) [2024-04-28 17:48:57,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 11044782080. Throughput: 0: 55683.0. Samples: 1535138920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:48:57,170][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 17:48:59,352][57339] Updated weights for policy 0, policy_version 674128 (0.0029) [2024-04-28 17:49:02,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11045060608. Throughput: 0: 55852.4. Samples: 1535476600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:49:02,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 17:49:02,198][57339] Updated weights for policy 0, policy_version 674138 (0.0029) [2024-04-28 17:49:05,221][57339] Updated weights for policy 0, policy_version 674148 (0.0030) [2024-04-28 17:49:07,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11045355520. Throughput: 0: 56027.3. Samples: 1535646600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:49:07,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 17:49:08,265][57339] Updated weights for policy 0, policy_version 674158 (0.0027) [2024-04-28 17:49:11,008][57339] Updated weights for policy 0, policy_version 674168 (0.0026) [2024-04-28 17:49:12,169][57108] Fps is (10 sec: 57344.6, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 11045634048. Throughput: 0: 56048.1. Samples: 1535981320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:49:12,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 17:49:13,998][57339] Updated weights for policy 0, policy_version 674178 (0.0025) [2024-04-28 17:49:16,752][57339] Updated weights for policy 0, policy_version 674188 (0.0030) [2024-04-28 17:49:17,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 11045912576. Throughput: 0: 56054.6. Samples: 1536316320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:49:17,170][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 17:49:19,776][57339] Updated weights for policy 0, policy_version 674198 (0.0029) [2024-04-28 17:49:22,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 11046191104. Throughput: 0: 56118.7. Samples: 1536488540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:49:22,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 17:49:22,606][57339] Updated weights for policy 0, policy_version 674208 (0.0033) [2024-04-28 17:49:25,616][57339] Updated weights for policy 0, policy_version 674218 (0.0030) [2024-04-28 17:49:27,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 11046453248. Throughput: 0: 56021.1. Samples: 1536821420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:49:27,170][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 17:49:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000674222_11046453248.pth... [2024-04-28 17:49:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000673408_11033116672.pth [2024-04-28 17:49:28,390][57319] Signal inference workers to stop experience collection... (22900 times) [2024-04-28 17:49:28,423][57339] InferenceWorker_p0-w0: stopping experience collection (22900 times) [2024-04-28 17:49:28,446][57319] Signal inference workers to resume experience collection... (22900 times) [2024-04-28 17:49:28,447][57339] InferenceWorker_p0-w0: resuming experience collection (22900 times) [2024-04-28 17:49:28,567][57339] Updated weights for policy 0, policy_version 674228 (0.0030) [2024-04-28 17:49:31,548][57339] Updated weights for policy 0, policy_version 674238 (0.0025) [2024-04-28 17:49:32,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 11046731776. Throughput: 0: 55987.1. Samples: 1537154600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:49:32,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 17:49:34,410][57339] Updated weights for policy 0, policy_version 674248 (0.0026) [2024-04-28 17:49:37,169][57108] Fps is (10 sec: 57345.4, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11047026688. Throughput: 0: 56004.0. Samples: 1537322820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:49:37,169][57108] Avg episode reward: [(0, '0.512')] [2024-04-28 17:49:37,372][57339] Updated weights for policy 0, policy_version 674258 (0.0029) [2024-04-28 17:49:40,289][57339] Updated weights for policy 0, policy_version 674268 (0.0026) [2024-04-28 17:49:42,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11047305216. Throughput: 0: 55959.2. Samples: 1537657080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:49:42,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 17:49:43,230][57339] Updated weights for policy 0, policy_version 674278 (0.0025) [2024-04-28 17:49:46,101][57339] Updated weights for policy 0, policy_version 674288 (0.0025) [2024-04-28 17:49:47,169][57108] Fps is (10 sec: 55704.5, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 11047583744. Throughput: 0: 55895.0. Samples: 1537991880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:49:47,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 17:49:49,101][57339] Updated weights for policy 0, policy_version 674298 (0.0031) [2024-04-28 17:49:51,963][57339] Updated weights for policy 0, policy_version 674308 (0.0032) [2024-04-28 17:49:52,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 11047862272. Throughput: 0: 55854.8. Samples: 1538160060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:49:52,169][57108] Avg episode reward: [(0, '0.539')] [2024-04-28 17:49:55,061][57339] Updated weights for policy 0, policy_version 674318 (0.0029) [2024-04-28 17:49:57,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11048140800. Throughput: 0: 55836.8. Samples: 1538493980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:49:57,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 17:49:57,801][57339] Updated weights for policy 0, policy_version 674328 (0.0033) [2024-04-28 17:50:00,967][57339] Updated weights for policy 0, policy_version 674338 (0.0026) [2024-04-28 17:50:02,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 11048419328. Throughput: 0: 55902.3. Samples: 1538831920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:50:02,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 17:50:03,752][57339] Updated weights for policy 0, policy_version 674348 (0.0031) [2024-04-28 17:50:06,915][57339] Updated weights for policy 0, policy_version 674358 (0.0034) [2024-04-28 17:50:07,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11048681472. Throughput: 0: 55675.6. Samples: 1538993940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:50:07,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 17:50:09,766][57339] Updated weights for policy 0, policy_version 674368 (0.0033) [2024-04-28 17:50:12,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11048976384. Throughput: 0: 55655.7. Samples: 1539325920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:50:12,170][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 17:50:12,702][57339] Updated weights for policy 0, policy_version 674378 (0.0031) [2024-04-28 17:50:15,734][57339] Updated weights for policy 0, policy_version 674388 (0.0033) [2024-04-28 17:50:17,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 11049271296. Throughput: 0: 55683.6. Samples: 1539660360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:50:17,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 17:50:18,768][57339] Updated weights for policy 0, policy_version 674398 (0.0027) [2024-04-28 17:50:21,436][57339] Updated weights for policy 0, policy_version 674408 (0.0033) [2024-04-28 17:50:22,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11049533440. Throughput: 0: 55670.9. Samples: 1539828020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:50:22,170][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 17:50:24,698][57339] Updated weights for policy 0, policy_version 674418 (0.0029) [2024-04-28 17:50:27,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 11049811968. Throughput: 0: 55701.8. Samples: 1540163660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 17:50:27,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 17:50:27,192][57339] Updated weights for policy 0, policy_version 674428 (0.0034) [2024-04-28 17:50:30,530][57339] Updated weights for policy 0, policy_version 674438 (0.0027) [2024-04-28 17:50:32,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11050090496. Throughput: 0: 55673.1. Samples: 1540497160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:50:32,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 17:50:33,214][57339] Updated weights for policy 0, policy_version 674448 (0.0027) [2024-04-28 17:50:36,420][57339] Updated weights for policy 0, policy_version 674458 (0.0031) [2024-04-28 17:50:37,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 11050385408. Throughput: 0: 55566.1. Samples: 1540660540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:50:37,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 17:50:39,202][57339] Updated weights for policy 0, policy_version 674468 (0.0032) [2024-04-28 17:50:42,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 11050631168. Throughput: 0: 55648.8. Samples: 1540998180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:50:42,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 17:50:42,302][57339] Updated weights for policy 0, policy_version 674478 (0.0031) [2024-04-28 17:50:42,710][57319] Signal inference workers to stop experience collection... (22950 times) [2024-04-28 17:50:42,756][57339] InferenceWorker_p0-w0: stopping experience collection (22950 times) [2024-04-28 17:50:42,766][57319] Signal inference workers to resume experience collection... (22950 times) [2024-04-28 17:50:42,772][57339] InferenceWorker_p0-w0: resuming experience collection (22950 times) [2024-04-28 17:50:45,025][57339] Updated weights for policy 0, policy_version 674488 (0.0032) [2024-04-28 17:50:47,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11050926080. Throughput: 0: 55578.0. Samples: 1541332940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:50:47,170][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 17:50:48,213][57339] Updated weights for policy 0, policy_version 674498 (0.0032) [2024-04-28 17:50:50,817][57339] Updated weights for policy 0, policy_version 674508 (0.0029) [2024-04-28 17:50:52,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 11051220992. Throughput: 0: 55836.0. Samples: 1541506560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:50:52,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 17:50:53,995][57339] Updated weights for policy 0, policy_version 674518 (0.0035) [2024-04-28 17:50:56,848][57339] Updated weights for policy 0, policy_version 674528 (0.0029) [2024-04-28 17:50:57,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 11051483136. Throughput: 0: 55938.2. Samples: 1541843140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:50:57,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 17:50:59,712][57339] Updated weights for policy 0, policy_version 674538 (0.0026) [2024-04-28 17:51:02,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 11051778048. Throughput: 0: 55914.9. Samples: 1542176540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:51:02,170][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 17:51:02,714][57339] Updated weights for policy 0, policy_version 674548 (0.0028) [2024-04-28 17:51:05,602][57339] Updated weights for policy 0, policy_version 674558 (0.0033) [2024-04-28 17:51:07,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11052040192. Throughput: 0: 56029.4. Samples: 1542349340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:51:07,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 17:51:08,446][57339] Updated weights for policy 0, policy_version 674568 (0.0029) [2024-04-28 17:51:11,516][57339] Updated weights for policy 0, policy_version 674578 (0.0032) [2024-04-28 17:51:12,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 11052335104. Throughput: 0: 56073.8. Samples: 1542686980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:51:12,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 17:51:14,286][57339] Updated weights for policy 0, policy_version 674588 (0.0028) [2024-04-28 17:51:17,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 11052580864. Throughput: 0: 56127.1. Samples: 1543022880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:51:17,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 17:51:17,308][57339] Updated weights for policy 0, policy_version 674598 (0.0027) [2024-04-28 17:51:20,175][57339] Updated weights for policy 0, policy_version 674608 (0.0026) [2024-04-28 17:51:22,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11052875776. Throughput: 0: 55957.0. Samples: 1543178600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:51:22,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 17:51:23,190][57339] Updated weights for policy 0, policy_version 674618 (0.0030) [2024-04-28 17:51:26,030][57339] Updated weights for policy 0, policy_version 674628 (0.0028) [2024-04-28 17:51:27,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 11053170688. Throughput: 0: 55979.2. Samples: 1543517240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:51:27,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 17:51:27,233][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000674633_11053187072.pth... [2024-04-28 17:51:27,276][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000673816_11039801344.pth [2024-04-28 17:51:28,965][57339] Updated weights for policy 0, policy_version 674638 (0.0029) [2024-04-28 17:51:31,807][57339] Updated weights for policy 0, policy_version 674648 (0.0028) [2024-04-28 17:51:32,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11053432832. Throughput: 0: 56038.9. Samples: 1543854680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:51:32,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 17:51:34,847][57339] Updated weights for policy 0, policy_version 674658 (0.0034) [2024-04-28 17:51:37,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55816.6). Total num frames: 11053727744. Throughput: 0: 55964.7. Samples: 1544024980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:51:37,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 17:51:37,471][57319] Signal inference workers to stop experience collection... (23000 times) [2024-04-28 17:51:37,472][57319] Signal inference workers to resume experience collection... (23000 times) [2024-04-28 17:51:37,505][57339] InferenceWorker_p0-w0: stopping experience collection (23000 times) [2024-04-28 17:51:37,505][57339] InferenceWorker_p0-w0: resuming experience collection (23000 times) [2024-04-28 17:51:37,577][57339] Updated weights for policy 0, policy_version 674668 (0.0027) [2024-04-28 17:51:40,626][57339] Updated weights for policy 0, policy_version 674678 (0.0025) [2024-04-28 17:51:42,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11053989888. Throughput: 0: 55828.0. Samples: 1544355400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:51:42,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 17:51:43,530][57339] Updated weights for policy 0, policy_version 674688 (0.0031) [2024-04-28 17:51:46,451][57339] Updated weights for policy 0, policy_version 674698 (0.0028) [2024-04-28 17:51:47,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56251.7, 300 sec: 55816.6). Total num frames: 11054301184. Throughput: 0: 55762.1. Samples: 1544685840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:51:47,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 17:51:49,439][57339] Updated weights for policy 0, policy_version 674708 (0.0032) [2024-04-28 17:51:52,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 11054546944. Throughput: 0: 55588.8. Samples: 1544850840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:51:52,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 17:51:52,410][57339] Updated weights for policy 0, policy_version 674718 (0.0027) [2024-04-28 17:51:55,230][57339] Updated weights for policy 0, policy_version 674728 (0.0029) [2024-04-28 17:51:57,169][57108] Fps is (10 sec: 52430.3, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 11054825472. Throughput: 0: 55530.4. Samples: 1545185840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:51:57,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 17:51:58,257][57339] Updated weights for policy 0, policy_version 674738 (0.0030) [2024-04-28 17:52:01,158][57339] Updated weights for policy 0, policy_version 674748 (0.0029) [2024-04-28 17:52:02,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 11055120384. Throughput: 0: 55424.8. Samples: 1545517000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:52:02,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 17:52:04,049][57339] Updated weights for policy 0, policy_version 674758 (0.0029) [2024-04-28 17:52:07,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11055382528. Throughput: 0: 55604.3. Samples: 1545680800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-04-28 17:52:07,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 17:52:07,240][57339] Updated weights for policy 0, policy_version 674768 (0.0028) [2024-04-28 17:52:09,942][57339] Updated weights for policy 0, policy_version 674778 (0.0025) [2024-04-28 17:52:12,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 11055661056. Throughput: 0: 55508.9. Samples: 1546015140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:52:12,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 17:52:13,103][57339] Updated weights for policy 0, policy_version 674788 (0.0028) [2024-04-28 17:52:15,864][57339] Updated weights for policy 0, policy_version 674798 (0.0028) [2024-04-28 17:52:17,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11055939584. Throughput: 0: 55516.3. Samples: 1546352920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:52:17,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 17:52:18,867][57339] Updated weights for policy 0, policy_version 674808 (0.0032) [2024-04-28 17:52:21,663][57339] Updated weights for policy 0, policy_version 674818 (0.0038) [2024-04-28 17:52:22,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11056218112. Throughput: 0: 55501.9. Samples: 1546522560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:52:22,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 17:52:24,815][57339] Updated weights for policy 0, policy_version 674828 (0.0024) [2024-04-28 17:52:27,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 11056496640. Throughput: 0: 55554.7. Samples: 1546855360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:52:27,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 17:52:27,444][57339] Updated weights for policy 0, policy_version 674838 (0.0029) [2024-04-28 17:52:30,669][57339] Updated weights for policy 0, policy_version 674848 (0.0031) [2024-04-28 17:52:32,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11056775168. Throughput: 0: 55582.9. Samples: 1547187060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:52:32,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 17:52:32,917][57319] Signal inference workers to stop experience collection... (23050 times) [2024-04-28 17:52:32,952][57339] InferenceWorker_p0-w0: stopping experience collection (23050 times) [2024-04-28 17:52:33,008][57319] Signal inference workers to resume experience collection... (23050 times) [2024-04-28 17:52:33,008][57339] InferenceWorker_p0-w0: resuming experience collection (23050 times) [2024-04-28 17:52:33,392][57339] Updated weights for policy 0, policy_version 674858 (0.0031) [2024-04-28 17:52:36,596][57339] Updated weights for policy 0, policy_version 674868 (0.0034) [2024-04-28 17:52:37,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11057053696. Throughput: 0: 55629.5. Samples: 1547354160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:52:37,169][57108] Avg episode reward: [(0, '0.692')] [2024-04-28 17:52:39,370][57339] Updated weights for policy 0, policy_version 674878 (0.0027) [2024-04-28 17:52:42,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 11057332224. Throughput: 0: 55651.4. Samples: 1547690160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:52:42,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 17:52:42,510][57339] Updated weights for policy 0, policy_version 674888 (0.0031) [2024-04-28 17:52:45,255][57339] Updated weights for policy 0, policy_version 674898 (0.0034) [2024-04-28 17:52:47,169][57108] Fps is (10 sec: 54066.8, 60 sec: 54886.5, 300 sec: 55761.1). Total num frames: 11057594368. Throughput: 0: 55741.7. Samples: 1548025380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:52:47,169][57108] Avg episode reward: [(0, '0.691')] [2024-04-28 17:52:48,446][57339] Updated weights for policy 0, policy_version 674908 (0.0027) [2024-04-28 17:52:51,062][57339] Updated weights for policy 0, policy_version 674918 (0.0024) [2024-04-28 17:52:52,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11057872896. Throughput: 0: 55653.2. Samples: 1548185200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:52:52,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 17:52:54,501][57339] Updated weights for policy 0, policy_version 674928 (0.0031) [2024-04-28 17:52:57,046][57339] Updated weights for policy 0, policy_version 674938 (0.0025) [2024-04-28 17:52:57,169][57108] Fps is (10 sec: 58981.5, 60 sec: 55978.4, 300 sec: 55761.1). Total num frames: 11058184192. Throughput: 0: 55641.5. Samples: 1548519020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:52:57,170][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 17:53:00,531][57339] Updated weights for policy 0, policy_version 674948 (0.0035) [2024-04-28 17:53:02,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 11058446336. Throughput: 0: 55482.3. Samples: 1548849620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:53:02,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 17:53:03,035][57339] Updated weights for policy 0, policy_version 674958 (0.0027) [2024-04-28 17:53:06,422][57339] Updated weights for policy 0, policy_version 674968 (0.0029) [2024-04-28 17:53:07,169][57108] Fps is (10 sec: 50792.0, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 11058692096. Throughput: 0: 55485.5. Samples: 1549019400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:53:07,169][57108] Avg episode reward: [(0, '0.723')] [2024-04-28 17:53:09,017][57339] Updated weights for policy 0, policy_version 674978 (0.0027) [2024-04-28 17:53:12,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 11058970624. Throughput: 0: 55430.7. Samples: 1549349740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:53:12,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 17:53:12,312][57339] Updated weights for policy 0, policy_version 674988 (0.0028) [2024-04-28 17:53:14,825][57339] Updated weights for policy 0, policy_version 674998 (0.0029) [2024-04-28 17:53:17,169][57108] Fps is (10 sec: 57342.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11059265536. Throughput: 0: 55488.8. Samples: 1549684060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:53:17,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 17:53:18,157][57339] Updated weights for policy 0, policy_version 675008 (0.0029) [2024-04-28 17:53:20,731][57339] Updated weights for policy 0, policy_version 675018 (0.0032) [2024-04-28 17:53:22,169][57108] Fps is (10 sec: 57342.7, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 11059544064. Throughput: 0: 55355.8. Samples: 1549845180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:53:22,170][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 17:53:24,086][57339] Updated weights for policy 0, policy_version 675028 (0.0025) [2024-04-28 17:53:26,763][57339] Updated weights for policy 0, policy_version 675038 (0.0030) [2024-04-28 17:53:27,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11059822592. Throughput: 0: 55251.0. Samples: 1550176460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:53:27,170][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 17:53:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000675038_11059822592.pth... [2024-04-28 17:53:27,231][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000674222_11046453248.pth [2024-04-28 17:53:30,234][57339] Updated weights for policy 0, policy_version 675048 (0.0025) [2024-04-28 17:53:32,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 11060117504. Throughput: 0: 55146.1. Samples: 1550506960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:53:32,169][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 17:53:32,756][57339] Updated weights for policy 0, policy_version 675058 (0.0028) [2024-04-28 17:53:35,914][57339] Updated weights for policy 0, policy_version 675068 (0.0029) [2024-04-28 17:53:37,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11060379648. Throughput: 0: 55331.2. Samples: 1550675100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:53:37,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 17:53:38,473][57339] Updated weights for policy 0, policy_version 675078 (0.0027) [2024-04-28 17:53:41,801][57339] Updated weights for policy 0, policy_version 675088 (0.0025) [2024-04-28 17:53:42,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 11060641792. Throughput: 0: 55293.9. Samples: 1551007240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:53:42,170][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 17:53:44,681][57339] Updated weights for policy 0, policy_version 675098 (0.0030) [2024-04-28 17:53:47,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 11060920320. Throughput: 0: 55425.3. Samples: 1551343760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-04-28 17:53:47,169][57108] Avg episode reward: [(0, '0.694')] [2024-04-28 17:53:47,681][57339] Updated weights for policy 0, policy_version 675108 (0.0028) [2024-04-28 17:53:50,427][57339] Updated weights for policy 0, policy_version 675118 (0.0029) [2024-04-28 17:53:52,126][57319] Signal inference workers to stop experience collection... (23100 times) [2024-04-28 17:53:52,126][57319] Signal inference workers to resume experience collection... (23100 times) [2024-04-28 17:53:52,151][57339] InferenceWorker_p0-w0: stopping experience collection (23100 times) [2024-04-28 17:53:52,151][57339] InferenceWorker_p0-w0: resuming experience collection (23100 times) [2024-04-28 17:53:52,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11061215232. Throughput: 0: 55318.5. Samples: 1551508740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:53:52,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 17:53:53,476][57339] Updated weights for policy 0, policy_version 675128 (0.0024) [2024-04-28 17:53:56,329][57339] Updated weights for policy 0, policy_version 675138 (0.0032) [2024-04-28 17:53:57,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 11061493760. Throughput: 0: 55477.6. Samples: 1551846240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:53:57,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 17:53:59,398][57339] Updated weights for policy 0, policy_version 675148 (0.0034) [2024-04-28 17:54:02,119][57339] Updated weights for policy 0, policy_version 675158 (0.0025) [2024-04-28 17:54:02,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11061788672. Throughput: 0: 55410.8. Samples: 1552177540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:54:02,169][57108] Avg episode reward: [(0, '0.507')] [2024-04-28 17:54:05,382][57339] Updated weights for policy 0, policy_version 675168 (0.0030) [2024-04-28 17:54:07,169][57108] Fps is (10 sec: 57344.6, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11062067200. Throughput: 0: 55894.9. Samples: 1552360440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:54:07,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 17:54:07,999][57339] Updated weights for policy 0, policy_version 675178 (0.0028) [2024-04-28 17:54:11,136][57339] Updated weights for policy 0, policy_version 675188 (0.0029) [2024-04-28 17:54:12,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 11062329344. Throughput: 0: 55878.7. Samples: 1552691000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:54:12,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 17:54:13,952][57339] Updated weights for policy 0, policy_version 675198 (0.0031) [2024-04-28 17:54:16,855][57339] Updated weights for policy 0, policy_version 675208 (0.0029) [2024-04-28 17:54:17,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 11062607872. Throughput: 0: 55886.9. Samples: 1553021860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:54:17,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 17:54:19,757][57339] Updated weights for policy 0, policy_version 675218 (0.0030) [2024-04-28 17:54:22,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 11062870016. Throughput: 0: 55715.6. Samples: 1553182300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:54:22,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 17:54:22,807][57339] Updated weights for policy 0, policy_version 675228 (0.0028) [2024-04-28 17:54:26,086][57339] Updated weights for policy 0, policy_version 675238 (0.0028) [2024-04-28 17:54:27,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11063164928. Throughput: 0: 55886.3. Samples: 1553522120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:54:27,170][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 17:54:28,760][57339] Updated weights for policy 0, policy_version 675248 (0.0028) [2024-04-28 17:54:31,870][57339] Updated weights for policy 0, policy_version 675258 (0.0029) [2024-04-28 17:54:32,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 11063443456. Throughput: 0: 55886.6. Samples: 1553858660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:54:32,170][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 17:54:34,522][57339] Updated weights for policy 0, policy_version 675268 (0.0028) [2024-04-28 17:54:37,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 11063721984. Throughput: 0: 55744.8. Samples: 1554017260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:54:37,170][57108] Avg episode reward: [(0, '0.696')] [2024-04-28 17:54:37,545][57339] Updated weights for policy 0, policy_version 675278 (0.0031) [2024-04-28 17:54:40,317][57339] Updated weights for policy 0, policy_version 675288 (0.0025) [2024-04-28 17:54:42,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 11064000512. Throughput: 0: 55776.4. Samples: 1554356180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:54:42,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 17:54:43,417][57339] Updated weights for policy 0, policy_version 675298 (0.0031) [2024-04-28 17:54:46,376][57339] Updated weights for policy 0, policy_version 675308 (0.0027) [2024-04-28 17:54:47,169][57108] Fps is (10 sec: 57344.7, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 11064295424. Throughput: 0: 55935.6. Samples: 1554694640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:54:47,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 17:54:49,396][57339] Updated weights for policy 0, policy_version 675318 (0.0037) [2024-04-28 17:54:52,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11064557568. Throughput: 0: 55463.4. Samples: 1554856300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:54:52,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 17:54:52,590][57319] Signal inference workers to stop experience collection... (23150 times) [2024-04-28 17:54:52,591][57319] Signal inference workers to resume experience collection... (23150 times) [2024-04-28 17:54:52,604][57339] Updated weights for policy 0, policy_version 675328 (0.0026) [2024-04-28 17:54:52,620][57339] InferenceWorker_p0-w0: stopping experience collection (23150 times) [2024-04-28 17:54:52,620][57339] InferenceWorker_p0-w0: resuming experience collection (23150 times) [2024-04-28 17:54:55,123][57339] Updated weights for policy 0, policy_version 675338 (0.0026) [2024-04-28 17:54:57,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 11064836096. Throughput: 0: 55514.2. Samples: 1555189140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:54:57,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 17:54:58,347][57339] Updated weights for policy 0, policy_version 675348 (0.0025) [2024-04-28 17:55:00,896][57339] Updated weights for policy 0, policy_version 675358 (0.0034) [2024-04-28 17:55:02,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11065114624. Throughput: 0: 55519.5. Samples: 1555520240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:55:02,169][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 17:55:04,271][57339] Updated weights for policy 0, policy_version 675368 (0.0026) [2024-04-28 17:55:06,891][57339] Updated weights for policy 0, policy_version 675378 (0.0032) [2024-04-28 17:55:07,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 11065409536. Throughput: 0: 55716.9. Samples: 1555689560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:55:07,170][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 17:55:10,044][57339] Updated weights for policy 0, policy_version 675388 (0.0027) [2024-04-28 17:55:12,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 11065655296. Throughput: 0: 55582.7. Samples: 1556023340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:55:12,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 17:55:12,878][57339] Updated weights for policy 0, policy_version 675398 (0.0030) [2024-04-28 17:55:15,856][57339] Updated weights for policy 0, policy_version 675408 (0.0030) [2024-04-28 17:55:17,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 11065933824. Throughput: 0: 55601.8. Samples: 1556360740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:55:17,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 17:55:18,689][57339] Updated weights for policy 0, policy_version 675418 (0.0027) [2024-04-28 17:55:21,725][57339] Updated weights for policy 0, policy_version 675428 (0.0035) [2024-04-28 17:55:22,169][57108] Fps is (10 sec: 58983.1, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 11066245120. Throughput: 0: 55873.1. Samples: 1556531540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-04-28 17:55:22,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 17:55:24,537][57339] Updated weights for policy 0, policy_version 675438 (0.0030) [2024-04-28 17:55:27,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11066507264. Throughput: 0: 55808.6. Samples: 1556867560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:55:27,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 17:55:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000675446_11066507264.pth... [2024-04-28 17:55:27,226][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000674633_11053187072.pth [2024-04-28 17:55:27,563][57339] Updated weights for policy 0, policy_version 675448 (0.0027) [2024-04-28 17:55:30,425][57339] Updated weights for policy 0, policy_version 675458 (0.0026) [2024-04-28 17:55:32,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 11066785792. Throughput: 0: 55761.2. Samples: 1557203900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:55:32,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 17:55:33,237][57339] Updated weights for policy 0, policy_version 675468 (0.0027) [2024-04-28 17:55:36,103][57339] Updated weights for policy 0, policy_version 675478 (0.0027) [2024-04-28 17:55:37,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 11067080704. Throughput: 0: 55954.6. Samples: 1557374260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:55:37,169][57108] Avg episode reward: [(0, '0.496')] [2024-04-28 17:55:39,097][57339] Updated weights for policy 0, policy_version 675488 (0.0029) [2024-04-28 17:55:42,048][57339] Updated weights for policy 0, policy_version 675498 (0.0029) [2024-04-28 17:55:42,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11067359232. Throughput: 0: 56007.6. Samples: 1557709480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:55:42,169][57108] Avg episode reward: [(0, '0.496')] [2024-04-28 17:55:45,001][57339] Updated weights for policy 0, policy_version 675508 (0.0024) [2024-04-28 17:55:46,587][57319] Signal inference workers to stop experience collection... (23200 times) [2024-04-28 17:55:46,594][57319] Signal inference workers to resume experience collection... (23200 times) [2024-04-28 17:55:46,618][57339] InferenceWorker_p0-w0: stopping experience collection (23200 times) [2024-04-28 17:55:46,618][57339] InferenceWorker_p0-w0: resuming experience collection (23200 times) [2024-04-28 17:55:47,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11067637760. Throughput: 0: 56091.5. Samples: 1558044360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:55:47,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 17:55:47,984][57339] Updated weights for policy 0, policy_version 675518 (0.0030) [2024-04-28 17:55:50,721][57339] Updated weights for policy 0, policy_version 675528 (0.0028) [2024-04-28 17:55:52,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 11067899904. Throughput: 0: 56003.6. Samples: 1558209720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:55:52,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 17:55:53,965][57339] Updated weights for policy 0, policy_version 675538 (0.0031) [2024-04-28 17:55:56,654][57339] Updated weights for policy 0, policy_version 675548 (0.0026) [2024-04-28 17:55:57,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 11068194816. Throughput: 0: 56126.6. Samples: 1558549040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:55:57,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 17:55:59,792][57339] Updated weights for policy 0, policy_version 675558 (0.0037) [2024-04-28 17:56:02,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 11068473344. Throughput: 0: 56051.5. Samples: 1558883060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:56:02,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 17:56:02,509][57339] Updated weights for policy 0, policy_version 675568 (0.0032) [2024-04-28 17:56:05,546][57339] Updated weights for policy 0, policy_version 675578 (0.0033) [2024-04-28 17:56:07,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 11068751872. Throughput: 0: 55947.8. Samples: 1559049200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:56:07,170][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 17:56:08,291][57339] Updated weights for policy 0, policy_version 675588 (0.0026) [2024-04-28 17:56:11,466][57339] Updated weights for policy 0, policy_version 675598 (0.0029) [2024-04-28 17:56:12,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11069014016. Throughput: 0: 55853.3. Samples: 1559380960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:56:12,169][57108] Avg episode reward: [(0, '0.524')] [2024-04-28 17:56:14,277][57339] Updated weights for policy 0, policy_version 675608 (0.0031) [2024-04-28 17:56:17,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 11069292544. Throughput: 0: 55876.9. Samples: 1559718360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:56:17,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 17:56:17,398][57339] Updated weights for policy 0, policy_version 675618 (0.0025) [2024-04-28 17:56:20,046][57339] Updated weights for policy 0, policy_version 675628 (0.0025) [2024-04-28 17:56:22,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11069603840. Throughput: 0: 55843.7. Samples: 1559887220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:56:22,170][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 17:56:23,209][57339] Updated weights for policy 0, policy_version 675638 (0.0033) [2024-04-28 17:56:26,039][57339] Updated weights for policy 0, policy_version 675648 (0.0027) [2024-04-28 17:56:27,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11069865984. Throughput: 0: 55828.0. Samples: 1560221740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:56:27,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 17:56:28,886][57339] Updated weights for policy 0, policy_version 675658 (0.0029) [2024-04-28 17:56:32,015][57339] Updated weights for policy 0, policy_version 675668 (0.0025) [2024-04-28 17:56:32,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 11070144512. Throughput: 0: 55817.7. Samples: 1560556160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:56:32,170][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 17:56:34,778][57339] Updated weights for policy 0, policy_version 675678 (0.0031) [2024-04-28 17:56:37,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 11070439424. Throughput: 0: 55856.7. Samples: 1560723280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:56:37,170][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 17:56:37,779][57339] Updated weights for policy 0, policy_version 675688 (0.0036) [2024-04-28 17:56:40,719][57339] Updated weights for policy 0, policy_version 675698 (0.0026) [2024-04-28 17:56:42,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 11070685184. Throughput: 0: 55764.6. Samples: 1561058440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:56:42,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 17:56:43,500][57339] Updated weights for policy 0, policy_version 675708 (0.0028) [2024-04-28 17:56:46,551][57339] Updated weights for policy 0, policy_version 675718 (0.0031) [2024-04-28 17:56:47,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 11070980096. Throughput: 0: 55661.0. Samples: 1561387800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:56:47,170][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 17:56:48,659][57319] Signal inference workers to stop experience collection... (23250 times) [2024-04-28 17:56:48,675][57339] InferenceWorker_p0-w0: stopping experience collection (23250 times) [2024-04-28 17:56:48,757][57319] Signal inference workers to resume experience collection... (23250 times) [2024-04-28 17:56:48,758][57339] InferenceWorker_p0-w0: resuming experience collection (23250 times) [2024-04-28 17:56:49,433][57339] Updated weights for policy 0, policy_version 675728 (0.0027) [2024-04-28 17:56:52,169][57108] Fps is (10 sec: 58981.6, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 11071275008. Throughput: 0: 55623.6. Samples: 1561552260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:56:52,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 17:56:52,322][57339] Updated weights for policy 0, policy_version 675738 (0.0029) [2024-04-28 17:56:55,262][57339] Updated weights for policy 0, policy_version 675748 (0.0036) [2024-04-28 17:56:57,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11071537152. Throughput: 0: 55780.0. Samples: 1561891060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:56:57,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 17:56:58,121][57339] Updated weights for policy 0, policy_version 675758 (0.0026) [2024-04-28 17:57:01,386][57339] Updated weights for policy 0, policy_version 675768 (0.0032) [2024-04-28 17:57:02,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 11071815680. Throughput: 0: 55831.3. Samples: 1562230760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-04-28 17:57:02,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 17:57:03,952][57339] Updated weights for policy 0, policy_version 675778 (0.0035) [2024-04-28 17:57:07,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11072077824. Throughput: 0: 55824.9. Samples: 1562399340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:57:07,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 17:57:07,340][57339] Updated weights for policy 0, policy_version 675788 (0.0029) [2024-04-28 17:57:09,726][57339] Updated weights for policy 0, policy_version 675798 (0.0035) [2024-04-28 17:57:12,169][57108] Fps is (10 sec: 55704.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11072372736. Throughput: 0: 55891.4. Samples: 1562736860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:57:12,170][57108] Avg episode reward: [(0, '0.676')] [2024-04-28 17:57:13,054][57339] Updated weights for policy 0, policy_version 675808 (0.0027) [2024-04-28 17:57:15,482][57339] Updated weights for policy 0, policy_version 675818 (0.0028) [2024-04-28 17:57:17,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 11072634880. Throughput: 0: 55840.1. Samples: 1563068960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:57:17,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 17:57:18,927][57339] Updated weights for policy 0, policy_version 675828 (0.0029) [2024-04-28 17:57:21,584][57339] Updated weights for policy 0, policy_version 675838 (0.0026) [2024-04-28 17:57:22,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11072929792. Throughput: 0: 55898.4. Samples: 1563238700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:57:22,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 17:57:24,842][57339] Updated weights for policy 0, policy_version 675848 (0.0027) [2024-04-28 17:57:27,169][57108] Fps is (10 sec: 58980.9, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 11073224704. Throughput: 0: 55878.3. Samples: 1563572980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:57:27,170][57108] Avg episode reward: [(0, '0.682')] [2024-04-28 17:57:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000675856_11073224704.pth... [2024-04-28 17:57:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000675038_11059822592.pth [2024-04-28 17:57:27,691][57339] Updated weights for policy 0, policy_version 675858 (0.0024) [2024-04-28 17:57:30,569][57339] Updated weights for policy 0, policy_version 675868 (0.0031) [2024-04-28 17:57:32,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11073486848. Throughput: 0: 55927.2. Samples: 1563904520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:57:32,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 17:57:33,751][57339] Updated weights for policy 0, policy_version 675878 (0.0034) [2024-04-28 17:57:36,351][57339] Updated weights for policy 0, policy_version 675888 (0.0031) [2024-04-28 17:57:37,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11073765376. Throughput: 0: 56069.4. Samples: 1564075380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:57:37,170][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 17:57:38,468][57319] Signal inference workers to stop experience collection... (23300 times) [2024-04-28 17:57:38,506][57339] InferenceWorker_p0-w0: stopping experience collection (23300 times) [2024-04-28 17:57:38,560][57319] Signal inference workers to resume experience collection... (23300 times) [2024-04-28 17:57:38,560][57339] InferenceWorker_p0-w0: resuming experience collection (23300 times) [2024-04-28 17:57:39,680][57339] Updated weights for policy 0, policy_version 675898 (0.0028) [2024-04-28 17:57:42,147][57339] Updated weights for policy 0, policy_version 675908 (0.0033) [2024-04-28 17:57:42,169][57108] Fps is (10 sec: 58982.2, 60 sec: 56524.7, 300 sec: 55872.2). Total num frames: 11074076672. Throughput: 0: 56014.2. Samples: 1564411700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:57:42,170][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 17:57:45,383][57339] Updated weights for policy 0, policy_version 675918 (0.0027) [2024-04-28 17:57:47,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 11074322432. Throughput: 0: 55914.1. Samples: 1564746900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:57:47,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 17:57:48,367][57339] Updated weights for policy 0, policy_version 675928 (0.0025) [2024-04-28 17:57:51,272][57339] Updated weights for policy 0, policy_version 675938 (0.0023) [2024-04-28 17:57:52,169][57108] Fps is (10 sec: 50790.4, 60 sec: 55159.5, 300 sec: 55594.6). Total num frames: 11074584576. Throughput: 0: 55697.2. Samples: 1564905720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:57:52,170][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 17:57:54,170][57339] Updated weights for policy 0, policy_version 675948 (0.0033) [2024-04-28 17:57:57,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11074879488. Throughput: 0: 55533.1. Samples: 1565235840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:57:57,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 17:57:57,221][57339] Updated weights for policy 0, policy_version 675958 (0.0037) [2024-04-28 17:58:00,346][57339] Updated weights for policy 0, policy_version 675968 (0.0025) [2024-04-28 17:58:02,169][57108] Fps is (10 sec: 60621.3, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 11075190784. Throughput: 0: 55496.5. Samples: 1565566300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:58:02,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 17:58:03,084][57339] Updated weights for policy 0, policy_version 675978 (0.0033) [2024-04-28 17:58:06,309][57339] Updated weights for policy 0, policy_version 675988 (0.0028) [2024-04-28 17:58:07,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11075436544. Throughput: 0: 55512.0. Samples: 1565736740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:58:07,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 17:58:08,856][57339] Updated weights for policy 0, policy_version 675998 (0.0028) [2024-04-28 17:58:12,134][57339] Updated weights for policy 0, policy_version 676008 (0.0026) [2024-04-28 17:58:12,169][57108] Fps is (10 sec: 52427.8, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11075715072. Throughput: 0: 55486.8. Samples: 1566069880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:58:12,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 17:58:14,832][57339] Updated weights for policy 0, policy_version 676018 (0.0026) [2024-04-28 17:58:17,169][57108] Fps is (10 sec: 57343.9, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 11076009984. Throughput: 0: 55552.0. Samples: 1566404360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:58:17,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 17:58:17,901][57339] Updated weights for policy 0, policy_version 676028 (0.0032) [2024-04-28 17:58:20,999][57339] Updated weights for policy 0, policy_version 676038 (0.0031) [2024-04-28 17:58:22,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11076288512. Throughput: 0: 55460.5. Samples: 1566571100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:58:22,170][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 17:58:23,856][57339] Updated weights for policy 0, policy_version 676048 (0.0027) [2024-04-28 17:58:27,124][57339] Updated weights for policy 0, policy_version 676058 (0.0025) [2024-04-28 17:58:27,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 11076534272. Throughput: 0: 55507.1. Samples: 1566909520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:58:27,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 17:58:29,843][57339] Updated weights for policy 0, policy_version 676068 (0.0034) [2024-04-28 17:58:32,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11076829184. Throughput: 0: 55408.5. Samples: 1567240280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:58:32,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 17:58:32,817][57339] Updated weights for policy 0, policy_version 676078 (0.0027) [2024-04-28 17:58:33,683][57319] Signal inference workers to stop experience collection... (23350 times) [2024-04-28 17:58:33,715][57339] InferenceWorker_p0-w0: stopping experience collection (23350 times) [2024-04-28 17:58:33,743][57319] Signal inference workers to resume experience collection... (23350 times) [2024-04-28 17:58:33,744][57339] InferenceWorker_p0-w0: resuming experience collection (23350 times) [2024-04-28 17:58:35,541][57339] Updated weights for policy 0, policy_version 676088 (0.0028) [2024-04-28 17:58:37,169][57108] Fps is (10 sec: 60621.0, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 11077140480. Throughput: 0: 55832.9. Samples: 1567418200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:58:37,170][57108] Avg episode reward: [(0, '0.678')] [2024-04-28 17:58:38,586][57339] Updated weights for policy 0, policy_version 676098 (0.0025) [2024-04-28 17:58:41,434][57339] Updated weights for policy 0, policy_version 676108 (0.0032) [2024-04-28 17:58:42,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 11077402624. Throughput: 0: 56018.2. Samples: 1567756660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-04-28 17:58:42,169][57108] Avg episode reward: [(0, '0.677')] [2024-04-28 17:58:44,493][57339] Updated weights for policy 0, policy_version 676118 (0.0029) [2024-04-28 17:58:47,169][57108] Fps is (10 sec: 50791.0, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11077648384. Throughput: 0: 56127.6. Samples: 1568092040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:58:47,169][57108] Avg episode reward: [(0, '0.704')] [2024-04-28 17:58:47,295][57339] Updated weights for policy 0, policy_version 676128 (0.0026) [2024-04-28 17:58:50,164][57339] Updated weights for policy 0, policy_version 676138 (0.0030) [2024-04-28 17:58:52,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 11077943296. Throughput: 0: 55978.7. Samples: 1568255780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:58:52,169][57108] Avg episode reward: [(0, '0.531')] [2024-04-28 17:58:53,194][57339] Updated weights for policy 0, policy_version 676148 (0.0026) [2024-04-28 17:58:55,848][57339] Updated weights for policy 0, policy_version 676158 (0.0028) [2024-04-28 17:58:57,169][57108] Fps is (10 sec: 60619.5, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 11078254592. Throughput: 0: 56113.3. Samples: 1568594980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:58:57,170][57108] Avg episode reward: [(0, '0.491')] [2024-04-28 17:58:58,863][57339] Updated weights for policy 0, policy_version 676168 (0.0032) [2024-04-28 17:59:01,675][57339] Updated weights for policy 0, policy_version 676178 (0.0023) [2024-04-28 17:59:02,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 11078516736. Throughput: 0: 56227.1. Samples: 1568934580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:59:02,169][57108] Avg episode reward: [(0, '0.539')] [2024-04-28 17:59:04,636][57339] Updated weights for policy 0, policy_version 676188 (0.0024) [2024-04-28 17:59:07,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11078795264. Throughput: 0: 56220.5. Samples: 1569101020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:59:07,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 17:59:07,837][57339] Updated weights for policy 0, policy_version 676198 (0.0029) [2024-04-28 17:59:10,478][57339] Updated weights for policy 0, policy_version 676208 (0.0027) [2024-04-28 17:59:12,169][57108] Fps is (10 sec: 57343.6, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 11079090176. Throughput: 0: 56148.8. Samples: 1569436220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:59:12,170][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 17:59:13,585][57339] Updated weights for policy 0, policy_version 676218 (0.0031) [2024-04-28 17:59:16,409][57339] Updated weights for policy 0, policy_version 676228 (0.0031) [2024-04-28 17:59:16,910][57319] Signal inference workers to stop experience collection... (23400 times) [2024-04-28 17:59:16,911][57319] Signal inference workers to resume experience collection... (23400 times) [2024-04-28 17:59:16,926][57339] InferenceWorker_p0-w0: stopping experience collection (23400 times) [2024-04-28 17:59:16,926][57339] InferenceWorker_p0-w0: resuming experience collection (23400 times) [2024-04-28 17:59:17,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 11079368704. Throughput: 0: 56291.6. Samples: 1569773400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:59:17,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 17:59:19,296][57339] Updated weights for policy 0, policy_version 676238 (0.0028) [2024-04-28 17:59:22,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 11079630848. Throughput: 0: 55993.8. Samples: 1569937920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:59:22,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 17:59:22,208][57339] Updated weights for policy 0, policy_version 676248 (0.0033) [2024-04-28 17:59:25,176][57339] Updated weights for policy 0, policy_version 676258 (0.0028) [2024-04-28 17:59:27,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 11079892992. Throughput: 0: 55900.0. Samples: 1570272160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:59:27,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 17:59:27,216][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000676264_11079909376.pth... [2024-04-28 17:59:27,270][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000675446_11066507264.pth [2024-04-28 17:59:28,218][57339] Updated weights for policy 0, policy_version 676268 (0.0027) [2024-04-28 17:59:31,002][57339] Updated weights for policy 0, policy_version 676278 (0.0027) [2024-04-28 17:59:32,169][57108] Fps is (10 sec: 57343.9, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 11080204288. Throughput: 0: 55826.1. Samples: 1570604220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:59:32,169][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 17:59:34,228][57339] Updated weights for policy 0, policy_version 676288 (0.0030) [2024-04-28 17:59:36,885][57339] Updated weights for policy 0, policy_version 676298 (0.0026) [2024-04-28 17:59:37,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 11080482816. Throughput: 0: 55875.0. Samples: 1570770160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:59:37,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 17:59:40,062][57339] Updated weights for policy 0, policy_version 676308 (0.0024) [2024-04-28 17:59:42,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 11080744960. Throughput: 0: 55755.4. Samples: 1571103960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:59:42,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 17:59:42,616][57339] Updated weights for policy 0, policy_version 676318 (0.0026) [2024-04-28 17:59:46,000][57339] Updated weights for policy 0, policy_version 676328 (0.0025) [2024-04-28 17:59:47,169][57108] Fps is (10 sec: 55705.5, 60 sec: 56524.7, 300 sec: 55872.2). Total num frames: 11081039872. Throughput: 0: 55615.2. Samples: 1571437260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:59:47,169][57108] Avg episode reward: [(0, '0.694')] [2024-04-28 17:59:48,514][57339] Updated weights for policy 0, policy_version 676338 (0.0030) [2024-04-28 17:59:51,743][57339] Updated weights for policy 0, policy_version 676348 (0.0031) [2024-04-28 17:59:52,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11081302016. Throughput: 0: 55591.0. Samples: 1571602620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:59:52,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 17:59:54,465][57339] Updated weights for policy 0, policy_version 676358 (0.0031) [2024-04-28 17:59:57,169][57108] Fps is (10 sec: 50790.8, 60 sec: 54886.6, 300 sec: 55705.6). Total num frames: 11081547776. Throughput: 0: 55678.0. Samples: 1571941720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 17:59:57,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 17:59:57,647][57339] Updated weights for policy 0, policy_version 676368 (0.0035) [2024-04-28 18:00:00,366][57339] Updated weights for policy 0, policy_version 676378 (0.0027) [2024-04-28 18:00:02,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 11081859072. Throughput: 0: 55602.1. Samples: 1572275500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 18:00:02,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 18:00:03,481][57339] Updated weights for policy 0, policy_version 676388 (0.0026) [2024-04-28 18:00:06,162][57339] Updated weights for policy 0, policy_version 676398 (0.0027) [2024-04-28 18:00:06,541][57319] Signal inference workers to stop experience collection... (23450 times) [2024-04-28 18:00:06,541][57319] Signal inference workers to resume experience collection... (23450 times) [2024-04-28 18:00:06,551][57339] InferenceWorker_p0-w0: stopping experience collection (23450 times) [2024-04-28 18:00:06,551][57339] InferenceWorker_p0-w0: resuming experience collection (23450 times) [2024-04-28 18:00:07,169][57108] Fps is (10 sec: 60620.0, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 11082153984. Throughput: 0: 55687.5. Samples: 1572443860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 18:00:07,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 18:00:09,363][57339] Updated weights for policy 0, policy_version 676408 (0.0039) [2024-04-28 18:00:11,986][57339] Updated weights for policy 0, policy_version 676418 (0.0030) [2024-04-28 18:00:12,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 11082432512. Throughput: 0: 55742.5. Samples: 1572780580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 18:00:12,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 18:00:15,149][57339] Updated weights for policy 0, policy_version 676428 (0.0033) [2024-04-28 18:00:17,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 11082694656. Throughput: 0: 55719.4. Samples: 1573111600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 18:00:17,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 18:00:17,792][57339] Updated weights for policy 0, policy_version 676438 (0.0029) [2024-04-28 18:00:21,034][57339] Updated weights for policy 0, policy_version 676448 (0.0028) [2024-04-28 18:00:22,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11082973184. Throughput: 0: 55679.2. Samples: 1573275720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-04-28 18:00:22,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 18:00:23,614][57339] Updated weights for policy 0, policy_version 676458 (0.0027) [2024-04-28 18:00:27,046][57339] Updated weights for policy 0, policy_version 676468 (0.0029) [2024-04-28 18:00:27,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 11083268096. Throughput: 0: 55847.3. Samples: 1573617100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:00:27,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 18:00:29,540][57339] Updated weights for policy 0, policy_version 676478 (0.0029) [2024-04-28 18:00:32,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 11083513856. Throughput: 0: 55857.3. Samples: 1573950840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:00:32,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 18:00:32,836][57339] Updated weights for policy 0, policy_version 676488 (0.0036) [2024-04-28 18:00:35,610][57339] Updated weights for policy 0, policy_version 676498 (0.0030) [2024-04-28 18:00:37,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 11083825152. Throughput: 0: 55884.9. Samples: 1574117440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:00:37,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 18:00:38,549][57339] Updated weights for policy 0, policy_version 676508 (0.0031) [2024-04-28 18:00:41,572][57339] Updated weights for policy 0, policy_version 676518 (0.0029) [2024-04-28 18:00:42,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11084103680. Throughput: 0: 55768.4. Samples: 1574451300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:00:42,170][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 18:00:44,558][57339] Updated weights for policy 0, policy_version 676528 (0.0027) [2024-04-28 18:00:47,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.4, 300 sec: 55816.6). Total num frames: 11084365824. Throughput: 0: 55721.2. Samples: 1574782960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:00:47,170][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 18:00:47,583][57339] Updated weights for policy 0, policy_version 676538 (0.0026) [2024-04-28 18:00:48,209][57319] Signal inference workers to stop experience collection... (23500 times) [2024-04-28 18:00:48,210][57319] Signal inference workers to resume experience collection... (23500 times) [2024-04-28 18:00:48,231][57339] InferenceWorker_p0-w0: stopping experience collection (23500 times) [2024-04-28 18:00:48,231][57339] InferenceWorker_p0-w0: resuming experience collection (23500 times) [2024-04-28 18:00:50,524][57339] Updated weights for policy 0, policy_version 676548 (0.0031) [2024-04-28 18:00:52,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 11084644352. Throughput: 0: 55854.8. Samples: 1574957320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:00:52,169][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 18:00:53,492][57339] Updated weights for policy 0, policy_version 676558 (0.0026) [2024-04-28 18:00:56,384][57339] Updated weights for policy 0, policy_version 676568 (0.0030) [2024-04-28 18:00:57,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 11084939264. Throughput: 0: 55768.4. Samples: 1575290160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:00:57,170][57108] Avg episode reward: [(0, '0.679')] [2024-04-28 18:00:59,275][57339] Updated weights for policy 0, policy_version 676578 (0.0024) [2024-04-28 18:01:02,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11085201408. Throughput: 0: 55897.8. Samples: 1575627000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:01:02,170][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 18:01:02,212][57339] Updated weights for policy 0, policy_version 676588 (0.0031) [2024-04-28 18:01:05,039][57339] Updated weights for policy 0, policy_version 676598 (0.0033) [2024-04-28 18:01:07,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 11085463552. Throughput: 0: 55795.0. Samples: 1575786500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:01:07,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 18:01:08,238][57339] Updated weights for policy 0, policy_version 676608 (0.0028) [2024-04-28 18:01:10,958][57339] Updated weights for policy 0, policy_version 676618 (0.0031) [2024-04-28 18:01:12,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 11085791232. Throughput: 0: 55600.5. Samples: 1576119120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:01:12,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 18:01:14,091][57339] Updated weights for policy 0, policy_version 676628 (0.0027) [2024-04-28 18:01:17,002][57339] Updated weights for policy 0, policy_version 676638 (0.0030) [2024-04-28 18:01:17,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 11086053376. Throughput: 0: 55676.4. Samples: 1576456280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:01:17,169][57108] Avg episode reward: [(0, '0.695')] [2024-04-28 18:01:19,953][57339] Updated weights for policy 0, policy_version 676648 (0.0022) [2024-04-28 18:01:22,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 11086315520. Throughput: 0: 55743.9. Samples: 1576625920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:01:22,170][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 18:01:22,821][57339] Updated weights for policy 0, policy_version 676658 (0.0030) [2024-04-28 18:01:25,799][57339] Updated weights for policy 0, policy_version 676668 (0.0024) [2024-04-28 18:01:27,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 11086610432. Throughput: 0: 55740.7. Samples: 1576959640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:01:27,170][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 18:01:27,289][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000676674_11086626816.pth... [2024-04-28 18:01:27,339][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000675856_11073224704.pth [2024-04-28 18:01:28,553][57339] Updated weights for policy 0, policy_version 676678 (0.0027) [2024-04-28 18:01:31,636][57339] Updated weights for policy 0, policy_version 676688 (0.0034) [2024-04-28 18:01:32,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11086872576. Throughput: 0: 55689.1. Samples: 1577288960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:01:32,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 18:01:34,320][57319] Signal inference workers to stop experience collection... (23550 times) [2024-04-28 18:01:34,320][57319] Signal inference workers to resume experience collection... (23550 times) [2024-04-28 18:01:34,350][57339] InferenceWorker_p0-w0: stopping experience collection (23550 times) [2024-04-28 18:01:34,353][57339] InferenceWorker_p0-w0: resuming experience collection (23550 times) [2024-04-28 18:01:34,432][57339] Updated weights for policy 0, policy_version 676698 (0.0033) [2024-04-28 18:01:37,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 11087134720. Throughput: 0: 55488.0. Samples: 1577454280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:01:37,169][57108] Avg episode reward: [(0, '0.709')] [2024-04-28 18:01:37,545][57339] Updated weights for policy 0, policy_version 676708 (0.0030) [2024-04-28 18:01:40,284][57339] Updated weights for policy 0, policy_version 676718 (0.0032) [2024-04-28 18:01:42,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 11087429632. Throughput: 0: 55543.2. Samples: 1577789600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:01:42,170][57108] Avg episode reward: [(0, '0.700')] [2024-04-28 18:01:43,466][57339] Updated weights for policy 0, policy_version 676728 (0.0028) [2024-04-28 18:01:46,095][57339] Updated weights for policy 0, policy_version 676738 (0.0034) [2024-04-28 18:01:47,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 11087724544. Throughput: 0: 55486.7. Samples: 1578123900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:01:47,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 18:01:49,235][57339] Updated weights for policy 0, policy_version 676748 (0.0025) [2024-04-28 18:01:51,824][57339] Updated weights for policy 0, policy_version 676758 (0.0029) [2024-04-28 18:01:52,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11088003072. Throughput: 0: 55732.0. Samples: 1578294440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:01:52,169][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 18:01:55,103][57339] Updated weights for policy 0, policy_version 676768 (0.0030) [2024-04-28 18:01:57,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 11088265216. Throughput: 0: 55670.7. Samples: 1578624300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:01:57,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 18:01:57,801][57339] Updated weights for policy 0, policy_version 676778 (0.0029) [2024-04-28 18:02:01,018][57339] Updated weights for policy 0, policy_version 676788 (0.0033) [2024-04-28 18:02:02,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11088543744. Throughput: 0: 55588.1. Samples: 1578957740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:02:02,169][57108] Avg episode reward: [(0, '0.677')] [2024-04-28 18:02:03,856][57339] Updated weights for policy 0, policy_version 676798 (0.0029) [2024-04-28 18:02:06,800][57339] Updated weights for policy 0, policy_version 676808 (0.0031) [2024-04-28 18:02:07,169][57108] Fps is (10 sec: 57343.9, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 11088838656. Throughput: 0: 55549.0. Samples: 1579125620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:02:07,170][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 18:02:09,610][57339] Updated weights for policy 0, policy_version 676818 (0.0029) [2024-04-28 18:02:12,169][57108] Fps is (10 sec: 54067.2, 60 sec: 54886.4, 300 sec: 55761.1). Total num frames: 11089084416. Throughput: 0: 55609.9. Samples: 1579462080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:02:12,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 18:02:12,719][57339] Updated weights for policy 0, policy_version 676828 (0.0028) [2024-04-28 18:02:15,379][57339] Updated weights for policy 0, policy_version 676838 (0.0025) [2024-04-28 18:02:17,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 11089379328. Throughput: 0: 55763.9. Samples: 1579798340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:02:17,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 18:02:18,573][57339] Updated weights for policy 0, policy_version 676848 (0.0035) [2024-04-28 18:02:20,603][57319] Signal inference workers to stop experience collection... (23600 times) [2024-04-28 18:02:20,604][57319] Signal inference workers to resume experience collection... (23600 times) [2024-04-28 18:02:20,624][57339] InferenceWorker_p0-w0: stopping experience collection (23600 times) [2024-04-28 18:02:20,624][57339] InferenceWorker_p0-w0: resuming experience collection (23600 times) [2024-04-28 18:02:21,218][57339] Updated weights for policy 0, policy_version 676858 (0.0031) [2024-04-28 18:02:22,169][57108] Fps is (10 sec: 60620.6, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 11089690624. Throughput: 0: 55876.8. Samples: 1579968740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:02:22,169][57108] Avg episode reward: [(0, '0.696')] [2024-04-28 18:02:24,404][57339] Updated weights for policy 0, policy_version 676868 (0.0029) [2024-04-28 18:02:26,979][57339] Updated weights for policy 0, policy_version 676878 (0.0033) [2024-04-28 18:02:27,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11089969152. Throughput: 0: 55994.3. Samples: 1580309340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:02:27,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 18:02:30,333][57339] Updated weights for policy 0, policy_version 676888 (0.0027) [2024-04-28 18:02:32,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11090231296. Throughput: 0: 55873.4. Samples: 1580638200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:02:32,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 18:02:32,785][57339] Updated weights for policy 0, policy_version 676898 (0.0039) [2024-04-28 18:02:36,119][57339] Updated weights for policy 0, policy_version 676908 (0.0031) [2024-04-28 18:02:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11090509824. Throughput: 0: 55871.1. Samples: 1580808640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:02:37,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 18:02:38,502][57339] Updated weights for policy 0, policy_version 676918 (0.0026) [2024-04-28 18:02:41,894][57339] Updated weights for policy 0, policy_version 676928 (0.0027) [2024-04-28 18:02:42,169][57108] Fps is (10 sec: 57344.8, 60 sec: 56251.9, 300 sec: 55872.2). Total num frames: 11090804736. Throughput: 0: 56115.2. Samples: 1581149480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:02:42,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 18:02:44,541][57339] Updated weights for policy 0, policy_version 676938 (0.0027) [2024-04-28 18:02:47,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 11091050496. Throughput: 0: 56036.3. Samples: 1581479380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:02:47,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 18:02:47,870][57339] Updated weights for policy 0, policy_version 676948 (0.0027) [2024-04-28 18:02:50,420][57339] Updated weights for policy 0, policy_version 676958 (0.0027) [2024-04-28 18:02:52,169][57108] Fps is (10 sec: 54066.3, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 11091345408. Throughput: 0: 55976.8. Samples: 1581644580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:02:52,170][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 18:02:53,912][57339] Updated weights for policy 0, policy_version 676968 (0.0034) [2024-04-28 18:02:56,444][57339] Updated weights for policy 0, policy_version 676978 (0.0029) [2024-04-28 18:02:57,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 11091607552. Throughput: 0: 55832.7. Samples: 1581974560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:02:57,170][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 18:02:59,760][57339] Updated weights for policy 0, policy_version 676988 (0.0026) [2024-04-28 18:03:02,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 11091918848. Throughput: 0: 55622.8. Samples: 1582301360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:03:02,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 18:03:02,283][57339] Updated weights for policy 0, policy_version 676998 (0.0028) [2024-04-28 18:03:05,657][57339] Updated weights for policy 0, policy_version 677008 (0.0030) [2024-04-28 18:03:07,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 11092197376. Throughput: 0: 55782.7. Samples: 1582478960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:03:07,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 18:03:08,210][57339] Updated weights for policy 0, policy_version 677018 (0.0026) [2024-04-28 18:03:11,548][57339] Updated weights for policy 0, policy_version 677028 (0.0026) [2024-04-28 18:03:12,169][57108] Fps is (10 sec: 54066.7, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 11092459520. Throughput: 0: 55537.7. Samples: 1582808540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:03:12,170][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 18:03:14,369][57339] Updated weights for policy 0, policy_version 677038 (0.0031) [2024-04-28 18:03:17,169][57108] Fps is (10 sec: 50790.5, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11092705280. Throughput: 0: 55628.5. Samples: 1583141480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:03:17,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 18:03:17,461][57339] Updated weights for policy 0, policy_version 677048 (0.0032) [2024-04-28 18:03:20,285][57339] Updated weights for policy 0, policy_version 677058 (0.0035) [2024-04-28 18:03:22,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55159.6, 300 sec: 55816.7). Total num frames: 11093000192. Throughput: 0: 55397.4. Samples: 1583301520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:03:22,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 18:03:23,334][57339] Updated weights for policy 0, policy_version 677068 (0.0027) [2024-04-28 18:03:26,269][57339] Updated weights for policy 0, policy_version 677078 (0.0026) [2024-04-28 18:03:27,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55159.6, 300 sec: 55761.2). Total num frames: 11093278720. Throughput: 0: 55276.0. Samples: 1583636900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:03:27,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 18:03:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000677080_11093278720.pth... [2024-04-28 18:03:27,226][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000676264_11079909376.pth [2024-04-28 18:03:29,305][57339] Updated weights for policy 0, policy_version 677088 (0.0029) [2024-04-28 18:03:32,115][57339] Updated weights for policy 0, policy_version 677098 (0.0031) [2024-04-28 18:03:32,169][57108] Fps is (10 sec: 57342.9, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 11093573632. Throughput: 0: 55295.1. Samples: 1583967660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:03:32,170][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 18:03:34,979][57319] Signal inference workers to stop experience collection... (23650 times) [2024-04-28 18:03:34,980][57319] Signal inference workers to resume experience collection... (23650 times) [2024-04-28 18:03:34,998][57339] InferenceWorker_p0-w0: stopping experience collection (23650 times) [2024-04-28 18:03:34,998][57339] InferenceWorker_p0-w0: resuming experience collection (23650 times) [2024-04-28 18:03:35,236][57339] Updated weights for policy 0, policy_version 677108 (0.0027) [2024-04-28 18:03:37,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11093852160. Throughput: 0: 55513.4. Samples: 1584142680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 25.0) [2024-04-28 18:03:37,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 18:03:38,310][57339] Updated weights for policy 0, policy_version 677118 (0.0029) [2024-04-28 18:03:41,177][57339] Updated weights for policy 0, policy_version 677128 (0.0028) [2024-04-28 18:03:42,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.3, 300 sec: 55816.6). Total num frames: 11094114304. Throughput: 0: 55537.8. Samples: 1584473760. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:03:42,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 18:03:44,026][57339] Updated weights for policy 0, policy_version 677138 (0.0027) [2024-04-28 18:03:47,099][57339] Updated weights for policy 0, policy_version 677148 (0.0028) [2024-04-28 18:03:47,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11094392832. Throughput: 0: 55719.0. Samples: 1584808720. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:03:47,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 18:03:49,937][57339] Updated weights for policy 0, policy_version 677158 (0.0030) [2024-04-28 18:03:52,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55159.6, 300 sec: 55594.6). Total num frames: 11094654976. Throughput: 0: 55326.3. Samples: 1584968640. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:03:52,169][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 18:03:52,908][57339] Updated weights for policy 0, policy_version 677168 (0.0023) [2024-04-28 18:03:55,723][57339] Updated weights for policy 0, policy_version 677178 (0.0029) [2024-04-28 18:03:57,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 11094933504. Throughput: 0: 55434.1. Samples: 1585303080. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:03:57,170][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 18:03:58,775][57339] Updated weights for policy 0, policy_version 677188 (0.0031) [2024-04-28 18:04:01,603][57339] Updated weights for policy 0, policy_version 677198 (0.0029) [2024-04-28 18:04:02,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55159.3, 300 sec: 55705.6). Total num frames: 11095228416. Throughput: 0: 55425.2. Samples: 1585635620. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:04:02,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 18:04:04,629][57339] Updated weights for policy 0, policy_version 677208 (0.0025) [2024-04-28 18:04:07,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 11095506944. Throughput: 0: 55691.0. Samples: 1585807620. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:04:07,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 18:04:07,465][57339] Updated weights for policy 0, policy_version 677218 (0.0026) [2024-04-28 18:04:10,456][57339] Updated weights for policy 0, policy_version 677228 (0.0025) [2024-04-28 18:04:12,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11095801856. Throughput: 0: 55612.2. Samples: 1586139460. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:04:12,170][57108] Avg episode reward: [(0, '0.727')] [2024-04-28 18:04:13,345][57339] Updated weights for policy 0, policy_version 677238 (0.0029) [2024-04-28 18:04:16,172][57339] Updated weights for policy 0, policy_version 677248 (0.0030) [2024-04-28 18:04:17,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 11096080384. Throughput: 0: 55780.7. Samples: 1586477780. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:04:17,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 18:04:19,089][57339] Updated weights for policy 0, policy_version 677258 (0.0028) [2024-04-28 18:04:21,971][57339] Updated weights for policy 0, policy_version 677268 (0.0027) [2024-04-28 18:04:22,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.5, 300 sec: 55816.6). Total num frames: 11096358912. Throughput: 0: 55604.8. Samples: 1586644900. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:04:22,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 18:04:25,165][57339] Updated weights for policy 0, policy_version 677278 (0.0028) [2024-04-28 18:04:27,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 11096604672. Throughput: 0: 55740.6. Samples: 1586982080. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:04:27,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 18:04:27,899][57339] Updated weights for policy 0, policy_version 677288 (0.0026) [2024-04-28 18:04:31,004][57339] Updated weights for policy 0, policy_version 677298 (0.0031) [2024-04-28 18:04:32,170][57108] Fps is (10 sec: 52424.0, 60 sec: 55158.6, 300 sec: 55594.3). Total num frames: 11096883200. Throughput: 0: 55749.9. Samples: 1587317520. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:04:32,170][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 18:04:33,753][57339] Updated weights for policy 0, policy_version 677308 (0.0029) [2024-04-28 18:04:36,775][57339] Updated weights for policy 0, policy_version 677318 (0.0041) [2024-04-28 18:04:37,169][57108] Fps is (10 sec: 58981.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11097194496. Throughput: 0: 55871.4. Samples: 1587482860. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:04:37,169][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 18:04:39,746][57339] Updated weights for policy 0, policy_version 677328 (0.0029) [2024-04-28 18:04:40,889][57319] Signal inference workers to stop experience collection... (23700 times) [2024-04-28 18:04:40,926][57339] InferenceWorker_p0-w0: stopping experience collection (23700 times) [2024-04-28 18:04:40,938][57319] Signal inference workers to resume experience collection... (23700 times) [2024-04-28 18:04:40,945][57339] InferenceWorker_p0-w0: resuming experience collection (23700 times) [2024-04-28 18:04:42,169][57108] Fps is (10 sec: 57349.9, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 11097456640. Throughput: 0: 55790.4. Samples: 1587813640. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:04:42,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 18:04:42,579][57339] Updated weights for policy 0, policy_version 677338 (0.0028) [2024-04-28 18:04:45,615][57339] Updated weights for policy 0, policy_version 677348 (0.0029) [2024-04-28 18:04:47,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 11097751552. Throughput: 0: 55714.9. Samples: 1588142780. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:04:47,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 18:04:48,626][57339] Updated weights for policy 0, policy_version 677358 (0.0024) [2024-04-28 18:04:51,619][57339] Updated weights for policy 0, policy_version 677368 (0.0029) [2024-04-28 18:04:52,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11098013696. Throughput: 0: 55776.5. Samples: 1588317560. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:04:52,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 18:04:54,521][57339] Updated weights for policy 0, policy_version 677378 (0.0028) [2024-04-28 18:04:57,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11098292224. Throughput: 0: 55742.2. Samples: 1588647860. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:04:57,170][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 18:04:57,474][57339] Updated weights for policy 0, policy_version 677388 (0.0029) [2024-04-28 18:05:00,430][57339] Updated weights for policy 0, policy_version 677398 (0.0027) [2024-04-28 18:05:02,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 11098537984. Throughput: 0: 55604.2. Samples: 1588979980. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:05:02,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 18:05:03,395][57339] Updated weights for policy 0, policy_version 677408 (0.0030) [2024-04-28 18:05:06,215][57339] Updated weights for policy 0, policy_version 677418 (0.0029) [2024-04-28 18:05:07,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 11098832896. Throughput: 0: 55363.3. Samples: 1589136240. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:05:07,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 18:05:09,371][57339] Updated weights for policy 0, policy_version 677428 (0.0028) [2024-04-28 18:05:12,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11099127808. Throughput: 0: 55245.2. Samples: 1589468120. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:05:12,169][57108] Avg episode reward: [(0, '0.475')] [2024-04-28 18:05:12,437][57339] Updated weights for policy 0, policy_version 677438 (0.0026) [2024-04-28 18:05:15,289][57339] Updated weights for policy 0, policy_version 677448 (0.0024) [2024-04-28 18:05:17,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 11099406336. Throughput: 0: 55164.3. Samples: 1589799860. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-04-28 18:05:17,170][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 18:05:18,224][57339] Updated weights for policy 0, policy_version 677458 (0.0028) [2024-04-28 18:05:21,219][57339] Updated weights for policy 0, policy_version 677468 (0.0027) [2024-04-28 18:05:22,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11099684864. Throughput: 0: 55440.8. Samples: 1589977700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:05:22,170][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 18:05:24,097][57339] Updated weights for policy 0, policy_version 677478 (0.0028) [2024-04-28 18:05:27,058][57339] Updated weights for policy 0, policy_version 677488 (0.0028) [2024-04-28 18:05:27,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55978.4, 300 sec: 55761.1). Total num frames: 11099963392. Throughput: 0: 55537.1. Samples: 1590312820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:05:27,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 18:05:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000677488_11099963392.pth... [2024-04-28 18:05:27,232][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000676674_11086626816.pth [2024-04-28 18:05:30,004][57339] Updated weights for policy 0, policy_version 677498 (0.0027) [2024-04-28 18:05:32,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55706.5, 300 sec: 55594.5). Total num frames: 11100225536. Throughput: 0: 55557.6. Samples: 1590642880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:05:32,170][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 18:05:33,047][57339] Updated weights for policy 0, policy_version 677508 (0.0035) [2024-04-28 18:05:36,015][57339] Updated weights for policy 0, policy_version 677518 (0.0026) [2024-04-28 18:05:37,169][57108] Fps is (10 sec: 52429.5, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 11100487680. Throughput: 0: 55140.8. Samples: 1590798900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:05:37,169][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 18:05:39,102][57339] Updated weights for policy 0, policy_version 677528 (0.0025) [2024-04-28 18:05:39,990][57319] Signal inference workers to stop experience collection... (23750 times) [2024-04-28 18:05:40,019][57339] InferenceWorker_p0-w0: stopping experience collection (23750 times) [2024-04-28 18:05:40,085][57319] Signal inference workers to resume experience collection... (23750 times) [2024-04-28 18:05:40,086][57339] InferenceWorker_p0-w0: resuming experience collection (23750 times) [2024-04-28 18:05:41,959][57339] Updated weights for policy 0, policy_version 677538 (0.0026) [2024-04-28 18:05:42,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11100782592. Throughput: 0: 55205.0. Samples: 1591132080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:05:42,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 18:05:45,021][57339] Updated weights for policy 0, policy_version 677548 (0.0028) [2024-04-28 18:05:47,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11101077504. Throughput: 0: 55184.6. Samples: 1591463280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:05:47,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 18:05:47,892][57339] Updated weights for policy 0, policy_version 677558 (0.0027) [2024-04-28 18:05:50,972][57339] Updated weights for policy 0, policy_version 677568 (0.0028) [2024-04-28 18:05:52,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 11101356032. Throughput: 0: 55639.0. Samples: 1591640000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:05:52,169][57108] Avg episode reward: [(0, '0.690')] [2024-04-28 18:05:53,898][57339] Updated weights for policy 0, policy_version 677578 (0.0023) [2024-04-28 18:05:56,987][57339] Updated weights for policy 0, policy_version 677588 (0.0036) [2024-04-28 18:05:57,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55159.6, 300 sec: 55594.6). Total num frames: 11101601792. Throughput: 0: 55600.1. Samples: 1591970120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:05:57,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 18:05:59,919][57339] Updated weights for policy 0, policy_version 677598 (0.0029) [2024-04-28 18:06:02,169][57108] Fps is (10 sec: 50790.6, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 11101863936. Throughput: 0: 55681.0. Samples: 1592305500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:06:02,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 18:06:02,919][57339] Updated weights for policy 0, policy_version 677608 (0.0030) [2024-04-28 18:06:05,923][57339] Updated weights for policy 0, policy_version 677618 (0.0025) [2024-04-28 18:06:07,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 11102158848. Throughput: 0: 55271.2. Samples: 1592464900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:06:07,169][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 18:06:08,769][57339] Updated weights for policy 0, policy_version 677628 (0.0032) [2024-04-28 18:06:11,759][57339] Updated weights for policy 0, policy_version 677638 (0.0027) [2024-04-28 18:06:12,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 11102437376. Throughput: 0: 55144.7. Samples: 1592794320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:06:12,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 18:06:14,579][57339] Updated weights for policy 0, policy_version 677648 (0.0034) [2024-04-28 18:06:17,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 11102715904. Throughput: 0: 55217.9. Samples: 1593127680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:06:17,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 18:06:17,554][57339] Updated weights for policy 0, policy_version 677658 (0.0026) [2024-04-28 18:06:20,603][57339] Updated weights for policy 0, policy_version 677668 (0.0033) [2024-04-28 18:06:22,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.7, 300 sec: 55594.6). Total num frames: 11103010816. Throughput: 0: 55713.1. Samples: 1593305980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:06:22,169][57108] Avg episode reward: [(0, '0.700')] [2024-04-28 18:06:23,404][57339] Updated weights for policy 0, policy_version 677678 (0.0029) [2024-04-28 18:06:26,603][57339] Updated weights for policy 0, policy_version 677688 (0.0026) [2024-04-28 18:06:27,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 11103289344. Throughput: 0: 55684.9. Samples: 1593637900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:06:27,169][57108] Avg episode reward: [(0, '0.506')] [2024-04-28 18:06:29,354][57339] Updated weights for policy 0, policy_version 677698 (0.0026) [2024-04-28 18:06:32,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 11103551488. Throughput: 0: 55778.2. Samples: 1593973300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:06:32,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 18:06:32,343][57339] Updated weights for policy 0, policy_version 677708 (0.0028) [2024-04-28 18:06:35,081][57339] Updated weights for policy 0, policy_version 677718 (0.0030) [2024-04-28 18:06:37,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 11103813632. Throughput: 0: 55392.6. Samples: 1594132660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:06:37,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 18:06:37,800][57319] Signal inference workers to stop experience collection... (23800 times) [2024-04-28 18:06:37,800][57319] Signal inference workers to resume experience collection... (23800 times) [2024-04-28 18:06:37,813][57339] InferenceWorker_p0-w0: stopping experience collection (23800 times) [2024-04-28 18:06:37,814][57339] InferenceWorker_p0-w0: resuming experience collection (23800 times) [2024-04-28 18:06:38,106][57339] Updated weights for policy 0, policy_version 677728 (0.0033) [2024-04-28 18:06:40,979][57339] Updated weights for policy 0, policy_version 677738 (0.0035) [2024-04-28 18:06:42,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 11104108544. Throughput: 0: 55564.8. Samples: 1594470540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:06:42,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 18:06:44,040][57339] Updated weights for policy 0, policy_version 677748 (0.0032) [2024-04-28 18:06:46,840][57339] Updated weights for policy 0, policy_version 677758 (0.0027) [2024-04-28 18:06:47,169][57108] Fps is (10 sec: 57342.2, 60 sec: 55159.3, 300 sec: 55538.9). Total num frames: 11104387072. Throughput: 0: 55564.7. Samples: 1594805920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:06:47,170][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 18:06:49,746][57339] Updated weights for policy 0, policy_version 677768 (0.0028) [2024-04-28 18:06:52,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11104681984. Throughput: 0: 55759.6. Samples: 1594974080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:06:52,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 18:06:52,643][57339] Updated weights for policy 0, policy_version 677778 (0.0033) [2024-04-28 18:06:55,576][57339] Updated weights for policy 0, policy_version 677788 (0.0026) [2024-04-28 18:06:57,169][57108] Fps is (10 sec: 58983.7, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11104976896. Throughput: 0: 55865.3. Samples: 1595308260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:06:57,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 18:06:58,406][57339] Updated weights for policy 0, policy_version 677798 (0.0029) [2024-04-28 18:07:01,366][57339] Updated weights for policy 0, policy_version 677808 (0.0028) [2024-04-28 18:07:02,169][57108] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 55594.5). Total num frames: 11105239040. Throughput: 0: 55880.1. Samples: 1595642280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:07:02,169][57108] Avg episode reward: [(0, '0.720')] [2024-04-28 18:07:04,286][57339] Updated weights for policy 0, policy_version 677818 (0.0032) [2024-04-28 18:07:07,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11105517568. Throughput: 0: 55631.0. Samples: 1595809380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:07:07,170][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 18:07:07,316][57339] Updated weights for policy 0, policy_version 677828 (0.0027) [2024-04-28 18:07:09,989][57339] Updated weights for policy 0, policy_version 677838 (0.0029) [2024-04-28 18:07:12,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 11105779712. Throughput: 0: 55702.6. Samples: 1596144520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:07:12,170][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 18:07:13,158][57339] Updated weights for policy 0, policy_version 677848 (0.0036) [2024-04-28 18:07:15,842][57339] Updated weights for policy 0, policy_version 677858 (0.0029) [2024-04-28 18:07:17,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 11106058240. Throughput: 0: 55722.2. Samples: 1596480800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:07:17,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 18:07:19,014][57339] Updated weights for policy 0, policy_version 677868 (0.0030) [2024-04-28 18:07:21,885][57339] Updated weights for policy 0, policy_version 677878 (0.0033) [2024-04-28 18:07:22,169][57108] Fps is (10 sec: 57345.1, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 11106353152. Throughput: 0: 55773.3. Samples: 1596642460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:07:22,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 18:07:24,917][57339] Updated weights for policy 0, policy_version 677888 (0.0028) [2024-04-28 18:07:27,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11106631680. Throughput: 0: 55775.9. Samples: 1596980460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:07:27,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 18:07:27,230][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000677896_11106648064.pth... [2024-04-28 18:07:27,272][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000677080_11093278720.pth [2024-04-28 18:07:27,762][57339] Updated weights for policy 0, policy_version 677898 (0.0031) [2024-04-28 18:07:30,666][57339] Updated weights for policy 0, policy_version 677908 (0.0024) [2024-04-28 18:07:32,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 11106910208. Throughput: 0: 55687.9. Samples: 1597311860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:07:32,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 18:07:33,470][57339] Updated weights for policy 0, policy_version 677918 (0.0023) [2024-04-28 18:07:36,571][57339] Updated weights for policy 0, policy_version 677928 (0.0026) [2024-04-28 18:07:37,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56524.7, 300 sec: 55594.5). Total num frames: 11107205120. Throughput: 0: 55793.8. Samples: 1597484800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:07:37,170][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 18:07:39,432][57339] Updated weights for policy 0, policy_version 677938 (0.0029) [2024-04-28 18:07:42,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 11107467264. Throughput: 0: 55939.9. Samples: 1597825560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:07:42,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 18:07:42,352][57339] Updated weights for policy 0, policy_version 677948 (0.0030) [2024-04-28 18:07:45,371][57339] Updated weights for policy 0, policy_version 677958 (0.0026) [2024-04-28 18:07:47,169][57108] Fps is (10 sec: 50790.1, 60 sec: 55432.7, 300 sec: 55483.4). Total num frames: 11107713024. Throughput: 0: 55936.3. Samples: 1598159420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:07:47,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 18:07:47,734][57319] Signal inference workers to stop experience collection... (23850 times) [2024-04-28 18:07:47,765][57339] InferenceWorker_p0-w0: stopping experience collection (23850 times) [2024-04-28 18:07:47,790][57319] Signal inference workers to resume experience collection... (23850 times) [2024-04-28 18:07:47,790][57339] InferenceWorker_p0-w0: resuming experience collection (23850 times) [2024-04-28 18:07:48,200][57339] Updated weights for policy 0, policy_version 677968 (0.0028) [2024-04-28 18:07:51,124][57339] Updated weights for policy 0, policy_version 677978 (0.0028) [2024-04-28 18:07:52,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 11107991552. Throughput: 0: 55692.6. Samples: 1598315540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:07:52,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 18:07:54,156][57339] Updated weights for policy 0, policy_version 677988 (0.0034) [2024-04-28 18:07:57,139][57339] Updated weights for policy 0, policy_version 677998 (0.0035) [2024-04-28 18:07:57,169][57108] Fps is (10 sec: 60621.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11108319232. Throughput: 0: 55649.5. Samples: 1598648740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:07:57,169][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 18:08:00,070][57339] Updated weights for policy 0, policy_version 678008 (0.0028) [2024-04-28 18:08:02,169][57108] Fps is (10 sec: 58980.8, 60 sec: 55705.4, 300 sec: 55539.0). Total num frames: 11108581376. Throughput: 0: 55641.6. Samples: 1598984680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:08:02,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 18:08:03,092][57339] Updated weights for policy 0, policy_version 678018 (0.0026) [2024-04-28 18:08:05,875][57339] Updated weights for policy 0, policy_version 678028 (0.0026) [2024-04-28 18:08:07,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 55594.6). Total num frames: 11108859904. Throughput: 0: 55850.7. Samples: 1599155740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:08:07,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 18:08:09,131][57339] Updated weights for policy 0, policy_version 678038 (0.0029) [2024-04-28 18:08:11,726][57339] Updated weights for policy 0, policy_version 678048 (0.0033) [2024-04-28 18:08:12,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 11109138432. Throughput: 0: 55768.5. Samples: 1599490040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:08:12,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 18:08:15,191][57339] Updated weights for policy 0, policy_version 678058 (0.0026) [2024-04-28 18:08:17,169][57108] Fps is (10 sec: 57343.3, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11109433344. Throughput: 0: 55924.3. Samples: 1599828460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:08:17,170][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 18:08:17,622][57339] Updated weights for policy 0, policy_version 678068 (0.0030) [2024-04-28 18:08:21,577][57339] Updated weights for policy 0, policy_version 678078 (0.0030) [2024-04-28 18:08:22,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 11109662720. Throughput: 0: 55547.6. Samples: 1599984440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:08:22,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 18:08:23,531][57339] Updated weights for policy 0, policy_version 678088 (0.0029) [2024-04-28 18:08:27,169][57108] Fps is (10 sec: 50789.6, 60 sec: 55159.3, 300 sec: 55483.4). Total num frames: 11109941248. Throughput: 0: 55365.2. Samples: 1600317000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:08:27,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 18:08:27,523][57339] Updated weights for policy 0, policy_version 678098 (0.0029) [2024-04-28 18:08:29,564][57339] Updated weights for policy 0, policy_version 678108 (0.0029) [2024-04-28 18:08:32,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11110252544. Throughput: 0: 55382.8. Samples: 1600651640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:08:32,169][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 18:08:33,208][57339] Updated weights for policy 0, policy_version 678118 (0.0029) [2024-04-28 18:08:34,381][57319] Signal inference workers to stop experience collection... (23900 times) [2024-04-28 18:08:34,383][57319] Signal inference workers to resume experience collection... (23900 times) [2024-04-28 18:08:34,394][57339] InferenceWorker_p0-w0: stopping experience collection (23900 times) [2024-04-28 18:08:34,414][57339] InferenceWorker_p0-w0: resuming experience collection (23900 times) [2024-04-28 18:08:35,500][57339] Updated weights for policy 0, policy_version 678128 (0.0026) [2024-04-28 18:08:37,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11110531072. Throughput: 0: 55649.1. Samples: 1600819760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:08:37,170][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 18:08:38,990][57339] Updated weights for policy 0, policy_version 678138 (0.0023) [2024-04-28 18:08:41,305][57339] Updated weights for policy 0, policy_version 678148 (0.0028) [2024-04-28 18:08:42,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11110809600. Throughput: 0: 55674.2. Samples: 1601154080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:08:42,169][57108] Avg episode reward: [(0, '0.495')] [2024-04-28 18:08:44,931][57339] Updated weights for policy 0, policy_version 678158 (0.0027) [2024-04-28 18:08:47,169][57108] Fps is (10 sec: 55705.7, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11111088128. Throughput: 0: 55526.3. Samples: 1601483360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:08:47,170][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 18:08:47,333][57339] Updated weights for policy 0, policy_version 678168 (0.0033) [2024-04-28 18:08:50,840][57339] Updated weights for policy 0, policy_version 678178 (0.0028) [2024-04-28 18:08:52,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 11111350272. Throughput: 0: 55447.5. Samples: 1601650880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:08:52,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 18:08:53,192][57339] Updated weights for policy 0, policy_version 678188 (0.0030) [2024-04-28 18:08:56,688][57339] Updated weights for policy 0, policy_version 678198 (0.0035) [2024-04-28 18:08:57,169][57108] Fps is (10 sec: 52429.3, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 11111612416. Throughput: 0: 55520.5. Samples: 1601988460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:08:57,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 18:08:59,137][57339] Updated weights for policy 0, policy_version 678208 (0.0026) [2024-04-28 18:09:02,169][57108] Fps is (10 sec: 52428.9, 60 sec: 54886.6, 300 sec: 55483.5). Total num frames: 11111874560. Throughput: 0: 55484.6. Samples: 1602325260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:09:02,169][57108] Avg episode reward: [(0, '0.527')] [2024-04-28 18:09:02,949][57339] Updated weights for policy 0, policy_version 678218 (0.0029) [2024-04-28 18:09:04,940][57339] Updated weights for policy 0, policy_version 678228 (0.0024) [2024-04-28 18:09:07,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55432.3, 300 sec: 55539.0). Total num frames: 11112185856. Throughput: 0: 55523.0. Samples: 1602482980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:09:07,170][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 18:09:08,787][57339] Updated weights for policy 0, policy_version 678238 (0.0031) [2024-04-28 18:09:11,035][57339] Updated weights for policy 0, policy_version 678248 (0.0035) [2024-04-28 18:09:12,169][57108] Fps is (10 sec: 60620.1, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 11112480768. Throughput: 0: 55536.6. Samples: 1602816140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:09:12,170][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 18:09:14,588][57339] Updated weights for policy 0, policy_version 678258 (0.0031) [2024-04-28 18:09:16,922][57339] Updated weights for policy 0, policy_version 678268 (0.0025) [2024-04-28 18:09:17,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 11112759296. Throughput: 0: 55615.0. Samples: 1603154320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:09:17,170][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 18:09:20,377][57339] Updated weights for policy 0, policy_version 678278 (0.0026) [2024-04-28 18:09:22,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11113005056. Throughput: 0: 55725.0. Samples: 1603327380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:09:22,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 18:09:22,545][57319] Signal inference workers to stop experience collection... (23950 times) [2024-04-28 18:09:22,566][57339] InferenceWorker_p0-w0: stopping experience collection (23950 times) [2024-04-28 18:09:22,601][57319] Signal inference workers to resume experience collection... (23950 times) [2024-04-28 18:09:22,601][57339] InferenceWorker_p0-w0: resuming experience collection (23950 times) [2024-04-28 18:09:22,707][57339] Updated weights for policy 0, policy_version 678288 (0.0029) [2024-04-28 18:09:26,287][57339] Updated weights for policy 0, policy_version 678298 (0.0030) [2024-04-28 18:09:27,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55705.7, 300 sec: 55594.7). Total num frames: 11113283584. Throughput: 0: 55533.7. Samples: 1603653100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:09:27,170][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 18:09:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000678301_11113283584.pth... [2024-04-28 18:09:27,231][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000677488_11099963392.pth [2024-04-28 18:09:28,636][57339] Updated weights for policy 0, policy_version 678308 (0.0027) [2024-04-28 18:09:32,056][57339] Updated weights for policy 0, policy_version 678318 (0.0030) [2024-04-28 18:09:32,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.5, 300 sec: 55483.5). Total num frames: 11113562112. Throughput: 0: 55670.8. Samples: 1603988540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:09:32,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 18:09:34,483][57339] Updated weights for policy 0, policy_version 678328 (0.0030) [2024-04-28 18:09:37,169][57108] Fps is (10 sec: 54067.2, 60 sec: 54886.4, 300 sec: 55483.4). Total num frames: 11113824256. Throughput: 0: 55352.8. Samples: 1604141760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:09:37,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 18:09:37,824][57339] Updated weights for policy 0, policy_version 678338 (0.0024) [2024-04-28 18:09:40,247][57339] Updated weights for policy 0, policy_version 678348 (0.0029) [2024-04-28 18:09:42,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 11114135552. Throughput: 0: 55322.6. Samples: 1604477980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:09:42,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 18:09:44,033][57339] Updated weights for policy 0, policy_version 678358 (0.0032) [2024-04-28 18:09:46,118][57339] Updated weights for policy 0, policy_version 678368 (0.0029) [2024-04-28 18:09:47,169][57108] Fps is (10 sec: 60620.3, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 11114430464. Throughput: 0: 55205.5. Samples: 1604809520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:09:47,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 18:09:50,038][57339] Updated weights for policy 0, policy_version 678378 (0.0034) [2024-04-28 18:09:51,974][57339] Updated weights for policy 0, policy_version 678388 (0.0026) [2024-04-28 18:09:52,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 11114708992. Throughput: 0: 55751.9. Samples: 1604991820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:09:52,170][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 18:09:55,795][57339] Updated weights for policy 0, policy_version 678398 (0.0028) [2024-04-28 18:09:57,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11114954752. Throughput: 0: 55766.7. Samples: 1605325640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:09:57,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 18:09:57,944][57339] Updated weights for policy 0, policy_version 678408 (0.0027) [2024-04-28 18:10:01,677][57339] Updated weights for policy 0, policy_version 678418 (0.0029) [2024-04-28 18:10:02,169][57108] Fps is (10 sec: 50791.6, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 11115216896. Throughput: 0: 55598.4. Samples: 1605656240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:10:02,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 18:10:03,861][57339] Updated weights for policy 0, policy_version 678428 (0.0030) [2024-04-28 18:10:07,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 55483.4). Total num frames: 11115495424. Throughput: 0: 55164.8. Samples: 1605809800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:10:07,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 18:10:07,701][57339] Updated weights for policy 0, policy_version 678438 (0.0027) [2024-04-28 18:10:09,739][57339] Updated weights for policy 0, policy_version 678448 (0.0028) [2024-04-28 18:10:12,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 11115790336. Throughput: 0: 55284.0. Samples: 1606140880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-04-28 18:10:12,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 18:10:13,351][57319] Signal inference workers to stop experience collection... (24000 times) [2024-04-28 18:10:13,351][57319] Signal inference workers to resume experience collection... (24000 times) [2024-04-28 18:10:13,377][57339] InferenceWorker_p0-w0: stopping experience collection (24000 times) [2024-04-28 18:10:13,377][57339] InferenceWorker_p0-w0: resuming experience collection (24000 times) [2024-04-28 18:10:13,468][57339] Updated weights for policy 0, policy_version 678458 (0.0031) [2024-04-28 18:10:15,707][57339] Updated weights for policy 0, policy_version 678468 (0.0035) [2024-04-28 18:10:17,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 11116068864. Throughput: 0: 55173.3. Samples: 1606471340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:10:17,169][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 18:10:19,307][57339] Updated weights for policy 0, policy_version 678478 (0.0034) [2024-04-28 18:10:21,722][57339] Updated weights for policy 0, policy_version 678488 (0.0027) [2024-04-28 18:10:22,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55594.6). Total num frames: 11116363776. Throughput: 0: 55763.1. Samples: 1606651100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:10:22,170][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 18:10:25,176][57339] Updated weights for policy 0, policy_version 678498 (0.0026) [2024-04-28 18:10:27,169][57108] Fps is (10 sec: 58981.5, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11116658688. Throughput: 0: 55670.6. Samples: 1606983160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:10:27,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 18:10:27,521][57339] Updated weights for policy 0, policy_version 678508 (0.0028) [2024-04-28 18:10:31,166][57339] Updated weights for policy 0, policy_version 678518 (0.0029) [2024-04-28 18:10:32,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11116904448. Throughput: 0: 55774.9. Samples: 1607319380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:10:32,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 18:10:33,373][57339] Updated weights for policy 0, policy_version 678528 (0.0026) [2024-04-28 18:10:37,145][57339] Updated weights for policy 0, policy_version 678538 (0.0028) [2024-04-28 18:10:37,169][57108] Fps is (10 sec: 50790.6, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 11117166592. Throughput: 0: 55277.0. Samples: 1607479280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:10:37,170][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 18:10:39,286][57339] Updated weights for policy 0, policy_version 678548 (0.0027) [2024-04-28 18:10:42,169][57108] Fps is (10 sec: 52428.3, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 11117428736. Throughput: 0: 55321.7. Samples: 1607815120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:10:42,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 18:10:42,953][57339] Updated weights for policy 0, policy_version 678558 (0.0028) [2024-04-28 18:10:45,109][57339] Updated weights for policy 0, policy_version 678568 (0.0026) [2024-04-28 18:10:47,169][57108] Fps is (10 sec: 55705.9, 60 sec: 54886.5, 300 sec: 55483.5). Total num frames: 11117723648. Throughput: 0: 55450.6. Samples: 1608151520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:10:47,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 18:10:48,734][57339] Updated weights for policy 0, policy_version 678578 (0.0033) [2024-04-28 18:10:51,036][57339] Updated weights for policy 0, policy_version 678588 (0.0029) [2024-04-28 18:10:52,169][57108] Fps is (10 sec: 60620.7, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 11118034944. Throughput: 0: 55818.2. Samples: 1608321620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:10:52,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 18:10:54,721][57339] Updated weights for policy 0, policy_version 678598 (0.0032) [2024-04-28 18:10:56,900][57339] Updated weights for policy 0, policy_version 678608 (0.0027) [2024-04-28 18:10:57,169][57108] Fps is (10 sec: 60620.1, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 11118329856. Throughput: 0: 55847.9. Samples: 1608654040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:10:57,170][57108] Avg episode reward: [(0, '0.683')] [2024-04-28 18:11:00,567][57339] Updated weights for policy 0, policy_version 678618 (0.0025) [2024-04-28 18:11:02,169][57108] Fps is (10 sec: 55705.1, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 11118592000. Throughput: 0: 55867.8. Samples: 1608985400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:11:02,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 18:11:02,849][57339] Updated weights for policy 0, policy_version 678628 (0.0028) [2024-04-28 18:11:06,313][57339] Updated weights for policy 0, policy_version 678638 (0.0029) [2024-04-28 18:11:07,169][57108] Fps is (10 sec: 52429.8, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 11118854144. Throughput: 0: 55694.4. Samples: 1609157340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:11:07,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 18:11:08,732][57339] Updated weights for policy 0, policy_version 678648 (0.0027) [2024-04-28 18:11:11,899][57319] Signal inference workers to stop experience collection... (24050 times) [2024-04-28 18:11:11,937][57339] InferenceWorker_p0-w0: stopping experience collection (24050 times) [2024-04-28 18:11:11,954][57319] Signal inference workers to resume experience collection... (24050 times) [2024-04-28 18:11:11,956][57339] InferenceWorker_p0-w0: resuming experience collection (24050 times) [2024-04-28 18:11:12,169][57108] Fps is (10 sec: 52429.9, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 11119116288. Throughput: 0: 55684.2. Samples: 1609488940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:11:12,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 18:11:12,183][57339] Updated weights for policy 0, policy_version 678658 (0.0024) [2024-04-28 18:11:14,739][57339] Updated weights for policy 0, policy_version 678668 (0.0030) [2024-04-28 18:11:17,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 11119378432. Throughput: 0: 55694.1. Samples: 1609825620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:11:17,170][57108] Avg episode reward: [(0, '0.715')] [2024-04-28 18:11:18,085][57339] Updated weights for policy 0, policy_version 678678 (0.0025) [2024-04-28 18:11:20,387][57339] Updated weights for policy 0, policy_version 678688 (0.0025) [2024-04-28 18:11:22,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 11119689728. Throughput: 0: 55737.0. Samples: 1609987440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:11:22,169][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 18:11:23,891][57339] Updated weights for policy 0, policy_version 678698 (0.0025) [2024-04-28 18:11:26,256][57339] Updated weights for policy 0, policy_version 678708 (0.0025) [2024-04-28 18:11:27,169][57108] Fps is (10 sec: 60619.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11119984640. Throughput: 0: 55826.9. Samples: 1610327340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:11:27,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 18:11:27,283][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000678711_11120001024.pth... [2024-04-28 18:11:27,326][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000677896_11106648064.pth [2024-04-28 18:11:29,682][57339] Updated weights for policy 0, policy_version 678718 (0.0031) [2024-04-28 18:11:32,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 11120263168. Throughput: 0: 55764.0. Samples: 1610660900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:11:32,169][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 18:11:32,233][57339] Updated weights for policy 0, policy_version 678728 (0.0027) [2024-04-28 18:11:35,654][57339] Updated weights for policy 0, policy_version 678738 (0.0036) [2024-04-28 18:11:37,169][57108] Fps is (10 sec: 55706.4, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 11120541696. Throughput: 0: 55901.3. Samples: 1610837180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:11:37,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 18:11:38,123][57339] Updated weights for policy 0, policy_version 678748 (0.0026) [2024-04-28 18:11:41,523][57339] Updated weights for policy 0, policy_version 678758 (0.0031) [2024-04-28 18:11:42,169][57108] Fps is (10 sec: 54067.5, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 11120803840. Throughput: 0: 55898.9. Samples: 1611169480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:11:42,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 18:11:44,034][57339] Updated weights for policy 0, policy_version 678768 (0.0028) [2024-04-28 18:11:47,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 11121082368. Throughput: 0: 55954.0. Samples: 1611503320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:11:47,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 18:11:47,303][57339] Updated weights for policy 0, policy_version 678778 (0.0029) [2024-04-28 18:11:49,970][57339] Updated weights for policy 0, policy_version 678788 (0.0030) [2024-04-28 18:11:52,169][57108] Fps is (10 sec: 52428.5, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 11121328128. Throughput: 0: 55627.9. Samples: 1611660600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-04-28 18:11:52,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 18:11:53,241][57339] Updated weights for policy 0, policy_version 678798 (0.0025) [2024-04-28 18:11:55,811][57339] Updated weights for policy 0, policy_version 678808 (0.0027) [2024-04-28 18:11:57,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 11121639424. Throughput: 0: 55675.4. Samples: 1611994340. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:11:57,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 18:11:59,123][57339] Updated weights for policy 0, policy_version 678818 (0.0025) [2024-04-28 18:12:01,743][57339] Updated weights for policy 0, policy_version 678828 (0.0028) [2024-04-28 18:12:02,169][57108] Fps is (10 sec: 60620.3, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 11121934336. Throughput: 0: 55572.3. Samples: 1612326380. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:12:02,169][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 18:12:04,922][57339] Updated weights for policy 0, policy_version 678838 (0.0029) [2024-04-28 18:12:07,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 11122212864. Throughput: 0: 55739.9. Samples: 1612495740. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:12:07,170][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 18:12:07,668][57339] Updated weights for policy 0, policy_version 678848 (0.0030) [2024-04-28 18:12:10,988][57339] Updated weights for policy 0, policy_version 678858 (0.0028) [2024-04-28 18:12:12,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56251.5, 300 sec: 55705.6). Total num frames: 11122491392. Throughput: 0: 55643.2. Samples: 1612831280. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:12:12,169][57108] Avg episode reward: [(0, '0.682')] [2024-04-28 18:12:13,499][57339] Updated weights for policy 0, policy_version 678868 (0.0031) [2024-04-28 18:12:15,938][57319] Signal inference workers to stop experience collection... (24100 times) [2024-04-28 18:12:15,939][57319] Signal inference workers to resume experience collection... (24100 times) [2024-04-28 18:12:15,966][57339] InferenceWorker_p0-w0: stopping experience collection (24100 times) [2024-04-28 18:12:15,967][57339] InferenceWorker_p0-w0: resuming experience collection (24100 times) [2024-04-28 18:12:16,828][57339] Updated weights for policy 0, policy_version 678878 (0.0027) [2024-04-28 18:12:17,169][57108] Fps is (10 sec: 54066.9, 60 sec: 56251.6, 300 sec: 55594.5). Total num frames: 11122753536. Throughput: 0: 55690.5. Samples: 1613166980. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:12:17,170][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 18:12:19,414][57339] Updated weights for policy 0, policy_version 678888 (0.0026) [2024-04-28 18:12:22,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11123032064. Throughput: 0: 55208.5. Samples: 1613321560. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:12:22,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 18:12:22,840][57339] Updated weights for policy 0, policy_version 678898 (0.0028) [2024-04-28 18:12:25,346][57339] Updated weights for policy 0, policy_version 678908 (0.0026) [2024-04-28 18:12:27,169][57108] Fps is (10 sec: 52427.4, 60 sec: 54886.2, 300 sec: 55483.4). Total num frames: 11123277824. Throughput: 0: 55256.8. Samples: 1613656060. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:12:27,170][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 18:12:28,799][57339] Updated weights for policy 0, policy_version 678918 (0.0032) [2024-04-28 18:12:31,184][57339] Updated weights for policy 0, policy_version 678928 (0.0028) [2024-04-28 18:12:32,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55159.5, 300 sec: 55483.5). Total num frames: 11123572736. Throughput: 0: 55270.2. Samples: 1613990480. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:12:32,169][57108] Avg episode reward: [(0, '0.702')] [2024-04-28 18:12:34,538][57339] Updated weights for policy 0, policy_version 678938 (0.0028) [2024-04-28 18:12:37,169][57108] Fps is (10 sec: 58984.0, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 11123867648. Throughput: 0: 55539.4. Samples: 1614159880. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:12:37,170][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 18:12:37,323][57339] Updated weights for policy 0, policy_version 678948 (0.0028) [2024-04-28 18:12:40,473][57339] Updated weights for policy 0, policy_version 678958 (0.0029) [2024-04-28 18:12:42,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 11124146176. Throughput: 0: 55563.1. Samples: 1614494680. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:12:42,170][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 18:12:43,128][57339] Updated weights for policy 0, policy_version 678968 (0.0034) [2024-04-28 18:12:46,302][57339] Updated weights for policy 0, policy_version 678978 (0.0029) [2024-04-28 18:12:47,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 11124441088. Throughput: 0: 55637.3. Samples: 1614830060. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:12:47,170][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 18:12:48,929][57339] Updated weights for policy 0, policy_version 678988 (0.0035) [2024-04-28 18:12:52,148][57339] Updated weights for policy 0, policy_version 678998 (0.0032) [2024-04-28 18:12:52,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 55539.0). Total num frames: 11124703232. Throughput: 0: 55789.8. Samples: 1615006280. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:12:52,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 18:12:54,615][57339] Updated weights for policy 0, policy_version 679008 (0.0030) [2024-04-28 18:12:57,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 11124965376. Throughput: 0: 55722.7. Samples: 1615338800. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:12:57,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 18:12:57,946][57339] Updated weights for policy 0, policy_version 679018 (0.0027) [2024-04-28 18:13:00,659][57339] Updated weights for policy 0, policy_version 679028 (0.0029) [2024-04-28 18:13:02,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 11125243904. Throughput: 0: 55643.7. Samples: 1615670940. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:13:02,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 18:13:03,845][57339] Updated weights for policy 0, policy_version 679038 (0.0031) [2024-04-28 18:13:03,978][57319] Signal inference workers to stop experience collection... (24150 times) [2024-04-28 18:13:04,017][57339] InferenceWorker_p0-w0: stopping experience collection (24150 times) [2024-04-28 18:13:04,044][57319] Signal inference workers to resume experience collection... (24150 times) [2024-04-28 18:13:04,047][57339] InferenceWorker_p0-w0: resuming experience collection (24150 times) [2024-04-28 18:13:06,560][57339] Updated weights for policy 0, policy_version 679048 (0.0031) [2024-04-28 18:13:07,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 11125522432. Throughput: 0: 55742.9. Samples: 1615830000. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:13:07,170][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 18:13:09,729][57339] Updated weights for policy 0, policy_version 679058 (0.0032) [2024-04-28 18:13:12,169][57108] Fps is (10 sec: 58981.8, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11125833728. Throughput: 0: 55790.1. Samples: 1616166600. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:13:12,170][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 18:13:12,536][57339] Updated weights for policy 0, policy_version 679068 (0.0027) [2024-04-28 18:13:15,495][57339] Updated weights for policy 0, policy_version 679078 (0.0035) [2024-04-28 18:13:17,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11126095872. Throughput: 0: 55843.5. Samples: 1616503440. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:13:17,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 18:13:18,432][57339] Updated weights for policy 0, policy_version 679088 (0.0026) [2024-04-28 18:13:21,463][57339] Updated weights for policy 0, policy_version 679098 (0.0026) [2024-04-28 18:13:22,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 11126390784. Throughput: 0: 55974.8. Samples: 1616678740. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:13:22,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 18:13:24,230][57339] Updated weights for policy 0, policy_version 679108 (0.0033) [2024-04-28 18:13:27,169][57108] Fps is (10 sec: 55706.2, 60 sec: 56252.2, 300 sec: 55594.5). Total num frames: 11126652928. Throughput: 0: 55956.2. Samples: 1617012700. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:13:27,169][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 18:13:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000679117_11126652928.pth... [2024-04-28 18:13:27,237][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000678301_11113283584.pth [2024-04-28 18:13:27,357][57339] Updated weights for policy 0, policy_version 679118 (0.0029) [2024-04-28 18:13:30,161][57339] Updated weights for policy 0, policy_version 679128 (0.0031) [2024-04-28 18:13:32,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 11126915072. Throughput: 0: 55878.0. Samples: 1617344560. Policy #0 lag: (min: 2.0, avg: 11.8, max: 22.0) [2024-04-28 18:13:32,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 18:13:33,270][57339] Updated weights for policy 0, policy_version 679138 (0.0031) [2024-04-28 18:13:36,028][57339] Updated weights for policy 0, policy_version 679148 (0.0029) [2024-04-28 18:13:37,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55159.6, 300 sec: 55483.4). Total num frames: 11127177216. Throughput: 0: 55535.2. Samples: 1617505360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:13:37,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 18:13:39,044][57339] Updated weights for policy 0, policy_version 679158 (0.0033) [2024-04-28 18:13:41,875][57339] Updated weights for policy 0, policy_version 679168 (0.0033) [2024-04-28 18:13:42,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11127488512. Throughput: 0: 55613.4. Samples: 1617841400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:13:42,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 18:13:44,933][57339] Updated weights for policy 0, policy_version 679178 (0.0026) [2024-04-28 18:13:47,169][57108] Fps is (10 sec: 60621.1, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 11127783424. Throughput: 0: 55575.2. Samples: 1618171820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:13:47,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 18:13:47,926][57339] Updated weights for policy 0, policy_version 679188 (0.0029) [2024-04-28 18:13:50,888][57339] Updated weights for policy 0, policy_version 679198 (0.0029) [2024-04-28 18:13:52,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 11128045568. Throughput: 0: 55849.6. Samples: 1618343220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:13:52,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 18:13:52,182][57319] Signal inference workers to stop experience collection... (24200 times) [2024-04-28 18:13:52,213][57339] InferenceWorker_p0-w0: stopping experience collection (24200 times) [2024-04-28 18:13:52,238][57319] Signal inference workers to resume experience collection... (24200 times) [2024-04-28 18:13:52,239][57339] InferenceWorker_p0-w0: resuming experience collection (24200 times) [2024-04-28 18:13:54,128][57339] Updated weights for policy 0, policy_version 679208 (0.0026) [2024-04-28 18:13:56,676][57339] Updated weights for policy 0, policy_version 679218 (0.0027) [2024-04-28 18:13:57,169][57108] Fps is (10 sec: 55705.5, 60 sec: 56251.9, 300 sec: 55816.7). Total num frames: 11128340480. Throughput: 0: 55777.5. Samples: 1618676580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:13:57,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 18:13:59,955][57339] Updated weights for policy 0, policy_version 679228 (0.0024) [2024-04-28 18:14:02,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 11128602624. Throughput: 0: 55731.6. Samples: 1619011360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:14:02,169][57108] Avg episode reward: [(0, '0.678')] [2024-04-28 18:14:02,447][57339] Updated weights for policy 0, policy_version 679238 (0.0028) [2024-04-28 18:14:05,663][57339] Updated weights for policy 0, policy_version 679248 (0.0037) [2024-04-28 18:14:07,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 11128864768. Throughput: 0: 55357.8. Samples: 1619169840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:14:07,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 18:14:08,312][57339] Updated weights for policy 0, policy_version 679258 (0.0030) [2024-04-28 18:14:11,852][57339] Updated weights for policy 0, policy_version 679268 (0.0027) [2024-04-28 18:14:12,169][57108] Fps is (10 sec: 52428.6, 60 sec: 54886.5, 300 sec: 55483.5). Total num frames: 11129126912. Throughput: 0: 55368.3. Samples: 1619504280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:14:12,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 18:14:14,376][57339] Updated weights for policy 0, policy_version 679278 (0.0034) [2024-04-28 18:14:17,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11129421824. Throughput: 0: 55325.9. Samples: 1619834220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:14:17,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 18:14:17,590][57339] Updated weights for policy 0, policy_version 679288 (0.0029) [2024-04-28 18:14:20,152][57339] Updated weights for policy 0, policy_version 679298 (0.0026) [2024-04-28 18:14:22,169][57108] Fps is (10 sec: 60621.0, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 11129733120. Throughput: 0: 55712.0. Samples: 1620012400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:14:22,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 18:14:23,312][57339] Updated weights for policy 0, policy_version 679308 (0.0028) [2024-04-28 18:14:26,029][57339] Updated weights for policy 0, policy_version 679318 (0.0032) [2024-04-28 18:14:27,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11129995264. Throughput: 0: 55618.8. Samples: 1620344240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:14:27,169][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 18:14:29,171][57339] Updated weights for policy 0, policy_version 679328 (0.0031) [2024-04-28 18:14:32,019][57339] Updated weights for policy 0, policy_version 679338 (0.0029) [2024-04-28 18:14:32,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 11130273792. Throughput: 0: 55754.7. Samples: 1620680780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:14:32,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 18:14:35,020][57339] Updated weights for policy 0, policy_version 679348 (0.0027) [2024-04-28 18:14:37,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 11130519552. Throughput: 0: 55497.2. Samples: 1620840600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:14:37,170][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 18:14:37,985][57339] Updated weights for policy 0, policy_version 679358 (0.0031) [2024-04-28 18:14:41,013][57339] Updated weights for policy 0, policy_version 679368 (0.0034) [2024-04-28 18:14:42,169][57108] Fps is (10 sec: 52428.0, 60 sec: 55159.4, 300 sec: 55483.5). Total num frames: 11130798080. Throughput: 0: 55495.0. Samples: 1621173860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:14:42,170][57108] Avg episode reward: [(0, '0.535')] [2024-04-28 18:14:43,728][57339] Updated weights for policy 0, policy_version 679378 (0.0030) [2024-04-28 18:14:47,169][57108] Fps is (10 sec: 55705.6, 60 sec: 54886.3, 300 sec: 55483.5). Total num frames: 11131076608. Throughput: 0: 55572.8. Samples: 1621512140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:14:47,169][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 18:14:47,481][57339] Updated weights for policy 0, policy_version 679388 (0.0032) [2024-04-28 18:14:49,232][57319] Signal inference workers to stop experience collection... (24250 times) [2024-04-28 18:14:49,235][57319] Signal inference workers to resume experience collection... (24250 times) [2024-04-28 18:14:49,263][57339] InferenceWorker_p0-w0: stopping experience collection (24250 times) [2024-04-28 18:14:49,263][57339] InferenceWorker_p0-w0: resuming experience collection (24250 times) [2024-04-28 18:14:49,487][57339] Updated weights for policy 0, policy_version 679398 (0.0030) [2024-04-28 18:14:52,169][57108] Fps is (10 sec: 58983.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11131387904. Throughput: 0: 55637.9. Samples: 1621673540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:14:52,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 18:14:53,375][57339] Updated weights for policy 0, policy_version 679408 (0.0030) [2024-04-28 18:14:55,395][57339] Updated weights for policy 0, policy_version 679418 (0.0030) [2024-04-28 18:14:57,169][57108] Fps is (10 sec: 60621.0, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 11131682816. Throughput: 0: 55669.8. Samples: 1622009420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:14:57,169][57108] Avg episode reward: [(0, '0.676')] [2024-04-28 18:14:59,093][57339] Updated weights for policy 0, policy_version 679428 (0.0029) [2024-04-28 18:15:01,234][57339] Updated weights for policy 0, policy_version 679438 (0.0028) [2024-04-28 18:15:02,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11131961344. Throughput: 0: 55772.8. Samples: 1622344000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:15:02,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 18:15:05,265][57339] Updated weights for policy 0, policy_version 679448 (0.0026) [2024-04-28 18:15:07,115][57339] Updated weights for policy 0, policy_version 679458 (0.0028) [2024-04-28 18:15:07,169][57108] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 11132239872. Throughput: 0: 55882.6. Samples: 1622527120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:15:07,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 18:15:11,000][57339] Updated weights for policy 0, policy_version 679468 (0.0025) [2024-04-28 18:15:12,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 11132485632. Throughput: 0: 56042.6. Samples: 1622866160. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:15:12,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 18:15:12,830][57339] Updated weights for policy 0, policy_version 679478 (0.0029) [2024-04-28 18:15:16,746][57339] Updated weights for policy 0, policy_version 679488 (0.0026) [2024-04-28 18:15:17,169][57108] Fps is (10 sec: 50790.7, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 11132747776. Throughput: 0: 56017.7. Samples: 1623201580. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:15:17,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 18:15:18,719][57339] Updated weights for policy 0, policy_version 679498 (0.0027) [2024-04-28 18:15:22,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54886.4, 300 sec: 55483.5). Total num frames: 11133026304. Throughput: 0: 55803.1. Samples: 1623351740. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:15:22,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 18:15:22,524][57339] Updated weights for policy 0, policy_version 679508 (0.0031) [2024-04-28 18:15:24,745][57339] Updated weights for policy 0, policy_version 679518 (0.0034) [2024-04-28 18:15:27,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 11133337600. Throughput: 0: 55804.1. Samples: 1623685040. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:15:27,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 18:15:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000679525_11133337600.pth... [2024-04-28 18:15:27,227][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000678711_11120001024.pth [2024-04-28 18:15:28,401][57339] Updated weights for policy 0, policy_version 679528 (0.0027) [2024-04-28 18:15:30,608][57339] Updated weights for policy 0, policy_version 679538 (0.0033) [2024-04-28 18:15:32,169][57108] Fps is (10 sec: 60621.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11133632512. Throughput: 0: 55747.2. Samples: 1624020760. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:15:32,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 18:15:34,251][57339] Updated weights for policy 0, policy_version 679548 (0.0025) [2024-04-28 18:15:36,290][57339] Updated weights for policy 0, policy_version 679558 (0.0025) [2024-04-28 18:15:37,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 11133894656. Throughput: 0: 56115.0. Samples: 1624198720. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:15:37,170][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 18:15:40,041][57339] Updated weights for policy 0, policy_version 679568 (0.0028) [2024-04-28 18:15:41,526][57319] Signal inference workers to stop experience collection... (24300 times) [2024-04-28 18:15:41,535][57319] Signal inference workers to resume experience collection... (24300 times) [2024-04-28 18:15:41,555][57339] InferenceWorker_p0-w0: stopping experience collection (24300 times) [2024-04-28 18:15:41,555][57339] InferenceWorker_p0-w0: resuming experience collection (24300 times) [2024-04-28 18:15:42,164][57339] Updated weights for policy 0, policy_version 679578 (0.0032) [2024-04-28 18:15:42,169][57108] Fps is (10 sec: 57343.9, 60 sec: 56798.0, 300 sec: 55872.2). Total num frames: 11134205952. Throughput: 0: 56122.3. Samples: 1624534920. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:15:42,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 18:15:45,964][57339] Updated weights for policy 0, policy_version 679588 (0.0031) [2024-04-28 18:15:47,169][57108] Fps is (10 sec: 55705.7, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 11134451712. Throughput: 0: 56109.7. Samples: 1624868940. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:15:47,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 18:15:47,994][57339] Updated weights for policy 0, policy_version 679598 (0.0028) [2024-04-28 18:15:51,719][57339] Updated weights for policy 0, policy_version 679608 (0.0029) [2024-04-28 18:15:52,169][57108] Fps is (10 sec: 50789.7, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 11134713856. Throughput: 0: 55506.2. Samples: 1625024900. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:15:52,170][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 18:15:53,939][57339] Updated weights for policy 0, policy_version 679618 (0.0033) [2024-04-28 18:15:57,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55159.5, 300 sec: 55594.6). Total num frames: 11134992384. Throughput: 0: 55431.1. Samples: 1625360560. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:15:57,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 18:15:57,500][57339] Updated weights for policy 0, policy_version 679628 (0.0026) [2024-04-28 18:15:59,646][57339] Updated weights for policy 0, policy_version 679638 (0.0034) [2024-04-28 18:16:02,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11135287296. Throughput: 0: 55444.0. Samples: 1625696560. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:16:02,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 18:16:03,425][57339] Updated weights for policy 0, policy_version 679648 (0.0030) [2024-04-28 18:16:05,759][57339] Updated weights for policy 0, policy_version 679658 (0.0036) [2024-04-28 18:16:07,169][57108] Fps is (10 sec: 60620.3, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11135598592. Throughput: 0: 55956.4. Samples: 1625869780. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:16:07,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 18:16:09,276][57339] Updated weights for policy 0, policy_version 679668 (0.0031) [2024-04-28 18:16:11,770][57339] Updated weights for policy 0, policy_version 679678 (0.0029) [2024-04-28 18:16:12,169][57108] Fps is (10 sec: 57344.6, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 11135860736. Throughput: 0: 55951.7. Samples: 1626202860. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:16:12,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 18:16:14,989][57339] Updated weights for policy 0, policy_version 679688 (0.0028) [2024-04-28 18:16:17,169][57108] Fps is (10 sec: 54067.3, 60 sec: 56524.8, 300 sec: 55761.1). Total num frames: 11136139264. Throughput: 0: 55979.9. Samples: 1626539860. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:16:17,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 18:16:17,410][57339] Updated weights for policy 0, policy_version 679698 (0.0024) [2024-04-28 18:16:20,886][57339] Updated weights for policy 0, policy_version 679708 (0.0027) [2024-04-28 18:16:22,169][57108] Fps is (10 sec: 57343.3, 60 sec: 56797.8, 300 sec: 55761.2). Total num frames: 11136434176. Throughput: 0: 55748.0. Samples: 1626707380. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:16:22,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 18:16:23,149][57339] Updated weights for policy 0, policy_version 679718 (0.0031) [2024-04-28 18:16:26,862][57339] Updated weights for policy 0, policy_version 679728 (0.0027) [2024-04-28 18:16:27,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 11136679936. Throughput: 0: 55880.7. Samples: 1627049560. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:16:27,169][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 18:16:28,998][57339] Updated weights for policy 0, policy_version 679738 (0.0030) [2024-04-28 18:16:32,169][57108] Fps is (10 sec: 49152.6, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 11136925696. Throughput: 0: 55881.9. Samples: 1627383620. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:16:32,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 18:16:32,699][57339] Updated weights for policy 0, policy_version 679748 (0.0028) [2024-04-28 18:16:34,873][57339] Updated weights for policy 0, policy_version 679758 (0.0025) [2024-04-28 18:16:37,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11137236992. Throughput: 0: 55843.7. Samples: 1627537860. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:16:37,169][57108] Avg episode reward: [(0, '0.694')] [2024-04-28 18:16:37,959][57319] Signal inference workers to stop experience collection... (24350 times) [2024-04-28 18:16:38,001][57339] InferenceWorker_p0-w0: stopping experience collection (24350 times) [2024-04-28 18:16:38,014][57319] Signal inference workers to resume experience collection... (24350 times) [2024-04-28 18:16:38,020][57339] InferenceWorker_p0-w0: resuming experience collection (24350 times) [2024-04-28 18:16:38,622][57339] Updated weights for policy 0, policy_version 679768 (0.0032) [2024-04-28 18:16:40,706][57339] Updated weights for policy 0, policy_version 679778 (0.0028) [2024-04-28 18:16:42,169][57108] Fps is (10 sec: 62258.2, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 11137548288. Throughput: 0: 55897.2. Samples: 1627875940. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:16:42,170][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 18:16:44,459][57339] Updated weights for policy 0, policy_version 679788 (0.0030) [2024-04-28 18:16:46,553][57339] Updated weights for policy 0, policy_version 679798 (0.0036) [2024-04-28 18:16:47,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 11137810432. Throughput: 0: 55869.9. Samples: 1628210700. Policy #0 lag: (min: 1.0, avg: 9.6, max: 25.0) [2024-04-28 18:16:47,169][57108] Avg episode reward: [(0, '0.684')] [2024-04-28 18:16:50,241][57339] Updated weights for policy 0, policy_version 679808 (0.0030) [2024-04-28 18:16:52,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 11138105344. Throughput: 0: 56098.2. Samples: 1628394200. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:16:52,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 18:16:52,683][57339] Updated weights for policy 0, policy_version 679818 (0.0028) [2024-04-28 18:16:55,956][57339] Updated weights for policy 0, policy_version 679828 (0.0024) [2024-04-28 18:16:57,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 11138367488. Throughput: 0: 56041.3. Samples: 1628724720. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:16:57,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 18:16:58,524][57339] Updated weights for policy 0, policy_version 679838 (0.0024) [2024-04-28 18:17:01,918][57339] Updated weights for policy 0, policy_version 679848 (0.0028) [2024-04-28 18:17:02,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11138646016. Throughput: 0: 56079.0. Samples: 1629063420. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:17:02,170][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 18:17:04,467][57339] Updated weights for policy 0, policy_version 679858 (0.0026) [2024-04-28 18:17:07,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 11138908160. Throughput: 0: 55879.7. Samples: 1629221960. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:17:07,169][57108] Avg episode reward: [(0, '0.684')] [2024-04-28 18:17:07,742][57339] Updated weights for policy 0, policy_version 679868 (0.0027) [2024-04-28 18:17:10,341][57339] Updated weights for policy 0, policy_version 679878 (0.0028) [2024-04-28 18:17:12,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 11139170304. Throughput: 0: 55717.8. Samples: 1629556860. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:17:12,170][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 18:17:13,738][57339] Updated weights for policy 0, policy_version 679888 (0.0028) [2024-04-28 18:17:16,027][57339] Updated weights for policy 0, policy_version 679898 (0.0026) [2024-04-28 18:17:17,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11139481600. Throughput: 0: 55641.2. Samples: 1629887480. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:17:17,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 18:17:19,512][57339] Updated weights for policy 0, policy_version 679908 (0.0032) [2024-04-28 18:17:21,725][57339] Updated weights for policy 0, policy_version 679918 (0.0035) [2024-04-28 18:17:22,169][57108] Fps is (10 sec: 60620.9, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 11139776512. Throughput: 0: 56084.8. Samples: 1630061680. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:17:22,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 18:17:25,359][57339] Updated weights for policy 0, policy_version 679928 (0.0025) [2024-04-28 18:17:27,169][57108] Fps is (10 sec: 58982.2, 60 sec: 56524.8, 300 sec: 55927.7). Total num frames: 11140071424. Throughput: 0: 56102.3. Samples: 1630400540. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:17:27,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 18:17:27,286][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000679937_11140087808.pth... [2024-04-28 18:17:27,335][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000679117_11126652928.pth [2024-04-28 18:17:27,479][57339] Updated weights for policy 0, policy_version 679938 (0.0030) [2024-04-28 18:17:31,190][57339] Updated weights for policy 0, policy_version 679948 (0.0035) [2024-04-28 18:17:32,169][57108] Fps is (10 sec: 55705.8, 60 sec: 56797.8, 300 sec: 55816.7). Total num frames: 11140333568. Throughput: 0: 56114.1. Samples: 1630735840. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:17:32,169][57108] Avg episode reward: [(0, '0.693')] [2024-04-28 18:17:33,889][57339] Updated weights for policy 0, policy_version 679958 (0.0026) [2024-04-28 18:17:36,925][57339] Updated weights for policy 0, policy_version 679968 (0.0027) [2024-04-28 18:17:37,169][57108] Fps is (10 sec: 54067.3, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 11140612096. Throughput: 0: 55746.8. Samples: 1630902800. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:17:37,170][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 18:17:38,127][57319] Signal inference workers to stop experience collection... (24400 times) [2024-04-28 18:17:38,128][57319] Signal inference workers to resume experience collection... (24400 times) [2024-04-28 18:17:38,143][57339] InferenceWorker_p0-w0: stopping experience collection (24400 times) [2024-04-28 18:17:38,144][57339] InferenceWorker_p0-w0: resuming experience collection (24400 times) [2024-04-28 18:17:39,713][57339] Updated weights for policy 0, policy_version 679978 (0.0028) [2024-04-28 18:17:42,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55159.7, 300 sec: 55650.1). Total num frames: 11140857856. Throughput: 0: 55904.1. Samples: 1631240400. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:17:42,169][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 18:17:42,618][57339] Updated weights for policy 0, policy_version 679988 (0.0028) [2024-04-28 18:17:45,713][57339] Updated weights for policy 0, policy_version 679998 (0.0025) [2024-04-28 18:17:47,169][57108] Fps is (10 sec: 52428.0, 60 sec: 55432.3, 300 sec: 55705.6). Total num frames: 11141136384. Throughput: 0: 55935.0. Samples: 1631580500. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:17:47,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 18:17:48,564][57339] Updated weights for policy 0, policy_version 680008 (0.0028) [2024-04-28 18:17:51,598][57339] Updated weights for policy 0, policy_version 680018 (0.0031) [2024-04-28 18:17:52,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55159.6, 300 sec: 55761.2). Total num frames: 11141414912. Throughput: 0: 55903.6. Samples: 1631737620. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:17:52,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 18:17:54,488][57339] Updated weights for policy 0, policy_version 680028 (0.0025) [2024-04-28 18:17:57,169][57108] Fps is (10 sec: 58983.1, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 11141726208. Throughput: 0: 55813.8. Samples: 1632068480. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:17:57,170][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 18:17:57,465][57339] Updated weights for policy 0, policy_version 680038 (0.0030) [2024-04-28 18:18:00,422][57339] Updated weights for policy 0, policy_version 680048 (0.0026) [2024-04-28 18:18:02,169][57108] Fps is (10 sec: 60620.6, 60 sec: 56251.9, 300 sec: 55927.8). Total num frames: 11142021120. Throughput: 0: 55807.1. Samples: 1632398800. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:18:02,169][57108] Avg episode reward: [(0, '0.473')] [2024-04-28 18:18:03,212][57339] Updated weights for policy 0, policy_version 680058 (0.0031) [2024-04-28 18:18:06,217][57339] Updated weights for policy 0, policy_version 680068 (0.0028) [2024-04-28 18:18:07,169][57108] Fps is (10 sec: 57344.7, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 11142299648. Throughput: 0: 56130.3. Samples: 1632587540. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:18:07,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 18:18:09,084][57339] Updated weights for policy 0, policy_version 680078 (0.0031) [2024-04-28 18:18:12,080][57339] Updated weights for policy 0, policy_version 680088 (0.0029) [2024-04-28 18:18:12,169][57108] Fps is (10 sec: 54066.6, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 11142561792. Throughput: 0: 56006.6. Samples: 1632920840. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:18:12,170][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 18:18:14,820][57339] Updated weights for policy 0, policy_version 680098 (0.0025) [2024-04-28 18:18:17,169][57108] Fps is (10 sec: 50790.1, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11142807552. Throughput: 0: 55948.5. Samples: 1633253520. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:18:17,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 18:18:17,966][57339] Updated weights for policy 0, policy_version 680108 (0.0025) [2024-04-28 18:18:20,873][57339] Updated weights for policy 0, policy_version 680118 (0.0025) [2024-04-28 18:18:22,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 11143086080. Throughput: 0: 55673.7. Samples: 1633408120. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:18:22,170][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 18:18:23,606][57339] Updated weights for policy 0, policy_version 680128 (0.0029) [2024-04-28 18:18:26,718][57339] Updated weights for policy 0, policy_version 680138 (0.0030) [2024-04-28 18:18:27,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 11143380992. Throughput: 0: 55759.8. Samples: 1633749600. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-04-28 18:18:27,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 18:18:29,427][57339] Updated weights for policy 0, policy_version 680148 (0.0022) [2024-04-28 18:18:29,441][57319] Signal inference workers to stop experience collection... (24450 times) [2024-04-28 18:18:29,441][57319] Signal inference workers to resume experience collection... (24450 times) [2024-04-28 18:18:29,465][57339] InferenceWorker_p0-w0: stopping experience collection (24450 times) [2024-04-28 18:18:29,465][57339] InferenceWorker_p0-w0: resuming experience collection (24450 times) [2024-04-28 18:18:32,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55705.6, 300 sec: 55927.7). Total num frames: 11143675904. Throughput: 0: 55544.6. Samples: 1634080000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:18:32,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 18:18:33,213][57339] Updated weights for policy 0, policy_version 680158 (0.0039) [2024-04-28 18:18:35,382][57339] Updated weights for policy 0, policy_version 680168 (0.0030) [2024-04-28 18:18:37,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11143970816. Throughput: 0: 56056.9. Samples: 1634260180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:18:37,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 18:18:39,076][57339] Updated weights for policy 0, policy_version 680178 (0.0038) [2024-04-28 18:18:41,038][57339] Updated weights for policy 0, policy_version 680188 (0.0025) [2024-04-28 18:18:42,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56524.6, 300 sec: 55816.7). Total num frames: 11144249344. Throughput: 0: 56089.8. Samples: 1634592520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:18:42,169][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 18:18:44,846][57339] Updated weights for policy 0, policy_version 680198 (0.0034) [2024-04-28 18:18:47,065][57339] Updated weights for policy 0, policy_version 680208 (0.0025) [2024-04-28 18:18:47,169][57108] Fps is (10 sec: 55705.7, 60 sec: 56525.0, 300 sec: 55872.2). Total num frames: 11144527872. Throughput: 0: 56116.1. Samples: 1634924020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:18:47,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 18:18:50,785][57339] Updated weights for policy 0, policy_version 680218 (0.0029) [2024-04-28 18:18:52,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11144773632. Throughput: 0: 55579.4. Samples: 1635088620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:18:52,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 18:18:52,807][57339] Updated weights for policy 0, policy_version 680228 (0.0031) [2024-04-28 18:18:56,748][57339] Updated weights for policy 0, policy_version 680238 (0.0033) [2024-04-28 18:18:57,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 11145052160. Throughput: 0: 55672.1. Samples: 1635426080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:18:57,170][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 18:18:58,815][57339] Updated weights for policy 0, policy_version 680248 (0.0024) [2024-04-28 18:19:02,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55159.3, 300 sec: 55816.7). Total num frames: 11145330688. Throughput: 0: 55633.2. Samples: 1635757020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:19:02,170][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 18:19:02,571][57339] Updated weights for policy 0, policy_version 680258 (0.0027) [2024-04-28 18:19:04,667][57339] Updated weights for policy 0, policy_version 680268 (0.0034) [2024-04-28 18:19:07,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55432.5, 300 sec: 55927.8). Total num frames: 11145625600. Throughput: 0: 55797.9. Samples: 1635919020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:19:07,169][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 18:19:08,345][57339] Updated weights for policy 0, policy_version 680278 (0.0028) [2024-04-28 18:19:10,386][57339] Updated weights for policy 0, policy_version 680288 (0.0027) [2024-04-28 18:19:12,169][57108] Fps is (10 sec: 58983.2, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 11145920512. Throughput: 0: 55641.8. Samples: 1636253480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:19:12,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 18:19:14,286][57339] Updated weights for policy 0, policy_version 680298 (0.0025) [2024-04-28 18:19:16,193][57339] Updated weights for policy 0, policy_version 680308 (0.0027) [2024-04-28 18:19:17,169][57108] Fps is (10 sec: 58982.2, 60 sec: 56797.9, 300 sec: 55872.2). Total num frames: 11146215424. Throughput: 0: 55724.1. Samples: 1636587580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:19:17,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 18:19:20,249][57339] Updated weights for policy 0, policy_version 680318 (0.0024) [2024-04-28 18:19:22,175][57108] Fps is (10 sec: 55672.9, 60 sec: 56519.3, 300 sec: 55871.1). Total num frames: 11146477568. Throughput: 0: 55715.3. Samples: 1636767700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:19:22,175][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 18:19:22,228][57339] Updated weights for policy 0, policy_version 680328 (0.0024) [2024-04-28 18:19:26,146][57339] Updated weights for policy 0, policy_version 680338 (0.0026) [2024-04-28 18:19:27,169][57108] Fps is (10 sec: 50789.4, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11146723328. Throughput: 0: 55823.8. Samples: 1637104600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:19:27,170][57108] Avg episode reward: [(0, '0.677')] [2024-04-28 18:19:27,288][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000680343_11146739712.pth... [2024-04-28 18:19:27,338][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000679525_11133337600.pth [2024-04-28 18:19:28,126][57339] Updated weights for policy 0, policy_version 680348 (0.0031) [2024-04-28 18:19:28,647][57319] Signal inference workers to stop experience collection... (24500 times) [2024-04-28 18:19:28,647][57319] Signal inference workers to resume experience collection... (24500 times) [2024-04-28 18:19:28,660][57339] InferenceWorker_p0-w0: stopping experience collection (24500 times) [2024-04-28 18:19:28,660][57339] InferenceWorker_p0-w0: resuming experience collection (24500 times) [2024-04-28 18:19:32,001][57339] Updated weights for policy 0, policy_version 680358 (0.0030) [2024-04-28 18:19:32,169][57108] Fps is (10 sec: 50820.6, 60 sec: 55159.6, 300 sec: 55816.7). Total num frames: 11146985472. Throughput: 0: 55831.1. Samples: 1637436420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:19:32,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 18:19:34,067][57339] Updated weights for policy 0, policy_version 680368 (0.0030) [2024-04-28 18:19:37,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55159.3, 300 sec: 55872.2). Total num frames: 11147280384. Throughput: 0: 55597.3. Samples: 1637590500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:19:37,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 18:19:37,774][57339] Updated weights for policy 0, policy_version 680378 (0.0036) [2024-04-28 18:19:39,928][57339] Updated weights for policy 0, policy_version 680388 (0.0029) [2024-04-28 18:19:42,169][57108] Fps is (10 sec: 60619.6, 60 sec: 55705.5, 300 sec: 55983.3). Total num frames: 11147591680. Throughput: 0: 55579.4. Samples: 1637927160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:19:42,170][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 18:19:43,648][57339] Updated weights for policy 0, policy_version 680398 (0.0034) [2024-04-28 18:19:45,722][57339] Updated weights for policy 0, policy_version 680408 (0.0026) [2024-04-28 18:19:47,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.3, 300 sec: 55816.6). Total num frames: 11147853824. Throughput: 0: 55588.9. Samples: 1638258520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:19:47,170][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 18:19:49,542][57339] Updated weights for policy 0, policy_version 680418 (0.0029) [2024-04-28 18:19:51,660][57339] Updated weights for policy 0, policy_version 680428 (0.0031) [2024-04-28 18:19:52,169][57108] Fps is (10 sec: 55706.5, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 11148148736. Throughput: 0: 55867.5. Samples: 1638433060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:19:52,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 18:19:55,325][57339] Updated weights for policy 0, policy_version 680438 (0.0027) [2024-04-28 18:19:57,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 11148427264. Throughput: 0: 55826.5. Samples: 1638765680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:19:57,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 18:19:57,407][57339] Updated weights for policy 0, policy_version 680448 (0.0031) [2024-04-28 18:20:01,243][57339] Updated weights for policy 0, policy_version 680458 (0.0034) [2024-04-28 18:20:02,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11148673024. Throughput: 0: 55828.0. Samples: 1639099840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:20:02,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 18:20:03,178][57339] Updated weights for policy 0, policy_version 680468 (0.0028) [2024-04-28 18:20:07,169][57108] Fps is (10 sec: 50791.6, 60 sec: 55159.5, 300 sec: 55761.2). Total num frames: 11148935168. Throughput: 0: 55400.2. Samples: 1639260380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-04-28 18:20:07,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 18:20:07,226][57339] Updated weights for policy 0, policy_version 680478 (0.0026) [2024-04-28 18:20:09,069][57339] Updated weights for policy 0, policy_version 680488 (0.0026) [2024-04-28 18:20:12,169][57108] Fps is (10 sec: 54067.4, 60 sec: 54886.4, 300 sec: 55816.7). Total num frames: 11149213696. Throughput: 0: 55426.9. Samples: 1639598800. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:20:12,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 18:20:12,997][57339] Updated weights for policy 0, policy_version 680498 (0.0038) [2024-04-28 18:20:14,920][57339] Updated weights for policy 0, policy_version 680508 (0.0025) [2024-04-28 18:20:17,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55159.5, 300 sec: 55927.8). Total num frames: 11149524992. Throughput: 0: 55531.1. Samples: 1639935320. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:20:17,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 18:20:18,932][57339] Updated weights for policy 0, policy_version 680518 (0.0029) [2024-04-28 18:20:20,982][57339] Updated weights for policy 0, policy_version 680528 (0.0030) [2024-04-28 18:20:22,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55438.1, 300 sec: 55816.7). Total num frames: 11149803520. Throughput: 0: 55808.3. Samples: 1640101860. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:20:22,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 18:20:24,633][57339] Updated weights for policy 0, policy_version 680538 (0.0026) [2024-04-28 18:20:27,126][57339] Updated weights for policy 0, policy_version 680548 (0.0031) [2024-04-28 18:20:27,169][57108] Fps is (10 sec: 57342.8, 60 sec: 56251.7, 300 sec: 55816.6). Total num frames: 11150098432. Throughput: 0: 55871.9. Samples: 1640441400. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:20:27,170][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 18:20:30,374][57339] Updated weights for policy 0, policy_version 680558 (0.0026) [2024-04-28 18:20:32,169][57108] Fps is (10 sec: 58981.4, 60 sec: 56797.8, 300 sec: 55927.8). Total num frames: 11150393344. Throughput: 0: 55935.2. Samples: 1640775600. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:20:32,170][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 18:20:32,852][57339] Updated weights for policy 0, policy_version 680568 (0.0032) [2024-04-28 18:20:36,214][57339] Updated weights for policy 0, policy_version 680578 (0.0026) [2024-04-28 18:20:37,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11150639104. Throughput: 0: 55852.3. Samples: 1640946420. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:20:37,170][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 18:20:38,849][57339] Updated weights for policy 0, policy_version 680588 (0.0034) [2024-04-28 18:20:42,109][57339] Updated weights for policy 0, policy_version 680598 (0.0028) [2024-04-28 18:20:42,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 11150917632. Throughput: 0: 55888.3. Samples: 1641280640. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:20:42,169][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 18:20:44,848][57339] Updated weights for policy 0, policy_version 680608 (0.0031) [2024-04-28 18:20:47,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 11151179776. Throughput: 0: 55810.5. Samples: 1641611320. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:20:47,169][57108] Avg episode reward: [(0, '0.694')] [2024-04-28 18:20:47,861][57339] Updated weights for policy 0, policy_version 680618 (0.0024) [2024-04-28 18:20:48,911][57319] Signal inference workers to stop experience collection... (24550 times) [2024-04-28 18:20:48,917][57319] Signal inference workers to resume experience collection... (24550 times) [2024-04-28 18:20:48,924][57339] InferenceWorker_p0-w0: stopping experience collection (24550 times) [2024-04-28 18:20:48,935][57339] InferenceWorker_p0-w0: resuming experience collection (24550 times) [2024-04-28 18:20:50,607][57339] Updated weights for policy 0, policy_version 680628 (0.0024) [2024-04-28 18:20:52,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 11151458304. Throughput: 0: 55758.6. Samples: 1641769520. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:20:52,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 18:20:53,740][57339] Updated weights for policy 0, policy_version 680638 (0.0033) [2024-04-28 18:20:56,342][57339] Updated weights for policy 0, policy_version 680648 (0.0029) [2024-04-28 18:20:57,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 11151769600. Throughput: 0: 55671.4. Samples: 1642104020. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:20:57,170][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 18:20:59,421][57339] Updated weights for policy 0, policy_version 680658 (0.0028) [2024-04-28 18:21:02,169][57108] Fps is (10 sec: 58982.4, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 11152048128. Throughput: 0: 55625.8. Samples: 1642438480. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:21:02,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 18:21:02,296][57339] Updated weights for policy 0, policy_version 680668 (0.0024) [2024-04-28 18:21:05,417][57339] Updated weights for policy 0, policy_version 680678 (0.0027) [2024-04-28 18:21:07,169][57108] Fps is (10 sec: 55706.2, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 11152326656. Throughput: 0: 55876.8. Samples: 1642616320. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:21:07,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 18:21:08,322][57339] Updated weights for policy 0, policy_version 680688 (0.0029) [2024-04-28 18:21:11,342][57339] Updated weights for policy 0, policy_version 680698 (0.0025) [2024-04-28 18:21:12,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 11152605184. Throughput: 0: 55873.1. Samples: 1642955680. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:21:12,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 18:21:14,390][57339] Updated weights for policy 0, policy_version 680708 (0.0034) [2024-04-28 18:21:17,107][57339] Updated weights for policy 0, policy_version 680718 (0.0029) [2024-04-28 18:21:17,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 11152883712. Throughput: 0: 55727.1. Samples: 1643283320. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:21:17,169][57108] Avg episode reward: [(0, '0.678')] [2024-04-28 18:21:20,122][57339] Updated weights for policy 0, policy_version 680728 (0.0029) [2024-04-28 18:21:22,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 11153145856. Throughput: 0: 55697.0. Samples: 1643452780. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:21:22,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 18:21:22,895][57339] Updated weights for policy 0, policy_version 680738 (0.0032) [2024-04-28 18:21:25,880][57339] Updated weights for policy 0, policy_version 680748 (0.0026) [2024-04-28 18:21:27,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.7, 300 sec: 55927.7). Total num frames: 11153424384. Throughput: 0: 55762.5. Samples: 1643789960. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:21:27,170][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 18:21:27,185][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000680751_11153424384.pth... [2024-04-28 18:21:27,234][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000679937_11140087808.pth [2024-04-28 18:21:28,941][57339] Updated weights for policy 0, policy_version 680758 (0.0028) [2024-04-28 18:21:31,780][57339] Updated weights for policy 0, policy_version 680768 (0.0024) [2024-04-28 18:21:32,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55159.6, 300 sec: 55816.7). Total num frames: 11153702912. Throughput: 0: 55849.6. Samples: 1644124540. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:21:32,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 18:21:34,924][57339] Updated weights for policy 0, policy_version 680778 (0.0030) [2024-04-28 18:21:37,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 11153997824. Throughput: 0: 56126.0. Samples: 1644295200. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:21:37,169][57108] Avg episode reward: [(0, '0.534')] [2024-04-28 18:21:37,863][57339] Updated weights for policy 0, policy_version 680788 (0.0028) [2024-04-28 18:21:40,872][57339] Updated weights for policy 0, policy_version 680798 (0.0028) [2024-04-28 18:21:42,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11154276352. Throughput: 0: 56055.8. Samples: 1644626520. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:21:42,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 18:21:43,727][57339] Updated weights for policy 0, policy_version 680808 (0.0024) [2024-04-28 18:21:46,857][57339] Updated weights for policy 0, policy_version 680818 (0.0033) [2024-04-28 18:21:47,169][57108] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 11154554880. Throughput: 0: 56019.9. Samples: 1644959380. Policy #0 lag: (min: 1.0, avg: 12.8, max: 23.0) [2024-04-28 18:21:47,169][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 18:21:49,596][57339] Updated weights for policy 0, policy_version 680828 (0.0031) [2024-04-28 18:21:52,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 11154817024. Throughput: 0: 55660.4. Samples: 1645121040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:21:52,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 18:21:52,765][57339] Updated weights for policy 0, policy_version 680838 (0.0024) [2024-04-28 18:21:52,954][57319] Signal inference workers to stop experience collection... (24600 times) [2024-04-28 18:21:52,991][57339] InferenceWorker_p0-w0: stopping experience collection (24600 times) [2024-04-28 18:21:53,044][57319] Signal inference workers to resume experience collection... (24600 times) [2024-04-28 18:21:53,045][57339] InferenceWorker_p0-w0: resuming experience collection (24600 times) [2024-04-28 18:21:55,621][57339] Updated weights for policy 0, policy_version 680848 (0.0030) [2024-04-28 18:21:57,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 11155095552. Throughput: 0: 55560.8. Samples: 1645455920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:21:57,169][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 18:21:58,598][57339] Updated weights for policy 0, policy_version 680858 (0.0033) [2024-04-28 18:22:01,544][57339] Updated weights for policy 0, policy_version 680868 (0.0040) [2024-04-28 18:22:02,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 11155374080. Throughput: 0: 55620.5. Samples: 1645786240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:22:02,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 18:22:04,476][57339] Updated weights for policy 0, policy_version 680878 (0.0032) [2024-04-28 18:22:07,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55159.4, 300 sec: 55816.7). Total num frames: 11155636224. Throughput: 0: 55477.6. Samples: 1645949280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:22:07,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 18:22:07,425][57339] Updated weights for policy 0, policy_version 680888 (0.0027) [2024-04-28 18:22:10,348][57339] Updated weights for policy 0, policy_version 680898 (0.0030) [2024-04-28 18:22:12,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 11155931136. Throughput: 0: 55318.6. Samples: 1646279300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:22:12,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 18:22:13,260][57339] Updated weights for policy 0, policy_version 680908 (0.0028) [2024-04-28 18:22:16,254][57339] Updated weights for policy 0, policy_version 680918 (0.0031) [2024-04-28 18:22:17,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11156226048. Throughput: 0: 55212.2. Samples: 1646609100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:22:17,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 18:22:19,106][57339] Updated weights for policy 0, policy_version 680928 (0.0028) [2024-04-28 18:22:22,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 11156471808. Throughput: 0: 55236.0. Samples: 1646780820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:22:22,170][57108] Avg episode reward: [(0, '0.693')] [2024-04-28 18:22:22,196][57339] Updated weights for policy 0, policy_version 680938 (0.0031) [2024-04-28 18:22:24,949][57339] Updated weights for policy 0, policy_version 680948 (0.0027) [2024-04-28 18:22:27,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11156750336. Throughput: 0: 55264.8. Samples: 1647113440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:22:27,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 18:22:28,152][57339] Updated weights for policy 0, policy_version 680958 (0.0029) [2024-04-28 18:22:30,951][57339] Updated weights for policy 0, policy_version 680968 (0.0031) [2024-04-28 18:22:32,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11157028864. Throughput: 0: 55258.8. Samples: 1647446020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:22:32,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 18:22:33,939][57339] Updated weights for policy 0, policy_version 680978 (0.0028) [2024-04-28 18:22:36,923][57339] Updated weights for policy 0, policy_version 680988 (0.0029) [2024-04-28 18:22:37,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 11157307392. Throughput: 0: 55358.7. Samples: 1647612180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:22:37,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 18:22:39,892][57339] Updated weights for policy 0, policy_version 680998 (0.0025) [2024-04-28 18:22:42,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55159.3, 300 sec: 55761.2). Total num frames: 11157585920. Throughput: 0: 55325.7. Samples: 1647945580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:22:42,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 18:22:42,763][57339] Updated weights for policy 0, policy_version 681008 (0.0028) [2024-04-28 18:22:45,704][57339] Updated weights for policy 0, policy_version 681018 (0.0029) [2024-04-28 18:22:47,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 11157880832. Throughput: 0: 55338.1. Samples: 1648276460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:22:47,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 18:22:48,761][57339] Updated weights for policy 0, policy_version 681028 (0.0030) [2024-04-28 18:22:51,551][57339] Updated weights for policy 0, policy_version 681038 (0.0031) [2024-04-28 18:22:52,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11158159360. Throughput: 0: 55408.1. Samples: 1648442640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:22:52,170][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 18:22:54,670][57339] Updated weights for policy 0, policy_version 681048 (0.0031) [2024-04-28 18:22:57,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 11158405120. Throughput: 0: 55574.0. Samples: 1648780120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:22:57,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 18:22:57,450][57339] Updated weights for policy 0, policy_version 681058 (0.0030) [2024-04-28 18:23:00,629][57339] Updated weights for policy 0, policy_version 681068 (0.0029) [2024-04-28 18:23:02,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 11158700032. Throughput: 0: 55591.1. Samples: 1649110700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:23:02,170][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 18:23:03,439][57339] Updated weights for policy 0, policy_version 681078 (0.0034) [2024-04-28 18:23:06,339][57339] Updated weights for policy 0, policy_version 681088 (0.0026) [2024-04-28 18:23:07,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 11158978560. Throughput: 0: 55450.4. Samples: 1649276080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:23:07,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 18:23:08,906][57319] Signal inference workers to stop experience collection... (24650 times) [2024-04-28 18:23:08,911][57319] Signal inference workers to resume experience collection... (24650 times) [2024-04-28 18:23:08,921][57339] InferenceWorker_p0-w0: stopping experience collection (24650 times) [2024-04-28 18:23:08,935][57339] InferenceWorker_p0-w0: resuming experience collection (24650 times) [2024-04-28 18:23:09,290][57339] Updated weights for policy 0, policy_version 681098 (0.0026) [2024-04-28 18:23:12,112][57339] Updated weights for policy 0, policy_version 681108 (0.0037) [2024-04-28 18:23:12,169][57108] Fps is (10 sec: 57345.0, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11159273472. Throughput: 0: 55529.4. Samples: 1649612260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:23:12,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 18:23:15,187][57339] Updated weights for policy 0, policy_version 681118 (0.0026) [2024-04-28 18:23:17,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 11159535616. Throughput: 0: 55516.3. Samples: 1649944260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:23:17,170][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 18:23:18,137][57339] Updated weights for policy 0, policy_version 681128 (0.0031) [2024-04-28 18:23:21,085][57339] Updated weights for policy 0, policy_version 681138 (0.0028) [2024-04-28 18:23:22,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11159814144. Throughput: 0: 55542.7. Samples: 1650111600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-04-28 18:23:22,169][57108] Avg episode reward: [(0, '0.722')] [2024-04-28 18:23:23,895][57339] Updated weights for policy 0, policy_version 681148 (0.0030) [2024-04-28 18:23:26,882][57339] Updated weights for policy 0, policy_version 681158 (0.0027) [2024-04-28 18:23:27,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 11160092672. Throughput: 0: 55525.3. Samples: 1650444220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:23:27,170][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 18:23:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000681158_11160092672.pth... [2024-04-28 18:23:27,226][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000680343_11146739712.pth [2024-04-28 18:23:29,971][57339] Updated weights for policy 0, policy_version 681168 (0.0027) [2024-04-28 18:23:32,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 11160338432. Throughput: 0: 55617.8. Samples: 1650779260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:23:32,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 18:23:32,718][57339] Updated weights for policy 0, policy_version 681178 (0.0026) [2024-04-28 18:23:35,748][57339] Updated weights for policy 0, policy_version 681188 (0.0034) [2024-04-28 18:23:37,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 11160666112. Throughput: 0: 55573.7. Samples: 1650943460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:23:37,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 18:23:38,653][57339] Updated weights for policy 0, policy_version 681198 (0.0026) [2024-04-28 18:23:41,681][57339] Updated weights for policy 0, policy_version 681208 (0.0027) [2024-04-28 18:23:42,169][57108] Fps is (10 sec: 60620.6, 60 sec: 55978.7, 300 sec: 55650.0). Total num frames: 11160944640. Throughput: 0: 55575.9. Samples: 1651281040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:23:42,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 18:23:44,549][57339] Updated weights for policy 0, policy_version 681218 (0.0026) [2024-04-28 18:23:47,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11161206784. Throughput: 0: 55773.5. Samples: 1651620500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:23:47,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 18:23:47,550][57339] Updated weights for policy 0, policy_version 681228 (0.0034) [2024-04-28 18:23:50,465][57339] Updated weights for policy 0, policy_version 681238 (0.0035) [2024-04-28 18:23:52,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11161485312. Throughput: 0: 55884.3. Samples: 1651790880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:23:52,170][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 18:23:53,252][57339] Updated weights for policy 0, policy_version 681248 (0.0024) [2024-04-28 18:23:56,715][57339] Updated weights for policy 0, policy_version 681258 (0.0031) [2024-04-28 18:23:57,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11161747456. Throughput: 0: 55812.1. Samples: 1652123800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:23:57,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 18:23:59,025][57339] Updated weights for policy 0, policy_version 681268 (0.0029) [2024-04-28 18:24:02,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 11162025984. Throughput: 0: 55790.3. Samples: 1652454820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:24:02,170][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 18:24:02,634][57339] Updated weights for policy 0, policy_version 681278 (0.0031) [2024-04-28 18:24:04,416][57319] Signal inference workers to stop experience collection... (24700 times) [2024-04-28 18:24:04,423][57319] Signal inference workers to resume experience collection... (24700 times) [2024-04-28 18:24:04,446][57339] InferenceWorker_p0-w0: stopping experience collection (24700 times) [2024-04-28 18:24:04,446][57339] InferenceWorker_p0-w0: resuming experience collection (24700 times) [2024-04-28 18:24:04,839][57339] Updated weights for policy 0, policy_version 681288 (0.0029) [2024-04-28 18:24:07,169][57108] Fps is (10 sec: 57342.5, 60 sec: 55705.4, 300 sec: 55594.5). Total num frames: 11162320896. Throughput: 0: 55748.2. Samples: 1652620280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:24:07,170][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 18:24:08,490][57339] Updated weights for policy 0, policy_version 681298 (0.0029) [2024-04-28 18:24:10,784][57339] Updated weights for policy 0, policy_version 681308 (0.0029) [2024-04-28 18:24:12,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55432.4, 300 sec: 55539.0). Total num frames: 11162599424. Throughput: 0: 55834.2. Samples: 1652956760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:24:12,170][57108] Avg episode reward: [(0, '0.695')] [2024-04-28 18:24:14,224][57339] Updated weights for policy 0, policy_version 681318 (0.0027) [2024-04-28 18:24:16,670][57339] Updated weights for policy 0, policy_version 681328 (0.0034) [2024-04-28 18:24:17,169][57108] Fps is (10 sec: 58983.8, 60 sec: 56251.9, 300 sec: 55706.7). Total num frames: 11162910720. Throughput: 0: 55771.2. Samples: 1653288960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:24:17,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 18:24:20,013][57339] Updated weights for policy 0, policy_version 681338 (0.0027) [2024-04-28 18:24:22,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 11163172864. Throughput: 0: 55946.4. Samples: 1653461040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:24:22,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 18:24:22,356][57339] Updated weights for policy 0, policy_version 681348 (0.0037) [2024-04-28 18:24:25,851][57339] Updated weights for policy 0, policy_version 681358 (0.0028) [2024-04-28 18:24:27,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 11163435008. Throughput: 0: 55991.2. Samples: 1653800640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:24:27,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 18:24:28,277][57339] Updated weights for policy 0, policy_version 681368 (0.0023) [2024-04-28 18:24:31,607][57339] Updated weights for policy 0, policy_version 681378 (0.0033) [2024-04-28 18:24:32,169][57108] Fps is (10 sec: 54067.1, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11163713536. Throughput: 0: 55834.7. Samples: 1654133060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:24:32,169][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 18:24:34,211][57339] Updated weights for policy 0, policy_version 681388 (0.0026) [2024-04-28 18:24:37,169][57108] Fps is (10 sec: 55704.3, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 11163992064. Throughput: 0: 55593.7. Samples: 1654292600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:24:37,170][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 18:24:37,483][57339] Updated weights for policy 0, policy_version 681398 (0.0028) [2024-04-28 18:24:40,108][57339] Updated weights for policy 0, policy_version 681408 (0.0028) [2024-04-28 18:24:42,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11164270592. Throughput: 0: 55678.2. Samples: 1654629320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:24:42,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 18:24:43,511][57339] Updated weights for policy 0, policy_version 681418 (0.0033) [2024-04-28 18:24:45,888][57339] Updated weights for policy 0, policy_version 681428 (0.0027) [2024-04-28 18:24:47,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11164549120. Throughput: 0: 55685.8. Samples: 1654960680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:24:47,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 18:24:49,399][57339] Updated weights for policy 0, policy_version 681438 (0.0029) [2024-04-28 18:24:51,703][57339] Updated weights for policy 0, policy_version 681448 (0.0033) [2024-04-28 18:24:52,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 11164844032. Throughput: 0: 55836.3. Samples: 1655132900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:24:52,169][57108] Avg episode reward: [(0, '0.535')] [2024-04-28 18:24:55,184][57339] Updated weights for policy 0, policy_version 681458 (0.0030) [2024-04-28 18:24:57,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11165106176. Throughput: 0: 55810.4. Samples: 1655468220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:24:57,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 18:24:57,765][57339] Updated weights for policy 0, policy_version 681468 (0.0032) [2024-04-28 18:25:00,996][57339] Updated weights for policy 0, policy_version 681478 (0.0027) [2024-04-28 18:25:02,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 11165384704. Throughput: 0: 55743.0. Samples: 1655797400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-04-28 18:25:02,169][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 18:25:02,985][57319] Signal inference workers to stop experience collection... (24750 times) [2024-04-28 18:25:03,018][57339] InferenceWorker_p0-w0: stopping experience collection (24750 times) [2024-04-28 18:25:03,075][57319] Signal inference workers to resume experience collection... (24750 times) [2024-04-28 18:25:03,075][57339] InferenceWorker_p0-w0: resuming experience collection (24750 times) [2024-04-28 18:25:03,911][57339] Updated weights for policy 0, policy_version 681488 (0.0028) [2024-04-28 18:25:06,923][57339] Updated weights for policy 0, policy_version 681498 (0.0031) [2024-04-28 18:25:07,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.8, 300 sec: 55761.1). Total num frames: 11165663232. Throughput: 0: 55636.1. Samples: 1655964660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:25:07,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 18:25:09,827][57339] Updated weights for policy 0, policy_version 681508 (0.0030) [2024-04-28 18:25:12,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 11165941760. Throughput: 0: 55533.7. Samples: 1656299660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:25:12,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 18:25:12,901][57339] Updated weights for policy 0, policy_version 681518 (0.0034) [2024-04-28 18:25:15,613][57339] Updated weights for policy 0, policy_version 681528 (0.0027) [2024-04-28 18:25:17,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55159.3, 300 sec: 55650.0). Total num frames: 11166220288. Throughput: 0: 55574.6. Samples: 1656633920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:25:17,170][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 18:25:18,632][57339] Updated weights for policy 0, policy_version 681538 (0.0036) [2024-04-28 18:25:21,357][57339] Updated weights for policy 0, policy_version 681548 (0.0032) [2024-04-28 18:25:22,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55594.6). Total num frames: 11166498816. Throughput: 0: 55841.1. Samples: 1656805440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:25:22,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 18:25:24,445][57339] Updated weights for policy 0, policy_version 681558 (0.0025) [2024-04-28 18:25:27,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 11166793728. Throughput: 0: 55771.0. Samples: 1657139020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:25:27,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 18:25:27,269][57339] Updated weights for policy 0, policy_version 681568 (0.0038) [2024-04-28 18:25:27,272][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000681568_11166810112.pth... [2024-04-28 18:25:27,321][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000680751_11153424384.pth [2024-04-28 18:25:30,699][57339] Updated weights for policy 0, policy_version 681578 (0.0029) [2024-04-28 18:25:32,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11167072256. Throughput: 0: 55668.4. Samples: 1657465760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:25:32,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 18:25:33,192][57339] Updated weights for policy 0, policy_version 681588 (0.0030) [2024-04-28 18:25:36,600][57339] Updated weights for policy 0, policy_version 681598 (0.0028) [2024-04-28 18:25:37,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 11167318016. Throughput: 0: 55560.7. Samples: 1657633140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:25:37,170][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 18:25:39,188][57339] Updated weights for policy 0, policy_version 681608 (0.0028) [2024-04-28 18:25:42,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11167596544. Throughput: 0: 55503.2. Samples: 1657965860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:25:42,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 18:25:42,384][57339] Updated weights for policy 0, policy_version 681618 (0.0032) [2024-04-28 18:25:45,016][57339] Updated weights for policy 0, policy_version 681628 (0.0026) [2024-04-28 18:25:47,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11167875072. Throughput: 0: 55605.4. Samples: 1658299640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:25:47,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 18:25:48,209][57339] Updated weights for policy 0, policy_version 681638 (0.0030) [2024-04-28 18:25:51,308][57339] Updated weights for policy 0, policy_version 681648 (0.0028) [2024-04-28 18:25:52,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 11168169984. Throughput: 0: 55399.5. Samples: 1658457640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:25:52,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 18:25:54,257][57339] Updated weights for policy 0, policy_version 681658 (0.0026) [2024-04-28 18:25:57,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 11168432128. Throughput: 0: 55336.3. Samples: 1658789800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:25:57,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 18:25:57,434][57339] Updated weights for policy 0, policy_version 681668 (0.0030) [2024-04-28 18:25:58,348][57319] Signal inference workers to stop experience collection... (24800 times) [2024-04-28 18:25:58,374][57339] InferenceWorker_p0-w0: stopping experience collection (24800 times) [2024-04-28 18:25:58,401][57319] Signal inference workers to resume experience collection... (24800 times) [2024-04-28 18:25:58,401][57339] InferenceWorker_p0-w0: resuming experience collection (24800 times) [2024-04-28 18:26:00,254][57339] Updated weights for policy 0, policy_version 681678 (0.0028) [2024-04-28 18:26:02,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 11168710656. Throughput: 0: 55364.7. Samples: 1659125320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:26:02,169][57108] Avg episode reward: [(0, '0.682')] [2024-04-28 18:26:03,220][57339] Updated weights for policy 0, policy_version 681688 (0.0032) [2024-04-28 18:26:05,952][57339] Updated weights for policy 0, policy_version 681698 (0.0026) [2024-04-28 18:26:07,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 11169021952. Throughput: 0: 55448.3. Samples: 1659300620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:26:07,170][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 18:26:09,220][57339] Updated weights for policy 0, policy_version 681708 (0.0030) [2024-04-28 18:26:11,811][57339] Updated weights for policy 0, policy_version 681718 (0.0030) [2024-04-28 18:26:12,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11169284096. Throughput: 0: 55398.3. Samples: 1659631940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:26:12,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 18:26:15,188][57339] Updated weights for policy 0, policy_version 681728 (0.0036) [2024-04-28 18:26:17,169][57108] Fps is (10 sec: 50790.3, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 11169529856. Throughput: 0: 55413.7. Samples: 1659959380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:26:17,170][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 18:26:17,898][57339] Updated weights for policy 0, policy_version 681738 (0.0027) [2024-04-28 18:26:20,954][57339] Updated weights for policy 0, policy_version 681748 (0.0026) [2024-04-28 18:26:22,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 11169808384. Throughput: 0: 55291.2. Samples: 1660121240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:26:22,169][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 18:26:23,938][57339] Updated weights for policy 0, policy_version 681758 (0.0037) [2024-04-28 18:26:26,732][57339] Updated weights for policy 0, policy_version 681768 (0.0029) [2024-04-28 18:26:27,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 11170103296. Throughput: 0: 55228.4. Samples: 1660451140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:26:27,169][57108] Avg episode reward: [(0, '0.532')] [2024-04-28 18:26:29,769][57339] Updated weights for policy 0, policy_version 681778 (0.0028) [2024-04-28 18:26:32,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 11170381824. Throughput: 0: 55258.3. Samples: 1660786260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:26:32,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 18:26:32,683][57339] Updated weights for policy 0, policy_version 681788 (0.0034) [2024-04-28 18:26:35,639][57339] Updated weights for policy 0, policy_version 681798 (0.0030) [2024-04-28 18:26:37,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 11170676736. Throughput: 0: 55450.2. Samples: 1660952900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:26:37,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 18:26:38,654][57339] Updated weights for policy 0, policy_version 681808 (0.0028) [2024-04-28 18:26:41,477][57339] Updated weights for policy 0, policy_version 681818 (0.0026) [2024-04-28 18:26:42,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 11170955264. Throughput: 0: 55467.0. Samples: 1661285820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-04-28 18:26:42,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 18:26:44,664][57339] Updated weights for policy 0, policy_version 681828 (0.0028) [2024-04-28 18:26:47,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 11171201024. Throughput: 0: 55430.5. Samples: 1661619700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:26:47,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 18:26:47,361][57339] Updated weights for policy 0, policy_version 681838 (0.0027) [2024-04-28 18:26:50,528][57339] Updated weights for policy 0, policy_version 681848 (0.0032) [2024-04-28 18:26:52,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 11171479552. Throughput: 0: 55325.0. Samples: 1661790240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:26:52,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 18:26:53,250][57339] Updated weights for policy 0, policy_version 681858 (0.0030) [2024-04-28 18:26:56,434][57339] Updated weights for policy 0, policy_version 681868 (0.0025) [2024-04-28 18:26:57,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55159.6, 300 sec: 55483.5). Total num frames: 11171741696. Throughput: 0: 55431.7. Samples: 1662126360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:26:57,169][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 18:26:58,459][57319] Signal inference workers to stop experience collection... (24850 times) [2024-04-28 18:26:58,490][57339] InferenceWorker_p0-w0: stopping experience collection (24850 times) [2024-04-28 18:26:58,509][57319] Signal inference workers to resume experience collection... (24850 times) [2024-04-28 18:26:58,509][57339] InferenceWorker_p0-w0: resuming experience collection (24850 times) [2024-04-28 18:26:59,001][57339] Updated weights for policy 0, policy_version 681878 (0.0030) [2024-04-28 18:27:02,167][57339] Updated weights for policy 0, policy_version 681888 (0.0031) [2024-04-28 18:27:02,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55705.4, 300 sec: 55650.1). Total num frames: 11172052992. Throughput: 0: 55549.8. Samples: 1662459120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:27:02,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 18:27:05,145][57339] Updated weights for policy 0, policy_version 681898 (0.0027) [2024-04-28 18:27:07,169][57108] Fps is (10 sec: 57343.5, 60 sec: 54886.5, 300 sec: 55539.0). Total num frames: 11172315136. Throughput: 0: 55607.1. Samples: 1662623560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:27:07,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 18:27:07,879][57339] Updated weights for policy 0, policy_version 681908 (0.0031) [2024-04-28 18:27:10,992][57339] Updated weights for policy 0, policy_version 681918 (0.0029) [2024-04-28 18:27:12,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55705.6, 300 sec: 55594.6). Total num frames: 11172626432. Throughput: 0: 55720.9. Samples: 1662958580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:27:12,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 18:27:13,849][57339] Updated weights for policy 0, policy_version 681928 (0.0029) [2024-04-28 18:27:16,819][57339] Updated weights for policy 0, policy_version 681938 (0.0035) [2024-04-28 18:27:17,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 11172888576. Throughput: 0: 55666.5. Samples: 1663291260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:27:17,169][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 18:27:19,772][57339] Updated weights for policy 0, policy_version 681948 (0.0026) [2024-04-28 18:27:22,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 11173150720. Throughput: 0: 55678.2. Samples: 1663458420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:27:22,170][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 18:27:22,619][57339] Updated weights for policy 0, policy_version 681958 (0.0025) [2024-04-28 18:27:25,652][57339] Updated weights for policy 0, policy_version 681968 (0.0027) [2024-04-28 18:27:27,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 11173429248. Throughput: 0: 55691.6. Samples: 1663791940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:27:27,170][57108] Avg episode reward: [(0, '0.676')] [2024-04-28 18:27:27,273][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000681973_11173445632.pth... [2024-04-28 18:27:27,322][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000681158_11160092672.pth [2024-04-28 18:27:28,546][57339] Updated weights for policy 0, policy_version 681978 (0.0023) [2024-04-28 18:27:31,795][57339] Updated weights for policy 0, policy_version 681988 (0.0030) [2024-04-28 18:27:32,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 11173707776. Throughput: 0: 55792.8. Samples: 1664130380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:27:32,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 18:27:34,457][57339] Updated weights for policy 0, policy_version 681998 (0.0031) [2024-04-28 18:27:37,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11174002688. Throughput: 0: 55552.3. Samples: 1664290100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:27:37,170][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 18:27:37,495][57339] Updated weights for policy 0, policy_version 682008 (0.0030) [2024-04-28 18:27:40,189][57339] Updated weights for policy 0, policy_version 682018 (0.0028) [2024-04-28 18:27:42,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 11174264832. Throughput: 0: 55536.3. Samples: 1664625500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:27:42,170][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 18:27:43,422][57339] Updated weights for policy 0, policy_version 682028 (0.0026) [2024-04-28 18:27:46,194][57339] Updated weights for policy 0, policy_version 682038 (0.0031) [2024-04-28 18:27:47,169][57108] Fps is (10 sec: 57344.8, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 11174576128. Throughput: 0: 55553.0. Samples: 1664959000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:27:47,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 18:27:49,379][57339] Updated weights for policy 0, policy_version 682048 (0.0031) [2024-04-28 18:27:52,168][57339] Updated weights for policy 0, policy_version 682058 (0.0028) [2024-04-28 18:27:52,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11174838272. Throughput: 0: 55743.2. Samples: 1665132000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:27:52,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 18:27:55,046][57339] Updated weights for policy 0, policy_version 682068 (0.0031) [2024-04-28 18:27:57,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 11175100416. Throughput: 0: 55731.5. Samples: 1665466500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:27:57,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 18:27:58,008][57339] Updated weights for policy 0, policy_version 682078 (0.0031) [2024-04-28 18:28:00,763][57339] Updated weights for policy 0, policy_version 682088 (0.0031) [2024-04-28 18:28:02,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 11175395328. Throughput: 0: 55820.0. Samples: 1665803160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:28:02,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 18:28:04,026][57339] Updated weights for policy 0, policy_version 682098 (0.0028) [2024-04-28 18:28:06,587][57339] Updated weights for policy 0, policy_version 682108 (0.0029) [2024-04-28 18:28:07,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 11175673856. Throughput: 0: 55768.4. Samples: 1665968000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:28:07,170][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 18:28:09,778][57339] Updated weights for policy 0, policy_version 682118 (0.0028) [2024-04-28 18:28:10,231][57319] Signal inference workers to stop experience collection... (24900 times) [2024-04-28 18:28:10,231][57319] Signal inference workers to resume experience collection... (24900 times) [2024-04-28 18:28:10,263][57339] InferenceWorker_p0-w0: stopping experience collection (24900 times) [2024-04-28 18:28:10,263][57339] InferenceWorker_p0-w0: resuming experience collection (24900 times) [2024-04-28 18:28:12,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11175952384. Throughput: 0: 55825.9. Samples: 1666304100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:28:12,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 18:28:12,709][57339] Updated weights for policy 0, policy_version 682128 (0.0035) [2024-04-28 18:28:15,557][57339] Updated weights for policy 0, policy_version 682138 (0.0030) [2024-04-28 18:28:17,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55650.0). Total num frames: 11176230912. Throughput: 0: 55808.5. Samples: 1666641760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:28:17,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 18:28:18,483][57339] Updated weights for policy 0, policy_version 682148 (0.0025) [2024-04-28 18:28:21,512][57339] Updated weights for policy 0, policy_version 682158 (0.0027) [2024-04-28 18:28:22,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 11176525824. Throughput: 0: 56072.5. Samples: 1666813360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-04-28 18:28:22,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 18:28:24,190][57339] Updated weights for policy 0, policy_version 682168 (0.0028) [2024-04-28 18:28:27,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 11176787968. Throughput: 0: 56050.2. Samples: 1667147760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:28:27,170][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 18:28:27,491][57339] Updated weights for policy 0, policy_version 682178 (0.0032) [2024-04-28 18:28:30,050][57339] Updated weights for policy 0, policy_version 682188 (0.0027) [2024-04-28 18:28:32,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55978.8, 300 sec: 55594.6). Total num frames: 11177066496. Throughput: 0: 56258.6. Samples: 1667490640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:28:32,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 18:28:33,232][57339] Updated weights for policy 0, policy_version 682198 (0.0025) [2024-04-28 18:28:35,983][57339] Updated weights for policy 0, policy_version 682208 (0.0029) [2024-04-28 18:28:37,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 11177345024. Throughput: 0: 55931.1. Samples: 1667648900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:28:37,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 18:28:39,170][57339] Updated weights for policy 0, policy_version 682218 (0.0027) [2024-04-28 18:28:41,786][57339] Updated weights for policy 0, policy_version 682228 (0.0030) [2024-04-28 18:28:42,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 11177639936. Throughput: 0: 55920.1. Samples: 1667982900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:28:42,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 18:28:45,030][57339] Updated weights for policy 0, policy_version 682238 (0.0026) [2024-04-28 18:28:47,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 11177934848. Throughput: 0: 55889.0. Samples: 1668318160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:28:47,169][57108] Avg episode reward: [(0, '0.704')] [2024-04-28 18:28:47,535][57339] Updated weights for policy 0, policy_version 682248 (0.0027) [2024-04-28 18:28:50,897][57339] Updated weights for policy 0, policy_version 682258 (0.0027) [2024-04-28 18:28:52,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 11178196992. Throughput: 0: 56069.4. Samples: 1668491120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:28:52,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 18:28:53,417][57339] Updated weights for policy 0, policy_version 682268 (0.0030) [2024-04-28 18:28:56,832][57339] Updated weights for policy 0, policy_version 682278 (0.0029) [2024-04-28 18:28:57,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11178459136. Throughput: 0: 56024.9. Samples: 1668825220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:28:57,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 18:28:59,327][57339] Updated weights for policy 0, policy_version 682288 (0.0028) [2024-04-28 18:29:02,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55432.6, 300 sec: 55594.6). Total num frames: 11178721280. Throughput: 0: 55954.3. Samples: 1669159700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:29:02,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 18:29:02,517][57339] Updated weights for policy 0, policy_version 682298 (0.0030) [2024-04-28 18:29:05,281][57339] Updated weights for policy 0, policy_version 682308 (0.0031) [2024-04-28 18:29:07,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 11179016192. Throughput: 0: 55764.0. Samples: 1669322740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:29:07,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 18:29:08,268][57339] Updated weights for policy 0, policy_version 682318 (0.0030) [2024-04-28 18:29:11,026][57339] Updated weights for policy 0, policy_version 682328 (0.0029) [2024-04-28 18:29:12,169][57108] Fps is (10 sec: 58981.7, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 11179311104. Throughput: 0: 55896.8. Samples: 1669663120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:29:12,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 18:29:14,243][57339] Updated weights for policy 0, policy_version 682338 (0.0030) [2024-04-28 18:29:16,733][57339] Updated weights for policy 0, policy_version 682348 (0.0033) [2024-04-28 18:29:17,169][57108] Fps is (10 sec: 58982.3, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 11179606016. Throughput: 0: 55711.6. Samples: 1669997660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:29:17,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 18:29:20,030][57339] Updated weights for policy 0, policy_version 682358 (0.0031) [2024-04-28 18:29:22,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 11179884544. Throughput: 0: 56096.8. Samples: 1670173260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:29:22,170][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 18:29:22,554][57339] Updated weights for policy 0, policy_version 682368 (0.0034) [2024-04-28 18:29:25,945][57339] Updated weights for policy 0, policy_version 682378 (0.0033) [2024-04-28 18:29:26,241][57319] Signal inference workers to stop experience collection... (24950 times) [2024-04-28 18:29:26,241][57319] Signal inference workers to resume experience collection... (24950 times) [2024-04-28 18:29:26,266][57339] InferenceWorker_p0-w0: stopping experience collection (24950 times) [2024-04-28 18:29:26,266][57339] InferenceWorker_p0-w0: resuming experience collection (24950 times) [2024-04-28 18:29:27,169][57108] Fps is (10 sec: 55705.7, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 11180163072. Throughput: 0: 56202.2. Samples: 1670512000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:29:27,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 18:29:27,257][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000682384_11180179456.pth... [2024-04-28 18:29:27,306][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000681568_11166810112.pth [2024-04-28 18:29:28,434][57339] Updated weights for policy 0, policy_version 682388 (0.0029) [2024-04-28 18:29:31,817][57339] Updated weights for policy 0, policy_version 682398 (0.0027) [2024-04-28 18:29:32,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11180425216. Throughput: 0: 56121.4. Samples: 1670843620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:29:32,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 18:29:34,211][57339] Updated weights for policy 0, policy_version 682408 (0.0030) [2024-04-28 18:29:37,169][57108] Fps is (10 sec: 52428.1, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 11180687360. Throughput: 0: 55704.4. Samples: 1670997820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:29:37,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 18:29:37,745][57339] Updated weights for policy 0, policy_version 682418 (0.0031) [2024-04-28 18:29:40,021][57339] Updated weights for policy 0, policy_version 682428 (0.0030) [2024-04-28 18:29:42,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 11180982272. Throughput: 0: 55701.1. Samples: 1671331780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:29:42,170][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 18:29:43,552][57339] Updated weights for policy 0, policy_version 682438 (0.0027) [2024-04-28 18:29:46,054][57339] Updated weights for policy 0, policy_version 682448 (0.0033) [2024-04-28 18:29:47,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 11181260800. Throughput: 0: 55653.3. Samples: 1671664100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:29:47,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 18:29:49,467][57339] Updated weights for policy 0, policy_version 682458 (0.0029) [2024-04-28 18:29:52,133][57339] Updated weights for policy 0, policy_version 682468 (0.0032) [2024-04-28 18:29:52,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 11181555712. Throughput: 0: 55794.5. Samples: 1671833500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:29:52,170][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 18:29:55,333][57339] Updated weights for policy 0, policy_version 682478 (0.0028) [2024-04-28 18:29:57,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11181817856. Throughput: 0: 55658.0. Samples: 1672167720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:29:57,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 18:29:57,909][57339] Updated weights for policy 0, policy_version 682488 (0.0028) [2024-04-28 18:30:01,151][57339] Updated weights for policy 0, policy_version 682498 (0.0027) [2024-04-28 18:30:02,169][57108] Fps is (10 sec: 54067.3, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11182096384. Throughput: 0: 55687.4. Samples: 1672503600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 18:30:02,170][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 18:30:03,741][57339] Updated weights for policy 0, policy_version 682508 (0.0030) [2024-04-28 18:30:07,053][57339] Updated weights for policy 0, policy_version 682518 (0.0027) [2024-04-28 18:30:07,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11182374912. Throughput: 0: 55441.5. Samples: 1672668120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:30:07,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 18:30:09,810][57339] Updated weights for policy 0, policy_version 682528 (0.0030) [2024-04-28 18:30:12,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 11182620672. Throughput: 0: 55232.8. Samples: 1672997480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:30:12,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 18:30:13,048][57339] Updated weights for policy 0, policy_version 682538 (0.0027) [2024-04-28 18:30:13,995][57319] Signal inference workers to stop experience collection... (25000 times) [2024-04-28 18:30:13,995][57319] Signal inference workers to resume experience collection... (25000 times) [2024-04-28 18:30:14,007][57339] InferenceWorker_p0-w0: stopping experience collection (25000 times) [2024-04-28 18:30:14,008][57339] InferenceWorker_p0-w0: resuming experience collection (25000 times) [2024-04-28 18:30:15,796][57339] Updated weights for policy 0, policy_version 682548 (0.0027) [2024-04-28 18:30:17,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 11182915584. Throughput: 0: 55300.5. Samples: 1673332140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:30:17,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 18:30:19,102][57339] Updated weights for policy 0, policy_version 682558 (0.0030) [2024-04-28 18:30:21,692][57339] Updated weights for policy 0, policy_version 682568 (0.0027) [2024-04-28 18:30:22,169][57108] Fps is (10 sec: 57342.8, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 11183194112. Throughput: 0: 55584.7. Samples: 1673499140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:30:22,170][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 18:30:24,975][57339] Updated weights for policy 0, policy_version 682578 (0.0033) [2024-04-28 18:30:27,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 11183489024. Throughput: 0: 55573.0. Samples: 1673832560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:30:27,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 18:30:27,510][57339] Updated weights for policy 0, policy_version 682588 (0.0027) [2024-04-28 18:30:30,918][57339] Updated weights for policy 0, policy_version 682598 (0.0027) [2024-04-28 18:30:32,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11183767552. Throughput: 0: 55435.1. Samples: 1674158680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:30:32,170][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 18:30:33,659][57339] Updated weights for policy 0, policy_version 682608 (0.0029) [2024-04-28 18:30:36,865][57339] Updated weights for policy 0, policy_version 682618 (0.0029) [2024-04-28 18:30:37,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 11184013312. Throughput: 0: 55455.6. Samples: 1674329000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:30:37,170][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 18:30:39,736][57339] Updated weights for policy 0, policy_version 682628 (0.0034) [2024-04-28 18:30:42,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 11184291840. Throughput: 0: 55431.4. Samples: 1674662140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:30:42,170][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 18:30:42,767][57339] Updated weights for policy 0, policy_version 682638 (0.0027) [2024-04-28 18:30:45,821][57339] Updated weights for policy 0, policy_version 682648 (0.0035) [2024-04-28 18:30:47,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 11184570368. Throughput: 0: 55386.7. Samples: 1674996000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:30:47,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 18:30:48,574][57339] Updated weights for policy 0, policy_version 682658 (0.0035) [2024-04-28 18:30:52,018][57339] Updated weights for policy 0, policy_version 682668 (0.0028) [2024-04-28 18:30:52,169][57108] Fps is (10 sec: 54067.9, 60 sec: 54613.4, 300 sec: 55594.5). Total num frames: 11184832512. Throughput: 0: 55233.8. Samples: 1675153640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:30:52,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 18:30:54,547][57339] Updated weights for policy 0, policy_version 682678 (0.0033) [2024-04-28 18:30:57,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.3, 300 sec: 55650.0). Total num frames: 11185127424. Throughput: 0: 55202.1. Samples: 1675481580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:30:57,170][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 18:30:57,860][57339] Updated weights for policy 0, policy_version 682688 (0.0025) [2024-04-28 18:31:00,581][57339] Updated weights for policy 0, policy_version 682698 (0.0033) [2024-04-28 18:31:02,169][57108] Fps is (10 sec: 60620.0, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11185438720. Throughput: 0: 55088.3. Samples: 1675811120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:31:02,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 18:31:03,830][57339] Updated weights for policy 0, policy_version 682708 (0.0031) [2024-04-28 18:31:06,456][57339] Updated weights for policy 0, policy_version 682718 (0.0035) [2024-04-28 18:31:07,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 11185684480. Throughput: 0: 55354.1. Samples: 1675990060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:31:07,169][57108] Avg episode reward: [(0, '0.683')] [2024-04-28 18:31:09,770][57339] Updated weights for policy 0, policy_version 682728 (0.0035) [2024-04-28 18:31:12,110][57319] Signal inference workers to stop experience collection... (25050 times) [2024-04-28 18:31:12,149][57339] InferenceWorker_p0-w0: stopping experience collection (25050 times) [2024-04-28 18:31:12,169][57108] Fps is (10 sec: 50791.2, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11185946624. Throughput: 0: 55269.9. Samples: 1676319700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:31:12,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 18:31:12,175][57319] Signal inference workers to resume experience collection... (25050 times) [2024-04-28 18:31:12,179][57339] InferenceWorker_p0-w0: resuming experience collection (25050 times) [2024-04-28 18:31:12,318][57339] Updated weights for policy 0, policy_version 682738 (0.0032) [2024-04-28 18:31:15,595][57339] Updated weights for policy 0, policy_version 682748 (0.0030) [2024-04-28 18:31:17,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 11186225152. Throughput: 0: 55435.2. Samples: 1676653260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:31:17,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 18:31:18,146][57339] Updated weights for policy 0, policy_version 682758 (0.0031) [2024-04-28 18:31:21,462][57339] Updated weights for policy 0, policy_version 682768 (0.0027) [2024-04-28 18:31:22,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54886.6, 300 sec: 55539.0). Total num frames: 11186487296. Throughput: 0: 55085.8. Samples: 1676807860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:31:22,169][57108] Avg episode reward: [(0, '0.497')] [2024-04-28 18:31:24,085][57339] Updated weights for policy 0, policy_version 682778 (0.0029) [2024-04-28 18:31:27,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54613.4, 300 sec: 55539.0). Total num frames: 11186765824. Throughput: 0: 55156.6. Samples: 1677144180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:31:27,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 18:31:27,185][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000682787_11186782208.pth... [2024-04-28 18:31:27,240][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000681973_11173445632.pth [2024-04-28 18:31:27,376][57339] Updated weights for policy 0, policy_version 682788 (0.0035) [2024-04-28 18:31:29,891][57339] Updated weights for policy 0, policy_version 682798 (0.0032) [2024-04-28 18:31:32,169][57108] Fps is (10 sec: 60621.4, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 11187093504. Throughput: 0: 55177.1. Samples: 1677478960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:31:32,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 18:31:33,288][57339] Updated weights for policy 0, policy_version 682808 (0.0031) [2024-04-28 18:31:35,701][57339] Updated weights for policy 0, policy_version 682818 (0.0030) [2024-04-28 18:31:37,169][57108] Fps is (10 sec: 60619.9, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 11187372032. Throughput: 0: 55547.8. Samples: 1677653300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:31:37,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 18:31:39,304][57339] Updated weights for policy 0, policy_version 682828 (0.0034) [2024-04-28 18:31:41,538][57339] Updated weights for policy 0, policy_version 682838 (0.0031) [2024-04-28 18:31:42,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11187634176. Throughput: 0: 55609.0. Samples: 1677983980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 18:31:42,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 18:31:45,273][57339] Updated weights for policy 0, policy_version 682848 (0.0028) [2024-04-28 18:31:47,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11187912704. Throughput: 0: 55689.4. Samples: 1678317140. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:31:47,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 18:31:47,502][57339] Updated weights for policy 0, policy_version 682858 (0.0035) [2024-04-28 18:31:51,202][57339] Updated weights for policy 0, policy_version 682868 (0.0032) [2024-04-28 18:31:52,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11188174848. Throughput: 0: 55304.8. Samples: 1678478780. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:31:52,169][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 18:31:53,386][57339] Updated weights for policy 0, policy_version 682878 (0.0026) [2024-04-28 18:31:57,084][57339] Updated weights for policy 0, policy_version 682888 (0.0033) [2024-04-28 18:31:57,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 11188436992. Throughput: 0: 55406.6. Samples: 1678813000. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:31:57,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 18:31:59,218][57339] Updated weights for policy 0, policy_version 682898 (0.0032) [2024-04-28 18:32:02,169][57108] Fps is (10 sec: 55706.0, 60 sec: 54886.6, 300 sec: 55650.1). Total num frames: 11188731904. Throughput: 0: 55448.9. Samples: 1679148460. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:32:02,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 18:32:02,890][57339] Updated weights for policy 0, policy_version 682908 (0.0032) [2024-04-28 18:32:05,137][57339] Updated weights for policy 0, policy_version 682918 (0.0025) [2024-04-28 18:32:07,169][57108] Fps is (10 sec: 58981.2, 60 sec: 55705.4, 300 sec: 55594.5). Total num frames: 11189026816. Throughput: 0: 55710.0. Samples: 1679314820. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:32:07,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 18:32:08,708][57339] Updated weights for policy 0, policy_version 682928 (0.0030) [2024-04-28 18:32:11,115][57339] Updated weights for policy 0, policy_version 682938 (0.0026) [2024-04-28 18:32:12,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55978.5, 300 sec: 55650.1). Total num frames: 11189305344. Throughput: 0: 55598.1. Samples: 1679646100. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:32:12,170][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 18:32:14,766][57339] Updated weights for policy 0, policy_version 682948 (0.0027) [2024-04-28 18:32:16,918][57339] Updated weights for policy 0, policy_version 682958 (0.0030) [2024-04-28 18:32:17,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 11189583872. Throughput: 0: 55556.2. Samples: 1679979000. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:32:17,170][57108] Avg episode reward: [(0, '0.499')] [2024-04-28 18:32:20,476][57339] Updated weights for policy 0, policy_version 682968 (0.0028) [2024-04-28 18:32:22,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 11189846016. Throughput: 0: 55483.3. Samples: 1680150040. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:32:22,169][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 18:32:22,726][57339] Updated weights for policy 0, policy_version 682978 (0.0024) [2024-04-28 18:32:26,239][57339] Updated weights for policy 0, policy_version 682988 (0.0029) [2024-04-28 18:32:27,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55705.6, 300 sec: 55594.6). Total num frames: 11190108160. Throughput: 0: 55682.3. Samples: 1680489680. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:32:27,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 18:32:27,358][57319] Signal inference workers to stop experience collection... (25100 times) [2024-04-28 18:32:27,359][57319] Signal inference workers to resume experience collection... (25100 times) [2024-04-28 18:32:27,371][57339] InferenceWorker_p0-w0: stopping experience collection (25100 times) [2024-04-28 18:32:27,371][57339] InferenceWorker_p0-w0: resuming experience collection (25100 times) [2024-04-28 18:32:28,518][57339] Updated weights for policy 0, policy_version 682998 (0.0029) [2024-04-28 18:32:32,169][57108] Fps is (10 sec: 54067.5, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 11190386688. Throughput: 0: 55744.1. Samples: 1680825620. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:32:32,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 18:32:32,187][57339] Updated weights for policy 0, policy_version 683008 (0.0034) [2024-04-28 18:32:34,398][57339] Updated weights for policy 0, policy_version 683018 (0.0026) [2024-04-28 18:32:37,169][57108] Fps is (10 sec: 55705.7, 60 sec: 54886.6, 300 sec: 55594.5). Total num frames: 11190665216. Throughput: 0: 55645.0. Samples: 1680982800. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:32:37,169][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 18:32:37,978][57339] Updated weights for policy 0, policy_version 683028 (0.0028) [2024-04-28 18:32:40,399][57339] Updated weights for policy 0, policy_version 683038 (0.0034) [2024-04-28 18:32:42,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 11190976512. Throughput: 0: 55763.9. Samples: 1681322380. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:32:42,170][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 18:32:43,897][57339] Updated weights for policy 0, policy_version 683048 (0.0030) [2024-04-28 18:32:46,238][57339] Updated weights for policy 0, policy_version 683058 (0.0029) [2024-04-28 18:32:47,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 11191255040. Throughput: 0: 55817.7. Samples: 1681660260. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:32:47,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 18:32:49,620][57339] Updated weights for policy 0, policy_version 683068 (0.0034) [2024-04-28 18:32:52,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 11191533568. Throughput: 0: 55925.4. Samples: 1681831460. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:32:52,170][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 18:32:52,293][57339] Updated weights for policy 0, policy_version 683078 (0.0029) [2024-04-28 18:32:55,419][57339] Updated weights for policy 0, policy_version 683088 (0.0027) [2024-04-28 18:32:57,169][57108] Fps is (10 sec: 55705.5, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 11191812096. Throughput: 0: 55934.4. Samples: 1682163140. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:32:57,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 18:32:58,099][57339] Updated weights for policy 0, policy_version 683098 (0.0030) [2024-04-28 18:33:01,385][57339] Updated weights for policy 0, policy_version 683108 (0.0029) [2024-04-28 18:33:02,169][57108] Fps is (10 sec: 55706.7, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 11192090624. Throughput: 0: 56091.8. Samples: 1682503120. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:33:02,169][57108] Avg episode reward: [(0, '0.689')] [2024-04-28 18:33:03,969][57339] Updated weights for policy 0, policy_version 683118 (0.0027) [2024-04-28 18:33:07,169][57108] Fps is (10 sec: 54066.1, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 11192352768. Throughput: 0: 55957.1. Samples: 1682668120. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:33:07,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 18:33:07,321][57339] Updated weights for policy 0, policy_version 683128 (0.0025) [2024-04-28 18:33:09,859][57339] Updated weights for policy 0, policy_version 683138 (0.0031) [2024-04-28 18:33:12,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.7, 300 sec: 55594.6). Total num frames: 11192631296. Throughput: 0: 55744.5. Samples: 1682998180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:33:12,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 18:33:13,042][57339] Updated weights for policy 0, policy_version 683148 (0.0028) [2024-04-28 18:33:15,622][57339] Updated weights for policy 0, policy_version 683158 (0.0028) [2024-04-28 18:33:17,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 11192926208. Throughput: 0: 55757.1. Samples: 1683334700. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-04-28 18:33:17,170][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 18:33:18,778][57339] Updated weights for policy 0, policy_version 683168 (0.0024) [2024-04-28 18:33:21,415][57339] Updated weights for policy 0, policy_version 683178 (0.0026) [2024-04-28 18:33:22,169][57108] Fps is (10 sec: 58982.3, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 11193221120. Throughput: 0: 56030.2. Samples: 1683504160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:33:22,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 18:33:24,468][57319] Signal inference workers to stop experience collection... (25150 times) [2024-04-28 18:33:24,489][57339] InferenceWorker_p0-w0: stopping experience collection (25150 times) [2024-04-28 18:33:24,526][57319] Signal inference workers to resume experience collection... (25150 times) [2024-04-28 18:33:24,526][57339] InferenceWorker_p0-w0: resuming experience collection (25150 times) [2024-04-28 18:33:24,634][57339] Updated weights for policy 0, policy_version 683188 (0.0026) [2024-04-28 18:33:27,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56524.6, 300 sec: 55705.6). Total num frames: 11193499648. Throughput: 0: 55988.4. Samples: 1683841860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:33:27,169][57108] Avg episode reward: [(0, '0.528')] [2024-04-28 18:33:27,273][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000683198_11193516032.pth... [2024-04-28 18:33:27,278][57339] Updated weights for policy 0, policy_version 683198 (0.0029) [2024-04-28 18:33:27,318][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000682384_11180179456.pth [2024-04-28 18:33:30,433][57339] Updated weights for policy 0, policy_version 683208 (0.0031) [2024-04-28 18:33:32,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56524.8, 300 sec: 55705.6). Total num frames: 11193778176. Throughput: 0: 55838.6. Samples: 1684173000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:33:32,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 18:33:33,336][57339] Updated weights for policy 0, policy_version 683218 (0.0028) [2024-04-28 18:33:36,232][57339] Updated weights for policy 0, policy_version 683228 (0.0033) [2024-04-28 18:33:37,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56524.7, 300 sec: 55650.0). Total num frames: 11194056704. Throughput: 0: 55834.8. Samples: 1684344020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:33:37,169][57108] Avg episode reward: [(0, '0.527')] [2024-04-28 18:33:39,196][57339] Updated weights for policy 0, policy_version 683238 (0.0028) [2024-04-28 18:33:42,051][57339] Updated weights for policy 0, policy_version 683248 (0.0030) [2024-04-28 18:33:42,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.8, 300 sec: 55594.6). Total num frames: 11194335232. Throughput: 0: 55965.9. Samples: 1684681600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:33:42,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 18:33:45,185][57339] Updated weights for policy 0, policy_version 683258 (0.0031) [2024-04-28 18:33:47,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 11194580992. Throughput: 0: 55818.1. Samples: 1685014940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:33:47,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 18:33:47,990][57339] Updated weights for policy 0, policy_version 683268 (0.0027) [2024-04-28 18:33:51,150][57339] Updated weights for policy 0, policy_version 683278 (0.0040) [2024-04-28 18:33:52,169][57108] Fps is (10 sec: 52427.8, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 11194859520. Throughput: 0: 55636.6. Samples: 1685171760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:33:52,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 18:33:53,899][57339] Updated weights for policy 0, policy_version 683288 (0.0032) [2024-04-28 18:33:57,018][57339] Updated weights for policy 0, policy_version 683298 (0.0028) [2024-04-28 18:33:57,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 11195154432. Throughput: 0: 55651.9. Samples: 1685502520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:33:57,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 18:33:59,717][57339] Updated weights for policy 0, policy_version 683308 (0.0028) [2024-04-28 18:34:02,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.4, 300 sec: 55650.0). Total num frames: 11195432960. Throughput: 0: 55541.8. Samples: 1685834080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:34:02,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 18:34:02,852][57339] Updated weights for policy 0, policy_version 683318 (0.0025) [2024-04-28 18:34:05,673][57339] Updated weights for policy 0, policy_version 683328 (0.0029) [2024-04-28 18:34:07,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 11195711488. Throughput: 0: 55664.7. Samples: 1686009080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:34:07,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 18:34:08,838][57339] Updated weights for policy 0, policy_version 683338 (0.0024) [2024-04-28 18:34:11,606][57339] Updated weights for policy 0, policy_version 683348 (0.0026) [2024-04-28 18:34:12,169][57108] Fps is (10 sec: 57345.0, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 11196006400. Throughput: 0: 55566.0. Samples: 1686342320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:34:12,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 18:34:14,815][57339] Updated weights for policy 0, policy_version 683358 (0.0028) [2024-04-28 18:34:17,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 11196268544. Throughput: 0: 55537.2. Samples: 1686672180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:34:17,170][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 18:34:17,525][57339] Updated weights for policy 0, policy_version 683368 (0.0027) [2024-04-28 18:34:20,766][57339] Updated weights for policy 0, policy_version 683378 (0.0026) [2024-04-28 18:34:22,169][57108] Fps is (10 sec: 50789.9, 60 sec: 54886.3, 300 sec: 55427.9). Total num frames: 11196514304. Throughput: 0: 55265.8. Samples: 1686830980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:34:22,169][57108] Avg episode reward: [(0, '0.539')] [2024-04-28 18:34:23,297][57339] Updated weights for policy 0, policy_version 683388 (0.0027) [2024-04-28 18:34:26,785][57339] Updated weights for policy 0, policy_version 683398 (0.0028) [2024-04-28 18:34:27,169][57108] Fps is (10 sec: 52428.3, 60 sec: 54886.3, 300 sec: 55483.4). Total num frames: 11196792832. Throughput: 0: 55121.4. Samples: 1687162080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:34:27,170][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 18:34:29,308][57339] Updated weights for policy 0, policy_version 683408 (0.0029) [2024-04-28 18:34:32,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 11197087744. Throughput: 0: 55202.5. Samples: 1687499060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:34:32,170][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 18:34:32,633][57339] Updated weights for policy 0, policy_version 683418 (0.0031) [2024-04-28 18:34:35,243][57339] Updated weights for policy 0, policy_version 683428 (0.0028) [2024-04-28 18:34:37,169][57108] Fps is (10 sec: 58983.1, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 11197382656. Throughput: 0: 55461.3. Samples: 1687667520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:34:37,172][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 18:34:38,494][57339] Updated weights for policy 0, policy_version 683438 (0.0024) [2024-04-28 18:34:41,137][57339] Updated weights for policy 0, policy_version 683448 (0.0025) [2024-04-28 18:34:42,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55159.3, 300 sec: 55539.0). Total num frames: 11197644800. Throughput: 0: 55479.9. Samples: 1687999120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:34:42,169][57108] Avg episode reward: [(0, '0.524')] [2024-04-28 18:34:44,524][57339] Updated weights for policy 0, policy_version 683458 (0.0029) [2024-04-28 18:34:45,433][57319] Signal inference workers to stop experience collection... (25200 times) [2024-04-28 18:34:45,434][57319] Signal inference workers to resume experience collection... (25200 times) [2024-04-28 18:34:45,462][57339] InferenceWorker_p0-w0: stopping experience collection (25200 times) [2024-04-28 18:34:45,462][57339] InferenceWorker_p0-w0: resuming experience collection (25200 times) [2024-04-28 18:34:47,029][57339] Updated weights for policy 0, policy_version 683468 (0.0029) [2024-04-28 18:34:47,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 11197939712. Throughput: 0: 55487.7. Samples: 1688331020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:34:47,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 18:34:50,290][57339] Updated weights for policy 0, policy_version 683478 (0.0030) [2024-04-28 18:34:52,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 11198201856. Throughput: 0: 55290.6. Samples: 1688497160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:34:52,170][57108] Avg episode reward: [(0, '0.535')] [2024-04-28 18:34:52,800][57339] Updated weights for policy 0, policy_version 683488 (0.0024) [2024-04-28 18:34:56,371][57339] Updated weights for policy 0, policy_version 683498 (0.0033) [2024-04-28 18:34:57,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55159.5, 300 sec: 55483.5). Total num frames: 11198464000. Throughput: 0: 55404.8. Samples: 1688835540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-04-28 18:34:57,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 18:34:58,707][57339] Updated weights for policy 0, policy_version 683508 (0.0032) [2024-04-28 18:35:02,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55159.6, 300 sec: 55483.4). Total num frames: 11198742528. Throughput: 0: 55466.4. Samples: 1689168160. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:35:02,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 18:35:02,258][57339] Updated weights for policy 0, policy_version 683518 (0.0033) [2024-04-28 18:35:04,719][57339] Updated weights for policy 0, policy_version 683528 (0.0026) [2024-04-28 18:35:07,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 11199021056. Throughput: 0: 55399.8. Samples: 1689323980. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:35:07,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 18:35:08,137][57339] Updated weights for policy 0, policy_version 683538 (0.0032) [2024-04-28 18:35:10,577][57339] Updated weights for policy 0, policy_version 683548 (0.0026) [2024-04-28 18:35:12,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 11199315968. Throughput: 0: 55411.4. Samples: 1689655580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:35:12,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 18:35:14,007][57339] Updated weights for policy 0, policy_version 683558 (0.0025) [2024-04-28 18:35:16,539][57339] Updated weights for policy 0, policy_version 683568 (0.0027) [2024-04-28 18:35:17,169][57108] Fps is (10 sec: 57345.0, 60 sec: 55432.6, 300 sec: 55594.6). Total num frames: 11199594496. Throughput: 0: 55343.3. Samples: 1689989500. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:35:17,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 18:35:20,032][57339] Updated weights for policy 0, policy_version 683578 (0.0029) [2024-04-28 18:35:22,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 11199873024. Throughput: 0: 55485.8. Samples: 1690164380. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:35:22,170][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 18:35:22,444][57339] Updated weights for policy 0, policy_version 683588 (0.0028) [2024-04-28 18:35:25,954][57339] Updated weights for policy 0, policy_version 683598 (0.0029) [2024-04-28 18:35:27,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.8, 300 sec: 55539.0). Total num frames: 11200151552. Throughput: 0: 55500.5. Samples: 1690496640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:35:27,170][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 18:35:27,182][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000683603_11200151552.pth... [2024-04-28 18:35:27,234][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000682787_11186782208.pth [2024-04-28 18:35:28,496][57339] Updated weights for policy 0, policy_version 683608 (0.0026) [2024-04-28 18:35:31,787][57339] Updated weights for policy 0, policy_version 683618 (0.0028) [2024-04-28 18:35:32,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 11200397312. Throughput: 0: 55550.8. Samples: 1690830800. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:35:32,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 18:35:34,616][57339] Updated weights for policy 0, policy_version 683628 (0.0037) [2024-04-28 18:35:37,169][57108] Fps is (10 sec: 52429.0, 60 sec: 54886.5, 300 sec: 55539.0). Total num frames: 11200675840. Throughput: 0: 55456.5. Samples: 1690992700. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:35:37,169][57108] Avg episode reward: [(0, '0.679')] [2024-04-28 18:35:37,574][57339] Updated weights for policy 0, policy_version 683638 (0.0025) [2024-04-28 18:35:40,493][57339] Updated weights for policy 0, policy_version 683648 (0.0028) [2024-04-28 18:35:42,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 11200970752. Throughput: 0: 55253.8. Samples: 1691321960. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:35:42,170][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 18:35:43,634][57339] Updated weights for policy 0, policy_version 683658 (0.0033) [2024-04-28 18:35:46,230][57339] Updated weights for policy 0, policy_version 683668 (0.0038) [2024-04-28 18:35:47,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 11201249280. Throughput: 0: 55443.4. Samples: 1691663120. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:35:47,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 18:35:47,642][57319] Signal inference workers to stop experience collection... (25250 times) [2024-04-28 18:35:47,673][57339] InferenceWorker_p0-w0: stopping experience collection (25250 times) [2024-04-28 18:35:47,701][57319] Signal inference workers to resume experience collection... (25250 times) [2024-04-28 18:35:47,705][57339] InferenceWorker_p0-w0: resuming experience collection (25250 times) [2024-04-28 18:35:49,362][57339] Updated weights for policy 0, policy_version 683678 (0.0026) [2024-04-28 18:35:52,060][57339] Updated weights for policy 0, policy_version 683688 (0.0031) [2024-04-28 18:35:52,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 11201544192. Throughput: 0: 55763.8. Samples: 1691833340. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:35:52,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 18:35:55,060][57339] Updated weights for policy 0, policy_version 683698 (0.0030) [2024-04-28 18:35:57,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 11201822720. Throughput: 0: 55825.7. Samples: 1692167740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:35:57,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 18:35:57,973][57339] Updated weights for policy 0, policy_version 683708 (0.0025) [2024-04-28 18:36:00,874][57339] Updated weights for policy 0, policy_version 683718 (0.0037) [2024-04-28 18:36:02,169][57108] Fps is (10 sec: 57343.6, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11202117632. Throughput: 0: 55803.5. Samples: 1692500660. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:36:02,170][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 18:36:03,852][57339] Updated weights for policy 0, policy_version 683728 (0.0030) [2024-04-28 18:36:06,835][57339] Updated weights for policy 0, policy_version 683738 (0.0034) [2024-04-28 18:36:07,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55705.9, 300 sec: 55650.1). Total num frames: 11202363392. Throughput: 0: 55763.3. Samples: 1692673720. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:36:07,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 18:36:09,910][57339] Updated weights for policy 0, policy_version 683748 (0.0031) [2024-04-28 18:36:12,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11202658304. Throughput: 0: 55765.0. Samples: 1693006060. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:36:12,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 18:36:12,710][57339] Updated weights for policy 0, policy_version 683758 (0.0025) [2024-04-28 18:36:15,597][57339] Updated weights for policy 0, policy_version 683768 (0.0028) [2024-04-28 18:36:17,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11202936832. Throughput: 0: 55968.7. Samples: 1693349400. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:36:17,170][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 18:36:18,498][57339] Updated weights for policy 0, policy_version 683778 (0.0029) [2024-04-28 18:36:21,472][57339] Updated weights for policy 0, policy_version 683788 (0.0025) [2024-04-28 18:36:22,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11203198976. Throughput: 0: 55925.8. Samples: 1693509360. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:36:22,170][57108] Avg episode reward: [(0, '0.493')] [2024-04-28 18:36:24,233][57339] Updated weights for policy 0, policy_version 683798 (0.0023) [2024-04-28 18:36:27,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.4, 300 sec: 55538.9). Total num frames: 11203477504. Throughput: 0: 56023.4. Samples: 1693843020. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:36:27,170][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 18:36:27,663][57339] Updated weights for policy 0, policy_version 683808 (0.0030) [2024-04-28 18:36:30,016][57339] Updated weights for policy 0, policy_version 683818 (0.0029) [2024-04-28 18:36:32,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 11203772416. Throughput: 0: 55907.2. Samples: 1694178940. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:36:32,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 18:36:33,377][57339] Updated weights for policy 0, policy_version 683828 (0.0034) [2024-04-28 18:36:35,863][57339] Updated weights for policy 0, policy_version 683838 (0.0031) [2024-04-28 18:36:37,169][57108] Fps is (10 sec: 58984.1, 60 sec: 56524.9, 300 sec: 55705.6). Total num frames: 11204067328. Throughput: 0: 55999.2. Samples: 1694353300. Policy #0 lag: (min: 1.0, avg: 10.5, max: 23.0) [2024-04-28 18:36:37,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 18:36:39,128][57339] Updated weights for policy 0, policy_version 683848 (0.0029) [2024-04-28 18:36:41,615][57339] Updated weights for policy 0, policy_version 683858 (0.0031) [2024-04-28 18:36:42,169][57108] Fps is (10 sec: 57344.6, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 11204345856. Throughput: 0: 56039.7. Samples: 1694689520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:36:42,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 18:36:45,019][57339] Updated weights for policy 0, policy_version 683868 (0.0033) [2024-04-28 18:36:47,169][57108] Fps is (10 sec: 55704.7, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 11204624384. Throughput: 0: 56079.6. Samples: 1695024240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:36:47,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 18:36:47,660][57339] Updated weights for policy 0, policy_version 683878 (0.0029) [2024-04-28 18:36:50,917][57339] Updated weights for policy 0, policy_version 683888 (0.0027) [2024-04-28 18:36:52,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11204886528. Throughput: 0: 55951.9. Samples: 1695191560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:36:52,169][57108] Avg episode reward: [(0, '0.715')] [2024-04-28 18:36:52,854][57319] Signal inference workers to stop experience collection... (25300 times) [2024-04-28 18:36:52,855][57319] Signal inference workers to resume experience collection... (25300 times) [2024-04-28 18:36:52,868][57339] InferenceWorker_p0-w0: stopping experience collection (25300 times) [2024-04-28 18:36:52,868][57339] InferenceWorker_p0-w0: resuming experience collection (25300 times) [2024-04-28 18:36:53,630][57339] Updated weights for policy 0, policy_version 683898 (0.0031) [2024-04-28 18:36:56,681][57339] Updated weights for policy 0, policy_version 683908 (0.0028) [2024-04-28 18:36:57,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11205148672. Throughput: 0: 55915.6. Samples: 1695522260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:36:57,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 18:36:59,354][57339] Updated weights for policy 0, policy_version 683918 (0.0029) [2024-04-28 18:37:02,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11205443584. Throughput: 0: 55752.9. Samples: 1695858280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:37:02,170][57108] Avg episode reward: [(0, '0.459')] [2024-04-28 18:37:02,668][57339] Updated weights for policy 0, policy_version 683928 (0.0031) [2024-04-28 18:37:05,107][57339] Updated weights for policy 0, policy_version 683938 (0.0036) [2024-04-28 18:37:07,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55978.5, 300 sec: 55650.1). Total num frames: 11205722112. Throughput: 0: 55808.8. Samples: 1696020760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:37:07,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 18:37:08,484][57339] Updated weights for policy 0, policy_version 683948 (0.0030) [2024-04-28 18:37:10,959][57339] Updated weights for policy 0, policy_version 683958 (0.0028) [2024-04-28 18:37:12,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11206017024. Throughput: 0: 55831.8. Samples: 1696355440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:37:12,169][57108] Avg episode reward: [(0, '0.694')] [2024-04-28 18:37:14,554][57339] Updated weights for policy 0, policy_version 683968 (0.0028) [2024-04-28 18:37:16,934][57339] Updated weights for policy 0, policy_version 683978 (0.0029) [2024-04-28 18:37:17,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 11206295552. Throughput: 0: 55704.9. Samples: 1696685660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:37:17,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 18:37:20,505][57339] Updated weights for policy 0, policy_version 683988 (0.0030) [2024-04-28 18:37:22,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 11206557696. Throughput: 0: 55713.8. Samples: 1696860420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:37:22,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 18:37:23,009][57339] Updated weights for policy 0, policy_version 683998 (0.0033) [2024-04-28 18:37:26,248][57339] Updated weights for policy 0, policy_version 684008 (0.0031) [2024-04-28 18:37:27,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11206819840. Throughput: 0: 55584.8. Samples: 1697190840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:37:27,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 18:37:27,251][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000684011_11206836224.pth... [2024-04-28 18:37:27,299][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000683198_11193516032.pth [2024-04-28 18:37:28,860][57339] Updated weights for policy 0, policy_version 684018 (0.0034) [2024-04-28 18:37:32,088][57339] Updated weights for policy 0, policy_version 684028 (0.0031) [2024-04-28 18:37:32,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 11207114752. Throughput: 0: 55538.8. Samples: 1697523480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:37:32,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 18:37:34,566][57339] Updated weights for policy 0, policy_version 684038 (0.0030) [2024-04-28 18:37:37,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 11207376896. Throughput: 0: 55384.7. Samples: 1697683880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:37:37,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 18:37:38,086][57339] Updated weights for policy 0, policy_version 684048 (0.0031) [2024-04-28 18:37:40,539][57339] Updated weights for policy 0, policy_version 684058 (0.0027) [2024-04-28 18:37:42,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11207671808. Throughput: 0: 55483.5. Samples: 1698019020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:37:42,169][57108] Avg episode reward: [(0, '0.497')] [2024-04-28 18:37:44,101][57339] Updated weights for policy 0, policy_version 684068 (0.0032) [2024-04-28 18:37:46,516][57339] Updated weights for policy 0, policy_version 684078 (0.0033) [2024-04-28 18:37:47,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 11207966720. Throughput: 0: 55360.8. Samples: 1698349520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:37:47,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 18:37:49,957][57339] Updated weights for policy 0, policy_version 684088 (0.0031) [2024-04-28 18:37:51,452][57319] Signal inference workers to stop experience collection... (25350 times) [2024-04-28 18:37:51,500][57339] InferenceWorker_p0-w0: stopping experience collection (25350 times) [2024-04-28 18:37:51,509][57319] Signal inference workers to resume experience collection... (25350 times) [2024-04-28 18:37:51,516][57339] InferenceWorker_p0-w0: resuming experience collection (25350 times) [2024-04-28 18:37:52,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11208245248. Throughput: 0: 55654.4. Samples: 1698525200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:37:52,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 18:37:52,265][57339] Updated weights for policy 0, policy_version 684098 (0.0032) [2024-04-28 18:37:55,879][57339] Updated weights for policy 0, policy_version 684108 (0.0030) [2024-04-28 18:37:57,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 11208507392. Throughput: 0: 55625.2. Samples: 1698858580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:37:57,169][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 18:37:58,227][57339] Updated weights for policy 0, policy_version 684118 (0.0034) [2024-04-28 18:38:01,816][57339] Updated weights for policy 0, policy_version 684128 (0.0026) [2024-04-28 18:38:02,169][57108] Fps is (10 sec: 50790.3, 60 sec: 55159.6, 300 sec: 55594.6). Total num frames: 11208753152. Throughput: 0: 55795.7. Samples: 1699196460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:38:02,169][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 18:38:04,217][57339] Updated weights for policy 0, policy_version 684138 (0.0032) [2024-04-28 18:38:07,169][57108] Fps is (10 sec: 52429.8, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 11209031680. Throughput: 0: 55315.9. Samples: 1699349640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:38:07,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 18:38:07,644][57339] Updated weights for policy 0, policy_version 684148 (0.0032) [2024-04-28 18:38:10,145][57339] Updated weights for policy 0, policy_version 684158 (0.0031) [2024-04-28 18:38:12,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 11209326592. Throughput: 0: 55421.3. Samples: 1699684800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:38:12,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 18:38:13,516][57339] Updated weights for policy 0, policy_version 684168 (0.0025) [2024-04-28 18:38:16,042][57339] Updated weights for policy 0, policy_version 684178 (0.0028) [2024-04-28 18:38:17,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 11209605120. Throughput: 0: 55452.3. Samples: 1700018840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-04-28 18:38:17,170][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 18:38:19,473][57339] Updated weights for policy 0, policy_version 684188 (0.0032) [2024-04-28 18:38:22,083][57339] Updated weights for policy 0, policy_version 684198 (0.0026) [2024-04-28 18:38:22,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55705.5, 300 sec: 55594.6). Total num frames: 11209900032. Throughput: 0: 55640.7. Samples: 1700187700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:38:22,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 18:38:25,354][57339] Updated weights for policy 0, policy_version 684208 (0.0028) [2024-04-28 18:38:27,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56251.7, 300 sec: 55650.0). Total num frames: 11210194944. Throughput: 0: 55645.6. Samples: 1700523080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:38:27,170][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 18:38:27,988][57339] Updated weights for policy 0, policy_version 684218 (0.0024) [2024-04-28 18:38:31,321][57339] Updated weights for policy 0, policy_version 684228 (0.0024) [2024-04-28 18:38:32,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11210457088. Throughput: 0: 55744.2. Samples: 1700858000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:38:32,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 18:38:33,861][57339] Updated weights for policy 0, policy_version 684238 (0.0028) [2024-04-28 18:38:37,162][57339] Updated weights for policy 0, policy_version 684248 (0.0026) [2024-04-28 18:38:37,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 11210719232. Throughput: 0: 55453.2. Samples: 1701020600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:38:37,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 18:38:39,985][57339] Updated weights for policy 0, policy_version 684258 (0.0030) [2024-04-28 18:38:42,169][57108] Fps is (10 sec: 50790.7, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 11210964992. Throughput: 0: 55422.0. Samples: 1701352560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:38:42,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 18:38:43,086][57339] Updated weights for policy 0, policy_version 684268 (0.0029) [2024-04-28 18:38:45,765][57339] Updated weights for policy 0, policy_version 684278 (0.0028) [2024-04-28 18:38:47,169][57108] Fps is (10 sec: 52429.0, 60 sec: 54613.5, 300 sec: 55539.0). Total num frames: 11211243520. Throughput: 0: 55198.6. Samples: 1701680400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:38:47,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 18:38:49,108][57339] Updated weights for policy 0, policy_version 684288 (0.0026) [2024-04-28 18:38:51,765][57339] Updated weights for policy 0, policy_version 684298 (0.0024) [2024-04-28 18:38:52,169][57108] Fps is (10 sec: 58981.3, 60 sec: 55159.3, 300 sec: 55594.5). Total num frames: 11211554816. Throughput: 0: 55434.0. Samples: 1701844180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:38:52,170][57108] Avg episode reward: [(0, '0.719')] [2024-04-28 18:38:55,001][57339] Updated weights for policy 0, policy_version 684308 (0.0027) [2024-04-28 18:38:55,250][57319] Signal inference workers to stop experience collection... (25400 times) [2024-04-28 18:38:55,250][57319] Signal inference workers to resume experience collection... (25400 times) [2024-04-28 18:38:55,279][57339] InferenceWorker_p0-w0: stopping experience collection (25400 times) [2024-04-28 18:38:55,279][57339] InferenceWorker_p0-w0: resuming experience collection (25400 times) [2024-04-28 18:38:57,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55432.7, 300 sec: 55594.6). Total num frames: 11211833344. Throughput: 0: 55373.1. Samples: 1702176580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:38:57,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 18:38:57,801][57339] Updated weights for policy 0, policy_version 684318 (0.0028) [2024-04-28 18:39:00,802][57339] Updated weights for policy 0, policy_version 684328 (0.0025) [2024-04-28 18:39:02,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56251.6, 300 sec: 55650.1). Total num frames: 11212128256. Throughput: 0: 55304.9. Samples: 1702507560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:39:02,170][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 18:39:03,623][57339] Updated weights for policy 0, policy_version 684338 (0.0027) [2024-04-28 18:39:06,652][57339] Updated weights for policy 0, policy_version 684348 (0.0026) [2024-04-28 18:39:07,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 11212406784. Throughput: 0: 55567.1. Samples: 1702688220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:39:07,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 18:39:09,633][57339] Updated weights for policy 0, policy_version 684358 (0.0030) [2024-04-28 18:39:12,169][57108] Fps is (10 sec: 50790.1, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 11212636160. Throughput: 0: 55494.2. Samples: 1703020320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:39:12,170][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 18:39:12,605][57339] Updated weights for policy 0, policy_version 684368 (0.0025) [2024-04-28 18:39:15,737][57339] Updated weights for policy 0, policy_version 684378 (0.0033) [2024-04-28 18:39:17,169][57108] Fps is (10 sec: 52427.8, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 11212931072. Throughput: 0: 55461.6. Samples: 1703353780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:39:17,170][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 18:39:18,373][57339] Updated weights for policy 0, policy_version 684388 (0.0026) [2024-04-28 18:39:21,493][57339] Updated weights for policy 0, policy_version 684398 (0.0027) [2024-04-28 18:39:22,169][57108] Fps is (10 sec: 55706.4, 60 sec: 54886.4, 300 sec: 55594.6). Total num frames: 11213193216. Throughput: 0: 55184.0. Samples: 1703503880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:39:22,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 18:39:24,201][57339] Updated weights for policy 0, policy_version 684408 (0.0032) [2024-04-28 18:39:27,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 11213488128. Throughput: 0: 55315.8. Samples: 1703841780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:39:27,170][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 18:39:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000684417_11213488128.pth... [2024-04-28 18:39:27,224][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000683603_11200151552.pth [2024-04-28 18:39:27,514][57339] Updated weights for policy 0, policy_version 684418 (0.0032) [2024-04-28 18:39:30,109][57339] Updated weights for policy 0, policy_version 684428 (0.0028) [2024-04-28 18:39:32,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 11213766656. Throughput: 0: 55424.3. Samples: 1704174500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:39:32,170][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 18:39:33,325][57339] Updated weights for policy 0, policy_version 684438 (0.0028) [2024-04-28 18:39:35,901][57339] Updated weights for policy 0, policy_version 684448 (0.0029) [2024-04-28 18:39:37,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11214077952. Throughput: 0: 55604.1. Samples: 1704346360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:39:37,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 18:39:39,268][57339] Updated weights for policy 0, policy_version 684458 (0.0029) [2024-04-28 18:39:41,717][57339] Updated weights for policy 0, policy_version 684468 (0.0031) [2024-04-28 18:39:42,169][57108] Fps is (10 sec: 57344.8, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 11214340096. Throughput: 0: 55702.6. Samples: 1704683200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:39:42,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 18:39:45,205][57339] Updated weights for policy 0, policy_version 684478 (0.0030) [2024-04-28 18:39:47,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55978.7, 300 sec: 55594.6). Total num frames: 11214602240. Throughput: 0: 55719.3. Samples: 1705014920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:39:47,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 18:39:47,563][57339] Updated weights for policy 0, policy_version 684488 (0.0027) [2024-04-28 18:39:51,058][57339] Updated weights for policy 0, policy_version 684498 (0.0029) [2024-04-28 18:39:51,788][57319] Signal inference workers to stop experience collection... (25450 times) [2024-04-28 18:39:51,788][57319] Signal inference workers to resume experience collection... (25450 times) [2024-04-28 18:39:51,804][57339] InferenceWorker_p0-w0: stopping experience collection (25450 times) [2024-04-28 18:39:51,804][57339] InferenceWorker_p0-w0: resuming experience collection (25450 times) [2024-04-28 18:39:52,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 11214864384. Throughput: 0: 55469.7. Samples: 1705184360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:39:52,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 18:39:53,378][57339] Updated weights for policy 0, policy_version 684508 (0.0030) [2024-04-28 18:39:57,073][57339] Updated weights for policy 0, policy_version 684518 (0.0029) [2024-04-28 18:39:57,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 11215142912. Throughput: 0: 55541.5. Samples: 1705519680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-04-28 18:39:57,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 18:39:59,380][57339] Updated weights for policy 0, policy_version 684528 (0.0033) [2024-04-28 18:40:02,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 11215437824. Throughput: 0: 55604.2. Samples: 1705855960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:40:02,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 18:40:02,830][57339] Updated weights for policy 0, policy_version 684538 (0.0026) [2024-04-28 18:40:05,134][57339] Updated weights for policy 0, policy_version 684548 (0.0028) [2024-04-28 18:40:07,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 11215716352. Throughput: 0: 55759.5. Samples: 1706013060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:40:07,169][57108] Avg episode reward: [(0, '0.679')] [2024-04-28 18:40:09,014][57339] Updated weights for policy 0, policy_version 684558 (0.0033) [2024-04-28 18:40:10,938][57339] Updated weights for policy 0, policy_version 684568 (0.0032) [2024-04-28 18:40:12,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56251.9, 300 sec: 55650.1). Total num frames: 11216011264. Throughput: 0: 55645.1. Samples: 1706345800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:40:12,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 18:40:14,783][57339] Updated weights for policy 0, policy_version 684578 (0.0032) [2024-04-28 18:40:16,805][57339] Updated weights for policy 0, policy_version 684588 (0.0032) [2024-04-28 18:40:17,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.9, 300 sec: 55650.1). Total num frames: 11216289792. Throughput: 0: 55753.1. Samples: 1706683380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:40:17,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 18:40:20,448][57339] Updated weights for policy 0, policy_version 684598 (0.0029) [2024-04-28 18:40:22,169][57108] Fps is (10 sec: 55704.6, 60 sec: 56251.6, 300 sec: 55650.0). Total num frames: 11216568320. Throughput: 0: 55867.0. Samples: 1706860380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:40:22,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 18:40:22,729][57339] Updated weights for policy 0, policy_version 684608 (0.0026) [2024-04-28 18:40:26,292][57339] Updated weights for policy 0, policy_version 684618 (0.0027) [2024-04-28 18:40:27,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 11216814080. Throughput: 0: 55847.6. Samples: 1707196340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:40:27,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 18:40:28,485][57339] Updated weights for policy 0, policy_version 684628 (0.0027) [2024-04-28 18:40:32,169][57108] Fps is (10 sec: 52429.9, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 11217092608. Throughput: 0: 55940.0. Samples: 1707532220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:40:32,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 18:40:32,350][57339] Updated weights for policy 0, policy_version 684638 (0.0029) [2024-04-28 18:40:34,269][57339] Updated weights for policy 0, policy_version 684648 (0.0027) [2024-04-28 18:40:37,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 11217387520. Throughput: 0: 55649.7. Samples: 1707688600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:40:37,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 18:40:38,224][57339] Updated weights for policy 0, policy_version 684658 (0.0028) [2024-04-28 18:40:40,099][57339] Updated weights for policy 0, policy_version 684668 (0.0026) [2024-04-28 18:40:42,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11217666048. Throughput: 0: 55652.4. Samples: 1708024040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:40:42,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 18:40:43,078][57319] Signal inference workers to stop experience collection... (25500 times) [2024-04-28 18:40:43,079][57319] Signal inference workers to resume experience collection... (25500 times) [2024-04-28 18:40:43,090][57339] InferenceWorker_p0-w0: stopping experience collection (25500 times) [2024-04-28 18:40:43,091][57339] InferenceWorker_p0-w0: resuming experience collection (25500 times) [2024-04-28 18:40:43,966][57339] Updated weights for policy 0, policy_version 684678 (0.0028) [2024-04-28 18:40:46,191][57339] Updated weights for policy 0, policy_version 684688 (0.0025) [2024-04-28 18:40:47,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 11217960960. Throughput: 0: 55498.2. Samples: 1708353380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:40:47,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 18:40:49,727][57339] Updated weights for policy 0, policy_version 684698 (0.0032) [2024-04-28 18:40:51,911][57339] Updated weights for policy 0, policy_version 684708 (0.0027) [2024-04-28 18:40:52,169][57108] Fps is (10 sec: 58982.6, 60 sec: 56524.8, 300 sec: 55705.6). Total num frames: 11218255872. Throughput: 0: 56065.3. Samples: 1708536000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:40:52,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 18:40:55,765][57339] Updated weights for policy 0, policy_version 684718 (0.0031) [2024-04-28 18:40:57,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56524.7, 300 sec: 55650.1). Total num frames: 11218534400. Throughput: 0: 56111.8. Samples: 1708870840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:40:57,170][57108] Avg episode reward: [(0, '0.692')] [2024-04-28 18:40:57,849][57339] Updated weights for policy 0, policy_version 684728 (0.0034) [2024-04-28 18:41:01,526][57339] Updated weights for policy 0, policy_version 684738 (0.0031) [2024-04-28 18:41:02,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11218780160. Throughput: 0: 55967.1. Samples: 1709201900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:41:02,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 18:41:03,787][57339] Updated weights for policy 0, policy_version 684748 (0.0035) [2024-04-28 18:41:07,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11219058688. Throughput: 0: 55671.7. Samples: 1709365600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:41:07,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 18:41:07,382][57339] Updated weights for policy 0, policy_version 684758 (0.0028) [2024-04-28 18:41:09,628][57339] Updated weights for policy 0, policy_version 684768 (0.0031) [2024-04-28 18:41:12,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 11219337216. Throughput: 0: 55701.6. Samples: 1709702920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:41:12,170][57108] Avg episode reward: [(0, '0.683')] [2024-04-28 18:41:13,242][57339] Updated weights for policy 0, policy_version 684778 (0.0030) [2024-04-28 18:41:15,538][57339] Updated weights for policy 0, policy_version 684788 (0.0032) [2024-04-28 18:41:17,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 11219615744. Throughput: 0: 55652.7. Samples: 1710036600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:41:17,170][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 18:41:19,003][57339] Updated weights for policy 0, policy_version 684798 (0.0025) [2024-04-28 18:41:21,460][57339] Updated weights for policy 0, policy_version 684808 (0.0029) [2024-04-28 18:41:22,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11219910656. Throughput: 0: 55989.3. Samples: 1710208120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:41:22,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 18:41:24,836][57339] Updated weights for policy 0, policy_version 684818 (0.0032) [2024-04-28 18:41:27,169][57108] Fps is (10 sec: 57344.8, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 11220189184. Throughput: 0: 55965.8. Samples: 1710542500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:41:27,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 18:41:27,362][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000684828_11220221952.pth... [2024-04-28 18:41:27,365][57339] Updated weights for policy 0, policy_version 684828 (0.0027) [2024-04-28 18:41:27,408][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000684011_11206836224.pth [2024-04-28 18:41:30,594][57339] Updated weights for policy 0, policy_version 684838 (0.0028) [2024-04-28 18:41:32,169][57108] Fps is (10 sec: 55705.7, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 11220467712. Throughput: 0: 56098.2. Samples: 1710877800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:41:32,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 18:41:33,311][57339] Updated weights for policy 0, policy_version 684848 (0.0027) [2024-04-28 18:41:36,327][57339] Updated weights for policy 0, policy_version 684858 (0.0029) [2024-04-28 18:41:37,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 11220746240. Throughput: 0: 55840.3. Samples: 1711048820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-04-28 18:41:37,170][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 18:41:39,224][57339] Updated weights for policy 0, policy_version 684868 (0.0027) [2024-04-28 18:41:42,026][57319] Signal inference workers to stop experience collection... (25550 times) [2024-04-28 18:41:42,065][57339] InferenceWorker_p0-w0: stopping experience collection (25550 times) [2024-04-28 18:41:42,082][57319] Signal inference workers to resume experience collection... (25550 times) [2024-04-28 18:41:42,083][57339] InferenceWorker_p0-w0: resuming experience collection (25550 times) [2024-04-28 18:41:42,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 11221024768. Throughput: 0: 55867.7. Samples: 1711384880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:41:42,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 18:41:42,196][57339] Updated weights for policy 0, policy_version 684878 (0.0026) [2024-04-28 18:41:45,068][57339] Updated weights for policy 0, policy_version 684888 (0.0025) [2024-04-28 18:41:47,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 11221286912. Throughput: 0: 55814.5. Samples: 1711713560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:41:47,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 18:41:48,075][57339] Updated weights for policy 0, policy_version 684898 (0.0028) [2024-04-28 18:41:51,071][57339] Updated weights for policy 0, policy_version 684908 (0.0025) [2024-04-28 18:41:52,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 11221565440. Throughput: 0: 55796.5. Samples: 1711876440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:41:52,169][57108] Avg episode reward: [(0, '0.683')] [2024-04-28 18:41:54,056][57339] Updated weights for policy 0, policy_version 684918 (0.0029) [2024-04-28 18:41:57,015][57339] Updated weights for policy 0, policy_version 684928 (0.0032) [2024-04-28 18:41:57,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11221860352. Throughput: 0: 55767.7. Samples: 1712212460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:41:57,169][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 18:41:59,768][57339] Updated weights for policy 0, policy_version 684938 (0.0030) [2024-04-28 18:42:02,169][57108] Fps is (10 sec: 58982.7, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11222155264. Throughput: 0: 55855.7. Samples: 1712550100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:42:02,169][57108] Avg episode reward: [(0, '0.675')] [2024-04-28 18:42:02,856][57339] Updated weights for policy 0, policy_version 684948 (0.0029) [2024-04-28 18:42:05,625][57339] Updated weights for policy 0, policy_version 684958 (0.0032) [2024-04-28 18:42:07,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 11222401024. Throughput: 0: 55792.8. Samples: 1712718800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:42:07,170][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 18:42:08,929][57339] Updated weights for policy 0, policy_version 684968 (0.0026) [2024-04-28 18:42:11,530][57339] Updated weights for policy 0, policy_version 684978 (0.0041) [2024-04-28 18:42:12,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 11222695936. Throughput: 0: 55739.5. Samples: 1713050780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:42:12,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 18:42:14,901][57339] Updated weights for policy 0, policy_version 684988 (0.0027) [2024-04-28 18:42:17,169][57108] Fps is (10 sec: 58983.2, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 11222990848. Throughput: 0: 55643.7. Samples: 1713381760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:42:17,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 18:42:17,229][57339] Updated weights for policy 0, policy_version 684998 (0.0032) [2024-04-28 18:42:20,877][57339] Updated weights for policy 0, policy_version 685008 (0.0031) [2024-04-28 18:42:22,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11223252992. Throughput: 0: 55577.1. Samples: 1713549780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:42:22,169][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 18:42:22,991][57339] Updated weights for policy 0, policy_version 685018 (0.0027) [2024-04-28 18:42:26,716][57339] Updated weights for policy 0, policy_version 685028 (0.0027) [2024-04-28 18:42:27,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 11223515136. Throughput: 0: 55609.0. Samples: 1713887280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:42:27,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 18:42:28,993][57339] Updated weights for policy 0, policy_version 685038 (0.0033) [2024-04-28 18:42:32,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11223810048. Throughput: 0: 55716.1. Samples: 1714220780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:42:32,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 18:42:32,606][57339] Updated weights for policy 0, policy_version 685048 (0.0030) [2024-04-28 18:42:34,926][57339] Updated weights for policy 0, policy_version 685058 (0.0026) [2024-04-28 18:42:36,951][57319] Signal inference workers to stop experience collection... (25600 times) [2024-04-28 18:42:36,988][57339] InferenceWorker_p0-w0: stopping experience collection (25600 times) [2024-04-28 18:42:37,042][57319] Signal inference workers to resume experience collection... (25600 times) [2024-04-28 18:42:37,043][57339] InferenceWorker_p0-w0: resuming experience collection (25600 times) [2024-04-28 18:42:37,169][57108] Fps is (10 sec: 60619.8, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 11224121344. Throughput: 0: 55874.6. Samples: 1714390800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:42:37,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 18:42:38,391][57339] Updated weights for policy 0, policy_version 685068 (0.0030) [2024-04-28 18:42:41,054][57339] Updated weights for policy 0, policy_version 685078 (0.0033) [2024-04-28 18:42:42,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 11224367104. Throughput: 0: 55818.1. Samples: 1714724280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:42:42,169][57108] Avg episode reward: [(0, '0.511')] [2024-04-28 18:42:44,288][57339] Updated weights for policy 0, policy_version 685088 (0.0028) [2024-04-28 18:42:46,729][57339] Updated weights for policy 0, policy_version 685098 (0.0028) [2024-04-28 18:42:47,169][57108] Fps is (10 sec: 52428.1, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 11224645632. Throughput: 0: 55856.2. Samples: 1715063640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:42:47,170][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 18:42:50,174][57339] Updated weights for policy 0, policy_version 685108 (0.0026) [2024-04-28 18:42:52,169][57108] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11224940544. Throughput: 0: 55862.2. Samples: 1715232600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:42:52,170][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 18:42:52,551][57339] Updated weights for policy 0, policy_version 685118 (0.0026) [2024-04-28 18:42:56,045][57339] Updated weights for policy 0, policy_version 685128 (0.0025) [2024-04-28 18:42:57,169][57108] Fps is (10 sec: 58982.8, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 11225235456. Throughput: 0: 56069.7. Samples: 1715573920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:42:57,170][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 18:42:58,311][57339] Updated weights for policy 0, policy_version 685138 (0.0026) [2024-04-28 18:43:01,872][57339] Updated weights for policy 0, policy_version 685148 (0.0024) [2024-04-28 18:43:02,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 11225464832. Throughput: 0: 56206.6. Samples: 1715911060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:43:02,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 18:43:04,060][57339] Updated weights for policy 0, policy_version 685158 (0.0028) [2024-04-28 18:43:07,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11225759744. Throughput: 0: 56035.8. Samples: 1716071400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:43:07,178][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 18:43:07,880][57339] Updated weights for policy 0, policy_version 685168 (0.0037) [2024-04-28 18:43:09,842][57339] Updated weights for policy 0, policy_version 685178 (0.0026) [2024-04-28 18:43:12,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11226038272. Throughput: 0: 55977.2. Samples: 1716406260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 18:43:12,169][57108] Avg episode reward: [(0, '0.657')] [2024-04-28 18:43:13,723][57339] Updated weights for policy 0, policy_version 685188 (0.0035) [2024-04-28 18:43:15,874][57339] Updated weights for policy 0, policy_version 685198 (0.0033) [2024-04-28 18:43:17,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 11226333184. Throughput: 0: 55951.9. Samples: 1716738620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:43:17,169][57108] Avg episode reward: [(0, '0.696')] [2024-04-28 18:43:19,506][57339] Updated weights for policy 0, policy_version 685208 (0.0031) [2024-04-28 18:43:22,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 11226595328. Throughput: 0: 56065.9. Samples: 1716913760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:43:22,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 18:43:22,311][57339] Updated weights for policy 0, policy_version 685218 (0.0023) [2024-04-28 18:43:25,268][57339] Updated weights for policy 0, policy_version 685228 (0.0031) [2024-04-28 18:43:27,169][57108] Fps is (10 sec: 57344.9, 60 sec: 56524.7, 300 sec: 55761.1). Total num frames: 11226906624. Throughput: 0: 56064.1. Samples: 1717247160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:43:27,174][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 18:43:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000685236_11226906624.pth... [2024-04-28 18:43:27,239][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000684417_11213488128.pth [2024-04-28 18:43:28,007][57339] Updated weights for policy 0, policy_version 685238 (0.0027) [2024-04-28 18:43:31,035][57339] Updated weights for policy 0, policy_version 685248 (0.0032) [2024-04-28 18:43:32,169][57108] Fps is (10 sec: 58982.8, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 11227185152. Throughput: 0: 55943.4. Samples: 1717581080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:43:32,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 18:43:33,707][57339] Updated weights for policy 0, policy_version 685258 (0.0027) [2024-04-28 18:43:34,284][57319] Signal inference workers to stop experience collection... (25650 times) [2024-04-28 18:43:34,286][57319] Signal inference workers to resume experience collection... (25650 times) [2024-04-28 18:43:34,315][57339] InferenceWorker_p0-w0: stopping experience collection (25650 times) [2024-04-28 18:43:34,315][57339] InferenceWorker_p0-w0: resuming experience collection (25650 times) [2024-04-28 18:43:36,812][57339] Updated weights for policy 0, policy_version 685268 (0.0033) [2024-04-28 18:43:37,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 11227447296. Throughput: 0: 56034.4. Samples: 1717754140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:43:37,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 18:43:39,563][57339] Updated weights for policy 0, policy_version 685278 (0.0027) [2024-04-28 18:43:42,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 11227709440. Throughput: 0: 55965.4. Samples: 1718092360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:43:42,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 18:43:42,558][57339] Updated weights for policy 0, policy_version 685288 (0.0024) [2024-04-28 18:43:45,325][57339] Updated weights for policy 0, policy_version 685298 (0.0024) [2024-04-28 18:43:47,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 11228004352. Throughput: 0: 55945.7. Samples: 1718428620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:43:47,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 18:43:48,441][57339] Updated weights for policy 0, policy_version 685308 (0.0029) [2024-04-28 18:43:50,968][57339] Updated weights for policy 0, policy_version 685318 (0.0029) [2024-04-28 18:43:52,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 11228299264. Throughput: 0: 55944.2. Samples: 1718588880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:43:52,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 18:43:54,218][57339] Updated weights for policy 0, policy_version 685328 (0.0030) [2024-04-28 18:43:56,791][57339] Updated weights for policy 0, policy_version 685338 (0.0037) [2024-04-28 18:43:57,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 11228577792. Throughput: 0: 56122.1. Samples: 1718931760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:43:57,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 18:44:00,069][57339] Updated weights for policy 0, policy_version 685348 (0.0029) [2024-04-28 18:44:02,169][57108] Fps is (10 sec: 57343.4, 60 sec: 56797.8, 300 sec: 55816.7). Total num frames: 11228872704. Throughput: 0: 56189.4. Samples: 1719267140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:44:02,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 18:44:02,755][57339] Updated weights for policy 0, policy_version 685358 (0.0030) [2024-04-28 18:44:05,798][57339] Updated weights for policy 0, policy_version 685368 (0.0026) [2024-04-28 18:44:07,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 11229151232. Throughput: 0: 56065.3. Samples: 1719436700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:44:07,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 18:44:08,932][57339] Updated weights for policy 0, policy_version 685378 (0.0026) [2024-04-28 18:44:11,522][57339] Updated weights for policy 0, policy_version 685388 (0.0029) [2024-04-28 18:44:12,169][57108] Fps is (10 sec: 54067.8, 60 sec: 56251.8, 300 sec: 55872.3). Total num frames: 11229413376. Throughput: 0: 56183.6. Samples: 1719775420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:44:12,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 18:44:14,792][57339] Updated weights for policy 0, policy_version 685398 (0.0032) [2024-04-28 18:44:17,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 11229691904. Throughput: 0: 56170.0. Samples: 1720108740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:44:17,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 18:44:17,403][57339] Updated weights for policy 0, policy_version 685408 (0.0031) [2024-04-28 18:44:20,660][57339] Updated weights for policy 0, policy_version 685418 (0.0030) [2024-04-28 18:44:22,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11229954048. Throughput: 0: 55964.4. Samples: 1720272540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:44:22,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 18:44:23,106][57339] Updated weights for policy 0, policy_version 685428 (0.0032) [2024-04-28 18:44:26,320][57339] Updated weights for policy 0, policy_version 685438 (0.0032) [2024-04-28 18:44:27,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 11230232576. Throughput: 0: 55914.7. Samples: 1720608520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:44:27,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 18:44:29,177][57339] Updated weights for policy 0, policy_version 685448 (0.0028) [2024-04-28 18:44:32,017][57339] Updated weights for policy 0, policy_version 685458 (0.0026) [2024-04-28 18:44:32,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11230543872. Throughput: 0: 55897.9. Samples: 1720944020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:44:32,169][57108] Avg episode reward: [(0, '0.507')] [2024-04-28 18:44:34,980][57339] Updated weights for policy 0, policy_version 685468 (0.0025) [2024-04-28 18:44:37,169][57108] Fps is (10 sec: 58982.3, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 11230822400. Throughput: 0: 56109.2. Samples: 1721113800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:44:37,170][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 18:44:37,882][57339] Updated weights for policy 0, policy_version 685478 (0.0029) [2024-04-28 18:44:40,992][57339] Updated weights for policy 0, policy_version 685488 (0.0027) [2024-04-28 18:44:42,169][57108] Fps is (10 sec: 55706.0, 60 sec: 56524.9, 300 sec: 55927.8). Total num frames: 11231100928. Throughput: 0: 55995.8. Samples: 1721451560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:44:42,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 18:44:43,821][57339] Updated weights for policy 0, policy_version 685498 (0.0036) [2024-04-28 18:44:46,655][57339] Updated weights for policy 0, policy_version 685508 (0.0037) [2024-04-28 18:44:47,169][57108] Fps is (10 sec: 55705.5, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 11231379456. Throughput: 0: 55928.4. Samples: 1721783920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:44:47,169][57108] Avg episode reward: [(0, '0.727')] [2024-04-28 18:44:49,643][57339] Updated weights for policy 0, policy_version 685518 (0.0025) [2024-04-28 18:44:51,182][57319] Signal inference workers to stop experience collection... (25700 times) [2024-04-28 18:44:51,182][57319] Signal inference workers to resume experience collection... (25700 times) [2024-04-28 18:44:51,207][57339] InferenceWorker_p0-w0: stopping experience collection (25700 times) [2024-04-28 18:44:51,207][57339] InferenceWorker_p0-w0: resuming experience collection (25700 times) [2024-04-28 18:44:52,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55978.5, 300 sec: 55983.3). Total num frames: 11231657984. Throughput: 0: 55957.3. Samples: 1721954780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 18:44:52,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 18:44:52,453][57339] Updated weights for policy 0, policy_version 685528 (0.0038) [2024-04-28 18:44:55,803][57339] Updated weights for policy 0, policy_version 685538 (0.0036) [2024-04-28 18:44:57,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 11231952896. Throughput: 0: 55938.6. Samples: 1722292660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:44:57,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 18:44:58,252][57339] Updated weights for policy 0, policy_version 685548 (0.0028) [2024-04-28 18:45:01,770][57339] Updated weights for policy 0, policy_version 685558 (0.0035) [2024-04-28 18:45:02,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 11232182272. Throughput: 0: 56173.0. Samples: 1722636520. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:45:02,170][57108] Avg episode reward: [(0, '0.708')] [2024-04-28 18:45:04,257][57339] Updated weights for policy 0, policy_version 685568 (0.0028) [2024-04-28 18:45:07,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 11232477184. Throughput: 0: 55995.6. Samples: 1722792340. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:45:07,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 18:45:07,827][57339] Updated weights for policy 0, policy_version 685578 (0.0028) [2024-04-28 18:45:10,010][57339] Updated weights for policy 0, policy_version 685588 (0.0030) [2024-04-28 18:45:12,169][57108] Fps is (10 sec: 60620.4, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 11232788480. Throughput: 0: 56033.7. Samples: 1723130040. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:45:12,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 18:45:13,622][57339] Updated weights for policy 0, policy_version 685598 (0.0028) [2024-04-28 18:45:16,049][57339] Updated weights for policy 0, policy_version 685608 (0.0031) [2024-04-28 18:45:17,169][57108] Fps is (10 sec: 58982.3, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 11233067008. Throughput: 0: 55904.0. Samples: 1723459700. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:45:17,169][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 18:45:19,345][57339] Updated weights for policy 0, policy_version 685618 (0.0031) [2024-04-28 18:45:21,778][57339] Updated weights for policy 0, policy_version 685628 (0.0031) [2024-04-28 18:45:22,169][57108] Fps is (10 sec: 55705.7, 60 sec: 56524.7, 300 sec: 56038.8). Total num frames: 11233345536. Throughput: 0: 56105.3. Samples: 1723638540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:45:22,170][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 18:45:25,226][57339] Updated weights for policy 0, policy_version 685638 (0.0028) [2024-04-28 18:45:27,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 11233591296. Throughput: 0: 55982.1. Samples: 1723970760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:45:27,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 18:45:27,266][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000685645_11233607680.pth... [2024-04-28 18:45:27,315][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000684828_11220221952.pth [2024-04-28 18:45:27,713][57339] Updated weights for policy 0, policy_version 685648 (0.0029) [2024-04-28 18:45:31,110][57339] Updated weights for policy 0, policy_version 685658 (0.0028) [2024-04-28 18:45:32,169][57108] Fps is (10 sec: 57344.8, 60 sec: 56251.7, 300 sec: 56038.8). Total num frames: 11233918976. Throughput: 0: 55897.9. Samples: 1724299320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:45:32,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 18:45:33,524][57339] Updated weights for policy 0, policy_version 685668 (0.0029) [2024-04-28 18:45:36,899][57339] Updated weights for policy 0, policy_version 685678 (0.0027) [2024-04-28 18:45:37,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 11234164736. Throughput: 0: 55838.3. Samples: 1724467500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:45:37,170][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 18:45:37,562][57319] Signal inference workers to stop experience collection... (25750 times) [2024-04-28 18:45:37,562][57319] Signal inference workers to resume experience collection... (25750 times) [2024-04-28 18:45:37,600][57339] InferenceWorker_p0-w0: stopping experience collection (25750 times) [2024-04-28 18:45:37,600][57339] InferenceWorker_p0-w0: resuming experience collection (25750 times) [2024-04-28 18:45:39,453][57339] Updated weights for policy 0, policy_version 685688 (0.0024) [2024-04-28 18:45:42,169][57108] Fps is (10 sec: 50790.5, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 11234426880. Throughput: 0: 55714.2. Samples: 1724799800. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:45:42,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 18:45:42,957][57339] Updated weights for policy 0, policy_version 685698 (0.0025) [2024-04-28 18:45:45,264][57339] Updated weights for policy 0, policy_version 685708 (0.0031) [2024-04-28 18:45:47,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11234721792. Throughput: 0: 55521.0. Samples: 1725134960. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:45:47,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 18:45:48,828][57339] Updated weights for policy 0, policy_version 685718 (0.0032) [2024-04-28 18:45:51,179][57339] Updated weights for policy 0, policy_version 685728 (0.0022) [2024-04-28 18:45:52,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 11235016704. Throughput: 0: 55576.9. Samples: 1725293300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:45:52,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 18:45:54,670][57339] Updated weights for policy 0, policy_version 685738 (0.0030) [2024-04-28 18:45:57,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55927.8). Total num frames: 11235278848. Throughput: 0: 55429.6. Samples: 1725624360. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:45:57,169][57108] Avg episode reward: [(0, '0.692')] [2024-04-28 18:45:57,185][57339] Updated weights for policy 0, policy_version 685748 (0.0029) [2024-04-28 18:46:00,587][57339] Updated weights for policy 0, policy_version 685758 (0.0027) [2024-04-28 18:46:02,169][57108] Fps is (10 sec: 54066.9, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 11235557376. Throughput: 0: 55523.1. Samples: 1725958240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:46:02,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 18:46:03,222][57339] Updated weights for policy 0, policy_version 685768 (0.0028) [2024-04-28 18:46:06,424][57339] Updated weights for policy 0, policy_version 685778 (0.0030) [2024-04-28 18:46:07,169][57108] Fps is (10 sec: 57343.7, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 11235852288. Throughput: 0: 55293.1. Samples: 1726126720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:46:07,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 18:46:09,062][57339] Updated weights for policy 0, policy_version 685788 (0.0030) [2024-04-28 18:46:12,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.5, 300 sec: 55872.2). Total num frames: 11236098048. Throughput: 0: 55311.5. Samples: 1726459780. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:46:12,170][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 18:46:12,232][57339] Updated weights for policy 0, policy_version 685798 (0.0034) [2024-04-28 18:46:15,032][57339] Updated weights for policy 0, policy_version 685808 (0.0035) [2024-04-28 18:46:17,169][57108] Fps is (10 sec: 50789.4, 60 sec: 54886.2, 300 sec: 55761.1). Total num frames: 11236360192. Throughput: 0: 55466.4. Samples: 1726795320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:46:17,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 18:46:18,099][57339] Updated weights for policy 0, policy_version 685818 (0.0027) [2024-04-28 18:46:18,846][57319] Signal inference workers to stop experience collection... (25800 times) [2024-04-28 18:46:18,846][57319] Signal inference workers to resume experience collection... (25800 times) [2024-04-28 18:46:18,865][57339] InferenceWorker_p0-w0: stopping experience collection (25800 times) [2024-04-28 18:46:18,865][57339] InferenceWorker_p0-w0: resuming experience collection (25800 times) [2024-04-28 18:46:20,955][57339] Updated weights for policy 0, policy_version 685828 (0.0026) [2024-04-28 18:46:22,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 11236655104. Throughput: 0: 55354.2. Samples: 1726958440. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:46:22,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 18:46:23,940][57339] Updated weights for policy 0, policy_version 685838 (0.0028) [2024-04-28 18:46:26,806][57339] Updated weights for policy 0, policy_version 685848 (0.0029) [2024-04-28 18:46:27,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 11236933632. Throughput: 0: 55291.8. Samples: 1727287940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:46:27,170][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 18:46:29,966][57339] Updated weights for policy 0, policy_version 685858 (0.0028) [2024-04-28 18:46:32,169][57108] Fps is (10 sec: 55706.2, 60 sec: 54886.4, 300 sec: 55816.7). Total num frames: 11237212160. Throughput: 0: 55247.1. Samples: 1727621080. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-04-28 18:46:32,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 18:46:32,630][57339] Updated weights for policy 0, policy_version 685868 (0.0033) [2024-04-28 18:46:35,890][57339] Updated weights for policy 0, policy_version 685878 (0.0030) [2024-04-28 18:46:37,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 11237507072. Throughput: 0: 55689.3. Samples: 1727799320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:46:37,169][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 18:46:38,574][57339] Updated weights for policy 0, policy_version 685888 (0.0027) [2024-04-28 18:46:41,716][57339] Updated weights for policy 0, policy_version 685898 (0.0025) [2024-04-28 18:46:42,169][57108] Fps is (10 sec: 58982.2, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 11237801984. Throughput: 0: 55758.1. Samples: 1728133480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:46:42,170][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 18:46:44,434][57339] Updated weights for policy 0, policy_version 685908 (0.0030) [2024-04-28 18:46:47,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 11238047744. Throughput: 0: 55714.2. Samples: 1728465380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:46:47,170][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 18:46:47,580][57339] Updated weights for policy 0, policy_version 685918 (0.0029) [2024-04-28 18:46:50,252][57339] Updated weights for policy 0, policy_version 685928 (0.0029) [2024-04-28 18:46:52,169][57108] Fps is (10 sec: 50790.2, 60 sec: 54886.4, 300 sec: 55761.1). Total num frames: 11238309888. Throughput: 0: 55608.8. Samples: 1728629120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:46:52,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 18:46:53,405][57339] Updated weights for policy 0, policy_version 685938 (0.0024) [2024-04-28 18:46:56,390][57339] Updated weights for policy 0, policy_version 685948 (0.0025) [2024-04-28 18:46:57,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.4, 300 sec: 55816.7). Total num frames: 11238621184. Throughput: 0: 55682.2. Samples: 1728965480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:46:57,170][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 18:46:59,208][57339] Updated weights for policy 0, policy_version 685958 (0.0031) [2024-04-28 18:47:02,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 11238883328. Throughput: 0: 55602.5. Samples: 1729297420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:47:02,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 18:47:02,361][57339] Updated weights for policy 0, policy_version 685968 (0.0030) [2024-04-28 18:47:04,991][57339] Updated weights for policy 0, policy_version 685978 (0.0027) [2024-04-28 18:47:07,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 11239161856. Throughput: 0: 55684.6. Samples: 1729464240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:47:07,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 18:47:08,133][57339] Updated weights for policy 0, policy_version 685988 (0.0028) [2024-04-28 18:47:10,881][57339] Updated weights for policy 0, policy_version 685998 (0.0026) [2024-04-28 18:47:12,169][57108] Fps is (10 sec: 55704.2, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11239440384. Throughput: 0: 55863.0. Samples: 1729801780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:47:12,170][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 18:47:13,849][57339] Updated weights for policy 0, policy_version 686008 (0.0028) [2024-04-28 18:47:16,784][57339] Updated weights for policy 0, policy_version 686018 (0.0031) [2024-04-28 18:47:17,169][57108] Fps is (10 sec: 57343.3, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 11239735296. Throughput: 0: 55929.6. Samples: 1730137920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:47:17,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 18:47:19,699][57339] Updated weights for policy 0, policy_version 686028 (0.0030) [2024-04-28 18:47:22,169][57108] Fps is (10 sec: 55706.9, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 11239997440. Throughput: 0: 55648.9. Samples: 1730303520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:47:22,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 18:47:22,607][57339] Updated weights for policy 0, policy_version 686038 (0.0030) [2024-04-28 18:47:25,463][57339] Updated weights for policy 0, policy_version 686048 (0.0030) [2024-04-28 18:47:27,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11240275968. Throughput: 0: 55682.7. Samples: 1730639200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:47:27,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 18:47:27,229][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000686053_11240292352.pth... [2024-04-28 18:47:27,270][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000685236_11226906624.pth [2024-04-28 18:47:28,458][57339] Updated weights for policy 0, policy_version 686058 (0.0032) [2024-04-28 18:47:31,260][57339] Updated weights for policy 0, policy_version 686068 (0.0031) [2024-04-28 18:47:32,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 11240570880. Throughput: 0: 55686.2. Samples: 1730971260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:47:32,170][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 18:47:34,413][57339] Updated weights for policy 0, policy_version 686078 (0.0032) [2024-04-28 18:47:36,896][57319] Signal inference workers to stop experience collection... (25850 times) [2024-04-28 18:47:36,900][57319] Signal inference workers to resume experience collection... (25850 times) [2024-04-28 18:47:36,915][57339] InferenceWorker_p0-w0: stopping experience collection (25850 times) [2024-04-28 18:47:36,915][57339] InferenceWorker_p0-w0: resuming experience collection (25850 times) [2024-04-28 18:47:37,150][57339] Updated weights for policy 0, policy_version 686088 (0.0026) [2024-04-28 18:47:37,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 11240865792. Throughput: 0: 55749.8. Samples: 1731137860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:47:37,169][57108] Avg episode reward: [(0, '0.499')] [2024-04-28 18:47:40,232][57339] Updated weights for policy 0, policy_version 686098 (0.0028) [2024-04-28 18:47:42,169][57108] Fps is (10 sec: 52429.8, 60 sec: 54886.4, 300 sec: 55761.2). Total num frames: 11241095168. Throughput: 0: 55718.0. Samples: 1731472780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:47:42,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 18:47:43,158][57339] Updated weights for policy 0, policy_version 686108 (0.0026) [2024-04-28 18:47:45,999][57339] Updated weights for policy 0, policy_version 686118 (0.0032) [2024-04-28 18:47:47,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11241390080. Throughput: 0: 55805.1. Samples: 1731808660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:47:47,170][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 18:47:49,140][57339] Updated weights for policy 0, policy_version 686128 (0.0030) [2024-04-28 18:47:51,830][57339] Updated weights for policy 0, policy_version 686138 (0.0031) [2024-04-28 18:47:52,169][57108] Fps is (10 sec: 60620.2, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 11241701376. Throughput: 0: 55815.0. Samples: 1731975920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:47:52,170][57108] Avg episode reward: [(0, '0.493')] [2024-04-28 18:47:55,009][57339] Updated weights for policy 0, policy_version 686148 (0.0027) [2024-04-28 18:47:57,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 11241947136. Throughput: 0: 55815.3. Samples: 1732313460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:47:57,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 18:47:57,849][57339] Updated weights for policy 0, policy_version 686158 (0.0029) [2024-04-28 18:48:00,782][57339] Updated weights for policy 0, policy_version 686168 (0.0027) [2024-04-28 18:48:02,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 11242242048. Throughput: 0: 55627.7. Samples: 1732641160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:48:02,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 18:48:03,746][57339] Updated weights for policy 0, policy_version 686178 (0.0030) [2024-04-28 18:48:06,731][57339] Updated weights for policy 0, policy_version 686188 (0.0033) [2024-04-28 18:48:07,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 11242520576. Throughput: 0: 55601.7. Samples: 1732805600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:48:07,169][57108] Avg episode reward: [(0, '0.512')] [2024-04-28 18:48:09,640][57339] Updated weights for policy 0, policy_version 686198 (0.0026) [2024-04-28 18:48:12,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 11242782720. Throughput: 0: 55727.1. Samples: 1733146920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 18:48:12,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 18:48:12,714][57339] Updated weights for policy 0, policy_version 686208 (0.0028) [2024-04-28 18:48:15,544][57339] Updated weights for policy 0, policy_version 686218 (0.0027) [2024-04-28 18:48:17,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 11243044864. Throughput: 0: 55842.2. Samples: 1733484160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:48:17,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 18:48:18,507][57339] Updated weights for policy 0, policy_version 686228 (0.0027) [2024-04-28 18:48:21,528][57339] Updated weights for policy 0, policy_version 686238 (0.0036) [2024-04-28 18:48:22,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 11243339776. Throughput: 0: 55660.3. Samples: 1733642580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:48:22,170][57108] Avg episode reward: [(0, '0.690')] [2024-04-28 18:48:24,412][57339] Updated weights for policy 0, policy_version 686248 (0.0026) [2024-04-28 18:48:27,169][57108] Fps is (10 sec: 58983.1, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 11243634688. Throughput: 0: 55585.7. Samples: 1733974140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:48:27,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 18:48:27,284][57339] Updated weights for policy 0, policy_version 686258 (0.0028) [2024-04-28 18:48:30,382][57339] Updated weights for policy 0, policy_version 686268 (0.0033) [2024-04-28 18:48:32,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 11243896832. Throughput: 0: 55563.6. Samples: 1734309020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:48:32,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 18:48:33,156][57339] Updated weights for policy 0, policy_version 686278 (0.0026) [2024-04-28 18:48:36,270][57339] Updated weights for policy 0, policy_version 686288 (0.0029) [2024-04-28 18:48:37,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 11244191744. Throughput: 0: 55739.0. Samples: 1734484180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:48:37,170][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 18:48:39,072][57339] Updated weights for policy 0, policy_version 686298 (0.0035) [2024-04-28 18:48:41,319][57319] Signal inference workers to stop experience collection... (25900 times) [2024-04-28 18:48:41,330][57339] InferenceWorker_p0-w0: stopping experience collection (25900 times) [2024-04-28 18:48:41,412][57319] Signal inference workers to resume experience collection... (25900 times) [2024-04-28 18:48:41,412][57339] InferenceWorker_p0-w0: resuming experience collection (25900 times) [2024-04-28 18:48:42,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 11244453888. Throughput: 0: 55591.4. Samples: 1734815080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:48:42,170][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 18:48:42,210][57339] Updated weights for policy 0, policy_version 686308 (0.0025) [2024-04-28 18:48:44,957][57339] Updated weights for policy 0, policy_version 686318 (0.0031) [2024-04-28 18:48:47,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 11244732416. Throughput: 0: 55738.8. Samples: 1735149420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:48:47,170][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 18:48:47,975][57339] Updated weights for policy 0, policy_version 686328 (0.0027) [2024-04-28 18:48:50,950][57339] Updated weights for policy 0, policy_version 686338 (0.0028) [2024-04-28 18:48:52,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 11245010944. Throughput: 0: 55805.2. Samples: 1735316840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:48:52,170][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 18:48:53,751][57339] Updated weights for policy 0, policy_version 686348 (0.0028) [2024-04-28 18:48:56,659][57339] Updated weights for policy 0, policy_version 686358 (0.0031) [2024-04-28 18:48:57,169][57108] Fps is (10 sec: 55706.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11245289472. Throughput: 0: 55749.8. Samples: 1735655660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:48:57,169][57108] Avg episode reward: [(0, '0.473')] [2024-04-28 18:48:59,617][57339] Updated weights for policy 0, policy_version 686368 (0.0029) [2024-04-28 18:49:02,169][57108] Fps is (10 sec: 57345.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11245584384. Throughput: 0: 55702.4. Samples: 1735990760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:49:02,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 18:49:02,732][57339] Updated weights for policy 0, policy_version 686378 (0.0028) [2024-04-28 18:49:05,343][57339] Updated weights for policy 0, policy_version 686388 (0.0027) [2024-04-28 18:49:07,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11245862912. Throughput: 0: 55949.5. Samples: 1736160300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:49:07,178][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 18:49:08,643][57339] Updated weights for policy 0, policy_version 686398 (0.0026) [2024-04-28 18:49:11,289][57339] Updated weights for policy 0, policy_version 686408 (0.0032) [2024-04-28 18:49:12,169][57108] Fps is (10 sec: 57343.0, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 11246157824. Throughput: 0: 55967.8. Samples: 1736492700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:49:12,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 18:49:14,403][57339] Updated weights for policy 0, policy_version 686418 (0.0037) [2024-04-28 18:49:17,169][57108] Fps is (10 sec: 55705.3, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 11246419968. Throughput: 0: 56004.5. Samples: 1736829220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:49:17,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 18:49:17,266][57339] Updated weights for policy 0, policy_version 686428 (0.0030) [2024-04-28 18:49:20,168][57339] Updated weights for policy 0, policy_version 686438 (0.0029) [2024-04-28 18:49:22,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 11246698496. Throughput: 0: 55627.8. Samples: 1736987420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:49:22,169][57108] Avg episode reward: [(0, '0.714')] [2024-04-28 18:49:23,009][57339] Updated weights for policy 0, policy_version 686448 (0.0029) [2024-04-28 18:49:26,081][57339] Updated weights for policy 0, policy_version 686458 (0.0029) [2024-04-28 18:49:27,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 11246960640. Throughput: 0: 55850.3. Samples: 1737328340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:49:27,178][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 18:49:27,262][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000686461_11246977024.pth... [2024-04-28 18:49:27,307][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000685645_11233607680.pth [2024-04-28 18:49:28,805][57339] Updated weights for policy 0, policy_version 686468 (0.0025) [2024-04-28 18:49:31,998][57339] Updated weights for policy 0, policy_version 686478 (0.0029) [2024-04-28 18:49:32,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11247255552. Throughput: 0: 55928.7. Samples: 1737666200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:49:32,178][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 18:49:34,692][57339] Updated weights for policy 0, policy_version 686488 (0.0034) [2024-04-28 18:49:37,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55978.9, 300 sec: 55761.1). Total num frames: 11247550464. Throughput: 0: 55893.1. Samples: 1737832020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:49:37,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 18:49:37,838][57339] Updated weights for policy 0, policy_version 686498 (0.0031) [2024-04-28 18:49:40,514][57339] Updated weights for policy 0, policy_version 686508 (0.0030) [2024-04-28 18:49:42,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 11247812608. Throughput: 0: 55724.1. Samples: 1738163240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:49:42,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 18:49:43,797][57339] Updated weights for policy 0, policy_version 686518 (0.0028) [2024-04-28 18:49:46,387][57339] Updated weights for policy 0, policy_version 686528 (0.0027) [2024-04-28 18:49:47,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56251.9, 300 sec: 55761.2). Total num frames: 11248107520. Throughput: 0: 55700.9. Samples: 1738497300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:49:47,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 18:49:49,942][57339] Updated weights for policy 0, policy_version 686538 (0.0030) [2024-04-28 18:49:50,553][57319] Signal inference workers to stop experience collection... (25950 times) [2024-04-28 18:49:50,553][57319] Signal inference workers to resume experience collection... (25950 times) [2024-04-28 18:49:50,573][57339] InferenceWorker_p0-w0: stopping experience collection (25950 times) [2024-04-28 18:49:50,595][57339] InferenceWorker_p0-w0: resuming experience collection (25950 times) [2024-04-28 18:49:52,169][57108] Fps is (10 sec: 58982.0, 60 sec: 56524.9, 300 sec: 55761.1). Total num frames: 11248402432. Throughput: 0: 55871.5. Samples: 1738674520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-04-28 18:49:52,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 18:49:52,172][57339] Updated weights for policy 0, policy_version 686548 (0.0028) [2024-04-28 18:49:55,728][57339] Updated weights for policy 0, policy_version 686558 (0.0034) [2024-04-28 18:49:57,169][57108] Fps is (10 sec: 55705.7, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 11248664576. Throughput: 0: 56045.1. Samples: 1739014720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:49:57,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 18:49:58,015][57339] Updated weights for policy 0, policy_version 686568 (0.0032) [2024-04-28 18:50:01,515][57339] Updated weights for policy 0, policy_version 686578 (0.0037) [2024-04-28 18:50:02,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11248926720. Throughput: 0: 56033.3. Samples: 1739350720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:50:02,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 18:50:03,996][57339] Updated weights for policy 0, policy_version 686588 (0.0026) [2024-04-28 18:50:07,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11249205248. Throughput: 0: 56165.7. Samples: 1739514880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:50:07,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 18:50:07,224][57339] Updated weights for policy 0, policy_version 686598 (0.0031) [2024-04-28 18:50:09,802][57339] Updated weights for policy 0, policy_version 686608 (0.0029) [2024-04-28 18:50:12,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11249500160. Throughput: 0: 56150.7. Samples: 1739855120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:50:12,169][57108] Avg episode reward: [(0, '0.524')] [2024-04-28 18:50:12,981][57339] Updated weights for policy 0, policy_version 686618 (0.0032) [2024-04-28 18:50:15,758][57339] Updated weights for policy 0, policy_version 686628 (0.0024) [2024-04-28 18:50:17,169][57108] Fps is (10 sec: 58982.4, 60 sec: 56251.7, 300 sec: 55761.2). Total num frames: 11249795072. Throughput: 0: 56185.3. Samples: 1740194540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:50:17,169][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 18:50:18,898][57339] Updated weights for policy 0, policy_version 686638 (0.0030) [2024-04-28 18:50:21,712][57339] Updated weights for policy 0, policy_version 686648 (0.0024) [2024-04-28 18:50:22,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11250057216. Throughput: 0: 56293.2. Samples: 1740365220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:50:22,170][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 18:50:24,848][57339] Updated weights for policy 0, policy_version 686658 (0.0031) [2024-04-28 18:50:27,169][57108] Fps is (10 sec: 55705.7, 60 sec: 56524.9, 300 sec: 55705.6). Total num frames: 11250352128. Throughput: 0: 56174.1. Samples: 1740691080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:50:27,170][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 18:50:27,530][57339] Updated weights for policy 0, policy_version 686668 (0.0027) [2024-04-28 18:50:30,837][57339] Updated weights for policy 0, policy_version 686678 (0.0026) [2024-04-28 18:50:32,169][57108] Fps is (10 sec: 57344.8, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 11250630656. Throughput: 0: 56068.5. Samples: 1741020380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:50:32,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 18:50:33,419][57339] Updated weights for policy 0, policy_version 686688 (0.0023) [2024-04-28 18:50:36,799][57339] Updated weights for policy 0, policy_version 686698 (0.0032) [2024-04-28 18:50:37,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11250909184. Throughput: 0: 56054.7. Samples: 1741196980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:50:37,169][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 18:50:39,325][57339] Updated weights for policy 0, policy_version 686708 (0.0024) [2024-04-28 18:50:42,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 11251154944. Throughput: 0: 55936.4. Samples: 1741531860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:50:42,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 18:50:42,641][57339] Updated weights for policy 0, policy_version 686718 (0.0033) [2024-04-28 18:50:44,286][57319] Signal inference workers to stop experience collection... (26000 times) [2024-04-28 18:50:44,334][57339] InferenceWorker_p0-w0: stopping experience collection (26000 times) [2024-04-28 18:50:44,334][57319] Signal inference workers to resume experience collection... (26000 times) [2024-04-28 18:50:44,348][57339] InferenceWorker_p0-w0: resuming experience collection (26000 times) [2024-04-28 18:50:45,333][57339] Updated weights for policy 0, policy_version 686728 (0.0027) [2024-04-28 18:50:47,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11251449856. Throughput: 0: 55805.4. Samples: 1741861960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:50:47,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 18:50:48,424][57339] Updated weights for policy 0, policy_version 686738 (0.0026) [2024-04-28 18:50:51,267][57339] Updated weights for policy 0, policy_version 686748 (0.0027) [2024-04-28 18:50:52,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 11251728384. Throughput: 0: 55759.2. Samples: 1742024040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:50:52,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 18:50:54,205][57339] Updated weights for policy 0, policy_version 686758 (0.0029) [2024-04-28 18:50:57,142][57339] Updated weights for policy 0, policy_version 686768 (0.0031) [2024-04-28 18:50:57,169][57108] Fps is (10 sec: 55704.4, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 11252006912. Throughput: 0: 55529.6. Samples: 1742353960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:50:57,170][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 18:51:00,042][57339] Updated weights for policy 0, policy_version 686778 (0.0029) [2024-04-28 18:51:02,169][57108] Fps is (10 sec: 57343.4, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 11252301824. Throughput: 0: 55522.6. Samples: 1742693060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:51:02,170][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 18:51:02,931][57339] Updated weights for policy 0, policy_version 686788 (0.0026) [2024-04-28 18:51:05,833][57339] Updated weights for policy 0, policy_version 686798 (0.0025) [2024-04-28 18:51:07,169][57108] Fps is (10 sec: 58983.1, 60 sec: 56524.8, 300 sec: 55927.8). Total num frames: 11252596736. Throughput: 0: 55604.9. Samples: 1742867440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:51:07,170][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 18:51:08,799][57339] Updated weights for policy 0, policy_version 686808 (0.0028) [2024-04-28 18:51:11,784][57339] Updated weights for policy 0, policy_version 686818 (0.0031) [2024-04-28 18:51:12,169][57108] Fps is (10 sec: 57344.7, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 11252875264. Throughput: 0: 55822.8. Samples: 1743203100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:51:12,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 18:51:14,653][57339] Updated weights for policy 0, policy_version 686828 (0.0030) [2024-04-28 18:51:17,169][57108] Fps is (10 sec: 50790.5, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 11253104640. Throughput: 0: 56001.6. Samples: 1743540460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:51:17,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 18:51:17,593][57339] Updated weights for policy 0, policy_version 686838 (0.0034) [2024-04-28 18:51:20,480][57339] Updated weights for policy 0, policy_version 686848 (0.0028) [2024-04-28 18:51:22,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11253399552. Throughput: 0: 55649.3. Samples: 1743701200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:51:22,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 18:51:23,387][57339] Updated weights for policy 0, policy_version 686858 (0.0030) [2024-04-28 18:51:23,570][57319] Signal inference workers to stop experience collection... (26050 times) [2024-04-28 18:51:23,570][57319] Signal inference workers to resume experience collection... (26050 times) [2024-04-28 18:51:23,584][57339] InferenceWorker_p0-w0: stopping experience collection (26050 times) [2024-04-28 18:51:23,584][57339] InferenceWorker_p0-w0: resuming experience collection (26050 times) [2024-04-28 18:51:26,183][57339] Updated weights for policy 0, policy_version 686868 (0.0026) [2024-04-28 18:51:27,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 11253661696. Throughput: 0: 55703.4. Samples: 1744038520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:51:27,170][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 18:51:27,322][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000686871_11253694464.pth... [2024-04-28 18:51:27,361][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000686053_11240292352.pth [2024-04-28 18:51:29,273][57339] Updated weights for policy 0, policy_version 686878 (0.0030) [2024-04-28 18:51:31,957][57339] Updated weights for policy 0, policy_version 686888 (0.0031) [2024-04-28 18:51:32,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55705.4, 300 sec: 55816.7). Total num frames: 11253972992. Throughput: 0: 55933.6. Samples: 1744378980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-04-28 18:51:32,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 18:51:35,108][57339] Updated weights for policy 0, policy_version 686898 (0.0027) [2024-04-28 18:51:37,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 11254235136. Throughput: 0: 56134.5. Samples: 1744550100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:51:37,170][57108] Avg episode reward: [(0, '0.707')] [2024-04-28 18:51:38,106][57339] Updated weights for policy 0, policy_version 686908 (0.0031) [2024-04-28 18:51:40,854][57339] Updated weights for policy 0, policy_version 686918 (0.0026) [2024-04-28 18:51:42,169][57108] Fps is (10 sec: 57344.7, 60 sec: 56524.8, 300 sec: 55927.8). Total num frames: 11254546432. Throughput: 0: 56168.2. Samples: 1744881520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:51:42,170][57108] Avg episode reward: [(0, '0.489')] [2024-04-28 18:51:43,838][57339] Updated weights for policy 0, policy_version 686928 (0.0026) [2024-04-28 18:51:46,657][57339] Updated weights for policy 0, policy_version 686938 (0.0026) [2024-04-28 18:51:47,169][57108] Fps is (10 sec: 58982.3, 60 sec: 56251.6, 300 sec: 55983.3). Total num frames: 11254824960. Throughput: 0: 56107.5. Samples: 1745217900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:51:47,170][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 18:51:49,714][57339] Updated weights for policy 0, policy_version 686948 (0.0029) [2024-04-28 18:51:52,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11255087104. Throughput: 0: 56026.3. Samples: 1745388620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:51:52,169][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 18:51:52,426][57339] Updated weights for policy 0, policy_version 686958 (0.0027) [2024-04-28 18:51:55,638][57339] Updated weights for policy 0, policy_version 686968 (0.0030) [2024-04-28 18:51:57,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11255365632. Throughput: 0: 56009.5. Samples: 1745723540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:51:57,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 18:51:58,419][57339] Updated weights for policy 0, policy_version 686978 (0.0028) [2024-04-28 18:52:01,455][57339] Updated weights for policy 0, policy_version 686988 (0.0034) [2024-04-28 18:52:02,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 11255644160. Throughput: 0: 55883.9. Samples: 1746055240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:52:02,170][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 18:52:04,220][57339] Updated weights for policy 0, policy_version 686998 (0.0039) [2024-04-28 18:52:07,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.4, 300 sec: 55816.7). Total num frames: 11255906304. Throughput: 0: 55905.6. Samples: 1746216960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:52:07,170][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 18:52:07,342][57339] Updated weights for policy 0, policy_version 687008 (0.0025) [2024-04-28 18:52:10,081][57339] Updated weights for policy 0, policy_version 687018 (0.0024) [2024-04-28 18:52:12,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 11256201216. Throughput: 0: 55888.6. Samples: 1746553500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:52:12,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 18:52:13,293][57339] Updated weights for policy 0, policy_version 687028 (0.0031) [2024-04-28 18:52:15,937][57339] Updated weights for policy 0, policy_version 687038 (0.0026) [2024-04-28 18:52:17,169][57108] Fps is (10 sec: 57344.8, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 11256479744. Throughput: 0: 55784.6. Samples: 1746889280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:52:17,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 18:52:19,198][57339] Updated weights for policy 0, policy_version 687048 (0.0032) [2024-04-28 18:52:21,724][57339] Updated weights for policy 0, policy_version 687058 (0.0031) [2024-04-28 18:52:22,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 11256774656. Throughput: 0: 55753.9. Samples: 1747059020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:52:22,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 18:52:25,095][57339] Updated weights for policy 0, policy_version 687068 (0.0030) [2024-04-28 18:52:27,169][57108] Fps is (10 sec: 55705.8, 60 sec: 56251.9, 300 sec: 55816.7). Total num frames: 11257036800. Throughput: 0: 55892.5. Samples: 1747396680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:52:27,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 18:52:27,642][57339] Updated weights for policy 0, policy_version 687078 (0.0028) [2024-04-28 18:52:27,645][57319] Signal inference workers to stop experience collection... (26100 times) [2024-04-28 18:52:27,645][57319] Signal inference workers to resume experience collection... (26100 times) [2024-04-28 18:52:27,662][57339] InferenceWorker_p0-w0: stopping experience collection (26100 times) [2024-04-28 18:52:27,662][57339] InferenceWorker_p0-w0: resuming experience collection (26100 times) [2024-04-28 18:52:30,910][57339] Updated weights for policy 0, policy_version 687088 (0.0036) [2024-04-28 18:52:32,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 11257315328. Throughput: 0: 55837.9. Samples: 1747730600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:52:32,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 18:52:33,464][57339] Updated weights for policy 0, policy_version 687098 (0.0026) [2024-04-28 18:52:36,914][57339] Updated weights for policy 0, policy_version 687108 (0.0025) [2024-04-28 18:52:37,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 11257593856. Throughput: 0: 55651.4. Samples: 1747892940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:52:37,170][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 18:52:39,253][57339] Updated weights for policy 0, policy_version 687118 (0.0027) [2024-04-28 18:52:42,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 11257872384. Throughput: 0: 55685.0. Samples: 1748229360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:52:42,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 18:52:42,827][57339] Updated weights for policy 0, policy_version 687128 (0.0038) [2024-04-28 18:52:45,321][57339] Updated weights for policy 0, policy_version 687138 (0.0034) [2024-04-28 18:52:47,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55432.7, 300 sec: 55761.2). Total num frames: 11258150912. Throughput: 0: 55690.4. Samples: 1748561300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:52:47,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 18:52:48,651][57339] Updated weights for policy 0, policy_version 687148 (0.0037) [2024-04-28 18:52:51,136][57339] Updated weights for policy 0, policy_version 687158 (0.0034) [2024-04-28 18:52:52,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 11258445824. Throughput: 0: 55846.3. Samples: 1748730040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:52:52,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 18:52:54,479][57339] Updated weights for policy 0, policy_version 687168 (0.0036) [2024-04-28 18:52:56,976][57339] Updated weights for policy 0, policy_version 687178 (0.0029) [2024-04-28 18:52:57,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 11258724352. Throughput: 0: 55748.0. Samples: 1749062160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:52:57,169][57108] Avg episode reward: [(0, '0.697')] [2024-04-28 18:53:00,362][57339] Updated weights for policy 0, policy_version 687188 (0.0028) [2024-04-28 18:53:02,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11258986496. Throughput: 0: 55804.0. Samples: 1749400460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:53:02,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 18:53:02,891][57339] Updated weights for policy 0, policy_version 687198 (0.0032) [2024-04-28 18:53:06,257][57339] Updated weights for policy 0, policy_version 687208 (0.0029) [2024-04-28 18:53:07,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.9, 300 sec: 55927.8). Total num frames: 11259281408. Throughput: 0: 55665.8. Samples: 1749563980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:53:07,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 18:53:08,899][57339] Updated weights for policy 0, policy_version 687218 (0.0034) [2024-04-28 18:53:12,031][57339] Updated weights for policy 0, policy_version 687228 (0.0028) [2024-04-28 18:53:12,169][57108] Fps is (10 sec: 55704.1, 60 sec: 55705.4, 300 sec: 55927.7). Total num frames: 11259543552. Throughput: 0: 55647.6. Samples: 1749900840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-04-28 18:53:12,170][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 18:53:14,642][57339] Updated weights for policy 0, policy_version 687238 (0.0024) [2024-04-28 18:53:17,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 11259805696. Throughput: 0: 55699.5. Samples: 1750237080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:53:17,169][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 18:53:17,968][57339] Updated weights for policy 0, policy_version 687248 (0.0037) [2024-04-28 18:53:20,419][57339] Updated weights for policy 0, policy_version 687258 (0.0026) [2024-04-28 18:53:22,169][57108] Fps is (10 sec: 55707.2, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 11260100608. Throughput: 0: 55745.1. Samples: 1750401460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:53:22,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 18:53:23,852][57339] Updated weights for policy 0, policy_version 687268 (0.0028) [2024-04-28 18:53:26,437][57339] Updated weights for policy 0, policy_version 687278 (0.0025) [2024-04-28 18:53:27,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55705.4, 300 sec: 55872.2). Total num frames: 11260379136. Throughput: 0: 55727.0. Samples: 1750737080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:53:27,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 18:53:27,253][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000687280_11260395520.pth... [2024-04-28 18:53:27,297][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000686461_11246977024.pth [2024-04-28 18:53:29,756][57339] Updated weights for policy 0, policy_version 687288 (0.0034) [2024-04-28 18:53:32,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 11260657664. Throughput: 0: 55705.8. Samples: 1751068060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:53:32,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 18:53:32,558][57339] Updated weights for policy 0, policy_version 687298 (0.0031) [2024-04-28 18:53:33,608][57319] Signal inference workers to stop experience collection... (26150 times) [2024-04-28 18:53:33,609][57319] Signal inference workers to resume experience collection... (26150 times) [2024-04-28 18:53:33,634][57339] InferenceWorker_p0-w0: stopping experience collection (26150 times) [2024-04-28 18:53:33,634][57339] InferenceWorker_p0-w0: resuming experience collection (26150 times) [2024-04-28 18:53:35,585][57339] Updated weights for policy 0, policy_version 687308 (0.0030) [2024-04-28 18:53:37,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 11260936192. Throughput: 0: 55770.1. Samples: 1751239700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:53:37,172][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 18:53:38,630][57339] Updated weights for policy 0, policy_version 687318 (0.0024) [2024-04-28 18:53:41,404][57339] Updated weights for policy 0, policy_version 687328 (0.0037) [2024-04-28 18:53:42,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 11261231104. Throughput: 0: 55848.8. Samples: 1751575360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:53:42,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 18:53:44,400][57339] Updated weights for policy 0, policy_version 687338 (0.0036) [2024-04-28 18:53:47,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.4, 300 sec: 55872.2). Total num frames: 11261493248. Throughput: 0: 55707.8. Samples: 1751907320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:53:47,170][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 18:53:47,248][57339] Updated weights for policy 0, policy_version 687348 (0.0034) [2024-04-28 18:53:50,628][57339] Updated weights for policy 0, policy_version 687358 (0.0034) [2024-04-28 18:53:52,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55159.4, 300 sec: 55816.7). Total num frames: 11261755392. Throughput: 0: 55569.6. Samples: 1752064620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:53:52,170][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 18:53:53,102][57339] Updated weights for policy 0, policy_version 687368 (0.0025) [2024-04-28 18:53:56,409][57339] Updated weights for policy 0, policy_version 687378 (0.0025) [2024-04-28 18:53:57,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 11262050304. Throughput: 0: 55583.8. Samples: 1752402100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:53:57,170][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 18:53:58,865][57339] Updated weights for policy 0, policy_version 687388 (0.0032) [2024-04-28 18:54:02,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 11262312448. Throughput: 0: 55544.0. Samples: 1752736560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:54:02,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 18:54:02,348][57339] Updated weights for policy 0, policy_version 687398 (0.0032) [2024-04-28 18:54:04,803][57339] Updated weights for policy 0, policy_version 687408 (0.0028) [2024-04-28 18:54:07,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 11262623744. Throughput: 0: 55632.7. Samples: 1752904940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:54:07,170][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 18:54:08,174][57339] Updated weights for policy 0, policy_version 687418 (0.0028) [2024-04-28 18:54:10,725][57339] Updated weights for policy 0, policy_version 687428 (0.0025) [2024-04-28 18:54:12,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55705.9, 300 sec: 55816.7). Total num frames: 11262885888. Throughput: 0: 55535.4. Samples: 1753236160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:54:12,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 18:54:14,082][57339] Updated weights for policy 0, policy_version 687438 (0.0025) [2024-04-28 18:54:16,555][57339] Updated weights for policy 0, policy_version 687448 (0.0038) [2024-04-28 18:54:17,169][57108] Fps is (10 sec: 55706.1, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 11263180800. Throughput: 0: 55499.1. Samples: 1753565520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:54:17,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 18:54:20,009][57339] Updated weights for policy 0, policy_version 687458 (0.0033) [2024-04-28 18:54:22,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 11263442944. Throughput: 0: 55485.9. Samples: 1753736560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:54:22,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 18:54:22,394][57339] Updated weights for policy 0, policy_version 687468 (0.0031) [2024-04-28 18:54:26,053][57339] Updated weights for policy 0, policy_version 687478 (0.0031) [2024-04-28 18:54:27,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 11263705088. Throughput: 0: 55637.8. Samples: 1754079060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:54:27,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 18:54:28,208][57339] Updated weights for policy 0, policy_version 687488 (0.0028) [2024-04-28 18:54:31,934][57339] Updated weights for policy 0, policy_version 687498 (0.0031) [2024-04-28 18:54:32,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11263983616. Throughput: 0: 55622.4. Samples: 1754410320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:54:32,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 18:54:33,565][57319] Signal inference workers to stop experience collection... (26200 times) [2024-04-28 18:54:33,566][57319] Signal inference workers to resume experience collection... (26200 times) [2024-04-28 18:54:33,579][57339] InferenceWorker_p0-w0: stopping experience collection (26200 times) [2024-04-28 18:54:33,579][57339] InferenceWorker_p0-w0: resuming experience collection (26200 times) [2024-04-28 18:54:34,111][57339] Updated weights for policy 0, policy_version 687508 (0.0027) [2024-04-28 18:54:37,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 11264262144. Throughput: 0: 55626.9. Samples: 1754567820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:54:37,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 18:54:37,719][57339] Updated weights for policy 0, policy_version 687518 (0.0033) [2024-04-28 18:54:39,978][57339] Updated weights for policy 0, policy_version 687528 (0.0028) [2024-04-28 18:54:42,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 11264557056. Throughput: 0: 55529.8. Samples: 1754900940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:54:42,170][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 18:54:43,656][57339] Updated weights for policy 0, policy_version 687538 (0.0027) [2024-04-28 18:54:45,832][57339] Updated weights for policy 0, policy_version 687548 (0.0030) [2024-04-28 18:54:47,169][57108] Fps is (10 sec: 57342.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11264835584. Throughput: 0: 55602.9. Samples: 1755238700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:54:47,170][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 18:54:49,537][57339] Updated weights for policy 0, policy_version 687558 (0.0024) [2024-04-28 18:54:51,773][57339] Updated weights for policy 0, policy_version 687568 (0.0033) [2024-04-28 18:54:52,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 11265114112. Throughput: 0: 55674.4. Samples: 1755410280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-04-28 18:54:52,169][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 18:54:55,288][57339] Updated weights for policy 0, policy_version 687578 (0.0028) [2024-04-28 18:54:57,169][57108] Fps is (10 sec: 55706.7, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11265392640. Throughput: 0: 55662.6. Samples: 1755740980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:54:57,169][57108] Avg episode reward: [(0, '0.699')] [2024-04-28 18:54:57,592][57339] Updated weights for policy 0, policy_version 687588 (0.0027) [2024-04-28 18:55:01,247][57339] Updated weights for policy 0, policy_version 687598 (0.0028) [2024-04-28 18:55:02,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11265654784. Throughput: 0: 55739.0. Samples: 1756073780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:55:02,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 18:55:03,564][57339] Updated weights for policy 0, policy_version 687608 (0.0028) [2024-04-28 18:55:07,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.6, 300 sec: 55705.6). Total num frames: 11265933312. Throughput: 0: 55656.5. Samples: 1756241100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:55:07,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 18:55:07,174][57339] Updated weights for policy 0, policy_version 687618 (0.0027) [2024-04-28 18:55:09,327][57339] Updated weights for policy 0, policy_version 687628 (0.0031) [2024-04-28 18:55:12,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11266211840. Throughput: 0: 55425.4. Samples: 1756573200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:55:12,178][57108] Avg episode reward: [(0, '0.701')] [2024-04-28 18:55:13,050][57339] Updated weights for policy 0, policy_version 687638 (0.0027) [2024-04-28 18:55:15,156][57339] Updated weights for policy 0, policy_version 687648 (0.0033) [2024-04-28 18:55:17,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 11266506752. Throughput: 0: 55483.4. Samples: 1756907080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:55:17,178][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 18:55:18,903][57339] Updated weights for policy 0, policy_version 687658 (0.0034) [2024-04-28 18:55:21,179][57339] Updated weights for policy 0, policy_version 687668 (0.0027) [2024-04-28 18:55:22,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11266768896. Throughput: 0: 55645.4. Samples: 1757071860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:55:22,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 18:55:24,825][57339] Updated weights for policy 0, policy_version 687678 (0.0030) [2024-04-28 18:55:27,106][57339] Updated weights for policy 0, policy_version 687688 (0.0030) [2024-04-28 18:55:27,169][57108] Fps is (10 sec: 57343.4, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 11267080192. Throughput: 0: 55638.9. Samples: 1757404700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:55:27,170][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 18:55:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000687688_11267080192.pth... [2024-04-28 18:55:27,232][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000686871_11253694464.pth [2024-04-28 18:55:30,831][57339] Updated weights for policy 0, policy_version 687698 (0.0032) [2024-04-28 18:55:32,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11267342336. Throughput: 0: 55564.5. Samples: 1757739100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:55:32,170][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 18:55:32,998][57339] Updated weights for policy 0, policy_version 687708 (0.0030) [2024-04-28 18:55:36,595][57339] Updated weights for policy 0, policy_version 687718 (0.0034) [2024-04-28 18:55:37,169][57108] Fps is (10 sec: 52430.4, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 11267604480. Throughput: 0: 55380.5. Samples: 1757902400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:55:37,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 18:55:37,735][57319] Signal inference workers to stop experience collection... (26250 times) [2024-04-28 18:55:37,735][57319] Signal inference workers to resume experience collection... (26250 times) [2024-04-28 18:55:37,759][57339] InferenceWorker_p0-w0: stopping experience collection (26250 times) [2024-04-28 18:55:37,759][57339] InferenceWorker_p0-w0: resuming experience collection (26250 times) [2024-04-28 18:55:39,141][57339] Updated weights for policy 0, policy_version 687728 (0.0031) [2024-04-28 18:55:42,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 11267866624. Throughput: 0: 55445.3. Samples: 1758236020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:55:42,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 18:55:42,304][57339] Updated weights for policy 0, policy_version 687738 (0.0034) [2024-04-28 18:55:45,052][57339] Updated weights for policy 0, policy_version 687748 (0.0026) [2024-04-28 18:55:47,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 11268161536. Throughput: 0: 55512.5. Samples: 1758571840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:55:47,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 18:55:48,203][57339] Updated weights for policy 0, policy_version 687758 (0.0033) [2024-04-28 18:55:50,926][57339] Updated weights for policy 0, policy_version 687768 (0.0026) [2024-04-28 18:55:52,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 11268440064. Throughput: 0: 55515.8. Samples: 1758739320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:55:52,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 18:55:54,178][57339] Updated weights for policy 0, policy_version 687778 (0.0025) [2024-04-28 18:55:56,830][57339] Updated weights for policy 0, policy_version 687788 (0.0031) [2024-04-28 18:55:57,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11268718592. Throughput: 0: 55598.2. Samples: 1759075120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:55:57,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 18:55:59,957][57339] Updated weights for policy 0, policy_version 687798 (0.0034) [2024-04-28 18:56:02,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 11269013504. Throughput: 0: 55635.7. Samples: 1759410680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:56:02,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 18:56:02,698][57339] Updated weights for policy 0, policy_version 687808 (0.0027) [2024-04-28 18:56:05,827][57339] Updated weights for policy 0, policy_version 687818 (0.0032) [2024-04-28 18:56:07,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11269275648. Throughput: 0: 55749.3. Samples: 1759580580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:56:07,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 18:56:08,598][57339] Updated weights for policy 0, policy_version 687828 (0.0030) [2024-04-28 18:56:11,694][57339] Updated weights for policy 0, policy_version 687838 (0.0028) [2024-04-28 18:56:12,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11269554176. Throughput: 0: 55850.5. Samples: 1759917960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:56:12,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 18:56:14,460][57339] Updated weights for policy 0, policy_version 687848 (0.0032) [2024-04-28 18:56:17,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.6, 300 sec: 55650.1). Total num frames: 11269816320. Throughput: 0: 55952.2. Samples: 1760256940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:56:17,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 18:56:17,510][57339] Updated weights for policy 0, policy_version 687858 (0.0027) [2024-04-28 18:56:20,365][57339] Updated weights for policy 0, policy_version 687868 (0.0030) [2024-04-28 18:56:22,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.5, 300 sec: 55761.2). Total num frames: 11270111232. Throughput: 0: 55863.0. Samples: 1760416240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:56:22,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 18:56:23,347][57339] Updated weights for policy 0, policy_version 687878 (0.0035) [2024-04-28 18:56:26,340][57339] Updated weights for policy 0, policy_version 687888 (0.0028) [2024-04-28 18:56:27,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 11270406144. Throughput: 0: 56009.7. Samples: 1760756460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:56:27,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 18:56:29,145][57339] Updated weights for policy 0, policy_version 687898 (0.0029) [2024-04-28 18:56:31,531][57319] Signal inference workers to stop experience collection... (26300 times) [2024-04-28 18:56:31,531][57319] Signal inference workers to resume experience collection... (26300 times) [2024-04-28 18:56:31,572][57339] InferenceWorker_p0-w0: stopping experience collection (26300 times) [2024-04-28 18:56:31,572][57339] InferenceWorker_p0-w0: resuming experience collection (26300 times) [2024-04-28 18:56:32,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 11270651904. Throughput: 0: 55939.1. Samples: 1761089100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-04-28 18:56:32,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 18:56:32,404][57339] Updated weights for policy 0, policy_version 687908 (0.0031) [2024-04-28 18:56:35,097][57339] Updated weights for policy 0, policy_version 687918 (0.0030) [2024-04-28 18:56:37,169][57108] Fps is (10 sec: 54066.0, 60 sec: 55705.3, 300 sec: 55594.5). Total num frames: 11270946816. Throughput: 0: 56003.8. Samples: 1761259500. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:56:37,170][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 18:56:38,244][57339] Updated weights for policy 0, policy_version 687928 (0.0029) [2024-04-28 18:56:40,893][57339] Updated weights for policy 0, policy_version 687938 (0.0029) [2024-04-28 18:56:42,169][57108] Fps is (10 sec: 58982.9, 60 sec: 56251.8, 300 sec: 55650.1). Total num frames: 11271241728. Throughput: 0: 55947.7. Samples: 1761592760. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:56:42,169][57108] Avg episode reward: [(0, '0.529')] [2024-04-28 18:56:44,056][57339] Updated weights for policy 0, policy_version 687948 (0.0038) [2024-04-28 18:56:46,879][57339] Updated weights for policy 0, policy_version 687958 (0.0027) [2024-04-28 18:56:47,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11271520256. Throughput: 0: 55872.2. Samples: 1761924940. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:56:47,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 18:56:49,947][57339] Updated weights for policy 0, policy_version 687968 (0.0027) [2024-04-28 18:56:52,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11271782400. Throughput: 0: 55865.2. Samples: 1762094520. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:56:52,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 18:56:52,726][57339] Updated weights for policy 0, policy_version 687978 (0.0033) [2024-04-28 18:56:55,724][57339] Updated weights for policy 0, policy_version 687988 (0.0030) [2024-04-28 18:56:57,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 11272060928. Throughput: 0: 55838.9. Samples: 1762430720. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:56:57,170][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 18:56:58,755][57339] Updated weights for policy 0, policy_version 687998 (0.0032) [2024-04-28 18:57:01,473][57339] Updated weights for policy 0, policy_version 688008 (0.0029) [2024-04-28 18:57:02,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 11272355840. Throughput: 0: 55643.9. Samples: 1762760920. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:57:02,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 18:57:04,474][57339] Updated weights for policy 0, policy_version 688018 (0.0025) [2024-04-28 18:57:07,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11272634368. Throughput: 0: 55868.4. Samples: 1762930320. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:57:07,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 18:57:07,252][57339] Updated weights for policy 0, policy_version 688028 (0.0026) [2024-04-28 18:57:10,228][57339] Updated weights for policy 0, policy_version 688038 (0.0026) [2024-04-28 18:57:12,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11272912896. Throughput: 0: 55834.2. Samples: 1763269000. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:57:12,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 18:57:13,064][57339] Updated weights for policy 0, policy_version 688048 (0.0030) [2024-04-28 18:57:16,117][57339] Updated weights for policy 0, policy_version 688058 (0.0026) [2024-04-28 18:57:17,169][57108] Fps is (10 sec: 57343.1, 60 sec: 56524.6, 300 sec: 55705.6). Total num frames: 11273207808. Throughput: 0: 55982.5. Samples: 1763608320. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:57:17,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 18:57:18,975][57339] Updated weights for policy 0, policy_version 688068 (0.0033) [2024-04-28 18:57:21,945][57339] Updated weights for policy 0, policy_version 688078 (0.0031) [2024-04-28 18:57:22,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 11273486336. Throughput: 0: 55953.1. Samples: 1763777380. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:57:22,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 18:57:24,757][57339] Updated weights for policy 0, policy_version 688088 (0.0028) [2024-04-28 18:57:27,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11273748480. Throughput: 0: 56075.4. Samples: 1764116160. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:57:27,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 18:57:27,200][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000688096_11273764864.pth... [2024-04-28 18:57:27,244][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000687280_11260395520.pth [2024-04-28 18:57:27,624][57339] Updated weights for policy 0, policy_version 688098 (0.0032) [2024-04-28 18:57:30,558][57339] Updated weights for policy 0, policy_version 688108 (0.0037) [2024-04-28 18:57:32,169][57108] Fps is (10 sec: 54067.6, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11274027008. Throughput: 0: 56054.8. Samples: 1764447400. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:57:32,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 18:57:33,546][57339] Updated weights for policy 0, policy_version 688118 (0.0034) [2024-04-28 18:57:36,458][57339] Updated weights for policy 0, policy_version 688128 (0.0030) [2024-04-28 18:57:37,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.9, 300 sec: 55705.6). Total num frames: 11274305536. Throughput: 0: 55985.9. Samples: 1764613880. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:57:37,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 18:57:39,465][57339] Updated weights for policy 0, policy_version 688138 (0.0027) [2024-04-28 18:57:42,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 11274600448. Throughput: 0: 55894.8. Samples: 1764945980. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:57:42,170][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 18:57:42,322][57339] Updated weights for policy 0, policy_version 688148 (0.0035) [2024-04-28 18:57:45,361][57339] Updated weights for policy 0, policy_version 688158 (0.0037) [2024-04-28 18:57:45,966][57319] Signal inference workers to stop experience collection... (26350 times) [2024-04-28 18:57:45,983][57339] InferenceWorker_p0-w0: stopping experience collection (26350 times) [2024-04-28 18:57:46,066][57319] Signal inference workers to resume experience collection... (26350 times) [2024-04-28 18:57:46,066][57339] InferenceWorker_p0-w0: resuming experience collection (26350 times) [2024-04-28 18:57:47,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 11274862592. Throughput: 0: 56028.4. Samples: 1765282200. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:57:47,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 18:57:48,031][57339] Updated weights for policy 0, policy_version 688168 (0.0026) [2024-04-28 18:57:51,294][57339] Updated weights for policy 0, policy_version 688178 (0.0030) [2024-04-28 18:57:52,169][57108] Fps is (10 sec: 55706.2, 60 sec: 56251.9, 300 sec: 55705.6). Total num frames: 11275157504. Throughput: 0: 55992.6. Samples: 1765449980. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:57:52,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 18:57:54,098][57339] Updated weights for policy 0, policy_version 688188 (0.0038) [2024-04-28 18:57:57,092][57339] Updated weights for policy 0, policy_version 688198 (0.0028) [2024-04-28 18:57:57,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56251.9, 300 sec: 55761.1). Total num frames: 11275436032. Throughput: 0: 55847.1. Samples: 1765782120. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:57:57,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 18:57:59,974][57339] Updated weights for policy 0, policy_version 688208 (0.0030) [2024-04-28 18:58:02,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11275714560. Throughput: 0: 55769.5. Samples: 1766117940. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:58:02,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 18:58:02,946][57339] Updated weights for policy 0, policy_version 688218 (0.0029) [2024-04-28 18:58:05,897][57339] Updated weights for policy 0, policy_version 688228 (0.0033) [2024-04-28 18:58:07,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11275976704. Throughput: 0: 55744.5. Samples: 1766285880. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-04-28 18:58:07,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 18:58:08,856][57339] Updated weights for policy 0, policy_version 688238 (0.0026) [2024-04-28 18:58:11,741][57339] Updated weights for policy 0, policy_version 688248 (0.0024) [2024-04-28 18:58:12,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11276271616. Throughput: 0: 55750.6. Samples: 1766624940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:58:12,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 18:58:14,707][57339] Updated weights for policy 0, policy_version 688258 (0.0026) [2024-04-28 18:58:17,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.8, 300 sec: 55761.1). Total num frames: 11276550144. Throughput: 0: 55820.4. Samples: 1766959320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:58:17,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 18:58:17,521][57339] Updated weights for policy 0, policy_version 688268 (0.0029) [2024-04-28 18:58:20,568][57339] Updated weights for policy 0, policy_version 688278 (0.0029) [2024-04-28 18:58:22,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 11276828672. Throughput: 0: 55842.2. Samples: 1767126780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:58:22,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 18:58:23,315][57339] Updated weights for policy 0, policy_version 688288 (0.0036) [2024-04-28 18:58:26,492][57339] Updated weights for policy 0, policy_version 688298 (0.0032) [2024-04-28 18:58:27,169][57108] Fps is (10 sec: 57343.1, 60 sec: 56251.6, 300 sec: 55816.6). Total num frames: 11277123584. Throughput: 0: 55968.3. Samples: 1767464560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:58:27,169][57108] Avg episode reward: [(0, '0.493')] [2024-04-28 18:58:29,042][57339] Updated weights for policy 0, policy_version 688308 (0.0030) [2024-04-28 18:58:32,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 11277385728. Throughput: 0: 55983.6. Samples: 1767801460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:58:32,170][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 18:58:32,291][57339] Updated weights for policy 0, policy_version 688318 (0.0030) [2024-04-28 18:58:34,714][57339] Updated weights for policy 0, policy_version 688328 (0.0028) [2024-04-28 18:58:37,169][57108] Fps is (10 sec: 55706.7, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 11277680640. Throughput: 0: 55934.6. Samples: 1767967040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:58:37,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 18:58:38,285][57339] Updated weights for policy 0, policy_version 688338 (0.0027) [2024-04-28 18:58:40,846][57339] Updated weights for policy 0, policy_version 688348 (0.0026) [2024-04-28 18:58:42,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11277942784. Throughput: 0: 55983.4. Samples: 1768301380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:58:42,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 18:58:44,205][57339] Updated weights for policy 0, policy_version 688358 (0.0025) [2024-04-28 18:58:46,639][57339] Updated weights for policy 0, policy_version 688368 (0.0030) [2024-04-28 18:58:47,169][57108] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 11278237696. Throughput: 0: 55866.6. Samples: 1768631940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:58:47,169][57108] Avg episode reward: [(0, '0.524')] [2024-04-28 18:58:50,048][57339] Updated weights for policy 0, policy_version 688378 (0.0028) [2024-04-28 18:58:52,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11278499840. Throughput: 0: 55972.5. Samples: 1768804640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:58:52,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 18:58:52,578][57339] Updated weights for policy 0, policy_version 688388 (0.0038) [2024-04-28 18:58:55,826][57339] Updated weights for policy 0, policy_version 688398 (0.0032) [2024-04-28 18:58:57,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 11278761984. Throughput: 0: 55913.6. Samples: 1769141060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:58:57,170][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 18:58:58,558][57339] Updated weights for policy 0, policy_version 688408 (0.0025) [2024-04-28 18:59:01,722][57339] Updated weights for policy 0, policy_version 688418 (0.0032) [2024-04-28 18:59:02,169][57108] Fps is (10 sec: 58982.7, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 11279089664. Throughput: 0: 55805.4. Samples: 1769470560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:59:02,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 18:59:03,321][57319] Signal inference workers to stop experience collection... (26400 times) [2024-04-28 18:59:03,323][57319] Signal inference workers to resume experience collection... (26400 times) [2024-04-28 18:59:03,352][57339] InferenceWorker_p0-w0: stopping experience collection (26400 times) [2024-04-28 18:59:03,352][57339] InferenceWorker_p0-w0: resuming experience collection (26400 times) [2024-04-28 18:59:04,304][57339] Updated weights for policy 0, policy_version 688428 (0.0028) [2024-04-28 18:59:07,169][57108] Fps is (10 sec: 55707.1, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11279319040. Throughput: 0: 55570.7. Samples: 1769627460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:59:07,169][57108] Avg episode reward: [(0, '0.526')] [2024-04-28 18:59:07,674][57339] Updated weights for policy 0, policy_version 688438 (0.0027) [2024-04-28 18:59:10,132][57339] Updated weights for policy 0, policy_version 688448 (0.0028) [2024-04-28 18:59:12,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11279613952. Throughput: 0: 55481.5. Samples: 1769961220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:59:12,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 18:59:13,516][57339] Updated weights for policy 0, policy_version 688458 (0.0028) [2024-04-28 18:59:16,020][57339] Updated weights for policy 0, policy_version 688468 (0.0025) [2024-04-28 18:59:17,169][57108] Fps is (10 sec: 55704.4, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 11279876096. Throughput: 0: 55433.2. Samples: 1770295960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:59:17,170][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 18:59:19,499][57339] Updated weights for policy 0, policy_version 688478 (0.0033) [2024-04-28 18:59:21,908][57339] Updated weights for policy 0, policy_version 688488 (0.0028) [2024-04-28 18:59:22,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11280187392. Throughput: 0: 55541.4. Samples: 1770466400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:59:22,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 18:59:25,391][57339] Updated weights for policy 0, policy_version 688498 (0.0031) [2024-04-28 18:59:27,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 11280449536. Throughput: 0: 55412.6. Samples: 1770794940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:59:27,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 18:59:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000688504_11280449536.pth... [2024-04-28 18:59:27,240][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000687688_11267080192.pth [2024-04-28 18:59:27,857][57339] Updated weights for policy 0, policy_version 688508 (0.0027) [2024-04-28 18:59:31,293][57339] Updated weights for policy 0, policy_version 688518 (0.0037) [2024-04-28 18:59:32,169][57108] Fps is (10 sec: 50790.2, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 11280695296. Throughput: 0: 55439.7. Samples: 1771126720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:59:32,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 18:59:33,674][57339] Updated weights for policy 0, policy_version 688528 (0.0024) [2024-04-28 18:59:37,142][57339] Updated weights for policy 0, policy_version 688538 (0.0020) [2024-04-28 18:59:37,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 11281006592. Throughput: 0: 55333.6. Samples: 1771294660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:59:37,170][57108] Avg episode reward: [(0, '0.508')] [2024-04-28 18:59:39,720][57339] Updated weights for policy 0, policy_version 688548 (0.0031) [2024-04-28 18:59:42,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 11281268736. Throughput: 0: 55247.4. Samples: 1771627180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:59:42,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 18:59:43,038][57339] Updated weights for policy 0, policy_version 688558 (0.0031) [2024-04-28 18:59:45,749][57339] Updated weights for policy 0, policy_version 688568 (0.0031) [2024-04-28 18:59:47,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 11281547264. Throughput: 0: 55350.6. Samples: 1771961340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-04-28 18:59:47,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 18:59:48,533][57319] Signal inference workers to stop experience collection... (26450 times) [2024-04-28 18:59:48,587][57319] Signal inference workers to resume experience collection... (26450 times) [2024-04-28 18:59:48,588][57339] InferenceWorker_p0-w0: stopping experience collection (26450 times) [2024-04-28 18:59:48,607][57339] InferenceWorker_p0-w0: resuming experience collection (26450 times) [2024-04-28 18:59:48,954][57339] Updated weights for policy 0, policy_version 688578 (0.0031) [2024-04-28 18:59:51,531][57339] Updated weights for policy 0, policy_version 688588 (0.0034) [2024-04-28 18:59:52,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 11281825792. Throughput: 0: 55601.5. Samples: 1772129540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 18:59:52,170][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 18:59:54,901][57339] Updated weights for policy 0, policy_version 688598 (0.0027) [2024-04-28 18:59:57,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.9, 300 sec: 55816.7). Total num frames: 11282120704. Throughput: 0: 55640.6. Samples: 1772465040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 18:59:57,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 18:59:57,357][57339] Updated weights for policy 0, policy_version 688608 (0.0026) [2024-04-28 19:00:00,793][57339] Updated weights for policy 0, policy_version 688618 (0.0031) [2024-04-28 19:00:02,169][57108] Fps is (10 sec: 55706.8, 60 sec: 54886.4, 300 sec: 55761.1). Total num frames: 11282382848. Throughput: 0: 55545.6. Samples: 1772795500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:00:02,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 19:00:03,132][57339] Updated weights for policy 0, policy_version 688628 (0.0032) [2024-04-28 19:00:06,631][57339] Updated weights for policy 0, policy_version 688638 (0.0028) [2024-04-28 19:00:07,169][57108] Fps is (10 sec: 54066.1, 60 sec: 55705.4, 300 sec: 55761.1). Total num frames: 11282661376. Throughput: 0: 55545.5. Samples: 1772965960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:00:07,170][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 19:00:09,200][57339] Updated weights for policy 0, policy_version 688648 (0.0028) [2024-04-28 19:00:12,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 11282923520. Throughput: 0: 55536.4. Samples: 1773294080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:00:12,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 19:00:12,530][57339] Updated weights for policy 0, policy_version 688658 (0.0029) [2024-04-28 19:00:15,042][57339] Updated weights for policy 0, policy_version 688668 (0.0026) [2024-04-28 19:00:17,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11283202048. Throughput: 0: 55577.1. Samples: 1773627700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:00:17,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 19:00:18,493][57339] Updated weights for policy 0, policy_version 688678 (0.0027) [2024-04-28 19:00:20,920][57339] Updated weights for policy 0, policy_version 688688 (0.0028) [2024-04-28 19:00:22,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 11283496960. Throughput: 0: 55629.9. Samples: 1773798000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:00:22,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 19:00:24,324][57339] Updated weights for policy 0, policy_version 688698 (0.0033) [2024-04-28 19:00:26,886][57339] Updated weights for policy 0, policy_version 688708 (0.0040) [2024-04-28 19:00:27,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11283791872. Throughput: 0: 55586.5. Samples: 1774128580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:00:27,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 19:00:30,262][57339] Updated weights for policy 0, policy_version 688718 (0.0026) [2024-04-28 19:00:32,169][57108] Fps is (10 sec: 58982.6, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 11284086784. Throughput: 0: 55452.9. Samples: 1774456720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:00:32,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 19:00:33,115][57339] Updated weights for policy 0, policy_version 688728 (0.0030) [2024-04-28 19:00:36,267][57339] Updated weights for policy 0, policy_version 688738 (0.0030) [2024-04-28 19:00:37,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 11284316160. Throughput: 0: 55616.2. Samples: 1774632260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:00:37,169][57108] Avg episode reward: [(0, '0.692')] [2024-04-28 19:00:39,216][57339] Updated weights for policy 0, policy_version 688748 (0.0025) [2024-04-28 19:00:41,555][57319] Signal inference workers to stop experience collection... (26500 times) [2024-04-28 19:00:41,562][57319] Signal inference workers to resume experience collection... (26500 times) [2024-04-28 19:00:41,585][57339] InferenceWorker_p0-w0: stopping experience collection (26500 times) [2024-04-28 19:00:41,585][57339] InferenceWorker_p0-w0: resuming experience collection (26500 times) [2024-04-28 19:00:42,146][57339] Updated weights for policy 0, policy_version 688758 (0.0026) [2024-04-28 19:00:42,169][57108] Fps is (10 sec: 52428.3, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11284611072. Throughput: 0: 55538.5. Samples: 1774964280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:00:42,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 19:00:45,071][57339] Updated weights for policy 0, policy_version 688768 (0.0024) [2024-04-28 19:00:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11284873216. Throughput: 0: 55556.9. Samples: 1775295560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:00:47,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 19:00:47,936][57339] Updated weights for policy 0, policy_version 688778 (0.0030) [2024-04-28 19:00:51,003][57339] Updated weights for policy 0, policy_version 688788 (0.0032) [2024-04-28 19:00:52,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 11285135360. Throughput: 0: 55171.5. Samples: 1775448680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:00:52,170][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 19:00:53,858][57339] Updated weights for policy 0, policy_version 688798 (0.0027) [2024-04-28 19:00:56,909][57339] Updated weights for policy 0, policy_version 688808 (0.0027) [2024-04-28 19:00:57,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11285446656. Throughput: 0: 55293.4. Samples: 1775782280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:00:57,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 19:00:59,860][57339] Updated weights for policy 0, policy_version 688818 (0.0033) [2024-04-28 19:01:02,169][57108] Fps is (10 sec: 58983.4, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11285725184. Throughput: 0: 55228.6. Samples: 1776112980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:01:02,170][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 19:01:02,903][57339] Updated weights for policy 0, policy_version 688828 (0.0027) [2024-04-28 19:01:06,290][57339] Updated weights for policy 0, policy_version 688838 (0.0028) [2024-04-28 19:01:07,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 11286003712. Throughput: 0: 55331.1. Samples: 1776287900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:01:07,169][57108] Avg episode reward: [(0, '0.519')] [2024-04-28 19:01:08,721][57339] Updated weights for policy 0, policy_version 688848 (0.0024) [2024-04-28 19:01:12,169][57108] Fps is (10 sec: 50790.1, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 11286233088. Throughput: 0: 55336.4. Samples: 1776618720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:01:12,170][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 19:01:12,252][57339] Updated weights for policy 0, policy_version 688858 (0.0036) [2024-04-28 19:01:14,460][57339] Updated weights for policy 0, policy_version 688868 (0.0025) [2024-04-28 19:01:17,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 11286544384. Throughput: 0: 55546.3. Samples: 1776956300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:01:17,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 19:01:17,891][57339] Updated weights for policy 0, policy_version 688878 (0.0027) [2024-04-28 19:01:20,411][57339] Updated weights for policy 0, policy_version 688888 (0.0027) [2024-04-28 19:01:22,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 11286806528. Throughput: 0: 55280.9. Samples: 1777119900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:01:22,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 19:01:23,737][57339] Updated weights for policy 0, policy_version 688898 (0.0033) [2024-04-28 19:01:26,309][57339] Updated weights for policy 0, policy_version 688908 (0.0028) [2024-04-28 19:01:27,169][57108] Fps is (10 sec: 54066.4, 60 sec: 54886.4, 300 sec: 55705.6). Total num frames: 11287085056. Throughput: 0: 55321.3. Samples: 1777453740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:01:27,170][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 19:01:27,223][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000688910_11287101440.pth... [2024-04-28 19:01:27,268][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000688096_11273764864.pth [2024-04-28 19:01:29,574][57319] Signal inference workers to stop experience collection... (26550 times) [2024-04-28 19:01:29,600][57339] InferenceWorker_p0-w0: stopping experience collection (26550 times) [2024-04-28 19:01:29,673][57319] Signal inference workers to resume experience collection... (26550 times) [2024-04-28 19:01:29,673][57339] InferenceWorker_p0-w0: resuming experience collection (26550 times) [2024-04-28 19:01:29,675][57339] Updated weights for policy 0, policy_version 688918 (0.0033) [2024-04-28 19:01:32,163][57339] Updated weights for policy 0, policy_version 688928 (0.0028) [2024-04-28 19:01:32,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55159.6, 300 sec: 55761.2). Total num frames: 11287396352. Throughput: 0: 55276.9. Samples: 1777783020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:01:32,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 19:01:35,447][57339] Updated weights for policy 0, policy_version 688938 (0.0028) [2024-04-28 19:01:37,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11287658496. Throughput: 0: 55676.8. Samples: 1777954120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:01:37,169][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 19:01:38,088][57339] Updated weights for policy 0, policy_version 688948 (0.0026) [2024-04-28 19:01:41,242][57339] Updated weights for policy 0, policy_version 688958 (0.0029) [2024-04-28 19:01:42,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11287937024. Throughput: 0: 55623.1. Samples: 1778285320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:01:42,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 19:01:43,912][57339] Updated weights for policy 0, policy_version 688968 (0.0035) [2024-04-28 19:01:47,054][57339] Updated weights for policy 0, policy_version 688978 (0.0029) [2024-04-28 19:01:47,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 11288215552. Throughput: 0: 55669.3. Samples: 1778618100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:01:47,170][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 19:01:49,774][57339] Updated weights for policy 0, policy_version 688988 (0.0038) [2024-04-28 19:01:52,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.8, 300 sec: 55650.1). Total num frames: 11288477696. Throughput: 0: 55481.8. Samples: 1778784580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:01:52,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 19:01:52,920][57339] Updated weights for policy 0, policy_version 688998 (0.0030) [2024-04-28 19:01:55,585][57339] Updated weights for policy 0, policy_version 689008 (0.0032) [2024-04-28 19:01:57,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11288772608. Throughput: 0: 55510.3. Samples: 1779116680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:01:57,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 19:01:58,848][57339] Updated weights for policy 0, policy_version 689018 (0.0028) [2024-04-28 19:02:01,671][57339] Updated weights for policy 0, policy_version 689028 (0.0031) [2024-04-28 19:02:02,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 11289034752. Throughput: 0: 55451.0. Samples: 1779451600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:02:02,169][57108] Avg episode reward: [(0, '0.692')] [2024-04-28 19:02:04,800][57339] Updated weights for policy 0, policy_version 689038 (0.0031) [2024-04-28 19:02:07,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11289346048. Throughput: 0: 55521.3. Samples: 1779618360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:02:07,169][57108] Avg episode reward: [(0, '0.534')] [2024-04-28 19:02:07,430][57339] Updated weights for policy 0, policy_version 689048 (0.0027) [2024-04-28 19:02:10,514][57339] Updated weights for policy 0, policy_version 689058 (0.0029) [2024-04-28 19:02:12,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 11289591808. Throughput: 0: 55489.4. Samples: 1779950760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:02:12,169][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 19:02:13,587][57339] Updated weights for policy 0, policy_version 689068 (0.0028) [2024-04-28 19:02:16,543][57339] Updated weights for policy 0, policy_version 689078 (0.0027) [2024-04-28 19:02:17,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 11289886720. Throughput: 0: 55630.5. Samples: 1780286400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:02:17,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 19:02:19,643][57339] Updated weights for policy 0, policy_version 689088 (0.0031) [2024-04-28 19:02:22,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 11290165248. Throughput: 0: 55604.8. Samples: 1780456340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:02:22,170][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 19:02:22,464][57339] Updated weights for policy 0, policy_version 689098 (0.0024) [2024-04-28 19:02:25,348][57339] Updated weights for policy 0, policy_version 689108 (0.0032) [2024-04-28 19:02:27,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 11290427392. Throughput: 0: 55705.8. Samples: 1780792080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:02:27,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 19:02:28,211][57339] Updated weights for policy 0, policy_version 689118 (0.0029) [2024-04-28 19:02:31,112][57339] Updated weights for policy 0, policy_version 689128 (0.0036) [2024-04-28 19:02:32,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.3, 300 sec: 55650.0). Total num frames: 11290722304. Throughput: 0: 55718.5. Samples: 1781125440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:02:32,170][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 19:02:34,006][57339] Updated weights for policy 0, policy_version 689138 (0.0032) [2024-04-28 19:02:37,107][57339] Updated weights for policy 0, policy_version 689148 (0.0028) [2024-04-28 19:02:37,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 11291000832. Throughput: 0: 55636.8. Samples: 1781288240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:02:37,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 19:02:40,097][57339] Updated weights for policy 0, policy_version 689158 (0.0035) [2024-04-28 19:02:42,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 11291262976. Throughput: 0: 55664.8. Samples: 1781621600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:02:42,170][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 19:02:42,959][57339] Updated weights for policy 0, policy_version 689168 (0.0029) [2024-04-28 19:02:44,916][57319] Signal inference workers to stop experience collection... (26600 times) [2024-04-28 19:02:44,917][57319] Signal inference workers to resume experience collection... (26600 times) [2024-04-28 19:02:44,949][57339] InferenceWorker_p0-w0: stopping experience collection (26600 times) [2024-04-28 19:02:44,949][57339] InferenceWorker_p0-w0: resuming experience collection (26600 times) [2024-04-28 19:02:45,970][57339] Updated weights for policy 0, policy_version 689178 (0.0033) [2024-04-28 19:02:47,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55538.9). Total num frames: 11291541504. Throughput: 0: 55640.4. Samples: 1781955420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:02:47,169][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 19:02:48,687][57339] Updated weights for policy 0, policy_version 689188 (0.0027) [2024-04-28 19:02:51,817][57339] Updated weights for policy 0, policy_version 689198 (0.0029) [2024-04-28 19:02:52,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.5, 300 sec: 55594.5). Total num frames: 11291836416. Throughput: 0: 55765.7. Samples: 1782127820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:02:52,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 19:02:54,609][57339] Updated weights for policy 0, policy_version 689208 (0.0027) [2024-04-28 19:02:57,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11292114944. Throughput: 0: 55761.0. Samples: 1782460000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:02:57,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 19:02:57,549][57339] Updated weights for policy 0, policy_version 689218 (0.0029) [2024-04-28 19:03:00,469][57339] Updated weights for policy 0, policy_version 689228 (0.0029) [2024-04-28 19:03:02,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11292377088. Throughput: 0: 55719.5. Samples: 1782793780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:03:02,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 19:03:03,747][57339] Updated weights for policy 0, policy_version 689238 (0.0029) [2024-04-28 19:03:06,709][57339] Updated weights for policy 0, policy_version 689248 (0.0030) [2024-04-28 19:03:07,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 11292655616. Throughput: 0: 55589.4. Samples: 1782957860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-04-28 19:03:07,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 19:03:09,484][57339] Updated weights for policy 0, policy_version 689258 (0.0026) [2024-04-28 19:03:12,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 11292934144. Throughput: 0: 55563.6. Samples: 1783292440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:03:12,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 19:03:12,652][57339] Updated weights for policy 0, policy_version 689268 (0.0032) [2024-04-28 19:03:15,363][57339] Updated weights for policy 0, policy_version 689278 (0.0033) [2024-04-28 19:03:17,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 11293212672. Throughput: 0: 55634.5. Samples: 1783628980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:03:17,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 19:03:18,453][57339] Updated weights for policy 0, policy_version 689288 (0.0030) [2024-04-28 19:03:21,351][57339] Updated weights for policy 0, policy_version 689298 (0.0032) [2024-04-28 19:03:22,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 11293491200. Throughput: 0: 55655.3. Samples: 1783792720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:03:22,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 19:03:24,286][57339] Updated weights for policy 0, policy_version 689308 (0.0033) [2024-04-28 19:03:27,076][57339] Updated weights for policy 0, policy_version 689318 (0.0030) [2024-04-28 19:03:27,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 11293786112. Throughput: 0: 55742.8. Samples: 1784130020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:03:27,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 19:03:27,177][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000689318_11293786112.pth... [2024-04-28 19:03:27,222][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000688504_11280449536.pth [2024-04-28 19:03:30,305][57339] Updated weights for policy 0, policy_version 689328 (0.0032) [2024-04-28 19:03:32,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.7, 300 sec: 55483.4). Total num frames: 11294048256. Throughput: 0: 55694.3. Samples: 1784461660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:03:32,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 19:03:32,891][57339] Updated weights for policy 0, policy_version 689338 (0.0033) [2024-04-28 19:03:36,202][57339] Updated weights for policy 0, policy_version 689348 (0.0032) [2024-04-28 19:03:37,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 11294326784. Throughput: 0: 55619.3. Samples: 1784630680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:03:37,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 19:03:38,833][57339] Updated weights for policy 0, policy_version 689358 (0.0027) [2024-04-28 19:03:42,030][57339] Updated weights for policy 0, policy_version 689368 (0.0027) [2024-04-28 19:03:42,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 11294605312. Throughput: 0: 55606.1. Samples: 1784962280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:03:42,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 19:03:44,715][57339] Updated weights for policy 0, policy_version 689378 (0.0026) [2024-04-28 19:03:47,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.7, 300 sec: 55483.5). Total num frames: 11294867456. Throughput: 0: 55630.8. Samples: 1785297160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:03:47,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 19:03:47,959][57339] Updated weights for policy 0, policy_version 689388 (0.0035) [2024-04-28 19:03:50,612][57339] Updated weights for policy 0, policy_version 689398 (0.0031) [2024-04-28 19:03:52,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 11295162368. Throughput: 0: 55617.2. Samples: 1785460640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:03:52,170][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 19:03:53,925][57339] Updated weights for policy 0, policy_version 689408 (0.0031) [2024-04-28 19:03:56,430][57339] Updated weights for policy 0, policy_version 689418 (0.0026) [2024-04-28 19:03:57,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 11295440896. Throughput: 0: 55649.7. Samples: 1785796680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:03:57,170][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 19:03:59,676][57339] Updated weights for policy 0, policy_version 689428 (0.0027) [2024-04-28 19:04:00,980][57319] Signal inference workers to stop experience collection... (26650 times) [2024-04-28 19:04:00,980][57319] Signal inference workers to resume experience collection... (26650 times) [2024-04-28 19:04:01,008][57339] InferenceWorker_p0-w0: stopping experience collection (26650 times) [2024-04-28 19:04:01,008][57339] InferenceWorker_p0-w0: resuming experience collection (26650 times) [2024-04-28 19:04:02,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 11295719424. Throughput: 0: 55558.2. Samples: 1786129100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:04:02,169][57108] Avg episode reward: [(0, '0.458')] [2024-04-28 19:04:02,429][57339] Updated weights for policy 0, policy_version 689438 (0.0027) [2024-04-28 19:04:05,684][57339] Updated weights for policy 0, policy_version 689448 (0.0027) [2024-04-28 19:04:07,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 11295997952. Throughput: 0: 55666.7. Samples: 1786297720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:04:07,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 19:04:08,410][57339] Updated weights for policy 0, policy_version 689458 (0.0025) [2024-04-28 19:04:11,585][57339] Updated weights for policy 0, policy_version 689468 (0.0030) [2024-04-28 19:04:12,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 11296292864. Throughput: 0: 55630.2. Samples: 1786633380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:04:12,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 19:04:14,153][57339] Updated weights for policy 0, policy_version 689478 (0.0026) [2024-04-28 19:04:17,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55432.4, 300 sec: 55427.9). Total num frames: 11296538624. Throughput: 0: 55689.2. Samples: 1786967680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:04:17,170][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 19:04:17,452][57339] Updated weights for policy 0, policy_version 689488 (0.0031) [2024-04-28 19:04:20,066][57339] Updated weights for policy 0, policy_version 689498 (0.0028) [2024-04-28 19:04:22,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 11296833536. Throughput: 0: 55527.4. Samples: 1787129420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:04:22,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 19:04:23,339][57339] Updated weights for policy 0, policy_version 689508 (0.0026) [2024-04-28 19:04:26,000][57339] Updated weights for policy 0, policy_version 689518 (0.0029) [2024-04-28 19:04:27,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 11297112064. Throughput: 0: 55545.3. Samples: 1787461820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:04:27,170][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 19:04:29,207][57339] Updated weights for policy 0, policy_version 689528 (0.0029) [2024-04-28 19:04:31,882][57339] Updated weights for policy 0, policy_version 689538 (0.0024) [2024-04-28 19:04:32,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 11297390592. Throughput: 0: 55515.8. Samples: 1787795380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:04:32,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 19:04:35,121][57339] Updated weights for policy 0, policy_version 689548 (0.0025) [2024-04-28 19:04:37,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11297669120. Throughput: 0: 55811.7. Samples: 1787972160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:04:37,169][57108] Avg episode reward: [(0, '0.695')] [2024-04-28 19:04:37,711][57339] Updated weights for policy 0, policy_version 689558 (0.0023) [2024-04-28 19:04:41,219][57339] Updated weights for policy 0, policy_version 689568 (0.0029) [2024-04-28 19:04:42,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 11297931264. Throughput: 0: 55665.4. Samples: 1788301620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:04:42,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 19:04:43,601][57339] Updated weights for policy 0, policy_version 689578 (0.0027) [2024-04-28 19:04:46,964][57339] Updated weights for policy 0, policy_version 689588 (0.0031) [2024-04-28 19:04:47,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 11298209792. Throughput: 0: 55665.8. Samples: 1788634060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-04-28 19:04:47,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 19:04:49,456][57339] Updated weights for policy 0, policy_version 689598 (0.0023) [2024-04-28 19:04:52,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55483.4). Total num frames: 11298488320. Throughput: 0: 55388.4. Samples: 1788790200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:04:52,169][57108] Avg episode reward: [(0, '0.516')] [2024-04-28 19:04:52,817][57339] Updated weights for policy 0, policy_version 689608 (0.0031) [2024-04-28 19:04:55,276][57339] Updated weights for policy 0, policy_version 689618 (0.0031) [2024-04-28 19:04:57,169][57108] Fps is (10 sec: 55704.5, 60 sec: 55432.4, 300 sec: 55538.9). Total num frames: 11298766848. Throughput: 0: 55488.3. Samples: 1789130360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:04:57,170][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 19:04:58,567][57339] Updated weights for policy 0, policy_version 689628 (0.0027) [2024-04-28 19:05:00,572][57319] Signal inference workers to stop experience collection... (26700 times) [2024-04-28 19:05:00,572][57319] Signal inference workers to resume experience collection... (26700 times) [2024-04-28 19:05:00,589][57339] InferenceWorker_p0-w0: stopping experience collection (26700 times) [2024-04-28 19:05:00,589][57339] InferenceWorker_p0-w0: resuming experience collection (26700 times) [2024-04-28 19:05:01,373][57339] Updated weights for policy 0, policy_version 689638 (0.0026) [2024-04-28 19:05:02,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.7, 300 sec: 55594.6). Total num frames: 11299061760. Throughput: 0: 55565.1. Samples: 1789468100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:05:02,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 19:05:04,356][57339] Updated weights for policy 0, policy_version 689648 (0.0026) [2024-04-28 19:05:07,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 11299340288. Throughput: 0: 55736.9. Samples: 1789637580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:05:07,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 19:05:07,356][57339] Updated weights for policy 0, policy_version 689658 (0.0036) [2024-04-28 19:05:10,408][57339] Updated weights for policy 0, policy_version 689668 (0.0033) [2024-04-28 19:05:12,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11299618816. Throughput: 0: 55795.2. Samples: 1789972600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:05:12,169][57108] Avg episode reward: [(0, '0.712')] [2024-04-28 19:05:13,270][57339] Updated weights for policy 0, policy_version 689678 (0.0029) [2024-04-28 19:05:16,206][57339] Updated weights for policy 0, policy_version 689688 (0.0025) [2024-04-28 19:05:17,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 11299897344. Throughput: 0: 55859.3. Samples: 1790309040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:05:17,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 19:05:19,008][57339] Updated weights for policy 0, policy_version 689698 (0.0028) [2024-04-28 19:05:22,040][57339] Updated weights for policy 0, policy_version 689708 (0.0030) [2024-04-28 19:05:22,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 11300175872. Throughput: 0: 55564.3. Samples: 1790472560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:05:22,169][57108] Avg episode reward: [(0, '0.695')] [2024-04-28 19:05:24,748][57339] Updated weights for policy 0, policy_version 689718 (0.0026) [2024-04-28 19:05:27,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 11300438016. Throughput: 0: 55589.3. Samples: 1790803140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:05:27,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 19:05:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000689724_11300438016.pth... [2024-04-28 19:05:27,225][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000688910_11287101440.pth [2024-04-28 19:05:28,126][57339] Updated weights for policy 0, policy_version 689728 (0.0027) [2024-04-28 19:05:30,610][57339] Updated weights for policy 0, policy_version 689738 (0.0027) [2024-04-28 19:05:32,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 11300716544. Throughput: 0: 55570.5. Samples: 1791134740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:05:32,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 19:05:33,968][57339] Updated weights for policy 0, policy_version 689748 (0.0032) [2024-04-28 19:05:36,523][57339] Updated weights for policy 0, policy_version 689758 (0.0031) [2024-04-28 19:05:37,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 11301027840. Throughput: 0: 56029.7. Samples: 1791311540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:05:37,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 19:05:39,756][57339] Updated weights for policy 0, policy_version 689768 (0.0031) [2024-04-28 19:05:42,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 11301306368. Throughput: 0: 55860.6. Samples: 1791644080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:05:42,170][57108] Avg episode reward: [(0, '0.532')] [2024-04-28 19:05:42,232][57339] Updated weights for policy 0, policy_version 689778 (0.0026) [2024-04-28 19:05:45,688][57339] Updated weights for policy 0, policy_version 689788 (0.0031) [2024-04-28 19:05:47,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 11301568512. Throughput: 0: 55728.7. Samples: 1791975900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:05:47,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 19:05:48,243][57339] Updated weights for policy 0, policy_version 689798 (0.0028) [2024-04-28 19:05:51,715][57339] Updated weights for policy 0, policy_version 689808 (0.0028) [2024-04-28 19:05:52,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55705.4, 300 sec: 55538.9). Total num frames: 11301830656. Throughput: 0: 55733.2. Samples: 1792145580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:05:52,170][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 19:05:54,044][57339] Updated weights for policy 0, policy_version 689818 (0.0027) [2024-04-28 19:05:57,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 11302109184. Throughput: 0: 55707.5. Samples: 1792479440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:05:57,170][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 19:05:57,584][57339] Updated weights for policy 0, policy_version 689828 (0.0027) [2024-04-28 19:05:59,860][57339] Updated weights for policy 0, policy_version 689838 (0.0026) [2024-04-28 19:06:02,169][57108] Fps is (10 sec: 57345.2, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11302404096. Throughput: 0: 55609.7. Samples: 1792811480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:06:02,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 19:06:03,436][57339] Updated weights for policy 0, policy_version 689848 (0.0029) [2024-04-28 19:06:05,587][57339] Updated weights for policy 0, policy_version 689858 (0.0033) [2024-04-28 19:06:07,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11302682624. Throughput: 0: 55852.5. Samples: 1792985920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:06:07,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 19:06:08,539][57319] Signal inference workers to stop experience collection... (26750 times) [2024-04-28 19:06:08,539][57319] Signal inference workers to resume experience collection... (26750 times) [2024-04-28 19:06:08,555][57339] InferenceWorker_p0-w0: stopping experience collection (26750 times) [2024-04-28 19:06:08,555][57339] InferenceWorker_p0-w0: resuming experience collection (26750 times) [2024-04-28 19:06:09,284][57339] Updated weights for policy 0, policy_version 689868 (0.0029) [2024-04-28 19:06:11,598][57339] Updated weights for policy 0, policy_version 689878 (0.0029) [2024-04-28 19:06:12,169][57108] Fps is (10 sec: 58981.8, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 11302993920. Throughput: 0: 55961.7. Samples: 1793321420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:06:12,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 19:06:14,939][57339] Updated weights for policy 0, policy_version 689888 (0.0025) [2024-04-28 19:06:17,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 11303256064. Throughput: 0: 56021.3. Samples: 1793655700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:06:17,170][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 19:06:17,344][57339] Updated weights for policy 0, policy_version 689898 (0.0032) [2024-04-28 19:06:21,057][57339] Updated weights for policy 0, policy_version 689908 (0.0024) [2024-04-28 19:06:22,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11303518208. Throughput: 0: 55896.1. Samples: 1793826860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:06:22,169][57108] Avg episode reward: [(0, '0.541')] [2024-04-28 19:06:23,133][57339] Updated weights for policy 0, policy_version 689918 (0.0027) [2024-04-28 19:06:26,905][57339] Updated weights for policy 0, policy_version 689928 (0.0031) [2024-04-28 19:06:27,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.5, 300 sec: 55538.9). Total num frames: 11303780352. Throughput: 0: 56025.7. Samples: 1794165240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-04-28 19:06:27,169][57108] Avg episode reward: [(0, '0.703')] [2024-04-28 19:06:29,009][57339] Updated weights for policy 0, policy_version 689938 (0.0023) [2024-04-28 19:06:32,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 11304058880. Throughput: 0: 56022.4. Samples: 1794496900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:06:32,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 19:06:32,975][57339] Updated weights for policy 0, policy_version 689948 (0.0034) [2024-04-28 19:06:34,934][57339] Updated weights for policy 0, policy_version 689958 (0.0031) [2024-04-28 19:06:37,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 11304353792. Throughput: 0: 55774.7. Samples: 1794655440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:06:37,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 19:06:38,804][57339] Updated weights for policy 0, policy_version 689968 (0.0026) [2024-04-28 19:06:40,917][57339] Updated weights for policy 0, policy_version 689978 (0.0028) [2024-04-28 19:06:42,169][57108] Fps is (10 sec: 58981.4, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11304648704. Throughput: 0: 55757.7. Samples: 1794988540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:06:42,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 19:06:44,786][57339] Updated weights for policy 0, policy_version 689988 (0.0026) [2024-04-28 19:06:46,723][57339] Updated weights for policy 0, policy_version 689998 (0.0024) [2024-04-28 19:06:47,169][57108] Fps is (10 sec: 58982.9, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 11304943616. Throughput: 0: 55643.4. Samples: 1795315440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:06:47,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 19:06:50,599][57339] Updated weights for policy 0, policy_version 690008 (0.0033) [2024-04-28 19:06:52,169][57108] Fps is (10 sec: 55705.2, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11305205760. Throughput: 0: 55757.2. Samples: 1795495000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:06:52,170][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 19:06:52,708][57339] Updated weights for policy 0, policy_version 690018 (0.0037) [2024-04-28 19:06:56,440][57339] Updated weights for policy 0, policy_version 690028 (0.0028) [2024-04-28 19:06:57,169][57108] Fps is (10 sec: 50790.5, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11305451520. Throughput: 0: 55791.1. Samples: 1795832020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:06:57,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 19:06:58,423][57319] Signal inference workers to stop experience collection... (26800 times) [2024-04-28 19:06:58,423][57319] Signal inference workers to resume experience collection... (26800 times) [2024-04-28 19:06:58,434][57339] InferenceWorker_p0-w0: stopping experience collection (26800 times) [2024-04-28 19:06:58,434][57339] InferenceWorker_p0-w0: resuming experience collection (26800 times) [2024-04-28 19:06:58,541][57339] Updated weights for policy 0, policy_version 690038 (0.0032) [2024-04-28 19:07:02,169][57108] Fps is (10 sec: 50791.3, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 11305713664. Throughput: 0: 55822.8. Samples: 1796167720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:07:02,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 19:07:02,358][57339] Updated weights for policy 0, policy_version 690048 (0.0030) [2024-04-28 19:07:04,319][57339] Updated weights for policy 0, policy_version 690058 (0.0027) [2024-04-28 19:07:07,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11306008576. Throughput: 0: 55307.4. Samples: 1796315700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:07:07,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 19:07:08,234][57339] Updated weights for policy 0, policy_version 690068 (0.0026) [2024-04-28 19:07:10,095][57339] Updated weights for policy 0, policy_version 690078 (0.0027) [2024-04-28 19:07:12,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 11306303488. Throughput: 0: 55180.5. Samples: 1796648360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:07:12,169][57108] Avg episode reward: [(0, '0.697')] [2024-04-28 19:07:14,087][57339] Updated weights for policy 0, policy_version 690088 (0.0034) [2024-04-28 19:07:16,105][57339] Updated weights for policy 0, policy_version 690098 (0.0024) [2024-04-28 19:07:17,169][57108] Fps is (10 sec: 58982.6, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11306598400. Throughput: 0: 55246.5. Samples: 1796983000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:07:17,170][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 19:07:19,901][57339] Updated weights for policy 0, policy_version 690108 (0.0025) [2024-04-28 19:07:22,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 11306876928. Throughput: 0: 55666.8. Samples: 1797160440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:07:22,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 19:07:22,210][57339] Updated weights for policy 0, policy_version 690118 (0.0031) [2024-04-28 19:07:25,810][57339] Updated weights for policy 0, policy_version 690128 (0.0025) [2024-04-28 19:07:27,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 11307139072. Throughput: 0: 55686.8. Samples: 1797494440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:07:27,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 19:07:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000690133_11307139072.pth... [2024-04-28 19:07:27,232][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000689318_11293786112.pth [2024-04-28 19:07:28,186][57339] Updated weights for policy 0, policy_version 690138 (0.0026) [2024-04-28 19:07:31,621][57339] Updated weights for policy 0, policy_version 690148 (0.0026) [2024-04-28 19:07:32,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11307401216. Throughput: 0: 55814.8. Samples: 1797827100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:07:32,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 19:07:33,968][57339] Updated weights for policy 0, policy_version 690158 (0.0028) [2024-04-28 19:07:37,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11307679744. Throughput: 0: 55387.2. Samples: 1797987420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:07:37,170][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 19:07:37,446][57339] Updated weights for policy 0, policy_version 690168 (0.0030) [2024-04-28 19:07:39,904][57339] Updated weights for policy 0, policy_version 690178 (0.0026) [2024-04-28 19:07:42,169][57108] Fps is (10 sec: 55704.3, 60 sec: 55159.4, 300 sec: 55650.0). Total num frames: 11307958272. Throughput: 0: 55383.3. Samples: 1798324280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:07:42,170][57108] Avg episode reward: [(0, '0.529')] [2024-04-28 19:07:43,233][57339] Updated weights for policy 0, policy_version 690188 (0.0030) [2024-04-28 19:07:45,661][57339] Updated weights for policy 0, policy_version 690198 (0.0034) [2024-04-28 19:07:47,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 11308253184. Throughput: 0: 55435.2. Samples: 1798662300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:07:47,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 19:07:48,901][57319] Signal inference workers to stop experience collection... (26850 times) [2024-04-28 19:07:48,945][57339] InferenceWorker_p0-w0: stopping experience collection (26850 times) [2024-04-28 19:07:48,962][57319] Signal inference workers to resume experience collection... (26850 times) [2024-04-28 19:07:48,963][57339] InferenceWorker_p0-w0: resuming experience collection (26850 times) [2024-04-28 19:07:49,074][57339] Updated weights for policy 0, policy_version 690208 (0.0033) [2024-04-28 19:07:51,437][57339] Updated weights for policy 0, policy_version 690218 (0.0038) [2024-04-28 19:07:52,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55432.6, 300 sec: 55650.0). Total num frames: 11308531712. Throughput: 0: 55877.3. Samples: 1798830180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:07:52,170][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 19:07:54,970][57339] Updated weights for policy 0, policy_version 690228 (0.0029) [2024-04-28 19:07:57,169][57108] Fps is (10 sec: 58982.1, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 11308843008. Throughput: 0: 55899.6. Samples: 1799163840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:07:57,169][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 19:07:57,251][57339] Updated weights for policy 0, policy_version 690238 (0.0025) [2024-04-28 19:08:00,838][57339] Updated weights for policy 0, policy_version 690248 (0.0025) [2024-04-28 19:08:02,169][57108] Fps is (10 sec: 55706.1, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11309088768. Throughput: 0: 55937.4. Samples: 1799500180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:08:02,169][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 19:08:03,208][57339] Updated weights for policy 0, policy_version 690258 (0.0029) [2024-04-28 19:08:06,750][57339] Updated weights for policy 0, policy_version 690268 (0.0027) [2024-04-28 19:08:07,169][57108] Fps is (10 sec: 52428.0, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11309367296. Throughput: 0: 55776.3. Samples: 1799670380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-04-28 19:08:07,170][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 19:08:09,063][57339] Updated weights for policy 0, policy_version 690278 (0.0027) [2024-04-28 19:08:12,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 11309629440. Throughput: 0: 55788.4. Samples: 1800004920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:08:12,169][57108] Avg episode reward: [(0, '0.561')] [2024-04-28 19:08:12,571][57339] Updated weights for policy 0, policy_version 690288 (0.0027) [2024-04-28 19:08:14,831][57339] Updated weights for policy 0, policy_version 690298 (0.0030) [2024-04-28 19:08:17,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 11309907968. Throughput: 0: 55828.8. Samples: 1800339400. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:08:17,169][57108] Avg episode reward: [(0, '0.665')] [2024-04-28 19:08:18,454][57339] Updated weights for policy 0, policy_version 690308 (0.0028) [2024-04-28 19:08:20,836][57339] Updated weights for policy 0, policy_version 690318 (0.0031) [2024-04-28 19:08:22,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 11310186496. Throughput: 0: 55944.6. Samples: 1800504920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:08:22,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 19:08:24,196][57339] Updated weights for policy 0, policy_version 690328 (0.0029) [2024-04-28 19:08:27,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11310481408. Throughput: 0: 55969.1. Samples: 1800842880. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:08:27,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 19:08:27,212][57339] Updated weights for policy 0, policy_version 690338 (0.0028) [2024-04-28 19:08:30,029][57339] Updated weights for policy 0, policy_version 690348 (0.0028) [2024-04-28 19:08:32,169][57108] Fps is (10 sec: 58981.6, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 11310776320. Throughput: 0: 55760.7. Samples: 1801171540. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:08:32,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 19:08:33,100][57339] Updated weights for policy 0, policy_version 690358 (0.0031) [2024-04-28 19:08:35,879][57339] Updated weights for policy 0, policy_version 690368 (0.0026) [2024-04-28 19:08:37,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11311038464. Throughput: 0: 55899.2. Samples: 1801345640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:08:37,170][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 19:08:38,812][57339] Updated weights for policy 0, policy_version 690378 (0.0029) [2024-04-28 19:08:41,871][57339] Updated weights for policy 0, policy_version 690388 (0.0030) [2024-04-28 19:08:42,169][57108] Fps is (10 sec: 55706.0, 60 sec: 56251.9, 300 sec: 55816.7). Total num frames: 11311333376. Throughput: 0: 55925.8. Samples: 1801680500. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:08:42,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 19:08:44,637][57339] Updated weights for policy 0, policy_version 690398 (0.0028) [2024-04-28 19:08:46,634][57319] Signal inference workers to stop experience collection... (26900 times) [2024-04-28 19:08:46,675][57339] InferenceWorker_p0-w0: stopping experience collection (26900 times) [2024-04-28 19:08:46,686][57319] Signal inference workers to resume experience collection... (26900 times) [2024-04-28 19:08:46,693][57339] InferenceWorker_p0-w0: resuming experience collection (26900 times) [2024-04-28 19:08:47,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55432.4, 300 sec: 55650.1). Total num frames: 11311579136. Throughput: 0: 55953.7. Samples: 1802018100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:08:47,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 19:08:47,701][57339] Updated weights for policy 0, policy_version 690408 (0.0026) [2024-04-28 19:08:50,538][57339] Updated weights for policy 0, policy_version 690418 (0.0025) [2024-04-28 19:08:52,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55432.7, 300 sec: 55650.1). Total num frames: 11311857664. Throughput: 0: 55540.7. Samples: 1802169700. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:08:52,169][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 19:08:53,674][57339] Updated weights for policy 0, policy_version 690428 (0.0028) [2024-04-28 19:08:56,346][57339] Updated weights for policy 0, policy_version 690438 (0.0027) [2024-04-28 19:08:57,169][57108] Fps is (10 sec: 55705.8, 60 sec: 54886.4, 300 sec: 55650.0). Total num frames: 11312136192. Throughput: 0: 55523.1. Samples: 1802503460. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:08:57,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 19:08:59,599][57339] Updated weights for policy 0, policy_version 690448 (0.0033) [2024-04-28 19:09:02,169][57108] Fps is (10 sec: 60620.2, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 11312463872. Throughput: 0: 55547.0. Samples: 1802839020. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:09:02,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 19:09:02,170][57339] Updated weights for policy 0, policy_version 690458 (0.0033) [2024-04-28 19:09:05,474][57339] Updated weights for policy 0, policy_version 690468 (0.0028) [2024-04-28 19:09:07,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 11312726016. Throughput: 0: 55776.8. Samples: 1803014880. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:09:07,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 19:09:08,259][57339] Updated weights for policy 0, policy_version 690478 (0.0030) [2024-04-28 19:09:11,251][57339] Updated weights for policy 0, policy_version 690488 (0.0031) [2024-04-28 19:09:12,169][57108] Fps is (10 sec: 54067.6, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 11313004544. Throughput: 0: 55675.1. Samples: 1803348260. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:09:12,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 19:09:14,677][57339] Updated weights for policy 0, policy_version 690498 (0.0027) [2024-04-28 19:09:17,075][57339] Updated weights for policy 0, policy_version 690508 (0.0033) [2024-04-28 19:09:17,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 11313283072. Throughput: 0: 55800.1. Samples: 1803682540. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:09:17,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 19:09:20,638][57339] Updated weights for policy 0, policy_version 690518 (0.0032) [2024-04-28 19:09:22,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 11313545216. Throughput: 0: 55472.8. Samples: 1803841920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:09:22,170][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 19:09:23,065][57339] Updated weights for policy 0, policy_version 690528 (0.0026) [2024-04-28 19:09:26,694][57339] Updated weights for policy 0, policy_version 690538 (0.0032) [2024-04-28 19:09:27,169][57108] Fps is (10 sec: 50790.5, 60 sec: 55159.5, 300 sec: 55594.6). Total num frames: 11313790976. Throughput: 0: 55528.9. Samples: 1804179300. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:09:27,169][57108] Avg episode reward: [(0, '0.688')] [2024-04-28 19:09:27,287][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000690541_11313823744.pth... [2024-04-28 19:09:27,333][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000689724_11300438016.pth [2024-04-28 19:09:29,012][57339] Updated weights for policy 0, policy_version 690548 (0.0027) [2024-04-28 19:09:32,169][57108] Fps is (10 sec: 52429.6, 60 sec: 54886.5, 300 sec: 55594.5). Total num frames: 11314069504. Throughput: 0: 55507.3. Samples: 1804515920. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:09:32,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 19:09:32,573][57339] Updated weights for policy 0, policy_version 690558 (0.0029) [2024-04-28 19:09:34,718][57339] Updated weights for policy 0, policy_version 690568 (0.0029) [2024-04-28 19:09:37,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 11314380800. Throughput: 0: 55772.9. Samples: 1804679480. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:09:37,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 19:09:38,504][57339] Updated weights for policy 0, policy_version 690578 (0.0028) [2024-04-28 19:09:40,649][57339] Updated weights for policy 0, policy_version 690588 (0.0022) [2024-04-28 19:09:42,169][57108] Fps is (10 sec: 62258.6, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 11314692096. Throughput: 0: 55766.2. Samples: 1805012940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:09:42,170][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 19:09:44,240][57339] Updated weights for policy 0, policy_version 690598 (0.0030) [2024-04-28 19:09:45,730][57319] Signal inference workers to stop experience collection... (26950 times) [2024-04-28 19:09:45,731][57319] Signal inference workers to resume experience collection... (26950 times) [2024-04-28 19:09:45,744][57339] InferenceWorker_p0-w0: stopping experience collection (26950 times) [2024-04-28 19:09:45,745][57339] InferenceWorker_p0-w0: resuming experience collection (26950 times) [2024-04-28 19:09:46,493][57339] Updated weights for policy 0, policy_version 690608 (0.0026) [2024-04-28 19:09:47,169][57108] Fps is (10 sec: 57343.1, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 11314954240. Throughput: 0: 55717.7. Samples: 1805346320. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-04-28 19:09:47,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 19:09:49,986][57339] Updated weights for policy 0, policy_version 690618 (0.0031) [2024-04-28 19:09:52,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 11315216384. Throughput: 0: 55788.4. Samples: 1805525360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:09:52,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 19:09:52,357][57339] Updated weights for policy 0, policy_version 690628 (0.0035) [2024-04-28 19:09:55,802][57339] Updated weights for policy 0, policy_version 690638 (0.0029) [2024-04-28 19:09:57,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 11315478528. Throughput: 0: 55808.0. Samples: 1805859620. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:09:57,169][57108] Avg episode reward: [(0, '0.540')] [2024-04-28 19:09:58,156][57339] Updated weights for policy 0, policy_version 690648 (0.0028) [2024-04-28 19:10:01,623][57339] Updated weights for policy 0, policy_version 690658 (0.0032) [2024-04-28 19:10:02,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 11315757056. Throughput: 0: 55665.7. Samples: 1806187500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:10:02,170][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 19:10:03,997][57339] Updated weights for policy 0, policy_version 690668 (0.0031) [2024-04-28 19:10:07,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 11316035584. Throughput: 0: 55590.8. Samples: 1806343500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:10:07,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 19:10:07,796][57339] Updated weights for policy 0, policy_version 690678 (0.0028) [2024-04-28 19:10:10,012][57339] Updated weights for policy 0, policy_version 690688 (0.0029) [2024-04-28 19:10:12,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11316330496. Throughput: 0: 55496.3. Samples: 1806676640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:10:12,170][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 19:10:13,760][57339] Updated weights for policy 0, policy_version 690698 (0.0031) [2024-04-28 19:10:15,850][57339] Updated weights for policy 0, policy_version 690708 (0.0027) [2024-04-28 19:10:17,169][57108] Fps is (10 sec: 60620.3, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11316641792. Throughput: 0: 55403.9. Samples: 1807009100. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:10:17,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 19:10:19,613][57339] Updated weights for policy 0, policy_version 690718 (0.0031) [2024-04-28 19:10:21,655][57339] Updated weights for policy 0, policy_version 690728 (0.0028) [2024-04-28 19:10:22,169][57108] Fps is (10 sec: 58983.1, 60 sec: 56251.9, 300 sec: 55872.2). Total num frames: 11316920320. Throughput: 0: 55949.8. Samples: 1807197220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:10:22,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 19:10:25,447][57339] Updated weights for policy 0, policy_version 690738 (0.0033) [2024-04-28 19:10:27,169][57108] Fps is (10 sec: 50790.2, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 11317149696. Throughput: 0: 55891.9. Samples: 1807528080. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:10:27,169][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 19:10:27,619][57339] Updated weights for policy 0, policy_version 690748 (0.0023) [2024-04-28 19:10:31,217][57339] Updated weights for policy 0, policy_version 690758 (0.0028) [2024-04-28 19:10:32,169][57108] Fps is (10 sec: 50790.2, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 11317428224. Throughput: 0: 55908.1. Samples: 1807862180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:10:32,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 19:10:33,462][57339] Updated weights for policy 0, policy_version 690768 (0.0023) [2024-04-28 19:10:37,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 11317690368. Throughput: 0: 55311.7. Samples: 1808014380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:10:37,169][57108] Avg episode reward: [(0, '0.555')] [2024-04-28 19:10:37,236][57339] Updated weights for policy 0, policy_version 690778 (0.0024) [2024-04-28 19:10:38,789][57319] Signal inference workers to stop experience collection... (27000 times) [2024-04-28 19:10:38,789][57319] Signal inference workers to resume experience collection... (27000 times) [2024-04-28 19:10:38,800][57339] InferenceWorker_p0-w0: stopping experience collection (27000 times) [2024-04-28 19:10:38,800][57339] InferenceWorker_p0-w0: resuming experience collection (27000 times) [2024-04-28 19:10:39,368][57339] Updated weights for policy 0, policy_version 690788 (0.0029) [2024-04-28 19:10:42,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55159.4, 300 sec: 55705.6). Total num frames: 11318001664. Throughput: 0: 55383.7. Samples: 1808351900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:10:42,170][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 19:10:43,001][57339] Updated weights for policy 0, policy_version 690798 (0.0029) [2024-04-28 19:10:45,083][57339] Updated weights for policy 0, policy_version 690808 (0.0031) [2024-04-28 19:10:47,169][57108] Fps is (10 sec: 58981.1, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 11318280192. Throughput: 0: 55587.4. Samples: 1808688940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:10:47,170][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 19:10:48,799][57339] Updated weights for policy 0, policy_version 690818 (0.0033) [2024-04-28 19:10:51,037][57339] Updated weights for policy 0, policy_version 690828 (0.0029) [2024-04-28 19:10:52,169][57108] Fps is (10 sec: 58983.3, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 11318591488. Throughput: 0: 55930.6. Samples: 1808860380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:10:52,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 19:10:54,748][57339] Updated weights for policy 0, policy_version 690838 (0.0034) [2024-04-28 19:10:56,874][57339] Updated weights for policy 0, policy_version 690848 (0.0030) [2024-04-28 19:10:57,169][57108] Fps is (10 sec: 58982.8, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 11318870016. Throughput: 0: 55930.7. Samples: 1809193520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:10:57,170][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 19:11:00,642][57339] Updated weights for policy 0, policy_version 690858 (0.0028) [2024-04-28 19:11:02,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11319115776. Throughput: 0: 56013.9. Samples: 1809529720. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:11:02,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 19:11:02,727][57339] Updated weights for policy 0, policy_version 690868 (0.0027) [2024-04-28 19:11:06,317][57339] Updated weights for policy 0, policy_version 690878 (0.0029) [2024-04-28 19:11:07,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 11319394304. Throughput: 0: 55531.4. Samples: 1809696140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:11:07,169][57108] Avg episode reward: [(0, '0.520')] [2024-04-28 19:11:08,460][57339] Updated weights for policy 0, policy_version 690888 (0.0028) [2024-04-28 19:11:12,060][57339] Updated weights for policy 0, policy_version 690898 (0.0031) [2024-04-28 19:11:12,169][57108] Fps is (10 sec: 55704.3, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 11319672832. Throughput: 0: 55672.3. Samples: 1810033340. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:11:12,178][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 19:11:14,452][57339] Updated weights for policy 0, policy_version 690908 (0.0030) [2024-04-28 19:11:17,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 11319951360. Throughput: 0: 55812.4. Samples: 1810373740. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:11:17,169][57108] Avg episode reward: [(0, '0.686')] [2024-04-28 19:11:17,917][57339] Updated weights for policy 0, policy_version 690918 (0.0028) [2024-04-28 19:11:20,237][57339] Updated weights for policy 0, policy_version 690928 (0.0029) [2024-04-28 19:11:22,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 11320246272. Throughput: 0: 56222.0. Samples: 1810544380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:11:22,170][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 19:11:23,682][57339] Updated weights for policy 0, policy_version 690938 (0.0024) [2024-04-28 19:11:25,939][57339] Updated weights for policy 0, policy_version 690948 (0.0028) [2024-04-28 19:11:27,169][57108] Fps is (10 sec: 58981.9, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 11320541184. Throughput: 0: 56133.4. Samples: 1810877900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 20.0) [2024-04-28 19:11:27,169][57108] Avg episode reward: [(0, '0.518')] [2024-04-28 19:11:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000690951_11320541184.pth... [2024-04-28 19:11:27,231][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000690133_11307139072.pth [2024-04-28 19:11:29,403][57339] Updated weights for policy 0, policy_version 690958 (0.0029) [2024-04-28 19:11:31,756][57339] Updated weights for policy 0, policy_version 690968 (0.0024) [2024-04-28 19:11:32,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 11320819712. Throughput: 0: 56012.9. Samples: 1811209520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:11:32,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 19:11:35,254][57339] Updated weights for policy 0, policy_version 690978 (0.0028) [2024-04-28 19:11:36,885][57319] Signal inference workers to stop experience collection... (27050 times) [2024-04-28 19:11:36,921][57339] InferenceWorker_p0-w0: stopping experience collection (27050 times) [2024-04-28 19:11:36,946][57319] Signal inference workers to resume experience collection... (27050 times) [2024-04-28 19:11:36,950][57339] InferenceWorker_p0-w0: resuming experience collection (27050 times) [2024-04-28 19:11:37,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56797.7, 300 sec: 55761.1). Total num frames: 11321098240. Throughput: 0: 56180.4. Samples: 1811388500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:11:37,169][57108] Avg episode reward: [(0, '0.701')] [2024-04-28 19:11:37,551][57339] Updated weights for policy 0, policy_version 690988 (0.0024) [2024-04-28 19:11:41,166][57339] Updated weights for policy 0, policy_version 690998 (0.0026) [2024-04-28 19:11:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 11321376768. Throughput: 0: 56363.5. Samples: 1811729880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:11:42,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 19:11:43,319][57339] Updated weights for policy 0, policy_version 691008 (0.0031) [2024-04-28 19:11:47,150][57339] Updated weights for policy 0, policy_version 691018 (0.0033) [2024-04-28 19:11:47,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11321638912. Throughput: 0: 56243.4. Samples: 1812060680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:11:47,170][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 19:11:49,143][57339] Updated weights for policy 0, policy_version 691028 (0.0026) [2024-04-28 19:11:52,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55159.5, 300 sec: 55761.2). Total num frames: 11321901056. Throughput: 0: 56058.8. Samples: 1812218780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:11:52,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 19:11:52,946][57339] Updated weights for policy 0, policy_version 691038 (0.0035) [2024-04-28 19:11:55,072][57339] Updated weights for policy 0, policy_version 691048 (0.0031) [2024-04-28 19:11:57,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 11322212352. Throughput: 0: 56026.0. Samples: 1812554500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:11:57,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 19:11:58,634][57339] Updated weights for policy 0, policy_version 691058 (0.0034) [2024-04-28 19:12:01,287][57339] Updated weights for policy 0, policy_version 691068 (0.0029) [2024-04-28 19:12:02,169][57108] Fps is (10 sec: 58981.8, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 11322490880. Throughput: 0: 55849.8. Samples: 1812886980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:12:02,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 19:12:04,481][57339] Updated weights for policy 0, policy_version 691078 (0.0027) [2024-04-28 19:12:07,169][57108] Fps is (10 sec: 55704.9, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 11322769408. Throughput: 0: 55947.9. Samples: 1813062040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:12:07,170][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 19:12:07,358][57339] Updated weights for policy 0, policy_version 691088 (0.0027) [2024-04-28 19:12:10,324][57339] Updated weights for policy 0, policy_version 691098 (0.0026) [2024-04-28 19:12:12,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56525.0, 300 sec: 55816.7). Total num frames: 11323064320. Throughput: 0: 56021.0. Samples: 1813398840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:12:12,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 19:12:13,306][57339] Updated weights for policy 0, policy_version 691108 (0.0030) [2024-04-28 19:12:16,236][57339] Updated weights for policy 0, policy_version 691118 (0.0035) [2024-04-28 19:12:17,169][57108] Fps is (10 sec: 55706.2, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 11323326464. Throughput: 0: 56073.9. Samples: 1813732840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:12:17,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 19:12:18,984][57339] Updated weights for policy 0, policy_version 691128 (0.0026) [2024-04-28 19:12:21,921][57339] Updated weights for policy 0, policy_version 691138 (0.0030) [2024-04-28 19:12:22,169][57108] Fps is (10 sec: 54066.2, 60 sec: 55978.6, 300 sec: 55816.6). Total num frames: 11323604992. Throughput: 0: 55938.5. Samples: 1813905740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:12:22,170][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 19:12:24,721][57339] Updated weights for policy 0, policy_version 691148 (0.0027) [2024-04-28 19:12:27,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 11323867136. Throughput: 0: 55804.5. Samples: 1814241080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:12:27,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 19:12:27,958][57339] Updated weights for policy 0, policy_version 691158 (0.0028) [2024-04-28 19:12:30,546][57339] Updated weights for policy 0, policy_version 691168 (0.0025) [2024-04-28 19:12:31,365][57319] Signal inference workers to stop experience collection... (27100 times) [2024-04-28 19:12:31,408][57339] InferenceWorker_p0-w0: stopping experience collection (27100 times) [2024-04-28 19:12:31,422][57319] Signal inference workers to resume experience collection... (27100 times) [2024-04-28 19:12:31,431][57339] InferenceWorker_p0-w0: resuming experience collection (27100 times) [2024-04-28 19:12:32,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 11324145664. Throughput: 0: 55725.9. Samples: 1814568340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:12:32,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 19:12:33,840][57339] Updated weights for policy 0, policy_version 691178 (0.0035) [2024-04-28 19:12:36,459][57339] Updated weights for policy 0, policy_version 691188 (0.0030) [2024-04-28 19:12:37,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 11324440576. Throughput: 0: 56084.3. Samples: 1814742580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:12:37,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 19:12:39,603][57339] Updated weights for policy 0, policy_version 691198 (0.0029) [2024-04-28 19:12:42,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 11324735488. Throughput: 0: 56137.9. Samples: 1815080700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:12:42,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 19:12:42,174][57339] Updated weights for policy 0, policy_version 691208 (0.0030) [2024-04-28 19:12:45,584][57339] Updated weights for policy 0, policy_version 691218 (0.0029) [2024-04-28 19:12:47,169][57108] Fps is (10 sec: 57344.6, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 11325014016. Throughput: 0: 56122.3. Samples: 1815412480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:12:47,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 19:12:48,279][57339] Updated weights for policy 0, policy_version 691228 (0.0029) [2024-04-28 19:12:51,501][57339] Updated weights for policy 0, policy_version 691238 (0.0025) [2024-04-28 19:12:52,169][57108] Fps is (10 sec: 54066.1, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 11325276160. Throughput: 0: 55988.9. Samples: 1815581540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:12:52,170][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 19:12:54,265][57339] Updated weights for policy 0, policy_version 691248 (0.0028) [2024-04-28 19:12:57,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 11325554688. Throughput: 0: 56032.9. Samples: 1815920320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:12:57,169][57108] Avg episode reward: [(0, '0.527')] [2024-04-28 19:12:57,196][57339] Updated weights for policy 0, policy_version 691258 (0.0032) [2024-04-28 19:13:00,240][57339] Updated weights for policy 0, policy_version 691268 (0.0030) [2024-04-28 19:13:02,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 11325833216. Throughput: 0: 56042.2. Samples: 1816254740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:13:02,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 19:13:02,995][57339] Updated weights for policy 0, policy_version 691278 (0.0035) [2024-04-28 19:13:05,947][57339] Updated weights for policy 0, policy_version 691288 (0.0028) [2024-04-28 19:13:07,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 11326095360. Throughput: 0: 55794.9. Samples: 1816416500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-04-28 19:13:07,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 19:13:08,866][57339] Updated weights for policy 0, policy_version 691298 (0.0032) [2024-04-28 19:13:11,787][57339] Updated weights for policy 0, policy_version 691308 (0.0025) [2024-04-28 19:13:12,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 11326390272. Throughput: 0: 55720.1. Samples: 1816748480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:13:12,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 19:13:14,877][57339] Updated weights for policy 0, policy_version 691318 (0.0031) [2024-04-28 19:13:17,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 11326685184. Throughput: 0: 55879.9. Samples: 1817082940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:13:17,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 19:13:17,707][57339] Updated weights for policy 0, policy_version 691328 (0.0031) [2024-04-28 19:13:20,845][57339] Updated weights for policy 0, policy_version 691338 (0.0031) [2024-04-28 19:13:22,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 11326963712. Throughput: 0: 55881.0. Samples: 1817257220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:13:22,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 19:13:23,683][57339] Updated weights for policy 0, policy_version 691348 (0.0034) [2024-04-28 19:13:26,595][57339] Updated weights for policy 0, policy_version 691358 (0.0027) [2024-04-28 19:13:27,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11327209472. Throughput: 0: 55790.1. Samples: 1817591260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:13:27,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 19:13:27,285][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000691359_11327225856.pth... [2024-04-28 19:13:27,332][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000690541_11313823744.pth [2024-04-28 19:13:29,420][57339] Updated weights for policy 0, policy_version 691368 (0.0026) [2024-04-28 19:13:32,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11327504384. Throughput: 0: 55845.9. Samples: 1817925540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:13:32,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 19:13:32,564][57339] Updated weights for policy 0, policy_version 691378 (0.0026) [2024-04-28 19:13:35,260][57339] Updated weights for policy 0, policy_version 691388 (0.0028) [2024-04-28 19:13:37,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11327782912. Throughput: 0: 55744.9. Samples: 1818090060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:13:37,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 19:13:38,468][57339] Updated weights for policy 0, policy_version 691398 (0.0028) [2024-04-28 19:13:41,450][57339] Updated weights for policy 0, policy_version 691408 (0.0030) [2024-04-28 19:13:42,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 11328077824. Throughput: 0: 55702.2. Samples: 1818426920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:13:42,169][57108] Avg episode reward: [(0, '0.720')] [2024-04-28 19:13:44,210][57339] Updated weights for policy 0, policy_version 691418 (0.0032) [2024-04-28 19:13:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 11328339968. Throughput: 0: 55720.0. Samples: 1818762140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:13:47,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 19:13:47,412][57339] Updated weights for policy 0, policy_version 691428 (0.0027) [2024-04-28 19:13:49,130][57319] Signal inference workers to stop experience collection... (27150 times) [2024-04-28 19:13:49,130][57319] Signal inference workers to resume experience collection... (27150 times) [2024-04-28 19:13:49,159][57339] InferenceWorker_p0-w0: stopping experience collection (27150 times) [2024-04-28 19:13:49,159][57339] InferenceWorker_p0-w0: resuming experience collection (27150 times) [2024-04-28 19:13:49,944][57339] Updated weights for policy 0, policy_version 691438 (0.0032) [2024-04-28 19:13:52,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 11328634880. Throughput: 0: 55794.6. Samples: 1818927260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:13:52,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 19:13:53,166][57339] Updated weights for policy 0, policy_version 691448 (0.0031) [2024-04-28 19:13:55,738][57339] Updated weights for policy 0, policy_version 691458 (0.0028) [2024-04-28 19:13:57,169][57108] Fps is (10 sec: 58982.8, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 11328929792. Throughput: 0: 55944.0. Samples: 1819265960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:13:57,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 19:13:58,980][57339] Updated weights for policy 0, policy_version 691468 (0.0030) [2024-04-28 19:14:01,846][57339] Updated weights for policy 0, policy_version 691478 (0.0030) [2024-04-28 19:14:02,170][57108] Fps is (10 sec: 55697.3, 60 sec: 55977.3, 300 sec: 55816.4). Total num frames: 11329191936. Throughput: 0: 56112.8. Samples: 1819608100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:14:02,171][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 19:14:04,750][57339] Updated weights for policy 0, policy_version 691488 (0.0036) [2024-04-28 19:14:07,169][57108] Fps is (10 sec: 54066.4, 60 sec: 56251.6, 300 sec: 55816.7). Total num frames: 11329470464. Throughput: 0: 55920.7. Samples: 1819773660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:14:07,170][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 19:14:07,988][57339] Updated weights for policy 0, policy_version 691498 (0.0030) [2024-04-28 19:14:10,523][57339] Updated weights for policy 0, policy_version 691508 (0.0028) [2024-04-28 19:14:12,169][57108] Fps is (10 sec: 55714.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11329748992. Throughput: 0: 55917.8. Samples: 1820107560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:14:12,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 19:14:13,718][57339] Updated weights for policy 0, policy_version 691518 (0.0024) [2024-04-28 19:14:16,321][57339] Updated weights for policy 0, policy_version 691528 (0.0024) [2024-04-28 19:14:17,169][57108] Fps is (10 sec: 57345.0, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 11330043904. Throughput: 0: 55996.4. Samples: 1820445380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:14:17,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 19:14:19,378][57339] Updated weights for policy 0, policy_version 691538 (0.0029) [2024-04-28 19:14:22,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 11330306048. Throughput: 0: 56146.8. Samples: 1820616660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:14:22,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 19:14:22,180][57339] Updated weights for policy 0, policy_version 691548 (0.0027) [2024-04-28 19:14:25,261][57339] Updated weights for policy 0, policy_version 691558 (0.0034) [2024-04-28 19:14:27,169][57108] Fps is (10 sec: 55705.0, 60 sec: 56524.8, 300 sec: 56038.8). Total num frames: 11330600960. Throughput: 0: 56094.5. Samples: 1820951180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:14:27,170][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 19:14:28,182][57339] Updated weights for policy 0, policy_version 691568 (0.0028) [2024-04-28 19:14:30,992][57339] Updated weights for policy 0, policy_version 691578 (0.0028) [2024-04-28 19:14:32,169][57108] Fps is (10 sec: 58982.1, 60 sec: 56524.7, 300 sec: 55983.3). Total num frames: 11330895872. Throughput: 0: 56021.0. Samples: 1821283080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:14:32,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 19:14:33,945][57339] Updated weights for policy 0, policy_version 691588 (0.0027) [2024-04-28 19:14:36,712][57339] Updated weights for policy 0, policy_version 691598 (0.0029) [2024-04-28 19:14:37,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 11331141632. Throughput: 0: 56245.9. Samples: 1821458320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:14:37,169][57108] Avg episode reward: [(0, '0.707')] [2024-04-28 19:14:39,753][57339] Updated weights for policy 0, policy_version 691608 (0.0036) [2024-04-28 19:14:42,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 11331420160. Throughput: 0: 56095.6. Samples: 1821790260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:14:42,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 19:14:42,674][57339] Updated weights for policy 0, policy_version 691618 (0.0031) [2024-04-28 19:14:45,480][57339] Updated weights for policy 0, policy_version 691628 (0.0029) [2024-04-28 19:14:47,169][57108] Fps is (10 sec: 57342.8, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 11331715072. Throughput: 0: 55868.8. Samples: 1822122120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-04-28 19:14:47,170][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 19:14:48,589][57339] Updated weights for policy 0, policy_version 691638 (0.0027) [2024-04-28 19:14:51,214][57339] Updated weights for policy 0, policy_version 691648 (0.0033) [2024-04-28 19:14:52,169][57108] Fps is (10 sec: 57342.4, 60 sec: 55978.5, 300 sec: 55983.2). Total num frames: 11331993600. Throughput: 0: 55754.5. Samples: 1822282620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:14:52,169][57108] Avg episode reward: [(0, '0.672')] [2024-04-28 19:14:54,537][57339] Updated weights for policy 0, policy_version 691658 (0.0029) [2024-04-28 19:14:55,303][57319] Signal inference workers to stop experience collection... (27200 times) [2024-04-28 19:14:55,348][57339] InferenceWorker_p0-w0: stopping experience collection (27200 times) [2024-04-28 19:14:55,358][57319] Signal inference workers to resume experience collection... (27200 times) [2024-04-28 19:14:55,364][57339] InferenceWorker_p0-w0: resuming experience collection (27200 times) [2024-04-28 19:14:56,942][57339] Updated weights for policy 0, policy_version 691668 (0.0029) [2024-04-28 19:14:57,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55978.6, 300 sec: 56038.8). Total num frames: 11332288512. Throughput: 0: 55885.7. Samples: 1822622420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:14:57,170][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 19:15:00,434][57339] Updated weights for policy 0, policy_version 691678 (0.0035) [2024-04-28 19:15:02,169][57108] Fps is (10 sec: 55707.0, 60 sec: 55980.1, 300 sec: 55983.3). Total num frames: 11332550656. Throughput: 0: 55863.9. Samples: 1822959260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:15:02,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 19:15:02,764][57339] Updated weights for policy 0, policy_version 691688 (0.0026) [2024-04-28 19:15:06,259][57339] Updated weights for policy 0, policy_version 691698 (0.0027) [2024-04-28 19:15:07,169][57108] Fps is (10 sec: 55706.0, 60 sec: 56251.9, 300 sec: 55983.3). Total num frames: 11332845568. Throughput: 0: 55832.9. Samples: 1823129140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:15:07,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 19:15:08,700][57339] Updated weights for policy 0, policy_version 691708 (0.0028) [2024-04-28 19:15:12,001][57339] Updated weights for policy 0, policy_version 691718 (0.0029) [2024-04-28 19:15:12,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 11333107712. Throughput: 0: 55913.3. Samples: 1823467280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:15:12,170][57108] Avg episode reward: [(0, '0.528')] [2024-04-28 19:15:14,804][57339] Updated weights for policy 0, policy_version 691728 (0.0028) [2024-04-28 19:15:17,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 11333369856. Throughput: 0: 55918.7. Samples: 1823799420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:15:17,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 19:15:17,783][57339] Updated weights for policy 0, policy_version 691738 (0.0031) [2024-04-28 19:15:20,878][57339] Updated weights for policy 0, policy_version 691748 (0.0030) [2024-04-28 19:15:22,169][57108] Fps is (10 sec: 57344.8, 60 sec: 56251.7, 300 sec: 56038.9). Total num frames: 11333681152. Throughput: 0: 55880.9. Samples: 1823972960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:15:22,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 19:15:23,695][57339] Updated weights for policy 0, policy_version 691758 (0.0030) [2024-04-28 19:15:26,543][57339] Updated weights for policy 0, policy_version 691768 (0.0023) [2024-04-28 19:15:27,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 11333943296. Throughput: 0: 55843.9. Samples: 1824303240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:15:27,169][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 19:15:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000691769_11333943296.pth... [2024-04-28 19:15:27,235][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000690951_11320541184.pth [2024-04-28 19:15:29,636][57339] Updated weights for policy 0, policy_version 691778 (0.0029) [2024-04-28 19:15:32,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 56094.3). Total num frames: 11334238208. Throughput: 0: 55818.4. Samples: 1824633940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:15:32,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 19:15:32,521][57339] Updated weights for policy 0, policy_version 691788 (0.0031) [2024-04-28 19:15:35,569][57339] Updated weights for policy 0, policy_version 691798 (0.0030) [2024-04-28 19:15:37,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 11334516736. Throughput: 0: 56267.4. Samples: 1824814640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:15:37,169][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 19:15:38,393][57339] Updated weights for policy 0, policy_version 691808 (0.0033) [2024-04-28 19:15:41,476][57339] Updated weights for policy 0, policy_version 691818 (0.0026) [2024-04-28 19:15:42,169][57108] Fps is (10 sec: 55705.8, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 11334795264. Throughput: 0: 56082.7. Samples: 1825146140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:15:42,169][57108] Avg episode reward: [(0, '0.682')] [2024-04-28 19:15:44,258][57339] Updated weights for policy 0, policy_version 691828 (0.0027) [2024-04-28 19:15:47,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55432.7, 300 sec: 55761.1). Total num frames: 11335041024. Throughput: 0: 56023.5. Samples: 1825480320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:15:47,169][57108] Avg episode reward: [(0, '0.679')] [2024-04-28 19:15:47,356][57339] Updated weights for policy 0, policy_version 691838 (0.0031) [2024-04-28 19:15:49,966][57339] Updated weights for policy 0, policy_version 691848 (0.0030) [2024-04-28 19:15:52,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55705.9, 300 sec: 55816.7). Total num frames: 11335335936. Throughput: 0: 55814.7. Samples: 1825640800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:15:52,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 19:15:53,191][57339] Updated weights for policy 0, policy_version 691858 (0.0033) [2024-04-28 19:15:55,726][57319] Signal inference workers to stop experience collection... (27250 times) [2024-04-28 19:15:55,751][57339] InferenceWorker_p0-w0: stopping experience collection (27250 times) [2024-04-28 19:15:55,786][57319] Signal inference workers to resume experience collection... (27250 times) [2024-04-28 19:15:55,787][57339] InferenceWorker_p0-w0: resuming experience collection (27250 times) [2024-04-28 19:15:55,894][57339] Updated weights for policy 0, policy_version 691868 (0.0025) [2024-04-28 19:15:57,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55432.6, 300 sec: 55927.8). Total num frames: 11335614464. Throughput: 0: 55804.7. Samples: 1825978480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:15:57,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 19:15:58,895][57339] Updated weights for policy 0, policy_version 691878 (0.0022) [2024-04-28 19:16:01,843][57339] Updated weights for policy 0, policy_version 691888 (0.0023) [2024-04-28 19:16:02,169][57108] Fps is (10 sec: 57342.5, 60 sec: 55978.4, 300 sec: 55983.3). Total num frames: 11335909376. Throughput: 0: 55940.1. Samples: 1826316740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:16:02,170][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 19:16:04,734][57339] Updated weights for policy 0, policy_version 691898 (0.0030) [2024-04-28 19:16:07,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55432.5, 300 sec: 55927.8). Total num frames: 11336171520. Throughput: 0: 55789.7. Samples: 1826483500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:16:07,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 19:16:07,782][57339] Updated weights for policy 0, policy_version 691908 (0.0032) [2024-04-28 19:16:10,590][57339] Updated weights for policy 0, policy_version 691918 (0.0030) [2024-04-28 19:16:12,169][57108] Fps is (10 sec: 55707.2, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 11336466432. Throughput: 0: 55777.4. Samples: 1826813220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:16:12,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 19:16:13,538][57339] Updated weights for policy 0, policy_version 691928 (0.0024) [2024-04-28 19:16:16,382][57339] Updated weights for policy 0, policy_version 691938 (0.0025) [2024-04-28 19:16:17,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 11336744960. Throughput: 0: 55908.1. Samples: 1827149800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:16:17,169][57108] Avg episode reward: [(0, '0.612')] [2024-04-28 19:16:19,286][57339] Updated weights for policy 0, policy_version 691948 (0.0027) [2024-04-28 19:16:22,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55705.4, 300 sec: 55872.2). Total num frames: 11337023488. Throughput: 0: 55707.0. Samples: 1827321460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:16:22,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 19:16:22,298][57339] Updated weights for policy 0, policy_version 691958 (0.0028) [2024-04-28 19:16:25,136][57339] Updated weights for policy 0, policy_version 691968 (0.0030) [2024-04-28 19:16:27,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 11337285632. Throughput: 0: 55667.9. Samples: 1827651200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-04-28 19:16:27,169][57108] Avg episode reward: [(0, '0.511')] [2024-04-28 19:16:28,310][57339] Updated weights for policy 0, policy_version 691978 (0.0028) [2024-04-28 19:16:31,029][57339] Updated weights for policy 0, policy_version 691988 (0.0030) [2024-04-28 19:16:32,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 11337580544. Throughput: 0: 55755.9. Samples: 1827989340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:16:32,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 19:16:34,249][57339] Updated weights for policy 0, policy_version 691998 (0.0030) [2024-04-28 19:16:37,025][57339] Updated weights for policy 0, policy_version 692008 (0.0030) [2024-04-28 19:16:37,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 11337859072. Throughput: 0: 55915.1. Samples: 1828156980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:16:37,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 19:16:39,969][57339] Updated weights for policy 0, policy_version 692018 (0.0033) [2024-04-28 19:16:42,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 11338137600. Throughput: 0: 55790.5. Samples: 1828489060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:16:42,169][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 19:16:42,762][57339] Updated weights for policy 0, policy_version 692028 (0.0029) [2024-04-28 19:16:45,826][57339] Updated weights for policy 0, policy_version 692038 (0.0031) [2024-04-28 19:16:47,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56524.8, 300 sec: 56038.8). Total num frames: 11338432512. Throughput: 0: 55716.8. Samples: 1828823980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:16:47,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 19:16:48,796][57339] Updated weights for policy 0, policy_version 692048 (0.0023) [2024-04-28 19:16:51,722][57339] Updated weights for policy 0, policy_version 692058 (0.0030) [2024-04-28 19:16:52,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56251.6, 300 sec: 55927.7). Total num frames: 11338711040. Throughput: 0: 55746.1. Samples: 1828992080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:16:52,170][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 19:16:54,810][57339] Updated weights for policy 0, policy_version 692068 (0.0023) [2024-04-28 19:16:57,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 11338956800. Throughput: 0: 55932.4. Samples: 1829330180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:16:57,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 19:16:57,581][57339] Updated weights for policy 0, policy_version 692078 (0.0028) [2024-04-28 19:17:00,606][57339] Updated weights for policy 0, policy_version 692088 (0.0030) [2024-04-28 19:17:02,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 11339235328. Throughput: 0: 55939.1. Samples: 1829667060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:17:02,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 19:17:03,330][57339] Updated weights for policy 0, policy_version 692098 (0.0031) [2024-04-28 19:17:06,367][57339] Updated weights for policy 0, policy_version 692108 (0.0031) [2024-04-28 19:17:07,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 11339513856. Throughput: 0: 55700.3. Samples: 1829827960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:17:07,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 19:17:09,245][57339] Updated weights for policy 0, policy_version 692118 (0.0027) [2024-04-28 19:17:12,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 11339808768. Throughput: 0: 55824.9. Samples: 1830163320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:17:12,170][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 19:17:12,231][57339] Updated weights for policy 0, policy_version 692128 (0.0028) [2024-04-28 19:17:14,435][57319] Signal inference workers to stop experience collection... (27300 times) [2024-04-28 19:17:14,436][57319] Signal inference workers to resume experience collection... (27300 times) [2024-04-28 19:17:14,459][57339] InferenceWorker_p0-w0: stopping experience collection (27300 times) [2024-04-28 19:17:14,460][57339] InferenceWorker_p0-w0: resuming experience collection (27300 times) [2024-04-28 19:17:15,133][57339] Updated weights for policy 0, policy_version 692138 (0.0033) [2024-04-28 19:17:17,169][57108] Fps is (10 sec: 57342.8, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 11340087296. Throughput: 0: 55583.5. Samples: 1830490600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:17:17,169][57108] Avg episode reward: [(0, '0.709')] [2024-04-28 19:17:18,130][57339] Updated weights for policy 0, policy_version 692148 (0.0029) [2024-04-28 19:17:20,944][57339] Updated weights for policy 0, policy_version 692158 (0.0032) [2024-04-28 19:17:22,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 11340382208. Throughput: 0: 55957.3. Samples: 1830675060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:17:22,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 19:17:23,830][57339] Updated weights for policy 0, policy_version 692168 (0.0028) [2024-04-28 19:17:26,780][57339] Updated weights for policy 0, policy_version 692178 (0.0037) [2024-04-28 19:17:27,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 11340660736. Throughput: 0: 56067.1. Samples: 1831012080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:17:27,169][57108] Avg episode reward: [(0, '0.677')] [2024-04-28 19:17:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000692179_11340660736.pth... [2024-04-28 19:17:27,225][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000691359_11327225856.pth [2024-04-28 19:17:29,633][57339] Updated weights for policy 0, policy_version 692188 (0.0027) [2024-04-28 19:17:32,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 11340939264. Throughput: 0: 55940.7. Samples: 1831341320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:17:32,170][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 19:17:32,729][57339] Updated weights for policy 0, policy_version 692198 (0.0031) [2024-04-28 19:17:35,760][57339] Updated weights for policy 0, policy_version 692208 (0.0030) [2024-04-28 19:17:37,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55705.5, 300 sec: 55816.6). Total num frames: 11341201408. Throughput: 0: 56005.3. Samples: 1831512320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:17:37,169][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 19:17:38,491][57339] Updated weights for policy 0, policy_version 692218 (0.0027) [2024-04-28 19:17:41,762][57339] Updated weights for policy 0, policy_version 692228 (0.0025) [2024-04-28 19:17:42,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11341479936. Throughput: 0: 55954.2. Samples: 1831848120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:17:42,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 19:17:44,220][57339] Updated weights for policy 0, policy_version 692238 (0.0025) [2024-04-28 19:17:47,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 11341758464. Throughput: 0: 55991.6. Samples: 1832186680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:17:47,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 19:17:47,549][57339] Updated weights for policy 0, policy_version 692248 (0.0029) [2024-04-28 19:17:50,029][57339] Updated weights for policy 0, policy_version 692258 (0.0030) [2024-04-28 19:17:52,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55432.7, 300 sec: 55872.2). Total num frames: 11342036992. Throughput: 0: 55906.6. Samples: 1832343760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:17:52,169][57108] Avg episode reward: [(0, '0.511')] [2024-04-28 19:17:53,453][57339] Updated weights for policy 0, policy_version 692268 (0.0027) [2024-04-28 19:17:56,047][57339] Updated weights for policy 0, policy_version 692278 (0.0026) [2024-04-28 19:17:57,169][57108] Fps is (10 sec: 58981.6, 60 sec: 56524.6, 300 sec: 55983.3). Total num frames: 11342348288. Throughput: 0: 56060.3. Samples: 1832686040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:17:57,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 19:17:59,217][57339] Updated weights for policy 0, policy_version 692288 (0.0026) [2024-04-28 19:18:01,988][57339] Updated weights for policy 0, policy_version 692298 (0.0029) [2024-04-28 19:18:02,169][57108] Fps is (10 sec: 57343.3, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 11342610432. Throughput: 0: 56235.2. Samples: 1833021180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-04-28 19:18:02,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 19:18:05,043][57339] Updated weights for policy 0, policy_version 692308 (0.0029) [2024-04-28 19:18:07,169][57108] Fps is (10 sec: 54068.1, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 11342888960. Throughput: 0: 55903.7. Samples: 1833190720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:18:07,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 19:18:07,798][57339] Updated weights for policy 0, policy_version 692318 (0.0029) [2024-04-28 19:18:10,959][57339] Updated weights for policy 0, policy_version 692328 (0.0031) [2024-04-28 19:18:12,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11343167488. Throughput: 0: 55969.8. Samples: 1833530720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:18:12,169][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 19:18:13,680][57339] Updated weights for policy 0, policy_version 692338 (0.0038) [2024-04-28 19:18:16,771][57339] Updated weights for policy 0, policy_version 692348 (0.0028) [2024-04-28 19:18:17,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11343446016. Throughput: 0: 56097.8. Samples: 1833865720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:18:17,170][57108] Avg episode reward: [(0, '0.684')] [2024-04-28 19:18:19,400][57339] Updated weights for policy 0, policy_version 692358 (0.0030) [2024-04-28 19:18:22,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 11343724544. Throughput: 0: 55938.4. Samples: 1834029540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:18:22,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 19:18:22,622][57339] Updated weights for policy 0, policy_version 692368 (0.0031) [2024-04-28 19:18:25,173][57339] Updated weights for policy 0, policy_version 692378 (0.0031) [2024-04-28 19:18:27,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 11344003072. Throughput: 0: 55968.6. Samples: 1834366720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:18:27,170][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 19:18:28,380][57319] Signal inference workers to stop experience collection... (27350 times) [2024-04-28 19:18:28,421][57339] InferenceWorker_p0-w0: stopping experience collection (27350 times) [2024-04-28 19:18:28,444][57319] Signal inference workers to resume experience collection... (27350 times) [2024-04-28 19:18:28,450][57339] InferenceWorker_p0-w0: resuming experience collection (27350 times) [2024-04-28 19:18:28,562][57339] Updated weights for policy 0, policy_version 692388 (0.0029) [2024-04-28 19:18:30,971][57339] Updated weights for policy 0, policy_version 692398 (0.0026) [2024-04-28 19:18:32,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 11344297984. Throughput: 0: 55861.2. Samples: 1834700440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:18:32,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 19:18:34,343][57339] Updated weights for policy 0, policy_version 692408 (0.0026) [2024-04-28 19:18:36,885][57339] Updated weights for policy 0, policy_version 692418 (0.0025) [2024-04-28 19:18:37,169][57108] Fps is (10 sec: 57345.5, 60 sec: 56251.9, 300 sec: 55927.8). Total num frames: 11344576512. Throughput: 0: 56173.8. Samples: 1834871580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:18:37,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 19:18:40,272][57339] Updated weights for policy 0, policy_version 692428 (0.0029) [2024-04-28 19:18:42,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 11344838656. Throughput: 0: 56077.9. Samples: 1835209540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:18:42,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 19:18:42,791][57339] Updated weights for policy 0, policy_version 692438 (0.0025) [2024-04-28 19:18:46,182][57339] Updated weights for policy 0, policy_version 692448 (0.0029) [2024-04-28 19:18:47,169][57108] Fps is (10 sec: 55705.3, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 11345133568. Throughput: 0: 55997.4. Samples: 1835541060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:18:47,169][57108] Avg episode reward: [(0, '0.648')] [2024-04-28 19:18:48,784][57339] Updated weights for policy 0, policy_version 692458 (0.0034) [2024-04-28 19:18:51,851][57339] Updated weights for policy 0, policy_version 692468 (0.0033) [2024-04-28 19:18:52,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.5, 300 sec: 55816.7). Total num frames: 11345395712. Throughput: 0: 55919.0. Samples: 1835707080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:18:52,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 19:18:54,676][57339] Updated weights for policy 0, policy_version 692478 (0.0032) [2024-04-28 19:18:57,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.7, 300 sec: 55928.0). Total num frames: 11345690624. Throughput: 0: 55792.4. Samples: 1836041380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:18:57,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 19:18:57,699][57339] Updated weights for policy 0, policy_version 692488 (0.0033) [2024-04-28 19:19:00,440][57339] Updated weights for policy 0, policy_version 692498 (0.0026) [2024-04-28 19:19:02,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 11345969152. Throughput: 0: 55884.1. Samples: 1836380500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:19:02,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 19:19:03,535][57339] Updated weights for policy 0, policy_version 692508 (0.0026) [2024-04-28 19:19:06,271][57339] Updated weights for policy 0, policy_version 692518 (0.0029) [2024-04-28 19:19:07,169][57108] Fps is (10 sec: 57344.2, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 11346264064. Throughput: 0: 55829.8. Samples: 1836541880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:19:07,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 19:19:09,430][57339] Updated weights for policy 0, policy_version 692528 (0.0037) [2024-04-28 19:19:12,012][57339] Updated weights for policy 0, policy_version 692538 (0.0035) [2024-04-28 19:19:12,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 11346542592. Throughput: 0: 55840.7. Samples: 1836879540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:19:12,169][57108] Avg episode reward: [(0, '0.527')] [2024-04-28 19:19:15,209][57339] Updated weights for policy 0, policy_version 692548 (0.0029) [2024-04-28 19:19:17,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 55983.3). Total num frames: 11346821120. Throughput: 0: 56002.8. Samples: 1837220560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:19:17,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 19:19:17,799][57339] Updated weights for policy 0, policy_version 692558 (0.0026) [2024-04-28 19:19:21,017][57339] Updated weights for policy 0, policy_version 692568 (0.0035) [2024-04-28 19:19:22,169][57108] Fps is (10 sec: 55705.2, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 11347099648. Throughput: 0: 55925.2. Samples: 1837388220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:19:22,170][57108] Avg episode reward: [(0, '0.527')] [2024-04-28 19:19:23,552][57339] Updated weights for policy 0, policy_version 692578 (0.0026) [2024-04-28 19:19:26,764][57339] Updated weights for policy 0, policy_version 692588 (0.0026) [2024-04-28 19:19:27,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55978.9, 300 sec: 55816.7). Total num frames: 11347361792. Throughput: 0: 55992.1. Samples: 1837729180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:19:27,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 19:19:27,298][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000692590_11347394560.pth... [2024-04-28 19:19:27,345][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000691769_11333943296.pth [2024-04-28 19:19:29,530][57339] Updated weights for policy 0, policy_version 692598 (0.0037) [2024-04-28 19:19:32,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55432.7, 300 sec: 55872.2). Total num frames: 11347623936. Throughput: 0: 56120.5. Samples: 1838066480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:19:32,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 19:19:32,556][57339] Updated weights for policy 0, policy_version 692608 (0.0027) [2024-04-28 19:19:35,575][57339] Updated weights for policy 0, policy_version 692618 (0.0027) [2024-04-28 19:19:36,634][57319] Signal inference workers to stop experience collection... (27400 times) [2024-04-28 19:19:36,635][57319] Signal inference workers to resume experience collection... (27400 times) [2024-04-28 19:19:36,658][57339] InferenceWorker_p0-w0: stopping experience collection (27400 times) [2024-04-28 19:19:36,659][57339] InferenceWorker_p0-w0: resuming experience collection (27400 times) [2024-04-28 19:19:37,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 11347935232. Throughput: 0: 56066.2. Samples: 1838230060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:19:37,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 19:19:38,384][57339] Updated weights for policy 0, policy_version 692628 (0.0030) [2024-04-28 19:19:41,315][57339] Updated weights for policy 0, policy_version 692638 (0.0029) [2024-04-28 19:19:42,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 11348197376. Throughput: 0: 56077.8. Samples: 1838564880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-04-28 19:19:42,169][57108] Avg episode reward: [(0, '0.727')] [2024-04-28 19:19:44,298][57339] Updated weights for policy 0, policy_version 692648 (0.0029) [2024-04-28 19:19:47,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 11348492288. Throughput: 0: 55884.0. Samples: 1838895280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:19:47,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 19:19:47,228][57339] Updated weights for policy 0, policy_version 692658 (0.0031) [2024-04-28 19:19:50,288][57339] Updated weights for policy 0, policy_version 692668 (0.0030) [2024-04-28 19:19:52,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 11348770816. Throughput: 0: 56082.7. Samples: 1839065600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:19:52,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 19:19:53,280][57339] Updated weights for policy 0, policy_version 692678 (0.0035) [2024-04-28 19:19:56,222][57339] Updated weights for policy 0, policy_version 692688 (0.0031) [2024-04-28 19:19:57,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 11349032960. Throughput: 0: 55990.7. Samples: 1839399120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:19:57,169][57108] Avg episode reward: [(0, '0.689')] [2024-04-28 19:19:59,129][57339] Updated weights for policy 0, policy_version 692698 (0.0030) [2024-04-28 19:20:02,075][57339] Updated weights for policy 0, policy_version 692708 (0.0028) [2024-04-28 19:20:02,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11349327872. Throughput: 0: 55682.2. Samples: 1839726260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:20:02,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 19:20:05,060][57339] Updated weights for policy 0, policy_version 692718 (0.0032) [2024-04-28 19:20:07,169][57108] Fps is (10 sec: 54065.5, 60 sec: 55159.2, 300 sec: 55816.6). Total num frames: 11349573632. Throughput: 0: 55503.3. Samples: 1839885880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:20:07,170][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 19:20:08,017][57339] Updated weights for policy 0, policy_version 692728 (0.0029) [2024-04-28 19:20:10,908][57339] Updated weights for policy 0, policy_version 692738 (0.0027) [2024-04-28 19:20:12,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55927.8). Total num frames: 11349868544. Throughput: 0: 55503.5. Samples: 1840226840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:20:12,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 19:20:13,827][57339] Updated weights for policy 0, policy_version 692748 (0.0029) [2024-04-28 19:20:16,884][57339] Updated weights for policy 0, policy_version 692758 (0.0028) [2024-04-28 19:20:17,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 11350147072. Throughput: 0: 55413.6. Samples: 1840560100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:20:17,170][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 19:20:19,910][57339] Updated weights for policy 0, policy_version 692768 (0.0027) [2024-04-28 19:20:22,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 11350409216. Throughput: 0: 55327.1. Samples: 1840719780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:20:22,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 19:20:22,796][57339] Updated weights for policy 0, policy_version 692778 (0.0030) [2024-04-28 19:20:25,711][57339] Updated weights for policy 0, policy_version 692788 (0.0026) [2024-04-28 19:20:27,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 11350704128. Throughput: 0: 55249.4. Samples: 1841051100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:20:27,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 19:20:28,731][57339] Updated weights for policy 0, policy_version 692798 (0.0031) [2024-04-28 19:20:31,551][57339] Updated weights for policy 0, policy_version 692808 (0.0030) [2024-04-28 19:20:32,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55978.5, 300 sec: 55816.6). Total num frames: 11350982656. Throughput: 0: 55221.1. Samples: 1841380240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:20:32,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 19:20:34,688][57339] Updated weights for policy 0, policy_version 692818 (0.0028) [2024-04-28 19:20:37,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 11351261184. Throughput: 0: 55234.5. Samples: 1841551160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:20:37,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 19:20:37,667][57339] Updated weights for policy 0, policy_version 692828 (0.0029) [2024-04-28 19:20:40,592][57339] Updated weights for policy 0, policy_version 692838 (0.0026) [2024-04-28 19:20:42,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 11351523328. Throughput: 0: 55177.7. Samples: 1841882120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:20:42,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 19:20:43,611][57339] Updated weights for policy 0, policy_version 692848 (0.0032) [2024-04-28 19:20:46,311][57319] Signal inference workers to stop experience collection... (27450 times) [2024-04-28 19:20:46,359][57339] InferenceWorker_p0-w0: stopping experience collection (27450 times) [2024-04-28 19:20:46,367][57319] Signal inference workers to resume experience collection... (27450 times) [2024-04-28 19:20:46,376][57339] InferenceWorker_p0-w0: resuming experience collection (27450 times) [2024-04-28 19:20:46,480][57339] Updated weights for policy 0, policy_version 692858 (0.0034) [2024-04-28 19:20:47,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55159.4, 300 sec: 55816.7). Total num frames: 11351801856. Throughput: 0: 55217.8. Samples: 1842211060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:20:47,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 19:20:49,564][57339] Updated weights for policy 0, policy_version 692868 (0.0029) [2024-04-28 19:20:52,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55432.5, 300 sec: 55872.2). Total num frames: 11352096768. Throughput: 0: 55418.5. Samples: 1842379700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:20:52,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 19:20:52,389][57339] Updated weights for policy 0, policy_version 692878 (0.0033) [2024-04-28 19:20:55,353][57339] Updated weights for policy 0, policy_version 692888 (0.0029) [2024-04-28 19:20:57,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 11352358912. Throughput: 0: 55290.7. Samples: 1842714920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:20:57,169][57108] Avg episode reward: [(0, '0.549')] [2024-04-28 19:20:58,399][57339] Updated weights for policy 0, policy_version 692898 (0.0029) [2024-04-28 19:21:01,312][57339] Updated weights for policy 0, policy_version 692908 (0.0027) [2024-04-28 19:21:02,169][57108] Fps is (10 sec: 52428.2, 60 sec: 54886.3, 300 sec: 55761.1). Total num frames: 11352621056. Throughput: 0: 55223.5. Samples: 1843045160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:21:02,170][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 19:21:04,357][57339] Updated weights for policy 0, policy_version 692918 (0.0026) [2024-04-28 19:21:07,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.9, 300 sec: 55761.1). Total num frames: 11352915968. Throughput: 0: 55405.0. Samples: 1843213000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:21:07,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 19:21:07,243][57339] Updated weights for policy 0, policy_version 692928 (0.0026) [2024-04-28 19:21:10,372][57339] Updated weights for policy 0, policy_version 692938 (0.0027) [2024-04-28 19:21:12,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 11353194496. Throughput: 0: 55475.4. Samples: 1843547500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:21:12,170][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 19:21:13,046][57339] Updated weights for policy 0, policy_version 692948 (0.0024) [2024-04-28 19:21:16,352][57339] Updated weights for policy 0, policy_version 692958 (0.0025) [2024-04-28 19:21:17,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 11353456640. Throughput: 0: 55563.7. Samples: 1843880600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:21:17,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 19:21:18,856][57339] Updated weights for policy 0, policy_version 692968 (0.0032) [2024-04-28 19:21:22,169][57108] Fps is (10 sec: 54068.0, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 11353735168. Throughput: 0: 55498.4. Samples: 1844048580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:21:22,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 19:21:22,246][57339] Updated weights for policy 0, policy_version 692978 (0.0029) [2024-04-28 19:21:24,956][57339] Updated weights for policy 0, policy_version 692988 (0.0028) [2024-04-28 19:21:27,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.3, 300 sec: 55705.6). Total num frames: 11354013696. Throughput: 0: 55431.0. Samples: 1844376520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:21:27,170][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 19:21:27,182][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000692994_11354013696.pth... [2024-04-28 19:21:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000692179_11340660736.pth [2024-04-28 19:21:28,073][57339] Updated weights for policy 0, policy_version 692998 (0.0027) [2024-04-28 19:21:31,063][57339] Updated weights for policy 0, policy_version 693008 (0.0032) [2024-04-28 19:21:32,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55432.7, 300 sec: 55761.1). Total num frames: 11354308608. Throughput: 0: 55477.3. Samples: 1844707540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:21:32,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 19:21:33,884][57339] Updated weights for policy 0, policy_version 693018 (0.0027) [2024-04-28 19:21:36,915][57339] Updated weights for policy 0, policy_version 693028 (0.0031) [2024-04-28 19:21:37,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 11354570752. Throughput: 0: 55430.7. Samples: 1844874080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:21:37,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 19:21:39,774][57339] Updated weights for policy 0, policy_version 693038 (0.0033) [2024-04-28 19:21:42,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11354865664. Throughput: 0: 55446.2. Samples: 1845210000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:21:42,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 19:21:43,022][57339] Updated weights for policy 0, policy_version 693048 (0.0029) [2024-04-28 19:21:45,684][57339] Updated weights for policy 0, policy_version 693058 (0.0029) [2024-04-28 19:21:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11355127808. Throughput: 0: 55505.1. Samples: 1845542880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:21:47,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 19:21:48,770][57339] Updated weights for policy 0, policy_version 693068 (0.0026) [2024-04-28 19:21:51,676][57319] Signal inference workers to stop experience collection... (27500 times) [2024-04-28 19:21:51,680][57319] Signal inference workers to resume experience collection... (27500 times) [2024-04-28 19:21:51,685][57339] Updated weights for policy 0, policy_version 693078 (0.0030) [2024-04-28 19:21:51,695][57339] InferenceWorker_p0-w0: stopping experience collection (27500 times) [2024-04-28 19:21:51,695][57339] InferenceWorker_p0-w0: resuming experience collection (27500 times) [2024-04-28 19:21:52,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 11355439104. Throughput: 0: 55579.5. Samples: 1845714080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:21:52,169][57108] Avg episode reward: [(0, '0.573')] [2024-04-28 19:21:54,509][57339] Updated weights for policy 0, policy_version 693088 (0.0031) [2024-04-28 19:21:57,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 11355701248. Throughput: 0: 55727.2. Samples: 1846055220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:21:57,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 19:21:57,470][57339] Updated weights for policy 0, policy_version 693098 (0.0029) [2024-04-28 19:22:00,371][57339] Updated weights for policy 0, policy_version 693108 (0.0032) [2024-04-28 19:22:02,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55705.8, 300 sec: 55761.1). Total num frames: 11355963392. Throughput: 0: 55775.6. Samples: 1846390500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:22:02,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 19:22:03,397][57339] Updated weights for policy 0, policy_version 693118 (0.0024) [2024-04-28 19:22:06,141][57339] Updated weights for policy 0, policy_version 693128 (0.0034) [2024-04-28 19:22:07,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11356258304. Throughput: 0: 55757.7. Samples: 1846557680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:22:07,169][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 19:22:09,326][57339] Updated weights for policy 0, policy_version 693138 (0.0029) [2024-04-28 19:22:12,089][57339] Updated weights for policy 0, policy_version 693148 (0.0030) [2024-04-28 19:22:12,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 11356536832. Throughput: 0: 55808.2. Samples: 1846887880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:22:12,169][57108] Avg episode reward: [(0, '0.710')] [2024-04-28 19:22:15,144][57339] Updated weights for policy 0, policy_version 693158 (0.0028) [2024-04-28 19:22:17,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11356815360. Throughput: 0: 55909.3. Samples: 1847223460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:22:17,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 19:22:17,851][57339] Updated weights for policy 0, policy_version 693168 (0.0034) [2024-04-28 19:22:21,034][57339] Updated weights for policy 0, policy_version 693178 (0.0029) [2024-04-28 19:22:22,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 11357077504. Throughput: 0: 55963.9. Samples: 1847392460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:22:22,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 19:22:23,659][57339] Updated weights for policy 0, policy_version 693188 (0.0028) [2024-04-28 19:22:26,769][57339] Updated weights for policy 0, policy_version 693198 (0.0027) [2024-04-28 19:22:27,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11357372416. Throughput: 0: 55958.1. Samples: 1847728120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:22:27,170][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 19:22:29,502][57339] Updated weights for policy 0, policy_version 693208 (0.0027) [2024-04-28 19:22:32,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11357650944. Throughput: 0: 55990.0. Samples: 1848062440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:22:32,169][57108] Avg episode reward: [(0, '0.522')] [2024-04-28 19:22:32,742][57339] Updated weights for policy 0, policy_version 693218 (0.0030) [2024-04-28 19:22:35,319][57339] Updated weights for policy 0, policy_version 693228 (0.0027) [2024-04-28 19:22:37,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 11357929472. Throughput: 0: 55843.1. Samples: 1848227020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:22:37,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 19:22:38,669][57339] Updated weights for policy 0, policy_version 693238 (0.0032) [2024-04-28 19:22:41,106][57339] Updated weights for policy 0, policy_version 693248 (0.0027) [2024-04-28 19:22:42,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11358208000. Throughput: 0: 55683.0. Samples: 1848560960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:22:42,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 19:22:44,455][57339] Updated weights for policy 0, policy_version 693258 (0.0036) [2024-04-28 19:22:46,850][57339] Updated weights for policy 0, policy_version 693268 (0.0028) [2024-04-28 19:22:47,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 11358502912. Throughput: 0: 55735.1. Samples: 1848898580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:22:47,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 19:22:50,224][57339] Updated weights for policy 0, policy_version 693278 (0.0029) [2024-04-28 19:22:51,364][57319] Signal inference workers to stop experience collection... (27550 times) [2024-04-28 19:22:51,364][57319] Signal inference workers to resume experience collection... (27550 times) [2024-04-28 19:22:51,389][57339] InferenceWorker_p0-w0: stopping experience collection (27550 times) [2024-04-28 19:22:51,390][57339] InferenceWorker_p0-w0: resuming experience collection (27550 times) [2024-04-28 19:22:52,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11358765056. Throughput: 0: 55814.6. Samples: 1849069340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:22:52,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 19:22:52,768][57339] Updated weights for policy 0, policy_version 693288 (0.0033) [2024-04-28 19:22:56,111][57339] Updated weights for policy 0, policy_version 693298 (0.0032) [2024-04-28 19:22:57,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11359043584. Throughput: 0: 55966.5. Samples: 1849406380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:22:57,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 19:22:58,645][57339] Updated weights for policy 0, policy_version 693308 (0.0034) [2024-04-28 19:23:01,989][57339] Updated weights for policy 0, policy_version 693318 (0.0029) [2024-04-28 19:23:02,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11359322112. Throughput: 0: 55847.9. Samples: 1849736620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:23:02,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 19:23:04,425][57339] Updated weights for policy 0, policy_version 693328 (0.0027) [2024-04-28 19:23:07,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 11359617024. Throughput: 0: 55677.0. Samples: 1849897920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:23:07,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 19:23:07,932][57339] Updated weights for policy 0, policy_version 693338 (0.0027) [2024-04-28 19:23:10,645][57339] Updated weights for policy 0, policy_version 693348 (0.0029) [2024-04-28 19:23:12,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 11359879168. Throughput: 0: 55867.7. Samples: 1850242160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:23:12,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 19:23:13,635][57339] Updated weights for policy 0, policy_version 693358 (0.0028) [2024-04-28 19:23:16,605][57339] Updated weights for policy 0, policy_version 693368 (0.0024) [2024-04-28 19:23:17,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 11360157696. Throughput: 0: 55907.1. Samples: 1850578260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:23:17,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 19:23:19,404][57339] Updated weights for policy 0, policy_version 693378 (0.0028) [2024-04-28 19:23:22,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56251.8, 300 sec: 55761.2). Total num frames: 11360452608. Throughput: 0: 55939.9. Samples: 1850744320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:23:22,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 19:23:22,386][57339] Updated weights for policy 0, policy_version 693388 (0.0029) [2024-04-28 19:23:25,336][57339] Updated weights for policy 0, policy_version 693398 (0.0030) [2024-04-28 19:23:27,169][57108] Fps is (10 sec: 58982.5, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 11360747520. Throughput: 0: 56039.6. Samples: 1851082740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:23:27,170][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 19:23:27,177][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000693405_11360747520.pth... [2024-04-28 19:23:27,226][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000692590_11347394560.pth [2024-04-28 19:23:28,146][57339] Updated weights for policy 0, policy_version 693408 (0.0026) [2024-04-28 19:23:31,148][57339] Updated weights for policy 0, policy_version 693418 (0.0031) [2024-04-28 19:23:32,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 11360976896. Throughput: 0: 56063.5. Samples: 1851421440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:23:32,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 19:23:33,817][57339] Updated weights for policy 0, policy_version 693428 (0.0030) [2024-04-28 19:23:37,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11361271808. Throughput: 0: 55939.1. Samples: 1851586600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:23:37,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 19:23:37,408][57339] Updated weights for policy 0, policy_version 693438 (0.0028) [2024-04-28 19:23:39,686][57339] Updated weights for policy 0, policy_version 693448 (0.0029) [2024-04-28 19:23:42,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 11361566720. Throughput: 0: 55891.3. Samples: 1851921480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:23:42,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 19:23:43,064][57339] Updated weights for policy 0, policy_version 693458 (0.0025) [2024-04-28 19:23:45,567][57339] Updated weights for policy 0, policy_version 693468 (0.0030) [2024-04-28 19:23:47,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11361861632. Throughput: 0: 56000.6. Samples: 1852256640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:23:47,169][57108] Avg episode reward: [(0, '0.506')] [2024-04-28 19:23:48,740][57339] Updated weights for policy 0, policy_version 693478 (0.0030) [2024-04-28 19:23:51,320][57339] Updated weights for policy 0, policy_version 693488 (0.0033) [2024-04-28 19:23:52,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 11362140160. Throughput: 0: 56337.7. Samples: 1852433120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:23:52,170][57108] Avg episode reward: [(0, '0.702')] [2024-04-28 19:23:54,664][57339] Updated weights for policy 0, policy_version 693498 (0.0027) [2024-04-28 19:23:57,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 11362402304. Throughput: 0: 56167.1. Samples: 1852769680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:23:57,169][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 19:23:57,318][57339] Updated weights for policy 0, policy_version 693508 (0.0026) [2024-04-28 19:24:00,602][57339] Updated weights for policy 0, policy_version 693518 (0.0033) [2024-04-28 19:24:01,590][57319] Signal inference workers to stop experience collection... (27600 times) [2024-04-28 19:24:01,632][57339] InferenceWorker_p0-w0: stopping experience collection (27600 times) [2024-04-28 19:24:01,649][57319] Signal inference workers to resume experience collection... (27600 times) [2024-04-28 19:24:01,649][57339] InferenceWorker_p0-w0: resuming experience collection (27600 times) [2024-04-28 19:24:02,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56524.8, 300 sec: 55761.1). Total num frames: 11362713600. Throughput: 0: 56197.9. Samples: 1853107160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:24:02,170][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 19:24:03,135][57339] Updated weights for policy 0, policy_version 693528 (0.0033) [2024-04-28 19:24:06,443][57339] Updated weights for policy 0, policy_version 693538 (0.0027) [2024-04-28 19:24:07,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11362959360. Throughput: 0: 56091.2. Samples: 1853268420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:24:07,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 19:24:08,914][57339] Updated weights for policy 0, policy_version 693548 (0.0031) [2024-04-28 19:24:12,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 11363237888. Throughput: 0: 56037.0. Samples: 1853604400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:24:12,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 19:24:12,354][57339] Updated weights for policy 0, policy_version 693558 (0.0031) [2024-04-28 19:24:14,655][57339] Updated weights for policy 0, policy_version 693568 (0.0030) [2024-04-28 19:24:17,169][57108] Fps is (10 sec: 57343.7, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 11363532800. Throughput: 0: 55955.1. Samples: 1853939420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:24:17,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 19:24:18,142][57339] Updated weights for policy 0, policy_version 693578 (0.0030) [2024-04-28 19:24:20,553][57339] Updated weights for policy 0, policy_version 693588 (0.0029) [2024-04-28 19:24:22,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 11363811328. Throughput: 0: 56061.9. Samples: 1854109380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:24:22,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 19:24:24,035][57339] Updated weights for policy 0, policy_version 693598 (0.0037) [2024-04-28 19:24:26,439][57339] Updated weights for policy 0, policy_version 693608 (0.0032) [2024-04-28 19:24:27,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 11364089856. Throughput: 0: 55998.5. Samples: 1854441420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:24:27,170][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 19:24:29,838][57339] Updated weights for policy 0, policy_version 693618 (0.0029) [2024-04-28 19:24:32,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56797.9, 300 sec: 55761.2). Total num frames: 11364384768. Throughput: 0: 56003.5. Samples: 1854776800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:24:32,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 19:24:32,209][57339] Updated weights for policy 0, policy_version 693628 (0.0027) [2024-04-28 19:24:35,725][57339] Updated weights for policy 0, policy_version 693638 (0.0030) [2024-04-28 19:24:37,169][57108] Fps is (10 sec: 55705.2, 60 sec: 56251.6, 300 sec: 55761.1). Total num frames: 11364646912. Throughput: 0: 55736.7. Samples: 1854941280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:24:37,169][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 19:24:37,955][57339] Updated weights for policy 0, policy_version 693648 (0.0026) [2024-04-28 19:24:41,484][57339] Updated weights for policy 0, policy_version 693658 (0.0029) [2024-04-28 19:24:42,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 11364909056. Throughput: 0: 55819.6. Samples: 1855281560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-04-28 19:24:42,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 19:24:44,296][57339] Updated weights for policy 0, policy_version 693668 (0.0032) [2024-04-28 19:24:47,169][57108] Fps is (10 sec: 55706.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11365203968. Throughput: 0: 55797.0. Samples: 1855618020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:24:47,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 19:24:47,227][57339] Updated weights for policy 0, policy_version 693678 (0.0028) [2024-04-28 19:24:50,515][57339] Updated weights for policy 0, policy_version 693688 (0.0026) [2024-04-28 19:24:52,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11365482496. Throughput: 0: 55775.1. Samples: 1855778300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:24:52,170][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 19:24:53,460][57339] Updated weights for policy 0, policy_version 693698 (0.0025) [2024-04-28 19:24:56,392][57339] Updated weights for policy 0, policy_version 693708 (0.0026) [2024-04-28 19:24:57,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11365761024. Throughput: 0: 55725.8. Samples: 1856112060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:24:57,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 19:24:59,329][57339] Updated weights for policy 0, policy_version 693718 (0.0029) [2024-04-28 19:25:02,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55159.6, 300 sec: 55761.2). Total num frames: 11366023168. Throughput: 0: 55583.2. Samples: 1856440660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:25:02,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 19:25:02,278][57339] Updated weights for policy 0, policy_version 693728 (0.0031) [2024-04-28 19:25:05,104][57339] Updated weights for policy 0, policy_version 693738 (0.0029) [2024-04-28 19:25:07,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 11366334464. Throughput: 0: 55682.7. Samples: 1856615100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:25:07,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 19:25:08,080][57339] Updated weights for policy 0, policy_version 693748 (0.0028) [2024-04-28 19:25:09,030][57319] Signal inference workers to stop experience collection... (27650 times) [2024-04-28 19:25:09,056][57339] InferenceWorker_p0-w0: stopping experience collection (27650 times) [2024-04-28 19:25:09,087][57319] Signal inference workers to resume experience collection... (27650 times) [2024-04-28 19:25:09,087][57339] InferenceWorker_p0-w0: resuming experience collection (27650 times) [2024-04-28 19:25:11,051][57339] Updated weights for policy 0, policy_version 693758 (0.0025) [2024-04-28 19:25:12,169][57108] Fps is (10 sec: 55704.2, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 11366580224. Throughput: 0: 55674.1. Samples: 1856946760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:25:12,170][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 19:25:14,062][57339] Updated weights for policy 0, policy_version 693768 (0.0026) [2024-04-28 19:25:17,108][57339] Updated weights for policy 0, policy_version 693778 (0.0030) [2024-04-28 19:25:17,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 11366858752. Throughput: 0: 55586.2. Samples: 1857278180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:25:17,169][57108] Avg episode reward: [(0, '0.698')] [2024-04-28 19:25:19,946][57339] Updated weights for policy 0, policy_version 693788 (0.0027) [2024-04-28 19:25:22,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11367153664. Throughput: 0: 55673.1. Samples: 1857446560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:25:22,170][57108] Avg episode reward: [(0, '0.538')] [2024-04-28 19:25:22,843][57339] Updated weights for policy 0, policy_version 693798 (0.0039) [2024-04-28 19:25:25,723][57339] Updated weights for policy 0, policy_version 693808 (0.0024) [2024-04-28 19:25:27,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55705.7, 300 sec: 55761.2). Total num frames: 11367432192. Throughput: 0: 55600.0. Samples: 1857783560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:25:27,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 19:25:27,267][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000693814_11367448576.pth... [2024-04-28 19:25:27,320][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000692994_11354013696.pth [2024-04-28 19:25:28,577][57339] Updated weights for policy 0, policy_version 693818 (0.0032) [2024-04-28 19:25:31,530][57339] Updated weights for policy 0, policy_version 693828 (0.0031) [2024-04-28 19:25:32,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55159.3, 300 sec: 55705.6). Total num frames: 11367694336. Throughput: 0: 55429.1. Samples: 1858112340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:25:32,170][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 19:25:34,548][57339] Updated weights for policy 0, policy_version 693838 (0.0033) [2024-04-28 19:25:37,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.7, 300 sec: 55761.1). Total num frames: 11367972864. Throughput: 0: 55589.3. Samples: 1858279820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:25:37,170][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 19:25:37,678][57339] Updated weights for policy 0, policy_version 693848 (0.0027) [2024-04-28 19:25:40,572][57339] Updated weights for policy 0, policy_version 693858 (0.0032) [2024-04-28 19:25:42,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11368267776. Throughput: 0: 55637.4. Samples: 1858615740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:25:42,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 19:25:43,376][57339] Updated weights for policy 0, policy_version 693868 (0.0023) [2024-04-28 19:25:46,317][57339] Updated weights for policy 0, policy_version 693878 (0.0029) [2024-04-28 19:25:47,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 11368529920. Throughput: 0: 55708.7. Samples: 1858947560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:25:47,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 19:25:49,055][57339] Updated weights for policy 0, policy_version 693888 (0.0028) [2024-04-28 19:25:52,068][57339] Updated weights for policy 0, policy_version 693898 (0.0032) [2024-04-28 19:25:52,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11368824832. Throughput: 0: 55634.3. Samples: 1859118640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:25:52,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 19:25:54,783][57339] Updated weights for policy 0, policy_version 693908 (0.0029) [2024-04-28 19:25:57,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 11369086976. Throughput: 0: 55831.7. Samples: 1859459180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:25:57,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 19:25:58,055][57339] Updated weights for policy 0, policy_version 693918 (0.0028) [2024-04-28 19:26:00,924][57339] Updated weights for policy 0, policy_version 693928 (0.0028) [2024-04-28 19:26:02,169][57108] Fps is (10 sec: 57342.7, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 11369398272. Throughput: 0: 55854.9. Samples: 1859791660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:26:02,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 19:26:03,797][57339] Updated weights for policy 0, policy_version 693938 (0.0030) [2024-04-28 19:26:06,668][57339] Updated weights for policy 0, policy_version 693948 (0.0026) [2024-04-28 19:26:07,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55159.4, 300 sec: 55761.2). Total num frames: 11369644032. Throughput: 0: 55894.7. Samples: 1859961820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:26:07,170][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 19:26:09,717][57339] Updated weights for policy 0, policy_version 693958 (0.0032) [2024-04-28 19:26:12,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 55927.7). Total num frames: 11369955328. Throughput: 0: 55774.5. Samples: 1860293420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:26:12,170][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 19:26:12,631][57339] Updated weights for policy 0, policy_version 693968 (0.0033) [2024-04-28 19:26:15,701][57339] Updated weights for policy 0, policy_version 693978 (0.0027) [2024-04-28 19:26:17,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11370217472. Throughput: 0: 56022.5. Samples: 1860633340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:26:17,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 19:26:18,402][57339] Updated weights for policy 0, policy_version 693988 (0.0029) [2024-04-28 19:26:19,114][57319] Signal inference workers to stop experience collection... (27700 times) [2024-04-28 19:26:19,134][57339] InferenceWorker_p0-w0: stopping experience collection (27700 times) [2024-04-28 19:26:19,172][57319] Signal inference workers to resume experience collection... (27700 times) [2024-04-28 19:26:19,172][57339] InferenceWorker_p0-w0: resuming experience collection (27700 times) [2024-04-28 19:26:21,453][57339] Updated weights for policy 0, policy_version 693998 (0.0030) [2024-04-28 19:26:22,169][57108] Fps is (10 sec: 54068.2, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 11370496000. Throughput: 0: 55906.8. Samples: 1860795620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-04-28 19:26:22,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 19:26:24,199][57339] Updated weights for policy 0, policy_version 694008 (0.0029) [2024-04-28 19:26:27,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11370774528. Throughput: 0: 55887.6. Samples: 1861130680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:26:27,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 19:26:27,247][57339] Updated weights for policy 0, policy_version 694018 (0.0030) [2024-04-28 19:26:30,047][57339] Updated weights for policy 0, policy_version 694028 (0.0028) [2024-04-28 19:26:32,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11371053056. Throughput: 0: 56078.3. Samples: 1861471080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:26:32,170][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 19:26:32,981][57339] Updated weights for policy 0, policy_version 694038 (0.0025) [2024-04-28 19:26:35,945][57339] Updated weights for policy 0, policy_version 694048 (0.0036) [2024-04-28 19:26:37,169][57108] Fps is (10 sec: 57343.3, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 11371347968. Throughput: 0: 55998.9. Samples: 1861638600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:26:37,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 19:26:38,901][57339] Updated weights for policy 0, policy_version 694058 (0.0029) [2024-04-28 19:26:41,868][57339] Updated weights for policy 0, policy_version 694068 (0.0027) [2024-04-28 19:26:42,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 11371626496. Throughput: 0: 55914.3. Samples: 1861975320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:26:42,169][57108] Avg episode reward: [(0, '0.531')] [2024-04-28 19:26:44,734][57339] Updated weights for policy 0, policy_version 694078 (0.0034) [2024-04-28 19:26:47,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 11371888640. Throughput: 0: 56008.8. Samples: 1862312060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:26:47,170][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 19:26:47,922][57339] Updated weights for policy 0, policy_version 694088 (0.0028) [2024-04-28 19:26:50,582][57339] Updated weights for policy 0, policy_version 694098 (0.0032) [2024-04-28 19:26:52,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 11372183552. Throughput: 0: 55892.5. Samples: 1862476980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:26:52,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 19:26:53,797][57339] Updated weights for policy 0, policy_version 694108 (0.0029) [2024-04-28 19:26:56,381][57339] Updated weights for policy 0, policy_version 694118 (0.0032) [2024-04-28 19:26:57,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 11372445696. Throughput: 0: 55993.7. Samples: 1862813140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:26:57,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 19:26:59,705][57339] Updated weights for policy 0, policy_version 694128 (0.0031) [2024-04-28 19:27:02,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 11372740608. Throughput: 0: 55867.8. Samples: 1863147400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:27:02,170][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 19:27:02,273][57339] Updated weights for policy 0, policy_version 694138 (0.0034) [2024-04-28 19:27:05,482][57339] Updated weights for policy 0, policy_version 694148 (0.0032) [2024-04-28 19:27:07,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11372986368. Throughput: 0: 55878.9. Samples: 1863310180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:27:07,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 19:27:08,224][57339] Updated weights for policy 0, policy_version 694158 (0.0029) [2024-04-28 19:27:11,355][57339] Updated weights for policy 0, policy_version 694168 (0.0024) [2024-04-28 19:27:12,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 11373281280. Throughput: 0: 55831.5. Samples: 1863643100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:27:12,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 19:27:14,000][57339] Updated weights for policy 0, policy_version 694178 (0.0028) [2024-04-28 19:27:17,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 11373559808. Throughput: 0: 55620.0. Samples: 1863973980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:27:17,169][57108] Avg episode reward: [(0, '0.534')] [2024-04-28 19:27:17,358][57339] Updated weights for policy 0, policy_version 694188 (0.0030) [2024-04-28 19:27:19,467][57319] Signal inference workers to stop experience collection... (27750 times) [2024-04-28 19:27:19,468][57319] Signal inference workers to resume experience collection... (27750 times) [2024-04-28 19:27:19,504][57339] InferenceWorker_p0-w0: stopping experience collection (27750 times) [2024-04-28 19:27:19,505][57339] InferenceWorker_p0-w0: resuming experience collection (27750 times) [2024-04-28 19:27:19,984][57339] Updated weights for policy 0, policy_version 694198 (0.0030) [2024-04-28 19:27:22,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 11373821952. Throughput: 0: 55511.6. Samples: 1864136620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:27:22,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 19:27:23,237][57339] Updated weights for policy 0, policy_version 694208 (0.0040) [2024-04-28 19:27:25,901][57339] Updated weights for policy 0, policy_version 694218 (0.0028) [2024-04-28 19:27:27,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 11374116864. Throughput: 0: 55392.9. Samples: 1864468000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:27:27,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 19:27:27,177][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000694221_11374116864.pth... [2024-04-28 19:27:27,224][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000693405_11360747520.pth [2024-04-28 19:27:29,139][57339] Updated weights for policy 0, policy_version 694228 (0.0026) [2024-04-28 19:27:31,723][57339] Updated weights for policy 0, policy_version 694238 (0.0027) [2024-04-28 19:27:32,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11374395392. Throughput: 0: 55253.6. Samples: 1864798460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:27:32,169][57108] Avg episode reward: [(0, '0.508')] [2024-04-28 19:27:34,926][57339] Updated weights for policy 0, policy_version 694248 (0.0032) [2024-04-28 19:27:37,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 11374673920. Throughput: 0: 55401.8. Samples: 1864970060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:27:37,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 19:27:37,607][57339] Updated weights for policy 0, policy_version 694258 (0.0026) [2024-04-28 19:27:40,900][57339] Updated weights for policy 0, policy_version 694268 (0.0031) [2024-04-28 19:27:42,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 11374936064. Throughput: 0: 55378.5. Samples: 1865305160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:27:42,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 19:27:43,784][57339] Updated weights for policy 0, policy_version 694278 (0.0033) [2024-04-28 19:27:46,782][57339] Updated weights for policy 0, policy_version 694288 (0.0031) [2024-04-28 19:27:47,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11375230976. Throughput: 0: 55392.0. Samples: 1865640040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:27:47,170][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 19:27:49,578][57339] Updated weights for policy 0, policy_version 694298 (0.0032) [2024-04-28 19:27:52,169][57108] Fps is (10 sec: 57343.5, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 11375509504. Throughput: 0: 55483.2. Samples: 1865806920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:27:52,169][57108] Avg episode reward: [(0, '0.542')] [2024-04-28 19:27:52,497][57339] Updated weights for policy 0, policy_version 694308 (0.0027) [2024-04-28 19:27:55,392][57339] Updated weights for policy 0, policy_version 694318 (0.0031) [2024-04-28 19:27:57,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 11375771648. Throughput: 0: 55449.4. Samples: 1866138340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:27:57,170][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 19:27:58,349][57339] Updated weights for policy 0, policy_version 694328 (0.0028) [2024-04-28 19:28:01,358][57339] Updated weights for policy 0, policy_version 694338 (0.0025) [2024-04-28 19:28:02,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 11376050176. Throughput: 0: 55566.7. Samples: 1866474480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-04-28 19:28:02,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 19:28:04,344][57339] Updated weights for policy 0, policy_version 694348 (0.0028) [2024-04-28 19:28:07,169][57108] Fps is (10 sec: 57345.1, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11376345088. Throughput: 0: 55635.5. Samples: 1866640220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:28:07,169][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 19:28:07,184][57339] Updated weights for policy 0, policy_version 694358 (0.0032) [2024-04-28 19:28:10,137][57339] Updated weights for policy 0, policy_version 694368 (0.0028) [2024-04-28 19:28:12,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 11376640000. Throughput: 0: 55832.9. Samples: 1866980480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:28:12,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 19:28:12,900][57339] Updated weights for policy 0, policy_version 694378 (0.0031) [2024-04-28 19:28:16,046][57339] Updated weights for policy 0, policy_version 694388 (0.0034) [2024-04-28 19:28:17,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11376902144. Throughput: 0: 55916.4. Samples: 1867314700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:28:17,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 19:28:18,781][57339] Updated weights for policy 0, policy_version 694398 (0.0030) [2024-04-28 19:28:22,069][57339] Updated weights for policy 0, policy_version 694408 (0.0025) [2024-04-28 19:28:22,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11377180672. Throughput: 0: 55828.8. Samples: 1867482360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:28:22,170][57108] Avg episode reward: [(0, '0.679')] [2024-04-28 19:28:24,741][57339] Updated weights for policy 0, policy_version 694418 (0.0026) [2024-04-28 19:28:27,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 11377442816. Throughput: 0: 55751.9. Samples: 1867814000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:28:27,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 19:28:27,879][57339] Updated weights for policy 0, policy_version 694428 (0.0026) [2024-04-28 19:28:30,732][57339] Updated weights for policy 0, policy_version 694438 (0.0026) [2024-04-28 19:28:32,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55705.4, 300 sec: 55816.7). Total num frames: 11377737728. Throughput: 0: 55713.2. Samples: 1868147140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:28:32,170][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 19:28:33,562][57319] Signal inference workers to stop experience collection... (27800 times) [2024-04-28 19:28:33,589][57339] InferenceWorker_p0-w0: stopping experience collection (27800 times) [2024-04-28 19:28:33,620][57319] Signal inference workers to resume experience collection... (27800 times) [2024-04-28 19:28:33,621][57339] InferenceWorker_p0-w0: resuming experience collection (27800 times) [2024-04-28 19:28:33,730][57339] Updated weights for policy 0, policy_version 694448 (0.0029) [2024-04-28 19:28:36,417][57339] Updated weights for policy 0, policy_version 694458 (0.0031) [2024-04-28 19:28:37,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 11377999872. Throughput: 0: 55666.5. Samples: 1868311920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:28:37,170][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 19:28:39,561][57339] Updated weights for policy 0, policy_version 694468 (0.0030) [2024-04-28 19:28:42,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11378294784. Throughput: 0: 55766.5. Samples: 1868647820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:28:42,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 19:28:42,369][57339] Updated weights for policy 0, policy_version 694478 (0.0029) [2024-04-28 19:28:45,511][57339] Updated weights for policy 0, policy_version 694488 (0.0033) [2024-04-28 19:28:47,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55978.8, 300 sec: 55761.1). Total num frames: 11378589696. Throughput: 0: 55653.3. Samples: 1868978880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:28:47,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 19:28:48,290][57339] Updated weights for policy 0, policy_version 694498 (0.0029) [2024-04-28 19:28:51,394][57339] Updated weights for policy 0, policy_version 694508 (0.0032) [2024-04-28 19:28:52,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11378868224. Throughput: 0: 55855.7. Samples: 1869153720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:28:52,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 19:28:54,040][57339] Updated weights for policy 0, policy_version 694518 (0.0032) [2024-04-28 19:28:57,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55978.9, 300 sec: 55650.1). Total num frames: 11379130368. Throughput: 0: 55917.4. Samples: 1869496760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:28:57,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 19:28:57,178][57339] Updated weights for policy 0, policy_version 694528 (0.0025) [2024-04-28 19:28:59,769][57339] Updated weights for policy 0, policy_version 694538 (0.0031) [2024-04-28 19:29:02,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11379392512. Throughput: 0: 55881.9. Samples: 1869829380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:29:02,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 19:29:03,072][57339] Updated weights for policy 0, policy_version 694548 (0.0036) [2024-04-28 19:29:05,783][57339] Updated weights for policy 0, policy_version 694558 (0.0033) [2024-04-28 19:29:07,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11379671040. Throughput: 0: 55584.9. Samples: 1869983680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:29:07,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 19:29:09,020][57339] Updated weights for policy 0, policy_version 694568 (0.0028) [2024-04-28 19:29:11,740][57339] Updated weights for policy 0, policy_version 694578 (0.0027) [2024-04-28 19:29:12,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11379965952. Throughput: 0: 55720.0. Samples: 1870321400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:29:12,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 19:29:14,802][57339] Updated weights for policy 0, policy_version 694588 (0.0026) [2024-04-28 19:29:17,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 11380260864. Throughput: 0: 55807.8. Samples: 1870658480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:29:17,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 19:29:17,799][57339] Updated weights for policy 0, policy_version 694598 (0.0027) [2024-04-28 19:29:20,634][57339] Updated weights for policy 0, policy_version 694608 (0.0026) [2024-04-28 19:29:22,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 11380539392. Throughput: 0: 56048.2. Samples: 1870834080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:29:22,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 19:29:23,649][57339] Updated weights for policy 0, policy_version 694618 (0.0026) [2024-04-28 19:29:26,400][57339] Updated weights for policy 0, policy_version 694628 (0.0031) [2024-04-28 19:29:27,169][57108] Fps is (10 sec: 55704.8, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 11380817920. Throughput: 0: 55989.6. Samples: 1871167360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:29:27,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 19:29:27,229][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000694631_11380834304.pth... [2024-04-28 19:29:27,276][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000693814_11367448576.pth [2024-04-28 19:29:29,643][57339] Updated weights for policy 0, policy_version 694638 (0.0030) [2024-04-28 19:29:32,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 11381096448. Throughput: 0: 56007.5. Samples: 1871499220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:29:32,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 19:29:32,292][57339] Updated weights for policy 0, policy_version 694648 (0.0030) [2024-04-28 19:29:35,470][57339] Updated weights for policy 0, policy_version 694658 (0.0028) [2024-04-28 19:29:37,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11381342208. Throughput: 0: 55722.6. Samples: 1871661240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:29:37,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 19:29:38,108][57339] Updated weights for policy 0, policy_version 694668 (0.0027) [2024-04-28 19:29:41,266][57339] Updated weights for policy 0, policy_version 694678 (0.0028) [2024-04-28 19:29:42,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 11381620736. Throughput: 0: 55531.4. Samples: 1871995680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-04-28 19:29:42,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 19:29:44,039][57339] Updated weights for policy 0, policy_version 694688 (0.0029) [2024-04-28 19:29:47,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11381915648. Throughput: 0: 55654.2. Samples: 1872333820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:29:47,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 19:29:47,200][57339] Updated weights for policy 0, policy_version 694698 (0.0029) [2024-04-28 19:29:49,788][57339] Updated weights for policy 0, policy_version 694708 (0.0028) [2024-04-28 19:29:52,169][57108] Fps is (10 sec: 60621.4, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11382226944. Throughput: 0: 55975.6. Samples: 1872502580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:29:52,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 19:29:53,080][57339] Updated weights for policy 0, policy_version 694718 (0.0031) [2024-04-28 19:29:54,866][57319] Signal inference workers to stop experience collection... (27850 times) [2024-04-28 19:29:54,867][57319] Signal inference workers to resume experience collection... (27850 times) [2024-04-28 19:29:54,894][57339] InferenceWorker_p0-w0: stopping experience collection (27850 times) [2024-04-28 19:29:54,895][57339] InferenceWorker_p0-w0: resuming experience collection (27850 times) [2024-04-28 19:29:55,788][57339] Updated weights for policy 0, policy_version 694728 (0.0031) [2024-04-28 19:29:57,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11382489088. Throughput: 0: 55875.9. Samples: 1872835820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:29:57,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 19:29:58,821][57339] Updated weights for policy 0, policy_version 694738 (0.0026) [2024-04-28 19:30:01,573][57339] Updated weights for policy 0, policy_version 694748 (0.0026) [2024-04-28 19:30:02,169][57108] Fps is (10 sec: 54067.6, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11382767616. Throughput: 0: 55863.2. Samples: 1873172320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:30:02,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 19:30:04,626][57339] Updated weights for policy 0, policy_version 694758 (0.0030) [2024-04-28 19:30:07,169][57108] Fps is (10 sec: 55704.5, 60 sec: 56251.5, 300 sec: 55816.7). Total num frames: 11383046144. Throughput: 0: 55778.7. Samples: 1873344140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:30:07,170][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 19:30:07,416][57339] Updated weights for policy 0, policy_version 694768 (0.0027) [2024-04-28 19:30:10,621][57339] Updated weights for policy 0, policy_version 694778 (0.0028) [2024-04-28 19:30:12,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11383308288. Throughput: 0: 55863.7. Samples: 1873681220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:30:12,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 19:30:13,279][57339] Updated weights for policy 0, policy_version 694788 (0.0025) [2024-04-28 19:30:16,507][57339] Updated weights for policy 0, policy_version 694798 (0.0027) [2024-04-28 19:30:17,169][57108] Fps is (10 sec: 52430.0, 60 sec: 55159.4, 300 sec: 55650.1). Total num frames: 11383570432. Throughput: 0: 55911.1. Samples: 1874015220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:30:17,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 19:30:19,094][57339] Updated weights for policy 0, policy_version 694808 (0.0026) [2024-04-28 19:30:22,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11383881728. Throughput: 0: 55995.2. Samples: 1874181020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:30:22,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 19:30:22,384][57339] Updated weights for policy 0, policy_version 694818 (0.0025) [2024-04-28 19:30:24,807][57339] Updated weights for policy 0, policy_version 694828 (0.0028) [2024-04-28 19:30:27,169][57108] Fps is (10 sec: 58981.7, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 11384160256. Throughput: 0: 56080.4. Samples: 1874519300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:30:27,170][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 19:30:28,394][57339] Updated weights for policy 0, policy_version 694838 (0.0026) [2024-04-28 19:30:30,827][57339] Updated weights for policy 0, policy_version 694848 (0.0025) [2024-04-28 19:30:32,169][57108] Fps is (10 sec: 58981.9, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 11384471552. Throughput: 0: 56055.0. Samples: 1874856300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:30:32,170][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 19:30:34,405][57339] Updated weights for policy 0, policy_version 694858 (0.0034) [2024-04-28 19:30:36,709][57339] Updated weights for policy 0, policy_version 694868 (0.0027) [2024-04-28 19:30:37,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56524.7, 300 sec: 55816.7). Total num frames: 11384733696. Throughput: 0: 56248.4. Samples: 1875033760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:30:37,170][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 19:30:40,143][57339] Updated weights for policy 0, policy_version 694878 (0.0031) [2024-04-28 19:30:42,169][57108] Fps is (10 sec: 52429.1, 60 sec: 56251.9, 300 sec: 55816.7). Total num frames: 11384995840. Throughput: 0: 56262.4. Samples: 1875367620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:30:42,169][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 19:30:42,574][57339] Updated weights for policy 0, policy_version 694888 (0.0028) [2024-04-28 19:30:45,800][57339] Updated weights for policy 0, policy_version 694898 (0.0027) [2024-04-28 19:30:47,169][57108] Fps is (10 sec: 55705.5, 60 sec: 56251.6, 300 sec: 55816.6). Total num frames: 11385290752. Throughput: 0: 56326.1. Samples: 1875707000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:30:47,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 19:30:48,510][57339] Updated weights for policy 0, policy_version 694908 (0.0029) [2024-04-28 19:30:51,764][57339] Updated weights for policy 0, policy_version 694918 (0.0028) [2024-04-28 19:30:52,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 11385536512. Throughput: 0: 55938.1. Samples: 1875861340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:30:52,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 19:30:54,338][57339] Updated weights for policy 0, policy_version 694928 (0.0029) [2024-04-28 19:30:57,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11385831424. Throughput: 0: 55938.6. Samples: 1876198460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:30:57,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 19:30:57,748][57339] Updated weights for policy 0, policy_version 694938 (0.0027) [2024-04-28 19:31:00,271][57319] Signal inference workers to stop experience collection... (27900 times) [2024-04-28 19:31:00,279][57319] Signal inference workers to resume experience collection... (27900 times) [2024-04-28 19:31:00,299][57339] InferenceWorker_p0-w0: stopping experience collection (27900 times) [2024-04-28 19:31:00,299][57339] InferenceWorker_p0-w0: resuming experience collection (27900 times) [2024-04-28 19:31:00,401][57339] Updated weights for policy 0, policy_version 694948 (0.0029) [2024-04-28 19:31:02,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11386126336. Throughput: 0: 55870.3. Samples: 1876529380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:31:02,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 19:31:03,695][57339] Updated weights for policy 0, policy_version 694958 (0.0034) [2024-04-28 19:31:06,141][57339] Updated weights for policy 0, policy_version 694968 (0.0032) [2024-04-28 19:31:07,169][57108] Fps is (10 sec: 58982.7, 60 sec: 56251.9, 300 sec: 55816.7). Total num frames: 11386421248. Throughput: 0: 55935.0. Samples: 1876698100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:31:07,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 19:31:09,446][57339] Updated weights for policy 0, policy_version 694978 (0.0027) [2024-04-28 19:31:11,852][57339] Updated weights for policy 0, policy_version 694988 (0.0025) [2024-04-28 19:31:12,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 11386683392. Throughput: 0: 55937.6. Samples: 1877036480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:31:12,169][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 19:31:15,323][57339] Updated weights for policy 0, policy_version 694998 (0.0025) [2024-04-28 19:31:17,169][57108] Fps is (10 sec: 52428.7, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 11386945536. Throughput: 0: 56005.3. Samples: 1877376540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:31:17,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 19:31:17,905][57339] Updated weights for policy 0, policy_version 695008 (0.0037) [2024-04-28 19:31:21,234][57339] Updated weights for policy 0, policy_version 695018 (0.0027) [2024-04-28 19:31:22,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11387240448. Throughput: 0: 55695.2. Samples: 1877540040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-04-28 19:31:22,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 19:31:23,714][57339] Updated weights for policy 0, policy_version 695028 (0.0025) [2024-04-28 19:31:26,959][57339] Updated weights for policy 0, policy_version 695038 (0.0025) [2024-04-28 19:31:27,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 11387518976. Throughput: 0: 55823.9. Samples: 1877879700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:31:27,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 19:31:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000695039_11387518976.pth... [2024-04-28 19:31:27,227][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000694221_11374116864.pth [2024-04-28 19:31:29,587][57339] Updated weights for policy 0, policy_version 695048 (0.0025) [2024-04-28 19:31:32,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 11387797504. Throughput: 0: 55745.9. Samples: 1878215560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:31:32,169][57108] Avg episode reward: [(0, '0.621')] [2024-04-28 19:31:32,894][57339] Updated weights for policy 0, policy_version 695058 (0.0034) [2024-04-28 19:31:35,454][57339] Updated weights for policy 0, policy_version 695068 (0.0025) [2024-04-28 19:31:37,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11388092416. Throughput: 0: 56098.1. Samples: 1878385760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:31:37,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 19:31:38,566][57339] Updated weights for policy 0, policy_version 695078 (0.0029) [2024-04-28 19:31:41,236][57339] Updated weights for policy 0, policy_version 695088 (0.0031) [2024-04-28 19:31:42,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 11388370944. Throughput: 0: 56075.2. Samples: 1878721840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:31:42,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 19:31:44,419][57339] Updated weights for policy 0, policy_version 695098 (0.0034) [2024-04-28 19:31:44,649][57319] Signal inference workers to stop experience collection... (27950 times) [2024-04-28 19:31:44,653][57319] Signal inference workers to resume experience collection... (27950 times) [2024-04-28 19:31:44,669][57339] InferenceWorker_p0-w0: stopping experience collection (27950 times) [2024-04-28 19:31:44,669][57339] InferenceWorker_p0-w0: resuming experience collection (27950 times) [2024-04-28 19:31:47,169][57108] Fps is (10 sec: 54068.3, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 11388633088. Throughput: 0: 56150.7. Samples: 1879056160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:31:47,169][57108] Avg episode reward: [(0, '0.699')] [2024-04-28 19:31:47,217][57339] Updated weights for policy 0, policy_version 695108 (0.0026) [2024-04-28 19:31:50,356][57339] Updated weights for policy 0, policy_version 695118 (0.0028) [2024-04-28 19:31:52,169][57108] Fps is (10 sec: 54067.3, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 11388911616. Throughput: 0: 56113.8. Samples: 1879223220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:31:52,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 19:31:53,081][57339] Updated weights for policy 0, policy_version 695128 (0.0033) [2024-04-28 19:31:56,138][57339] Updated weights for policy 0, policy_version 695138 (0.0025) [2024-04-28 19:31:57,169][57108] Fps is (10 sec: 57343.5, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 11389206528. Throughput: 0: 55962.6. Samples: 1879554800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:31:57,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 19:31:58,810][57339] Updated weights for policy 0, policy_version 695148 (0.0037) [2024-04-28 19:32:01,950][57339] Updated weights for policy 0, policy_version 695158 (0.0029) [2024-04-28 19:32:02,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 11389485056. Throughput: 0: 55923.7. Samples: 1879893100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:32:02,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 19:32:04,811][57339] Updated weights for policy 0, policy_version 695168 (0.0027) [2024-04-28 19:32:07,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 11389730816. Throughput: 0: 55987.5. Samples: 1880059480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:32:07,170][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 19:32:07,728][57339] Updated weights for policy 0, policy_version 695178 (0.0029) [2024-04-28 19:32:10,730][57339] Updated weights for policy 0, policy_version 695188 (0.0031) [2024-04-28 19:32:12,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11390042112. Throughput: 0: 55857.4. Samples: 1880393280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:32:12,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 19:32:13,627][57339] Updated weights for policy 0, policy_version 695198 (0.0027) [2024-04-28 19:32:16,525][57339] Updated weights for policy 0, policy_version 695208 (0.0027) [2024-04-28 19:32:17,169][57108] Fps is (10 sec: 58982.9, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 11390320640. Throughput: 0: 55686.7. Samples: 1880721460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:32:17,169][57108] Avg episode reward: [(0, '0.578')] [2024-04-28 19:32:19,522][57339] Updated weights for policy 0, policy_version 695218 (0.0032) [2024-04-28 19:32:22,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11390599168. Throughput: 0: 55661.1. Samples: 1880890500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:32:22,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 19:32:22,411][57339] Updated weights for policy 0, policy_version 695228 (0.0026) [2024-04-28 19:32:25,186][57339] Updated weights for policy 0, policy_version 695238 (0.0030) [2024-04-28 19:32:27,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 11390861312. Throughput: 0: 55647.0. Samples: 1881225960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:32:27,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 19:32:28,245][57339] Updated weights for policy 0, policy_version 695248 (0.0026) [2024-04-28 19:32:30,997][57339] Updated weights for policy 0, policy_version 695258 (0.0028) [2024-04-28 19:32:32,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 11391156224. Throughput: 0: 55434.5. Samples: 1881550720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:32:32,170][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 19:32:34,035][57319] Signal inference workers to stop experience collection... (28000 times) [2024-04-28 19:32:34,037][57319] Signal inference workers to resume experience collection... (28000 times) [2024-04-28 19:32:34,056][57339] InferenceWorker_p0-w0: stopping experience collection (28000 times) [2024-04-28 19:32:34,056][57339] InferenceWorker_p0-w0: resuming experience collection (28000 times) [2024-04-28 19:32:34,155][57339] Updated weights for policy 0, policy_version 695268 (0.0031) [2024-04-28 19:32:36,956][57339] Updated weights for policy 0, policy_version 695278 (0.0029) [2024-04-28 19:32:37,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55978.8, 300 sec: 55983.3). Total num frames: 11391451136. Throughput: 0: 55660.0. Samples: 1881727920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:32:37,169][57108] Avg episode reward: [(0, '0.522')] [2024-04-28 19:32:39,990][57339] Updated weights for policy 0, policy_version 695288 (0.0026) [2024-04-28 19:32:42,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 11391696896. Throughput: 0: 55818.2. Samples: 1882066620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:32:42,170][57108] Avg episode reward: [(0, '0.474')] [2024-04-28 19:32:42,901][57339] Updated weights for policy 0, policy_version 695298 (0.0033) [2024-04-28 19:32:45,736][57339] Updated weights for policy 0, policy_version 695308 (0.0039) [2024-04-28 19:32:47,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 11391975424. Throughput: 0: 55747.1. Samples: 1882401720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:32:47,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 19:32:48,641][57339] Updated weights for policy 0, policy_version 695318 (0.0037) [2024-04-28 19:32:51,613][57339] Updated weights for policy 0, policy_version 695328 (0.0035) [2024-04-28 19:32:52,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 11392270336. Throughput: 0: 55738.2. Samples: 1882567700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:32:52,170][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 19:32:54,409][57339] Updated weights for policy 0, policy_version 695338 (0.0035) [2024-04-28 19:32:57,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 11392548864. Throughput: 0: 55728.9. Samples: 1882901080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:32:57,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 19:32:57,586][57339] Updated weights for policy 0, policy_version 695348 (0.0031) [2024-04-28 19:33:00,254][57339] Updated weights for policy 0, policy_version 695358 (0.0029) [2024-04-28 19:33:02,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 11392811008. Throughput: 0: 55927.4. Samples: 1883238200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-04-28 19:33:02,170][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 19:33:03,575][57339] Updated weights for policy 0, policy_version 695368 (0.0035) [2024-04-28 19:33:06,205][57339] Updated weights for policy 0, policy_version 695378 (0.0029) [2024-04-28 19:33:07,169][57108] Fps is (10 sec: 57342.5, 60 sec: 56524.7, 300 sec: 55872.2). Total num frames: 11393122304. Throughput: 0: 55826.8. Samples: 1883402720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:33:07,170][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 19:33:09,327][57339] Updated weights for policy 0, policy_version 695388 (0.0027) [2024-04-28 19:33:11,977][57339] Updated weights for policy 0, policy_version 695398 (0.0032) [2024-04-28 19:33:12,169][57108] Fps is (10 sec: 60621.6, 60 sec: 56251.7, 300 sec: 55983.3). Total num frames: 11393417216. Throughput: 0: 55978.8. Samples: 1883745000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:33:12,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 19:33:15,354][57339] Updated weights for policy 0, policy_version 695408 (0.0031) [2024-04-28 19:33:17,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 11393679360. Throughput: 0: 56212.9. Samples: 1884080300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:33:17,169][57108] Avg episode reward: [(0, '0.522')] [2024-04-28 19:33:17,818][57339] Updated weights for policy 0, policy_version 695418 (0.0032) [2024-04-28 19:33:21,408][57339] Updated weights for policy 0, policy_version 695428 (0.0036) [2024-04-28 19:33:22,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55705.5, 300 sec: 55927.7). Total num frames: 11393941504. Throughput: 0: 55952.3. Samples: 1884245780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:33:22,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 19:33:23,781][57339] Updated weights for policy 0, policy_version 695438 (0.0028) [2024-04-28 19:33:27,097][57339] Updated weights for policy 0, policy_version 695448 (0.0026) [2024-04-28 19:33:27,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11394220032. Throughput: 0: 55843.5. Samples: 1884579580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:33:27,170][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 19:33:27,274][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000695449_11394236416.pth... [2024-04-28 19:33:27,318][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000694631_11380834304.pth [2024-04-28 19:33:29,667][57339] Updated weights for policy 0, policy_version 695458 (0.0025) [2024-04-28 19:33:32,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55705.6, 300 sec: 55927.8). Total num frames: 11394498560. Throughput: 0: 55858.9. Samples: 1884915380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:33:32,169][57108] Avg episode reward: [(0, '0.698')] [2024-04-28 19:33:32,925][57339] Updated weights for policy 0, policy_version 695468 (0.0028) [2024-04-28 19:33:35,633][57339] Updated weights for policy 0, policy_version 695478 (0.0030) [2024-04-28 19:33:37,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 11394760704. Throughput: 0: 55974.9. Samples: 1885086560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:33:37,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 19:33:38,658][57339] Updated weights for policy 0, policy_version 695488 (0.0027) [2024-04-28 19:33:39,055][57319] Signal inference workers to stop experience collection... (28050 times) [2024-04-28 19:33:39,101][57339] InferenceWorker_p0-w0: stopping experience collection (28050 times) [2024-04-28 19:33:39,106][57319] Signal inference workers to resume experience collection... (28050 times) [2024-04-28 19:33:39,115][57339] InferenceWorker_p0-w0: resuming experience collection (28050 times) [2024-04-28 19:33:41,316][57339] Updated weights for policy 0, policy_version 695498 (0.0025) [2024-04-28 19:33:42,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11395055616. Throughput: 0: 55932.8. Samples: 1885418060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:33:42,169][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 19:33:44,403][57339] Updated weights for policy 0, policy_version 695508 (0.0033) [2024-04-28 19:33:47,112][57339] Updated weights for policy 0, policy_version 695518 (0.0028) [2024-04-28 19:33:47,169][57108] Fps is (10 sec: 60620.1, 60 sec: 56524.7, 300 sec: 55927.7). Total num frames: 11395366912. Throughput: 0: 55828.0. Samples: 1885750460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:33:47,170][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 19:33:50,431][57339] Updated weights for policy 0, policy_version 695528 (0.0026) [2024-04-28 19:33:52,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 11395629056. Throughput: 0: 56048.1. Samples: 1885924880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:33:52,170][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 19:33:52,932][57339] Updated weights for policy 0, policy_version 695538 (0.0029) [2024-04-28 19:33:56,500][57339] Updated weights for policy 0, policy_version 695548 (0.0028) [2024-04-28 19:33:57,169][57108] Fps is (10 sec: 55705.8, 60 sec: 56251.6, 300 sec: 56038.8). Total num frames: 11395923968. Throughput: 0: 55948.8. Samples: 1886262700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:33:57,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 19:33:58,841][57339] Updated weights for policy 0, policy_version 695558 (0.0028) [2024-04-28 19:34:02,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55978.7, 300 sec: 55927.7). Total num frames: 11396169728. Throughput: 0: 55895.0. Samples: 1886595580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:34:02,169][57108] Avg episode reward: [(0, '0.518')] [2024-04-28 19:34:02,514][57339] Updated weights for policy 0, policy_version 695568 (0.0024) [2024-04-28 19:34:04,679][57339] Updated weights for policy 0, policy_version 695578 (0.0026) [2024-04-28 19:34:07,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 11396448256. Throughput: 0: 55719.9. Samples: 1886753180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:34:07,170][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 19:34:08,356][57339] Updated weights for policy 0, policy_version 695588 (0.0026) [2024-04-28 19:34:10,600][57339] Updated weights for policy 0, policy_version 695598 (0.0026) [2024-04-28 19:34:12,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55159.3, 300 sec: 55816.7). Total num frames: 11396726784. Throughput: 0: 55781.2. Samples: 1887089740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:34:12,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 19:34:14,182][57339] Updated weights for policy 0, policy_version 695608 (0.0026) [2024-04-28 19:34:16,375][57339] Updated weights for policy 0, policy_version 695618 (0.0025) [2024-04-28 19:34:17,169][57108] Fps is (10 sec: 55706.4, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 11397005312. Throughput: 0: 55772.6. Samples: 1887425140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:34:17,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 19:34:19,888][57339] Updated weights for policy 0, policy_version 695628 (0.0031) [2024-04-28 19:34:22,169][57108] Fps is (10 sec: 58982.6, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 11397316608. Throughput: 0: 55850.5. Samples: 1887599840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:34:22,170][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 19:34:22,351][57339] Updated weights for policy 0, policy_version 695638 (0.0027) [2024-04-28 19:34:25,708][57339] Updated weights for policy 0, policy_version 695648 (0.0030) [2024-04-28 19:34:27,169][57108] Fps is (10 sec: 58981.9, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 11397595136. Throughput: 0: 56015.0. Samples: 1887938740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:34:27,170][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 19:34:28,375][57339] Updated weights for policy 0, policy_version 695658 (0.0035) [2024-04-28 19:34:31,572][57339] Updated weights for policy 0, policy_version 695668 (0.0032) [2024-04-28 19:34:31,938][57319] Signal inference workers to stop experience collection... (28100 times) [2024-04-28 19:34:31,983][57339] InferenceWorker_p0-w0: stopping experience collection (28100 times) [2024-04-28 19:34:31,992][57319] Signal inference workers to resume experience collection... (28100 times) [2024-04-28 19:34:31,999][57339] InferenceWorker_p0-w0: resuming experience collection (28100 times) [2024-04-28 19:34:32,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 56038.8). Total num frames: 11397873664. Throughput: 0: 56054.3. Samples: 1888272900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:34:32,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 19:34:34,175][57339] Updated weights for policy 0, policy_version 695678 (0.0032) [2024-04-28 19:34:37,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 11398119424. Throughput: 0: 55843.7. Samples: 1888437840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:34:37,169][57108] Avg episode reward: [(0, '0.532')] [2024-04-28 19:34:37,397][57339] Updated weights for policy 0, policy_version 695688 (0.0029) [2024-04-28 19:34:40,176][57339] Updated weights for policy 0, policy_version 695698 (0.0027) [2024-04-28 19:34:42,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.6, 300 sec: 55927.7). Total num frames: 11398414336. Throughput: 0: 55851.6. Samples: 1888776020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:34:42,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 19:34:43,096][57339] Updated weights for policy 0, policy_version 695708 (0.0029) [2024-04-28 19:34:46,062][57339] Updated weights for policy 0, policy_version 695718 (0.0033) [2024-04-28 19:34:47,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55432.5, 300 sec: 55816.7). Total num frames: 11398692864. Throughput: 0: 55847.2. Samples: 1889108700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:34:47,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 19:34:48,833][57339] Updated weights for policy 0, policy_version 695728 (0.0029) [2024-04-28 19:34:51,818][57339] Updated weights for policy 0, policy_version 695738 (0.0027) [2024-04-28 19:34:52,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.7, 300 sec: 55872.2). Total num frames: 11398971392. Throughput: 0: 56107.8. Samples: 1889278020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:34:52,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 19:34:54,791][57339] Updated weights for policy 0, policy_version 695748 (0.0032) [2024-04-28 19:34:57,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 11399249920. Throughput: 0: 56036.9. Samples: 1889611400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:34:57,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 19:34:57,914][57339] Updated weights for policy 0, policy_version 695758 (0.0031) [2024-04-28 19:35:00,623][57339] Updated weights for policy 0, policy_version 695768 (0.0032) [2024-04-28 19:35:02,169][57108] Fps is (10 sec: 57343.8, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 11399544832. Throughput: 0: 56007.6. Samples: 1889945480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:35:02,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 19:35:03,879][57339] Updated weights for policy 0, policy_version 695778 (0.0030) [2024-04-28 19:35:06,342][57339] Updated weights for policy 0, policy_version 695788 (0.0027) [2024-04-28 19:35:07,169][57108] Fps is (10 sec: 55706.7, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 11399806976. Throughput: 0: 55907.7. Samples: 1890115680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:35:07,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 19:35:09,715][57339] Updated weights for policy 0, policy_version 695798 (0.0028) [2024-04-28 19:35:12,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.9, 300 sec: 56038.8). Total num frames: 11400101888. Throughput: 0: 55809.9. Samples: 1890450180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:35:12,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 19:35:12,262][57339] Updated weights for policy 0, policy_version 695808 (0.0027) [2024-04-28 19:35:15,562][57339] Updated weights for policy 0, policy_version 695818 (0.0029) [2024-04-28 19:35:17,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 11400364032. Throughput: 0: 55795.9. Samples: 1890783720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:35:17,170][57108] Avg episode reward: [(0, '0.519')] [2024-04-28 19:35:18,409][57339] Updated weights for policy 0, policy_version 695828 (0.0023) [2024-04-28 19:35:21,356][57339] Updated weights for policy 0, policy_version 695838 (0.0027) [2024-04-28 19:35:22,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.7, 300 sec: 55927.8). Total num frames: 11400658944. Throughput: 0: 55646.6. Samples: 1890941940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:35:22,169][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 19:35:24,264][57339] Updated weights for policy 0, policy_version 695848 (0.0030) [2024-04-28 19:35:27,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 11400921088. Throughput: 0: 55589.4. Samples: 1891277540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:35:27,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 19:35:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000695857_11400921088.pth... [2024-04-28 19:35:27,242][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000695039_11387518976.pth [2024-04-28 19:35:27,367][57339] Updated weights for policy 0, policy_version 695858 (0.0027) [2024-04-28 19:35:30,388][57339] Updated weights for policy 0, policy_version 695868 (0.0029) [2024-04-28 19:35:30,648][57319] Signal inference workers to stop experience collection... (28150 times) [2024-04-28 19:35:30,688][57339] InferenceWorker_p0-w0: stopping experience collection (28150 times) [2024-04-28 19:35:30,700][57319] Signal inference workers to resume experience collection... (28150 times) [2024-04-28 19:35:30,705][57339] InferenceWorker_p0-w0: resuming experience collection (28150 times) [2024-04-28 19:35:32,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 11401216000. Throughput: 0: 55612.5. Samples: 1891611260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:35:32,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 19:35:33,229][57339] Updated weights for policy 0, policy_version 695878 (0.0023) [2024-04-28 19:35:36,121][57339] Updated weights for policy 0, policy_version 695888 (0.0028) [2024-04-28 19:35:37,169][57108] Fps is (10 sec: 57342.4, 60 sec: 56251.5, 300 sec: 55927.7). Total num frames: 11401494528. Throughput: 0: 55558.7. Samples: 1891778180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:35:37,170][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 19:35:39,068][57339] Updated weights for policy 0, policy_version 695898 (0.0026) [2024-04-28 19:35:42,063][57339] Updated weights for policy 0, policy_version 695908 (0.0025) [2024-04-28 19:35:42,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 11401756672. Throughput: 0: 55529.4. Samples: 1892110220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:35:42,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 19:35:44,968][57339] Updated weights for policy 0, policy_version 695918 (0.0027) [2024-04-28 19:35:47,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 11402051584. Throughput: 0: 55594.4. Samples: 1892447240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:35:47,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 19:35:47,992][57339] Updated weights for policy 0, policy_version 695928 (0.0030) [2024-04-28 19:35:51,157][57339] Updated weights for policy 0, policy_version 695938 (0.0029) [2024-04-28 19:35:52,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.6, 300 sec: 55927.8). Total num frames: 11402330112. Throughput: 0: 55632.4. Samples: 1892619140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:35:52,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 19:35:53,723][57339] Updated weights for policy 0, policy_version 695948 (0.0025) [2024-04-28 19:35:56,855][57339] Updated weights for policy 0, policy_version 695958 (0.0026) [2024-04-28 19:35:57,169][57108] Fps is (10 sec: 52429.7, 60 sec: 55432.7, 300 sec: 55761.1). Total num frames: 11402575872. Throughput: 0: 55608.8. Samples: 1892952580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:35:57,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 19:35:59,485][57339] Updated weights for policy 0, policy_version 695968 (0.0031) [2024-04-28 19:36:02,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 11402854400. Throughput: 0: 55648.2. Samples: 1893287880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:36:02,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 19:36:02,573][57339] Updated weights for policy 0, policy_version 695978 (0.0030) [2024-04-28 19:36:05,376][57339] Updated weights for policy 0, policy_version 695988 (0.0027) [2024-04-28 19:36:07,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11403165696. Throughput: 0: 55813.4. Samples: 1893453540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:36:07,169][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 19:36:08,552][57339] Updated weights for policy 0, policy_version 695998 (0.0024) [2024-04-28 19:36:11,374][57339] Updated weights for policy 0, policy_version 696008 (0.0034) [2024-04-28 19:36:12,169][57108] Fps is (10 sec: 60620.3, 60 sec: 55978.6, 300 sec: 55983.3). Total num frames: 11403460608. Throughput: 0: 55949.7. Samples: 1893795280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:36:12,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 19:36:14,435][57339] Updated weights for policy 0, policy_version 696018 (0.0036) [2024-04-28 19:36:17,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.8, 300 sec: 55816.7). Total num frames: 11403706368. Throughput: 0: 55950.3. Samples: 1894129020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:36:17,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 19:36:17,203][57339] Updated weights for policy 0, policy_version 696028 (0.0028) [2024-04-28 19:36:20,204][57339] Updated weights for policy 0, policy_version 696038 (0.0026) [2024-04-28 19:36:21,133][57319] Signal inference workers to stop experience collection... (28200 times) [2024-04-28 19:36:21,133][57319] Signal inference workers to resume experience collection... (28200 times) [2024-04-28 19:36:21,148][57339] InferenceWorker_p0-w0: stopping experience collection (28200 times) [2024-04-28 19:36:21,154][57339] InferenceWorker_p0-w0: resuming experience collection (28200 times) [2024-04-28 19:36:22,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 11403984896. Throughput: 0: 55921.5. Samples: 1894294640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-04-28 19:36:22,170][57108] Avg episode reward: [(0, '0.574')] [2024-04-28 19:36:23,156][57339] Updated weights for policy 0, policy_version 696048 (0.0031) [2024-04-28 19:36:25,991][57339] Updated weights for policy 0, policy_version 696058 (0.0026) [2024-04-28 19:36:27,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 11404279808. Throughput: 0: 55815.6. Samples: 1894621920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:36:27,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 19:36:29,075][57339] Updated weights for policy 0, policy_version 696068 (0.0026) [2024-04-28 19:36:31,990][57339] Updated weights for policy 0, policy_version 696078 (0.0029) [2024-04-28 19:36:32,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 11404541952. Throughput: 0: 55805.9. Samples: 1894958500. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:36:32,170][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 19:36:34,813][57339] Updated weights for policy 0, policy_version 696088 (0.0028) [2024-04-28 19:36:37,169][57108] Fps is (10 sec: 52427.9, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 11404804096. Throughput: 0: 55669.6. Samples: 1895124280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:36:37,169][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 19:36:38,085][57339] Updated weights for policy 0, policy_version 696098 (0.0030) [2024-04-28 19:36:40,630][57339] Updated weights for policy 0, policy_version 696108 (0.0027) [2024-04-28 19:36:42,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11405115392. Throughput: 0: 55596.4. Samples: 1895454420. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:36:42,169][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 19:36:43,930][57339] Updated weights for policy 0, policy_version 696118 (0.0030) [2024-04-28 19:36:46,475][57339] Updated weights for policy 0, policy_version 696128 (0.0027) [2024-04-28 19:36:47,169][57108] Fps is (10 sec: 57344.9, 60 sec: 55432.7, 300 sec: 55816.7). Total num frames: 11405377536. Throughput: 0: 55606.6. Samples: 1895790180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:36:47,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 19:36:49,705][57339] Updated weights for policy 0, policy_version 696138 (0.0029) [2024-04-28 19:36:52,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.6, 300 sec: 55761.2). Total num frames: 11405656064. Throughput: 0: 55793.8. Samples: 1895964260. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:36:52,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 19:36:52,424][57339] Updated weights for policy 0, policy_version 696148 (0.0030) [2024-04-28 19:36:55,366][57339] Updated weights for policy 0, policy_version 696158 (0.0029) [2024-04-28 19:36:57,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 11405934592. Throughput: 0: 55659.6. Samples: 1896299960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:36:57,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 19:36:58,246][57339] Updated weights for policy 0, policy_version 696168 (0.0032) [2024-04-28 19:37:01,137][57339] Updated weights for policy 0, policy_version 696178 (0.0025) [2024-04-28 19:37:02,169][57108] Fps is (10 sec: 57344.3, 60 sec: 56251.8, 300 sec: 55927.8). Total num frames: 11406229504. Throughput: 0: 55649.0. Samples: 1896633220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:37:02,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 19:37:04,364][57339] Updated weights for policy 0, policy_version 696188 (0.0038) [2024-04-28 19:37:06,954][57339] Updated weights for policy 0, policy_version 696198 (0.0031) [2024-04-28 19:37:07,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55705.4, 300 sec: 55816.6). Total num frames: 11406508032. Throughput: 0: 55786.2. Samples: 1896805020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:37:07,170][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 19:37:10,329][57339] Updated weights for policy 0, policy_version 696208 (0.0029) [2024-04-28 19:37:11,145][57319] Signal inference workers to stop experience collection... (28250 times) [2024-04-28 19:37:11,149][57319] Signal inference workers to resume experience collection... (28250 times) [2024-04-28 19:37:11,179][57339] InferenceWorker_p0-w0: stopping experience collection (28250 times) [2024-04-28 19:37:11,179][57339] InferenceWorker_p0-w0: resuming experience collection (28250 times) [2024-04-28 19:37:12,169][57108] Fps is (10 sec: 54066.5, 60 sec: 55159.5, 300 sec: 55761.1). Total num frames: 11406770176. Throughput: 0: 55894.2. Samples: 1897137160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:37:12,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 19:37:12,976][57339] Updated weights for policy 0, policy_version 696218 (0.0030) [2024-04-28 19:37:16,023][57339] Updated weights for policy 0, policy_version 696228 (0.0027) [2024-04-28 19:37:17,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.5, 300 sec: 55816.6). Total num frames: 11407065088. Throughput: 0: 55962.1. Samples: 1897476800. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:37:17,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 19:37:18,603][57339] Updated weights for policy 0, policy_version 696238 (0.0029) [2024-04-28 19:37:21,804][57339] Updated weights for policy 0, policy_version 696248 (0.0037) [2024-04-28 19:37:22,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11407343616. Throughput: 0: 55790.3. Samples: 1897634840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:37:22,170][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 19:37:24,759][57339] Updated weights for policy 0, policy_version 696258 (0.0037) [2024-04-28 19:37:27,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 11407622144. Throughput: 0: 55872.8. Samples: 1897968700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:37:27,170][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 19:37:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000696266_11407622144.pth... [2024-04-28 19:37:27,225][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000695449_11394236416.pth [2024-04-28 19:37:27,740][57339] Updated weights for policy 0, policy_version 696268 (0.0030) [2024-04-28 19:37:30,635][57339] Updated weights for policy 0, policy_version 696278 (0.0034) [2024-04-28 19:37:32,169][57108] Fps is (10 sec: 57344.7, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 11407917056. Throughput: 0: 55974.2. Samples: 1898309020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:37:32,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 19:37:33,502][57339] Updated weights for policy 0, policy_version 696288 (0.0035) [2024-04-28 19:37:36,320][57339] Updated weights for policy 0, policy_version 696298 (0.0032) [2024-04-28 19:37:37,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56524.9, 300 sec: 55927.7). Total num frames: 11408195584. Throughput: 0: 55986.9. Samples: 1898483680. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:37:37,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 19:37:39,261][57339] Updated weights for policy 0, policy_version 696308 (0.0027) [2024-04-28 19:37:42,119][57339] Updated weights for policy 0, policy_version 696318 (0.0026) [2024-04-28 19:37:42,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 11408474112. Throughput: 0: 55984.4. Samples: 1898819260. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:37:42,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 19:37:45,131][57339] Updated weights for policy 0, policy_version 696328 (0.0026) [2024-04-28 19:37:47,169][57108] Fps is (10 sec: 55706.0, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 11408752640. Throughput: 0: 56038.5. Samples: 1899154960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:37:47,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 19:37:47,928][57339] Updated weights for policy 0, policy_version 696338 (0.0027) [2024-04-28 19:37:51,262][57339] Updated weights for policy 0, policy_version 696348 (0.0028) [2024-04-28 19:37:52,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11408998400. Throughput: 0: 55908.7. Samples: 1899320900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:37:52,169][57108] Avg episode reward: [(0, '0.523')] [2024-04-28 19:37:53,735][57339] Updated weights for policy 0, policy_version 696358 (0.0033) [2024-04-28 19:37:57,015][57339] Updated weights for policy 0, policy_version 696368 (0.0031) [2024-04-28 19:37:57,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55978.6, 300 sec: 55872.2). Total num frames: 11409293312. Throughput: 0: 56026.1. Samples: 1899658340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:37:57,178][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 19:37:59,491][57339] Updated weights for policy 0, policy_version 696378 (0.0030) [2024-04-28 19:38:02,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11409555456. Throughput: 0: 55973.6. Samples: 1899995600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-04-28 19:38:02,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 19:38:02,867][57339] Updated weights for policy 0, policy_version 696388 (0.0034) [2024-04-28 19:38:05,393][57339] Updated weights for policy 0, policy_version 696398 (0.0027) [2024-04-28 19:38:07,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 11409866752. Throughput: 0: 56138.3. Samples: 1900161060. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:38:07,170][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 19:38:08,784][57339] Updated weights for policy 0, policy_version 696408 (0.0030) [2024-04-28 19:38:11,324][57339] Updated weights for policy 0, policy_version 696418 (0.0029) [2024-04-28 19:38:12,169][57108] Fps is (10 sec: 60620.7, 60 sec: 56524.9, 300 sec: 55872.2). Total num frames: 11410161664. Throughput: 0: 56110.8. Samples: 1900493680. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:38:12,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 19:38:14,640][57339] Updated weights for policy 0, policy_version 696428 (0.0026) [2024-04-28 19:38:17,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 11410423808. Throughput: 0: 56004.0. Samples: 1900829200. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:38:17,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 19:38:17,293][57319] Signal inference workers to stop experience collection... (28300 times) [2024-04-28 19:38:17,295][57319] Signal inference workers to resume experience collection... (28300 times) [2024-04-28 19:38:17,306][57339] Updated weights for policy 0, policy_version 696438 (0.0026) [2024-04-28 19:38:17,330][57339] InferenceWorker_p0-w0: stopping experience collection (28300 times) [2024-04-28 19:38:17,330][57339] InferenceWorker_p0-w0: resuming experience collection (28300 times) [2024-04-28 19:38:20,472][57339] Updated weights for policy 0, policy_version 696448 (0.0033) [2024-04-28 19:38:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55978.9, 300 sec: 55872.2). Total num frames: 11410702336. Throughput: 0: 55900.6. Samples: 1900999200. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:38:22,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 19:38:23,051][57339] Updated weights for policy 0, policy_version 696458 (0.0027) [2024-04-28 19:38:26,349][57339] Updated weights for policy 0, policy_version 696468 (0.0027) [2024-04-28 19:38:27,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11410980864. Throughput: 0: 55984.8. Samples: 1901338580. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:38:27,170][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 19:38:28,871][57339] Updated weights for policy 0, policy_version 696478 (0.0033) [2024-04-28 19:38:32,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 11411226624. Throughput: 0: 56012.5. Samples: 1901675520. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:38:32,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 19:38:32,371][57339] Updated weights for policy 0, policy_version 696488 (0.0033) [2024-04-28 19:38:34,621][57339] Updated weights for policy 0, policy_version 696498 (0.0028) [2024-04-28 19:38:37,169][57108] Fps is (10 sec: 52428.4, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 11411505152. Throughput: 0: 55754.8. Samples: 1901829880. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:38:37,170][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 19:38:38,235][57339] Updated weights for policy 0, policy_version 696508 (0.0023) [2024-04-28 19:38:40,548][57339] Updated weights for policy 0, policy_version 696518 (0.0030) [2024-04-28 19:38:42,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11411800064. Throughput: 0: 55606.7. Samples: 1902160640. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:38:42,170][57108] Avg episode reward: [(0, '0.570')] [2024-04-28 19:38:44,171][57339] Updated weights for policy 0, policy_version 696528 (0.0025) [2024-04-28 19:38:46,325][57339] Updated weights for policy 0, policy_version 696538 (0.0028) [2024-04-28 19:38:47,169][57108] Fps is (10 sec: 62259.7, 60 sec: 56251.7, 300 sec: 55927.8). Total num frames: 11412127744. Throughput: 0: 55511.8. Samples: 1902493640. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:38:47,169][57108] Avg episode reward: [(0, '0.490')] [2024-04-28 19:38:49,937][57339] Updated weights for policy 0, policy_version 696548 (0.0026) [2024-04-28 19:38:52,129][57339] Updated weights for policy 0, policy_version 696558 (0.0030) [2024-04-28 19:38:52,169][57108] Fps is (10 sec: 60620.8, 60 sec: 56797.7, 300 sec: 55872.2). Total num frames: 11412406272. Throughput: 0: 55986.7. Samples: 1902680460. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:38:52,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 19:38:55,751][57339] Updated weights for policy 0, policy_version 696568 (0.0027) [2024-04-28 19:38:57,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 11412652032. Throughput: 0: 56009.8. Samples: 1903014120. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:38:57,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 19:38:58,042][57339] Updated weights for policy 0, policy_version 696578 (0.0037) [2024-04-28 19:39:01,771][57339] Updated weights for policy 0, policy_version 696588 (0.0027) [2024-04-28 19:39:02,169][57108] Fps is (10 sec: 50791.0, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11412914176. Throughput: 0: 55958.3. Samples: 1903347320. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:39:02,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 19:39:03,862][57339] Updated weights for policy 0, policy_version 696598 (0.0024) [2024-04-28 19:39:07,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55159.6, 300 sec: 55761.2). Total num frames: 11413176320. Throughput: 0: 55535.9. Samples: 1903498320. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:39:07,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 19:39:07,508][57339] Updated weights for policy 0, policy_version 696608 (0.0026) [2024-04-28 19:39:09,707][57339] Updated weights for policy 0, policy_version 696618 (0.0029) [2024-04-28 19:39:12,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 11413471232. Throughput: 0: 55524.6. Samples: 1903837180. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:39:12,169][57108] Avg episode reward: [(0, '0.560')] [2024-04-28 19:39:13,385][57339] Updated weights for policy 0, policy_version 696628 (0.0029) [2024-04-28 19:39:15,515][57339] Updated weights for policy 0, policy_version 696638 (0.0027) [2024-04-28 19:39:17,169][57108] Fps is (10 sec: 58981.5, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11413766144. Throughput: 0: 55575.8. Samples: 1904176440. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:39:17,170][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 19:39:19,027][57319] Signal inference workers to stop experience collection... (28350 times) [2024-04-28 19:39:19,064][57339] InferenceWorker_p0-w0: stopping experience collection (28350 times) [2024-04-28 19:39:19,119][57319] Signal inference workers to resume experience collection... (28350 times) [2024-04-28 19:39:19,119][57339] InferenceWorker_p0-w0: resuming experience collection (28350 times) [2024-04-28 19:39:19,225][57339] Updated weights for policy 0, policy_version 696648 (0.0026) [2024-04-28 19:39:21,414][57339] Updated weights for policy 0, policy_version 696658 (0.0030) [2024-04-28 19:39:22,169][57108] Fps is (10 sec: 60620.1, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 11414077440. Throughput: 0: 55965.0. Samples: 1904348300. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:39:22,170][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 19:39:25,212][57339] Updated weights for policy 0, policy_version 696668 (0.0030) [2024-04-28 19:39:27,169][57108] Fps is (10 sec: 58983.5, 60 sec: 56251.9, 300 sec: 55872.2). Total num frames: 11414355968. Throughput: 0: 55972.1. Samples: 1904679380. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:39:27,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 19:39:27,232][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000696678_11414372352.pth... [2024-04-28 19:39:27,236][57339] Updated weights for policy 0, policy_version 696678 (0.0027) [2024-04-28 19:39:27,289][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000695857_11400921088.pth [2024-04-28 19:39:31,032][57339] Updated weights for policy 0, policy_version 696688 (0.0024) [2024-04-28 19:39:32,169][57108] Fps is (10 sec: 52429.6, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 11414601728. Throughput: 0: 55991.3. Samples: 1905013240. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:39:32,169][57108] Avg episode reward: [(0, '0.507')] [2024-04-28 19:39:33,104][57339] Updated weights for policy 0, policy_version 696698 (0.0029) [2024-04-28 19:39:37,115][57339] Updated weights for policy 0, policy_version 696708 (0.0031) [2024-04-28 19:39:37,169][57108] Fps is (10 sec: 50789.4, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 11414863872. Throughput: 0: 55421.2. Samples: 1905174420. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:39:37,170][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 19:39:39,044][57339] Updated weights for policy 0, policy_version 696718 (0.0030) [2024-04-28 19:39:42,169][57108] Fps is (10 sec: 52427.4, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 11415126016. Throughput: 0: 55405.1. Samples: 1905507360. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-04-28 19:39:42,170][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 19:39:42,927][57339] Updated weights for policy 0, policy_version 696728 (0.0027) [2024-04-28 19:39:44,861][57339] Updated weights for policy 0, policy_version 696738 (0.0030) [2024-04-28 19:39:47,169][57108] Fps is (10 sec: 54067.9, 60 sec: 54613.4, 300 sec: 55705.6). Total num frames: 11415404544. Throughput: 0: 55456.8. Samples: 1905842880. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:39:47,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 19:39:48,917][57339] Updated weights for policy 0, policy_version 696748 (0.0037) [2024-04-28 19:39:50,840][57339] Updated weights for policy 0, policy_version 696758 (0.0032) [2024-04-28 19:39:52,169][57108] Fps is (10 sec: 58983.1, 60 sec: 55159.5, 300 sec: 55816.7). Total num frames: 11415715840. Throughput: 0: 55739.5. Samples: 1906006600. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:39:52,169][57108] Avg episode reward: [(0, '0.667')] [2024-04-28 19:39:54,770][57339] Updated weights for policy 0, policy_version 696768 (0.0031) [2024-04-28 19:39:56,676][57339] Updated weights for policy 0, policy_version 696778 (0.0030) [2024-04-28 19:39:57,169][57108] Fps is (10 sec: 62258.7, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 11416027136. Throughput: 0: 55596.7. Samples: 1906339040. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:39:57,170][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 19:40:00,770][57339] Updated weights for policy 0, policy_version 696788 (0.0026) [2024-04-28 19:40:02,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 11416289280. Throughput: 0: 55471.3. Samples: 1906672640. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:40:02,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 19:40:02,502][57339] Updated weights for policy 0, policy_version 696798 (0.0030) [2024-04-28 19:40:04,163][57319] Signal inference workers to stop experience collection... (28400 times) [2024-04-28 19:40:04,163][57319] Signal inference workers to resume experience collection... (28400 times) [2024-04-28 19:40:04,189][57339] InferenceWorker_p0-w0: stopping experience collection (28400 times) [2024-04-28 19:40:04,189][57339] InferenceWorker_p0-w0: resuming experience collection (28400 times) [2024-04-28 19:40:06,592][57339] Updated weights for policy 0, policy_version 696808 (0.0034) [2024-04-28 19:40:07,169][57108] Fps is (10 sec: 50790.8, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11416535040. Throughput: 0: 55377.4. Samples: 1906840280. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:40:07,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 19:40:08,569][57339] Updated weights for policy 0, policy_version 696818 (0.0026) [2024-04-28 19:40:12,169][57108] Fps is (10 sec: 50790.1, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 11416797184. Throughput: 0: 55544.3. Samples: 1907178880. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:40:12,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 19:40:12,460][57339] Updated weights for policy 0, policy_version 696828 (0.0034) [2024-04-28 19:40:14,388][57339] Updated weights for policy 0, policy_version 696838 (0.0026) [2024-04-28 19:40:17,169][57108] Fps is (10 sec: 52428.4, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 11417059328. Throughput: 0: 55574.0. Samples: 1907514080. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:40:17,169][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 19:40:18,324][57339] Updated weights for policy 0, policy_version 696848 (0.0032) [2024-04-28 19:40:20,312][57339] Updated weights for policy 0, policy_version 696858 (0.0027) [2024-04-28 19:40:22,169][57108] Fps is (10 sec: 55705.6, 60 sec: 54613.3, 300 sec: 55705.6). Total num frames: 11417354240. Throughput: 0: 55405.9. Samples: 1907667680. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:40:22,169][57108] Avg episode reward: [(0, '0.553')] [2024-04-28 19:40:24,303][57339] Updated weights for policy 0, policy_version 696868 (0.0026) [2024-04-28 19:40:26,094][57339] Updated weights for policy 0, policy_version 696878 (0.0026) [2024-04-28 19:40:27,169][57108] Fps is (10 sec: 60621.4, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 11417665536. Throughput: 0: 55392.7. Samples: 1908000020. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:40:27,169][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 19:40:30,330][57339] Updated weights for policy 0, policy_version 696888 (0.0037) [2024-04-28 19:40:32,071][57339] Updated weights for policy 0, policy_version 696898 (0.0025) [2024-04-28 19:40:32,169][57108] Fps is (10 sec: 62258.6, 60 sec: 56251.5, 300 sec: 55872.2). Total num frames: 11417976832. Throughput: 0: 55268.7. Samples: 1908329980. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:40:32,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 19:40:35,983][57339] Updated weights for policy 0, policy_version 696908 (0.0028) [2024-04-28 19:40:37,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55705.7, 300 sec: 55761.1). Total num frames: 11418206208. Throughput: 0: 55578.7. Samples: 1908507640. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:40:37,175][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 19:40:38,172][57339] Updated weights for policy 0, policy_version 696918 (0.0026) [2024-04-28 19:40:41,737][57339] Updated weights for policy 0, policy_version 696928 (0.0033) [2024-04-28 19:40:42,169][57108] Fps is (10 sec: 50790.6, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11418484736. Throughput: 0: 55810.2. Samples: 1908850500. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:40:42,170][57108] Avg episode reward: [(0, '0.515')] [2024-04-28 19:40:44,313][57339] Updated weights for policy 0, policy_version 696938 (0.0028) [2024-04-28 19:40:47,169][57108] Fps is (10 sec: 54066.8, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 11418746880. Throughput: 0: 55734.1. Samples: 1909180680. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:40:47,170][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 19:40:47,641][57339] Updated weights for policy 0, policy_version 696948 (0.0033) [2024-04-28 19:40:50,021][57339] Updated weights for policy 0, policy_version 696958 (0.0030) [2024-04-28 19:40:52,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55159.6, 300 sec: 55761.1). Total num frames: 11419025408. Throughput: 0: 55357.8. Samples: 1909331380. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:40:52,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 19:40:53,109][57319] Signal inference workers to stop experience collection... (28450 times) [2024-04-28 19:40:53,110][57319] Signal inference workers to resume experience collection... (28450 times) [2024-04-28 19:40:53,143][57339] InferenceWorker_p0-w0: stopping experience collection (28450 times) [2024-04-28 19:40:53,144][57339] InferenceWorker_p0-w0: resuming experience collection (28450 times) [2024-04-28 19:40:53,546][57339] Updated weights for policy 0, policy_version 696968 (0.0031) [2024-04-28 19:40:55,689][57339] Updated weights for policy 0, policy_version 696978 (0.0027) [2024-04-28 19:40:57,169][57108] Fps is (10 sec: 55706.5, 60 sec: 54613.5, 300 sec: 55761.1). Total num frames: 11419303936. Throughput: 0: 55310.8. Samples: 1909667860. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:40:57,169][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 19:40:59,264][57339] Updated weights for policy 0, policy_version 696988 (0.0028) [2024-04-28 19:41:01,583][57339] Updated weights for policy 0, policy_version 696998 (0.0029) [2024-04-28 19:41:02,169][57108] Fps is (10 sec: 58981.6, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 11419615232. Throughput: 0: 55318.2. Samples: 1910003400. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:41:02,170][57108] Avg episode reward: [(0, '0.563')] [2024-04-28 19:41:05,223][57339] Updated weights for policy 0, policy_version 697008 (0.0029) [2024-04-28 19:41:07,169][57108] Fps is (10 sec: 60620.6, 60 sec: 56251.8, 300 sec: 55761.1). Total num frames: 11419910144. Throughput: 0: 55979.2. Samples: 1910186740. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:41:07,169][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 19:41:07,827][57339] Updated weights for policy 0, policy_version 697018 (0.0032) [2024-04-28 19:41:11,146][57339] Updated weights for policy 0, policy_version 697028 (0.0029) [2024-04-28 19:41:12,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56524.8, 300 sec: 55872.2). Total num frames: 11420188672. Throughput: 0: 56089.8. Samples: 1910524060. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:41:12,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 19:41:13,593][57339] Updated weights for policy 0, policy_version 697038 (0.0030) [2024-04-28 19:41:16,888][57339] Updated weights for policy 0, policy_version 697048 (0.0032) [2024-04-28 19:41:17,169][57108] Fps is (10 sec: 54066.7, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 11420450816. Throughput: 0: 56192.1. Samples: 1910858620. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:41:17,170][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 19:41:19,341][57339] Updated weights for policy 0, policy_version 697058 (0.0029) [2024-04-28 19:41:22,169][57108] Fps is (10 sec: 54067.0, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 11420729344. Throughput: 0: 55792.0. Samples: 1911018280. Policy #0 lag: (min: 0.0, avg: 14.2, max: 23.0) [2024-04-28 19:41:22,169][57108] Avg episode reward: [(0, '0.649')] [2024-04-28 19:41:22,708][57339] Updated weights for policy 0, policy_version 697068 (0.0025) [2024-04-28 19:41:25,499][57339] Updated weights for policy 0, policy_version 697078 (0.0032) [2024-04-28 19:41:27,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 11420991488. Throughput: 0: 55761.5. Samples: 1911359760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:41:27,169][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 19:41:27,214][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000697083_11421007872.pth... [2024-04-28 19:41:27,260][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000696266_11407622144.pth [2024-04-28 19:41:28,518][57339] Updated weights for policy 0, policy_version 697088 (0.0027) [2024-04-28 19:41:31,271][57339] Updated weights for policy 0, policy_version 697098 (0.0035) [2024-04-28 19:41:32,169][57108] Fps is (10 sec: 54067.6, 60 sec: 54886.6, 300 sec: 55816.7). Total num frames: 11421270016. Throughput: 0: 55985.1. Samples: 1911700000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:41:32,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 19:41:34,443][57339] Updated weights for policy 0, policy_version 697108 (0.0028) [2024-04-28 19:41:37,075][57339] Updated weights for policy 0, policy_version 697118 (0.0028) [2024-04-28 19:41:37,169][57108] Fps is (10 sec: 58981.7, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 11421581312. Throughput: 0: 56321.2. Samples: 1911865840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:41:37,170][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 19:41:40,170][57339] Updated weights for policy 0, policy_version 697128 (0.0030) [2024-04-28 19:41:42,169][57108] Fps is (10 sec: 58982.0, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 11421859840. Throughput: 0: 56229.2. Samples: 1912198180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:41:42,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 19:41:42,946][57339] Updated weights for policy 0, policy_version 697138 (0.0025) [2024-04-28 19:41:46,251][57339] Updated weights for policy 0, policy_version 697148 (0.0026) [2024-04-28 19:41:47,169][57108] Fps is (10 sec: 57344.5, 60 sec: 56798.0, 300 sec: 55927.7). Total num frames: 11422154752. Throughput: 0: 56175.7. Samples: 1912531300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:41:47,169][57108] Avg episode reward: [(0, '0.602')] [2024-04-28 19:41:49,000][57339] Updated weights for policy 0, policy_version 697158 (0.0035) [2024-04-28 19:41:52,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 11422384128. Throughput: 0: 55899.6. Samples: 1912702220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:41:52,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 19:41:52,194][57339] Updated weights for policy 0, policy_version 697168 (0.0027) [2024-04-28 19:41:54,840][57339] Updated weights for policy 0, policy_version 697178 (0.0028) [2024-04-28 19:41:57,169][57108] Fps is (10 sec: 52428.9, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 11422679040. Throughput: 0: 55852.0. Samples: 1913037400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:41:57,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 19:41:57,906][57339] Updated weights for policy 0, policy_version 697188 (0.0030) [2024-04-28 19:42:00,609][57319] Signal inference workers to stop experience collection... (28500 times) [2024-04-28 19:42:00,646][57339] InferenceWorker_p0-w0: stopping experience collection (28500 times) [2024-04-28 19:42:00,700][57319] Signal inference workers to resume experience collection... (28500 times) [2024-04-28 19:42:00,700][57339] InferenceWorker_p0-w0: resuming experience collection (28500 times) [2024-04-28 19:42:00,813][57339] Updated weights for policy 0, policy_version 697198 (0.0031) [2024-04-28 19:42:02,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.7, 300 sec: 55705.6). Total num frames: 11422941184. Throughput: 0: 55753.0. Samples: 1913367500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:42:02,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 19:42:03,813][57339] Updated weights for policy 0, policy_version 697208 (0.0031) [2024-04-28 19:42:06,669][57339] Updated weights for policy 0, policy_version 697218 (0.0027) [2024-04-28 19:42:07,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 11423219712. Throughput: 0: 55697.7. Samples: 1913524680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:42:07,170][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 19:42:09,709][57339] Updated weights for policy 0, policy_version 697228 (0.0026) [2024-04-28 19:42:12,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 11423498240. Throughput: 0: 55540.4. Samples: 1913859080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:42:12,169][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 19:42:12,656][57339] Updated weights for policy 0, policy_version 697238 (0.0026) [2024-04-28 19:42:15,480][57339] Updated weights for policy 0, policy_version 697248 (0.0027) [2024-04-28 19:42:17,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 11423793152. Throughput: 0: 55426.6. Samples: 1914194200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:42:17,169][57108] Avg episode reward: [(0, '0.625')] [2024-04-28 19:42:18,510][57339] Updated weights for policy 0, policy_version 697258 (0.0030) [2024-04-28 19:42:21,468][57339] Updated weights for policy 0, policy_version 697268 (0.0030) [2024-04-28 19:42:22,169][57108] Fps is (10 sec: 58982.8, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 11424088064. Throughput: 0: 55754.0. Samples: 1914374760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:42:22,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 19:42:24,530][57339] Updated weights for policy 0, policy_version 697278 (0.0031) [2024-04-28 19:42:27,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.7, 300 sec: 55705.6). Total num frames: 11424350208. Throughput: 0: 55766.8. Samples: 1914707680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:42:27,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 19:42:27,405][57339] Updated weights for policy 0, policy_version 697288 (0.0031) [2024-04-28 19:42:30,521][57339] Updated weights for policy 0, policy_version 697298 (0.0027) [2024-04-28 19:42:32,169][57108] Fps is (10 sec: 52428.8, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11424612352. Throughput: 0: 55781.9. Samples: 1915041480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:42:32,169][57108] Avg episode reward: [(0, '0.579')] [2024-04-28 19:42:33,227][57339] Updated weights for policy 0, policy_version 697308 (0.0029) [2024-04-28 19:42:36,312][57339] Updated weights for policy 0, policy_version 697318 (0.0027) [2024-04-28 19:42:37,169][57108] Fps is (10 sec: 54066.4, 60 sec: 55159.5, 300 sec: 55650.0). Total num frames: 11424890880. Throughput: 0: 55426.0. Samples: 1915196400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:42:37,170][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 19:42:38,964][57339] Updated weights for policy 0, policy_version 697328 (0.0030) [2024-04-28 19:42:42,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 11425169408. Throughput: 0: 55480.9. Samples: 1915534040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:42:42,169][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 19:42:42,204][57339] Updated weights for policy 0, policy_version 697338 (0.0028) [2024-04-28 19:42:44,883][57339] Updated weights for policy 0, policy_version 697348 (0.0033) [2024-04-28 19:42:47,169][57108] Fps is (10 sec: 55705.9, 60 sec: 54886.4, 300 sec: 55761.1). Total num frames: 11425447936. Throughput: 0: 55543.9. Samples: 1915866980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:42:47,178][57108] Avg episode reward: [(0, '0.606')] [2024-04-28 19:42:48,447][57339] Updated weights for policy 0, policy_version 697358 (0.0032) [2024-04-28 19:42:50,785][57339] Updated weights for policy 0, policy_version 697368 (0.0027) [2024-04-28 19:42:52,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 11425742848. Throughput: 0: 55760.5. Samples: 1916033900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:42:52,170][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 19:42:54,345][57339] Updated weights for policy 0, policy_version 697378 (0.0030) [2024-04-28 19:42:56,539][57339] Updated weights for policy 0, policy_version 697388 (0.0031) [2024-04-28 19:42:57,169][57108] Fps is (10 sec: 58983.1, 60 sec: 55978.7, 300 sec: 55872.2). Total num frames: 11426037760. Throughput: 0: 55713.4. Samples: 1916366180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:42:57,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 19:43:00,084][57339] Updated weights for policy 0, policy_version 697398 (0.0034) [2024-04-28 19:43:00,981][57319] Signal inference workers to stop experience collection... (28550 times) [2024-04-28 19:43:01,017][57339] InferenceWorker_p0-w0: stopping experience collection (28550 times) [2024-04-28 19:43:01,071][57319] Signal inference workers to resume experience collection... (28550 times) [2024-04-28 19:43:01,071][57339] InferenceWorker_p0-w0: resuming experience collection (28550 times) [2024-04-28 19:43:02,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55978.5, 300 sec: 55705.6). Total num frames: 11426299904. Throughput: 0: 55747.0. Samples: 1916702820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-04-28 19:43:02,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 19:43:02,342][57339] Updated weights for policy 0, policy_version 697408 (0.0025) [2024-04-28 19:43:06,087][57339] Updated weights for policy 0, policy_version 697418 (0.0033) [2024-04-28 19:43:07,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55978.8, 300 sec: 55650.1). Total num frames: 11426578432. Throughput: 0: 55616.4. Samples: 1916877500. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:43:07,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 19:43:08,332][57339] Updated weights for policy 0, policy_version 697428 (0.0027) [2024-04-28 19:43:12,048][57339] Updated weights for policy 0, policy_version 697438 (0.0025) [2024-04-28 19:43:12,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 11426824192. Throughput: 0: 55527.9. Samples: 1917206440. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:43:12,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 19:43:14,346][57339] Updated weights for policy 0, policy_version 697448 (0.0027) [2024-04-28 19:43:17,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 11427102720. Throughput: 0: 55538.6. Samples: 1917540720. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:43:17,169][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 19:43:17,899][57339] Updated weights for policy 0, policy_version 697458 (0.0033) [2024-04-28 19:43:20,124][57339] Updated weights for policy 0, policy_version 697468 (0.0031) [2024-04-28 19:43:22,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55432.5, 300 sec: 55705.6). Total num frames: 11427414016. Throughput: 0: 55666.8. Samples: 1917701400. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:43:22,170][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 19:43:23,696][57339] Updated weights for policy 0, policy_version 697478 (0.0035) [2024-04-28 19:43:26,110][57339] Updated weights for policy 0, policy_version 697488 (0.0031) [2024-04-28 19:43:27,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 11427692544. Throughput: 0: 55556.9. Samples: 1918034100. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:43:27,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 19:43:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000697491_11427692544.pth... [2024-04-28 19:43:27,230][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000696678_11414372352.pth [2024-04-28 19:43:29,512][57339] Updated weights for policy 0, policy_version 697498 (0.0031) [2024-04-28 19:43:31,930][57339] Updated weights for policy 0, policy_version 697508 (0.0031) [2024-04-28 19:43:32,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56251.7, 300 sec: 55872.3). Total num frames: 11427987456. Throughput: 0: 55555.2. Samples: 1918366960. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:43:32,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 19:43:35,541][57339] Updated weights for policy 0, policy_version 697518 (0.0032) [2024-04-28 19:43:37,169][57108] Fps is (10 sec: 57342.9, 60 sec: 56251.6, 300 sec: 55816.6). Total num frames: 11428265984. Throughput: 0: 55719.8. Samples: 1918541300. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:43:37,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 19:43:37,851][57339] Updated weights for policy 0, policy_version 697528 (0.0032) [2024-04-28 19:43:41,455][57339] Updated weights for policy 0, policy_version 697538 (0.0031) [2024-04-28 19:43:42,169][57108] Fps is (10 sec: 52428.1, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 11428511744. Throughput: 0: 55768.3. Samples: 1918875760. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:43:42,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 19:43:43,661][57339] Updated weights for policy 0, policy_version 697548 (0.0027) [2024-04-28 19:43:47,169][57108] Fps is (10 sec: 50790.5, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 11428773888. Throughput: 0: 55789.2. Samples: 1919213340. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:43:47,170][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 19:43:47,299][57339] Updated weights for policy 0, policy_version 697558 (0.0031) [2024-04-28 19:43:49,568][57339] Updated weights for policy 0, policy_version 697568 (0.0030) [2024-04-28 19:43:52,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.4, 300 sec: 55594.5). Total num frames: 11429052416. Throughput: 0: 55259.9. Samples: 1919364200. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:43:52,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 19:43:53,118][57339] Updated weights for policy 0, policy_version 697578 (0.0029) [2024-04-28 19:43:55,383][57339] Updated weights for policy 0, policy_version 697588 (0.0020) [2024-04-28 19:43:57,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55432.3, 300 sec: 55761.1). Total num frames: 11429363712. Throughput: 0: 55422.9. Samples: 1919700480. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:43:57,169][57108] Avg episode reward: [(0, '0.586')] [2024-04-28 19:43:58,999][57339] Updated weights for policy 0, policy_version 697598 (0.0029) [2024-04-28 19:44:01,280][57339] Updated weights for policy 0, policy_version 697608 (0.0034) [2024-04-28 19:44:02,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11429642240. Throughput: 0: 55435.1. Samples: 1920035300. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:44:02,169][57108] Avg episode reward: [(0, '0.518')] [2024-04-28 19:44:04,935][57339] Updated weights for policy 0, policy_version 697618 (0.0027) [2024-04-28 19:44:05,305][57319] Signal inference workers to stop experience collection... (28600 times) [2024-04-28 19:44:05,306][57319] Signal inference workers to resume experience collection... (28600 times) [2024-04-28 19:44:05,332][57339] InferenceWorker_p0-w0: stopping experience collection (28600 times) [2024-04-28 19:44:05,332][57339] InferenceWorker_p0-w0: resuming experience collection (28600 times) [2024-04-28 19:44:07,074][57339] Updated weights for policy 0, policy_version 697628 (0.0026) [2024-04-28 19:44:07,169][57108] Fps is (10 sec: 57345.4, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11429937152. Throughput: 0: 55870.7. Samples: 1920215580. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:44:07,169][57108] Avg episode reward: [(0, '0.652')] [2024-04-28 19:44:10,712][57339] Updated weights for policy 0, policy_version 697638 (0.0025) [2024-04-28 19:44:12,169][57108] Fps is (10 sec: 55705.2, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11430199296. Throughput: 0: 55984.4. Samples: 1920553400. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:44:12,170][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 19:44:13,144][57339] Updated weights for policy 0, policy_version 697648 (0.0036) [2024-04-28 19:44:16,503][57339] Updated weights for policy 0, policy_version 697658 (0.0027) [2024-04-28 19:44:17,169][57108] Fps is (10 sec: 52428.7, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 11430461440. Throughput: 0: 55861.4. Samples: 1920880720. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:44:17,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 19:44:18,893][57339] Updated weights for policy 0, policy_version 697668 (0.0033) [2024-04-28 19:44:22,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55159.3, 300 sec: 55483.4). Total num frames: 11430723584. Throughput: 0: 55555.2. Samples: 1921041280. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:44:22,170][57108] Avg episode reward: [(0, '0.637')] [2024-04-28 19:44:22,486][57339] Updated weights for policy 0, policy_version 697678 (0.0030) [2024-04-28 19:44:24,776][57339] Updated weights for policy 0, policy_version 697688 (0.0028) [2024-04-28 19:44:27,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 11431018496. Throughput: 0: 55549.3. Samples: 1921375480. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:44:27,170][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 19:44:28,294][57339] Updated weights for policy 0, policy_version 697698 (0.0030) [2024-04-28 19:44:30,544][57339] Updated weights for policy 0, policy_version 697708 (0.0027) [2024-04-28 19:44:32,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55432.5, 300 sec: 55761.2). Total num frames: 11431313408. Throughput: 0: 55567.7. Samples: 1921713880. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:44:32,170][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 19:44:33,965][57339] Updated weights for policy 0, policy_version 697718 (0.0032) [2024-04-28 19:44:36,415][57339] Updated weights for policy 0, policy_version 697728 (0.0029) [2024-04-28 19:44:37,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55705.8, 300 sec: 55872.2). Total num frames: 11431608320. Throughput: 0: 56114.8. Samples: 1921889360. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:44:37,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 19:44:39,969][57339] Updated weights for policy 0, policy_version 697738 (0.0028) [2024-04-28 19:44:42,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 55816.6). Total num frames: 11431870464. Throughput: 0: 56048.9. Samples: 1922222680. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-04-28 19:44:42,178][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 19:44:42,431][57339] Updated weights for policy 0, policy_version 697748 (0.0029) [2024-04-28 19:44:45,885][57339] Updated weights for policy 0, policy_version 697758 (0.0027) [2024-04-28 19:44:47,169][57108] Fps is (10 sec: 54066.6, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 11432148992. Throughput: 0: 55925.2. Samples: 1922551940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:44:47,170][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 19:44:48,321][57339] Updated weights for policy 0, policy_version 697768 (0.0032) [2024-04-28 19:44:51,667][57339] Updated weights for policy 0, policy_version 697778 (0.0025) [2024-04-28 19:44:52,169][57108] Fps is (10 sec: 55706.1, 60 sec: 56251.7, 300 sec: 55594.5). Total num frames: 11432427520. Throughput: 0: 55597.1. Samples: 1922717460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:44:52,170][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 19:44:54,276][57339] Updated weights for policy 0, policy_version 697788 (0.0038) [2024-04-28 19:44:57,169][57108] Fps is (10 sec: 52430.0, 60 sec: 55159.7, 300 sec: 55539.0). Total num frames: 11432673280. Throughput: 0: 55649.5. Samples: 1923057620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:44:57,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 19:44:57,504][57339] Updated weights for policy 0, policy_version 697798 (0.0029) [2024-04-28 19:44:57,514][57319] Signal inference workers to stop experience collection... (28650 times) [2024-04-28 19:44:57,518][57319] Signal inference workers to resume experience collection... (28650 times) [2024-04-28 19:44:57,532][57339] InferenceWorker_p0-w0: stopping experience collection (28650 times) [2024-04-28 19:44:57,532][57339] InferenceWorker_p0-w0: resuming experience collection (28650 times) [2024-04-28 19:45:00,253][57339] Updated weights for policy 0, policy_version 697808 (0.0030) [2024-04-28 19:45:02,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11432968192. Throughput: 0: 55814.6. Samples: 1923392380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:45:02,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 19:45:03,360][57339] Updated weights for policy 0, policy_version 697818 (0.0030) [2024-04-28 19:45:06,234][57339] Updated weights for policy 0, policy_version 697828 (0.0028) [2024-04-28 19:45:07,169][57108] Fps is (10 sec: 58981.5, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 11433263104. Throughput: 0: 55777.9. Samples: 1923551280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:45:07,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 19:45:09,433][57339] Updated weights for policy 0, policy_version 697838 (0.0028) [2024-04-28 19:45:12,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.6, 300 sec: 55816.7). Total num frames: 11433525248. Throughput: 0: 55641.4. Samples: 1923879340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:45:12,169][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 19:45:12,201][57339] Updated weights for policy 0, policy_version 697848 (0.0028) [2024-04-28 19:45:15,221][57339] Updated weights for policy 0, policy_version 697858 (0.0033) [2024-04-28 19:45:17,169][57108] Fps is (10 sec: 57344.0, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 11433836544. Throughput: 0: 55512.4. Samples: 1924211940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:45:17,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 19:45:17,985][57339] Updated weights for policy 0, policy_version 697868 (0.0026) [2024-04-28 19:45:21,166][57339] Updated weights for policy 0, policy_version 697878 (0.0040) [2024-04-28 19:45:22,169][57108] Fps is (10 sec: 57343.7, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 11434098688. Throughput: 0: 55478.6. Samples: 1924385900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:45:22,170][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 19:45:23,731][57339] Updated weights for policy 0, policy_version 697888 (0.0025) [2024-04-28 19:45:27,086][57339] Updated weights for policy 0, policy_version 697898 (0.0030) [2024-04-28 19:45:27,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 11434360832. Throughput: 0: 55441.1. Samples: 1924717520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:45:27,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 19:45:27,182][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000697898_11434360832.pth... [2024-04-28 19:45:27,238][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000697083_11421007872.pth [2024-04-28 19:45:29,687][57339] Updated weights for policy 0, policy_version 697908 (0.0031) [2024-04-28 19:45:32,170][57108] Fps is (10 sec: 50786.6, 60 sec: 54885.7, 300 sec: 55594.4). Total num frames: 11434606592. Throughput: 0: 55431.1. Samples: 1925046380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:45:32,170][57108] Avg episode reward: [(0, '0.531')] [2024-04-28 19:45:33,002][57339] Updated weights for policy 0, policy_version 697918 (0.0029) [2024-04-28 19:45:35,775][57339] Updated weights for policy 0, policy_version 697928 (0.0030) [2024-04-28 19:45:37,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 11434917888. Throughput: 0: 55355.2. Samples: 1925208440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:45:37,169][57108] Avg episode reward: [(0, '0.529')] [2024-04-28 19:45:38,839][57339] Updated weights for policy 0, policy_version 697938 (0.0027) [2024-04-28 19:45:41,544][57339] Updated weights for policy 0, policy_version 697948 (0.0031) [2024-04-28 19:45:42,169][57108] Fps is (10 sec: 57349.1, 60 sec: 55159.7, 300 sec: 55705.6). Total num frames: 11435180032. Throughput: 0: 55249.7. Samples: 1925543860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:45:42,169][57108] Avg episode reward: [(0, '0.690')] [2024-04-28 19:45:44,678][57339] Updated weights for policy 0, policy_version 697958 (0.0040) [2024-04-28 19:45:47,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 11435474944. Throughput: 0: 55252.4. Samples: 1925878740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:45:47,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 19:45:47,793][57339] Updated weights for policy 0, policy_version 697968 (0.0032) [2024-04-28 19:45:50,496][57339] Updated weights for policy 0, policy_version 697978 (0.0026) [2024-04-28 19:45:52,169][57108] Fps is (10 sec: 58982.1, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11435769856. Throughput: 0: 55522.3. Samples: 1926049780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:45:52,169][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 19:45:53,861][57339] Updated weights for policy 0, policy_version 697988 (0.0029) [2024-04-28 19:45:56,699][57339] Updated weights for policy 0, policy_version 697998 (0.0027) [2024-04-28 19:45:57,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55978.4, 300 sec: 55650.0). Total num frames: 11436032000. Throughput: 0: 55663.4. Samples: 1926384200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:45:57,170][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 19:45:59,781][57339] Updated weights for policy 0, policy_version 698008 (0.0030) [2024-04-28 19:46:02,169][57108] Fps is (10 sec: 52427.8, 60 sec: 55432.3, 300 sec: 55538.9). Total num frames: 11436294144. Throughput: 0: 55611.4. Samples: 1926714460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:46:02,170][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 19:46:02,534][57339] Updated weights for policy 0, policy_version 698018 (0.0026) [2024-04-28 19:46:05,518][57339] Updated weights for policy 0, policy_version 698028 (0.0029) [2024-04-28 19:46:07,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 11436572672. Throughput: 0: 55217.0. Samples: 1926870660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:46:07,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 19:46:08,310][57339] Updated weights for policy 0, policy_version 698038 (0.0027) [2024-04-28 19:46:11,478][57339] Updated weights for policy 0, policy_version 698048 (0.0029) [2024-04-28 19:46:12,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 11436851200. Throughput: 0: 55277.2. Samples: 1927205000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:46:12,170][57108] Avg episode reward: [(0, '0.558')] [2024-04-28 19:46:14,564][57339] Updated weights for policy 0, policy_version 698058 (0.0040) [2024-04-28 19:46:14,927][57319] Signal inference workers to stop experience collection... (28700 times) [2024-04-28 19:46:14,927][57319] Signal inference workers to resume experience collection... (28700 times) [2024-04-28 19:46:14,949][57339] InferenceWorker_p0-w0: stopping experience collection (28700 times) [2024-04-28 19:46:14,949][57339] InferenceWorker_p0-w0: resuming experience collection (28700 times) [2024-04-28 19:46:17,169][57108] Fps is (10 sec: 54067.3, 60 sec: 54613.4, 300 sec: 55539.0). Total num frames: 11437113344. Throughput: 0: 55299.2. Samples: 1927534800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:46:17,169][57108] Avg episode reward: [(0, '0.551')] [2024-04-28 19:46:17,393][57339] Updated weights for policy 0, policy_version 698068 (0.0027) [2024-04-28 19:46:20,368][57339] Updated weights for policy 0, policy_version 698078 (0.0027) [2024-04-28 19:46:22,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11437424640. Throughput: 0: 55557.8. Samples: 1927708540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-04-28 19:46:22,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 19:46:23,177][57339] Updated weights for policy 0, policy_version 698088 (0.0034) [2024-04-28 19:46:26,138][57339] Updated weights for policy 0, policy_version 698098 (0.0028) [2024-04-28 19:46:27,169][57108] Fps is (10 sec: 58981.1, 60 sec: 55705.4, 300 sec: 55705.6). Total num frames: 11437703168. Throughput: 0: 55620.1. Samples: 1928046780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:46:27,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 19:46:29,135][57339] Updated weights for policy 0, policy_version 698108 (0.0029) [2024-04-28 19:46:32,036][57339] Updated weights for policy 0, policy_version 698118 (0.0040) [2024-04-28 19:46:32,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55979.4, 300 sec: 55539.0). Total num frames: 11437965312. Throughput: 0: 55512.0. Samples: 1928376780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:46:32,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 19:46:35,146][57339] Updated weights for policy 0, policy_version 698128 (0.0033) [2024-04-28 19:46:37,169][57108] Fps is (10 sec: 52429.5, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 11438227456. Throughput: 0: 55460.8. Samples: 1928545520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:46:37,170][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 19:46:37,935][57339] Updated weights for policy 0, policy_version 698138 (0.0030) [2024-04-28 19:46:41,045][57339] Updated weights for policy 0, policy_version 698148 (0.0027) [2024-04-28 19:46:42,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 11438505984. Throughput: 0: 55395.3. Samples: 1928876980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:46:42,169][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 19:46:43,882][57339] Updated weights for policy 0, policy_version 698158 (0.0035) [2024-04-28 19:46:47,000][57339] Updated weights for policy 0, policy_version 698168 (0.0029) [2024-04-28 19:46:47,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55159.5, 300 sec: 55594.5). Total num frames: 11438784512. Throughput: 0: 55430.0. Samples: 1929208800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:46:47,169][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 19:46:49,752][57339] Updated weights for policy 0, policy_version 698178 (0.0026) [2024-04-28 19:46:52,169][57108] Fps is (10 sec: 55706.1, 60 sec: 54886.4, 300 sec: 55539.0). Total num frames: 11439063040. Throughput: 0: 55667.2. Samples: 1929375680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:46:52,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 19:46:52,777][57339] Updated weights for policy 0, policy_version 698188 (0.0034) [2024-04-28 19:46:55,620][57339] Updated weights for policy 0, policy_version 698198 (0.0027) [2024-04-28 19:46:57,169][57108] Fps is (10 sec: 58982.2, 60 sec: 55705.7, 300 sec: 55705.6). Total num frames: 11439374336. Throughput: 0: 55631.2. Samples: 1929708400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:46:57,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 19:46:58,496][57339] Updated weights for policy 0, policy_version 698208 (0.0023) [2024-04-28 19:47:01,497][57339] Updated weights for policy 0, policy_version 698218 (0.0029) [2024-04-28 19:47:02,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 11439636480. Throughput: 0: 55600.0. Samples: 1930036800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:47:02,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 19:47:04,490][57339] Updated weights for policy 0, policy_version 698228 (0.0028) [2024-04-28 19:47:07,169][57108] Fps is (10 sec: 52428.5, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 11439898624. Throughput: 0: 55495.4. Samples: 1930205840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:47:07,170][57108] Avg episode reward: [(0, '0.668')] [2024-04-28 19:47:07,415][57339] Updated weights for policy 0, policy_version 698238 (0.0025) [2024-04-28 19:47:10,312][57319] Signal inference workers to stop experience collection... (28750 times) [2024-04-28 19:47:10,312][57319] Signal inference workers to resume experience collection... (28750 times) [2024-04-28 19:47:10,322][57339] InferenceWorker_p0-w0: stopping experience collection (28750 times) [2024-04-28 19:47:10,331][57339] InferenceWorker_p0-w0: resuming experience collection (28750 times) [2024-04-28 19:47:10,433][57339] Updated weights for policy 0, policy_version 698248 (0.0026) [2024-04-28 19:47:12,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 11440177152. Throughput: 0: 55482.1. Samples: 1930543460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:47:12,169][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 19:47:13,242][57339] Updated weights for policy 0, policy_version 698258 (0.0036) [2024-04-28 19:47:16,727][57339] Updated weights for policy 0, policy_version 698268 (0.0027) [2024-04-28 19:47:17,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 11440439296. Throughput: 0: 55715.5. Samples: 1930883980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:47:17,169][57108] Avg episode reward: [(0, '0.544')] [2024-04-28 19:47:19,082][57339] Updated weights for policy 0, policy_version 698278 (0.0028) [2024-04-28 19:47:22,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 11440734208. Throughput: 0: 55275.3. Samples: 1931032900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:47:22,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 19:47:22,486][57339] Updated weights for policy 0, policy_version 698288 (0.0031) [2024-04-28 19:47:24,889][57339] Updated weights for policy 0, policy_version 698298 (0.0038) [2024-04-28 19:47:27,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 11441012736. Throughput: 0: 55228.8. Samples: 1931362280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:47:27,170][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 19:47:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000698304_11441012736.pth... [2024-04-28 19:47:27,225][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000697491_11427692544.pth [2024-04-28 19:47:28,855][57339] Updated weights for policy 0, policy_version 698308 (0.0028) [2024-04-28 19:47:30,954][57339] Updated weights for policy 0, policy_version 698318 (0.0031) [2024-04-28 19:47:32,169][57108] Fps is (10 sec: 58981.2, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11441324032. Throughput: 0: 55188.8. Samples: 1931692300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:47:32,170][57108] Avg episode reward: [(0, '0.485')] [2024-04-28 19:47:34,804][57339] Updated weights for policy 0, policy_version 698328 (0.0031) [2024-04-28 19:47:36,922][57339] Updated weights for policy 0, policy_version 698338 (0.0031) [2024-04-28 19:47:37,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 11441586176. Throughput: 0: 55741.7. Samples: 1931884060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:47:37,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 19:47:40,590][57339] Updated weights for policy 0, policy_version 698348 (0.0026) [2024-04-28 19:47:42,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11441848320. Throughput: 0: 55781.8. Samples: 1932218580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:47:42,169][57108] Avg episode reward: [(0, '0.636')] [2024-04-28 19:47:42,573][57339] Updated weights for policy 0, policy_version 698358 (0.0029) [2024-04-28 19:47:46,363][57339] Updated weights for policy 0, policy_version 698368 (0.0030) [2024-04-28 19:47:47,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 11442110464. Throughput: 0: 55878.7. Samples: 1932551340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:47:47,169][57108] Avg episode reward: [(0, '0.673')] [2024-04-28 19:47:48,570][57339] Updated weights for policy 0, policy_version 698378 (0.0029) [2024-04-28 19:47:52,169][57108] Fps is (10 sec: 52429.2, 60 sec: 55159.5, 300 sec: 55372.4). Total num frames: 11442372608. Throughput: 0: 55296.6. Samples: 1932694180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:47:52,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 19:47:52,258][57339] Updated weights for policy 0, policy_version 698388 (0.0030) [2024-04-28 19:47:54,461][57339] Updated weights for policy 0, policy_version 698398 (0.0028) [2024-04-28 19:47:57,169][57108] Fps is (10 sec: 57342.9, 60 sec: 55159.4, 300 sec: 55539.0). Total num frames: 11442683904. Throughput: 0: 55422.4. Samples: 1933037480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:47:57,170][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 19:47:58,184][57339] Updated weights for policy 0, policy_version 698408 (0.0028) [2024-04-28 19:48:00,371][57339] Updated weights for policy 0, policy_version 698418 (0.0034) [2024-04-28 19:48:02,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 11442962432. Throughput: 0: 55238.0. Samples: 1933369680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-04-28 19:48:02,169][57108] Avg episode reward: [(0, '0.697')] [2024-04-28 19:48:04,158][57339] Updated weights for policy 0, policy_version 698428 (0.0031) [2024-04-28 19:48:04,849][57319] Signal inference workers to stop experience collection... (28800 times) [2024-04-28 19:48:04,854][57319] Signal inference workers to resume experience collection... (28800 times) [2024-04-28 19:48:04,870][57339] InferenceWorker_p0-w0: stopping experience collection (28800 times) [2024-04-28 19:48:04,870][57339] InferenceWorker_p0-w0: resuming experience collection (28800 times) [2024-04-28 19:48:06,112][57339] Updated weights for policy 0, policy_version 698438 (0.0024) [2024-04-28 19:48:07,169][57108] Fps is (10 sec: 58983.6, 60 sec: 56251.9, 300 sec: 55761.1). Total num frames: 11443273728. Throughput: 0: 55945.7. Samples: 1933550460. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:48:07,169][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 19:48:10,127][57339] Updated weights for policy 0, policy_version 698448 (0.0030) [2024-04-28 19:48:11,979][57339] Updated weights for policy 0, policy_version 698458 (0.0031) [2024-04-28 19:48:12,169][57108] Fps is (10 sec: 58981.8, 60 sec: 56251.7, 300 sec: 55761.1). Total num frames: 11443552256. Throughput: 0: 55911.6. Samples: 1933878300. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:48:12,170][57108] Avg episode reward: [(0, '0.556')] [2024-04-28 19:48:15,861][57339] Updated weights for policy 0, policy_version 698468 (0.0032) [2024-04-28 19:48:17,169][57108] Fps is (10 sec: 50789.7, 60 sec: 55705.6, 300 sec: 55483.4). Total num frames: 11443781632. Throughput: 0: 55952.5. Samples: 1934210160. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:48:17,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 19:48:17,896][57339] Updated weights for policy 0, policy_version 698478 (0.0030) [2024-04-28 19:48:21,717][57339] Updated weights for policy 0, policy_version 698488 (0.0029) [2024-04-28 19:48:22,169][57108] Fps is (10 sec: 49151.9, 60 sec: 55159.3, 300 sec: 55427.9). Total num frames: 11444043776. Throughput: 0: 55247.1. Samples: 1934370180. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:48:22,170][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 19:48:23,645][57339] Updated weights for policy 0, policy_version 698498 (0.0026) [2024-04-28 19:48:27,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55159.6, 300 sec: 55372.4). Total num frames: 11444322304. Throughput: 0: 55200.5. Samples: 1934702600. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:48:27,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 19:48:27,600][57339] Updated weights for policy 0, policy_version 698508 (0.0036) [2024-04-28 19:48:29,519][57339] Updated weights for policy 0, policy_version 698518 (0.0023) [2024-04-28 19:48:32,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55159.4, 300 sec: 55483.5). Total num frames: 11444633600. Throughput: 0: 55237.6. Samples: 1935037040. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:48:32,170][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 19:48:33,478][57339] Updated weights for policy 0, policy_version 698528 (0.0031) [2024-04-28 19:48:35,470][57339] Updated weights for policy 0, policy_version 698538 (0.0031) [2024-04-28 19:48:37,169][57108] Fps is (10 sec: 60620.5, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11444928512. Throughput: 0: 55988.3. Samples: 1935213660. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:48:37,170][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 19:48:39,378][57339] Updated weights for policy 0, policy_version 698548 (0.0032) [2024-04-28 19:48:41,352][57339] Updated weights for policy 0, policy_version 698558 (0.0027) [2024-04-28 19:48:42,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11445207040. Throughput: 0: 55699.7. Samples: 1935543960. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:48:42,170][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 19:48:45,099][57339] Updated weights for policy 0, policy_version 698568 (0.0033) [2024-04-28 19:48:47,169][57108] Fps is (10 sec: 55706.1, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11445485568. Throughput: 0: 55654.7. Samples: 1935874140. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:48:47,169][57108] Avg episode reward: [(0, '0.614')] [2024-04-28 19:48:47,177][57339] Updated weights for policy 0, policy_version 698578 (0.0034) [2024-04-28 19:48:51,120][57339] Updated weights for policy 0, policy_version 698588 (0.0029) [2024-04-28 19:48:52,169][57108] Fps is (10 sec: 50790.2, 60 sec: 55705.4, 300 sec: 55427.9). Total num frames: 11445714944. Throughput: 0: 55274.5. Samples: 1936037820. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:48:52,170][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 19:48:52,347][57319] Signal inference workers to stop experience collection... (28850 times) [2024-04-28 19:48:52,352][57319] Signal inference workers to resume experience collection... (28850 times) [2024-04-28 19:48:52,367][57339] InferenceWorker_p0-w0: stopping experience collection (28850 times) [2024-04-28 19:48:52,367][57339] InferenceWorker_p0-w0: resuming experience collection (28850 times) [2024-04-28 19:48:53,244][57339] Updated weights for policy 0, policy_version 698598 (0.0029) [2024-04-28 19:48:57,016][57339] Updated weights for policy 0, policy_version 698608 (0.0029) [2024-04-28 19:48:57,169][57108] Fps is (10 sec: 50789.5, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 11445993472. Throughput: 0: 55332.3. Samples: 1936368260. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:48:57,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 19:48:59,097][57339] Updated weights for policy 0, policy_version 698618 (0.0027) [2024-04-28 19:49:02,169][57108] Fps is (10 sec: 54068.0, 60 sec: 54886.4, 300 sec: 55316.8). Total num frames: 11446255616. Throughput: 0: 55405.5. Samples: 1936703400. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:49:02,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 19:49:02,802][57339] Updated weights for policy 0, policy_version 698628 (0.0029) [2024-04-28 19:49:05,099][57339] Updated weights for policy 0, policy_version 698638 (0.0028) [2024-04-28 19:49:07,169][57108] Fps is (10 sec: 57344.7, 60 sec: 54886.4, 300 sec: 55483.5). Total num frames: 11446566912. Throughput: 0: 55468.5. Samples: 1936866260. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:49:07,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 19:49:08,613][57339] Updated weights for policy 0, policy_version 698648 (0.0030) [2024-04-28 19:49:11,139][57339] Updated weights for policy 0, policy_version 698658 (0.0027) [2024-04-28 19:49:12,169][57108] Fps is (10 sec: 60621.0, 60 sec: 55159.6, 300 sec: 55594.5). Total num frames: 11446861824. Throughput: 0: 55408.1. Samples: 1937195960. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:49:12,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 19:49:14,579][57339] Updated weights for policy 0, policy_version 698668 (0.0026) [2024-04-28 19:49:17,012][57339] Updated weights for policy 0, policy_version 698678 (0.0028) [2024-04-28 19:49:17,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 11447140352. Throughput: 0: 55361.4. Samples: 1937528300. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:49:17,170][57108] Avg episode reward: [(0, '0.562')] [2024-04-28 19:49:20,516][57339] Updated weights for policy 0, policy_version 698688 (0.0026) [2024-04-28 19:49:22,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55978.8, 300 sec: 55539.0). Total num frames: 11447402496. Throughput: 0: 55357.0. Samples: 1937704720. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:49:22,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 19:49:22,951][57339] Updated weights for policy 0, policy_version 698698 (0.0030) [2024-04-28 19:49:26,307][57339] Updated weights for policy 0, policy_version 698708 (0.0024) [2024-04-28 19:49:27,169][57108] Fps is (10 sec: 54068.1, 60 sec: 55978.7, 300 sec: 55483.5). Total num frames: 11447681024. Throughput: 0: 55496.6. Samples: 1938041300. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:49:27,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 19:49:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000698711_11447681024.pth... [2024-04-28 19:49:27,226][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000697898_11434360832.pth [2024-04-28 19:49:28,758][57339] Updated weights for policy 0, policy_version 698718 (0.0028) [2024-04-28 19:49:32,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.7, 300 sec: 55372.4). Total num frames: 11447943168. Throughput: 0: 55555.6. Samples: 1938374140. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:49:32,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 19:49:32,238][57339] Updated weights for policy 0, policy_version 698728 (0.0030) [2024-04-28 19:49:34,651][57339] Updated weights for policy 0, policy_version 698738 (0.0026) [2024-04-28 19:49:35,805][57319] Signal inference workers to stop experience collection... (28900 times) [2024-04-28 19:49:35,806][57319] Signal inference workers to resume experience collection... (28900 times) [2024-04-28 19:49:35,828][57339] InferenceWorker_p0-w0: stopping experience collection (28900 times) [2024-04-28 19:49:35,828][57339] InferenceWorker_p0-w0: resuming experience collection (28900 times) [2024-04-28 19:49:37,169][57108] Fps is (10 sec: 54067.0, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 11448221696. Throughput: 0: 55393.5. Samples: 1938530520. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:49:37,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 19:49:38,175][57339] Updated weights for policy 0, policy_version 698748 (0.0030) [2024-04-28 19:49:40,616][57339] Updated weights for policy 0, policy_version 698758 (0.0025) [2024-04-28 19:49:42,169][57108] Fps is (10 sec: 58981.9, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 11448532992. Throughput: 0: 55416.1. Samples: 1938861980. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:49:42,169][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 19:49:44,161][57339] Updated weights for policy 0, policy_version 698768 (0.0028) [2024-04-28 19:49:46,435][57339] Updated weights for policy 0, policy_version 698778 (0.0022) [2024-04-28 19:49:47,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55159.4, 300 sec: 55483.5). Total num frames: 11448795136. Throughput: 0: 55304.9. Samples: 1939192120. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-04-28 19:49:47,169][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 19:49:49,879][57339] Updated weights for policy 0, policy_version 698788 (0.0028) [2024-04-28 19:49:52,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.9, 300 sec: 55650.0). Total num frames: 11449090048. Throughput: 0: 55636.0. Samples: 1939369880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:49:52,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 19:49:52,227][57339] Updated weights for policy 0, policy_version 698798 (0.0029) [2024-04-28 19:49:55,863][57339] Updated weights for policy 0, policy_version 698808 (0.0031) [2024-04-28 19:49:57,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 11449352192. Throughput: 0: 55701.1. Samples: 1939702520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:49:57,178][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 19:49:58,234][57339] Updated weights for policy 0, policy_version 698818 (0.0030) [2024-04-28 19:50:01,672][57339] Updated weights for policy 0, policy_version 698828 (0.0032) [2024-04-28 19:50:02,169][57108] Fps is (10 sec: 52428.1, 60 sec: 55978.5, 300 sec: 55427.9). Total num frames: 11449614336. Throughput: 0: 55775.6. Samples: 1940038200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:50:02,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 19:50:04,173][57339] Updated weights for policy 0, policy_version 698838 (0.0027) [2024-04-28 19:50:07,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 11449892864. Throughput: 0: 55319.4. Samples: 1940194100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:50:07,170][57108] Avg episode reward: [(0, '0.661')] [2024-04-28 19:50:07,497][57339] Updated weights for policy 0, policy_version 698848 (0.0028) [2024-04-28 19:50:09,994][57339] Updated weights for policy 0, policy_version 698858 (0.0025) [2024-04-28 19:50:12,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55159.4, 300 sec: 55372.4). Total num frames: 11450171392. Throughput: 0: 55393.3. Samples: 1940534000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:50:12,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 19:50:13,238][57339] Updated weights for policy 0, policy_version 698868 (0.0024) [2024-04-28 19:50:15,963][57339] Updated weights for policy 0, policy_version 698878 (0.0027) [2024-04-28 19:50:17,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 11450466304. Throughput: 0: 55406.4. Samples: 1940867440. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:50:17,169][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 19:50:19,256][57339] Updated weights for policy 0, policy_version 698888 (0.0025) [2024-04-28 19:50:21,772][57339] Updated weights for policy 0, policy_version 698898 (0.0035) [2024-04-28 19:50:22,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 11450744832. Throughput: 0: 55647.9. Samples: 1941034680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:50:22,170][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 19:50:24,570][57319] Signal inference workers to stop experience collection... (28950 times) [2024-04-28 19:50:24,614][57339] InferenceWorker_p0-w0: stopping experience collection (28950 times) [2024-04-28 19:50:24,630][57319] Signal inference workers to resume experience collection... (28950 times) [2024-04-28 19:50:24,631][57339] InferenceWorker_p0-w0: resuming experience collection (28950 times) [2024-04-28 19:50:25,017][57339] Updated weights for policy 0, policy_version 698908 (0.0031) [2024-04-28 19:50:27,169][57108] Fps is (10 sec: 55706.9, 60 sec: 55705.7, 300 sec: 55650.2). Total num frames: 11451023360. Throughput: 0: 55668.2. Samples: 1941367040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:50:27,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 19:50:27,673][57339] Updated weights for policy 0, policy_version 698918 (0.0026) [2024-04-28 19:50:30,932][57339] Updated weights for policy 0, policy_version 698928 (0.0027) [2024-04-28 19:50:32,169][57108] Fps is (10 sec: 54067.7, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 11451285504. Throughput: 0: 55804.9. Samples: 1941703340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:50:32,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 19:50:33,562][57339] Updated weights for policy 0, policy_version 698938 (0.0031) [2024-04-28 19:50:36,849][57339] Updated weights for policy 0, policy_version 698948 (0.0029) [2024-04-28 19:50:37,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 11451564032. Throughput: 0: 55536.9. Samples: 1941869040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:50:37,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 19:50:39,531][57339] Updated weights for policy 0, policy_version 698958 (0.0028) [2024-04-28 19:50:42,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 11451842560. Throughput: 0: 55558.2. Samples: 1942202640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:50:42,178][57108] Avg episode reward: [(0, '0.664')] [2024-04-28 19:50:43,237][57339] Updated weights for policy 0, policy_version 698968 (0.0033) [2024-04-28 19:50:45,326][57339] Updated weights for policy 0, policy_version 698978 (0.0030) [2024-04-28 19:50:47,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 11452121088. Throughput: 0: 55444.5. Samples: 1942533200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:50:47,169][57108] Avg episode reward: [(0, '0.671')] [2024-04-28 19:50:48,943][57339] Updated weights for policy 0, policy_version 698988 (0.0029) [2024-04-28 19:50:51,111][57339] Updated weights for policy 0, policy_version 698998 (0.0033) [2024-04-28 19:50:52,169][57108] Fps is (10 sec: 54067.7, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 11452383232. Throughput: 0: 55810.3. Samples: 1942705560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:50:52,169][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 19:50:54,684][57339] Updated weights for policy 0, policy_version 699008 (0.0033) [2024-04-28 19:50:57,050][57339] Updated weights for policy 0, policy_version 699018 (0.0026) [2024-04-28 19:50:57,169][57108] Fps is (10 sec: 58981.4, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 11452710912. Throughput: 0: 55608.2. Samples: 1943036380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:50:57,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 19:51:00,633][57339] Updated weights for policy 0, policy_version 699028 (0.0027) [2024-04-28 19:51:02,169][57108] Fps is (10 sec: 58982.0, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 11452973056. Throughput: 0: 55532.5. Samples: 1943366400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:51:02,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 19:51:03,819][57339] Updated weights for policy 0, policy_version 699038 (0.0038) [2024-04-28 19:51:06,616][57339] Updated weights for policy 0, policy_version 699048 (0.0030) [2024-04-28 19:51:07,169][57108] Fps is (10 sec: 50791.7, 60 sec: 55432.6, 300 sec: 55483.5). Total num frames: 11453218816. Throughput: 0: 55549.0. Samples: 1943534380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:51:07,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 19:51:09,753][57339] Updated weights for policy 0, policy_version 699058 (0.0028) [2024-04-28 19:51:12,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 11453497344. Throughput: 0: 55569.2. Samples: 1943867660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:51:12,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 19:51:12,715][57339] Updated weights for policy 0, policy_version 699068 (0.0030) [2024-04-28 19:51:13,449][57319] Signal inference workers to stop experience collection... (29000 times) [2024-04-28 19:51:13,449][57319] Signal inference workers to resume experience collection... (29000 times) [2024-04-28 19:51:13,462][57339] InferenceWorker_p0-w0: stopping experience collection (29000 times) [2024-04-28 19:51:13,462][57339] InferenceWorker_p0-w0: resuming experience collection (29000 times) [2024-04-28 19:51:15,805][57339] Updated weights for policy 0, policy_version 699078 (0.0027) [2024-04-28 19:51:17,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55159.6, 300 sec: 55427.9). Total num frames: 11453775872. Throughput: 0: 55553.8. Samples: 1944203260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:51:17,169][57108] Avg episode reward: [(0, '0.655')] [2024-04-28 19:51:18,644][57339] Updated weights for policy 0, policy_version 699088 (0.0032) [2024-04-28 19:51:21,697][57339] Updated weights for policy 0, policy_version 699098 (0.0035) [2024-04-28 19:51:22,169][57108] Fps is (10 sec: 54067.2, 60 sec: 54886.4, 300 sec: 55372.4). Total num frames: 11454038016. Throughput: 0: 55293.7. Samples: 1944357260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:51:22,169][57108] Avg episode reward: [(0, '0.468')] [2024-04-28 19:51:24,594][57339] Updated weights for policy 0, policy_version 699108 (0.0029) [2024-04-28 19:51:27,169][57108] Fps is (10 sec: 55704.5, 60 sec: 55159.2, 300 sec: 55483.4). Total num frames: 11454332928. Throughput: 0: 55250.6. Samples: 1944688920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-04-28 19:51:27,170][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 19:51:27,180][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000699117_11454332928.pth... [2024-04-28 19:51:27,230][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000698304_11441012736.pth [2024-04-28 19:51:27,494][57339] Updated weights for policy 0, policy_version 699118 (0.0028) [2024-04-28 19:51:30,459][57339] Updated weights for policy 0, policy_version 699128 (0.0027) [2024-04-28 19:51:32,169][57108] Fps is (10 sec: 60620.7, 60 sec: 55978.6, 300 sec: 55650.1). Total num frames: 11454644224. Throughput: 0: 55162.7. Samples: 1945015520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:51:32,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 19:51:33,471][57339] Updated weights for policy 0, policy_version 699138 (0.0031) [2024-04-28 19:51:36,305][57339] Updated weights for policy 0, policy_version 699148 (0.0029) [2024-04-28 19:51:37,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55705.4, 300 sec: 55594.5). Total num frames: 11454906368. Throughput: 0: 55311.8. Samples: 1945194600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:51:37,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 19:51:39,377][57339] Updated weights for policy 0, policy_version 699158 (0.0035) [2024-04-28 19:51:42,037][57339] Updated weights for policy 0, policy_version 699168 (0.0027) [2024-04-28 19:51:42,169][57108] Fps is (10 sec: 52429.0, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 11455168512. Throughput: 0: 55410.9. Samples: 1945529860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:51:42,169][57108] Avg episode reward: [(0, '0.687')] [2024-04-28 19:51:45,041][57339] Updated weights for policy 0, policy_version 699178 (0.0024) [2024-04-28 19:51:47,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 11455463424. Throughput: 0: 55542.5. Samples: 1945865820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:51:47,169][57108] Avg episode reward: [(0, '0.616')] [2024-04-28 19:51:47,812][57339] Updated weights for policy 0, policy_version 699188 (0.0025) [2024-04-28 19:51:51,234][57339] Updated weights for policy 0, policy_version 699198 (0.0027) [2024-04-28 19:51:52,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.6, 300 sec: 55427.9). Total num frames: 11455725568. Throughput: 0: 55367.9. Samples: 1946025940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:51:52,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 19:51:53,538][57339] Updated weights for policy 0, policy_version 699208 (0.0033) [2024-04-28 19:51:56,994][57339] Updated weights for policy 0, policy_version 699218 (0.0029) [2024-04-28 19:51:57,169][57108] Fps is (10 sec: 52429.7, 60 sec: 54613.5, 300 sec: 55427.9). Total num frames: 11455987712. Throughput: 0: 55502.2. Samples: 1946365260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:51:57,169][57108] Avg episode reward: [(0, '0.535')] [2024-04-28 19:51:59,346][57339] Updated weights for policy 0, policy_version 699228 (0.0024) [2024-04-28 19:52:00,290][57319] Signal inference workers to stop experience collection... (29050 times) [2024-04-28 19:52:00,321][57339] InferenceWorker_p0-w0: stopping experience collection (29050 times) [2024-04-28 19:52:00,339][57319] Signal inference workers to resume experience collection... (29050 times) [2024-04-28 19:52:00,340][57339] InferenceWorker_p0-w0: resuming experience collection (29050 times) [2024-04-28 19:52:02,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 11456282624. Throughput: 0: 55550.5. Samples: 1946703040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:52:02,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 19:52:02,914][57339] Updated weights for policy 0, policy_version 699238 (0.0033) [2024-04-28 19:52:05,258][57339] Updated weights for policy 0, policy_version 699248 (0.0026) [2024-04-28 19:52:07,169][57108] Fps is (10 sec: 60620.8, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 11456593920. Throughput: 0: 55855.1. Samples: 1946870740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:52:07,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 19:52:08,685][57339] Updated weights for policy 0, policy_version 699258 (0.0032) [2024-04-28 19:52:11,252][57339] Updated weights for policy 0, policy_version 699268 (0.0031) [2024-04-28 19:52:12,169][57108] Fps is (10 sec: 60621.8, 60 sec: 56524.9, 300 sec: 55761.2). Total num frames: 11456888832. Throughput: 0: 56059.4. Samples: 1947211580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:52:12,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 19:52:14,669][57339] Updated weights for policy 0, policy_version 699278 (0.0026) [2024-04-28 19:52:17,143][57339] Updated weights for policy 0, policy_version 699288 (0.0029) [2024-04-28 19:52:17,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 11457134592. Throughput: 0: 56110.3. Samples: 1947540480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:52:17,169][57108] Avg episode reward: [(0, '0.603')] [2024-04-28 19:52:20,416][57339] Updated weights for policy 0, policy_version 699298 (0.0027) [2024-04-28 19:52:22,169][57108] Fps is (10 sec: 50790.2, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 11457396736. Throughput: 0: 55816.3. Samples: 1947706320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:52:22,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 19:52:22,949][57339] Updated weights for policy 0, policy_version 699308 (0.0027) [2024-04-28 19:52:26,354][57339] Updated weights for policy 0, policy_version 699318 (0.0029) [2024-04-28 19:52:27,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55432.7, 300 sec: 55372.4). Total num frames: 11457658880. Throughput: 0: 55810.3. Samples: 1948041320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:52:27,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 19:52:28,673][57339] Updated weights for policy 0, policy_version 699328 (0.0033) [2024-04-28 19:52:32,169][57108] Fps is (10 sec: 54066.9, 60 sec: 54886.4, 300 sec: 55427.9). Total num frames: 11457937408. Throughput: 0: 55859.3. Samples: 1948379480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:52:32,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 19:52:32,184][57339] Updated weights for policy 0, policy_version 699338 (0.0031) [2024-04-28 19:52:34,587][57339] Updated weights for policy 0, policy_version 699348 (0.0027) [2024-04-28 19:52:37,169][57108] Fps is (10 sec: 58981.3, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 11458248704. Throughput: 0: 55899.8. Samples: 1948541440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:52:37,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 19:52:37,991][57339] Updated weights for policy 0, policy_version 699358 (0.0030) [2024-04-28 19:52:40,456][57339] Updated weights for policy 0, policy_version 699368 (0.0029) [2024-04-28 19:52:42,169][57108] Fps is (10 sec: 60620.6, 60 sec: 56251.7, 300 sec: 55705.6). Total num frames: 11458543616. Throughput: 0: 55700.9. Samples: 1948871800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:52:42,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 19:52:43,844][57339] Updated weights for policy 0, policy_version 699378 (0.0032) [2024-04-28 19:52:46,411][57339] Updated weights for policy 0, policy_version 699388 (0.0025) [2024-04-28 19:52:47,169][57108] Fps is (10 sec: 57345.1, 60 sec: 55978.9, 300 sec: 55761.1). Total num frames: 11458822144. Throughput: 0: 55756.6. Samples: 1949212080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:52:47,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 19:52:49,483][57319] Signal inference workers to stop experience collection... (29100 times) [2024-04-28 19:52:49,484][57319] Signal inference workers to resume experience collection... (29100 times) [2024-04-28 19:52:49,495][57339] InferenceWorker_p0-w0: stopping experience collection (29100 times) [2024-04-28 19:52:49,523][57339] InferenceWorker_p0-w0: resuming experience collection (29100 times) [2024-04-28 19:52:49,594][57339] Updated weights for policy 0, policy_version 699398 (0.0027) [2024-04-28 19:52:52,110][57339] Updated weights for policy 0, policy_version 699408 (0.0031) [2024-04-28 19:52:52,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 11459100672. Throughput: 0: 55829.7. Samples: 1949383080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:52:52,170][57108] Avg episode reward: [(0, '0.521')] [2024-04-28 19:52:55,428][57339] Updated weights for policy 0, policy_version 699418 (0.0027) [2024-04-28 19:52:57,169][57108] Fps is (10 sec: 54067.2, 60 sec: 56251.8, 300 sec: 55594.5). Total num frames: 11459362816. Throughput: 0: 55738.6. Samples: 1949719820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:52:57,169][57108] Avg episode reward: [(0, '0.557')] [2024-04-28 19:52:58,030][57339] Updated weights for policy 0, policy_version 699428 (0.0030) [2024-04-28 19:53:01,315][57339] Updated weights for policy 0, policy_version 699438 (0.0027) [2024-04-28 19:53:02,169][57108] Fps is (10 sec: 50791.2, 60 sec: 55432.7, 300 sec: 55372.4). Total num frames: 11459608576. Throughput: 0: 55797.0. Samples: 1950051340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:53:02,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 19:53:03,974][57339] Updated weights for policy 0, policy_version 699448 (0.0028) [2024-04-28 19:53:07,070][57339] Updated weights for policy 0, policy_version 699458 (0.0034) [2024-04-28 19:53:07,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 11459919872. Throughput: 0: 55729.7. Samples: 1950214160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-04-28 19:53:07,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 19:53:09,912][57339] Updated weights for policy 0, policy_version 699468 (0.0025) [2024-04-28 19:53:12,169][57108] Fps is (10 sec: 60620.1, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 11460214784. Throughput: 0: 55793.7. Samples: 1950552040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:53:12,170][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 19:53:12,776][57339] Updated weights for policy 0, policy_version 699478 (0.0030) [2024-04-28 19:53:15,615][57339] Updated weights for policy 0, policy_version 699488 (0.0032) [2024-04-28 19:53:17,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 11460493312. Throughput: 0: 55832.0. Samples: 1950891920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:53:17,169][57108] Avg episode reward: [(0, '0.622')] [2024-04-28 19:53:18,659][57339] Updated weights for policy 0, policy_version 699498 (0.0035) [2024-04-28 19:53:21,271][57339] Updated weights for policy 0, policy_version 699508 (0.0033) [2024-04-28 19:53:22,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56524.8, 300 sec: 55816.7). Total num frames: 11460788224. Throughput: 0: 56190.0. Samples: 1951069980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:53:22,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 19:53:24,534][57339] Updated weights for policy 0, policy_version 699518 (0.0031) [2024-04-28 19:53:27,169][57108] Fps is (10 sec: 55705.4, 60 sec: 56524.7, 300 sec: 55650.1). Total num frames: 11461050368. Throughput: 0: 56250.2. Samples: 1951403060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:53:27,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 19:53:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000699527_11461050368.pth... [2024-04-28 19:53:27,231][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000698711_11447681024.pth [2024-04-28 19:53:27,353][57339] Updated weights for policy 0, policy_version 699528 (0.0033) [2024-04-28 19:53:30,361][57339] Updated weights for policy 0, policy_version 699538 (0.0028) [2024-04-28 19:53:32,169][57108] Fps is (10 sec: 52428.2, 60 sec: 56251.6, 300 sec: 55539.0). Total num frames: 11461312512. Throughput: 0: 56071.4. Samples: 1951735300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:53:32,170][57108] Avg episode reward: [(0, '0.685')] [2024-04-28 19:53:33,211][57339] Updated weights for policy 0, policy_version 699548 (0.0033) [2024-04-28 19:53:36,004][57339] Updated weights for policy 0, policy_version 699558 (0.0028) [2024-04-28 19:53:37,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 11461591040. Throughput: 0: 55839.1. Samples: 1951895840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:53:37,170][57108] Avg episode reward: [(0, '0.545')] [2024-04-28 19:53:38,921][57339] Updated weights for policy 0, policy_version 699568 (0.0030) [2024-04-28 19:53:41,832][57339] Updated weights for policy 0, policy_version 699578 (0.0028) [2024-04-28 19:53:42,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 11461885952. Throughput: 0: 55910.7. Samples: 1952235800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:53:42,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 19:53:44,829][57339] Updated weights for policy 0, policy_version 699588 (0.0029) [2024-04-28 19:53:47,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.3, 300 sec: 55705.6). Total num frames: 11462148096. Throughput: 0: 56067.2. Samples: 1952574380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:53:47,170][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 19:53:47,696][57339] Updated weights for policy 0, policy_version 699598 (0.0030) [2024-04-28 19:53:49,151][57319] Signal inference workers to stop experience collection... (29150 times) [2024-04-28 19:53:49,152][57319] Signal inference workers to resume experience collection... (29150 times) [2024-04-28 19:53:49,166][57339] InferenceWorker_p0-w0: stopping experience collection (29150 times) [2024-04-28 19:53:49,166][57339] InferenceWorker_p0-w0: resuming experience collection (29150 times) [2024-04-28 19:53:50,577][57339] Updated weights for policy 0, policy_version 699608 (0.0032) [2024-04-28 19:53:52,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11462459392. Throughput: 0: 56346.1. Samples: 1952749740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:53:52,170][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 19:53:53,908][57339] Updated weights for policy 0, policy_version 699618 (0.0030) [2024-04-28 19:53:56,440][57339] Updated weights for policy 0, policy_version 699628 (0.0027) [2024-04-28 19:53:57,169][57108] Fps is (10 sec: 58982.7, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 11462737920. Throughput: 0: 56181.2. Samples: 1953080200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:53:57,170][57108] Avg episode reward: [(0, '0.626')] [2024-04-28 19:53:59,649][57339] Updated weights for policy 0, policy_version 699638 (0.0030) [2024-04-28 19:54:02,169][57108] Fps is (10 sec: 55706.4, 60 sec: 56797.8, 300 sec: 55761.1). Total num frames: 11463016448. Throughput: 0: 56020.9. Samples: 1953412860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:54:02,170][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 19:54:02,325][57339] Updated weights for policy 0, policy_version 699648 (0.0027) [2024-04-28 19:54:05,356][57339] Updated weights for policy 0, policy_version 699658 (0.0029) [2024-04-28 19:54:07,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55978.6, 300 sec: 55650.0). Total num frames: 11463278592. Throughput: 0: 55981.6. Samples: 1953589160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:54:07,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 19:54:08,010][57339] Updated weights for policy 0, policy_version 699668 (0.0028) [2024-04-28 19:54:11,341][57339] Updated weights for policy 0, policy_version 699678 (0.0027) [2024-04-28 19:54:12,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 11463557120. Throughput: 0: 55907.5. Samples: 1953918900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:54:12,169][57108] Avg episode reward: [(0, '0.669')] [2024-04-28 19:54:14,064][57339] Updated weights for policy 0, policy_version 699688 (0.0030) [2024-04-28 19:54:17,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55705.5, 300 sec: 55705.6). Total num frames: 11463835648. Throughput: 0: 55953.3. Samples: 1954253200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:54:17,170][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 19:54:17,227][57339] Updated weights for policy 0, policy_version 699698 (0.0033) [2024-04-28 19:54:19,805][57339] Updated weights for policy 0, policy_version 699708 (0.0032) [2024-04-28 19:54:22,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 55761.1). Total num frames: 11464130560. Throughput: 0: 56132.8. Samples: 1954421820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:54:22,170][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 19:54:22,951][57339] Updated weights for policy 0, policy_version 699718 (0.0029) [2024-04-28 19:54:25,769][57339] Updated weights for policy 0, policy_version 699728 (0.0035) [2024-04-28 19:54:27,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11464392704. Throughput: 0: 56088.4. Samples: 1954759780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:54:27,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 19:54:28,880][57339] Updated weights for policy 0, policy_version 699738 (0.0026) [2024-04-28 19:54:31,629][57339] Updated weights for policy 0, policy_version 699748 (0.0030) [2024-04-28 19:54:32,169][57108] Fps is (10 sec: 55705.9, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 11464687616. Throughput: 0: 55943.7. Samples: 1955091840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:54:32,170][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 19:54:34,782][57339] Updated weights for policy 0, policy_version 699758 (0.0029) [2024-04-28 19:54:37,169][57108] Fps is (10 sec: 57344.1, 60 sec: 56251.8, 300 sec: 55705.6). Total num frames: 11464966144. Throughput: 0: 55833.1. Samples: 1955262220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:54:37,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 19:54:37,382][57339] Updated weights for policy 0, policy_version 699768 (0.0031) [2024-04-28 19:54:40,500][57339] Updated weights for policy 0, policy_version 699778 (0.0028) [2024-04-28 19:54:42,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 55761.1). Total num frames: 11465244672. Throughput: 0: 55941.8. Samples: 1955597580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:54:42,170][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 19:54:43,095][57339] Updated weights for policy 0, policy_version 699788 (0.0030) [2024-04-28 19:54:46,371][57339] Updated weights for policy 0, policy_version 699798 (0.0030) [2024-04-28 19:54:47,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55978.9, 300 sec: 55650.1). Total num frames: 11465506816. Throughput: 0: 56104.5. Samples: 1955937560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-04-28 19:54:47,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 19:54:48,993][57339] Updated weights for policy 0, policy_version 699808 (0.0028) [2024-04-28 19:54:52,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 11465801728. Throughput: 0: 55804.6. Samples: 1956100360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:54:52,169][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 19:54:52,275][57339] Updated weights for policy 0, policy_version 699818 (0.0027) [2024-04-28 19:54:54,780][57339] Updated weights for policy 0, policy_version 699828 (0.0029) [2024-04-28 19:54:57,169][57108] Fps is (10 sec: 57342.9, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 11466080256. Throughput: 0: 55936.8. Samples: 1956436060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:54:57,170][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 19:54:58,067][57339] Updated weights for policy 0, policy_version 699838 (0.0035) [2024-04-28 19:55:00,694][57339] Updated weights for policy 0, policy_version 699848 (0.0027) [2024-04-28 19:55:02,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55705.5, 300 sec: 55816.7). Total num frames: 11466358784. Throughput: 0: 55948.0. Samples: 1956770860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:55:02,170][57108] Avg episode reward: [(0, '0.690')] [2024-04-28 19:55:03,817][57339] Updated weights for policy 0, policy_version 699858 (0.0030) [2024-04-28 19:55:06,579][57339] Updated weights for policy 0, policy_version 699868 (0.0028) [2024-04-28 19:55:07,169][57108] Fps is (10 sec: 57344.4, 60 sec: 56251.8, 300 sec: 55872.2). Total num frames: 11466653696. Throughput: 0: 56194.3. Samples: 1956950560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:55:07,170][57108] Avg episode reward: [(0, '0.589')] [2024-04-28 19:55:09,485][57319] Signal inference workers to stop experience collection... (29200 times) [2024-04-28 19:55:09,485][57319] Signal inference workers to resume experience collection... (29200 times) [2024-04-28 19:55:09,515][57339] InferenceWorker_p0-w0: stopping experience collection (29200 times) [2024-04-28 19:55:09,515][57339] InferenceWorker_p0-w0: resuming experience collection (29200 times) [2024-04-28 19:55:09,594][57339] Updated weights for policy 0, policy_version 699878 (0.0029) [2024-04-28 19:55:12,169][57108] Fps is (10 sec: 55706.6, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 11466915840. Throughput: 0: 56160.1. Samples: 1957286980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:55:12,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 19:55:12,425][57339] Updated weights for policy 0, policy_version 699888 (0.0040) [2024-04-28 19:55:15,483][57339] Updated weights for policy 0, policy_version 699898 (0.0034) [2024-04-28 19:55:17,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55978.8, 300 sec: 55761.2). Total num frames: 11467194368. Throughput: 0: 56197.9. Samples: 1957620740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:55:17,169][57108] Avg episode reward: [(0, '0.536')] [2024-04-28 19:55:18,364][57339] Updated weights for policy 0, policy_version 699908 (0.0030) [2024-04-28 19:55:21,426][57339] Updated weights for policy 0, policy_version 699918 (0.0028) [2024-04-28 19:55:22,169][57108] Fps is (10 sec: 55704.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11467472896. Throughput: 0: 56016.8. Samples: 1957782980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:55:22,170][57108] Avg episode reward: [(0, '0.533')] [2024-04-28 19:55:24,080][57339] Updated weights for policy 0, policy_version 699928 (0.0028) [2024-04-28 19:55:27,169][57108] Fps is (10 sec: 57343.0, 60 sec: 56251.6, 300 sec: 55872.2). Total num frames: 11467767808. Throughput: 0: 56082.1. Samples: 1958121280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:55:27,170][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 19:55:27,181][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000699937_11467767808.pth... [2024-04-28 19:55:27,230][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000699117_11454332928.pth [2024-04-28 19:55:27,377][57339] Updated weights for policy 0, policy_version 699938 (0.0031) [2024-04-28 19:55:29,870][57339] Updated weights for policy 0, policy_version 699948 (0.0032) [2024-04-28 19:55:32,169][57108] Fps is (10 sec: 57345.0, 60 sec: 55978.8, 300 sec: 55872.2). Total num frames: 11468046336. Throughput: 0: 56106.2. Samples: 1958462340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:55:32,169][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 19:55:33,053][57339] Updated weights for policy 0, policy_version 699958 (0.0035) [2024-04-28 19:55:35,766][57339] Updated weights for policy 0, policy_version 699968 (0.0031) [2024-04-28 19:55:37,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.5, 300 sec: 55872.2). Total num frames: 11468324864. Throughput: 0: 56183.4. Samples: 1958628620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:55:37,169][57108] Avg episode reward: [(0, '0.642')] [2024-04-28 19:55:39,087][57339] Updated weights for policy 0, policy_version 699978 (0.0026) [2024-04-28 19:55:41,608][57339] Updated weights for policy 0, policy_version 699988 (0.0037) [2024-04-28 19:55:42,169][57108] Fps is (10 sec: 57343.1, 60 sec: 56251.7, 300 sec: 55927.7). Total num frames: 11468619776. Throughput: 0: 56107.6. Samples: 1958960900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:55:42,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 19:55:45,409][57339] Updated weights for policy 0, policy_version 699998 (0.0027) [2024-04-28 19:55:47,169][57108] Fps is (10 sec: 57344.8, 60 sec: 56524.8, 300 sec: 55983.3). Total num frames: 11468898304. Throughput: 0: 56072.2. Samples: 1959294100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:55:47,169][57108] Avg episode reward: [(0, '0.623')] [2024-04-28 19:55:47,521][57339] Updated weights for policy 0, policy_version 700008 (0.0028) [2024-04-28 19:55:51,137][57339] Updated weights for policy 0, policy_version 700018 (0.0029) [2024-04-28 19:55:52,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.6, 300 sec: 55761.2). Total num frames: 11469160448. Throughput: 0: 55801.3. Samples: 1959461620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:55:52,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 19:55:53,344][57339] Updated weights for policy 0, policy_version 700028 (0.0026) [2024-04-28 19:55:56,884][57339] Updated weights for policy 0, policy_version 700038 (0.0030) [2024-04-28 19:55:57,169][57108] Fps is (10 sec: 54067.4, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 11469438976. Throughput: 0: 55814.7. Samples: 1959798640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:55:57,169][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 19:55:59,312][57339] Updated weights for policy 0, policy_version 700048 (0.0033) [2024-04-28 19:56:02,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.8, 300 sec: 55927.7). Total num frames: 11469717504. Throughput: 0: 55805.3. Samples: 1960131980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:56:02,169][57108] Avg episode reward: [(0, '0.666')] [2024-04-28 19:56:02,849][57339] Updated weights for policy 0, policy_version 700058 (0.0027) [2024-04-28 19:56:05,276][57339] Updated weights for policy 0, policy_version 700068 (0.0026) [2024-04-28 19:56:07,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 11470012416. Throughput: 0: 55826.7. Samples: 1960295180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:56:07,178][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 19:56:08,751][57339] Updated weights for policy 0, policy_version 700078 (0.0030) [2024-04-28 19:56:11,036][57339] Updated weights for policy 0, policy_version 700088 (0.0030) [2024-04-28 19:56:12,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 11470274560. Throughput: 0: 55638.0. Samples: 1960624980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:56:12,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 19:56:14,726][57339] Updated weights for policy 0, policy_version 700098 (0.0025) [2024-04-28 19:56:14,847][57319] Signal inference workers to stop experience collection... (29250 times) [2024-04-28 19:56:14,851][57319] Signal inference workers to resume experience collection... (29250 times) [2024-04-28 19:56:14,873][57339] InferenceWorker_p0-w0: stopping experience collection (29250 times) [2024-04-28 19:56:14,874][57339] InferenceWorker_p0-w0: resuming experience collection (29250 times) [2024-04-28 19:56:16,802][57339] Updated weights for policy 0, policy_version 700108 (0.0029) [2024-04-28 19:56:17,169][57108] Fps is (10 sec: 55705.8, 60 sec: 56251.7, 300 sec: 56038.8). Total num frames: 11470569472. Throughput: 0: 55554.6. Samples: 1960962300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:56:17,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 19:56:20,419][57339] Updated weights for policy 0, policy_version 700118 (0.0029) [2024-04-28 19:56:22,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55978.8, 300 sec: 55927.8). Total num frames: 11470831616. Throughput: 0: 55766.4. Samples: 1961138100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:56:22,170][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 19:56:22,776][57339] Updated weights for policy 0, policy_version 700128 (0.0026) [2024-04-28 19:56:26,327][57339] Updated weights for policy 0, policy_version 700138 (0.0029) [2024-04-28 19:56:27,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11471110144. Throughput: 0: 55919.2. Samples: 1961477260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-04-28 19:56:27,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 19:56:28,775][57339] Updated weights for policy 0, policy_version 700148 (0.0028) [2024-04-28 19:56:32,145][57339] Updated weights for policy 0, policy_version 700158 (0.0030) [2024-04-28 19:56:32,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55705.6, 300 sec: 55872.3). Total num frames: 11471388672. Throughput: 0: 55970.7. Samples: 1961812780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:56:32,169][57108] Avg episode reward: [(0, '0.653')] [2024-04-28 19:56:34,907][57339] Updated weights for policy 0, policy_version 700168 (0.0034) [2024-04-28 19:56:37,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55432.6, 300 sec: 55872.2). Total num frames: 11471650816. Throughput: 0: 55696.9. Samples: 1961967980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:56:37,170][57108] Avg episode reward: [(0, '0.658')] [2024-04-28 19:56:37,953][57339] Updated weights for policy 0, policy_version 700178 (0.0031) [2024-04-28 19:56:40,700][57339] Updated weights for policy 0, policy_version 700188 (0.0026) [2024-04-28 19:56:42,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.6, 300 sec: 55816.7). Total num frames: 11471929344. Throughput: 0: 55657.3. Samples: 1962303220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:56:42,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 19:56:43,909][57339] Updated weights for policy 0, policy_version 700198 (0.0025) [2024-04-28 19:56:46,517][57339] Updated weights for policy 0, policy_version 700208 (0.0027) [2024-04-28 19:56:47,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55705.6, 300 sec: 55983.3). Total num frames: 11472240640. Throughput: 0: 55560.5. Samples: 1962632200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:56:47,169][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 19:56:49,735][57339] Updated weights for policy 0, policy_version 700218 (0.0030) [2024-04-28 19:56:52,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.7, 300 sec: 55983.3). Total num frames: 11472502784. Throughput: 0: 55826.3. Samples: 1962807360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:56:52,169][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 19:56:52,389][57339] Updated weights for policy 0, policy_version 700228 (0.0029) [2024-04-28 19:56:55,634][57339] Updated weights for policy 0, policy_version 700238 (0.0029) [2024-04-28 19:56:57,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55983.3). Total num frames: 11472797696. Throughput: 0: 55938.7. Samples: 1963142220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:56:57,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 19:56:58,230][57339] Updated weights for policy 0, policy_version 700248 (0.0033) [2024-04-28 19:57:01,445][57339] Updated weights for policy 0, policy_version 700258 (0.0030) [2024-04-28 19:57:02,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55705.7, 300 sec: 55816.7). Total num frames: 11473059840. Throughput: 0: 55806.3. Samples: 1963473580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:57:02,169][57108] Avg episode reward: [(0, '0.546')] [2024-04-28 19:57:04,064][57339] Updated weights for policy 0, policy_version 700268 (0.0027) [2024-04-28 19:57:07,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 11473338368. Throughput: 0: 55547.9. Samples: 1963637760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:57:07,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 19:57:07,372][57339] Updated weights for policy 0, policy_version 700278 (0.0038) [2024-04-28 19:57:10,026][57339] Updated weights for policy 0, policy_version 700288 (0.0034) [2024-04-28 19:57:12,169][57108] Fps is (10 sec: 54066.1, 60 sec: 55432.4, 300 sec: 55816.6). Total num frames: 11473600512. Throughput: 0: 55391.9. Samples: 1963969900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:57:12,170][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 19:57:13,336][57339] Updated weights for policy 0, policy_version 700298 (0.0027) [2024-04-28 19:57:15,905][57339] Updated weights for policy 0, policy_version 700308 (0.0031) [2024-04-28 19:57:17,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55432.5, 300 sec: 55927.7). Total num frames: 11473895424. Throughput: 0: 55297.2. Samples: 1964301160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:57:17,169][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 19:57:19,414][57339] Updated weights for policy 0, policy_version 700318 (0.0026) [2024-04-28 19:57:21,725][57339] Updated weights for policy 0, policy_version 700328 (0.0032) [2024-04-28 19:57:22,169][57108] Fps is (10 sec: 58982.7, 60 sec: 55978.6, 300 sec: 56038.8). Total num frames: 11474190336. Throughput: 0: 55654.6. Samples: 1964472440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:57:22,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 19:57:22,703][57319] Signal inference workers to stop experience collection... (29300 times) [2024-04-28 19:57:22,736][57339] InferenceWorker_p0-w0: stopping experience collection (29300 times) [2024-04-28 19:57:22,756][57319] Signal inference workers to resume experience collection... (29300 times) [2024-04-28 19:57:22,756][57339] InferenceWorker_p0-w0: resuming experience collection (29300 times) [2024-04-28 19:57:25,086][57339] Updated weights for policy 0, policy_version 700338 (0.0030) [2024-04-28 19:57:27,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55978.7, 300 sec: 56038.8). Total num frames: 11474468864. Throughput: 0: 55668.4. Samples: 1964808300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:57:27,169][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 19:57:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000700346_11474468864.pth... [2024-04-28 19:57:27,231][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000699527_11461050368.pth [2024-04-28 19:57:27,650][57339] Updated weights for policy 0, policy_version 700348 (0.0027) [2024-04-28 19:57:31,070][57339] Updated weights for policy 0, policy_version 700358 (0.0030) [2024-04-28 19:57:32,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55978.7, 300 sec: 55927.8). Total num frames: 11474747392. Throughput: 0: 55776.5. Samples: 1965142140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:57:32,169][57108] Avg episode reward: [(0, '0.530')] [2024-04-28 19:57:33,571][57339] Updated weights for policy 0, policy_version 700368 (0.0031) [2024-04-28 19:57:36,948][57339] Updated weights for policy 0, policy_version 700378 (0.0030) [2024-04-28 19:57:37,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55978.8, 300 sec: 55816.7). Total num frames: 11475009536. Throughput: 0: 55514.8. Samples: 1965305520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:57:37,169][57108] Avg episode reward: [(0, '0.607')] [2024-04-28 19:57:39,436][57339] Updated weights for policy 0, policy_version 700388 (0.0026) [2024-04-28 19:57:42,169][57108] Fps is (10 sec: 50790.2, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11475255296. Throughput: 0: 55435.1. Samples: 1965636800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:57:42,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 19:57:42,903][57339] Updated weights for policy 0, policy_version 700398 (0.0029) [2024-04-28 19:57:45,231][57339] Updated weights for policy 0, policy_version 700408 (0.0032) [2024-04-28 19:57:47,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 11475566592. Throughput: 0: 55566.1. Samples: 1965974060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:57:47,169][57108] Avg episode reward: [(0, '0.548')] [2024-04-28 19:57:48,615][57339] Updated weights for policy 0, policy_version 700418 (0.0031) [2024-04-28 19:57:51,135][57339] Updated weights for policy 0, policy_version 700428 (0.0029) [2024-04-28 19:57:52,169][57108] Fps is (10 sec: 58982.4, 60 sec: 55705.6, 300 sec: 55872.2). Total num frames: 11475845120. Throughput: 0: 55732.1. Samples: 1966145700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:57:52,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 19:57:54,549][57339] Updated weights for policy 0, policy_version 700438 (0.0031) [2024-04-28 19:57:56,959][57339] Updated weights for policy 0, policy_version 700448 (0.0026) [2024-04-28 19:57:57,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55705.5, 300 sec: 56038.8). Total num frames: 11476140032. Throughput: 0: 55720.1. Samples: 1966477300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:57:57,169][57108] Avg episode reward: [(0, '0.646')] [2024-04-28 19:58:00,507][57339] Updated weights for policy 0, policy_version 700458 (0.0024) [2024-04-28 19:58:02,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55872.2). Total num frames: 11476402176. Throughput: 0: 55699.1. Samples: 1966807620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:58:02,170][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 19:58:03,054][57339] Updated weights for policy 0, policy_version 700468 (0.0029) [2024-04-28 19:58:06,275][57339] Updated weights for policy 0, policy_version 700478 (0.0031) [2024-04-28 19:58:07,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55705.5, 300 sec: 55816.6). Total num frames: 11476680704. Throughput: 0: 55568.8. Samples: 1966973040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-04-28 19:58:07,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 19:58:08,846][57339] Updated weights for policy 0, policy_version 700488 (0.0030) [2024-04-28 19:58:11,993][57339] Updated weights for policy 0, policy_version 700498 (0.0028) [2024-04-28 19:58:12,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 55816.6). Total num frames: 11476959232. Throughput: 0: 55645.2. Samples: 1967312340. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:58:12,170][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 19:58:14,674][57339] Updated weights for policy 0, policy_version 700508 (0.0026) [2024-04-28 19:58:17,118][57319] Signal inference workers to stop experience collection... (29350 times) [2024-04-28 19:58:17,118][57319] Signal inference workers to resume experience collection... (29350 times) [2024-04-28 19:58:17,150][57339] InferenceWorker_p0-w0: stopping experience collection (29350 times) [2024-04-28 19:58:17,150][57339] InferenceWorker_p0-w0: resuming experience collection (29350 times) [2024-04-28 19:58:17,169][57108] Fps is (10 sec: 54068.4, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11477221376. Throughput: 0: 55822.2. Samples: 1967654140. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:58:17,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 19:58:17,929][57339] Updated weights for policy 0, policy_version 700518 (0.0026) [2024-04-28 19:58:20,662][57339] Updated weights for policy 0, policy_version 700528 (0.0030) [2024-04-28 19:58:22,169][57108] Fps is (10 sec: 54067.3, 60 sec: 55159.4, 300 sec: 55761.1). Total num frames: 11477499904. Throughput: 0: 55634.4. Samples: 1967809080. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:58:22,170][57108] Avg episode reward: [(0, '0.656')] [2024-04-28 19:58:23,838][57339] Updated weights for policy 0, policy_version 700538 (0.0028) [2024-04-28 19:58:26,637][57339] Updated weights for policy 0, policy_version 700548 (0.0026) [2024-04-28 19:58:27,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 11477794816. Throughput: 0: 55625.1. Samples: 1968139940. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:58:27,170][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 19:58:29,891][57339] Updated weights for policy 0, policy_version 700558 (0.0029) [2024-04-28 19:58:32,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55432.4, 300 sec: 55872.2). Total num frames: 11478073344. Throughput: 0: 55532.1. Samples: 1968473000. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:58:32,169][57108] Avg episode reward: [(0, '0.530')] [2024-04-28 19:58:32,481][57339] Updated weights for policy 0, policy_version 700568 (0.0030) [2024-04-28 19:58:35,656][57339] Updated weights for policy 0, policy_version 700578 (0.0032) [2024-04-28 19:58:37,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.3, 300 sec: 55816.6). Total num frames: 11478351872. Throughput: 0: 55574.4. Samples: 1968646560. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:58:37,170][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 19:58:38,487][57339] Updated weights for policy 0, policy_version 700588 (0.0031) [2024-04-28 19:58:41,534][57339] Updated weights for policy 0, policy_version 700598 (0.0034) [2024-04-28 19:58:42,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11478614016. Throughput: 0: 55695.7. Samples: 1968983600. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:58:42,169][57108] Avg episode reward: [(0, '0.493')] [2024-04-28 19:58:44,446][57339] Updated weights for policy 0, policy_version 700608 (0.0035) [2024-04-28 19:58:47,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55761.2). Total num frames: 11478908928. Throughput: 0: 55725.3. Samples: 1969315260. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:58:47,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 19:58:47,467][57339] Updated weights for policy 0, policy_version 700618 (0.0028) [2024-04-28 19:58:50,345][57339] Updated weights for policy 0, policy_version 700628 (0.0027) [2024-04-28 19:58:52,169][57108] Fps is (10 sec: 55705.9, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11479171072. Throughput: 0: 55646.1. Samples: 1969477100. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:58:52,169][57108] Avg episode reward: [(0, '0.564')] [2024-04-28 19:58:53,141][57339] Updated weights for policy 0, policy_version 700638 (0.0030) [2024-04-28 19:58:56,130][57339] Updated weights for policy 0, policy_version 700648 (0.0026) [2024-04-28 19:58:57,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 11479465984. Throughput: 0: 55768.0. Samples: 1969821900. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:58:57,169][57108] Avg episode reward: [(0, '0.608')] [2024-04-28 19:58:58,933][57339] Updated weights for policy 0, policy_version 700658 (0.0030) [2024-04-28 19:59:02,000][57339] Updated weights for policy 0, policy_version 700668 (0.0028) [2024-04-28 19:59:02,169][57108] Fps is (10 sec: 57343.1, 60 sec: 55705.6, 300 sec: 55816.7). Total num frames: 11479744512. Throughput: 0: 55732.7. Samples: 1970162120. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:59:02,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 19:59:04,792][57339] Updated weights for policy 0, policy_version 700678 (0.0034) [2024-04-28 19:59:07,169][57108] Fps is (10 sec: 57345.4, 60 sec: 55978.9, 300 sec: 55872.2). Total num frames: 11480039424. Throughput: 0: 55914.0. Samples: 1970325200. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:59:07,169][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 19:59:07,835][57339] Updated weights for policy 0, policy_version 700688 (0.0031) [2024-04-28 19:59:10,626][57339] Updated weights for policy 0, policy_version 700698 (0.0027) [2024-04-28 19:59:12,169][57108] Fps is (10 sec: 58983.1, 60 sec: 56251.9, 300 sec: 55927.8). Total num frames: 11480334336. Throughput: 0: 56062.4. Samples: 1970662740. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:59:12,169][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 19:59:13,587][57339] Updated weights for policy 0, policy_version 700708 (0.0027) [2024-04-28 19:59:16,630][57339] Updated weights for policy 0, policy_version 700718 (0.0025) [2024-04-28 19:59:17,169][57108] Fps is (10 sec: 55705.1, 60 sec: 56251.7, 300 sec: 55816.7). Total num frames: 11480596480. Throughput: 0: 56248.4. Samples: 1971004180. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:59:17,169][57108] Avg episode reward: [(0, '0.690')] [2024-04-28 19:59:19,436][57339] Updated weights for policy 0, policy_version 700728 (0.0027) [2024-04-28 19:59:22,169][57108] Fps is (10 sec: 54066.8, 60 sec: 56251.9, 300 sec: 55872.2). Total num frames: 11480875008. Throughput: 0: 56104.2. Samples: 1971171240. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:59:22,169][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 19:59:22,289][57339] Updated weights for policy 0, policy_version 700738 (0.0030) [2024-04-28 19:59:25,422][57339] Updated weights for policy 0, policy_version 700748 (0.0032) [2024-04-28 19:59:27,169][57108] Fps is (10 sec: 54066.7, 60 sec: 55705.6, 300 sec: 55761.1). Total num frames: 11481137152. Throughput: 0: 56109.1. Samples: 1971508520. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:59:27,170][57108] Avg episode reward: [(0, '0.567')] [2024-04-28 19:59:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000700754_11481153536.pth... [2024-04-28 19:59:27,226][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000699937_11467767808.pth [2024-04-28 19:59:28,124][57339] Updated weights for policy 0, policy_version 700758 (0.0029) [2024-04-28 19:59:31,301][57339] Updated weights for policy 0, policy_version 700768 (0.0032) [2024-04-28 19:59:32,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11481432064. Throughput: 0: 56204.5. Samples: 1971844460. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:59:32,170][57108] Avg episode reward: [(0, '0.580')] [2024-04-28 19:59:34,005][57339] Updated weights for policy 0, policy_version 700778 (0.0030) [2024-04-28 19:59:36,647][57319] Signal inference workers to stop experience collection... (29400 times) [2024-04-28 19:59:36,649][57319] Signal inference workers to resume experience collection... (29400 times) [2024-04-28 19:59:36,680][57339] InferenceWorker_p0-w0: stopping experience collection (29400 times) [2024-04-28 19:59:36,680][57339] InferenceWorker_p0-w0: resuming experience collection (29400 times) [2024-04-28 19:59:37,169][57108] Fps is (10 sec: 55706.5, 60 sec: 55705.8, 300 sec: 55761.2). Total num frames: 11481694208. Throughput: 0: 56197.7. Samples: 1972006000. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:59:37,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 19:59:37,181][57339] Updated weights for policy 0, policy_version 700788 (0.0036) [2024-04-28 19:59:39,708][57339] Updated weights for policy 0, policy_version 700798 (0.0026) [2024-04-28 19:59:42,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55978.5, 300 sec: 55816.6). Total num frames: 11481972736. Throughput: 0: 55961.9. Samples: 1972340180. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:59:42,169][57108] Avg episode reward: [(0, '0.627')] [2024-04-28 19:59:43,122][57339] Updated weights for policy 0, policy_version 700808 (0.0025) [2024-04-28 19:59:45,474][57339] Updated weights for policy 0, policy_version 700818 (0.0024) [2024-04-28 19:59:47,169][57108] Fps is (10 sec: 58981.8, 60 sec: 56251.7, 300 sec: 55872.2). Total num frames: 11482284032. Throughput: 0: 55860.0. Samples: 1972675820. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-04-28 19:59:47,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 19:59:48,979][57339] Updated weights for policy 0, policy_version 700828 (0.0027) [2024-04-28 19:59:51,469][57339] Updated weights for policy 0, policy_version 700838 (0.0027) [2024-04-28 19:59:52,169][57108] Fps is (10 sec: 58982.3, 60 sec: 56524.6, 300 sec: 55872.2). Total num frames: 11482562560. Throughput: 0: 56110.0. Samples: 1972850160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 19:59:52,170][57108] Avg episode reward: [(0, '0.631')] [2024-04-28 19:59:54,934][57339] Updated weights for policy 0, policy_version 700848 (0.0031) [2024-04-28 19:59:57,169][57108] Fps is (10 sec: 54067.8, 60 sec: 55978.9, 300 sec: 55816.7). Total num frames: 11482824704. Throughput: 0: 55957.8. Samples: 1973180840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 19:59:57,169][57108] Avg episode reward: [(0, '0.539')] [2024-04-28 19:59:57,321][57339] Updated weights for policy 0, policy_version 700858 (0.0030) [2024-04-28 20:00:00,812][57339] Updated weights for policy 0, policy_version 700868 (0.0026) [2024-04-28 20:00:02,169][57108] Fps is (10 sec: 54067.9, 60 sec: 55978.7, 300 sec: 55761.2). Total num frames: 11483103232. Throughput: 0: 55824.1. Samples: 1973516260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:00:02,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 20:00:03,341][57339] Updated weights for policy 0, policy_version 700878 (0.0027) [2024-04-28 20:00:06,666][57339] Updated weights for policy 0, policy_version 700888 (0.0027) [2024-04-28 20:00:07,169][57108] Fps is (10 sec: 54066.9, 60 sec: 55432.5, 300 sec: 55761.1). Total num frames: 11483365376. Throughput: 0: 55596.9. Samples: 1973673100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:00:07,169][57108] Avg episode reward: [(0, '0.640')] [2024-04-28 20:00:09,197][57339] Updated weights for policy 0, policy_version 700898 (0.0031) [2024-04-28 20:00:12,169][57108] Fps is (10 sec: 50790.4, 60 sec: 54613.3, 300 sec: 55650.1). Total num frames: 11483611136. Throughput: 0: 55513.5. Samples: 1974006620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:00:12,169][57108] Avg episode reward: [(0, '0.719')] [2024-04-28 20:00:12,659][57339] Updated weights for policy 0, policy_version 700908 (0.0027) [2024-04-28 20:00:15,282][57339] Updated weights for policy 0, policy_version 700918 (0.0035) [2024-04-28 20:00:17,169][57108] Fps is (10 sec: 55704.8, 60 sec: 55432.4, 300 sec: 55761.1). Total num frames: 11483922432. Throughput: 0: 55484.2. Samples: 1974341260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:00:17,170][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 20:00:18,573][57339] Updated weights for policy 0, policy_version 700928 (0.0031) [2024-04-28 20:00:21,096][57339] Updated weights for policy 0, policy_version 700938 (0.0030) [2024-04-28 20:00:22,169][57108] Fps is (10 sec: 62259.3, 60 sec: 55978.7, 300 sec: 55816.7). Total num frames: 11484233728. Throughput: 0: 55720.9. Samples: 1974513440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:00:22,169][57108] Avg episode reward: [(0, '0.595')] [2024-04-28 20:00:24,456][57339] Updated weights for policy 0, policy_version 700948 (0.0026) [2024-04-28 20:00:26,868][57339] Updated weights for policy 0, policy_version 700958 (0.0033) [2024-04-28 20:00:27,169][57108] Fps is (10 sec: 58983.2, 60 sec: 56251.8, 300 sec: 55816.7). Total num frames: 11484512256. Throughput: 0: 55614.8. Samples: 1974842840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:00:27,169][57108] Avg episode reward: [(0, '0.566')] [2024-04-28 20:00:30,474][57339] Updated weights for policy 0, policy_version 700968 (0.0030) [2024-04-28 20:00:32,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11484790784. Throughput: 0: 55583.1. Samples: 1975177060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:00:32,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 20:00:32,689][57339] Updated weights for policy 0, policy_version 700978 (0.0027) [2024-04-28 20:00:36,301][57339] Updated weights for policy 0, policy_version 700988 (0.0036) [2024-04-28 20:00:37,169][57108] Fps is (10 sec: 50789.7, 60 sec: 55432.3, 300 sec: 55594.5). Total num frames: 11485020160. Throughput: 0: 55383.5. Samples: 1975342420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:00:37,170][57108] Avg episode reward: [(0, '0.721')] [2024-04-28 20:00:38,713][57339] Updated weights for policy 0, policy_version 700998 (0.0030) [2024-04-28 20:00:39,254][57319] Signal inference workers to stop experience collection... (29450 times) [2024-04-28 20:00:39,255][57319] Signal inference workers to resume experience collection... (29450 times) [2024-04-28 20:00:39,273][57339] InferenceWorker_p0-w0: stopping experience collection (29450 times) [2024-04-28 20:00:39,273][57339] InferenceWorker_p0-w0: resuming experience collection (29450 times) [2024-04-28 20:00:42,169][57108] Fps is (10 sec: 50790.7, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 11485298688. Throughput: 0: 55399.0. Samples: 1975673800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:00:42,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 20:00:42,273][57339] Updated weights for policy 0, policy_version 701008 (0.0033) [2024-04-28 20:00:44,542][57339] Updated weights for policy 0, policy_version 701018 (0.0029) [2024-04-28 20:00:47,169][57108] Fps is (10 sec: 55706.1, 60 sec: 54886.4, 300 sec: 55650.1). Total num frames: 11485577216. Throughput: 0: 55385.2. Samples: 1976008600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:00:47,169][57108] Avg episode reward: [(0, '0.674')] [2024-04-28 20:00:48,096][57339] Updated weights for policy 0, policy_version 701028 (0.0028) [2024-04-28 20:00:50,578][57339] Updated weights for policy 0, policy_version 701038 (0.0027) [2024-04-28 20:00:52,169][57108] Fps is (10 sec: 58982.3, 60 sec: 55432.6, 300 sec: 55761.1). Total num frames: 11485888512. Throughput: 0: 55568.4. Samples: 1976173680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:00:52,169][57108] Avg episode reward: [(0, '0.543')] [2024-04-28 20:00:54,175][57339] Updated weights for policy 0, policy_version 701048 (0.0025) [2024-04-28 20:00:56,537][57339] Updated weights for policy 0, policy_version 701058 (0.0026) [2024-04-28 20:00:57,169][57108] Fps is (10 sec: 57344.1, 60 sec: 55432.4, 300 sec: 55705.6). Total num frames: 11486150656. Throughput: 0: 55462.1. Samples: 1976502420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:00:57,169][57108] Avg episode reward: [(0, '0.601')] [2024-04-28 20:00:59,917][57339] Updated weights for policy 0, policy_version 701068 (0.0029) [2024-04-28 20:01:02,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11486429184. Throughput: 0: 55291.7. Samples: 1976829380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:01:02,169][57108] Avg episode reward: [(0, '0.596')] [2024-04-28 20:01:02,389][57339] Updated weights for policy 0, policy_version 701078 (0.0031) [2024-04-28 20:01:06,069][57339] Updated weights for policy 0, policy_version 701088 (0.0028) [2024-04-28 20:01:07,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11486707712. Throughput: 0: 55352.4. Samples: 1977004300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:01:07,169][57108] Avg episode reward: [(0, '0.518')] [2024-04-28 20:01:08,154][57339] Updated weights for policy 0, policy_version 701098 (0.0029) [2024-04-28 20:01:11,824][57339] Updated weights for policy 0, policy_version 701108 (0.0028) [2024-04-28 20:01:12,169][57108] Fps is (10 sec: 52429.1, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 11486953472. Throughput: 0: 55564.5. Samples: 1977343240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:01:12,170][57108] Avg episode reward: [(0, '0.605')] [2024-04-28 20:01:14,104][57339] Updated weights for policy 0, policy_version 701118 (0.0025) [2024-04-28 20:01:17,169][57108] Fps is (10 sec: 50790.8, 60 sec: 54886.6, 300 sec: 55539.0). Total num frames: 11487215616. Throughput: 0: 55515.7. Samples: 1977675260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:01:17,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 20:01:17,744][57339] Updated weights for policy 0, policy_version 701128 (0.0029) [2024-04-28 20:01:19,954][57339] Updated weights for policy 0, policy_version 701138 (0.0030) [2024-04-28 20:01:22,169][57108] Fps is (10 sec: 57343.8, 60 sec: 54886.3, 300 sec: 55650.1). Total num frames: 11487526912. Throughput: 0: 55327.7. Samples: 1977832160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:01:22,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 20:01:23,725][57339] Updated weights for policy 0, policy_version 701148 (0.0035) [2024-04-28 20:01:25,880][57339] Updated weights for policy 0, policy_version 701158 (0.0028) [2024-04-28 20:01:27,169][57108] Fps is (10 sec: 60620.7, 60 sec: 55159.5, 300 sec: 55705.6). Total num frames: 11487821824. Throughput: 0: 55358.3. Samples: 1978164920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-04-28 20:01:27,169][57108] Avg episode reward: [(0, '0.689')] [2024-04-28 20:01:27,203][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000701162_11487838208.pth... [2024-04-28 20:01:27,245][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000700346_11474468864.pth [2024-04-28 20:01:29,651][57339] Updated weights for policy 0, policy_version 701168 (0.0030) [2024-04-28 20:01:31,862][57339] Updated weights for policy 0, policy_version 701178 (0.0028) [2024-04-28 20:01:32,169][57108] Fps is (10 sec: 58981.5, 60 sec: 55432.4, 300 sec: 55816.7). Total num frames: 11488116736. Throughput: 0: 55324.7. Samples: 1978498220. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:01:32,169][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 20:01:34,708][57319] Signal inference workers to stop experience collection... (29500 times) [2024-04-28 20:01:34,731][57339] InferenceWorker_p0-w0: stopping experience collection (29500 times) [2024-04-28 20:01:34,802][57319] Signal inference workers to resume experience collection... (29500 times) [2024-04-28 20:01:34,802][57339] InferenceWorker_p0-w0: resuming experience collection (29500 times) [2024-04-28 20:01:35,469][57339] Updated weights for policy 0, policy_version 701188 (0.0028) [2024-04-28 20:01:37,169][57108] Fps is (10 sec: 55704.9, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 11488378880. Throughput: 0: 55611.1. Samples: 1978676180. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:01:37,169][57108] Avg episode reward: [(0, '0.638')] [2024-04-28 20:01:37,672][57339] Updated weights for policy 0, policy_version 701198 (0.0026) [2024-04-28 20:01:41,300][57339] Updated weights for policy 0, policy_version 701208 (0.0034) [2024-04-28 20:01:42,169][57108] Fps is (10 sec: 52430.1, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 11488641024. Throughput: 0: 55780.1. Samples: 1979012520. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:01:42,169][57108] Avg episode reward: [(0, '0.581')] [2024-04-28 20:01:43,441][57339] Updated weights for policy 0, policy_version 701218 (0.0034) [2024-04-28 20:01:47,169][57108] Fps is (10 sec: 52429.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 11488903168. Throughput: 0: 55791.2. Samples: 1979339980. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:01:47,169][57108] Avg episode reward: [(0, '0.670')] [2024-04-28 20:01:47,188][57339] Updated weights for policy 0, policy_version 701228 (0.0029) [2024-04-28 20:01:49,235][57339] Updated weights for policy 0, policy_version 701238 (0.0026) [2024-04-28 20:01:52,169][57108] Fps is (10 sec: 54067.1, 60 sec: 54886.5, 300 sec: 55539.0). Total num frames: 11489181696. Throughput: 0: 55355.6. Samples: 1979495300. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:01:52,169][57108] Avg episode reward: [(0, '0.663')] [2024-04-28 20:01:53,098][57339] Updated weights for policy 0, policy_version 701248 (0.0028) [2024-04-28 20:01:55,098][57339] Updated weights for policy 0, policy_version 701258 (0.0031) [2024-04-28 20:01:57,169][57108] Fps is (10 sec: 58983.0, 60 sec: 55705.8, 300 sec: 55705.6). Total num frames: 11489492992. Throughput: 0: 55197.1. Samples: 1979827100. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:01:57,169][57108] Avg episode reward: [(0, '0.559')] [2024-04-28 20:01:59,097][57339] Updated weights for policy 0, policy_version 701268 (0.0025) [2024-04-28 20:02:01,031][57339] Updated weights for policy 0, policy_version 701278 (0.0035) [2024-04-28 20:02:02,169][57108] Fps is (10 sec: 57344.0, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11489755136. Throughput: 0: 55228.4. Samples: 1980160540. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:02:02,169][57108] Avg episode reward: [(0, '0.585')] [2024-04-28 20:02:04,916][57339] Updated weights for policy 0, policy_version 701288 (0.0025) [2024-04-28 20:02:06,940][57339] Updated weights for policy 0, policy_version 701298 (0.0029) [2024-04-28 20:02:07,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55978.6, 300 sec: 55816.7). Total num frames: 11490066432. Throughput: 0: 55750.2. Samples: 1980340920. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:02:07,169][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 20:02:10,805][57339] Updated weights for policy 0, policy_version 701308 (0.0029) [2024-04-28 20:02:12,169][57108] Fps is (10 sec: 57343.1, 60 sec: 56251.6, 300 sec: 55705.6). Total num frames: 11490328576. Throughput: 0: 55719.4. Samples: 1980672300. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:02:12,170][57108] Avg episode reward: [(0, '0.613')] [2024-04-28 20:02:12,916][57339] Updated weights for policy 0, policy_version 701318 (0.0033) [2024-04-28 20:02:16,811][57339] Updated weights for policy 0, policy_version 701328 (0.0032) [2024-04-28 20:02:17,169][57108] Fps is (10 sec: 50790.3, 60 sec: 55978.5, 300 sec: 55539.0). Total num frames: 11490574336. Throughput: 0: 55662.8. Samples: 1981003040. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:02:17,170][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 20:02:19,000][57339] Updated weights for policy 0, policy_version 701338 (0.0029) [2024-04-28 20:02:22,169][57108] Fps is (10 sec: 52429.4, 60 sec: 55432.6, 300 sec: 55539.0). Total num frames: 11490852864. Throughput: 0: 55329.9. Samples: 1981166020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:02:22,169][57108] Avg episode reward: [(0, '0.537')] [2024-04-28 20:02:22,715][57339] Updated weights for policy 0, policy_version 701348 (0.0028) [2024-04-28 20:02:25,075][57339] Updated weights for policy 0, policy_version 701358 (0.0029) [2024-04-28 20:02:27,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55159.3, 300 sec: 55538.9). Total num frames: 11491131392. Throughput: 0: 55262.4. Samples: 1981499340. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:02:27,170][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 20:02:28,517][57339] Updated weights for policy 0, policy_version 701368 (0.0025) [2024-04-28 20:02:30,962][57339] Updated weights for policy 0, policy_version 701378 (0.0030) [2024-04-28 20:02:32,169][57108] Fps is (10 sec: 55704.5, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 11491409920. Throughput: 0: 55312.6. Samples: 1981829060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:02:32,170][57108] Avg episode reward: [(0, '0.634')] [2024-04-28 20:02:34,337][57339] Updated weights for policy 0, policy_version 701388 (0.0029) [2024-04-28 20:02:34,698][57319] Signal inference workers to stop experience collection... (29550 times) [2024-04-28 20:02:34,735][57339] InferenceWorker_p0-w0: stopping experience collection (29550 times) [2024-04-28 20:02:34,762][57319] Signal inference workers to resume experience collection... (29550 times) [2024-04-28 20:02:34,762][57339] InferenceWorker_p0-w0: resuming experience collection (29550 times) [2024-04-28 20:02:36,688][57339] Updated weights for policy 0, policy_version 701398 (0.0031) [2024-04-28 20:02:37,169][57108] Fps is (10 sec: 57345.0, 60 sec: 55432.7, 300 sec: 55761.1). Total num frames: 11491704832. Throughput: 0: 55640.0. Samples: 1981999100. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:02:37,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 20:02:40,220][57339] Updated weights for policy 0, policy_version 701408 (0.0026) [2024-04-28 20:02:42,169][57108] Fps is (10 sec: 57345.1, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 11491983360. Throughput: 0: 55659.4. Samples: 1982331780. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:02:42,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 20:02:42,530][57339] Updated weights for policy 0, policy_version 701418 (0.0027) [2024-04-28 20:02:46,254][57339] Updated weights for policy 0, policy_version 701428 (0.0030) [2024-04-28 20:02:47,169][57108] Fps is (10 sec: 55704.4, 60 sec: 55978.5, 300 sec: 55650.0). Total num frames: 11492261888. Throughput: 0: 55635.8. Samples: 1982664160. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:02:47,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 20:02:48,593][57339] Updated weights for policy 0, policy_version 701438 (0.0029) [2024-04-28 20:02:52,169][57108] Fps is (10 sec: 52428.2, 60 sec: 55432.4, 300 sec: 55483.4). Total num frames: 11492507648. Throughput: 0: 55388.8. Samples: 1982833420. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:02:52,170][57108] Avg episode reward: [(0, '0.647')] [2024-04-28 20:02:52,187][57339] Updated weights for policy 0, policy_version 701448 (0.0030) [2024-04-28 20:02:54,612][57339] Updated weights for policy 0, policy_version 701458 (0.0029) [2024-04-28 20:02:57,169][57108] Fps is (10 sec: 52429.6, 60 sec: 54886.3, 300 sec: 55539.0). Total num frames: 11492786176. Throughput: 0: 55367.7. Samples: 1983163840. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:02:57,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 20:02:58,094][57339] Updated weights for policy 0, policy_version 701468 (0.0030) [2024-04-28 20:03:00,432][57339] Updated weights for policy 0, policy_version 701478 (0.0028) [2024-04-28 20:03:02,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55159.3, 300 sec: 55539.0). Total num frames: 11493064704. Throughput: 0: 55410.6. Samples: 1983496520. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:03:02,170][57108] Avg episode reward: [(0, '0.592')] [2024-04-28 20:03:03,915][57339] Updated weights for policy 0, policy_version 701488 (0.0026) [2024-04-28 20:03:06,367][57339] Updated weights for policy 0, policy_version 701498 (0.0026) [2024-04-28 20:03:07,169][57108] Fps is (10 sec: 57343.6, 60 sec: 54886.4, 300 sec: 55594.5). Total num frames: 11493359616. Throughput: 0: 55337.7. Samples: 1983656220. Policy #0 lag: (min: 0.0, avg: 8.0, max: 22.0) [2024-04-28 20:03:07,169][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 20:03:09,809][57339] Updated weights for policy 0, policy_version 701508 (0.0028) [2024-04-28 20:03:12,169][57108] Fps is (10 sec: 58982.9, 60 sec: 55432.6, 300 sec: 55705.6). Total num frames: 11493654528. Throughput: 0: 55413.5. Samples: 1983992940. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:03:12,169][57108] Avg episode reward: [(0, '0.565')] [2024-04-28 20:03:12,296][57339] Updated weights for policy 0, policy_version 701518 (0.0027) [2024-04-28 20:03:15,670][57339] Updated weights for policy 0, policy_version 701528 (0.0030) [2024-04-28 20:03:17,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11493916672. Throughput: 0: 55422.0. Samples: 1984323040. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:03:17,169][57108] Avg episode reward: [(0, '0.629')] [2024-04-28 20:03:18,600][57339] Updated weights for policy 0, policy_version 701538 (0.0026) [2024-04-28 20:03:21,423][57339] Updated weights for policy 0, policy_version 701548 (0.0029) [2024-04-28 20:03:22,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55978.7, 300 sec: 55650.1). Total num frames: 11494211584. Throughput: 0: 55524.9. Samples: 1984497720. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:03:22,169][57108] Avg episode reward: [(0, '0.576')] [2024-04-28 20:03:24,385][57339] Updated weights for policy 0, policy_version 701558 (0.0028) [2024-04-28 20:03:27,169][57108] Fps is (10 sec: 55705.2, 60 sec: 55705.6, 300 sec: 55594.5). Total num frames: 11494473728. Throughput: 0: 55498.5. Samples: 1984829220. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:03:27,170][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 20:03:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000701567_11494473728.pth... [2024-04-28 20:03:27,228][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000700754_11481153536.pth [2024-04-28 20:03:27,410][57339] Updated weights for policy 0, policy_version 701568 (0.0030) [2024-04-28 20:03:30,564][57339] Updated weights for policy 0, policy_version 701578 (0.0029) [2024-04-28 20:03:32,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55432.7, 300 sec: 55539.0). Total num frames: 11494735872. Throughput: 0: 55462.0. Samples: 1985159940. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:03:32,169][57108] Avg episode reward: [(0, '0.577')] [2024-04-28 20:03:33,081][57319] Signal inference workers to stop experience collection... (29600 times) [2024-04-28 20:03:33,081][57319] Signal inference workers to resume experience collection... (29600 times) [2024-04-28 20:03:33,111][57339] InferenceWorker_p0-w0: stopping experience collection (29600 times) [2024-04-28 20:03:33,111][57339] InferenceWorker_p0-w0: resuming experience collection (29600 times) [2024-04-28 20:03:33,197][57339] Updated weights for policy 0, policy_version 701588 (0.0028) [2024-04-28 20:03:36,475][57339] Updated weights for policy 0, policy_version 701598 (0.0026) [2024-04-28 20:03:37,169][57108] Fps is (10 sec: 52429.0, 60 sec: 54886.3, 300 sec: 55539.0). Total num frames: 11494998016. Throughput: 0: 55258.7. Samples: 1985320060. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:03:37,169][57108] Avg episode reward: [(0, '0.618')] [2024-04-28 20:03:39,153][57339] Updated weights for policy 0, policy_version 701608 (0.0027) [2024-04-28 20:03:42,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55159.5, 300 sec: 55539.0). Total num frames: 11495292928. Throughput: 0: 55320.1. Samples: 1985653240. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:03:42,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 20:03:42,236][57339] Updated weights for policy 0, policy_version 701618 (0.0029) [2024-04-28 20:03:45,075][57339] Updated weights for policy 0, policy_version 701628 (0.0029) [2024-04-28 20:03:47,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55159.7, 300 sec: 55594.5). Total num frames: 11495571456. Throughput: 0: 55322.4. Samples: 1985986020. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:03:47,169][57108] Avg episode reward: [(0, '0.619')] [2024-04-28 20:03:48,273][57339] Updated weights for policy 0, policy_version 701638 (0.0026) [2024-04-28 20:03:50,869][57339] Updated weights for policy 0, policy_version 701648 (0.0030) [2024-04-28 20:03:52,169][57108] Fps is (10 sec: 57343.4, 60 sec: 55978.7, 300 sec: 55594.5). Total num frames: 11495866368. Throughput: 0: 55602.2. Samples: 1986158320. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:03:52,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 20:03:54,038][57339] Updated weights for policy 0, policy_version 701658 (0.0027) [2024-04-28 20:03:56,704][57339] Updated weights for policy 0, policy_version 701668 (0.0028) [2024-04-28 20:03:57,169][57108] Fps is (10 sec: 57343.3, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 11496144896. Throughput: 0: 55587.0. Samples: 1986494360. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:03:57,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 20:04:00,192][57339] Updated weights for policy 0, policy_version 701678 (0.0028) [2024-04-28 20:04:02,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.7, 300 sec: 55539.0). Total num frames: 11496423424. Throughput: 0: 55576.4. Samples: 1986823980. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:04:02,170][57108] Avg episode reward: [(0, '0.583')] [2024-04-28 20:04:02,667][57339] Updated weights for policy 0, policy_version 701688 (0.0027) [2024-04-28 20:04:05,893][57339] Updated weights for policy 0, policy_version 701698 (0.0027) [2024-04-28 20:04:07,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 11496685568. Throughput: 0: 55440.4. Samples: 1986992540. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:04:07,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 20:04:08,515][57339] Updated weights for policy 0, policy_version 701708 (0.0032) [2024-04-28 20:04:11,879][57339] Updated weights for policy 0, policy_version 701718 (0.0026) [2024-04-28 20:04:12,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55159.4, 300 sec: 55483.4). Total num frames: 11496964096. Throughput: 0: 55493.8. Samples: 1987326440. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:04:12,170][57108] Avg episode reward: [(0, '0.588')] [2024-04-28 20:04:14,291][57339] Updated weights for policy 0, policy_version 701728 (0.0029) [2024-04-28 20:04:17,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 11497242624. Throughput: 0: 55681.2. Samples: 1987665600. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:04:17,170][57108] Avg episode reward: [(0, '0.680')] [2024-04-28 20:04:17,725][57339] Updated weights for policy 0, policy_version 701738 (0.0034) [2024-04-28 20:04:20,063][57339] Updated weights for policy 0, policy_version 701748 (0.0024) [2024-04-28 20:04:22,169][57108] Fps is (10 sec: 57344.2, 60 sec: 55432.4, 300 sec: 55594.5). Total num frames: 11497537536. Throughput: 0: 55741.3. Samples: 1987828420. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:04:22,170][57108] Avg episode reward: [(0, '0.662')] [2024-04-28 20:04:23,484][57339] Updated weights for policy 0, policy_version 701758 (0.0025) [2024-04-28 20:04:26,052][57339] Updated weights for policy 0, policy_version 701768 (0.0026) [2024-04-28 20:04:27,169][57108] Fps is (10 sec: 57344.7, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 11497816064. Throughput: 0: 55752.8. Samples: 1988162120. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:04:27,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 20:04:29,376][57339] Updated weights for policy 0, policy_version 701778 (0.0030) [2024-04-28 20:04:31,916][57339] Updated weights for policy 0, policy_version 701788 (0.0031) [2024-04-28 20:04:32,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55978.6, 300 sec: 55594.5). Total num frames: 11498094592. Throughput: 0: 55809.2. Samples: 1988497440. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:04:32,169][57108] Avg episode reward: [(0, '0.600')] [2024-04-28 20:04:35,116][57339] Updated weights for policy 0, policy_version 701798 (0.0036) [2024-04-28 20:04:36,253][57319] Signal inference workers to stop experience collection... (29650 times) [2024-04-28 20:04:36,290][57339] InferenceWorker_p0-w0: stopping experience collection (29650 times) [2024-04-28 20:04:36,316][57319] Signal inference workers to resume experience collection... (29650 times) [2024-04-28 20:04:36,317][57339] InferenceWorker_p0-w0: resuming experience collection (29650 times) [2024-04-28 20:04:37,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.8, 300 sec: 55594.5). Total num frames: 11498373120. Throughput: 0: 55733.4. Samples: 1988666320. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:04:37,169][57108] Avg episode reward: [(0, '0.552')] [2024-04-28 20:04:37,683][57339] Updated weights for policy 0, policy_version 701808 (0.0026) [2024-04-28 20:04:41,345][57339] Updated weights for policy 0, policy_version 701818 (0.0030) [2024-04-28 20:04:42,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55978.5, 300 sec: 55483.4). Total num frames: 11498651648. Throughput: 0: 55810.2. Samples: 1989005820. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:04:42,169][57108] Avg episode reward: [(0, '0.604')] [2024-04-28 20:04:43,516][57339] Updated weights for policy 0, policy_version 701828 (0.0030) [2024-04-28 20:04:47,115][57339] Updated weights for policy 0, policy_version 701838 (0.0031) [2024-04-28 20:04:47,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55705.5, 300 sec: 55427.9). Total num frames: 11498913792. Throughput: 0: 55828.0. Samples: 1989336240. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-04-28 20:04:47,169][57108] Avg episode reward: [(0, '0.569')] [2024-04-28 20:04:49,425][57339] Updated weights for policy 0, policy_version 701848 (0.0029) [2024-04-28 20:04:52,169][57108] Fps is (10 sec: 52429.6, 60 sec: 55159.6, 300 sec: 55427.9). Total num frames: 11499175936. Throughput: 0: 55593.9. Samples: 1989494260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:04:52,169][57108] Avg episode reward: [(0, '0.598')] [2024-04-28 20:04:52,899][57339] Updated weights for policy 0, policy_version 701858 (0.0027) [2024-04-28 20:04:55,258][57339] Updated weights for policy 0, policy_version 701868 (0.0026) [2024-04-28 20:04:57,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 11499470848. Throughput: 0: 55567.1. Samples: 1989826960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:04:57,169][57108] Avg episode reward: [(0, '0.550')] [2024-04-28 20:04:59,097][57339] Updated weights for policy 0, policy_version 701878 (0.0028) [2024-04-28 20:05:01,464][57339] Updated weights for policy 0, policy_version 701888 (0.0026) [2024-04-28 20:05:02,169][57108] Fps is (10 sec: 57342.9, 60 sec: 55432.5, 300 sec: 55539.0). Total num frames: 11499749376. Throughput: 0: 55446.2. Samples: 1990160680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:05:02,170][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 20:05:04,897][57339] Updated weights for policy 0, policy_version 701898 (0.0024) [2024-04-28 20:05:07,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11500044288. Throughput: 0: 55763.2. Samples: 1990337760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:05:07,170][57108] Avg episode reward: [(0, '0.679')] [2024-04-28 20:05:07,322][57339] Updated weights for policy 0, policy_version 701908 (0.0031) [2024-04-28 20:05:10,667][57339] Updated weights for policy 0, policy_version 701918 (0.0036) [2024-04-28 20:05:12,169][57108] Fps is (10 sec: 55706.7, 60 sec: 55705.8, 300 sec: 55539.0). Total num frames: 11500306432. Throughput: 0: 55700.1. Samples: 1990668620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:05:12,169][57108] Avg episode reward: [(0, '0.611')] [2024-04-28 20:05:13,059][57339] Updated weights for policy 0, policy_version 701928 (0.0032) [2024-04-28 20:05:16,472][57339] Updated weights for policy 0, policy_version 701938 (0.0034) [2024-04-28 20:05:17,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55705.7, 300 sec: 55427.9). Total num frames: 11500584960. Throughput: 0: 55593.0. Samples: 1990999120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:05:17,169][57108] Avg episode reward: [(0, '0.568')] [2024-04-28 20:05:19,017][57339] Updated weights for policy 0, policy_version 701948 (0.0028) [2024-04-28 20:05:22,169][57108] Fps is (10 sec: 55705.5, 60 sec: 55432.6, 300 sec: 55427.9). Total num frames: 11500863488. Throughput: 0: 55508.0. Samples: 1991164180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:05:22,169][57108] Avg episode reward: [(0, '0.737')] [2024-04-28 20:05:22,389][57339] Updated weights for policy 0, policy_version 701958 (0.0026) [2024-04-28 20:05:25,057][57339] Updated weights for policy 0, policy_version 701968 (0.0029) [2024-04-28 20:05:27,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55432.5, 300 sec: 55427.9). Total num frames: 11501142016. Throughput: 0: 55278.3. Samples: 1991493340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:05:27,169][57108] Avg episode reward: [(0, '0.615')] [2024-04-28 20:05:27,178][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000701974_11501142016.pth... [2024-04-28 20:05:27,224][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000701162_11487838208.pth [2024-04-28 20:05:28,298][57339] Updated weights for policy 0, policy_version 701978 (0.0034) [2024-04-28 20:05:30,912][57339] Updated weights for policy 0, policy_version 701988 (0.0032) [2024-04-28 20:05:32,169][57108] Fps is (10 sec: 54067.1, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 11501404160. Throughput: 0: 55308.1. Samples: 1991825100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:05:32,170][57108] Avg episode reward: [(0, '0.660')] [2024-04-28 20:05:34,184][57339] Updated weights for policy 0, policy_version 701998 (0.0030) [2024-04-28 20:05:36,659][57339] Updated weights for policy 0, policy_version 702008 (0.0028) [2024-04-28 20:05:37,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55705.6, 300 sec: 55650.1). Total num frames: 11501715456. Throughput: 0: 55646.2. Samples: 1991998340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:05:37,169][57108] Avg episode reward: [(0, '0.633')] [2024-04-28 20:05:39,994][57339] Updated weights for policy 0, policy_version 702018 (0.0028) [2024-04-28 20:05:42,169][57108] Fps is (10 sec: 58982.5, 60 sec: 55705.7, 300 sec: 55650.1). Total num frames: 11501993984. Throughput: 0: 55654.9. Samples: 1992331420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:05:42,169][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 20:05:42,714][57339] Updated weights for policy 0, policy_version 702028 (0.0027) [2024-04-28 20:05:45,995][57339] Updated weights for policy 0, policy_version 702038 (0.0029) [2024-04-28 20:05:47,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 11502256128. Throughput: 0: 55587.8. Samples: 1992662120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:05:47,169][57108] Avg episode reward: [(0, '0.510')] [2024-04-28 20:05:48,760][57339] Updated weights for policy 0, policy_version 702048 (0.0029) [2024-04-28 20:05:51,847][57339] Updated weights for policy 0, policy_version 702058 (0.0032) [2024-04-28 20:05:52,169][57108] Fps is (10 sec: 52428.9, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 11502518272. Throughput: 0: 55336.6. Samples: 1992827900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:05:52,169][57108] Avg episode reward: [(0, '0.591')] [2024-04-28 20:05:53,193][57319] Signal inference workers to stop experience collection... (29700 times) [2024-04-28 20:05:53,215][57339] InferenceWorker_p0-w0: stopping experience collection (29700 times) [2024-04-28 20:05:53,249][57319] Signal inference workers to resume experience collection... (29700 times) [2024-04-28 20:05:53,249][57339] InferenceWorker_p0-w0: resuming experience collection (29700 times) [2024-04-28 20:05:54,493][57339] Updated weights for policy 0, policy_version 702068 (0.0024) [2024-04-28 20:05:57,169][57108] Fps is (10 sec: 54066.1, 60 sec: 55432.5, 300 sec: 55483.4). Total num frames: 11502796800. Throughput: 0: 55402.0. Samples: 1993161720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:05:57,170][57108] Avg episode reward: [(0, '0.650')] [2024-04-28 20:05:57,642][57339] Updated weights for policy 0, policy_version 702078 (0.0032) [2024-04-28 20:06:00,360][57339] Updated weights for policy 0, policy_version 702088 (0.0030) [2024-04-28 20:06:02,169][57108] Fps is (10 sec: 57342.8, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 11503091712. Throughput: 0: 55523.8. Samples: 1993497700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:06:02,170][57108] Avg episode reward: [(0, '0.599')] [2024-04-28 20:06:03,570][57339] Updated weights for policy 0, policy_version 702098 (0.0028) [2024-04-28 20:06:06,237][57339] Updated weights for policy 0, policy_version 702108 (0.0028) [2024-04-28 20:06:07,169][57108] Fps is (10 sec: 57343.9, 60 sec: 55432.4, 300 sec: 55650.0). Total num frames: 11503370240. Throughput: 0: 55628.2. Samples: 1993667460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:06:07,169][57108] Avg episode reward: [(0, '0.575')] [2024-04-28 20:06:09,401][57339] Updated weights for policy 0, policy_version 702118 (0.0027) [2024-04-28 20:06:12,169][57108] Fps is (10 sec: 55706.8, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11503648768. Throughput: 0: 55700.2. Samples: 1993999840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:06:12,169][57108] Avg episode reward: [(0, '0.628')] [2024-04-28 20:06:12,209][57339] Updated weights for policy 0, policy_version 702128 (0.0028) [2024-04-28 20:06:15,317][57339] Updated weights for policy 0, policy_version 702138 (0.0026) [2024-04-28 20:06:17,169][57108] Fps is (10 sec: 55705.4, 60 sec: 55705.4, 300 sec: 55594.5). Total num frames: 11503927296. Throughput: 0: 55712.2. Samples: 1994332160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:06:17,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 20:06:17,982][57339] Updated weights for policy 0, policy_version 702148 (0.0027) [2024-04-28 20:06:21,157][57339] Updated weights for policy 0, policy_version 702158 (0.0025) [2024-04-28 20:06:22,169][57108] Fps is (10 sec: 52428.6, 60 sec: 55159.5, 300 sec: 55427.9). Total num frames: 11504173056. Throughput: 0: 55542.3. Samples: 1994497740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:06:22,169][57108] Avg episode reward: [(0, '0.524')] [2024-04-28 20:06:23,766][57339] Updated weights for policy 0, policy_version 702168 (0.0032) [2024-04-28 20:06:27,090][57339] Updated weights for policy 0, policy_version 702178 (0.0034) [2024-04-28 20:06:27,169][57108] Fps is (10 sec: 55706.2, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 11504484352. Throughput: 0: 55489.6. Samples: 1994828460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:06:27,169][57108] Avg episode reward: [(0, '0.494')] [2024-04-28 20:06:29,738][57339] Updated weights for policy 0, policy_version 702188 (0.0032) [2024-04-28 20:06:32,169][57108] Fps is (10 sec: 57343.7, 60 sec: 55705.6, 300 sec: 55483.5). Total num frames: 11504746496. Throughput: 0: 55557.7. Samples: 1995162220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:06:32,169][57108] Avg episode reward: [(0, '0.525')] [2024-04-28 20:06:33,066][57339] Updated weights for policy 0, policy_version 702198 (0.0028) [2024-04-28 20:06:35,643][57339] Updated weights for policy 0, policy_version 702208 (0.0027) [2024-04-28 20:06:37,169][57108] Fps is (10 sec: 55706.3, 60 sec: 55432.6, 300 sec: 55594.5). Total num frames: 11505041408. Throughput: 0: 55660.4. Samples: 1995332620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:06:37,169][57108] Avg episode reward: [(0, '0.554')] [2024-04-28 20:06:38,959][57339] Updated weights for policy 0, policy_version 702218 (0.0027) [2024-04-28 20:06:41,496][57339] Updated weights for policy 0, policy_version 702228 (0.0028) [2024-04-28 20:06:42,169][57108] Fps is (10 sec: 57344.3, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11505319936. Throughput: 0: 55675.8. Samples: 1995667120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:06:42,169][57108] Avg episode reward: [(0, '0.572')] [2024-04-28 20:06:44,785][57339] Updated weights for policy 0, policy_version 702238 (0.0027) [2024-04-28 20:06:47,169][57108] Fps is (10 sec: 55705.0, 60 sec: 55705.5, 300 sec: 55650.0). Total num frames: 11505598464. Throughput: 0: 55666.3. Samples: 1996002680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:06:47,170][57108] Avg episode reward: [(0, '0.590')] [2024-04-28 20:06:47,417][57339] Updated weights for policy 0, policy_version 702248 (0.0026) [2024-04-28 20:06:50,527][57339] Updated weights for policy 0, policy_version 702258 (0.0032) [2024-04-28 20:06:52,169][57108] Fps is (10 sec: 55705.1, 60 sec: 55978.6, 300 sec: 55539.0). Total num frames: 11505876992. Throughput: 0: 55700.6. Samples: 1996173980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:06:52,169][57108] Avg episode reward: [(0, '0.624')] [2024-04-28 20:06:53,315][57339] Updated weights for policy 0, policy_version 702268 (0.0029) [2024-04-28 20:06:56,288][57339] Updated weights for policy 0, policy_version 702278 (0.0026) [2024-04-28 20:06:57,169][57108] Fps is (10 sec: 55706.1, 60 sec: 55978.8, 300 sec: 55594.5). Total num frames: 11506155520. Throughput: 0: 55849.3. Samples: 1996513060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:06:57,169][57108] Avg episode reward: [(0, '0.597')] [2024-04-28 20:06:59,154][57339] Updated weights for policy 0, policy_version 702288 (0.0033) [2024-04-28 20:07:02,151][57319] Signal inference workers to stop experience collection... (29750 times) [2024-04-28 20:07:02,151][57319] Signal inference workers to resume experience collection... (29750 times) [2024-04-28 20:07:02,169][57108] Fps is (10 sec: 55705.8, 60 sec: 55705.7, 300 sec: 55483.5). Total num frames: 11506434048. Throughput: 0: 55877.6. Samples: 1996846640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:07:02,169][57108] Avg episode reward: [(0, '0.659')] [2024-04-28 20:07:02,174][57339] InferenceWorker_p0-w0: stopping experience collection (29750 times) [2024-04-28 20:07:02,174][57339] InferenceWorker_p0-w0: resuming experience collection (29750 times) [2024-04-28 20:07:02,259][57339] Updated weights for policy 0, policy_version 702298 (0.0029) [2024-04-28 20:07:05,031][57339] Updated weights for policy 0, policy_version 702308 (0.0026) [2024-04-28 20:07:07,169][57108] Fps is (10 sec: 55704.6, 60 sec: 55705.6, 300 sec: 55539.0). Total num frames: 11506712576. Throughput: 0: 55705.5. Samples: 1997004500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:07:07,169][57108] Avg episode reward: [(0, '0.645')] [2024-04-28 20:07:08,130][57339] Updated weights for policy 0, policy_version 702318 (0.0033) [2024-04-28 20:07:10,814][57339] Updated weights for policy 0, policy_version 702328 (0.0024) [2024-04-28 20:07:12,169][57108] Fps is (10 sec: 55705.7, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 11506991104. Throughput: 0: 55849.4. Samples: 1997341680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:07:12,169][57108] Avg episode reward: [(0, '0.639')] [2024-04-28 20:07:14,013][57339] Updated weights for policy 0, policy_version 702338 (0.0031) [2024-04-28 20:07:16,671][57339] Updated weights for policy 0, policy_version 702348 (0.0029) [2024-04-28 20:07:17,169][57108] Fps is (10 sec: 57344.6, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 11507286016. Throughput: 0: 55886.6. Samples: 1997677120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:07:17,169][57108] Avg episode reward: [(0, '0.632')] [2024-04-28 20:07:19,959][57339] Updated weights for policy 0, policy_version 702358 (0.0026) [2024-04-28 20:07:22,169][57108] Fps is (10 sec: 55705.6, 60 sec: 56251.7, 300 sec: 55650.1). Total num frames: 11507548160. Throughput: 0: 55975.5. Samples: 1997851520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:07:22,169][57108] Avg episode reward: [(0, '0.584')] [2024-04-28 20:07:22,603][57339] Updated weights for policy 0, policy_version 702368 (0.0028) [2024-04-28 20:07:26,068][57339] Updated weights for policy 0, policy_version 702378 (0.0034) [2024-04-28 20:07:27,169][57108] Fps is (10 sec: 55706.0, 60 sec: 55978.8, 300 sec: 55705.6). Total num frames: 11507843072. Throughput: 0: 55917.3. Samples: 1998183400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:07:27,169][57108] Avg episode reward: [(0, '0.593')] [2024-04-28 20:07:27,179][57319] Saving /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000702383_11507843072.pth... [2024-04-28 20:07:27,222][57319] Removing /workspace/metta/train_dir/p2.fa.clean/checkpoint_p0/checkpoint_000701567_11494473728.pth [2024-04-28 20:07:28,510][57339] Updated weights for policy 0, policy_version 702388 (0.0030) [2024-04-28 20:07:31,889][57339] Updated weights for policy 0, policy_version 702398 (0.0030) [2024-04-28 20:07:32,169][57108] Fps is (10 sec: 54067.5, 60 sec: 55705.7, 300 sec: 55539.0). Total num frames: 11508088832. Throughput: 0: 55770.8. Samples: 1998512360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:07:32,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 20:07:34,305][57339] Updated weights for policy 0, policy_version 702408 (0.0034) [2024-04-28 20:07:37,169][57108] Fps is (10 sec: 54066.6, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 11508383744. Throughput: 0: 55558.6. Samples: 1998674120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:07:37,170][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 20:07:37,729][57339] Updated weights for policy 0, policy_version 702418 (0.0025) [2024-04-28 20:07:40,296][57339] Updated weights for policy 0, policy_version 702428 (0.0029) [2024-04-28 20:07:42,169][57108] Fps is (10 sec: 57343.0, 60 sec: 55705.5, 300 sec: 55594.5). Total num frames: 11508662272. Throughput: 0: 55440.3. Samples: 1999007880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:07:42,170][57108] Avg episode reward: [(0, '0.571')] [2024-04-28 20:07:43,581][57339] Updated weights for policy 0, policy_version 702438 (0.0033) [2024-04-28 20:07:46,342][57339] Updated weights for policy 0, policy_version 702448 (0.0029) [2024-04-28 20:07:47,169][57108] Fps is (10 sec: 55705.6, 60 sec: 55705.6, 300 sec: 55705.6). Total num frames: 11508940800. Throughput: 0: 55431.0. Samples: 1999341040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:07:47,170][57108] Avg episode reward: [(0, '0.587')] [2024-04-28 20:07:49,606][57339] Updated weights for policy 0, policy_version 702458 (0.0031) [2024-04-28 20:07:52,118][57339] Updated weights for policy 0, policy_version 702468 (0.0028) [2024-04-28 20:07:52,169][57108] Fps is (10 sec: 57344.4, 60 sec: 55978.7, 300 sec: 55761.1). Total num frames: 11509235712. Throughput: 0: 55712.6. Samples: 1999511560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:07:52,170][57108] Avg episode reward: [(0, '0.635')] [2024-04-28 20:07:55,428][57339] Updated weights for policy 0, policy_version 702478 (0.0028) [2024-04-28 20:07:57,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55650.1). Total num frames: 11509481472. Throughput: 0: 55605.3. Samples: 1999843920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:07:57,169][57108] Avg episode reward: [(0, '0.654')] [2024-04-28 20:07:57,916][57339] Updated weights for policy 0, policy_version 702488 (0.0028) [2024-04-28 20:08:01,396][57339] Updated weights for policy 0, policy_version 702498 (0.0029) [2024-04-28 20:08:02,169][57108] Fps is (10 sec: 54067.0, 60 sec: 55705.5, 300 sec: 55650.1). Total num frames: 11509776384. Throughput: 0: 55483.5. Samples: 2000173880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:08:02,169][57108] Avg episode reward: [(0, '0.643')] [2024-04-28 20:08:03,966][57339] Updated weights for policy 0, policy_version 702508 (0.0034) [2024-04-28 20:08:05,257][57319] Signal inference workers to stop experience collection... (29800 times) [2024-04-28 20:08:05,258][57319] Signal inference workers to resume experience collection... (29800 times) [2024-04-28 20:08:05,285][57339] InferenceWorker_p0-w0: stopping experience collection (29800 times) [2024-04-28 20:08:05,285][57339] InferenceWorker_p0-w0: resuming experience collection (29800 times) [2024-04-28 20:08:07,169][57108] Fps is (10 sec: 54067.2, 60 sec: 55159.6, 300 sec: 55483.4). Total num frames: 11510022144. Throughput: 0: 55344.9. Samples: 2000342040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-04-28 20:08:07,169][57108] Avg episode reward: [(0, '0.641')] [2024-04-28 20:08:07,423][57339] Updated weights for policy 0, policy_version 702518 (0.0031) [2024-04-28 20:08:10,323][57339] Updated weights for policy 0, policy_version 702528 (0.0027) [2024-04-28 20:08:12,169][57108] Fps is (10 sec: 54067.6, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 11510317056. Throughput: 0: 55296.9. Samples: 2000671760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 20:08:12,169][57108] Avg episode reward: [(0, '0.651')] [2024-04-28 20:08:13,232][57339] Updated weights for policy 0, policy_version 702538 (0.0031) [2024-04-28 20:08:16,092][57339] Updated weights for policy 0, policy_version 702548 (0.0027) [2024-04-28 20:08:17,169][57108] Fps is (10 sec: 55705.5, 60 sec: 54886.4, 300 sec: 55483.4). Total num frames: 11510579200. Throughput: 0: 55457.7. Samples: 2001007960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 20:08:17,169][57108] Avg episode reward: [(0, '0.630')] [2024-04-28 20:08:19,140][57339] Updated weights for policy 0, policy_version 702558 (0.0030) [2024-04-28 20:08:22,126][57339] Updated weights for policy 0, policy_version 702568 (0.0039) [2024-04-28 20:08:22,169][57108] Fps is (10 sec: 55705.3, 60 sec: 55432.5, 300 sec: 55594.5). Total num frames: 11510874112. Throughput: 0: 55407.6. Samples: 2001167460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 20:08:22,170][57108] Avg episode reward: [(0, '0.644')] [2024-04-28 20:08:25,094][57339] Updated weights for policy 0, policy_version 702578 (0.0035) [2024-04-28 20:08:27,169][57108] Fps is (10 sec: 57344.5, 60 sec: 55159.5, 300 sec: 55650.1). Total num frames: 11511152640. Throughput: 0: 55362.9. Samples: 2001499200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 20:08:27,169][57108] Avg episode reward: [(0, '0.609')] [2024-04-28 20:08:27,917][57339] Updated weights for policy 0, policy_version 702588 (0.0030) [2024-04-28 20:08:30,993][57339] Updated weights for policy 0, policy_version 702598 (0.0031) [2024-04-28 20:08:32,169][57108] Fps is (10 sec: 57343.6, 60 sec: 55978.5, 300 sec: 55761.1). Total num frames: 11511447552. Throughput: 0: 55309.8. Samples: 2001829980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 20:08:32,169][57108] Avg episode reward: [(0, '0.534')] [2024-04-28 20:08:33,854][57339] Updated weights for policy 0, policy_version 702608 (0.0031) [2024-04-28 20:08:36,799][57339] Updated weights for policy 0, policy_version 702618 (0.0031) [2024-04-28 20:08:37,169][57108] Fps is (10 sec: 55704.5, 60 sec: 55432.5, 300 sec: 55650.0). Total num frames: 11511709696. Throughput: 0: 55491.0. Samples: 2002008660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 20:08:37,170][57108] Avg episode reward: [(0, '0.620')] [2024-04-28 20:08:39,770][57339] Updated weights for policy 0, policy_version 702628 (0.0026) [2024-04-28 20:08:42,169][57108] Fps is (10 sec: 50791.2, 60 sec: 54886.6, 300 sec: 55539.0). Total num frames: 11511955456. Throughput: 0: 55510.7. Samples: 2002341900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 20:08:42,169][57108] Avg episode reward: [(0, '0.594')] [2024-04-28 20:08:42,706][57339] Updated weights for policy 0, policy_version 702638 (0.0026) [2024-04-28 20:08:45,620][57339] Updated weights for policy 0, policy_version 702648 (0.0027) [2024-04-28 20:08:47,169][57108] Fps is (10 sec: 54068.4, 60 sec: 55159.6, 300 sec: 55539.0). Total num frames: 11512250368. Throughput: 0: 55704.6. Samples: 2002680580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 20:08:47,169][57108] Avg episode reward: [(0, '0.547')] [2024-04-28 20:08:48,553][57339] Updated weights for policy 0, policy_version 702658 (0.0029) [2024-04-28 20:08:51,377][57339] Updated weights for policy 0, policy_version 702668 (0.0032) [2024-04-28 20:08:52,169][57108] Fps is (10 sec: 57343.0, 60 sec: 54886.3, 300 sec: 55539.0). Total num frames: 11512528896. Throughput: 0: 55347.4. Samples: 2002832680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 20:08:52,170][57108] Avg episode reward: [(0, '0.698')] [2024-04-28 20:08:54,463][57339] Updated weights for policy 0, policy_version 702678 (0.0027) [2024-04-28 20:08:57,169][57108] Fps is (10 sec: 57343.8, 60 sec: 55705.7, 300 sec: 55594.5). Total num frames: 11512823808. Throughput: 0: 55531.2. Samples: 2003170660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 20:08:57,169][57108] Avg episode reward: [(0, '0.681')] [2024-04-28 20:08:57,180][57339] Updated weights for policy 0, policy_version 702688 (0.0028) [2024-04-28 20:09:00,349][57339] Updated weights for policy 0, policy_version 702698 (0.0029) [2024-04-28 20:09:02,169][57108] Fps is (10 sec: 57344.8, 60 sec: 55432.6, 300 sec: 55650.1). Total num frames: 11513102336. Throughput: 0: 55476.1. Samples: 2003504380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 20:09:02,169][57108] Avg episode reward: [(0, '0.617')] [2024-04-28 20:09:03,232][57339] Updated weights for policy 0, policy_version 702708 (0.0027) [2024-04-28 20:09:06,008][57319] Signal inference workers to stop experience collection... (29850 times) [2024-04-28 20:09:06,008][57319] Signal inference workers to resume experience collection... (29850 times) [2024-04-28 20:09:06,050][57339] InferenceWorker_p0-w0: stopping experience collection (29850 times) [2024-04-28 20:09:06,050][57339] InferenceWorker_p0-w0: resuming experience collection (29850 times) [2024-04-28 20:09:06,125][57339] Updated weights for policy 0, policy_version 702718 (0.0028) [2024-04-28 20:09:07,169][57108] Fps is (10 sec: 58982.3, 60 sec: 56524.8, 300 sec: 55761.2). Total num frames: 11513413632. Throughput: 0: 55821.4. Samples: 2003679420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 20:09:07,169][57108] Avg episode reward: [(0, '0.678')] [2024-04-28 20:09:09,211][57339] Updated weights for policy 0, policy_version 702728 (0.0030) [2024-04-28 20:09:11,943][57339] Updated weights for policy 0, policy_version 702738 (0.0028) [2024-04-28 20:09:12,169][57108] Fps is (10 sec: 57343.2, 60 sec: 55978.6, 300 sec: 55705.6). Total num frames: 11513675776. Throughput: 0: 56031.3. Samples: 2004020620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 20:09:12,170][57108] Avg episode reward: [(0, '0.610')] [2024-04-28 20:09:14,992][57339] Updated weights for policy 0, policy_version 702748 (0.0026) [2024-04-28 20:09:17,169][57108] Fps is (10 sec: 50789.7, 60 sec: 55705.5, 300 sec: 55539.0). Total num frames: 11513921536. Throughput: 0: 56070.7. Samples: 2004353160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-04-28 20:09:17,170][57108] Avg episode reward: [(0, '0.582')] [2024-04-28 20:09:17,880][57339] Updated weights for policy 0, policy_version 702758 (0.0027)